1
0
Fork 0
alistair23-linux/Documentation
Kirill A. Shutemov c8d78c1823 mm: replace remap_file_pages() syscall with emulation
remap_file_pages(2) was invented to be able efficiently map parts of
huge file into limited 32-bit virtual address space such as in database
workloads.

Nonlinear mappings are pain to support and it seems there's no
legitimate use-cases nowadays since 64-bit systems are widely available.

Let's drop it and get rid of all these special-cased code.

The patch replaces the syscall with emulation which creates new VMA on
each remap_file_pages(), unless they it can be merged with an adjacent
one.

I didn't find *any* real code that uses remap_file_pages(2) to test
emulation impact on.  I've checked Debian code search and source of all
packages in ALT Linux.  No real users: libc wrappers, mentions in
strace, gdb, valgrind and this kind of stuff.

There are few basic tests in LTP for the syscall.  They work just fine
with emulation.

To test performance impact, I've written small test case which
demonstrate pretty much worst case scenario: map 4G shmfs file, write to
begin of every page pgoff of the page, remap pages in reverse order,
read every page.

The test creates 1 million of VMAs if emulation is in use, so I had to
set vm.max_map_count to 1100000 to avoid -ENOMEM.

Before:		23.3 ( +-  4.31% ) seconds
After:		43.9 ( +-  0.85% ) seconds
Slowdown:	1.88x

I believe we can live with that.

Test case:

        #define _GNU_SOURCE
        #include <assert.h>
        #include <stdlib.h>
        #include <stdio.h>
        #include <sys/mman.h>

        #define MB	(1024UL * 1024)
        #define SIZE	(4096 * MB)

        int main(int argc, char **argv)
        {
                unsigned long *p;
                long i, pass;

                for (pass = 0; pass < 10; pass++) {
                        p = mmap(NULL, SIZE, PROT_READ|PROT_WRITE,
                                        MAP_SHARED | MAP_ANONYMOUS, -1, 0);
                        if (p == MAP_FAILED) {
                                perror("mmap");
                                return -1;
                        }

                        for (i = 0; i < SIZE / 4096; i++)
                                p[i * 4096 / sizeof(*p)] = i;

                        for (i = 0; i < SIZE / 4096; i++) {
                                if (remap_file_pages(p + i * 4096 / sizeof(*p), 4096,
                                                0, (SIZE - 4096 * (i + 1)) >> 12, 0)) {
                                        perror("remap_file_pages");
                                        return -1;
                                }
                        }

                        for (i = SIZE / 4096 - 1; i >= 0; i--)
                                assert(p[i * 4096 / sizeof(*p)] == SIZE / 4096 - i - 1);

                        munmap(p, SIZE);
                }

                return 0;
        }

[akpm@linux-foundation.org: fix spello]
[sasha.levin@oracle.com: initialize populate before usage]
[sasha.levin@oracle.com: grab file ref to prevent race while mmaping]
Signed-off-by: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Dave Jones <davej@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Armin Rigo <arigo@tunes.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-10 14:30:30 -08:00
..
ABI perf/core improvements and fixes: 2015-01-28 15:43:15 +01:00
DocBook media updates for v3.19-rc1 2014-12-18 20:14:49 -08:00
EDID drm: Add 800x600 (SVGA) screen resolution to the built-in EDIDs 2014-05-26 12:53:40 +10:00
PCI doc: replace "practise" with "practice" in Documentation 2014-06-19 15:28:56 +02:00
RCU Merge branches 'doc.2015.01.07a', 'fixes.2015.01.15a', 'preempt.2015.01.06a', 'srcu.2015.01.06a', 'stall.2015.01.16a' and 'torture.2015.01.11a' into HEAD 2015-01-15 23:34:34 -08:00
accounting Documentation: use subdir-y to avoid unnecessary built-in.o files 2014-09-26 11:02:55 +02:00
acpi ACPI / GPIO: Document ACPI GPIO mappings API 2014-11-04 21:58:24 +01:00
aoe
arm Merge branch 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm 2014-12-12 15:26:48 -08:00
arm64 arm64: Emulate CP15 Barrier instructions 2014-11-20 16:34:48 +00:00
auxdisplay Documentation: use subdir-y to avoid unnecessary built-in.o files 2014-09-26 11:02:55 +02:00
backlight
blackfin Documentation: add makefiles for more targets 2014-09-26 11:02:56 +02:00
block Merge branch 'for-3.19/core' of git://git.kernel.dk/linux-block 2014-12-13 14:14:23 -08:00
blockdev zram: report maximum used memory 2014-10-09 22:26:02 -04:00
bus-devices
cdrom
cgroups Update of Documentation/cgroups/00-INDEX 2015-01-05 09:19:32 -05:00
connector w1: optional bundling of netlink kernel replies 2014-05-27 13:56:21 -07:00
console
cpu-freq intel_pstate: Add support for HWP 2014-11-12 00:04:38 +01:00
cpuidle
cris
crypto crypto: doc - userspace interface spec 2014-11-13 22:31:38 +08:00
development-process Documentation: remove outdated references to the linux-next wiki 2014-10-28 09:06:11 -04:00
device-mapper dm cache policy mq: simplify ability to promote sequential IO to the cache 2014-11-10 15:25:30 -05:00
devicetree Merge branch 'for-3.20' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata 2015-02-09 19:40:42 -08:00
dmaengine Documentation: dmanegine: move dmatest.txt to dmaengine folder 2014-11-06 11:17:37 +05:30
driver-model PCI changes for the v3.18 merge window: 2014-10-09 15:03:49 -04:00
dvb [media] get_dvb_firmware: Update firmware of ITEtech IT9135 2014-09-21 17:03:04 -03:00
early-userspace
extcon
fault-injection
fb doc: spelling error changes 2014-05-05 15:32:05 +02:00
filesystems fsioctl.c: make generic_block_fiemap() signal-tolerant 2015-02-10 14:30:30 -08:00
firmware_class doc: fix minor typos in firmware_class README 2014-07-17 18:43:40 -07:00
fmc FMC: make eeprom attribute writable 2014-02-28 15:12:08 -08:00
frv
gpio This is the bulk of GPIO changes for the v3.19 series: 2014-12-14 14:05:05 -08:00
hid HID: uhid: update documentation 2014-08-25 03:28:09 -05:00
hwmon hwmon: (ina2xx) Add ina231 compatible string 2015-01-25 21:23:59 -08:00
i2c Documentation: i2c: Use PM ops instead of legacy suspend/resume 2014-12-04 19:09:03 +01:00
i2o
ia64 kvm: Documentation: remove ia64 2014-11-20 11:08:55 +01:00
ide
infiniband IB/mad: add new ioctl to ABI to support new registration options 2014-08-10 20:36:00 -07:00
input Docs changes for the 3.19 merge window 2014-12-12 14:42:48 -08:00
ioctl cxl: Add documentation for userspace APIs 2014-10-08 20:16:19 +11:00
isdn
ja_JP Documentation: Update stable address in Chinese and Japanese translations 2014-04-16 14:13:27 -07:00
kbuild Documentation: kbuild: Improve grammar 2014-08-19 10:02:56 +02:00
kdump kernel: add panic_on_warn 2014-12-10 17:41:10 -08:00
ko_KR
laptops Documentation: update .gitignore files 2014-09-26 11:02:59 +02:00
leds
locking locking/Documentation: Update code path 2015-01-16 09:09:21 +01:00
m68k
memory-devices
metag
mic Documentation: Build mic/mpssd only for x86_64 2014-12-05 11:18:36 -05:00
mips Documentation: au1xxx-ide.c has moved 2014-08-26 09:35:53 +02:00
misc-devices Documentation: use subdir-y to avoid unnecessary built-in.o files 2014-09-26 11:02:55 +02:00
mmc
mn10300
mtd MTD updates for 3.16: 2014-06-11 08:35:34 -07:00
namespaces
netlabel
networking Documentation: Update netlink_mmap.txt 2015-02-02 18:50:00 -08:00
nfc
nios2 Documentation: Add documentation for Nios2 architecture 2014-12-08 12:56:06 +08:00
parisc
pcmcia Documentation: use subdir-y to avoid unnecessary built-in.o files 2014-09-26 11:02:55 +02:00
phy phy: Add new Exynos USB 2.0 PHY driver 2014-03-08 12:39:44 +05:30
platform Documentation: Add list of laptop models supported by the Compal driver 2014-06-10 19:11:06 -04:00
power Merge branch 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm 2014-12-12 15:26:48 -08:00
powerpc cxl: Add documentation for userspace APIs 2014-10-08 20:16:19 +11:00
pps
prctl Documentation: Restrict TSC test code to x86 2014-10-28 08:46:27 -04:00
pti
ptp ptp: restore the makefile for building the test program. 2014-10-24 16:07:10 -04:00
rapidio rapidio/tsi721_dma: rework scatter-gather list handling 2014-08-08 15:57:24 -07:00
s390 s390/docs: Remove sections that are not related to s390 2014-11-18 18:22:59 +01:00
scheduler Documentation/scheduler/sched-deadline.txt: Add minimal main() appendix 2014-09-16 10:23:45 +02:00
scsi Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2014-12-12 10:08:06 -08:00
security Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity into next 2014-11-19 21:36:07 +11:00
serial serial: Fix locking for uart driver set_termios() method 2014-11-05 18:53:54 -08:00
sh
sound ALSA: hda - Add "eapd" model string for AD1986A codec 2014-12-10 14:00:13 +01:00
spi Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/doc 2014-10-07 21:14:57 -04:00
sysctl ipc/msg: increase MSGMNI, remove scaling 2014-12-13 12:42:52 -08:00
target Documentation/target: Update fabric_ops to latest code 2015-01-06 13:46:49 -08:00
thermal Documentation: thermal: document of_cpufreq_cooling_register() 2015-01-06 14:39:17 -04:00
timers Documentation: update .gitignore files 2014-09-26 11:02:59 +02:00
tpm
trace Char/Misc driver patches for 3.19-rc1 2014-12-14 16:43:47 -08:00
usb USB patches for 3.19-rc1 2014-12-14 14:57:16 -08:00
vDSO vdso: don't require 64-bit math in standalone test 2014-10-25 10:53:44 -04:00
video4linux [media] vivid.txt: document new controls 2014-12-16 23:21:37 -02:00
virtual Second round of changes for KVM for arm/arm64 for v3.19; fixes reboot 2014-12-15 13:06:40 +01:00
vm mm: replace remap_file_pages() syscall with emulation 2015-02-10 14:30:30 -08:00
w1 w1: new w1_ds2406 driver 2014-06-19 17:45:14 -07:00
watchdog Documentation: use subdir-y to avoid unnecessary built-in.o files 2014-09-26 11:02:55 +02:00
wimax
x86 x86, entry: Switch stacks on a paranoid entry from userspace 2015-01-02 10:22:45 -08:00
xtensa
zh_CN Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2014-08-06 21:03:53 -07:00
00-INDEX locking/Documentation: Move locking related docs into Documentation/locking/ 2014-08-13 10:32:03 +02:00
BUG-HUNTING
Changes Update old iproute2 and Xen Remus links 2014-12-09 13:38:13 -05:00
CodingStyle CodingStyle: add some more error handling guidelines 2014-12-02 08:55:32 -05:00
DMA-API-HOWTO.txt Documentation: correct parameter error for dma_mapping_error 2014-09-26 11:22:29 +02:00
DMA-API.txt DMA-API: Capitalize "CPU" consistently 2014-05-26 17:28:27 -06:00
DMA-ISA-LPC.txt DMA-API: Clarify physical/bus address distinction 2014-05-20 16:54:21 -06:00
DMA-attributes.txt doc: spelling error changes 2014-05-05 15:32:05 +02:00
HOWTO Documentation: remove outdated references to the linux-next wiki 2014-10-28 09:06:11 -04:00
IPMI.txt ipmi: Add SMBus interface driver (SSIF) 2014-12-11 15:04:11 -06:00
IRQ-affinity.txt
IRQ-domain.txt irqdomain: Introduce new interfaces to support hierarchy irqdomains 2014-11-23 13:01:45 +01:00
IRQ.txt
Intel-IOMMU.txt
Makefile Documentation: add makefiles for more targets 2014-09-26 11:02:56 +02:00
ManagementStyle
SAK.txt
SM501.txt
SecurityBugs
SubmitChecklist
SubmittingDrivers doc: SubmittingPatches: remove dead link, kerneltrap.org no longer works 2014-06-19 15:15:27 +02:00
SubmittingPatches Documentation/SubmittingPatches: Reported-by tags and permission 2014-10-29 08:56:46 -04:00
VGA-softcursor.txt
applying-patches.txt Documentation: change "&" to "and" in Documentation/applying-patches.txt 2014-09-26 11:10:11 +02:00
assoc_array.txt
atomic_ops.txt documentation: Add atomic_long_t to atomic_ops.txt 2014-11-13 10:34:54 -08:00
bad_memory.txt
basic_profiling.txt
bcache.txt
binfmt_misc.txt binfmt_misc: touch up documentation a bit 2014-10-14 02:18:16 +02:00
braille-console.txt
bt8xxgpio.txt
btmrvl.txt
bus-virt-phys-mapping.txt
cachetlb.txt
circular-buffers.txt
clk.txt clk: Change clk_ops->determine_rate to return a clk_hw as the best parent 2014-12-03 16:21:37 -08:00
coccinelle.txt
cpu-hotplug.txt Doc/cpu-hotplug: Specify race-free way to register CPU hotplug callbacks 2014-03-20 13:43:40 +01:00
cpu-load.txt
cputopology.txt
crc32.txt
dcdbas.txt
debugging-modules.txt
debugging-via-ohci1394.txt firewire: revert to 4 GB RDMA, fix protocols using Memory Space 2014-05-29 15:50:30 +02:00
dell_rbu.txt
devices.txt Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media 2014-04-04 09:50:07 -07:00
digsig.txt
dma-buf-sharing.txt Documentation/dma-buf-sharing.txt: update API descriptions 2014-08-28 11:57:24 +05:30
dontdiff Documentation: LLVMLinux: Update Documentation/dontdiff 2014-04-09 13:44:34 -07:00
dynamic-debug-howto.txt doc: spelling error changes 2014-05-05 15:32:05 +02:00
edac.txt Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial into next 2014-06-04 08:50:34 -07:00
efi-stub.txt doc: arm64: add description of EFI stub support 2014-04-30 19:57:05 +01:00
eisa.txt
email-clients.txt Documentation/email-clients.txt: add info about Claws Mail 2014-12-02 11:55:29 -05:00
flexible-arrays.txt
futex-requeue-pi.txt doc: Fix misnamed FUTEX_CMP_REQUEUE_PI op constants 2015-01-19 12:05:32 +01:00
gcov.txt
highuid.txt
hsi.txt Documentation: HSI: Add some general description for the HSI subsystem 2014-05-04 09:49:46 +02:00
hw_random.txt
hwspinlock.txt
init.txt
initrd.txt
intel_txt.txt
io-mapping.txt
io_ordering.txt
iostats.txt
irqflags-tracing.txt asm/system.h: clean asm/system.h from docs 2014-04-07 16:36:11 -07:00
isapnp.txt
java.txt Documentation: update java sample wrapper for java 7 2014-05-25 12:39:00 -07:00
kernel-doc-nano-HOWTO.txt
kernel-docs.txt
kernel-parameters.txt Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input 2015-01-19 04:55:23 +12:00
kernel-per-CPU-kthreads.txt
kmemcheck.txt doc: fix double words 2014-03-21 13:16:58 +01:00
kmemleak.txt Documentation: Add CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF case 2014-10-24 13:59:03 -04:00
kobject.txt kobject: grammar fix 2014-12-08 09:07:11 -05:00
kprobes.txt Documentation/kprobes: add s390 to list of supported architectures 2014-09-09 08:53:27 +02:00
kref.txt
kselftest.txt kselftest: Move the docs to the Documentation dir 2014-11-24 10:49:54 -07:00
ldm.txt
local_ops.txt percpu: update local_ops.txt to reflect this_cpu operations 2014-12-13 12:42:53 -08:00
lockup-watchdogs.txt lockup-watchdogs: Fix a typo 2014-08-26 09:35:52 +02:00
logo.gif
logo.txt
lzo.txt Documentation: lzo: document part of the encoding 2014-09-28 11:08:00 +02:00
magic-number.txt Documentation/serial: Delete obsolete driver documentation 2014-04-16 14:20:34 -07:00
mailbox.txt Documentation: Fix a typo in mailbox.txt 2014-11-03 11:54:50 -05:00
md.txt
media-framework.txt
memory-barriers.txt documentation: Fix smp typo in memory-barriers.txt 2015-01-07 08:59:32 -08:00
memory-hotplug.txt memory-hotplug: add sysfs valid_zones attribute 2014-10-09 22:25:52 -04:00
module-signing.txt Nothing major: the stricter permissions checking for sysfs broke 2014-04-06 09:38:07 -07:00
mono.txt
nommu-mmap.txt
numastat.txt
oops-tracing.txt panic: add TAINT_SOFTLOCKUP 2014-08-08 15:57:24 -07:00
padata.txt
parport-lowlevel.txt
parport.txt
percpu-rw-semaphore.txt
phy.txt phy: improved lookup method 2014-11-21 19:48:50 +05:30
pi-futex.txt
pinctrl.txt pinctrl: clean up after enable refactoring 2014-09-04 10:05:07 +02:00
pnp.txt
preempt-locking.txt
printk-formats.txt lib/vsprintf: add %*pE[achnops] format specifier 2014-10-14 02:18:26 +02:00
pwm.txt pwm: modify PWM_LOOKUP to initialize all struct pwm_lookup members 2014-05-21 11:19:36 +02:00
ramoops.txt pstore-ram: Allow optional mapping with pgprot_noncached 2014-12-11 13:38:31 -08:00
rbtree.txt doc: spelling error changes 2014-05-05 15:32:05 +02:00
remoteproc.txt
rfkill.txt doc: spelling error changes 2014-05-05 15:32:05 +02:00
robust-futex-ABI.txt
robust-futexes.txt doc: spelling error changes 2014-05-05 15:32:05 +02:00
rpmsg.txt
rtc.txt
serial-console.txt
sgi-ioc4.txt
smsc_ece1099.txt
sparse.txt
stable_api_nonsense.txt
stable_kernel_rules.txt stable_kernel_rules: Add pointer to netdev-FAQ for network patches 2014-07-09 15:54:27 -07:00
static-keys.txt
svga.txt
sysfs-rules.txt Documentation/sysfs-rules.txt: Add device attribute error code documentation 2014-09-19 14:44:51 -07:00
sysrq.txt
this_cpu_ops.txt Docs: this_cpu_ops: remove redundant add forms 2014-09-26 11:03:00 +02:00
unaligned-memory-access.txt
unicode.txt
unshare.txt
vfio.txt drivers/vfio: EEH support for VFIO PCI device 2014-08-05 15:28:48 +10:00
vgaarbiter.txt
video-output.txt
vme_api.txt
volatile-considered-harmful.txt
workqueue.txt
xillybus.txt xillybus: Move out of staging 2014-09-23 23:44:16 -07:00
xz.txt
zorro.txt