remarkable-linux/fs
Eric B Munson 6bfde05bf5 hugetlbfs: allow the creation of files suitable for MAP_PRIVATE on the vfs internal mount
This patchset adds a flag to mmap that allows the user to request that an
anonymous mapping be backed with huge pages.  This mapping will borrow
functionality from the huge page shm code to create a file on the kernel
internal mount and use it to approximate an anonymous mapping.  The
MAP_HUGETLB flag is a modifier to MAP_ANONYMOUS and will not work without
both flags being preset.

A new flag is necessary because there is no other way to hook into huge
pages without creating a file on a hugetlbfs mount which wouldn't be
MAP_ANONYMOUS.

To userspace, this mapping will behave just like an anonymous mapping
because the file is not accessible outside of the kernel.

This patchset is meant to simplify the programming model.  Presently there
is a large chunk of boiler platecode, contained in libhugetlbfs, required
to create private, hugepage backed mappings.  This patch set would allow
use of hugepages without linking to libhugetlbfs or having hugetblfs
mounted.

Unification of the VM code would provide these same benefits, but it has
been resisted each time that it has been suggested for several reasons: it
would break PAGE_SIZE assumptions across the kernel, it makes page-table
abstractions really expensive, and it does not provide any benefit on
architectures that do not support huge pages, incurring fast path
penalties without providing any benefit on these architectures.

This patch:

There are two means of creating mappings backed by huge pages:

        1. mmap() a file created on hugetlbfs
        2. Use shm which creates a file on an internal mount which essentially
           maps it MAP_SHARED

The internal mount is only used for shared mappings but there is very
little that stops it being used for private mappings. This patch extends
hugetlbfs_file_setup() to deal with the creation of files that will be
mapped MAP_PRIVATE on the internal hugetlbfs mount. This extended API is
used in a subsequent patch to implement the MAP_HUGETLB mmap() flag.

Signed-off-by: Eric Munson <ebmunson@us.ibm.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Adam Litke <agl@us.ibm.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-22 07:17:41 -07:00
..
9p 9p: remove unnecessary v9fses->options which duplicates the mount string 2009-08-17 16:42:28 -05:00
adfs headers: smp_lock.h redux 2009-07-12 12:22:34 -07:00
affs affs: add ->sync_fs 2009-06-11 21:36:14 -04:00
afs const: make file_lock_operations const 2009-09-22 07:17:25 -07:00
autofs
autofs4 autofs4 - fix missed case when changing to use struct path 2009-08-31 17:44:05 -10:00
befs const: mark remaining super_operations const 2009-09-22 07:17:24 -07:00
bfs headers: smp_lock.h redux 2009-07-12 12:22:34 -07:00
btrfs const: mark remaining inode_operations as const 2009-09-22 07:17:24 -07:00
cachefiles
cifs const: mark remaining inode_operations as const 2009-09-22 07:17:24 -07:00
coda
configfs writeback: add name to backing_dev_info 2009-09-11 09:20:26 +02:00
cramfs
debugfs debugfs: use specified mode to possibly mark files read/write only 2009-06-15 21:30:28 -07:00
devpts devpts: remove module-related code 2009-06-24 08:15:24 -04:00
dlm Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm 2009-09-18 09:19:10 -07:00
ecryptfs const: mark remaining address_space_operations const 2009-09-22 07:17:24 -07:00
efs get rid of BKL in fs/efs 2009-06-17 00:36:36 -04:00
exofs headers: smp_lock.h redux 2009-07-12 12:22:34 -07:00
exportfs
ext2 const: make block_device_operations const 2009-09-22 07:17:25 -07:00
ext3 const: make struct super_block::s_qcop const 2009-09-22 07:17:24 -07:00
ext4 const: make struct super_block::s_qcop const 2009-09-22 07:17:24 -07:00
fat fat: Opencode sync_page_range_nolock() 2009-09-14 17:08:17 +02:00
freevxfs headers: smp_lock.h redux 2009-07-12 12:22:34 -07:00
fscache
fuse Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse 2009-09-18 09:23:03 -07:00
gfs2 Merge branch 'for-2.6.32' of git://git.kernel.dk/linux-2.6-block 2009-09-14 17:55:15 -07:00
hfs headers: smp_lock.h redux 2009-07-12 12:22:34 -07:00
hfsplus headers: smp_lock.h redux 2009-07-12 12:22:34 -07:00
hostfs hostfs: set maximum filesize in superblock for proper LFS support 2009-06-30 18:56:03 -07:00
hpfs headers: smp_lock.h redux 2009-07-12 12:22:34 -07:00
hppfs
hugetlbfs hugetlbfs: allow the creation of files suitable for MAP_PRIVATE on the vfs internal mount 2009-09-22 07:17:41 -07:00
isofs isofs: fix Joliet regression 2009-07-10 19:18:59 -07:00
jbd jbd: Annotate transaction start also for journal_restart() 2009-09-16 17:44:10 +02:00
jbd2 writeback: get rid of wbc->for_writepages 2009-09-16 15:16:18 +02:00
jffs2 const: mark remaining export_operations const 2009-09-22 07:17:24 -07:00
jfs jffs2/jfs/xfs: switch over to 'check_acl' rather than 'permission()' 2009-09-08 11:09:04 -07:00
lockd const: make lock_manager_operations const 2009-09-22 07:17:25 -07:00
minix Making fs/minix/minix.h double including safe 2009-06-22 11:34:42 -07:00
ncpfs NLS: update handling of Unicode 2009-06-15 21:44:43 -07:00
nfs const: make file_lock_operations const 2009-09-22 07:17:25 -07:00
nfs_common
nfsd const: make lock_manager_operations const 2009-09-22 07:17:25 -07:00
nilfs2 const: mark remaining inode_operations as const 2009-09-22 07:17:24 -07:00
nls NLS: update handling of Unicode 2009-06-15 21:44:43 -07:00
notify inotify: update the group mask on mark addition 2009-08-28 12:51:14 -04:00
ntfs mm: replace various uses of num_physpages by totalram_pages 2009-09-22 07:17:38 -07:00
ocfs2 const: make struct super_block::s_qcop const 2009-09-22 07:17:24 -07:00
omfs const: mark remaining inode_operations as const 2009-09-22 07:17:24 -07:00
openpromfs
partitions const: make block_device_operations const 2009-09-22 07:17:25 -07:00
proc oom: fix oom_adjust_write() input sanity check 2009-09-22 07:17:39 -07:00
qnx4 fs/qnx4: sanitize includes 2009-06-11 21:36:12 -04:00
quota const: make struct super_block::s_qcop const 2009-09-22 07:17:24 -07:00
ramfs writeback: add name to backing_dev_info 2009-09-11 09:20:26 +02:00
reiserfs const: make struct super_block::s_qcop const 2009-09-22 07:17:24 -07:00
romfs const: mark remaining inode_operations as const 2009-09-22 07:17:24 -07:00
smbfs
squashfs const: mark remaining super_operations const 2009-09-22 07:17:24 -07:00
sysfs Merge branch 'writeback' of git://git.kernel.dk/linux-2.6-block 2009-09-11 09:17:05 -07:00
sysv get rid of BKL in fs/sysv 2009-06-17 00:36:37 -04:00
ubifs const: mark remaining address_space_operations const 2009-09-22 07:17:24 -07:00
udf udf: Fix possible corruption when close races with write 2009-09-14 19:13:01 +02:00
ufs ufs: sector_t cannot be negative 2009-06-18 13:03:46 -07:00
xfs const: mark remaining super_operations const 2009-09-22 07:17:24 -07:00
aio.c eventfd: revised interface and cleanups 2009-06-30 18:55:58 -07:00
anon_inodes.c fs: Provide empty .set_page_dirty() aop for anon inodes 2009-06-18 14:46:10 +02:00
attr.c
bad_inode.c
binfmt_aout.c
binfmt_elf.c mm: add get_dump_page 2009-09-22 07:17:40 -07:00
binfmt_elf_fdpic.c mm: add get_dump_page 2009-09-22 07:17:40 -07:00
binfmt_em86.c
binfmt_flat.c flat: fix uninitialized ptr with shared libs 2009-08-07 10:39:57 -07:00
binfmt_misc.c
binfmt_script.c
binfmt_som.c
bio-integrity.c block: Create bip slabs with embedded integrity vectors 2009-07-01 10:56:25 +02:00
bio.c block: fix sg SG_DXFER_TO_FROM_DEV regression 2009-07-10 20:31:53 +02:00
block_dev.c const: make block_device_operations const 2009-09-22 07:17:25 -07:00
buffer.c writeback: switch to per-bdi threads for flushing data 2009-09-11 09:20:25 +02:00
char_dev.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6 2009-09-11 09:19:35 -07:00
compat.c exec: do not sleep in TASK_TRACED under ->cred_guard_mutex 2009-09-05 11:30:42 -07:00
compat_binfmt_elf.c
compat_ioctl.c compat_ioctl: hook up compat handler for FIEMAP ioctl 2009-08-07 10:39:56 -07:00
dcache.c sched: Pull up the might_sleep() check into cond_resched() 2009-07-18 15:51:44 +02:00
dcookies.c
direct-io.c
drop_caches.c mm: remove __invalidate_mapping_pages variant 2009-06-16 19:47:43 -07:00
eventfd.c eventfd: revised interface and cleanups 2009-06-30 18:55:58 -07:00
eventpoll.c epoll: fix nested calls support 2009-06-18 13:03:41 -07:00
exec.c perf: Do the big rename: Performance Counters -> Performance Events 2009-09-21 14:28:04 +02:00
fcntl.c headers: smp_lock.h redux 2009-07-12 12:22:34 -07:00
fifo.c
file.c
file_table.c
filesystems.c
fs-writeback.c writeback: fix possible bdi writeback refcounting problem 2009-09-16 15:18:53 +02:00
fs_struct.c
generic_acl.c
inode.c const: mark remaining inode_operations as const 2009-09-22 07:17:24 -07:00
internal.h
ioctl.c fs: Add new pre-allocation ioctls to vfs for compatibility with legacy xfs ioctls 2009-06-24 08:15:27 -04:00
ioprio.c
Kconfig tmpfs: depend on shmem 2009-09-22 07:17:41 -07:00
Kconfig.binfmt
libfs.c vfs: make get_sb_pseudo set s_maxbytes to value that can be cast to signed 2009-08-18 16:31:12 -07:00
locks.c const: make lock_manager_operations const 2009-09-22 07:17:25 -07:00
Makefile
mbcache.c
mpage.c
namei.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6 2009-09-11 08:55:49 -07:00
namespace.c vfs: mnt_want_write_file(): fix special file handling 2009-08-07 10:39:56 -07:00
nfsctl.c
no-block.c
open.c CRED: Add some configurable debugging [try #6] 2009-09-02 21:29:01 +10:00
pipe.c lockdep: Fix lockdep annotation for pipe_double_lock() 2009-07-22 21:14:14 +02:00
pnode.c
pnode.h
posix_acl.c
read_write.c
read_write.h
readdir.c
select.c poll/select: initialize triggered field of struct poll_wqueues 2009-08-15 18:40:11 -07:00
seq_file.c seq_file: add function to write binary data 2009-06-18 13:03:57 -07:00
signalfd.c
splice.c Merge branch 'for-2.6.32' of git://git.kernel.dk/linux-2.6-block 2009-09-14 17:55:15 -07:00
stack.c
stat.c
super.c const: mark remaining super_operations const 2009-09-22 07:17:24 -07:00
sync.c fs: Assign bdi in super_block 2009-09-16 15:18:51 +02:00
timerfd.c
utimes.c
xattr.c VFS: Factor out part of vfs_setxattr so it can be called from the SELinux hook for inode_setsecctx. 2009-09-10 10:11:22 +10:00
xattr_acl.c