Commit graph

955 commits

Author SHA1 Message Date
Linus Torvalds 5166701b36 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs updates from Al Viro:
 "The first vfs pile, with deep apologies for being very late in this
  window.

  Assorted cleanups and fixes, plus a large preparatory part of iov_iter
  work.  There's a lot more of that, but it'll probably go into the next
  merge window - it *does* shape up nicely, removes a lot of
  boilerplate, gets rid of locking inconsistencie between aio_write and
  splice_write and I hope to get Kent's direct-io rewrite merged into
  the same queue, but some of the stuff after this point is having
  (mostly trivial) conflicts with the things already merged into
  mainline and with some I want more testing.

  This one passes LTP and xfstests without regressions, in addition to
  usual beating.  BTW, readahead02 in ltp syscalls testsuite has started
  giving failures since "mm/readahead.c: fix readahead failure for
  memoryless NUMA nodes and limit readahead pages" - might be a false
  positive, might be a real regression..."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits)
  missing bits of "splice: fix racy pipe->buffers uses"
  cifs: fix the race in cifs_writev()
  ceph_sync_{,direct_}write: fix an oops on ceph_osdc_new_request() failure
  kill generic_file_buffered_write()
  ocfs2_file_aio_write(): switch to generic_perform_write()
  ceph_aio_write(): switch to generic_perform_write()
  xfs_file_buffered_aio_write(): switch to generic_perform_write()
  export generic_perform_write(), start getting rid of generic_file_buffer_write()
  generic_file_direct_write(): get rid of ppos argument
  btrfs_file_aio_write(): get rid of ppos
  kill the 5th argument of generic_file_buffered_write()
  kill the 4th argument of __generic_file_aio_write()
  lustre: don't open-code kernel_recvmsg()
  ocfs2: don't open-code kernel_recvmsg()
  drbd: don't open-code kernel_recvmsg()
  constify blk_rq_map_user_iov() and friends
  lustre: switch to kernel_sendmsg()
  ocfs2: don't open-code kernel_sendmsg()
  take iov_iter stuff to mm/iov_iter.c
  process_vm_access: tidy up a bit
  ...
2014-04-12 14:49:50 -07:00
Linus Torvalds 26c12d9334 Merge branch 'akpm' (incoming from Andrew)
Merge second patch-bomb from Andrew Morton:
 - the rest of MM
 - zram updates
 - zswap updates
 - exit
 - procfs
 - exec
 - wait
 - crash dump
 - lib/idr
 - rapidio
 - adfs, affs, bfs, ufs
 - cris
 - Kconfig things
 - initramfs
 - small amount of IPC material
 - percpu enhancements
 - early ioremap support
 - various other misc things

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (156 commits)
  MAINTAINERS: update Intel C600 SAS driver maintainers
  fs/ufs: remove unused ufs_super_block_third pointer
  fs/ufs: remove unused ufs_super_block_second pointer
  fs/ufs: remove unused ufs_super_block_first pointer
  fs/ufs/super.c: add __init to init_inodecache()
  doc/kernel-parameters.txt: add early_ioremap_debug
  arm64: add early_ioremap support
  arm64: initialize pgprot info earlier in boot
  x86: use generic early_ioremap
  mm: create generic early_ioremap() support
  x86/mm: sparse warning fix for early_memremap
  lglock: map to spinlock when !CONFIG_SMP
  percpu: add preemption checks to __this_cpu ops
  vmstat: use raw_cpu_ops to avoid false positives on preemption checks
  slub: use raw_cpu_inc for incrementing statistics
  net: replace __this_cpu_inc in route.c with raw_cpu_inc
  modules: use raw_cpu_write for initialization of per cpu refcount.
  mm: use raw_cpu ops for determining current NUMA node
  percpu: add raw_cpu_ops
  slub: fix leak of 'name' in sysfs_slab_add
  ...
2014-04-07 16:38:06 -07:00
Fabian Frederick 8ca577223f affs: add mount option to avoid filename truncates
Normal behavior for filenames exceeding specific filesystem limits is to
refuse operation.

AFFS standard name length being only 30 characters against 255 for usual
Linux filesystems, original implementation does filename truncate by
default with a define value AFFS_NO_TRUNCATE which can be enabled but
needs module compilation.

This patch adds 'nofilenametruncate' mount option so that user can
easily activate that feature and avoid a lot of problems (eg overwrite
files ...)

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-07 16:36:08 -07:00
Andrey Vagin 49d063cb35 proc: show mnt_id in /proc/pid/fdinfo
Currently we don't have a way how to determing from which mount point
file has been opened.  This information is required for proper dumping
and restoring file descriptos due to presence of mount namespaces.  It's
possible, that two file descriptors are opened using the same paths, but
one fd references mount point from one namespace while the other fd --
from other namespace.

$ ls -l /proc/1/fd/1
lrwx------ 1 root root 64 Mar 19 23:54 /proc/1/fd/1 -> /dev/null

$ cat /proc/1/fdinfo/1
pos:	0
flags:	0100002
mnt_id:	16

$ cat /proc/1/mountinfo | grep ^16
16 32 0:4 / /dev rw,nosuid shared:2 - devtmpfs devtmpfs rw,size=1013356k,nr_inodes=253339,mode=755

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Rob Landley <rob@landley.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-07 16:36:04 -07:00
Kirill A. Shutemov 8c6e50b029 mm: introduce vm_ops->map_pages()
Here's new version of faultaround patchset.  It took a while to tune it
and collect performance data.

First patch adds new callback ->map_pages to vm_operations_struct.

->map_pages() is called when VM asks to map easy accessible pages.
Filesystem should find and map pages associated with offsets from
"pgoff" till "max_pgoff".  ->map_pages() is called with page table
locked and must not block.  If it's not possible to reach a page without
blocking, filesystem should skip it.  Filesystem should use do_set_pte()
to setup page table entry.  Pointer to entry associated with offset
"pgoff" is passed in "pte" field in vm_fault structure.  Pointers to
entries for other offsets should be calculated relative to "pte".

Currently VM use ->map_pages only on read page fault path.  We try to
map FAULT_AROUND_PAGES a time.  FAULT_AROUND_PAGES is 16 for now.
Performance data for different FAULT_AROUND_ORDER is below.

TODO:
 - implement ->map_pages() for shmem/tmpfs;
 - modify get_user_pages() to be able to use ->map_pages() and implement
   mmap(MAP_POPULATE|MAP_NONBLOCK) on top.

=========================================================================
Tested on 4-socket machine (120 threads) with 128GiB of RAM.

Few real-world workloads. The sweet spot for FAULT_AROUND_ORDER here is
somewhere between 3 and 5. Let's say 4 :)

Linux build (make -j60)
FAULT_AROUND_ORDER		Baseline	1		3		4		5		7		9
	minor-faults		283,301,572	247,151,987	212,215,789	204,772,882	199,568,944	194,703,779	193,381,485
	time, seconds		151.227629483	153.920996480	151.356125472	150.863792049	150.879207877	151.150764954	151.450962358
Linux rebuild (make -j60)
FAULT_AROUND_ORDER		Baseline	1		3		4		5		7		9
	minor-faults		5,396,854	4,148,444	2,855,286	2,577,282	2,361,957	2,169,573	2,112,643
	time, seconds		27.404543757	27.559725591	27.030057426	26.855045126	26.678618635	26.974523490	26.761320095
Git test suite (make -j60 test)
FAULT_AROUND_ORDER		Baseline	1		3		4		5		7		9
	minor-faults		129,591,823	99,200,751	66,106,718	57,606,410	51,510,808	45,776,813	44,085,515
	time, seconds		66.087215026	64.784546905	64.401156567	65.282708668	66.034016829	66.793780811	67.237810413

Two synthetic tests: access every word in file in sequential/random order.
It doesn't improve much after FAULT_AROUND_ORDER == 4.

Sequential access 16GiB file
FAULT_AROUND_ORDER		Baseline	1		3		4		5		7		9
 1 thread
	minor-faults		4,195,437	2,098,275	525,068		262,251		131,170		32,856		8,282
	time, seconds		7.250461742	6.461711074	5.493859139	5.488488147	5.707213983	5.898510832	5.109232856
 8 threads
	minor-faults		33,557,540	16,892,728	4,515,848	2,366,999	1,423,382	442,732		142,339
	time, seconds		16.649304881	9.312555263	6.612490639	6.394316732	6.669827501	6.75078944	6.371900528
 32 threads
	minor-faults		134,228,222	67,526,810	17,725,386	9,716,537	4,763,731	1,668,921	537,200
	time, seconds		49.164430543	29.712060103	12.938649729	10.175151004	11.840094583	9.594081325	9.928461797
 60 threads
	minor-faults		251,687,988	126,146,952	32,919,406	18,208,804	10,458,947	2,733,907	928,217
	time, seconds		86.260656897	49.626551828	22.335007632	17.608243696	16.523119035	16.339489186	16.326390902
 120 threads
	minor-faults		503,352,863	252,939,677	67,039,168	35,191,827	19,170,091	4,688,357	1,471,862
	time, seconds		124.589206333	79.757867787	39.508707872	32.167281632	29.972989292	28.729834575	28.042251622
Random access 1GiB file
 1 thread
	minor-faults		262,636		132,743		34,369		17,299		8,527		3,451		1,222
	time, seconds		15.351890914	16.613802482	16.569227308	15.179220992	16.557356122	16.578247824	15.365266994
 8 threads
	minor-faults		2,098,948	1,061,871	273,690		154,501		87,110		25,663		7,384
	time, seconds		15.040026343	15.096933500	14.474757288	14.289129964	14.411537468	14.296316837	14.395635804
 32 threads
	minor-faults		8,390,734	4,231,023	1,054,432	528,847		269,242		97,746		26,881
	time, seconds		20.430433109	21.585235358	22.115062928	14.872878951	14.880856305	14.883370649	14.821261690
 60 threads
	minor-faults		15,733,258	7,892,809	1,973,393	988,266		594,789		164,994		51,691
	time, seconds		26.577302548	25.692397770	18.728863715	20.153026398	21.619101933	17.745086260	17.613215273
 120 threads
	minor-faults		31,471,111	15,816,616	3,959,209	1,978,685	1,008,299	264,635		96,010
	time, seconds		41.835322703	40.459786095	36.085306105	35.313894834	35.814445675	36.552633793	34.289210594

Touch only one page in page table in 16GiB file
FAULT_AROUND_ORDER		Baseline	1		3		4		5		7		9
 1 thread
	minor-faults		8,372		8,324		8,270		8,260		8,249		8,239		8,237
	time, seconds		0.039892712	0.045369149	0.051846126	0.063681685	0.079095975	0.17652406	0.541213386
 8 threads
	minor-faults		65,731		65,681		65,628		65,620		65,608		65,599		65,596
	time, seconds		0.124159196	0.488600638	0.156854426	0.191901957	0.242631486	0.543569456	1.677303984
 32 threads
	minor-faults		262,388		262,341		262,285		262,276		262,266		262,257		263,183
	time, seconds		0.452421421	0.488600638	0.565020946	0.648229739	0.789850823	1.651584361	5.000361559
 60 threads
	minor-faults		491,822		491,792		491,723		491,711		491,701		491,691		491,825
	time, seconds		0.763288616	0.869620515	0.980727360	1.161732354	1.466915814	3.04041448	9.308612938
 120 threads
	minor-faults		983,466		983,655		983,366		983,372		983,363		984,083		984,164
	time, seconds		1.595846553	1.667902182	2.008959376	2.425380942	2.941368804	5.977807890	18.401846125

This patch (of 2):

Introduce new vm_ops callback ->map_pages() and uses it for mapping easy
accessible pages around fault address.

On read page fault, if filesystem provides ->map_pages(), we try to map up
to FAULT_AROUND_PAGES pages around page fault address in hope to reduce
number of minor page faults.

We call ->map_pages first and use ->fault() as fallback if page by the
offset is not ready to be mapped (cold page cache or something).

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Ning Qu <quning@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-07 16:35:52 -07:00
Linus Torvalds 3021112598 f2fs updates for v3.15
This patch-set includes the following major enhancement patches.
  o introduce large directory support
  o introduce f2fs_issue_flush to merge redundant flush commands
  o merge write IOs as much as possible aligned to the segment
  o add sysfs entries to tune the f2fs configuration
  o use radix_tree for the free_nid_list to reduce in-memory operations
  o remove costly bit operations in f2fs_find_entry
  o enhance the readahead flow for CP/NAT/SIT/SSA blocks
 
 The other bug fixes are as follows.
  o recover xattr node blocks correctly after sudden-power-cut
  o fix to calculate the maximum number of node ids
  o enhance to handle many error cases
 
 And, there are a bunch of cleanups.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJTQiQrAAoJEEAUqH6CSFDSlbIP/iq06BrUeMDLoQFhA2GQFKFD
 wd0A5h9hCiFcKBcI/u/aAQqj/a5wdwzDl9XzH2PzJ45IM6sVGQZ0lv+kdLhab6rk
 ipNbV7G0yLAX+8ygS6GZF7pSKfMzGSGTrRvfdtoiunIip1jCY1IkUxv1XMgBSPza
 wnWYrE5HXEqRUDCqPXJyxrPmx0/0jw8/V82Ng9stnY34ySs+l/3Pvg65Kh0QuSSy
 BRjJUGlOCF68KUBKd+6YB2T5KlbQde3/5lhP+GMOi+xm5sFB+j+59r/WpJpF2Nxs
 ImxQs5GkiU01ErH/rn5FgHY/zzddQenBKwOvrjEeUA1eVpBurdsIr1JN0P6qDbgB
 ho5U8LzCQq+HZiW444eQGkXSOagpUKqDhTVJO7Fji/wG88Atc9gLX3ix8TH2skxT
 C5CvvrJM7DKBtkZyTzotKY/cWorOZhge6E/EkbGaM1sSHdK5b1Rg4YlFi9TDyz0n
 QjGD1uuvEeukeKGdIG9pjc7o5ledbMDYwLpT2RuRXenLOTsn8BqDOo9aRTg+5Kag
 tJNJLFumjPR2mEBNKjicJMUf381J/SKDwZszAz9mgvCZXldMza/Ax0LzJDJCVmkP
 UuBiVzGxVzpd33IsESUDr0J9hc+t8kS10jfAeKnE3cpb6n7/RYxstHh6CHOFKNXM
 gPUSYPN3CYiP47DnSfzA
 =eSW+
 -----END PGP SIGNATURE-----

Merge tag 'for-f2fs-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs

Pull f2fs updates from Jaegeuk Kim:
 "This patch-set includes the following major enhancement patches.
   - introduce large directory support
   - introduce f2fs_issue_flush to merge redundant flush commands
   - merge write IOs as much as possible aligned to the segment
   - add sysfs entries to tune the f2fs configuration
   - use radix_tree for the free_nid_list to reduce in-memory operations
   - remove costly bit operations in f2fs_find_entry
   - enhance the readahead flow for CP/NAT/SIT/SSA blocks

  The other bug fixes are as follows:
   - recover xattr node blocks correctly after sudden-power-cut
   - fix to calculate the maximum number of node ids
   - enhance to handle many error cases

  And, there are a bunch of cleanups"

* tag 'for-f2fs-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (62 commits)
  f2fs: fix wrong statistics of inline data
  f2fs: check the acl's validity before setting
  f2fs: introduce f2fs_issue_flush to avoid redundant flush issue
  f2fs: fix to cover io->bio with io_rwsem
  f2fs: fix error path when fail to read inline data
  f2fs: use list_for_each_entry{_safe} for simplyfying code
  f2fs: avoid free slab cache under spinlock
  f2fs: avoid unneeded lookup when xattr name length is too long
  f2fs: avoid unnecessary bio submit when wait page writeback
  f2fs: return -EIO when node id is not matched
  f2fs: avoid RECLAIM_FS-ON-W warning
  f2fs: skip unnecessary node writes during fsync
  f2fs: introduce fi->i_sem to protect fi's info
  f2fs: change reclaim rate in percentage
  f2fs: add missing documentation for dir_level
  f2fs: remove unnecessary threshold
  f2fs: throttle the memory footprint with a sysfs entry
  f2fs: avoid to drop nat entries due to the negative nr_shrink
  f2fs: call f2fs_wait_on_page_writeback instead of native function
  f2fs: introduce nr_pages_to_write for segment alignment
  ...
2014-04-07 10:55:36 -07:00
Jaegeuk Kim 6b4afdd794 f2fs: introduce f2fs_issue_flush to avoid redundant flush issue
Some storage devices show relatively high latencies to complete cache_flush
commands, even though their normal IO speed is prettry much high. In such
the case, it needs to merge cache_flush commands as much as possible to avoid
issuing them redundantly.
So, this patch introduces a mount option, "-o flush_merge", to mitigate such
the overhead.

If this option is enabled by user, F2FS merges the cache_flush commands and then
issues just one cache_flush on behalf of them. Once the single command is
finished, F2FS sends a completion signal to all the pending threads.

Note that, this option can be used under a workload consisting of very intensive
concurrent fsync calls, while the storage handles cache_flush commands slowly.

Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-04-07 09:50:58 +09:00
Linus Torvalds 7df934526c Merge branch 'cross-rename' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs
Pull renameat2 system call from Miklos Szeredi:
 "This adds a new syscall, renameat2(), which is the same as renameat()
  but with a flags argument.

  The purpose of extending rename is to add cross-rename, a symmetric
  variant of rename, which exchanges the two files.  This allows
  interesting things, which were not possible before, for example
  atomically replacing a directory tree with a symlink, etc...  This
  also allows overlayfs and friends to operate on whiteouts atomically.

  Andy Lutomirski also suggested a "noreplace" flag, which disables the
  overwriting behavior of rename.

  These two flags, RENAME_EXCHANGE and RENAME_NOREPLACE are only
  implemented for ext4 as an example and for testing"

* 'cross-rename' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
  ext4: add cross rename support
  ext4: rename: split out helper functions
  ext4: rename: move EMLINK check up
  ext4: rename: create ext4_renament structure for local vars
  vfs: add cross-rename
  vfs: lock_two_nondirectories: allow directory args
  security: add flags to rename hooks
  vfs: add RENAME_NOREPLACE flag
  vfs: add renameat2 syscall
  vfs: rename: use common code for dir and non-dir
  vfs: rename: move d_move() up
  vfs: add d_is_dir()
2014-04-04 14:03:05 -07:00
Fabian Frederick 4adeacdf36 Documentation/filesystems/ntfs.txt: remove changelog reference
File was removed in commit 7c821a179f ("Remove fs/ntfs/ChangeLog").

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Acked-by: Anton Altaparmakov <anton@tuxera.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-03 16:21:27 -07:00
Ryusuke Konishi 7fac376d78 nilfs2: update project's web site in nilfs2.txt
Project's web site was moved to nilfs.sourceforge.net from
www.nilfs.org.  This updates the site information in
Documentation/filesystems/nilfs2.txt with the new location.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-03 16:21:26 -07:00
Andreas Rohner 2cc88f3a5f nilfs2: implementation of NILFS_IOCTL_SET_SUINFO ioctl
With this ioctl the segment usage entries in the SUFILE can be updated
from userspace.

This is useful, because it allows the userspace GC to modify and update
segment usage entries for specific segments, which enables it to avoid
unnecessary write operations.

If a segment needs to be cleaned, but there is no or very little
reclaimable space in it, the cleaning operation basically degrades to a
useless moving operation.  In the end the only thing that changes is the
location of the data and a timestamp in the segment usage information.
With this ioctl the GC can skip the cleaning and update the segment
usage entries directly instead.

This is basically a shortcut to cleaning the segment.  It is still
necessary to read the segment summary information, but the writing of
the live blocks can be skipped if it's not worth it.

[konishi.ryusuke@lab.ntt.co.jp: add description of NILFS_IOCTL_SET_SUINFO ioctl]
Signed-off-by: Andreas Rohner <andreas.rohner@gmx.net>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-03 16:21:25 -07:00
Johannes Weiner 91b0abe36a mm + fs: store shadow entries in page cache
Reclaim will be leaving shadow entries in the page cache radix tree upon
evicting the real page.  As those pages are found from the LRU, an
iput() can lead to the inode being freed concurrently.  At this point,
reclaim must no longer install shadow pages because the inode freeing
code needs to ensure the page tree is really empty.

Add an address_space flag, AS_EXITING, that the inode freeing code sets
under the tree lock before doing the final truncate.  Reclaim will check
for this flag before installing shadow pages.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Minchan Kim <minchan@kernel.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Bob Liu <bob.liu@oracle.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jan Kara <jack@suse.cz>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Luigi Semenzato <semenzato@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Metin Doslu <metin@citusdata.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Ozgun Erdogan <ozgun@citusdata.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Roman Gushchin <klamm@yandex-team.ru>
Cc: Ryan Mallon <rmallon@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-03 16:21:01 -07:00
Al Viro c186afb4db switch ->is_partially_uptodate() to saner arguments
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-04-01 23:19:19 -04:00
Miklos Szeredi 520c8b1650 vfs: add renameat2 syscall
Add new renameat2 syscall, which is the same as renameat with an added
flags argument.

Pass flags to vfs_rename() and to i_op->rename() as well.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Reviewed-by: J. Bruce Fields <bfields@redhat.com>
2014-04-01 17:08:42 +02:00
Masanari Iida df5cbb2783 doc: fix double words
Fix double words "the the" in various files
within Documentations.

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2014-03-21 13:16:58 +01:00
Jaegeuk Kim 58c410351e f2fs: change reclaim rate in percentage
It is more reasonable to determine the reclaiming rate of prefree segments
according to the volume size, which is set to 5% by default.
For example, if the volume is 128GB, the prefree segments are reclaimed
when the number reaches to 6.4GB.

Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-03-20 22:10:10 +09:00
Jaegeuk Kim cdfc41c134 f2fs: throttle the memory footprint with a sysfs entry
This patch introduces ram_thresh, a sysfs entry, which controls the memory
footprint used by the free nid list and the nat cache.

Previously, the free nid list was controlled by MAX_FREE_NIDS, while the nat
cache was managed by NM_WOUT_THRESHOLD.
However, this approach cannot be applied dynamically according to the system.

So, this patch adds ram_thresh that users can specify the threshold, which is
in order of 1 / 1024.
For example, if the total ram size is 4GB and the value is set to 10 by default,
f2fs tries to control the number of free nids and nat caches not to consume over
10 * (4GB / 1024) = 10MB.

Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-03-20 22:10:09 +09:00
Jaegeuk Kim ab9fa662e4 f2fs: add an sysfs entry to control the directory level
This patch adds an sysfs entry to control dir_level used by the large directory.

The description of this entry is:

 dir_level                    This parameter controls the directory level to
			      support large directory. If a directory has a
			      number of files, it can reduce the file lookup
			      latency by increasing this dir_level value.
			      Otherwise, it needs to decrease this value to
			      reduce the space overhead. The default value is 0.

Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-02-27 20:31:15 +09:00
Jaegeuk Kim 3843154598 f2fs: introduce large directory support
This patch introduces an i_dir_level field to support large directory.

Previously, f2fs maintains multi-level hash tables to find a dentry quickly
from a bunch of chiild dentries in a directory, and the hash tables consist of
the following tree structure as below.

In Documentation/filesystems/f2fs.txt,

----------------------
A : bucket
B : block
N : MAX_DIR_HASH_DEPTH
----------------------

level #0   | A(2B)
           |
level #1   | A(2B) - A(2B)
           |
level #2   | A(2B) - A(2B) - A(2B) - A(2B)
     .     |   .       .       .       .
level #N/2 | A(2B) - A(2B) - A(2B) - A(2B) - A(2B) - ... - A(2B)
     .     |   .       .       .       .
level #N   | A(4B) - A(4B) - A(4B) - A(4B) - A(4B) - ... - A(4B)

But, if we can guess that a directory will handle a number of child files,
we don't need to traverse the tree from level #0 to #N all the time.
Since the lower level tables contain relatively small number of dentries,
the miss ratio of the target dentry is likely to be high.

In order to avoid that, we can configure the hash tables sparsely from level #0
like this.

level #0   | A(2B) - A(2B) - A(2B) - A(2B)

level #1   | A(2B) - A(2B) - A(2B) - A(2B) - A(2B) - ... - A(2B)
     .     |   .       .       .       .
level #N/2 | A(2B) - A(2B) - A(2B) - A(2B) - A(2B) - ... - A(2B)
     .     |   .       .       .       .
level #N   | A(4B) - A(4B) - A(4B) - A(4B) - A(4B) - ... - A(4B)

With this structure, we can skip the ineffective tree searches in lower level
hash tables.

This patch adds just a facility for this by introducing i_dir_level in
f2fs_inode.

Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-02-27 19:56:09 +09:00
Jiri Kosina d4263348f7 Merge branch 'master' into for-next 2014-02-20 14:54:28 +01:00
Olaf Hering be873ac782 Documentation: update URL to hfsplus Technote 1150
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2014-02-20 14:48:51 +01:00
Henrik Austad 3cf8ca1c25 Documentation/: update 00-INDEX files
Some of the 00-INDEX files are somewhat outdated and some folders does
not contain 00-INDEX at all.  Only outdated (with the notably exception
of spi) indexes are touched here, the 169 folders without 00-INDEX has
not been touched.

New 00-INDEX
 - spi/* was added in a series of commits dating back to 2006

Added files (missing in (*/)00-INDEX)
 - dmatest.txt was added by commit 851b7e16a0 ("dmatest: run test via
   debugfs")
 - this_cpu_ops.txt was added by commit a1b2a555d6 ("percpu: add
   documentation on this_cpu operations")
 - ww-mutex-design.txt was added by commit 040a0a3710 ("mutex: Add
   support for wound/wait style locks")
 - bcache.txt was added by commit cafe563591 ("bcache: A block layer
   cache")
 - kernel-per-CPU-kthreads.txt was added by commit 49717cb404
   ("kthread: Document ways of reducing OS jitter due to per-CPU
   kthreads")
 - phy.txt was added by commit ff76496347 ("drivers: phy: add generic
   PHY framework")
 - block/null_blk was added by commit 12f8f4fc03 ("null_blk:
   documentation")
 - module-signing.txt was added by commit 3cafea3076 ("Add
   Documentation/module-signing.txt file")
 - assoc_array.txt was added by commit 3cb989501c ("Add a generic
   associative array implementation.")
 - arm/IXP4xx was part of the initial repo
 - arm/cluster-pm-race-avoidance.txt was added by commit 7fe31d28e8
   ("ARM: mcpm: introduce helpers for platform coherency exit/setup")
 - arm/firmware.txt was added by commit 7366b92a77 ("ARM: Add
   interface for registering and calling firmware-specific operations")
 - arm/kernel_mode_neon.txt was added by commit 2afd0a0524 ("ARM:
   7825/1: document the use of NEON in kernel mode")
 - arm/tcm.txt was added by commit bc581770cf ("ARM: 5580/2: ARM TCM
   (Tightly-Coupled Memory) support v3")
 - arm/vlocks.txt was added by commit 9762f12d3e ("ARM: mcpm: Add
   baremetal voting mutexes")
 - blackfin/gptimers-example.c, Makefile was added by commit
   4b60779d5e ("Blackfin: add an example showing how to use the
   gptimers API")
 - devicetree/usage-model.txt was added by commit 31134efc68 ("dt:
   Linux DT usage model documentation")
 - fb/api.txt was added by commit fb21c2f428 ("fbdev: Add FOURCC-based
   format configuration API")
 - fb/sm501.txt was added by commit e6a0498071 ("video, sm501: add
   edid and commandline support")
 - fb/udlfb.txt was added by commit 96f8d864af ("fbdev: move udlfb out
   of staging.")
 - filesystems/Makefile was added by commit 1e0051ae48
   ("Documentation/fs/: split txt and source files")
 - filesystems/nfs/nfsd-admin-interfaces.txt was added by commit
   8a4c6e19cf ("nfsd: document kernel interfaces for nfsd
   configuration")
 - ide/warm-plug-howto.txt was added by commit f74c91413e ("ide: add
   warm-plug support for IDE devices (take 2)")
 - laptops/Makefile was added by commit d49129accc
   ("Documentation/laptop/: split txt and source files")
 - leds/leds-blinkm.txt was added by commit b54cf35a7f ("LEDS: add
   BlinkM RGB LED driver, documentation and update MAINTAINERS")
 - leds/ledtrig-oneshot.txt was added by commit 5e417281cd ("leds: add
   oneshot trigger")
 - leds/ledtrig-transient.txt was added by commit 44e1e9f8e7 ("leds:
   add new transient trigger for one shot timer activation")
 - m68k/README.buddha was part of the initial repo
 - networking/LICENSE.(qla3xxx|qlcnic|qlge) was added by commits
   40839129f7, c4e84bde1d, 5a4faa8737
 - networking/Makefile was added by commit 3794f3e812 ("docsrc: build
   Documentation/ sources")
 - networking/i40evf.txt was added by commit 105bf2fe6b ("i40evf: add
   driver to kernel build system")
 - networking/ipsec.txt was added by commit b3c6efbc36 ("xfrm: Add
   file to document IPsec corner case")
 - networking/mac80211-auth-assoc-deauth.txt was added by commit
   3cd7920a2b ("mac80211: add auth/assoc/deauth flow diagram")
 - networking/netlink_mmap.txt was added by commit 5683264c39
   ("netlink: add documentation for memory mapped I/O")
 - networking/nf_conntrack-sysctl.txt was added by commit c9f9e0e159
   ("netfilter: doc: add nf_conntrack sysctl api documentation") lan)
 - networking/team.txt was added by commit 3d249d4ca7 ("net: introduce
   ethernet teaming device")
 - networking/vxlan.txt was added by commit d342894c5d ("vxlan:
   virtual extensible lan")
 - power/runtime_pm.txt was added by commit 5e928f77a0 ("PM: Introduce
   core framework for run-time PM of I/O devices (rev.  17)")
 - power/charger-manager.txt was added by commit 3bb3dbbd56
   ("power_supply: Add initial Charger-Manager driver")
 - RCU/lockdep-splat.txt was added by commit d7bd2d68aa ("rcu:
   Document interpretation of RCU-lockdep splats")
 - s390/kvm.txt was added by 5ecee4b (KVM: s390: API documentation)
 - s390/qeth.txt was added by commit b4d72c08b3 ("qeth: bridgeport
   support - basic control")
 - scheduler/sched-bwc.txt was added by commit 88ebc08ea9 ("sched: Add
   documentation for bandwidth control")
 - scsi/advansys.txt was added by commit 4bd6d7f356 ("[SCSI] advansys:
   Move documentation to Documentation/scsi")
 - scsi/bfa.txt was added by commit 1ec90174bd ("[SCSI] bfa: add
   readme file")
 - scsi/bnx2fc.txt was added by commit 12b8fc10ea ("[SCSI] bnx2fc: Add
   driver documentation")
 - scsi/cxgb3i.txt was added by commit c3673464eb ("[SCSI] cxgb3i: Add
   cxgb3i iSCSI driver.")
 - scsi/hpsa.txt was added by commit 992ebcf14f ("[SCSI] hpsa: Add
   hpsa.txt to Documentation/scsi")
 - scsi/link_power_management_policy.txt was added by commit
   ca77329fb7 ("[libata] Link power management infrastructure")
 - scsi/osd.txt was added by commit 78e0c621de ("[SCSI] osd:
   Documentation for OSD library")
 - scsi/scsi-parameter.txt was created/moved by commit 163475fb11
   ("Documentation: move SCSI parameters to their own text file")
 - serial/driver was part of the initial repo
 - serial/n_gsm.txt was added by commit 323e84122e ("n_gsm: add a
   documentation")
 - timers/Makefile was added by commit 3794f3e812 ("docsrc: build
   Documentation/ sources")
 - virt/kvm/s390.txt was added by commit d9101fca3d ("KVM: s390:
   diagnose call documentation")
 - vm/split_page_table_lock was added by commit 49076ec2cc ("mm:
   dynamically allocate page->ptl if it cannot be embedded to struct
   page")
 - w1/slaves/w1_ds28e04 was added by commit fbf7f7b4e2 ("w1: Add
   1-wire slave device driver for DS28E04-100")
 - w1/masters/omap-hdq was added by commit e0a29382c6 ("hdq:
   documentation for OMAP HDQ")
 - x86/early-microcode.txt was added by commit 0d91ea86a8 ("x86, doc:
   Documentation for early microcode loading")
 - x86/earlyprintk.txt was added by commit a1aade4788 ("x86/doc:
   mini-howto for using earlyprintk=dbgp")
 - x86/entry_64.txt was added by commit 8b4777a4b5 ("x86-64: Document
   some of entry_64.S")
 - x86/pat.txt was added by commit d27554d874 ("x86: PAT
   documentation")

Moved files
 - arm/kernel_user_helpers.txt was moved out of arch/arm/kernel by
   commit 37b8304642 ("ARM: kuser: move interface documentation out of
   the source code")
 - efi-stub.txt was moved out of x86/ and down into Documentation/ in
   commit 4172fe2f8a ("EFI stub documentation updates")
 - laptops/hpfall.c was moved out of hwmon/ and into laptops/ in commit
   efcfed9bad ("Move hp_accel to drivers/platform/x86")
 - commit 5616c23ad9 ("x86: doc: move x86-generic documentation from
   Doc/x86/i386"):
   * x86/usb-legacy-support.txt
   * x86/boot.txt
   * x86/zero_page.txt
 - power/video_extension.txt was moved to acpi in commit 70e66e4df1
   ("ACPI / video: move video_extension.txt to Documentation/acpi")

Removed files (left in 00-INDEX)
 - memory.txt was removed by commit 00ea8990aa ("memory.txt: remove
   stray information")
 - gpio.txt was moved to gpio/ in commit fd8e198cfc ("Documentation:
   gpiolib: document new interface")
 - networking/DLINK.txt was removed by commit 168e06ae26
   ("drivers/net: delete old parallel port de600/de620 drivers")
 - serial/hayes-esp.txt was removed by commit f53a2ade0b ("tty: esp:
   remove broken driver")
 - s390/TAPE was removed by commit 9e280f6693 ("[S390] remove tape
   block docu")
 - vm/locking was removed by commit 57ea8171d2 ("mm: documentation:
   remove hopelessly out-of-date locking doc")
 - laptops/acer-wmi.txt was remvoed by commit 020036678e ("acer-wmi:
   Delete out-of-date documentation")

Typos/misc issues
 - rpc-server-gss.txt was added as knfsd-rpcgss.txt in commit
   030d794bf4 ("SUNRPC: Use gssproxy upcall for server RPCGSS
   authentication.")
 - commit b88cf73d92 ("net: add missing entries to
   Documentation/networking/00-INDEX")
   * generic-hdlc.txt was added as generic_hdlc.txt
   * spider_net.txt was added as spider-net.txt
 - w1/master/mxc-w1 was added as mxc_w1 by commit a5fd9139f7 ("w1: add
   1-wire master driver for i.MX27 / i.MX31")
 - s390/zfcpdump.txt was added as zfcpdump by commit 6920c12a40
   ("[S390] Add Documentation/s390/00-INDEX.")

Signed-off-by: Henrik Austad <henrik@austad.us>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	[rcu bits]
Acked-by: Rob Landley <rob@landley.net>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Mark Brown <broonie@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Len Brown <len.brown@intel.com>
Cc: James Bottomley <JBottomley@parallels.com>
Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-02-10 16:01:40 -08:00
Linus Torvalds e7651b819e Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs updates from Chris Mason:
 "This is a pretty big pull, and most of these changes have been
  floating in btrfs-next for a long time.  Filipe's properties work is a
  cool building block for inheriting attributes like compression down on
  a per inode basis.

  Jeff Mahoney kicked in code to export filesystem info into sysfs.

  Otherwise, lots of performance improvements, cleanups and bug fixes.

  Looks like there are still a few other small pending incrementals, but
  I wanted to get the bulk of this in first"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (149 commits)
  Btrfs: fix spin_unlock in check_ref_cleanup
  Btrfs: setup inode location during btrfs_init_inode_locked
  Btrfs: don't use ram_bytes for uncompressed inline items
  Btrfs: fix btrfs_search_slot_for_read backwards iteration
  Btrfs: do not export ulist functions
  Btrfs: rework ulist with list+rb_tree
  Btrfs: fix memory leaks on walking backrefs failure
  Btrfs: fix send file hole detection leading to data corruption
  Btrfs: add a reschedule point in btrfs_find_all_roots()
  Btrfs: make send's file extent item search more efficient
  Btrfs: fix to catch all errors when resolving indirect ref
  Btrfs: fix protection between walking backrefs and root deletion
  btrfs: fix warning while merging two adjacent extents
  Btrfs: fix infinite path build loops in incremental send
  btrfs: undo sysfs when open_ctree() fails
  Btrfs: fix snprintf usage by send's gen_unique_name
  btrfs: fix defrag 32-bit integer overflow
  btrfs: sysfs: list the NO_HOLES feature
  btrfs: sysfs: don't show reserved incompat feature
  btrfs: call permission checks earlier in ioctls and return EPERM
  ...
2014-01-30 20:08:20 -08:00
Richard Yao 46bf16c44b Documentation/filesystems/vfs.txt: update file_operations documentation
->readv, ->writev and ->sendfile have been removed while ->show_fdinfo
has been added. The documentation should reflect this.

Signed-off-by: Richard Yao <ryao@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-30 16:56:56 -08:00
David Rientjes 778c14affa mm, oom: base root bonus on current usage
A 3% of system memory bonus is sometimes too excessive in comparison to
other processes.

With commit a63d83f427 ("oom: badness heuristic rewrite"), the OOM
killer tries to avoid killing privileged tasks by subtracting 3% of
overall memory (system or cgroup) from their per-task consumption.  But
as a result, all root tasks that consume less than 3% of overall memory
are considered equal, and so it only takes 33+ privileged tasks pushing
the system out of memory for the OOM killer to do something stupid and
kill dhclient or other root-owned processes.  For example, on a 32G
machine it can't tell the difference between the 1M agetty and the 10G
fork bomb member.

The changelog describes this 3% boost as the equivalent to the global
overcommit limit being 3% higher for privileged tasks, but this is not
the same as discounting 3% of overall memory from _every privileged task
individually_ during OOM selection.

Replace the 3% of system memory bonus with a 3% of current memory usage
bonus.

By giving root tasks a bonus that is proportional to their actual size,
they remain comparable even when relatively small.  In the example
above, the OOM killer will discount the 1M agetty's 256 badness points
down to 179, and the 10G fork bomb's 262144 points down to 183500 points
and make the right choice, instead of discounting both to 0 and killing
agetty because it's first in the task list.

Signed-off-by: David Rientjes <rientjes@google.com>
Reported-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-30 16:56:56 -08:00
Linus Torvalds d9894c228b Merge branch 'for-3.14' of git://linux-nfs.org/~bfields/linux
Pull nfsd updates from Bruce Fields:
 - Handle some loose ends from the vfs read delegation support.
   (For example nfsd can stop breaking leases on its own in a
    fewer places where it can now depend on the vfs to.)
 - Make life a little easier for NFSv4-only configurations
   (thanks to Kinglong Mee).
 - Fix some gss-proxy problems (thanks Jeff Layton).
 - miscellaneous bug fixes and cleanup

* 'for-3.14' of git://linux-nfs.org/~bfields/linux: (38 commits)
  nfsd: consider CLAIM_FH when handing out delegation
  nfsd4: fix delegation-unlink/rename race
  nfsd4: delay setting current_fh in open
  nfsd4: minor nfs4_setlease cleanup
  gss_krb5: use lcm from kernel lib
  nfsd4: decrease nfsd4_encode_fattr stack usage
  nfsd: fix encode_entryplus_baggage stack usage
  nfsd4: simplify xdr encoding of nfsv4 names
  nfsd4: encode_rdattr_error cleanup
  nfsd4: nfsd4_encode_fattr cleanup
  minor svcauth_gss.c cleanup
  nfsd4: better VERIFY comment
  nfsd4: break only delegations when appropriate
  NFSD: Fix a memory leak in nfsd4_create_session
  sunrpc: get rid of use_gssp_lock
  sunrpc: fix potential race between setting use_gss_proxy and the upcall rpc_clnt
  sunrpc: don't wait for write before allowing reads from use-gss-proxy file
  nfsd: get rid of unused function definition
  Define op_iattr for nfsd4_open instead using macro
  NFSD: fix compile warning without CONFIG_NFSD_V3
  ...
2014-01-30 10:18:43 -08:00
Qu Wenruo a88998f291 btrfs: Add treelog mount option.
Add treelog mount option to enable tree log with
remount option.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Reviewed-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
2014-01-28 13:20:20 -08:00
Qu Wenruo d399167d88 btrfs: Add datasum mount option.
Add datasum mount option to enable checksum with
remount option.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Reviewed-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
2014-01-28 13:20:20 -08:00
Qu Wenruo a258af7a3e btrfs: Add datacow mount option.
Add datacow mount option to enable copy-on-write with
remount option.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Reviewed-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
2014-01-28 13:20:19 -08:00
Qu Wenruo bd0330ad21 btrfs: Add acl mount option.
Add acl mount option to enable acl with remount option.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Reviewed-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
2014-01-28 13:20:19 -08:00
Qu Wenruo 2c9ee85671 btrfs: Add noflushoncommit mount option.
Add noflushoncommit mount option to disable flush on commit with
remount option.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Reviewed-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
2014-01-28 13:20:18 -08:00
Qu Wenruo 5303629343 btrfs: Add noenospc_debug mount option.
Add noenospc_debug mount option to disable ENOSPC debug with
remount option.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Reviewed-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
2014-01-28 13:20:17 -08:00
Qu Wenruo e07a2ade44 btrfs: Add nodiscard mount option.
Add nodiscard mount option to disable discard with remount option.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Reviewed-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
2014-01-28 13:20:17 -08:00
Qu Wenruo fc0ca9af18 btrfs: Add noautodefrag mount option.
Btrfs has autodefrag mount option but no pairing noautodefrag option,
which makes it impossible to disable autodefrag without umount.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Reviewed-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
2014-01-28 13:20:16 -08:00
Qu Wenruo 842bef5891 btrfs: Add "barrier" option to support "-o remount,barrier"
Btrfs can be remounted without barrier, but there is no "barrier" option
so nobody can remount btrfs back with barrier on. Only umount and
mount again can re-enable barrier.(Quite awkward)

Also the mount options in the document is also changed slightly for the
further pairing options changes.

Reported-by: Daniel Blueman <daniel@quora.org>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: Mike Fleetwood <mike.fleetwood@googlemail.com>
Reviewed-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
2014-01-28 13:20:16 -08:00
Linus Torvalds 1c2948380b 9p changes for 3.14 merge window
Included are a new cache model for support of mmap,
 and several cleanups across the filesystem and networking
 portions of the code.
 
 Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.14 (GNU/Linux)
 Comment: GPGTools - http://gpgtools.org
 
 iQIcBAABAgAGBQJS4qT3AAoJEDZk62b0Tg6xiF0P/RI2c75f/5SDbeNtH0usbgRj
 YfC7do4NX0HX0nI8H0bvfai1UU9JLc8M7aEjf5nw19O45phOQcGm/KeGRmMKtAhr
 OixTLXaMkRd3llTGkFv8ZY0W6aaSsGpB3Lzin+lZBwzYqMcqksBqhOTOoS1MSh3F
 u5PyhpuyJmxMkS5ud857PfwIREXOSHF/NIMMs5k9M9kK0zCka7xvl4Kg8zng2RVf
 A3rmKsLEvYuAgnxOq16hsRMgqHwx2833C3VmQKSl/n6SfOCy7cAMNmChIDrnAwtF
 dJosxypiRSkYjgD/YJR3UZofF7IqPgdL4umNmnb2lTHbOpeqNQ1hLB8BotjGpoVX
 pl9lxzz8UzaflwkAdgPsy/GBrbULxQKPLhL1Y0QPedhYh57bqRUEPPJ/HOjyrbOE
 RZXKZXfKbYlbNwc61N+meRC0IJETTjafnJlEzXu2vA+3LxZ3n/uZ7uq7XasVPiUV
 UKTKcvzYMs/PxA47rX81DOzebmphGEZDzw2ONbi4LMwGqeWt6WIpCMLPdGDjq7kl
 jdkpf9DuDr4mDrVP5+cFhzGQYbv9rCGR1zakWSW2H9xqP4Zy+o3kEPstniTMuNS4
 smkLPfpcG0VAKvY3HiVxT62EA4M+38IBAME0ATicE6esrWDyuLtGlke7x+uZoLUF
 mQ7WPimYBR+60liZ3zbQ
 =tCej
 -----END PGP SIGNATURE-----

Merge tag 'for-3.14-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs

Pull 9p changes from Eric Van Hensbergen:
 "Included are a new cache model for support of mmap, and several
  cleanups across the filesystem and networking portions of the code"

* tag 'for-3.14-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs:
  9p: update documentation
  9P: introduction of a new cache=mmap model.
  net/9p: remove virtio default hack and set appropriate bits instead
  9p: remove useless 'name' variable and assignment
  9p: fix return value in case in v9fs_fid_xattr_set()
  9p: remove useless variable and assignment
  9p: remove useless assignment
  9p: remove unused 'super_block' struct pointer
  9p: remove never used return variable
  9p: remove unused 'p9_fid' struct pointer
  9p: remove unused 'p9_client' struct pointer
2014-01-26 10:55:41 -08:00
Eric Van Hensbergen b871866e4a 9p: update documentation
quick pass to update the documentation to include instructions for
the new cache=mmap mode as well as clean up some out-of-date bits.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2014-01-24 10:55:21 -06:00
Fabian Frederick 50114c1103 Documentation/filesystems/00-INDEX: updates
Add the following documentation-files with description :
 -autofs4-mount-control.txt
 -btrfs.txt
 -debugfs.txt
 -devpts.txt
 -fiemap.txt
 -gfs2-glocks.txt
 -gfs2-uevents.txt
 -omfs.txt
 -path-lookup.txt
 -qnx6.txt
 -quota.txt
 -squashfs.txt
 -sysfs-tagging.txt
 -ubifs.txt
 -xfs-delayed-logging-design.txt
 -xfs-self-describing-metadata.txt

Add the following documentation directories with description :
 -caching
 -cifs (replacing cifs.txt)
 -pohmelfs

Remove the following documentation-files reference:
 -dentry-locking.txt
 -reiser4.txt

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23 16:37:01 -08:00
Andre Richter c108373290 Documentation/filesystems/sysfs.txt: fix device_attribute declaration
Fix a wrong device_attribute declaration example.

Signed-off-by: Andre Richter <andre.o.richter@gmail.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23 16:37:00 -08:00
Vyacheslav Dubeyko d623a9420c nilfs2: add comments for ioctls
Add comments for ioctls in fs/nilfs2/ioctl.c file and describe NILFS2
specific ioctls in Documentation/filesystems/nilfs2.txt.

Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com>
Reviewed-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: Wenliang Fan <fanwlexca@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23 16:37:00 -08:00
Linus Torvalds 0d90d63872 f2fs updates for v3.14
This patch-set includes the following major enhancement patches.
 o support inline_data
 o refactor bio operations such as merge operations and rw type assignment
 o enhance the direct IO path
 o enhance bio operations
 o truncate a node page when it becomes obsolete
 o add sysfs entries: small_discards, max_victim_search, and in-place-update
 o add a sysfs entry to control max_victim_search
 
 The other bug fixes are as follows.
 o fix a bug in truncate_partial_nodes
 o avoid warnings during sparse and build process
 o fix error handling flows
 o fix potential bit overflows
 
 And, there are a bunch of cleanups.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJS4HQfAAoJEEAUqH6CSFDSyyMP/iUXSMC9yw6eOmSjAh3boc6+
 C7e4zrhdovekGTuZgg41SLdr83cpbEohv11wcXAfxB+eYFEz0zrAVzt54zMi7uOL
 9JmFJ6XVL/T3omI5hpEwWHg6S6tOynN6mcjacsrvypEekgjHbbpLudSw6SCu3dKz
 Lpc3z6CxrWbhvX8Iyf1j8mCceWkTO6eRv7u2H4Njtsq4Tukw3BHiBsURXt6kGwpx
 CvRBgCFdQhv4GAtbDosmVjNWOUxvik7w2epHAPQGddFTgaCL9uS+gfweHK6H9EDp
 1e3BDhmn5r9IhiLY8KVXRc8+po9kQeO1jNQATBuWggfjJSGbEBmrEQX4MFE3uCi9
 q84hGV9+yaJxoT2A21qIeWgorF9gjqNbnrrENKHyKhOqXJSrh48u5LUV8KqIyz1Y
 Qw62cypEB+PQxWegN76vwX/OrHMCLYMQ6c78bYLSwkBKonOrF5sN2+kJW5+zEj6n
 q2cYi1PLMJe7LTcULUrxJTSPFLKM5yA2oYZq3LN4sUYBeN6USaouaIqcZBqRBTCO
 adqlTa3sWytkDMAHsTpwrHABKK7pwiZoPLDVwjo0TIJ6Us4JhDtTktp5pj24fQ7Y
 6lC9w4VbfAKtq8fMV17rZYD0lQFlmZk4uQRJ8XYicCRFx11kMPKYzdGmP5aVXWru
 wxcztktnABtCAXK0PFLf
 =gVDh
 -----END PGP SIGNATURE-----

Merge tag 'for-f2fs-3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs

Pull f2fs updates from Jaegeuk Kim:
 "In this round, a couple of sysfs entries were introduced to tune the
  f2fs at runtime.

  In addition, f2fs starts to support inline_data and improves the
  read/write performance in some workloads by refactoring bio-related
  flows.

  This patch-set includes the following major enhancement patches.
   - support inline_data
   - refactor bio operations such as merge operations and rw type
     assignment
   - enhance the direct IO path
   - enhance bio operations
   - truncate a node page when it becomes obsolete
   - add sysfs entries: small_discards, max_victim_search, and
     in-place-update
   - add a sysfs entry to control max_victim_search

  The other bug fixes are as follows.
   - fix a bug in truncate_partial_nodes
   - avoid warnings during sparse and build process
   - fix error handling flows
   - fix potential bit overflows

  And, there are a bunch of cleanups"

* tag 'for-f2fs-3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (95 commits)
  f2fs: drop obsolete node page when it is truncated
  f2fs: introduce NODE_MAPPING for code consistency
  f2fs: remove the orphan block page array
  f2fs: add help function META_MAPPING
  f2fs: move a branch for code redability
  f2fs: call mark_inode_dirty to flush dirty pages
  f2fs: clean checkpatch warnings
  f2fs: missing REQ_META and REQ_PRIO when sync_meta_pages(META_FLUSH)
  f2fs: avoid f2fs_balance_fs call during pageout
  f2fs: add delimiter to seperate name and value in debug phrase
  f2fs: use spinlock rather than mutex for better speed
  f2fs: move alloc new orphan node out of lock protection region
  f2fs: move grabing orphan pages out of protection region
  f2fs: remove the needless parameter of f2fs_wait_on_page_writeback
  f2fs: update documents and a MAINTAINERS entry
  f2fs: add a sysfs entry to control max_victim_search
  f2fs: improve write performance under frequent fsync calls
  f2fs: avoid to read inline data except first page
  f2fs: avoid to left uninitialized data in page when read inline data
  f2fs: fix truncate_partial_nodes bug
  ...
2014-01-23 09:21:09 -08:00
Linus Torvalds bb1281f2aa Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Pull trivial tree updates from Jiri Kosina:
 "Usual rocket science stuff from trivial.git"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits)
  neighbour.h: fix comment
  sched: Fix warning on make htmldocs caused by wait.h
  slab: struct kmem_cache is protected by slab_mutex
  doc: Fix typo in USB Gadget Documentation
  of/Kconfig: Spelling s/one/once/
  mkregtable: Fix sscanf handling
  lp5523, lp8501: comment improvements
  thermal: rcar: comment spelling
  treewide: fix comments and printk msgs
  IXP4xx: remove '1 &&' from a condition check in ixp4xx_restart()
  Documentation: update /proc/uptime field description
  Documentation: Fix size parameter for snprintf
  arm: fix comment header and macro name
  asm-generic: uaccess: Spelling s/a ny/any/
  mtd: onenand: fix comment header
  doc: driver-model/platform.txt: fix a typo
  drivers: fix typo in DEVTMPFS_MOUNT Kconfig help text
  doc: Fix typo (acces_process_vm -> access_process_vm)
  treewide: Fix typos in printk
  drivers/gpu/drm/qxl/Kconfig: reformat the help text
  ...
2014-01-22 21:21:55 -08:00
Rik van Riel 34e431b0ae /proc/meminfo: provide estimated available memory
Many load balancing and workload placing programs check /proc/meminfo to
estimate how much free memory is available.  They generally do this by
adding up "free" and "cached", which was fine ten years ago, but is
pretty much guaranteed to be wrong today.

It is wrong because Cached includes memory that is not freeable as page
cache, for example shared memory segments, tmpfs, and ramfs, and it does
not include reclaimable slab memory, which can take up a large fraction
of system memory on mostly idle systems with lots of files.

Currently, the amount of memory that is available for a new workload,
without pushing the system into swap, can be estimated from MemFree,
Active(file), Inactive(file), and SReclaimable, as well as the "low"
watermarks from /proc/zoneinfo.

However, this may change in the future, and user space really should not
be expected to know kernel internals to come up with an estimate for the
amount of free memory.

It is more convenient to provide such an estimate in /proc/meminfo.  If
things change in the future, we only have to change it in one place.

Signed-off-by: Rik van Riel <riel@redhat.com>
Reported-by: Erik Mouw <erik.mouw_2@nxp.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-21 16:19:43 -08:00
Jaegeuk Kim 3bac380c90 f2fs: update documents and a MAINTAINERS entry
This patch adds missing some description of sysfs entries in
 - Documentation/ABI/testing/sysfs-fs-f2fs
 - Documentation/filesystems/f2fs.txt.

And it adds a maintained document entry of F2FS in MAINTAINERS.

Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-01-09 21:00:06 +09:00
Rob Landley 49457896d5 Documentation: update /proc/uptime field description
/proc/uptime has two fields, update description to describe both
ala http://lkml.indiana.edu/hypermail/linux/kernel/1312.3/01454.html

Signed-off-by: Rob Landley <rob@landley.net>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2014-01-02 10:53:04 +01:00
Huajun Li e4024e86c2 f2fs: update f2fs Documentation
This patch describes the inline_data support in f2fs document.

Signed-off-by: Huajun Li <huajun.li@intel.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-12-26 20:40:44 +09:00
Jaegeuk Kim ba0697ec98 f2fs: add description about small_discards in document
This patch adds a description about small_disacrds in the f2fs document.

Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-12-23 10:18:07 +09:00
Jaegeuk Kim 216fbd6443 f2fs: introduce sysfs entry to control in-place-update policy
This patch introduces new sysfs entries for users to control the policy of
in-place-updates, namely IPU, in f2fs.

Sometimes f2fs suffers from performance degradation due to its out-of-place
update policy that produces many additional node block writes.
If the storage performance is very dependant on the amount of data writes
instead of IO patterns, we'd better drop this out-of-place update policy.

This patch suggests 5 polcies and their triggering conditions as follows.

[sysfs entry name = ipu_policy]

0: F2FS_IPU_FORCE       all the time,
1: F2FS_IPU_SSR         if SSR mode is activated,
2: F2FS_IPU_UTIL        if FS utilization is over threashold,
3: F2FS_IPU_SSR_UTIL    if SSR mode is activated and FS utilization is over
                        threashold,
4: F2FS_IPU_DISABLE    disable IPU. (=default option)

[sysfs entry name = min_ipu_util]

This parameter controls the threshold to trigger in-place-updates.
The number indicates percentage of the filesystem utilization, and used by
F2FS_IPU_UTIL and F2FS_IPU_SSR_UTIL policies.

For more details, see need_inplace_update() in segment.h.

Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-12-23 10:18:07 +09:00
Stefan Weil 507da6a1f3 doc: Fix typo (acces_process_vm -> access_process_vm)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2013-12-19 15:12:21 +01:00
J. Bruce Fields 4bd8eabc29 nfsd4: update 4.1 nfsd status documentation
This has gone a little stale.

Reported-by: Christoph Hellwig <hch@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-12-10 20:35:57 -05:00