Commit graph

208 commits

Author SHA1 Message Date
Jaegeuk Kim 836b5a6356 f2fs: issue discard with finally produced len and minlen
This patch determines to issue discard commands by comparing given minlen and
the length of produced final candidates.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-05-28 15:41:39 -07:00
Jaegeuk Kim a66cdd9855 f2fs: introduce discard_map for f2fs_trim_fs
This patch adds a bitmap for discard issues from f2fs_trim_fs.
There-in rule is to issue discard commands only for invalidated blocks
after mount.
Once mount is done, f2fs_trim_fs trims out whole invalid area.
After ehn, it will not issue and discrads redundantly.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-05-28 15:41:39 -07:00
Jaegeuk Kim 05ca3632e5 f2fs: add sbi and page pointer in f2fs_io_info
This patch adds f2fs_sb_info and page pointers in f2fs_io_info structure.
With this change, we can reduce a lot of parameters for IO functions.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-05-28 15:41:32 -07:00
Jaegeuk Kim 8ce67cb07d f2fs: add some tracepoints to debug volatile and atomic writes
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-04-10 15:08:47 -07:00
Jaegeuk Kim 21cb1d99bc f2fs: fix to cover sentry_lock for block allocation
In the following call stack, f2fs changes the bitmap for dirty segments and # of
dirty sentries without grabbing sit_i->sentry_lock.
This can result in mismatch on bitmap and # of dirty sentries, since if there
are some direct_io operations.

In allocate_data_block,
 - __allocate_new_segments
  - mutex_lock(&curseg->curseg_mutex);
  - s_ops->allocate_segment
   - new_curseg/change_curseg
    - reset_curseg
     - __set_sit_entry_type
      - __mark_sit_entry_dirty
       - set_bit(dirty_sentries_bitmap)
       - dirty_sentries++;

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-04-10 15:08:43 -07:00
Chao Yu b28c3f9493 f2fs: fix to issue small discard in real-time mode discard
Now in f2fs, we share functions and structures for batch mode and real-time mode
discard. For real-time mode discard, in shared function add_discard_addrs, we
will use uninitialized trim_minlen in struct cp_control to compare with length
of contiguous free blocks to decide whether skipping discard fragmented freespace
or not, this makes us ignore small discard sometimes. Fix it.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Reviewed-by : Changman Lee <cm224.lee@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-04-10 15:08:31 -07:00
Wanpeng Li 2b11a74b21 f2fs: don't need to collect dirty sit entries and flush journal when there's no dirty sit entries
Don't need to collect dirty sit entries and flush sit journal to sit
 entries when there's no dirty sit entries. This patch check dirty_sentries
 earlier just like flush_nat_entries.

Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
Reviewed-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-04-10 15:08:29 -07:00
Chao Yu 1dcc336b02 f2fs: enable rb-tree extent cache
This patch enables rb-tree based extent cache in f2fs.

When we mount with "-o extent_cache", f2fs will try to add recently accessed
page-block mappings into rb-tree based extent cache as much as possible, instead
of original one extent info cache.

By this way, f2fs can support more effective cache between dnode page cache and
disk. It will supply high hit ratio in the cache with fewer memory when dnode
page cache are reclaimed in environment of low memory.

Storage: Sandisk sd card 64g
1.append write file (offset: 0, size: 128M);
2.override write file (offset: 2M, size: 1M);
3.override write file (offset: 4M, size: 1M);
...
4.override write file (offset: 48M, size: 1M);
...
5.override write file (offset: 112M, size: 1M);
6.sync
7.echo 3 > /proc/sys/vm/drop_caches
8.read file (size:128M, unit: 4k, count: 32768)
(time dd if=/mnt/f2fs/128m bs=4k count=32768)

Extent Hit Ratio:
		before		patched
Hit Ratio	121 / 1071	1071 / 1071

Performance:
		before		patched
real    	0m37.051s	0m35.556s
user    	0m0.040s	0m0.026s
sys     	0m2.990s	0m2.251s

Memory Cost:
		before		patched
Tree Count:	0		1 (size: 24 bytes)
Node Count:	0		45 (size: 1440 bytes)

v3:
 o retest and given more details of test result.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-03-03 09:58:47 -08:00
Chao Yu 1a118ccfd6 f2fs: use spinlock for segmap_lock instead of rwlock
rwlock can provide better concurrency when there are much more readers than
writers because readers can hold the rwlock simultaneously.

But now, for segmap_lock rwlock in struct free_segmap_info, there is only one
reader 'mount' from below call path:
->f2fs_fill_super
  ->build_segment_manager
    ->build_dirty_segmap
      ->init_dirty_segmap
        ->find_next_inuse
          read_lock
          ...
          read_unlock

Now that our concurrency can not be improved since there is no other reader for
this lock, we do not need to use rwlock_t type for segmap_lock, let's replace it
with spinlock_t type.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-02-11 17:04:51 -08:00
Jaegeuk Kim 60a3b782b1 f2fs: avoid variable length array
Instead of using variable length array, this patch let preallocate memory for
them.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-02-11 17:04:50 -08:00
Jaegeuk Kim f7ef9b83b5 f2fs: introduce macros to convert bytes and blocks in f2fs
This patch adds two macros for transition between byte and block offsets.
Currently, f2fs only supports 4KB blocks, so use the default size for now.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-02-11 17:04:48 -08:00
Jaegeuk Kim bba681cbb2 f2fs: introduce a batched trim
This patch introduces a batched trimming feature, which submits split discard
commands.

This is to avoid long latency due to huge trim commands.
If fstrim was triggered ranging from 0 to the end of device, we should lock
all the checkpoint-related mutexes, resulting in very long latency.

Reviewed-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-02-11 17:04:44 -08:00
Jaegeuk Kim 119ee91445 f2fs: split UMOUNT and FASTBOOT flags
This patch adds FASTBOOT flag into checkpoint as follows.

 - CP_UMOUNT_FLAG is set when system is umounted.
 - CP_FASTBOOT_FLAG is set when intermediate checkpoint having node summaries
   was done.

So, if you get CP_UMOUNT_FLAG from checkpoint, the system was umounted cleanly.
Instead, if there was sudden-power-off, you can get CP_FASTBOOT_FLAG or nothing.

Reviewed-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-02-11 17:04:41 -08:00
Jaegeuk Kim 38aa0889b2 f2fs: align direct_io'ed data to section
This patch aligns the start block address of a file for direct io to the f2fs's
section size.

Some flash devices manage an over 4KB-sized page as a write unit, and if the
direct_io'ed data are written but not aligned to that unit, the performance can
be degraded due to the partial page copies.

Thus, since f2fs has a section that is well aligned to FTL units, we can align
the block address to the section size so that f2fs avoids this misalignment.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-01-09 17:02:27 -08:00
Jaegeuk Kim e1509cf294 f2fs: clean up to remove parameter
This patch uses dn->data_blkaddr as a parameter for the destination block
address.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-01-09 17:02:26 -08:00
Changman Lee b9a2c25207 f2fs: add block count by in-place-update in stat info
This patch adds block count by in-place-update in stat.

Signed-off-by: Changman Lee <cm224.lee@samsung.com>
Reviewed-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-01-09 17:02:25 -08:00
Jaegeuk Kim 9e4ded3f30 f2fs: activate f2fs_trace_pid
This patch activates f2fs_trace_pid.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-01-09 17:02:24 -08:00
Jaegeuk Kim cf04e8eb55 f2fs: use f2fs_io_info to clean up messy parameters during IO path
This patch cleans up parameters on IO paths.
The key idea is to use f2fs_io_info adding a parameter, block address, and then
use this structure as parameters.

Reviewed-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-01-09 17:02:23 -08:00
Chao Yu 3fa06d7bc9 f2fs: readahead contiguous current summary blocks in checkpoint
Let's add readahead code for reading contiguous compact/normal summary blocks
in checkpoint, then we will gain better performance in mount procedure.

Changes from v1
  o remove inappropriate 'unlikely' in npages_for_summary_flush.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-01-09 17:02:23 -08:00
Jaegeuk Kim 042b7816aa f2fs: remove unnecessary call to invalidate inmemory pages
Now we use inmemory pages for atomic write only and provide abort procedure,
we don't need to truncate them explicitly.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-01-09 17:02:22 -08:00
Jaegeuk Kim d7bc2484b8 f2fs: fix small discards not to issue redundantly
The ckpt_valid_map and cur_valid_map are synced by seg_info_to_raw_sit.

In the case of small discards, the candidates are selected before sync,
while fitrim selects candidates after sync.

So, for small discards, we need to add candidates only just being obsoleted.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-01-09 17:02:22 -08:00
Jaegeuk Kim 1e84371ffe f2fs: change atomic and volatile write policies
This patch adds two new ioctls to release inmemory pages grabbed by atomic
writes.
 o f2fs_ioc_abort_volatile_write
  - If transaction was failed, all the grabbed pages and data should be written.
 o f2fs_ioc_release_volatile_write
  - This is to enhance the performance of PERSIST mode in sqlite.

In order to avoid huge memory consumption which causes OOM, this patch changes
volatile writes to use normal dirty pages, instead blocked flushing to the disk
as long as system does not suffer from memory pressure.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-01-09 17:02:22 -08:00
Jaegeuk Kim 70c640b1d6 f2fs: don't need to call lock_op and lock_page for abort
We don't need to call lock_op and lock_page at the aborting path.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-01-09 17:02:21 -08:00
Jaegeuk Kim 88a70a69c0 f2fs: fix wrong condition check to trigger f2fs_sync_fs
If there is not enough available memory, we need to trigger f2fs_sync_fs.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-01-09 17:02:21 -08:00
Jaegeuk Kim 8dcf2ff721 f2fs: count the number of inmemory pages
This patch adds counting # of inmemory pages in the page cache.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-12-08 10:35:15 -08:00
Jaegeuk Kim 0722b1011a f2fs: set page private for inmemory pages for truncation
The inmemory pages should be handled by invalidate_page since it needs to be
released int the truncation path.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-12-08 10:35:14 -08:00
Jaegeuk Kim 9be32d72be f2fs: do retry operations with cond_resched
This patch revists retrial paths in f2fs.
The basic idea is to use cond_resched instead of retrying from the very early
stage.

Suggested-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Reviewed-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-12-08 10:35:05 -08:00
Jaegeuk Kim 0341845efc f2fs: fix livelock calling f2fs_iget during f2fs_evict_inode
In f2fs_evict_inode,
 commit_inmemory_pages
   f2fs_gc
     f2fs_iget
       iget_locked
         -> wait for inode free

Here, if the inode is same as the one to be evicted, f2fs should wait forever.
Actually, we should not call f2fs_balance_fs during f2fs_evict_inode to avoid
this.

But, the commit_inmem_pages calls f2fs_balance_fs by default, even if
f2fs_evict_inode wants to free inmemory pages only.

Hence, this patch adds to trigger f2fs_balance_fs only when there is something
to write.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-11-23 21:51:57 -08:00
Changman Lee c9ee00857c f2fs: fix wrong data structure when create slab
It used nat_entry_set when create slab for sit_entry_set.

Signed-off-by: Changman Lee <cm224.lee@samsung.com>
Reviewed-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-11-23 21:48:49 -08:00
Jaegeuk Kim e5e7ea3c86 f2fs: control the memory footprint used by ino entries
This patch adds to control the memory footprint used by ino entries.
This will conduct best effort, not strictly.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-11-06 15:24:46 -08:00
Jaegeuk Kim a344b9fda0 f2fs: disable roll-forward when active_logs = 2
The roll-forward mechanism should be activated when the number of active
logs is not 2.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-11-05 20:05:53 -08:00
Jaegeuk Kim adf4983bde f2fs: send discard commands in larger extent
If there is a chance to make a huge sized discard command, we don't need
to split it out, since each blkdev_issue_discard should wait one at a
time.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-11-04 17:34:13 -08:00
Jaegeuk Kim e3fb1b794b f2fs: do not discard data protected by the previous checkpoint
We should not discard any data protected by the previous checkpoint all
the time.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-11-03 16:07:38 -08:00
Jaegeuk Kim ca4b02eeed f2fs: call write_checkpoint under disabled gc
During the write_checkpoint, we should avoid f2fs_gc trigger to avoid any
filesystem consistency.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-11-03 16:07:37 -08:00
Gu Zheng 2cc2218611 f2fs: use current_sit_addr to replace the open code
Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-11-03 16:07:37 -08:00
Gu Zheng 52aca07425 f2fs: rename f2fs_set/clear_bit to f2fs_test_and_set/clear_bit
Rename f2fs_set/clear_bit to f2fs_test_and_set/clear_bit, which mean
set/clear bit and return the old value, for better readability.

Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-11-03 16:07:36 -08:00
Jan Kara 9bd27ae4aa f2fs: avoid returning uninitialized value to userspace from f2fs_trim_fs()
If user specifies too low end sector for trimming, f2fs_trim_fs() will
use uninitialized value as a number of trimmed blocks and returns it to
userspace. Initialize number of trimmed blocks early to avoid the
problem.

Coverity-id: 1248809
CC: stable@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-11-03 16:07:35 -08:00
Jaegeuk Kim 4a257ed677 f2fs: avoid build warning
This patch removes build warning.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-11-03 16:07:30 -08:00
Jaegeuk Kim cbcb2872e3 f2fs: invalidate inmemory page
If user truncates file's data, we should truncate inmemory pages too.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-11-03 16:07:29 -08:00
Jaegeuk Kim 34ba94bac9 f2fs: do not make dirty any inmemory pages
This patch let inmemory pages be clean all the time.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-11-03 16:07:29 -08:00
Jaegeuk Kim 88b88a6679 f2fs: support atomic writes
This patch introduces a very limited functionality for atomic write support.
In order to support atomic write, this patch adds two ioctls:
 o F2FS_IOC_START_ATOMIC_WRITE
 o F2FS_IOC_COMMIT_ATOMIC_WRITE

The database engine should be aware of the following sequence.
1. open
 -> ioctl(F2FS_IOC_START_ATOMIC_WRITE);
2. writes
  : all the written data will be treated as atomic pages.
3. commit
 -> ioctl(F2FS_IOC_COMMIT_ATOMIC_WRITE);
  : this flushes all the data blocks to the disk, which will be shown all or
  nothing by f2fs recovery procedure.
4. repeat to #2.

The IO pattens should be:

  ,- START_ATOMIC_WRITE                  ,- COMMIT_ATOMIC_WRITE
 CP | D D D D D D | FSYNC | D D D D | FSYNC ...
                      `- COMMIT_ATOMIC_WRITE

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-10-06 17:39:50 -07:00
Jaegeuk Kim 7cd8558baa f2fs: check the use of macros on block counts and addresses
This patch cleans up the existing and new macros for readability.

Rule is like this.

         ,-----------------------------------------> MAX_BLKADDR -,
         |  ,------------- TOTAL_BLKS ----------------------------,
         |  |                                                     |
         |  ,- seg0_blkaddr   ,----- sit/nat/ssa/main blkaddress  |
block    |  | (SEG0_BLKADDR)  | | | |   (e.g., MAIN_BLKADDR)      |
address  0..x................ a b c d .............................
            |                                                     |
global seg# 0...................... m .............................
            |                       |                             |
            |                       `------- MAIN_SEGS -----------'
            `-------------- TOTAL_SEGS ---------------------------'
                                    |                             |
 seg#                               0..........xx..................

= Note =
 o GET_SEGNO_FROM_SEG0 : blk address -> global segno
 o GET_SEGNO           : blk address -> segno
 o START_BLOCK         : segno -> starting block address

Reviewed-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-30 15:34:47 -07:00
Jaegeuk Kim 4b2fecc846 f2fs: introduce FITRIM in f2fs_ioctl
This patch introduces FITRIM in f2fs_ioctl.
In this case, f2fs will issue small discards and prefree discards as many as
possible for the given area.

Reviewed-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-30 15:06:09 -07:00
Jaegeuk Kim 9b5f136fd4 f2fs: change the ipu_policy option to enable combinations
This patch changes the ipu_policy setting to use any combination of orthogonal policies.

Signed-off-by: Changman Lee <cm224.lee@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-23 11:10:24 -07:00
Chao Yu 55cf9cb63f f2fs: support large sector size
Block size in f2fs is 4096 bytes, so theoretically, f2fs can support 4096 bytes
sector device at maximum. But now f2fs only support 512 bytes size sector, so
block device such as zRAM which uses page cache as its block storage space will
not be mounted successfully as mismatch between sector size of zRAM and sector
size of f2fs supported.

In this patch we support large sector size in f2fs, so block device with sector
size of 512/1024/2048/4096 bytes can be supported in f2fs.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-23 11:10:20 -07:00
Jaegeuk Kim 90a893c749 f2fs: use MAX_BIO_BLOCKS(sbi)
This patch cleans up a simple macro.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-23 11:10:18 -07:00
Jaegeuk Kim c1ce1b02bb f2fs: give an option to enable in-place-updates during fsync to users
If user wrote F2FS_IPU_FSYNC:4 in /sys/fs/f2fs/ipu_policy, f2fs_sync_file
only starts to try in-place-updates.
And, if the number of dirty pages is over /sys/fs/f2fs/min_fsync_blocks, it
keeps out-of-order manner. Otherwise, it triggers in-place-updates.

This may be used by storage showing very high random write performance.

For example, it can be used when,

Seq. writes (Data) + wait + Seq. writes (Node)

is pretty much slower than,

Rand. writes (Data)

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-16 04:10:44 -07:00
Gu Zheng 721bd4d5c3 f2fs: use lock-less list(llist) to simplify the flush cmd management
We use flush cmd control to collect many flush cmds, and flush them
together. In this case, we use two list to manage the flush cmds
(collect and dispatch), and one spin lock is used to protect this.
In fact, the lock-less list(llist) is very suitable to this case,
and we use simplify this routine.

-
v2:
-use llist_for_each_entry_safe to fix possible use-after-free issue.
-remove the unused field from struct flush_cmd.
Thanks for Yu's suggestion.
-

Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-09 13:15:06 -07:00
Chao Yu 184a5cd2ce f2fs: refactor flush_sit_entries codes for reducing SIT writes
In commit aec71382c6 ("f2fs: refactor flush_nat_entries codes for reducing NAT
writes"), we descripte the issue as below:

"Although building NAT journal in cursum reduce the read/write work for NAT
block, but previous design leave us lower performance when write checkpoint
frequently for these cases:
1. if journal in cursum has already full, it's a bit of waste that we flush all
   nat entries to page for persistence, but not to cache any entries.
2. if journal in cursum is not full, we fill nat entries to journal util
   journal is full, then flush the left dirty entries to disk without merge
   journaled entries, so these journaled entries may be flushed to disk at next
   checkpoint but lost chance to flushed last time."

Actually, we have the same problem in using SIT journal area.

In this patch, firstly we will update sit journal with dirty entries as many as
possible. Secondly if there is no space in sit journal, we will remove all
entries in journal and walk through the whole dirty entry bitmap of sit,
accounting dirty sit entries located in same SIT block to sit entry set. All
entry sets are linked to list sit_entry_set in sm_info, sorted ascending order
by count of entries in set. Later we flush entries in set which have fewest
entries into journal as many as we can, and then flush dense set with merged
entries to disk.

In this way we can use sit journal area more effectively, also we will reduce
SIT update, result in gaining in performance and saving lifetime of flash
device.

In my testing environment, it shows this patch can help to reduce SIT block
update obviously.

virtual machine + hard disk:
fsstress -p 20 -n 400 -l 5
		sit page num	cp count	sit pages/cp
based		2006.50		1349.75		1.486
patched		1566.25		1463.25		1.070

Our latency of merging op is small when handling a great number of dirty SIT
entries in flush_sit_entries:
latency(ns)	dirty sit count
36038		2151
49168		2123
37174		2232

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-09 13:15:05 -07:00
Chao Yu d3a14afd5e f2fs: remove unneeded sit_i in macro SIT_BLOCK_OFFSET/START_SEGNO
sit_i in macro SIT_BLOCK_OFFSET/START_SEGNO is not used, remove it.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-09 13:15:05 -07:00
Jaegeuk Kim ec325b5270 f2fs: handle bug cases by letting fsck.f2fs initiate
This patch adds to handle corner buggy cases for fsck.f2fs.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-09 13:15:03 -07:00
Jaegeuk Kim 05796763b8 f2fs: add BUG cases to initiate fsck.f2fs
This patch replaces BUG cases with f2fs_bug_on to remain fsck.f2fs information.
And it implements some void functions to initiate fsck.f2fs too.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-09 13:15:03 -07:00
Jaegeuk Kim 9850cf4a89 f2fs: need fsck.f2fs when f2fs_bug_on is triggered
If any f2fs_bug_on is triggered, fsck.f2fs is needed.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-09 13:15:02 -07:00
Jaegeuk Kim 4081363fbe f2fs: introduce F2FS_I_SB, F2FS_M_SB, and F2FS_P_SB
This patch adds three inline functions to clean up dirty casting codes.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-03 17:37:13 -07:00
Jaegeuk Kim 202095a7a0 f2fs: remove rewrite_node_page
I think we need to let the dirty node pages remain in the page cache instead
of rewriting them in their places.
So, after done with successful recovery, write_checkpoint will flush all of them
through the normal write path.
Through this, we can avoid potential error cases in terms of block allocation.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-08-21 13:57:02 -07:00
arter97 e1c4204520 f2fs: fix typo
Fix typo and some grammatical errors.

The words "filesystem" and "readahead" are being used without the space treewide.

Signed-off-by: Park Ju Hyung <qkrwngud825@gmail.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-08-19 10:01:33 -07:00
Chao Yu b65ee14818 f2fs: use for_each_set_bit to simplify the code
This patch uses for_each_set_bit to simplify some codes in f2fs.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-08-04 13:20:53 -07:00
Dongho Sim 33be828ada f2fs: remove redundant lines in allocate_data_block
There are redundant lines in allocate_data_block.

In this function, we call refresh_sit_entry with old seg and old curseg.
After that, we call locate_dirty_segment with old curseg.

But, the new address is always allocated from old curseg and
we call locate_dirty_segment with old curseg in refresh_sit_entry.
So, we do not need to call locate_dirty_segment with old curseg again.

We've discussed like below:

Jaegeuk said:
 "When considering SSR, we need to take care of the following scenario.
  - old segno : X
  - new address : Z
  - old curseg : Y
  This means, a new block is supposed to be written to Z from X.
  And Z is newly allocated in the same path from Y.

  In that case, we should trigger locate_dirty_segment for Y, since
  it was a current_segment and can be dirty owing to SSR.
  But that was not included in the dirty list."

Changman said:
 "We already choosed old curseg(Y) and then we allocate new address(Z) from old
  curseg(Y). After that we call refresh_sit_entry(old address, new address).
  In the funcation, we call locate_dirty_segment with old seg and old curseg.
  So calling locate_dirty_segment after refresh_sit_entry again is redundant."

Jaegeuk said:
 "Right. The new address is always allocated from old_curseg."

Reviewed-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Dongho Sim <dh.sim@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-07-30 15:38:59 -07:00
Jaegeuk Kim 24a9ee0fa3 f2fs: add tracepoint for f2fs_issue_flush
This patch adds a tracepoint for f2fs_issue_flush.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-07-30 14:13:36 -07:00
Jaegeuk Kim cf2271e781 f2fs: avoid retrying wrong recovery routine when error was occurred
This patch eliminates the propagation of recovery errors to the next mount.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-07-30 14:13:35 -07:00
Jaegeuk Kim 0f7b2abd18 f2fs: add nobarrier mount option
This patch adds a mount option, nobarrier, in f2fs.
The assumption in here is that file system keeps the IO ordering, but
doesn't care about cache flushes inside the storages.

Reviewed-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-07-29 05:27:48 -07:00
Chao Yu 6b2920a513 f2fs: use inner macro and function to clean up codes
In this patch we use below inner macro and function to clean up codes.
1. ADDRS_PER_PAGE
2. SM_I
3. f2fs_readonly

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-07-09 14:04:26 -07:00
Fabian Frederick b434babf85 f2fs: replace count*size kzalloc by kcalloc
kcalloc manages count*sizeof overflow.

Cc: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Cc: linux-f2fs-devel@lists.sourceforge.net
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-07-09 14:04:25 -07:00
Chao Yu 50e1f8d221 f2fs: avoid to access NULL pointer in issue_flush_thread
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=75861

Denis 2014-05-10 11:28:59 UTC reported:
"F2FS-fs (mmcblk0p28): mounting..
 Unable to handle kernel NULL pointer dereference at virtual address 00000018
 ...
 [<c0a2f678>] (_raw_spin_lock+0x3c/0x70) from [<c03a0330>] (issue_flush_thread+0x50/0x17c)
 [<c03a0330>] (issue_flush_thread+0x50/0x17c) from [<c01b4064>] (kthread+0x98/0xa4)
 [<c01b4064>] (kthread+0x98/0xa4) from [<c0108060>] (kernel_thread_exit+0x0/0x8)"

This patch assign cmd_control_info in sm_info before issue_flush_thread is being
created, so this make sure that issue flush thread will have no chance to access
invalid info in fcc.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Reviewed-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-07-09 05:59:55 -07:00
Chao Yu 8bc6f60e3f f2fs: remove unused variables in f2fs_sm_info
Remove unused variables in struct f2fs_sm_info.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-07-09 05:57:57 -07:00
Chao Yu adf8d90b6a f2fs: avoid to use slab memory in f2fs_issue_flush for efficiency
If we use slab memory in f2fs_issue_flush(), we will face memory pressure and
latency time caused by racing of kmem_cache_{alloc,free}.

Let's alloc memory in stack instead of slab.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-05-08 18:23:21 +09:00
Gu Zheng 2163d19815 f2fs: introduce help function {create,destroy}_flush_cmd_control
Introduce help function {create,destroy}_flush_cmd_control to clean up
the create/destory flush merge operation.

Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-05-07 10:21:57 +09:00
Gu Zheng a688b9d9e5 f2fs: introduce struct flush_cmd_control to wrap the flush_merge fields
Split the flush_merge fields from sm_i, and use the new struct flush_cmd_control
to wrap it, so that we can igonre these fileds if flush_merge is disable, and
it alse can the structs more neat.

Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-05-07 10:21:56 +09:00
Gu Zheng 876dc59eb1 f2fs: add the flush_merge handle in the remount flow
Add the *remount* handle of flush_merge option, so that the users
can enable flush_merge in the runtime, such as the underlying device
handles the cache_flush command relatively slowly.

Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-05-07 10:21:55 +09:00
Jaegeuk Kim 1e87a78d95 f2fs: avoid to conduct roll-forward due to the remained garbage blocks
The f2fs always scans the next chain of direct node blocks.
But some garbage blocks are able to be remained due to no discard support or
SSR triggers.
This occasionally wreaks recovering wrong inodes that were used or BUG_ONs
due to reallocating node ids as follows.

When mount this f2fs image:
http://linuxtesting.org/downloads/f2fs_fault_image.zip
BUG_ON is triggered in f2fs driver (messages below are generated on
kernel 3.13.2; for other kernels output is similar):

kernel BUG at fs/f2fs/node.c:215!
 Call Trace:
 [<ffffffffa032ebad>] recover_inode_page+0x1fd/0x3e0 [f2fs]
 [<ffffffff811446e7>] ? __lock_page+0x67/0x70
 [<ffffffff81089990>] ? autoremove_wake_function+0x50/0x50
 [<ffffffffa0337788>] recover_fsync_data+0x1398/0x15d0 [f2fs]
 [<ffffffff812b9e5c>] ? selinux_d_instantiate+0x1c/0x20
 [<ffffffff811cb20b>] ? d_instantiate+0x5b/0x80
 [<ffffffffa0321044>] f2fs_fill_super+0xb04/0xbf0 [f2fs]
 [<ffffffff811b861e>] ? mount_bdev+0x7e/0x210
 [<ffffffff811b8769>] mount_bdev+0x1c9/0x210
 [<ffffffffa0320540>] ? validate_superblock+0x210/0x210 [f2fs]
 [<ffffffffa031cf8d>] f2fs_mount+0x1d/0x30 [f2fs]
 [<ffffffff811b9497>] mount_fs+0x47/0x1c0
 [<ffffffff81166e00>] ? __alloc_percpu+0x10/0x20
 [<ffffffff811d4032>] vfs_kern_mount+0x72/0x110
 [<ffffffff811d6763>] do_mount+0x493/0x910
 [<ffffffff811615cb>] ? strndup_user+0x5b/0x80
 [<ffffffff811d6c70>] SyS_mount+0x90/0xe0
 [<ffffffff8166f8d9>] system_call_fastpath+0x16/0x1b

Found by Linux File System Verification project (linuxtesting.org).

Reported-by: Andrey Tsyvarev <tsyvarev@ispras.ru>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-05-07 10:21:54 +09:00
Gu Zheng b270ad6f0a f2fs: enable flush_merge only in f2fs is not read-only
Enable flush_merge only in f2fs is not read-only, so does the mount
option show.

Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-05-07 10:21:54 +09:00
Gu Zheng 197d46476c f2fs: use __GFP_ZERO to avoid appending set-NULL
Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-05-07 10:21:53 +09:00
Gu Zheng a4ed23f2f1 f2fs: put the bio when issue_flush completed
Put the bio when the flush cmd issued, it also can fix the following
kmemleak:
unreferenced object 0xffff8800270c73c0 (size 200):
  comm "f2fs_flush-7:0", pid 27161, jiffies 4312127988 (age 988.503s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 40 07 81 19 01 88 ff ff  ........@.......
    01 00 00 00 00 00 00 f0 11 14 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff81559866>] kmemleak_alloc+0x72/0x96
    [<ffffffff81156f7e>] slab_post_alloc_hook+0x28/0x2a
    [<ffffffff811595b1>] kmem_cache_alloc+0xec/0x157
    [<ffffffff8111924d>] mempool_alloc_slab+0x15/0x17
    [<ffffffff81119513>] mempool_alloc+0x71/0x138
    [<ffffffff81193548>] bio_alloc_bioset+0x93/0x18c
    [<ffffffffa040f857>] issue_flush_thread+0x8d/0x145 [f2fs]
    [<ffffffff8107ac16>] kthread+0xba/0xc2
    [<ffffffff81571b2c>] ret_from_fork+0x7c/0xb0
    [<ffffffffffffffff>] 0xffffffffffffffff

Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-05-07 10:21:53 +09:00
Jaegeuk Kim 6b4afdd794 f2fs: introduce f2fs_issue_flush to avoid redundant flush issue
Some storage devices show relatively high latencies to complete cache_flush
commands, even though their normal IO speed is prettry much high. In such
the case, it needs to merge cache_flush commands as much as possible to avoid
issuing them redundantly.
So, this patch introduces a mount option, "-o flush_merge", to mitigate such
the overhead.

If this option is enabled by user, F2FS merges the cache_flush commands and then
issues just one cache_flush on behalf of them. Once the single command is
finished, F2FS sends a completion signal to all the pending threads.

Note that, this option can be used under a workload consisting of very intensive
concurrent fsync calls, while the storage handles cache_flush commands slowly.

Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-04-07 09:50:58 +09:00
Jaegeuk Kim ce23447fe5 f2fs: fix to cover io->bio with io_rwsem
In the f2fs_wait_on_page_writeback, io->bio should be covered by io_rwsem.
Otherwise, the bio pointer can become a dangling pointer due to data races.

Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-04-02 09:56:27 +09:00
Chao Yu 2d7b822ad9 f2fs: use list_for_each_entry{_safe} for simplyfying code
This patch use list_for_each_entry{_safe} instead of list_for_each{_safe} for
simplfying code.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-04-02 09:56:27 +09:00
Chao Yu df0f8dc0e1 f2fs: avoid unnecessary bio submit when wait page writeback
This patch introduce is_merged_page() to check whether current page is merged
in f2fs bio cache. When page is not in cache, we can avoid submitting bio cache,
resulting in having more chance to merge pages.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-04-01 18:53:41 +09:00
Jaegeuk Kim 58c410351e f2fs: change reclaim rate in percentage
It is more reasonable to determine the reclaiming rate of prefree segments
according to the volume size, which is set to 5% by default.
For example, if the volume is 128GB, the prefree segments are reclaimed
when the number reaches to 6.4GB.

Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-03-20 22:10:10 +09:00
Chao Yu e4fc5fbfc9 f2fs: avoid to return incorrect errno of read_normal_summaries
We should return error number of read_normal_summaries instead of -EINVAL when
read_normal_summaries failed.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-03-18 09:29:53 +09:00
Gu Zheng d653788a43 f2fs: optimize restore_node_summary slightly
Previously, we ra_sum_pages to pre-read contiguous pages as more
as possible, and if we fail to alloc more pages, an ENOMEM error
will be reported upstream, even though we have alloced some pages
yet. In fact, we can use the available pages to do the job partly,
and continue the rest in the following circle. Only reporting ENOMEM
upstream if we really can not alloc any available page.

And another fix is ignoring dealing with the following pages if an
EIO occurs when reading page from page_list.

Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Reviewed-by: Chao Yu <chao2.yu@samsung.com>
[Jaegeuk Kim: modify the flow for better neat code]
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-03-10 18:45:15 +09:00
Gu Zheng e8512d2e0c f2fs: remove the unused ctor argument of f2fs_kmem_cache_create()
Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-03-10 18:45:14 +09:00
Chao Yu 662befda25 f2fs: introduce ra_meta_pages to readahead CP/NAT/SIT pages
This patch help us to cleanup the readahead code by merging ra_{sit,nat}_pages
function into ra_meta_pages.
Additionally the new function is used to readahead cp block in
recover_orphan_inodes.

Change log from v1:
 o fix a deadloop bug pointed by Jaegeuk Kim.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-02-17 14:58:53 +09:00
Jaegeuk Kim 491c0854b4 f2fs: clean up with a macro
This patch adds GET_BLKOFF_FROM_SEG0 to clean up some codes.

Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-02-17 14:58:52 +09:00
Jaegeuk Kim 5e443818fa f2fs: handle dirty segments inside refresh_sit_entry
This patch cleans up the refresh_sit_entry to handle locate_dirty_segments.

Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-02-17 14:58:52 +09:00
Gu Zheng 9df27d982d f2fs: add help function META_MAPPING
Introduce help function META_MAPPING() to get the cache meta blocks'
address space.

Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-01-22 18:41:07 +09:00
Chris Fries 6c311ec6c2 f2fs: clean checkpatch warnings
Fixed a variety of trivial checkpatch warnings.  The only delta should
be some minor formatting on log strings that were split / too long.

Signed-off-by: Chris Fries <cfries@motorola.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-01-20 10:27:12 +09:00
Yuan Zhong 5514f0aadd f2fs: remove the needless parameter of f2fs_wait_on_page_writeback
"boo sync" parameter is never referenced in f2fs_wait_on_page_writeback.
We should remove this parameter.

Signed-off-by: Yuan Zhong <yuan.mark.zhong@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-01-14 17:45:54 +09:00
Jaegeuk Kim fb5566da91 f2fs: improve write performance under frequent fsync calls
When considering a bunch of data writes with very frequent fsync calls, we
are able to think the following performance regression.

N: Node IO, D: Data IO, IO scheduler: cfq

Issue    pending IOs
	 D1 D2 D3 D4
 D1         D2 D3 D4 N1
 D2            D3 D4 N1 N2
 N1            D3 D4 N2 D1
 --> N1 can be selected by cfq becase of the same priority of N and D.
     Then D3 and D4 would be delayed, resuling in performance degradation.

So, when processing the fsync call, it'd better give higher priority to data IOs
than node IOs by assigning WRITE and WRITE_SYNC respectively.
This patch improves the random wirte performance with frequent fsync calls by up
to 10%.

Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-01-08 11:16:20 +09:00
Gu Zheng 7e8f23081a f2fs: remove the rw_flag domain from f2fs_io_info
When using the f2fs_io_info in the low level, we still need to merge the
rw and rw_flag, so use the rw to hold all the io flags directly,
and remove the rw_flag field.

ps.It is based on the previous patch:
f2fs: move all the bio initialization into __bio_alloc

Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-12-23 10:18:07 +09:00
Jaegeuk Kim bfad7c2d40 f2fs: introduce a new direct_IO write path
Previously, f2fs doesn't support direct IOs with high performance, which throws
every write requests via the buffered write path, resulting in highly
performance degradation due to memory opeations like copy_from_user.

This patch introduces a new direct IO path in which every write requests are
processed by generic blockdev_direct_IO() with enhanced get_block function.

The get_data_block() in f2fs handles:
1. if original data blocks are allocates, then give them to blockdev.
2. otherwise,
  a. preallocate requested block addresses
  b. do not use extent cache for better performance
  c. give the block addresses to blockdev

This policy induces that:
- new allocated data are sequentially written to the disk
- updated data are randomly written to the disk.
- f2fs gives consistency on its file meta, not file data.

Reviewed-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-12-23 10:18:07 +09:00
Jaegeuk Kim 216fbd6443 f2fs: introduce sysfs entry to control in-place-update policy
This patch introduces new sysfs entries for users to control the policy of
in-place-updates, namely IPU, in f2fs.

Sometimes f2fs suffers from performance degradation due to its out-of-place
update policy that produces many additional node block writes.
If the storage performance is very dependant on the amount of data writes
instead of IO patterns, we'd better drop this out-of-place update policy.

This patch suggests 5 polcies and their triggering conditions as follows.

[sysfs entry name = ipu_policy]

0: F2FS_IPU_FORCE       all the time,
1: F2FS_IPU_SSR         if SSR mode is activated,
2: F2FS_IPU_UTIL        if FS utilization is over threashold,
3: F2FS_IPU_SSR_UTIL    if SSR mode is activated and FS utilization is over
                        threashold,
4: F2FS_IPU_DISABLE    disable IPU. (=default option)

[sysfs entry name = min_ipu_util]

This parameter controls the threshold to trigger in-place-updates.
The number indicates percentage of the filesystem utilization, and used by
F2FS_IPU_UTIL and F2FS_IPU_SSR_UTIL policies.

For more details, see need_inplace_update() in segment.h.

Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-12-23 10:18:07 +09:00
Jaegeuk Kim 458e6197c3 f2fs: refactor bio->rw handling
This patch introduces f2fs_io_info to mitigate the complex parameter list.

struct f2fs_io_info {
	enum page_type type;		/* contains DATA/NODE/META/META_FLUSH */
	int rw;				/* contains R/RS/W/WS */
	int rw_flag;			/* contains REQ_META/REQ_PRIO */
}

1. f2fs_write_data_pages
 - DATA
 - WRITE_SYNC is set when wbc->WB_SYNC_ALL.

2. sync_node_pages
 - NODE
 - WRITE_SYNC all the time

3. sync_meta_pages
 - META
 - WRITE_SYNC all the time
 - REQ_META | REQ_PRIO all the time

 ** f2fs_submit_merged_bio() handles META_FLUSH.

4. ra_nat_pages, ra_sit_pages, ra_sum_pages
 - META
 - READ_SYNC

Cc: Fan Li <fanofcode.li@samsung.com>
Cc: Changman Lee <cm224.lee@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-12-23 10:18:06 +09:00
Fan Li 63a0b7cb33 f2fs: merge pages with the same sync_mode flag
Previously f2fs submits most of write requests using WRITE_SYNC, but f2fs_write_data_pages
submits last write requests by sync_mode flags callers pass.

This causes a performance problem since continuous pages with different sync flags
can't be merged in cfq IO scheduler(thanks yu chao for pointing it out), and synchronous
requests often take more time.

This patch makes the following modifies to DATA writebacks:

1. every page will be written back using the sync mode caller pass.
2. only pages with the same sync mode can be merged in one bio request.

These changes are restricted to DATA pages.Other types of writebacks are modified
To remain synchronous.

In my test with tiotest, f2fs sequence write performance is improved by about 7%-10% ,
and this patch has no obvious impact on other performance tests.

Signed-off-by: Fan Li <fanofcode.li@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-12-23 10:18:06 +09:00
Jaegeuk Kim 93dfe2ac51 f2fs: refactor bio-related operations
This patch integrates redundant bio operations on read and write IOs.

1. Move bio-related codes to the top of data.c.
2. Replace f2fs_submit_bio with f2fs_submit_merged_bio, which handles read
   bios additionally.
3. Introduce __submit_merged_bio to submit the merged bio.
4. Change f2fs_readpage to f2fs_submit_page_bio.
5. Introduce f2fs_submit_page_mbio to integrate previous submit_read_page and
   submit_write_page.

Reviewed-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Reviewed-by: Chao Yu <chao2.yu@samsung.com >
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-12-23 10:18:05 +09:00
Jaegeuk Kim 187b5b8b3d f2fs: remove the own bi_private allocation
Previously f2fs allocates its own bi_private data structure all the time even
though we don't use it. But, can we remove this bi_private allocation?

This patch removes such the additional bi_private allocation.

1. Retrieve f2fs_sb_info from its page->mapping->host->i_sb.
 - This removes the usecases of bi_private in end_io.

2. Use bi_private only when we really need it.
 - The bi_private is used only when the checkpoint procedure is conducted.
 - When conducting the checkpoint, f2fs submits a META_FLUSH bio to wait its bio
completion.
 - Since we have no dependancies to remove bi_private now, let's just use
 bi_private pointer as the completion pointer.

Reviewed-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-12-23 10:18:05 +09:00
Jaegeuk Kim f9a4e6df52 f2fs: bug fix on bit overflow from 32bits to 64bits
This patch fixes some bit overflows by the shift operations.

Dan Carpenter reported potential bugs on bit overflows as follows.

fs/f2fs/segment.c:910 submit_write_page()
	warn: should 'blk_addr << ((sbi)->log_blocksize - 9)' be a 64 bit type?
fs/f2fs/checkpoint.c:429 get_valid_checkpoint()
	warn: should '1 << ()' be a 64 bit type?
fs/f2fs/data.c:408 f2fs_readpage()
	warn: should 'blk_addr << ((sbi)->log_blocksize - 9)' be a 64 bit type?
fs/f2fs/data.c:457 submit_read_page()
	warn: should 'blk_addr << ((sbi)->log_blocksize - 9)' be a 64 bit type?
fs/f2fs/data.c:525 get_data_block_ro()
	warn: should 'i << blkbits' be a 64 bit type?

Bug-Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-12-23 10:18:04 +09:00
Changman Lee 03232305ff f2fs: send REQ_META or REQ_PRIO when reading meta area
Let's send REQ_META or REQ_PRIO when reading meta area such as NAT/SIT
etc.

Signed-off-by: Changman Lee <cm224.lee@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-12-23 10:18:03 +09:00
Jaegeuk Kim a709f4a2f2 f2fs: add detailed information of bio types in the tracepoints
This patch inserts information of bio types in more detail.
So, we can now see REQ_META and REQ_PRIO too.

Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-12-23 10:18:03 +09:00
Chao Yu 74de593af7 f2fs: read contiguous sit entry pages by merging for mount performance
Previously we read sit entries page one by one, this method lost the chance
of reading contiguous page together. So we read pages as contiguous as
possible for better mount performance.

change log:
 o merge judgements/use 'Continue' or 'Break' instead of 'Goto' as Gu Zheng
   suggested.
 o add mark_page_accessed() before release page to delay VM reclaiming.
 o remove '*order' for simplification of function as Jaegeuk Kim suggested.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
[Jaegeuk Kim: fix a bug on the block address calculation]
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-12-23 10:18:02 +09:00
Chao Yu d4d288bc72 f2fs: adds a tracepoint for f2fs_submit_read_bio
This patch adds a tracepoint for f2fs_submit_read_bio.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
[Jaegeuk Kim: integrate tracepoints of f2fs_submit_read(_write)_bio]
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-12-23 10:18:02 +09:00