1
0
Fork 0
alistair23-linux/include
Jan Kara 5afced3bf2 writeback: Avoid skipping inode writeback
Inode's i_io_list list head is used to attach inode to several different
lists - wb->{b_dirty, b_dirty_time, b_io, b_more_io}. When flush worker
prepares a list of inodes to writeback e.g. for sync(2), it moves inodes
to b_io list. Thus it is critical for sync(2) data integrity guarantees
that inode is not requeued to any other writeback list when inode is
queued for processing by flush worker. That's the reason why
writeback_single_inode() does not touch i_io_list (unless the inode is
completely clean) and why __mark_inode_dirty() does not touch i_io_list
if I_SYNC flag is set.

However there are two flaws in the current logic:

1) When inode has only I_DIRTY_TIME set but it is already queued in b_io
list due to sync(2), concurrent __mark_inode_dirty(inode, I_DIRTY_SYNC)
can still move inode back to b_dirty list resulting in skipping
writeback of inode time stamps during sync(2).

2) When inode is on b_dirty_time list and writeback_single_inode() races
with __mark_inode_dirty() like:

writeback_single_inode()		__mark_inode_dirty(inode, I_DIRTY_PAGES)
  inode->i_state |= I_SYNC
  __writeback_single_inode()
					  inode->i_state |= I_DIRTY_PAGES;
					  if (inode->i_state & I_SYNC)
					    bail
  if (!(inode->i_state & I_DIRTY_ALL))
  - not true so nothing done

We end up with I_DIRTY_PAGES inode on b_dirty_time list and thus
standard background writeback will not writeback this inode leading to
possible dirty throttling stalls etc. (thanks to Martijn Coenen for this
analysis).

Fix these problems by tracking whether inode is queued in b_io or
b_more_io lists in a new I_SYNC_QUEUED flag. When this flag is set, we
know flush worker has queued inode and we should not touch i_io_list.
On the other hand we also know that once flush worker is done with the
inode it will requeue the inode to appropriate dirty list. When
I_SYNC_QUEUED is not set, __mark_inode_dirty() can (and must) move inode
to appropriate dirty list.

Reported-by: Martijn Coenen <maco@android.com>
Reviewed-by: Martijn Coenen <maco@android.com>
Tested-by: Martijn Coenen <maco@android.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Fixes: 0ae45f63d4 ("vfs: add support for a lazytime mount option")
CC: stable@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
2020-06-15 09:18:45 +02:00
..
acpi Merge branches 'acpi-apei', 'acpi-pmic', 'acpi-video' and 'acpi-dptf' 2020-06-01 17:19:43 +02:00
asm-generic mm: consolidate pte_index() and pte_offset_*() definitions 2020-06-09 09:39:14 -07:00
clocksource
crypto crypto: lib/sha1 - fold linux/cryptohash.h into crypto/sha.h 2020-05-08 15:32:17 +10:00
drm drm: remove drm specific kmap_atomic code 2020-06-04 19:06:22 -07:00
dt-bindings This is the bulk of pin control changes for the v5.8 2020-06-07 16:13:43 -07:00
keys keys: Implement update for the big_key type 2020-06-02 17:22:31 +01:00
kunit Documentation: test.h - fix warnings 2020-05-22 14:33:59 -06:00
kvm KVM: arm64: vgic-v3: Take cpu_if pointer directly instead of vcpu 2020-05-28 11:57:10 +01:00
linux writeback: Avoid skipping inode writeback 2020-06-15 09:18:45 +02:00
math-emu
media media updates for v5.8-rc1 2020-06-03 20:59:38 -07:00
misc
net inet_connection_sock: clear inet_num out of destroy helper 2020-06-04 15:59:56 -07:00
pcmcia pcmcia: Replace zero-length array with flexible-array 2020-05-18 10:28:31 +02:00
ras
rdma dynamic_debug: add an option to enable dynamic debug for modules only 2020-06-08 11:05:56 -07:00
scsi SCSI misc on 20200605 2020-06-05 15:11:50 -07:00
soc pci-v5.8-changes 2020-06-06 11:01:58 -07:00
sound ASoC: Updates for v5.8 2020-06-01 20:26:07 +02:00
target scsi: target: tcmu: Make pgr_support and alua_support attributes writable 2020-05-07 22:39:22 -04:00
trace f2fs-for-5.8-rc1 2020-06-09 11:28:59 -07:00
uapi Changes in gfs2: 2020-06-08 12:47:09 -07:00
vdso
video
xen mm: reorder includes after introduction of linux/pgtable.h 2020-06-09 09:39:13 -07:00