1
0
Fork 0
Commit Graph

2144 Commits (8620d18575af2f2406554a6190e7d12a663474b4)

Author SHA1 Message Date
Bob Peterson 27874115b0 gfs2: read-only mounts should grab the sd_freeze_gl glock
[ Upstream commit b780cc615b ]

Before this patch, only read-write mounts would grab the freeze
glock in read-only mode, as part of gfs2_make_fs_rw. So the freeze
glock was never initialized. That meant requests to freeze, which
request the glock in EX, were granted without any state transition.
That meant you could mount a gfs2 file system, which is currently
frozen on a different cluster node, in read-only mode.

This patch makes read-only mounts lock the freeze glock in SH mode,
which will block for file systems that are frozen on another node.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-07-22 09:32:53 +02:00
Bob Peterson fbaf0137df gfs2: fix use-after-free on transaction ail lists
[ Upstream commit 83d060ca8d ]

Before this patch, transactions could be merged into the system
transaction by function gfs2_merge_trans(), but the transaction ail
lists were never merged. Because the ail flushing mechanism can run
separately, bd elements can be attached to the transaction's buffer
list during the transaction (trans_add_meta, etc) but quickly moved
to its ail lists. Later, in function gfs2_trans_end, the transaction
can be freed (by gfs2_trans_end) while it still has bd elements
queued to its ail lists, which can cause it to either lose track of
the bd elements altogether (memory leak) or worse, reference the bd
elements after the parent transaction has been freed.

Although I've not seen any serious consequences, the problem becomes
apparent with the previous patch's addition of:

	gfs2_assert_warn(sdp, list_empty(&tr->tr_ail1_list));

to function gfs2_trans_free().

This patch adds logic into gfs2_merge_trans() to move the merged
transaction's ail lists to the sdp transaction. This prevents the
use-after-free. To do this properly, we need to hold the ail lock,
so we pass sdp into the function instead of the transaction itself.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-24 17:50:40 +02:00
Bob Peterson 08904df10d gfs2: Allow lock_nolock mount to specify jid=X
[ Upstream commit ea22eee4e6 ]

Before this patch, a simple typo accidentally added \n to the jid=
string for lock_nolock mounts. This made it impossible to mount a
gfs2 file system with a journal other than journal0. Thus:

mount -tgfs2 -o hostdata="jid=1" <device> <mount pt>

Resulted in:
mount: wrong fs type, bad option, bad superblock on <device>

In most cases this is not a problem. However, for debugging and
testing purposes we sometimes want to test the integrity of other
journals. This patch removes the unnecessary \n and thus allows
lock_nolock users to specify an alternate journal.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-24 17:50:37 +02:00
Andreas Gruenbacher f8713c2cb0 gfs2: Even more gfs2_find_jhead fixes
[ Upstream commit 20be493b78 ]

Fix several issues in the previous gfs2_find_jhead fix:
* When updating @blocks_submitted, @block refers to the first block block not
  submitted yet, not the last block submitted, so fix an off-by-one error.
* We want to ensure that @blocks_submitted is far enough ahead of @blocks_read
  to guarantee that there is in-flight I/O.  Otherwise, we'll eventually end up
  waiting for pages that haven't been submitted, yet.
* It's much easier to compare the number of blocks added with the number of
  blocks submitted to limit the maximum bio size.
* Even with bio chaining, we can keep adding blocks until we reach the maximum
  bio size, as long as we stop at a page boundary.  This simplifies the logic.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-17 16:40:21 +02:00
Andreas Gruenbacher 49388448ed gfs2: Grab glock reference sooner in gfs2_add_revoke
[ Upstream commit f4e2f5e1a5 ]

This patch rearranges gfs2_add_revoke so that the extra glock
reference is added earlier on in the function to avoid races in which
the glock is freed before the new reference is taken.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-03 08:21:10 +02:00
Bob Peterson fd5516ea82 gfs2: move privileged user check to gfs2_quota_lock_check
[ Upstream commit 4ed0c30811 ]

Before this patch, function gfs2_quota_lock checked if it was called
from a privileged user, and if so, it bypassed the quota check:
superuser can operate outside the quotas.
That's the wrong place for the check because the lock/unlock functions
are separate from the lock_check function, and you can do lock and
unlock without actually checking the quotas.

This patch moves the check to gfs2_quota_lock_check.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-03 08:21:09 +02:00
Bob Peterson 4adb7a2b31 Revert "gfs2: Don't demote a glock until its revokes are written"
[ Upstream commit b14c94908b ]

This reverts commit df5db5f9ee.

This patch fixes a regression: patch df5db5f9ee allowed function
run_queue() to bypass its call to do_xmote() if revokes were queued for
the glock. That's wrong because its call to do_xmote() is what is
responsible for calling the go_sync() glops functions to sync both
the ail list and any revokes queued for it. By bypassing the call,
gfs2 could get into a stand-off where the glock could not be demoted
until its revokes are written back, but the revokes would not be
written back because do_xmote() was never called.

It "sort of" works, however, because there are other mechanisms like
the log flush daemon (logd) that can sync the ail items and revokes,
if it deems it necessary. The problem is: without file system pressure,
it might never deem it necessary.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-05-27 17:46:44 +02:00
Andreas Gruenbacher f256dea077 gfs2: More gfs2_find_jhead fixes
[ Upstream commit aa83da7f47 ]

It turns out that when extending an existing bio, gfs2_find_jhead fails to
check if the block number is consecutive, which leads to incorrect reads for
fragmented journals.

In addition, limit the maximum bio size to an arbitrary value of 2 megabytes:
since commit 07173c3ec2 ("block: enable multipage bvecs"), if we just keep
adding pages until bio_add_page fails, bios will grow much larger than useful,
which pins more memory than necessary with barely any additional performance
gains.

Fixes: f4686c26ec ("gfs2: read journal in large chunks")
Cc: stable@vger.kernel.org # v5.2+
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-05-20 08:20:22 +02:00
Andreas Gruenbacher 171bf6ef03 gfs2: Another gfs2_walk_metadata fix
[ Upstream commit 566a2ab3c9 ]

Make sure we don't walk past the end of the metadata in gfs2_walk_metadata: the
inode holds fewer pointers than indirect blocks.

Slightly clean up gfs2_iomap_get.

Fixes: a27a0c9b6a ("gfs2: gfs2_walk_metadata fix")
Cc: stable@vger.kernel.org # v5.3+
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-05-20 08:20:17 +02:00
Bob Peterson c5bcaacd06 gfs2: Don't demote a glock until its revokes are written
[ Upstream commit df5db5f9ee ]

Before this patch, run_queue would demote glocks based on whether
there are any more holders. But if the glock has pending revokes that
haven't been written to the media, giving up the glock might end in
file system corruption if the revokes never get written due to
io errors, node crashes and fences, etc. In that case, another node
will replay the metadata blocks associated with the glock, but
because the revoke was never written, it could replay that block
even though the glock had since been granted to another node who
might have made changes.

This patch changes the logic in run_queue so that it never demotes
a glock until its count of pending revokes reaches zero.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-04-17 10:50:03 +02:00
Bob Peterson 46bbc5526d gfs2: Do log_flush in gfs2_ail_empty_gl even if ail list is empty
[ Upstream commit 9ff7828935 ]

Before this patch, if gfs2_ail_empty_gl saw there was nothing on
the ail list, it would return and not flush the log. The problem
is that there could still be a revoke for the rgrp sitting on the
sd_log_le_revoke list that's been recently taken off the ail list.
But that revoke still needs to be written, and the rgrp_go_inval
still needs to call log_flush_wait to ensure the revokes are all
properly written to the journal before we relinquish control of
the glock to another node. If we give the glock to another node
before we have this knowledge, the node might crash and its journal
replayed, in which case the missing revoke would allow the journal
replay to replay the rgrp over top of the rgrp we already gave to
another node, thus overwriting its changes and corrupting the
file system.

This patch makes gfs2_ail_empty_gl still call gfs2_log_flush rather
than returning.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-04-17 10:50:03 +02:00
Al Viro 9719442f9e gfs2_atomic_open(): fix O_EXCL|O_CREAT handling on cold dcache
commit 2103913265 upstream.

with the way fs/namei.c:do_last() had been done, ->atomic_open()
instances needed to recognize the case when existing file got
found with O_EXCL|O_CREAT, either by falling back to finish_no_open()
or failing themselves.  gfs2 one didn't.

Fixes: 6d4ade986f (GFS2: Add atomic_open support)
Cc: stable@kernel.org # v3.11
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-18 07:17:51 +01:00
Andreas Gruenbacher ae35ac3c4b gfs2: fix O_SYNC write handling
commit 6e5e41e2dc upstream.

In gfs2_file_write_iter, for direct writes, the error checking in the buffered
write fallback case is incomplete.  This can cause inode write errors to go
undetected.  Fix and clean up gfs2_file_write_iter along the way.

Based on a proposed fix by Christoph Hellwig <hch@lst.de>.

Fixes: 967bcc91b0 ("gfs2: iomap direct I/O support")
Cc: stable@vger.kernel.org # v4.19+
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-11 04:35:35 -08:00
Christoph Hellwig 6373486908 gfs2: move setting current->backing_dev_info
commit 4c0e8dda60 upstream.

Set current->backing_dev_info just around the buffered write calls to
prepare for the next fix.

Fixes: 967bcc91b0 ("gfs2: iomap direct I/O support")
Cc: stable@vger.kernel.org # v4.19+
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-11 04:35:35 -08:00
Abhi Das c61b93fae6 gfs2: fix gfs2_find_jhead that returns uninitialized jhead with seq 0
commit 7582026f6f upstream.

When the first log header in a journal happens to have a sequence
number of 0, a bug in gfs2_find_jhead() causes it to prematurely exit,
and return an uninitialized jhead with seq 0. This can cause failures
in the caller. For instance, a mount fails in one test case.

The correct behavior is for it to continue searching through the journal
to find the correct journal head with the highest sequence number.

Fixes: f4686c26ec ("gfs2: read journal in large chunks")
Cc: stable@vger.kernel.org # v5.2+
Signed-off-by: Abhi Das <adas@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-11 04:35:35 -08:00
Andreas Gruenbacher 73774def78 gfs2: Another gfs2_find_jhead fix
commit eed0f953b9 upstream.

On filesystems with a block size smaller than the page size,
gfs2_find_jhead can split a page across two bios (for example, when
blocks are not allocated consecutively).  When that happens, the first
bio that completes will unlock the page in its bi_end_io handler even
though the page hasn't been read completely yet.  Fix that by using a
chained bio for the rest of the page.

While at it, clean up the sector calculation logic in
gfs2_log_alloc_bio.  In gfs2_find_jhead, simplify the disk block and
offset calculation logic and fix a variable name.

Fixes: f4686c26ec ("gfs2: read journal in large chunks")
Cc: stable@vger.kernel.org # v5.2+
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-05 21:22:40 +00:00
Arnd Bergmann de1605c603 gfs2: add compat_ioctl support
commit 8d09807048 upstream.

Out of the four ioctl commands supported on gfs2, only FITRIM
works in compat mode.

Add a proper handler based on the ext4 implementation.

Fixes: 6ddc5c3ddf ("gfs2: getlabel support")
Reviewed-by: Bob Peterson <rpeterso@redhat.com>
Cc: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-17 19:48:52 +01:00
Bob Peterson 0007f536dc gfs2: fix glock reference problem in gfs2_trans_remove_revoke
commit fe5e7ba11f upstream.

Commit 9287c6452d fixed a situation in which gfs2 could use a glock
after it had been freed. To do that, it temporarily added a new glock
reference by calling gfs2_glock_hold in function gfs2_add_revoke.
However, if the bd element was removed by gfs2_trans_remove_revoke, it
failed to drop the additional reference.

This patch adds logic to gfs2_trans_remove_revoke to properly drop the
additional glock reference.

Fixes: 9287c6452d ("gfs2: Fix occasional glock use-after-free")
Cc: stable@vger.kernel.org # v5.2+
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-21 11:04:34 +01:00
Andreas Gruenbacher e697fd14db gfs2: Multi-block allocations in gfs2_page_mkwrite
commit f53056c430 upstream.

In gfs2_page_mkwrite's gfs2_allocate_page_backing helper, try to
allocate as many blocks at once as we need.  Pass in the size of the
requested allocation.

Fixes: 35af80aef9 ("gfs2: don't use buffer_heads in gfs2_allocate_page_backing")
Cc: stable@vger.kernel.org # v5.3+
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-21 11:04:33 +01:00
Andrew Price d5798141fd gfs2: Fix initialisation of args for remount
When gfs2 was converted to use fs_context, the initialisation of the
mount args structure to the currently active args was lost with the
removal of gfs2_remount_fs(), so the checks of the new args on remount
became checks against the default values instead of the current ones.
This caused unexpected remount behaviour and test failures (xfstests
generic/294, generic/306 and generic/452).

Reinstate the args initialisation, this time in gfs2_init_fs_context()
and conditional upon fc->purpose, as that's the only time we get control
before the mount args are parsed in the remount process.

Fixes: 1f52aa08d1 ("gfs2: Convert gfs2 to fs_context")
Signed-off-by: Andrew Price <anprice@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2019-10-30 12:16:53 +01:00
Andrew Price 30aecae86e gfs2: Fix memory leak when gfs2meta's fs_context is freed
gfs2 and gfs2meta share an ->init_fs_context function which allocates an
args structure stored in fc->fs_private. gfs2 registers a ->free
function to free this memory when the fs_context is cleaned up, but
there was not one registered for gfs2meta, causing a leak.

Register a ->free function for gfs2meta. The existing gfs2_fc_free
function does what we need.

Reported-by: syzbot+c2fdfd2b783754878fb6@syzkaller.appspotmail.com
Fixes: 1f52aa08d1 ("gfs2: Convert gfs2 to fs_context")
Signed-off-by: Andrew Price <anprice@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2019-10-24 16:20:43 +02:00
Linus Torvalds 0b36c9eed2 Merge branch 'work.mount3' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull more mount API conversions from Al Viro:
 "Assorted conversions of options parsing to new API.

  gfs2 is probably the most serious one here; the rest is trivial stuff.

  Other things in what used to be #work.mount are going to wait for the
  next cycle (and preferably go via git trees of the filesystems
  involved)"

* 'work.mount3' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  gfs2: Convert gfs2 to fs_context
  vfs: Convert spufs to use the new mount API
  vfs: Convert hypfs to use the new mount API
  hypfs: Fix error number left in struct pointer member
  vfs: Convert functionfs to use the new mount API
  vfs: Convert bpf to use the new mount API
2019-09-24 12:33:34 -07:00
Andrew Price 1f52aa08d1 gfs2: Convert gfs2 to fs_context
Convert gfs2 and gfs2meta to fs_context. Removes the duplicated vfs code
from gfs2_mount and instead uses the new vfs_get_block_super() before
switching the ->root to the appropriate dentry.

The mount option parsing has been converted to the new API and error
reporting for invalid options has been made more precise at the same
time.

All of the mount/remount code has been moved into ops_fstype.c

Signed-off-by: Andrew Price <anprice@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: cluster-devel@redhat.com
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-09-18 22:47:05 -04:00
Bob Peterson f0b444b349 gfs2: clear buf_in_tr when ending a transaction in sweep_bh_for_rgrps
In function sweep_bh_for_rgrps, which is a helper for punch_hole,
it uses variable buf_in_tr to keep track of when it needs to commit
pending block frees on a partial delete that overflows the
transaction created for the delete. The problem is that the
variable was initialized at the start of function sweep_bh_for_rgrps
but it was never cleared, even when starting a new transaction.

This patch reinitializes the variable when the transaction is
ended, so the next transaction starts out with it cleared.

Fixes: d552a2b9b3 ("GFS2: Non-recursive delete")
Cc: stable@vger.kernel.org # v4.12+
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2019-09-17 16:50:50 +02:00
Andreas Gruenbacher b473bc2dcd gfs2: Improve mmap write vs. truncate consistency
On filesystems with a block size smaller than PAGE_SIZE, page_mkwrite is
called for each memory-mapped page before that page can be written to.
When such a memory-mapped file is truncated down to size x which is not
a multiple of the page size and then back to a larger size, the page
straddling size x can end up with a partial block mapping.  In that
case, make sure to mark that page read-only so that page_mkwrite will be
called before the page can be written to the next time.

(There is no point in marking the page straddling size x read-only when
truncating down as writing to memory beyond the end of the file will
result in SIGBUS instead of growing the file.)

Fixes xfstests generic/029, generic/030 on filesystems with a block size
smaller than PAGE_SIZE.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2019-09-06 22:54:23 +02:00
Bob Peterson ad26967b9a gfs2: Use async glocks for rename
Because s_vfs_rename_mutex is not cluster-wide, multiple nodes can
reverse the roles of which directories are "old" and which are "new" for
the purposes of rename. This can cause deadlocks where two nodes end up
waiting for each other.

There can be several layers of directory dependencies across many nodes.

This patch fixes the problem by acquiring all gfs2_rename's inode glocks
asychronously and waiting for all glocks to be acquired.  That way all
inodes are locked regardless of the order.

The timeout value for multiple asynchronous glocks is calculated to be
the total of the individual wait times for each glock times two.

Since gfs2_exchange is very similar to gfs2_rename, both functions are
patched in the same way.

A new async glock wait queue, sd_async_glock_wait, keeps a list of
waiters for these events. If gfs2's holder_wake function detects an
async holder, it wakes up any waiters for the event. The waiter only
tests whether any of its requests are still pending.

Since the glocks are sent to dlm asychronously, the wait function needs
to check to see which glocks, if any, were granted.

If a glock is granted by dlm (and therefore held), its minimum hold time
is checked and adjusted as necessary, as other glock grants do.

If the event times out, all glocks held thus far must be dequeued to
resolve any existing deadlocks.  Then, if there are any outstanding
locking requests, we need to loop around and wait for dlm to respond to
those requests too.  After we release all requests, we return -ESTALE to
the caller (vfs rename) which loops around and retries the request.

    Node1           Node2
    ---------       ---------
1.  Enqueue A       Enqueue B
2.  Enqueue B       Enqueue A
3.  A granted
6.                  B granted
7.  Wait for B
8.                  Wait for A
9.                  A times out (since Node 1 holds A)
10.                 Dequeue B (since it was granted)
11.                 Wait for all requests from DLM
12. B Granted (since Node2 released it in step 10)
13. Rename
14. Dequeue A
15.                 DLM Grants A
16.                 Dequeue A (due to the timeout and since we
                    no longer have B held for our task).
17. Dequeue B
18.                 Return -ESTALE to vfs
19.                 VFS retries the operation, goto step 1.

This release-all-locks / acquire-all-locks may slow rename / exchange
down as both nodes struggle in the same way and do the same thing.
However, this will only happen when there is contention for the same
inodes, which ought to be rare.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2019-09-04 20:22:17 +02:00
Andreas Gruenbacher 01123cf17c gfs2: create function gfs2_glock_update_hold_time
This patch moves the code that updates glock minimum hold
time to a separate function. This will be called by a future
patch.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2019-09-04 20:22:17 +02:00
Bob Peterson bc74aaefdd gfs2: separate holder for rgrps in gfs2_rename
Before this patch, gfs2_rename added a holder for the rgrp glock to
its array of holders, ghs. There's nothing wrong with that, but this
patch separates it into a separate holder. This is done to ensure
it's always locked last as per the proper glock lock ordering,
and also to pave the way for a future patch in which we will
lock the non-rgrp glocks asynchronously.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2019-09-04 20:22:17 +02:00
Markus Elfring bccaef9073 gfs2: Delete an unnecessary check before brelse()
The brelse() function tests whether its argument is NULL and then
returns immediately.  Thus the test around the call is not needed.

This issue was detected by using the Coccinelle software.

[The same applies to brelse() in gfs2_dir_no_add (which Coccinelle
apparently missed), so fix that as well.]

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2019-09-04 20:22:17 +02:00
Andreas Gruenbacher 45eb05042d gfs2: Minor PAGE_SIZE arithmetic cleanups
Replace divisions by PAGE_SIZE with shifts by PAGE_SHIFT and similar.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2019-09-04 20:22:06 +02:00
Andreas Gruenbacher 8f0daef5f7 gfs2: Fix recovery slot bumping
Get rid of the assumption that the number of slots can at most increase by
RECOVER_SIZE_INC (16) in set_recover_size.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2019-09-03 09:42:41 +02:00
Bob Peterson 98fb057487 gfs2: Fix possible fs name overflows
This patch fixes three places in which temporary character buffers
could overflow due to the addition of the file system id from patch
3792ce973f. Thanks to Dan Carpenter for pointing it out.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2019-09-03 09:42:41 +02:00
Bob Peterson 8c5ca11710 gfs2: untangle the logic in gfs2_drevalidate
Before this patch, function gfs2_drevalidate was a horrific tangle of
unreadable labels, cases and goto statements. This patch tries to
simplify the logic and make it more readable.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2019-09-03 09:42:41 +02:00
Andreas Gruenbacher 0a6a4abc84 gfs2: Always mark inode dirty in fallocate
When allocating space with fallocate, always update the file timestamps
and mark the inode dirty, no matter if the FALLOC_FL_KEEP_SIZE flag is
set or not.  The inode needs to be marked dirty so that a subsequent
fsync will pick it up and any new allocations will make it to disk.
Filesystems like xfs and ext4 always update the timestamps, so make
gfs2 behave the same way.

Fixes xfstest generic/483.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2019-09-03 09:41:42 +02:00
Andreas Gruenbacher d40312598d gfs2: Minor gfs2_alloc_inode cleanup
In gfs2_alloc_inode, when kmem_cache_alloc cannot allocate a new object, return
NULL immediately.  The code currently relies on the fact that i_inode is the
first member in struct gfs2_inode and so ip and &ip->i_inode evaluate to the
same address, but that isn't immediately obvious.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Bob Peterson <rpeterso@redhat.com>
2019-08-09 17:00:52 +01:00
Christoph Hellwig 2257e468a6 gfs2: implement gfs2_block_zero_range using iomap_zero_range
iomap handles all the nitty-gritty details of zeroing a file
range for us, so use the proper helper.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Bob Peterson <rpeterso@redhat.com>
2019-08-09 17:00:51 +01:00
Andreas Gruenbacher 72d36d0529 gfs2: Add support for IOMAP_ZERO
Add support for the IOMAP_ZERO iomap operation so that iomap_zero_range will
work as expected.  In the IOMAP_ZERO case, the caller of iomap_zero_range is
responsible for taking an exclusive glock on the inode, so we need no
additional locking in gfs2_iomap_begin.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Bob Peterson <rpeterso@redhat.com>
2019-08-09 17:00:50 +01:00
Andreas Gruenbacher 34aad20bc3 gfs2: gfs2_iomap_begin cleanup
Following commit d0a22a4b03 ("gfs2: Fix iomap write page reclaim deadlock"),
gfs2_iomap_begin and gfs2_iomap_begin_write can be further cleaned up and the
split between those two functions can be improved.

With suggestions from Christoph Hellwig.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Bob Peterson <rpeterso@redhat.com>
2019-08-09 17:00:49 +01:00
Andreas Gruenbacher a27a0c9b6a gfs2: gfs2_walk_metadata fix
It turns out that the current version of gfs2_metadata_walker suffers
from multiple problems that can cause gfs2_hole_size to report an
incorrect size.  This will confuse fiemap as well as lseek with the
SEEK_DATA flag.

Fix that by changing gfs2_hole_walker to compute the metapath to the
first data block after the hole (if any), and compute the hole size
based on that.

Fixes xfstest generic/490.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Bob Peterson <rpeterso@redhat.com>
Cc: stable@vger.kernel.org # v4.18+
2019-08-09 16:56:12 +01:00
Andreas Gruenbacher 706cb5492c gfs2: Inode dirtying fix
With the recent iomap write page reclaim deadlock fix, it turns out that the
GLF_DIRTY flag isn't always set when it needs to be anymore: previously, this
happened as a side effect of always adding the inode buffer head to the current
transaction with gfs2_trans_add_meta, but this isn't happening consistently
anymore.  Fix by removing an additional unnecessary gfs2_trans_add_meta call
and by setting the GLF_DIRTY flag in gfs2_iomap_end.

(The GLF_DIRTY flag causes inode_go_sync to flush the transaction log when
syncing out the glock of that inode.  When the flag isn't set, inode_go_sync
will skip inodes, including ones with an i_state of I_DIRTY_PAGES, which will
lead to cluster incoherency.)

In addition, in gfs2_iomap_page_done, if the metadata has changed, mark the
inode as I_DIRTY_DATASYNC to have the inode added to the current transaction:
we don't expect metadata to change here, but let's err on the safe side.

Fixes: d0a22a4b03 ("gfs2: Fix iomap write page reclaim deadlock");
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2019-07-31 18:51:50 +02:00
Linus Torvalds 5010fe9f09 New for 5.3:
- Standardize parameter checking for the SETFLAGS and FSSETXATTR ioctls
   (which were the file attribute setters for ext4 and xfs and have now
   been hoisted to the vfs)
 - Only allow the DAX flag to be set on files and directories.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEUzaAxoMeQq6m2jMV+H93GTRKtOsFAl0aJgMACgkQ+H93GTRK
 tOuKkg//SJaxcB63uVPZk9hDraYTmyo9OXRRX6X9WwDKPTWwa88CUwS1ny1QF7Mt
 zMkgzG2/y2Rs9PQ0ARoPbh1hNb2CXnvA+xnzUEev1MW6UN/nTFMZEOPn2ZQ+DxQE
 gg/0U56kKgtjtXzBZVpTgHzSETivdXwHxFW3hiTtyRXg+4ulgDIZLOjN2wRB+Pdb
 X8ZmM6MqKOTbhQEXlw13TlCKBzoMjC1w4UU4rkZPjoSjAaUWiPfrk/XU7qgguf9p
 v1dbSN2dADQ19jzZ1dmggXnlJsRMZjk/ls5rxJlB5DHDbh6YgnA2TE+tYrtH28eB
 uyKfD+RQnMzRVdmH8PsMQRQQFXR2UYyprVP7a6wi6TkB+gytn7sR5uT4sbAhmhcF
 TiTYfYNRXzemHCewyOwOsUE/7oCeiJcdbqiPAHHD/jYLZfRjSXDcGzz3+7ZYZ3GO
 hRxUhpxHPbkmK4T2OxhzReCbRsLN/0BeEcDdLkNWmi2FTh3V1gYzMGkgI9wsVbsd
 pHjoGIHbMPWqktF/obuGq96WVfYBBaWJ6WNzQqKT4dQYAJBW2omxitXQHLpi6cjt
 hG5ncxa3cPpWx4t3Lx2hb0TPS7RyYvuoQIcS/Me2RWioxrwWrgnOqdHFfLEwWpfN
 jRowdWiGgOIsq8hMt7qycmGCXzbgsbaA/7oRqh8TiwM9taPOM4c=
 =uH2E
 -----END PGP SIGNATURE-----

Merge tag 'vfs-fix-ioctl-checking-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull common SETFLAGS/FSSETXATTR parameter checking from Darrick Wong:
 "Here's a patch series that sets up common parameter checking functions
  for the FS_IOC_SETFLAGS and FS_IOC_FSSETXATTR ioctl implementations.

  The goal here is to reduce the amount of behaviorial variance between
  the filesystems where those ioctls originated (ext2 and XFS,
  respectively) and everybody else.

   - Standardize parameter checking for the SETFLAGS and FSSETXATTR
     ioctls (which were the file attribute setters for ext4 and xfs and
     have now been hoisted to the vfs)

   - Only allow the DAX flag to be set on files and directories"

* tag 'vfs-fix-ioctl-checking-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  vfs: only allow FSSETXATTR to set DAX flag on files and dirs
  vfs: teach vfs_ioc_fssetxattr_check to check extent size hints
  vfs: teach vfs_ioc_fssetxattr_check to check project id info
  vfs: create a generic checking function for FS_IOC_FSSETXATTR
  vfs: create a generic checking and prep function for FS_IOC_SETFLAGS
2019-07-12 16:54:37 -07:00
Linus Torvalds f632a8170a Driver Core and debugfs changes for 5.3-rc1
Here is the "big" driver core and debugfs changes for 5.3-rc1
 
 It's a lot of different patches, all across the tree due to some api
 changes and lots of debugfs cleanups.  Because of this, there is going
 to be some merge issues with your tree at the moment, I'll follow up
 with the expected resolutions to make it easier for you.
 
 Other than the debugfs cleanups, in this set of changes we have:
 	- bus iteration function cleanups (will cause build warnings
 	  with s390 and coresight drivers in your tree)
 	- scripts/get_abi.pl tool to display and parse Documentation/ABI
 	  entries in a simple way
 	- cleanups to Documenatation/ABI/ entries to make them parse
 	  easier due to typos and other minor things
 	- default_attrs use for some ktype users
 	- driver model documentation file conversions to .rst
 	- compressed firmware file loading
 	- deferred probe fixes
 
 All of these have been in linux-next for a while, with a bunch of merge
 issues that Stephen has been patient with me for.  Other than the merge
 issues, functionality is working properly in linux-next :)
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXSgpnQ8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ykcwgCfS30OR4JmwZydWGJ7zK/cHqk+KjsAnjOxjC1K
 LpRyb3zX29oChFaZkc5a
 =XrEZ
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core and debugfs updates from Greg KH:
 "Here is the "big" driver core and debugfs changes for 5.3-rc1

  It's a lot of different patches, all across the tree due to some api
  changes and lots of debugfs cleanups.

  Other than the debugfs cleanups, in this set of changes we have:

   - bus iteration function cleanups

   - scripts/get_abi.pl tool to display and parse Documentation/ABI
     entries in a simple way

   - cleanups to Documenatation/ABI/ entries to make them parse easier
     due to typos and other minor things

   - default_attrs use for some ktype users

   - driver model documentation file conversions to .rst

   - compressed firmware file loading

   - deferred probe fixes

  All of these have been in linux-next for a while, with a bunch of
  merge issues that Stephen has been patient with me for"

* tag 'driver-core-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (102 commits)
  debugfs: make error message a bit more verbose
  orangefs: fix build warning from debugfs cleanup patch
  ubifs: fix build warning after debugfs cleanup patch
  driver: core: Allow subsystems to continue deferring probe
  drivers: base: cacheinfo: Ensure cpu hotplug work is done before Intel RDT
  arch_topology: Remove error messages on out-of-memory conditions
  lib: notifier-error-inject: no need to check return value of debugfs_create functions
  swiotlb: no need to check return value of debugfs_create functions
  ceph: no need to check return value of debugfs_create functions
  sunrpc: no need to check return value of debugfs_create functions
  ubifs: no need to check return value of debugfs_create functions
  orangefs: no need to check return value of debugfs_create functions
  nfsd: no need to check return value of debugfs_create functions
  lib: 842: no need to check return value of debugfs_create functions
  debugfs: provide pr_fmt() macro
  debugfs: log errors when something goes wrong
  drivers: s390/cio: Fix compilation warning about const qualifiers
  drivers: Add generic helper to match by of_node
  driver_find_device: Unify the match function with class_find_device()
  bus_find_device: Unify the match callback with class_find_device
  ...
2019-07-12 12:24:03 -07:00
Linus Torvalds 0248a8be6d Some relatively minor changes for gfs2:
- An initial batch of obvious cleanups and fixes from Bob's
    recovery patch queue.
  - Two iomap conversion patches and some cleanups from Christoph
    Hellwig.
  - A cosmetic cleanup from Kefeng Wang (Huawei).
  - Another minor fix and cleanup by me.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJdJmA1AAoJENW/n+sDE2U69jgQAJGXJj+iyT7kKS2vQE15W+ff
 MdT9amtWokFBhLLz5fTI+OZ8segJPScsGuJuoZ8/5tW/3w3MMn9u96h+BD9Ww9On
 ByRG1+zMkWPyAgI2RCDytP7WQ/enJrjFxYEEGNgwfkGLZ6aDnoZ4/8DlK8MQ+6SE
 IeNFmV3jU8f4If5pLLZ352akTBhAOLC3InKv/DSHtq4QZqoxZimNQhqTuDW7uHKX
 2YwE1+NC36dBfJz70dJqb6YPoUdEn4qklQPe6jj0mlMb38uXo08dKuE3atnRmbxQ
 9cOwHlB1W5DiRb4Fg/aim+XOnrBFhPVT9kktKTwztaS/Rc6N8gUi9Est79Qz9vEK
 2vDBammJ0/zB6+ogUyv4cVN9hcgOQInyo5yv5aJvhaIl/WrVUF11rdrj35a0vgeW
 8oU0kD9h9jHPew1wCOPhS4138Qc9sDAqyeYIvAQ80W1VePw88kQ7xc9WwKyHmRlX
 XjU1REZ+4P/nCycIht9L0ow1xpqHoutCmFVNhE/dbkUoJSMJyh2xNvwTPAYE0EZe
 53oAtmBtYPhPDaghPpReHbVWq5F/OeUoRucLaugkkSZ96lbVAFQPr0TgP6GW1KS0
 rL+Epa+eG1+8qeiA4HtKG7aqdHx9vQe43RGoGzknUmG4PTQHHRuz7rNUD0jQayoi
 D/wEiuL///x10vb8cD0P
 =5lRq
 -----END PGP SIGNATURE-----

Merge tag 'gfs2-for-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2

Pull gfs2 updates from Andreas Gruenbacher:
 "Some relatively minor changes for gfs2:

   - An initial batch of obvious cleanups and fixes from Bob's recovery
     patch queue.

   - Two iomap conversion patches and some cleanups from Christoph
     Hellwig.

   - A cosmetic cleanup from Kefeng Wang (Huawei).

   - Another minor fix and cleanup by me"

* tag 'gfs2-for-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
  gfs2: Remove unused gfs2_iomap_alloc argument
  gfs2: don't use buffer_heads in gfs2_allocate_page_backing
  gfs2: use iomap_bmap instead of generic_block_bmap
  gfs2: mark stuffed_readpage static
  gfs2: merge gfs2_writepage_common into gfs2_writepage
  gfs2: merge gfs2_writeback_aops and gfs2_ordered_aops
  gfs2: remove the unused gfs2_stuffed_write_end function
  gfs2: use page_offset in gfs2_page_mkwrite
  gfs2: replace more printk with calls to fs_info and friends
  gfs2: dump fsid when dumping glock problems
  gfs2: simplify gfs2_freeze by removing case
  gfs2: Rename SDF_SHUTDOWN to SDF_WITHDRAWN
  gfs2: Warn when a journal replay overwrites a rgrp with buffers
  gfs2: log which portion of the journal is replayed
  gfs2: eliminate tr_num_revoke_rm
  gfs2: kthread and remount improvements
  gfs2: Use IS_ERR_OR_NULL
  gfs2: Clean up freeing struct gfs2_sbd
2019-07-10 21:20:05 -07:00
Linus Torvalds a47f5c56b2 New for 5.3:
- Only mark inode dirty at the end of writing to a file (instead of once
   for every page written).
 - Fix for an accounting error in the page_done callback.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEUzaAxoMeQq6m2jMV+H93GTRKtOsFAl0Y20cACgkQ+H93GTRK
 tOtoog//VAJ91Fnr7A/HOQGxmfGZrpCAV+pdwUARS2ZwbrNmrg3aGsolPzSEPgKm
 b6Zk75CPtUi9MnOAUI6d8404rv3pdhWm5rY7+QF02/JQZYUFt+uAVqJyZXxO6nDm
 egNii0iYlS9WBZyDVPMskxaySD5j8+FbCxsLVak5LIk7kwxcMUdS4Zx/jfWv2XVa
 DUls+QpcgrF+Bg8ItOnlmfBQOOwe8/vgKlzUm1PIp4Wdt8QroBMLs+BKu8ZJaOD9
 AZ4RKi2wv2ctZ6yjhq5t4dQAfFYzgHzYMB7iAxIqog4O8XASYyIuTjq0HCJrv634
 /FiDrHUaEqOdVacgXp8GMSGneDEejNiBBZPbeCd0J8YkIhskvcrZpF5gJL9PlcBa
 TcXekTHJyhFrVOJTt36adAzuS1RIFPyX1QdZKQtOGCns3nxQjTyugNYpoWBUprFF
 4BG/s4KCW4zAXFw7cnPtA2LrZr5N+nrSdCbq5kp2cN+tlhSKcmYEfXkrzsCqsPrz
 v/wMzDDpPj68rWObl+8TFSItGBQ3Z4aiXiE1xyO3mUZvKWbZ171/XWYNX5y3Curd
 NHybSYPBuw0RWWVBM+/gYHNHJcjK3chVzZVqRtTYIidAhH9vBzskFjNQ/DZp+vxs
 qO8gyzE1GyiVPkgyKMhmEYNmbw3rCns4xsWm78oY9Wgb8eOUvx0=
 =so9b
 -----END PGP SIGNATURE-----

Merge tag 'iomap-5.3-merge-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull iomap updates from Darrick Wong:
 "There are a few fixes for gfs2 but otherwise it's pretty quiet so far.

   - Only mark inode dirty at the end of writing to a file (instead of
     once for every page written).

   - Fix for an accounting error in the page_done callback"

* tag 'iomap-5.3-merge-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  iomap: fix page_done callback for short writes
  fs: fold __generic_write_end back into generic_write_end
  iomap: don't mark the inode dirty in iomap_write_end
2019-07-10 20:29:45 -07:00
Andreas Gruenbacher bb4cb25dd3 gfs2: Remove unused gfs2_iomap_alloc argument
Remove the unused flags argument of gfs2_iomap_alloc.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2019-07-04 17:24:25 +02:00
Christoph Hellwig 35af80aef9 gfs2: don't use buffer_heads in gfs2_allocate_page_backing
Rewrite gfs2_allocate_page_backing to call gfs2_iomap_get_alloc and operate on
struct iomap directly.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2019-07-03 14:45:18 +02:00
Christoph Hellwig 7770c93a46 gfs2: use iomap_bmap instead of generic_block_bmap
No need to indirect through get_blocks and buffer_heads when we can just use
the iomap version.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2019-07-03 14:45:18 +02:00
Christoph Hellwig 378b6cbfb8 gfs2: mark stuffed_readpage static
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2019-07-03 14:45:18 +02:00
Christoph Hellwig 59c01c5046 gfs2: merge gfs2_writepage_common into gfs2_writepage
There is no need to keep these two functions separate.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2019-07-03 14:45:18 +02:00
Christoph Hellwig eadd753580 gfs2: merge gfs2_writeback_aops and gfs2_ordered_aops
The only difference between the two is that gfs2_ordered_aops sets the
set_page_dirty method to __set_page_dirty_buffers, but given that
__set_page_dirty_buffers is the default, if no method is set, there is no need
to to do that.  Merge the two sets of operations into one.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2019-07-03 14:45:09 +02:00