Commit graph

94 commits

Author SHA1 Message Date
Trond Myklebust 80f9642724 NFSv4.x: Enforce the ca_maxresponsesize_cached on the back channel
We have no duplicate reply cache, so we always set the back channel
ca_maxresponsesize_cached to zero when negotiating the session.
That means we should always error out as soon as we see the server
set args->csa_cachethis.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-01-24 17:12:48 -05:00
Trond Myklebust f74a834a0e NFSv4.x: CB_SEQUENCE should return NFS4ERR_DELAY if still executing
See RFC5661 Section 2.10.6.2: if retrying a request, and the old one is
still in progress, we must return NFS4ERR_DELAY as the reply to sequence.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-01-24 17:12:48 -05:00
Trond Myklebust f4f58ed19b NFSv4.x: Remove hard coded slotids in callback channel
Instead, use the values encoded in the slot table itself.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-01-24 17:12:47 -05:00
Trond Myklebust 4b0934baf9 NFSv4.1/pNFS: Fix a race in initiate_file_draining()
Peng Tao points out that the call to pnfs_mark_matching_lsegs_return()
could race with pnfs_put_lseg(), in which case the layout segment is
cleared, but no layoutreturn will be sent.
Fix is to replace the call to pnfs_mark_matching_lsegs_invalid().

Reported-by: Peng Tao <tao.peng@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-01-04 12:36:12 -05:00
Trond Myklebust b20135d0b2 NFSv4.1/pNFS: Don't queue up a new commit if the layout segment is invalid
If the layout segment is invalid, then we should not be adding more
write requests to the commit list. Instead, those writes should be
replayed after requesting a new layout.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-31 15:55:35 -05:00
Trond Myklebust e07db907eb NFSv4: List stateid information in the callback tracepoints
The stateid is extremely valuable when debugging.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-28 14:33:05 -05:00
Trond Myklebust e0d9243048 NFSv4.1/pNFS: Don't return NFS4ERR_DELAY unnecessarily in CB_LAYOUTRECALL
If the client is promising to return the layout ASAP, then there is no
need to return DELAY and have the server retry. Instead default to the
normal procedure described in RFC5661.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-28 14:33:05 -05:00
Trond Myklebust 41c9127d6d NFSv4.1/pNFS: Ensure we enforce RFC5661 Section 12.5.5.2.1
The RFC requires us to check if the server is recalling a stateid that we
haven't yet received. If so, tell it to wait.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-28 14:33:04 -05:00
Trond Myklebust fc7ff36747 pNFS: If we have to delay the layout callback, mark the layout for return
If the client needs to delay the layout callback, then speed up the recall
process by marking the remaining layout segments to be actively returned
by the client.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-28 14:33:04 -05:00
Trond Myklebust 0654cc726f NFSv4.1/pNFS: Add a helper to mark the layout as returned
This ensures that we don't reuse the stateid if a layout return or
implied layout return means that we've returned all layout segments

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-28 14:33:04 -05:00
Kinglong Mee 39de493e88 NFS: Remove unneeded NFS_DEBUG checking before define NFSDBG_FACILITY
It's not needed to checking NFS_DEBUG before define NFSDBG_FACILITY, remove it.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-10-21 15:49:23 -05:00
Trond Myklebust 249b2eef64 NFSv4: Add a tracepoint for CB_LAYOUTRECALL
Only support for single file layoutrecall for now.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-08-25 14:40:06 -04:00
Trond Myklebust 7cd148610a NFSv4: Add a tracepoint for CB_GETATTR
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-08-25 14:40:05 -04:00
Anna Schumaker 3f10a6af4b NFS: Remove nfs41_server_notify_{target|highest}_slotid_update()
All these functions do is call nfs41_ping_server() without adding
anything.  Let's remove them and give nfs41_ping_server() a better name
instead.

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-08-17 13:32:00 -05:00
Trond Myklebust 4e54ab8d8c NFS: Ensure that we update the sequence id under the slot table lock
Fix a callback slot table regression.

Fixes: e937ee714b ("nfs: Only update callback sequnce id when CB_SEQUENCE success")
Cc: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-06-11 21:15:52 -04:00
Kinglong Mee 0579c8d208 nfs: Initialize cb_sequenceres information before validate_seqid()
For a cb_layoutrecall replay, nfsd got CB_SEQUENCE status of zero,
but all informations of cb_sequenceres are zero too !!!

validate_seqid() return NFS4ERR_RETRY_UNCACHED_REP for a replay,
and skip the initlize cb_sequenceres.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-06-11 21:09:06 -04:00
Kinglong Mee e937ee714b nfs: Only update callback sequnce id when CB_SEQUENCE success
When testing pnfs layout, nfsd got error NFS4ERR_SEQ_MISORDERED.
It is caused by nfs return NFS4ERR_DELAY before validate_seqid(),
don't update the sequnce id, but nfsd updates the sequnce id !!!

According to RFC5661 20.9.3,
" If CB_SEQUENCE returns an error, then the state of the slot
  (sequence ID, cached reply) MUST NOT change. "

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-06-11 14:00:40 -04:00
Trond Myklebust b1c0df5fad NFSv4.1: Don't set up a backchannel if the server didn't agree to do so
If the server doesn't agree to out backchannel setup request, then
don't set one up.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-02-18 12:30:47 -08:00
Weston Andros Adamson cb1410c71e NFS: fix subtle change in COMMIT behavior
Recent work in the pgio layer made it possible for there to be more than one
request per page. This caused a subtle change in commit behavior, because
write.c:nfs_commit_unstable_pages compares the number of *pages* waiting for
writeback against the number of requests on a commit list to choose when to
send a COMMIT in a non-blocking flush.

This is probably hard to hit in normal operation - you have to be using
rsize/wsize < PAGE_SIZE, or pnfs with lots of boundaries that are not page
aligned to have a noticeable change in behavior.

Signed-off-by: Weston Andros Adamson <dros@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-11-24 17:00:42 -05:00
Christoph Hellwig 84c9dee3ad pnfs: enable CB_NOTIFY_DEVICEID support
This code has been around for a while, but never was enabled, although
it is in a working shape.

Note that we implement NOTIFY_DEVICEID4_CHANGE identical to
NOTIFY_DEVICEID4_DELETE.  Given that in either case we can't do anything
but preventing further lookups of a given device ID there isn't much difference
in semantics for the two.  For the delete case the server MUST ensure that
there are no outstanding layouts, while for the change case it doesn't, but
that has little relevance to the client.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-09-12 13:33:50 -04:00
Christoph Hellwig c88953d87f pnfs: add return_range method
If a layout driver keeps per-inode state outside of the layout segments it
needs to be notified of any layout returns or recalls on an inode, and not
just about the freeing of layout segments.  Add a method to acomplish this,
which will allow the block layout driver to handle the case of truncated
and re-expanded files properly.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-09-10 12:47:03 -07:00
Christoph Hellwig 7c5d187581 pnfs: force a layout commit when encountering busy segments during recall
Expedite layout recall processing by forcing a layout commit when
we see busy segments.  Without it the layout recall might have to wait
until the VM decided to start writeback for the file, which can introduce
long delays.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-09-10 12:47:02 -07:00
Trond Myklebust 9a7fe9e890 NFSv4.1: Minor optimisation in get_layout_by_fh_locked()
If the filehandles match, but the igrab() fails, or the layout is
freed before we can get it, then just return NULL.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-02-19 21:21:06 -05:00
Trond Myklebust 27999f2530 NFSv4.1: Ensure that the layout recall callback matches layout stateids
It is not sufficient to compare filehandles when we receive a layout
recall from the server; we also need to check that the layout stateids
match.

Reported-by: shaobingqing <shaobingqing@bwstor.com.cn>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-02-19 21:21:05 -05:00
Chuck Lever e8d92382dd NFS: When displaying session slot numbers, use "%u" consistently
Clean up, since slot and sequence numbers are all unsigned anyway.

Among other things, squelch compiler warnings:

linux/fs/nfs/nfs4proc.c: In function ‘nfs4_setup_sequence’:
linux/fs/nfs/nfs4proc.c:703:2: warning: signed and unsigned type in
	conditional expression [-Wsign-compare]

and

linux/fs/nfs/nfs4session.c: In function ‘nfs4_alloc_slot’:
linux/fs/nfs/nfs4session.c:151:31: warning: signed and unsigned type in
	conditional expression [-Wsign-compare]

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-09-03 15:26:30 -04:00
Trond Myklebust 2f92ae343e NFSv4.1: Add tracepoints for debugging slot table operations
Add tracepoints to nfs41_setup_sequence and nfs41_sequence_done
to track session and slot table state changes.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-08-22 08:58:27 -04:00
Trond Myklebust ca8acf8d84 NFSv4: Add tracepoints for debugging delegations
Set up tracepoints to track when delegations are set, reclaimed,
returned by the client, or recalled by the server.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-08-22 08:58:24 -04:00
Trond Myklebust 959d921f5e Merge branch 'labeled-nfs' into linux-next
* labeled-nfs:
  NFS: Apply v4.1 capabilities to v4.2
  NFS: Add in v4.2 callback operation
  NFS: Make callbacks minor version generic
  Kconfig: Add Kconfig entry for Labeled NFS V4 client
  NFS: Extend NFS xattr handlers to accept the security namespace
  NFS: Client implementation of Labeled-NFS
  NFS: Add label lifecycle management
  NFS:Add labels to client function prototypes
  NFSv4: Extend fattr bitmaps to support all 3 words
  NFSv4: Introduce new label structure
  NFSv4: Add label recommended attribute and NFSv4 flags
  NFSv4.2: Added NFS v4.2 support to the NFS client
  SELinux: Add new labeling type native labels
  LSM: Add flags field to security_sb_set_mnt_opts for in kernel mount data.
  Security: Add Hook to test if the particular xattr is part of a MAC model.
  Security: Add hook to calculate context based on a negative dentry.
  NFS: Add NFSv4.2 protocol constants

Conflicts:
	fs/nfs/nfs4proc.c
2013-06-28 16:29:51 -04:00
Bryan Schumaker 459de2edb9 NFS: Make callbacks minor version generic
I found a few places that hardcode the minor version number rather than
making it dependent on the protocol the callback came in over.  This
patch makes it easier to add new minor versions in the future.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-06-08 16:20:18 -04:00
Andy Adamson 774d5f14ee NFSv4.1 Fix a pNFS session draining deadlock
On a CB_RECALL the callback service thread flushes the inode using
filemap_flush prior to scheduling the state manager thread to return the
delegation. When pNFS is used and I/O has not yet gone to the data server
servicing the inode, a LAYOUTGET can preceed the I/O. Unlike the async
filemap_flush call, the LAYOUTGET must proceed to completion.

If the state manager starts to recover data while the inode flush is sending
the LAYOUTGET, a deadlock occurs as the callback service thread holds the
single callback session slot until the flushing is done which blocks the state
manager thread, and the state manager thread has set the session draining bit
which puts the inode flush LAYOUTGET RPC to sleep on the forechannel slot
table waitq.

Separate the draining of the back channel from the draining of the fore channel
by moving the NFS4_SESSION_DRAINING bit from session scope into the fore
and back slot tables.  Drain the back channel first allowing the LAYOUTGET
call to proceed (and fail) so the callback service thread frees the callback
slot. Then proceed with draining the forechannel.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-05-20 14:20:14 -04:00
Trond Myklebust 826e001308 NFSv4: Fix CB_RECALL_ANY to only return delegations that are not in use
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-04-05 17:03:57 -04:00
Trond Myklebust fd9a8d7160 NFSv4.1: Fix bulk recall and destroy of layouts
The current code in pnfs_destroy_all_layouts() assumes that removing
the layout from the server->layouts list is sufficient to make it
invisible to other processes. This ignores the fact that most
users access the layout through the nfs_inode->layout...
There is further breakage due to lack of reference counting of the
layouts, meaning that the whole thing Oopses at the drop of a hat.

The code in initiate_bulk_draining() is almost correct, and can be
used as a model for pnfs_destroy_all_layouts(), so move that
code to pnfs.c, and refactor the code to allow us to choose between
a single filesystem bulk recall, and a recall of all layouts.
Also note that initiate_bulk_draining() currently calls iput() while
holding locks. Fix that too.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
2013-02-14 13:22:50 -05:00
Nickolai Zeldovich ecf0eb9edb nfs: avoid dereferencing null pointer in initiate_bulk_draining
Fix an inverted null pointer check in initiate_bulk_draining().

Signed-off-by: Nickolai Zeldovich <nickolai@csail.mit.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org [>= 3.7]
2013-01-05 14:26:51 -05:00
Trond Myklebust 73e39aaa83 NFSv4.1: Cleanup move session slot management to fs/nfs/nfs4session.c
NFSv4.1 session management is getting complex enough to deserve
a separate file.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-12-06 00:30:45 +01:00
Trond Myklebust ac0748359a NFSv4.1: CB_RECALL_SLOT must schedule a sequence op after updating targets
RFC5661 requires us to make sure that the server knows we've updated
our slot table size by sending at least one SEQUENCE op containing the
new 'highest_slotid' value.
We can do so using the 'CHECK_LEASE' functionality of the state
manager.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-12-06 00:30:43 +01:00
Trond Myklebust afa296103e NFSv4.1: Remove the state manager code to resize the slot table
The state manager no longer needs any special machinery to stop the
session flow and resize the slot table. It is all done on the fly by
the SEQUENCE op code now.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-12-06 00:30:43 +01:00
Trond Myklebust 1b285ff16a NFSv4.1: Allow the server to recall all but one slot
If the server wants to leave us with only one slot, or it wants
to "shrink" our slot table to something larger than we have now,
then so be it.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-12-06 00:30:42 +01:00
Trond Myklebust d5fb4ce33e NFSv4.1: Don't confuse target_highest_slotid and max_slots in cb_recall_slot
Don't confuse the table size and the target_highest_slotid...

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-12-06 00:30:41 +01:00
Trond Myklebust ce008c4bb9 NFSv4.1: Fix nfs4_callback_recallslot to work with dynamic slot allocation
Ensure that the NFSv4.1 CB_RECALL_SLOT callback updates the slot table
target max slotid safely.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-12-06 00:30:37 +01:00
Trond Myklebust 464ee9f966 NFSv4.1: Ensure that the client tracks the server target_highest_slotid
Dynamic slot allocation in NFSv4.1 depends on the client being able to
track the server's target value for the highest slotid in the
slot table.  See the reference in Section 2.10.6.1 of RFC5661.

To avoid ordering problems in the case where 2 SEQUENCE replies contain
conflicting updates to this target value, we also introduce a generation
counter, to track whether or not an RPC containing a SEQUENCE operation
was launched before or after the last update.

Also rename the nfs4_slot_table target_max_slots field to
'target_highest_slotid' to avoid confusion with a slot
table size or number of slots.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-12-06 00:29:47 +01:00
Trond Myklebust 4ea8fed593 NFSv4: Get rid of unnecessary BUG_ON()s
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-11-04 14:43:39 -05:00
Trond Myklebust 9c6263819f NFSv4.1: Clean up the removal of pnfs_layout_hdr from the server list
Move the code into pnfs_free_layout_hdr(), and add checks to
get_layout_by_fh_locked to ensure that they don't reference a layout
that is being freed.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-09-28 16:03:14 -04:00
Trond Myklebust 70c3bd2bdf NFSv4.1: Cleanup; add "pnfs_" prefix to get_layout_hdr() and put_layout_hdr()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-09-28 16:03:07 -04:00
Trond Myklebust 49a85061b0 NFSv4.1: Cleanup add a "pnfs_" prefix to mark_matching_lsegs_invalid
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-09-28 16:03:06 -04:00
Trond Myklebust 36281caa83 NFSv4: Further clean-ups of delegation stateid validation
Change the name to reflect what we're really doing: testing two
stateids for whether or not they match according the the rules in
RFC3530 and RFC5661.
Move the code from callback_proc.c to nfs4proc.c

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-03-06 10:32:44 -05:00
Trond Myklebust 8e663f0e5f NFSv4.1: Fix matching of the stateids when returning a delegation
nfs41_validate_delegation_stateid is broken if we supply a stateid with
a non-zero sequence id. Instead of trying to match the sequence id,
the function assumes that we always want to error. While this is
true for a delegation callback, it is not true in general.

Also fix a typo in nfs4_callback_recall.

Reported-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-03-06 10:32:44 -05:00
Trond Myklebust 2446ab6070 SUNRPC: Use RCU to dereference the rpc_clnt.cl_xprt field
A migration event will replace the rpc_xprt used by an rpc_clnt.  To
ensure this can be done safely, all references to cl_xprt must now use
a form of rcu_dereference().

Special care is taken with rpc_peeraddr2str(), which returns a pointer
to memory whose lifetime is the same as the rpc_xprt.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
[ cel: fix lockdep splats and layering violations ]
[ cel: forward ported to 3.4 ]
[ cel: remove rpc_max_reqs(), add rpc_net_ns() ]
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-03-02 15:36:38 -05:00
Trond Myklebust 0cb3284b53 NFSv4.1: Get rid of NFS4CLNT_LAYOUTRECALL
The NFS4CLNT_LAYOUTRECALL bit is a long-term impediment to scalability. It
basically stops all other recalls by a given server once any layout recall
is requested.

If the recall is for a different file, then we don't care.
If the recall applies to the same file, then we're in one of two situations:
Either we are in the case of a replay of an existing request, in which case
the session is supposed to deal with matters, or we are dealing with a
completely different request, in which case we should just try to process
it.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-03-01 11:17:50 -05:00
Stanislav Kinsbursky c7add9a972 NFS: search for client session id in proper network namespace
Network namespace is taken from request transport and passed as a part of
cb_process_state structure.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-02-06 18:48:04 -05:00
Benny Halevy d36b7cf7c6 pnfs: clean up initiate_file_draining layout lookup
Fixes the following compiler warning:

fs/nfs/callback_proc.c: In function 'do_callback_layoutrecall':
fs/nfs/callback_proc.c:115:26: warning: 'lo' may be used uninitialized in this function

Reported-by: Jim Rees <rees@umich.edu>
Signed-off-by: Benny Halevy <bhalevy@tonian.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-01-31 19:28:22 -05:00