1
0
Fork 0
Commit Graph

85 Commits (8a68badd12a8b708a02d54cd5aac4d07851a6d5a)

Author SHA1 Message Date
Thomas Gleixner 1f3a8e49d8 ktime: Get rid of ktime_equal()
No point in going through loops and hoops instead of just comparing the
values.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
2016-12-25 17:21:23 +01:00
Thomas Gleixner 2456e85535 ktime: Get rid of the union
ktime is a union because the initial implementation stored the time in
scalar nanoseconds on 64 bit machine and in a endianess optimized timespec
variant for 32bit machines. The Y2038 cleanup removed the timespec variant
and switched everything to scalar nanoseconds. The union remained, but
become completely pointless.

Get rid of the union and just keep ktime_t as simple typedef of type s64.

The conversion was done with coccinelle and some manual mopping up.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
2016-12-25 17:21:22 +01:00
Weston Andros Adamson 1c48cee83b pNFS/flexfiles: delete deviceid, don't mark inactive
Instead of marking a device inactive, remove it from the cache entirely.

Flexfiles has a way to report errors back to the server, so we don't want
to stop devices from being tried again for 120 seconds.

Signed-off-by: Weston Andros Adamson <dros@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-19 17:29:45 -05:00
Trond Myklebust d9152114f7 pNFS/flexfiles: Ensure we have enough buffer for layoutreturn
The flexfiles client can piggyback both layout errors and layoutstats
as part of the layoutreturn. Both these payloads can get large, with
20 layout error entries taking up about 1.2K, and 4 layoutstats entries
taking up another 1K.
This patch allows a maximum payload of 4k by allocating a full page.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-09 20:26:59 -05:00
Trond Myklebust 5ba6a09e92 pNFS/flexfiles: Remove a redundant parameter in ff_layout_encode_ioerr()
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-09 20:26:58 -05:00
Fred Isaman 65990d1afb pNFS/flexfiles: Fix a deadlock on LAYOUTGET
We encountered a deadlock where the SEQUENCE that accompanied the
LAYOUTGET triggered a session drain, while ff_layout_alloc_lseg
triggered a GETDEVICEINFO.  The GETDEVICEINFO hung waiting for the
session drain, while the LAYOUTGET held the slot waiting for
alloc_lseg to finish.
  Avoid this by moving the call to nfs4_find_get_deviceid out of
ff_layout_alloc_lseg and into nfs4_ff_layout_prepare_ds.

Signed-off-by: Fred Isaman <fred.isaman@gmail.com>
[dros@primarydata.com: pNFS/flexfiles: fix races in ff_layout_mirror_valid]
Signed-off-by: Weston Andros Adamson <dros@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-08 21:49:57 -05:00
Trond Myklebust 230bc962a6 pNFS/flexfiles: Support sending layoutstats in layoutreturn
Add the ability to send an array of layoutstats entries as part of
layoutreturn.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-03 15:37:46 -05:00
Trond Myklebust 422c93c881 pNFS/flexfiles: Minor refactoring before adding iostats to layoutreturn
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-03 15:37:45 -05:00
Trond Myklebust 2f8220c16e NFS: Fix up read of mirror stats
Need to lock while reading in order to ensure 64-bit reads are correct.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-03 15:37:44 -05:00
Trond Myklebust 08e2e5bc6c pNFS/flexfiles: Clean up layoutstats
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-03 15:37:44 -05:00
Trond Myklebust 5b9b3c855a pNFS/flexfiles: Refactor encoding of the layoutreturn payload
Add the layout error payload to the flexfiles layoutreturn private
data, and set up the encoding mechanisms. This is a refactoring in
preparation for adding the layout iostats payload.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-03 15:37:43 -05:00
Trond Myklebust 06946c6a3d pNFS/flexfiles: Only send layoutstats updates for mirrors that were updated
If there have been no reads or writes to a given mirror since the last
layoutstats update, then don't resend the same data.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-02 11:42:58 -05:00
Trond Myklebust 46c98c6d1b pNFS/flexfiles: Don't attempt to send layoutstats if there are no entries
If the list of mirrors is empty, then don't send an RPC call.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-02 11:42:58 -05:00
Trond Myklebust 94e5c571fc pNFS: Get rid of unnecessary layout parameter in encode_layoutreturn callback
The parameter is already present in the "args" structure.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-01 17:21:44 -05:00
Trond Myklebust 54e4a0dfa2 pNFS: Fix a deadlock between read resends and layoutreturn
We must not call nfs_pageio_init_read() on a new nfs_pageio_descriptor
while holding a reference to a layout segment, as that can deadlock
pnfs_update_layout().

Fixes: d67ae825a5 ("pnfs/flexfiles: Add the FlexFile Layout Driver")
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Cc: stable@vger.kernel.org # v4.0+
2016-12-01 17:21:38 -05:00
Trond Myklebust 41020b671a NFSv4.x: Allow callers of nfs_remove_bad_delegation() to specify a stateid
Allow the callers of nfs_remove_bad_delegation() to specify the stateid
that needs to be marked as bad.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Tested-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-09-27 14:33:37 -04:00
Trond Myklebust 3dc147359e pNFS/flexfiles: Fix an Oopsable condition when connection to the DS fails
If the attempt to connect to a DS fails inside ff_layout_pg_init_read or
ff_layout_pg_init_write, then we currently end up clearing the layout
segment carried by the struct nfs_pageio_descriptor, causing an Oops
when we later call into ff_layout_read_pagelist/ff_layout_write_pagelist.

The fix is to ensure we return the layout and then retry.

Fixes: 446ca21953 ("pNFS/flexfiles: When initing reads or writes, we...")
Cc: stable@vger.kernel.org # v4.7+
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-08-29 15:21:16 -04:00
Trond Myklebust 1c8d477a77 pNFS/flexfiles: Fix layoutstat periodic reporting
Putting the periodicity timer in the mirror instances is causing
non-scalable reporting behaviour and missed reporting intervals.
When you recall layouts and/or implement client side mirroring, it
leads to consecutive reports with only a few ms between RPC calls.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Fixes: d0379a5d06 ("pNFS/flexfiles: Support server-supplied...")
2016-08-14 23:01:10 -04:00
Trond Myklebust 2e18d4d822 pNFS: Files and flexfiles always need to commit before layoutcommit
So ensure that we mark the layout for commit once the write is done,
and then ensure that the commit to ds is finished before sending
layoutcommit.

Note that by doing this, we're able to optimise away the commit
for the case of servers that don't need layoutcommit in order to
return updated attributes.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-07-05 19:08:01 -04:00
Trond Myklebust bc28e1c2e3 pNFS/flexfiles: Clean up calls to pnfs_set_layoutcommit()
Let's just have one place where we check ff_layout_need_layoutcommit().

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-07-05 18:52:26 -04:00
Trond Myklebust c001c87a63 pNFS/flexfiles: Fix layoutcommit after a commit to DS
We should always do a layoutcommit after commit to DS, except if
the layout segment we're using has set FF_FLAGS_NO_LAYOUTCOMMIT.

Fixes: d67ae825a5 ("pnfs/flexfiles: Add the FlexFile Layout Driver")
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-07-05 18:52:26 -04:00
Tom Haynes c7d73af2d2 pnfs: pnfs_update_layout needs to consider if strict iomode checking is on
As flexfiles has FF_FLAGS_NO_READ_IO, there is a need to generically
support enforcing that a IOMODE_RW segment will not allow READ I/O.

Signed-off-by: Tom Haynes <loghyr@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-05-26 08:40:56 -04:00
Tom Haynes 602c4cd452 nfs/flexfiles: Use the layout segment for reading unless it a IOMODE_RW and reading is disabled
Signed-off-by: Tom Haynes <loghyr@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-05-26 08:40:51 -04:00
Jeff Layton 094069f1d9 flexfiles: remove pointless setting of NFS_LAYOUT_RETURN_REQUESTED
Setting just the NFS_LAYOUT_RETURN_REQUESTED flag doesn't do anything,
unless there are lsegs that are also being marked for return. At the
point where that happens this flag is also set, so these set_bit calls
don't do anything useful.

Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-05-17 15:48:11 -04:00
Jeff Layton ee26bdd680 pnfs: don't merge new ff lsegs with ones that have LAYOUTRETURN bit set
Otherwise, we'll end up returning layouts that we've just received if
the client issues a new LAYOUTGET prior to the LAYOUTRETURN.

Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-05-17 15:48:09 -04:00
Tom Haynes 446ca21953 pNFS/flexfiles: When initing reads or writes, we might have to retry connecting to DSes
If we are initializing reads or writes and can not connect to a DS, then
check whether or not IO is allowed through the MDS. If it is allowed,
reset to the MDS. Else, fail the layout segment and force a retry
of a new layout segment.

Signed-off-by: Tom Haynes <loghyr@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-05-17 15:48:08 -04:00
Tom Haynes 3b13b4b311 pNFS/flexfiles: When checking for available DSes, conditionally check for MDS io
Whenever we check to see if we have the needed number of DSes for the
action, we may also have to check to see whether IO is allowed to go to
the MDS or not.

[jlayton: fix merge conflict due to lack of localio patches here]

Signed-off-by: Tom Haynes <loghyr@primarydata.com>
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-05-17 15:48:08 -04:00
Trond Myklebust 75bf47ebf6 pNFS/flexfile: Fix erroneous fall back to read/write through the MDS
This patch fixes a problem whereby the pNFS client falls back to doing
reads and writes through the metadata server even when the layout flag
FF_FLAGS_NO_IO_THRU_MDS is set.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-05-17 15:48:07 -04:00
Trond Myklebust 93b717fd81 NFSv4: Label stateids with the type
In order to more easily distinguish what kind of stateid we are dealing
with, introduce a type that can be used to label the stateid structure.

The label will be useful both for debugging, but also when dealing with
operations like SETATTR, READ and WRITE that can take several different
types of stateid as arguments.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-05-17 15:48:06 -04:00
Jeff Layton 3064b6861d nfs: have flexfiles mirror keep creds for both ro and rw layouts
A mirror can be shared between multiple layouts, even with different
iomodes. That makes stats gathering simpler, but it causes a problem
when we get different creds in READ vs. RW layouts.

The current code drops the newer credentials onto the floor when this
occurs. That's problematic when you fetch a READ layout first, and then
a RW. If the READ layout doesn't have the correct creds to do a write,
then writes will fail.

We could just overwrite the READ credentials with the RW ones, but that
would break the ability for the server to fence the layout for reads if
things go awry. We need to be able to revert to the earlier READ creds
if the RW layout is returned afterward.

The simplest fix is to just keep two sets of creds per mirror. One for
READ layouts and one for RW, and then use the appropriate set depending
on the iomode of the layout segment.

Also fix up some RCU nits that sparse found.

Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-05-09 09:05:40 -04:00
Jeff Layton 90a0be00e9 nfs: get a reference to the credential in ff_layout_alloc_lseg
We're just as likely to have allocation problems here as we would if we
delay looking up the credential like we currently do. Fix the code to
get a rpc_cred reference early, as soon as the mirror is set up.

This allows us to eliminate the mirror early if there is a problem
getting an rpc credential. This also allows us to drop the uid/gid
from the layout_mirror struct as well.

In the event that we find an existing mirror where this one would go, we
swap in the new creds unconditionally, and drop the reference to the old
one.

Note that the old ff_layout_update_mirror_cred function wouldn't set
this pointer unless the DS version was 3, but we don't know what the DS
version is at this point. I'm a little unclear on why it did that as you
still need creds to talk to v4 servers as well. I have the code set
it regardless of the DS version here.

Also note the change to using generic creds instead of calling
lookup_cred directly. With that change, we also need to populate the
group_info pointer in the acred as some functions expect that to never
be NULL. Instead of allocating one every time however, we can allocate
one when the module is loaded and share it since the group_info is
refcounted.

Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-05-09 09:05:40 -04:00
Jeff Layton 57f3f4c0cd nfs: have ff_layout_get_ds_cred take a reference to the cred
In later patches, we're going to want to allow the creds to be updated
when we get a new layout with updated creds. Have this function take
a reference to the cred that is later put once the call has been
dispatched.

Also, prepare for this change by ensuring we follow RCU rules when
getting a reference to the cred as well.

Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-05-09 09:05:40 -04:00
Dave Wysochanski fe238e601d NFS: Save struct inode * inside nfs_commit_info to clarify usage of i_lock
Commit ea2cf22 created nfs_commit_info and saved &inode->i_lock inside
this NFS specific structure.  This obscures the usage of i_lock.
Instead, save struct inode * so later it's clear the spinlock taken is
i_lock.

Should be no functional change.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-05-09 09:05:40 -04:00
Trond Myklebust 2370abdab5 NFS: Cleanup - rename NFS_LAYOUT_RETURN_BEFORE_CLOSE
NFS_LAYOUT_RETURN_BEFORE_CLOSE is being used to signal that a
layoutreturn is needed, either due to a layout recall or to a
layout error. Rename it to NFS_LAYOUT_RETURN_REQUESTED in order
to clarify its purpose.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-01-27 20:40:05 -05:00
Trond Myklebust 6d45c042f3 Merge branch 'bugfixes'
* bugfixes:
  pNFS/flexfiles: Fix an XDR encoding bug in layoutreturn
  pNFS/flexfiles: Improve merging of errors in LAYOUTRETURN
2016-01-22 11:02:36 -05:00
Trond Myklebust 082fa37d13 pNFS/flexfiles: Fix an XDR encoding bug in layoutreturn
We must not skip encoding the statistics, or the server will see an
XDR encoding error.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Cc: stable@vger.kernel.org # 4.0+
2016-01-22 11:01:44 -05:00
Trond Myklebust daaadd2283 Merge branch 'bugfixes'
* bugfixes:
  SUNRPC: Fixup socket wait for memory
  SUNRPC: Fix a missing break in rpc_anyaddr()
  pNFS/flexfiles: Fix an Oopsable typo in ff_mirror_match_fh()
  NFS: Fix attribute cache revalidation
  NFS: Ensure we revalidate attributes before using execute_ok()
  NFS: Flush reclaim writes using FLUSH_COND_STABLE
  NFS: Background flush should not be low priority
  NFSv4.1/pnfs: Fixup an lo->plh_block_lgets imbalance in layoutreturn
  NFSv4: Don't perform cached access checks before we've OPENed the file
  NFS: Allow the combination pNFS and labeled NFS
  NFS42: handle layoutstats stateid error
  nfs: Fix race in __update_open_stateid()
  nfs: fix missing assignment in nfs4_sequence_done tracepoint
2016-01-07 18:45:36 -05:00
Trond Myklebust 942e3d72a6 Merge branch 'pnfs_generic'
* pnfs_generic:
  NFSv4.1/pNFS: Cleanup constify struct pnfs_layout_range arguments
  NFSv4.1/pnfs: Cleanup copying of pnfs_layout_range structures
  NFSv4.1/pNFS: Cleanup pnfs_mark_matching_lsegs_invalid()
  NFSv4.1/pNFS: Fix a race in initiate_file_draining()
  NFSv4.1/pNFS: pnfs_error_mark_layout_for_return() must always return layout
  NFSv4.1/pNFS: pnfs_mark_matching_lsegs_return() should set the iomode
  NFSv4.1/pNFS: Use nfs4_stateid_copy for copying stateids
  NFSv4.1/pNFS: Don't pass stateids by value to pnfs_send_layoutreturn()
  NFS: Relax requirements in nfs_flush_incompatible
  NFSv4.1/pNFS: Don't queue up a new commit if the layout segment is invalid
  NFS: Allow multiple commit requests in flight per file
  NFS/pNFS: Fix up pNFS write reschedule layering violations and bugs
  NFSv4: List stateid information in the callback tracepoints
  NFSv4.1/pNFS: Don't return NFS4ERR_DELAY unnecessarily in CB_LAYOUTRECALL
  NFSv4.1/pNFS: Ensure we enforce RFC5661 Section 12.5.5.2.1
  pNFS: If we have to delay the layout callback, mark the layout for return
  NFSv4.1/pNFS: Add a helper to mark the layout as returned
  pNFS: Ensure nfs4_layoutget_prepare returns the correct error
2016-01-04 13:19:55 -05:00
Trond Myklebust dc602dd706 NFS/pNFS: Fix up pNFS write reschedule layering violations and bugs
The flexfiles layout in particular, seems to want to poke around in the
O_DIRECT flags when retransmitting.
This patch sets up an interface to allow it to call back into O_DIRECT
to handle retransmission correctly. It also fixes a potential bug whereby
we could change the behaviour of O_DIRECT if an error is already pending.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-31 13:50:42 -05:00
Trond Myklebust 86fb449b07 pNFS/flexfiles: Fix an Oopsable typo in ff_mirror_match_fh()
Jeff reports seeing an Oops in ff_layout_alloc_lseg. Turns out
copy+paste has played cruel tricks on a nested loop.

Reported-by: Jeff Layton <jeff.layton@primarydata.com>
Cc: stable@vger.kernel.org # 4.3+
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-30 10:57:01 -05:00
Trond Myklebust 4d0ac22109 pNFS/flexfiles: Ensure we record layoutstats even if RPC is terminated early
Currently, we will only record the layoutstats correctly if the
RPC call successfully obtains a slot. If we exit before that
happens, then we may find ourselves starting the busy timer through
the call in ff_layout_(read|write)_prepare_layoutstats, but never stopping it.

The same thing happens if we're doing DA-DS.

The fix is to ensure that we catch these cases in the rpc_release()
callback.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-28 14:32:41 -05:00
Trond Myklebust 37e9ed22b1 pNFS: Add flag to track if we've called nfs4_ff_layout_stat_io_start_read/write
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-28 14:32:41 -05:00
Trond Myklebust 7eeea16797 pNFS/flexfiles: Fix a statistics gathering imbalance
When we replay a failed read, write or commit to the dataserver, we
need to ensure that we call ff_layout_read_prepare_v3(),
ff_layout_write_prepare_v3 or ff_layout_commit_prepare_v3() so that we
reset the statistics.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-28 14:32:40 -05:00
Trond Myklebust 2e5b29f044 pNFS/flexfiles: Don't prevent flexfiles client from retrying LAYOUTGET
Fix a bug in which flexfiles clients are falling back to I/O through the
MDS even when the FF_FLAGS_NO_IO_THRU_MDS flag is set.

The flexfiles client will always report errors through the LAYOUTRETURN
and/or LAYOUTERROR mechanisms, so it should normally be safe for it
to retry the LAYOUTGET until it fails or succeeds.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-28 14:32:40 -05:00
Peng Tao 141b9b59ed pnfs/flexfiles: count io stat in rpc_count_stats callback
If client ever restarts IO due to some errors, we'll endup
mis-counting IO stats if we do the counting in .rpc_done
callback. Move it to .rpc_count_stats callback that is only
called when releasing RPC.

Signed-off-by: Peng Tao <tao.peng@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-28 14:32:39 -05:00
Peng Tao c22eeb8697 pnfs/flexfiles: do not mark delay-like status as DS failure
We just need to delay and retry in these cases.

Signed-off-by: Peng Tao <tao.peng@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-28 14:32:39 -05:00
Peng Tao d600ad1f2b NFS41: pop some layoutget errors to application
For ERESTARTSYS/EIO/EROFS/ENOSPC/E2BIG in layoutget, we
should just bail out instead of hiding the error and
retrying inband IO.

Change all the call sites to pop the error all the way up.

Signed-off-by: Peng Tao <tao.peng@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-28 14:32:36 -05:00
Trond Myklebust d0379a5d06 pNFS/flexfiles: Support server-supplied layoutstats sampling period
Some servers want to be able to control the frequency with which clients
report layoutstats, for instance, in order to monitor QoS for a particular
file or set of file. In order to support this, the flexfiles layout allows
the server to pass this info as a hint in the layout payload.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-28 14:32:36 -05:00
Trond Myklebust 260074cd84 pNFS/flexfiles: Add support for FF_FLAGS_NO_IO_THRU_MDS
For loosely coupled pNFS/flexfiles systems, there is often no advantage
at all in going through the MDS for I/O, since the MDS is subject to
the same limitations as all other clients when talking to DSes. If a
DS is unresponsive, I/O through the MDS will fail.

For such systems, the only scalable solution is to have the pNFS clients
retry doing pNFS, and so the protocol now provides a flag that allows
the pNFS server to signal this.

If LAYOUTGET returns FF_FLAGS_NO_IO_THRU_MDS, then we should assume that
the MDS wants the client to retry using these devices, even if they were
previously marked as being unavailable. To do so, we add a helper,
ff_layout_mark_devices_valid() that will be called from layoutget.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-11-02 13:50:37 -05:00
Trond Myklebust 135444126a pNFS/flexfiles: When mirrored, retry failed reads by switching mirrors
If the pNFS/flexfiles file is mirrored, and a read to one mirror fails,
then we should bump the mirror index, so that we retry to a different
mirror. Once we've iterated through all mirrors and all failed, we can
return the layout and issue a new LAYOUTGET.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-11-02 13:50:35 -05:00