Commit graph

598 commits

Author SHA1 Message Date
Trond Myklebust 961a828df6 SUNRPC: Fix potential races in xprt_lock_write_next()
We have to ensure that the wake up from the waitqueue and the assignment
of xprt->snd_task are atomic. We can do this by assigning the snd_task
while under the waitqueue spinlock.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-01-31 19:28:08 -05:00
Trond Myklebust 536e43d12b NFS: Optimise away unnecessary setattrs for open(O_TRUNC);
Currently, we will correctly optimise away a truncate that doesn't
change the file size. However, in the case of open(O_TRUNC), we
also want to optimise away the time changes.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-01-31 19:28:07 -05:00
Trond Myklebust 48c22eb210 NFS: Move struct nfs_unique_id into struct nfs_seqid_counter
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-01-31 19:28:07 -05:00
Trond Myklebust 9d12b216aa NFSv41: Add a new helper nfs4_init_sequence()
Clean up

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-01-31 19:28:06 -05:00
Trond Myklebust d2d7ce28a2 NFSv4: Replace lock_owner->ld_id with an ida based allocator
Again, We're unlikely to ever need more than 2^31 simultaneous lock
owners, so let's replace the custom allocator.

Now that there are no more users, we can also get rid of the custom
allocator code.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-01-31 18:20:28 -05:00
Trond Myklebust 9157c31dd6 NFSv4: Replace state_owner->so_owner_id with an ida based allocator
We're unlikely to ever need more than 2^31 simultaneous open owners,
so let's replace the custom allocator with the generic ida allocator.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-01-31 18:20:28 -05:00
Trond Myklebust d1e284d50a NFSv4: Clean up nfs4_get_state_owner
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-01-31 18:20:28 -05:00
Peng Tao de040beccd NFS4: fix compile warnings in nfs4proc.c
compile in nfs-for-3.3 branch shows following warnings. Fix it here.

fs/nfs/nfs4proc.c: In function ‘__nfs4_get_acl_uncached’:
fs/nfs/nfs4proc.c:3589: warning: format ‘%ld’ expects type ‘long int’, but argument 4 has type ‘size_t’
fs/nfs/nfs4proc.c:3589: warning: format ‘%ld’ expects type ‘long int’, but argument 6 has type ‘size_t’

Signed-off-by: Peng Tao <peng_tao@emc.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-01-12 16:31:51 -05:00
Linus Torvalds 57eccf1c2a Merge branch 'nfs-for-3.3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
* 'nfs-for-3.3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  NFSv4: Change the default setting of the nfs4_disable_idmapping parameter
  NFSv4: Save the owner/group name string when doing open
  NFS: Remove pNFS bloat from the generic write path
  pnfs-obj: Must return layout on IO error
  pnfs-obj: pNFS errors are communicated on iodata->pnfs_error
  NFS: Cache state owners after files are closed
  NFS: Clean up nfs4_find_state_owners_locked()
  NFSv4: include bitmap in nfsv4 get acl data
  nfs: fix a minor do_div portability issue
  NFSv4.1: cleanup comment and debug printk
  NFSv4.1: change nfs4_free_slot parameters for dynamic slots
  NFSv4.1: cleanup init and reset of session slot tables
  NFSv4.1: fix backchannel slotid off-by-one bug
  nfs: fix regression in handling of context= option in NFSv4
  NFS - fix recent breakage to NFS error handling.
  NFS: Retry mounting NFSROOT
  SUNRPC: Clean up the RPCSEC_GSS service ticket requests
2012-01-10 14:57:40 -08:00
Trond Myklebust 6926afd192 NFSv4: Save the owner/group name string when doing open
...so that we can do the uid/gid mapping outside the asynchronous RPC
context.
This fixes a bug in the current NFSv4 atomic open code where the client
isn't able to determine what the true uid/gid fields of the file are,
(because the asynchronous nature of the OPEN call denies it the ability
to do an upcall) and so fills them with default values, marking the
inode as needing revalidation.
Unfortunately, in some cases, the VFS will do some additional sanity
checks on the file, and may override the server's decision to allow
the open because it sees the wrong owner/group fields.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-01-07 13:22:46 -05:00
Andy Adamson bf118a342f NFSv4: include bitmap in nfsv4 get acl data
The NFSv4 bitmap size is unbounded: a server can return an arbitrary
sized bitmap in an FATTR4_WORD0_ACL request.  Replace using the
nfs4_fattr_bitmap_maxsz as a guess to the maximum bitmask returned by a server
with the inclusion of the bitmap (xdr length plus bitmasks) and the acl data
xdr length to the (cached) acl page data.

This is a general solution to commit e5012d1f "NFSv4.1: update
nfs4_fattr_bitmap_maxsz" and fixes hitting a BUG_ON in xdr_shrink_bufhead
when getting ACLs.

Fix a bug in decode_getacl that returned -EINVAL on ACLs > page when getxattr
was called with a NULL buffer, preventing ACL > PAGE_SIZE from being retrieved.

Cc: stable@kernel.org
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-01-05 10:42:42 -05:00
Andy Adamson 0b1c8fc43c NFSv4.1: cleanup comment and debug printk
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-01-05 10:42:41 -05:00
Andy Adamson aabd0b40b3 NFSv4.1: change nfs4_free_slot parameters for dynamic slots
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-01-05 10:42:41 -05:00
Andy Adamson aacd553727 NFSv4.1: cleanup init and reset of session slot tables
We are either initializing or resetting a session. Initialize or reset
the session slot tables accordingly.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-01-05 10:42:40 -05:00
Rafael J. Wysocki b00f4dc5ff Merge branch 'master' into pm-sleep
* master: (848 commits)
  SELinux: Fix RCU deref check warning in sel_netport_insert()
  binary_sysctl(): fix memory leak
  mm/vmalloc.c: remove static declaration of va from __get_vm_area_node
  ipmi_watchdog: restore settings when BMC reset
  oom: fix integer overflow of points in oom_badness
  memcg: keep root group unchanged if creation fails
  nilfs2: potential integer overflow in nilfs_ioctl_clean_segments()
  nilfs2: unbreak compat ioctl
  cpusets: stall when updating mems_allowed for mempolicy or disjoint nodemask
  evm: prevent racing during tfm allocation
  evm: key must be set once during initialization
  mmc: vub300: fix type of firmware_rom_wait_states module parameter
  Revert "mmc: enable runtime PM by default"
  mmc: sdhci: remove "state" argument from sdhci_suspend_host
  x86, dumpstack: Fix code bytes breakage due to missing KERN_CONT
  IB/qib: Correct sense on freectxts increment and decrement
  RDMA/cma: Verify private data length
  cgroups: fix a css_set not found bug in cgroup_attach_proc
  oprofile: Fix uninitialized memory access when writing to writing to oprofilefs
  Revert "xen/pv-on-hvm kexec: add xs_reset_watches to shutdown watches from old kernel"
  ...

Conflicts:
	kernel/cgroup_freezer.c
2011-12-21 21:59:45 +01:00
Trond Myklebust 652f89f64f NFSv4: Do not accept delegated opens when a delegation recall is in effect
...and report the servers that try to return a delegation when the client
is using the CLAIM_DELEG_CUR open mode. That behaviour is explicitly
forbidden in RFC3530.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-12-09 19:05:58 -05:00
Jeff Layton d310310cbf Freezer / sunrpc / NFS: don't allow TASK_KILLABLE sleeps to block the freezer
Allow the freezer to skip wait_on_bit_killable sleeps in the sunrpc
layer. This should allow suspend and hibernate events to proceed, even
when there are RPC's pending on the wire.

Also, wrap the TASK_KILLABLE sleeps in NFS layer in freezer_do_not_count
and freezer_count calls. This allows the freezer to skip tasks that are
sleeping while looping on EJUKEBOX or NFS4ERR_DELAY sorts of errors.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-12-06 22:12:27 +01:00
Trond Myklebust a6f498a891 NFS: Fix a regression in the referral code
Fix a regression that was introduced by commit
0c2e53f11a (NFS: Remove the unused
"lookupfh()" version of nfs4_proc_lookup()).

In the case where the lookup gets an NFS4ERR_MOVED, we want to return
the result of nfs4_get_referral(). Instead, that value is getting
clobbered by the call to nfs4_handle_exception()...

Reported-by: Tigran Mkrtchyan <tigran.mkrtchyan@desy.de>
Tested-by: Tigran Mkrtchyan <tigran.mkrtchyan@desy.de>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-11-07 16:20:00 -05:00
Jeff Layton 1788ea6e3b nfs: when attempting to open a directory, fall back on normal lookup (try #5)
commit d953126 changed how nfs_atomic_lookup handles an -EISDIR return
from an OPEN call. Prior to that patch, that caused the client to fall
back to doing a normal lookup. When that patch went in, the code began
returning that error to userspace. The d_revalidate codepath however
never had the corresponding change, so it was still possible to end up
with a NULL ctx->state pointer after that.

That patch caused a regression. When we attempt to open a directory that
does not have a cached dentry, that open now errors out with EISDIR. If
you attempt the same open with a cached dentry, it will succeed.

Fix this by reverting the change in nfs_atomic_lookup and allowing
attempts to open directories to fall back to a normal lookup

Also, add a NFSv4-specific f_ops->open routine that just returns
-ENOTDIR. This should never be called if things are working properly,
but if it ever is, then the dprintk may help in debugging.

To facilitate this, a new file_operations field is also added to the
nfs_rpc_ops struct.

Cc: stable@kernel.org
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-11-04 16:39:04 -04:00
Peng Tao 92407e75ce nfs4: serialize layoutcommit
Current pnfs_layoutcommit_inode can not handle parallel layoutcommit.
And as Trond suggested , there is no need for client to optimize for
parallel layoutcommit. So add NFS_INO_LAYOUTCOMMITTING flag to
mark inflight layoutcommit and serialize lalyoutcommit with it.
Also mark_inode_dirty_sync if pnfs_layoutcommit_inode fails to issue
layoutcommit.

Reported-by: Vitaliy Gusev <gusev.vitaliy@nexenta.com>
Signed-off-by: Peng Tao <peng_tao@emc.com>
Signed-off-by: Jim Rees <rees@umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-10-31 11:51:28 -04:00
Trond Myklebust d00c5d4386 NFS: Get rid of nfs_restart_rpc()
It can trivially be replaced with rpc_restart_call_prepare.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-10-19 13:58:30 -07:00
Trond Myklebust 08ef7bd3bc NFSv4: Translate NFS4ERR_BADNAME into ENOENT when applied to a lookup
Both LOOKUP and OPEN operations may return NFS4ERR_BADNAME if we send a
an invalid name as a filename argument. As far as the application is
concerned, it just has to know that the file doesn't exist, and so
ENOENT would be the appropriate reply. We should only return EINVAL
if the filename is being used to _create_ a new object on the
remote filesystem.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-10-18 16:13:51 -07:00
Trond Myklebust 0c2e53f11a NFS: Remove the unused "lookupfh()" version of nfs4_proc_lookup()
...and also remove the associated nfs_v4_clientops entry.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-10-18 16:13:51 -07:00
Trond Myklebust a9a4a87a59 NFS: Use the inode->i_version to cache NFSv4 change attribute information
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-10-18 09:14:34 -07:00
Trond Myklebust 042b60beb4 NFSv4: renewd needs to be able to handle the NFS4ERR_CB_PATH_DOWN error
The NFSv4 spec does not specify that the server must repeat that error,
so in order to avoid having the delegations revoked, we should handle
it immediately.

Also note that NFS4ERR_CB_PATH_DOWN does in fact renew the lease...

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-08-24 15:07:37 -04:00
Trond Myklebust 2f60ea6b8c NFSv4: The NFSv4.0 client must send RENEW calls if it holds a delegation
RFC3530 states that if the client holds a delegation, then it is obliged
to continue to send RENEW calls once every lease period in order to allow
the server to return NFS4ERR_CB_PATH_DOWN if the callback path is
unreachable.

This is not required for NFSv4.1, since the server can at any time set
the SEQ4_STATUS_CB_PATH_DOWN_SESSION in any SEQUENCE operation.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-08-24 15:07:37 -04:00
Trond Myklebust 8534d4ec05 NFSv4: nfs4_proc_renew should be declared static
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-08-24 15:07:37 -04:00
Trond Myklebust b569ad3492 NFSv4: nfs4_proc_async_renew should use a GFP_NOFS allocation
We shouldn't allow the renew daemon to do direct reclaim on the NFS
partition.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-08-24 15:07:35 -04:00
Andy Adamson db29c08909 pnfs: cleanup_layoutcommit
This gives layout driver a chance to cleanup structures they put in at
encode_layoutcommit.

Signed-off-by: Andy Adamson <andros@netapp.com>
[fixup layout header pointer for layoutcommit]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Benny Halevy <bhalevy@tonian.com>
[rm inode and pnfs_layout_hdr args from cleanup_layoutcommit()]
Signed-off-by: Jim Rees <rees@umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-07-31 12:18:15 -04:00
Fred Isaman dae100c2b1 pnfs: ask for layout_blksize and save it in nfs_server
Block layout needs it to determine IO size.

Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
Signed-off-by: Tao Guo <glorioustao@gmail.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Benny Halevy <bhalevy@tonian.com>
Signed-off-by: Jim Rees <rees@umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-07-31 12:18:15 -04:00
Andy Adamson 7f11d8d38d pnfs: GETDEVICELIST
The block driver uses GETDEVICELIST

Signed-off-by: Andy Adamson <andros@netapp.com>
[pass struct nfs_server * to getdevicelist]
[get machince creds for getdevicelist]
[fix getdevicelist decode sizing]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Benny Halevy <bhalevy@tonian.com>
Signed-off-by: Jim Rees <rees@umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-07-31 12:18:15 -04:00
Peng Tao a9bae5666d pnfs: let layoutcommit handle a list of lseg
There can be multiple lseg per file, so layoutcommit should be
able to handle it.

[Needed in v3.0]
CC: Stable Tree <stable@kernel.org>
Signed-off-by: Peng Tao <peng_tao@emc.com>
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Jim Rees <rees@umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-07-31 12:18:15 -04:00
Stephen Rothwell 5f00bcb38e Merge branch 'master' into devel and apply fixup from Stephen Rothwell:
vfs/nfs: fixup for nfs_open_context change

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-07-25 14:53:52 -04:00
Al Viro 3d4ff43d89 nfs_open_context doesn't need struct path either
just dentry, please...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20 01:43:44 -04:00
Al Viro 82a2c1b77a nfs4_opendata doesn't need struct path either
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20 01:43:42 -04:00
Al Viro 643168c2dc nfs4_closedata doesn't need to mess with struct path
instead of path_get()/path_put(), we can just use nfs_sb_{,de}active()
to pin the superblock down.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20 01:43:41 -04:00
Trond Myklebust a56aaa02b1 NFSv4.1: Clean up layoutreturn
Since we take a reference to it, we really ought to pass the a pointer to
the layout header in the arguments instead of assuming that
NFS_I(inode)->layout will forever point to the correct object.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-07-12 13:40:29 -04:00
Bryan Schumaker f062eb6ced NFS: test and free stateids during recovery
When recovering open files and locks, the stateid should be tested
against the server and freed if it is invalid.  This patch adds new
recovery functions for NFS v4.1.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-07-12 13:40:28 -04:00
Bryan Schumaker 9aeda35fd6 NFS: added FREE_STATEID call
FREE_STATEID is used to tell the server that we want to free a stateid
that no longer has any locks associated with it.  This allows the client
to reclaim locks without encountering edge conditions documented in
section 8.4.3 of RFC 5661.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-07-12 13:40:28 -04:00
Bryan Schumaker 7d9747947a NFS: Added TEST_STATEID call
This patch adds in the xdr for doing a TEST_STATEID call with a single
stateid. RFC 5661 allows multiple stateids to be tested in a single
call, but only testing one keeps things simpler for now.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-07-12 13:40:27 -04:00
Bryan Schumaker fca78d6d2c NFS: Add SECINFO_NO_NAME procedure
If the client is using NFS v4.1, then we can use SECINFO_NO_NAME to find
the secflavor for the initial mount.  If the server doesn't support
SECINFO_NO_NAME then I fall back on the "guess and check" method used
for v4.0 mounts.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-07-12 13:40:27 -04:00
Weston Andros Adamson 78fe0f41d9 NFS: use scope from exchange_id to skip reclaim
can be skipped if the "eir_server_scope" from the exchange_id proc differs from
previous calls.

Also, in the future server_scope will be useful for determining whether client
trunking is available

Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-07-12 13:40:27 -04:00
Fred Isaman a2e1d4f2e5 nfs4.1: fix several problems with _pnfs_return_layout
_pnfs_return_layout had the following problems:

- it did not call pnfs_free_lseg_list on all paths
- it unintentionally did a forgetful return when there was no outstanding io
- it raced with concurrent LAYOUTGETS

Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-06-15 11:24:30 -04:00
Andy Adamson 533eb4611c NFSv4.1: allow nfs_fhget to succeed with mounted on fileid
Commit 28331a46d8 "Ensure we request the
ordinary fileid when doing readdirplus"
changed the meaning of NFS_ATTR_FATTR_FILEID which used to be set when
FATTR4_WORD1_MOUNTED_ON_FILED was requested.

Allow nfs_fhget to succeed with only a mounted on fileid when crossing
a mountpoint or a referral.

Ask for the fileid of the absent file system if mounted_on_fileid is not
supported.

Signed-off-by: Andy Adamson <andros@netapp.com>
cc:stable@kernel.org [2.6.39]
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-06-15 11:24:29 -04:00
Benny Halevy c9c30dd5f7 NFSv4.1: deprecate headerpadsz in CREATE_SESSION
We don't support header padding yet so better off ditching it

Reported-by: Sid Moore <learnmost@gmail.com>
Signed-off-by: Benny Halevy <benny@tonian.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-06-15 11:24:28 -04:00
Linus Torvalds cd1acdf172 Merge branch 'pnfs-submit' of git://git.open-osd.org/linux-open-osd
* 'pnfs-submit' of git://git.open-osd.org/linux-open-osd: (32 commits)
  pnfs-obj: pg_test check for max_io_size
  NFSv4.1: define nfs_generic_pg_test
  NFSv4.1: use pnfs_generic_pg_test directly by layout driver
  NFSv4.1: change pg_test return type to bool
  NFSv4.1: unify pnfs_pageio_init functions
  pnfs-obj: objlayout_encode_layoutcommit implementation
  pnfs: encode_layoutcommit
  pnfs-obj: report errors and .encode_layoutreturn Implementation.
  pnfs: encode_layoutreturn
  pnfs: layoutret_on_setattr
  pnfs: layoutreturn
  pnfs-obj: osd raid engine read/write implementation
  pnfs: support for non-rpc layout drivers
  pnfs-obj: define per-inode private structure
  pnfs: alloc and free layout_hdr layoutdriver methods
  pnfs-obj: objio_osd device information retrieval and caching
  pnfs-obj: decode layout, alloc/free lseg
  pnfs-obj: pnfs_osd XDR client implementation
  pnfs-obj: pnfs_osd XDR definitions
  pnfs-obj: objlayoutdriver module skeleton
  ...
2011-05-29 14:10:13 -07:00
Benny Halevy 8a1636c459 pnfs: layoutret_on_setattr
With the objects layout security model, we have object capabilities
that are associated with the layout and we anticipate that the server
will issue a cb_layoutrecall for any setattr that changes security
related attributes (user/group/mode/acl) or truncates the file.

Therefore, the layout is returned before issuing the setattr to avoid
the anticipated cb_layoutrecall.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
2011-05-29 20:54:36 +03:00
Benny Halevy cbe8260369 pnfs: layoutreturn
NFSv4.1 LAYOUTRETURN implementation

Currently, does not support layout-type payload encoding.

Signed-off-by: Alexandros Batsakis <batsakis@netapp.com>
Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com>
Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Marc Eshel <eshel@almaden.ibm.com>
Signed-off-by: Zhang Jingwang <zhangjingwang@nrchpc.ac.cn>
[call pnfs_return_layout right before pnfs_destroy_layout]
[remove assert_spin_locked from pnfs_clear_lseg_list]
[remove wait parameter from the layoutreturn path.]
[remove return_type field from nfs4_layoutreturn_args]
[remove range from nfs4_layoutreturn_args]
[no need to send layoutcommit from _pnfs_return_layout]
[don't wait on sync layoutreturn]
[fix layout stateid in layoutreturn args]
[fixed NULL deref in _pnfs_return_layout]
[removed recaim member of nfs4_layoutreturn_args]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
2011-05-29 20:54:36 +03:00
Benny Halevy d20581aa4b pnfs: support for non-rpc layout drivers
Non-rpc layout driver such as for objects and blocks
implement their own I/O path and error handling logic.
Therefore bypass NFS-based error handling for these layout drivers.

[fix lseg ref-count bugs, and null de-refs]
[Fall out from: non-rpc layout drivers]
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
[get rid of PNFS_USE_RPC_CODE]
[get rid of __nfs4_write_done_cb]
[revert useless change in nfs4_write_done_cb]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
2011-05-29 20:53:51 +03:00
Trond Myklebust 0ced63d1a2 NFSv4: Handle expired stateids when the lease is still valid
Currently, if the server returns NFS4ERR_EXPIRED in reply to a READ or
WRITE, but the RENEW test determines that the lease is still active, we
fail to recover and end up looping forever in a READ/WRITE + RENEW death
spiral.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@kernel.org
2011-05-27 17:42:01 -04:00
Andy Adamson a8a4ae3a89 NFSv41: Resend on NFS4ERR_RETRY_UNCACHED_REP
Free the slot and resend the RPC with new session <slot#,seq#>.

For nfs4_async_handle_error, return -EAGAIN and set the task->tk_status to 0
to restart the async rpc in the rpc_restart_call_prepare state which resets
the slot.

For nfs4_handle_exception, retrying a call that uses nfs4_call_sync will
reset the slot via nfs41_call_sync_prepare.

For open/close/lock/locku/delegreturn/layoutcommit/unlink/rename/write
cachethis is true, so these operations will not trigger an
NFS4ERR_RETRY_UNCACHED_REP.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-05-11 14:01:33 -04:00
Trond Myklebust 1bd714f2a1 NFSv4: Ensure that clientid and session establishment can time out
The following patch ensures that we do not get permanently trapped in
the RPC layer when trying to establish a new client id or session.
This again ensures that the state manager can finish in a timely
fashion when the last filesystem to reference the nfs_client exits.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-04-24 14:29:33 -04:00
Trond Myklebust fd954ae124 NFSv4.1: Don't loop forever in nfs4_proc_create_session
If a server for some reason keeps sending NFS4ERR_DELAY errors, we can end
up looping forever inside nfs4_proc_create_session, and so the usual
mechanisms for detecting if the nfs_client is dead don't work.

Fix this by ensuring that we loop inside the nfs4_state_manager thread
instead.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-04-24 14:28:18 -04:00
Bryan Schumaker fb8a5ba811 NFSv4: Handle NFS4ERR_WRONGSEC outside of nfs4_handle_exception()
I only want to try other secflavors during an initial mount if
NFS4ERR_WRONGSEC is returned.  nfs4_handle_exception() could
potentially map other errors to EPERM, so we should handle this
error specially for correctness.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-04-18 17:06:00 -04:00
Bryan Schumaker 468f86134e NFSv4.1: Don't update sequence number if rpc_task is not sent
If we fail to contact the gss upcall program, then no message will
be sent to the server.  The client still updated the sequence number,
however, and this lead to NFS4ERR_SEQ_MISMATCH for the next several
RPC calls.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-04-18 17:05:48 -04:00
Bryan Schumaker 9b7160c55a NFS: don't negotiate when user specifies sec flavor
We were always attempting sec flavor negotiation, even if the user
told us a specific sec flavor to use.  If that sec flavor fails,
we should return an error rather than continuing with sec flavor
negotiation.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-04-13 15:12:23 -04:00
Bryan Schumaker 801a16dc7b NFS: Attempt mount with default sec flavor first
nfs4_lookup_root() is already configured to use either RPC_AUTH_UNIX
or a user specified flavor (through -o sec=<whatever>).  We should
use this flavor first, and only attempt negotiation if it fails
with -EPERM.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-04-13 15:12:23 -04:00
Bryan Schumaker 0fabee243a NFS: flav_array honors NFS_MAX_SECFLAVORS
NFS_MAX_SECFLAVORS should already take into account RPC_AUTH_UNIX
and RPC_AUTH_NULL, so we don't need to set aside extra slots
for them.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-04-13 15:12:22 -04:00
Bryan Schumaker d1a8016a2d NFS: Fix infinite loop in gss_create_upcall()
There can be an infinite loop if gss_create_upcall() is called without
the userspace program running.  To prevent this, we return -EACCES if
we notice that pipe_version hasn't changed (indicating that the pipe
has not been opened).

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-04-13 15:12:22 -04:00
Bryan Schumaker 37adb89fad NFS: Change initial mount authflavor only when server returns NFS4ERR_WRONGSEC
When attempting an initial mount, we should only attempt other
authflavors if AUTH_UNIX receives a NFS4ERR_WRONGSEC error.
This allows other errors to be passed back to userspace programs.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-04-07 13:19:40 -07:00
Trond Myklebust 0acd220192 Merge branch 'nfs-for-2.6.39' into nfs-for-next 2011-03-24 17:03:14 -04:00
Weston Andros Adamson 35124a0994 Cleanup XDR parsing for LAYOUTGET, GETDEVICEINFO
changes LAYOUTGET and GETDEVICEINFO XDR parsing to:
 - not use vmap, which doesn't work on incoherent archs
 - use xdr_stream parsing for all xdr

Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-24 17:01:41 -04:00
Andy Adamson ef31153786 NFSv4.1 convert layoutcommit sync to boolean
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-24 15:49:48 -04:00
Bryan Schumaker 8f70e95f9f NFS: Determine initial mount security
When sec=<something> is not presented as a mount option,
we should attempt to determine what security flavor the
server is using.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-24 13:52:42 -04:00
Bryan Schumaker 7ebb931598 NFS: use secinfo when crossing mountpoints
A submount may use different security than the parent
mount does.  We should figure out what sec flavor the
submount uses at mount time.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-24 13:52:42 -04:00
Bryan Schumaker 5a5ea0d485 NFS: Add secinfo procedure
This patch adds the nfs4 operation secinfo as a
valid nfs rpc operation.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-24 13:52:41 -04:00
Bryan Schumaker 7c5130588d NFS: lookup supports alternate client
A later patch will need to perform a lookup using an
alternate client with a different security flavor.
This patch adds support for doing that on NFS v4.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-24 13:52:41 -04:00
Bryan Schumaker e73b83f270 NFS: convert call_sync() to a function
This patch changes nfs4_call_sync() from a macro into a
static inline function.  As a macro, the call_sync()
function will not do any type checking and depends
on the sequence arguments always having the same name.
As a function, we get to have type checking and can
rename the arguments if we so choose.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-24 13:52:41 -04:00
Andy Adamson 863a3c6c68 NFSv4.1: layoutcommit
The filelayout driver sends LAYOUTCOMMIT only when COMMIT goes to
the data server (as opposed to the MDS) and the data server WRITE
is not NFS_FILE_SYNC.

Only whole file layout support means that there is only one IOMODE_RW layout
segment.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Alexandros Batsakis <batsakis@netapp.com>
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com>
Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
Signed-off-by: Mingyang Guo <guomingyang@nrchpc.ac.cn>
Signed-off-by: Tao Guo <guotao@nrchpc.ac.cn>
Signed-off-by: Zhang Jingwang <zhangjingwang@nrchpc.ac.cn>
Tested-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-23 15:29:04 -04:00
Fred Isaman 988b6dceb0 NFSv4.1: remove GETATTR from ds commits
Any COMMIT compound directed to a data server needs to have the
GETATTR calls suppressed.  We here, make sure the field we are testing
(data->lseg) is set and refcounted correctly.

Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-23 15:29:03 -04:00
Fred Isaman 5f452431e2 NFSv4.1: add callback to nfs4_commit_done
Add a callback that the pnfs layout driver can use to do its own
error handling of the data server's COMMIT response.

Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-23 15:29:03 -04:00
Trond Myklebust b064eca2cf NFSv4: Send unmapped uid/gids to the server when using auth_sys
The new behaviour is enabled using the new module parameter
'nfs4_disable_idmapping'.

Note that if the server rejects an unmapped uid or gid, then
the client will automatically switch back to using the idmapper.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-11 15:39:27 -05:00
Trond Myklebust 3ddeb7c5c6 NFSv4: Propagate the error NFS4ERR_BADOWNER to nfs4_do_setattr
This will be required in order to switch uid/gid mapping back on if the
admin has tried to disable it.

Note that we also propagate NFS4ERR_BADNAME at the same time, in order to
work around a Linux server bug.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-11 15:39:27 -05:00
Fred Isaman a69aef1496 NFSv4.1: pnfs filelayout driver write
Allows the pnfs filelayout driver to write to the data servers.

Note that COMMIT to data servers will be implemented in a future
patch.  To avoid improper behavior, for the moment any WRITE to a data
server that would also require a COMMIT to the data server is sent
NFS_FILE_SYNC.

Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com>
Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
Signed-off-by: Mingyang Guo <guomingyang@nrchpc.ac.cn>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-11 15:38:44 -05:00
Fred Isaman 7ffd10640d NFSv4.1: remove GETATTR from ds writes
Any WRITE compound directed to a data server needs to have the
GETATTR calls suppressed.

Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-11 15:38:44 -05:00
Fred Isaman b029bc9b08 NFSv4.1: add callback to nfs4_write_done
Add callback that pnfs layout driver can use to do its own handling
of data server WRITE response.

Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-11 15:38:43 -05:00
Andy Adamson cbdabc7f8b NFSv4.1: filelayout async error handler
Use our own async error handler.
Mark the layout as failed and retry i/o through the MDS on specified errors.

Update the mds_offset in nfs_readpage_retry so that a failed short-read retry
to a DS gets correctly resent through the MDS.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-11 15:38:43 -05:00
Andy Adamson dc70d7b318 NFSv4.1: filelayout read
Attempt a pNFS file layout read by setting up the nfs_read_data struct and
calling nfs_initiate_read with the data server rpc client and the
filelayout rpc call ops.

Error handling is implemented in a subsequent patch.

Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com>
Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Mingyang Guo <guomingyang@nrchpc.ac.cn>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
Tested-by: Guo Mingyang <guomingyang@nrchpc.ac.cn>
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-11 15:38:43 -05:00
Andy Adamson d83217c135 NFSv4.1: data server connection
Introduce a data server set_client and init session following the
nfs4_set_client and  nfs4_init_session convention.

Once a new nfs_client is on the nfs_client_list, the nfs_client cl_cons_state
serializes access to creating an nfs_client struct with matching properties.

Use the new nfs_get_client() that initializes new clients.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-11 15:38:42 -05:00
Andy Adamson 45a52a0207 NFS move nfs_client initialization into nfs_get_client
Now nfs_get_client returns an nfs_client ready to be used no matter if it was
found or created.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-11 15:38:41 -05:00
Huang Weiyi 57df216bd8 nfs4: remove duplicated #include
Remove duplicated #include('s) in
  fs/nfs/nfs4proc.c

Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-11 15:18:37 -05:00
Trond Myklebust ecac799a5e NFSv4: Fix the setlk error handler
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-11 15:18:36 -05:00
Trond Myklebust b4410c2f7f NFSv4.1: Fix the handling of the SEQUENCE status bits
We want SEQUENCE status bits to be handled by the state manager in order
to avoid threading issues.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-11 15:18:35 -05:00
Trond Myklebust 0400a6b0cb NFSv4/4.1: Fix nfs4_schedule_state_recovery abuses
nfs4_schedule_state_recovery() should only be used when we need to force
the state manager to check the lease. If we just want to start the
state manager in order to handle a state recovery situation, we should be
using nfs4_schedule_state_manager().

This patch fixes the abuses of nfs4_schedule_state_recovery() by replacing
its use with a set of helper functions that do the right thing.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-11 15:18:22 -05:00
Andy Adamson c34c32ea97 NFSv4.1 reclaim complete must wait for completion
Signed-off-by: Andy Adamson <andros@netapp.com>
[Trond: fix whitespace errors]
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-10 15:05:01 -05:00
Ricardo Labiaga 7d6d63d642 NFSv4.1: Retry CREATE_SESSION on NFS4ERR_DELAY
Fix bug where we currently retry the EXCHANGEID call again, eventhough
we already have a valid clientid.  Instead, delay and retry the CREATE_SESSION
call.

Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-10 15:04:59 -05:00
Jovi Zhang 43b7c3f051 nfs: fix compilation warning
this commit fix compilation warning as following:
linux-2.6/fs/nfs/nfs4proc.c:3265: warning: comparison of distinct pointer types lacks a cast

Signed-off-by: Jovi Zhang <bookjovi@gmail.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-10 15:04:56 -05:00
Trond Myklebust bf294b41ce SUNRPC: Close a race in __rpc_wait_for_completion_task()
Although they run as rpciod background tasks, under normal operation
(i.e. no SIGKILL), functions like nfs_sillyrename(), nfs4_proc_unlck()
and nfs4_do_close() want to be fully synchronous. This means that when we
exit, we want all references to the rpc_task to be gone, and we want
any dentry references etc. held by that task to be released.

For this reason these functions call __rpc_wait_for_completion_task(),
followed by rpc_put_task() in the expectation that the latter will be
releasing the last reference to the rpc_task, and thus ensuring that the
callback_ops->rpc_release() has been called synchronously.

This patch fixes a race which exists due to the fact that
rpciod calls rpc_complete_task() (in order to wake up the callers of
__rpc_wait_for_completion_task()) and then subsequently calls
rpc_put_task() without ensuring that these two steps are done atomically.

In order to avoid adding new spin locks, the patch uses the existing
waitqueue spin lock to order the rpc_task reference count releases between
the waiting process and rpciod.
The common case where nobody is waiting for completion is optimised for by
checking if the RPC_TASK_ASYNC flag is cleared and/or if the rpc_task
reference count is 1: in those cases we drop trying to grab the spin lock,
and immediately free up the rpc_task.

Those few processes that need to put the rpc_task from inside an
asynchronous context and that do not care about ordering are given a new
helper: rpc_put_task_async().

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-10 15:04:52 -05:00
Neil Horman e9e3d724e2 nfs4: Ensure that ACL pages sent over NFS were not allocated from the slab (v3)
The "bad_page()" page allocator sanity check was reported recently (call
chain as follows):

  bad_page+0x69/0x91
  free_hot_cold_page+0x81/0x144
  skb_release_data+0x5f/0x98
  __kfree_skb+0x11/0x1a
  tcp_ack+0x6a3/0x1868
  tcp_rcv_established+0x7a6/0x8b9
  tcp_v4_do_rcv+0x2a/0x2fa
  tcp_v4_rcv+0x9a2/0x9f6
  do_timer+0x2df/0x52c
  ip_local_deliver+0x19d/0x263
  ip_rcv+0x539/0x57c
  netif_receive_skb+0x470/0x49f
  :virtio_net:virtnet_poll+0x46b/0x5c5
  net_rx_action+0xac/0x1b3
  __do_softirq+0x89/0x133
  call_softirq+0x1c/0x28
  do_softirq+0x2c/0x7d
  do_IRQ+0xec/0xf5
  default_idle+0x0/0x50
  ret_from_intr+0x0/0xa
  default_idle+0x29/0x50
  cpu_idle+0x95/0xb8
  start_kernel+0x220/0x225
  _sinittext+0x22f/0x236

It occurs because an skb with a fraglist was freed from the tcp
retransmit queue when it was acked, but a page on that fraglist had
PG_Slab set (indicating it was allocated from the Slab allocator (which
means the free path above can't safely free it via put_page.

We tracked this back to an nfsv4 setacl operation, in which the nfs code
attempted to fill convert the passed in buffer to an array of pages in
__nfs4_proc_set_acl, which gets used by the skb->frags list in
xs_sendpages.  __nfs4_proc_set_acl just converts each page in the buffer
to a page struct via virt_to_page, but the vfs allocates the buffer via
kmalloc, meaning the PG_slab bit is set.  We can't create a buffer with
kmalloc and free it later in the tcp ack path with put_page, so we need
to either:

1) ensure that when we create the list of pages, no page struct has
   PG_Slab set

 or

2) not use a page list to send this data

Given that these buffers can be multiple pages and arbitrarily sized, I
think (1) is the right way to go.  I've written the below patch to
allocate a page from the buddy allocator directly and copy the data over
to it.  This ensures that we have a put_page free-able page for every
entry that winds up on an skb frag list, so it can be safely freed when
the frame is acked.  We do a put page on each entry after the
rpc_call_sync call so as to drop our own reference count to the page,
leaving only the ref count taken by tcp_sendpages.  This way the data
will be properly freed when the ack comes in

Successfully tested by myself to solve the above oops.

Note, as this is the result of a setacl operation that exceeded a page
of data, I think this amounts to a local DOS triggerable by an
uprivlidged user, so I'm CCing security on this as well.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Trond Myklebust <Trond.Myklebust@netapp.com>
CC: security@kernel.org
CC: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-04 17:28:52 -08:00
Andy Adamson c7a360b05b NFS construct consistent co_ownerid for v4.1
As stated in section 2.4 of RFC 5661, subsequent instances of the client need
to present the same co_ownerid. Concatinate the client's IP dot address,
host name, and the rpc_auth pseudoflavor to form the co_ownerid.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-01-25 22:49:14 -05:00
Andy Adamson 357f54d6b3 NFS fix the setting of exchange id flag
Indicate support for referrals. Do not set any PNFS roles. Check the flags
returned by the server for validity. Do not use exchange flags from an old
client ID instance when recovering a client ID.

Update the EXCHID4_FLAG_XXX set to RFC 5661.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-01-11 14:17:09 -05:00
Trond Myklebust d035c36c58 NFSv4: Ensure continued open and lockowner name uniqueness
In order to enable migration support, we will want to move some of the
structures that are subject to migration into the struct nfs_server.
In particular, if we are to move the state_owner and state_owner_id to
being a per-filesystem structure, then we should label the resulting
open/lock owners with a per-filesytem label to ensure global uniqueness.

This patch does so by adding the super block s_dev to the open/lock owner
name.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-01-06 16:03:13 -05:00
Fred Isaman f7e8917a67 pnfs: layout roc code
A layout can request return-on-close.  How this interacts with the
forgetful model of never sending LAYOUTRETURNS is a bit ambiguous.
We forget any layouts marked roc, and wait for them to be completely
forgotten before continuing with the close.  In addition, to compensate
for races with any inflight LAYOUTGETs, and the fact that we do not get
any layout stateid back from the server, we set the barrier to the worst
case scenario of current_seqid + number of outstanding LAYOUTGETS.

Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-01-06 14:46:32 -05:00
Fred Isaman cf7d63f1f9 pnfs: serialize LAYOUTGET(openstateid)
We shouldn't send a LAYOUTGET(openstateid) unless all outstanding RPCs
using the previous stateid are completed.  This requires choosing the
stateid to encode earlier, so we can abort if one is not available (we
want to use the open stateid, but a LAYOUTGET is already out using
it), and adding a count of the number of outstanding rpc calls using
layout state (which for now consist solely of LAYOUTGETs).

Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-01-06 14:46:31 -05:00
Fred Isaman c31663d4a1 pnfs: layoutget rpc code cleanup
No functional changes, just some code minor code rearrangement and
comments.

Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-01-06 14:46:31 -05:00
Fred Isaman daaa82d1c7 pnfs: remove unnecessary field lgp->status
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-01-06 14:46:30 -05:00
Andy Adamson 42acd02182 NFS add session back channel draining
Currently session draining only drains the fore channel.
The back channel processing must also be drained.

Use the back channel highest_slot_used to indicate that a callback is being
processed by the callback thread.  Move the session complete to be per channel.

When the session is draininig, wait for any current back channel processing
to complete and stop all new back channel processing by returning NFS4ERR_DELAY
to the back channel client.

Drain the back channel, then the fore channel.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-01-06 14:46:25 -05:00
Andy Adamson f4eecd5da3 NFS implement v4.0 callback_ident
Use the small id to pointer translator service to provide a unique callback
identifier per SETCLIENTID call used to identify the v4.0 callback service
associated with the clientid.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-01-06 14:46:24 -05:00
Aneesh Kumar K.V 64c2ce8b72 nfsv4: Switch to generic xattr handling code
This patch make nfsv4 use the generic xattr handling code
to get the nfsv4 acl. This will help us to add richacl
support to nfsv4 in later patches

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-01-04 13:10:41 -05:00
Aneesh Kumar K.V a8a5da996d nfs: Set MS_POSIXACL always
We want to skip VFS applying mode for NFS. So set MS_POSIXACL always
and selectively use umask. Ideally we would want to use umask only
when we don't have inheritable ACEs set. But NFS currently don't
allow to send umask to the server. So this is best what we can do
and this is consistent with NFSv3

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-01-04 13:10:40 -05:00
Trond Myklebust 1174dd1f89 NFSv4: Convert a few commas into semicolons...
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-12-21 11:51:27 -05:00
J. Bruce Fields 611c96c8f7 nfs4: fix units bug causing hang on recovery
Note that cl_lease_time is in jiffies.  This can cause a very long wait
in the NFS4ERR_CLID_INUSE case.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-12-21 11:51:24 -05:00
Aneesh Kumar K.V 08a22b392a nfs: Discard ACL cache on mode update
An update of mode bits can result in ACL value being changed. We need
to mark the acl cache invalid when we update mode. Similarly we need
to update file attribute when we change ACL value

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-12-07 19:30:42 -05:00
Trond Myklebust ac39612824 NFS: readdir shouldn't read beyond the reply returned by the server
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-11-15 20:44:29 -05:00
Geert Uytterhoeven 12364a4f05 nfs4: The difference of 2 pointers is ptrdiff_t
On m68k, which is 32-bit:

fs/nfs/nfs4proc.c: In function ‘nfs41_sequence_done’:
fs/nfs/nfs4proc.c:432: warning: format ‘%ld’ expects type ‘long int’, but argument 3 has type ‘int’
fs/nfs/nfs4proc.c: In function ‘nfs4_setup_sequence’:
fs/nfs/nfs4proc.c:576: warning: format ‘%ld’ expects type ‘long int’, but argument 5 has type ‘int’

On 32-bit, ptrdiff_t is int; on 64-bit, ptrdiff_t is long.

Introduced by commit dfb4f30983 ("NFSv4.1: keep
seq_res.sr_slot as pointer rather than an index")

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-10-28 15:49:29 -04:00
J. Bruce Fields 43c2e885be nfs4: fix channel attribute sanity-checks
The sanity checks here are incorrect; in the worst case they allow
values that crash the client.

They're also over-reliant on the preprocessor.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-10-25 22:19:41 -04:00
Andy Adamson b1f69b754e NFSv4.1: pnfs: add LAYOUTGET and GETDEVICEINFO infrastructure
Add the ability to actually send LAYOUTGET and GETDEVICEINFO.  This also adds
in the machinery to handle layout state and the deviceid cache.  Note that
GETDEVICEINFO is not called directly by the generic layer.  Instead it
is called by the drivers while parsing the LAYOUTGET opaque data in response
to an unknown device id embedded therein.  RFC 5661 only encodes
device ids within the driver-specific opaque data.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Dean Hildebrand <dhildebz@umich.edu>
Signed-off-by: Marc Eshel <eshel@almaden.ibm.com>
Signed-off-by: Mike Sager <sager@netapp.com>
Signed-off-by: Ricardo Labiaga <ricardo.labiaga@netapp.com>
Signed-off-by: Tao Guo <guotao@nrchpc.ac.cn>
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-10-24 18:07:10 -04:00
Andy Adamson 504913fbc8 NFS: ask for layouttypes during v4 fsinfo call
This information will be used to determine which layout driver,
if any, to use for subsequent IO on this filesystem.  Each driver
is assigned an integer id, with 0 reserved to indicate no driver.

The server can in theory return multiple ids.  However, our current
client implementation only notes the first entry and ignores the
rest.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-10-24 18:07:09 -04:00
Ricardo Labiaga 55b6e7742d Ask for time_delta during fsinfo probe
Used by the client to determine if the server has a granular enough
time stamp.

Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-10-24 18:00:04 -04:00
Bryan Schumaker 82f2e5472e NFS: Readdir plus in v4
By requsting more attributes during a readdir, we can mimic the readdir plus
operation that was in NFSv3.

To test, I ran the command `ls -lU --color=none` on directories with various
numbers of files.  Without readdir plus, I see this:

n files |    100    |   1,000   |  10,000   |  100,000  | 1,000,000
--------+-----------+-----------+-----------+-----------+----------
real    | 0m00.153s | 0m00.589s | 0m05.601s | 0m56.691s | 9m59.128s
user    | 0m00.007s | 0m00.007s | 0m00.077s | 0m00.703s | 0m06.800s
sys     | 0m00.010s | 0m00.070s | 0m00.633s | 0m06.423s | 1m10.005s
access  | 3         | 1         | 1         | 4         | 31
getattr | 2         | 1         | 1         | 1         | 1
lookup  | 104       | 1,003     | 10,003    | 100,003   | 1,000,003
readdir | 2         | 16        | 158       | 1,575     | 15,749
total   | 111       | 1,021     | 10,163    | 101,583   | 1,015,784

With readdir plus enabled, I see this:

n files |    100    |   1,000   |  10,000   |  100,000  | 1,000,000
--------+-----------+-----------+-----------+-----------+----------
real    | 0m00.115s | 0m00.206s | 0m01.079s | 0m12.521s | 2m07.528s
user    | 0m00.003s | 0m00.003s | 0m00.040s | 0m00.290s | 0m03.296s
sys     | 0m00.007s | 0m00.020s | 0m00.120s | 0m01.357s | 0m17.556s
access  | 3         | 1         | 1         | 1         | 7
getattr | 2         | 1         | 1         | 1         | 1
lookup  | 4         | 3         | 3         | 3         | 3
readdir | 6         | 62        | 630       | 6,300     | 62,993
total   | 15        | 67        | 635       | 6,305     | 63,004

Readdir plus disabled has about a 16x increase in the number of rpc calls and
is 4 - 5 times slower on large directories.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-10-23 15:27:37 -04:00
Bryan Schumaker 56e4ebf877 NFS: readdir with vmapped pages
We can use vmapped pages to read more information from the network at once.
This will reduce the number of calls needed to complete a readdir.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
[trondmy: Added #include for linux/vmalloc.h> in fs/nfs/dir.c]
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-10-23 15:27:35 -04:00
Trond Myklebust 168667c43b NFSv4: The state manager must ignore EKEYEXPIRED.
Otherwise, we cannot recover state correctly.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-10-23 15:27:28 -04:00
Trond Myklebust ae1007d37e NFSv4: Don't call nfs4_state_mark_reclaim_reboot() from error handlers
In the case of a server reboot, the state recovery thread starts by calling
nfs4_state_end_reclaim_reboot() in order to avoid edge conditions when
the server reboots while the client is in the middle of recovery.

However, if the client has already marked the nfs4_state as requiring
reboot recovery, then the above behaviour will cause the recovery thread to
treat the open as if it was part of such an edge condition: the open will
be recovered as if it was part of a lease expiration (and all the locks
will be lost).
Fix is to remove the call to nfs4_state_mark_reclaim_reboot from
nfs4_async_handle_error(), and nfs4_handle_exception(). Instead we leave it
to the recovery thread to do this for us.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@kernel.org
2010-10-19 19:42:33 -04:00
Trond Myklebust b0ed9dbc24 NFSv4: Fix open recovery
NFSv4 open recovery is currently broken: since we do not clear the
state->flags states before attempting recovery, we end up with the
'can_open_cached()' function triggering. This again leads to no OPEN call
being put on the wire.

Reported-by: Sachin Prabhu <sprabhu@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@kernel.org
2010-10-19 19:41:55 -04:00
Benny Halevy dfb4f30983 NFSv4.1: keep seq_res.sr_slot as pointer rather than an index
Having to explicitly initialize sr_slotid to NFS4_MAX_SLOT_TABLE
resulted in numerous bugs.  Keeping the current slot as a pointer
to the slot table is more straight forward and robust as it's
implicitly set up to NULL wherever the seq_res member is initialized
to zeroes.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-09-24 09:17:01 -04:00
Trond Myklebust d688e11007 NFSv4.1: Fix the slotid initialisation in nfs_async_rename()
This fixes an Oopsable condition that was introduced by commit
d3d4152a5d (nfs: make sillyrename an async
operation)

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-09-21 16:52:40 -04:00
Jeff Layton d3d4152a5d nfs: make sillyrename an async operation
A synchronous rename can be interrupted by a SIGKILL. If that happens
during a sillyrename operation, it's possible for the rename call to
be sent to the server, but the task exits before processing the
reply. If this happens, the sillyrenamed file won't get cleaned up
during nfs_dentry_iput and the server is left with a dangling .nfs* file
hanging around.

Fix this problem by turning sillyrename into an asynchronous operation
and have the task doing the sillyrename just wait on the reply. If the
task is killed before the sillyrename completes, it'll still proceed
to completion.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-09-17 17:31:57 -04:00
Jeff Layton e8582a8b96 nfs: standardize the rename response container
Right now, v3 and v4 have their own variants. Create a standard struct
that will work for v3 and v4. v2 doesn't get anything but a simple error
and so isn't affected by this.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-09-17 17:31:06 -04:00
Jeff Layton 920769f031 nfs: standardize the rename args container
Each NFS version has its own version of the rename args container.
Standardize them on a common one that's identical to the one NFSv4
uses.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-09-17 17:30:25 -04:00
Trond Myklebust 2b484297e4 NFS: Add an 'open_context' element to struct nfs_rpc_ops
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-09-17 10:56:51 -04:00
Trond Myklebust c0204fd2b8 NFS: Clean up nfs4_proc_create()
Remove all remaining references to the struct nameidata from the low level
NFS layers. Again pass down a partially initialised struct nfs_open_context
when we want to do atomic open+create.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-09-17 10:56:51 -04:00
Trond Myklebust 535918f141 NFSv4: Further cleanups for nfs4_open_revalidate()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-09-17 10:56:51 -04:00
Trond Myklebust b8d4caddd8 NFSv4: Clean up nfs4_open_revalidate
Remove references to 'struct nameidata' from the low-level open_revalidate
code, and replace them with a struct nfs_open_context which will be
correctly initialised upon success.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-09-17 10:56:51 -04:00
Trond Myklebust f46e0bd34e NFSv4: Further minor cleanups for nfs4_atomic_open()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-09-17 10:56:50 -04:00
Trond Myklebust cd9a1c0e5a NFSv4: Clean up nfs4_atomic_open
Start moving the 'struct nameidata' dependent code out of the lower level
NFS code in preparation for the removal of open intents.

Instead of the struct nameidata, we pass down a partially initialised
struct nfs_open_context that will be fully initialised by the atomic open
upon success.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-09-17 10:56:50 -04:00
Trond Myklebust 0a377cff94 NFS: Fix an Oops in the NFSv4 atomic open code
Adam Lackorzynski reports:

with 2.6.35.2 I'm getting this reproducible Oops:

[  110.825396] BUG: unable to handle kernel NULL pointer dereference at
(null)
[  110.828638] IP: [<ffffffff811247b7>] encode_attrs+0x1a/0x2a4
[  110.828638] PGD be89f067 PUD bf18f067 PMD 0
[  110.828638] Oops: 0000 [#1] SMP
[  110.828638] last sysfs file: /sys/class/net/lo/operstate
[  110.828638] CPU 2
[  110.828638] Modules linked in: rtc_cmos rtc_core rtc_lib amd64_edac_mod
i2c_amd756 edac_core i2c_core dm_mirror dm_region_hash dm_log dm_snapshot
sg sr_mod usb_storage ohci_hcd mptspi tg3 mptscsih mptbase usbcore nls_base
[last unloaded: scsi_wait_scan]
[  110.828638]
[  110.828638] Pid: 11264, comm: setchecksum Not tainted 2.6.35.2 #1
[  110.828638] RIP: 0010:[<ffffffff811247b7>]  [<ffffffff811247b7>]
encode_attrs+0x1a/0x2a4
[  110.828638] RSP: 0000:ffff88003bf5b878  EFLAGS: 00010296
[  110.828638] RAX: ffff8800bddb48a8 RBX: ffff88003bf5bb18 RCX:
0000000000000000
[  110.828638] RDX: ffff8800be258800 RSI: 0000000000000000 RDI:
ffff88003bf5b9f8
[  110.828638] RBP: 0000000000000000 R08: ffff8800bddb48a8 R09:
0000000000000004
[  110.828638] R10: 0000000000000003 R11: ffff8800be779000 R12:
ffff8800be258800
[  110.828638] R13: ffff88003bf5b9f8 R14: ffff88003bf5bb20 R15:
ffff8800be258800
[  110.828638] FS:  0000000000000000(0000) GS:ffff880041e00000(0063)
knlGS:00000000556bd6b0
[  110.828638] CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
[  110.828638] CR2: 0000000000000000 CR3: 00000000be8ef000 CR4:
00000000000006e0
[  110.828638] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[  110.828638] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[  110.828638] Process setchecksum (pid: 11264, threadinfo
ffff88003bf5a000, task ffff88003f232210)
[  110.828638] Stack:
[  110.828638]  0000000000000000 ffff8800bfbcf920 0000000000000000
0000000000000ffe
[  110.828638] <0> 0000000000000000 0000000000000000 0000000000000000
0000000000000000
[  110.828638] <0> 0000000000000000 0000000000000000 0000000000000000
0000000000000000
[  110.828638] Call Trace:
[  110.828638]  [<ffffffff81124c1f>] ? nfs4_xdr_enc_setattr+0x90/0xb4
[  110.828638]  [<ffffffff81371161>] ? call_transmit+0x1c3/0x24a
[  110.828638]  [<ffffffff813774d9>] ? __rpc_execute+0x78/0x22a
[  110.828638]  [<ffffffff81371a91>] ? rpc_run_task+0x21/0x2b
[  110.828638]  [<ffffffff81371b7e>] ? rpc_call_sync+0x3d/0x5d
[  110.828638]  [<ffffffff8111e284>] ? _nfs4_do_setattr+0x11b/0x147
[  110.828638]  [<ffffffff81109466>] ? nfs_init_locked+0x0/0x32
[  110.828638]  [<ffffffff810ac521>] ? ifind+0x4e/0x90
[  110.828638]  [<ffffffff8111e2fb>] ? nfs4_do_setattr+0x4b/0x6e
[  110.828638]  [<ffffffff8111e634>] ? nfs4_do_open+0x291/0x3a6
[  110.828638]  [<ffffffff8111ed81>] ? nfs4_open_revalidate+0x63/0x14a
[  110.828638]  [<ffffffff811056c4>] ? nfs_open_revalidate+0xd7/0x161
[  110.828638]  [<ffffffff810a2de4>] ? do_lookup+0x1a4/0x201
[  110.828638]  [<ffffffff810a4733>] ? link_path_walk+0x6a/0x9d5
[  110.828638]  [<ffffffff810a42b6>] ? do_last+0x17b/0x58e
[  110.828638]  [<ffffffff810a5fbe>] ? do_filp_open+0x1bd/0x56e
[  110.828638]  [<ffffffff811cd5e0>] ? _atomic_dec_and_lock+0x30/0x48
[  110.828638]  [<ffffffff810a9b1b>] ? dput+0x37/0x152
[  110.828638]  [<ffffffff810ae063>] ? alloc_fd+0x69/0x10a
[  110.828638]  [<ffffffff81099f39>] ? do_sys_open+0x56/0x100
[  110.828638]  [<ffffffff81027a22>] ? ia32_sysret+0x0/0x5
[  110.828638] Code: 83 f1 01 e8 f5 ca ff ff 48 83 c4 50 5b 5d 41 5c c3 41
57 41 56 41 55 49 89 fd 41 54 49 89 d4 55 48 89 f5 53 48 81 ec 18 01 00 00
<8b> 06 89 c2 83 e2 08 83 fa 01 19 db 83 e3 f8 83 c3 18 a8 01 8d
[  110.828638] RIP  [<ffffffff811247b7>] encode_attrs+0x1a/0x2a4
[  110.828638]  RSP <ffff88003bf5b878>
[  110.828638] CR2: 0000000000000000
[  112.840396] ---[ end trace 95282e83fd77358f ]---

We need to ensure that the O_EXCL flag is turned off if the user doesn't
set O_CREAT.

Cc: stable@kernel.org
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-08-18 09:25:42 -04:00
Davidlohr Bueso 5d7ca35a18 nfs: Remove redundant NULL check upon kfree()
Signed-off-by: Davidlohr Bueso <dave@gnu.org>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-08-11 12:42:15 -04:00
Trond Myklebust d05dd4e98f NFS: Fix the NFS users of rpc_restart_call()
Fix up those functions that depend on knowing whether or not
rpc_restart_call is successful or not.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-08-03 22:06:44 -04:00
Trond Myklebust a6f03393ec NFSv4: Get rid of the bogus RPC_ASSASSINATED(task) checks
There is no real reason to have RPC_ASSASSINATED() checks in the NFS code.
As far as it is concerned, this is just an RPC error...

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-08-03 22:06:43 -04:00
Trond Myklebust 452e93523d NFSv4: Clean up the process of renewing the NFSv4 lease
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-08-03 22:06:42 -04:00
Trond Myklebust 14516c3a30 NFSv4.1: Handle NFS4ERR_DELAY on SEQUENCE correctly
In RFC5661, an NFS4ERR_DELAY error on a SEQUENCE operation has the special
meaning that the server is not finished processing the request. In this
case we want to just retry the request without touching the slot.

Also fix a bug whereby we would fail to update the sequence id if the
server returned any error other than NFS_OK/NFS4ERR_DELAY.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-08-03 22:06:42 -04:00
Trond Myklebust 77041ed9b4 NFSv4: Ensure the lockowners are labelled using the fl_owner and/or fl_pid
flock locks want to be labelled using the process pid, while posix locks
want to be labelled using the fl_owner.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-07-30 14:46:10 -04:00
Trond Myklebust d3c7b7ccc1 NFSv4: Add support for the RELEASE_LOCKOWNER operation
This is needed by NFSv4.0 servers in order to keep the number of locking
stateids at a manageable level.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-07-30 14:46:10 -04:00
Trond Myklebust 1f0e890dba NFSv4: Clean up struct nfs4_state_owner
The 'so_delegations' list appears to be unused.

Also eliminate so_client. If we already have so_server, we can get to the
nfs_client structure.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-06-24 15:11:43 -04:00
Trond Myklebust 1055d76d91 NFSv4.1: There is no need to init the session more than once...
Set up a flag to ensure that is indeed the case.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-06-22 13:24:03 -04:00
Trond Myklebust fe74ba3a8d NFSv41: Cleanup for nfs4_alloc_session.
There is no reason to change the nfs_client state every time we allocate a
new session. Move that line into nfs4_init_client_minor_version.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-06-22 13:24:03 -04:00
Trond Myklebust d77d76ffb6 NFSv41: Clean up exclusive create
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-06-22 13:24:03 -04:00
Trond Myklebust e047a10c12 NFSv41: Fix nfs_async_inode_return_delegation() ugliness
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-06-22 13:24:02 -04:00
Trond Myklebust c48f4f3541 NFSv41: Convert the various reboot recovery ops etc to minor version ops
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-06-22 13:24:02 -04:00
Trond Myklebust 97dc135947 NFSv41: Clean up the NFSv4.1 minor version specific operations
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-06-22 13:24:02 -04:00
Trond Myklebust a2118c33aa NFSv41: Don't store session state in the nfs_client->cl_state
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-06-22 13:24:02 -04:00
Trond Myklebust df8964554a NFSv41: Further cleanup for nfs4_sequence_done
Instead of testing if the nfs_client has a session, we should be testing if
the struct nfs4_sequence_res was set up with one.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-06-22 13:24:02 -04:00
Trond Myklebust 035168ab39 NFSv4.1: Make nfs4_setup_sequence take a nfs_server argument
In anticipation of the day when we have per-filesystem sessions, and also
in order to allow the session to change in the event of a filesystem
migration event.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-06-22 13:24:02 -04:00
Trond Myklebust 71ac6da994 NFSv4.1: Merge the nfs41_proc_async_sequence() and nfs4_proc_sequence()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-06-22 13:24:01 -04:00
Trond Myklebust aa5190d0ed NFSv4: Kill nfs4_async_handle_error() abuses by NFSv4.1
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-06-22 13:24:01 -04:00
Trond Myklebust d185a334c7 NFSv4.1: Simplify nfs41_sequence_done()
Nobody uses the rpc_status parameter.

It is not obvious why we need the struct nfs_client argument either, when
we already have that information in the session.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-06-22 13:24:01 -04:00
Trond Myklebust 2a6e26cdb8 NFSv4.1: Clean up nfs4_setup_sequence
Firstly, there is little point in first zeroing out the entire struct
nfs4_sequence_res, and then initialising all fields save one. Just
initialise the last field to zero...

Secondly, nfs41_setup_sequence() has only 2 possible return values: 0, or
-EAGAIN, so there is no 'terminate rpc task' case.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-06-22 13:24:01 -04:00
Trond Myklebust d5f8d3fe72 NFSv41: Fix a memory leak in nfs41_proc_async_sequence()
If the call to rpc_call_async() fails, then the arguments will not be
freed, since there will be no call to nfs41_sequence_call_done

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-06-22 13:24:01 -04:00
Trond Myklebust 8535b2be51 NFSv4: Don't use GFP_KERNEL allocations in state recovery
We do not want to have the state recovery thread kick off and wait for a
memory reclaim, since that may deadlock when the writebacks end up
waiting for the state recovery thread to complete.

The safe thing is therefore to use GFP_NOFS in all open, close,
delegation return, lock, etc. operations that may be called by the
state recovery thread.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-05-14 15:09:33 -04:00
Chuck Lever 9bc4e3ca46 NFS: Calldata for nfs4_renew_done()
I'm about to change task->tk_start from a jiffies value to a ktime_t
value in order to make RPC RTT reporting more precise.

Recently (commit dc96aef9) nfs4_renew_done() started to reference
task->tk_start so that a jiffies value no longer had to be passed
from nfs4_proc_async_renew().  This allowed the calldata to point to
an nfs_client instead.

Changing task->tk_start to a ktime_t value makes it effectively
useless for renew timestamps, so we need to restore the pre-dc96aef9
logic that provided a jiffies "start" timestamp to nfs4_renew_done().

Both an nfs_client pointer and a timestamp need to be passed to
nfs4_renew_done(), so create a new nfs_renewdata structure that
contains both, resembling what is already done for delegreturn,
lock, and unlock.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-05-14 15:09:32 -04:00