1
0
Fork 0
Commit Graph

499 Commits (redonkable)

Author SHA1 Message Date
Pavel Shilovsky 84b15c4e15 CIFS: Properly process SMB3 lease breaks
[ Upstream commit 9bd4540836 ]

Currenly we doesn't assume that a server may break a lease
from RWH to RW which causes us setting a wrong lease state
on a file and thus mistakenly flushing data and byte-range
locks and purging cached data on the client. This leads to
performance degradation because subsequent IOs go directly
to the server.

Fix this by propagating new lease state and epoch values
to the oplock break handler through cifsFileInfo structure
and removing the use of cifsInodeInfo flags for that. It
allows to avoid some races of several lease/oplock breaks
using those flags in parallel.

Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-01 13:17:21 +02:00
Aurelien Aptel e76e39f7c6 cifs: fix rename() by ensuring source handle opened with DELETE bit
commit 86f740f2ae upstream.

To rename a file in SMB2 we open it with the DELETE access and do a
special SetInfo on it. If the handle is missing the DELETE bit the
server will fail the SetInfo with STATUS_ACCESS_DENIED.

We currently try to reuse any existing opened handle we have with
cifs_get_writable_path(). That function looks for handles with WRITE
access but doesn't check for DELETE, making rename() fail if it finds
a handle to reuse. Simple reproducer below.

To select handles with the DELETE bit, this patch adds a flag argument
to cifs_get_writable_path() and find_writable_file() and the existing
'bool fsuid_only' argument is converted to a flag.

The cifsFileInfo struct only stores the UNIX open mode but not the
original SMB access flags. Since the DELETE bit is not mapped in that
mode, this patch stores the access mask in cifs_fid on file open,
which is accessible from cifsFileInfo.

Simple reproducer:

	#include <stdio.h>
	#include <stdlib.h>
	#include <sys/types.h>
	#include <sys/stat.h>
	#include <fcntl.h>
	#include <unistd.h>
	#define E(s) perror(s), exit(1)

	int main(int argc, char *argv[])
	{
		int fd, ret;
		if (argc != 3) {
			fprintf(stderr, "Usage: %s A B\n"
			"create&open A in write mode, "
			"rename A to B, close A\n", argv[0]);
			return 0;
		}

		fd = openat(AT_FDCWD, argv[1], O_WRONLY|O_CREAT|O_SYNC, 0666);
		if (fd == -1) E("openat()");

		ret = rename(argv[1], argv[2]);
		if (ret) E("rename()");

		ret = close(fd);
		if (ret) E("close()");

		return ret;
	}

$ gcc -o bugrename bugrename.c
$ ./bugrename /mnt/a /mnt/b
rename(): Permission denied

Fixes: 8de9e86c67 ("cifs: create a helper to find a writeable handle by path name")
CC: Stable <stable@vger.kernel.org>
Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-12 13:00:18 +01:00
Vincent Whitchurch d65b067c25 CIFS: Fix task struct use-after-free on reconnect
commit f1f27ad745 upstream.

The task which created the MID may be gone by the time cifsd attempts to
call the callbacks on MIDs from cifs_reconnect().

This leads to a use-after-free of the task struct in cifs_wake_up_task:

 ==================================================================
 BUG: KASAN: use-after-free in __lock_acquire+0x31a0/0x3270
 Read of size 8 at addr ffff8880103e3a68 by task cifsd/630

 CPU: 0 PID: 630 Comm: cifsd Not tainted 5.5.0-rc6+ #119
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
 Call Trace:
  dump_stack+0x8e/0xcb
  print_address_description.constprop.5+0x1d3/0x3c0
  ? __lock_acquire+0x31a0/0x3270
  __kasan_report+0x152/0x1aa
  ? __lock_acquire+0x31a0/0x3270
  ? __lock_acquire+0x31a0/0x3270
  kasan_report+0xe/0x20
  __lock_acquire+0x31a0/0x3270
  ? __wake_up_common+0x1dc/0x630
  ? _raw_spin_unlock_irqrestore+0x4c/0x60
  ? mark_held_locks+0xf0/0xf0
  ? _raw_spin_unlock_irqrestore+0x39/0x60
  ? __wake_up_common_lock+0xd5/0x130
  ? __wake_up_common+0x630/0x630
  lock_acquire+0x13f/0x330
  ? try_to_wake_up+0xa3/0x19e0
  _raw_spin_lock_irqsave+0x38/0x50
  ? try_to_wake_up+0xa3/0x19e0
  try_to_wake_up+0xa3/0x19e0
  ? cifs_compound_callback+0x178/0x210
  ? set_cpus_allowed_ptr+0x10/0x10
  cifs_reconnect+0xa1c/0x15d0
  ? generic_ip_connect+0x1860/0x1860
  ? rwlock_bug.part.0+0x90/0x90
  cifs_readv_from_socket+0x479/0x690
  cifs_read_from_socket+0x9d/0xe0
  ? cifs_readv_from_socket+0x690/0x690
  ? mempool_resize+0x690/0x690
  ? rwlock_bug.part.0+0x90/0x90
  ? memset+0x1f/0x40
  ? allocate_buffers+0xff/0x340
  cifs_demultiplex_thread+0x388/0x2a50
  ? cifs_handle_standard+0x610/0x610
  ? rcu_read_lock_held_common+0x120/0x120
  ? mark_lock+0x11b/0xc00
  ? __lock_acquire+0x14ed/0x3270
  ? __kthread_parkme+0x78/0x100
  ? lockdep_hardirqs_on+0x3e8/0x560
  ? lock_downgrade+0x6a0/0x6a0
  ? lockdep_hardirqs_on+0x3e8/0x560
  ? _raw_spin_unlock_irqrestore+0x39/0x60
  ? cifs_handle_standard+0x610/0x610
  kthread+0x2bb/0x3a0
  ? kthread_create_worker_on_cpu+0xc0/0xc0
  ret_from_fork+0x3a/0x50

 Allocated by task 649:
  save_stack+0x19/0x70
  __kasan_kmalloc.constprop.5+0xa6/0xf0
  kmem_cache_alloc+0x107/0x320
  copy_process+0x17bc/0x5370
  _do_fork+0x103/0xbf0
  __x64_sys_clone+0x168/0x1e0
  do_syscall_64+0x9b/0xec0
  entry_SYSCALL_64_after_hwframe+0x49/0xbe

 Freed by task 0:
  save_stack+0x19/0x70
  __kasan_slab_free+0x11d/0x160
  kmem_cache_free+0xb5/0x3d0
  rcu_core+0x52f/0x1230
  __do_softirq+0x24d/0x962

 The buggy address belongs to the object at ffff8880103e32c0
  which belongs to the cache task_struct of size 6016
 The buggy address is located 1960 bytes inside of
  6016-byte region [ffff8880103e32c0, ffff8880103e4a40)
 The buggy address belongs to the page:
 page:ffffea000040f800 refcount:1 mapcount:0 mapping:ffff8880108da5c0
 index:0xffff8880103e4c00 compound_mapcount: 0
 raw: 4000000000010200 ffffea00001f2208 ffffea00001e3408 ffff8880108da5c0
 raw: ffff8880103e4c00 0000000000050003 00000001ffffffff 0000000000000000
 page dumped because: kasan: bad access detected

 Memory state around the buggy address:
  ffff8880103e3900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
  ffff8880103e3980: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 >ffff8880103e3a00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                                           ^
  ffff8880103e3a80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
  ffff8880103e3b00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ==================================================================

This can be reliably reproduced by adding the below delay to
cifs_reconnect(), running find(1) on the mount, restarting the samba
server while find is running, and killing find during the delay:

  	spin_unlock(&GlobalMid_Lock);
  	mutex_unlock(&server->srv_mutex);

 +	msleep(10000);
 +
  	cifs_dbg(FYI, "%s: issuing mid callbacks\n", __func__);
  	list_for_each_safe(tmp, tmp2, &retry_list) {
  		mid_entry = list_entry(tmp, struct mid_q_entry, qhead);

Fix this by holding a reference to the task struct until the MID is
freed.

Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
CC: Stable <stable@vger.kernel.org>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-01 09:34:37 +00:00
Ronnie Sahlberg 2685410d1e cifs: move cifsFileInfo_put logic into a work-queue
[ Upstream commit 32546a9586 ]

This patch moves the final part of the cifsFileInfo_put() logic where we
need a write lock on lock_sem to be processed in a separate thread that
holds no other locks.
This is to prevent deadlocks like the one below:

> there are 6 processes looping to while trying to down_write
> cinode->lock_sem, 5 of them from _cifsFileInfo_put, and one from
> cifs_new_fileinfo
>
> and there are 5 other processes which are blocked, several of them
> waiting on either PG_writeback or PG_locked (which are both set), all
> for the same page of the file
>
> 2 inode_lock() (inode->i_rwsem) for the file
> 1 wait_on_page_writeback() for the page
> 1 down_read(inode->i_rwsem) for the inode of the directory
> 1 inode_lock()(inode->i_rwsem) for the inode of the directory
> 1 __lock_page
>
>
> so processes are blocked waiting on:
>   page flags PG_locked and PG_writeback for one specific page
>   inode->i_rwsem for the directory
>   inode->i_rwsem for the file
>   cifsInodeInflock_sem
>
>
>
> here are the more gory details (let me know if I need to provide
> anything more/better):
>
> [0 00:48:22.765] [UN]  PID: 8863   TASK: ffff8c691547c5c0  CPU: 3
> COMMAND: "reopen_file"
>  #0 [ffff9965007e3ba8] __schedule at ffffffff9b6e6095
>  #1 [ffff9965007e3c38] schedule at ffffffff9b6e64df
>  #2 [ffff9965007e3c48] rwsem_down_write_slowpath at ffffffff9af283d7
>  #3 [ffff9965007e3cb8] legitimize_path at ffffffff9b0f975d
>  #4 [ffff9965007e3d08] path_openat at ffffffff9b0fe55d
>  #5 [ffff9965007e3dd8] do_filp_open at ffffffff9b100a33
>  #6 [ffff9965007e3ee0] do_sys_open at ffffffff9b0eb2d6
>  #7 [ffff9965007e3f38] do_syscall_64 at ffffffff9ae04315
> * (I think legitimize_path is bogus)
>
> in path_openat
>         } else {
>                 const char *s = path_init(nd, flags);
>                 while (!(error = link_path_walk(s, nd)) &&
>                         (error = do_last(nd, file, op)) > 0) {  <<<<
>
> do_last:
>         if (open_flag & O_CREAT)
>                 inode_lock(dir->d_inode);  <<<<
>         else
> so it's trying to take inode->i_rwsem for the directory
>
>      DENTRY           INODE           SUPERBLK     TYPE PATH
> ffff8c68bb8e79c0 ffff8c691158ef20 ffff8c6915bf9000 DIR  /mnt/vm1_smb/
> inode.i_rwsem is ffff8c691158efc0
>
> <struct rw_semaphore 0xffff8c691158efc0>:
>         owner: <struct task_struct 0xffff8c6914275d00> (UN -   8856 -
> reopen_file), counter: 0x0000000000000003
>         waitlist: 2
>         0xffff9965007e3c90     8863   reopen_file      UN 0  1:29:22.926
>   RWSEM_WAITING_FOR_WRITE
>         0xffff996500393e00     9802   ls               UN 0  1:17:26.700
>   RWSEM_WAITING_FOR_READ
>
>
> the owner of the inode.i_rwsem of the directory is:
>
> [0 00:00:00.109] [UN]  PID: 8856   TASK: ffff8c6914275d00  CPU: 3
> COMMAND: "reopen_file"
>  #0 [ffff99650065b828] __schedule at ffffffff9b6e6095
>  #1 [ffff99650065b8b8] schedule at ffffffff9b6e64df
>  #2 [ffff99650065b8c8] schedule_timeout at ffffffff9b6e9f89
>  #3 [ffff99650065b940] msleep at ffffffff9af573a9
>  #4 [ffff99650065b948] _cifsFileInfo_put.cold.63 at ffffffffc0a42dd6 [cifs]
>  #5 [ffff99650065ba38] cifs_writepage_locked at ffffffffc0a0b8f3 [cifs]
>  #6 [ffff99650065bab0] cifs_launder_page at ffffffffc0a0bb72 [cifs]
>  #7 [ffff99650065bb30] invalidate_inode_pages2_range at ffffffff9b04d4bd
>  #8 [ffff99650065bcb8] cifs_invalidate_mapping at ffffffffc0a11339 [cifs]
>  #9 [ffff99650065bcd0] cifs_revalidate_mapping at ffffffffc0a1139a [cifs]
> #10 [ffff99650065bcf0] cifs_d_revalidate at ffffffffc0a014f6 [cifs]
> #11 [ffff99650065bd08] path_openat at ffffffff9b0fe7f7
> #12 [ffff99650065bdd8] do_filp_open at ffffffff9b100a33
> #13 [ffff99650065bee0] do_sys_open at ffffffff9b0eb2d6
> #14 [ffff99650065bf38] do_syscall_64 at ffffffff9ae04315
>
> cifs_launder_page is for page 0xffffd1e2c07d2480
>
> crash> page.index,mapping,flags 0xffffd1e2c07d2480
>       index = 0x8
>       mapping = 0xffff8c68f3cd0db0
>   flags = 0xfffffc0008095
>
>   PAGE-FLAG       BIT  VALUE
>   PG_locked         0  0000001
>   PG_uptodate       2  0000004
>   PG_lru            4  0000010
>   PG_waiters        7  0000080
>   PG_writeback     15  0008000
>
>
> inode is ffff8c68f3cd0c40
> inode.i_rwsem is ffff8c68f3cd0ce0
>      DENTRY           INODE           SUPERBLK     TYPE PATH
> ffff8c68a1f1b480 ffff8c68f3cd0c40 ffff8c6915bf9000 REG
> /mnt/vm1_smb/testfile.8853
>
>
> this process holds the inode->i_rwsem for the parent directory, is
> laundering a page attached to the inode of the file it's opening, and in
> _cifsFileInfo_put is trying to down_write the cifsInodeInflock_sem
> for the file itself.
>
>
> <struct rw_semaphore 0xffff8c68f3cd0ce0>:
>         owner: <struct task_struct 0xffff8c6914272e80> (UN -   8854 -
> reopen_file), counter: 0x0000000000000003
>         waitlist: 1
>         0xffff9965005dfd80     8855   reopen_file      UN 0  1:29:22.912
>   RWSEM_WAITING_FOR_WRITE
>
> this is the inode.i_rwsem for the file
>
> the owner:
>
> [0 00:48:22.739] [UN]  PID: 8854   TASK: ffff8c6914272e80  CPU: 2
> COMMAND: "reopen_file"
>  #0 [ffff99650054fb38] __schedule at ffffffff9b6e6095
>  #1 [ffff99650054fbc8] schedule at ffffffff9b6e64df
>  #2 [ffff99650054fbd8] io_schedule at ffffffff9b6e68e2
>  #3 [ffff99650054fbe8] __lock_page at ffffffff9b03c56f
>  #4 [ffff99650054fc80] pagecache_get_page at ffffffff9b03dcdf
>  #5 [ffff99650054fcc0] grab_cache_page_write_begin at ffffffff9b03ef4c
>  #6 [ffff99650054fcd0] cifs_write_begin at ffffffffc0a064ec [cifs]
>  #7 [ffff99650054fd30] generic_perform_write at ffffffff9b03bba4
>  #8 [ffff99650054fda8] __generic_file_write_iter at ffffffff9b04060a
>  #9 [ffff99650054fdf0] cifs_strict_writev.cold.70 at ffffffffc0a4469b [cifs]
> #10 [ffff99650054fe48] new_sync_write at ffffffff9b0ec1dd
> #11 [ffff99650054fed0] vfs_write at ffffffff9b0eed35
> #12 [ffff99650054ff00] ksys_write at ffffffff9b0eefd9
> #13 [ffff99650054ff38] do_syscall_64 at ffffffff9ae04315
>
> the process holds the inode->i_rwsem for the file to which it's writing,
> and is trying to __lock_page for the same page as in the other processes
>
>
> the other tasks:
> [0 00:00:00.028] [UN]  PID: 8859   TASK: ffff8c6915479740  CPU: 2
> COMMAND: "reopen_file"
>  #0 [ffff9965007b39d8] __schedule at ffffffff9b6e6095
>  #1 [ffff9965007b3a68] schedule at ffffffff9b6e64df
>  #2 [ffff9965007b3a78] schedule_timeout at ffffffff9b6e9f89
>  #3 [ffff9965007b3af0] msleep at ffffffff9af573a9
>  #4 [ffff9965007b3af8] cifs_new_fileinfo.cold.61 at ffffffffc0a42a07 [cifs]
>  #5 [ffff9965007b3b78] cifs_open at ffffffffc0a0709d [cifs]
>  #6 [ffff9965007b3cd8] do_dentry_open at ffffffff9b0e9b7a
>  #7 [ffff9965007b3d08] path_openat at ffffffff9b0fe34f
>  #8 [ffff9965007b3dd8] do_filp_open at ffffffff9b100a33
>  #9 [ffff9965007b3ee0] do_sys_open at ffffffff9b0eb2d6
> #10 [ffff9965007b3f38] do_syscall_64 at ffffffff9ae04315
>
> this is opening the file, and is trying to down_write cinode->lock_sem
>
>
> [0 00:00:00.041] [UN]  PID: 8860   TASK: ffff8c691547ae80  CPU: 2
> COMMAND: "reopen_file"
> [0 00:00:00.057] [UN]  PID: 8861   TASK: ffff8c6915478000  CPU: 3
> COMMAND: "reopen_file"
> [0 00:00:00.059] [UN]  PID: 8858   TASK: ffff8c6914271740  CPU: 2
> COMMAND: "reopen_file"
> [0 00:00:00.109] [UN]  PID: 8862   TASK: ffff8c691547dd00  CPU: 6
> COMMAND: "reopen_file"
>  #0 [ffff9965007c3c78] __schedule at ffffffff9b6e6095
>  #1 [ffff9965007c3d08] schedule at ffffffff9b6e64df
>  #2 [ffff9965007c3d18] schedule_timeout at ffffffff9b6e9f89
>  #3 [ffff9965007c3d90] msleep at ffffffff9af573a9
>  #4 [ffff9965007c3d98] _cifsFileInfo_put.cold.63 at ffffffffc0a42dd6 [cifs]
>  #5 [ffff9965007c3e88] cifs_close at ffffffffc0a07aaf [cifs]
>  #6 [ffff9965007c3ea0] __fput at ffffffff9b0efa6e
>  #7 [ffff9965007c3ee8] task_work_run at ffffffff9aef1614
>  #8 [ffff9965007c3f20] exit_to_usermode_loop at ffffffff9ae03d6f
>  #9 [ffff9965007c3f38] do_syscall_64 at ffffffff9ae0444c
>
> closing the file, and trying to down_write cifsi->lock_sem
>
>
> [0 00:48:22.839] [UN]  PID: 8857   TASK: ffff8c6914270000  CPU: 7
> COMMAND: "reopen_file"
>  #0 [ffff9965006a7cc8] __schedule at ffffffff9b6e6095
>  #1 [ffff9965006a7d58] schedule at ffffffff9b6e64df
>  #2 [ffff9965006a7d68] io_schedule at ffffffff9b6e68e2
>  #3 [ffff9965006a7d78] wait_on_page_bit at ffffffff9b03cac6
>  #4 [ffff9965006a7e10] __filemap_fdatawait_range at ffffffff9b03b028
>  #5 [ffff9965006a7ed8] filemap_write_and_wait at ffffffff9b040165
>  #6 [ffff9965006a7ef0] cifs_flush at ffffffffc0a0c2fa [cifs]
>  #7 [ffff9965006a7f10] filp_close at ffffffff9b0e93f1
>  #8 [ffff9965006a7f30] __x64_sys_close at ffffffff9b0e9a0e
>  #9 [ffff9965006a7f38] do_syscall_64 at ffffffff9ae04315
>
> in __filemap_fdatawait_range
>                         wait_on_page_writeback(page);
> for the same page of the file
>
>
>
> [0 00:48:22.718] [UN]  PID: 8855   TASK: ffff8c69142745c0  CPU: 7
> COMMAND: "reopen_file"
>  #0 [ffff9965005dfc98] __schedule at ffffffff9b6e6095
>  #1 [ffff9965005dfd28] schedule at ffffffff9b6e64df
>  #2 [ffff9965005dfd38] rwsem_down_write_slowpath at ffffffff9af283d7
>  #3 [ffff9965005dfdf0] cifs_strict_writev at ffffffffc0a0c40a [cifs]
>  #4 [ffff9965005dfe48] new_sync_write at ffffffff9b0ec1dd
>  #5 [ffff9965005dfed0] vfs_write at ffffffff9b0eed35
>  #6 [ffff9965005dff00] ksys_write at ffffffff9b0eefd9
>  #7 [ffff9965005dff38] do_syscall_64 at ffffffff9ae04315
>
>         inode_lock(inode);
>
>
> and one 'ls' later on, to see whether the rest of the mount is available
> (the test file is in the root, so we get blocked up on the directory
> ->i_rwsem), so the entire mount is unavailable
>
> [0 00:36:26.473] [UN]  PID: 9802   TASK: ffff8c691436ae80  CPU: 4
> COMMAND: "ls"
>  #0 [ffff996500393d28] __schedule at ffffffff9b6e6095
>  #1 [ffff996500393db8] schedule at ffffffff9b6e64df
>  #2 [ffff996500393dc8] rwsem_down_read_slowpath at ffffffff9b6e9421
>  #3 [ffff996500393e78] down_read_killable at ffffffff9b6e95e2
>  #4 [ffff996500393e88] iterate_dir at ffffffff9b103c56
>  #5 [ffff996500393ec8] ksys_getdents64 at ffffffff9b104b0c
>  #6 [ffff996500393f30] __x64_sys_getdents64 at ffffffff9b104bb6
>  #7 [ffff996500393f38] do_syscall_64 at ffffffff9ae04315
>
> in iterate_dir:
>         if (shared)
>                 res = down_read_killable(&inode->i_rwsem);  <<<<
>         else
>                 res = down_write_killable(&inode->i_rwsem);
>

Reported-by: Frank Sorenson <sorenson@redhat.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-04 19:18:26 +01:00
Pavel Shilovsky 4324961126 CIFS: Fix NULL pointer dereference in mid callback
commit 86a7964be7 upstream.

There is a race between a system call processing thread
and the demultiplex thread when mid->resp_buf becomes NULL
and later is being accessed to get credits. It happens when
the 1st thread wakes up before a mid callback is called in
the 2nd one but the mid state has already been set to
MID_RESPONSE_RECEIVED. This causes NULL pointer dereference
in mid callback.

Fix this by saving credits from the response before we
update the mid state and then use this value in the mid
callback rather then accessing a response buffer.

Cc: Stable <stable@vger.kernel.org>
Fixes: ee258d7915 ("CIFS: Move credit processing to mid callbacks for SMB3")
Tested-by: Frank Sorenson <sorenson@redhat.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-21 11:04:46 +01:00
Dave Wysochanski d46b0da7a3 cifs: Fix cifsInodeInfo lock_sem deadlock when reconnect occurs
There's a deadlock that is possible and can easily be seen with
a test where multiple readers open/read/close of the same file
and a disruption occurs causing reconnect.  The deadlock is due
a reader thread inside cifs_strict_readv calling down_read and
obtaining lock_sem, and then after reconnect inside
cifs_reopen_file calling down_read a second time.  If in
between the two down_read calls, a down_write comes from
another process, deadlock occurs.

        CPU0                    CPU1
        ----                    ----
cifs_strict_readv()
 down_read(&cifsi->lock_sem);
                               _cifsFileInfo_put
                                  OR
                               cifs_new_fileinfo
                                down_write(&cifsi->lock_sem);
cifs_reopen_file()
 down_read(&cifsi->lock_sem);

Fix the above by changing all down_write(lock_sem) calls to
down_write_trylock(lock_sem)/msleep() loop, which in turn
makes the second down_read call benign since it will never
block behind the writer while holding lock_sem.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
Suggested-by: Ronnie Sahlberg <lsahlber@redhat.com>
Reviewed--by: Ronnie Sahlberg <lsahlber@redhat.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2019-10-24 21:35:04 -05:00
Steve French d0959b080b smb3: remove noisy debug message and minor cleanup
Message was intended only for developer temporary build
In addition cleanup two minor warnings noticed by Coverity
and a trivial change to workaround a sparse warning

Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2019-10-08 18:19:40 -07:00
Steve French c3ca78e217 smb3: pass mode bits into create calls
We need to populate an ACL (security descriptor open context)
on file and directory correct.  This patch passes in the
mode.  Followon patch will build the open context and the
security descriptor (from the mode) that goes in the open
context.

Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
2019-09-26 02:06:42 -05:00
Paulo Alcantara (SUSE) 8eecd1c2e5 cifs: Add support for root file systems
Introduce a new CONFIG_CIFS_ROOT option to handle root file systems
over a SMB share.

In order to mount the root file system during the init process, make
cifs.ko perform non-blocking socket operations while mounting and
accessing it.

Cc: Steve French <smfrench@gmail.com>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Paulo Alcantara (SUSE) <paulo@paulo.ac>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-09-16 11:43:38 -05:00
Steve French 3e7a02d478 smb3: allow disabling requesting leases
In some cases to work around server bugs or performance
problems it can be helpful to be able to disable requesting
SMB2.1/SMB3 leases on a particular mount (not to all servers
and all shares we are mounted to). Add new mount parm
"nolease" which turns off requesting leases on directory
or file opens.  Currently the only way to disable leases is
globally through a module load parameter. This is more
granular.

Suggested-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
CC: Stable <stable@vger.kernel.org>
2019-09-16 11:43:38 -05:00
Steve French 1b63f1840e smb3: display max smb3 requests in flight at any one time
Displayed in /proc/fs/cifs/Stats once for each
socket we are connected to.

This allows us to find out what the maximum number of
requests that had been in flight (at any one time). Note that
/proc/fs/cifs/Stats can be reset if you want to look for
maximum over a small period of time.

Sample output (immediately after mount):

Resources in use
CIFS Session: 1
Share (unique mount targets): 2
SMB Request/Response Buffer: 1 Pool size: 5
SMB Small Req/Resp Buffer: 1 Pool size: 30
Operations (MIDs): 0

0 session 0 share reconnects
Total vfs operations: 5 maximum at one time: 2

Max requests in flight: 2
1) \\localhost\scratch
SMBs: 18
Bytes read: 0  Bytes written: 0
...

Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2019-09-16 11:43:38 -05:00
Steve French 563317ec30 smb3: enable offload of decryption of large reads via mount option
Disable offload of the decryption of encrypted read responses
by default (equivalent to setting this new mount option "esize=0").

Allow setting the minimum encrypted read response size that we
will choose to offload to a worker thread - it is now configurable
via on a new mount option "esize="

Depending on which encryption mechanism (GCM vs. CCM) and
the number of reads that will be issued in parallel and the
performance of the network and CPU on the client, it may make
sense to enable this since it can provide substantial benefit when
multiple large reads are in flight at the same time.

Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
2019-09-16 11:43:38 -05:00
Steve French 35cf94a397 smb3: allow parallelizing decryption of reads
decrypting large reads on encrypted shares can be slow (e.g. adding
multiple milliseconds per-read on non-GCM capable servers or
when mounting with dialects prior to SMB3.1.1) - allow parallelizing
of read decryption by launching worker threads.

Testing to Samba on localhost showed 25% improvement.
Testing to remote server showed very large improvement when
doing more than one 'cp' command was called at one time.

Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
2019-09-16 11:43:38 -05:00
Steve French 4f5c10f1ad smb3: allow skipping signature verification for perf sensitive configurations
Add new mount option "signloosely" which enables signing but skips the
sometimes expensive signing checks in the responses (signatures are
calculated and sent correctly in the SMB2/SMB3 requests even with this
mount option but skipped in the responses).  Although weaker for security
(and also data integrity in case a packet were corrupted), this can provide
enough of a performance benefit (calculating the signature to verify a
packet can be expensive especially for large packets) to be useful in
some cases.

Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
2019-09-16 11:43:38 -05:00
Steve French 41e033fecd smb3: add mount option to allow RW caching of share accessed by only 1 client
If a share is known to be only to be accessed by one client, we
can aggressively cache writes not just reads to it.

Add "cache=" option (cache=singleclient) for mounting read write shares
(that will not be read or written to from other clients while we have
it mounted) in order to improve performance.

Signed-off-by: Steve French <stfrench@microsoft.com>
2019-09-16 11:43:38 -05:00
Steve French 83bbfa706d smb3: add mount option to allow forced caching of read only share
If a share is immutable (at least for the period that it will
be mounted) it would be helpful to not have to revalidate
dentries repeatedly that we know can not be changed remotely.

Add "cache=" option (cache=ro) for mounting read only shares
in order to improve performance in cases in which we know that
the share will not be changing while it is in use.

Signed-off-by: Steve French <stfrench@microsoft.com>
2019-09-16 11:43:37 -05:00
Steve French 9fe5ff1c5d smb3: do not send compression info by default
Since in theory a server could respond with compressed read
responses even if not requested on read request (assuming that
a compression negcontext is sent in negotiate protocol) - do
not send compression information during negotiate protocol
unless the user asks for compression explicitly (compression
is experimental), and add a mount warning that compression
is experimental.

Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
2019-07-07 22:37:43 -05:00
Steve French 412094a8fb smb3: add new mount option to retrieve mode from special ACE
There is a special ACE used by some servers to allow the mode
bits to be stored.  This can be especially helpful in scenarios
in which the client is trusted, and access checking on the
client vs the POSIX mode bits is sufficient.

Add mount option to allow enabling this behavior.
Follow on patch will add support for chmod and queryinfo
(stat) by retrieving the POSIX mode bits from the special
ACE, SID: S-1-5-88-3

See e.g.
https://docs.microsoft.com/en-us/previous-versions/windows/it-pro/windows-server-2008-R2-and-2008/hh509017(v=ws.10)

Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
2019-07-07 22:37:43 -05:00
Steve French 73cf8085dc cifs: simplify code by removing CONFIG_CIFS_ACL ifdef
SMB3 ACL support is needed for many use cases now and should not be
ifdeffed out, even for SMB1 (CIFS).  Remove the CONFIG_CIFS_ACL
ifdef so ACL support is always built into cifs.ko

Signed-off-by: Steve French <stfrench@microsoft.com>
2019-07-07 22:37:43 -05:00
Steve French 6552d6a026 cifs: Fix check for matching with existing mount
If we mount the same share twice, we check the flags to see if the
second mount matches the earlier mount, but we left some flags out.

Signed-off-by: Steve French <stfrench@microsoft.com>
2019-07-07 22:37:43 -05:00
Ronnie Sahlberg 487317c994 cifs: add spinlock for the openFileList to cifsInodeInfo
We can not depend on the tcon->open_file_lock here since in multiuser mode
we may have the same file/inode open via multiple different tcons.

The current code is race prone and will crash if one user deletes a file
at the same time a different user opens/create the file.

To avoid this we need to have a spinlock attached to the inode and not the tcon.

RHBZ:  1580165

CC: Stable <stable@vger.kernel.org>
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2019-06-13 14:21:09 -05:00
Ronnie Sahlberg dece44e381 cifs: add support for SEEK_DATA and SEEK_HOLE
Add llseek op for SEEK_DATA and SEEK_HOLE.
Improves xfstests/285,286,436,445,448 and 490

Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-05-15 22:27:53 -05:00
Ronnie Sahlberg ebaf546a55 SMB3: Clean up query symlink when reparse point
Two of the common symlink formats use reparse points
(unlike mfsymlinks and also unlike the SMB1 posix
extensions).  This is the first part of the fixes
to allow these reparse points (NFS style and Windows
symlinks) to be resolved properly as symlinks by the
client.

Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-05-07 23:24:55 -05:00
Steve French 26ea888f62 Negotiate and save preferred compression algorithms
New negotiate context (3) allows the server and client to
negotiate which compression algorithms to use. Add support
for this and save it off in the server structure.

Also now displayed in /proc/fs/cifs/DebugData (see below example
to Windows 10) where compression algoirthm "LZ77" was negotiated:

Servers:
Number of credits: 326 Dialect 0x311 COMPRESS_LZ77 signed
1) Name: 192.168.92.17 Uses: 1 Capability: 0x300067	Session Status: 1 TCP status: 1 Instance: 1

See MS-XCA and MS-SMB2 2.2.3.1 for more details.

Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
2019-05-07 23:24:55 -05:00
Ronnie Sahlberg 392e1c5dc9 cifs: rename and clarify CIFS_ASYNC_OP and CIFS_NO_RESP
The flags were named confusingly.
CIFS_ASYNC_OP now just means that we will not block waiting for credits
to become available so we thus rename this to be CIFS_NON_BLOCKING.

Change CIFS_NO_RESP to CIFS_NO_RSP_BUF to clarify that we will actually get a
response from the server but we will not get/do not want a response buffer.

Delete CIFSSMBNotify. This is an SMB1 function that is not used.

Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2019-05-07 23:24:55 -05:00
Ronnie Sahlberg d69cb728e7 cifs: fix credits leak for SMB1 oplock breaks
For SMB1 oplock breaks we would grab one credit while sending the PDU
but we would never relese the credit back since we will never receive a
response to this from the server. Eventuallt this would lead to a hang
once all credits are leaked.

Fix this by defining a new flag CIFS_NO_SRV_RSP which indicates that there
is no server response to this command and thus we need to add any credits back
immediately after sending the PDU.

CC: Stable <stable@vger.kernel.org> #v5.0+
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-05-07 23:24:55 -05:00
Ronnie Sahlberg 2f3ebaba13 cifs: add fiemap support
Useful for improved copy performance as well as for
applications which query allocated ranges of sparse
files.

Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-05-07 23:24:55 -05:00
Aurelien Aptel d070f9dd62 CIFS: check CIFS_MOUNT_NO_DFS when trying to reuse existing sb
if we mount A then mount A again with nodfs, we shouldn't reuse the
superblock. document the purpose of the defines as well.

there are most likely more flags that needs to be added to this mask,
in fact the logic to find them should be which flag should
be *ignored* when trying to reuse an existing sb.

Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-05-07 23:24:54 -05:00
Steve French 433b8dd767 SMB3: Track total time spent on roundtrips for each SMB3 command
Also track minimum and maximum time by command in /proc/fs/cifs/Stats

Signed-off-by: Steve French <stfrench@microsoft.com>
2019-05-07 23:24:54 -05:00
Aurelien Aptel b98749cac4 CIFS: keep FileInfo handle live during oplock break
In the oplock break handler, writing pending changes from pages puts
the FileInfo handle. If the refcount reaches zero it closes the handle
and waits for any oplock break handler to return, thus causing a deadlock.

To prevent this situation:

* We add a wait flag to cifsFileInfo_put() to decide whether we should
  wait for running/pending oplock break handlers

* We keep an additionnal reference of the SMB FileInfo handle so that
  for the rest of the handler putting the handle won't close it.
  - The ref is bumped everytime we queue the handler via the
    cifs_queue_oplock_break() helper.
  - The ref is decremented at the end of the handler

This bug was triggered by xfstest 464.

Also important fix to address the various reports of
oops in smb2_push_mandatory_locks

Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
CC: Stable <stable@vger.kernel.org>
2019-04-16 09:38:38 -05:00
Steve French ca567eb2b3 SMB3: Allow persistent handle timeout to be configurable on mount
Reconnecting after server or network failure can be improved
(to maintain availability and protect data integrity) by allowing
the client to choose the default persistent (or resilient)
handle timeout in some use cases.  Today we default to 0 which lets
the server pick the default timeout (usually 120 seconds) but this
can be problematic for some workloads.  Add the new mount parameter
to cifs.ko for SMB3 mounts "handletimeout" which enables the user
to override the default handle timeout for persistent (mount
option "persistenthandles") or resilient handles (mount option
"resilienthandles").  Maximum allowed is 16 minutes (960000 ms).
Units for the timeout are expressed in milliseconds. See
section 2.2.14.2.12 and 2.2.31.3 of the MS-SMB2 protocol
specification for more information.

Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
CC: Stable <stable@vger.kernel.org>
2019-04-01 14:33:36 -05:00
Aurelien Aptel c847dccfbd CIFS: make mknod() an smb_version_op
This cleanup removes cifs specific code from SMB2/SMB3 code paths
which is cleaner and easier to maintain as the code to handle
special files is improved.  Below is an example creating special files
using 'sfu' mount option over SMB3 to Windows (with this patch)
(Note that to Samba server, support for saving dos attributes
has to be enabled for the SFU mount option to work).

In the future this will also make implementation of creating
special files as reparse points easier (as Windows NFS server does
for example).

   root@smf-Thinkpad-P51:~# stat -c "%F" /mnt2/char
   character special file

   root@smf-Thinkpad-P51:~# stat -c "%F" /mnt2/block
   block special file

Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
2019-03-14 19:32:36 -05:00
Steve French 6552580286 cifs: minor documentation updates
Also updated a comment describing use of the GlobalMid_Lock

Signed-off-by: Steve French <stfrench@microsoft.com>
2019-03-14 19:32:36 -05:00
Ronnie Sahlberg b0f6df737a cifs: cache FILE_ALL_INFO for the shared root handle
When we open the shared root handle also ask for FILE_ALL_INFORMATION since
we can do this at zero cost as part of a compound.
Cache this information as long as the lease is held and return and serve any
future requests from cache.

This allows us to serve "stat /<mountpoint>" directly from cache and avoid
a network roundtrip.  Since clients often want to do this quite a lot
this improve performance slightly.

As an example: xfstest generic/533 performs 43 stat operations on the root
of the share while it is run. Which are eliminated with this patch.

Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2019-03-14 19:32:35 -05:00
Ronnie Sahlberg b227d215de cifs: wait_for_free_credits() make it possible to wait for >=1 credits
Change wait_for_free_credits() to allow waiting for >=1 credits instead of just
a single credit.

Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2019-03-14 19:32:35 -05:00
Pavel Shilovsky 9a1c67e8d5 CIFS: Adjust MTU credits before reopening a file
Currently we adjust MTU credits before sending an IO request
and after reopening a file. This approach doesn't allow the
reopen routine to use existing credits that are not needed
for IO. Reorder credit adjustment and reopening a file to
use credits available to the client more efficiently. Also
unwrap complex if statement into few pieces to improve
readability.

Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-03-05 18:10:01 -06:00
Pavel Shilovsky 34f4deb7c5 CIFS: Respect reconnect in non-MTU credits calculations
Every time after a session reconnect we don't need to account for
credits obtained in previous sessions. Make use of the recently
added cifs_credits structure to properly calculate credits for
non-MTU requests the same way we did for MTU ones.

Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-03-05 18:10:01 -06:00
Pavel Shilovsky 335b7b62ff CIFS: Respect reconnect in MTU credits calculations
Every time after a session reconnect we don't need to account for
credits obtained in previous sessions. Introduce new struct cifs_credits
which contains both credits value and reconnect instance of the
time those credits were taken. Modify a routine that add credits
back to handle the reconnect instance by assuming zero credits
if the reconnect happened after the credits were obtained and
before we decided to add them back due to some errors during sending.

This patch fixes the MTU credits cases. The subsequent patch
will handle non-MTU ones.

Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-03-05 18:10:01 -06:00
Pavel Shilovsky 66265f134a CIFS: Count SMB3 credits for malformed pending responses
Even if a response is malformed, we should count credits
granted by the server to avoid miscalculations and unnecessary
reconnects due to client or server bugs. If the response has
been received partially, the session will be reconnected anyway
on the next iteration of the demultiplex thread, so counting
credits for such cases shouldn't break things.

Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-03-04 20:06:39 -06:00
Pavel Shilovsky c781af7e0c CIFS: Do not skip SMB2 message IDs on send failures
When we hit failures during constructing MIDs or sending PDUs
through the network, we end up not using message IDs assigned
to the packet. The next SMB packet will skip those message IDs
and continue with the next one. This behavior may lead to a server
not granting us credits until we use the skipped IDs. Fix this by
reverting the current ID to the original value if any errors occur
before we push the packet through the network stack.

This patch fixes the generic/310 test from the xfs-tests.

Cc: <stable@vger.kernel.org> # 4.19.x
Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-03-04 20:06:12 -06:00
Steve French e8506d25f7 smb3: make default i/o size for smb3 mounts larger
We negotiate rsize mounts (and it can be overridden by user) to
typically 4MB, so using larger default I/O sizes from userspace
(changing to 1MB default i/o size returned by stat) the
performance is much better (and not just for long latency
network connections) in most use cases for SMB3 than the default I/O
size (which ends up being 128K for cp and can be even smaller for cp).
This can be 4x slower or worse depending on network latency.

By changing inode->blocksize from 32K (which was perhaps ok
for very old SMB1/CIFS) to a larger value, 1MB (but still less than
max size negotiated with the server which is 4MB, in order to minimize
risk) it significantly increases performance for the
noncached case, and slightly increases it for the cached case.
This can be changed by the user on mount (specifying bsize=
values from 16K to 16MB) to tune better for performance
for applications that depend on blocksize.

Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
CC: Stable <stable@vger.kernel.org>
2019-03-04 20:05:35 -06:00
Pavel Shilovsky 9a66396f18 CIFS: Fix error paths in writeback code
This patch aims to address writeback code problems related to error
paths. In particular it respects EINTR and related error codes and
stores and returns the first error occurred during writeback.

Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Acked-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-01-11 07:14:40 -06:00
Pavel Shilovsky 8a26f0f781 CIFS: Fix credits calculation for cancelled requests
If a request is cancelled, we can't assume that the server returns
1 credit back. Instead we need to wait for a response and process
the number of credits granted by the server.

Create a separate mid callback for cancelled request, parse the number
of credits in a response buffer and add them to the client's credits.
If the didn't get a response (no response buffer available) assume
0 credits granted. The latter most probably happens together with
session reconnect, so the client's credits are adjusted anyway.

Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-01-11 07:14:40 -06:00
Paulo Alcantara 93d5cb517d cifs: Add support for failover in cifs_reconnect()
After failing to reconnect to original target, it will retry any
target available from DFS cache.

Signed-off-by: Paulo Alcantara <palcantara@suse.de>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2018-12-28 10:13:11 -06:00
Paulo Alcantara 1c780228e9 cifs: Make use of DFS cache to get new DFS referrals
This patch will make use of DFS cache routines where appropriate and
do not always request a new referral from server.

Signed-off-by: Paulo Alcantara <palcantara@suse.de>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2018-12-28 10:09:46 -06:00
Paulo Alcantara 54be1f6c1c cifs: Add DFS cache routines
* Add new dfs_cache.[ch] files

* Add new /proc/fs/cifs/dfscache file
  - dump current cache when read
  - clear current cache when writing "0" to it

* Add delayed_work to periodically refresh cache entries

The new interface will be used for caching DFS referrals, as well as
supporting client target failover.

The DFS cache is a hashtable that maps UNC paths to cache entries.

A cache entry contains:
- the UNC path it is mapped on
- how much the the UNC path the entry consumes
- flags
- a Time-To-Live after which the entry expires
- a list of possible targets (linked lists of UNC paths)
- a "hint target" pointing the last known working target or the first
  target if none were tried. This hint lets cifs.ko remember and try
  working targets first.

* Looking for an entry in the cache is done with dfs_cache_find()
  - if no valid entries are found, a DFS query is made, stored in the
    cache and returned
  - the full target list can be copied and returned to avoid race
    conditions and looped on with the help with the
    dfs_cache_tgt_iterator

* Updating the target hint to the next target is done with
  dfs_cache_update_tgthint()

These functions have a dfs_cache_noreq_XXX() version that doesn't
fetches referrals if no entries are found. These versions don't
require the tcp/ses/tcon/cifs_sb parameters as a result.

Expired entries cannot be used and since they have a pretty short TTL
[1] in order for them to be useful for failover the DFS cache adds a
delayed work called periodically to keep them fresh.

Since we might not have available connections to issue the referral
request when refreshing we need to store volume_info structs with
credentials and other needed info to be able to connect to the right
server.

1: Windows defaults: 5mn for domain-based referrals, 30mn for regular
links

Signed-off-by: Paulo Alcantara <palcantara@suse.de>
Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2018-12-28 10:05:58 -06:00
Paulo Alcantara e7b602f437 cifs: Save TTL value when parsing DFS referrals
This will be needed by DFS cache.

Signed-off-by: Paulo Alcantara <palcantara@suse.de>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2018-12-23 23:49:00 -06:00
Kenneth D'souza 4a3b38aec5 Add vers=3.0.2 as a valid option for SMBv3.0.2
Technically 3.02 is not the dialect name although that is more familiar to
many, so we should also accept the official dialect name (3.0.2 vs. 3.02)
in vers=

Signed-off-by: Kenneth D'souza <kdsouza@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2018-12-23 22:39:29 -06:00
Long Li 6e6e2b86c2 CIFS: Add support for direct I/O read
With direct I/O read, we transfer the data directly from transport layer to
the user data buffer.

Change in v3: add support for kernel AIO

Change in v4:
Refactor common read code to __cifs_readv for direct and non-direct I/O.
Retry on direct I/O failure.

Signed-off-by: Long Li <longli@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2018-11-02 14:09:41 -05:00
Steve French dfe33f9abc smb3: allow more detailed protocol info on open files for debugging
In order to debug complex problems it is often helpful to
have detailed information on the client and server view
of the open file information.  Add the ability for root to
view the list of smb3 open files and dump the persistent
handle and other info so that it can be more easily
correlated with server logs.

Sample output from "cat /proc/fs/cifs/open_files"

 # Version:1
 # Format:
 # <tree id> <persistent fid> <flags> <count> <pid> <uid> <filename> <mid>
 0x5 0x800000378 0x8000 1 7704 0 some-file 0x14
 0xcb903c0c 0x84412e67 0x8000 1 7754 1001 rofile 0x1a6d
 0xcb903c0c 0x9526b767 0x8000 1 7720 1000 file 0x1a5b
 0xcb903c0c 0x9ce41a21 0x8000 1 7715 0 smallfile 0xd67

Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
2018-11-02 14:09:41 -05:00