1
0
Fork 0
Commit Graph

1723 Commits (redonkable)

Author SHA1 Message Date
Tang Wenji 7ee0031786 target: Fix return sense reason in target_scsi3_emulate_pr_out
The sense reason should be TCM_PARAMETER_LIST_LENGTH_ERROR when
parmeter length error.

Also the cdb[1] & 0x1f has been assigned to local variable sa,
so use sa instead of it.

Signed-off-by: Tang Wenji <tang.wenji@zte.com.cn>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-09 20:59:17 -07:00
Tang Wenji 388fe6996b target: Fix cmd size for PR-OUT in passthrough_parse_cdb
The cmd size should be 4bytes form byte5 to byte8 when CDB opcode
is PERSISTENT_RESERVE_OUT in SPC3 and SPC4

(Also fix up the same in spc_parse_cdb - MNC)

Signed-off-by: Tang Wenji <tang.wenji@zte.com.cn>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-09 20:58:49 -07:00
Bryant G. Ly de8c5221aa tcmu: Fix dev_config_store
Currently when there is a reconfig, the uio_info->name
does not get updated to reflect the change in the dev_config
name change.

On restart tcmu-runner there will be a mismatch between
the dev_config string in uio and the tcmu structure that contains
the string. When this occurs it'll reload the one in uio
and you lose the reconfigured device path.

v2: Created a helper function for the updating of uio_info

Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-09 20:57:56 -07:00
Damien Le Moal 016a5fec19 target: pscsi: Introduce TYPE_ZBC support
TYPE_ZBC host managed zoned block devices are also block devices
despite the non-standard device type (14h). Handle them similarly to
regular TYPE_DISK devices.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 23:11:49 -07:00
Damien Le Moal e5dc9a7055 target: Use macro for WRITE_VERIFY_32 operation codes
Add WRITE_VERIFY_32 definition to scsi prototypes and use this macro
definition isntead of the hard coded value.

(Drop WRITE_VERIFY_16 that's already part of another patch - nab)

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 23:11:49 -07:00
Mike Christie 402242c904 target: fix SAM_STAT_BUSY/TASK_SET_FULL handling
If the scsi status was not SAM_STAT_GOOD or there was no transport
sense, we would ignore the scsi status and do a generic not ready
LUN communication failure check condition failure.

The problem is that LUN COMM failure is treated as a hard error
sometimes and will cause apps to get IO errors instead of the OS's SCSI
layer retrying. For example, the tcmu daemon will return SAM_STAT_QUEUE_FULL
when memory runs low and can still make progress but wants the initiator to
reduce the work load. Windows will fail this error directly the app
instead of retrying.

This patch is based on Nick's "target/iblock: Use -EAGAIN/-ENOMEM to
propigate SAM BUSY/TASK_SET_FULL" patch here:

http://comments.gmane.org/gmane.linux.scsi.target.devel/11301

but instead of only setting SAM_STAT_GOOD, SAM_STAT_TASK_SET_FULL
and SAM_STAT_BUSY as success, it sets all non check condition status
as success so they are passed back to the initiator, so passthrough
type backends can return all SCSI status codes. Since only passthrough
uses this, I was not sure if we wanted to add checks for non-passthrough
and specific codes.

Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 23:11:48 -07:00
Mike Christie 1a44417548 target: remove transport_complete
transport_complete is no longer used, so drop the code.

Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 23:11:47 -07:00
Mike Christie dce6ce8cfb pscsi: finish cmd processing from pscsi_req_done
This patch performs the pscsi_transport_complete operations from
pscsi_req_done. It looks like the only difference the transport_complete
callout provides is that it is called under t_state_lock which seems to
only be needed for the SCF_TRANSPORT_TASK_SENSE bit handling. We can
now use transport_copy_sense_to_cmd to handle the se_cmd sense bits, and
we can then drop the code where we have to copy the request info to
the pscsi_plugin_task for transport_complete use.

Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 23:11:46 -07:00
Mike Christie 406f74c20d tcmu: fix sense handling during completion
We were just copying the sense to the cmd sense_buffer and
did not implement a transport_complete or set the
SCF_TRANSPORT_TASK_SENSE, so the sense was ignored.

Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 23:11:46 -07:00
Mike Christie c6d66aba98 target: add helper to copy sense to se_cmd buffer
This adds a helper to copy sense from backend module buffer to
the se_cmd's sense buffer.

Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 23:11:45 -07:00
Mike Christie 9fe3698450 target: do not require a transport_complete for SCF_TRANSPORT_TASK_SENSE
tcmu needs to pass raw sense to target_complete_cmd, but a a
transport_complete callout is akward to implement for it.

This moves the check for SCF_TRANSPORT_TASK_SENSE so any backend
can pass sense.

Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 23:11:44 -07:00
Colin Ian King c82ff239ec target: make device_mutex and device_list static
Variables device_mutex and device_list static are local to the source,
so make them static.

Cleans up sparse warnings:
"symbol 'device_list' was not declared. Should it be static?"
"symbol 'device_mutex' was not declared. Should it be static?"

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 23:11:44 -07:00
Xiubo Li 9d62bc0e6d tcmu: Fix flushing cmd entry dcache page
When feeding the tcmu's cmd ring, we need to flush the dcache page
for the cmd entry to make sure these kernel stores are visible to
user space mappings of that page.

For the none PAD cmd entry, this will be flushed at the end of the
tcmu_queue_cmd_ring().

Signed-off-by: Xiubo Li <lixiubo@cmss.chinamobile.com>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 23:11:43 -07:00
Mike Christie 9260695d65 tcmu: fix multiple uio open/close sequences
If the uio device is open and closed multiple times, the
kref count will be off due to tcmu_release getting called
multiple times for each close. This patch integrates
Wenji Tang's patch to add a kref_get on open that now
matches the kref_put done on tcmu_release and adds
a kref_put in tcmu_destroy_device to match the kref_get
done in succesful tcmu_configure_device calls.

Signed-off-by: Mike Christie <mchristi@redhat.com>
Cc: Wenji Tang <tang.wenji@zte.com.cn>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 23:11:42 -07:00
Mike Christie 531283ff75 tcmu: drop configured check in destroy
destroy_device is only called if we have successfully run
configure_device, so drop the duplicate tcmu_dev_configured check.

Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 23:11:41 -07:00
Mike Christie be50f538e9 target: remove g_device_list
g_device_list is no longer needed because we now use the idr code
for lookups and seaches.

Signed-off-by: Mike Christie <mchristi@redhat.com>
Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 23:11:41 -07:00
Mike Christie 6906d008b4 xcopy: loop over devices using idr helper
This converts the xcopy code to use the idr helper. The next patch
will drop the g_device_list and make g_device_mutex local to the
target_core_device.c file.

Signed-off-by: Mike Christie <mchristi@redhat.com>
Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 23:11:40 -07:00
Mike Christie b1943fd454 target: add helper to iterate over devices
This adds a wrapper around idr_for_each so the xcopy code can loop over
the devices in the next patch.

Signed-off-by: Mike Christie <mchristi@redhat.com>
Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 23:11:39 -07:00
Mike Christie b3af66e243 tcmu: perfom device add, del and reconfig synchronously
This makes the device add, del reconfig operations sync. It fixes
the issue where for add and reconfig, we do not know if userspace
successfully completely the operation, so we leave invalid kernel
structs or report incorrect status for the config/reconfig operations.

Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 23:11:39 -07:00
Mike Christie 85441e6b8c target: add helper to find se_device by dev_index
This adds a helper to find a se_device by dev_index. It will
be used in the next patches so tcmu's netlink interface can
execute commands on specific devices.

Signed-off-by: Mike Christie <mchristi@redhat.com>
Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 23:11:38 -07:00
Mike Christie 0a5eee647b target: use idr for se_device dev index
In the next patches we will add tcmu netlink support that allows
userspace to send commands to target_core_user. To execute operations
on a se_device/tcmu_dev we need to be able to look up a dev by any old
id. This patch replaces the se_device->dev_index with a idr created
id.

The next patches will also remove the g_device_list and replace it with
the idr.

Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 23:11:37 -07:00
Mike Christie 926347061e target: break up free_device callback
With this patch free_device is now used to free what is allocated in the
alloc_device callback and destroy_device tears down the resources that are
setup in the configure_device callback.

This patch will be needed in the next patch where tcmu needs
to be able to look up the device in the destroy callback.

Signed-off-by: Mike Christie <mchristi@redhat.com>
Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 23:11:37 -07:00
Mike Christie 2d76443e02 tcmu: reconfigure netlink attr changes
1. TCMU_ATTR_TYPE is too generic when it describes only the
reconfiguration type, so rename to TCMU_ATTR_RECONFIG_TYPE.

2. Only return the reconfig type when it is a
TCMU_CMD_RECONFIG_DEVICE command.

3. CONFIG_* type is not needed. We can pass the value along with an
ATTR to userspace, so it does not need to read sysfs/configfs.

4. Fix leak in tcmu_dev_path_store and rename to dev_config to
reflect it is more than just a path that can be changed.

6. Don't update kernel struct value if netlink sending fails.

Signed-off-by: Mike Christie <mchristi@redhat.com>
Reviewed-by: "Bryant G. Ly" <bryantly@linux.vnet.ibm.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 23:11:36 -07:00
Colin Ian King 5821783bca tcmu: make array tcmu_attrib_attrs static const
The array tcmu_attrib_attrs does not need to be in global scope, so make
it static.

Cleans up sparse warning:
"symbol 'tcmu_attrib_attrs' was not declared. Should it be static?"

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 23:11:34 -07:00
Xiubo Li 07932a023a tcmu: Fix module removal due to stuck unmap_thread thread again
Because the unmap code just after the schdule() returned may take
a long time and if the kthread_stop() is fired just when in this
routine, the module removal maybe stuck too.

Signed-off-by: Xiubo Li <lixiubo@cmss.chinamobile.com>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 23:11:33 -07:00
Jiang Yi 1d6ef27659 target: Fix COMPARE_AND_WRITE caw_sem leak during se_cmd quiesce
This patch addresses a COMPARE_AND_WRITE se_device->caw_sem leak,
that would be triggered during normal se_cmd shutdown or abort
via __transport_wait_for_tasks().

This would occur because target_complete_cmd() would catch this
early and do complete_all(&cmd->t_transport_stop_comp), but since
target_complete_ok_work() or target_complete_failure_work() are
never called to invoke se_cmd->transport_complete_callback(),
the COMPARE_AND_WRITE specific callbacks never release caw_sem.

To address this special case, go ahead and release caw_sem
directly from target_complete_cmd().

(Remove '&& success' from check, to release caw_sem regardless
 of scsi_status - nab)

Signed-off-by: Jiang Yi <jiangyilism@gmail.com>
Cc: <stable@vger.kernel.org> # 3.14+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 23:11:32 -07:00
Bryant G. Ly 8a45885c15 tcmu: Add Type of reconfig into netlink
This patch adds more info about the attribute being changed,
so that usersapce can easily figure out what is happening.

Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
Reviewed-By: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 23:11:32 -07:00
Bryant G. Ly ee01825220 tcmu: Make dev_config configurable
This allows for userspace to change the device path after
it has been created. Thus giving the user the ability to change
the path. The use case for this is to allow for virtual optical
to have media change.

Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
Reviewed-By: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 23:11:31 -07:00
Bryant G. Ly 801fc54d5d tcmu: Make dev_size configurable via userspace
Allow tcmu backstores to be able to set the device size
after it has been configured via set attribute.

Part of support in userspace to support certain backstores
changing device size.

Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
Reviewed-By: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 23:11:30 -07:00
Bryant G. Ly 1068be7bd4 tcmu: Add netlink for device reconfiguration
This gives tcmu the ability to handle events that can cause
reconfiguration, such as resize, path changes, write_cache, etc...

Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
Reviewed-By: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 23:11:30 -07:00
Bryant G. Ly 9a8bb60650 tcmu: Support emulate_write_cache
This will enable the toggling of write_cache in tcmu through targetcli-fb

Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
Reviewed-By: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 23:11:29 -07:00
Bart Van Assche 8fa4011e0d target/iscsi: Remove dead code from iscsit_process_scsi_cmd()
If an iSCSI command is rejected before iscsit_process_scsi_cmd()
is called, .reject_reason is set but iscsit_process_scsi_cmd() is
not called. This means that the "if (cmd->reject_reason) ..." code
in this function can be removed without changing the behavior of
this function.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Andy Grover <agrover@redhat.com>
Cc: David Disseldorp <ddiss@suse.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 23:11:28 -07:00
Bart Van Assche d1c26857cd target/iscsi: Simplify iscsit_free_cmd()
Since .se_tfo is only set if a command has been submitted to
the LIO core, check .se_tfo instead of .iscsi_opcode. Since
__iscsit_free_cmd() only affects SCSI commands but not TMFs,
calling that function for TMFs does not change behavior. This
patch does not change the behavior of iscsit_free_cmd().

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Andy Grover <agrover@redhat.com>
Cc: David Disseldorp <ddiss@suse.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 23:11:27 -07:00
Bart Van Assche 4412a67131 target/iscsi: Remove second argument of __iscsit_free_cmd()
Initialize .data_direction to DMA_NONE in iscsit_allocate_cmd()
such that the second argument of __iscsit_free_cmd() can be left
out. Note: this patch causes the first part of __iscsit_free_cmd()
no longer to be skipped for TMFs. That's fine since no data
segments are associated with TMFs.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Andy Grover <agrover@redhat.com>
Cc: David Disseldorp <ddiss@suse.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 23:11:26 -07:00
Bart Van Assche 4c1f0e6539 target/tcm_loop: Make TMF processing slightly faster
Target drivers must guarantee that struct se_cmd and struct se_tmr_req
exist as long as target_tmr_work() is in progress. This is why the
tcm_loop driver today passes 1 as second argument to
transport_generic_free_cmd() from inside the TMF code. Instead of
making the TMF code wait, make the TMF code obtain two references
(SCF_ACK_KREF) and drop one reference from inside the .check_stop_free()
callback.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.com>
Cc: David Disseldorp <ddiss@suse.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 23:11:25 -07:00
Bart Van Assche 75f141aaf4 target/tcm_loop: Use target_submit_tmr() instead of open-coding this function
Use target_submit_tmr() instead of open-coding this function. The
only functional change is that TMFs are now added to sess_cmd_list,
something the current code does not do. This behavior change is a
bug fix because it makes LUN RESETs wait for other TMFs that are in
progress for the same LUN.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.com>
Cc: David Disseldorp <ddiss@suse.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 23:11:25 -07:00
Bart Van Assche d17203c411 target/tcm_loop: Replace a waitqueue and a counter by a completion
This patch simplifies the implementation of the tcm_loop driver
but does not change its behavior.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.com>
Cc: David Disseldorp <ddiss@suse.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 23:11:24 -07:00
Bart Van Assche 4d3895d5ea target/tcm_loop: Merge struct tcm_loop_cmd and struct tcm_loop_tmr
This patch simplifies the tcm_loop implementation but does not
change any functionality.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.com>
Cc: David Disseldorp <ddiss@suse.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 23:11:23 -07:00
Bart Van Assche c00e622023 target: Introduce a function that shows the command state
Introduce target_show_cmd() and use it where appropriate. If
transport_wait_for_tasks() takes too long, make it show the
state of the command it is waiting for.

(Add missing brackets around multi-line conditions - nab)

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Andy Grover <agrover@redhat.com>
Cc: David Disseldorp <ddiss@suse.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 22:58:04 -07:00
Nicholas Bellinger 03db016a1b iscsi-target: Kill left-over iscsi_target_do_cleanup
With commit 25cdda95fd in place to address the initial login
PDU asynchronous socket close OOPs, go ahead and kill off the
left-over iscsi_target_do_cleanup() and ->login_cleanup_work.

Reported-by: Mike Christie <mchristi@redhat.com>
Cc: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 22:58:04 -07:00
Bart Van Assche d877d7275b target: Fix a deadlock between the XCOPY code and iSCSI session shutdown
Move the code for parsing an XCOPY command from the context of
the iSCSI receiver thread to the context of the XCOPY workqueue.
Keep the simple XCOPY checks in the context of the iSCSI receiver
thread. Move the code for allocating and freeing struct xcopy_op
from the code that parses an XCOPY command to its caller.

This patch fixes the following deadlock:

======================================================
[ INFO: possible circular locking dependency detected ]
4.10.0-rc7-dbg+ #1 Not tainted
-------------------------------------------------------
rmdir/13321 is trying to acquire lock:
 (&sess->cmdsn_mutex){+.+.+.}, at: [<ffffffffa02cb47d>] iscsit_free_all_ooo_cmdsns+0x2d/0xb0 [iscsi_target_mod]

but task is already holding lock:
 (&sb->s_type->i_mutex_key#14){++++++}, at: [<ffffffff811c6e20>] vfs_rmdir+0x50/0x140

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:
-> #1 (&sb->s_type->i_mutex_key#14){++++++}:
 lock_acquire+0x71/0x90
 down_write+0x3f/0x70
 configfs_depend_item+0x3a/0xb0 [configfs]
 target_depend_item+0x13/0x20 [target_core_mod]
 target_xcopy_locate_se_dev_e4+0xdd/0x1a0 [target_core_mod]
 target_do_xcopy+0x34b/0x970 [target_core_mod]
 __target_execute_cmd+0x22/0xa0 [target_core_mod]
 target_execute_cmd+0x233/0x2c0 [target_core_mod]
 iscsit_execute_cmd+0x208/0x270 [iscsi_target_mod]
 iscsit_sequence_cmd+0x10b/0x190 [iscsi_target_mod]
 iscsit_get_rx_pdu+0x37d/0xcd0 [iscsi_target_mod]
 iscsi_target_rx_thread+0x6e/0xa0 [iscsi_target_mod]
 kthread+0x102/0x140
 ret_from_fork+0x31/0x40

-> #0 (&sess->cmdsn_mutex){+.+.+.}:
 __lock_acquire+0x10e6/0x1260
 lock_acquire+0x71/0x90
 mutex_lock_nested+0x5f/0x670
 iscsit_free_all_ooo_cmdsns+0x2d/0xb0 [iscsi_target_mod]
 iscsit_close_session+0xac/0x200 [iscsi_target_mod]
 lio_tpg_close_session+0x9f/0xb0 [iscsi_target_mod]
 target_shutdown_sessions+0xc3/0xd0 [target_core_mod]
 core_tpg_del_initiator_node_acl+0x91/0x140 [target_core_mod]
 target_fabric_nacl_base_release+0x20/0x30 [target_core_mod]
 config_item_release+0x5a/0xc0 [configfs]
 config_item_put+0x1d/0x1f [configfs]
 configfs_rmdir+0x1a6/0x300 [configfs]
 vfs_rmdir+0xb7/0x140
 do_rmdir+0x1f4/0x200
 SyS_rmdir+0x11/0x20
 entry_SYSCALL_64_fastpath+0x23/0xc6

other info that might help us debug this:

 Possible unsafe locking scenario:
       CPU0                    CPU1
       ----                    ----
  lock(&sb->s_type->i_mutex_key#14);
                               lock(&sess->cmdsn_mutex);
                               lock(&sb->s_type->i_mutex_key#14);
  lock(&sess->cmdsn_mutex);

 *** DEADLOCK ***

3 locks held by rmdir/13321:
 #0:  (sb_writers#10){.+.+.+}, at: [<ffffffff811e1aff>] mnt_want_write+0x1f/0x50
 #1:  (&default_group_class[depth - 1]#2/1){+.+.+.}, at: [<ffffffff811cc8ce>] do_rmdir+0x15e/0x200
 #2:  (&sb->s_type->i_mutex_key#14){++++++}, at: [<ffffffff811c6e20>] vfs_rmdir+0x50/0x140

stack backtrace:
CPU: 2 PID: 13321 Comm: rmdir Not tainted 4.10.0-rc7-dbg+ #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014
Call Trace:
 dump_stack+0x86/0xc3
 print_circular_bug+0x1c7/0x220
 __lock_acquire+0x10e6/0x1260
 lock_acquire+0x71/0x90
 mutex_lock_nested+0x5f/0x670
 iscsit_free_all_ooo_cmdsns+0x2d/0xb0 [iscsi_target_mod]
 iscsit_close_session+0xac/0x200 [iscsi_target_mod]
 lio_tpg_close_session+0x9f/0xb0 [iscsi_target_mod]
 target_shutdown_sessions+0xc3/0xd0 [target_core_mod]
 core_tpg_del_initiator_node_acl+0x91/0x140 [target_core_mod]
 target_fabric_nacl_base_release+0x20/0x30 [target_core_mod]
 config_item_release+0x5a/0xc0 [configfs]
 config_item_put+0x1d/0x1f [configfs]
 configfs_rmdir+0x1a6/0x300 [configfs]
 vfs_rmdir+0xb7/0x140
 do_rmdir+0x1f4/0x200
 SyS_rmdir+0x11/0x20
 entry_SYSCALL_64_fastpath+0x23/0xc6

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Andy Grover <agrover@redhat.com>
Cc: David Disseldorp <ddiss@suse.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 22:58:00 -07:00
Bart Van Assche a85d667e58 target: Use {get,put}_unaligned_be*() instead of open coding these functions
Introduce the function get_unaligned_be24(). Use {get,put}_unaligned_be*()
where appropriate. This patch does not change any functionality.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Cc: Andy Grover <agrover@redhat.com>
Cc: David Disseldorp <ddiss@suse.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 22:57:59 -07:00
Bart Van Assche f2b72d6a8e target: Fix transport_init_se_cmd()
Avoid that aborting a command before it has been submitted onto
a workqueue triggers the following warning:

INFO: trying to register non-static key.
the code is fine but needs lockdep annotation.
turning off the locking correctness validator.
CPU: 3 PID: 46 Comm: kworker/u8:1 Not tainted 4.12.0-rc2-dbg+ #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014
Workqueue: tmr-iblock target_tmr_work [target_core_mod]
Call Trace:
 dump_stack+0x86/0xcf
 register_lock_class+0xe8/0x570
 __lock_acquire+0xa1/0x11d0
 lock_acquire+0x59/0x80
 flush_work+0x42/0x2b0
 __cancel_work_timer+0x10c/0x180
 cancel_work_sync+0xb/0x10
 core_tmr_lun_reset+0x352/0x740 [target_core_mod]
 target_tmr_work+0xd6/0x130 [target_core_mod]
 process_one_work+0x1ca/0x3f0
 worker_thread+0x49/0x3b0
 kthread+0x109/0x140
 ret_from_fork+0x31/0x40

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.com>
Cc: David Disseldorp <ddiss@suse.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 22:57:59 -07:00
Bart Van Assche 9f2f342892 target: Remove se_device.dev_list
The last user of se_device.dev_list was removed through commit
0fd97ccf45 ("target: kill struct se_subsystem_dev"). Hence
also remove se_device.dev_list.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Andy Grover <agrover@redhat.com>
Cc: David Disseldorp <ddiss@suse.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 22:57:58 -07:00
Bart Van Assche 3e182db787 target: Use symbolic value for WRITE_VERIFY_16
Now that a symbolic value has been introduced for WRITE_VERIFY_16,
use it. This patch does not change any functionality.

References: commit c2d26f18dc ("target: Add WRITE_VERIFY_16")
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
Cc: Andy Grover <agrover@redhat.com>
Cc: David Disseldorp <ddiss@suse.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 22:57:57 -07:00
Nicholas Bellinger 5465e7d3b9 target: Add TARGET_SCF_LOOKUP_LUN_FROM_TAG support for ABORT_TASK
This patch introduces support in target_submit_tmr() for locating a
unpacked_lun from an existing se_cmd->tag during ABORT_TASK.

When TARGET_SCF_LOOKUP_LUN_FROM_TAG is set, target_submit_tmr()
will do the extra lookup via target_lookup_lun_from_tag() and
subsequently invoke transport_lookup_tmr_lun() so a proper
percpu se_lun->lun_ref is taken before workqueue dispatch into
se_device->tmr_wq happens.

Aside from the extra target_lookup_lun_from_tag(), the existing
code-path remains unchanged.

Reviewed-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Quinn Tran <quinn.tran@cavium.com>
Cc: Mike Christie <mchristi@redhat.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 22:57:56 -07:00
Nicholas Bellinger eeb64d239e target: Add support for TMR percpu reference counting
This patch introduces TMR percpu reference counting using
se_lun->lun_ref in transport_lookup_tmr_lun(), following
how existing non TMR per se_lun reference counting works
within transport_lookup_cmd_lun().

It also adds explicit transport_lun_remove_cmd() calls to
drop the reference in the three tmr related locations that
invoke transport_cmd_check_stop_to_fabric();

   - target_tmr_work() during normal ->queue_tm_rsp()
   - target_complete_tmr_failure() during error ->queue_tm_rsp()
   - transport_generic_handle_tmr() during early failure

Also, note the exception paths in transport_generic_free_cmd()
and transport_cmd_finish_abort() already check SCF_SE_LUN_CMD,
and will invoke transport_lun_remove_cmd() when necessary.

Reviewed-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Quinn Tran <quinn.tran@cavium.com>
Cc: Mike Christie <mchristi@redhat.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 22:57:55 -07:00
Jiang Yi 12f66e4a0f target: reject COMPARE_AND_WRITE if emulate_caw is not set
In struct se_dev_attrib, there is a field emulate_caw exposed
as a /sys/kernel/config/target/core/$HBA/$DEV/attrib/.

If this field is set zero, it means the corresponding struct se_device
does not support the scsi cmd COMPARE_AND_WRITE

In function sbc_parse_cdb(), go ahead and reject scsi COMPARE_AND_WRITE
if emulate_caw is not set, because it has been explicitly disabled
from user-space.

(Make pr_err ratelimited - nab)

Signed-off-by: Jiang Yi <jiangyilism@gmail.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 22:57:54 -07:00
Linus Torvalds 89fbf5384d Merge branch 'work.read_write' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull read/write updates from Al Viro:
 "Christoph's fs/read_write.c series - consolidation and cleanups"

* 'work.read_write' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  nfsd: remove nfsd_vfs_read
  nfsd: use vfs_iter_read/write
  fs: implement vfs_iter_write using do_iter_write
  fs: implement vfs_iter_read using do_iter_read
  fs: move more code into do_iter_read/do_iter_write
  fs: remove __do_readv_writev
  fs: remove do_compat_readv_writev
  fs: remove do_readv_writev
2017-07-05 14:35:57 -07:00
Linus Torvalds 5518b69b76 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
 "Reasonably busy this cycle, but perhaps not as busy as in the 4.12
  merge window:

   1) Several optimizations for UDP processing under high load from
      Paolo Abeni.

   2) Support pacing internally in TCP when using the sch_fq packet
      scheduler for this is not practical. From Eric Dumazet.

   3) Support mutliple filter chains per qdisc, from Jiri Pirko.

   4) Move to 1ms TCP timestamp clock, from Eric Dumazet.

   5) Add batch dequeueing to vhost_net, from Jason Wang.

   6) Flesh out more completely SCTP checksum offload support, from
      Davide Caratti.

   7) More plumbing of extended netlink ACKs, from David Ahern, Pablo
      Neira Ayuso, and Matthias Schiffer.

   8) Add devlink support to nfp driver, from Simon Horman.

   9) Add RTM_F_FIB_MATCH flag to RTM_GETROUTE queries, from Roopa
      Prabhu.

  10) Add stack depth tracking to BPF verifier and use this information
      in the various eBPF JITs. From Alexei Starovoitov.

  11) Support XDP on qed device VFs, from Yuval Mintz.

  12) Introduce BPF PROG ID for better introspection of installed BPF
      programs. From Martin KaFai Lau.

  13) Add bpf_set_hash helper for TC bpf programs, from Daniel Borkmann.

  14) For loads, allow narrower accesses in bpf verifier checking, from
      Yonghong Song.

  15) Support MIPS in the BPF selftests and samples infrastructure, the
      MIPS eBPF JIT will be merged in via the MIPS GIT tree. From David
      Daney.

  16) Support kernel based TLS, from Dave Watson and others.

  17) Remove completely DST garbage collection, from Wei Wang.

  18) Allow installing TCP MD5 rules using prefixes, from Ivan
      Delalande.

  19) Add XDP support to Intel i40e driver, from Björn Töpel

  20) Add support for TC flower offload in nfp driver, from Simon
      Horman, Pieter Jansen van Vuuren, Benjamin LaHaise, Jakub
      Kicinski, and Bert van Leeuwen.

  21) IPSEC offloading support in mlx5, from Ilan Tayari.

  22) Add HW PTP support to macb driver, from Rafal Ozieblo.

  23) Networking refcount_t conversions, From Elena Reshetova.

  24) Add sock_ops support to BPF, from Lawrence Brako. This is useful
      for tuning the TCP sockopt settings of a group of applications,
      currently via CGROUPs"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1899 commits)
  net: phy: dp83867: add workaround for incorrect RX_CTRL pin strap
  dt-bindings: phy: dp83867: provide a workaround for incorrect RX_CTRL pin strap
  cxgb4: Support for get_ts_info ethtool method
  cxgb4: Add PTP Hardware Clock (PHC) support
  cxgb4: time stamping interface for PTP
  nfp: default to chained metadata prepend format
  nfp: remove legacy MAC address lookup
  nfp: improve order of interfaces in breakout mode
  net: macb: remove extraneous return when MACB_EXT_DESC is defined
  bpf: add missing break in for the TCP_BPF_SNDCWND_CLAMP case
  bpf: fix return in load_bpf_file
  mpls: fix rtm policy in mpls_getroute
  net, ax25: convert ax25_cb.refcount from atomic_t to refcount_t
  net, ax25: convert ax25_route.refcount from atomic_t to refcount_t
  net, ax25: convert ax25_uid_assoc.refcount from atomic_t to refcount_t
  net, sctp: convert sctp_ep_common.refcnt from atomic_t to refcount_t
  net, sctp: convert sctp_transport.refcnt from atomic_t to refcount_t
  net, sctp: convert sctp_chunk.refcnt from atomic_t to refcount_t
  net, sctp: convert sctp_datamsg.refcnt from atomic_t to refcount_t
  net, sctp: convert sctp_auth_bytes.refcnt from atomic_t to refcount_t
  ...
2017-07-05 12:31:59 -07:00
Dmitry Monakhov 128b6f9fdd t10-pi: Move opencoded contants to common header
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-07-03 16:56:25 -06:00
Linus Torvalds c6b1e36c8f Merge branch 'for-4.13/block' of git://git.kernel.dk/linux-block
Pull core block/IO updates from Jens Axboe:
 "This is the main pull request for the block layer for 4.13. Not a huge
  round in terms of features, but there's a lot of churn related to some
  core cleanups.

  Note this depends on the UUID tree pull request, that Christoph
  already sent out.

  This pull request contains:

   - A series from Christoph, unifying the error/stats codes in the
     block layer. We now use blk_status_t everywhere, instead of using
     different schemes for different places.

   - Also from Christoph, some cleanups around request allocation and IO
     scheduler interactions in blk-mq.

   - And yet another series from Christoph, cleaning up how we handle
     and do bounce buffering in the block layer.

   - A blk-mq debugfs series from Bart, further improving on the support
     we have for exporting internal information to aid debugging IO
     hangs or stalls.

   - Also from Bart, a series that cleans up the request initialization
     differences across types of devices.

   - A series from Goldwyn Rodrigues, allowing the block layer to return
     failure if we will block and the user asked for non-blocking.

   - Patch from Hannes for supporting setting loop devices block size to
     that of the underlying device.

   - Two series of patches from Javier, fixing various issues with
     lightnvm, particular around pblk.

   - A series from me, adding support for write hints. This comes with
     NVMe support as well, so applications can help guide data placement
     on flash to improve performance, latencies, and write
     amplification.

   - A series from Ming, improving and hardening blk-mq support for
     stopping/starting and quiescing hardware queues.

   - Two pull requests for NVMe updates. Nothing major on the feature
     side, but lots of cleanups and bug fixes. From the usual crew.

   - A series from Neil Brown, greatly improving the bio rescue set
     support. Most notably, this kills the bio rescue work queues, if we
     don't really need them.

   - Lots of other little bug fixes that are all over the place"

* 'for-4.13/block' of git://git.kernel.dk/linux-block: (217 commits)
  lightnvm: pblk: set line bitmap check under debug
  lightnvm: pblk: verify that cache read is still valid
  lightnvm: pblk: add initialization check
  lightnvm: pblk: remove target using async. I/Os
  lightnvm: pblk: use vmalloc for GC data buffer
  lightnvm: pblk: use right metadata buffer for recovery
  lightnvm: pblk: schedule if data is not ready
  lightnvm: pblk: remove unused return variable
  lightnvm: pblk: fix double-free on pblk init
  lightnvm: pblk: fix bad le64 assignations
  nvme: Makefile: remove dead build rule
  blk-mq: map all HWQ also in hyperthreaded system
  nvmet-rdma: register ib_client to not deadlock in device removal
  nvme_fc: fix error recovery on link down.
  nvmet_fc: fix crashes on bad opcodes
  nvme_fc: Fix crash when nvme controller connection fails.
  nvme_fc: replace ioabort msleep loop with completion
  nvme_fc: fix double calls to nvme_cleanup_cmd()
  nvme-fabrics: verify that a controller returns the correct NQN
  nvme: simplify nvme_dev_attrs_are_visible
  ...
2017-07-03 10:34:51 -07:00
David S. Miller b079115937 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
A set of overlapping changes in macvlan and the rocker
driver, nothing serious.

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-30 12:43:08 -04:00
Christoph Hellwig abbb65899a fs: implement vfs_iter_write using do_iter_write
De-dupliate some code and allow for passing the flags argument to
vfs_iter_write.  Additionally it now properly updates timestamps.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-06-29 17:49:23 -04:00
Christoph Hellwig 18e9710ee5 fs: implement vfs_iter_read using do_iter_read
De-dupliate some code and allow for passing the flags argument to
vfs_iter_read.  Additional it properly updates atime now.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-06-29 17:49:23 -04:00
Bart Van Assche ca18d6f769 block: Make most scsi_req_init() calls implicit
Instead of explicitly calling scsi_req_init() after blk_get_request(),
call that function from inside blk_get_request(). Add an
.initialize_rq_fn() callback function to the block drivers that need
it. Merge the IDE .init_rq_fn() function into .initialize_rq_fn()
because it is too small to keep it as a separate function. Keep the
scsi_req_init() call in ide_prep_sense() because it follows a
blk_rq_init() call.

References: commit 82ed4db499 ("block: split scsi_request out of struct request")
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Omar Sandoval <osandov@fb.com>
Cc: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-06-20 19:27:14 -06:00
yuan linyu de77b966ce net: introduce __skb_put_[zero, data, u8]
follow Johannes Berg, semantic patch file as below,
@@
identifier p, p2;
expression len;
expression skb;
type t, t2;
@@
(
-p = __skb_put(skb, len);
+p = __skb_put_zero(skb, len);
|
-p = (t)__skb_put(skb, len);
+p = __skb_put_zero(skb, len);
)
... when != p
(
p2 = (t2)p;
-memset(p2, 0, len);
|
-memset(p, 0, len);
)

@@
identifier p;
expression len;
expression skb;
type t;
@@
(
-t p = __skb_put(skb, len);
+t p = __skb_put_zero(skb, len);
)
... when != p
(
-memset(p, 0, len);
)

@@
type t, t2;
identifier p, p2;
expression skb;
@@
t *p;
...
(
-p = __skb_put(skb, sizeof(t));
+p = __skb_put_zero(skb, sizeof(t));
|
-p = (t *)__skb_put(skb, sizeof(t));
+p = __skb_put_zero(skb, sizeof(t));
)
... when != p
(
p2 = (t2)p;
-memset(p2, 0, sizeof(*p));
|
-memset(p, 0, sizeof(*p));
)

@@
expression skb, len;
@@
-memset(__skb_put(skb, len), 0, len);
+__skb_put_zero(skb, len);

@@
expression skb, len, data;
@@
-memcpy(__skb_put(skb, len), data, len);
+__skb_put_data(skb, data, len);

@@
expression SKB, C, S;
typedef u8;
identifier fn = {__skb_put};
fresh identifier fn2 = fn ## "_u8";
@@
- *(u8 *)fn(SKB, S) = C;
+ fn2(SKB, C);

Signed-off-by: yuan linyu <Linyu.Yuan@alcatel-sbell.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-20 13:30:14 -04:00
Jason A. Donenfeld 6787ab81b2 iscsi: ensure RNG is seeded before use
It's not safe to use weak random data here, especially for the challenge
response randomness. Since we're always in process context, it's safe to
simply wait until we have enough randomness to carry out the
authentication correctly.

While we're at it, we clean up a small memleak during an error
condition.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: "Nicholas A. Bellinger" <nab@linux-iscsi.org>
Cc: Lee Duncan <lduncan@suse.com>
Cc: Chris Leech <cleech@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2017-06-19 22:06:28 -04:00
NeilBrown 011067b056 blk: replace bioset_create_nobvec() with a flags arg to bioset_create()
"flags" arguments are often seen as good API design as they allow
easy extensibility.
bioset_create_nobvec() is implemented internally as a variation in
flags passed to __bioset_create().

To support future extension, make the internal structure part of the
API.
i.e. add a 'flags' argument to bioset_create() and discard
bioset_create_nobvec().

Note that the bio_split allocations in drivers/md/raid* do not need
the bvec mempool - they should have used bioset_create_nobvec().

Suggested-by: Christoph Hellwig <hch@infradead.org>
Reviewed-by: Christoph Hellwig <hch@infradead.org>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-06-18 12:40:59 -06:00
Johannes Berg d58ff35122 networking: make skb_push & __skb_push return void pointers
It seems like a historic accident that these return unsigned char *,
and in many places that means casts are required, more often than not.

Make these functions return void * and remove all the casts across
the tree, adding a (u8 *) cast only where the unsigned char pointer
was used directly, all done with the following spatch:

    @@
    expression SKB, LEN;
    typedef u8;
    identifier fn = { skb_push, __skb_push, skb_push_rcsum };
    @@
    - *(fn(SKB, LEN))
    + *(u8 *)fn(SKB, LEN)

    @@
    expression E, SKB, LEN;
    identifier fn = { skb_push, __skb_push, skb_push_rcsum };
    type T;
    @@
    - E = ((T *)(fn(SKB, LEN)))
    + E = fn(SKB, LEN)

    @@
    expression SKB, LEN;
    identifier fn = { skb_push, __skb_push, skb_push_rcsum };
    @@
    - fn(SKB, LEN)[0]
    + *(u8 *)fn(SKB, LEN)

Note that the last part there converts from push(...)[0] to the
more idiomatic *(u8 *)push(...).

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-16 11:48:40 -04:00
Johannes Berg 4df864c1d9 networking: make skb_put & friends return void pointers
It seems like a historic accident that these return unsigned char *,
and in many places that means casts are required, more often than not.

Make these functions (skb_put, __skb_put and pskb_put) return void *
and remove all the casts across the tree, adding a (u8 *) cast only
where the unsigned char pointer was used directly, all done with the
following spatch:

    @@
    expression SKB, LEN;
    typedef u8;
    identifier fn = { skb_put, __skb_put };
    @@
    - *(fn(SKB, LEN))
    + *(u8 *)fn(SKB, LEN)

    @@
    expression E, SKB, LEN;
    identifier fn = { skb_put, __skb_put };
    type T;
    @@
    - E = ((T *)(fn(SKB, LEN)))
    + E = fn(SKB, LEN)

which actually doesn't cover pskb_put since there are only three
users overall.

A handful of stragglers were converted manually, notably a macro in
drivers/isdn/i4l/isdn_bsdcomp.c and, oddly enough, one of the many
instances in net/bluetooth/hci_sock.c. In the former file, I also
had to fix one whitespace problem spatch introduced.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-16 11:48:39 -04:00
Jens Axboe 8f66439eec Linux 4.12-rc5
-----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJZPdbLAAoJEHm+PkMAQRiGx4wH/1nCjfnl6fE8oJ24/1gEAOUh
 biFdqJkYZmlLYHVtYfLm4Ueg4adJdg0wx6qM/4RaAzmQVvLfDV34bc1qBf1+P95G
 kVF+osWyXrZo5cTwkwapHW/KNu4VJwAx2D1wrlxKDVG5AOrULH1pYOYGOpApEkZU
 4N+q5+M0ce0GJpqtUZX+UnI33ygjdDbBxXoFKsr24B7eA0ouGbAJ7dC88WcaETL+
 2/7tT01SvDMo0jBSV0WIqlgXwZ5gp3yPGnklC3F4159Yze6VFrzHMKS/UpPF8o8E
 W9EbuzwxsKyXUifX2GY348L1f+47glen/1sedbuKnFhP6E9aqUQQJXvEO7ueQl4=
 =m2Gx
 -----END PGP SIGNATURE-----

Merge tag 'v4.12-rc5' into for-4.13/block

We've already got a few conflicts and upcoming work depends on some of the
changes that have gone into mainline as regression fixes for this series.

Pull in 4.12-rc5 to resolve these conflicts and make it easier on down stream
trees to continue working on 4.13 changes.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-06-12 08:30:13 -06:00
Christoph Hellwig 4e4cbee93d block: switch bios to blk_status_t
Replace bi_error with a new bi_status to allow for a clear conversion.
Note that device mapper overloaded bi_error with a private value, which
we'll have to keep arround at least for now and thus propagate to a
proper blk_status_t value.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
2017-06-09 09:27:32 -06:00
Christoph Hellwig 2a842acab1 block: introduce new block status code type
Currently we use nornal Linux errno values in the block layer, and while
we accept any error a few have overloaded magic meanings.  This patch
instead introduces a new  blk_status_t value that holds block layer specific
status codes and explicitly explains their meaning.  Helpers to convert from
and to the previous special meanings are provided for now, but I suspect
we want to get rid of them in the long run - those drivers that have a
errno input (e.g. networking) usually get errnos that don't know about
the special block layer overloads, and similarly returning them to userspace
will usually return somethings that strictly speaking isn't correct
for file system operations, but that's left as an exercise for later.

For now the set of errors is a very limited set that closely corresponds
to the previous overloaded errno values, but there is some low hanging
fruite to improve it.

blk_status_t (ab)uses the sparse __bitwise annotations to allow for sparse
typechecking, so that we can easily catch places passing the wrong values.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
2017-06-09 09:27:32 -06:00
Nicholas Bellinger eceb4459df iscsi-target: Avoid holding ->tpg_state_lock during param update
As originally reported by Jia-Ju, iscsit_tpg_enable_portal_group()
holds iscsi_portal_group->tpg_state_lock while updating AUTHMETHOD
via iscsi_update_param_value(), which performs a GFP_KERNEL
allocation.

However, since iscsit_tpg_enable_portal_group() is already protected
by iscsit_get_tpg() -> iscsi_portal_group->tpg_access_lock in it's
parent caller, ->tpg_state_lock only needs to be held when setting
TPG_STATE_ACTIVE.

Reported-by: Jia-Ju Bai <baijiaju1990@163.com>
Reviewed-by: Jia-Ju Bai <baijiaju1990@163.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-06-08 23:26:38 -07:00
Nicholas Bellinger 9ae0e9ade5 target/configfs: Kill se_lun->lun_link_magic
Instead of using a hardcoded magic value in se_lun when verifying
a target config_item symlink source during target_fabric_mappedlun_link(),
go ahead and use target_fabric_port_item_ops directly instead.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Mike Christie <mchristi@redhat.com>
Cc: Hannes Reinecke <hare@suse.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-06-08 23:26:38 -07:00
Nicholas Bellinger c17cd24959 target/configfs: Kill se_device->dev_link_magic
Instead of using a hardcoded magic value in se_device when verifying
a target config_item symlink source during target_fabric_port_link(),
go ahead and use target_core_dev_item_ops directly instead.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Mike Christie <mchristi@redhat.com>
Cc: Hannes Reinecke <hare@suse.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-06-08 23:26:38 -07:00
Nicholas Bellinger 2237498f0b target/iblock: Convert WRITE_SAME to blkdev_issue_zeroout
The people who are actively using iblock_execute_write_same_direct() are
doing so in the context of ESX VAAI BlockZero, together with
EXTENDED_COPY and COMPARE_AND_WRITE primitives.

In practice though I've not seen any users of IBLOCK WRITE_SAME for
anything other than VAAI BlockZero, so just using blkdev_issue_zeroout()
when available, and falling back to iblock_execute_write_same() if the
WRITE_SAME buffer contains anything other than zeros should be OK.

(Hook up max_write_zeroes_sectors to signal LBPRZ feature bit in
 target_configure_unmap_from_queue - nab)

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Mike Christie <mchristi@redhat.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Jens Axboe <axboe@fb.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-06-08 23:26:37 -07:00
Gustavo A. R. Silva fb418240ec target: remove dead code
Local variable _ret_ is assigned to a constant value and it is never
updated again. Remove this variable and the dead code it guards.

Addresses-Coverity-ID: 140761
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Reviewed-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-06-08 23:26:37 -07:00
Nicholas Bellinger abb85a9b51 iscsi-target: Reject immediate data underflow larger than SCSI transfer length
When iscsi WRITE underflow occurs there are two different scenarios
that can happen.

Normally in practice, when an EDTL vs. SCSI CDB TRANSFER LENGTH
underflow is detected, the iscsi immediate data payload is the
smaller SCSI CDB TRANSFER LENGTH.

That is, when a host fabric LLD is using a fixed size EDTL for
a specific control CDB, the SCSI CDB TRANSFER LENGTH and actual
SCSI payload ends up being smaller than EDTL.  In iscsi, this
means the received iscsi immediate data payload matches the
smaller SCSI CDB TRANSFER LENGTH, because there is no more
SCSI payload to accept beyond SCSI CDB TRANSFER LENGTH.

However, it's possible for a malicous host to send a WRITE
underflow where EDTL is larger than SCSI CDB TRANSFER LENGTH,
but incoming iscsi immediate data actually matches EDTL.

In the wild, we've never had a iscsi host environment actually
try to do this.

For this special case, it's wrong to truncate part of the
control CDB payload and continue to process the command during
underflow when immediate data payload received was larger than
SCSI CDB TRANSFER LENGTH, so go ahead and reject and drop the
bogus payload as a defensive action.

Note this potential bug was originally relaxed by the following
for allowing WRITE underflow in MSFT FCP host environments:

   commit c72c525022
   Author: Roland Dreier <roland@purestorage.com>
   Date:   Wed Jul 22 15:08:18 2015 -0700

      target: allow underflow/overflow for PR OUT etc. commands

Cc: Roland Dreier <roland@purestorage.com>
Cc: Mike Christie <mchristi@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: <stable@vger.kernel.org> # v4.3+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-06-08 22:25:29 -07:00
Nicholas Bellinger 105fa2f44e iscsi-target: Fix delayed logout processing greater than SECONDS_FOR_LOGOUT_COMP
This patch fixes a BUG() in iscsit_close_session() that could be
triggered when iscsit_logout_post_handler() execution from within
tx thread context was not run for more than SECONDS_FOR_LOGOUT_COMP
(15 seconds), and the TCP connection didn't already close before
then forcing tx thread context to automatically exit.

This would manifest itself during explicit logout as:

[33206.974254] 1 connection(s) still exist for iSCSI session to iqn.1993-08.org.debian:01:3f5523242179
[33206.980184] INFO: NMI handler (kgdb_nmi_handler) took too long to run: 2100.772 msecs
[33209.078643] ------------[ cut here ]------------
[33209.078646] kernel BUG at drivers/target/iscsi/iscsi_target.c:4346!

Normally when explicit logout attempt fails, the tx thread context
exits and iscsit_close_connection() from rx thread context does the
extra cleanup once it detects conn->conn_logout_remove has not been
cleared by the logout type specific post handlers.

To address this special case, if the logout post handler in tx thread
context detects conn->tx_thread_active has already been cleared, simply
return and exit in order for existing iscsit_close_connection()
logic from rx thread context do failed logout cleanup.

Reported-by: Bart Van Assche <bart.vanassche@sandisk.com>
Tested-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Mike Christie <mchristi@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: stable@vger.kernel.org # 3.14+
Tested-by: Gary Guo <ghg@datera.io>
Tested-by: Chu Yuan Lin <cyl@datera.io>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-06-08 22:25:14 -07:00
Nicholas Bellinger 73d4e580cc target: Fix kref->refcount underflow in transport_cmd_finish_abort
This patch fixes a se_cmd->cmd_kref underflow during CMD_T_ABORTED
when a fabric driver drops it's second reference from below the
target_core_tmr.c based callers of transport_cmd_finish_abort().

Recently with the conversion of kref to refcount_t, this bug was
manifesting itself as:

[705519.601034] refcount_t: underflow; use-after-free.
[705519.604034] INFO: NMI handler (kgdb_nmi_handler) took too long to run: 20116.512 msecs
[705539.719111] ------------[ cut here ]------------
[705539.719117] WARNING: CPU: 3 PID: 26510 at lib/refcount.c:184 refcount_sub_and_test+0x33/0x51

Since the original kref atomic_t based kref_put() didn't check for
underflow and only invoked the final callback when zero was reached,
this bug did not manifest in practice since all se_cmd memory is
using preallocated tags.

To address this, go ahead and propigate the existing return from
transport_put_cmd() up via transport_cmd_finish_abort(), and
change transport_cmd_finish_abort() + core_tmr_handle_tas_abort()
callers to only do their local target_put_sess_cmd() if necessary.

Reported-by: Bart Van Assche <bart.vanassche@sandisk.com>
Tested-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Mike Christie <mchristi@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Himanshu Madhani <himanshu.madhani@qlogic.com>
Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: stable@vger.kernel.org # 3.14+
Tested-by: Gary Guo <ghg@datera.io>
Tested-by: Chu Yuan Lin <cyl@datera.io>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-06-08 22:24:18 -07:00
Ganesh Goudar 1dec4cec9f cxgb4: Fix tids count for ipv6 offload connection
the adapter consumes two tids for every ipv6 offload
connection be it active or passive, calculate tid usage
count accordingly.

Also change the signatures of relevant functions to get
the address family.

Signed-off-by: Rizwan Ansari <rizwana@chelsio.com>
Signed-off-by: Varun Prakash <varun@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-07 15:14:34 -04:00
Jiang Yi 5e0cf5e6c4 iscsi-target: Always wait for kthread_should_stop() before kthread exit
There are three timing problems in the kthread usages of iscsi_target_mod:

 - np_thread of struct iscsi_np
 - rx_thread and tx_thread of struct iscsi_conn

In iscsit_close_connection(), it calls

 send_sig(SIGINT, conn->tx_thread, 1);
 kthread_stop(conn->tx_thread);

In conn->tx_thread, which is iscsi_target_tx_thread(), when it receive
SIGINT the kthread will exit without checking the return value of
kthread_should_stop().

So if iscsi_target_tx_thread() exit right between send_sig(SIGINT...)
and kthread_stop(...), the kthread_stop() will try to stop an already
stopped kthread.

This is invalid according to the documentation of kthread_stop().

(Fix -ECONNRESET logout handling in iscsi_target_tx_thread and
 early iscsi_target_rx_thread failure case - nab)

Signed-off-by: Jiang Yi <jiangyilism@gmail.com>
Cc: <stable@vger.kernel.org> # v3.12+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-05-31 15:12:57 -07:00
Nicholas Bellinger 25cdda95fd iscsi-target: Fix initial login PDU asynchronous socket close OOPs
This patch fixes a OOPs originally introduced by:

   commit bb048357da
   Author: Nicholas Bellinger <nab@linux-iscsi.org>
   Date:   Thu Sep 5 14:54:04 2013 -0700

   iscsi-target: Add sk->sk_state_change to cleanup after TCP failure

which would trigger a NULL pointer dereference when a TCP connection
was closed asynchronously via iscsi_target_sk_state_change(), but only
when the initial PDU processing in iscsi_target_do_login() from iscsi_np
process context was blocked waiting for backend I/O to complete.

To address this issue, this patch makes the following changes.

First, it introduces some common helper functions used for checking
socket closing state, checking login_flags, and atomically checking
socket closing state + setting login_flags.

Second, it introduces a LOGIN_FLAGS_INITIAL_PDU bit to know when a TCP
connection has dropped via iscsi_target_sk_state_change(), but the
initial PDU processing within iscsi_target_do_login() in iscsi_np
context is still running.  For this case, it sets LOGIN_FLAGS_CLOSED,
but doesn't invoke schedule_delayed_work().

The original NULL pointer dereference case reported by MNC is now handled
by iscsi_target_do_login() doing a iscsi_target_sk_check_close() before
transitioning to FFP to determine when the socket has already closed,
or iscsi_target_start_negotiation() if the login needs to exchange
more PDUs (eg: iscsi_target_do_login returned 0) but the socket has
closed.  For both of these cases, the cleanup up of remaining connection
resources will occur in iscsi_target_start_negotiation() from iscsi_np
process context once the failure is detected.

Finally, to handle to case where iscsi_target_sk_state_change() is
called after the initial PDU procesing is complete, it now invokes
conn->login_work -> iscsi_target_do_login_rx() to perform cleanup once
existing iscsi_target_sk_check_close() checks detect connection failure.
For this case, the cleanup of remaining connection resources will occur
in iscsi_target_do_login_rx() from delayed workqueue process context
once the failure is detected.

Reported-by: Mike Christie <mchristi@redhat.com>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Tested-by: Mike Christie <mchristi@redhat.com>
Cc: Mike Christie <mchristi@redhat.com>
Reported-by: Hannes Reinecke <hare@suse.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Varun Prakash <varun@chelsio.com>
Cc: <stable@vger.kernel.org> # v3.12+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-05-31 15:12:31 -07:00
Mike Christie f3cdbe39b2 tcmu: fix crash during device removal
We currently do

tcmu_free_device ->tcmu_netlink_event(TCMU_CMD_REMOVED_DEVICE) ->
uio_unregister_device -> kfree(tcmu_dev).

The problem is that the kernel does not wait for userspace to
do the close() on the uio device before freeing the tcmu_dev.
We can then hit a race where the kernel frees the tcmu_dev before
userspace does close() and so when close() -> release -> tcmu_release
is done, we try to access a freed tcmu_dev.

This patch made over the target-pending master branch moves the freeing
of the tcmu_dev to when the last reference has been dropped.

This also fixes a leak where if tcmu_configure_device was not called on a
device we did not free udev->name which was allocated at tcmu_alloc_device time.

Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-05-23 19:50:49 -07:00
Nicholas Bellinger 4ff83daa02 target: Re-add check to reject control WRITEs with overflow data
During v4.3 when the overflow/underflow check was relaxed by
commit c72c525022:

  commit c72c525022
  Author: Roland Dreier <roland@purestorage.com>
  Date:   Wed Jul 22 15:08:18 2015 -0700

       target: allow underflow/overflow for PR OUT etc. commands

to allow underflow/overflow for Windows compliance + FCP, a
consequence was to allow control CDBs to process overflow
data for iscsi-target with immediate data as well.

As per Roland's original change, continue to allow underflow
cases for control CDBs to make Windows compliance + FCP happy,
but until overflow for control CDBs is supported tree-wide,
explicitly reject all control WRITEs with overflow following
pre v4.3.y logic.

Reported-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Roland Dreier <roland@purestorage.com>
Cc: <stable@vger.kernel.org> # v4.3+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-05-15 20:20:29 -07:00
Linus Torvalds bd1286f964 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Pull SCSI target updates from Nicholas Bellinger:
 "Things were a lot more calm than previously expected. It's primarily
  fixes in various areas, with most of the new functionality centering
  around TCMU backend driver work that Xiubo Li has been driving.

  Here's the summary on the feature side:

   - Make T10-PI verify configurable for emulated (FILEIO + RD) backends
    (Dmitry Monakhov)
   - Allow target-core/TCMU pass-through to use in-kernel SPC-PR logic
    (Bryant Ly + MNC)
   - Add TCMU support for growing ring buffer size (Xiubo Li + MNC)
   - Add TCMU support for global block data pool (Xiubo Li + MNC)

  and on the bug-fix side:

   - Fix COMPARE_AND_WRITE non GOOD status handling for READ phase
    failures (Gary Guo + nab)
   - Fix iscsi-target hang with explicitly changing per NodeACL
    CmdSN number depth with concurrent login driven session
    reinstatement.  (Gary Guo + nab)
   - Fix ibmvscsis fabric driver ABORT task handling (Bryant Ly)
   - Fix target-core/FILEIO zero length handling (Bart Van Assche)

  Also, there was an OOPs introduced with the WRITE_VERIFY changes that
  I ended up reverting at the last minute, because as not unusual Bart
  and I could not agree on the fix in time for -rc1. Since it's specific
  to a conformance test, it's been reverted for now.

  There is a separate patch in the queue to address the underlying
  control CDB write overflow regression in >= v4.3 separate from the
  WRITE_VERIFY revert here, that will be pushed post -rc1"

* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (30 commits)
  Revert "target: Fix VERIFY and WRITE VERIFY command parsing"
  IB/srpt: Avoid that aborting a command triggers a kernel warning
  IB/srpt: Fix abort handling
  target/fileio: Fix zero-length READ and WRITE handling
  ibmvscsis: Do not send aborted task response
  tcmu: fix module removal due to stuck thread
  target: Don't force session reset if queue_depth does not change
  iscsi-target: Set session_fall_back_to_erl0 when forcing reinstatement
  target: Fix compare_and_write_callback handling for non GOOD status
  tcmu: Recalculate the tcmu_cmd size to save cmd area memories
  tcmu: Add global data block pool support
  tcmu: Add dynamic growing data area feature support
  target: fixup error message in target_tg_pt_gp_tg_pt_gp_id_store()
  target: fixup error message in target_tg_pt_gp_alua_access_type_store()
  target/user: PGR Support
  target: Add WRITE_VERIFY_16
  Documentation/target: add an example script to configure an iSCSI target
  target: Use kmalloc_array() in transport_kmap_data_sg()
  target: Use kmalloc_array() in compare_and_write_callback()
  target: Improve size determinations in two functions
  ...
2017-05-12 11:44:13 -07:00
Nicholas Bellinger 984a9d4c40 Revert "target: Fix VERIFY and WRITE VERIFY command parsing"
This reverts commit 0e2eb7d12e

  Author: Bart Van Assche <bart.vanassche@sandisk.com>
  Date:   Thu Mar 30 10:12:39 2017 -0700

      target: Fix VERIFY and WRITE VERIFY command parsing

This patch broke existing behaviour for WRITE_VERIFY because
it dropped the original SCF_SCSI_DATA_CDB assignment for
bytchk = 0 so target_cmd_size_check() no longer rejected
this case, allowing an overflow case to trigger an OOPs
in iscsi-target.

Since the short term and long term fixes are still being
discussed, revert it for now since it's late in the merge
window and try again in v4.13-rc1.

Conflicts:
	drivers/target/target_core_sbc.c

Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-05-11 01:01:05 -07:00
Bart Van Assche 59ac9c0781 target/fileio: Fix zero-length READ and WRITE handling
This patch fixes zero-length READ and WRITE handling in target/FILEIO,
which was broken a long time back by:

Since:

  commit d81cb44726
  Author: Paolo Bonzini <pbonzini@redhat.com>
  Date:   Mon Sep 17 16:36:11 2012 -0700

      target: go through normal processing for all zero-length commands

which moved zero-length READ and WRITE completion out of target-core,
to doing submission into backend driver code.

To address this, go ahead and invoke target_complete_cmd() for any
non negative return value in fd_do_rw().

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Andy Grover <agrover@redhat.com>
Cc: David Disseldorp <ddiss@suse.de>
Cc: <stable@vger.kernel.org> # v3.7+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-05-07 16:05:16 -07:00
Mike Christie d906d8af28 tcmu: fix module removal due to stuck thread
We need to do a kthread_should_stop to check when kthread_stop has been
called.

This was a regression added in

b6df4b79a5
tcmu: Add global data block pool support

so not sure if you wanted to merge it in with that patch or what.

Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-05-04 20:01:41 -07:00
Nicholas Bellinger 46861cdd80 target: Don't force session reset if queue_depth does not change
Keeping in the idempotent nature of target_core_fabric_configfs.c,
if a queue_depth value is set and it's the same as the existing
value, don't attempt to force session reinstatement.

Reported-by: Raghu Krishnamurthy <rk@datera.io>
Cc: Raghu Krishnamurthy <rk@datera.io>
Tested-by: Gary Guo <ghg@datera.io>
Cc: Gary Guo <ghg@datera.io>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-05-04 20:01:40 -07:00
Nicholas Bellinger 197b806ae5 iscsi-target: Set session_fall_back_to_erl0 when forcing reinstatement
While testing modification of per se_node_acl queue_depth forcing
session reinstatement via lio_target_nacl_cmdsn_depth_store() ->
core_tpg_set_initiator_node_queue_depth(), a hung task bug triggered
when changing cmdsn_depth invoked session reinstatement while an iscsi
login was already waiting for session reinstatement to complete.

This can happen when an outstanding se_cmd descriptor is taking a
long time to complete, and session reinstatement from iscsi login
or cmdsn_depth change occurs concurrently.

To address this bug, explicitly set session_fall_back_to_erl0 = 1
when forcing session reinstatement, so session reinstatement is
not attempted if an active session is already being shutdown.

This patch has been tested with two scenarios.  The first when
iscsi login is blocked waiting for iscsi session reinstatement
to complete followed by queue_depth change via configfs, and
second when queue_depth change via configfs us blocked followed
by a iscsi login driven session reinstatement.

Note this patch depends on commit d36ad77f70 to handle multiple
sessions per se_node_acl when changing cmdsn_depth, and for
pre v4.5 kernels will need to be included for stable as well.

Reported-by: Gary Guo <ghg@datera.io>
Tested-by: Gary Guo <ghg@datera.io>
Cc: Gary Guo <ghg@datera.io>
Cc: <stable@vger.kernel.org> # v4.1+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-05-04 20:01:39 -07:00
Nicholas Bellinger a71a5dc7f8 target: Fix compare_and_write_callback handling for non GOOD status
Following the bugfix for handling non SAM_STAT_GOOD COMPARE_AND_WRITE
status during COMMIT phase in commit 9b2792c3da, the same bug exists
for the READ phase as well.

This would manifest first as a lost SCSI response, and eventual
hung task during fabric driver logout or re-login, as existing
shutdown logic waited for the COMPARE_AND_WRITE se_cmd->cmd_kref
to reach zero.

To address this bug, compare_and_write_callback() has been changed
to set post_ret = 1 and return TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE
as necessary to signal failure status.

Reported-by: Bill Borsari <wgb@datera.io>
Cc: Bill Borsari <wgb@datera.io>
Tested-by: Gary Guo <ghg@datera.io>
Cc: Gary Guo <ghg@datera.io>
Cc: <stable@vger.kernel.org> # v4.1+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-05-04 20:01:39 -07:00
Xiubo Li fe25cc3479 tcmu: Recalculate the tcmu_cmd size to save cmd area memories
For the "struct tcmu_cmd_entry" in cmd area, the minimum size
will be sizeof(struct tcmu_cmd_entry) == 112 Bytes. And it could
fill about (sizeof(struct rsp) - sizeof(struct req)) /
sizeof(struct iovec) == 68 / 16 ~= 4 data regions(iov[4]) by
default.

For most tcmu_cmds, the data block indexes allocated from the
data area will be continuous. And for the continuous blocks they
will be merged into the same region using only one iovec. For
the current code, it will always allocates the same number of
iovecs with blocks for each tcmu_cmd, and it will wastes much
memories.

For example, when the block size is 4K and the DATA_OUT buffer
size is 64K, and the regions needed is less than 5(on my
environment is almost 99.7%). The current code will allocate
about 16 iovecs, and there will be (16 - 4) * sizeof(struct
iovec) = 192 Bytes cmd area memories wasted.

Here adds two helpers to calculate the base size and full size
of the tcmu_cmd. And will recalculate them again when it make sure
how many iovs is needed before insert it to cmd area.

Signed-off-by: Xiubo Li <lixiubo@cmss.chinamobile.com>
Acked-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-05-02 20:39:38 -07:00
Xiubo Li b6df4b79a5 tcmu: Add global data block pool support
For each target there will be one ring, when the target number
grows larger and larger, it could eventually runs out of the
system memories.

In this patch for each target ring, currently for the cmd area
the size will be fixed to 8MB and for the data area the size
will grow from 0 to max 256K * PAGE_SIZE(1G for 4K page size).

For all the targets' data areas, they will get empty blocks
from the "global data block pool", which has limited to 512K *
PAGE_SIZE(2G for 4K page size) for now.

When the "global data block pool" has been used up, then any
target could wake up the unmap thread routine to shrink other
targets' data area memories. And the unmap thread routine will
always try to truncate the ring vma from the last using block
offset.

When user space has touched the data blocks out of tcmu_cmd
iov[], the tcmu_page_fault() will try to return one zeroed blocks.

Here we move the timeout's tcmu_handle_completions() into unmap
thread routine, that's to say when the timeout fired, it will
only do the tcmu_check_expired_cmd() and then wake up the unmap
thread to do the completions() and then try to shrink its idle
memories. Then the cmdr_lock could be a mutex and could simplify
this patch because the unmap_mapping_range() or zap_* may go to
sleep.

Signed-off-by: Xiubo Li <lixiubo@cmss.chinamobile.com>
Signed-off-by: Jianfei Hu <hujianfei@cmss.chinamobile.com>
Acked-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-05-01 22:58:04 -07:00
Xiubo Li 141685a391 tcmu: Add dynamic growing data area feature support
Currently for the TCMU, the ring buffer size is fixed to 64K cmd
area + 1M data area, and this will be bottlenecks for high iops.

The struct tcmu_cmd_entry {} size is fixed about 112 bytes with
iovec[N] & N <= 4, and the size of struct iovec is about 16 bytes.

If N == 0, the ratio will be sizeof(cmd entry) : sizeof(datas) ==
112Bytes : (N * 4096)Bytes = 28 : 0, no data area is need.

If 0 < N <=4, the ratio will be sizeof(cmd entry) : sizeof(datas)
== 112Bytes : (N * 4096)Bytes = 28 : (N * 1024), so the max will
be 28 : 1024.

If N > 4, the sizeof(cmd entry) will be [(N - 4) *16 + 112] bytes,
and its corresponding data size will be [N * 4096], so the ratio
of sizeof(cmd entry) : sizeof(datas) == [(N - 4) * 16 + 112)Bytes
: (N * 4096)Bytes == 4/1024 - 12/(N * 1024), so the max is about
4 : 1024.

When N is bigger, the ratio will be smaller.

As the initial patch, we will set the cmd area size to 2M, and
the cmd area size to 32M. The TCMU will dynamically grows the data
area from 0 to max 32M size as needed.

The cmd area memory will be allocated through vmalloc(), and the
data area's blocks will be allocated individually later when needed.

The allocated data area block memory will be managed via radix tree.
For now the bitmap still be the most efficient way to search and
manage the block index, this could be update later.

Signed-off-by: Xiubo Li <lixiubo@cmss.chinamobile.com>
Signed-off-by: Jianfei Hu <hujianfei@cmss.chinamobile.com>
Acked-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-05-01 22:57:36 -07:00
Hannes Reinecke 3d035237a5 target: fixup error message in target_tg_pt_gp_tg_pt_gp_id_store()
When setting up an ALUA target port group with an invalid ID the
error message

kstrtoul() returned -22 for tg_pt_gp_id

is displayed, which is not really helpful.
Convert it to something sane.
And while we're at it, join the messages onto a single line.

Signed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Bart van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-05-01 22:21:53 -07:00
Hannes Reinecke c0dcafd8c5 target: fixup error message in target_tg_pt_gp_alua_access_type_store()
When setting up a target the error message:

Unable to do set ##_name ALUA state on non valid tg_pt_gp ID: 0

is displayed.
Apparently concatenation doesn't work in a string; one should be using
implicit string concatenation here.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Bart van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-05-01 22:21:49 -07:00
Bryant G. Ly 4ec5bf0ea8 target/user: PGR Support
This adds initial PGR support for just TCMU, since tcmu doesn't
have the necessary IT_NEXUS info to process PGR in userspace,
so have those commands be processed in kernel.

HA support is not available yet, we will work on it if this patch
is acceptable.

Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-05-01 22:21:45 -07:00
Bryant G. Ly c2d26f18dc target: Add WRITE_VERIFY_16
This patch addresses clients who needs write_verify_16 for
large volume groups such as AIX.

Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-05-01 22:21:40 -07:00
Markus Elfring df6751f340 target: Use kmalloc_array() in transport_kmap_data_sg()
A multiplication for the size determination of a memory allocation
indicated that an array data structure should be processed.
Thus use the corresponding function "kmalloc_array".

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-05-01 22:21:33 -07:00
Markus Elfring f318aef55f target: Use kmalloc_array() in compare_and_write_callback()
* A multiplication for the size determination of a memory allocation
  indicated that an array data structure should be processed.
  Thus use the corresponding function "kmalloc_array".

  This issue was detected by using the Coccinelle software.

* Replace the specification of a data structure by a pointer dereference
  to make the corresponding size determination a bit safer according to
  the Linux coding style convention.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-05-01 22:21:32 -07:00
Markus Elfring 5d68fb72d3 target: Improve size determinations in two functions
Replace the specification of two data structures by pointer dereferences
as the parameter for the operator "sizeof" to make the corresponding size
determinations a bit safer according to the Linux coding style convention.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-05-01 22:21:30 -07:00
Markus Elfring fbc4040bf6 target: Delete error messages for failed memory allocations
The script "checkpatch.pl" pointed information out like the following.

WARNING: Possible unnecessary 'out of memory' message

Thus remove such statements here.

Link: http://events.linuxfoundation.org/sites/events/files/slides/LCJ16-Refactor_Strings-WSang_0.pdf
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-05-01 22:21:28 -07:00
Markus Elfring 55ec409202 target: Use kcalloc() in two functions
* Multiplications for the size determination of memory allocations
  indicated that array data structures should be processed.
  Thus use the corresponding function "kcalloc".

  This issue was detected by using the Coccinelle software.

* Replace the specification of data structures by pointer dereferences
  to make the corresponding size determination a bit safer according to
  the Linux coding style convention.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-05-01 22:21:27 -07:00
Markus Elfring 3829f38169 iscsi-target: Improve size determinations in four functions
Replace the specification of four data structures by pointer dereferences
as the parameter for the operator "sizeof" to make the corresponding size
determinations a bit safer according to the Linux coding style convention.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-05-01 22:21:08 -07:00
Markus Elfring c46e22f106 iscsi-target: Delete error messages for failed memory allocations
The script "checkpatch.pl" pointed information out like the following.

WARNING: Possible unnecessary 'out of memory' message

Thus remove such statements here.

Link: http://events.linuxfoundation.org/sites/events/files/slides/LCJ16-Refactor_Strings-WSang_0.pdf
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-05-01 22:21:07 -07:00
Markus Elfring f172511032 iscsi-target: Use kcalloc() in iscsit_allocate_iovecs()
* A multiplication for the size determination of a memory allocation
  indicated that an array data structure should be processed.
  Thus use the corresponding function "kcalloc".

  This issue was detected by using the Coccinelle software.

* Replace the specification of a data structure by a pointer dereference
  to make the corresponding size determination a bit safer according to
  the Linux coding style convention.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-05-01 22:21:05 -07:00
Dmitry Monakhov 056e8924a0 tcm: make pi data verification configurable
Currently ramdisk and fileio always perform PI verification
before and after backend IO. This approach is not very flexible.
Because some one may want to postpone this work to other layers in
IO stack. For example if we want to test blk_integrity_profile

testcase:
dee408c868
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-05-01 22:20:58 -07:00