Commit graph

1295 commits

Author SHA1 Message Date
Colin Ian King 5c665aeb65 scsi: lpfc: don't dereference localport before it has been null checked
localport is being dereferenced to assign lport and then immediately
afterwards localport is being sanity checked to see if it is null.  Fix
this by only dereferencing localport until after it has been null
checked.

Detected by CoverityScan, CID#1463038 ("Dereference before null check")

Fixes: 3a8cefbfc5ee ("scsi: lpfc: Beef up stat counters for debug")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-01-03 22:52:43 -05:00
James Smart b996ce3996 scsi: lpfc: correct sg_seg_cnt attribute min vs default
Prior patch mixed up what argument in the macro was what, so min value
was placed as the "default" argument, and the default value was placed
as the "min" argument. Thus, when the default was applied, it looked
like the default was smaller than the allowed min.

Swap argument postions to correct.

[mkp: fixed checkpatch warning]

Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-20 21:27:30 -05:00
James Smart 2f7005debe scsi: lpfc: update driver version to 11.4.0.6
Update the driver version to 11.4.0.6

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-20 21:11:48 -05:00
James Smart 4b056682d8 scsi: lpfc: Beef up stat counters for debug
If log verbose in not turned on, its hard to tell when certain error
paths get hit. Add stats counters and corresponding logic to
debugfs/sysfs to aid understanding what paths were traversed.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-20 21:11:48 -05:00
James Smart 3fd78355cd scsi: lpfc: Fix infinite wait when driver unregisters a remote NVME port.
When unregistering a remote port the lpfc driver would eventually wait
for the remoteport_unreg done callback. But the driver never completed
the io aborts that would allow the connections to terminate thus the
unreg done callback was never issued.  Turns out the coding style of the
driver allowed for the wait to occur on the same cpu that the deferred
isr is called on. The blocking for the wait, blocked the isr, and as the
isr didn't run, the io aborts wouldn't finish.

Turns out there was never a good reason to block waiting for the unreg
done in the first place. The driver can continue execution and the ref
counting within the driver will do the right thing.

Resolve by removing the wait and patching up a few cases where the ref
counting didn't look right - mainly cases where the remote port comes
back before the aborts had completed and the unreg done had been
called. Additionally, a few places which used pointer values to guide
driver actions weren't protected by lock, so correct those.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-20 21:11:47 -05:00
James Smart e06351a002 scsi: lpfc: Fix issues connecting with nvme initiator
In the lpfc discovery engine, when as a nvme target, where the driver
was performing mailbox io with the adapter for port login when a NVME
PRLI is received from the host. Rather than queue and eventually get
back to sending a response after the mailbox traffic, the driver
rejected the io with an error response.

Turns out this particular initiator didn't like the rejection values
(unable to process command/command in progress) so it never attempted a
retry of the PRLI. Thus the host never established nvme connectivity
with the lpfc target.

By changing the rejection values (to Logical Busy/nothing more), the
initiator accepted the response and would retry the PRLI, resulting in
nvme connectivity.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-20 21:11:47 -05:00
James Smart 9de416ac67 scsi: lpfc: Fix SCSI LUN discovery when SCSI and NVME enabled
When enabled for both SCSI and NVME support, and connected pt2pt to a
SCSI only target, the driver nodelist entry for the remote port is left
in PRLI_ISSUE state and no SCSI LUNs are discovered. Works fine if only
configured for SCSI support.

Error was due to some of the prli points still reflecting the need to
send only 1 PRLI. On a lot of fabric configs, targets were NVME only,
which meant the fabric-reported protocol attributes were only telling
the driver one protocol or the other. Thus things worked fine. With
pt2pt, the driver must send a PRLI for both protocols as there are no
hints on what the target supports. Thus pt2pt targets were hitting the
multiple PRLI issues.

Complete the dual PRLI support. Track explicitly whether scsi (fcp) or
nvme prli's have been sent. Accurately track protocol support detected
on each node as reported by the fabric or probed by PRLI traffic.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-20 21:11:46 -05:00
James Smart a51e41b671 scsi: lpfc: Increase SCSI CQ and WQ sizes.
Increased the sizes of the SCSI WQ's and CQ's so that SCSI operation is
similar to that used by NVME. However, size increase restricted only to
those newer adapters that can support the larger WQE size, thus bigger
queue sizes.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-20 21:11:46 -05:00
James Smart b95e29b75d scsi: lpfc: Fix receive PRLI handling
Handling a rcv'ed PRLI incorrectly can cause the ndlp to end up in the
wrong state or the driver to ACC and PRLI when it should send LS_RJT.

The cause was due to the driver not properly looking at the PRLI type
and taking the multiple protocol support into consideration.

Resolved by adding checks in the various PRLI receive points to validate
PRLI type and reject if not valid for the enabled protocols and mode
(host vs target).

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-20 21:11:45 -05:00
James Smart cbc5de1b8a scsi: lpfc: Fix -EOVERFLOW behavior for NVMET and defer_rcv
The driver is all set to handle the defer_rcv api for the nvmet_fc
transport, yet didn't properly recognize the return status when the
defer_rcv occurred. The driver treated it simply as an error and aborted
the io. Several residual issues occurred at that point.

Finish the defer_rcv support: recognize the return status when the io
request is being handled in a deferred style. This stops the rogue
aborts; Replenish the async cmd rcv buffer in the deferred receive if
needed.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-20 21:11:45 -05:00
James Smart cf1a1d3e2d scsi: lpfc: Fix random heartbeat timeouts during heavy IO
NVME targets appear to randomly disconnect from the initiator when
running heavy IO.

The error is due to the host aggregate (across all controllers) io load
was beyond the maximum exchange count for nvme on the adapter. The
driver was properly returning a resource busy status, but the io load
was so great heartbeat commands would be bounced and not have a
successful retry within the fuzz amount for the nvme heartbeat (yes, a
very high io load!). Thus the target was terminating the controller due
to a keep alive failure.

Resolve by reserving a few exchanges (by counters) which can be used
when the adapter is out of normal exchanges and the command is a NVME
heartbeat command. As counters are used, while the reserved command is
outstanding, as soon as any other exchange completes, the counters are
adjusted and the reserved count is replenished. The heartbeat completes
execution in a normal fashion.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-20 21:11:44 -05:00
James Smart ba48077f23 scsi: lpfc: update driver version to 11.4.0.5
Update the driver version to 11.4.0.5

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-04 20:32:55 -05:00
James Smart 81e6a63728 scsi: lpfc: small sg cnt cleanup
The logic for sg_seg_cnt is a bit convoluted. This patch tries to clean
up a couple of areas, especially around the +2 and +1 logic.

This patch:

- Cleans up the lpfc_sg_seg_cnt attribute to specify a real minimum
  rather than making the minimum be whatever the default is.

- Removes the hardcoding of +2 (for the number of elements we use in a
  sgl for cmd iu and rsp iu) and +1 (an additional entry to compensate
  for nvme's reduction of io size based on a possible partial page)
  logic in sg list initialization. In the case where the +1 logic is
  referenced in host and target io checks, use the values set in the
  transport template as that value was properly set.

There can certainly be more done in this area and it will be addressed
in combined host/target driver effort.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-04 20:32:55 -05:00
James Smart c3725bdcdf scsi: lpfc: Fix driver handling of nvme resources during unload
During driver unload, the driver may crash due to NULL pointers.  The
NULL pointers were due to the driver not protecting itself sufficiently
during some of the teardown paths.  Additionally, the driver was not
waiting for and cleanup up nvme io resources. As such, the driver wasn't
making the callbacks to the transport, stalling the transports
association teardown.

This patch waits for io clean up before tearding down and adds checks
for possible NULL pointers.

Cc: <stable@vger.kernel.org> # 4.12+
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-04 20:32:55 -05:00
James Smart 3386f4bdd2 scsi: lpfc: Fix crash during driver unload with running nvme traffic
When the driver is unloading, the nvme transport could be in the process
of submitting new requests, will send abort requests to terminate
associations, or may make LS-related requests.  The driver's abort and
request entry points currently is ignorant of the unloading state and is
starting the requests even though the infrastructure to complete them
continues to teardown.

Change the entry points for new requests to check whether unloading and
if so, reject the requests. Abort routines check unloading, and if so,
noop the request. An abort is noop'd as the teardown paths are already
aborting/terminating the io outstanding at the time the teardown
initiated.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-04 20:32:55 -05:00
James Smart add9d6be3d scsi: lpfc: Correct driver deregistrations with host nvme transport
The driver's interaction with the host nvme transport has been incorrect
for a while. The driver did not wait for the unregister callbacks
(waited only 5 jiffies). Thus the driver may remove objects that may be
referenced by subsequent abort commands from the transport, and the
actual unregister callback was effectively a noop. This was especially
problematic if the driver was unloaded.

The driver now waits for the unregister callbacks, as it should, before
continuing with teardown.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-04 20:32:54 -05:00
James Smart 3b5bde69bc scsi: lpfc: correct port registrations with nvme_fc
The driver currently registers any remote port that has NVME support.
It should only be registering target ports.

Register only target ports.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-04 20:32:54 -05:00
James Smart 4938250ebd scsi: lpfc: Linux LPFC driver does not process all RSCNs
During RSCN storms, the driver does not rediscover some targets.  The
driver marks some RSCN as to be handled after the ones it's working
on. The driver missed processing some deferred RSCN.

Move where the driver checks for deferred RSCNs and initiate deferred
RSCN handling if the flag was set. Also revise nport state within the
RSCN confirm routine. Add some state data to a possible debug print to
aid future debugging.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-04 20:32:54 -05:00
James Smart b7e50c536e scsi: lpfc: Fix ndlp ref count for pt2pt mode issue RSCN
pt2pt ndlp ref count prematurely goes to 0. There was reference removed
that should only be removed if connected to a switch, not if in
point-to-point mode.

Add a mode check before the reference remove.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-04 20:32:54 -05:00
James Smart bcb24f6577 scsi: lpfc: Adjust default value of lpfc_nvmet_mrq
The current default for async hw receive queues is 1, which presents
issues under heavy load as number of queues influence the available
async receive buffer limits.

Raise the default to the either the current hw limit (16) or the number
of hw qs configured (io channel value).

Revise the attribute definition for mrq to better reflect what we do for
hw queues. E.g. 0 means default to optimal (# of cpus), non-zero
specifies a specific limit. Before this change, mrq=0 meant target mode
was disabled. As 0 now has a different meaning, rework the if tests to
use the better nvmet_support check.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-04 20:32:54 -05:00
James Smart 07d494f753 scsi: lpfc: Fix display for debugfs queInfo
Display for lpfc/fnX/iDiag/queInfo isn't formatted perfectly.  Corrected
the format strings for the queue info debug messages.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-04 20:32:54 -05:00
James Smart d33d0eb28b scsi: lpfc: Driver fails to detect direct attach storage array
The driver does not respond to PLOGI from the direct attach target.  The
driver uses incorrect S_ID in CONFIG_LINK, after FLOGI completion

Correct by issuing CONFIG_LINK with the correct S_ID after receiving the
PLOGI from the target

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-04 20:32:54 -05:00
James Smart d73154ba32 scsi: lpfc: Raise maximum NVME sg list size for 256 elements
Raise the maximum NVME sg list size allowed to 256 elements.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-04 20:32:54 -05:00
James Smart 422c4cb7e9 scsi: lpfc: Fix NVME LS abort_xri
Performing an LS abort results in the following message being seen:
  0603 Invalid CQ subtype 6: 00000300 22000002 ffff0016 d0050000
and the associated exchange is not properly freed.

The code did not recognize the exchange type that was aborted, thus it
was not properly handled.

Correct by adding the NVME LS ELS type to the exchange types that are
recognized.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-04 20:32:54 -05:00
James Smart e4b9794efd scsi: lpfc: Fix crash after bad bar setup on driver attachment
In test cases where an instance of the driver is detached and
reattached, the driver will crash on reattachment. There is a compound
if statement that will skip over the bar setup if the pci_resource_start
call is not successful. The driver erroneously returns success to its
bar setup in this scenario even though the bars aren't properly
configured.

Rework the offending code segment for proper initialization steps.  If
the pci_resource_start call fails, -ENOMEM is now returned.

Sample stack:

rport-5:0-10: blocked FC remote port time out: removing rport
BUG: unable to handle kernel NULL pointer dereference at           (null)
... lpfc_sli4_wait_bmbx_ready+0x32/0x70 [lpfc]
...
...  RIP: 0010:...  ... lpfc_sli4_wait_bmbx_ready+0x32/0x70 [lpfc]
 Call Trace:
  ... lpfc_sli4_post_sync_mbox+0x106/0x4d0 [lpfc]
  ... ? __alloc_pages_nodemask+0x176/0x420
  ... ? __kmalloc+0x2e/0x230
  ... lpfc_sli_issue_mbox_s4+0x533/0x720 [lpfc]
  ... ? mempool_alloc+0x69/0x170
  ... ? dma_generic_alloc_coherent+0x8f/0x140
  ... lpfc_sli_issue_mbox+0xf/0x20 [lpfc]
  ... lpfc_sli4_driver_resource_setup+0xa6f/0x1130 [lpfc]
  ... ? lpfc_pci_probe_one+0x23e/0x16f0 [lpfc]
  ... lpfc_pci_probe_one+0x445/0x16f0 [lpfc]
  ... local_pci_probe+0x45/0xa0
  ... work_for_cpu_fn+0x14/0x20
  ... process_one_work+0x17a/0x440

Cc: <stable@vger.kernel.org> # 4.12+
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-04 20:32:54 -05:00
James Smart 8a5ca109a3 scsi: lpfc: Handle XRI_ABORTED_CQE in soft IRQ
XRI_ABORTED_CQE completions were not being handled in the fast path.
They were being queued and deferred to the lpfc worker thread for
processing. This is an artifact of the driver design prior to moving
queue processing out of the isr and into a workq element. Now that queue
processing is already in a deferred context, remove this artifact and
process them directly.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-04 20:32:53 -05:00
James Smart 81b96eda5f scsi: lpfc: Expand WQE capability of every NVME hardware queue
Hardware queues are a fast staging area to push commands into the
adapter.  The adapter should drain them extremely quickly. However,
under heavy io load, the host cpu is pushing commands faster than the
drain rate of the adapter causing the driver to resource busy commands.

Enlarge the hardware queue (wq & cq) to support a larger number of queue
entries (4x the prior size) before backpressure. Enlarging the queue
requires larger contiguous buffers (16k) per logical page for the
hardware. This changed calling sequences that were expecting 4K page
sizes that now must pass a parameter with the page sizes. It also
required use of a new version of an adapter command that can vary the
page size values.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-04 20:32:53 -05:00
James Smart c73455e1b5 scsi: lpfc: FLOGI failures are reported when connected to a private loop.
When the HBA is connected to a private loop, the driver reports FLOGI
loop-open failure as functional error. This is an expected condition.

Mark loop-open failure as a warning instead of error.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-04 20:32:53 -05:00
Dan Carpenter 9816ef6ecb scsi: lpfc: Use after free in lpfc_rq_buf_free()
The error message dereferences "rqb_entry" so we need to print it first
and then free the buffer.

Fixes: 6c621a2229 ("scsi: lpfc: Separate NVMET RQ buffer posting from IO resources SGL/iocbq/context")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-11-28 23:46:06 -05:00
Linus Torvalds 670ffccb2f SCSI misc on 20171114
This is mostly updates of the usual suspects: lpfc, qla2xxx, hisi_sas,
 megaraid_sas, pm80xx, mpt3sas, be2iscsi, hpsa. and a host of minor
 updates.
 
 There's no major behaviour change or additions to the core in all of
 this, so the potential for regressions should be small (biggest
 potential being in the scsi error handler changes).
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJaCxtCAAoJEAVr7HOZEZN4d9EQAI+OHP6ss6zjKKC21c9jNPcH
 NhLrNv37gHg/LA2VXeUEL9RGUjCGLIUrI4HsrxzkFAMLKP4TkshMs8/2RvczY+Sa
 VpayPqVybEKLIS6ipQyM1SLIQff2nvtDVcN/T+8z1lkk45TrbA6ZGuwUwd2aJyEA
 2V2wtg51ObnL0Nr9QPPll0JrtL1AnCZyRlu9XrwTZuuSBZwk93opIuuvbZm/3dVg
 Ir4GSS4Y+PuHIfu4cxqdsPMdzRdY9I2me1YiE4jeFSn1/VTAjL4HBz7fO9eITT42
 VhXSpDz1XvFsa9dJ0ubkqoALpJzCfOcBw+EuGvSydLEvOBoEVwMccdfaD9lT1zc5
 L9e1Z5qqJoq7hTA6xTXCYfWG73I9HYvljtmc8yudKHhADOdnSTUXhaO6uBF0RNqD
 OxPSA1RZwRx3c6lDOcK6BTtvLAkTEuYKdrWSKJi0w+QXJAyQ6etqbmsKpmPdRim7
 Z4ZSpJFro2gyo9gcdJO0ykTG+z3U7Z/ay1sNgnuprsv+eU/QjUdlAPl18o79EkRf
 H54zZggZ4wC6q/cFVVt4Vx+V+oqIeu38s7NDXS9UltLoTZPm2EzDW6pXd/38Z4Tf
 a1oBAUET8kYLC90P8sVZxUIHZjITlpgDbyE2Lq00PMYXhk8S4IxF0aMN5RvVqzUv
 +7N2HrHkSSgG1nhw1t+E
 =3O85
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "This is mostly updates of the usual suspects: lpfc, qla2xxx, hisi_sas,
  megaraid_sas, pm80xx, mpt3sas, be2iscsi, hpsa. and a host of minor
  updates.

  There's no major behaviour change or additions to the core in all of
  this, so the potential for regressions should be small (biggest
  potential being in the scsi error handler changes)"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (203 commits)
  scsi: lpfc: Fix hard lock up NMI in els timeout handling.
  scsi: mpt3sas: remove a stray KERN_INFO
  scsi: mpt3sas: cleanup _scsih_pcie_enumeration_event()
  scsi: aacraid: use timespec64 instead of timeval
  scsi: scsi_transport_fc: add 64GBIT and 128GBIT port speed definitions
  scsi: qla2xxx: Suppress a kernel complaint in qla_init_base_qpair()
  scsi: mpt3sas: fix dma_addr_t casts
  scsi: be2iscsi: Use kasprintf
  scsi: storvsc: Avoid excessive host scan on controller change
  scsi: lpfc: fix kzalloc-simple.cocci warnings
  scsi: mpt3sas: Update mpt3sas driver version.
  scsi: mpt3sas: Fix sparse warnings
  scsi: mpt3sas: Fix nvme drives checking for tlr.
  scsi: mpt3sas: NVMe drive support for BTDHMAPPING ioctl command and log info
  scsi: mpt3sas: Add-Task-management-debug-info-for-NVMe-drives.
  scsi: mpt3sas: scan and add nvme device after controller reset
  scsi: mpt3sas: Set NVMe device queue depth as 128
  scsi: mpt3sas: Handle NVMe PCIe device related events generated from firmware.
  scsi: mpt3sas: API's to remove nvme drive from sml
  scsi: mpt3sas: API 's to support NVMe drive addition to SML
  ...
2017-11-14 16:23:44 -08:00
Linus Torvalds e2c5923c34 Merge branch 'for-4.15/block' of git://git.kernel.dk/linux-block
Pull core block layer updates from Jens Axboe:
 "This is the main pull request for block storage for 4.15-rc1.

  Nothing out of the ordinary in here, and no API changes or anything
  like that. Just various new features for drivers, core changes, etc.
  In particular, this pull request contains:

   - A patch series from Bart, closing the whole on blk/scsi-mq queue
     quescing.

   - A series from Christoph, building towards hidden gendisks (for
     multipath) and ability to move bio chains around.

   - NVMe
        - Support for native multipath for NVMe (Christoph).
        - Userspace notifications for AENs (Keith).
        - Command side-effects support (Keith).
        - SGL support (Chaitanya Kulkarni)
        - FC fixes and improvements (James Smart)
        - Lots of fixes and tweaks (Various)

   - bcache
        - New maintainer (Michael Lyle)
        - Writeback control improvements (Michael)
        - Various fixes (Coly, Elena, Eric, Liang, et al)

   - lightnvm updates, mostly centered around the pblk interface
     (Javier, Hans, and Rakesh).

   - Removal of unused bio/bvec kmap atomic interfaces (me, Christoph)

   - Writeback series that fix the much discussed hundreds of millions
     of sync-all units. This goes all the way, as discussed previously
     (me).

   - Fix for missing wakeup on writeback timer adjustments (Yafang
     Shao).

   - Fix laptop mode on blk-mq (me).

   - {mq,name} tupple lookup for IO schedulers, allowing us to have
     alias names. This means you can use 'deadline' on both !mq and on
     mq (where it's called mq-deadline). (me).

   - blktrace race fix, oopsing on sg load (me).

   - blk-mq optimizations (me).

   - Obscure waitqueue race fix for kyber (Omar).

   - NBD fixes (Josef).

   - Disable writeback throttling by default on bfq, like we do on cfq
     (Luca Miccio).

   - Series from Ming that enable us to treat flush requests on blk-mq
     like any other request. This is a really nice cleanup.

   - Series from Ming that improves merging on blk-mq with schedulers,
     getting us closer to flipping the switch on scsi-mq again.

   - BFQ updates (Paolo).

   - blk-mq atomic flags memory ordering fixes (Peter Z).

   - Loop cgroup support (Shaohua).

   - Lots of minor fixes from lots of different folks, both for core and
     driver code"

* 'for-4.15/block' of git://git.kernel.dk/linux-block: (294 commits)
  nvme: fix visibility of "uuid" ns attribute
  blk-mq: fixup some comment typos and lengths
  ide: ide-atapi: fix compile error with defining macro DEBUG
  blk-mq: improve tag waiting setup for non-shared tags
  brd: remove unused brd_mutex
  blk-mq: only run the hardware queue if IO is pending
  block: avoid null pointer dereference on null disk
  fs: guard_bio_eod() needs to consider partitions
  xtensa/simdisk: fix compile error
  nvme: expose subsys attribute to sysfs
  nvme: create 'slaves' and 'holders' entries for hidden controllers
  block: create 'slaves' and 'holders' entries for hidden gendisks
  nvme: also expose the namespace identification sysfs files for mpath nodes
  nvme: implement multipath access to nvme subsystems
  nvme: track shared namespaces
  nvme: introduce a nvme_ns_ids structure
  nvme: track subsystems
  block, nvme: Introduce blk_mq_req_flags_t
  block, scsi: Make SCSI quiesce and resume work reliably
  block: Add the QUEUE_FLAG_PREEMPT_ONLY request queue flag
  ...
2017-11-14 15:32:19 -08:00
James Smart 6ddcf0a30a lpfc: tie in to new dev_loss_tmo interface in nvme transport
This patch calls the new nvme transport routine for dev_loss_tmo
whenever the SCSI fc transport calls the lldd to make a dynamic
change to a remote ports dev_loss_tmo.

Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-11-10 19:53:25 -07:00
Dick Kennedy 341b2aa833 scsi: lpfc: Fix hard lock up NMI in els timeout handling.
System crashed due to a hard lockup at lpfc_els_timeout_handler+0x128.

The els ring's txcmplq list is corrupted: the last element in the list
does not point back the the head causing a loop. Issue is the els
processing path for sli4 hbas are using the hbalock instead of the
ring_lock for removing elements from the txcmplq list.

Use the adapter SLI_REV to determine which lock should be used for
removing iocbqs from the els rings txcmplq.

note: the future refactoring will address this so that we don't have
this ugly type-based lock code.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-11-08 18:25:12 -05:00
Vasyl Gomonovych 1c356ec5e9 scsi: lpfc: fix kzalloc-simple.cocci warnings
drivers/scsi/lpfc/lpfc_debugfs.c:5460:22-29: WARNING: kzalloc should be used for phba -> nvmeio_trc, instead of kmalloc/memset
drivers/scsi/lpfc/lpfc_debugfs.c:2230:20-27: WARNING: kzalloc should be used for phba -> nvmeio_trc, instead of kmalloc/memset

 Use kzalloc rather than kmalloc followed by memset with 0

Generated by: scripts/coccinelle/api/alloc/kzalloc-simple.cocci

Signed-off-by: Vasyl Gomonovych <gomonovych@gmail.com>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-11-03 12:26:09 -04:00
Kees Cook f22eb4d31c scsi: lpfc: Convert timers to use timer_setup()
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.

Cc: James Smart <james.smart@broadcom.com>
Cc: Dick Kennedy <dick.kennedy@broadcom.com>
Cc: "James E.J. Bottomley" <jejb@linux.vnet.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: linux-scsi@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-11-01 11:27:07 -07:00
Dan Carpenter 41319e4f62 scsi: lpfc: Fix a precedence bug in lpfc_nvme_io_cmd_wqe_cmpl()
The ! has higher precedence than the & operation.  I've added
parenthesis so this works as intended.

Fixes: 952c303b32 ("scsi: lpfc: Ensure io aborts interlocked with the target.")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-10-16 22:43:54 -04:00
Dick Kennedy f6cab3452b scsi: lpfc: change version to 11.4.0.4
Change version to 11.4.0.4

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-10-02 22:46:42 -04:00
James Smart 29bfd55a9c scsi: lpfc: correct nvme sg segment count check
The internal cfg flag is actually smaller, by 1 (for a partial page
sge), than the sg list maintained by the driver. Thus the check on sg
segments errored out when it shouldn't have

Ensure the check is +1

Note: having a value that is less than what it really is is bogus.
Correcting it now would be a significant rework. Add this item to the
list to be refactored in the merge with efct.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-10-02 22:46:42 -04:00
Dick Kennedy 1abcb3718b scsi: lpfc: Fix oops of nvme host during driver unload.
When running NVME io as a NVME host, if the driver is unloaded there
would be oops in lpfc_sli4_issue_wqe.

When unloading, controllers are torn down and the transport initiates
set_property commands to reset the controller and issues aborts to
terminate existing io.  The drivers nvme abort and fcp io submit
routines needed to recognize the driver is unloading and fail the new
requests. It didn't, resulting in the oops.

Revise the ls and fcp io submit routines to detect the unloading state
and properly handle their cleanup.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-10-02 22:46:41 -04:00
Dick Kennedy 6ad8c07a2f scsi: lpfc: Extend RDP support
Support RDP and Multiple Frames

If the remote Nport is not logged in, the driver would not populate all
the descriptors in the RDP response payload. Doing so would create a
payload length that requires multiple frames due to exceeding the
default rx buffer size without an explicit login. Currently FC-LS
explicitly states the RDP response must be a single frame sequence.
Thus we did not violate the standard.

Recently, a modification to FC-LS was accepted which allows multi-frame
sequences and all vendors have indicated they are interoperable with the
change. As such, extend RDP support with the additional fields and send
a multi-frame sequence.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-10-02 22:46:41 -04:00
Dick Kennedy 952c303b32 scsi: lpfc: Ensure io aborts interlocked with the target.
Before releasing nvme io back to the io stack for possible retry on
other paths, ensure the io termination is interlocked with the target
device by ensuring the entire ABTS-LS protocol is complete.

Additionally, FC-NVME ABTS-LS protocol does not use RRQ. Remove RRQ
behavior from ABTS-LS.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-10-02 22:46:40 -04:00
Dick Kennedy 184fc2b9a8 scsi: lpfc: Fix secure firmware updates
Firmware update fails with: status x17 add_status x56 on the final write

If multiple DMA buffers are used for the download, some firmware revs
have difficulty with signatures and crcs split across the dma buffer
boundaries.  Resolve by making all writes be a single 4k page in length.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-10-02 22:46:40 -04:00
Dick Kennedy b7672ae681 scsi: lpfc: Fix crash in lpfc_nvme_fcp_io_submit during LIP
The driver is seeing a NULL pointer in lpfc_nvme_fcp_io_submit.  This
was ultimately due to a transport AER being sent on a terminated
controller, thus some of the values were not set. In case we're in a
system without a corrected transport and in case a race condition occurs
where we enter the routine as the teardown is happening in a separate
thread, validate the parameters before starting the io.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-10-02 22:46:39 -04:00
Dick Kennedy 42270dce9d scsi: lpfc: Disable NPIV support if NVME is enabled
The initial implementation of NVME didn't merge with NPIV support.  As
such, there are several issues if NPIV is used with NVME. For now,
ensure that if NVME is enabled then NPIV is not enabled.

Support for NPIV with NVME will be added in the near future.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-10-02 22:46:39 -04:00
Dick Kennedy e7981a2c72 scsi: lpfc: Fix oops if nvmet_fc_register_targetport fails
if nvmet targetport registration fails, the driver encounters a NULL
pointer oops in lpfc_hb_timeout_handler.

To fix: if registration fails, ensure nvmet_support is cleared on the
port structure.

Also enhanced the log message on failure.

Cc: <stable@vger.kernel.org> # 4.12+
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-10-02 22:46:38 -04:00
Dick Kennedy cf4c8c8610 scsi: lpfc: Revise NVME module parameter descriptions for better clarity
The descriptions for lpfc_xri_split and lpfc_enable_fc4_type were
poor. Revise for better understanding:

  lpfc_xri_split - Percentage of FCP XRI resources versus NVME
  lpfc_enable_fc4_type - Enable FC4 Protocol support - FCP / NVME

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-10-02 22:46:38 -04:00
James Smart c578f6f4b9 scsi: lpfc: Set missing abort context
Always set ctxp->state to LPFC_NVMET_STE_ABORT if ABORT op gets called

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-10-02 22:46:37 -04:00
James Smart e3246a123d scsi: lpfc: Reduce log spew on controller reconnects
There are several log messages that report abnormal terminations that by
default are marked warn. These are typically the result of failures due
to invalid controller state or abort completions. They are all natural
when a controller resets.

Unfortunately, as they are logged by default, it makes the admin very
concerned.

Convert the messages to Info.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-10-02 22:46:37 -04:00
Dick Kennedy 8e036a9497 scsi: lpfc: Fix FCP hba_wqidx assignment
The driver is encountering  oops in lpfc_sli_calc_ring.

The driver is setting hba_wqidx for FCP based on the policy in use for
NVME. The two may not be the same.  Change to set the wqidx based on the
FCP policy.

Cc: <stable@vger.kernel.org> # 4.12+
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-10-02 22:46:36 -04:00
Dick Kennedy f485c18db2 scsi: lpfc: Move CQ processing to a soft IRQ
Under heavy target nvme load duration, the lpfc irq handler is
encountering cpu lockup warnings.

Convert the driver to a shortened ISR handler which identifies the
interrupting condition then schedules a workq thread to process the
completion queue the interrupt was for. This moves all the real work
into the workq element.

As nvmet_fc upcalls are no longer in ISR context, don't set the feature
flags

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-10-02 22:46:36 -04:00
Dick Kennedy c8a4ce0bf3 scsi: lpfc: Make ktime sampling more accurate
Need to make ktime samples more accurate

If ktime is turned on in the middle of an IO, the max calculation could
be misleading. Base sampling on the start time of the IO as opposed to
ktime_on.

Make ISR ktime timestamps be from when CQE is read instead of EQE.
Added additional sanity checks when deciding whether to accept an IO
sample or not.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-10-02 22:46:35 -04:00
Dick Kennedy e8bcf0ae4c scsi: lpfc: PLOGI failures during NPIV testing
Local Reject/Invalid RPI errors seen during discovery.

Temporary RPI cleanup was occurring regardless of SLI rev. It's only
necessary on SLI-4.

Adjust the test for whether cleanup is necessary.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-10-02 22:46:35 -04:00
Dick Kennedy 2299e4323d scsi: lpfc: Fix warning messages when NVME_TARGET_FC not defined
Warning messages when NVME_TARGET_FC not defined on ppc builds

The lpfc_nvmet_replenish_context() function is only meaningful when NVME
target mode enabled. Surround the function body with ifdefs for target
mode enablement.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-10-02 22:46:34 -04:00
Dick Kennedy 2b75d0f934 scsi: lpfc: Fix lpfc nvme host rejecting IO with Not Ready message
In a link bounce scenario, a condition can occur where the discovery
engine swaps an ndlp structure (address change for an nport). While the
swap was successfully executed by the discovery engine, the driver did
not properly detect a change in the ndlp bound to the nvme rport.  This
error resulted in the nvme host transport issuing an IO to the correct
nvme rport, but the lpfc driver addressed a ndlp with an NLP_UNUSED
status and failed the io. This resulting it it looking like there were
missing namespaces and applications failed due to io errors.

To fix, in lpfc_nvme_register_rport, rework the "rebind" case to break
the nvme rport<->ndlp association when the ndlp already has an
nrport. Then rebind the rport to the correct ndlp data and backpointers.

[mkp: typo]

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-10-02 22:46:34 -04:00
Dick Kennedy 1234a6d54f scsi: lpfc: Fix crash receiving ELS while detaching driver
The driver crashes when attempting to use a freed ndpl pointer.

The pci_remove_one handler runs on a separate kernel thread. The order
of the removal is starting by freeing all of the ndlps and then
disabling interrupts. In between these two events the driver can still
receive an ELS and process it. When it tries to use the ndlp pointer
will be NULL

Change the order of the pci_remove_one vs disable interrupts so that
interrupts are disabled before the ndlp's are freed.

Cc: <stable@vger.kernel.org> # 4.12+
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-10-02 22:46:33 -04:00
Dick Kennedy 401bb4169d scsi: lpfc: fix pci hot plug crash in list_add call
During pci hot plug, the kernel crashes in a list_add_call

The lookup by tag function will return null if the IOCB is out of range
or does not have the on txcmplq flag set.

Fix: Check for null return from lookup by tag.

Cc: <stable@vger.kernel.org> # 4.12+
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-10-02 22:46:32 -04:00
Dick Kennedy 1901762f2c scsi: lpfc: fix pci hot plug crash in timer management routines
During pci hot plug, the kernel crashes in timer management code.

The sli4 remove_one handler is not stoping the timers as it starts to
remove the port so that it can be swapped.

Fix: Stop the timers early in the handler routine.

Note: Fix in SLI-4 only. SLI-3 already stopped the timers properly.

Cc: <stable@vger.kernel.org> # 4.12+
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-10-02 22:46:32 -04:00
Linus Torvalds 0b33ce72ea SCSI fixes on 20170930
Eight mostly minor fixes for recently discovered issues in drivers.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJZz2OGAAoJEAVr7HOZEZN4NTQQAL1avebL1Abau5ODtGTciJrZ
 I4QqUuJ0PGo9AKWIA/s34cY7kdPt2qhIoUJY0bLKdv8QtR5MDqfJ6c+2ianKenWm
 EkDKYBQ2csBqzH9VN0tEIPQhvLr7oxvCgEDjfvusxWoUX4AgSv2bMaxIpgs4GUxS
 hOH4fg/6A0HuhP/HjtYfd89DBdhhKOPU6oRx6YLF4ctlfZxAcALsvVjNAGaJzZX8
 Bpv2K/xWOwb/UghjJvxv1wYZ+AL0BHAFVZilBFpyX+ClRhmU9d9SVYs43CbwSu+Y
 41qIAYLJXrls6CSWJMTEo/+mhbPMQBS6Q3wSewOaNZXkx4comjJHpx/k2HsiVk7K
 ObN9eIXNzyby132pIIHc9ZRSpJRTr0/jjb9FetneZlLHaNabtIf67JA0j4fIoCEh
 Qb73uR030mCNvQ+xvK8DxF9UE1zQXnXiJWoIH9NrGtFtknQOmFlfFfmo/gMDI8w6
 aYkxuHdqn2oFjKDuZOQ4zYa8ptcZAAml64gj2YiNiwgNosEpt9UMJ5WRknrGJ7po
 jHohzKyKkSykjOxgOL1Noh5d7AoWPtJ1Qbg+Leyg1WLnRGHTgAYYmv1LabOdMhbM
 SIMhNRBFQunCBKTyes/RB2Nykl1OXipnxIaRgSrRhsEiTOVutmyy7jr6BbZSB2vY
 XBpadk38SGSsZva+Sfk0
 =v2G5
 -----END PGP SIGNATURE-----

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "Eight mostly minor fixes for recently discovered issues in drivers"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: ILLEGAL REQUEST + ASC==27 => target failure
  scsi: aacraid: Add a small delay after IOP reset
  scsi: scsi_transport_fc: Also check for NOTPRESENT in fc_remote_port_add()
  scsi: scsi_transport_fc: set scsi_target_id upon rescan
  scsi: scsi_transport_iscsi: fix the issue that iscsi_if_rx doesn't parse nlmsg properly
  scsi: aacraid: error: testing array offset 'bus' after use
  scsi: lpfc: Don't return internal MBXERR_ERROR code from probe function
  scsi: aacraid: Fix 2T+ drives on SmartIOC-2000
2017-09-30 12:50:56 -07:00
Thomas Meyer df2f7729f2 scsi: lpfc: Cocci spatch "pool_zalloc-simple"
Use *_pool_zalloc rather than *_pool_alloc followed by memset with 0.
Found by coccinelle spatch "api/alloc/pool_zalloc-simple.cocci"

Signed-off-by: Thomas Meyer <thomas@m3y3r.de>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-09-25 19:05:38 -04:00
James Smart 8e009ce846 lpfc: remove use of FC-specific error codes
The lpfc driver uses the FC-specific error when it needed to return an
error to the FC-NVME transport. Convert to use a generic value instead.

Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-09-25 08:56:05 -06:00
Stefano Brivio 5c756065e4 scsi: lpfc: Don't return internal MBXERR_ERROR code from probe function
Internal error codes happen to be positive, thus the PCI driver core
won't treat them as failure, but we do. This would cause a crash later
on as lpfc_pci_remove_one() is called (e.g. as shutdown function).

Fixes: 6d368e5321 ("[SCSI] lpfc 8.3.24: Add resource extent support")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Acked-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-09-15 21:15:32 -04:00
Colin Ian King 858e51e8cb scsi: lpfc: remove redundant null check on eqe
The pointer eqe is always non-null inside the while loop, so the check
to see if eqe is NULL is redudant and hence can be removed.

Detected by CoverityScan CID#1248693 ("Logically Dead Code")

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-09-15 15:39:11 -04:00
Linus Torvalds 572c01ba19 SCSI misc on 20170907
This is mostly updates of the usual suspects: lpfc, qla2xxx, hisi_sas, megaraid_sas, zfcp and a host of minor updates.
 
 The major driver change here is the elimination of the block based
 cciss driver in favour of the SCSI based hpsa driver (which now drives
 all the legacy cases cciss used to be required for).  Plus a reset
 handler clean up and the redo of the SAS SMP handler to use bsg lib.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJZscDNAAoJEAVr7HOZEZN4DWIQAK/UkkrvKpV/jLATM/yi7CoL
 QidY86Hmwwl7A9HQ+2fjLfAsye0xcCzRwkucKK90IP5b4pefHhiJJfiMKAAe3TUW
 xstnY5z5jaOhDG4nyJFoSm5fH5qXkMnJ8NZRK8f6Qg5yBN5dStEKqoBboNsz4KBI
 md7idw0mbp5i2GXlJwSpc5eDS97GiPL6WkwgGaGKfXF1NDau0GbEdjijfz55haCD
 pMhY7WJh/71RfOq/1ThXT1Z3khOlVcKXrkdO+602n7zh/klRBRtBC8m2a6xCfZPj
 n7Pb/s0jhCQPd+e/Xtv7WEbY8uNOCrGoVgZ6U5EGrT5IeTfep24ackYqerjMhE63
 esi4BJY8lUP9SGleLMgjYWyCHdmxBJRa7UI614DWN/H0QoGP6j/2EzGoi5Fw04vC
 H8/+aqPPWZc9KUBioRYo8xWO8YgMqL2eyXY+Tc9cwxqAe2T6k/NC1zJVgDFKXfzb
 QoWW4v9NNmYwf5vL/7tNgkeTMFQV66yUR7dR3SGTSk8UIrJ40ok0JyUAsDg86ZAH
 BfMkWwhWQ6Byoel0Y7Ti88T49Cox/64r/I0ux06Qgg99+KpRLT7z20+GLIEHgXxg
 116C39rgvYKqzc7W8RCyj8qSROuMVzg6QFbB6n+1PEsYIX2O8A2Re3jdS34q2LbX
 aBDm/Lfdl4kkJrV9xY6P
 =nQUG
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "This is mostly updates of the usual suspects: lpfc, qla2xxx, hisi_sas,
  megaraid_sas, zfcp and a host of minor updates.

  The major driver change here is the elimination of the block based
  cciss driver in favour of the SCSI based hpsa driver (which now drives
  all the legacy cases cciss used to be required for). Plus a reset
  handler clean up and the redo of the SAS SMP handler to use bsg lib"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (279 commits)
  scsi: scsi-mq: Always unprepare before requeuing a request
  scsi: Show .retries and .jiffies_at_alloc in debugfs
  scsi: Improve requeuing behavior
  scsi: Call scsi_initialize_rq() for filesystem requests
  scsi: qla2xxx: Reset the logo flag, after target re-login.
  scsi: qla2xxx: Fix slow mem alloc behind lock
  scsi: qla2xxx: Clear fc4f_nvme flag
  scsi: qla2xxx: add missing includes for qla_isr
  scsi: qla2xxx: Fix an integer overflow in sysfs code
  scsi: aacraid: report -ENOMEM to upper layer from aac_convert_sgraw2()
  scsi: aacraid: get rid of one level of indentation
  scsi: aacraid: fix indentation errors
  scsi: storvsc: fix memory leak on ring buffer busy
  scsi: scsi_transport_sas: switch to bsg-lib for SMP passthrough
  scsi: smartpqi: remove the smp_handler stub
  scsi: hpsa: remove the smp_handler stub
  scsi: bsg-lib: pass the release callback through bsg_setup_queue
  scsi: Rework handling of scsi_device.vpd_pg8[03]
  scsi: Rework the code for caching Vital Product Data (VPD)
  scsi: rcu: Introduce rcu_swap_protected()
  ...
2017-09-07 21:11:05 -07:00
Arnd Bergmann 5fe5a6c9ac scsi: lpfc: avoid false-positive gcc-8 warning
This is an interesting regression with gcc-8, showing a harmless warning
for correct code:

In file included from include/linux/kernel.h:13:0,
                 ...
                 from drivers/scsi/lpfc/lpfc_debugfs.c:23:
include/linux/printk.h:301:2: error: 'eq' may be used uninitialized in this function [-Werror=maybe-uninitialized]
  printk(KERN_ERR pr_fmt(fmt), ##__VA_ARGS__)
  ^~~~~~
In file included from drivers/scsi/lpfc/lpfc_debugfs.c:58:0:
drivers/scsi/lpfc/lpfc_debugfs.h:451:31: note: 'eq' was declared here

I managed to reduce the warning into a small test case for gcc-8 that I
reported in the gcc bugzilla[1].

As a workaround, this changes the logic to move the two assignments of
'eq' out of the conditions and instead make the index conditional.  This
works for all configurations I tried and avoids adding a bogus
initialization.

Acked-by: James Smart <james.smart@broadcom.com>
Link: [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81958
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-25 18:26:52 -04:00
Arnd Bergmann 7362617319 scsi: lpfc: avoid an unused function warning
The only reference to lpfc_nvmet_replenish_context() is inside of an
disabled:

drivers/scsi/lpfc/lpfc_nvmet.c:1457:1: error: 'lpfc_nvmet_replenish_context' defined but not used [-Werror=unused-function]

This replaces the preprocessor conditional with a C condition, so the
compiler can see that the function is intentionally unused.

Fixes: 9a38e4f1c82f ("scsi: lpfc: Fix MRQ > 1 context list handling")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-25 18:26:28 -04:00
Dick Kennedy 610448367c scsi: lpfc: lpfc version bump 11.4.0.3
Update driver version to 11.4.0.3

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24 22:29:44 -04:00
Maurizio Lombardi 286871a666 scsi: lpfc: fix "integer constant too large" error on 32bit archs
cc1: warnings being treated as errors
drivers/scsi/lpfc/lpfc_init.c: In function 'lpfc_get_wwpn':
drivers/scsi/lpfc/lpfc_init.c:3253: error: integer constant is too large for 'long' type

Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24 22:29:44 -04:00
James Smart 44fd7fe3dd scsi: lpfc: Add Buffer to Buffer credit recovery support
Add Buffer to buffer credit recovery support to the driver.  This is a
negotiated feature with the peer that allows for both sides to detect
dropped RRDY's and FC Frames and recover credit.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24 22:29:43 -04:00
James Smart d58734f05f scsi: lpfc: remove console log clutter
Change hw queue binding messages to info - not error.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24 22:29:43 -04:00
Dick Kennedy 6b486ce9ee scsi: lpfc: Fix bad sgl reposting after 2nd adapter reset
Port issue was fixed, the hbacmd reset would take more than 8 minutes to
complete.

There were conflicting NVME SGL posting/reposting responsibilities
between lpfc_online()/lpfc_sli4_hba_setup() and
lpfc_nvme_create_localport().  The lpfc_online() causes a REPOST on
existing NVME SGLs which is not released during the fc port reset.
However, lpfc_nvme_create_localport() wants to allocate new NVME buffers
and post them. Both cancelled out each other which had a side effect of
hosing the mailbox handling that was used to remove the sgl lists -
causing multiple 60s mbx timeouts.

Fix by preserving all SGL lists over the fc port reset.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24 22:29:42 -04:00
Dick Kennedy a145fda381 scsi: lpfc: Fix nvme target failure after 2nd adapter reset
The nonrecovery occurred because the lpfc nvme initiator function did
not reestablish its localport creation with the nvme host transport in
lpfc_oneline.  Because of that, an NVME rport binding could not take
place.

Corrected by recreating the localport in the adapter reset recovery
routine.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24 22:29:42 -04:00
Dick Kennedy c6e0c92506 scsi: lpfc: Fix relative offset error on large nvmet target ios
If the nvmet_fc transport breaks an io into multiple sequences, the
driver will improperly set the relative offset on the 2nd through N
sequences.

Correct by properly formatting the hw cmd so the relative offset is
picked up from the hw cmd.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24 22:29:41 -04:00
Dick Kennedy 66d7ce93a0 scsi: lpfc: Fix MRQ > 1 context list handling
Various oops including cpu LOCKUPs were seen.

For asynchronously received ius where the driver must assign exchange
resources, the resources were on a single get (free) list and put list
(finished, waiting to be put on get list). As all cpus are sharing the
lists, an interrupt for a receive frame may have to wait for all the
other cpus to place their done work onto the put list before it can
acquire the lock to pull from the list.

Fix by breaking the resource lists into per-cpu lists or at least more
than 1 list with cpu's sharing the lists). A cpu would allocate from the
free list for its own cpu, and put its done work on the its own put list
- avoiding the contention. As cpu load may vary, when empty, a cpu may
grab from another cpu, thereby changing resource distribution.  But
searching for a resource only occurs on 1 or a few cpus until a single
resource can be allocated. if the condition reoccurs, it starts looking
at a different cpu.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24 22:29:41 -04:00
Dick Kennedy e3e2863def scsi: lpfc: Limit amount of work processed in IRQ
Various oops being seen on being in the ISR too long and cpu lockups,
when under heavy load.

The amount of work being posted off of completion queues kept the ISR
running almost all the time

Correct the issue by limiting the amount of work per iteration.

[mkp: typo]

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24 22:29:40 -04:00
Dick Kennedy 176de5bb20 scsi: lpfc: Correct issues with FAWWN and FDISCs
When using fabric-assigned WWNs, the switch doesn't like copy of the
FLOGI payload, which includes valid VVL bits, to be used as the FDISC
payload.

Rather than wait for corrected switch firmware, ensure the VVL bits are
marked invalid on FDISCs.

[mkp: typo]

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24 22:29:39 -04:00
Dick Kennedy 991f0c0e33 scsi: lpfc: Fix NVME PRLI handling during RSCN
A race condition was found whereby the initiator would receive the RSCN
for a new NVME device before it had a chance to register its FC4 support
with the fabric. Thus, when queried by the initiator, it would see that
the target supported FC-NVME.

Corrected by making the assumption that the target always supports
FC-NVME thus a PRLI is sent. It's ok for the target to reject it.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24 22:29:39 -04:00
Dick Kennedy 4b40d02b8b scsi: lpfc: Fix crash in lpfc nvmet when fc port is reset
In adapter reset tests, an oops was seen with a NULL pointer in
lpfc_free_rq_buffer+0x20/0x60

The driver is failing to properly repost the nvmet sgl list when
recovering from the reset. Thus the driver eventually trys to walk an
errant buffer list.

Corrected the sgl buffer recovery as well as strengthening the
initialization of the bufferlist.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24 22:29:38 -04:00
Dick Kennedy 4adc041b4d scsi: lpfc: Fix duplicate NVME rport entries and namespaces.
After lip, the driver sometimes would have two rports for the same
device, allowing the namespaces to be duplicated by nvme.

In lpfc_plogi_confirm_nport() the driver was not swapping the nrport
maintained by the ndlp's undergoing address swapping. This allowed the
2nd rport to sneak in as it was considered a separate device.

This patch adds the fixes to Swap the nrport in each ndlp and take care
of the reference counts on the ndlps similar to FCP rports.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24 22:29:38 -04:00
Dick Kennedy 8db1c2b3e7 scsi: lpfc: Fix handling of FCP and NVME FC4 types in Pt2Pt topology
After link bounce in a NVME Pt2Pt config, the driver managed to map the
same nport twice, resulting in multiple device nodes for the same
namespace.

In Pt2Pt, the driver must send PRLI's for both (scsi) FCP and NVME
rather than using fabric aids. The driver was inconsistent on handling
various PRLI completions, especially rejects, which had reject codes
cross the different protocol PRLI completions.

Fixed to perform the following: if nvmet mode (fc port can only be a
nvme target) - rejects all unsolicitly FCP PRLI's. Never issues a FCP
PRLI.

The multiple protocol PRLI's are sent simultaneously. However, driver
will now only state transition after both PRLI's are complete. New flags
were added to aid tracking the responses from the different PRLI's.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24 22:29:37 -04:00
Dick Kennedy cd22d6057c scsi: lpfc: Correct return error codes to align with nvme_fc transport
Modify driver return error codes to align with host nvme transport.

Driver isn't returning Exxx error codes to properly reflect out of
resource or connectivity conditions (-EBUSY), yet there were hard error
conditions returning -EBUSY.

Ensure the following situations return the proper return code:

 - Temporary failures or temporary resource availability: -EBUSY

 - Connectivity issues: -ENODEV

All others are treated as hard errors and return an -Exxx value that
indicates the type of error.

Also, lpfc_sli4_issue_wqe() was modified to not translate error from
-Exxx to WQE state.  This allows lpfc_nvme_fcp_io_submit() routine to
just return whatever -E value was returned from other routines.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24 22:29:37 -04:00
Dick Kennedy ffb70cd6b6 scsi: lpfc: convert info messages to standard messages
Transitioned some informational discovery messages to now always be
displayed when log_verbose is set.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24 22:29:36 -04:00
Dick Kennedy bb6a8a2c24 scsi: lpfc: Fix oops when NVME Target is discovered in a nonNVME environment
lpfc oops when it discovers a NVME target but is configured for SCSI
only operation. Oops is in lpfc_nvme_register_port+0x33/0x300.

The localport is not valid so it should not have been referenced.

Added validity check for localport

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24 22:29:36 -04:00
Dick Kennedy d2aa48761e scsi: lpfc: Fix rediscovery on switch blade pull
When the switch blade is pulled out then plugged back in, the driver
does not issue a PLOGI to the target

When the switch blade is pulled out, it does not reset the link. The
driver ends up issuing a LOGO to the target, and finally sees devloss.
Since the driver believes that a LOGO is outstanding, it does not issue
a PLOGI to the target upon link up

Correct by placing the ndlp in UNUSED state When devloss happens in
LOGO_ISSUE state.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24 22:29:35 -04:00
Dick Kennedy 2877cbffb7 scsi: lpfc: Fix loop mode target discovery
The driver does not discover targets when in loop mode.

The NLP type is correctly getting set when a fabric connection is
detected but, not for loop. The unknown NLP type means that the driver
does not issue a PRLI when in loop topology. Thus target discovery
fails.

Fix by checking the topology during discovery.  If it is loop, set the
NLP FC4 type to FCP.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24 22:29:35 -04:00
Dick Kennedy 1fe68477d2 scsi: lpfc: Fix plogi collision that causes illegal state transition
Message "0271 Illegal State Transition: node" seen in logs, all luns are
unuseable for that target.

A window exists in the rcv_plogi path where if the state is plogi issue
but the driver has not issued a plogi, then two reglogins will be sent
for the same RPI. The first one to complete will advance the state to
prli issue the second one will be detected as an illegal state, and
leave the node in an unusable state.

Correct the completion routine for the PLOGI ACC that detects the state
change when the driver starts discovery on the node again and drop the
REGLOGIN mailbox command.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24 22:29:34 -04:00
Gustavo A. R. Silva 44ed33e6c5 scsi: lpfc: remove useless code in lpfc_sli4_bsg_link_diag_test
Remove variable assignments. The value stored in local variable _rc_ is
overwritten at line 2448:rc = lpfc_sli4_bsg_set_link_diag_state(phba,
0); before it can be used.

Addresses-Coverity-ID: 1226935
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24 22:29:09 -04:00
James Smart 5073842093 lpfc: support nvmet_fc defer_rcv callback
Currently, calls to nvmet_fc_rcv_fcp_req() always copied the
FC-NVME cmd iu to a temporary buffer before returning, allowing
the driver to immediately repost the buffer to the hardware.

To address timing conditions on queue element structures vs async
command reception, the nvmet_fc transport occasionally may need to
hold on to the command iu buffer for a short period. In these cases,
the nvmet_fc_rcv_fcp_req() will return a special return code
(-EOVERFLOW). In these cases, the LLDD must delay until the new
defer_rcv lldd callback is called before recycling the buffer back
to the hw.

This patch adds support for the new nvmet_fc transport defer_rcv
callback and recognition of the new error code when passing commands
to the transport.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-08-10 11:19:05 +02:00
Romain Perier 771db5c0e3 scsi: lpfc: Replace PCI pool old API
The PCI pool API is deprecated. This commit replaces the PCI pool old
API by the appropriate function with the DMA pool API. It also updates
some comments, accordingly.

Signed-off-by: Romain Perier <romain.perier@collabora.com>
Reviewed-by: Peter Senna Tschudin <peter.senna@collabora.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-07 14:04:01 -04:00
Linus Torvalds 130568d5ea Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull more block updates from Jens Axboe:
 "This is a followup for block changes, that didn't make the initial
  pull request. It's a bit of a mixed bag, this contains:

   - A followup pull request from Sagi for NVMe. Outside of fixups for
     NVMe, it also includes a series for ensuring that we properly
     quiesce hardware queues when browsing live tags.

   - Set of integrity fixes from Dmitry (mostly), fixing various issues
     for folks using DIF/DIX.

   - Fix for a bug introduced in cciss, with the req init changes. From
     Christoph.

   - Fix for a bug in BFQ, from Paolo.

   - Two followup fixes for lightnvm/pblk from Javier.

   - Depth fix from Ming for blk-mq-sched.

   - Also from Ming, performance fix for mtip32xx that was introduced
     with the dynamic initialization of commands"

* 'for-linus' of git://git.kernel.dk/linux-block: (44 commits)
  block: call bio_uninit in bio_endio
  nvmet: avoid unneeded assignment of submit_bio return value
  nvme-pci: add module parameter for io queue depth
  nvme-pci: compile warnings in nvme_alloc_host_mem()
  nvmet_fc: Accept variable pad lengths on Create Association LS
  nvme_fc/nvmet_fc: revise Create Association descriptor length
  lightnvm: pblk: remove unnecessary checks
  lightnvm: pblk: control I/O flow also on tear down
  cciss: initialize struct scsi_req
  null_blk: fix error flow for shared tags during module_init
  block: Fix __blkdev_issue_zeroout loop
  nvme-rdma: unconditionally recycle the request mr
  nvme: split nvme_uninit_ctrl into stop and uninit
  virtio_blk: quiesce/unquiesce live IO when entering PM states
  mtip32xx: quiesce request queues to make sure no submissions are inflight
  nbd: quiesce request queues to make sure no submissions are inflight
  nvme: kick requeue list when requeueing a request instead of when starting the queues
  nvme-pci: quiesce/unquiesce admin_q instead of start/stop its hw queues
  nvme-loop: quiesce/unquiesce admin_q instead of start/stop its hw queues
  nvme-fc: quiesce/unquiesce admin_q instead of start/stop its hw queues
  ...
2017-07-11 15:36:52 -07:00
Linus Torvalds 9031114841 SCSI misc on 20170704
This is mostly updates of the usual suspects: lpfc, qla2xxx, bnx2fc,
 qedf, hpsa, hisi_sas, smartpqi, cxlflash, aacraid, csiostor along with
 a host of minor and miscellaneous changes.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJZW7JMAAoJEAVr7HOZEZN4IM4P/AqBtvH+6Lo1Eb+3A/HnHskK
 hIxVxgBxaw3fhW5AegDfVCvrdVTTEkCB/g5CIKN8NCWEx6OGmCX0Lu6lnjld9BOZ
 cTlPtzNwFGlgHrz34LwCc3vlc5ovMpTQBrpGAQpGGWoAZIP+c3ilEihIYTEMNCsN
 dmjI71AigDE5g6X1OT361IJ1gydkjfG41IcRe/jlMtEgRNdy3B2JVIdATL89Pw4b
 0uZO3uUTn8EGEKUdyJZCNpie7sGZv8u2LhA+Znby2C4h3bwWNV/d0p7ped4xrQY5
 yVpZEUbYVdcOOYBgeBJlfwOhvjRQTdxeK4d7W9XTb+AQf30F3DgSepdMCdf3BjVt
 KnQvBOTxyidB8xsCL46wlxxNew3qoUtaKoY88WUOOnnJwU5U7hlRtPkf/eO2i5QF
 +k7fxUpFfkBTS4I2gXnyGWpmSoxwJerd0knojSOjrjJcAlcgM65+pocUAea/0Dpr
 SsfL2sTb12gk6bkF9UlRv8/4aSsWYb92WW1nbTt2nFRXncPNN5Qzc3lGj//36O+b
 2bka+aSKVAFoNAnQ1pUE8EJxSboy5q7y4509iZzO/Fom+pVuzBROm5fmrpcOE5dP
 IjW7gqSFB6578tnNiK049rrrPja5wkUa+Ptc8s0FjPAVyIDrp2RN+f2nljOBBhW8
 3Z1nXMG0eFqvb5taLtfZ
 =D9QX
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "This is mostly updates of the usual suspects: lpfc, qla2xxx, bnx2fc,
  qedf, hpsa, hisi_sas, smartpqi, cxlflash, aacraid, csiostor along with
  a host of minor and miscellaneous changes"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (276 commits)
  qla2xxx: Fix NVMe entry_type for iocb packet on BE system
  scsi: qla2xxx: avoid unused-function warning
  scsi: snic: fix a couple of spelling mistakes/typos
  scsi: qla2xxx: fix a bunch of typos and spelling mistakes
  scsi: lpfc: don't double count abort errors
  scsi: lpfc: spin_lock_irq() is not nestable
  scsi: hisi_sas: optimise DMA slot memory
  scsi: ibmvfc: constify dev_pm_ops structures.
  scsi: ibmvscsi: constify dev_pm_ops structures.
  scsi: cxlflash: Update debug prints in reset handlers
  scsi: cxlflash: Update send_tmf() parameters
  scsi: cxlflash: Avoid double free of character device
  scsi: Add STARGET_CREATED_REMOVE state to scsi_target_state
  scsi: ses: do not add a device to an enclosure if enclosure_add_links() fails.
  scsi: ufs: flush eh_work when eh_work scheduled.
  scsi: qla2xxx: Protect access to qpair members with qpair->qp_lock
  scsi: sun_esp: fix device reference leaks
  scsi: fnic: changing queue command to return result DID_IMM_RETRY when rport is init
  scsi: fnic: correct speed display and add support for 25,40 and 100G
  scsi: fnic: added timestamp reporting in fnic debug stats
  ...
2017-07-06 12:10:33 -07:00
Linus Torvalds 3bad2f1c67 Merge branch 'work.misc-set_fs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull misc user access cleanups from Al Viro:
 "The first pile is assorted getting rid of cargo-culted access_ok(),
  cargo-culted set_fs() and field-by-field copyouts.

  The same description applies to a lot of stuff in other branches -
  this is just the stuff that didn't fit into a more specific topical
  branch"

* 'work.misc-set_fs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  Switch flock copyin/copyout primitives to copy_{from,to}_user()
  fs/fcntl: return -ESRCH in f_setown when pid/pgid can't be found
  fs/fcntl: f_setown, avoid undefined behaviour
  fs/fcntl: f_setown, allow returning error
  lpfc debugfs: get rid of pointless access_ok()
  adb: get rid of pointless access_ok()
  isdn: get rid of pointless access_ok()
  compat statfs: switch to copy_to_user()
  fs/locks: don't mess with the address limit in compat_fcntl64
  nfsd_readlink(): switch to vfs_get_link()
  drbd: ->sendpage() never needed set_fs()
  fs/locks: pass kernel struct flock to fcntl_getlk/setlk
  fs: locks: Fix some troubles at kernel-doc comments
2017-07-05 13:13:32 -07:00
Dmitry Monakhov 128b6f9fdd t10-pi: Move opencoded contants to common header
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-07-03 16:56:25 -06:00
Dan Carpenter 88e9617420 scsi: lpfc: don't double count abort errors
If lpfc_nvmet_unsol_fcp_issue_abort() fails then we accidentally
increment "tgtp->xmt_abort_rsp_error" and then two lines later we
increment it a second time.

Fixes: 547077a44b ("scsi: lpfc: Adding additional stats counters for nvme.")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-07-01 17:09:11 -04:00
Dan Carpenter c4031db72b scsi: lpfc: spin_lock_irq() is not nestable
We're calling spin_lock_irq() multiple times, the problem is that on the
first spin_unlock_irq() then we will re-enable IRQs and we don't want
that.

Fixes: 966bb5b711 ("scsi: lpfc: Break up IO ctx list into a separate get and put list")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-07-01 17:08:41 -04:00
Al Viro ca1579f6c6 Merge remote-tracking branch 'jl/locks-4.13' into work.misc-set_fs 2017-06-26 23:52:33 -04:00
James Smart cb45e5295a scsi: lpfc: fix refcount error on node list
A change in remote port removal introduced a spurious put which can
cause a premature structure teardown. The affects were most notable when
the driver attempted to unload as a null pointer would be hit.

Fix by removing the unnecessary put.

Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-26 15:01:01 -04:00
James Smart 00cefeb964 scsi: lpfc: Fix nvme io stoppage after link bounce
On link down, transport is calling driver to abort outstanding ios.
Driver erroneously rejects the abort if the port indicates it isn't
logged in - which will be the case after the link down. Thus, the io
can't clean up. This prevents reconnection at the transport level.

Fix by allowing abort to proceed.

Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-26 15:01:00 -04:00
James Smart b609b1c753 scsi: lpfc: update to revision to 11.4.0.1
Set lpfc driver revision to 11.4.0.1

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-19 21:41:12 -04:00
James Smart d6564e5260 scsi: lpfc: Driver responds LS_RJT to Beacon Off ELS - Linux
Beacon OFF from switch is rejected by driver.

Driver fails Beacon OFF if frequency is set to 0. As per fc-ls spec,
status, capability, frequency and duration fields are only applicable
for Beacon ON.

Remove frequency and type checks. Reject Beacon ON if duration is non
zero.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-19 21:41:03 -04:00
James Smart 4550f9c75e scsi: lpfc: Fix crash in lpfc_sli_ringtxcmpl_put when nvmet gets an abort request.
When running nvme detach-ns /dev/nvme0n1 -n 1 command, the nvmet lpfc
driver crashes with this stack dump:

kernel BUG at /root/NVME/lpfc_8.4/lpfc_sli.c:1393!
invalid opcode: 0000 [#1] SMP
Workqueue: nvmet-fc-cpu0 nvmet_fc_do_work_on_cpu [nvmet_fc]
 lpfc_sli4_issue_wqe+0x357/0x440 [lpfc]
 lpfc_nvmet_xmt_fcp_abort+0x36b/0x5c0 [lpfc]
 nvmet_fc_abort_op+0x30/0x50 [nvmet_fc]
 nvmet_fc_do_work_on_cpu+0xd9/0x130 [nvmet_fc]
 process_one_work+0x14e/0x410
 worker_thread+0x116/0x490
 kthread+0xc7/0xe0
 ret_from_fork+0x3f/0x70

Crash is due to an uninitialized iocbq->vport pointer.

Explicitly set the iocbq->vport field to phba->pport in
lpfc_nvmet_sol_fcp_issue_abort as it does all abort iocbq initialization
in the routine.  Using phba->pport is ok because target does not support
NPIV instances.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-19 21:40:53 -04:00