1
0
Fork 0
Commit Graph

4175 Commits (55e18b18335132cb21d815e09c376fca891f5af2)

Author SHA1 Message Date
Neel Desai 53526500f3 IB/hfi1: Permanently enable P_Key checking in HFI
Ingress and egress port P_Key checking should always be performed for
HFIs. This patch will enable ingress and egress P_Key checking when
the port is initialized and will ignore the P_Key information sent by
the FM in the port info structure which is meant to be used only by the
switch.

Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Neel Desai <neel.desai@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-28 13:56:21 -04:00
Stuart Summers 98b9ee2002 IB/hfi1: Cache neighbor secure data after link up
Secure data is transferred across the link during verify
cap. This includes Neighbor Guid, Type, and Port Number.
This transfer is not guaranteed to complete until the 8051
firmware has completed processing of the state_complete
frame. Move the consumption of this data from verify cap
handling to link up handling to ensure the data is finalized.

Additionally, do not notify the SM that the link is up until
after this data is actually available.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Stuart Summers <john.s.summers@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-28 13:56:20 -04:00
Neel Desai 03e80e9a57 IB/hfi1: Adjust high temperature warning for QSFP cable
When we receive a QSFP_HIGH_TEMP_ALARM or QSFP_HIGH_TEMP_WARNING
interrupt, print a "QSFP cable temperature too high" message.

Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Neel Desai <neel.desai@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-28 13:56:20 -04:00
Tadeusz Struk 22546b741a IB/hfi1: Fix softlockup issue
Soft lockups can occur because the mad processing on different CPUs acquire
the spin lock dc8051_lock:

[534552.835870]  [<ffffffffa026f993>] ? read_dev_port_cntr.isra.37+0x23/0x160 [hfi1]
[534552.835880]  [<ffffffffa02775af>] read_dev_cntr+0x4f/0x60 [hfi1]
[534552.835893]  [<ffffffffa028d7cd>] pma_get_opa_portstatus+0x64d/0x8c0 [hfi1]
[534552.835904]  [<ffffffffa0290e7d>] hfi1_process_mad+0x48d/0x18c0 [hfi1]
[534552.835908]  [<ffffffff811dc1f1>] ? __slab_free+0x81/0x2f0
[534552.835936]  [<ffffffffa024c34e>] ? ib_mad_recv_done+0x21e/0xa30 [ib_core]
[534552.835939]  [<ffffffff811dd153>] ? __kmalloc+0x1f3/0x240
[534552.835947]  [<ffffffffa024c3fb>] ib_mad_recv_done+0x2cb/0xa30 [ib_core]
[534552.835955]  [<ffffffffa0237c85>] __ib_process_cq+0x55/0xd0 [ib_core]
[534552.835962]  [<ffffffffa0237d70>] ib_cq_poll_work+0x20/0x60 [ib_core]
[534552.835964]  [<ffffffff810a7f3b>] process_one_work+0x17b/0x470
[534552.835966]  [<ffffffff810a8d76>] worker_thread+0x126/0x410
[534552.835969]  [<ffffffff810a8c50>] ? rescuer_thread+0x460/0x460
[534552.835971]  [<ffffffff810b052f>] kthread+0xcf/0xe0
[534552.835974]  [<ffffffff810b0460>] ? kthread_create_on_node+0x140/0x140
[534552.835977]  [<ffffffff81696418>] ret_from_fork+0x58/0x90
[534552.835980]  [<ffffffff810b0460>] ? kthread_create_on_node+0x140/0x140

This issue is made worse when the 8051 is busy and the reads take longer.
Fix by using a non-spinning lock procure.

Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Reviewed-by: Mike Marciszyn <mike.marciniszyn@intel.com>
Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-28 13:56:15 -04:00
Mike Marciniszyn b6eac931b9 IB/hfi1: Prevent kernel QP post send hard lockups
The driver progress routines can call cond_resched() when
a timeslice is exhausted and irqs are enabled.

If the ULP had been holding a spin lock without disabling irqs and
the post send directly called the progress routine, the cond_resched()
could yield allowing another thread from the same ULP to deadlock
on that same lock.

Correct by replacing the current hfi1_do_send() calldown with a unique
one for post send and adding an argument to hfi1_do_send() to indicate
that the send engine is running in a thread.   If the routine is not
running in a thread, avoid calling cond_resched().

CC: <stable@vger.kernel.org> # 4.7.x-
Fixes: Commit 831464ce4b ("IB/hfi1: Don't call cond_resched in atomic mode when sending packets")
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-28 13:48:01 -04:00
Don Hiatt 3d591099a0 IB/hfi1: Use defines from common headers
Move FECN and BECN related defines to common header files

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-28 13:48:01 -04:00
Don Hiatt cb42705792 IB/hfi1: Add functions to parse 9B headers
These inline functions improve code readability by
enabling callers to read specific fields from the
header without knowledge of byte offsets.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-28 13:48:01 -04:00
Dasaratharaman Chandramouli aad559c21d IB/hfi1: Rename hdr2sc to hfi1_9B_get_sc5
The function really returned the 5-bit sc value from
the header and rhf. hdr2sc didn't quite describe what it did.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-28 13:48:01 -04:00
Sebastian Sanchez 3ca4fbc84a IB/hfi1: Return SC2VL mappings to FM with VL15 instead of ILLEGAL_VL
VL15 in the SC2VL table is used to indicate an invalid SC
for the FM, however, internally the driver remaps SCs from
VL15 to ILLEGAL_VL to prevent error counts. This mapping
confuses the FM when performing a sweep, making it return
a table mismatch error. Have SMA convert ILLEGAL_VL
to VL15 entries for the SC2VL table queries.

Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-28 13:48:01 -04:00
Michael J. Ruhl db730894f4 IB/hfi1: Validate the TID count before using it
Improve the safety of the code by validating the user supplied
tidcnt before use.

Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-28 13:48:01 -04:00
Michael J. Ruhl aad9ff97dd IB/rdmavt/hfi1/qib: Use the MGID and MLID for multicast addressing
The Infiniband spec defines "A multicast address is defined by a
MGID and a MLID" (section 10.5).

The current code only uses the MGID for identifying multicast groups.
Update the driver to be compliant with this definition.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-28 13:48:01 -04:00
Michael J. Ruhl 72fb70f5a3 IB/hfi1: Correct MulticastMask/CollectiveMask info to SMA output
The FM uses the values of MulticastMask and CollectiveMask to
determine the number of bits for net masks. The current values of
0 and 0 are incorrect.  The values should be 4 and 1.  Updated the
necessary code to reflect the specified values.

Reviewed-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-28 13:48:01 -04:00
Geliang Tang 2443c6cc92 IB/qib: use setup_timer
Use setup_timer() instead of init_timer() to simplify the code.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-28 13:19:14 -04:00
Geliang Tang 79d9df5618 IB/nes: use setup_timer
Use setup_timer() instead of init_timer() to simplify the code.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-28 13:19:13 -04:00
Geliang Tang 96ff2c11c5 IB/i40iw: use setup_timer
Use setup_timer() instead of init_timer() to simplify the code.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-28 13:19:05 -04:00
Leon Romanovsky 1e0729348f IB/nes: Fix incorrect type in assignment
Fix mismatch between types, wqe_words are in le32 format, while opcode
in CPU format.

The following sparse warnings are helped to find it:

drivers/infiniband/hw/nes/nes_hw.c:3058:24: warning: incorrect type in assignment (different base types)
drivers/infiniband/hw/nes/nes_hw.c:3058:24:    expected unsigned int [unsigned] [assigned] [usertype] opcode
drivers/infiniband/hw/nes/nes_hw.c:3058:24:    got restricted __le32 <noident>

CC: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-28 13:11:43 -04:00
Leon Romanovsky 313e16d5af IB/usnic: Simplify the code to balance loc/unlock calls
Simplify code in find_free_vf_and_create_qp_grp() to avoid sparse error
regarding call to unlock in the block other than lock was called.

drivers/infiniband/hw/usnic/usnic_ib_verbs.c:206:9: warning: context imbalance
			in 'find_free_vf_and_create_qp_grp' - different lock
			contexts for basic block

CC: Christian Benvenuti <benve@cisco.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-28 13:11:43 -04:00
Leon Romanovsky f5029e752b Ib/usnic: Explicitly include usnic headers
Sparse tool complains about undeclared symbols in usnic_ib_verbs.c
and usnic_ib_sysfs.c This is caused by lack of direct include of
appropriate usnic_ib_verbs.h and usnic_ib_sysfs.h, where all
these functions were declared.

Simple include eliminates 30 warnings similar to the below one:

drivers/infiniband/hw/usnic/usnic_ib_sysfs.c:304:6: warning: symbol
				'usnic_ib_sysfs_unregister_usdev' was
				not declared. Should it be static?

CC: Christian Benvenuti <benve@cisco.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-28 13:11:43 -04:00
Pan Bian 9ef63f31ad iw_cxgb4: check return value of alloc_skb
Function alloc_skb() will return a NULL pointer when there is no enough
memory. However, the return value of alloc_skb() is directly used
without validation in function send_fw_pass_open_req(). This patches
checks the return value of alloc_skb() against NULL.

Signed-off-by: Pan Bian <bianpan2016@163.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-28 13:09:55 -04:00
Amrani, Ram b6acd71fef RDMA/qedr: add support for send+invalidate in poll CQ
Split the poll responder CQ into two functions.
Add support for send+invalidate in poll CQ.

Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-28 12:47:57 -04:00
Amrani, Ram 4dd72636c9 RDMA/qedr: destroy CQ only after HW releases it
Wait for all relevant CNQ interrupts before freeing the CQ.
Don't invoke completion handlers for a destroyed CQ.

Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-28 12:47:57 -04:00
Amrani, Ram 942b3b2c41 RDMA/qedr: enhance destroy flow for GSI QP
Avoid attempting to release irrelevant (and unused) resources for GSI QP.

Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-28 12:47:57 -04:00
Amrani, Ram f92faaba11 RDMA/qedr: properly check atomic capabilities
After checking the path upwards towards root complex, actualy check
root complex atomic_req capability, and not our own NIC.
Verify that the PCIe device control register's atomic egress block
is cleared in the path.
Verify that the PCIe version is at least 2.

Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-28 12:47:57 -04:00
Amrani, Ram 08c4cf51e3 RDMA/qedr: reset access control when registering a MR
Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-28 12:47:57 -04:00
Eric Biggers cda37124f4 fs: constify tree_descr arrays passed to simple_fill_super()
simple_fill_super() is passed an array of tree_descr structures which
describe the files to create in the filesystem's root directory.  Since
these arrays are never modified intentionally, they should be 'const' so
that they are placed in .rodata and benefit from memory protection.
This patch updates the function signature and all users, and also
constifies tree_descr.name.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-04-26 23:54:06 -04:00
Artemy Kovalyov db570d7dea IB/mlx5: Add ODP support to MW
Internally MW implemented as KLM MKey and filled by userspace UMR
postsends.  Handle pagefault trigered by operations on this MKeys.

Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-25 15:40:28 -04:00
Artemy Kovalyov 1b7dbc26fc IB/mlx5: Extract page fault code
To make page fault handling code more flexible
split pagefault_single_data_segment() function.
Keep MR resolution in pagefault_single_data_segment() and
move actual updates into pagefault_single_mr().

Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-25 15:40:28 -04:00
Artemy Kovalyov b2ac91885b IB/mlx5: Add contiguous ODP support
Currenlty ODP supports only regular MMU pages.
Add ODP support for regions consisting of physically contiguous chunks
of arbitrary order (huge pages for instance) to improve performance.

Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-25 15:40:28 -04:00
Artemy Kovalyov 4df4a5bac3 IB/mlx5: Decrease verbosity level of ODP errors
Decrease verbosity level of ODP error flows messages to debug level.
Remove one redundant print since debug level message already exists in
this flow.

Fixes: d9aaed8387 ('{net,IB}/mlx5: Refactor page fault handling')
Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-25 15:40:28 -04:00
Artemy Kovalyov 523791d7c5 IB/mlx5: Fix implicit MR GC
When implicit MR's leaf MKey becomes unused, i.e. when it's
last page being released my MMU invalidation it is marked as "dying"
and scheduled for release by garbage collector.
Currentle consequent page fault may remove "dying" flag.
Treat leaf MKey as non-existent once it was scheduled to removal
by GC.

Fixes: 81713d3788 ('IB/mlx5: Add implicit MR support')
Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-25 15:40:28 -04:00
Artemy Kovalyov 438b228e03 IB/mlx5: Fix UMR size calculation
Translation table updates of large UMR may require multiple post send
operations. The last operations can be in various lengths, but current
code set them to be the same length.

Fixes: 7d0cc6edcc ('IB/mlx5: Add MR cache for large UMR regions')
Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-25 15:40:28 -04:00
Artemy Kovalyov bd174fc2ca IB/mlx5: Fix function updating xlt emergency path
In memory shortage path we fall back to use spare buffer.
mlx5_ib_update_xlt() called from ib_uverbs_reg_mr when ibmr.ucontext
not initialized yet.

Scenario how to test it:
1. trigger memory exhaustion so __get_free_pages(GFP_KERNEL, 4) will fail
2. register MR
3. there should be no kernel oops

Fixes: 7d0cc6edcc ('IB/mlx5: Add MR cache for large UMR regions')
Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-25 15:40:28 -04:00
Artemy Kovalyov 3e7e1193e2 IB: Replace ib_umem page_size by page_shift
Size of pages are held by struct ib_umem in page_size field.

It is better to store it as an exponent, because page size by nature
is always power-of-two and used as a factor, divisor or ilog2's argument.

The conversion of page_size to be page_shift allows to have portable
code and avoid following error while compiling on ARM:

  ERROR: "__aeabi_uldivmod" [drivers/infiniband/core/ib_core.ko] undefined!

CC: Selvin Xavier <selvin.xavier@broadcom.com>
CC: Steve Wise <swise@chelsio.com>
CC: Lijun Ou <oulijun@huawei.com>
CC: Shiraz Saleem <shiraz.saleem@intel.com>
CC: Adit Ranadive <aditr@vmware.com>
CC: Dennis Dalessandro <dennis.dalessandro@intel.com>
CC: Ram Amrani <Ram.Amrani@Cavium.com>
Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Acked-by: Ram Amrani <Ram.Amrani@cavium.com>
Acked-by: Shiraz Saleem <shiraz.saleem@intel.com>
Acked-by: Selvin Xavier <selvin.xavier@broadcom.com>
Acked-by: Selvin Xavier <selvin.xavier@broadcom.com>
Acked-by: Adit Ranadive <aditr@vmware.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-25 15:40:28 -04:00
Christoph Hellwig 21c433a74b IB/hfi1: Use pcie_flr() instead of duplicating it
Tested-by: Jakub Byczkowski <jakub.byczkowski@intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Doug Ledford <dledford@redhat.com>
2017-04-25 14:36:19 -05:00
Ira Weiny e8ea95af87 IB/hfi: Fix up comments in engine mapping
Fix off by 1 error in comments documenting the sdma and send context
mappings.

Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-25 15:24:51 -04:00
Arnd Bergmann 5b0ff9a007 infiniband: hns: avoid gcc-7.0.1 warning for uninitialized data
hns_roce_v1_cq_set_ci() calls roce_set_bit() on an uninitialized field,
which will then change only a few of its bits, causing a warning with
the latest gcc:

infiniband/hw/hns/hns_roce_hw_v1.c: In function 'hns_roce_v1_cq_set_ci':
infiniband/hw/hns/hns_roce_hw_v1.c:1854:23: error: 'doorbell[1]' is used uninitialized in this function [-Werror=uninitialized]
  roce_set_bit(doorbell[1], ROCEE_DB_OTHERS_H_ROCEE_DB_OTH_HW_SYNS_S, 1);

The code is actually correct since we always set all bits of the
port_vlan field, but gcc correctly points out that the first
access does contain uninitialized data.

This initializes the field to zero first before setting the
individual bits.

Fixes: 9a4435375c ("IB/hns: Add driver files for hns RoCE driver")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-25 15:16:38 -04:00
Yuval Shaia 4d6f28591f {net,IB}/{rxe,usnic}: Utilize generic mac to eui32 function
This logic seems to be duplicated in (at least) three separate files.
Move it to one place so code can be re-use.

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
2017-04-25 14:21:34 -04:00
Yuval Shaia a7c81326ca IB/usnic: Remove unused functions
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-25 14:12:14 -04:00
Ganesh Goudar e821303c42 iw_cxgb4: Use dsgl by default
Enable the use of dsgl by default and determine whether dsgl is
supported from lld info.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Bharat Potnuri <bharat@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-25 14:04:41 -04:00
Doug Ledford 374cb8610a RDMA/bnxt_re: Use IS_ERR_OR_NULL where appropriate
Constructs such as if (ptr && !IS_ERR(ptr)) can be shorted to
just !IS_ERR_OR_NULL(ptr) instead.  Make substitutions in the bnxt_re
driver where appropriate.

Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-25 14:00:59 -04:00
Colin Ian King ebbd1dfb26 RDMA/bnxt_re: remove redundant initialization of rc to zero
rc is initialized to zero but is then updated by calls to
bnxt_qplib_free_fast_reg_page_list and/or bnxt_qpliob_free_mrw
so the initialization is redundant and can be removed.

Detected with CoverityScan, CID#1408448 ("Unused Value")

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-25 13:36:59 -04:00
Noa Osherovich f1b65df5a2 IB/mlx5: Add support for active_width and active_speed in RoCE
Add missing calculation and translation of active_width and
active_speed for RoCE.

Fixes: 3f89a643eb ('IB/mlx5: Extend query_device/port to ...')
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-21 12:58:41 -04:00
Noa Osherovich 50f22fd8ec IB/mlx5: Set mlx5_query_roce_port's return value to void
In case of an error, the properties reported to user
are zeroed out, so no need for a return value.

Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-21 12:54:51 -04:00
Moni Shoua 12f8fedef2 IB/mlx5: Set correct SL in completion for RoCE
There is a difference when parsing a completion entry between Ethernet
and IB ports. When link layer is Ethernet the bits describe the type of
L3 header in the packet. In the case when link layer is Ethernet and VLAN
header is present the value of SL is equal to the 3 UP bits in the VLAN
header. If VLAN header is not present then the SL is undefined and consumer
of the completion should check if IB_WC_WITH_VLAN is set.

While that, this patch also fills the vlan_id field in the completion if
present.

Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-21 12:29:31 -04:00
Parav Pandit e1f24a79f4 IB/mlx5: Support congestion related counters
This patch adds support to query the congestion related hardware counters
through new command and links them with other hw counters being available
in hw_counters sysfs location.

In order to reuse existing infrastructure it renames related q_counter
data structures to more generic counters to reflect q_counters and
congestion counters and maybe some other counters in the future.

New hardware counters:
 * rp_cnp_handled - CNP packets handled by the reaction point
 * rp_cnp_ignored - CNP packets ignored by the reaction point
 * np_cnp_sent    - CNP packets sent by notification point to respond to
                     CE marked RoCE packets
 * np_ecn_marked_roce_packets - CE marked RoCE packets received by
                                notification point

It also avoids returning ENOSYS which is specific for invalid
system call and produces the following checkpatch.pl warning.

WARNING: ENOSYS means 'invalid syscall nr' and nothing else
+		return -ENOSYS;

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-21 12:29:31 -04:00
Leon Romanovsky a43402af1e IB/mthca: Check validity of output parameter pointer
The mthca driver didn't check supplied pointer to functions
mthca_cmd_poll() and mthca_cmd_wait(). This caused to the following
smatch errors:

drivers/infiniband/hw/mthca/mthca_cmd.c:371 mthca_cmd_poll() error: we previously assumed 'out_param' could be null (see line 353)
drivers/infiniband/hw/mthca/mthca_cmd.c:454 mthca_cmd_wait() error: we previously assumed 'out_param' could be null (see line 432)

In reality all callers of these functions are setting out_is_imm
flag are providing pointer too. However it is better to check
again to remove smatch errors to achieve warning free subsystem.

Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-21 12:29:31 -04:00
Slava Shwartsman a22ed86cff IB/mlx5: Add drop flow steering rule support
A drop rule is described by an action drop and no destination.
If a user specified IB_FLOW_SPEC_ACTION_DROP then set the action
to MLX5_FLOW_CONTEXT_ACTION_DROP and clear the destination.

Signed-off-by: Slava Shwartsman <slavash@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-21 12:29:31 -04:00
Ariel Levkovich 19cc75249a IB/mlx5: Use IP version matching to classify IP traffic
This change adds the ability for flow steering to classify IPv4/6
packets with MPLS tag (Ethertype 0x8847 and 0x8848) as standard IP
packets and hit IPv4/6 classifed steering rules.

When user added a flow rule with IP classification, driver was
implicitly adding ethertype matching to the created rule in order
to distinguish between IPv4 and IPv6 protocols.
Since IP packets with MPLS tag header have MPLS ethertype, they missed
the rule and ended up hitting the default filters.
Such behavior prevented from MPLS packets to undergo inbound traffic
load balancing flows (if such were defined by configuring RSS) to
achieve higher throughput - the way that non-MPLS IP packets performed.

Since our device is able to look past the MPLS tag and identify the
next protocol we introduce this solution which replaces Ethertype
matching by the device's capability to perform IP version parsing
and matching in order to distinguish between IPv4 and IPv6.
Therefore, whenever a flow with IP spec is added and device support IP
version matching, driver will implicitly add IP version matching to the
rule (Based on the IP spec type) without Ethertype matching which will
cause relevant MPLS tagged packets to hit this rule as well.
Otherwise (device doesn't support IP version matching), we fall back to
setting Ethertype matching.

If the user's filters specify an L2 ethertype and an IP spec
the rule will then match both the ethertype and the IP version.

The device's support for IP version matching is reported by the
device via dedicated capability bit in query_device_cap and named
outer/inner_ip_version.

Signed-off-by: Ariel Levkovich <lariel@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-21 12:26:05 -04:00
Ariel Levkovich 0f750966dc IB/mlx5: Add inner spec and IPv6 validation in user's flow attribute list
This change fixes an incomplete validation of the user's
flow attributes list.

Previous implementation validated only matching of IPv4 Ethertype
to IPv4 spec of outer headers (in case both Ethernet with specified
Ethertype and IP specs were present) and lacked the validation of:
1. Matching of IPv6 Ethertype in Ethernet spec (if such exists) to an
   IPv6 protocol spec (if such exists).
2. Validation of Ethertype to IP protocol matching on inner headers specs.
Which could cause some combinations of unmatching Ethernet and IP
protocols to pass validation and apply on the device.

The fix adds validation of IPv6 Ethertype and IP spec as well as
performing the scan on both outer and inner attributes.

Fixes: 038d2ef875 ("Add flow steering support")
Signed-off-by: Ariel Levkovich <lariel@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-21 12:26:05 -04:00
Bodong Wang 44f2e99ecd IB/mlx5: Fix wrong use of kfree at bad flow in create_cq_user
The kfree was called to free cqb, while it should free *cqb.

Fixes: 1cbe6fc86c ("IB/mlx5: Add support for CQE compressing")
Signed-off-by: Bodong Wang <bodong@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-21 12:26:05 -04:00
Maor Gottlieb 00b7c2abb6 IB/mlx5: Enlarge autogroup flow table
In order to enlarge the flow group size to 8k, we decrease
the number of flow group types to 6 and increase the flow
table size to 64k.

Flow group size is calculated as follow:
  group_size = table_size / (#group_types + 1)

Fixes: 038d2ef875 ('IB/mlx5: Add flow steering support')
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-21 12:26:05 -04:00
Maor Gottlieb dac388ef4c IB/mlx5: Check supported flow table size
Check that the required flow table size is supported
by device. Return ENOMEM error if no space left.

In addition change the create flow table routine
to return ENOMEM instead of ENOSPC.

Fixes: 038d2ef875 ('IB/mlx5: Add flow steering support')
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-21 12:26:05 -04:00
Maor Gottlieb 1377661298 IB/mlx5: Change vma from shared to private
Anonymous VMA (->vm_ops == NULL) cannot be shared, otherwise
it would lead to SIGBUS.

Remove the shared flags from the vma after we change it to be
anonymous.

This is easily reproduced by doing modprobe -r while running a
user-space application such as raw_ethernet_bw.

Fixes: 7c2344c3bb ('IB/mlx5: Implements disassociate_ucontext API')
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-21 12:26:05 -04:00
Maor Gottlieb ecc7d83be3 IB/mlx5: Take write semaphore when changing the vma struct
When the driver disassociate user context, it changes the vma to
anonymous by setting the vm_ops to null and zap the vma ptes.

In order to avoid race in the kernel, we need to take write lock
before we change the vma entries.

Fixes: 7c2344c3bb ('IB/mlx5: Implements disassociate_ucontext API')
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-21 12:26:05 -04:00
Maor Gottlieb ca37a664a8 IB/mlx4: Change vma from shared to private
Anonymous VMA (->vm_ops == NULL) cannot be shared, otherwise
it would lead to SIGBUS.

Remove the shared flags from the vma after we change it to be
anonymous.

This is easily reproduced by doing modprobe -r while running a
user-space application such as raw_ethernet_bw.

Fixes: ae184ddeca ('IB/mlx4_ib: Disassociate support')
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-21 12:26:05 -04:00
Maor Gottlieb 22c3653d04 IB/mlx4: Take write semaphore when changing the vma struct
When the driver disassociate user context, it changes the vma to
anonymous by setting the vm_ops to null and zap the vma ptes.

In order to avoid race in the kernel, we need to take write lock
before we change the vma entries.

Fixes: ae184ddeca ('IB/mlx4_ib: Disassociate support')
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-21 12:26:05 -04:00
Jack Morgenstein fb7a91746a IB/mlx4: Reduce SRIOV multicast cleanup warning message to debug level
A warning message during SRIOV multicast cleanup should have actually been
a debug level message. The condition generating the warning does no harm
and can fill the message log.

In some cases, during testing, some tests were so intense as to swamp the
message log with these warning messages, causing a stall in the console
message log output task. This stall caused an NMI to be sent to all CPUs
(so that they all dumped their stacks into the message log).
Aside from the message flood causing an NMI, the tests all passed.

Once the message flood which caused the NMI is removed (by reducing the
warning message to debug level), the NMI no longer occurs.

Sample message log (console log) output illustrating the flood and
resultant NMI (snippets with comments and modified with ... instead
of hex digits, to satisfy checkpatch.pl):

 <mlx4_ib> _mlx4_ib_mcg_port_cleanup: ... WARNING: group refcount 1!!!...
 *** About 4000 almost identical lines in less than one second ***
 <mlx4_ib> _mlx4_ib_mcg_port_cleanup: ... WARNING: group refcount 1!!!...
 INFO: rcu_sched detected stalls on CPUs/tasks: { 17} (...)
 *** { 17} above indicates that CPU 17 was the one that stalled ***
 sending NMI to all CPUs:
 ...
 NMI backtrace for cpu 17
 CPU: 17 PID: 45909 Comm: kworker/17:2
 Hardware name: HP ProLiant DL360p Gen8, BIOS P71 09/08/2013
 Workqueue: events fb_flashcursor
 task: ffff880478...... ti: ffff88064e...... task.ti: ffff88064e......
 RIP: 0010:[ffffffff81......]  [ffffffff81......] io_serial_in+0x15/0x20
 RSP: 0018:ffff88064e257cb0  EFLAGS: 00000002
 RAX: 0000000000...... RBX: ffffffff81...... RCX: 0000000000......
 RDX: 0000000000...... RSI: 0000000000...... RDI: ffffffff81......
 RBP: ffff88064e...... R08: ffffffff81...... R09: 0000000000......
 R10: 0000000000...... R11: ffff88064e...... R12: 0000000000......
 R13: 0000000000...... R14: ffffffff81...... R15: 0000000000......
 FS:  0000000000......(0000) GS:ffff8804af......(0000) knlGS:000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080......
 CR2: 00007f2a2f...... CR3: 0000000001...... CR4: 0000000000......
 DR0: 0000000000...... DR1: 0000000000...... DR2: 0000000000......
 DR3: 0000000000...... DR6: 00000000ff...... DR7: 0000000000......
 Stack:
 ffff88064e...... ffffffff81...... ffffffff81...... 0000000000......
 ffffffff81...... ffff88064e...... ffffffff81...... ffffffff81......
 ffffffff81...... ffff88064e...... ffffffff81...... 0000000000......
 Call Trace:
[<ffffffff813d099b>] wait_for_xmitr+0x3b/0xa0
[<ffffffff813d0b5c>] serial8250_console_putchar+0x1c/0x30
[<ffffffff813d0b40>] ? serial8250_console_write+0x140/0x140
[<ffffffff813cb5fa>] uart_console_write+0x3a/0x80
[<ffffffff813d0aae>] serial8250_console_write+0xae/0x140
[<ffffffff8107c4d1>] call_console_drivers.constprop.15+0x91/0xf0
[<ffffffff8107d6cf>] console_unlock+0x3bf/0x400
[<ffffffff813503cd>] fb_flashcursor+0x5d/0x140
[<ffffffff81355c30>] ? bit_clear+0x120/0x120
[<ffffffff8109d5fb>] process_one_work+0x17b/0x470
[<ffffffff8109e3cb>] worker_thread+0x11b/0x400
[<ffffffff8109e2b0>] ? rescuer_thread+0x400/0x400
[<ffffffff810a5aef>] kthread+0xcf/0xe0
[<ffffffff810a5a20>] ? kthread_create_on_node+0x140/0x140
[<ffffffff81645858>] ret_from_fork+0x58/0x90
[<ffffffff810a5a20>] ? kthread_create_on_node+0x140/0x140
Code: 48 89 e5 d3 e6 48 63 f6 48 03 77 10 8b 06 5d c3 66 0f 1f 44 00 00 66 66 66 6

As indicated in the stack trace above, the console output task got swamped.

Fixes: b9c5d6a643 ("IB/mlx4: Add multicast group (MCG) paravirtualization for SR-IOV")
Cc: <stable@vger.kernel.org> # v3.6+
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-21 12:26:05 -04:00
Jack Morgenstein 99e68909d5 IB/mlx4: Fix ib device initialization error flow
In mlx4_ib_add, procedure mlx4_ib_alloc_eqs is called to allocate EQs.

However, in the mlx4_ib_add error flow, procedure mlx4_ib_free_eqs is not
called to free the allocated EQs.

Fixes: e605b743f3 ("IB/mlx4: Increase the number of vectors (EQs) available for ULPs")
Cc: <stable@vger.kernel.org> # v3.4+
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-21 12:26:05 -04:00
Majd Dibbiny dd77abf8a0 IB/mlx4: Support RAW Ethernet when RoCE is disabled
On some environments, such as certain SR-IOV VF configurations, RoCE
isn't supported for mlx4 Ethernet ports. Currently the driver will
not open IB device on that port.

This is problematic since we do want user-space RAW Ethernet QPs functionality
to remain in place. For that end, enhance the relevant driver flows such that we
do create a device instance in that case.

Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-21 12:26:05 -04:00
Doug Ledford 339e7575ad cxgb4: Convert PDBG to pr_debug the second
A couple spots were missed in the original patch to implement this
change.  Add those spots.

Fixes: a9a42886d0 (cxgb4: Convert PDBG to pr_debug)
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-20 22:18:54 -04:00
Markus Elfring 4418b27b52 IB/hns: Use kcalloc() in hns_roce_buddy_init()
* Multiplications for the size determination of memory allocations
  indicated that array data structures should be processed.
  Thus use the corresponding function "kcalloc".

  This issue was detected by using the Coccinelle software.

* Replace the specification of data types by pointer dereferences
  to make the corresponding size determinations a bit safer according to
  the Linux coding style convention.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-20 16:31:49 -04:00
Markus Elfring e1d717de5d IB/hns: Use kmalloc_array() in hns_roce_cmd_use_events()
* A multiplication for the size determination of a memory allocation
  indicated that an array data structure should be processed.
  Thus use the corresponding function "kmalloc_array".

  This issue was detected by using the Coccinelle software.

* Replace the specification of a data structure by a pointer dereference
  to make the corresponding size determination a bit safer according to
  the Linux coding style convention.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-20 16:31:49 -04:00
Markus Elfring db6f0289f5 IB/hfi1: Coding style improvement (make sizeof use safer)
Replace the specification of a data structure by a reference to
the desired member as the parameter for the operator "sizeof" to make
the corresponding size determination a bit safer according to
the Linux coding style convention.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-20 16:25:04 -04:00
Markus Elfring e036c2006c IB/hfi1: Remove intermediate var in hfi1_user_sdma_alloc_queues()
* Pass a product for a call of the function "vmalloc_user" without storing
  it in an intermediate variable.

* Delete the local variable "memsize" which became unnecessary with
  this refactoring.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-20 16:24:05 -04:00
Markus Elfring 147d84e1e3 IB/hfi1: Use kcalloc() in hfi1_user_sdma_alloc_queues()
* Multiplications for the size determination of memory allocations
  indicated that array data structures should be processed.
  Thus reuse the corresponding function "kcalloc".

  This issue was detected by using the Coccinelle software.

* Replace the specification of a data type by a pointer dereference
  to make the corresponding size determination a bit safer according to
  the Linux coding style convention.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-20 16:23:25 -04:00
Markus Elfring 4076e5187d IB/hfi1: Use kcalloc() in hfi1_user_exp_rcv_init()
* A multiplication for the size determination of a memory allocation
  indicated that an array data structure should be processed.
  Thus reuse the corresponding function "kcalloc".

  This issue was detected by using the Coccinelle software.

* Replace the specification of a data type by a pointer dereference
  to make the corresponding size determination a bit safer according to
  the Linux coding style convention.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-20 16:23:25 -04:00
Joe Perches a9a42886d0 cxgb4: Convert PDBG to pr_debug
Use a more typical logging style.

Miscellanea:

o Obsolete the c4iw_debug module parameter
o Coalesce formats
o Realign arguments

Signed-off-by: Joe Perches <joe@perches.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-20 16:14:13 -04:00
Joe Perches 700456bd25 cxgb4: Use more common logging style
Convert printks to pr_<level>

Miscellanea:

o Coalesce formats
o Realign arguments

Signed-off-by: Joe Perches <joe@perches.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-20 16:13:20 -04:00
Joe Perches b7b37ee0e1 cxgb3: Convert PDBG to pr_debug
Using the normal mechanism, not an indirected one, is clearer.

Miscellanea:

o Coalesce formats
o Realign arguments

Signed-off-by: Joe Perches <joe@perches.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-20 16:13:20 -04:00
Joe Perches 46b2d4e8ec cxgb3: Use more common logging style
Convert printks to pr_<level>

Miscellanea:

o Coalesce formats
o Realign arguments

Signed-off-by: Joe Perches <joe@perches.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-20 16:13:20 -04:00
Vishwanathapura, Niranjana 64551ede6c IB/hfi1: VNIC SDMA support
HFI1 VNIC SDMA support enables transmission of VNIC packets over SDMA.
Map VNIC queues to SDMA engines and support halting and wakeup of the
VNIC queues.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-20 15:19:41 -04:00
Vishwanathapura, Niranjana 2280740f01 IB/hfi1: Virtual Network Interface Controller (VNIC) HW support
HFI1 HW specific support for VNIC functionality.
Dynamically allocate a set of contexts for VNIC when the first vnic
port is instantiated. Allocate VNIC contexts from user contexts pool
and return them back to the same pool while freeing up. Set aside
enough MSI-X interrupts for VNIC contexts and assign them when the
contexts are allocated. On the receive side, use an RSM rule to
spread TCP/UDP streams among VNIC contexts.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Andrzej Kacprowski <andrzej.kacprowski@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-20 15:19:35 -04:00
Vishwanathapura, Niranjana d4829ea603 IB/hfi1: OPA_VNIC RDMA netdev support
Add support to create and free OPA_VNIC rdma netdev devices.
Implement netstack interface functionality including xmit_skb,
receive side NAPI etc. Also implement rdma netdev control functions.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Andrzej Kacprowski <andrzej.kacprowski@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-20 12:03:12 -04:00
Doug Ledford 23790ba2d7 Merge branch 'k.o/for-4.12' into k.o/for-4.12-rdma-netdevice 2017-04-20 12:00:41 -04:00
Erez Shitrit 93d576af3c hw/mlx5: Add New bit to check over QP creation
Add check for bit IB_QP_CREATE_NETIF_QP while creating QP.

Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-17 11:08:32 -04:00
Saeed Mahameed 258545449b net/mlx5e: IPoIB, Xmit flow
Implement mlx5e's IPoIB SKB transmit using the helper functions provided
by mlx5e ethernet tx flow, the only difference in the code between
mlx5e_xmit and mlx5i_xmit is that IPoIB has some extra fields to fill
(UD datagram segment) in the TX descriptor (WQE) and it doesn't need to
have any vlan handling.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-17 11:08:31 -04:00
David S. Miller 6f14f443d3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Mostly simple cases of overlapping changes (adding code nearby,
a function whose name changes, for example).

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-06 08:24:51 -07:00
Don Hiatt 243d9f436f IB/hfi1: Add transmit fault injection feature
Add ability to fault packets on transmit by opcode.
Dropping by packet can be achieved by setting the mask to 0.

In order to drop non-verbs traffic we set PbcInsertHrc
to NONE (0x2). The packet will still be delivered to
the receiving node but a KHdrHCRCErr (KDETH packet
with a bad HCRC) will be triggered and the packet will
not be delivered to the correct context.

In order to drop regular verbs traffic we set the
PbcTestEbp flag. The packet will still be delivered
to the receiving node but a 'late ebp error' will
be triggered and will be dropped.

A global toggle (/sys/kernel/debug/hfi1/hfi1_X/fault_suppress_err)
has been added to suppress the error messages on the receive
node when a packet was faulted on the sending node.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-05 14:45:09 -04:00
Don Hiatt 0181ce31b2 IB/hfi1: Add receive fault injection feature
Add fault injection capability:
  - Drop packets unconditionally (fault_by_packet)
  - Drop packets based on opcode (fault_by_opcode)

This feature reacts to the global FAULT_INJECTION
config flag.

The faulting traces have been added:
  - misc/fault_opcode
  - misc/fault_packet

See 'Documentation/fault-injection/fault-injection.txt'
for details.

Examples:
  - Dropping packets by opcode:
    /sys/kernel/debug/hfi1/hfi1_X/fault_opcode
	# Enable fault
	echo Y > fault_by_opcode
	# Setprobability of dropping (0-100%)
	# echo 25 > probability
	# Set opcode
	echo 0x64 > opcode
	# Number of times to fault
	echo 3 > times
	# An optional mask allows you to fault
	# a range of opcodes
	echo 0xf0 > mask
    /sys/kernel/debug/hfi1/hfi1_X/fault_stats
    contains a value in parentheses to indicate
    number of each opcode dropped.

  - Dropping packets unconditionally
    /sys/kernel/debug/hfi1/hfi1_X/fault_packet
	# Enable fault
	echo Y > fault_by_packet
    /sys/kernel/debug/hfi1/hfi1_X/fault_packet/fault_stats
    contains the number of packets dropped.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-05 14:45:09 -04:00
Michael J. Ruhl f7b4263372 IB/hfi1: Ensure VL index is within bounds
Improve the safety of the code and ensure the array cannot be indexed
out of bounds when picking the CPU for a given SDMA engine.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-05 14:45:09 -04:00
Sebastian Sanchez 5f14e4e667 IB/rdmavt, IB/hfi1: Fix timer migration regressions
RC timeout counter isn't getting incremented.
Increment counter and add the trace for it.

Fixes: 87c23b4ab018 ("IB/rdmavt: Adding timer logic to rdmavt")
Reviewed-by: Brian Welty <brian.welty@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-05 14:45:09 -04:00
Michael J. Ruhl 5e6e94244b IB/hfi1: Add a patch value to the firmware version string
The HFI firmware now includes a patch level in its version.
Updating the necessary code to include the patch version in the
firmware string.

Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-05 14:45:09 -04:00
Easwar Hariharan fb897ad315 IB/hfi1: Check for QSFP presence before attempting reads
Attempting to read the status of a QSFP cable creates noise in the logs
and misses out on setting an appropriate Offline/Disabled Reason if the
cable is not plugged in. Check for this prior to attempting the read and
attendant retries.

Fixes: 673b975f1f ("IB/hfi1: Add QSFP sanity pre-check")
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Easwar Hariharan <easwar.hariharan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-05 14:45:09 -04:00
Tadeusz Struk 62eed66e98 IB/hfi1: Protect the global dev_cntr_names and port_cntr_names
Protect the global dev_cntr_names and port_cntr_names with the global
mutex as they are allocated and freed in a function called per device.
Otherwise there is a danger of double free and memory leaks.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com>
Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-05 14:45:09 -04:00
Tadeusz Struk 5d6f08afdd IB/hfi1: Check device id early during init
If there is a wrong device passed to the driver it should fail early,
without trying to initialize the device only to find out that it has
an invalid device later during the init.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-05 14:45:09 -04:00
Mike Marciniszyn 9260b3541f IB/rdmavt: Add swqe completion trace
The following fields are available for filter/trace:
- wqe
- wr_id
- qpn
- qpt
- length
- idx
- ssn
- (wr)opcode
- (wr)send_flags

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-05 14:45:09 -04:00
Mike Marciniszyn 43a474aadb IB/rdmavt, IB/hfi1, IB/qib: Make wc opcode translation driver dependent
The work to create a completion helper moved the translation of send
wqe operations to completion opcodes to rdmvat.

This precludes having driver dependent operations.  Make the translation
driver dependent by doing the translation in the driver prior to the
rvt_qp_swqe_complete() call using restored translation tables.

Fixes: Commit f2dc9cdce8 ("IB/rdmavt: Add a send completion helper")
Fixes: Commit 0771da5a6e ("IB/hfi1,IB/qib: Use new send completion helper")
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-05 14:45:09 -04:00
Sebastian Sanchez 5a52a7acf7 IB/hfi1: NULL pointer dereference when freeing rhashtable
A NULL pointer dereference occurs when the driver
is unloaded, and the SDMA rhashtable is freed if
the rhashtable_init() function has not been called.
Prevent this by changing sdma_rht to be a pointer
to a dynamically allocated hash table. The NULL-ness
of the pointer serves as an indication that the hash
table was initialized and that it needs to be
destroyed.

Fixes: 0cb2aa690c ("IB/hfi1: Add sysfs interface for affinity setup")
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-05 14:45:09 -04:00
Michael J. Ruhl 8688426ba6 IB/hfi1: Cache registers during state change
When the LCB is going offline, inopportune port queries can cause
benign error messages to be logged.  To deal with this, cache the
registers just before setting the LCB to offline, allowing queries to
return without eliciting the error.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-05 14:45:09 -04:00
Michael J. Ruhl 0519c520dc IB/hfi1: Race hazard avoidance in user SDMA driver
Set the errcode before the state and add the smb_wmb() to avoid a
potential race condition with the user.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Michael Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-05 14:45:09 -04:00
Dean Luick ec8a142327 IB/hfi1: Force logical link down
If the logical link state does not read as down when
the physical link state is offline, force it to down.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Jakub Byczkowski <jakub.byczkowski@intel.com>
Signed-off-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Easwar Hariharan <easwar.hariharan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-05 14:45:09 -04:00
Mark Brown cd6ce4a573 IB/hns: Explicitly include linux/of.h
hns_roce_hw_v1.c uses DT interfaces but relies on implict inclusion of
linux/of.h which means that changes in other headers could break the
build, as happened in -next for arm64 today.  Add an explicit include.

Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-05 13:45:53 -04:00
Florian Westphal 282ccf6efb drivers: add explicit interrupt.h includes
These files all use functions declared in interrupt.h, but currently rely
on implicit inclusion of this file (via netns/xfrm.h).

That won't work anymore when the flow cache is removed so include that
header where needed.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-30 11:05:34 -07:00
Greg Kroah-Hartman 57c0eabbd5 Merge 4.11-rc4 into char-misc-next
We want the char-misc fixes in here as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-03-27 09:13:04 +02:00
Arnd Bergmann f6aafac184 IB/qib: fix false-postive maybe-uninitialized warning
aarch64-linux-gcc-7 complains about code it doesn't fully understand:

drivers/infiniband/hw/qib/qib_iba7322.c: In function 'qib_7322_txchk_change':
include/asm-generic/bitops/non-atomic.h:105:35: error: 'shadow' may be used uninitialized in this function [-Werror=maybe-uninitialized]

The code is right, and despite trying hard, I could not come up with a version
that I liked better than just adding a fake initialization here to shut up the
warning.

Fixes: f931551baf ("IB/qib: Add new qib driver for QLogic PCIe InfiniBand adapters")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-03-24 22:44:29 -04:00
Dan Carpenter 004d18ea99 RDMA/ocrdma: fix a type issue in ocrdma_put_pd_num()
We want to return zero on success or negative error codes.  The type
should be int and not u8.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-03-24 21:11:15 -04:00
Aditya Sarwade b172679b0d RDMA/vmw_pvrdma: Activate device on ethernet link up
Restore device state when ethernet link changes to active.

Acked-by: George Zhang <georgezhang@vmware.com>
Acked-by: Jorgen Hansen <jhansen@vmware.com>
Acked-by: Bryan Tan <bryantan@vmware.com>
Signed-off-by: Aditya Sarwade <asarwade@vmware.com>
Signed-off-by: Adit Ranadive <aditr@vmware.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-03-24 20:49:53 -04:00
Adit Ranadive e51c2fb033 RDMA/vmw_pvrdma: Dont hardcode QP header page
Moved the header page count to a macro.

Reported-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Adit Ranadive <aditr@vmware.com>
Reviewed-by: Aditya Sarwade <asarwade@vmware.com>
Tested-by: Andrew Boyer <andrew.boyer@dell.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-03-24 20:49:53 -04:00
Adit Ranadive 6332dee83d RDMA/vmw_pvrdma: Cleanup unused variables
Removed the unused nreq and redundant index variables.
Moved hardcoded async and cq ring pages number to macro.

Reported-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Adit Ranadive <aditr@vmware.com>
Reviewed-by: Aditya Sarwade <asarwade@vmware.com>
Tested-by: Andrew Boyer <andrew.boyer@dell.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-03-24 20:49:53 -04:00
Shiraz Saleem 871a8623d3 i40iw: Receive netdev events post INET_NOTIFIER state
Netdev notification events are de-registered only when all
client iwdev instances are removed. If a single client is closed
and re-opened, netdev events could arrive even before the Control
Queue-Pair (CQP) is created, causing a NULL pointer dereference crash
in i40iw_get_cqp_request. Fix this by allowing netdev event
notification only after we have reached the INET_NOTIFIER state with
respect to device initialization.

Reported-by: Stefan Assmann <sassmann@redhat.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-03-24 16:23:29 -04:00
Logan Gunthorpe 985087157c infiniband: utilize the new cdev_set_parent function
This replaces the suspect looking cdev.kobj.parent lines with the
equivalent cdev_set_parent function. This is a straightforward change
that's largely cosmetic but it does push the kobj.parent ownership
into char_dev.c where it belongs.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-03-21 06:44:33 +01:00
Mintz, Yuval be086e7c53 qed*: Utilize Firmware 8.15.3.0
This patch advances the qed* drivers into using the newer firmware -
This solves several firmware bugs, mostly related [but not limited to]
various init/deinit issues in various offloaded protocols.

It also introduces a major 4-Cached SGE change in firmware, which can be
seen in the storage drivers' changes.

In addition, this firmware is required for supporting the new QL41xxx
series of adapters; While this patch doesn't add the actual support,
the firmware contains the necessary initialization & firmware logic to
operate such adapters [actual support would be added later on].

Changes from Previous versions:
-------------------------------
 - V2 - fix kbuild-test robot warnings

Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com>
Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com>
Signed-off-by: Manish Rangankar <Manish.Rangankar@cavium.com>
Signed-off-by: Chad Dupuis <Chad.Dupuis@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-13 15:33:09 -07:00
Ingo Molnar 0881e7bd34 sched/headers: Prepare to move the get_task_struct()/put_task_struct() and related APIs from <linux/sched.h> to <linux/sched/task.h>
But first update usage sites with the new header dependency.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02 08:42:40 +01:00
Ingo Molnar 589ee62844 sched/headers: Prepare to remove the <linux/mm_types.h> dependency from <linux/sched.h>
Update code that relied on sched.h including various MM types for them.

This will allow us to remove the <linux/mm_types.h> include from <linux/sched.h>.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02 08:42:37 +01:00
Ingo Molnar 3f07c01441 sched/headers: Prepare for new header dependencies before moving code to <linux/sched/signal.h>
We are going to split <linux/sched/signal.h> out of <linux/sched.h>, which
will have to be picked up from other headers and a couple of .c files.

Create a trivial placeholder <linux/sched/signal.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02 08:42:29 +01:00
Ingo Molnar 6e84f31522 sched/headers: Prepare for new header dependencies before moving code to <linux/sched/mm.h>
We are going to split <linux/sched/mm.h> out of <linux/sched.h>, which
will have to be picked up from other headers and a couple of .c files.

Create a trivial placeholder <linux/sched/mm.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.

The APIs that are going to be moved first are:

   mm_alloc()
   __mmdrop()
   mmdrop()
   mmdrop_async_fn()
   mmdrop_async()
   mmget_not_zero()
   mmput()
   mmput_async()
   get_task_mm()
   mm_access()
   mm_release()

Include the new header in the files that are going to need it.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02 08:42:28 +01:00
Ingo Molnar 0c98d344fe sched/core: Remove the tsk_cpus_allowed() wrapper
So the original intention of tsk_cpus_allowed() was to 'future-proof'
the field - but it's pretty ineffectual at that, because half of
the code uses ->cpus_allowed directly ...

Also, the wrapper makes the code longer than the original expression!

So just get rid of it. This also shrinks <linux/sched.h> a bit.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02 08:42:24 +01:00
Vegard Nossum f1f1007644 mm: add new mmgrab() helper
Apart from adding the helper function itself, the rest of the kernel is
converted mechanically using:

  git grep -l 'atomic_inc.*mm_count' | xargs sed -i 's/atomic_inc(&\(.*\)->mm_count);/mmgrab\(\1\);/'
  git grep -l 'atomic_inc.*mm_count' | xargs sed -i 's/atomic_inc(&\(.*\)\.mm_count);/mmgrab\(\&\1\);/'

This is needed for a later patch that hooks into the helper, but might
be a worthwhile cleanup on its own.

(Michal Hocko provided most of the kerneldoc comment.)

Link: http://lkml.kernel.org/r/20161218123229.22952-1-vegard.nossum@oracle.com
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-27 18:43:48 -08:00
Masahiro Yamada b8a14f3379 scripts/spelling.txt: add "overwriten" pattern and fix typo instances
Fix typos and add the following to the scripts/spelling.txt:

  overwrien||overwritten

Link: http://lkml.kernel.org/r/1481573103-11329-30-git-send-email-yamada.masahiro@socionext.com
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-27 18:43:47 -08:00
Linus Torvalds ac1820fb28 This is a tree wide change and has been kept separate for that reason.
Bart Van Assche noted that the ib DMA mapping code was significantly
 similar enough to the core DMA mapping code that with a few changes
 it was possible to remove the IB DMA mapping code entirely and
 switch the RDMA stack to use the core DMA mapping code.  This resulted
 in a nice set of cleanups, but touched the entire tree.  This branch
 will be submitted separately to Linus at the end of the merge window
 as per normal practice for tree wide changes like this.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJYo06oAAoJELgmozMOVy/d9Z8QALedWHdu98St1L0u2c8sxnR9
 2zo/4sF5Vb9u7FpmdIX32L4SQ9s9KhPE8Qp8NtZLf9v10zlDebIRJDpXknXtKooV
 CAXxX4sxBXV27/UrhbZEfXiPrmm6ccJFyIfRnMU6NlMqh2AtAsRa5AC2/RMp8oUD
 Med97PFiF0o6TD22/UH1VFbRpX1zjaKyqm7a3as5sJfzNA+UGIZAQ7Euz8000DKZ
 xCgVLTEwS0FmOujtBkCst7xa9TjuqR1HLOB4DdGvAhP6BHdz2yamM7Qmh9NN+NEX
 0BtjsuXomtn6j6AszGC+bpipCZh3NUigcwoFAARXCYFHibBvo4DPdFeGsraFgXdy
 1+KyR8CCeQG3Aly5Vwr264RFPGkGpwMj8PsBlXgQVtrlg4rriaCzOJNmIIbfdADw
 ftqhxBOzReZw77aH2s+9p2ILRfcAmPqhynLvFGFo9LBvsik8LVso7YgZN0xGxwcI
 IjI/XGC8UskPVsIZBIYA6sl2bYzgOjtBIHiXjRrPlW3uhduIXLrvKFfLPP/5XLAG
 ehLXK+J0bfsyY9ClmlNS8oH/WdLhXAyy/KNmnj5bRRm9qg6BRJR3bsOBhZJODuoC
 XgEXFfF6/7roNESWxowff7pK0rTkRg/m/Pa4VQpeO+6NWHE7kgZhL6kyIp5nKcwS
 3e7mgpcwC+3XfA/6vU3F
 =e0Si
 -----END PGP SIGNATURE-----

Merge tag 'for-next-dma_ops' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma

Pull rdma DMA mapping updates from Doug Ledford:
 "Drop IB DMA mapping code and use core DMA code instead.

  Bart Van Assche noted that the ib DMA mapping code was significantly
  similar enough to the core DMA mapping code that with a few changes it
  was possible to remove the IB DMA mapping code entirely and switch the
  RDMA stack to use the core DMA mapping code.

  This resulted in a nice set of cleanups, but touched the entire tree
  and has been kept separate for that reason."

* tag 'for-next-dma_ops' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (37 commits)
  IB/rxe, IB/rdmavt: Use dma_virt_ops instead of duplicating it
  IB/core: Remove ib_device.dma_device
  nvme-rdma: Switch from dma_device to dev.parent
  RDS: net: Switch from dma_device to dev.parent
  IB/srpt: Modify a debug statement
  IB/srp: Switch from dma_device to dev.parent
  IB/iser: Switch from dma_device to dev.parent
  IB/IPoIB: Switch from dma_device to dev.parent
  IB/rxe: Switch from dma_device to dev.parent
  IB/vmw_pvrdma: Switch from dma_device to dev.parent
  IB/usnic: Switch from dma_device to dev.parent
  IB/qib: Switch from dma_device to dev.parent
  IB/qedr: Switch from dma_device to dev.parent
  IB/ocrdma: Switch from dma_device to dev.parent
  IB/nes: Remove a superfluous assignment statement
  IB/mthca: Switch from dma_device to dev.parent
  IB/mlx5: Switch from dma_device to dev.parent
  IB/mlx4: Switch from dma_device to dev.parent
  IB/i40iw: Remove a superfluous assignment statement
  IB/hns: Switch from dma_device to dev.parent
  ...
2017-02-25 13:45:43 -08:00
Dave Jiang 11bac80004 mm, fs: reduce fault, page_mkwrite, and pfn_mkwrite to take only vmf
->fault(), ->page_mkwrite(), and ->pfn_mkwrite() calls do not need to
take a vma and vmf parameter when the vma already resides in vmf.

Remove the vma parameter to simplify things.

[arnd@arndb.de: fix ARM build]
  Link: http://lkml.kernel.org/r/20170125223558.1451224-1-arnd@arndb.de
Link: http://lkml.kernel.org/r/148521301778.19116.10840599906674778980.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jan Kara <jack@suse.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-24 17:46:54 -08:00
Linus Torvalds af17fe7a63 Mellanox specific updates for 4.11 merge window
Because the Mellanox code required being based on a net-next tree,
 I keept it separate from the remainder of the RDMA stack submission
 that is based on 4.10-rc3.
 
 This branch contains:
 
 - Various mlx4 and mlx5 fixes and minor changes
 - Support for adding a tag match rule to flow specs
 - Support for cvlan offload operation for raw ethernet QPs
 - A change to the core IB code to recognize raw eth capabilities and
   enumerate them (touches non-Mellanox code)
 - Implicit On-Demand Paging memory registration support
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJYrx+WAAoJELgmozMOVy/du70P/1kpW2xY9Le04c3K7na2XOYl
 AUVIDrW/8Go63tpOaM7jBT3k4GlwVFr3IOmBpS24KbW/THxjhyUeP5L5+z2x+go+
 jkQOgtPWWEHr5zP3MzsNyB8fDx1YQOnJwEXxybQRW/cbw4CLjnhP+ezd6FdV/3Yy
 pPEqDVlAErzvNweG+n2r1pjcUbR8uneC3inyMLnyzUBz4CHKmC8fgD3/qJIM+DNb
 gtFT5xHFIXKCigWdQ/EwsTDcHub43V8OXlI5sO7loG6vToOUATMkjI4oOUNhDmYS
 X7XLN3yRK9QHEfb5kutXIZEWzTGh7LiFtUYGaNNYqqzDfSiMRc9NC5kTOfplEXDV
 Uo+AGb6Fh1zYIOzNk7o+tazIv3LaLv6+Fcm+9bbe0VUIqasaylsePqaTwMuIzx/I
 xP5nitmd5lbYo8WdlasVdG6mH1DlJEUbU30v4DpmTpxCP6jGpog7lexyGyF3TgzS
 NhnG0IiIClWh3WQ2/GdsFK/obIdFkpLeASli1hwD81vzPfly9zc2YpgqydZI3WCr
 q6hTXYnANcP6+eciCpQPO7giRdXdiKey08Uoq/2jxb7Qbm4daG6UwopjvH9/lm1F
 m6UDaDvzNYm+Rx+bL/+KSx9JO9+fJB1L51yCmvLGpWi6yJI4ZTfanHNMBsCua46N
 Kev/DSpIAzX1WOBkte+a
 =rspQ
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma

Pull Mellanox rdma updates from Doug Ledford:
 "Mellanox specific updates for 4.11 merge window

  Because the Mellanox code required being based on a net-next tree, I
  keept it separate from the remainder of the RDMA stack submission that
  is based on 4.10-rc3.

  This branch contains:

   - Various mlx4 and mlx5 fixes and minor changes

   - Support for adding a tag match rule to flow specs

   - Support for cvlan offload operation for raw ethernet QPs

   - A change to the core IB code to recognize raw eth capabilities and
     enumerate them (touches non-Mellanox code)

   - Implicit On-Demand Paging memory registration support"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (40 commits)
  IB/mlx5: Fix configuration of port capabilities
  IB/mlx4: Take source GID by index from HW GID table
  IB/mlx5: Fix blue flame buffer size calculation
  IB/mlx4: Remove unused variable from function declaration
  IB: Query ports via the core instead of direct into the driver
  IB: Add protocol for USNIC
  IB/mlx4: Support raw packet protocol
  IB/mlx5: Support raw packet protocol
  IB/core: Add raw packet protocol
  IB/mlx5: Add implicit MR support
  IB/mlx5: Expose MR cache for mlx5_ib
  IB/mlx5: Add null_mkey access
  IB/umem: Indicate that process is being terminated
  IB/umem: Update on demand page (ODP) support
  IB/core: Add implicit MR flag
  IB/mlx5: Support creation of a WQ with scatter FCS offload
  IB/mlx5: Enable QP creation with cvlan offload
  IB/mlx5: Enable WQ creation and modification with cvlan offload
  IB/mlx5: Expose vlan offloads capabilities
  IB/uverbs: Enable QP creation with cvlan offload
  ...
2017-02-23 11:27:49 -08:00
Linus Torvalds 4cc4b9323f First set of updates for 4.11 kernel merge window
- Add new Broadcom bnxt_re RoCE driver
 - rxe driver updates
 - ioctl cleanups
 - ETH_P_IBOE declaration cleanup
 - IPoIB changes
 - Add port state cache
 - Allow srpt driver to accept guids as port names in config
 - Update to hfi1 driver
 - Update to srp driver
 - Lots of misc. minor changes all over
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJYrfewAAoJELgmozMOVy/dFnEP/2Qe7NqXRqxLS0ZqsQseFHgQ
 jd236E7R/XtQQTE3PTcrWL0mq0DRF6tMEjfhUASKTbZVfCBTniJAoXYrvWhN/STq
 LxAdigdV/0SPbxO3r9B1Xvk2v5BySaIBkaUDvcEXzT4e7UVQwZgxDkhhsYeY0Z/r
 9bNB5760PzW8uO5cctXccNcWztZnW0IUZuAHVfQCPjZ7svoGwLnNDW6YQx+FsEkW
 tbPdzMXX8VKHlC5RcKbfOOBjdNyrUpWl+uvWEc/7mazKscp4yKVFZL7PcxqPJSfd
 aKdfqXYawhjZZpyws8Kn0rhkfT7xWKD/y9G5STykRJPj9/n1BDScFkmyDQhtP5bJ
 GANzdgH0z7Dt9LkcAs86A8EVBbIdbdT2cpPVu7t0uWEIsJw/O5ThKpgjnrrTm6m+
 89tgqLZooifTEsdj4UkZoyktrD3J9LSNZkgVmWtRn01W3oYFOPbdM4TmBZtg+/Yl
 VGmOJEHMEsNuJBcJcOuRJ1MVz2LebXmPUcB0RXzgmHHgulZ/DqoOtlpg5JNmJcr5
 wpw/yppkBop4V4+etJBlzDsZNmZZlX+AY0ZLqQJsDHNszDjwXgAy5Rn5FYIdMyk4
 ff0FKb5dzASSxHRDxAsu2uoGaREM0NkpA0UYiIZbepGLSO8PuFG2ScQ6qzU47vqu
 9SEzOaaQY2S2uqFFFnYp
 =ugNm
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma

Pull rdma updates from Doug Ledford:
 "First set of updates for 4.11 kernel merge window

   - Add new Broadcom bnxt_re RoCE driver
   - rxe driver updates
   - ioctl cleanups
   - ETH_P_IBOE declaration cleanup
   - IPoIB changes
   - Add port state cache
   - Allow srpt driver to accept guids as port names in config
   - Update to hfi1 driver
   - Update to srp driver
   - Lots of misc minor changes all over"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (114 commits)
  RDMA/bnxt_re: fix for "bnxt_en: Update to firmware interface spec 1.7.0."
  rdma_cm: fail iwarp accepts w/o connection params
  IB/srp: Drain the send queue before destroying a QP
  IB/core: Add support for draining IB_POLL_DIRECT completion queues
  IB/srp: Improve an error path
  IB/srp: Make a diagnostic message more informative
  IB/srp: Document locking conventions
  IB/srp: Fix race conditions related to task management
  IB/srp: Avoid that duplicate responses trigger a kernel bug
  IB/SRP: Avoid using IB_MR_TYPE_SG_GAPS
  RDMA/qedr: Fix some error handling
  RDMA/bnxt_re: add DCB dependency
  IB/hns: include linux/module.h
  IB/vmw_pvrdma: Expose vendor error to ULPs
  vmw_pvrdma: switch to pci_alloc_irq_vectors
  IB/hfi1: use size_t for passing array length
  IB/ipoib: Remove redudant label
  IB/ipoib: remove the unnecessary memory free
  IB/mthca: switch to pci_alloc_irq_vectors
  IB/hfi1: Code reuse with memdup_copy
  ...
2017-02-23 08:27:57 -08:00
Stephen Rothwell db690328a7 RDMA/bnxt_re: fix for "bnxt_en: Update to firmware interface spec 1.7.0."
When the firmware interface spec was updated, a constant element was
renamed.  The rename missed the instances in the bnxt_re driver
because it wasn't upstream yet.  This updates the bnxt_re driver
with the rename.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-22 15:40:17 -05:00
Linus Torvalds 3051bf36c2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
 "Highlights:

   1) Support TX_RING in AF_PACKET TPACKET_V3 mode, from Sowmini
      Varadhan.

   2) Simplify classifier state on sk_buff in order to shrink it a bit.
      From Willem de Bruijn.

   3) Introduce SIPHASH and it's usage for secure sequence numbers and
      syncookies. From Jason A. Donenfeld.

   4) Reduce CPU usage for ICMP replies we are going to limit or
      suppress, from Jesper Dangaard Brouer.

   5) Introduce Shared Memory Communications socket layer, from Ursula
      Braun.

   6) Add RACK loss detection and allow it to actually trigger fast
      recovery instead of just assisting after other algorithms have
      triggered it. From Yuchung Cheng.

   7) Add xmit_more and BQL support to mvneta driver, from Simon Guinot.

   8) skb_cow_data avoidance in esp4 and esp6, from Steffen Klassert.

   9) Export MPLS packet stats via netlink, from Robert Shearman.

  10) Significantly improve inet port bind conflict handling, especially
      when an application is restarted and changes it's setting of
      reuseport. From Josef Bacik.

  11) Implement TX batching in vhost_net, from Jason Wang.

  12) Extend the dummy device so that VF (virtual function) features,
      such as configuration, can be more easily tested. From Phil
      Sutter.

  13) Avoid two atomic ops per page on x86 in bnx2x driver, from Eric
      Dumazet.

  14) Add new bpf MAP, implementing a longest prefix match trie. From
      Daniel Mack.

  15) Packet sample offloading support in mlxsw driver, from Yotam Gigi.

  16) Add new aquantia driver, from David VomLehn.

  17) Add bpf tracepoints, from Daniel Borkmann.

  18) Add support for port mirroring to b53 and bcm_sf2 drivers, from
      Florian Fainelli.

  19) Remove custom busy polling in many drivers, it is done in the core
      networking since 4.5 times. From Eric Dumazet.

  20) Support XDP adjust_head in virtio_net, from John Fastabend.

  21) Fix several major holes in neighbour entry confirmation, from
      Julian Anastasov.

  22) Add XDP support to bnxt_en driver, from Michael Chan.

  23) VXLAN offloads for enic driver, from Govindarajulu Varadarajan.

  24) Add IPVTAP driver (IP-VLAN based tap driver) from Sainath Grandhi.

  25) Support GRO in IPSEC protocols, from Steffen Klassert"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1764 commits)
  Revert "ath10k: Search SMBIOS for OEM board file extension"
  net: socket: fix recvmmsg not returning error from sock_error
  bnxt_en: use eth_hw_addr_random()
  bpf: fix unlocking of jited image when module ronx not set
  arch: add ARCH_HAS_SET_MEMORY config
  net: napi_watchdog() can use napi_schedule_irqoff()
  tcp: Revert "tcp: tcp_probe: use spin_lock_bh()"
  net/hsr: use eth_hw_addr_random()
  net: mvpp2: enable building on 64-bit platforms
  net: mvpp2: switch to build_skb() in the RX path
  net: mvpp2: simplify MVPP2_PRS_RI_* definitions
  net: mvpp2: fix indentation of MVPP2_EXT_GLOBAL_CTRL_DEFAULT
  net: mvpp2: remove unused register definitions
  net: mvpp2: simplify mvpp2_bm_bufs_add()
  net: mvpp2: drop useless fields in mvpp2_bm_pool and related code
  net: mvpp2: remove unused 'tx_skb' field of 'struct mvpp2_tx_queue'
  net: mvpp2: release reference to txq_cpu[] entry after unmapping
  net: mvpp2: handle too large value in mvpp2_rx_time_coal_set()
  net: mvpp2: handle too large value handling in mvpp2_rx_pkts_coal_set()
  net: mvpp2: remove useless arguments in mvpp2_rx_{pkts, time}_coal_set
  ...
2017-02-22 10:15:09 -08:00
Linus Torvalds 42e1b14b6e Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Ingo Molnar:
 "The main changes in this cycle were:

   - Implement wraparound-safe refcount_t and kref_t types based on
     generic atomic primitives (Peter Zijlstra)

   - Improve and fix the ww_mutex code (Nicolai Hähnle)

   - Add self-tests to the ww_mutex code (Chris Wilson)

   - Optimize percpu-rwsems with the 'rcuwait' mechanism (Davidlohr
     Bueso)

   - Micro-optimize the current-task logic all around the core kernel
     (Davidlohr Bueso)

   - Tidy up after recent optimizations: remove stale code and APIs,
     clean up the code (Waiman Long)

   - ... plus misc fixes, updates and cleanups"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (50 commits)
  fork: Fix task_struct alignment
  locking/spinlock/debug: Remove spinlock lockup detection code
  lockdep: Fix incorrect condition to print bug msgs for MAX_LOCKDEP_CHAIN_HLOCKS
  lkdtm: Convert to refcount_t testing
  kref: Implement 'struct kref' using refcount_t
  refcount_t: Introduce a special purpose refcount type
  sched/wake_q: Clarify queue reinit comment
  sched/wait, rcuwait: Fix typo in comment
  locking/mutex: Fix lockdep_assert_held() fail
  locking/rtmutex: Flip unlikely() branch to likely() in __rt_mutex_slowlock()
  locking/rwsem: Reinit wake_q after use
  locking/rwsem: Remove unnecessary atomic_long_t casts
  jump_labels: Move header guard #endif down where it belongs
  locking/atomic, kref: Implement kref_put_lock()
  locking/ww_mutex: Turn off __must_check for now
  locking/atomic, kref: Avoid more abuse
  locking/atomic, kref: Use kref_get_unless_zero() more
  locking/atomic, kref: Kill kref_sub()
  locking/atomic, kref: Add kref_read()
  locking/atomic, kref: Add KREF_INIT()
  ...
2017-02-20 13:23:30 -08:00
Christophe Jaillet 4cd33aafe4 RDMA/qedr: Fix some error handling
'qedr_alloc_pbl_tbl()' can not return NULL.

In qedr_init_user_queue():
 - simplify the test for the return value, no need to test for NULL
 - propagate the error pointer if needed, otherwise 0 (success) is returned.
   This is spurious.

In init_mr_info():
 - test the return value with IS_ERR
 - propagate the error pointer if needed instead of an exlictit -ENOMEM.
   This is a no-op as the only error pointer that we can have here is
   already -ENOMEM

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Ram Amrani <Ram.Amrani@cavium.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-19 09:27:29 -05:00
Arnd Bergmann e57f774db1 RDMA/bnxt_re: add DCB dependency
When CONFIG_DCB is disabled, we get a link error:

drivers/infiniband/built-in.o: In function `bnxt_re_setup_qos':
trace.c:(.text+0x155774): undefined reference to `dcb_ieee_getapp_mask'
trace.c:(.text+0x155774): relocation truncated to fit: R_AARCH64_CALL26 against undefined symbol `dcb_ieee_getapp_mask'
trace.c:(.text+0x155794): undefined reference to `dcb_ieee_getapp_mask'
trace.c:(.text+0x155794): relocation truncated to fit: R_AARCH64_CALL26 against undefined symbol `dcb_ieee_getapp_mask'

Like the other drivers that use this function, a Kconfig dependency is
the correct fix.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-19 09:27:29 -05:00
Arnd Bergmann 3ecc16c82c IB/hns: include linux/module.h
I ran into a build error on arm64 randconfig testing:

infiniband/hw/hns/hns_roce_main.c:539:1: error: data definition has no type or storage class [-Werror]
infiniband/hw/hns/hns_roce_main.c:539:1: error: type defaults to 'int' in declaration of 'MODULE_DEVICE_TABLE' [-Werror=implicit-int]
infiniband/hw/hns/hns_roce_main.c:539:1: error: parameter names (without types) in function declaration [-Werror]
infiniband/hw/hns/hns_roce_main.c:979:226: error: data definition has no type or storage class [-Werror]
infiniband/hw/hns/hns_roce_main.c:979:226: error: type defaults to 'int' in declaration of 'module_init' [-Werror=implicit-int]
infiniband/hw/hns/hns_roce_main.c:979:1: error: parameter names (without types) in function declaration [-Werror]

Including the module.h makes it build again.

Fixes: 9a4435375c ("IB/hns: Add driver files for hns RoCE driver")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-19 09:27:28 -05:00
Yuval Shaia c67294b70b IB/vmw_pvrdma: Expose vendor error to ULPs
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Acked-by: Adit Ranadive <aditr@vmware.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-19 09:27:28 -05:00
Christoph Hellwig 7bf3976d6c vmw_pvrdma: switch to pci_alloc_irq_vectors
.. and greatly clean up the irq handling boilerplate code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Adit Ranadive <aditr@vmware.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-19 09:27:20 -05:00
Arnd Bergmann 64b2ae74e8 IB/hfi1: use size_t for passing array length
gcc-7 produces a mysterious warning about the size argument being potentially out
of range:

drivers/infiniband/hw/hfi1/verbs.c: In function 'init_cntr_names':
drivers/infiniband/hw/hfi1/verbs.c:1644:2: error: 'memcpy': specified size between 18446744071562067968 and 18446744073709551615 exceeds maximum object size 9223372036854775807 [-Werror=stringop-overflow=]

This seems to refer to a the case where an 64-bit size_t gets truncated
into a negative 'int' and subsequently turned into a high 64-bit number
again.

The fix is clearly to use size_t here, which matches the type that gets
used for this value elsewhere.

Fixes: b7481944b0 ("IB/hfi1: Show statistics counters under IB stats interface")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-19 09:18:47 -05:00
Christoph Hellwig f50cccdd03 IB/mthca: switch to pci_alloc_irq_vectors
Trivial switch to the new API for this driver.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-19 09:18:45 -05:00
Michael J. Ruhl 1bb0d7b781 IB/hfi1: Code reuse with memdup_copy
Update several usages of kmalloc/user_copy to memdup_copy and
memdup_copy_nul.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-19 09:18:45 -05:00
Don Hiatt 832666c163 IB/hfi1, qib, rdmavt: Move AETH defines to rdma/ib_hdrs.h
Rename RVT AETH defines and export in rdma/ib_hdrs.h

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-19 09:18:44 -05:00
Michael J. Ruhl db069ecb5d IB/hfi1: Do not set physical link state if DC is in the shutdown state
If the DC is in shutdown state, the set link state function will return
an error.  Since this is not a failure in this state, make sure to
only call set link state if the DC is on.

Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-19 09:18:43 -05:00
Jakub Byczkowski c27aad00d1 IB/hfi1: Modify logging frequency of DCC errors
Use rate-limit state to limit number of messages logged
to kernel message buffer for DCC errors. Add new macro
dd_dev_info_ratelimited for that propose. Replace all
dd_dev_info calls in handle_dcc_err function with
rate-limited version.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jakub Byczkowski <jakub.byczkowski@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-19 09:18:43 -05:00
Mike Marciniszyn f9215b5e53 IB/rdmavt, IB/hfi1, IB/qib: Correct ack count for passive (RTR) QPs
The send complete for RC QPs mismanages the ack count when the
responder side is only in RTR.

A QP in that state cannot send requests, but it can be the target
for operations that elicit responses.

Adjust the RC completion logic to correct the count maintenance
by reflecting RECV_OK in a new state test.

Reviewed-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-19 09:18:42 -05:00
Brian Welty 3fc4a0906f IB/qib: Updates to use rdmavt's SGE helper routines
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Brian Welty <brian.welty@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-19 09:18:42 -05:00
Brian Welty 1198fcea8a IB/hfi1, rdmavt: Move SGE state helper routines into rdmavt
To improve code reuse, add small SGE state helper routines to rdmavt_mr.h.
Leverage these in hfi1, including refactoring of hfi1_copy_sge.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Brian Welty <brian.welty@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-19 09:18:41 -05:00
Brian Welty 0128fceaf9 IB/hfi1, rdmavt: Update copy_sge to use boolean arguments
Convert copy_sge and related SGE state functions to use boolean.
For determining if QP is in user mode, add helper function in rdmavt_qp.h.
This is used to determine if QP needs the last byte ordering.
While here, change rvt_pd.user to a boolean.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Brian Welty <brian.welty@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-19 09:18:41 -05:00
Venkata Sandeep Dhanalakota b4238e7057 IB/qib: Use new rdmavt timers
Reduce qib code footprint by using the rdmavt timers.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Brian Welty <brian.welty@intel.com>
Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-19 09:18:40 -05:00
Venkata Sandeep Dhanalakota 56acbbfb46 IB/hfi1: Use new rdmavt timers
Reduce hfi1 code footprint by using the rdmavt timers.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Brian Welty <brian.welty@intel.com>
Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-19 09:18:39 -05:00
Brian Welty 696513e8cf IB/hfi1, qib, rdmavt: Move AETH credit functions into rdmavt
Add rvt_compute_aeth() and rvt_get_credit() as shared functions in
rdmavt, moved from hfi1/qib logic.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Brian Welty <brian.welty@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-19 09:18:38 -05:00
Brian Welty beb5a04267 IB/hfi1, qib, rdmavt: Move two IB event functions into rdmavt
Add rvt_rc_error() and rvt_comm_est() as shared functions in
rdmavt, moved from hfi1/qib logic.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Brian Welty <brian.welty@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-19 09:18:38 -05:00
Sebastian Sanchez c03c08d50b IB/hfi1: Check upper-case EFI variables
The EFI variable that provides board ID is named
by the PCI address of the device, which is published
in upper-case, while the HFI1 driver reads the EFI
variable in lower-case.
This prevents returning the correct board id when
queried through sysfs. Read EFI variables in
upper-case if the lower-case read fails.

Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com>
Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-19 09:18:37 -05:00
Sebastian Sanchez 76327627be IB/hfi1: Reduce oversized fields in struct hfi1_packet
Some fields in struct hfi1_packet are oversized.
Reduce them.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-19 09:18:37 -05:00
Mike Marciniszyn d7c76e91aa IB/hfi1: Add additional fields to qp_stats
The r_psn and s_rnr_retry are missing.

Add with this patch.

Reviewed-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-19 09:18:36 -05:00
Sebastian Sanchez b448bf9a0d IB/hfi1: Allocate context data on memory node
There are some memory allocation calls in hfi1_create_ctxtdata()
that do not use the numa function parameter. This
can cause cache lines to be filled over QPI.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-19 09:18:36 -05:00
Sebastian Sanchez f3e862cb68 IB/hfi1: Access hfi1_ibport through rcd pointer
Receive code paths use the QP's device and port
number to access the struct hfi1_ibport. When an
instance of struct hfi1_ctxtdata is present, it can
be used to access struct hfi1_ibport through a pointer.
This makes struct hfi1_ibport lookup time faster as an
array doesn't have to be indexed and access fields in
other cache-lines.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-19 09:18:35 -05:00
Mike Marciniszyn a8715b97d6 IB/hfi1: Correct error calldown locking
The resource specific wait locking missed correcting the lock
for the notify_error_qp() calldown.

The code is fixed to correctly use the iowait lock field to protect
the head that is protected by that lock.

Fixes: Commit 4e045572e2 ("IB/hfi1: Add unique txwait_lock for txreq events")
Reviewed-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-19 09:18:34 -05:00
Easwar Hariharan 39e2afa8d0 IB/hfi1: Use static CTLE with Preset 6 for integrated HFIs
After extended testing, it was found that the previous PCIe Gen
3 recipe, which used adaptive CTLE with Preset 4, could cause an
NMI/Surprise Link Down in about 1 in 100 to 1 in 1000 power cycles on
some platforms. New EV data combined with extensive empirical data
indicates that the new recipe should use static CTLE with Preset 6 for
all integrated silicon SKUs.

Fixes: c3f8de0b33 ("IB/hfi1: Add static PCIe Gen3 CTLE tuning")
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Easwar Hariharan <easwar.hariharan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-19 09:18:34 -05:00
Mike Marciniszyn eb04ff09d8 IB/hfi1: Ensure read of producer s_head is correct
The read of s_head in the hfi1_make_rc_req() and
qib_make_rc_req() lack the necesary barrier instuctions.

Correct other ACCESS_ONCE() warnings in the same file.

Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-19 09:18:33 -05:00
Mike Marciniszyn a82a7fcd1f IB/hfi1: Process qp wait list in IRQ thread periodically
In the event that the IRQ thread is extremely busy, the
processing of an rcd wait list can be delayed by quite
a bit until the IRQ thread completes its work.

The QP reset reference count wait can then appear to be stuck, thus
causing up a QP destroy to emit the hung task diagnostic.

Fix by processing the qp wait list periodically from the thread.  The
interval is a multiple (currently 4) of the MAX_PKT_RECV.

Also, reduce some of the excessive inlining.   The guidelines
are per packet is ok inline, otherwise the choice is based on
likelyhood of execution.

Reviewed-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-19 09:18:32 -05:00
Mike Marciniszyn 4fcf1de5a7 IB/hfi1: Correct defered count after processing qp_wait_list
The qp_wait_list processing leaves the defered ack count
at its prior value.

This can result in a premature send of an ack.

Fixed by unconditionally reseting the defered ack count
in hfi1_send_rc_ack().

Fixes: Commit 7c091e5c06 ("staging/rdma/hfi1: add ACK coalescing logic")
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-19 09:18:32 -05:00
Ganesh Goudar 192539f4ce iw_cxgb4: clean up send_connect()
Clean up send_connect() and make use of t6 specific
active open request struct.

Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Bharat Teja <bharat@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-19 09:18:30 -05:00
Doug Ledford 6dd7abae71 Merge branch 'k.o/for-4.10-rc' into HEAD 2017-02-19 09:18:21 -05:00
Eli Cohen cdbe33d0f8 IB/mlx5: Fix configuration of port capabilities
When the "ib_virt" cap is set, configuration of port capabilities need
to be done through mlx5_core_modify_hca_vport_context.
Since modify_hca_vport_context accepts mask and value, there is no need
to read the port capabilities and calculate the new cap values so we
avoid the mutex when ib_virt is set.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-15 09:29:37 -05:00
Talat Batheesh a748d60df3 IB/mlx4: Take source GID by index from HW GID table
Previously, we used the HW GID index in order to search the source GID
in the software GID cached table. In some cases, for example when
the MAC Address of the network interface is changed, the GID cached table
saves the old-IPv6-link-local GID at the end of the table.

When returning the old MAC address, the software GID cached table tries
to add the new IPv6-link-local GID, and when it identifies that the GID
already exists, the software GID cached does not add it. Thus a mismatch
occurs between the HW and the SW GID tables.

It resulted with sending traffic with the wrong source GID.

This commit fixes the issue by taking both from the HW table.

The problem can be reproduced with the following scenario:
Client:
    # ifconfig ens6 2.2.2.5
    # ifconfig ens6 inet6 add 2001:0db8:0:f101::5/64
    # ifconfig ens6 hw ether f4:52:14:61:a0:71
    # ifconfig ens6 inet6 del 2001:0db8:0:f101::5/64
    # ifconfig ens6 inet6 add 2001:0db8:0:f101::5/64
    # ucmatose -f ipv6 -b 2001:0db8:0:f101::5 -s 2001:0db8:0:f101::6 -p 20156
Server:
    # ucmatose -f ipv6 -b 2001:0db8:0:f101::6 -p 20156

Fixes: 4c3eb3ca13 ('IB/mlx4: Add VLAN support for IBoE')
Signed-off-by: Talat Batheesh <talatb@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-14 11:44:42 -05:00
Eli Cohen d8030b0de0 IB/mlx5: Fix blue flame buffer size calculation
A blue flame register is comprised of two buffers of equal size.

Fixes: 5fe9dec0d0 ("IB/mlx5: Use blue flame register allocator in mlx5_ib")
Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Noa Osherovich <noaos@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-14 11:44:42 -05:00