redonkable/alistair23-linux

Author	SHA1	Message	Date
Michal Kalderon	ae3488ff37	qed: Add ll2 connection for processing unaligned MPA packets This patch adds only the establishment and termination of the ll2 connection that handles unaligned MPA packets. Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-09 10:21:26 -07:00
Michal Kalderon	6f34a284f3	qed: Add LL2 slowpath handling For iWARP unaligned MPA flow, a slowpath event of flushing an MPA connection that entered an unaligned state is required. The flush ramrod is received on the ll2 queue, and a pre-registered callback function is called to handle the flush event. Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-09 10:21:26 -07:00
Michal Kalderon	89d6511309	qed: Add the source of a packet sent on an iWARP ll2 connection When a packet is sent back to iWARP FW via the tx ll2 connection the FW needs to know the source of the packet. Whether it is OOO or unaligned MPA related. Since OOO is implemented entirely inside the ll2 code (and shared with iSCSI), packets are marked as IN_ORDER inside the ll2 code. For unaligned mpa the value will be determined in the iWARP code and sent on the pkt->vlan field. Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-09 10:21:26 -07:00
Michal Kalderon	6df60fe703	qed: Fix initialization of ll2 offload feature enable_ip_cksum, enable_l4_cksum, calc_ip_len were added in commit stated below but not passed through to FW. This was OK until now as it wasn't used, but is required for the iWARP unaligned flow Fixes:7c7973b2ae27 ("qed: LL2 to use packed information for tx") Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-09 10:21:26 -07:00
Michal Kalderon	77caa792f5	qed: Add ll2 option for dropping a tx packet The option of sending a packet on the ll2 and dropping it exists in hardware and was not used until now, thus not exposed. The iWARP unaligned MPA flow requires this functionality for flushing the tx queue. Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-09 10:21:26 -07:00
Michal Kalderon	ed468ebee0	qed: Add ll2 ability of opening a secondary queue When more than one ll2 queue is opened ( that is not an OOO queue ) ll2 code does not have enough information to determine whether the queue is the main one or not, so a new field is added to the acquire input data to expose the control of determining whether the queue is the main queue or a secondary queue. Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-09 10:21:26 -07:00
Michal Kalderon	f5823fe689	qed: Add ll2 option to limit the number of bds per packet iWARP uses 3 ll2 connections, the maximum number of bds is known during connection setup. This patch modifies the static array in the ll2_tx_packet descriptor to be a flexible array and significantlly reduces memory size. In addition, some redundant fields in the ll2_tx_packet were removed, which also contributed to decreasing the descriptor size. Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-09 10:21:26 -07:00
Yotam Gigi	593bc28ae2	mlxsw: spectrum_switchdev: Support bridge mrouter notifications Support the SWITCHDEV_ATTR_ID_BRIDGE_MROUTER port attribute switchdev notification. To do that, add the mrouter flag to struct mlxsw_sp_bridge_device, which indicates whether the bridge device was set to be mrouter port. This field is set when: - A new bridge is created, where the value is taken from the kernel bridge value. - A switchdev SWITCHDEV_ATTR_ID_BRIDGE_MROUTER notification is sent. In addition, change the bridge MID entries to include the router port when the bridge device is configured to be mrouter port. The MID entries are updated in the following cases: - When a new MID entry is created, update the router port according to the bridge mrouter state. - When a SWITCHDEV_ATTR_ID_BRIDGE_MROUTER notification is sent, update all the bridge's MID entries. This is aligned with the case where a bridge slave is configured to be mrouter port. Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Reviewed-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-09 10:18:11 -07:00
Yotam Gigi	c4db953f00	mlxsw: spectrum_switchdev: Add support for router port in SMID entries In Spectrum, MDB entries point to MID entries, that indicate which ports a packet should be forwarded to. Add the support in creating MID entries that forward the packet to the Spectrum router port. This will be later used to handle the bridge mrouter port switchdev notifications. Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Reviewed-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-09 10:18:11 -07:00
Yotam Gigi	b35750f191	mlxsw: spectrum: router: Export the mlxsw_sp_router_port function In Spectrum hardware, the router port is a virtual port that is the gateway to the routing mechanism. Hence, in order for a packet to be L3 forwarded, it must first be L2 forwarded to the router port inside the hardware. Further patches in this patchset are going to introduce support in bridge device used as an mrouter port. In this case, the router port index will be needed in order to update the MDB entries to include the router port. Thus, export the mlxsw_sp_router_port function, which returns the index of the Spectrum router port. Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Reviewed-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-09 10:18:11 -07:00
Emil Tantilov	b64666ae00	ixgbe: fix crash when injecting AER after failed reset In case where AER recovery fails the device is left in a down state. Consecutive AER error injection can lead to a double IRQ free. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-09 10:09:05 -07:00
Alexander Duyck	b4ded8327f	ixgbe: Update adaptive ITR algorithm The following change is meant to update the adaptive ITR algorithm to better support the needs of the network. Specifically with this change what I have done is make it so that our ITR algorithm will try to prevent either starving a socket buffer for memory in the case of Tx, or overrunning an Rx socket buffer on receive. In addition a side effect of the calculations used is that we should function better with new features such as XDP which can handle small packets at high rates without needing to lock us into NAPI polling mode. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-09 10:07:50 -07:00
Emil Tantilov	c3aec05dfe	ixgbe: fix the FWSM.PT check in ixgbe_mng_present() Bits other than FWSM.PT can be set in IXGBE_SWFW_MODE_MASK making the previous check invalid. Change the check for MNG present to be only based on FWSM.PT bit. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-09 10:05:19 -07:00
Emil Tantilov	dcfd6b839c	ixgbe: fix use of uninitialized padding This patch is resolving Coverity hits where padding in a structure could be used uninitialized. - Initialize fwd_cmd.pad/2 before ixgbe_calculate_checksum() - Initialize buffer.pad2/3 before ixgbe_hic_unlocked() Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-09 10:04:06 -07:00
Jesper Dangaard Brouer	86e2349422	ixgbe: add counter for times Rx pages gets allocated, not recycled The ixgbe driver have page recycle scheme based around the RX-ring queue, where a RX page is shared between two packets. Based on the refcnt, the driver can determine if the RX-page is currently only used by a single packet, if so it can then directly refill/recycle the RX-slot by with the opposite "side" of the page. While this is a clever trick, it is hard to determine when this recycling is successful and when it fails. Adding a counter, which is available via ethtool --statistics as 'alloc_rx_page'. Which counts the number of times the recycle fails and the real page allocator is invoked. When interpreting the stats, do remember that every alloc will serve two packets. The counter is collected per rx_ring, but is summed and ethtool exported as 'alloc_rx_page'. It would be relevant to know what rx_ring that cannot keep up, but that can be exported later if someone experience a need for this. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-09 10:02:38 -07:00
Jakub Kicinski	2de1be1db2	nfp: bpf: pass dst register to ld_field instruction ld_field instruction is a bit special because the encoding uses two source registers and one of them becomes the output. We do need to pass the dst register to our encoding helpers though, otherwise the "write both banks" flag will not be observed. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-09 09:51:03 -07:00
Jakub Kicinski	2e85d3884f	nfp: bpf: byte swap the instructions Device expects the instructions in little endian. Make sure we byte swap on big endian hosts. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-09 09:51:03 -07:00
Jakub Kicinski	1c03e03f9b	nfp: bpf: pad code with valid nops We need to append up to 8 nops after last instruction to make sure the CPU will not fetch garbage instructions with invalid ECC if the code store was not initialized. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-09 09:51:03 -07:00
Jakub Kicinski	fd068ddc88	nfp: bpf: calculate code store ECC In the initial PoC firmware I simply disabled ECC on the instruction store. Do the ECC calculation for generated instructions in the driver. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-09 09:51:03 -07:00
Jakub Kicinski	18e53b6cb9	nfp: bpf: move to datapath ABI version 2 Datapath ABI version 2 stores the packet information in LMEM instead of NNRs. We also have strict restrictions on which GPRs we can use. Only GPRs 0-23 are reserved for BPF. Adjust the static register locations and "ABI" registers. Note that packet length is packed with other info so we have to extract it into one of the scratch registers, OTOH since LMEM can be used in restricted operands we don't have to extract packet pointer. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-09 09:51:03 -07:00
Jakub Kicinski	995e101ffa	nfp: bpf: encode extended LM pointer operands Most instructions have special fields which allow switching between base and extended Local Memory pointers. Introduce those to register encoding, we will use the extra LM pointers to access high addresses of the stack. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-09 09:51:03 -07:00
Jakub Kicinski	9f15d0f438	nfp: bpf: encode LMEM accesses NFP LMEM is a large, indirectly accessed register file. There are two basic indirect access registers. Each access operation may either use offset (up to 8 or 16 words) or perform post decrement/increment. Add encodings of LMEM indexes as instruction operands. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-09 09:51:03 -07:00
Jakub Kicinski	8afd9c961e	nfp: add more white space to the instruction defines We need to add longer OP_* defines, move the values away. Purely whitespace commit. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-09 09:51:02 -07:00
Jakub Kicinski	509144e250	nfp: bpf: remove packet marking support Temporarily drop support for skb->mark. We are primarily focusing on XDP offload, and implementing skb->mark on the new datapath has lower priority. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-09 09:51:02 -07:00
Jakub Kicinski	226e0e94ce	nfp: bpf: remove register rename Remove the register renumbering optimization. To implement calling map and other helpers we need more strict register layout. We can't freely reassign register numbers. This will have the effect of running in 4 context/thread mode, which should be OK since we are moving towards integrating the BPF closer with FW app datapath anyway, and the target datapath itself runs in 4 context mode. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-09 09:51:02 -07:00
Jakub Kicinski	3cae131933	nfp: bpf: encode all 64bit shifts Add encodings of all 64bit shift operations. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-09 09:51:02 -07:00
Jakub Kicinski	2a15bb1aba	nfp: bpf: move software reg helpers and cmd table out of translator Move the software reg helpers and some static data to nfp_asm.c. They are related to the previous patch, but move is done in a separate commit for ease of review. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-09 09:51:02 -07:00
Jakub Kicinski	b3f868df3c	nfp: bpf: use the power of sparse to check we encode registers right Define a new __bitwise type for software representation of registers. This will allow us to catch incorrect parameter types using sparse. Accessors we define also allow us to return correct enum type and therefore ensure all switches handle all register types. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-09 09:51:02 -07:00
Jakub Kicinski	a52b35c39e	nfp: bpf: lift the single-port limitation Limiting the eBPF offload to a single port was a workaround required for the PoC application FW which has not been released externally. It's not necessary any more. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-09 09:51:02 -07:00
Jakub Kicinski	3a4b0129bf	nfp: output control messages to trace_devlink_hwmsg() Use standard devlink trace point to allow tracing of control messages. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-09 09:51:02 -07:00
Yunsheng Lin	1db9b1bf82	net: hns3: Cleanup for non-static function in hns3 driver This patch fixes the following warning from sparse: warning: symbol 'hns3_set_multicast_list' was not declared. Should it be static. hns3_set_multicast_list turns out to be not used, so delete it. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-09 09:46:54 -07:00
Yunsheng Lin	a90bb9a5ea	net: hns3: Cleanup for endian issue in hns3 driver This patch fixes a lot of endian issues detected by sparse. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-09 09:46:54 -07:00
Yunsheng Lin	d44f9b631f	net: hns3: Cleanup for struct that used to send cmd to firmware The hclge_tm module has already added _cmd to the end of struct that used to send cmd to firmware. This will help us finding the endian issues. This patch adds the _cmd to the end of struct that used to send cmd to firmware in hclge_main module. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-09 09:46:54 -07:00
Yunsheng Lin	5392902d33	net: hns3: Consistently using GENMASK in hns3 driver This patch uses GENMASK to generate bit mask whenever possible in hns3 driver. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-09 09:46:54 -07:00
Yunsheng Lin	56cf68c730	net: hns3: Cleanup indentation for Kconfig in the the hisilicon folder This patch fixes a few indentation for Kconfig file in the hisilicon folder. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-09 09:46:54 -07:00
Yunsheng Lin	9780cb97af	net: hns3: Add hns3_get_handle macro in hns3 driver There are many places that will need to get the handle of netdev, so add a macro to get the handle of netdev. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-09 09:46:54 -07:00
Yunsheng Lin	5bca3b94df	net: hns3: Cleanup for shifting true in hns3 driver This patch fixes a shifting true in hclge_main module. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-09 09:46:53 -07:00
Emil Tantilov	761c2a48c7	ixgbe: split Tx/Rx ring clearing for ethtool loopback test Commit: fed21bcee7a5 ("ixgbe: Don't bother clearing buffer memory for descriptor rings) exposed some issues with the logic in the current implementation of ixgbe_clean_test_rings() that are being addressed in this patch: - Split the clearing of the Tx and Rx rings in separate loops. Previously both Tx and Rx rings were cleared in a rx_desc->wb.upper.length based loop which could lead to issues if for w/e reason packets were received outside of the frames transmitted for the loopback test. - Add check for IXGBE_TXD_STAT_DD to avoid clearing the rings if the transmits have not comlpeted by the time we enter ixgbe_clean_test_rings() - Exit early on ixgbe_check_lbtest_frame() failure. This change fixes a crash during ethtool diagnostic (ethtool -t). Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-09 09:41:25 -07:00
Emil Tantilov	c69be946d6	ixgbe: add error checks when initializing the PHY Ignoring errors when attempting to identify the PHY can lead to a crash. Specifically in the case of FW controlled PHYs where the PHY read/write operations are set to NULL. Removed redundant comment. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-09 09:38:11 -07:00
Shannon Nelson	f5a71caa17	ixgbe: restore normal RSS after last macvlan offload is removed Just like when the last VF is removed, we need to restore normal operations after the last macvlan offload is removed, else we get stuck in single queue operations. To test: ethtool -l eth1 # note the number of queues in use, ~= cpus ethtool -K eth1 l2-fwd-offload on ip link add mv1 link eth1 type macvlan mode bridge ip link set dev mv1 up ip link del mv1 ethtool -l eth1 # are we back to the same # of queues, or stuck on 1? Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-09 09:36:31 -07:00
Bhumika Goyal	2e033eace7	ixgbe: declare ixgbe_mac_operations structures as const Declare ixgbe_mac_operations structures as const as they are only stored in the mac_ops field of ixgbe_info structure. This field is of type const and therefore ixgbe_mac_operations structure can be made const too. Signed-off-by: Bhumika Goyal <bhumirks@gmail.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-09 09:34:27 -07:00
Emil Tantilov	2e22a75c55	ixgbe: Clear SWFW_SYNC register during init Added clearing of SW resource bits in the SW/FW synchronization register to ixgbe_init_swfw_sync_X540(). Updated ixgbe_acquire_swfw_sync_X540 SW Manageability host interface resource bit error case to match the error handling of the other SW resource bits. Which is to release the SW resource bits if SW times out while attempting to acquire the resource. This allows the driver to load in cases where the semaphore bits could be stuck after a reset or a crash. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-09 09:28:23 -07:00
John Fastabend	8e679021c5	ixgbe: incorrect XDP ring accounting in ethtool tx_frame param Changing the TX ring parameters with an XDP program attached may cause the XDP queues to be cleared and the TX rings to be incorrectly configured. Fix by doing correct ring accounting in setup call. Fixes: `33fdc82f08` ("ixgbe: add support for XDP_TX action") Signed-off-by: John Fastabend <john.fastabend@gmail.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-09 08:02:47 -07:00
Ding Tianhong	5e0fac63a6	net: ixgbe: Use new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag The ixgbe driver use the compile check to determine if it can send TLPs to Root Port with the Relaxed Ordering Attribute set, this is too inconvenient, now the new flag PCI_DEV_FLAGS_NO_RELAXED_ORDERING has been added to the kernel and we could check the bit4 in the PCIe Device Control register to determine whether we should use the Relaxed Ordering Attributes or not, so use this new way in the ixgbe driver. Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Acked-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-09 07:43:06 -07:00
Ding Tianhong	f4986d250a	Revert commit `1a8b6d76dc` ("net:add one common config...") The new flag PCI_DEV_FLAGS_NO_RELAXED_ORDERING has been added to indicate that Relaxed Ordering Attributes (RO) should not be used for Transaction Layer Packets (TLP) targeted toward these affected Root Port, it will clear the bit4 in the PCIe Device Control register, so the PCIe device drivers could query PCIe configuration space to determine if it can send TLPs to Root Port with the Relaxed Ordering Attributes set. With this new flag we don't need the config ARCH_WANT_RELAX_ORDER to control the Relaxed Ordering Attributes for the ixgbe drivers just like the commit `1a8b6d76dc` ("net:add one common config...") did, so revert this commit. Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-09 07:43:06 -07:00
Sabrina Dubroca	a39221ce96	ixgbe: fix masking of bits read from IXGBE_VXLANCTRL register In ixgbe_clear_udp_tunnel_port(), we read the IXGBE_VXLANCTRL register and then try to mask some bits out of the value, using the logical instead of bitwise and operator. Fixes: `a21d0822ff` ("ixgbe: add support for geneve Rx offload") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-09 07:43:06 -07:00
Mark D Rustad	e0f06bba96	ixgbe: Return error when getting PHY address if PHY access is not supported In cases where PHY register access is not supported, don't mislead a caller into thinking that it is supported by returning a PHY address. Instead, return -EOPNOTSUPP when PHY access is not supported. Signed-off-by: Mark Rustad <mark.d.rustad@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-09 07:43:06 -07:00
Christos Gkekas	c49c777f9c	qed: Delete redundant check on dcb_app priority dcb_app priority is unsigned thus checking whether it is less than zero is redundant. Signed-off-by: Christos Gkekas <chris.gekas@gmail.com> Acked-By: Tomer Tayar <Tomer.Tayar@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-08 21:21:02 -07:00
Christos Gkekas	c778c32118	net: ethernet: stmmac: Clean up dead code Many macros in dwmac-ipq806x are unused and should be removed. Moreover gmac->id is an unsigned variable and therefore checking whether it is less than zero is redundant. Signed-off-by: Christos Gkekas <chris.gekas@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-08 21:19:07 -07:00
David S. Miller	51a0c00c6b	Merge tag 'mlx5-updates-2017-10-06' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Saeed Mahameed says: ==================== Mellanox, mlx5 updates 2017-10-06 This series includes some shared code updates for kernel 4.15 to both net-next and rdma-next trees. The series includes mlx5 low level flow steering updates and optimizations to support firmware command parallelism for flow steering requests from Maor Gottlieb and two other small fixes from Matan and Maor. One fix from Matan adds error handling for when the destination list of the flow steering rule is full. Maor introduced a patch to avoid NULL pointer dereference on steering cleanup. Then Some refactoring patches needed by the series for code sharing purposes. and split the Flow Table Entry (FTE) and Flow Group (FG) creation code to two parts: 1) Object allocation - allocate the steering node and initialize its resources. 2) The firmware command execution. This change will give us the ability to take write lock on the parent node (e.g. FG for FTE creating) only on the software data struct allocation and creation part of the procedure where the synchronization is really required, and will allow us to execute multiple firmware commands simultaneously and overcome the firmware bottleneck. Refactor the locking scheme of the mlx5 core flow steering as follows: 1) Replace the mutex lock with readers-writers semaphore and take the write lock only when necessary (e.g. allocating a new flow table entry index or adding a node to the parent's children list). When we try to find a suitable child in the parent's children list (e.g. search for flow group with the same match_criteria of the rule) then we only take the read lock. 2) Add versioning mechanism - each steering entity (FT, FG, FTE, DST) will have an incremental version. The version is increased when the entity is changed (e.g. when a new FTE was added to FG - the FG's version is increased). Versioning is used in order to determine if the last traverse of an entity's children is valid or a rescan under write lock is required. Last patch adds FGs and FTEs memory pool, It is useful because these objects are not small and could be allocated/deallocated many times. This support improves the insertion rate of steering rules from ~5k/sec to ~40k/sec. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-08 21:07:11 -07:00
Haiyang Zhang	0518ec4f9d	hv_netvsc: Add ethtool handler to set and get TCP hash levels The patch supports the options to switch TCP hash level between L3 and L4 by ethtool command. TCP over IPv4 and v6 can be set differently. The default hash level is L4. We currently only allow switching TX hash level from within the guests. For example, for TCP over IPv4 on eth0: To include TCP port numbers in hashing: ethtool -N eth0 rx-flow-hash tcp4 sdfn To exclude TCP port numbers in hashing: ethtool -N eth0 rx-flow-hash tcp4 sd To show TCP hash level: ethtool -n eth0 rx-flow-hash tcp4 Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-08 10:11:01 -07:00
Haiyang Zhang	486e398105	hv_netvsc: Change the hash level variable to bit flags This simplifies the logic and make it easier to add more options. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-08 10:11:01 -07:00
Ido Schimmel	9b63ef88d3	mlxsw: spectrum: Propagate extack further for bridge enslavements The code that actually takes care of bridge offload introduces a few more non-trivial constraints with regards to bridge enslavements. Propagate extack there to indicate the reason. $ ip link add link enp1s0np1 name enp1s0np1.10 type vlan id 10 $ ip link add link enp1s0np1 name enp1s0np1.20 type vlan id 20 $ ip link add name br0 type bridge $ ip link set dev enp1s0np1.10 master br0 $ ip link set dev enp1s0np1.20 master br0 Error: spectrum: Can not bridge VLAN uppers of the same port. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-08 10:07:21 -07:00
Ido Schimmel	c1f2c6d025	mlxsw: spectrum: Add extack for VLAN enslavements Similar to physical ports, enslavement of VLAN devices can also fail. Use extack to indicate why the enslavement failed. $ ip link add link enp1s0np1 name enp1s0np1.10 type vlan id 10 $ ip link add name bond0 type bond mode 802.3ad $ ip link set dev enp1s0np1.10 master bond0 Error: spectrum: VLAN devices only support bridge and VRF uppers. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-08 10:07:21 -07:00
Ido Schimmel	a69518cf0b	mlxsw: spectrum_router: Avoid expensive lookup during route removal In commit `fc922bb0dd` ("mlxsw: spectrum_router: Use one LPM tree for all virtual routers") I increased the scale of supported VRFs by having all of them share the same LPM tree. In order to avoid look-ups for prefix lengths that don't exist, each route removal would trigger an aggregation across all the active virtual routers to see which prefix lengths are in use and which aren't and structure the tree accordingly. With the way the data structures are currently laid out, this is a very expensive operation. When preformed repeatedly - due to the invocation of the abort mechanism - and with enough VRFs, this can result in a hung task. For now, avoid this optimization until it can be properly re-added in net-next. Fixes: `fc922bb0dd` ("mlxsw: spectrum_router: Use one LPM tree for all virtual routers") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reported-by: David Ahern <dsa@cumulusnetworks.com> Tested-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-08 10:05:27 -07:00
Jonathan Toppins	0d7b70e836	bnxt_en: don't consider building bnxt_tc.o if option not enabled Instead of zeroing out bnxt_tc.c with a #ifdef foo, instead don't compile the file when the option is not enabled. Now make and the preprocessor do not have to waste time compiling a no-op. Signed-off-by: Jonathan Toppins <jtoppins@redhat.com> Acked-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-07 03:00:26 +01:00
David S. Miller	f5333f80c3	Merge branch '40GbE' of ra.kernel.org:/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 40GbE Intel Wired LAN Driver Updates 2017-10-06 This series contains updates to i40e and i40evf only. Rami fixes a typo in the code comments. Mitch adds an ethtool private flag to control source pruning to resolve an issue where our default behavior is to enable source pruning which breaks ARP monitoring in channel bonding. Fixes a couple of register definitions, which were incorrect. Jake fixes an issue with multiple logical CPUs per core (simultaneous multithreading - SMT) and how we set an affinity hint based on the v_idx of that q_vector, which is an incremental value and might lead to multiple offline CPUs being assigned to a q_vector. Instead, we should only assign hints for CPUs which are online, so look to use cpumask_local_spread(). Also fixed a VF VLAN tag stripping issue, where the flag created to change this feature was seen as unchangeable. Lastly, organized and re-numbered the feature flags. Alan re-enables PTP L4 for XL710 devices with firmware version 6.0 or greater, now that the previous bug in the older firmware is fixed. Implements the PCI error handlers for reset_prepare() and reset_done() to allow us to handle function level resets. Alice cleans up code that was added to the incorrect function during a merge. Filip adds a change to display an error message when a module is inserted that does not meet the thermal requirements, Talking Heads "Burning Down the House" comes to mind. Also fixed a flow director filter issue where a variable was not being cleared which stores the filter number to be removed from the list when the firmware refused to add the requested filter. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-06 23:29:38 +01:00
Guillaume Nault	6151b8b37b	ppp: fix race in ppp device destruction ppp_release() tries to ensure that netdevices are unregistered before decrementing the unit refcount and running ppp_destroy_interface(). This is all fine as long as the the device is unregistered by ppp_release(): the unregister_netdevice() call, followed by rtnl_unlock(), guarantee that the unregistration process completes before rtnl_unlock() returns. However, the device may be unregistered by other means (like ppp_nl_dellink()). If this happens right before ppp_release() calling rtnl_lock(), then ppp_release() has to wait for the concurrent unregistration code to release the lock. But rtnl_unlock() releases the lock before completing the device unregistration process. This allows ppp_release() to proceed and eventually call ppp_destroy_interface() before the unregistration process completes. Calling free_netdev() on this partially unregistered device will BUG(): ------------[ cut here ]------------ kernel BUG at net/core/dev.c:8141! invalid opcode: 0000 [#1] SMP CPU: 1 PID: 1557 Comm: pppd Not tainted 4.14.0-rc2+ #4 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1.fc26 04/01/2014 Call Trace: ppp_destroy_interface+0xd8/0xe0 [ppp_generic] ppp_disconnect_channel+0xda/0x110 [ppp_generic] ppp_unregister_channel+0x5e/0x110 [ppp_generic] pppox_unbind_sock+0x23/0x30 [pppox] pppoe_connect+0x130/0x440 [pppoe] SYSC_connect+0x98/0x110 ? do_fcntl+0x2c0/0x5d0 SyS_connect+0xe/0x10 entry_SYSCALL_64_fastpath+0x1a/0xa5 RIP: free_netdev+0x107/0x110 RSP: ffffc28a40573d88 ---[ end trace ed294ff0cc40eeff ]--- We could set the ->needs_free_netdev flag on PPP devices and move the ppp_destroy_interface() logic in the ->priv_destructor() callback. But that'd be quite intrusive as we'd first need to unlink from the other channels and units that depend on the device (the ones that used the PPPIOCCONNECT and PPPIOCATTACH ioctls). Instead, we can just let the netdevice hold a reference on its ppp_file. This reference is dropped in ->priv_destructor(), at the very end of the unregistration process, so that neither ppp_release() nor ppp_disconnect_channel() can call ppp_destroy_interface() in the interim. Reported-by: Beniamino Galvani <bgalvani@redhat.com> Fixes: `8cb775bc0a` ("ppp: fix device unregistration upon netns deletion") Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-06 10:16:34 -07:00
Bjorn Helgaas	d2746fe538	bnx2x: Use pci_ari_enabled() instead of local copy Use pci_ari_enabled() from the PCI core instead of the identical local copy bnx2x_ari_enabled(). No functional change intended. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-06 10:06:06 -07:00
Pieter Jansen van Vuuren	f8b7b0a6b1	nfp: add set tcp and udp header action flower offload Previously we did not have offloading support for set TCP/UDP actions. This patch enables TC flower offload of set TCP/UDP sport and dport actions. Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-06 09:56:36 -07:00
Pieter Jansen van Vuuren	354b82bb32	nfp: add set ipv6 source and destination address Previously we did not have offloading support for set IPv6 actions. This patch enables TC flower offload of set IPv6 src and dst address actions. Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-06 09:56:36 -07:00
Pieter Jansen van Vuuren	c0b1bd9a8b	nfp: add set ipv4 header action flower offload Previously we did not have offloading support for set IPv4 actions. This patch enables TC flower offload of set IPv4 src and dst address actions. Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-06 09:56:36 -07:00
Pieter Jansen van Vuuren	da83d8fe58	nfp: add set ethernet header action flower offload Previously we did not have offloading support for set ethernet actions. This patch enables TC flower offload of set ethernet actions. Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-06 09:56:35 -07:00
Pieter Jansen van Vuuren	fc53b4a701	nfp: add IPv6 ttl and tos match offloading support Previously matching on IPv6 ttl and tos fields were not offloaded. This patch enables offloading IPv6 ttl and tos as match fields. Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-06 09:56:35 -07:00
Pieter Jansen van Vuuren	a1e9203cc6	nfp: add IPv4 ttl and tos match offloading support Previously matching on IPv4 ttl and tos fields were not offloaded. This patch enables offloading IPv4 ttl and tos as match fields. Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-06 09:56:35 -07:00
Pieter Jansen van Vuuren	bb055c198d	nfp: add mpls match offloading support Previously MPLS match offloading was not supported. This patch enables MPLS match offloading support for label, bos and tc fields. Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-06 09:56:35 -07:00
Jacob Keller	b74f571f59	i40e/i40evf: organize and re-number feature flags Now that we've reduced the number of flags, organize similar flags together and re-number them accordingly. Since we don't yet have more than 32 flags, we'll use a u32 for both the hw_features and flag field. Should we gain more flags in the future, we may need to convert to a u64 or separate flags out into two fields. One alternative approach considered, but not implemented here, was to use an enumeration for the flag variables, and create a macro I40E_FLAG() which used string concatenation to generate BIT_ULL values. This has the advantage of making the actual bit values compile-time dynamic so that we do not need to worry about matching the order to the bit value. However, this does produce a high level of code churn, and makes it more difficult to read a dumped flags value when debugging. Change-ID: I8653fff69453cd547d6fe98d29dfa9d8710387d1 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-06 08:11:32 -07:00
Jacob Keller	a5340d933e	i40e: ignore skb->xmit_more when deciding to set RS bit Since commit `6a7fded776` ("i40e: Fix RS bit update in Tx path and disable force WB workaround") we've tried to "optimize" setting the RS bit based around skb->xmit_more. This same logic was refactored in commit `1dc8b53879` ("i40e: Reorder logic for coalescing RS bits"), but ultimately was not functionally changed. Using skb->xmit_more in this way is incorrect, because in certain circumstances we may see a large number of skbs in sequence with xmit_more set. This leads to a performance loss as the hardware does not writeback anything for those packets, which delays the time it takes for us to respond to the stack transmit requests. This significantly impacts UDP performance, especially when layered with multiple devices, such as bonding, VLANs, and vnet setups. This was not noticed until now because it is difficult to create a setup which reproduces the issue. It was discovered in a UDP_STREAM test in a VM, connected using a vnet device to a bridge, which is connected to a bonded pair of X710 ports in active-backup mode with a VLAN. These layered devices seem to compound the number of skbs transmitted at once by the qdisc. Additionally, the problem can be masked by reducing the ITR value. Since the original commit does not provide strong justification for this RS bit "optimization", revert to the previous behavior of setting the RS bit every 4th packet. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-06 08:11:32 -07:00
Jacob Keller	0a3b4f702f	i40evf: enable support for VF VLAN tag stripping control A recent commit 809481484e5d ("i40e/i40evf: support for VF VLAN tag stripping control") added support for VFs to negotiate the control of VLAN tag stripping. This should have allowed VFs to disable the feature. Unfortunately, the flag was set only in netdev->feature flags and not in netdev->hw_features. This ultimately causes the stack to assume that it cannot change the flag, so it was unchangeable and marked as [fixed] in the ethtool -k output. Fix this by setting the feature in hw_features first, just as we do for the PF code. This enables ethtool -K to disable the feature correctly, and fully enables user control of the VLAN tag stripping feature. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-06 08:11:32 -07:00
Mariusz Stachura	052b93d0c2	i40e: do not enter PHY debug mode while setting LEDs behaviour Previous implementation of LED set/get functions required to enter PHY debug mode, in order to prevent access to it from FW and SW at the same time. Reset of all ports was a unwanted side effect. Signed-off-by: Mariusz Stachura <mariusz.stachura@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-06 08:11:32 -07:00
Alan Brady	19b7960b2d	i40e: implement split PCI error reset handler This patch implements the PCI error handler reset_prepare and reset_done. This allows us to handle function level reset. Without this patch we are unable to perform and recover from an FLR correctly and this will cause VFs to be unable to recover from an FLR on the PF. Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-06 08:11:32 -07:00
Filip Sadowski	013df598d6	i40e: Properly maintain flow director filters list When there is no space for more flow director filters and user requested to add a new one it is rejected by firmware and automatically removed from the filter list maintained by driver. This behaviour is correct. Afterwards existing filter can be removed making free slot for the new one. This however causes the newly added filter to be accepted by firmware but removed from driver filter list resulting in not showing after issuing 'ethtool -n <dev_name>'. This happened due to not clearing the variable pf->fd_inv which stores filter number to be removed from the list when firmware refused to add the requested filter. It caused the filter with this specific ID to be constantly removed once it was added to the list although it has been accepted by firmware and effectively applied to the NIC. It was fixed by clearing pf->fd_inv variable after removal of the filter from the list when it was rejected by firmware. Signed-off-by: Filip Sadowski <filip.sadowski@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-06 08:11:32 -07:00
Filip Sadowski	9a858178ef	i40e: Display error message if module does not meet thermal requirements This patch causes error message to be displayed when NIC detects insertion of module that does not meet thermal requirements. Signed-off-by: Filip Sadowski <filip.sadowski@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-06 08:11:31 -07:00
Alice Michael	7f66182263	i40e: fix merge error This patch removes some code that was accidentally added to the wrong function with a merge error. Fixes: `c53934c6d1` ("i40e: fix: do not sleep in netdev_ops") Signed-off-by: Alice Michael <alice.michael@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-06 08:11:31 -07:00
Jesse Brandeburg	bd6cd4e6dd	i40e/i40evf: use DECLARE_BITMAP for state When using set_bit and friends, we should be using actual bitmaps, and fix all the locations where we might access it. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-06 08:11:31 -07:00
Mitch Williams	0a0d9af5bc	i40e: fix incorrect register definition This register was defined incorrectly. Fix the increment value to 8, and replace the iterator with _i to make the definition consistent with other statistics registers. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-06 08:11:31 -07:00
Mitch Williams	60518a0489	i40e: redfine I40E_PHY_TYPE_MAX Since I40E_PHY_TYPE_MAX is used as an iterator, usually combined with some sort of bit-shifting, it should only include actual PHY types and not error cases. Move it up in the enum declaration so that loops only iterate across valid PHY types. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-06 08:11:31 -07:00
Alan Brady	c3d26b75c2	i40e: re-enable PTP L4 capabilities for XL710 if FW >6.0 Starting with XL710 FW 5.3 PTP L4 was disabled for XL710 due to a bug. The bug has since been resolved in XL710 FW >6.0 and PTP L4 can now be re-enabled on those devices with updated firmware. Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-06 08:11:31 -07:00
Jacob Keller	be664cbefc	i40e/i40evf: spread CPU affinity hints across online CPUs only Currently, when setting up the IRQ for a q_vector, we set an affinity hint based on the v_idx of that q_vector. Meaning a loop iterates on v_idx, which is an incremental value, and the cpumask is created based on this value. This is a problem in systems with multiple logical CPUs per core (like in simultaneous multithreading (SMT) scenarios). If we disable some logical CPUs, by turning SMT off for example, we will end up with a sparse cpu_online_mask, i.e., only the first CPU in a core is online, and incremental filling in q_vector cpumask might lead to multiple offline CPUs being assigned to q_vectors. Example: if we have a system with 8 cores each one containing 8 logical CPUs (SMT == 8 in this case), we have 64 CPUs in total. But if SMT is disabled, only the 1st CPU in each core remains online, so the cpu_online_mask in this case would have only 8 bits set, in a sparse way. In general case, when SMT is off the cpu_online_mask has only C bits set: 0, 1N, 2N, ..., C*(N-1) where C == # of cores; N == # of logical CPUs per core. In our example, only bits 0, 8, 16, 24, 32, 40, 48, 56 would be set. Instead, we should only assign hints for CPUs which are online. Even better, the kernel already provides a function, cpumask_local_spread() which takes an index and returns a CPU, spreading the interrupts across local NUMA nodes first, and then remote ones if necessary. Since we generally have a 1:1 mapping between vectors and CPUs, there is no real advantage to spreading vectors to local CPUs first. In order to avoid mismatch of the default XPS hints, we'll pass -1 so that it spreads across all CPUs without regard to the node locality. Note that we don't need to change the q_vector->affinity_mask as this is initialized to cpu_possible_mask, until an actual affinity is set and then notified back to us. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-06 08:11:31 -07:00
Mitch Williams	64615b5418	i40e: add private flag to control source pruning By default, our devices do source pruning, that is, they drop receive packets that have the source MAC matching one of the receive filters. Unfortunately, this breaks ARP monitoring in channel bonding, as the bonding driver expects devices to receive ARPs containing their own source address. Add an ethtool private flag to control this feature. Also, remove the netif_running() check when we process our private flags. It's OK to reset when the device is closed and in most cases we need the reset the apply these changes. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-06 08:11:31 -07:00
Rami Rosen	ec2f25d203	i40e: fix a typo in i40e_pf documentation This patch fixes a typo in i40e_pf object documentation; num_req_vfs refers to the number of VFs requested for the PF. Signed-off-by: Rami Rosen <rami.rosen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-06 08:11:31 -07:00
Johannes Berg	753d179ad0	Merge remote-tracking branch 'net-next/master' into mac80211-next Merging this brings in the timer_setup() change, which allows me to apply Kees's mac80211 changes for it. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2017-10-06 11:46:55 +02:00
Colin Ian King	d009313c99	net: qcom/emac: make function emac_isr static The function emac_isr is local to the source and does not need to be in global scope, so make it static. Cleans up sparse warnings: symbol 'emac_isr' was not declared. Should it be static? Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-05 21:27:02 -07:00
David S. Miller	53954cf8c5	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Just simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-05 18:19:22 -07:00
David Ahern	e58376e1df	mlxsw: spectrum: Add extack messages for enslave failures mlxsw fails device enslavement for a number of reasons. Use the extack facility to return an error message to the user stating why the enslave is failing. Messages are prefixed with "spectrum" so users know it is a constraint imposed by the hardware driver. For example: $ ip li add br0.11 link br0 type vlan id 11 $ ip li set swp11 master br0 Error: spectrum: Enslaving a port to a device that already has an upper device is not supported. Signed-off-by: David Ahern <dsahern@gmail.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Tested-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-04 21:39:34 -07:00
David Ahern	759088bda2	net: bonding: Add extack messages for some enslave failures A number of bond_enslave errors are logged using the netdev_err API. Return those messages to userspace via the extack facility. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-04 21:39:34 -07:00
David Ahern	de3baa3ed7	net: vrf: Add extack messages for enslave errors Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-04 21:39:33 -07:00
David Ahern	42ab19ee90	net: Add extack to upper device linking Add extack arg to netdev_upper_dev_link and netdev_master_upper_dev_link Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-04 21:39:33 -07:00
David Ahern	33eaf2a6eb	net: Add extack to ndo_add_slave Pass extack to do_set_master and down to ndo_add_slave Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-04 21:39:33 -07:00
Colin Ian King	ebf6b13142	cxgb4vf: make a couple of functions static The functions t4vf_link_down_rc_str and t4vf_handle_get_port_info are local to the source and do not need to be in global scope, so make them static. Cleans up sparse warnings: symbol 't4vf_link_down_rc_str' was not declared. Should it be static? symbol 't4vf_handle_get_port_info' was not declared. Should it be static? Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-04 10:31:55 -07:00
Simon Horman	4d86d38186	ravb: RX checksum offload Add support for RX checksum offload. This is enabled by default and may be disabled and re-enabled using ethtool: # ethtool -K eth0 rx off # ethtool -K eth0 rx on The RAVB provides a simple checksumming scheme which appears to be completely compatible with CHECKSUM_COMPLETE: sum of all packet data after the L2 header is appended to packet data; this may be trivially read by the driver and used to update the skb accordingly. In terms of performance throughput is close to gigabit line-rate both with and without RX checksum offload enabled. Perf output, however, appears to indicate that significantly less time is spent in do_csum(). This is as expected. Test results with RX checksum offload enabled: # /usr/bin/perf_3.16 record -o /run/perf.data -a netperf -t TCP_MAERTS -H 10.4.3.162 MIGRATED TCP MAERTS TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.4.3.162 () port 0 AF_INET : demo enable_enobufs failed: getprotobyname Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 87380 16384 16384 10.00 937.54 Summary of output of perf report: 18.28% ksoftirqd/0 [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore 10.34% ksoftirqd/0 [kernel.kallsyms] [k] __pi_memcpy 9.83% ksoftirqd/0 [kernel.kallsyms] [k] ravb_poll 7.89% ksoftirqd/0 [kernel.kallsyms] [k] skb_put 4.01% ksoftirqd/0 [kernel.kallsyms] [k] dev_gro_receive 3.37% netperf [kernel.kallsyms] [k] __arch_copy_to_user 3.17% swapper [kernel.kallsyms] [k] arch_cpu_idle 2.55% swapper [kernel.kallsyms] [k] tick_nohz_idle_enter 2.04% ksoftirqd/0 [kernel.kallsyms] [k] __pi___inval_dcache_area 2.03% swapper [kernel.kallsyms] [k] _raw_spin_unlock_irq 1.96% ksoftirqd/0 [kernel.kallsyms] [k] __netdev_alloc_skb 1.59% ksoftirqd/0 [kernel.kallsyms] [k] __slab_alloc.isra.83 Test results without RX checksum offload enabled: # /usr/bin/perf_3.16 record -o /run/perf.data -a netperf -t TCP_MAERTS -H 10.4.3.162 MIGRATED TCP MAERTS TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.4.3.162 () port 0 AF_INET : demo enable_enobufs failed: getprotobyname Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 87380 16384 16384 10.00 940.20 Summary of output of perf report: 17.10% ksoftirqd/0 [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore 10.99% ksoftirqd/0 [kernel.kallsyms] [k] __pi_memcpy 8.87% ksoftirqd/0 [kernel.kallsyms] [k] ravb_poll 8.16% ksoftirqd/0 [kernel.kallsyms] [k] skb_put 7.42% ksoftirqd/0 [kernel.kallsyms] [k] do_csum 3.91% ksoftirqd/0 [kernel.kallsyms] [k] dev_gro_receive 2.31% swapper [kernel.kallsyms] [k] arch_cpu_idle 2.16% ksoftirqd/0 [kernel.kallsyms] [k] __pi___inval_dcache_area 2.14% ksoftirqd/0 [kernel.kallsyms] [k] __netdev_alloc_skb 1.93% netperf [kernel.kallsyms] [k] __arch_copy_to_user 1.79% swapper [kernel.kallsyms] [k] tick_nohz_idle_enter 1.63% ksoftirqd/0 [kernel.kallsyms] [k] __slab_alloc.isra.83 Above results collected on an R-Car Gen 3 Salvator-X/r8a7796 ES1.0. Also tested on a R-Car Gen 3 Salvator-X/r8a7795 ES1.0. By inspection this also appears to be compatible with the ravb found on R-Car Gen 2 SoCs, however, this patch is currently untested on such hardware. Signed-off-by: Simon Horman <horms+renesas@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-04 10:26:05 -07:00
Ganesh Goudar	acd669a8f6	cxgb4: add new T6 pci device id's Add 0x6085 T6 device id. Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-03 21:42:39 -07:00
David S. Miller	af14827fa3	Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 100GbE Intel Wired LAN Driver Updates 2017-10-03 This series contains updates to fm10k only. Jake provides majority of the changes in this series, starting with using fm10k_prepare_for_reset() if we lose PCIe link. Before we would detach the device and close the netdev, which left a lot of items still active, such as the Tx/Rx resources. This could cause problems where register reads would return potentially invalid values and would result in unknown driver behavior, so call fm10k_prepare_for_reset() much like we do for suspend/resume cycles. This will attempt to shutdown as much as possible to prevent possible issues. Then replaced the PCI specific legacy power management hooks with the new generic power management hooks for both suspend and hibernate. Introduced a workqueue item which monitors a queue of MAC and VLAN requests since a large number of MAC address or VLAN updates at once can overload the mailbox with too many messages at once. Fixed a cppcheck warning by properly declaring the min_rate and max_rate variables in the declaration and definition for .ndo_set_vf_bw, rather than using "unused" for the minimum rates. Joe Perches fixes the backward logic when using net_ratelimit(). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-03 16:25:05 -07:00
David Wu	05946876f0	net: stmmac: dwmac-rk: Add RK3128 GMAC support Add constants and callback functions for the dwmac on rk3128 soc. As can be seen, the base structure is the same, only registers and the bits in them moved slightly. Signed-off-by: David Wu <david.wu@rock-chips.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-03 15:39:56 -07:00
Mahesh Bandewar	4d2c0cda07	bonding: speed/duplex update at NETDEV_UP event Some NIC drivers don't have correct speed/duplex settings at the time they send NETDEV_UP notification and that messes up the bonding state. Especially 802.3ad mode which is very sensitive to these settings. In the current implementation we invoke bond_update_speed_duplex() when we receive NETDEV_UP, however, ignore the return value. If the values we get are invalid (UNKNOWN), then slave gets removed from the aggregator with speed and duplex set to UNKNOWN while link is still marked as UP. This patch fixes this scenario. Also 802.3ad mode is sensitive to these conditions while other modes are not, so making sure that it doesn't change the behavior for other modes. Signed-off-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-03 14:32:25 -07:00
Aleksander Morgado	63ba395cd7	rndis_host: support Novatel Verizon USB730L Treat the ef/04/01 interface class/subclass/protocol combination used by the Novatel Verizon USB730L (1410:9030) as a possible RNDIS interface. T: Bus=01 Lev=02 Prnt=02 Port=01 Cnt=02 Dev#= 17 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 3 P: Vendor=1410 ProdID=9030 Rev=03.10 S: Manufacturer=Novatel Wireless S: Product=MiFi USB730L S: SerialNumber=0123456789ABCDEF C: #Ifs= 3 Cfg#= 1 Atr=80 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 1 Cls=ef(misc ) Sub=04 Prot=01 Driver=rndis_host I: If#= 1 Alt= 0 #EPs= 2 Cls=0a(data ) Sub=00 Prot=00 Driver=rndis_host I: If#= 2 Alt= 0 #EPs= 1 Cls=03(HID ) Sub=00 Prot=00 Driver=usbhid Once the network interface is brought up, the user just needs to run a DHCP client to get IP address and routing setup. As a side note, other Novatel Verizon USB730L models with the same vid:pid end up exposing a standard ECM interface which doesn't require any other kernel update to make it work. Signed-off-by: Aleksander Morgado <aleksander@aleksander.es> Reviewed-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-03 14:30:46 -07:00
Dan Carpenter	b5c7d4e54c	mlxsw: spectrum: Add missing error code on allocation failure We accidentally return success if the kmalloc_array() call fails. Fixes: `0e14c7777a` ("mlxsw: spectrum: Add the multicast routing hardware logic") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Yotam Gigi <yotamg@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-03 10:26:58 -07:00
Dan Carpenter	b508e0b6e4	mlxsw: spectrum: Fix check for IS_ERR() instead of NULL mlxsw_afa_block_create() doesn't return error pointers, it returns NULL on error. Fixes: `0e14c7777a` ("mlxsw: spectrum: Add the multicast routing hardware logic") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Yotam Gigi <yotamg@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-03 10:26:58 -07:00
Colin Ian King	360cc342c9	net: dsa: mt7530: make functions mt7530_phy_write static The function mt7530_phy_write is local to the source and does not need to be in global scope, so make it static. Cleans up sparse warnings: symbol 'mt7530_phy_write' was not declared. Should it be static? Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-03 10:20:12 -07:00
Colin Ian King	161ae6b04d	net: dsa: lan9303: make functions lan9303_mdio_phy_{read\|write} static The functions lan9303_mdio_phy_write and lan9303_mdio_phy_read are local to the source and do not need to be in global scope, so make them static. Cleans up sparse warnings: symbol 'lan9303_mdio_phy_write' was not declared. Should it be static? symbol 'lan9303_mdio_phy_read' was not declared. Should it be static? Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-03 10:20:12 -07:00
Yotam Gigi	f60c254998	mlxsw: spectrum: mr: Support trap-and-forward routes Add the support of trap-and-forward route action in the multicast routing offloading logic. A route will be set to trap-and-forward action if one (or more) of its output interfaces is not offload-able, i.e. does not have a valid Spectrum RIF. This way, a route with mixed output VIFs list, which contains both offload-able and un-offload-able devices can go through partial offloading in hardware, and the rest will be done in the kernel ipmr module. Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-03 10:06:30 -07:00
Yotam Gigi	607feadef8	mlxsw: spectrum: mr_tcam: Add trap-and-forward multicast route In addition to the current multicast route actions, which include trap route action and a forward route action, add the trap-and-forward multicast route action, and implement it in the multicast routing hardware logic. To implement that, add a trap-and-forward ACL action as the last action in the route flexible action set. The used trap is the ACL2 trap, which marks the packets with offload_mr_forward_mark, to prevent the packet from being forwarded again by the kernel. Note: At that stage the offloading logic does not support trap-and-forward multicast routes. This patch adds the support only in the hardware logic. Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-03 10:06:30 -07:00
Yotam Gigi	a0040c8c93	mlxsw: spectrum: Add trap for multicast trap-and-forward routes When a multicast route is configured with trap-and-forward action, the packets should be marked with skb->offload_mr_fwd_mark, in order to prevent the packets from being forwarded again by the kernel ipmr module. Due to this, it is not possible to use the already existing multicast trap (MLXSW_TRAP_ID_ACL1) as the packet should be marked differently. Add the MLXSW_TRAP_ID_ACL2 which is for trap-and-forward multicast routes, and set the offload_mr_fwd_mark skb field in its handler. Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-03 10:06:30 -07:00
Yotam Gigi	2678724355	mlxsw: acl: Introduce ACL trap and forward action Use trap/discard flex action to implement trap and forward. The action will later be used for multicast routing, as the multicast routing mechanism is done using ACL flexible actions in Spectrum hardware. Using that action, it will be possible to implement a trap-and-forward route. Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-03 10:06:30 -07:00
Arjun Vynipadath	a047fbae23	cxgb4: Update comment for min_mtu We have lost a comment for minimum mtu value set for netdevice with 'commit `d894be57ca` ("ethernet: use net core MTU range checking in more drivers"). Updating it accordingly. Signed-off-by: Arjun Vynipadath <arjun@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-03 09:55:41 -07:00
Jacob Keller	3e256ac5b1	fm10k: fix mis-ordered parameters in declaration for .ndo_set_vf_bw We've had support for setting both a minimum and maximum bandwidth via .ndo_set_vf_bw since commit `883a9ccbae` ("fm10k: Add support for SR-IOV to driver", 2014-09-20). Likely because we do not support minimum rates, the declaration mis-ordered the "unused" parameter, which causes warnings when analyzed with cppcheck. Fix this warning by properly declaring the min_rate and max_rate variables in the declaration and definition (rather than using "unused"). Also rename "rate" to max_rate so as to clarify that we only support setting the maximum rate. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-03 09:00:04 -07:00
Jacob Keller	87be98927e	fm10k: prefer %s and __func__ for diagnostic prints Don't hard code the function names in the diagnostic output when these reset related routines fail. Instead, use %s and __func__ so that future refactors don't need to change the print outs. Additionally, while we are here, add missing function header comments for the new reset_prepare and reset_done function handlers. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-03 08:55:20 -07:00
Joe Perches	c0ad8ef3df	fm10k: Fix misuse of net_ratelimit() Correct the backward logic using !net_ratelimit() Miscellanea: o Add a blank line before the error return label Signed-off-by: Joe Perches <joe@perches.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-03 08:51:27 -07:00
Jacob Keller	ef57ab791c	fm10k: bump version number Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-03 08:49:52 -07:00
Jacob Keller	1f5c27e528	fm10k: use the MAC/VLAN queue for VF<->PF MAC/VLAN requests Now that we have a working MAC/VLAN queue for handling MAC/VLAN messages from the netdev, replace the default handler for the VF<->PF messages. This new handler is very similar to the default code, but uses the MAC/VLAN queue instead of sending the message directly. Unfortunately we can't easily re-use the default code, so we'll just replace the entire function. This ensures that a VF requesting a large number of VLANs or MAC addresses does not start a reset cycle, as explained in the commit which introduced the message queue. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Ngai-mint Kwan <ngai-mint.kwan@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-03 08:39:17 -07:00
Jacob Keller	fc9173682d	fm10k: introduce a message queue for MAC/VLAN messages Under some circumstances, when dealing with a large number of MAC address or VLAN updates at once, the fm10k driver, particularly the VFs can overload the mailbox with too many messages at once. This results in a mailbox timeout, which causes the driver to initiate a reset. During the reset, we re-send all the same messages that originally caused the timeout. This results in a cycle of resets each triggering a future reset. To fix or avoid this, we introduce a workqueue item which monitors a queue of MAC and VLAN requests. These requests are queued to the end of the list, and we process as a FIFO periodically. Initially we only handle requests for the netdev, but we do handle unicast MAC addresses, multicast MAC addresses, and update VLAN requests. A future patch will add support to use this queue for handling MAC update requests from the VF<->PF mailbox. The MAC/VLAN work item will keep checking to make sure that each request does not overflow the mailbox and cause a timeout. If it might, then the work item will reschedule itself a short time later. This avoids any reset cycle, since we never send the message if the mailbox is not ready. As an alternative, we tried increasing the mailbox message FIFO, but this just delays the problem and results in needless memory waste on the system. Our new message queue is dynamically allocated so only uses as much memory as it needs. Additionally, it need not be contiguous like the Tx and Rx FIFOs. Note that this patch chose to only create a queue for MAC and VLAN messages, since these are the only messages sent in a large enough volume to cause the reset loop. Other messages are very unlikely to overflow the mailbox Tx FIFO so easily. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-03 08:25:36 -07:00
Jacob Keller	8249c47c6b	fm10k: use generic PM hooks instead of legacy PCIe power hooks Replace the PCI specific legacy power management hooks with the new generic power management hooks which work properly for both suspend and hibernate. The new generic system is better and properly handles the lower level PCIe power management rather than forcing the driver to handle it. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-03 08:19:33 -07:00
Jacob Keller	b4fcd43661	fm10k: use spinlock to implement mailbox lock Lets not re-invent the locking wheel. Remove our bitlock and use a proper spinlock instead. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-03 08:12:44 -07:00
Jacob Keller	0b40f45748	fm10k: prepare_for_reset() when we lose PCIe Link If we lose PCIe link, such as when an unannounced PFLR event occurs, or when a device is surprise removed, we currently detach the device and close the netdev. This unfortunately leaves a lot of things still active, such as the msix_mbx_pf IRQ, and Tx/Rx resources. This can cause problems because the register reads will return potentially invalid values which may result in unknown driver behavior. Begin the process of resetting using fm10k_prepare_for_reset(), much in the same way as the suspend and resume cycle does. This will attempt to shutdown as much as possible, in order to prevent possible issues. A naive implementation for this has issues, because there are now multiple flows calling the reset logic and setting a reset bit. This would cause problems, because the "re-attach" routine might call fm10k_handle_reset() prior to the reset actually finishing. Instead, we'll add state bits to indicate which flow actually initiated the reset. For the general reset flow, we'll assume that if someone else is resetting that we do not need to handle it at all, so it does not need its own state bit. For the suspend case, we will simply issue a warning indicating that we are attempting to recover from this case when resuming. For the detached subtask, we'll simply refuse to re-attach until we've actually initiated a reset as part of that flow. Finally, we'll stop attempting to manage the mailbox subtask when we're detached, since there's nothing we can do if we don't have a PCIe address. Overall this produces a much cleaner shutdown and recovery cycle for a PCIe surprise remove event. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-03 08:06:44 -07:00
David S. Miller	4efac6ff4d	Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 40GbE Intel Wired LAN Driver Updates 2017-10-02 This series contains updates to i40e and i40evf. Shannon Nelson fixes an issue where when a machine has more CPUs than queue pairs, the counting gets a "little funky" and turns off Flow Director. So to correct it, limit the number of LAN queues initially allocated to be sure there are some left for Flow Director and other features. Lihong cleans up dead code by removing a condition check which cannot ever be true. Christophe Jaillet fixes a potential NULL pointer dereference, which could happen if kzalloc() fails. Filip corrects the reporting of supported link modes, which was incorrect for some NICs. Added support for 'ethtool -m' command, which displays information about QSFP+ modules. Mariusz adds functions to read/write the LED registers to control the LEDS, instead of accessing the registers directly whenever the LEDs need to be controlled. Jake fixes a regression where we introduced a scheduling while atomic, so introduce a separate helper function which will manage its own need for the mac_filter_hash_lock. Also cleaned up the "PF" parameter in i40e_vc_disable_vf() since it is never used and is not needed. Fixed a rare case where it is possible that a reset does not occur when i40e_vc_disable_vf() is called, so modify i40e_reset_vf() to return a bool to indicate whether it reset or not so that i40e_vc_disable_vf() can wait until a reset actually occurs. Alan adds the ability for the VF to request more or less underlying allocated queues from the PF. Fixes the incorrect method for clearing the vf_states variable with a NULL assignment, when we should be using atomic bitops since we don't actually want to clear all the flags. Fixed a resource leak, where the PF driver fails to inform clients of a VF reset because we were incorrectly checking the I40E_VF_STATE_PRE_ENABLE bit. Mitch converts i40evf_map_rings_to_vectors() to a void function since it cannot fail and allows us to clean up the checks for the function return value. Scott enables the driver(s) to pass traffic with VLAN tags using the 802.1ad Ethernet protocol. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-02 15:16:03 -07:00
Scott Peterson	ab243ec940	i40e: Stop dropping 802.1ad tags - eth proto 0x88a8 Enable i40e to pass traffic with VLAN tags using the 802.1ad ethernet protocol ID (0x88a8). This requires NIC firmware providing version 1.7 of the API. With older NIC firmware 802.1ad tagged packets will continue to be dropped. No VLAN offloads nor RSS are supported for 802.1ad VLANs. Signed-off-by: Scott Peterson <scott.d.peterson@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 12:46:36 -07:00
Alan Brady	c53d11f669	i40e: fix client notify of VF reset Currently there is a bug in which the PF driver fails to inform clients of a VF reset which then causes clients to leak resources. The bug exists because we were incorrectly checking the I40E_VF_STATE_PRE_ENABLE bit. When a VF is first init we go through a reset to initialize variables and allocate resources but we don't want to inform clients of this first reset since the client isn't fully enabled yet so we set a state bit signifying we're in a "pre-enabled" client state. During the first reset we should be clearing the bit, allowing all following resets to notify the client of the reset when the bit is not set. This patch fixes the issue by negating the 'test_and_clear_bit' check to accurately reflect the behavior we want. Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 12:46:36 -07:00
Alan Brady	41d0a4d0c8	i40e: fix handling of vf_states variable Currently we inappropriately clear the vf_states variable with a null assignment. This is problematic because we should be using atomic bitops on this variable and we don't actually want to clear all the flags. We should just clear the ones we know we want to clear. Additionally remove the I40E_VF_STATE_FCOEENA bit because it is no longer being used. Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 12:46:35 -07:00
Mitch Williams	1b7b7596ae	i40e: make i40evf_map_rings_to_vectors void This function cannot fail, so why is it returning a value? And why are we checking it? Why shouldn't we just make it void? Why is this commit message made up of only questions? Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 12:46:35 -07:00
Alan Brady	5b36e8d04b	i40evf: Enable VF to request an alternate queue allocation Currently the VF gets a default number of allocated queues from HW on init and it could choose to enable or disable those allocated queues. This makes it such that the VF can request more or less underlying allocated queues from the PF. First the VF negotiates the number of queues it wants that can be supported by the PF and if successful asks for a reset. During reset the PF will reallocate the HW queues for the VF and will then remap the new queues. Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 12:46:35 -07:00
Jacob Keller	d43d60e5eb	i40e: ensure reset occurs when disabling VF It is possible although rare that we may not reset when i40e_vc_disable_vf() is called. This can lead to some weird circumstances with some values not being properly set. Modify i40e_reset_vf() to return a code indicating whether it reset or not. Now, i40e_vc_disable_vf() can wait until a reset actually occurs. If it fails to free up within a reasonable time frame we'll display a warning message. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 12:46:35 -07:00
Jacob Keller	f18d20218a	i40e: make use of i40e_vc_disable_vf Replace i40e_vc_notify_vf_reset and i40e_reset_vf with a call to i40e_vc_disable_vf which does this exact thing. This matches similar code patterns throughout the driver. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 12:46:35 -07:00
Jacob Keller	eeeddbb806	i40e: drop i40e_pf *pf from i40e_vc_disable_vf() It's never used, and the vf structure could get back to the PF if necessary. Lets just drop the extra unneeded parameter. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 12:46:35 -07:00
Jacob Keller	ba4e003d29	i40e: don't hold spinlock while resetting VF When we refactored handling of the PVID in commit `9af52f60b2` ("i40e: use (add\|rm)_vlan_all_mac helper functions when changing PVID") we introduced a scheduling while atomic regression. This occurred because we now held the spinlock across a call to i40e_reset_vf(), which results in a usleep_range() call that triggers a scheduling while atomic bug. This was rare as it only occurred if the user configured a VLAN on a VF and also attempted to reconfigure the VF from the host system with a port VLAN. We do need to hold the lock while calling i40e_is_vsi_in_vlan(), but we should not be holding it while we reset the VF. We'll fix this by introducing a separate helper function i40e_vsi_has_vlans which checks whether we have a PVID and whether the VSI has configured VLANs. This helper function will manage its own need for the mac_filter_hash_lock. Then, we can move the acquiring of the spinlock until after we reset the VF, which ensures that we do not sleep while holding the lock. Using a separate function like this makes the code more clear and is easier to read than attempting to release and re-acquire the spinlock when we reset the VF. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 12:46:35 -07:00
Mariusz Stachura	00f6c2f5e2	i40e: use admin queue for setting LEDs behavior Instead of accessing register directly, use newly added AQC in order to blink LEDs. Introduce and utilize a new flag to prevent excessive API version checking. Signed-off-by: Mariusz Stachura <mariusz.stachura@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 12:46:35 -07:00
Filip Sadowski	9c0e5caf63	i40e: Add support for 'ethtool -m' This patch adds support for 'ethtool -m' command which displays information about (Q)SFP+ module plugged into NIC's cage. Signed-off-by: Filip Sadowski <filip.sadowski@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 12:46:35 -07:00
Filip Sadowski	d60bcc7980	i40e: Fix reporting of supported link modes This patch fixes incorrect reporting of supported link modes on some NICs. Signed-off-by: Filip Sadowski <filip.sadowski@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 12:46:35 -07:00
Christophe JAILLET	54902349ee	i40e: Fix a potential NULL pointer dereference If 'kzalloc()' fails, a NULL pointer will be dereferenced. Return an error code (-ENOMEM) instead. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 12:46:35 -07:00
Lihong Yang	5872866e16	i40e: remove logically dead code This patch removes the !vf condition check that cannot be true in i40e_ndo_set_vf_trust function Detected by CoverityScan, CID 1397531 Logically dead code Signed-off-by: Lihong Yang <lihong.yang@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 12:46:35 -07:00
Shannon Nelson	e50d5751c8	i40e: limit lan queue count in large CPU count machine When a machine has more CPUs than queue pairs, e.g. 512 cores, the counting gets a little funky and turns off Flow Director with the message: not enough queues for Flow Director. Flow Director feature is disabled This patch limits the number of lan queues initially allocated to be sure we have some left for FD and other features. Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 12:46:34 -07:00
David S. Miller	d9601be13c	Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 100GbE Intel Wired LAN Driver Updates 2017-10-02 This series contains updates to fm10k only. Jake provides all but one of the changes in this series. Most are small fixes, starting with ensuring prompt transmission of messages queued up after each VF message is received and handled. Fix a possible race condition between the watchdog task and the processing of mailbox messages by just checking whether the mailbox is still open. Fix a couple of GCC v7 warnings, including misspelled "fall through" comments and warnings about possible truncation of calls to snprintf(). Cleaned up a convoluted bitshift and read for the PFVFLRE register. Fixed a potential divide by zero when finding the proper r_idx. Markus Elfring fixes an issue which was found using Coccinelle, where we should have been using seq_putc() instead of seq_puts(). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-02 11:59:34 -07:00
Amir Levy	e69b6c02b4	net: Add support for networking over Thunderbolt cable ThunderboltIP is a protocol created by Apple to tunnel IP/ethernet traffic over a Thunderbolt cable. The protocol consists of configuration phase where each side sends ThunderboltIP login packets (the protocol is determined by UUID in the XDomain packet header) over the configuration channel. Once both sides get positive acknowledgment to their login packet, they configure high-speed DMA path accordingly. This DMA path is then used to transmit and receive networking traffic. This patch creates a virtual ethernet interface the host software can use in the same way as any other networking interface. Once the interface is brought up successfully network packets get tunneled over the Thunderbolt cable to the remote host and back. The connection is terminated by sending a ThunderboltIP logout packet over the configuration channel. We do this when the network interface is brought down by user or the driver is unloaded. Signed-off-by: Amir Levy <amir.jer.levy@intel.com> Signed-off-by: Michael Jamet <michael.jamet@intel.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Reviewed-by: Yehezkel Bernat <yehezkel.bernat@intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-02 11:24:42 -07:00
Petr Machata	85f44a15b1	mlxsw: spectrum_router: Drop a redundant condition Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-02 11:20:22 -07:00
Petr Machata	7ff176f81d	mlxsw: spectrum_router: Fix a typo Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-02 11:20:22 -07:00
Petr Machata	de0f43c01a	mlxsw: spectrum_router: Track RIF of IPIP next hops When considering whether to set RTNH_F_OFFLOAD flag on an IPv6 route, mlxsw_sp_fib6_entry_offload_set() looks up the mlxsw_sp_nexthop corresponding to a given route, and decides based on whether the next hop's offloaded flag was set. When looking for the matching next hop, it also takes into account the device of the route, which must match next hop's RIF. IPIP next hops however hitherto didn't set the RIF. As a result, IPv6 routes forwarding traffic to IP-in-IP netdevices are never marked as offloaded, even when they actually are. Thus track RIF of IPIP next hops the same way as that of ETHERNET next hops. Fixes: `8f28a30976` ("mlxsw: spectrum_router: Support IPv6 overlay encap") Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-02 11:18:57 -07:00
Petr Machata	28a04c7b7b	mlxsw: spectrum_router: Move VRF refcounting When creating a new RIF, bumping RIF count of the containing VR is the last thing to be done. Symmetrically, when destroying a RIF, RIF count is first dropped and only then the rest of the cleanup proceeds. That's a problem for loopback RIFs. Those hold two VR references: one for overlay and one for underlay. mlxsw_sp_rif_destroy() releases the overlay one, and the deconfigure() callback the underlay one. But if both overlay and underlay are the same, and if there are no other artifacts holding the VR alive, this put actually destroys the VR. Later on, when mlxsw_sp_rif_destroy() calls mlxsw_sp_vr_put() for the same VR, the VR will already have been released and the kernel crashes with NULL pointer dereference. The underlying problem is that the RIF under destruction ends up referencing the overlay VR much longer than it claims: all the way until the call to mlxsw_sp_vr_put(). So line up the reference counting properly to reflect this. Make corresponding changes in mlxsw_sp_rif_create() as well for symmetry. Fixes: `6ddb7426a7` ("mlxsw: spectrum_router: Introduce loopback RIFs") Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-02 11:18:57 -07:00
Jacob Keller	04914390f5	fm10k: prevent race condition of __FM10K_SERVICE_SCHED Although very unlikely, it is possible that cancel_work_sync() may stop the service_task before it actually started. In this case, the __FM10K_SERVICE_SCHED bit will never be cleared. This results in the service task being unable to reschedule in the future. Add a helper function which sets the service disable bit, waits for the service task to stop and clears the schedule bit, thus avoiding the race condition. We know the schedule bit is safe to clear because the cancel_work_sync() guarantees the service task is not running. Add a helper function also to restart the service task, for symmetry. This is not strictly needed but helps the mental model of how to stop and start the service task. This race could only happen in fm10k_suspend/fm10k_resume as this is the only place where the service task is actually restarted. Thus, suspend/resume testing would be ideal. However, note that the chance of this happening is very slim as the service event is scheduled for immediate execution, and you would have to trigger a suspend at almost the exact same time as the service task was scheduled. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 08:10:54 -07:00
Jacob Keller	65b0a469e9	fm10k: move fm10k_prepare_for_reset and fm10k_handle_reset A future patch needs these functions defined earlier in the file. Move them closer to above where they will be called. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 08:09:18 -07:00
Jacob Keller	dd5eede2b7	fm10k: avoid divide by zero in rare cases when device is resetting It is possible that under rare circumstances the device is undergoing a reset, such as when a PFLR occurs, and the device may be transmitting simultaneously. In this case, we might attempt to divide by zero when finding the proper r_idx. Instead, lets read the num_tx_queues once, and make sure it's non-zero. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 08:07:57 -07:00
Jacob Keller	d876c1583b	fm10k: don't loop while resetting VFs due to VFLR event We've always had a really weird looping construction for resetting VFs. We read the VFLRE register and reset the VF if the corresponding bit is set, which makes sense. However we loop continuously until we no longer have any bits left unset. At first this makes sense, as a sort of "keep trying until we succeed" concept. Unfortunately this causes a problem if we happen to surprise remove while this code is executing, because in this case we'll always read all 1s for the VFLRE register. This results in a hard lockup on the CPU because the loop will never terminate. Because our own reset function will clear the VFLR event register always, (except when we've lost PCIe link obviously) there is no real reason to loop. In practice, we'll loop over once and find that no VFs are pending anymore. Lets just check once. Since we're clear the notification when we reset there's no benefit to the loop. Additionally, there shouldn't be a race as future VLFRE events should trigger an interrupt. Additionally, we didn't warn or do anything in the looped case anyways. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 08:06:30 -07:00
Jacob Keller	4abf01b43b	fm10k: simplify reading PFVFLRE register We're doing a really convoluted bitshift and read for the PFVFLRE register. Just reading the PFVFLRE(1), shifting it by 32, then reading PFVFLRE(0) should be sufficient. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 08:04:57 -07:00
Jacob Keller	8bac58be17	fm10k: avoid needless delay when loading driver When we load the driver, we set the last_reset to be in the future, which delays the initial driver reset. Additionally, the service task isn't scheduled to run automatically until the timer runs out. This causes a needless delay of the first reset to begin talking to the switch manager. We can avoid this by simply not setting last_reset and immediately scheduling the service task while in probe. This allows the device to wake up faster, and avoids this delay. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 07:57:42 -07:00
Jacob Keller	523a0b558d	fm10k: add missing fall through comment Newer versions of GCC starting with 7 now additionally warn when a case statement may fall through without an explicit comment mentioning it. Add such a comment to silence the warning, as this is expected. Unfortunately the comment must come directly before the next case statement, so we put it outside the #ifdef. Otherwise, the compiler cannot properly detect it and thus the warning is displayed regardless. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 07:54:00 -07:00
Jacob Keller	b94dd008c4	fm10k: avoid possible truncation of q_vector->name New versions of GCC since version 7 began warning about possible truncation of calls to snprintf. We can fix this and avoid false positives. First, we should pass the full buffer size to snprintf, because it guarantees a NULL character as part of its passed length, so passing len-1 is simply wasting a byte of possible storage. Second, if we make the ri and ti variables unsigned, the compiler is able to correctly reason that the value never gets larger than 256, so it doesn't need to warn about the full space required to print a signed integer. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 07:46:57 -07:00
Jacob Keller	375ce90eab	fm10k: fix typos on fall through comments Newer versions of GCC since version 7 now warn when a case statement may fall through without an explicit comment. "Fallthough" does not count as it is misspelled. Fix the typos for these comments to appease the new warnings. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 07:42:15 -07:00
Jacob Keller	5c66d1251d	fm10k: stop spurious link down messages when Tx FIFO is full In fm10k_get_host_state_generic, we check the mailbox tx_read() function to ensure that the mailbox is still open. This function also checks to make sure we have space to transmit another message. Unfortunately, if we just recently sent a bunch of messages (such as enabling hundreds of VLANs on a VF) this can result in a race where the watchdog task thinks the link went down just because we haven't had time to process all these messages yet. Instead, lets just check whether the mailbox is still open. This ensures that we don't race with the Tx FIFO, and we only link down once the mailbox is not open. This is safe, because if the FIFO fills up and we're unable to send a message for too long, we'll end up triggering the timeout detection which results in a reset. Additionally, since we still check to ensure the mailbox state is OPEN, we'll transition to link down whenever the mailbox closes as well. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 07:40:31 -07:00
Markus Elfring	95f49d4bde	fm10k: Use seq_putc() in fm10k_dbg_desc_break() Two single characters should be put into a sequence. Thus use the corresponding function "seq_putc". This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 07:28:57 -07:00
Jacob Keller	b52b7f7059	fm10k: reschedule service event if we stall the PF<->SM mailbox When we are handling PF<->VF mailbox messages, it is possible that the VF will send us so many messages that the PF<->SM FIFO will fill up. In this case, we stop the loop and wait until the service event is rescheduled. Normally this should happen due to an interrupt. But it is possible that we don't get another interrupt for a while and it isn't until the service timer actually reschedules us. Instead, simply reschedule immediately which will cause the service event to be run again as soon as we exit. This ensures that we promptly handle all of the PF<->VF messages with minimal delay, while still giving time for the SM mailbox to drain. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 07:25:47 -07:00
Jacob Keller	17a9180994	fm10k: ensure we process SM mbx when processing VF mbx When we process VF mailboxes, the driver is likely going to also queue up messages to the switch manager. This process merely queues up the FIFO, but doesn't actually begin the transmission process. Because we hold the mailbox lock during this VF processing, the PF<->SM mailbox is not getting processed at this time. Ensure that we actually process the PF<->SM mailbox in between each PF<->VF mailbox. This should ensure prompt transmission of the messages queued up after each VF message is received and handled. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 07:24:48 -07:00
Colin Ian King	45bfbc013b	mlxsw: spectrum: fix uninitialized value in err In the unlikely event that mfc->mfc_un.res.ttls[i] is 255 for all values of i from 0 to MAXIVS-1, the err is not set at all and hence has a garbage value on the error return at the end of the function, so initialize it to 0. Also, the error return check on err and goto to err: inside the for loop makes it impossible for err to be zero at the end of the for loop, so we can remove the redundant err check at the end of the loop. Detected by CoverityScan CID#1457207 ("Unitialized scalar value") Fixes: `c011ec1bbf` ("mlxsw: spectrum: Add the multicast routing offloading logic") Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Yotam Gigi <yotamg@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-01 23:05:54 -07:00

1 2 3 4 5 ...

71280 commits