redonkable/alistair23-linux

Author	SHA1	Message	Date
Wei Wang	e873e4b9cc	ipv6: use fib6_info_hold_safe() when necessary In the code path where only rcu read lock is held, e.g. in the route lookup code path, it is not safe to directly call fib6_info_hold() because the fib6_info may already have been deleted but still exists in the rcu grace period. Holding reference to it could cause double free and crash the kernel. This patch adds a new function fib6_info_hold_safe() and replace fib6_info_hold() in all necessary places. Syzbot reported 3 crash traces because of this. One of them is: 8021q: adding VLAN 0 to HW filter on device team0 IPv6: ADDRCONF(NETDEV_CHANGE): team0: link becomes ready dst_release: dst:(____ptrval____) refcnt:-1 dst_release: dst:(____ptrval____) refcnt:-2 WARNING: CPU: 1 PID: 4845 at include/net/dst.h:239 dst_hold include/net/dst.h:239 [inline] WARNING: CPU: 1 PID: 4845 at include/net/dst.h:239 ip6_setup_cork+0xd66/0x1830 net/ipv6/ip6_output.c:1204 dst_release: dst:(____ptrval____) refcnt:-1 Kernel panic - not syncing: panic_on_warn set ... CPU: 1 PID: 4845 Comm: syz-executor493 Not tainted 4.18.0-rc3+ #10 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1c9/0x2b4 lib/dump_stack.c:113 panic+0x238/0x4e7 kernel/panic.c:184 dst_release: dst:(____ptrval____) refcnt:-2 dst_release: dst:(____ptrval____) refcnt:-3 __warn.cold.8+0x163/0x1ba kernel/panic.c:536 dst_release: dst:(____ptrval____) refcnt:-4 report_bug+0x252/0x2d0 lib/bug.c:186 fixup_bug arch/x86/kernel/traps.c:178 [inline] do_error_trap+0x1fc/0x4d0 arch/x86/kernel/traps.c:296 dst_release: dst:(____ptrval____) refcnt:-5 do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:316 invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:992 RIP: 0010:dst_hold include/net/dst.h:239 [inline] RIP: 0010:ip6_setup_cork+0xd66/0x1830 net/ipv6/ip6_output.c:1204 Code: c1 ed 03 89 9d 18 ff ff ff 48 b8 00 00 00 00 00 fc ff df 41 c6 44 05 00 f8 e9 2d 01 00 00 4c 8b a5 c8 fe ff ff e8 1a f6 e6 fa <0f> 0b e9 6a fc ff ff e8 0e f6 e6 fa 48 8b 85 d0 fe ff ff 48 8d 78 RSP: 0018:ffff8801a8fcf178 EFLAGS: 00010293 RAX: ffff8801a8eba5c0 RBX: 0000000000000000 RCX: ffffffff869511e6 RDX: 0000000000000000 RSI: ffffffff869515b6 RDI: 0000000000000005 RBP: ffff8801a8fcf2c8 R08: ffff8801a8eba5c0 R09: ffffed0035ac8338 R10: ffffed0035ac8338 R11: ffff8801ad6419c3 R12: ffff8801a8fcf720 R13: ffff8801a8fcf6a0 R14: ffff8801ad6419c0 R15: ffff8801ad641980 ip6_make_skb+0x2c8/0x600 net/ipv6/ip6_output.c:1768 udpv6_sendmsg+0x2c90/0x35f0 net/ipv6/udp.c:1376 inet_sendmsg+0x1a1/0x690 net/ipv4/af_inet.c:798 sock_sendmsg_nosec net/socket.c:641 [inline] sock_sendmsg+0xd5/0x120 net/socket.c:651 ___sys_sendmsg+0x51d/0x930 net/socket.c:2125 __sys_sendmmsg+0x240/0x6f0 net/socket.c:2220 __do_sys_sendmmsg net/socket.c:2249 [inline] __se_sys_sendmmsg net/socket.c:2246 [inline] __x64_sys_sendmmsg+0x9d/0x100 net/socket.c:2246 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x446ba9 Code: e8 cc bb 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 eb 08 fc ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007fb39a469da8 EFLAGS: 00000246 ORIG_RAX: 0000000000000133 RAX: ffffffffffffffda RBX: 00000000006dcc54 RCX: 0000000000446ba9 RDX: 00000000000000b8 RSI: 0000000020001b00 RDI: 0000000000000003 RBP: 00000000006dcc50 R08: 00007fb39a46a700 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 45c828efc7a64843 R13: e6eeb815b9d8a477 R14: 5068caf6f713c6fc R15: 0000000000000001 Dumping ftrace buffer: (ftrace buffer empty) Kernel Offset: disabled Rebooting in 86400 seconds.. Fixes: `93531c6743` ("net/ipv6: separate handling of FIB entries from dst based routes") Reported-by: syzbot+902e2a1bcd4f7808cef5@syzkaller.appspotmail.com Reported-by: syzbot+8ae62d67f647abeeceb9@syzkaller.appspotmail.com Reported-by: syzbot+3f08feb14086930677d0@syzkaller.appspotmail.com Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-23 11:19:02 -07:00
YueHaibing	fd800f6464	wan/fsl_ucc_hdlc: use IS_ERR_VALUE() to check return value of qe_muram_alloc qe_muram_alloc return a unsigned long integer,which should not compared with zero. check it using IS_ERR_VALUE() to fix this. Fixes: `c19b6d246a` ("drivers/net: support hdlc function for QE-UCC") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-23 11:07:10 -07:00
David S. Miller	5302a84e37	linux-can-fixes-for-4.18-20180723 -----BEGIN PGP SIGNATURE----- iQFHBAABCgAxFiEENrCndlB/VnAEWuH5k9IU1zQoZfEFAltVzJQTHG1rbEBwZW5n dXRyb25peC5kZQAKCRCT0hTXNChl8e/pCADT/Al1tOpzM0EYs3fdSDzqi3caxkAI gpxGDcM8fpiJn5psC0DvFJ4vf8GBQBdkoS8W6M3ieSxrwxvFlK0hQsujbBn4vAGt WaKgDYvXJY5PJBjwwHSFMHRVGmzzg7YOQ3CqFsoHlVjr+gPA6T4qgclnPQrUgWQY y6IuQfoAagsP3ezDV15hiRhPaI4SJrCK27XfAD8Cbr9rrl7sa4ifsP20Wf2xoFDu lbf/bVMjOiYANrga4Pz8PNlFiIX9F3kW0Qc81eyJkvBEnmxRThZ7nvP/DHd3vVoF k2EUBaGhdyu5URHASTZRKnT2WKTbeAJ48wFhwT9QCD169SrZWhv7r/nP =wU3+ -----END PGP SIGNATURE----- Merge tag 'linux-can-fixes-for-4.18-20180723' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2018-07-23 this is a pull request of 12 patches for net/master. The patch by Stephane Grosjean for the peak_canfd CAN driver fixes a problem with older firmware. The next patch is by Roman Fietze and fixes the setup of the CCCR register in the m_can driver. Nicholas Mc Guire's patch for the mpc5xxx_can driver adds missing error checking. The two patches by Faiz Abbas fix the runtime resume and clean up the probe function in the m_can driver. The last 7 patches by Anssi Hannula fix several problem in the xilinx_can driver. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-23 11:02:21 -07:00
David S. Miller	6a525818b1	Merge branch 'smc-next' Ursula Braun says: ==================== net/smc: patches 2018-07-23 here are some small patches for SMC: Just the first patch contains a functional change. It allows to differ between the modes SMCR and SMCD on s390 when monitoring SMC sockets. The remaining patches are cleanups without functional changes. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-23 10:57:14 -07:00
Ursula Braun	48bf523177	net/smc: remove local variable page in smc_rx_splice() The page map address is already stored in the RMB descriptor. There is no need to derive it from the cpu_addr value. Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-23 10:57:14 -07:00
Ursula Braun	144ce4b9b5	net/smc: use DECLARE_BITMAP for rtokens_used_mask Link group field tokens_used_mask is a bitmap. Use macro DECLARE_BITMAP for its definition. Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-23 10:57:14 -07:00
Stefan Raspl	00e5fb263f	net/smc: add function to get link group from link Replace a frequently used construct with a more readable variant, reducing the code. Also might come handy when we start to support more than a single per link group. Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-23 10:57:14 -07:00
Stefan Raspl	bac6de7b63	net/smc: eliminate cursor read and write calls The functions to read and write cursors are exclusively used to copy cursors. Therefore switch to a respective function instead. Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-23 10:57:14 -07:00
Karsten Graul	c601171d7a	net/smc: provide smc mode in smc_diag.c Rename field diag_fallback into diag_mode and set the smc mode of a connection explicitly. Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-23 10:57:14 -07:00
Petr Machata	9a2ad36238	selftests: forwarding: gre_multipath: Drop IPv6 tests Support for device-only IPv6 multipath next hops was dropped in commit `33bd5ac54d` ("net/ipv6: Revert attempt to simplify route replace and append") and as of commit `b5d2d75e07` ("net/ipv6: Do not allow device only routes via the multipath API"), attempts to add a next hop like that yield an explicit diagnostic. Correspondingly, drop the IPv6 parts of GRE multipath test that are supposed to test that code. Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-23 09:46:52 -07:00
YueHaibing	7fa41efac1	ipv6: sr: Use kmemdup instead of duplicating it in parse_nla_srh Replace calls to kmalloc followed by a memcpy with a direct call to kmemdup. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-23 09:39:07 -07:00
David S. Miller	f8b2990fd9	Merge branch 'net-bridge-add-support-for-backup-port' Nikolay Aleksandrov says: ==================== net: bridge: add support for backup port This set introduces a new bridge port option that allows any port to have any other port (in the same bridge of course) as its backup and traffic will be forwarded to the backup port when the primary goes down. This is mainly used in MLAG and EVPN setups where we have peerlink path which is a backup of many (or even all) ports and is a participating bridge port itself. There's more detailed information in patch 02. Patch 01 just prepares the port sysfs code for options that take raw value. The main issues that this set solves are scalability and fallback latency. We have used similar code for over 6 months now to bring the fallback latency of the backup peerlink down and avoid fdb notification storms. Also due to the nature of master devices such setup is currently not possible, and last but not least having tens of thousands of fdbs require thousands of calls to switch. I've also CCed our MLAG experts that have been using similar option. Roopa also adds: "Two switches acting in a MLAG pair are connected by the peerlink interface which is a bridge port. the config on one of the switches looks like the below. The other switch also has a similar config. eth0 is connected to one port on the server. And the server is connected to both switches. br0 -- team0---eth0 \| -- switch-peerlink switch-peerlink becomes the failover/backport port when say team0 to the server goes down. Today, when team0 goes down, control plane has to withdraw all the fdb entries pointing to team0 and re-install the fdb entries pointing to switch-peerlink...and restore the fdb entries when team0 comes back up again. and this is the problem we are trying to solve. This also becomes necessary when multihoming is implemented by a standard like E-VPN https://tools.ietf.org/html/rfc8365#section-8 where the 'switch-peerlink' is an overlay vxlan port (like nikolay mentions in his patch commit). In these implementations, the fdb scale can be much larger. On why bond failover cannot be used here ?: the point that nikolay was alluding to is, switch-peerlink in the above example is a bridge port and is a failover/backport port for more than one or all ports in the bridge br0. And you cannot enslave switch-peerlink into a second level team with other bridge ports. Hence a multi layered team device is not an option (FWIW, switch-peerlink is also a teamed interface to the peer switch)." v3: Added Roopa's explanation and diagram v2: In patch 01 use kstrdup/kfree to avoid casting the const buf. In order to avoid using GFP_ATOMIC or always allocating I kept the spinlock inside each branch. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-23 09:32:15 -07:00
Nikolay Aleksandrov	2756f68c31	net: bridge: add support for backup port This patch adds a new port attribute - IFLA_BRPORT_BACKUP_PORT, which allows to set a backup port to be used for known unicast traffic if the port has gone carrier down. The backup pointer is rcu protected and set only under RTNL, a counter is maintained so when deleting a port we know how many other ports reference it as a backup and we remove it from all. Also the pointer is in the first cache line which is hot at the time of the check and thus in the common case we only add one more test. The backup port will be used only for the non-flooding case since it's a part of the bridge and the flooded packets will be forwarded to it anyway. To remove the forwarding just send a 0/non-existing backup port. This is used to avoid numerous scalability problems when using MLAG most notably if we have thousands of fdbs one would need to change all of them on port carrier going down which takes too long and causes a storm of fdb notifications (and again when the port comes back up). In a Multi-chassis Link Aggregation setup usually hosts are connected to two different switches which act as a single logical switch. Those switches usually have a control and backup link between them called peerlink which might be used for communication in case a host loses connectivity to one of them. We need a fast way to failover in case a host port goes down and currently none of the solutions (like bond) cannot fulfill the requirements because the participating ports are actually the "master" devices and must have the same peerlink as their backup interface and at the same time all of them must participate in the bridge device. As Roopa noted it's normal practice in routing called fast re-route where a precalculated backup path is used when the main one is down. Another use case of this is with EVPN, having a single vxlan device which is backup of every port. Due to the nature of master devices it's not currently possible to use one device as a backup for many and still have all of them participate in the bridge (which is master itself). More detailed information about MLAG is available at the link below. https://docs.cumulusnetworks.com/display/DOCS/Multi-Chassis+Link+Aggregation+-+MLAG Further explanation and a diagram by Roopa: Two switches acting in a MLAG pair are connected by the peerlink interface which is a bridge port. the config on one of the switches looks like the below. The other switch also has a similar config. eth0 is connected to one port on the server. And the server is connected to both switches. br0 -- team0---eth0 \| -- switch-peerlink Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-23 09:32:15 -07:00
Nikolay Aleksandrov	a5f3ea54f3	net: bridge: add support for raw sysfs port options This patch adds a new alternative store callback for port sysfs options which takes a raw value (buf) and can use it directly. It is needed for the backup port sysfs support since we have to pass the device by its name. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-23 09:32:15 -07:00
Anssi Hannula	8ebd83bdb0	can: xilinx_can: fix power management handling There are several issues with the suspend/resume handling code of the driver: - The device is attached and detached in the runtime_suspend() and runtime_resume() callbacks if the interface is running. However, during xcan_chip_start() the interface is considered running, causing the resume handler to incorrectly call netif_start_queue() at the beginning of xcan_chip_start(), and on xcan_chip_start() error return the suspend handler detaches the device leaving the user unable to bring-up the device anymore. - The device is not brought properly up on system resume. A reset is done and the code tries to determine the bus state after that. However, after reset the device is always in Configuration mode (down), so the state checking code does not make sense and communication will also not work. - The suspend callback tries to set the device to sleep mode (low-power mode which monitors the bus and brings the device back to normal mode on activity), but then immediately disables the clocks (possibly before the device reaches the sleep mode), which does not make sense to me. If a clean shutdown is wanted before disabling clocks, we can just bring it down completely instead of only sleep mode. Reorganize the PM code so that only the clock logic remains in the runtime PM callbacks and the system PM callbacks contain the device bring-up/down logic. This makes calling the runtime PM callbacks during e.g. xcan_chip_start() safe. The system PM callbacks now simply call common code to start/stop the HW if the interface was running, replacing the broken code from before. xcan_chip_stop() is updated to use the common reset code so that it will wait for the reset to complete. Reset also disables all interrupts so do not do that separately. Also, the device_may_wakeup() checks are removed as the driver does not have wakeup support. Tested on Zynq-7000 integrated CAN. Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi> Cc: Michal Simek <michal.simek@xilinx.com> Cc: <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2018-07-23 14:34:46 +02:00
Anssi Hannula	2f4f0f338c	can: xilinx_can: fix incorrect clear of non-processed interrupts xcan_interrupt() clears ERROR\|RXOFLV\|BSOFF\|ARBLST interrupts if any of them is asserted. This does not take into account that some of them could have been asserted between interrupt status read and interrupt clear, therefore clearing them without handling them. Fix the code to only clear those interrupts that it knows are asserted and therefore going to be processed in xcan_err_interrupt(). Fixes: `b1201e44f5` ("can: xilinx CAN controller support") Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi> Cc: Michal Simek <michal.simek@xilinx.com> Cc: <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2018-07-23 14:34:46 +02:00
Anssi Hannula	8399799725	can: xilinx_can: fix RX overflow interrupt not being enabled RX overflow interrupt (RXOFLW) is disabled even though xcan_interrupt() processes it. This means that an RX overflow interrupt will only be processed when another interrupt gets asserted (e.g. for RX/TX). Fix that by enabling the RXOFLW interrupt. Fixes: `b1201e44f5` ("can: xilinx CAN controller support") Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi> Cc: Michal Simek <michal.simek@xilinx.com> Cc: <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2018-07-23 14:34:46 +02:00
Anssi Hannula	620050d9c2	can: xilinx_can: keep only 1-2 frames in TX FIFO to fix TX accounting The xilinx_can driver assumes that the TXOK interrupt only clears after it has been acknowledged as many times as there have been successfully sent frames. However, the documentation does not mention such behavior, instead saying just that the interrupt is cleared when the clear bit is set. Similarly, testing seems to also suggest that it is immediately cleared regardless of the amount of frames having been sent. Performing some heavy TX load and then going back to idle has the tx_head drifting further away from tx_tail over time, steadily reducing the amount of frames the driver keeps in the TX FIFO (but not to zero, as the TXOK interrupt always frees up space for 1 frame from the driver's perspective, so frames continue to be sent) and delaying the local echo frames. The TX FIFO tracking is also otherwise buggy as it does not account for TX FIFO being cleared after software resets, causing BUG!, TX FIFO full when queue awake! messages to be output. There does not seem to be any way to accurately track the state of the TX FIFO for local echo support while using the full TX FIFO. The Zynq version of the HW (but not the soft-AXI version) has watermark programming support and with it an additional TX-FIFO-empty interrupt bit. Modify the driver to only put 1 frame into TX FIFO at a time on soft-AXI and 2 frames at a time on Zynq. On Zynq the TXFEMP interrupt bit is used to detect whether 1 or 2 frames have been sent at interrupt processing time. Tested with the integrated CAN on Zynq-7000 SoC. The 1-frame-FIFO mode was also tested. An alternative way to solve this would be to drop local echo support but keep using the full TX FIFO. v2: Add FIFO space check before TX queue wake with locking to synchronize with queue stop. This avoids waking the queue when xmit() had just filled it. v3: Keep local echo support and reduce the amount of frames in FIFO instead as suggested by Marc Kleine-Budde. Fixes: `b1201e44f5` ("can: xilinx CAN controller support") Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi> Cc: <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2018-07-23 14:34:46 +02:00
Anssi Hannula	877e0b7594	can: xilinx_can: fix recovery from error states not being propagated The xilinx_can driver contains no mechanism for propagating recovery from CAN_STATE_ERROR_WARNING and CAN_STATE_ERROR_PASSIVE. Add such a mechanism by factoring the handling of XCAN_STATE_ERROR_PASSIVE and XCAN_STATE_ERROR_WARNING out of xcan_err_interrupt and checking for recovery after RX and TX if the interface is in one of those states. Tested with the integrated CAN on Zynq-7000 SoC. Fixes: `b1201e44f5` ("can: xilinx CAN controller support") Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi> Cc: <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2018-07-23 14:34:45 +02:00
Anssi Hannula	32852c561b	can: xilinx_can: fix RX loop if RXNEMP is asserted without RXOK If the device gets into a state where RXNEMP (RX FIFO not empty) interrupt is asserted without RXOK (new frame received successfully) interrupt being asserted, xcan_rx_poll() will continue to try to clear RXNEMP without actually reading frames from RX FIFO. If the RX FIFO is not empty, the interrupt will not be cleared and napi_schedule() will just be called again. This situation can occur when: (a) xcan_rx() returns without reading RX FIFO due to an error condition. The code tries to clear both RXOK and RXNEMP but RXNEMP will not clear due to a frame still being in the FIFO. The frame will never be read from the FIFO as RXOK is no longer set. (b) A frame is received between xcan_rx_poll() reading interrupt status and clearing RXOK. RXOK will be cleared, but RXNEMP will again remain set as the new message is still in the FIFO. I'm able to trigger case (b) by flooding the bus with frames under load. There does not seem to be any benefit in using both RXNEMP and RXOK in the way the driver does, and the polling example in the reference manual (UG585 v1.10 18.3.7 Read Messages from RxFIFO) also says that either RXOK or RXNEMP can be used for detecting incoming messages. Fix the issue and simplify the RX processing by only using RXNEMP without RXOK. Tested with the integrated CAN on Zynq-7000 SoC. Fixes: `b1201e44f5` ("can: xilinx CAN controller support") Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi> Cc: <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2018-07-23 14:34:45 +02:00
Anssi Hannula	2574fe5451	can: xilinx_can: fix device dropping off bus on RX overrun The xilinx_can driver performs a software reset when an RX overrun is detected. This causes the device to enter Configuration mode where no messages are received or transmitted. The documentation does not mention any need to perform a reset on an RX overrun, and testing by inducing an RX overflow also indicated that the device continues to work just fine without a reset. Remove the software reset. Tested with the integrated CAN on Zynq-7000 SoC. Fixes: `b1201e44f5` ("can: xilinx CAN controller support") Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi> Cc: <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2018-07-23 14:34:45 +02:00
Faiz Abbas	54e4a0c486	can: m_can: Move accessing of message ram to after clocks are enabled MCAN message ram should only be accessed once clocks are enabled. Therefore, move the call to parse/init the message ram to after clocks are enabled. Signed-off-by: Faiz Abbas <faiz_abbas@ti.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2018-07-23 14:34:45 +02:00
Faiz Abbas	1675bee3e7	can: m_can: Fix runtime resume call pm_runtime_get_sync() returns a 1 if the state of the device is already 'active'. This is not a failure case and should return a success. Therefore fix error handling for pm_runtime_get_sync() call such that it returns success when the value is 1. Also cleanup the TODO for using runtime PM for sleep mode as that is implemented. Signed-off-by: Faiz Abbas <faiz_abbas@ti.com> Cc: <stable@vger.kernel.org Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2018-07-23 14:34:45 +02:00
Nicholas Mc Guire	b5c1a23b17	can: mpc5xxx_can: check of_iomap return before use of_iomap() can return NULL so that return needs to be checked and NULL treated as failure. While at it also take care of the missing of_node_put() in the error path. Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org> Fixes: commit `afa17a500a` ("net/can: add driver for mscan family & mpc52xx_mscan") Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2018-07-23 14:34:45 +02:00
Roman Fietze	393753b217	can: m_can.c: fix setup of CCCR register: clear CCCR NISO bit before checking can.ctrlmode Inside m_can_chip_config(), when setting up the new value of the CCCR, the CCCR_NISO bit is not cleared like the others, CCCR_TEST, CCCR_MON, CCCR_BRSE and CCCR_FDOE, before checking the can.ctrlmode bits for CAN_CTRLMODE_FD_NON_ISO. This way once the controller was configured for CAN_CTRLMODE_FD_NON_ISO, this mode could never be cleared again. This fix is only relevant for controllers with version 3.1.x or 3.2.x. Older versions do not support NISO. Signed-off-by: Roman Fietze <roman.fietze@telemotive.de> Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2018-07-23 14:34:45 +02:00
Stephane Grosjean	5d4c94ed9f	can: peak_canfd: fix firmware < v3.3.0: limit allocation to 32-bit DMA addr only The DMA logic in firmwares < v3.3.0 embedded in the PCAN-PCIe FD cards family is not capable of handling a mix of 32-bit and 64-bit logical addresses. If the board is equipped with 2 or 4 CAN ports, then such a situation might lead to a PCIe Bus Error "Malformed TLP" packet as well as "irq xx: nobody cared" issue. This patch adds a workaround that requests only 32-bit DMA addresses when these might be allocated outside of the 4 GB area. This issue has been fixed in firmware v3.3.0 and next. Signed-off-by: Stephane Grosjean <s.grosjean@peak-system.com> Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2018-07-23 14:34:45 +02:00
YueHaibing	0a78c3803d	net: mediatek: use dma_zalloc_coherent instead of allocator/memset Use dma_zalloc_coherent instead of dma_alloc_coherent followed by memset 0. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-22 20:51:40 -07:00
Linus Torvalds	d72e90f33a	Linux 4.18-rc6	2018-07-22 14:12:20 -07:00
Linus Torvalds	7441308421	NVMe fixes for 4.18-rc6: - fix a regression in 4.18 that causes a memory leak on probe failure (Keith Bush) - fix a deadlock in the passthrough ioctl code (Scott Bauer) - don't enable AENs if not supported (Weiping Zhang) - fix an old regression in metadata handling in the passthrough ioctl code (Roland Dreier) -----BEGIN PGP SIGNATURE----- iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAltUe8YLHGhjaEBsc3Qu ZGUACgkQD55TZVIEUYM6TQ//UDrKbhnW6x2Vl7wfSyPjG1lADDXjLrIPoy2+WJNN ylgRl0Ezv7bXvj9gdwkDcgeoN0ua6gf88vjrwgem27BySPNeDMWYaaaAbwUaHxJd rsW/ogaB3gHrgn0MWn7OPb/WT2bQtoq55ivBP9A1ExRAdZ6RjM8qQc/7dkPFCaLf XxUE1+udgFVp5a7nbFb6TRdaZmxzYgkDU1PTgERD8RTmBes7K5uOQtO5whFVHU7b tveIXLmybgpB0BDN8R9x1uHRtjRmIdgSrJ6H+ps5cc+LB/wHTWvRd/hdlC++Ug8u k3+ifvsOLDdTz0xFW+0256edCyStQvVQYog7EcjxHL2GViSyxUayJWKE3XVI7DFW tClP6IW39XqTbYs0LGJmv1POufiQUUD3I6xHgE3R3Yb5CyE4EKrNnBkAK4F2pX6n Y9rgSY3cjswi/qn9vKZr2DVkEl1oqGiFVBV6PxMZwIHnIoJfZQ4ZwsPgEaeridil +GjyF6j2mI5DtrJ9rN8UYENDVioqb1r+1TXt9k/t4bmaK4IWms2+/w9YHfH+4hUr M64CkvQa7/wHhE3oIEzgOWLDhvksNyyZQHR6BkMGlwGg7xvO2FuQZlong6MlTVyc bgVNPf71X705xuYfXOHCxkSvviWAJlJtsB7r+R6ez6ikngagt2VOK+yPeSesRnux kCo= =PvXH -----END PGP SIGNATURE----- Merge tag 'nvme-for-4.18' of git://git.infradead.org/nvme Pull NVMe fixes from Christoph Hellwig: - fix a regression in 4.18 that causes a memory leak on probe failure (Keith Bush) - fix a deadlock in the passthrough ioctl code (Scott Bauer) - don't enable AENs if not supported (Weiping Zhang) - fix an old regression in metadata handling in the passthrough ioctl code (Roland Dreier) * tag 'nvme-for-4.18' of git://git.infradead.org/nvme: nvme: fix handling of metadata_len for NVME_IOCTL_IO_CMD nvme: don't enable AEN if not supported nvme: ensure forward progress during Admin passthru nvme-pci: fix memory leak on probe failure	2018-07-22 13:21:45 -07:00
Linus Torvalds	165ea0d1c2	Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs fixes from Al Viro: "Fix several places that screw up cleanups after failures halfway through opening a file (one open-coding filp_clone_open() and getting it wrong, two misusing alloc_file()). That part is -stable fodder from the 'work.open' branch. And Christoph's regression fix for uapi breakage in aio series; include/uapi/linux/aio_abi.h shouldn't be pulling in the kernel definition of sigset_t, the reason for doing so in the first place had been bogus - there's no need to expose struct __aio_sigset in aio_abi.h at all" * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: aio: don't expose __aio_sigset in uapi ocxlflash_getfile(): fix double-iput() on alloc_file() failures cxl_getfile(): fix double-iput() on alloc_file() failures drm_mode_create_lease_ioctl(): fix open-coded filp_clone_open()	2018-07-22 12:04:51 -07:00
Al Viro	f88a333b44	alpha: fix osf_wait4() breakage kernel_wait4() expects a userland address for status - it's only rusage that goes as a kernel one (and needs a copyout afterwards) [ Also, fix the prototype of kernel_wait4() to have that __user annotation - Linus ] Fixes: `92ebce5ac5` ("osf_wait4: switch to kernel_wait4()") Cc: stable@kernel.org # v4.13+ Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2018-07-22 11:51:30 -07:00
Randy Dunlap	c9ce1fa1c2	net: prevent ISA drivers from building on PPC32 Prevent drivers from building on PPC32 if they use isa_bus_to_virt(), isa_virt_to_bus(), or isa_page_to_bus(), which are not available and thus cause build errors. ../drivers/net/ethernet/3com/3c515.c: In function 'corkscrew_open': ../drivers/net/ethernet/3com/3c515.c:824:9: error: implicit declaration of function 'isa_virt_to_bus'; did you mean 'virt_to_bus'? [-Werror=implicit-function-declaration] ../drivers/net/ethernet/amd/lance.c: In function 'lance_rx': ../drivers/net/ethernet/amd/lance.c:1203:23: error: implicit declaration of function 'isa_bus_to_virt'; did you mean 'bus_to_virt'? [-Werror=implicit-function-declaration] ../drivers/net/ethernet/amd/ni65.c: In function 'ni65_init_lance': ../drivers/net/ethernet/amd/ni65.c:585:20: error: implicit declaration of function 'isa_virt_to_bus'; did you mean 'virt_to_bus'? [-Werror=implicit-function-declaration] ../drivers/net/ethernet/cirrus/cs89x0.c: In function 'net_open': ../drivers/net/ethernet/cirrus/cs89x0.c:897:20: error: implicit declaration of function 'isa_virt_to_bus'; did you mean 'virt_to_bus'? [-Werror=implicit-function-declaration] Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Suggested-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-22 11:12:29 -07:00
Jakub Kicinski	07300f774f	nfp: avoid buffer leak when FW communication fails After device is stopped we reset the rings by moving all free buffers to positions [0, cnt - 2], and clear the position cnt - 1 in the ring. We then proceed to clear the read/write pointers. This means that if we try to reset the ring again the code will assume that the next to fill buffer is at position 0 and swap it with cnt - 1. Since we previously cleared position cnt - 1 it will lead to leaking the first buffer and leaving ring in a bad state. This scenario can only happen if FW communication fails, in which case the ring will never be used again, so the fact it's in a bad state will not be noticed. Buffer leak is the only problem. Don't try to move buffers in the ring if the read/write pointers indicate the ring was never used or have already been reset. nfp_net_clear_config_and_disable() is now fully idempotent. Found by code inspection, FW communication failures are very rare, and reconfiguring a live device is not common either, so it's unlikely anyone has ever noticed the leak. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-22 10:58:52 -07:00
Jakub Kicinski	042f882556	nfp: bring back support for offloading shared blocks Now that we have offload replay infrastructure added by commit `326367427c` ("net: sched: call reoffload op on block callback reg") and flows are guaranteed to be removed correctly, we can revert commit `951a8ee6de` ("nfp: reject binding to shared blocks"). Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-22 10:58:52 -07:00
John Hurley	b809ec869b	nfp: flower: ensure dead neighbour entries are not offloaded Previously only the neighbour state was checked to decide if an offloaded entry should be removed. However, there can be situations when the entry is dead but still marked as valid. This can lead to dead entries not being removed from fw tables or even incorrect data being added. Check the entry dead bit before deciding if it should be added to or removed from fw neighbour tables. Fixes: `8e6a9046b6` ("nfp: flower vxlan neighbour offload") Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-22 10:55:01 -07:00
David S. Miller	0fb8d5a030	Merge branch 'vxlan-fix-default-fdb-entry-user-space-notify-ordering-race' Roopa Prabhu says: ==================== vxlan: fix default fdb entry user-space notify ordering/race Problem: In vxlan_newlink, a default fdb entry is added before register_netdev. The default fdb creation function notifies user-space of the fdb entry on the vxlan device which user-space does not know about yet. (RTM_NEWNEIGH goes before RTM_NEWLINK for the same ifindex). This series fixes the user-space netlink notification ordering issue with the following changes: - decouple fdb notify from fdb create. - Move fdb notify after register_netdev. - modify rtnl_configure_link to allow configuring a link early. - Call rtnl_configure_link in vxlan newlink handler to notify userspace about the newlink before fdb notify and hence avoiding the user-space race. ==================== Fixes: `afbd8bae9c` ("vxlan: add implicit fdb entry for default destination") Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>	2018-07-22 10:52:37 -07:00
Roopa Prabhu	e99465b952	vxlan: fix default fdb entry netlink notify ordering during netdev create Problem: In vxlan_newlink, a default fdb entry is added before register_netdev. The default fdb creation function also notifies user-space of the fdb entry on the vxlan device which user-space does not know about yet. (RTM_NEWNEIGH goes before RTM_NEWLINK for the same ifindex). This patch fixes the user-space netlink notification ordering issue with the following changes: - decouple fdb notify from fdb create. - Move fdb notify after register_netdev. - Call rtnl_configure_link in vxlan newlink handler to notify userspace about the newlink before fdb notify and hence avoiding the user-space race. Fixes: `afbd8bae9c` ("vxlan: add implicit fdb entry for default destination") Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-22 10:52:37 -07:00
Roopa Prabhu	f6e0538586	vxlan: make netlink notify in vxlan_fdb_destroy optional Add a new option do_notify to vxlan_fdb_destroy to make sending netlink notify optional. Used by a later patch. Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-22 10:52:37 -07:00
Roopa Prabhu	7431016b10	vxlan: add new fdb alloc and create helpers - Add new vxlan_fdb_alloc helper - rename existing vxlan_fdb_create into vxlan_fdb_update: because it really creates or updates an existing fdb entry - move new fdb creation into a separate vxlan_fdb_create Main motivation for this change is to introduce the ability to decouple vxlan fdb creation and notify, used in a later patch. Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-22 10:52:37 -07:00
Roopa Prabhu	5025f7f7d5	rtnetlink: add rtnl_link_state check in rtnl_configure_link rtnl_configure_link sets dev->rtnl_link_state to RTNL_LINK_INITIALIZED and unconditionally calls __dev_notify_flags to notify user-space of dev flags. current call sequence for rtnl_configure_link rtnetlink_newlink rtnl_link_ops->newlink rtnl_configure_link (unconditionally notifies userspace of default and new dev flags) If a newlink handler wants to call rtnl_configure_link early, we will end up with duplicate notifications to user-space. This patch fixes rtnl_configure_link to check rtnl_link_state and call __dev_notify_flags with gchanges = 0 if already RTNL_LINK_INITIALIZED. Later in the series, this patch will help the following sequence where a driver implementing newlink can call rtnl_configure_link to initialize the link early. makes the following call sequence work: rtnetlink_newlink rtnl_link_ops->newlink (vxlan) -> rtnl_configure_link (initializes link and notifies user-space of default dev flags) rtnl_configure_link (updates dev flags if requested by user ifm and notifies user-space of new dev flags) Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-22 10:52:37 -07:00
Florian Westphal	6e56830776	atl1c: reserve min skb headroom Got crash report with following backtrace: BUG: unable to handle kernel paging request at ffff8801869daffe RIP: 0010:[<ffffffff816429c4>] [<ffffffff816429c4>] ip6_finish_output2+0x394/0x4c0 RSP: 0018:ffff880186c83a98 EFLAGS: 00010283 RAX: ffff8801869db00e ... [<ffffffff81644cdc>] ip6_finish_output+0x8c/0xf0 [<ffffffff81644d97>] ip6_output+0x57/0x100 [<ffffffff81643dc9>] ip6_forward+0x4b9/0x840 [<ffffffff81645566>] ip6_rcv_finish+0x66/0xc0 [<ffffffff81645db9>] ipv6_rcv+0x319/0x530 [<ffffffff815892ac>] netif_receive_skb+0x1c/0x70 [<ffffffffc0060bec>] atl1c_clean+0x1ec/0x310 [atl1c] ... The bad access is in neigh_hh_output(), at skb->data - 16 (HH_DATA_MOD). atl1c driver provided skb with no headroom, so 14 bytes (ethernet header) got pulled, but then 16 are copied. Reserve NET_SKB_PAD bytes headroom, like netdev_alloc_skb(). Compile tested only; I lack hardware. Fixes: `7b70176421` ("atl1c: Fix misuse of netdev_alloc_skb in refilling rx ring") Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-22 10:28:26 -07:00
Vitaly Kuznetsov	2d408c0d45	xen-netfront: fix queue name setting Commit `f599c64fdf` ("xen-netfront: Fix race between device setup and open") changed the initialization order: xennet_create_queues() now happens before we do register_netdev() so using netdev->name in xennet_init_queue() is incorrect, we end up with the following in /proc/interrupts: 60: 139 0 xen-dyn -event eth%d-q0-tx 61: 265 0 xen-dyn -event eth%d-q0-rx 62: 234 0 xen-dyn -event eth%d-q1-tx 63: 1 0 xen-dyn -event eth%d-q1-rx and this looks ugly. Actually, using early netdev name (even when it's already set) is also not ideal: nowadays we tend to rename eth devices and queue name may end up not corresponding to the netdev name. Use nodename from xenbus device for queue naming: this can't change in VM's lifetime. Now /proc/interrupts looks like 62: 202 0 xen-dyn -event device/vif/0-q0-tx 63: 317 0 xen-dyn -event device/vif/0-q0-rx 64: 262 0 xen-dyn -event device/vif/0-q1-tx 65: 17 0 xen-dyn -event device/vif/0-q1-rx Fixes: `f599c64fdf` ("xen-netfront: Fix race between device setup and open") Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Ross Lagerwall <ross.lagerwall@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-22 10:27:02 -07:00
Randy Dunlap	be5a8ffa9c	net/dsa/realtek: add MODULE_LICENSE() Add MODULE_LICENSE() to net/dsa/realtek.o to fix build warning message. WARNING: modpost: missing MODULE_LICENSE() in drivers/net/dsa/realtek.o Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-22 10:25:48 -07:00
Nikolay Aleksandrov	5b3df17723	bonding: don't cast const buf in sysfs store As was recently discussed [1], let's avoid casting the const buf in bonding_sysfs_store_option and use kstrndup/kfree instead. [1] http://lists.openwall.net/netdev/2018/07/22/25 Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-22 10:09:30 -07:00
David S. Miller	fb42c838b6	Merge branch 'TX-used-ring-batched-updating-for-vhost' Jason Wang says: ==================== TX used ring batched updating for vhost This series implement batch updating of used ring for TX. This help to reduce the cache contention on used ring. The idea is first split datacopy path from zerocopy, and do only batching for datacopy. This is because zercopy had already supported its own batching. TX PPS was increased 25.8% and Netperf TCP does not show obvious differences. The split of datapath will also be helpful for future implementation like in order completion. ==================== Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-22 09:43:31 -07:00
Jason Wang	4afb52c2af	vhost_net: batch update used ring for datacopy TX Like commit `e2b3b35eb9` ("vhost_net: batch used ring update in rx"), this patches implements batch used ring update for datacopy TX (zerocopy has already done some kind of batching). Testpmd transmission from guest to host (XDP_DROP on tap) shows 25.8% improvement (from ~3.1Mpps to ~3.9Mpps) on Broadwell i7-5600U CPU @ 2.60GHz machine. Netperf TCP tests does not show obvious differences. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-22 09:43:31 -07:00
Jason Wang	d0d8697187	vhost_net: rename VHOST_RX_BATCH to VHOST_NET_BATCH A more generic name which could be used for TX as well. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-22 09:43:31 -07:00
Jason Wang	09c3248938	vhost_net: rename vhost_rx_signal_used() to vhost_net_signal_used() Rename for reusing this for TX. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-22 09:43:31 -07:00
Jason Wang	0d20bdf34d	vhost_net: split out datacopy logic Instead of mixing zerocopy and datacopy logics, this patch tries to split datacopy logic out. This results for a more compact code and ad-hoc optimization could be done on top more easily. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-22 09:43:31 -07:00
Jason Wang	c92a8a8cb7	vhost_net: introduce tx_can_batch() Introduce tx_can_batch() to determine whether TX could be batched. This will help to reduce the code duplication in the future. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-22 09:43:31 -07:00

... 6 7 8 9 10 ...

769513 commits