1
0
Fork 0
Commit Graph

872776 Commits (4eda4e0096842764d725bcfd77471a419832b074)

Author SHA1 Message Date
Nicholas Nunley 4eda4e0096 iavf: initialize ITRN registers with correct values
Since commit 92418fb147 ("i40e/i40evf: Use usec value instead of reg
value for ITR defines") the driver tracks the interrupt throttling
intervals in single usec units, although the actual ITRN registers are
programmed in 2 usec units. Most register programming flows in the driver
correctly handle the conversion, although it is currently not applied when
the registers are initialized to their default values. Most of the time
this doesn't present a problem since the default values are usually
immediately overwritten through the standard adaptive throttling mechanism,
or updated manually by the user, but if adaptive throttling is disabled and
the interval values are left alone then the incorrect value will persist.

Since the intended default interval of 50 usecs (vs. 100 usecs as
programmed) performs better for most traffic workloads, this can lead to
performance regressions.

This patch adds the correct conversion when writing the initial values to
the ITRN registers.

Signed-off-by: Nicholas Nunley <nicholas.d.nunley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-11-08 16:10:51 -08:00
Colin Ian King 615457a226 ice: fix potential infinite loop because loop counter being too small
Currently the for-loop counter i is a u8 however it is being checked
against a maximum value hw->num_tx_sched_layers which is a u16. Hence
there is a potential wrap-around of counter i back to zero if
hw->num_tx_sched_layers is greater than 255.  Fix this by making i
a u16.

Addresses-Coverity: ("Infinite loop")
Fixes: b36c598c99 ("ice: Updates to Tx scheduler code")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-11-08 16:10:51 -08:00
Eric Dumazet 1b53d64435 net: fix data-race in neigh_event_send()
KCSAN reported the following data-race [1]

The fix will also prevent the compiler from optimizing out
the condition.

[1]

BUG: KCSAN: data-race in neigh_resolve_output / neigh_resolve_output

write to 0xffff8880a41dba78 of 8 bytes by interrupt on cpu 1:
 neigh_event_send include/net/neighbour.h:443 [inline]
 neigh_resolve_output+0x78/0x480 net/core/neighbour.c:1474
 neigh_output include/net/neighbour.h:511 [inline]
 ip_finish_output2+0x4af/0xe40 net/ipv4/ip_output.c:228
 __ip_finish_output net/ipv4/ip_output.c:308 [inline]
 __ip_finish_output+0x23a/0x490 net/ipv4/ip_output.c:290
 ip_finish_output+0x41/0x160 net/ipv4/ip_output.c:318
 NF_HOOK_COND include/linux/netfilter.h:294 [inline]
 ip_output+0xdf/0x210 net/ipv4/ip_output.c:432
 dst_output include/net/dst.h:436 [inline]
 ip_local_out+0x74/0x90 net/ipv4/ip_output.c:125
 __ip_queue_xmit+0x3a8/0xa40 net/ipv4/ip_output.c:532
 ip_queue_xmit+0x45/0x60 include/net/ip.h:237
 __tcp_transmit_skb+0xe81/0x1d60 net/ipv4/tcp_output.c:1169
 tcp_transmit_skb net/ipv4/tcp_output.c:1185 [inline]
 __tcp_retransmit_skb+0x4bd/0x15f0 net/ipv4/tcp_output.c:2976
 tcp_retransmit_skb+0x36/0x1a0 net/ipv4/tcp_output.c:2999
 tcp_retransmit_timer+0x719/0x16d0 net/ipv4/tcp_timer.c:515
 tcp_write_timer_handler+0x42d/0x510 net/ipv4/tcp_timer.c:598
 tcp_write_timer+0xd1/0xf0 net/ipv4/tcp_timer.c:618

read to 0xffff8880a41dba78 of 8 bytes by interrupt on cpu 0:
 neigh_event_send include/net/neighbour.h:442 [inline]
 neigh_resolve_output+0x57/0x480 net/core/neighbour.c:1474
 neigh_output include/net/neighbour.h:511 [inline]
 ip_finish_output2+0x4af/0xe40 net/ipv4/ip_output.c:228
 __ip_finish_output net/ipv4/ip_output.c:308 [inline]
 __ip_finish_output+0x23a/0x490 net/ipv4/ip_output.c:290
 ip_finish_output+0x41/0x160 net/ipv4/ip_output.c:318
 NF_HOOK_COND include/linux/netfilter.h:294 [inline]
 ip_output+0xdf/0x210 net/ipv4/ip_output.c:432
 dst_output include/net/dst.h:436 [inline]
 ip_local_out+0x74/0x90 net/ipv4/ip_output.c:125
 __ip_queue_xmit+0x3a8/0xa40 net/ipv4/ip_output.c:532
 ip_queue_xmit+0x45/0x60 include/net/ip.h:237
 __tcp_transmit_skb+0xe81/0x1d60 net/ipv4/tcp_output.c:1169
 tcp_transmit_skb net/ipv4/tcp_output.c:1185 [inline]
 __tcp_retransmit_skb+0x4bd/0x15f0 net/ipv4/tcp_output.c:2976
 tcp_retransmit_skb+0x36/0x1a0 net/ipv4/tcp_output.c:2999
 tcp_retransmit_timer+0x719/0x16d0 net/ipv4/tcp_timer.c:515
 tcp_write_timer_handler+0x42d/0x510 net/ipv4/tcp_timer.c:598

Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.4.0-rc3+ #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-08 13:59:53 -08:00
Stefano Garzarella ad8a722035 vsock/virtio: fix sock refcnt holding during the shutdown
The "42f5cda5eaf4" commit rightly set SOCK_DONE on peer shutdown,
but there is an issue if we receive the SHUTDOWN(RDWR) while the
virtio_transport_close_timeout() is scheduled.
In this case, when the timeout fires, the SOCK_DONE is already
set and the virtio_transport_close_timeout() will not call
virtio_transport_reset() and virtio_transport_do_close().
This causes that both sockets remain open and will never be released,
preventing the unloading of [virtio|vhost]_transport modules.

This patch fixes this issue, calling virtio_transport_reset() and
virtio_transport_do_close() when we receive the SHUTDOWN(RDWR)
and there is nothing left to read.

Fixes: 42f5cda5ea ("vsock/virtio: set SOCK_DONE on peer shutdown")
Cc: Stephen Barber <smbarber@chromium.org>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-08 12:17:50 -08:00
David S. Miller b05f5b4a9b Three small fixes:
* we hit a failure path bug related to
    ieee80211_txq_setup_flows()
  * also use kvmalloc() to make that less likely
  * fix a timing value shortly after boot (during
    INITIAL_JIFFIES)
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEH1e1rEeCd0AIMq6MB8qZga/fl8QFAl3FYVEACgkQB8qZga/f
 l8RNHg//bVYynSBqMYDx+zjIG6fdlrI/zMAJIO9mjQmwNgUw67dSeO6fXhji9ngS
 jAhXaeJAj9qB++SxDncsOo4pWi+1IhuU1Zr+88WqpQvR78FT/5g0Jn5pNDkIOAT+
 w9w+RmK5iGmiyqyubKi/FGCX8F5WBVJWWIso4cctTWIAGGIOkNc47ZjfmoGZmxK3
 z9LhU98lGP1ZYXCOTx9kXOo4yf9GbC2vAC/JKmWc9jg1moDqbno/WNcc8TgDl9Z8
 935VzaZ1lt/2cWrCpjg8pJqPK4TMAIXdzYoUP1KixMjAI7wQVaUfiuIbOY+uXUmq
 zUzalSWuOBsaK2pKGfD7KbSekozc61A9APRhU5Onlco9mfVVqtfqsZiczCGgJk+I
 9e0uhYKlABSdkhNor6hcuF1s+lzxFT6ikB7rKUYns2w2EYLDwIWeOjUMPGRIhDGa
 86KwQQvMNxeyArMnsO4lI1W6vpSGO2BZ6H5SR6JsJN4B1Wb9UGslrMwTlbkUqqZz
 FEaT+8sKM3oqxadBOWCCynIPayiCaNSWxvN1aiV2nM1gyjXLzv+HFBhUpJUzMYpI
 jq1tI/410Ba2T9feieo/+7ekIQAMvqU10Kilol63n5osGAupl4Y8jPA7dSbkF3+F
 PZFvIgdv584UXnMVIysuBYm/oFoilpE4L51XxBPeRaJcl8gK8Wo=
 =jkY+
 -----END PGP SIGNATURE-----

Merge tag 'mac80211-for-net-2019-11-08' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211

Johannes Berg says:

====================
Three small fixes:
 * we hit a failure path bug related to
   ieee80211_txq_setup_flows()
 * also use kvmalloc() to make that less likely
 * fix a timing value shortly after boot (during
   INITIAL_JIFFIES)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-08 11:37:24 -08:00
Alexander Sverdlin e4dd560803 net: ethernet: octeon_mgmt: Account for second possible VLAN header
Octeon's input ring-buffer entry has 14 bits-wide size field, so to account
for second possible VLAN header max_mtu must be further reduced.

Fixes: 109cc16526 ("ethernet/cavium: use core min/max MTU checking")
Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-08 11:18:40 -08:00
Ahmed Zaki 285531f9e6 mac80211: fix station inactive_time shortly after boot
In the first 5 minutes after boot (time of INITIAL_JIFFIES),
ieee80211_sta_last_active() returns zero if last_ack is zero. This
leads to "inactive time" showing jiffies_to_msecs(jiffies).

 # iw wlan0 station get fc:ec:da:64:a6:dd
 Station fc:ec:da:64:a6:dd (on wlan0)
	inactive time:	4294894049 ms
	.
	.
	connected time:	70 seconds

Fix by returning last_rx if last_ack == 0.

Signed-off-by: Ahmed Zaki <anzaki@gmail.com>
Link: https://lore.kernel.org/r/20191031121243.27694-1-anzaki@gmail.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-11-08 09:17:28 +01:00
Toke Høiland-Jørgensen 71e67c3bd1 net/fq_impl: Switch to kvmalloc() for memory allocation
The FQ implementation used by mac80211 allocates memory using kmalloc(),
which can fail; and Johannes reported that this actually happens in
practice.

To avoid this, switch the allocation to kvmalloc() instead; this also
brings fq_impl in line with all the FQ qdiscs.

Fixes: 557fc4a098 ("fq: add fair queuing framework")
Reported-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/20191105155750.547379-1-toke@redhat.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-11-08 09:11:49 +01:00
Johannes Berg 6dd47d9754 mac80211: fix ieee80211_txq_setup_flows() failure path
If ieee80211_txq_setup_flows() fails, we don't clean up LED
state properly, leading to crashes later on, fix that.

Fixes: dc8b274f09 ("mac80211: Move up init of TXQs")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-by: Toke Høiland-Jørgensen <toke@toke.dk>
Link: https://lore.kernel.org/r/20191105154110.1ccf7112ba5d.I0ba865792446d051867b33153be65ce6b063d98c@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-11-08 09:11:33 +01:00
David Ahern e0a312629f ipv4: Fix table id reference in fib_sync_down_addr
Hendrik reported routes in the main table using source address are not
removed when the address is removed. The problem is that fib_sync_down_addr
does not account for devices in the default VRF which are associated
with the main table. Fix by updating the table id reference.

Fixes: 5a56a0b3a4 ("net: Don't delete routes in different VRFs")
Reported-by: Hendrik Donner <hd@os-cillation.de>
Signed-off-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-07 16:14:36 -08:00
Eric Dumazet 1bef4c223b ipv6: fixes rt6_probe() and fib6_nh->last_probe init
While looking at a syzbot KCSAN report [1], I found multiple
issues in this code :

1) fib6_nh->last_probe has an initial value of 0.

   While probably okay on 64bit kernels, this causes an issue
   on 32bit kernels since the time_after(jiffies, 0 + interval)
   might be false ~24 days after boot (for HZ=1000)

2) The data-race found by KCSAN
   I could use READ_ONCE() and WRITE_ONCE(), but we also can
   take the opportunity of not piling-up too many rt6_probe_deferred()
   works by using instead cmpxchg() so that only one cpu wins the race.

[1]
BUG: KCSAN: data-race in find_match / find_match

write to 0xffff8880bb7aabe8 of 8 bytes by interrupt on cpu 1:
 rt6_probe net/ipv6/route.c:663 [inline]
 find_match net/ipv6/route.c:757 [inline]
 find_match+0x5bd/0x790 net/ipv6/route.c:733
 __find_rr_leaf+0xe3/0x780 net/ipv6/route.c:831
 find_rr_leaf net/ipv6/route.c:852 [inline]
 rt6_select net/ipv6/route.c:896 [inline]
 fib6_table_lookup+0x383/0x650 net/ipv6/route.c:2164
 ip6_pol_route+0xee/0x5c0 net/ipv6/route.c:2200
 ip6_pol_route_output+0x48/0x60 net/ipv6/route.c:2452
 fib6_rule_lookup+0x3d6/0x470 net/ipv6/fib6_rules.c:117
 ip6_route_output_flags_noref+0x16b/0x230 net/ipv6/route.c:2484
 ip6_route_output_flags+0x50/0x1a0 net/ipv6/route.c:2497
 ip6_dst_lookup_tail+0x25d/0xc30 net/ipv6/ip6_output.c:1049
 ip6_dst_lookup_flow+0x68/0x120 net/ipv6/ip6_output.c:1150
 inet6_csk_route_socket+0x2f7/0x420 net/ipv6/inet6_connection_sock.c:106
 inet6_csk_xmit+0x91/0x1f0 net/ipv6/inet6_connection_sock.c:121
 __tcp_transmit_skb+0xe81/0x1d60 net/ipv4/tcp_output.c:1169
 tcp_transmit_skb net/ipv4/tcp_output.c:1185 [inline]
 tcp_xmit_probe_skb+0x19b/0x1d0 net/ipv4/tcp_output.c:3735

read to 0xffff8880bb7aabe8 of 8 bytes by interrupt on cpu 0:
 rt6_probe net/ipv6/route.c:657 [inline]
 find_match net/ipv6/route.c:757 [inline]
 find_match+0x521/0x790 net/ipv6/route.c:733
 __find_rr_leaf+0xe3/0x780 net/ipv6/route.c:831
 find_rr_leaf net/ipv6/route.c:852 [inline]
 rt6_select net/ipv6/route.c:896 [inline]
 fib6_table_lookup+0x383/0x650 net/ipv6/route.c:2164
 ip6_pol_route+0xee/0x5c0 net/ipv6/route.c:2200
 ip6_pol_route_output+0x48/0x60 net/ipv6/route.c:2452
 fib6_rule_lookup+0x3d6/0x470 net/ipv6/fib6_rules.c:117
 ip6_route_output_flags_noref+0x16b/0x230 net/ipv6/route.c:2484
 ip6_route_output_flags+0x50/0x1a0 net/ipv6/route.c:2497
 ip6_dst_lookup_tail+0x25d/0xc30 net/ipv6/ip6_output.c:1049
 ip6_dst_lookup_flow+0x68/0x120 net/ipv6/ip6_output.c:1150
 inet6_csk_route_socket+0x2f7/0x420 net/ipv6/inet6_connection_sock.c:106
 inet6_csk_xmit+0x91/0x1f0 net/ipv6/inet6_connection_sock.c:121
 __tcp_transmit_skb+0xe81/0x1d60 net/ipv4/tcp_output.c:1169

Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 18894 Comm: udevd Not tainted 5.4.0-rc3+ #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011

Fixes: cc3a86c802 ("ipv6: Change rt6_probe to take a fib6_nh")
Fixes: f547fac624 ("ipv6: rate-limit probes for neighbourless routes")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-07 16:13:13 -08:00
Salil Mehta bf5a6b4c47 net: hns: Fix the stray netpoll locks causing deadlock in NAPI path
This patch fixes the problem of the spin locks, originally
meant for the netpoll path of hns driver, causing deadlock in
the normal NAPI poll path. The issue happened due to the presence
of the stray leftover spin lock code related to the netpoll,
whose support was earlier removed from the HNS[1], got activated
due to enabling of NET_POLL_CONTROLLER switch.

Earlier background:
The netpoll handling code originally had this bug(as identified
by Marc Zyngier[2]) of wrong spin lock API being used which did
not disable the interrupts and hence could cause locking issues.
i.e. if the lock were first acquired in context to thread like
'ip' util and this lock if ever got later acquired again in
context to the interrupt context like TX/RX (Interrupts could
always pre-empt the lock holding task and acquire the lock again)
and hence could cause deadlock.

Proposed Solution:
1. If the netpoll was enabled in the HNS driver, which is not
   right now, we could have simply used spin_[un]lock_irqsave()
2. But as netpoll is disabled, therefore, it is best to get rid
   of the existing locks and stray code for now. This should
   solve the problem reported by Marc.

[1] https://git.kernel.org/torvalds/c/4bd2c03be7
[2] https://patchwork.ozlabs.org/patch/1189139/

Fixes: 4bd2c03be7 ("net: hns: remove ndo_poll_controller")
Cc: lipeng <lipeng321@huawei.com>
Cc: Yisen Zhuang <yisen.zhuang@huawei.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: David S. Miller <davem@davemloft.net>
Reported-by: Marc Zyngier <maz@kernel.org>
Acked-by: Marc Zyngier <maz@kernel.org>
Tested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-07 16:12:15 -08:00
Aleksander Morgado e497df686e net: usb: qmi_wwan: add support for DW5821e with eSIM support
Exactly same layout as the default DW5821e module, just a different
vid/pid.

The QMI interface is exposed in USB configuration #1:

P:  Vendor=413c ProdID=81e0 Rev=03.18
S:  Manufacturer=Dell Inc.
S:  Product=DW5821e-eSIM Snapdragon X20 LTE
S:  SerialNumber=0123456789ABCDEF
C:  #Ifs= 6 Cfg#= 1 Atr=a0 MxPwr=500mA
I:  If#=0x0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan
I:  If#=0x1 Alt= 0 #EPs= 1 Cls=03(HID  ) Sub=00 Prot=00 Driver=usbhid
I:  If#=0x2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
I:  If#=0x3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
I:  If#=0x4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
I:  If#=0x5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option

Signed-off-by: Aleksander Morgado <aleksander@aleksander.es>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-07 15:49:49 -08:00
Oliver Neukum 332f989a3b CDC-NCM: handle incomplete transfer of MTU
A malicious device may give half an answer when asked
for its MTU. The driver will proceed after this with
a garbage MTU. Anything but a complete answer must be treated
as an error.

V2: used sizeof as request by Alexander

Reported-and-tested-by: syzbot+0631d878823ce2411636@syzkaller.appspotmail.com
Signed-off-by: Oliver Neukum <oneukum@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-07 15:27:06 -08:00
Pan Bian 025ec40b81 nfc: netlink: fix double device reference drop
The function nfc_put_device(dev) is called twice to drop the reference
to dev when there is no associated local llcp. Remove one of them to fix
the bug.

Fixes: 52feb444a9 ("NFC: Extend netlink interface for LTO, RW, and MIUX parameters support")
Fixes: d9b8d8e19b ("NFC: llcp: Service Name Lookup netlink interface")
Signed-off-by: Pan Bian <bianpan2016@163.com>
Reviewed-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-07 15:23:19 -08:00
Pan Bian 99a8efbb6e NFC: st21nfca: fix double free
The variable nfcid_skb is not changed in the callee nfc_hci_get_param()
if error occurs. Consequently, the freed variable nfcid_skb will be
freed again, resulting in a double free bug. Set nfcid_skb to NULL after
releasing it to fix the bug.

Signed-off-by: Pan Bian <bianpan2016@163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-06 21:48:29 -08:00
Huazhong Tan 648db0514a net: hns3: add compatible handling for command HCLGE_OPC_PF_RST_DONE
Since old firmware does not support HCLGE_OPC_PF_RST_DONE, it will
return -EOPNOTSUPP to the driver when received this command. So
for this case, it should just print a warning and return success
to the caller.

Fixes: 72e2fb0799 ("net: hns3: clear reset interrupt status in hclge_irq_handle()")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-06 21:46:56 -08:00
David S. Miller c78806f31f mlx5-fixes-2019-11-06
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAl3DQ0sACgkQSD+KveBX
 +j4nKgf/XnUZGR4PHGcEfalc/p67J7jmi7SN5Xuts8uyUTgqMGni/SkWRl+S1mQy
 8dl5kZK/JxBhAzzlVIcXpcmkfeXydcrm1NHMPCOMsnUrAWAEP0lE1lfLBkX1zWPi
 OHb3+nziJHc0s4AI/Haca7JOj8HjBSOPpv3Udg6f53leizo31pK2FSiBASc5bRoX
 EUNMD3eFAap7fjBMVLzO/JeQXxgjaqb9qU3AKopy34cg+AInZ+ZH2S++aoT7RG3J
 MCPOnrSxLmhA4BUu5I81vfgFNxxPP65vI7NPIhSMY7+pOQJNPIc1p08XvIpQ9X2o
 5o/rQ9tVxq2NZqcMB+b6tiHVFYu7Kw==
 =OmLZ
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-fixes-2019-11-06' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahamees says:

====================
Mellanox, mlx5 fixes 2019-11-06

This series introduces some fixes to mlx5 driver.

Please pull and let me know if there is any problem.

No -stable this time.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-06 21:39:48 -08:00
Heiner Kallweit 9c6850fea3 r8169: fix page read in r8168g_mdio_read
Functions like phy_modify_paged() read the current page, on Realtek
PHY's this means reading the value of register 0x1f. Add special
handling for reading this register, similar to what we do already
in r8168g_mdio_write(). Currently we read a random value that by
chance seems to be 0 always.

Fixes: a2928d2864 ("r8169: use paged versions of phylib MDIO access functions")
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-06 21:36:48 -08:00
David S. Miller 58b87d21fe Merge branch 'stmmac-fixes'
Jose Abreu says:

====================
net: stmmac: Fixes for -net

Misc fixes for stmmac.

Patch 1/11 and 2/11, use the correct variable type for bitrev32() calls.

Patch 3/11, fixes the random failures the we were seing when running selftests.

Patch 4/11, prevents a crash that can occur when receiving AVB packets and with
SPH feature enabled on XGMAC.

Patch 5/11, fixes the correct settings for CBS on XGMAC.

Patch 6/11, corrects the interpretation of AVB feature on XGMAC.

Patch 7/11, disables Flow Control for AVB enabled queues on XGMAC.

Patch 8/11, disables MMC interrupts on XGMAC, preventing a storm of interrupts.

Patch 9/11, fixes the number of packets that were being taken into account in
the RX path cleaning function.

Patch 10/11, fixes an incorrect descriptor setting that could cause IP
misbehavior.

Patch 11/11, fixes the IOC generation mechanism when multiple descriptors
are used.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-06 21:22:56 -08:00
Jose Abreu 7df4a3a76d net: stmmac: Fix the TX IOC in xmit path
IOC bit must be only set in the last descriptor. Move the logic up a
little bit to make sure it's set in the correct descriptor.

Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-06 21:22:55 -08:00
Jose Abreu b2f071995b net: stmmac: Fix TSO descriptor with Enhanced Addressing
When using addressing > 32 bits the TSO first descriptor only has the
header so we can't set the payload field for this descriptor. Let's
reset the variable so that buffer 2 value is zero.

Fixes: a993db88d1 ("net: stmmac: Enable support for > 32 Bits addressing in XGMAC")
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-06 21:22:55 -08:00
Jose Abreu cda4985a3e net: stmmac: Fix the packet count in stmmac_rx()
Currently, stmmac_rx() is counting the number of descriptors but it
should count the number of packets as specified by the NAPI limit.

Fix this.

Fixes: ec222003bd ("net: stmmac: Prepare to add Split Header support")
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-06 21:22:55 -08:00
Jose Abreu aeb18dd076 net: stmmac: xgmac: Disable MMC interrupts by default
MMC interrupts were being enabled, which is not what we want because it
will lead to a storm of interrupts that are not handled at all. Fix it
by disabling all MMC interrupts for XGMAC.

Fixes: b6cdf09f51 ("net: stmmac: xgmac: Implement MMC counters")
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-06 21:22:55 -08:00
Jose Abreu 132f2f20c9 net: stmmac: xgmac: Disable Flow Control when 1 or more queues are in AV
When in AVB mode we need to disable flow control to prevent MAC from
pausing in TX side.

Fixes: ec6ea8e3ee ("net: stmmac: Add CBS support in XGMAC2")
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-06 21:22:55 -08:00
Jose Abreu 08c1ac3bcb net: stmmac: xgmac: Fix AV Feature detection
Fix incorrect precedence of operators. For reference: AV implies AV
Feature but RAV implies only RX side AV Feature. As we want full AV
features we need to check RAV.

Fixes: c2b69474d6 ("net: stmmac: xgmac: Correct RAVSEL field interpretation")
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-06 21:22:55 -08:00
Jose Abreu 97add93fbc net: stmmac: xgmac: Fix TSA selection
When we change between Transmission Scheduling Algorithms, we need to
clear previous values so that the new chosen algorithm is correctly
selected.

Fixes: ec6ea8e3ee ("net: stmmac: Add CBS support in XGMAC2")
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-06 21:22:55 -08:00
Jose Abreu 96147375d4 net: stmmac: xgmac: Only get SPH header len if available
Split Header length is only available when L34T == 0. Fix this by
correctly checking if L34T is zero before trying to get Header length.

Fixes: 67afd6d1cf ("net: stmmac: Add Split Header support and enable it in XGMAC cores")
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-06 21:22:55 -08:00
Jose Abreu eeb9d74516 net: stmmac: selftests: Prevent false positives in filter tests
In L2 tests that filter packets by destination MAC address we need to
prevent false positives that can occur if we add an address that
collides with the existing ones.

To fix this, lets manually check if the new address to be added is
already present in the NIC and use a different one if so. For Hash
filtering this also envolves converting the address to the hash.

Fixes: 091810dbde ("net: stmmac: Introduce selftests support")
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-06 21:22:55 -08:00
Jose Abreu 3d00e45d49 net: stmmac: xgmac: bitrev32 returns u32
The bitrev32 function returns an u32 var, not an int. Fix it.

Fixes: 0efedbf11f ("net: stmmac: xgmac: Fix XGMAC selftests")
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-06 21:22:55 -08:00
Jose Abreu 4d7c47e34f net: stmmac: gmac4: bitrev32 returns u32
The bitrev32 function returns an u32 var, not an int. Fix it.

Fixes: 477286b53f ("stmmac: add GMAC4 core support")
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-06 21:22:55 -08:00
David S. Miller 53ba60afb1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for net:

1) Missing register size validation in bitwise and cmp offloads.

2) Fix error code in ip_set_sockfn_get() when copy_to_user() fails,
   from Dan Carpenter.

3) Oneliner to copy MAC address in IPv6 hash:ip,mac sets, from
   Stefano Brivio.

4) Missing policy validation in ipset with NL_VALIDATE_STRICT,
   from Jozsef Kadlecsik.

5) Fix unaligned access to private data area of nf_tables instructions,
   from Lukas Wunner.

6) Relax check for object updates, reported as a regression by
   Eric Garver, patch from Fernando Fernandez Mancera.

7) Crash on ebtables dnat extension when used from the output path.
   From Florian Westphal.

8) Fix bogus EOPNOTSUPP when updating basechain flags.

9) Fix bogus EBUSY when updating a basechain that is already offloaded.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-06 21:16:55 -08:00
Ursula Braun 98f3375505 net/smc: fix ethernet interface refcounting
If a pnet table entry is to be added mentioning a valid ethernet
interface, but an invalid infiniband or ISM device, the dev_put()
operation for the ethernet interface is called twice, resulting
in a negative refcount for the ethernet interface, which disables
removal of such a network interface.

This patch removes one of the dev_put() calls.

Fixes: 890a2cb4a9 ("net/smc: rework pnet table")
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-06 17:44:32 -08:00
David S. Miller 9990a79d8f Merge branch 'net-tls-add-a-TX-lock'
Jakub Kicinski says:

====================
net/tls: add a TX lock

Some time ago Pooja and Mallesham started reporting crashes with
an async accelerator. After trying to poke the existing logic into
shape I came to the conclusion that it can't be trusted, and to
preserve our sanity we should just add a lock around the TX side.

First patch removes the sk_write_pending checks from the write
space callbacks. Those don't seem to have a logical justification.

Patch 2 adds the TX lock and patch 3 associated test (which should
hang with current net).

Mallesham reports that even with these fixes applied the async
accelerator workload still occasionally hangs waiting for socket
memory. I suspect that's strictly related to the way async crypto
is integrated in TLS, so I think we should get these into net or
net-next and move from there.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-06 17:33:32 -08:00
Jakub Kicinski 41098af59d selftests/tls: add test for concurrent recv and send
Add a test which spawns 16 threads and performs concurrent
send and recv calls on the same socket.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-06 17:33:32 -08:00
Jakub Kicinski 79ffe6087e net/tls: add a TX lock
TLS TX needs to release and re-acquire the socket lock if send buffer
fills up.

TLS SW TX path currently depends on only allowing one thread to enter
the function by the abuse of sk_write_pending. If another writer is
already waiting for memory no new ones are allowed in.

This has two problems:
 - writers don't wake other threads up when they leave the kernel;
   meaning that this scheme works for single extra thread (second
   application thread or delayed work) because memory becoming
   available will send a wake up request, but as Mallesham and
   Pooja report with larger number of threads it leads to threads
   being put to sleep indefinitely;
 - the delayed work does not get _scheduled_ but it may _run_ when
   other writers are present leading to crashes as writers don't
   expect state to change under their feet (same records get pushed
   and freed multiple times); it's hard to reliably bail from the
   work, however, because the mere presence of a writer does not
   guarantee that the writer will push pending records before exiting.

Ensuring wakeups always happen will make the code basically open
code a mutex. Just use a mutex.

The TLS HW TX path does not have any locking (not even the
sk_write_pending hack), yet it uses a per-socket sg_tx_data
array to push records.

Fixes: a42055e8d2 ("net/tls: Add support for async encryption of records for performance")
Reported-by: Mallesham  Jatharakonda <mallesh537@gmail.com>
Reported-by: Pooja Trivedi <poojatrivedi@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-06 17:33:32 -08:00
Jakub Kicinski 02b1fa07bb net/tls: don't pay attention to sk_write_pending when pushing partial records
sk_write_pending being not zero does not guarantee that partial
record will be pushed. If the thread waiting for memory times out
the pending record may get stuck.

In case of tls_device there is no path where parial record is
set and writer present in the first place. Partial record is
set only in tls_push_sg() and tls_push_sg() will return an
error immediately. All tls_device callers of tls_push_sg()
will return (and not wait for memory) if it failed.

Fixes: a42055e8d2 ("net/tls: Add support for async encryption of records for performance")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-06 17:33:32 -08:00
Vladimir Oltean 17fdd7638c net: mscc: ocelot: fix __ocelot_rmw_ix prototype
The "read-modify-write register index" function is declared with a
confusing prototype: the "mask" and "reg" arguments are swapped.

Fortunately, this does not affect callers so far. Both arguments are
u32, and the wrapper macros (ocelot_rmw_ix etc) have the arguments in
the correct order (the one from ocelot_io.c).

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-06 15:34:12 -08:00
David S. Miller 9f8f35076c Merge branch 'Bonding-fixes-for-Ocelot-switch'
Vladimir Oltean says:

====================
Bonding fixes for Ocelot switch

This series fixes 2 issues with bonding in a system that integrates the
ocelot driver, but the ports that are bonded do not actually belong to
ocelot.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-06 15:16:17 -08:00
Claudiu Manoil 3b3eed8eec net: mscc: ocelot: fix NULL pointer on LAG slave removal
lag_upper_info may be NULL on slave removal.

Fixes: dc96ee3730 ("net: mscc: ocelot: add bonding support")
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-06 15:16:17 -08:00
Claudiu Manoil 7afb3e575e net: mscc: ocelot: don't handle netdev events for other netdevs
The check that the event is actually for this device should be moved
from the "port" handler to the net device handler.

Otherwise the port handler will deny bonding configuration for other
net devices in the same system (like enetc in the LS1028A) that don't
have the lag_upper_info->tx_type restriction that ocelot has.

Fixes: dc96ee3730 ("net: mscc: ocelot: add bonding support")
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-06 15:16:17 -08:00
Dmytro Linkin 950d3af70e net/mlx5e: Use correct enum to determine uplink port
For vlan push action, if eswitch flow source capability is enabled, flow
source value compared with MLX5_VPORT_UPLINK enum, to determine uplink
port. This lead to syndrome in dmesg if try to add vlan push action.
For example:
 $ tc filter add dev vxlan0 ingress protocol ip prio 1 flower \
       enc_dst_port 4789 \
       action tunnel_key unset pipe \
       action vlan push id 20 pipe \
       action mirred egress redirect dev ens1f0_0
 $ dmesg
 ...
 [ 2456.883693] mlx5_core 0000:82:00.0: mlx5_cmd_check:756:(pid 5273): SET_FLOW_TABLE_ENTRY(0x936) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0xa9c090)
Use the correct enum value MLX5_FLOW_CONTEXT_FLOW_SOURCE_UPLINK.

Fixes: bb204dcf39fe ("net/mlx5e: Determine source port properly for vlan push action")
Signed-off-by: Dmytro Linkin <dmitrolin@mellanox.com>
Reviewed-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-11-06 14:03:55 -08:00
Alex Vesker 260986fcff net/mlx5: DR, Fix memory leak during rule creation
During rule creation hw_ste_arr was not freed.

Fixes: 41d0707415 ("net/mlx5: DR, Expose steering rule functionality")
Signed-off-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-11-06 14:03:54 -08:00
Alex Vesker 22f83150f0 net/mlx5: DR, Fix memory leak in modify action destroy
The rewrite data was no freed.

Fixes: 9db810ed2d ("net/mlx5: DR, Expose steering action functionality")
Signed-off-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-11-06 14:03:54 -08:00
Roi Dayan f382b0df69 net/mlx5e: Fix eswitch debug print of max fdb flow
The value is already the calculation so remove the log prefix.

Fixes: e52c280240 ("net/mlx5: E-Switch, Add chains and priorities")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Eli Britstein <elibr@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-11-06 14:03:54 -08:00
David S. Miller cc59dbcc5d Merge branch 'net-bcmgenet-restore-internal-EPHY-support'
Doug Berger says:

====================
net: bcmgenet: restore internal EPHY support (part 2)

This is a follow up to my previous submission (see [1]).

The first commit provides what is intended to be a complete solution
for the issues that can result from insufficient clocking of the MAC
during reset of its state machines. It should be backported to the
stable releases.

It is intended to replace the partial solution of commit 1f51548627
("net: bcmgenet: soft reset 40nm EPHYs before MAC init") which is
reverted by the second commit of this series and should not be back-
ported as noted in [2].

The third commit corrects a timing hazard with a polled PHY that can
occur when the MAC resumes and also when a v3 internal EPHY is reset
by the change in commit 25382b991d ("net: bcmgenet: reset 40nm EPHY
on energy detect"). It is expected that commit 25382b991d be back-
ported to stable first before backporting this commit.

[1] https://lkml.org/lkml/2019/10/16/1706
[2] https://lkml.org/lkml/2019/10/31/749
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-06 10:46:22 -08:00
Doug Berger 0686bd9d5e net: bcmgenet: reapply manual settings to the PHY
The phy_init_hw() function may reset the PHY to a configuration
that does not match manual network settings stored in the phydev
structure. If the phy state machine is polled rather than event
driven this can create a timing hazard where the phy state machine
might alter the settings stored in the phydev structure from the
value read from the BMCR.

This commit follows invocations of phy_init_hw() by the bcmgenet
driver with invocations of the genphy_config_aneg() function to
ensure that the BMCR is written to match the settings held in the
phydev structure. This prevents the risk of manual settings being
accidentally altered.

Fixes: 1c1008c793 ("net: bcmgenet: add main driver file")
Signed-off-by: Doug Berger <opendmb@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-06 10:46:22 -08:00
Doug Berger 6b6d017fcc Revert "net: bcmgenet: soft reset 40nm EPHYs before MAC init"
This reverts commit 1f51548627.

This commit improved the chances of the umac resetting cleanly by
ensuring that the PHY was restored to its normal operation prior
to resetting the umac. However, there were still cases when the
PHY might not be driving a Tx clock to the umac during this window
(e.g. when the PHY detects no link).

The previous commit now ensures that the unimac receives clocks
from the MAC during its reset window so this commit is no longer
needed. This commit also has an unintended negative impact on the
MDIO performance of the UniMAC MDIO interface because it is used
before the MDIO interrupts are reenabled, so it should be removed.

Signed-off-by: Doug Berger <opendmb@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-06 10:46:22 -08:00
Doug Berger 3a55402c93 net: bcmgenet: use RGMII loopback for MAC reset
As noted in commit 28c2d1a7a0 ("net: bcmgenet: enable loopback
during UniMAC sw_reset") the UniMAC must be clocked while sw_reset
is asserted for its state machines to reset cleanly.

The transmit and receive clocks used by the UniMAC are derived from
the signals used on its PHY interface. The bcmgenet MAC can be
configured to work with different PHY interfaces including MII,
GMII, RGMII, and Reverse MII on internal and external interfaces.
Unfortunately for the UniMAC, when configured for MII the Tx clock
is always driven from the PHY which places it outside of the direct
control of the MAC.

The earlier commit enabled a local loopback mode within the UniMAC
so that the receive clock would be derived from the transmit clock
which addressed the observed issue with an external GPHY disabling
it's Rx clock. However, when a Tx clock is not available this
loopback is insufficient.

This commit implements a workaround that leverages the fact that
the MAC can reliably generate all of its necessary clocking by
enterring the external GPHY RGMII interface mode with the UniMAC in
local loopback during the sw_reset interval. Unfortunately, this
has the undesirable side efect of the RGMII GTXCLK signal being
driven during the same window.

In most configurations this is a benign side effect as the signal
is either not routed to a pin or is already expected to drive the
pin. The one exception is when an external MII PHY is expected to
drive the same pin with its TX_CLK output creating output driver
contention.

This commit exploits the IEEE 802.3 clause 22 standard defined
isolate mode to force an external MII PHY to present a high
impedance on its TX_CLK output during the window to prevent any
contention at the pin.

The MII interface is used internally with the 40nm internal EPHY
which agressively disables its clocks for power savings leading to
incomplete resets of the UniMAC and many instabilities observed
over the years. The workaround of this commit is expected to put
an end to those problems.

Fixes: 1c1008c793 ("net: bcmgenet: add main driver file")
Signed-off-by: Doug Berger <opendmb@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-06 10:46:21 -08:00
Tariq Toukan 2836654a27 Documentation: TLS: Add missing counter description
Add TLS TX counter description for the handshake retransmitted
packets that triggers the resync procedure then skip it, going
into the regular transmit flow.

Fixes: 46a3ea9807 ("net/mlx5e: kTLS, Enhance TX resync flow")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-05 18:34:06 -08:00