1
0
Fork 0
Commit Graph

798406 Commits (ea010070d0a7497253d5a6f919f6dd107450b31a)

Author SHA1 Message Date
shamir rabinovitch ea010070d0 net/rds: fix warn in rds_message_alloc_sgs
redundant copy_from_user in rds_sendmsg system call expose rds
to issue where rds_rdma_extra_size walk the rds iovec and and
calculate the number pf pages (sgs) it need to add to the tail of
rds message and later rds_cmsg_rdma_args copy the rds iovec again
and re calculate the same number and get different result causing
WARN_ON in rds_message_alloc_sgs.

fix this by doing the copy_from_user only once per rds_sendmsg
system call.

When issue occur the below dump is seen:

WARNING: CPU: 0 PID: 19789 at net/rds/message.c:316 rds_message_alloc_sgs+0x10c/0x160 net/rds/message.c:316
Kernel panic - not syncing: panic_on_warn set ...
CPU: 0 PID: 19789 Comm: syz-executor827 Not tainted 4.19.0-next-20181030+ #101
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x244/0x39d lib/dump_stack.c:113
 panic+0x2ad/0x55c kernel/panic.c:188
 __warn.cold.8+0x20/0x45 kernel/panic.c:540
 report_bug+0x254/0x2d0 lib/bug.c:186
 fixup_bug arch/x86/kernel/traps.c:178 [inline]
 do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:271
 do_invalid_op+0x36/0x40 arch/x86/kernel/traps.c:290
 invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:969
RIP: 0010:rds_message_alloc_sgs+0x10c/0x160 net/rds/message.c:316
Code: c0 74 04 3c 03 7e 6c 44 01 ab 78 01 00 00 e8 2b 9e 35 fa 4c 89 e0 48 83 c4 08 5b 41 5c 41 5d 41 5e 41 5f 5d c3 e8 14 9e 35 fa <0f> 0b 31 ff 44 89 ee e8 18 9f 35 fa 45 85 ed 75 1b e8 fe 9d 35 fa
RSP: 0018:ffff8801c51b7460 EFLAGS: 00010293
RAX: ffff8801bc412080 RBX: ffff8801d7bf4040 RCX: ffffffff8749c9e6
RDX: 0000000000000000 RSI: ffffffff8749ca5c RDI: 0000000000000004
RBP: ffff8801c51b7490 R08: ffff8801bc412080 R09: ffffed003b5c5b67
R10: ffffed003b5c5b67 R11: ffff8801dae2db3b R12: 0000000000000000
R13: 000000000007165c R14: 000000000007165c R15: 0000000000000005
 rds_cmsg_rdma_args+0x82d/0x1510 net/rds/rdma.c:623
 rds_cmsg_send net/rds/send.c:971 [inline]
 rds_sendmsg+0x19a2/0x3180 net/rds/send.c:1273
 sock_sendmsg_nosec net/socket.c:622 [inline]
 sock_sendmsg+0xd5/0x120 net/socket.c:632
 ___sys_sendmsg+0x7fd/0x930 net/socket.c:2117
 __sys_sendmsg+0x11d/0x280 net/socket.c:2155
 __do_sys_sendmsg net/socket.c:2164 [inline]
 __se_sys_sendmsg net/socket.c:2162 [inline]
 __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2162
 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x44a859
Code: e8 dc e6 ff ff 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 6b cb fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007f1d4710ada8 EFLAGS: 00000297 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00000000006dcc28 RCX: 000000000044a859
RDX: 0000000000000000 RSI: 0000000020001600 RDI: 0000000000000003
RBP: 00000000006dcc20 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000297 R12: 00000000006dcc2c
R13: 646e732f7665642f R14: 00007f1d4710b9c0 R15: 00000000006dcd2c
Kernel Offset: disabled
Rebooting in 86400 seconds..

Reported-by: syzbot+26de17458aeda9d305d8@syzkaller.appspotmail.com
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: shamir rabinovitch <shamir.rabinovitch@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19 10:27:58 -08:00
David S. Miller c6f4075e2f wireless-drivers fixes for 4.20
Last set of fixes for 4.20. All (except the mt76 fix) of these are
 important fixes to user reported problems and pretty small in size.
 
 rtlwifi
 
 * fix skb leak
 
 mwifiex
 
 * revert a commit from v4.19 due to problems with locking
 
 mt76
 
 * fix a potential NULL derenfence
 
 * add entry to MAINTAINERS
 
 iwlwifi
 
 * fix a firmware crash which was a regression introduced in v4.20-rc4
 
 ath10k
 
 * fix a firmware crash with wcn3990 firmware
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJcGlNvAAoJEG4XJFUm622boJ0H/AjDASkdxSG4n+s1wCmsUizE
 jK9YwS2dOc1oVSzGPm35GccFgkc4x1IpS1qoTo1rykVD7Cn6G7mj6eqy8vTiJ24S
 6HPhdkS5YynWvYfbQmuhhhaQ6ew8/dJzqhPJAH1/kPfDsy08qUaVl5tGw4CwP4Pp
 Ufvrn8VNmRbrAg9Jq28+xmANNURtKNtHmXI5WWuSvA4LzkdEz2aebG8ynOwTrVW6
 f6NHTDeJ5tWrswr0EWGjBbGRfb8TlacOxqzytrFYFuYHH3WiqCQZdsvRBazvryqJ
 szrYMlI+bSN4m3ZEAtEWlZYTU1YNPwbRDXc+GYkbDr19YKevHxi+46SvXa36cJM=
 =ECaW
 -----END PGP SIGNATURE-----

Merge tag 'wireless-drivers-for-davem-2018-12-19' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers

Kalle Valo says:

====================
wireless-drivers fixes for 4.20

Last set of fixes for 4.20. All (except the mt76 fix) of these are
important fixes to user reported problems and pretty small in size.

rtlwifi

* fix skb leak

mwifiex

* revert a commit from v4.19 due to problems with locking

mt76

* fix a potential NULL derenfence

* add entry to MAINTAINERS

iwlwifi

* fix a firmware crash which was a regression introduced in v4.20-rc4

ath10k

* fix a firmware crash with wcn3990 firmware
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19 08:39:18 -08:00
David S. Miller 49ce708be6 Just three fixes:
* fix a memory leak in an error path
  * fix TXQs in interface teardown
  * free fraglist if we used it internally
    before returning SKB
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEH1e1rEeCd0AIMq6MB8qZga/fl8QFAlwaBJAACgkQB8qZga/f
 l8QvBg/9ESMGPa4F+AdXfvwlDWHkb3fBuv8E1HAkiDiV3G0eyziOwqIo4mSD49fT
 tSM+3AFBe9D97O0Y669qnKrcqSWVpGY31hI2MmxskcwwJ+BVgf9M33GjvHa548Bt
 c3DEKVLDIfOfQG9+LySeNNz8kRipqIe11WDjpoHZ1i2LtV/uEGOC+cSAxTRwqxpG
 HyjF2GBz24PVz69eGNODQNIWgIkkGCil8b8BCED94jUio015t7H+b3hzWDDC83m1
 pJwK/Uq4UkJzpjQykSf9805blB0h7PdXo8CSrNNw9+PnEVcvu+81IqD0q5VfZL1a
 CePmM43ud59kx1DtPQ0LB4XSIJX7PPiMskK9ZoPUduDvOsaJU32gFmK/Rvm45Xdt
 8mQrZM8EsBLCFq8M/Gk89QadXByItUD8B6jZcJgMEddUOa5QPnCaodTQqqO2I8ky
 T76FXRegTOMFeGhr/NrWAw3xxCY8TZqwB4P9F4juoKCa9Wz0b1dbIrD7nx1SajoA
 jhb795CTczMKCzFywNIh97fKa2yp0YQO0/EQDOFrbYxEyQozSr+cudU4EBN6B08i
 A4rVRAAPTlwfFz5TSr7gvT/JHlsku+kfjFAY1tmSWxLgWxFJb3cc0Uj1ak05bmv4
 fdANk9g81U4W/bUBKqmeIIrYW0icXHtGZBLX0Khpcx3KQSH11vw=
 =X7vr
 -----END PGP SIGNATURE-----

Merge tag 'mac80211-for-davem-2018-12-19' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211

Johannes Berg says:

====================
Just three fixes:
 * fix a memory leak in an error path
 * fix TXQs in interface teardown
 * free fraglist if we used it internally
   before returning SKB
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19 08:34:46 -08:00
Rakesh Pillai 53884577fb ath10k: skip sending quiet mode cmd for WCN3990
HL2.0 firmware does not support setting quiet mode.  If the host driver sends
the quiet mode setting command to the HL2.0 firmware, it crashes with the below
signature.

fatal error received: err_qdi.c:456:EX:wlan_process:1:WLAN RT:207a:PC=b001b4f0

The quiet mode command support is exposed by the firmware via thermal throttle
wmi service. Enable ath10k thermal support if thermal throttle wmi service bit
is set.  10.x firmware versions support this feature by default, but
unfortunately do not advertise the support via service flags, hence have to
manually set the service flag in ath10k_core_compat_services().

Tested on QCA988X with 10.2.4.70.9-2. Also tested on WCN3990.

Co-developed-by: Govind Singh <govinds@codeaurora.org>
Co-developed-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Rakesh Pillai <pillair@codeaurora.org>
Signed-off-by: Govind Singh <govinds@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2018-12-19 15:37:01 +02:00
Sara Sharon 34b1e0e9ef mac80211: free skb fraglist before freeing the skb
mac80211 uses the frag list to build AMSDU. When freeing
the skb, it may not be really freed, since someone is still
holding a reference to it.
In that case, when TCP skb is being retransmitted, the
pointer to the frag list is being reused, while the data
in there is no longer valid.
Since we will never get frag list from the network stack,
as mac80211 doesn't advertise the capability, we can safely
free and nullify it before releasing the SKB.

Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-12-19 09:40:17 +01:00
Johannes Berg d350a0f431 nl80211: fix memory leak if validate_pae_over_nl80211() fails
If validate_pae_over_nl80211() were to fail in nl80211_crypto_settings(),
we might leak the 'connkeys' allocation. Fix this.

Fixes: 64bf3d4bc2 ("nl80211: Add CONTROL_PORT_OVER_NL80211 attribute")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-12-19 09:40:17 +01:00
David S. Miller 3061169a47 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Alexei Starovoitov says:

====================
pull-request: bpf 2018-12-18

The following pull-request contains BPF updates for your *net* tree.

The main changes are:

1) promote bpf_perf_event.h to mandatory UAPI header, from Masahiro.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 22:30:33 -08:00
Myungho Jung 78abe3d0df net/smc: fix TCP fallback socket release
clcsock can be released while kernel_accept() references it in TCP
listen worker. Also, clcsock needs to wake up before released if TCP
fallback is used and the clcsock is blocked by accept. Add a lock to
safely release clcsock and call kernel_sock_shutdown() to wake up
clcsock from accept in smc_release().

Reported-by: syzbot+0bf2e01269f1274b4b03@syzkaller.appspotmail.com
Reported-by: syzbot+e3132895630f957306bc@syzkaller.appspotmail.com
Signed-off-by: Myungho Jung <mhjungk@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 22:02:51 -08:00
Colin Ian King f7db2beb4c vxge: ensure data0 is initialized in when fetching firmware version information
Currently variable data0 is not being initialized so a garbage value is
being passed to vxge_hw_vpath_fw_api and this value is being written to
the rts_access_steer_data0 register.  There are other occurrances where
data0 is being initialized to zero (e.g. in function
vxge_hw_upgrade_read_version) so I think it makes sense to ensure data0
is initialized likewise to 0.

Detected by CoverityScan, CID#140696 ("Uninitialized scalar variable")

Fixes: 8424e00dfd ("vxge: serialize access to steering control register")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 22:00:40 -08:00
Juergen Gross d81c5054a5 xen/netfront: tolerate frags with no data
At least old Xen net backends seem to send frags with no real data
sometimes. In case such a fragment happens to occur with the frag limit
already reached the frontend will BUG currently even if this situation
is easily recoverable.

Modify the BUG_ON() condition accordingly.

Tested-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 21:59:20 -08:00
Kunihiko Hayashi 8742beb50f net: phy: Fix the issue that netif always links up after resuming
Even though the link is down before entering hibernation,
there is an issue that the network interface always links up after resuming
from hibernation.

If the link is still down before enabling the network interface,
and after resuming from hibernation, the phydev->state is forcibly set
to PHY_UP in mdio_bus_phy_restore(), and the link becomes up.

In suspend sequence, only if the PHY is attached, mdio_bus_phy_suspend()
calls phy_stop_machine(), and mdio_bus_phy_resume() calls
phy_start_machine().
In resume sequence, it's enough to do the same as mdio_bus_phy_resume()
because the state has been preserved.

This patch fixes the issue by calling phy_start_machine() in
mdio_bus_phy_restore() in the same way as mdio_bus_phy_resume().

Fixes: bc87922ff5 ("phy: Move PHY PM operations into phy_device")
Suggested-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 21:48:15 -08:00
Jason Martinsen 15515aaaa6 lan78xx: Resolve issue with changing MAC address
Current state for the lan78xx driver does not allow for changing the
MAC address of the interface, without either removing the module (if
you compiled it that way) or rebooting the machine.  If you attempt to
change the MAC address, ifconfig will show the new address, however,
the system/interface will not respond to any traffic using that
configuration.  A few short-term options to work around this are to
unload the module and reload it with the new MAC address, change the
interface to "promisc", or reboot with the correct configuration to
change the MAC.

This patch enables the ability to change the MAC address via fairly normal means...
ifdown <interface>
modify entry in /etc/network/interfaces OR a similar method
ifup <interface>
Then test via any network communication, such as ICMP requests to gateway.

My only test platform for this patch has been a raspberry pi model 3b+.

Signed-off-by: Jason Martinsen <jasonmartinsen@msn.com>

-----

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 21:40:59 -08:00
Bryan Whitehead 0db7d253e9 lan743x: Expand phy search for LAN7431
The LAN7431 uses an external phy, and it can be found anywhere in
the phy address space. This patch uses phy address 1 for LAN7430
only. And searches all addresses otherwise.

Signed-off-by: Bryan Whitehead <Bryan.Whitehead@microchip.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 21:35:56 -08:00
David S. Miller 59fc137ebd Merge branch 'vxlan-Various-fixes'
Petr Machata says:

====================
vxlan: Various fixes

This patch set contains three fixes for the vxlan driver.

Patch #1 fixes handling of offload mark on replaced VXLAN FDB entries. A
way to trigger this is to replace the FDB entry with one that can not be
offloaded. A future patch set should make it possible to veto such FDB
changes. However the FDB might still fail to be offloaded due to another
issue, and the offload mark should reflect that.

Patch #2 fixes problems in __vxlan_dev_create() when a call to
rtnl_configure_link() fails. These failures would be tricky to hit on a
real system, the most likely vector is through an error in vxlan_open().
However, with the abovementioned vetoing patchset, vetoing the created
entry would trigger the same problems (and be easier to reproduce).

Patch #3 fixes a problem in vxlan_changelink(). In situations where the
default remote configured in the FDB table (if any) does not exactly
match the remote address configured at the VXLAN device, changing the
remote address breaks the default FDB entry. Patch #4 is then a self
test for this issue.

v3:
- Patch #2:
    - Reuse the same errout block for both cleanup paths. Use a bool to
      decide whether the unregister_netdevice() call should be made.

v2:
- Drop former patch #3
- Patch #2:
    - Delete the default entry before calling unregister_netdevice(). That
      takes care of former patch #3, hence tweak the commit message to
      mention that problem as well.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 21:18:26 -08:00
Petr Machata 55cbe07942 selftests: net: Add test_vxlan_fdb_changelink.sh
Add a test to exercise the fix from the previous patch.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 21:18:26 -08:00
Petr Machata ce5e098f7a vxlan: changelink: Fix handling of default remotes
Default remotes are stored as FDB entries with an Ethernet address of
00:00:00:00:00:00. When a request is made to change a remote address of
a VXLAN device, vxlan_changelink() first deletes the existing default
remote, and then creates a new FDB entry.

This works well as long as the list of default remotes matches exactly
the configuration of a VXLAN remote address. Thus when the VXLAN device
has a remote of X, there should be exactly one default remote FDB entry
X. If the VXLAN device has no remote address, there should be no such
entry.

Besides using "ip link set", it is possible to manipulate the list of
default remotes by using the "bridge fdb". It is therefore easy to break
the above condition. Under such circumstances, the __vxlan_fdb_delete()
call doesn't delete the FDB entry itself, but just one remote. The
following vxlan_fdb_create() then creates a new FDB entry, leading to a
situation where two entries exist for the address 00:00:00:00:00:00,
each with a different subset of default remotes.

An even more obvious breakage rooted in the same cause can be observed
when a remote address is configured for a VXLAN device that did not have
one before. In that case vxlan_changelink() doesn't remove any remote,
and just creates a new FDB entry for the new address:

$ ip link add name vx up type vxlan id 2000 dstport 4789
$ bridge fdb ap dev vx 00:00:00:00:00:00 dst 192.0.2.20 self permanent
$ bridge fdb ap dev vx 00:00:00:00:00:00 dst 192.0.2.30 self permanent
$ ip link set dev vx type vxlan remote 192.0.2.30
$ bridge fdb sh dev vx | grep 00:00:00:00:00:00
00:00:00:00:00:00 dst 192.0.2.30 self permanent <- new entry, 1 rdst
00:00:00:00:00:00 dst 192.0.2.20 self permanent <- orig. entry, 2 rdsts
00:00:00:00:00:00 dst 192.0.2.30 self permanent

To fix this, instead of calling vxlan_fdb_create() directly, defer to
vxlan_fdb_update(). That has logic to handle the duplicates properly.
Additionally, it also handles notifications, so drop that call from
changelink as well.

Fixes: 0241b83673 ("vxlan: fix default fdb entry netlink notify ordering during netdev create")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 21:18:26 -08:00
Petr Machata 6db9246871 vxlan: Fix error path in __vxlan_dev_create()
When a failure occurs in rtnl_configure_link(), the current code
calls unregister_netdevice() to roll back the earlier call to
register_netdevice(), and jumps to errout, which calls
vxlan_fdb_destroy().

However unregister_netdevice() calls transitively ndo_uninit, which is
vxlan_uninit(), and that already takes care of deleting the default FDB
entry by calling vxlan_fdb_delete_default(). Since the entry added
earlier in __vxlan_dev_create() is exactly the default entry, the
cleanup code in the errout block always leads to double free and thus a
panic.

Besides, since vxlan_fdb_delete_default() always destroys the FDB entry
with notification enabled, the deletion of the default entry is notified
even before the addition was notified.

Instead, move the unregister_netdevice() call after the manual destroy,
which solves both problems.

Fixes: 0241b83673 ("vxlan: fix default fdb entry netlink notify ordering during netdev create")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 21:18:25 -08:00
Petr Machata 6ad0b5a4e0 vxlan: Unmark offloaded bit on replaced FDB entries
When rdst of an offloaded FDB entry is replaced, it certainly isn't
offloaded anymore. Drivers are notified about such replacements, and can
re-mark the entry as offloaded again if they so wish. However until a
driver does so explicitly, assume a replaced FDB entry is not offloaded.

Note that replaces coming via vxlan_fdb_external_learn_add() are always
immediately followed by an explicit offload marking.

Fixes: 0efe117333 ("vxlan: Support marking RDSTs as offloaded")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 21:18:25 -08:00
David S. Miller a9d6d897f3 Merge branch 'macb-DMA-race-fixes'
Anssi Hannula says:

====================
net: macb: DMA race condition fixes

Here are a couple of race condition fixes for the macb driver. The first
two are for issues observed at runtime on real HW.

v2:
- added received Tested-bys and Acked-bys to the first two patches
- in patch 3/3, moved the timestamp protection barrier closer to the
  timestamp reads
- in patch 3/3, removed unnecessary move of the addr assignment in
  gem_rx() to keep the patch minimal for maximum clarity
- in patch 3/3, clarified commit message and comments

The 3/3 is the same one I improperly sent last week as a standalone
patch.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 16:17:49 -08:00
Anssi Hannula 6e0af29806 net: macb: add missing barriers when reading descriptors
When reading buffer descriptors on RX or on TX completion, an
RX_USED/TX_USED bit is checked first to ensure that the descriptors have
been populated, i.e. the ownership has been transferred. However, there
are no memory barriers to ensure that the data protected by the
RX_USED/TX_USED bit is up-to-date with respect to that bit.

Specifically:

- TX timestamp descriptors may be loaded before ctrl is loaded for the
  TX_USED check, which is racy as the descriptors may be updated between
  the loads, causing old timestamp descriptor data to be used.

- RX ctrl may be loaded before addr is loaded for the RX_USED check,
  which is racy as a new frame may be written between the loads, causing
  old ctrl descriptor data to be used.
  This issue exists for both macb_rx() and gem_rx() variants.

Fix the races by adding DMA read memory barriers on those paths and
reordering the reads in macb_rx().

I have not observed any actual problems in practice caused by these
being missing, though.

Tested on a ZynqMP based system.

Fixes: 89e5785fc8 ("[PATCH] Atmel MACB ethernet driver")
Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi>
Cc: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 16:17:48 -08:00
Anssi Hannula 8159ecab0d net: macb: fix dropped RX frames due to a race
Bit RX_USED set to 0 in the address field allows the controller to write
data to the receive buffer descriptor.

The driver does not ensure the ctrl field is ready (cleared) when the
controller sees the RX_USED=0 written by the driver. The ctrl field might
only be cleared after the controller has already updated it according to
a newly received frame, causing the frame to be discarded in gem_rx() due
to unexpected ctrl field contents.

A message is logged when the above scenario occurs:

  macb ff0b0000.ethernet eth0: not whole frame pointed by descriptor

Fix the issue by ensuring that when the controller sees RX_USED=0 the
ctrl field is already cleared.

This issue was observed on a ZynqMP based system.

Fixes: 4df95131ea ("net/macb: change RX path for GEM")
Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi>
Tested-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Cc: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 16:17:48 -08:00
Anssi Hannula e100a897bf net: macb: fix random memory corruption on RX with 64-bit DMA
64-bit DMA addresses are split in upper and lower halves that are
written in separate fields on GEM. For RX, bit 0 of the address is used
as the ownership bit (RX_USED). When the RX_USED bit is unset the
controller is allowed to write data to the buffer.

The driver does not guarantee that the controller already sees the upper
half when the RX_USED bit is cleared, possibly resulting in the
controller writing an incoming frame to an address with an incorrect
upper half and therefore possibly corrupting unrelated system memory.

Fix that by adding the necessary DMA memory barrier between the writes.

This corruption was observed on a ZynqMP based system.

Fixes: fff8019a08 ("net: macb: Add 64 bit addressing support for GEM")
Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi>
Acked-by: Harini Katakam <harini.katakam@xilinx.com>
Tested-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Cc: Nicolas Ferre <nicolas.ferre@microchip.com>
Cc: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 16:17:48 -08:00
Davide Caratti e2c4cf7f98 net: Use __kernel_clockid_t in uapi net_stamp.h
Herton reports the following error when building a userspace program that
includes net_stamp.h:

 In file included from foo.c:2:
 /usr/include/linux/net_tstamp.h:158:2: error: unknown type name
 ‘clockid_t’
   clockid_t clockid; /* reference clockid */
   ^~~~~~~~~

Fix it by using __kernel_clockid_t in place of clockid_t.

Fixes: 80b14dee2b ("net: Add a new socket option for a future transmit time.")
Cc: Timothy Redaelli <tredaelli@redhat.com>
Reported-by: Herton R. Krzesinski <herton@redhat.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Tested-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 15:59:29 -08:00
Claudiu Beznea 4298388574 net: macb: restart tx after tx used bit read
On some platforms (currently detected only on SAMA5D4) TX might stuck
even the pachets are still present in DMA memories and TX start was
issued for them. This happens due to race condition between MACB driver
updating next TX buffer descriptor to be used and IP reading the same
descriptor. In such a case, the "TX USED BIT READ" interrupt is asserted.
GEM/MACB user guide specifies that if a "TX USED BIT READ" interrupt
is asserted TX must be restarted. Restart TX if used bit is read and
packets are present in software TX queue. Packets are removed from software
TX queue if TX was successful for them (see macb_tx_interrupt()).

Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 15:57:07 -08:00
Dan Carpenter b26322d2ac net: stmmac: Fix an error code in probe()
The function should return an error if create_singlethread_workqueue()
fails.

Fixes: 34877a15f7 ("net: stmmac: Rework and fix TX Timeout code")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 15:45:28 -08:00
Cong Wang 3c6306d440 tipc: check group dests after tipc_wait_for_cond()
Similar to commit 143ece654f ("tipc: check tsk->group in tipc_wait_for_cond()")
we have to reload grp->dests too after we re-take the sock lock.
This means we need to move the dsts check after tipc_wait_for_cond()
too.

Fixes: 75da2163db ("tipc: introduce communication groups")
Reported-and-tested-by: syzbot+99f20222fc5018d2b97a@syzkaller.appspotmail.com
Cc: Ying Xue <ying.xue@windriver.com>
Cc: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 15:44:23 -08:00
Dan Carpenter f07d427689 qed: Fix an error code qed_ll2_start_xmit()
We accidentally deleted the code to set "rc = -ENOMEM;" and this patch
adds it back.

Fixes: d2201a2159 ("qed: No need for LL2 frags indication")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 15:43:44 -08:00
Antoine Tenart 0067917720 net: mvpp2: 10G modes aren't supported on all ports
The mvpp2_phylink_validate() function sets all modes that are
supported by a given PPv2 port. A recent change made all ports to
advertise they support 10G modes in certain cases. This is not true,
as only the port #0 can do so. This patch fixes it.

Fixes: 01b3fd5ac9 ("net: mvpp2: fix detection of 10G SFP modules")
Cc: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 14:48:15 -08:00
Jorgen Hansen a915b982d8 VSOCK: Send reset control packet when socket is partially bound
If a server side socket is bound to an address, but not in the listening
state yet, incoming connection requests should receive a reset control
packet in response. However, the function used to send the reset
silently drops the reset packet if the sending socket isn't bound
to a remote address (as is the case for a bound socket not yet in
the listening state). This change fixes this by using the src
of the incoming packet as destination for the reset packet in
this case.

Fixes: d021c34405 ("VSOCK: Introduce VM Sockets")
Reviewed-by: Adit Ranadive <aditr@vmware.com>
Reviewed-by: Vishnu Dasa <vdasa@vmware.com>
Signed-off-by: Jorgen Hansen <jhansen@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 11:53:42 -08:00
David S. Miller fde9cd69a5 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
Steffen Klassert says:

====================
pull request (net): ipsec 2018-12-18

1) Fix error return code in xfrm_output_one()
   when no dst_entry is attached to the skb.
   From Wei Yongjun.

2) The xfrm state hash bucket count reported to
   userspace is off by one. Fix from Benjamin Poirier.

3) Fix NULL pointer dereference in xfrm_input when
   skb_dst_force clears the dst_entry.

4) Fix freeing of xfrm states on acquire. We use a
   dedicated slab cache for the xfrm states now,
   so free it properly with kmem_cache_free.
   From Mathias Krause.

Please pull or let me know if there are problems.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 11:43:26 -08:00
David S. Miller 8d013b7910 Merge branch 'mlxsw-VXLAN-and-firmware-flashing-fixes'
Ido Schimmel says:

====================
mlxsw: VXLAN and firmware flashing fixes

Patch #1 fixes firmware flashing failures by increasing the time period
after which the driver fails the transaction with the firmware. The
problem is explained in detail in the commit message.

Patch #2 adds a missing trap for decapsulated ARP packets. It is
necessary for VXLAN routing to work.

Patch #3 fixes a memory leak during driver reload caused by NULLing a
pointer before kfree().

Please consider patch #1 for 4.19.y
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 09:17:39 -08:00
Ido Schimmel 5edb7e8bd5 mlxsw: spectrum_nve: Fix memory leak upon driver reload
The pointer was NULLed before freeing the memory, resulting in a memory
leak. Trace from kmemleak:

unreferenced object 0xffff88820ae36528 (size 512):
  comm "devlink", pid 5374, jiffies 4295354033 (age 10829.296s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<00000000a43f5195>] kmem_cache_alloc_trace+0x1be/0x330
    [<00000000312f8140>] mlxsw_sp_nve_init+0xcb/0x1ae0
    [<0000000009201d22>] mlxsw_sp_init+0x1382/0x2690
    [<000000007227d877>] mlxsw_sp1_init+0x1b5/0x260
    [<000000004a16feec>] __mlxsw_core_bus_device_register+0x776/0x1360
    [<0000000070ab954c>] mlxsw_devlink_core_bus_device_reload+0x129/0x220
    [<00000000432313d5>] devlink_nl_cmd_reload+0x119/0x1e0
    [<000000003821a06b>] genl_family_rcv_msg+0x813/0x1150
    [<00000000d54d04c0>] genl_rcv_msg+0xd1/0x180
    [<0000000040543d12>] netlink_rcv_skb+0x152/0x3c0
    [<00000000efc4eae8>] genl_rcv+0x2d/0x40
    [<00000000ea645603>] netlink_unicast+0x52f/0x740
    [<00000000641fca1a>] netlink_sendmsg+0x9c7/0xf50
    [<00000000fed4a4b8>] sock_sendmsg+0xbe/0x120
    [<00000000d85795a9>] __sys_sendto+0x397/0x620
    [<00000000c5f84622>] __x64_sys_sendto+0xe6/0x1a0

Fixes: 6e6030bd54 ("mlxsw: spectrum_nve: Implement common NVE core")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 09:17:39 -08:00
Ido Schimmel 5d5043917a mlxsw: spectrum: Add trap for decapsulated ARP packets
After a packet was decapsulated it is classified to the relevant FID
based on its VNI and undergoes L2 forwarding.

Unlike regular (non-encapsulated) ARP packets, Spectrum does not trap
decapsulated ARP packets during L2 forwarding and instead can only trap
such packets in the underlay router during decapsulation.

Add this missing packet trap, which is required for VXLAN routing when
the MAC of the target host is not known.

Fixes: b02597d513 ("mlxsw: spectrum: Add NVE packet traps")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 09:17:38 -08:00
Shalom Toledo cf0b70e71b mlxsw: core: Increase timeout during firmware flash process
During the firmware flash process, some of the EMADs get timed out, which
causes the driver to send them again with a limit of 5 retries. There are
some situations in which 5 retries is not enough and the EMAD access fails.
If the failed EMAD was related to the flashing process, the driver fails
the flashing.

The reason for these timeouts during firmware flashing is cache misses in
the CPU running the firmware. In case the CPU needs to fetch instructions
from the flash when a firmware is flashed, it needs to wait for the
flashing to complete. Since flashing takes time, it is possible for pending
EMADs to timeout.

Fix by increasing EMADs' timeout while flashing firmware.

Fixes: ce6ef68f43 ("mlxsw: spectrum: Implement the ethtool flash_device callback")
Signed-off-by: Shalom Toledo <shalomt@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 09:17:38 -08:00
Sara Sharon a50e5fb8db mac80211: fix a kernel panic when TXing after TXQ teardown
Recently TXQ teardown was moved earlier in ieee80211_unregister_hw(),
to avoid a use-after-free of the netdev data. However, interfaces
aren't fully removed at the point, and cfg80211_shutdown_all_interfaces
can for example, TX a deauth frame. Move the TXQ teardown to the
point between cfg80211_shutdown_all_interfaces and the free of
netdev queues, so we can be sure they are torn down before netdev
is freed, but after there is no ongoing TX.

Fixes: 77cfaf52ec ("mac80211: Run TXQ teardown code before de-registering interfaces")
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-12-18 13:03:32 +01:00
Vivien Didelot a5f3932646 net: dsa: mv88e6xxx: set ethtool regs version
Currently the ethtool_regs version is set to 0 for all DSA drivers.

Use this field to store the chip ID to simplify the pretty dump of
any interfaces registered by the "dsa" driver.

Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-17 23:29:00 -08:00
David S. Miller b33299017c Merge branch 'net-SO_TIMESTAMPING-fixes'
Willem de Bruijn says:

====================
net: SO_TIMESTAMPING fixes

Fix two omissions:

- tx timestamping is missing for AF_INET6/SOCK_RAW/IPPROTO_RAW
- SOF_TIMESTAMPING_OPT_ID is missing for IPPROTO_RAW, PF_PACKET, CAN

Discovered while expanding the selftest in

  tools/testing/selftests/networking/timestamping/txtimestamp.c

Will send the test patchset to net-next once the fixes make it to that
branch. For now, it is available at

  https://github.com/wdebruij/linux/commits/txtimestamp-test-1
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-17 23:27:01 -08:00
Willem de Bruijn 8f932f762e net: add missing SOF_TIMESTAMPING_OPT_ID support
SOF_TIMESTAMPING_OPT_ID is supported on TCP, UDP and RAW sockets.
But it was missing on RAW with IPPROTO_IP, PF_PACKET and CAN.

Add skb_setup_tx_timestamp that configures both tx_flags and tskey
for these paths that do not need corking or use bytestream keys.

Fixes: 09c2d251b7 ("net-timestamp: add key to disambiguate concurrent datagrams")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-17 23:27:00 -08:00
Willem de Bruijn fbfb2321e9 ipv6: add missing tx timestamping on IPPROTO_RAW
Raw sockets support tx timestamping, but one case is missing.

IPPROTO_RAW takes a separate packet construction path. raw_send_hdrinc
has an explicit call to sock_tx_timestamp, but rawv6_send_hdrinc does
not. Add it.

Fixes: 11878b40ed ("net-timestamp: SOCK_RAW and PING timestamping")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-17 23:27:00 -08:00
Vivien Didelot 255fe81a6a MAINTAINERS: change my email address
Make my Gmail address the primary one from now on.

Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-17 15:05:00 -08:00
Masahiro Yamada bcb671c2fa bpf: promote bpf_perf_event.h to mandatory UAPI header
Since commit c895f6f703 ("bpf: correct broken uapi for
BPF_PROG_TYPE_PERF_EVENT program type"), all architectures
(except um) are required to have bpf_perf_event.h in uapi/asm.

Add it to mandatory-y so "make headers_install" can check it.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-12-17 22:00:10 +01:00
Emmanuel Grumbach eca1e56cee iwlwifi: mvm: don't send GEO_TX_POWER_LIMIT to old firmwares
Old firmware versions don't support this command. Sending it
to any firmware before -41.ucode will crash the firmware.

This fixes https://bugzilla.kernel.org/show_bug.cgi?id=201975

Fixes: 66e839030f ("iwlwifi: fix wrong WGDS_WIFI_DATA_SIZE")
CC: <stable@vger.kernel.org> #4.19+
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2018-12-17 15:15:22 +02:00
Marcin Wojtas e735fd55b9 net: mvneta: fix operation for 64K PAGE_SIZE
Recent changes in the mvneta driver reworked allocation
and handling of the ingress buffers to use entire pages.
Apart from that in SW BM scenario the HW must be informed
via PRXDQS about the biggest possible incoming buffer
that can be propagated by RX descriptors.

The BufferSize field was filled according to the MTU-dependent
pkt_size value. Later change to PAGE_SIZE broke RX operation
when usin 64K pages, as the field is simply too small.

This patch conditionally limits the value passed to the BufferSize
of the PRXDQS register, depending on the PAGE_SIZE used.
On the occasion remove now unused frag_size field of the mvneta_port
structure.

Fixes: 562e2f467e ("net: mvneta: Improve the buffer allocation method for SWBM")
Signed-off-by: Marcin Wojtas <mw@semihalf.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-16 12:39:59 -08:00
David S. Miller 369a094d50 Merge branch 'hns-fixes'
Peng Li says:

====================
net: hns: Code improvements & fixes for HNS driver

This patchset introduces some code improvements and fixes
for the identified problems in the HNS driver.

Every patch is independent.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-16 12:07:32 -08:00
Yonglong Liu 6adafc356e net: hns: Fix ping failed when use net bridge and send multicast
Create a net bridge, add eth and vnet to the bridge. The vnet is used
by a virtual machine. When ping the virtual machine from the outside
host and the virtual machine send multicast at the same time, the ping
package will lost.

The multicast package send to the eth, eth will send it to the bridge too,
and the bridge learn the mac of eth. When outside host ping the virtual
mechine, it will match the promisc entry of the eth which is not expected,
and the bridge send it to eth not to vnet, cause ping lost.

So this patch change promisc tcam entry position to the END of 512 tcam
entries, which indicate lower priority. And separate one promisc entry to
two: mc & uc, to avoid package match the wrong tcam entry.

Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-16 12:07:32 -08:00
Yonglong Liu 726ae5c9e5 net: hns: Add mac pcs config when enable|disable mac
In some case, when mac enable|disable and adjust link, may cause hard to
link(or abnormal) between mac and phy. This patch adds the code for rx PCS
to avoid this bug.

Disable the rx PCS when driver disable the gmac, and enable the rx PCS
when driver enable the mac.

Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-16 12:07:32 -08:00
Yonglong Liu 7e74a19ca5 net: hns: Fix ntuple-filters status error.
The ntuple-filters features is forced on by chip.
But it shows "ntuple-filters: off [fixed]" when use ethtool.
This patch make it correct with "ntuple-filters: on [fixed]".

Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-16 12:07:32 -08:00
Yonglong Liu a57275d355 net: hns: Avoid net reset caused by pause frames storm
There will be a large number of MAC pause frames on the net,
which caused tx timeout of net device. And then the net device
was reset to try to recover it. So that is not useful, and will
cause some other problems.

So need doubled ndev->watchdog_timeo if device watchdog occurred
until watchdog_timeo up to 40s and then try resetting to recover
it.

When collecting dfx information such as hardware registers when tx timeout.
Some registers for count were cleared when read. So need move this task
before update net state which also read the count registers.

Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-16 12:07:32 -08:00
Yonglong Liu c82bd077e1 net: hns: Free irq when exit from abnormal branch
1.In "hns_nic_init_irq", if request irq fail at index i,
  the function return directly without releasing irq resources
  that already requested.

2.In "hns_nic_net_up" after "hns_nic_init_irq",
  if exceptional branch occurs, irqs that already requested
  are not release.

Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-16 12:07:32 -08:00
Yonglong Liu 31f6b61d81 net: hns: Clean rx fbd when ae stopped.
If there are packets in hardware when changing the speed or duplex,
it may cause hardware hang up.

This patch adds the code to wait rx fbd clean up when ae stopped.

Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-16 12:07:32 -08:00