Commit graph

798361 commits

Author SHA1 Message Date
Vivien Didelot a5f3932646 net: dsa: mv88e6xxx: set ethtool regs version
Currently the ethtool_regs version is set to 0 for all DSA drivers.

Use this field to store the chip ID to simplify the pretty dump of
any interfaces registered by the "dsa" driver.

Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-17 23:29:00 -08:00
David S. Miller b33299017c Merge branch 'net-SO_TIMESTAMPING-fixes'
Willem de Bruijn says:

====================
net: SO_TIMESTAMPING fixes

Fix two omissions:

- tx timestamping is missing for AF_INET6/SOCK_RAW/IPPROTO_RAW
- SOF_TIMESTAMPING_OPT_ID is missing for IPPROTO_RAW, PF_PACKET, CAN

Discovered while expanding the selftest in

  tools/testing/selftests/networking/timestamping/txtimestamp.c

Will send the test patchset to net-next once the fixes make it to that
branch. For now, it is available at

  https://github.com/wdebruij/linux/commits/txtimestamp-test-1
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-17 23:27:01 -08:00
Willem de Bruijn 8f932f762e net: add missing SOF_TIMESTAMPING_OPT_ID support
SOF_TIMESTAMPING_OPT_ID is supported on TCP, UDP and RAW sockets.
But it was missing on RAW with IPPROTO_IP, PF_PACKET and CAN.

Add skb_setup_tx_timestamp that configures both tx_flags and tskey
for these paths that do not need corking or use bytestream keys.

Fixes: 09c2d251b7 ("net-timestamp: add key to disambiguate concurrent datagrams")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-17 23:27:00 -08:00
Willem de Bruijn fbfb2321e9 ipv6: add missing tx timestamping on IPPROTO_RAW
Raw sockets support tx timestamping, but one case is missing.

IPPROTO_RAW takes a separate packet construction path. raw_send_hdrinc
has an explicit call to sock_tx_timestamp, but rawv6_send_hdrinc does
not. Add it.

Fixes: 11878b40ed ("net-timestamp: SOCK_RAW and PING timestamping")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-17 23:27:00 -08:00
Vivien Didelot 255fe81a6a MAINTAINERS: change my email address
Make my Gmail address the primary one from now on.

Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-17 15:05:00 -08:00
Marcin Wojtas e735fd55b9 net: mvneta: fix operation for 64K PAGE_SIZE
Recent changes in the mvneta driver reworked allocation
and handling of the ingress buffers to use entire pages.
Apart from that in SW BM scenario the HW must be informed
via PRXDQS about the biggest possible incoming buffer
that can be propagated by RX descriptors.

The BufferSize field was filled according to the MTU-dependent
pkt_size value. Later change to PAGE_SIZE broke RX operation
when usin 64K pages, as the field is simply too small.

This patch conditionally limits the value passed to the BufferSize
of the PRXDQS register, depending on the PAGE_SIZE used.
On the occasion remove now unused frag_size field of the mvneta_port
structure.

Fixes: 562e2f467e ("net: mvneta: Improve the buffer allocation method for SWBM")
Signed-off-by: Marcin Wojtas <mw@semihalf.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-16 12:39:59 -08:00
David S. Miller 369a094d50 Merge branch 'hns-fixes'
Peng Li says:

====================
net: hns: Code improvements & fixes for HNS driver

This patchset introduces some code improvements and fixes
for the identified problems in the HNS driver.

Every patch is independent.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-16 12:07:32 -08:00
Yonglong Liu 6adafc356e net: hns: Fix ping failed when use net bridge and send multicast
Create a net bridge, add eth and vnet to the bridge. The vnet is used
by a virtual machine. When ping the virtual machine from the outside
host and the virtual machine send multicast at the same time, the ping
package will lost.

The multicast package send to the eth, eth will send it to the bridge too,
and the bridge learn the mac of eth. When outside host ping the virtual
mechine, it will match the promisc entry of the eth which is not expected,
and the bridge send it to eth not to vnet, cause ping lost.

So this patch change promisc tcam entry position to the END of 512 tcam
entries, which indicate lower priority. And separate one promisc entry to
two: mc & uc, to avoid package match the wrong tcam entry.

Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-16 12:07:32 -08:00
Yonglong Liu 726ae5c9e5 net: hns: Add mac pcs config when enable|disable mac
In some case, when mac enable|disable and adjust link, may cause hard to
link(or abnormal) between mac and phy. This patch adds the code for rx PCS
to avoid this bug.

Disable the rx PCS when driver disable the gmac, and enable the rx PCS
when driver enable the mac.

Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-16 12:07:32 -08:00
Yonglong Liu 7e74a19ca5 net: hns: Fix ntuple-filters status error.
The ntuple-filters features is forced on by chip.
But it shows "ntuple-filters: off [fixed]" when use ethtool.
This patch make it correct with "ntuple-filters: on [fixed]".

Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-16 12:07:32 -08:00
Yonglong Liu a57275d355 net: hns: Avoid net reset caused by pause frames storm
There will be a large number of MAC pause frames on the net,
which caused tx timeout of net device. And then the net device
was reset to try to recover it. So that is not useful, and will
cause some other problems.

So need doubled ndev->watchdog_timeo if device watchdog occurred
until watchdog_timeo up to 40s and then try resetting to recover
it.

When collecting dfx information such as hardware registers when tx timeout.
Some registers for count were cleared when read. So need move this task
before update net state which also read the count registers.

Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-16 12:07:32 -08:00
Yonglong Liu c82bd077e1 net: hns: Free irq when exit from abnormal branch
1.In "hns_nic_init_irq", if request irq fail at index i,
  the function return directly without releasing irq resources
  that already requested.

2.In "hns_nic_net_up" after "hns_nic_init_irq",
  if exceptional branch occurs, irqs that already requested
  are not release.

Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-16 12:07:32 -08:00
Yonglong Liu 31f6b61d81 net: hns: Clean rx fbd when ae stopped.
If there are packets in hardware when changing the speed or duplex,
it may cause hardware hang up.

This patch adds the code to wait rx fbd clean up when ae stopped.

Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-16 12:07:32 -08:00
Yonglong Liu 5778b13b64 net: hns: Fixed bug that netdev was opened twice
After resetting dsaf to try to repair chip error such as ecc error,
the net device will be open if net interface is up. But at this time
if there is the users set the net device up with the command ifconfig,
the net device will be opened twice consecutively.

Function napi_enable was called when open device. And Kernel panic will
be occurred if it was called twice consecutively. Such as follow:
static inline void napi_enable(struct napi_struct *n)
{
         BUG_ON(!test_bit(NAPI_STATE_SCHED, &n->state));
         smp_mb__before_clear_bit();
         clear_bit(NAPI_STATE_SCHED, &n->state);
}

[37255.571996] Kernel panic - not syncing: BUG!
[37255.595234] Call trace:
[37255.597694] [<ffff80000008ab48>] dump_backtrace+0x0/0x1a0
[37255.603114] [<ffff80000008ad08>] show_stack+0x20/0x28
[37255.608187] [<ffff8000009c4944>] dump_stack+0x98/0xb8
[37255.613258] [<ffff8000009c149c>] panic+0x10c/0x26c
[37255.618070] [<ffff80000070f134>] hns_nic_net_up+0x30c/0x4e0
[37255.623664] [<ffff80000070f39c>] hns_nic_net_open+0x94/0x12c
[37255.629346] [<ffff80000084be78>] __dev_open+0xf4/0x168
[37255.634504] [<ffff80000084c1ac>] __dev_change_flags+0x98/0x15c
[37255.640359] [<ffff80000084c29c>] dev_change_flags+0x2c/0x68
[37255.769580] [<ffff8000008dc400>] devinet_ioctl+0x650/0x704
[37255.775086] [<ffff8000008ddc38>] inet_ioctl+0x98/0xb4
[37255.780159] [<ffff800000827b7c>] sock_do_ioctl+0x44/0x84
[37255.785490] [<ffff800000828e04>] sock_ioctl+0x248/0x30c
[37255.790737] [<ffff80000026dc6c>] do_vfs_ioctl+0x480/0x618
[37255.796156] [<ffff80000026de94>] SyS_ioctl+0x90/0xa4
[37255.801139] SMP: stopping secondary CPUs
[37255.805079] kbox: catch panic event.
[37255.809586] collected_len = 128928, LOG_BUF_LEN_LOCAL = 131072
[37255.816103] flush cache 0xffff80003f000000  size 0x800000
[37255.822192] flush cache 0xffff80003f000000  size 0x800000
[37255.828289] flush cache 0xffff80003f000000  size 0x800000
[37255.834378] kbox: no notify die func register. no need to notify
[37255.840413] ---[ end Kernel panic - not syncing: BUG!

This patchset fix this bug according to the flag NIC_STATE_DOWN.

Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-16 12:07:31 -08:00
Yonglong Liu 4ad26f117b net: hns: Some registers use wrong address according to the datasheet.
According to the hip06 datasheet:
1.Six registers use wrong address:
  RCB_COM_SF_CFG_INTMASK_RING
  RCB_COM_SF_CFG_RING_STS
  RCB_COM_SF_CFG_RING
  RCB_COM_SF_CFG_INTMASK_BD
  RCB_COM_SF_CFG_BD_RINT_STS
  DSAF_INODE_VC1_IN_PKT_NUM_0_REG
2.The offset of DSAF_INODE_VC1_IN_PKT_NUM_0_REG should be
  0x103C + 0x80 * all_chn_num
3.The offset to show the value of DSAF_INODE_IN_DATA_STP_DISC_0_REG
  is wrong, so the value of DSAF_INODE_SW_VLAN_TAG_DISC_0_REG will be
  overwrite

These registers are only used in "ethtool -d", so that did not cause ndev
to misfunction.

Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-16 12:07:31 -08:00
Yonglong Liu 308c6cafde net: hns: All ports can not work when insmod hns ko after rmmod.
There are two test cases:
1. Remove the 4 modules:hns_enet_drv/hns_dsaf/hnae/hns_mdio,
   and install them again, must use "ifconfig down/ifconfig up"
   command pair to bring port to work.

   This patch calls phy_stop function when init phy to fix this bug.

2. Remove the 2 modules:hns_enet_drv/hns_dsaf, and install them again,
   all ports can not use anymore, because of the phy devices register
   failed(phy devices already exists).

   Phy devices are registered when hns_dsaf installed, this patch
   removes them when hns_dsaf removed.

The two cases are sometimes related, fixing the second case also requires
fixing the first case, so fix them together.

Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-16 12:07:31 -08:00
Yonglong Liu 4e1d4be681 net: hns: Incorrect offset address used for some registers.
According to the hip06 Datasheet:
1. The offset of INGRESS_SW_VLAN_TAG_DISC should be 0x1A00+4*all_chn_num
2. The offset of INGRESS_IN_DATA_STP_DISC should be 0x1A50+4*all_chn_num

Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-16 12:07:31 -08:00
Eric Dumazet 8203e2d844 net: clear skb->tstamp in forwarding paths
Sergey reported that forwarding was no longer working
if fq packet scheduler was used.

This is caused by the recent switch to EDT model, since incoming
packets might have been timestamped by __net_timestamp()

__net_timestamp() uses ktime_get_real(), while fq expects packets
using CLOCK_MONOTONIC base.

The fix is to clear skb->tstamp in forwarding paths.

Fixes: 80b14dee2b ("net: Add a new socket option for a future transmit time.")
Fixes: fb420d5d91 ("tcp/fq: move back to CLOCK_MONOTONIC")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Sergey Matyukevich <geomatsi@gmail.com>
Tested-by: Sergey Matyukevich <geomatsi@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-15 13:24:21 -08:00
Robert P. J. Day 15c6d8e565 mod_devicetable.h: correct kerneldoc typo, "PHYSID2" -> "MII_PHYSID2"
Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-15 12:10:20 -08:00
Michal Kubecek ade446403b net: ipv4: do not handle duplicate fragments as overlapping
Since commit 7969e5c40d ("ip: discard IPv4 datagrams with overlapping
segments.") IPv4 reassembly code drops the whole queue whenever an
overlapping fragment is received. However, the test is written in a way
which detects duplicate fragments as overlapping so that in environments
with many duplicate packets, fragmented packets may be undeliverable.

Add an extra test and for (potentially) duplicate fragment, only drop the
new fragment rather than the whole queue. Only starting offset and length
are checked, not the contents of the fragments as that would be too
expensive. For similar reason, linear list ("run") of a rbtree node is not
iterated, we only check if the new fragment is a subset of the interval
covered by existing consecutive fragments.

v2: instead of an exact check iterating through linear list of an rbtree
node, only check if the new fragment is subset of the "run" (suggested
by Eric Dumazet)

Fixes: 7969e5c40d ("ip: discard IPv4 datagrams with overlapping segments.")
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-15 11:50:40 -08:00
Jörgen Storvist 1986af16e8 qmi_wwan: Added support for Telit LN940 series
Added support for the Telit LN940 series cellular modules QMI interface.
QMI_QUIRK_SET_DTR quirk requied for Qualcomm MDM9x40 chipset.

Signed-off-by: Jörgen Storvist <jorgen.storvist@gmail.com>
Acked-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-15 11:46:43 -08:00
Jörgen Storvist 110a1cc28b qmi_wwan: Added support for Fibocom NL668 series
Added support for Fibocom NL668 series QMI interface.
Using QMI_QUIRK_SET_DTR required for Qualcomm MDM9x07 chipsets.

Signed-off-by: Jörgen Storvist <jorgen.storvist@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-15 11:31:59 -08:00
David S. Miller 10589a568f Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Alexei Starovoitov says:

====================
pull-request: bpf 2018-12-15

The following pull-request contains BPF updates for your *net* tree.

The main changes are:

1) fix liveness propagation of callee saved registers, from Jakub.

2) fix overflow in bpf_jit_limit knob, from Daniel.

3) bpf_flow_dissector api fix, from Stanislav.

4) bpf_perf_event api fix on powerpc, from Sandipan.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-15 10:58:32 -08:00
Cong Wang 143ece654f tipc: check tsk->group in tipc_wait_for_cond()
tipc_wait_for_cond() drops socket lock before going to sleep,
but tsk->group could be freed right after that release_sock().
So we have to re-check and reload tsk->group after it wakes up.

After this patch, tipc_wait_for_cond() returns -ERESTARTSYS when
tsk->group is NULL, instead of continuing with the assumption of
a non-NULL tsk->group.

(It looks like 'dsts' should be re-checked and reloaded too, but
it is a different bug.)

Similar for tipc_send_group_unicast() and tipc_send_group_anycast().

Reported-by: syzbot+10a9db47c3a0e13eb31c@syzkaller.appspotmail.com
Fixes: b7d4263551 ("tipc: introduce flow control for group broadcast messages")
Fixes: ee106d7f94 ("tipc: introduce group anycast messaging")
Fixes: 27bd9ec027 ("tipc: introduce group unicast messaging")
Cc: Ying Xue <ying.xue@windriver.com>
Cc: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-14 15:48:49 -08:00
Dave Taht 65cab850f0 net: Allow class-e address assignment via ifconfig ioctl
While most distributions long ago switched to the iproute2 suite
of utilities, which allow class-e (240.0.0.0/4) address assignment,
distributions relying on busybox, toybox and other forms of
ifconfig cannot assign class-e addresses without this kernel patch.

While CIDR has been obsolete for 2 decades, and a survey of all the
open source code in the world shows the IN_whatever macros are also
obsolete... rather than obsolete CIDR from this ioctl entirely, this
patch merely enables class-e assignment, sanely.

Signed-off-by: Dave Taht <dave.taht@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-14 15:39:31 -08:00
Gustavo A. R. Silva 69d2c86766 ip6mr: Fix potential Spectre v1 vulnerability
vr.mifi is indirectly controlled by user-space, hence leading to
a potential exploitation of the Spectre variant 1 vulnerability.

This issue was detected with the help of Smatch:

net/ipv6/ip6mr.c:1845 ip6mr_ioctl() warn: potential spectre issue 'mrt->vif_table' [r] (local cap)
net/ipv6/ip6mr.c:1919 ip6mr_compat_ioctl() warn: potential spectre issue 'mrt->vif_table' [r] (local cap)

Fix this by sanitizing vr.mifi before using it to index mrt->vif_table'

Notice that given that speculation windows are large, the policy is
to kill the speculation on the first load and not worry if it can be
completed with a dependent load/store [1].

[1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-14 15:34:28 -08:00
Arnd Bergmann 51367e423c w90p910_ether: remove incorrect __init annotation
The get_mac_address() function is normally inline, but when it is
not, we get a warning that this configuration is broken:

WARNING: vmlinux.o(.text+0x4aff00): Section mismatch in reference from the function w90p910_ether_setup() to the function .init.text:get_mac_address()
The function w90p910_ether_setup() references
the function __init get_mac_address().
This is often because w90p910_ether_setup lacks a __init

Remove the __init to make it always do the right thing.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-14 14:42:51 -08:00
Lepton Wu 8236b08cf5 VSOCK: bind to random port for VMADDR_PORT_ANY
The old code always starts from fixed port for VMADDR_PORT_ANY. Sometimes
when VMM crashed, there is still orphaned vsock which is waiting for
close timer, then it could cause connection time out for new started VM
if they are trying to connect to same port with same guest cid since the
new packets could hit that orphaned vsock. We could also fix this by doing
more in vhost_vsock_reset_orphans, but any way, it should be better to start
from a random local port instead of a fixed one.

Signed-off-by: Lepton Wu <ytht.net@gmail.com>
Reviewed-by: Jorgen Hansen <jhansen@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-14 14:40:19 -08:00
Mario Limonciello 9c27369f4a r8152: Add support for MAC address pass through on RTL8153-BND
All previous docks and dongles that have supported this feature use
the RTL8153-AD chip.

RTL8153-BND is a new chip that will be used in upcoming Dell type-C docks.
It should be added to the whitelist of devices to activate MAC address
pass through.

Per confirming with Realtek all devices containing RTL8153-BND should
activate MAC pass through and there won't use pass through bit on efuse
like in RTL8153-AD.

Signed-off-by: Mario Limonciello <mario.limonciello@dell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-14 14:37:45 -08:00
Atul Gupta 0c3a16be70 crypto/chelsio/chtls: send/recv window update
recalculated send and receive window using linkspeed.
Determine correct value of eck_ok from SYN received and
option configured on local system.

Signed-off-by: Atul Gupta <atul.gupta@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-14 13:40:42 -08:00
Atul Gupta 848dd1c1cb crypto/chelsio/chtls: macro correction in tx path
corrected macro used in tx path. removed redundant hdrlen
and check for !page in chtls_sendmsg

Signed-off-by: Atul Gupta <atul.gupta@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-14 13:39:39 -08:00
Atul Gupta 6422ccc5fb crypto/chelsio/chtls: listen fails with multiadapt
listen fails when more than one tls capable device is
registered. tls_hw_hash is called for each dev which loops
again for each cdev_list causing listen failure. Hence
call chtls_listen_start/stop for specific device than loop over all
devices.

Signed-off-by: Atul Gupta <atul.gupta@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-14 13:39:39 -08:00
Atul Gupta df9d4a1780 net/tls: sleeping function from invalid context
HW unhash within mutex for registered tls devices cause sleep
when called from tcp_set_state for TCP_CLOSE. Release lock and
re-acquire after function call with ref count incr/dec.
defined kref and fp release for tls_device to ensure device
is not released outside lock.

BUG: sleeping function called from invalid context at
kernel/locking/mutex.c:748
in_atomic(): 1, irqs_disabled(): 0, pid: 0, name: swapper/7
INFO: lockdep is turned off.
CPU: 7 PID: 0 Comm: swapper/7 Tainted: G        W  O
Call Trace:
 <IRQ>
 dump_stack+0x5e/0x8b
 ___might_sleep+0x222/0x260
 __mutex_lock+0x5c/0xa50
 ? vprintk_emit+0x1f3/0x440
 ? kmem_cache_free+0x22d/0x2a0
 ? tls_hw_unhash+0x2f/0x80
 ? printk+0x52/0x6e
 ? tls_hw_unhash+0x2f/0x80
 tls_hw_unhash+0x2f/0x80
 tcp_set_state+0x5f/0x180
 tcp_done+0x2e/0xe0
 tcp_rcv_state_process+0x92c/0xdd3
 ? lock_acquire+0xf5/0x1f0
 ? tcp_v4_rcv+0xa7c/0xbe0
 ? tcp_v4_do_rcv+0x70/0x1e0

Signed-off-by: Atul Gupta <atul.gupta@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-14 13:39:39 -08:00
Atul Gupta 6c0563e442 net/tls: Init routines in create_ctx
create_ctx is called from tls_init and tls_hw_prot
hence initialize function pointers in common routine.

Signed-off-by: Atul Gupta <atul.gupta@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-14 13:39:39 -08:00
Nathan Chancellor 2ab4c3426c drivers: net: xgene: Remove unnecessary forward declarations
Clang warns:

drivers/net/ethernet/apm/xgene/xgene_enet_main.c:33:36: warning:
tentative array definition assumed to have one element
static const struct acpi_device_id xgene_enet_acpi_match[];
                                   ^
1 warning generated.

Both xgene_enet_acpi_match and xgene_enet_of_match are defined before
their uses at the bottom of the file so this is unnecessary. When
CONFIG_ACPI is disabled, ACPI_PTR becomes NULL so xgene_enet_acpi_match
doesn't need to be defined.

Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-14 13:35:27 -08:00
Cong Wang fb83ed496b tipc: compare remote and local protocols in tipc_udp_enable()
When TIPC_NLA_UDP_REMOTE is an IPv6 mcast address but
TIPC_NLA_UDP_LOCAL is an IPv4 address, a NULL-ptr deref is triggered
as the UDP tunnel sock is initialized to IPv4 or IPv6 sock merely
based on the protocol in local address.

We should just error out when the remote address and local address
have different protocols.

Reported-by: syzbot+eb4da3a20fad2e52555d@syzkaller.appspotmail.com
Cc: Ying Xue <ying.xue@windriver.com>
Cc: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-14 13:28:03 -08:00
Cong Wang acb4a33e98 tipc: fix a double kfree_skb()
tipc_udp_xmit() drops the packet on error, there is no
need to drop it again.

Fixes: ef20cd4dd1 ("tipc: introduce UDP replicast")
Reported-and-tested-by: syzbot+eae585ba2cc2752d3704@syzkaller.appspotmail.com
Cc: Ying Xue <ying.xue@windriver.com>
Cc: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-14 13:23:44 -08:00
Cong Wang 15ef70e286 tipc: use lock_sock() in tipc_sk_reinit()
lock_sock() must be used in process context to be race-free with
other lock_sock() callers, for example, tipc_release(). Otherwise
using the spinlock directly can't serialize a parallel tipc_release().

As it is blocking, we have to hold the sock refcnt before
rhashtable_walk_stop() and release it after rhashtable_walk_start().

Fixes: 07f6c4bc04 ("tipc: convert tipc reference table to use generic rhashtable")
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Ying Xue <ying.xue@windriver.com>
Cc: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-14 13:16:33 -08:00
Jakub Kicinski d3e8869ec8 net: netlink: rename NETLINK_DUMP_STRICT_CHK -> NETLINK_GET_STRICT_CHK
NETLINK_DUMP_STRICT_CHK can be used for all GET requests,
dumps as well as doit handlers.  Replace the DUMP in the
name with GET make that clearer.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-14 11:44:31 -08:00
Sudarsana Reddy Kalluru c3db8d5310 qed: Fix command number mismatch between driver and the mfw
The value for OEM_CFG_UPDATE command differs between driver and the
Management firmware (mfw). Fix this gap with adding a reserved field.

Fixes: cac6f69154 ("qed: Add support for Unified Fabric Port.")
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-13 21:48:14 -08:00
David S. Miller 38ed22351c mlx5-fixes-2018-12-13
-----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJcEiXXAAoJEEg/ir3gV/o+xxQH/jGS6GQrty9puL40sXKE4ryK
 GKVSwbKLnu4WatPcLe/CGxJwRoCVUUKfsbprsuy+oYxhUd5M5h6qu215kZgsfdDC
 CfxznbEj3o6mN+ZyZjL+F2gEG6e96bpkKvayb2a9YYQaY/9+V6CMKh3U5jHI+Q2V
 D4ToCjWbMXkQ0jsevfpV9W2T0DNfXIa5QGvsvH+D+PoLF/ke/YALZd+yJT0Cm9N2
 /vp0qvnHnZO5YavEThN2Lh8f9AY/pAwuGgTrPyddfkZC/6QLH+c8948gF14V+03l
 Nz78uSpurQ97X511UBcabSQB0xTCHSAw3/SKCVSxFrpMvkhj4mTbjohwc567Gy4=
 =YxUz
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-fixes-2018-12-13' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

mlx5-fixes-2018-12-13

Subject: [pull request][net 0/9] Mellanox, mlx5 fixes 2018-12-13
Saeed Mahameed says:

====================
This series introduces some fixes to the mlx5 core and mlx5e netdevice
driver.

=======
Conflict with net-next: When merged with net-next this series will
cause a moderate conflict:

1) in drivers/net/ethernet/mellanox/mlx5/core/en_tc.c (2 hunks)
Take hunks from net only and just replace *attr->mirror_count to *attr->split_count
1.1) there is one more instance of slow_attr->mirror_count to be replaced
with slow_attr->split_count, it doesn't appear in the conflict, it will
cause a compilation error if left out.
2) in mlx5_ifc.h, take hunks only from net.

Example for the merge resolution can be found at:
https://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git/commit/?h=merge/mlx5-fixes&id=48830adf29804d85d77ed8a251d625db0eb5b8a8
branch merge/mlx5-fixes of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
(I simply merged this pull request tag into net-next and resolved the conflict)

I don't know if it's ok with you, but to save your time, you can just:
git pull git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux merge/mlx5-fixes
Into net-next, before your next net merge, and you will have a clean
merge of net into net-next (at least for mlx5 files).
======

Please pull and let me know if there's any problem.

For -stable v4.18
338d615be484 ('net/mlx5e: Cancel DIM work on close SQ')
91f40f9904ad ('net/mlx5e: RX, Verify MPWQE stride size is in range')

For -stable v4.19
c5c7e1c41bbe ('net/mlx5e: Remove unused UDP GSO remaining counter')
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-13 19:21:53 -08:00
Jakub Kicinski 7640ead939 bpf: verifier: make sure callees don't prune with caller differences
Currently for liveness and state pruning the register parentage
chains don't include states of the callee.  This makes some sense
as the callee can't access those registers.  However, this means
that READs done after the callee returns will not propagate into
the states of the callee.  Callee will then perform pruning
disregarding differences in caller state.

Example:

   0: (85) call bpf_user_rnd_u32
   1: (b7) r8 = 0
   2: (55) if r0 != 0x0 goto pc+1
   3: (b7) r8 = 1
   4: (bf) r1 = r8
   5: (85) call pc+4
   6: (15) if r8 == 0x1 goto pc+1
   7: (05) *(u64 *)(r9 - 8) = r3
   8: (b7) r0 = 0
   9: (95) exit

   10: (15) if r1 == 0x0 goto pc+0
   11: (95) exit

Here we acquire unknown state with call to get_random() [1].  Then
we store this random state in r8 (either 0 or 1) [1 - 3], and make
a call on line 5.  Callee does nothing but a trivial conditional
jump (to create a pruning point).  Upon return caller checks the
state of r8 and either performs an unsafe read or not.

Verifier will first explore the path with r8 == 1, creating a pruning
point at [11].  The parentage chain for r8 will include only callers
states so once verifier reaches [6] it will mark liveness only on states
in the caller, and not [11].  Now when verifier walks the paths with
r8 == 0 it will reach [11] and since REG_LIVE_READ on r8 was not
propagated there it will prune the walk entirely (stop walking
the entire program, not just the callee).  Since [6] was never walked
with r8 == 0, [7] will be considered dead and replaced with "goto -1"
causing hang at runtime.

This patch weaves the callee's explored states onto the callers
parentage chain.  Rough parentage for r8 would have looked like this
before:

[0] [1] [2] [3] [4] [5]   [10]      [11]      [6]      [7]
     |           |      ,---|----.    |        |        |
  sl0:         sl0:    / sl0:     \ sl0:      sl0:     sl0:
  fr0: r8 <-- fr0: r8<+--fr0: r8   `fr0: r8  ,fr0: r8<-fr0: r8
                       \ fr1: r8 <- fr1: r8 /
                        \__________________/

after:

[0] [1] [2] [3] [4] [5]   [10]      [11]      [6]      [7]
     |           |          |         |        |        |
   sl0:         sl0:      sl0:       sl0:      sl0:     sl0:
   fr0: r8 <-- fr0: r8 <- fr0: r8 <- fr0: r8 <-fr0: r8<-fr0: r8
                          fr1: r8 <- fr1: r8

Now the mark from instruction 6 will travel through callees states.

Note that we don't have to connect r0 because its overwritten by
callees state on return and r1 - r5 because those are not alive
any more once a call is made.

v2:
 - don't connect the callees registers twice (Alexei: suggestion & code)
 - add more details to the comment (Ed & Alexei)
v1: don't unnecessarily link caller saved regs (Jiong)

Fixes: f4d7e40a5b ("bpf: introduce function calls (verification)")
Reported-by: David Beckett <david.beckett@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-12-13 10:35:40 -08:00
Tal Gilboa fa2bf86bab net/mlx5e: Cancel DIM work on close SQ
TXQ SQ closure is followed by closing the corresponding CQ. A pending
DIM work would try to modify the now non-existing CQ.
This would trigger an error:
[85535.835926] mlx5_core 0000:af:00.0: mlx5_cmd_check:769:(pid 124399):
MODIFY_CQ(0x403) op_mod(0x0) failed, status bad resource state(0x9), syndrome (0x1d7771)

Fix by making sure to cancel any pending DIM work before destroying the SQ.

Fixes: cbce4f4447 ("net/mlx5e: Enable adaptive-TX moderation")
Signed-off-by: Tal Gilboa <talgi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-13 01:24:44 -08:00
Mikhael Goikhman d13b224f43 net/mlx5e: Remove unused UDP GSO remaining counter
Remove tx_udp_seg_rem counter from ethtool output, as it is no longer
being updated in the driver's data flow.

Fixes: 3f44899ef2 ("net/mlx5e: Use PARTIAL_GSO for UDP segmentation")
Signed-off-by: Mikhael Goikhman <migo@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-13 01:24:44 -08:00
Or Gerlitz 61c806dafe net/mlx5e: Avoid encap flows deletion attempt the 1st time a neigh is resolved
Currently, we are deleting offloaded encap flows in case the relevant neigh
becomes unconnected while the encap is valid (a sign that it used to be
connected), or if the curr neigh mac is different from the cached mac
(a sign that the remote side changed their mac).

The 2nd check also applies when the neigh becomes connected on the 1st
time (we start with zero mac). Before the offending commit, the deleting
handler was practically no op, as no flows were offloaded. But since
that commit, we offload neigh-less encap flows to slow path.

Under mirroring scheme, we go into the delete handler, attempt to unoffload a
mirror rule which was never set (as we were offloading to slow path) and crash.

Fix that by calling the delete handler only when the encap is valid,
which covers both cases mentioned above.

Fixes: 5dbe906ff1 ('net/mlx5e: Use a slow path rule instead if vxlan neighbour isn't available')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-13 01:24:44 -08:00
Or Gerlitz 154e62abe9 net/mlx5e: Properly initialize flow attributes for slow path eswitch rule deletion
When a neighbour is resolved, we delete the goto slow path rule from HW.

The eswitch flow attributes where not properly initialized on that case,
hence we mess up the eswitch refcounts for chain zero (the default one).

Fix that along with making sure to use semicolons and not commas on that code;

Fixes: 5dbe906ff1 ('net/mlx5e: Use a slow path rule instead if vxlan neighbour isn't available')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-13 01:24:44 -08:00
Or Gerlitz d14f6f2a84 net/mlx5e: Avoid overriding the user provided priority for offloaded tc rules
Just a leftover which was wrongly left there, remove it while spawning
a message to suggest firmware upgrade.

Fixes: bf07aa730a ('net/mlx5e: Support offloading tc priorities and chains for eswitch flows')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-13 01:24:44 -08:00
Or Gerlitz e88afe759a net/mlx5e: Err if asked to mirror a goto chain tc eswitch rule
Currently we are not supporting this and not err-ing on that either.

For now, just err if asked to do that.

Fixes: bf07aa730a ('net/mlx5e: Support offloading tc priorities and chains for eswitch flows')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reported-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Eli Britstein <elibr@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-13 01:24:44 -08:00
Moshe Shemesh e1c15b62b7 net/mlx5e: RX, Verify MPWQE stride size is in range
Add check of MPWQE stride size is within range supported by HW. In case
calculated MPWQE stride size exceed range, linear SKB can't be used and
we should use non linear MPWQE instead.

Fixes: 619a8f2a42 ("net/mlx5e: Use linear SKB in Striding RQ")
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-13 01:24:44 -08:00
Gavi Teitz 8956f0014e net/mlx5e: Fix default amount of channels for VF representors
The default amount of channels a representor opens was erroneously
changed from one to the maximum amount of channels, restore to its
intended value.

Fixes: 779d986d60 ("net/mlx5e: Do not ignore netdevice TX/RX queues number")
Signed-off-by: Gavi Teitz <gavi@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-13 01:24:44 -08:00