Commit graph

71820 commits

Author SHA1 Message Date
Mahesh Bandewar a190d04db9 ipvlan: introduce 'private' attribute for all existing modes.
IPvlan has always operated in bridge mode. However there are scenarios
where each slave should be able to talk through the master device but
not necessarily across each other. Think of an environment where each
of a namespace is a private and independant customer. In this scenario
the machine which is hosting these namespaces neither want to tell who
their neighbor is nor the individual namespaces care to talk to neighbor
on short-circuited network path.

This patch implements the mode that is very similar to the 'private' mode
in macvlan where individual slaves can send and receive traffic through
the master device, just that they can not talk among slave devices.

Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-29 18:39:57 +09:00
Wei Yongjun 2660d226d9 net: aquantia: Make local functions static
Fixes the following sparse warnings:

drivers/net/ethernet/aquantia/atlantic/aq_ethtool.c:224:5: warning:
 symbol 'aq_ethtool_get_coalesce' was not declared. Should it be static?
drivers/net/ethernet/aquantia/atlantic/aq_ethtool.c:245:5: warning:
 symbol 'aq_ethtool_set_coalesce' was not declared. Should it be static?

Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-29 17:53:00 +09:00
Florian Fainelli 5c1a6eaf0d net: dsa: b53: Export b53_configure_vlan()
bcm_sf2 and b53 replicate the same operations: clear all VLANs and set
their ports to the default VLAN tag (1 for these devices) so export the
b53 function doing just that.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-29 12:13:30 +09:00
Felix Manlunas 641da8ed3d liquidio: get rid of false alarm "Unknown cmd 27" in dmesg
Creating a macvtap interface with the liquidio VF driver as lower device
causes this alarming message to show up in dmesg:

    liquidio_link_ctrl_cmd_completion Unknown cmd 27

That's actually a false alarm because cmd 27 is the value of the macro
OCTNET_CMD_SET_UC_LIST which is known.  It's a control command sent from
host to NIC firmware to set the unicast MAC address list of the macvtap
lower device.

Make the false alarm go away by adding a case for OCTNET_CMD_SET_UC_LIST
in liquidio_link_ctrl_cmd_completion().

Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: Raghu Vatsavayi <raghu.vatsavayi@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-29 12:12:33 +09:00
Haiyang Zhang a6fb6aa3cf hv_netvsc: Set tx_table to equal weight after subchannels open
In some cases, like internal vSwitch, the host doesn't provide
send indirection table updates. This patch sets the table to be
equal weight after subchannels are all open. Otherwise, all workload
will be on one TX channel.

As tested, this patch has largely increased the throughput over
internal vSwitch.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-29 12:09:23 +09:00
Matteo Croce 90e229ef61 ppp: allow usage in namespaces
Check for CAP_NET_ADMIN with ns_capable() instead of capable()
to allow usage of ppp in user namespace other than the init one.

Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-29 11:55:32 +09:00
David S. Miller 87e3de1e4e Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
1GbE Intel Wired LAN Driver Updates 2017-10-27

This patchset is a proposal of how the Traffic Control subsystem can
be used to offload the configuration of the Credit Based Shaper
(defined in the IEEE 802.1Q-2014 Section 8.6.8.2) into supported
network devices.

As part of this work, we've assessed previous public discussions
related to TSN enabling: patches from Henrik Austad (Cisco), the
presentation from Eric Mann at Linux Plumbers 2012, patches from
Gangfeng Huang (National Instruments) and the current state of the
OpenAVNU project (https://github.com/AVnu/OpenAvnu/).

Overview
========

Time-sensitive Networking (TSN) is a set of standards that aim to
address resources availability for providing bandwidth reservation and
bounded latency on Ethernet based LANs. The proposal described here
aims to cover mainly what is needed to enable the following standards:
802.1Qat and 802.1Qav.

The initial target of this work is the Intel i210 NIC, but other
controllers' datasheet were also taken into account, like the Renesas
RZ/A1H RZ/A1M group and the Synopsis DesignWare Ethernet QoS
controller.

Proposal
========

Feature-wise, what is covered here is the configuration interfaces for
HW implementations of the Credit-Based shaper (CBS, 802.1Qav). CBS is
a per-queue shaper. Given that this feature is related to traffic
shaping, and that the traffic control subsystem already provides a
queueing discipline that offloads config into the device driver (i.e.
mqprio), designing a new qdisc for the specific purpose of offloading
the config for the CBS shaper seemed like a good fit.

For steering traffic into the correct queues, we use the socket option
SO_PRIORITY and then a mechanism to map priority to traffic classes /
Tx queues. The qdisc mqprio is currently used in our tests.

As for the CBS config interface, this patchset is proposing a new
qdisc called 'cbs'. Its 'tc' cmd line is:

$ tc qdisc add dev IFACE parent ID cbs locredit N hicredit M sendslope S \
     idleslope I

   Note that the parameters for this qdisc are the ones defined by the
   802.1Q-2014 spec, so no hardware specific functionality is exposed here.

Per-stream shaping, as defined by IEEE 802.1Q-2014 Section 34.6.1, is
not yet covered by this proposal.

v2: Merged patch 6 of the original series into patch 4 based on feedback
    from David Miller.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-29 11:20:28 +09:00
Arjun Vynipadath c69fe40780 cxgb3: Check and handle the dma mapping errors
This patch adds checks at approprate places whether *dma_map*() call has
succeeded or not.

Original Work by: Santosh Rastapur <santosh@chelsio.com>
Signed-off-by: Arjun Vynipadath <arjun@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-29 11:10:17 +09:00
Francois Romieu 509708310c r8169: Add support for interrupt coalesce tuning (ethtool -C)
Kirr: In particular with

	ethtool -C <ifname> rx-usecs 0 rx-frames 0

now it is possible to disable RX delays when NIC usage requires low-latency.

See this thread for context:

	https://www.spinics.net/lists/netdev/msg217665.html

My specific case is that:

We have many computers with gigabit Realtek NICs. For 2 such computers
connected to a gigabit store-and-forward switch the minimum round-trip
time for small pings (`ping -i 0 -w 3 -s 56 -q peer`) is ~ 30μs.

However it turned out that when Ethernet frame length transitions 127 ->
128 bytes (`ping -i 0 -w 3 -s {81 -> 82} -q peer`) the lowest RTT
transitions step-wise to ~ 270μs.

As David Light said this is RX interrupt mitigation done by NIC which creates
the latency. For workloads when low-latency is required with e.g. Intel,
BCM etc NIC drivers one just uses `ethtool -C rx-usecs ...` to reduce
the time NIC delays before interrupting CPU, but it turned out
`ethtool -C` is not supported by r8169 driver.

Like Stéphane ANCELOT I've traced the problem down to IntrMitigate being
hardcoded to != 0 for our chips (we have 8168 based NICs):

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/realtek/r8169.c#n5460
static void rtl_hw_start_8169(struct net_device *dev) {
        ...
        /*
         * Undocumented corner. Supposedly:
         * (TxTimer << 12) | (TxPackets << 8) | (RxTimer << 4) | RxPackets
         */
        RTL_W16(IntrMitigate, 0x0000);

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/realtek/r8169.c#n6346
static void rtl_hw_start_8168(struct net_device *dev) {
        ...
        RTL_W16(IntrMitigate, 0x5151);

and then I've also found

	https://www.spinics.net/lists/netdev/msg217665.html

and original Francois' patch:

	https://www.spinics.net/lists/netdev/msg217984.html
	https://www.spinics.net/lists/netdev/msg218207.html

So could we please finally get support for tuning r8169 interrupt
coalescing in tree? (so that next poor soul who hits the problem does
not need to go all the way to dig into driver sources and internet
wildly and finally patch locally

        -RTL_W16(IntrMitigate, 0x5151);
        +RTL_W16(IntrMitigate, 0x5100);

guessing whether it is right or not and also having to care to deploy
the patch everywhere it needs to be used, etc...).

To do so I've took original Francois's patch from 2012 and reworked it a bit:

- updated to latest net-next.git;
- adjusted scaling setup based on feedback from Hayes to pick up scaling
  vector depending not only on link speed but also on CPlusCmd[0:1] and to
  adjust CPlusCmd[0:1] correspondingly when setting timings;
- improved a bit (I think so) error handling.

I've tested the patch on "RTL8168d/8111d" (XID 083000c0) and with it and
`ethtool -C rx-usecs 0 rx-frames 0` on both ends it improves:

- minimum RTT latency:

        ~270μs ->  ~30μs (small packet),
        ~330μs -> ~110μs (full 1.5K ethernet frame)

- average RTT latency:

        ~480μs ->  ~50μs (small packet),
        ~560μs -> ~125μs (full 1.5K ethernet frame)

( before:

        root@neo1:# ping -i 0 -w 3 -s 82 -q neo2
        PING neo2.kirr.nexedi.com (192.168.102.21) 82(110) bytes of data.

        --- neo2.kirr.nexedi.com ping statistics ---
        5906 packets transmitted, 5905 received, 0% packet loss, time 2999ms
        rtt min/avg/max/mdev = 0.274/0.485/0.607/0.026 ms, ipg/ewma 0.508/0.489 ms

        root@neo1:# ping -i 0 -w 3 -s 1472 -q neo2
        PING neo2.kirr.nexedi.com (192.168.102.21) 1472(1500) bytes of data.

        --- neo2.kirr.nexedi.com ping statistics ---
        5073 packets transmitted, 5073 received, 0% packet loss, time 2999ms
        rtt min/avg/max/mdev = 0.330/0.566/0.710/0.028 ms, ipg/ewma 0.591/0.544 ms

  after:

        root@neo1# ping -i 0 -w 3 -s 82 -q neo2
        PING neo2.kirr.nexedi.com (192.168.102.21) 82(110) bytes of data.

        --- neo2.kirr.nexedi.com ping statistics ---
        45815 packets transmitted, 45815 received, 0% packet loss, time 3000ms
        rtt min/avg/max/mdev = 0.036/0.051/0.368/0.010 ms, ipg/ewma 0.065/0.053 ms

        root@neo1:# ping -i 0 -w 3 -s 1472 -q neo2
        PING neo2.kirr.nexedi.com (192.168.102.21) 1472(1500) bytes of data.

        --- neo2.kirr.nexedi.com ping statistics ---
        21250 packets transmitted, 21250 received, 0% packet loss, time 3000ms
        rtt min/avg/max/mdev = 0.112/0.125/0.390/0.007 ms, ipg/ewma 0.141/0.125 ms

  the small -> 1.5K latency growth is understandable as it takes ~15μs
  to transmit 1.5K on 1Gbps on the wire and with 2 hosts and 1 switch
  and ICMP ECHO + ECHO reply the packet has to travel 4 ethernet
  segments which is already 60μs;

  probably something a bit else is also there as e.g. on Linux, even
  with `cpupower frequency-set -g performance`, on some computers I've
  noticed the kernel can be spending more time in software-only mode
  when incoming packets go in less frequently. E.g. this program can
  demonstrate the effect for ICMP ECHO processing:

  https://lab.nexedi.com/kirr/bcc/blob/43cfc13b/tools/pinglat.py

  (later this was found to be partly due to C-states exit latencies) )

We have this patch running in our testing setup for 1 months already
without any issues observed.

It remains to be clarified whether RX and TX timers use the same base.
For now I've set them equally, but Francois's original patch version
suggests it could be not the same.

I've got no feedback at all to my original posting of this patch and questions

	https://www.spinics.net/lists/netdev/msg457173.html

neither from Francois, nor from any people from Realtek during one month.

So I suggest we simply apply it to net-next.git now.

Cc: Francois Romieu <romieu@fr.zoreil.com>
Cc: Hayes Wang <hayeswang@realtek.com>
Cc: Realtek linux nic maintainers <nic_swsd@realtek.com>
Cc: David Laight <David.Laight@ACULAB.COM>
Cc: Stéphane ANCELOT <sancelot@free.fr>
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Kirill Smelkov <kirr@nexedi.com>
Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-29 11:07:58 +09:00
Girish Moodalbail dea6e19f4e tap: reference to KVA of an unloaded module causes kernel panic
The commit 9a393b5d59 ("tap: tap as an independent module") created a
separate tap module that implements tap functionality and exports
interfaces that will be used by macvtap and ipvtap modules to create
create respective tap devices.

However, that patch introduced a regression wherein the modules macvtap
and ipvtap can be removed (through modprobe -r) while there are
applications using the respective /dev/tapX devices. These applications
cause kernel to hold reference to /dev/tapX through 'struct cdev
macvtap_cdev' and 'struct cdev ipvtap_dev' defined in macvtap and ipvtap
modules respectively. So,  when the application is later closed the
kernel panics because we are referencing KVA that is present in the
unloaded modules.

----------8<------- Example ----------8<----------
$ sudo ip li add name mv0 link enp7s0 type macvtap
$ sudo ip li show mv0 |grep mv0| awk -e '{print $1 $2}'
  14:mv0@enp7s0:
$ cat /dev/tap14 &
$ lsmod |egrep -i 'tap|vlan'
macvtap                16384  0
macvlan                24576  1 macvtap
tap                    24576  3 macvtap
$ sudo modprobe -r macvtap
$ fg
cat /dev/tap14
^C

<...system panics...>
BUG: unable to handle kernel paging request at ffffffffa038c500
IP: cdev_put+0xf/0x30
----------8<-----------------8<----------

The fix is to set cdev.owner to the module that creates the tap device
(either macvtap or ipvtap). With this set, the operations (in
fs/char_dev.c) on char device holds and releases the module through
cdev_get() and cdev_put() and will not allow the module to unload
prematurely.

Fixes: 9a393b5d59 (tap: tap as an independent module)
Signed-off-by: Girish Moodalbail <girish.moodalbail@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28 19:17:21 +09:00
Kees Cook 267146d447 drivers/net: smsc: Convert timers to use timer_setup()
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.

Cc: "David S. Miller" <davem@davemloft.net>
Cc: "yuval.shaia@oracle.com" <yuval.shaia@oracle.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Philippe Reynes <tremyfr@gmail.com>
Cc: Allen Pais <allen.lkml@gmail.com>
Cc: Tobias Klauser <tklauser@distanz.ch>
Cc: netdev@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28 19:09:50 +09:00
Kees Cook 8089c6f477 drivers/net: packetengines: Convert timers to use timer_setup()
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Allen Pais <allen.lkml@gmail.com>
Cc: yuan linyu <Linyu.Yuan@alcatel-sbell.com.cn>
Cc: Philippe Reynes <tremyfr@gmail.com>
Cc: netdev@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28 19:09:49 +09:00
Kees Cook 15735c9d8a drivers/net: natsemi: Convert timers to use timer_setup()
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Allen Pais <allen.lkml@gmail.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Philippe Reynes <tremyfr@gmail.com>
Cc: Wei Yongjun <weiyongjun1@huawei.com>
Cc: netdev@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28 19:09:49 +09:00
Kees Cook 0365b047de drivers/net: mellanox: Convert timers to use timer_setup()
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.

Cc: Saeed Mahameed <saeedm@mellanox.com>
Cc: Matan Barak <matanb@mellanox.com>
Cc: Leon Romanovsky <leonro@mellanox.com>
Cc: netdev@vger.kernel.org
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28 19:09:49 +09:00
Kees Cook 34309b36e4 drivers/net: korina: Convert timers to use timer_setup()
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Roman Yeryomin <leroi.lists@gmail.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: netdev@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28 19:09:49 +09:00
Kees Cook 8b3718dc2c drivers/net: fealnx: Convert timers to use timer_setup()
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.

Cc: "David S. Miller" <davem@davemloft.net>
Cc: "yuval.shaia@oracle.com" <yuval.shaia@oracle.com>
Cc: Allen Pais <allen.lkml@gmail.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: Philippe Reynes <tremyfr@gmail.com>
Cc: Johannes Berg <johannes.berg@intel.com>
Cc: netdev@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28 19:09:49 +09:00
Kees Cook 9cb618c295 drivers/net: dlink: Convert timers to use timer_setup()
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.

Cc: Denis Kirjanov <kda@linux-powerpc.org>
Cc: netdev@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Denis Kirjanov <kda@linux-powerpc.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28 19:09:49 +09:00
Kees Cook 0e23daeb64 drivers/net: chelsio/cxgb*: Convert timers to use timer_setup()
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.

Cc: Santosh Raspatur <santosh@chelsio.com>
Cc: Ganesh Goudar <ganeshgr@chelsio.com>
Cc: Casey Leedom <leedom@chelsio.com>
Cc: netdev@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28 19:09:49 +09:00
Kees Cook 70a42ac1c2 drivers/net: appletalk/cops: Convert timers to use timer_setup()
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.

Cc: Allen Pais <allen.lkml@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: David Howells <dhowells@redhat.com>
Cc: netdev@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28 19:09:49 +09:00
Kees Cook c6c52ba151 drivers/net: amd: Convert timers to use timer_setup()
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.

Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Allen Pais <allen.lkml@gmail.com>
Cc: netdev@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28 19:09:49 +09:00
Kees Cook c63144e4dd drivers/net: 8390: Convert timers to use timer_setup()
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.

Cc: netdev@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28 19:09:49 +09:00
Jason Wang 63b9ab65bd tuntap: properly align skb->head before building skb
An unaligned alloc_frag->offset caused by previous allocation will
result an unaligned skb->head. This will lead unaligned
skb_shared_info and then unaligned dataref which requires to be
aligned for accessing on some architecture. Fix this by aligning
alloc_frag->offset before the frag refilling.

Fixes: 0bbd7dad34 ("tun: make tun_build_skb() thread safe")
Cc: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Cc: Wei Wei <dotweiba@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Reported-by: Wei Wei <dotweiba@gmail.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28 19:05:28 +09:00
Bhadram Varka a830405ee4 stmmac: copy unicast mac address to MAC registers
Currently stmmac driver not copying the valid ethernet
MAC address to MAC registers. This patch takes care
of updating the MAC register with MAC address.

Signed-off-by: Bhadram Varka <vbhadram@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28 19:04:29 +09:00
Pablo Cascón a267eaebfc nfp: inform the VF driver needs to be restarted after changing the MAC
Add message to inform the VF MAC was changed and the need to restart
the VF driver for the changes to be effective.

Signed-off-by: Pablo Cascón <pablo.cascon@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28 18:59:48 +09:00
Felix Manlunas aa28667cfb liquidio: fix kernel panic in VF driver
Doing ifconfig down on VF driver in the middle of receiving line rate
traffic causes a kernel panic:

    LiquidIO_VF 0000:02:00.3: should not come here should not get rx when poll mode = 0 for vf
    BUG: unable to handle kernel NULL pointer dereference at           (null)
    .
    .
    .
    Call Trace:
     <IRQ>
     ? tasklet_action+0x102/0x120
     __do_softirq+0x91/0x292
     irq_exit+0xb6/0xc0
     do_IRQ+0x4f/0xd0
     common_interrupt+0x93/0x93
     </IRQ>
    RIP: 0010:cpuidle_enter_state+0x142/0x2f0
    RSP: 0018:ffffffffa6403e20 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff59
    RAX: 0000000000000000 RBX: 0000000000000003 RCX: 000000000000001f
    RDX: 0000000000000000 RSI: 000000002ab7519f RDI: 0000000000000000
    RBP: ffffffffa6403e58 R08: 0000000000000084 R09: 0000000000000018
    R10: ffffffffa6403df0 R11: 00000000000003c7 R12: 0000000000000003
    R13: ffffd27ebd806800 R14: ffffffffa64d40d8 R15: 0000007be072823f
     cpuidle_enter+0x17/0x20
     call_cpuidle+0x23/0x40
     do_idle+0x18c/0x1f0
     cpu_startup_entry+0x64/0x70
     rest_init+0xa5/0xb0
     start_kernel+0x45e/0x46b
     x86_64_start_reservations+0x24/0x26
     x86_64_start_kernel+0x6f/0x72
     secondary_startup_64+0xa5/0xa5
    Code:  Bad RIP value.
    RIP:           (null) RSP: ffff9246ed003f28
    CR2: 0000000000000000
    ---[ end trace 92731e80f31b7d7d ]---
    Kernel panic - not syncing: Fatal exception in interrupt
    Kernel Offset: 0x24000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
    ---[ end Kernel panic - not syncing: Fatal exception in interrupt

Reason is:  in the function assigned to net_device_ops->ndo_stop, the steps
for bringing down the interface are done in the wrong order.  The step that
notifies the NIC firmware to stop forwarding packets to host is done too
late.  Fix it by moving that step to the beginning.

Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: Raghu Vatsavayi <raghu.vatsavayi@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28 18:52:46 +09:00
Michael Chan 952c5719aa bnxt_en: Fix randconfig build errors.
Fix undefined symbols when CONFIG_VLAN_8021Q or CONFIG_INET is not set.

Fixes: 8c95f773b4 ("bnxt_en: add support for Flower based vxlan encap/decap offload")
Reported-by: Jakub Kicinski <kubakici@wp.pl>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28 18:24:15 +09:00
Andre Guedes 05f9d3e1ae igb: Add support for CBS offload
This patch adds support for Credit-Based Shaper (CBS) qdisc offload
from Traffic Control system. This support enable us to leverage the
Forwarding and Queuing for Time-Sensitive Streams (FQTSS) features
from Intel i210 Ethernet Controller. FQTSS is the former 802.1Qav
standard which was merged into 802.1Q in 2014. It enables traffic
prioritization and bandwidth reservation via the Credit-Based Shaper
which is implemented in hardware by i210 controller.

The patch introduces the igb_setup_tc() function which implements the
support for CBS qdisc hardware offload in the IGB driver. CBS offload
is the only traffic control offload supported by the driver at the
moment.

FQTSS transmission mode from i210 controller is automatically enabled
by the IGB driver when the CBS is enabled for the first hardware
queue. Likewise, FQTSS mode is automatically disabled when CBS is
disabled for the last hardware queue. Changing FQTSS mode requires NIC
reset.

FQTSS feature is supported by i210 controller only.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Tested-by: Henrik Austad <henrik@austad.us>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-27 09:49:36 -07:00
Intiyaz Basha c859e21a35 liquidio: xmit_more support
Defer ringing the Tx doorbell if skb->xmit_more is set unless the Tx queue
is full or stopped.  To keep latency low, use a deferral limit of 8
packets.  We chose 8 because Octeon can fetch at most 8 packets in a single
PCI read, and our tests show that 8 results in low latency.

Signed-off-by: Intiyaz Basha <intiyaz.basha@cavium.com>
Signed-off-by: Satanand Burla <satananda.burla@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28 00:35:23 +09:00
John Allen 2a1bf51111 ibmvnic: Fix failover error path for non-fatal resets
For all non-fatal reset conditions, the hypervisor will send a failover when
we attempt to initialize the crq and the vnic client is expected to handle
that failover instead of the existing non-fatal reset. To handle this, we
need to return from init with a return code that indicates that we have hit
this case.

Signed-off-by: John Allen <jallen@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28 00:23:58 +09:00
John Allen c26eba03e4 ibmvnic: Update reset infrastructure to support tunable parameters
Update ibmvnic reset infrastructure to include a new reset option that will
allow changing of tunable parameters. There currently is no way to request
different capabilities from the vnic server on the fly so this patch
achieves this by resetting the driver and attempting to log in with the
requested changes. If the reset operation fails, the old values of the
tunable parameters are stored in the "fallback" struct and we attempt to
login with the fallback values.

Signed-off-by: John Allen <jallen@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28 00:23:58 +09:00
Subash Abhinov Kasiviswanathan ca32fb034c net: qualcomm: rmnet: Add support for GRO
Add gro_cells so that rmnet devices can call gro_cells_receive
instead of netif_receive_skb.

Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28 00:10:23 +09:00
Subash Abhinov Kasiviswanathan 192c4b5d48 net: qualcomm: rmnet: Add support for 64 bit stats
Implement 64 bit per cpu stats.

Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28 00:10:23 +09:00
Subash Abhinov Kasiviswanathan 85355d775f net: qualcomm: rmnet: Always assign rmnet dev in deaggregation path
The rmnet device needs to assigned for all packets in the
deaggregation path based on the mux id, so the check is not needed.

Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28 00:10:23 +09:00
Subash Abhinov Kasiviswanathan 2ffbbf0f91 net: qualcomm: rmnet: Fix the return value of rmnet_rx_handler()
Since packet is always consumed by rmnet_rx_handler(), we always
return RX_HANDLER_CONSUMED. There is no need to pass on this
value through multiple functions.

Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28 00:10:23 +09:00
David S. Miller 8ab190fbe9 Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue
Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates 2017-10-26

This series contains fixes to e1000, igb, ixgbe and i40e.

Vincenzo Maffione fixes a potential race condition which would result in
the interface being up but transmits are disabled in the hardware.

Colin Ian King fixes a possible NULL pointer dereference in e1000, which
was found by Coverity.

Jean-Philippe Brucker fixes a possible kernel panic when a driver cannot
map a transmit buffer, which is caused by an erroneous test.

Alex provides a fix for ixgbe, which is a partial revert of the commit
ffed21bcee ("ixgbe: Don't bother clearing buffer memory for descriptor rings")
because the previous commit messed up the exception handling path by
adding the count back in when we did not need to.  Also fixed a typo,
where the transmit ITR setting was being used to determine if we were
using adaptive receive interrupt moderation or not.  Lastly, fixed a
memory leak by including programming descriptors in the cleaned count.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28 00:05:34 +09:00
Sathya Perla cd66358e52 bnxt_en: alloc tc_info{} struct only when tc flower is enabled
TC flower is not enabled on VFs and when there's no FW support.
Alloc the tc_info{} struct at init time only when TC flower is being
enabled.

Signed-off-by: Sathya Perla <sathya.perla@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28 00:02:45 +09:00
Sathya Perla 5a84acbebb bnxt_en: query cfa flow stats periodically to compute 'lastused' attribute
This patch implements periodic querying of cfa flow stats
in batches to compute the 'lastused' attribute of TC flow stats.

Signed-off-by: Sathya Perla <sathya.perla@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28 00:02:45 +09:00
Sathya Perla f484f6782e bnxt_en: add hwrm FW cmds for cfa_encap_record and decap_filter
Add routines for issuing the hwrm_cfa_encap_record_alloc/free
and hwrm_cfa_decap_filter_alloc/free FW cmds needed for
supporting vxlan encap/decap offload.

Signed-off-by: Sathya Perla <sathya.perla@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28 00:02:45 +09:00
Sathya Perla 8c95f773b4 bnxt_en: add support for Flower based vxlan encap/decap offload
This patch adds IPv4 vxlan encap/decap action support to TC-flower
offload.

For vxlan encap, the driver maintains a tunnel encap hash-table.
When a new flow with a tunnel encap action arrives, this table
is looked up; if an encap entry exists, it uses the already
programmed encap_record_handle as the tunnel_handle in the
hwrm_cfa_flow_alloc cmd. Else, a new encap node is added and the
L2 header fields are queried via a route lookup.
hwrm_cfa_encap_record_alloc cmd is used to create a new encap
record and the encap_record_handle is used as the tunnel_handle
while adding the flow.

For vxlan decap, the driver maintains a tunnel decap hash-table.
When a new flow with a tunnel decap action arrives, this table
is looked up; if a decap entry exists, it uses the already
programmed decap_filter_handle as the tunnel_handle in the
hwrm_cfa_flow_alloc cmd. Else, a new decap node is added and
a decap_filter_handle is alloc'd via the hwrm_cfa_decap_filter_alloc
cmd. This handle is used as the tunnel_handle while adding the flow.

The code to issue the HWRM FW cmds is introduced in a follow-up patch.

Signed-off-by: Sathya Perla <sathya.perla@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28 00:02:45 +09:00
Michael Chan f8503969d2 bnxt_en: Refactor and simplify coalescing code.
The mapping of the ethtool coalescing parameters to hardware parameters
is now done in bnxt_hwrm_set_coal_params().  The same function can
handle both RX and TX settings.  The code is now more clear.  Some
adjustments have been made to get better hardware settings.  The
coal_frames setting is now accurately set in hardware.  The max_timer
is set to coal_ticks value.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28 00:02:45 +09:00
Michael Chan 18775aa8a9 bnxt_en: Reorganize the coalescing parameters.
The current IRQ coalescing logic is a little messy.  The ethtool
parameters are mapped to hardware parameters in a way that is difficult
to understand.  The first step is to better organize the parameters
by adding the new structure bnxt_coal.  The structure is used by both
the RX and TX sets of coalescing parameters.

Adjust the default coal_ticks to 14 us and 28 us for RX and TX.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28 00:02:45 +09:00
Vasundhara Volam 49f7972fd1 bnxt_en: Add ethtool reset method
This is a firmware internal reset after driver is unloaded.

Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28 00:02:45 +09:00
Michael Chan 7eb9bb3a0c bnxt_en: Check maximum supported MTU from firmware.
Some NICs have a firmware enforced maximum MTU setting by management
firmware.  Set up netdev->max_mtu accordingly.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28 00:02:44 +09:00
Michael Chan c1a7bdff17 bnxt_en: Optimize .ndo_set_mac_address() for VFs.
No need to call bnxt_approve_mac() which will send a message to the
PF if the MAC address hasn't changed.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28 00:02:44 +09:00
Michael Chan 431aa1eb20 bnxt_en: Get firmware package version one time.
The current code retrieves the firmware package version from firmware
everytime ethtool -i is run.  There is no reason to do that as the
firmware will not change while the driver is loaded.  Get the version
once at init time.

Also, display the full 4-part firmware version string and remove the
less useful interface spec version.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28 00:02:44 +09:00
Michael Chan e0ad8fc598 bnxt_en: Check for zero length value in bnxt_get_nvram_item().
Return -EINVAL if the length is zero and not proceed to do essentially
nothing.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28 00:02:44 +09:00
Rob Miller 618784e3ee bnxt_en: adding PCI ID for SMARTNIC VF support
Signed-off-by: Rob Miller <rmiller@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28 00:02:44 +09:00
Ray Jui 8ed693b7bb bnxt_en: Add PCIe device ID for bcm58804
Add new PCIe device ID and chip number for bcm58804

Signed-off-by: Ray Jui <ray.jui@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28 00:02:44 +09:00
Michael Chan 57922b0a2f bnxt_en: Update firmware interface to 1.8.3.1
Vxlan encap/decap filters are added to this firmware spec.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28 00:02:44 +09:00
Vivien Didelot 02bc6e546e net: dsa: introduce dsa_user_ports helper
Introduce a dsa_user_ports() helper to return the ds->enabled_port_mask
mask which is more explicit. This will also minimize diffs when touching
this internal mask.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28 00:00:09 +09:00
Vivien Didelot 4a5b85ffe2 net: dsa: use dsa_is_user_port everywhere
Most of the DSA code still check ds->enabled_port_mask directly to
inspect a given port type instead of using the provided dsa_is_user_port
helper. Change this.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28 00:00:09 +09:00
Vivien Didelot 2b3e9891cb net: dsa: rename dsa_is_normal_port helper
This patch renames dsa_is_normal_port to dsa_is_user_port because "user"
is the correct term in the DSA terminology, not "normal".

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28 00:00:09 +09:00
Vivien Didelot 91dee14481 net: dsa: mv88e6xxx: skip unused ports
The unused ports are currently configured in normal mode. This does not
prevent the switch from being functional, but it is unnecessary. Skip
unused ports.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28 00:00:09 +09:00
Vivien Didelot bff7b688d5 net: dsa: add dsa_is_unused_port helper
As the comment above the chunk states, the b53 driver attempts to
disable the unused ports. But using ds->enabled_port_mask is misleading,
because this mask reports in fact the user ports.

To avoid confusion and fix this, this patch introduces an explicit
dsa_is_unused_port helper which ensures the corresponding bit is not
masked in any of the switch port masks.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28 00:00:09 +09:00
Gustavo A. R. Silva 5bca178eed net: faraday: ftmac100: Use BUG_ON instead of if condition followed by BUG.
Notice that in this particular case unlikely() is already being called
inside BUG_ON macro.

This issue was detected with the help of Coccinelle.

Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-27 23:53:14 +09:00
Gustavo A. R. Silva 4fa112f6b5 net: bcmgenet: Use BUG_ON instead of if condition followed by BUG
Use BUG_ON instead of if condition followed by BUG.

Something to notice in this particular case is that unlikely()
is already being called inside BUG_ON macro.

This issue was detected with the help of Coccinelle.

Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-27 23:52:36 +09:00
Rahul Lakkireddy 6f92a6544f cxgb4: collect hardware misc dumps
Collect path mtu, PM stats, TP clock info, congestion control, and VPD
data dumps.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-27 23:48:30 +09:00
Rahul Lakkireddy 08c4901bfe cxgb4: collect hardware scheduler dumps
Collect hardware TX traffic scheduler and pace tables.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-27 23:48:30 +09:00
Rahul Lakkireddy db8cd7ce20 cxgb4: collect PBT tables dump
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-27 23:48:30 +09:00
Rahul Lakkireddy b289593e13 cxgb4: collect MPS-TCAM dump
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-27 23:48:29 +09:00
Rahul Lakkireddy 9030e49897 cxgb4: collect TID info dump
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-27 23:48:29 +09:00
Rahul Lakkireddy 28b445561f cxgb4: collect RSS dumps
Collect RSS table and RSS VF configuration dumps.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-27 23:48:29 +09:00
Rahul Lakkireddy 3044d0fb01 cxgb4: collect CIM queue configuration dump
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-27 23:48:29 +09:00
Rahul Lakkireddy 27887bc7cb cxgb4: collect hardware LA dumps
Collect CIM, CIM_MA, ULP_RX, TP, CIM_PIF, and ULP_TX logic analyzer
dumps.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-27 23:48:29 +09:00
Jose Abreu 6d9f0790af net: stmmac: First Queue must always be in DCB mode
According to DWMAC databook the first queue operating mode
must always be in DCB.

As MTL_QUEUE_DCB = 1, we need to always set the first queue
operating mode to DCB otherwise driver will think that queue
is in AVB mode (because MTL_QUEUE_AVB = 0).

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-27 23:32:56 +09:00
Egil Hjelmeland 356c3e9afa net: dsa: lan9303: Move struct lan9303 to include/linux/dsa/lan9303.h
The next patch require net/dsa/tag_lan9303.c to access struct lan9303.
Therefore move struct lan9303 definitions from drivers/net/dsa/lan9303.h
to new file include/linux/dsa/lan9303.h.

Signed-off-by: Egil Hjelmeland <privat@egil-hjelmeland.no>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-27 23:30:53 +09:00
Nogah Frankel 3e8c1fd318 mlxsw: reg: Avoid magic number in PPCNT
Replace recurring magic number in PPCNT register with a define.

Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-27 23:25:55 +09:00
Nogah Frankel 9deef43ddf mlxsw: spectrum: Change stats cache to be local
Change the HW stats cache to be local. Rename it for better clarity.
It holds the results of the last result of HW stats that are being read
periodically, in order to have answer for stats request immediately.

Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-27 23:25:55 +09:00
Jose Abreu 4894ac6b6c net: stmmac: dwc-qos-eth: Fix typo in DT bindings parsing
According to DT bindings documentation we are expecting a
property called "snps,read-requests" but we are parsing
instead a property called "read,read-requests".

This is clearly a typo. Fix it.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-27 23:23:19 +09:00
David S. Miller 5be9541a09 Merge tag 'mlx5-fixes-2017-10-26' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:

====================
Mellanox, mlx5 fixes 2017-10-26

The series includes some misc fixes for mlx5 core and etherent driver.
Please pull and let me know if there's any problem.

For -Stable:
net/mlx5e: Properly deal with encap flows add/del under neigh update (kernels >= 4.12)
net/mlx5: Fix health work queue spin lock to IRQ safe  (kernels >= 4.13)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-27 22:23:41 +09:00
Felix Manlunas 392209fa83 liquidio: deprecate 1-bit flag indicating watchdog kernel thread is running
Deprecate the 1-bit flag (bit 2 in the SLI_SCRATCH_1 Octeon register) that
indicates that the liquidio watchdog kernel thread is running for this NIC.
Reason is:  it is incompatible with the firmware's use for SLI_SCRATCH_1.

In lieu of checking that now-deprecated flag, check the value of
oct_dev->adapter_refcount to determine whether or not to create the
watchdog kernel thread.

Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: Raghu Vatsavayi <raghu.vatsavayi@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-27 22:13:19 +09:00
Florian Fainelli c0c21458d7 net: systemport: Check DSA notifier master against ourself
Check that the master network device that is signaled through the DSA
notifier is actually going to be ourself, otherwise, we could just be
de-referencing garbage from other drivers.

Fixes: 84ff33eeb23d ("net: systemport: Establish DSA network device queue mapping")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-27 22:12:35 +09:00
Kees Cook c58320de51 drivers/net: arcnet: Convert timers to use timer_setup()
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.

Cc: Michael Grzeschik <m.grzeschik@pengutronix.de>
Cc: netdev@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-27 12:09:16 +09:00
Kees Cook 032cfd66af drivers/net: wan/sdla: Convert timers to use timer_setup()
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.

Cc: Allen Pais <allen.lkml@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Tobias Klauser <tklauser@distanz.ch>
Cc: netdev@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-27 12:09:16 +09:00
Kees Cook 605ea2f935 drivers/net: wan/lmc: Convert timers to use timer_setup()
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.

Cc: Allen Pais <allen.lkml@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-27 12:09:16 +09:00
Kees Cook e2009be038 drivers/net: wan/dscc4: Remove unused timer
This removes an entirely unused timer, which avoids needing to convert it
to timer_setup().

Cc: Francois Romieu <romieu@fr.zoreil.com>
Cc: netdev@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-27 12:09:16 +09:00
Kees Cook c37631c7f6 drivers/net: sxgbe: Convert timers to use timer_setup()
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.

Cc: Byungho An <bh74.an@samsung.com>
Cc: Girish K S <ks.giri@samsung.com>
Cc: Vipul Pandya <vipul.pandya@samsung.com>
Cc: netdev@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-27 12:09:16 +09:00
Kees Cook 9de36ccf08 drivers/net: realtek: Convert timers to use timer_setup()
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.

Cc: Realtek linux nic maintainers <nic_swsd@realtek.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: David Howells <dhowells@redhat.com>
Cc: Jay Vosburgh <jay.vosburgh@canonical.com>
Cc: Allen Pais <allen.lkml@gmail.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Tobias Klauser <tklauser@distanz.ch>
Cc: netdev@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-27 12:09:16 +09:00
Kees Cook 97815186d4 drivers/net: nuvoton: Convert timers to use timer_setup()
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.

Cc: Wan ZongShun <mcuos.com@gmail.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: netdev@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-27 12:09:15 +09:00
Kees Cook 3248f77fa3 drivers/net: netronome: Convert timers to use timer_setup()
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.

Cc: Jakub Kicinski <jakub.kicinski@netronome.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jiri Pirko <jiri@mellanox.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Simon Horman <simon.horman@netronome.com>
Cc: oss-drivers@netronome.com
Cc: netdev@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-27 12:09:15 +09:00
Kees Cook 0eba23bbce drivers/net: hippi: Convert timers to use timer_setup()
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.

Cc: Jes Sorensen <jes@trained-monkey.org>
Cc: linux-hippi@sunsite.dk
Cc: netdev@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-27 12:09:15 +09:00
Kees Cook f6fd8918f0 drivers/net: hamradio/yam: Convert timers to use timer_setup()
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly. Initialization was entirely missing.

Cc: Jean-Paul Roubelat <jpr@f6fbb.org>
Cc: linux-hams@vger.kernel.org
Cc: netdev@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-27 12:09:15 +09:00
Kees Cook 550acfb37f drivers/net: can: Convert timers to use timer_setup()
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.

Cc: Wolfgang Grandegger <wg@grandegger.com>
Cc: Marc Kleine-Budde <mkl@pengutronix.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Allen Pais <allen.lkml@gmail.com>
Cc: linux-can@vger.kernel.org
Cc: netdev@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-27 12:09:15 +09:00
Kees Cook 0ff624fbfe drivers/net: 3com/3c515: Convert timers to use timer_setup()
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: Johannes Berg <johannes.berg@intel.com>
Cc: netdev@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-27 12:09:15 +09:00
Corentin Labbe a8ff8ccb45 net: stmmac: sun8i: Restore the compatibles
The original dwmac-sun8i DT bindings have some issue on how to handle
integrated PHY and was reverted in last RC of 4.13.
But now we have a solution so we need to get back that was reverted.

This patch restore compatibles about dwmac-sun8i
This reverts commit ad4540cc5a ("net: stmmac: sun8i: Remove the compatibles")

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-27 11:58:49 +09:00
Corentin Labbe 634db83b82 net: stmmac: dwmac-sun8i: Handle integrated/external MDIOs
The Allwinner H3 SoC have two distinct MDIO bus, only one could be
active at the same time.
The selection of the active MDIO bus are done via some bits in the EMAC
register of the system controller.

This patch implement this MDIO switch via a custom MDIO-mux.

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-27 11:58:49 +09:00
Corentin Labbe b5beecb580 net: stmmac: snps, dwmac-mdio MDIOs are automatically registered
stmmac bindings docs said that its mdio node must have
compatible = "snps,dwmac-mdio";
Since dwmac-sun8i does not have any good reasons to not doing it, all
their MDIO node must have it.

Since these compatible is automatically registered, dwmac-sun8i compatible
does not need to be in need_mdio_ids.

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-27 11:58:49 +09:00
Alexander Duyck 62b4c6694d i40e: Add programming descriptors to cleaned_count
This patch updates the i40e driver to include programming descriptors in
the cleaned_count. Without this change it becomes possible for us to leak
memory as we don't trigger a large enough allocation when the time comes to
allocate new buffers and we end up overwriting a number of rx_buffers equal
to the number of programming descriptors we encountered.

Fixes: 0e626ff7cc ("i40e: Fix support for flow director programming status")
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Anders K. Pedersen <akp@cohaesio.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-26 07:42:58 -07:00
Alexander Duyck 10781348ca i40e: Fix incorrect use of tx_itr_setting when checking for Rx ITR setup
It looks like there was either a copy/paste error or just a typo that
resulted in the Tx ITR setting being used to determine if we were using
adaptive Rx interrupt moderation or not.

This patch fixes the typo.

Fixes: 65e87c0398 ("i40evf: support queue-specific settings for interrupt moderation")
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-26 07:42:58 -07:00
Alexander Duyck 069db9cd0b ixgbe: Fix Tx map failure path
This patch is a partial revert of "ixgbe: Don't bother clearing buffer
memory for descriptor rings". Specifically I messed up the exception
handling path a bit and this resulted in us incorrectly adding the count
back in when we didn't need to.

In order to make this simpler I am reverting most of the exception handling
path change and instead just replacing the bit that was handled by the
unmap_and_free call.

Fixes: ffed21bcee ("ixgbe: Don't bother clearing buffer memory for descriptor rings")
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-26 07:42:58 -07:00
Jean-Philippe Brucker 104ba83363 igb: Fix TX map failure path
When the driver cannot map a TX buffer, instead of rolling back
gracefully and retrying later, we currently get a panic:

[  159.885994] igb 0000:00:00.0: TX DMA map failed
[  159.886588] Unable to handle kernel paging request at virtual address ffff00000a08c7a8
               ...
[  159.897031] PC is at igb_xmit_frame_ring+0x9c8/0xcb8

Fix the erroneous test that leads to this situation.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-26 07:42:58 -07:00
Colin Ian King 5983587c8c e1000: avoid null pointer dereference on invalid stat type
Currently if the stat type is invalid then data[i] is being set
either by dereferencing a null pointer p, or it is reading from
an incorrect previous location if we had a valid stat type
previously.  Fix this by skipping over the read of p on an invalid
stat type.

Detected by CoverityScan, CID#113385 ("Explicit null dereferenced")

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-26 07:42:57 -07:00
Vincenzo Maffione 44c445c3d1 e1000: fix race condition between e1000_down() and e1000_watchdog
This patch fixes a race condition that can result into the interface being
up and carrier on, but with transmits disabled in the hardware.
The bug may show up by repeatedly IFF_DOWN+IFF_UP the interface, which
allows e1000_watchdog() interleave with e1000_down().

    CPU x                           CPU y
    --------------------------------------------------------------------
    e1000_down():
        netif_carrier_off()
                                    e1000_watchdog():
                                        if (carrier == off) {
                                            netif_carrier_on();
                                            enable_hw_transmit();
                                        }
        disable_hw_transmit();
                                    e1000_watchdog():
                                        /* carrier on, do nothing */

Signed-off-by: Vincenzo Maffione <v.maffione@gmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-26 07:42:57 -07:00
Girish Moodalbail 78e0ea6791 tap: double-free in error path in tap_open()
Double free of skb_array in tap module is causing kernel panic. When
tap_set_queue() fails we free skb_array right away by calling
skb_array_cleanup(). However, later on skb_array_cleanup() is called
again by tap_sock_destruct through sock_put(). This patch fixes that
issue.

Fixes: 362899b872 (macvtap: switch to use skb array)
Signed-off-by: Girish Moodalbail <girish.moodalbail@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-26 17:57:39 +09:00
Egil Hjelmeland 3c91b0c1de net: dsa: lan9303: Do not disable switch fabric port 0 at .probe
Make the LAN9303 work when lan9303_probe() is called twice.

For some unknown reason the LAN9303 switch fail to forward data when switch
fabric port 0 TX is disabled during probe. (Write of LAN9303_MAC_TX_CFG_0
in lan9303_disable_processing_port().)

In that situation the switch fabric seem to receive frames, because the ALR
is learning addresses. But no frames are transmitted on any of the ports.

In our system lan9303_probe() is called twice, first time
dsa_register_switch() return -EPROBE_DEFER. As an experiment, modified the
code to skip writing LAN9303_MAC_TX_CFG_0, port 0 during the first probe.
Then the switch works as expected.

Resolve the problem by not calling lan9303_disable_processing_port() on
port 0 during probe. Ports 1 and 2 are still disabled.

Although unsatisfying that the exact failure mechanism is not known,
the patch should not cause any harm.

Signed-off-by: Egil Hjelmeland <privat@egil-hjelmeland.no>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-26 17:40:32 +09:00
Rahul Lakkireddy acfdf7eabe cxgb4: fix overflow in collecting IBQ and OBQ dump
Destination buffer already has offset added.  So, don't add offset
again.

Fetch actual size of configured OBQ from hardware, instead of using
hardcoded value.

Fixes: 7c075ce221 ("cxgb4: collect IBQ and OBQ dumps")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-26 17:31:28 +09:00
Lipeng c3b6f755fd net: hns3: fix the bug when reuse command description in hclge_add_mac_vlan_tbl
When reusing a command description read from HW, driver should set
IN_VLD bit, WR bit and NO_INTR bit. If IN_VLD bit and NO_INTR bit
are not set, the command fails and driver prints error message:

[  135.261284] hns3 0000:7d:00.0: cmdq execute failed for get_mac_vlan_cmd_status,status=2.
[  135.270983] hns3 0000:7d:00.0: add mac addr failed for cmd_send, ret =-5.

This patch fixes the bug.
Fixes: 46a3df9 (net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support)

Signed-off-by: Lipeng <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-26 17:25:35 +09:00
Lipeng a17dcf3f01 net: hns3: fix a bug in hclge_uninit_client_instance
HNS3 driver initialize hdev->roce_client and vport->roce.client in
hclge_init_client_instance, and need set hdev->roce_client and
vport->roce.client NULL.

If do not set them NULL when uninit, it will fail in the scene:
insmod hns3.ko, hns-roce.ko, hns-roce-hw-v3.ko successfully, but
rmmod hns3.ko after rmmod hns-roce-hw-v2.ko and hns-roce.ko.
This patch fixes the issue.

Fixes: 46a3df9 (net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support)

Signed-off-by: Lipeng <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-26 17:25:35 +09:00
Lipeng 3a46f34d20 net: hns3: add nic_client check when initialize roce base information
Roce driver works base on HNS3 driver.If insmod Roce driver before
NIC driver there is a error because do not check nic_client. This patch
adds nic_client check when initialize roce base information.

Fixes: 46a3df9 (net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support)

Signed-off-by: Lipeng <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-26 17:25:35 +09:00
Lipeng 7036d26f32 net: hns3: fix the bug of hns3_set_txbd_baseinfo
The SC bits of TX BD mean switch control. For this area, value 0
indicates no switch control, the packet is routed according to the
forwarding table. Value 1 indicates that the packet is transmitted
to the network bypassing the forwarding table.

As HNS3 driver need support VF later, VF conmunicate with its own
PF need forwarding table. This patch sets SC bits of TX BD 0 and use
forwarding table.

Fixes: 76ad4f0 (net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC)

Signed-off-by: Lipeng <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-26 17:25:35 +09:00