1
0
Fork 0
Commit Graph

92539 Commits (3c8e42a7935f720c403612abf896d962a7aa1495)

Author SHA1 Message Date
Nikolay Aleksandrov 772c344dbb net: ipmr: add getlink support
Currently there's no way to dump the VIF table for an ipmr table other
than the default (via proc). This is a major issue when debugging ipmr
issues and in general it is good to know which interfaces are
configured. This patch adds support for RTM_GETLINK for the ipmr family
so we can dump the VIF table and the ipmr table's current config for
each table. We're protected by rtnl so no need to acquire RCU or
mrt_lock.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-08 14:38:59 -04:00
Arkadi Sharshevsky 9fe8bcec0d net: bridge: Receive notification about successful FDB offload
When a new static FDB is added to the bridge a notification is sent to
the driver for offload. In case of successful offload the driver should
notify the bridge back, which in turn should mark the FDB as offloaded.

Currently, externally learned is equivalent for being offloaded which is
not correct due to the fact that FDBs which are added from user-space are
also marked as externally learned. In order to specify if an FDB was
successfully offloaded a new flag is introduced.

Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-08 14:16:25 -04:00
Arkadi Sharshevsky 6b26b51b1d net: bridge: Add support for notifying devices about FDB add/del
Currently the bridge doesn't notify the underlying devices about new
FDBs learned. The FDB sync is placed on the switchdev notifier chain
because devices may potentially learn FDB that are not directly related
to their ports, for example:

1. Mixed SW/HW bridge - FDBs that point to the ASICs external devices
                        should be offloaded as CPU traps in order to
			perform forwarding in slow path.
2. EVPN - Externally learned FDBs for the vtep device.

Notification is sent only about static FDB add/del. This is done due
to fact that currently this is the only scenario supported by switch
drivers.

Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Reviewed-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-08 14:16:25 -04:00
Arkadi Sharshevsky dc0ecabd62 net: switchdev: Add support for querying supported bridge flags by hardware
This is done as a preparation stage before setting the bridge port flags
from the bridge code. Currently the device can be queried for the bridge
flags state, but the querier cannot distinguish if the flag is disabled
or if it is not supported at all. Thus, add new attr and a bit-mask which
include information regarding the support on a per-flag basis.

Drivers that support bridge offload but not support bridge flags should
return zeroed bitmask.

Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-08 14:16:23 -04:00
David S. Miller 7eca9cc539 RxRPC rewrite
-----BEGIN PGP SIGNATURE-----
 
 iQIVAwUAWThq9/Sw1s6N8H32AQLfQhAAikphSQnfbT4SkZsVmcZefNMlThGgX2EE
 5nDNsDiZnXqAOY5ivMnLlb7JXjby2Ckb3coTa8gVK2RmvgIOqGAVdKqYNJQNqYvi
 +plwZFHlx+qWBbQRmucAfGorhmdoG3mRyksHHcpeQ4c/9bcfOJXY9QwAwiSZcPXl
 RDS5QsNVI0nKL/PB8hbKBSp+40/joMJFVSAnBn5X/zxyL5jcoj0Gj7HXj/EKnlfq
 qO5GiheISjJJ47cTR+J3JXl1OrJqG0Dd17BdgK85S+G2bWy9o7MsotMKd1XHHIkQ
 IxuQ7oUa3QVKNUF+Lp1Kxx7ve/V6PPzbaFAY2RGyqwImD4iy2dBNpfgzL4/3rpT3
 IeFBP57N5f2J2EBKeA90GOXVB71LN520e9WytjjD+NMcyJHaFKjjv4xbr5lUhRPp
 6psJHLld6s92NwwPN4YVcT7RrqMFxPC0NmD8xymrm+XnKizdvJQ9TMbD+33nhlV3
 yf1DDYBtPq8/hVyMmgywwy/la8KSCv3pybu1GcXx5MsTAoqLOeXcUcWr2d/ljTsg
 m5xRtjbsw200exf65lc+083W/xXRFGQ9XbFvCPqcefQ+LSE3A4yInTEyzMl0X4WC
 2ciqgM11TYrexw+OwDM5oXQWmp58GZlpSDNlvXvWK8RsCJxwYPrF2Fw8/fw7/wcK
 7EVfvAA+j0k=
 =0fbW
 -----END PGP SIGNATURE-----

Merge tag 'rxrpc-rewrite-20170607-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

David Howells says:

====================
rxrpc: Tx length parameter

Here's a set of patches that allows someone initiating a client call with
AF_RXRPC to indicate upfront the total amount of data that will be
transmitted.  This will allow AF_RXRPC to encrypt directly from source
buffer to packet rather than having to copy into the buffer and only
encrypt when it's full (the encrypted portion of the packet starts with a
length and so we can't encrypt until we know what the length will be).

The three patches are:

 (1) Provide a means of finding out what control message types are actually
     supported.  EINVAL is reported if an unsupported cmsg type is seen, so
     we don't want to set the new cmsg unless we know it will be accepted.

 (2) Consolidate some stuff into a struct to reduce the parameter count on
     the function that parses the cmsg buffer.

 (3) Introduce the RXRPC_TX_LENGTH cmsg.  This can be provided on the first
     sendmsg() that contributes data to a client call request or a service
     call reply.  If provided, the user must provide exactly that amount of
     data or an error will be incurred.

Changes in version 2:

 (*) struct rxrpc_send_params::tx_total_len should be s64 not u64.  Thanks to
     Julia Lawall for reporting this.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-08 11:41:41 -04:00
Eric Dumazet 0604475119 tcp: add TCPMemoryPressuresChrono counter
DRAM supply shortage and poor memory pressure tracking in TCP
stack makes any change in SO_SNDBUF/SO_RCVBUF (or equivalent autotuning
limits) and tcp_mem[] quite hazardous.

TCPMemoryPressures SNMP counter is an indication of tcp_mem sysctl
limits being hit, but only tracking number of transitions.

If TCP stack behavior under stress was perfect :
1) It would maintain memory usage close to the limit.
2) Memory pressure state would be entered for short times.

We certainly prefer 100 events lasting 10ms compared to one event
lasting 200 seconds.

This patch adds a new SNMP counter tracking cumulative duration of
memory pressure events, given in ms units.

$ cat /proc/sys/net/ipv4/tcp_mem
3088    4117    6176
$ grep TCP /proc/net/sockstat
TCP: inuse 180 orphan 0 tw 2 alloc 234 mem 4140
$ nstat -n ; sleep 10 ; nstat |grep Pressure
TcpExtTCPMemoryPressures        1700
TcpExtTCPMemoryPressuresChrono  5209

v2: Used EXPORT_SYMBOL_GPL() instead of EXPORT_SYMBOL() as David
instructed.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-08 11:26:19 -04:00
Eric Dumazet 5d2ed0521a tcp: Namespaceify sysctl_tcp_timestamps
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-08 10:53:29 -04:00
Eric Dumazet 9bb37ef00e tcp: Namespaceify sysctl_tcp_window_scaling
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-08 10:53:29 -04:00
Eric Dumazet f930103421 tcp: Namespaceify sysctl_tcp_sack
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-08 10:53:28 -04:00
Eric Dumazet eed29f17f0 tcp: add a struct net parameter to tcp_parse_options()
We want to move some TCP sysctls to net namespaces in the future.

tcp_window_scaling, tcp_sack and tcp_timestamps being fetched
from tcp_parse_options(), we need to pass an extra parameter.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-08 10:53:28 -04:00
Jiri Pirko a5fcf8a6c9 net: propagate tc filter chain index down the ndo_setup_tc call
We need to push the chain index down to the drivers, so they have the
information to which chain the rule belongs. For now, no driver supports
multichain offload, so only chain 0 is supported. This is needed to
prevent chain squashes during offload for now. Later this will be used
to implement multichain offload.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-08 09:55:53 -04:00
David Howells e754eba685 rxrpc: Provide a cmsg to specify the amount of Tx data for a call
Provide a control message that can be specified on the first sendmsg() of a
client call or the first sendmsg() of a service response to indicate the
total length of the data to be transmitted for that call.

Currently, because the length of the payload of an encrypted DATA packet is
encrypted in front of the data, the packet cannot be encrypted until we
know how much data it will hold.

By specifying the length at the beginning of the transmit phase, each DATA
packet length can be set before we start loading data from userspace (where
several sendmsg() calls may contribute to a particular packet).

An error will be returned if too little or too much data is presented in
the Tx phase.

Signed-off-by: David Howells <dhowells@redhat.com>
2017-06-07 17:15:46 +01:00
David Howells 515559ca21 rxrpc: Provide a getsockopt call to query what cmsgs types are supported
Provide a getsockopt() call that can query what cmsg types are supported by
AF_RXRPC.
2017-06-07 17:15:46 +01:00
David S. Miller 216fe8f021 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Just some simple overlapping changes in marvell PHY driver
and the DSA core code.

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-06 22:20:08 -04:00
Russell King c125ca0918 net: phy: add XAUI and 10GBASE-KR PHY connection types
XAUI allows XGMII to reach an extended distance by using a XGXS layer at
each end of the MAC to PHY link, operating over four Serdes lanes.

10GBASE-KR is a single lane Serdes backplane ethernet connection method
with autonegotiation on the link.  Some PHYs use this to connect to the
ethernet interface at 10G speeds, switching to other connection types
when utilising slower speeds.

10GBASE-KR is also used for XFI and SFI to connect to XFP and SFP fiber
modules.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-06 21:14:13 -04:00
Russell King 002ba7058a net: phy: hook up clause 45 autonegotiation restart
genphy_restart_aneg() can only restart autonegotiation on clause 22
PHYs.  Add a phy_restart_aneg() function which selects between the
clause 22 and clause 45 restart functionality depending on the PHY
type and whether the Clause 45 PHY supports the Clause 22 register set.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-06 21:14:13 -04:00
Russell King 5acde34a5a net: phy: add 802.3 clause 45 support to phylib
Add generic helpers for 802.3 clause 45 PHYs for >= 10Gbps support.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-06 21:14:13 -04:00
Linus Torvalds b29794ec95 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Made TCP congestion control documentation match current reality,
    from Anmol Sarma.

 2) Various build warning and failure fixes from Arnd Bergmann.

 3) Fix SKB list leak in ipv6_gso_segment().

 4) Use after free in ravb driver, from Eugeniu Rosca.

 5) Don't use udp_poll() in ping protocol driver, from Eric Dumazet.

 6) Don't crash in PCI error recovery of cxgb4 driver, from Guilherme
    Piccoli.

 7) _SRC_NAT_DONE_BIT needs to be cleared using atomics, from Liping
    Zhang.

 8) Use after free in vxlan deletion, from Mark Bloch.

 9) Fix ordering of NAPI poll enabled in ethoc driver, from Max
    Filippov.

10) Fix stmmac hangs with TSO, from Niklas Cassel.

11) Fix crash in CALIPSO ipv6, from Richard Haines.

12) Clear nh_flags properly on mpls link up. From Roopa Prabhu.

13) Fix regression in sk_err socket error queue handling, noticed by
    ping applications. From Soheil Hassas Yeganeh.

14) Update mlx4/mlx5 MAINTAINERS information.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (78 commits)
  net: stmmac: fix a broken u32 less than zero check
  net: stmmac: fix completely hung TX when using TSO
  net: ethoc: enable NAPI before poll may be scheduled
  net: bridge: fix a null pointer dereference in br_afspec
  ravb: Fix use-after-free on `ifconfig eth0 down`
  net/ipv6: Fix CALIPSO causing GPF with datagram support
  net: stmmac: ensure jumbo_frm error return is correctly checked for -ve value
  Revert "sit: reload iphdr in ipip6_rcv"
  i40e/i40evf: proper update of the page_offset field
  i40e: Fix state flags for bit set and clean operations of PF
  iwlwifi: fix host command memory leaks
  iwlwifi: fix min API version for 7265D, 3168, 8000 and 8265
  iwlwifi: mvm: clear new beacon command template struct
  iwlwifi: mvm: don't fail when removing a key from an inexisting sta
  iwlwifi: pcie: only use d0i3 in suspend/resume if system_pm is set to d0i3
  iwlwifi: mvm: fix firmware debug restart recording
  iwlwifi: tt: move ucode_loaded check under mutex
  iwlwifi: mvm: support ibss in dqa mode
  iwlwifi: mvm: Fix command queue number on d0i3 flow
  iwlwifi: mvm: rs: start using LQ command color
  ...
2017-06-06 14:30:17 -07:00
David Rientjes abb2ea7dfd compiler, clang: suppress warning for unused static inline functions
GCC explicitly does not warn for unused static inline functions for
-Wunused-function.  The manual states:

	Warn whenever a static function is declared but not defined or
	a non-inline static function is unused.

Clang does warn for static inline functions that are unused.

It turns out that suppressing the warnings avoids potentially complex
#ifdef directives, which also reduces LOC.

Suppress the warning for clang.

Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-06 14:09:22 -07:00
Martin KaFai Lau 1e27097690 bpf: Add BPF_OBJ_GET_INFO_BY_FD
A single BPF_OBJ_GET_INFO_BY_FD cmd is used to obtain the info
for both bpf_prog and bpf_map.  The kernel can figure out the
fd is associated with a bpf_prog or bpf_map.

The suggested struct bpf_prog_info and struct bpf_map_info are
not meant to be a complete list and it is not the goal of this patch.
New fields can be added in the future patch.

The focus of this patch is to create the interface,
BPF_OBJ_GET_INFO_BY_FD cmd for exposing the bpf_prog's and
bpf_map's info.

The obj's info, which will be extended (and get bigger) over time, is
separated from the bpf_attr to avoid bloating the bpf_attr.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Alexei Starovoitov <ast@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-06 15:41:24 -04:00
Martin KaFai Lau 783d28dd11 bpf: Add jited_len to struct bpf_prog
Add jited_len to struct bpf_prog.  It will be
useful for the struct bpf_prog_info which will
be added in the later patch.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Alexei Starovoitov <ast@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-06 15:41:24 -04:00
Martin KaFai Lau bd5f5f4ecb bpf: Add BPF_MAP_GET_FD_BY_ID
Add BPF_MAP_GET_FD_BY_ID command to allow user to get a fd
from a bpf_map's ID.

bpf_map_inc_not_zero() is added and is called with map_idr_lock
held.

__bpf_map_put() is also added which has the 'bool do_idr_lock'
param to decide if the map_idr_lock should be acquired when
freeing the map->id.

In the error path of bpf_map_inc_not_zero(), it may have to
call __bpf_map_put(map, false) which does not need
to take the map_idr_lock when freeing the map->id.

It is currently limited to CAP_SYS_ADMIN which we can
consider to lift it in followup patches.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Alexei Starovoitov <ast@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-06 15:41:23 -04:00
Martin KaFai Lau b16d9aa4c2 bpf: Add BPF_PROG_GET_FD_BY_ID
Add BPF_PROG_GET_FD_BY_ID command to allow user to get a fd
from a bpf_prog's ID.

bpf_prog_inc_not_zero() is added and is called with prog_idr_lock
held.

__bpf_prog_put() is also added which has the 'bool do_idr_lock'
param to decide if the prog_idr_lock should be acquired when
freeing the prog->id.

In the error path of bpf_prog_inc_not_zero(), it may have to
call __bpf_prog_put(map, false) which does not need
to take the prog_idr_lock when freeing the prog->id.

It is currently limited to CAP_SYS_ADMIN which we can
consider to lift it in followup patches.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Alexei Starovoitov <ast@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-06 15:41:23 -04:00
Martin KaFai Lau 34ad5580f8 bpf: Add BPF_(PROG|MAP)_GET_NEXT_ID command
This patch adds BPF_PROG_GET_NEXT_ID and BPF_MAP_GET_NEXT_ID
to allow userspace to iterate all bpf_prog IDs and bpf_map IDs.

The API is trying to be consistent with the existing
BPF_MAP_GET_NEXT_KEY.

It is currently limited to CAP_SYS_ADMIN which we can
consider to lift it in followup patches.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Alexei Starovoitov <ast@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-06 15:41:23 -04:00
Martin KaFai Lau f3f1c054c2 bpf: Introduce bpf_map ID
This patch generates an unique ID for each created bpf_map.
The approach is similar to the earlier patch for bpf_prog ID.

It is worth to note that the bpf_map's ID and bpf_prog's ID
are in two independent ID spaces and both have the same valid range:
[1, INT_MAX).

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Alexei Starovoitov <ast@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-06 15:41:22 -04:00
Martin KaFai Lau dc4bb0e235 bpf: Introduce bpf_prog ID
This patch generates an unique ID for each BPF_PROG_LOAD-ed prog.
It is worth to note that each BPF_PROG_LOAD-ed prog will have
a different ID even they have the same bpf instructions.

The ID is generated by the existing idr_alloc_cyclic().
The ID is ranged from [1, INT_MAX).  It is allocated in cyclic manner,
so an ID will get reused every 2 billion BPF_PROG_LOAD.

The bpf_prog_alloc_id() is done after bpf_prog_select_runtime()
because the jit process may have allocated a new prog.  Hence,
we need to ensure the value of pointer 'prog' will not be changed
any more before storing the prog to the prog_idr.

After bpf_prog_select_runtime(), the prog is read-only.  Hence,
the id is stored in 'struct bpf_prog_aux'.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Alexei Starovoitov <ast@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-06 15:41:22 -04:00
yuval.shaia@oracle.com f8fe997546 net: phy: Delete unused function phy_ethtool_gset
It's unused, so remove it.

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-06 15:12:28 -04:00
Jiri Pirko 5a4d1fee2f net: sched: introduce helper to identify gact trap action
Introduce a helper called is_tcf_gact_trap which could be used to
tell if the action is gact trap or not.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Yotam Gigi <yotamg@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-06 12:45:23 -04:00
Jiri Pirko e25ea21ffa net: sched: introduce a TRAP control action
There is need to instruct the HW offloaded path to push certain matched
packets to cpu/kernel for further analysis. So this patch introduces a
new TRAP control action to TC.

For kernel datapath, this action does not make much sense. So with the
same logic as in HW, new TRAP behaves similar to STOLEN. The skb is just
dropped in the datapath (and virtually ejected to an upper level, which
does not exist in case of kernel).

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Yotam Gigi <yotamg@mellanox.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-06 12:45:23 -04:00
Linus Torvalds 84c6c3035b media fixes for v4.12-rc4
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJZNnBiAAoJEAhfPr2O5OEV9bwP/1bus9tAw3AT+HxRSIaFFX8+
 DMDmJ6nZ4WQJ4fI04tKkUjpl+G2ImDGshdGgLht/YpaJRd6KgPqV+zWrAVX5/0e1
 mLyhjaALuk5M//JbkxEP95SWBOZ6SCIWlV/5oQRTNI86kO0gISxoCAsbumKlSSUC
 qTFmbmPp9siFpS43eZjVcgYIbwFx75qvLTc1+JRvxa2VhtMB5d4xYnXSpxlCvduj
 NN14KiphBgCOvyMQsi4q3H6ma8EL0sEtaukqPzXOnz6GGAIUUbDA23APM5H0LIIZ
 kYhO9ooez4dz1094ex1zSS/uQq2ogCTv7ShQseddNbHhOFG7Aq30AXLMEWeHaNp1
 fFb28CY3CBpNaYfjePbqIs8KKg3JxmJGmCGgW65p40UGUo1Itbpci5MqN8BjQAI8
 Ks1rf+V4iYQTr4QmQJQqCyJCljrsQbGMKZ9I67pmqfbqDunlH43Zr88DEWPv3rbW
 qac6U1vh108UHE/1KRZFjzvo31ToP+f+AwyVTXVeIi6vba2gvC8ASCJnZ/nGtO74
 Eb/GR0DtqvYGE6sXohbMywZ+8wRR6CdRVDC4YotQwaoghwnH10WPLg3JahECVMu7
 MbDtVvUHjbJ18cqwCW+J01gcuQxH/8Lx07T9T+pUFFanPBT7phPiQ/UAEPL1e3XO
 e4nFwX9h78wISBdy8Yx7
 =+jBV
 -----END PGP SIGNATURE-----

Merge tag 'media/v4.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media

Pull media fixes from Mauro Carvalho Chehab:
 "Some bug fixes:

   - Don't fail build if atomisp has warnings

   - Some CEC Kconfig changes to allow it to be used by DRM without
     media dependencies

   - A race fix at RC initialization code

   - A driver fix at rainshadow-cec

  IMHO, the one that affects most people in this series is a build fix:
  if you try to build the Kernel with W=1 or using gcc7 and
  all[yes|mod]config, build will fail due to -Werror at atomisp
  makefiles"

* tag 'media/v4.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
  [media] rc-core: race condition during ir_raw_event_register()
  [media] cec: drop MEDIA_CEC_DEBUG
  [media] cec: rename MEDIA_CEC_NOTIFIER to CEC_NOTIFIER
  [media] cec: select CEC_CORE instead of depend on it
  [media] rainshadow-cec: ensure exit_loop is intialized
  [media] atomisp: don't treat warnings as errors
2017-06-06 09:37:44 -07:00
David S. Miller bb36314054 RxRPC rewrite
-----BEGIN PGP SIGNATURE-----
 
 iQIVAwUAWTZjxvSw1s6N8H32AQKwUQ/8CPF6CFwn+oS7cTkkI27sKaW43tTWyxxl
 1qAXjeI5dqrnmR+xW5Xu06HwO8TQKfum5dvXJLse5y15ttbK9/fevRW1IzcYxeHQ
 YcR414c0akIZ72hJ93LZypmLwlhEicZs4dZXrUs6f6WuqFLwYrt4K1MyrY8Bt+bM
 +a2yLVToF4L0nI/aAhoU0Hh0sNv4AP/PKrLWEzhLDq1Q6xBiQHSsrHLPOPkJ9QqA
 KOZWSjZJj8j7gSBoXtMwQiBxV76KptbksYQFLpy3EwL/r7z1qBPI0TOAKnLDLs5Q
 cDHf2uSUTrgfO7TIg02/SJcHm+8s0p3K585E9iK5JZ6BMjdSRfKR14nJdlWyXdZ5
 EvvEA7AlUpukHVv+CP+03sdBfkZ3PSb4sAQ+CbwY30SKwL1fRE26NW0fZa5lSmUt
 E1ixCxHPJXPnSZJAa5kePdWDgQjn2qJI+3Zh+jw0yaQ+rAgpP4M95xckeWdU9PKg
 8uFMM7Z1h70PnmVV3nX603MqyVivpKEZKHKTQgqGz4BvB1ZEu9noLTfwQCodXtns
 /8/8sVD65L4/SpHr1AM3Y+v7483bHth8edAI0k/QZerdKGImR+enrYBoSZ53QkEf
 TG8pvK74Tdpw2LQJsUIDvL5+oBO4FtPNOmT4UHbotenrVkF/4laIFcCVPW58scG1
 mB8kAUS+bzs=
 =M7Pr
 -----END PGP SIGNATURE-----

Merge tag 'rxrpc-rewrite-20170606' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

David Howells says:

====================
rxrpc: Support service upgrade

Here's a set of patches that allow AF_RXRPC to support the AuriStor service
upgrade facility.  This allows the server to change the service ID
requested to an upgraded service if the client requests it upon the
initiation of a connection.

This is used by the AuriStor AFS-compatible servers to implement IPv6
handling and improved facilities by providing improved volume location,
volume, protection, file and cache management services.  Note that certain
parts of the AFS protocol carry hard-coded IPv4 addresses.

The reason AuriStor does it this way is that probing the improved service
ID first will not incur an ABORT or any other response on some servers if
the server is not listening on it - and so one have to employ a timeout.

This is implemented in the server by allowing an AF_RXRPC server to call
bind() twice on a socket to allow it to listen on two service IDs and then
call setsockopt() to instruct the server to upgrade one into the other if
the client requests it (by setting userStatus to 1 on the first DATA packet
on a connection).  If the upgrade occurs, all further operations on that
connection are done with the new service ID.  AF_RXRPC has to handle this
automatically as connections are not exposed to userspace.

Clients can request this facility by setting an RXRPC_UPGRADE_SERVICE
command in the sendmsg() control buffer and then observing the resultant
service ID in the msg_addr returned by recvmsg().  This should only be used
to probe the service.  Clients should then use the returned service ID in
all subsequent communications with that server.  Note that the kernel will
not retain this information should the connection expire from its cache.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-06 12:05:57 -04:00
Linus Torvalds ba7b2387ad Merge branch 'for-4.12-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup fixes from Tejun Heo:
 "Two cgroup fixes. One to address RCU delay of cpuset removal affecting
  userland visible behaviors. The other fixes a race condition between
  controller disable and cgroup removal"

* 'for-4.12-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cpuset: consider dying css as offline
  cgroup: Prevent kill_css() from being called more than once
2017-06-05 15:37:03 -07:00
yuval.shaia@oracle.com 82c01a84d5 net/{mii, smsc}: Make mii_ethtool_get_link_ksettings and smc_netdev_get_ecmd return void
Make return value void since functions never returns meaningfull value.

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-05 11:00:42 -04:00
Rosen, Rami 4e2ec43654 genetlink: remove ops_list from genetlink header.
commit d91824c08f ("genetlink: register family ops as array") removed the
ops_list member from both genl_family and genl_ops; while the
documentation of genl_family was updated accordingly by this patch,
ops_list remained in the documentation of the genl_ops object.
This patch fixes it by removing ops_list from genl_ops documentation.

Signed-off-by: Rami Rosen <rami.rosen@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-05 10:54:55 -04:00
Anmol Sarma 1e0ce2a1ee net: Update TCP congestion control documentation
Update tcp.txt to fix mandatory congestion control ops and default
CCA selection. Also, fix comment in tcp.h for undo_cwnd.

Signed-off-by: Anmol Sarma <me@anmolsarma.in>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-05 10:53:24 -04:00
David Howells 4e255721d1 rxrpc: Add service upgrade support for client connections
Make it possible for a client to use AuriStor's service upgrade facility.

The client does this by adding an RXRPC_UPGRADE_SERVICE control message to
the first sendmsg() of a call.  This takes no parameters.

When recvmsg() starts returning data from the call, the service ID field in
the returned msg_name will reflect the result of the upgrade attempt.  If
the upgrade was ignored, srx_service will match what was set in the
sendmsg(); if the upgrade happened the srx_service will be altered to
indicate the service the server upgraded to.

Note that:

 (1) The choice of upgrade service is up to the server

 (2) Further client calls to the same server that would share a connection
     are blocked if an upgrade probe is in progress.

 (3) This should only be used to probe the service.  Clients should then
     use the returned service ID in all subsequent communications with that
     server (and not set the upgrade).  Note that the kernel will not
     retain this information should the connection expire from its cache.

 (4) If a server that supports upgrading is replaced by one that doesn't,
     whilst a connection is live, and if the replacement is running, say,
     OpenAFS 1.6.4 or older or an older IBM AFS, then the replacement
     server will not respond to packets sent to the upgraded connection.

     At this point, calls will time out and the server must be reprobed.

Signed-off-by: David Howells <dhowells@redhat.com>
2017-06-05 14:30:49 +01:00
David Howells 4722974d90 rxrpc: Implement service upgrade
Implement AuriStor's service upgrade facility.  There are three problems
that this is meant to deal with:

 (1) Various of the standard AFS RPC calls have IPv4 addresses in their
     requests and/or replies - but there's no room for including IPv6
     addresses.

 (2) Definition of IPv6-specific RPC operations in the standard operation
     sets has not yet been achieved.

 (3) One could envision the creation a new service on the same port that as
     the original service.  The new service could implement improved
     operations - and the client could try this first, falling back to the
     original service if it's not there.

     Unfortunately, certain servers ignore packets addressed to a service
     they don't implement and don't respond in any way - not even with an
     ABORT.  This means that the client must then wait for the call timeout
     to occur.

What service upgrade does is to see if the connection is marked as being
'upgradeable' and if so, change the service ID in the server and thus the
request and reply formats.  Note that the upgrade isn't mandatory - a
server that supports only the original call set will ignore the upgrade
request.

In the protocol, the procedure is then as follows:

 (1) To request an upgrade, the first DATA packet in a new connection must
     have the userStatus set to 1 (this is normally 0).  The userStatus
     value is normally ignored by the server.

 (2) If the server doesn't support upgrading, the reply packets will
     contain the same service ID as for the first request packet.

 (3) If the server does support upgrading, all future reply packets on that
     connection will contain the new service ID and the new service ID will
     be applied to *all* further calls on that connection as well.

 (4) The RPC op used to probe the upgrade must take the same request data
     as the shadow call in the upgrade set (but may return a different
     reply).  GetCapability RPC ops were added to all standard sets for
     just this purpose.  Ops where the request formats differ cannot be
     used for probing.

 (5) The client must wait for completion of the probe before sending any
     further RPC ops to the same destination.  It should then use the
     service ID that recvmsg() reported back in all future calls.

 (6) The shadow service must have call definitions for all the operation
     IDs defined by the original service.


To support service upgrading, a server should:

 (1) Call bind() twice on its AF_RXRPC socket before calling listen().
     Each bind() should supply a different service ID, but the transport
     addresses must be the same.  This allows the server to receive
     requests with either service ID.

 (2) Enable automatic upgrading by calling setsockopt(), specifying
     RXRPC_UPGRADEABLE_SERVICE and passing in a two-member array of
     unsigned shorts as the argument:

	unsigned short optval[2];

     This specifies a pair of service IDs.  They must be different and must
     match the service IDs bound to the socket.  Member 0 is the service ID
     to upgrade from and member 1 is the service ID to upgrade to.

Signed-off-by: David Howells <dhowells@redhat.com>
2017-06-05 14:30:49 +01:00
Talat Batheesh 6dc06c08be net/mlx4: Fix the check in attaching steering rules
Our previous patch (cited below) introduced a regression
for RAW Eth QPs.

Fix it by checking if the QP number provided by user-space
exists, hence allowing steering rules to be added for valid
QPs only.

Fixes: 89c557687a ("net/mlx4_en: Avoid adding steering rules with invalid ring")
Reported-by: Or Gerlitz <gerlitz.or@gmail.com>
Signed-off-by: Talat Batheesh <talatb@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-04 23:10:05 -04:00
Mintz, Yuval cbb8a12c08 qed: VF XDP support
The final addition on the qed front -
 - VFs would now require their PFs to provide multiple CIDs
 - Based on the availability of connections from PF, determine whether
   XDP is feasible and share it with qede via dev_info.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-04 23:08:31 -04:00
Mintz, Yuval 08bc8f15e6 qed: Multiple qzone queues for VFs
This adds the infrastructure for supporting VFs that want to open
multiple transmission queues on the same queue-zone.
At this point, there are no VFs that actually request this functionality,
but later patches would remedy that.

 a. VF and PF would communicate the capability during ACQUIRE;
    Legacy VFs would continue on behaving as they do today

 b. PF would communicate number of supported CIDs to the VF
    and would enforce said limitation

 c. Whenever VF passes a request for a given queue configuration
    it would also pass an associated index within said queue-zone

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-04 23:08:31 -04:00
Mintz, Yuval f604b17d7f qed*: L2 interface to use the SB structures directly
Part of an effort of a cleaner seperation between qed and the protocol
drivers, the L2 interface is to use the SB structure for initialization
purposes opaquely.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-04 23:08:30 -04:00
Jason A. Donenfeld 48a1df6533 skbuff: return -EMSGSIZE in skb_to_sgvec to prevent overflow
This is a defense-in-depth measure in response to bugs like
4d6fa57b4d ("macsec: avoid heap overflow in skb_to_sgvec"). There's
not only a potential overflow of sglist items, but also a stack overflow
potential, so we fix this by limiting the amount of recursion this function
is allowed to do. Not actually providing a bounded base case is a future
disaster that we can easily avoid here.

As a small matter of house keeping, we take this opportunity to move the
documentation comment over the actual function the documentation is for.

While this could be implemented by using an explicit stack of skbuffs,
when implementing this, the function complexity increased considerably,
and I don't think such complexity and bloat is actually worth it. So,
instead I built this and tested it on x86, x86_64, ARM, ARM64, and MIPS,
and measured the stack usage there. I also reverted the recent MIPS
changes that give it a separate IRQ stack, so that I could experience
some worst-case situations. I found that limiting it to 24 layers deep
yielded a good stack usage with room for safety, as well as being much
deeper than any driver actually ever creates.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: David Howells <dhowells@redhat.com>
Cc: Sabrina Dubroca <sd@queasysnail.net>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-04 23:01:47 -04:00
Eric Dumazet 77d4b1d369 net: ping: do not abuse udp_poll()
Alexander reported various KASAN messages triggered in recent kernels

The problem is that ping sockets should not use udp_poll() in the first
place, and recent changes in UDP stack finally exposed this old bug.

Fixes: c319b4d76b ("net: ipv4: add IPPROTO_ICMP socket kind")
Fixes: 6d0bfe2261 ("net: ipv6: Add IPv6 support to the ping socket.")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Sasha Levin <alexander.levin@verizon.com>
Cc: Solar Designer <solar@openwall.com>
Cc: Vasiliy Kulikov <segoon@openwall.com>
Cc: Lorenzo Colitti <lorenzo@google.com>
Acked-By: Lorenzo Colitti <lorenzo@google.com>
Tested-By: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-04 22:56:55 -04:00
Teng Qin b7d3ed5be9 bpf: update perf event helper functions documentation
This commit updates documentation of the bpf_perf_event_output and
bpf_perf_event_read helpers to match their implementation.

Signed-off-by: Teng Qin <qinteng@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-04 21:58:16 -04:00
Alexei Starovoitov f91840a32d perf, bpf: Add BPF support to all perf_event types
Allow BPF_PROG_TYPE_PERF_EVENT program types to attach to all
perf_event types, including HW_CACHE, RAW, and dynamic pmu events.
Only tracepoint/kprobe events are treated differently which require
BPF_PROG_TYPE_TRACEPOINT/BPF_PROG_TYPE_KPROBE program types accordingly.

Also add support for reading all event counters using
bpf_perf_event_read() helper.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-04 21:58:01 -04:00
Sowmini Varadhan 5071034e4a neigh: Really delete an arp/neigh entry on "ip neigh delete" or "arp -d"
The command
  # arp -s 62.2.0.1 a🅱️c:d:e:f dev eth2
adds an entry like the following (listed by "arp -an")
  ? (62.2.0.1) at 0a:0b:0c:0d:0e:0f [ether] PERM on eth2
but the symmetric deletion command
  # arp -i eth2 -d 62.2.0.1
does not remove the PERM entry from the table, and instead leaves behind
  ? (62.2.0.1) at <incomplete> on eth2

The reason is that there is a refcnt of 1 for the arp_tbl itself
(neigh_alloc starts off the entry with a refcnt of 1), thus
the neigh_release() call from arp_invalidate() will (at best) just
decrement the ref to 1, but will never actually free it from the
table.

To fix this, we need to do something like neigh_forced_gc: if
the refcnt is 1 (i.e., on the table's ref), remove the entry from
the table and free it. This patch refactors and shares common code
between neigh_forced_gc and the newly added neigh_remove_one.

A similar issue exists for IPv6 Neighbor Cache entries, and is fixed
in a similar manner by this patch.

Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Reviewed-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-04 21:37:18 -04:00
Florian Fainelli 14be36c2c9 net: dsa: Initialize all CPU and enabled ports masks in dsa_ds_parse()
There was no reason for duplicating the code that initializes
ds->enabled_port_mask in both dsa_parse_ports_dn() and
dsa_parse_ports(), instead move this to dsa_ds_parse() which is early
enough before ops->setup() has run.

While at it, we can now make dsa_is_cpu_port() check ds->cpu_port_mask
which is a step towards being multi-CPU port capable.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-04 20:05:15 -04:00
Or Gerlitz 4d80cc0aaa net/sched: cls_flower: add support for matching on ip tos and ttl
Benefit from the support of ip header fields dissection and
allow users to set rules matching on ipv4 tos and ttl or
ipv6 traffic-class and hoplimit.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-04 18:12:24 -04:00
Or Gerlitz 518d8a2e9b net/flow_dissector: add support for dissection of misc ip header fields
Add support for dissection of ip tos and ttl and ipv6 traffic-class
and hoplimit. Both are dissected into the same struct.

Uses similar call to ip dissection function as with tcp, arp and others.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-04 18:12:23 -04:00
Hans Verkuil e94c32818d [media] cec: rename MEDIA_CEC_NOTIFIER to CEC_NOTIFIER
This config option is strictly speaking independent of the
media subsystem since it can be used by drm as well.

Besides, it looks odd when drivers select CEC_CORE and
MEDIA_CEC_NOTIFIER, that's inconsistent naming.

Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2017-06-04 15:23:35 -03:00