1
0
Fork 0
alistair23-linux/drivers/net
Fred Lotter 28abe57962 nfp: flower: cmsg rtnl locks can timeout reify messages
Flower control message replies are handled in different locations. The truly
high priority replies are handled in the BH (tasklet) context, while the
remaining replies are handled in a predefined Linux work queue. The work
queue handler orders replies into high and low priority groups, and always
start servicing the high priority replies within the received batch first.

Reply Type:			Rtnl Lock:	Handler:

CMSG_TYPE_PORT_MOD		no		BH tasklet (mtu)
CMSG_TYPE_TUN_NEIGH		no		BH tasklet
CMSG_TYPE_FLOW_STATS		no		BH tasklet
CMSG_TYPE_PORT_REIFY		no		WQ high
CMSG_TYPE_PORT_MOD		yes		WQ high (link/mtu)
CMSG_TYPE_MERGE_HINT		yes		WQ low
CMSG_TYPE_NO_NEIGH		no		WQ low
CMSG_TYPE_ACTIVE_TUNS		no		WQ low
CMSG_TYPE_QOS_STATS		no		WQ low
CMSG_TYPE_LAG_CONFIG		no		WQ low

A subset of control messages can block waiting for an rtnl lock (from both
work queue priority groups). The rtnl lock is heavily contended for by
external processes such as systemd-udevd, systemd-network and libvirtd,
especially during netdev creation, such as when flower VFs and representors
are instantiated.

Kernel netlink instrumentation shows that external processes (such as
systemd-udevd) often use successive rtnl_trylock() sequences, which can result
in an rtnl_lock() blocked control message to starve for longer periods of time
during rtnl lock contention, i.e. netdev creation.

In the current design a single blocked control message will block the entire
work queue (both priorities), and introduce a latency which is
nondeterministic and dependent on system wide rtnl lock usage.

In some extreme cases, one blocked control message at exactly the wrong time,
just before the maximum number of VFs are instantiated, can block the work
queue for long enough to prevent VF representor REIFY replies from getting
handled in time for the 40ms timeout.

The firmware will deliver the total maximum number of REIFY message replies in
around 300us.

Only REIFY and MTU update messages require replies within a timeout period (of
40ms). The MTU-only updates are already done directly in the BH (tasklet)
handler.

Move the REIFY handler down into the BH (tasklet) in order to resolve timeouts
caused by a blocked work queue waiting on rtnl locks.

Signed-off-by: Fred Lotter <frederik.lotter@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-07 18:05:50 +02:00
..
appletalk
arcnet arcnet: com20020-isa: Mark expected switch fall-throughs 2019-07-29 10:23:59 -07:00
bonding bonding: Add vlan tx offload to hw_enc_features 2019-08-08 22:01:44 -07:00
caif caif-hsi: fix possible deadlock in cfhsi_exit_module() 2019-07-17 11:58:56 -07:00
can Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2019-08-06 17:11:59 -07:00
dsa net: dsa: microchip: add KSZ8563 compatibility string 2019-08-31 23:36:37 -07:00
ethernet nfp: flower: cmsg rtnl locks can timeout reify messages 2019-09-07 18:05:50 +02:00
fddi The main MIPS changes for a pretty light v5.3 cycle, including: 2019-07-17 09:42:03 -07:00
fjes fjes: no need to check return value of debugfs_create functions 2019-06-22 16:43:08 -07:00
hamradio net/hamradio/6pack: Fix the size of a sk_buff used in 'sp_bump()' 2019-09-07 15:46:28 +02:00
hippi hippi: Remove call to memset after pci_alloc_consistent 2019-07-15 11:06:27 -07:00
hyperv hv_netvsc: Fix a warning of suspicious RCU usage 2019-08-09 13:42:19 -07:00
ieee802154 Merge branch 'ieee802154-for-davem-2019-08-24' of git://git.kernel.org/pub/scm/linux/kernel/git/sschmidt/wpan 2019-08-24 13:46:57 -07:00
ipvlan Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-06-07 09:29:14 -07:00
netdevsim netdevsim: Restore per-network namespace accounting for fib entries 2019-08-11 20:59:19 -07:00
phy net: phylink: Fix flow control resolution 2019-09-07 17:26:13 +02:00
plip Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-06-07 11:00:14 -07:00
ppp compat_ioctl: pppoe: fix PPPOEIOCSFWD handling 2019-07-30 14:42:13 -07:00
slip
team team: Add vlan tx offload to hw_enc_features 2019-08-08 22:41:41 -07:00
usb r8152: remove calling netif_napi_del 2019-08-28 16:02:32 -07:00
vmxnet3 vmxnet3: Remove call to memset after dma_alloc_coherent 2019-07-15 11:06:27 -07:00
wan net: wan: sdla: Mark expected switch fall-through 2019-07-29 13:57:58 -07:00
wimax wimax/i2400m: fix a memory leak bug 2019-08-18 14:11:28 -07:00
wireless rsi: fix a double free bug in rsi_91x_deinit() 2019-09-03 16:54:48 +03:00
xen-netback xen/netback: Reset nr_frags before freeing skb 2019-08-08 18:01:24 -07:00
Kconfig
LICENSE.SRC
Makefile
Space.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
dummy.c
eql.c
geneve.c SPDX update for 5.2-rc6 2019-06-21 09:58:42 -07:00
gtp.c gtp: add missing gtp_encap_disable_sock() in gtp_encap_enable() 2019-07-07 18:42:48 -07:00
ifb.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
loopback.c loopback: fix lockdep splat 2019-07-03 11:24:38 -07:00
macsec.c macsec: fix checksumming after decryption 2019-07-02 14:12:29 -07:00
macvlan.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-06-07 11:00:14 -07:00
macvtap.c
mdio.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
mii.c
net_failover.c
netconsole.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 153 2019-05-30 11:26:32 -07:00
nlmon.c
ntb_netdev.c
rionet.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
sb1000.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
sungem_phy.c
tap.c coallocate socket_wq with socket itself 2019-07-08 19:25:19 -07:00
thunderbolt.c
tun.c tun: mark small packets as owned by the tap sock 2019-07-25 11:38:32 -07:00
veth.c veth: Support bulk XDP_TX 2019-06-25 14:26:54 +02:00
virtio_net.c virtio_net: enable napi_tx by default 2019-06-14 19:34:27 -07:00
vrf.c vrf: make sure skb->data contains ip header to make routing 2019-07-21 13:32:51 -07:00
vsockmon.c
vxlan.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-07-08 19:48:57 -07:00
xen-netfront.c