1
0
Fork 0
alistair23-linux/net
Julian Anastasov 0c122fc90d ipvs: allow connection reuse for unconfirmed conntrack
[ Upstream commit f0a5e4d7a5 ]

YangYuxi is reporting that connection reuse
is causing one-second delay when SYN hits
existing connection in TIME_WAIT state.
Such delay was added to give time to expire
both the IPVS connection and the corresponding
conntrack. This was considered a rare case
at that time but it is causing problem for
some environments such as Kubernetes.

As nf_conntrack_tcp_packet() can decide to
release the conntrack in TIME_WAIT state and
to replace it with a fresh NEW conntrack, we
can use this to allow rescheduling just by
tuning our check: if the conntrack is
confirmed we can not schedule it to different
real server and the one-second delay still
applies but if new conntrack was created,
we are free to select new real server without
any delays.

YangYuxi lists some of the problem reports:

- One second connection delay in masquerading mode:
https://marc.info/?t=151683118100004&r=1&w=2

- IPVS low throughput #70747
https://github.com/kubernetes/kubernetes/issues/70747

- Apache Bench can fill up ipvs service proxy in seconds #544
https://github.com/cloudnativelabs/kube-router/issues/544

- Additional 1s latency in `host -> service IP -> pod`
https://github.com/kubernetes/kubernetes/issues/90854

Fixes: f719e3754e ("ipvs: drop first packet to redirect conntrack")
Co-developed-by: YangYuxi <yx.atom1@gmail.com>
Signed-off-by: YangYuxi <yx.atom1@gmail.com>
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Reviewed-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-08-19 08:16:10 +02:00
..
6lowpan 6lowpan: no need to check return value of debugfs_create functions 2019-07-06 12:50:01 +02:00
9p net/9p: validate fds in p9_fd_open 2020-08-11 15:33:36 +02:00
802 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
8021q vlan: vlan_changelink() should propagate errors 2020-01-12 12:21:50 +01:00
appletalk appletalk: Fix atalk_proc_init() return path 2020-08-11 15:33:40 +02:00
atm net: atm: Reduce the severity of logging in unlink_clip_vcc 2019-11-18 17:08:20 -08:00
ax25 AX.25: Prevent integer overflows in connect and sendmsg 2020-07-31 18:39:31 +02:00
batman-adv batman-adv: Revert "disable ethtool link speed detection when auto negotiation off" 2020-06-22 09:30:56 +02:00
bluetooth Bluetooth: add a mutex lock to avoid UAF in do_enale_set 2020-08-19 08:15:59 +02:00
bpf bpf/flow_dissector: support flags in BPF_PROG_TEST_RUN 2019-07-25 18:00:41 -07:00
bpfilter net/bpfilter: remove superfluous testing message 2020-04-21 09:04:53 +02:00
bridge bridge: mcast: Fix MLD2 Report IPv6 payload length check 2020-07-22 09:32:46 +02:00
caif net: use skb_queue_empty_lockless() in poll() handlers 2019-10-28 13:33:41 -07:00
can can: j1939: j1939_sk_bind(): take priv after lock is held 2019-12-31 16:45:56 +01:00
ceph libceph: don't omit recovery_deletes in target_copy() 2020-07-22 09:33:17 +02:00
core bpf: sockmap: Require attach_bpf_fd when detaching a program 2020-08-07 09:34:02 +02:00
dcb treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 201 2019-05-30 11:29:52 -07:00
dccp dccp: Fix possible memleak in dccp_init and dccp_fini 2020-06-17 16:40:32 +02:00
decnet net: add bool confirm_neigh parameter for dst_ops.update_pmtu 2020-01-04 19:18:58 +01:00
dns_resolver KEYS: Don't write out to userspace while holding key semaphore 2020-04-23 10:36:45 +02:00
dsa net: dsa: declare lockless TX feature for slave ports 2020-06-03 08:21:38 +02:00
ethernet net: add annotations on hh->hh_len lockless accesses 2020-01-09 10:20:06 +01:00
hsr hsr: check protocol version in hsr_newlink() 2020-04-21 09:04:44 +02:00
ieee802154 nl802154: add missing attribute validation for dev_type 2020-03-18 07:17:44 +01:00
ife net: Fix Kconfig indentation 2019-09-26 08:56:17 +02:00
ipv4 tcp: apply a floor of 1 for RTT samples from TCP timestamps 2020-08-11 15:33:41 +02:00
ipv6 ipv6: Fix nexthop refcnt leak when creating ipv6 route info 2020-08-11 15:33:39 +02:00
iucv net/af_iucv: mark expected switch fall-throughs 2019-07-29 10:26:14 -07:00
kcm kcm: disable preemption in kcm_parse_func_strparser() 2019-09-27 10:27:14 +02:00
key xfrm: policy: match with both mark and mask on user interfaces 2020-08-05 09:59:44 +02:00
l2tp l2tp: remove skb_dst_set() from l2tp_xmit_skb() 2020-07-22 09:32:47 +02:00
l3mdev ipv6: convert major tx path to use RT6_LOOKUP_F_DST_NOREF 2019-06-23 13:24:17 -07:00
lapb Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-06-17 20:20:36 -07:00
llc llc: make sure applications use ARPHRD_ETHER 2020-07-22 09:32:47 +02:00
mac80211 mac80211: mesh: Free pending skb when destroying a mpath 2020-08-05 09:59:48 +02:00
mac802154 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 174 2019-05-30 11:26:41 -07:00
mpls net: ipv6_stub: use ip6_dst_lookup_flow instead of ip6_dst_lookup 2019-12-18 16:08:42 +01:00
ncsi net/ncsi: Disable global multicast filter 2019-09-19 18:04:40 -07:00
netfilter ipvs: allow connection reuse for unconfirmed conntrack 2020-08-19 08:16:10 +02:00
netlabel netlabel: cope with NULL catmap 2020-05-20 08:20:08 +02:00
netlink genetlink: remove genl_bind 2020-07-22 09:32:46 +02:00
netrom net: netrom: Fix potential nr_neigh refcnt leak in nr_add_node 2020-04-29 16:33:08 +02:00
nfc nfc: add missing attribute validation for vendor subcommand 2020-03-18 07:17:46 +01:00
nsh treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
openvswitch openvswitch: Prevent kernel-infoleak in ovs_ct_put_key() 2020-08-11 15:33:41 +02:00
packet net/packet: tpacket_rcv: avoid a producer race condition 2020-04-01 11:01:35 +02:00
phonet net: use skb_queue_empty_lockless() in poll() handlers 2019-10-28 13:33:41 -07:00
psample net: psample: fix skb_over_panic 2019-12-04 22:30:54 +01:00
qrtr qrtr: orphan socket in qrtr_release() 2020-07-31 18:39:30 +02:00
rds rds: Prevent kernel-infoleak in rds_notify_queue_get() 2020-08-05 09:59:44 +02:00
rfkill rfkill: Fix incorrect check to avoid NULL pointer dereference 2020-01-12 12:21:33 +01:00
rose net: core: add generic lockdep keys 2019-10-24 14:53:48 -07:00
rxrpc rxrpc: Fix race between recvmsg and sendmsg on immediate call failure 2020-08-11 15:33:40 +02:00
sched sched: consistently handle layer3 header accesses in the presence of VLANs 2020-07-22 09:32:48 +02:00
sctp sctp: shrink stream outq when fails to do addstream reconf 2020-07-31 18:39:31 +02:00
smc net/smc: cancel event worker during device removal 2020-03-18 07:17:59 +01:00
strparser Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-06-22 08:59:24 -04:00
sunrpc nfsd: Fix NFSv4 READ on RDMA when using readv 2020-08-11 15:33:42 +02:00
switchdev treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
tipc tipc: block BH before using dst_cache 2020-06-03 08:21:03 +02:00
tls bpf: Fix running sk_skb program types with ktls 2020-06-22 09:31:12 +02:00
unix af_unix: add compat_ioctl support 2020-01-17 19:48:52 +01:00
vmw_vsock vsock/virtio: annotate 'the_virtio_vsock' RCU pointer 2020-07-29 10:18:31 +02:00
wimax wimax: no need to check return value of debugfs_create functions 2019-08-10 15:25:47 -07:00
wireless cfg80211: check vendor command doit pointer before use 2020-08-11 15:33:38 +02:00
x25 net/x25: Fix null-ptr-deref in x25_disconnect 2020-08-05 09:59:44 +02:00
xdp xdp: Fix xsk_generic_xmit errno 2020-06-24 17:50:44 +02:00
xfrm xfrm: policy: match with both mark and mask on user interfaces 2020-08-05 09:59:44 +02:00
Kconfig net: Fix CONFIG_NET_CLS_ACT=n and CONFIG_NFT_FWD_NETDEV={y, m} build 2020-04-01 11:02:18 +02:00
Makefile net: split out functions related to registering inflight socket files 2019-02-28 08:24:23 -07:00
compat.c uio: make import_iovec()/compat_import_iovec() return bytes on success 2019-05-31 15:30:03 -06:00
socket.c compat_ioctl: handle SIOCOUTQNSD 2020-01-17 19:48:52 +01:00
sysctl_net.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00