1
0
Fork 0
alistair23-linux/net/ipv4
Neal Cardwell add880d788 tcp: fix cwnd-limited bug for TSO deferral where we send nothing
[ Upstream commit 299bcb55ec ]

When cwnd is not a multiple of the TSO skb size of N*MSS, we can get
into persistent scenarios where we have the following sequence:

(1) ACK for full-sized skb of N*MSS arrives
  -> tcp_write_xmit() transmit full-sized skb with N*MSS
  -> move pacing release time forward
  -> exit tcp_write_xmit() because pacing time is in the future

(2) TSQ callback or TCP internal pacing timer fires
  -> try to transmit next skb, but TSO deferral finds remainder of
     available cwnd is not big enough to trigger an immediate send
     now, so we defer sending until the next ACK.

(3) repeat...

So we can get into a case where we never mark ourselves as
cwnd-limited for many seconds at a time, even with
bulk/infinite-backlog senders, because:

o In case (1) above, every time in tcp_write_xmit() we have enough
cwnd to send a full-sized skb, we are not fully using the cwnd
(because cwnd is not a multiple of the TSO skb size). So every time we
send data, we are not cwnd limited, and so in the cwnd-limited
tracking code in tcp_cwnd_validate() we mark ourselves as not
cwnd-limited.

o In case (2) above, every time in tcp_write_xmit() that we try to
transmit the "remainder" of the cwnd but defer, we set the local
variable is_cwnd_limited to true, but we do not send any packets, so
sent_pkts is zero, so we don't call the cwnd-limited logic to update
tp->is_cwnd_limited.

Fixes: ca8a226343 ("tcp: make cwnd-limited checks measurement-based, and gentler")
Reported-by: Ingemar Johansson <ingemar.s.johansson@ericsson.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20201209035759.1225145-1-ncardwell.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-21 13:27:04 +01:00
..
bpfilter SPDX update for 5.2-rc2, round 1 2019-05-21 12:33:38 -07:00
netfilter netfilter: use actual socket sk rather than skb sk when routing harder 2020-11-18 19:20:17 +01:00
Kconfig vti[6]: fix packet tx through bpf_redirect() in XinY cases 2020-04-01 11:02:05 +02:00
Makefile net: Initial nexthop code 2019-05-28 21:37:30 -07:00
af_inet.c net: remove empty inet_exit_net 2019-08-19 18:22:54 -07:00
ah4.c xfrm: remove type and offload_type map from xfrm_state_afinfo 2019-06-06 08:34:50 +02:00
arp.c Exempt multicast addresses from five-second neighbor lifetime 2020-11-24 13:28:56 +01:00
cipso_ipv4.c netlabel: cope with NULL catmap 2020-05-20 08:20:08 +02:00
datagram.c inet: stop leaking jiffies on the wire 2019-11-01 14:57:52 -07:00
devinet.c devinet: fix memleak in inetdev_init() 2020-06-10 20:24:54 +02:00
esp4.c xfrm: remove get_mtu indirection from xfrm_type 2019-07-01 06:16:40 +02:00
esp4_offload.c xfrm: remove the xfrm_state_put call becofe going to out_reset 2020-06-03 08:21:32 +02:00
fib_frontend.c ipv4: fix error return code in rtm_to_fib_config() 2020-12-21 13:27:03 +01:00
fib_lookup.h ipv4: Use accessors for fib_info nexthop data 2019-06-04 19:26:49 -07:00
fib_notifier.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
fib_rules.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-06-07 11:00:14 -07:00
fib_semantics.c net: Fix the arp error in some cases 2020-06-30 15:36:44 -04:00
fib_trie.c ipv4: Silence suspicious RCU usage warning 2020-09-12 14:18:54 +02:00
fou.c fou: Fix IPv6 netlink policy 2020-01-29 16:45:22 +01:00
gre_demux.c gre: fix uninit-value in __iptunnel_pull_header 2020-03-18 07:17:38 +01:00
gre_offload.c net: gre: recompute gre csum for sctp over gre tunnels 2020-08-11 15:33:40 +02:00
icmp.c icmp: randomize the global rate limiter 2020-10-29 09:57:27 +01:00
igmp.c net: fix __ip_mc_inc_group usage 2019-08-20 12:48:06 -07:00
inet_connection_sock.c net: refactor bind_bucket fastreuse into helper 2020-08-19 08:16:23 +02:00
inet_diag.c inet_diag: Fix error path to cancel the meseage in inet_req_diag_fill() 2020-11-24 13:28:56 +01:00
inet_fragment.c inet: frags: re-introduce skb coalescing for local delivery 2019-08-08 15:55:10 -07:00
inet_hashtables.c net: initialize fastreuse on inet_inherit_port 2020-08-19 08:16:23 +02:00
inet_timewait_sock.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
inetpeer.c inetpeer: fix data-race in inet_putpeer / inet_putpeer 2020-01-04 19:18:39 +01:00
ip_forward.c ipv4: Revert removal of rt_uses_gateway 2019-09-20 18:23:33 -07:00
ip_fragment.c inet: frags: re-introduce skb coalescing for local delivery 2019-08-08 15:55:10 -07:00
ip_gre.c ip_gre: set dev->hard_header_len and dev->needed_headroom properly 2020-10-29 09:58:04 +01:00
ip_input.c netfilter: drop bridge nf reset from nf_reset 2019-10-01 18:42:15 +02:00
ip_options.c netfilter: nf_tables: add support for matching IPv4 options 2019-06-21 18:35:51 +02:00
ip_output.c ip: fix tos reflection in ack and reset packets 2020-09-26 18:03:12 +02:00
ip_sockglue.c ip_sockglue: Fix missing-check bug in ip_ra_control() 2019-05-25 11:00:50 -07:00
ip_tunnel.c ip_tunnel: fix over-mtu packet send fail without TUNNEL_DONT_FRAGMENT flags 2020-11-10 12:37:25 +01:00
ip_tunnel_core.c ip_tunnel: allow not to count pkts on tstats by setting skb's dev to NULL 2019-06-18 20:48:45 -04:00
ip_vti.c ip_vti: receive ipip packet by calling ip_tunnel_rcv 2020-06-03 08:21:34 +02:00
ipcomp.c Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next 2019-07-05 15:01:15 -07:00
ipconfig.c ipconfig: add carrier_timeout kernel parameter 2019-02-01 15:24:13 -08:00
ipip.c net: ipip: fix wrong address family in init error path 2020-06-03 08:20:52 +02:00
ipmr.c net: don't return invalid table id error when we fall back to PF_UNSPEC 2020-06-03 08:20:41 +02:00
ipmr_base.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-05-07 17:22:09 -07:00
metrics.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
netfilter.c netfilter: use actual socket sk rather than skb sk when routing harder 2020-11-18 19:20:17 +01:00
netlink.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
nexthop.c nexthop: Fix performance regression in nexthop deletion 2020-10-29 09:57:26 +01:00
ping.c ipv4: fill fl4_icmp_{type,code} in ping_v4_sendmsg 2020-07-22 09:32:47 +02:00
proc.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-06-17 20:20:36 -07:00
protocol.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
raw.c netfilter: drop bridge nf reset from nf_reset 2019-10-01 18:42:15 +02:00
raw_diag.c inet_diag: return classid for all socket types 2020-03-18 07:17:38 +01:00
route.c ipv4: Fix tos mask in inet_rtm_getroute() 2020-12-08 10:40:25 +01:00
syncookies.c net: Update window_clamp if SOCK_RCVBUF is set 2020-11-18 19:20:32 +01:00
sysctl_net_ipv4.c tcp: correct read of TFO keys on big endian systems 2020-08-19 08:16:23 +02:00
tcp.c tcp: Prevent low rmem stalls with SO_RCVLOWAT. 2020-11-01 12:01:04 +01:00
tcp_bbr.c tcp: only postpone PROBE_RTT if RTT is < current min_rtt estimate 2020-11-24 13:28:59 +01:00
tcp_bic.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
tcp_bpf.c bpf, sockmap: Ensure SO_RCVBUF memory is observed on ingress redirect 2020-11-24 13:29:08 +01:00
tcp_cdg.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
tcp_cong.c tcp: Set INET_ECN_xmit configuration in tcp_reinit_congestion_control 2020-12-08 10:40:24 +01:00
tcp_cubic.c tcp_cubic: fix spurious HYSTART_DELAY exit upon drop in min RTT 2020-06-30 15:36:47 -04:00
tcp_dctcp.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
tcp_dctcp.h tcp: refactor DCTCP ECN ACK handling 2018-10-10 22:26:00 -07:00
tcp_diag.c tcp: annotate tp->write_seq lockless reads 2019-10-13 10:13:08 -07:00
tcp_fastopen.c tcp: correct read of TFO keys on big endian systems 2020-08-19 08:16:23 +02:00
tcp_highspeed.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
tcp_htcp.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
tcp_hybla.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
tcp_illinois.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
tcp_input.c tcp: select sane initial rcvq_space.space for big MSS 2020-12-21 13:27:04 +01:00
tcp_ipv4.c tcp: fix receive window update in tcp_add_backlog() 2020-10-14 10:33:06 +02:00
tcp_lp.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
tcp_metrics.c tcp: refactor setting the initial congestion window 2019-05-01 11:47:54 -04:00
tcp_minisocks.c tcp: annotate tp->snd_nxt lockless reads 2019-10-13 10:13:08 -07:00
tcp_nv.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
tcp_offload.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
tcp_output.c tcp: fix cwnd-limited bug for TSO deferral where we send nothing 2020-12-21 13:27:04 +01:00
tcp_rate.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
tcp_recovery.c tcp: introduce tcp_skb_timestamp_us() helper 2018-09-21 19:37:59 -07:00
tcp_scalable.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
tcp_timer.c tcp: add rcu protection around tp->fastopen_rsk 2019-10-13 10:13:08 -07:00
tcp_ulp.c bpf: Sockmap/tls, push write_space updates through ulp updates 2020-01-23 08:22:45 +01:00
tcp_vegas.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
tcp_vegas.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
tcp_veno.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
tcp_westwood.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
tcp_yeah.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
tunnel4.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
udp.c udp: fix the proto value passed to ip_protocol_deliver_rcu for the segments 2020-12-21 13:27:03 +01:00
udp_diag.c inet_diag: return classid for all socket types 2020-03-18 07:17:38 +01:00
udp_impl.h udp: add missing rehash callback to udplite 2019-01-17 15:01:08 -08:00
udp_offload.c net: udp: fix UDP header access on Fast/frag0 UDP GRO 2020-11-18 19:20:32 +01:00
udp_tunnel.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
udplite.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
xfrm4_input.c xfrm: reset transport header back to network header after all input transforms ahave been applied 2018-09-04 10:26:30 +02:00
xfrm4_output.c xfrm: Always set XFRM_TRANSFORMED in xfrm{4,6}_output_finish 2020-04-29 16:33:11 +02:00
xfrm4_policy.c net: add bool confirm_neigh parameter for dst_ops.update_pmtu 2020-01-04 19:18:58 +01:00
xfrm4_protocol.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
xfrm4_state.c xfrm: remove eth_proto value from xfrm_state_afinfo 2019-06-06 08:34:50 +02:00
xfrm4_tunnel.c xfrm: remove type and offload_type map from xfrm_state_afinfo 2019-06-06 08:34:50 +02:00