1
0
Fork 0
remarkable-linux/net/ipv4
Eric Dumazet e114a710aa tcp: fix cwnd limited checking to improve congestion control
Yuchung discovered tcp_is_cwnd_limited() was returning false in
slow start phase even if the application filled the socket write queue.

All congestion modules take into account tcp_is_cwnd_limited()
before increasing cwnd, so this behavior limits slow start from
probing the bandwidth at full speed.

The problem is that even if write queue is full (aka we are _not_
application limited), cwnd can be under utilized if TSO should auto
defer or TCP Small queues decided to hold packets.

So the in_flight can be kept to smaller value, and we can get to the
point tcp_is_cwnd_limited() returns false.

With TCP Small Queues and FQ/pacing, this issue is more visible.

We fix this by having tcp_cwnd_validate(), which is supposed to track
such things, take into account unsent_segs, the number of segs that we
are not sending at the moment due to TSO or TSQ, but intend to send
real soon. Then when we are cwnd-limited, remember this fact while we
are processing the window of ACKs that comes back.

For example, suppose we have a brand new connection with cwnd=10; we
are in slow start, and we send a flight of 9 packets. By the time we
have received ACKs for all 9 packets we want our cwnd to be 18.
We implement this by setting tp->lsnd_pending to 9, and
considering ourselves to be cwnd-limited while cwnd is less than
twice tp->lsnd_pending (2*9 -> 18).

This makes tcp_is_cwnd_limited() more understandable, by removing
the GSO/TSO kludge, that tried to work around the issue.

Note the in_flight parameter can be removed in a followup cleanup
patch.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-02 17:54:35 -04:00
..
netfilter ipv4, fib: pass LOOPBACK_IFINDEX instead of 0 to flowi4_iif 2014-04-16 15:05:11 -04:00
Kconfig net: neighbour: Remove CONFIG_ARPD 2013-09-03 21:41:43 -04:00
Makefile xfrm4: Add IPsec protocol multiplexer 2014-02-25 07:04:16 +01:00
af_inet.c net: Replace u64_stats_fetch_begin_bh to u64_stats_fetch_begin_irq 2014-03-14 22:41:36 -04:00
ah4.c ah4: Use the IPsec protocol multiplexer API 2014-02-25 07:04:17 +01:00
arp.c ipv4: arp: update neighbour address when a gratuitous arp is received and arp_accept is set 2014-01-02 00:08:38 -05:00
cipso_ipv4.c ipv4: ERROR: code indent should use tabs where possible 2013-12-26 13:43:21 -05:00
datagram.c net: Remove FLOWI_FLAG_CAN_SLEEP 2013-12-06 07:24:39 +01:00
devinet.c ipv4: Fix runtime WARNING in rtmsg_ifa() 2014-02-06 20:02:15 -08:00
esp4.c esp4: Use the IPsec protocol multiplexer API 2014-02-25 07:04:17 +01:00
fib_frontend.c ipv4, fib: pass LOOPBACK_IFINDEX instead of 0 to flowi4_iif 2014-04-16 15:05:11 -04:00
fib_lookup.h ipv4: make fib_detect_death static 2013-12-28 17:01:46 -05:00
fib_rules.c inet: fix NULL pointer Oops in fib(6)_rule_suppress 2013-12-10 17:54:23 -05:00
fib_semantics.c ipv4, fib: pass LOOPBACK_IFINDEX instead of 0 to flowi4_iif 2014-04-16 15:05:11 -04:00
fib_trie.c seq_file: remove "%n" usage from seq_file users 2013-11-15 09:32:20 +09:00
gre_demux.c ip_tunnel: Fix dst ref-count. 2014-03-26 15:18:40 -04:00
gre_offload.c net/ipv4: don't use module_init in non-modular gre_offload 2014-01-16 16:08:27 -08:00
icmp.c ipv4: introduce hardened ip_no_pmtu_disc mode 2014-01-13 11:22:55 -08:00
igmp.c net: replace macros net_random and net_srandom with direct calls to prandom 2014-01-14 15:15:25 -08:00
inet_connection_sock.c net: replace macros net_random and net_srandom with direct calls to prandom 2014-01-14 15:15:25 -08:00
inet_diag.c inet_diag: fix inet_diag_dump_icsk() to use correct state for timewait sockets 2014-01-13 22:35:46 -08:00
inet_fragment.c inet: frag: make sure forced eviction removes all frags 2014-03-06 15:28:45 -05:00
inet_hashtables.c inet: convert inet_ehash_secret and ipv6_hash_secret to net_get_random_once 2013-10-19 19:45:35 -04:00
inet_lro.c lro: remove dead code 2013-12-29 16:34:25 -05:00
inet_timewait_sock.c tcp/dccp: remove twchain 2013-10-08 23:19:24 -04:00
inetpeer.c inetpeer_gc_worker: trivial cleanup 2014-04-26 12:52:28 -04:00
ip_forward.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-02-19 01:24:22 -05:00
ip_fragment.c net: Add utility functions to clear rxhash 2013-12-17 16:36:21 -05:00
ip_gre.c gre: add x-netns support 2014-04-23 14:53:36 -04:00
ip_input.c net: Fix memory leak if TPROXY used with TCP early demux 2014-01-27 16:22:11 -08:00
ip_options.c ipv4: Use predefined value for readability 2014-04-28 13:28:43 -04:00
ip_output.c ipv4: add a sock pointer to dst->output() path. 2014-04-15 13:47:15 -04:00
ip_sockglue.c ipv4: yet another new IP_MTU_DISCOVER option IP_PMTUDISC_OMIT 2014-02-26 15:51:00 -05:00
ip_tunnel.c ip_tunnel: use the right netns in ioctl handler 2014-04-16 15:16:02 -04:00
ip_tunnel_core.c ipv4: add a sock pointer to dst->output() path. 2014-04-15 13:47:15 -04:00
ip_vti.c vti: don't allow to add the same tunnel twice 2014-04-12 17:03:11 -04:00
ipcomp.c ipcomp4: Use the IPsec protocol multiplexer API 2014-02-25 07:04:17 +01:00
ipconfig.c ipv4: ipconfig.c: add parentheses in an if statement 2014-02-14 00:14:23 -05:00
ipip.c ipv4: be friend with drop monitor 2014-01-18 23:08:02 -08:00
ipmr.c ipv4, fib: pass LOOPBACK_IFINDEX instead of 0 to flowi4_iif 2014-04-16 15:05:11 -04:00
netfilter.c netfilter: remove double colon 2014-02-19 11:41:25 +01:00
ping.c net: ipv4: current group_info should be put after using. 2014-04-13 22:52:58 -04:00
proc.c tcp: snmp stats for Fast Open, SYN rtx, and data pkts 2014-03-03 15:58:03 -05:00
protocol.c net: remove outdated comment for ipv4 and ipv6 protocol handler 2013-11-28 18:47:51 -05:00
raw.c ipv6: honor IPV6_PKTINFO with v4 mapped addresses on sendmsg 2014-02-19 16:28:42 -05:00
route.c ipv4, route: pass 0 instead of LOOPBACK_IFINDEX to fib_validate_source() 2014-04-16 15:05:12 -04:00
syncookies.c ipv4: fix checkpatch error "space prohibited" 2013-12-26 13:43:21 -05:00
sysctl_net_ipv4.c ipv4: introduce ip_dst_mtu_maybe_forward and protect forwarding path against pmtu spoofing 2014-01-13 11:22:54 -08:00
tcp.c tcp: Add a TCP_FASTOPEN socket option to get a max backlog on its listner 2014-04-20 18:18:54 -04:00
tcp_bic.c tcp: properly handle stretch acks in slow start 2013-11-04 19:57:59 -05:00
tcp_cong.c tcp: fix cwnd limited checking to improve congestion control 2014-05-02 17:54:35 -04:00
tcp_cubic.c tcp: switch rtt estimations to usec resolution 2014-02-26 17:08:40 -05:00
tcp_diag.c inet_diag: Rename inet_diag_req into inet_diag_req_v2 2012-01-11 12:56:06 -08:00
tcp_fastopen.c tcp: enable sockets to use MSG_FASTOPEN by default 2013-11-04 19:57:47 -05:00
tcp_highspeed.c tcp: remove unused min_cwnd member of tcp_congestion_ops 2014-02-13 18:22:34 -05:00
tcp_htcp.c tcp: properly handle stretch acks in slow start 2013-11-04 19:57:59 -05:00
tcp_hybla.c tcp: switch rtt estimations to usec resolution 2014-02-26 17:08:40 -05:00
tcp_illinois.c tcp: switch rtt estimations to usec resolution 2014-02-26 17:08:40 -05:00
tcp_input.c tcp: make tcp_cwnd_application_limited() static 2014-04-20 18:18:56 -04:00
tcp_ipv4.c net: Fix use after free by removing length arg from sk_data_ready callbacks. 2014-04-11 16:15:36 -04:00
tcp_lp.c tcp: switch rtt estimations to usec resolution 2014-02-26 17:08:40 -05:00
tcp_memcontrol.c cgroup: drop const from @buffer of cftype->write_string() 2014-03-19 10:23:54 -04:00
tcp_metrics.c tcp: switch rtt estimations to usec resolution 2014-02-26 17:08:40 -05:00
tcp_minisocks.c net: Fix use after free by removing length arg from sk_data_ready callbacks. 2014-04-11 16:15:36 -04:00
tcp_offload.c tcp: do not export tcp_gso_segment() and tcp_gro_receive() 2014-01-14 18:53:48 -08:00
tcp_output.c tcp: fix cwnd limited checking to improve congestion control 2014-05-02 17:54:35 -04:00
tcp_probe.c tcp: switch rtt estimations to usec resolution 2014-02-26 17:08:40 -05:00
tcp_scalable.c tcp: remove unused min_cwnd member of tcp_congestion_ops 2014-02-13 18:22:34 -05:00
tcp_timer.c tcp: snmp stats for Fast Open, SYN rtx, and data pkts 2014-03-03 15:58:03 -05:00
tcp_vegas.c tcp: switch rtt estimations to usec resolution 2014-02-26 17:08:40 -05:00
tcp_vegas.h net: ipv4/ipv6: Remove extern from function prototypes 2013-10-19 19:12:11 -04:00
tcp_veno.c tcp: switch rtt estimations to usec resolution 2014-02-26 17:08:40 -05:00
tcp_westwood.c tcp: remove unused min_cwnd member of tcp_congestion_ops 2014-02-13 18:22:34 -05:00
tcp_yeah.c tcp: switch rtt estimations to usec resolution 2014-02-26 17:08:40 -05:00
tunnel4.c net: Convert printks to pr_<level> 2012-03-11 23:42:51 -07:00
udp.c ipv6: honor IPV6_PKTINFO with v4 mapped addresses on sendmsg 2014-02-19 16:28:42 -05:00
udp_diag.c netlink: rename ssk to sk in struct netlink_skb_params 2013-04-19 14:57:56 -04:00
udp_impl.h net: ipv4/ipv6: Remove extern from function prototypes 2013-10-19 19:12:11 -04:00
udp_offload.c net/ipv4: Use proper RCU APIs for writer-side in udp_offload.c 2014-02-04 20:01:55 -08:00
udplite.c net: ipv4: Standardize prefixes for message logging 2012-03-12 17:05:21 -07:00
xfrm4_input.c xfrm4: Add IPsec protocol multiplexer 2014-02-25 07:04:16 +01:00
xfrm4_mode_beet.c ipv4: ERROR: code indent should use tabs where possible 2013-12-26 13:43:21 -05:00
xfrm4_mode_transport.c
xfrm4_mode_tunnel.c xfrm4: Remove xfrm_tunnel_notifier 2014-02-25 07:04:18 +01:00
xfrm4_output.c ipv4: add a sock pointer to dst->output() path. 2014-04-15 13:47:15 -04:00
xfrm4_policy.c xfrm: Introduce xfrm_input_afinfo to access the the callbacks properly 2014-03-14 07:28:07 +01:00
xfrm4_protocol.c xfrm: Introduce xfrm_input_afinfo to access the the callbacks properly 2014-03-14 07:28:07 +01:00
xfrm4_state.c inet: make no_pmtu_disc per namespace and kill ipv4_config 2013-12-18 16:58:20 -05:00
xfrm4_tunnel.c sit: add IPv4 over IPv4 support 2013-05-31 17:19:05 -07:00