alistair23-linux/net/ipv4
Yuchung Cheng ddf1af6fa0 tcp: new delivery accounting
This patch changes the accounting of how many packets are
newly acked or sacked when the sender receives an ACK.

The current approach basically computes

   newly_acked_sacked = (prior_packets - prior_sacked) -
                        (tp->packets_out - tp->sacked_out)

   where prior_packets and prior_sacked out are snapshot
   at the beginning of the ACK processing.

The new approach tracks the delivery information via a new
TCP state variable "delivered" which monotically increases
as new packets are delivered in order or out-of-order.

The reason for this change is that the current approach is
brittle that produces negative or inaccurate estimate.

   1) For non-SACK connections, an ACK that advances the SND.UNA
   could reset the DUPACK counters (tp->sacked_out) in
   tcp_process_loss() or tcp_fastretrans_alert(). This inflates
   the inflight suddenly and causes under-estimate or even
   negative estimate. Here is a real example:

                   before   after (processing ACK)
   packets_out     75       73
   sacked_out      23        0
   ca state        Loss     Open

   The old approach computes (75-23) - (73 - 0) = -21 delivered
   while the new approach computes 1 delivered since it
   considers the 2nd-24th packets are delivered OOO.

   2) MSS change would re-count packets_out and sacked_out so
   the estimate is in-accurate and can even become negative.
   E.g., the inflight is doubled when MSS is halved.

   3) Spurious retransmission signaled by DSACK is not accounted

The new approach is simpler and more robust. For SACK connections,
tp->delivered increments as packets are being acked or sacked in
SACK and ACK processing.

For non-sack connections, it's done in tcp_remove_reno_sacks() and
tcp_add_reno_sack(). When an ACK advances the SND.UNA, tp->delivered
is incremented by the number of packets ACKed (less the current
number of DUPACKs received plus one packet hole).  Upon receiving
a DUPACK, tp->delivered is incremented assuming one out-of-order
packet is delivered.

Upon receiving a DSACK, tp->delivered is incremtened assuming one
retransmission is delivered in tcp_sacktag_write_queue().

Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-07 14:09:51 -05:00
..
netfilter inet: frag: Always orphan skbs inside ip_defrag() 2016-01-28 16:00:46 -08:00
af_inet.c net: add validation for the socket syscall protocol argument 2015-12-14 16:09:30 -05:00
ah4.c
arp.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-10-20 06:08:27 -07:00
cipso_ipv4.c
datagram.c
devinet.c netlink: Rightsize IFLA_AF_SPEC size calculation 2015-10-21 19:15:20 -07:00
esp4.c
fib_frontend.c net: Flush local routes when device changes vrf association 2015-12-13 23:58:44 -05:00
fib_lookup.h
fib_rules.c
fib_semantics.c net: Fix prefsrc lookups 2015-11-04 21:34:37 -05:00
fib_trie.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-02-01 15:56:08 -08:00
fou.c udp: restrict offloads to one namespace 2016-01-10 17:28:24 -05:00
gre_demux.c
gre_offload.c ipv6: gre: support SIT encapsulation 2015-10-26 22:01:18 -07:00
icmp.c
igmp.c ipv4: igmp: Allow removing groups from a removed interface 2015-12-03 12:07:05 -05:00
inet_connection_sock.c tcp: ensure proper barriers in lockless contexts 2015-11-15 18:36:38 -05:00
inet_diag.c net: diag: support v4mapped sockets in inet_diag_find_one_icsk() 2016-01-20 18:51:31 -08:00
inet_fragment.c inet: kill unused skb_free op 2016-01-05 22:25:57 -05:00
inet_hashtables.c tcp/dccp: fix hashdance race for passive sessions 2015-10-23 05:42:21 -07:00
inet_lro.c
inet_timewait_sock.c
inetpeer.c
ip_forward.c
ip_fragment.c inet: frag: Always orphan skbs inside ip_defrag() 2016-01-28 16:00:46 -08:00
ip_gre.c ip_tunnel: Move stats update to iptunnel_xmit() 2015-12-25 23:32:23 -05:00
ip_input.c ipv4: early demux should be aware of fragments 2016-01-29 15:14:20 -05:00
ip_options.c
ip_output.c net: preserve IP control block during GSO segmentation 2016-01-15 14:35:24 -05:00
ip_sockglue.c ipv4: fix a potential deadlock in mcast getsockopt() path 2015-11-04 21:29:59 -05:00
ip_tunnel.c ip_tunnel: Move stats update to iptunnel_xmit() 2015-12-25 23:32:23 -05:00
ip_tunnel_core.c ipv4: fix endianness warnings in ip_tunnel_core.c 2016-01-08 21:30:43 -05:00
ip_vti.c ip_tunnel: Move stats update to iptunnel_xmit() 2015-12-25 23:32:23 -05:00
ipcomp.c
ipconfig.c ipv4: ipconfig: avoid unused ic_proto_used symbol 2016-01-29 19:39:09 -08:00
ipip.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-12-31 18:20:10 -05:00
ipmr.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-12-03 21:09:12 -05:00
Kconfig ipv4+ipv6: Make INET*_ESP select CRYPTO_ECHAINIV 2016-01-25 10:45:41 -08:00
Makefile net: drop tcp_memcontrol.c 2016-01-20 17:09:18 -08:00
netfilter.c
ping.c ipv4: eliminate lock count warnings in ping.c 2016-01-08 21:30:43 -05:00
proc.c
protocol.c
raw.c net: Propagate lookup failure in l3mdev_get_saddr to caller 2016-01-04 22:58:30 -05:00
route.c
syncookies.c net: Allow accepted sockets to be bound to l3mdev domain 2015-12-18 14:43:38 -05:00
sysctl_net_ipv4.c net: drop tcp_memcontrol.c 2016-01-20 17:09:18 -08:00
tcp.c tcp: do not enqueue skb with SYN flag 2016-02-06 03:11:59 -05:00
tcp_bic.c
tcp_cdg.c
tcp_cong.c
tcp_cubic.c
tcp_dctcp.c tcp: allow dctcp alpha to drop to zero 2015-10-23 02:46:52 -07:00
tcp_diag.c net: diag: Support destroying TCP sockets. 2015-12-15 23:26:52 -05:00
tcp_fastopen.c tcp: fastopen: call tcp_fin() if FIN present in SYNACK 2016-02-06 16:49:58 -05:00
tcp_highspeed.c
tcp_htcp.c
tcp_hybla.c
tcp_illinois.c
tcp_input.c tcp: new delivery accounting 2016-02-07 14:09:51 -05:00
tcp_ipv4.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-02-01 15:56:08 -08:00
tcp_lp.c
tcp_metrics.c
tcp_minisocks.c tcp: honour SO_BINDTODEVICE for TW_RST case too 2015-12-22 17:03:05 -05:00
tcp_offload.c
tcp_output.c net: tcp_memcontrol: simplify linkage between socket and page counter 2016-01-14 16:00:49 -08:00
tcp_probe.c
tcp_recovery.c tcp: use RACK to detect losses 2015-10-21 07:00:53 -07:00
tcp_scalable.c
tcp_timer.c ipv4: Namespecify the tcp_keepalive_intvl sysctl knob 2016-01-10 17:32:09 -05:00
tcp_vegas.c
tcp_vegas.h
tcp_veno.c
tcp_westwood.c
tcp_yeah.c tcp_yeah: don't set ssthresh below 2 2016-01-11 17:25:16 -05:00
tunnel4.c
udp.c udp: fix potential infinite loop in SO_REUSEPORT logic 2016-01-19 13:52:25 -05:00
udp_diag.c soreuseport: setsockopt SO_ATTACH_REUSEPORT_[CE]BPF 2016-01-04 22:49:59 -05:00
udp_impl.h
udp_offload.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-01-11 23:55:43 -05:00
udp_tunnel.c ip_tunnel: Move stats update to iptunnel_xmit() 2015-12-25 23:32:23 -05:00
udplite.c
xfrm4_input.c
xfrm4_mode_beet.c
xfrm4_mode_transport.c
xfrm4_mode_tunnel.c
xfrm4_output.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-10-24 06:54:12 -07:00
xfrm4_policy.c Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec 2015-12-22 16:26:31 -05:00
xfrm4_protocol.c
xfrm4_state.c
xfrm4_tunnel.c