remarkable-linux/net
Nandita Dukkipati 6ba8a3b19e tcp: Tail loss probe (TLP)
This patch series implement the Tail loss probe (TLP) algorithm described
in http://tools.ietf.org/html/draft-dukkipati-tcpm-tcp-loss-probe-01. The
first patch implements the basic algorithm.

TLP's goal is to reduce tail latency of short transactions. It achieves
this by converting retransmission timeouts (RTOs) occuring due
to tail losses (losses at end of transactions) into fast recovery.
TLP transmits one packet in two round-trips when a connection is in
Open state and isn't receiving any ACKs. The transmitted packet, aka
loss probe, can be either new or a retransmission. When there is tail
loss, the ACK from a loss probe triggers FACK/early-retransmit based
fast recovery, thus avoiding a costly RTO. In the absence of loss,
there is no change in the connection state.

PTO stands for probe timeout. It is a timer event indicating
that an ACK is overdue and triggers a loss probe packet. The PTO value
is set to max(2*SRTT, 10ms) and is adjusted to account for delayed
ACK timer when there is only one oustanding packet.

TLP Algorithm

On transmission of new data in Open state:
  -> packets_out > 1: schedule PTO in max(2*SRTT, 10ms).
  -> packets_out == 1: schedule PTO in max(2*RTT, 1.5*RTT + 200ms)
  -> PTO = min(PTO, RTO)

Conditions for scheduling PTO:
  -> Connection is in Open state.
  -> Connection is either cwnd limited or no new data to send.
  -> Number of probes per tail loss episode is limited to one.
  -> Connection is SACK enabled.

When PTO fires:
  new_segment_exists:
    -> transmit new segment.
    -> packets_out++. cwnd remains same.

  no_new_packet:
    -> retransmit the last segment.
       Its ACK triggers FACK or early retransmit based recovery.

ACK path:
  -> rearm RTO at start of ACK processing.
  -> reschedule PTO if need be.

In addition, the patch includes a small variation to the Early Retransmit
(ER) algorithm, such that ER and TLP together can in principle recover any
N-degree of tail loss through fast recovery. TLP is controlled by the same
sysctl as ER, tcp_early_retrans sysctl.
tcp_early_retrans==0; disables TLP and ER.
		 ==1; enables RFC5827 ER.
		 ==2; delayed ER.
		 ==3; TLP and delayed ER. [DEFAULT]
		 ==4; TLP only.

The TLP patch series have been extensively tested on Google Web servers.
It is most effective for short Web trasactions, where it reduced RTOs by 15%
and improved HTTP response time (average by 6%, 99th percentile by 10%).
The transmitted probes account for <0.5% of the overall transmissions.

Signed-off-by: Nandita Dukkipati <nanditad@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-12 08:30:34 -04:00
..
9p Revert parts of "hlist: drop the node parameter from iterators" 2013-03-08 15:05:34 -08:00
802 mrp: make mrp_rcv static 2013-02-11 14:16:26 -05:00
8021q net: proc: change proc_net_remove to remove_proc_entry 2013-02-18 14:53:08 -05:00
appletalk hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
atm hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
ax25 hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
batman-adv batman-adv: verify tt len does not exceed packet len 2013-03-11 22:59:47 +01:00
bluetooth hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
bridge bridge: using for_each_set_bit to simplify the code 2013-03-12 08:04:09 -04:00
caif CAIF: fix indentation for function arguments 2013-03-07 16:24:45 -05:00
can net: can: af_can.c: Fix checkpatch warnings 2013-03-11 07:40:17 -04:00
ceph Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client 2013-02-28 17:43:09 -08:00
core Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2013-03-12 05:52:22 -04:00
dcb Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2013-03-12 05:52:22 -04:00
dccp Driver core patches for 3.9-rc1 2013-02-21 12:05:51 -08:00
decnet hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
dns_resolver Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security 2012-12-16 15:40:50 -08:00
dsa dsa: make dsa_switch_setup check for valid port names 2013-01-21 15:40:12 -05:00
ethernet net: split eth_mac_addr for better error handling 2013-01-21 14:07:44 -05:00
ieee802154 6lowpan: Fix endianness issue in is_addr_link_local(). 2013-03-10 16:49:35 -04:00
ipv4 tcp: Tail loss probe (TLP) 2013-03-12 08:30:34 -04:00
ipv6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2013-03-12 05:52:22 -04:00
ipx hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
irda net/irda: Raise dtr in non-blocking open 2013-03-06 02:47:05 -05:00
iucv hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
key afkey: fix a typo 2013-03-07 16:26:45 -05:00
l2tp Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2013-03-05 18:42:29 -08:00
lapb net/lapb: remove depends on CONFIG_EXPERIMENTAL 2013-01-11 11:40:01 -08:00
llc hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
mac80211 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem 2013-03-06 10:21:17 -05:00
mac802154 Driver core patches for 3.9-rc1 2013-02-21 12:05:51 -08:00
netfilter Merge branch 'master' of git://1984.lsi.us.es/nf 2013-03-07 15:20:02 -05:00
netlabel netlabel: fix build problems when CONFIG_IPV6=n 2013-03-08 11:33:51 -05:00
netlink hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
netrom hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
nfc hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
openvswitch hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
packet hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
phonet hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
rds net/rds: zero last byte for strncpy 2013-03-08 00:35:44 -05:00
rfkill rfkill: don't use [delayed_]work_pending() 2012-12-28 13:40:16 -08:00
rose hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
rxrpc Driver core patches for 3.9-rc1 2013-02-21 12:05:51 -08:00
sched Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2013-03-12 05:52:22 -04:00
sctp Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2013-03-05 18:42:29 -08:00
sunrpc fs: Limit sys_mount to only request filesystem modules. 2013-03-03 19:36:31 -08:00
tipc hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
unix hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
vmw_vsock VSOCK: Don't reject PF_VSOCK protocol 2013-02-18 15:02:51 -05:00
wimax
wireless Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem 2013-03-06 10:21:17 -05:00
x25 hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
xfrm hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
compat.c make get_file() return its argument 2012-09-26 21:10:25 -04:00
Kconfig Driver core patches for 3.9-rc1 2013-02-21 12:05:51 -08:00
Makefile VSOCK: Introduce VM Sockets 2013-02-10 19:41:08 -05:00
nonet.c
socket.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-02-26 20:16:07 -08:00
sysctl_net.c user_ns: get rid of duplicate code in net_ctl_permissions 2012-11-18 20:32:45 -05:00