alistair23-linux/net
Guenter Roeck 5e016cbf6c ipv4: Don't drop redirected route cache entry unless PTMU actually expired
TCP sessions over IPv4 can get stuck if routers between endpoints
do not fragment packets but implement PMTU instead, and we are using
those routers because of an ICMP redirect.

Setup is as follows

       MTU1    MTU2   MTU1
    A--------B------C------D

with MTU1 > MTU2. A and D are endpoints, B and C are routers. B and C
implement PMTU and drop packets larger than MTU2 (for example because
DF is set on all packets). TCP sessions are initiated between A and D.
There is packet loss between A and D, causing frequent TCP
retransmits.

After the number of retransmits on a TCP session reaches tcp_retries1,
tcp calls dst_negative_advice() prior to each retransmit. This results
in route cache entries for the peer to be deleted in
ipv4_negative_advice() if the Path MTU is set.

If the outstanding data on an affected TCP session is larger than
MTU2, packets sent from the endpoints will be dropped by B or C, and
ICMP NEEDFRAG will be returned. A and D receive NEEDFRAG messages and
update PMTU.

Before the next retransmit, tcp will again call dst_negative_advice(),
causing the route cache entry (with correct PMTU) to be deleted. The
retransmitted packet will be larger than MTU2, causing it to be
dropped again.

This sequence repeats until the TCP session aborts or is terminated.

Problem is fixed by removing redirected route cache entries in
ipv4_negative_advice() only if the PMTU is expired.

Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-21 20:55:13 -07:00
..
9p 9p: Change the name of new protocol from 9p2010.L to 9p2000.L 2010-03-13 08:57:29 -06:00
802
8021q net: Potential null skb->dev dereference 2010-03-18 21:16:45 -07:00
appletalk
atm
ax25
bluetooth Bluetooth: Fix kernel crash on L2CAP stress tests 2010-03-21 05:49:36 +01:00
bridge bridge: Make first arg to deliver_clone const. 2010-03-16 14:37:47 -07:00
can
core net: Potential null skb->dev dereference 2010-03-18 21:16:45 -07:00
dcb const: struct nla_policy 2010-02-18 14:30:18 -08:00
dccp net-2.6 [Bug-Fix][dccp]: fix oops caused after failed initialisation 2010-03-15 16:00:50 -07:00
decnet net: Add checking to rcu_dereference() primitives 2010-02-25 09:41:03 +01:00
dsa
econet
ethernet
ieee802154
ipv4 ipv4: Don't drop redirected route cache entry unless PTMU actually expired 2010-03-21 20:55:13 -07:00
ipv6 net: ipmr/ip6mr: fix potential out-of-bounds vif_table access 2010-03-19 22:47:22 -07:00
ipx
irda const: struct nla_policy 2010-02-18 14:30:18 -08:00
iucv
key xfrm: SP lookups signature with mark 2010-02-22 16:21:12 -08:00
lapb
llc net: backlog functions rename 2010-03-05 13:34:03 -08:00
mac80211 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 2010-03-13 14:50:18 -08:00
netfilter netfilter: ctnetlink: fix reliable event delivery if message building fails 2010-03-20 14:29:03 -07:00
netlabel net: remove INIT_RCU_HEAD() usage 2010-02-17 00:03:27 -08:00
netlink netlink: fix NETLINK_RECV_NO_ENOBUFS in netlink_set_err() 2010-03-20 14:29:03 -07:00
netrom
packet af_packet: move strict addr_len check right before dev_[mc/unicast]_[add/del] 2010-03-03 01:04:38 -08:00
phonet phonet: use for_each_set_bit() 2010-03-15 16:00:47 -07:00
rds
rfkill rfkill: Add support for KEY_RFKILL 2010-03-02 14:28:49 -05:00
rose
rxrpc
sched
sctp Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 2010-03-13 14:50:18 -08:00
sunrpc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 2010-03-13 14:50:18 -08:00
tipc tipc: fix lockdep warning on address assignment 2010-03-16 14:15:45 -07:00
unix AF_UNIX: update locking comment 2010-02-18 14:12:06 -08:00
wanrouter
wimax const: struct nla_policy 2010-02-18 14:30:18 -08:00
wireless Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 2010-02-25 23:26:21 -08:00
x25 net: backlog functions rename 2010-03-05 13:34:03 -08:00
xfrm ipsec: Fix bogus bundle flowi 2010-03-03 01:04:37 -08:00
compat.c
Kconfig
Makefile
nonet.c
socket.c
sysctl_net.c
TUNABLE