alistair23-linux/include/net
Eric Dumazet 6746960140 ipv6: RTAX_FEATURE_ALLFRAG causes inefficient TCP segment sizing
Quoting Tore Anderson from :
https://bugzilla.kernel.org/show_bug.cgi?id=42572

When RTAX_FEATURE_ALLFRAG is set on a route, the effective TCP segment
size does not take into account the size of the IPv6 Fragmentation
header that needs to be included in outbound packets, causing every
transmitted TCP segment to be fragmented across two IPv6 packets, the
latter of which will only contain 8 bytes of actual payload.

RTAX_FEATURE_ALLFRAG is typically set on a route in response to
receving a ICMPv6 Packet Too Big message indicating a Path MTU of less
than 1280 bytes. 1280 bytes is the minimum IPv6 MTU, however ICMPv6
PTBs with MTU < 1280 are still valid, in particular when an IPv6
packet is sent to an IPv4 destination through a stateless translator.
Any ICMPv4 Need To Fragment packets originated from the IPv4 part of
the path will be translated to ICMPv6 PTB which may then indicate an
MTU of less than 1280.

The Linux kernel refuses to reduce the effective MTU to anything below
1280 bytes, instead it sets it to exactly 1280 bytes, and
RTAX_FEATURE_ALLFRAG is also set. However, the TCP segment size appears
to be set to 1240 bytes (1280 Path MTU - 40 bytes of IPv6 header),
instead of 1232 (additionally taking into account the 8 bytes required
by the IPv6 Fragmentation extension header).

This in turn results in rather inefficient transmission, as every
transmitted TCP segment now is split in two fragments containing
1232+8 bytes of payload.

After this patch, all the outgoing packets that includes a
Fragmentation header all are "atomic" or "non-fragmented" fragments,
i.e., they both have Offset=0 and More Fragments=0.

With help from David S. Miller

Reported-by: Tore Anderson <tore@fud.no>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Maciej Żenczykowski <maze@google.com>
Cc: Tom Herbert <therbert@google.com>
Tested-by: Tore Anderson <tore@fud.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-27 00:03:34 -04:00
..
9p 9p: Reduce object size with CONFIG_NET_9P_DEBUG 2012-01-05 10:51:44 -06:00
bluetooth Bluetooth: Fix userspace compatibility issue with mgmt interface 2012-04-05 15:05:51 -03:00
caif caif-hsi: robust frame aggregation for HSI 2012-04-13 11:37:36 -04:00
irda Fix common misspellings 2011-03-31 11:26:23 -03:00
iucv af_iucv: add shutdown for HS transport 2012-03-07 22:52:24 -08:00
netfilter net: Convert nf_conntrack_proto to use register_net_sysctl 2012-04-20 21:22:30 -04:00
netns net ipv6: Don't use sysctl tables with .child entries. 2012-04-20 21:22:29 -04:00
nfc Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem 2012-04-18 14:27:48 -04:00
phonet net: dont hold rtnl mutex during netlink dump callbacks 2011-05-02 15:26:28 -07:00
sctp net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
tc_act net/sched: add ACT_CSUM action to update packets checksums 2010-08-20 01:42:59 -07:00
act_api.h net: sched: constify tcf_proto and tc_action 2011-07-06 02:52:16 -07:00
addrconf.h net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
af_ieee802154.h
af_rxrpc.h net: Remove __KERNEL__ cpp checks from include/net 2011-04-24 10:54:56 -07:00
af_unix.h net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
ah.h ipsec: update MAX_AH_AUTH_LEN to support sha512 2011-01-13 21:48:25 -08:00
arp.h ipv4: Eliminate spurious argument to __ipv4_neigh_lookup 2012-02-15 17:48:35 -05:00
atmclip.h atm: clip: Use device neigh support on top of "arp_tbl". 2011-11-30 18:51:03 -05:00
ax25.h net ax25: Fix the build when sysctl support is disabled. 2012-04-23 22:14:47 -04:00
ax88796.h
cfg80211-wext.h cfg80211: remove unused wext handler exports 2011-08-08 14:26:29 -04:00
cfg80211.h cfg80211: enforce lack of interface combinations 2012-04-16 14:16:58 -04:00
checksum.h
cipso_ipv4.h doc: Update the email address for Paul Moore in various source files 2011-08-01 17:58:33 -07:00
cls_cgroup.h Merge commit 'v2.6.36-rc7' into core/rcu 2010-10-07 09:43:45 +02:00
compat.h net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
datalink.h
dcbevent.h dcb: Add stub routines for !CONFIG_DCB 2011-10-06 15:49:51 -04:00
dcbnl.h net/dcb: Add an optional max rate attribute 2012-04-05 05:08:04 -04:00
dn.h net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
dn_dev.h decnet: RCU conversion and get rid of dev_base_lock 2010-11-08 13:50:08 -08:00
dn_fib.h net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
dn_neigh.h
dn_nsp.h net: use __packed annotation 2010-06-03 03:21:52 -07:00
dn_route.h net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
dsa.h dsa: Include linux/if_ether.h to fix build error 2011-12-01 11:41:06 -05:00
dsfield.h
dst.h ipv6: fix problem with expired dst cache 2012-04-13 12:58:29 -04:00
dst_ops.h net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
esp.h
ethoc.h
fib_rules.h fib_rules: __rcu annotates ctarget 2010-10-27 11:37:32 -07:00
flow.h ipv4: reset flowi parameters on route connect 2012-02-04 19:29:48 -05:00
flow_keys.h flow_dissector: use a 64bit load/store 2011-11-29 13:17:03 -05:00
garp.h garp: remove last synchronize_rcu() call 2011-05-12 17:46:56 -04:00
gen_stats.h Fix common misspellings 2011-03-31 11:26:23 -03:00
genetlink.h net: Deinline __nlmsg_put and genlmsg_put. -7k code on i386 defconfig. 2012-01-30 15:22:06 -05:00
gre.h PPTP: PPP over IPv4 (Point-to-Point Tunneling Protocol) 2010-08-21 23:05:39 -07:00
icmp.h net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
ieee80211_radiotap.h wireless: move ieee80211chan2mhz macro 2011-11-11 12:32:50 -05:00
ieee802154.h 6LoWPAN: add fragmentation support 2011-11-14 00:19:42 -05:00
ieee802154_netdev.h
if_inet6.h net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
inet6_connection_sock.h tcp: bind() use stronger condition for bind_conflict 2012-04-14 15:28:55 -04:00
inet6_hashtables.h net: use IS_ENABLED(CONFIG_IPV6) 2011-12-11 18:25:16 -05:00
inet_common.h inet, inet6: make tcp_sendmsg() and tcp_sendpage() through inet_sendmsg() and inet_sendpage() 2010-07-12 20:21:46 -07:00
inet_connection_sock.h ipv6: RTAX_FEATURE_ALLFRAG causes inefficient TCP segment sizing 2012-04-27 00:03:34 -04:00
inet_ecn.h inet: add rfc 3168 extract in front of INET_ECN_encapsulate() 2011-10-22 01:25:23 -04:00
inet_frag.h fragment: add fast path for in-order fragments 2010-06-30 13:44:29 -07:00
inet_hashtables.h atomic: use <linux/atomic.h> 2011-07-26 16:49:47 -07:00
inet_sock.h net: implement IP_RECVTOS for IP_PKTOPTIONS 2012-02-13 00:46:41 -05:00
inet_timewait_sock.h inet: remove rcu protection on tw_net 2011-12-14 13:34:55 -05:00
inetpeer.h route: Remove redirect_genid 2012-03-08 00:30:32 -08:00
ip.h net: Delete all remaining instances of ctl_path 2012-04-20 21:22:30 -04:00
ip6_checksum.h
ip6_fib.h ipv6: clean up rt6_clean_expires 2012-04-17 22:31:59 -04:00
ip6_route.h net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
ip6_tunnel.h tunnels: add _rcu annotations 2010-10-25 13:09:45 -07:00
ip_fib.h net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
ip_vs.h net: Delete all remaining instances of ctl_path 2012-04-20 21:22:30 -04:00
ipcomp.h
ipconfig.h
ipip.h tunnel: implement 64 bits statistics 2012-04-14 14:47:05 -04:00
ipv6.h net: Delete all remaining instances of ctl_path 2012-04-20 21:22:30 -04:00
ipx.h net: Remove __KERNEL__ cpp checks from include/net 2011-04-24 10:54:56 -07:00
iw_handler.h Fix common misspellings 2011-03-31 11:26:23 -03:00
lapb.h wan: make LAPB callbacks const 2011-09-16 19:20:20 -04:00
lib80211.h include: replace linux/module.h with "struct module" wherever possible 2011-10-31 19:32:32 -04:00
llc.h atomic: use <linux/atomic.h> 2011-07-26 16:49:47 -07:00
llc_c_ac.h
llc_c_ev.h net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
llc_c_st.h
llc_conn.h
llc_if.h
llc_pdu.h bonding,llc: Fix structure sizeof incompatibility for some PDUs 2011-05-13 15:13:24 -04:00
llc_s_ac.h
llc_s_ev.h
llc_s_st.h
llc_sap.h
mac80211.h mac80211: declare ieee80211_ave_rssi as EXPORT 2012-04-23 15:37:41 -04:00
mip6.h net: use __packed annotation 2010-06-03 03:21:52 -07:00
mld.h
ndisc.h Treat ND option 31 as userland (DNSSL support) 2012-04-12 15:56:57 -04:00
neighbour.h net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
net_namespace.h net sysctl: Add place holder functions for when sysctl support is compiled out of the kernel. 2012-04-23 19:24:28 -04:00
net_ratelimit.h net: Kill ratelimit.h dependency in linux/net.h 2011-05-27 13:41:33 -04:00
netdma.h
netevent.h net: Remove __KERNEL__ cpp checks from include/net 2011-04-24 10:54:56 -07:00
netlabel.h doc: Update the email address for Paul Moore in various source files 2011-08-01 17:58:33 -07:00
netlink.h netlink: Delete all NLA_PUT*() macros. 2012-04-02 04:33:45 -04:00
netprio_cgroup.h netprio_cgroup: fix wrong memory access when NETPRIO_CGROUP=m 2012-02-10 15:08:57 -05:00
netrom.h
nexthop.h
nl802154.h
p8022.h
ping.h net: ping: fix build failure 2011-05-17 14:16:58 -04:00
pkt_cls.h net: Fix range checks in tcf_valid_offset(). 2010-12-21 12:43:16 -08:00
pkt_sched.h net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
protocol.h net: use IS_ENABLED(CONFIG_IPV6) 2011-12-11 18:25:16 -05:00
psnap.h
raw.h include/net/raw.h: Convert raw_seq_private macro to inline 2010-09-08 13:42:22 -07:00
rawv6.h net: Remove __KERNEL__ cpp checks from include/net 2011-04-24 10:54:56 -07:00
red.h net_sched: red: Make minor corrections to comments 2012-04-16 23:53:11 -04:00
regulatory.h cfg80211: pass DFS region to drivers through reg_notifier() 2011-11-21 16:20:41 -05:00
request_sock.h tcp: Change possible SYN flooding messages 2011-09-15 14:49:43 -04:00
rose.h rose: Add length checks to CALL_REQUEST parsing 2011-03-27 17:59:04 -07:00
route.h net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
rtnetlink.h rtnetlink: ops->get_tx_queue() cannot take a const 'tb'. 2012-04-13 14:21:04 -04:00
sch_generic.h net: Make qdisc_skb_cb upper size bound explicit. 2012-02-09 13:50:34 -05:00
scm.h af_unix: dont send SCM_CREDENTIALS by default 2011-09-28 13:29:50 -04:00
secure_seq.h tcp: add const qualifiers where possible 2011-10-21 05:22:42 -04:00
slhc_vj.h
snmp.h Merge branch 'for-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu 2012-01-09 13:08:28 -08:00
sock.h Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2012-04-23 23:15:17 -04:00
stp.h
tcp.h ipv6: RTAX_FEATURE_ALLFRAG causes inefficient TCP segment sizing 2012-04-27 00:03:34 -04:00
tcp_memcontrol.h cgroup: remove cgroup_subsys argument from callbacks 2012-02-02 09:20:22 -08:00
tcp_states.h
timewait_sock.h BUG: headers with BUG/BUG_ON etc. need linux/bug.h 2012-03-04 17:54:34 -05:00
transp_v6.h net: relax PKTINFO non local ipv6 udp xmit check 2011-08-30 17:39:01 -04:00
udp.h net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
udplite.h net: ipv4: Standardize prefixes for message logging 2012-03-12 17:05:21 -07:00
wext.h
wimax.h net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
wpan-phy.h BUG: headers with BUG/BUG_ON etc. need linux/bug.h 2012-03-04 17:54:34 -05:00
x25.h net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
x25device.h
xfrm.h xfrm: Stop using NLA_PUT*(). 2012-04-02 04:33:45 -04:00