alistair23-linux/net/ipv4
Andrey Vagin dbde497966 tcp: don't update snd_nxt, when a socket is switched from repair mode
snd_nxt must be updated synchronously with sk_send_head.  Otherwise
tp->packets_out may be updated incorrectly, what may bring a kernel panic.

Here is a kernel panic from my host.
[  103.043194] BUG: unable to handle kernel NULL pointer dereference at 0000000000000048
[  103.044025] IP: [<ffffffff815aaaaf>] tcp_rearm_rto+0xcf/0x150
...
[  146.301158] Call Trace:
[  146.301158]  [<ffffffff815ab7f0>] tcp_ack+0xcc0/0x12c0

Before this panic a tcp socket was restored. This socket had sent and
unsent data in the write queue. Sent data was restored in repair mode,
then the socket was switched from reapair mode and unsent data was
restored. After that the socket was switched back into repair mode.

In that moment we had a socket where write queue looks like this:
snd_una    snd_nxt   write_seq
   |_________|________|
             |
	  sk_send_head

After a second switching from repair mode the state of socket was
changed:

snd_una          snd_nxt, write_seq
   |_________ ________|
             |
	  sk_send_head

This state is inconsistent, because snd_nxt and sk_send_head are not
synchronized.

Bellow you can find a call trace, how packets_out can be incremented
twice for one skb, if snd_nxt and sk_send_head are not synchronized.
In this case packets_out will be always positive, even when
sk_write_queue is empty.

tcp_write_wakeup
	skb = tcp_send_head(sk);
	tcp_fragment
		if (!before(tp->snd_nxt, TCP_SKB_CB(buff)->end_seq))
			tcp_adjust_pcount(sk, skb, diff);
	tcp_event_new_data_sent
		tp->packets_out += tcp_skb_pcount(skb);

I think update of snd_nxt isn't required, when a socket is switched from
repair mode.  Because it's initialized in tcp_connect_init. Then when a
write queue is restored, snd_nxt is incremented in tcp_event_new_data_sent,
so it's always is in consistent state.

I have checked, that the bug is not reproduced with this patch and
all tests about restoring tcp connections work fine.

Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: James Morris <jmorris@namei.org>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-19 16:14:20 -05:00
..
netfilter Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nftables 2013-11-04 19:48:57 -05:00
af_inet.c inet: fix a UFO regression 2013-11-08 02:07:59 -05:00
ah4.c ipv4: properly refresh rtable entries on pmtu/redirect events 2013-06-03 00:07:42 -07:00
arp.c net: neighbour: Remove CONFIG_ARPD 2013-09-03 21:41:43 -04:00
cipso_ipv4.c cipso: don't follow a NULL pointer when setsockopt() is called 2012-07-18 09:01:12 -07:00
datagram.c ipv4: fix possible seqlock deadlock 2013-11-14 17:31:14 -05:00
devinet.c net: igmp: Allow user-space configuration of igmp unsolicited report interval 2013-08-09 11:27:46 -07:00
esp4.c net: esp{4,6}: get rid of struct esp_data 2013-10-29 06:39:42 +01:00
fib_frontend.c fib_trie: remove duplicated rcu lock 2013-10-18 13:53:59 -04:00
fib_lookup.h net: ipv4/ipv6: Remove extern from function prototypes 2013-10-19 19:12:11 -04:00
fib_rules.c fib_rules: fix suppressor names and default values 2013-08-03 10:40:23 -07:00
fib_semantics.c fib: Use const struct nl_info * in rtmsg_fib 2013-10-18 14:42:15 -04:00
fib_trie.c fib_trie: only calc for the un-first node 2013-10-10 00:08:07 -04:00
gre_demux.c ipv4: generalize gre_handle_offloads 2013-10-19 19:36:18 -04:00
gre_offload.c ipip: add GSO/TSO support 2013-10-19 19:36:19 -04:00
icmp.c ipv4: processing ancillary IP_TOS or IP_TTL 2013-09-28 15:21:52 -07:00
igmp.c ipv4 igmp: use in_dev_put in timer handlers instead of __in_dev_put 2013-09-30 22:28:56 -07:00
inet_connection_sock.c inet: rename ir_loc_port to ir_num 2013-10-10 14:37:35 -04:00
inet_diag.c inet_diag: use sock_gen_put() 2013-10-17 15:02:02 -04:00
inet_fragment.c inet: remove old fragmentation hash initializing 2013-10-23 17:01:41 -04:00
inet_hashtables.c inet: convert inet_ehash_secret and ipv6_hash_secret to net_get_random_once 2013-10-19 19:45:35 -04:00
inet_lro.c ipv4: replace ip_fast_csum with csum_replace2 2013-03-15 09:12:25 -04:00
inet_timewait_sock.c tcp/dccp: remove twchain 2013-10-08 23:19:24 -04:00
inetpeer.c ip: generate unique IP identificator if local fragmentation is allowed 2013-09-19 14:11:15 -04:00
ip_forward.c ipv4: introduce rt_uses_gateway 2012-10-08 17:42:36 -04:00
ip_fragment.c ipv4: initialize ip4_frags hash secret as late as possible 2013-10-23 17:01:40 -04:00
ip_gre.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2013-08-16 15:37:26 -07:00
ip_input.c net: add SNMP counters tracking incoming ECN bits 2013-08-08 22:24:59 -07:00
ip_options.c net/ipv4: Ensure that location of timestamp option is stored 2013-03-12 05:35:39 -04:00
ip_output.c ipv4: introduce new IP_MTU_DISCOVER mode IP_PMTUDISC_INTERFACE 2013-11-05 21:52:27 -05:00
ip_sockglue.c ipv4: introduce new IP_MTU_DISCOVER mode IP_PMTUDISC_INTERFACE 2013-11-05 21:52:27 -05:00
ip_tunnel.c core/dev: do not ignore dmac in dev_forward_skb() 2013-11-14 02:39:53 -05:00
ip_tunnel_core.c ipv4: generalize gre_handle_offloads 2013-10-19 19:36:18 -04:00
ip_vti.c xfrm: Release dst if this dst is improper for vti tunnel 2013-11-19 15:50:57 -05:00
ipcomp.c ipv4: properly refresh rtable entries on pmtu/redirect events 2013-06-03 00:07:42 -07:00
ipconfig.c ipconfig: add informative timeout messages while waiting for carrier 2013-04-02 14:35:33 -04:00
ipip.c ipip: add GSO/TSO support 2013-10-19 19:36:19 -04:00
ipmr.c ip: generate unique IP identificator if local fragmentation is allowed 2013-09-19 14:11:15 -04:00
Kconfig net: neighbour: Remove CONFIG_ARPD 2013-09-03 21:41:43 -04:00
Makefile net: gre: move GSO functions to gre_offload 2013-07-03 14:37:39 -07:00
netfilter.c netfilter: add my copyright statements 2013-04-18 20:27:55 +02:00
ping.c ping: prevent NULL pointer dereference on write to msg_name 2013-11-18 16:02:03 -05:00
proc.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2013-08-16 15:37:26 -07:00
protocol.c ipv4: Disallow non-namespace aware protocols to register. 2013-02-05 14:42:23 -05:00
raw.c inet: prevent leakage of uninitialized memory to user in recv syscalls 2013-11-18 15:12:03 -05:00
route.c ipv4: introduce new IP_MTU_DISCOVER mode IP_PMTUDISC_INTERFACE 2013-11-05 21:52:27 -05:00
syncookies.c inet: split syncookie keys for ipv4 and ipv6 and initialize with net_get_random_once 2013-10-19 19:45:35 -04:00
sysctl_net_ipv4.c tcp: properly handle stretch acks in slow start 2013-11-04 19:57:59 -05:00
tcp.c tcp: tsq: restore minimal amount of queueing 2013-11-14 16:25:14 -05:00
tcp_bic.c tcp: properly handle stretch acks in slow start 2013-11-04 19:57:59 -05:00
tcp_cong.c tcp: properly handle stretch acks in slow start 2013-11-04 19:57:59 -05:00
tcp_cubic.c tcp: properly handle stretch acks in slow start 2013-11-04 19:57:59 -05:00
tcp_diag.c inet_diag: Rename inet_diag_req into inet_diag_req_v2 2012-01-11 12:56:06 -08:00
tcp_fastopen.c tcp: enable sockets to use MSG_FASTOPEN by default 2013-11-04 19:57:47 -05:00
tcp_highspeed.c tcp: properly handle stretch acks in slow start 2013-11-04 19:57:59 -05:00
tcp_htcp.c tcp: properly handle stretch acks in slow start 2013-11-04 19:57:59 -05:00
tcp_hybla.c tcp: properly handle stretch acks in slow start 2013-11-04 19:57:59 -05:00
tcp_illinois.c tcp: properly handle stretch acks in slow start 2013-11-04 19:57:59 -05:00
tcp_input.c tcp: properly handle stretch acks in slow start 2013-11-04 19:57:59 -05:00
tcp_ipv4.c ipv4: introduce new IP_MTU_DISCOVER mode IP_PMTUDISC_INTERFACE 2013-11-05 21:52:27 -05:00
tcp_lp.c tcp: properly handle stretch acks in slow start 2013-11-04 19:57:59 -05:00
tcp_memcontrol.c tcp_memcontrol: Kill struct tcp_memcontrol 2013-10-21 18:43:02 -04:00
tcp_metrics.c genetlink: make all genl_ops users const 2013-11-14 17:10:41 -05:00
tcp_minisocks.c ipv6: make lookups simpler and faster 2013-10-09 00:01:25 -04:00
tcp_offload.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2013-11-04 13:48:30 -05:00
tcp_output.c tcp: don't update snd_nxt, when a socket is switched from repair mode 2013-11-19 16:14:20 -05:00
tcp_probe.c ipv6: make lookups simpler and faster 2013-10-09 00:01:25 -04:00
tcp_scalable.c tcp: properly handle stretch acks in slow start 2013-11-04 19:57:59 -05:00
tcp_timer.c tcp: temporarily disable Fast Open on SYN timeout 2013-10-29 22:50:41 -04:00
tcp_vegas.c tcp: properly handle stretch acks in slow start 2013-11-04 19:57:59 -05:00
tcp_vegas.h net: ipv4/ipv6: Remove extern from function prototypes 2013-10-19 19:12:11 -04:00
tcp_veno.c tcp: properly handle stretch acks in slow start 2013-11-04 19:57:59 -05:00
tcp_westwood.c tcp: refactor F-RTO 2013-03-21 11:47:50 -04:00
tcp_yeah.c tcp: properly handle stretch acks in slow start 2013-11-04 19:57:59 -05:00
tunnel4.c net: Convert printks to pr_<level> 2012-03-11 23:42:51 -07:00
udp.c inet: prevent leakage of uninitialized memory to user in recv syscalls 2013-11-18 15:12:03 -05:00
udp_diag.c netlink: rename ssk to sk in struct netlink_skb_params 2013-04-19 14:57:56 -04:00
udp_impl.h net: ipv4/ipv6: Remove extern from function prototypes 2013-10-19 19:12:11 -04:00
udp_offload.c ipip: add GSO/TSO support 2013-10-19 19:36:19 -04:00
udplite.c net: ipv4: Standardize prefixes for message logging 2012-03-12 17:05:21 -07:00
xfrm4_input.c net: Add skb_unclone() helper function. 2013-02-15 15:10:37 -05:00
xfrm4_mode_beet.c ipsec: be careful of non existing mac headers 2012-02-23 16:50:45 -05:00
xfrm4_mode_transport.c
xfrm4_mode_tunnel.c Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next 2013-09-30 15:24:57 -04:00
xfrm4_output.c xfrm: revert ipv4 mtu determination to dst_mtu 2013-08-26 12:40:53 +02:00
xfrm4_policy.c xfrm: Fix null pointer dereference when decoding sessions 2013-11-01 07:08:46 +01:00
xfrm4_state.c xfrm: make local error reporting more robust 2013-08-14 13:07:12 +02:00
xfrm4_tunnel.c sit: add IPv4 over IPv4 support 2013-05-31 17:19:05 -07:00