redonkable/alistair23-linux

Author	SHA1	Message	Date
David S. Miller	b7d6a43211	Merge branch 'net-next-2.6_20100423a/br/br_multicast_v3' of git://git.linux-ipv6.org/gitroot/yoshfuji/linux-2.6-next	2010-04-23 23:37:24 -07:00
Brian Haley	4b340ae20d	IPv6: Complete IPV6_DONTFRAG support Finally add support to detect a local IPV6_DONTFRAG event and return the relevant data to the user if they've enabled IPV6_RECVPATHMTU on the socket. The next recvmsg() will return no data, but have an IPV6_PATHMTU as ancillary data. Signed-off-by: Brian Haley <brian.haley@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-23 23:35:29 -07:00
Brian Haley	13b52cd446	IPv6: Add dontfrag argument to relevant functions Add dontfrag argument to relevant functions for IPV6_DONTFRAG support, as well as allowing the value to be passed-in via ancillary cmsg data. Signed-off-by: Brian Haley <brian.haley@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-23 23:35:28 -07:00
Brian Haley	793b147316	IPv6: data structure changes for new socket options Add underlying data structure changes and basic setsockopt() and getsockopt() support for IPV6_RECVPATHMTU, IPV6_PATHMTU, and IPV6_DONTFRAG. IPV6_PATHMTU is actually fully functional at this point. Signed-off-by: Brian Haley <brian.haley@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-23 23:35:28 -07:00
Jiri Pirko	3a73702863	l2tp_eth: fix memory allocation Since .size is set properly in "struct pernet_operations l2tp_eth_net_ops", allocating space for "struct l2tp_eth_net" by hand is not correct, even causes memory leakage. Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-23 16:37:33 -07:00
Jiri Pirko	e773aaff82	l2tp: fix memory allocation Since .size is set properly in "struct pernet_operations l2tp_net_ops", allocating space for "struct l2tp_net" by hand is not correct, even causes memory leakage. Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-23 16:37:32 -07:00
John W. Linville	3b51cc996e	Merge branch 'master' into for-davem Conflicts: drivers/net/wireless/ath/ath9k/phy.c drivers/net/wireless/iwlwifi/iwl-6000.c drivers/net/wireless/iwlwifi/iwl-debugfs.c	2010-04-23 14:43:45 -04:00
YOSHIFUJI Hideaki	08b202b672	bridge br_multicast: IPv6 MLD support. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2010-04-23 13:35:56 +09:00
YOSHIFUJI Hideaki	8ef2a9a598	bridge br_multicast: Make functions less ipv4 dependent. Introduce struct br_ip{} to store ip address and protocol and make functions more generic so that we can support both IPv4 and IPv6 with less pain. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2010-04-23 13:35:55 +09:00
YOSHIFUJI Hideaki	6e7cb83707	ipv6 mcast: Introduce include/net/mld.h for MLD definitions. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2010-04-23 13:35:55 +09:00
Eric Dumazet	fda48a0d7a	tcp: bind() fix when many ports are bound Port autoselection done by kernel only works when number of bound sockets is under a threshold (typically 30000). When this threshold is over, we must check if there is a conflict before exiting first loop in inet_csk_get_port() Change inet_csk_bind_conflict() to forbid two reuse-enabled sockets to bind on same (address,port) tuple (with a non ANY address) Same change for inet6_csk_bind_conflict() Reported-by: Gaspar Chilingarov <gasparch@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Evgeniy Polyakov <zbr@ioremap.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-22 19:06:06 -07:00
Andrew Hendry	5ebfbc06aa	X25: Add if_x25.h and x25 to device identifiers V2 Feedback from John Hughes. - Add header for userspace implementations such as xot/xoe to use - Use explicit values for interface stability - No changes to driver patches V1 - Use identifiers instead of magic numbers for X25 layer 3 to device interface. - Also fixed checkpatch notes on updated code. [ Add new user header to include/linux/Kbuild -DaveM ] Signed-off-by: Andrew Hendry <andrew.hendry@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-22 16:12:36 -07:00
Paul LeoNerd Evans	40eaf96271	net: Socket filter ancilliary data access for skb->dev->type Add an SKF_AD_HATYPE field to the packet ancilliary data area, giving access to skb->dev->type, as reported in the sll_hatype field. When capturing packets on a PF_PACKET/SOCK_RAW socket bound to all interfaces, there doesn't appear to be a way for the filter program to actually find out the underlying hardware type the packet was captured on. This patch adds such ability. This patch also handles the case where skb->dev can be NULL, such as on netlink sockets. Signed-off-by: Paul Evans <leonerd@leonerd.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-22 16:05:44 -07:00
Tom Herbert	aa2ea0586d	tcp: fix outsegs stat for TSO segments Account for TSO segments of an skb in TCP_MIB_OUTSEGS counter. Without doing this, the counter can be off by orders of magnitude from the actual number of segments sent. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-22 16:00:00 -07:00
Dan Carpenter	24acc68956	rdma: potential ERR_PTR dereference In the original code, the "goto out" calls "rdma_destroy_id(cm_id);" That isn't needed here and would cause problems because "cm_id" is an ERR_PTR. The new code just returns directly. Signed-off-by: Dan Carpenter <error27@gmail.com> Acked-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-22 15:57:26 -07:00
Dan Carpenter	80032cffb9	rtnetlink: potential ERR_PTR dereference In the original code, if rtnl_create_link() returned an ERR_PTR then that would get passed to rtnl_configure_link() which dereferences it. Signed-off-by: Dan Carpenter <error27@gmail.com> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-22 15:57:26 -07:00
Stephen Hemminger	e802af9cab	IPv6: Generic TTL Security Mechanism (final version) This patch adds IPv6 support for RFC5082 Generalized TTL Security Mechanism. Not to users of mapped address; the IPV6 and IPV4 socket options are seperate. The server does have to deal with both IPv4 and IPv6 socket options and the client has to handle the different for each family. On client: int ttl = 255; getaddrinfo(argv[1], argv[2], &hint, &result); for (rp = result; rp != NULL; rp = rp->ai_next) { s = socket(rp->ai_family, rp->ai_socktype, rp->ai_protocol); if (s < 0) continue; if (rp->ai_family == AF_INET) { setsockopt(s, IPPROTO_IP, IP_TTL, &ttl, sizeof(ttl)); } else if (rp->ai_family == AF_INET6) { setsockopt(s, IPPROTO_IPV6, IPV6_UNICAST_HOPS, &ttl, sizeof(ttl))) } if (connect(s, rp->ai_addr, rp->ai_addrlen) == 0) { ... On server: int minttl = 255 - maxhops; getaddrinfo(NULL, port, &hints, &result); for (rp = result; rp != NULL; rp = rp->ai_next) { s = socket(rp->ai_family, rp->ai_socktype, rp->ai_protocol); if (s < 0) continue; if (rp->ai_family == AF_INET6) setsockopt(s, IPPROTO_IPV6, IPV6_MINHOPCOUNT, &minttl, sizeof(minttl)); setsockopt(s, IPPROTO_IP, IP_MINTTL, &minttl, sizeof(minttl)); if (bind(s, rp->ai_addr, rp->ai_addrlen) == 0) break ... Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-22 15:24:53 -07:00
David S. Miller	9ccb897594	net: Orphan and de-dst skbs earlier in xmit path. This way GSO packets don't get handled differently. With help from Eric Dumazet. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>	2010-04-22 01:02:07 -07:00
Eric Dumazet	e326bed2f4	rps: immediate send IPI in process_backlog() If some skb are queued to our backlog, we are delaying IPI sending at the end of net_rx_action(), increasing latencies. This defeats the queueing, since we want to quickly dispatch packets to the pool of worker cpus, then eventually deeply process our packets. It's better to send IPI before processing our packets in upper layers, from process_backlog(). Change the _and_disable_irq suffix to _and_enable_irq(), since we enable local irq in net_rps_action(), sorry for the confusion. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-22 00:22:45 -07:00
Jiri Olsa	f4f914b580	net: ipv6 bind to device issue The issue raises when having 2 NICs both assigned the same IPv6 global address. If a sender binds to a particular NIC (SO_BINDTODEVICE), the outgoing traffic is being sent via the first found. The bonded device is thus not taken into an account during the routing. From the ip6_route_output function: If the binding address is multicast, linklocal or loopback, the RT6_LOOKUP_F_IFACE bit is set, but not for global address. So binding global address will neglect SO_BINDTODEVICE-binded device, because the fib6_rule_lookup function path won't check for the flowi::oif field and take first route that fits. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Scott Otto <scott.otto@alcatel-lucent.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-21 22:59:24 -07:00
Johannes Berg	b002a86109	ethernet: print protocol in host byte order Eric's recent patch added __force, but this place would seem to require actually doing a byte order conversion so the printk is consistent across architectures. Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-21 22:57:19 -07:00
Eric Dumazet	9a20e3197e	net: Introduce skb_orphan_try() At this point, skb->destructor is not the original one (stored in DEV_GSO_CB(skb)->destructor) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-21 22:54:08 -07:00
Shan Wei	f2228f785a	ipv6: allow to send packet after receiving ICMPv6 Too Big message with MTU field less than IPV6_MIN_MTU According to RFC2460, PMTU is set to the IPv6 Minimum Link MTU (1280) and a fragment header should always be included after a node receiving Too Big message reporting PMTU is less than the IPv6 Minimum Link MTU. After receiving a ICMPv6 Too Big message reporting PMTU is less than the IPv6 Minimum Link MTU, sctp can't send any data/control chunk that total length including IPv6 head and IPv6 extend head is less than IPV6_MIN_MTU(1280 bytes). The failure occured in p6_fragment(), about reason see following(take SHUTDOWN chunk for example): sctp_packet_transmit (SHUTDOWN chunk, len=16 byte) \|------sctp_v6_xmit (local_df=0) \|------ip6_xmit \|------ip6_output (dst_allfrag is ture) \|------ip6_fragment In ip6_fragment(), for local_df=0, drops the the packet and returns EMSGSIZE. The patch fixes it with adding check length of skb->len. In this case, Ipv6 not to fragment upper protocol data, just only add a fragment header before it. Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-21 22:48:26 -07:00
andrew hendry	2cec6b014d	X25 fix dead unaccepted sockets 1, An X25 program binds and listens 2, calls arrive waiting to be accepted 3, Program exits without accepting 4, Sockets time out but don't get correctly cleaned up 5, cat /proc/net/x25/socket shows the dead sockets with bad inode fields. This line borrowed from AX25 sets the dying socket so the timers clean up later. Signed-off-by: Andrew Hendry <andrew.hendry@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-21 16:31:50 -07:00
Nicolas Dichtel	bc8e4b954e	xfrm6: ensure to use the same dev when building a bundle When building a bundle, we set dst.dev and rt6.rt6i_idev. We must ensure to set the same device for both fields. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-21 16:25:30 -07:00
Eric Dumazet	989a297920	fasync: RCU and fine grained locking kill_fasync() uses a central rwlock, candidate for RCU conversion, to avoid cache line ping pongs on SMP. fasync_remove_entry() and fasync_add_entry() can disable IRQS on a short section instead during whole list scan. Use a spinlock per fasync_struct to synchronize kill_fasync_rcu() and fasync_{remove\|add}_entry(). This spinlock is IRQ safe, so sock_fasync() doesnt need its own implementation and can use fasync_helper(), to reduce code size and complexity. We can remove __kill_fasync() direct use in net/socket.c, and rename it to kill_fasync_rcu(). Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-21 16:19:29 -07:00
David S. Miller	e5700aff14	tcp: Mark v6 response packets as CHECKSUM_PARTIAL Otherwise we only get the checksum right for data-less TCP responses. Noticed by Herbert Xu. Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-21 14:59:20 -07:00
David S. Miller	f71b70e115	tcp: Fix ipv6 checksumming on response packets for real. Commit `6651ffc8e8` ("ipv6: Fix tcp_v6_send_response transport header setting.") fixed one half of why ipv6 tcp response checksums were invalid, but it's not the whole story. If we're going to use CHECKSUM_PARTIAL for these things (which we are since commit `2e8e18ef52` "tcp: Set CHECKSUM_UNNECESSARY in tcp_init_nondata_skb"), we can't be setting buff->csum as we always have been here in tcp_v6_send_response. We need to leave it at zero. Kill that line and checksums are good again. Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-21 01:57:01 -07:00
David S. Miller	87eb367003	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/wireless/iwlwifi/iwl-6000.c net/core/dev.c	2010-04-21 01:14:25 -07:00
David Howells	05d17608a6	net: Fix an RCU warning in dev_pick_tx() Fix the following RCU warning in dev_pick_tx(): =================================================== [ INFO: suspicious rcu_dereference_check() usage. ] --------------------------------------------------- net/core/dev.c:1993 invoked rcu_dereference_check() without protection! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 0 2 locks held by swapper/0: #0: (&idev->mc_ifc_timer){+.-...}, at: [<ffffffff81039e65>] run_timer_softirq+0x17b/0x278 #1: (rcu_read_lock_bh){.+....}, at: [<ffffffff812ea3eb>] dev_queue_xmit+0x14e/0x4dc stack backtrace: Pid: 0, comm: swapper Not tainted 2.6.34-rc5-cachefs #4 Call Trace: <IRQ> [<ffffffff810516c4>] lockdep_rcu_dereference+0xaa/0xb2 [<ffffffff812ea4f6>] dev_queue_xmit+0x259/0x4dc [<ffffffff812ea3eb>] ? dev_queue_xmit+0x14e/0x4dc [<ffffffff81052324>] ? trace_hardirqs_on+0xd/0xf [<ffffffff81035362>] ? local_bh_enable_ip+0xbc/0xc1 [<ffffffff812f0954>] neigh_resolve_output+0x24b/0x27c [<ffffffff8134f673>] ip6_output_finish+0x7c/0xb4 [<ffffffff81350c34>] ip6_output2+0x256/0x261 [<ffffffff81052324>] ? trace_hardirqs_on+0xd/0xf [<ffffffff813517fb>] ip6_output+0xbbc/0xbcb [<ffffffff8135bc5d>] ? fib6_force_start_gc+0x2b/0x2d [<ffffffff81368acb>] mld_sendpack+0x273/0x39d [<ffffffff81368858>] ? mld_sendpack+0x0/0x39d [<ffffffff81052099>] ? mark_held_locks+0x52/0x70 [<ffffffff813692fc>] mld_ifc_timer_expire+0x24f/0x288 [<ffffffff81039ed6>] run_timer_softirq+0x1ec/0x278 [<ffffffff81039e65>] ? run_timer_softirq+0x17b/0x278 [<ffffffff813690ad>] ? mld_ifc_timer_expire+0x0/0x288 [<ffffffff81035531>] ? __do_softirq+0x69/0x140 [<ffffffff8103556a>] __do_softirq+0xa2/0x140 [<ffffffff81002e0c>] call_softirq+0x1c/0x28 [<ffffffff81004b54>] do_softirq+0x38/0x80 [<ffffffff81034f06>] irq_exit+0x45/0x47 [<ffffffff810177c3>] smp_apic_timer_interrupt+0x88/0x96 [<ffffffff810028d3>] apic_timer_interrupt+0x13/0x20 <EOI> [<ffffffff810488dd>] ? __atomic_notifier_call_chain+0x0/0x86 [<ffffffff810096bf>] ? mwait_idle+0x6e/0x78 [<ffffffff810096b6>] ? mwait_idle+0x65/0x78 [<ffffffff810011cb>] cpu_idle+0x4d/0x83 [<ffffffff81380b05>] rest_init+0xb9/0xc0 [<ffffffff81380a4c>] ? rest_init+0x0/0xc0 [<ffffffff8168dcf0>] start_kernel+0x392/0x39d [<ffffffff8168d2a3>] x86_64_start_reservations+0xb3/0xb7 [<ffffffff8168d38b>] x86_64_start_kernel+0xe4/0xeb An rcu_dereference() should be an rcu_dereference_bh(). Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-21 01:09:44 -07:00
David S. Miller	e04997b13a	Merge branch 'master' of /home/davem/src/GIT/linux-2.6/	2010-04-21 00:50:39 -07:00
Herbert Xu	6651ffc8e8	ipv6: Fix tcp_v6_send_response transport header setting. My recent patch to remove the open-coded checksum sequence in tcp_v6_send_response broke it as we did not set the transport header pointer on the new packet. Actually, there is code there trying to set the transport header properly, but it sets it for the wrong skb ('skb' instead of 'buff'). This bug was introduced by commit `a8fdf2b331` ("ipv6: Fix tcp_v6_send_response(): it didn't set skb transport header") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-21 00:47:15 -07:00
Rami Rosen	ccb7c7732e	net: Remove two unnecessary exports (skbuff). There is no need to export skb_under_panic() and skb_over_panic() in skbuff.c, since these methods are used only in skbuff.c ; this patch removes these two exports. It also marks these functions as 'static' and removeS the extern declarations of them from include/linux/skbuff.h Signed-off-by: Rami Rosen <ramirose@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-20 22:39:53 -07:00
Eric Dumazet	0eae88f31c	net: Fix various endianness glitches Sparse can help us find endianness bugs, but we need to make some cleanups to be able to more easily spot real bugs. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-20 19:06:52 -07:00
Eric Dumazet	8eabf95cb1	bridge: add a missing ntohs() grec_nsrcs is in network order, we should convert to host horder in br_multicast_igmp3_report() Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-20 18:51:57 -07:00
David S. Miller	e46754f8c9	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6	2010-04-20 17:57:56 -07:00
Eric Dumazet	aa39514516	net: sk_sleep() helper Define a new function to return the waitqueue of a "struct sock". static inline wait_queue_head_t sk_sleep(struct sock sk) { return sk->sk_sleep; } Change all read occurrences of sk_sleep by a call to this function. Needed for a future RCU conversion. sk_sleep wont be a field directly available. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-20 16:37:13 -07:00
Juuso Oikarinen	7bdfcaaff5	mac80211: Fix ieee80211_sta_conn_mon_timer with hw connection monitoring When IEEE80211_HW_CONNECTION_MONITOR is configured by the driver, starting of ieee80211_sta_conn_mon_timer should be prevented, as it is then not needed. This is currently partially the case. As it seems, when a probe-response is received from the AP the timer is still restarted, thus restarting the host based connection keep-alive mechanism. These probe-responses happen at least when scanning while associated. Fix this by preventing starting of the ieee80211_sta_conn_mon_timer in the ieee80211_rx_mgmt_probe_resp function. Signed-off-by: Juuso Oikarinen <juuso.oikarinen@nokia.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-04-20 11:52:40 -04:00
Holger Schurig	1289723ef2	mac80211: sample survey implementation for mac80211 & hwsim This adds the survey function to both mac80211 itself and to mac80211_hwsim. For the latter driver, we simply invent some noise level.A real driver which cannot determine the real channel noise MUST NOT report any noise, especially not a magically conjured one :-) Signed-off-by: Holger Schurig <holgerschurig@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-04-20 11:50:52 -04:00
Daniel Yingqiang Ma	03ceedea97	ath9k: Group Key fix for VAPs When I set up multiple VAPs with ath9k, I encountered an issue that the traffic may be lost after a while. The detailed phenomenon is 1. After a while the clients connected to one of these VAPs will get into a state that no broadcast/multicast packets can be transfered successfully while the unicast packets can be transfered normally. 2. Minutes latter the unitcast packets transfer will fail as well, because the ARP entry is expired and it can't be freshed due to the broadcast trouble. It's caused by the group key overwritten and someone discussed this issue in ath9k-devel maillist before, but haven't work out a fix yet. I referred the method in madwifi, and made a patch for ath9k. The method is to set the high bit of the sender(AP)'s address, and associated that mac and the group key. It requires the hardware supports multicast frame key search. It seems true for AR9160. Not sure whether it's the correct way to fix this issue. But it seems to work in my test. The patch is attached, feel free to revise it. Signed-off-by: Daniel Yingqiang ma <yma.cool@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-04-20 11:50:51 -04:00
Jiri Pirko	ab9304717f	net: emphasize rtnl lock required in call_netdevice_notifiers Since netdev_chain is guarded by rtnl_lock, ASSERT_RTNL should be present here to make sure that all callers of call_netdevice_notifiers does the locking properly. Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-20 01:45:37 -07:00
Eric Dumazet	b249dcb82d	rps: consistent rxhash In case we compute a software skb->rxhash, we can generate a consistent hash : Its value will be the same in both flow directions. This helps some workloads, like conntracking, since the same state needs to be accessed in both directions. tbench + RFS + this patch gives better results than tbench with default kernel configuration (no RPS, no RFS) Also fixed some sparse warnings. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-20 01:18:06 -07:00
Eric Dumazet	e36fa2f7e9	rps: cleanups struct softnet_data holds many queues, so consistent use "sd" name instead of "queue" is better. Adds a rps_ipi_queued() helper to cleanup enqueue_to_backlog() Adds a _and_irq_disable suffix to net_rps_action() name, as David suggested. incr_input_queue_head() becomes input_queue_head_incr() Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-20 01:18:05 -07:00
Eric Dumazet	f5acb907dc	rps: static functions store_rps_map() & store_rps_dev_flow_table_cnt() are static. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-19 14:40:57 -07:00
Johannes Berg	67e0f39277	mac80211: add missing newline One HT debugging printk is missing a newline, add it. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-04-19 16:45:20 -04:00
Juuso Oikarinen	3393a608c4	mac80211: Prevent running sta_cleanup timer unnecessarily The sta_cleanup timer is used to periodically expire buffered frames from the tx buf. The timer is executing periodically, regardless of the need for it. This is wasting resources. Fix this simply by not restarting the sta_cleanup timer if the tx buffer was empty. Restart the timer when there is some more tx-traffic. Cc: Janne Ylälehto <janne.ylalehto@nokia.com> Signed-off-by: Juuso Oikarinen <juuso.oikarinen@nokia.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-04-19 16:41:42 -04:00
Johannes Berg	2aab4c273a	mac80211: fix stopping RX BA session from timer Kalle reported that his system deadlocks since my recent work in this area. The reason quickly became apparent: we try to cancel_timer_sync() a timer from within itself. Fix that by making the function aware of the context it is called from. Reported-by: Kalle Valo <kvalo@adurom.com> Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Tested-by: Kalle Valo <kvalo@adurom.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-04-19 16:41:42 -04:00
Reinette Chatre	fe6f212ce1	mac80211: pass HT changes to driver when off channel Since "mac80211: make off-channel work generic" drivers have not been notified of configuration changes after association or authentication. This caused more dependence on current state to ensure driver will be notified when configuration changes occur. One such problem arises if off-channel is in progress when HT information changes. Since HT is only enabled on the "oper_channel" the driver will never be notified of this change. Usually the driver is notified soon after of a BSS information change (BSS_CHANGED_HT) ... but since the driver did not get a notification that this is a HT channel the new BSS information does not make sense. Fix this by also changing the off-channel information when HT is enabled and thus cause driver to be notified correctly. This fixes a problem in 4965 when associated with 5GHz 40MHz channel. Without this patch the system can associate but is unable to transfer any data, not even ping. See http://bugzilla.intellinuxwireless.org/show_bug.cgi?id=2158 Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-04-19 16:34:11 -04:00
Johannes Berg	b4bb5c3fd9	mac80211: remove bogus TX agg state assignment When the addba timer expires but has no work to do, it should not affect the state machine. If it does, TX will not see the successfully established and we can also crash trying to re-establish the session. Cc: stable@kernel.org [2.6.32, 2.6.33] Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-04-19 16:34:11 -04:00
Eric Dumazet	88751275b8	rps: shortcut net_rps_action() net_rps_action() is a bit expensive on NR_CPUS=64..4096 kernels, even if RPS is not active. Tom Herbert used two bitmasks to hold information needed to send IPI, but a single LIFO list seems more appropriate. Move all RPS logic into net_rps_action() to cleanup net_rx_action() code (remove two ifdefs) Move rps_remote_softirq_cpus into softnet_data to share its first cache line, filling an existing hole. In a future patch, we could call net_rps_action() from process_backlog() to make sure we send IPI before handling this cpu backlog. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-19 13:20:34 -07:00
Linus Torvalds	375db4810b	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: gigaset: include cleanup cleanup packet : remove init_net restriction WAN: flush tx_queue in hdlc_ppp to prevent panic on rmmod hw_driver. ip: Fix ip_dev_loopback_xmit() net: dev_pick_tx() fix fib: suppress lockdep-RCU false positive in FIB trie. tun: orphan an skb on tx forcedeth: fix tx limit2 flag check iwlwifi: work around bogus active chains detection	2010-04-19 07:27:45 -07:00
Eric Dumazet	fc6055a5ba	net: Introduce skb_orphan_try() Transmitted skb might be attached to a socket and a destructor, for memory accounting purposes. Traditionally, this destructor is called at tx completion time, when skb is freed. When tx completion is performed by another cpu than the sender, this forces some cache lines to change ownership. XPS was an attempt to give tx completion to initial cpu. David idea is to call destructor right before giving skb to device (call to ndo_start_xmit()). Because device queues are usually small, orphaning skb before tx completion is not a big deal. Some drivers already do this, we could do it in upper level. There is one known exception to this early orphaning, called tx timestamping. It needs to keep a reference to socket until device can give a hardware or software timestamp. This patch adds a skb_orphan_try() helper, to centralize all exceptions to early orphaning in one spot, and use it in dev_hard_start_xmit(). "tbench 16" results on a Nehalem machine (2 X5570 @ 2.93GHz) before: Throughput 4428.9 MB/sec 16 procs after: Throughput 4448.14 MB/sec 16 procs UDP should get even better results, its destructor being more complex, since SOCK_USE_WRITE_QUEUE is not set (four atomic ops instead of one) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-18 02:39:41 -07:00
Eric Dumazet	9958da0501	net: remove time limit in process_backlog() - There is no point to enforce a time limit in process_backlog(), since other napi instances dont follow same rule. We can exit after only one packet processed... The normal quota of 64 packets per napi instance should be the norm, and net_rx_action() already has its own time limit. Note : /proc/net/core/dev_weight can be used to tune this 64 default value. - Use DEFINE_PER_CPU_ALIGNED for softnet_data definition. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-18 02:36:13 -07:00
Eric Dumazet	8770acf049	rps: rps_sock_flow_table is mostly read Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-17 00:54:36 -07:00
Tom Herbert	fec5e652e5	rfs: Receive Flow Steering This patch implements receive flow steering (RFS). RFS steers received packets for layer 3 and 4 processing to the CPU where the application for the corresponding flow is running. RFS is an extension of Receive Packet Steering (RPS). The basic idea of RFS is that when an application calls recvmsg (or sendmsg) the application's running CPU is stored in a hash table that is indexed by the connection's rxhash which is stored in the socket structure. The rxhash is passed in skb's received on the connection from netif_receive_skb. For each received packet, the associated rxhash is used to look up the CPU in the hash table, if a valid CPU is set then the packet is steered to that CPU using the RPS mechanisms. The convolution of the simple approach is that it would potentially allow OOO packets. If threads are thrashing around CPUs or multiple threads are trying to read from the same sockets, a quickly changing CPU value in the hash table could cause rampant OOO packets-- we consider this a non-starter. To avoid OOO packets, this solution implements two types of hash tables: rps_sock_flow_table and rps_dev_flow_table. rps_sock_table is a global hash table. Each entry is just a CPU number and it is populated in recvmsg and sendmsg as described above. This table contains the "desired" CPUs for flows. rps_dev_flow_table is specific to each device queue. Each entry contains a CPU and a tail queue counter. The CPU is the "current" CPU for a matching flow. The tail queue counter holds the value of a tail queue counter for the associated CPU's backlog queue at the time of last enqueue for a flow matching the entry. Each backlog queue has a queue head counter which is incremented on dequeue, and so a queue tail counter is computed as queue head count + queue length. When a packet is enqueued on a backlog queue, the current value of the queue tail counter is saved in the hash entry of the rps_dev_flow_table. And now the trick: when selecting the CPU for RPS (get_rps_cpu) the rps_sock_flow table and the rps_dev_flow table for the RX queue are consulted. When the desired CPU for the flow (found in the rps_sock_flow table) does not match the current CPU (found in the rps_dev_flow table), the current CPU is changed to the desired CPU if one of the following is true: - The current CPU is unset (equal to RPS_NO_CPU) - Current CPU is offline - The current CPU's queue head counter >= queue tail counter in the rps_dev_flow table. This checks if the queue tail has advanced beyond the last packet that was enqueued using this table entry. This guarantees that all packets queued using this entry have been dequeued, thus preserving in order delivery. Making each queue have its own rps_dev_flow table has two advantages: 1) the tail queue counters will be written on each receive, so keeping the table local to interrupting CPU s good for locality. 2) this allows lockless access to the table-- the CPU number and queue tail counter need to be accessed together under mutual exclusion from netif_receive_skb, we assume that this is only called from device napi_poll which is non-reentrant. This patch implements RFS for TCP and connected UDP sockets. It should be usable for other flow oriented protocols. There are two configuration parameters for RFS. The "rps_flow_entries" kernel init parameter sets the number of entries in the rps_sock_flow_table, the per rxqueue sysfs entry "rps_flow_cnt" contains the number of entries in the rps_dev_flow table for the rxqueue. Both are rounded to power of two. The obvious benefit of RFS (over just RPS) is that it achieves CPU locality between the receive processing for a flow and the applications processing; this can result in increased performance (higher pps, lower latency). The benefits of RFS are dependent on cache hierarchy, application load, and other factors. On simple benchmarks, we don't necessarily see improvement and sometimes see degradation. However, for more complex benchmarks and for applications where cache pressure is much higher this technique seems to perform very well. Below are some benchmark results which show the potential benfit of this patch. The netperf test has 500 instances of netperf TCP_RR test with 1 byte req. and resp. The RPC test is an request/response test similar in structure to netperf RR test ith 100 threads on each host, but does more work in userspace that netperf. e1000e on 8 core Intel No RFS or RPS 104K tps at 30% CPU No RFS (best RPS config): 290K tps at 63% CPU RFS 303K tps at 61% CPU RPC test tps CPU% 50/90/99% usec latency Latency StdDev No RFS/RPS 103K 48% 757/900/3185 4472.35 RPS only: 174K 73% 415/993/2468 491.66 RFS 223K 73% 379/651/1382 315.61 Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-16 16:01:27 -07:00
Daniel Lezcano	1c4f019732	packet : remove init_net restriction The af_packet protocol is used by Perl to do ioctls as reported by Stephane Riviere: "Net::RawIP relies on SIOCGIFADDR et SIOCGIFHWADDR to get the IP and MAC addresses of the network interface." But in a new network namespace these ioctl fail because it is disabled for a namespace different from the init_net_ns. These two lines should not be there as af_inet and af_packet are namespace aware since a long time now. I suppose we forget to remove these lines because we sent the af_packet first, before af_inet was supported. Signed-off-by: Daniel Lezcano <daniel.lezcano@free.fr> Reported-by: Stephane Riviere <stephane.riviere@regis-dgac.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-16 15:41:04 -07:00
Nishant Sarmukadam	7834704be4	cfg80211: Avoid sending IWEVASSOCREQIE and IWEVASSOCRESPIE events with NULL event body In a scenario, where a cfg80211 driver (station mode) does not send assoc request and assoc response IEs in cfg80211_connect_result after a successful association to an AP, cfg80211 sends IWEVASSOCREQIE and IWEVASSOCRESPIE to the user space application with NULL data. This can cause an issue at the event recipient. An example of this is when cfg80211 sends IWEVASSOCREQIE and IWEVASSOCRESPIE events with NULL event body to wpa_supplicant. The wpa_supplicant overwrites the assoc request and assoc response IEs for this station with NULL data. If the association is WPA/WPA2, the wpa_supplicant is not able to generate EAPOL handshake messages, since the IEs are NULL. With the patch, req_ie and resp_ie will be NULL by avoiding the assignment if the driver has not sent the IEs to cfg80211. The event sending code sends the events only if resp_ie and req_ie are not NULL. This will ensure that the events are not sent with NULL event body. Signed-off-by: Nishant Sarmukadam <nishants@marvell.com> Reviewed-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-04-16 15:32:00 -04:00
Shan Wei	b5d4399823	ipv6: fix the comment of ip6_xmit() ip6_xmit() is used by upper transport protocol. Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-15 23:36:38 -07:00
Shan Wei	4e15ed4d93	net: replace ipfragok with skb->local_df As Herbert Xu said: we should be able to simply replace ipfragok with skb->local_df. commit f88037(sctp: Drop ipfargok in sctp_xmit function) has droped ipfragok and set local_df value properly. The patch kills the ipfragok parameter of .queue_xmit(). Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-15 23:36:37 -07:00
Shan Wei	0eecb78494	ipv6: cancel to setting local_df in ip6_xmit() commit f88037(sctp: Drop ipfargok in sctp_xmit function) has droped ipfragok and set local_df value properly. So the change of commit 77e2f1(ipv6: Fix ip6_xmit to send fragments if ipfragok is true) is not needed. So the patch remove them. Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-15 23:36:37 -07:00
Joe Perches	a4fbf8415c	net/l2tp/l2tp_debugfs.c: Convert NIPQUAD to %pI4 Signed-off-by: Joe Perches <joe@perches.com> Acked-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-15 15:37:13 -07:00
David S. Miller	3eb14b944f	Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6	2010-04-15 14:31:06 -07:00
Eric Dumazet	e30b38c298	ip: Fix ip_dev_loopback_xmit() Eric Paris got following trace with a linux-next kernel [ 14.203970] BUG: using smp_processor_id() in preemptible [00000000] code: avahi-daemon/2093 [ 14.204025] caller is netif_rx+0xfa/0x110 [ 14.204035] Call Trace: [ 14.204064] [<ffffffff81278fe5>] debug_smp_processor_id+0x105/0x110 [ 14.204070] [<ffffffff8142163a>] netif_rx+0xfa/0x110 [ 14.204090] [<ffffffff8145b631>] ip_dev_loopback_xmit+0x71/0xa0 [ 14.204095] [<ffffffff8145b892>] ip_mc_output+0x192/0x2c0 [ 14.204099] [<ffffffff8145d610>] ip_local_out+0x20/0x30 [ 14.204105] [<ffffffff8145d8ad>] ip_push_pending_frames+0x28d/0x3d0 [ 14.204119] [<ffffffff8147f1cc>] udp_push_pending_frames+0x14c/0x400 [ 14.204125] [<ffffffff814803fc>] udp_sendmsg+0x39c/0x790 [ 14.204137] [<ffffffff814891d5>] inet_sendmsg+0x45/0x80 [ 14.204149] [<ffffffff8140af91>] sock_sendmsg+0xf1/0x110 [ 14.204189] [<ffffffff8140dc6c>] sys_sendmsg+0x20c/0x380 [ 14.204233] [<ffffffff8100ad82>] system_call_fastpath+0x16/0x1b While current linux-2.6 kernel doesnt emit this warning, bug is latent and might cause unexpected failures. ip_dev_loopback_xmit() runs in process context, preemption enabled, so must call netif_rx_ni() instead of netif_rx(), to make sure that we process pending software interrupt. Same change for ip6_dev_loopback_xmit() Reported-by: Eric Paris <eparis@redhat.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-15 14:25:22 -07:00
David S. Miller	791f58c064	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/ipmr-2.6	2010-04-15 14:14:05 -07:00
John W. Linville	5c01d56693	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem Conflicts: Documentation/feature-removal-schedule.txt drivers/net/wireless/ath/ath5k/phy.c drivers/net/wireless/wl12xx/wl1271_main.c	2010-04-15 16:21:34 -04:00
Patrick McHardy	8de53dfbf9	ipv4: ipmr: fix NULL pointer deref during unres queue destruction Fix an oversight in ipmr_destroy_unres() - the net pointer is unconditionally initialized to NULL, resulting in a NULL pointer dereference later on. Fix by adding a net pointer to struct mr_table and using it in ipmr_destroy_unres(). Signed-off-by: Patrick McHardy <kaber@trash.net>	2010-04-15 13:31:29 +02:00
Patrick McHardy	b0ebb739a8	ipv4: ipmr: fix invalid cache resolving when adding a non-matching entry The patch to convert struct mfc_cache to list_heads (ipv4: ipmr: convert struct mfc_cache to struct list_head) introduced a bug when adding new cache entries that don't match any unresolved entries. The unres queue is searched for a matching entry, which is then resolved. When no matching entry is present, the iterator points to the head of the list, but is treated as a matching entry. Use a seperate variable to indicate that a matching entry was found. Signed-off-by: Patrick McHardy <kaber@trash.net>	2010-04-15 13:31:29 +02:00
Patrick McHardy	66496d4973	ipv4: ipmr: fix IP_MROUTE_MULTIPLE_TABLES Kconfig dependencies IP_MROUTE_MULTIPLE_TABLES should depend on IP_MROUTE. Signed-off-by: Patrick McHardy <kaber@trash.net>	2010-04-15 13:31:29 +02:00
Eric Dumazet	8728c544a9	net: dev_pick_tx() fix When dev_pick_tx() caches tx queue_index on a socket, we must check socket dst_entry matches skb one, or risk a crash later, as reported by Denys Fedorysychenko, if old packets are in flight during a route change, involving devices with different number of queues. Bug introduced by commit `a4ee3ce3` (net: Use sk_tx_queue_mapping for connected sockets) Reported-by: Denys Fedorysychenko <nuclearcat@nuclearcat.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-15 01:27:11 -07:00
Eric Dumazet	b0e28f1eff	net: netif_rx() must disable preemption Eric Paris reported netif_rx() is calling smp_processor_id() from preemptible context, in particular when caller is ip_dev_loopback_xmit(). RPS commit added this smp_processor_id() call, this patch makes sure preemption is disabled. rps_get_cpus() wants rcu_read_lock() anyway, we can dot it a bit earlier. Reported-by: Eric Paris <eparis@redhat.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-15 00:14:07 -07:00
Eric Dumazet	4eaa0e3c86	fib: suppress lockdep-RCU false positive in FIB trie. Followup of commit `634a4b20` Allow tnode_get_child_rcu() to be called either under rcu_read_lock() protection or with RTNL held. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-14 16:13:29 -07:00
David S. Miller	dad1e54b12	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/pcmcia/smc91c92_cs.c drivers/net/virtio_net.c	2010-04-14 05:01:33 -07:00
Patrick McHardy	f0ad0860d0	ipv4: ipmr: support multiple tables This patch adds support for multiple independant multicast routing instances, named "tables". Userspace multicast routing daemons can bind to a specific table instance by issuing a setsockopt call using a new option MRT_TABLE. The table number is stored in the raw socket data and affects all following ipmr setsockopt(), getsockopt() and ioctl() calls. By default, a single table (RT_TABLE_DEFAULT) is created with a default routing rule pointing to it. Newly created pimreg devices have the table number appended ("pimregX"), with the exception of devices created in the default table, which are named just "pimreg" for compatibility reasons. Packets are directed to a specific table instance using routing rules, similar to how regular routing rules work. Currently iif, oif and mark are supported as keys, source and destination addresses could be supported additionally. Example usage: - bind pimd/xorp/... to a specific table: uint32_t table = 123; setsockopt(fd, IPPROTO_IP, MRT_TABLE, &table, sizeof(table)); - create routing rules directing packets to the new table: # ip mrule add iif eth0 lookup 123 # ip mrule add oif eth0 lookup 123 Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-13 14:49:34 -07:00
Patrick McHardy	0c12295a74	ipv4: ipmr: move mroute data into seperate structure Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-13 14:49:34 -07:00
Patrick McHardy	862465f2e7	ipv4: ipmr: convert struct mfc_cache to struct list_head Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-13 14:49:33 -07:00
Patrick McHardy	d658f8a0e6	ipv4: ipmr: remove net pointer from struct mfc_cache Now that cache entries in unres_queue don't need to be distinguished by their network namespace pointer anymore, we can remove it from struct mfc_cache add pass the namespace as function argument to the functions that need it. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-13 14:49:33 -07:00
Patrick McHardy	e258beb22f	ipv4: ipmr: move unres_queue and timer to per-namespace data The unres_queue is currently shared between all namespaces. Following patches will additionally allow to create multiple multicast routing tables in each namespace. Having a single shared queue for all these users seems to excessive, move the queue and the cleanup timer to the per-namespace data to unshare it. As a side-effect, this fixes a bug in the seq file iteration functions: the first entry returned is always from the current namespace, entries returned after that may belong to any namespace. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-13 14:49:32 -07:00
Patrick McHardy	0f87b1dd01	net: fib_rules: decouple address families from real address families Decouple the address family values used for fib_rules from the real address families in socket.h. This allows to use fib_rules for code that is not a real address family without increasing AF_MAX/NPROTO. Values up to 127 are reserved for real address families and map directly to the corresponding AF value, values starting from 128 are for other uses. rtnetlink is changed to invoke the AF_UNSPEC dumpit/doit handlers for these families. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-13 14:49:31 -07:00
Patrick McHardy	28bb17268b	net: fib_rules: set family in fib_rule_hdr centrally All fib_rules implementations need to set the family in their ->fill() functions. Since the value is available to the generic fib_nl_fill_rule() function, set it there. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-13 14:49:30 -07:00
Patrick McHardy	d8a566beaa	net: fib_rules: consolidate IPv4 and DECnet ->default_pref() functions. Both functions are equivalent, consolidate them since a following patch needs a third implementation for multicast routing. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-13 14:49:30 -07:00
Linus Torvalds	465de2ba71	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (25 commits) smc91c92_cs: define multicast_table as unsigned char can: avoids a false warning e1000e: stop cleaning when we reach tx_ring->next_to_use igb: restrict WoL for 82576 ET2 Quad Port Server Adapter virtio_net: missing sg_init_table Revert "tcp: Set CHECKSUM_UNNECESSARY in tcp_init_nondata_skb" iwlwifi: need check for valid qos packet before free tcp: Set CHECKSUM_UNNECESSARY in tcp_init_nondata_skb udp: fix for unicast RX path optimization myri10ge: fix rx_pause in myri10ge_set_pauseparam net: corrected documentation for hardware time stamping stmmac: use resource_size() x.25 attempts to negotiate invalid throughput x25: Patch to fix bug 15678 - x25 accesses fields beyond end of packet. bridge: Fix IGMP3 report parsing cnic: Fix crash during bnx2x MTU change. qlcnic: fix set mac addr r6040: fix r6040_multicast_list vhost-net: fix vq_memory_access_ok error checking ath9k: fix double calls to ath_radio_enable ...	2010-04-13 11:32:48 -07:00
stephen hemminger	5611551103	dst: don't inline dst_ifdown The function dst_ifdown is called only two places but in a non- performance critical code path, there is no reason to inline it. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-13 03:32:44 -07:00
Eric Dumazet	acbbc07145	net: uninline skb_bond_should_drop() skb_bond_should_drop() is too big to be inlined. This patch reduces kernel text size, and its compilation time as well (shrinking include/linux/netdevice.h) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-13 03:32:42 -07:00
Eric Dumazet	4ffa87012e	can: avoids a false warning At this point optlen == sizeof(sfilter) but some compilers are dumb. Reported-by: Németh Márton <nm127@freemail.h Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Oliver Hartkopp <oliver@hartkopp.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-13 03:03:14 -07:00
stephen hemminger	8595805aaf	IPv6: only notify protocols if address is compeletely gone The notifier for address down should only be called if address is completely gone, not just being marked as tentative on link transistion. The code in net-next would case bonding/sctp/s390 to see address disappear on link down, but they would never see it reappear on link up. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-13 02:29:28 -07:00
stephen hemminger	d1f84c63a4	ipv6: additional ref count for hash list unnecessary Since an address in hash list has to already have a ref count, no additional ref count is needed. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-13 02:29:28 -07:00
stephen hemminger	27bdb2abcc	IPv6: keep tentative addresses in hash table When link goes down, want address to be preserved but in a tentative state, therefore it has to stay in hash list. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-13 02:29:27 -07:00
stephen hemminger	93fa159abe	IPv6: keep route for tentative address Recent changes preserve IPv6 address when link goes down (good). But would cause address to point to dead dst entry (bad). The simplest fix is to just not delete route if address is being held for later use. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-13 02:29:27 -07:00
Eric Dumazet	b6c6712a42	net: sk_dst_cache RCUification With latest CONFIG_PROVE_RCU stuff, I felt more comfortable to make this work. sk->sk_dst_cache is currently protected by a rwlock (sk_dst_lock) This rwlock is readlocked for a very small amount of time, and dst entries are already freed after RCU grace period. This calls for RCU again :) This patch converts sk_dst_lock to a spinlock, and use RCU for readers. __sk_dst_get() is supposed to be called with rcu_read_lock() or if socket locked by user, so use appropriate rcu_dereference_check() condition (rcu_read_lock_held() \|\| sock_owned_by_user(sk)) This patch avoids two atomic ops per tx packet on UDP connected sockets, for example, and permits sk_dst_lock to be much less dirtied. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-13 01:41:33 -07:00
Eric Dumazet	7a161ea924	net: Dont use netdev_warn() Dont use netdev_warn() in dev_cap_txqueue() and get_rps_cpu() so that we can catch following warnings without crash. bond0.2240 received packet on queue 6, but number of RX queues is 1 bond0.2240 received packet on queue 11, but number of RX queues is 1 Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-13 01:41:32 -07:00
Richard Cochran	ed85b565b8	packet: support for TX time stamps on RAW sockets Enable the SO_TIMESTAMPING socket infrastructure for raw packet sockets. We introduce PACKET_TX_TIMESTAMP for the control message cmsg_type. Similar support for UDP and CAN sockets was added in commit `51f31cabe3` Signed-off-by: Richard Cochran <richard.cochran@omicron.at> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-13 01:30:48 -07:00
Linus Torvalds	fedfb947b2	Merge branch 'for-2.6.34' of git://linux-nfs.org/~bfields/linux * 'for-2.6.34' of git://linux-nfs.org/~bfields/linux: svcrdma: RDMA support not yet compatible with RPC6	2010-04-12 18:34:56 -07:00
David S. Miller	2e8e18ef52	tcp: Set CHECKSUM_UNNECESSARY in tcp_init_nondata_skb Back in commit `04a0551c87` ("loopback: Drop obsolete ip_summed setting") we stopped setting CHECKSUM_UNNECESSARY in the loopback xmit. This is because such a setting was a lie since it implies that the checksum field of the packet is properly filled in. Instead what happens normally is that CHECKSUM_PARTIAL is set and skb->csum is calculated as needed. But this was only happening for TCP data packets (via the skb->ip_summed assignment done in tcp_sendmsg()). It doesn't happen for non-data packets like ACKs etc. Fix this by setting skb->ip_summed in the common non-data packet constructor. It already is setting skb->csum to zero. But this reminds us that we still have things like ip_output.c's ip_dev_loopback_xmit() which sets skb->ip_summed to the value CHECKSUM_UNNECESSARY, which Herbert's patch teaches us is not valid. So we'll have to address that at some point too. Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-11 15:29:13 -07:00
Herbert Xu	bb29624614	inet: Remove unused send_check length argument inet: Remove unused send_check length argument This patch removes the unused length argument from the send_check function in struct inet_connection_sock_af_ops. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Tested-by: Yinghai <yinghai.lu@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-11 15:29:09 -07:00
Herbert Xu	8ad50d96db	tcp: Handle CHECKSUM_PARTIAL for SYNACK packets for IPv6 tcp: Handle CHECKSUM_PARTIAL for SYNACK packets for IPv6 This patch moves the common code between tcp_v6_send_check and tcp_v6_gso_send_check into a new function __tcp_v6_send_check. It then uses the new function in tcp_v6_send_synack as well as tcp_v6_send_response so that they handle CHECKSUM_PARTIAL properly. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Tested-by: Yinghai <yinghai.lu@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-11 15:29:08 -07:00
Herbert Xu	419f9f8960	tcp: Handle CHECKSUM_PARTIAL for SYNACK packets for IPv4 tcp: Handle CHECKSUM_PARTIAL for SYNACK packets for IPv4 This patch moves the common code between tcp_v4_send_check and tcp_v4_gso_send_check into a new function __tcp_v4_send_check. It then uses the new function in tcp_v4_send_synack so that it handles CHECKSUM_PARTIAL properly. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Tested-by: Yinghai <yinghai.lu@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-11 15:29:08 -07:00
David S. Miller	871039f02f	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/stmmac/stmmac_main.c drivers/net/wireless/wl12xx/wl1271_cmd.c drivers/net/wireless/wl12xx/wl1271_main.c drivers/net/wireless/wl12xx/wl1271_spi.c net/core/ethtool.c net/mac80211/scan.c	2010-04-11 14:53:53 -07:00
David S. Miller	4a1032faac	Merge branch 'master' of /home/davem/src/GIT/linux-2.6/	2010-04-11 02:44:30 -07:00
David S. Miller	ae4e8d63b5	Revert "tcp: Set CHECKSUM_UNNECESSARY in tcp_init_nondata_skb" This reverts commit `2626419ad5`. It causes regressions for people with IGB cards. Connection requests don't complete etc. The true cause of the issue is still not known, but we should sort this out in net-next-2.6 not net-2.6 Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-11 02:40:49 -07:00
Teemu Paasikivi	68dd5b7a45	mac80211: check whether scan is in progress before queueing scan_work As scan_work is queued from work_work it needs to be checked if scan has been started during execution of work_work. Otherwise, when hw scan is used, the stack gets error about hw being busy with ongoing scan. This causes the stack to abort scan without notifying the driver about it. This leads to a situation where the hw is scanning and the stack thinks it's not. Then when the scan finishes, the stack will complain by warnings. Signed-off-by: Teemu Paasikivi <ext-teemu.3.paasikivi@nokia.com> Reviewed-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-04-09 13:43:11 -04:00
Luis R. Rodriguez	c15cf5fcf9	mac80211: fix typo for LDPC capability Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-04-09 13:43:09 -04:00
Zhu Yi	39184b151c	mac80211: delay skb linearising in rx decryption We delay the skb linearising in ieee80211_rx_h_decrypt so that frames do not require software decryption are not linearized. We are safe to do this because ieee80211_get_mmie_keyidx() only requires to touch nonlinear data for management frames, which are already linearized before getting here. Cc: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Zhu Yi <yi.zhu@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-04-09 13:43:09 -04:00
Johannes Berg	b5878a2dc5	mac80211: enhance tracing Enhance tracing by adding tracing for a variety of callbacks that the drivers call, and also for internal calls (currently limited to queue status). This can aid debugging what is going on in mac80211 in interaction with drivers, since we can now see what drivers call and not just what mac80211 calls in the driver. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-04-08 15:24:13 -04:00
Javier Cardona	97ad9139fd	mac80211: Moved mesh action codes to a more visible location Grouped mesh action codes together with the other action codes in ieee80211.h. Signed-off-by: Javier Cardona <javier@cozybit.com> Reviewed-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-04-08 15:24:07 -04:00
David S. Miller	2626419ad5	tcp: Set CHECKSUM_UNNECESSARY in tcp_init_nondata_skb Back in commit `04a0551c87` ("loopback: Drop obsolete ip_summed setting") we stopped setting CHECKSUM_UNNECESSARY in the loopback xmit. This is because such a setting was a lie since it implies that the checksum field of the packet is properly filled in. Instead what happens normally is that CHECKSUM_PARTIAL is set and skb->csum is calculated as needed. But this was only happening for TCP data packets (via the skb->ip_summed assignment done in tcp_sendmsg()). It doesn't happen for non-data packets like ACKs etc. Fix this by setting skb->ip_summed in the common non-data packet constructor. It already is setting skb->csum to zero. But this reminds us that we still have things like ip_output.c's ip_dev_loopback_xmit() which sets skb->ip_summed to the value CHECKSUM_UNNECESSARY, which Herbert's patch teaches us is not valid. So we'll have to address that at some point too. Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-08 11:32:30 -07:00
Jorge Boncompte [DTI2]	1223c67c09	udp: fix for unicast RX path optimization Commits `5051ebd275` and `5051ebd275` ("ipv[46]: udp: optimize unicast RX path") broke some programs. After upgrading a L2TP server to 2.6.33 it started to fail, tunnels going up an down, after the 10th tunnel came up. My modified rp-l2tp uses a global unconnected socket bound to (INADDR_ANY, 1701) and one connected socket per tunnel after parameter negotiation. After ten sockets were open and due to mixed parameters to udp[46]_lib_lookup2() kernel started to drop packets. Signed-off-by: Jorge Boncompte [DTI2] <jorge@dti2.net> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-08 11:29:13 -07:00
John W. Linville	0f2df9eac7	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 into merge Conflicts: Documentation/feature-removal-schedule.txt drivers/net/wireless/ath/ath5k/phy.c drivers/net/wireless/iwlwifi/iwl-4965.c drivers/net/wireless/iwlwifi/iwl-agn.c drivers/net/wireless/iwlwifi/iwl-core.c drivers/net/wireless/iwlwifi/iwl-core.h drivers/net/wireless/iwlwifi/iwl-tx.c	2010-04-08 13:34:54 -04:00
chavey	97f8aefbbf	net: fix ethtool coding style errors and warnings Fix coding style errors and warnings output while running checkpatch.pl on the files net/core/ethtool.c and include/linux/ethtool.h Signed-off-by: chavey <chavey@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-07 21:54:42 -07:00
John Hughes	ddd0451fc8	x.25 attempts to negotiate invalid throughput The current X.25 code has some bugs in throughput negotiation: 1. It does negotiation in all cases, usually there is no need 2. It incorrectly attempts to negotiate the throughput class in one direction only. There are separate throughput classes for input and output and if either is negotiated both mist be negotiates. This is bug https://bugzilla.kernel.org/show_bug.cgi?id=15681 This bug was first reported by Daniel Ferenci to the linux-x25 mailing list on 6/8/2004, but is still present. The current (2.6.34) x.25 code doesn't seem to know that the X.25 throughput facility includes two values, one for the required throughput outbound, one for inbound. This causes it to attempt to negotiate throughput 0x0A, which is throughput 9600 inbound and the illegal value "0" for inbound throughput. Because of this some X.25 devices (e.g. Cisco 1600) refuse to connect to Linux X.25. The following patch fixes this behaviour. Unless the user specifies a required throughput it does not attempt to negotiate. If the user does not specify a throughput it accepts the suggestion of the remote X.25 system. If the user requests a throughput then it validates both the input and output throughputs and correctly negotiates them with the remote end. Signed-off-by: John Hughes <john@calva.com> Tested-by: Andrew Hendry <andrew.hendry@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-07 21:33:02 -07:00
John Hughes	f5eb917b86	x25: Patch to fix bug 15678 - x25 accesses fields beyond end of packet. Here is a patch to stop X.25 examining fields beyond the end of the packet. For example, when a simple CALL ACCEPTED was received: 10 10 0f x25_parse_facilities was attempting to decode the FACILITIES field, but this packet contains no facilities field. Signed-off-by: John Hughes <john@calva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-07 21:29:25 -07:00
Herbert Xu	fd218cf955	bridge: Fix IGMP3 report parsing The IGMP3 report parsing is looking at the wrong address for group records. This patch fixes it. Reported-by: Banyeer <banyeer@yahoo.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-07 21:20:47 -07:00
Eric Dumazet	298b9e44be	net: include linux/proc_fs.h in dev_addr_lists.c As pointed by Randy Dunlap, we must include linux/proc_fs.h in net/core/dev_addr_lists.c, regardless of CONFIG_PROC_FS Reported-by: Randy Dunlap <randy.dunlap@oracle.com>, Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-07 16:46:36 -07:00
David S. Miller	005c93b5d8	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6	2010-04-07 16:41:03 -07:00
Johannes Berg	8c11e4ab09	mac80211: fix paged RX crypto WEP crypto was broken, but upon finding the problem it is evident that other things were broken by the paged RX patch as well. To fix it, for now move the linearising in front. This means that we linearise all frames, which is not at all what we want, but at least it fixes the problem for now. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Acked-by: Zhu Yi <yi.zhu@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-04-07 16:26:25 -04:00
Johannes Berg	54297e4d60	mac80211: fix some RX aggregation locking A few places in mac80211 do not currently acquire the sta lock for RX aggregation, but they should. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-04-07 14:38:06 -04:00
Johannes Berg	098a607091	mac80211: clean up/fix aggregation code The aggregation code has a number of quirks, like inventing an unneeded WLAN_BACK_TIMER value and leaking memory under certain circumstances during station destruction. Fix these issues by using the regular aggregation session teardown code and blocking new aggregation sessions, all before the station is really destructed. As a side effect, this gets rid of the long code block to destroy aggregation safely. Additionally, rename tid_state_rx which can only have the values IDLE and OPERATIONAL to tid_active_rx to make it easier to understand that there is no bitwise stuff going on on the RX side -- the TX side remains because it needs to keep track of the driver and peer states. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-04-07 14:38:05 -04:00
Johannes Berg	618f356b95	mac80211: rename WLAN_STA_SUSPEND to WLAN_STA_BLOCK_BA I want to use it during station destruction as well so rename it to WLAN_STA_BLOCK_BA which is also the only use of it now. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-04-07 14:38:04 -04:00
Johannes Berg	66b0470aee	mac80211: remove ieee80211_sta_stop_rx_ba_session All callers of ieee80211_sta_stop_rx_ba_session can just call __ieee80211_stop_rx_ba_session instead because they already have the station struct, so do that and remove ieee80211_sta_stop_rx_ba_session. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-04-07 14:38:03 -04:00
Johannes Berg	2b43ae6daf	mac80211: remove irq disabling for sta lock All other places except one in the TX path, which has BHs disabled, and it also cannot be locked from interrupts so disabling IRQs is not necessary. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-04-07 14:38:01 -04:00
Johannes Berg	e64b379574	mac80211: fix station destruction problem When a station w/o a key is destroyed, or when a driver submits work for a station and thereby references it again, it seems like potentially we could reference the station structure while it is being destroyed. Wait for an RCU grace period to elapse before finishing destroying the station after we have removed the station from the driver and from the hash table etc., even in the case where no key is associated with the station. Also, there's no point in deleting the plink timer here since it'll be properly deleted just a bit later. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-04-07 14:38:00 -04:00
Jouni Malinen	d5cdfacb35	cfg80211: Add local-state-change-only auth/deauth/disassoc cfg80211 is quite strict on allowing authentication and association commands only in certain states. In order to meet these requirements, user space applications may need to clear authentication or association state in some cases. Currently, this can be done with deauth/disassoc command, but that ends up sending out Deauthentication or Disassociation frame unnecessarily. Add a new nl80211 attribute to allow this sending of the frame be skipped, but with all other deauth/disassoc operations being completed. Similar state change is also needed for IEEE 802.11r FT protocol in the FT-over-DS case which does not use Authentication frame exchange in a transition to another BSS. For this to work with cfg80211, an authentication entry needs to be created for the target BSS without sending out an Authentication frame. The nl80211 authentication command can be used for this purpose, too, with the new attribute to indicate that the command is only for changing local state. This enables wpa_supplicant to complete FT-over-DS transition successfully. Signed-off-by: Jouni Malinen <j@w1.fi> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-04-07 14:37:56 -04:00
Timo Teräs	8e4795605d	flow: delayed deletion of flow cache entries Speed up lookups by freeing flow cache entries later. After virtualizing flow cache entry operations, the flow cache may now end up calling policy or bundle destructor which can be slowish. As gc_list is more effective with double linked list, the flow cache is converted to use common hlist and list macroes where appropriate. Signed-off-by: Timo Teras <timo.teras@iki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-07 03:43:20 -07:00
Timo Teräs	285ead175c	xfrm: remove policy garbage collection Policies are now properly reference counted and destroyed from all code paths. The delayed gc is just an overhead now and can be removed. Signed-off-by: Timo Teras <timo.teras@iki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-07 03:43:19 -07:00
Timo Teräs	80c802f307	xfrm: cache bundles instead of policies for outgoing flows __xfrm_lookup() is called for each packet transmitted out of system. The xfrm_find_bundle() does a linear search which can kill system performance depending on how many bundles are required per policy. This modifies __xfrm_lookup() to store bundles directly in the flow cache. If we did not get a hit, we just create a new bundle instead of doing slow search. This means that we can now get multiple xfrm_dst's for same flow (on per-cpu basis). Signed-off-by: Timo Teras <timo.teras@iki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-07 03:43:19 -07:00
Timo Teräs	fe1a5f031e	flow: virtualize flow cache entry methods This allows to validate the cached object before returning it. It also allows to destruct object properly, if the last reference was held in flow cache. This is also a prepartion for caching bundles in the flow cache. In return for virtualizing the methods, we save on: - not having to regenerate the whole flow cache on policy removal: each flow matching a killed policy gets refreshed as the getter function notices it smartly. - we do not have to call flow_cache_flush from policy gc, since the flow cache now properly deletes the object if it had any references Signed-off-by: Timo Teras <timo.teras@iki.fi> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-07 03:43:18 -07:00
David S. Miller	4a35ecf8bf	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/bonding/bond_main.c drivers/net/via-velocity.c drivers/net/wireless/iwlwifi/iwl-agn.c	2010-04-06 23:53:30 -07:00
Hagen Paul Pfeifer	842509b859	socket: remove duplicate declaration of struct timespec struct timespec ts was alreay defined. Reuse the previously defined one and reduce the memory footprint on the stack by 16 bytes. Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-06 19:50:20 -07:00
Jon Paul Maloy	c6537d6742	TIPC: Updated topology subscription protocol according to latest spec This patch makes it explicit in the API that all fields in subscriptions and events exchanged with the Topology Server must be in network byte order. It also ensures that all fields of a subscription are compared when cancelling a subscription, in order to avoid inadvertent cancelling of the wrong subscription. Finally, the tipc module version is updated to 2.0.0, to reflect the API change. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-06 19:50:19 -07:00
Jouni Malinen	d211e90e28	mac80211: Fix robust management frame handling (MFP) Commit e34e09401ee9888dd662b2fca5d607794a56daf2 incorrectly removed use of ieee80211_has_protected() from the management frame case and in practice, made this validation drop all Action frames when MFP is enabled. This should have only been done for frames with Protected field set to zero. Signed-off-by: Jouni Malinen <j@w1.fi> Cc: stable@kernel.org Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-04-06 16:49:33 -04:00
Johannes Berg	0379185b6c	mac80211: annotate station rcu dereferences The new RCU lockdep support warns about these in some contexts -- make it aware of the locks used to protect all this. Different locks are used in different contexts which unfortunately means we can't get perfect checking. Also remove rcu_dereference() from two places that don't actually dereference the pointers. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-04-06 15:53:30 -04:00
Javier Cardona	1cb561f837	mac80211: Handle mesh action frames in ieee80211_rx_h_action This fixes the problem introduced in commit `8404080568` which broke mesh peer link establishment. changes: v2 Added missing break (Johannes) v3 Broke original patch into two (Johannes) Signed-off-by: Javier Cardona <javier@cozybit.com> Cc: stable@kernel.org Reviewed-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-04-06 15:53:28 -04:00
Linus Torvalds	cb4361c1dc	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (37 commits) smc91c92_cs: fix the problem of "Unable to find hardware address" r8169: clean up my printk uglyness net: Hook up cxgb4 to Kconfig and Makefile cxgb4: Add main driver file and driver Makefile cxgb4: Add remaining driver headers and L2T management cxgb4: Add packet queues and packet DMA code cxgb4: Add HW and FW support code cxgb4: Add register, message, and FW definitions netlabel: Fix several rcu_dereference() calls used without RCU read locks bonding: fix potential deadlock in bond_uninit() net: check the length of the socket address passed to connect(2) stmmac: add documentation for the driver. stmmac: fix kconfig for crc32 build error be2net: fix bug in vlan rx path for big endian architecture be2net: fix flashing on big endian architectures be2net: fix a bug in flashing the redboot section bonding: bond_xmit_roundrobin() fix drivers/net: Add missing unlock net: gianfar - align BD ring size console messages net: gianfar - initialize per-queue statistics ...	2010-04-06 08:34:06 -07:00
YOSHIFUJI Hideaki / 吉藤英明	2f787b0b76	mac80211: Ensure initializing private mc_list in prepare_multicast(). Fix kernel panic by NULL pointer dereference in the context of ieee80211_ops->prepare_multicast(). This bug was introduced by commit 22bedad3c.. ("net: convert multicast list to list_head"). Call __hw_addr_init() in ieee80211_alloc_hw() to initialize list_head of private device multicast list, like we do in bond_init(). Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Reviewed-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-06 00:12:30 -07:00
Eric Dumazet	e4008276fd	net: Add a missing local_irq_enable() As noticed by Changli Gao, we must call local_irq_enable() after rps_unlock() Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-05 15:42:39 -07:00
Tom Herbert	5a6d234e73	rps: fixed missed rps_unlock Fix spin_unlock_irq which needs to be rps_unlock. Signed-off-by: Tom Herbert <therbert@google.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-05 14:37:55 -07:00
Linus Torvalds	749d229761	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs: 9p: saving negative to unsigned char 9p: return on mutex_lock_interruptible() 9p: Creating files with names too long should fail with ENAMETOOLONG. 9p: Make sure we are able to clunk the cached fid on umount 9p: drop nlink remove fs/9p: Clunk the fid resulting from partial walk of the name 9p: documentation update 9p: Fix setting of protocol flags in v9fs_session_info structure.	2010-04-05 13:42:54 -07:00
Dan Carpenter	3dc9fef67f	9p: saving negative to unsigned char Saving -EINVAL as unsigned char truncates the high bits and changes it into 234 instead of -22. This breaks the test for "if (ret == -EINVAL)" in parse_opts(). Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2010-04-05 14:37:28 -05:00
Tom Tucker	bade732a28	svcrdma: RDMA support not yet compatible with RPC6 RPC6 requires that it be possible to create endpoints that listen exclusively for IPv4 or IPv6 connection requests. This is not currently supported by the RDMA API. This fixes a server RDMA regression introduced by `37498292a` "NFSD: Create PF_INET6 listener in write_ports". Signed-off-by: Tom Tucker<tom@opengridcomputing.com> Tested-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-04-05 12:10:22 -04:00
Aneesh Kumar K.V	6d96d3ab7a	9p: Make sure we are able to clunk the cached fid on umount dcache prune happen on umount. So we cannot mark the client satus disconnect. That will prevent a 9p call to the server Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2010-04-05 10:37:36 -05:00
Eric Dumazet	7bddd0db62	l2tp: unmanaged L2TPv3 tunnels fixes Followup to commit `789a4a2c` (l2tp: Add support for static unmanaged L2TPv3 tunnels) One missing init in l2tp_tunnel_sock_create() could access random kernel memory, and a bit field should be unsigned. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-04 01:02:46 -07:00
Brian Haley	486f50ca79	SCTP: Change to use ipv6_addr_copy() Change SCTP IPv6 code to use ipv6_addr_copy() Signed-off-by: Brian Haley <brian.haley@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-03 15:10:21 -07:00
Eric Dumazet	1f8438a853	icmp: Account for ICMP out errors When ip_append() fails because of socket limit or memory shortage, increment ICMP_MIB_OUTERRORS counter, so that "netstat -s" can report these errors. LANG=C netstat -s \| grep "ICMP messages failed" 0 ICMP messages failed For IPV6, implement ICMP6_MIB_OUTERRORS counter as well. # grep Icmp6OutErrors /proc/net/dev_snmp6/* /proc/net/dev_snmp6/eth0:Icmp6OutErrors 0 /proc/net/dev_snmp6/lo:Icmp6OutErrors 0 Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-03 15:09:04 -07:00
David S. Miller	f66ef2d064	l2tp: Fix L2TP_DEBUGFS ifdef tests. We have to check CONFIG_L2TP_DEBUGFS_MODULE as well as CONFIG_L2TP_DEBUGFS. Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-03 15:01:37 -07:00
David S. Miller	f481c0d862	l2tp: Add missing semicolon to MODULE_ALIAS() in l2tp_netlink.c Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-03 14:58:07 -07:00
James Chapman	789a4a2c61	l2tp: Add support for static unmanaged L2TPv3 tunnels This patch adds support for static (unmanaged) L2TPv3 tunnels, where the tunnel socket is created by the kernel rather than being created by userspace. This means L2TP tunnels and sessions can be created manually, without needing an L2TP control protocol implemented in userspace. This might be useful where the user wants a simple ethernet over IP tunnel. A patch to iproute2 adds a new command set under "ip l2tp" to make use of this feature. This will be submitted separately. Signed-off-by: James Chapman <jchapman@katalix.com> Reviewed-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-03 14:56:08 -07:00
James Chapman	0ad6614048	l2tp: Add debugfs files for dumping l2tp debug info The existing pppol2tp driver exports debug info to /proc/net/pppol2tp. Rather than adding info to that file for the new functionality added in this patch series, we add new files in debugfs, leaving the old /proc file for backwards compatibility (L2TPv2 only). Currently only one file is provided: l2tp/tunnels, which lists internal debug info for all l2tp tunnels and sessions. More files may be added later. The info is for debug and problem analysis only - userspace apps should use netlink to obtain status about l2tp tunnels and sessions. Although debugfs does not support net namespaces, the tunnels and sessions dumped in l2tp/tunnels are only those in the net namespace of the process reading the file. Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-03 14:56:07 -07:00
James Chapman	d9e31d17ce	l2tp: Add L2TP ethernet pseudowire support This driver presents a regular net_device for each L2TP ethernet pseudowire instance. These interfaces are named l2tpethN by default, though userspace can specify an alternative name when the L2TP session is created, if preferred. When the pseudowire is established, regular Linux networking utilities may be used to configure the interface, i.e. give it IP address info or add it to a bridge. Any data passed over the interface is carried over an L2TP tunnel. Signed-off-by: James Chapman <jchapman@katalix.com> Reviewed-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-03 14:56:06 -07:00
James Chapman	e02d494d2c	l2tp: Convert rwlock to RCU Reader/write locks are discouraged because they are slower than spin locks. So this patch converts the rwlocks used in the per_net structs to rcu. Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-03 14:56:06 -07:00
James Chapman	309795f4be	l2tp: Add netlink control API for L2TP In L2TPv3, we need to create/delete/modify/query L2TP tunnel and session contexts. The number of parameters is significant. So let's use netlink. Userspace uses this API to control L2TP tunnel/session contexts in the kernel. The previous pppol2tp driver was managed using [gs]etsockopt(). This API is retained for backwards compatibility. Unlike L2TPv2 which carries only PPP frames, L2TPv3 can carry raw ethernet frames or other frame types and these do not always have an associated socket family. Therefore, we need a way to use L2TP sessions that doesn't require a socket type for each supported frame type. Hence netlink is used. Signed-off-by: James Chapman <jchapman@katalix.com> Reviewed-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-03 14:56:05 -07:00
James Chapman	f408e0ce40	netlink: Export genl_lock() API for use by modules This lets kernel modules which use genl netlink APIs serialize netlink processing. Signed-off-by: James Chapman <jchapman@katalix.com> Reviewed-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-03 14:56:05 -07:00

1 2 3 4 5 ...

15228 commits