Commit graph

13014 commits

Author SHA1 Message Date
Rémi Denis-Courmont f249fb7830 Fix error return for setsockopt(SO_TIMESTAMPING)
I guess it should be -EINVAL rather than EINVAL. I have not checked
when the bug came in. Perhaps a candidate for -stable?

Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-20 08:23:36 -07:00
Andy Grover cb24405e67 RDS: Refactor end of __conn_create for readability
Add a comment for what's going on. Remove negative logic.
I find this much easier to understand quickly, although
there are a few lines duplicated.

Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-20 08:03:17 -07:00
Andy Grover ed9e352a35 RDS/IW: Remove dead code
In iWARP code, node_type will always be RNIC

Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-20 08:03:15 -07:00
Andy Grover 404bb72a56 RDS/IW: Remove page_shift variable from iwarp transport
The existing code treated page_shift as a variable, when in fact we
always want to have the fastreg page size be the same as the arch's
page size -- and it is, so this doesn't need to be a variable.

Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-20 08:03:14 -07:00
Andy Grover a870d62726 RDS/IB: Always use PAGE_SIZE for FMR page size
While FMRs allow significant flexibility in what size of pages they can use,
we really just want FMR pages to match CPU page size. Roland says we can
count on this always being supported, so this simplifies things.

Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-20 08:03:13 -07:00
Andy Grover edacaeae52 RDS: Fix completion notifications on blocking sockets
Completion or congestion notifications were not being checked
if the socket went to sleep. This patch fixes that.

Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-20 08:03:12 -07:00
Andy Grover fdf6e6b4af RDS/IB: Drop connection when a fatal QP event is received
Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-20 08:03:11 -07:00
Andy Grover 68cb01c1ba RDS/IB: Disable flow control in sysctl and explain why
Backwards compatibility with rds 3.0 causes protocol-
based flow control to be disabled as a side-effect.

I don't want to pull out FC support from the IB transport
but I do want to document and keep the sysctl consistent
if possible.

Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-20 08:03:10 -07:00
Andy Grover e11d912a7d RDS/IB: Move tx/rx ring init and refill to later
Since RDS 3.0 and 3.1 have different packet formats,
we need to wait until after protocol negotiation
is complete to layout the rx buffers.

Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-20 08:03:08 -07:00
Andy Grover 9099707ded RDS: Don't set c_version in __rds_conn_create()
Protocol negotiation is logically a property of the
transports, so rds core need not set it.

Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-20 08:03:08 -07:00
Andy Grover 597ddd50e1 RDS/IB: Rename byte_len to data_len to enhance readability
Of course len is in bytes. Calling it data_len hopefully indicates
a little better what the variable is actually for.

Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-20 08:03:07 -07:00
Andy Grover 92c330b9e9 RDS/RDMA: Fix cut-n-paste errors in printks in rdma_transport.c
Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-20 08:03:05 -07:00
Andy Grover 8dacd57e7e RDS/IB: Fix printk to indicate remote IP, not local
Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-20 08:03:04 -07:00
Andy Grover 02a6a2592e RDS/IB: Handle connections using RDS 3.0 wire protocol
The big differences between RDS 3.0 and 3.1 are protocol-level
flow control, and with 3.1 the header is in front of the data. The header
always ends up in the header buffer, and the data goes in the data page.

In 3.0 our "header" is a trailer, and will end up either in the data
page, the header buffer, or split across the two. Since 3.1 is backwards-
compatible with 3.0, we need to continue to support these cases. This
patch does that -- if using RDS 3.0 wire protocol, it will copy the header
from wherever it ended up into the header buffer.

Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-20 08:03:03 -07:00
Andy Grover 9ddbcfa098 RDS/IB: Improve RDS protocol version checking
RDS on IB uses privdata to do protocol version negotiation. Apparently
the IB stack will return a larger privdata buffer than the struct we were
expecting. Just to be extra-sure, this patch adds some checks in this area.

Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-20 08:03:02 -07:00
Andy Grover 3ba23ade46 RDS: Set retry_count to 2 and make modifiable via modparam
This will be default cause IB connections to failover faster,
but allow a longer retry count to be used if desired.

Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-20 08:03:01 -07:00
John Dykstra e547bc1ecc tcp: Use correct peer adr when copying MD5 keys
When the TCP connection handshake completes on the passive
side, a variety of state must be set up in the "child" sock,
including the key if MD5 authentication is being used.  Fix TCP
for both address families to label the key with the peer's
destination address, rather than the address from the listening
sock, which is usually the wildcard.

Reported-by:   Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: John Dykstra <john.dykstra1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-20 07:49:08 -07:00
John Dykstra e3afe7b75e tcp: Fix MD5 signature checking on IPv4 mapped sockets
Fix MD5 signature checking so that an IPv4 active open
to an IPv6 socket can succeed.  In particular, use the
correct address family's signature generation function
for the SYN/ACK.

Reported-by:   Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: John Dykstra <john.dykstra1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-20 07:49:07 -07:00
Jarek Poplawski b902e57352 ipv4: fib_trie: Use tnode_get_child_rcu() and node_parent_rcu() in lookups
While looking for other fib_trie problems reported by Pawel Staszewski
I noticed there are a few uses of tnode_get_child() and node_parent()
in lookups instead of their rcu versions.

Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-20 07:39:31 -07:00
Jarek Poplawski be916cdebe ipv4: Fix inflate_threshold_root automatically
During large updates there could be triggered warnings like: "Fix
inflate_threshold_root. Now=25 size=11 bits" if inflate() of the root
node isn't finished in 10 loops. It should be much rarer now, after
changing the threshold from 15 to 25, and a temporary problem, so
this patch tries to handle it automatically using a fix variable to
increase by one inflate threshold for next root resizes (up to the 35
limit, max fix = 10). The fix variable is decreased when root's
inflate() finishes below 7 loops (even if some other, smaller table/
trie is updated -- for simplicity the fix variable is global for now).

Reported-by: Pawel Staszewski <pstaszewski@itcare.pl>
Reported-by: Jorge Boncompte [DTI2] <jorge@dti2.net>
Tested-by: Pawel Staszewski <pstaszewski@itcare.pl>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: Robert Olsson <robert.olsson@its.uu.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-20 07:39:29 -07:00
Jarek Poplawski c3059477fc ipv4: Use synchronize_rcu() during trie_rebalance()
During trie_rebalance() we free memory after resizing with call_rcu(),
but large updates, especially with PREEMPT_NONE configs, can cause
memory stresses, so this patch calls synchronize_rcu() in
tnode_free_flush() after each sync_pages to guarantee such freeing
(especially before resizing the root node).

The value of sync_pages = 128 is based on Pawel Staszewski's tests as
the lowest which doesn't hinder updating times. (For testing purposes
there was a sysfs module parameter to change it on demand, but it's
removed until we're sure it could be really useful.)

The patch is based on suggestions by: Paul E. McKenney
<paulmck@linux.vnet.ibm.com>

Reported-by: Pawel Staszewski <pstaszewski@itcare.pl>
Tested-by: Pawel Staszewski <pstaszewski@itcare.pl>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-20 07:39:25 -07:00
Herbert Xu 2e477c9bd2 vlan: Propagate physical MTU changes
When the physical MTU changes we want to ensure that all existing
VLAN device MTUs do not exceed the new underlying MTU.  This patch
adds that propagation.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-20 07:35:37 -07:00
Eric Dumazet c482c56857 udp: cleanups
Pure style cleanups.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-17 09:47:31 -07:00
David S. Miller da8120355e Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/wireless/orinoco/main.c
2009-07-16 20:21:24 -07:00
Eric Dumazet 4dc6dc7162 net: sock_copy() fixes
Commit e912b1142b
(net: sk_prot_alloc() should not blindly overwrite memory)
took care of not zeroing whole new socket at allocation time.

sock_copy() is another spot where we should be very careful.
We should not set refcnt to a non null value, until
we are sure other fields are correctly setup, or
a lockless reader could catch this socket by mistake,
while not fully (re)initialized.

This patch puts sk_node & sk_refcnt to the very beginning
of struct sock to ease sock_copy() & sk_prot_alloc() job.

We add appropriate smp_wmb() before sk_refcnt initializations
to match our RCU requirements (changes to sock keys should
be committed to memory before sk_refcnt setting)

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-16 18:05:26 -07:00
David S. Miller 95aa1fe4ab Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6 2009-07-16 17:34:50 -07:00
Eric Dumazet 941297f443 netfilter: nf_conntrack: nf_conntrack_alloc() fixes
When a slab cache uses SLAB_DESTROY_BY_RCU, we must be careful when allocating
objects, since slab allocator could give a freed object still used by lockless
readers.

In particular, nf_conntrack RCU lookups rely on ct->tuplehash[xxx].hnnode.next
being always valid (ie containing a valid 'nulls' value, or a valid pointer to next
object in hash chain.)

kmem_cache_zalloc() setups object with NULL values, but a NULL value is not valid
for ct->tuplehash[xxx].hnnode.next.

Fix is to call kmem_cache_alloc() and do the zeroing ourself.

As spotted by Patrick, we also need to make sure lookup keys are committed to
memory before setting refcount to 1, or a lockless reader could get a reference
on the old version of the object. Its key re-check could then pass the barrier.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-07-16 14:03:40 +02:00
Patrick McHardy aa6a03eb0a netfilter: xt_osf: fix nf_log_packet() arguments
The first argument is the address family, the second one the hook
number.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-07-16 14:01:54 +02:00
Lothar Waßmann b13bb2e993 net/can: add module alias to can protocol drivers
Add appropriate MODULE_ALIAS() to facilitate autoloading of can protocol drivers

Signed-off-by: Lothar Wassmann <LW@KARO-electronics.de>
Acked-by: Oliver Hartkopp <oliver@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-15 11:20:38 -07:00
Lothar Waßmann f7e5cc0c40 net/can bugfix: use after free bug in can protocol drivers
Fix a use after free bug in can protocol drivers

The release functions of the can protocol drivers lack a call to
sock_orphan() which leads to referencing freed memory under certain
circumstances.

This patch fixes a bug reported here:
https://lists.berlios.de/pipermail/socketcan-users/2009-July/000985.html

Signed-off-by: Lothar Wassmann <LW@KARO-electronics.de>
Acked-by: Oliver Hartkopp <oliver@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-15 11:20:37 -07:00
Johannes Berg 1dacc76d00 net/compat/wext: send different messages to compat tasks
Wireless extensions have the unfortunate problem that events
are multicast netlink messages, and are not independent of
pointer size. Thus, currently 32-bit tasks on 64-bit platforms
cannot properly receive events and fail with all kinds of
strange problems, for instance wpa_supplicant never notices
disassociations, due to the way the 64-bit event looks (to a
32-bit process), the fact that the address is all zeroes is
lost, it thinks instead it is 00:00:00:00:01:00.

The same problem existed with the ioctls, until David Miller
fixed those some time ago in an heroic effort.

A different problem caused by this is that we cannot send the
ASSOCREQIE/ASSOCRESPIE events because sending them causes a
32-bit wpa_supplicant on a 64-bit system to overwrite its
internal information, which is worse than it not getting the
information at all -- so we currently resort to sending a
custom string event that it then parses. This, however, has a
severe size limitation we are frequently hitting with modern
access points; this limitation would can be lifted after this
patch by sending the correct binary, not custom, event.

A similar problem apparently happens for some other netlink
users on x86_64 with 32-bit tasks due to the alignment for
64-bit quantities.

In order to fix these problems, I have implemented a way to
send compat messages to tasks. When sending an event, we send
the non-compat event data together with a compat event data in
skb_shinfo(main_skb)->frag_list. Then, when the event is read
from the socket, the netlink code makes sure to pass out only
the skb that is compatible with the task. This approach was
suggested by David Miller, my original approach required
always sending two skbs but that had various small problems.

To determine whether compat is needed or not, I have used the
MSG_CMSG_COMPAT flag, and adjusted the call path for recv and
recvfrom to include it, even if those calls do not have a cmsg
parameter.

I have not solved one small part of the problem, and I don't
think it is necessary to: if a 32-bit application uses read()
rather than any form of recvmsg() it will still get the wrong
(64-bit) event. However, neither do applications actually do
this, nor would it be a regression.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-15 08:53:39 -07:00
Johannes Berg 4f45b2cd4e wext: optimise, comment and fix event sending
The current function for sending events first allocates the
event stream buffer, and then an skb to copy the event stream
into. This can be done in one go. Also, the current function
leaks kernel data to userspace in a 4 uninitialised bytes,
initialise those explicitly. Finally also add a few useful
comments, as opposed to the current comments.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-15 08:53:37 -07:00
Johannes Berg b333b3d228 wireless extensions: make netns aware
This makes wireless extensions netns aware. The
tasklet sending the events is converted to a work
struct so that we can rtnl_lock() in it.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-15 08:53:32 -07:00
Andreas Jaggi ee686ca919 gre: fix ToS/DiffServ inherit bug
Fixes two bugs:
- ToS/DiffServ inheritance was unintentionally activated when using impair fixed ToS values
- ECN bit was lost during ToS/DiffServ inheritance

Signed-off-by: Andreas Jaggi <aj@open.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-14 09:35:59 -07:00
Tobias Klauser 97fd5bc7f2 net: Rename lookup_neigh_params function
Rename lookup_neigh_params to lookup_neigh_parms as the struct is named
neigh_parms and all other functions dealing with the struct carry
neigh_parms in their names.

Signed-off-by: Tobias Klauser <klto@zhaw.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-13 11:17:49 -07:00
Changli Gao df5ede8258 net: remove redundant sched/ in net/Makefile
Remove redundant sched/ in net/Makefile.

sched/ is contained in previous:
obj-$(CONFIG_NET) += ethernet/ 802/ sched/ netlink/,
so the later
obj-$(CONFIG_NET_SCHED) += sched/
isn't necessary.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
----
Makefile | 1 -
1 file changed, 1 deletion(-)
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-12 20:11:14 -07:00
Sridhar Samudrala ba73542585 udpv6: Handle large incoming UDP/IPv6 packets and support software UFO
- validate and forward GSO UDP/IPv6 packets from untrusted sources.
- do software UFO if the outgoing device doesn't support UFO.

Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-12 14:29:29 -07:00
Sridhar Samudrala 7ea2f2c5a6 udpv6: Remove unused skb argument of ipv6_select_ident()
- move ipv6_select_ident() inline function to ipv6.h and remove the unused
  skb argument

Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-12 14:29:28 -07:00
Sridhar Samudrala c31d532690 udpv6: Fix gso_size setting in ip6_ufo_append_data
- fix gso_size setting for ipv6 fragment to be a multiple of 8 bytes.

Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-12 14:29:26 -07:00
Sridhar Samudrala 493c6be3fe udpv6: Fix HW checksum support for outgoing UFO packets
- add HW checksum support for outgoing large UDP/IPv6 packets destined for
  a UFO enabled device.

Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-12 14:29:24 -07:00
Sridhar Samudrala d7ca4cc01f udpv4: Handle large incoming UDP/IPv4 packets and support software UFO.
- validate and forward GSO UDP/IPv4 packets from untrusted sources.
- do software UFO if the outgoing device doesn't support UFO.

Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-12 14:29:21 -07:00
Johannes Berg 30ffee8480 net: move and export get_net_ns_by_pid
The function get_net_ns_by_pid(), to get a network
namespace from a pid_t, will be required in cfg80211
as well. Therefore, let's move it to net_namespace.c
and export it. We can't make it a static inline in
the !NETNS case because it needs to verify that the
given pid even exists (and return -ESRCH).

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-12 14:03:28 -07:00
Johannes Berg 134e63756d genetlink: make netns aware
This makes generic netlink network namespace aware. No
generic netlink families except for the controller family
are made namespace aware, they need to be checked one by
one and then set the family->netnsok member to true.

A new function genlmsg_multicast_netns() is introduced to
allow sending a multicast message in a given namespace,
for example when it applies to an object that lives in
that namespace, a new function genlmsg_multicast_allns()
to send a message to all network namespaces (for objects
that do not have an associated netns).

The function genlmsg_multicast() is changed to multicast
the message in just init_net, which is currently correct
for all generic netlink families since they only work in
init_net right now. Some will later want to work in all
net namespaces because they do not care about the netns
at all -- those will have to be converted to use one of
the new functions genlmsg_multicast_allns() or
genlmsg_multicast_netns() whenever they are made netns
aware in some way.

After this patch families can easily decide whether or
not they should be available in all net namespaces. Many
genl families us it for objects not related to networking
and should therefore be available in all namespaces, but
that will have to be done on a per family basis.

Note that this doesn't touch on the checkpoint/restart
problem where network namespaces could be used, genl
families and multicast groups are numbered globally and
I see no easy way of changing that, especially since it
must be possible to multicast to all network namespaces
for those families that do not care about netns.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-12 14:03:27 -07:00
Johannes Berg 11a28d373e net: make namespace iteration possible under RCU
All we need to take care of is using proper RCU list
add/del primitives and inserting a synchronize_rcu()
at one place to make sure the exit notifiers are run
after everybody has stopped iterating the list.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-12 14:03:25 -07:00
Johannes Berg 6c04bb18dd netlink: use call_rcu for netlink_change_ngroups
For the network namespace work in generic netlink I need
to be able to call this function under rcu_read_lock(),
otherwise the locking becomes a nightmare and more locks
would be needed. Instead, just embed a struct rcu_head
(actually a struct listeners_rcu_head that also carries
the pointer to the memory block) into the listeners
memory so we can use call_rcu() instead of synchronising
and then freeing. No rcu_barrier() is needed since this
code cannot be modular.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-12 14:03:24 -07:00
Johannes Berg 487420df79 netlink: remove unused exports
I added those myself in commits b4ff4f04 and 84659eb5,
but I see no reason now why they should be exported,
only generic netlink uses them which cannot be modular.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-12 14:03:19 -07:00
Sascha Hlusiak f2ba025b20 sit: fix regression: do not release skb->dst before xmit
The sit module makes use of skb->dst in it's xmit function, so since
93f154b594 ("net: release dst entry in dev_hard_start_xmit()") sit
tunnels are broken, because the flag IFF_XMIT_DST_RELEASE is not
unset.

This patch unsets that flag for sit devices to fix this
regression.

Signed-off-by: Sascha Hlusiak <contact@saschahlusiak.de>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-11 20:30:52 -07:00
Eric Dumazet e51a67a9c8 net: ip_push_pending_frames() fix
After commit 2b85a34e91
(net: No more expensive sock_hold()/sock_put() on each tx)
we do not take any more references on sk->sk_refcnt on outgoing packets.

I forgot to delete two __sock_put() from ip_push_pending_frames()
and ip6_push_pending_frames().

Reported-by: Emil S Tantilov <emils.tantilov@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Tested-by: Emil S Tantilov <emils.tantilov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-11 20:26:21 -07:00
Eric Dumazet e912b1142b net: sk_prot_alloc() should not blindly overwrite memory
Some sockets use SLAB_DESTROY_BY_RCU, and our RCU code correctness
depends on sk->sk_nulls_node.next being always valid. A NULL
value is not allowed as it might fault a lockless reader.

Current sk_prot_alloc() implementation doesnt respect this hypothesis,
calling kmem_cache_alloc() with __GFP_ZERO. Just call memset() around
the forbidden field.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-11 20:26:19 -07:00
Johannes Berg 0b20633d96 cfg80211: disallow configuring unsupported interfaces
In order to force drivers to advertise their interface
types, don't just disallow creating new interfaces with
unadvertised types but also disallow setting them UP.
Additionally, add some validation on the operations the
drivers support.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-10 15:02:33 -04:00
Johannes Berg 79c97e97ae cfg80211: clean up naming once and for all
We've named the registered devices 'drv' sometimes,
thinking of "driver", which is not what it is, it's
the internal representation of a wiphy, i.e. a
device. Let's clean up the naming once and and use
'rdev' aka 'registered device' everywhere.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-10 15:02:33 -04:00
Johannes Berg 667503ddcb cfg80211: fix locking
Over time, a lot of locking issues have crept into
the smarts of cfg80211, so e.g. scan completion can
race against a new scan, IBSS join can race against
leaving an IBSS, etc.

Introduce a new per-interface lock that protects
most of the per-interface data that we need to keep
track of, and sprinkle assertions about that lock
everywhere. Some things now need to be offloaded to
work structs so that we don't require being able to
sleep in functions the drivers call. The exception
to that are the MLME callbacks (rx_auth etc.) that
currently only mac80211 calls because it was easier
to do that there instead of in cfg80211, and future
drivers implementing those calls will, if they ever
exist, probably need to use a similar scheme like
mac80211 anyway...

In order to be able to handle _deauth and _disassoc
properly, introduce a cookie passed to it that will
determine locking requirements.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-10 15:02:32 -04:00
Johannes Berg 4f5dadcebb cfg80211: fix MFP bug, sparse warnings
sparse warns about a number of things, and one of them
(use_mfp shadowed variable) actually is a bug, fix all
of them.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-10 15:02:32 -04:00
Johannes Berg 4d0c8aead3 cfg80211: properly name driver locking
Currently we call that cfg80211_put_dev(), but that is
misleading. With the new convention of using 'rdev' for
registered_device variables, also call that function
cfg80211_unlock_rdev().

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-10 15:02:32 -04:00
Johannes Berg c1e6fb1aad cfg80211: warn again on spurious deauth
The original code in mac80211 could send a deauth
frame under certain circumstances even if nothing
had ever requested an authentication. This has been
fixed with the rework there, so cfg80211 can now
warn again about spurious events to catch possible
future drivers or mac80211 regressions.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-10 15:02:31 -04:00
Johannes Berg cb0b4beb93 cfg80211: mlme API must be able to sleep
After the mac80211 mlme cleanup, we can require that
the MLME functions in cfg80211 can sleep. This will
simplify future work in cfg80211 a lot.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-10 15:02:31 -04:00
Johannes Berg 7848547561 cfg80211: fix netdev down problem
We shouldn't be looking at the ssid_len for non-IBSS,
and for IBSS we should also return an error on trying
to leave an IBSS while not in or joining an IBSS.

This fixes an issue where we wouldn't disconnect() on
an interface being taken down since there's no SSID
configured this way.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-10 15:02:31 -04:00
Johannes Berg c9cf01226e mac80211: refactor the WEP code to be directly usable
The new key work for cfg80211 will only give us the WEP
key for shared auth to do that authentication, and not
via the regular key settings, so we need to be able to
encrypt a single frame in software, and that without a
key struct. Thus, refactor the WEP code to not require
a key structure but use the key, len and idx directly.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-10 15:02:31 -04:00
Johannes Berg 77fdaa12ce mac80211: rework MLME for multiple authentications
Sit tight. This shakes up the world as you know
it. Let go of your spaghetti tongs, they will no
longer be required, the horrible statemachine in
net/mac80211/mlme.c is no more...

With the cfg80211 SME mac80211 now has much less
to keep track of, but, on the other hand, for FT
it needs to be able to keep track of at least one
authentication being in progress while associated.
So convert from a single state machine to having
small ones for all the different things we need to
do. For real FT it will still need work wrt. PS,
but this should be a good step.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-10 15:02:30 -04:00
Johannes Berg a7c1cfc961 mac80211: remove dead code from mlme
The ap_capab and last_probe struct members are unused.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-10 15:02:30 -04:00
Johannes Berg 3e5d7649a6 cfg80211: let SME control reassociation vs. association
Since we don't really know that well in the kernel,
let's let the SME control whether it wants to use
reassociation or not, by allowing it to give the
previous BSSID in the associate() parameters.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-10 15:02:30 -04:00
Johannes Berg 1be491fca1 rfkill: prep for rfkill API changes
We've designed the /dev/rfkill API in a way that we
can increase the event struct by adding members at
the end, should it become necessary. To validate the
events, userspace and the kernel need to have the
proper event size to check for -- when reading from
the other end they need to verify that it's at least
version 1 of the event API, with the current struct
size, so define a constant for that and make the
code a little more 'future proof'.

Not that I expect that we'll have to change the event
size any time soon, but it's better to write the code
in a way that lends itself to extending.

Due to the current size of the event struct, the code
is currently equivalent, but should the event struct
ever need to be increased the new code might not need
changing.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-10 15:02:29 -04:00
Samuel Ortiz 6c230c0270 cfg80211: check for current_bss from giwrate
When connecting to an ESSID manually, we may not set the BSSID, and thus
wdev->wext.connect.bssid will be NULL.
wdev->current_bss is always updated when a connection is established so we
should check it first.

Signed-off-by: Samuel Ortiz <samuel.ortiz@intel.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-10 15:02:27 -04:00
Helmut Schaa 96f7e73938 mac80211: shorten the passive dwell time for sw scans
mac80211's software scan implementation uses a passive dwell time of
(HZ / 5) which means we stay 200ms on each passive channel. Compared
to iwlwifi's hw scan and the old ipw* drivers which use values around
120ms this is quite long.

Reducing the passive dwell time from 200ms to 125ms should save us
something around a second on cards capable of 11a and we should still be
able to catch beacons from most access points (assuming a ~100ms beacon
interval).

Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-10 15:02:27 -04:00
Johannes Berg 9834c079d1 cfg80211: fix giwrange
"cfg80211: Advertise ciphers via WE according to driver capability"
unfortunately broke iwrange -- it used the variable c
that needs to be 0 for the channel list.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-10 15:01:54 -04:00
Johannes Berg 3dc27d25f2 nl80211: limit to one pairwise cipher for associate()
In this case, only one cipher makes sense, unlike for
connect() where it may be possible to have the card or
driver select.

No changes to mac80211 due to the way the structs are
laid out -- but the loop in net/mac80211/cfg.c will
degrade to just zero or one passes.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-10 15:01:54 -04:00
Johannes Berg 0a9b5e1795 cfg80211: refuse authenticating to same BSSID twice
It is possible that there are different BSS structs with
the same BSSID, but we cannot authenticate with multiple
of them them because we need the BSSID to be unique for
deauthenticating/disassociating.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-10 15:01:54 -04:00
Johannes Berg 19957bb399 cfg80211: keep track of BSSes
In order to avoid problems with BSS structs going away
while they're in use, I've long wanted to make cfg80211
keep track of them. Without the SME, that wasn't doable
but now that we have the SME we can do this too. It can
keep track of up to four separate authentications and
one association, regardless of whether it's controlled
by the cfg80211 SME or the userspace SME.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-10 15:01:53 -04:00
Johannes Berg 517357c685 cfg80211: assimilate and export ieee80211_bss_get_ie
This function from mac80211 seems generally useful, and
I will need it in cfg80211 soon.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-10 15:01:53 -04:00
Johannes Berg 0eb14647fc cfg80211: reset auth algorithm
When the interface is brought down, we need to
reset the auth algorithm because wpa_supplicant
doesn't reset it, and then we fail to use shared
key auth when required later.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-10 15:01:53 -04:00
Johannes Berg e45cd82ace cfg80211: send events for userspace SME
When the userspace SME is in control, we are currently not sending
events, but this means that any userspace applications using wext
or nl80211 to receive events will not know what's going on unless
they can also interpret the nl80211 assoc event. Since we have all
the required code, let the SME follow events from the userspace
SME, this even means that you will be refused to connect() while
the userspace SME is in control and connected.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-10 15:01:53 -04:00
Johannes Berg ab1faead50 mac80211: remove dead code, clean up
With mac80211 now always controlled by an external SME,
a lot of code is dead -- SSID, BSSID, channel selection
is always done externally, etc. Additionally, rename
IEEE80211_STA_TKIP_WEP_USED to IEEE80211_STA_DISABLE_11N
and clean up the code a bit.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-10 15:01:53 -04:00
Johannes Berg 6dc1cb0319 mac80211: remove auth algorithm retry
The automatic auth algorithm issue is now solved in
cfg80211, so mac80211 no longer needs code to try
different algorithms -- just using whatever cfg80211
asked for is good.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-10 15:01:52 -04:00
Johannes Berg ac00326e9d mac80211: re-add HT disabling
The IEEE80211_STA_TKIP_WEP_USED flag is used internally to
disable HT when WEP or TKIP are used. Now that cfg80211 is
giving us the required information, we can set the flag
appropriately again.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-10 15:01:52 -04:00
Johannes Berg 8990646d2f cfg80211: implement get_wireless_stats
By dropping the noise reporting, we can implement
wireless stats in cfg80211. We also make the
handler return NULL if we have no information,
which is possible thanks to the recent wext change.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-10 15:01:52 -04:00
Johannes Berg 9930380f0b cfg80211: implement IWRATE
For now, let's implement that using a very hackish way:
simply mirror the wext API in the cfg80211 API. This
will have to be changed later when we implement proper
bitrate API.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-10 15:01:52 -04:00
Johannes Berg ab737a4f7d cfg80211: implement IWAP for WDS
This implements siocsiwap/giwap for WDS mode.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-10 15:01:52 -04:00
Johannes Berg bc92afd920 cfg80211: implement iwpower
Just on/off and timeout, and with a hacky cfg80211 method
until we figure out what we want, though this is probably
sufficient as we want to use pm_qos for wifi everywhere.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-10 15:01:51 -04:00
Johannes Berg f21293549f cfg80211: managed mode wext compatibility
This adds code to make it possible to use the cfg80211
connect() API with wireless extensions, and because the
previous patch added emulation of that API with auth()
and assoc(), by extension also supports wext on that.
At the same time, removes code from mac80211 for wext,
but doesn't yet clean up mac80211's mlme code more.

Signed-off-by: Samuel Ortiz <samuel.ortiz@intel.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-10 15:01:51 -04:00
Johannes Berg 6829c878ec cfg80211: emulate connect with auth/assoc
This adds code to cfg80211 so that drivers (mac80211 right
now) that don't implement connect but rather auth/assoc can
still be used with the nl80211 connect command. This will
also be necessary for the wext compat code.

Signed-off-by: Samuel Ortiz <samuel.ortiz@intel.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-10 15:01:51 -04:00
Samuel Ortiz b23aa676ab cfg80211: connect/disconnect API
This patch introduces the cfg80211 connect/disconnect API.
The goal here is to run the AUTH and ASSOC steps in one call.
This is needed for some fullmac cards that run both steps
directly from the target, after the host driver sends a
connect command.

Additionally, all the new crypto parameters for connect()
are now also valid for associate() -- although associate
requires the IEs to be used, the information can be useful
for drivers and should be given.

Signed-off-by: Samuel Ortiz <samuel.ortiz@intel.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-10 15:01:51 -04:00
Johannes Berg 3f65b24536 mac80211: remove an unused function declaration
The ieee80211_scan_results function hasn't existed for a
long time now, so its declaration should be removed as
well.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-10 15:01:50 -04:00
Johannes Berg aff89a9b90 cfg80211: introduce nl80211 testmode command
This introduces a new NL80211_CMD_TESTMODE for testing
and calibration use with nl80211. There's no multiplexing
like like iwpriv had, and the command is not available by
default, it needs to be explicitly enabled in Kconfig and
shouldn't be enabled in most kernels.

The command requires a wiphy index or interface index to
identify the device to operate on, and the new TESTDATA
attribute. There also is API for sending replies to the
command, and testmode multicast messages (on a testmode
multicast group).

I've also updated mac80211 to be able to pass through the
command to the driver, since it itself doesn't implement
the testmode command.

Additionally, to give people an idea of how to use the
command, I've added a little code to hwsim that makes use
of the new command to set the powersave mode, this is
currently done via debugfs and should remain there, and
the testmode command only serves as an example of how to
use this best -- with nested netlink attributes in the
TESTDATA attribute. A hwsim testmode tool can be found at
http://git.sipsolutions.net/hwsim.git/. This tool is BSD
licensed so people can easily use it as a basis for their
own internal fabrication and validation tools.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-10 15:01:50 -04:00
Johannes Berg 5121ea0481 wext: constify extra argument to wireless_send_event
This is never changed by the function, so can be marked const.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-10 15:01:49 -04:00
Johannes Berg 0575606b08 mac80211: tell SME about real auth state
When the auth algorithm is rejected, but we don't have
another one to try, we will eventually retry but that
isn't useful -- we'll then do it again and again until
we eventually give up. Instead, we should let the SME
know and go into disabled state. The same applies for
situations where the AP rejects with any other status
code.

Additionally, when trying the next auth algorithm, we
should reset the auth_tries so that just a single lost
frame doesn't lead to us giving up on the third auth
algorithm.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-10 15:01:49 -04:00
Johannes Berg 7ebbe6bd51 cfg80211: remove wireless_dev->bssid
This variable isn't necessary -- the wext code keeps
track of the BSSID itself, and otherwise we have
current_bss.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-10 15:01:49 -04:00
Johannes Berg e6d6e3420d cfg80211: use proper allocation flags
Instead of hardcoding GFP_ATOMIC everywhere, add a
new function parameter that gets the flags from the
caller. Obviously then I need to update all callers
(all of them in mac80211), and it turns out that now
it's ok to use GFP_KERNEL in almost all places.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-10 15:01:49 -04:00
Johannes Berg dad8233021 nl80211: clean up function definitions
I don't like the 'extern' keyword much when it's not
necessary, it makes lines rather long.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-10 15:01:48 -04:00
Johannes Berg 2a783c136b cfg80211: move break statement to correct place
Move a break statement to the correct place _after_ the
#endif, otherwise w/o WIRELESS_EXT things break badly.
Also, while touching this code, do a cleanup and assign
dev->ieee80211_ptr to a new variable.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-10 15:01:48 -04:00
Johannes Berg 898324025f wext: default to y
The way I initially thought we could do wireless extensions
is by making all the compat code in cfg80211 be independent
of CONFIG_WIRELESS_EXT, but this is turning out to not be
feasible. Therefore, fix the Kconfig help text and make the
option default to yes, so people won't get a nasty surprise
when mac80211 will get rid of its 'select WIRELESS_EXT' any
time now.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-10 15:01:48 -04:00
Johannes Berg d3cebbdced mac80211: fix todo lock
The key todo lock can be taken from different locks
that require it to be _bh to avoid lock inversion
due to (soft)irqs.

This should fix the two problems reported by Bob and
Gabor:
http://mid.gmane.org/20090619113049.GB18956@hash.localnet
http://mid.gmane.org/4A3FA376.8020307@openwrt.org

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: Bob Copeland <me@bobcopeland.com>
Cc: Gabor Juhos <juhosg@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-10 15:01:48 -04:00
Johannes Berg df2b35b65b wext: allow returning NULL stats
Currently, wext drivers cannot return NULL for stats even though
that would make the ioctl return -EOPNOTSUPP because that would
mean they are no longer listed in /proc/net/wireless. This patch
changes the wext core's behaviour to list them if they have any
wireless_handlers, but only show their stats when available, so
that drivers can start returning NULL if stats are currently not
available, reducing confusion for e.g. IBSS.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-10 15:01:48 -04:00
Johannes Berg f58d4ed98b cfg80211: send wext MLME-MICHAELMICFAILURE.indication
Instead of having mac80211 do it itself.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-10 15:01:47 -04:00
David Kilroy 27bea66c22 cfg80211: infer WPA and WPA2 support from TKIP and CCMP
Signed-off-by: David Kilroy <kilroyd@googlemail.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-10 15:01:42 -04:00
David Kilroy 2ab658f9ce cfg80211: set WE encoding size based on available ciphers
Only set the sizes for WEP40 and WEP104.

Signed-off-by: David Kilroy <kilroyd@googlemail.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-10 15:01:42 -04:00
David Kilroy 51cd4aabd0 cfg80211: allow drivers that can't scan for specific ssids
Signed-off-by: David Kilroy <kilroyd@googlemail.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-10 15:01:42 -04:00
David Kilroy 3daf097594 cfg80211: Advertise ciphers via WE according to driver capability
Signed-off-by: David Kilroy <kilroyd@googlemail.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-10 15:01:42 -04:00
Johannes Berg 83f5e2cf79 cfg80211: prohibit scanning the same channel more than once
It isn't very useful to scan the same channel more than once
during a given scan, and some hardware (notably iwlwifi) can
only scan a limited number of channels at a time. To prevent
any overflows, simply disallow scanning any channel multiple
times in a given scan command. This is a small change in the
userspace ABI, but the only user, wpa_supplicant, never asks
for a scan with the same frequency listed twice.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-10 14:57:54 -04:00
Johannes Berg 386aa23dd5 mac80211: improve per-sta debugfs
We had code for a number of files, that we didn't publish
in debugfs, fix that. Also make the agg_status file layout
more readable and add more information to it.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-10 14:57:54 -04:00
Johannes Berg f1d58c2521 mac80211: push rx status into skb->cb
Within mac80211, we often need to copy the rx status into
skb->cb. This is wasteful, as drivers could be building it
in there to start with. This patch changes the API so that
drivers are expected to pass the RX status in skb->cb, now
accessible as IEEE80211_SKB_RXCB(skb). It also updates all
drivers to pass the rx status in there, but only by making
them memcpy() it into place before the call to the receive
function (ieee80211_rx(_irqsafe)). Each driver can now be
optimised on its own schedule.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-10 14:57:54 -04:00
Johannes Berg a538e2d5a3 cfg80211: issue netlink notification when scan starts
To ease multiple apps working together smoothly,
send a notification when a scan is started.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-10 14:57:53 -04:00
Johannes Berg e36d56b648 cfg80211: pass netdev to change_virtual_intf
If there was a reason I'm passing the ifidx I cannot
remember it any more and don't see one now, so let's
just pass the pointer itself.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-10 14:57:38 -04:00
David S. Miller e5a8a896f5 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2009-07-09 20:18:24 -07:00
Jiri Olsa a57de0b433 net: adding memory barrier to the poll and receive callbacks
Adding memory barrier after the poll_wait function, paired with
receive callbacks. Adding fuctions sock_poll_wait and sk_has_sleeper
to wrap the memory barrier.

Without the memory barrier, following race can happen.
The race fires, when following code paths meet, and the tp->rcv_nxt
and __add_wait_queue updates stay in CPU caches.

CPU1                         CPU2

sys_select                   receive packet
  ...                        ...
  __add_wait_queue           update tp->rcv_nxt
  ...                        ...
  tp->rcv_nxt check          sock_def_readable
  ...                        {
  schedule                      ...
                                if (sk->sk_sleep && waitqueue_active(sk->sk_sleep))
                                        wake_up_interruptible(sk->sk_sleep)
                                ...
                             }

If there was no cache the code would work ok, since the wait_queue and
rcv_nxt are opposit to each other.

Meaning that once tp->rcv_nxt is updated by CPU2, the CPU1 either already
passed the tp->rcv_nxt check and sleeps, or will get the new value for
tp->rcv_nxt and will return with new data mask.
In both cases the process (CPU1) is being added to the wait queue, so the
waitqueue_active (CPU2) call cannot miss and will wake up CPU1.

The bad case is when the __add_wait_queue changes done by CPU1 stay in its
cache, and so does the tp->rcv_nxt update on CPU2 side.  The CPU1 will then
endup calling schedule and sleep forever if there are no more data on the
socket.

Calls to poll_wait in following modules were ommited:
	net/bluetooth/af_bluetooth.c
	net/irda/af_irda.c
	net/irda/irnet/irnet_ppp.c
	net/mac80211/rc80211_pid_debugfs.c
	net/phonet/socket.c
	net/rds/af_rds.c
	net/rfkill/core.c
	net/sunrpc/cache.c
	net/sunrpc/rpc_pipe.c
	net/tipc/socket.c

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-09 17:06:57 -07:00
Anton Vorontsov 1b614fb9a0 netpoll: Fix carrier detection for drivers that are using phylib
Using early netconsole and gianfar driver this error pops up:

  netconsole: timeout waiting for carrier

It appears that net/core/netpoll.c:netpoll_setup() is using
cond_resched() in a loop waiting for a carrier.

The thing is that cond_resched() is a no-op when system_state !=
SYSTEM_RUNNING, and so drivers/net/phy/phy.c's state_queue is never
scheduled, therefore link detection doesn't work.

I belive that the main problem is in cond_resched()[1], but despite
how the cond_resched() story ends, it might be a good idea to call
msleep(1) instead of cond_resched(), as suggested by Andrew Morton.

[1] http://lkml.org/lkml/2009/7/7/463

Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-08 20:09:44 -07:00
David S. Miller d2daeabf62 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2009-07-08 18:13:13 -07:00
Anton Vorontsov bff38771e1 netpoll: Introduce netpoll_carrier_timeout kernel option
Some PHYs require longer timeouts for carrier detection, and
auto-negotiation process may take indefinite amount of time.

It may be inconvenient to force longer timeouts for sane PHYs,
so let's introduce a kernel command line option.

Since we're using module_param(), the option also can be
changed in runtime.

Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-08 11:10:56 -07:00
Jarek Poplawski 345aa03120 ipv4: Fix fib_trie rebalancing, part 4 (root thresholds)
Pawel Staszewski wrote:
<blockquote>
Some time ago i report this:
http://bugzilla.kernel.org/show_bug.cgi?id=6648

and now with 2.6.29 / 2.6.29.1 / 2.6.29.3 and 2.6.30 it back
dmesg output:
oprofile: using NMI interrupt.
Fix inflate_threshold_root. Now=15 size=11 bits
...
Fix inflate_threshold_root. Now=15 size=11 bits

cat /proc/net/fib_triestat
Basic info: size of leaf: 40 bytes, size of tnode: 56 bytes.
Main:
        Aver depth:     2.28
        Max depth:      6
        Leaves:         276539
        Prefixes:       289922
        Internal nodes: 66762
          1: 35046  2: 13824  3: 9508  4: 4897  5: 2331  6: 1149  7: 5
9: 1  18: 1
        Pointers: 691228
Null ptrs: 347928
Total size: 35709  kB
</blockquote>

It seems, the current threshold for root resizing is too aggressive,
and it causes misleading warnings during big updates, but it might be
also responsible for memory problems, especially with non-preempt
configs, when RCU freeing is delayed long after call_rcu.

It should be also mentioned that because of non-atomic changes during
resizing/rebalancing the current lookup algorithm can miss valid leaves
so it's additional argument to shorten these activities even at a cost
of a minimally longer searching.

This patch restores values before the patch "[IPV4]: fib_trie root
node settings", commit: 965ffea43d from
v2.6.22.

Pawel's report:
<blockquote>
I dont see any big change of (cpu load or faster/slower
routing/propagating routes from bgpd or something else) - in avg there
is from 2% to 3% more of CPU load i dont know why but it is - i change
from "preempt" to "no preempt" 3 times and check this my "mpstat -P ALL
1 30"
always avg cpu load was from 2 to 3% more compared to "no preempt"
[...]
cat /proc/net/fib_triestat
Basic info: size of leaf: 20 bytes, size of tnode: 36 bytes.
Main:
        Aver depth:     2.44
        Max depth:      6
        Leaves:         277814
        Prefixes:       291306
        Internal nodes: 66420
          1: 32737  2: 14850  3: 10332  4: 4871  5: 2313  6: 942  7: 371  8: 3  17: 1
        Pointers: 599098
Null ptrs: 254865
Total size: 18067  kB
</blockquote>

According to this and other similar reports average depth is slightly
increased (~0.2), and root nodes are shorter (log 17 vs. 18), but
there is no visible performance decrease. So, until memory handling is
improved or added parameters for changing this individually, this
patch resets to safer defaults.

Reported-by: Pawel Staszewski <pstaszewski@itcare.pl>
Reported-by: Jorge Boncompte [DTI2] <jorge@dti2.net>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Tested-by: Pawel Staszewski <pstaszewski@itcare.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-08 10:46:45 -07:00
Luciano Coelho 3938b45c1c mac80211: minstrel: avoid accessing negative indices in rix_to_ndx()
If rix is not found in mi->r[], i will become -1 after the loop.  This value
is eventually used to access arrays, so we were accessing arrays with a
negative index, which is obviously not what we want to do.  This patch fixes
this potential problem.

Signed-off-by: Luciano Coelho <luciano.coelho@nokia.com>
Acked-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-07 12:55:28 -04:00
Johannes Berg 2dce4c2b5f cfg80211: fix refcount leak
The code in cfg80211's cfg80211_bss_update erroneously
grabs a reference to the BSS, which means that it will
never be freed.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: stable@kernel.org [2.6.29, 2.6.30]
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-07 12:55:28 -04:00
Andrey Yurovsky 59615b5f9d mac80211: fix allocation in mesh_queue_preq
We allocate a PREQ queue node in mesh_queue_preq, however the allocation
may cause us to sleep.  Use GFP_ATOMIC to prevent this.

[ 1869.126498] BUG: scheduling while atomic: ping/1859/0x10000100
[ 1869.127164] Modules linked in: ath5k mac80211 ath
[ 1869.128310] Pid: 1859, comm: ping Not tainted 2.6.30-wl #1
[ 1869.128754] Call Trace:
[ 1869.129293]  [<c1023a2b>] __schedule_bug+0x48/0x4d
[ 1869.129866]  [<c13b5533>] __schedule+0x77/0x67a
[ 1869.130544]  [<c1026f2e>] ? release_console_sem+0x17d/0x185
[ 1869.131568]  [<c807cf47>] ? mesh_queue_preq+0x2b/0x165 [mac80211]
[ 1869.132318]  [<c13b5b3e>] schedule+0x8/0x1f
[ 1869.132807]  [<c1023c12>] __cond_resched+0x16/0x2f
[ 1869.133478]  [<c13b5bf0>] _cond_resched+0x27/0x32
[ 1869.134191]  [<c108a370>] kmem_cache_alloc+0x1c/0xcf
[ 1869.134714]  [<c10273ae>] ? printk+0x15/0x17
[ 1869.135670]  [<c807cf47>] mesh_queue_preq+0x2b/0x165 [mac80211]
[ 1869.136731]  [<c807d1f8>] mesh_nexthop_lookup+0xee/0x12d [mac80211]
[ 1869.138130]  [<c807417e>] ieee80211_xmit+0xe6/0x2b2 [mac80211]
[ 1869.138935]  [<c80be46d>] ? ath5k_hw_setup_rx_desc+0x0/0x66 [ath5k]
[ 1869.139831]  [<c80c97bc>] ? ath5k_tasklet_rx+0xba/0x506 [ath5k]
[ 1869.140863]  [<c8075191>] ieee80211_subif_start_xmit+0x6c9/0x6e4
[mac80211]
[ 1869.141665]  [<c105cf1c>] ? handle_level_irq+0x78/0x9d
[ 1869.142390]  [<c12e3f93>] dev_hard_start_xmit+0x168/0x1c7
[ 1869.143092]  [<c12f1f17>] __qdisc_run+0xe1/0x1b7
[ 1869.143612]  [<c12e25ff>] qdisc_run+0x18/0x1a
[ 1869.144248]  [<c12e62f4>] dev_queue_xmit+0x16a/0x25a
[ 1869.144785]  [<c13b6dcc>] ? _read_unlock_bh+0xe/0x10
[ 1869.145465]  [<c12eacdb>] neigh_resolve_output+0x19c/0x1c7
[ 1869.146182]  [<c130e2da>] ? ip_finish_output+0x0/0x51
[ 1869.146697]  [<c130e2a0>] ip_finish_output2+0x182/0x1bc
[ 1869.147358]  [<c130e327>] ip_finish_output+0x4d/0x51
[ 1869.147863]  [<c130e9d5>] ip_output+0x80/0x85
[ 1869.148515]  [<c130cc49>] dst_output+0x9/0xb
[ 1869.149141]  [<c130dec6>] ip_local_out+0x17/0x1a
[ 1869.149632]  [<c130e0bc>] ip_push_pending_frames+0x1f3/0x255
[ 1869.150343]  [<c13247ff>] raw_sendmsg+0x5e6/0x667
[ 1869.150883]  [<c1033c55>] ? insert_work+0x6a/0x73
[ 1869.151834]  [<c8071e00>] ?
ieee80211_invoke_rx_handlers+0x17da/0x1ae8 [mac80211]
[ 1869.152630]  [<c132bd68>] inet_sendmsg+0x3b/0x48
[ 1869.153232]  [<c12d7deb>] __sock_sendmsg+0x45/0x4e
[ 1869.153740]  [<c12d8537>] sock_sendmsg+0xb8/0xce
[ 1869.154519]  [<c80be46d>] ? ath5k_hw_setup_rx_desc+0x0/0x66 [ath5k]
[ 1869.155289]  [<c1036b25>] ? autoremove_wake_function+0x0/0x30
[ 1869.155859]  [<c115992b>] ? __copy_from_user_ll+0x11/0xce
[ 1869.156573]  [<c1159d99>] ? copy_from_user+0x31/0x54
[ 1869.157235]  [<c12df646>] ? verify_iovec+0x40/0x6e
[ 1869.157778]  [<c12d869a>] sys_sendmsg+0x14d/0x1a5
[ 1869.158714]  [<c8072c40>] ? __ieee80211_rx+0x49e/0x4ee [mac80211]
[ 1869.159641]  [<c80c83fe>] ? ath5k_rxbuf_setup+0x6d/0x8d [ath5k]
[ 1869.160543]  [<c80be46d>] ? ath5k_hw_setup_rx_desc+0x0/0x66 [ath5k]
[ 1869.161434]  [<c80beba4>] ? ath5k_hw_get_rxdp+0xe/0x10 [ath5k]
[ 1869.162319]  [<c80c97bc>] ? ath5k_tasklet_rx+0xba/0x506 [ath5k]
[ 1869.163063]  [<c1005627>] ? enable_8259A_irq+0x40/0x43
[ 1869.163594]  [<c101edb8>] ? __dequeue_entity+0x23/0x27
[ 1869.164793]  [<c100187a>] ? __switch_to+0x2b/0x105
[ 1869.165442]  [<c1021d5f>] ? finish_task_switch+0x5b/0x74
[ 1869.166129]  [<c12d963a>] sys_socketcall+0x14b/0x17b
[ 1869.166612]  [<c1002b95>] syscall_call+0x7/0xb

Signed-off-by: Andrey Yurovsky <andrey@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-07 12:55:27 -04:00
Jiri Slaby 1f5fc70a25 Wireless: nl80211, fix lock imbalance
Don't forget to unlock cfg80211_mutex in one fail path of
nl80211_set_wiphy.

Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-07 12:55:25 -04:00
Mark Smith 482d804cb4 econet: use NET_RX_SUCCESS instead of magic number 0 for econet_rcv successful return
Signed-off-by: Mark Smith <markzzzsmith@yahoo.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-06 18:07:59 -07:00
Mark Smith 5c91face51 ipv6: correct return on ipv6_rcv() packet drop
The routine ipv6_rcv() uses magic number 0 for a return when it drops a
packet. This corresponds to NET_RX_SUCCESS, which is obviously
incorrect. Correct this by using NET_RX_DROP instead.

ps. It isn't exactly clear who the IPv6 maintainers are, apologies if
I've missed any.

Signed-off-by: Mark Smith <markzzzsmith@yahoo.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-06 18:07:55 -07:00
Wei Yongjun 1bc4ee4088 sctp: fix warning at inet_sock_destruct() while release sctp socket
Commit 'net: Move rx skb_orphan call to where needed' broken sctp protocol
with warning at inet_sock_destruct(). Actually, sctp can do this right with
sctp_sock_rfree_frag() and sctp_skb_set_owner_r_frag() pair.

    sctp_sock_rfree_frag(skb);
    sctp_skb_set_owner_r_frag(skb, newsk);

This patch not revert the commit d55d87fdff,
instead remove the sctp_sock_rfree_frag() function.

------------[ cut here ]------------
WARNING: at net/ipv4/af_inet.c:151 inet_sock_destruct+0xe0/0x142()
Modules linked in: sctp ipv6 dm_mirror dm_region_hash dm_log dm_multipath
scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd [last unloaded: scsi_wait_scan]
Pid: 1808, comm: sctp_test Not tainted 2.6.31-rc2 #40
Call Trace:
 [<c042dd06>] warn_slowpath_common+0x6a/0x81
 [<c064a39a>] ? inet_sock_destruct+0xe0/0x142
 [<c042dd2f>] warn_slowpath_null+0x12/0x15
 [<c064a39a>] inet_sock_destruct+0xe0/0x142
 [<c05fde44>] __sk_free+0x19/0xcc
 [<c05fdf50>] sk_free+0x18/0x1a
 [<ca0d14ad>] sctp_close+0x192/0x1a1 [sctp]
 [<c0649f7f>] inet_release+0x47/0x4d
 [<c05fba4d>] sock_release+0x19/0x5e
 [<c05fbab3>] sock_close+0x21/0x25
 [<c049c31b>] __fput+0xde/0x189
 [<c049c3de>] fput+0x18/0x1a
 [<c049988f>] filp_close+0x56/0x60
 [<c042f422>] put_files_struct+0x5d/0xa1
 [<c042f49f>] exit_files+0x39/0x3d
 [<c043086a>] do_exit+0x1a5/0x5dd
 [<c04a86c2>] ? d_kill+0x35/0x3b
 [<c0438fa4>] ? dequeue_signal+0xa6/0x115
 [<c0430d05>] do_group_exit+0x63/0x8a
 [<c0439504>] get_signal_to_deliver+0x2e1/0x2f9
 [<c0401d9e>] do_notify_resume+0x7c/0x6b5
 [<c043f601>] ? autoremove_wake_function+0x0/0x34
 [<c04a864e>] ? __d_free+0x3d/0x40
 [<c04a867b>] ? d_free+0x2a/0x3c
 [<c049ba7e>] ? vfs_write+0x103/0x117
 [<c05fc8fa>] ? sys_socketcall+0x178/0x182
 [<c0402a56>] work_notifysig+0x13/0x19
---[ end trace 9db92c463e789fba ]---

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-06 12:47:08 -07:00
Patrick McHardy ec634fe328 net: convert remaining non-symbolic return values in ndo_start_xmit() functions
This patch converts the remaining occurences of raw return values to their
symbolic counterparts in ndo_start_xmit() functions that were missed by the
previous automatic conversion.

Additionally code that assumed the symbolic value of NETDEV_TX_OK to be zero
is changed to explicitly use NETDEV_TX_OK.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-05 19:23:38 -07:00
Cyrill Gorcunov 1490fd8947 net, bridge: align br_nf_ops assignment
No functional change -- just for easier reading.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-05 19:16:14 -07:00
oscar.medina@motorola.com 6650613d33 tipc: Add socket options to get number of queued messages
This patch allows a TIPC application to determine the number of messages
currently waiting in a socket's receive queue (TIPC_SOCK_RECVQ_DEPTH) or
in all TIPC socket receive queues (TIPC_NODE_RECVQ_DEPTH).

Signed-off-by: Oscar Medina <oscar.medina@motorola.com>
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-05 19:16:11 -07:00
Patrick McHardy 6ed106549d net: use NETDEV_TX_OK instead of 0 in ndo_start_xmit() functions
This patch is the result of an automatic spatch transformation to convert
all ndo_start_xmit() return values of 0 to NETDEV_TX_OK.

Some occurences are missed by the automatic conversion, those will be
handled in a seperate patch.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-05 19:16:04 -07:00
Florian Westphal 0e8635a8e1 net: remove NET_RX_BAD and NET_RX_CN* defines
almost no users in the tree; and the few that use them treat them
like NET_RX_DROP.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-05 19:15:35 -07:00
Stephane Contri 1ded3f59f3 dsa: fix 88e6xxx statistics counter snapshotting
The bit that tells us whether a statistics counter snapshot operation
has completed is located in the GLOBAL register block, not in the
GLOBAL2 register block, so fix up mv88e6xxx_stats_wait() to poll the
right register address.

Signed-off-by: Stephane Contri <Stephane.Contri@grassvalley.com>
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Cc: stable@kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-05 18:03:35 -07:00
Brian Haley a1ed05263b IPv6: preferred lifetime of address not getting updated
There's a bug in addrconf_prefix_rcv() where it won't update the
preferred lifetime of an IPv6 address if the current valid lifetime
of the address is less than 2 hours (the minimum value in the RA).

For example, If I send a router advertisement with a prefix that
has valid lifetime = preferred lifetime = 2 hours we'll build
this address:

3: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qlen 1000
    inet6 2001:1890:1109:a20:217:8ff:fe7d:4718/64 scope global dynamic
       valid_lft 7175sec preferred_lft 7175sec

If I then send the same prefix with valid lifetime = preferred
lifetime = 0 it will be ignored since the minimum valid lifetime
is 2 hours:

3: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qlen 1000
    inet6 2001:1890:1109:a20:217:8ff:fe7d:4718/64 scope global dynamic
       valid_lft 7161sec preferred_lft 7161sec

But according to RFC 4862 we should always reset the preferred lifetime
even if the valid lifetime is invalid, which would cause the address
to immediately get deprecated.  So with this patch we'd see this:

5: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qlen 1000
    inet6 2001:1890:1109:a20:21f:29ff:fe5a:ef04/64 scope global deprecated dynamic
       valid_lft 7163sec preferred_lft 0sec

The comment winds-up being 5x the size of the code to fix the problem.

Update the preferred lifetime of IPv6 addresses derived from a prefix
info option in a router advertisement even if the valid lifetime in
the option is invalid, as specified in RFC 4862 Section 5.5.3e.  Fixes
an issue where an address will not immediately become deprecated.
Reported by Jens Rosenboom.

Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-03 19:10:13 -07:00
Wei Yongjun 59cae0092e xfrm6: fix the proto and ports decode of sctp protocol
The SCTP pushed the skb above the sctp chunk header, so the
check of pskb_may_pull(skb, nh + offset + 1 - skb->data) in
_decode_session6() will never return 0 and the ports decode
of sctp will always fail. (nh + offset + 1 - skb->data < 0)

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-03 19:10:10 -07:00
Wei Yongjun c615c9f3f3 xfrm4: fix the ports decode of sctp protocol
The SCTP pushed the skb data above the sctp chunk header, so the check
of pskb_may_pull(skb, xprth + 4 - skb->data) in _decode_session4() will
never return 0 because xprth + 4 - skb->data < 0, the ports decode of
sctp will always fail.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-03 19:10:06 -07:00
Abhishek Kulkarni 15da4b1612 net/9p: Fix crash due to bad mount parameters.
It is not safe to use match_int without checking the token type returned
by match_token (especially when the token type returned is Opt_err and
args is empty). Fix it.

Signed-off-by: Abhishek Kulkarni <adkulkar@umail.iu.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-02 13:17:01 -07:00
Eric W. Biederman f8a68e752b Revert "ipv4: arp announce, arp_proxy and windows ip conflict verification"
This reverts commit 73ce7b01b4.

After discovering that we don't listen to gratuitious arps in 2.6.30
I tracked the failure down to this commit.

The patch makes absolutely no sense.  RFC2131 RFC3927 and RFC5227.
are all in agreement that an arp request with sip == 0 should be used
for the probe (to prevent learning) and an arp request with sip == tip
should be used for the gratitous announcement that people can learn
from.

It appears the author of the broken patch got those two cases confused
and modified the code to drop all gratuitous arp traffic.  Ouch!

Cc: stable@kernel.org
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-30 19:47:08 -07:00
Jarek Poplawski 008440e3ad ipv4: Fix fib_trie rebalancing, part 3
Alas current delaying of freeing old tnodes by RCU in trie_rebalance
is still not enough because we can free a top tnode before updating a
t->trie pointer.

Reported-by: Pawel Staszewski <pstaszewski@itcare.pl>
Tested-by: Pawel Staszewski <pstaszewski@itcare.pl>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-30 12:48:38 -07:00
Wei Yongjun ff0ac74afb sctp: xmit sctp packet always return no route error
Commit 'net: skb->dst accessors'(adf30907d6)
broken the sctp protocol stack, the sctp packet can never be sent out after
Eric Dumazet's patch, which have typo in the sctp code.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Vlad Yasevich <vladisalv.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-29 19:41:53 -07:00
Wei Yongjun 1802571b98 xfrm: use xfrm_addr_cmp() instead of compare addresses directly
Clean up to use xfrm_addr_cmp() instead of compare addresses directly.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-29 19:41:46 -07:00
Herbert Xu 6828b92bd2 tcp: Do not tack on TSO data to non-TSO packet
If a socket starts out on a non-TSO route, and then switches to
a TSO route, then we will tack on data to the tail of the tx queue
even if it started out life as non-TSO.  This is suboptimal because
all of it will then be copied and checksummed unnecessarily.

This patch fixes this by ensuring that skb->ip_summed is set to
CHECKSUM_PARTIAL before appending extra data beyond the MSS.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-29 19:41:43 -07:00
Herbert Xu 8e5b9dda99 tcp: Stop non-TSO packets morphing into TSO
If a socket starts out on a non-TSO route, and then switches to
a TSO route, then the tail on the tx queue can morph into a TSO
packet, causing mischief because the rest of the stack does not
expect a partially linear TSO packet.

This patch fixes this by ensuring that skb->ip_summed is set to
CHECKSUM_PARTIAL before declaring a packet as TSO.

Reported-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-29 19:41:39 -07:00
David S. Miller 9c0346bd08 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lowpan/lowpan 2009-06-29 19:23:53 -07:00
David S. Miller 53bd9728bf Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6 2009-06-29 19:22:31 -07:00
Dmitry Eremin-Solenikov dfd06fe824 nl802154: add module license and description
Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
2009-06-29 18:20:28 +04:00
Dmitry Eremin-Solenikov 932c1329ac nl802154: fix Oops in ieee802154_nl_get_dev
ieee802154_nl_get_dev() lacks check for the existance of the device
that was returned by dev_get_XXX, thus resulting in Oops for non-existing
devices. Fix it.

Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
2009-06-29 18:20:27 +04:00
Jan Engelhardt d6d3f08b0f netfilter: xtables: conntrack match revision 2
As reported by Philip, the UNTRACKED state bit does not fit within
the 8-bit state_mask member. Enlarge state_mask and give status_mask
a few more bits too.

Reported-by: Philip Craig <philipc@snapgear.com>
References: http://markmail.org/thread/b7eg6aovfh4agyz7
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-06-29 14:31:46 +02:00
Patrick McHardy a3a9f79e36 netfilter: tcp conntrack: fix unacknowledged data detection with NAT
When NAT helpers change the TCP packet size, the highest seen sequence
number needs to be corrected. This is currently only done upwards, when
the packet size is reduced the sequence number is unchanged. This causes
TCP conntrack to falsely detect unacknowledged data and decrease the
timeout.

Fix by updating the highest seen sequence number in both directions after
packet mangling.

Tested-by: Krzysztof Piotr Oledzki <ole@ans.pl>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-06-29 14:07:56 +02:00
Herbert Xu ff780cd8f2 gro: Flush GRO packets in napi_disable_pending path
When NAPI is disabled while we're in net_rx_action, we end up
calling __napi_complete without flushing GRO packets.  This is
a bug as it would cause the GRO packets to linger, of course it
also literally BUGs to catch error like this :)

This patch changes it to napi_complete, with the obligatory IRQ
reenabling.  This should be safe because we've only just disabled
IRQs and it does not materially affect the test conditions in
between.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-26 19:27:04 -07:00
Herbert Xu 71f9dacd2e inet: Call skb_orphan before tproxy activates
As transparent proxying looks up the socket early and assigns
it to the skb for later processing, we must drop any existing
socket ownership prior to that in order to distinguish between
the case where tproxy is active and where it is not.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-26 19:22:37 -07:00
Jesper Dangaard Brouer 4a27096bbe mac80211: Use rcu_barrier() on unload.
The mac80211 module uses rcu_call() thus it should use rcu_barrier()
on module unload.

The rcu_barrier() is placed in mech.c ieee80211_stop_mesh() which is
invoked from ieee80211_stop() in case vif.type == NL80211_IFTYPE_MESH_POINT.

Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-26 13:51:36 -07:00
Jesper Dangaard Brouer 75de874f5c sunrpc: Use rcu_barrier() on unload.
The sunrpc module uses rcu_call() thus it should use rcu_barrier() on
module unload.

Have not verified that the possibility for new call_rcu() callbacks
has been disabled.  As a hint for checking, the functions calling
call_rcu() (unx_destroy_cred and generic_destroy_cred) are
registered as crdestroy function pointer in struct rpc_credops.

Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-26 13:51:34 -07:00
Jesper Dangaard Brouer 473c22d759 bridge: Use rcu_barrier() instead of syncronize_net() on unload.
When unloading modules that uses call_rcu() callbacks, then we must
use rcu_barrier().  This module uses syncronize_net() which is not
enough to be sure that all callback has been completed.

Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-26 13:51:32 -07:00
Jesper Dangaard Brouer 1f2ccd00f2 ipv6: Use rcu_barrier() on module unload.
The ipv6 module uses rcu_call() thus it should use rcu_barrier() on
module unload.

Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-26 13:51:31 -07:00
Jesper Dangaard Brouer 10e8544801 decnet: Use rcu_barrier() on module unload.
The decnet module unloading as been disabled with a '#if 0' statement,
because it have had issues.

We add a rcu_barrier() anyhow for correctness.

The maintainer (Chrissie Caulfield) will look into the unload issue
when time permits.

Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Chrissie Caulfield <christine.caulfield@googlemail.com>
Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-26 13:51:27 -07:00
Jens Rosenboom a1faa69810 ipv6: avoid wraparound for expired preferred lifetime
Avoid showing wrong high values when the preferred lifetime of an address
is expired.

Signed-off-by: Jens Rosenboom <me@jayr.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-25 20:03:50 -07:00
Wei Yongjun 1ac530b355 tcp: missing check ACK flag of received segment in FIN-WAIT-2 state
RFC0793 defined that in FIN-WAIT-2 state if the ACK bit is off drop
the segment and return[Page 72]. But this check is missing in function
tcp_timewait_state_process(). This cause the segment with FIN flag but
no ACK has two diffent action:

Case 1:
    Node A                      Node B
              <-------------    FIN,ACK
                                (enter FIN-WAIT-1)
    ACK       ------------->
                                (enter FIN-WAIT-2)
    FIN       ------------->    discard
                                (move sk to tw list)

Case 2:
    Node A                      Node B
              <-------------    FIN,ACK
                                (enter FIN-WAIT-1)
    ACK       ------------->
                                (enter FIN-WAIT-2)
                                (move sk to tw list)
    FIN       ------------->

              <-------------    ACK

This patch fixed the problem.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-25 20:03:15 -07:00
Jesper Dangaard Brouer 308ff823eb nf_conntrack: Use rcu_barrier()
RCU barriers, rcu_barrier(), is inserted two places.

 In nf_conntrack_expect.c nf_conntrack_expect_fini() before the
 kmem_cache_destroy().  Firstly to make sure the callback to the
 nf_ct_expect_free_rcu() code is still around.  Secondly because I'm
 unsure about the consequence of having in flight
 nf_ct_expect_free_rcu/kmem_cache_free() calls while doing a
 kmem_cache_destroy() slab destroy.

 And in nf_conntrack_extend.c nf_ct_extend_unregister(), inorder to
 wait for completion of callbacks to __nf_ct_ext_free_rcu(), which is
 invoked by __nf_ct_ext_add().  It might be more efficient to call
 rcu_barrier() in nf_conntrack_core.c nf_conntrack_cleanup_net(), but
 thats make it more difficult to read the code (as the callback code
 in located in nf_conntrack_extend.c).

Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-06-25 16:32:52 +02:00
Rémi Denis-Courmont 2be6fa4c7e Phonet: generate Netlink RTM_DELADDR when destroying a device
Netlink address deletion events were not sent when a network device
vanished neither when Phonet was unloaded.

Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-25 02:58:16 -07:00
Rémi Denis-Courmont c7a1a4c80f Phonet: publicize the Netlink notification function
Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-25 02:58:15 -07:00
Herbert Xu 245acb8772 ipsec: Fix name of CAST algorithm
Our CAST algorithm is called cast5, not cast128.  Clearly nobody
has ever used it :)

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-24 18:03:10 -07:00