1
0
Fork 0
Commit Graph

21791 Commits (b72061a3cb0f1c5aa3f919e2dadb4a1631773e7a)

Author SHA1 Message Date
Shriram Rajagopalan c3059be16c net/sched: sch_plug - Queue traffic until an explicit release command
The qdisc supports two operations - plug and unplug. When the
qdisc receives a plug command via netlink request, packets arriving
henceforth are buffered until a corresponding unplug command is received.
Depending on the type of unplug command, the queue can be unplugged
indefinitely or selectively.

This qdisc can be used to implement output buffering, an essential
functionality required for consistent recovery in checkpoint based
fault-tolerance systems. Output buffering enables speculative execution
by allowing generated network traffic to be rolled back. It is used to
provide network protection for Xen Guests in the Remus high availability
project, available as part of Xen.

This module is generic enough to be used by any other system that wishes
to add speculative execution and output buffering to its applications.

This module was originally available in the linux 2.6.32 PV-OPS tree,
used as dom0 for Xen.

For more information, please refer to http://nss.cs.ubc.ca/remus/
and http://wiki.xensource.com/xenwiki/Remus

Changes in V3:
  * Removed debug output (printk) on queue overflow
  * Added TCQ_PLUG_RELEASE_INDEFINITE - that allows the user to
    use this qdisc, for simple plug/unplug operations.
  * Use of packet counts instead of pointers to keep track of
    the buffers in the queue.

Signed-off-by: Shriram Rajagopalan <rshriram@cs.ubc.ca>
Signed-off-by: Brendan Cully <brendan@cs.ubc.ca>
[author of the code in the linux 2.6.32 pvops tree]
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-07 12:54:56 -05:00
David S. Miller 17b8a74f00 Merge branch 'tipc_net-next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux 2012-02-07 12:31:01 -05:00
Allan Stephens dff10e9e63 tipc: Minor optimization to rejection of connection-based messages
Modifies message rejection logic so that TIPC doesn't attempt to
send a FIN message to the rejecting port if it is known in advance
that there is no such message because the rejecting port doesn't exist.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-02-06 16:59:19 -05:00
Allan Stephens 3175bd9add tipc: Eliminate alteration of publication key during name table purging
Removes code that alters the publication key of a name table entry
that is being forcibly purged from TIPC's name table after contact
with the publishing node has been lost.

Current TIPC ensures that all defunct names are purged before
re-establishing contact with a failed node.  There used to be a risk
that the publication might be accidentally deleted because it might be
re-added to the name table before the purge operation was completed.
But now there is no longer a need to ensure that the new key is different
than the old one.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-02-06 16:59:19 -05:00
Allan Stephens 63e7f1ac28 tipc: Prevent loss of fragmented messages over broadcast link
Modifies broadcast link so that an incoming fragmented message is not
lost if reassembly cannot begin because there currently is no buffer
big enough to hold the entire reassembled message. The broadcast link
now ignores the first fragment completely, which causes the sending node
to retransmit the first fragment so that reassembly can be re-attempted.

Previously, the sender would have had no reason to retransmit the 1st
fragment, so we would never have a chance to re-try the allocation.

To do this cleanly without duplicaton, a new bclink_accept_pkt()
function is introduced.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-02-06 16:59:19 -05:00
Allan Stephens b76b27cad5 tipc: Prevent loss of fragmented messages over unicast links
Modifies unicast link endpoint logic so an incoming fragmented message
is not lost if reassembly cannot begin because there is no buffer big
enough to hold the entire reassembled message. The link endpoint now
ignores the first fragment completely, which causes the sending node to
retransmit the first fragment so that reassembly can be re-attempted.

Previously, the sender would have had no reason to retransmit the 1st
fragment, so we would never have a chance to re-try the allocation.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
2012-02-06 16:59:19 -05:00
Allan Stephens 1ec2bb0840 tipc: Remove obsolete broadcast tag capability
Eliminates support for the broadcast tag field, which is no longer
used by broadcast link NACK messages.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-02-06 16:59:18 -05:00
Allan Stephens 7a54d4a99d tipc: Major redesign of broadcast link ACK/NACK algorithms
Completely redesigns broadcast link ACK and NACK mechanisms to prevent
spurious retransmit requests in dual LAN networks, and to prevent the
broadcast link from stalling due to the failure of a receiving node to
acknowledge receiving a broadcast message or request its retransmission.

Note: These changes only impact the timing of when ACK and NACK messages
are sent, and not the basic broadcast link protocol itself, so inter-
operability with nodes using the "classic" algorithms is maintained.

The revised algorithms are as follows:

1) An explicit ACK message is still sent after receiving 16 in-sequence
messages, and implicit ACK information continues to be carried in other
unicast link message headers (including link state messages).  However,
the timing of explicit ACKs is now based on the receiving node's absolute
network address rather than its relative network address to ensure that
the failure of another node does not delay the ACK beyond its 16 message
target.

2) A NACK message is now typically sent only when a message gap persists
for two consecutive incoming link state messages; this ensures that a
suspected gap is not confirmed until both LANs in a dual LAN network have
had an opportunity to deliver the message, thereby preventing spurious NACKs.
A NACK message can also be generated by the arrival of a single link state
message, if the deferred queue is so big that the current message gap
cannot be the result of "normal" mis-ordering due to the use of dual LANs
(or one LAN using a bonded interface). Since link state messages typically
arrive at different nodes at different times the problem of multiple nodes
issuing identical NACKs simultaneously is inherently avoided.

3) Nodes continue to "peek" at NACK messages sent by other nodes. If
another node requests retransmission of a message gap suspected (but not
yet confirmed) by the peeking node, the peeking node forgets about the
gap and does not generate a duplicate retransmit request. (If the peeking
node subsequently fails to receive the lost message, later link state
messages will cause it to rediscover and confirm the gap and send another
NACK.)

4) Message gap "equality" is now determined by the start of the gap only.
This is sufficient to deal with the most common cases of message loss,
and eliminates the need for complex end of gap computations.

5) A peeking node no longer tries to determine whether it should send a
complementary NACK, since the most common cases of message loss don't
require it to be sent. Consequently, the node no longer examines the
"broadcast tag" field of a NACK message when peeking.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-02-06 16:59:18 -05:00
Allan Stephens b98158e3b3 tipc: Add missing locks in broadcast link statistics accumulation
Ensures that all attempts to update broadcast link statistics are done
only while holding the lock that protects the link's main data structures,
to prevent interference by simultaneous updates caused by messages
arriving on other interfaces.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-02-06 16:59:18 -05:00
Allan Stephens 0232c5a566 tipc: Fix bug in broadcast link duplicate message statistics
Modifies broadcast link so that it increments the "received duplicate
message" count if an incoming message cannot be added to the deferred
message queue because it is already present in the queue. (The aligns
broadcast link behavior with that of TIPC's unicast links.)

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-02-06 16:59:17 -05:00
Allan Stephens 8a275a6a30 tipc: Fix node lock reclamation issues in broadcast link reception
Fixes a pair of problems in broadcast link message reception code
relating to the reclamation of the node lock after consuming an
in-sequence message.

1) Now retests to see if the sending node is still up after reclaiming
   the node lock, and bails out if it is non-operational.

2) Now manipulates the node's deferred message queue only after
   reclaiming the node lock, rather than using queue head pointer
   information that was cached previously.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-02-06 16:59:17 -05:00
Allan Stephens 57732560d1 tipc: Add missing broadcast link lock when sending NACK
Ensures that any attempt to send a NACK message over TIPC's broadcast
link has exclusive access to the link's main data structures, to prevent
interference with a simultaneous attempt to send other broadcast link
traffic (such as application-generated multicast messages).

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-02-06 16:59:17 -05:00
Allan Stephens 47361c87c5 tipc: Fix problem with broadcast link synchronization between nodes
Corrects a problem in which a link endpoint that activates as the
result of receiving a RESET/STATE sequence of link protocol messages
fails to properly record the broadcast link status information about
the node to which it is now communicating with. (The problem does
not occur with the more common RESET/ACTIVATE sequence of messages.)
The fix ensures that the broadcast link status info is updated after
the RESET message resets the link endpoint, rather than before, thereby
preventing new information from being overwritten by the reset operation.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-02-06 16:59:16 -05:00
Allan Stephens 9349931371 tipc: Ensure broadcast link re-acquires node after link failure
Fix a bug that can prevent TIPC from sending broadcast messages to a node
if contact with the node is lost and then regained. The problem occurs if
the broadcast link first clears the flag indicating the node is part of the
link's distribution set (when it loses contact with the node), and later
fails to restore the flag (when contact is regained); restoration fails
if contact with the node is regained by implicit unicast link activation
triggered by the arrival of a data message, rather than explicitly by the
arrival of a link activation message.

The broadcast link now uses separate fields to track whether a node is
theoretically capable of receiving broadcast messages versus whether it is
actually part of the link's distribution set. The former member is updated
by the receipt of link protocol messages, which can occur at any time; the
latter member is updated only when contact with the node is gained or lost.
This change also permits the simplification of several conditional
expressions since the broadcast link's "supported" field can now only be
set if there are working links to the associated node.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-02-06 16:59:16 -05:00
Allan Stephens 4d75313ce9 tipc: Prevent broadcast link stalling in dual LAN environments
Ensure that sequence number information about incoming broadcast link
messages is initialized only by the activation of the first link to a
given cluster node.  Previously, a race condition allowed reset and/or
activation messages for a second link to re-initialize this sequence
number information with obsolete values. This could trigger TIPC to
request the retransmission of previously acknowledged broadcast link
messages from that node, resulting in broadcast link processing becoming
stalled if the node had already released one or more of those messages
and was unable to perform the required retransmission.

Thanks to Laser <gotolaser@gmail.com> for identifying this problem
and assisting in the development of this fix.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-02-06 16:59:16 -05:00
Allan Stephens 92d2c905b4 tipc: Prevent transmission of outdated link protocol messages
Ensures that a link endpoint discards any previously deferred link
protocol message whenever it attempts to send a new one.

Previously, it was possible for a link protocol message that was unsent
due to congestion to be transmitted after newer protocol messages had
been sent. The stale link protocol message might then cause the receiving
link endpoint to malfunction because of its outdated conent.

Thanks to Osamu Kaminuma [okaminum@avaya.com] for diagnosing the problem
and contributing a prototype patch.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-02-06 16:59:15 -05:00
Allan Stephens 8809b255a9 tipc: improve the link deferred queue insertion algorithm
Re-code the algorithm for inserting an out-of-sequence message into
a unicast or broadcast link's deferred message queue.  It remains
functionally equivalent but should be easier to understand/maintain.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-02-06 16:59:15 -05:00
David S. Miller 59d74026fa Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next 2012-02-06 15:35:14 -05:00
David S. Miller a0417fa3a1 net: Make qdisc_skb_cb upper size bound explicit.
Just like skb->cb[], so that qdisc_skb_cb can be encapsulated inside
of other data structures.

This is intended to be used by IPoIB so that it can remember
addressing information stored at hard_header_ops->create() time that
it can fetch when the packet gets to the transmit routine.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-06 15:14:37 -05:00
John W. Linville 8926574c4d Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem
Conflicts:
	drivers/net/wireless/rtlwifi/rtl8192se/sw.c
2012-02-06 14:26:39 -05:00
Jesper Juhl 1f0b6702b5 caif: caifdev is never used in net/caif/caif_dev.c::transmit() - remove it.
Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-05 17:53:04 -05:00
Jesper Juhl 22b6a2eb90 decnet: remove unused variable from dn_output()
The variable 'neigh' is assigned to, but otherwise completely
unused. So let's remove it.

Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-05 17:44:55 -05:00
David S. Miller dd48dc34fe Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2012-02-04 16:39:32 -05:00
Neil Horman 5962b35c1d netprio_cgroup: Fix obo in get_prioidx
It was recently pointed out to me that the get_prioidx function sets a bit in
the prioidx map prior to checking to see if the index being set is out of
bounds.  This patch corrects that, avoiding the possiblity of us writing beyond
the end of the array

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Reported-by: Stanislaw Gruszka <sgruszka@redhat.com>
CC: Stanislaw Gruszka <sgruszka@redhat.com>
CC: "David S. Miller" <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-04 16:30:24 -05:00
sjur.brandeland@stericsson.com 576f3cc7fb caif: Add drop count for caif_net device.
Count dropped packets in CAIF Netdevice.

Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-04 16:06:27 -05:00
sjur.brandeland@stericsson.com 4a695823b5 caif: Kill debugfs vars for caif socket
Kill off the debug-fs exposed varaibles from caif_socket.

Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-04 16:06:27 -05:00
John W. Linville 157ca9eae9 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem 2012-02-03 14:14:07 -05:00
Dmitry Tarnyagin ba7605745d caif: Bugfix double kfree_skb upon xmit failure
SKB is freed twice upon send error. The Network stack consumes SKB even
when it returns error code.

Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-02 14:35:12 -05:00
sjur.brandeland@stericsson.com b01377a420 caif: Bugfix list_del_rcu race in cfmuxl_ctrlcmd.
Always use cfmuxl_remove_uplayer when removing a up-layer.
cfmuxl_ctrlcmd() can be called independently and in parallel with
cfmuxl_remove_uplayer(). The race between them could cause list_del_rcu
to be called on a node which has been already taken out from the list.
That lead to a (rare) crash on accessing poisoned node->prev inside
list_del_rcu.

This fix ensures that deletion are done holding the same lock.

Reported-by: Dmitry Tarnyagin <dmitry.tarnyagin@stericsson.com>
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-02 14:35:12 -05:00
Jason Wang c43b874d5d tcp: properly initialize tcp memory limits
Commit 4acb4190 tries to fix the using uninitialized value
introduced by commit 3dc43e3,  but it would make the
per-socket memory limits too small.

This patch fixes this and also remove the redundant codes
introduced in 4acb4190.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Glauber Costa <glommer@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-02 14:34:41 -05:00
David S. Miller 7161c76f0d atm: clip: Convert over to dst_neigh_lookup().
CLIP only support ipv4, and this is evidenced by the fact that it
is a device specific extension of arp_tbl, so this conversion is
pretty straightforward.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-01 17:23:11 -05:00
David S. Miller 3329bdfc40 decnet: Add missing neigh->ha locking to dn_neigh_output_packet()
Basically, mirror the logic in neigh_connected_output().

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-01 16:34:25 -05:00
David S. Miller f79d52c254 ipv6: Remove never used function inet6_ac_check().
It went from unused, to commented out, and never changing after
that.

Just get rid of it, if someone wants it they can unearth it from
the history.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-01 16:14:17 -05:00
Joe Perches 7b6cd1ce72 PATCH V2 net-next] net: dev: Convert printks to pr_<level>
Use the current logging style.
Coalesce formats where appropriate.
Update grammar where appropriate.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-01 15:57:08 -05:00
Eliad Peller 07ae2dfcf4 mac80211: timeout a single frame in the rx reorder buffer
The current code checks for stored_mpdu_num > 1, causing
the reorder_timer to be triggered indefinitely, but the
frame is never timed-out (until the next packet is received)

Signed-off-by: Eliad Peller <eliad@wizery.com>
Cc: <stable@vger.kernel.org>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-02-01 15:26:00 -05:00
Ben Hutchings 786f528119 ethtool: Null-terminate filename passed to ethtool_ops::flash_device
The parameters for ETHTOOL_FLASHDEV include a filename, which ought to
be null-terminated.  Currently the only driver that implements
ethtool_ops::flash_device attempts to add a null terminator if
necessary, but does it wrongly.  Do it in the ethtool core instead.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-01 14:47:17 -05:00
Arun Sharma efcdbf24fd net: Disambiguate kernel message
Some of our machines were reporting:

TCP: too many of orphaned sockets

even when the number of orphaned sockets was well below the
limit.

We print a different message depending on whether we're out
of TCP memory or there are too many orphaned sockets.

Also move the check out of line and cleanup the messages
that were printed.

Signed-off-by: Arun Sharma <asharma@fb.com>
Suggested-by: Mohan Srinivasan <mohan@fb.com>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: David Miller <davem@davemloft.net>
Cc: Glauber Costa <glommer@parallels.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-01 14:41:50 -05:00
Joe Perches 6f7062457f netpoll: Neaten MAX_SKB_SIZE macro
Add the types in the packet layout order.

Signed-off-by: Joe Perches <joe@perches.com>
Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-01 14:20:27 -05:00
Joe Perches e6ec26935a netpoll: Convert printks to np_<level> and add pr_fmt
Use a more current message logging style.
Add pr_fmt to prefix dmesg output with "netpoll: "
Add macros to print np->name.

Signed-off-by: Joe Perches <joe@perches.com>
Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-01 14:20:27 -05:00
Shawn Lu 658ddaaf66 tcp: md5: RST: getting md5 key from listener
TCP RST mechanism is broken in TCP md5(RFC2385). When
connection is gone, md5 key is lost, sending RST
without md5 hash is deem to ignored by peer. This can
be a problem since RST help protocal like bgp to fast
recove from peer crash.

In most case, users of tcp md5, such as bgp and ldp,
have listener on both sides to accept connection from peer.
md5 keys for peers are saved in listening socket.

There are two cases in finding md5 key when connection is
lost:
1.Passive receive RST: The message is send to well known port,
tcp will associate it with listner. md5 key is gotten from
listener.

2.Active receive RST (no sock): The message is send to ative
side, there is no socket associated with the message. In this
case, finding listener from source port, then find md5 key from
listener.

we are not loosing sercuriy here:
packet is checked with md5 hash. No RST is generated
if md5 hash doesn't match or no md5 key can be found.

Signed-off-by: Shawn Lu <shawn.lu@ericsson.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-01 12:43:54 -05:00
Dan Carpenter 5b11b2e4bd xfrm6: remove unneeded NULL check in __xfrm6_output()
We don't check for NULL consistently in __xfrm6_output().  If "x" were
NULL here it would lead to an OOPs later.  I asked Steffen Klassert
about this and he suggested that we remove the NULL check.

On 10/29/11, Steffen Klassert <steffen.klassert@secunet.com> wrote:
>> net/ipv6/xfrm6_output.c
>>    148
>>    149		if ((x && x->props.mode == XFRM_MODE_TUNNEL) &&
>>                           ^
>
> x can't be null here. It would be a bug if __xfrm6_output() is called
> without a xfrm_state attached to the skb. I think we can just remove
> this null check.

Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-01 02:52:48 -05:00
Eric Dumazet a8afca0329 tcp: md5: protects md5sig_info with RCU
This patch makes sure we use appropriate memory barriers before
publishing tp->md5sig_info, allowing tcp_md5_do_lookup() being used from
tcp_v4_send_reset() without holding socket lock (upcoming patch from
Shawn Lu)

Note we also need to respect rcu grace period before its freeing, since
we can free socket without this grace period thanks to
SLAB_DESTROY_BY_RCU

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Shawn Lu <shawn.lu@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-01 02:11:47 -05:00
Eric Dumazet 5f3d9cb296 tcp: md5: use sock_kmalloc() to limit md5 keys
There is no limit on number of MD5 keys an application can attach to a
tcp socket.

This patch adds a per tcp socket limit based
on /proc/sys/net/core/optmem_max

With current default optmem_max values, this allows about 150 keys on
64bit arches, and 88 keys on 32bit arches.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-01-31 16:11:48 -05:00
Eric Dumazet a915da9b69 tcp: md5: rcu conversion
In order to be able to support proper RST messages for TCP MD5 flows, we
need to allow access to MD5 keys without locking listener socket.

This conversion is a nice cleanup, and shrinks size of timewait sockets
by 80 bytes.

IPv6 code reuses generic code found in IPv4 instead of duplicating it.

Control path uses GFP_KERNEL allocations instead of GFP_ATOMIC.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Shawn Lu <shawn.lu@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-01-31 12:14:00 -05:00
Eric Dumazet a2d91241a8 tcp: md5: remove obsolete md5_add() method
We no longer use md5_add() method from struct tcp_sock_af_ops

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-01-31 12:13:59 -05:00
Helmut Schaa 4f3eb0ba48 mac80211: Move num_sta_ps counter decrement after synchronize_rcu
Unted the assumption that the sta struct is still accessible before the
synchronize_rcu call we should move the num_sta_ps counter decrement
after synchronize_rcu to avoid incorrect decrements if num_sta_ps.

Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-01-30 15:48:28 -05:00
Simon Wunderlich 19468413e8 mac80211: add support for mcs masks
* Handle MCS masks set by the user.
* Match rates provided by the rate control algorithm to the mask set,
  also in HT mode, and switch back to legacy mode if necessary.
* add debugfs files to observate the rate selection

Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: Mathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-01-30 15:48:26 -05:00
Simon Wunderlich 24db78c05b nl80211: add support for mcs masks
Allow to set mcs masks through nl80211. We also allow to set MCS
rates but no legacy rates (and vice versa).

Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: Mathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-01-30 15:48:25 -05:00
Helmut Schaa 608383bfc0 mac80211: Fix incorrect num_sta_ps decrement in ap_sta_ps_end
If the driver blocked this specific STA with the help of
ieee80211_sta_block_awake we won't clear WLAN_STA_PS_STA later but
still decrement num_sta_ps. Hence, the next data frame from this
STA will trigger ap_sta_ps_end again and also decrement num_sta_ps
again leading to an incorrect num_sta_ps counter.

This can result in problems with powersaving clients not waking up
from PS because the TIM calculation might be skipped due to the
incorrect num_sta_ps counter.

Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-01-30 15:48:20 -05:00
Helmut Schaa 2ab694d302 mac80211: Fix incorrect num_sta_ps decrement in __sta_info_destroy
When WLAN_STA_PS_DRIVER is set by ieee80211_sta_block_awake the
num_sta_ps counter is not incremented. Hence, we shouldn't decrement
it in __sta_info_destroy if only WLAN_STA_PS_DRIVER is set. This
could result in an incorrect num_sta_ps counter leading to strange side
effects with associated powersaving clients.

Fix this by only decrementing num_sta_ps when WLAN_STA_PS_STA was set
before.

Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-01-30 15:48:18 -05:00