Commit graph

7829 commits

Author SHA1 Message Date
YOSHIFUJI Hideaki b50660f1fe [IP] UDP: Use SEQ_START_TOKEN.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-31 19:38:15 -07:00
YOSHIFUJI Hideaki 4c7966b86b [IPV6] MCAST: Ensure to check multicast listener(s).
In ip6_mc_input(), we need to check whether we have listener(s) for
the packet.

After commit ae7bf20a63, all packets
for multicast destinations are delivered to upper layer if
IFF_PROMISC or IFF_ALLMULTI is set.

In fact, bug was rather ancient; the original (before the commit)
intent of the dev->flags check was to skip the ipv6_chk_mcast_addr()
call, assuming L2 filters packets appropriately, but it was even not
true.

Let's explicitly check our multicast list.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Acked-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-31 19:30:45 -07:00
David S. Miller 9f09243890 [LLC]: Kill llc_station_mac_sa symbol export.
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-28 19:51:40 -07:00
David S. Miller e8e16b706e [INET]: inet_frag_evictor() must run with BH disabled
Based upon a lockdep trace from Dave Jones.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-28 17:30:18 -07:00
Joonwoo Park a5a04819c5 [LLC]: station source mac address
kill unnecessary llc_station_mac_sa.

Signed-off-by: Joonwoo Park <joonwpark81@gmail.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-28 16:28:36 -07:00
Joonwoo Park 27785d83e4 [LLC]: bogus llc packet length
discard llc packet which has bogus packet length.

Signed-off-by: Joonwoo Park <joonwpark81@gmail.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-28 16:27:33 -07:00
Herbert Xu 2ba2506ca7 [NET]: Add preemption point in qdisc_run
The qdisc_run loop is currently unbounded and runs entirely in a
softirq.  This is bad as it may create an unbounded softirq run.

This patch fixes this by calling need_resched and breaking out if
necessary.

It also adds a break out if the jiffies value changes since that would
indicate we've been transmitting for too long which starves other
softirqs.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-28 16:25:26 -07:00
Rusty Russell 32aced7509 [NET]: Don't send ICMP_FRAG_NEEDED for GSO packets
Commit 9af3912ec9 ("[NET] Move DF check
to ip_forward") added a new check to send ICMP fragmentation needed
for large packets.

Unlike the check in ip_finish_output(), it doesn't check for GSO.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-28 16:23:19 -07:00
Robert P. J. Day d5fb2962c6 bluetooth: replace deprecated RW_LOCK_UNLOCKED macros
The older RW_LOCK_UNLOCKED macros defeat lockdep state tracing so
replace them with the newer __RW_LOCK_UNLOCKED macros.

Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-28 16:17:38 -07:00
Patrick McHardy 3480c63bdf [LLC]: Restrict LLC sockets to root
LLC currently allows users to inject raw frames, including IP packets
encapsulated in SNAP. While Linux doesn't handle IP over SNAP, other
systems do. Restrict LLC sockets to root similar to packet sockets.

[ Modified Patrick's patch to use CAP_NEW_RAW --DaveM ]

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-27 20:28:10 -07:00
Denis V. Lunev 8eeee8b152 [NETFILTER]: Replate direct proc_fops assignment with proc_create call.
This elliminates infamous race during module loading when one could lookup
proc entry without proc_fops assigned.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-27 16:55:53 -07:00
Thomas Graf 920fc941a9 [ESP]: Ensure IV is in linear part of the skb to avoid BUG() due to OOB access
ESP does not account for the IV size when calling pskb_may_pull() to
ensure everything it accesses directly is within the linear part of a
potential fragment. This results in a BUG() being triggered when the
both the IPv4 and IPv6 ESP stack is fed with an skb where the first
fragment ends between the end of the esp header and the end of the IV.

This bug was found by Dirk Nehring <dnehring@gmx.net> .

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-27 16:08:03 -07:00
Herbert Xu 732c8bd590 [IPSEC]: Fix BEET output
The IPv6 BEET output function is incorrectly including the inner
header in the payload to be protected.  This causes a crash as
the packet doesn't actually have that many bytes for a second
header.

The IPv4 BEET output on the other hand is broken when it comes
to handling an inner IPv6 header since it always assumes an
inner IPv4 header.

This patch fixes both by making sure that neither BEET output
function touches the inner header at all.  All access is now
done through the protocol-independent cb structure.  Two new
attributes are added to make this work, the IP header length
and the IPv4 option length.  They're filled in by the inner
mode's output function.

Thanks to Joakim Koskela for finding this problem.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-26 16:51:09 -07:00
Pavel Emelyanov 7c0ecc4c4f [ICMP]: Dst entry leak in icmp_send host re-lookup code (v2).
Commit 8b7817f3a9 ([IPSEC]: Add ICMP host
relookup support) introduced some dst leaks on error paths: the rt
pointer can be forgotten to be put. Fix it bu going to a proper label.

Found after net namespace's lo refused to unregister :) Many thanks to 
Den for valuable help during debugging.

Herbert pointed out, that xfrm_lookup() will put the rtable in case
of error itself, so the first goto fix is redundant.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-26 02:27:09 -07:00
Robert P. J. Day 5c2e2e239e [AX25]: Remove obsolete references to BKL from TODO file.
Given that there are no apparent calls to lock_kernel() or
unlock_kernel() under net/ax25, delete the TODO reference related to
that.

Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-26 02:14:38 -07:00
Patrick McHardy 61ee6bd487 [NET]: Fix multicast device ioctl checks
SIOCADDMULTI/SIOCDELMULTI check whether the driver has a set_multicast_list
method to determine whether it supports multicast. Drivers implementing
secondary unicast support use set_rx_mode however.

Check for both dev->set_multicast_mode and dev->set_rx_mode to determine
multicast capabilities.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-26 02:12:11 -07:00
David S. Miller 8c7230f781 [IRDA]: Store irnet_socket termios properly.
It should be a "struct ktermios" not a "struct termios".

Based upon a build warning reported by Stephen Rothwell.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-26 00:55:50 -07:00
Patrick McHardy 0ed21b321a [VLAN]: Don't copy ALLMULTI/PROMISC flags from underlying device
Changing these flags requires to use dev_set_allmulti/dev_set_promiscuity
or dev_change_flags. Setting it directly causes two unwanted effects:

- the next dev_change_flags call will notice a difference between
  dev->gflags and the actual flags, enable promisc/allmulti
  mode and incorrectly update dev->gflags

- this keeps the underlying device in promisc/allmulti mode until
  the VLAN device is deleted

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-26 00:15:17 -07:00
Kazunori MIYAZAWA df9dcb4588 [IPSEC]: Fix inter address family IPsec tunnel handling.
Signed-off-by: Kazunori MIYAZAWA <kazunori@miyazawa.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-24 14:51:51 -07:00
Pavel Emelyanov fa86d322d8 [NEIGH]: Fix race between pneigh deletion and ipv6's ndisc_recv_ns (v3).
Proxy neighbors do not have any reference counting, so any caller
of pneigh_lookup (unless it's a netlink triggered add/del routine)
should _not_ perform any actions on the found proxy entry. 

There's one exception from this rule - the ipv6's ndisc_recv_ns() 
uses found entry to check the flags for NTF_ROUTER.

This creates a race between the ndisc and pneigh_delete - after 
the pneigh is returned to the caller, the nd_tbl.lock is dropped 
and the deleting procedure may proceed.

One of the fixes would be to add a reference counting, but this
problem exists for ndisc only. Besides such a patch would be too 
big for -rc4.

So I propose to introduce a __pneigh_lookup() which is supposed
to be called with the lock held and use it in ndisc code to check
the flags on alive pneigh entry.


Changes from v2:
As David noticed, Exported the __pneigh_lookup() to ipv6 module. 
The checkpatch generates a warning on it, since the EXPORT_SYMBOL 
does not follow the symbol itself, but in this file all the 
exports come at the end, so I decided no to break this harmony.

Changes from v1:
Fixed comments from YOSHIFUJI - indentation of prototype in header
and the pndisc_check_router() name - and a compilation fix, pointed
by Daniel - the is_routed was (falsely) considered as uninitialized
by gcc.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-24 14:48:59 -07:00
Martin Devera 8f3ea33a50 sch_htb: fix "too many events" situation
HTB is event driven algorithm and part of its work is to apply
scheduled events at proper times. It tried to defend itself from
livelock by processing only limited number of events per dequeue.
Because of faster computers some users already hit this hardcoded
limit.

This patch limits processing up to 2 jiffies (why not 1 jiffie ?
because it might stop prematurely when only fraction of jiffie
remains).

Signed-off-by: Martin Devera <devik@cdi.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-23 22:00:38 -07:00
Wang Chen dbee0d3f46 [ATM]: When proc_create() fails, do some error handling work and return -ENOMEM.
Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-23 21:45:36 -07:00
Julia Lawall 53a6201fdf [9P] net/9p/trans_fd.c: remove unused variable
The variable cb is initialized but never used otherwise.

The semantic patch that makes this change is as follows:
(http://www.emn.fr/x-info/coccinelle/)

// <smpl>
@@
type T;
identifier i;
constant C;
@@

(
extern T i;
|
- T i;
  <+... when != i
- i = C;
  ...+>
)
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-22 18:05:33 -07:00
Julia Lawall 421f099bc5 [IPV6] net/ipv6/ndisc.c: remove unused variable
The variable hlen is initialized but never used otherwise.

The semantic patch that makes this change is as follows:
(http://www.emn.fr/x-info/coccinelle/)

// <smpl>
@@
type T;
identifier i;
constant C;
@@

(
extern T i;
|
- T i;
  <+... when != i
- i = C;
  ...+>
)
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-22 18:04:16 -07:00
Stephen Hemminger 6440cc9e0f [IPV4] fib_trie: fix warning from rcu_assign_poinger
This gets rid of a warning caused by the test in rcu_assign_pointer.
I tried to fix rcu_assign_pointer, but that devolved into a long set
of discussions about doing it right that came to no real solution.
Since the test in rcu_assign_pointer for constant NULL would never
succeed in fib_trie, just open code instead.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-22 17:59:58 -07:00
Herbert Xu 69d1506731 [TCP]: Let skbs grow over a page on fast peers
While testing the virtio-net driver on KVM with TSO I noticed
that TSO performance with a 1500 MTU is significantly worse
compared to the performance of non-TSO with a 16436 MTU.  The
packet dump shows that most of the packets sent are smaller
than a page.

Looking at the code this actually is quite obvious as it always
stop extending the packet if it's the first packet yet to be
sent and if it's larger than the MSS.  Since each extension is
bound by the page size, this means that (given a 1500 MTU) we're
very unlikely to construct packets greater than a page, provided
that the receiver and the path is fast enough so that packets can
always be sent immediately.

The fix is also quite obvious.  The push calls inside the loop
is just an optimisation so that we don't end up doing all the
sending at the end of the loop.  Therefore there is no specific
reason why it has to do so at MSS boundaries.  For TSO, the
most natural extension of this optimisation is to do the pushing
once the skb exceeds the TSO size goal.

This is what the patch does and testing with KVM shows that the
TSO performance with a 1500 MTU easily surpasses that of a 16436
MTU and indeed the packet sizes sent are generally larger than
16436.

I don't see any obvious downsides for slower peers or connections,
but it would be prudent to test this extensively to ensure that
those cases don't regress.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-22 15:47:05 -07:00
Pavel Emelyanov 7512cbf6ef [DLCI]: Fix tiny race between module unload and sock_ioctl.
This is a narrow pedantry :) but the dlci_ioctl_hook check and call
should not be parted with the mutex lock.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-21 15:58:52 -07:00
Phil Oester 12b101555f [IPV4]: Fix null dereference in ip_defrag
Been seeing occasional panics in my testing of 2.6.25-rc in ip_defrag.
Offending line in ip_defrag is here:

	net = skb->dev->nd_net

where dev is NULL.  Bisected the problem down to commit
ac18e7509e ([NETNS][FRAGS]: Make the
inet_frag_queue lookup work in namespaces).  

Below patch (idea from Patrick McHardy) fixes the problem for me.

Signed-off-by: Phil Oester <kernel@linuxace.com>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-21 15:01:50 -07:00
YOSHIFUJI Hideaki 38fe999e22 [IPV6] KCONFIG: Fix description about IPV6_TUNNEL.
Based on notice from "Colin" <colins@sjtu.edu.cn>.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-20 16:13:58 -07:00
Patrick McHardy 607bfbf2d5 [TCP]: Fix shrinking windows with window scaling
When selecting a new window, tcp_select_window() tries not to shrink
the offered window by using the maximum of the remaining offered window
size and the newly calculated window size. The newly calculated window
size is always a multiple of the window scaling factor, the remaining
window size however might not be since it depends on rcv_wup/rcv_nxt.
This means we're effectively shrinking the window when scaling it down.


The dump below shows the problem (scaling factor 2^7):

- Window size of 557 (71296) is advertised, up to 3111907257:

IP 172.2.2.3.33000 > 172.2.2.2.33000: . ack 3111835961 win 557 <...>

- New window size of 514 (65792) is advertised, up to 3111907217, 40 bytes
  below the last end:

IP 172.2.2.3.33000 > 172.2.2.2.33000: . 3113575668:3113577116(1448) ack 3111841425 win 514 <...>

The number 40 results from downscaling the remaining window:

3111907257 - 3111841425 = 65832
65832 / 2^7 = 514
65832 % 2^7 = 40

If the sender uses up the entire window before it is shrunk, this can have
chaotic effects on the connection. When sending ACKs, tcp_acceptable_seq()
will notice that the window has been shrunk since tcp_wnd_end() is before
tp->snd_nxt, which makes it choose tcp_wnd_end() as sequence number.
This will fail the receivers checks in tcp_sequence() however since it
is before it's tp->rcv_wup, making it respond with a dupack.

If both sides are in this condition, this leads to a constant flood of
ACKs until the connection times out.

Make sure the window is never shrunk by aligning the remaining window to
the window scaling factor.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-20 16:11:27 -07:00
Jarek Poplawski 8a455b087c netpoll: zap_completion_queue: adjust skb->users counter
zap_completion_queue() retrieves skbs from completion_queue where they have
zero skb->users counter.  Before dev_kfree_skb_any() it should be non-zero
yet, so it's increased now.

Reported-and-tested-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-20 16:07:27 -07:00
Fabio Checconi 2bec008ca9 bridge: use time_before() in br_fdb_cleanup()
In br_fdb_cleanup() next_timer and this_timer are in jiffies, so they
should be compared using the time_after() macro.

Signed-off-by: Fabio Checconi <fabio@gandalf.sssup.it>
Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-20 15:54:58 -07:00
Vlad Yasevich 270637abff [SCTP]: Fix a race between module load and protosw access
There is a race is SCTP between the loading of the module
and the access by the socket layer to the protocol functions.
In particular, a list of addresss that SCTP maintains is
not initialized prior to the registration with the protosw.
Thus it is possible for a user application to gain access
to SCTP functions before everything has been initialized.
The problem shows up as odd crashes during connection
initializtion when we try to access the SCTP address list.

The solution is to refactor how we do registration and
initialize the lists prior to registering with the protosw.
Care must be taken since the address list initialization
depends on some other pieces of SCTP initialization.  Also
the clean-up in case of failure now also needs to be refactored.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Acked-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-20 15:17:14 -07:00
Daniel Hokka Zakrisson d0ebf13359 [NETFILTER]: ipt_recent: sanity check hit count
If a rule using ipt_recent is created with a hit count greater than
ip_pkt_list_tot, the rule will never match as it cannot keep track
of enough timestamps. This patch makes ipt_recent refuse to create such
rules.

With ip_pkt_list_tot's default value of 20, the following can be used
to reproduce the problem.

nc -u -l 0.0.0.0 1234 &
for i in `seq 1 100`; do echo $i | nc -w 1 -u 127.0.0.1 1234; done

This limits it to 20 packets:
iptables -A OUTPUT -p udp --dport 1234 -m recent --set --name test \
         --rsource
iptables -A OUTPUT -p udp --dport 1234 -m recent --update --seconds \
         60 --hitcount 20 --name test --rsource -j DROP

While this is unlimited:
iptables -A OUTPUT -p udp --dport 1234 -m recent --set --name test \
         --rsource
iptables -A OUTPUT -p udp --dport 1234 -m recent --update --seconds \
         60 --hitcount 21 --name test --rsource -j DROP

With the patch the second rule-set will throw an EINVAL.

Reported-by: Sean Kennedy <skennedy@vcn.com>
Signed-off-by: Daniel Hokka Zakrisson <daniel@hozac.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-20 15:07:10 -07:00
Roel Kluin 6aebb9b280 [NETFILTER]: nf_conntrack_h323: logical-bitwise & confusion in process_setup()
logical-bitwise & confusion

Signed-off-by: Roel Kluin <12o3l@tiscali.nl>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-20 15:06:23 -07:00
David S. Miller 2f633928cb Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 2008-03-17 23:44:31 -07:00
Al Viro 5e226e4d90 [IPV4]: esp_output() misannotations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-17 22:50:23 -07:00
Al Viro 9534f035ee [8021Q]: vlan_dev misannotations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-17 22:49:48 -07:00
Al Viro 27724426a9 [SUNRPC]: net/* NULL noise
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-17 22:48:03 -07:00
Al Viro bc92dd194d [SCTP]: fix misannotated __sctp_rcv_asconf_lookup()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-17 22:47:32 -07:00
Al Viro 0382b9c354 [PKT_SCHED]: annotate cls_u32
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-17 22:46:46 -07:00
Al Viro e6f1cebf71 [NET] endianness noise: INADDR_ANY
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-17 22:44:53 -07:00
Linus Torvalds 609eb39c8d Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (47 commits)
  [SCTP]: Fix local_addr deletions during list traversals.
  net: fix build with CONFIG_NET=n
  [TCP]: Prevent sending past receiver window with TSO (at last skb)
  rt2x00: Add new D-Link USB ID
  rt2x00: never disable multicast because it disables broadcast too
  libertas: fix the 'compare command with itself' properly
  drivers/net/Kconfig: fix whitespace for GELIC_WIRELESS entry
  [NETFILTER]: nf_queue: don't return error when unregistering a non-existant handler
  [NETFILTER]: nfnetlink_queue: fix EPERM when binding/unbinding and instance 0 exists
  [NETFILTER]: nfnetlink_log: fix EPERM when binding/unbinding and instance 0 exists
  [NETFILTER]: nf_conntrack: replace horrible hack with ksize()
  [NETFILTER]: nf_conntrack: add \n to "expectation table full" message
  [NETFILTER]: xt_time: fix failure to match on Sundays
  [NETFILTER]: nfnetlink_log: fix computation of netlink skb size
  [NETFILTER]: nfnetlink_queue: fix computation of allocated size for netlink skb.
  [NETFILTER]: nfnetlink: fix ifdef in nfnetlink_compat.h
  [NET]: include <linux/types.h> into linux/ethtool.h for __u* typedef
  [NET]: Make /proc/net a symlink on /proc/self/net (v3)
  RxRPC: fix rxrpc_recvmsg()'s returning of msg_name
  net/enc28j60: oops fix
  ...
2008-03-12 13:08:09 -07:00
Tom Tucker 3fedb3c5a8 SVCRDMA: Fix erroneous BUG_ON in send_write
The assertion that checks for sge context overflow is
incorrectly hard-coded to 32. This causes a kernel bug
check when using big-data mounts. Changed the BUG_ON to
use the computed value RPCSVC_MAXPAGES.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-12 12:37:34 -07:00
Tom Tucker c48cbb405c SVCRDMA: Add xprt refs to fix close/unmount crash
RDMA connection shutdown on an SMP machine can cause a kernel crash due
to the transport close path racing with the I/O tasklet.

Additional transport references were added as follows:
- A reference when on the DTO Q to avoid having the transport
  deleted while queued for I/O.
- A reference while there is a QP able to generate events.
- A reference until the DISCONNECTED event is received on the CM ID

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-12 12:37:34 -07:00
Chidambar 'ilLogict' Zinnoury 22626216c4 [SCTP]: Fix local_addr deletions during list traversals.
Since the lists are circular, we need to explicitely tag
the address to be deleted since we might end up freeing
the list head instead.  This fixes some interesting SCTP
crashes.

Signed-off-by: Chidambar 'ilLogict' Zinnoury <illogict@online.fr>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-11 18:05:02 -07:00
Ilpo Järvinen 5ea3a74806 [TCP]: Prevent sending past receiver window with TSO (at last skb)
With TSO it was possible to send past the receiver window when the skb
to be sent was the last in the write queue while the receiver window
is the limiting factor. One can notice that there's a loophole in the
tcp_mss_split_point that lacked a receiver window check for the
tcp_write_queue_tail() if also cwnd was smaller than the full skb.

Noticed by Thomas Gleixner <tglx@linutronix.de> in form of "Treason
uncloaked! Peer ... shrinks window .... Repaired."  messages (the peer
didn't actually shrink its window as the message suggests, we had just
sent something past it without a permission to do so).

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Tested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-11 17:55:27 -07:00
Patrick McHardy 94be1a3f36 [NETFILTER]: nf_queue: don't return error when unregistering a non-existant handler
Commit ce7663d84:

[NETFILTER]: nfnetlink_queue: don't unregister handler of other subsystem

changed nf_unregister_queue_handler to return an error when attempting to
unregister a queue handler that is not identical to the one passed in.
This is correct in case we really do have a different queue handler already
registered, but some existing userspace code always does an unbind before
bind and aborts if that fails, so try to be nice and return success in
that case.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-10 16:45:05 -07:00
Patrick McHardy 914afea84e [NETFILTER]: nfnetlink_queue: fix EPERM when binding/unbinding and instance 0 exists
Similar to the nfnetlink_log problem, nfnetlink_queue incorrectly
returns -EPERM when binding or unbinding to an address family and
queueing instance 0 exists and is owned by a different process. Unlike
nfnetlink_log it previously completes the operation, but it is still
incorrect.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-10 16:44:36 -07:00
Patrick McHardy b7047a1c88 [NETFILTER]: nfnetlink_log: fix EPERM when binding/unbinding and instance 0 exists
When binding or unbinding to an address family, the res_id is usually set
to zero. When logging instance 0 already exists and is owned by a different
process, this makes nfunl_recv_config return -EPERM without performing
the bind operation.

Since no operation on the foreign logging instance itself was requested,
this is incorrect. Move bind/unbind commands before the queue instance
permissions checks.

Also remove an incorrect comment.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-10 16:44:13 -07:00