redonkable/alistair23-linux

Author	SHA1	Message	Date
Johannes Berg	79c97e97ae	cfg80211: clean up naming once and for all We've named the registered devices 'drv' sometimes, thinking of "driver", which is not what it is, it's the internal representation of a wiphy, i.e. a device. Let's clean up the naming once and and use 'rdev' aka 'registered device' everywhere. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-07-10 15:02:33 -04:00
Johannes Berg	667503ddcb	cfg80211: fix locking Over time, a lot of locking issues have crept into the smarts of cfg80211, so e.g. scan completion can race against a new scan, IBSS join can race against leaving an IBSS, etc. Introduce a new per-interface lock that protects most of the per-interface data that we need to keep track of, and sprinkle assertions about that lock everywhere. Some things now need to be offloaded to work structs so that we don't require being able to sleep in functions the drivers call. The exception to that are the MLME callbacks (rx_auth etc.) that currently only mac80211 calls because it was easier to do that there instead of in cfg80211, and future drivers implementing those calls will, if they ever exist, probably need to use a similar scheme like mac80211 anyway... In order to be able to handle _deauth and _disassoc properly, introduce a cookie passed to it that will determine locking requirements. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-07-10 15:02:32 -04:00
Johannes Berg	4f5dadcebb	cfg80211: fix MFP bug, sparse warnings sparse warns about a number of things, and one of them (use_mfp shadowed variable) actually is a bug, fix all of them. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-07-10 15:02:32 -04:00
Johannes Berg	4d0c8aead3	cfg80211: properly name driver locking Currently we call that cfg80211_put_dev(), but that is misleading. With the new convention of using 'rdev' for registered_device variables, also call that function cfg80211_unlock_rdev(). Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-07-10 15:02:32 -04:00
Johannes Berg	c1e6fb1aad	cfg80211: warn again on spurious deauth The original code in mac80211 could send a deauth frame under certain circumstances even if nothing had ever requested an authentication. This has been fixed with the rework there, so cfg80211 can now warn again about spurious events to catch possible future drivers or mac80211 regressions. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-07-10 15:02:31 -04:00
Johannes Berg	cb0b4beb93	cfg80211: mlme API must be able to sleep After the mac80211 mlme cleanup, we can require that the MLME functions in cfg80211 can sleep. This will simplify future work in cfg80211 a lot. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-07-10 15:02:31 -04:00
Johannes Berg	7848547561	cfg80211: fix netdev down problem We shouldn't be looking at the ssid_len for non-IBSS, and for IBSS we should also return an error on trying to leave an IBSS while not in or joining an IBSS. This fixes an issue where we wouldn't disconnect() on an interface being taken down since there's no SSID configured this way. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-07-10 15:02:31 -04:00
Johannes Berg	c9cf01226e	mac80211: refactor the WEP code to be directly usable The new key work for cfg80211 will only give us the WEP key for shared auth to do that authentication, and not via the regular key settings, so we need to be able to encrypt a single frame in software, and that without a key struct. Thus, refactor the WEP code to not require a key structure but use the key, len and idx directly. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-07-10 15:02:31 -04:00
Johannes Berg	77fdaa12ce	mac80211: rework MLME for multiple authentications Sit tight. This shakes up the world as you know it. Let go of your spaghetti tongs, they will no longer be required, the horrible statemachine in net/mac80211/mlme.c is no more... With the cfg80211 SME mac80211 now has much less to keep track of, but, on the other hand, for FT it needs to be able to keep track of at least one authentication being in progress while associated. So convert from a single state machine to having small ones for all the different things we need to do. For real FT it will still need work wrt. PS, but this should be a good step. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-07-10 15:02:30 -04:00
Johannes Berg	a7c1cfc961	mac80211: remove dead code from mlme The ap_capab and last_probe struct members are unused. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-07-10 15:02:30 -04:00
Johannes Berg	3e5d7649a6	cfg80211: let SME control reassociation vs. association Since we don't really know that well in the kernel, let's let the SME control whether it wants to use reassociation or not, by allowing it to give the previous BSSID in the associate() parameters. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-07-10 15:02:30 -04:00
Johannes Berg	1be491fca1	rfkill: prep for rfkill API changes We've designed the /dev/rfkill API in a way that we can increase the event struct by adding members at the end, should it become necessary. To validate the events, userspace and the kernel need to have the proper event size to check for -- when reading from the other end they need to verify that it's at least version 1 of the event API, with the current struct size, so define a constant for that and make the code a little more 'future proof'. Not that I expect that we'll have to change the event size any time soon, but it's better to write the code in a way that lends itself to extending. Due to the current size of the event struct, the code is currently equivalent, but should the event struct ever need to be increased the new code might not need changing. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-07-10 15:02:29 -04:00
Samuel Ortiz	6c230c0270	cfg80211: check for current_bss from giwrate When connecting to an ESSID manually, we may not set the BSSID, and thus wdev->wext.connect.bssid will be NULL. wdev->current_bss is always updated when a connection is established so we should check it first. Signed-off-by: Samuel Ortiz <samuel.ortiz@intel.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-07-10 15:02:27 -04:00
Helmut Schaa	96f7e73938	mac80211: shorten the passive dwell time for sw scans mac80211's software scan implementation uses a passive dwell time of (HZ / 5) which means we stay 200ms on each passive channel. Compared to iwlwifi's hw scan and the old ipw* drivers which use values around 120ms this is quite long. Reducing the passive dwell time from 200ms to 125ms should save us something around a second on cards capable of 11a and we should still be able to catch beacons from most access points (assuming a ~100ms beacon interval). Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-07-10 15:02:27 -04:00
Johannes Berg	9834c079d1	cfg80211: fix giwrange "cfg80211: Advertise ciphers via WE according to driver capability" unfortunately broke iwrange -- it used the variable c that needs to be 0 for the channel list. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-07-10 15:01:54 -04:00
Johannes Berg	3dc27d25f2	nl80211: limit to one pairwise cipher for associate() In this case, only one cipher makes sense, unlike for connect() where it may be possible to have the card or driver select. No changes to mac80211 due to the way the structs are laid out -- but the loop in net/mac80211/cfg.c will degrade to just zero or one passes. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-07-10 15:01:54 -04:00
Johannes Berg	0a9b5e1795	cfg80211: refuse authenticating to same BSSID twice It is possible that there are different BSS structs with the same BSSID, but we cannot authenticate with multiple of them them because we need the BSSID to be unique for deauthenticating/disassociating. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-07-10 15:01:54 -04:00
Johannes Berg	19957bb399	cfg80211: keep track of BSSes In order to avoid problems with BSS structs going away while they're in use, I've long wanted to make cfg80211 keep track of them. Without the SME, that wasn't doable but now that we have the SME we can do this too. It can keep track of up to four separate authentications and one association, regardless of whether it's controlled by the cfg80211 SME or the userspace SME. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-07-10 15:01:53 -04:00
Johannes Berg	517357c685	cfg80211: assimilate and export ieee80211_bss_get_ie This function from mac80211 seems generally useful, and I will need it in cfg80211 soon. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-07-10 15:01:53 -04:00
Johannes Berg	0eb14647fc	cfg80211: reset auth algorithm When the interface is brought down, we need to reset the auth algorithm because wpa_supplicant doesn't reset it, and then we fail to use shared key auth when required later. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-07-10 15:01:53 -04:00
Johannes Berg	e45cd82ace	cfg80211: send events for userspace SME When the userspace SME is in control, we are currently not sending events, but this means that any userspace applications using wext or nl80211 to receive events will not know what's going on unless they can also interpret the nl80211 assoc event. Since we have all the required code, let the SME follow events from the userspace SME, this even means that you will be refused to connect() while the userspace SME is in control and connected. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-07-10 15:01:53 -04:00
Johannes Berg	ab1faead50	mac80211: remove dead code, clean up With mac80211 now always controlled by an external SME, a lot of code is dead -- SSID, BSSID, channel selection is always done externally, etc. Additionally, rename IEEE80211_STA_TKIP_WEP_USED to IEEE80211_STA_DISABLE_11N and clean up the code a bit. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-07-10 15:01:53 -04:00
Johannes Berg	6dc1cb0319	mac80211: remove auth algorithm retry The automatic auth algorithm issue is now solved in cfg80211, so mac80211 no longer needs code to try different algorithms -- just using whatever cfg80211 asked for is good. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-07-10 15:01:52 -04:00
Johannes Berg	ac00326e9d	mac80211: re-add HT disabling The IEEE80211_STA_TKIP_WEP_USED flag is used internally to disable HT when WEP or TKIP are used. Now that cfg80211 is giving us the required information, we can set the flag appropriately again. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-07-10 15:01:52 -04:00
Johannes Berg	8990646d2f	cfg80211: implement get_wireless_stats By dropping the noise reporting, we can implement wireless stats in cfg80211. We also make the handler return NULL if we have no information, which is possible thanks to the recent wext change. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-07-10 15:01:52 -04:00
Johannes Berg	9930380f0b	cfg80211: implement IWRATE For now, let's implement that using a very hackish way: simply mirror the wext API in the cfg80211 API. This will have to be changed later when we implement proper bitrate API. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-07-10 15:01:52 -04:00
Johannes Berg	ab737a4f7d	cfg80211: implement IWAP for WDS This implements siocsiwap/giwap for WDS mode. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-07-10 15:01:52 -04:00
Johannes Berg	bc92afd920	cfg80211: implement iwpower Just on/off and timeout, and with a hacky cfg80211 method until we figure out what we want, though this is probably sufficient as we want to use pm_qos for wifi everywhere. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-07-10 15:01:51 -04:00
Johannes Berg	f21293549f	cfg80211: managed mode wext compatibility This adds code to make it possible to use the cfg80211 connect() API with wireless extensions, and because the previous patch added emulation of that API with auth() and assoc(), by extension also supports wext on that. At the same time, removes code from mac80211 for wext, but doesn't yet clean up mac80211's mlme code more. Signed-off-by: Samuel Ortiz <samuel.ortiz@intel.com> Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-07-10 15:01:51 -04:00
Johannes Berg	6829c878ec	cfg80211: emulate connect with auth/assoc This adds code to cfg80211 so that drivers (mac80211 right now) that don't implement connect but rather auth/assoc can still be used with the nl80211 connect command. This will also be necessary for the wext compat code. Signed-off-by: Samuel Ortiz <samuel.ortiz@intel.com> Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-07-10 15:01:51 -04:00
Samuel Ortiz	b23aa676ab	cfg80211: connect/disconnect API This patch introduces the cfg80211 connect/disconnect API. The goal here is to run the AUTH and ASSOC steps in one call. This is needed for some fullmac cards that run both steps directly from the target, after the host driver sends a connect command. Additionally, all the new crypto parameters for connect() are now also valid for associate() -- although associate requires the IEs to be used, the information can be useful for drivers and should be given. Signed-off-by: Samuel Ortiz <samuel.ortiz@intel.com> Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-07-10 15:01:51 -04:00
Johannes Berg	3f65b24536	mac80211: remove an unused function declaration The ieee80211_scan_results function hasn't existed for a long time now, so its declaration should be removed as well. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-07-10 15:01:50 -04:00
Johannes Berg	aff89a9b90	cfg80211: introduce nl80211 testmode command This introduces a new NL80211_CMD_TESTMODE for testing and calibration use with nl80211. There's no multiplexing like like iwpriv had, and the command is not available by default, it needs to be explicitly enabled in Kconfig and shouldn't be enabled in most kernels. The command requires a wiphy index or interface index to identify the device to operate on, and the new TESTDATA attribute. There also is API for sending replies to the command, and testmode multicast messages (on a testmode multicast group). I've also updated mac80211 to be able to pass through the command to the driver, since it itself doesn't implement the testmode command. Additionally, to give people an idea of how to use the command, I've added a little code to hwsim that makes use of the new command to set the powersave mode, this is currently done via debugfs and should remain there, and the testmode command only serves as an example of how to use this best -- with nested netlink attributes in the TESTDATA attribute. A hwsim testmode tool can be found at http://git.sipsolutions.net/hwsim.git/. This tool is BSD licensed so people can easily use it as a basis for their own internal fabrication and validation tools. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-07-10 15:01:50 -04:00
Johannes Berg	5121ea0481	wext: constify extra argument to wireless_send_event This is never changed by the function, so can be marked const. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-07-10 15:01:49 -04:00
Johannes Berg	0575606b08	mac80211: tell SME about real auth state When the auth algorithm is rejected, but we don't have another one to try, we will eventually retry but that isn't useful -- we'll then do it again and again until we eventually give up. Instead, we should let the SME know and go into disabled state. The same applies for situations where the AP rejects with any other status code. Additionally, when trying the next auth algorithm, we should reset the auth_tries so that just a single lost frame doesn't lead to us giving up on the third auth algorithm. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-07-10 15:01:49 -04:00
Johannes Berg	7ebbe6bd51	cfg80211: remove wireless_dev->bssid This variable isn't necessary -- the wext code keeps track of the BSSID itself, and otherwise we have current_bss. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-07-10 15:01:49 -04:00
Johannes Berg	e6d6e3420d	cfg80211: use proper allocation flags Instead of hardcoding GFP_ATOMIC everywhere, add a new function parameter that gets the flags from the caller. Obviously then I need to update all callers (all of them in mac80211), and it turns out that now it's ok to use GFP_KERNEL in almost all places. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-07-10 15:01:49 -04:00
Johannes Berg	dad8233021	nl80211: clean up function definitions I don't like the 'extern' keyword much when it's not necessary, it makes lines rather long. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-07-10 15:01:48 -04:00
Johannes Berg	2a783c136b	cfg80211: move break statement to correct place Move a break statement to the correct place _after_ the #endif, otherwise w/o WIRELESS_EXT things break badly. Also, while touching this code, do a cleanup and assign dev->ieee80211_ptr to a new variable. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-07-10 15:01:48 -04:00
Johannes Berg	898324025f	wext: default to y The way I initially thought we could do wireless extensions is by making all the compat code in cfg80211 be independent of CONFIG_WIRELESS_EXT, but this is turning out to not be feasible. Therefore, fix the Kconfig help text and make the option default to yes, so people won't get a nasty surprise when mac80211 will get rid of its 'select WIRELESS_EXT' any time now. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-07-10 15:01:48 -04:00
Johannes Berg	d3cebbdced	mac80211: fix todo lock The key todo lock can be taken from different locks that require it to be _bh to avoid lock inversion due to (soft)irqs. This should fix the two problems reported by Bob and Gabor: http://mid.gmane.org/20090619113049.GB18956@hash.localnet http://mid.gmane.org/4A3FA376.8020307@openwrt.org Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Cc: Bob Copeland <me@bobcopeland.com> Cc: Gabor Juhos <juhosg@openwrt.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-07-10 15:01:48 -04:00
Johannes Berg	df2b35b65b	wext: allow returning NULL stats Currently, wext drivers cannot return NULL for stats even though that would make the ioctl return -EOPNOTSUPP because that would mean they are no longer listed in /proc/net/wireless. This patch changes the wext core's behaviour to list them if they have any wireless_handlers, but only show their stats when available, so that drivers can start returning NULL if stats are currently not available, reducing confusion for e.g. IBSS. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-07-10 15:01:48 -04:00
Johannes Berg	f58d4ed98b	cfg80211: send wext MLME-MICHAELMICFAILURE.indication Instead of having mac80211 do it itself. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-07-10 15:01:47 -04:00
David Kilroy	27bea66c22	cfg80211: infer WPA and WPA2 support from TKIP and CCMP Signed-off-by: David Kilroy <kilroyd@googlemail.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-07-10 15:01:42 -04:00
David Kilroy	2ab658f9ce	cfg80211: set WE encoding size based on available ciphers Only set the sizes for WEP40 and WEP104. Signed-off-by: David Kilroy <kilroyd@googlemail.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-07-10 15:01:42 -04:00
David Kilroy	51cd4aabd0	cfg80211: allow drivers that can't scan for specific ssids Signed-off-by: David Kilroy <kilroyd@googlemail.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-07-10 15:01:42 -04:00
David Kilroy	3daf097594	cfg80211: Advertise ciphers via WE according to driver capability Signed-off-by: David Kilroy <kilroyd@googlemail.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-07-10 15:01:42 -04:00
Johannes Berg	83f5e2cf79	cfg80211: prohibit scanning the same channel more than once It isn't very useful to scan the same channel more than once during a given scan, and some hardware (notably iwlwifi) can only scan a limited number of channels at a time. To prevent any overflows, simply disallow scanning any channel multiple times in a given scan command. This is a small change in the userspace ABI, but the only user, wpa_supplicant, never asks for a scan with the same frequency listed twice. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-07-10 14:57:54 -04:00
Johannes Berg	386aa23dd5	mac80211: improve per-sta debugfs We had code for a number of files, that we didn't publish in debugfs, fix that. Also make the agg_status file layout more readable and add more information to it. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-07-10 14:57:54 -04:00
Johannes Berg	f1d58c2521	mac80211: push rx status into skb->cb Within mac80211, we often need to copy the rx status into skb->cb. This is wasteful, as drivers could be building it in there to start with. This patch changes the API so that drivers are expected to pass the RX status in skb->cb, now accessible as IEEE80211_SKB_RXCB(skb). It also updates all drivers to pass the rx status in there, but only by making them memcpy() it into place before the call to the receive function (ieee80211_rx(_irqsafe)). Each driver can now be optimised on its own schedule. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-07-10 14:57:54 -04:00
Johannes Berg	a538e2d5a3	cfg80211: issue netlink notification when scan starts To ease multiple apps working together smoothly, send a notification when a scan is started. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-07-10 14:57:53 -04:00
Johannes Berg	e36d56b648	cfg80211: pass netdev to change_virtual_intf If there was a reason I'm passing the ifidx I cannot remember it any more and don't see one now, so let's just pass the pointer itself. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-07-10 14:57:38 -04:00
David S. Miller	e5a8a896f5	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6	2009-07-09 20:18:24 -07:00
Jiri Olsa	a57de0b433	net: adding memory barrier to the poll and receive callbacks Adding memory barrier after the poll_wait function, paired with receive callbacks. Adding fuctions sock_poll_wait and sk_has_sleeper to wrap the memory barrier. Without the memory barrier, following race can happen. The race fires, when following code paths meet, and the tp->rcv_nxt and __add_wait_queue updates stay in CPU caches. CPU1 CPU2 sys_select receive packet ... ... __add_wait_queue update tp->rcv_nxt ... ... tp->rcv_nxt check sock_def_readable ... { schedule ... if (sk->sk_sleep && waitqueue_active(sk->sk_sleep)) wake_up_interruptible(sk->sk_sleep) ... } If there was no cache the code would work ok, since the wait_queue and rcv_nxt are opposit to each other. Meaning that once tp->rcv_nxt is updated by CPU2, the CPU1 either already passed the tp->rcv_nxt check and sleeps, or will get the new value for tp->rcv_nxt and will return with new data mask. In both cases the process (CPU1) is being added to the wait queue, so the waitqueue_active (CPU2) call cannot miss and will wake up CPU1. The bad case is when the __add_wait_queue changes done by CPU1 stay in its cache, and so does the tp->rcv_nxt update on CPU2 side. The CPU1 will then endup calling schedule and sleep forever if there are no more data on the socket. Calls to poll_wait in following modules were ommited: net/bluetooth/af_bluetooth.c net/irda/af_irda.c net/irda/irnet/irnet_ppp.c net/mac80211/rc80211_pid_debugfs.c net/phonet/socket.c net/rds/af_rds.c net/rfkill/core.c net/sunrpc/cache.c net/sunrpc/rpc_pipe.c net/tipc/socket.c Signed-off-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-07-09 17:06:57 -07:00
Anton Vorontsov	1b614fb9a0	netpoll: Fix carrier detection for drivers that are using phylib Using early netconsole and gianfar driver this error pops up: netconsole: timeout waiting for carrier It appears that net/core/netpoll.c:netpoll_setup() is using cond_resched() in a loop waiting for a carrier. The thing is that cond_resched() is a no-op when system_state != SYSTEM_RUNNING, and so drivers/net/phy/phy.c's state_queue is never scheduled, therefore link detection doesn't work. I belive that the main problem is in cond_resched()[1], but despite how the cond_resched() story ends, it might be a good idea to call msleep(1) instead of cond_resched(), as suggested by Andrew Morton. [1] http://lkml.org/lkml/2009/7/7/463 Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-07-08 20:09:44 -07:00
David S. Miller	d2daeabf62	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6	2009-07-08 18:13:13 -07:00
Anton Vorontsov	bff38771e1	netpoll: Introduce netpoll_carrier_timeout kernel option Some PHYs require longer timeouts for carrier detection, and auto-negotiation process may take indefinite amount of time. It may be inconvenient to force longer timeouts for sane PHYs, so let's introduce a kernel command line option. Since we're using module_param(), the option also can be changed in runtime. Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-07-08 11:10:56 -07:00
Jarek Poplawski	345aa03120	ipv4: Fix fib_trie rebalancing, part 4 (root thresholds) Pawel Staszewski wrote: <blockquote> Some time ago i report this: http://bugzilla.kernel.org/show_bug.cgi?id=6648 and now with 2.6.29 / 2.6.29.1 / 2.6.29.3 and 2.6.30 it back dmesg output: oprofile: using NMI interrupt. Fix inflate_threshold_root. Now=15 size=11 bits ... Fix inflate_threshold_root. Now=15 size=11 bits cat /proc/net/fib_triestat Basic info: size of leaf: 40 bytes, size of tnode: 56 bytes. Main: Aver depth: 2.28 Max depth: 6 Leaves: 276539 Prefixes: 289922 Internal nodes: 66762 1: 35046 2: 13824 3: 9508 4: 4897 5: 2331 6: 1149 7: 5 9: 1 18: 1 Pointers: 691228 Null ptrs: 347928 Total size: 35709 kB </blockquote> It seems, the current threshold for root resizing is too aggressive, and it causes misleading warnings during big updates, but it might be also responsible for memory problems, especially with non-preempt configs, when RCU freeing is delayed long after call_rcu. It should be also mentioned that because of non-atomic changes during resizing/rebalancing the current lookup algorithm can miss valid leaves so it's additional argument to shorten these activities even at a cost of a minimally longer searching. This patch restores values before the patch "[IPV4]: fib_trie root node settings", commit: `965ffea43d` from v2.6.22. Pawel's report: <blockquote> I dont see any big change of (cpu load or faster/slower routing/propagating routes from bgpd or something else) - in avg there is from 2% to 3% more of CPU load i dont know why but it is - i change from "preempt" to "no preempt" 3 times and check this my "mpstat -P ALL 1 30" always avg cpu load was from 2 to 3% more compared to "no preempt" [...] cat /proc/net/fib_triestat Basic info: size of leaf: 20 bytes, size of tnode: 36 bytes. Main: Aver depth: 2.44 Max depth: 6 Leaves: 277814 Prefixes: 291306 Internal nodes: 66420 1: 32737 2: 14850 3: 10332 4: 4871 5: 2313 6: 942 7: 371 8: 3 17: 1 Pointers: 599098 Null ptrs: 254865 Total size: 18067 kB </blockquote> According to this and other similar reports average depth is slightly increased (~0.2), and root nodes are shorter (log 17 vs. 18), but there is no visible performance decrease. So, until memory handling is improved or added parameters for changing this individually, this patch resets to safer defaults. Reported-by: Pawel Staszewski <pstaszewski@itcare.pl> Reported-by: Jorge Boncompte [DTI2] <jorge@dti2.net> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Tested-by: Pawel Staszewski <pstaszewski@itcare.pl> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-07-08 10:46:45 -07:00
Luciano Coelho	3938b45c1c	mac80211: minstrel: avoid accessing negative indices in rix_to_ndx() If rix is not found in mi->r[], i will become -1 after the loop. This value is eventually used to access arrays, so we were accessing arrays with a negative index, which is obviously not what we want to do. This patch fixes this potential problem. Signed-off-by: Luciano Coelho <luciano.coelho@nokia.com> Acked-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-07-07 12:55:28 -04:00
Johannes Berg	2dce4c2b5f	cfg80211: fix refcount leak The code in cfg80211's cfg80211_bss_update erroneously grabs a reference to the BSS, which means that it will never be freed. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Cc: stable@kernel.org [2.6.29, 2.6.30] Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-07-07 12:55:28 -04:00
Andrey Yurovsky	59615b5f9d	mac80211: fix allocation in mesh_queue_preq We allocate a PREQ queue node in mesh_queue_preq, however the allocation may cause us to sleep. Use GFP_ATOMIC to prevent this. [ 1869.126498] BUG: scheduling while atomic: ping/1859/0x10000100 [ 1869.127164] Modules linked in: ath5k mac80211 ath [ 1869.128310] Pid: 1859, comm: ping Not tainted 2.6.30-wl #1 [ 1869.128754] Call Trace: [ 1869.129293] [<c1023a2b>] __schedule_bug+0x48/0x4d [ 1869.129866] [<c13b5533>] __schedule+0x77/0x67a [ 1869.130544] [<c1026f2e>] ? release_console_sem+0x17d/0x185 [ 1869.131568] [<c807cf47>] ? mesh_queue_preq+0x2b/0x165 [mac80211] [ 1869.132318] [<c13b5b3e>] schedule+0x8/0x1f [ 1869.132807] [<c1023c12>] __cond_resched+0x16/0x2f [ 1869.133478] [<c13b5bf0>] _cond_resched+0x27/0x32 [ 1869.134191] [<c108a370>] kmem_cache_alloc+0x1c/0xcf [ 1869.134714] [<c10273ae>] ? printk+0x15/0x17 [ 1869.135670] [<c807cf47>] mesh_queue_preq+0x2b/0x165 [mac80211] [ 1869.136731] [<c807d1f8>] mesh_nexthop_lookup+0xee/0x12d [mac80211] [ 1869.138130] [<c807417e>] ieee80211_xmit+0xe6/0x2b2 [mac80211] [ 1869.138935] [<c80be46d>] ? ath5k_hw_setup_rx_desc+0x0/0x66 [ath5k] [ 1869.139831] [<c80c97bc>] ? ath5k_tasklet_rx+0xba/0x506 [ath5k] [ 1869.140863] [<c8075191>] ieee80211_subif_start_xmit+0x6c9/0x6e4 [mac80211] [ 1869.141665] [<c105cf1c>] ? handle_level_irq+0x78/0x9d [ 1869.142390] [<c12e3f93>] dev_hard_start_xmit+0x168/0x1c7 [ 1869.143092] [<c12f1f17>] __qdisc_run+0xe1/0x1b7 [ 1869.143612] [<c12e25ff>] qdisc_run+0x18/0x1a [ 1869.144248] [<c12e62f4>] dev_queue_xmit+0x16a/0x25a [ 1869.144785] [<c13b6dcc>] ? _read_unlock_bh+0xe/0x10 [ 1869.145465] [<c12eacdb>] neigh_resolve_output+0x19c/0x1c7 [ 1869.146182] [<c130e2da>] ? ip_finish_output+0x0/0x51 [ 1869.146697] [<c130e2a0>] ip_finish_output2+0x182/0x1bc [ 1869.147358] [<c130e327>] ip_finish_output+0x4d/0x51 [ 1869.147863] [<c130e9d5>] ip_output+0x80/0x85 [ 1869.148515] [<c130cc49>] dst_output+0x9/0xb [ 1869.149141] [<c130dec6>] ip_local_out+0x17/0x1a [ 1869.149632] [<c130e0bc>] ip_push_pending_frames+0x1f3/0x255 [ 1869.150343] [<c13247ff>] raw_sendmsg+0x5e6/0x667 [ 1869.150883] [<c1033c55>] ? insert_work+0x6a/0x73 [ 1869.151834] [<c8071e00>] ? ieee80211_invoke_rx_handlers+0x17da/0x1ae8 [mac80211] [ 1869.152630] [<c132bd68>] inet_sendmsg+0x3b/0x48 [ 1869.153232] [<c12d7deb>] __sock_sendmsg+0x45/0x4e [ 1869.153740] [<c12d8537>] sock_sendmsg+0xb8/0xce [ 1869.154519] [<c80be46d>] ? ath5k_hw_setup_rx_desc+0x0/0x66 [ath5k] [ 1869.155289] [<c1036b25>] ? autoremove_wake_function+0x0/0x30 [ 1869.155859] [<c115992b>] ? __copy_from_user_ll+0x11/0xce [ 1869.156573] [<c1159d99>] ? copy_from_user+0x31/0x54 [ 1869.157235] [<c12df646>] ? verify_iovec+0x40/0x6e [ 1869.157778] [<c12d869a>] sys_sendmsg+0x14d/0x1a5 [ 1869.158714] [<c8072c40>] ? __ieee80211_rx+0x49e/0x4ee [mac80211] [ 1869.159641] [<c80c83fe>] ? ath5k_rxbuf_setup+0x6d/0x8d [ath5k] [ 1869.160543] [<c80be46d>] ? ath5k_hw_setup_rx_desc+0x0/0x66 [ath5k] [ 1869.161434] [<c80beba4>] ? ath5k_hw_get_rxdp+0xe/0x10 [ath5k] [ 1869.162319] [<c80c97bc>] ? ath5k_tasklet_rx+0xba/0x506 [ath5k] [ 1869.163063] [<c1005627>] ? enable_8259A_irq+0x40/0x43 [ 1869.163594] [<c101edb8>] ? __dequeue_entity+0x23/0x27 [ 1869.164793] [<c100187a>] ? __switch_to+0x2b/0x105 [ 1869.165442] [<c1021d5f>] ? finish_task_switch+0x5b/0x74 [ 1869.166129] [<c12d963a>] sys_socketcall+0x14b/0x17b [ 1869.166612] [<c1002b95>] syscall_call+0x7/0xb Signed-off-by: Andrey Yurovsky <andrey@cozybit.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-07-07 12:55:27 -04:00
Jiri Slaby	1f5fc70a25	Wireless: nl80211, fix lock imbalance Don't forget to unlock cfg80211_mutex in one fail path of nl80211_set_wiphy. Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-07-07 12:55:25 -04:00
Mark Smith	482d804cb4	econet: use NET_RX_SUCCESS instead of magic number 0 for econet_rcv successful return Signed-off-by: Mark Smith <markzzzsmith@yahoo.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-07-06 18:07:59 -07:00
Mark Smith	5c91face51	ipv6: correct return on ipv6_rcv() packet drop The routine ipv6_rcv() uses magic number 0 for a return when it drops a packet. This corresponds to NET_RX_SUCCESS, which is obviously incorrect. Correct this by using NET_RX_DROP instead. ps. It isn't exactly clear who the IPv6 maintainers are, apologies if I've missed any. Signed-off-by: Mark Smith <markzzzsmith@yahoo.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-07-06 18:07:55 -07:00
Wei Yongjun	1bc4ee4088	sctp: fix warning at inet_sock_destruct() while release sctp socket Commit 'net: Move rx skb_orphan call to where needed' broken sctp protocol with warning at inet_sock_destruct(). Actually, sctp can do this right with sctp_sock_rfree_frag() and sctp_skb_set_owner_r_frag() pair. sctp_sock_rfree_frag(skb); sctp_skb_set_owner_r_frag(skb, newsk); This patch not revert the commit `d55d87fdff`, instead remove the sctp_sock_rfree_frag() function. ------------[ cut here ]------------ WARNING: at net/ipv4/af_inet.c:151 inet_sock_destruct+0xe0/0x142() Modules linked in: sctp ipv6 dm_mirror dm_region_hash dm_log dm_multipath scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd [last unloaded: scsi_wait_scan] Pid: 1808, comm: sctp_test Not tainted 2.6.31-rc2 #40 Call Trace: [<c042dd06>] warn_slowpath_common+0x6a/0x81 [<c064a39a>] ? inet_sock_destruct+0xe0/0x142 [<c042dd2f>] warn_slowpath_null+0x12/0x15 [<c064a39a>] inet_sock_destruct+0xe0/0x142 [<c05fde44>] __sk_free+0x19/0xcc [<c05fdf50>] sk_free+0x18/0x1a [<ca0d14ad>] sctp_close+0x192/0x1a1 [sctp] [<c0649f7f>] inet_release+0x47/0x4d [<c05fba4d>] sock_release+0x19/0x5e [<c05fbab3>] sock_close+0x21/0x25 [<c049c31b>] __fput+0xde/0x189 [<c049c3de>] fput+0x18/0x1a [<c049988f>] filp_close+0x56/0x60 [<c042f422>] put_files_struct+0x5d/0xa1 [<c042f49f>] exit_files+0x39/0x3d [<c043086a>] do_exit+0x1a5/0x5dd [<c04a86c2>] ? d_kill+0x35/0x3b [<c0438fa4>] ? dequeue_signal+0xa6/0x115 [<c0430d05>] do_group_exit+0x63/0x8a [<c0439504>] get_signal_to_deliver+0x2e1/0x2f9 [<c0401d9e>] do_notify_resume+0x7c/0x6b5 [<c043f601>] ? autoremove_wake_function+0x0/0x34 [<c04a864e>] ? __d_free+0x3d/0x40 [<c04a867b>] ? d_free+0x2a/0x3c [<c049ba7e>] ? vfs_write+0x103/0x117 [<c05fc8fa>] ? sys_socketcall+0x178/0x182 [<c0402a56>] work_notifysig+0x13/0x19 ---[ end trace 9db92c463e789fba ]--- Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-07-06 12:47:08 -07:00
Patrick McHardy	ec634fe328	net: convert remaining non-symbolic return values in ndo_start_xmit() functions This patch converts the remaining occurences of raw return values to their symbolic counterparts in ndo_start_xmit() functions that were missed by the previous automatic conversion. Additionally code that assumed the symbolic value of NETDEV_TX_OK to be zero is changed to explicitly use NETDEV_TX_OK. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-07-05 19:23:38 -07:00
Cyrill Gorcunov	1490fd8947	net, bridge: align br_nf_ops assignment No functional change -- just for easier reading. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-07-05 19:16:14 -07:00
oscar.medina@motorola.com	6650613d33	tipc: Add socket options to get number of queued messages This patch allows a TIPC application to determine the number of messages currently waiting in a socket's receive queue (TIPC_SOCK_RECVQ_DEPTH) or in all TIPC socket receive queues (TIPC_NODE_RECVQ_DEPTH). Signed-off-by: Oscar Medina <oscar.medina@motorola.com> Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-07-05 19:16:11 -07:00
Patrick McHardy	6ed106549d	net: use NETDEV_TX_OK instead of 0 in ndo_start_xmit() functions This patch is the result of an automatic spatch transformation to convert all ndo_start_xmit() return values of 0 to NETDEV_TX_OK. Some occurences are missed by the automatic conversion, those will be handled in a seperate patch. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-07-05 19:16:04 -07:00
Florian Westphal	0e8635a8e1	net: remove NET_RX_BAD and NET_RX_CN* defines almost no users in the tree; and the few that use them treat them like NET_RX_DROP. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-07-05 19:15:35 -07:00
Stephane Contri	1ded3f59f3	dsa: fix 88e6xxx statistics counter snapshotting The bit that tells us whether a statistics counter snapshot operation has completed is located in the GLOBAL register block, not in the GLOBAL2 register block, so fix up mv88e6xxx_stats_wait() to poll the right register address. Signed-off-by: Stephane Contri <Stephane.Contri@grassvalley.com> Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Cc: stable@kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>	2009-07-05 18:03:35 -07:00
Brian Haley	a1ed05263b	IPv6: preferred lifetime of address not getting updated There's a bug in addrconf_prefix_rcv() where it won't update the preferred lifetime of an IPv6 address if the current valid lifetime of the address is less than 2 hours (the minimum value in the RA). For example, If I send a router advertisement with a prefix that has valid lifetime = preferred lifetime = 2 hours we'll build this address: 3: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qlen 1000 inet6 2001:1890:1109:a20:217:8ff:fe7d:4718/64 scope global dynamic valid_lft 7175sec preferred_lft 7175sec If I then send the same prefix with valid lifetime = preferred lifetime = 0 it will be ignored since the minimum valid lifetime is 2 hours: 3: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qlen 1000 inet6 2001:1890:1109:a20:217:8ff:fe7d:4718/64 scope global dynamic valid_lft 7161sec preferred_lft 7161sec But according to RFC 4862 we should always reset the preferred lifetime even if the valid lifetime is invalid, which would cause the address to immediately get deprecated. So with this patch we'd see this: 5: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qlen 1000 inet6 2001:1890:1109:a20:21f:29ff:fe5a:ef04/64 scope global deprecated dynamic valid_lft 7163sec preferred_lft 0sec The comment winds-up being 5x the size of the code to fix the problem. Update the preferred lifetime of IPv6 addresses derived from a prefix info option in a router advertisement even if the valid lifetime in the option is invalid, as specified in RFC 4862 Section 5.5.3e. Fixes an issue where an address will not immediately become deprecated. Reported by Jens Rosenboom. Signed-off-by: Brian Haley <brian.haley@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-07-03 19:10:13 -07:00
Wei Yongjun	59cae0092e	xfrm6: fix the proto and ports decode of sctp protocol The SCTP pushed the skb above the sctp chunk header, so the check of pskb_may_pull(skb, nh + offset + 1 - skb->data) in _decode_session6() will never return 0 and the ports decode of sctp will always fail. (nh + offset + 1 - skb->data < 0) Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-07-03 19:10:10 -07:00
Wei Yongjun	c615c9f3f3	xfrm4: fix the ports decode of sctp protocol The SCTP pushed the skb data above the sctp chunk header, so the check of pskb_may_pull(skb, xprth + 4 - skb->data) in _decode_session4() will never return 0 because xprth + 4 - skb->data < 0, the ports decode of sctp will always fail. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-07-03 19:10:06 -07:00
Abhishek Kulkarni	15da4b1612	net/9p: Fix crash due to bad mount parameters. It is not safe to use match_int without checking the token type returned by match_token (especially when the token type returned is Opt_err and args is empty). Fix it. Signed-off-by: Abhishek Kulkarni <adkulkar@umail.iu.edu> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-07-02 13:17:01 -07:00
Eric W. Biederman	f8a68e752b	Revert "ipv4: arp announce, arp_proxy and windows ip conflict verification" This reverts commit `73ce7b01b4`. After discovering that we don't listen to gratuitious arps in 2.6.30 I tracked the failure down to this commit. The patch makes absolutely no sense. RFC2131 RFC3927 and RFC5227. are all in agreement that an arp request with sip == 0 should be used for the probe (to prevent learning) and an arp request with sip == tip should be used for the gratitous announcement that people can learn from. It appears the author of the broken patch got those two cases confused and modified the code to drop all gratuitous arp traffic. Ouch! Cc: stable@kernel.org Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-30 19:47:08 -07:00
Jarek Poplawski	008440e3ad	ipv4: Fix fib_trie rebalancing, part 3 Alas current delaying of freeing old tnodes by RCU in trie_rebalance is still not enough because we can free a top tnode before updating a t->trie pointer. Reported-by: Pawel Staszewski <pstaszewski@itcare.pl> Tested-by: Pawel Staszewski <pstaszewski@itcare.pl> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-30 12:48:38 -07:00
Wei Yongjun	ff0ac74afb	sctp: xmit sctp packet always return no route error Commit 'net: skb->dst accessors'(`adf30907d6`) broken the sctp protocol stack, the sctp packet can never be sent out after Eric Dumazet's patch, which have typo in the sctp code. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Vlad Yasevich <vladisalv.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-29 19:41:53 -07:00
Wei Yongjun	1802571b98	xfrm: use xfrm_addr_cmp() instead of compare addresses directly Clean up to use xfrm_addr_cmp() instead of compare addresses directly. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-29 19:41:46 -07:00
Herbert Xu	6828b92bd2	tcp: Do not tack on TSO data to non-TSO packet If a socket starts out on a non-TSO route, and then switches to a TSO route, then we will tack on data to the tail of the tx queue even if it started out life as non-TSO. This is suboptimal because all of it will then be copied and checksummed unnecessarily. This patch fixes this by ensuring that skb->ip_summed is set to CHECKSUM_PARTIAL before appending extra data beyond the MSS. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-29 19:41:43 -07:00
Herbert Xu	8e5b9dda99	tcp: Stop non-TSO packets morphing into TSO If a socket starts out on a non-TSO route, and then switches to a TSO route, then the tail on the tx queue can morph into a TSO packet, causing mischief because the rest of the stack does not expect a partially linear TSO packet. This patch fixes this by ensuring that skb->ip_summed is set to CHECKSUM_PARTIAL before declaring a packet as TSO. Reported-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-29 19:41:39 -07:00
David S. Miller	9c0346bd08	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lowpan/lowpan	2009-06-29 19:23:53 -07:00
David S. Miller	53bd9728bf	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6	2009-06-29 19:22:31 -07:00
Dmitry Eremin-Solenikov	dfd06fe824	nl802154: add module license and description Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>	2009-06-29 18:20:28 +04:00
Dmitry Eremin-Solenikov	932c1329ac	nl802154: fix Oops in ieee802154_nl_get_dev ieee802154_nl_get_dev() lacks check for the existance of the device that was returned by dev_get_XXX, thus resulting in Oops for non-existing devices. Fix it. Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>	2009-06-29 18:20:27 +04:00
Jan Engelhardt	d6d3f08b0f	netfilter: xtables: conntrack match revision 2 As reported by Philip, the UNTRACKED state bit does not fit within the 8-bit state_mask member. Enlarge state_mask and give status_mask a few more bits too. Reported-by: Philip Craig <philipc@snapgear.com> References: http://markmail.org/thread/b7eg6aovfh4agyz7 Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-06-29 14:31:46 +02:00
Patrick McHardy	a3a9f79e36	netfilter: tcp conntrack: fix unacknowledged data detection with NAT When NAT helpers change the TCP packet size, the highest seen sequence number needs to be corrected. This is currently only done upwards, when the packet size is reduced the sequence number is unchanged. This causes TCP conntrack to falsely detect unacknowledged data and decrease the timeout. Fix by updating the highest seen sequence number in both directions after packet mangling. Tested-by: Krzysztof Piotr Oledzki <ole@ans.pl> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-06-29 14:07:56 +02:00
Herbert Xu	ff780cd8f2	gro: Flush GRO packets in napi_disable_pending path When NAPI is disabled while we're in net_rx_action, we end up calling __napi_complete without flushing GRO packets. This is a bug as it would cause the GRO packets to linger, of course it also literally BUGs to catch error like this :) This patch changes it to napi_complete, with the obligatory IRQ reenabling. This should be safe because we've only just disabled IRQs and it does not materially affect the test conditions in between. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-26 19:27:04 -07:00
Herbert Xu	71f9dacd2e	inet: Call skb_orphan before tproxy activates As transparent proxying looks up the socket early and assigns it to the skb for later processing, we must drop any existing socket ownership prior to that in order to distinguish between the case where tproxy is active and where it is not. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-26 19:22:37 -07:00
Jesper Dangaard Brouer	4a27096bbe	mac80211: Use rcu_barrier() on unload. The mac80211 module uses rcu_call() thus it should use rcu_barrier() on module unload. The rcu_barrier() is placed in mech.c ieee80211_stop_mesh() which is invoked from ieee80211_stop() in case vif.type == NL80211_IFTYPE_MESH_POINT. Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-26 13:51:36 -07:00
Jesper Dangaard Brouer	75de874f5c	sunrpc: Use rcu_barrier() on unload. The sunrpc module uses rcu_call() thus it should use rcu_barrier() on module unload. Have not verified that the possibility for new call_rcu() callbacks has been disabled. As a hint for checking, the functions calling call_rcu() (unx_destroy_cred and generic_destroy_cred) are registered as crdestroy function pointer in struct rpc_credops. Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-26 13:51:34 -07:00
Jesper Dangaard Brouer	473c22d759	bridge: Use rcu_barrier() instead of syncronize_net() on unload. When unloading modules that uses call_rcu() callbacks, then we must use rcu_barrier(). This module uses syncronize_net() which is not enough to be sure that all callback has been completed. Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-26 13:51:32 -07:00
Jesper Dangaard Brouer	1f2ccd00f2	ipv6: Use rcu_barrier() on module unload. The ipv6 module uses rcu_call() thus it should use rcu_barrier() on module unload. Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-26 13:51:31 -07:00
Jesper Dangaard Brouer	10e8544801	decnet: Use rcu_barrier() on module unload. The decnet module unloading as been disabled with a '#if 0' statement, because it have had issues. We add a rcu_barrier() anyhow for correctness. The maintainer (Chrissie Caulfield) will look into the unload issue when time permits. Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Chrissie Caulfield <christine.caulfield@googlemail.com> Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-26 13:51:27 -07:00
Jens Rosenboom	a1faa69810	ipv6: avoid wraparound for expired preferred lifetime Avoid showing wrong high values when the preferred lifetime of an address is expired. Signed-off-by: Jens Rosenboom <me@jayr.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-25 20:03:50 -07:00
Wei Yongjun	1ac530b355	tcp: missing check ACK flag of received segment in FIN-WAIT-2 state RFC0793 defined that in FIN-WAIT-2 state if the ACK bit is off drop the segment and return[Page 72]. But this check is missing in function tcp_timewait_state_process(). This cause the segment with FIN flag but no ACK has two diffent action: Case 1: Node A Node B <------------- FIN,ACK (enter FIN-WAIT-1) ACK -------------> (enter FIN-WAIT-2) FIN -------------> discard (move sk to tw list) Case 2: Node A Node B <------------- FIN,ACK (enter FIN-WAIT-1) ACK -------------> (enter FIN-WAIT-2) (move sk to tw list) FIN -------------> <------------- ACK This patch fixed the problem. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-25 20:03:15 -07:00
Jesper Dangaard Brouer	308ff823eb	nf_conntrack: Use rcu_barrier() RCU barriers, rcu_barrier(), is inserted two places. In nf_conntrack_expect.c nf_conntrack_expect_fini() before the kmem_cache_destroy(). Firstly to make sure the callback to the nf_ct_expect_free_rcu() code is still around. Secondly because I'm unsure about the consequence of having in flight nf_ct_expect_free_rcu/kmem_cache_free() calls while doing a kmem_cache_destroy() slab destroy. And in nf_conntrack_extend.c nf_ct_extend_unregister(), inorder to wait for completion of callbacks to __nf_ct_ext_free_rcu(), which is invoked by __nf_ct_ext_add(). It might be more efficient to call rcu_barrier() in nf_conntrack_core.c nf_conntrack_cleanup_net(), but thats make it more difficult to read the code (as the callback code in located in nf_conntrack_extend.c). Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-06-25 16:32:52 +02:00
Rémi Denis-Courmont	2be6fa4c7e	Phonet: generate Netlink RTM_DELADDR when destroying a device Netlink address deletion events were not sent when a network device vanished neither when Phonet was unloaded. Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-25 02:58:16 -07:00
Rémi Denis-Courmont	c7a1a4c80f	Phonet: publicize the Netlink notification function Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-25 02:58:15 -07:00
Herbert Xu	245acb8772	ipsec: Fix name of CAST algorithm Our CAST algorithm is called cast5, not cast128. Clearly nobody has ever used it :) Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-24 18:03:10 -07:00
Linus Torvalds	09ce42d316	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: bnx2: Fix the behavior of ethtool when ONBOOT=no qla3xxx: Don't sleep while holding lock. qla3xxx: Give the PHY time to come out of reset. ipv4 routing: Ensure that route cache entries are usable and reclaimable with caching is off net: Move rx skb_orphan call to where needed ipv6: Use correct data types for ICMPv6 type and code net: let KS8842 driver depend on HAS_IOMEM can: let SJA1000 driver depend on HAS_IOMEM netxen: fix firmware init handshake netxen: fix build with without CONFIG_PM netfilter: xt_rateest: fix comparison with self netfilter: xt_quota: fix incomplete initialization netfilter: nf_log: fix direct userspace memory access in proc handler netfilter: fix some sparse endianess warnings netfilter: nf_conntrack: fix conntrack lookup race netfilter: nf_conntrack: fix confirmation race condition netfilter: nf_conntrack: death_by_timeout() fix	2009-06-24 10:01:12 -07:00
Neil Horman	b6280b47a7	ipv4 routing: Ensure that route cache entries are usable and reclaimable with caching is off When route caching is disabled (rt_caching returns false), We still use route cache entries that are created and passed into rt_intern_hash once. These routes need to be made usable for the one call path that holds a reference to them, and they need to be reclaimed when they're finished with their use. To be made usable, they need to be associated with a neighbor table entry (which they currently are not), otherwise iproute_finish2 just discards the packet, since we don't know which L2 peer to send the packet to. To do this binding, we need to follow the path a bit higher up in rt_intern_hash, which calls arp_bind_neighbour, but not assign the route entry to the hash table. Currently, if caching is off, we simply assign the route to the rp pointer and are reutrn success. This patch associates us with a neighbor entry first. Secondly, we need to make sure that any single use routes like this are known to the garbage collector when caching is off. If caching is off, and we try to hash in a route, it will leak when its refcount reaches zero. To avoid this, this patch calls rt_free on the route cache entry passed into rt_intern_hash. This places us on the gc list for the route cache garbage collector, so that when its refcount reaches zero, it will be reclaimed (Thanks to Alexey for this suggestion). I've tested this on a local system here, and with these patches in place, I'm able to maintain routed connectivity to remote systems, even if I set /proc/sys/net/ipv4/rt_cache_rebuild_count to -1, which forces rt_caching to return false. Signed-off-by: Neil Horman <nhorman@redhat.com> Reported-by: Jarek Poplawski <jarkao2@gmail.com> Reported-by: Maxime Bizon <mbizon@freebox.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-23 16:36:26 -07:00
Herbert Xu	d55d87fdff	net: Move rx skb_orphan call to where needed In order to get the tun driver to account packets, we need to be able to receive packets with destructors set. To be on the safe side, I added an skb_orphan call for all protocols by default since some of them (IP in particular) cannot handle receiving packets destructors properly. Now it seems that at least one protocol (CAN) expects to be able to pass skb->sk through the rx path without getting clobbered. So this patch attempts to fix this properly by moving the skb_orphan call to where it's actually needed. In particular, I've added it to skb_set_owner_[rw] which is what most users of skb->destructor call. This is actually an improvement for tun too since it means that we only give back the amount charged to the socket when the skb is passed to another socket that will also be charged accordingly. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Tested-by: Oliver Hartkopp <olver@hartkopp.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-23 16:36:25 -07:00
Brian Haley	d5fdd6babc	ipv6: Use correct data types for ICMPv6 type and code Change all the code that deals directly with ICMPv6 type and code values to use u8 instead of a signed int as that's the actual data type. Signed-off-by: Brian Haley <brian.haley@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-23 04:31:07 -07:00
Linus Torvalds	7e0338c0de	Merge branch 'for-2.6.31' of git://fieldses.org/git/linux-nfsd * 'for-2.6.31' of git://fieldses.org/git/linux-nfsd: (60 commits) SUNRPC: Fix the TCP server's send buffer accounting nfsd41: Backchannel: minorversion support for the back channel nfsd41: Backchannel: cleanup nfs4.0 callback encode routines nfsd41: Remove ip address collision detection case nfsd: optimise the starting of zero threads when none are running. nfsd: don't take nfsd_mutex twice when setting number of threads. nfsd41: sanity check client drc maxreqs nfsd41: move channel attributes from nfsd4_session to a nfsd4_channel_attr struct NFS: kill off complicated macro 'PROC' sunrpc: potential memory leak in function rdma_read_xdr nfsd: minor nfsd_vfs_write cleanup nfsd: Pull write-gathering code out of nfsd_vfs_write nfsd: track last inode only in use_wgather case sunrpc: align cache_clean work's timer nfsd: Use write gathering only with NFSv2 NFSv4: kill off complicated macro 'PROC' NFSv4: do exact check about attribute specified knfsd: remove unreported filehandle stats counters knfsd: fix reply cache memory corruption knfsd: reply cache cleanups ...	2009-06-22 12:55:50 -07:00
Linus Torvalds	df36b439c5	Merge branch 'for-2.6.31' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6 * 'for-2.6.31' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6: (128 commits) nfs41: sunrpc: xprt_alloc_bc_request() should not use spin_lock_bh() nfs41: Move initialization of nfs4_opendata seq_res to nfs4_init_opendata_res nfs: remove unnecessary NFS_INO_INVALID_ACL checks NFS: More "sloppy" parsing problems NFS: Invalid mount option values should always fail, even with "sloppy" NFS: Remove unused XDR decoder functions NFS: Update MNT and MNT3 reply decoding functions NFS: add XDR decoder for mountd version 3 auth-flavor lists NFS: add new file handle decoders to in-kernel mountd client NFS: Add separate mountd status code decoders for each mountd version NFS: remove unused function in fs/nfs/mount_clnt.c NFS: Use xdr_stream-based XDR encoder for MNT's dirpath argument NFS: Clean up MNT program definitions lockd: Don't bother with RPC ping for NSM upcalls lockd: Update NSM state from SM_MON replies NFS: Fix false error return from nfs_callback_up() if ipv6.ko is not available NFS: Return error code from nfs_callback_up() to user space NFS: Do not display the setting of the "intr" mount option NFS: add support for splice writes nfs41: Backchannel: CB_SEQUENCE validation ...	2009-06-22 12:53:06 -07:00
Linus Torvalds	5165aece0e	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (43 commits) via-velocity: Fix velocity driver unmapping incorrect size. mlx4_en: Remove redundant refill code on RX mlx4_en: Removed redundant check on lso header size mlx4_en: Cancel port_up check in transmit function mlx4_en: using stop/start_all_queues mlx4_en: Removed redundant skb->len check mlx4_en: Counting all the dropped packets on the TX side usbnet cdc_subset: fix issues talking to PXA gadgets Net: qla3xxx, remove sleeping in atomic ipv4: fix NULL pointer + success return in route lookup path isdn: clean up documentation index cfg80211: validate station settings cfg80211: allow setting station parameters in mesh cfg80211: allow adding/deleting stations on mesh ath5k: fix beacon_int handling MAINTAINERS: Fix Atheros pattern paths ath9k: restore PS mode, before we put the chip into FULL SLEEP state. ath9k: wait for beacon frame along with CAB acer-wmi: fix rfkill conversion ath5k: avoid PCI FATAL interrupts by restoring RETRY_TIMEOUT disabling ...	2009-06-22 11:57:09 -07:00
Patrick McHardy	4d900f9df5	netfilter: xt_rateest: fix comparison with self As noticed by T�r�k Edwin <edwintorok@gmail.com>: Compiling the kernel with clang has shown this warning: net/netfilter/xt_rateest.c:69:16: warning: self-comparison always results in a constant value ret &= pps2 == pps2; ^ Looking at the code: if (info->flags & XT_RATEEST_MATCH_BPS) ret &= bps1 == bps2; if (info->flags & XT_RATEEST_MATCH_PPS) ret &= pps2 == pps2; Judging from the MATCH_BPS case it seems to be a typo, with the intention of comparing pps1 with pps2. http://bugzilla.kernel.org/show_bug.cgi?id=13535 Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-06-22 14:17:12 +02:00
Jan Engelhardt	6d62182fea	netfilter: xt_quota: fix incomplete initialization Commit v2.6.29-rc5-872-gacc738f ("xtables: avoid pointer to self") forgot to copy the initial quota value supplied by iptables into the private structure, thus counting from whatever was in the memory kmalloc returned. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-06-22 14:16:45 +02:00
Patrick McHardy	2495561928	netfilter: nf_log: fix direct userspace memory access in proc handler Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-06-22 14:15:30 +02:00
Patrick McHardy	f9ffc31251	netfilter: fix some sparse endianess warnings net/netfilter/xt_NFQUEUE.c:46:9: warning: incorrect type in assignment (different base types) net/netfilter/xt_NFQUEUE.c:46:9: expected unsigned int [unsigned] [usertype] ipaddr net/netfilter/xt_NFQUEUE.c:46:9: got restricted unsigned int net/netfilter/xt_NFQUEUE.c:68:10: warning: incorrect type in assignment (different base types) net/netfilter/xt_NFQUEUE.c:68:10: expected unsigned int [unsigned] <noident> net/netfilter/xt_NFQUEUE.c:68:10: got restricted unsigned int net/netfilter/xt_NFQUEUE.c:69:10: warning: incorrect type in assignment (different base types) net/netfilter/xt_NFQUEUE.c:69:10: expected unsigned int [unsigned] <noident> net/netfilter/xt_NFQUEUE.c:69:10: got restricted unsigned int net/netfilter/xt_NFQUEUE.c:70:10: warning: incorrect type in assignment (different base types) net/netfilter/xt_NFQUEUE.c:70:10: expected unsigned int [unsigned] <noident> net/netfilter/xt_NFQUEUE.c:70:10: got restricted unsigned int net/netfilter/xt_NFQUEUE.c:71:10: warning: incorrect type in assignment (different base types) net/netfilter/xt_NFQUEUE.c:71:10: expected unsigned int [unsigned] <noident> net/netfilter/xt_NFQUEUE.c:71:10: got restricted unsigned int net/netfilter/xt_cluster.c:20:55: warning: incorrect type in return expression (different base types) net/netfilter/xt_cluster.c:20:55: expected unsigned int net/netfilter/xt_cluster.c:20:55: got restricted unsigned int const [usertype] ip net/netfilter/xt_cluster.c:20:55: warning: incorrect type in return expression (different base types) net/netfilter/xt_cluster.c:20:55: expected unsigned int net/netfilter/xt_cluster.c:20:55: got restricted unsigned int const [usertype] ip Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-06-22 14:15:02 +02:00
Patrick McHardy	8d8890b775	netfilter: nf_conntrack: fix conntrack lookup race The RCU protected conntrack hash lookup only checks whether the entry has a refcount of zero to decide whether it is stale. This is not sufficient, entries are explicitly removed while there is at least one reference left, possibly more. Explicitly check whether the entry has been marked as dying to fix this. Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-06-22 14:14:41 +02:00
Patrick McHardy	5c8ec910e7	netfilter: nf_conntrack: fix confirmation race condition New connection tracking entries are inserted into the hash before they are fully set up, namely the CONFIRMED bit is not set and the timer not started yet. This can theoretically lead to a race with timer, which would set the timeout value to a relative value, most likely already in the past. Perform hash insertion as the final step to fix this. Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-06-22 14:14:16 +02:00
Eric Dumazet	8cc20198cf	netfilter: nf_conntrack: death_by_timeout() fix death_by_timeout() might delete a conntrack from hash list and insert it in dying list. nf_ct_delete_from_lists(ct); nf_ct_insert_dying_list(ct); I believe a (lockless) reader could catch ct while doing a lookup and miss the end of its chain. (nulls lookup algo must check the null value at the end of lookup and should restart if the null value is not the expected one. cf Documentation/RCU/rculist_nulls.txt for details) We need to change nf_conntrack_init_net() and use a different "null" value, guaranteed not being used in regular lists. Choose very large values, since hash table uses [0..size-1] null values. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-06-22 14:13:55 +02:00
Ricardo Labiaga	e9f0298558	nfs41: sunrpc: xprt_alloc_bc_request() should not use spin_lock_bh() xprt_alloc_bc_request() is always called in soft interrupt context. Grab the spin_lock instead of the bottom half spin_lock. Softirqs do not preempt other softirqs running on the same processor, so there is no need to disable bottom halves. Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-06-20 14:55:39 -04:00
David S. Miller	c3da63f357	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6	2009-06-20 01:16:40 -07:00
Neil Horman	73e42897e8	ipv4: fix NULL pointer + success return in route lookup path Don't drop route if we're not caching I recently got a report of an oops on a route lookup. Maxime was testing what would happen if route caching was turned off (doing so by setting making rt_caching always return 0), and found that it triggered an oops. I looked at it and found that the problem stemmed from the fact that the route lookup routines were returning success from their lookup paths (which is good), but never set the *rp pointer to anything (which is bad). This happens because in rt_intern_hash, if rt_caching returns false, we call rt_drop and return 0. This almost emulates slient success. What we should be doing is assigning rp = rt and _not_ dropping the route. This way, during slow path lookups, when we create a new route cache entry, we don't immediately discard it, rather we just don't add it into the cache hash table, but we let this one lookup use it for the purpose of this route request. Maxime has tested and reports it prevents the oops. There is still a subsequent routing issue that I'm looking into further, but I'm confident that, even if its related to this same path, this patch makes sense to take. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-20 01:15:16 -07:00
Johannes Berg	a97f4424fb	cfg80211: validate station settings When I disallowed interfering with stations on non-AP interfaces, I not only forget mesh but also managed interfaces which need this for the authorized flag. Let's actually validate everything properly. This fixes an nl80211 regression introduced by the interfering, under which wpa_supplicant -Dnl80211 could not properly connect. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-06-19 11:50:24 -04:00
Andrey Yurovsky	9a5e8bbc8f	cfg80211: allow setting station parameters in mesh Mesh Point interfaces can also set parameters, for example plink_open is used to manually establish peer links from user-space (currently via iw). Add Mesh Point to the check in nl80211_set_station. Signed-off-by: Andrey Yurovsky <andrey@cozybit.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-06-19 11:50:24 -04:00
Andrey Yurovsky	155cc9e4b1	cfg80211: allow adding/deleting stations on mesh Commit b2a151a288 added a check that prevents adding or deleting stations on non-AP interfaces. Adding and deleting stations is supported for Mesh Point interfaces, so add Mesh Point to that check as well. Signed-off-by: Andrey Yurovsky <andrey@cozybit.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-06-19 11:50:23 -04:00
Alan Jenkins	464902e812	rfkill: export persistent attribute in sysfs This information allows userspace to implement a hybrid policy where it can store the rfkill soft-blocked state in platform non-volatile storage if available, and if not then file-based storage can be used. Some users prefer platform non-volatile storage because of the behaviour when dual-booting multiple versions of Linux, or if the rfkill setting is changed in the BIOS setting screens, or if the BIOS responds to wireless-toggle hotkeys itself before the relevant platform driver has been loaded. Signed-off-by: Alan Jenkins <alan-jenkins@tuffmail.co.uk> Acked-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-06-19 11:50:18 -04:00
Alan Jenkins	06d5caf47e	rfkill: don't restore software blocked state on persistent devices The setting of the "persistent" flag is also made more explicit using a new rfkill_init_sw_state() function, instead of special-casing rfkill_set_sw_state() when it is called before registration. Suspend is a bit of a corner case so we try to get away without adding another hack to rfkill-input - it's going to be removed soon. If the state does change over suspend, users will simply have to prod rfkill-input twice in order to toggle the state. Userspace policy agents will be able to implement a more consistent user experience. For example, they can avoid the above problem if they toggle devices individually. Then there would be no "global state" to get out of sync. Currently there are only two rfkill drivers with persistent soft-blocked state. thinkpad-acpi already checks the software state on resume. eeepc-laptop will require modification. Signed-off-by: Alan Jenkins <alan-jenkins@tuffmail.co.uk> CC: Marcel Holtmann <marcel@holtmann.org> Acked-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-06-19 11:50:17 -04:00
Alan Jenkins	7fa20a7f60	rfkill: rfkill_set_block() when suspended nitpick If we return after fiddling with the state, userspace will see the wrong state and rfkill_set_sw_state() won't work until the next call to rfkill_set_block(). At the moment rfkill_set_block() will always be called from rfkill_resume(), but this will change in future. Also, presumably the point of this test is to avoid bothering devices which may be suspended. If we don't want to call set_block(), we probably don't want to call query() either :-). Signed-off-by: Alan Jenkins <alan-jenkins@tuffmail.co.uk> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-06-19 11:50:17 -04:00
Dmitry Baryshkov	25502bda07	ieee802154: use standard routine for printing dumps Use print_hex_dump_bytes instead of self-written dumping function for outputting packet dumps. Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-19 00:18:43 -07:00
Hendrik Brueckner	0ea920d211	af_iucv: Return -EAGAIN if iucv msg limit is exceeded If the iucv message limit for a communication path is exceeded, sendmsg() returns -EAGAIN instead of -EPIPE. The calling application can then handle this error situtation, e.g. to try again after waiting some time. For blocking sockets, sendmsg() waits up to the socket timeout before returning -EAGAIN. For the new wait condition, a macro has been introduced and the iucv_sock_wait_state() has been refactored to this macro. Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-19 00:10:40 -07:00
Hendrik Brueckner	bb664f49f8	af_iucv: Change if condition in sendmsg() for more readability Change the if condition to exit sendmsg() if the socket in not connected. Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-19 00:10:39 -07:00
Trond Myklebust	47fcb03fef	SUNRPC: Fix the TCP server's send buffer accounting Currently, the sunrpc server is refusing to allow us to process new RPC calls if the TCP send buffer is 2/3 full, even if we do actually have enough free space to guarantee that we can send another request. The following patch fixes svc_tcp_has_wspace() so that we only stop processing requests if we know that the socket buffer cannot possibly fit another reply. It also fixes the tcp write_space() callback so that we only clear the SOCK_NOSPACE flag when the TCP send buffer is less than 2/3 full. This should ensure that the send window will grow as per the standard TCP socket code. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-06-18 19:58:51 -07:00
Trond Myklebust	1f84603c09	Merge branch 'devel-for-2.6.31' into for-2.6.31 Conflicts: fs/nfs/client.c fs/nfs/super.c	2009-06-18 18:13:44 -07:00
Linus Torvalds	d2aa455037	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (55 commits) netxen: fix tx ring accounting netxen: fix detection of cut-thru firmware mode forcedeth: fix dma api mismatches atm: sk_wmem_alloc initial value is one net: correct off-by-one write allocations reports via-velocity : fix no link detection on boot Net / e100: Fix suspend of devices that cannot be power managed TI DaVinci EMAC : Fix rmmod error net: group address list and its count ipv4: Fix fib_trie rebalancing, part 2 pkt_sched: Update drops stats in act_police sky2: version 1.23 sky2: add GRO support sky2: skb recycling sky2: reduce default transmit ring sky2: receive counter update sky2: fix shutdown synchronization sky2: PCI irq issues sky2: more receive shutdown sky2: turn off pause during shutdown ... Manually fix trivial conflict in net/core/skbuff.c due to kmemcheck	2009-06-18 14:07:15 -07:00
Eric Dumazet	81e2a3d5b7	atm: sk_wmem_alloc initial value is one commit `2b85a34e91` (net: No more expensive sock_hold()/sock_put() on each tx) changed initial sk_wmem_alloc value. This broke net/atm since this protocol assumed a null initial value. This patch makes necessary changes. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-18 00:29:12 -07:00
Eric Dumazet	31e6d363ab	net: correct off-by-one write allocations reports commit `2b85a34e91` (net: No more expensive sock_hold()/sock_put() on each tx) changed initial sk_wmem_alloc value. We need to take into account this offset when reporting sk_wmem_alloc to user, in PROC_FS files or various ioctls (SIOCOUTQ/TIOCOUTQ) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-18 00:29:12 -07:00
Jiri Pirko	31278e7147	net: group address list and its count This patch is inspired by patch recently posted by Johannes Berg. Basically what my patch does is to group list and a count of addresses into newly introduced structure netdev_hw_addr_list. This brings us two benefits: 1) struct net_device becames a bit nicer. 2) in the future there will be a possibility to operate with lists independently on netdevices (with exporting right functions). I wanted to introduce this patch before I'll post a multicast lists conversion. Signed-off-by: Jiri Pirko <jpirko@redhat.com> drivers/net/bnx2.c \| 4 +- drivers/net/e1000/e1000_main.c \| 4 +- drivers/net/ixgbe/ixgbe_main.c \| 6 +- drivers/net/mv643xx_eth.c \| 2 +- drivers/net/niu.c \| 4 +- drivers/net/virtio_net.c \| 10 ++-- drivers/s390/net/qeth_l2_main.c \| 2 +- include/linux/netdevice.h \| 17 +++-- net/core/dev.c \| 130 ++++++++++++++++++-------------------- 9 files changed, 89 insertions(+), 90 deletions(-) Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-18 00:29:08 -07:00
Jarek Poplawski	7b85576d15	ipv4: Fix fib_trie rebalancing, part 2 My previous patch, which explicitly delays freeing of tnodes by adding them to the list to flush them after the update is finished, isn't strict enough. It treats exceptionally tnodes without parent, assuming they are newly created, so "invisible" for the read side yet. But the top tnode doesn't have parent as well, so we have to exclude all exceptions (at least until a better way is found). Additionally we need to move rcu assignment of this node before flushing, so the return type of the trie_rebalance() function is changed. Reported-by: Yan Zheng <zheng.yan@oracle.com> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-18 00:28:51 -07:00
Jarek Poplawski	b964758050	pkt_sched: Update drops stats in act_police Action police statistics could be misleading because drops are not shown when expected. With feedback from: Jamal Hadi Salim <hadi@cyberus.ca> Reported-by: Pawel Staszewski <pstaszewski@itcare.pl> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Acked-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-17 18:56:45 -07:00
Stephen Hemminger	603a8bbe62	skbuff: don't corrupt mac_header on skb expansion The skb mac_header field is sometimes NULL (or ~0u) as a sentinel value. The places where skb is expanded add an offset which would change this flag into an invalid pointer (or offset). Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-17 18:46:41 -07:00
Stephen Hemminger	19633e129c	skbuff: skb_mac_header_was_set is always true on >32 bit Looking at the crash in log_martians(), one suspect is that the check for mac header being set is not correct. The value of mac_header defaults to 0 on allocation, therefore skb_mac_header_was_set will always be true on platforms using NET_SKBUFF_USES_OFFSET. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-17 18:46:03 -07:00
Trond Myklebust	301933a0ac	Merge commit 'linux-pnfs/nfs41-for-2.6.31' into nfsv41-for-2.6.31	2009-06-17 17:59:58 -07:00
Ricardo Labiaga	dd2b63d049	nfs41: Rename rq_received to rq_reply_bytes_recvd The 'rq_received' member of 'struct rpc_rqst' is used to track when we have received a reply to our request. With v4.1, the backchannel can now accept callback requests over the existing connection. Rename this field to make it clear that it is only used for tracking reply bytes and not all bytes received on the connection. Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com>	2009-06-17 14:11:40 -07:00
Rahul Iyer	343952fa5a	nfs41: Get the rpc_xprt * from the rpc_rqst instead of the rpc_clnt. Obtain the rpc_xprt from the rpc_rqst so that calls and callback replies can both use the same code path. A client needs the rpc_xprt in order to reply to a callback. Signed-off-by: Rahul Iyer <iyer@netapp.com> Signed-off-by: Ricardo Labiaga <ricardo.labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com>	2009-06-17 14:11:34 -07:00
Benny Halevy	8f97524235	nfs41: create a svc_xprt for nfs41 callback thread and use for incoming callbacks Signed-off-by: Benny Halevy <bhalevy@panasas.com>	2009-06-17 14:11:31 -07:00
Andy Adamson	9c9f3f5fa6	nfs41: sunrpc: add a struct svc_xprt pointer to struct svc_serv for backchannel use This svc_xprt is passed on to the callback service thread to be later used to processes incoming svc_rqst's Signed-off-by: Benny Halevy <bhalevy@panasas.com>	2009-06-17 14:11:31 -07:00
Benny Halevy	7652e5a09b	nfs41: sunrpc: provide functions to create and destroy a svc_xprt for backchannel use For nfs41 callbacks we need an svc_xprt to process requests coming up the backchannel socket as rpc_rqst's that are transformed into svc_rqst's that need a rq_xprt to be processed. The svc_{udp,tcp}_create methods are too heavy for this job as svc_create_socket creates an actual socket to listen on while for nfs41 we're "reusing" the fore channel's socket. Signed-off-by: Benny Halevy <bhalevy@panasas.com>	2009-06-17 14:11:30 -07:00
Ricardo Labiaga	4d6bbb6233	nfs41: Backchannel bc_svc_process() Implement the NFSv4.1 backchannel service. Invokes the common callback processing logic svc_process_common() to authenticate the call and dispatch the appropriate NFSv4.1 XDR decoder and operation procedure. It then invokes bc_send() to send the reply over the same connection. bc_send() is implemented in a separate patch. At this time there is no slot validation or reply cache handling. [nfs41: Preallocate rpc_rqst receive buffer for handling callbacks] Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [Move bc_svc_process() declaration to correct patch] Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com>	2009-06-17 14:11:29 -07:00
Ricardo Labiaga	1cad7ea6fe	nfs41: Refactor svc_process() net/sunrpc/svc.c:svc_process() is used by the NFSv4 callback service to process RPC requests arriving over connections initiated by the server. NFSv4.1 supports callbacks over the backchannel on connections initiated by the client. This patch refactors svc_process() so that common code can also be used by the backchannel. Signed-off-by: Ricardo Labiaga <ricardo.labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com>	2009-06-17 14:11:28 -07:00
Ricardo Labiaga	0d90ba1cd4	nfs41: Backchannel callback service helper routines Executes the backchannel task on the RPC state machine using the existing open connection previously established by the client. Signed-off-by: Ricardo Labiaga <ricardo.labiaga@netapp.com> nfs41: Add bc_svc.o to sunrpc Makefile. [nfs41: bc_send() does not need to be exported outside RPC module] [nfs41: xprt_free_bc_request() need not be exported outside RPC module] Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [Update copyright] Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com>	2009-06-17 14:11:28 -07:00
Ricardo Labiaga	55ae1aabfb	nfs41: Add backchannel processing support to RPC state machine Adds rpc_run_bc_task() which is called by the NFS callback service to process backchannel requests. It performs similar work to rpc_run_task() though "schedules" the backchannel task to be executed starting at the call_trasmit state in the RPC state machine. It also introduces some miscellaneous updates to the argument validation, call_transmit, and transport cleanup functions to take into account that there are now forechannel and backchannel tasks. Backchannel requests do not carry an RPC message structure, since the payload has already been XDR encoded using the existing NFSv4 callback mechanism. Introduce a new transmit state for the client to reply on to backchannel requests. This new state simply reserves the transport and issues the reply. In case of a connection related error, disconnects the transport and drops the reply. It requires the forechannel to re-establish the connection and the server to retransmit the request, as stated in NFSv4.1 section 2.9.2 "Client and Server Transport Behavior". Note: There is no need to loop attempting to reserve the transport. If EAGAIN is returned by xprt_prepare_transmit(), return with tk_status == 0, setting tk_action to call_bc_transmit. rpc_execute() will invoke it again after the task is taken off the sleep queue. [nfs41: rpc_run_bc_task() need not be exported outside RPC module] [nfs41: New call_bc_transmit RPC state] Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: Backchannel: No need to loop in call_bc_transmit()] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [rpc_count_iostats incorrectly exits early] Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [Convert rpc_reply_expected() to inline function] [Remove unnecessary BUG_ON()] [Rename variable] Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com>	2009-06-17 14:11:24 -07:00
Trond Myklebust	88b5ed73bc	SUNRPC: Fix a missing "break" option in xs_tcp_setup_socket() In the case of -EADDRNOTAVAIL and/or unhandled connection errors, we want to get rid of the existing socket and retry immediately, just as the comment says. Currently we end up sleeping for a minute, due to the missing "break" statement. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-06-17 13:22:57 -07:00
Ricardo Labiaga	44b98efdd0	nfs41: New xs_tcp_read_data() Handles RPC replies and backchannel callbacks. Traditionally the NFS client has expected only RPC replies on its open connections. With NFSv4.1, callbacks can arrive over an existing open connection. This patch refactors the old xs_tcp_read_request() into an RPC reply handler: xs_tcp_read_reply(), a new backchannel callback handler: xs_tcp_read_callback(), and a common routine to read the data off the transport: xs_tcp_read_common(). The new xs_tcp_read_callback() queues callback requests onto a queue where the callback service (a separate thread) is listening for the processing. This patch incorporates work and suggestions from Rahul Iyer (iyer@netapp.com) and Benny Halevy (bhalevy@panasas.com). xs_tcp_read_callback() drops the connection when the number of expected callbacks is exceeded. Use xprt_force_disconnect(), ensuring tasks on the pending queue are awaken on disconnect. [nfs41: Keep track of RPC call/reply direction with a flag] [nfs41: Preallocate rpc_rqst receive buffer for handling callbacks] Signed-off-by: Ricardo Labiaga <ricardo.labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: sunrpc: xs_tcp_read_callback() should use xprt_force_disconnect()] Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [Moves embedded #ifdefs into #ifdef function blocks] Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com>	2009-06-17 13:06:16 -07:00
Ricardo Labiaga	fb7a0b9add	nfs41: New backchannel helper routines This patch introduces support to setup the callback xprt on the client side. It allocates/ destroys the preallocated memory structures used to process backchannel requests. At setup time, xprt_setup_backchannel() is invoked to allocate one or more rpc_rqst structures and substructures. This ensures that they are available when an RPC callback arrives. The rpc_rqst structures are maintained in a linked list attached to the rpc_xprt structure. We keep track of the number of allocations so that they can be correctly removed when the channel is destroyed. When an RPC callback arrives, xprt_alloc_bc_request() is invoked to obtain a preallocated rpc_rqst structure. An rpc_xprt structure is returned, and its RPC_BC_PREALLOC_IN_USE bit is set in rpc_xprt->bc_flags. The structure is removed from the the list since it is now in use, and it will be later added back when its user is done with it. After the RPC callback replies, the rpc_rqst structure is returned by invoking xprt_free_bc_request(). This clears the RPC_BC_PREALLOC_IN_USE bit and adds it back to the list, allowing it to be reused by a subsequent RPC callback request. To be consistent with the reception of RPC messages, the backchannel requests should be placed into the 'struct rpc_rqst' rq_rcv_buf, which is then in turn copied to the 'struct rpc_rqst' rq_private_buf. [nfs41: Preallocate rpc_rqst receive buffer for handling callbacks] Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [Update copyright notice and explain page allocation] Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com>	2009-06-17 13:06:14 -07:00
Ricardo Labiaga	f9acac1a47	nfs41: Initialize new rpc_xprt callback related fields Signed-off-by: Ricardo Labiaga <ricardo.labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com>	2009-06-17 13:06:14 -07:00
Ricardo Labiaga	f4a2e418bf	nfs41: Process the RPC call direction Reading and storing the RPC direction is a three step process. 1. xs_tcp_read_calldir() reads the RPC direction, but it will not store it in the XDR buffer since the 'struct rpc_rqst' is not yet available. 2. The 'struct rpc_rqst' is obtained during the TCP_RCV_COPY_DATA state. This state need not necessarily be preceeded by the TCP_RCV_READ_CALLDIR. For example, we may be reading a continuation packet to a large reply. Therefore, we can't simply obtain the 'struct rpc_rqst' during the TCP_RCV_READ_CALLDIR state and assume it's available during TCP_RCV_COPY_DATA. This patch adds a new TCP_RCV_READ_CALLDIR flag to indicate the need to read the RPC direction. It then uses TCP_RCV_COPY_CALLDIR to indicate the RPC direction needs to be saved after the 'struct rpc_rqst' has been allocated. 3. The 'struct rpc_rqst' is obtained by the xs_tcp_read_data() helper functions. xs_tcp_read_common() then saves the RPC direction in the XDR buffer if TCP_RCV_COPY_CALLDIR is set. This will happen when we're reading the data immediately after the direction was read. xs_tcp_read_common() then clears this flag. [was nfs41: Skip past the RPC call direction] Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: sunrpc: Add RPC direction back into the XDR buffer] Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: sunrpc: Don't skip past the RPC call direction] Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com>	2009-06-17 12:43:46 -07:00
Ricardo Labiaga	18dca02aeb	nfs41: Add ability to read RPC call direction on TCP stream. NFSv4.1 callbacks can arrive over an existing connection. This patch adds the logic to read the RPC call direction (call or reply). It does this by updating the state machine to look for the call direction invoking xs_tcp_read_calldir(...) after reading the XID. [nfs41: Keep track of RPC call/reply direction with a flag] As per 11/14/08 review of RFC 53/85. Add a new flag to track whether the incoming message is an RPC call or an RPC reply. TCP_RPC_REPLY is set in the 'struct sock_xprt' tcp_flags in xs_tcp_read_calldir() if the message is an RPC reply sent on the forechannel. It is cleared if the message is an RPC request sent on the back channel. Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com>	2009-06-17 12:43:45 -07:00
Andy Adamson	aae2006e9b	nfs41: sunrpc: Export the call prepare state for session reset Signed-off-by: Andy Adamson<andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-06-17 12:25:07 -07:00
Eric Dumazet	c564039fd8	net: sk_wmem_alloc has initial value of one, not zero commit `2b85a34e91` (net: No more expensive sock_hold()/sock_put() on each tx) changed initial sk_wmem_alloc value. Some protocols check sk_wmem_alloc value to determine if a timer must delay socket deallocation. We must take care of the sk_wmem_alloc value being one instead of zero when no write allocations are pending. Reported by Ingo Molnar, and full diagnostic from David Miller. This patch introduces three helpers to get read/write allocations and a followup patch will use these helpers to report correct write allocations to user. Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-17 04:31:25 -07:00
David Howells	519d25679e	RxRPC: Don't attempt to reuse aborted connections Connections that have seen a connection-level abort should not be reused as the far end will just abort them again; instead a new connection should be made. Connection-level aborts occur due to such things as authentication failures. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-06-16 21:20:14 -07:00
Linus Torvalds	517d08699b	Merge branch 'akpm' * akpm: (182 commits) fbdev: bf54x-lq043fb: use kzalloc over kmalloc/memset fbdev: bfin: fix __dev{init,exit} markings fbdev: bfin: drop unnecessary calls to memset fbdev: bfin-t350mcqb-fb: drop unused local variables fbdev: blackfin has __raw I/O accessors, so use them in fb.h fbdev: s1d13xxxfb: add accelerated bitblt functions tcx: use standard fields for framebuffer physical address and length fbdev: add support for handoff from firmware to hw framebuffers intelfb: fix a bug when changing video timing fbdev: use framebuffer_release() for freeing fb_info structures radeon: P2G2CLK_ALWAYS_ONb tested twice, should 2nd be P2G2CLK_DAC_ALWAYS_ONb? s3c-fb: CPUFREQ frequency scaling support s3c-fb: fix resource releasing on error during probing carminefb: fix possible access beyond end of carmine_modedb[] acornfb: remove fb_mmap function mb862xxfb: use CONFIG_OF instead of CONFIG_PPC_OF mb862xxfb: restrict compliation of platform driver to PPC Samsung SoC Framebuffer driver: add Alpha Channel support atmel-lcdc: fix pixclock upper bound detection offb: use framebuffer_alloc() to allocate fb_info struct ... Manually fix up conflicts due to kmemcheck in mm/slab.c	2009-06-16 19:50:13 -07:00
Christoph Lameter	62bc62a873	page allocator: use a pre-calculated value instead of num_online_nodes() in fast paths num_online_nodes() is called in a number of places but most often by the page allocator when deciding whether the zonelist needs to be filtered based on cpusets or the zonelist cache. This is actually a heavy function and touches a number of cache lines. This patch stores the number of online nodes at boot time and updates the value when nodes get onlined and offlined. The value is then used in a number of important paths in place of num_online_nodes(). [rientjes@google.com: do not override definition of node_set_online() with macro] Signed-off-by: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: Mel Gorman <mel@csn.ul.ie> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Dave Hansen <dave@linux.vnet.ibm.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com> Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-06-16 19:47:35 -07:00
Linus Torvalds	b3fec0fe35	Merge branch 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/vegard/kmemcheck * 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/vegard/kmemcheck: (39 commits) signal: fix __send_signal() false positive kmemcheck warning fs: fix do_mount_root() false positive kmemcheck warning fs: introduce __getname_gfp() trace: annotate bitfields in struct ring_buffer_event net: annotate struct sock bitfield c2port: annotate bitfield for kmemcheck net: annotate inet_timewait_sock bitfields ieee1394/csr1212: fix false positive kmemcheck report ieee1394: annotate bitfield net: annotate bitfields in struct inet_sock net: use kmemcheck bitfields API for skbuff kmemcheck: introduce bitfield API kmemcheck: add opcode self-testing at boot x86: unify pte_hidden x86: make _PAGE_HIDDEN conditional kmemcheck: make kconfig accessible for other architectures kmemcheck: enable in the x86 Kconfig kmemcheck: add hooks for the page allocator kmemcheck: add hooks for page- and sg-dma-mappings kmemcheck: don't track page tables ...	2009-06-16 13:09:51 -07:00
David S. Miller	14ebaf81e1	x25: Fix sleep from timer on socket destroy. If socket destuction gets delayed to a timer, we try to lock_sock() from that timer which won't work. Use bh_lock_sock() in that case. Signed-off-by: David S. Miller <davem@davemloft.net> Tested-by: Ingo Molnar <mingo@elte.hu>	2009-06-16 05:40:30 -07:00
Ursula Braun	c23cad923b	[S390] PM: af_iucv power management callbacks. Patch establishes a dummy afiucv-device to make sure af_iucv is notified as iucv-bus device about suspend/resume. The PM freeze callback severs all iucv pathes of connected af_iucv sockets. The PM thaw/restore callback switches the state of all previously connected sockets to IUCV_DISCONN. Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-06-16 10:31:19 +02:00
Ursula Braun	672e405b60	[S390] pm: iucv power management callbacks. Patch calls the PM callback functions of iucv-bus devices, which are responsible for removal of their established iucv pathes. The PM freeze callback for the first iucv-bus device disables all iucv interrupts except the connection severed interrupt. The PM freeze callback for the last iucv-bus device shuts down iucv. The PM thaw callback for the first iucv-bus device re-enables iucv if it has been shut down during freeze. If freezing has been interrupted, it re-enables iucv interrupts according to the needs of iucv-exploiters. The PM restore callback for the first iucv-bus device re-enables iucv. Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-06-16 10:31:17 +02:00
Ursula Braun	6c005961c1	[S390] iucv: establish reboot notifier To guarantee a proper cleanup, patch adds a reboot notifier to the iucv base code, which disables iucv interrupts, shuts down established iucv pathes, and removes iucv declarations for z/VM. Checks have to be added to the iucv-API functions, whether iucv-buffers removed at reboot time are still declared. Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-06-16 10:31:17 +02:00
Christian Engelmayer	59fb30660b	sunrpc: potential memory leak in function rdma_read_xdr In case the check on ch_count fails the cleanup path is skipped and the previously allocated memory 'rpl_map', 'chl_map' is not freed. Reported by Coverity. Signed-off-by: Christian Engelmayer <christian.engelmayer@frequentis.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-06-15 19:34:32 -07:00
Anton Blanchard	6aad89c837	sunrpc: align cache_clean work's timer Align cache_clean work. Signed-off-by: Anton Blanchard <anton@samba.org> Cc: Neil Brown <neilb@suse.de> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-06-15 18:14:58 -07:00
J. Bruce Fields	7eef4091a6	Merge commit 'v2.6.30' into for-2.6.31	2009-06-15 18:08:07 -07:00
Johannes Berg	1fa6f4af9f	mac80211: fix wext bssid/ssid setting When changing to a new BSSID or SSID, the code in ieee80211_set_disassoc() needs to have the old data still valid to be able to disconnect and clean up properly. Currently, however, the old data is thrown away before ieee80211_set_disassoc() is ever called, so fix that by calling the function _before_ the old data is overwritten. This is (one of) the issue(s) causing mac80211 to hold cfg80211's BSS structs forever, and them thus being returned in scan results after they're long gone. http://www.intellinuxwireless.org/bugzilla/show_bug.cgi?id=2015 Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-06-15 15:05:59 -04:00
Johannes Berg	7e9debe978	mac80211: disconnect when user changes channel If we do not disconnect when a channel switch is requested, we end up eventually detection beacon loss from the AP and then disconnecting, without ever really telling the AP, so we might just as well disconnect right away. Additionally, this fixes a problem with iwlwifi where the driver will clear some internal state on channel changes like this and then get confused when we actually go clear that state from mac80211. It may look like this patch drops the no-IBSS check, but that is already handled by cfg80211 in the wext handler it provides for IBSS (cfg80211_ibss_wext_siwfreq). Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-06-15 15:05:58 -04:00
Johannes Berg	db2e6bd4e9	mac80211: add queue debugfs file I suspect that some driver bugs can cause queues to be stopped while they shouldn't be, but it's hard to find out whether that is the case or not without having any visible information about the queues. This adds a file to debugfs that allows us to see the queues' statuses. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-06-15 15:05:57 -04:00
Jouni Malinen	68f2d02669	mac80211: Do not try to associate with an empty SSID It looks like some programs (e.g., NM) are setting an empty SSID with SIOCSIWESSID in some cases. This seems to trigger mac80211 to try to associate with an invalid configuration (wildcard SSID) which will result in failing associations (or odd issues, potentially including kernel panic with some drivers) if the AP were to actually accept this anyway). Only start association process if the SSID is actually set. This speeds up connection with NM in number of cases and avoids sending out broken association request frames. Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-06-15 15:05:51 -04:00
Vegard Nossum	722f2a6c87	Merge commit 'linus/master' into HEAD Conflicts: MAINTAINERS Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>	2009-06-15 15:50:49 +02:00
Vegard Nossum	a98b65a3ad	net: annotate struct sock bitfield 2009/2/24 Ingo Molnar <mingo@elte.hu>: > ok, this is the last warning i have from today's overnight -tip > testruns - a 32-bit system warning in sock_init_data(): > > [ 2.610389] NET: Registered protocol family 16 > [ 2.616138] initcall netlink_proto_init+0x0/0x170 returned 0 after 7812 usecs > [ 2.620010] WARNING: kmemcheck: Caught 32-bit read from uninitialized memory (f642c184) > [ 2.624002] 010000000200000000000000604990c000000000000000000000000000000000 > [ 2.634076] i i i i i i u u i i i i i i i i i i i i i i i i i i i i i i i i > [ 2.641038] ^ > [ 2.643376] > [ 2.644004] Pid: 1, comm: swapper Not tainted (2.6.29-rc6-tip-01751-g4d1c22c-dirty #885) > [ 2.648003] EIP: 0060:[<c07141a1>] EFLAGS: 00010282 CPU: 0 > [ 2.652008] EIP is at sock_init_data+0xa1/0x190 > [ 2.656003] EAX: 0001a800 EBX: f6836c00 ECX: 00463000 EDX: c0e46fe0 > [ 2.660003] ESI: f642c180 EDI: c0b83088 EBP: f6863ed8 ESP: c0c412ec > [ 2.664003] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 > [ 2.668003] CR0: 8005003b CR2: f682c400 CR3: 00b91000 CR4: 000006f0 > [ 2.672003] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 > [ 2.676003] DR6: ffff4ff0 DR7: 00000400 > [ 2.680002] [<c07423e5>] __netlink_create+0x35/0xa0 > [ 2.684002] [<c07443cc>] netlink_kernel_create+0x4c/0x140 > [ 2.688002] [<c072755e>] rtnetlink_net_init+0x1e/0x40 > [ 2.696002] [<c071b601>] register_pernet_operations+0x11/0x30 > [ 2.700002] [<c071b72c>] register_pernet_subsys+0x1c/0x30 > [ 2.704002] [<c0bf3c8c>] rtnetlink_init+0x4c/0x100 > [ 2.708002] [<c0bf4669>] netlink_proto_init+0x159/0x170 > [ 2.712002] [<c0101124>] do_one_initcall+0x24/0x150 > [ 2.716002] [<c0bbf3c7>] do_initcalls+0x27/0x40 > [ 2.723201] [<c0bbf3fc>] do_basic_setup+0x1c/0x20 > [ 2.728002] [<c0bbfb8a>] kernel_init+0x5a/0xa0 > [ 2.732002] [<c0103e47>] kernel_thread_helper+0x7/0x10 > [ 2.736002] [<ffffffff>] 0xffffffff We fix this false positive by annotating the bitfield in struct sock. Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>	2009-06-15 15:49:36 +02:00
Vegard Nossum	9e337b0fb3	net: annotate inet_timewait_sock bitfields The use of bitfields here would lead to false positive warnings with kmemcheck. Silence them. (Additionally, one erroneous comment related to the bitfield was also fixed.) Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>	2009-06-15 15:49:32 +02:00
Vegard Nossum	fe55f6d5c0	net: use kmemcheck bitfields API for skbuff Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>	2009-06-15 15:49:25 +02:00
David S. Miller	9cbc1cb8cd	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6 Conflicts: Documentation/feature-removal-schedule.txt drivers/scsi/fcoe/fcoe.c net/core/drop_monitor.c net/core/net-traces.c	2009-06-15 03:02:23 -07:00
Jarek Poplawski	ca44d6e60f	pkt_sched: Rename PSCHED_US2NS and PSCHED_NS2US Let's use TICKS instead of US, so PSCHED_TICKS2NS and PSCHED_NS2TICKS (like in PSCHED_TICKS_PER_SEC already) to avoid misleading. Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-15 02:31:47 -07:00
Jarek Poplawski	e0f7cb8c8c	ipv4: Fix fib_trie rebalancing While doing trie_rebalance(): resize(), inflate(), halve() RCU free tnodes before updating their parents. It depends on RCU delaying the real destruction, but if RCU readers start after call_rcu() and before parent update they could access freed memory. It is currently prevented with preempt_disable() on the update side, but it's not safe, except maybe classic RCU, plus it conflicts with memory allocations with GFP_KERNEL flag used from these functions. This patch explicitly delays freeing of tnodes by adding them to the list, which is flushed after the update is finished. Reported-by: Yan Zheng <zheng.yan@oracle.com> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-15 02:31:29 -07:00
Linus Torvalds	489f7ab6c1	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (31 commits) trivial: remove the trivial patch monkey's name from SubmittingPatches trivial: Fix a typo in comment of addrconf_dad_start() trivial: usb: fix missing space typo in doc trivial: pci hotplug: adding __init/__exit macros to sgi_hotplug trivial: Remove the hyphen from git commands trivial: fix ETIMEOUT -> ETIMEDOUT typos trivial: Kconfig: .ko is normally not included in module names trivial: SubmittingPatches: fix typo trivial: Documentation/dell_rbu.txt: fix typos trivial: Fix Pavel's address in MAINTAINERS trivial: ftrace:fix description of trace directory trivial: unnecessary (void*) cast removal in sound/oss/msnd.c trivial: input/misc: Fix typo in Kconfig trivial: fix grammo in bus_for_each_dev() kerneldoc trivial: rbtree.txt: fix rb_entry() parameters in sample code trivial: spelling fix in ppc code comments trivial: fix typo in bio_alloc kernel doc trivial: Documentation/rbtree.txt: cleanup kerneldoc of rbtree.txt trivial: Miscellaneous documentation typo fixes trivial: fix typo milisecond/millisecond for documentation and source comments. ...	2009-06-14 13:46:25 -07:00
Marcel Holtmann	1a097181ee	Bluetooth: Fix Kconfig issue with RFKILL integration Since the re-write of the RFKILL subsystem it is no longer good to just select RFKILL, but it is important to add a proper depends on rule. Based on a report by Alexander Beregalov <a.beregalov@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2009-06-14 15:30:51 +02:00
Tom Goff	403dbb97f6	PIM-SM: namespace changes IPv4: - make PIM register vifs netns local - set the netns when a PIM register vif is created - make PIM available in all network namespaces (if CONFIG_IP_PIMSM_V2) by adding the protocol handler when multicast routing is initialized IPv6: - make PIM register vifs netns local - make PIM available in all network namespaces (if CONFIG_IPV6_PIMSM_V2) by adding the protocol handler when multicast routing is initialized Signed-off-by: Tom Goff <thomas.goff@boeing.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-14 03:16:13 -07:00
Timo Teräs	e61a4b634a	ipv4: update ARPD help text Removed the statements about ARP cache size as this config option does not affect it. The cache size is controlled by neigh_table gc thresholds. Remove also expiremental and obsolete markings as the API originally intended for arp caching is useful for implementing ARP-like protocols (e.g. NHRP) in user space and has been there for a long enough time. Signed-off-by: Timo Teras <timo.teras@iki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-13 23:36:32 -07:00
Eric Dumazet	125bb8f563	net: use a deferred timer in rt_check_expire For the sake of power saver lovers, use a deferrable timer to fire rt_check_expire() As some big routers cache equilibrium depends on garbage collection done in time, we take into account elapsed time between two rt_check_expire() invocations to adjust the amount of slots we have to check. Based on an initial idea and patch from Tero Kristo Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Tero Kristo <tero.kristo@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-13 23:36:31 -07:00
David S. Miller	eaae44d248	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6	2009-06-13 16:43:28 -07:00
Joe Perches	3dd5d7e3ba	x_tables: Convert printk to pr_err Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-06-13 12:32:39 +02:00
Pablo Neira Ayuso	dd7669a92c	netfilter: conntrack: optional reliable conntrack event delivery This patch improves ctnetlink event reliability if one broadcast listener has set the NETLINK_BROADCAST_ERROR socket option. The logic is the following: if an event delivery fails, we keep the undelivered events in the missed event cache. Once the next packet arrives, we add the new events (if any) to the missed events in the cache and we try a new delivery, and so on. Thus, if ctnetlink fails to deliver an event, we try to deliver them once we see a new packet. Therefore, we may lose state transitions but the userspace process gets in sync at some point. At worst case, if no events were delivered to userspace, we make sure that destroy events are successfully delivered. Basically, if ctnetlink fails to deliver the destroy event, we remove the conntrack entry from the hashes and we insert them in the dying list, which contains inactive entries. Then, the conntrack timer is added with an extra grace timeout of random32() % 15 seconds to trigger the event again (this grace timeout is tunable via /proc). The use of a limited random timeout value allows distributing the "destroy" resends, thus, avoiding accumulating lots "destroy" events at the same time. Event delivery may re-order but we can identify them by means of the tuple plus the conntrack ID. The maximum number of conntrack entries (active or inactive) is still handled by nf_conntrack_max. Thus, we may start dropping packets at some point if we accumulate a lot of inactive conntrack entries that did not successfully report the destroy event to userspace. During my stress tests consisting of setting a very small buffer of 2048 bytes for conntrackd and the NETLINK_BROADCAST_ERROR socket flag, and generating lots of very small connections, I noticed very few destroy entries on the fly waiting to be resend. A simple way to test this patch consist of creating a lot of entries, set a very small Netlink buffer in conntrackd (+ a patch which is not in the git tree to set the BROADCAST_ERROR flag) and invoke `conntrack -F'. For expectations, no changes are introduced in this patch. Currently, event delivery is only done for new expectations (no events from expectation expiration, removal and confirmation). In that case, they need a per-expectation event cache to implement the same idea that is exposed in this patch. This patch can be useful to provide reliable flow-accouting. We still have to add a new conntrack extension to store the creation and destroy time. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-06-13 12:30:52 +02:00
Pablo Neira Ayuso	9858a3ae1d	netfilter: conntrack: move helper destruction to nf_ct_helper_destroy() This patch moves the helper destruction to a function that lives in nf_conntrack_helper.c. This new function is used in the patch to add ctnetlink reliable event delivery. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-06-13 12:28:22 +02:00
Pablo Neira Ayuso	a0891aa6a6	netfilter: conntrack: move event caching to conntrack extension infrastructure This patch reworks the per-cpu event caching to use the conntrack extension infrastructure. The main drawback is that we consume more memory per conntrack if event delivery is enabled. This patch is required by the reliable event delivery that follows to this patch. BTW, this patch allows you to enable/disable event delivery via /proc/sys/net/netfilter/nf_conntrack_events in runtime, although you can still disable event caching as compilation option. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-06-13 12:26:29 +02:00
Patrick McHardy	65cb9fda32	netfilter: nf_conntrack: use mod_timer_pending() for conntrack refresh Use mod_timer_pending() instead of atomic sequence of del_timer()/ add_timer(). mod_timer_pending() does not rearm an inactive timer, so we don't need the conntrack lock anymore to make sure we don't accidentally rearm a timer of a conntrack which is in the process of being destroyed. With this change, we don't need to take the global lock anymore at all, counter updates can be performed under the per-conntrack lock. Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-06-13 12:21:49 +02:00
Patrick McHardy	266d07cb1c	netfilter: nf_log: fix sleeping function called from invalid context Fix regression introduced by `17625274` "netfilter: sysctl support of logger choice": BUG: sleeping function called from invalid context at /mnt/s390test/linux-2.6-tip/arch/s390/include/asm/uaccess.h:234 in_atomic(): 1, irqs_disabled(): 0, pid: 3245, name: sysctl CPU: 1 Not tainted 2.6.30-rc8-tipjun10-02053-g39ae214 #1 Process sysctl (pid: 3245, task: 000000007f675da0, ksp: 000000007eb17cf0) 0000000000000000 000000007eb17be8 0000000000000002 0000000000000000 000000007eb17c88 000000007eb17c00 000000007eb17c00 0000000000048156 00000000003e2de8 000000007f676118 000000007eb17f10 0000000000000000 0000000000000000 000000007eb17be8 000000000000000d 000000007eb17c58 00000000003e2050 000000000001635c 000000007eb17be8 000000007eb17c30 Call Trace: (�<00000000000162e6>� show_trace+0x13a/0x148) �<00000000000349ea>� __might_sleep+0x13a/0x164 �<0000000000050300>� proc_dostring+0x134/0x22c �<0000000000312b70>� nf_log_proc_dostring+0xfc/0x188 �<0000000000136f5e>� proc_sys_call_handler+0xf6/0x118 �<0000000000136fda>� proc_sys_read+0x26/0x34 �<00000000000d6e9c>� vfs_read+0xac/0x158 �<00000000000d703e>� SyS_read+0x56/0x88 �<0000000000027f42>� sysc_noemu+0x10/0x16 Use the nf_log_mutex instead of RCU to fix this. Reported-and-tested-by: Maran Pakkirisamy <maranpsamy@in.ibm.com> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-06-13 12:21:10 +02:00
Patrick McHardy	5b54814022	net: use symbolic values for ndo_start_xmit() return codes Convert magic values 1 and -1 to NETDEV_TX_BUSY and NETDEV_TX_LOCKED respectively. 0 (NETDEV_TX_OK) is not changed to keep the noise down, except in very few cases where its in direct proximity to one of the other values. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-13 01:18:50 -07:00
Patrick McHardy	81fbbf6040	net: fix network drivers ndo_start_xmit() return values (part 7) Fix up ATM drivers that return an errno value to qdisc_restart(), causing qdisc_restart() to print a warning an requeue/retransmit the skb. - lec: condition can only be remedied by userspace, until that retransmissions Compile tested only. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-13 01:18:43 -07:00
Masatake YAMATO	590a9887a2	trivial: Fix a typo in comment of addrconf_dad_start() Signed-off-by: Masatake YAMATO <yamato@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2009-06-12 18:01:51 +02:00
Pavel Machek	4737f0978d	trivial: Kconfig: .ko is normally not included in module names .ko is normally not included in Kconfig help, make it consistent. Signed-off-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2009-06-12 18:01:50 +02:00
Martin Olsson	6d60f9dfc8	trivial: Fix paramater/parameter typo in dmesg and source comments Signed-off-by: Martin Olsson <martin@minimum.se> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2009-06-12 18:01:46 +02:00
Michael S. Tsirkin	d2a7ddda9f	virtio: find_vqs/del_vqs virtio operations This replaces find_vq/del_vq with find_vqs/del_vqs virtio operations, and updates all drivers. This is needed for MSI support, because MSI needs to know the total number of vectors upfront. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (+ lguest/9p compile fixes)	2009-06-12 22:16:36 +09:30
Rusty Russell	9499f5e7ed	virtio: add names to virtqueue struct, mapping from devices to queues. Add a linked list of all virtqueues for a virtio device: this helps for debugging and is also needed for upcoming interface change. Also, add a "name" field for clearer debug messages. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2009-06-12 22:16:36 +09:30
Michał Mirosław	da6782927d	bridge: Simplify interface for ATM LANE This patch changes FDB entry check for ATM LANE bridge integration. There's no point in holding a FDB entry around SKB building. br_fdb_get()/br_fdb_put() pair are changed into single br_fdb_test_addr() hook that checks if the addr has FDB entry pointing to other port to the one the request arrived on. FDB entry refcounting is removed as it's not used anywhere else. Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-11 21:03:21 -07:00
John Dykstra	746e6ad23c	[PATCH] net core: Some interface flags not returned by SIOCGIFFLAGS Commit `b00055aacd` " [NET] core: add RFC2863 operstate" defined new interface flag values. Its documentation specified that these flags could be accessed from user space via SIOCGIFFLAGS. However, this does not work because the new flags do not fit in that ioctl's argument width. Change the documentation to match the code's behavior. Also change the source to explicitly show the truncation. This _should_ have no effect on executable code, and did not with gcc 4.2.4 generating x86 code. A new ioctl could be defined to return all interface flags to user space. However, since this has been broken for three years with no one complaining, there doesn't seem much need. They are still accessible via netlink. Reported-by: "Fredrik Arnerup" <fredrik.arnerup@edgeware.tv> Signed-off-by: John Dykstra <john.dykstra1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-11 20:57:21 -07:00
David S. Miller	adf76cfe24	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6	2009-06-11 20:00:44 -07:00
David S. Miller	3ee40c376a	Merge branch 'linux-2.6.31.y' of git://git.kernel.org/pub/scm/linux/kernel/git/inaky/wimax	2009-06-11 17:11:33 -07:00
Patrick McHardy	24992eacd8	netfilter: ip_tables: fix build error Fix build error introduced by commit `bb70dfa5` (netfilter: xtables: consolidate comefrom debug cast access): net/ipv4/netfilter/ip_tables.c: In function 'ipt_do_table': net/ipv4/netfilter/ip_tables.c:421: error: 'comefrom' undeclared (first use in this function) net/ipv4/netfilter/ip_tables.c:421: error: (Each undeclared identifier is reported only once net/ipv4/netfilter/ip_tables.c:421: error: for each function it appears in.) Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-06-12 01:53:09 +02:00

... 2 3 4 5 6 ...

13014 commits