alistair23-linux

redonkable

Author	SHA1	Message	Date
Samuel Ortiz	0b456c418a	NFC: Remove the static supported_se field Supported secure elements are typically found during a discovery process initiated when the NFC controller is up and running. For a given NFC chipset there can be many configurations (embedded SE or not, with or without a SIM card wired to the NFC controller SWP interface, etc...) and thus driver code will never know before hand which SEs are available. So we remove this field, it will be replaced by a real SE discovery mechanism. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2013-06-14 13:44:19 +02:00
Frederic Danis	391d8a2da7	NFC: Add NCI over SPI receive Before any operation, driver interruption is de-asserted to prevent race condition between TX and RX. Transaction starts by emitting "Direct read" and acknowledged mode bytes. Then packet length is read allowing to allocate correct NCI socket buffer. After that payload is retrieved. A delay after the transaction can be added. This delay is determined by the driver during nci_spi_allocate_device() call and can be 0. If acknowledged mode is set: - CRC of header and payload is checked - if frame reception fails (CRC error): NACK is sent - if received frame has ACK or NACK flag: unblock nci_spi_send() Payload is passed to NCI module. At the end, driver interruption is re asserted. Signed-off-by: Frederic Danis <frederic.danis@linux.intel.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2013-06-14 13:44:16 +02:00
Frederic Danis	ee9596d467	NFC: Add NCI over SPI send Before any operation, driver interruption is de-asserted to prevent race condition between TX and RX. The NCI over SPI header is added in front of NCI packet. If acknowledged mode is set, CRC-16-CCITT is added to the packet. Then the packet is forwarded to SPI module to be sent. A delay after the transaction is added. This delay is determined by the driver during nci_spi_allocate_device() call and can be 0. After data has been sent, driver interruption is re-asserted. If acknowledged mode is set, nci_spi_send will block until acknowledgment is received. Signed-off-by: Frederic Danis <frederic.danis@linux.intel.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2013-06-14 13:44:15 +02:00
Frederic Danis	8a00a61b0e	NFC: Add basic NCI over SPI The NFC Forum defines a transport interface based on Serial Peripheral Interface (SPI) for the NFC Controller Interface (NCI). This module implements the SPI transport of NCI, calling SPI module directly to read/write data to NFC controller (NFCC). NFCC driver should provide functions performing device open and close. It should also provide functions asserting/de-asserting interruption to prevent TX/RX race conditions. NFCC driver can also fix a delay between transactions if needed by the hardware. Signed-off-by: Frederic Danis <frederic.danis@linux.intel.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2013-06-14 13:44:03 +02:00
Neil Horman	c5c7774d7e	sctp: fully initialize sctp_outq in sctp_outq_init In commit `2f94aabd9f` (refactor sctp_outq_teardown to insure proper re-initalization) we modified sctp_outq_teardown to use sctp_outq_init to fully re-initalize the outq structure. Steve West recently asked me why I removed the q->error = 0 initalization from sctp_outq_teardown. I did so because I was operating under the impression that sctp_outq_init would properly initalize that value for us, but it doesn't. sctp_outq_init operates under the assumption that the outq struct is all 0's (as it is when called from sctp_association_init), but using it in __sctp_outq_teardown violates that assumption. We should do a memset in sctp_outq_init to ensure that the entire structure is in a known state there instead. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Reported-by: "West, Steve (NSN - US/Fort Worth)" <steve.west@nsn.com> CC: Vlad Yasevich <vyasevich@gmail.com> CC: netdev@vger.kernel.org CC: davem@davemloft.net Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-13 18:05:24 -07:00
Rony Efraim	1d8faf48c7	net/core: Add VF link state control Add netlink directives and ndo entry to allow for controling VF link, which can be in one of three states: Auto - VF link state reflects the PF link state (default) Up - VF link state is up, traffic from VF to VF works even if the actual PF link is down Down - VF link state is down, no traffic from/to this VF, can be of use while configuring the VF Signed-off-by: Rony Efraim <ronye@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-13 17:51:04 -07:00
Eric Dumazet	ca4ec90b31	htb: reorder struct htb_class fields for performance htb_class structures are big, and source of false sharing on SMP. By carefully splitting them in two parts, we can improve performance. I got 9 % performance increase on a 24 threads machine, with 200 concurrent netperf in TCP_RR mode, using a HTB hierarchy of 4 classes. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-13 17:17:02 -07:00
Willem de Bruijn	5f121b9a83	net-rps: fixes for rps flow limit Caught by sparse: - __rcu: missing annotation to sd->flow_limit - __user: direct access in cpumask_scnprintf Also - add endline character when printing bitmap if room in buffer - avoid bucket overflow by reducing FLOW_LIMIT_HISTORY The last item warrants some explanation. The hashtable buckets are subject to overflow if FLOW_LIMIT_HISTORY is larger than or equal to bucket size, since all packets may end up in a single bucket. The current (rather arbitrary) history value of 256 happens to match the buffer size (u8). As a result, with a single flow, the first 128 packets are accepted (correct), the second 128 packets dropped (correct) and then the history[] array has filled, so that each subsequent new packet causes an increment in the bucket for new_flow plus a decrement for old_flow: a steady state. This is fine if packets are dropped, as the steady state goes away as soon as a mix of traffic reappears. But, because the 256th packet overflowed the bucket to 0: no packets are dropped. Instead of explicitly adding an overflow check, this patch changes FLOW_LIMIT_HISTORY to never be able to overflow a single bucket. Reported-by: Fengguang Wu <fengguang.wu@intel.com> (first item) Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-13 17:13:05 -07:00
Samuel Ortiz	a395298c9c	NFC: HCI: Follow a positive code path in the HCI ops implementations Exiting on the error case is more typical to the kernel coding style. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2013-06-14 00:26:10 +02:00
Eric Lapuyade	9a695d23aa	NFC: HCI: Implement fw_upload ops This is a simple forward to the HCI driver. When driver is done with the operation, it shall directly notify NFC Core by calling nfc_fw_upload_done(). Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2013-06-14 00:26:09 +02:00
Eric Lapuyade	9674da8759	NFC: Add firmware upload netlink command As several NFC chipsets can have their firmwares upgraded and reflashed, this patchset adds a new netlink command to trigger that the driver loads or flashes a new firmware. This will allows userspace triggered firmware upgrade through netlink. The firmware name or hint is passed as a parameter, and the driver will eventually fetch the firmware binary through the request_firmware API. The cmd can only be executed when the nfc dev is not in use. Actual firmware loading/flashing is an asynchronous operation. Result of the operation shall send a new event up to user space through the nfc dev multicast socket. During operation, the nfc dev is not openable and thus not usable. Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2013-06-14 00:26:08 +02:00
Frederic Danis	1095e69f47	NFC: NCI: Fix skb->dev usage skb->dev is used for carrying a net_device pointer and not an nci_dev pointer. Remove usage of skb-dev to carry nci_dev and replace it by parameter in nci_recv_frame(), nci_send_frame() and driver send() functions. NfcWilink driver is also updated to use those functions. Signed-off-by: Frederic Danis <frederic.danis@linux.intel.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2013-06-14 00:25:53 +02:00
Johan Hedberg	59f45d576a	Bluetooth: Fix conditions for HCI_Delete_Stored_Link_Key Even though the HCI_Delete_Stored_Link_Key command is mandatory for 1.1 and later controllers some controllers do not seem to support it properly as was witnessed by one Broadcom based controller: < HCI Command: Delete Stored Link Key (0x03\|0x0012) plen 7 bdaddr 00:00:00:00:00:00 all 1 > HCI Event: Command Complete (0x0e) plen 4 Delete Stored Link Key (0x03\|0x0012) ncmd 1 status 0x11 deleted 0 Error: Unsupported Feature or Parameter Value Luckily this same controller also doesn't list the command in its supported commands bit mask (counting from 0 bit 7 of octet 6): < HCI Command: Read Local Supported Commands (0x04\|0x0002) plen 0 > HCI Event: Command Complete (0x0e) plen 68 Read Local Supported Commands (0x04\|0x0002) ncmd 1 status 0x00 Commands: ffffffffffff1ffffffffffff30fffff3f Therefore, it makes sense to move sending of HCI_Delete_Stored_Link_Key to after receiving the supported commands response and to only send it if its respective bit in the mask is set. The downside of this is that we no longer send the HCI_Delete_Stored_Link_Key command for Bluetooth 1.1 controllers since HCI_Read_Local_Supported_Command was introduced in version 1.2, but this is an acceptable penalty as the command in question shouldn't affect critical behavior. Reported-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Tested-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2013-06-13 13:05:40 -04:00
Anderson Lizardo	300b962e52	Bluetooth: Fix crash in l2cap_build_cmd() with small MTU If a too small MTU value is set with ioctl(HCISETACLMTU) or by a bogus controller, memory corruption happens due to a memcpy() call with negative length. Fix this crash on either incoming or outgoing connections with a MTU smaller than L2CAP_HDR_SIZE + L2CAP_CMD_HDR_SIZE: [ 46.885433] BUG: unable to handle kernel paging request at f56ad000 [ 46.888037] IP: [<c03d94cd>] memcpy+0x1d/0x40 [ 46.888037] pdpt = 0000000000ac3001 pde = 00000000373f8067 *pte = 80000000356ad060 [ 46.888037] Oops: 0002 [#1] SMP DEBUG_PAGEALLOC [ 46.888037] Modules linked in: hci_vhci bluetooth virtio_balloon i2c_piix4 uhci_hcd usbcore usb_common [ 46.888037] CPU: 0 PID: 1044 Comm: kworker/u3:0 Not tainted 3.10.0-rc1+ #12 [ 46.888037] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007 [ 46.888037] Workqueue: hci0 hci_rx_work [bluetooth] [ 46.888037] task: f59b15b0 ti: f55c4000 task.ti: f55c4000 [ 46.888037] EIP: 0060:[<c03d94cd>] EFLAGS: 00010212 CPU: 0 [ 46.888037] EIP is at memcpy+0x1d/0x40 [ 46.888037] EAX: f56ac1c0 EBX: fffffff8 ECX: 3ffffc6e EDX: f55c5cf2 [ 46.888037] ESI: f55c6b32 EDI: f56ad000 EBP: f55c5c68 ESP: f55c5c5c [ 46.888037] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 [ 46.888037] CR0: 8005003b CR2: f56ad000 CR3: 3557d000 CR4: 000006f0 [ 46.888037] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 [ 46.888037] DR6: ffff0ff0 DR7: 00000400 [ 46.888037] Stack: [ 46.888037] fffffff8 00000010 00000003 f55c5cac f8c6a54c ffffffff f8c69eb2 00000000 [ 46.888037] f4783cdc f57f0070 f759c590 1001c580 00000003 0200000a 00000000 f5a88560 [ 46.888037] f5ba2600 f5a88560 00000041 00000000 f55c5d90 f8c6f4c7 00000008 f55c5cf2 [ 46.888037] Call Trace: [ 46.888037] [<f8c6a54c>] l2cap_send_cmd+0x1cc/0x230 [bluetooth] [ 46.888037] [<f8c69eb2>] ? l2cap_global_chan_by_psm+0x152/0x1a0 [bluetooth] [ 46.888037] [<f8c6f4c7>] l2cap_connect+0x3f7/0x540 [bluetooth] [ 46.888037] [<c019b37b>] ? trace_hardirqs_off+0xb/0x10 [ 46.888037] [<c01a0ff8>] ? mark_held_locks+0x68/0x110 [ 46.888037] [<c064ad20>] ? mutex_lock_nested+0x280/0x360 [ 46.888037] [<c064b9d9>] ? __mutex_unlock_slowpath+0xa9/0x150 [ 46.888037] [<c01a118c>] ? trace_hardirqs_on_caller+0xec/0x1b0 [ 46.888037] [<c064ad08>] ? mutex_lock_nested+0x268/0x360 [ 46.888037] [<c01a125b>] ? trace_hardirqs_on+0xb/0x10 [ 46.888037] [<f8c72f8d>] l2cap_recv_frame+0xb2d/0x1d30 [bluetooth] [ 46.888037] [<c01a0ff8>] ? mark_held_locks+0x68/0x110 [ 46.888037] [<c064b9d9>] ? __mutex_unlock_slowpath+0xa9/0x150 [ 46.888037] [<c01a118c>] ? trace_hardirqs_on_caller+0xec/0x1b0 [ 46.888037] [<f8c754f1>] l2cap_recv_acldata+0x2a1/0x320 [bluetooth] [ 46.888037] [<f8c491d8>] hci_rx_work+0x518/0x810 [bluetooth] [ 46.888037] [<f8c48df2>] ? hci_rx_work+0x132/0x810 [bluetooth] [ 46.888037] [<c0158979>] process_one_work+0x1a9/0x600 [ 46.888037] [<c01588fb>] ? process_one_work+0x12b/0x600 [ 46.888037] [<c015922e>] ? worker_thread+0x19e/0x320 [ 46.888037] [<c015922e>] ? worker_thread+0x19e/0x320 [ 46.888037] [<c0159187>] worker_thread+0xf7/0x320 [ 46.888037] [<c0159090>] ? rescuer_thread+0x290/0x290 [ 46.888037] [<c01602f8>] kthread+0xa8/0xb0 [ 46.888037] [<c0656777>] ret_from_kernel_thread+0x1b/0x28 [ 46.888037] [<c0160250>] ? flush_kthread_worker+0x120/0x120 [ 46.888037] Code: c3 90 8d 74 26 00 e8 63 fc ff ff eb e8 90 55 89 e5 83 ec 0c 89 5d f4 89 75 f8 89 7d fc 3e 8d 74 26 00 89 cb 89 c7 c1 e9 02 89 d6 <f3> a5 89 d9 83 e1 03 74 02 f3 a4 8b 5d f4 8b 75 f8 8b 7d fc 89 [ 46.888037] EIP: [<c03d94cd>] memcpy+0x1d/0x40 SS:ESP 0068:f55c5c5c [ 46.888037] CR2: 00000000f56ad000 [ 46.888037] ---[ end trace 0217c1f4d78714a9 ]--- Signed-off-by: Anderson Lizardo <anderson.lizardo@openbossa.org> Cc: stable@vger.kernel.org Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2013-06-13 13:05:39 -04:00
Eric Dumazet	d3b6f61418	ip_tunnel: remove __net_init/exit from exported functions If CONFIG_NET_NS is not set then __net_init is the same as __init and __net_exit is the same as __exit. These functions will be removed from memory after the module loads or is removed. Functions that are exported for use by other functions should never be labeled for removal. Bug introduced by commit `c544193214` ("GRE: Refactor GRE tunneling code.") Reported-by: Steinar H. Gunderson <sgunderson@bigfoot.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-13 03:00:59 -07:00
Ilan Peer	38745c7414	mac80211: Fix VHT bandwidth change event Signed-off-by: Ilan Peer <ilan.peer@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-06-13 11:58:47 +02:00
Alexander Bondar	817cee7675	mac80211: track AP's beacon rate and give it to the driver Track the AP's beacon rate in the scan BSS data and in the interface configuration to let the drivers know which rate the AP is using. This information may be used by drivers, in our case to let the firmware optimise beacon RX. Signed-off-by: Alexander Bondar <alexander.bondar@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-06-13 11:58:47 +02:00
Saurabh Mohan	baafc77b32	net/ipv4: ip_vti clear skb cb before tunneling. If users apply shaper to vti tunnel then it will cause a kernel crash. The problem seems to be due to the vti_tunnel_xmit function not clearing skb->opt field before passing the packet to xfrm tunneling code. Signed-off-by: Saurabh Mohan <saurabh@vyatta.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-13 02:47:46 -07:00
Yuchung Cheng	85f16525a2	tcp: properly send new data in fast recovery in first RTT Linux sends new unset data during disorder and recovery state if all (suspected) lost packets have been retransmitted ( RFC5681, section 3.2 step 1 & 2, RFC3517 section 4, NexSeg() Rule 2). One requirement is to keep the receive window about twice the estimated sender's congestion window (tcp_rcv_space_adjust()), assuming the fast retransmits repair the losses in the next round trip. But currently it's not the case on the first round trip in either normal or Fast Open connection, beucase the initial receive window is identical to (expected) sender's initial congestion window. The fix is to double it. Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-13 02:46:29 -07:00
Guillaume Nault	a6f79d0f26	l2tp: Fix sendmsg() return value PPPoL2TP sockets should comply with the standard send*() return values (i.e. return number of bytes sent instead of 0 upon success). Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-13 02:39:04 -07:00
Guillaume Nault	55b92b7a11	l2tp: Fix PPP header erasure and memory leak Copy user data after PPP framing header. This prevents erasure of the added PPP header and avoids leaking two bytes of uninitialised memory at the end of skb's data buffer. Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-13 02:39:04 -07:00
Joe Perches	fe2c6338fd	net: Convert uses of typedef ctl_table to struct ctl_table Reduce the uses of this unnecessary typedef. Done via perl script: $ git grep --name-only -w ctl_table net \| \ xargs perl -p -i -e '\ sub trim { my ($local) = @_; $local =~ s/(^\s+\|\s+$)//g; return $local; } \ s/\b(?<!struct\s)ctl_table\b(\s\\s*\|\s+\w+)/"struct ctl_table " . trim($1)/ge' Reflow the modified lines that now exceed 80 columns. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-13 02:36:09 -07:00
Flavio Leitner	194f4a6df2	net: make all team port device link events urgent Since team functionality relies heavily on userspace daemon, we need to deliver event to userspace via Netlink as quick as possible. So make all team port device link events urgent. Signed-off-by: Flavio Leitner <fbl@redhat.com> Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-13 02:31:41 -07:00
Daniel Borkmann	2dc85bf323	packet: packet_getname_spkt: make sure string is always 0-terminated uaddr->sa_data is exactly of size 14, which is hard-coded here and passed as a size argument to strncpy(). A device name can be of size IFNAMSIZ (== 16), meaning we might leave the destination string unterminated. Thus, use strlcpy() and also sizeof() while we're at it. We need to memset the data area beforehand, since strlcpy does not padd the remaining buffer with zeroes for user space, so that we do not possibly leak anything. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-13 01:38:36 -07:00
Wu Fengguang	a06a2d378d	net: ping_check_bind_addr() etc. can be static net/ipv4/ping.c:286:5: sparse: symbol 'ping_check_bind_addr' was not declared. Should it be static? net/ipv4/ping.c:355:6: sparse: symbol 'ping_set_saddr' was not declared. Should it be static? net/ipv4/ping.c:370:6: sparse: symbol 'ping_clear_saddr' was not declared. Should it be static? net/ipv6/ping.c:60:5: sparse: symbol 'dummy_ipv6_recv_error' was not declared. Should it be static? net/ipv6/ping.c:64:5: sparse: symbol 'dummy_ip6_datagram_recv_ctl' was not declared. Should it be static? net/ipv6/ping.c:69:5: sparse: symbol 'dummy_icmpv6_err_convert' was not declared. Should it be static? net/ipv6/ping.c:73:6: sparse: symbol 'dummy_ipv6_icmp_error' was not declared. Should it be static? net/ipv6/ping.c:75:5: sparse: symbol 'dummy_ipv6_chk_addr' was not declared. Should it be static? net/ipv6/ping.c:201:5: sparse: symbol 'ping_v6_seq_show' was not declared. Should it be static? Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-13 01:36:41 -07:00
Ben Greear	e562078a19	mac80211: Ensure tid_start_tx is protected by sta->lock All accesses of the tid_start_tx lock should be protected by sta->lock if there is any chance that another thread could still be accessing the sta object. Signed-off-by: Ben Greear <greearb@candelatech.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-06-13 10:27:07 +02:00
David S. Miller	e86c986137	Included change: - fix "rtnl locked" concurrent executions by using rtnl_lock instead of rtnl_trylock. This fix enables batman-adv initialisation to do not fail just because somewhere else in the system another code path is holding the rtnl lock. It is easy to see the problem when batman-adv is trying to start together with other networking components. - fix the routing protocol forwarding policy by enhancing the duplicate control packet detection. When the right circumstances trigger the issue, some nodes in the network become totally unreachable, so breaking the mesh connectivity. - fix the Bridge Loop Avoidance component by not running the originator address change handling routine when the component is disabled. The routine was generating useless packets that were sent over the network. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) iQIcBAABCAAGBQJRtkFtAAoJEADl0hg6qKeOBHUP/1Ni8juzvoQwR/gYjq0p2lHM LvkZ/KI74QP96rfDog+nhP9fcOgjSt2MF+sSa6is92RavrKFZQyO2J/GKkrHu2HA gaLKxN6S28Hi2qVbVXgSqT9RQ3XvpzIaojtNvb0tC1onzGzLtd6V5FURIq0FRHvN RvP4w+1HwH2CsQlgjQq1OPwUllVqTUzGYH0fl/U+0mw7h+q0ZWCA1IZln/t08xjl ViCCydbD1Th2tgK7uzFg8X3EJZN1CkrBWflb7X3YK5zeps1NC+l4OuUOK+K2L+fx vCMu603FXKi+SjM24d+eGJx6kQPCapYThIrp1qy43SLkNazRIAmgbZpndme0QP/8 eSUozWAusWIESJI3Krneh3i70agMeg2MK4nAp51z54j52urDlOGURyNf7TkieaT4 Vti5QG0poXncIb1XQ+yaKDCORwkn18QjmtfNmCCgT2YF91pOSYCrlgONi65K6DIs F4eDk7sTgHAIgYO/XEet/V5p06SO86ksF/C13Dqug64s3rkw9ejqgLZBEy3OH1AF IFgws3qE6GiSiXLMiiheplBYD51au+V1Jihqvw/lo2JzlOw4PRNRYsaQgVaUH/MJ jupEjA8V0swMtDIi6ixcPE/P60OJR41VuT8gVGWbrTKnHZ0yyIfgZwPcLZaQ2X0e EIlTJdtS7lVpleZ2C/H1 =oGCV -----END PGP SIGNATURE----- Merge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-merge Included change: - fix "rtnl locked" concurrent executions by using rtnl_lock instead of rtnl_trylock. This fix enables batman-adv initialisation to do not fail just because somewhere else in the system another code path is holding the rtnl lock. It is easy to see the problem when batman-adv is trying to start together with other networking components. - fix the routing protocol forwarding policy by enhancing the duplicate control packet detection. When the right circumstances trigger the issue, some nodes in the network become totally unreachable, so breaking the mesh connectivity. - fix the Bridge Loop Avoidance component by not running the originator address change handling routine when the component is disabled. The routine was generating useless packets that were sent over the network. Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-13 01:26:54 -07:00
Simon Horman	2c928e0e8d	sctp: Correct byte order of access to skb->{network, transport}_header Corrects an byte order conflict introduced by `158874cac6` ("sctp: Correct access to skb->{network, transport}_header"). The values in question are host byte order. Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-13 01:21:50 -07:00
Gao feng	ca15febfe9	netlink: make compare exist all the time Commit `da12c90e09` "netlink: Add compare function for netlink_table" only set compare at the time we create kernel netlink, and reset compare to NULL at the time we finially release netlink socket, but netlink_lookup wants the compare exist always. So we should set compare after we allocate nl_table, and never reset it. make comapre exist all the time. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-13 00:45:48 -07:00
Linus Torvalds	26e04462c8	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking update from David Miller: 1) Fix dump iterator in nfnl_acct_dump() and ctnl_timeout_dump() to dump all objects properly, from Pablo Neira Ayuso. 2) xt_TCPMSS must use the default MSS of 536 when no MSS TCP option is present. Fix from Phil Oester. 3) qdisc_get_rtab() looks for an existing matching rate table and uses that instead of creating a new one. However, it's key matching is incomplete, it fails to check to make sure the ->data[] array is identical too. Fix from Eric Dumazet. 4) ip_vs_dest_entry isn't fully initialized before copying back to userspace, fix from Dan Carpenter. 5) Fix ubuf reference counting regression in vhost_net, from Jason Wang. 6) When sock_diag dumps a socket filter back to userspace, we have to translate it out of the kernel's internal representation first. From Nicolas Dichtel. 7) davinci_mdio holds a spinlock while calling pm_runtime, which sleeps. Fix from Sebastian Siewior. 8) Timeout check in sh_eth_check_reset is off by one, from Sergei Shtylyov. 9) If sctp socket init fails, we can NULL deref during cleanup. Fix from Daniel Borkmann. 10) netlink_mmap() does not propagate errors properly, from Patrick McHardy. 11) Disable powersave and use minstrel by default in ath9k. From Sujith Manoharan. 12) Fix a regression in that SOCK_ZEROCOPY is not set on tuntap sockets which prevents vhost from being able to use zerocopy. From Jason Wang. 13) Fix race between port lookup and TX path in team driver, from Jiri Pirko. 14) Missing length checks in bluetooth L2CAP packet parsing, from Johan Hedberg. 15) rtlwifi fails to connect to networking using any encryption method other than WPA2. Fix from Larry Finger. 16) Fix iwlegacy build due to incorrect CONFIG_* ifdeffing for power management stuff. From Yijing Wang. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (35 commits) b43: stop format string leaking into error msgs ath9k: Use minstrel rate control by default Revert "ath9k_hw: Update rx gain initval to improve rx sensitivity" ath9k: Disable PowerSave by default net: wireless: iwlegacy: fix build error for il_pm_ops rtlwifi: Fix a false leak indication for PCI devices wl12xx/wl18xx: scan all 5ghz channels wl12xx: increase minimum singlerole firmware version required wl12xx: fix minimum required firmware version for wl127x multirole rtlwifi: rtl8192cu: Fix problem in connecting to WEP or WPA(1) networks mwifiex: debugfs: Fix out of bounds array access Bluetooth: Fix mgmt handling of power on failures Bluetooth: Fix missing length checks for L2CAP signalling PDUs Bluetooth: btmrvl: support Marvell Bluetooth device SD8897 Bluetooth: Fix checks for LE support on LE-only controllers team: fix checks in team_get_first_port_txable_rcu() team: move add to port list before port enablement team: check return value of team_get_port_by_index_rcu() for NULL tuntap: set SOCK_ZEROCOPY flag during open netlink: fix error propagation in netlink_mmap() ...	2013-06-12 17:18:29 -07:00
Johannes Berg	661eb3811d	mac80211: fix TX aggregation TID struct leak Ben reports that kmemleak is saying TX aggregation TID structs are leaked. Given his workload, I suspect that they're leaked because stations are destroyed before their aggregation sessions get a chance to start. Fix this by simply freeing structs that are not used yet. Reported-by: Ben Greear <greearb@candelatech.com> Tested-by: Ben Greear <greearb@candelatech.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-06-13 00:16:37 +02:00
Eric Dumazet	7c0cadc69c	udp: fix two sparse errors commit `ba418fa357` ("soreuseport: UDP/IPv4 implementation") added following sparse errors : net/ipv4/udp.c:433:60: warning: cast from restricted __be16 net/ipv4/udp.c:433:60: warning: incorrect type in argument 1 (different base types) net/ipv4/udp.c:433:60: expected unsigned short [unsigned] [usertype] val net/ipv4/udp.c:433:60: got restricted __be16 [usertype] sport net/ipv4/udp.c:433:60: warning: cast from restricted __be16 net/ipv4/udp.c:433:60: warning: cast from restricted __be16 net/ipv4/udp.c:514:60: warning: cast from restricted __be16 net/ipv4/udp.c:514:60: warning: incorrect type in argument 1 (different base types) net/ipv4/udp.c:514:60: expected unsigned short [unsigned] [usertype] val net/ipv4/udp.c:514:60: got restricted __be16 [usertype] sport net/ipv4/udp.c:514:60: warning: cast from restricted __be16 net/ipv4/udp.c:514:60: warning: cast from restricted __be16 Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-12 15:03:24 -07:00
Eric Dumazet	5b9b626377	gro: remove a sparse error Fix following sparse error : net/ipv4/af_inet.c:1410:59: warning: restricted __be16 degrades to integer added in commit `db8caf3dbc` ("gro: should aggregate frames without DF") Reported-by: kbuild test robot <fengguang.wu@intel.com> From: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-12 15:03:24 -07:00
David S. Miller	4a2e667ac1	Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next John W. Linville says: ==================== This pull request is intended for the 3.11 stream... One big highlight is the cw1200 driver the ST-E CW1100 & CW1200 WLAN chipsets. This one has been lingering for a while, lacking some review comments. Once started getting pulled into linux-next, it got a bit more attention and a number of improvements were made over the initial cut. No doubt there will be more changes ahead, but I think it is looking alright at this point. Along with that, there is the usual flurry of updates to the mac80211 core and the iwlwifi, mwifiex, ath9k, rt2x00, wil6210, and other drivers. A few of the highlights are some rt2x00 refactoring/cleanup by Gabor Juhos, some rt2800 hardware support enhancements by Stanislaw Gruszka, some iwlwifi power management updates from Alexander Bondar, some enhanced bcma SPROM support from Rafał Miłecki, and a variety of other things here and there. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-12 14:23:41 -07:00
Eric Dumazet	c70eba7453	igmp: fix new sparse errors Fix following sparse errors : net/ipv4/igmp.c:1222:25: warning: cast from restricted __be32 net/ipv4/igmp.c🔢31: warning: incorrect type in assignment (different address spaces) net/ipv4/igmp.c🔢31: expected struct ip_mc_list [noderef] <asn:4>next_hash net/ipv4/igmp.c🔢31: got struct ip_mc_list <noident> net/ipv4/igmp.c:1250:31: warning: incorrect type in assignment (different address spaces) net/ipv4/igmp.c:1250:31: expected struct ip_mc_list [noderef] <asn:4>next_hash net/ipv4/igmp.c:1250:31: got struct ip_mc_list <noident> net/ipv4/igmp.c:2380:37: warning: cast from restricted __be32 These were added by commit `e989707135` ("igmp: hash a hash table to speedup ip_check_mc_rcu()") Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-12 14:14:55 -07:00
John W. Linville	812fd64596	Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Conflicts: drivers/net/wireless/iwlwifi/mvm/mac80211.c	2013-06-12 15:39:05 -04:00
John W. Linville	861bca265e	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless Conflicts: drivers/net/wireless/ath/ath9k/Kconfig net/mac80211/iface.c	2013-06-12 14:35:23 -04:00
John W. Linville	6bb7aabf73	Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211	2013-06-12 14:26:48 -04:00
Linus Torvalds	8d7a8fe2ce	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull ceph fixes from Sage Weil: "There is a pair of fixes for double-frees in the recent bundle for 3.10, a couple of fixes for long-standing bugs (sleep while atomic and an endianness fix), and a locking fix that can be triggered when osds are going down" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: rbd: fix cleanup in rbd_add() rbd: don't destroy ceph_opts in rbd_add() ceph: ceph_pagelist_append might sleep while atomic ceph: add cpu_to_le32() calls when encoding a reconnect capability libceph: must hold mutex for reset_changed_osds()	2013-06-12 08:28:19 -07:00
John W. Linville	42d887a680	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem	2013-06-12 10:57:04 -04:00
Johan Hedberg	96570ffcca	Bluetooth: Fix mgmt handling of power on failures If hci_dev_open fails we need to ensure that the corresponding mgmt_set_powered command gets an appropriate response. This patch fixes the missing response by adding a new mgmt_set_powered_failed function that's used to indicate a power on failure to mgmt. Since a situation with the device being rfkilled may require special handling in user space the patch uses a new dedicated mgmt status code for this. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Cc: stable@vger.kernel.org Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2013-06-12 10:20:55 -04:00
Johan Hedberg	cb3b3152b2	Bluetooth: Fix missing length checks for L2CAP signalling PDUs There has been code in place to check that the L2CAP length header matches the amount of data received, but many PDU handlers have not been checking that the data received actually matches that expected by the specific PDU. This patch adds passing the length header to the specific handler functions and ensures that those functions fail cleanly in the case of an incorrect amount of data. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2013-06-12 10:20:54 -04:00
Johan Hedberg	757aee0f71	Bluetooth: Fix checks for LE support on LE-only controllers LE-only controllers do not support extended features so any kind of host feature bit checks do not make sense for them. This patch fixes code used for both single-mode (LE-only) and dual-mode (BR/EDR/LE) to use the HCI_LE_ENABLED flag instead of the "Host LE supported" feature bit for LE support tests. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2013-06-12 10:20:54 -04:00
Phil Oester	b396966c46	netfilter: xt_TCPMSS: Fix missing fragmentation handling Similar to commit `bc6bcb59` ("netfilter: xt_TCPOPTSTRIP: fix possible mangling beyond packet boundary"), add safe fragment handling to xt_TCPMSS. Signed-off-by: Phil Oester <kernel@linuxace.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-06-12 11:06:19 +02:00
Phil Oester	70d19f805f	netfilter: xt_TCPMSS: Fix IPv6 default MSS too As a followup to commit `409b545a` ("netfilter: xt_TCPMSS: Fix violation of RFC879 in absence of MSS option"), John Heffner points out that IPv6 has a higher MTU than IPv4, and thus a higher minimum MSS. Update TCPMSS target to account for this, and update RFC comment. While at it, point to more recent reference RFC1122 instead of RFC879. Signed-off-by: Phil Oester <kernel@linuxace.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-06-12 11:04:41 +02:00
Daniel Borkmann	7a6e288d27	pktgen: ipv6: numa: consolidate skb allocation to pktgen_alloc_skb We currently allow for numa-node aware skb allocation only within the fill_packet_ipv4() path, but not in fill_packet_ipv6(). Consolidate that code to a common allocation helper to enable numa-node aware skb allocation for ipv6, and use it in both paths. This also makes both functions a bit more readable. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-12 00:47:25 -07:00
Daniel Borkmann	da5bab079f	net: udp4: move GSO functions to udp_offload Similarly to TCP offloading and UDPv6 offloading, move all related UDPv4 functions to udp_offload.c to make things more explicit. Also, by this, we can make those functions static. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-12 00:47:25 -07:00
Shawn Bohrer	946d3bd723	igmp: remove unnecessary in_device member zeroing ip_mc_init_dev() is passed a freshly kzalloc'd in_device so it is unnecessary to explicitly zero out the members. Signed-off-by: Shawn Bohrer <sbohrer@rgmadvisors.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-12 00:41:15 -07:00
Eric Dumazet	e989707135	igmp: hash a hash table to speedup ip_check_mc_rcu() After IP route cache removal, multicast applications using a lot of multicast addresses hit a O(N) behavior in ip_check_mc_rcu() Add a per in_device hash table to get faster lookup. This hash table is created only if the number of items in mc_list is above 4. Reported-by: Shawn Bohrer <sbohrer@rgmadvisors.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Tested-by: Shawn Bohrer <sbohrer@rgmadvisors.com> Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-12 00:25:23 -07:00
Eric Dumazet	64153ce0a7	net_sched: htb: do not setup default rate estimators With a thousand htb classes, est_timer() spends ~5 million cpu cycles and throws out cpu cache, because each htb class has a default rate estimator (est 4sec 16sec). Most users do not use default rate estimators, so switch htb to not setup ones. Add a module parameter (htb_rate_est) so that users relying on this default rate estimator can revert the behavior. echo 1 >/sys/module/sch_htb/parameters/htb_rate_est Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-12 00:14:21 -07:00
Simon Wunderlich	795d855d56	mac80211: Fix rate control mask matching call The order of parameters was mixed up, introduced in commit "mac80211: improve the rate control API" Cc: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Mathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-06-12 09:12:43 +02:00
Simon Wunderlich	a6b368f6ca	mac80211: abort CAC in stop_ap() When a CAC is running and stop_ap is called (e.g. when hostapd is killed while performing CAC), the CAC must be aborted immediately. Otherwise ieee80211_stop_ap() will try to stop it when it's too late - wdev->channel is already NULL and the abort event can not be generated. Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-06-12 09:12:43 +02:00
Johannes Berg	35d865afbb	mac80211: work around broken APs not including HT info There are some APs, notably 2G/3G/4G Wifi routers, specifically the "Onda PN51T", "Vodafone PocketWiFi 2", "ZTE MF60" and a similar T-Mobile branded device [1] that erroneously don't include all the needed information in (re)association response frames. Work around this by assuming the information is the same as it was in the beacon or probe response and using the data from there instead. This fixes https://bugzilla.kernel.org/show_bug.cgi?id=58881. [1] https://bbs.archlinux.org/viewtopic.php?pid=1277305 Note that this requires marking the first ieee802_11_parse_elems() argument const, otherwise we'd get a compiler warning. Cc: stable@vger.kernel.org Reported-and-tested-by: Michal Zajac <manwe@manwe.pl> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-06-12 09:11:54 +02:00
Eric Dumazet	130d3d68b5	net_sched: psched_ratecfg_precompute() improvements Before allowing 64bits bytes rates, refactor psched_ratecfg_precompute() to get better comments and increased accuracy. rate_bps field is renamed to rate_bytes_ps, as we only have to worry about bytes per second. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Ben Greear <greearb@candelatech.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-11 22:39:47 -07:00
John W. Linville	3899ba90a4	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem Conflicts: drivers/net/wireless/ath/ath9k/debug.c net/mac80211/iface.c	2013-06-11 14:48:32 -04:00
Johannes Berg	940d0ac9db	cfg80211: fix rtnl leak in wiphy dump error cases In two wiphy dump error cases, most often when the dump allocation must be increased, the RTNL is leaked. This quickly results in a complete system lockup. Release the RTNL correctly. Reported-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-06-11 16:52:39 +02:00
Antonio Quartulli	ea141b75ae	nl80211: allow sending CMD_FRAME without specifying any frequency Users may want to send a frame on the current channel without specifying it. This is particularly useful for the correct implementation of the IBSS/RSN support in wpa_supplicant which requires to receive and send AUTH frames. Make mgmt_tx pass a NULL channel to the driver if none has been specified by the user. Signed-off-by: Antonio Quartulli <antonio@open-mesh.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-06-11 15:01:36 +02:00
Antonio Quartulli	f7aeb6fb1a	mac80211: make mgmt_tx accept a NULL channel cfg80211 passes a NULL channel to mgmt_tx if the frame has to be sent on the one currently in use by the device. Make the implementation of mgmt_tx correctly handle this case. Fail if offchan is required. Signed-off-by: Antonio Quartulli <antonio@open-mesh.com> [fix RCU locking] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-06-11 15:01:24 +02:00
Jouni Malinen	3d124ea27a	cfg80211: fix VHT TDLS peer AID verification I (Johannes) accidentally applied the first version of the patch ("Allow TDLS peer AID to be configured for VHT"). Now apply just the changes between v1 and v2 to get the AID verification and prefer the new attribute over the old one. Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-06-11 14:36:43 +02:00
Ashok Nagarajan	ffb3cf3000	{nl,mac,cfg}80211: Allow user to configure basic rates for mesh Currently mesh uses mandatory rates as the default basic rates. Allow basic rates to be configured during mesh join. Basic rates are applied only if channel is also provided with mesh join command. Signed-off-by: Ashok Nagarajan <ashok@cozybit.com> [some whitespace fixes, refuse basic rates w/o channel] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-06-11 14:24:36 +02:00
Colleen Twitty	66de671374	mac80211: expire mesh peers based on mesh configuration The time it takes to see the peer link expire may differ by a minute since sta_expire() is run once a minute as a mesh housekeeping task. Signed-off-by: Colleen Twitty <colleen@cozybit.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-06-11 14:16:29 +02:00
Colleen Twitty	8e7c053853	{nl,cfg}80211: make peer link expiration time configurable If a STA has a peer that it hasn't seen any tx activity from for a certain length of time, the peer link is expired. This means the inactive STA is removed from the list of peers and that STA is not considered a peer again unless it re-peers. Previously, this inactivity time was always 30 minutes. Now, add it to the mesh configuration and allow it to be configured. Retain 30 minutes as a default value. Signed-off-by: Colleen Twitty <colleen@cozybit.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-06-11 14:16:29 +02:00
Thomas Pedersen	ecccd072b0	mac80211: fix mesh deadlock The patch "cfg80211/mac80211: use cfg80211 wdev mutex in mac80211" introduced several deadlocks by converting the ifmsh->mtx to wdev->mtx. Solve these by: 1. drop the cancel_work_sync() in ieee80211_stop_mesh(). Instead make the mesh work conditional on whether the mesh is running or not. 2. lock the mesh work with sdata_lock() to protect beacon updates and prevent races with wdev->mesh_id_len or cfg80211. Signed-off-by: Thomas Pedersen <thomas@cozybit.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-06-11 13:14:42 +02:00
Patrick McHardy	7cdbac71f9	netlink: fix error propagation in netlink_mmap() Return the error if something went wrong instead of unconditionally returning 0. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-11 02:52:47 -07:00
Eric Dumazet	45203a3b38	net_sched: add 64bit rate estimators struct gnet_stats_rate_est contains u32 fields, so the bytes per second field can wrap at 34360Mbit. Add a new gnet_stats_rate_est64 structure to get 64bit bps/pps fields, and switch the kernel to use this structure natively. This structure is dumped to user space as a new attribute : TCA_STATS_RATE_EST64 Old tc command will now display the capped bps (to 34360Mbit), instead of wrapped values, and updated tc command will display correct information. Old tc command output, after patch : eric:~# tc -s -d qd sh dev lo qdisc pfifo 8001: root refcnt 2 limit 1000p Sent 80868245400 bytes 1978837 pkt (dropped 0, overlimits 0 requeues 0) rate 34360Mbit 189696pps backlog 0b 0p requeues 0 This patch carefully reorganizes "struct Qdisc" layout to get optimal performance on SMP. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-11 02:51:03 -07:00
Daniel Borkmann	1abd165ed7	net: sctp: fix NULL pointer dereference in socket destruction While stress testing sctp sockets, I hit the following panic: BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 IP: [<ffffffffa0490c4e>] sctp_endpoint_free+0xe/0x40 [sctp] PGD 7cead067 PUD 7ce76067 PMD 0 Oops: 0000 [#1] SMP Modules linked in: sctp(F) libcrc32c(F) [...] CPU: 7 PID: 2950 Comm: acc Tainted: GF 3.10.0-rc2+ #1 Hardware name: Dell Inc. PowerEdge T410/0H19HD, BIOS 1.6.3 02/01/2011 task: ffff88007ce0e0c0 ti: ffff88007b568000 task.ti: ffff88007b568000 RIP: 0010:[<ffffffffa0490c4e>] [<ffffffffa0490c4e>] sctp_endpoint_free+0xe/0x40 [sctp] RSP: 0018:ffff88007b569e08 EFLAGS: 00010292 RAX: 0000000000000000 RBX: ffff88007db78a00 RCX: dead000000200200 RDX: ffffffffa049fdb0 RSI: ffff8800379baf38 RDI: 0000000000000000 RBP: ffff88007b569e18 R08: ffff88007c230da0 R09: 0000000000000001 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: ffff880077990d00 R14: 0000000000000084 R15: ffff88007db78a00 FS: 00007fc18ab61700(0000) GS:ffff88007fc60000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000020 CR3: 000000007cf9d000 CR4: 00000000000007e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Stack: ffff88007b569e38 ffff88007db78a00 ffff88007b569e38 ffffffffa049fded ffffffff81abf0c0 ffff88007db78a00 ffff88007b569e58 ffffffff8145b60e 0000000000000000 0000000000000000 ffff88007b569eb8 ffffffff814df36e Call Trace: [<ffffffffa049fded>] sctp_destroy_sock+0x3d/0x80 [sctp] [<ffffffff8145b60e>] sk_common_release+0x1e/0xf0 [<ffffffff814df36e>] inet_create+0x2ae/0x350 [<ffffffff81455a6f>] __sock_create+0x11f/0x240 [<ffffffff81455bf0>] sock_create+0x30/0x40 [<ffffffff8145696c>] SyS_socket+0x4c/0xc0 [<ffffffff815403be>] ? do_page_fault+0xe/0x10 [<ffffffff8153cb32>] ? page_fault+0x22/0x30 [<ffffffff81544e02>] system_call_fastpath+0x16/0x1b Code: 0c c9 c3 66 2e 0f 1f 84 00 00 00 00 00 e8 fb fe ff ff c9 c3 66 0f 1f 84 00 00 00 00 00 55 48 89 e5 53 48 83 ec 08 66 66 66 66 90 <48> 8b 47 20 48 89 fb c6 47 1c 01 c6 40 12 07 e8 9e 68 01 00 48 RIP [<ffffffffa0490c4e>] sctp_endpoint_free+0xe/0x40 [sctp] RSP <ffff88007b569e08> CR2: 0000000000000020 ---[ end trace e0d71ec1108c1dd9 ]--- I did not hit this with the lksctp-tools functional tests, but with a small, multi-threaded test program, that heavily allocates, binds, listens and waits in accept on sctp sockets, and then randomly kills some of them (no need for an actual client in this case to hit this). Then, again, allocating, binding, etc, and then killing child processes. This panic then only occurs when ``echo 1 > /proc/sys/net/sctp/auth_enable'' is set. The cause for that is actually very simple: in sctp_endpoint_init() we enter the path of sctp_auth_init_hmacs(). There, we try to allocate our crypto transforms through crypto_alloc_hash(). In our scenario, it then can happen that crypto_alloc_hash() fails with -EINTR from crypto_larval_wait(), thus we bail out and release the socket via sk_common_release(), sctp_destroy_sock() and hit the NULL pointer dereference as soon as we try to access members in the endpoint during sctp_endpoint_free(), since endpoint at that time is still NULL. Now, if we have that case, we do not need to do any cleanup work and just leave the destruction handler. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-11 02:49:05 -07:00
Peter Pan(潘卫平)	b41abb42bf	net: pass correct parameter to skb_headers_offset_update() Since commit 1a37e412a022(net: Use 16bits for _headers fields of struct skbuff), skb->_header are relative to skb->head, so copy_skb_header() should not call skb_headers_offset_update() now, and we should pass correct parameter to skb_headers_offset_update() in pskb_expand_head() and skb_copy_expand(). Signed-off-by: Weiping Pan <panweiping3@gmail.com> Reviewed-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-11 02:48:06 -07:00
Gao feng	da12c90e09	netlink: Add compare function for netlink_table As we know, netlink sockets are private resource of net namespace, they can communicate with each other only when they in the same net namespace. this works well until we try to add namespace support for other subsystems which use netlink. Don't like ipv4 and route table.., it is not suited to make these subsytems belong to net namespace, Such as audit and crypto subsystems,they are more suitable to user namespace. So we must have the ability to make the netlink sockets in same user namespace can communicate with each other. This patch adds a new function pointer "compare" for netlink_table, we can decide if the netlink sockets can communicate with each other through this netlink_table self-defined compare function. The behavior isn't changed if we don't provide the compare function for netlink_table. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-11 02:39:42 -07:00
Vlad Yasevich	867a59436f	bridge: Add a flag to control unicast packet flood. Add a flag to control flood of unicast traffic. By default, flood is on and the bridge will flood unicast traffic if it doesn't know the destination. When the flag is turned off, unicast traffic without an FDB will not be forwarded to the specified port. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-11 02:04:32 -07:00
Vlad Yasevich	9ba18891f7	bridge: Add flag to control mac learning. Allow user to control whether mac learning is enabled on the port. By default, mac learning is enabled. Disabling mac learning will cause new dynamic FDB entries to not be created for a particular port. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-11 02:04:32 -07:00
Nicolas Dichtel	ed13998c31	sock_diag: fix filter code sent to userspace Filters need to be translated to real BPF code for userland, like SO_GETFILTER. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-10 22:23:32 -07:00
Cong Wang	30f3a40f9a	net: remove last caller of skb_tail_offset() and itself Similar to the following commits: commit `00f97da17a` (netpoll: fix position of network header) commit `525cebedb3` (pktgen: Fix position of ip and udp header) using skb_tail_offset() seems not correct since the offset is based on head pointer. With the last caller removed, skb_tail_offset() can be killed finally. Cc: Thomas Graf <tgraf@suug.ch> Cc: Daniel Borkmann <dborkmann@redhat.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-10 22:22:23 -07:00
Eliezer Tamir	d30e383bb8	tcp: add low latency socket poll support. Adds low latency socket poll support for TCP. In tcp_v[46]_rcv() add a call to sk_mark_ll() to copy the napi_id from the skb to the sk. In tcp_recvmsg(), when there is no data in the socket we busy-poll. This is a good example of how to add busy-poll support to more protocols. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com> Acked-by: Eric Dumazet <edumazet@google.com> Tested-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-10 21:22:36 -07:00
Eliezer Tamir	a5b50476f7	udp: add low latency socket poll support Add upport for busy-polling on UDP sockets. In __udp[46]_lib_rcv add a call to sk_mark_ll() to copy the napi_id from the skb into the sk. This is done at the earliest possible moment, right after we identify which socket this skb is for. In __skb_recv_datagram When there is no data and the user tries to read we busy poll. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com> Acked-by: Eric Dumazet <edumazet@google.com> Tested-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-10 21:22:36 -07:00
Eliezer Tamir	0602129286	net: add low latency socket poll Adds an ndo_ll_poll method and the code that supports it. This method can be used by low latency applications to busy-poll Ethernet device queues directly from the socket code. sysctl_net_ll_poll controls how many microseconds to poll. Default is zero (disabled). Individual protocol support will be added by subsequent patches. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com> Acked-by: Eric Dumazet <edumazet@google.com> Tested-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-10 21:22:35 -07:00
Eliezer Tamir	af12fa6e46	net: add napi_id and hash Adds a napi_id and a hashing mechanism to lookup a napi by id. This will be used by subsequent patches to implement low latency Ethernet device polling. Based on a code sample by Eric Dumazet. Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Tested-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-10 21:22:35 -07:00
Linus Torvalds	1b79821fe7	zero copy error fix -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) Comment: GPGTools - http://gpgtools.org iQIcBAABAgAGBQJRtjfAAAoJEDZk62b0Tg6xEHcP/jtZbupCJBVmjvI5zBDjn4NT RnOoaZLz4ks1OTj2WkhkdMe7/dNh+Dlq/w9F9nw4jOZhkSN90cgDZJqK/otHqh3R GpXEwNS5NLysTu9qlA2Kd49C77KynafEUtFR4k0zSB/q8Uh+g8D480bps1xBeIUx ATlBSuYPy6SZM1PWqgllnP7IzATQtaHJSQ8uXCkfSneG+dj+13Fq9LngH7DCpKA6 T4bU8Mx8kXlR0RcVtjjqFAf7TmOe05XG/WhgsaX5gGcpxuZ5FajCaIZeuD1rQ63K jLKA56VxUGa5FU6ZBZH0I+52xhdceByJdFcTW8X8fhXU6m82NYhKnMv64QZ+UjT0 F2vNlzW7SkYvnMkjvxor/JLyi0OKIrMfElPYcuMW8pU5AVV9Hzf0M1NHLCeM2CNn JoghVzOPcUSistYldAJkivCcGi9sEgNadTsBrU1TM8FwkaHboUXprNf8xn2SRm1V iD/25/voPqbsVPtDSO9YsVEi6NldLx07yQwJy34jJ1D5Qw4MPtfPlUTLT17ikCc4 kjY1fiov+HOcEeCse1X3IxqEB3RK/KTR6NUqOk1yY2DlLxU1ZhJfrpelOTDV9iXc 5Ra71ebsBYb2gQqk6+do/IUxKMiLBGrt5l2KLq2ZdhWYn8Fr8O7B5gaX9BH8aOwH L5ROAeZxkgSJBjuKMcsi =+vsA -----END PGP SIGNATURE----- Merge tag '9p-3.10-bug-fix-1' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs Pull net/9p bug fix from Eric Van Hensbergen: "zero copy error fix" * tag '9p-3.10-bug-fix-1' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs: net/9p: Handle error in zero copy request correctly for 9p2000.u	2013-06-10 17:35:25 -07:00
Pablo Neira Ayuso	ed82c43732	netfilter: xt_TCPOPTSTRIP: don't use tcp_hdr() In (`bc6bcb5` netfilter: xt_TCPOPTSTRIP: fix possible mangling beyond packet boundary), the use of tcp_hdr was introduced. However, we cannot assume that skb->transport_header is set for non-local packets. Cc: Florian Westphal <fw@strlen.de> Reported-by: Phil Oester <kernel@linuxace.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-06-11 01:55:07 +02:00
David S. Miller	d88210910a	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf Pablo Neira Ayuso says: ==================== The following patchset contains four fixes for Netfilter and one fix for IPVS, they are: * Fix data leak to user-space via getsockopt IP_VS_SO_GET_DESTS, from Dan Carpenter. * Fix xt_TCPMSS if no TCP MSS is specified in syn packets, to avoid the violation of RFC879, from Phil Oester. * Fix incomplete dump of objects via nfnetlink_acct and nfnetlink_cttimeout, from myself. * Fix missing HW protocol in packets passed to user-space via NFQUEUE, from myself. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-10 13:30:33 -07:00
Dan Carpenter	a8241c6351	ipvs: info leak in __ip_vs_get_dest_entries() The entry struct has a 2 byte hole after ->port and another 4 byte hole after ->stats.outpkts. You must have CAP_NET_ADMIN in your namespace to hit this information leak. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-06-10 14:53:00 +02:00
Simon Wunderlich	d5b4c93e67	batman-adv: Don't handle address updates when bla is disabled The bridge loop avoidance has a hook to handle address updates of the originator. These should not be handled when bridge loop avoidance is disabled - it might send some bridge loop avoidance packets which should not appear if bla is disabled. Signed-off-by: Simon Wunderlich <simon@open-mesh.com> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>	2013-06-10 08:42:18 +02:00
Simon Wunderlich	7c24bbbeab	batman-adv: forward late OGMs from best next hop When a packet is received from another node first and later from the best next hop, this packet is dropped. However the first OGM was sent with the BATADV_NOT_BEST_NEXT_HOP flag and thus dropped by neighbors. The late OGM from the best neighbor is then dropped because it is a duplicate. If this situation happens constantly, a node might end up not forwarding the "valid" OGMs anymore, and nodes behind will starve from not getting valid OGMs. Fix this by refining the duplicate checking behaviour: The actions should depend on whether it was a duplicate for a neighbor only or for the originator. OGMs which are not duplicates for a specific neighbor will now be considered in batadv_iv_ogm_forward(), but only actually forwarded for the best next hop. Therefore, late OGMs from the best next hop are forwarded now and not dropped as duplicates anymore. Signed-off-by: Simon Wunderlich <simon@open-mesh.com> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>	2013-06-10 08:42:17 +02:00
Matthias Schiffer	ac16d1484e	batman-adv: wait for rtnl in batadv_store_mesh_iface instead of failing if it is taken The rtnl_lock in batadv_store_mesh_iface has been converted to a rtnl_trylock some time ago to avoid a possible deadlock between rtnl and s_active on removal of the sysfs nodes. The behaviour introduced by that was quite confusing as it could lead to the sysfs store to fail, making batman-adv setup scripts unreliable. As recently the sysfs removal was postponed to a worker not running with the rtnl taken, the deadlock can't occur any more and it is safe to change the trylock back to a lock to make the sysfs store reliable again. Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net> Reviewed-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>	2013-06-10 08:42:16 +02:00
Greg Kroah-Hartman	38a4671cad	Merge 3.10-rc5 into char-misc-next	2013-06-08 22:34:53 -07:00
Pablo Neira Ayuso	c05cdb1b86	netlink: allow large data transfers from user-space I can hit ENOBUFS in the sendmsg() path with a large batch that is composed of many netlink messages. Here that limit is 8 MBytes of skbuff data area as kmalloc does not manage to get more than that. While discussing atomic rule-set for nftables with Patrick McHardy, we decided to put all rule-set updates that need to be applied atomically in one single batch to simplify the existing approach. However, as explained above, the existing netlink code limits us to a maximum of ~20000 rules that fit in one single batch without hitting ENOBUFS. iptables does not have such limitation as it is using vmalloc. This patch adds netlink_alloc_large_skb() which is only used in the netlink_sendmsg() path. It uses alloc_skb if the memory requested is <= one memory page, that should be the common case for most subsystems, else vmalloc for higher memory allocations. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-07 16:26:34 -07:00
Eric Dumazet	40edeff6e1	net_sched: qdisc_get_rtab() must check data[] array qdisc_get_rtab() should check not only the keys in struct tc_ratespec, but also the full data[] array. "tc ... linklayer atm " only perturbs values in the 256 slots array. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-07 15:24:04 -07:00
Daniel Borkmann	28850dc7c7	net: tcp: move GRO/GSO functions to tcp_offload Would be good to make things explicit and move those functions to a new file called tcp_offload.c, thus make this similar to tcpv6_offload.c. While moving all related functions into tcp_offload.c, we can also make some of them static, since they are only used there. Also, add an explicit registration function. Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-07 14:39:05 -07:00
Daniel Borkmann	5ee9859157	net: minor: tcp: use tcp_skb_mss helper in tcp_tso_segment We have the minimal inline helper tcp_skb_mss to access skb_shinfo(skb)->gso_size, so also use it here to get mss. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-07 14:39:05 -07:00
Pablo Neira Ayuso	7b8dfe289f	netfilter: nfnetlink_queue: fix missing HW protocol Locally generated IPv4 and IPv6 traffic gets skb->protocol unset, thus passing zero. ip6tables -I OUTPUT -j NFQUEUE libmnl/examples/netfilter# ./nf-queue 0 & ping6 ::1 packet received (id=1 hw=0x0000 hook=3) ^^^^^^ Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-06-07 18:55:20 +02:00
David S. Miller	93a306aef5	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Merge 'net' into 'net-next' to get the MSG_CMSG_COMPAT regression fix. Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-06 23:39:26 -07:00
Trond Myklebust	9ec2ef53b9	SUNRPC: Remove redundant call to rpc_set_running() in __rpc_execute() The RPC_TASK_RUNNING flag will always have been set in rpc_make_runnable() once we get past the test for out_of_line_wait_on_bit() returning ERESTARTSYS. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-06-06 16:24:40 -04:00
Trond Myklebust	0053a8e65c	SUNRPC: Remove unused function rpc_queue_empty Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-06-06 16:24:39 -04:00
Trond Myklebust	a76580fbf0	SUNRPC: Fix a potential race in rpc_execute If the rpc_task is asynchronous, it could theoretically finish executing on the workqueue it was assigned by rpc_make_runnable() before we get round to testing RPC_IS_ASYNC() in rpc_execute. In practice, however, all the existing callers hold a reference to the rpc_task, so this can't happen today... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-06-06 16:24:38 -04:00
Andy Lutomirski	a7526eb5d0	net: Unbreak compat_sys_{send,recv}msg I broke them in this commit: commit `1be374a051` Author: Andy Lutomirski <luto@amacapital.net> Date: Wed May 22 14:07:44 2013 -0700 net: Block MSG_CMSG_COMPAT in send(m)msg and recv(m)msg This patch adds __sys_sendmsg and __sys_sendmsg as common helpers that accept MSG_CMSG_COMPAT and blocks MSG_CMSG_COMPAT at the syscall entrypoints. It also reverts some unnecessary checks in sys_socketcall. Apparently I was suffering from underscore blindness the first time around. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Tested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-06 11:52:14 -07:00
David S. Miller	143554ace8	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next Conflicts: net/netfilter/nf_log.c The conflict in nf_log.c is that in 'net' we added CONFIG_PROC_FS protection around foo_proc_entry() calls to fix a build failure, whereas in Pablo's tree a guard if() test around a call is remove_proc_entry() was removed. Trivially resolved. Pablo Neira Ayuso says: ==================== The following patchset contains the first batch of Netfilter/IPVS updates for your net-next tree, they are: * Three patches with improvements and code refactorization for nfnetlink_queue, from Florian Westphal. * FTP helper now parses replies without brackets, as RFC1123 recommends, from Jeff Mahoney. * Rise a warning to tell everyone about ULOG deprecation, NFLOG has been already in the kernel tree for long time and supersedes the old logging over netlink stub, from myself. * Don't panic if we fail to load netfilter core framework, just bail out instead, from myself. * Add cond_resched_rcu, used by IPVS to allow rescheduling while walking over big hashtables, from Simon Horman. * Change type of IPVS sysctl_sync_qlen_max sysctl to avoid possible overflow, from Zhang Yanfei. * Use strlcpy instead of strncpy to skip zeroing of already initialized area to write the extension names in ebtables, from Chen Gang. * Use already existing per-cpu notrack object from xt_CT, from Eric Dumazet. * Save explicit socket lookup in xt_socket now that we have early demux, also from Eric Dumazet. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-06 01:03:06 -07:00
Fan Du	4c4d41f200	xfrm: add LINUX_MIB_XFRMACQUIREERROR statistic counter When host ping its peer, ICMP echo request packet triggers IPsec policy, then host negotiates SA secret with its peer. After IKE installed SA for OUT direction, but before SA for IN direction installed, host get ICMP echo reply from its peer. At the time being, the SA state for IN direction could be XFRM_STATE_ACQ, then the received packet will be dropped after adding LINUX_MIB_XFRMINSTATEINVALID statistic. Adding a LINUX_MIB_XFRMACQUIREERROR statistic counter for such scenario when SA in larval state is much clearer for user than LINUX_MIB_XFRMINSTATEINVALID which indicates the SA is totally bad. Signed-off-by: Fan Du <fan.du@windriver.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2013-06-06 06:45:55 +02:00
David S. Miller	6bc19fb82d	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Merge 'net' bug fixes into 'net-next' as we have patches that will build on top of them. This merge commit includes a change from Emil Goode (emilgoode@gmail.com) that fixes a warning that would have been introduced by this merge. Specifically it fixes the pingv6_ops method ipv6_chk_addr() to add a "const" to the "struct net_device *dev" argument and likewise update the dummy_ipv6_chk_addr() declaration. Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-05 16:37:30 -07:00
Andy Shevchenko	4cd5773a2a	net: core: move mac_pton() to lib/net_utils.c Since we have at least one user of this function outside of CONFIG_NET scope, we have to provide this function independently. The proposed solution is to move it under lib/net_utils.c with corresponding configuration variable and select wherever it is needed. Signed-off-by: Andy Shevchenko <andy.shevchenko@gmail.com> Reported-by: Arnd Bergmann <arnd@arndb.de> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-06-05 12:00:27 -07:00
Phil Oester	409b545ac1	netfilter: xt_TCPMSS: Fix violation of RFC879 in absence of MSS option The clamp-mss-to-pmtu option of the xt_TCPMSS target can cause issues connecting to websites if there was no MSS option present in the original SYN packet from the client. In these cases, it may add a MSS higher than the default specified in RFC879. Fix this by never setting a value > 536 if no MSS option was specified by the client. This closes netfilter's bugzilla #662. Signed-off-by: Phil Oester <kernel@linuxace.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-06-05 13:59:22 +02:00
Florian Westphal	7f87712c01	netfilter: nfnetlink_queue: only add CAP_LEN attr when needed CAP_LEN contains the size of the network packet we're queueing to userspace, i.e. normally it is the same as the NFQA_PAYLOAD attribute len. Include it only in the unlikely case when NFQA_PAYLOAD is truncated due to copy_range limitations. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-06-05 12:40:54 +02:00
Florian Westphal	9cefbbc9c8	netfilter: nfnetlink_queue: cleanup copy_range usage For every packet queued, we check if configured copy_range is 0, and treat that as 'copy entire packet'. We can move this check to the queue configuration, and can set copy_range appropriately. Also, convert repetitive '0xffff - NLA_HDRLEN' to a macro. [ queue initialization still used 0xffff, although its harmless since the initial setting is overwritten on queue config ] Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-06-05 12:40:28 +02:00
Pablo Neira Ayuso	37bc4f8dfa	netfilter: nfnetlink_cttimeout: fix incomplete dumping of objects Fix broken incomplete object dumping if the list of objects does not fit into one single netlink message. Reported-by: Gabriel Lazar <Gabriel.Lazar@com.utcluj.ro> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-06-05 12:36:37 +02:00
Pablo Neira Ayuso	991a6b735f	netfilter: nfnetlink_acct: fix incomplete dumping of objects Fix broken incomplete object dumping if the list of objects does not fit into one single netlink message. Reported-by: Gabriel Lazar <Gabriel.Lazar@com.utcluj.ro> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-06-05 12:36:36 +02:00
Linus Torvalds	4d3797d7e1	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) Fix timeouts with direct mode authentication in mac80211, from Stanislaw Gruszka. 2) Aggregation sessions can deadlock in ath9k, from Felix Fietkau. 3) Netfilter's xt_addrtype doesn't work with ipv6 due to route lookups creating undesirable cache entries, from Florian Westphal. 4) Fix netfilter's ipt_ULOG from generating non-NULL terminated strings. 5) Fix netdev transmit queue crashes in mac80211, from Johannes Berg. 6) Fix copy and paste error in 802.11 stack that broke reporting of 64-bit station tx statistics, from Felix Fietkau. 7) When qlge_probe fails, it leaks the netdev. Fix from Wei Yongjun. 8) SKB control block (where we store the IP options information, amongst other things) must be cleared properly otherwise ICMP sending can crash for IP tunnels. Fix from Eric Dumazet. 9) Verification of Energy Efficient Ether support was coded wrongly, the test was inversed. Fix from Giuseppe CAVALLARO. 10) TCP handles redirects improperly because the wrong flow key is used for the route lookup. From Michal Kubecek. 11) Don't interpret MSG_CMSG_COMPAT from userspace, fix from Andy Lutomirski. 12) The new AF_VSOCK was missing from the lockdep string table, fix from Federico Vaga. 13) be2net doesn't handle checksumming of IP fragments properly, from Somnath Kotur. 14) Fix several bugs in the device address list code that lead to crashes and other misbehaviors. From Jay Vosburgh. 15) Fix ipv6 segmentation handling of fragmented GRE tunnel traffic, from Pravin B Shalr. 16) Fix usage of stale policies in IPSEC layer, from Paul Moore. 17) Fix team driver dump of ports when there are a large number of them, from Jiri Pirko. 18) Fix softlockups in UDP ipv4 socket lookup causes by and error in the hlist_nulls_for_each_entry_rcu() macro. From Eric Dumazet. 19) Fix several regressions added by the high rate accuracy changes to the htb packet scheduler. From Eric Dumazet. 20) Fix DMA'ing onto the stack in esd_usb2 and peak_usb CAN drivers, from Olivier Sobrie and Marc Kleine-Budde. 21) Fix unremovable network devices due to missing route pointer installation in the per-device ipv6 address list entries. From Gao feng. 22) Apply the tg3 5719 DMA workaround on 5720 chips as well, otherwise we get stalls. From Nithin Sujir. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (68 commits) net_sched: htb: do not mix 1ns and 64ns time units net: fix sk_buff head without data area tg3: Add read dma workaround for 5720 net: ethernet: xilinx_emaclite: set protocol selector bits when writing ANAR bnx2x: Fix bridged GSO for 57710/57711 chips net: fec: add fallback to random MAC address bnx2x: fix TCP offload for tunneling ipv4 over ipv6 ipv6: assign rt6_info to inet6_ifaddr in init_loopback net/mlx4_core: Keep VF assigned MAC in the PF admin table net/mlx4_en: Handle unassigned VF MAC address correctly net/mlx4_core: Return -EPROBE_DEFER when a VF is probed before PF is sufficiently initialized net/mlx4_en: Fix adaptive moderation cq update net: can: peak_usb: Do not do dma on the stack net: can: esd_usb2: Do not do dma on the stack net: can: kvaser_usb: fix reception on "USBcan Pro" and "USBcan R" type hardware. net_sched: restore "overhead xxx" handling net: force a reload of first item in hlist_nulls_for_each_entry_rcu hyperv: Fix vlan_proto setting in netvsc_recv_callback() team: fix port list dump for big number of ports list: introduce list_first_entry_or_null ...	2013-06-05 19:19:04 +09:00
Johannes Berg	780b40df12	wireless: fix kernel-doc Some kernel-doc fixes for forgotten fields and renamed things. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-06-05 09:33:32 +02:00
Alexander Bondar	989c6505cd	mac80211: Use suitable semantics for beacon availability indication Currently beacon availability upon association is marked by have_beacon flag of assoc_data structure that becomes unavailable when association completes. However beacon availability indication is required also after association to inform a driver. Currently dtim_period parameter is used for this purpose. Move have_beacon flag to another structure, persistant throughout a interface's life cycle. Use suitable sematics for beacon availability indication. Signed-off-by: Alexander Bondar <alexander.bondar@intel.com> [fix another instance of BSS_CHANGED_DTIM_PERIOD in docs] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-06-05 09:12:20 +02:00
Alexander Bondar	482a9c74fa	mac80211: fix powersave bug and clean up ieee80211_rx_bss_info ieee80211_rx_bss_info() deals with dtim_period setting and PS update when associated. Move all these to another locations cleaning this function. Also, the current implementation is buggy because when it calls ieee80211_recalc_ps() bss_conf->dtim_period is notset properly yet and thus nothing will happen. Signed-off-by: Alexander Bondar <alexander.bondar@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-06-05 08:52:47 +02:00
Eric Dumazet	5343a7f8be	net_sched: htb: do not mix 1ns and 64ns time units commit `56b765b79` ("htb: improved accuracy at high rates") added another regression for low rates, because it mixes 1ns and 64ns time units. So the maximum delay (mbuffer) was not 60 second, but 937 ms. Lets convert all time fields to 1ns as 64bit arches are becoming the norm. Reported-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Tested-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-04 17:44:07 -07:00
Amerigo Wang	00f97da17a	netpoll: fix position of network header Similar to the problem in pktgen, netpoll uses skb_tail_offset() too, as the code is copied from pktgen. Also use return values of skb_put() directly, this will simiplify the code. Reported-by: Thomas Graf <tgraf@suug.ch> Cc: Thomas Graf <tgraf@suug.ch> Cc: Daniel Borkmann <dborkmann@redhat.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-04 17:37:04 -07:00
Thomas Graf	525cebedb3	pktgen: Fix position of ip and udp header skb_set_network_header() expects an offset based on the data pointer whereas skb_tail_offset() also includes the headroom. This resulted in the ip header being written in a wrong location. Use return values of skb_put() directly and rely on skb->len to set mac, network, and transport header. Cc: Simon Horman <horms@verge.net.au> Cc: Daniel Borkmann <dborkmann@redhat.com> Assisted-by: Daniel Borkmann <dborkmann@redhat.com> Signed-off-by: Thomas Graf <tgraf@suug.ch> Acked-by: Daniel Borkmann <dborkmann@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-04 17:35:45 -07:00
Pablo Neira	5e71d9d77c	net: fix sk_buff head without data area Eric Dumazet spotted that we have to check skb->head instead of skb->data as skb->head points to the beginning of the data area of the skbuff. Similarly, we have to initialize the skb->head pointer, not skb->data in __alloc_skb_head. After this fix, netlink crashes in the release path of the sk_buff, so let's fix that as well. This bug was introduced in (`0ebd0ac` net: add function to allocate sk_buff head without data area). Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-04 17:26:49 -07:00
Yan Burman	600fed5e97	net/ethtool: Fix comment regarding location of dev_ethtool() call Signed-off-by: Yan Burman <yanb@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-04 17:10:03 -07:00
Cong Wang	c26d6b46da	ping: always initialize ->sin6_scope_id and ->sin6_flowinfo If we don't need scope id, we should initialize it to zero. Same for ->sin6_flowinfo. Cc: Lorenzo Colitti <lorenzo@google.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Acked-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-04 16:58:42 -07:00
Gao feng	534c877928	ipv6: assign rt6_info to inet6_ifaddr in init_loopback Commit `25fb6ca4ed` "net IPv6 : Fix broken IPv6 routing table after loopback down-up" forgot to assign rt6_info to the inet6_ifaddr. When disable the net device, the rt6_info which allocated in init_loopback will not be destroied in __ipv6_ifa_notify. This will trigger the waring message below [23527.916091] unregister_netdevice: waiting for tap0 to become free. Usage count = 1 Reported-by: Arkadiusz Miskiewicz <a.miskiewicz@gmail.com> Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-04 16:57:41 -07:00
Baruch Siach	430f03cde2	net: mark netdev_create_hash __net_init netdev_create_hash() is only called from netdev_init() which is marked __net_init. Signed-off-by: Baruch Siach <baruch@tkos.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-04 16:47:27 -07:00
Jean Sacren	4960c2c6fa	Kconfig: remove dangling references to the deleted file Commit `202dc3fc59` (Documentation: remove obsolete networking/multicast.txt file) deleted the obsolete file. After the file has been removed, clean up a couple of places where references to the deleted file were made so that users wouldn't be confused when they consult the Help menu. Signed-off-by: Jean Sacren <sakiwit@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-04 15:17:39 -07:00
Jean Sacren	ebd4687af7	xfrm: simplify the exit path of xfrm_output_one() Clean up unnecessary assignment and jump. While there, fix up the label name. Signed-off-by: Jean Sacren <sakiwit@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-04 15:17:38 -07:00
Johannes Berg	9b881963c1	cfg80211: make wiphy index start at 0 again The change to use atomic_inc_return() for assigning the wiphy index made the first wiphy index 1 instead of 0. This is fine, but we all habitually type "phy0" when we're testing, so make it go back to 0 instead of 1 by subtracting 1 from the index. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-06-04 22:28:23 +02:00
Johannes Berg	256c90dedf	cfg80211: fix potential deadlock regression My big locking cleanups caused a problem by registering the rfkill instance with the RTNL held, while the callback also acquires the RTNL. This potentially causes a deadlock since the two locks used (rfkill mutex and RTNL) can be acquired in two different orders. Fix this by (un)registering rfkill without holding the RTNL. This needs to be done after the device struct is registered, but that can also be done w/o holding the RTNL. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-06-04 22:22:41 +02:00
Lorenzo Colitti	d862e54614	net: ipv6: Implement /proc/net/icmp6. The format is based on /proc/net/icmp and /proc/net/{udp,raw}6. Compiles and displays reasonable results with CONFIG_IPV6={n,m,y} Couldn't figure out how to test without CONFIG_PROC_FS enabled. Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-04 12:56:14 -07:00
Lorenzo Colitti	8cc785f6f4	net: ipv4: make the ping /proc code AF-independent Introduce a ping_seq_afinfo structure (similar to its UDP equivalent) and use it to make some of the ping /proc functions address-family independent. Rename the remaining ping /proc functions from ping_* to ping_v4_*. Compiles and displays reasonable results with CONFIG_IPV6={n,m,y} Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-04 12:56:14 -07:00
Lorenzo Colitti	17ef66afc0	net: ipv6: Unify {raw,udp}6_sock_seq_show. udp6_sock_seq_show and raw6_sock_seq_show are identical, except the UDP version displays ports and the raw version displays the protocol. Refactor most of the code in these two functions into a new common ip6_dgram_sock_seq_show function, in preparation for using it to display ICMPv6 sockets as well. Also reduce the indentation in parts of include/net/transp_v6.h to improve readability. Compiles and displays reasonable results with CONFIG_IPV6={n,m,y} Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-04 12:56:14 -07:00
Johannes Berg	3430140ad9	regulatory: use proper enum return value get_reg_request_treatment() returns 0 in one case but is defined to return an enum, use the proper value REG_REQ_OK. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-06-04 14:35:07 +02:00
Johannes Berg	ceca7b7121	cfg80211: separate internal SME implementation The current internal SME implementation in cfg80211 is very mixed up with the MLME handling, which has been causing issues for a long time. There are three things that the implementation has to provide: * a basic SME implementation for nl80211's connect() call (for drivers implementing auth/assoc, which is really just mac80211) and wireless extensions * MLME events for the userspace SME * SME events (connected, disconnected etc.) for all different SME implementation possibilities (driver, cfg80211 and userspace) To achieve these goals it isn't necessary to track the software SME's connection status outside of it's state (which is the part that caused many issues.) Instead, track it only in the SME data (wdev->conn) and in the general case only track whether the wdev is connected or not (via wdev->current_bss.) Also separate the internal implementation to not have callbacks from the SME events, but rather call it from the API functions that the driver (or rather mac80211) calls. This separates the code better. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-06-04 13:03:11 +02:00
Johannes Berg	6ff57cf888	cfg80211/mac80211: clean up cfg80211 SME APIs Do some cleanups in the cfg80211 SME APIs, which are only used by mac80211. Most of these functions get a frame passed, and there isn't really any reason to export multiple functions as cfg80211 can check the frame type instead, do that. Additionally, the API functions have confusing names like cfg80211_send_...() which was meant to indicate that it sends an event to userspace, but gets a bit confusing when there's both TX and RX and they're not all clearly labeled. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-06-04 13:03:10 +02:00
Pontus Fuchs	ff40b425f0	mac80211: set IEEE80211_TX_CTL_REQ_TX_STATUS on nullframes The connection monitor needs to know the tx status of nullframes to work properly. Signed-off-by: Pontus Fuchs <pontus.fuchs@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-06-04 12:52:39 +02:00
Johannes Berg	9c90a9f64c	nl80211: remove bogus genlmsg_end() error checking genlmsg_end() can't return an error since it returns the skb length so remove checks treating the return value as an error code. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-06-04 12:47:08 +02:00
Felix Fietkau	d6d23de278	mac80211: add a tx control flag to indicate PS-Poll/uAPSD response Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-06-04 12:41:36 +02:00
Johannes Berg	964dc9e2c3	cfg80211: take WoWLAN support information out of wiphy struct There's no need to take up the space for devices that don't support WoWLAN, and most drivers can even make the support data static const (except where it's modified at runtime.) Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-06-03 18:43:34 +02:00
Jacob Minshall	e05ecccdf7	mac80211: set mesh formation field properly Cap max peerings at 63 in accordance with IEEE-2012 8.4.2.100.7. Triggers a beacon regeneration every time the number of peerings changes. Previously this would only happen if the "accepting peerings" bit changed. Signed-off-by: Jacob Minshall <jacob@cozybit.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-06-03 17:03:18 +02:00
Thomas Pedersen	866403a7bd	mac80211: don't check local mesh TTL on TX nl80211 has already verified the mesh TTL on setting the mesh config, so no need to check it again in mac80211. Signed-off-by: Thomas Pedersen <thomas@cozybit.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-06-03 16:53:51 +02:00
Johannes Berg	ed405be5cb	mac80211: fix sdata locking around __ieee80211_request_smps My cfg80211/mac80211 locking unification broke the sdata locking in ieee80211_set_power_mgmt, it needs to acquire the lock for __ieee80211_request_smps(). Add the locking. Reported-by: Jakub Kicinski <kubakici@wp.pl> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-06-03 13:51:59 +02:00
Cong Wang	9a99d4a50c	icmp: avoid allocating large struct on stack struct icmp_bxm is a large struct, reduce stack usage by allocating it on heap. Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Joe Perches <joe@perches.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-03 00:28:44 -07:00
Rami Rosen	08578d8d4e	] icmp: fix icmp_unreach() comment. ICMP_PARAMETERPROB is handled by icmp_unreach(); This patch adds ICMP_PARAMETERPROB to the list of ICMP message types handled by icmp_unreach(). Signed-off-by: Rami Rosen <ramirose@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-03 00:27:15 -07:00
Timo Teräs	5aad1de5ea	ipv4: use separate genid for next hop exceptions commit `13d82bf5` (ipv4: Fix flushing of cached routing informations) added the support to flush learned pmtu information. However, using rt_genid is quite heavy as it is bumped on route add/change and multicast events amongst other places. These can happen quite often, especially if using dynamic routing protocols. While this is ok with routes (as they are just recreated locally), the pmtu information is learned from remote systems and the icmp notification can come with long delays. It is worthy to have separate genid to avoid excessive pmtu resets. Cc: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Timo Teräs <timo.teras@iki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-03 00:07:43 -07:00
Timo Teräs	f016229e30	ipv4: rate limit updating of next hop exceptions with same pmtu The tunnel devices call update_pmtu for each packet sent, this causes contention on the fnhe_lock. Ignore the pmtu update if pmtu is not actually changed, and there is still plenty of time before the entry expires. Signed-off-by: Timo Teräs <timo.teras@iki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-03 00:07:43 -07:00
Timo Teräs	387aa65a89	ipv4: properly refresh rtable entries on pmtu/redirect events This reverts commit `05ab86c5` (xfrm4: Invalidate all ipv4 routes on IPsec pmtu events). Flushing all cached entries is not needed. Instead, invalidate only the related next hop dsts to recheck for the added next hop exception where needed. This also fixes a subtle race due to bumping generation id's before updating the pmtu. Cc: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Timo Teräs <timo.teras@iki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-03 00:07:42 -07:00
Eric Dumazet	01cb71d2d4	net_sched: restore "overhead xxx" handling commit `56b765b79` ("htb: improved accuracy at high rates") broke the "overhead xxx" handling, as well as the "linklayer atm" attribute. tc class add ... htb rate X ceil Y linklayer atm overhead 10 This patch restores the "overhead xxx" handling, for htb, tbf and act_police The "linklayer atm" thing needs a separate fix. Reported-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Vimalkumar <j.vimal@gmail.com> Cc: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-02 22:22:35 -07:00
Paul Moore	e4c1721642	xfrm: force a garbage collection after deleting a policy In some cases after deleting a policy from the SPD the policy would remain in the dst/flow/route cache for an extended period of time which caused problems for SELinux as its dynamic network access controls key off of the number of XFRM policy and state entries. This patch corrects this problem by forcing a XFRM garbage collection whenever a policy is sucessfully removed. Reported-by: Ondrej Moris <omoris@redhat.com> Signed-off-by: Paul Moore <pmoore@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-31 17:30:07 -07:00
Nicolas Dichtel	32b8a8e59c	sit: add IPv4 over IPv4 support This patch adds the support of IPv4 over Ipv4 for the module sit. The gain of this feature is to be able to have 4in4 and 6in4 over the same interface instead of having one interface for 6in4 and another for 4in4 even if encapsulation addresses are the same. To avoid conflicting with ipip module, sit IPv4 over IPv4 protocol is registered with a smaller priority. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-31 17:19:05 -07:00
Nicolas Dichtel	bf3d6a8f79	iptunnel: specify protocol outside IP header Before this patch, ip_tunnel_xmit() was using the field protocol from the IP header passed into argument. There is no functional change, this patch prepares the support of IPv4 over IPv4 for module sit. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-31 17:19:05 -07:00
Pravin B Shelar	1e2bd517c1	udp6: Fix udp fragmentation for tunnel traffic. udp6 over GRE tunnel does not work after to GRE tso changes. GRE tso handler passes inner packet but keeps track of outer header start in SKB_GSO_CB(skb)->mac_offset. udp6 fragment need to take care of outer header, which start at the mac_offset, while adding fragment header. This bug is introduced by commit `68c3316311` (GRE: Add TCP segmentation offload for GRE). Reported-by: Dmitry Kravkov <dkravkov@gmail.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Tested-by: Dmitry Kravkov <dmitry@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-31 17:06:07 -07:00
Cong Wang	35d0461061	net: clean up skb headers code commit `1a37e412a0` (net: Use 16bits for _headers fields of struct skbuff) converts skb->_header to u16, some #if NET_SKBUFF_DATA_USES_OFFSET are now useless, and to be safe, we could just use "X = (typeof(X)) ~0U;" as suggested by David. Cc: David S. Miller <davem@davemloft.net> Cc: Simon Horman <horms@verge.net.au> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-31 17:02:47 -07:00
Jay Vosburgh	b190a50875	net/core: dev_mc_sync_multiple calls wrong helper The dev_mc_sync_multiple function is currently calling __hw_addr_sync, and not __hw_addr_sync_multiple. This will result in addresses only being synced to the first device from the set. Corrected by calling the _multiple variant. Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Reviewed-by: Vlad Yasevich <vyasevic@redhat.com> Tested-by: Shawn Bohrer <sbohrer@rgmadvisors.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-31 16:56:56 -07:00
Jay Vosburgh	29ca2f8fcc	net/core: __hw_addr_sync_one / _multiple broken Currently, __hw_addr_sync_one is called in a loop by __hw_addr_sync_multiple to sync each of a "from" device's hw addresses to a "to" device. __hw_addr_sync_one calls __hw_addr_add_ex to attempt to add each address. __hw_addr_add_ex is called with global=false, and sync=true. __hw_addr_add_ex checks to see if the new address matches an address already on the list. If so, it tests global and sync. In this case, sync=true, and it then checks if the address is already synced, and if so, returns 0. This 0 return causes __hw_addr_sync_one to increment the sync_cnt and refcount for the "from" list's address entry, even though the address is already synced and has a reference and sync_cnt. This will cause the sync_cnt and refcount to increment without bound every time an addresses is added to the "from" device and synced to the "to" device. The fix here has two parts: First, when __hw_addr_add_ex finds the address already exists and is synced, return -EEXIST instead of 0. Second, __hw_addr_sync_one checks the error return for -EEXIST, and if so, it (a) does not add a refcount/sync_cnt, and (b) returns 0 itself so that __hw_addr_sync_multiple will not return an error. Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Reviewed-by: Vlad Yasevich <vyasevic@redhat.com> Tested-by: Shawn Bohrer <sbohrer@rgmadvisors.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-31 16:56:56 -07:00
Jay Vosburgh	60ba834c2f	net/core: __hw_addr_unsync_one "from" address not marked synced When an address is added to a subordinate interface (the "to" list), the address entry in the "from" list is not marked "synced" as the entry added to the "to" list is. When performing the unsync operation (e.g., dev_mc_unsync), __hw_addr_unsync_one calls __hw_addr_del_entry with the "synced" parameter set to true for the case when the address reference is being released from the "from" list. This causes a test inside to fail, with the result being that the reference count on the "from" address is not properly decremeted and the address on the "from" list will never be freed. Correct this by having __hw_addr_unsync_one call the __hw_addr_del_entry function with the "sync" flag set to false for the "remove from the from list" case. Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Reviewed-by: Vlad Yasevich <vyasevic@redhat.com> Tested-by: Shawn Bohrer <sbohrer@rgmadvisors.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-31 16:56:56 -07:00
Jay Vosburgh	9747ba6636	net/core: __hw_addr_create_ex does not initialize sync_cnt The sync_cnt field is not being initialized, which can result in arbitrary values in the field. Fixed by initializing it to zero. Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Reviewed-by: Vlad Yasevich <vyasevic@redhat.com> Tested-by: Shawn Bohrer <sbohrer@rgmadvisors.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-31 16:56:56 -07:00
Simon Horman	938177e9f3	netfilter: Correct calculation using skb->tail and skb-network_header This corrects an regression introduced by "net: Use 16bits for *_headers fields of struct skbuff" when NET_SKBUFF_DATA_USES_OFFSET is not set. In that case skb->tail will be a pointer whereas skb->network_header will be an offset from head. This is corrected by using wrappers that ensure that calculations are always made using pointers. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Reported-by: Chen Gang <gang.chen@asianux.com> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-31 16:38:25 -07:00
Nicolas Dichtel	fda3f402f4	snmp6: remove IPSTATS_MIB_CSUMERRORS This stat is not relevant in IPv6, there is no checksum in IPv6 header. Just leave a comment to explain the hole. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-31 16:26:49 -07:00
Eric Dumazet	db8caf3dbc	gro: should aggregate frames without DF GRO on IPv4 doesn't aggregate frames if they don't have DF bit set. Some servers use IP_MTU_DISCOVER/IP_PMTUDISC_PROBE, so linux receivers are unable to aggregate this kind of traffic. The right thing to do is to allow aggregation as long as the DF bit has same value on all segments. bnx2x LRO does this correctly. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jerry Chu <hkchu@google.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Ben Hutchings <bhutchings@solarflare.com> Reviewed-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-31 16:25:56 -07:00
David Majnemer	c3f1dbaf6e	net: Update RFS target at poll for tcp/udp The current state of affairs is that read()/write() will setup RFS (Receive Flow Steering) for internet protocol sockets while poll()/epoll() does not. When poll() gets called with a TCP or UDP socket, we should update the flow target. This permits to RFS (if enabled) to select the appropriate CPU for following incoming packets. Note: Only connected UDP sockets can benefit from RFS. Signed-off-by: David Majnemer <majnemer@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Paul Turner <pjt@google.com> Cc: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-31 16:24:43 -07:00
David S. Miller	fb68e2f438	Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless John W. Linville says: ==================== Please pull this batch of fixes intended for the 3.10 stream... Regarding the NFC bits, Samuel says: "This is the first batch of NFC fixes for 3.10, and it contains: - 3 fixes for the NFC MEI support: * We now depend on the correct Kconfig symbol. * We register an MEI event callback whenever we enable an NFC device, otherwise we fail to read anything after an enable/disable cycle. * We only disable an MEI device from its disable mey_phy_ops, preventing useless consecutive disable calls. - An NFC Makefile cleanup, as I forgot to remove a commented out line when moving the LLCP code to the NFC top level directory." As for the mac80211 bits, Johannes says: "This time I have a fix from Stanislaw for a stupid mistake I made in the auth/assoc timeout changes, a fix from Felix for 64-bit traffic counters and one from Helmut for address mask handling in mac80211. I also have a few fixes myself for four different crashes reported by a few people." And Johannes says this about the iwlwifi bit: "This fixes a brown paper-bag bug that we really should've caught in review. More details in the changelog for the fix." On top of that... Arend van Spriel and Hante Meuleman cooperate to send a series of AP and P2P mode fixes for brcmfmac. Gabor Juhos corrects a register offset for AR9550, avoiding a bus error. Dan Carpenter provides a fixup to some dmesg output in the atmel driver. And, finally... Felix Fietkau not only gives us a trio of small AR934x fixes, but also refactors the ath9k aggregation session start/stop handling (using the generic mac80211 support) in order to avoid a deadlock. Please let me know if there are problems! ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-31 02:38:19 -07:00
Simon Horman	aef6de511a	sctp: Correct byte order of access to skb->{network, transport}_header Corrects an byte order conflict introduced by "sctp: Correct access to skb->{network, transport}_header". All the values in question are host byte order. Reported-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-31 01:41:36 -07:00
Yuchung Cheng	c7d9d6a185	tcp: undo on DSACK during recovery If the receiver supports DSACK, sender can detect false recoveries and revert cwnd reductions triggered by either severe network reordering or concurrent reordering and loss event. Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-30 18:06:11 -07:00
Yuchung Cheng	7026b912f9	tcp: fix undo on partial ack in recovery Upon detecting spurious fast retransmit via timestamps during recovery, use PRR to clock out new data packet instead of retransmission. Once all retransmission are proven spurious, the sender then reverts the cwnd reduction and congestion state to open or disorder. The current code does the opposite: it undoes cwnd as soon as any retransmission is spurious and continues to retransmit until all data are acked. This nullifies the point to undo the cwnd because the sender is still retransmistting spuriously. This patch fixes it. The undo_ssthresh argument of tcp_undo_cwnd_reductiuon() is no longer needed and is removed. Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-30 18:06:11 -07:00
Yuchung Cheng	6a63df46a7	tcp: refactor undo functions Refactor and relocate various functions or variables to prepare the undo fix. Remove some unused function arguments. Rename tcp_undo_cwr to tcp_undo_cwnd_reduction to be consistent with the rest of CWR related function names. Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-30 18:06:11 -07:00
Yuchung Cheng	6804973ffb	tcp: consolidate PRR packet accounting This patch series fixes an undo bug in fast recovery: the sender mistakenly undos the cwnd too early but continues fast retransmits until all pending data are acked. This also multiplies the SNMP stat PARTIALUNDO events by the degree of the network reordering. The first patch prepares the fix by consolidating the accounting of newly_acked_sacked in tcp_cwnd_reduction(), instead of updating newly_acked_sacked everytime sacked_out is adjusted. Also pass acked and prior_unsacked as const type because they are readonly in the rest of recovery processing. Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-30 18:06:11 -07:00
Linus Torvalds	4203afc3fb	Merge branch 'for-3.10' of git://linux-nfs.org/~bfields/linux Pull nfsd fixes from Bruce Fields: "A couple minor fixes for the (new to 3.10) gss-proxy code. And one regression from user-namespace changes. (XBMC clients were doing something admittedly weird--sending -1 gid's--but something that we used to allow.)" * 'for-3.10' of git://linux-nfs.org/~bfields/linux: svcrpc: fix failures to handle -1 uid's and gid's svcrpc: implement O_NONBLOCK behavior for use-gss-proxy svcauth_gss: fix error code in use_gss_proxy()	2013-05-31 09:48:56 +09:00
David S. Miller	73ce00d4d6	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf Pablo Neira Ayuso says: ==================== The following patchset contains Netfilter/IPVS fixes for 3.10-rc3, they are: * fix xt_addrtype with IPv6, from Florian Westphal. This required a new hook for IPv6 functions in the netfilter core to avoid hard dependencies with the ipv6 subsystem when this match is only used for IPv4. * fix connection reuse case in IPVS. Currently, if an reused connection are directed to the same server. If that server is down, those connection would fail. Therefore, clear the connection and choose a new server among the available ones. * fix possible non-nul terminated string sent to user-space if ipt_ULOG is used as the default netfilter logging stub, from Chen Gang. * fix mark logging of IPv6 packets in xt_LOG, from Michal Kubecek. This bug has been there since 2.6.26. * Fix breakage ip_vs_sh due to incorrect structure layout for RCU, from Jan Beulich. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-30 16:38:38 -07:00
David S. Miller	a230e03384	Included changes: - reduce broadcast overhead on non-lossy wired links - fix typos in kernel doc - use eth_hdr() when possible - use netdev_allock_skb_ip_align() and don't deal with NET_IP_ALIGN - change VID semantic in the BLA component - other minor cleanups and code refactoring -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) iQIcBAABCAAGBQJRpVCoAAoJEADl0hg6qKeOC0MP/ROgdprMTQNSFnvJLo9YZ4qP EZpEwPqU8gCmnOKdQJ/KYSiyAX/iFjwTBgfelq1ekAbUmzV1y/cLOjHgZuOhM7Bk 5xEiC5Yunj+5Q14f1rW6uDs8+B84gnl4Iow9j6aAlOW6g0BAf3Pb9caN7sSheh/w dKUHQKaDHaNMrHVerEHRDkzlWSJes6YIYMKgbX1mwMGddMtFJbMKIvylzilLXUl6 3jLlwC8XDvgH+0ygLvuFWEhOwkSdWTo32J92VE/WSTpDPyZYvXj/r+/e/U2Pi3q2 2owariF4PCr+4cc2tcDOPIh/zRdt0OTCZ2lW/z2CZbCCGZ3AYoQ8ue4hulf9AyZS 9Ff/wdgMmPN/aM+t7+QtKrKAcCBb9JqY3eVRL6jeBUF+SP0O/dfVDaQreQExy+iF H0LhsE6eqMFaucjZit/yZ2SpR6kMo42BmCZjLJVujm0REUsqaq3GKolzrRGNDX91 Wzc/eZkqyES3R6rMrbvW0iqRAAPiUi6YOv9f8MoYbqXNOcmJJdyoMVWthAEFOnFB 4qJw2O5HlROL2TmOKw7cR4TPDX9reWN0OdZrSLabEXYkqVhoqB08ojSLAZbHVTe4 h3EEjGc8eaIpAYb5B68JE41ZNEZFIUlwOCMxeZsk2LV05LTOHCmrUPbF+n/psag9 9zt30TKyPu2b/+XFyOfd =22js -----END PGP SIGNATURE----- Merge tag 'batman-adv-for-davem' of git://git.open-mesh.org/linux-merge Included changes: - reduce broadcast overhead on non-lossy wired links - fix typos in kernel doc - use eth_hdr() when possible - use netdev_allock_skb_ip_align() and don't deal with NET_IP_ALIGN - change VID semantic in the BLA component - other minor cleanups and code refactoring Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-30 15:53:18 -07:00
Jan Beulich	a70b9641e6	ipvs: ip_vs_sh: fix build kfree_rcu() requires offsetof(..., rcu_head) < 4096, which can get violated with a sufficiently high CONFIG_IP_VS_SH_TAB_BITS. Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-05-29 17:50:39 +02:00
John W. Linville	50cc1cab4c	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem	2013-05-29 10:59:35 -04:00
J. Bruce Fields	afe3c3fd53	svcrpc: fix failures to handle -1 uid's and gid's As of `f025adf191` "sunrpc: Properly decode kuids and kgids in RPC_AUTH_UNIX credentials" any rpc containing a -1 (0xffff) uid or gid would fail with a badcred error. Reported symptoms were xmbc clients failing on upgrade of the NFS server; examination of the network trace showed them sending -1 as the gid. Reported-by: Julian Sikorski <belegdol@gmail.com> Tested-by: Julian Sikorski <belegdol@gmail.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-05-29 10:37:47 -04:00
David Herrmann	4e713cdffb	HID: Bluetooth: hidp: register HID devices async While l2cap_user callbacks are running, the whole hci_dev is locked. Even if we would add more fine-grained locking to HCI core, it would still be called from the non-reentrant rx work-queue and thus block the event processing. However, if we want to perform synchronous I/O during HID device registration (eg., to perform device-detection), we need the HCI core to be able to dispatch incoming data. Therefore, we now move device-registration to a separate worker. The HCI core can continue running and we add devices asynchronously in another kernel thread. Device removal is synchronized and waits for the worker to exit before calling the usual device removal functions. If l2cap_user->remove is called before the thread registered the devices, we set "terminate" to true and the thread will skip it. If l2cap_user->remove is called after it, we notice this as the device is no longer in HIDP_SESSION_PREPARING state and simply unregister the device as we did before. There is no new deadlock as we now call hidp_session_add_dev() with one lock less held (the HCI lock) and it cannot itself call back into HCI as it was called with the HCI-lock held before. One might wonder whether this can block during device unregistration. But we set "terminate" to true and wake the HIDP thread up _before_ unregistering the HID/input devices. Therefore, all pending HID I/O operations are canceled. All further I/O attempts will fail with ENODEV or EIO. So all latency we can get are few context-switches, but no timeouts or blocking I/O waits! This change also prepares for a long standing HID bug. All HID devices that register power_supply devices need to be able to handle callbacks during registration (a power_supply oddity that cannot easily be fixed). So with this patch available, we can allow HID I/O during registration by calling the recently introduced hid_device_io_start/stop helpers, which currently are a no-op for bluetooth due to this locking. Note that we cannot do the same for input devices. input-core doesn't allow us to call input_event() asynchronously to input_register_device(), which HID-core kindly allows (for good reasons). Fixing input-core to allow this isn't as easy as it sounds and is, beside simplifying HIDP, not really an improvement. Hence, we still register input devices synchronously as we did before. Only HID devices are registered asynchronously. Signed-off-by: David Herrmann <dh.herrmann@gmail.com> Acked-by: Jiri Kosina <jkosina@suse.cz> Acked-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk> Tested-by: Daniel Nicoletti <dantti12@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2013-05-29 15:19:11 +02:00
Michal Kubeček	d660164d79	netfilter: xt_LOG: fix mark logging for IPv6 packets In dump_ipv6_packet(), the "recurse" parameter is zero only if dumping contents of a packet embedded into an ICMPv6 error message. Therefore we want to log packet mark if recurse is non-zero, not when it is zero. Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-05-29 12:29:18 +02:00
Johannes Berg	f4d57941bf	mac80211: always send multicast on CAB queue If the driver advertised support for a CAB queue, then we should put all multicast frames there, otherwise sending them can be racy with clients going to sleep while we TX a frame. To avoid this, always TX multicast frames on the multicast queue. It seems like even drivers not using the queue framework might want to do this which would mean also moving the IEEE80211_TX_CTL_SEND_AFTER_DTIM flag assignment, but it also seems that drivers behave differently here so that just moving it wouldn't be a good idea. It'd be better to modify those drivers to use the queue framework. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-05-29 11:04:48 +02:00
Felix Fietkau	31eba5bc56	mac80211: support active monitor interfaces Support them only if the driver advertises support for them via IEEE80211_HW_SUPPORTS_ACTIVE_MONITOR. Unlike normal monitor interfaces, they are added to the driver, along with their MAC address. Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-05-29 09:11:56 +02:00
Felix Fietkau	e057d3c31b	cfg80211: support an active monitor interface flag An active monitor interface is one that is used for communication (via injection). It is expected to ACK incoming unicast packets. This is useful for running various 802.11 testing utilities that associate to an AP via injection and manage the state in user space. Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-05-29 09:11:44 +02:00
Federico Vaga	456db6a4d4	net/core/sock.c: add missing VSOCK string in af_family_*_key_strings The three arrays of strings: af_family_key_strings, af_family_slock_key_strings and af_family_clock_key_strings have not VSOCK's string Signed-off-by: Federico Vaga <federico.vaga@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-28 23:58:49 -07:00
Andy Lutomirski	1be374a051	net: Block MSG_CMSG_COMPAT in send(m)msg and recv(m)msg To: linux-kernel@vger.kernel.org Cc: x86@kernel.org, trinity@vger.kernel.org, Andy Lutomirski <luto@amacapital.net>, netdev@vger.kernel.org, "David S. Miller" <davem@davemloft.net> Subject: [PATCH 5/5] net: Block MSG_CMSG_COMPAT in send(m)msg and recv(m)msg MSG_CMSG_COMPAT is (AFAIK) not intended to be part of the API -- it's a hack that steals a bit to indicate to other networking code that a compat entry was used. So don't allow it from a non-compat syscall. This prevents an oops when running this code: int main() { int s; struct sockaddr_in addr; struct msghdr hdr; char highpage = mmap((void)(TASK_SIZE_MAX - 4096), 4096, PROT_READ \| PROT_WRITE, MAP_PRIVATE \| MAP_ANONYMOUS \| MAP_FIXED, -1, 0); if (highpage == MAP_FAILED) err(1, "mmap"); s = socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP); if (s == -1) err(1, "socket"); addr.sin_family = AF_INET; addr.sin_port = htons(1); addr.sin_addr.s_addr = htonl(INADDR_LOOPBACK); if (connect(s, (struct sockaddr)&addr, sizeof(addr)) != 0) err(1, "connect"); void *evil = highpage + 4096 - COMPAT_MSGHDR_SIZE; printf("Evil address is %p\n", evil); if (syscall(__NR_sendmmsg, s, evil, 1, MSG_CMSG_COMPAT) < 0) err(1, "sendmmsg"); return 0; } Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-28 23:55:41 -07:00
Simon Horman	7cc4619005	net, ipv4, ipv6: Correct assignment of skb->network_header to skb->tail This corrects an regression introduced by "net: Use 16bits for *_headers fields of struct skbuff" when NET_SKBUFF_DATA_USES_OFFSET is not set. In that case skb->tail will be a pointer however skb->network_header is now an offset. This patch corrects the problem by adding a wrapper to return skb tail as an offset regardless of the value of NET_SKBUFF_DATA_USES_OFFSET. It seems that skb->tail that this offset may be more than 64k and some care has been taken to treat such cases as an error. Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-28 23:49:07 -07:00
Simon Horman	158874cac6	sctp: Correct access to skb->{network, transport}_header This corrects an regression introduced by "net: Use 16bits for *_headers fields of struct skbuff" when NET_SKBUFF_DATA_USES_OFFSET is not set. In that case sk_buff_data_t will be a pointer, however, skb->{network,transport}_header is now __u16. Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-28 23:49:07 -07:00
Simon Horman	f7c0c2ae84	ipv4: Correct comparisons and calculations using skb->tail and skb-transport_header This corrects an regression introduced by "net: Use 16bits for *_headers fields of struct skbuff" when NET_SKBUFF_DATA_USES_OFFSET is not set. In that case skb->tail will be a pointer whereas skb->transport_header will be an offset from head. This is corrected by using wrappers that ensure that comparisons and calculations are always made using pointers. Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-28 23:49:07 -07:00
Simon Horman	29a3cad5c6	ipv6: Correct comparisons and calculations using skb->tail and skb-transport_header This corrects an regression introduced by "net: Use 16bits for *_headers fields of struct skbuff" when NET_SKBUFF_DATA_USES_OFFSET is not set. In that case skb->tail will be a pointer whereas skb->transport_header will be an offset from head. This is corrected by using wrappers that ensure that comparisons and calculations are always made using pointers. Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-28 23:49:07 -07:00
Simon Horman	ced14f6804	net: Correct comparisons and calculations using skb->tail and skb-transport_header This corrects an regression introduced by "net: Use 16bits for *_headers fields of struct skbuff" when NET_SKBUFF_DATA_USES_OFFSET is not set. In that case skb->tail will be a pointer whereas skb->transport_header will be an offset from head. This is corrected by using wrappers that ensure that comparisons and calculations are always made using pointers. Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-28 23:49:07 -07:00
Cong Wang	75538c2b85	net: always pass struct netdev_notifier_info to netdevice notifiers commit `351638e7de` (net: pass info struct via netdevice notifier) breaks booting of my KVM guest, this is due to we still forget to pass struct netdev_notifier_info in several places. This patch completes it. Cc: Jiri Pirko <jiri@resnulli.us> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-28 21:58:54 -07:00
Simon Wunderlich	6715fd3f05	batman-adv: Start new development cycle Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>	2013-05-29 02:44:56 +02:00
Martin Hundebøll	e91ecfc64a	batman-adv: Move call to batadv_nc_skb_forward() from routing.c to send.c The call to batadv_nc_skb_forward() fits better in batadv_send_skb_to_orig(), as this is where the actual next hop is looked up. To let the caller of batadv_send_skb_to_orig() know wether the skb is transmitted, buffered or failed, the return value is changed from boolean to int. Signed-off-by: Martin Hundebøll <martin@hundeboll.net> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>	2013-05-29 02:44:55 +02:00
Antonio Quartulli	5f80df6705	batman-adv: print the VID properly Since the MSB bits of any vid variable are now used for storing flags, print the vid properly by taking the flags away and printing -1 in case of VID representing no real VLAN. Signed-off-by: Antonio Quartulli <antonio@open-mesh.com> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>	2013-05-29 02:44:55 +02:00
Antonio Quartulli	eb2deb6b39	batman-adv: change VID semantic in the BLA code In order to make batman-adv fully vlan aware later, the semantic used for variables storing the VLAN ID values has to be changed in order to be adapted to the new one which will be used batman-adv wide. In particular, the VID has to be an "_unsigned_ short int" and its 4 MSB will be used as a flag bitfield, while the remaining 12 bits are used to store the real VID value Signed-off-by: Antonio Quartulli <antonio@open-mesh.com> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Acked-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>	2013-05-29 02:44:55 +02:00
Marek Lindner	aa27c31265	batman-adv: do not print orig nodes without nc neighbors on nc table print Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>	2013-05-29 02:44:55 +02:00
Linus Lüssing	e54c77f08e	batman-adv: Remove unnecessary INIT_HLIST_NODE() calls There's no need to for an explicit hlist_node initialization if it is added to a list right away, like it's the case with the hlist_add_head()s here. Signed-off-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>	2013-05-29 02:44:55 +02:00
Antonio Quartulli	d4ff40f683	batman-adv: pass a 16bit long flag argument to tt_global_add() it may be the case that we want to store some local TT client flags in a global entry, therefore the tt_global_add needs to get a proper argument for this Signed-off-by: Antonio Quartulli <ordex@autistici.org> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>	2013-05-29 02:44:55 +02:00
Antonio Quartulli	41ab6c4891	batman-adv: don't deal with NET_IP_ALIGN manually Instead of dealing with NET_IP_ALIGN during allocation and headroom reservation, it is possible to use netdev_alloc_skb_ip_align() which transparently allocate and reserve the correct amount of data Signed-off-by: Antonio Quartulli <ordex@autistici.org> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>	2013-05-29 02:44:54 +02:00
Antonio Quartulli	3abe4adbfb	batman-adv: refactor batadv_tt_local_event() Instead of passing a generic combination of flags as argument, it is easier to pass the entire tt_common structure (containing the flags already set) plus a bitfield of event flags that will be unified with the already existing ones before inserting the client in the event queue. In this way invocations of the modified function can be simplified. Signed-off-by: Antonio Quartulli <antonio@open-mesh.com> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>	2013-05-29 02:44:54 +02:00
Antonio Quartulli	d989661732	batman-adv: move batadv_slide_own_bcast_window to bat_iv_ogm.c batadv_slide_own_bcast_window() is used only in bat_iv_ogm.c and it is currently touching only batman_iv specific attributes. Move it into bat_iv_ogm.c and make it static. Signed-off-by: Antonio Quartulli <antonio@open-mesh.com> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>	2013-05-29 02:44:54 +02:00
Antonio Quartulli	24a5deeb8a	batman-adv: move ring_buffer helper functions in bat_iv_ogm the two lonely ring_buffer helper functions are used by the bat_iv_ogm module only and therefore they can be moved inside it. Reported-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>	2013-05-29 02:44:54 +02:00
Antonio Quartulli	7ed4be9523	batman-adv: use eth_hdr() when it makes sense Instead of casting the result of skb_mac_header() to "struct ethhdr *" every time, the eth_hdr inline function can be use to beautify the code and improve its readability. Signed-off-by: Antonio Quartulli <ordex@autistici.org> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>	2013-05-29 02:44:54 +02:00
Antonio Quartulli	7db3fc291b	batman-adv: don't initialise batman_iv private members in hard-interface.c hard-interface.c has to do not contain any routing algorithm specific code. Allocate the hard-interface with kzalloc() and remove any useless and algorithm specific member initialisation Signed-off-by: Antonio Quartulli <antonio@open-mesh.com> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>	2013-05-29 02:44:54 +02:00
Antonio Quartulli	38dc40ef52	batman-adv: do not silently ignore wrong condition Only one neigh_node per orig_node should match a given neighbor address, therefore, if more than one matching neigh_node is found, a WARNING has to be triggered to let the user know that something is wrong in the originator state instead of silently skipping the error. Signed-off-by: Antonio Quartulli <ordex@autistici.org> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>	2013-05-29 02:44:54 +02:00
Antonio Quartulli	a3b81b67de	batman-adv: don't check compat version twice Compatibility version is checked upon packet reception before calling any handler. For this reason it does need to be checked once more in the handler itself. Signed-off-by: Antonio Quartulli <antonio@open-mesh.com> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>	2013-05-29 02:44:53 +02:00
Antonio Quartulli	281581d3e7	batman-adv: don't check the source address twice The source address has already been checked in batadv_check_management_packet() upon packet reception and therefore it does not need to be checked again in ogm_process() Signed-off-by: Antonio Quartulli <antonio@open-mesh.com> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>	2013-05-29 02:44:53 +02:00
Antonio Quartulli	d1dc30739c	batman-adv: slightly improve neighbor creation debug message print the interface along with the new neighbor mac address Signed-off-by: Antonio Quartulli <antonio@open-mesh.com> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>	2013-05-29 02:44:53 +02:00
Antonio Quartulli	863dd7a82a	batman-adv: drop useless argument seqno in neighbor creation the sequence number is not stored in struct neigh_node, therefore there is no need to pass such value to the neigh_node creation procedure. At the moment the value is only used by a debug message, but given the fact that the seqno is not related to the neighbor object, it is better to print it elsewhere. Signed-off-by: Antonio Quartulli <antonio@open-mesh.com> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>	2013-05-29 02:44:53 +02:00
Marek Lindner	93178018eb	batman-adv: fix typos in kernel doc & comments Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>	2013-05-29 02:44:53 +02:00
Matthias Schiffer	caf65bfcc5	batman-adv: send each broadcast only once on non-wireless interfaces While it makes sense to send each broadcast thrice on 802.11 (WLAN) interfaces as broadcasts are often unreliable on these, there is no reason to do so on other interface types. The increased the overhead can be harmful on low-bandwidth links like VPN connections over slow internet lines, therefore it is better to reduce the number of broadcast packets sent on non-wireless links to one. Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>	2013-05-29 02:44:53 +02:00
Matthias Schiffer	de68d1003d	batman-adv: split batadv_is_wifi_iface() into two functions Previously batadv_is_wifi_iface() did two things at once: looking up a net_device from an interface index, and determining if it is a wifi device. The second part is useful itself when the caller already has a net_device reference. Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>	2013-05-29 02:44:53 +02:00
J. Bruce Fields	b161c14440	svcrpc: implement O_NONBLOCK behavior for use-gss-proxy Somebody noticed LTP was complaining about O_NONBLOCK opens of /proc/net/rpc/use-gss-proxy succeeding and then a following read hanging. I'm not convinced LTP really has any business opening random proc files and expecting them to behave a certain way. Maybe this isn't really a bug. But in any case the O_NONBLOCK behavior could be useful for someone that wants to test whether gss-proxy is up without waiting. Reported-by: Jan Stancek <jstancek@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-05-28 16:46:51 -04:00
David S. Miller	06ecf24bdf	net: Fix build warnings after mac_header and transport_header became __u16. net/core/skbuff.c: In function ‘__alloc_skb_head’: net/core/skbuff.c:203:2: warning: large integer implicitly truncated to unsigned type [-Woverflow] net/core/skbuff.c: In function ‘__alloc_skb’: net/core/skbuff.c:279:2: warning: large integer implicitly truncated to unsigned type [-Woverflow] net/core/skbuff.c:280:2: warning: large integer implicitly truncated to unsigned type [-Woverflow] net/core/skbuff.c: In function ‘build_skb’: net/core/skbuff.c:348:2: warning: large integer implicitly truncated to unsigned type [-Woverflow] net/core/skbuff.c:349:2: warning: large integer implicitly truncated to unsigned type [-Woverflow] Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-28 13:15:50 -07:00
Timo Teräs	6c8b4e3ff8	arp: flush arp cache on IFF_NOARP change IFF_NOARP affects what kind of neighbor entries are created (nud NOARP or nud INCOMPLETE). If the flag changes, flush the arp cache to refresh all entries. Signed-off-by: Timo Teräs <timo.teras@iki.fi> Signed-off-by: Jiri Pirko <jiri@resnulli.us> v2->v3: shortened notifier_info struct name Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-28 13:11:02 -07:00
Jiri Pirko	be9efd3653	net: pass changed flags along with NETDEV_CHANGE event Use new netdevice notifier infrastructure to pass along changed flags. Signed-off-by: Timo Teräs <timo.teras@iki.fi> Signed-off-by: Jiri Pirko <jiri@resnulli.us> v2->v3: shortened notifier_info struct name Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-28 13:11:01 -07:00
Jiri Pirko	351638e7de	net: pass info struct via netdevice notifier So far, only net_device * could be passed along with netdevice notifier event. This patch provides a possibility to pass custom structure able to provide info that event listener needs to know. Signed-off-by: Jiri Pirko <jiri@resnulli.us> v2->v3: fix typo on simeth shortened dev_getter shortened notifier_info struct name v1->v2: fix notifier_call parameter in call_netdevice_notifier() Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-28 13:11:01 -07:00
John W. Linville	d61bdbf123	Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211	2013-05-28 13:38:53 -04:00
Aneesh Kumar K.V	6390460af8	net/9p: Handle error in zero copy request correctly for 9p2000.u For zero copy request, error will be encoded in the user space buffer. So copy the error code correctly using copy_from_user. Here we use the extra bytes we allocate for zero copy request. If total error details are more than P9_ZC_HDR_SZ - 7 bytes, we return -EFAULT. The patch also avoid a memory allocation in the error path. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2013-05-28 11:47:57 -05:00
Aneesh Kumar K.V	42fe6484c6	net/9p: Handle error in zero copy request correctly for 9p2000.u For zero copy request, error will be encoded in the user space buffer. So copy the error code correctly using copy_from_user. Here we use the extra bytes we allocate for zero copy request. If total error details are more than P9_ZC_HDR_SZ - 7 bytes, we return -EFAULT. The patch also avoid a memory allocation in the error path. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2013-05-28 09:28:42 -05:00
Aneesh Kumar K.V	535bcd3c4e	net/9p: Use virtio transpart as the default transport Make the default 9p experience better by defaulting to virtio transport if present. These days most of the users are using 9p in a virtualized setup Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2013-05-28 09:28:42 -05:00
Aneesh Kumar K.V	095e7999c0	net/9p: Make 9P2000.L the default protocol for 9p file system If we dont' specify a protocol version default to 9P2000.L. 9P2000.L have better support for posix semantic and is where all the recent development is happening. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2013-05-28 09:28:42 -05:00
Michal Kubecek	f96ef988cc	ipv4: fix redirect handling for TCP packets Unlike ipv4_redirect() and ipv4_sk_redirect(), ip_do_redirect() doesn't call __build_flow_key() directly but via ip_rt_build_flow_key() wrapper. This leads to __build_flow_key() getting pointer to IPv4 header of the ICMP redirect packet rather than pointer to the embedded IPv4 header of the packet initiating the redirect. As a result, handling of ICMP redirects initiated by TCP packets is broken. Issue was introduced by `4895c771c` ("ipv4: Add FIB nexthop exceptions.") Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-27 23:39:19 -07:00
dingtianhong	da6e378ba9	netpoll: remove return value from netpoll_rx_disable() The netpoll_rx_disable() will always return 0, it is no use and looks wordy, so remove the unnecessary code and get rid of it in _dev_open and _dev_close. Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-27 23:18:50 -07:00
Simon Horman	0d89d2035f	MPLS: Add limited GSO support In the case where a non-MPLS packet is received and an MPLS stack is added it may well be the case that the original skb is GSO but the NIC used for transmit does not support GSO of MPLS packets. The aim of this code is to provide GSO in software for MPLS packets whose skbs are GSO. SKB Usage: When an implementation adds an MPLS stack to a non-MPLS packet it should do the following to skb metadata: * Set skb->inner_protocol to the old non-MPLS ethertype of the packet. skb->inner_protocol is added by this patch. * Set skb->protocol to the new MPLS ethertype of the packet. * Set skb->network_header to correspond to the end of the L3 header, including the MPLS label stack. I have posted a patch, "[PATCH v3.29] datapath: Add basic MPLS support to kernel" which adds MPLS support to the kernel datapath of Open vSwtich. That patch sets the above requirements in datapath/actions.c:push_mpls() and was used to exercise this code. The datapath patch is against the Open vSwtich tree but it is intended that it be added to the Open vSwtich code present in the mainline Linux kernel at some point. Features: I believe that the approach that I have taken is at least partially consistent with the handling of other protocols. Jesse, I understand that you have some ideas here. I am more than happy to change my implementation. This patch adds dev->mpls_features which may be used by devices to advertise features supported for MPLS packets. A new NETIF_F_MPLS_GSO feature is added for devices which support hardware MPLS GSO offload. Currently no devices support this and MPLS GSO always falls back to software. Alternate Implementation: One possible alternate implementation is to teach netif_skb_features() and skb_network_protocol() about MPLS, in a similar way to their understanding of VLANs. I believe this would avoid the need for net/mpls/mpls_gso.c and in particular the calls to __skb_push() and __skb_push() in mpls_gso_segment(). I have decided on the implementation in this patch as it should not introduce any overhead in the case where mpls_gso is not compiled into the kernel or inserted as a module. MPLS GSO suggested by Jesse Gross. Based in part on "v4 GRE: Add TCP segmentation offload for GRE" by Pravin B Shelar. Cc: Jesse Gross <jesse@nicira.com> Cc: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-27 22:50:59 -07:00
Johannes Berg	6abb9cb99f	cfg80211: make WoWLAN configuration available to drivers Make the current WoWLAN configuration available to drivers at runtime. This isn't really useful for the normal WoWLAN behaviour and accessing it can also be racy, but drivers may use it for testing the WoWLAN device behaviour while the host stays up & running to observe the device. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-05-27 15:10:58 +02:00
Jeff Mahoney	4e7dba99c9	netfilter: Implement RFC 1123 for FTP conntrack The FTP conntrack code currently only accepts the following format for the 227 response for PASV: 227 Entering Passive Mode (148,100,81,40,31,161). It doesn't accept the following format from an obscure server: 227 Data transfer will passively listen to 67,218,99,134,50,144 From RFC 1123: The format of the 227 reply to a PASV command is not well standardized. In particular, an FTP client cannot assume that the parentheses shown on page 40 of RFC-959 will be present (and in fact, Figure 3 on page 43 omits them). Therefore, a User-FTP program that interprets the PASV reply must scan the reply for the first digit of the host and port numbers. This patch adds support for the RFC 1123 clarification by: - Allowing a search filter to specify NUL as the terminator so that try_number will return successfully if the array of numbers has been filled when an unexpected character is encountered. - Using space as the separator for the 227 reply and then scanning for the first digit of the number sequence. The number sequence is parsed out using the existing try_rfc959 but with a NUL terminator. References: https://bugzilla.novell.com/show_bug.cgi?id=466279 References: http://bugzilla.netfilter.org/show_bug.cgi?id=574 Reported-by: Mark Post <mpost@novell.com> Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Cc: Patrick McHardy <kaber@trash.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: netfilter-devel@vger.kernel.org Cc: netfilter@vger.kernel.org Cc: coreteam@netfilter.org Cc: netdev@vger.kernel.org Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-05-27 13:32:43 +02:00
Grzegorz Lyczba	dc7b3eb900	ipvs: Fix reuse connection if real server is dead Expire cached connection for new TCP/SCTP connection if real server is down. Otherwise, IPVS uses the dead server for the reused connection, instead of a new working one. Signed-off-by: Grzegorz Lyczba <grzegorz.lyczba@gmail.com> Acked-by: Hans Schillstrom <hans@schillstrom.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-05-27 13:00:45 +02:00
Helmut Schaa	ac20976dca	mac80211: Allow single vif mac address change with addr_mask When changing the MAC address of a single vif mac80211 will check if the new address fits into the address mask specified by the driver. This only needs to be done when using multiple BSSIDs. Hence, check the new address only against all other vifs. Also fix the MAC address assignment on new interfaces if the user changed the address of a vif such that perm_addr is not covered by addr_mask anymore. Resolves: https://bugzilla.kernel.org/show_bug.cgi?id=57371 Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com> Signed-off-by: Jakub Kicinski <kubakici@wp.pl> Reported-by: Alessandro Lannocca <alessandro.lannocca@gmail.com> Cc: Alessandro Lannocca <alessandro.lannocca@gmail.com> Cc: Bruno Randolf <br1@thinktube.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-05-27 11:26:48 +02:00
Johannes Berg	c8aa22db01	mac80211: close AP_VLAN interfaces before unregistering all Since Eric's commit `efe117ab8` ("Speedup ieee80211_remove_interfaces") there's a bug in mac80211 when it unregisters with AP_VLAN interfaces up. If the AP_VLAN interface was registered after the AP it belongs to (which is the typical case) and then we get into this code path, unregister_netdevice_many() will crash because it isn't prepared to deal with interfaces being closed in the middle of it. Exactly this happens though, because we iterate the list, find the AP master this AP_VLAN belongs to and dev_close() the dependent VLANs. After this, unregister_netdevice_many() won't pick up the fact that the AP_VLAN is already down and will do it again, causing a crash. Cc: stable@vger.kernel.org [2.6.33+] Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-05-27 11:26:47 +02:00
Johannes Berg	1351c5d3b1	mac80211: assign AP_VLAN hw queues correctly A lot of code in mac80211 assumes that the hw queues are set up correctly for all interfaces (except for monitor) but this isn't true for AP_VLAN interfaces. Fix this by copying the AP master configuration when an AP VLAN is brought up, after this the AP interface can't change its configuration any more and needs to be brought down to change it, which also forces AP_VLAN interfaces down, so just copying in open() is sufficient. Reported-by: Jouni Malinen <j@w1.fi> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-05-27 11:26:47 +02:00
Florian Westphal	9d5242b192	netfilter: nfnetlink_queue: avoid peer_portid test The portid is set to NETLINK_CB(skb).portid at create time. The run-time check will always be false. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-05-26 22:05:11 +02:00
Linus Torvalds	89ff77837a	NFS client bugfixes for 3.10 - Stable fix to prevent an rpc_task wakeup race - Fix a NFSv4.1 session drain deadlock - Fix a NFSv4/v4.1 mount regression when not running rpc.gssd - Ensure auth_gss pipe detection works in namespaces - Fix SETCLIENTID fallback if rpcsec_gss is not available -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.13 (GNU/Linux) iQIcBAABAgAGBQJRomGNAAoJEGcL54qWCgDyzUIQALj4hsOtdcmCCAWu0m+Pr+Gz /6rpO2xJ5cLWyTYS2J9wmPkU+mb/g/OB/OhANyQQsDbfoW3e27TGaXRB8hUs29vk LHTkFcrBTi4ohlw2Gyb6LWDF6cyHkC7cpH7tfIjLDff77V/qK1uqo41MtYpqUvy3 41tMoFOhvpwsy7HuKqVyYtNlwxnXrdJ5XvK0ycaQYQ9t8om9Hxu/scCgLfl/qwRr eVhIesC2/Y1eC46UK7/TWCC7aTLkvi85UQY+fywMkajt/gPzjjh6nuhc45aOT9d7 pc0sXgIs8fLEjC+CRa0ojrziJZCHZY93U1kzA+CotRqPq76nPeFeG/1VhViKKb4C 1AU9IA4ntViUqNDQiGrCxE6ZeMewTOKksrCErwSS16Hv9Gqh4SHq84R/bF6MFgBc dqQsexUbf/vaWHmuosARgbyc/QcBXDwWwbkq0hXSWbHA8vpc8/Lw1wkp49eyKz0V 0zwJ9xInakXr5/DIC72IKEF0Dg26L+GTXWLmDZVcdpVBp5A403trxUKckcsNa/Xf FXaGMhyv2ZfV4AohW1Z8klkLiMHt4dr+YB7SQAgR/gb81p3odgKgMz51PfmHxFqs 4nxwRQqWplBTGiLrOFSgaRC50jLqEloUweWC2qf56KYEb9N6M6XDIXhllLIMyWLD 94cvihpKj+/ecKH5kQWF =B8go -----END PGP SIGNATURE----- Merge tag 'nfs-for-3.10-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs Pull NFS client bugfixes from Trond Myklebust: - Stable fix to prevent an rpc_task wakeup race - Fix a NFSv4.1 session drain deadlock - Fix a NFSv4/v4.1 mount regression when not running rpc.gssd - Ensure auth_gss pipe detection works in namespaces - Fix SETCLIENTID fallback if rpcsec_gss is not available * tag 'nfs-for-3.10-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFS: Fix SETCLIENTID fallback if GSS is not available SUNRPC: Prevent an rpc_task wakeup race NFSv4.1 Fix a pNFS session draining deadlock SUNRPC: Convert auth_gss pipe detection to work in namespaces SUNRPC: Faster detection if gssd is actually running SUNRPC: Fix a bug in gss_create_upcall	2013-05-26 12:33:05 -07:00
Eric Dumazet	a622260254	ip_tunnel: fix kernel panic with icmp_dest_unreach Daniel Petre reported crashes in icmp_dst_unreach() with following call graph: #3 [ffff88003fc03938] __stack_chk_fail at ffffffff81037f77 #4 [ffff88003fc03948] icmp_send at ffffffff814d5fec #5 [ffff88003fc03ae8] ipv4_link_failure at ffffffff814a1795 #6 [ffff88003fc03af8] ipgre_tunnel_xmit at ffffffff814e7965 #7 [ffff88003fc03b78] dev_hard_start_xmit at ffffffff8146e032 #8 [ffff88003fc03bc8] sch_direct_xmit at ffffffff81487d66 #9 [ffff88003fc03c08] __qdisc_run at ffffffff81487efd #10 [ffff88003fc03c48] dev_queue_xmit at ffffffff8146e5a7 #11 [ffff88003fc03c88] ip_finish_output at ffffffff814ab596 Daniel found a similar problem mentioned in http://lkml.indiana.edu/hypermail/linux/kernel/1007.0/00961.html And indeed this is the root cause : skb->cb[] contains data fooling IP stack. We must clear IPCB in ip_tunnel_xmit() sooner in case dst_link_failure() is called. Or else skb->cb[] might contain garbage from GSO segmentation layer. A similar fix was tested on linux-3.9, but gre code was refactored in linux-3.10. I'll send patches for stable kernels as well. Many thanks to Daniel for providing reports, patches and testing ! Reported-by: Daniel Petre <daniel.petre@rcs-rds.ro> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-25 23:26:30 -07:00
Joe Perches	c48b22daa6	tcp: Remove 2 indentation levels in tcp_rcv_state_process case TCP_FIN_WAIT1 can also be simplified by reversing tests and adding breaks; Add braces after case and move automatic definitions. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-25 23:22:18 -07:00
Joe Perches	61eb900352	tcp: Remove another indentation level in tcp_rcv_state_process case TCP_SYN_RECV: can have another indentation level removed by converting if (acceptable) { ...; } else { return 1; } to if (!acceptable) return 1; ...; Reflow code and comments to fit 80 columns. Another pure cleanup patch. Signed-off-by: Joe Perches <joe@perches.com> Improved-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-25 23:22:18 -07:00
Eric Dumazet	1f6afc8108	tcp: remove one indentation level in tcp_rcv_state_process() Remove one level of indentation 'introduced' in commit `c3ae62af8e` (tcp: should drop incoming frames without ACK flag set) if (true) { ... } @acceptable variable is a boolean. This patch is a pure cleanup. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-25 23:22:18 -07:00
Jiri Pirko	42e52bf9e3	net: add netnotifier event for upper device change Now when upper device is changed, event is not propagated via RT Netlink to userspace. Userspace might never now about the change. Fix this by adding upper-device-change notifier event. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-25 23:12:19 -07:00
Lorenzo Colitti	6d0bfe2261	net: ipv6: Add IPv6 support to the ping socket. This adds the ability to send ICMPv6 echo requests without a raw socket. The equivalent ability for ICMPv4 was added in 2011. Instead of having separate code paths for IPv4 and IPv6, make most of the code in net/ipv4/ping.c dual-stack and only add a few IPv6-specific bits (like the protocol definition) to a new net/ipv6/ping.c. Hopefully this will reduce divergence and/or duplication of bugs in the future. Caveats: - Setting options via ancillary data (e.g., using IPV6_PKTINFO to specify the outgoing interface) is not yet supported. - There are no separate security settings for IPv4 and IPv6; everything is controlled by /proc/net/ipv4/ping_group_range. - The proc interface does not yet display IPv6 ping sockets properly. Tested with a patched copy of ping6 and using raw socket calls. Compiles and works with all of CONFIG_IPV6={n,m,y}. Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-25 21:07:49 -07:00
Zhang Yanfei	0799567424	ipvs: change type of netns_ipvs->sysctl_sync_qlen_max This member of struct netns_ipvs is calculated from nr_free_buffer_pages so change its type to unsigned long in case of overflow. Also, type of its related proc var sync_qlen_max and the return type of function sysctl_sync_qlen_max() should be changed to unsigned long, too. Besides, the type of ipvs_master_sync_state->sync_queue_len should be changed to unsigned long accordingly. Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: Julian Anastasov <ja@ssi.bg> Cc: David Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Simon Horman <horms@verge.net.au>	2013-05-26 08:17:33 +09:00
David S. Miller	e6ff4c75f9	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Merge net into net-next because some upcoming net-next changes build on top of bug fixes that went into net. Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-24 16:48:28 -07:00
Johannes Berg	83739b03de	cfg80211: remove some locked wrappers from sme API By making all the API functions require wdev locking we can clean up the API a bit, getting rid of the locking version of each function. This also decreases the size of cfg80211 by a small amount. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-05-25 00:02:22 +02:00
Johannes Berg	91bf9b26fc	cfg80211: remove some locked wrappers from mlme API By making all the API functions require wdev locking we can clean up the API a bit, getting rid of the locking version of each function. This also decreases the size of cfg80211 by a small amount. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-05-25 00:02:20 +02:00
Johannes Berg	38fd2143fa	regulatory: remove reg_mutex The reg_mutex is similar to the ones I just removed in cfg80211 but even less useful since it protects global data, and we hold the RTNL in all places (except module unload) already. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-05-25 00:02:19 +02:00
Johannes Berg	db2424c58e	regulatory: use RCU in regulatory_hint_11d() Since it just does a quick check of the last regulatory request, the function doesn't have to hold the reg mutex but can use RCU instead. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-05-25 00:02:18 +02:00
Johannes Berg	1cdd59ce8d	cfg80211: simplify and correct P2P-Device scan check If the driver for some reason successfully finishes scanning while in p2p_stop_device(), cfg80211 will still set it to aborted. Simplify this code using the new 'notified' value and only mark it aborted in case the driver didn't notify cfg80211 at all (in which case we also leak the request to not crash, this is a driver bug.) Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-05-25 00:02:17 +02:00
Johannes Berg	8d61ffa5e0	cfg80211/mac80211: use cfg80211 wdev mutex in mac80211 Using separate locks in cfg80211 and mac80211 has always caused issues, for example having to unlock in places in mac80211 to call cfg80211, which even needed a framework to make cfg80211 calls after some functions returned etc. Additionally, I suspect some issues people have reported with the cfg80211 state getting confused could be due to such issues, when cfg80211 is asking mac80211 to change state but mac80211 is in the process of telling cfg80211 that the state changed (in another way.) Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-05-25 00:02:16 +02:00
Johannes Berg	5fe231e873	cfg80211: vastly simplify locking Virtually all code paths in cfg80211 already (need to) hold the RTNL. As such, there's little point in having another four mutexes for various parts of the code, they just cause lock ordering issues (and much of the time, the RTNL and a few of the others need thus be held.) Simplify all this by getting rid of the extra four mutexes and just use the RTNL throughout. Only a few code changes were needed to do this and we can get rid of a work struct for bonus points. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-05-25 00:02:15 +02:00
Johannes Berg	73810b77de	cfg80211: use atomic_t for wiphy counter There's no need to lock, we can just use an atomic_t. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-05-25 00:02:13 +02:00
Johannes Berg	9f419f3851	cfg80211: move cfg80211_get_dev_from_ifindex under wext The function is only used and needed by the wext code for scanning, so move it there. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-05-25 00:02:13 +02:00
Johannes Berg	dde7dc759b	Merge remote-tracking branch 'mac80211/master' into mac80211-next	2013-05-25 00:01:30 +02:00
Ashok Nagarajan	b422c6cd7e	{cfg,mac}80211: move mandatory rates calculation to cfg80211 Move mandatory rates calculation to cfg80211, shared with non mac80211 drivers. Signed-off-by: Ashok Nagarajan <ashok@cozybit.com> [extend documentation] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-05-24 23:54:43 +02:00
Ashok Nagarajan	d4a5a48976	mac80211: Move mesh estab_plinks outside mesh_stats debug group As estab_plinks is not a statistics member, don't show its debug information along with other mesh stat members Signed-off-by: Ashok Nagarajan <ashok@cozybit.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-05-24 23:18:42 +02:00
Jouni Malinen	5e4b6f5698	cfg80211: Allow TDLS peer AID to be configured for VHT VHT uses peer AID in the PARTIAL_AID field in TDLS frames. The current design for TDLS is to first add a dummy STA entry before completing TDLS Setup and then update information on this STA entry based on what was received from the peer during the setup exchange. In theory, this could use NL80211_ATTR_STA_AID to set the peer AID just like this is used in AP mode to set the AID of an association station. However, existing cfg80211 validation rules prevent this attribute from being used with set_station operation. To avoid interoperability issues between different kernel and user space version combinations, introduce a new nl80211 attribute for the purpose of setting TDLS peer AID. This attribute can be used in both the new_station and set_station operations. It is not supposed to be allowed to change the AID value during the lifetime of the STA entry, but that validation is left for drivers to do in the change_station callback. Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-05-24 22:36:28 +02:00
Linus Torvalds	eb3d33900a	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: "It's been a while since my last pull request so quite a few fixes have piled up." Indeed. 1) Fix nf_{log,queue} compilation with PROC_FS disabled, from Pablo Neira Ayuso. 2) Fix data corruption on some tg3 chips with TSO enabled, from Michael Chan. 3) Fix double insertion of VLAN tags in be2net driver, from Sarveshwar Bandi. 4) Don't have TCP's MD5 support pass > PAGE_SIZE page offsets in scatter-gather entries into the crypto layer, the crypto layer can't handle that. From Eric Dumazet. 5) Fix lockdep splat in 802.1Q MRP code, also from Eric Dumazet. 6) Fix OOPS in netfilter log module when called from conntrack, from Hans Schillstrom. 7) FEC driver needs to use netif_tx_{lock,unlock}_bh() rather than the non-BH disabling variants. From Fabio Estevam. 8) TCP GSO can generate out-of-order packets, fix from Eric Dumazet. 9) vxlan driver doesn't update 'used' field of fdb entries when it should, from Sridhar Samudrala. 10) ipv6 should use kzalloc() to allocate inet6 socket cork options, otherwise we can OOPS in ip6_cork_release(). From Eric Dumazet. 11) Fix races in bonding set mode, from Nikolay Aleksandrov. 12) Fix checksum generation regression added by "r8169: fix 8168evl frame padding.", from Francois Romieu. 13) ip_gre can look at stale SKB data pointer, fix from Eric Dumazet. 14) Fix checksum handling when GSO is enabled in bnx2x driver with certain chips, from Yuval Mintz. 15) Fix double free in batman-adv, from Martin Hundebøll. 16) Fix device startup synchronization with firmware in tg3 driver, from Nithin Sujit. 17) perf networking dropmonitor doesn't work at all due to mixed up trace parameter ordering, from Ben Hutchings. 18) Fix proportional rate reduction handling in tcp_ack(), from Nandita Dukkipati. 19) IPSEC layer doesn't return an error when a valid state is detected, causing an OOPS. Fix from Timo Teräs. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (85 commits) be2net: bug fix on returning an invalid nic descriptor tcp: xps: fix reordering issues net: Revert unused variable changes. xfrm: properly handle invalid states as an error virtio_net: enable napi for all possible queues during open tcp: bug fix in proportional rate reduction. net: ethernet: sun: drop unused variable net: ethernet: korina: drop unused variable net: ethernet: apple: drop unused variable qmi_wwan: Added support for Cinterion's PLxx WWAN Interface perf: net_dropmonitor: Remove progress indicator perf: net_dropmonitor: Use bisection in symbol lookup perf: net_dropmonitor: Do not assume ordering of dictionaries perf: net_dropmonitor: Fix symbol-relative addresses perf: net_dropmonitor: Fix trace parameter order net: fec: use a more proper compatible string for MVF type device qlcnic: Fix updating netdev->features qlcnic: remove netdev->trans_start updates within the driver qlcnic: Return proper error codes from probe failure paths tg3: Update version to 3.132 ...	2013-05-24 08:27:32 -07:00
Oleksij Rempel	786677d100	mac80211: add STBC flag for radiotap Some chips can tell us if received frame was encoded with STBC or not. To make this information available in user space we can use updated radiotap specification: http://www.radiotap.org/defined-fields/MCS This patch will set number of STBC encoded spatial streams (Nss). The HAVE_STBC flag should be provided by driver. Signed-off-by: Oleksij Rempel <linux@rempel-privat.de> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-05-24 12:07:25 +02:00
Vlad Yasevich	161f65ba35	bridge: Set vlan_features to allow offloads on vlans. When vlan device is configured on top of the brige, it does not support any offload capabilities because the bridge device does not initiliaze vlan_fatures. Set vlan_fatures to be equivalent to hw_fatures. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-23 18:54:30 -07:00
Eric Dumazet	547669d483	tcp: xps: fix reordering issues commit `3853b5841c` ("xps: Improvements in TX queue selection") introduced ooo_okay flag, but the condition to set it is slightly wrong. In our traces, we have seen ACK packets being received out of order, and RST packets sent in response. We should test if we have any packets still in host queue. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tom Herbert <therbert@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-23 18:29:20 -07:00
Johannes Berg	4c8a9d4bfa	mac80211: close AP_VLAN interfaces before unregistering all Since Eric's commit `efe117ab8` ("Speedup ieee80211_remove_interfaces") there's a bug in mac80211 when it unregisters with AP_VLAN interfaces up. If the AP_VLAN interface was registered after the AP it belongs to (which is the typical case) and then we get into this code path, unregister_netdevice_many() will crash because it isn't prepared to deal with interfaces being closed in the middle of it. Exactly this happens though, because we iterate the list, find the AP master this AP_VLAN belongs to and dev_close() the dependent VLANs. After this, unregister_netdevice_many() won't pick up the fact that the AP_VLAN is already down and will do it again, causing a crash. Cc: stable@vger.kernel.org [2.6.33+] Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-05-24 01:06:09 +02:00
Johannes Berg	5f38a11274	mac80211: assign AP_VLAN hw queues correctly A lot of code in mac80211 assumes that the hw queues are set up correctly for all interfaces (except for monitor) but this isn't true for AP_VLAN interfaces. Fix this by copying the AP master configuration when an AP VLAN is brought up, after this the AP interface can't change its configuration any more and needs to be brought down to change it, which also forces AP_VLAN interfaces down, so just copying in open() is sufficient. Reported-by: Jouni Malinen <j@w1.fi> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-05-23 23:11:33 +02:00
Felix Fietkau	4325d724cd	cfg80211: fix reporting 64-bit station info tx bytes Copy & paste mistake - STATION_INFO_TX_BYTES64 is the name of the flag, not NL80211_STA_INFO_TX_BYTES64. Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-05-23 22:08:18 +02:00
Johannes Berg	2b436312f0	mac80211: fix queue handling crash The code I added in "mac80211: don't start new netdev queues if driver stopped" crashes for monitor and AP VLAN interfaces because while they have a netdev, they don't have queues set up by the driver. To fix the crash, exclude these from queue accounting here and just start their netdev queues unconditionally. For monitor, this is the best we can do, as we can redirect frames there to any other interface and don't know which one that will since it can be different for each frame. For AP VLAN interfaces, we can do better later and actually properly track the queue status. Not doing this is really a separate bug though. Reported-by: Ilan Peer <ilan.peer@intel.com> Reported-by: Jouni Malinen <j@w1.fi> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-05-23 21:04:38 +02:00
Johannes Berg	c815797663	cfg80211: check wdev->netdev in connection work If a P2P-Device is present and another virtual interface triggers the connection work, the system crash because it tries to check if the P2P-Device's netdev (which doesn't exist) is up. Skip any wdevs that have no netdev to fix this. Cc: stable@vger.kernel.org Reported-by: YanBo <dreamfly281@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-05-23 18:12:38 +02:00
Chen Gang	4f36ea6eed	netfilter: ipt_ULOG: fix non-null terminated string in the nf_log path If nf_log uses ipt_ULOG as logging output, we can deliver non-null terminated strings to user-space since the maximum length of the prefix that is passed by nf_log is NF_LOG_PREFIXLEN but pm->prefix is 32 bytes long (ULOG_PREFIX_LEN). This is actually happening already from nf_conntrack_tcp if ipt_ULOG is used, since it is passing strings longer than 32 bytes. Signed-off-by: Chen Gang <gang.chen@asianux.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-05-23 14:25:40 +02:00
Simon Horman	a38e5e230e	ipvs: use cond_resched_rcu() helper when walking connections This avoids the situation where walking of a large number of connections may prevent scheduling for a long time while also avoiding excessive calls to rcu_read_unlock() and rcu_read_lock(). Note that in the case of !CONFIG_PREEMPT_RCU this will add a call to cond_resched(). Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au> Acked-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-05-23 14:23:18 +02:00
Pablo Neira Ayuso	de94c4591b	netfilter: {ipt,ebt}_ULOG: rise warning on deprecation This target has been superseded by NFLOG. Spot a warning so we prepare removal in a couple of years. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Gao feng <gaofeng@cn.fujitsu.com>	2013-05-23 14:23:16 +02:00
Pablo Neira Ayuso	6d11cfdba5	netfilter: don't panic on error while walking through the init path Don't panic if we hit an error while adding the nf_log or pernet netfilter support, just bail out. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Gao feng <gaofeng@cn.fujitsu.com>	2013-05-23 14:22:30 +02:00
Florian Westphal	2a7851bffb	netfilter: add nf_ipv6_ops hook to fix xt_addrtype with IPv6 Quoting https://bugzilla.netfilter.org/show_bug.cgi?id=812: [ ip6tables -m addrtype ] When I tried to use in the nat/PREROUTING it messes up the routing cache even if the rule didn't matched at all. [..] If I remove the --limit-iface-in from the non-working scenario, so just use the -m addrtype --dst-type LOCAL it works! This happens when LOCAL type matching is requested with --limit-iface-in, and the default ipv6 route is via the interface the packet we test arrived on. Because xt_addrtype uses ip6_route_output, the ipv6 routing implementation creates an unwanted cached entry, and the packet won't make it to the real/expected destination. Silently ignoring --limit-iface-in makes the routing work but it breaks rule matching (--dst-type LOCAL with limit-iface-in is supposed to only match if the dst address is configured on the incoming interface; without --limit-iface-in it will match if the address is reachable via lo). The test should call ipv6_chk_addr() instead. However, this would add a link-time dependency on ipv6. There are two possible solutions: 1) Revert the commit that moved ipt_addrtype to xt_addrtype, and put ipv6 specific code into ip6t_addrtype. 2) add new "nf_ipv6_ops" struct to register pointers to ipv6 functions. While the former might seem preferable, Pablo pointed out that there are more xt modules with link-time dependeny issues regarding ipv6, so lets go for 2). Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-05-23 11:58:55 +02:00
Chen Gang	8bc14d25ff	bridge: netfilter: using strlcpy() instead of strncpy() 'name' has already set all zero when it is defined, so not need let strncpy() to pad it again. 'name' is a string, better always let is NUL terminated, so use strlcpy() instead of strncpy(). Signed-off-by: Chen Gang <gang.chen@asianux.com> Acked-by: Bart De Schuymer <bdschuym@pandora.be> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-05-23 11:24:57 +02:00
Eric Dumazet	00028aa370	netfilter: xt_socket: use IP early demux With IP early demux added in linux-3.6, we perform TCP lookup in IP layer before iptables hooks. We can avoid doing a second lookup in xt_socket. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-05-23 11:09:53 +02:00
Eric Dumazet	27e7190efd	netfilter: xt_CT: optimize XT_CT_NOTRACK The percpu untracked ct are not currently used for XT_CT_NOTRACK. xt_ct_tg_check()/xt_ct_target() provides a single ct. Thats not optimal as the ct->ct_general.use cache line will bounce among cpus. Use the intended [1] thing : xt_ct_target() should select the percpu object. [1] Refs : commit `5bfddbd46a` ("netfilter: nf_conntrack: IPS_UNTRACKED bit") commit `b3c5163fe0` ("netfilter: nf_conntrack: per_cpu untracking") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-05-23 11:09:29 +02:00
Timo Teräs	497574c72c	xfrm: properly handle invalid states as an error The error exit path needs err explicitly set. Otherwise it returns success and the only caller, xfrm_output_resume(), would oops in skb_dst(skb)->ops derefence as skb_dst(skb) is NULL. Bug introduced in commit `bb65a9cb` (xfrm: removes a superfluous check and add a statistic). Signed-off-by: Timo Teräs <timo.teras@iki.fi> Cc: Li RongQing <roy.qing.li@gmail.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-23 01:20:07 -07:00
Cong Wang	8892475386	ipv6: use ipv6_addr_scope() helper ipv6_addr_type(&addr)&IPV6_ADDR_SCOPE_MASK could be replaced by ipv6_addr_scope(), which is slightly faster. Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-23 01:17:47 -07:00
Cong Wang	7996c799ae	ipv6: use ipv6_addr_any() helper ipv6_addr_any() is a faster way to determine if an addr is ipv6 any addr, no need to compute the addr type. Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-23 01:17:47 -07:00
Nandita Dukkipati	35f079ebbc	tcp: bug fix in proportional rate reduction. This patch is a fix for a bug triggering newly_acked_sacked < 0 in tcp_ack(.). The bug is triggered by sacked_out decreasing relative to prior_sacked, but packets_out remaining the same as pior_packets. This is because the snapshot of prior_packets is taken after tcp_sacktag_write_queue() while prior_sacked is captured before tcp_sacktag_write_queue(). The problem is: tcp_sacktag_write_queue (tcp_match_skb_to_sack() -> tcp_fragment) adjusts the pcount for packets_out and sacked_out (MSS change or other reason). As a result, this delta in pcount is reflected in (prior_sacked - sacked_out) but not in (prior_packets - packets_out). This patch does the following: 1) initializes prior_packets at the start of tcp_ack() so as to capture the delta in packets_out created by tcp_fragment. 2) introduces a new "previous_packets_out" variable that snapshots packets_out right before tcp_clean_rtx_queue, so pkts_acked can be correctly computed as before. 3) Computes pkts_acked using previous_packets_out, and computes newly_acked_sacked using prior_packets. Signed-off-by: Nandita Dukkipati <nanditad@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-23 00:10:09 -07:00
Eric Dumazet	e43ac79a4b	sch_tbf: segment too big GSO packets If a GSO packet has a length above tbf burst limit, the packet is currently silently dropped. Current way to handle this is to set the device in non GSO/TSO mode, or setting high bursts, and its sub optimal. We can actually segment too big GSO packets, and send individual segments as tbf parameters allow, allowing for better interoperability. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Ben Hutchings <ben@decadent.org.uk> Cc: Jiri Pirko <jiri@resnulli.us> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Reviewed-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-23 00:06:40 -07:00
Simon Horman	1cdbcb7957	net: Loosen constraints for recalculating checksum in skb_segment() This is a generic solution to resolve a specific problem that I have observed. If the encapsulation of an skb changes then ability to offload checksums may also change. In particular it may be necessary to perform checksumming in software. An example of such a case is where a non-GRE packet is received but is to be encapsulated and transmitted as GRE. Another example relates to my proposed support for for packets that are non-MPLS when received but MPLS when transmitted. The cost of this change is that the value of the csum variable may be checked when it previously was not. In the case where the csum variable is true this is pure overhead. In the case where the csum variable is false it leads to software checksumming, which I believe also leads to correct checksums in transmitted packets for the cases described above. Further analysis: This patch relies on the return value of can_checksum_protocol() being correct and in turn the return value of skb_network_protocol(), used to provide the protocol parameter of can_checksum_protocol(), being correct. It also relies on the features passed to skb_segment() and in turn to can_checksum_protocol() being correct. I believe that this problem has not been observed for VLANs because it appears that almost all drivers, the exception being xgbe, set vlan_features such that that the checksum offload support for VLAN packets is greater than or equal to that of non-VLAN packets. I wonder if the code in xgbe may be an oversight and the hardware does support checksumming of VLAN packets. If so it may be worth updating the vlan_features of the driver as this patch will force such checksums to be performed in software rather than hardware. Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-22 15:06:13 -07:00
Cong Wang	6b7df111ec	bridge: send query as soon as leave is received Continue sending queries when leave is received if the user marks it as a querier. Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Adam Baker <linux@baker-net.org.uk> Signed-off-by: Cong Wang <amwang@redhat.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-22 14:54:37 -07:00
Cong Wang	9f00b2e7cf	bridge: only expire the mdb entry when query is received Currently we arm the expire timer when the mdb entry is added, however, this causes problem when there is no querier sent out after that. So we should only arm the timer when a corresponding query is received, as suggested by Herbert. And he also mentioned "if there is no querier then group subscriptions shouldn't expire. There has to be at least one querier in the network for this thing to work. Otherwise it just degenerates into a non-snooping switch, which is OK." Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Adam Baker <linux@baker-net.org.uk> Signed-off-by: Cong Wang <amwang@redhat.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-22 14:54:37 -07:00
Cong Wang	1c8ad5bfa2	bridge: use the bridge IP addr as source addr for querier Quote from Adam: "If it is believed that the use of 0.0.0.0 as the IP address is what is causing strange behaviour on other devices then is there a good reason that a bridge rather than a router shouldn't be the active querier? If not then using the bridge IP address and having the querier enabled by default may be a reasonable solution (provided that our querier obeys the election rules and shuts up if it sees a query from a lower IP address that isn't 0.0.0.0). Just because a device is the elected querier for IGMP doesn't appear to mean it is required to perform any other routing functions." And introduce a new troggle for it, as suggested by Herbert. Suggested-by: Adam Baker <linux@baker-net.org.uk> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Adam Baker <linux@baker-net.org.uk> Signed-off-by: Cong Wang <amwang@redhat.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-22 14:54:37 -07:00
Trond Myklebust	a3c3cac5d3	SUNRPC: Prevent an rpc_task wakeup race The lockless RPC_IS_QUEUED() test in __rpc_execute means that we need to be careful about ordering the calls to rpc_test_and_set_running(task) and rpc_clear_queued(task). If we get the order wrong, then we may end up testing the RPC_TASK_RUNNING flag after __rpc_execute() has looped and changed the state of the rpc_task. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@vger.kernel.org	2013-05-22 14:55:32 -04:00
Martin Hundebøll	f69ae770e7	batman-adv: Avoid double freeing of bat_counters On errors in batadv_mesh_init(), bat_counters will be freed in both batadv_mesh_free() and batadv_softif_init_late(). This patch fixes this by returning earlier from batadv_softif_init_late() in case of errors in batadv_mesh_init() and by setting bat_counters to NULL after freeing. Signed-off-by: Martin Hundebøll <martin@hundeboll.net> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>	2013-05-21 21:34:36 +02:00
chaoting fan	b6040f9706	sunrpc: the cache_detail in cache_is_valid is unused any more The cache_detail(*detail) in function cache_is_valid is not used any more. Signed-off-by: fanchaoting <fanchaoting@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-05-21 11:02:02 -04:00
Paul Bolle	7c055881de	NFC: Remove commented out LLCP related Makefile line The Kconfig symbol NFC_LLCP was removed in commit `30cc458765` ("NFC: Move LLCP code to the NFC top level diirectory"). But the reference to its macro in this Makefile was only commented out. Remove it now. Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2013-05-21 10:47:41 +02:00
Eric Dumazet	71cea17ed3	tcp: md5: remove spinlock usage in fast path TCP md5 code uses per cpu variables but protects access to them with a shared spinlock, which is a contention point. [ tcp_md5sig_pool_lock is locked twice per incoming packet ] Makes things much simpler, by allocating crypto structures once, first time a socket needs md5 keys, and not deallocating them as they are really small. Next step would be to allow crypto allocations being done in a NUMA aware way. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-20 14:00:42 -07:00
Willem de Bruijn	99bbc70741	rps: selective flow shedding during softnet overflow A cpu executing the network receive path sheds packets when its input queue grows to netdev_max_backlog. A single high rate flow (such as a spoofed source DoS) can exceed a single cpu processing rate and will degrade throughput of other flows hashed onto the same cpu. This patch adds a more fine grained hashtable. If the netdev backlog is above a threshold, IRQ cpus track the ratio of total traffic of each flow (using 4096 buckets, configurable). The ratio is measured by counting the number of packets per flow over the last 256 packets from the source cpu. Any flow that occupies a large fraction of this (set at 50%) will see packet drop while above the threshold. Tested: Setup is a muli-threaded UDP echo server with network rx IRQ on cpu0, kernel receive (RPS) on cpu0 and application threads on cpus 2--7 each handling 20k req/s. Throughput halves when hit with a 400 kpps antagonist storm. With this patch applied, antagonist overload is dropped and the server processes its complete load. The patch is effective when kernel receive processing is the bottleneck. The above RPS scenario is a extreme, but the same is reached with RFS and sufficient kernel processing (iptables, packet socket tap, ..). Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-20 13:48:04 -07:00
John W. Linville	ba7c96bec5	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem	2013-05-20 15:19:01 -04:00
Eric Dumazet	96f5a846bd	ip_gre: fix a possible crash in ipgre_err() Another fix needed in ipgre_err(), as parse_gre_header() might change skb->head. Bug added in commit `c544193214` (GRE: Refactor GRE tunneling code.) Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-20 00:18:52 -07:00
Yuchung Cheng	3e59cb0ddf	tcp: remove bad timeout logic in fast recovery tcp_timeout_skb() was intended to trigger fast recovery on timeout, unfortunately in reality it often causes spurious retransmission storms during fast recovery. The particular sign is a fast retransmit over the highest sacked sequence (SND.FACK). Currently the RTO timer re-arming (as in RFC6298) offers a nice cushion to avoid spurious timeout: when SND.UNA advances the sender re-arms RTO and extends the timeout by icsk_rto. The sender does not offset the time elapsed since the packet at SND.UNA was sent. But if the next (DUP)ACK arrives later than ~RTTVAR and triggers tcp_fastretrans_alert(), then tcp_timeout_skb() will mark any packet sent before the icsk_rto interval lost, including one that's above the highest sacked sequence. Most likely a large part of scorebard will be marked. If most packets are not lost then the subsequent DUPACKs with new SACK blocks will cause the sender to continue to retransmit packets beyond SND.FACK spuriously. Even if only one packet is lost the sender may falsely retransmit almost the entire window. The situation becomes common in the world of bufferbloat: the RTT continues to grow as the queue builds up but RTTVAR remains small and close to the minimum 200ms. If a data packet is lost and the DUPACK triggered by the next data packet is slightly delayed, then a spurious retransmission storm forms. As the original comment on tcp_timeout_skb() suggests: the usefulness of this feature is questionable. It also wastes cycles walking the sack scoreboard and is actually harmful because of false recovery. It's time to remove this. Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Nandita Dukkipati <nanditad@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-19 23:51:17 -07:00
Rusty Russell	d2f83e9078	Hoist memcpy_fromiovec/memcpy_toiovec into lib/ ERROR: "memcpy_fromiovec" [drivers/vhost/vhost_scsi.ko] undefined! That function is only present with CONFIG_NET. Turns out that crypto/algif_skcipher.c also uses that outside net, but it actually needs sockets anyway. In addition, commit `6d4f0139d6` added CONFIG_NET dependency to CONFIG_VMCI for memcpy_toiovec, so hoist that function and revert that commit too. socket.h already includes uio.h, so no callers need updating; trying only broke things fo x86_64 randconfig (thanks Fengguang!). Reported-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2013-05-20 10:24:22 +09:30
Chen Gang	ff0102ee10	net: irda: using kzalloc() instead of kmalloc() to avoid strncpy() issue. 'discovery->data.info' length is 22, NICKNAME_MAX_LEN is 21, so the strncpy() will always left the last byte of 'discovery->data.info' uninitialized. When 'text' length is longer than 21 (NICKNAME_MAX_LEN), if still left the last byte of 'discovery->data.info' uninitialized, the next strlen() will cause issue. Also 'discovery->data' is 'struct irda_device_info' which defined in "include/uapi/...", it may copy to user mode, so need whole initialized. All together, need use kzalloc() instead of kmalloc() to initialize all members firstly. Signed-off-by: Chen Gang <gang.chen@asianux.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-19 15:10:47 -07:00
Nicolas Dichtel	caeaba7900	ipv6: add support of peer address This patch adds the support of peer address for IPv6. For example, it is possible to specify the remote end of a 6inY tunnel. This was already possible in IPv4: ip addr add ip1 peer ip2 dev dev1 The peer address is specified with IFA_ADDRESS and the local address with IFA_LOCAL (like explained in include/uapi/linux/if_addr.h). Note that the API is not changed, because before this patch, it was not possible to specify two different addresses in IFA_LOCAL and IFA_REMOTE. There is a small change for the dump: if the peer is different from ::, IFA_ADDRESS will contain the peer address instead of the local address. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-19 15:09:26 -07:00
Paul Moore	6b21e1b77d	netlabel: improve domain mapping validation The net/netlabel/netlabel_domainhash.c:netlbl_domhsh_add() function does not properly validate new domain hash entries resulting in potential problems when an administrator attempts to add an invalid entry. One such problem, as reported by Vlad Halilov, is a kernel BUG (found in netlabel_domainhash.c:netlbl_domhsh_audit_add()) when adding an IPv6 outbound mapping with a CIPSO configuration. This patch corrects this problem by adding the necessary validation code to netlbl_domhsh_add() via the newly created netlbl_domhsh_validate() function. Ideally this patch should also be pushed to the currently active -stable trees. Reported-by: Vlad Halilov <vlad.halilov@gmail.com> Signed-off-by: Paul Moore <pmoore@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-19 14:49:55 -07:00
Eric Dumazet	284041ef21	ipv6: fix possible crashes in ip6_cork_release() commit `0178b695fd` ("ipv6: Copy cork options in ip6_append_data") added some code duplication and bad error recovery, leading to potential crash in ip6_cork_release() as kfree() could be called with garbage. use kzalloc() to make sure this wont happen. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: Neal Cardwell <ncardwell@google.com>	2013-05-18 12:55:45 -07:00
Nicolas Dichtel	57b354e66b	dev: remove duplicate 'skb->dev = dev' in dev_forward_skb() This was added by commit `59b9997bab` (Revert "net: maintain namespace isolation between vlan and real device"). In fact, before the initial commit - the one that is reverted -, this statement was not present. 'skb->dev = dev' is already done in eth_type_trans(), which is call just after. Spotted-by: Alain Ritoux <alain.ritoux@6wind.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-17 18:23:08 -07:00
Alex Elder	14d2f38df6	libceph: must hold mutex for reset_changed_osds() An osd client has a red-black tree describing its osds, and occasionally we would get crashes due to one of these trees tree becoming corrupt somehow. The problem turned out to be that reset_changed_osds() was being called without protection of the osd client request mutex. That function would call __reset_osd() for any osd that had changed, and __reset_osd() would call __remove_osd() for any osd with no outstanding requests, and finally __remove_osd() would remove the corresponding entry from the red-black tree. Thus, the tree was getting modified without having any lock protection, and was vulnerable to problems due to concurrent updates. This appears to be the only osd tree updating path that has this problem. It can be fairly easily fixed by moving the call up a few lines, to just before the request mutex gets dropped in kick_requests(). This resolves: http://tracker.ceph.com/issues/5043 Cc: stable@vger.kernel.org # 3.4+ Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>	2013-05-17 12:45:40 -05:00
Stanislaw Gruszka	6211dd12da	mac80211: fix direct probe auth We send direct probe to broadcast address, as some APs do not respond to unicast PROBE frames when unassociated. Broadcast frames are not acked, so we can not use that for trigger MLME state machine, but we need to use old timeout mechanism. This fixes authentication timed out like below: [ 1024.671974] wlan6: authenticate with 54:e6:fc:98:63:fe [ 1024.694125] wlan6: direct probe to 54:e6:fc:98:63:fe (try 1/3) [ 1024.695450] wlan6: direct probe to 54:e6:fc:98:63:fe (try 2/3) [ 1024.700586] wlan6: send auth to 54:e6:fc:98:63:fe (try 3/3) [ 1024.701441] wlan6: authentication with 54:e6:fc:98:63:fe timed out With fix, we have: [ 4524.198978] wlan6: authenticate with 54:e6:fc:98:63:fe [ 4524.220692] wlan6: direct probe to 54:e6:fc:98:63:fe (try 1/3) [ 4524.421784] wlan6: send auth to 54:e6:fc:98:63:fe (try 2/3) [ 4524.423272] wlan6: authenticated [ 4524.423811] wlan6: associate with 54:e6:fc:98:63:fe (try 1/3) [ 4524.427492] wlan6: RX AssocResp from 54:e6:fc:98:63:fe (capab=0x431 status=0 aid=1) Cc: stable@vger.kernel.org # 3.9 Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-05-17 13:59:18 +02:00
Linus Lüssing	72822225bd	batman-adv: Fix rcu_barrier() miss due to double call_rcu() in TT code rcu_barrier() only waits for the currently scheduled rcu functions to finish - it won't wait for any function scheduled via another call_rcu() within an rcu scheduled function. Unfortunately our batadv_tt_orig_list_entry_free_ref() does just that, via a batadv_orig_node_free_ref() call, leading to our rcu_barrier() call potentially missing such a batadv_orig_node_free_ref(). This patch fixes this issue by calling the batadv_orig_node_free_rcu() directly from the rcu callback, removing the unnecessary, additional call_rcu() layer here. Signed-off-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Acked-by: Antonio Quartulli <ordex@autistici.org>	2013-05-17 09:54:28 +02:00
Eric Dumazet	d2cf43674e	tcp: speedup tcp_fixup_rcvbuf() tcp_fixup_rcvbuf() contains a loop to estimate initial socket rcv space needed for a given mss. With large MTU (like 64K on lo), we can loop ~500 times and consume a lot of cpu cycles. perf top of 200 concurrent netperf -t TCP_CRR 5.62% netperf [kernel.kallsyms] [k] tcp_init_buffer_space 1.71% netperf [kernel.kallsyms] [k] _raw_spin_lock 1.55% netperf [kernel.kallsyms] [k] kmem_cache_free 1.51% netperf [kernel.kallsyms] [k] tcp_transmit_skb 1.50% netperf [kernel.kallsyms] [k] tcp_ack Lets use a 100% factor, and remove the loop. 100% is needed anyway for tcp_adv_win_scale=1 default value, and is also the maximum factor. Refs: commit `b49960a05e` ("tcp: change tcp_adv_win_scale and tcp_rmem[2]") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-16 15:19:45 -07:00
Eric Dumazet	6ff50cd555	tcp: gso: do not generate out of order packets GSO TCP handler has following issues : 1) ooo_okay from original GSO packet is duplicated to all segments 2) segments (but the last one) are orphaned, so transmit path can not get transmit queue number from the socket. This happens if GSO segmentation is done before stacked device for example. Result is we can send packets from a given TCP flow to different TX queues (if using multiqueue NICS). This generates OOO problems and spurious SACK & retransmits. Fix this by keeping socket pointer set for all segments. This means that every segment must also have a destructor, and the original gso skb truesize must be split on all segments, to keep precise sk->sk_wmem_alloc accounting. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Maciej Żenczykowski <maze@google.com> Cc: Tom Herbert <therbert@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-16 14:43:40 -07:00
David S. Miller	5c4b274981	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf Pablo Neira Ayuso says: ==================== The following patchset contains three Netfilter fixes and update for the MAINTAINER file for your net tree, they are: * Fix crash if nf_log_packet is called from conntrack, in that case both interfaces are NULL, from Hans Schillstrom. This bug introduced with the logging netns support in the previous merge window. * Fix compilation of nf_log and nf_queue without CONFIG_PROC_FS, from myself. This bug was introduced in the previous merge window with the new netns support for the netfilter logging infrastructure. * Fix possible crash in xt_TCPOPTSTRIP due to missing sanity checkings to validate that the TCP header is well-formed, from myself. I can find this bug in 2.6.25, probably it's been there since the beginning. I'll pass this to -stable. * Update MAINTAINER file to point to new nf trees at git.kernel.org, remove Harald and use M: instead of P: (now obsolete tag) to keep Jozsef in the list of people. Please, consider pulling this. Thanks! ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-16 14:32:42 -07:00
Alexander Bondar	ce85788846	mac80211: enable power save only if DTIM period is available Generally, the DTIM period is available after a beacon has been received, and if no beacon has been received enabling powersave is problematic anyway for synchronisation. Since some drivers may require the DTIM period for powersave, don't enable powersave until it becomes available in case the scan/association managed to not receive a beacon. Signed-off-by: Alexander Bondar <alexander.bondar@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-05-16 23:01:29 +02:00
Colleen Twitty	0d4261ad5d	mac80211: enable Auth Protocol Identifier on mesh config. Previously the mesh_auth_id was disabled. Instead set the correct mesh authentication bit based on the mesh setup. Signed-off-by: Colleen Twitty <colleen@cozybit.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-05-16 22:39:43 +02:00
Colleen Twitty	6e16d90b52	cfg80211: Userspace may inform kernel of mesh auth method. Authentication takes place in userspace, but the beacon is generated in the kernel. Allow userspace to inform the kernel of the authentication method so the appropriate mesh config IE can be set prior to beacon generation when joining the MBSS. Signed-off-by: Colleen Twitty <colleen@cozybit.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-05-16 22:39:43 +02:00
Johannes Berg	7ade703604	cfg80211: use C99 initialisers to simplify code a bit Use C99 initialisers for the auth, deauth and disassoc requests to simplify the code. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-05-16 22:39:42 +02:00
Johannes Berg	bd500af223	mac80211: write memcpy differently for smatch There's no real difference between *array and array, but the former confuses smatch so write it differently. The generated code is exactly the same. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-05-16 22:39:42 +02:00
Johannes Berg	4325f6caad	wireless: move crypto constants to ieee80211.h mac80211 and the Intel drivers all define crypto constants, move them to ieee80211.h instead. Reviewed-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-05-16 22:39:41 +02:00
Johannes Berg	04a161f460	mac80211: fix HT beacon-based channel switch handling When an HT AP is advertising channel switch in a beacon, it doesn't (and shouldn't, according to 802.11-2012 Table 8-20) include a secondary channel offset element. The only possible interpretation is that the previous secondary channel offset remains valid, so use that when switching channel based only on beacon information. VHT requires the Wide Bandwidth Channel Switch subelement to be present in the Channel Switch Wrapper element, so the code for that is probably ok (see 802.11ac Draft 4, 8.4.2.165.) Reported-by: Sujith Manoharan <c_manoha@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-05-16 22:39:40 +02:00
Marcel Holtmann	fb4e156886	nl80211: Add generic netlink module alias for cfg80211/nl80211 To support auto-loading of wireless modules from netlink users, add module alias for nl80211 family. This also adds NL80211_GENL_NAME constant to define the "nl80211" netlink family name as part of uapi. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-05-16 22:39:39 +02:00
Vladimir Kondratiev	55300a13d2	cfg80211: add 60GHz regulatory class Add regulatory class for 60GHz band, according to the last specification. Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-05-16 22:39:38 +02:00
Felix Fietkau	ef0621e805	mac80211: add support for per-chain signal strength reporting Signed-off-by: Felix Fietkau <nbd@openwrt.org> [fix unit documentation] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-05-16 22:39:38 +02:00
Felix Fietkau	119363c7dc	cfg80211: add support for per-chain signal strength reporting Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-05-16 22:39:37 +02:00
Johannes Berg	e248ad3020	cfg80211: fix sending WoWLAN TCP wakeup settings The code sending the current WoWLAN TCP wakeup settings in nl80211_send_wowlan_tcp() is not closing the nested attribute, thus causing the parser to get confused on the receiver side in userspace (iw). Fix this. Cc: stable@vger.kernel.org [3.9] Reported-by: Deepak Arora <deepakx.arora@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-05-16 22:38:09 +02:00
Johannes Berg	de3d43a37d	mac80211: report deauth to cfg80211 for local state change Even if the frame isn't transmitted to the AP, we need to report it to cfg80211 so the state there can be updated correctly. Reported-by: Jouni Malinen <j@w1.fi> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-05-16 22:38:08 +02:00
Johannes Berg	2430816b4e	cfg80211: fix interface down/disconnect state handling When the interface goes down, there's no need to call cfg80211_mlme_down() after __cfg80211_disconnect() as the latter will call the former (if appropriate.) Also, in __cfg80211_disconnect(), if the cfg80211 SME isn't used, __cfg80211_disconnected() may still need to be called (depending on the current state) so that the SME state gets cleared. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-05-16 22:38:08 +02:00
Johannes Berg	2b9ccd4e43	mac80211: fix AP-mode frame matching In AP mode, ignore frames with mis-matched BSSID that aren't multicast or sent to the correct destination. This fixes reporting public action frames to userspace multiple times on multiple virtual AP interfaces. Cc: stable@vger.kernel.org Reported-by: Jouni Malinen <j@w1.fi> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-05-16 22:38:07 +02:00
Ilan Peer	a838490b49	nl80211: Add wdev identifier to some nl80211 notifications Adding the attributes fixes an issue with P2P Device not working properly for management frame TX. Signed-off-by: Ilan Peer <ilan.peer@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-05-16 22:38:06 +02:00
Johannes Berg	655914ab86	mac80211: use just spin_lock() in ieee80211_get_tkip_p2k() ieee80211_get_tkip_p2k() may be called with interrupts disabled, so spin_unlock_bh() isn't safe and leads to warnings. Since it's always called with BHs disabled already, just use spin_lock(). Cc: stable@vger.kernel.org Reported-by: Milan Kocian <milon@wq.cz> Acked-by: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-05-16 22:38:06 +02:00
Felix Fietkau	f6b3d85f7f	mac80211: fix spurious RCU warning and update documentation Document rx vs tx status concurrency requirements. Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-05-16 22:38:05 +02:00
Johannes Berg	3670946fe2	mac80211: fix HT beacon-based channel switch handling When an HT AP is advertising channel switch in a beacon, it doesn't (and shouldn't, according to 802.11-2012 Table 8-20) include a secondary channel offset element. The only possible interpretation is that the previous secondary channel offset remains valid, so use that when switching channel based only on beacon information. VHT requires the Wide Bandwidth Channel Switch subelement to be present in the Channel Switch Wrapper element, so the code for that is probably ok (see 802.11ac Draft 4, 8.4.2.165.) Reported-by: Sujith Manoharan <c_manoha@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-05-16 22:38:04 +02:00
Johannes Berg	b8360ab8d2	mac80211: fix IEEE80211_SDATA_DISCONNECT_RESUME Since commit `12e7f51702`, IEEE80211_SDATA_DISCONNECT_RESUME no longer worked as it would simply never be tested. Restore a bit of the code removed there and in `9b7d72c104` to make it work again. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-05-16 22:38:04 +02:00
Johannes Berg	a92eecbbea	cfg80211: fix WoWLAN wakeup tracing If the device reports a non-wireless wakeup reason, the tracing code crashes trying to dereference a NULL pointer. Fix this by checking the pointer on all accesses and also add a non_wireless tag to the event. Cc: stable@vger.kernel.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-05-16 22:38:03 +02:00
Johannes Berg	03cd7e4e1e	cfg80211: fix wiphy_register error path If rfkill_register() fails in wiphy_register() the struct device is unregistered but everything else isn't (regulatory, debugfs) and we even leave the wiphy instance on all internal lists even though it will likely be freed soon, which is clearly a problem. Fix this by cleaning up properly. Cc: stable@vger.kernel.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-05-16 22:38:02 +02:00
Pablo Neira Ayuso	bc6bcb59dd	netfilter: xt_TCPOPTSTRIP: fix possible mangling beyond packet boundary This target assumes that tcph->doff is well-formed, that may be well not the case. Add extra sanity checkings to avoid possible crash due to read/write out of the real packet boundary. After this patch, the default action on malformed TCP packets is to drop them. Moreover, fragments are skipped. Reported-by: Rafal Kupka <rkupka@telemetry.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-05-16 17:35:53 +02:00
Trond Myklebust	2aed8b476f	SUNRPC: Convert auth_gss pipe detection to work in namespaces This seems to have been overlooked when we did the namespace conversion. If a container is running a legacy version of rpc.gssd then it will be disrupted if the global 'pipe_version' is set by a container running the new version of rpc.gssd. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-05-16 06:17:54 -07:00
Trond Myklebust	abfdbd53a4	SUNRPC: Faster detection if gssd is actually running Recent changes to the NFS security flavour negotiation mean that we have a stronger dependency on rpc.gssd. If the latter is not running, because the user failed to start it, then we time out and mark the container as not having an instance. We then use that information to time out faster the next time. If, on the other hand, the rpc.gssd successfully binds to an rpc_pipe, then we mark the container as having an rpc.gssd instance. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-05-16 06:15:41 -07:00
Linus Torvalds	109c3c0292	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull Ceph fixes from Sage Weil: "Yes, this is a much larger pull than I would like after -rc1. There are a few things included: - a few fixes for leaks and incorrect assertions - a few patches fixing behavior when mapped images are resized - handling for cloned/layered images that are flattened out from underneath the client The last bit was non-trivial, and there is some code movement and associated cleanup mixed in. This was ready and was meant to go in last week but I missed the boat on Friday. My only excuse is that I was waiting for an all clear from the testing and there were many other shiny things to distract me. Strictly speaking, handling the flatten case isn't a regression and could wait, so if you like we can try to pull the series apart, but Alex and I would much prefer to have it all in as it is a case real users will hit with 3.10." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (33 commits) rbd: re-submit flattened write request (part 2) rbd: re-submit write request for flattened clone rbd: re-submit read request for flattened clone rbd: detect when clone image is flattened rbd: reference count parent requests rbd: define parent image request routines rbd: define rbd_dev_unparent() rbd: don't release write request until necessary rbd: get parent info on refresh rbd: ignore zero-overlap parent rbd: support reading parent page data for writes rbd: fix parent request size assumption libceph: init sent and completed when starting rbd: kill rbd_img_request_get() rbd: only set up watch for mapped images rbd: set mapping read-only flag in rbd_add() rbd: support reading parent page data rbd: fix an incorrect assertion condition rbd: define rbd_dev_v2_header_info() rbd: get rid of trivial v1 header wrappers ...	2013-05-15 13:36:19 -07:00
Trond Myklebust	d36ccb9cec	SUNRPC: Fix a bug in gss_create_upcall If wait_event_interruptible_timeout() is successful, it returns the number of seconds remaining until the timeout. In that case, we should be retrying the upcall. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-05-15 10:49:58 -07:00
J. Bruce Fields	2fccbd9cc0	sunrpc: server back channel needs no rpcbind method XPRT_BOUND is set on server backchannel xprts by xs_setup_bc_tcp() (using xprt_set_bound()), and is never cleared, so ->rpcbind() will never need to be called. Reported-by: "Myklebust, Trond" <Trond.Myklebust@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-05-15 09:27:02 -04:00
Hans Schillstrom	8cdb46da06	netfilter: log: netns NULL ptr bug when calling from conntrack Since (`69b34fb` netfilter: xt_LOG: add net namespace support for xt_LOG), we hit this: [ 4224.708977] BUG: unable to handle kernel NULL pointer dereference at 0000000000000388 [ 4224.709074] IP: [<ffffffff8147f699>] ipt_log_packet+0x29/0x270 when callling log functions from conntrack both in and out are NULL i.e. the net pointer is invalid. Adding struct net *net in call to nf_logfn() will secure that there always is a vaild net ptr. Reported as netfilter's bugzilla bug 818: https://bugzilla.netfilter.org/show_bug.cgi?id=818 Reported-by: Ronald <ronald645@gmail.com> Signed-off-by: Hans Schillstrom <hans@schillstrom.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-05-15 14:11:07 +02:00
Eric Dumazet	faff57a92b	net/802/mrp: fix lockdep splat commit `fb745e9a03` ("net/802/mrp: fix possible race condition when calling mrp_pdu_queue()") introduced a lockdep splat. [ 19.735147] ================================= [ 19.735235] [ INFO: inconsistent lock state ] [ 19.735324] 3.9.2-build-0063 #4 Not tainted [ 19.735412] --------------------------------- [ 19.735500] inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage. [ 19.735592] rmmod/1840 [HC0[0]:SC0[0]:HE1:SE1] takes: [ 19.735682] (&(&app->lock)->rlock#2){+.?...}, at: [<f862bb5b>] mrp_uninit_applicant+0x69/0xba [mrp] app->lock is normally taken under softirq context, so disable BH to avoid the splat. Reported-by: Denys Fedoryshchenko <denys@visp.net.lb> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: David Ward <david.ward@ll.mit.edu> Cc: Cong Wang <amwang@redhat.com> Tested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-14 13:02:30 -07:00
Eric Dumazet	54d27fcb33	tcp: fix tcp_md5_hash_skb_data() TCP md5 communications fail [1] for some devices, because sg/crypto code assume page offsets are below PAGE_SIZE. This was discovered using mlx4 driver [2], but I suspect loopback might trigger the same bug now we use order-3 pages in tcp_sendmsg() [1] Failure is giving following messages. huh, entered softirq 3 NET_RX ffffffff806ad230 preempt_count 00000100, exited with 00000101? [2] mlx4 driver uses order-2 pages to allocate RX frags Reported-by: Matt Schnall <mischnal@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Bernhard Beck <bbeck@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-14 11:32:04 -07:00
Linus Torvalds	dbbffe6898	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: "Several small bug fixes all over: 1) be2net driver uses wrong payload length when submitting MAC list get requests to the chip. From Sathya Perla. 2) Fix mwifiex memory leak on driver unload, from Amitkumar Karwar. 3) Prevent random memory access in batman-adv, from Marek Lindner. 4) batman-adv doesn't check for pskb_trim_rcsum() errors, also from Marek Lindner. 5) Fix fec crashes on rapid link up/down, from Frank Li. 6) Fix inner protocol grovelling in GSO, from Pravin B Shelar. 7) Link event validation fix in qlcnic from Rajesh Borundia. 8) Not all FEC chips can support checksum offload, fix from Shawn Guo. 9) EXPORT_SYMBOL + inline doesn't make any sense, from Denis Efremov. 10) Fix race in passthru mode during device removal in macvlan, from Jiri Pirko. 11) Fix RCU hash table lookup socket state race in ipv6, leading to NULL pointer derefs, from Eric Dumazet. 12) Add several missing HAS_DMA kconfig dependencies, from Geert Uyttterhoeven. 13) Fix bogus PCI resource management in 3c59x driver, from Sergei Shtylyov. 14) Fix info leak in ipv6 GRE tunnel driver, from Amerigo Wang. 15) Fix device leak in ipv6 IPSEC policy layer, from Cong Wang. 16) DMA mapping leak fix in qlge from Thadeu Lima de Souza Cascardo. 17) Missing iounmap on probe failure in bna driver, from Wei Yongjun." * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (40 commits) bna: add missing iounmap() on error in bnad_init() qlge: fix dma map leak when the last chunk is not allocated xfrm6: release dev before returning error ipv6,gre: do not leak info to user-space virtio_net: use default napi weight by default emac: Fix EMAC soft reset on 460EX/GT 3c59x: fix PCI resource management caif: CAIF_VIRTIO should depend on HAS_DMA net/ethernet: MACB should depend on HAS_DMA net/ethernet: ARM_AT91_ETHER should depend on HAS_DMA net/wireless: ATH9K should depend on HAS_DMA net/ethernet: STMMAC_ETH should depend on HAS_DMA net/ethernet: NET_CALXEDA_XGMAC should depend on HAS_DMA ipv6: do not clear pinet6 field macvlan: fix passthru mode race between dev removal and rx path ipv4: ip_output: remove inline marking of EXPORT_SYMBOL functions net/mlx4: Strengthen VLAN tags/priorities enforcement in VST mode net/mlx4_core: Add missing report on VST and spoof-checking dev caps net: fec: enable hardware checksum only on imx6q-fec qlcnic: Fix validation of link event command. ...	2013-05-13 13:25:36 -07:00
Alex Elder	c10ebbf55b	libceph: init sent and completed when starting The rbd code has a need to be able to restart an osd request that has already been started and completed once before. This currently wouldn't work right because the osd client code assumes an osd request will be started exactly once Certain fields in a request are never cleared and this leads to trouble if you try to reuse it. Specifically, the r_sent, r_got_reply, and r_completed fields are never cleared. The r_sent field records the osd incarnation at the time the request was sent to that osd. If that's non-zero, the message won't get re-mapped to a target osd properly, and won't be put on the unsafe requests list the first time it's sent as it should. The r_got_reply field is used in handle_reply() to ensure the reply to a request is processed only once. And the r_completed field is used for lingering requests to avoid calling the callback function every time the osd client re-sends the request on behalf of its initiator. Each osd request passes through ceph_osdc_start_request() when responsibility for the request is handed over to the osd client for completion. We can safely zero these three fields there each time a request gets started. One last related change--clear the r_linger flag when a request is no longer registered as a linger request. This resolves: http://tracker.ceph.com/issues/5026 Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-13 12:52:23 -05:00
Dan Carpenter	625cdd78d1	svcauth_gss: fix error code in use_gss_proxy() This should return zero on success and -EBUSY on error so the type needs to be int instead of bool. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-05-12 14:56:30 -04:00
Colin Cross	2b15af6f95	af_unix: use freezable blocking calls in read Avoid waking up every thread sleeping in read call on an AF_UNIX socket during suspend and resume by calling a freezable blocking call. Previous patches modified the freezer to avoid sending wakeups to threads that are blocked in freezable blocking calls. This call was selected to be converted to a freezable call because it doesn't hold any locks or release any resources when interrupted that might be needed by another freezing task or a kernel driver during suspend, and is a common site where idle userspace tasks are blocked. Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Colin Cross <ccross@android.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-05-12 14:16:23 +02:00
Colin Cross	416ad3c9c0	freezer: add unsafe versions of freezable helpers for NFS NFS calls the freezable helpers with locks held, which is unsafe and will cause lockdep warnings when `6aa9707` "lockdep: check that no locks held at freeze time" is reapplied (it was reverted in `dbf520a`). NFS shouldn't be doing this, but it has long-running syscalls that must hold a lock but also shouldn't block suspend. Until NFS freeze handling is rewritten to use a signal to exit out of the critical section, add new *_unsafe versions of the helpers that will not run the lockdep test when `6aa9707` is reapplied, and call them from NFS. In practice the likley result of holding the lock while freezing is that a second task blocked on the lock will never freeze, aborting suspend, but it is possible to manufacture a case using the cgroup freezer, the lock, and the suspend freezer to create a deadlock. Silencing the lockdep warning here will allow problems to be found in other drivers that may have a more serious deadlock risk, and prevent new problems from being added. Signed-off-by: Colin Cross <ccross@android.com> Acked-by: Pavel Machek <pavel@ucw.cz> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-05-12 14:16:21 +02:00
Cong Wang	84c4a9dfbf	xfrm6: release dev before returning error We forget to call dev_put() on error path in xfrm6_fill_dst(), its caller doesn't handle this. Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-11 17:40:15 -07:00
Amerigo Wang	5dbd506843	ipv6,gre: do not leak info to user-space There is a hole in struct ip6_tnl_parm2, so we have to zero the struct on stack before copying it to user-space. Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-11 17:40:14 -07:00
Eric Dumazet	f77d602124	ipv6: do not clear pinet6 field We have seen multiple NULL dereferences in __inet6_lookup_established() After analysis, I found that inet6_sk() could be NULL while the check for sk_family == AF_INET6 was true. Bug was added in linux-2.6.29 when RCU lookups were introduced in UDP and TCP stacks. Once an IPv6 socket, using SLAB_DESTROY_BY_RCU is inserted in a hash table, we no longer can clear pinet6 field. This patch extends logic used in commit `fcbdf09d96` ("net: fix nulls list corruptions in sk_prot_alloc") TCP/UDP/UDPLite IPv6 protocols provide their own .clear_sk() method to make sure we do not clear pinet6 field. At socket clone phase, we do not really care, as cloning the parent (non NULL) pinet6 is not adding a fatal race. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-11 16:26:38 -07:00
David S. Miller	fb863b8bea	Included changes: - fix parsing of user typed protocol string to avoid random memory access in some cases - check pskb_trim_rcsum() return value - prevent DAT from sending ARP replies when not needed - reorder the main clean up routine to prevent race conditions -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) iQIcBAABCAAGBQJRi33dAAoJEADl0hg6qKeOdigP/iWWnwQ51RY6QXRskPKd0AuB 0KNXKES5O7xViSFPD35tZLhCYn+2XVYOuV0JvECIaBxYnB7ezNCPVldLFFJDVjyF jqtMVafdq3+nPEhoWcJRHzVtJMpOr9fjm/A+fA2dk3sxJFVYaHLKQ9KBRyFQVFgB LpH8llLYew2ND/TYiCJ7rUbZeYM++Ui1qIs+UYlYqbS6hBuK3UicwX/kr/PLwHsh Fyd4U3PQ/nfLVbLK65x5jTjLaKh/CnEhyuQ4F51zVrNZIuj/yqBuc94k6jGGDZfb lGnbCSTHCXfLJV7ykXirDmoFOZXPAhHu2eJ68y4GKb5P/iZwpAWvLRIo96eENehh TjRhNOBV0OVQfeUKZi9Y/cJUuGxQ2Wvic4uUo+If3LCi3A1rwPklBFy6Txjy0hjC jmFQJqsbHO+TF4abFPsw67ublzjrq2WScn8OgNDwMs7TXqEr5DGY5CojlRYJ9cV9 aUSZAXjf0YhCzQXyUIN23sJaxzC1IQ6rOoe5D1B9TuKxBs0hlw0o0C8LiOO2/SVU QS0koaNq9RvW23mTiJ/23WF7UVycFpPHj5lUc232FlzboVkT0R2kKhPzPCfLdS+C za3O8TFNsipUi6kf4HTOeZVn6zZw0qrVt15tTMrn+n1q6Sz3DC0bf2RrYLw7fh4Y xsJtTb/m9+iyX4q/YZJ4 =n0yL -----END PGP SIGNATURE----- Merge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-merge Included changes: - fix parsing of user typed protocol string to avoid random memory access in some cases - check pskb_trim_rcsum() return value - prevent DAT from sending ARP replies when not needed - reorder the main clean up routine to prevent race conditions Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-11 16:23:44 -07:00
Denis Efremov	2fbd967973	ipv4: ip_output: remove inline marking of EXPORT_SYMBOL functions EXPORT_SYMBOL and inline directives are contradictory to each other. The patch fixes this inconsistency. Found by Linux Driver Verification project (linuxtesting.org). Signed-off-by: Denis Efremov <yefremov.denis@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-11 16:12:44 -07:00
Linus Torvalds	c4cc75c332	Merge git://git.infradead.org/users/eparis/audit Pull audit changes from Eric Paris: "Al used to send pull requests every couple of years but he told me to just start pushing them to you directly. Our touching outside of core audit code is pretty straight forward. A couple of interface changes which hit net/. A simple argument bug calling audit functions in namei.c and the removal of some assembly branch prediction code on ppc" * git://git.infradead.org/users/eparis/audit: (31 commits) audit: fix message spacing printing auid Revert "audit: move kaudit thread start from auditd registration to kaudit init" audit: vfs: fix audit_inode call in O_CREAT case of do_last audit: Make testing for a valid loginuid explicit. audit: fix event coverage of AUDIT_ANOM_LINK audit: use spin_lock in audit_receive_msg to process tty logging audit: do not needlessly take a lock in tty_audit_exit audit: do not needlessly take a spinlock in copy_signal audit: add an option to control logging of passwords with pam_tty_audit audit: use spin_lock_irqsave/restore in audit tty code helper for some session id stuff audit: use a consistent audit helper to log lsm information audit: push loginuid and sessionid processing down audit: stop pushing loginid, uid, sessionid as arguments audit: remove the old depricated kernel interface audit: make validity checking generic audit: allow checking the type of audit message in the user filter audit: fix build break when AUDIT_DEBUG == 2 audit: remove duplicate export of audit_enabled Audit: do not print error when LSMs disabled ...	2013-05-11 14:29:11 -07:00
Linus Torvalds	2dbd3cac87	Merge branch 'for-3.10' of git://linux-nfs.org/~bfields/linux Pull nfsd fixes from Bruce Fields: "Small fixes for two bugs and two warnings" * 'for-3.10' of git://linux-nfs.org/~bfields/linux: nfsd: fix oops when legacy_recdir_name_error is passed a -ENOENT error SUNRPC: fix decoding of optional gss-proxy xdr fields SUNRPC: Refactor gssx_dec_option_array() to kill uninitialized warning nfsd4: don't allow owner override on 4.1 CLAIM_FH opens	2013-05-10 09:28:55 -07:00
Linus Torvalds	8cbc95ee74	More NFS client bugfixes for 3.10 - Ensure that we match the 'sec=' mount flavour against the server list - Fix the NFSv4 byte range locking in the presence of delegations - Ensure that we conform to the NFSv4.1 spec w.r.t. freeing lock stateids - Fix a pNFS data server connection race -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.13 (GNU/Linux) iQIcBAABAgAGBQJRit1yAAoJEGcL54qWCgDyD9EQAKgb37dXhGt7OXBRBP4EY/T8 xJZ2tmdDZ6etLFJVftqCv05hBvyfilPLK0E9zg/zW/kvkKxYQ/fykvpzBR/+Q7KF quOmjDHLhDTXBnXzPg1HEoeTaXI2/a8CdjpxxEkthD4+FaKlyCXM+EFtA9orT9ZI oM+aNaqEzTjoQyryTFMcHxAvsrqjnZBa0MT6Fh45HaLaijV7CdDWoj6gjy6Lc3Al 4wHeT8QrZTp/NfIN16uykFZjeWwul4N9upu+CI2V8ZDMEit6JDYX4sl5tB41PzYW audDBcu0waSqoVQ2mJ5OHoYGZf0wopMUFaAst+tn0pQvwWUfTjD8XtO8uOgeMNoz 2S+XxUC2qhSMszwNBVSmwe2LtSAyHiw32Md4hqkLYDH2c7tk8bJPKDXZJACBzJS7 O1aMmOgWar8+nmzvmXFeU804SxBykV1V8UgtXWp5IwC36V0HAYnM5xtHwXBR7HWe lnuVHVdux7ySeAyrs2aMdKk7SAw5OC//WW8qoEF5USDEIljeoBzA+IYu9n91Hg2b ufnsyxumGJ6dZ0iU2nJVoLagRaZcm6kOhnxcegMpb9IH2+RLCQNef09lj2iklm2j mJA4o2lkVEHOswg/NwKn/I4ho8tbNNb8v//S5KiqrYhiiqZhOzu3RRtFeZi91iac P/g+hPzfuGnmwcoCEUSa =5zpc -----END PGP SIGNATURE----- Merge tag 'nfs-for-3.10-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs Pull more NFS client bugfixes from Trond Myklebust: - Ensure that we match the 'sec=' mount flavour against the server list - Fix the NFSv4 byte range locking in the presence of delegations - Ensure that we conform to the NFSv4.1 spec w.r.t. freeing lock stateids - Fix a pNFS data server connection race * tag 'nfs-for-3.10-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFS4.1 Fix data server connection race NFSv3: match sec= flavor against server list NFSv4.1: Ensure that we free the lock stateid on the server NFSv4: Convert nfs41_free_stateid to use an asynchronous RPC call SUNRPC: Don't spam syslog with "Pseudoflavor not found" messages NFSv4.x: Fix handling of partially delegated locks	2013-05-09 10:24:54 -07:00
Antonio Quartulli	a436186035	batman-adv: reorder clean up routine in order to avoid race conditions nc_worker accesses the originator table during its periodic work, but since the originator table is freed before stopping the worker this leads to a global protection fault. Fix this by killing the worker (in nc_free) before freeing the originator table. Moreover tidy up the entire clean up routine by running all the subcomponents freeing procedures first and then killing the TT and the originator tables at the end. Signed-off-by: Antonio Quartulli <ordex@autistici.org> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>	2013-05-09 12:39:45 +02:00
Antonio Quartulli	88e48d7b33	batman-adv: make DAT drop ARP requests targeting local clients In the outgoing ARP request snooping routine in DAT, ARP Request sent by local clients which are supposed to be replied by other local clients can be silently dropped. The destination host will reply by itself through the LAN and therefore there is no need to involve DAT. Reported-by: Carlos Quijano <carlos@crqgestion.es> Signed-off-by: Antonio Quartulli <ordex@autistici.org> Tested-by: Carlos Quijano <carlos@crqgestion.es> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>	2013-05-09 12:39:45 +02:00
Marek Lindner	7da19971a9	batman-adv: check return value of pskb_trim_rcsum() Reported-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Acked-by: Martin Hundebøll <martin@hundeboll.net> Signed-off-by: Antonio Quartulli <ordex@autistici.org>	2013-05-09 12:39:44 +02:00
Marek Lindner	293c9c1cef	batman-adv: check proto length before accessing proto string buffer batadv_param_set_ra() strips the trailing '\n' from the supplied string buffer without checking the length of the buffer first. This patches avoids random memory access and associated potential crashes. Reported-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>	2013-05-09 12:39:44 +02:00
Pravin B Shelar	19acc32725	gso: Handle Trans-Ether-Bridging protocol in skb_network_protocol() Rather than having logic to calculate inner protocol in every tunnel gso handler move it to gso code. This simplifies code. Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Cong Wang <amwang@redhat.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-08 13:13:30 -07:00
J. Bruce Fields	fb43f11c66	SUNRPC: fix decoding of optional gss-proxy xdr fields The current code works, but sort of by accident: it obviously didn't intend the error return to be interpreted as "true". Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-05-07 17:45:20 -04:00
Linus Torvalds	51a26ae7a1	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: "Just a small pile of fixes" 1) Fix race conditions in IP fragmentation LRU list handling, from Konstantin Khlebnikov. 2) vfree() is no longer verboten in interrupts, so deferring is pointless, from Al Viro. 3) Conversion from mutex to semaphore in netpoll left trylock test inverted, caught by Dan Carpenter. 4) 3c59x uses wrong base address when releasing regions, from Sergei Shtylyov. 5) Bounds checking in TIPC from Dan Carpenter. 6) Fastopen cookies should not be expired as aggressively as other TCP metrics. From Eric Dumazet. 7) Fix retrieval of MAC address in ibmveth, from Ben Herrenschmidt. 8) Don't use "u16" in virtio user headers, from Stephen Hemminger * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: tipc: potential divide by zero in tipc_link_recv_fragment() tipc: add a bounds check in link_recv_changeover_msg() net/usb: new driver for RTL8152 3c59x: fix freeing nonexistent resource on driver unload netpoll: inverted down_trylock() test rps_dev_flow_table_release(): no need to delay vfree() fib_trie: no need to delay vfree() net: frag, fix race conditions in LRU list maintenance tcp: do not expire TCP fastopen cookies net/eth/ibmveth: Fixup retrieval of MAC address virtio: don't expose u16 in userspace api	2013-05-06 15:51:10 -07:00
Dan Carpenter	6bf15191f6	tipc: potential divide by zero in tipc_link_recv_fragment() The worry here is that fragm_sz could be zero since it comes from skb->data. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-06 16:16:52 -04:00
Dan Carpenter	cb4b102f0a	tipc: add a bounds check in link_recv_changeover_msg() The bearer_id here comes from skb->data and it can be a number from 0 to 7. The problem is that the ->links[] array has only 2 elements so I have added a range check. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-06 16:16:52 -04:00
Linus Torvalds	91f8575685	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull Ceph changes from Alex Elder: "This is a big pull. Most of it is culmination of Alex's work to implement RBD image layering, which is now complete (yay!). There is also some work from Yan to fix i_mutex behavior surrounding writes in cephfs, a sync write fix, a fix for RBD images that get resized while they are mapped, and a few patches from me that resolve annoying auth warnings and fix several bugs in the ceph auth code." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (254 commits) rbd: fix image request leak on parent read libceph: use slab cache for osd client requests libceph: allocate ceph message data with a slab allocator libceph: allocate ceph messages with a slab allocator rbd: allocate image object names with a slab allocator rbd: allocate object requests with a slab allocator rbd: allocate name separate from obj_request rbd: allocate image requests with a slab allocator rbd: use binary search for snapshot lookup rbd: clear EXISTS flag if mapped snapshot disappears rbd: kill off the snapshot list rbd: define rbd_snap_size() and rbd_snap_features() rbd: use snap_id not index to look up snap info rbd: look up snapshot name in names buffer rbd: drop obj_request->version rbd: drop rbd_obj_method_sync() version parameter rbd: more version parameter removal rbd: get rid of some version parameters rbd: stop tracking header object version rbd: snap names are pointer to constant data ...	2013-05-06 13:11:19 -07:00
Dan Carpenter	a3dbbc2bab	netpoll: inverted down_trylock() test The return value is reversed from mutex_trylock(). Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-06 11:06:52 -04:00
Al Viro	243198d09f	rps_dev_flow_table_release(): no need to delay vfree() The same story as with fib_trie patch - vfree() from RCU callbacks is legitimate now. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-06 11:06:51 -04:00
Al Viro	0020356355	fib_trie: no need to delay vfree() Now that vfree() can be called from interrupt contexts, there's no need to play games with schedule_work() to escape calling vfree() from RCU callbacks. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-06 11:06:51 -04:00
Konstantin Khlebnikov	b56141ab34	net: frag, fix race conditions in LRU list maintenance This patch fixes race between inet_frag_lru_move() and inet_frag_lru_add() which was introduced in commit `3ef0eb0db4` ("net: frag, move LRU list maintenance outside of rwlock") One cpu already added new fragment queue into hash but not into LRU. Other cpu found it in hash and tries to move it to the end of LRU. This leads to NULL pointer dereference inside of list_move_tail(). Another possible race condition is between inet_frag_lru_move() and inet_frag_lru_del(): move can happens after deletion. This patch initializes LRU list head before adding fragment into hash and inet_frag_lru_move() doesn't touches it if it's empty. I saw this kernel oops two times in a couple of days. [119482.128853] BUG: unable to handle kernel NULL pointer dereference at (null) [119482.132693] IP: [<ffffffff812ede89>] __list_del_entry+0x29/0xd0 [119482.136456] PGD 2148f6067 PUD 215ab9067 PMD 0 [119482.140221] Oops: 0000 [#1] SMP [119482.144008] Modules linked in: vfat msdos fat 8021q fuse nfsd auth_rpcgss nfs_acl nfs lockd sunrpc ppp_async ppp_generic bridge slhc stp llc w83627ehf hwmon_vid snd_hda_codec_hdmi snd_hda_codec_realtek kvm_amd k10temp kvm snd_hda_intel snd_hda_codec edac_core radeon snd_hwdep ath9k snd_pcm ath9k_common snd_page_alloc ath9k_hw snd_timer snd soundcore drm_kms_helper ath ttm r8169 mii [119482.152692] CPU 3 [119482.152721] Pid: 20, comm: ksoftirqd/3 Not tainted 3.9.0-zurg-00001-g9f95269 #132 To Be Filled By O.E.M. To Be Filled By O.E.M./RS880D [119482.161478] RIP: 0010:[<ffffffff812ede89>] [<ffffffff812ede89>] __list_del_entry+0x29/0xd0 [119482.166004] RSP: 0018:ffff880216d5db58 EFLAGS: 00010207 [119482.170568] RAX: 0000000000000000 RBX: ffff88020882b9c0 RCX: dead000000200200 [119482.175189] RDX: 0000000000000000 RSI: 0000000000000880 RDI: ffff88020882ba00 [119482.179860] RBP: ffff880216d5db58 R08: ffffffff8155c7f0 R09: 0000000000000014 [119482.184570] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88020882ba00 [119482.189337] R13: ffffffff81c8d780 R14: ffff880204357f00 R15: 00000000000005a0 [119482.194140] FS: 00007f58124dc700(0000) GS:ffff88021fcc0000(0000) knlGS:0000000000000000 [119482.198928] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [119482.203711] CR2: 0000000000000000 CR3: 00000002155f0000 CR4: 00000000000007e0 [119482.208533] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [119482.213371] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [119482.218221] Process ksoftirqd/3 (pid: 20, threadinfo ffff880216d5c000, task ffff880216d3a9a0) [119482.223113] Stack: [119482.228004] ffff880216d5dbd8 ffffffff8155dcda 0000000000000000 ffff000200000001 [119482.233038] ffff8802153c1f00 ffff880000289440 ffff880200000014 ffff88007bc72000 [119482.238083] 00000000000079d5 ffff88007bc72f44 ffffffff00000002 ffff880204357f00 [119482.243090] Call Trace: [119482.248009] [<ffffffff8155dcda>] ip_defrag+0x8fa/0xd10 [119482.252921] [<ffffffff815a8013>] ipv4_conntrack_defrag+0x83/0xe0 [119482.257803] [<ffffffff8154485b>] nf_iterate+0x8b/0xa0 [119482.262658] [<ffffffff8155c7f0>] ? inet_del_offload+0x40/0x40 [119482.267527] [<ffffffff815448e4>] nf_hook_slow+0x74/0x130 [119482.272412] [<ffffffff8155c7f0>] ? inet_del_offload+0x40/0x40 [119482.277302] [<ffffffff8155d068>] ip_rcv+0x268/0x320 [119482.282147] [<ffffffff81519992>] __netif_receive_skb_core+0x612/0x7e0 [119482.286998] [<ffffffff81519b78>] __netif_receive_skb+0x18/0x60 [119482.291826] [<ffffffff8151a650>] process_backlog+0xa0/0x160 [119482.296648] [<ffffffff81519f29>] net_rx_action+0x139/0x220 [119482.301403] [<ffffffff81053707>] __do_softirq+0xe7/0x220 [119482.306103] [<ffffffff81053868>] run_ksoftirqd+0x28/0x40 [119482.310809] [<ffffffff81074f5f>] smpboot_thread_fn+0xff/0x1a0 [119482.315515] [<ffffffff81074e60>] ? lg_local_lock_cpu+0x40/0x40 [119482.320219] [<ffffffff8106d870>] kthread+0xc0/0xd0 [119482.324858] [<ffffffff8106d7b0>] ? insert_kthread_work+0x40/0x40 [119482.329460] [<ffffffff816c32dc>] ret_from_fork+0x7c/0xb0 [119482.334057] [<ffffffff8106d7b0>] ? insert_kthread_work+0x40/0x40 [119482.338661] Code: 00 00 55 48 8b 17 48 b9 00 01 10 00 00 00 ad de 48 8b 47 08 48 89 e5 48 39 ca 74 29 48 b9 00 02 20 00 00 00 ad de 48 39 c8 74 7a <4c> 8b 00 4c 39 c7 75 53 4c 8b 42 08 4c 39 c7 75 2b 48 89 42 08 [119482.343787] RIP [<ffffffff812ede89>] __list_del_entry+0x29/0xd0 [119482.348675] RSP <ffff880216d5db58> [119482.353493] CR2: 0000000000000000 Oops happened on this path: ip_defrag() -> ip_frag_queue() -> inet_frag_lru_move() -> list_move_tail() -> __list_del_entry() Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Cc: Jesper Dangaard Brouer <brouer@redhat.com> Cc: Florian Westphal <fw@strlen.de> Cc: Eric Dumazet <edumazet@google.com> Cc: David S. Miller <davem@davemloft.net> Acked-by: Florian Westphal <fw@strlen.de> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-06 11:06:51 -04:00
Geert Uytterhoeven	9fd40c5a66	SUNRPC: Refactor gssx_dec_option_array() to kill uninitialized warning net/sunrpc/auth_gss/gss_rpc_xdr.c: In function ‘gssx_dec_option_array’: net/sunrpc/auth_gss/gss_rpc_xdr.c:258: warning: ‘creds’ may be used uninitialized in this function Return early if count is zero, to make it clearer to the compiler (and the casual reviewer) that no more processing is done. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-05-06 08:54:06 -04:00
Pablo Neira Ayuso	e778f56e2f	netfilter: nf_{log,queue}: fix compilation without CONFIG_PROC_FS This patch fixes the following compilation error: net/netfilter/nf_log.c:373:38: error: 'struct netns_nf' has no member named 'proc_netfilter' if procfs is not set. The netns support for nf_log, nfnetlink_log and nfnetlink_queue_core requires CONFIG_PROC_FS in the removal path of their respective /proc interface since net->nf.proc_netfilter is undefined in that case. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Gao feng <gaofeng@cn.fujitsu.com>	2013-05-06 12:28:01 +02:00
Eric Dumazet	efeaa5550e	tcp: do not expire TCP fastopen cookies TCP metric cache expires entries after one hour. This probably make sense for TCP RTT/RTTVAR/CWND, but not for TCP fastopen cookies. Its better to try previous cookie. If it appears to be obsolete, server will send us new cookie anyway. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-05 16:58:02 -04:00
Linus Torvalds	1aaf6d3d3d	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) Several routines do not use netdev_features_t to hold such bitmasks, fixes from Patrick McHardy and Bjørn Mork. 2) Update cpsw IRQ software state and the actual HW irq enabling in the correct order. From Mugunthan V N. 3) When sending tipc packets to multiple bearers, we have to make copies of the SKB rather than just giving the original SKB directly. Fix from Gerlando Falauto. 4) Fix race with bridging topology change timer, from Stephen Hemminger. 5) Fix TCPv6 segmentation handling in GRE and VXLAN, from Pravin B Shelar. 6) Endian bug in USB pegasus driver, from Dan Carpenter. 7) Fix crashes on MTU reduction in USB asix driver, from Holger Eitzenberger. 8) Don't allow the kernel to BUG() just because the user puts some crap in an AF_PACKET mmap() ring descriptor. Fix from Daniel Borkmann. 9) Don't use variable sized arrays on the stack in xen-netback, from Wei Liu. 10) Fix stats reporting and an unbalanced napi_disable() in be2net driver. From Somnath Kotur and Ajit Khaparde. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (25 commits) cxgb4: fix error recovery when t4_fw_hello returns a positive value sky2: Fix crash on receiving VLAN frames packet: tpacket_v3: do not trigger bug() on wrong header status asix: fix BUG in receive path when lowering MTU net: qmi_wwan: Add Telewell TW-LTE 4G usbnet: pegasus: endian bug in write_mii_word() vxlan: Fix TCPv6 segmentation. gre: Fix GREv4 TCPv6 segmentation. bridge: fix race with topology change timer tipc: pskb_copy() buffers when sending on more than one bearer tipc: tipc_bcbearer_send(): simplify bearer selection tipc: cosmetic: clean up comments and break a long line drivers: net: cpsw: irq not disabled in cpsw isr in particular sequence xen-netback: better names for thresholds xen-netback: avoid allocating variable size array on stack xen-netback: remove redundent parameter in netbk_count_requests be2net: Fix to fail probe if MSI-X enable fails for a VF be2net: avoid napi_disable() when it has not been enabled be2net: Fix firmware download for Lancer be2net: Fix to receive Multicast Packets when Promiscuous mode is enabled on certain devices ...	2013-05-04 20:10:04 -07:00
Daniel Borkmann	8da3056c04	packet: tpacket_v3: do not trigger bug() on wrong header status Jakub reported that it is fairly easy to trigger the BUG() macro from user space with TPACKET_V3's RX_RING by just giving a wrong header status flag. We already had a similar situation in commit `7f5c3e3a80` (``af_packet: remove BUG statement in tpacket_destruct_skb'') where this was the case in the TX_RING side that could be triggered from user space. So really, don't use BUG() or BUG_ON() unless there's really no way out, and i.e. don't use it for consistency checking when there's user space involved, no excuses, especially not if you're slapping the user with WARN + dump_stack + BUG all at once. The two functions are of concern: prb_retire_current_block() [when block status != TP_STATUS_KERNEL] prb_open_block() [when block_status != TP_STATUS_KERNEL] Calls to prb_open_block() are guarded by ealier checks if block_status is really TP_STATUS_KERNEL (racy!), but the first one BUG() is easily triggable from user space. System behaves still stable after they are removed. Also remove that yoda condition entirely, since it's already guarded. Reported-by: Jakub Zawadzki <darkjames-ws@darkjames.pl> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-03 16:10:33 -04:00
Pravin B Shelar	0d05535d41	vxlan: Fix TCPv6 segmentation. This patch set correct skb->protocol so that inner packet can lookup correct gso handler. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-03 16:08:59 -04:00

... 5 6 7 8 9 ...

28821 Commits (0a8d3e2412841c6b1dab1006fd5f7ab5b689db21)