Commit graph

32276 commits

Author SHA1 Message Date
Florian Westphal 177943260a 6lowpan: reassembly: un-export local functions
most of these are only used locally, make them static.
fold lowpan_expire_frag_queue into its caller, its small enough.

Cc: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-13 16:10:11 -04:00
Heiner Kallweit ecab67015e ipv6: Avoid unnecessary temporary addresses being generated
tmp_prefered_lft is an offset to ifp->tstamp, not now. Therefore
age needs to be added to the condition.

Age calculation in ipv6_create_tempaddr is different from the one
in addrconf_verify and doesn't consider ADDRCONF_TIMER_FUZZ_MINUS.
This can cause age in ipv6_create_tempaddr to be less than the one
in addrconf_verify and therefore unnecessary temporary address to
be generated.
Use age calculation as in addrconf_modify to avoid this.

Signed-off-by: Heiner Kallweit <heiner.kallweit@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-13 15:49:14 -04:00
Yang Yingliang d59b7d8059 net_sched: return nla_nest_end() instead of skb->len
nla_nest_end() already has return skb->len, so replace
return skb->len with return nla_nest_end instead().

Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-13 15:39:20 -04:00
Jan Beulich f9708b4302 consolidate duplicate code is skb_checksum_setup() helpers
consolidate duplicate code is skb_checksum_setup() helpers

Realizing that the skb_maybe_pull_tail() calls in the IP-protocol
specific portions of both helpers are terminal ones (i.e. no further
pulls are expected), their maximum size to be pulled can be made match
their minimal size needed, thus making the code identical and hence
possible to be moved into another helper.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: Paul Durrant <paul.durrant@citrix.com>
Cc: David Miller <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-13 15:16:29 -04:00
John W. Linville 42775a34d2 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem
Conflicts:
	drivers/net/wireless/ath/ath9k/recv.c
2014-03-13 14:21:43 -04:00
Daniel Borkmann 433131ba03 net: sctp: remove NULL check in sctp_assoc_update_retran_path
This is basically just to let Coverity et al shut up. Remove an
unneeded NULL check in sctp_assoc_update_retran_path().

It is safe to remove it, because in sctp_assoc_update_retran_path()
we iterate over the list of transports, our own transport which is
asoc->peer.retran_path included. In the iteration, we skip the
list head element and transports in state SCTP_UNCONFIRMED.

Such transports came from peer addresses received in INIT/INIT-ACK
address parameters. They are not yet confirmed by a heartbeat and
not available for data transfers.

We know however that in the list of transports, even if it contains
such elements, it at least contains our asoc->peer.retran_path as
well, so even if next to that element, we only encounter
SCTP_UNCONFIRMED transports, we are always going to fall back to
asoc->peer.retran_path through sctp_trans_elect_best(), as that is
for sure not SCTP_UNCONFIRMED as per fbdf501c93 ("sctp: Do no
select unconfirmed transports for retransmissions").

Whenever we call sctp_trans_elect_best() it will give us a non-NULL
element back, and therefore when we break out of the loop, we are
guaranteed to have a non-NULL transport pointer, and can remove
the NULL check.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-13 13:23:52 -04:00
Arnd Bergmann 52d3ef5c25 Bluetooth: make sure 6LOWPAN_IPHC is built-in if needed
Commit 975508879 "Bluetooth: make bluetooth 6lowpan as an option"
ensures that 6LOWPAN_IPHC is turned on when we have BT_6LOWPAN
enabled in Kconfig, but it allows building the IPHC code as
a loadable module even if the entire Bluetooth stack is built-in,
and that causes a link error.

We can solve that by moving the 'select' statement into CONFIG_BT,
which is a "tristate" option to enforce that 6LOWPAN_IPHC can
only be a module if BT also is a module.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-03-13 07:05:10 -07:00
Joe Perches b80edf0b52 netfilter: Convert uses of __constant_<foo> to <foo>
The use of __constant_<foo> has been unnecessary for quite awhile now.

Make these uses consistent with the rest of the kernel.

Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-03-13 14:13:19 +01:00
Matthew Leach dbb490b965 net: socket: error on a negative msg_namelen
When copying in a struct msghdr from the user, if the user has set the
msg_namelen parameter to a negative value it gets clamped to a valid
size due to a comparison between signed and unsigned values.

Ensure the syscall errors when the user passes in a negative value.

Signed-off-by: Matthew Leach <matthew.leach@arm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12 16:29:24 -04:00
Jon Paul Maloy 5c311421a2 tipc: eliminate redundant lookups in registry
As an artefact from the native interface, the message sending functions
in the port takes a port ref as first parameter, and then looks up in
the registry to find the corresponding port pointer. This despite the
fact that the only currently existing caller, tipc_sock, already knows
this pointer.

We change the signature of these functions to take a struct tipc_port*
argument, and remove the redundant lookups.

We also remove an unmotivated extra lookup in the function
socket.c:auto_connect(), and, as the lookup functions tipc_port_deref()
and ref_deref() now become unused, we remove these two functions.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12 15:53:49 -04:00
Jon Paul Maloy 58ed944241 tipc: align usage of variable names and macros in socket
The practice of naming variables in TIPC is inconistent, sometimes
even within the same file.

In this commit we align variable names and declarations within
socket.c, and function and macro names within socket.h. We also
reduce the number of conversion macros to two, in order to make
usage less obsure.

These changes are purely cosmetic.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12 15:53:49 -04:00
Jon Paul Maloy 3b4f302d85 tipc: eliminate redundant locking
The three functions tipc_portimportance(), tipc_portunreliable() and
tipc_portunreturnable() and their corresponding tipc_set* functions,
are all grabbing port_lock when accessing the targeted port. This is
unnecessary in the current code, since these calls only are made from
within socket downcalls, already protected by sock_lock.

We remove the redundant locking. Also, since the functions now become
trivial one-liners, we move them to port.h and make them inline.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12 15:53:49 -04:00
Jon Paul Maloy 24be34b5a0 tipc: eliminate upcall function pointers between port and socket
Due to the original one-to-many relation between port and user API
layers, upcalls to the API have been performed via function pointers,
installed in struct tipc_port at creation. Since this relation now
always is one-to-one, we can instead use ordinary function calls.

We remove the function pointers 'dispatcher' and ´wakeup' from
struct tipc_port, and replace them with calls to the renamed
functions tipc_sk_rcv() and tipc_sk_wakeup().

At the same time we change the name and signature of the functions
tipc_createport() and tipc_deleteport() to reflect their new role
as mere initialization/destruction functions.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12 15:53:49 -04:00
Jon Paul Maloy 8826cde655 tipc: aggregate port structure into socket structure
After the removal of the tipc native API the relation between
a tipc_port and its API types is strictly one-to-one, i.e, the
latter can now only be a socket API. There is therefore no need
to allocate struct tipc_port and struct sock independently.

In this commit, we aggregate struct tipc_port into struct tipc_sock,
hence saving both CPU cycles and structure complexity.

There are no functional changes in this commit, except for the
elimination of the separate allocation/freeing of tipc_port.
All other changes are just adaptatons to the new data structure.

This commit also opens up for further code simplifications and
code volume reduction, something we will do in later commits.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12 15:53:49 -04:00
Jon Paul Maloy f9fef18c6d tipc: remove redundant 'peer_name' field in struct tipc_sock
The field 'peer_name' in struct tipc_sock is redundant, since
this information already is available from tipc_port, to which
tipc_sock has a reference.

We remove the field, and ensure that peer node and peer port
info instead is fetched via the functions that already exist
for this purpose.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12 15:53:49 -04:00
Jon Paul Maloy 978813ee89 tipc: replace reference table rwlock with spinlock
The lock for protecting the reference table is declared as an
RWLOCK, although it is only used in write mode, never in read
mode.

We redefine it to become a spinlock.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12 15:53:49 -04:00
Steffen Klassert 4a93f5095a flowcache: Fix resource leaks on namespace exit.
We leak an active timer, the hotcpu notifier and all allocated
resources when we exit a namespace. Fix this by introducing a
flow_cache_fini() function where we release the resources before
we exit.

Fixes: ca925cf153 ("flowcache: Make flow cache name space aware")
Reported-by: Jakub Kicinski <moorray3@wp.pl>
Tested-by: Jakub Kicinski <moorray3@wp.pl>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Fan Du <fan.du@windriver.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12 15:31:18 -04:00
Joe Perches 184593c734 tipc: Convert uses of __constant_<foo> to <foo>
The use of __constant_<foo> has been unnecessary for quite awhile now.

Make these uses consistent with the rest of the kernel.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12 15:28:06 -04:00
Joe Perches ec633eb5ff ieee802154: Convert uses of __constant_<foo> to <foo>
The use of __constant_<foo> has been unnecessary for quite awhile now.

Make these uses consistent with the rest of the kernel.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12 15:28:06 -04:00
Joe Perches 2b8837aeaa net: Convert uses of __constant_<foo> to <foo>
The use of __constant_<foo> has been unnecessary for quite awhile now.

Make these uses consistent with the rest of the kernel.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12 15:28:06 -04:00
Joe Perches f0e78826e4 8021q: Convert uses of __constant_<foo> to <foo>
The use of __constant_<foo> has been unnecessary for quite awhile now.

Make these uses consistent with the rest of the kernel.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12 15:28:06 -04:00
Joe Perches dcf4adbfdc Bluetooth: Convert uses of __constant_<foo> to <foo>
The use of __constant_<foo> has been unnecessary for quite awhile now.

Make these uses consistent with the rest of the kernel.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-03-12 11:10:17 -07:00
Florian Westphal 14e1a97776 netfilter: connlimit: use kmem_cache for conn objects
We might allocate thousands of these (one object per connection).
Use distinct kmem cache to permit simplte tracking on how many
objects are currently used by the connlimit match via the sysfs.

Reviewed-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-03-12 13:55:03 +01:00
Florian Westphal 3bcc5fdf1b netfilter: connlimit: move insertion of new element out of count function
Allows easier code-reuse in followup patches.

Reviewed-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-03-12 13:55:02 +01:00
Florian Westphal d9ec4f1ee2 netfilter: connlimit: improve packet-to-closed-connection logic
Instead of freeing the entry from our list and then adding
it back again in the 'packet to closing connection' case just keep the
matching entry around.  Also drop the found_ct != NULL test as
nf_ct_tuplehash_to_ctrack is just container_of().

Reviewed-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-03-12 13:55:01 +01:00
Florian Westphal 15cfd52895 netfilter: connlimit: factor hlist search into new function
Simplifies followup patch that introduces separate locks for each of
the hash slots.

Reviewed-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-03-12 13:55:01 +01:00
Eric Dumazet fba373d2bb pkt_sched: add cond_resched() to class and qdisc dump
We have seen delays of more than 50ms in class or qdisc dumps, in case
device is under high TX stress, even with the prior 4KB per skb limit.

Add cond_resched() to give a chance to higher prio tasks to get cpu.

Signed-off-by; Eric Dumazet <edumazet@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-11 23:54:23 -04:00
Eric Dumazet 15dc36ebbb pkt_sched: do not use rcu in tc_dump_qdisc()
Like all rtnetlink dump operations, we hold RTNL in tc_dump_qdisc(),
so we do not need to use rcu protection to protect list of netdevices.

This will allow preemption to occur, thus reducing latencies.
Following patch adds explicit cond_resched() calls.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-11 23:54:23 -04:00
Linus Lüssing 20a599bec9 bridge: multicast: enable snooping on general queries only
Without this check someone could easily create a denial of service
by injecting multicast-specific queries to enable the bridge
snooping part if no real querier issuing periodic general queries
is present on the link which would result in the bridge wrongly
shutting down ports for multicast traffic as the bridge did not learn
about these listeners.

With this patch the snooping code is enabled upon receiving valid,
general queries only.

Signed-off-by: Linus Lüssing <linus.luessing@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-11 23:22:10 -04:00
Linus Lüssing 9ed973cc40 bridge: multicast: add sanity check for general query destination
General IGMP and MLD queries are supposed to have the multicast
link-local all-nodes address as their destination according to RFC2236
section 9, RFC3376 section 4.1.12/9.1, RFC2710 section 8 and RFC3810
section 5.1.15.

Without this check, such malformed IGMP/MLD queries can result in a
denial of service: The queries are ignored by most IGMP/MLD listeners
therefore they will not respond with an IGMP/MLD report. However,
without this patch these malformed MLD queries would enable the
snooping part in the bridge code, potentially shutting down the
according ports towards these hosts for multicast traffic as the
bridge did not learn about these listeners.

Reported-by: Jan Stancek <jstancek@redhat.com>
Signed-off-by: Linus Lüssing <linus.luessing@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-11 23:22:10 -04:00
Eric Dumazet c3f9b01849 tcp: tcp_release_cb() should release socket ownership
Lars Persson reported following deadlock :

-000 |M:0x0:0x802B6AF8(asm) <-- arch_spin_lock
-001 |tcp_v4_rcv(skb = 0x8BD527A0) <-- sk = 0x8BE6B2A0
-002 |ip_local_deliver_finish(skb = 0x8BD527A0)
-003 |__netif_receive_skb_core(skb = 0x8BD527A0, ?)
-004 |netif_receive_skb(skb = 0x8BD527A0)
-005 |elk_poll(napi = 0x8C770500, budget = 64)
-006 |net_rx_action(?)
-007 |__do_softirq()
-008 |do_softirq()
-009 |local_bh_enable()
-010 |tcp_rcv_established(sk = 0x8BE6B2A0, skb = 0x87D3A9E0, th = 0x814EBE14, ?)
-011 |tcp_v4_do_rcv(sk = 0x8BE6B2A0, skb = 0x87D3A9E0)
-012 |tcp_delack_timer_handler(sk = 0x8BE6B2A0)
-013 |tcp_release_cb(sk = 0x8BE6B2A0)
-014 |release_sock(sk = 0x8BE6B2A0)
-015 |tcp_sendmsg(?, sk = 0x8BE6B2A0, ?, ?)
-016 |sock_sendmsg(sock = 0x8518C4C0, msg = 0x87D8DAA8, size = 4096)
-017 |kernel_sendmsg(?, ?, ?, ?, size = 4096)
-018 |smb_send_kvec()
-019 |smb_send_rqst(server = 0x87C4D400, rqst = 0x87D8DBA0)
-020 |cifs_call_async()
-021 |cifs_async_writev(wdata = 0x87FD6580)
-022 |cifs_writepages(mapping = 0x852096E4, wbc = 0x87D8DC88)
-023 |__writeback_single_inode(inode = 0x852095D0, wbc = 0x87D8DC88)
-024 |writeback_sb_inodes(sb = 0x87D6D800, wb = 0x87E4A9C0, work = 0x87D8DD88)
-025 |__writeback_inodes_wb(wb = 0x87E4A9C0, work = 0x87D8DD88)
-026 |wb_writeback(wb = 0x87E4A9C0, work = 0x87D8DD88)
-027 |wb_do_writeback(wb = 0x87E4A9C0, force_wait = 0)
-028 |bdi_writeback_workfn(work = 0x87E4A9CC)
-029 |process_one_work(worker = 0x8B045880, work = 0x87E4A9CC)
-030 |worker_thread(__worker = 0x8B045880)
-031 |kthread(_create = 0x87CADD90)
-032 |ret_from_kernel_thread(asm)

Bug occurs because __tcp_checksum_complete_user() enables BH, assuming
it is running from softirq context.

Lars trace involved a NIC without RX checksum support but other points
are problematic as well, like the prequeue stuff.

Problem is triggered by a timer, that found socket being owned by user.

tcp_release_cb() should call tcp_write_timer_handler() or
tcp_delack_timer_handler() in the appropriate context :

BH disabled and socket lock held, but 'owned' field cleared,
as if they were running from timer handlers.

Fixes: 6f458dfb40 ("tcp: improve latencies of timer triggered events")
Reported-by: Lars Persson <lars.persson@axis.com>
Tested-by: Lars Persson <lars.persson@axis.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-11 16:45:59 -04:00
Michael S. Tsirkin 1fd819ecb9 skbuff: skb_segment: orphan frags before copying
skb_segment copies frags around, so we need
to copy them carefully to avoid accessing
user memory after reporting completion to userspace
through a callback.

skb_segment doesn't normally happen on datapath:
TSO needs to be disabled - so disabling zero copy
in this case does not look like a big deal.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-11 16:26:38 -04:00
Michael S. Tsirkin 1a4cedaf65 skbuff: skb_segment: s/fskb/list_skb/
fskb is unrelated to frag: it's coming from
frag_list. Rename it list_skb to avoid confusion.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-11 16:26:38 -04:00
Michael S. Tsirkin df5771ffef skbuff: skb_segment: s/skb/head_skb/
rename local variable to make it easier to tell at a glance that we are
dealing with a head skb.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-11 16:26:38 -04:00
Michael S. Tsirkin 4e1beba12d skbuff: skb_segment: s/skb_frag/frag/
skb_frag can in fact point at either skb
or fskb so rename it generally "frag".

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-11 16:26:38 -04:00
Michael S. Tsirkin 8cb19905e9 skbuff: skb_segment: s/frag/nskb_frag/
frag points at nskb, so name it appropriately

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-11 16:26:38 -04:00
Andre Guedes 4340a124de Bluetooth: Enable duplicates filter in background scan
To avoid flooding the host with useless advertising reports during
background scan, we enable the duplicates filter from controller.

However, enabling duplicates filter requires a small change in
background scan routine in order to fix the following scenario:
  1) Background scan is running.
  2) A device disconnects and starts advertising.
  3) Before host gets the disconnect event, the advertising is reported
     to host. Since there is no pending LE connection at that time,
     nothing happens.
  4) Host gets the disconnection event and adds a pending connection.
  5) No advertising is reported (since controller is filtering) and the
     connection is never established.

So, to address this scenario, we should always restart background scan
to unsure we don't miss any advertising report (due to duplicates
filter).

Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-03-11 12:59:04 -07:00
Andrew Earl 27539bc441 Bluetooth: Fix aborting eSCO connection in case of error 0x20
Add additional error case to attempt alternative configuration for SCO. Error
occurs with Intel BT controller where fallback is not attempted as the error
0x20 Unsupported LMP Parameter value is not included in the list of errors
where a retry should be attempted.
The problem also affects PTS test case TC_HF_ACS_BV_05_I.

See the HCI log below for details:
< HCI Command: Setup Synchronous Connection (0x01|0x0028) plen 17
    handle 256 voice setting 0x0060 ptype 0x0380
> HCI Event: Command Status (0x0f) plen 4
    Setup Synchronous Connection (0x01|0x0028) status 0x00 ncmd 1
> HCI Event: Max Slots Change (0x1b) plen 3
    handle 256 slots 1
> HCI Event: Synchronous Connect Complete (0x2c) plen 17
    status 0x20 handle 0 bdaddr 00:80:98:09:0B:19 type eSCO
    Error: Unsupported LMP Parameter Value
< HCI Command: Setup Synchronous Connection (0x01|0x0028) plen 17
    handle 256 voice setting 0x0060 ptype 0x0380
> HCI Event: Command Status (0x0f) plen 4
    Setup Synchronous Connection (0x01|0x0028) status 0x00 ncmd 1
> HCI Event: Max Slots Change (0x1b) plen 3
    handle 256 slots 5
> HCI Event: Synchronous Connect Complete (0x2c) plen 17
    status 0x20 handle 0 bdaddr 00:80:98:09:0B:19 type eSCO
    Error: Unsupported LMP Parameter Value
< HCI Command: Setup Synchronous Connection (0x01|0x0028) plen 17
    handle 256 voice setting 0x0060 ptype 0x03c8
> HCI Event: Command Status (0x0f) plen 4
    Setup Synchronous Connection (0x01|0x0028) status 0x00 ncmd 1
> HCI Event: Max Slots Change (0x1b) plen 3
    handle 256 slots 1
> HCI Event: Synchronous Connect Complete (0x2c) plen 17
    status 0x00 handle 257 bdaddr 00:80:98:09:0B:19 type eSCO
    Air mode: CVSD

See btmon log for further details:
> HCI Event (0x0f) plen 4 [hci0] 44.888063
      Setup Synchronous Connection (0x01|0x0028) ncmd 1
        Status: Success (0x00)
> HCI Event (0x1b) plen 3 [hci0] 44.893064
        Handle: 256
        Max slots: 1
> HCI Event (0x2c) plen 17 [hci0] 44.942080
        Status: Unsupported LMP Parameter Value (0x20)
        Handle: 0
        Address: 00:1B:DC:06:04:B0 (OUI 00-1B-DC)
        Link type: eSCO (0x02)
        Transmission interval: 0x00
        Retransmission window: 0x01
        RX packet length: 0
        TX packet length: 0
        Air mode: CVSD (0x02)
> HCI Event (0x1b) plen 3 [hci0] 44.948054
        Handle: 256
        Max slots: 5

Signed-off-by: Andrew Earl <andrewx.earl@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-03-11 08:16:45 -07:00
Alexander Aring 9755088797 Bluetooth: make bluetooth 6lowpan as an option
Currently you can have bluetooth 6lowpan without ipv6 enabled. This
doesn't make any sense. With this patch you can disable/enable bluetooth
6lowpan support at compile time.

The current bluetooth 6lowpan implementation doesn't check the return
value of 6lowpan function. Nevertheless I added -EOPNOTSUPP as return value
if 6lowpan bluetooth is disabled.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-03-11 07:54:55 -07:00
Li RongQing 090f1166c6 ipv6: ip6_forward: perform skb->pkt_type check at the beginning
Packets which have L2 address different from ours should be
already filtered before entering into ip6_forward().

Perform that check at the beginning to avoid processing such packets.

Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-11 00:37:42 -04:00
Peter Boström dd38743b4c vlan: Set correct source MAC address with TX VLAN offload enabled
With TX VLAN offload enabled the source MAC address for frames sent using the
VLAN interface is currently set to the address of the real interface. This is
wrong since the VLAN interface may be configured with a different address.

The bug was introduced in commit 2205369a31
("vlan: Fix header ops passthru when doing TX VLAN offload.").

This patch sets the source address before calling the create function of the
real interface.

Signed-off-by: Peter Boström <peter.bostrom@netrounds.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-10 22:21:51 -04:00
Eric Dumazet d32d9bb85c flowcache: restore a single flow_cache kmem_cache
It is not legal to create multiple kmem_cache having the same name.

flowcache can use a single kmem_cache, no need for a per netns
one.

Fixes: ca925cf153 ("flowcache: Make flow cache name space aware")
Reported-by: Jakub Kicinski <moorray3@wp.pl>
Tested-by: Jakub Kicinski <moorray3@wp.pl>
Tested-by: Fan Du <fan.du@windriver.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-10 21:45:11 -04:00
Mark A. Greer ceeee42d85 NFC: digital: Rename Type V tags to Type 5 tags
According to the latest draft specification from
the NFC-V committee, ISO/IEC 15693 tags will be
referred to as "Type 5" tags and not "Type V"
tags anymore.  Make the code reflect the new
terminology.

Signed-off-by: Mark A. Greer <mgreer@animalcreek.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2014-03-11 00:40:59 +01:00
Eric Dumazet 2818fa0fa0 pkt_sched: fq: do not hold qdisc lock while allocating memory
Resizing fq hash table allocates memory while holding qdisc spinlock,
with BH disabled.

This is definitely not good, as allocation might sleep.

We can drop the lock and get it when needed, we hold RTNL so no other
changes can happen at the same time.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Fixes: afe4fd0624 ("pkt_sched: fq: Fair Queue packet scheduler")
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-10 16:17:52 -04:00
Eric Dumazet 431a91242d tcp: timestamp SYN+DATA messages
All skb in socket write queue should be properly timestamped.

In case of FastOpen, we special case the SYN+DATA 'message' as we
queue in socket wrote queue the two fallback skbs:

1) SYN message by itself.
2) DATA segment by itself.

We should make sure these skbs have proper timestamps.

Add a WARN_ON_ONCE() to eventually catch future violations.

Fixes: 740b0f1841 ("tcp: switch rtt estimations to usec resolution")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-10 16:15:54 -04:00
Alexander Aring 3772ab1d37 6lowpan: reassembly: fix access of ctl table entry
Correct offset is 3 of the 6lowpanfrag_max_datagram_size value in proc
entry ctl table and not 2.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-10 16:03:03 -04:00
Linus Torvalds e6a4b6f5ea Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs fixes from Al Viro.

Clean up file table accesses (get rid of fget_light() in favor of the
fdget() interface), add proper file position locking.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  get rid of fget_light()
  sockfd_lookup_light(): switch to fdget^W^Waway from fget_light
  vfs: atomic f_pos accesses as per POSIX
  ocfs2 syncs the wrong range...
2014-03-10 12:57:26 -07:00
Eric Dumazet 37314363cd pkt_sched: move the sanity test in qdisc_list_add()
The WARN_ON(root == &noop_qdisc)) added in qdisc_list_add()
can trigger in normal conditions when devices are not up.
It should be done only right before the list_add_tail() call.

Fixes: e57a784d8c ("pkt_sched: set root qdisc before change() in attach_default_qdiscs()")
Reported-by: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
Tested-by: Mirco Tischler <mt-ml@gmx.de>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-10 15:44:21 -04:00
Eric Dumazet 746e349980 l2tp: fix unused variable warning
net/l2tp/l2tp_core.c:1111:15: warning: unused variable
'sk' [-Wunused-variable]

Fixes: 31c70d5956 ("l2tp: keep original skb ownership")
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-10 15:32:24 -04:00
Eric Dumazet 9063e21fb0 netlink: autosize skb lengthes
One known problem with netlink is the fact that NLMSG_GOODSIZE is
really small on PAGE_SIZE==4096 architectures, and it is difficult
to know in advance what buffer size is used by the application.

This patch adds an automatic learning of the size.

First netlink message will still be limited to ~4K, but if user used
bigger buffers, then following messages will be able to use up to 16KB.

This speedups dump() operations by a large factor and should be safe
for legacy applications.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Thomas Graf <tgraf@suug.ch>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-10 13:56:26 -04:00
Al Viro 00e188ef6a sockfd_lookup_light(): switch to fdget^W^Waway from fget_light
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-03-10 11:44:41 -04:00
Marcel Holtmann 53ac6ab612 Bluetooth: Make LTK and CSRK only persisent when bonding
In case the pairable option has been disabled, the pairing procedure
does not create keys for bonding. This means that these generated keys
should not be stored persistently.

For LTK and CSRK this is important to tell userspace to not store these
new keys. They will be available for the lifetime of the device, but
after the next power cycle they should not be used anymore.

Inform userspace to actually store the keys persistently only if both
sides request bonding.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-03-10 14:57:33 +02:00
Nikolay Aleksandrov 52a4c6404f selinux: add gfp argument to security_xfrm_policy_alloc and fix callers
security_xfrm_policy_alloc can be called in atomic context so the
allocation should be done with GFP_ATOMIC. Add an argument to let the
callers choose the appropriate way. In order to do so a gfp argument
needs to be added to the method xfrm_policy_alloc_security in struct
security_operations and to the internal function
selinux_xfrm_alloc_user. After that switch to GFP_ATOMIC in the atomic
callers and leave GFP_KERNEL as before for the rest.
The path that needed the gfp argument addition is:
security_xfrm_policy_alloc -> security_ops.xfrm_policy_alloc_security ->
all users of xfrm_policy_alloc_security (e.g. selinux_xfrm_policy_alloc) ->
selinux_xfrm_alloc_user (here the allocation used to be GFP_KERNEL only)

Now adding a gfp argument to selinux_xfrm_alloc_user requires us to also
add it to security_context_to_sid which is used inside and prior to this
patch did only GFP_KERNEL allocation. So add gfp argument to
security_context_to_sid and adjust all of its callers as well.

CC: Paul Moore <paul@paul-moore.com>
CC: Dave Jones <davej@redhat.com>
CC: Steffen Klassert <steffen.klassert@secunet.com>
CC: Fan Du <fan.du@windriver.com>
CC: David S. Miller <davem@davemloft.net>
CC: LSM list <linux-security-module@vger.kernel.org>
CC: SELinux list <selinux@tycho.nsa.gov>

Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-03-10 08:30:02 +01:00
Nikolay Aleksandrov 87536a81e1 net: af_key: fix sleeping under rcu
There's a kmalloc with GFP_KERNEL in a helper
(pfkey_sadb2xfrm_user_sec_ctx) used in pfkey_compile_policy which is
called under rcu_read_lock. Adjust pfkey_sadb2xfrm_user_sec_ctx to have
a gfp argument and adjust the users.

CC: Dave Jones <davej@redhat.com>
CC: Steffen Klassert <steffen.klassert@secunet.com>
CC: Fan Du <fan.du@windriver.com>
CC: David S. Miller <davem@davemloft.net>

Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-03-10 08:27:22 +01:00
dingtianhong be14cc98e9 vlan: use use ether_addr_equal_64bits to instead of ether_addr_equal
Ether_addr_equal_64bits is more efficient than ether_addr_equal, and
can be used when each argument is an array within a structure that
contains at least two bytes of data beyond the array, so it is safe
to use it for vlan, and make sense for fast path.

Cc: Joe Perches <joe@perches.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-09 19:03:51 -04:00
dingtianhong 375f67df28 vlan: slight optimization for vlan_do_receive()
According Joe's suggestion, maybe it'd be faster to add an unlikely to
the test for PCKET_OTHERHOST, so I add it and see whether the performance
could be better, although the differences is so small and negligible, but
it is hard to catch that any lower device would set the skb type to
PACKET_OTHERHOST, so most of time, I think it make sense to add unlikely
for the test.

Cc: Joe Perches <joe@perches.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-09 19:03:51 -04:00
Marcel Holtmann 7ee4ea3692 Bluetooth: Add support for handling signature resolving keys
The connection signature resolving key (CSRK) is used for attribute
protocol signed write procedures. This change generates a new local
key during pairing and requests the peer key as well.

Newly generated key and received key will be provided to userspace
using the New Signature Resolving Key management event.

The Master CSRK can be used for verification of remote signed write
PDUs and the Slave CSRK can be used for sending signed write PDUs
to the remote device.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-03-09 21:39:50 +02:00
Eric Dumazet 2d8d40afd1 pkt_sched: fq: do not hold qdisc lock while allocating memory
Resizing fq hash table allocates memory while holding qdisc spinlock,
with BH disabled.

This is definitely not good, as allocation might sleep.

We can drop the lock and get it when needed, we hold RTNL so no other
changes can happen at the same time.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Fixes: afe4fd0624 ("pkt_sched: fq: Fair Queue packet scheduler")
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-08 19:09:10 -05:00
Patrick McHardy a4c2e8beba netfilter: nft_nat: fix family validation
The family in the NAT expression is basically completely useless since
we have it available during runtime anyway. Nevertheless it is used to
decide the NAT family, so at least validate it properly. As we don't
support cross-family NAT, it needs to match the family of the table the
expression exists in.

Unfortunately we can't remove it completely since we need to dump it for
userspace (*sigh*), so at least reduce the memory waste.

Additionally clean up the module init function by removing useless
temporary variables.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-03-08 12:35:19 +01:00
Patrick McHardy d46f2cd260 netfilter: nft_ct: remove family from struct nft_ct
Since we have the context available during destruction again, we can
remove the family from the private structure.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-03-08 12:35:19 +01:00
Patrick McHardy ab9da5c19f netfilter: nf_tables: restore notifications for anonymous set destruction
Since we have the context available again, we can restore notifications
for destruction of anonymous sets.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-03-08 12:35:18 +01:00
Patrick McHardy 62472bcefb netfilter: nf_tables: restore context for expression destructors
In order to fix set destruction notifications and get rid of unnecessary
members in private data structures, pass the context to expressions'
destructor functions again.

In order to do so, replace various members in the nft_rule_trans structure
by the full context.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-03-08 12:35:17 +01:00
Patrick McHardy a36e901cf6 netfilter: nf_tables: clean up nf_tables_trans_add() argument order
The context argument logically comes first, and this is what every other
function dealing with contexts does.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-03-08 12:35:16 +01:00
Alexander Aring 37147652cf 6lowpan: reassembly: fix return of init function
This patch adds a missing return after fragmentation init. Otherwise we
register a sysctl interface and deregister it afterwards which makes no
sense.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-07 17:11:19 -05:00
David S. Miller 3894004289 RxRPC development
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIVAwUAUxX4KBOxKuMESys7AQJqkRAAmxU1GEGESCE/F/U3Hm9pRFeg9kj6+7BO
 7vPrUzB5KEnu4eXjCc60a+BVmV0fJRiS7zsFWzmcKgjIZzJdobbStD944tdz0kIx
 w7RFKOr+D6+zyo0zr1Fj0DwlhvuOpfBsisYtNC5J7OTWc3whgnIkroLt16FNFasu
 gxe7oo+KHiTXIJuIf6E/p5iK1te7CUsIIc1v01hgVVD91Udk4odWyl288BzzcXuC
 5WFYaMnOG+9ndo+4/mCt5dN0FDYL2a1djyknPdXP6HOSBoqVBjPxFGxuN8O9HFMV
 ghWjNG72YGDk0jy6ghiQijdoxE4l2iy0/gYGzSiCMSp59aHHlkq/XIhOoZWD96Nw
 TAoo0PbTFrqRzfvOQ8lMULDneB0KP20gI/AUbgnU3CGataALntAKiVOhrOtXaG21
 trE77omDA5FtMlQ60DabTl6RveMZXCZ+LwHPuZ0Euxo4uykmDLdOChZADGoqZA+5
 VzI1Qa4YK+9B/fkgUyiIzS4syCHGISFIIQ7/Srr26UMlpuK4WDPA9a11hKxrbrkA
 JxEkAk9i/KD3/XsWWfxEMPgNarwHOyDQQ4/XYKx20O2N9RDrao+cNw6ghlwZS+mo
 bmx/sqWmMDaQoMNqL8Q+bq2W7omG8wNgv1pn6MVnFSfiJHAMDhNZkBzp0uCLGFpA
 8WYj9OBcUgY=
 =PK2i
 -----END PGP SIGNATURE-----

Merge tag 'rxrpc-devel-20140304' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

David Howells says:

====================
net-next: AF_RXRPC fixes and development

Here are some AF_RXRPC fixes:

 (1) Fix to remove incorrect checksum calculation made during recvmsg().  It's
     unnecessary to try to do this there since we check the checksum before
     reading the RxRPC header from the packet.

 (2) Fix to prevent the sending of an ABORT packet in response to another
     ABORT packet and inducing a storm.

 (3) Fix UDP MTU calculation from parsing ICMP_FRAG_NEEDED packets where we
     don't handle the ICMP packet not specifying an MTU size.

And development patches:

 (4) Add sysctls for configuring RxRPC parameters, specifically various delays
     pertaining to ACK generation, the time before we resend a packet for
     which we don't receive an ACK, the maximum time a call is permitted to
     live and the amount of time transport, connection and dead call
     information is cached.

 (5) Improve ACK packet production by adjusting the handling of ACK_REQUESTED
     packets, ignoring the MORE_PACKETS flag, delaying the production of
     otherwise immediate ACK_IDLE packets and delaying all ACK_IDLE production
     (barring the call termination) to half a second.

 (6) Add more sysctl parameters to expose the Rx window size, the maximum
     packet size that we're willing to receive and the number of jumbo rxrpc
     packets we're willing to handle in a single UDP packet.

 (7) Request ACKs on alternate DATA packets so that the other side doesn't
     wait till we fill up the Tx window.

 (8) Use a RCU hash table to look up the rxrpc_call for an incoming packet
     rather than stepping through a hierarchy involving several spinlocks.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-07 16:08:59 -05:00
John W. Linville 97bd5f0054 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem 2014-03-07 15:09:32 -05:00
Eric Dumazet 31c70d5956 l2tp: keep original skb ownership
There is no reason to orphan skb in l2tp.

This breaks things like per socket memory limits, TCP Small queues...

Fix this before more people copy/paste it.

This is very similar to commit 8f646c922d
("vxlan: keep original skb ownership")

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-07 14:37:55 -05:00
Eric Dumazet 2196269242 tcp: do not leak non zero tstamp in output packets
Usage of skb->tstamp should remain private to TCP stack
(only set on packets on write queue, not on cloned ones)

Otherwise, packets given to loopback interface with a non null tstamp
can confuse netif_rx() / net_timestamp_check()

Other possibility would be to clear tstamp in loopback_xmit(),
as done in skb_scrub_packet()

Fixes: 740b0f1841 ("tcp: switch rtt estimations to usec resolution")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-07 14:32:57 -05:00
Gustavo Padovan 0753c182ef Bluetooth: Fix skb allocation check for A2MP
vtable's method alloc_skb() needs to return a ERR_PTR in case of err and
not a NULL.

Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-03-07 20:09:41 +02:00
Johan Hedberg 2606ecbc48 Bluetooth: Fix expected key count debug logs
The debug logs for reporting a discrepancy between the expected amount
of keys and the actually received amount of keys got these value mixed
up. This patch fixes the issue.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-03-07 09:49:12 -08:00
Patrick McHardy ce6eb0d7c8 netfilter: nft_hash: bug fixes and resizing
The hash set type is very broken and was never meant to be merged in this
state. Missing RCU synchronization on element removal, leaking chain
refcounts when used as a verdict map, races during lookups, a fixed table
size are probably just some of the problems. Luckily it is currently
never chosen by the kernel when the rbtree type is also available.

Rewrite it to be usable.

The new implementation supports automatic hash table resizing using RCU,
based on Paul McKenney's and Josh Triplett's algorithm "Optimized Resizing
For RCU-Protected Hash Tables" described in [1].

Resizing doesn't require a second list head in the elements, it works by
chosing a hash function that remaps elements to a predictable set of buckets,
only resizing by integral factors and

- during expansion: linking new buckets to the old bucket that contains
  elements for any of the new buckets, thereby creating imprecise chains,
  then incrementally seperating the elements until the new buckets only
  contain elements that hash directly to them.

- during shrinking: linking the hash chains of all old buckets that hash
  to the same new bucket to form a single chain.

Expansion requires at most the number of elements in the longest hash chain
grace periods, shrinking requires a single grace period.

Due to the requirement of having hash chains/elements linked to multiple
buckets during resizing, homemade single linked lists are used instead of
the existing list helpers, that don't support this in a clean fashion.
As a side effect, the amount of memory required per element is reduced by
one pointer.

Expansion is triggered when the load factors exceeds 75%, shrinking when
the load factor goes below 30%. Both operations are allowed to fail and
will be retried on the next insertion or removal if their respective
conditions still hold.

[1] http://dl.acm.org/citation.cfm?id=2002181.2002192

Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-03-07 11:42:07 +01:00
Jesper Dangaard Brouer 93bb0ceb75 netfilter: conntrack: remove central spinlock nf_conntrack_lock
nf_conntrack_lock is a monolithic lock and suffers from huge contention
on current generation servers (8 or more core/threads).

Perf locking congestion is clear on base kernel:

-  72.56%  ksoftirqd/6  [kernel.kallsyms]    [k] _raw_spin_lock_bh
   - _raw_spin_lock_bh
      + 25.33% init_conntrack
      + 24.86% nf_ct_delete_from_lists
      + 24.62% __nf_conntrack_confirm
      + 24.38% destroy_conntrack
      + 0.70% tcp_packet
+   2.21%  ksoftirqd/6  [kernel.kallsyms]    [k] fib_table_lookup
+   1.15%  ksoftirqd/6  [kernel.kallsyms]    [k] __slab_free
+   0.77%  ksoftirqd/6  [kernel.kallsyms]    [k] inet_getpeer
+   0.70%  ksoftirqd/6  [nf_conntrack]       [k] nf_ct_delete
+   0.55%  ksoftirqd/6  [ip_tables]          [k] ipt_do_table

This patch change conntrack locking and provides a huge performance
improvement.  SYN-flood attack tested on a 24-core E5-2695v2(ES) with
10Gbit/s ixgbe (with tool trafgen):

 Base kernel:   810.405 new conntrack/sec
 After patch: 2.233.876 new conntrack/sec

Notice other floods attack (SYN+ACK or ACK) can easily be deflected using:
 # iptables -A INPUT -m state --state INVALID -j DROP
 # sysctl -w net/netfilter/nf_conntrack_tcp_loose=0

Use an array of hashed spinlocks to protect insertions/deletions of
conntracks into the hash table. 1024 spinlocks seem to give good
results, at minimal cost (4KB memory). Due to lockdep max depth,
1024 becomes 8 if CONFIG_LOCKDEP=y

The hash resize is a bit tricky, because we need to take all locks in
the array. A seqcount_t is used to synchronize the hash table users
with the resizing process.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-03-07 11:41:13 +01:00
Jesper Dangaard Brouer ca7433df3a netfilter: conntrack: seperate expect locking from nf_conntrack_lock
Netfilter expectations are protected with the same lock as conntrack
entries (nf_conntrack_lock).  This patch split out expectations locking
to use it's own lock (nf_conntrack_expect_lock).

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-03-07 11:41:01 +01:00
Jesper Dangaard Brouer e1b207dac1 netfilter: avoid race with exp->master ct
Preparation for disconnecting the nf_conntrack_lock from the
expectations code.  Once the nf_conntrack_lock is lifted, a race
condition is exposed.

The expectations master conntrack exp->master, can race with
delete operations, as the refcnt increment happens too late in
init_conntrack().  Race is against other CPUs invoking
->destroy() (destroy_conntrack()), or nf_ct_delete() (via timeout
or early_drop()).

Avoid this race in nf_ct_find_expectation() by using atomic_inc_not_zero(),
and checking if nf_ct_is_dying() (path via nf_ct_delete()).

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-03-07 11:40:47 +01:00
Jesper Dangaard Brouer b7779d06f9 netfilter: conntrack: spinlock per cpu to protect special lists.
One spinlock per cpu to protect dying/unconfirmed/template special lists.
(These lists are now per cpu, a bit like the untracked ct)
Add a @cpu field to nf_conn, to make sure we hold the appropriate
spinlock at removal time.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-03-07 11:40:38 +01:00
Jesper Dangaard Brouer b476b72a0f netfilter: trivial code cleanup and doc changes
Changes while reading through the netfilter code.

Added hint about how conntrack nf_conn refcnt is accessed.
And renamed repl_hash to reply_hash for readability

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-03-07 11:40:04 +01:00
Pablo Neira Ayuso 52af2bfcc0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/horms/ipvs-next
Via Simon Horman:

====================
* Whitespace cleanup spotted by checkpatch.pl from Tingwei Liu.
* Section conflict cleanup, basically removal of one wrong __read_mostly,
  from Andi Kleen.
====================

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-03-07 11:37:11 +01:00
Nicolas Dichtel 870a2df4ca xfrm: rename struct xfrm_filter
iproute2 already defines a structure with that name, let's use another one to
avoid any conflict.

CC: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-03-07 08:12:37 +01:00
Tingwei Liu 411fd527bc ipvs: Reduce checkpatch noise in ip_vs_lblc.c
Add whitespace after operator and put open brace { on the previous line

Cc: Tingwei Liu <liutingwei@hisense.com>
Cc: lvs-devel@vger.kernel.org
Signed-off-by: Tingwei Liu <tingw.liu@gmail.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
2014-03-07 12:36:21 +09:00
Andi Kleen c61b0c1328 sections, ipvs: Remove useless __read_mostly for ipvs genl_ops
const __read_mostly does not make any sense, because const
data is already read-only. Remove the __read_mostly
for the ipvs genl_ops. This avoids a LTO
section conflict compile problem.

Cc: Wensong Zhang <wensong@linux-vs.org>
Cc: Simon Horman <horms@verge.net.au>
Cc: Patrick McHardy <kaber@trash.net>
Cc: lvs-devel@vger.kernel.org
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
2014-03-07 12:36:21 +09:00
Sabrina Dubroca c88507fbad ipv6: don't set DST_NOCOUNT for remotely added routes
DST_NOCOUNT should only be used if an authorized user adds routes
locally. In case of routes which are added on behalf of router
advertisments this flag must not get used as it allows an unlimited
number of routes getting added remotely.

Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-06 17:29:27 -05:00
Eric Dumazet 6f542efcbc net_sched: htb: do not acquire qdisc lock in dump operations
htb_dump() and htb_dump_class() do not strictly need to acquire
qdisc lock to fetch qdisc and/or class parameters.

We hold RTNL and no changes can occur.

This reduces by 50% qdisc lock pressure while doing tc qdisc|class dump
operations.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-06 17:24:54 -05:00
Alexander Aring cefc8c8a7c 6lowpan: move 6lowpan header to include/net
This header is used by bluetooth and ieee802154 branch. This patch
move this header to the include/net directory to avoid a use of a
relative path in include.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-06 17:21:38 -05:00
Alexander Aring 91c6922745 6lowpan: add missing include of net/ipv6.h
The 6lowpan.h file contains some static inline function which use
internal ipv6 api structs. Add a include of ipv6.h to be sure that it's
known before.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-06 17:21:38 -05:00
Anton Nayshtut d2d273ffab ipv6: Fix exthdrs offload registration.
Without this fix, ipv6_exthdrs_offload_init doesn't register IPPROTO_DSTOPTS
offload, but returns 0 (as the IPPROTO_ROUTING registration actually succeeds).

This then causes the ipv6_gso_segment to drop IPv6 packets with IPPROTO_DSTOPTS
header.

The issue detected and the fix verified by running MS HCK Offload LSO test on
top of QEMU Windows guests, as this test sends IPv6 packets with
IPPROTO_DSTOPTS.

Signed-off-by: Anton Nayshtut <anton@swortex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-06 16:35:55 -05:00
Anton Blanchard 0a13404dd3 net: unix socket code abuses csum_partial
The unix socket code is using the result of csum_partial to
hash into a lookup table:

	unix_hash_fold(csum_partial(sunaddr, len, 0));

csum_partial is only guaranteed to produce something that can be
folded into a checksum, as its prototype explains:

 * returns a 32-bit number suitable for feeding into itself
 * or csum_tcpudp_magic

The 32bit value should not be used directly.

Depending on the alignment, the ppc64 csum_partial will return
different 32bit partial checksums that will fold into the same
16bit checksum.

This difference causes the following testcase (courtesy of
Gustavo) to sometimes fail:

#include <sys/socket.h>
#include <stdio.h>

int main()
{
	int fd = socket(PF_LOCAL, SOCK_STREAM|SOCK_CLOEXEC, 0);

	int i = 1;
	setsockopt(fd, SOL_SOCKET, SO_REUSEADDR, &i, 4);

	struct sockaddr addr;
	addr.sa_family = AF_LOCAL;
	bind(fd, &addr, 2);

	listen(fd, 128);

	struct sockaddr_storage ss;
	socklen_t sslen = (socklen_t)sizeof(ss);
	getsockname(fd, (struct sockaddr*)&ss, &sslen);

	fd = socket(PF_LOCAL, SOCK_STREAM|SOCK_CLOEXEC, 0);

	if (connect(fd, (struct sockaddr*)&ss, sslen) == -1){
		perror(NULL);
		return 1;
	}
	printf("OK\n");
	return 0;
}

As suggested by davem, fix this by using csum_fold to fold the
partial 32bit checksum into a 16bit checksum before using it.

Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: stable@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-06 16:19:33 -05:00
Florian Westphal e588e2f286 inet: frag: make sure forced eviction removes all frags
Quoting Alexander Aring:
  While fragmentation and unloading of 6lowpan module I got this kernel Oops
  after few seconds:

  BUG: unable to handle kernel paging request at f88bbc30
  [..]
  Modules linked in: ipv6 [last unloaded: 6lowpan]
  Call Trace:
   [<c012af4c>] ? call_timer_fn+0x54/0xb3
   [<c012aef8>] ? process_timeout+0xa/0xa
   [<c012b66b>] run_timer_softirq+0x140/0x15f

Problem is that incomplete frags are still around after unload; when
their frag expire timer fires, we get crash.

When a netns is removed (also done when unloading module), inet_frag
calls the evictor with 'force' argument to purge remaining frags.

The evictor loop terminates when accounted memory ('work') drops to 0
or the lru-list becomes empty.  However, the mem accounting is done
via percpu counters and may not be accurate, i.e. loop may terminate
prematurely.

Alter evictor to only stop once the lru list is empty when force is
requested.

Reported-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
Reported-by: Alexander Aring <alex.aring@gmail.com>
Tested-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-06 15:28:45 -05:00
David S. Miller f7324acd98 tcp: Use NET_ADD_STATS instead of NET_ADD_STATS_BH in tcp_event_new_data_sent()
Can be invoked from non-BH context.

Based upon a patch by Eric Dumazet.

Fixes: f19c29e3e3 ("tcp: snmp stats for Fast Open, SYN rtx, and data pkts")
Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-06 15:19:43 -05:00
Erik Hugne 2892505ea1 tipc: don't log disabled tasklet handler errors
Failure to schedule a TIPC tasklet with tipc_k_signal because the
tasklet handler is disabled is not an error. It means TIPC is
currently in the process of shutting down. We remove the error
logging in this case.

Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-06 14:46:24 -05:00
Erik Hugne 1bb8dce57f tipc: fix memory leak during module removal
When the TIPC module is removed, the tasklet handler is disabled
before all other subsystems. This will cause lingering publications
in the name table because the node_down tasklets responsible to
clean up publications from an unreachable node will never run.
When the name table is shut down, these publications are detected
and an error message is logged:
tipc: nametbl_stop(): orphaned hash chain detected
This is actually a memory leak, introduced with commit
993b858e37 ("tipc: correct the order
of stopping services at rmmod")

Instead of just logging an error and leaking memory, we free
the orphaned entries during nametable shutdown.

Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-06 14:46:24 -05:00
Erik Hugne edcc0511b5 tipc: drop subscriber connection id invalidation
When a topology server subscriber is disconnected, the associated
connection id is set to zero. A check vs zero is then done in the
subscription timeout function to see if the subscriber have been
shut down. This is unnecessary, because all subscription timers
will be cancelled when a subscriber terminates. Setting the
connection id to zero is actually harmful because id zero is the
identity of the topology server listening socket, and can cause a
race that leads to this socket being closed instead.

Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-06 14:46:23 -05:00
Ying Xue fe8e464939 tipc: avoid to unnecessary process switch under non-block mode
When messages are received via tipc socket under non-block mode,
schedule_timeout() is called in tipc_wait_for_rcvmsg(), that is,
the process of receiving messages will be scheduled once although
timeout value passed to schedule_timeout() is 0. The same issue
exists in accept()/wait_for_accept(). To avoid this unnecessary
process switch, we only call schedule_timeout() if the timeout
value is non-zero.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-06 14:46:23 -05:00
Ying Xue 4652edb70e tipc: fix connection refcount leak
When tipc_conn_sendmsg() calls tipc_conn_lookup() to query a
connection instance, its reference count value is increased if
it's found. But subsequently if it's found that the connection is
closed, the work of sending message is not queued into its server
send workqueue, and the connection reference count is not decreased.
This will cause a reference count leak. To reproduce this problem,
an application would need to open and closes topology server
connections with high intensity.

We fix this by immediately decrementing the connection reference
count if a send fails due to the connection being closed.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-06 14:46:23 -05:00
Ying Xue 6d4ebeb4df tipc: allow connection shutdown callback to be invoked in advance
Currently connection shutdown callback function is called when
connection instance is released in tipc_conn_kref_release(), and
receiving packets and sending packets are running in different
threads. Even if connection is closed by the thread of receiving
packets, its shutdown callback may not be called immediately as
the connection reference count is non-zero at that moment. So,
although the connection is shut down by the thread of receiving
packets, the thread of sending packets doesn't know it. Before
its shutdown callback is invoked to tell the sending thread its
connection has been closed, the sending thread may deliver
messages by tipc_conn_sendmsg(), this is why the following error
information appears:

"Sending subscription event failed, no memory"

To eliminate it, allow connection shutdown callback function to
be called before connection id is removed in tipc_close_conn(),
which makes the sending thread know the truth in time that its
socket is closed so that it doesn't send message to it. We also
remove the "Sending XXX failed..." error reporting for topology
and config services.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-06 14:46:23 -05:00
Guillaume Nault 9e9cb6221a l2tp: fix userspace reception on plain L2TP sockets
As pppol2tp_recv() never queues up packets to plain L2TP sockets,
pppol2tp_recvmsg() never returns data to userspace, thus making
the recv*() system calls unusable.

Instead of dropping packets when the L2TP socket isn't bound to a PPP
channel, this patch adds them to its reception queue.

Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-06 14:25:39 -05:00
Guillaume Nault bb5016eac1 l2tp: fix manual sequencing (de)activation in L2TPv2
Commit e0d4435f "l2tp: Update PPP-over-L2TP driver to work over L2TPv3"
broke the PPPOL2TP_SO_SENDSEQ setsockopt. The L2TP header length was
previously computed by pppol2tp_l2t_header_len() before each call to
l2tp_xmit_skb(). Now that header length is retrieved from the hdr_len
session field, this field must be updated every time the L2TP header
format is modified, or l2tp_xmit_skb() won't push the right amount of
data for the L2TP header.

This patch uses l2tp_session_set_header_len() to adjust hdr_len every
time sequencing is (de)activated from userspace (either by the
PPPOL2TP_SO_SENDSEQ setsockopt or the L2TP_ATTR_SEND_SEQ netlink
attribute).

Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-06 14:25:39 -05:00
John W. Linville 73c8daa083 Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 2014-03-06 13:47:16 -05:00
Hannes Frederic Sowa e90c14835b inet: remove now unused flag DST_NOPEER
Commit e688a60480 ("net: introduce DST_NOPEER dst flag") introduced
DST_NOPEER because because of crashes in ipv6_select_ident called from
udp6_ufo_fragment.

Since commit 916e4cf46d ("ipv6: reuse ip6_frag_id from
ip6_ufo_append_data") we don't call ipv6_select_ident any more from
ip6_ufo_append_data, thus this flag lost its purpose and can be removed.

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-06 13:15:52 -05:00
Heiko Carstens 3a49a0f718 net/compat: convert to COMPAT_SYSCALL_DEFINE with changing parameter types
In order to allow the COMPAT_SYSCALL_DEFINE macro generate code that
performs proper zero and sign extension convert all 64 bit parameters
to their corresponding 32 bit compat counterparts.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2014-03-06 16:30:45 +01:00
Heiko Carstens 361d93c46f net/compat: convert to COMPAT_SYSCALL_DEFINE
Convert all compat system call functions where all parameter types
have a size of four or less than four bytes, or are pointer types
to COMPAT_SYSCALL_DEFINE.
The implicit casts within COMPAT_SYSCALL_DEFINE will perform proper
zero and sign extension to 64 bit of all parameters if needed.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2014-03-06 15:35:11 +01:00
Josh Hunt 07cf8f5ae2 netfilter: ipset: add forceadd kernel support for hash set types
Adds a new property for hash set types, where if a set is created
with the 'forceadd' option and the set becomes full the next addition
to the set may succeed and evict a random entry from the set.

To keep overhead low eviction is done very simply. It checks to see
which bucket the new entry would be added. If the bucket's pos value
is non-zero (meaning there's at least one entry in the bucket) it
replaces the first entry in the bucket. If pos is zero, then it continues
down the normal add process.

This property is useful if you have a set for 'ban' lists where it may
not matter if you release some entries from the set early.

Signed-off-by: Josh Hunt <johunt@akamai.com>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2014-03-06 09:31:43 +01:00
Ilia Mirkin 6843bc3c56 netfilter: ipset: move registration message to init from net_init
Commit 1785e8f473 ("netfiler: ipset: Add net namespace for ipset") moved
the initialization print into net_init, which can get called a lot due
to namespaces. Move it back into init, reduce to pr_info.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2014-03-06 09:31:43 +01:00
Vytas Dauksa 4d0e5c076d netfilter: ipset: add markmask for hash:ip,mark data type
Introduce packet mark mask for hash:ip,mark data type. This allows to
set mark bit filter for the ip set.

Change-Id: Id8dd9ca7e64477c4f7b022a1d9c1a5b187f1c96e

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2014-03-06 09:31:42 +01:00
Vytas Dauksa 3b02b56cd5 netfilter: ipset: add hash:ip,mark data type to ipset
Introduce packet mark support with new ip,mark hash set. This includes
userspace and kernelspace code, hash:ip,mark set tests and man page
updates.

The intended use of ip,mark set is similar to the ip:port type, but for
protocols which don't use a predictable port number. Instead of port
number it matches a firewall mark determined by a layer 7 filtering
program like opendpi.

As well as allowing or blocking traffic it will also be used for
accounting packets and bytes sent for each protocol.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2014-03-06 09:31:42 +01:00
Fengguang Wu 9562cf28d1 netfilter: ipset: Add hash: fix coccinelle warnings
net/netfilter/ipset/ip_set_hash_netnet.c:115:8-9: WARNING: return of 0/1 in function 'hash_netnet4_data_list' with return type bool
/c/kernel-tests/src/cocci/net/netfilter/ipset/ip_set_hash_netnet.c:338:8-9: WARNING: return of 0/1 in function 'hash_netnet6_data_list' with return type bool

Return statements in functions returning bool should use
true/false instead of 1/0.
Generated by: coccinelle/misc/boolreturn.cocci

Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2014-03-06 09:31:42 +01:00
Sergey Popovich 35f6e63abe netfilter: ipset: Follow manual page behavior for SET target on list:set
ipset(8) for list:set says:
  The match will try to find a matching entry in the sets and the
  target will try to add an entry to the first set to which it can
  be added.

However real behavior is bit differ from described. Consider example:

 # ipset create test-1-v4 hash:ip family inet
 # ipset create test-1-v6 hash:ip family inet6
 # ipset create test-1 list:set
 # ipset add test-1 test-1-v4
 # ipset add test-1 test-1-v6

 # iptables  -A INPUT -p tcp --destination-port 25 -j SET --add-set test-1 src
 # ip6tables -A INPUT -p tcp --destination-port 25 -j SET --add-set test-1 src

And then when iptables/ip6tables rule matches packet IPSET target
tries to add src from packet to the list:set test-1 where first
entry is test-1-v4 and the second one is test-1-v6.

For IPv4, as it first entry in test-1 src added to test-1-v4
correctly, but for IPv6 src not added!

Placing test-1-v6 to the first element of list:set makes behavior
correct for IPv6, but brokes for IPv4.

This is due to result, returned from ip_set_add() and ip_set_del() from
net/netfilter/ipset/ip_set_core.c when set in list:set equires more
parameters than given or address families do not match (which is this
case).

It seems wrong returning 0 from ip_set_add() and ip_set_del() in
this case, as 0 should be returned only when an element successfuly
added/deleted to/from the set, contrary to ip_set_test() which
returns 0 when no entry exists and >0 when entry found in set.

Signed-off-by: Sergey Popovich <popovich_sergei@mail.ru>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2014-03-06 09:31:41 +01:00
Daniel Borkmann c485658bae net: sctp: fix skb leakage in COOKIE ECHO path of chunk->auth_chunk
While working on ec0223ec48 ("net: sctp: fix sctp_sf_do_5_1D_ce to
verify if we/peer is AUTH capable"), we noticed that there's a skb
memory leakage in the error path.

Running the same reproducer as in ec0223ec48 and by unconditionally
jumping to the error label (to simulate an error condition) in
sctp_sf_do_5_1D_ce() receive path lets kmemleak detector bark about
the unfreed chunk->auth_chunk skb clone:

Unreferenced object 0xffff8800b8f3a000 (size 256):
  comm "softirq", pid 0, jiffies 4294769856 (age 110.757s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    89 ab 75 5e d4 01 58 13 00 00 00 00 00 00 00 00  ..u^..X.........
  backtrace:
    [<ffffffff816660be>] kmemleak_alloc+0x4e/0xb0
    [<ffffffff8119f328>] kmem_cache_alloc+0xc8/0x210
    [<ffffffff81566929>] skb_clone+0x49/0xb0
    [<ffffffffa0467459>] sctp_endpoint_bh_rcv+0x1d9/0x230 [sctp]
    [<ffffffffa046fdbc>] sctp_inq_push+0x4c/0x70 [sctp]
    [<ffffffffa047e8de>] sctp_rcv+0x82e/0x9a0 [sctp]
    [<ffffffff815abd38>] ip_local_deliver_finish+0xa8/0x210
    [<ffffffff815a64af>] nf_reinject+0xbf/0x180
    [<ffffffffa04b4762>] nfqnl_recv_verdict+0x1d2/0x2b0 [nfnetlink_queue]
    [<ffffffffa04aa40b>] nfnetlink_rcv_msg+0x14b/0x250 [nfnetlink]
    [<ffffffff815a3269>] netlink_rcv_skb+0xa9/0xc0
    [<ffffffffa04aa7cf>] nfnetlink_rcv+0x23f/0x408 [nfnetlink]
    [<ffffffff815a2bd8>] netlink_unicast+0x168/0x250
    [<ffffffff815a2fa1>] netlink_sendmsg+0x2e1/0x3f0
    [<ffffffff8155cc6b>] sock_sendmsg+0x8b/0xc0
    [<ffffffff8155d449>] ___sys_sendmsg+0x369/0x380

What happens is that commit bbd0d59809 clones the skb containing
the AUTH chunk in sctp_endpoint_bh_rcv() when having the edge case
that an endpoint requires COOKIE-ECHO chunks to be authenticated:

  ---------- INIT[RANDOM; CHUNKS; HMAC-ALGO] ---------->
  <------- INIT-ACK[RANDOM; CHUNKS; HMAC-ALGO] ---------
  ------------------ AUTH; COOKIE-ECHO ---------------->
  <-------------------- COOKIE-ACK ---------------------

When we enter sctp_sf_do_5_1D_ce() and before we actually get to
the point where we process (and subsequently free) a non-NULL
chunk->auth_chunk, we could hit the "goto nomem_init" path from
an error condition and thus leave the cloned skb around w/o
freeing it.

The fix is to centrally free such clones in sctp_chunk_destroy()
handler that is invoked from sctp_chunk_free() after all refs have
dropped; and also move both kfree_skb(chunk->auth_chunk) there,
so that chunk->auth_chunk is either NULL (since sctp_chunkify()
allocs new chunks through kmem_cache_zalloc()) or non-NULL with
a valid skb pointer. chunk->skb and chunk->auth_chunk are the
only skbs in the sctp_chunk structure that need to be handeled.

While at it, we should use consume_skb() for both. It is the same
as dev_kfree_skb() but more appropriately named as we are not
a device but a protocol. Also, this effectively replaces the
kfree_skb() from both invocations into consume_skb(). Functions
are the same only that kfree_skb() assumes that the frame was
being dropped after a failure (e.g. for tools like drop monitor),
usage of consume_skb() seems more appropriate in function
sctp_chunk_destroy() though.

Fixes: bbd0d59809 ("[SCTP]: Implement the receive and verification of AUTH chunk")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Vlad Yasevich <yasevich@gmail.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-05 20:40:25 -05:00
Linus Lüssing 6565b9eeef bridge: multicast: add sanity check for query source addresses
MLD queries are supposed to have an IPv6 link-local source address
according to RFC2710, section 4 and RFC3810, section 5.1.14. This patch
adds a sanity check to ignore such broken MLD queries.

Without this check, such malformed MLD queries can result in a
denial of service: The queries are ignored by any MLD listener
therefore they will not respond with an MLD report. However,
without this patch these malformed MLD queries would enable the
snooping part in the bridge code, potentially shutting down the
according ports towards these hosts for multicast traffic as the
bridge did not learn about these listeners.

Reported-by: Jan Stancek <jstancek@redhat.com>
Signed-off-by: Linus Lüssing <linus.luessing@web.de>
Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-05 20:40:24 -05:00
David S. Miller 67ddc87f16 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/wireless/ath/ath9k/recv.c
	drivers/net/wireless/mwifiex/pcie.c
	net/ipv6/sit.c

The SIT driver conflict consists of a bug fix being done by hand
in 'net' (missing u64_stats_init()) whilst in 'net-next' a helper
was created (netdev_alloc_pcpu_stats()) which takes care of this.

The two wireless conflicts were overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-05 20:32:02 -05:00
Nikolay Aleksandrov 24b9bf43e9 net: fix for a race condition in the inet frag code
I stumbled upon this very serious bug while hunting for another one,
it's a very subtle race condition between inet_frag_evictor,
inet_frag_intern and the IPv4/6 frag_queue and expire functions
(basically the users of inet_frag_kill/inet_frag_put).

What happens is that after a fragment has been added to the hash chain
but before it's been added to the lru_list (inet_frag_lru_add) in
inet_frag_intern, it may get deleted (either by an expired timer if
the system load is high or the timer sufficiently low, or by the
fraq_queue function for different reasons) before it's added to the
lru_list, then after it gets added it's a matter of time for the
evictor to get to a piece of memory which has been freed leading to a
number of different bugs depending on what's left there.

I've been able to trigger this on both IPv4 and IPv6 (which is normal
as the frag code is the same), but it's been much more difficult to
trigger on IPv4 due to the protocol differences about how fragments
are treated.

The setup I used to reproduce this is: 2 machines with 4 x 10G bonded
in a RR bond, so the same flow can be seen on multiple cards at the
same time. Then I used multiple instances of ping/ping6 to generate
fragmented packets and flood the machines with them while running
other processes to load the attacked machine.

*It is very important to have the _same flow_ coming in on multiple CPUs
concurrently. Usually the attacked machine would die in less than 30
minutes, if configured properly to have many evictor calls and timeouts
it could happen in 10 minutes or so.

An important point to make is that any caller (frag_queue or timer) of
inet_frag_kill will remove both the timer refcount and the
original/guarding refcount thus removing everything that's keeping the
frag from being freed at the next inet_frag_put.  All of this could
happen before the frag was ever added to the LRU list, then it gets
added and the evictor uses a freed fragment.

An example for IPv6 would be if a fragment is being added and is at
the stage of being inserted in the hash after the hash lock is
released, but before inet_frag_lru_add executes (or is able to obtain
the lru lock) another overlapping fragment for the same flow arrives
at a different CPU which finds it in the hash, but since it's
overlapping it drops it invoking inet_frag_kill and thus removing all
guarding refcounts, and afterwards freeing it by invoking
inet_frag_put which removes the last refcount added previously by
inet_frag_find, then inet_frag_lru_add gets executed by
inet_frag_intern and we have a freed fragment in the lru_list.

The fix is simple, just move the lru_add under the hash chain locked
region so when a removing function is called it'll have to wait for
the fragment to be added to the lru_list, and then it'll remove it (it
works because the hash chain removal is done before the lru_list one
and there's no window between the two list adds when the frag can get
dropped). With this fix applied I couldn't kill the same machine in 24
hours with the same setup.

Fixes: 3ef0eb0db4 ("net: frag, move LRU list maintenance outside of
rwlock")

CC: Florian Westphal <fw@strlen.de>
CC: Jesper Dangaard Brouer <brouer@redhat.com>
CC: David S. Miller <davem@davemloft.net>

Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-05 20:31:42 -05:00
Claudio Takahasi 5981a8821b Bluetooth: Fix removing Long Term Key
This patch fixes authentication failure on LE link re-connection when
BlueZ acts as slave (peripheral). LTK is removed from the internal list
after its first use causing PIN or Key missing reply when re-connecting
the link. The LE Long Term Key Request event indicates that the master
is attempting to encrypt or re-encrypt the link.

Pre-condition: BlueZ host paired and running as slave.
How to reproduce(master):

  1) Establish an ACL LE encrypted link
  2) Disconnect the link
  3) Try to re-establish the ACL LE encrypted link (fails)

> HCI Event: LE Meta Event (0x3e) plen 19
      LE Connection Complete (0x01)
        Status: Success (0x00)
        Handle: 64
        Role: Slave (0x01)
...
@ Device Connected: 00:02:72:DC:29:C9 (1) flags 0x0000
> HCI Event: LE Meta Event (0x3e) plen 13
      LE Long Term Key Request (0x05)
        Handle: 64
        Random number: 875be18439d9aa37
        Encryption diversifier: 0x76ed
< HCI Command: LE Long Term Key Request Reply (0x08|0x001a) plen 18
        Handle: 64
        Long term key: 2aa531db2fce9f00a0569c7d23d17409
> HCI Event: Command Complete (0x0e) plen 6
      LE Long Term Key Request Reply (0x08|0x001a) ncmd 1
        Status: Success (0x00)
        Handle: 64
> HCI Event: Encryption Change (0x08) plen 4
        Status: Success (0x00)
        Handle: 64
        Encryption: Enabled with AES-CCM (0x01)
...
@ Device Disconnected: 00:02:72:DC:29:C9 (1) reason 3
< HCI Command: LE Set Advertise Enable (0x08|0x000a) plen 1
        Advertising: Enabled (0x01)
> HCI Event: Command Complete (0x0e) plen 4
      LE Set Advertise Enable (0x08|0x000a) ncmd 1
        Status: Success (0x00)
> HCI Event: LE Meta Event (0x3e) plen 19
      LE Connection Complete (0x01)
        Status: Success (0x00)
        Handle: 64
        Role: Slave (0x01)
...
@ Device Connected: 00:02:72:DC:29:C9 (1) flags 0x0000
> HCI Event: LE Meta Event (0x3e) plen 13
      LE Long Term Key Request (0x05)
        Handle: 64
        Random number: 875be18439d9aa37
        Encryption diversifier: 0x76ed
< HCI Command: LE Long Term Key Request Neg Reply (0x08|0x001b) plen 2
        Handle: 64
> HCI Event: Command Complete (0x0e) plen 6
      LE Long Term Key Request Neg Reply (0x08|0x001b) ncmd 1
        Status: Success (0x00)
        Handle: 64
> HCI Event: Disconnect Complete (0x05) plen 4
        Status: Success (0x00)
        Handle: 64
        Reason: Authentication Failure (0x05)
@ Device Disconnected: 00:02:72:DC:29:C9 (1) reason 0

Signed-off-by: Claudio Takahasi <claudio.takahasi@openbossa.org>
Cc: stable@vger.kernel.org
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-03-05 19:41:20 +02:00
Johannes Berg 864a6040f3 mac80211: clear sequence/fragment number in QoS-null frames
Avoid leaking data by sending uninitialized memory and setting an
invalid (non-zero) fragment number (the sequence number is ignored
anyway) by setting the seq_ctrl field to zero.

Cc: stable@vger.kernel.org
Fixes: 3f52b7e328 ("mac80211: mesh power save basics")
Fixes: ce662b44ce ("mac80211: send (QoS) Null if no buffered frames")
Reviewed-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-03-05 15:49:54 +01:00
Alexander Aring 6092c79fd0 ieee802154: fix whitespace issues in Kconfig
This patch fixes some whitespace issues in Kconfig files of IEEE
802.15.4 subsytem.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-04 20:12:44 -05:00
Simon Wunderlich 960d97f951 cfg80211: add MPLS and 802.21 classification
MPLS labels may contain traffic control information, which should be
evaluated and used by the wireless subsystem if present.

Also check for IEEE 802.21 which is always network control traffic.

Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Mathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-04 13:51:06 -05:00
John W. Linville f3b6a488a6 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless
Conflicts:
	drivers/net/wireless/ath/ath9k/recv.c
	drivers/net/wireless/mwifiex/pcie.c
2014-03-04 13:05:12 -05:00
Linus Torvalds c3bebc71c4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Fix memory leak in ieee80211_prep_connection(), sta_info leaked on
    error.  From Eytan Lifshitz.

 2) Unintentional switch case fallthrough in nft_reject_inet_eval(),
    from Patrick McHardy.

 3) Must check if payload lenth is a power of 2 in
    nft_payload_select_ops(), from Nikolay Aleksandrov.

 4) Fix mis-checksumming in xen-netfront driver, ip_hdr() is not in the
    correct place when we invoke skb_checksum_setup().  From Wei Liu.

 5) TUN driver should not advertise HW vlan offload features in
    vlan_features.  Fix from Fernando Luis Vazquez Cao.

 6) IPV6_VTI needs to select NET_IPV_TUNNEL to avoid build errors, fix
    from Steffen Klassert.

 7) Add missing locking in xfrm_migrade_state_find(), we must hold the
    per-namespace xfrm_state_lock while traversing the lists.  Fix from
    Steffen Klassert.

 8) Missing locking in ath9k driver, access to tid->sched must be done
    under ath_txq_lock().  Fix from Stanislaw Gruszka.

 9) Fix two bugs in TCP fastopen.  First respect the size argument given
    to tcp_sendmsg() in the fastopen path, and secondly prevent
    tcp_send_syn_data() from potentially using order-5 allocations.
    From Eric Dumazet.

10) Fix handling of default neigh garbage collection params, from Jiri
    Pirko.

11) Fix cwnd bloat and over-inflation of RTT when transmit segmentation
    is in use.  From Eric Dumazet.

12) Missing initialization of Realtek r8169 driver's statistics
    seqlocks.  Fix from Kyle McMartin.

13) Fix RTNL assertion failures in 802.3ad and AB ARP monitor of bonding
    driver, from Ding Tianhong.

14) Bonding slave release race can cause divide by zero, fix from
    Nikolay Aleksandrov.

15) Overzealous return from neigh_periodic_work() causes reachability
    time to not be computed.  Fix from Duain Jiong.

16) Fix regression in ipv6_find_hdr(), it should not return -ENOENT when
    a specific target is specified and found.  From Hans Schillstrom.

17) Fix VLAN tag stripping regression in BNA driver, from Ivan Vecera.

18) Tail loss probe can calculate bogus RTTs due to missing packet
    marking on retransmit.  Fix from Yuchung Cheng.

19) We cannot do skb_dst_drop() in iptunnel_pull_header() because
    multicast loopback detection in later code paths need access to
    skb_rtable().  Fix from Xin Long.

20) The macvlan driver regresses in that it propagates lower device
    offload support disables into itself, causing severe slowdowns when
    running over a bridge.  Provide the software offloads always on
    macvlan devices to deal with this and the regression is gone.  From
    Vlad Yasevich.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (103 commits)
  macvlan: Add support for 'always_on' offload features
  net: sctp: fix sctp_sf_do_5_1D_ce to verify if we/peer is AUTH capable
  ip_tunnel:multicast process cause panic due to skb->_skb_refdst NULL pointer
  net: cpsw: fix cpdma rx descriptor leak on down interface
  be2net: isolate TX workarounds not applicable to Skyhawk-R
  be2net: Fix skb double free in be_xmit_wrokarounds() failure path
  be2net: clear promiscuous bits in adapter->flags while disabling promiscuous mode
  be2net: Fix to reset transparent vlan tagging
  qlcnic: dcb: a couple off by one bugs
  tcp: fix bogus RTT on special retransmission
  hsr: off by one sanity check in hsr_register_frame_in()
  can: remove CAN FD compatibility for CAN 2.0 sockets
  can: flexcan: factor out soft reset into seperate funtion
  can: flexcan: flexcan_remove(): add missing netif_napi_del()
  can: flexcan: fix transition from and to freeze mode in chip_{,un}freeze
  can: flexcan: factor out transceiver {en,dis}able into seperate functions
  can: flexcan: fix transition from and to low power mode in chip_{en,dis}able
  can: flexcan: flexcan_open(): fix error path if flexcan_chip_start() fails
  can: flexcan: fix shutdown: first disable chip, then all interrupts
  USB AX88179/178A: Support D-Link DUB-1312
  ...
2014-03-04 08:44:32 -08:00
Tim Smith 7727640cc3 af_rxrpc: Keep rxrpc_call pointers in a hashtable
Keep track of rxrpc_call structures in a hashtable so they can be
found directly from the network parameters which define the call.

This allows incoming packets to be routed directly to a call without walking
through hierarchy of peer -> transport -> connection -> call and all the
spinlocks that that entailed.

Signed-off-by: Tim Smith <tim@electronghost.co.uk>
Signed-off-by: David Howells <dhowells@redhat.com>
2014-03-04 10:36:53 +00:00
David S. Miller 48235515c4 Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless
John W. Linville says:

====================
Please pull this batch of fixes intended for the 3.14 stream...

For the mac80211 bits, Johannes says:

"This time I have a fix to get out of an 'infinite error state' in case
regulatory domain updates failed and two fixes for VHT associations: one
to not disconnect immediately when the AP uses more bandwidth than the
new regdomain would allow after a change due to association country
information getting used, and one for an issue in the code where
mac80211 doesn't correctly ignore a reserved field and then uses an HT
instead of VHT association."

For the iwlwifi bits, Emmanuel says:

"Johannes fixes a long standing bug in the AMPDU status reporting.
Max fixes the listen time which was way too long and causes trouble
to several APs."

Along with those, Bing Zhao marks the mwifiex_usb driver as _not_
supporting USB autosuspend after a number of problems with that have
been reported.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-03 16:42:47 -05:00
Daniel Borkmann ec0223ec48 net: sctp: fix sctp_sf_do_5_1D_ce to verify if we/peer is AUTH capable
RFC4895 introduced AUTH chunks for SCTP; during the SCTP
handshake RANDOM; CHUNKS; HMAC-ALGO are negotiated (CHUNKS
being optional though):

  ---------- INIT[RANDOM; CHUNKS; HMAC-ALGO] ---------->
  <------- INIT-ACK[RANDOM; CHUNKS; HMAC-ALGO] ---------
  -------------------- COOKIE-ECHO -------------------->
  <-------------------- COOKIE-ACK ---------------------

A special case is when an endpoint requires COOKIE-ECHO
chunks to be authenticated:

  ---------- INIT[RANDOM; CHUNKS; HMAC-ALGO] ---------->
  <------- INIT-ACK[RANDOM; CHUNKS; HMAC-ALGO] ---------
  ------------------ AUTH; COOKIE-ECHO ---------------->
  <-------------------- COOKIE-ACK ---------------------

RFC4895, section 6.3. Receiving Authenticated Chunks says:

  The receiver MUST use the HMAC algorithm indicated in
  the HMAC Identifier field. If this algorithm was not
  specified by the receiver in the HMAC-ALGO parameter in
  the INIT or INIT-ACK chunk during association setup, the
  AUTH chunk and all the chunks after it MUST be discarded
  and an ERROR chunk SHOULD be sent with the error cause
  defined in Section 4.1. [...] If no endpoint pair shared
  key has been configured for that Shared Key Identifier,
  all authenticated chunks MUST be silently discarded. [...]

  When an endpoint requires COOKIE-ECHO chunks to be
  authenticated, some special procedures have to be followed
  because the reception of a COOKIE-ECHO chunk might result
  in the creation of an SCTP association. If a packet arrives
  containing an AUTH chunk as a first chunk, a COOKIE-ECHO
  chunk as the second chunk, and possibly more chunks after
  them, and the receiver does not have an STCB for that
  packet, then authentication is based on the contents of
  the COOKIE-ECHO chunk. In this situation, the receiver MUST
  authenticate the chunks in the packet by using the RANDOM
  parameters, CHUNKS parameters and HMAC_ALGO parameters
  obtained from the COOKIE-ECHO chunk, and possibly a local
  shared secret as inputs to the authentication procedure
  specified in Section 6.3. If authentication fails, then
  the packet is discarded. If the authentication is successful,
  the COOKIE-ECHO and all the chunks after the COOKIE-ECHO
  MUST be processed. If the receiver has an STCB, it MUST
  process the AUTH chunk as described above using the STCB
  from the existing association to authenticate the
  COOKIE-ECHO chunk and all the chunks after it. [...]

Commit bbd0d59809 introduced the possibility to receive
and verification of AUTH chunk, including the edge case for
authenticated COOKIE-ECHO. On reception of COOKIE-ECHO,
the function sctp_sf_do_5_1D_ce() handles processing,
unpacks and creates a new association if it passed sanity
checks and also tests for authentication chunks being
present. After a new association has been processed, it
invokes sctp_process_init() on the new association and
walks through the parameter list it received from the INIT
chunk. It checks SCTP_PARAM_RANDOM, SCTP_PARAM_HMAC_ALGO
and SCTP_PARAM_CHUNKS, and copies them into asoc->peer
meta data (peer_random, peer_hmacs, peer_chunks) in case
sysctl -w net.sctp.auth_enable=1 is set. If in INIT's
SCTP_PARAM_SUPPORTED_EXT parameter SCTP_CID_AUTH is set,
peer_random != NULL and peer_hmacs != NULL the peer is to be
assumed asoc->peer.auth_capable=1, in any other case
asoc->peer.auth_capable=0.

Now, if in sctp_sf_do_5_1D_ce() chunk->auth_chunk is
available, we set up a fake auth chunk and pass that on to
sctp_sf_authenticate(), which at latest in
sctp_auth_calculate_hmac() reliably dereferences a NULL pointer
at position 0..0008 when setting up the crypto key in
crypto_hash_setkey() by using asoc->asoc_shared_key that is
NULL as condition key_id == asoc->active_key_id is true if
the AUTH chunk was injected correctly from remote. This
happens no matter what net.sctp.auth_enable sysctl says.

The fix is to check for net->sctp.auth_enable and for
asoc->peer.auth_capable before doing any operations like
sctp_sf_authenticate() as no key is activated in
sctp_auth_asoc_init_active_key() for each case.

Now as RFC4895 section 6.3 states that if the used HMAC-ALGO
passed from the INIT chunk was not used in the AUTH chunk, we
SHOULD send an error; however in this case it would be better
to just silently discard such a maliciously prepared handshake
as we didn't even receive a parameter at all. Also, as our
endpoint has no shared key configured, section 6.3 says that
MUST silently discard, which we are doing from now onwards.

Before calling sctp_sf_pdiscard(), we need not only to free
the association, but also the chunk->auth_chunk skb, as
commit bbd0d59809 created a skb clone in that case.

I have tested this locally by using netfilter's nfqueue and
re-injecting packets into the local stack after maliciously
modifying the INIT chunk (removing RANDOM; HMAC-ALGO param)
and the SCTP packet containing the COOKIE_ECHO (injecting
AUTH chunk before COOKIE_ECHO). Fixed with this patch applied.

Fixes: bbd0d59809 ("[SCTP]: Implement the receive and verification of AUTH chunk")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Vlad Yasevich <yasevich@gmail.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-03 16:39:36 -05:00
David S. Miller 82f1918351 linux-can-fixes-for-3.14-20140303
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iEYEABECAAYFAlMUhCcACgkQjTAFq1RaXHMCSACdFy5OoMTtHjuPuQe5RH4Lu7rP
 tM4AnRs2kviQRLhs92HgXuLWAN9QmRX4
 =Y65u
 -----END PGP SIGNATURE-----

Merge tag 'linux-can-fixes-for-3.14-20140303' of git://gitorious.org/linux-can/linux-can

linux-can-fixes-for-3.14-20140303

Marc Kleine-Budde says:

====================
this is a pull request of 8 patches. Oliver Hartkopp contributes a patch which
removes the CAN FD compatibility for CAN 2.0 sockets, as it turns out that this
compatibility has some conceptual cornercases. The remaining 7 patches are by
me, they address a problem in the flexcan driver. When shutting down the
interface ("ifconfig can0 down") under heavy network load the whole system will
hang. This series reworks the actual sequence in close() and the transition
from and to the low power modes of the CAN controller.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-03 16:05:04 -05:00
Yuchung Cheng f19c29e3e3 tcp: snmp stats for Fast Open, SYN rtx, and data pkts
Add the following snmp stats:

TCPFastOpenActiveFail: Fast Open attempts (SYN/data) failed beacuse
the remote does not accept it or the attempts timed out.

TCPSynRetrans: number of SYN and SYN/ACK retransmits to break down
retransmissions into SYN, fast-retransmits, timeout retransmits, etc.

TCPOrigDataSent: number of outgoing packets with original data (excluding
retransmission but including data-in-SYN). This counter is different from
TcpOutSegs because TcpOutSegs also tracks pure ACKs. TCPOrigDataSent is
more useful to track the TCP retransmission rate.

Change TCPFastOpenActive to track only successful Fast Opens to be symmetric to
TCPFastOpenPassive.

Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Nandita Dukkipati <nanditad@google.com>
Signed-off-by: Lawrence Brakmo <brakmo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-03 15:58:03 -05:00
Xin Long 10ddceb22b ip_tunnel:multicast process cause panic due to skb->_skb_refdst NULL pointer
when ip_tunnel process multicast packets, it may check if the packet is looped
back packet though 'rt_is_output_route(skb_rtable(skb))' in ip_tunnel_rcv(),
but before that , skb->_skb_refdst has been dropped in iptunnel_pull_header(),
so which leads to a panic.

fix the bug: https://bugzilla.kernel.org/show_bug.cgi?id=70681

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-03 15:56:40 -05:00
Hiroaki SHIMODA a135e598c4 sch_tbf: Remove holes in struct tbf_sched_data.
On x86_64 we have 3 holes in struct tbf_sched_data.

The member peak_present can be replaced with peak.rate_bytes_ps,
because peak.rate_bytes_ps is set only when peak is specified in
tbf_change(). tbf_peak_present() is introduced to test
peak.rate_bytes_ps.

The member max_size is moved to fill 32bit hole.

Signed-off-by: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-03 15:43:47 -05:00
Yuchung Cheng c84a57113f tcp: fix bogus RTT on special retransmission
RTT may be bogus with tall loss probe (TLP) when a packet
is retransmitted and latter (s)acked without TCPCB_SACKED_RETRANS flag.

For example, TLP calls __tcp_retransmit_skb() instead of
tcp_retransmit_skb(). The skb timestamps are updated but the sacked
flag is not marked with TCPCB_SACKED_RETRANS. As a result we'll
get bogus RTT in tcp_clean_rtx_queue() or in tcp_sacktag_one() on
spurious retransmission.

The fix is to apply the sticky flag TCP_EVER_RETRANS to enforce Karn's
check on RTT sampling. However this will disable F-RTO if timeout occurs
after TLP, by resetting undo_marker in tcp_enter_loss(). We relax this
check to only if any pending retransmists are still in-flight.

Signed-off-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Nandita Dukkipati <nanditad@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-03 15:33:02 -05:00
Dan Carpenter de39d7a4f3 hsr: off by one sanity check in hsr_register_frame_in()
This is a sanity check and we never pass invalid values so this patch
doesn't change anything.  However the node->time_in[] array has
HSR_MAX_SLAVE (2) elements and not HSR_MAX_DEV (3).

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-03 15:29:42 -05:00
John W. Linville 9ffd2e9a17 Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next 2014-03-03 14:59:12 -05:00
John W. Linville 0c6a4812a0 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem 2014-03-03 14:34:45 -05:00
Luis R. Rodriguez 255e25b0e5 cfg80211: allow reprocessing of pending requests
In certain situations we want to trigger reprocessing
of the last regulatory hint. One situation in which
this makes sense is the case where the cfg80211 was
built-in to the kernel, CFG80211_INTERNAL_REGDB was not
enabled and the CRDA binary is on a partition not availble
during early boot. In such a case we want to be able to
re-process the same request at some other point.

When we are asked to re-process the same request we need
to be careful to not kfree it, addresses that.

Reported-by: Sander Eikelenboom <linux@eikelenboom.it>
Signed-off-by: Luis R. Rodriguez <mcgrof@do-not-panic.com>
[rename function]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-03-03 15:07:33 +01:00
Chun-Yeow Yeoh b8ff416bc9 mac80211: add missing update on rx status VHT flag
Add missing update on the rx status vht flag of the last
data packet. Otherwise, cfg80211_calculate_bitrate_vht
may not consider the channel width resulting in wrong
calculation of the received bitrate.

Signed-off-by: Chun-Yeow Yeoh <yeohchunyeow@gmail.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-03-03 15:07:33 +01:00
Michal Kazior 37fa2bdd16 mac80211: refactor channel switch function
The function was quite big. This splits out beacon
updating into a separate function for improved
maintenance and extension.

Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-03-03 15:07:29 +01:00
Oliver Hartkopp 821047c405 can: remove CAN FD compatibility for CAN 2.0 sockets
In commit e2d265d3b5 (canfd: add support for CAN FD in CAN_RAW sockets)
CAN FD frames with a payload length up to 8 byte are passed to legacy
sockets where the CAN FD support was not enabled by the application.

After some discussions with developers at a fair this well meant feature
leads to confusion as no clean switch for CAN / CAN FD is provided to the
application programmer. Additionally a compatibility like this for legacy
CAN_RAW sockets requires some compatibility handling for the sending, e.g.
make CAN2.0 frames a CAN FD frame with BRS at transmission time (?!?).

This will become a mess when people start to develop applications with
real CAN FD hardware. This patch reverts the bad compatibility code
together with the documentation describing the removed feature.

Acked-by: Stephane Grosjean <s.grosjean@peak-system.com>
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2014-03-03 14:29:52 +01:00
Johannes Berg bc00a91d62 cfg80211: remove racy beacon_interval assignment
In case of AP mode, the beacon interval is already reset to
zero inside cfg80211_stop_ap(), and in the other modes it
isn't relevant. Remove the assignment to remove a potential
race since the assignment isn't properly locked.

Reported-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-03-03 14:18:20 +01:00
Eliad Peller 1c37a72c1b mac80211: consider virtual mon when calculating min_def
When calculating the current max bw required for
a channel context, we didn't consider the virtual
monitor interface, resulting in its channel context
being narrower than configured.

This broke monitor mode with iwlmvm, which uses the
minimal width.

Reported-by: Ido Yariv <idox.yariv@intel.com>
Signed-off-by: Eliad Peller <eliadx.peller@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-03-03 13:48:13 +01:00
Alexander Aring b6f82fc05d 6lowpan: use memcpy to set tag value in fraghdr
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-02 19:19:44 -05:00
Alexander Aring 0234a63248 6lowpan: remove initialization of tag value
The initialization of the tag value doesn't matter at begin of
fragmentation. This patch removes the initialization to zero.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-02 19:19:44 -05:00
Alexander Aring 4c7f778e56 6lowpan: fix type of datagram size parameter
Datagram size value is u16 because we convert it to host byte order
and we need to read it. Only the tag value belongs to __be16 type.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-02 19:19:44 -05:00
Alexander Aring 7240cdec60 6lowpan: handling 6lowpan fragmentation via inet_frag api
This patch drops the current way of 6lowpan fragmentation on receiving
side and replace it with a implementation which use the inet_frag api.
The old fragmentation handling has some race conditions and isn't
rfc4944 compatible. Also adding support to match fragments on
destination address, source address, tag value and datagram_size
which is missing in the current implementation.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-28 17:05:22 -05:00
Alexander Aring d57fec84fb 6lowpan: fix some checkpatch issues
Detected with:

./scripts/checkpatch.pl --strict -f net/ieee802154/6lowpan_rtnl.c

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-28 17:05:22 -05:00
Alexander Aring 01348b3448 6lowpan: move 6lowpan.c to 6lowpan_rtnl.c
We have a 6lowpan.c file and 6lowpan.ko file. To avoid confusing we
should move 6lowpan.c to 6lowpan_rtnl.c. Then we can support multiple
source files for 6lowpan module.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-28 17:05:21 -05:00
Alexander Aring 02600d0de6 6lowpan: change tag type to __be16
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-28 17:05:21 -05:00
Alexander Aring 96cb3eb7a1 6lowpan: fix fragmentation on sending side
This patch fix the fragmentation on sending side according to rfc4944.

Also add improvement to use the full payload of a PDU which calculate
the nearest divided to 8 payload length for the fragmentation datagram
size attribute.

The main issue is that the datagram size of fragmentation header use the
ipv6 payload length, but rfc4944 says it's the ipv6 payload length inclusive
network header size (and transport header size if compressed).

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-28 17:05:21 -05:00
Alexander Aring 349aa7bc29 6lowpan: add uncompress header size function
This patch add a lookup function for uncompressed 6LoWPAN header
size. This is needed to estimate the real size after uncompress the
6LoWPAN header.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-28 17:05:21 -05:00
Daniel Borkmann 52f1454f62 packet: allow to transmit +4 byte in TX_RING slot for VLAN case
Commit 57f89bfa21 ("network: Allow af_packet to transmit +4 bytes
for VLAN packets.") added the possibility for non-mmaped frames to
send extra 4 byte for VLAN header so the MTU increases from 1500 to
1504 byte, for example.

Commit cbd89acb9e ("af_packet: fix for sending VLAN frames via
packet_mmap") attempted to fix that for the mmap part but was
reverted as it caused regressions while using eth_type_trans()
on output path.

Lets just act analogous to 57f89bfa21 and add a similar logic
to TX_RING. We presume size_max as overcharged with +4 bytes and
later on after skb has been built by tpacket_fill_skb() check
for ETH_P_8021Q header on packets larger than normal MTU. Can
be easily reproduced with a slightly modified trafgen in mmap(2)
mode, test cases:

 { fill(0xff, 12) const16(0x8100) fill(0xff, <1504|1505>) }
 { fill(0xff, 12) const16(0x0806) fill(0xff, <1500|1501>) }

Note that we need to do the test right after tpacket_fill_skb()
as sockets can have PACKET_LOSS set where we would not fail but
instead just continue to traverse the ring.

Reported-by: Mathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Ben Greear <greearb@candelatech.com>
Cc: Phil Sutter <phil@nwl.cc>
Tested-by: Mathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-28 16:52:02 -05:00
John W. Linville b95eddbb90 Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 2014-02-28 13:42:54 -05:00
Johan Hedberg 81ad6fd969 Bluetooth: Remove unnecessary stop_scan_complete function
The stop_scan_complete function was used as an intermediate step before
doing the actual connection creation. Since we're using hci_request
there's no reason to have this extra function around, i.e. we can simply
put both HCI commands into the same request.

The single task that the intermediate function had, i.e. indicating
discovery as stopped is now taken care of by a new
HCI_LE_SCAN_INTERRUPTED flag which allows us to do the discovery state
update when the stop scan command completes.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-02-28 10:28:17 -08:00
Johan Hedberg 317ac8cb3f Bluetooth: Fix trying to disable scanning twice
The discovery process has a timer for disabling scanning, however
scanning might be disabled through other means too like the auto-connect
process.  We should therefore ensure that the timer is never active
after sending a HCI command to disable scanning.

There was some existing code in stop_scan_complete trying to avoid the
timer when a connect request interrupts a discovery procedure, but the
other way around was not covered. This patch covers both scenarios by
canceling the timer as soon as we get a successful command complete for
the disabling HCI command.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-02-28 10:28:16 -08:00
Johan Hedberg e3098be40b Bluetooth: Delay LTK encryption to let remote receive all keys
Some devices may refuse to re-encrypt with the LTK if they haven't
received all our keys yet. This patch adds a 250ms delay before
attempting re-encryption with the LTK.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-02-28 08:17:46 -08:00
Johan Hedberg 38ccdc9332 Bluetooth: Re-encrypt link after receiving an LTK
It's not strictly speaking required to re-encrypt a link once we receive
an LTK since the connection is already encrypted with the STK. However,
re-encrypting with the LTK allows us to verify that we've received an
LTK that actually works.

This patch updates the SMP code to request encrypting with the LTK in
case we're in master role and waits until the key refresh complete event
before notifying user space of the distributed keys.

A new flag is also added for the SMP context to ensure that we
re-encryption only once in case of multiple calls to smp_distribute_keys.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-02-28 08:17:46 -08:00
Johan Hedberg 9489eca4ab Bluetooth: Add timeout for LE connection attempts
LE connection attempts do not have a controller side timeout in the same
way as BR/EDR has (in form of the page timeout). Since we always do
scanning before initiating connections the attempts are always expected
to succeed in some reasonable time.

This patch adds a timer which forces a cancellation of the connection
attempt within 20 seconds if it has not been successful by then. This
way we e.g. ensure that mgmt_pair_device times out eventually and gives
an error response.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-02-28 07:56:42 -08:00
Johan Hedberg b1cd5fd937 Bluetooth: Use hdev->init/resp_addr values for smp_c1 function
Now that we have nicely tracked values of the initiator and responder
address information we can pass that directly to the smp_c1 function
without worrying e.g. about who initiated the connection. This patch
updates the two places in smp.c to use the new variables.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-02-28 07:53:08 -08:00
Johan Hedberg cb1d68f7a3 Bluetooth: Track LE initiator and responder address information
For SMP we need the local and remote addresses (and their types) that
were used to establish the connection. These may be different from the
Identity Addresses or even the current RPA. To guarantee that we have
this information available and it is correct track these values
separately from the very beginning of the connection.

For outgoing connections we set the values as soon as we get a
successful command status for HCI_LE_Create_Connection (for which the
patch adds a command status handler function) and for incoming
connections as soon as we get a LE Connection Complete HCI event.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-02-28 07:53:07 -08:00
Johan Hedberg b46e003089 Bluetooth: Fix updating connection state to BT_CONNECT too early
We shouldn't update the hci_conn state to BT_CONNECT until the moment
that we're ready to send the initiating HCI command for it. If the
connection has the BT_CONNECT state too early the code responsible for
updating the local random address may incorrectly think there's a
pending connection in progress and refuse to update the address.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-02-28 07:53:07 -08:00
Johan Hedberg 8d97250ea2 Bluetooth: Add protections for updating local random address
Different controllers behave differently when HCI_Set_Random_Address is
called while they are advertising or have a HCI_LE_Create_Connection in
progress. Some take the newly written address into use for the pending
operation while others use the random address that we had at the time
that the operation started.

Due to this undefined behavior and for the fact that we want to reliably
determine the initiator address of all connections for the sake of SMP
it's best to simply prevent the random address update if we have these
problematic operations in progress.

This patch adds a set_random_addr() helper function for the use of
hci_update_random_address which contains the necessary checks for
advertising and ongoing LE connections.

One extra thing we need to do is to clear the HCI_ADVERTISING flag in
the enable_advertising() function before sending any commands. Since
re-enabling advertising happens by calling first disable_advertising()
and then enable_advertising() all while having the HCI_ADVERTISING flag
set. Clearing the flag lets the set_random_addr() function know that
it's safe to write a new address at least as far as advertising is
concerned.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-02-28 07:53:06 -08:00
Johan Hedberg 759331d7cc Bluetooth: Fix clearing SMP keys if pairing fails
If SMP fails we should not leave any keys (LTKs or IRKs) hanging around
the internal lists. This patch adds the necessary code to
smp_chan_destroy to remove any keys we may have in case of pairing
failure.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-02-28 12:36:10 +02:00
Marcel Holtmann fe39c7b2da Bluetooth: Use __le64 type for LE random numbers
The random numbers in Bluetooth Low Energy are 64-bit numbers and should
also be little endian since the HCI specification is little endian.

Change the whole Low Energy pairing to use __le64 instead of a byte
array.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-02-28 12:36:04 +02:00
Johan Hedberg a3172b7eb4 Bluetooth: Add timer to force power off
If some of the cleanup commands caused by mgmt_set_powered(off) never
complete we should still force the adapter to be powered down. This is
rather easy to do since hdev->power_off is already a delayed work
struct. This patch schedules this delayed work if at least one HCI
command was sent by the cleanup procedure.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-02-27 23:41:07 -08:00
Johan Hedberg c9910d0fb4 Bluetooth: Fix disconnecting connections in non-connected states
When powering off and disconnecting devices we should also consider
connections which have not yet reached the BT_CONNECTED state. They may
not have a valid handle yet and simply sending a HCI_Disconnect will not
work.

This patch updates the code to either disconnect, cancel connection
creation or reject incoming connection creation based on the current
conn->state value as well as the link type in question.

When the power off procedure results in canceling connection attempts
instead of disconnecting connections we get a connection failed event
instead of a disconnection event. Therefore, we also need to have extra
code in the mgmt_connect_failed function to check if we should proceed
with the power off or not.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-02-27 23:35:08 -08:00
Marcel Holtmann 0f36b589e4 Bluetooth: Track LE white list modification via HCI commands
When the LE white list gets changed via HCI commands make sure that
the internal storage of the white list entries gets updated.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-02-28 09:31:31 +02:00
Marcel Holtmann d2ab0ac18d Bluetooth: Add support for storing LE white list entries
The current LE white list entries require storing in the HCI controller
structure. So provide a storage and access functions for it. In addition
export the current list via debugfs.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-02-28 09:31:20 +02:00
Marcel Holtmann 747d3f0301 Bluetooth: Clear all LE white list entries when powering controller
When starting up a controller make sure that all LE white list entries
are cleared. Normally the HCI Reset takes care of this. This is just
in case no HCI Reset has been executed.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-02-28 09:31:14 +02:00
Marcel Holtmann c9507490ab Bluetooth: Make hci_blacklist_clear function static
The hci_blacklist_clear function is not used outside of hci_core.c and
can be made static.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-02-28 09:31:05 +02:00
Hans Schillstrom accfe0e356 ipv6: ipv6_find_hdr restore prev functionality
The commit 9195bb8e38 ("ipv6: improve
ipv6_find_hdr() to skip empty routing headers") broke ipv6_find_hdr().

When a target is specified like IPPROTO_ICMPV6 ipv6_find_hdr()
returns -ENOENT when it's found, not the header as expected.

A part of IPVS is broken and possible also nft_exthdr_eval().
When target is -1 which it is most cases, it works.

This patch exits the do while loop if the specific header is found
so the nexthdr could be returned as expected.

Reported-by: Art -kwaak- van Breemen <ard@telegraafnet.nl>
Signed-off-by: Hans Schillstrom <hans@schillstrom.com>
CC:Ansis Atteka <aatteka@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-27 18:27:26 -05:00
Duan Jiong feff9ab2e7 neigh: recompute reachabletime before returning from neigh_periodic_work()
If the neigh table's entries is less than gc_thresh1, the function
will return directly, and the reachabletime will not be recompute,
so the reachabletime can be guessed.

Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-27 18:21:17 -05:00
David S. Miller 352063c839 Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless
John W. Linville says:

====================
Regarding the mac80211 bits, Johannes says:

"This time, I have a fix from Arik for scheduled scan recovery (something
that only recently went into the tree), a memory leak fix from Eytan and
a small regulatory bugfix from Inbal. The EAPOL change from Felix makes
rekeying more stable while lots of traffic is flowing, and there's
Emmanuel's and my fixes for a race in the code handling powersaving
clients."

Regarding the NFC bits, Samuel says:

"We only have one candidate for 3.14 fixes, and this is a NCI NULL
pointer dereference introduced during the 3.14 merge window."

Regarding the iwlwifi bits, Emmanuel says:

"This should fix an issue raised in iwldvm when we have lots of
association failures.  There is a bugzilla for this bug - it hasn't
been validated by the user, but I hope it will do the trick."

Beyond that...

Amitkumar Karwar brings two mwifiex fixes, one to avoid a NULL pointer
dereference and another to address an improperly timed interrupt.

Arend van Spriel gives us a brcmfmac fix to avoid a crash during
scatter-gather packet transfers.

Avinash Patila offers an mwifiex to avoid an invalid memory access
when a device is removed.

Bing Zhao delivers a simple fix to avoid a naming conflict between
libertas and mwifiex.

Felix Fietkau provides a trio of ath9k fixes that properly account
for sequence numbering in ps-poll frames, reduce the rate for false
positives during baseband hang detection, and fix a regression related
to rx descriptor handling.

James Cameron shows us a libertas fix to ignore zero-length IEs when
processing scan results.

Kirill Tkhai brings a hostap fix to avoid prematurely freeing a timer.

Stanislaw Gruszka fixes an ath9k locking problem.

Sujith Manoharan addresses ETSI compliance for a device handled by
ath9k by adjusting the minimum CCA power threshold values.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-27 17:42:43 -05:00
Duan Jiong 5e2c21dceb neigh: directly goto out after setting nud_state to NUD_FAILED
Because those following if conditions will not be matched.

Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-27 16:40:38 -05:00
David S. Miller 8e1f40ec77 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next
Steffen Klassert says:

====================
This is the rework of the IPsec virtual tunnel interface
for ipv4 to support inter address family tunneling and
namespace crossing. The only change to the last RFC version
is a compile fix for an odd configuration where CONFIG_XFRM
is set but CONFIG_INET is not set.

1) Add and use a IPsec protocol multiplexer.

2) Add xfrm_tunnel_skb_cb to the skb common buffer
   to store a receive callback there.

3) Make vti work with i_key set by not including the i_key
   when comupting the hash for the tunnel lookup in case of
   vti tunnels.

4) Update ip_vti to use it's own receive hook.

5) Remove xfrm_tunnel_notifier, this is replaced by the IPsec
   protocol multiplexer.

6) We need to be protocol family indepenent, so use the on xfrm_lookup
   returned dst_entry instead of the ipv4 rtable in vti_tunnel_xmit().

7) Add support for inter address family tunneling.

8) Check if the tunnel endpoints of the xfrm state and the vti interface
   are matching and return an error otherwise.

8) Enable namespace crossing tor vti devices.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-27 16:31:54 -05:00
David S. Miller 23187212e7 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
Steffen Klassert says:

====================
1) Build fix for ip_vti when NET_IP_TUNNEL is not set.
   We need this set to have ip_tunnel_get_stats64()
   available.

2) Fix a NULL pointer dereference on sub policy usage.
   We try to access a xfrm_state from the wrong array.

3) Take xfrm_state_lock in xfrm_migrate_state_find(),
   we need it to traverse through the state lists.

4) Clone states properly on migration, otherwise we crash
   when we migrate a state with aead algorithm attached.

5) Fix unlink race when between thread context and timer
   when policies are deleted.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-27 16:19:41 -05:00
Lorenzo Colitti bf439b3154 net: ipv6: ping: Use socket mark in routing lookup
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-27 16:08:46 -05:00
John W. Linville 8e2a89c515 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem 2014-02-27 15:05:51 -05:00
Johannes Berg cb66498160 mac80211: fix association to 20/40 MHz VHT networks
When a VHT network uses 20 or 40 MHz as per the HT operation
information, the channel center frequency segment 0 field in
the VHT operation information is reserved, so ignore it.

This fixes association with such networks when the AP puts 0
into the field, previously we'd disconnect due to an invalid
channel with the message
wlan0: AP VHT information is invalid, disable VHT

Cc: stable@vger.kernel.org
Fixes: f2d9d270c1 ("mac80211: support VHT association")
Reported-by: Tim Nelson <tim.l.nelson@gmail.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-02-27 20:53:01 +01:00
Hiroaki SHIMODA 724b9e1d75 sch_tbf: Fix potential memory leak in tbf_change().
The allocated child qdisc is not freed in error conditions.
Defer the allocation after user configuration turns out to be
valid and acceptable.

Fixes: cc106e441a ("net: sched: tbf: fix the calculation of max_size")
Signed-off-by: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com>
Cc: Yang Yingliang <yangyingliang@huawei.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-27 12:53:50 -05:00
Johan Hedberg a1f4c3188b Bluetooth: Add hci_copy_identity_address convenience function
The number of places needing the local Identity Address are starting to
grow so it's better to have a single place for the logic of determining
it. This patch adds a convenience function for getting the Identity
Address and updates the two current places needing this to use it.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-02-27 08:50:21 -08:00
Johan Hedberg 56ed2cb88c Bluetooth: Add tracking of advertising address type
To know the real source address for incoming connections (needed e.g.
for SMP) we should store the own_address_type parameter that was used
for the last HCI_LE_Write_Advertising_Parameters command. This patch
adds a proper command complete handler for the command and stores the
address type in a new adv_addr_type variable in the hci_dev struct.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-02-27 08:50:21 -08:00
Lukasz Rymanowski d3a2541d83 Bluetooth: Fix response on confirm_name
According to mgmt-api.txt, in case of confirm name command,
cmd_complete should be always use as a response. Not command status
as it is now for failures.
Using command complete on failure is actually better as client might
be interested in device address for which confirm name failed.

Signed-off-by: Lukasz Rymanowski <lukasz.rymanowski@tieto.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-02-27 08:41:09 -08:00
Pablo Neira Ayuso 0768b3b3d2 netfilter: nf_tables: add optional user data area to rules
This allows us to store user comment strings, but it could be also
used to store any kind of information that the user application needs
to link to the rule.

Scratch 8 bits for the new ulen field that indicates the length the
user data area. 4 bits from the handle (so it's 42 bits long, according
to Patrick, it would last 139 years with 1000 new rules per second)
and 4 bits from dlen (so the expression data area is 4K, which seems
sufficient by now even considering the compatibility layer).

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Patrick McHardy <kaber@trash.net>
2014-02-27 16:56:00 +01:00
Andre Guedes dd2ef8e274 Bluetooth: Update background scan parameters
If new scanning parameters are set while background scan is running,
we should restart background scanning so these parameters are updated.

Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-02-26 19:41:35 -08:00
Andre Guedes 8ef30fd3d1 Bluetooth: Create hci_req_add_le_passive_scan helper
This patches creates the public hci_req_add_le_passive_scan helper so
it can be re-used outside hci_core.c in the next patch.

Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-02-26 19:41:35 -08:00
Andre Guedes 7d474e06ef Bluetooth: Add le_auto_conn file on debugfs
This patch adds to debugfs the le_auto_conn file. This file will be
used to test LE auto connection infrastructure.

This file accept writes in the following format:
  "add <address> <address_type> [auto_connect]"
  "del <address> <address_type>"
  "clr"

The <address type> values are:
  * 0 for public address
  * 1 for random address

The [auto_connect] values are (for more details see struct hci_
conn_params):
  * 0 for disabled (default)
  * 1 for always
  * 2 for link loss

So for instance, if you want the kernel autonomously establishes
connections with device AA:BB:CC:DD:EE:FF (public address) every
time the device enters in connectable mode (starts advertising),
you should run the command:
$ echo "add AA:BB:CC:DD:EE:FF 0 1" > /sys/kernel/debug/bluetooth/hci0/le_auto_conn

To delete the connection parameters for that device, run the command:
$ echo "del AA:BB:CC:DD:EE:FF 0" > /sys/kernel/debug/bluetooth/hci0/le_auto_conn

To clear the connection parameters list, run the command:
$ echo "clr" > /sys/kernel/debug/bluetooth/hci0/le_auto_conn

Finally. to get the list of connection parameters configured in kernel,
read the le_auto_conn file:
$ cat /sys/kernel/debug/bluetooth/hci0/le_auto_conn

This file is created only if LE is enabled.

Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-02-26 19:41:35 -08:00
Andre Guedes 5b906a84a5 Bluetooth: Support resolvable private addresses
Only identity addresses are inserted into hdev->pend_le_conns. So,
in order to support resolvable private addresses in auto connection
mechanism, we should resolve the address before checking for pending
connections.

Thus, this patch adds an extra check in check_pending_le_conn() and
updates 'addr' and 'addr_type' variables before hci_pend_le_conn_
lookup().

Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-02-26 19:41:35 -08:00
Andre Guedes a9b0a04c2a Bluetooth: Connection parameters and resolvable address
We should only accept connection parameters from identity addresses
(public or random static). Thus, we should check the address type
in hci_conn_params_add().

Additionally, since the IRK is removed during unpair, we should also
remove the connection parameters from that device.

Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-02-26 19:41:35 -08:00
Andre Guedes 6046dc3e06 Bluetooth: Auto connection and power on
When hdev is closed (e.g. Mgmt power off command, RFKILL or controller
is reset), the ongoing active connections are silently dropped by the
controller (no Disconnection Complete Event is sent to host). For that
reason, the devices that require HCI_AUTO_CONN_ALWAYS are not added to
hdev->pend_le_conns list and they won't auto connect.

So to fix this issue, during hdev closing, we remove all pending LE
connections. After adapter is powered on, we add a pending LE connection
for each HCI_AUTO_CONN_ALWAYS address.

This way, the auto connection mechanism works propely after a power
off and power on sequence as well as RFKILL block/unblock.

Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-02-26 19:41:35 -08:00
Andre Guedes c54c3860e3 Bluetooth: Temporarily stop background scanning on discovery
If the user sends a mgmt start discovery command while the background
scanning is running, we should temporarily stop it. Once the discovery
finishes, we start the background scanning again.

Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-02-26 19:41:35 -08:00
Andre Guedes cef952ce76 Bluetooth: Connection parameters and auto connection
This patch modifies hci_conn_params_add() and hci_conn_params_del() so
they also add/delete pending LE connections according to the auto_
connect option. This way, background scan is automatically triggered/
untriggered when connection parameters are added/removed.

For instance, when a new connection parameters with HCI_AUTO_CONN_ALWAYS
option is added and we are not connected to the device, we add a pending
LE connection for that device.

Likewise, when the connection parameters are updated we add or delete
a pending LE connection according to its new auto_connect option.

Finally, when the connection parameter is deleted we also delete the
pending LE connection (if any).

Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-02-26 19:41:34 -08:00
Andre Guedes 9fcb18ef3a Bluetooth: Introduce LE auto connect options
This patch introduces the LE auto connection options: HCI_AUTO_CONN_
ALWAYS and HCI_AUTO_CONN_LINK_LOSS. Their working mechanism are
described as follows:

The HCI_AUTO_CONN_ALWAYS option configures the kernel to always re-
establish the connection, no matter the reason the connection was
terminated. This feature is required by some LE profiles such as
HID over GATT, Health Thermometer and Blood Pressure. These profiles
require the host autonomously connect to the device as soon as it
enters in connectable mode (start advertising) so the device is able
to delivery notifications or indications.

The BT_AUTO_CONN_LINK_LOSS option configures the kernel to re-
establish the connection in case the connection was terminated due
to a link loss. This feature is required by the majority of LE
profiles such as Proximity, Find Me, Cycling Speed and Cadence and
Time.

Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-02-26 19:41:34 -08:00
Andre Guedes a4790dbd43 Bluetooth: Introduce LE auto connection infrastructure
This patch introduces the LE auto connection infrastructure which
will be used to implement the LE auto connection options.

In summary, the auto connection mechanism works as follows: Once the
first pending LE connection is created, the background scanning is
started. When the target device is found in range, the kernel
autonomously starts the connection attempt. If connection is
established successfully, that pending LE connection is deleted and
the background is stopped.

To achieve that, this patch introduces the hci_update_background_scan()
which controls the background scanning state. This function starts or
stops the background scanning based on the hdev->pend_le_conns list. If
there is no pending LE connection, the background scanning is stopped.
Otherwise, we start the background scanning.

Then, every time a pending LE connection is added we call hci_update_
background_scan() so the background scanning is started (in case it is
not already running). Likewise, every time a pending LE connection is
deleted we call hci_update_background_scan() so the background scanning
is stopped (in case this was the last pending LE connection) or it is
started again (in case we have more pending LE connections). Finally,
we also call hci_update_background_scan() in hci_le_conn_failed() so
the background scan is restarted in case the connection establishment
fails. This way the background scanning keeps running until all pending
LE connection are established.

At this point, resolvable addresses are not support by this
infrastructure. The proper support is added in upcoming patches.

Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-02-26 19:41:34 -08:00
Andre Guedes 77a77a30ae Bluetooth: Introduce hdev->pend_le_conn list
This patch introduces the hdev->pend_le_conn list which holds the
device addresses the kernel should autonomously connect. It also
introduces some helper functions to manipulate the list.

The list and helper functions will be used by the next patch which
implements the LE auto connection infrastructure.

Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-02-26 19:41:34 -08:00
Andre Guedes 6f77d8c757 Bluetooth: Move address type conversion to outside hci_connect_le
This patch moves address type conversion (L2CAP address type to HCI
address type) to outside hci_connect_le. This way, we avoid back and
forth address type conversion in a comming patch.

So hci_connect_le() now expects 'dst_type' parameter in HCI address
type convention.

Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-02-26 19:41:34 -08:00
Andre Guedes 04a6c5898e Bluetooth: Refactor HCI connection code
hci_connect() is a very simple and useless wrapper of hci_connect_acl
and hci_connect_le functions. Addtionally, all places where hci_connect
is called the link type value is passed explicitly. This way, we can
safely delete hci_connect, declare hci_connect_acl and hci_connect_le
in hci_core.h and call them directly.

No functionality is changed by this patch.

Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-02-26 19:41:34 -08:00
Andre Guedes c99ed8343c Bluetooth: Remove unused function
This patch removes hci_create_le_conn() since it is not used anymore.

Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-02-26 19:41:33 -08:00
Andre Guedes 2acf3d9066 Bluetooth: Stop scanning on LE connection
Some LE controllers don't support scanning and creating a connection
at the same time. So we should always stop scanning in order to
establish the connection.

Since we may prematurely stop the discovery procedure in favor of
the connection establishment, we should also cancel hdev->le_scan_
disable delayed work and set the discovery state to DISCOVERY_STOPPED.

This change does a small improvement since it is not mandatory the
user stops scanning before connecting anymore. Moreover, this change
is required by upcoming LE auto connection mechanism in order to work
properly with controllers that don't support background scanning and
connection establishment at the same time.

In future, we might want to do a small optimization by checking if
controller is able to scan and connect at the same time. For now,
we want the simplest approach so we always stop scanning (even if
the controller is able to carry out both operations).

Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-02-26 19:41:33 -08:00
Andre Guedes 06c053fb54 Bluetooth: Declare le_conn_failed in hci_core.h
This patch adds the "hci_" prefix to le_conn_failed() helper and
declares it in hci_core.h so it can be reused in hci_event.c.

Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-02-26 19:41:33 -08:00
Andre Guedes b1efcc2870 Bluetooth: Create hci_req_add_le_scan_disable helper
This patch moves stop LE scanning duplicate code to one single
place and reuses it. This will avoid more duplicate code in
upcoming patches.

Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-02-26 19:41:33 -08:00
Eric Dumazet 740b0f1841 tcp: switch rtt estimations to usec resolution
Upcoming congestion controls for TCP require usec resolution for RTT
estimations. Millisecond resolution is simply not enough these days.

FQ/pacing in DC environments also require this change for finer control
and removal of bimodal behavior due to the current hack in
tcp_update_pacing_rate() for 'small rtt'

TCP_CONG_RTT_STAMP is no longer needed.

As Julian Anastasov pointed out, we need to keep user compatibility :
tcp_metrics used to export RTT and RTTVAR in msec resolution,
so we added RTT_US and RTTVAR_US. An iproute2 patch is needed
to use the new attributes if provided by the kernel.

In this example ss command displays a srtt of 32 usecs (10Gbit link)

lpk51:~# ./ss -i dst lpk52
Netid  State      Recv-Q Send-Q   Local Address:Port       Peer
Address:Port
tcp    ESTAB      0      1         10.246.11.51:42959
10.246.11.52:64614
         cubic wscale:6,6 rto:201 rtt:0.032/0.001 ato:40 mss:1448
cwnd:10 send
3620.0Mbps pacing_rate 7240.0Mbps unacked:1 rcv_rtt:993 rcv_space:29559

Updated iproute2 ip command displays :

lpk51:~# ./ip tcp_metrics | grep 10.246.11.52
10.246.11.52 age 561.914sec cwnd 10 rtt 274us rttvar 213us source
10.246.11.51

Old binary displays :

lpk51:~# ip tcp_metrics | grep 10.246.11.52
10.246.11.52 age 561.914sec cwnd 10 rtt 250us rttvar 125us source
10.246.11.51

With help from Julian Anastasov, Stephen Hemminger and Yuchung Cheng

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Larry Brakmo <brakmo@google.com>
Cc: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-26 17:08:40 -05:00
Johan Hedberg 4bd6d38e7f Bluetooth: Remove unneeded "force" parameter from smp_distribute_keys()
Now that to-be-received keys are properly tracked we no-longer need the
"force" parameter to smp_distribute_keys(). It was essentially acting as
an indicator whether all keys have been received, but now it's just
redundant together with smp->remote_key_dist.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-02-26 13:46:04 -08:00
Johan Hedberg efabba37fe Bluetooth: Simplify logic for checking for SMP completion
Now that smp->remote_key_dist is tracking the keys we're still waiting
for we can use it to simplify the logic for checking whether we're done
with key distribution or not. At the same time the reliance on the
"force" parameter of smp_distribute_keys goes away and it can completely
be removed in a subsequent patch.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-02-26 13:46:04 -08:00
Johan Hedberg 9747a9f317 Bluetooth: Track not yet received keys in SMP
To make is easier to track which keys we've received and which ones
we're still waiting for simply clear the corresponding key bits from
smp->remote_key_dist as they get received. This will allow us to
simplify the code for checking for SMP completion in subsequent patches.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-02-26 13:46:03 -08:00
Hannes Frederic Sowa 0b95227a7b ipv6: yet another new IPV6_MTU_DISCOVER option IPV6_PMTUDISC_OMIT
This option has the same semantic as IP_PMTUDISC_OMIT for IPv4 which
got recently introduced. It doesn't honor the path mtu discovered by the
host but in contrary to IPV6_PMTUDISC_INTERFACE allows the generation of
fragments if the packet size exceeds the MTU of the outgoing interface
MTU.

Fixes: 93b36cf342 ("ipv6: support IPV6_PMTU_INTERFACE on sockets")
Cc: Florian Weimer <fweimer@redhat.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-26 15:51:01 -05:00
Hannes Frederic Sowa 1b34657635 ipv4: yet another new IP_MTU_DISCOVER option IP_PMTUDISC_OMIT
IP_PMTUDISC_INTERFACE has a design error: because it does not allow the
generation of fragments if the interface mtu is exceeded, it is very
hard to make use of this option in already deployed name server software
for which I introduced this option.

This patch adds yet another new IP_MTU_DISCOVER option to not honor any
path mtu information and not accepting new icmp notifications destined for
the socket this option is enabled on. But we allow outgoing fragmentation
in case the packet size exceeds the outgoing interface mtu.

As such this new option can be used as a drop-in replacement for
IP_PMTUDISC_DONT, which is currently in use by most name server software
making the adoption of this option very smooth and easy.

The original advantage of IP_PMTUDISC_INTERFACE is still maintained:
ignoring incoming path MTU updates and not honoring discovered path MTUs
in the output path.

Fixes: 482fc6094a ("ipv4: introduce new IP_MTU_DISCOVER mode IP_PMTUDISC_INTERFACE")
Cc: Florian Weimer <fweimer@redhat.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-26 15:51:00 -05:00
Hannes Frederic Sowa 69647ce46a ipv4: use ip_skb_dst_mtu to determine mtu in ip_fragment
ip_skb_dst_mtu mostly falls back to ip_dst_mtu_maybe_forward if no socket
is attached to the skb (in case of forwarding) or determines the mtu like
we do in ip_finish_output, which actually checks if we should branch to
ip_fragment. Thus use the same function to determine the mtu here, too.

This is important for the introduction of IP_PMTUDISC_OMIT, where we
want the packets getting cut in pieces of the size of the outgoing
interface mtu. IPv6 already does this correctly.

Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-26 15:51:00 -05:00
Timo Teräs a960ff81f0 neigh: probe application via netlink in NUD_PROBE
iproute2 arpd seems to expect this as there's code and comments
to handle netlink probes with NUD_PROBE set. It is used to flush
the arpd cached mappings.

opennhrp instead turns off unicast probes (so it can handle all
neighbour discovery). Without this change it will not see NUD_PROBE
probes and cannot reconfirm the mapping. Thus currently neigh entry
will just fail and can cause few packets dropped until broadcast
discovery is restarted.

Earlier discussion on the subject:
http://marc.info/?t=139305877100001&r=1&w=2

Signed-off-by: Timo Teräs <timo.teras@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-26 15:46:25 -05:00
Bjørn Mork 84a3e72c3a ipv6: log src and dst along with "udp checksum is 0"
These info messages are rather pointless without any means to identify
the source of the bogus packets.  Logging the src and dst addresses and
ports may help a bit.

Cc: Joe Perches <joe@perches.com>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-26 15:46:25 -05:00
Amir Vadai 3f85944fe2 net: Add sysfs file for port number
Add a sysfs file to enable user space to query the device
port number used by a netdevice instance. This is needed for
devices that have multiple ports on the same PCI function.

Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-26 15:38:06 -05:00
Florian Westphal 8e165e2034 net: tcp: add mib counters to track zero window transitions
Three counters are added:
- one to track when we went from non-zero to zero window
- one to track the reverse
- one counter incremented when we want to announce zero window,
  but can't because we would shrink current window.

Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-26 15:23:30 -05:00
Eric Dumazet 9a9bfd032f net: tcp: use NET_INC_STATS()
While LINUX_MIB_TCPSPURIOUS_RTX_HOSTQUEUES can only be incremented
in tcp_transmit_skb() from softirq (incoming message or timer
activation), it is better to use NET_INC_STATS() instead of
NET_INC_STATS_BH() as tcp_transmit_skb() can be called from process
context.

This will avoid copy/paste confusion when/if we want to add
other SNMP counters in tcp_transmit_skb()

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-26 15:19:47 -05:00
David Howells e8388eb103 af_rxrpc: Request an ACK for every alternate DATA packet
Set the RxRPC header flag to request an ACK packet for every odd-numbered DATA
packet unless it's the last one (which implicitly requests an ACK anyway).
This is similar to how librx appears to work.

If we don't do this, we'll send out a full window of packets and then just sit
there until the other side gets bored and sends an ACK to indicate that it's
been idle for a while and has received no new packets.

Requesting a lot of ACKs shouldn't be a problem as ACKs should be merged when
possible.

As AF_RXRPC currently works, it will schedule an ACK to be generated upon
receipt of a DATA packet with the ACK-request packet set - and in the time
taken to schedule this in a work queue, several other packets are likely to
arrive and then all get ACK'd together.

Signed-off-by: David Howells <dhowells@redhat.com>
2014-02-26 17:25:07 +00:00
David Howells 817913d8cd af_rxrpc: Expose more RxRPC parameters via sysctls
Expose RxRPC parameters via sysctls to control the Rx window size, the Rx MTU
maximum size and the number of packets that can be glued into a jumbo packet.

More info added to Documentation/networking/rxrpc.txt.

Signed-off-by: David Howells <dhowells@redhat.com>
2014-02-26 17:25:07 +00:00
David Howells 9823f39a17 af_rxrpc: Improve ACK production
Improve ACK production by the following means:

 (1) Don't send an ACK_REQUESTED ack immediately even if the RXRPC_MORE_PACKETS
     flag isn't set on a data packet that has also has RXRPC_REQUEST_ACK set.

     MORE_PACKETS just means that the sender just emptied its Tx data buffer.
     More data will be forthcoming unless RXRPC_LAST_PACKET is also flagged.

     It is possible to see runs of DATA packets with MORE_PACKETS unset that
     aren't waiting for an ACK.

     It is therefore better to wait a small instant to see if we can combine an
     ACK for several packets.

 (2) Don't send an ACK_IDLE ack immediately unless we're responding to the
     terminal data packet of a call.

     Whilst sending an ACK_IDLE mid-call serves to let the other side know
     that we won't be asking it to resend certain Tx buffers and that it can
     discard them, spamming it with loads of acks just because we've
     temporarily run out of data just distracts it.

 (3) Put the ACK_IDLE ack generation timeout up to half a second rather than a
     single jiffy.  Just because we haven't been given more data immediately
     doesn't mean that more isn't forthcoming.  The other side may be busily
     finding the data to send to us.

Signed-off-by: David Howells <dhowells@redhat.com>
2014-02-26 17:25:07 +00:00
David Howells 5873c0834f af_rxrpc: Add sysctls for configuring RxRPC parameters
Add sysctls for configuring RxRPC protocol handling, specifically controls on
delays before ack generation, the delay before resending a packet, the maximum
lifetime of a call and the expiration times of calls, connections and
transports that haven't been recently used.

More info added in Documentation/networking/rxrpc.txt.

Signed-off-by: David Howells <dhowells@redhat.com>
2014-02-26 17:25:06 +00:00
David Howells 6c9a2d3202 af_rxrpc: Fix UDP MTU calculation from ICMP_FRAG_NEEDED
AF_RXRPC sends UDP packets with the "Don't Fragment" bit set in an attempt to
determine the maximum packet size between the local socket and the peer by
invoking the generation of ICMP_FRAG_NEEDED packets.

Once a packet is sent with the "Don't Fragment" bit set, it is then
inconvenient to break it up as that requires recalculating all the rxrpc serial
and sequence numbers and reencrypting all the fragments, so we switch off the
"Don't Fragment" service temporarily and send the bounced packet again.  Future
packets then use the new MTU.

That's all fine.  The problem lies in rxrpc_UDP_error_report() where the code
that deals with ICMP_FRAG_NEEDED packets lives.  Packets of this type have a
field (ee_info) to indicate the maximum packet size at the reporting node - but
sometimes ee_info isn't filled in and is just left as 0 and the code must allow
for this.

When ee_info is 0, the code should take the MTU size we're currently using and
reduce it for the next packet we want to send.  However, it takes ee_info
(which is known to be 0) and tries to reduce that instead.

This was discovered by Coverity.

Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
2014-02-26 17:25:01 +00:00
Steffen Klassert 3a9016f97f xfrm: Fix unlink race when policies are deleted.
When a policy is unlinked from the lists in thread context,
the xfrm timer can fire before we can mark this policy as dead.
So reinitialize the bydst hlist, then hlist_unhashed() will
notice that this policy is not linked and will avoid a
doulble unlink of that policy.

Reported-by: Xianpeng Zhao <673321875@qq.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-02-26 09:52:02 +01:00
Mike Pecovnik 46833a86f7 net: Fix permission check in netlink_connect()
netlink_sendmsg() was changed to prevent non-root processes from sending
messages with dst_pid != 0.
netlink_connect() however still only checks if nladdr->nl_groups is set.
This patch modifies netlink_connect() to check for the same condition.

Signed-off-by: Mike Pecovnik <mike.pecovnik@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-25 18:35:14 -05:00
Hannes Frederic Sowa 91a48a2e85 ipv4: ipv6: better estimate tunnel header cut for correct ufo handling
Currently the UFO fragmentation process does not correctly handle inner
UDP frames.

(The following tcpdumps are captured on the parent interface with ufo
disabled while tunnel has ufo enabled, 2000 bytes payload, mtu 1280,
both sit device):

IPv6:
16:39:10.031613 IP (tos 0x0, ttl 64, id 3208, offset 0, flags [DF], proto IPv6 (41), length 1300)
    192.168.122.151 > 1.1.1.1: IP6 (hlim 64, next-header Fragment (44) payload length: 1240) 2001::1 > 2001::8: frag (0x00000001:0|1232) 44883 > distinct: UDP, length 2000
16:39:10.031709 IP (tos 0x0, ttl 64, id 3209, offset 0, flags [DF], proto IPv6 (41), length 844)
    192.168.122.151 > 1.1.1.1: IP6 (hlim 64, next-header Fragment (44) payload length: 784) 2001::1 > 2001::8: frag (0x00000001:0|776) 58979 > 46366: UDP, length 5471

We can see that fragmentation header offset is not correctly updated.
(fragmentation id handling is corrected by 916e4cf46d ("ipv6: reuse
ip6_frag_id from ip6_ufo_append_data")).

IPv4:
16:39:57.737761 IP (tos 0x0, ttl 64, id 3209, offset 0, flags [DF], proto IPIP (4), length 1296)
    192.168.122.151 > 1.1.1.1: IP (tos 0x0, ttl 64, id 57034, offset 0, flags [none], proto UDP (17), length 1276)
    192.168.99.1.35961 > 192.168.99.2.distinct: UDP, length 2000
16:39:57.738028 IP (tos 0x0, ttl 64, id 3210, offset 0, flags [DF], proto IPIP (4), length 792)
    192.168.122.151 > 1.1.1.1: IP (tos 0x0, ttl 64, id 57035, offset 0, flags [none], proto UDP (17), length 772)
    192.168.99.1.13531 > 192.168.99.2.20653: UDP, length 51109

In this case fragmentation id is incremented and offset is not updated.

First, I aligned inet_gso_segment and ipv6_gso_segment:
* align naming of flags
* ipv6_gso_segment: setting skb->encapsulation is unnecessary, as we
  always ensure that the state of this flag is left untouched when
  returning from upper gso segmenation function
* ipv6_gso_segment: move skb_reset_inner_headers below updating the
  fragmentation header data, we don't care for updating fragmentation
  header data
* remove currently unneeded comment indicating skb->encapsulation might
  get changed by upper gso_segment callback (gre and udp-tunnel reset
  encapsulation after segmentation on each fragment)

If we encounter an IPIP or SIT gso skb we now check for the protocol ==
IPPROTO_UDP and that we at least have already traversed another ip(6)
protocol header.

The reason why we have to special case GSO_IPIP and GSO_SIT is that
we reset skb->encapsulation to 0 while skb_mac_gso_segment the inner
protocol of GSO_UDP_TUNNEL or GSO_GRE packets.

Reported-by: Wolfgang Walter <linux@stwm.de>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-25 18:27:06 -05:00
Johan Hedberg a9a58f8612 Bluetooth: Ignore IRKs with no Identity Address
The Core Specification (4.1) leaves room for sending an SMP Identity
Address Information PDU with an all-zeros BD_ADDR value. This
essentially means that we would not have an Identity Address for the
device and the only means of identifying it would be the IRK value
itself.

Due to lack of any known implementations behaving like this it's best to
keep our implementation as simple as possible as far as handling such
situations is concerned. This patch updates the Identity Address
Information handler function to simply ignore the IRK received from such
a device.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-02-25 12:30:41 -08:00
Johan Hedberg a4858cb942 Bluetooth: Fix advertising address type when toggling connectable
When the connectable setting is toggled using mgmt_set_connectable the
HCI_CONNECTABLE flag will only be set once the related HCI commands
succeed. When determining what kind of advertising to do we need to
therefore also check whether there is a pending Set Connectable command
in addition to the current flag value.

The enable_advertising function was already taking care of this for the
advertising type with the help of the get_adv_type function, but was
failing to do the same for the address type selection. This patch
converts the get_adv_type function to be more generic in that it returns
the expected connectable state and updates the enable_advertising
function to use the return value both for the advertising type as well
as the advertising address type.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-02-25 10:02:53 -08:00
Andrzej Kaczmarek ede81a2a12 Bluetooth: Fix NULL pointer dereference when sending data
When trying to allocate skb for new PDU, l2cap_chan is unlocked so we
can sleep waiting for memory as otherwise there's possible deadlock as
fixed in e454c84464. However, in a6a5568c03 lock was moved from socket
to channel level and it's no longer safe to just unlock and lock again
without checking l2cap_chan state since channel can be disconnected
when lock is not held.

This patch adds missing checks for l2cap_chan state when returning from
call which allocates skb.

Scenario is easily reproducible by running rfcomm-tester in a loop.

BUG: unable to handle kernel NULL pointer dereference at         (null)
IP: [<ffffffffa0442169>] l2cap_do_send+0x29/0x120 [bluetooth]
PGD 0
Oops: 0000 [#1] SMP
Modules linked in:
CPU: 7 PID: 4038 Comm: krfcommd Not tainted 3.14.0-rc2+ #15
Hardware name: Dell Inc. OptiPlex 790/0HY9JP, BIOS A10 11/24/2011
task: ffff8802bdd731c0 ti: ffff8801ec986000 task.ti: ffff8801ec986000
RIP: 0010:[<ffffffffa0442169>]  [<ffffffffa0442169>] l2cap_do_send+0x29/0x120
RSP: 0018:ffff8801ec987ad8  EFLAGS: 00010202
RAX: 0000000000000000 RBX: ffff8800c5796800 RCX: 0000000000000000
RDX: ffff880410e7a800 RSI: ffff8802b6c1da00 RDI: ffff8800c5796800
RBP: ffff8801ec987af8 R08: 00000000000000c0 R09: 0000000000000300
R10: 000000000000573b R11: 000000000000573a R12: ffff8802b6c1da00
R13: 0000000000000000 R14: ffff8802b6c1da00 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff88042dce0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 000000041257c000 CR4: 00000000000407e0
Stack:
 ffff8801ec987d78 ffff8800c5796800 ffff8801ec987d78 0000000000000000
 ffff8801ec987ba8 ffffffffa0449e37 0000000000000004 ffff8801ec987af0
 ffff8801ec987d40 0000000000000282 0000000000000000 ffffffff00000004
Call Trace:
 [<ffffffffa0449e37>] l2cap_chan_send+0xaa7/0x1120 [bluetooth]
 [<ffffffff81770100>] ? _raw_spin_unlock_bh+0x20/0x40
 [<ffffffffa045188b>] l2cap_sock_sendmsg+0xcb/0x110 [bluetooth]
 [<ffffffff81652b0f>] sock_sendmsg+0xaf/0xc0
 [<ffffffff810a8381>] ? update_curr+0x141/0x200
 [<ffffffff810a8961>] ? dequeue_entity+0x181/0x520
 [<ffffffff81652b60>] kernel_sendmsg+0x40/0x60
 [<ffffffffa04a8505>] rfcomm_send_frame+0x45/0x70 [rfcomm]
 [<ffffffff810766f0>] ? internal_add_timer+0x20/0x50
 [<ffffffffa04a8564>] rfcomm_send_cmd+0x34/0x60 [rfcomm]
 [<ffffffffa04a8605>] rfcomm_send_disc+0x75/0xa0 [rfcomm]
 [<ffffffffa04aacec>] rfcomm_run+0x8cc/0x1a30 [rfcomm]
 [<ffffffffa04aa420>] ? rfcomm_check_accept+0xc0/0xc0 [rfcomm]
 [<ffffffff8108e3a9>] kthread+0xc9/0xe0
 [<ffffffff8108e2e0>] ? flush_kthread_worker+0xb0/0xb0
 [<ffffffff817795fc>] ret_from_fork+0x7c/0xb0
 [<ffffffff8108e2e0>] ? flush_kthread_worker+0xb0/0xb0
Code: 00 00 66 66 66 66 90 55 48 89 e5 48 83 ec 20 f6 05 d6 a3 02 00 04
RIP  [<ffffffffa0442169>] l2cap_do_send+0x29/0x120 [bluetooth]
 RSP <ffff8801ec987ad8>
CR2: 0000000000000000

Signed-off-by: Andrzej Kaczmarek <andrzej.kaczmarek@tieto.com>
Acked-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-02-25 10:02:53 -08:00
Ilan Peer 7c8d5e03ac cfg80211: send stop AP event only due to internal reason
Commit "nl80211: send event when AP operation is stopped" added an
event to notify user space that an AP interface has been stopped, to
handle cases such as suspend etc. The event is sent regardless
if the stop AP flow was triggered by user space or due to internal state
change.

This might cause issues with wpa_supplicant/hostapd flows that consider
stop AP flow as a synchronous one, e.g., AP/GO channel change in the
absence of CSA support. In such cases, the flow will restart the AP
immediately after the stop AP flow is done, and only handle the stop
AP event after the current flow is done, and as a result stop the AP
again.

Change the current implementation to only send the event in case the
stop AP was triggered due to an internal reason.

Signed-off-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-02-25 17:34:18 +01:00
Janusz Dziedzic 31559f35c5 cfg80211: DFS get CAC time from regulatory database
Send Channel Availability Check time as a parameter
of start_radar_detection() callback.
Get CAC time from regulatory database.

Signed-off-by: Janusz Dziedzic <janusz.dziedzic@tieto.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-02-25 17:32:54 +01:00
Janusz Dziedzic 089027e57c cfg80211: regulatory: allow getting DFS CAC time from userspace
Introduce DFS CAC time as a regd param, configured per REG_RULE and
set per channel in cfg80211. DFS CAC time is close connected with
regulatory database configuration. Instead of using hardcoded values,
get DFS CAC time form regulatory database. Pass DFS CAC time to user
mode (mainly for iw reg get, iw list, iw info). Allow setting DFS CAC
time via CRDA. Add support for internal regulatory database.

Signed-off-by: Janusz Dziedzic <janusz.dziedzic@tieto.com>
[rewrap commit log]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-02-25 17:29:25 +01:00
Janusz Dziedzic fb5c96368f cfg80211: regulatory: allow user to set world regdomain
Allow to set world regulatory domain in case of user
request (iw reg set 00).

Signed-off-by: Janusz Dziedzic <janusz.dziedzic@tieto.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-02-25 16:27:45 +01:00
Janusz Dziedzic 092008abee cfg80211: regulatory: reset regdomain in case of error
Reset regdomain to world regdomain in case
of errors in set_regdom() function.

This will fix a problem with such scenario:
- iw reg set US
- iw reg set 00
- iw reg set US
The last step always fail and we get deadlock
in kernel regulatory code. Next setting new
regulatory wasn't possible due to:

Pending regulatory request, waiting for it to be processed...

Signed-off-by: Janusz Dziedzic <janusz.dziedzic@tieto.com>
Acked-by: Luis R. Rodriguez <mcgrof@do-not-panic.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-02-25 16:27:04 +01:00
Johannes Berg 1226d25870 cfg80211: regulatory: simplify uevent sending
There's no need for the struct device_type with the uevent function
etc., just fill the country alpha2 when sending the event.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-02-25 15:44:44 +01:00
Florian Westphal 39111fd261 netfilter: nfnetlink_log: remove unused code
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-02-25 11:30:01 +01:00
Patrick McHardy e0abdadcc6 netfilter: nf_tables: accept QUEUE/DROP verdict parameters
Allow userspace to specify the queue number or the errno code for QUEUE
and DROP verdicts.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-02-25 11:29:26 +01:00
Patrick McHardy 67a8fc27cc netfilter: nf_tables: add nft_dereference() macro
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-02-25 11:29:23 +01:00
Patrick McHardy 0eb5db7ad3 netfilter: nfnetlink: add rcu_dereference_protected() helpers
Add a lockdep_nfnl_is_held() function and a nfnl_dereference() macro for
RCU dereferences protected by a NFNL subsystem mutex.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-02-25 11:29:21 +01:00
Patrick McHardy 3e90ebd3c9 netfilter: ip_set: rename nfnl_dereference()/nfnl_set()
The next patch will introduce a nfnl_dereference() macro that actually
checks that the appropriate mutex is held and therefore needs a
subsystem argument.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-02-25 11:29:18 +01:00
Steffen Klassert 895de9a348 vti4: Enable namespace changing
vti4 is now fully namespace aware, so allow namespace changing
for vti devices

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-02-25 07:04:19 +01:00
Steffen Klassert 6e2de802af vti4: Check the tunnel endpoints of the xfrm state and the vti interface
The tunnel endpoints of the xfrm_state we got from the xfrm_lookup
must match the tunnel endpoints of the vti interface. This patch
ensures this matching.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-02-25 07:04:19 +01:00
Steffen Klassert 78a010cca0 vti4: Support inter address family tunneling.
With this patch we can tunnel ipv6 traffic via a vti4
interface. A vti4 interface can now have an ipv6 address
and ipv6 traffic can be routed via a vti4 interface.
The resulting traffic is xfrm transformed and tunneled
throuhg ipv4 if matching IPsec policies and states are
present.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-02-25 07:04:19 +01:00
Steffen Klassert a34cd4f319 vti4: Use the on xfrm_lookup returned dst_entry directly
We need to be protocol family indepenent to support
inter addresss family tunneling with vti. So use a
dst_entry instead of the ipv4 rtable in vti_tunnel_xmit.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-02-25 07:04:18 +01:00
Steffen Klassert 9994bb8e1e xfrm4: Remove xfrm_tunnel_notifier
This was used from vti and is replaced by the IPsec protocol
multiplexer hooks. It is now unused, so remove it.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-02-25 07:04:18 +01:00
Steffen Klassert df3893c176 vti: Update the ipv4 side to use it's own receive hook.
With this patch, vti uses the IPsec protocol multiplexer to
register it's own receive side hooks for ESP, AH and IPCOMP.

Vti now does the following on receive side:

1. Do an input policy check for the IPsec packet we received.
   This is required because this packet could be already
   prosecces by IPsec, so an inbuond policy check is needed.

2. Mark the packet with the i_key. The policy and the state
   must match this key now. Policy and state belong to the outer
   namespace and policy enforcement is done at the further layers.

3. Call the generic xfrm layer to do decryption and decapsulation.

4. Wait for a callback from the xfrm layer to properly clean the
   skb to not leak informations on namespace and to update the
   device statistics.

On transmit side:

1. Mark the packet with the o_key. The policy and the state
   must match this key now.

2. Do a xfrm_lookup on the original packet with the mark applied.

3. Check if we got an IPsec route.

4. Clean the skb to not leak informations on namespace
   transitions.

5. Attach the dst_enty we got from the xfrm_lookup to the skb.

6. Call dst_output to do the IPsec processing.

7. Do the device statistics.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-02-25 07:04:18 +01:00
Steffen Klassert 6d608f06e3 ip_tunnel: Make vti work with i_key set
Vti uses the o_key to mark packets that were transmitted or received
by a vti interface. Unfortunately we can't apply different marks
to in and outbound packets with only one key availabe. Vti interfaces
typically use wildcard selectors for vti IPsec policies. On forwarding,
the same output policy will match for both directions. This generates
a loop between the IPsec gateways until the ttl of the packet is
exceeded.

The gre i_key/o_key are usually there to find the right gre tunnel
during a lookup. When vti uses the i_key to mark packets, the tunnel
lookup does not work any more because vti does not use the gre keys
as a hash key for the lookup.

This patch workarounds this my not including the i_key when comupting
the hash for the tunnel lookup in case of vti tunnels.

With this we have separate keys available for the transmitting and
receiving side of the vti interface.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-02-25 07:04:18 +01:00
Steffen Klassert 70be6c91c8 xfrm: Add xfrm_tunnel_skb_cb to the skb common buffer
IPsec vti_rcv needs to remind the tunnel pointer to
check it later at the vti_rcv_cb callback. So add
this pointer to the IPsec common buffer, initialize
it and check it to avoid transport state matching of
a tunneled packet.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-02-25 07:04:17 +01:00
Steffen Klassert d099160e02 ipcomp4: Use the IPsec protocol multiplexer API
Switch ipcomp4 to use the new IPsec protocol multiplexer.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-02-25 07:04:17 +01:00
Steffen Klassert e5b56454e0 ah4: Use the IPsec protocol multiplexer API
Switch ah4 to use the new IPsec protocol multiplexer.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-02-25 07:04:17 +01:00
Steffen Klassert 827789cbd7 esp4: Use the IPsec protocol multiplexer API
Switch esp4 to use the new IPsec protocol multiplexer.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-02-25 07:04:17 +01:00
Steffen Klassert 3328715e6c xfrm4: Add IPsec protocol multiplexer
This patch add an IPsec protocol multiplexer. With this
it is possible to add alternative protocol handlers as
needed for IPsec virtual tunnel interfaces.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-02-25 07:04:16 +01:00
Joe Perches 0409114282 bridge: netfilter: Use ether_addr_copy
Convert the uses of memcpy to ether_addr_copy because
for some architectures it is smaller and faster.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-24 19:16:44 -05:00
Joe Perches e5a727f663 bridge: Use ether_addr_copy and ETH_ALEN
Convert the more obvious uses of memcpy to ether_addr_copy.

There are still uses of memcpy that could be converted but
these addresses are __aligned(2).

Convert a couple uses of 6 in gr_private.h to ETH_ALEN.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-24 19:16:43 -05:00
Eric Dumazet d10473d4e3 tcp: reduce the bloat caused by tcp_is_cwnd_limited()
tcp_is_cwnd_limited() allows GSO/TSO enabled flows to increase
their cwnd to allow a full size (64KB) TSO packet to be sent.

Non GSO flows only allow an extra room of 3 MSS.

For most flows with a BDP below 10 MSS, this results in a bloat
of cwnd reaching 90, and an inflate of RTT.

Thanks to TSO auto sizing, we can restrict the bloat to the number
of MSS contained in a TSO packet (tp->xmit_size_goal_segs), to keep
original intent without performance impact.

Because we keep cwnd small, it helps to keep TSO packet size to their
optimal value.

Example for a 10Mbit flow, with low TCP Small queue limits (no more than
2 skb in qdisc/device tx ring)

Before patch :

lpk51:~# ./ss -i dst lpk52:44862 | grep cwnd
         cubic wscale:6,6 rto:215 rtt:15.875/2.5 mss:1448 cwnd:96
ssthresh:96
send 70.1Mbps unacked:14 rcv_space:29200

After patch :

lpk51:~# ./ss -i dst lpk52:52916 | grep cwnd
         cubic wscale:6,6 rto:206 rtt:5.206/0.036 mss:1448 cwnd:15
ssthresh:14
send 33.4Mbps unacked:4 rcv_space:29200

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Nandita Dukkipati <nanditad@google.com>
Cc: Van Jacobson <vanj@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-24 19:13:38 -05:00
Mathias Krause 72f8e06f3e pktgen: document all supported flags
The documentation misses a few of the supported flags. Fix this. Also
respect the dependency to CONFIG_XFRM for the IPSEC flag.

Cc: Fan Du <fan.du@windriver.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-24 18:54:26 -05:00
Mathias Krause 0945574750 pktgen: simplify error handling in pgctrl_write()
The 'out' label is just a relict from previous times as pgctrl_write()
had multiple error paths. Get rid of it and simply return right away
on errors.

Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-24 18:54:26 -05:00
Mathias Krause 20b0c718c3 pktgen: fix out-of-bounds access in pgctrl_write()
If a privileged user writes an empty string to /proc/net/pktgen/pgctrl
the code for stripping the (then non-existent) '\n' actually writes the
zero byte at index -1 of data[]. The then still uninitialized array will
very likely fail the command matching tests and the pr_warning() at the
end will therefore leak stack bytes to the kernel log.

Fix those issues by simply ensuring we're passed a non-empty string as
the user API apparently expects a trailing '\n' for all commands.

Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-24 18:54:25 -05:00
David S. Miller 1f5a7407e4 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next
Steffen Klassert says:

====================
1) Introduce skb_to_sgvec_nomark function to add further data to the sg list
   without calling sg_unmark_end first. Needed to add extended sequence
   number informations. From Fan Du.

2) Add IPsec extended sequence numbers support to the Authentication Header
   protocol for ipv4 and ipv6. From Fan Du.

3) Make the IPsec flowcache namespace aware, from Fan Du.

4) Avoid creating temporary SA for every packet when no key manager is
   registered. From Horia Geanta.

5) Support filtering of SA dumps to show only the SAs that match a
   given filter. From Nicolas Dichtel.

6) Remove caching of xfrm_policy_sk_bundles. The cached socket policy bundles
   are never used, instead we create a new cache entry whenever xfrm_lookup()
   is called on a socket policy. Most protocols cache the used routes to the
   socket, so this caching is not needed.

7)  Fix a forgotten SADB_X_EXT_FILTER length check in pfkey, from Nicolas
    Dichtel.

8) Cleanup error handling of xfrm_state_clone.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-24 18:13:33 -05:00
Frederic Weisbecker c46fff2a3b smp: Rename __smp_call_function_single() to smp_call_function_single_async()
The name __smp_call_function_single() doesn't tell much about the
properties of this function, especially when compared to
smp_call_function_single().

The comments above the implementation are also misleading. The main
point of this function is actually not to be able to embed the csd
in an object. This is actually a requirement that result from the
purpose of this function which is to raise an IPI asynchronously.

As such it can be called with interrupts disabled. And this feature
comes at the cost of the caller who then needs to serialize the
IPIs on this csd.

Lets rename the function and enhance the comments so that they reflect
these properties.

Suggested-by: Christoph Hellwig <hch@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Jens Axboe <axboe@fb.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2014-02-24 14:47:15 -08:00
Frederic Weisbecker fce8ad1568 smp: Remove wait argument from __smp_call_function_single()
The main point of calling __smp_call_function_single() is to send
an IPI in a pure asynchronous way. By embedding a csd in an object,
a caller can send the IPI without waiting for a previous one to complete
as is required by smp_call_function_single() for example. As such,
sending this kind of IPI can be safe even when irqs are disabled.

This flexibility comes at the expense of the caller who then needs to
synchronize the csd lifecycle by himself and make sure that IPIs on a
single csd are serialized.

This is how __smp_call_function_single() works when wait = 0 and this
usecase is relevant.

Now there don't seem to be any usecase with wait = 1 that can't be
covered by smp_call_function_single() instead, which is safer. Lets look
at the two possible scenario:

1) The user calls __smp_call_function_single(wait = 1) on a csd embedded
   in an object. It looks like a nice and convenient pattern at the first
   sight because we can then retrieve the object from the IPI handler easily.

   But actually it is a waste of memory space in the object since the csd
   can be allocated from the stack by smp_call_function_single(wait = 1)
   and the object can be passed an the IPI argument.

   Besides that, embedding the csd in an object is more error prone
   because the caller must take care of the serialization of the IPIs
   for this csd.

2) The user calls __smp_call_function_single(wait = 1) on a csd that
   is allocated on the stack. It's ok but smp_call_function_single()
   can do it as well and it already takes care of the allocation on the
   stack. Again it's more simple and less error prone.

Therefore, using the underscore prepend API version with wait = 1
is a bad pattern and a sign that the caller can do safer and more
simple.

There was a single user of that which has just been converted.
So lets remove this option to discourage further users.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Jens Axboe <axboe@fb.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2014-02-24 14:47:09 -08:00
John W. Linville db18014f65 Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next 2014-02-24 15:04:35 -05:00
Johan Hedberg 8b064a3ad3 Bluetooth: Clean up HCI state when doing power off
To be friendly to user space and to behave well with controllers that
lack a proper internal power off procedure we should try to clean up as
much state as possible before requesting the HCI driver to power off.

This patch updates the power off procedure that's triggered by
mgmt_set_powered to clean any scan modes, stop LE scanning and
advertising and to disconnect any open connections.

The asynchronous cleanup procedure uses the HCI request framework,
however since HCI_Disconnect is only covered until its Command Status
event we need some extra tracking/waiting of disconnections. This is
done by monitoring when hci_conn_count() indicates that there are no
more connections.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-02-24 11:10:36 -08:00
Johan Hedberg 7c4cfab808 Bluetooth: Don't clear HCI_ADVERTISING when powering off
Once mgmt_set_powered(off) is updated to clear the scan mode we should
not just blindly clear the HCI_ADVERTISING flag in mgmt_advertising()
but first check if there is a pending set_powered operation.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-02-24 11:10:36 -08:00