1
0
Fork 0
Commit Graph

33894 Commits (06b8ab55289345ab191bf4bf0e4acc6d4bdf293d)

Author SHA1 Message Date
Nikolay Aleksandrov 2e404f632f inet: frags: use INET_FRAG_EVICTED to prevent icmp messages
Now that we have INET_FRAG_EVICTED we might as well use it to stop
sending icmp messages in the "frag_expire" functions instead of
stripping INET_FRAG_FIRST_IN from their flags when evicting.
Also fix the comment style in ip6_expire_frag_queue().

Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-02 15:31:31 -07:00
Nikolay Aleksandrov f926e23660 inet: frags: fix function declaration alignments in inet_fragment
Fix a couple of functions' declaration alignments.

Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-02 15:31:31 -07:00
Nikolay Aleksandrov 06aa8b8a03 inet: frags: rename last_in to flags
The last_in field has been used to store various flags different from
first/last frag in so give it a more descriptive name: flags.

Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-02 15:31:31 -07:00
Nikolay Aleksandrov d2373862b3 inet: frags: use INC_STATS_BH in the ipv6 reassembly code
Softirqs are already disabled so no need to do it again, thus let's be
consistent and use the IP6_INC_STATS_BH variant.

Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-02 15:31:31 -07:00
Duan Jiong 188b1210f3 ipv4: remove nested rcu_read_lock/unlock
ip_local_deliver_finish() already have a rcu_read_lock/unlock, so
the rcu_read_lock/unlock is unnecessary.

See the stack below:
ip_local_deliver_finish
	|
	|
	->icmp_rcv
		|
		|
		->icmp_socket_deliver

Suggested-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-02 15:27:35 -07:00
Alexei Starovoitov 7ae457c1e5 net: filter: split 'struct sk_filter' into socket and bpf parts
clean up names related to socket filtering and bpf in the following way:
- everything that deals with sockets keeps 'sk_*' prefix
- everything that is pure BPF is changed to 'bpf_*' prefix

split 'struct sk_filter' into
struct sk_filter {
	atomic_t        refcnt;
	struct rcu_head rcu;
	struct bpf_prog *prog;
};
and
struct bpf_prog {
        u32                     jited:1,
                                len:31;
        struct sock_fprog_kern  *orig_prog;
        unsigned int            (*bpf_func)(const struct sk_buff *skb,
                                            const struct bpf_insn *filter);
        union {
                struct sock_filter      insns[0];
                struct bpf_insn         insnsi[0];
                struct work_struct      work;
        };
};
so that 'struct bpf_prog' can be used independent of sockets and cleans up
'unattached' bpf use cases

split SK_RUN_FILTER macro into:
    SK_RUN_FILTER to be used with 'struct sk_filter *' and
    BPF_PROG_RUN to be used with 'struct bpf_prog *'

__sk_filter_release(struct sk_filter *) gains
__bpf_prog_release(struct bpf_prog *) helper function

also perform related renames for the functions that work
with 'struct bpf_prog *', since they're on the same lines:

sk_filter_size -> bpf_prog_size
sk_filter_select_runtime -> bpf_prog_select_runtime
sk_filter_free -> bpf_prog_free
sk_unattached_filter_create -> bpf_prog_create
sk_unattached_filter_destroy -> bpf_prog_destroy
sk_store_orig_filter -> bpf_prog_store_orig_filter
sk_release_orig_filter -> bpf_release_orig_filter
__sk_migrate_filter -> bpf_migrate_filter
__sk_prepare_filter -> bpf_prepare_filter

API for attaching classic BPF to a socket stays the same:
sk_attach_filter(prog, struct sock *)/sk_detach_filter(struct sock *)
and SK_RUN_FILTER(struct sk_filter *, ctx) to execute a program
which is used by sockets, tun, af_packet

API for 'unattached' BPF programs becomes:
bpf_prog_create(struct bpf_prog **)/bpf_prog_destroy(struct bpf_prog *)
and BPF_PROG_RUN(struct bpf_prog *, ctx) to execute a program
which is used by isdn, ppp, team, seccomp, ptp, xt_bpf, cls_bpf, test_bpf

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-02 15:03:58 -07:00
Alexei Starovoitov 8fb575ca39 net: filter: rename sk_convert_filter() -> bpf_convert_filter()
to indicate that this function is converting classic BPF into eBPF
and not related to sockets

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-02 15:02:38 -07:00
Alexei Starovoitov 4df95ff488 net: filter: rename sk_chk_filter() -> bpf_check_classic()
trivial rename to indicate that this functions performs classic BPF checking

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-02 15:02:38 -07:00
Alexei Starovoitov 009937e78a net: filter: rename sk_filter_proglen -> bpf_classic_proglen
trivial rename to better match semantics of macro

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-02 15:02:38 -07:00
Alexei Starovoitov 278571baca net: filter: simplify socket charging
attaching bpf program to a socket involves multiple socket memory arithmetic,
since size of 'sk_filter' is changing when classic BPF is converted to eBPF.
Also common path of program creation has to deal with two ways of freeing
the memory.

Simplify the code by delaying socket charging until program is ready and
its size is known

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-02 15:02:37 -07:00
James Morris 103ae675b1 Merge branch 'next' of git://git.infradead.org/users/pcmoore/selinux into next 2014-08-02 22:58:02 +10:00
Thomas Graf 0dc1362562 netfilter: nf_tables: Avoid duplicate call to nft_data_uninit() for same key
nft_del_setelem() currently calls nft_data_uninit() twice on the same
key. Once to release the key which is guaranteed to be NFT_DATA_VALUE
and a second time in the error path to which it falls through.

The second call has been harmless so far though because the type
passed is always NFT_DATA_VALUE which is currently a no-op.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-08-01 18:14:49 +02:00
Paul Moore 4fbe63d1c7 netlabel: shorter names for the NetLabel catmap funcs/structs
Historically the NetLabel LSM secattr catmap functions and data
structures have had very long names which makes a mess of the NetLabel
code and anyone who uses NetLabel.  This patch renames the catmap
functions and structures from "*_secattr_catmap_*" to just "*_catmap_*"
which improves things greatly.

There are no substantial code or logic changes in this patch.

Signed-off-by: Paul Moore <pmoore@redhat.com>
Tested-by: Casey Schaufler <casey@schaufler-ca.com>
2014-08-01 11:17:37 -04:00
Paul Moore d960a6184a netlabel: fix the catmap walking functions
The two NetLabel LSM secattr catmap walk functions didn't handle
certain edge conditions correctly, causing incorrect security labels
to be generated in some cases.  This patch corrects these problems and
converts the functions to use the new _netlbl_secattr_catmap_getnode()
function in order to reduce the amount of repeated code.

Cc: stable@vger.kernel.org
Signed-off-by: Paul Moore <pmoore@redhat.com>
Tested-by: Casey Schaufler <casey@schaufler-ca.com>
2014-08-01 11:17:29 -04:00
Paul Moore 4b8feff251 netlabel: fix the horribly broken catmap functions
The NetLabel secattr catmap functions, and the SELinux import/export
glue routines, were broken in many horrible ways and the SELinux glue
code fiddled with the NetLabel catmap structures in ways that we
probably shouldn't allow.  At some point this "worked", but that was
likely due to a bit of dumb luck and sub-par testing (both inflicted
by yours truly).  This patch corrects these problems by basically
gutting the code in favor of something less obtuse and restoring the
NetLabel abstractions in the SELinux catmap glue code.

Everything is working now, and if it decides to break itself in the
future this code will be much easier to debug than the code it
replaces.

One noteworthy side effect of the changes is that it is no longer
necessary to allocate a NetLabel catmap before calling one of the
NetLabel APIs to set a bit in the catmap.  NetLabel will automatically
allocate the catmap nodes when needed, resulting in less allocations
when the lowest bit is greater than 255 and less code in the LSMs.

Cc: stable@vger.kernel.org
Reported-by: Christian Evans <frodox@zoho.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Tested-by: Casey Schaufler <casey@schaufler-ca.com>
2014-08-01 11:17:17 -04:00
Paul Moore 41c3bd2039 netlabel: fix a problem when setting bits below the previously lowest bit
The NetLabel category (catmap) functions have a problem in that they
assume categories will be set in an increasing manner, e.g. the next
category set will always be larger than the last.  Unfortunately, this
is not a valid assumption and could result in problems when attempting
to set categories less than the startbit in the lowest catmap node.
In some cases kernel panics and other nasties can result.

This patch corrects the problem by checking for this and allocating a
new catmap node instance and placing it at the front of the list.

Cc: stable@vger.kernel.org
Reported-by: Christian Evans <frodox@zoho.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Tested-by: Casey Schaufler <casey@schaufler-ca.com>
2014-08-01 11:17:03 -04:00
Duan Jiong 4330487acf net: use inet6_iif instead of IP6CB()->iif
Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-31 22:37:06 -07:00
Vlad Yasevich fcdfe3a7fa net: Correctly set segment mac_len in skb_segment().
When performing segmentation, the mac_len value is copied right
out of the original skb.  However, this value is not always set correctly
(like when the packet is VLAN-tagged) and we'll end up copying a bad
value.

One way to demonstrate this is to configure a VM which tags
packets internally and turn off VLAN acceleration on the forwarding
bridge port.  The packets show up corrupt like this:
16:18:24.985548 52:54:00🆎be:25 > 52:54:00:26:ce:a3, ethertype 802.1Q
(0x8100), length 1518: vlan 100, p 0, ethertype 0x05e0,
        0x0000:  8cdb 1c7c 8cdb 0064 4006 b59d 0a00 6402 ...|...d@.....d.
        0x0010:  0a00 6401 9e0d b441 0a5e 64ec 0330 14fa ..d....A.^d..0..
        0x0020:  29e3 01c9 f871 0000 0101 080a 000a e833)....q.........3
        0x0030:  000f 8c75 6e65 7470 6572 6600 6e65 7470 ...unetperf.netp
        0x0040:  6572 6600 6e65 7470 6572 6600 6e65 7470 erf.netperf.netp
        0x0050:  6572 6600 6e65 7470 6572 6600 6e65 7470 erf.netperf.netp
        0x0060:  6572 6600 6e65 7470 6572 6600 6e65 7470 erf.netperf.netp
        ...

This also leads to awful throughput as GSO packets are dropped and
cause retransmissions.

The solution is to set the mac_len using the values already available
in then new skb.  We've already adjusted all of the header offset, so we
might as well correctly figure out the mac_len using skb_reset_mac_len().
After this change, packets are segmented correctly and performance
is restored.

CC: Eric Dumazet <edumazet@google.com>
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-31 22:28:39 -07:00
Tobias Klauser 74e83b23f2 netlink: Use PAGE_ALIGNED macro
Use PAGE_ALIGNED(...) instead of IS_ALIGNED(..., PAGE_SIZE).

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-31 22:05:28 -07:00
Duan Jiong 7304fe4681 net: fix the counter ICMP_MIB_INERRORS/ICMP6_MIB_INERRORS
When dealing with ICMPv[46] Error Message, function icmp_socket_deliver()
and icmpv6_notify() do some valid checks on packet's length, but then some
protocols check packet's length redaudantly. So remove those duplicated
statements, and increase counter ICMP_MIB_INERRORS/ICMP6_MIB_INERRORS in
function icmp_socket_deliver() and icmpv6_notify() respectively.

In addition, add missed counter in udp6/udplite6 when socket is NULL.

Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-31 22:04:18 -07:00
Jason Gunthorpe 299ee123e1 sctp: Fixup v4mapped behaviour to comply with Sock API
The SCTP socket extensions API document describes the v4mapping option as
follows:

8.1.15.  Set/Clear IPv4 Mapped Addresses (SCTP_I_WANT_MAPPED_V4_ADDR)

   This socket option is a Boolean flag which turns on or off the
   mapping of IPv4 addresses.  If this option is turned on, then IPv4
   addresses will be mapped to V6 representation.  If this option is
   turned off, then no mapping will be done of V4 addresses and a user
   will receive both PF_INET6 and PF_INET type addresses on the socket.
   See [RFC3542] for more details on mapped V6 addresses.

This description isn't really in line with what the code does though.

Introduce addr_to_user (renamed addr_v4map), which should be called
before any sockaddr is passed back to user space. The new function
places the sockaddr into the correct format depending on the
SCTP_I_WANT_MAPPED_V4_ADDR option.

Audit all places that touched v4mapped and either sanely construct
a v4 or v6 address then call addr_to_user, or drop the
unnecessary v4mapped check entirely.

Audit all places that call addr_to_user and verify they are on a sycall
return path.

Add a custom getname that formats the address properly.

Several bugs are addressed:
 - SCTP_I_WANT_MAPPED_V4_ADDR=0 often returned garbage for
   addresses to user space
 - The addr_len returned from recvmsg was not correct when
   returning AF_INET on a v6 socket
 - flowlabel and scope_id were not zerod when promoting
   a v4 to v6
 - Some syscalls like bind and connect behaved differently
   depending on v4mapped

Tested bind, getpeername, getsockname, connect, and recvmsg for proper
behaviour in v4mapped = 1 and 0 cases.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Tested-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-31 21:49:06 -07:00
David S. Miller a173e550c2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following patchset contains netfilter updates for net-next, they are:

1) Add the reject expression for the nf_tables bridge family, this
   allows us to send explicit reject (TCP RST / ICMP dest unrech) to
   the packets matching a rule.

2) Simplify and consolidate the nf_tables set dumping logic. This uses
   netlink control->data to filter out depending on the request.

3) Perform garbage collection in xt_hashlimit using a workqueue instead
   of a timer, which is problematic when many entries are in place in
   the tables, from Eric Dumazet.

4) Remove leftover code from the removed ulog target support, from
   Paul Bolle.

5) Dump unmodified flags in the netfilter packet accounting when resetting
   counters, so userspace knows that a counter was in overquota situation,
   from Alexey Perevalov.

6) Fix wrong usage of the bitwise functions in nfnetlink_acct, also from
   Alexey.

7) Fix a crash when adding new set element with an empty NFTA_SET_ELEM_LIST
   attribute.

This patchset also includes a couple of cleanups for xt_LED from
Duan Jiong and for nf_conntrack_ipv4 (using coccinelle) from
Himangi Saraogi.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-31 14:09:14 -07:00
Banerjee, Debabrata 388070faa1 tcp: don't require root to read tcp_metrics
commit d23ff7016 (tcp: add generic netlink support for tcp_metrics) introduced
netlink support for the new tcp_metrics, however it restricted getting of
tcp_metrics to root user only. This is a change from how these values could
have been fetched when in the old route cache. Unless there's a legitimate
reason to restrict the reading of these values it would be better if normal
users could fetch them.

Cc: Julian Anastasov <ja@ssi.bg>
Cc: linux-kernel@vger.kernel.org

Signed-off-by: Debabrata Banerjee <dbanerje@akamai.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-31 14:07:37 -07:00
Chuck Lever 8079fb785e xprtrdma: Handle additional connection events
Commit 38ca83a5 added RDMA_CM_EVENT_TIMEWAIT_EXIT. But that status
is relevant only for consumers that re-use their QPs on new
connections. xprtrdma creates a fresh QP on reconnection, so that
event should be explicitly ignored.

Squelch the alarming "unexpected CM event" message.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Tested-by: Shirley Ma <shirley.ma@oracle.com>
Tested-by: Devesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2014-07-31 16:22:59 -04:00
Chuck Lever a779ca5fa7 xprtrdma: Remove RPCRDMA_PERSISTENT_REGISTRATION macro
Clean up.

RPCRDMA_PERSISTENT_REGISTRATION was a compile-time switch between
RPCRDMA_REGISTER mode and RPCRDMA_ALLPHYSICAL mode.  Since
RPCRDMA_REGISTER has been removed, there's no need for the extra
conditional compilation.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Tested-by: Shirley Ma <shirley.ma@oracle.com>
Tested-by: Devesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2014-07-31 16:22:59 -04:00
Chuck Lever 282191cb72 xprtrdma: Make rpcrdma_ep_disconnect() return void
Clean up: The return code is used only for dprintk's that are
already redundant.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Tested-by: Shirley Ma <shirley.ma@oracle.com>
Tested-by: Devesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2014-07-31 16:22:58 -04:00
Chuck Lever bb96193d91 xprtrdma: Schedule reply tasklet once per upcall
Minor optimization: grab rpcrdma_tk_lock_g and disable hard IRQs
just once after clearing the receive completion queue.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Tested-by: Shirley Ma <shirley.ma@oracle.com>
Tested-by: Devesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2014-07-31 16:22:58 -04:00
Chuck Lever 2e84522c2e xprtrdma: Allocate each struct rpcrdma_mw separately
Currently rpcrdma_buffer_create() allocates struct rpcrdma_mw's as
a single contiguous area of memory. It amounts to quite a bit of
memory, and there's no requirement for these to be carved from a
single piece of contiguous memory.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Tested-by: Shirley Ma <shirley.ma@oracle.com>
Tested-by: Devesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2014-07-31 16:22:57 -04:00
Chuck Lever f590e878c5 xprtrdma: Rename frmr_wr
Clean up: Name frmr_wr after the opcode of the Work Request,
consistent with the send and local invalidation paths.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Tested-by: Shirley Ma <shirley.ma@oracle.com>
Tested-by: Devesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2014-07-31 16:22:57 -04:00
Chuck Lever dab7e3b8da xprtrdma: Disable completions for LOCAL_INV Work Requests
Instead of relying on a completion to change the state of an FRMR
to FRMR_IS_INVALID, set it in advance. If an error occurs, a completion
will fire anyway and mark the FRMR FRMR_IS_STALE.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Tested-by: Shirley Ma <shirley.ma@oracle.com>
Tested-by: Devesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2014-07-31 16:22:57 -04:00
Chuck Lever 050557220e xprtrdma: Disable completions for FAST_REG_MR Work Requests
Instead of relying on a completion to change the state of an FRMR
to FRMR_IS_VALID, set it in advance. If an error occurs, a completion
will fire anyway and mark the FRMR FRMR_IS_STALE.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Tested-by: Shirley Ma <shirley.ma@oracle.com>
Tested-by: Devesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2014-07-31 16:22:56 -04:00
Chuck Lever 440ddad51b xprtrdma: Don't post a LOCAL_INV in rpcrdma_register_frmr_external()
Any FRMR arriving in rpcrdma_register_frmr_external() is now
guaranteed to be either invalid, or to be targeted by a queued
LOCAL_INV that will invalidate it before the adapter processes
the FAST_REG_MR being built here.

The problem with current arrangement of chaining a LOCAL_INV to the
FAST_REG_MR is that if the transport is not connected, the LOCAL_INV
is flushed and the FAST_REG_MR is flushed. This leaves the FRMR
valid with the old rkey. But rpcrdma_register_frmr_external() has
already bumped the in-memory rkey.

Next time through rpcrdma_register_frmr_external(), a LOCAL_INV and
FAST_REG_MR is attempted again because the FRMR is still valid. But
the rkey no longer matches the hardware's rkey, and a memory
management operation error occurs.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Tested-by: Shirley Ma <shirley.ma@oracle.com>
Tested-by: Devesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2014-07-31 16:22:56 -04:00
Chuck Lever ddb6bebcc6 xprtrdma: Reset FRMRs after a flushed LOCAL_INV Work Request
When a LOCAL_INV Work Request is flushed, it leaves an FRMR in the
VALID state. This FRMR can be returned by rpcrdma_buffer_get(), and
must be knocked down in rpcrdma_register_frmr_external() before it
can be re-used.

Instead, capture these in rpcrdma_buffer_get(), and reset them.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Tested-by: Shirley Ma <shirley.ma@oracle.com>
Tested-by: Devesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2014-07-31 16:22:55 -04:00
Chuck Lever 9f9d802a28 xprtrdma: Reset FRMRs when FAST_REG_MR is flushed by a disconnect
FAST_REG_MR Work Requests update a Memory Region's rkey. Rkey's are
used to block unwanted access to the memory controlled by an MR. The
rkey is passed to the receiver (the NFS server, in our case), and is
also used by xprtrdma to invalidate the MR when the RPC is complete.

When a FAST_REG_MR Work Request is flushed after a transport
disconnect, xprtrdma cannot tell whether the WR actually hit the
adapter or not. So it is indeterminant at that point whether the
existing rkey is still valid.

After the transport connection is re-established, the next
FAST_REG_MR or LOCAL_INV Work Request against that MR can sometimes
fail because the rkey value does not match what xprtrdma expects.

The only reliable way to recover in this case is to deregister and
register the MR before it is used again. These operations can be
done only in a process context, so handle it in the transport
connect worker.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Tested-by: Shirley Ma <shirley.ma@oracle.com>
Tested-by: Devesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2014-07-31 16:22:55 -04:00
Chuck Lever c2922c0235 xprtrdma: Properly handle exhaustion of the rb_mws list
If the rb_mws list is exhausted, clean up and return NULL so that
call_allocate() will delay and try again.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Tested-by: Shirley Ma <shirley.ma@oracle.com>
Tested-by: Devesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2014-07-31 16:22:55 -04:00
Chuck Lever 3111d72c7c xprtrdma: Chain together all MWs in same buffer pool
During connection loss recovery, need to visit every MW in a
buffer pool. Any MW that is in use by an RPC will not be on the
rb_mws list.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Tested-by: Shirley Ma <shirley.ma@oracle.com>
Tested-by: Devesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2014-07-31 16:22:54 -04:00
Chuck Lever c93e986a29 xprtrdma: Back off rkey when FAST_REG_MR fails
If posting a FAST_REG_MR Work Reqeust fails, revert the rkey update
to avoid subsequent IB_WC_MW_BIND_ERR completions.

Suggested-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2014-07-31 16:22:54 -04:00
Chuck Lever 0dbb4108a6 xprtrdma: Unclutter struct rpcrdma_mr_seg
Clean ups:
 - make it obvious that the rl_mw field is a pointer -- allocated
   separately, not as part of struct rpcrdma_mr_seg
 - promote "struct {} frmr;" to a named type
 - promote the state enum to a named type
 - name the MW state field the same way other fields in
   rpcrdma_mw are named

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Tested-by: Shirley Ma <shirley.ma@oracle.com>
Tested-by: Devesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2014-07-31 16:22:54 -04:00
Chuck Lever 539431a437 xprtrdma: Don't invalidate FRMRs if registration fails
If FRMR registration fails, it's likely to transition the QP to the
error state. Or, registration may have failed because the QP is
_already_ in ERROR.

Thus calling rpcrdma_deregister_external() in
rpcrdma_create_chunks() is useless in FRMR mode: the LOCAL_INVs just
get flushed.

It is safe to leave existing registrations: when FRMR registration
is tried again, rpcrdma_register_frmr_external() checks if each FRMR
is already/still VALID, and knocks it down first if it is.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Tested-by: Shirley Ma <shirley.ma@oracle.com>
Tested-by: Devesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2014-07-31 16:22:53 -04:00
Chuck Lever a7bc211ac9 xprtrdma: On disconnect, don't ignore pending CQEs
xprtrdma is currently throwing away queued completions during
a reconnect. RPC replies posted just before connection loss, or
successful completions that change the state of an FRMR, can be
missed.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Tested-by: Shirley Ma <shirley.ma@oracle.com>
Tested-by: Devesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2014-07-31 16:22:53 -04:00
Chuck Lever 6ab59945f2 xprtrdma: Update rkeys after transport reconnect
Various reports of:

  rpcrdma_qp_async_error_upcall: QP error 3 on device mlx4_0
		ep ffff8800bfd3e848

Ensure that rkeys in already-marshalled RPC/RDMA headers are
refreshed after the QP has been replaced by a reconnect.

BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=249
Suggested-by: Selvin Xavier <Selvin.Xavier@Emulex.Com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Tested-by: Shirley Ma <shirley.ma@oracle.com>
Tested-by: Devesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2014-07-31 16:22:53 -04:00
Chuck Lever 43e9598817 xprtrdma: Limit data payload size for ALLPHYSICAL
When the client uses physical memory registration, each page in the
payload gets its own array entry in the RPC/RDMA header's chunk list.

Therefore, don't advertise a maximum payload size that would require
more array entries than can fit in the RPC buffer where RPC/RDMA
headers are built.

BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=248
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Tested-by: Shirley Ma <shirley.ma@oracle.com>
Tested-by: Devesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2014-07-31 16:22:52 -04:00
Chuck Lever 73806c8832 xprtrdma: Protect ia->ri_id when unmapping/invalidating MRs
Ensure ia->ri_id remains valid while invoking dma_unmap_page() or
posting LOCAL_INV during a transport reconnect. Otherwise,
ia->ri_id->device or ia->ri_id->qp is NULL, which triggers a panic.

BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=259
Fixes: ec62f40 'xprtrdma: Ensure ia->ri_id->qp is not NULL when reconnecting'
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Tested-by: Shirley Ma <shirley.ma@oracle.com>
Tested-by: Devesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2014-07-31 16:22:52 -04:00
Chuck Lever 5fc83f470d xprtrdma: Fix panic in rpcrdma_register_frmr_external()
seg1->mr_nsegs is not yet initialized when it is used to unmap
segments during an error exit. Use the same unmapping logic for
all error exits.

"if (frmr_wr.wr.fast_reg.length < len) {" used to be a BUG_ON check.
The broken code will never be executed under normal operation.

Fixes: c977dea (xprtrdma: Remove BUG_ON() call sites)
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Tested-by: Shirley Ma <shirley.ma@oracle.com>
Tested-by: Devesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2014-07-31 16:22:52 -04:00
Toshiaki Makita 47fab41ab5 bridge: Don't include NDA_VLAN for FDB entries with vid 0
An FDB entry with vlan_id 0 doesn't mean it is used in vlan 0, but used when
vlan_filtering is disabled.

There is inconsistency around NDA_VLAN whose payload is 0 - even if we add
an entry by RTM_NEWNEIGH without any NDA_VLAN, and even though adding an
entry with NDA_VLAN 0 is prohibited, we get an entry with NDA_VLAN 0 by
RTM_GETNEIGH.

Dumping an FDB entry with vlan_id 0 shouldn't include NDA_VLAN.

Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-31 12:18:44 -07:00
Pablo Neira Ayuso 7d5570ca89 netfilter: nf_tables: check for unset NFTA_SET_ELEM_LIST_ELEMENTS attribute
Otherwise, the kernel oopses in nla_for_each_nested when iterating over
the unset attribute NFTA_SET_ELEM_LIST_ELEMENTS in the
nf_tables_{new,del}setelem() path.

netlink: 65524 bytes leftover after parsing attributes in process `nft'.
[...]
Oops: 0000 [#1] SMP
[...]
CPU: 2 PID: 6287 Comm: nft Not tainted 3.16.0-rc2+ #169
RIP: 0010:[<ffffffffa0526e61>]  [<ffffffffa0526e61>] nf_tables_newsetelem+0x82/0xec [nf_tables]
[...]
Call Trace:
 [<ffffffffa05178c4>] nfnetlink_rcv+0x2e7/0x3d7 [nfnetlink]
 [<ffffffffa0517939>] ? nfnetlink_rcv+0x35c/0x3d7 [nfnetlink]
 [<ffffffff8137d300>] netlink_unicast+0xf8/0x17a
 [<ffffffff8137d6a5>] netlink_sendmsg+0x323/0x351
[...]

Fix this by returning -EINVAL if this attribute is not set, which
doesn't make sense at all since those commands are there to add and to
delete elements from the set.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-07-31 21:11:43 +02:00
Alexey Perevalov b6d0468804 netfilter: nfnetlink_acct: avoid using NFACCT_F_OVERQUOTA with bit helper functions
Bit helper functions were used for manipulation with NFACCT_F_OVERQUOTA,
but they are accepting pit position, but not a bit mask. As a result
not a third bit for NFACCT_F_OVERQUOTA was set, but forth. Such
behaviour was dangarous and could lead to unexpected overquota report
result.

Signed-off-by: Alexey Perevalov <a.perevalov@samsung.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-07-31 19:55:47 +02:00
David S. Miller ccda4a77f3 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next
Steffen Klassert says:

====================
pull request (net-next): ipsec-next 2014-07-30

This is the last pull request for ipsec-next before I'll be
off for two weeks starting on friday. David, can you please
take urgent ipsec patches directly into net/net-next during
this time?

1) Error handling simplifications for vti and vti6.
   From Mathias Krause.

2) Remove a duplicate semicolon after a return statement.
   From Christoph Paasch.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-30 20:05:54 -07:00
Pablo Neira 34c5bd66e5 net: filter: don't release unattached filter through call_rcu()
sk_unattached_filter_destroy() does not always need to release the
filter object via rcu. Since this filter is never attached to the
socket, the caller should be responsible for releasing the filter
in a safe way, which may not necessarily imply rcu.

This is a short summary of clients of this function:

1) xt_bpf.c and cls_bpf.c use the bpf matchers from rules, these rules
   are removed from the packet path before the filter is released. Thus,
   the framework makes sure the filter is safely removed.

2) In the ppp driver, the ppp_lock ensures serialization between the
   xmit and filter attachment/detachment path. This doesn't use rcu
   so deferred release via rcu makes no sense.

3) In the isdn/ppp driver, it is called from isdn_ppp_release()
   the isdn_ppp_ioctl(). This driver uses mutex and spinlocks, no rcu.
   Thus, deferred rcu makes no sense to me either, the deferred releases
   may be just masking the effects of wrong locking strategy, which
   should be fixed in the driver itself.

4) In the team driver, this is the only place where the rcu
   synchronization with unattached filter is used. Therefore, this
   patch introduces synchronize_rcu() which is called from the
   genetlink path to make sure the filter doesn't go away while packets
   are still walking over it. I think we can revisit this once struct
   bpf_prog (that only wraps specific bpf code bits) is in place, then
   add some specific struct rcu_head in the scope of the team driver if
   Jiri thinks this is needed.

Deferred rcu release for unattached filters was originally introduced
in 302d663 ("filter: Allow to create sk-unattached filters").

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-30 19:56:27 -07:00
Thomas Graf 80019d310f net: Remove unlikely() for WARN_ON() conditions
No need for the unlikely(), WARN_ON() and BUG_ON() internally use
unlikely() on the condition.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-30 17:41:47 -07:00
Christoph Paasch 1f74e613de tcp: Fix integer-overflow in TCP vegas
In vegas we do a multiplication of the cwnd and the rtt. This
may overflow and thus their result is stored in a u64. However, we first
need to cast the cwnd so that actually 64-bit arithmetic is done.

Then, we need to do do_div to allow this to be used on 32-bit arches.

Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Laight <David.Laight@ACULAB.COM>
Cc: Doug Leith <doug.leith@nuim.ie>
Fixes: 8d3a564da3 (tcp: tcp_vegas cong avoid fix)
Signed-off-by: Christoph Paasch <christoph.paasch@uclouvain.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-30 17:31:06 -07:00
Christoph Paasch 45a07695bc tcp: Fix integer-overflows in TCP veno
In veno we do a multiplication of the cwnd and the rtt. This
may overflow and thus their result is stored in a u64. However, we first
need to cast the cwnd so that actually 64-bit arithmetic is done.

A first attempt at fixing 76f1017757 ([TCP]: TCP Veno congestion
control) was made by 159131149c (tcp: Overflow bug in Vegas), but it
failed to add the required cast in tcp_veno_cong_avoid().

Fixes: 76f1017757 ([TCP]: TCP Veno congestion control)
Signed-off-by: Christoph Paasch <christoph.paasch@uclouvain.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-30 17:31:06 -07:00
Anish Bhatt 16eecd9be4 dcbnl : Fix misleading dcb_app->priority explanation
Current explanation of dcb_app->priority is wrong. It says priority is
expected to be a 3-bit unsigned integer which is only true when working with
DCBx-IEEE. Use of dcb_app->priority by DCBx-CEE expects it to be 802.1p user
priority bitmap. Updated accordingly

This affects the cxgb4 driver, but I will post those changes as part of a
larger changeset shortly.

Fixes: 3e29027af4 ("dcbnl: add support for ieee8021Qaz attributes")
Signed-off-by: Anish Bhatt <anish@chelsio.com>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-30 17:21:05 -07:00
Dmitry Popov 95cb574598 ip_tunnel(ipv4): fix tunnels with "local any remote $remote_ip"
Ipv4 tunnels created with "local any remote $ip" didn't work properly since
7d442fab0 (ipv4: Cache dst in tunnels). 99% of packets sent via those tunnels
had src addr = 0.0.0.0. That was because only dst_entry was cached, although
fl4.saddr has to be cached too. Every time ip_tunnel_xmit used cached dst_entry
(tunnel_rtable_get returned non-NULL), fl4.saddr was initialized with
tnl_params->saddr (= 0 in our case), and wasn't changed until iptunnel_xmit().

This patch adds saddr to ip_tunnel->dst_cache, fixing this issue.

Reported-by: Sergey Popov <pinkbyte@gentoo.org>
Signed-off-by: Dmitry Popov <ixaphire@qrator.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-30 15:18:58 -07:00
David S. Miller f139c74a8d Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-30 13:25:49 -07:00
Johan Hedberg 82c295b1b0 Bluetooth: Always use non-bonding requirement when not bondable
When we're not bondable we should never send any other SSP
authentication requirement besides one of the non-bonding ones.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-30 19:28:41 +02:00
Johan Hedberg b2939475eb Bluetooth: Rename pairable mgmt setting to bondable
This setting maps to the HCI_BONDABLE flag which tracks whether we're
bondable or not. Therefore, rename the mgmt setting and respective
command accordingly.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-30 19:28:41 +02:00
Johan Hedberg b6ae8457ac Bluetooth: Rename HCI_PAIRABLE to HCI_BONDABLE
The HCI_PAIRABLE flag isn't actually controlling whether we're pairable
but whether we're bondable. Therefore, rename it accordingly.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-30 19:28:41 +02:00
Marcel Holtmann bdb9434664 Bluetooth: Fix sparse warning from HID new leds handling
The new leds bit handling produces this spares warning.

  CHECK   net/bluetooth/hidp/core.c
net/bluetooth/hidp/core.c:156:60: warning: dubious: x | !y

Just fix it by doing an explicit x << 0 shift operation.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-07-30 19:28:41 +02:00
Johan Hedberg 6f78fd4bb9 Bluetooth: Fix check for connected state when pairing
Both BT_CONNECTED and BT_CONFIG state mean that we have a baseband link
available. We should therefore check for either of these when pairing
and deciding whether to call hci_conn_security() directly.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-30 19:28:41 +02:00
Marcel Holtmann 3fa71fe0b9 6lowpan: iphc: Fix parenthesis alignments which off-by-one
CHECK: Alignment should match open parenthesis
+	if (((hdr->flow_lbl[0] & 0x0F) == 0) &&
+	     (hdr->flow_lbl[1] == 0) && (hdr->flow_lbl[2] == 0)) {

CHECK: Alignment should match open parenthesis
+		if ((hdr->priority == 0) &&
+		   ((hdr->flow_lbl[0] & 0xF0) == 0)) {

CHECK: Alignment should match open parenthesis
+		if ((hdr->priority == 0) &&
+		   ((hdr->flow_lbl[0] & 0xF0) == 0)) {

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-07-30 19:28:40 +02:00
Marcel Holtmann 9ab9bb009c 6lowpan: iphc: Fix missing braces for if statement
CHECK: braces {} should be used on all arms of this statement
+	if ((iphc0 & 0x03) != LOWPAN_IPHC_TTL_I)
[...]
+	else {
[...]

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-07-30 19:28:40 +02:00
Marcel Holtmann 26fff593cd 6lowpan: iphc: Fix missing blank line after variable declarations
WARNING: Missing a blank line after declarations
+		struct sk_buff *new;
+		if (uncompress_udp_header(skb, &uh))

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-07-30 19:28:40 +02:00
Marcel Holtmann 7fc4cfda75 6lowpan: iphc: Fix issues with alignment matching open parenthesis
This patch fixes all the issues with alignment matching of open
parenthesis found by checkpatch.pl and makes them follow the
network coding style now.

CHECK: Alignment should match open parenthesis
+static int uncompress_addr(struct sk_buff *skb,
+				struct in6_addr *ipaddr, const u8 address_mode,

CHECK: Alignment should match open parenthesis
+static int uncompress_context_based_src_addr(struct sk_buff *skb,
+						struct in6_addr *ipaddr,

CHECK: Alignment should match open parenthesis
+static int skb_deliver(struct sk_buff *skb, struct ipv6hdr *hdr,
+		struct net_device *dev, skb_delivery_cb deliver_skb)

CHECK: Alignment should match open parenthesis
+	new = skb_copy_expand(skb, sizeof(struct ipv6hdr), skb_tailroom(skb),
+								GFP_ATOMIC);

CHECK: Alignment should match open parenthesis
+	raw_dump_table(__func__, "raw skb data dump before receiving",
+			new->data, new->len);

CHECK: Alignment should match open parenthesis
+lowpan_uncompress_multicast_daddr(struct sk_buff *skb,
+		struct in6_addr *ipaddr,

CHECK: Alignment should match open parenthesis
+	raw_dump_inline(NULL, "Reconstructed ipv6 multicast addr is",
+				ipaddr->s6_addr, 16);

CHECK: Alignment should match open parenthesis
+int lowpan_process_data(struct sk_buff *skb, struct net_device *dev,
+		const u8 *saddr, const u8 saddr_type, const u8 saddr_len,

CHECK: Alignment should match open parenthesis
+	raw_dump_table(__func__, "raw skb data dump uncompressed",
+				skb->data, skb->len);

CHECK: Alignment should match open parenthesis
+		err = uncompress_addr(skb, &hdr.saddr, tmp, saddr,
+					saddr_type, saddr_len);

CHECK: Alignment should match open parenthesis
+		err = uncompress_addr(skb, &hdr.daddr, tmp, daddr,
+					daddr_type, daddr_len);

CHECK: Alignment should match open parenthesis
+		pr_debug("dest: stateless compression mode %d dest %pI6c\n",
+			tmp, &hdr.daddr);

CHECK: Alignment should match open parenthesis
+		raw_dump_table(__func__, "raw UDP header dump",
+				      (u8 *)&uh, sizeof(uh));

CHECK: Alignment should match open parenthesis
+	raw_dump_table(__func__, "raw header dump", (u8 *)&hdr,
+							sizeof(hdr));

CHECK: Alignment should match open parenthesis
+int lowpan_header_compress(struct sk_buff *skb, struct net_device *dev,
+			unsigned short type, const void *_daddr,

CHECK: Alignment should match open parenthesis
+	raw_dump_table(__func__, "raw skb network header dump",
+		skb_network_header(skb), sizeof(struct ipv6hdr));

CHECK: Alignment should match open parenthesis
+	raw_dump_table(__func__,
+			"sending raw skb network uncompressed packet",

CHECK: Alignment should match open parenthesis
+	if (((hdr->flow_lbl[0] & 0x0F) == 0) &&
+	     (hdr->flow_lbl[1] == 0) && (hdr->flow_lbl[2] == 0)) {

WARNING: quoted string split across lines
+			pr_debug("dest address unicast link-local %pI6c "
+				"iphc1 0x%02x\n", &hdr->daddr, iphc1);

CHECK: Alignment should match open parenthesis
+	raw_dump_table(__func__, "raw skb data dump compressed",
+				skb->data, skb->len);

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-07-30 19:28:40 +02:00
Marcel Holtmann 89f534905a 6lowpan: iphc: Fix block comments to match networking style
This patch fixes all the block comment issues found by checkpatch.pl and
makes them match the network style now.

WARNING: networking block comments don't use an empty /* line, use /* Comment...
+/*
+ * Based on patches from Jon Smirl <jonsmirl@gmail.com>

WARNING: networking block comments don't use an empty /* line, use /* Comment...
+/*
+ * Uncompress address function for source and

WARNING: networking block comments don't use an empty /* line, use /* Comment...
+/*
+ * Uncompress address function for source context

WARNING: networking block comments don't use an empty /* line, use /* Comment...
+		/*
+		 * UDP lenght needs to be infered from the lower layers

WARNING: networking block comments don't use an empty /* line, use /* Comment...
+	/*
+	 * Traffic Class and FLow Label carried in-line

WARNING: networking block comments don't use an empty /* line, use /* Comment...
+	/*
+	 * Traffic class carried in-line

WARNING: networking block comments don't use an empty /* line, use /* Comment...
+	/*
+	 * Flow Label carried in-line

WARNING: networking block comments don't use an empty /* line, use /* Comment...
+		/*
+		 * replace the compressed UDP head by the uncompressed UDP

WARNING: networking block comments don't use an empty /* line, use /* Comment...
+	/*
+	 * As we copy some bit-length fields, in the IPHC encoding bytes,

WARNING: networking block comments don't use an empty /* line, use /* Comment...
+	/*
+	 * Traffic class, flow label

WARNING: networking block comments don't use an empty /* line, use /* Comment...
+	/*
+	 * Hop limit

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-07-30 19:28:40 +02:00
Alexander Aring b2e3a479a6 6lowpan: iphc: remove check on null
This memory is placed on stack and can't be null so remove the check on
null.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-30 19:28:39 +02:00
Alexander Aring 556a5bfc03 6lowpan: iphc: use ipv6 api to check address scope
This patch removes the own implementation to check of link-layer,
broadcast and any address type and use the IPv6 api for that.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-30 19:28:39 +02:00
Alexander Aring 85c71240a3 6lowpan: iphc: cleanup use of lowpan_push_hc_data
This patch uses the lowpan_push_hc_data functions in several places
where we can use it. The lowpan_push_hc_data was introduced in some
previous patches.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-30 19:28:39 +02:00
Alexander Aring 4ebc960f94 6lowpan: iphc: cleanup use of lowpan_fetch_skb
We introduced the lowpan_fetch_skb function in some previous patches for
6lowpan to have a generic fetch function. This patch drops the old
function and use the generic lowpan_fetch_skb one.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-30 19:28:39 +02:00
Alexander Aring 8ec1d9be32 6lowpan: iphc: use sizeof in udp uncompression
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-30 19:28:39 +02:00
Alexander Aring 84ca5e036f 6lowpan: iphc: rename hc06_ptr pointer to hc_ptr
The hc06_ptr pointer variable stands for header compression draft-06. We
are mostly rfc complaint. This patch rename the variable to normal hc_ptr.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-30 19:28:39 +02:00
Johan Hedberg 616d55be4c Bluetooth: Fix SMP context tracking leading to a kernel crash
The HCI_CONN_LE_SMP_PEND flag is supposed to indicate whether we have an
SMP context or not. If the context creation fails, or some other error
is indicated between setting the flag and creating the context the flag
must be cleared first.

This patch ensures that smp_chan_create() clears the flag in case of
allocation failure as well as reorders code in smp_cmd_security_req()
that could lead to returning an error between setting the flag and
creating the context.

Without the patch the following kind of kernel crash could be observed
(this one because of unacceptable authentication requirements in a
Security Request):

[  +0.000855] kernel BUG at net/bluetooth/smp.c:606!
[  +0.000000] invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
[  +0.000000] CPU: 0 PID: 58 Comm: kworker/u5:2 Tainted: G        W     3.16.0-rc1+ #785
[  +0.008391] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[  +0.000000] Workqueue: hci0 hci_rx_work
[  +0.000000] task: f4dc8f90 ti: f4ef0000 task.ti: f4ef0000
[  +0.000000] EIP: 0060:[<c13432b6>] EFLAGS: 00010246 CPU: 0
[  +0.000000] EIP is at smp_chan_destroy+0x1e/0x145
[  +0.000709] EAX: f46db870 EBX: 00000000 ECX: 00000000 EDX: 00000005
[  +0.000000] ESI: f46db870 EDI: f46db870 EBP: f4ef1dc0 ESP: f4ef1db0
[  +0.000000]  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
[  +0.000000] CR0: 8005003b CR2: b666b0b0 CR3: 00022000 CR4: 00000690
[  +0.000000] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[  +0.000000] DR6: fffe0ff0 DR7: 00000400
[  +0.000000] Stack:
[  +0.000000]  00000005 f17b7840 f46db870 f4ef1dd4 f4ef1de4 c1343441 c134342e 00000000
[  +0.000000]  c1343441 00000005 00000002 00000000 f17b7840 f4ef1e38 c134452a 00002aae
[  +0.000000]  01ef1e00 00002aae f46bd980 f46db870 00000039 ffffffff 00000007 f4ef1e34
[  +0.000000] Call Trace:
[  +0.000000]  [<c1343441>] smp_failure+0x64/0x6c
[  +0.000000]  [<c134342e>] ? smp_failure+0x51/0x6c
[  +0.000000]  [<c1343441>] ? smp_failure+0x64/0x6c
[  +0.000000]  [<c134452a>] smp_sig_channel+0xad6/0xafc
[  +0.000000]  [<c1053b61>] ? vprintk_emit+0x343/0x366
[  +0.000000]  [<c133f34e>] l2cap_recv_frame+0x1337/0x1ac4
[  +0.000000]  [<c133f34e>] ? l2cap_recv_frame+0x1337/0x1ac4
[  +0.000000]  [<c1172307>] ? __dynamic_pr_debug+0x3e/0x40
[  +0.000000]  [<c11702a1>] ? debug_smp_processor_id+0x12/0x14
[  +0.000000]  [<c1340bc9>] l2cap_recv_acldata+0xe8/0x239
[  +0.000000]  [<c1340bc9>] ? l2cap_recv_acldata+0xe8/0x239
[  +0.000000]  [<c1169931>] ? __const_udelay+0x1a/0x1c
[  +0.000000]  [<c131f120>] hci_rx_work+0x1a1/0x286
[  +0.000000]  [<c137244e>] ? mutex_unlock+0x8/0xa
[  +0.000000]  [<c131f120>] ? hci_rx_work+0x1a1/0x286
[  +0.000000]  [<c1038fe5>] process_one_work+0x128/0x1df
[  +0.000000]  [<c1038fe5>] ? process_one_work+0x128/0x1df
[  +0.000000]  [<c10392df>] worker_thread+0x222/0x2de
[  +0.000000]  [<c10390bd>] ? process_scheduled_works+0x21/0x21
[  +0.000000]  [<c103d34c>] kthread+0x82/0x87
[  +0.000000]  [<c1040000>] ? create_new_namespaces+0x90/0x105
[  +0.000000]  [<c13738e1>] ret_from_kernel_thread+0x21/0x30
[  +0.000000]  [<c103d2ca>] ? __kthread_parkme+0x50/0x50
[  +0.000000] Code: 65 f4 89 f0 5b 5e 5f 5d 8d 67 f8 5f c3 57 8d 7c 24 08 83 e4 f8 ff 77 fc 55 89 e5 57 89 c7 56 53 52 8b 98 e0 00 00 00 85 db 75 02 <0f> 0b 8b b3 80 00 00 00 8b 00 c1 ee 03 83 e6 01 89 f2 e8 ef 09
[  +0.000000] EIP: [<c13432b6>] smp_chan_destroy+0x1e/0x145 SS:ESP 0068:f4ef1db0

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-30 19:28:38 +02:00
Alexey Perevalov d24675cb1f netfilter: nfnetlink_acct: dump unmodified nfacct flags
NFNL_MSG_ACCT_GET_CTRZERO modifies dumped flags, in this case
client see unmodified (uncleared) counter value and cleared
overquota state - end user doesn't know anything about overquota state,
unless end user subscribed on overquota report.

Signed-off-by: Alexey Perevalov <a.perevalov@samsung.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-07-30 18:16:56 +02:00
Eric W. Biederman 728dba3a39 namespaces: Use task_lock and not rcu to protect nsproxy
The synchronous syncrhonize_rcu in switch_task_namespaces makes setns
a sufficiently expensive system call that people have complained.

Upon inspect nsproxy no longer needs rcu protection for remote reads.
remote reads are rare.  So optimize for same process reads and write
by switching using rask_lock instead.

This yields a simpler to understand lock, and a faster setns system call.

In particular this fixes a performance regression observed
by Rafael David Tinoco <rafael.tinoco@canonical.com>.

This is effectively a revert of Pavel Emelyanov's commit
cf7b708c8d Make access to task's nsproxy lighter
from 2007.  The race this originialy fixed no longer exists as
do_notify_parent uses task_active_pid_ns(parent) instead of
parent->nsproxy.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2014-07-29 18:08:50 -07:00
Karoly Kemeny c54a5e0247 ipv4: clean up cast warning in do_ip_getsockopt
Sparse warns because of implicit pointer cast.

v2: subject line correction, space between "void" and "*"

Signed-off-by: Karoly Kemeny <karoly.kemeny@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-29 16:31:16 -07:00
Wei Yongjun ad025a56a5 tipc: remove duplicated include from socket.c
Remove duplicated include.

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-29 15:51:14 -07:00
Himangi Saraogi 27446442a8 net/udp_offload: Use IS_ERR_OR_NULL
This patch introduces the use of the macro IS_ERR_OR_NULL in place of
tests for NULL and IS_ERR.

The following Coccinelle semantic patch was used for making the change:

@@
expression e;
@@

- e == NULL || IS_ERR(e)
+ IS_ERR_OR_NULL(e)
 || ...

Signed-off-by: Himangi Saraogi <himangi774@gmail.com>
Acked-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-29 15:31:56 -07:00
Himangi Saraogi d0e992aa02 openvswitch: Use IS_ERR_OR_NULL
This patch introduces the use of the macro IS_ERR_OR_NULL in place of
tests for NULL and IS_ERR.

The following Coccinelle semantic patch was used for making the change:

@@
expression e;
@@

- e == NULL || IS_ERR(e)
+ IS_ERR_OR_NULL(e)
 || ...

Signed-off-by: Himangi Saraogi <himangi774@gmail.com>
Acked-by: Julia Lawall <julia.lawall@lip6.fr>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-29 15:31:56 -07:00
Himangi Saraogi 5a8dbf03dd net/ipv4: Use IS_ERR_OR_NULL
This patch introduces the use of the macro IS_ERR_OR_NULL in place of
tests for NULL and IS_ERR.

The following Coccinelle semantic patch was used for making the change:

@@
expression e;
@@

- e == NULL || IS_ERR(e)
+ IS_ERR_OR_NULL(e)
 || ...

Signed-off-by: Himangi Saraogi <himangi774@gmail.com>
Acked-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-29 15:31:56 -07:00
Trond Myklebust 518776800c SUNRPC: Allow svc_reserve() to notify TCP socket that space has been freed
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-07-29 16:10:20 -04:00
Trond Myklebust c7fb3f0631 SUNRPC: svc_tcp_write_space: don't clear SOCK_NOSPACE prematurely
If requests are queued in the socket inbuffer waiting for an
svc_tcp_has_wspace() requirement to be satisfied, then we do not want
to clear the SOCK_NOSPACE flag until we've satisfied that requirement.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-07-29 16:10:19 -04:00
Trond Myklebust 0971374e28 SUNRPC: Reduce contention in svc_xprt_enqueue()
Ensure that all calls to svc_xprt_enqueue() except svc_xprt_received()
check the value of XPT_BUSY, before attempting to grab spinlocks etc.
This is to avoid situations such as the following "perf" trace,
which shows heavy contention on the pool spinlock:

    54.15%            nfsd  [kernel.kallsyms]        [k] _raw_spin_lock_bh
                      |
                      --- _raw_spin_lock_bh
                         |
                         |--71.43%-- svc_xprt_enqueue
                         |          |
                         |          |--50.31%-- svc_reserve
                         |          |
                         |          |--31.35%-- svc_xprt_received
                         |          |
                         |          |--18.34%-- svc_tcp_data_ready
...

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-07-29 16:10:15 -04:00
Andrey Ryabinin 40eea803c6 net: sendmsg: fix NULL pointer dereference
Sasha's report:
	> While fuzzing with trinity inside a KVM tools guest running the latest -next
	> kernel with the KASAN patchset, I've stumbled on the following spew:
	>
	> [ 4448.949424] ==================================================================
	> [ 4448.951737] AddressSanitizer: user-memory-access on address 0
	> [ 4448.952988] Read of size 2 by thread T19638:
	> [ 4448.954510] CPU: 28 PID: 19638 Comm: trinity-c76 Not tainted 3.16.0-rc4-next-20140711-sasha-00046-g07d3099-dirty #813
	> [ 4448.956823]  ffff88046d86ca40 0000000000000000 ffff880082f37e78 ffff880082f37a40
	> [ 4448.958233]  ffffffffb6e47068 ffff880082f37a68 ffff880082f37a58 ffffffffb242708d
	> [ 4448.959552]  0000000000000000 ffff880082f37a88 ffffffffb24255b1 0000000000000000
	> [ 4448.961266] Call Trace:
	> [ 4448.963158] dump_stack (lib/dump_stack.c:52)
	> [ 4448.964244] kasan_report_user_access (mm/kasan/report.c:184)
	> [ 4448.965507] __asan_load2 (mm/kasan/kasan.c:352)
	> [ 4448.966482] ? netlink_sendmsg (net/netlink/af_netlink.c:2339)
	> [ 4448.967541] netlink_sendmsg (net/netlink/af_netlink.c:2339)
	> [ 4448.968537] ? get_parent_ip (kernel/sched/core.c:2555)
	> [ 4448.970103] sock_sendmsg (net/socket.c:654)
	> [ 4448.971584] ? might_fault (mm/memory.c:3741)
	> [ 4448.972526] ? might_fault (./arch/x86/include/asm/current.h:14 mm/memory.c:3740)
	> [ 4448.973596] ? verify_iovec (net/core/iovec.c:64)
	> [ 4448.974522] ___sys_sendmsg (net/socket.c:2096)
	> [ 4448.975797] ? put_lock_stats.isra.13 (./arch/x86/include/asm/preempt.h:98 kernel/locking/lockdep.c:254)
	> [ 4448.977030] ? lock_release_holdtime (kernel/locking/lockdep.c:273)
	> [ 4448.978197] ? lock_release_non_nested (kernel/locking/lockdep.c:3434 (discriminator 1))
	> [ 4448.979346] ? check_chain_key (kernel/locking/lockdep.c:2188)
	> [ 4448.980535] __sys_sendmmsg (net/socket.c:2181)
	> [ 4448.981592] ? trace_hardirqs_on_caller (kernel/locking/lockdep.c:2600)
	> [ 4448.982773] ? trace_hardirqs_on (kernel/locking/lockdep.c:2607)
	> [ 4448.984458] ? syscall_trace_enter (arch/x86/kernel/ptrace.c:1500 (discriminator 2))
	> [ 4448.985621] ? trace_hardirqs_on_caller (kernel/locking/lockdep.c:2600)
	> [ 4448.986754] SyS_sendmmsg (net/socket.c:2201)
	> [ 4448.987708] tracesys (arch/x86/kernel/entry_64.S:542)
	> [ 4448.988929] ==================================================================

This reports means that we've come to netlink_sendmsg() with msg->msg_name == NULL and msg->msg_namelen > 0.

After this report there was no usual "Unable to handle kernel NULL pointer dereference"
and this gave me a clue that address 0 is mapped and contains valid socket address structure in it.

This bug was introduced in f3d3342602
(net: rework recvmsg handler msg_name and msg_namelen logic).
Commit message states that:
	"Set msg->msg_name = NULL if user specified a NULL in msg_name but had a
	 non-null msg_namelen in verify_iovec/verify_compat_iovec. This doesn't
	 affect sendto as it would bail out earlier while trying to copy-in the
	 address."
But in fact this affects sendto when address 0 is mapped and contains
socket address structure in it. In such case copy-in address will succeed,
verify_iovec() function will successfully exit with msg->msg_namelen > 0
and msg->msg_name == NULL.

This patch fixes it by setting msg_namelen to 0 if msg_name == NULL.

Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Eric Dumazet <edumazet@google.com>
Cc: <stable@vger.kernel.org>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-29 12:20:22 -07:00
WANG Cong 9c5ff24f96 vlan: fail early when creating netdev named config
Similarly, vlan will create  /proc/net/vlan/<dev>, so when we
create dev with name "config", it will confict with
/proc/net/vlan/config.

Reported-by: Stephane Chazelas <stephane.chazelas@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-29 11:43:50 -07:00
WANG Cong a317a2f19d ipv6: fail early when creating netdev named all or default
We create a proc dir for each network device, this will cause
conflicts when the devices have name "all" or "default".

Rather than emitting an ugly kernel warning, we could just
fail earlier by checking the device name.

Reported-by: Stephane Chazelas <stephane.chazelas@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-29 11:43:50 -07:00
WANG Cong 20e61da7ff ipv4: fail early when creating netdev named all or default
We create a proc dir for each network device, this will cause
conflicts when the devices have name "all" or "default".

Rather than emitting an ugly kernel warning, we could just
fail earlier by checking the device name.

Reported-by: Stephane Chazelas <stephane.chazelas@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-29 11:43:50 -07:00
Willem de Bruijn 4d276eb6a4 net: remove deprecated syststamp timestamp
The SO_TIMESTAMPING API defines three types of timestamps: software,
hardware in raw format (hwtstamp) and hardware converted to system
format (syststamp). The last has been deprecated in favor of combining
hwtstamp with a PTP clock driver. There are no active users in the
kernel.

The option was device driver dependent. If set, but without hardware
support, the correct behavior is to return zero in the relevant field
in the SCM_TIMESTAMPING ancillary message. Without device drivers
implementing the option, this field is effectively always zero.

Remove the internal plumbing to dissuage new drivers from implementing
the feature. Keep the SOF_TIMESTAMPING_SYS_HARDWARE flag, however, to
avoid breaking existing applications that request the timestamp.

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-29 11:39:50 -07:00
Willem de Bruijn 68a360e82e packet: remove deprecated syststamp timestamp
No device driver will ever return an skb_shared_info structure with
syststamp non-zero, so remove the branch that tests for this and
optionally marks the packet timestamp as TP_STATUS_TS_SYS_HARDWARE.

Do not remove the definition TP_STATUS_TS_SYS_HARDWARE, as processes
may refer to it.

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-29 11:39:50 -07:00
John W. Linville a1ae52c203 Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next 2014-07-29 10:32:36 -04:00
John W. Linville ec87652694 NFC: 3.17 pull request
This is the NFC pull request for 3.17.
 This is a rather quiet one, we have:
 
 - A new driver from ST Microelectronics for their NCI ST21NFCB,
   including device tree  support.
 
 - p2p support for the ST21NFCA driver
 
 - A few fixes an enhancements for the NFC digital layer
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJT0CJhAAoJEIqAPN1PVmxKtnAP/jOu1cP/EsegdByqmEumjSGQ
 mBv7wC+/r91wV6Hny8Ij8uxyN+/QfxF75nwPjTrPSApo7mHOlF8FeY0/EktxJazo
 c/NIAntNwREIbpkc68CIQNrYr9YFkKEM+7lCV1ImALUb/CPfiH7Fx7vGdhkCKqkc
 B8etkLyeJtl9EqSZM7GI2YrEbPzPEWLk2ydVQ9BccxvN0I8rc29DnD+DNR5sL6Wa
 7bEmZF5J/GnVErvnS8uHDGgerpzlFAj7MONrsxADZCHPie0F87T3wXAGwKlaBY6R
 nOde0YCS749JxN1c2AdmgIadwo5XqjeSbjQ1g8L8HJTWY8Tl9Vw8GoQFN9qhHkSM
 gkOya77n0R8SZWoJzo3BFBMpsncVG2cJIsnwpRBIWMUzg76mGe7Fzl21KCXD8xyy
 xvy8Ar3QKPDTu6uvLNEPk9+cfl/JxQAoLNL30eeZGBDBSg/g2ptNiBYNLXvVoEtU
 B/xyTdmA1SXQnOKGKNxjFCo+WZDXSoTrWeml/uvBhprVAj6YS3K/imc9EiL4zcD7
 72iNGZbIZRfw91x7VbbQ5Nb8PEyYsLef8ztUFM9HgliNgIDMHaQHoXYwmD4264uO
 RGTETdHYQb0ltX3HsBiIgTNF+Y1sJUVP3TyEhjSk58TpCy/S6v9YjcT9RYmNpuPk
 YftdxfapAKV0EJhcQc1r
 =wR0y
 -----END PGP SIGNATURE-----

Merge tag 'nfc-next-3.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/nfc-next

Samuel Ortiz <sameo@linux.intel.com> says:

"NFC: 3.17 pull request

This is the NFC pull request for 3.17.
This is a rather quiet one, we have:

- A new driver from ST Microelectronics for their NCI ST21NFCB,
  including device tree  support.

- p2p support for the ST21NFCA driver

- A few fixes an enhancements for the NFC digital layer"

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2014-07-29 10:31:20 -04:00
Eric Dumazet 04ca6973f7 ip: make IP identifiers less predictable
In "Counting Packets Sent Between Arbitrary Internet Hosts", Jeffrey and
Jedidiah describe ways exploiting linux IP identifier generation to
infer whether two machines are exchanging packets.

With commit 73f156a6e8 ("inetpeer: get rid of ip_id_count"), we
changed IP id generation, but this does not really prevent this
side-channel technique.

This patch adds a random amount of perturbation so that IP identifiers
for a given destination [1] are no longer monotonically increasing after
an idle period.

Note that prandom_u32_max(1) returns 0, so if generator is used at most
once per jiffy, this patch inserts no hole in the ID suite and do not
increase collision probability.

This is jiffies based, so in the worst case (HZ=1000), the id can
rollover after ~65 seconds of idle time, which should be fine.

We also change the hash used in __ip_select_ident() to not only hash
on daddr, but also saddr and protocol, so that ICMP probes can not be
used to infer information for other protocols.

For IPv6, adds saddr into the hash as well, but not nexthdr.

If I ping the patched target, we can see ID are now hard to predict.

21:57:11.008086 IP (...)
    A > target: ICMP echo request, seq 1, length 64
21:57:11.010752 IP (... id 2081 ...)
    target > A: ICMP echo reply, seq 1, length 64

21:57:12.013133 IP (...)
    A > target: ICMP echo request, seq 2, length 64
21:57:12.015737 IP (... id 3039 ...)
    target > A: ICMP echo reply, seq 2, length 64

21:57:13.016580 IP (...)
    A > target: ICMP echo request, seq 3, length 64
21:57:13.019251 IP (... id 3437 ...)
    target > A: ICMP echo reply, seq 3, length 64

[1] TCP sessions uses a per flow ID generator not changed by this patch.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Jeffrey Knockel <jeffk@cs.unm.edu>
Reported-by: Jedidiah R. Crandall <crandall@cs.unm.edu>
Cc: Willy Tarreau <w@1wt.eu>
Cc: Hannes Frederic Sowa <hannes@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-28 18:46:34 -07:00
Jon Paul Maloy 13e9b9972f tipc: make tipc_buf_append() more robust
As per comment from David Miller, we try to make the buffer reassembly
function more resilient to user errors than it is today.

- We check that the "*buf" parameter always is set, since this is
  mandatory input.

- We ensure that *buf->next always is set to NULL before linking in
  the buffer, instead of relying of the caller to have done this.

- We ensure that the "tail" pointer in the head buffer's control
  block is initialized to NULL when the first fragment arrives.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-28 18:34:01 -07:00
Jun Zhao 545469f7a5 neighbour : fix ndm_type type error issue
ndm_type means L3 address type, in neighbour proxy and vxlan, it's RTN_UNICAST.
NDA_DST is for netlink TLV type, hence it's not right value in this context.

Signed-off-by: Jun Zhao <mypopydev@gmail.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-28 17:52:17 -07:00
David S. Miller 3fd0202a0d Merge tag 'master-2014-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next
John W. Linville says:

====================
pull request: wireless-next 2014-07-25

Please pull this batch of updates intended for the 3.17 stream!

For the mac80211 bits, Johannes says:

"We have a lot of TDLS patches, among them a fix that should make hwsim
tests happy again. The rest, this time, is mostly small fixes."

For the Bluetooth bits, Gustavo says:

"Some more patches for 3.17. The most important change here is the move of
the 6lowpan code to net/6lowpan. It has been agreed with Davem that this
change will go through the bluetooth tree. The rest are mostly clean up and
fixes."

and,

"Here follows some more patches for 3.17. These are mostly fixes to what
we've sent to you before for next merge window."

For the iwlwifi bits, Emmanuel says:

"I have the usual amount of BT Coex stuff. Arik continues to work
on TDLS and Ariej contributes a few things for HS2.0. I added a few
more things to the firmware debugging infrastructure. Eran fixes a
small bug - pretty normal content."

And for the Atheros bits, Kalle says:

"For ath6kl me and Jessica added support for ar6004 hw3.0, our latest
version of ar6004.

For ath10k Janusz added a printout so that it's easier to check what
ath10k kconfig options are enabled. He also added a debugfs file to
configure maximum amsdu and ampdu values. Also we had few fixes as
usual."

On top of that is the usual large batch of various driver updates --
brcmfmac, mwifiex, the TI drivers, and wil6210 all get some action.
RafaƂ has also been very busy with b43 and related updates.

Also, I pulled the wireless tree into this in order to resolve a
merge conflict...

P.S.  The change to fs/compat_ioctl.c reflects a name change in a
Bluetooth header file...
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-28 17:36:25 -07:00
Johan Hedberg 3bd2724010 Bluetooth: Fix incorrectly disabling page scan when toggling connectable
If we have entries in the whitelist we shouldn't disable page scanning
when disabling connectable mode. This patch adds the necessary check to
the Set Connectable command handler.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-28 20:13:32 +02:00
Johan Hedberg 204e399003 Bluetooth: Fix clearing HCI_PSCAN flag
This patch fixes a typo in the hci_cc_write_scan_enable() function where
we want to clear the HCI_PSCAN flag if the SCAN_PAGE bit of the HCI
command parameter was not set.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-28 16:50:52 +02:00
Ingo Molnar ca5bc6cd5d Merge branch 'sched/urgent' into sched/core, to merge fixes before applying new changes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-07-28 10:03:00 +02:00
Nikolay Aleksandrov 1bab4c7507 inet: frag: set limits and make init_net's high_thresh limit global
This patch makes init_net's high_thresh limit to be the maximum for all
namespaces, thus introducing a global memory limit threshold equal to the
sum of the individual high_thresh limits which are capped.
It also introduces some sane minimums for low_thresh as it shouldn't be
able to drop below 0 (or > high_thresh in the unsigned case), and
overall low_thresh should not ever be above high_thresh, so we make the
following relations for a namespace:
init_net:
 high_thresh - max(not capped), min(init_net low_thresh)
 low_thresh - max(init_net high_thresh), min (0)

all other namespaces:
 high_thresh = max(init_net high_thresh), min(namespace's low_thresh)
 low_thresh = max(namespace's high_thresh), min(0)

The major issue with having low_thresh > high_thresh is that we'll
schedule eviction but never evict anything and thus rely only on the
timers.

Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-27 22:34:36 -07:00
Florian Westphal ab1c724f63 inet: frag: use seqlock for hash rebuild
rehash is rare operation, don't force readers to take
the read-side rwlock.

Instead, we only have to detect the (rare) case where
the secret was altered while we are trying to insert
a new inetfrag queue into the table.

If it was changed, drop the bucket lock and recompute
the hash to get the 'new' chain bucket that we have to
insert into.

Joint work with Nikolay Aleksandrov.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-27 22:34:36 -07:00
Florian Westphal e3a57d18b0 inet: frag: remove periodic secret rebuild timer
merge functionality into the eviction workqueue.

Instead of rebuilding every n seconds, take advantage of the upper
hash chain length limit.

If we hit it, mark table for rebuild and schedule workqueue.
To prevent frequent rebuilds when we're completely overloaded,
don't rebuild more than once every 5 seconds.

ipfrag_secret_interval sysctl is now obsolete and has been marked as
deprecated, it still can be changed so scripts won't be broken but it
won't have any effect. A comment is left above each unused secret_timer
variable to avoid confusion.

Joint work with Nikolay Aleksandrov.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-27 22:34:36 -07:00