1
0
Fork 0
Commit Graph

535048 Commits (ffe8690c85b8426db7783064724d106702f1b1e8)

Author SHA1 Message Date
Kaixu Xia ffe8690c85 perf: add the necessary core perf APIs when accessing events counters in eBPF programs
This patch add three core perf APIs:
 - perf_event_attrs(): export the struct perf_event_attr from struct
   perf_event;
 - perf_event_get(): get the struct perf_event from the given fd;
 - perf_event_read_local(): read the events counters active on the
   current CPU;
These APIs are needed when accessing events counters in eBPF programs.

The API perf_event_read_local() comes from Peter and I add the
corresponding SOB.

Signed-off-by: Kaixu Xia <xiakaixu@huawei.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-09 22:50:05 -07:00
David S. Miller f1d5ca4344 Merge branch 'mv88e6xxx-switchdev-fdb'
Vivien Didelot says:

====================
net: dsa: mv88e6xxx: support switchdev FDB objects

This patchset refactors the DSA and mv88e6xxx code to use the switchdev FDB
objects.

The first two patches add minor but necessary changes to switchdev, the third
one implements the switchdev glue in DSA for FDB routines, and the remaining
ones refactor the FDB access functions in the mv88e6xxx code.

Below is an usage example (ports 0-2 belongs to br0, ports 3-4 belongs to br1):

    # bridge fdb add 3c:97:0e:11:30:6e dev swp2
    # bridge fdb add 3c:97:0e:11:40:78 dev swp3
    # bridge fdb add 3c:97:0e:11:50:86 dev swp4
    # bridge fdb del 3c:97:0e:11:40:78 dev swp3
    # bridge fdb
    01:00:5e:00:00:01 dev eth0 self permanent
    01:00:5e:00:00:01 dev eth1 self permanent
    00:50:d2:10:78:15 dev swp0 master br0 permanent
    3c:97:0e:11:30:6e dev swp2 self static
    00:50:d2:10:78:15 dev swp3 master br1 permanent
    3c:97:0e:11:50:86 dev swp4 self static
    # cat /sys/kernel/debug/dsa0/atu
    # DB   T/P  Vec State Addr
    # 001  Port 004   e   3c:97:0e:11:30:6e
    # 004  Port 010   e   3c:97:0e:11:50:86

For the 88E6xxx switches, FIDs 1 to num_ports will be reserved for non-bridged
ports and bridge groups, and the remaining will be later used by VLANs.

This change is necessary to welcome the support for hardware VLANs (which will
follow soon).

Changes in v2:

 - remove ndo_bridge_{get,set,del}link from switchdev/DSA glue code

 - use ether_addr_copy instead of memcpy for MAC addresses

 - constify MAC address in port_fdb_{add,del}

 - split the mv88e6xxx code refactoring into several patches
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-09 22:48:10 -07:00
Vivien Didelot 878205101f net: dsa: mv88e6xxx: rework FDB add/del operations
Add a low level function for the ATU Load operation, and provide FDB add
and delete wrappers functions.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-09 22:48:09 -07:00
Vivien Didelot 6630e23617 net: dsa: mv88e6xxx: rework FDB getnext operation
This commit adds a low level _mv88e6xxx_atu_getnext function and helpers
to rewrite the mv88e6xxx_port_fdb_getnext operation.

A mv88e6xxx_atu_entry structure is added for convenient access to the
hardware, and GLOBAL_ATU_FID is defined instead of the raw 0x01 value.

The previous implementation did not handle the eventual trunk mapping.
If the related bit is set, then the ATU data register would contain the
trunk ID, and not the port vector.

Check this in the FDB getnext operation and do not handle it (yet).

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-09 22:48:09 -07:00
Vivien Didelot 395059fb92 net: dsa: mv88e6xxx: rename ATU MAC accessors
Rename the __mv88e6xxx_{read,write}_addr functions to more explicit
_mv88e6xxx_atu_mac_{read,write} functions, which also respect the single
underscore convention used in the file (meaning SMI lock must be held).

In the meantime, define their MAC address parameters as an array of
ETH_ALEN bytes instead of a char pointer.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-09 22:48:09 -07:00
Vivien Didelot 368b1d9c10 net: dsa: mv88e6xxx: extend fid mask
The driver currently manages one FID per port (or bridge group), with a
mask of DSA_MAX_PORTS bits, where 0 means that the FID is in use.

The Marvell 88E6xxx switches support up to 4094 FIDs (from 1 to 0xfff;
FID 0 means that multiple address databases are not being used).

This patch changes the fid_mask for an fid_bitmap of 4096 bits.

>From now on, FIDs 1 to num_ports are reserved for non-bridged ports and
bridge groups (a bridge group gets the FID of its first member). The
remaining bits will be reserved for VLAN entries.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-09 22:48:09 -07:00
Vivien Didelot 55045ddded net: dsa: add support for switchdev FDB objects
Remove the fdb_{add,del,getnext} function pointer in favor of new
port_fdb_{add,del,getnext}.

Implement the switchdev_port_obj_{add,del,dump} functions in DSA to
support the SWITCHDEV_OBJ_PORT_FDB objects.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-09 22:48:09 -07:00
Vivien Didelot 890248261a net: switchdev: support static FDB addresses
This patch adds a is_static boolean to the switchdev_obj_fdb structure,
in order to set the ndm_state to either NUD_NOARP or NUD_REACHABLE.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-09 22:48:09 -07:00
Vivien Didelot 1525c386a1 net: switchdev: change fdb addr for a byte array
The address in the switchdev_obj_fdb structure is currently represented
as a pointer. Replacing it for a 6-byte array allows switchdev to carry
addresses directly read from hardware registers, not stored by the
switch chip driver (as in Rocker).

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-09 22:48:08 -07:00
Masanari Iida 4933d85c51 net:wimax: Fix doucble word "the the" in networking.xml
This patch fix a double word "the the"
in Documentation/DocBook/networking.xml and
Documentation/DocBook/networking/API-Wimax-report-rfkill-sw.html.

These files are generated from comment in source, so I had to
fix the typo in net/wimax/io-rfkill.c

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-09 22:43:52 -07:00
Tom Herbert 10e4ea7514 net: Fix race condition in store_rps_map
There is a race condition in store_rps_map that allows jump label
count in rps_needed to go below zero. This can happen when
concurrently attempting to set and a clear map.

Scenario:

1. rps_needed count is zero
2. New map is assigned by setting thread, but rps_needed count _not_ yet
   incremented (rps_needed count still zero)
2. Map is cleared by second thread, old_map set to that just assigned
3. Second thread performs static_key_slow_dec, rps_needed count now goes
   negative

Fix is to increment or decrement rps_needed under the spinlock.

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-07 15:56:56 -07:00
Wenyu Zhang e05176a328 openvswitch: Make 100 percents packets sampled when sampling rate is 1.
When sampling rate is 1, the sampling probability is UINT32_MAX. The packet
should be sampled even the prandom32() generate the number of UINT32_MAX.
And none packet need be sampled when the probability is 0.

Signed-off-by: Wenyu Zhang <wenyuz@vmware.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-07 12:00:38 -07:00
Alexei Starovoitov da8b43c0e1 vxlan: combine VXLAN_FLOWBASED into VXLAN_COLLECT_METADATA
IFLA_VXLAN_FLOWBASED is useless without IFLA_VXLAN_COLLECT_METADATA,
so combine them into single IFLA_VXLAN_COLLECT_METADATA flag.
'flowbased' doesn't convey real meaning of the vxlan tunnel mode.
This mode can be used by routing, tc+bpf and ovs.
Only ovs is strictly flow based, so 'collect metadata' is a better
name for this tunnel mode.

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-07 11:46:34 -07:00
David S. Miller e03c512841 Merge branch 'rds-tcp-netns'
Sowmini Varadhan says:

====================
RDS-TCP: Network namespace support

This patch series contains the set of changes to correctly set up
the infra for PF_RDS sockets that use TCP as the transport in multiple
network namespaces.

Patch 1 in the series is the minimal set of changes to allow
a single instance of RDS-TCP to run in any (i.e init_net or other) net
namespace.  The changes in this patch set ensure that the execution of
'modprobe [-r] rds_tcp' sets up the kernel TCP sockets
relative to the current netns, so that RDS applications can send/recv
packets from that netns, and the netns can later be deleted cleanly.

Patch 2 of the series further allows multiple RDS-TCP instances,
one per network namespace. The changes in this patch allows dynamic
creation/tear-down of RDS-TCP client and server sockets  across all
current and future namespaces.

v2 changes from RFC sent out earlier:
    David Ahern comments in patch 1, net_device notifier in patch 2,
    patch 3 broken off and submitted separately.
v3: Cong Wang review comments.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-07 11:29:58 -07:00
Sowmini Varadhan 467fa15356 RDS-TCP: Support multiple RDS-TCP listen endpoints, one per netns.
Register pernet subsys init/stop functions that will set up
and tear down per-net RDS-TCP listen endpoints. Unregister
pernet subusys functions on 'modprobe -r' to clean up these
end points.

Enable keepalive on both accept and connect socket endpoints.
The keepalive timer expiration will ensure that client socket
endpoints will be removed as appropriate from the netns when
an interface is removed from a namespace.

Register a device notifier callback that will clean up all
sockets (and thus avoid the need to wait for keepalive timeout)
when the loopback device is unregistered from the netns indicating
that the netns is getting deleted.

Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-07 11:29:58 -07:00
Sowmini Varadhan d5a8ac28a7 RDS-TCP: Make RDS-TCP work correctly when it is set up in a netns other than init_net
Open the sockets calling sock_create_kern() with the correct struct net
pointer, and use that struct net pointer when verifying the
address passed to rds_bind().

Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-07 11:29:57 -07:00
David S. Miller 1ebd08a7e5 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates 2015-08-05

This series contains updates to i40e, i40evf and e1000e.

Anjali adds support for x772 devices to i40e and i40evf.  With the added
support, x772 supports offloading of the outer UDP transmit and receive
checksum for tunneled packets.  Also supports evicting ATR filters in the
hardware, so update the driver with this new feature set.

Raanan provides several fixes for e1000e, first rectifies the Energy
Efficient Ethernet in Sx code so that it only applies to parts that
actually support EEE in Sx.  Fix whitespace and moved ICH8 related define
to the proper context.  Fixed the ASPM locking which was reported by
Bjorn Helgaas.  Fix a workaround implementation for systime which could
experience a large non-linear increment of the systime value when
checking for overflow.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-07 00:19:09 -07:00
Jason A. Donenfeld d92cff89a0 net_dbg_ratelimited: turn into no-op when !DEBUG
The pr_debug family of functions turns into a no-op when -DDEBUG is not
specified, opting instead to call "no_printk", which gets compiled to a
no-op (but retains gcc's nice warnings about printf-style arguments).

The problem with net_dbg_ratelimited is that it is defined to be a
variant of net_ratelimited_function, which expands to essentially:

    if (net_ratelimit())
        pr_debug(fmt, ...);

When DEBUG is not defined, then this becomes,

    if (net_ratelimit())
        ;

This seems benign, except it isn't. Firstly, there's the obvious
overhead of calling net_ratelimit needlessly, which does quite some book
keeping for the rate limiting. Given that the pr_debug and
net_dbg_ratelimited family of functions are sprinkled liberally through
performance critical code, with developers assuming they'll be compiled
out to a no-op most of the time, we certainly do not want this needless
book keeping. Secondly, and most visibly, even though no debug message
is printed when DEBUG is not defined, if there is a flood of
invocations, dmesg winds up peppered with messages such as
"net_ratelimit: 320 callbacks suppressed". This is because our
aforementioned net_ratelimit() function actually prints this text in
some circumstances. It's especially odd to see this when there isn't any
other accompanying debug message.

So, in sum, it doesn't make sense to have this function's current
behavior, and instead it should match what every other debug family of
functions in the kernel does with !DEBUG -- nothing.

This patch replaces calls to net_dbg_ratelimited when !DEBUG with
no_printk, keeping with the idiom of all the other debug print helpers.

Also, though not strictly neccessary, it guards the call with an if (0)
so that all evaluation of any arguments are sure to be compiled out.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-06 23:51:30 -07:00
Roopa Prabhu 3dcb615e68 af_mpls: add null dev check in find_outdev
This patch adds null dev check for the 'cfg->rc_via_table ==
NEIGH_LINK_TABLE or dev_get_by_index() failed' case

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-06 22:03:58 -07:00
David S. Miller 3c62181840 Merge branch 'test-bpf-next'
Nicolas Schichan says:

====================
test_bpf improvements

Please find below the patch series with my latest changes to test_bpf.

The first patch checks for unexpected NULL generated skbs before
running the filter.

The second patch adds the possibility for tests to generate fragmented
skbs.

The third patch tests LD_ABS and LD_IND on fragmented skbs.

The fourth patch adds the possibility to restrict the tests being run
by specifying the name/id/range of the test(s) to run via module
parameters.

The fifth patch tests LD_ABS and LD_IND on non fragmented skbs with
various sizes and alignments.

The sixth and final patch checks that the interpreter or JIT correctly
resets A and X to 0.

This serie is against today's net-next tree.

Changes in V2:

* move declaration of 'ptr' in if() block in patch 2/6.

* fix various typos in patch 4/6

* rework default init of test_range array and cleanup exclude_test()
  return condition in patch 4/6.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-06 22:02:32 -07:00
Nicolas Schichan 86bf1721b2 test_bpf: add tests checking that JIT/interpreter sets A and X to 0.
It is mandatory for the JIT or interpreter to reset the A and X
registers to 0 before running the filter. Check that it is the case on
various ALU and JMP instructions.

Signed-off-by: Nicolas Schichan <nschichan@freebox.fr>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-06 22:02:32 -07:00
Nicolas Schichan 08fcb08fc0 test_bpf: add more tests for LD_ABS and LD_IND.
This exerces the LD_ABS and LD_IND instructions for various sizes and
alignments. This also checks that X when used as an offset to a
BPF_IND instruction first in a filter is correctly set to 0.

Signed-off-by: Nicolas Schichan <nschichan@freebox.fr>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-06 22:02:32 -07:00
Nicolas Schichan d2648d4e26 test_bpf: add module parameters to filter the tests to run.
When developping on the interpreter or a particular JIT, it can be
interesting to restrict the tests list to a specific test or a
particular range of tests.

This patch adds the following module parameters to the test_bpf module:

* test_name=<string>: only the specified named test will be run.

* test_id=<number>: only the test with the specified id will be run
  (see the output of test_bpf without parameters to get the test id).

* test_range=<number>,<number>: only the tests within IDs in the
  specified id range are run (see the output of test_bpf without
  parameters to get the test ids).

Any invalid range, test id or test name will result in -EINVAL being
returned and no tests being run.

Signed-off-by: Nicolas Schichan <nschichan@freebox.fr>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-06 22:02:32 -07:00
Nicolas Schichan 2cf1ad7593 test_bpf: test LD_ABS and LD_IND instructions on fragmented skbs.
These new tests exercise various load sizes and offsets crossing the
head/fragment boundary.

Signed-off-by: Nicolas Schichan <nschichan@freebox.fr>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-06 22:02:32 -07:00
Nicolas Schichan bac142acb9 test_bpf: allow tests to specify an skb fragment.
This introduce a new test->aux flag (FLAG_SKB_FRAG) to tell the
populate_skb() function to add a fragment to the test skb containing
the data specified in test->frag_data).

Signed-off-by: Nicolas Schichan <nschichan@freebox.fr>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-06 22:02:31 -07:00
Nicolas Schichan e34684f88e test_bpf: avoid oopsing the kernel when generate_test_data() fails.
Signed-off-by: Nicolas Schichan <nschichan@freebox.fr>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-06 22:02:31 -07:00
David S. Miller c71b5ad06e Merge branch 'mlx5e-next'
Amir Vadai says:

====================
net/mlx5e: Driver updates 04-Aug-2015

This patchset introduces two features to the ConnectX-4 driver: Patch 8/8
("Support physical port counters") exposes some hardware counters through
ethtool. Rest of the patches are preparation and usage of what we call
light-weight netdev open/close. Some flows that used to be in the ndo_open/stop
are moved to the PCI probe/remove flows - i.e. we will make the netdev
open/close operations more "light-weight".

The benefits of this change are:
1) Reduce the execution time of the stop/open operations.
2) Avoid saving SW shadows of resource configurations that must
   persist through stop/open operations (e.g flow table steering
   rules), and avoid deleting/applying them from/to the device upon
   netdev stop/open.
3) Avoid synchronizing threads that access those resources with the
   netdev stop/open threads.

Instead of create/destroy the resource during netdev open/stop, This patchset
changes the behavior such that upon netdev stop, traffic is redirected to a
"Drop RQ" (a RQ that silently drops, at the NIC HW level all incoming traffic).
After redirecting the traffic, RX/TX software resources could be destroyed.
During netdev open, the RX/TX rings are created and traffic is redirected to
the RX rings.

Patchset was applied and tested over commit ba7591d ("ebpf: add skb->hash to
offset map for usage in {cls, act}_bpf or filters")
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-06 22:00:59 -07:00
Gal Pressman efea389d3c net/mlx5_core: Support physical port counters
Added physical port counters in the following standard formats to
ethtool statistics:
  - IEEE 802.3
  - RFC2863
  - RFC2819

Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-06 22:00:59 -07:00
Achiad Shochat 9b37b07fcb net/mlx5e: Take advantage of the light-weight netdev open/stop
Now that TIRs, TISs and flow tables are kept alive while the netdev is
stopped (after executing ndo_stop()) we can do the following
improvements:

- Obsolete the active_vlans SW shadow.
- Do not delete/add flow table rules upon ndo_stop/open.
  In addition to simplifying the flow, this change also fastens
  the ndo_open/close operations.
- Obsolete synchronization of threads accessing the flow tables
  with the netdev stop/open threads.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-06 22:00:59 -07:00
Achiad Shochat 1cefa326ff net/mlx5e: Disable async events before unregister_netdev()
It does not make sense to allow events while the netdev is
unregistered.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-06 22:00:58 -07:00
Achiad Shochat 40ab6a6ebe net/mlx5e: Rename/move functions following the ndo_stop flow change
Rename some functions that used to be invoked upon ndo_open/stop and
are now invoked upon create/destroy_netdev() in order to better hint
their place in the flow.

Change some functions location in the file so that functions involved
in ndo_open/stop flow will not be interleaved with other functions.

This is a cosmetic change, no logical change here.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-06 22:00:58 -07:00
Achiad Shochat 5c50368f38 net/mlx5e: Light-weight netdev open/stop
Create/destroy TIRs, TISs and flow tables upon PCI probe/remove rather
than upon the netdev ndo_open/stop.

Upon ndo_stop(), redirect all RX traffic to the (lately introduced)
"Drop RQ" and then close only the RX/TX rings, leaving the TIRs,
TISs and flow tables alive.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-06 22:00:58 -07:00
Achiad Shochat d9eea403ca net/mlx5_core: Introduce access function to modify RSS/LRO params
To be used by the mlx5 Eth driver in following commit.

This is in preparation for netdev "light-weight" open/stop flow
change described in previous commit.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-06 22:00:58 -07:00
Achiad Shochat 50cfa25aba net/mlx5e: Introduce the "Drop RQ"
RX traffic routed to this RQ will be silently dropped, at the NIC HW
level.

This is in preparation for netdev "light-weight" open/stop flow
change described in previous commit.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-06 22:00:58 -07:00
Achiad Shochat 4cbeaff54f net/mlx5e: Unify the RX flow
Generally an RX packet flows through the following objects:
Flow table --> TIR --> RQT --> RQ

Where:
- TIR stands for "Transport Interface Receive", defining the RSS and
  LRO paramaters.
- RQT stands for "RQ Table", implementing the RSS indirection table.
- RQ stands for "Receive Queue"

For flows that do not need LRO, nor RSS, the driver made a shortcut to
the above RX flow by pointing to the RQ directly from the TIR, yielding
this flow:
Flow table --> TIR --> RQ

In this commit we remove this shortcut by "inserting" a single-RQ RQT
between the TIR and the RQ, i.e RX packets will reach the same RQ but
will go through an RQT of size 1, pointing to just a single RQ.

This way the RX traffic re-direction to/from the "Drop RQ" will be more
uniform (AKA "one flow"), as it will involve only RQTs re-direction and
no TIRs re-direction.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-06 22:00:58 -07:00
David S. Miller adc4cc99b7 Merge branch 'cpsw-next'
Mugunthan V N says:

====================
CPSW interrupt handling cleanup and performance improvement

This patch series removes the irq controller disable interrupt and
adding a napi for tx event handling which improves the performance by
~180Mbps on dra7-evm

[  5] local 192.168.10.116 port 5001 connected with 192.168.10.165 port 44176
[  5]  0.0-60.0 sec  1.48 GBytes   210 Mbits/sec
[  4] local 192.168.10.116 port 5001 connected with 192.168.10.165 port 33257
[  4]  0.0-60.0 sec  2.71 GBytes   386 Mbits/sec

Changes from initial version:
* Added a patch to have napi only for first interface as there is
  no use of having seperate napis for each interface as the
  interrupt is shared by both interface and only one napi is
  scheduled for each interrupt.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-06 21:59:27 -07:00
Mugunthan V N 32a7432c0f drivers: net: cpsw: add separate napi for tx
Instead of processing tx events in isr adding separate napi for
tx which improves performance by ~180Mbps with
omap2plus_defconfig on DRA74x platform. Also cleaning up rx napis
by renaming to napi_rx for better understanding the code.

Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-06 21:59:27 -07:00
Mugunthan V N d354eb85d6 drivers: net: cpsw: dual_emac: simplify napi usage
Since interrupt is shared between the two ethernet interface and
in isr only one napi is scheduled at an instance so having two
napis doesn't make any difference. So making napi also as a
common resource for the dual ethernet interfaces.

Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-06 21:59:27 -07:00
Mugunthan V N 870915feab drivers: net: cpsw: remove disable_irq/enable_irq as irq can be masked from cpsw itself
CPSW interrupts can be disabled by masking CPSW interrupts and
clearing interrupt by writing appropriate EOI. So removing all
disable_irq/enable_irq as discussed in [1]

[1] http://patchwork.ozlabs.org/patch/492741/

Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-06 21:59:27 -07:00
Dan Carpenter 5a9348b54d mpls: small cleanup in inet/inet6_fib_lookup_dev()
We recently changed this code from returning NULL to returning ERR_PTR.
There are some left over NULL assignments which we can remove.  We can
preserve the error code from ip_route_output() instead of always
returning -ENODEV.  Also these functions use a mix of gotos and direct
returns.  There is no cleanup necessary so I changed the gotos to
direct returns.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Acked-by: Robert Shearman <rshearma@brocade.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-06 21:56:55 -07:00
David S. Miller 02b5242847 Merge branch 'bnx2x-cnic-bnx2fc-bd-support'
Yuval Mintz says:

====================
bnx2x, cnic, bnx2fc: add support for BD

Commit 230d00eb4b ("bnx2x: new Multi-function mode - BD") added support
for a new multi-function mode, but it added only the support required by
bnx2x for L2 interfaces.

This adds the required changes to support the new multi-function mode in
the offloaded storage protocols.

Dave,

Please consider applying this series to `net-next'.

Do notice that this involves non-networking driver changes -
but sending this as a single series seemed like the best approach as
we had to have bnx2x changes to support the new functionality.
If this is problematic, please tell us what's the preferred solution here.

Changes from previous versions
------------------------------

 - From v1 - no actual changes; v1 failed to reach netdev so in order to
   keep things in line I've termed this one v2.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-06 21:54:13 -07:00
Joe Carnuccio 2971ff67bd bnx2fc: Read npiv table from nvram and create vports.
Signed-off-by: Joe Carnuccio <joe.carnuccio@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-06 21:54:13 -07:00
Yuval Mintz 97ac4ef78e bnx2x: Add BD support for storage
Commit 230d00eb4b ("bnx2x: new Multi-function mode - BD") adds support
for the new mode in bnx2x. This expands this support by implementing
APIs required by our storage drivers to support that mode.

Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-06 21:54:13 -07:00
Adheer Chandravanshi 9b8d504402 cnic: Add the interfaces to get FC-NPIV table.
Signed-off-by: Adheer Chandravanshi <adheer.chandravanshi@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-06 21:54:12 -07:00
Tej Parkash eddb755428 cnic: Populate upper layer driver state in MFW
Signed-off-by: Tej Parkash <tej.parkash@qlogic.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-06 21:54:12 -07:00
Scott Feldman ff14702844 rocker: use netdev_err after register_netdev
After successful register_netdev, we can use netdev_err rather the more
generic dev_err.

Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-06 21:47:57 -07:00
Scott Feldman 6c4f7780a5 rocker: NULL port if port probe fails
Set port to NULL if port probe fails so we don't try to remove partially
initialized port on port probe err cleanup path.

Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-06 21:47:57 -07:00
Raanan Avargil d2d7d4e4a6 e1000e: Increase driver version number
Signed-off-by: Raanan Avargil <raanan.avargil@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-08-05 16:54:04 -07:00
Raanan Avargil 37b12910dd e1000e: Fix tight loop implementation of systime read algorithm
Change the algorithm. Read systimel twice and check for overflow.
If there was no overflow, use the first value.
If there was an overflow, read systimeh again and use the second
systimel value.

Signed-off-by: Raanan Avargil <raanan.avargil@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-08-05 16:53:48 -07:00
Raanan Avargil 2758f9edb7 e1000e: Fix incorrect ASPM locking
This patch fixes wrong locking usage.
In the context of slot reset, we should use lock.
And during resume, there is no need of lock.

Reported-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Raanan Avargil <raanan.avargil@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-08-05 16:53:47 -07:00