1
0
Fork 0
Commit Graph

428221 Commits (58ed944241794087df1edadfa66795c966bf1604)

Author SHA1 Message Date
Jon Paul Maloy 58ed944241 tipc: align usage of variable names and macros in socket
The practice of naming variables in TIPC is inconistent, sometimes
even within the same file.

In this commit we align variable names and declarations within
socket.c, and function and macro names within socket.h. We also
reduce the number of conversion macros to two, in order to make
usage less obsure.

These changes are purely cosmetic.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12 15:53:49 -04:00
Jon Paul Maloy 3b4f302d85 tipc: eliminate redundant locking
The three functions tipc_portimportance(), tipc_portunreliable() and
tipc_portunreturnable() and their corresponding tipc_set* functions,
are all grabbing port_lock when accessing the targeted port. This is
unnecessary in the current code, since these calls only are made from
within socket downcalls, already protected by sock_lock.

We remove the redundant locking. Also, since the functions now become
trivial one-liners, we move them to port.h and make them inline.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12 15:53:49 -04:00
Jon Paul Maloy 24be34b5a0 tipc: eliminate upcall function pointers between port and socket
Due to the original one-to-many relation between port and user API
layers, upcalls to the API have been performed via function pointers,
installed in struct tipc_port at creation. Since this relation now
always is one-to-one, we can instead use ordinary function calls.

We remove the function pointers 'dispatcher' and ´wakeup' from
struct tipc_port, and replace them with calls to the renamed
functions tipc_sk_rcv() and tipc_sk_wakeup().

At the same time we change the name and signature of the functions
tipc_createport() and tipc_deleteport() to reflect their new role
as mere initialization/destruction functions.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12 15:53:49 -04:00
Jon Paul Maloy 8826cde655 tipc: aggregate port structure into socket structure
After the removal of the tipc native API the relation between
a tipc_port and its API types is strictly one-to-one, i.e, the
latter can now only be a socket API. There is therefore no need
to allocate struct tipc_port and struct sock independently.

In this commit, we aggregate struct tipc_port into struct tipc_sock,
hence saving both CPU cycles and structure complexity.

There are no functional changes in this commit, except for the
elimination of the separate allocation/freeing of tipc_port.
All other changes are just adaptatons to the new data structure.

This commit also opens up for further code simplifications and
code volume reduction, something we will do in later commits.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12 15:53:49 -04:00
Jon Paul Maloy f9fef18c6d tipc: remove redundant 'peer_name' field in struct tipc_sock
The field 'peer_name' in struct tipc_sock is redundant, since
this information already is available from tipc_port, to which
tipc_sock has a reference.

We remove the field, and ensure that peer node and peer port
info instead is fetched via the functions that already exist
for this purpose.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12 15:53:49 -04:00
Jon Paul Maloy 978813ee89 tipc: replace reference table rwlock with spinlock
The lock for protecting the reference table is declared as an
RWLOCK, although it is only used in write mode, never in read
mode.

We redefine it to become a spinlock.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12 15:53:49 -04:00
Steffen Klassert 4a93f5095a flowcache: Fix resource leaks on namespace exit.
We leak an active timer, the hotcpu notifier and all allocated
resources when we exit a namespace. Fix this by introducing a
flow_cache_fini() function where we release the resources before
we exit.

Fixes: ca925cf153 ("flowcache: Make flow cache name space aware")
Reported-by: Jakub Kicinski <moorray3@wp.pl>
Tested-by: Jakub Kicinski <moorray3@wp.pl>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Fan Du <fan.du@windriver.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12 15:31:18 -04:00
Joe Perches 1f36fc74d8 lg-vl600: Convert uses of __constant_<foo> to <foo>
The use of __constant_<foo> has been unnecessary for quite awhile now.

Make these uses consistent with the rest of the kernel.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12 15:28:06 -04:00
Joe Perches ceffc4acfc xilinx: Convert uses of __constant_<foo> to <foo>
The use of __constant_<foo> has been unnecessary for quite awhile now.

Make these uses consistent with the rest of the kernel.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12 15:28:06 -04:00
Joe Perches b779d0afcc brocade: Convert uses of __constant_<foo> to <foo>
The use of __constant_<foo> has been unnecessary for quite awhile now.

Make these uses consistent with the rest of the kernel.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12 15:28:06 -04:00
Joe Perches 184593c734 tipc: Convert uses of __constant_<foo> to <foo>
The use of __constant_<foo> has been unnecessary for quite awhile now.

Make these uses consistent with the rest of the kernel.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12 15:28:06 -04:00
Joe Perches ec633eb5ff ieee802154: Convert uses of __constant_<foo> to <foo>
The use of __constant_<foo> has been unnecessary for quite awhile now.

Make these uses consistent with the rest of the kernel.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12 15:28:06 -04:00
Joe Perches 2b8837aeaa net: Convert uses of __constant_<foo> to <foo>
The use of __constant_<foo> has been unnecessary for quite awhile now.

Make these uses consistent with the rest of the kernel.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12 15:28:06 -04:00
Joe Perches f0e78826e4 8021q: Convert uses of __constant_<foo> to <foo>
The use of __constant_<foo> has been unnecessary for quite awhile now.

Make these uses consistent with the rest of the kernel.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12 15:28:06 -04:00
Claudiu Manoil b338ce270e gianfar: Fix multi-queue support checks @probe()
priv is not instantiated at gfar_of_init() time, when
parsing the DT for info on supported HW queues.  Before
the netdev can be allocated, the number of supported
queues must be known.  Because the number of supported
queues depends on device type, move the compatibility
checks before netdev allocation.  Local vars are used
to hold the operation mode info before netdev allocation.
This fixes the null accesses for priv->.., in gfar_of_init.

Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12 00:47:52 -04:00
hayeswang 4f1d4d54f9 r8152: support dumping the hw counters
Add dumping the tally counter by ethtool.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12 00:09:09 -04:00
Thomas Stilwell 48d5dbaf94 ieee802154: at86rf230: add support for rf233 chip
The rf233 and rf231 are sufficiently similar that we can treat
rf233 like rf231.

rf233 is missing some features that rf231 has, but we don't currently
make use of them so there's nothing to handle differently yet.

Should we add support in the future for rf231 *_NOCLK or SLEEP states,
or PAD_IO drive strength, exceptions will need to be made for rf233.

Signed-off-by: Thomas Stilwell <stilwellt@openlabs.co>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12 00:05:36 -04:00
David S. Miller 62cf4be989 Merge branch 'pkt_sched_cond_resched'
Eric Dumazet says:

====================
pkt_sched: allow scheduling points

We have seen delays of more than 50ms in class or qdisc dumps, in case
device is under high TX stress, even with the prior 4KB per skb limit.

With the new 16KB limit, this could translate to 200ms delays.

Add cond_resched() to give a chance to higher prio tasks to get cpu.

But before doing so, we need to remove the rcu locking from tc_dump_qdisc()
as David spotted.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-11 23:54:56 -04:00
Eric Dumazet fba373d2bb pkt_sched: add cond_resched() to class and qdisc dump
We have seen delays of more than 50ms in class or qdisc dumps, in case
device is under high TX stress, even with the prior 4KB per skb limit.

Add cond_resched() to give a chance to higher prio tasks to get cpu.

Signed-off-by; Eric Dumazet <edumazet@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-11 23:54:23 -04:00
Eric Dumazet 15dc36ebbb pkt_sched: do not use rcu in tc_dump_qdisc()
Like all rtnetlink dump operations, we hold RTNL in tc_dump_qdisc(),
so we do not need to use rcu protection to protect list of netdevices.

This will allow preemption to occur, thus reducing latencies.
Following patch adds explicit cond_resched() calls.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-11 23:54:23 -04:00
stephen hemminger a19a7ec8fc bonding: force cast of IP address in options
The option code is taking IP address and putting it into a generic
container. Force cast to silence sparse warnings.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-11 16:37:14 -04:00
stephen hemminger 693350c2ff netdev: set __percpu attribute on netdev_alloc_pcpu_stats
This patch fixes sparse warnings in vlan driver.
It propagates the sparse __percpu attribute from alloc_percpu
into netdev_alloc_pcpu_stats. I expect it may trigger additional
sparse warnings from other drivers that are missing the __percpu
attribute.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-11 16:37:14 -04:00
Li RongQing 090f1166c6 ipv6: ip6_forward: perform skb->pkt_type check at the beginning
Packets which have L2 address different from ours should be
already filtered before entering into ip6_forward().

Perform that check at the beginning to avoid processing such packets.

Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-11 00:37:42 -04:00
hayeswang fcb308d529 r8152: add skb_cow_head
Call skb_cow_head() before editing the tx packet header. The header
would be reallocated if it is shared.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-10 22:23:00 -04:00
Tobias Klauser 8dc43ddc9f net: eth: cpsw: Use net_device_stats from struct net_device
Instead of using an own copy of struct net_device_stats in struct
cpsw_priv, use stats from struct net_device. Also remove the thus
unnecessary .ndo_get_stats function, as it just returns dev->stats,
which is the default.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Acked-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-10 21:53:01 -04:00
Eric Dumazet d32d9bb85c flowcache: restore a single flow_cache kmem_cache
It is not legal to create multiple kmem_cache having the same name.

flowcache can use a single kmem_cache, no need for a per netns
one.

Fixes: ca925cf153 ("flowcache: Make flow cache name space aware")
Reported-by: Jakub Kicinski <moorray3@wp.pl>
Tested-by: Jakub Kicinski <moorray3@wp.pl>
Tested-by: Fan Du <fan.du@windriver.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-10 21:45:11 -04:00
Gu Zheng 5812521be0 net: add a pre-check of net_ns in sk_change_net()
We do not need to switch the net_ns if the target net_ns the same
as the current one, so here we add a pre-check of net_ns to avoid
this as David suggested.

Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-10 16:29:48 -04:00
Eric Dumazet 431a91242d tcp: timestamp SYN+DATA messages
All skb in socket write queue should be properly timestamped.

In case of FastOpen, we special case the SYN+DATA 'message' as we
queue in socket wrote queue the two fallback skbs:

1) SYN message by itself.
2) DATA segment by itself.

We should make sure these skbs have proper timestamps.

Add a WARN_ON_ONCE() to eventually catch future violations.

Fixes: 740b0f1841 ("tcp: switch rtt estimations to usec resolution")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-10 16:15:54 -04:00
Haiyang Zhang 99d3016de4 hyperv: Change the receive buffer size for legacy hosts
Due to a bug in the Hyper-V host verion 2008R2, we need to use a slightly smaller
receive buffer size, otherwise the buffer will not be accepted by the legacy hosts.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-10 16:11:26 -04:00
Alexander Aring 3772ab1d37 6lowpan: reassembly: fix access of ctl table entry
Correct offset is 3 of the 6lowpanfrag_max_datagram_size value in proc
entry ctl table and not 2.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-10 16:03:03 -04:00
David S. Miller e3ca64948b Merge branch 'hyperv-next'
K. Y. Srinivasan says:

====================
Drivers: net: hyperv: Enable various offloads

This patch set enables both checksum as well as segmentation offload.
As part of this effort I have enabled scatter gather I/O a well.

In version 2 of these patches, I addressed comments from David Miller and
Dan Carpenter.

In this version I have addressed the latest comments from David Miller.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-10 15:52:17 -04:00
KY Srinivasan 77bf548794 Drivers: net: hyperv: Enable large send offload
Enable segmentation offload.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-10 15:51:37 -04:00
KY Srinivasan 08cd04bf6d Drivers: net: hyperv: Enable send side checksum offload
Enable send side checksum offload.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-10 15:51:37 -04:00
KY Srinivasan e3d605ed44 Drivers: net: hyperv: Enable receive side IP checksum offload
Enable receive side checksum offload.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-10 15:51:37 -04:00
KY Srinivasan 4a0e70ae5e Drivers: net: hyperv: Enable offloads on the host
Prior to enabling guest side offloads, enable the offloads on the host.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-10 15:51:37 -04:00
KY Srinivasan 8a00251a36 Drivers: net: hyperv: Cleanup the send path
In preparation for enabling offloads, cleanup the send path.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-10 15:51:37 -04:00
KY Srinivasan 54a7357f7a Drivers: net: hyperv: Enable scatter gather I/O
Cleanup the code and enable scatter gather I/O.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-10 15:51:36 -04:00
Tim Harvey 3ee2f8ce1a sky2: allow mac to come from dt
The driver reads the mac address from the device registers which would
need to have been programmed by the bootloader.  This patch adds
the ability to pull the mac from devicetree via the pci device dt node.

Signed-off-by: Tim Harvey <tharvey@gateworks.com>
Cc: netdev@vger.kernel.org
Cc: devicetree@vger.kernel.org
Cc: Grant Likely <grant.likely@linaro.org>
Cc: Rob Herring <robh+dt@kernel.org>

Changes since v2:
 - eliminated use of stack tmpaddr per feedback

Changes since v1:
 - simplified based on feedback
 - fixed formatting
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-10 15:40:30 -04:00
Eric Dumazet 746e349980 l2tp: fix unused variable warning
net/l2tp/l2tp_core.c:1111:15: warning: unused variable
'sk' [-Wunused-variable]

Fixes: 31c70d5956 ("l2tp: keep original skb ownership")
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-10 15:32:24 -04:00
Kleber Sacilotto de Souza c120e9e030 IB/mlx5_core: remove unreachable function call in module init
The call to mlx5_health_cleanup() in the module init function can never
be reached. Removing it.

Signed-off-by: Kleber Sacilotto de Souza <klebers@linux.vnet.ibm.com>
Acked-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-10 15:23:22 -04:00
Eric Dumazet 9063e21fb0 netlink: autosize skb lengthes
One known problem with netlink is the fact that NLMSG_GOODSIZE is
really small on PAGE_SIZE==4096 architectures, and it is difficult
to know in advance what buffer size is used by the application.

This patch adds an automatic learning of the size.

First netlink message will still be limited to ~4K, but if user used
bigger buffers, then following messages will be able to use up to 16KB.

This speedups dump() operations by a large factor and should be safe
for legacy applications.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Thomas Graf <tgraf@suug.ch>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-10 13:56:26 -04:00
Edward Cree cd84ff4da1 sfc: Use ether_addr_copy and eth_broadcast_addr
Faster than memcpy/memset on some architectures.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-10 13:53:37 -04:00
David S. Miller 19433646fe Merge branch 'gianfar-next'
Claudiu Manoil says:

====================
gianfar: Tx timeout issue

There's an older Tx timeout issue showing up on etsec2 devices
with 2 CPUs.  I pinned this issue down to processing overhead
incurred by supporting multiple Tx/Rx rings, as explained in
the 2nd patch below.  But before this, there's also a concurency
issue leading to Rx/Tx spurrious interrupts, addressed by the
'Tx NAPI' patch below.
The Tx timeout can be triggered with multiple Tx flows,
'iperf -c -N 8' commands, on a 2 CPUs etsec2 based (P1020) board.

Before the patches:
"""
root@p1020rdb-pc:~# iperf -c 172.16.1.3 -n 1000M -P 8 &
[...]
root@p1020rdb-pc:~# NETDEV WATCHDOG: eth1 (fsl-gianfar): transmit queue 1 timed out
WARNING: at net/sched/sch_generic.c:279
Modules linked in:
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.13.0-rc3-03386-g89ea59c #23
task: ed84ef40 ti: ed868000 task.ti: ed868000
NIP: c04627a8 LR: c04627a8 CTR: c02fb270
REGS: ed869d00 TRAP: 0700   Not tainted  (3.13.0-rc3-03386-g89ea59c)
MSR: 00029000 <CE,EE,ME>  CR: 44000022  XER: 20000000
[...]

root@p1020rdb-pc:~# [ ID] Interval       Transfer     Bandwidth
[  5]  0.0-19.3 sec  1000 MBytes    434 Mbits/sec
[  8]  0.0-39.7 sec  1000 MBytes    211 Mbits/sec
[  9]  0.0-40.1 sec  1000 MBytes    209 Mbits/sec
[  3]  0.0-40.2 sec  1000 MBytes    209 Mbits/sec
[ 10]  0.0-59.0 sec  1000 MBytes    142 Mbits/sec
[  7]  0.0-74.6 sec  1000 MBytes    112 Mbits/sec
[  6]  0.0-74.7 sec  1000 MBytes    112 Mbits/sec
[  4]  0.0-74.7 sec  1000 MBytes    112 Mbits/sec
[SUM]  0.0-74.7 sec  7.81 GBytes    898 Mbits/sec

root@p1020rdb-pc:~# ifconfig eth1
eth1      Link encap:Ethernet  HWaddr 00:04:9f:00:13:01
          inet addr:172.16.1.1  Bcast:172.16.255.255  Mask:255.255.0.0
          inet6 addr: fe80::204:9fff:fe00:1301/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:708722 errors:0 dropped:0 overruns:0 frame:0
          TX packets:8717849 errors:6 dropped:0 overruns:1470 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:58118018 (55.4 MiB)  TX bytes:274069482 (261.3 MiB)
          Base address:0xa000

"""

After applying the patches:
"""
root@p1020rdb-pc:~# iperf -c 172.16.1.3 -n 1000M -P 8 &
[...]
root@p1020rdb-pc:~# [ ID] Interval       Transfer     Bandwidth
[  9]  0.0-70.5 sec  1000 MBytes    119 Mbits/sec
[  5]  0.0-70.5 sec  1000 MBytes    119 Mbits/sec
[  6]  0.0-70.7 sec  1000 MBytes    119 Mbits/sec
[  4]  0.0-71.0 sec  1000 MBytes    118 Mbits/sec
[  8]  0.0-71.1 sec  1000 MBytes    118 Mbits/sec
[  3]  0.0-71.2 sec  1000 MBytes    118 Mbits/sec
[ 10]  0.0-71.3 sec  1000 MBytes    118 Mbits/sec
[  7]  0.0-71.3 sec  1000 MBytes    118 Mbits/sec
[SUM]  0.0-71.3 sec  7.81 GBytes    942 Mbits/sec

root@p1020rdb-pc:~# ifconfig eth1
eth1      Link encap:Ethernet  HWaddr 00:04:9f:00:13:01
          inet addr:172.16.1.1  Bcast:172.16.255.255  Mask:255.255.0.0
          inet6 addr: fe80::204:9fff:fe00:1301/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:728446 errors:0 dropped:0 overruns:0 frame:0
          TX packets:8690057 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:59732650 (56.9 MiB)  TX bytes:271554306 (258.9 MiB)
          Base address:0xa000
"""
v2: PATCH 2:
    Replaced CPP check with run-time condition to
    limit the number of queues. Updated comments.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-10 13:17:44 -04:00
Claudiu Manoil 71ff9e3df7 gianfar: Use Single-Queue polling for "fsl,etsec2"
For the "fsl,etsec2" compatible models the driver currently
supports 8 Tx and Rx DMA rings (aka HW queues).  However, there
are only 2 pairs of Rx/Tx interrupt lines, as these controllers
are integrated in low power SoCs with 2 CPUs at most.  As a result,
there are at most 2 NAPI instances that have to service multiple
Tx and Rx queues for these devices.  This complicates the NAPI
polling routine having to iterate over the mutiple Rx/Tx queues
hooked to the same interrupt lines.  And there's also an overhead
at HW level, as the controller needs to service all the 8 Tx rings
in a round robin manner.  The combined overhead shows up for multi
parallel Tx flows transmitted by the kernel stack, when the driver
usually starts returning NETDEV_TX_BUSY leading to NETDEV WATCHDOG
Tx timeout triggering if the Tx path is congested for too long.

As an alternative, this patch makes the driver support only one
Tx/Rx DMA ring per NAPI instance (per interrupt group or pair
of Tx/Rx interrupt lines) by default.  The simplified single queue
polling routine (gfar_poll_sq) will be the default napi poll routine
for the etsec2 devices too.  Some adjustments needed to be made to
link the Tx/Rx HW queues with each NAPI instance (2 in this case).
The gfar_poll_sq() is already successfully used by older SQ_SG_MODE
(single interrupt group) controllers.
This patch fixes Tx timeout triggering under heavy Tx traffic load
(i.e. iperf -c -P 8) for the "fsl,etsec2" (currently the only
MQ_MG_MODE devices).  There's also a significant memory footprint
reduction by supporting 2 Rx/Tx DMA rings (at most), instead of 8,
for these devices.

Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-10 13:17:22 -04:00
Claudiu Manoil aeb12c5ef7 gianfar: Separate out the Tx interrupt handling (Tx NAPI)
There are some concurrency issues on devices w/ 2 CPUs related
to the handling of Rx and Tx interrupts.  eTSEC has separate
interrupt lines for Rx and Tx but a single imask register
to mask these interrupts and a single NAPI instance to handle
both Rx and Tx work.  As a result, the Rx and Tx ISRs are
identical, both are invoking gfar_schedule_cleanup(), however
both handlers can be entered at the same time when the Rx and
Tx interrupts are taken by different CPUs.  In this case
spurrious interrupts (SPU) show up (in /proc/interrupts)
indicating a concurrency issue.  Also, Tx overruns followed
by Tx timeout have been observed under heavy Tx traffic load.

To address these issues, the schedule cleanup ISR part has
been changed to handle the Rx and Tx interrupts independently.
The patch adds a separate NAPI poll routine for Tx cleanup to
be triggerred independently by the Tx confirmation interrupts
only.  Existing poll functions are modified to handle only
the Rx path processing.  The Tx poll routine does not need a
budget, since Tx processing doesn't consume NAPI budget, and
hence it is registered with minimum NAPI weight.
NAPI scheduling does not require locking since there are
different NAPI instances between the Rx and Tx confirmation
paths now.
So, the patch fixes the occurence of spurrious Rx/Tx interrupts.
Tx overruns also occur less frequently now.

Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-10 13:17:22 -04:00
dingtianhong be14cc98e9 vlan: use use ether_addr_equal_64bits to instead of ether_addr_equal
Ether_addr_equal_64bits is more efficient than ether_addr_equal, and
can be used when each argument is an array within a structure that
contains at least two bytes of data beyond the array, so it is safe
to use it for vlan, and make sense for fast path.

Cc: Joe Perches <joe@perches.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-09 19:03:51 -04:00
dingtianhong 375f67df28 vlan: slight optimization for vlan_do_receive()
According Joe's suggestion, maybe it'd be faster to add an unlikely to
the test for PCKET_OTHERHOST, so I add it and see whether the performance
could be better, although the differences is so small and negligible, but
it is hard to catch that any lower device would set the skb type to
PACKET_OTHERHOST, so most of time, I think it make sense to add unlikely
for the test.

Cc: Joe Perches <joe@perches.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-09 19:03:51 -04:00
Eric Dumazet 2d8d40afd1 pkt_sched: fq: do not hold qdisc lock while allocating memory
Resizing fq hash table allocates memory while holding qdisc spinlock,
with BH disabled.

This is definitely not good, as allocation might sleep.

We can drop the lock and get it when needed, we hold RTNL so no other
changes can happen at the same time.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Fixes: afe4fd0624 ("pkt_sched: fq: Fair Queue packet scheduler")
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-08 19:09:10 -05:00
David S. Miller d85ea93ffb Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next
Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates

This series contains updates to e1000e, ixgbevf and igb.

Majority of this series contains fixes and cleanups to e1000e,
most notably are:

Todd provides a fix to PTP in e1000e which adds a lock in e1000e_phc_adjfreq
to prevent concurrent changes to TIMINCA and SYSTIMH/L.  Then provides an
igb fix to use ARRAY_SIZE for array size calculation.

David provides the remaining e1000e which contain:
 - cleanup of pointer references that are no longer used
 - fix an issue on systems with Management Engine enabled with the
   ethernet cable unplugged
 - fix an issue on 82579 where enabling EEE LPI sooner than one second
   after link up causes link issues on some switches
 - refactor the power management flows to prevent the suspend path from
   being executed twice when hibernating
 - refactor the runtime power management to fix interfering with the
   functionality of Energy Efficient Ethernet when enabled and to fix
   the device from repeatedly flip between suspend and resume with the
   interface administratively downed
 - enable the feature PHY Ultra Low Power Mode which is a power saving
   feature that reduces the power consumption of the PHY when a cable is
   not connected
 - fix the ethtool offline tests for 82579 parts
 - fix SHRA register access for 82579 parts which was introduced by
   previous commit c3a0dce35a "e1000e: fix overrun of PHY RAR array"

Florian provides a fix for ixgbevf where skb->pkt_type was being checked
like a bitmask, but it is not a bitmask.

Fix an issue reported by Stephen Hemminger where there was a warning
about code defined but never used if IGB_HWMON is not defined.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-08 18:49:29 -05:00
Jeff Kirsher 9b143d11a4 igb: fix warning if !CONFIG_IGB_HWMON
Fix warning about code defined but never used if IGB_HWMON not defined.

Reported-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-03-08 00:36:55 -08:00