1
0
Fork 0
Commit Graph

415174 Commits (8f646c922d55047ecd6c65ada49ead88ed0db61e)

Author SHA1 Message Date
Eric Dumazet 8f646c922d vxlan: keep original skb ownership
Sathya Perla posted a patch trying to address following problem :

<quote>
 The vxlan driver sets itself as the socket owner for all the TX flows
 it encapsulates (using vxlan_set_owner()) and assigns it's own skb
 destructor. This causes all tunneled traffic to land up on only one TXQ
 as all encapsulated skbs refer to the vxlan socket and not the original
 socket.  Also, the vxlan skb destructor breaks some functionality for
 tunneled traffic like wmem accounting and as TCP small queues and
 FQ/pacing packet scheduler.
</quote>

I reworked Sathya patch and added some explanations.

vxlan_xmit() can avoid one skb_clone()/dev_kfree_skb() pair
and gain better drop monitor accuracy, by calling kfree_skb() when
appropriate.

The UDP socket used by vxlan to perform encapsulation of xmit packets
do not need to be alive while packets leave vxlan code. Its better
to keep original socket ownership to get proper feedback from qdisc and
NIC layers.

We use skb->sk to

A) control amount of bytes/packets queued on behalf of a socket, but
prior vxlan code did the skb->sk transfert without any limit/control
on vxlan socket sk_sndbuf.

B) security purposes (as selinux) or netfilter uses, and I do not think
anything is prepared to handle vxlan stacked case in this area.

By not changing ownership, vxlan tunnels behave like other tunnels.
As Stephen mentioned, we might do the same change in L2TP.

Reported-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-06 16:37:09 -05:00
Eric Dumazet 996b175e39 tcp: out_of_order_queue do not use its lock
TCP out_of_order_queue lock is not used, as queue manipulation
happens with socket lock held and we therefore use the lockless
skb queue routines (as __skb_queue_head())

We can use __skb_queue_head_init() instead of skb_queue_head_init()
to make this more consistent.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-06 16:34:34 -05:00
Veaceslav Falico 0b23810d8c bonding: fix kstrtou8() return value verification in num_peer_notif
It returns 0 in case of success, !0 error otherwise. Fix the improper error
verification.

Fixes: 2c9839c143 ("bonding: add num_grat_arp attribute netlink support")
CC: sfeldma@cumulusnetworks.com
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Acked-by: Scott Feldman <sfeldma@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-06 16:26:00 -05:00
hayeswang 31ca1decfb r8152: replace the return value of rtl_ops_init
Replace the boolean value with the error code for the return value
of the rtl_ops_init().

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-06 16:24:09 -05:00
hayeswang e3ad412ad8 r8152: move the actions of saving the information of the device
Some information of the device may be used in other functions. Move
the relative code to make sure it would be initialzed correctly
before using it.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-06 16:24:09 -05:00
hayeswang 45f4a19f6d r8152: replace some tabs with spaces
Replace the tabs of the variables declaration with the spaces.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-06 16:24:09 -05:00
Vijay Subramanian d4b36210c2 net: pkt_sched: PIE AQM scheme
Proportional Integral controller Enhanced (PIE) is a scheduler to address the
bufferbloat problem.

>From the IETF draft below:
" Bufferbloat is a phenomenon where excess buffers in the network cause high
latency and jitter. As more and more interactive applications (e.g. voice over
IP, real time video streaming and financial transactions) run in the Internet,
high latency and jitter degrade application performance. There is a pressing
need to design intelligent queue management schemes that can control latency and
jitter; and hence provide desirable quality of service to users.

We present here a lightweight design, PIE(Proportional Integral controller
Enhanced) that can effectively control the average queueing latency to a target
value. Simulation results, theoretical analysis and Linux testbed results have
shown that PIE can ensure low latency and achieve high link utilization under
various congestion situations. The design does not require per-packet
timestamp, so it incurs very small overhead and is simple enough to implement
in both hardware and software.  "

Many thanks to Dave Taht for extensive feedback, reviews, testing and
suggestions. Thanks also to Stephen Hemminger and Eric Dumazet for reviews and
suggestions.  Naeem Khademi and Dave Taht independently contributed to ECN
support.

For more information, please see technical paper about PIE in the IEEE
Conference on High Performance Switching and Routing 2013. A copy of the paper
can be found at ftp://ftpeng.cisco.com/pie/.

Please also refer to the IETF draft submission at
http://tools.ietf.org/html/draft-pan-tsvwg-pie-00

All relevant code, documents and test scripts and results can be found at
ftp://ftpeng.cisco.com/pie/.

For problems with the iproute2/tc or Linux kernel code, please contact Vijay
Subramanian (vijaynsu@cisco.com or subramanian.vijay@gmail.com) Mythili Prabhu
(mysuryan@cisco.com)

Signed-off-by: Vijay Subramanian <subramanian.vijay@gmail.com>
Signed-off-by: Mythili Prabhu <mysuryan@cisco.com>
CC: Dave Taht <dave.taht@bufferbloat.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-06 15:13:01 -05:00
David S. Miller 83111e7fe8 netfilter: Fix build failure in nfnetlink_queue_core.c.
net/netfilter/nfnetlink_queue_core.c: In function 'nfqnl_put_sk_uidgid':
net/netfilter/nfnetlink_queue_core.c:304:35: error: 'TCP_TIME_WAIT' undeclared (first use in this function)
net/netfilter/nfnetlink_queue_core.c:304:35: note: each undeclared identifier is reported only once for each function it appears in
make[3]: *** [net/netfilter/nfnetlink_queue_core.o] Error 1

Just a missing include of net/tcp_states.h

Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-06 13:36:06 -05:00
David S. Miller 9aa28f2b71 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nftables
Pablo Neira Ayuso says: <pablo@netfilter.org>

====================
nftables updates for net-next

The following patchset contains nftables updates for your net-next tree,
they are:

* Add set operation to the meta expression by means of the select_ops()
  infrastructure, this allows us to set the packet mark among other things.
  From Arturo Borrero Gonzalez.

* Fix wrong format in sscanf in nf_tables_set_alloc_name(), from Daniel
  Borkmann.

* Add new queue expression to nf_tables. These comes with two previous patches
  to prepare this new feature, one to add mask in nf_tables_core to
  evaluate the queue verdict appropriately and another to refactor common
  code with xt_NFQUEUE, from Eric Leblond.

* Do not hide nftables from Kconfig if nfnetlink is not enabled, also from
  Eric Leblond.

* Add the reject expression to nf_tables, this adds the missing TCP RST
  support. It comes with an initial patch to refactor common code with
  xt_NFQUEUE, again from Eric Leblond.

* Remove an unused variable assignment in nf_tables_dump_set(), from Michal
  Nazarewicz.

* Remove the nft_meta_target code, now that Arturo added the set operation
  to the meta expression, from me.

* Add help information for nf_tables to Kconfig, also from me.

* Allow to dump all sets by specifying NFPROTO_UNSPEC, similar feature is
  available to other nf_tables objects, requested by Arturo, from me.

* Expose the table usage counter, so we can know how many chains are using
  this table without dumping the list of chains, from Tomasz Bursztyka.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-06 13:29:30 -05:00
David S. Miller 6a8c4796df Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next
Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates

This series contains updates to i40e only.

Majority of this series contains patches from Greg and Mitch to fix
up or add functionality to the PF/VF driver interactions.  Notably,
a fix for SR-IOV VF port VLAN which resolved the problem of port VLAN
configurations not being persistent across VF driver loads and unloads
and enable/disable of the feature.  Also do not enable the default port
on the VEB, which is designed only to bridge the PF to an Open vSwitch
or bridge.  Another fix to resolve a possible memory corruption
condition where ARQ messages are written to random memory locations.
Fix a problem where the 'ip link show' command would display stale
link address information after the link address was set via the 'ip
link set' command.

Anjali provides several patches, one which saves information that can
be used while cleaning the Tx ring and useful in detecting Tx hangs.
Then provides a fixes to the admin queue shutdown function to ensure
we are shutting down the queue in the shutdown path and ensure ASQ is
alive before issuing the admin queue command.

Shannon provides a fix for get/update vsi params where the incorrect
struct was being used.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-06 13:25:58 -05:00
Anjali Singhai Jain 80f6428fa1 i40e: Do not allow AQ calls from ndo-ops
If the device is not in a working state avoid making admin
queue (AQ) calls that rely on a working AQ.

Change-Id: Ifbba6d257b3a5b51bfe92938c04088c0baa21433
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-01-06 03:56:28 -08:00
Anjali Singhai Jain 37f0be6d29 i40e: check asq alive before notify
Driver needs to make sure the send queue is alive before
trying to use it.

Chagne-Id: I9bd1f6159c45c98e63f562e3a8dfb57edfe50e13
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-01-06 03:48:36 -08:00
Anjali Singhai Jain e1860d8f1c i40e: Admin queue shutdown fixes
Always call the AQ call to shutdown the queue in the shutdown path.

Check ASQ is alive before issuing the AQ command since we might be
resetting to recover from a bad state in which case we should not
issue the AQ command.

Use the register variable for length so it can be used by PF, VF
and GL AQ commands.

Change-Id: Ic3d305687ea3f1a6afa84e864b7a27bd38a9af32
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-01-06 03:26:23 -08:00
Greg Rose b774c7dd75 i40e: Hide the Port VLAN VLAN ID
The VF VSI Port VLAN settings still allow the user to view VLAN tag in
the descriptor.  Fix the settings to hide the VLAN ID from the VF. The
VF is not supposed to be aware it is on a VLAN in the Port VLAN
scenario.

Change-Id: I976f2bacb455dbb750f8c53a781c689f02cb8907
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-01-06 03:15:17 -08:00
Shannon Nelson f5ac8579f0 i40e: use correct struct for get and update vsi params
The get_vsi_params and update_vsi_params functions were using a
different command struct that just happened to have an seid element in
the right place and so worked correctly anyway.  This patch fixes the
functions to use the right data struct.

There is no actual logic change.

Change-Id: I513b5e1dc293dfd5b2ba4fa443cbdbfa608d9d19
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-01-06 03:08:34 -08:00
Greg Rose f657a6e131 i40e: Fix VF driver MAC address configuration
Fix a problem where the 'ip link show' command would display stale
link address information after the link address was set via the 'ip
link set' command.  In addition, fix problem with the user being
allowed to overwrite the administratively set VF MAC address.

Change-Id: I669ed14e55f2b633ef7b456b713632b08468671c
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-01-06 02:57:05 -08:00
Mitch Williams 7efa84b7ab i40e: support VFs on PFs other than 0
When communicating with VF devices over the AQ, the FW refers to the
VF by its global VF ID, not local the VF ID with reference to its
parent PF. Since the global and local VF IDs are identical for PF 0,
the code worked correctly on PF 0.

However, we cannot just use global IDs throughout the code as most of
the other references to the VF (VSI setup, register offsets, etc.)
require the local VF ID. Instead, we just add or subtract our base VF
ID when sending and receiving AQ messages.

Change-Id: I92f4332b4876bc68b2f9af9ebf48761f63b6bd97
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-01-06 02:46:49 -08:00
Mitch Williams f7414531a0 i40e: acknowledge VFLR when disabling SR-IOV
When SR-IOV is disabled, the (now nonexistent) virtual function
devices undergo a VFLR event. We don't need to handle this event
because the VFs are gone, but we do need to tell the HW that they are
complete. This fixes an issue with a phantom VFLR and broken VFs when
SR-IOV is re-enabled.

Change-Id: I7580b49ded0158172a85b14661ec212af77000c8
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-01-06 02:39:10 -08:00
Mitch Williams 3197ce220c i40e: don't allocate zero size
Shockingly, the compiler didn't flag this uninitialized variable. This
fixes a potential memory corruption condition where ARQ messages are
written to random memory locations.

Change-Id: Iac82f4562d2bf3f42df3f3b2163d9cbed2160135
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-01-06 02:32:26 -08:00
Mitch Williams 5a9769c827 i40e: use struct assign instead of memcpy
Use struct assignment rather than an expensive memory copy.

Change-Id: I1d18d510774dfd41a9c1250cdef238a4187528f5
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-01-06 02:20:10 -08:00
Greg Rose 567472640c i40e: Do not enable default port on the VEB
Enabling the default port on the VEB causes all outgoing traffic from
virtual functions to be copied to the physical function.  The default
port is only supposed to be used if you wish to bridge the physical
function to a SW switch such as Open vSwitch or the Linux bridge. That
allows the SW switch to route traffic to VMs that are not using a
virtual function.

Eventually we'll want to implement the ndo_fdb_add, ndo_fdb_del, and
ndo_fdb_dump functions.  The ndo_fdb_add function would set the
default port on the VEB in those cases where the MAC/VLAN address
filters have overflowed.  Normally we would not want to use it.

Change-Id: I3990f0384fff2840c4e43bc0955dd0b701380852
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-01-06 02:13:22 -08:00
Mitch Williams b141d6196c i40e: avoid unnecessary register read
We don't need to read the base VF id. It's already stashed in the HW
struct.

Change-Id: Ib81e2f76fc40b12c966e014a856b481912cafefc
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-01-06 02:06:51 -08:00
Jesse Brandeburg 1b60f3c416 i40e: fix whitespace
Trivial whitespace fix.

Change-Id: Ib7c70891a33c4b3d200c69367549d0dbdee0f076
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-01-06 02:00:10 -08:00
Greg Rose 6c12fcbf18 i40e: Fix SR-IOV VF port VLAN
This patch fixes two different problems.
1) The port VLAN configuration was not persistent across VF driver
   loads and unloads.

2) The port VLAN configuration was only correct the first time it was
   set. Switching the port VLAN on and off would cause subsequent VLAN
   configurations to be corrupted in the VSI.  Ensure that the correct
   bits are being set for the VSI port VLAN configuration.

Change-Id: I7ebf5329f77eb8d73ccd3324eb346b3abeea737d
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-01-05 20:59:01 -08:00
Anjali Singhai Jain 298deef1f4 i40e: Record dma buffer info for dummy packets
Save information that we can use while cleaning the tx ring. Also record
the time_stamp since we will need it to check tx hangs.

Change-Id: Ia3f1c17f6fec9bcb7fef2542d77eac7f6c4f115c
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-01-05 18:20:28 -08:00
Eyal Perry b912b2f8fc net/mlx4_core: Warn if device doesn't have enough PCI bandwidth
Check if the device get enough bandwidth from the entire PCI chain to satisfy
its capabilities. This patch determines the PCIe device's bandwidth capabilities
by reading its PCIe Link Capabilities registers and then call the
pcie_get_minimum_link function to ensure that the adapter is hooked into a slot
which is capable of providing the necessary bandwidth capabilities.

Signed-off-by: Eyal Perry <eyalpe@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-05 20:37:05 -05:00
David S. Miller 3a2e15df50 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next
Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates

This series contains updates to i40e only.

Anjali provides two cleanups to remove unnecessary code and a fix
to resolve debugfs dumping only half the NVM.  Then provides a fix
to ethtool NVM reads where shadow RAM was used instead of actual
NVM reads.

Jesse provides a couple of fixes, one removes custom i40e functions
which duplicate existing kernel functionality.  Second fixes constant
cast issues by replacing __constant_htons with htons.

Mitch provides a couple of fixes for the VF interfaces in i40e.  First
provides a fix to guard against VF message races with can cause a panic.
Second fix reinitializes the buffer size each time we clean the ARQ,
because subsequent messages can be truncated. Lastly adds functionality
to enable/disable ICR 0 dynamically.

Vasu adds a simple guard against multiple includes of the i40e_txrx.h
file.

Shannon provides a couple of fixes, first fix swaps a couple of lines
around in the error handling if the allocation for the VSI array fails.
Second fixes an issue where we try to free the q_vector that has not
been setup which can panic the kernel.

David provides a patch to save off the point to memory and the length
of 2 structs used in the admin queue in order to store all info about
allocated kernel memory.

Neerav fixes ring allocation where allocation and clearing of rings
for a VSI should be using the alloc_queue_pairs and not num_queue_pairs.
Then removes the unused define for multi-queue enabled.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-05 20:31:01 -05:00
Hannes Frederic Sowa 1e85c9b66d 8021q: make vlan_pcpu_stats visible without CONFIG_VLAN_8021Q
macvlan needs vlan_pcpu_stats so make it visible even if compiling
without VLAN_8021Q support. Otherwise a very long compiler error happens.

Fixes: cdf3e274cf ("macvlan: unify macvlan_pcpu_stats and vlan_pcpu_stats")
Cc: Li RongQing <roy.qing.li@gmail.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-By: Li RongQing <roy.qing.li@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-05 20:27:55 -05:00
Daniel Borkmann a48d4bb0b0 net: netdev_kobject_init: annotate with __init
netdev_kobject_init() is only being called from __init context,
that is, net_dev_init(), so annotate it with __init as well, thus
the kernel can take this as a hint that the function is used only
during the initialization phase and free up used memory resources
after its invocation.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-05 20:27:54 -05:00
David S. Miller 855404efae Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
netfilter/IPVS updates for net-next

The following patchset contains Netfilter updates for your net-next tree,
they are:

* Add full port randomization support. Some crazy researchers found a way
  to reconstruct the secure ephemeral ports that are allocated in random mode
  by sending off-path bursts of UDP packets to overrun the socket buffer of
  the DNS resolver to trigger retransmissions, then if the timing for the
  DNS resolution done by a client is larger than usual, then they conclude
  that the port that received the burst of UDP packets is the one that was
  opened. It seems a bit aggressive method to me but it seems to work for
  them. As a result, Daniel Borkmann and Hannes Frederic Sowa came up with a
  new NAT mode to fully randomize ports using prandom.

* Add a new classifier to x_tables based on the socket net_cls set via
  cgroups. These includes two patches to prepare the field as requested by
  Zefan Li. Also from Daniel Borkmann.

* Use prandom instead of get_random_bytes in several locations of the
  netfilter code, from Florian Westphal.

* Allow to use the CTA_MARK_MASK in ctnetlink when mangling the conntrack
  mark, also from Florian Westphal.

* Fix compilation warning due to unused variable in IPVS, from Geert
  Uytterhoeven.

* Add support for UID/GID via nfnetlink_queue, from Valentina Giusti.

* Add IPComp extension to x_tables, from Fan Du.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-05 20:18:50 -05:00
Anjali Singhai Jain c3f0c4fedf i40e: remove un-necessary io-write
Driver needs to clean PBA only when interrupts are turned off and we
are polling instead.

Change-Id: Ic0c1da761bd3abe7f73b1cc8bcddf8e3a232fd0f
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-01-05 01:46:29 -08:00
Anjali Singhai Jain 7b86228902 i40e: Remove unnecessary prototypes
These functions don't need a prototype as they are defined
in the file before they are called.

Change-Id: Ie17ffad4a29a9c0df434c4ebc4681128a6095c65
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-01-05 01:38:03 -08:00
Neerav Parikh 9f52987b05 i40e: I40E_FLAG_MQ_ENABLED is not used
Remove references to I40E_FLAG_MQ_ENABLED from the code
as it doesn't seem to be used anywhere.

Change-Id: I4c89fb65b2cdd26fbb0c58fccbbb4b03f0e5f1b3
Signed-off-by: Neerav Parikh <Neerav.Parikh@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-01-05 01:30:36 -08:00
Neerav Parikh d739764406 i40e: Fix ring allocation
The allocation and clearing of rings for a VSI should be
using the alloc_queue_pairs and not num_queue_pairs.

The alloc_queue_pairs per VSI is a pre-allocated number
of queues assigned to a VSI; based on number of TCs enabled
only certain number of queues may be used from that. This
is mainly valid only for the LAN VSI case as that is the
only VSI that may be enabled with multiple traffic classes.
In the future the number of TCs may change based on DCBX
configuration.

The actual number of queues that are enabled/configured is
based on the number of TCs enabled for a given VSI and that
is stored in num_queue_pairs.

With this change num_[tr]x_queues is unused so remove them.

Change-Id: I9c2f84778bb25f7313c630e9b002a0caa883ce29
Signed-off-by: Neerav Parikh <Neerav.Parikh@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-01-05 00:32:09 -08:00
Shannon Nelson 78681b1f87 i40e: catch unset q_vector
Don't try to free a q_vector that hasn't been set up as it can
panic the kernel.

Change-Id: I0650cc6c441d0779788c522c790293c276d14fbc
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-01-04 23:54:42 -08:00
David Cassard 90bb776ae5 i40e: keep allocated memory in structs
Save both a pointer to memory and the length in order to store all
info about allocated kernel memory.  This patch changes some adminq
allocations to preserve the full i40e_dma_mem/i40e_virt_mem structs
for every allocation.

Change-Id: Ibcf96159aba4ba61f839d16d87d19478df28e630
Signed-off-by: David Cassard <david.g.cassard@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-01-04 23:46:24 -08:00
Shannon Nelson 04b03013a5 i40e: fix error handling when alloc of vsi array fails
Swap a couple lines around in the error handling if the kzalloc() for
the pf->vsi array fails.  This was causing a kernel BUG because the
call to i40e_clear_interrupt_scheme() was assuming the pf->vsi[] array
existed.  In this fix it is possible that i40e_reset_interrupt_capability()
will get called twice, but this is a safe action.

Change-Id: I939163ccaa89baac7511556d36bc873864c35ae1
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-01-04 23:12:55 -08:00
Mitch Williams 2f0191238d i40e: reinit buffer size each time
When cleaning the ARQ, we must reinitialize the buffer size each time we
go through the loop, because i40e_clean_arq_element returns the message
length in the same field. Without this change, subsequent messages can
be truncated to the length of the previous message.

Change-Id: Ic9c32ff843faf0fc3196d21351a1c3a60c6158eb
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-01-04 22:59:27 -08:00
Mitch Williams 2ef28cfb09 i40e: use functions to enable and disable icr 0
Introduce i40e_irq_dynamic_disable_icr0 and use it and its previously-
extant counterpart when appropriate.

Change-Id: Ieb4037874fba2e96fc2354b34a97a3cb8f6490f3
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-01-04 22:51:18 -08:00
Vasu Dev 36fac58180 i40e: add header file flag _I40E_TXRX_H_
Add an include header guard to guard against multiple includes

Change-Id: I73efa03efc912d2047edab903c7caed05b444da2
Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-01-04 22:44:37 -08:00
Mitch Williams 6c1b5bff5a i40e: guard against vf message races
When disabling and enabling VFs on a live system with the VF driver
loaded, it's possible to receive an admin queue message from the VF
driver at an inconvenient time, e.g. when the associated data structures
aren't present or configured. This causes a rather inconvenient panic.

To guard against this, we change the order of when we set num_alloc_vfs
when turning off SR-IOV, and then gate processing of any VF messages
based upon that value. Likewise, when enabling VFs, we shut off the
relevant interrupt until configuration is complete.

Change-Id: I0c172c056616c2bebd78bbc807ab446eb484deea
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-01-04 22:37:57 -08:00
Jesse Brandeburg 0e2fe46ca7 i40e: fix constant cast issues
replace __constant_htons with htons

Change-Id: I123a5318bae34c8b004c71db07c56f137c685849
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-01-04 22:31:22 -08:00
Anjali Singhai Jain e5e0a5db4c i40e: Change the ethtool NVM read method to use AQ
Earlier we were reading Shadow RAM (copy of the NVM) which can differ
from the actual NVM. Use AQ instead to read the actual NVM.

Change-Id: Ia0f2773b722db77d093f738c068af872be69bbd4
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-01-04 22:22:26 -08:00
Jesse Brandeburg f62b5060d6 i40e: fix mac address checking
Remove custom i40e functions around ethernet addresses that are
duplicating already existing kernel functionality.

Also ends up fixing a bug with multicast addresses.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-01-04 22:13:45 -08:00
Anjali Singhai Jain a45e88c9db i40e: Dump the whole NVM, not half
Debugfs was reading exactly half the number of words, fix it.

Change-Id: Ieb217f3c6dca455d44e50a0dc61a6664c0cb2265
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-01-04 22:06:20 -08:00
David S. Miller a1d4b03a07 Merge branch 'bgmac'
bgmac: add initial support for core rev 4 on ARM BCM47xx

====================
This adds support for core rev 4 and ARM BCM47XX.
With an other fix to the platform code I am now getting over 200 MBit/s
with this Ethernet driver, the DMA problems are solved are unrelated
to bgmac.

v3:
   - moved flags calculation for bcma_core_enable() into if block
   - remove hard coding of phy address to BGMAC_PHY_NOREGS

v2: add changed suggested by Rafał
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-04 20:25:55 -05:00
Hauke Mehrtens 6df4aff972 bgmac: add support for Northstar SoC (BCM4707, BCM53018)
This adds support for the Northstar SoC. This SoC does not have a PMU in
bcma and no register on it should be called. In addition it support 2.5
GBit/s Ethernet to the PHY.

This GMAC core is not fully working there are still problems with the
DMA controller.

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-04 20:25:20 -05:00
Hauke Mehrtens 622a521fa4 bgmac: reset all cores on Northstar SoC
On the Northstar SoC (BCM4707 and BCM53018) we have to enable all GMAC
cores when we just want to use on. We iterate over all the cores and
activate them.

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Acked-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-04 20:25:19 -05:00
Hauke Mehrtens 48e07fbe07 bgmac: add support for new BGMAC_CMDCFG_SR position on core rev >= 4
The BGMAC_CMDCFG_SR register is at a different position on core rev >= 4
We do not know where this register is on a rev 5 or higher core, I have
newer seen such a core.

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Acked-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-04 20:25:19 -05:00
Hauke Mehrtens 56ceecde1f bgmac: initialize the DMA controller of core rev >= 4
The DMA controller used in the device supported by GMAC with core rev
>= 4 has some new options which are now set to the default values used
in the Broadcom SDK.

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Acked-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-04 20:25:19 -05:00