1
0
Fork 0
Commit Graph

147 Commits (98efd69493b9d4b02353a552af8ffaaf30de8af4)

Author SHA1 Message Date
Alexander Duyck 98efd69493 i40e/i40evf: Add support for using order 1 pages with a 3K buffer
There are situations where adding padding to the front and back of an Rx
buffer will require that we add additional padding.  Specifically if
NET_IP_ALIGN is non-zero, or the MTU size is larger than 7.5K we would need
to use 2K buffers which leaves us with no room for the padding.

To preemptively address these cases I am adding support for 3K buffers to
the Rx path so that we can provide the additional padding needed in the
event of NET_IP_ALIGN being non-zero or a cache line being greater than 64.

Change-ID: I938bc1ba611285428df39a613cd66f98e60b55c7
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-04-08 02:53:50 -07:00
Alexander Duyck fa2343e903 i40e/i40evf: Break i40e_fetch_rx_buffer up to allow for reuse of frag code
This patch is meant to clean up the code in preparation for us adding
support for build_skb.  Specifically we deconstruct i40e_fetch_buffer into
several functions so that those functions can later be reused when we add a
path for build_skb.

Specifically with this change we split out the code for adding a page to an
exiting skb.

Change-ID: Iab1efbab6b8b97cb60ab9fdd0be1d37a056a154d
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-29 02:15:06 -07:00
Alexander Duyck a0cfc3130e i40e/i40evf: Pull out code for cleaning up Rx buffers
This patch pulls out the code responsible for handling buffer recycling and
page counting and distributes it through several functions.  This allows us
to commonize the bits that handle either freeing or recycling the buffers.

As far as the page count tracking one change to the logic is that
pagecnt_bias is decremented as soon as we call i40e_get_rx_buffer.  It is
then the responsibility of the function that pulls the data to either
increment the pagecnt_bias if the buffer can be recycled as-is, or to
update page_offset so that we are pointing at the correct location for
placement of the next buffer.

Change-ID: Ibac576360cb7f0b1627f2a993d13c1a8a2bf60af
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-29 02:15:06 -07:00
Alexander Duyck 9a064128fc i40e/i40evf: Pull code for grabbing and syncing rx_buffer from fetch_buffer
This patch pulls the code responsible for fetching the Rx buffer and
synchronizing DMA into a function, specifically called i40e_get_rx_buffer.

The general idea is to allow for better code reuse by pulling this out of
i40e_fetch_rx_buffer.  We dropped a couple of prefetches since the time
between the prefetch being called and the data being accessed was too small
to be useful.

Change-ID: I4885fce4b2637dbedc8e16431169d23d3d7e79b9
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-29 02:15:06 -07:00
Alexander Duyck d57c0e08c7 i40e/i40evf: Use length to determine if descriptor is done
This change makes it so that we use the length of the packet instead of the
DD status bit to determine if a new descriptor is ready to be processed.
The obvious advantage is that it cuts down on reads as we don't really even
need the DD bit if going from a 0 to a non-zero value on size is enough to
inform us that the packet has been completed.

Change-ID: Iebdf9cdb36c454ef092df27199b92ad09c374231
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-29 02:15:06 -07:00
Preethi Banala b1cb07db6e i40evf: enforce descriptor write-back mechanism for VF
The current driver mode is to use a write-back mechanism for the head
register which indicates transmit completions. The VF driver needs to be
able to work on hardware that exclusively uses descriptor write-back, so
change the default driver mode of operation to descriptor write-back for
VF. In our analysis, performance wasn't significantly different with
either write-back method.

Change-ID: Ia92e4ec77c2df8dc4515c71d53746d57d77759af
Signed-off-by: Preethi Banala <preethi.banala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-29 02:15:06 -07:00
Alexander Duyck a5b268e4b1 i40e/i40evf: Clean-up process_skb_fields
This is a minor clean-up to make the i40e/i40evf process_skb_fields
function look a little more like what we have in igb.  The Rx checksum
function called out a need for skb->protocol but I can't see where it
actually needs it.  I am assuming this is something that was likely
refactored out some time ago as the Rx checksum code has gone through a few
rewrites.

Change-ID: I0b4668a34d90b61b66ded7c7c26e19a3e2d06251
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-27 16:47:43 -07:00
Alexander Duyck 741b8b832a i40e/i40evf: Fix use after free in Rx cleanup path
We need to reset skb back to NULL when we have freed it in the Rx cleanup
path.  I found one spot where this wasn't occurring so this patch fixes it.

Change-ID: Iaca68934200732cd4a63eb0bd83b539c95f8c4dd
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-27 16:45:14 -07:00
Alexander Duyck 1793668c3b i40e/i40evf: Update code to better handle incrementing page count
Update the driver code so that we do bulk updates of the page reference
count instead of just incrementing it by one reference at a time.  The
advantage to doing this is that we cut down on atomic operations and
this in turn should give us a slight improvement in cycles per packet.
In addition if we eventually move this over to using build_skb the gains
will be more noticeable.

I also found and fixed a store forwarding stall from where we were
assigning "*new_buff = *old_buff".  By breaking it up into individual
copies we can avoid this and as a result the performance is slightly
improved.

Change-ID: I1d3880dece4133eca3c32423b04a5467321ccc52
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-27 16:45:13 -07:00
Alexander Duyck 59605bc096 i40e/i40evf: Add support for mapping pages with DMA attributes
This patch adds support for DMA_ATTR_SKIP_CPU_SYNC and
DMA_ATTR_WEAK_ORDERING. By enabling both of these for the Rx path we
are able to see performance improvements on architectures that implement
either one due to the fact that page mapping and unmapping only has to
sync what is actually being used instead of the entire buffer. In addition
by enabling the weak ordering attribute enables a performance improvement
for architectures that can associate a memory ordering with a DMA buffer
such as Sparc.

Change-ID: If176824e8231c5b24b8a5d55b339a6026738fc75
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-15 02:00:14 -07:00
Jacob Keller b9c015d421 i40e: mark the value passed to csum_replace_by_diff as __wsum
Fix, or rather, avoid a sparse warning caused by the fact that
csum_replace_by_diff expects to receive a __wsum value. Since the
calculation appears to work, simply typecast the passed paylen value to
__wsum to avoid the warning.

This seems pretty fishy since __wsum was obviously annotated as
a separate type on purpose, so this throws the entire calculation into
question. Since it currently appears to behave as expected, the typecast
is probably safe.

Change-ID: I4fdc5cddd589abc16098176e8a61127e761488f4
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-02-18 20:35:36 -08:00
Carolyn Wyborny 3c234c4709 i40e: Fix Adaptive ITR enabling
This patch fixes a bug introduced with the addition of the per queue
ITR feature support in ethtool.  With that addition, there were
functions added which converted the ITR settings to binary values.
The IS_ENABLED macros that run on those values check whether a bit
is set or not and with the value being binary, the bit check always
returned ITR disabled which prevents any updating of the ITR rate.
This patch fixes the problem by changing the functions to return the
current ITR value instead and renaming it to better reflect
its function.  These functions now provide a value which will be
accurately asessed and update the ITR as intended.

Change-ID: I14f1d088d052e27f652aaa3113e186415ddea1fc
Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-02-18 20:35:36 -08:00
Scott Peterson 9b37c93731 i40e/i40evf: eliminate i40e_pull_tail()
Reorganize the i40e_pull_tail() logic, doing it in i40e_add_rx_frag()
where it's cheaper.  The igb driver does this the same way.

Also renames i40e_page_is_reserved() to reflect what it actually
tests.

Change-ID: Icd9cc507aae1fcdc02308b3a09034111b4c24071
Signed-off-by: Scott Peterson <scott.d.peterson@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-02-11 20:39:01 -08:00
Scott Peterson e72e56597b i40e/i40evf: Moves skb from i40e_rx_buffer to i40e_ring
This patch reduces the size of struct i40e_rx_buffer by one pointer,
and makes the i40e driver a little more consistent with the igb driver
in terms of packets that span buffers.

We do this by moving the skb field from struct i40e_rx_buffer to
struct i40e_ring. We pass the skb we already have (or NULL if we
don't) to i40e_fetch_rx_buffer(), which skips the skb allocation if we
already have one for this packet.

Change-ID: I4ad48a531844494ba0c5d8e1a62209a057f661b0
Signed-off-by: Scott Peterson <scott.d.peterson@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-02-11 20:39:01 -08:00
Scott Peterson 7987dcd7b9 i40e/i40evf: Limit DMA sync of RX buffers to actual packet size
On packet RX, we perform a DMA sync for CPU before passing the
packet up.  Here we limit that sync to the actual length of the
incoming packet, rather than always syncing the entire buffer.

Change-ID: I626aaf6c37275a8ce9e81efcaa773f327b331487
Signed-off-by: Scott Peterson <scott.d.peterson@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-02-11 20:39:01 -08:00
Alexander Duyck 52ea3e8020 i40e: Quick refactor to start moving data off stack and into Tx buffer info
This patch does some quick work to pull some of the data off of the stack
and hopefully start storing it in the Tx buffer info section of the Tx
ring.  Ideally we should be moving away from having to store much of
anything on the stack and can just maintain it all in the descriptor rings.

Change-ID: I4b4715ea1920e122502482b3f9e56a9a6cb1e9fe
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-02-02 22:41:10 -08:00
Alexander Duyck 6beb84a73e i40e/i40evf: napi_poll must return the work done
Currently the function i40e_napi-poll() returns 0 when it clean completely
the Rx rings, but this foul budget accounting in core code.

Fix this by returning the actual work done, capped to budget - 1, since
the core doesn't allow to return the full budget when the driver modifies
the NAPI status

This is based on a similar change that was made for the ixgbe driver by
Paolo Abeni.

Change-ID: Ic3d93ad2fa2fc8ce3164bc461e69367da0f9173b
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-12-06 20:46:13 -08:00
Alexander Duyck 1dc8b53879 i40e: Reorder logic for coalescing RS bits
This patch reorders the logic at the end of i40e_tx_map to address the
fact that the logic was rather convoluted and much larger than it needed
to be.

In order to try and coalesce the code paths I have updated some of the
comments and repurposed some of the variables in order to reduce
unnecessary overhead.

This patch does the following:
1.  Quit tracking skb->xmit_more with a flag, just max out packet_stride
2.  Drop tail_bump and do_rs and instead just use desc_count and td_cmd
3.  Pull comments from ixgbe that make need for wmb() more explicit.

Change-ID: Ic7da85ec75043c634e87fef958109789bcc6317c
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-10-31 14:26:40 -07:00
Alexander Duyck 99dad8b34c i40e: Drop redundant Rx descriptor processing code
This patch cleans up several pieces of redundant code in the Rx clean-up
paths.

The first bit is that hdr_addr and the status_err_len portions of the Rx
descriptor represent the same value.  As such there is no point in setting
them to 0 before setting them to 0.  I'm dropping the second spot where we
are updating the value to 0 so that we only have 1 write for this value
instead of 2.

The second piece is the checking for the DD bit in the packet.  We only
need to check for a non-zero value for the status_err_len because if the
device is done with the descriptor it will have written something back and
the DD is just one piece of it.  In addition I have moved the reading of
the Rx descriptor bits related to rx_ptype down so that they are actually
below the dma_rmb() call so that we are guaranteed that we don't have any
funky 64b on 32b calls causing any ordering issues.

Change-ID: I256e44a025d3c64a7224aaaec37c852bfcb1871b
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-10-28 23:28:39 -07:00
Alan Brady 96db776a36 i40e/i40evf: fix interrupt affinity bug
There exists a bug in which a 'perfect storm' can occur and cause
interrupts to fail to be correctly affinitized. This causes unexpected
behavior and has a substantial impact on performance when it happens.

The bug occurs if there is heavy traffic, any number of CPUs that have
an i40e interrupt are pegged at 100%, and the interrupt afffinity for
those CPUs is changed.  Instead of moving to the new CPU, the interrupt
continues to be polled while there is heavy traffic.

The bug is most readily realized as the driver is first brought up and
all interrupts start on CPU0. If there is heavy traffic and the
interrupt starts polling before the interrupt is affinitized, the
interrupt will be stuck on CPU0 until traffic stops. The bug, however,
can also be wrought out more simply by affinitizing all the interrupts
to a single CPU and then attempting to move any of those interrupts off
while there is heavy traffic.

This patch fixes the bug by registering for update notifications from
the kernel when the interrupt affinity changes. When that fires, we
cache the intended affinity mask. Then, while polling, if the cpu is
pegged at 100% and we failed to clean the rings, we check to make sure
we have the correct affinity and stop polling if we're firing on the
wrong CPU.  When the kernel successfully moves the interrupt, it will
start polling on the correct CPU. The performance impact is minimal
since the only time this section gets executed is when performance is
already compromised by the CPU.

Change-ID: I4410a880159b9dba1f8297aa72bef36dca34e830
Signed-off-by: Alan Brady <alan.brady@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-10-28 23:28:39 -07:00
Jacob Keller 65e87c0398 i40evf: support queue-specific settings for interrupt moderation
In commit a75e8005d5 ("i40e: queue-specific settings for interrupt
moderation") the i40e driver gained support for setting interrupt
moderation values per queue. This patch adds support for this feature
to the i40evf driver as well. In addition, a few changes are made to
the i40e implementation to add function header documentation comments,
as well.

This behaves in a similar fashion to the implementation in i40e. Thus,
requesting the moderation value when no queue is provided will report
queue 0 value, while setting the value without a queue will set all
queues at once.

Change-ID: I1f310a57c8e6c84a8524c178d44d1b7a6d3a848e
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-09-24 22:50:23 -07:00
Alexander Duyck e486bdfd7c i40e/i40evf: Add txring_txq function to match fm10k and ixgbe
This patch adds a txring_txq function which allows us to convert a
i40e_ring/i40evf_ring to a netdev_tx_queue structure.  This way we
can avoid having to make a multi-line function call for all the spots
that need access to this.

Change-ID: Ic063b71d8b92ea406d2c32e798c8e2b02809d65b
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-09-24 22:22:17 -07:00
Alexander Duyck 64bfd68eae i40e: Fix Flow Director raw_buf cleanup
The Tx cleanup flow was incorrectly assuming it could check for the flow
director bits after it had unmapped the buffer.  However in this case it
results in us trying to free a raw_buf as though it is an sk_buff.

To fix this I am moving up the flag test for the FD_SB bit so that when
find a non-NULL skb or raw_buf value we then check the flag and use the
appropriate call to free the buffer.

Change-ID: I6284034ba1ea87c9922e56f6eb3181f7f09bddde
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-09-24 22:15:12 -07:00
Alexander Duyck 841493a3f6 i40e: Limit TX descriptor count in cases where frag size is greater than 16K
The i40e driver was incorrectly assuming that we would always be pulling
no more than 1 descriptor from each fragment.  It is in fact possible for
us to end up with the case where 2 descriptors worth of data may be pulled
when a frame is larger than one of the pieces generated when aligning the
payload to either 4K or pieces smaller than 16K.

To adjust for this we just need to make certain to test all the way to the
end of the fragments as it is possible for us to span 2 descriptors in the
block before us so we need to guarantee that even the last 6 descriptors
have enough data to fill a full frame.

Change-ID: Ic2ecb4d6b745f447d334e66c14002152f50e2f99
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-09-22 22:33:41 -07:00
Carolyn Wyborny ffeac83685 i40e: refactor tail_bump check
This patch refactors tail bump check.

Change-ID: Ide0e19171d67d90cb2b06b8dcd4fa791ae120160
Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-08-19 21:24:28 -07:00
David S. Miller da54bb13c0 Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Conflicts:
	drivers/net/ethernet/intel/i40e/i40e_main.c

Jeff Kirsher says:

====================
40GbE Intel Wired LAN Driver Updates 2016-07-22

This series contains updates to i40e and i40evf.

Heinrich Schuchardt found a possible null pointer being dereferenced in
i40e_debug_aq(), fixed the issue by doing the variable assignment after
we are sure the pointer is not null.

Avinash fixed an issue when link was down, we were not showing the
correct advertised link modes.

Mitch cleans up a useless initializer since the variable is assigned
right away.  Refactors the receive filter handling to properly track
filter adds and deletes so the driver will not lose filters during a
reset and up/down cycles.  Also added a tracking mechanism so that the
driver knows when to enter and leave promiscuous mode.

Catherine removes a device id which is not needed (or used).  Moves
a mutex lock since we need to lock the client list around the
i40e_client_release() call to prevent the release from interrupting
the client instances while they are being added.

Joshua adds Hyper-V specific VF device ids.

Amitoj Kaur Chawla cleans up a redundant memset() call before a memcpy().

Stefan Assmann adds the missing link advertise for some x710 NICs.

Tushar Dave fixes and issue found on SPARC, where a PF reset clears MAC
filters and if a platform-specific MAC address is used, the driver has
to explicitly write default MAC address to MAC filters otherwise all
incoming traffic destined to the default MAC address will be dropped
after reset.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25 10:43:07 -07:00
Mitch Williams 88dc9e6fed i40e/i40evf: remove useless initializer
This initializer isn't needed because the variable is assigned right
away.

Change-ID: I6ce3edb3f4e0364db248a7a0bcc62ca95c01d941
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-07-22 00:07:03 -07:00
Alexander Duyck 858296c878 i40e/i40evf: Fix i40e_rx_checksum
There are a couple of issues I found in i40e_rx_checksum while doing some
recent testing.  As a result I have found the Rx checksum logic is pretty
much broken and returning that the checksum is valid for tunnels in cases
where it is not.

First the inner types are not the correct values to use to test for if a
tunnel is present or not.  In addition the inner protocol types are not a
bitmask as such performing an OR of the values doesn't make sense.  I have
instead changed the code so that the inner protocol types are used to
determine if we report CHECKSUM_UNNECESSARY or not.  For anything that does
not end in UDP, TCP, or SCTP it doesn't make much sense to report a
checksum offload since it won't contain a checksum anyway.

This leaves us with the need to set the csum_level based on some value.
For that purpose I am using the tunnel_type field.  If the tunnel type is
GRENAT or greater then this means we have a GRE or UDP tunnel with an inner
header.  In the case of GRE or UDP we will have a possible checksum present
so for this reason it should be safe to set the csum_level to 1 to indicate
that we are reporting the state of the inner header.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-07-14 23:17:45 -07:00
Alexander Duyck bf2d1df395 intel: Add support for IPv6 IP-in-IP offload
This patch adds support for offloading IPXIP6 type packets that represent
either IPv4 or IPv6 encapsulated inside of an IPv6 outer IP header.  In
addition with this change we should also be able to support FOU
encapsulated traffic with outer IPv6 headers.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-20 19:25:52 -04:00
Tom Herbert 7e13318daa net: define gso types for IPx over IPv4 and IPv6
This patch defines two new GSO definitions SKB_GSO_IPXIP4 and
SKB_GSO_IPXIP6 along with corresponding NETIF_F_GSO_IPXIP4 and
NETIF_F_GSO_IPXIP6. These are used to described IP in IP
tunnel and what the outer protocol is. The inner protocol
can be deduced from other GSO types (e.g. SKB_GSO_TCPV4 and
SKB_GSO_TCPV6). The GSO types of SKB_GSO_IPIP and SKB_GSO_SIT
are removed (these are both instances of SKB_GSO_IPXIP4).
SKB_GSO_IPXIP6 will be used when support for GSO with IP
encapsulation over IPv6 is added.

Signed-off-by: Tom Herbert <tom@herbertland.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-20 18:03:15 -04:00
Jesse Brandeburg ab9ad98eb5 i40evf: refactor receive routine
This is part 2 of the Rx refactor series, just including
changes to i40evf.

This refactor aligns the receive routine with the one in
ixgbe which was highly optimized.  This reduces the code
we have to maintain and allows for (hopefully) more readable
and maintainable RX hot path.

In order to do this:
- consolidate the receive path into a single function that doesn't
  use packet split but *does* use pages for Rx buffers.
- remove the old _1buf routine
- consolidate several routines into helper functions
- remove VF ethtool control over packet split
- remove priv_flags interface since it is unused

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-05-05 22:42:58 -07:00
Jesse Brandeburg 19b85e677d i40evf: Drop packet split receive routine
As part of preparation for the rx-refactor, remove the
packet split receive routine and ancillary code.

Some of the split related context set up code stays in
i40e_virtchnl_pf.c in case an older VF driver tries to load
and still wants to use packet split.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-05-05 22:31:23 -07:00
Jesse Brandeburg f8a952cb40 i40e/i40evf: Refactor tunnel interpretation
Refactor the interpretation of a tunnel.  This removes
some code and lets us start using the hardware's parsing.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-05-05 18:24:10 -07:00
Alexander Duyck 1c7b4a23d1 i40e/i40evf: Add support for GSO partial with UDP_TUNNEL_CSUM and GRE_CSUM
This patch makes it so that i40e and i40evf can use GSO_PARTIAL to support
segmentation for frames with checksums enabled in outer headers.  As a
result we can now send data over these types of tunnels at over 20Gb/s
versus the 12Gb/s that was previously possible on my system.

The advantage with the i40e parts is that this offload is mostly
transparent as the hardware still deals with the inner and/or outer IPv4
headers so the IP ID is still incrementing for both when this offload is
performed.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-05-01 17:05:16 -07:00
Jesse Brandeburg a149f2c323 i40e/i40evf: Only offload VLAN tag if enabled
The driver was offloading the VLAN tag into the skb
any time there was a VLAN tag and the hardware stripping was
enabled.  Just check to make sure it's enabled before put_tag.

Change-Id: Ife95290c06edd9a616393b38679923938b382241
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-04-27 13:05:47 -07:00
Alexander Duyck 577389a5db i40e/i40evf: Add support for IPIP and SIT offloads
Looking over the documentation it turns out enabling IPIP and SIT offloads
for i40e is pretty straightforward.  As such I decided to enable them with
this patch.  In my testing I am seeing an improvement of 8 to 10 Gb/s
for IPIP and SIT tunnels with this offload enabled.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-04-26 03:29:49 -07:00
David S. Miller 1602f49b58 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts were two cases of simple overlapping changes,
nothing serious.

In the UDP case, we need to add a hlist_add_tail_rcu()
to linux/rculist.h, because we've moved UDP socket handling
away from using nulls lists.

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-23 18:51:33 -04:00
Alexander Duyck 3f3f7cb875 i40e/i40evf: Limit TSO to 7 descriptors for payload instead of 8 per packet
This patch addresses a bug introduced based on my interpretation of the
XL710 datasheet.  Specifically section 8.4.1 states that "A single transmit
packet may span up to 8 buffers (up to 8 data descriptors per packet
including both the header and payload buffers)."  It then later goes on to
say that each segment for a TSO obeys the previous rule, however it then
refers to TSO header and the segment payload buffers.

I believe the actual limit for fragments with TSO and a skbuff that has
payload data in the header portion of the buffer is actually only 7
fragments as the skb->data portion counts as 2 buffers, one for the TSO
header, and one for a segment payload buffer.

Fixes: 2d37490b82 ("i40e/i40evf: Rewrite logic for 8 descriptor per packet check")
Reported-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-04-13 19:59:23 -07:00
Jesse Brandeburg 1f15d66712 i40e/i40evf: Faster RX via avoiding FCoE
As it turns out, calling into other files from hot path hurts
performance a lot.  In this case the majority of the time we
call "check FCoE" and the packet is *not* FCoE, but this call
was taking 5% of our total cycles spent on receive.

Change-ID: I080552c26e7060bc7b78504dc2763f6f0b3d8c76
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-04-06 18:26:23 -07:00
Jesse Brandeburg 84b079928a i40e/i40evf: Drop unused tx_ring argument
Some of the tx_ring arguments can be deleted since they are not used.

Change-ID: I99275b0f191d7f63ec2f05061919904940c36f31
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-04-06 18:15:17 -07:00
Jesse Brandeburg d1bd743b5b i40e/i40evf: Move stack var deeper
A local variable could move down inside the context where it is used.

Change-ID: I9caba9e1eacf921037077f2665cbce83fd8e95d6
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-04-06 18:10:57 -07:00
Alexander Duyck 24d41e5e2c i40e/i40evf: Fix TSO checksum pseudo-header adjustment
With IPv4 and IPv6 now using the same format for checksums based on the
length of the frame we need to update the i40e and i40evf drivers so that
they correctly account for lengths greater than or equal to 64K.

With this patch the driver should now correctly update checksums for frames
up to 16776960 in length which should be more than large enough for all
possible TSO frames in the near future.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-04-05 20:34:51 -07:00
Jesse Brandeburg 4ea623922d i40e/i40evf: Fix casting in transmit code
Simple cast to fix a sparse warning.

Fixes: commit 5453205cd0 ("i40e/i40evf: Enable support for
SKB_GSO_UDP_TUNNEL_CSUM")

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-04-05 02:05:34 -07:00
Alexander Duyck a619afe814 i40e/i40evf: Add support for bulk free in Tx cleanup
This patch enables bulk Tx clean for skbs.  In order to enable it we need
to pass the napi_budget value as that is used to determine if we are truly
running in NAPI mode or if we are simply calling the routine from netpoll
with a budget of 0.  In order to avoid adding too many more variables I
thought it best to pass the VSI directly in a fashion similar to what we do
on igb and ixgbe with the q_vector.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-04-05 01:58:53 -07:00
Alexander Duyck f2edaaaa39 i40e/i40evf: Fix handling of boolean logic in polling routines
In the polling routines for i40e and i40evf we were using bitwise operators
to avoid the side effects of the logical operators, specifically the fact
that if the first case is true with "||" we skip the second case, or if it
is false with "&&" we skip the second case.  This fixes an earlier patch
that converted the bitwise operators over to the logical operators and
instead replaces the entire thing with just an if statement since it should
be more readable what we are trying to do this way.

Fixes: 1a36d7fadd ("i40e/i40evf: use logical operators, not bitwise")
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-04-05 01:54:12 -07:00
Alexander Duyck 5c4654daf2 i40e/i40evf: Allow up to 12K bytes of data per Tx descriptor instead of 8K
From what I can tell the practical limitation on the size of the Tx data
buffer is the fact that the Tx descriptor is limited to 14 bits.  As such
we cannot use 16K as is typically used on the other Intel drivers.  However
artificially limiting ourselves to 8K can be expensive as this means that
we will consume up to 10 descriptors (1 context, 1 for header, and 9 for
payload, non-8K aligned) in a single send.

I propose that we can reduce this by increasing the maximum data for a 4K
aligned block to 12K.  We can reduce the descriptors used for a 32K aligned
block by 1 by increasing the size like this.  In addition we still have the
4K - 1 of space that is still unused.  We can use this as a bit of extra
padding when dealing with data that is not aligned to 4K.

By aligning the descriptors after the first to 4K we can improve the
efficiency of PCIe accesses as we can avoid using byte enables and can fetch
full TLP transactions after the first fetch of the buffer.  This helps to
improve PCIe efficiency.  Below is the results of testing before and after
with this patch:

Recv   Send   Send                         Utilization      Service Demand
Socket Socket Message  Elapsed             Send     Recv    Send    Recv
Size   Size   Size     Time    Throughput  local    remote  local   remote
bytes  bytes  bytes    secs.   10^6bits/s  % S      % U     us/KB   us/KB
Before:
87380  16384  16384    10.00     33682.24  20.27    -1.00   0.592   -1.00
After:
87380  16384  16384    10.00     34204.08  20.54    -1.00   0.590   -1.00

So the net result of this patch is that we have a small gain in throughput
due to a reduction in overhead for putting together the frame.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-04-05 01:28:44 -07:00
Alexander Duyck 3bc67973e8 i40e/i40evf: Move Tx checksum closer to TSO
On all of the other Intel drivers we place checksum close to TSO as they
have a significant amount in common and it can help to reduce the decision
tree for how to handle the frame as the first check in TSO is to see if
checksumming is offloaded, and if it is not we can skip _BOTH_ TSO and Tx
checksum offload based on a single check.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-02-18 23:30:19 -08:00
Alexander Duyck 2d37490b82 i40e/i40evf: Rewrite logic for 8 descriptor per packet check
This patch is meant to rewrite the logic for how we determine if we can
transmit the frame or if it needs to be linearized.

The previous code for this function was using a mix of division and modulus
division as a part of computing if we need to take the slow path.  Instead
I have replaced this by simply working with a sliding window which will
tell us if the frame would be capable of causing a single packet to span
several descriptors.

The logic for the scan is fairly simple.  If any given group of 6 fragments
is less than gso_size - 1 then it is possible for us to have one byte
coming out of the first fragment, 6 fragments, and one or more bytes coming
out of the last fragment.  This gives us a total of 8 fragments
which exceeds what we can allow so we send such frames to be linearized.

Arguably the use of modulus might be more exact as the approach I propose
may generate some false positives.  However the likelihood of us taking much
of a hit for those false positives is fairly low, and I would rather not
add more overhead in the case where we are receiving a frame composed of 4K
pages.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-02-18 23:27:05 -08:00
Alexander Duyck 4ec441df25 i40e/i40evf: Break up xmit_descriptor_count from maybe_stop_tx
In an upcoming patch I would like to have access to the descriptor count
used for the data portion of the frame.  For this reason I am splitting up
the descriptor count function from the function that stops the ring.

Also in order to try and reduce unnecessary duplication of code I am moving
the slow-path portions of the code out of being inline calls so that we can
just jump to them and process them instead of having to build them into
each function that calls them.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-02-18 23:23:51 -08:00
Alexander Duyck 5453205cd0 i40e/i40evf: Enable support for SKB_GSO_UDP_TUNNEL_CSUM
The XL722 has support for providing the outer UDP tunnel checksum on
transmits.  Make use of this feature to support segmenting UDP tunnels with
outer checksums enabled.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-02-18 14:46:23 -08:00