1
0
Fork 0
Commit Graph

58 Commits (2dba07975acbc30c77a22d38030ee5a86ab8a748)

Author SHA1 Message Date
Ilan Tayari 2ac9cfe782 net/mlx5e: IPSec, Add Innova IPSec offload TX data path
In the TX data path, prepend a special metadata ethertype which
instructs the hardware to perform cryptography.

In addition, fill Software-Parser segment in TX descriptor so
that the hardware may parse the ESP protocol, and perform TX
checksum offload on the inner payload.

Support GSO, by providing the inverse of gso_size in the metadata.
This allows the FPGA to update the ESP header (seqno and seqiv) on the
resulting packets, by calculating the packet number within the GSO
back from the TCP sequence number.

Note that for GSO SKBs, the stack does not include an ESP trailer,
unlike the non-GSO case.

Signed-off-by: Ilan Tayari <ilant@mellanox.com>
Signed-off-by: Yossi Kuperman <yossiku@mellanox.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@mellanox.com>
Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-27 16:36:48 +03:00
Ilan Tayari 4b67379376 net/mlx5: Make get_cqe routine not ethernet-specific
Move mlx5e_get_cqe routine to wq.h and rename it to
mlx5_cqwq_get_cqe.

This allows it to be used by other CQ users outside of the
ethernet driver code.

A later patch in this patchset will make use of it from
FPGA code for the FPGA high-speed connection.

Signed-off-by: Ilan Tayari <ilant@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-27 16:36:47 +03:00
Erez Shitrit 4ec5cf781b net/mlx5e: IPoIB, Get more TX statistics
Add misses counters (bytes, packet, gso, xmit_more) in TX flow for ipoib
traffic.

Fixes: 58545449b7b ("net/mlx5e: IPoIB, Xmit flow")
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-19 18:40:20 +03:00
Saeed Mahameed 4301ba7b3e net/mlx5e: IPoIB, Move to a separate directory
IPoIB netdevice driver was only introduced in previous kernel release
and it is growing in terms of features and LOC, move it to a separate
directory.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-19 18:40:20 +03:00
Or Gerlitz e53eef6350 net/mlx5: Align to match opening parenthesis
Fixed checkpatch complaints of the form:

 CHECK: Alignment should match open parenthesis

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-16 00:12:40 +03:00
Stephen Hemminger 8bf3198a5e mlx5: fix warning about missing prototype
Fix sparse warning about missing prototypes. The rx/tx code path
defines functions with prototypes in ipoib.h.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-04-22 20:26:42 +03:00
Saeed Mahameed 258545449b net/mlx5e: IPoIB, Xmit flow
Implement mlx5e's IPoIB SKB transmit using the helper functions provided
by mlx5e ethernet tx flow, the only difference in the code between
mlx5e_xmit and mlx5i_xmit is that IPoIB has some extra fields to fill
(UD datagram segment) in the TX descriptor (WQE) and it doesn't need to
have any vlan handling.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-17 11:08:31 -04:00
Saeed Mahameed 77bdf8950b net/mlx5e: Xmit flow break down
Break current mlx5e xmit flow into smaller blocks (helper functions)
in order to reuse them for IPoIB SKB transmission.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-17 11:08:31 -04:00
Saeed Mahameed 6a9764efb2 net/mlx5e: Isolate open_channels from priv->params
In order to have a clean separation between channels resources creation
flows and current active mlx5e netdev parameters, make sure each
resource creation function do not access priv->params, and only works
with on a new fresh set of parameters.

For this we add "new" mlx5e_params field to mlx5e_channels structure
and use it down the road to mlx5e_open_{cq,rq,sq} and so on.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
2017-03-27 15:08:18 +03:00
Saeed Mahameed acc6c5953a net/mlx5e: Split open/close channels to stages
As a foundation for safe config flow, a simple clear API such as
(Open then Activate) where the "Open" handles the heavy unsafe
creation operation and the "activate" will be fast and fail safe,
to enable the newly created channels.

For this we split the RQs/TXQ SQs and channels open/close flows to
open => activate, deactivate => close.

This will simplify the ability to have fail safe configuration changes
in downstream patches as follows:

make_new_config(new_params)
{
     old_channels = current_active_channels;
     new_channels = create_channels(new_params);
     if (!new_channels)
              return "Failed, but current channels still active :)"
     deactivate_channels(old_channels); /* Can't fail */
     activate_channels(new_channels); /* Can't fail */
     close_channels(old_channels);
     current_active_channels = new_channels;

     return "SUCCESS";
}

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
2017-03-27 15:08:17 +03:00
Saeed Mahameed 3139104861 net/mlx5e: Different SQ types
Different SQ types (tx, xdp, ico) are growing apart, we separate them
and remove unwanted parts in each one of them, to simplify data path and
utilize data cache.

Remove DB union from SQ structures since it is not needed anymore as we
now have different SQ data type for each SQ.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 19:11:46 -07:00
Saeed Mahameed 864b2d7153 net/mlx5e: Generalize tx helper functions for different SQ types
In the next patches we will introduce different SQ types, for that we here
generalize some TX helper functions to work with more basic SQ parameters,
in order to re-use them for the different SQ types.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 19:11:46 -07:00
Saeed Mahameed 1c4bf94045 net/mlx5e: Move XDP completion functions to rx file
XDP code belongs to RX path, move mlx5e_poll_xdp_tx_cq and
mlx5e_free_xdp_tx_descs to en_rx.c.

Rename them to mlx5e_poll_xdpsq_cq and mlx5e_free_xdpsq_descs.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 19:11:45 -07:00
Saeed Mahameed 6982ab6097 net/mlx5e: Xmit, no write combining
mlx5e netdev Blue Flame (write combining) support demands a lot of
overhead for a little latency gain for some special cases, this overhead
is hurting the common case.

Here we remove xmit Blue Flame support by creating all bfregs with no
write combining for all SQs, and we remove a lot of BF logic and
conditions from xmit data path.

Simplify mlx5e_tx_notify_hw (doorbell function) by removing BF related
code and by removing one memory barrier needed for WC mapped SQ doorbell
buffers, which no longer exist.

Performance improvement:
System: Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz

Test case                   Before      Now      improvement
---------------------------------------------------------------
TX packets (24 threads)     50Mpps      54Mpps    8%

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 19:11:45 -07:00
Gal Pressman d3a4e4da54 net/mlx5e: Count GSO packets correctly
TX packets statistics ('tx_packets' counter) used to count GSO packets
as one, even though it contains multiple segments.
This patch will increment the counter by the number of segments, and
align the driver with the behavior of other drivers in the stack.

Note that no information is lost in this patch due to 'tx_tso_packets'
counter existence.

Before, ethtool showed:
$ ethtool -S ens6 | egrep "tx_packets|tx_tso_packets"
     tx_packets: 61340
     tx_tso_packets: 60954
     tx_packets_phy: 2451115

Now, we will see the more logical statistics:
$ ethtool -S ens6 | egrep "tx_packets|tx_tso_packets"
     tx_packets: 2451115
     tx_tso_packets: 60954
     tx_packets_phy: 2451115

Fixes: e586b3b0ba ("net/mlx5: Ethernet Datapath files")
Signed-off-by: Gal Pressman <galp@mellanox.com>
Cc: kernel-team@fb.com
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-22 12:11:13 -07:00
Saeed Mahameed a6f402e499 net/mlx5e: Tx, no inline copy on ConnectX-5
ConnectX-5 and later HW generations will report min inline mode ==
MLX5_INLINE_MODE_NONE, which means driver is not required to copy packet
headers to inline fields of TX WQE.

When inline is not required, vlan insertion will be handled in the
TX descriptor rather than copy to inline.

For LSO case driver is still required to copy headers, for the HW to
duplicate on wire.

This will improve CPU utilization and boost TX performance.

Tested with pktgen burst single flow:
CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
HCA: Mellanox Technologies MT28800 Family [ConnectX-5 Ex]

Before: 15.1Mpps
After:  17.2Mpps
Improvement: 14%

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
2017-02-06 18:20:17 +02:00
Saeed Mahameed 2b31f7ae5f net/mlx5: TX WQE update
Add new TX WQE fields for Connect-X5 vlan insertion support,
type and vlan_tci, when type = MLX5_ETH_WQE_INSERT_VLAN the
HW will insert the vlan and prio fields (vlan_tci) to the packet.

Those bits and the inline header fields are mutually exclusive, and
valid only when:
MLX5_CAP_ETH(mdev, wqe_inline_mode) == MLX5_CAP_INLINE_MODE_NOT_REQUIRED
and MLX5_CAP_ETH(mdev, wqe_vlan_insert),
who will be set in ConnectX-5 and later HW generations.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
2017-02-06 18:20:16 +02:00
Mohamad Haj Yahia c0f1147d14 net/mlx5e: Change the SQ/RQ operational state to positive logic
When using the negative logic (i.e. FLUSH state), after the RQ/SQ reopen
we will have a time interval that the RQ/SQ is not really ready and the
state indicates that its not in FLUSH state because the initial SQ/RQ struct
memory starts as zeros.
Now we changed the state to indicate if the SQ/RQ is opened and we will
set the READY state after finishing preparing all the SQ/RQ resources.

Fixes: 6e8dd6d6f4 ("net/mlx5e: Don't wait for SQ completions on close")
Fixes: f2fde18c52 ("net/mlx5e: Don't wait for RQ completions on close")
Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-06 11:44:45 -05:00
Saeed Mahameed b5503b994e net/mlx5e: XDP TX forwarding support
Adding support for XDP_TX forwarding from xdp program.
Using XDP, now user can loop packets out of the same port.

We create a dedicated TX SQ for each channel that will serve
XDP programs that return XDP_TX action to loop packets back to
the wire directly from the channel RQ RX path.

For that RX pages will now need to be mapped bi-directionally,
and on XDP_TX action we will sync the page back to device then
queue it into SQ for transmission.  The XDP xmit frame function will
report back to the RX path if the page was consumed (transmitted), if so,
RX path will forget about that page as if it were released to the stack.
Later on, on XDP TX completion, the page will be released back to the
page cache.

For simplicity this patch will hit a doorbell on every XDP TX packet.

Next patch will introduce a xmit more like mechanism that will
queue up more than one packet into SQ w/o notifying the hardware,
once RX napi loop is done we will hit doorbell once for all XDP TX
packets form the previous loop.  This should drastically improve
XDP TX performance.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-22 02:51:41 -04:00
Saeed Mahameed f10b7cc770 net/mlx5e: Have a clear separation between different SQ types
Make a clear separate between Regular SQ (TXQ) and ICO SQ creation,
destruction and union their mutual information structures.

Don't allocate redundant TXQ skb/wqe_info/dma_fifo arrays for ICO SQ.
And have a different SQ edge for ICO SQ than TXQ SQ, to be more
accurate.

In preparation for XDP TX support.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-22 02:51:41 -04:00
Tariq Toukan 0dbf657c39 net/mlx5e: Fix xmit_more counter race issue
Update the xmit_more counter before notifying the HW,
to prevent a possible use-after-free of the skb.

Fixes: c8cf78fe10 ("net/mlx5e: Add ethtool counter for TX xmit_more")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-08 16:15:28 -07:00
Tariq Toukan c8cf78fe10 net/mlx5e: Add ethtool counter for TX xmit_more
Add a counter in ethtool for the number of times that
TX xmit_more was used.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-28 23:24:15 -04:00
Saeed Mahameed 6e8dd6d6f4 net/mlx5e: Don't wait for SQ completions on close
Instead of asking the firmware to flush the SQ (Send Queue) via
asynchronous completions when moved to error, we handle SQ flush
manually (mlx5e_free_tx_descs) same as we did when SQ flush got
timed out or on tx_timeout.

This will reduce SQs flush time and speedup interface down procedure.

Moved mlx5e_free_tx_descs to the end of en_tx.c for tx
critical code locality.

Fixes: 29429f3300 ('net/mlx5e: Timeout if SQ doesn't flush during close')
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-28 23:24:15 -04:00
Hadar Hen Zion ae76715d15 net/mlx5e: Check the minimum inline header mode before xmit
Each send queue (SQ) has inline mode that defines the minimal required
inline headers in the SQ WQE.
Before sending each packet check that the minimum required headers
on the WQE are copied.

Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25 17:53:40 -07:00
Rana Shahout 7ccdd0841b net/mlx5e: Fix select queue callback
The default fallback function used by mlx5e select queue can return
any TX queues in range [0..dev->num_real_tx_queues).

The current implementation assumes that the fallback function returns
a number in the range [0.. number of channels).  Actually
dev->num_real_tx_queues = (number of channels) * dev->num_tc;
which is more than the expected range if num_tc is configured and could
lead to crashes.

To fix this we test if num_tc is not configured we can safely return the
fallback suggestion, if not we will reciprocal_scale the fallback
result and normalize it to the desired range.

Fixes: 08fb1dacdd ('net/mlx5e: Support DCBNL IEEE ETS')
Signed-off-by: Rana Shahout <ranas@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reported-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01 06:12:04 -04:00
Matthew Finlay e3a19b53cb net/mlx5e: Copy all L2 headers into inline segment
ConnectX4-Lx uses an inline wqe mode that currently defaults to
requiring the entire L2 header be included in the wqe.
This patch fixes mlx5e_get_inline_hdr_size() to account for
all L2 headers (VLAN, QinQ, etc) using skb_network_offset(skb).

Fixes: e586b3b0ba ("net/mlx5: Ethernet Datapath files")
Signed-off-by: Matthew Finlay <matt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01 06:12:04 -04:00
Daniel Jurgens 29429f3300 net/mlx5e: Timeout if SQ doesn't flush during close
Avoid an infinite loop by timing out waiting for the SQ to flush. Also
clean up the TX descriptors if that happens.

Fixes: f62b8bb8f2 ('net/mlx5: Extend mlx5_core to support ConnectX-4 Ethernet functionality')
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01 06:12:03 -04:00
Gal Pressman bfe6d8d1d4 net/mlx5e: Reorganize ethtool statistics
Categorize and reorganize ethtool statistics counters by renaming to
"rx_*" and "tx_*" and removing redundant and duplicated counters, this
way they are easier to grasp and more user friendly.

Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29 04:28:47 -04:00
Eli Cohen 0ca00fc1f8 net/mlx5e: Fix blue flame quota logic
Blue flame is a latency enhancement feature that allows the driver to
write the packet data directly to the NIC's registers thus making the
read of the packet data from host memory redundant.

We maintain a quota for the blue flame which is reloaded whenever we
identify that the hardware is processing send requests and processes
them fast enough so by the time we post the next send request it was
able to process all the pending ones. This indicates that the hardware
is capable of processing more blue flame requests efficiently. The blue
flame quota is decremented whenever we send using blue flame.

The current code erroneously clears the budget if we did not use blue
flame for the current post send operation and we fix it here.

Fixes: 88a85f99e5 ('net/mlx5e: TX latency optimization to save DMA reads')
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-09 22:06:27 -07:00
Tariq Toukan bc77b240b3 net/mlx5e: Add fragmented memory support for RX multi packet WQE
If the allocation of a linear (physically continuous) MPWQE fails,
we allocate a fragmented MPWQE.

This is implemented via device's UMR (User Memory Registration)
which allows to register multiple memory fragments into ConnectX
hardware as a continuous buffer.
UMR registration is an asynchronous operation and is done via
ICO SQs.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-21 15:09:05 -04:00
Tariq Toukan d3c9bc2743 net/mlx5e: Added ICO SQs
Added ICO (Internal Control Operations) SQ per channel to be used
for driver internal operations such as memory registration for
fragmented memory and nop requests upon ifconfig up.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-21 15:09:05 -04:00
Jesper Dangaard Brouer 8ec736e556 mlx5: use napi_consume_skb API to get bulk free operations
Bulk free of SKBs happen transparently by the API call napi_consume_skb().
The napi budget parameter is needed by napi_consume_skb() to detect
if called from netpoll.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-13 22:35:36 -04:00
David S. Miller 810813c47a Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Several cases of overlapping changes, as well as one instance
(vxlan) of a bug fix in 'net' overlapping with code movement
in 'net-next'.

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-08 12:34:12 -05:00
Gal Pressman b081da5ee1 net/mlx5e: Add rx/tx bytes software counters
Sum up rx/tx bytes in software as we do for rx/tx packets, to be reported
in upcoming statistics fix.

Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-02 14:37:26 -05:00
Tariq Toukan 59a7c2fd33 net/mlx5e: Remove wrong poll CQ optimization
With the MLX5E_CQ_HAS_CQES optimization flag, the following buggy
flow might occur:
- Suppose RX is always busy, TX has a single packet every second.
- We poll a single TX cqe and clear its flag.
- We never arm it again as RX is always busy.
- TX CQ flag is never changed, and new TX cqes are not polled.

We revert this optimization.

Fixes: e586b3b0ba ('net/mlx5: Ethernet Datapath files')
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-02 14:37:24 -05:00
Moshe Lazer 0ba422410b net/mlx5: Fix global UAR mapping
Avoid double mapping of io mapped memory, Device page may be
mapped to non-cached(NC) or to write-combining(WC).
The code before this fix tries to map it both to WC and NC
contrary to what stated in Intel's software developer manual.

Here we remove the global WC mapping of all UARS
"dev->priv.bf_mapping", since UAR mapping should be decided
per UAR (e.g we want different mappings for EQs, CQs vs QPs).

Caller will now have to choose whether to map via
write-combining API or not.

mlx5e SQs will choose write-combining in order to perform
BlueFlame writes.

Fixes: 88a85f99e5 ('TX latency optimization to save DMA reads')
Signed-off-by: Moshe Lazer <moshel@mellanox.com>
Reviewed-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-01 17:28:00 -05:00
Matthew Finlay 89db09eb59 net/mlx5e: Add TX inner packet counters
Add TSO and TX checksum counters for tunneled, inner packets

Signed-off-by: Matthew Finlay <matt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-24 13:50:22 -05:00
Matthew Finlay 9879515895 net/mlx5e: Add TX stateless offloads for tunneling
Add support for TSO and TX checksum when using hw assisted,
tunneled offloads.

Signed-off-by: Matthew Finlay <matt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-24 13:50:22 -05:00
Saeed Mahameed 08fb1dacdd net/mlx5e: Support DCBNL IEEE ETS
Support the ndo_setup_tc callback and the needed methods
for multi TC/UP support, and removed the default_vlan_prio
from mlx5e_priv which is always 0, it was replaced with
hardcoded "0" in the new select queue method.

For that we now create MAX_NUM_TC num of TISs (one per prio)
on netdevice creation instead of priv->params.num_tc which
was always 1.

So far each channel had a single TXQ, Now each channel has a
TXQ per TC (Traffic Class).

Added en_dcbnl.c which implements the set/get DCBNL IEEE ETS,
set/get dcbx and registers the mlx5e dcbnl ops.

We still use the kernel's default TXQ selection method to select the
channel to transmit through but now we use our own method to select
the TXQ inside the channel based on VLAN priority.

In mlx5, as opposed to mlx4, tc group N gets lower priority than
tc group N+1.

CC: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Rana Shahout <ranas@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-24 13:50:21 -05:00
Eran Ben Elisha ef9814deaf net/mlx5e: Add HW timestamping (TS) support
Add support for enable/disable HW timestamping for incoming and/or
outgoing packets. To enable/disable HW timestamping appropriate
ioctl should be used. Currently HWTSTAMP_FILTER_ALL/NONE and
HWTSAMP_TX_ON/OFF only are supported. Make all relevant changes in
RX/TX flows to consider TS request and plant HW timestamps into
relevant structures.

Add internal clock for converting hardware timestamp to nanoseconds. In
addition, add a service task to catch internal clock overflow, to make
sure timestamping is accurate.

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-05 14:11:50 -05:00
Achiad Shochat 34802a42b3 net/mlx5e: Do not modify the TX SKB
If the SKB is cloned, or has an elevated users count, someone else
can be looking at it at the same time.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-05 14:11:50 -05:00
Achiad Shochat d4e28cbd24 net/mlx5e: Use the right DMA free function on TX path
On xmit path we use skb_frag_dma_map() which is using dma_map_page(),
while upon completion we dma-unmap the skb fragments using
dma_unmap_single() rather than dma_unmap_page().

To fix this, we now save the dma map type on xmit path and use this
info to call the right dma unmap method upon TX completion.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-15 18:43:40 -05:00
Saeed Mahameed ba6c4c0944 net/mlx5e: Fix inline header size calculation
mlx5e_get_inline_hdr_size didn't take into account the vlan insertion
into the inline WQE segment.
This could lead to max inline violation in cases where
skb_headlen(skb) + VLAN_HLEN >= sq->max_inline.

Fixes: 3ea4891db8 ("net/mlx5e: Fix LSO vlan insertion")
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-15 18:43:40 -05:00
Achiad Shochat 3ea4891db8 net/mlx5e: Fix LSO vlan insertion
Consider vlan insertion impact on headers copy size also for LSO
packets.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-03 10:41:51 -05:00
Achiad Shochat e4cf27bd9c net/mlx5e: Re-eanble client vlan TX acceleration
This reverts commit cd58c714ac "net/mlx5e: Disable client vlan TX acceleration".

Bring back client vlan insertion offload, the original
performance issue was found and fixed in the next patch.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-03 10:41:51 -05:00
Rana Shahout 5283af899a net/mlx5e: Avoid accessing NULL pointer at ndo_select_queue
To avoid multiply/division operations on the data path,
we hold a {channel, tc}==>txq mapping table.
We held this mapping table inside the channel object that is
being destroyed upon some configuration operations (e.g MTU change).
So in case ndo_select_queue occurs during such a configuration operation,
it may access a NULL channel pointer, resulting in kernel panic.
To fix this issue we moved the {channel, tc}==>txq mapping table
outside the channel object so that it will be available also
during such configuration operations.

Signed-off-by: Rana Shahout <ranas@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-25 13:45:09 -07:00
Achiad Shochat 88a85f99e5 net/mlx5e: TX latency optimization to save DMA reads
A regular TX WQE execution involves two or more DMA reads -
one to fetch the WQE, and another one per WQE gather entry.

These DMA reads obviously increase the TX latency.
There are two mlx5 mechanisms to bypass these DMA reads:
1) Inline WQE
2) Blue Flame (BF)

An inline WQE contains a whole packet, thus saves the DMA read/s
of the regular WQE gather entry/s. Inline WQE support was already
added in the previous commit.

A BF WQE is written directly to the device I/O mapped memory, thus
enables saving the DMA read that fetches the WQE.

The BF WQE I/O write must be in cache line granularity, thus uses
the CPU write combining mechanism.
A BF WQE I/O write acts also as a TX doorbell for notifying the
device of new TX WQEs.
A BF WQE is written to the same I/O mapped address as the regular TX
doorbell, thus this address is being mapped twice - once by ioremap()
and once by io_mapping_map_wc().

While both mechanisms reduce the TX latency, they both consume more CPU
cycles than a regular WQE:
- A BF WQE must still be written to host memory, in addition to being
  written directly to the device I/O mapped memory.
- An inline WQE involves copying the SKB data into it.

To handle this tradeoff, we introduce here a heuristic algorithm that
strives to avoid using these two mechanisms in case the TX queue is
being back-pressured by the device, and limit their usage rate otherwise.

An inline WQE will always be "Blue Flamed" (written directly to the
device I/O mapped memory) while a BF WQE may not be inlined (may contain
gather entries).

Preliminary testing using netperf UDP_RR shows that the latency goes down
from 17.5us to 16.9us, while the message rate (tested with pktgen) stays
the same.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-27 00:29:17 -07:00
Achiad Shochat 58d522912a net/mlx5e: Support TX packet copy into WQE
AKA inline WQE.
A TX latency optimization to save data gather DMA reads.
Controlled by ETHTOOL_TX_COPYBREAK.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-27 00:29:17 -07:00
Achiad Shochat a1f5a1a87a net/mlx5e: Pop cq outside mlx5e_get_cqe
Separate between mlx5e_get_cqe() and mlx5_cqwq_pop(), this helps for
better code readability and better CQ buffer management.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-24 00:42:41 -07:00
Achiad Shochat e33910548a net/mlx5e: Remove mlx5e_cq.sqrq back-pointer
Use container_of() instead.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-24 00:42:39 -07:00