1
0
Fork 0
Commit Graph

967812 Commits (0afe0a998c40085a6342e1aeb4c510cccba46caf)

Author SHA1 Message Date
Lorenzo Bianconi 2f9d09394d net: mvneta: Add xdp tx return bulking support
Convert mvneta driver to xdp_return_frame_bulk APIs.

XDP_REDIRECT (upstream codepath): 275Kpps
XDP_REDIRECT (upstream codepath + bulking APIs): 284Kpps

Co-developed-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/9af8014006d022fc0fec78cdaa71beb56999750d.1605267335.git.lorenzo@kernel.org
2020-11-14 02:29:00 +01:00
Lorenzo Bianconi 7886244736 net: page_pool: Add bulk support for ptr_ring
Introduce the capability to batch page_pool ptr_ring refill since it is
usually run inside the driver NAPI tx completion loop.

Suggested-by: Jesper Dangaard Brouer <brouer@redhat.com>
Co-developed-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Link: https://lore.kernel.org/bpf/08dd249c9522c001313f520796faa777c4089e1c.1605267335.git.lorenzo@kernel.org
2020-11-14 02:29:00 +01:00
Lorenzo Bianconi 8965398713 net: xdp: Introduce bulking for xdp tx return path
XDP bulk APIs introduce a defer/flush mechanism to return
pages belonging to the same xdp_mem_allocator object
(identified via the mem.id field) in bulk to optimize
I-cache and D-cache since xdp_return_frame is usually run
inside the driver NAPI tx completion loop.
The bulk queue size is set to 16 to be aligned to how
XDP_REDIRECT bulking works. The bulk is flushed when
it is full or when mem.id changes.
xdp_frame_bulk is usually stored/allocated on the function
call-stack to avoid locking penalties.
Current implementation considers only page_pool memory model.

Suggested-by: Jesper Dangaard Brouer <brouer@redhat.com>
Co-developed-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Link: https://lore.kernel.org/bpf/e190c03eac71b20c8407ae0fc2c399eda7835f49.1605267335.git.lorenzo@kernel.org
2020-11-14 02:28:59 +01:00
Jisheng Zhang bb3222f71b net: stmmac: platform: use optional clk/reset get APIs
Use the devm_reset_control_get_optional() and devm_clk_get_optional()
rather than open coding them.

Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
Link: https://lore.kernel.org/r/20201112092606.5173aa6f@xhacker.debian
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-13 16:31:52 -08:00
Heiner Kallweit ca1ab89cd2 r8169: improve rtl_tx
We can simplify the for() condition and eliminate variable tx_left.
The change also considers that tp->cur_tx may be incremented by a
racing rtl8169_start_xmit().
In addition replace the write to tp->dirty_tx and the following
smp_mb() with an equivalent call to smp_store_mb(). This implicitly
adds a WRITE_ONCE() to the write.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://lore.kernel.org/r/c2e19e5e-3d3f-d663-af32-13c3374f5def@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-13 16:29:07 -08:00
Heiner Kallweit 95f3c5458d r8169: use READ_ONCE in rtl_tx_slots_avail
tp->dirty_tx and tp->cur_tx may be changed by a racing rtl_tx() or
rtl8169_start_xmit(). Use READ_ONCE() to annotate the races and ensure
that the compiler doesn't use cached values.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://lore.kernel.org/r/5676fee3-f6b4-84f2-eba5-c64949a371ad@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-13 16:28:59 -08:00
Jakub Kicinski 2caf08e757 Merge branch 'net-ipa-two-fixes'
Alex Elder says:

====================
net: ipa: two fixes

This small series makes two fixes to the IPA code:
  - While reviewing something else I found that one of the resource
    limits on the SDM845 used the wrong value.  The first patch
    fixes this.  The correct value allocates more resources of this
    type for IPA to use, and otherwise does not change behavior.
  - When the IPA-resident microcontroller starts up it generates an
    event, which triggers an AP interrupt.  The event merely
    provides some information for logging, which we don't support.
    We already ignore the event, and that's harmless.  So this
    patch explicitly ignores it rather than issuing a warning when
    it occurs.
====================

Link: https://lore.kernel.org/r/20201112121157.19784-1-elder@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-13 15:37:10 -08:00
Alex Elder 0a5096ec2a net: ipa: ignore the microcontroller log event
The IPA-resident microcontroller has the ability to log various
activity in an area of IPA shared memory.  When the microcontroller
starts it generates an event to the AP to provide information about
the log.

We don't support reading this log, and we can safely ignore the
event.  So do that rather than treating the log info event we
receive as "unsupported."

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-13 15:37:07 -08:00
Alex Elder 3ce6da1b2e net: ipa: fix source packet contexts limit
I have discovered that the maximum number of source packet contexts
configured for SDM845 is incorrect.  Fix this error.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-13 15:37:07 -08:00
Jakub Kicinski 992c75ae2f Merge branch 'sfc-further-ef100-encap-tso-features'
Edward Cree says:

====================
sfc: further EF100 encap TSO features

This series adds support for GRE and GRE_CSUM TSO on EF100 NICs, as
 well as improving the handling of UDP tunnel TSO.
====================

Link: https://lore.kernel.org/r/eda2de73-edf2-8b92-edb9-099ebda09ebc@solarflare.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-13 15:33:37 -08:00
Edward Cree c5122cf584 sfc: support GRE TSO on EF100
We can treat SKB_GSO_GRE almost exactly the same as UDP tunnels, except
 that we don't want to edit the outer UDP len (as there isn't one).
For SKB_GSO_GRE_CSUM, we have to use GSO_PARTIAL as the device doesn't
 support offload of non-UDP outer L4 checksums.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Martin Habets <mhabets@solarflare.com>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
2020-11-13 15:33:30 -08:00
Edward Cree 42bfd69a9f sfc: correctly support non-partial GSO_UDP_TUNNEL_CSUM on EF100
By asking the HW for the correct edits, we can make UDP tunnel TSO
 work without needing GSO_PARTIAL.  So don't specify it in our
 netdev->gso_partial_features.
However, retain GSO_PARTIAL support, as this will be used for other
 protocols later.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Martin Habets <mhabets@solarflare.com>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
2020-11-13 15:33:27 -08:00
Edward Cree dc8d2512e6 sfc: extend bitfield macros to 19 fields
Our TSO descriptors got even more fussy.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Martin Habets <mhabets@solarflare.com>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
2020-11-13 15:33:03 -08:00
Jakub Kicinski 72ac50b206 Merge branch 'net-ipa-gsi-register-consolidation'
Alex Elder says:

====================
net: ipa: GSI register consolidation

This series rearranges and consolidates some GSI register
definitions.  Its general aim is to make things more
consistent, by:
  - Using enumerated types to define the values held in GSI register
    fields
  - Defining field values in "gsi_reg.h", together with the
    definition of the register (and field) that holds them
  - Format enumerated type members consistently, with hexidecimal
    numeric values, and assignments aligned on the same column

There is one checkpatch "CHECK" warning requesting a blank line; I
ignored that because my intention was to group certain definitions.
====================

Link: https://lore.kernel.org/r/20201110215922.23514-1-elder@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-13 15:13:46 -08:00
Alex Elder 4730ab1c1d net: ipa: use enumerated types for GSI field values
Replace constants defined with an "_FVAL" suffix with values defined
in enumerated types, to be consistent with other usage in the driver.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-13 15:13:41 -08:00
Alex Elder cec2076e43 net: ipa: move GSI command opcode values into "gsi_reg.h"
The gsi_ch_cmd_opcode, gsi_evt_cmd_opcode, and gsi_generic_cmd_opcode
enumerated types are values that fields in the GSI command registers
can take on.  Move their definitions out of "gsi.c" and into "gsi_reg.h",
alongside the definition of registers they are associated with.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-13 15:13:41 -08:00
Alex Elder 7b0ac8f651 net: ipa: move GSI error values into "gsi_reg.h"
The gsi_err_code and gsi_err_type enumerated types are values that
fields in the GSI ERROR_LOG register can take on.  Move their
definitions out of "gsi.c" and into "gsi_reg.h", alongside the
definition of the ERROR_LOG register offset and field symbols.

Drop the "_ERR" suffix in the names of the gsi_err_code members.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-13 15:13:41 -08:00
Alex Elder 9ed8c2a92d net: ipa: move channel type values into "gsi_reg.h"
The gsi_channel_type enumerated type define values used for the
channel type/protocol for event rings and channels.  Move its
definition out of "gsi.c" and into "gsi_reg.h", alongside the
definition of the CH_C_CNTXT_0 register offset and its fields.
Add a comment near the definition of the EV_CH_E_CNTXT_0 register
indicating this type is used for its EV_CHTYPE field.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-13 15:13:41 -08:00
Alex Elder 46dda53ef7 net: ipa: use common value for channel type and protocol
The numeric values that represent the event ring channel type are
identical to the values that represent the matching protocol used
for a channel.  Use a new gsi_channel_type enumerated type to
represent the values programmed for both cases, using "CHANNEL_TYPE"
in member names in place of "EVT_CHTYPE" and "CHANNEL_PROTOCOL".

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-13 15:13:41 -08:00
Alex Elder 6c6358cca6 net: ipa: define GSI interrupt types with enums
Define the GSI global interrupt types with an enumerated type whose
values are the bit positions representing the global interrupt types.

Similarly, define the GSI general interrupt types with an enumerated
type whose values are the bit positions of general interrupt types.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-13 15:13:40 -08:00
Wenlin Kang 2f51e5758d tipc: fix -Wstringop-truncation warnings
Replace strncpy() with strscpy(), fixes the following warning:

In function 'bearer_name_validate',
    inlined from 'tipc_enable_bearer' at net/tipc/bearer.c:246:7:
net/tipc/bearer.c:141:2: warning: 'strncpy' specified bound 32 equals destination size [-Wstringop-truncation]
  strncpy(name_copy, name, TIPC_MAX_BEARER_NAME);
  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Wenlin Kang <wenlin.kang@windriver.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Link: https://lore.kernel.org/r/20201112093442.8132-1-wenlin.kang@windriver.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-13 14:17:49 -08:00
Jakub Kicinski f8fd36b95e Some updates:
* injection/radiotap updates for new test capabilities
  * remove WDS support - even years ago when we turned
    it off by default it was already basically unusable
  * support for HE (802.11ax) rates for beacons
  * support for some vendor-specific HE rates
  * many other small features/cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEH1e1rEeCd0AIMq6MB8qZga/fl8QFAl+uSqIACgkQB8qZga/f
 l8QBOw/6AwlcQWMjqdb6H/QRORA81E4tX2+alHbeBai7KSI+9E1Jtakmn5qKQ4iH
 IjpNWPsclj4zKhgbKaariIn/bZEk8OhzmDpssnHTMpuo3iuCmuzFaDdZd9Uun2Ad
 tr3bqfHaom1MhWRF/FuBSHcnk599qRnsk+RY7/6dhjiPlWOWJvsfpuo1KblVoFWU
 wYDX+W2oYDAx44O/6AGJ0Zctwf6m7Kyzb2aMIqv2fwacBoDvyVdTIT/4NroV9INI
 QvIY4Gi8hoCDQX39zwaxSWOq7uFLYHwUozzZxktS5c4N3eSVFs80jmdiQiMKmKRQ
 A+R+ZcuFBcC+6+Wt4x+20T2mF6pUvSaIDA4jegCbDL4jQlp+023XTMlV42cnpP0z
 hFZgBWJszLnLtj4KW/v3sXefZ1Pxl0WD4BHNqz8SMzMUaWalrXP4Gt2bnjB7Bx1N
 2M/DjW570eNZeZ9ZFcvkwHysCWMzHKmh5sPXnOitrs4s2hweIrO7wnMlYVLAGF1J
 m8jUoqpI9Cc7dFEg0inaSIddcjobcx9i2eG14zaZnXj0t8WqAbQqI0Lw/mipWXFY
 7DfdjFULI+Yru46TAFbiisFo/2dlijxrIr3d3QK21Cwklb3BPhpiDf83q6HYhNpB
 xPs38OCZaNdSL7TwNRcuZ2jmBCf+48SYgse85HQOgdD2QzJv6dU=
 =TGgF
 -----END PGP SIGNATURE-----

Merge tag 'mac80211-next-for-net-next-2020-11-13' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next

Johannes Berg says:

====================
Some updates:
 * injection/radiotap updates for new test capabilities
 * remove WDS support - even years ago when we turned
   it off by default it was already basically unusable
 * support for HE (802.11ax) rates for beacons
 * support for some vendor-specific HE rates
 * many other small features/cleanups

* tag 'mac80211-next-for-net-next-2020-11-13' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next: (21 commits)
  nl80211: fix kernel-doc warning in the new SAE attribute
  cfg80211: remove WDS code
  mac80211: remove WDS-related code
  rt2x00: remove WDS code
  b43legacy: remove WDS code
  b43: remove WDS code
  carl9170: remove WDS code
  ath9k: remove WDS code
  wireless: remove CONFIG_WIRELESS_WDS
  mac80211: assure that certain drivers adhere to DONT_REORDER flag
  mac80211: don't overwrite QoS TID of injected frames
  mac80211: adhere to Tx control flag that prevents frame reordering
  mac80211: add radiotap flag to assure frames are not reordered
  mac80211: save HE oper info in BSS config for mesh
  cfg80211: add support to configure HE MCS for beacon rate
  nl80211: fix beacon tx rate mask validation
  nl80211/cfg80211: fix potential infinite loop
  cfg80211: Add support to calculate and report 4096-QAM HE rates
  cfg80211: Add support to configure SAE PWE value to drivers
  ieee80211: Add definition for WFA DPP
  ...
====================

Link: https://lore.kernel.org/r/20201113101148.25268-1-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-13 12:03:22 -08:00
KP Singh 6f100640ca bpf: Expose bpf_d_path helper to sleepable LSM hooks
Sleepable hooks are never called from an NMI/interrupt context, so it
is safe to use the bpf_d_path helper in LSM programs attaching to these
hooks.

The helper is not restricted to sleepable programs and merely uses the
list of sleepable hooks as the initial subset of LSM hooks where it can
be used.

Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20201113005930.541956-3-kpsingh@chromium.org
2020-11-13 15:47:21 +01:00
KP Singh 423f16108c bpf: Augment the set of sleepable LSM hooks
Update the set of sleepable hooks with the ones that do not trigger
a warning with might_fault() when exercised with the correct kernel
config options enabled, i.e.

	DEBUG_ATOMIC_SLEEP=y
	LOCKDEP=y
	PROVE_LOCKING=y

This means that a sleepable LSM eBPF program can be attached to these
LSM hooks. A new helper method bpf_lsm_is_sleepable_hook is added and
the set is maintained locally in bpf_lsm.c

Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20201113005930.541956-2-kpsingh@chromium.org
2020-11-13 15:45:54 +01:00
Alexei Starovoitov 904709f63b Merge branch 'bpf: Enable bpf_sk_storage for FENTRY/FEXIT/RAW_TP'
Martin KaFai says:

====================

This set is to allow the FENTRY/FEXIT/RAW_TP tracing program to use
bpf_sk_storage.  The first two patches are a cleanup.  The last patch is
tests.  Patch 3 has the required kernel changes to
enable bpf_sk_storage for FENTRY/FEXIT/RAW_TP.

Please see individual patch for details.

v2:
- Rename some of the function prefix from sk_storage to bpf_sk_storage
- Use prefix check instead of substr check
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-11-12 18:39:57 -08:00
Martin KaFai Lau 53632e1119 bpf: selftest: Use bpf_sk_storage in FENTRY/FEXIT/RAW_TP
This patch tests storing the task's related info into the
bpf_sk_storage by fentry/fexit tracing at listen, accept,
and connect.  It also tests the raw_tp at inet_sock_set_state.

A negative test is done by tracing the bpf_sk_storage_free()
and using bpf_sk_storage_get() at the same time.  It ensures
this bpf program cannot load.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20201112211320.2587537-1-kafai@fb.com
2020-11-12 18:39:28 -08:00
Martin KaFai Lau 8e4597c627 bpf: Allow using bpf_sk_storage in FENTRY/FEXIT/RAW_TP
This patch enables the FENTRY/FEXIT/RAW_TP tracing program to use
the bpf_sk_storage_(get|delete) helper, so those tracing programs
can access the sk's bpf_local_storage and the later selftest
will show some examples.

The bpf_sk_storage is currently used in bpf-tcp-cc, tc,
cg sockops...etc which is running either in softirq or
task context.

This patch adds bpf_sk_storage_get_tracing_proto and
bpf_sk_storage_delete_tracing_proto.  They will check
in runtime that the helpers can only be called when serving
softirq or running in a task context.  That should enable
most common tracing use cases on sk.

During the load time, the new tracing_allowed() function
will ensure the tracing prog using the bpf_sk_storage_(get|delete)
helper is not tracing any bpf_sk_storage*() function itself.
The sk is passed as "void *" when calling into bpf_local_storage.

This patch only allows tracing a kernel function.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20201112211313.2587383-1-kafai@fb.com
2020-11-12 18:39:28 -08:00
Martin KaFai Lau e794bfddb8 bpf: Rename some functions in bpf_sk_storage
Rename some of the functions currently prefixed with sk_storage
to bpf_sk_storage.  That will make the next patch have fewer
prefix check and also bring the bpf_sk_storage.c to a more
consistent function naming.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: KP Singh <kpsingh@google.com>
Link: https://lore.kernel.org/bpf/20201112211307.2587021-1-kafai@fb.com
2020-11-12 18:39:27 -08:00
Martin KaFai Lau 9e838b02b0 bpf: Folding omem_charge() into sk_storage_charge()
sk_storage_charge() is the only user of omem_charge().
This patch simplifies it by folding omem_charge() into
sk_storage_charge().

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: KP Singh <kpsingh@google.com>
Link: https://lore.kernel.org/bpf/20201112211301.2586255-1-kafai@fb.com
2020-11-12 18:39:27 -08:00
Jakub Kicinski e1d9d7b913 Merge https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12 16:54:48 -08:00
Daniel Borkmann 0a58a65cc0 Merge branch 'bpf-ptrs-beyond-pkt-end'
Alexei Starovoitov says:

====================
v1->v2:
- removed set-but-unused variable.
- added Jiri's Tested-by.

In some cases LLVM uses the knowledge that branch is taken to optimze the code
which causes the verifier to reject valid programs.
Teach the verifier to recognize that
r1 = skb->data;
r1 += 10;
r2 = skb->data_end;
if (r1 > r2) {
  here r1 points beyond packet_end and subsequent
  if (r1 > r2) // always evaluates to "true".
}
====================

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2020-11-13 01:42:57 +01:00
Alexei Starovoitov cb62d34019 selftests/bpf: Add asm tests for pkt vs pkt_end comparison.
Add few assembly tests for packet comparison.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20201111031213.25109-4-alexei.starovoitov@gmail.com
2020-11-13 01:42:11 +01:00
Alexei Starovoitov 9cc873e858 selftests/bpf: Add skb_pkt_end test
Add a test that currently makes LLVM generate assembly code:

$ llvm-objdump -S skb_pkt_end.o
0000000000000000 <main_prog>:
; 	if (skb_shorter(skb, ETH_IPV4_TCP_SIZE))
       0:	61 12 50 00 00 00 00 00	r2 = *(u32 *)(r1 + 80)
       1:	61 14 4c 00 00 00 00 00	r4 = *(u32 *)(r1 + 76)
       2:	bf 43 00 00 00 00 00 00	r3 = r4
       3:	07 03 00 00 36 00 00 00	r3 += 54
       4:	b7 01 00 00 00 00 00 00	r1 = 0
       5:	2d 23 02 00 00 00 00 00	if r3 > r2 goto +2 <LBB0_2>
       6:	07 04 00 00 0e 00 00 00	r4 += 14
; 	if (skb_shorter(skb, ETH_IPV4_TCP_SIZE))
       7:	bf 41 00 00 00 00 00 00	r1 = r4
0000000000000040 <LBB0_2>:
       8:	b4 00 00 00 ff ff ff ff	w0 = -1
; 	if (!(ip = get_iphdr(skb)))
       9:	2d 23 05 00 00 00 00 00	if r3 > r2 goto +5 <LBB0_6>
; 	proto = ip->protocol;
      10:	71 12 09 00 00 00 00 00	r2 = *(u8 *)(r1 + 9)
; 	if (proto != IPPROTO_TCP)
      11:	56 02 03 00 06 00 00 00	if w2 != 6 goto +3 <LBB0_6>
; 	if (tcp->dest != 0)
      12:	69 12 16 00 00 00 00 00	r2 = *(u16 *)(r1 + 22)
      13:	56 02 01 00 00 00 00 00	if w2 != 0 goto +1 <LBB0_6>
; 	return tcp->urg_ptr;
      14:	69 10 26 00 00 00 00 00	r0 = *(u16 *)(r1 + 38)
0000000000000078 <LBB0_6>:
; }
      15:	95 00 00 00 00 00 00 00	exit

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20201111031213.25109-3-alexei.starovoitov@gmail.com
2020-11-13 01:42:11 +01:00
Alexei Starovoitov 6d94e741a8 bpf: Support for pointers beyond pkt_end.
This patch adds the verifier support to recognize inlined branch conditions.
The LLVM knows that the branch evaluates to the same value, but the verifier
couldn't track it. Hence causing valid programs to be rejected.
The potential LLVM workaround: https://reviews.llvm.org/D87428
can have undesired side effects, since LLVM doesn't know that
skb->data/data_end are being compared. LLVM has to introduce extra boolean
variable and use inline_asm trick to force easier for the verifier assembly.

Instead teach the verifier to recognize that
r1 = skb->data;
r1 += 10;
r2 = skb->data_end;
if (r1 > r2) {
  here r1 points beyond packet_end and
  subsequent
  if (r1 > r2) // always evaluates to "true".
}

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20201111031213.25109-2-alexei.starovoitov@gmail.com
2020-11-13 01:42:11 +01:00
Guillaume Nault e865802357 selftests: set conf.all.rp_filter=0 in bareudp.sh
When working on the rp_filter problem, I didn't realise that disabling
it on the network devices didn't cover all cases: rp_filter could also
be enabled globally in the namespace, in which case it would drop
packets, even if the net device has rp_filter=0.

Fixes: 1ccd58331f ("selftests: disable rp_filter when testing bareudp")
Fixes: bbbc7aa45e ("selftests: add test script for bareudp tunnels")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Link: https://lore.kernel.org/r/f2d459346471f163b239aa9d63ce3e2ba9c62895.1605107012.git.gnault@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12 16:14:38 -08:00
Jakub Kicinski e7086213f7 Merge branch 'mlxsw-spectrum-prepare-for-xm-implementation-prefix-insertion-and-removal'
Ido Schimmel says:

====================
mlxsw: spectrum: Prepare for XM implementation - prefix insertion and removal

Jiri says:

This is a preparation patchset for follow-up support of boards with
extended mezzanine (XM), which is going to allow extended (scale-wise)
router offload.

XM requires a separate PRM register named XMDR to be used instead of
RALUE to insert/update/remove FIB entries. Therefore, this patchset
extends the previously introduces low-level ops to be able to have
XM-specific FIB entry config implementation.

Currently the existing original RALUE implementation is moved to "basic"
low-level ops.

Unlike legacy router, insertion/update/removal of FIB entries into XM
could be done in bulks up to 4 items in a single PRM register write.
That is why this patchset implements "an op context", that allows the
future XM ops implementation to squash multiple FIB events to single
register write. For that, the way in which the FIB events are processed
by the work queue has to be changed.

The conversion from 1:1 FIB event - work callback call to event queue is
implemented in patch #3.

Patch #4 introduces "an op context" that will allow in future to squash
multiple FIB events into one XMDR register write. Patch #12 converts it
from stack to be allocated per instance.

Existing RALUE manipulations are pushed to ops in patch #10.

Patch #13 is introducing a possibility for low-level implementation to
have per FIB entry private memory.

The rest of the patches are either cosmetics or smaller preparations.
====================

Link: https://lore.kernel.org/r/20201110094900.1920158-1-idosch@idosch.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12 15:55:26 -08:00
Jiri Pirko 173f14cda3 mlxsw: spectrum_router: Introduce FIB entry update op
Follow-up patchset introducing XMDR implementation is going to need
to distinguish write and update ops. Therefore introduce "update op"
and call "write op" only when new FIB entry is inserted.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12 15:55:22 -08:00
Jiri Pirko a005a7fe2f mlxsw: spectrum_router: Track FIB entry committed state and skip uncommitted on delete
In case bulking is used, the entry that was previously added may not
be yet committed to the HW as it waits in the queue for bulk send. For
such entries, skip the deletion.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12 15:55:22 -08:00
Jiri Pirko ae9ce81aa7 mlxsw: spectrum_router: Introduce fib_entry priv for low-level ops
Prepare for the low-level ops that need to store some data alongside
the fib_entry and introduce a per-fib_entry priv for ll ops.
The priv is reference counted as in the follow-up patch it is going
to be saved in pack() function and used later on in commit() even in
case the related fib_entry gets freed in the middle.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12 15:55:21 -08:00
Jiri Pirko 91d20d71b2 mlxsw: spectrum_router: Have FIB entry op context allocated for the instance
Get the max size needed for FIB entry op context and allocate it once
for the instance. Use it repeatedly from the scheduled work.
By this, allow to extend the context to hold more data than it is wise
to do when it was on the stack. Make sure to signalize that the context
needs to be initialized in case families of subsequent FIB entries differ.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12 15:55:21 -08:00
Jiri Pirko 505cd65c66 mlxsw: spectrum_router: Prepare work context for possible bulking
For XMDR register it is possible to carry multiple FIB entry
operations in a single write. However the FW does not restrict mixing
the types of operations, make the code easier and indicate the bulking
is ok only in case the bulk contains FIB operations of the same family
and event.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12 15:55:21 -08:00
Jiri Pirko 7f5c4090e4 mlxsw: spectrum: Push RALUE packing and writing into low-level router ops
With follow-up introduction of XM implementation, XMDR register is
going to be optionally used instead of RALUE register. Push the RALUE
packing helpers and write call into low-level router ops.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12 15:55:21 -08:00
Jiri Pirko 1a9c21d5f7 mlxsw: spectrum_router: Use RALUE pack helper from abort function
Unify the RALUE register payload packing and use the
__mlxsw_sp_fib_entry_ralue_pack() helper from
__mlxsw_sp_router_set_abort_trap().

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12 15:55:21 -08:00
Jiri Pirko 1a7fcdf75d mlxsw: reg: Allow to pass NULL pointer to mlxsw_reg_ralue_pack4/6()
In preparation for the change that is going to be done in the next
patch, allow to pass NULL pointer to mlxsw_reg_ralue_pack4() and
mlxsw_reg_ralue_pack6() helpers.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12 15:55:20 -08:00
Jiri Pirko 0c1d6b2694 mlxsw: spectrum_router: Pass destination IP as a pointer to mlxsw_reg_ralue_pack4()
Instead of passing destination IP as a u32 value, pass it as pointer to
u32. Avoid using local variable for the pointer store.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12 15:55:20 -08:00
Jiri Pirko d271cf9f29 mlxsw: spectrum: Export RALUE pack helper and use it from IPIP
As the RALUE packing is going to be put into op, make the user from
IPIP code use the same helper as the router code does.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12 15:55:20 -08:00
Jiri Pirko 0f6b66011a mlxsw: spectrum_router: Push out RALUE pack into separate helper
As the RALUE packing is going to be pushed into an op, in preparation
for that push the code into a separate function in the meantime.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12 15:55:20 -08:00
Jiri Pirko 2d5bd7a111 mlxsw: spectrum: Propagate context from work handler containing RALUE payload
Currently, RALUE payload is defined locally in the function that is
calling the register write. With introduction of alternative register to
RALUE, XMDR, it has to be possible to put multiple FIB entry
operations into single register write.

So in order to prepare for that, have per-work entry operation context
and propagate it all the way down to the functions writing RALUE.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12 15:55:19 -08:00
Jiri Pirko c1b290d594 mlxsw: spectrum_router: Introduce FIB event queue instead of separate works
Currently, every FIB event is queued-up as a separate work to be
processed. However, that allows to process only one FIB entry per work
callback.

In preparation of future XMDR register bulking of multiple FIB entries,
convert to FIB event queue. Implement this by a list_head, adding new
events to the end of the list in the FIB notify callback. That allows to
process multiple events from the list inside the work callback.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12 15:55:19 -08:00
Jiri Pirko d57ff02286 mlxsw: spectrum_router: Use RALUE-independent op arg
Since the write/delete of FIB entry is going to be implemented by XMDR
register for XM implementation, introduce RALUE-independent enum for op
so the enum could be used in both RALUE and XMDR.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12 15:55:19 -08:00