1
0
Fork 0
remarkable-linux/include
Magnus Karlsson a9744f7ca2 xsk: fix potential race in SKB TX completion code
There is a potential race in the TX completion code for the SKB
case. One process enters the sendmsg code of an AF_XDP socket in order
to send a frame. The execution eventually trickles down to the driver
that is told to send the packet. However, it decides to drop the
packet due to some error condition (e.g., rings full) and frees the
SKB. This will trigger the SKB destructor and a completion will be
sent to the AF_XDP user space through its
single-producer/single-consumer queues.

At the same time a TX interrupt has fired on another core and it
dispatches the TX completion code in the driver. It does its HW
specific things and ends up freeing the SKB associated with the
transmitted packet. This will trigger the SKB destructor and a
completion will be sent to the AF_XDP user space through its
single-producer/single-consumer queues. With a pseudo call stack, it
would look like this:

Core 1:
sendmsg() being called in the application
  netdev_start_xmit()
    Driver entered through ndo_start_xmit
      Driver decides to free the SKB for some reason (e.g., rings full)
        Destructor of SKB called
          xskq_produce_addr() is called to signal completion to user space

Core 2:
TX completion irq
  NAPI loop
    Driver irq handler for TX completions
      Frees the SKB
        Destructor of SKB called
          xskq_produce_addr() is called to signal completion to user space

We now have a violation of the single-producer/single-consumer
principle for our queues as there are two threads trying to produce at
the same time on the same queue.

Fixed by introducing a spin_lock in the destructor. In regards to the
performance, I get around 1.74 Mpps for txonly before and after the
introduction of the spinlock. There is of course some impact due to
the spin lock but it is in the less significant digits that are too
noisy for me to measure. But let us say that the version without the
spin lock got 1.745 Mpps in the best case and the version with 1.735
Mpps in the worst case, then that would mean a maximum drop in
performance of 0.5%.

Fixes: 35fcde7f8d ("xsk: support for Tx")
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-07-02 18:37:12 -07:00
..
acpi ACPI / processor: Finish making acpi_processor_ppc_has_changed() void 2018-06-20 10:50:40 +02:00
asm-generic locking/qspinlock: Fix build for anonymous union in older GCC compilers 2018-06-22 04:19:16 +02:00
clocksource
crypto Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL 2018-06-28 10:40:47 -07:00
drm drm for v4.18-rc1 2018-06-06 08:16:33 -07:00
dt-bindings ARM: SoC driver updates 2018-06-11 18:15:22 -07:00
keys docs: Fix some broken references 2018-06-15 18:10:01 -03:00
kvm KVM: arm/arm64: Bump VGIC_V3_MAX_CPUS to 512 2018-05-25 12:29:27 +01:00
linux Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-07-02 11:18:28 -07:00
math-emu
media media: v4l2-core: push taking ioctl mutex down to ioctl handler 2018-05-28 16:31:44 -04:00
memory
misc ocxl: Expose the thread_id needed for wait on POWER9 2018-06-03 20:40:32 +10:00
net xsk: fix potential race in SKB TX completion code 2018-07-02 18:37:12 -07:00
pcmcia
ras
rdma 4.18-rc 2018-06-21 07:22:30 +09:00
scsi SCSI misc on 20180610 2018-06-10 13:01:12 -07:00
soc ARM: SoC: late updates 2018-06-11 18:19:45 -07:00
sound sound updates for 4.18 2018-06-06 09:08:38 -07:00
target scsi: target: transport should handle st FM/EOM/ILI reads 2018-05-18 12:22:48 -04:00
trace NFS client updates for Linux 4.18 2018-06-12 10:09:03 -07:00
uapi Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-07-02 11:18:28 -07:00
video fbdev changes for v4.18: 2018-06-17 05:00:24 +09:00
xen xen: fixes for 4.18-rc2 2018-06-23 20:44:11 +08:00