1
0
Fork 0
alistair23-linux/net/core
Eric Dumazet 3226b158e6 net: avoid 32 x truesize under-estimation for tiny skbs
Both virtio net and napi_get_frags() allocate skbs
with a very small skb->head

While using page fragments instead of a kmalloc backed skb->head might give
a small performance improvement in some cases, there is a huge risk of
under estimating memory usage.

For both GOOD_COPY_LEN and GRO_MAX_HEAD, we can fit at least 32 allocations
per page (order-3 page in x86), or even 64 on PowerPC

We have been tracking OOM issues on GKE hosts hitting tcp_mem limits
but consuming far more memory for TCP buffers than instructed in tcp_mem[2]

Even if we force napi_alloc_skb() to only use order-0 pages, the issue
would still be there on arches with PAGE_SIZE >= 32768

This patch makes sure that small skb head are kmalloc backed, so that
other objects in the slab page can be reused instead of being held as long
as skbs are sitting in socket queues.

Note that we might in the future use the sk_buff napi cache,
instead of going through a more expensive __alloc_skb()

Another idea would be to use separate page sizes depending
on the allocated length (to never have more than 4 frags per page)

I would like to thank Greg Thelen for his precious help on this matter,
analysing crash dumps is always a time consuming task.

Fixes: fd11a83dd3 ("net: Pull out core bits of __netdev_alloc_skb and add __napi_alloc_skb")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://lore.kernel.org/r/20210113161819.1155526-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-14 10:51:52 -08:00
..
Makefile ethtool: move to its own directory 2019-12-12 17:07:05 -08:00
bpf_sk_storage.c bpf: Expose bpf_sk_storage_* to iterator programs 2020-12-04 22:32:40 +01:00
datagram.c net: datagram: fix some kernel-doc markups 2020-11-17 14:15:03 -08:00
datagram.h net/core: Allow the compiler to verify declaration and definition consistency 2019-03-27 13:49:44 -07:00
dev.c net: make sure devices go through netdev_wait_all_refs 2021-01-08 19:27:41 -08:00
dev_addr_lists.c net: core: add nested_level variable in net_device 2020-09-28 15:00:15 -07:00
dev_ioctl.c net: dev_ioctl: remove redundant initialization of variable err 2020-11-03 17:49:26 -08:00
devlink.c net: core: devlink: simplify the return expression of devlink_nl_cmd_trap_set_doit() 2020-12-08 16:22:54 -08:00
drop_monitor.c genetlink: move to smaller ops wherever possible 2020-10-02 19:11:11 -07:00
dst.c net: Correct the comment of dst_dev_put() 2020-09-10 13:28:57 -07:00
dst_cache.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
failover.c failover: allow name change on IFF_UP slave interfaces 2019-04-10 22:12:26 -07:00
fib_notifier.c net: fib_notifier: propagate extack down to the notifier block callback 2019-10-04 11:10:56 -07:00
fib_rules.c treewide: rename nla_strlcpy to nla_strscpy. 2020-11-16 08:08:54 -08:00
filter.c bpf: Only provide bpf_sock_from_file with CONFIG_NET 2020-12-08 18:23:36 -08:00
flow_dissector.c net: core: fix spelling typo in flow_dissector.c 2020-11-07 15:52:21 -08:00
flow_offload.c net: flow_offload: Fix memory leak for indirect flow block 2020-12-09 16:08:33 -08:00
gen_estimator.c net_sched: gen_estimator: extend packet counter to 64bit 2019-11-06 21:51:36 -08:00
gen_stats.c docs: networking: convert gen_stats.txt to ReST 2020-04-28 14:39:46 -07:00
gro_cells.c gro_cells: reduce number of synchronize_net() calls 2020-11-25 11:28:12 -08:00
hwbm.c net: hwbm: Make the hwbm_pool lock a mutex 2019-06-09 19:40:10 -07:00
link_watch.c net: Add IF_OPER_TESTING 2020-04-20 12:43:24 -07:00
lwt_bpf.c lwt_bpf: Replace preempt_disable() with migrate_disable() 2020-12-07 11:53:40 -08:00
lwtunnel.c net: ipv6: add rpl sr tunnel 2020-03-29 22:30:57 -07:00
neighbour.c net: neighbor: fix a crash caused by mod zero 2020-12-28 14:49:48 -08:00
net-procfs.c net-sysfs: add backlog len and CPU id to softnet data 2020-09-21 13:56:37 -07:00
net-sysfs.c net-sysfs: take the rtnl lock when accessing xps_rxqs_map and num_tc 2020-12-28 13:26:46 -08:00
net-sysfs.h net-sysfs: add netdev_change_owner() 2020-02-26 20:07:25 -08:00
net-traces.c page_pool: add tracepoints for page_pool with details need by XDP 2019-06-19 11:23:13 -04:00
net_namespace.c fixes-v5.11 2020-12-14 16:40:27 -08:00
netclassid_cgroup.c net: Remove the err argument from sock_from_file 2020-12-04 22:32:40 +01:00
netevent.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
netpoll.c net: Have netpoll bring-up DSA management interface 2020-11-18 11:04:11 -08:00
netprio_cgroup.c net: Remove the err argument from sock_from_file 2020-12-04 22:32:40 +01:00
page_pool.c net: page_pool: Add bulk support for ptr_ring 2020-11-14 02:29:00 +01:00
pktgen.c pktgen: Fix inconsistent of format with argument type in pktgen.c 2020-10-01 18:45:23 -07:00
ptp_classifier.c ptp: Add generic ptp v2 header parsing function 2020-08-19 16:07:49 -07:00
request_sock.c tcp: add rcu protection around tp->fastopen_rsk 2019-10-13 10:13:08 -07:00
rtnetlink.c net: make free_netdev() more lenient with unregistering devices 2021-01-08 19:27:41 -08:00
scm.c fs: Add receive_fd() wrapper for __receive_fd() 2020-07-13 11:03:44 -07:00
secure_seq.c crypto: lib/sha1 - remove unnecessary includes of linux/cryptohash.h 2020-05-08 15:32:17 +10:00
skbuff.c net: avoid 32 x truesize under-estimation for tiny skbs 2021-01-14 10:51:52 -08:00
skmsg.c bpf, sockmap: Avoid failures from skb_to_sgvec when skb has frag_list 2020-11-18 00:14:04 +01:00
sock.c net: Remove the err argument from sock_from_file 2020-12-04 22:32:40 +01:00
sock_diag.c bpf, net: Rework cookie generator as per-cpu one 2020-09-30 11:50:35 -07:00
sock_map.c bpf: Eliminate rlimit-based memory accounting for sockmap and sockhash maps 2020-12-02 18:32:47 -08:00
sock_reuseport.c udp: Prevent reuseport_select_sock from reading uninitialized socks 2021-01-08 19:15:40 -08:00
stream.c tcp: make sure EPOLLOUT wont be missed 2019-08-19 13:07:43 -07:00
sysctl_net_core.c net: add option to not create fall-back tunnels in root-ns as well 2020-08-28 06:52:44 -07:00
timestamping.c net: Introduce a new MII time stamping interface. 2019-12-25 19:51:33 -08:00
tso.c net: tso: add UDP segmentation support 2020-06-18 20:46:23 -07:00
utils.c net: Fix skb->csum update in inet_proto_csum_replace16(). 2020-01-24 20:54:30 +01:00
xdp.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-12-11 22:29:38 -08:00