1
0
Fork 0
Commit Graph

709160 Commits (c6e218035913e14952b04ceecf1a543205106fdb)

Author SHA1 Message Date
Jiri Pirko fa71212e91 net: sched: remove unused is_classid_clsact_ingress/egress helpers
These helpers are no longer in use by drivers, so remove them.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21 03:04:08 +01:00
Jiri Pirko d58d31a118 net: sched: remove unused classid field from tc_cls_common_offload
It is no longer used by the drivers, so remove it.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21 03:04:08 +01:00
Jiri Pirko 8d26d5636d net: sched: avoid ndo_setup_tc calls for TC_SETUP_CLS*
All drivers are converted to use block callbacks for TC_SETUP_CLS*.
So it is now safe to remove the calls to ndo_setup_tc from cls_*

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21 03:04:08 +01:00
Jiri Pirko 6b3eb752b4 dsa: Convert ndo_setup_tc offloads to block callbacks
Benefit from the newly introduced block callback infrastructure and
convert ndo_setup_tc calls for matchall offloads to block callbacks.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21 03:04:08 +01:00
Jiri Pirko 90d97315b3 nfp: bpf: Convert ndo_setup_tc offloads to block callbacks
Benefit from the newly introduced block callback infrastructure and
convert ndo_setup_tc calls for bpf offloads to block callbacks.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21 03:04:08 +01:00
Jiri Pirko 363fc53b8b nfp: flower: Convert ndo_setup_tc offloads to block callbacks
Benefit from the newly introduced block callback infrastructure and
convert ndo_setup_tc calls for flower offloads to block callbacks.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21 03:04:08 +01:00
Jiri Pirko 855afa0932 mlx5e_rep: Convert ndo_setup_tc offloads to block callbacks
Benefit from the newly introduced block callback infrastructure and
convert ndo_setup_tc calls for flower offloads to block callbacks.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21 03:04:08 +01:00
Jiri Pirko 6ea30f8a97 ixgbe: Convert ndo_setup_tc offloads to block callbacks
Benefit from the newly introduced block callback infrastructure and
convert ndo_setup_tc calls for u32 offloads to block callbacks.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21 03:04:08 +01:00
Jiri Pirko cd019e91a8 cxgb4: Convert ndo_setup_tc offloads to block callbacks
Benefit from the newly introduced block callback infrastructure and
convert ndo_setup_tc calls for flower and u32 offloads to block callbacks.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21 03:04:07 +01:00
Jiri Pirko 9e0fd15dd6 bnxt: Convert ndo_setup_tc offloads to block callbacks
Benefit from the newly introduced block callback infrastructure and
convert ndo_setup_tc calls for flower offloads to block callbacks.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21 03:04:07 +01:00
Jiri Pirko d6c862baaf mlx5e: Convert ndo_setup_tc offloads to block callbacks
Benefit from the newly introduced block callback infrastructure and
convert ndo_setup_tc calls for flower offloads to block callbacks.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21 03:04:07 +01:00
Jiri Pirko eb49cfaa6b mlxsw: spectrum: Convert ndo_setup_tc offloads to block callbacks
Benefit from the newly introduced block callback infrastructure and
convert ndo_setup_tc calls for matchall and flower offloads to block
callbacks.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21 03:04:07 +01:00
Jiri Pirko 3f7889c4c7 net: sched: cls_bpf: call block callbacks for offload
Use the newly introduced callbacks infrastructure and call block
callbacks alongside with the existing per-netdev ndo_setup_tc.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21 03:04:07 +01:00
Jiri Pirko 245dc5121a net: sched: cls_u32: call block callbacks for offload
Use the newly introduced callbacks infrastructure and call block
callbacks alongside with the existing per-netdev ndo_setup_tc.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21 03:04:07 +01:00
Jiri Pirko 7746041192 net: sched: cls_u32: swap u32_remove_hw_knode and u32_remove_hw_hnode
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21 03:04:07 +01:00
Jiri Pirko 2447a96f88 net: sched: cls_matchall: call block callbacks for offload
Use the newly introduced callbacks infrastructure and call block
callbacks alongside with the existing per-netdev ndo_setup_tc.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21 03:04:07 +01:00
Jiri Pirko 208c0f4b52 net: sched: use tc_setup_cb_call to call per-block callbacks
Extend the tc_setup_cb_call entrypoint function originally used only for
action egress devices callbacks to call per-block callbacks as well.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21 03:04:07 +01:00
Jiri Pirko acb674428c net: sched: introduce per-block callbacks
Introduce infrastructure that allows drivers to register callbacks that
are called whenever tc would offload inserted rule for a specific block.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21 03:04:06 +01:00
Jiri Pirko 6e40cf2d4d net: sched: use extended variants of block_get/put in ingress and clsact qdiscs
Use previously introduced extended variants of block get and put
functions. This allows to specify a binder types specific to clsact
ingress/egress which is useful for drivers to distinguish who actually
got the block.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21 03:04:06 +01:00
Jiri Pirko 8c4083b30e net: sched: add block bind/unbind notif. and extended block_get/put
Introduce new type of ndo_setup_tc message to propage binding/unbinding
of a block to driver. Call this ndo whenever qdisc gets/puts a block.
Alongside with this, there's need to propagate binder type from qdisc
code down to the notifier. So introduce extended variants of
block_get/put in order to pass this info.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21 03:04:06 +01:00
Matteo Croce 197df02cb3 udp: make some messages more descriptive
In the UDP code there are two leftover error messages with very few meaning.
Replace them with a more descriptive error message as some users
reported them as "strange network error".

Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21 02:52:35 +01:00
Stefano Brivio 772e97b57a geneve: Fix function matching VNI and tunnel ID on big-endian
On big-endian machines, functions converting between tunnel ID
and VNI use the three LSBs of tunnel ID storage to map VNI.

The comparison function eq_tun_id_and_vni(), on the other hand,
attempted to map the VNI from the three MSBs. Fix it by using
the same check implemented on LE, which maps VNI from the three
LSBs of tunnel ID.

Fixes: 2e0b26e103 ("geneve: Optimize geneve device lookup.")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: Jakub Sitnicki <jkbs@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21 02:50:42 +01:00
David S. Miller c69d75ae15 linux-can-fixes-for-4.14-20171019
-----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCgAxFiEE4bay/IylYqM/npjQHv7KIOw4HPYFAlnohyMTHG1rbEBwZW5n
 dXRyb25peC5kZQAKCRAe/sog7Dgc9jRFCACmSBz2H1yW5VfjrN3ZgoEoUo8P5qCZ
 7mXvTstTASF20T7kFqNS2K94CMVjBYRUIHvihp0LmPmPAmpmQRNedAssWuZpMflw
 xhaH8T8YWbyzHU2MBZP9WtTjrTilPEPcjvFSgYw6wpHW7VC3S/ffEN8Mj3ymR6bW
 KRw7gemXpvYUxPrGGCzyvFy4G+KptyPcAD6I8ceDUtdkwK4TXGYqEAoSqxtAIUHX
 dgRLzheOv8sDh+dM+ZNpH6UUkzK0gVHQ40lgXydZazEwPPq9zdj2Ec/6qpR5fkUH
 dnCIhzLNiCJOMGSfzkf0Esef1zBpU8oJmILbgf97CdayaW9cA1EUleGW
 =HYTx
 -----END PGP SIGNATURE-----

Merge tag 'linux-can-fixes-for-4.14-20171019' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can

Marc Kleine-Budde says:

====================
pull-request: can 2017-10-19

this is a pull request of 11 patches for the upcoming 4.14 release.

There are 6 patches by ZHU Yi for the flexcan driver, that work around
the CAN error handling state transition problems found in various
incarnations of the flexcan IP core.

The patch by Colin Ian King fixes a potential NULL pointer deref in the
CAN broad cast manager (bcm). One patch by me replaces a direct deref of a RCU
protected pointer by rcu_access_pointer. My second patch adds missing
OOM error handling in af_can. A patch by Stefan Mätje for the esd_usb2
driver fixes the dlc in received RTR frames. And the last patch is by
Wolfgang Grandegger, it fixes a busy loop in the gs_usb driver in case
it runs out of TX contexts.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21 02:30:31 +01:00
Paolo Abeni b65f164d37 ipv6: let trace_fib6_table_lookup() dereference the fib table
The perf traces for ipv6 routing code show a relevant cost around
trace_fib6_table_lookup(), even if no trace is enabled. This is
due to the fib6_table de-referencing currently performed by the
caller.

Let's the tracing code pay this overhead, passing to the trace
helper the table pointer. This gives small but measurable
performance improvement under UDP flood.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Acked-by: David Ahern <dsa@cumulusnetworks.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21 02:23:38 +01:00
David S. Miller f730cc9fee Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next
Johan Hedberg says:

====================
pull request: bluetooth-next 2017-10-19

Here's the first bluetooth-next pull request targeting the 4.15 kernel
release.

 - Multiple fixes & improvements to the hci_bcm driver
 - DT improvements, e.g. new local-bd-address property
 - Fixes & improvements to ECDH usage. Private key is now generated by
   the crypto subsystem.
 - gcc-4.9 warning fixes

Please let me know if there are any issues pulling. Thanks.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21 02:22:19 +01:00
Dexuan Cui b4562ca792 hv_sock: add locking in the open/close/release code paths
Without the patch, when hvs_open_connection() hasn't completely established
a connection (e.g. it has changed sk->sk_state to SS_CONNECTED, but hasn't
inserted the sock into the connected queue), vsock_stream_connect() may see
the sk_state change and return the connection to the userspace, and next
when the userspace closes the connection quickly, hvs_release() may not see
the connection in the connected queue; finally hvs_open_connection()
inserts the connection into the queue, but we won't be able to purge the
connection for ever.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Cathy Avery <cavery@redhat.com>
Cc: Rolf Neugebauer <rolf.neugebauer@docker.com>
Cc: Marcelo Cerri <marcelo.cerri@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21 02:21:08 +01:00
Gavin Shan 0a90e25198 net/ncsi: Fix length of GVI response packet
The length of GVI (GetVersionInfo) response packet should be 40 instead
of 36. This issue was found from /sys/kernel/debug/ncsi/eth0/stats.

 # ethtool --ncsi eth0 swstats
     :
 RESPONSE     OK       TIMEOUT  ERROR
 =======================================
 GVI          0        0        2

With this applied, no error reported on GVI response packets:

 # ethtool --ncsi eth0 swstats
     :
 RESPONSE     OK       TIMEOUT  ERROR
 =======================================
 GVI          2        0        0

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Samuel Mendoza-Jonas <sam@mendozajonas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21 01:56:38 +01:00
Gavin Shan 52b4c8627f net/ncsi: Enforce failover on link monitor timeout
The NCSI channel has been configured to provide service if its link
monitor timer is enabled, regardless of its state (inactive or active).
So the timeout event on the link monitor indicates the out-of-service
on that channel, for which a failover is needed.

This sets NCSI_DEV_RESHUFFLE flag to enforce failover on link monitor
timeout, regardless the channel's original state (inactive or active).
Also, the link is put into "down" state to give the failing channel
lowest priority when selecting for the active channel. The state of
failing channel should be set to active in order for deinitialization
and failover to be done.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Samuel Mendoza-Jonas <sam@mendozajonas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21 01:56:38 +01:00
Gavin Shan 100ef01f3e net/ncsi: Disable HWA mode when no channels are found
When there are no NCSI channels probed, HWA (Hardware Arbitration)
mode is enabled. It's not correct because HWA depends on the fact:
NCSI channels exist and all of them support HWA mode. This disables
HWA when no channels are probed.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Samuel Mendoza-Jonas <sam@mendozajonas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21 01:56:38 +01:00
Samuel Mendoza-Jonas 0795fb2021 net/ncsi: Stop monitor if channel times out or is inactive
ncsi_channel_monitor() misses stopping the channel monitor in several
places that it should, causing a WARN_ON_ONCE() to trigger when the
monitor is re-started later, eg:

[  459.040000] WARNING: CPU: 0 PID: 1093 at net/ncsi/ncsi-manage.c:269 ncsi_start_channel_monitor+0x7c/0x90
[  459.040000] CPU: 0 PID: 1093 Comm: kworker/0:3 Not tainted 4.10.17-gaca2fdd #140
[  459.040000] Hardware name: ASpeed SoC
[  459.040000] Workqueue: events ncsi_dev_work
[  459.040000] [<80010094>] (unwind_backtrace) from [<8000d950>] (show_stack+0x20/0x24)
[  459.040000] [<8000d950>] (show_stack) from [<801dbf70>] (dump_stack+0x20/0x28)
[  459.040000] [<801dbf70>] (dump_stack) from [<80018d7c>] (__warn+0xe0/0x108)
[  459.040000] [<80018d7c>] (__warn) from [<80018e70>] (warn_slowpath_null+0x30/0x38)
[  459.040000] [<80018e70>] (warn_slowpath_null) from [<803f6a08>] (ncsi_start_channel_monitor+0x7c/0x90)
[  459.040000] [<803f6a08>] (ncsi_start_channel_monitor) from [<803f7664>] (ncsi_configure_channel+0xdc/0x5fc)
[  459.040000] [<803f7664>] (ncsi_configure_channel) from [<803f8160>] (ncsi_dev_work+0xac/0x474)
[  459.040000] [<803f8160>] (ncsi_dev_work) from [<8002d244>] (process_one_work+0x1e0/0x450)
[  459.040000] [<8002d244>] (process_one_work) from [<8002d510>] (worker_thread+0x5c/0x570)
[  459.040000] [<8002d510>] (worker_thread) from [<80033614>] (kthread+0x124/0x164)
[  459.040000] [<80033614>] (kthread) from [<8000a5e8>] (ret_from_fork+0x14/0x2c)

This also updates the monitor instead of just returning if
ncsi_xmit_cmd() fails to send the get-link-status command so that the
monitor properly times out.

Fixes: e6f44ed6d0 "net/ncsi: Package and channel management"

Signed-off-by: Samuel Mendoza-Jonas <sam@mendozajonas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21 01:56:38 +01:00
Samuel Mendoza-Jonas 6850d0f8b2 net/ncsi: Fix AEN HNCDSC packet length
Correct the value of the HNCDSC AEN packet.
Fixes: 7a82ecf4cf "net/ncsi: NCSI AEN packet handler"

Signed-off-by: Samuel Mendoza-Jonas <sam@mendozajonas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21 01:56:38 +01:00
Eric Dumazet 164a5e7ad5 ipv4: ipv4_default_advmss() should use route mtu
ipv4_default_advmss() incorrectly uses the device MTU instead
of the route provided one. IPv6 has the proper behavior,
lets harmonize the two protocols.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21 01:55:05 +01:00
Eric Dumazet 509c7a1ecc packet: avoid panic in packet_getsockopt()
syzkaller got crashes in packet_getsockopt() processing
PACKET_ROLLOVER_STATS command while another thread was managing
to change po->rollover

Using RCU will fix this bug. We might later add proper RCU annotations
for sparse sake.

In v2: I replaced kfree(rollover) in fanout_add() to kfree_rcu()
variant, as spotted by John.

Fixes: a9b6391814 ("packet: rollover statistics")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: John Sperbeck <jsperbeck@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21 01:51:34 +01:00
David S. Miller 520d0d75dd Merge branch 'ieee802154-for-davem-2017-10-18' of git://git.kernel.org/pub/scm/linux/kernel/git/sschmidt/wpan-next
Stefan Schmidt says:

====================
pull-request: ieee802154 2017-10-18

Please find below a pull request from the ieee802154 subsystem for net-next.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21 01:45:56 +01:00
David Ahern 3c75f9b1b4 spectrum: Convert fib event handlers to use container_of on info arg
Use container_of to convert the generic fib_notifier_info into
the event specific data structure.

Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21 01:45:17 +01:00
Eric Dumazet ba233b3474 tcp: fix tcp_send_syn_data()
syn_data was allocated by sk_stream_alloc_skb(), meaning
its destructor and _skb_refdst fields are mangled.

We need to call tcp_skb_tsorted_anchor_cleanup() before
calling kfree_skb() or kernel crashes.

Bug was reported by syzkaller bot.

Fixes: e2080072ed ("tcp: new list for sent but unacked skbs for RACK recovery")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21 01:44:05 +01:00
David S. Miller 27188af5ab Merge branch 'ipv6-fixes-for-RTF_CACHE-entries'
Paolo Abeni says:

====================
ipv6: fixes for RTF_CACHE entries

This series addresses 2 different but related issues with RTF_CACHE
introduced by the recent refactory.

patch 1 restore the gc timer for such routes
patch 2 removes the aged out dst from the fib tree, properly coping with pMTU
routes

v1 -> v2:
 - dropped the  for ip route show cache
 - avoid touching dst.obsolete when the dst is aged out

v2 -> v3:
 - take care of pMTU exceptions
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21 01:39:11 +01:00
Paolo Abeni 1859bac04f ipv6: remove from fib tree aged out RTF_CACHE dst
The commit 2b760fcf5c ("ipv6: hook up exception table to store
dst cache") partially reverted the commit 1e2ea8ad37 ("ipv6: set
dst.obsolete when a cached route has expired").

As a result, RTF_CACHE dst referenced outside the fib tree will
not be removed until the next sernum change; dst_check() does not
fail on aged-out dst, and dst->__refcnt can't decrease: the aged
out dst will stay valid for a potentially unlimited time after the
timeout expiration.

This change explicitly removes RTF_CACHE dst from the fib tree when
aged out. The rt6_remove_exception() logic will then obsolete the
dst and other entities will drop the related reference on next
dst_check().

pMTU exceptions are not aged-out, and are removed from the exception
table only when the - usually considerably longer - ip6_rt_mtu_expires
timeout expires.

v1 -> v2:
  - do not touch dst.obsolete in rt6_remove_exception(), not needed
v2 -> v3:
  - take care of pMTU exceptions, too

Fixes: 2b760fcf5c ("ipv6: hook up exception table to store dst cache")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Wei Wang <weiwan@google.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21 01:39:10 +01:00
Paolo Abeni b886d5f2f2 ipv6: start fib6 gc on RTF_CACHE dst creation
After the commit 2b760fcf5c ("ipv6: hook up exception table
to store dst cache"), the fib6 gc is not started after the
creation of a RTF_CACHE via a redirect or pmtu update, since
fib6_add() isn't invoked anymore for such dsts.

We need the fib6 gc to run periodically to clean the RTF_CACHE,
or the dst will stay there forever.

Fix it by explicitly calling fib6_force_start_gc() on successful
exception creation. gc_args->more accounting will ensure that
the gc timer will run for whatever time needed to properly
clean the table.

v2 -> v3:
 - clarified the commit message

Fixes: 2b760fcf5c ("ipv6: hook up exception table to store dst cache")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Wei Wang <weiwan@google.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21 01:39:10 +01:00
Eric Dumazet c92e8c02fe tcp/dccp: fix ireq->opt races
syzkaller found another bug in DCCP/TCP stacks [1]

For the reasons explained in commit ce1050089c ("tcp/dccp: fix
ireq->pktopts race"), we need to make sure we do not access
ireq->opt unless we own the request sock.

Note the opt field is renamed to ireq_opt to ease grep games.

[1]
BUG: KASAN: use-after-free in ip_queue_xmit+0x1687/0x18e0 net/ipv4/ip_output.c:474
Read of size 1 at addr ffff8801c951039c by task syz-executor5/3295

CPU: 1 PID: 3295 Comm: syz-executor5 Not tainted 4.14.0-rc4+ #80
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:16 [inline]
 dump_stack+0x194/0x257 lib/dump_stack.c:52
 print_address_description+0x73/0x250 mm/kasan/report.c:252
 kasan_report_error mm/kasan/report.c:351 [inline]
 kasan_report+0x25b/0x340 mm/kasan/report.c:409
 __asan_report_load1_noabort+0x14/0x20 mm/kasan/report.c:427
 ip_queue_xmit+0x1687/0x18e0 net/ipv4/ip_output.c:474
 tcp_transmit_skb+0x1ab7/0x3840 net/ipv4/tcp_output.c:1135
 tcp_send_ack.part.37+0x3bb/0x650 net/ipv4/tcp_output.c:3587
 tcp_send_ack+0x49/0x60 net/ipv4/tcp_output.c:3557
 __tcp_ack_snd_check+0x2c6/0x4b0 net/ipv4/tcp_input.c:5072
 tcp_ack_snd_check net/ipv4/tcp_input.c:5085 [inline]
 tcp_rcv_state_process+0x2eff/0x4850 net/ipv4/tcp_input.c:6071
 tcp_child_process+0x342/0x990 net/ipv4/tcp_minisocks.c:816
 tcp_v4_rcv+0x1827/0x2f80 net/ipv4/tcp_ipv4.c:1682
 ip_local_deliver_finish+0x2e2/0xba0 net/ipv4/ip_input.c:216
 NF_HOOK include/linux/netfilter.h:249 [inline]
 ip_local_deliver+0x1ce/0x6e0 net/ipv4/ip_input.c:257
 dst_input include/net/dst.h:464 [inline]
 ip_rcv_finish+0x887/0x19a0 net/ipv4/ip_input.c:397
 NF_HOOK include/linux/netfilter.h:249 [inline]
 ip_rcv+0xc3f/0x1820 net/ipv4/ip_input.c:493
 __netif_receive_skb_core+0x1a3e/0x34b0 net/core/dev.c:4476
 __netif_receive_skb+0x2c/0x1b0 net/core/dev.c:4514
 netif_receive_skb_internal+0x10b/0x670 net/core/dev.c:4587
 netif_receive_skb+0xae/0x390 net/core/dev.c:4611
 tun_rx_batched.isra.50+0x5ed/0x860 drivers/net/tun.c:1372
 tun_get_user+0x249c/0x36d0 drivers/net/tun.c:1766
 tun_chr_write_iter+0xbf/0x160 drivers/net/tun.c:1792
 call_write_iter include/linux/fs.h:1770 [inline]
 new_sync_write fs/read_write.c:468 [inline]
 __vfs_write+0x68a/0x970 fs/read_write.c:481
 vfs_write+0x18f/0x510 fs/read_write.c:543
 SYSC_write fs/read_write.c:588 [inline]
 SyS_write+0xef/0x220 fs/read_write.c:580
 entry_SYSCALL_64_fastpath+0x1f/0xbe
RIP: 0033:0x40c341
RSP: 002b:00007f469523ec10 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000718000 RCX: 000000000040c341
RDX: 0000000000000037 RSI: 0000000020004000 RDI: 0000000000000015
RBP: 0000000000000086 R08: 0000000000000000 R09: 0000000000000000
R10: 00000000000f4240 R11: 0000000000000293 R12: 00000000004b7fd1
R13: 00000000ffffffff R14: 0000000020000000 R15: 0000000000025000

Allocated by task 3295:
 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
 save_stack+0x43/0xd0 mm/kasan/kasan.c:447
 set_track mm/kasan/kasan.c:459 [inline]
 kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:551
 __do_kmalloc mm/slab.c:3725 [inline]
 __kmalloc+0x162/0x760 mm/slab.c:3734
 kmalloc include/linux/slab.h:498 [inline]
 tcp_v4_save_options include/net/tcp.h:1962 [inline]
 tcp_v4_init_req+0x2d3/0x3e0 net/ipv4/tcp_ipv4.c:1271
 tcp_conn_request+0xf6d/0x3410 net/ipv4/tcp_input.c:6283
 tcp_v4_conn_request+0x157/0x210 net/ipv4/tcp_ipv4.c:1313
 tcp_rcv_state_process+0x8ea/0x4850 net/ipv4/tcp_input.c:5857
 tcp_v4_do_rcv+0x55c/0x7d0 net/ipv4/tcp_ipv4.c:1482
 tcp_v4_rcv+0x2d10/0x2f80 net/ipv4/tcp_ipv4.c:1711
 ip_local_deliver_finish+0x2e2/0xba0 net/ipv4/ip_input.c:216
 NF_HOOK include/linux/netfilter.h:249 [inline]
 ip_local_deliver+0x1ce/0x6e0 net/ipv4/ip_input.c:257
 dst_input include/net/dst.h:464 [inline]
 ip_rcv_finish+0x887/0x19a0 net/ipv4/ip_input.c:397
 NF_HOOK include/linux/netfilter.h:249 [inline]
 ip_rcv+0xc3f/0x1820 net/ipv4/ip_input.c:493
 __netif_receive_skb_core+0x1a3e/0x34b0 net/core/dev.c:4476
 __netif_receive_skb+0x2c/0x1b0 net/core/dev.c:4514
 netif_receive_skb_internal+0x10b/0x670 net/core/dev.c:4587
 netif_receive_skb+0xae/0x390 net/core/dev.c:4611
 tun_rx_batched.isra.50+0x5ed/0x860 drivers/net/tun.c:1372
 tun_get_user+0x249c/0x36d0 drivers/net/tun.c:1766
 tun_chr_write_iter+0xbf/0x160 drivers/net/tun.c:1792
 call_write_iter include/linux/fs.h:1770 [inline]
 new_sync_write fs/read_write.c:468 [inline]
 __vfs_write+0x68a/0x970 fs/read_write.c:481
 vfs_write+0x18f/0x510 fs/read_write.c:543
 SYSC_write fs/read_write.c:588 [inline]
 SyS_write+0xef/0x220 fs/read_write.c:580
 entry_SYSCALL_64_fastpath+0x1f/0xbe

Freed by task 3306:
 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
 save_stack+0x43/0xd0 mm/kasan/kasan.c:447
 set_track mm/kasan/kasan.c:459 [inline]
 kasan_slab_free+0x71/0xc0 mm/kasan/kasan.c:524
 __cache_free mm/slab.c:3503 [inline]
 kfree+0xca/0x250 mm/slab.c:3820
 inet_sock_destruct+0x59d/0x950 net/ipv4/af_inet.c:157
 __sk_destruct+0xfd/0x910 net/core/sock.c:1560
 sk_destruct+0x47/0x80 net/core/sock.c:1595
 __sk_free+0x57/0x230 net/core/sock.c:1603
 sk_free+0x2a/0x40 net/core/sock.c:1614
 sock_put include/net/sock.h:1652 [inline]
 inet_csk_complete_hashdance+0xd5/0xf0 net/ipv4/inet_connection_sock.c:959
 tcp_check_req+0xf4d/0x1620 net/ipv4/tcp_minisocks.c:765
 tcp_v4_rcv+0x17f6/0x2f80 net/ipv4/tcp_ipv4.c:1675
 ip_local_deliver_finish+0x2e2/0xba0 net/ipv4/ip_input.c:216
 NF_HOOK include/linux/netfilter.h:249 [inline]
 ip_local_deliver+0x1ce/0x6e0 net/ipv4/ip_input.c:257
 dst_input include/net/dst.h:464 [inline]
 ip_rcv_finish+0x887/0x19a0 net/ipv4/ip_input.c:397
 NF_HOOK include/linux/netfilter.h:249 [inline]
 ip_rcv+0xc3f/0x1820 net/ipv4/ip_input.c:493
 __netif_receive_skb_core+0x1a3e/0x34b0 net/core/dev.c:4476
 __netif_receive_skb+0x2c/0x1b0 net/core/dev.c:4514
 netif_receive_skb_internal+0x10b/0x670 net/core/dev.c:4587
 netif_receive_skb+0xae/0x390 net/core/dev.c:4611
 tun_rx_batched.isra.50+0x5ed/0x860 drivers/net/tun.c:1372
 tun_get_user+0x249c/0x36d0 drivers/net/tun.c:1766
 tun_chr_write_iter+0xbf/0x160 drivers/net/tun.c:1792
 call_write_iter include/linux/fs.h:1770 [inline]
 new_sync_write fs/read_write.c:468 [inline]
 __vfs_write+0x68a/0x970 fs/read_write.c:481
 vfs_write+0x18f/0x510 fs/read_write.c:543
 SYSC_write fs/read_write.c:588 [inline]
 SyS_write+0xef/0x220 fs/read_write.c:580
 entry_SYSCALL_64_fastpath+0x1f/0xbe

Fixes: e994b2f0fb ("tcp: do not lock listener to process SYN packets")
Fixes: 079096f103 ("tcp/dccp: install syn_recv requests into ehash table")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21 01:33:19 +01:00
Linus Torvalds 9c323bff13 Merge branch 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm
Pull ARM fixes from Russell King:
 "Three fixes this time around:

   - ensure sparse realises that we're building for a 32-bit arch on
     64-bit hosts.

   - use the correct instruction for semihosting on v7m (nommu) CPUs.

   - reserve address 0 to prevent the first page of memory being used on
     nommu systems"

* 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm:
  ARM: 8704/1: semihosting: use proper instruction on v7m processors
  ARM: 8701/1: fix sparse flags for build on 64bit machines
  ARM: 8700/1: nommu: always reserve address 0 away
2017-10-20 18:20:17 -04:00
Linus Torvalds 545ea16f7c ARM: SoC fixes for 4.14
Here is another set of bugfixes for ARM SoCs, mostly harmless:
 
  - A boot regression fix on ux500
  - PCIe interrupts on NXP i.MX7 and on Marvell Armada 7K/8K were
    wired up wrong, in different ways
  - Armada XP support for large memory never worked
  - The socfpga reset controller now builds on 64-bit
  - minor device tree corrections on gemini, mvebu, r-pi 3,
    rockchip and at91
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIVAwUAWeppyWCrR//JCVInAQIfVQ//cQDz9jv//4tMhBJeLxpcTe62U0HL3gG/
 pCUXUxUKQxTTuvUm2Ecx+YpeFfS4hDEOAVi8wDAgJ6yBhW+jrIasiVq5XaspR6/C
 DicscLdW3YJ6hjBfk87mbC7F6wu8aTzZa4xkwjJ1L0XbNSq7oOBaoff8dhMNnzC8
 w1HuLu/laAaTEhiHZ1M/hkjx9VxA2j0AQvhdV7Ebrh3Wk+2wYB1QhWngLgHBIZsC
 1VVOtHCtRtfrIBSnHjjx/Wvcwln6fUNUXJGgp3K/oJeNE6M2D2FxA0ylB6YiYmmf
 jRdXypqiE6xjRJa7yru/Q1LfLzt5dQibSQYNsU+ljJ7Wp/3V9F1Ms43hXROunPRN
 Exebvi7BDjGO+/MxgD/FLFltndJRsmnZPD0+M8+zWcHIrC1N1ULrBO8HfdIWxm/x
 nNOvx17dEUCg8azwFlmfjxXIq4p/4qv+LzbkFVPOlPf1y4xzvMthkAyKiN1vGVoO
 d5O7HeTc4ciPG/qxcL4zdwDo4qgeMTDJNrTRJ+4oxjkKxHDHH3gpYXKdKWCFXT+W
 5/Y+kIYEQChNNRdXqQ1Xx4xgdZaHfu6J3QrfNs0291yjLDUyCjk4FtDZBzlWO/SH
 F8rE79QOwYbZGgth2b/RqX4/IxAzE7wviUVlBUja4NcPcDkwk2hPHqiyfkYmXaTn
 a80A7sWIjSM=
 =M8dW
 -----END PGP SIGNATURE-----

Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC fixes from Arnd Bergmann:
 "Here is another set of bugfixes for ARM SoCs, mostly harmless:

   - a boot regression fix on ux500

   - PCIe interrupts on NXP i.MX7 and on Marvell Armada 7K/8K were wired
     up wrong, in different ways

   - Armada XP support for large memory never worked

   - the socfpga reset controller now builds on 64-bit

   - minor device tree corrections on gemini, mvebu, r-pi 3, rockchip
     and at91"

* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  ARM: ux500: Fix regression while init PM domains
  ARM: dts: fix PCLK name on Gemini and MOXA ART
  arm64: dts: rockchip: fix typo in iommu nodes
  arm64: dts: rockchip: correct vqmmc voltage for rk3399 platforms
  ARM: dts: imx7d: Invert legacy PCI irq mapping
  bus: mbus: fix window size calculation for 4GB windows
  ARM: dts: at91: sama5d2: add ADC hw trigger edge type
  ARM: dts: at91: sama5d2_xplained: enable ADTRG pin
  ARM: dts: at91: at91-sama5d27_som1: fix PHY ID
  ARM: dts: bcm283x: Fix console path on RPi3
  reset: socfpga: fix for 64-bit compilation
  ARM: dts: Fix I2C repeated start issue on Armada-38x
  arm64: dts: marvell: fix interrupt-map property for Armada CP110 PCIe controller
  arm64: dts: salvator-common: add 12V regulator to backlight
  ARM: dts: sun6i: Fix endpoint IDs in second display pipeline
  arm64: allwinner: a64: pine64: Use dcdc1 regulator for mmc0
2017-10-20 18:17:43 -04:00
Arnd Bergmann 6bf99a6cb6 Allwinner fixes for 4.14
Two fixes, one for the A31 DRM binding, and one for a missing regulator on
 the pine MMC controller.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJZ6axFAAoJEBx+YmzsjxAgRPQQAL8SNIbznqcV1ncW1SXBH/hg
 W1UCfclJfCJ1nrctsfXIgDpIJAGjVR+PEh8kBHgyqknqLJ6bQpJOqfHzSZk+SWF4
 1NuosdxLMW9V9wrwzNUUYD6Jh3VoJAgKDcWBPeY9eUvvLq6wnzXmSXPBtTUlXuNp
 XcXoT7TCSlUZ0rvJKPe2ON+BH1hjYhNnHs07TN2x2lbYQMbEcLLzqBOyfxESzQ5w
 hAb8gpJhGSDAk2pJXtyviSNokx5fqSePnKmfPNG42QHXq+cvt6aCcosqZ9u3OuAx
 eNTVTZPlvnQ/GfjEouG4NTjYbv5cXdN8itqaSypMeN+8xpOJ/mFDa8K/vzyzF2Kr
 6svpe4SC0YB6z4YtKFLR0Q6a/MlgMNq02WW5l+oq8e44pwyPRYFeTNNP8yD9ZO0k
 xlhgNyo+/KIXGx6XBga27x3IyaWopGslLK/UjG4El0jOAPISiuZcbF6GCsFht7dk
 YSVEVQ842v2iX817kaDy1zGTOy0b9j9/AOu6ctZlsP9XM9YaMxi7pFrb+UFu6FSJ
 yRR8TNZjjrSMsuDc8yrbH0/nWcgkmQtYXa2iQ4/2ILlW63zrm2yM7w8CKygyZ25D
 NhuK/yQ+PEsKYDAJ3s3T3hUbirAmQx3KUmv6Jr7UUPaj6V5a62Jl/pt5+oSlGQm7
 EBkvmIV+XXwmqng7Y/WF
 =Cvge
 -----END PGP SIGNATURE-----

Merge tag 'sunxi-fixes-for-4.14' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux into fixes

Pull "Allwinner fixes for 4.14" from Maxime Ripard:

Two fixes, one for the A31 DRM binding, and one for a missing regulator on
the pine MMC controller.

* tag 'sunxi-fixes-for-4.14' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux:
  ARM: dts: sun6i: Fix endpoint IDs in second display pipeline
  arm64: allwinner: a64: pine64: Use dcdc1 regulator for mmc0
2017-10-20 22:24:48 +02:00
Kees Cook 1c9fec470b waitid(): Avoid unbalanced user_access_end() on access_ok() error
As pointed out by Linus and David, the earlier waitid() fix resulted in
a (currently harmless) unbalanced user_access_end() call.  This fixes it
to just directly return EFAULT on access_ok() failure.

Fixes: 96ca579a1e ("waitid(): Add missing access_ok() checks")
Acked-by: David Daney <david.daney@cavium.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-10-20 15:32:54 -04:00
David S. Miller 7f9ad2ace1 Merge branch 'bpf-lsm-hooks'
Chenbo Feng says:

====================
bpf: security: New file mode and LSM hooks for eBPF object permission control

Much like files and sockets, eBPF objects are accessed, controlled, and
shared via a file descriptor (FD). Unlike files and sockets, the
existing mechanism for eBPF object access control is very limited.
Currently there are two options for granting accessing to eBPF
operations: grant access to all processes, or only CAP_SYS_ADMIN
processes. The CAP_SYS_ADMIN-only mode is not ideal because most users
do not have this capability and granting a user CAP_SYS_ADMIN grants too
many other security-sensitive permissions. It also unnecessarily allows
all CAP_SYS_ADMIN processes access to eBPF functionality. Allowing all
processes to access to eBPF objects is also undesirable since it has
potential to allow unprivileged processes to consume kernel memory, and
opens up attack surface to the kernel.

Adding LSM hooks maintains the status quo for systems which do not use
an LSM, preserving compatibility with userspace, while allowing security
modules to choose how best to handle permissions on eBPF objects. Here
is a possible use case for the lsm hooks with selinux module:

The network-control daemon (netd) creates and loads an eBPF object for
network packet filtering and analysis. It passes the object FD to an
unprivileged network monitor app (netmonitor), which is not allowed to
create, modify or load eBPF objects, but is allowed to read the traffic
stats from the map.

Selinux could use these hooks to grant the following permissions:
allow netd self:bpf_map { create read write};
allow netmonitor netd:fd use;
allow netmonitor netd:bpf_map read;

In this patch series, A file mode is added to bpf map to store the
accessing mode. With this file mode flags, the map can be obtained read
only, write only or read and write. With the help of this file mode,
several security hooks can be added to the eBPF syscall implementations
to do permissions checks. These LSM hooks are mainly focused on checking
the process privileges before it obtains the fd for a specific bpf
object. No matter from a file location or from a eBPF id. Besides that,
a general check hook is also implemented at the start of bpf syscalls so
that each security module can have their own implementation on the reset
of bpf object related functionalities.

In order to store the ownership and security information about eBPF
maps, a security field pointer is added to the struct bpf_map. And the
last two patch set are implementation of selinux check on these hooks
introduced, plus an additional check when eBPF object is passed between
processes using unix socket as well as binder IPC.

Change since V1:

 - Whitelist the new bpf flags in the map allocate check.
 - Added bpf selftest for the new flags.
 - Added two new security hooks for copying the security information from
   the bpf object security struct to file security struct
 - Simplified the checking action when bpf fd is passed between processes.

 Change since V2:

 - Fixed the line break problem for map flags check
 - Fixed the typo in selinux check of file mode.
 - Merge bpf_map and bpf_prog into one selinux class
 - Added bpf_type and bpf_sid into file security struct to store the
   security information when generate fd.
 - Add the hook to bpf_map_new_fd and bpf_prog_new_fd.

 Change since V3:

 - Return the actual error from security check instead of -EPERM
 - Move the hooks into anon_inode_getfd() to avoid get file again after
   bpf object file is installed with fd.
 - Removed the bpf_sid field inside file_scerity_struct to reduce the
   cache size.

 Change since V4:

 - Rename bpf av prog_use to prog_run to distinguish from fd_use.
 - Remove the bpf_type field inside file_scerity_struct and use bpf fops
   to indentify bpf object instead.

 Change since v5:

 - Fixed the incorrect selinux class name for SECCLASS_BPF

 Change since v7:

 - Fixed the build error caused by xt_bpf module.
 - Add flags check for bpf_obj_get() and bpf_map_get_fd_by_id() to make it
   uapi-wise.
 - Add the flags field to the bpf_obj_get_user function when BPF_SYSCALL
   is not configured.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-20 13:33:00 +01:00
Chenbo Feng f66e448cfd selinux: bpf: Add addtional check for bpf object file receive
Introduce a bpf object related check when sending and receiving files
through unix domain socket as well as binder. It checks if the receiving
process have privilege to read/write the bpf map or use the bpf program.
This check is necessary because the bpf maps and programs are using a
anonymous inode as their shared inode so the normal way of checking the
files and sockets when passing between processes cannot work properly on
eBPF object. This check only works when the BPF_SYSCALL is configured.

Signed-off-by: Chenbo Feng <fengc@google.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Reviewed-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-20 13:32:59 +01:00
Chenbo Feng ec27c3568a selinux: bpf: Add selinux check for eBPF syscall operations
Implement the actual checks introduced to eBPF related syscalls. This
implementation use the security field inside bpf object to store a sid that
identify the bpf object. And when processes try to access the object,
selinux will check if processes have the right privileges. The creation
of eBPF object are also checked at the general bpf check hook and new
cmd introduced to eBPF domain can also be checked there.

Signed-off-by: Chenbo Feng <fengc@google.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-20 13:32:59 +01:00
Chenbo Feng afdb09c720 security: bpf: Add LSM hooks for bpf object related syscall
Introduce several LSM hooks for the syscalls that will allow the
userspace to access to eBPF object such as eBPF programs and eBPF maps.
The security check is aimed to enforce a per object security protection
for eBPF object so only processes with the right priviliges can
read/write to a specific map or use a specific eBPF program. Besides
that, a general security hook is added before the multiplexer of bpf
syscall to check the cmd and the attribute used for the command. The
actual security module can decide which command need to be checked and
how the cmd should be checked.

Signed-off-by: Chenbo Feng <fengc@google.com>
Acked-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-20 13:32:59 +01:00
Chenbo Feng e043325b30 bpf: Add tests for eBPF file mode
Two related tests are added into bpf selftest to test read only map and
write only map. The tests verified the read only and write only flags
are working on hash maps.

Signed-off-by: Chenbo Feng <fengc@google.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-20 13:32:59 +01:00
Chenbo Feng 6e71b04a82 bpf: Add file mode configuration into bpf maps
Introduce the map read/write flags to the eBPF syscalls that returns the
map fd. The flags is used to set up the file mode when construct a new
file descriptor for bpf maps. To not break the backward capability, the
f_flags is set to O_RDWR if the flag passed by syscall is 0. Otherwise
it should be O_RDONLY or O_WRONLY. When the userspace want to modify or
read the map content, it will check the file mode to see if it is
allowed to make the change.

Signed-off-by: Chenbo Feng <fengc@google.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-20 13:32:59 +01:00