alistair23-linux

redonkable

Author	SHA1	Message	Date
David S. Miller	a2b4eb55ca	Merge branch 'bpf-misc' Daniel Borkmann says: ==================== Misc BPF improvements This last series for this window adds various misc improvements to BPF, one is to mark registered map and prog types as __ro_after_init, another one for removing cBPF stubs in eBPF JITs and moving the stub to the core and last also improving JITs is to make generated images visible to the kernel and kallsyms, so they can be seen in traces. For details, please have a look at the individual patches. Thanks a lot! ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-17 13:40:06 -05:00
Daniel Borkmann	74451e66d5	bpf: make jited programs visible in traces Long standing issue with JITed programs is that stack traces from function tracing check whether a given address is kernel code through {__,}kernel_text_address(), which checks for code in core kernel, modules and dynamically allocated ftrace trampolines. But what is still missing is BPF JITed programs (interpreted programs are not an issue as __bpf_prog_run() will be attributed to them), thus when a stack trace is triggered, the code walking the stack won't see any of the JITed ones. The same for address correlation done from user space via reading /proc/kallsyms. This is read by tools like perf, but the latter is also useful for permanent live tracing with eBPF itself in combination with stack maps when other eBPF types are part of the callchain. See offwaketime example on dumping stack from a map. This work tries to tackle that issue by making the addresses and symbols known to the kernel. The lookup from *kernel_text_address() is implemented through a latched RB tree that can be read under RCU in fast-path that is also shared for symbol/size/offset lookup for a specific given address in kallsyms. The slow-path iteration through all symbols in the seq file done via RCU list, which holds a tiny fraction of all exported ksyms, usually below 0.1 percent. Function symbols are exported as bpf_prog_<tag>, in order to aide debugging and attribution. This facility is currently enabled for root-only when bpf_jit_kallsyms is set to 1, and disabled if hardening is active in any mode. The rationale behind this is that still a lot of systems ship with world read permissions on kallsyms thus addresses should not get suddenly exposed for them. If that situation gets much better in future, we always have the option to change the default on this. Likewise, unprivileged programs are not allowed to add entries there either, but that is less of a concern as most such programs types relevant in this context are for root-only anyway. If enabled, call graphs and stack traces will then show a correct attribution; one example is illustrated below, where the trace is now visible in tooling such as perf script --kallsyms=/proc/kallsyms and friends. Before: 7fff8166889d bpf_clone_redirect+0x80007f0020ed (/lib/modules/4.9.0-rc8+/build/vmlinux) f5d80 __sendmsg_nocancel+0xffff006451f1a007 (/usr/lib64/libc-2.18.so) After: 7fff816688b7 bpf_clone_redirect+0x80007f002107 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fffa0575728 bpf_prog_33c45a467c9e061a+0x8000600020fb (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fffa07ef1fc cls_bpf_classify+0x8000600020dc (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff81678b68 tc_classify+0x80007f002078 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff8164d40b __netif_receive_skb_core+0x80007f0025fb (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff8164d718 __netif_receive_skb+0x80007f002018 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff8164e565 process_backlog+0x80007f002095 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff8164dc71 net_rx_action+0x80007f002231 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff81767461 __softirqentry_text_start+0x80007f0020d1 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff817658ac do_softirq_own_stack+0x80007f00201c (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff810a2c20 do_softirq+0x80007f002050 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff810a2cb5 __local_bh_enable_ip+0x80007f002085 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff8168d452 ip_finish_output2+0x80007f002152 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff8168ea3d ip_finish_output+0x80007f00217d (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff8168f2af ip_output+0x80007f00203f (/lib/modules/4.9.0-rc8+/build/vmlinux) [...] 7fff81005854 do_syscall_64+0x80007f002054 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff817649eb return_from_SYSCALL_64+0x80007f002000 (/lib/modules/4.9.0-rc8+/build/vmlinux) f5d80 __sendmsg_nocancel+0xffff01c484812007 (/usr/lib64/libc-2.18.so) Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-17 13:40:05 -05:00
Daniel Borkmann	9383191da4	bpf: remove stubs for cBPF from arch code Remove the dummy bpf_jit_compile() stubs for eBPF JITs and make that a single __weak function in the core that can be overridden similarly to the eBPF one. Also remove stale pr_err() mentions of bpf_jit_compile. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-17 13:40:04 -05:00
Daniel Borkmann	c78f8bdfa1	bpf: mark all registered map/prog types as __ro_after_init All map types and prog types are registered to the BPF core through bpf_register_map_type() and bpf_register_prog_type() during init and remain unchanged thereafter. As by design we don't (and never will) have any pluggable code that can register to that at any later point in time, lets mark all the existing bpf_{map,prog}_type_list objects in the tree as __ro_after_init, so they can be moved to read-only section from then onwards. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-17 13:40:04 -05:00
Roopa Prabhu	afcb50ba7f	bridge: vlan_tunnel: explicitly reset metadata attrs to NULL on failure Fixes: `efa5356b0d` ("bridge: per vlan dst_metadata netlink support") Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-17 13:33:41 -05:00
Tobias Klauser	6850f8b509	net: bgmac: store MAC address directly in netdev->dev_addr After commit `34a5102c32` ("net: bgmac: allocate struct bgmac just once & don't copy it") the mac_addr member of struct bgmac is no longer necessary to pass the MAC address to bgmac_enet_probe(). Instead it can directly be stored in netdev->dev_addr. Also use eth_hw_addr_random() instead of eth_random_addr() in case a random MAC is nedded. This will make sure netdev->addr_assign_type will be properly set. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Acked-by: Jon Mason <jon.mason@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-17 13:03:39 -05:00
David S. Miller	3105dfb2a9	wireless-drivers-next patches for 4.11 Mostly small fixes, not really any new features. Major changes: ath10k * when trying older firmware versions don't confuse user with error messages ath9k * fix crash in AP mode (regression) * fix relayfs crash (regression) * fix initialisation with AR9340 and AR9550 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJYpamYAAoJEG4XJFUm622b7FAIAKLgTCq5S3mNFGsuA+r+2v8L HwDSmIF1bd/L74h2i08rHuEycU3h92RXPCYFZT8bR1fbU58BZlO4YLc0Ey9ug5v7 1uc2lB00j2CJJ97XuW5HK3ZGA9jv/V5P+N8GM0Z1UdxG7J12h++MSVu7lKyDrPus tPrA4C+/iANbPiSJpqgS+Zeyd6qUHa+udTNTnL8XxuztKLrAH5DDiHdxiGIpDmKt HhyAeK0A+vJYdcGCIzD2RXpw/VXUvmF/p7WiR6/3LYRXNNpZoC+U/MdLKh5LBj1C hpdlNBmqV+Wf7g7W3CrM3ZVZm8s+3ad8+2rIiW9lqE/UEivVmAZ70iuSppqC/nE= =WwLM -----END PGP SIGNATURE----- Merge tag 'wireless-drivers-next-for-davem-2017-02-16' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next Kalle Valo says: ==================== wireless-drivers-next patches for 4.11 Mostly small fixes, not really any new features. Major changes: ath10k * when trying older firmware versions don't confuse user with error messages ath9k * fix crash in AP mode (regression) * fix relayfs crash (regression) * fix initialisation with AR9340 and AR9550 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-17 13:00:45 -05:00
Tobias Klauser	6d6a505a1e	net: ethoc: Use eth_hw_addr_random() Use eth_hw_addr_random() to set a random dev_addr and update addr_assign_type instead of open-coding it. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-17 12:42:24 -05:00
David S. Miller	e1c151a479	Merge branch 'rhashtable-allocation-failure-during-insertion' Herbert Xu says: ==================== rhashtable: Handle table allocation failure during insertion v2 - Added Ack to patch 2. Fixed RCU annotation in code path executed by rehasher by using rht_dereference_bucket. v1 - This series tackles the problem of table allocation failures during insertion. The issue is that we cannot vmalloc during insertion. This series deals with this by introducing nested tables. The first two patches removes manual hash table walks which cannot work on a nested table. The final patch introduces nested tables. I've tested this with test_rhashtable and it appears to work. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-17 12:28:44 -05:00
Herbert Xu	da20420f83	rhashtable: Add nested tables This patch adds code that handles GFP_ATOMIC kmalloc failure on insertion. As we cannot use vmalloc, we solve it by making our hash table nested. That is, we allocate single pages at each level and reach our desired table size by nesting them. When a nested table is created, only a single page is allocated at the top-level. Lower levels are allocated on demand during insertion. Therefore for each insertion to succeed, only two (non-consecutive) pages are needed. After a nested table is created, a rehash will be scheduled in order to switch to a vmalloced table as soon as possible. Also, the rehash code will never rehash into a nested table. If we detect a nested table during a rehash, the rehash will be aborted and a new rehash will be scheduled. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-17 12:28:35 -05:00
Herbert Xu	40f9f43970	tipc: Fix tipc_sk_reinit race conditions There are two problems with the function tipc_sk_reinit. Firstly it's doing a manual walk over an rhashtable. This is broken as an rhashtable can be resized and if you manually walk over it during a resize then you may miss entries. Secondly it's missing memory barriers as previously the code used spinlocks which provide the barriers implicitly. This patch fixes both problems. Fixes: `07f6c4bc04` ("tipc: convert tipc reference table to...") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-17 12:28:35 -05:00
Herbert Xu	98687f426b	gfs2: Use rhashtable walk interface in glock_hash_walk The function glock_hash_walk walks the rhashtable by hand. This is broken because if it catches the hash table in the middle of a rehash, then it will miss entries. This patch replaces the manual walk by using the rhashtable walk interface. Fixes: `88ffbf3e03` ("GFS2: Use resizable hash table for glocks") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-17 12:28:22 -05:00
Jisheng Zhang	4581be42fc	net: mvneta: make mvneta_eth_tool_ops static The mvneta_eth_tool_ops is only used internally in mvneta driver, so make it static. Signed-off-by: Jisheng Zhang <jszhang@marvell.com> Acked-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-17 12:16:29 -05:00
David S. Miller	4b5026ade5	Merge branch 'net-sched-reflect-hw-offload-in-classifiers' Or Gerlitz says: ==================== net/sched: Reflect HW offload status in classifiers Currently there is no way of querying whether a filter is offloaded to HW or not when using "both" policy (where none of skip_sw or skip_hw flags are set by user-space). Added two new flags, "in hw" and "not in hw" such that user space can determine if a filter is actually offloaded to hw. The "in hw" UAPI semantics was chosen so it's similar to the "skip hw" flag logic. If none of these two flags are set, this signals running over older kernel. As an example, add one vlan push + fwd rule, one matchall rule and one u32 rule without any flags, and another vlan + fwd skip_sw rule, such that the different TC classifier attempt to offload all of them -- all over mlx5 SRIOV VF rep: flower skip_sw indev eth2_0 src_mac e4:11:22:33:44:50 dst_mac e4:1d:2d:a5:f3:9d action vlan push id 52 action mirred egress redirect dev eth2 flower indev eth2_0 src_mac e4:11:22:33:44:50 dst_mac e4:11:22:33:44:51 action vlan push id 53 action mirred egress redirect dev eth2 u32 ht 800: flowid 800:1 match ip src 192.168.1.0/24 action drop Since that VF rep doesn't offload matchall/u32 and can currently offload only one vlan push rule we expect three of the rules not to be offloaded: filter protocol ip pref 99 u32 filter protocol ip pref 99 u32 fh 800: ht divisor 1 filter protocol ip pref 99 u32 fh 800::1 order 1 key ht 800 bkt 0 flowid 800:1 not in_hw match c0a80100/ffffff00 at 12 action order 1: gact action drop random type none pass val 0 index 8 ref 1 bind 1 filter protocol all pref 49150 matchall filter protocol all pref 49150 matchall handle 0x1 not in_hw action order 1: mirred (Egress Mirror to device veth1) pipe index 27 ref 1 bind 1 filter protocol ip pref 49151 flower filter protocol ip pref 49151 flower handle 0x1 indev eth2_0 dst_mac e4:11:22:33:44:51 src_mac e4:11:22:33:44:50 eth_type ipv4 not in_hw action order 1: vlan push id 53 protocol 802.1Q priority 0 pipe index 20 ref 1 bind 1 action order 2: mirred (Egress Redirect to device eth2) stolen index 26 ref 1 bind 1 filter protocol ip pref 49152 flower filter protocol ip pref 49152 flower handle 0x1 indev eth2_0 dst_mac e4:1d:2d:a5:f3:9d src_mac e4:11:22:33:44:50 eth_type ipv4 skip_sw in_hw action order 1: vlan push id 52 protocol 802.1Q priority 0 pipe index 19 ref 1 bind 1 action order 2: mirred (Egress Redirect to device eth2) stolen index 25 ref 1 bind 1 v3 --> v4 changes: - removed extra parenthesis (Dave) v2 --> v3 changes: - fixed the matchall dump flags patch to do proper checks (Jakub) - added the same proper checks to flower where they were missing - that flower patch was added as #1 and hence all the other patches are offed-by-one v1 --> v2 changes: - applied feedback from Jakub and Dave -- where none of the skip flags were set, the suggested approach didn't allow user space to distringuish between old kernel to a case when offloading to HW worked fine. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-17 12:08:07 -05:00
Or Gerlitz	5cecb6cc00	net/sched: cls_bpf: Reflect HW offload status BPF classifier support for the "in hw" offloading flags. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Amir Vadai <amir@vadai.me> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-17 12:08:06 -05:00
Or Gerlitz	24d3dc6d27	net/sched: cls_u32: Reflect HW offload status U32 support for the "in hw" offloading flags. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Amir Vadai <amir@vadai.me> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-17 12:08:06 -05:00
Or Gerlitz	c7d2b2f5ee	net/sched: cls_matchall: Reflect HW offloading status Matchall support for the "in hw" offloading flags. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Amir Vadai <amir@vadai.me> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-17 12:08:06 -05:00
Or Gerlitz	55593960d0	net/sched: cls_flower: Reflect HW offload status Flower support for the "in hw" offloading flags. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Amir Vadai <amir@vadai.me> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-17 12:08:05 -05:00
Or Gerlitz	e696028acc	net/sched: Reflect HW offload status Currently there is no way of querying whether a filter is offloaded to HW or not when using "both" policy (where none of skip_sw or skip_hw flags are set by user-space). Add two new flags, "in hw" and "not in hw" such that user space can determine if a filter is actually offloaded to hw or not. The "in hw" UAPI semantics was chosen so it's similar to the "skip hw" flag logic. If none of these two flags are set, this signals running over older kernel. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Amir Vadai <amir@vadai.me> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-17 12:08:05 -05:00
Or Gerlitz	7a335adad8	net/sched: cls_matchall: Dump the classifier flags The classifier flags are not dumped to user-space, do that. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Yotam Gigi <yotamg@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-17 12:08:05 -05:00
Or Gerlitz	749e6720d2	net/sched: cls_flower: Properly handle classifier flags dumping Dump the classifier flags only if non zero and make sure to check the return status of the handler that puts them into the netlink msg. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-17 12:08:04 -05:00
Philippe Reynes	ad8e963ca6	net: oki-semi: pch_gbe: use new api ethtool_{get\|set}_link_ksettings The ethtool api {get\|set}_settings is deprecated. We move this driver to new api {get\|set}_link_ksettings. As I don't have the hardware, I'd be very pleased if someone may test this patch. Signed-off-by: Philippe Reynes <tremyfr@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-17 12:00:14 -05:00
Ivan Khoronzhuk	9fe9aa0b73	net: ethernet: ti: cpsw: correct ale dev to cpsw The ale is a property of cpsw, so change dev to cpsw->dev, aka pdev->dev, to be consistent. Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-17 11:12:32 -05:00
Philippe Reynes	0fa9e2899c	net: nvidia: forcedeth: use new api ethtool_{get\|set}_link_ksettings The ethtool api {get\|set}_settings is deprecated. We move this driver to new api {get\|set}_link_ksettings. As I don't have the hardware, I'd be very pleased if someone may test this patch. Signed-off-by: Philippe Reynes <tremyfr@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-17 11:05:33 -05:00
David S. Miller	af6e2b5b8c	Merge branch 'ptp-attribute-cleanup' Dmitry Torokhov says: ==================== PTP attribute handling cleanup PTP core was creating some attributes, such as "period" and "fifo", and the entire "pins" attribute group, after creating class deevice, which creates a race for userspace: uevent may arrive before all attributes are created. This series of patches switches PTP to use is_visible() to control visibility of attributes in a group, and device_create_with_groups() to ensure that attributes are created before we notify userspace of a new device. v2: - added Richard's acked-by to patch #1 - removed use of kmalloc_array in favor of kcalloc in patch #2 at Richard's request - added a cover letter v1: - initial patch set ==================== Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-17 11:03:07 -05:00
Dmitry Torokhov	85a66e5501	ptp: create "pins" together with the rest of attributes Let's switch to using device_create_with_groups(), which will allow us to create "pins" attribute group together with the rest of ptp device attributes, and before userspace gets notified about ptp device creation. Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-17 11:03:06 -05:00
Dmitry Torokhov	af59e717d5	ptp: use is_visible method to hide unused attributes Instead of creating selected attributes after the device is created (and after userspace potentially seen uevent), lets use attribute group is_visible() method to control which attributes are shown. This will allow us to create all attributes (except "pins" group, which will be taken care of later) before userspace gets notified about new ptp class device. Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-17 11:03:06 -05:00
Dmitry Torokhov	6f7aa56bae	ptp: use kcalloc when allocating arrays kcalloc is more semantically correct when allocating arrays of objects, and overflow-safe. Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-17 11:03:06 -05:00
Dmitry Torokhov	882f312dc0	ptp: do not explicitly set drvdata in ptp_clock_register() We do not need explicitly call dev_set_drvdata(), as it is done for us by device_create(). Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-17 11:03:05 -05:00
Eric Dumazet	01f0f42534	mlx4: do not fire tasklet unless necessary All rx and rx netdev interrupts are handled by respectively by mlx4_en_rx_irq() and mlx4_en_tx_irq() which simply schedule a NAPI. But mlx4_eq_int() also fires a tasklet to service all items that were queued via mlx4_add_cq_to_tasklet(), but this handler was not called unless user cqe was handled. This is very confusing, as "mpstat -I SCPU ..." show huge number of tasklet invocations. This patch saves this overhead, by carefully firing the tasklet directly from mlx4_add_cq_to_tasklet(), removing four atomic operations per IRQ. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tariq Toukan <tariqt@mellanox.com> Cc: Saeed Mahameed <saeedm@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-17 10:55:22 -05:00
David S. Miller	99d5ceeea5	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2017-02-16 1) Make struct xfrm_input_afinfo const, nothing writes to it. From Florian Westphal. 2) Remove all places that write to the afinfo policy backend and make the struct const then. From Florian Westphal. 3) Prepare for packet consuming gro callbacks and add ESP GRO handlers. ESP packets can be decapsulated at the GRO layer then. It saves a round through the stack for each ESP packet. Please note that this has a merge coflict between commit `63fca65d08` ("net: add confirm_neigh method to dst_ops") from net-next and `3d7d25a68e` ("xfrm: policy: remove garbage_collect callback") `a2817d8b27` ("xfrm: policy: remove family field") from ipsec-next. The conflict can be solved as it is done in linux-next. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-16 21:25:49 -05:00
David S. Miller	5237b9dde3	Merge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 10GbE Intel Wired LAN Driver Updates 2017-02-16 This series contains updates to ixgbe only. Tony updates the driver to advertise 2.5Gb and 5.0Gb if the adapter supports it. Stephen Hemminger renames our dcbnl_ops since it is global to ixgbe_dcbnl_ops to avoid namespace issues. Mark updates the driver version based on the recent changes. Alex has the remainder of the changes, starting with consolidating functions that represent logical steps in the receive process so we can later update them more easily (and align with igb). Modify the receive path to only synchronize the length of the frame versus the entire buffer. Provided performance improvements by adding support for DMA_ATTR_SKIP_CPU_SYNC and DMA_ATTR_WEAK_ORDERING. Also made additional performance gains by batching the page count updates instead of doing them one at a time. Adjusted the receive path to use 3k buffers with 8k backing them in order to support build_skb with jumbo frames. Made additional driver improvements by using the length of the packet instead of the DD status to determine if a new descriptor is ready to be processed, which cuts down on reads. To reduce code duplication, pulled apart the receive path into separate functions. Added support for providing a buffer with headroom and tailroom to allow for shared info for NET_SKB_PAD and NET_IP_ALIGN. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-16 21:19:44 -05:00
David S. Miller	3f64116a83	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2017-02-16 19:34:01 -05:00
Ganesh Goudar	f3caf8618b	cxgb4: Remove redundant code in t4_uld_clean_up() Remove variable rxq_info and also remove redundant assignment to it. Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-16 14:32:52 -05:00
Ganesh Goudar	5d071c24f0	cxgb4: Add new T5 and T6 pci device id's Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-16 14:32:52 -05:00
Arjun V	45da1ca2e2	cxgb4: Increase max number of tc u32 links Make max number of supported tc u32 links equal to max number of filters supported by hardware. Signed-off-by: Arjun V <arjun@chelsio.com> Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: Casey Leedom <leedom@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-16 14:32:52 -05:00
Linus Torvalds	4695daefba	media fixes for v4.10-rc9 -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJYpefcAAoJEAhfPr2O5OEVeswP/RgA7lHk9cVl0f2srK/7rTcc kd8IIfrHhQmHTRJOllVKdz5Rwl9Eih3FHs7xukfHzP3nr4ZfkEoj8ZEPnF6C1W0v TivhxBJyyhKIx3g/oIl6OOXJeBLRgRCENrwoKjbRVDtuDcxpfm8d2P2NjdoPDN+6 zSDkMgqw4QW/8fqhoVpoEdtIjS0/BBS/Qob0QIpDpzSk1QtNsv0Ra83sYEGNHX/5 f+jw25XQZiYZJFw8cNCEmmOTlan5yuCKS9gjjMm7le07jFLjS/nKL+e9zwrm7WXN 4tuoxwcNCtKRzjdXyPG4up5LN0PTckQeZ/ust4w9N9wh+/ssr1n6p4qGMieVWQpE vXgekn/1mjOpZ9VNY+ciq4IZXNLjqeWPODQiNrgyfrbGN5oN6PKoIYW8ubGd8YvA 86FEA7zJq6/I+Yne1PqvlLo3tWXquV7tVEGsZnatWe2ZnfvadsVMkGPi/KzgGjNj PPg/lBm/8uuPjlHrAZmo0VvT/To1BQLRjKZGbb1TnC6bZU/pEJpdRy4b5mGQAyOP ZkXeuGae3JaxWgyYVzzbjANcfxZcc4qNp7FHdG/7yKXU3w8qsZgNgiWKyPw2Xw/F 0Xx7wH0MI1Q+KlbuQi6TNGjMigX8VpqZXv2ql8r8KsUlg33FwMFNF0pe8ofPPc+8 BP0WVM1iUInBVd5jxIYi =aXBS -----END PGP SIGNATURE----- Merge tag 'media/v4.10-5' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media fix from Mauro Carvalho Chehab: "A regression fix that makes the Siano driver to work again after the CONFIG_VMAP_STACK change" * tag 'media/v4.10-5' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: [media] siano: make it work again with CONFIG_VMAP_STACK	2017-02-16 10:22:41 -08:00
Miklos Szeredi	5a81e6a171	vfs: fix uninitialized flags in splice_to_pipe() Flags (PIPE_BUF_FLAG_PACKET, PIPE_BUF_FLAG_GIFT) could remain on the unused part of the pipe ring buffer. Previously splice_to_pipe() left the flags value alone, which could result in incorrect behavior. Uninitialized flags appears to have been there from the introduction of the splice syscall. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Cc: <stable@vger.kernel.org> # 2.6.17+ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-02-16 09:09:02 -08:00
Linus Torvalds	58f6eaee7b	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse Pull fuse fixes from Miklos Szeredi: "Fix a use after free bug introduced in 4.2 and using an uninitialized value introduced in 4.9" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: fix uninitialized flags in pipe_buffer fuse: fix use after free issue in fuse_dev_do_read()	2017-02-16 09:05:34 -08:00
Linus Torvalds	aa6fba55cc	pci-v4.10-fixes-4 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJYpbp5AAoJEFmIoMA60/r8zZkP/2FlVvsdjF+khX/pWndPiFcW tm1IN0rt/2Y+eizK+w7pmkcXSH6IO/djLdqxtZngaVmN5Ov+2+IezCPP88YQi8Rm NbEVP6tRe7EjTYi9qTCdu88W2cmHyv+xHruZPtTUVPiR1waMo5ewgNJZhvFvGho0 fuB6CyQ6H7NfV9oviTfRYbos0qVIFmje/DA16SvI1vhmGrBxRraIpNpjVnKEgPLC d5e5Pj6HquF6AtrI+HjCgkckH77lAAmlSkvxxdxlv4tOVhgYQjFoHgztwzNApYs7 h6vSxfrPfSvwbUgHDaNzosOtc1lhnyXr0s4nql0ffhTvw01YkJeetWy9HYfhFCBC 2neWaTOW0AShap22a92JfBFmKOu+712qIA2BW0mkfTjGK4iZIbRhI6z49VPQO+Fx liymiDDRdjZUEHdKfa5AUIAX4hBRn1C9UUi8Pfi3QHg8uYPH68238qcrZoxfpTBZ kZk4T/kEhVsTZu7Bn7Yl7WzSwnLMJsDPDzqInJuABrGZUtlxyDT+gUu4hU3LfPjI S9DreCZp9K062j5Xw/EszDlcFjEbFDf0CgX1444mYEW1fNAgSJ7p0j26bktyYDCs cxmA9GMcqEtrxFJeV19kcVGgw63aiDNitbIg+Gz0ap3GP2ybtEzQzWdyHuej39yX PYbSvJZbeprFhmowQuhp =+P65 -----END PGP SIGNATURE----- Merge tag 'pci-v4.10-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fix from Bjorn Helgaas: "Add back pcie_pme_remove() so we free the IRQ when removing PCIe port devices; previously the leaked IRQ caused an MSI BUG_ON" * tag 'pci-v4.10-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI/PME: Restore pcie_pme_driver.remove	2017-02-16 09:03:37 -08:00
Linus Torvalds	3c7a9f32f9	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) In order to avoid problems in the future, make cgroup bpf overriding explicit using BPF_F_ALLOW_OVERRIDE. From Alexei Staovoitov. 2) LLC sets skb->sk without proper skb->destructor and this explodes, fix from Eric Dumazet. 3) Make sure when we have an ipv4 mapped source address, the destination is either also an ipv4 mapped address or ipv6_addr_any(). Fix from Jonathan T. Leighton. 4) Avoid packet loss in fec driver by programming the multicast filter more intelligently. From Rui Sousa. 5) Handle multiple threads invoking fanout_add(), fix from Eric Dumazet. 6) Since we can invoke the TCP input path in process context, without BH being disabled, we have to accomodate that in the locking of the TCP probe. Also from Eric Dumazet. 7) Fix erroneous emission of NETEVENT_DELAY_PROBE_TIME_UPDATE when we aren't even updating that sysctl value. From Marcus Huewe. 8) Fix endian bugs in ibmvnic driver, from Thomas Falcon. [ This is the second version of the pull that reverts the nested rhashtable changes that looked a bit too scary for this late in the release - Linus ] * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (27 commits) rhashtable: Revert nested table changes. ibmvnic: Fix endian errors in error reporting output ibmvnic: Fix endian error when requesting device capabilities net: neigh: Fix netevent NETEVENT_DELAY_PROBE_TIME_UPDATE notification net: xilinx_emaclite: fix freezes due to unordered I/O net: xilinx_emaclite: fix receive buffer overflow bpf: kernel header files need to be copied into the tools directory tcp: tcp_probe: use spin_lock_bh() uapi: fix linux/if_pppol2tp.h userspace compilation errors packet: fix races in fanout_add() ibmvnic: Fix initial MTU settings net: ethernet: ti: cpsw: fix cpsw assignment in resume kcm: fix a null pointer dereference in kcm_sendmsg() net: fec: fix multicast filtering hardware setup ipv6: Handle IPv4-mapped src to in6addr_any dst. ipv6: Inhibit IPv4-mapped src address on the wire. net/mlx5e: Disable preemption when doing TC statistics upcall rhashtable: Add nested tables tipc: Fix tipc_sk_reinit race conditions gfs2: Use rhashtable walk interface in glock_hash_walk ...	2017-02-16 08:37:18 -08:00
Miklos Szeredi	84588a93d0	fuse: fix uninitialized flags in pipe_buffer Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Fixes: `d82718e348` ("fuse_dev_splice_read(): switch to add_to_pipe()") Cc: <stable@vger.kernel.org> # 4.9+	2017-02-16 15:08:20 +01:00
Alexander Duyck	ffed21bcee	ixgbe: Don't bother clearing buffer memory for descriptor rings This patch makes it so that we don't need to bother with clearing the memory out for the descriptor rings. The general idea is to only free buffers associated with buffers in use which are located between the next_to_clean and next_to_use or next_to_alloc values. Everything outside of those regions can be safely ignored since they should have no buffers associated with them. The advantage to doing things this way is that is should speed up bring-up and tear-down of the rings. Specifically we can avoid the 512 or more cycles required to memset the rings in tear-down. In the bring-up phase we then clear the memory as a part of initialization. The general idea is that the clearing in initialization can act as a prefetch of sorts for the buffer info structures so they are in the local CPU when we go to populate them. This should help to improve overall time needed to perform a suspend/resume. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-02-16 04:02:44 -08:00
Alexander Duyck	6f429223b3	ixgbe: Add support for build_skb This patch adds build_skb support to the Rx path. There are several advantages to this change. 1. It avoids the memcpy and skb->head allocation for small packets which improves performance by about 5% in my tests. 2. It avoids the memcpy, skb->head allocation, and eth_get_headlen for larger packets improving performance by about 10% in my tests. 3. For VXLAN packets it allows the full header to be in skb->data which improves the performance by as much as 30% in some of my tests. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-02-16 04:02:44 -08:00
Alexander Duyck	2ccdf26ff6	ixgbe: Add private flag to control buffer mode Since there are potential drawbacks to the new Rx allocation approach I thought it best to add a "chicken bit" so that we can turn the feature off if in the event that a problem is found. It also provides a means of validating the legacy Rx path in the event that we are forced to fall back. At some point in the future when we are convinced we don't need it anymore we might be able to drop the legacy-rx flag. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-02-16 04:02:44 -08:00
Alexander Duyck	2de6aa3a66	ixgbe: Add support for padding packet This patch adds support for providing a buffer with headroom and tailroom to allow for shared info, NET_SKB_PAD, and NET_IP_ALIGN. With this combined with the DMA changes we can start using build_skb to build frames around an incoming Rx buffer instead of having to memcpy the headers. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-02-16 04:02:44 -08:00
Alexander Duyck	3fd218767f	ixgbe: Break out Rx buffer page management We are going to be expanding the number of Rx paths in the driver. Instead of duplicating all that code I am pulling it apart into separate functions so that we don't have so much code duplication. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-02-16 04:02:44 -08:00
Alexander Duyck	c3630cc40b	ixgbe: Use length to determine if descriptor is done This change makes it so that we use the length of the packet instead of the DD status bit to determine if a new descriptor is ready to be processed. The obvious advantage is that it cuts down on reads as we don't really even need the DD bit if going from a 0 to a non-zero value on size is enough to inform us that the packet has been completed. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-02-16 04:02:44 -08:00
Alexander Duyck	4f4542bfb3	ixgbe: Make use of order 1 pages and 3K buffers independent of FCoE In order to support build_skb with jumbo frames it will be necessary to use 3K buffers for the Rx path with 8K pages backing them. This is needed on architectures that implement 4K pages because we can't support 2K buffers plus padding in a 4K page. In the case of systems that support page sizes larger than 4K the 3K attribute will only be applied to FCoE as we can fall back to using just 2K buffers and adding the padding. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-02-16 04:02:44 -08:00
Alexander Duyck	1b56cf49f5	ixgbe: Update code to better handle incrementing page count Batch the page count updates instead of doing them one at a time. By doing this we can improve the overall performance as the atomic increment operations can be expensive due to the fact that on x86 they are locked operations which can cause stalls. By doing bulk updates we can consolidate the stall which should help to improve the overall receive performance. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-02-16 04:02:44 -08:00

1 2 3 4 5 ...

651521 Commits (a2b4eb55ca81cd1968a942a593debddb3255d68a) All Branches Search

651521 Commits (a2b4eb55ca81cd1968a942a593debddb3255d68a)

All Branches