remarkable-linux

redonkable

Author	SHA1	Message	Date
Toke Høiland-Jørgensen	50f08edf98	ath9k: Switch to using mac80211 intermediate software queues. This switches ath9k over to using the mac80211 intermediate software queueing mechanism for data packets. It removes the queueing inside the driver, except for the retry queue, and instead pulls from mac80211 when a packet is needed. The retry queue is used to store a packet that was pulled but can't be sent immediately. The old code path in ath_tx_start that would queue packets has been removed completely, as has the qlen limit tunables (since there's no longer a queue in the driver to limit). The mac80211 intermediate software queues offer significant latency reductions, and this patch allows ath9k to realise them. The exact gains from this varies with the test scenario, but in an access point scenario we have seen latency reductions ranging from 1/3 to as much as an order of magnitude. We also achieve slightly better aggregation. Median latency (ping) figures with this patch applied at the access point, with two high-rate stations and one low-rate station (HT20 5Ghz), running a Flent rtt_fair_var_up test with one TCP flow and one ping flow going to each station: Fast station Slow station Default pfifo_fast qdisc: 430.4 ms 638.7 ms fq_codel qdisc on iface: 35.5 ms 211.8 ms This patch set: 22.4 ms 38.2 ms Median aggregation sizes over the same test: Default pfifo_fast qdisc: 9.5 pkts 1.9 pkts fq_codel qdisc on iface: 11.2 pkts 1.9 pkts This patch set: 13.9 pkts 1.9 pkts This patch is based on Tim's original patch set, but reworked quite thoroughly. Cc: Tim Shepard <shep@alum.mit.edu> Cc: Felix Fietkau <nbd@nbd.name> Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk> Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>	2016-11-15 17:00:04 +02:00
Colin Ian King	14acebc33e	ath9k_htc: fix minor mistakes in dev_err messages Add missing space in a dev_err message and join wrapped text so it does not span multiple lines. Fix spelling mistake on "unknown". Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>	2016-11-15 16:57:47 +02:00
Martin Blumenstingl	138b41253d	ath9k: parse the device configuration from an OF node This allows setting the MAC address and specifying that the firmware will be requested from userspace (because there might not be a hardware EEPROM connected to the chip) for ath9k based PCI devices using the device tree. There is some out-of-tree code to "convert devicetree to ath9k_platform_data" (for example in OpenWrt and LEDE) which becomes obsolete with this patch. Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>	2016-11-15 16:55:42 +02:00
Martin Blumenstingl	b40ded2ad7	ath9k: add a helper to get the string representation of ath_bus_type This can be used when the ath_bus_type has to be presented in a log message or firmware filename. Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>	2016-11-15 16:55:37 +02:00
Martin Blumenstingl	fc383ffdb9	Documentation: dt: net: add ath9k wireless device binding Add documentation how devicetree can be used to configure ath9k based devices. Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>	2016-11-15 16:55:33 +02:00
Vittorio Gambaletta (VittGam)	79e57dd113	ath9k: Really fix LED polarity for some Mini PCI AR9220 MB92 cards. The active_high LED of my Wistron DNMA-92 is still being recognized as active_low on 4.7.6 mainline. When I was preparing my former commit `0f9edcdd88` ("ath9k: Fix LED polarity for some Mini PCI AR9220 MB92 cards.") to fix that I must have somehow messed up with testing, because I tested the final version of that patch before sending it, and it was apparently working; but now it is not working on 4.7.6 mainline. I initially added the PCI_DEVICE_SUB section for 0x0029/0x2096 above the PCI_VDEVICE section for 0x0029; but then I moved the former below the latter after seeing how 0x002A sections were sorted in the file. This turned out to be wrong: if a generic PCI_VDEVICE entry (that has both subvendor and subdevice IDs set to PCI_ANY_ID) is put before a more specific one (PCI_DEVICE_SUB), then the generic PCI_VDEVICE entry will match first and will be used. With this patch, 0x0029/0x2096 has finally got active_high LED on 4.7.6. While I'm at it, let's fix 0x002A too by also moving its generic definition below its specific ones. Fixes: `0f9edcdd88` ("ath9k: Fix LED polarity for some Mini PCI AR9220 MB92 cards.") Cc: <stable@vger.kernel.org> #4.7+ Signed-off-by: Vittorio Gambaletta <linuxbugs@vittgam.net> [kvalo@qca.qualcomm.com: improve the commit log based on email discussions] Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>	2016-11-15 16:52:16 +02:00
Kalle Valo	b9824cb891	Merge ath-next from git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git ath.git patches for 4.10. Major changes: ath10k * allow setting coverage class for first generation cards * read regulatory domain from ACPI ath9k * disable RNG by default	2016-11-09 02:15:17 +02:00
Kalle Valo	3f8247c8c4	* Finalize and enable dynamic queue allocation; * Use dev_coredumpmsg() to prevent locking the driver; * Small fix to pass the AID to the FW; * Use FW PS decisions with multi-queue; -----BEGIN PGP SIGNATURE----- iQIcBAABCAAGBQJYEGRBAAoJEKFHnKIaPMX63wAP+wSdixtt6iKzKL+RUHf+UXoz Qg+Kh8XCg6JzYYoTuWb00DgFZv6sBcca+PMv+/EREShZz+/hhQlxz5PPuA63Yx+7 ZwO0UdtRP3QsbukvVJcpnEnoWwpmqfnSW2R3pgQirWQAjSiNJvrXkQh4LDu+C89F Bv/UyIo+oUNX5cCUavoR7meoau2GKLCz3B5vNZgLDBfNeOuV7KyVwuvdb7suL9C7 hf+Q/zi1BBU2p2xYLwrb9AuKURrqWqI+i8zZZ0OzgD3w61QqMQ+k4x4vpY1MupwF eblR2wjmo51p1OmbzswjzvIqgeu8EUt+w0mzmfq8j8FMv+zdiZ4tbjnrYZC1s1jw yxVkRUO9otZSoHD2C5sJsXlzCs2CFEbmpU3iTa/WV/QMWAclbvubEZwKpVQAp05D tX9p0PTB8Zefuw84K2jcg0AuSPTDJLB7ojwdIEzYI6m/+O6N6pn9ymQ/TWzl0oyC XzPGnhELa6FTreIOfen75eHusDa3mcYwb1NntGFbSrouS/2KUpUusOBgDLC2P97x cHPv2ADOI4xPrMA8vMnVs7Sd8RT+hb61gBz37b/s+M5n74WwDLEpZdbEDD8xaHW9 wcYiiFnB53WXcUQdreQ6IbanDWH4MufpMeDQ72lXxrH+VvzMgfsMxwGCb1sUYy7h XRWyhhjEgagj4/w2WpqK =D+eo -----END PGP SIGNATURE----- Merge tag 'iwlwifi-next-for-kalle-2016-10-25-2' of git://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-next * Finalize and enable dynamic queue allocation; * Use dev_coredumpmsg() to prevent locking the driver; * Small fix to pass the AID to the FW; * Use FW PS decisions with multi-queue;	2016-10-27 18:29:18 +03:00
Elad Raz	6edf10173a	devlink: Prevent port_type_set() callback when it's not needed When a port_type_set() is been called and the new port type set is the same as the old one, just return success. Signed-off-by: Elad Raz <eladr@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-26 17:30:32 -04:00
Stefan Richter	89ab88b01b	firewire: net: set initial MTU = 1500 unconditionally, fix IPv6 on some CardBus cards firewire-net, like the older eth1394 driver, reduced the initial MTU to less than 1500 octets if the local link layer controller's asynchronous packet reception limit was lower. This is bogus, since this reception limit does not have anything to do with the transmission limit. Neither did this reduction affect the TX path positively, nor could it prevent link fragmentation at the RX path. Many FireWire CardBus cards have a max_rec of 9, causing an initial MTU of 1024 - 16 = 1008. RFC 2734 and RFC 3146 allow a minimum max_rec = 8, which would result in an initial MTU of 512 - 16 = 496. On such cards, IPv6 could only be employed if the MTU was manually increased to 1280 or more, i.e. IPv6 would not work without intervention from userland. We now always initialize the MTU to 1500, which is the default according to RFC 2734 and RFC 3146. On a VIA VT6316 based CardBus card which was affected by this, changing the MTU from 1008 to 1500 also increases TX bandwidth by 6 %. RX remains unaffected. CC: netdev@vger.kernel.org CC: linux1394-devel@lists.sourceforge.net CC: Jarod Wilson <jarod@redhat.com> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-26 17:28:50 -04:00
Stefan Richter	5d48f00d83	firewire: net: fix maximum possible MTU Commit `b3e3893e12` ("net: use core MTU range checking in misc drivers") mistakenly introduced an upper limit for firewire-net's MTU based on the local link layer controller's reception capability. Revert this. Neither RFC 2734 nor our implementation impose any particular upper limit. Actually, to be on the safe side and to make the code explicit, set ETH_MAX_MTU = 65535 as upper limit now. (I replaced sizeof(struct rfc2734_header) by the equivalent RFC2374_FRAG_HDR_SIZE in order to avoid distracting long/int conversions.) Fixes: b3e3893e1253('net: use core MTU range checking in misc drivers') CC: netdev@vger.kernel.org CC: linux1394-devel@lists.sourceforge.net CC: Jarod Wilson <jarod@redhat.com> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Acked-by: Jarod Wilson <jarod@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-26 17:28:50 -04:00
Wei Yongjun	e2897b8238	net: netcp: add missing of_node_put() in netcp_probe() This node pointer is returned by of_get_child_by_name() with refcount incremented in this function. of_node_put() on it before exitting this function. This is detected by Coccinelle semantic patch. Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-26 17:21:59 -04:00
Wei Yongjun	f850b4a77d	net: ena: use setup_timer() and mod_timer() Use setup_timer() instead of init_timer(), being the preferred/standard way to set a timer up. Also, quoting the mod_timer() function comment: -> mod_timer() is a more efficient way to update the expire field of an active timer (if the timer is inactive it will be activated). Use setup_timer and mod_timer to setup and arm a timer, to make the code cleaner and easier to read. Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-26 17:21:59 -04:00
Wei Yongjun	675a6ceefc	amd-xgbe: Fix error return code in xgbe_probe() Fix to return error code -ENODEV from the DMA is not supported error handling case instead of 0, as done elsewhere in this function. Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-26 17:21:59 -04:00
Wei Yongjun	0942170f32	net: ns83820: use dev_kfree_skb_irq instead of kfree_skb It is not allowed to call kfree_skb() from hardware interrupt context or with interrupts being disabled, spin_lock_irqsave() make sure always in irq disable context. So the kfree_skb() should be replaced with dev_kfree_skb_irq(). This is detected by Coccinelle semantic patch. Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-26 17:21:59 -04:00
Wei Yongjun	a24a9d7aca	net: eth: altera: Fix error return code in altera_tse_probe() Fix to return error code -EINVAL from the error handling case instead of 0, as done elsewhere in this function. Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-26 17:21:59 -04:00
Wei Yongjun	68497a87c4	net: dsa: mv88e6xxx: use setup_timer to simplify the code Use setup_timer function instead of initializing timer with the function and data fields. Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-26 17:21:59 -04:00
Wei Yongjun	1aaa87aff6	net: netcp: drop kfree for memory allocated with devm_kzalloc It's not necessary to free memory allocated with devm_kzalloc in the remove path and using kfree leads to a double free. Fixes: `84640e27f2` ("net: netcp: Add Keystone NetCP core ethernet driver") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-26 17:21:59 -04:00
Sven Eckelmann	701470baba	batman-adv: Revert "use core MTU range checking in misc drivers" The maximum MTU is defined via the slave devices of an batman-adv interface. Thus it is not possible to calculate the max_mtu during the creation of the batman-adv device when no slave devices are attached. Doing so would for example break non-fragmentation setups which then (incorrectly) allow an MTU of 1500 even when underlying device cannot transport 1500 bytes + batman-adv headers. Checking the dynamically calculated max_mtu via the minimum of the slave devices MTU during .ndo_change_mtu is also used by the bridge interface. Cc: Jarod Wilson <jarod@redhat.com> Fixes: `b3e3893e12` ("net: use core MTU range checking in misc drivers") Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-26 17:19:07 -04:00
David S. Miller	f806f77247	Merge branch 'BCM54612E' Xo Wang says: ==================== Broadcom BCM54612E support This series is based on tip of torvalds/master. The first patch adds register definitions from Broadcom docs. The second patch adds the BCM54612E PHY ID, flags, and device-specific RGMII internal delay initialization. I tested on a custom board with an Aspeed AST2500 SOC with its second MAC connected to this PHY. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-26 17:15:27 -04:00
Xo Wang	d92ead16be	net: phy: broadcom: Add support for BCM54612E This PHY has internal delays enabled after reset. This clears the internal delay enables unless the interface specifically requests them. Signed-off-by: Xo Wang <xow@google.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Joel Stanley <joel@jms.id.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-26 17:15:26 -04:00
Xo Wang	3cf25904fe	net: phy: broadcom: Update Auxiliary Control Register macros Add the RXD-to-RXC skew (delay) time bit in the Miscellaneous Control shadow register and a mask for the shadow selector field. Remove a re-definition of MII_BCM54XX_AUXCTL_SHDWSEL_AUXCTL. Signed-off-by: Xo Wang <xow@google.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Joel Stanley <joel@jms.id.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-26 17:15:26 -04:00
Eric Dumazet	73e42ff72a	sch_htb: do not report fake rate estimators When I prepared commit `d250a5f90e` ("pkt_sched: gen_estimator: Dont report fake rate estimators"), htb still had an implicit rate estimator for all its classes. Then later, I made this rate estimator optional in commit `64153ce0a7` ("net_sched: htb: do not setup default rate estimators"), but I forgot to update htb use of gnet_stats_copy_rate_est() After this patch, "tc -s qdisc ..." no longer report fake rate estimators for HTB classes. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-26 17:13:49 -04:00
Aviya Erenfeld	7e62a699aa	iwlwifi: mvm: use dev_coredumpsg() iwlmvm currently uses dev_coredumpm() to collect multiple buffers, but this has the downside of pinning the module until the coredump expires, if the data isn't read by any userspace. Avoid this by using the new dev_coredumpsg() method. We still copy the data from the old way of generating it, but neither hold on to vmalloc'ed data for a long time, nor do we pin the module now. Signed-off-by: Aviya Erenfeld <aviya.erenfeld@intel.com> [rewrite commit message] Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com>	2016-10-26 11:04:15 +03:00
Liad Kaufman	7948b87308	iwlwifi: mvm: enable dynamic queue allocation mode New firmwares support dynamic queue allocation (DQA), which enables on-demand allocation of queues per RA/TID, instead of allocating them statically per vif. This allows an AP to send, for instance, BE traffic to STA2 even if it also needs to send traffic to a sleeping STA1, without being blocked by the sleeping station. The implementation in the driver is now ready, so we can enable this feature by default when running firmwares that support it. Signed-off-by: Liad Kaufman <liad.kaufman@intel.com> [reworded the commit message] Signed-off-by: Luca Coelho <luciano.coelho@intel.com>	2016-10-26 10:27:44 +03:00
Cyrill Gorcunov	432490f9d4	net: ip, diag -- Add diag interface for raw sockets In criu we are actively using diag interface to collect sockets present in the system when dumping applications. And while for unix, tcp, udp[lite], packet, netlink it works as expected, the raw sockets do not have. Thus add it. v2: - add missing sock_put calls in raw_diag_dump_one (by eric.dumazet@) - implement @destroy for diag requests (by dsa@) v3: - add export of raw_abort for IPv6 (by dsa@) - pass net-admin flag into inet_sk_diag_fill due to changes in net-next branch (by dsa@) v4: - use @pad in struct inet_diag_req_v2 for raw socket protocol specification: raw module carries sockets which may have custom protocol passed from socket() syscall and sole @sdiag_protocol is not enough to match underlied ones - start reporting protocol specifed in socket() call when sockets are raw ones for the same reason: user space tools like ss may parse this attribute and use it for socket matching v5 (by eric.dumazet@): - use sock_hold in raw_sock_get instead of atomic_inc, we're holding (raw_v4_hashinfo\|raw_v6_hashinfo)->lock when looking up so counter won't be zero here. v6: - use sdiag_raw_protocol() helper which will access @pad structure used for raw sockets protocol specification: we can't simply rename this member without breaking uapi v7: - sine sdiag_raw_protocol() helper is not suitable for uapi lets rather make an alias structure with proper names. __check_inet_diag_req_raw helper will catch if any of structure unintentionally changed. CC: David S. Miller <davem@davemloft.net> CC: Eric Dumazet <eric.dumazet@gmail.com> CC: David Ahern <dsa@cumulusnetworks.com> CC: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> CC: James Morris <jmorris@namei.org> CC: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> CC: Patrick McHardy <kaber@trash.net> CC: Andrey Vagin <avagin@openvz.org> CC: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-23 19:35:24 -04:00
Thomas Graf	f76a9db351	lwt: Remove unused len field The field is initialized by ILA and MPLS but never used. Remove it. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-23 17:45:01 -04:00
Andrey Vagin	7281a66590	net: allow to kill a task which waits net_mutex in copy_new_ns net_mutex can be locked for a long time. It may be because many namespaces are being destroyed or many processes decide to create a network namespace. Both these operations are heavy, so it is better to have an ability to kill a process which is waiting net_mutex. Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Andrei Vagin <avagin@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-23 17:33:39 -04:00
Shmulik Ladkani	d65f2fa680	net/sched: em_meta: Fix 'meta vlan' to correctly recognize zero VID frames META_COLLECTOR int_vlan_tag() assumes that if the accel tag (vlan_tci) is zero, then no vlan accel tag is present. This is incorrect for zero VID vlan accel packets, making the following match fail: tc filter add ... basic match 'meta(vlan mask 0xfff eq 0)' ... Apparently 'int_vlan_tag' was implemented prior VLAN_TAG_PRESENT was introduced in `05423b2` "vlan: allow null VLAN ID to be used" (and at time introduced, the 'vlan_tx_tag_get' call in em_meta was not adapted). Fix, testing skb_vlan_tag_present instead of testing skb_vlan_tag_get's value. Fixes: `05423b2413` ("vlan: allow null VLAN ID to be used") Fixes: `1a31f2042e` ("netsched: Allow meta match on vlan tag on receive") Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-23 17:31:25 -04:00
David S. Miller	1bac938141	Merge branch 'mlxsw-cosmetics-plus-res-mgmt-rewrite' Jiri Pirko says: ==================== mlxsw: Driver update Mostly cosmetics and small resource values management rewrite. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-23 17:21:35 -04:00
Jiri Pirko	c1a3831121	mlxsw: Convert resources into array Since the number of resources is going to get much bigger, ease up the addition by simly defining IDs. Convert the existing structure members to a set array, one for validity, one for values. Introduce a set of getters and setters for easy access. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-23 17:21:29 -04:00
Jiri Pirko	f38a2314e5	mlxsw: cmd: Push resource query defines to cmd.h Push cmd resource query related defines to cmd.h where they belong. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-23 17:21:29 -04:00
Jiri Pirko	8e9658d567	mlxsw: reg: Generare register names automatically Extend the MLXSW_REG_DEFINE macro to store register name in string form. Use this string later on instead of hard coded string values. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-23 17:21:29 -04:00
Jiri Pirko	21978dcfc8	mlxsw: reg: Use helper macro to define registers Save some code and also prepare to easily carry name in string form. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-23 17:21:29 -04:00
Jiri Pirko	412791df39	mlxsw: item: Make char *buf arg constant for getters Enforce const for getter buf args. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-23 17:21:28 -04:00
Jiri Pirko	fe0612dc90	mlxsw: item: Make struct mlxsw_item args const These should be const, so enforce it. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-23 17:21:28 -04:00
David S. Miller	67dc159673	Merge branch 'bpf-numa-id' Daniel Borkmann says: ==================== Add BPF numa id helper This patch set adds a helper for retrieving current numa node id and a test case for SO_REUSEPORT. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-22 17:06:09 -04:00
Daniel Borkmann	3c2c3c16aa	reuseport, bpf: add test case for bpf_get_numa_node_id The test case is very similar to reuseport_bpf_cpu, only that here we select socket members based on current numa node id. # numactl -H available: 2 nodes (0-1) node 0 cpus: 0 1 2 3 4 5 12 13 14 15 16 17 node 0 size: 128867 MB node 0 free: 120080 MB node 1 cpus: 6 7 8 9 10 11 18 19 20 21 22 23 node 1 size: 96765 MB node 1 free: 87504 MB node distances: node 0 1 0: 10 20 1: 20 10 # ./reuseport_bpf_numa ---- IPv4 UDP ---- send node 0, receive socket 0 send node 1, receive socket 1 send node 1, receive socket 1 send node 0, receive socket 0 ---- IPv6 UDP ---- send node 0, receive socket 0 send node 1, receive socket 1 send node 1, receive socket 1 send node 0, receive socket 0 ---- IPv4 TCP ---- send node 0, receive socket 0 send node 1, receive socket 1 send node 1, receive socket 1 send node 0, receive socket 0 ---- IPv6 TCP ---- send node 0, receive socket 0 send node 1, receive socket 1 send node 1, receive socket 1 send node 0, receive socket 0 SUCCESS Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-22 17:05:52 -04:00
Daniel Borkmann	2d0e30c30f	bpf: add helper for retrieving current numa node id Use case is mainly for soreuseport to select sockets for the local numa node, but since generic, lets also add this for other networking and tracing program types. Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-22 17:05:52 -04:00
David S. Miller	a10b91b8b8	Merge branch 'udpmem' Paolo Abeni says: ==================== udp: refactor memory accounting This patch series refactor the udp memory accounting, replacing the generic implementation with a custom one, in order to remove the needs for locking the socket on the enqueue and dequeue operations. The socket backlog usage is dropped, as well. The first patch factor out pieces of some queue and memory management socket helpers, so that they can later be used by the udp memory accounting functions. The second patch adds the memory account helpers, without using them. The third patch replacse the old rx memory accounting path for udp over ipv4 and udp over ipv6. In kernel UDP users are updated, as well. The memory accounting schema is described in detail in the individual patch commit message. The performance gain depends on the specific scenario; with few flows (and little contention in the original code) the differences are in the noise range, while with several flows contending the same socket, the measured speed-up is relevant (e.g. even over 100% in case of extreme contention) Many thanks to Eric Dumazet for the reiterated reviews and suggestions. v5 -> v6: - do not orphan the skb on enqueue, skb_steal_sock() already did the work for us v4 -> v5: - use the receive queue spin lock to protect the memory accounting - several minor clean-up v3 -> v4: - simplified the locking schema, always use a plain spinlock v2 -> v3: - do not set the now unsed backlog_rcv callback v1 -> v2: - changed slighly the memory accounting schema, we now perform lazy reclaim - fixed forward_alloc updating issue - fixed memory counter integer overflows ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-22 17:05:13 -04:00
Paolo Abeni	850cbaddb5	udp: use it's own memory accounting schema Completely avoid default sock memory accounting and replace it with udp-specific accounting. Since the new memory accounting model encapsulates completely the required locking, remove the socket lock on both enqueue and dequeue, and avoid using the backlog on enqueue. Be sure to clean-up rx queue memory on socket destruction, using udp its own sk_destruct. Tested using pktgen with random src port, 64 bytes packet, wire-speed on a 10G link as sender and udp_sink as the receiver, using an l4 tuple rxhash to stress the contention, and one or more udp_sink instances with reuseport. nr readers Kpps (vanilla) Kpps (patched) 1 170 440 3 1250 2150 6 3000 3650 9 4200 4450 12 5700 6250 v4 -> v5: - avoid unneeded test in first_packet_length v3 -> v4: - remove useless sk_rcvqueues_full() call v2 -> v3: - do not set the now unsed backlog_rcv callback v1 -> v2: - add memory pressure support - fixed dropwatch accounting for ipv6 Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-22 17:05:05 -04:00
Paolo Abeni	f970bd9e3a	udp: implement memory accounting helpers Avoid using the generic helpers. Use the receive queue spin lock to protect the memory accounting operation, both on enqueue and on dequeue. On dequeue perform partial memory reclaiming, trying to leave a quantum of forward allocated memory. On enqueue use a custom helper, to allow some optimizations: - use a plain spin_lock() variant instead of the slightly costly spin_lock_irqsave(), - avoid dst_force check, since the calling code has already dropped the skb dst - avoid orphaning the skb, since skb_steal_sock() already did the work for us The above needs custom memory reclaiming on shutdown, provided by the udp_destruct_sock(). v5 -> v6: - don't orphan the skb on enqueue v4 -> v5: - replace the mem_lock with the receive queue spin lock - ensure that the bh is always allowed to enqueue at least a skb, even if sk_rcvbuf is exceeded v3 -> v4: - reworked memory accunting, simplifying the schema - provide an helper for both memory scheduling and enqueuing v1 -> v2: - use a udp specific destrctor to perform memory reclaiming - remove a couple of helpers, unneeded after the above cleanup - do not reclaim memory on dequeue if not under memory pressure - reworked the fwd accounting schema to avoid potential integer overflow Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-22 17:05:05 -04:00
Paolo Abeni	f8c3bf00d4	net/socket: factor out helpers for memory and queue manipulation Basic sock operations that udp code can use with its own memory accounting schema. No functional change is introduced in the existing APIs. v4 -> v5: - avoid whitespace changes v2 -> v4: - avoid exporting __sock_enqueue_skb v1 -> v2: - avoid export sock_rmem_free Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-22 17:05:05 -04:00
Jarod Wilson	8b1efc0f83	net: remove MTU limits on a few ether_setup callers These few drivers call ether_setup(), but have no ndo_change_mtu, and thus were overlooked for changes to MTU range checking behavior. They previously had no range checks, so for feature-parity, set their min_mtu to 0 and max_mtu to ETH_MAX_MTU (65535), instead of the 68 and 1500 inherited from the ether_setup() changes. Fine-tuning can come after we get back to full feature-parity here. CC: netdev@vger.kernel.org Reported-by: Asbjoern Sloth Toennesen <asbjorn@asbjorn.st> CC: Asbjoern Sloth Toennesen <asbjorn@asbjorn.st> CC: R Parameswaran <parameswaran.r7@gmail.com> Signed-off-by: Jarod Wilson <jarod@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-21 13:57:50 -04:00
Vitaly Kuznetsov	e8f0a89cd7	hv_netvsc: fix a race between netvsc_send() and netvsc_init_buf() Fix in commit `8809883482` ("hv_netvsc: set nvdev link after populating chn_table") turns out to be incomplete. A crash in netvsc_get_next_send_section() is observed on mtu change when the device is under load. The race I identified is: if we get to netvsc_send() after we set net_device_ctx->nvdev link in netvsc_device_add() but before we finish netvsc_connect_vsp()->netvsc_init_buf() send_section_map is not allocated and we crash. Unfortunately we can't set net_device_ctx->nvdev link after the netvsc_init_buf() call as during the negotiation we need to receive packets and on the receive path we check for it. It would probably be possible to split nvdev into a pair of nvdev_in and nvdev_out links and check them accordingly in get_outbound_net_device()/ get_inbound_net_device() but this looks like an overkill. Check that send_section_map is allocated in netvsc_send(). Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-21 11:27:31 -04:00
David S. Miller	9e58c5dc4b	Merge branch 'MTU-core-range-checking-more' Jarod Wilson says: ==================== net: use core MTU range checking everywhere This stack of patches should get absolutely everything in the kernel converted from doing their own MTU range checking to the core MTU range checking. This second spin includes alterations to hopefully fix all concerns raised with the first, as well as including some additional changes to drivers and infrastructure where I completely missed necessary updates. These have all been built through the 0-day build infrastructure via the (rebasing) master branch at https://github.com/jarodwilson/linux-muck, which at the time of the most recent compile across 147 configs, was based on net-next at commit `7b1536ef0a`. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-20 14:51:11 -04:00
Jarod Wilson	b96f9afee4	ipv4/6: use core net MTU range checking ipv4/ip_tunnel: - min_mtu = 68, max_mtu = 0xFFF8 - dev->hard_header_len - t_hlen - preserve all ndo_change_mtu checks for now to prevent regressions ipv6/ip6_tunnel: - min_mtu = 68, max_mtu = 0xFFF8 - dev->hard_header_len - preserve all ndo_change_mtu checks for now to prevent regressions ipv6/ip6_vti: - min_mtu = 1280, max_mtu = 65535 - remove redundant vti6_change_mtu ipv6/sit: - min_mtu = 1280, max_mtu = 0xFFF8 - t_hlen - remove redundant ipip6_tunnel_change_mtu CC: netdev@vger.kernel.org CC: "David S. Miller" <davem@davemloft.net> CC: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> CC: James Morris <jmorris@namei.org> CC: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> CC: Patrick McHardy <kaber@trash.net> Signed-off-by: Jarod Wilson <jarod@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-20 14:51:10 -04:00
Jarod Wilson	46b3ef4cdf	s390/net: use net core MTU range checking ctcm: - min_mtu = 576, max_mtu = 65527 netiucv: - min_mtu = 576, max_mtu = 65535 qeth: - min_mtu = 64, max_mtu = 65535 CC: netdev@vger.kernel.org CC: linux-s390@vger.kernel.org CC: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: Jarod Wilson <jarod@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-20 14:51:10 -04:00
Jarod Wilson	b3e3893e12	net: use core MTU range checking in misc drivers firewire-net: - set min/max_mtu - remove fwnet_change_mtu nes: - set max_mtu - clean up nes_netdev_change_mtu xpnet: - set min/max_mtu - remove xpnet_dev_change_mtu hippi: - set min/max_mtu - remove hippi_change_mtu batman-adv: - set max_mtu - remove batadv_interface_change_mtu - initialization is a little async, not 100% certain that max_mtu is set in the optimal place, don't have hardware to test with rionet: - set min/max_mtu - remove rionet_change_mtu slip: - set min/max_mtu - streamline sl_change_mtu um/net_kern: - remove pointless ndo_change_mtu hsi/clients/ssi_protocol: - use core MTU range checking - remove now redundant ssip_pn_set_mtu ipoib: - set a default max MTU value - Note: ipoib's actual max MTU can vary, depending on if the device is in connected mode or not, so we'll just set the max_mtu value to the max possible, and let the ndo_change_mtu function continue to validate any new MTU change requests with checks for CM or not. Note that ipoib has no min_mtu set, and thus, the network core's mtu > 0 check is the only lower bounds here. mptlan: - use net core MTU range checking - remove now redundant mpt_lan_change_mtu fddi: - min_mtu = 21, max_mtu = 4470 - remove now redundant fddi_change_mtu (including export) fjes: - min_mtu = 8192, max_mtu = 65536 - The max_mtu value is actually one over IP_MAX_MTU here, but the idea is to get past the core net MTU range checks so fjes_change_mtu can validate a new MTU against what it supports (see fjes_support_mtu in fjes_hw.c) hsr: - min_mtu = 0 (calls ether_setup, max_mtu is 1500) f_phonet: - min_mtu = 6, max_mtu = 65541 u_ether: - min_mtu = 14, max_mtu = 15412 phonet/pep-gprs: - min_mtu = 576, max_mtu = 65530 - remove redundant gprs_set_mtu CC: netdev@vger.kernel.org CC: linux-rdma@vger.kernel.org CC: Stefan Richter <stefanr@s5r6.in-berlin.de> CC: Faisal Latif <faisal.latif@intel.com> CC: linux-rdma@vger.kernel.org CC: Cliff Whickman <cpw@sgi.com> CC: Robin Holt <robinmholt@gmail.com> CC: Jes Sorensen <jes@trained-monkey.org> CC: Marek Lindner <mareklindner@neomailbox.ch> CC: Simon Wunderlich <sw@simonwunderlich.de> CC: Antonio Quartulli <a@unstable.cc> CC: Sathya Prakash <sathya.prakash@broadcom.com> CC: Chaitra P B <chaitra.basappa@broadcom.com> CC: Suganath Prabu Subramani <suganath-prabu.subramani@broadcom.com> CC: MPT-FusionLinux.pdl@broadcom.com CC: Sebastian Reichel <sre@kernel.org> CC: Felipe Balbi <balbi@kernel.org> CC: Arvid Brodin <arvid.brodin@alten.se> CC: Remi Denis-Courmont <courmisch@gmail.com> Signed-off-by: Jarod Wilson <jarod@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-20 14:51:10 -04:00
Jarod Wilson	d0c2c9973e	net: use core MTU range checking in virt drivers hyperv_net: - set min/max_mtu, per Haiyang, after rndis_filter_device_add virtio_net: - set min/max_mtu - remove virtnet_change_mtu vmxnet3: - set min/max_mtu xen-netback: - min_mtu = 0, max_mtu = 65517 xen-netfront: - min_mtu = 0, max_mtu = 65535 unisys/visor: - clean up defines a little to not clash with network core or add redundat definitions CC: netdev@vger.kernel.org CC: virtualization@lists.linux-foundation.org CC: "K. Y. Srinivasan" <kys@microsoft.com> CC: Haiyang Zhang <haiyangz@microsoft.com> CC: "Michael S. Tsirkin" <mst@redhat.com> CC: Shrikrishna Khare <skhare@vmware.com> CC: "VMware, Inc." <pv-drivers@vmware.com> CC: Wei Liu <wei.liu2@citrix.com> CC: Paul Durrant <paul.durrant@citrix.com> CC: David Kershner <david.kershner@unisys.com> Signed-off-by: Jarod Wilson <jarod@redhat.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-20 14:51:09 -04:00

1 2 3 4 5 ...

632964 Commits (50f08edf98096a68f01ff4566b605a25bf8e42ce) All Branches Search

632964 Commits (50f08edf98096a68f01ff4566b605a25bf8e42ce)

All Branches