redonkable/remarkable-linux

Author	SHA1	Message	Date
Alexander Duyck	4b8164467b	i40e: Add common function for finding VSI by type This patch adds a common method for finding a VSI by type. The main motivation for doing this is that the Flow Director path actually had two ways of handling this, one stopped on first match and one did not. This patch makes it so that all callers of this function will get the same approach for finding a VSI. Change-ID: Ibf25de8acd8466582520694424aa87da66965fbd Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Bimmy Pujari <bimmy.pujari@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-10-31 14:26:40 -07:00
Jacob Keller	7d3f04af69	i40evf: avoid an extra msleep while Remove the second call to msleep outside the loop, and move the msleep within the loop as the first step. This guarantees that a single loop will wait the minimum time first, and then after the reset finishes we no longer need an extra msleep. Change-ID: Ib2086f0a142402b614f67846bc091754203a0b9a Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-10-31 14:26:40 -07:00
Jacob Keller	124905012d	i40e: replace PTP Rx timestamp hang logic The current Rx timestamp hang logic is not very robust because it does not notice a register is hung until all four timestamps have been latched and we wait a full 5 seconds. Replace this logic with a newer Rx hang detection based on storing the jiffies when we first notice a receive timestamp event. We store each register's time separately, along with a flag indicating if it is currently latched. Upon first transitioning to latch, we will update the latch_events[i] jiffies value. This indicates the time we first noticed this event. The watchdog routine will simply check that the either the flag has been cleared, or we have passed at least one second. In this case, it is able to clear the Rx timestamp register under the assumption that it was for a dropped frame. The benefit if this strategy is that we should be able to detect and clear out stalled RXTIME_H registers before we exhaust the supply of 4, and avoid complete stall of Rx timestamp events. Change-ID: Id55458c0cd7a5dd0c951ff2b8ac0b2509364131f Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-10-31 14:26:40 -07:00
Jacob Keller	195512629c	i40e: use a mutex instead of spinlock in PTP user entry points We need a locking mechanism to protect the hardware SYSTIME register which is split over 2 values, and has internal hardware latching. We can't allow multiple accesses at the same time. However.... The spinlock_t is overkill here, especially use of spin_lock_irqsave, since every PTP access will halt hardirqs. Notice that the only places which need the SYSTIME value are user context and are capable of sleeping. Thus, it is safe to use a mutex here instead of the spinlock. Change-ID: I971761a89b58c6aad953590162e85a327fbba232 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-10-31 14:26:40 -07:00
Jacob Keller	144ed17630	i40e: correct check for reading TSYNINDX from the receive descriptor When hardware has taken a timestamp for a received packet, it indicates which RXTIME register the timestamp was placed in by some bits in the receive descriptor. It uses 3 bits, one to indicate if the descriptor index is valid (ie: there was a timestamp) and 2 bits to indicate which of the 4 registers to read. However, the driver currently does not check the TSYNVALID bit and only checks the index. It assumes a zero index means no timestamp, and a non zero index means a timestamp occurred. While this appears to be true, it prevents ever reading a timestamp in RXTIME[0], and causes the first timestamp the device captures to be ignored. Fix this by using the TSYNVALID bit correctly as the true indicator of whether the packet has an associated timestamp. Also rename the variable rsyn to tsyn as this is more descriptive and matches the register names. Change-ID: I4437e8f3a3df2c2ddb458b0fb61420f3dafc4c12 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-10-31 14:26:40 -07:00
Jacob Keller	0093631966	i40e: remove duplicate add/delete adminq command code for filters We duplicate some code around adding and deleting filters using the adminq interface. This is prone to errors in case there are bugs. Use functions which extract the logic to their own portion so that we don't duplicate it twice in code. Change-ID: I60d68aeb887976787dec00b23ab386a106e61465 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-10-31 14:26:40 -07:00
Jacob Keller	cbebb85f30	i40e: avoid looping to check whether we're in VLAN mode We determine that a VSI is in vlan_mode whenever it has any filters with a VLAN other than -1 (I40E_VLAN_ALL). The previous method of doing so was to perform a loop whenever we needed the check. However, we can notice that only place where filters are added (i40e_add_filter) can change the condition from false to true, and the only place we can return to false is in i40e_vsi_sync_filters_subtask. Thus, we can remove the loop and use a boolean directly. Doing this avoids looping over filters repeatedly especially while we're already inside a loop over all the filters. This should reduce the latency of filter operations throughout the driver. Change-ID: Iafde08df588da2a2ea666997d05e11fad8edc338 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-10-31 14:26:40 -07:00
Alan Brady	84f5ca6cf4	i40e: fix MAC filters when removing VLANs Currently there exists a bug where adding at least one VLAN and then removing all VLANs leaves the mac filters for the VSI with an incorrect value for 'vid' which indicates the mac filter's VLAN status. The current implementation for handling the removal of VLANs is wrong for a couple reasons. The first is that when i40e_vsi_kill_vlan iterates through the MAC filters, it fails to account for the MAC filter status; i.e. it's not accommodating for filters that are about to be deleted. The second problem is that MAC filters can be deleted in other places (specifically i40e_set_rx_mode). Thus if it occurs that all the VLAN MAC filters get deleted we need to switch out of VLAN mode, but the code path through i40e_vsi_kill_vlan has already been executed and we're now stuck in VLAN mode. This patch fixes the issue by removing the check from i40e_vsi_kill_vlan and puts the check instead in i40e_sync_vsi_filters where we're guaranteed to see all filter deletions and can properly detect when we need to switch out of VLAN mode. Change-ID: Ib38fe6034b356eee9a0e20b8a9eeed5ff2debcd9 Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-10-31 14:26:40 -07:00
Jacob Keller	4a2ce27bb5	i40e: properly cleanup on allocation failure in i40e_sync_vsi_filters Currently, we fail to correctly restore filters on the temporary add list when we fail to allocate memory either for deletion or addition. Replace calls to "goto out;" with calls to a new location that correctly handles memory allocation failures. Note that it is safe for us to call i40e_undo_filter_entries on the tmp_del_list even after we've deleted filters because at this point it will be empty, so we don't need to separate the logic for add and delete failure. Change-Id: Iee107fd219c6e03e2fd9645c2debf8e8384a8521 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-10-31 14:26:40 -07:00
Jacob Keller	278e7d0b9d	i40e: store MAC/VLAN filters in a hash with the MAC Address as key Replace the mac_filter_list with a static size hash table of 8bits. The primary advantage of this is a decrease in latency of operations related to searching for specific MAC filters, including .set_rx_mode. Using a linked list resulted in several locations which were O(n^2). Using a hash table should give us latency growth closer to O(n*log(n)). Change-ID: I5330bd04053b880e670210933e35830b95948ebb Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com>	2016-10-31 14:26:40 -07:00
Jacob Keller	290d255719	i40e: implement __i40e_del_filter and use where applicable When inside a loop where we call i40e_del_filter we use an O(n^2) pattern where i40e_del_filter calls i40e_find_filter for us. We can avoid this O(n^2) logic by factoring a function, __i40e_del_filter() out from the i40e_del_filter code. This allows us to re-use the delete logic where appropriate without having to search for the filter twice. This new function benefits several functions including i40e_vsi_add_vlan, i40e_vsi_kill_vlan, i40e_del_mac_vlan_all, and i40e_vsi_release. Change-ID: I75fabe0f53bf73f56b80d342e5fdcfcc28f4d3eb Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-10-31 14:26:40 -07:00
Jacob Keller	57b341d666	i40e: When searching all MAC/VLAN filters, ignore removed filters When adding new MAC address filters, the driver determines if it should behave in VLAN mode (where all MAC addresses get assigned to every existing VLAN) or in non-VLAN mode where MAC addresses get assigned the VLAN_ANY identifier. Under some circumstances it is possible that a VLAN has been marked for removal (such that all filters of that VLAN are set to I40E_FILTER_REMOVE), and a subsequent call to i40e_put_mac_in_vlan may occur prior to the driver subtask that syncs filters to the hardware. In this case, we may add filters to the new removed VLAN, even though it should have been removed. This is most obvious when first adding a new VLAN. We will delete all filters which are in I40E_VLAN_ANY (-1) and then re-add them as in VLAN 0 (untagged). Then before we sync filters, we will add new MAC address filter, which will be added to every VLAN that exists. Unfortunately, this will include I40E_VLAN_ANY, so we will end up incorrectly adding filters to the -1 VLAN. This can be fixed by simply skipping all filters which are marked for removal. A similar check is not necessary in i40e_del_mac_all_vlan, since we are deleting, and any filter which we find already marked for removal would simply be deleted again, which doesn't cause any issues. Change-Id: I7962154013ce02fe950584690aeeb3ed853d0086 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-10-31 14:26:40 -07:00
Jacob Keller	5feb3d7b0c	i40e: refactor i40e_put_mac_in_vlan to avoid changing f->vlan When a PVID has been assigned to a VSI, the function i40e_put_mac_in_vlan arbitrarily modifies all filters to have the same VLAN. This is obviously incorrect because it could be modifying active filters without putting them into the NEW state. The correct method is to remove then re-add filters which is already done in the code where we assign the PVID. Fix this issue and a few other minor nits at the same time. First, when we have a PVID don't even bother looping and simply add the filter with the PVID immediately. In the case of the loop, we now can remove several checks. We also don't need to use i40e_find_filter first before calling i40e_add_filter, since i40e_add_filter implicitly does a lookup already. Finally, update the return semantics of this function so that on failure to add a filter it returns NULL, but on success, it returns the last filter added. Otherwise, we're just returning the last filter in the list. An alternative fix might be to return 0 or an error code, but this is pretty invasive to every call site. Change-ID: I2325dfd843aec76d89fb0d7cb0e7c4f290a34840 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-10-31 14:26:40 -07:00
Jacob Keller	35ec2ff37c	i40e: move i40e_put_mac_in_vlan and i40e_del_mac_all_vlan A future patch will be modifying these functions and making a call to a static function which currently is defined after these functions. Move them in a separate patch to ease review and ensure the moved code is correct. Change-ID: I2ca7fd4e10c0c07ed2291db1ea41bf5987fc6474 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-10-31 14:26:40 -07:00
Jacob Keller	6622f5cdba	i40e: make use of __dev_uc_sync and __dev_mc_sync The kernel provides __dev_uc_sync and __dev_mc_sync in order for drivers which need individual notification of add and delete for each filter. These functions allow us to vastly simplify our .set_rx_mode handler. We need to implement two functions for sync and unsync which add and remove filters respectively. This change avoids a very complex and inefficient algorithm which resulted in an abnormal latency for the .set_rx_mode NDO operation. The resulting code after this change is more readable, more efficient, and less code. Due to the callback signature used by these functions we also must update several other functions to take a const u8 * pointer. Change-Id: I2ca7fd4e10c0c07ed2291db1ea41bf5987fc6474 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-10-31 14:26:40 -07:00
Jacob Keller	1bc87e807a	i40e: drop is_vf and is_netdev fields in struct i40e_mac_filter Originally the is_vf and is_netdev fields were added in order to distinguish between VF and netdev filters in a single VSI. However, it can be noted that we use separate VSI for SRIOV VFs and for netdev VSI. Thus, since a single VSI should only ever have one type of filter, we can simply remove the checks and remove the typing. In a similar fashion, we can note that the only remaining way to get multiple filters of a single type is through a debug command that was added to debugfs. This command is useless in practice, and results in causing bugs if we keep counter tracking but lose the is_vf and is_netdev protections as desired above. Since the only time we'd actually have a counter value besides 0 and 1 is through use of this debugfs hook, we can remove this unnecessary command, and the entire counter logic it required. We vastly simplify mac filters by removing (a) the distinction between VF and netdev filters (b) counting logic (c) the ability to add and remove filters bypassing the stack via debugfs Change-ID: Idf916dd2a1159b1188ddbab5bef6b85ea6bf27d9 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-10-31 14:26:40 -07:00
Colin Ian King	ff00f3a967	i40e: Add missing \n to end of dev_err message Trival fix, dev_err message is missing a \n, so add it. Signed-off-by: Colin Ian King <colin.king@canonical.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-10-31 14:26:40 -07:00
David S. Miller	17a032b7bf	Merge branch 'bridge-PIM-hello' Nikolay Aleksandrov says: ==================== bridge: add support for PIM hello router ports The first 3 patches of this set do minor cleanups and add some helpers to the PIM header file. Patch 4 adds a way to detect mcast router ports via PIM hello messages, they're marked as temporary and are not considered for querier. There's more detailed information in patch 4's commit message. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-31 16:18:31 -04:00
Nikolay Aleksandrov	91b02d3d13	bridge: mcast: add router port on PIM hello message When we receive a PIM Hello message on a port we can consider that it has a multicast router attached, thus it is correct to add it to the router list. The only catch is it shouldn't be considered for a querier. Using Daniel's description: leaf-11 leaf-12 leaf-13 \ \| / bridge-1 / \ host-11 host-12 - all ports in bridge-1 are in a single vlan aware bridge - leaf-11 is the IGMP querier - leaf-13 is the PIM DR - host-11 TXes packets to 226.10.10.10 - bridge-1 only forwards the 226.10.10.10 traffic out the port to leaf-11, it should also forward this traffic out the port to leaf-13 Suggested-by: Daniel Walton <dwalton@cumulusnetworks.com> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-31 16:18:30 -04:00
Nikolay Aleksandrov	56245cae19	net: pim: add all RFC7761 message types Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-31 16:18:30 -04:00
Nikolay Aleksandrov	20bb6ce987	net: pim: add a helper to check for IPv4 all pim routers address Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-31 16:18:30 -04:00
Nikolay Aleksandrov	556d299fcb	net: pim: add common pimhdr struct and helpers Add the common pimhdr structure and helpers to access it, also cleanup the format of the header file. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-31 16:18:29 -04:00
David S. Miller	f3a6f592e9	Merge branch 'qed-next' Yuval Mintz says: ==================== qed*: Patch series This series does several things. The bigger changes: - Add new notification APIs [& Defaults] for various fields. The series then utilizes some of those qed <-> qede APIs to bass WoL support upon. - Change the resource allocation scheme to receive the values from management firmware, instead of equally sharing resources between functions [that might not need those]. That would, e.g., allow us to configure additional filters to network interfaces in presence of storage [PCI] functions from same adapter. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-31 15:52:37 -04:00
Tomer Tayar	2edbff8dcb	qed: Learn resources from management firmware Currently, each interfaces assumes it receives an equal portion of HW/FW resources, but this is wasteful - different partitions [and specifically, parititions exposing different protocol support] might require different resources. Implement a new resource learning scheme where the information is received directly from the management firmware [which has knowledge of all of the functions and can serve as arbiter]. Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-31 15:52:36 -04:00
Mintz, Yuval	5a1f965aac	qed: Use VF-queue feature Driver sets several restrictions about the number of supported VFs according to available HW/FW resources. This creates a problem as there are constellations which can't be supported [as limitation don't accurately describe the resources], as well as holes where enabling IOV would fail due to supposed lack of resources. This introduces a new interal feature - vf-queues, which would be used to lift some of the restriction and accurately enumerate the queues that can be used by a given PF's VFs. Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-31 15:52:36 -04:00
Mintz, Yuval	6927e82680	qed: Learn of RDMA capabilities per-device Today, RDMA capabilities are learned from management firmware which provides a per-device indication for all interfaces. Newer management firmware is capable of providing a per-device indication [would later be extended to either RoCE/iWARP]. Try using this newer learning mechanism, but fallback in case management firmware is too old to retain current functionality. Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-31 15:52:36 -04:00
Mintz, Yuval	d7455f6e44	qede: Decouple ethtool caps from qed While the qed_lm_maps is closely tied with the QED_LM_* defines, when iterating over the array use actual size instead of the qed define to prevent future possible issues. Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-31 15:52:35 -04:00
Mintz, Yuval	14d39648cb	qed*: Add support for WoL Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-31 15:52:35 -04:00
Mintz, Yuval	7a4b21b7d1	qed: Add nvram selftest Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-31 15:52:35 -04:00
Sudarsana Kalluru	0fefbfbaad	qed*: Management firmware - notifications and defaults Management firmware is interested in various tidbits about the driver - including the driver state & several configuration related fields [MTU, primtary MAC, etc.]. This adds the necessray logic to update MFW with such configurations, some of which are passed directly via qed while for others APIs are provide so that qede would be able to later configure if needed. This also introduces a new default configuration for MTU which would replace the default inherited by being an ethernet device. Signed-off-by: Sudarsana Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-31 15:52:35 -04:00
Julia Lawall	89d9123e8e	solos-pci: use permission-specific DEVICE_ATTR variants Use DEVICE_ATTR_RW for read-write attributes. This simplifies the source code, improves readbility, and reduces the chance of inconsistencies. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @rw@ declarer name DEVICE_ATTR; identifier x,x_show,x_store; @@ DEVICE_ATTR(x, $0644\\|S_IRUGO\|S_IWUSR$, x_show, x_store); @script:ocaml@ x << rw.x; x_show << rw.x_show; x_store << rw.x_store; @@ if not (x^"_show" = x_show && x^"_store" = x_store) then Coccilib.include_match false @@ declarer name DEVICE_ATTR_RW; identifier rw.x,rw.x_show,rw.x_store; @@ - DEVICE_ATTR(x, $0644\\|S_IRUGO\|S_IWUSR$, x_show, x_store); + DEVICE_ATTR_RW(x); // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-31 15:32:12 -04:00
Julia Lawall	63215705a6	ptp: use permission-specific DEVICE_ATTR variants Use DEVICE_ATTR_RO for read only attributes. This simplifies the source code, improves readbility, and reduces the chance of inconsistencies. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @ro@ declarer name DEVICE_ATTR; identifier x,x_show; @@ DEVICE_ATTR(x, $0444\\|S_IRUGO$, x_show, NULL); @script:ocaml@ x << ro.x; x_show << ro.x_show; @@ if not (x^"_show" = x_show) then Coccilib.include_match false @@ declarer name DEVICE_ATTR_RO; identifier ro.x,ro.x_show; @@ - DEVICE_ATTR(x, $0444\\|S_IRUGO$, x_show, NULL); + DEVICE_ATTR_RO(x); // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-31 15:32:12 -04:00
Daniel Borkmann	0f98621bef	bpf, inode: add support for symlinks and fix mtime/ctime While commit `bb35a6ef7d` ("bpf, inode: allow for rename and link ops") added support for hard links that can be used for prog and map nodes, this work adds simple symlink support, which can be used f.e. for directories also when unpriviledged and works with cmdline tooling that understands S_IFLNK anyway. Since the switch in `e27f4a942a` ("bpf: Use mount_nodev not mount_ns to mount the bpf filesystem"), there can be various mount instances with mount_nodev() and thus hierarchy can be flattened to facilitate object sharing. Thus, we can keep bpf tooling also working by repointing paths. Most of the functionality can be used from vfs library operations. The symlink is stored in the inode itself, that is in i_link, which is sufficient in our case as opposed to storing it in the page cache. While at it, I noticed that bpf_mkdir() and bpf_mkobj() don't update the directories mtime and ctime, so add a common helper for it called bpf_dentry_finalize() that takes care of it for all cases now. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-31 15:28:11 -04:00
Aaron Young	8778b27664	ldmvsw: tx queue stuck in stopped state after LDC reset The following patch fixes an issue with the ldmvsw driver where the network connection of a guest domain becomes non-functional after the guest domain has panic'd and rebooted. The root cause was determined to be from the following series of events: 1. Guest domain panics - resulting in the guest no longer processing network packets (from ldmvsw driver) 2. The ldmvsw driver (in the control domain) eventually exerts flow control due to no more available tx drings and stops the tx queue for the guest domain 3. The LDC of the network connection for the guest is reset when the guest domain reboots after the panic. 4. The LDC reset event is received by the ldmvsw driver and the ldmvsw responds by clearing the tx queue for the guest. 5. ldmvsw waits indefinitely for a DATA ACK from the guest - which is the normal method to re-enable the tx queue. But the ACK never comes because the tx queue was cleared due to the LDC reset. To fix this issue, in addition to clearing the tx queue, re-enable the tx queue on a LDC reset. This prevents the ldmvsw from getting caught in this deadlocked state of waiting for a DATA ACK which will never come. Signed-off-by: Aaron Young <Aaron.Young@oracle.com> Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-31 15:20:29 -04:00
David S. Miller	04f762e8f2	Merge branch 'xps-DCB' Alexander Duyck says: ==================== Add support for XPS when using DCB This patch series enables proper isolation between traffic classes when using XPS while DCB is enabled. Previously enabling XPS would cause the traffic to be potentially pulled from one traffic class into another on egress. This change essentially multiplies the XPS map by the number of traffic classes and allows us to do a lookup per traffic class for a given CPU. To guarantee the isolation I invalidate the XPS map for any queues that are moved from one traffic class to another, or if we change the number of traffic classes. v2: Added sysfs to display traffic class Replaced do/while with for loop Cleaned up several other for for loops throughout the patch ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-31 15:00:48 -04:00
Alexander Duyck	184c449f91	net: Add support for XPS with QoS via traffic classes This patch adds support for setting and using XPS when QoS via traffic classes is enabled. With this change we will factor in the priority and traffic class mapping of the packet and use that information to correctly select the queue. This allows us to define a set of queues for a given traffic class via mqprio and then configure the XPS mapping for those queues so that the traffic flows can avoid head-of-line blocking between the individual CPUs if so desired. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-31 15:00:48 -04:00
Alexander Duyck	6234f87407	net: Refactor removal of queues from XPS map and apply on num_tc changes This patch updates the code for removing queues from the XPS map and makes it so that we can apply the code any time we change either the number of traffic classes or the mapping of a given block of queues. This way we avoid having queues pulling traffic from a foreign traffic class. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-31 15:00:48 -04:00
Alexander Duyck	8d059b0f6f	net: Add sysfs value to determine queue traffic class Add a sysfs attribute for a Tx queue that allows us to determine the traffic class for a given queue. This will allow us to more easily determine this in the future. It is needed as XPS will take the traffic class for a group of queues into account in order to avoid pulling traffic from one traffic class into another. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-31 15:00:47 -04:00
Alexander Duyck	9cf1f6a8c4	net: Move functions for configuring traffic classes out of inline headers The functions for configuring the traffic class to queue mappings have other effects that need to be addressed. Instead of trying to export a bunch of new functions just relocate the functions so that we can instrument them directly with the functionality they will need. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-31 15:00:47 -04:00
Gao Feng	20861f26e3	driver: tun: Use new macro SOCK_IOC_TYPE instead of literal number 0x89 The current codes use _IOC_TYPE(cmd) == 0x89 to check if the cmd is one socket ioctl command like SIOCGIFHWADDR. But the literal number 0x89 may confuse readers. So create one macro SOCK_IOC_TYPE to enhance the readability. Signed-off-by: Gao Feng <fgao@ikuai8.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-31 10:56:47 -04:00
Andrey Vagin	c62cce2cae	net: add an ioctl to get a socket network namespace Each socket operates in a network namespace where it has been created, so if we want to dump and restore a socket, we have to know its network namespace. We have a socket_diag to get information about sockets, it doesn't report sockets which are not bound or connected. This patch introduces a new socket ioctl, which is called SIOCGSKNS and used to get a file descriptor for a socket network namespace. A task must have CAP_NET_ADMIN in a target network namespace to use this ioctl. Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Andrei Vagin <avagin@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-31 10:56:36 -04:00
David S. Miller	2a43ca0aa9	mv643xx_eth: Properly resolve merge conflict. The second SET_NETDEV_DEV() in the hunk should be removed. Reported-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-31 10:33:08 -04:00
David S. Miller	d8e4aa0707	mv643xx_eth: Fix merge error. One merge conflict block wasn't resolved. Reported-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-31 08:59:27 -04:00
David S. Miller	0a6ce1e3c1	Mellanox ConnectX-4/Connect-IB shared code (IB & ETH part) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJYFfxWAAoJEORje4g2clinr4MQAMUMoO4akLhppyUuLiT2ZZHc KshFwUF5RnZq6c2qXSxTVhfajyG6q71EwQNGaAMeGsjoCXM+u99Adgp/lYkXxbzY e4W3gCvjGWIik5d3JRVq07XutVpeJIc/Qvvc5zc1lUYNR5f71iBw538eG9ic4PXi 4/CpRvcsa8Z9sbtKPcjHwQQRd4ewx/KAD6QyOsVz9GgkBeNMYag3SO731DYSkjRC MYK85arNC1JUE/MHQKIfYQvjiJVfEyt2FvC8v9tW+bhzP6dAzxRY0yd8ZFJtCiYH GFGy8vdeCA/0dFRD5cYKPKBiFwUbRC8bt2lLC5ZoUic2nZ23LlO67uDkaPDRYckt oyhErFRX6Q/goqKFCI4tLUoSBF1bhy9EnbWyOWmcW7qpXRD3VCclS0Ctr++yJnv2 bhhlID56f+dX+rnW/OAERrk8MdVHo5xBUzQ8ZAAF3WDP9LqW+qlYVrEvrFqFIeFM OCGUbW2xsZaHMZRyx0K6068hy8O4EujjgC9PARi65rrZAAwxlDm4ElJEYvzXZVXA YoMeXiZGrhoj7+h8OorV0TyB+7mUgxFNlq0tCoi193QS+zIuQqf3XYNmuQGCUflF YdzZs7/9LpANN4e5yDTCq3CIr8yYv9sRrdnSX0iShvbFBAKClLxpjj2xkpayHFVR 8CRvlb5O1v1fIwSn2z/I =D5Ix -----END PGP SIGNATURE----- Merge tag 'shared-for-4.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma Saeed Mahameed says: ==================== Mellanox mlx5 core driver updates 2016-10-25 This series contains some updates and fixes of mlx5 core and IB drivers with the addition of two features that demand new low level commands and infrastructure updates. - SRIOV VF max rate limit support - mlx5e tc support for FWD rules with counter. Needed for both net and rdma subsystems. Updates and Fixes: From Saeed Mahameed (2): - mlx5 IB: Skip handling unknown mlx5 events - Add ConnectX-5 PCIe 4.0 VF device ID From Artemy Kovalyov (2): - Update struct mlx5_ifc_xrqc_bits - Ensure SRQ physical address structure endianness From Eugenia Emantayev (1): - Fix length of async_event_mask New Features: From Mohamad Haj Yahia (3): mlx5 SRIOV VF max rate limit support - Introduce TSAR manipulation firmware commands - Introduce E-switch QoS management - Add SRIOV VF max rate configuration support From Mark Bloch (7): mlx5e Tc support for FWD rule with counter - Don't unlock fte while still using it - Use fte status to decide on firmware command - Refactor find_flow_rule - Group similar rules under the same fte - Add multi dest support - Add option to add fwd rule with counter - mlx5e tc support for FWD rule with counter Mark here fixed two trivial issues with the flow steering core, and did some refactoring in the flow steering API to support adding mulit destination rules to the same hardware flow table entry at once. In the last two patches added the ability to populate a flow rule with a flow counter to the same flow entry. V2: Dropped some patches that added new structures without adding any usage of them. Added SRIOV VF max rate configuration support patch that introduces the usage of the TSAR infrastructure. Added flow steering fixes and refactoring in addition to mlx5 tc support for forward rule with counter. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-30 17:31:12 -04:00
Philippe Reynes	d46b634945	net: bonding: use new api ethtool_{get\|set}_link_ksettings The ethtool api {get\|set}_settings is deprecated. We move this driver to new api {get\|set}_link_ksettings. Signed-off-by: Philippe Reynes <tremyfr@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-30 17:23:39 -04:00
David S. Miller	2701c3b39d	Merge branch 'mlxsw-IB' Jiri Pirko says: ==================== mlxsw: Add Infiniband support for Mellanox switches This patchset adds basic Infiniband support for SwitchX-2, Switch-IB and Switch-IB-2 ASIC drivers. SwitchX-2 ASIC is VPI capable, which means each port can be either Ethernet or Infiniband. When the port is configured as Infiniband, the Subnet Management Agent (SMA) is managed by the SwitchX-2 firmware and not by the host. Port configuration, MTU and more are configured remotely by the Subnet Manager (SM). Usage: $ devlink port show pci/0000:03:00.0/1: type eth netdev eth0 pci/0000:03:00.0/3: type eth netdev eth1 pci/0000:03:00.0/5: type eth netdev eth2 pci/0000:03:00.0/6: type eth netdev eth3 pci/0000:03:00.0/8: type eth netdev eth4 $ devlink port set pci/0000:03:00.0/1 type ib $ devlink port show pci/0000:03:00.0/1: type ib Switch-IB (FDR) and Switch-IB-2 (EDR 100Gbs) ASICs are Infiniband-only switches. The support provided in the mlxsw_switchib.ko driver is port initialization only. The firmware running in the Silicon implements the SMA. Please note that this patchset does only very basic port initialization. ib_device or RDMA implementations are not part of this patchset. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-30 16:50:20 -04:00
Elad Raz	d1ba526384	mlxsw: switchib: Introduce SwitchIB and SwitchIB silicon driver SwitchIB and SwitchIB-2 are Infiniband switches with up to 36 ports. This driver initialize the hardware and Firmware which implements the IB management and connection with the SM. Signed-off-by: Elad Raz <eladr@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-30 16:50:17 -04:00
Elad Raz	64b92b0114	mlxsw: switchx2: Add IB port support SwitchX-2 is IB capable device. This patch add a support to change the port type between Ethernet and Infiniband. When the port is set to IB, the FW implements the Subnet Management Agent (SMA) manage the port. All port attributes can be control remotely by the SM. Usage: $ devlink port show pci/0000:03:00.0/1: type eth netdev eth0 pci/0000:03:00.0/3: type eth netdev eth1 pci/0000:03:00.0/5: type eth netdev eth2 pci/0000:03:00.0/6: type eth netdev eth3 pci/0000:03:00.0/8: type eth netdev eth4 $ devlink port set pci/0000:03:00.0/1 type ib $ devlink port show pci/0000:03:00.0/1: type ib Signed-off-by: Elad Raz <eladr@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-30 16:50:17 -04:00
Elad Raz	03ddb787c7	mlxsw: switchx2: Add eth prefix to port create and remove Since we are about to add Infiniband port remove and create we will add "eth" prefix to port create and remove APIs. Signed-off-by: Elad Raz <eladr@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-30 16:50:17 -04:00
Elad Raz	0c81ea5db2	mlxsw: core: Add port type (Eth/IB) set API Add "port_type_set" API to mlxsw core. The core layer send the change type callback to the port along with it's private information. Signed-off-by: Elad Raz <eladr@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-30 16:50:17 -04:00

... 2 3 4 5 6 ...

634605 commits