Commit 0a5539f661 ("bpf: Provide a linux/types.h override
for bpf selftests.") caused a build failure for tools/testing/selftest/bpf
because of some missing types:
$ make -C tools/testing/selftests/bpf/
...
In file included from /home/yhs/work/net-next/tools/testing/selftests/bpf/test_pkt_access.c:8:
../../../include/uapi/linux/bpf.h:170:3: error: unknown type name '__aligned_u64'
__aligned_u64 key;
...
/usr/include/linux/swab.h:160:8: error: unknown type name '__always_inline'
static __always_inline __u16 __swab16p(const __u16 *p)
...
The type __aligned_u64 is defined in linux:include/uapi/linux/types.h.
The fix is to copy missing type definition into
tools/testing/selftests/bpf/include/uapi/linux/types.h.
Adding additional include "string.h" resolves __always_inline issue.
Fixes: 0a5539f661 ("bpf: Provide a linux/types.h override for bpf selftests.")
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- A few request-based DM and DM multipath fixes for issues that were
made when merging Christoph's changes with Bart's changes for 4.12
- A DM bufio unsigned overflow fix
- A couple pure fixes for the DM cache target.
- Various very small tweaks to the DM cache target that enable
considerable speed improvements in the face of continuous IO. Given
that the cache target was significantly reworked for 4.12 I see no
reason to sit on these advances until 4.13 considering the favorable
results associated with such minimalist tweaks.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJZHJ6hAAoJEMUj8QotnQNazb4H/2i+OdNU3qEv0wg0Vf2fdWbU
M0FYLDwuEBx+/RCBoJbjLY4enfup2Ak4ykxivt1gDZL4sBY8bsf/jxjgTKD1icp4
tV+6tLFbzVZwp2JtDgbWJ0FuYEfINxNwVJYRUY6dbgsWQPCxKwYAYnYa102no78t
pqykpj/jfQB5ru5bNDVC/KcV8fj+3mc4H7IJxGeEnVzoXyW6wsnvP+6FqOHnuocE
HU2zCzit1nfLtI7eL3I3B5nQKVtkPZoR/ILFyo7viU1EKA9zNEjkLI7EOKrMvPNC
b0H6lqMkoBIWDw22sxkk/utpywFVqQ1K7kyAWTa1XsSsSvritSmZP2k+dCEkk/U=
=5lrV
-----END PGP SIGNATURE-----
Merge tag 'for-4.12/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper fixes from Mike Snitzer:
- a couple DM thin provisioning fixes
- a few request-based DM and DM multipath fixes for issues that were
made when merging Christoph's changes with Bart's changes for 4.12
- a DM bufio unsigned overflow fix
- a couple pure fixes for the DM cache target.
- various very small tweaks to the DM cache target that enable
considerable speed improvements in the face of continuous IO. Given
that the cache target was significantly reworked for 4.12 I see no
reason to sit on these advances until 4.13 considering the favorable
results associated with such minimalist tweaks.
* tag 'for-4.12/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm cache: handle kmalloc failure allocating background_tracker struct
dm bufio: make the parameter "retain_bytes" unsigned long
dm mpath: multipath_clone_and_map must not return -EIO
dm mpath: don't return -EIO from dm_report_EIO
dm rq: add a missing break to map_request
dm space map disk: fix some book keeping in the disk space map
dm thin metadata: call precommit before saving the roots
dm cache policy smq: don't do any writebacks unless IDLE
dm cache: simplify the IDLE vs BUSY state calculation
dm cache: track all IO to the cache rather than just the origin device's IO
dm cache policy smq: stop preemptively demoting blocks
dm cache policy smq: put newly promoted entries at the top of the multiqueue
dm cache policy smq: be more aggressive about triggering a writeback
dm cache policy smq: only demote entries in bottom half of the clean multiqueue
dm cache: fix incorrect 'idle_time' reset in IO tracker
Pull i2c fixes from Wolfram Sang:
"Here are some bugfixes from I2C, especially removing a wrongly
displayed error message for all i2c muxes"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: xgene: Set ACPI_COMPANION_I2C
i2c: mv64xxx: don't override deferred probing when getting irq
i2c: mux: only print failure message on error
i2c: mux: reg: rename label to indicate what it does
i2c: mux: reg: put away the parent i2c adapter on probe failure
Andrew Lunn says:
====================
net: phy: marvell: Checkpatch cleanup
I will be contributing a few new features to the Marvell PHY driver
soon. Start by making the code mostly checkpatch clean. There should
not be any functional changes. Just comments set into the correct
format, missing blank lines, turn some comparisons around, and
refactoring to reduce indentation depth.
There is still one camel in the code, but it actually makes sense, so
leave it in piece.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Makes the code a bit more readable, and solves quite a few checkpatch
warnings of lines longer than 80 characters.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Break big functions up by using a number of smaller helper
function. Solves some of the over 80 lines warnings, by reducing the
indentation level.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Avoid multiple assignments
Comparisons should place the constant on the right side of the test
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove the extra blank lines, add one in where recommended.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use net style comment blocks, and wrap one block with long lines.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet says:
====================
tcp: TCP TS option use 1 ms clock
TCP Timestamps option is defined in RFC 7323
Traditionally on linux, it has been tied to the internal
'jiffy' variable, because it had been a cheap and good enough
generator.
Unfortunately some distros use HZ=250 or even HZ=100 leading
to not very useful TCP timestamps.
For TCP flows in the DC, Google has used usec resolution for more
than two years with great success [1].
RCVBUF autotuning is more precise.
This series converts tp->tcp_mstamp to a plain u64 value storing
a 1 usec TCP clock.
This choice will allow us to upstream the 1 usec TS option as
discussed in IETF 97.
Kathleen Nichols [2] and others advocate for 1ms TS clocks for
network analysis. (1ms being the lowest value supported by RFC 7323.)
[1] https://www.ietf.org/proceedings/97/slides/slides-97-tcpm-tcp-options-for-low-latency-00.pdf
[2] http://netseminar.stanford.edu/seminars/02_02_17.pdf
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
TCP Timestamps option is defined in RFC 7323
Traditionally on linux, it has been tied to the internal
'jiffies' variable, because it had been a cheap and good enough
generator.
For TCP flows on the Internet, 1 ms resolution would be much better
than 4ms or 10ms (HZ=250 or HZ=100 respectively)
For TCP flows in the DC, Google has used usec resolution for more
than two years with great success [1]
Receive size autotuning (DRS) is indeed more precise and converges
faster to optimal window size.
This patch converts tp->tcp_mstamp to a plain u64 value storing
a 1 usec TCP clock.
This choice will allow us to upstream the 1 usec TS option as
discussed in IETF 97.
[1] https://www.ietf.org/proceedings/97/slides/slides-97-tcpm-tcp-options-for-low-latency-00.pdf
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After this patch, all uses of tcp_time_stamp will require
a change when we introduce 1 ms and/or 1 us TCP TS option.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
tcp_time_stamp will become slightly more expensive soon,
cache its value.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This CC does not need 1 ms tcp_time_stamp and can use
the jiffy based 'timestamp'.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This place wants to use tcp_jiffies32, this is good enough.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
tcp_time_stamp will no longer be tied to jiffies.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use tcp_jiffies32 instead of tcp_time_stamp, since
tcp_time_stamp will soon be only used for TCP TS option.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use tcp_jiffies32 instead of tcp_time_stamp, since
tcp_time_stamp will soon be only used for TCP TS option.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use tcp_jiffies32 instead of tcp_time_stamp, since
tcp_time_stamp will soon be only used for TCP TS option.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use tcp_jiffies32 instead of tcp_time_stamp, since
tcp_time_stamp will soon be only used for TCP TS option.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use tcp_jiffies32 instead of tcp_time_stamp to feed
tp->snd_cwnd_stamp.
tcp_time_stamp will soon be a litle bit more expensive
than simply reading 'jiffies'.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use tcp_jiffies32 instead of tcp_time_stamp to feed
tp->lsndtime.
tcp_time_stamp will soon be a litle bit more expensive
than simply reading 'jiffies'.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use our own macro instead of abusing tcp_time_stamp
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We abuse tcp_time_stamp for two different cases :
1) base to generate TCP Timestamp options (RFC 7323)
2) A 32bit version of jiffies since some TCP fields
are 32bit wide to save memory.
Since we want in the future to have 1ms TCP TS clock,
regardless of HZ value, we want to cleanup things.
tcp_jiffies32 is the truncated jiffies value,
which will be used only in places where we want a 'host'
timestamp.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Idea is to later convert tp->tcp_mstamp to a full u64 counter
using usec resolution, so that we can later have fine
grained TCP TS clock (RFC 7323), regardless of HZ value.
We try to refresh tp->tcp_mstamp only when necessary.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We still need to initialize err to -EINVAL for
the case where 'opt' is NULL in dsmark_init().
Fixes: 6529eaba33 ("net: sched: introduce tcf block infractructure")
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko says:
====================
net: sched: introduce multichain support for filters
Currently, each classful qdisc holds one chain of filters.
This chain is traversed and each filter could be matched on, which
may lead to execution of list of actions. One of such action
could be "reclassify", which would "reset" the processing of the
filter chain.
So this filter chain could be looked at as a flat table.
Sometimes it is convenient for user to configure a hierarchy
of tables. Example usecase is encapsulation.
Hierarchy of tables is a common way how it is done in HW pipelines.
So it is much more convenient to offload this.
This patchset contains two major patches:
8/10 - This patch introduces the support for having multiple
chains of filters.
10/10 - This patch adds new control action to allow going to specified chain
The rest of the patches are smaller or bigger depencies of those 2.
Please see individual patch descriptions for details.
Corresponding iproute2 patches are appended as a reply to this cover letter.
Simple example:
$ tc qdisc add dev eth0 ingress
$ tc filter add dev eth0 parent ffff: protocol ip pref 33 flower dst_mac 52:54:00:3d:c7:6d action goto chain 11
$ tc filter add dev eth0 parent ffff: protocol ip pref 22 chain 11 flower dst_ip 192.168.40.1 action drop
$ tc filter show dev eth0 root
filter parent ffff: protocol ip pref 33 flower chain 0
filter parent ffff: protocol ip pref 33 flower chain 0 handle 0x1
dst_mac 52:54:00:3d:c7:6d
eth_type ipv4
action order 1: gact action goto chain 11
random type none pass val 0
index 2 ref 1 bind 1
filter parent ffff: protocol ip pref 22 flower chain 11
filter parent ffff: protocol ip pref 22 flower chain 11 handle 0x1
eth_type ipv4
dst_ip 192.168.40.1
action order 1: gact action drop
random type none pass val 0
index 3 ref 1 bind 1
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce new type of termination action called "goto_chain". This allows
user to specify a chain to be processed. This action type is
then processed as a return value in tcf_classify loop in similar
way as "reclassify" is, only it does not reset to the first filter
in chain but rather reset to the first filter of the desired chain.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tp pointer will be needed by the next patch in order to get the chain.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of having only one filter per block, introduce a list of chains
for every block. Create chain 0 by default. UAPI is extended so the user
can specify which chain he wants to change. If the new attribute is not
specified, chain 0 is used. That allows to maintain backward
compatibility. If chain does not exist and user wants to manipulate with
it, new chain is created with specified index. Also, when last filter is
removed from the chain, the chain is destroyed.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since there will be multiple chains to dump, push chain dumping code to
a separate function.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce struct tcf_chain object and set of helpers around it. Wraps up
insertion, deletion and search in the filter chain.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Call the helper from the function rather than to always adjust the
return value of the function.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The use of "nprio" variable in tc_ctl_tfilter is a bit cryptic and makes
a reader wonder what is going on for a while. So help him to understand
this priority allocation dance a litte bit better.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make the name consistent with the rest of the helpers around.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, the filter chains are direcly put into the private structures
of qdiscs. In order to be able to have multiple chains per qdisc and to
allow filter chains sharing among qdiscs, there is a need for common
object that would hold the chains. This introduces such object and calls
it "tcf_block".
Helpers to get and put the blocks are provided to be called from
individual qdisc code. Also, the original filter_list pointers are left
in qdisc privs to allow the entry into tcf_block processing without any
added overhead of possible multiple pointer dereference on fast path.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move tc_classify function to cls_api.c where it belongs, rename it to
fit the namespace.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn says:
====================
net: dsa: Sort various lists
As we gain more DSA drivers and tagging protocols, the lists are
getting a bit unruly. Do some sorting.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
With more drivers being added, it is time to sort the drivers to
impose some order.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With more tag protocols being added, regain some order by sorting the
entries in various places.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan says:
====================
bnxt_en: DCBX fixes.
2 bug fixes for the case where the NIC's firmware DCBX agent is enabled.
With these fixes, we will return the proper information to lldpad.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Otherwise, all the host based DCBX settings from lldpad will fail if the
firmware DCBX agent is running.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the current code, bnxt_dcb_init() is called too early before we
determine if the firmware DCBX agent is running or not. As a result,
we are not setting the DCB_CAP_DCBX_HOST and DCB_CAP_DCBX_LLD_MANAGED
flags properly to report to DCBNL.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If CONFIG_INET is not set, net/core/sock.c can not compile :
net/core/sock.c: In function ‘skb_orphan_partial’:
net/core/sock.c:1810:2: error: implicit declaration of function
‘skb_is_tcp_pure_ack’ [-Werror=implicit-function-declaration]
if (skb_is_tcp_pure_ack(skb))
^
Fix this by always including <net/tcp.h>
Fixes: f6ba8d33cf ("netem: fix skb_orphan_partial()")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ftrace function_graph time measurements of a given function is not
accurate according to those recorded by ftrace using the function
filters. This change pulls the x86_64 fix from 'commit 722b3c7469
("ftrace/graph: Trace function entry before updating index")' into the
sparc specific prepare_ftrace_return which stops ftrace from
counting interrupted tasks in the time measurement.
Example measurements for select_task_rq_fair running "hackbench 100
process 1000":
| tracing/trace_stat/function0 | function_graph
Before patch | 2.802 us | 4.255 us
After patch | 2.749 us | 3.094 us
Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Greetings,
GCC 7 introduced the -Wstringop-overflow flag to detect buffer overflows
in calls to string handling functions [1][2]. Due to the way
``empty_zero_page'' is declared in arch/sparc/include/setup.h, this
causes a warning to trigger at compile time in the function mem_init(),
which is subsequently converted to an error. The ensuing patch fixes
this issue and aligns the declaration of empty_zero_page to that of
other architectures. Thank you.
Cheers,
Orlando.
[1] https://gcc.gnu.org/ml/gcc-patches/2016-10/msg02308.html
[2] https://gcc.gnu.org/gcc-7/changes.html
Signed-off-by: Orlando Arias <oarias@knights.ucf.edu>
--------------------------------------------------------------------------------
Signed-off-by: David S. Miller <davem@davemloft.net>
An incorrect huge page alignment check caused
mmap failure for 64K pages when MAP_FIXED is used
with address not aligned to HPAGE_SIZE.
Orabug: 25885991
Fixes: dcd1912d21 ("sparc64: Add 64K page size support")
Signed-off-by: Nitin Gupta <nitin.m.gupta@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the function assigned to .ndo_set_vf_mac, check the validity of the
vfidx argument before proceeding to tell the firmware to set the VF MAC
address.
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: Derek Chickles <derek.chickles@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>