Commit graph

753459 commits

Author SHA1 Message Date
Stefan Raspl 2ef4f27ad0 smc: allocate RMBs as compound pages
Preparatory work for splice() support.

Signed-off-by: Stefan Raspl <raspl@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com><
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-04 11:45:06 -04:00
Stefan Raspl b51fa1b135 smc: make smc_rx_wait_data() generic
Turn smc_rx_wait_data into a generic function that can be used at various
instances to wait on traffic to complete with varying criteria.

Signed-off-by: Stefan Raspl <raspl@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com><
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-04 11:45:06 -04:00
Stefan Raspl c8b8ec8e0d smc: simplify abort logic
Some of the conditions to exit recv() are common in two pathes - cleaning up
code by moving the check up so we have it only once.

Signed-off-by: Stefan Raspl <raspl@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com><
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-04 11:45:06 -04:00
David S. Miller a7b15ab887 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Overlapping changes in selftests Makefile.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-04 09:58:56 -04:00
David S. Miller b05f03b232 Merge branch 'sh_eth-complain-on-access-to-unimplemented-TSU-registers'
Sergei Shtylyov says:

====================
sh_eth: complain on access to unimplemented TSU registers

Here's a set of 2 patches against DaveM's 'net-next.git' repo. The 1st patch
routes TSU_POST<n> register accesses thru sh_eth_tsu_{read|write}() and the 2nd
added WARN_ON() unimplemented register to those functions. I'm going to deal with
TSU_ADR{H|L}<n> registers in a later series...
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-04 09:11:50 -04:00
Sergei Shtylyov 627a0d20fa sh_eth: WARN_ON() access to unimplemented TSU register
Commit 3365711df0 ("sh_eth: WARN on access to a register not implemented
in a particular chip") added  WARN_ON() to sh_eth_{read|write}() but not
to sh_eth_tsu_{read|write}(). Now that we've routed almost all TSU register
accesses  (except TSU_ADR{H|L}<n> -- which are special) thru the latter
pair of accessors, it makes sense to check for the unimplemented TSU
registers as well...

Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-04 09:11:45 -04:00
Sergei Shtylyov 77cb065fde sh_eth: use TSU register accessors for TSU_POST<n>
There's no particularly good reason TSU_POST<n> registers get accessed
circumventing sh_eth_tsu_{read|write}() -- start using those, removing
(badly named) sh_eth_tsu_get_post_reg_offset(),  while at it...

Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-04 09:11:45 -04:00
Linus Torvalds 1504269814 linux-kselftest-4.17-rc4
This Kselftest update for 4.17-rc4 consists of a fix for a syntax error
 in the script that runs selftests. Mathieu Desnoyers found this bug in
 the script on systems running GNU Make 3.8 or older.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJa63FsAAoJEAsCRMQNDUMcnOsP/1bpgIW8zDxhjviswVnFAXtU
 C0W5gs7jPOZsAOrIMpdJx6eN9WokS75yPa18cNwvKEZ6CEj5E1PkaCySdBvVAqgU
 ACxRV1NA3e6zsKxMXhZhC+O1+KJKTg7HaGspICSVbCBO2GRdrUixiLoVOSyx8VVZ
 XQr0dP9DKorSdpaf77xY6gmGIf7zifc/+jCVp+foMG4Cmdvsq7N8GHb/iR6Tj4Gu
 L88hW43qquCzo2q4zBZTbo5GUz9r0ctst0/HRAyci+M6gIQ9GwZbE5KplZT8iQVM
 KLP6umAQEVP1VDeq1TZs4s0+NQ6/p8h7MtxDrdmIYoQ0usNTH5oBqYg4FWHdYzPG
 aj6ahbOnim6B0SsO1+3SARSfY9a/o3BmDwdH5HhxhP9of11UVscqRJFnncHvIUen
 XpEWrW7E4jnMb8tiwD/ewu718QoWhr/9OG5PzF+vsCaF2Xw/gJB05NSH8oAVk38q
 B4WzyGoYZjuSUaJw/8PUm8I/33cDr9H9jQRtMvgmYXvxPbpKXRPAz4IY4LJeSx6J
 sZ9NsT+ZiXm4mnNjJI9Oim3yOYY5FEZ/KTk4vDNDulX3H/QxfN9A3R5bfcaiF3Fw
 pTqc/lf6PPLtqGPNss1DU8fyGkvXsfYt4bI2ctOqegsP5+xs8NUrJMuzVSuHvC7a
 1poSVAQ6vORC46zgwrQc
 =l8ln
 -----END PGP SIGNATURE-----

Merge tag 'linux-kselftest-4.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull kselftest fixes from Shuah Khan:
 "This Kselftest update for 4.17-rc4 consists of a fix for a syntax
  error in the script that runs selftests. Mathieu Desnoyers found this
  bug in the script on systems running GNU Make 3.8 or older"

* tag 'linux-kselftest-4.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  selftests: Fix lib.mk run_tests target shell script
2018-05-03 19:26:51 -10:00
Linus Torvalds e523a2562a Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Various sockmap fixes from John Fastabend (pinned map handling,
    blocking in recvmsg, double page put, error handling during redirect
    failures, etc.)

 2) Fix dead code handling in x86-64 JIT, from Gianluca Borello.

 3) Missing device put in RDS IB code, from Dag Moxnes.

 4) Don't process fast open during repair mode in TCP< from Yuchung
    Cheng.

 5) Move address/port comparison fixes in SCTP, from Xin Long.

 6) Handle add a bond slave's master into a bridge properly, from
    Hangbin Liu.

 7) IPv6 multipath code can operate on unitialized memory due to an
    assumption that the icmp header is in the linear SKB area. Fix from
    Eric Dumazet.

 8) Don't invoke do_tcp_sendpages() recursively via TLS, from Dave
    Watson.

9) Fix memory leaks in x86-64 JIT, from Daniel Borkmann.

10) RDS leaks kernel memory to userspace, from Eric Dumazet.

11) DCCP can invoke a tasklet on a freed socket, take a refcount. Also
    from Eric Dumazet.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (78 commits)
  dccp: fix tasklet usage
  smc: fix sendpage() call
  net/smc: handle unregistered buffers
  net/smc: call consolidation
  qed: fix spelling mistake: "offloded" -> "offloaded"
  net/mlx5e: fix spelling mistake: "loobpack" -> "loopback"
  tcp: restore autocorking
  rds: do not leak kernel memory to user land
  qmi_wwan: do not steal interfaces from class drivers
  ipv4: fix fnhe usage by non-cached routes
  bpf: sockmap, fix error handling in redirect failures
  bpf: sockmap, zero sg_size on error when buffer is released
  bpf: sockmap, fix scatterlist update on error path in send with apply
  net_sched: fq: take care of throttled flows before reuse
  ipv6: Revert "ipv6: Allow non-gateway ECMP for IPv6"
  bpf, x64: fix memleak when not converging on calls
  bpf, x64: fix memleak when not converging after image
  net/smc: restrict non-blocking connect finish
  8139too: Use disable_irq_nosync() in rtl8139_poll_controller()
  sctp: fix the issue that the cookie-ack with auth can't get processed
  ...
2018-05-03 18:57:03 -10:00
Linus Torvalds bb609316d4 Merge branch 'parisc-4.17-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc fixes from Helge Deller:
 "Fix two section mismatches, convert to read_persistent_clock64(), add
  further documentation regarding the HPMC crash handler and make
  bzImage the default build target"

* 'parisc-4.17-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: Fix section mismatches
  parisc: drivers.c: Fix section mismatches
  parisc: time: Convert read_persistent_clock() to read_persistent_clock64()
  parisc: Document rules regarding checksum of HPMC handler
  parisc: Make bzImage default build target
2018-05-03 18:31:19 -10:00
Eric Dumazet a8d7aa17bb dccp: fix tasklet usage
syzbot reported a crash in tasklet_action_common() caused by dccp.

dccp needs to make sure socket wont disappear before tasklet handler
has completed.

This patch takes a reference on the socket when arming the tasklet,
and moves the sock_put() from dccp_write_xmit_timer() to dccp_write_xmitlet()

kernel BUG at kernel/softirq.c:514!
invalid opcode: 0000 [#1] SMP KASAN
Dumping ftrace buffer:
   (ftrace buffer empty)
Modules linked in:
CPU: 1 PID: 17 Comm: ksoftirqd/1 Not tainted 4.17.0-rc3+ #30
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:tasklet_action_common.isra.19+0x6db/0x700 kernel/softirq.c:515
RSP: 0018:ffff8801d9b3faf8 EFLAGS: 00010246
dccp_close: ABORT with 65423 bytes unread
RAX: 1ffff1003b367f6b RBX: ffff8801daf1f3f0 RCX: 0000000000000000
RDX: ffff8801cf895498 RSI: 0000000000000004 RDI: 0000000000000000
RBP: ffff8801d9b3fc40 R08: ffffed0039f12a95 R09: ffffed0039f12a94
dccp_close: ABORT with 65423 bytes unread
R10: ffffed0039f12a94 R11: ffff8801cf8954a3 R12: 0000000000000000
R13: ffff8801d9b3fc18 R14: dffffc0000000000 R15: ffff8801cf895490
FS:  0000000000000000(0000) GS:ffff8801daf00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000001b2bc28000 CR3: 00000001a08a9000 CR4: 00000000001406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 tasklet_action+0x1d/0x20 kernel/softirq.c:533
 __do_softirq+0x2e0/0xaf5 kernel/softirq.c:285
dccp_close: ABORT with 65423 bytes unread
 run_ksoftirqd+0x86/0x100 kernel/softirq.c:646
 smpboot_thread_fn+0x417/0x870 kernel/smpboot.c:164
 kthread+0x345/0x410 kernel/kthread.c:238
 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:412
Code: 48 8b 85 e8 fe ff ff 48 8b 95 f0 fe ff ff e9 94 fb ff ff 48 89 95 f0 fe ff ff e8 81 53 6e 00 48 8b 95 f0 fe ff ff e9 62 fb ff ff <0f> 0b 48 89 cf 48 89 8d e8 fe ff ff e8 64 53 6e 00 48 8b 8d e8
RIP: tasklet_action_common.isra.19+0x6db/0x700 kernel/softirq.c:515 RSP: ffff8801d9b3faf8

Fixes: dc841e30ea ("dccp: Extend CCID packet dequeueing interface")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Cc: dccp@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-03 15:14:57 -04:00
David S. Miller 31140b47fe Merge branch 'smc-fixes'
Ursula Braun says:

====================
net/smc: fixes 2018/05/03

here are smc fixes for 2 problems:
 * receive buffers in SMC must be registered. If registration fails
   these buffers must not be kept within the link group for reuse.
   Patch 1 is a preparational patch; patch 2 contains the fix.
 * sendpage: do not hold the sock lock when calling kernel_sendpage()
             or sock_no_sendpage()
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-03 14:47:32 -04:00
Stefan Raspl bda27ff5c4 smc: fix sendpage() call
The sendpage() call grabs the sock lock before calling the default
implementation - which tries to grab it once again.

Signed-off-by: Stefan Raspl <raspl@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com><
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-03 14:47:31 -04:00
Karsten Graul a6920d1d13 net/smc: handle unregistered buffers
When smc_wr_reg_send() fails then tag (regerr) the affected buffer and
free it in smc_buf_unuse().

Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-03 14:47:31 -04:00
Karsten Graul e63a5f8c19 net/smc: call consolidation
Consolidate the call to smc_wr_reg_send() in a new function.
No functional changes.

Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-03 14:47:31 -04:00
Colin Ian King df80b8fb3c qed: fix spelling mistake: "offloded" -> "offloaded"
Trivial fix to spelling mistake in DP_NOTICE message

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-03 14:45:51 -04:00
David S. Miller 62264f99fb Merge branch 'bridge-FDB-Notify-about-removal-of-non-user-added-entries'
Petr Machata says:

====================
bridge: FDB: Notify about removal of non-user-added entries

Device drivers may generally need to keep in sync with bridge's FDB. In
particular, for its offload of tc mirror action where the mirrored-to
device is a gretap device, mlxsw needs to listen to a number of events,
FDB events among the others. SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE would be
a natural notification in that case.

However, for removal of FDB entries added due to device activity (as
opposed to explicit addition through "bridge fdb add" or similar), there
are no notifications.

Thus in patch #1, add the "added_by_user" field to switchdev
notifications sent for FDB activity. Adapt drivers to ignore activity on
non-user-added entries, to maintain the current behavior. Specifically
in case of mlxsw, allow mlxsw_sp_span_respin() call for any and all FDB
updates.

In patch #2, change the bridge driver to actually emit notifications for
these FDB entries. Take care not to send notification for bridge
updates that itself originate in SWITCHDEV_FDB_*_TO_BRIDGE events.

Changes from v1 to v2:
- Instead of introducing a new variant of fdb_delete(), add a new
  parameter to the existing function.
- Name the parameter swdev_notify, not notify.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-03 13:46:48 -04:00
Petr Machata 161d82de1f net: bridge: Notify about !added_by_user FDB entries
Do not automatically bail out on sending notifications about activity on
non-user-added FDB entries. Instead, notify about this activity except
for cases where the activity itself originates in a notification, to
avoid sending duplicate notifications.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Acked-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-03 13:46:47 -04:00
Petr Machata 816a3bed95 switchdev: Add fdb.added_by_user to switchdev notifications
The following patch enables sending notifications also for events on FDB
entries that weren't added by the user. Give the drivers the information
necessary to distinguish between the two origins of FDB entries.

To maintain the current behavior, have switchdev-implementing drivers
bail out on notifications about non-user-added FDB entries. In case of
mlxsw driver, allow a call to mlxsw_sp_span_respin() so that SPAN over
bridge catches up with the changed FDB.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Acked-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-03 13:46:47 -04:00
David S. Miller 0e913f28ba Merge branch 'mlxsw-Introduce-support-for-CQEv1-2'
Ido Schimmel says:

====================
mlxsw: Introduce support for CQEv1/2

Jiri says:

Current SwitchX2 and Spectrum FWs support CQEv0 and that is what we
implement in mlxsw. Spectrum FW also supports CQE v1 and v2.
However, Spectrum-2 won't support CQEv0. Prepare for it and setup the
CQE versions to use according to what is queried from FW.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-03 13:44:43 -04:00
Jiri Pirko 41107685b9 mlxsw: pci: Check number of CQEs for CQE version 2
Check number of CQEs for CQE version 2 reported by QUERY_AQ_CAP command.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-03 13:44:43 -04:00
Jiri Pirko 8404f6f2e8 mlxsw: pci: Allow to use CQEs of version 1 and version 2
Use previously added resources to query FW support for multiple versions
of CQEs. Use the biggest version supported. For SDQs, it has no sense to
use version 2 as it does not introduce any new features, but it is
twice the size of CQE version 1.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-03 13:44:43 -04:00
Jiri Pirko b76550bbed mlxsw: pci: Introduce helpers to work with multiple CQE versions
Introduce definitions of fields in CQE version 1 and 2. Also, introduce
common helpers that would call appropriate version-specific helpers
according to the version enum passed.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-03 13:44:42 -04:00
Jiri Pirko 9b934a3bdc mlxsw: resources: Add CQE versions resources
Add resources that FW uses to report supported CQE versions.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-03 13:44:42 -04:00
Nikolay Aleksandrov faa1cd8298 net: bridge: avoid duplicate notification on up/down/change netdev events
While handling netdevice events, br_device_event() sometimes uses
br_stp_(disable|enable)_port which unconditionally send a notification,
but then a second notification for the same event is sent at the end of
the br_device_event() function. To avoid sending duplicate notifications
in such cases, check if one has already been sent (i.e.
br_stp_enable/disable_port have been called).
The patch is based on a change by Satish Ashok.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-03 13:40:54 -04:00
David S. Miller 2e51855194 Merge branch 'selftests-forwarding-sysctl'
Petr Machata says:

====================
selftests: forwarding: Updates to sysctl handling

Some selftests need to adjust sysctl settings. In order to be neutral to
the system that the test is run on, it is a good practice to change back
to the original setting after the test ends. That involves some
boilerplate that can be abstracted away.

In patch #1, introduce two functions, sysctl_set() and sysctl_restore().
The former stores the current value of a given setting, and sets a new
value. The latter restores the setting to the previously-stored value.

In patch #2, use these wrappers in a number of tests.

Additionally in patch #3, fix a problem in mirror_gre_nh.sh, which
neglected to set a sysctl that's crucial for the test to work.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-03 13:37:03 -04:00
Petr Machata 7eaaf0bc52 selftests: forwarding: mirror_gre_nh: Unset RP filter
The test fails to work if reverse-path filtering is in effect on the
mirrored-to host interface, or for all interfaces.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-03 13:37:02 -04:00
Petr Machata d51d10aa1d selftests: forwarding: Use sysctl_set(), sysctl_restore()
Instead of hand-managing the sysctl set and restore, use the wrappers
sysctl_set() and sysctl_restore() to do the bookkeeping automatically.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-03 13:37:02 -04:00
Petr Machata f5ae57784b selftests: forwarding: lib: Add sysctl_set(), sysctl_restore()
Add two helper functions: sysctl_set() to change the value of a given
sysctl setting, and sysctl_restore() to change it back to what it was.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-03 13:37:02 -04:00
Colin Ian King 4e11581c27 net/mlx5e: fix spelling mistake: "loobpack" -> "loopback"
Trivial fix to spelling mistake in netdev_err error message

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-03 12:58:17 -04:00
David S. Miller 41f405460a Merge branch 'selftests-forwarding-Two-enhancements'
Ido Schimmel says:

====================
selftests: forwarding: Two enhancements

First patch increases the maximum deviation in the multipath tests which
proved to be too low in some cases.

Second patch allows user to run only specific tests from each file using
the TESTS environment variable. This granularity is needed in setups
where not all the tests can pass.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-03 12:54:32 -04:00
Ido Schimmel 0eb8053c14 selftests: forwarding: Allow running specific tests
Similar to commit a511858c75 ("selftests: fib_tests: Allow user to run
a specific test"), allow user to run only a subset of the tests using
the TESTS environment variable.

This is useful when not all the tests can pass on a given system.

Example:
# export TESTS="ping_ipv4 ping_ipv6"
# ./bridge_vlan_aware.sh
TEST: ping					[PASS]
TEST: ping6					[PASS]

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-03 12:54:31 -04:00
Ido Schimmel 9413248753 selftests: forwarding: Increase maximum deviation in multipath test
We sometimes observe failures in the test due to too large discrepancy
between the measured and expected ratios. For example:

TEST: ECMP                                                          [FAIL]
        Too large discrepancy between expected and measured ratios
        INFO: Expected ratio 1.00 Measured ratio 1.11

Fix this by allowing an up to 15% deviation between both ratios.

Another possibility is to increase the number of generated flows, but
this will prolong the execution time of the test, which is already quite
high.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-03 12:54:31 -04:00
Ganesh Goudar 5f110899ae cxgb4: update latest firmware version supported
Change t4fw_version.h to update latest firmware version
number to 1.19.1.0.

Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-03 12:52:32 -04:00
Linus Torvalds c15f6d8d47 Fix an incorrect warning selection introduces in the last merge window.
-----BEGIN PGP SIGNATURE-----
 
 iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAlrqtmcLHGhjaEBsc3Qu
 ZGUACgkQD55TZVIEUYMpdxAAvCIYzIDVItIUU/yRjDaek1adEhcIVOr8D+qr3F+v
 M0Ct2KGvd7zG2YwinMJ/O8SXurOaIpIJtmkRc1ZdAnGKNTevGOLo9F339vd+LmIq
 f6Jg8BcrysTPEVF20sGRlL8fF9osJqEubIgYrkf3G/u1E4uRmumgTnNyuyXZua0B
 aS0FGtDtGiUjF0VtXKiWisBQE1Uxz2nJzu53Zw9kLT5Wev68kTcUqzF/cNDFRqwT
 RQt9FkBcq7iYmsCcJ41zWqIbFZ1TnLPuJ0rE05pcymcOMdI18ysTUDC/q8hdaFHR
 686TyHVMC9B7FB3nAdvPBuyX5+KGXTTzQlyxf8/eLr6mm7PTW5VCNqWLD5JB9GXd
 VLVIwjq/4J2wkY3gW1MEBRWcKPl5rlTlUNBwhxC5MHzyfRjz90Fv9x8htLBBeDZ5
 YO0g7igWqkZpYve9Sc3W/HbcG+DEwR21OJIFlAEGL8z0ECVL8gmKh22fHPbuBn/K
 kOvMTdovYV/XgH8tvkHkSeGKiBvdGuCATrC2Th3YmrRmG6wsZD2R9rUsYXBL+Abe
 i7KsH+mzq/QAl3wi/pROWdxqY4IVA5OllMCxZ9EtAY1Tv4OQdMQd2WY20NgNor/l
 9oYNwsDfHsLDnplyxUQpayTjNRqwFKAH9Vz1k0d8adukzcze5aSOAoCCEsjpNe2Q
 0Ds=
 =jdZ7
 -----END PGP SIGNATURE-----

Merge tag 'dma-mapping-4.17-4' of git://git.infradead.org/users/hch/dma-mapping

Pull dma-mapping fix from Christoph Hellwig:
 "Fix an incorrect warning selection introduced in the last merge
  window"

* tag 'dma-mapping-4.17-4' of git://git.infradead.org/users/hch/dma-mapping:
  swiotlb: fix inversed DMA_ATTR_NO_WARN test
2018-05-03 06:27:39 -10:00
Eric Dumazet 114f39feab tcp: restore autocorking
When adding rb-tree for TCP retransmit queue, we inadvertently broke
TCP autocorking.

tcp_should_autocork() should really check if the rtx queue is not empty.

Tested:

Before the fix :
$ nstat -n;./netperf -H 10.246.7.152 -Cc -- -m 500;nstat | grep AutoCork
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.246.7.152 () port 0 AF_INET
Recv   Send    Send                          Utilization       Service Demand
Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
Size   Size    Size     Time     Throughput  local    remote   local   remote
bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB   us/KB

540000 262144    500    10.00      2682.85   2.47     1.59     3.618   2.329
TcpExtTCPAutoCorking            33                 0.0

// Same test, but forcing TCP_NODELAY
$ nstat -n;./netperf -H 10.246.7.152 -Cc -- -D -m 500;nstat | grep AutoCork
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.246.7.152 () port 0 AF_INET : nodelay
Recv   Send    Send                          Utilization       Service Demand
Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
Size   Size    Size     Time     Throughput  local    remote   local   remote
bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB   us/KB

540000 262144    500    10.00      1408.75   2.44     2.96     6.802   8.259
TcpExtTCPAutoCorking            1                  0.0

After the fix :
$ nstat -n;./netperf -H 10.246.7.152 -Cc -- -m 500;nstat | grep AutoCork
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.246.7.152 () port 0 AF_INET
Recv   Send    Send                          Utilization       Service Demand
Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
Size   Size    Size     Time     Throughput  local    remote   local   remote
bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB   us/KB

540000 262144    500    10.00      5472.46   2.45     1.43     1.761   1.027
TcpExtTCPAutoCorking            361293             0.0

// With TCP_NODELAY option
$ nstat -n;./netperf -H 10.246.7.152 -Cc -- -D -m 500;nstat | grep AutoCork
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.246.7.152 () port 0 AF_INET : nodelay
Recv   Send    Send                          Utilization       Service Demand
Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
Size   Size    Size     Time     Throughput  local    remote   local   remote
bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB   us/KB

540000 262144    500    10.00      5454.96   2.46     1.63     1.775   1.174
TcpExtTCPAutoCorking            315448             0.0

Fixes: 75c119afe1 ("tcp: implement rb-tree based retransmit queue")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Michael Wenig <mwenig@vmware.com>
Tested-by: Michael Wenig <mwenig@vmware.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Michael Wenig <mwenig@vmware.com>
Tested-by: Michael Wenig <mwenig@vmware.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-03 11:28:50 -04:00
Sun Lianwen 7ccbdff13e ip6_gre: correct the function name in ip6gre_tnl_addr_conflict() comment
The function name is wrong in ip6gre_tnl_addr_conflict() comment, which
use ip6_tnl_addr_conflict instead of ip6gre_tnl_addr_conflict.

Signed-off-by: Sun Lianwen <sunlw.fnst@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-03 11:28:24 -04:00
Eric Dumazet eb80ca476e rds: do not leak kernel memory to user land
syzbot/KMSAN reported an uninit-value in put_cmsg(), originating
from rds_cmsg_recv().

Simply clear the structure, since we have holes there, or since
rx_traces might be smaller than RDS_MSG_RX_DGRAM_TRACE_MAX.

BUG: KMSAN: uninit-value in copy_to_user include/linux/uaccess.h:184 [inline]
BUG: KMSAN: uninit-value in put_cmsg+0x600/0x870 net/core/scm.c:242
CPU: 0 PID: 4459 Comm: syz-executor582 Not tainted 4.16.0+ #87
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:17 [inline]
 dump_stack+0x185/0x1d0 lib/dump_stack.c:53
 kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1067
 kmsan_internal_check_memory+0x135/0x1e0 mm/kmsan/kmsan.c:1157
 kmsan_copy_to_user+0x69/0x160 mm/kmsan/kmsan.c:1199
 copy_to_user include/linux/uaccess.h:184 [inline]
 put_cmsg+0x600/0x870 net/core/scm.c:242
 rds_cmsg_recv net/rds/recv.c:570 [inline]
 rds_recvmsg+0x2db5/0x3170 net/rds/recv.c:657
 sock_recvmsg_nosec net/socket.c:803 [inline]
 sock_recvmsg+0x1d0/0x230 net/socket.c:810
 ___sys_recvmsg+0x3fb/0x810 net/socket.c:2205
 __sys_recvmsg net/socket.c:2250 [inline]
 SYSC_recvmsg+0x298/0x3c0 net/socket.c:2262
 SyS_recvmsg+0x54/0x80 net/socket.c:2257
 do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287
 entry_SYSCALL_64_after_hwframe+0x3d/0xa2

Fixes: 3289025aed ("RDS: add receive message trace used by application")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Cc: linux-rdma <linux-rdma@vger.kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-03 11:26:14 -04:00
Bjørn Mork 5697db4a69 qmi_wwan: do not steal interfaces from class drivers
The USB_DEVICE_INTERFACE_NUMBER matching macro assumes that
the { vendorid, productid, interfacenumber } set uniquely
identifies one specific function.  This has proven to fail
for some configurable devices. One example is the Quectel
EM06/EP06 where the same interface number can be either
QMI or MBIM, without the device ID changing either.

Fix by requiring the vendor-specific class for interface number
based matching.  Functions of other classes can and should use
class based matching instead.

Fixes: 03304bcb5e ("net: qmi_wwan: use fixed interface number matching")
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-03 11:25:03 -04:00
David S. Miller 3d8d38dfea Merge branch 'act_csum-get_fill_size'
Craig Dillabaugh says:

====================
Update csum tc action for batch operation.

This patchset includes two patches the first updating act_csum.c
to include the get_fill_size routine required for batch operation, and
the second including updated TDC tests for the feature.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-03 11:15:59 -04:00
Craig Dillabaugh 2c785b388f tc-testing: Updated csum action tests batch create w/wo cookies.
Signed-off-by: Craig Dillabaugh <cdillaba@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-03 11:15:58 -04:00
Craig Dillabaugh 29e6eee192 net sched: Implemented get_fill_size routine for act_csum.
Signed-off-by: Craig Dillabaugh <cdillaba@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-03 11:15:58 -04:00
Linus Torvalds f4ef6a438c Various fixes in tracing:
- Tracepoints should not give warning on OOM failures
 
  - Use special field for function pointer in trace event
 
  - Fix igrab issues in uprobes
 
  - Fixes to the new histogram triggers
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCWuoYdBQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qtFnAP9X4+AVDQH0VfsMLSc9D+rK6WmcRIhv
 q8J2gNPv3anM+AD/SFXWGO4ihN+0KDw/TqmJxESNEybq47vTZ/s5lM6A4gQ=
 =fQbj
 -----END PGP SIGNATURE-----

Merge tag 'trace-v4.17-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing fixes from Steven Rostedt:
 "Various fixes in tracing:

   - Tracepoints should not give warning on OOM failures

   - Use special field for function pointer in trace event

   - Fix igrab issues in uprobes

   - Fixes to the new histogram triggers"

* tag 'trace-v4.17-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracepoint: Do not warn on ENOMEM
  tracing: Add field modifier parsing hist error for hist triggers
  tracing: Add field parsing hist error for hist triggers
  tracing: Restore proper field flag printing when displaying triggers
  tracing: initcall: Ordered comparison of function pointers
  tracing: Remove igrab() iput() call from uprobes.c
  tracing: Fix bad use of igrab in trace_uprobe.c
2018-05-02 17:38:37 -10:00
Linus Torvalds ecd649b340 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input updates from Dmitry Torokhov:
 "Just a few driver fixes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: atmel_mxt_ts - add missing compatible strings to OF device table
  Input: atmel_mxt_ts - fix the firmware update
  Input: atmel_mxt_ts - add touchpad button mapping for Samsung Chromebook Pro
  MAINTAINERS: Rakesh Iyer can't be reached anymore
  Input: hideep_ts - fix a typo in Kconfig
  Input: alps - fix reporting pressure of v3 trackstick
  Input: leds - fix out of bound access
  Input: synaptics-rmi4 - fix an unchecked out of memory error path
2018-05-02 17:34:42 -10:00
Julian Anastasov 94720e3aee ipv4: fix fnhe usage by non-cached routes
Allow some non-cached routes to use non-expired fnhe:

1. ip_del_fnhe: moved above and now called by find_exception.
The 4.5+ commit deed49df73 expires fnhe only when caching
routes. Change that to:

1.1. use fnhe for non-cached local output routes, with the help
from (2)

1.2. allow __mkroute_input to detect expired fnhe (outdated
fnhe_gw, for example) when do_cache is false, eg. when itag!=0
for unicast destinations.

2. __mkroute_output: keep fi to allow local routes with orig_oif != 0
to use fnhe info even when the new route will not be cached into fnhe.
After commit 839da4d989 ("net: ipv4: set orig_oif based on fib
result for local traffic") it means all local routes will be affected
because they are not cached. This change is used to solve a PMTU
problem with IPVS (and probably Netfilter DNAT) setups that redirect
local clients from target local IP (local route to Virtual IP)
to new remote IP target, eg. IPVS TUN real server. Loopback has
64K MTU and we need to create fnhe on the local route that will
keep the reduced PMTU for the Virtual IP. Without this change
fnhe_pmtu is updated from ICMP but never exposed to non-cached
local routes. This includes routes with flowi4_oif!=0 for 4.6+ and
with flowi4_oif=any for 4.14+).

3. update_or_create_fnhe: make sure fnhe_expires is not 0 for
new entries

Fixes: 839da4d989 ("net: ipv4: set orig_oif based on fib result for local traffic")
Fixes: d6d5e999e5 ("route: do not cache fib route info on local routes with oif")
Fixes: deed49df73 ("route: check and remove route cache when we get route")
Cc: David Ahern <dsahern@gmail.com>
Cc: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-02 22:52:35 -04:00
Linus Torvalds 3b6f979319 SCSI fixes on 20180502
Three small bug fixes: an illegally overlapping memcmp in target code,
 a potential infinite loop in isci under certain rare phy conditions
 and an ATA queue depth (performance) correction for storvsc.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCWunNPiYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishdyJAQDGWGGN
 i5BJOr1c8BNEMHHuJIYvpgNJDLQd5tfqxieZgAD/e1RQn1nIjxnoIrduOAP+so8u
 XvGtP79yTO2yxz7VP48=
 =HASg
 -----END PGP SIGNATURE-----

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "Three small bug fixes: an illegally overlapping memcmp in target code,
  a potential infinite loop in isci under certain rare phy conditions
  and an ATA queue depth (performance) correction for storvsc"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: target: Fix fortify_panic kernel exception
  scsi: isci: Fix infinite loop in while loop
  scsi: storvsc: Set up correct queue depth values for IDE devices
2018-05-02 16:38:17 -10:00
David S. Miller e002434e88 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2018-05-03

The following pull-request contains BPF updates for your *net* tree.

The main changes are:

1) Several BPF sockmap fixes mostly related to bugs in error path
   handling, that is, a bug in updating the scatterlist length /
   offset accounting, a missing sk_mem_uncharge() in redirect
   error handling, and a bug where the outstanding bytes counter
   sg_size was not zeroed, from John.

2) Fix two memory leaks in the x86-64 BPF JIT, one in an error
   path where we still don't converge after image was allocated
   and another one where BPF calls are used and JIT passes don't
   converge, from Daniel.

3) Minor fix in BPF selftests where in test_stacktrace_build_id()
   we drop useless args in urandom_read and we need to add a missing
   newline in a CHECK() error message, from Song.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-02 20:42:44 -04:00
Alexei Starovoitov b5b6ff7302 Merge branch 'bpf-sockmap-fixes'
John Fastabend says:

====================
When I added the test_sockmap to selftests I mistakenly changed the
test logic a bit. The result of this was on redirect cases we ended up
choosing the wrong sock from the BPF program and ended up sending to a
socket that had no receive handler. The result was the actual receive
handler, running on a different socket, is timing out and closing the
socket. This results in errors (-EPIPE to be specific) on the sending
side. Typically happening if the sender does not complete the send
before the receive side times out. So depending on timing and the size
of the send we may get errors. This exposed some bugs in the sockmap
error path handling.

This series fixes the errors. The primary issue is we did not do proper
memory accounting in these cases which resulted in missing a
sk_mem_uncharge(). This happened in the redirect path and in one case
on the normal send path. See the three patches for the details.

The other take-away from this is we need to fix the test_sockmap and
also add more negative test cases. That will happen in bpf-next.

Finally, I tested this using the existing test_sockmap program, the
older sockmap sample test script, and a few real use cases with
Cilium. All of these seem to be in working correctly.

v2: fix compiler warning, drop iterator variable 'i' that is no longer
    used in patch 3.
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-05-02 15:30:46 -07:00
John Fastabend abaeb096ca bpf: sockmap, fix error handling in redirect failures
When a redirect failure happens we release the buffers in-flight
without calling a sk_mem_uncharge(), the uncharge is called before
dropping the sock lock for the redirecte, however we missed updating
the ring start index. When no apply actions are in progress this
is OK because we uncharge the entire buffer before the redirect.
But, when we have apply logic running its possible that only a
portion of the buffer is being redirected. In this case we only
do memory accounting for the buffer slice being redirected and
expect to be able to loop over the BPF program again and/or if
a sock is closed uncharge the memory at sock destruct time.

With an invalid start index however the program logic looks at
the start pointer index, checks the length, and when seeing the
length is zero (from the initial release and failure to update
the pointer) aborts without uncharging/releasing the remaining
memory.

The fix for this is simply to update the start index. To avoid
fixing this error in two locations we do a small refactor and
remove one case where it is open-coded. Then fix it in the
single function.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-05-02 15:30:45 -07:00
John Fastabend fec51d40ea bpf: sockmap, zero sg_size on error when buffer is released
When an error occurs during a redirect we have two cases that need
to be handled (i) we have a cork'ed buffer (ii) we have a normal
sendmsg buffer.

In the cork'ed buffer case we don't currently support recovering from
errors in a redirect action. So the buffer is released and the error
should _not_ be pushed back to the caller of sendmsg/sendpage. The
rationale here is the user will get an error that relates to old
data that may have been sent by some arbitrary thread on that sock.
Instead we simple consume the data and tell the user that the data
has been consumed. We may add proper error recovery in the future.
However, this patch fixes a bug where the bytes outstanding counter
sg_size was not zeroed. This could result in a case where if the user
has both a cork'ed action and apply action in progress we may
incorrectly call into the BPF program when the user expected an
old verdict to be applied via the apply action. I don't have a use
case where using apply and cork at the same time is valid but we
never explicitly reject it because it should work fine. This patch
ensures the sg_size is zeroed so we don't have this case.

In the normal sendmsg buffer case (no cork data) we also do not
zero sg_size. Again this can confuse the apply logic when the logic
calls into the BPF program when the BPF programmer expected the old
verdict to remain. So ensure we set sg_size to zero here as well. And
additionally to keep the psock state in-sync with the sk_msg_buff
release all the memory as well. Previously we did this before
returning to the user but this left a gap where psock and sk_msg_buff
states were out of sync which seems fragile. No additional overhead
is taken here except for a call to check the length and realize its
already been freed. This is in the error path as well so in my
opinion lets have robust code over optimized error paths.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-05-02 15:30:45 -07:00