Commit graph

635481 commits

Author SHA1 Message Date
Matthew Wilcox 53855d10f4 radix tree test suite: fix compilation
Patch "lib/radix-tree: Convert to hotplug state machine" breaks the test
suite as it adds a call to cpuhp_setup_state_nocalls() which is not
currently emulated in the test suite.  Add it, and delete the emulation
of the old CPU hotplug mechanism.

Link: http://lkml.kernel.org/r/1480369871-5271-36-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Tested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-07 17:10:00 -08:00
Sergey Senozhatsky 5c7e9ccd91 zram: restrict add/remove attributes to root only
zram hot_add sysfs attribute is a very 'special' attribute - reading
from it creates a new uninitialized zram device.  This file, by a
mistake, can be read by a 'normal' user at the moment, while only root
must be able to create a new zram device, therefore hot_add attribute
must have S_IRUSR mode, not S_IRUGO.

[akpm@linux-foundation.org: s/sence/sense/, reflow comment to use 80 cols]
Fixes: 6566d1a32b ("zram: add dynamic device add/remove functionality")
Link: http://lkml.kernel.org/r/20161205155845.20129-1-sergey.senozhatsky@gmail.com
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Reported-by: Steven Allen <steven@stebalien.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: <stable@vger.kernel.org>    [4.2+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-07 17:10:00 -08:00
Linus Torvalds ea5a9eff96 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
 "Misc fixes: a core dumping crash fix, a guess-unwinder regression fix,
  plus three build warning fixes"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/unwind: Fix guess-unwinder regression
  x86/build: Annotate die() with noreturn to fix build warning on clang
  x86/platform/olpc: Fix resume handler build warning
  x86/apic/uv: Silence a shift wrapping warning
  x86/coredump: Always use user_regs_struct for compat_elf_gregset_t
2016-12-07 11:39:27 -08:00
Linus Torvalds 68f5503bdc Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fix from Ingo Molnar:
 "An autogroup nice level adjustment bug fix"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/autogroup: Fix 64-bit kernel nice level adjustment
2016-12-07 11:35:55 -08:00
Linus Torvalds bf7f1c7e2f Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "A bogus warning fix, a counter width handling fix affecting certain
  machines, plus a oneliner hw-enablement patch for Knights Mill CPUs"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/core: Remove invalid warning from list_update_cgroup_even()t
  perf/x86: Fix full width counter, counter overflow
  perf/x86/intel: Enable C-state residency events for Knights Mill
2016-12-07 11:32:19 -08:00
Linus Torvalds 5b43f97f3f Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fixes from Ingo Molnar:
 "Two rtmutex race fixes (which miraculously never triggered, that we
  know of), plus two lockdep printk formatting regression fixes"

* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  lockdep: Fix report formatting
  locking/rtmutex: Use READ_ONCE() in rt_mutex_owner()
  locking/rtmutex: Prevent dequeue vs. unlock race
  locking/selftest: Fix output since KERN_CONT changes
2016-12-07 11:27:33 -08:00
Linus Torvalds 407cf05d46 Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull objtool fix from Ingo Molnar:
 "A single late breaking fix for objtool"

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  objtool: Fix bytes check of lea's rex_prefix
2016-12-07 10:56:00 -08:00
Linus Torvalds ce779d6b5b Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse
Pull fuse fix from Miklos Szeredi:
 "Fix a regression spotted by Jeff Layton"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
  fuse: fix clearing suid, sgid for chown()
2016-12-07 08:45:23 -08:00
Linus Torvalds f27c2f69cc Revert "default exported asm symbols to zero"
This reverts commit 8ab2ae655b.

I loved that commit because of how it explained what the problem with
newer versions of binutils were, but the actual patch itself turns out
to not work very well.

It has two problems:

 - a zero CRC value isn't actually right.  It happens to work for the
   case where both sides of the equation fail at giving the symbol a
   crc, but there are cases where the users of the exported symbol get
   the right crc (due to seeing the C declarations), but the actual
   exporting itself does not (due to the whole weak asm symbol issue).

   So then the module load fails after all - we did have a crc for the
   symbol, but we couldn't match it with the loaded module.

 - it seems that the alpha assembler has special semantics for the
   '.set' directive, and on alpha it doesn't actually set the value of
   the specified symbol at all, it is instead used to set various
   assembly modes (eg ".set noat" and ".set noreorder").

   So using ".set" to set the symbol value would just cause build
   failures on alpha.

I'm sure we'll find some other workaround for these issues (hopefully
that involves getting rid of modversions entirely some day, but people
are also talking about just using smarter tools).  But for now we'll
just fall back on commit faaae2a581 ("Re-enable CONFIG_MODVERSIONS in
a slightly weaker form") that just let's a missing crc through.

Reported-by: Jan Stancek <jstancek@redhat.com>
Reported-by: Philip Müller <philm@manjaro.org>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-07 08:39:00 -08:00
Linus Torvalds a0ac402cfc Don't feed anything but regular iovec's to blk_rq_map_user_iov
In theory we could map other things, but there's a reason that function
is called "user_iov".  Using anything else (like splice can do) just
confuses it.

Reported-and-tested-by: Johannes Thumshirn <jthumshirn@suse.de>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-07 08:23:35 -08:00
Linus Torvalds bc3913a537 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc
Pull sparc fix from David Miller:
 "A use-before-NULL-check from Dan Carpenter"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
  dbri: move dereference after check for NULL
2016-12-06 09:24:11 -08:00
Dan Carpenter 163117e8d4 dbri: move dereference after check for NULL
We accidentally introduced a dereference before the NULL check in
xmit_descs() as part of silencing a GCC warning.

Fixes: 16f46050e7 ("dbri: Fix compiler warning")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-06 12:18:22 -05:00
Linus Torvalds da1b466fa4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) When dcbnl_cee_fill() fails to be able to push a new netlink
    attribute, it return 0 instead of an error code. From Pan Bian.

 2) Two suffix handling fixes to FIB trie code, from Alexander Duyck.

 3) bnxt_hwrm_stat_ctx_alloc() goes through all the trouble of setting
    and maintaining a return code 'rc' but fails to actually return it.
    Also from Pan Bian.

 4) ping socket ICMP handler needs to validate ICMP header length, from
    Kees Cook.

 5) caif_sktinit_module() has this interesting logic:

        int err = sock_register(...);
        if (!err)
                return err;
        return 0;

    Just return sock_register()'s return value directly which is the
    only possible correct thing to do.

 6) Two bnx2x driver fixes from Yuval Mintz, return a reasonable
    estimate from get_ringparam() ethtool op when interface is down and
    avoid trying to use UDP port based tunneling on 577xx chips.

 7) Fix ep93xx_eth crash on module unload from Florian Fainelli.

 8) Missing uapi exports, from Stephen Hemminger.

 9) Don't schedule work from sk_destruct(), because the socket will be
    freed upon return from that function. From Herbert Xu.

10) Buggy drivers, of which we know there is at least one, can send a
    huge packet into the TCP stack but forget to set the gso_size in the
    SKB, which causes all kinds of problems.

    Correct this when it happens, and emit a one-time warning with the
    device name included so that it can be diagnosed more easily.

    From Marcelo Ricardo Leitner.

11) virtio-net does DMA off the stack causes hiccups with VMAP_STACK,
    fix from Andy Lutomirski.

12) Fix fec driver compilation with CONFIG_M5272, from Nikita
    Yushchenko.

13) mlx5 fixes from Kamal Heib, Saeed Mahameed, and Mohamad Haj Yahia.
    (erroneously flushing queues on error, module parameter validation,
    etc)

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (34 commits)
  net/mlx5e: Change the SQ/RQ operational state to positive logic
  net/mlx5e: Don't flush SQ on error
  net/mlx5e: Don't notify HW when filling the edge of ICO SQ
  net/mlx5: Fix query ISSI flow
  net/mlx5: Remove duplicate pci dev name print
  net/mlx5: Verify module parameters
  net: fec: fix compile with CONFIG_M5272
  be2net: Add DEVSEC privilege to SET_HSW_CONFIG command.
  virtio-net: Fix DMA-from-the-stack in virtnet_set_mac_address()
  tcp: warn on bogus MSS and try to amend it
  uapi glibc compat: fix outer guard of net device flags enum
  net: stmmac: clear reset value of snps, wr_osr_lmt/snps, rd_osr_lmt before writing
  netlink: Do not schedule work from sk_destruct
  uapi: export nf_log.h
  uapi: export tc_skbmod.h
  net: ep93xx_eth: Do not crash unloading module
  bnx2x: Prevent tunnel config for 577xx
  bnx2x: Correct ringparam estimate when DOWN
  isdn: hisax: set error code on failure
  net: bnx2x: fix improper return value
  ...
2016-12-06 09:06:51 -08:00
Linus Torvalds 10d20bd25e shmem: fix shm fallocate() list corruption
The shmem hole punching with fallocate(FALLOC_FL_PUNCH_HOLE) does not
want to race with generating new pages by faulting them in.

However, the wait-queue used to delay the page faulting has a serious
problem: the wait queue head (in shmem_fallocate()) is allocated on the
stack, and the code expects that "wake_up_all()" will make sure that all
the queue entries are gone before the stack frame is de-allocated.

And that is not at all necessarily the case.

Yes, a normal wake-up sequence will remove the wait-queue entry that
caused the wakeup (see "autoremove_wake_function()"), but the key
wording there is "that caused the wakeup".  When there are multiple
possible wakeup sources, the wait queue entry may well stay around.

And _particularly_ in a page fault path, we may be faulting in new pages
from user space while we also have other things going on, and there may
well be other pending wakeups.

So despite the "wake_up_all()", it's not at all guaranteed that all list
entries are removed from the wait queue head on the stack.

Fix this by introducing a new wakeup function that removes the list
entry unconditionally, even if the target process had already woken up
for other reasons.  Use that "synchronous" function to set up the
waiters in shmem_fault().

This problem has never been seen in the wild afaik, but Dave Jones has
reported it on and off while running trinity.  We thought we fixed the
stack corruption with the blk-mq rq_list locking fix (commit
7fe311302f: "blk-mq: update hardware and software queues for sleeping
alloc"), but it turns out there was _another_ stack corruptor hiding
in the trinity runs.

Vegard Nossum (also running trinity) was able to trigger this one fairly
consistently, and made us look once again at the shmem code due to the
faults often being in that area.

Reported-and-tested-by: Vegard Nossum <vegard.nossum@oracle.com>.
Reported-by: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-06 08:59:05 -08:00
David S. Miller 32f16e142d Merge branch 'mlx5-fixes'
Saeed Mahameed says:

====================
Mellanox 100G mlx5 fixes 2016-12-04

Some bug fixes for mlx5 core and mlx5e driver.

v1->v2:
 - replace "uint" with "unsigned int"
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-06 11:44:45 -05:00
Mohamad Haj Yahia c0f1147d14 net/mlx5e: Change the SQ/RQ operational state to positive logic
When using the negative logic (i.e. FLUSH state), after the RQ/SQ reopen
we will have a time interval that the RQ/SQ is not really ready and the
state indicates that its not in FLUSH state because the initial SQ/RQ struct
memory starts as zeros.
Now we changed the state to indicate if the SQ/RQ is opened and we will
set the READY state after finishing preparing all the SQ/RQ resources.

Fixes: 6e8dd6d6f4 ("net/mlx5e: Don't wait for SQ completions on close")
Fixes: f2fde18c52 ("net/mlx5e: Don't wait for RQ completions on close")
Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-06 11:44:45 -05:00
Saeed Mahameed 3c8591d593 net/mlx5e: Don't flush SQ on error
We are doing SQ descriptors cleanup in driver.

Fixes: 6e8dd6d6f4 ("net/mlx5e: Don't wait for SQ completions on close")
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-06 11:44:44 -05:00
Saeed Mahameed b8335d91b4 net/mlx5e: Don't notify HW when filling the edge of ICO SQ
We are going to do this a couple of steps ahead anyway.

Fixes: d3c9bc2743 ("net/mlx5e: Added ICO SQs")
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-06 11:44:44 -05:00
Kamal Heib f9c14e4674 net/mlx5: Fix query ISSI flow
In old FWs query ISSI command is not supported and for some of those FWs
it might fail with status other than "MLX5_CMD_STAT_BAD_OP_ERR".

In such case instead of failing the driver load, we will treat any FW
status other than 0 for Query ISSI FW command as ISSI not supported and
assume ISSI=0 (most basic driver/FW interface).

In case of driver syndrom (query ISSI failure by driver) we will fail
driver load.

Fixes: f62b8bb8f2 ('net/mlx5: Extend mlx5_core to support ConnectX-4
Ethernet functionality')
Signed-off-by: Kamal Heib <kamalh@mellanox.com>

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-06 11:44:44 -05:00
Kamal Heib 9e5b2fc1d3 net/mlx5: Remove duplicate pci dev name print
Remove duplicate pci dev name printing from mlx5_core_warn/dbg.

Fixes: 5a7883989b ('net/mlx5_core: Improve mlx5 messages')
Signed-off-by: Kamal Heib <kamalh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-06 11:44:43 -05:00
Kamal Heib f663ad9862 net/mlx5: Verify module parameters
Verify the mlx5_core module parameters by making sure that they are in
the expected range and if they aren't restore them to their default
values.

Fixes: 9603b61de1 ('mlx5: Move pci device handling from mlx5_ib to mlx5_core')
Signed-off-by: Kamal Heib <kamalh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-06 11:44:43 -05:00
Nikita Yushchenko f85de66663 net: fec: fix compile with CONFIG_M5272
Commit 80cca775cd ("net: fec: cache statistics while device is down")
introduced unconditional statistics-related actions.

However, when driver is compiled with CONFIG_M5272, staticsics-related
definitions do not exist, which results into build errors.

Fix that by adding explicit handling of !defined(CONFIG_M5272) case.

Fixes: 80cca775cd ("net: fec: cache statistics while device is down")
Signed-off-by: Nikita Yushchenko <nikita.yoush@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-06 11:40:15 -05:00
Venkat Duvvuru d14584d919 be2net: Add DEVSEC privilege to SET_HSW_CONFIG command.
OPCODE_COMMON_GET_FN_PRIVILEGES is returning only DEVSEC
privilege (Unrestricted Administrative Privilege) for Lancer NIC functions.
So, driver is failing SET_HSW_CONFIG command, as DEVSEC privilege was not
set in the privilege bitmap. This patch fixes the problem by setting DEVSEC
privilege in SET_HSW_CONFIG’s privilege bitmap.

Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Signed-off-by: Suresh Reddy <suresh.reddy@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-06 11:39:41 -05:00
Andy Lutomirski e37e2ff350 virtio-net: Fix DMA-from-the-stack in virtnet_set_mac_address()
With CONFIG_VMAP_STACK=y, virtnet_set_mac_address() can be passed a
pointer to the stack and it will OOPS.  Copy the address to the heap
to prevent the crash.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Laura Abbott <labbott@redhat.com>
Reported-by: zbyszek@in.waw.pl
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-06 11:38:43 -05:00
Marcelo Ricardo Leitner dcb17d22e1 tcp: warn on bogus MSS and try to amend it
There have been some reports lately about TCP connection stalls caused
by NIC drivers that aren't setting gso_size on aggregated packets on rx
path. This causes TCP to assume that the MSS is actually the size of the
aggregated packet, which is invalid.

Although the proper fix is to be done at each driver, it's often hard
and cumbersome for one to debug, come to such root cause and report/fix
it.

This patch amends this situation in two ways. First, it adds a warning
on when this situation occurs, so it gives a hint to those trying to
debug this. It also limit the maximum probed MSS to the adverised MSS,
as it should never be any higher than that.

The result is that the connection may not have the best performance ever
but it shouldn't stall, and the admin will have a hint on what to look
for.

Tested with virtio by forcing gso_size to 0.

v2: updated msg per David's suggestion
v3: use skb_iif to find the interface and also log its name, per Eric
    Dumazet's suggestion. As the skb may be backlogged and the interface
    gone by then, we need to check if the number still has a meaning.
v4: use helper tcp_gro_dev_warn() and avoid pr_warn_once inside __once, per
    David's suggestion

Cc: Jonathan Maxwell <jmaxwell37@gmail.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-06 11:01:19 -05:00
Jonas Gorski efc4515482 uapi glibc compat: fix outer guard of net device flags enum
Fix a wrong condition preventing the higher net device flags
IFF_LOWER_UP etc to be defined if net/if.h is included before
linux/if.h.

The comment makes it clear the intention was to allow partial
definition with either parts.

This fixes compilation of userspace programs trying to use
IFF_LOWER_UP, IFF_DORMANT or IFF_ECHO.

Fixes: 4a91cb61bb ("uapi glibc compat: fix compile errors when glibc net/if.h included before linux/if.h")
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
Reviewed-by: Mikko Rapeli <mikko.rapeli@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-06 10:43:32 -05:00
Niklas Cassel 6b3374cb1c net: stmmac: clear reset value of snps, wr_osr_lmt/snps, rd_osr_lmt before writing
WR_OSR_LMT and RD_OSR_LMT have a reset value of 1.
Since the reset value wasn't cleared before writing, the value in the
register would be incorrect if specifying an uneven value for
snps,wr_osr_lmt/snps,rd_osr_lmt.

Zero is a valid value for the properties, since the databook specifies:
maximum outstanding requests = WR_OSR_LMT + 1.

We do not want to change the behavior for existing users when the
property is missing. Therefore, default to 1 if the property is missing,
since that is the same as the reset value.

Signed-off-by: Niklas Cassel <niklas.cassel@axis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-06 10:33:48 -05:00
Miklos Szeredi c01638f5d9 fuse: fix clearing suid, sgid for chown()
Basically, the pjdfstests set the ownership of a file to 06555, and then
chowns it (as root) to a new uid/gid. Prior to commit a09f99edde ("fuse:
fix killing s[ug]id in setattr"), fuse would send down a setattr with both
the uid/gid change and a new mode.  Now, it just sends down the uid/gid
change.

Technically this is NOTABUG, since POSIX doesn't _require_ that we clear
these bits for a privileged process, but Linux (wisely) has done that and I
think we don't want to change that behavior here.

This is caused by the use of should_remove_suid(), which will always return
0 when the process has CAP_FSETID.

In fact we really don't need to be calling should_remove_suid() at all,
since we've already been indicated that we should remove the suid, we just
don't want to use a (very) stale mode for that.

This patch should fix the above as well as simplify the logic.

Reported-by: Jeff Layton <jlayton@redhat.com> 
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Fixes: a09f99edde ("fuse: fix killing s[ug]id in setattr")
Cc: <stable@vger.kernel.org>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
2016-12-06 16:18:45 +01:00
Dmitry Vyukov f943fe0faf lockdep: Fix report formatting
Since commit:

  4bcc595ccd ("printk: reinstate KERN_CONT for printing continuation lines")

printk() requires KERN_CONT to continue log messages. Lots of printk()
in lockdep.c and print_ip_sym() don't have it. As the result lockdep
reports are completely messed up.

Add missing KERN_CONT and inline print_ip_sym() where necessary.

Example of a messed up report:

  0-rc5+ #41 Not tainted
  -------------------------------------------------------
  syz-executor0/5036 is trying to acquire lock:
   (
  rtnl_mutex
  ){+.+.+.}
  , at:
  [<ffffffff86b3d6ac>] rtnl_lock+0x1c/0x20
  but task is already holding lock:
   (
  &net->packet.sklist_lock
  ){+.+...}
  , at:
  [<ffffffff873541a6>] packet_diag_dump+0x1a6/0x1920
  which lock already depends on the new lock.
  the existing dependency chain (in reverse order) is:
  -> #3
   (
  &net->packet.sklist_lock
  +.+...}
  ...

Without this patch all scripts that parse kernel bug reports are broken.

Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: andreyknvl@google.com
Cc: aryabinin@virtuozzo.com
Cc: joe@perches.com
Cc: syzkaller@googlegroups.com
Link: http://lkml.kernel.org/r/1480343083-48731-1-git-send-email-dvyukov@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-12-06 10:40:08 +01:00
David Carrillo-Cisneros 8fc31ce889 perf/core: Remove invalid warning from list_update_cgroup_even()t
The warning introduced in commit:

  864c2357ca ("perf/core: Do not set cpuctx->cgrp for unscheduled cgroups")

assumed that a cgroup switch always precedes list_del_event. This is
not the case. Remove warning.

Make sure that cpuctx->cgrp is NULL until a cgroup event is sched in
or ctx->nr_cgroups == 0.

Signed-off-by: David Carrillo-Cisneros <davidcc@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: Paul Turner <pjt@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi V Shankar <ravi.v.shankar@intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vegard Nossum <vegard.nossum@gmail.com>
Cc: Vikas Shivappa <vikas.shivappa@linux.intel.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1480841177-27299-1-git-send-email-davidcc@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-12-06 09:44:29 +01:00
Peter Zijlstra (Intel) 7f612a7f0b perf/x86: Fix full width counter, counter overflow
Lukasz reported that perf stat counters overflow handling is broken on KNL/SLM.

Both these parts have full_width_write set, and that does indeed have
a problem. In order to deal with counter wrap, we must sample the
counter at at least half the counter period (see also the sampling
theorem) such that we can unambiguously reconstruct the count.

However commit:

  069e0c3c40 ("perf/x86/intel: Support full width counting")

sets the sampling interval to the full period, not half.

Fixing that exposes another issue, in that we must not sign extend the
delta value when we shift it right; the counter cannot have
decremented after all.

With both these issues fixed, counter overflow functions correctly
again.

Reported-by: Lukasz Odzioba <lukasz.odzioba@intel.com>
Tested-by: Liang, Kan <kan.liang@intel.com>
Tested-by: Odzioba, Lukasz <lukasz.odzioba@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: stable@vger.kernel.org
Fixes: 069e0c3c40 ("perf/x86/intel: Support full width counting")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-12-06 09:44:28 +01:00
Piotr Luc 1dba23b12f perf/x86/intel: Enable C-state residency events for Knights Mill
The Knights Mill is enough close to Knights Landing so the path reuses
C-state residency support of the latter.

Signed-off-by: Piotr Luc <piotr.luc@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/20161201000853.18260-1-piotr.luc@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-12-06 09:44:27 +01:00
Jiri Slaby 69042bf200 objtool: Fix bytes check of lea's rex_prefix
For the "lea %(rsp), %rbp" case, we check if there is a rex_prefix.
But we check 'bytes' which is insn_byte_t[4] in rex_prefix (insn_field
structure). Therefore, the check is always true.

Instead, check 'nbytes' which is the right one.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20161205105551.25917-1-jslaby@suse.cz
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-12-06 09:20:59 +01:00
Herbert Xu ed5d7788a9 netlink: Do not schedule work from sk_destruct
It is wrong to schedule a work from sk_destruct using the socket
as the memory reserve because the socket will be freed immediately
after the return from sk_destruct.

Instead we should do the deferral prior to sk_free.

This patch does just that.

Fixes: 707693c8a4 ("netlink: Call cb->done from a worker thread")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-05 19:43:42 -05:00
stephen hemminger ffe3bb85c1 uapi: export nf_log.h
File is in uapi directory but not being copied on
 make install_headers

Fixes commit 4ec9c8fbbc22 ("netfilter: nft_log: complete
NFTA_LOG_FLAGS attr support").

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-05 19:43:02 -05:00
stephen hemminger ad55885829 uapi: export tc_skbmod.h
Fixes commit 735cffe5d800 ("net_sched: Introduce skbmod action")
Not used by iproute2 but maybe in future.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-05 19:43:02 -05:00
Florian Fainelli c823abac17 net: ep93xx_eth: Do not crash unloading module
When we unload the ep93xx_eth, whether we have opened the network
interface or not, we will either hit a kernel paging request error, or a
simple NULL pointer de-reference because:

- if ep93xx_open has been called, we have created a valid DMA mapping
  for ep->descs, when we call ep93xx_stop, we also call
  ep93xx_free_buffers, ep->descs now has a stale value

- if ep93xx_open has not been called, we have a NULL pointer for
  ep->descs, so performing any operation against that address just won't
  work

Fix this by adding a NULL pointer check for ep->descs which means that
ep93xx_free_buffers() was able to successfully tear down the descriptors
and free the DMA cookie as well.

Fixes: 1d22e05df8 ("[PATCH] Cirrus Logic ep93xx ethernet driver")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-05 15:35:39 -05:00
David S. Miller 34e0f2c2d8 Merge branch 'bnx2x-fixes'
Yuval Mintz says:

====================
bnx2x: fixes series

Two unrelated fixes for bnx2x - the first one is nice-to-have,
while the other fixes fatal behaviour in older adapters.

Please consider applying them to `net'.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-05 15:08:40 -05:00
Mintz, Yuval 360d9df2ac bnx2x: Prevent tunnel config for 577xx
Only the 578xx adapters are capable of configuring UDP ports for
the purpose of tunnelling - doing the same on 577xx might lead to
a firmware assertion.
We're already not claiming support for any related feature for such
devices, but we also need to prevent the configuration of the UDP
ports to the device in this case.

Fixes: f34fa14cc0 ("bnx2x: Add vxlan RSS support")
Reported-by: Anikina Anna <anikina@gmail.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-05 15:08:40 -05:00
Mintz, Yuval 65870fa77f bnx2x: Correct ringparam estimate when DOWN
Until interface is up [and assuming ringparams weren't explicitly
configured] when queried for the size of its rings bnx2x would
claim they're the maximal size by default.
That is incorrect as by default the maximal number of buffers would
be equally divided between the various rx rings.

This prevents the user from actually setting the number of elements
on each rx ring to be of maximal size prior to transitioning the
interface into up state.

To fix this, make a rough estimation about the number of buffers.
It wouldn't always be accurate, but it would be much better than
current estimation and would allow users to increase number of
buffers during early initialization of the interface.

Reported-by: Seymour, Shane <shane.seymour@hpe.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-05 15:08:39 -05:00
Pan Bian 9a53682b34 isdn: hisax: set error code on failure
In function hfc4s8s_probe(), the value of return variable err should be
negative on failures. However, when the call to request_region() returns
NULL, the value of err is 0. This patch fixes the bug, assigning
"-EBUSY" to err on the path that request_region() fails.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=188931

Signed-off-by: Pan Bian <bianpan2016@163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-05 15:03:42 -05:00
Pan Bian 005f7e68e7 net: bnx2x: fix improper return value
Macro BNX2X_ALLOC_AND_SET(arr, lbl, func) calls kmalloc() to allocate
memory, and jumps to label "lbl" if the allocation fails. Label "lbl"
first cleans memory and then returns variable rc. Before calling the
macro, the value of variable rc is 0. Because 0 means no error, the
callers of bnx2x_init_firmware() may be misled. This patch fixes the bug,
assigning "-ENOMEM" to rc before calling macro NX2X_ALLOC_AND_SET().

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=189141

Signed-off-by: Pan Bian <bianpan2016@163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-05 15:03:34 -05:00
Pan Bian 0ff18d2d36 net: ethernet: qlogic: set error code on failure
When calling dma_mapping_error(), the value of return variable rc is 0.
And when the call returns an unexpected value, rc is not set to a
negative errno. Thus, it will return 0 on the error path, and its
callers cannot detect the bug. This patch fixes the bug, assigning
"-ENOMEM" to err.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=189041

Signed-off-by: Pan Bian <bianpan2016@163.com>
Acked-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-05 14:54:18 -05:00
Pan Bian 7cf6156633 atm: fix improper return value
It returns variable "error" when ioremap_nocache() returns a NULL
pointer. The value of "error" is 0 then, which will mislead the callers
to believe that there is no error. This patch fixes the bug, returning
"-ENOMEM".

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=189021

Signed-off-by: Pan Bian <bianpan2016@163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-05 14:53:46 -05:00
Pan Bian 8ad3ba9345 net: irda: set error code on failures
When the calls to kzalloc() fail, the value of return variable ret may
be 0. 0 means success in this context. This patch fixes the bug,
assigning "-ENOMEM" to ret before calling kzalloc().

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=188971

Signed-off-by: Pan Bian <bianpan2016@163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-05 14:53:08 -05:00
Pan Bian c79e167c3c net: caif: remove ineffective check
The check of the return value of sock_register() is ineffective.
"if(!err)" seems to be a typo. It is better to propagate the error code
to the callers of caif_sktinit_module(). This patch removes the check
statment and directly returns the result of sock_register().

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=188751
Signed-off-by: Pan Bian <bianpan2016@163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-05 14:48:48 -05:00
Kees Cook 0eab121ef8 net: ping: check minimum size on ICMP header length
Prior to commit c0371da604 ("put iov_iter into msghdr") in v3.19, there
was no check that the iovec contained enough bytes for an ICMP header,
and the read loop would walk across neighboring stack contents. Since the
iov_iter conversion, bad arguments are noticed, but the returned error is
EFAULT. Returning EINVAL is a clearer error and also solves the problem
prior to v3.19.

This was found using trinity with KASAN on v3.18:

BUG: KASAN: stack-out-of-bounds in memcpy_fromiovec+0x60/0x114 at addr ffffffc071077da0
Read of size 8 by task trinity-c2/9623
page:ffffffbe034b9a08 count:0 mapcount:0 mapping:          (null) index:0x0
flags: 0x0()
page dumped because: kasan: bad access detected
CPU: 0 PID: 9623 Comm: trinity-c2 Tainted: G    BU         3.18.0-dirty #15
Hardware name: Google Tegra210 Smaug Rev 1,3+ (DT)
Call trace:
[<ffffffc000209c98>] dump_backtrace+0x0/0x1ac arch/arm64/kernel/traps.c:90
[<ffffffc000209e54>] show_stack+0x10/0x1c arch/arm64/kernel/traps.c:171
[<     inline     >] __dump_stack lib/dump_stack.c:15
[<ffffffc000f18dc4>] dump_stack+0x7c/0xd0 lib/dump_stack.c:50
[<     inline     >] print_address_description mm/kasan/report.c:147
[<     inline     >] kasan_report_error mm/kasan/report.c:236
[<ffffffc000373dcc>] kasan_report+0x380/0x4b8 mm/kasan/report.c:259
[<     inline     >] check_memory_region mm/kasan/kasan.c:264
[<ffffffc00037352c>] __asan_load8+0x20/0x70 mm/kasan/kasan.c:507
[<ffffffc0005b9624>] memcpy_fromiovec+0x5c/0x114 lib/iovec.c:15
[<     inline     >] memcpy_from_msg include/linux/skbuff.h:2667
[<ffffffc000ddeba0>] ping_common_sendmsg+0x50/0x108 net/ipv4/ping.c:674
[<ffffffc000dded30>] ping_v4_sendmsg+0xd8/0x698 net/ipv4/ping.c:714
[<ffffffc000dc91dc>] inet_sendmsg+0xe0/0x12c net/ipv4/af_inet.c:749
[<     inline     >] __sock_sendmsg_nosec net/socket.c:624
[<     inline     >] __sock_sendmsg net/socket.c:632
[<ffffffc000cab61c>] sock_sendmsg+0x124/0x164 net/socket.c:643
[<     inline     >] SYSC_sendto net/socket.c:1797
[<ffffffc000cad270>] SyS_sendto+0x178/0x1d8 net/socket.c:1761

CVE-2016-8399

Reported-by: Qidan He <i@flanker017.me>
Fixes: c319b4d76b ("net: ipv4: add IPPROTO_ICMP socket kind")
Cc: stable@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-05 13:35:38 -05:00
Linus Torvalds d9d04527c7 powerpc fixes for 4.9 #7
Four fixes, the first for code we merged this cycle and three that are also
 going to stable:
 
  - On 64-bit Book3E we were not placing the .text section where we said we would
    in the asm.
  - We broke building the boot wrapper on some 32-bit toolchains.
  - Lazy icache flushing was broken on pre-POWER5 machines.
  - One of the error paths in our EEH code would lead to a deadlock.
 
 Thanks to:
   Andrew Donnellan, Ben Hutchings, Benjamin Herrenschmidt, Nicholas Piggin.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJYRMkzAAoJEFHr6jzI4aWAoKsP/jUhpuGLkLM04isnRyjerYUL
 TZLp3NyplLSII8mwj+mCglVet4R79fctsvrl8uUXcMuTSfD6F1e9W7Oz9gxmBahT
 owZw5xmXFNLeCq1/w0N3KajYNvCTISEBuIgb/JVHntTu9nQ2gMwJ78DxpnBlTL93
 mAWA/2Sl1ybcEJJDKK6M/Lz3TyicRGBKDfo6tHdtgH44Sv1q2NufbA+UCzpZXywO
 HIPbirgy22vs6pA+OAe+5UmiHCgkFXNgZbrnIr0bz/6w8xaQmKJF1uUxt6Qj03Gn
 l+y45C1lgGiQzCNl5S+VkO0yopFS9L3/VNA0xHHUpi7/3Sz229ASHpCJjNrT3qsd
 NUjKMLucAgM+86R4gOhAKk1xjKKjp9LnTdAVs1t9w4nMud6pKd3+2/I7kEk8GQvh
 fTf3P2Bw6Gtm8b2Pd5WswcDYXpZGgfbPSltgAXtnyKuswtNtUfhhmVkNaJRZSCLP
 ZdgcwT1zmBISz3b5n1dtngRwSO4BP/+2HTCpLFF77ZT6PEAhRKbCOy1Qlb0+C2RW
 nZG6oXkNHjvF4W6teYRAmyqklj4ndUcUKsS9koFO/6GOiaDUMiCHnMNa+tByk1nl
 ufAKLAQ5IlvhEZ6kE11QDkcACy76obNDnKu24+kGwyJD9R2CK0HIMykogJYasN68
 Hjo03XsSaWv2umwq7ZwI
 =6IL4
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.9-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:
 "Four fixes, the first for code we merged this cycle and three that are
  also going to stable:

   - On 64-bit Book3E we were not placing the .text section where we
     said we would in the asm.

   - We broke building the boot wrapper on some 32-bit toolchains.

   - Lazy icache flushing was broken on pre-POWER5 machines.

   - One of the error paths in our EEH code would lead to a deadlock.

  Thanks to: Andrew Donnellan, Ben Hutchings, Benjamin Herrenschmidt,
  Nicholas Piggin"

* tag 'powerpc-4.9-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/64: Fix placement of .text to be immediately following .head.text
  powerpc/eeh: Fix deadlock when PE frozen state can't be cleared
  powerpc/mm: Fix lazy icache flush on pre-POWER5
  powerpc/boot: Fix build failure in 32-bit boot wrapper
2016-12-05 10:30:12 -08:00
Pan Bian 4606c9e8c5 atm: lanai: set error code when ioremap fails
In function lanai_dev_open(), when the call to ioremap() fails, the
value of return variable result is 0. 0 means no error in this context.
This patch fixes the bug, assigning "-ENOMEM" to result when ioremap()
returns a NULL pointer.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=188791

Signed-off-by: Pan Bian <bianpan2016@163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-05 13:27:33 -05:00
Pan Bian 51920830d9 net: usb: set error code when usb_alloc_urb fails
In function lan78xx_probe(), variable ret takes the errno code on
failures. However, when the call to usb_alloc_urb() fails, its value
will keeps 0. 0 indicates success in the context, which is inconsistent
with the execution result. This patch fixes the bug, assigning
"-ENOMEM" to ret when usb_alloc_urb() returns a NULL pointer.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=188771

Signed-off-by: Pan Bian <bianpan2016@163.com>
Acked-by: Woojung Huh <woojung.huh@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-05 13:27:15 -05:00