1
0
Fork 0
Commit Graph

336256 Commits (05bf3aac930752408bf38a3f070061fc5f1b9c73)

Author SHA1 Message Date
Jiang Liu 05bf3aac93 VFIO: fix out of order labels for error recovery in vfio_pci_init()
The two labels for error recovery in function vfio_pci_init() is out of
order, so fix it.

Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2012-12-07 13:43:51 -07:00
Jiang Liu de2b3eeafb VFIO: use ACCESS_ONCE() to guard access to dev->driver
Comments from dev_driver_string(),
/* dev->driver can change to NULL underneath us because of unbinding,
 * so be careful about accessing it.
 */

So use ACCESS_ONCE() to guard access to dev->driver field.

Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2012-12-07 13:43:50 -07:00
Jiang Liu 9df7b25ab7 VFIO: unregister IOMMU notifier on error recovery path
On error recovery path in function vfio_create_group(), it should
unregister the IOMMU notifier for the new VFIO group. Otherwise it may
cause invalid memory access later when handling bus notifications.

Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2012-12-07 13:43:50 -07:00
Alex Williamson 2007722a60 vfio-pci: Re-order device reset
Move the device reset to the end of our disable path, the device
should already be stopped from pci_disable_device().  This also allows
us to manipulate the save/restore to avoid the save/reset/restore +
save/restore that we had before.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2012-12-07 13:43:50 -07:00
Fengguang Wu 3a1f7041dd vfio: simplify kmalloc+copy_from_user to memdup_user
Generated by: coccinelle/api/memdup_user.cocci

Acked-by: Julia Lawall <julia.lawall@lip6.fr>
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2012-12-07 13:43:49 -07:00
Linus Torvalds 1afa471706 MMC fixes for 3.7:
- sdhci-s3c: Fix runtime PM regression against 3.7-rc1
  - sh-mmcif: Fix oops against 3.6
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJQwOsQAAoJEHNBYZ7TNxYMkiAQAKfrOAFIe2eH+Z651iowbZxs
 WUV5eocfZPDnahxpkOQmK7Oa+BA7Kqqa8bql0+8CXkrXNh5zDZ6uuK4M8fWWYIAp
 uRgLetjC3/jB9dBXP9oxNBJ6oXcq/fuwx33Zvxln1FVDGy15glCeJ9O2lHhcwiB2
 I5ezzOTyU1CogqofuSAh27/9Y6CagDuTBceoeW7ADYD+4nGNX/HET68T9WTaaR2p
 CM22ogAyMUyCBfACCT50m1LkzmhXq1cn6AptdZzHOhGuQ7QP0L2Ej1LBgTAIIoV+
 fJ4Nd7Ht8Xk5arnLBrQYtP8YOg5J0P6bt/Y8Dbqv4gM+vtsBIQ5mHRIFebMpz0to
 3zoH8m+Qq1/Ko+zT063FQNC3gRHQLyKurEIPTJO/IxhzIiw8MBT3bwR7rEGO1uCt
 8CW633dzu0SjG9Le4A5dLRjQ8s93HKhfCiqXm4/xkdPNgBLRQYDf09Z+G6RTIoaQ
 O7D6JjbgnGVdzy6HwuWK35LN74kyc8+rACDfbE9KMmtfYt94x6TMBfhQ/rTP5grv
 pBfLWHyRqScBLpE6RTXFm2FEAR5y1iCWRrZUtJwjJ4rqfgc4li3odvn08+5O2m7v
 PPEk+OKJSrsSu3f5kJSk4KKYPWFB9MqK+V6dKDjPvJ6u3afmPvx7USgt/zTxtp9c
 7Met6CmklOgokwhnFlrN
 =+gTB
 -----END PGP SIGNATURE-----

Merge tag 'mmc-fixes-for-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc

Pull MMC fixes from Chris Ball:
 "Two small regression fixes:

   - sdhci-s3c: Fix runtime PM regression against 3.7-rc1
   - sh-mmcif: Fix oops against 3.6"

* tag 'mmc-fixes-for-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc:
  mmc: sh-mmcif: avoid oops on spurious interrupts (second try)
  Revert misapplied "mmc: sh-mmcif: avoid oops on spurious interrupts"
  mmc: sdhci-s3c: fix missing clock for gpio card-detect
2012-12-07 09:15:20 -08:00
Mel Gorman 18a2f371f5 tmpfs: fix shared mempolicy leak
This fixes a regression in 3.7-rc, which has since gone into stable.

Commit 00442ad04a ("mempolicy: fix a memory corruption by refcount
imbalance in alloc_pages_vma()") changed get_vma_policy() to raise the
refcount on a shmem shared mempolicy; whereas shmem_alloc_page() went
on expecting alloc_page_vma() to drop the refcount it had acquired.
This deserves a rework: but for now fix the leak in shmem_alloc_page().

Hugh: shmem_swapin() did not need a fix, but surely it's clearer to use
the same refcounting there as in shmem_alloc_page(), delete its onstack
mempolicy, and the strange mpol_cond_copy() and __mpol_cond_copy() -
those were invented to let swapin_readahead() make an unknown number of
calls to alloc_pages_vma() with one mempolicy; but since 00442ad04a,
alloc_pages_vma() has kept refcount in balance, so now no problem.

Reported-and-tested-by: Tommi Rantala <tt.rantala@gmail.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-06 11:56:43 -08:00
Johannes Weiner c702418f8a mm: vmscan: do not keep kswapd looping forever due to individual uncompactable zones
When a zone meets its high watermark and is compactable in case of
higher order allocations, it contributes to the percentage of the node's
memory that is considered balanced.

This requirement, that a node be only partially balanced, came about
when kswapd was desparately trying to balance tiny zones when all bigger
zones in the node had plenty of free memory.  Arguably, the same should
apply to compaction: if a significant part of the node is balanced
enough to run compaction, do not get hung up on that tiny zone that
might never get in shape.

When the compaction logic in kswapd is reached, we know that at least
25% of the node's memory is balanced properly for compaction (see
zone_balanced and pgdat_balanced).  Remove the individual zone checks
that restart the kswapd cycle.

Otherwise, we may observe more endless looping in kswapd where the
compaction code loops back to reclaim because of a single zone and
reclaim does nothing because the node is considered balanced overall.

See for example

  https://bugzilla.redhat.com/show_bug.cgi?id=866988

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-and-tested-by: Thorsten Leemhuis <fedora@leemhuis.info>
Reported-by: Jiri Slaby <jslaby@suse.cz>
Tested-by: John Ellson <john.ellson@comcast.net>
Tested-by: Zdenek Kabelac <zkabelac@redhat.com>
Tested-by: Bruno Wolff III <bruno@wolff.to>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-06 11:29:57 -08:00
Mel Gorman 60177d31d2 mm: compaction: validate pfn range passed to isolate_freepages_block
Commit 0bf380bc70 ("mm: compaction: check pfn_valid when entering a
new MAX_ORDER_NR_PAGES block during isolation for migration") added a
check for pfn_valid() when isolating pages for migration as the scanner
does not necessarily start pageblock-aligned.

Since commit c89511ab2f ("mm: compaction: Restart compaction from near
where it left off"), the free scanner has the same problem.  This patch
makes sure that the pfn range passed to isolate_freepages_block() is
within the same block so that pfn_valid() checks are unnecessary.

In answer to Henrik's wondering why others have not reported this:
reproducing this requires a large enough hole with the right aligment to
have compaction walk into a PFN range with no memmap.  Size and
alignment depends in the memory model - 4M for FLATMEM and 128M for
SPARSEMEM on x86.  It needs a "lucky" machine.

Reported-by: Henrik Rydberg <rydberg@euromail.se>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-06 11:17:33 -08:00
Guennadi Liakhovetski 91ab252ac5 mmc: sh-mmcif: avoid oops on spurious interrupts (second try)
On some systems, e.g., kzm9g, MMCIF interfaces can produce spurious
interrupts without any active request. To prevent the Oops, that results
in such cases, don't dereference the mmc request pointer until we make
sure, that we are indeed processing such a request.

Reported-by: Tetsuyuki Kobayashi <koba@kmckk.co.jp>
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Tested-by: Tetsuyuki Kobayashi <koba@kmckk.co.jp>
Cc: stable@vger.kernel.org
Signed-off-by: Chris Ball <cjb@laptop.org>
2012-12-06 13:54:35 -05:00
Chris Ball 6984f3c31b Revert misapplied "mmc: sh-mmcif: avoid oops on spurious interrupts"
This reverts commit 8464dd52d3, which was a misapplied debugging
version of the patch, not the final patch itself.

Signed-off-by: Chris Ball <cjb@laptop.org>
Cc: stable@vger.kernel.org
2012-12-06 13:54:34 -05:00
Heiko Stübner fe007c02f9 mmc: sdhci-s3c: fix missing clock for gpio card-detect
2abeb5c5de ("Add clk_(enable/disable) in runtime suspend/resume")
added the capability to stop the clocks when the device is runtime
suspended, but forgot to handle the case of the card-detect using
an external gpio.

Therefore in the case that runtime-pm is enabled, start the io-clock
when a card is inserted and stop it again once it is removed.

Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Chris Ball <cjb@laptop.org>
2012-12-06 13:54:33 -05:00
Linus Torvalds 04c5decdc0 Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus
Pull MIPS fixes from Ralf Baechle:
 "These are the fixes for the N32 syscall bugs found by Al, an
  extraneous break that broke detection for R3000 and R3081 processors,
  an endless loop processing signals for kernel task (x86 received the
  same fix a while ago) and a fix for transparent huge page which took
  ages to track down because it was so hard to come up with a workable
  test case."

* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
  MIPS: Fix endless loop when processing signals for kernel tasks
  MIPS: R3000/R3081: Fix CPU detection.
  MIPS: N32: Fix signalfd4 syscall entry point
  MIPS: N32: Fix preadv(2) and pwritev(2) entry points.
  MIPS: Avoid mcheck by flushing page range in huge_ptep_set_access_flags()
2012-12-06 08:42:13 -08:00
Linus Torvalds d91fa97128 Merge branch 'more-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux
Pull build fix from Rusty Russell:
 "Tim Gardner <tim.gardner@canonical.com> writes:
  > It is $(obj)/oid_registry.o that is dependent on $(obj)/oid_registry_data.c.
  > The object file cannot be built until $(obj)/oid_registry_data.c has been
  > generated.
  >
  > A periodic and hard to reproduce parallel build failure is due to
  > this incorrect lib/Makefile dependency. The compile error is completely
  > disingenuous.
  >
  >   GEN     lib/oid_registry_data.c
  > Compiling 49 OIDs
  >   CC      lib/oid_registry.o
  > gcc: error: lib/oid_registry.c: No such file or directory
  > gcc: fatal error: no input files
  > compilation terminated.
  > make[3]: *** [lib/oid_registry.o] Error 4

  I can't reproduce it either.  It's completely weird; nothing ever
  removes lib/oid_registry.c, so either gcc is giving the wrong message
  or it's a weird fs with a very odd race.

  But your version is definitely more correct than the previous one,
  so..."

* 'more-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
  lib/Makefile: Fix oid_registry build dependency
2012-12-06 08:39:57 -08:00
Linus Torvalds 54d1ae492f Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux
Pull module signing fixes from Rusty Russell:
 "David gave me these a month ago, during my git workflow churn :("

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
  ASN.1: Fix an indefinite length skip error
  MODSIGN: Don't use enum-type bitfields in module signature info block
2012-12-06 08:29:08 -08:00
Linus Torvalds cfd1f032f9 Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull watchdog fix from Thomas Gleixner:
 "Trivial CPU hotplug regression fix for the watchdog code"

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  watchdog: Fix CPU hotplug regression
2012-12-06 08:27:11 -08:00
Tim Gardner 527897ccd9 lib/Makefile: Fix oid_registry build dependency
It is $(obj)/oid_registry.o that is dependent on $(obj)/oid_registry_data.c.
The object file cannot be built until $(obj)/oid_registry_data.c has been
generated.

A periodic and hard to reproduce parallel build failure is due to
this incorrect lib/Makefile dependency. The compile error is completely
disingenuous.

  GEN     lib/oid_registry_data.c
Compiling 49 OIDs
  CC      lib/oid_registry.o
gcc: error: lib/oid_registry.c: No such file or directory
gcc: fatal error: no input files
compilation terminated.
make[3]: *** [lib/oid_registry.o] Error 4

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: David Howells <dhowells@redhat.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-12-06 17:25:01 +10:30
Dmitry Adamushko c90e6fbb22 MIPS: Fix endless loop when processing signals for kernel tasks
The problem occurs [1] when a kernel-mode task returns from a system
call with a pending signal.

A real-life scenario is a child of 'khelper' returning from a failed
kernel_execve() in ____call_usermodehelper() [ kernel/kmod.c ].
kernel_execve() fails due to a pending SIGKILL, which is the result of
"kill -9 -1" (at least, busybox's init does it upon reboot).

The loop is as follows:

* syscall_exit_work:
 - work_pending:            // start_of_the_loop
 - work_notifysig:
   - do_notify_resume()
     - do_signal()
       - if (!user_mode(regs)) return;
 - resume_userspace         // TIF_SIGPENDING is still set
 - work_pending             // so we call work_pending => goto
                            // start_of_the_loop

More information can be found in another LKML thread:
http://www.serverphorums.com/read.php?12,457826

[1] The problem was also reproduced on !CONFIG_VM86 x86, and the
following fix was accepted.

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=29a2e2836ff9ea65a603c89df217f4198973a74f

Signed-off-by: Dmitry Adamushko <dmitry.adamushko@gmail.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/3571/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2012-12-05 19:59:00 +01:00
Ralf Baechle 2d33976fb3 MIPS: R3000/R3081: Fix CPU detection.
Broken since e05ea74fc56f347f872ef9946d27c53e8bf20864 (lmo) rsp.
cea7e2dfde (kernel.org) [MIPS: Sort out CPU
type to name translation.]  These CPUs are no longer very popular to say
the least ...

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Reported-by: Murphy McCauley <murphy.mccauley@gmail.com>
2012-12-05 19:58:54 +01:00
Ralf Baechle 97daa76801 MIPS: N32: Fix signalfd4 syscall entry point
This needs to use the compat entry point or it's going to fail on big
endian systems.

Noticed by Al Viro.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2012-12-05 19:58:48 +01:00
Dan Carpenter 27d7c2a006 vfs: clear to the end of the buffer on partial buffer reads
READ is zero so the "rw & READ" test is always false.  The intended test
was "((rw & RW_MASK) == READ)".

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-05 10:32:59 -08:00
David Howells f3537f91f9 ASN.1: Fix an indefinite length skip error
Fix an error in asn1_find_indefinite_length() whereby small definite length
elements of size 0x7f are incorrecly classified as non-small.  Without this
fix, an error will be given as the length of the length will be perceived as
being very much greater than the maximum supported size.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-12-05 11:27:39 +10:30
David Howells 12e130b045 MODSIGN: Don't use enum-type bitfields in module signature info block
Don't use enum-type bitfields in the module signature info block as we can't be
certain how the compiler will handle them.  As I understand it, it is arch
dependent, and it is possible for the compiler to rearrange them based on
endianness and to insert a byte of padding to pad the three enums out to four
bytes.

Instead use u8 fields for these, which the compiler should emit in the right
order without padding.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-12-05 11:27:24 +10:30
Thomas Gleixner 8d4516904b watchdog: Fix CPU hotplug regression
Norbert reported:
"3.7-rc6 booted with nmi_watchdog=0 fails to suspend to RAM or
 offline CPUs. It's reproducable with a KVM guest and physical
 system."

The reason is that commit bcd951cf(watchdog: Use hotplug thread
infrastructure) missed to take this into account. So the cpu offline
code gets stuck in the teardown function because it accesses non
initialized data structures.

Add a check for watchdog_enabled into that path to cure the issue.

Reported-and-tested-by: Norbert Warmuth <nwarmuth@t-online.de>
Tested-by: Joseph Salisbury <joseph.salisbury@canonical.com>
Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1211231033230.2701@ionos
Link: http://bugs.launchpad.net/bugs/1079534
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-12-04 19:56:59 +01:00
Linus Torvalds df2fc246c8 Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux
Pull module fixes from Rusty Russell:
 "Module signing build fixes for blackfin and metag"

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
  modsign: add symbol prefix to certificate list
  linux/kernel.h: define SYMBOL_PREFIX
2012-12-04 09:32:12 -08:00
Linus Torvalds 70dcc535bd Fixes for 2 brown-paperbag bugs introduced this merge window by the fastmap
code:
 
 1. The UBI background thread got stuck when a bit-flip happened because free
    LEBs was not removed from the "free" tree when we started using it.
 2. I/O debugging checks did not work because we called a sleeping function in
    atomic context.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJQvgPKAAoJECmIfjd9wqK0JgQQAKVMRSmE1GehkcqtOG+binQQ
 FeasewaarkyRhZzkLy0Mw25c/1csjZ8bL/rbht75yjrhPM72Lu5S6pTYmnqQ2kFX
 phVIyDNeMeGxmI55s3rlTc8HMVvwIpFUHQEYgWiXfse4R2I2Du9NY2G9UWJ3hvUL
 X4x039YSLrI+SawnF5cxe59FV54aS0cbZ6Dlltvyw7oKqt0lStt/RG2V5MaUOpx+
 DW7qCXJm3VJ2TkUb/Goyop1R5BznOtFEdRJXeXfefJ6+sMyQCO+5WBMq3JMF0x5K
 RmlJ4ebsk5oqerKDFmtjE7TwFiaMvGpFOxcomIJuP5kyoDKtv7VqczkGtIegGJ0+
 cAKBLSUQP4tTWoq066soEJmGKpIqyrqrhdGTMJdKSRKNIzCdRkUtY/GUhAxZmSEz
 Pszmy1Dbfo8Vw8aAgoMSzSiHTbLX1e1GbwXNDHA/SMKq5drMfa+mvSWSOEfcoYBV
 PXcUETOfTIHvNBiSlTeyx3W4BFC74kwHizqdIqpCWSC9GrUG/ckC2FEQcDFe1Lqf
 PfavXn1VM6IyDAnjaCPh+ZycA9YtIb0eV7RSiT5F/nI7FAXH1+/2YHMYnnUz+BUM
 cdS4DR7bsy3a1QGG+M1OZOZlxy8brWrH3tBEeWRRVeywr/Ij24g7mcG3SHS4uEuF
 JCevqZJ069/xxCSiRpoy
 =fs+0
 -----END PGP SIGNATURE-----

Merge tag 'upstream-3.7-rc9' of git://git.infradead.org/linux-ubi

Pull UBI changes from Artem Bityutskiy:
 "Fixes for 2 brown-paperbag bugs introduced this merge window by the
  fastmap code:

   1.  The UBI background thread got stuck when a bit-flip happened
       because free LEBs was not removed from the "free" tree when we
       started using it.
   2.  I/O debugging checks did not work because we called a sleeping
       function in atomic context."

* tag 'upstream-3.7-rc9' of git://git.infradead.org/linux-ubi:
  UBI: dont call ubi_self_check_all_ff() in __wl_get_peb()
  UBI: remove PEB from free tree in get_peb_for_wl()
2012-12-04 09:15:51 -08:00
Linus Torvalds ca50496eb4 Merge branch 'for-3.7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
Pull workqueue fixes from Tejun Heo:
 "So, safe fixes my ass.

  Commit 8852aac25e ("workqueue: mod_delayed_work_on() shouldn't queue
  timer on 0 delay") had the side-effect of performing delayed_work
  sanity checks even when @delay is 0, which should be fine for any sane
  use cases.

  Unfortunately, megaraid was being overly ingenious.  It seemingly
  wanted to use cancel_delayed_work_sync() before cancel_work_sync() was
  introduced, but didn't want to waste the space for full delayed_work
  as it was only going to use 0 @delay.  So, it only allocated space for
  struct work_struct and then cast it to struct delayed_work and passed
  it into delayed_work functions - truly awesome engineering tradeoff to
  save some bytes.

  Xiaotian fixed it by making megraid allocate full delayed_work for
  now.  It should be converted to use work_struct and cancel_work_sync()
  but I think we better do that after 3.7.

  I added another commit to change BUG_ON()s in __queue_delayed_work()
  to WARN_ON_ONCE()s so that the kernel doesn't crash even if there are
  more such abuses."

* 'for-3.7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
  workqueue: convert BUG_ON()s in __queue_delayed_work() to WARN_ON_ONCE()s
  megaraid: fix BUG_ON() from incorrect use of delayed work
2012-12-04 09:02:45 -08:00
Ralf Baechle d5563715a3 MIPS: N32: Fix preadv(2) and pwritev(2) entry points.
By using the native syscall entry point the kernel was also expecting
64-bit iovec structures.

This is broken since ddd9e91b71 [preadv/
pwritev: MIPS: Add preadv(2) and pwritev(2) syscalls.] which originally
added these two syscalls.  I walked through piles of code, including
libc and couldn't find anything that would have worked around the issue
so this change the API to what it should always have been.

Noticed and patch suggested by Al Viro.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2012-12-04 17:59:39 +01:00
Linus Torvalds 609e3ff3ff Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc
Pull sparc fixes from David Miller:
 "Two small fixes for Sparc, nobody uses sparc, so these are low risk :-)

   1) Piggyback is too picky about the symbol types that _start and _end
      have in the final kernel image, and it thus breaks with newer
      binutils.  Future proof by getting rid of the symbol type checks.

   2) exit_group() should kill register windows on sparc64 the same way
      we do for plain exit().  Thanks to Al Viro for spotting this."

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
  sparc: Fix piggyback with newer binutils.
  sparc64: exit_group should kill register windows just like plain exit.
2012-12-04 08:42:29 -08:00
Linus Torvalds 57302e0ddf vfs: avoid "attempt to access beyond end of device" warnings
The block device access simplification that avoided accessing the (racy)
block size information (commit bbec0270bdd8: "blkdev_max_block: make
private to fs/buffer.c") no longer checks the maximum block size in the
block mapping path.

That was _almost_ as simple as just removing the code entirely, because
the readers and writers all check the size of the device anyway, so
under normal circumstances it "just worked".

However, the block size may be such that the end of the device may
straddle one single buffer_head.  At which point we may still want to
access the end of the device, but the buffer we use to access it
partially extends past the end.

The 'bd_set_size()' function intentionally sets the block size to avoid
this, but mounting the device - or setting the block size by hand to
some other value - can modify that block size.

So instead, teach 'submit_bh()' about the special case of the buffer
head straddling the end of the device, and turning such an access into a
smaller IO access, avoiding the problem.

This, btw, also means that unlike before, we can now access the whole
device regardless of device block size setting.  So now, even if the
device size is only 512-byte aligned, we can read and write even the
last sector even when having a much bigger block size for accessing the
rest of the device.

So with this, we could now get rid of the 'bd_set_size()' block size
code entirely - resulting in faster IO for the common case - but that
would be a separate patch.

Reported-and-tested-by: Romain Francoise <romain@orebokech.com>
Reporeted-and-tested-by: Meelis Roos <mroos@linux.ee>
Reported-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-04 08:25:11 -08:00
Tejun Heo fc4b514f27 workqueue: convert BUG_ON()s in __queue_delayed_work() to WARN_ON_ONCE()s
8852aac25e ("workqueue: mod_delayed_work_on() shouldn't queue timer on
0 delay") unexpectedly uncovered a very nasty abuse of delayed_work in
megaraid - it allocated work_struct, casted it to delayed_work and
then pass that into queue_delayed_work().

Previously, this was okay because 0 @delay short-circuited to
queue_work() before doing anything with delayed_work.  8852aac25e
moved 0 @delay test into __queue_delayed_work() after sanity check on
delayed_work making megaraid trigger BUG_ON().

Although megaraid is already fixed by c1d390d8e6 ("megaraid: fix
BUG_ON() from incorrect use of delayed work"), this patch converts
BUG_ON()s in __queue_delayed_work() to WARN_ON_ONCE()s so that such
abusers, if there are more, trigger warning but don't crash the
machine.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Xiaotian Feng <xtfeng@gmail.com>
2012-12-04 07:58:47 -08:00
David Daney ac53c4fca4 MIPS: Avoid mcheck by flushing page range in huge_ptep_set_access_flags()
Problem:

1) Huge page mapping of anonymous memory is initially invalid.  Will be
   faulted in by copy-on-write mechanism.

2) Userspace attempts store at the end of the huge mapping.

3) TLB Refill exception handler fill TLB with a normal (4K sized)
   invalid page at the end of the huge mapping virtual address range.

4) Userspace restarted, and re-attempts the store at the end of the
   huge mapping.

5) Page from #3 is invalid, we get a fault and go to the hugepage
   fault handler.  This tries to map a huge page and calls
   huge_ptep_set_access_flags() to install the mapping.

6) We just call the generic ptep_set_access_flags() to set up the page
   tables, but the flush there assumes a normal (4K sized) page and
   only tries to flush the first part of the huge page virtual address
   out of the TLB, since the existing entry from step #3 doesn't
   conflict, nothing is flushed.

7) We attempt to load the mapping into the TLB, but because it
   conflicts with the entry from step #3, we get a Machine Check
   exception.

The fix: Flush the entire rage covered by the huge page in
huge_ptep_set_access_flags(), and remove the optimization in
local_flush_tlb_range() so that the flush actually does the correct
thing.

Signed-off-by: David Daney <david.daney@cavium.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Cc: Hillf Danton <dhillf@gmail.com>
Patchwork: https://patchwork.linux-mips.org/patch/4661/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
(cherry picked from commit dd617f258cc39d36be26afee9912624a2d23112c)
2012-12-04 16:57:54 +01:00
Xiaotian Feng c1d390d8e6 megaraid: fix BUG_ON() from incorrect use of delayed work
megaraid use INIT_WORK to declare a hotplug_work, but cast the
hotplug_work from work_struct to delayed_work and
schedule_delayed_work on it.  This is very dangerous, as other part of
delayed_work might be kernel memories allocated by others.

With commit 8852aac ("workqueue: mod_delayed_work_on() shouldn't queue
timer on 0 delay"), schedule_delayed_work() will check dwork->timer
before queue_work even when @delay is 0, this causes megaraid code to
hit the BUG_ON() in workqueue code.  Change megaraid code to use
delayed work.

Signed-off-by: Xiaotian Feng <dannyfeng@tencent.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Neela Syam Kolli <megaraidlinux@lsi.com>
Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
Cc: linux-scsi@vger.kernel.org
2012-12-04 07:29:47 -08:00
Richard Weinberger 894aef2157 UBI: dont call ubi_self_check_all_ff() in __wl_get_peb()
As ubi_self_check_all_ff() might sleep we are not allowed
to call it from atomic context.
For now we call it only from ubi_wl_get_peb().
There are some code paths where it would also make sense,
but these paths are currently atomic and only enabled
when fastmap is used.

Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2012-12-04 16:04:31 +02:00
Richard Weinberger ed4b7021cb UBI: remove PEB from free tree in get_peb_for_wl()
If UBI is built without fastmap, get_peb_for_wl() has to
remove the PEB manially from the free tree.
Otherwise the requested PEB lives in two trees.

Reported-by: Zach Sadecki <zsadecki@itwatchdogs.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2012-12-04 16:04:16 +02:00
David S. Miller 0032c85745 sparc: Fix piggyback with newer binutils.
Newer versions of binutils mark '_end' as 'B' instead of 'A' for
whatever reason.

To be honest, the piggyback code doesn't actually care what kind
of symbol _start and _end are, it just wants to find them and
record the address.

So remove the type from the match strings.

Reported-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-03 11:24:25 -08:00
Linus Torvalds b69f0859dc Linux 3.7-rc8 2012-12-03 11:22:37 -08:00
David S. Miller de7531e857 sparc64: exit_group should kill register windows just like plain exit.
Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-03 11:17:57 -08:00
Linus Torvalds b52c6402b5 Merge git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac
Pull EDAC fixes from Mauro Carvalho Chehab:
 "One EDAC core fix, and a few driver fixes (i7300, i9275x, i7core)."

* git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac:
  i7core_edac: fix panic when accessing sysfs files
  i7300_edac: Fix error flag testing
  edac: Fix the dimm filling for csrows-based layouts
  i82975x_edac: Fix dimm label initialization
2012-12-03 11:16:37 -08:00
Linus Torvalds 4ba0032984 Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull media fixes from Mauro Carvalho Chehab:
 "Some driver fixes for s5p/exynos (mostly race fixes)"

* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
  [media] s5p-mfc: Handle multi-frame input buffer
  [media] s5p-mfc: Bug fix of timestamp/timecode copy mechanism
  [media] exynos-gsc: Add missing video device vfl_dir flag initialization
  [media] exynos-gsc: Fix settings for input and output image RGB type
  [media] exynos-gsc: Don't use mutex_lock_interruptible() in device release()
  [media] fimc-lite: Don't use mutex_lock_interruptible() in device release()
  [media] s5p-fimc: Don't use mutex_lock_interruptible() in device release()
  [media] s5p-fimc: Prevent race conditions during subdevs registration
2012-12-03 11:13:32 -08:00
Al Viro 25a3bc6bd1 [parisc] open(2) compat bug
In commit 9d73fc2d64 ("open*(2) compat fixes (s390, arm64)") I said:
>
> 	The usual rules for open()/openat()/open_by_handle_at() are
> 1) native 32bit - don't force O_LARGEFILE in flags
> 2) native 64bit - force O_LARGEFILE in flags
> 3) compat on 64bit host - as for native 32bit
> 4) native 32bit ABI for 64bit system (mips/n32, x86/x32) - as for native 64bit
>
> There are only two exceptions - s390 compat has open() forcing O_LARGEFILE and
> arm64 compat has open_by_handle_at() doing the same thing.  The same binaries
> on native host (s390/31 and arm resp.) will *not* force O_LARGEFILE, so IMO
> both are emulation bugs.

Three exceptions, actually - parisc open() is another case like that.
Native 32bit won't force O_LARGEFILE, the same binary on parisc64 will.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-03 11:13:09 -08:00
Mike Galbraith fd8ef11730 Revert "sched, autogroup: Stop going ahead if autogroup is disabled"
This reverts commit 800d4d30c8.

Between commits 8323f26ce3 ("sched: Fix race in task_group()") and
800d4d30c8 ("sched, autogroup: Stop going ahead if autogroup is
disabled"), autogroup is a wreck.

With both applied, all you have to do to crash a box is disable
autogroup during boot up, then reboot..  boom, NULL pointer dereference
due to commit 800d4d30c8 not allowing autogroup to move things, and
commit 8323f26ce3 making that the only way to switch runqueues:

  BUG: unable to handle kernel NULL pointer dereference at           (null)
  IP: [<ffffffff81063ac0>] effective_load.isra.43+0x50/0x90
  Pid: 7047, comm: systemd-user-se Not tainted 3.6.8-smp #7 MEDIONPC MS-7502/MS-7502
  RIP: effective_load.isra.43+0x50/0x90
  Process systemd-user-se (pid: 7047, threadinfo ffff880221dde000, task ffff88022618b3a0)
  Call Trace:
    select_task_rq_fair+0x255/0x780
    try_to_wake_up+0x156/0x2c0
    wake_up_state+0xb/0x10
    signal_wake_up+0x28/0x40
    complete_signal+0x1d6/0x250
    __send_signal+0x170/0x310
    send_signal+0x40/0x80
    do_send_sig_info+0x47/0x90
    group_send_sig_info+0x4a/0x70
    kill_pid_info+0x3a/0x60
    sys_kill+0x97/0x1a0
    ? vfs_read+0x120/0x160
    ? sys_read+0x45/0x90
    system_call_fastpath+0x16/0x1b
  Code: 49 0f af 41 50 31 d2 49 f7 f0 48 83 f8 01 48 0f 46 c6 48 2b 07 48 8b bf 40 01 00 00 48 85 ff 74 3a 45 31 c0 48 8b 8f 50 01 00 00 <48> 8b 11 4c 8b 89 80 00 00 00 49 89 d2 48 01 d0 45 8b 59 58 4c
  RIP  [<ffffffff81063ac0>] effective_load.isra.43+0x50/0x90
   RSP <ffff880221ddfbd8>
  CR2: 0000000000000000

Signed-off-by: Mike Galbraith <efault@gmx.de>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Yong Zhang <yong.zhang0@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: stable@vger.kernel.org # 2.6.39+
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-03 11:10:24 -08:00
Linus Torvalds d3594ea2b3 Merge branch 'block-dev'
Merge 'block-dev' branch.

I was going to just mark everything here for stable and leave it to the
3.8 merge window, but having decided on doing another -rc, I migth as
well merge it now.

This removes the bd_block_size_semaphore semaphore that was added in
this release to fix a race condition between block size changes and
block IO, and replaces it with atomicity guaratees in fs/buffer.c
instead, along with simplifying fs/block-dev.c.

This removes more lines than it adds, makes the code generally simpler,
and avoids the latency/rt issues that the block size semaphore
introduced for mount.

I'm not happy with the timing, but it wouldn't be much better doing this
during the merge window and then having some delayed back-port of it
into stable.

* block-dev:
  blkdev_max_block: make private to fs/buffer.c
  direct-io: don't read inode->i_blkbits multiple times
  blockdev: remove bd_block_size_semaphore again
  fs/buffer.c: make block-size be per-page and protected by the page lock
2012-12-03 10:53:25 -08:00
James Hogan 84ecfd15f5 modsign: add symbol prefix to certificate list
Add the arch symbol prefix (if applicable) to the asm definition of
modsign_certificate_list and modsign_certificate_list_end. This uses the
recently defined SYMBOL_PREFIX which is derived from
CONFIG_SYMBOL_PREFIX.

This fixes the build of module signing on the blackfin and metag
architectures.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: David Howells <dhowells@redhat.com>
Cc: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-12-03 13:06:25 +10:30
James Hogan cbdbf2abb7 linux/kernel.h: define SYMBOL_PREFIX
Define SYMBOL_PREFIX to be the same as CONFIG_SYMBOL_PREFIX if set by
the architecture, or "" otherwise. This avoids the need for ugly #ifdefs
whenever symbols are referenced in asm blocks.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Joe Perches <joe@perches.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Jean Delvare <khali@linux-fr.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-12-03 13:05:54 +10:30
Linus Torvalds 7e5530af11 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) 8139cp leaks memory in error paths, from Francois Romieu.

 2) do_tcp_sendpages() cannot handle order > 0 pages, but they can
    certainly arrive there now, fix from Eric Dumazet.

 3) Race condition and sysfs fixes in bonding from Nikolay Aleksandrov.

 4) Remain-on-Channel fix in mac80211 from Felix Liao.

 5) CCK rate calculation fix in iwlwifi, from Emmanuel Grumbach.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  8139cp: fix coherent mapping leak in error path.
  tcp: fix crashes in do_tcp_sendpages()
  bonding: fix race condition in bonding_store_slaves_active
  bonding: make arp_ip_target parameter checks consistent with sysfs
  bonding: fix miimon and arp_interval delayed work race conditions
  mac80211: fix remain-on-channel (non-)cancelling
  iwlwifi: fix the basic CCK rates calculation
2012-12-02 16:39:00 -08:00
Linus Torvalds 4ccc804586 Single bugfix for raid1/raid10.
Fixes a recently introduced deadlock.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iQIVAwUAULVcJjnsnt1WYoG5AQKJjw//WXClUwmYi7FD9u8OfqLKsmvJCB8rTQFL
 RQTkm/AGwXMTrmV2iifR7Hpm14wP7pbwwiFNbLv4cw8W8ldt+4PfjCRTyoSwH5HT
 +evDAxmtuTfhznGn9fWCJClmrng0+W0ir3Bmkju+u35orCx97+98Cgv4rVAeYZ0R
 TR59g10y0c1QQuLPXoe3J5iVKuXjrlW4USIGRzkaqKmSZa9LGLETLE9V/RQtcdVD
 HB+uVMEMIGfibWSa918yWRbje8tGYeFyWOrxLs6eS/MdMEWPOdYgesxMErN+8/9q
 ZPYc+wIbGdYfzn8RcuAlE10ZMkS/6eNCn/O5ztBrM9Iyztecv3TKxNzb1S9RHppZ
 ze9d7qfX5kDhwc7YPikPZlnP4CDElDjaPzb0jSyy6FwNzWV45YuC9D5n4xGPOgcC
 83ORlSzMcv6NOFZc8HjrV4NFYE4Dezm0sThFPMEkY2FfLzIztg1H5Q0k0bvfxtqa
 yzCaQtuGjMhsbcLELqHCXFNHFhBVaetuFKAPRnynnkgSDMiZVjmV5/rsapy+qBON
 4BSI7Shwq5jn1xrqVd6ylLic5nkFIGuU7jZ15VftzP3ggQxmSLuq8Q7ZOZ+cXdZ9
 bQTkZyrwsIp8qxBE9DCi33VDDcUNiSvCnTdr18XPAzJ0DQIQ4hhlmLxGGj8BV5f4
 KKOqr3RBP6I=
 =fV2f
 -----END PGP SIGNATURE-----

Merge tag 'md-3.7-fixes' of git://neil.brown.name/md

Pull md bugfix from NeilBrown:
 "Single bugfix for raid1/raid10.

  Fixes a recently introduced deadlock."

* tag 'md-3.7-fixes' of git://neil.brown.name/md:
  md/raid1{,0}: fix deadlock in bitmap_unplug.
2012-12-02 16:24:31 -08:00
Al Viro 9d73fc2d64 open*(2) compat fixes (s390, arm64)
The usual rules for open()/openat()/open_by_handle_at() are
 1) native 32bit - don't force O_LARGEFILE in flags
 2) native 64bit - force O_LARGEFILE in flags
 3) compat on 64bit host - as for native 32bit
 4) native 32bit ABI for 64bit system (mips/n32, x86/x32) - as for
    native 64bit

There are only two exceptions - s390 compat has open() forcing
O_LARGEFILE and arm64 compat has open_by_handle_at() doing the same
thing.  The same binaries on native host (s390/31 and arm resp.) will
*not* force O_LARGEFILE, so IMO both are emulation bugs.

Objections? The fix is obvious...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-02 10:46:38 -08:00
Linus Torvalds 3c46f3d640 Merge branch 'for-3.7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
Pull  late workqueue fixes from Tejun Heo:
 "Unfortunately, I have two really late fixes.  One was for a
  long-standing bug and queued for 3.8 but I found out about a
  regression introduced during 3.7-rc1 two days ago, so I'm sending out
  the two fixes together.

  The first (long-standing) one is rescuer_thread() entering exit path
  w/ TASK_INTERRUPTIBLE.  It only triggers on workqueue destructions
  which isn't very frequent and the exit path can usually survive being
  called with TASK_INTERRUPT, so it was hidden pretty well.  Apparently,
  if you're reiserfs, this could lead to the exiting kthread sleeping
  indefinitely holding a mutex, which is never good.

  The fix is simple - restoring TASK_RUNNING before returning from the
  kthread function.

  The second one is introduced by the new mod_delayed_work().
  mod_delayed_work() was missing special case handling for 0 delay.
  Instead of queueing the work item immediately, it queued the timer
  which expires on the closest next tick.  Some users of the new
  function converted from "[__]cancel_delayed_work() +
  queue_delayed_work()" combination became unhappy with the extra delay.

  Block unplugging led to noticeably higher number of context switches
  and intel 6250 wireless failed to associate with WPA-Enterprise
  network.  The fix, again, is fairly simple.  The 0 delay special case
  logic from queue_delayed_work_on() should be moved to
  __queue_delayed_work() which is shared by both queue_delayed_work_on()
  and mod_delayed_work_on().

  The first one is difficult to trigger and the failure mode for the
  latter isn't completely catastrophic, so missing these two for 3.7
  wouldn't make it a disastrous release, but both bugs are nasty and the
  fixes are fairly safe"

* 'for-3.7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
  workqueue: mod_delayed_work_on() shouldn't queue timer on 0 delay
  workqueue: exit rescuer_thread() as TASK_RUNNING
2012-12-01 17:55:13 -08:00
françois romieu 892a925e42 8139cp: fix coherent mapping leak in error path.
cp_open
[...]
        rc = cp_alloc_rings(cp);
        if (rc)
                return rc;

cp_alloc_rings
[...]
        mem = dma_alloc_coherent(&cp->pdev->dev, CP_RING_BYTES,
                                 &cp->ring_dma, GFP_KERNEL);

- cp_alloc_rings never frees the coherent mapping it allocates
- neither do cp_open when cp_alloc_rings fails

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-01 20:39:17 -05:00