Commit graph

344450 commits

Author SHA1 Message Date
David Howells 0c6e686ce4 MN10300: Get rid of unused variable from ASB2305 PCI code
Get rid of an unused variable in pcibios_fixup_device_resources() which leads
to the following warning:

arch/mn10300/unit-asb2305/pci.c: In function 'pcibios_fixup_device_resources':
arch/mn10300/unit-asb2305/pci.c:324:24: warning: unused variable 'region' [-Wunused-variable]

Whilst we're at it, merge the two integer variable declarations into one line.

Signed-off-by: David Howells <dhowells@redhat.com>
2012-12-12 15:46:15 +00:00
David Howells 00a2f91518 MN10300: ASB2305 PCI code needs linux/irq.h
ASB2305 PCI code needs to #include linux/irq.h for XIRQ1 so that it can set
the CPU interrupt priority level on the PCI interrupt.

Signed-off-by: David Howells <dhowells@redhat.com>
2012-12-12 15:46:15 +00:00
Kautuk Consul 3d7b6a674e mn10300/mm/fault.c: Port OOM changes to do_page_fault
Commit d065bd810b
(mm: retry page fault when blocking on disk transfer) and
commit 37b23e0525
(x86,mm: make pagefault killable)

The above commits introduced changes into the x86 pagefault handler
for making the page fault handler retryable as well as killable.

These changes reduce the mmap_sem hold time, which is crucial
during OOM killer invocation.

Port these changes to mn10300.

Signed-off-by: Kautuk Consul <consul.kautuk@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
2012-12-12 15:46:15 +00:00
David Howells 83c2dc15ce MN10300: Handle cacheable PCI regions in pci_iomap()
Handle cacheable PCI regions in pci_iomap().  If IORESOURCE_CACHEABLE is set
then we AND away the 0x20000000 "flag".

Signed-off-by: David Howells <dhowells@redhat.com>
2012-12-12 15:46:14 +00:00
Mark Salter 0369c360e5 MN10300: fix debug polling in ttySM driver
The debug polling interface for the SoC serial ports did not work in the case
where the serial ports were not also used as a console. In that case, the
uart driver startup function will not be called so tx and rx would not be
enabled in the hardware control register. Also, vdma interrupts would not be
enabled which the poll_get_char function relied on. This patch makes sure that
the rx and tx enables are set as a consequence of the uart set_termios call
which is the only initialization done for the debug polling interface. Also,
the poll_get_char now handles the case where vdma interrupts are not enabled.

Signed-off-by: Mark Salter <msalter@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
2012-12-12 15:46:14 +00:00
Mark Salter 97a70b1439 MN10300: ttySM: clean up unnecessary casting
The ttySM uart data register pointers are declared as void* pointers. Change
them to u8* pointers so we don't need to use casts in the code.

Signed-off-by: Mark Salter <msalter@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
2012-12-12 15:46:14 +00:00
Mark Salter 8f0bcbcab0 MN10300: fix SMP synchronization between txdma and serial driver
The SoC serial port driver uses a high priority interrupt to handle tx of
characters in the tx ring buffer. The driver needs to disable/enable this IRQ
from outside of irq context. The original code to do this is not foolproof on
SMP machines because the IRQ running on one core could still access the serial
port for a short time after the driver running on another core disables the
interrupt. This patch adds a flag to tell the IRQ handler that the driver
wants to disable the interrupt. After seeing the flag, the IRQ handler will
immediately disable the interrupt and exit. After setting the flag, the driver
will wait for interrupt to be disabled by the IRQ handler.

Signed-off-by: Mark Salter <msalter@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
2012-12-12 15:46:14 +00:00
Mark Salter 8d160027ff MN10300: fix serial port vdma irq setup for SMP
The builtin SoC serial ports have no FIFOs and use a virtual DMA mechanism
based on high priority IRQs to avoid overruns. These high priority interrupts
are pinned to cpu#0 on SMP systems. This patch fixes a problem with SMP where
the set_intr_level() interface is used to set the priority for these IRQs. The
set_intr_level() function sets priority on the local cpu but on SMP systems,
this code may be run on some other cpu than the one handling the interrupts.
Instead of setting interrupt level explicitly, this patch uses a special
irq_chip for these interrupts so that the mask/unmask methods can set the
interrupt level implicitly.

Signed-off-by: Mark Salter <msalter@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
2012-12-12 15:46:14 +00:00
Mark Salter 7d361cb754 MN10300: cleanup IRQ affinity setting
The irq_set_affinity handler for the mn10300 cpu pic had some hard-coded IRQs
which were not to be migrated from one cpu to another. This patch cleans those
up by using a combination of IRQF_NOBALANCING and specialized irq chips with
no irq_set_affinity handler. This maintains the previous behavior by using
generic IRQ interfaces rather than hard coding IRQ numbers in the default
irq_set_affinity handler.

Signed-off-by: Mark Salter <msalter@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
2012-12-12 15:46:14 +00:00
David Howells c98c406eb2 MN10300: ttySM: Use memory barriers correctly in circular buffer logic
Use memory barriers correctly in the circular buffer logic used in the driver,
as documented in Documentation/circular-buffers.txt.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Mark Salter <msalter@redhat.com>
2012-12-12 15:46:14 +00:00
Linus Torvalds d07e43d70e Merge branch 'omap-serial' of git://git.linaro.org/people/rmk/linux-arm
Pull ARM OMAP serial updates from Russell King:
 "This series is a major reworking of the OMAP serial driver code fixing
  various bugs in the hardware-assisted flow control, extending up into
  serial_core for a couple of issues.  These fixes have been done as a
  set of progressive changes and transformations in the hope that no new
  bugs will be introduced by this series.

  The problems are many-fold, from the driver not being informed about
  updated settings, to the driver not knowing what the intentions of the
  upper layers are.

  The first four patches tackle the serial_core layer, allowing it to
  provide the necessary information to drivers, and the remaining
  patches allow the OMAP serial driver to take advantage of this.

  This brings hardware assisted RTS/CTS and XON/OFF flow control into a
  useful state.

  These patches have been in linux-next for most of the last cycle;
  indeed they predate the previous merge window.  They've also been
  posted to the OMAP people."

* 'omap-serial' of git://git.linaro.org/people/rmk/linux-arm: (21 commits)
  SERIAL: omap: fix hardware assisted flow control
  SERIAL: omap: simplify (2)
  SERIAL: omap: move xon/xoff setting earlier
  SERIAL: omap: always set TCR
  SERIAL: omap: simplify
  SERIAL: omap: don't read back LCR/MCR/EFR
  SERIAL: omap: serial_omap_configure_xonxoff() contents into set_termios
  SERIAL: omap: configure xon/xoff before setting modem control lines
  SERIAL: omap: remove OMAP_UART_SYSC_RESET and OMAP_UART_FIFO_CLR
  SERIAL: omap: move driver private definitions and structures to driver
  SERIAL: omap: remove 'irq_pending' bitfield
  SERIAL: omap: fix MCR TCRTLR bit handling
  SERIAL: omap: fix set_mctrl() breakage
  SERIAL: omap: no need to re-read EFR
  SERIAL: omap: remove setting of EFR SCD bit
  SERIAL: omap: allow hardware assisted IXANY mode to be disabled
  SERIAL: omap: allow hardware assisted rts/cts modes to be disabled
  SERIAL: core: add throttle/unthrottle callbacks for hardware assisted flow control
  SERIAL: core: add hardware assisted h/w flow control support
  SERIAL: core: add hardware assisted s/w flow control support
  ...

Conflicts:
	drivers/tty/serial/omap-serial.c
2012-12-12 07:45:16 -08:00
Takashi Iwai 6eb827d235 ALSA: hda - Move runtime PM check to runtime_idle callback
The runtime_idle callback is the right place to check the suspend
capability, but currently we do it wrongly in the runtime_suspend
callback.  This leads to a kernel error message like:
   pci_pm_runtime_suspend(): azx_runtime_suspend+0x0/0x50 [snd_hda_intel] returns -11
and the runtime PM core would even repeat the attempts.

Reported-and-tested-by: Borislav Petkov <bp@alien8.de>
Cc: <stable@vger.kernel.org> [v3.7]
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2012-12-12 14:22:13 +01:00
Takashi Iwai 63a077e276 ALSA: hda - Add stereo-dmic fixup for Acer Aspire One 522
Acer Aspire One 522 has the infamous digital mic unit that needs the
phase inversion fixup for stereo.

Bugzilla: https://bugzilla.novell.com/show_bug.cgi?id=715737

Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2012-12-12 12:12:55 +01:00
Takashi Iwai c5c215232d ALSA: hda - Avoid doubly suspend after vga switcheroo
The HD-audio driver artificially calls the suspend and the resume code
path in the VGA switcheroo state changes.  When a machine goes to
suspend, it tries to suspend the device again, and it stalls at
snd_power_wait().

This patch adds checks whether the devices were already in (forced)
suspend in PM callbacks for avoiding the doubly suspend.

Reported-by: Daniel J Blueman <daniel@quora.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2012-12-12 11:32:56 +01:00
Denis Washington 1d31affbef ALSA: usb-audio: Enable S/PDIF on the ASUS Xonar U3
The only required change is to extend the existing Xonar U1
mixer quirks to the U3, which seems to be controlled the same
way.

Signed-off-by: Denis Washington <denisw@online.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2012-12-12 11:32:54 +01:00
Takashi Iwai cc5ede3efd ALSA: hda - Check validity of CORB/RIRB WP reads
When the HD-audio controller is disabled (e.g. via vga switcheroo) but
the driver is still accessing it, it spews floods of "spurious
response" kernel messages.  It's because CORB/RIRB WP reads 0xff, and
the driver tries to fill up until this number.

This patch changes the CORB/RIRB WP reads to word instead of byte, and
add the check of the read value.  If it's 0xffff, the controller is
supposed to be disabled, so the further action will be skipped.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
2012-12-12 11:32:33 +01:00
Mengdong Lin fa348da53b ALSA: hda - use usleep_range in link reset and change timeout check
Reducing the time on HDA link reset can help to reduce the driver loading
time. So we replace msleep with usleep_range to get more accurate time
control and change the value to a smaller one. And a 100ms timeout is set
for both entering and exiting the link reset.

Signed-off-by: Xingchao Wang <xingchao.wang@intel.com>
Signed-off-by: Mengdong Lin <mengdong.lin@intel.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2012-12-12 11:03:12 +01:00
Zhang Rui 1f53ef17d3 Thermal: Fix DEFAULT_THERMAL_GOVERNOR
Fix DEFAULT_THERMAL_GOVERNOR to be consistant with the
default governor selected in kernel config file.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2012-12-12 15:34:48 +08:00
Zhang Rui d567c686ae Thermal: fix a NULL pointer dereference when generic thermal layer is built as a module
[   12.761956] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
[   12.762016] IP: [<ffffffffa0005277>] handle_thermal_trip+0x47/0x130 [thermal_sys]
[   12.762060] PGD 1fec74067 PUD 1fee5b067 PMD 0
[   12.762127] Oops: 0000 [#1] SMP
[   12.762177] Modules linked in: hid_generic crc32c_intel usbhid hid firewire_ohci(+) e1000e(+) firewire_core crc_itu_t xhci_hcd(+) thermal(+) fan thermal_sys hwmon
[   12.762423] CPU 1
[   12.762443] Pid: 187, comm: modprobe Tainted: G       A     3.7.0-thermal-module+ #25                  /DH77DF
[   12.762496] RIP: 0010:[<ffffffffa0005277>]  [<ffffffffa0005277>] handle_thermal_trip+0x47/0x130 [thermal_sys]
[   12.762682] RSP: 0018:ffff8801fe7ddc18  EFLAGS: 00010282
[   12.762704] RAX: 0000000000000000 RBX: ffff8801ff3e9c00 RCX: ffff8801fdc39800
[   12.762728] RDX: ffff8801fe7ddc24 RSI: 0000000000000001 RDI: ffff8801ff3e9c00
[   12.762764] RBP: ffff8801fe7ddc48 R08: 0000000004000000 R09: ffffffffa001f568
[   12.762797] R10: ffffffff81363083 R11: 0000000000000001 R12: 0000000000000001
[   12.762832] R13: 0000000000000000 R14: 0000000000000001 R15: ffff8801fde73e68
[   12.762866] FS:  00007f5548516700(0000) GS:ffff88021f240000(0000) knlGS:0000000000000000
[   12.762912] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   12.762946] CR2: 0000000000000018 CR3: 00000001fefe2000 CR4: 00000000001407e0
[   12.762979] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   12.763014] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[   12.763048] Process modprobe (pid: 187, threadinfo ffff8801fe7dc000, task ffff8801fe5bdb40)
[   12.763095] Stack:
[   12.763122]  0000000000019640 00000000fdc39800 ffff8801fe7ddc48 ffff8801ff3e9c00
[   12.763225]  0000000000000002 0000000000000000 ffff8801fe7ddc78 ffffffffa00053e7
[   12.763338]  ffff8801ff3e9c00 0000000000006c98 ffffffffa0007480 ffff8801ff3e9c00
[   12.763440] Call Trace:
[   12.763470]  [<ffffffffa00053e7>] thermal_zone_device_update+0x77/0xa0 [thermal_sys]
[   12.763515]  [<ffffffffa0006d38>] thermal_zone_device_register+0x788/0xa88 [thermal_sys]
[   12.763562]  [<ffffffffa001f394>] acpi_thermal_add+0x360/0x4c8 [thermal]
[   12.763598]  [<ffffffff8133902a>] acpi_device_probe+0x50/0x190
[   12.763632]  [<ffffffff811bd793>] ? sysfs_create_link+0x13/0x20
[   12.763666]  [<ffffffff813cc41b>] driver_probe_device+0x7b/0x240
[   12.763699]  [<ffffffff813cc68b>] __driver_attach+0xab/0xb0
[   12.763732]  [<ffffffff813cc5e0>] ? driver_probe_device+0x240/0x240
[   12.763766]  [<ffffffff813ca836>] bus_for_each_dev+0x56/0x90
[   12.763799]  [<ffffffff813cbf4e>] driver_attach+0x1e/0x20
[   12.763831]  [<ffffffff813cbac0>] bus_add_driver+0x190/0x290
[   12.763864]  [<ffffffffa0022000>] ? 0xffffffffa0021fff
[   12.763896]  [<ffffffff813ccbea>] driver_register+0x7a/0x160
[   12.763928]  [<ffffffffa0022000>] ? 0xffffffffa0021fff
[   12.763960]  [<ffffffff813399fb>] acpi_bus_register_driver+0x43/0x45
[   12.763995]  [<ffffffffa002203a>] acpi_thermal_init+0x3a/0x42 [thermal]
[   12.764029]  [<ffffffff8100207f>] do_one_initcall+0x3f/0x170
[   12.764063]  [<ffffffff810b1a5f>] sys_init_module+0x8f/0x200
[   12.764097]  [<ffffffff815ff259>] system_call_fastpath+0x16/0x1b
[   12.764129] Code: 48 8b 87 c8 02 00 00 41 89 f4 48 8d 55 dc ff 50 28 44 8b 6d dc 41 8d 45 fe 83 f8 01 76 5e 48 8b 83 d8 02 00 00 44 89 e6 48 89 df <ff> 50 18 4c 8d a3 10 03 00 00 4c 89 e7 e8 87 f1 5e e1 8b 83 bc
[   12.765164] RIP  [<ffffffffa0005277>] handle_thermal_trip+0x47/0x130 [thermal_sys]
[   12.765223]  RSP <ffff8801fe7ddc18>
[   12.765252] CR2: 0000000000000018
[   12.765284] ---[ end trace 7723294cdfb00d2a ]---

This is because thermal_zone_device_update() is invoked before
any thermal governors being registered.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2012-12-12 15:23:54 +08:00
Anton Vorontsov 76d8a23b12 Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux
The merge is merely to fix conflicts before sending a pull request.

Conflicts:
	drivers/power/ab8500_btemp.c
	drivers/power/ab8500_charger.c
	drivers/power/ab8500_fg.c
	drivers/power/abx500_chargalg.c
	drivers/power/max8925_power.c

Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org>
2012-12-11 22:15:57 -08:00
Eric Dumazet 1abbe1394a pkt_sched: avoid requeues if possible
With BQL being deployed, we can more likely have following behavior :

We dequeue a packet from qdisc in dequeue_skb(), then we realize target
tx queue is in XOFF state in sch_direct_xmit(), and we have to hold the
skb into gso_skb for later.

This shows in stats (tc -s qdisc dev eth0) as requeues.

Problem of these requeues is that high priority packets can not be
dequeued as long as this (possibly low prio and big TSO packet) is not
removed from gso_skb.

At 1Gbps speed, a full size TSO packet is 500 us of extra latency.

In some cases, we know that all packets dequeued from a qdisc are
for a particular and known txq :

- If device is non multi queue
- For all MQ/MQPRIO slave qdiscs

This patch introduces a new qdisc flag, TCQ_F_ONETXQUEUE to mark
this capability, so that dequeue_skb() is allowed to dequeue a packet
only if the associated txq is not stopped.

This indeed reduce latencies for high prio packets (or improve fairness
with sfq/fq_codel), and almost remove qdisc 'requeues'.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-12 00:16:47 -05:00
David Woodhouse cae49ede00 solos-pci: fix double-free of TX skb in DMA mode
We weren't clearing card->tx_skb[port] when processing the TX done interrupt.
If there wasn't another skb ready to transmit immediately, this led to a
double-free because we'd free it *again* next time we did have a packet to
send.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Cc: stable@kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-12 00:16:47 -05:00
Linus Torvalds 1ebaf4f4e6 Merge branch 'x86-timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 timer update from Ingo Molnar:
 "This tree includes HPET fixes and also implements a calibration-free,
  TSC match driven APIC timer interrupt mode: 'TSC deadline mode'
  supported in SandyBridge and later CPUs."

* 'x86-timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86: hpet: Fix inverted return value check in arch_setup_hpet_msi()
  x86: hpet: Fix masking of MSI interrupts
  x86: apic: Use tsc deadline for oneshot when available
2012-12-11 20:01:33 -08:00
Linus Torvalds 743aa456c1 Merge branch 'x86-nuke386-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull "Nuke 386-DX/SX support" from Ingo Molnar:
 "This tree removes ancient-386-CPUs support and thus zaps quite a bit
  of complexity:

    24 files changed, 56 insertions(+), 425 deletions(-)

  ... which complexity has plagued us with extra work whenever we wanted
  to change SMP primitives, for years.

  Unfortunately there's a nostalgic cost: your old original 386 DX33
  system from early 1991 won't be able to boot modern Linux kernels
  anymore.  Sniff."

I'm not sentimental.  Good riddance.

* 'x86-nuke386-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, 386 removal: Document Nx586 as a 386 and thus unsupported
  x86, cleanups: Simplify sync_core() in the case of no CPUID
  x86, 386 removal: Remove CONFIG_X86_POPAD_OK
  x86, 386 removal: Remove CONFIG_X86_WP_WORKS_OK
  x86, 386 removal: Remove CONFIG_INVLPG
  x86, 386 removal: Remove CONFIG_BSWAP
  x86, 386 removal: Remove CONFIG_XADD
  x86, 386 removal: Remove CONFIG_CMPXCHG
  x86, 386 removal: Remove CONFIG_M386 from Kconfig
2012-12-11 19:59:32 -08:00
Linus Torvalds a05a4e24dc Merge branch 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 topology discovery improvements from Ingo Molnar:
 "These changes improve topology discovery on AMD CPUs.

  Right now this feeds information displayed in
  /sys/devices/system/cpu/cpuX/cache/indexY/* - but in the future we
  could use this to set up a better scheduling topology."

* 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, cacheinfo: Base cache sharing info on CPUID 0x8000001d on AMD
  x86, cacheinfo: Make use of CPUID 0x8000001d for cache information on AMD
  x86, cacheinfo: Determine number of cache leafs using CPUID 0x8000001d on AMD
  x86: Add cpu_has_topoext
2012-12-11 19:58:29 -08:00
Linus Torvalds e9a5a91971 Merge branch 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cleanups from Ingo Molnar:
 "Small cleanups."

* 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86: Fix the error of using "const" in gen-insn-attr-x86.awk
  x86, apic: Cleanup cfg->domain setup for legacy interrupts
  x86: Remove dead hlt_use_halt code
2012-12-11 19:57:56 -08:00
Linus Torvalds 74b8423345 Merge branch 'x86-bsp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 BSP hotplug changes from Ingo Molnar:
 "This tree enables CPU#0 (the boot processor) to be onlined/offlined on
  x86, just like any other CPU.  Enabled on Intel CPUs for now.

  Allowing this required the identification and fixing of latent CPU#0
  assumptions (such as CPU#0 initializations, etc.) in the x86
  architecture code, plus the identification of barriers to
  BSP-offlining, such as active PIC interrupts which can only be
  serviced on the BSP.

  It's behind a default-off option, and there's a debug option that
  allows the automatic testing of this feature.

  The motivation of this feature is to allow and prepare for true
  CPU-hotplug hardware support: recent changes to MCE support enable us
  to detect a deteriorating but not yet hard-failing L1/L2 cache on a
  CPU that could be soft-unplugged - or a failing L3 cache on a
  multi-socket system.

  Note that true hardware hot-plug is not yet fully enabled by this,
  because that requires a special platform wakeup sequence to be sent to
  the freshly powered up CPU#0.  Future patches for this are planned,
  once such a platform exists.  Chicken and egg"

* 'x86-bsp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, topology: Debug CPU0 hotplug
  x86/i387.c: Initialize thread xstate only on CPU0 only once
  x86, hotplug: Handle retrigger irq by the first available CPU
  x86, hotplug: The first online processor saves the MTRR state
  x86, hotplug: During CPU0 online, enable x2apic, set_numa_node.
  x86, hotplug: Wake up CPU0 via NMI instead of INIT, SIPI, SIPI
  x86-32, hotplug: Add start_cpu0() entry point to head_32.S
  x86-64, hotplug: Add start_cpu0() entry point to head_64.S
  kernel/cpu.c: Add comment for priority in cpu_hotplug_pm_callback
  x86, hotplug, suspend: Online CPU0 for suspend or hibernate
  x86, hotplug: Support functions for CPU0 online/offline
  x86, topology: Don't offline CPU0 if any PIC irq can not be migrated out of it
  x86, Kconfig: Add config switch for CPU0 hotplug
  doc: Add x86 CPU0 online/offline feature
2012-12-11 19:56:33 -08:00
Linus Torvalds 5074474737 Merge branch 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 boot changes from Ingo Molnar:
 "Two small changes: a cleanup and allow CONFIG_X86_MPPARSE to be turned
  off on SFI as well."

* 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  arch/x86/Kconfig: Allow turning off CONFIG_X86_MPPARSE when either ACPI or SFI is present
  x86/boot/doc: Fix grammar and typo in boot.txt
2012-12-11 19:55:58 -08:00
Linus Torvalds 0019fab355 Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 asm changes from Ingo Molnar:
 "Two fixlets and a cleanup."

* 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86_32: Return actual stack when requesting sp from regs
  x86: Don't clobber top of pt_regs in nested NMI
  x86/asm: Clean up copy_page_*() comments and code
2012-12-11 19:55:20 -08:00
Michael Chan fda4d85d61 bnx2: Fix accidental reversions.
Commit 4ce45e0246
"bnx2: Add BNX2 prefix to CHIP ID and name macros"

accidentally reverted 2 commits to use pci_ioumap() and to make
pci_error_handlers const.  This fixes those mistakes.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-11 21:28:25 -05:00
Linus Torvalds b64c5fda38 Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core timer changes from Ingo Molnar:
 "It contains continued generic-NOHZ work by Frederic and smaller
  cleanups."

* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  time: Kill xtime_lock, replacing it with jiffies_lock
  clocksource: arm_generic: use this_cpu_ptr per-cpu helper
  clocksource: arm_generic: use integer math helpers
  time/jiffies: Make clocksource_jiffies static
  clocksource: clean up parse_pmtmr()
  tick: Correct the comments for tick_sched_timer()
  tick: Conditionally build nohz specific code in tick handler
  tick: Consolidate tick handling for high and low res handlers
  tick: Consolidate timekeeping handling code
2012-12-11 18:22:46 -08:00
Linus Torvalds f57d54bab6 Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
 "The biggest change affects group scheduling: we now track the runnable
  average on a per-task entity basis, allowing a smoother, exponential
  decay average based load/weight estimation instead of the previous
  binary on-the-runqueue/off-the-runqueue load weight method.

  This will inevitably disturb workloads that were in some sort of
  borderline balancing state or unstable equilibrium, so an eye has to
  be kept on regressions.

  For that reason the new load average is only limited to group
  scheduling (shares distribution) at the moment (which was also hurting
  the most from the prior, crude weight calculation and whose scheduling
  quality wins most from this change) - but we plan to extend this to
  regular SMP balancing as well in the future, which will simplify and
  speed up things a bit.

  Other changes involve ongoing preparatory work to extend NOHZ to the
  scheduler as well, eventually allowing completely irq-free user-space
  execution."

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (33 commits)
  Revert "sched/autogroup: Fix crash on reboot when autogroup is disabled"
  cputime: Comment cputime's adjusting code
  cputime: Consolidate cputime adjustment code
  cputime: Rename thread_group_times to thread_group_cputime_adjusted
  cputime: Move thread_group_cputime() to sched code
  vtime: Warn if irqs aren't disabled on system time accounting APIs
  vtime: No need to disable irqs on vtime_account()
  vtime: Consolidate a bit the ctx switch code
  vtime: Explicitly account pending user time on process tick
  vtime: Remove the underscore prefix invasion
  sched/autogroup: Fix crash on reboot when autogroup is disabled
  cputime: Separate irqtime accounting from generic vtime
  cputime: Specialize irq vtime hooks
  kvm: Directly account vtime to system on guest switch
  vtime: Make vtime_account_system() irqsafe
  vtime: Gather vtime declarations to their own header file
  sched: Describe CFS load-balancer
  sched: Introduce temporary FAIR_GROUP_SCHED dependency for load-tracking
  sched: Make __update_entity_runnable_avg() fast
  sched: Update_cfs_shares at period edge
  ...
2012-12-11 18:21:38 -08:00
Steven Rostedt e1a6c3d748 ktest: Test if target machine is up before install
Sometimes a test kernel will crash or hang on reboot (this is even more
apparent when testing a config without CGROUPS on a box running
systemd). When this happens, on the next iteration of installing a
kernel, ktest will fail when it tries to install.

Have ktest do a check to see if the target can be connected to via ssh
before it tries to install. If it can't connect, then reboot again.
This time the reboot will fail because it can't connect and will force a
power cycle.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-12-11 21:19:41 -05:00
Linus Torvalds da830e589a Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "These are late-v3.7 pending fixes for tracing."

Fix up trivial conflict in kernel/trace/ring_buffer.c: the NULL pointer
fix clashed with the change of type of the 'ret' variable.

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  ring-buffer: Fix race between integrity check and readers
  ring-buffer: Fix NULL pointer if rb_set_head_page() fails
  ftrace: Clear bits properly in reset_iter_read()
2012-12-11 18:18:58 -08:00
Linus Torvalds 090f8ccba3 Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
 "Lots of activity:

   211 files changed, 8328 insertions(+), 4116 deletions(-)

  most of it on the tooling side.

  Main changes:

   * ftrace enhancements and fixes from Steve Rostedt.

   * uprobes fixes, cleanups and preparation for the ARM port from Oleg
     Nesterov.

   * UAPI fixes, from David Howels - prepares the arch/x86 UAPI
     transition

   * Separate perf tests into multiple objects, one per test, from Jiri
     Olsa.

   * Make hardware event translations available in sysfs, from Jiri
     Olsa.

   * Fixes to /proc/pid/maps parsing, preparatory to supporting data
     maps, from Namhyung Kim

   * Implement ui_progress for GTK, from Namhyung Kim

   * Add framework for automated perf_event_attr tests, where tools with
     different command line options will be run from a 'perf test', via
     python glue, and the perf syscall will be intercepted to verify
     that the perf_event_attr fields set by the tool are those expected,
     from Jiri Olsa

   * Add a 'link' method for hists, so that we can have the leader with
     buckets for all the entries in all the hists.  This new method is
     now used in the default 'diff' output, making the sum of the
     'baseline' column be 100%, eliminating blind spots.

   * libtraceevent fixes for compiler warnings trying to make perf it
     build on some distros, like fedora 14, 32-bit, some of the warnings
     really pointed to real bugs.

   * Add a browser for 'perf script' and make it available from the
     report and annotate browsers.  It does filtering to find the
     scripts that handle events found in the perf.data file used.  From
     Feng Tang

   * perf inject changes to allow showing where a task sleeps, from
     Andrew Vagin.

   * Makefile improvements from Namhyung Kim.

   * Add --pre and --post command hooks in 'stat', from Peter Zijlstra.

   * Don't stop synthesizing threads when one vanishes, this is for the
     existing threads when we start a tool like trace.

   * Use sched:sched_stat_runtime to provide a thread summary, this
     produces the same output as the 'trace summary' subcommand of
     tglx's original "trace" tool.

   * Support interrupted syscalls in 'trace'

   * Add an event duration column and filter in 'trace'.

   * There are references to the man pages in some tools, so try to
     build Documentation when installing, warning the user if that is
     not possible, from Borislav Petkov.

   * Give user better message if precise is not supported, from David
     Ahern.

   * Try to find cross-built objdump path by using the session
     environment information in the perf.data file header, from Irina
     Tirdea, original patch and idea by Namhyung Kim.

   * Diplays more output on features check for make V=1, so that one can
     figure out what is happening by looking at gcc output, etc.  From
     Jiri Olsa.

   * Add on_exit implementation for systems without one, e.g.  Android,
     from Bernhard Rosenkraenzer.

   * Only process events for vcpus of interest, helps handling large
     number of events, from David Ahern.

   * Cross compilation fixes for Android, from Irina Tirdea.

   * Add documentation on compiling for Android, from Irina Tirdea.

   * perf diff improvements from Jiri Olsa.

   * Target (task/user/cpu/syswide) handling improvements, from Namhyung
     Kim.

   * Add support in 'trace' for tracing workload given by command line,
     from Namhyung Kim.

   * ... and much more."

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (194 commits)
  uprobes: Use percpu_rw_semaphore to fix register/unregister vs dup_mmap() race
  perf evsel: Introduce is_group_member method
  perf powerpc: Use uapi/unistd.h to fix build error
  tools: Pass the target in descend
  tools: Honour the O= flag when tool build called from a higher Makefile
  tools: Define a Makefile function to do subdir processing
  perf ui: Always compile browser setup code
  perf ui: Add ui_progress__finish()
  perf ui gtk: Implement ui_progress functions
  perf ui: Introduce generic ui_progress helper
  perf ui tui: Move progress.c under ui/tui directory
  perf tools: Add basic event modifier sanity check
  perf tools: Omit group members from perf_evlist__disable/enable
  perf tools: Ensure single disable call per event in record comand
  perf tools: Fix 'disabled' attribute config for record command
  perf tools: Fix attributes for '{}' defined event groups
  perf tools: Use sscanf for parsing /proc/pid/maps
  perf tools: Add gtk.<command> config option for launching GTK browser
  perf tools: Fix compile error on NO_NEWT=1 build
  perf hists: Initialize all of he->stat with zeroes
  ...
2012-12-11 18:14:31 -08:00
Linus Torvalds aefb058b0c Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Ingo Molnar:
 "Affinity fixes and a nested threaded IRQ handling fix."

* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  genirq: Always force thread affinity
  irq: Set CPU affinity right on thread creation
  genirq: Provide means to retrigger parent
2012-12-11 18:12:06 -08:00
Linus Torvalds 37ea95a959 Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU update from Ingo Molnar:
 "The major features of this tree are:

     1. A first version of no-callbacks CPUs.  This version prohibits
        offlining CPU 0, but only when enabled via CONFIG_RCU_NOCB_CPU=y.
        Relaxing this constraint is in progress, but not yet ready
        for prime time.  These commits were posted to LKML at
        https://lkml.org/lkml/2012/10/30/724.

     2. Changes to SRCU that allows statically initialized srcu_struct
        structures.  These commits were posted to LKML at
        https://lkml.org/lkml/2012/10/30/296.

     3. Restructuring of RCU's debugfs output.  These commits were posted
        to LKML at https://lkml.org/lkml/2012/10/30/341.

     4. Additional CPU-hotplug/RCU improvements, posted to LKML at
        https://lkml.org/lkml/2012/10/30/327.
        Note that the commit eliminating __stop_machine() was judged to
        be too-high of risk, so is deferred to 3.9.

     5. Changes to RCU's idle interface, most notably a new module
        parameter that redirects normal grace-period operations to
        their expedited equivalents.  These were posted to LKML at
        https://lkml.org/lkml/2012/10/30/739.

     6. Additional diagnostics for RCU's CPU stall warning facility,
        posted to LKML at https://lkml.org/lkml/2012/10/30/315.
        The most notable change reduces the
        default RCU CPU stall-warning time from 60 seconds to 21 seconds,
        so that it once again happens sooner than the softlockup timeout.

     7. Documentation updates, which were posted to LKML at
        https://lkml.org/lkml/2012/10/30/280.
        A couple of late-breaking changes were posted at
        https://lkml.org/lkml/2012/11/16/634 and
        https://lkml.org/lkml/2012/11/16/547.

     8. Miscellaneous fixes, which were posted to LKML at
        https://lkml.org/lkml/2012/10/30/309.

     9. Finally, a fix for an lockdep-RCU splat was posted to LKML
        at https://lkml.org/lkml/2012/11/7/486."

* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (49 commits)
  context_tracking: New context tracking susbsystem
  sched: Mark RCU reader in sched_show_task()
  rcu: Separate accounting of callbacks from callback-free CPUs
  rcu: Add callback-free CPUs
  rcu: Add documentation for the new rcuexp debugfs trace file
  rcu: Update documentation for TREE_RCU debugfs tracing
  rcu: Reduce default RCU CPU stall warning timeout
  rcu: Fix TINY_RCU rcu_is_cpu_rrupt_from_idle check
  rcu: Clarify memory-ordering properties of grace-period primitives
  rcu: Add new rcutorture module parameters to start/end test messages
  rcu: Remove list_for_each_continue_rcu()
  rcu: Fix batch-limit size problem
  rcu: Add tracing for synchronize_sched_expedited()
  rcu: Remove old debugfs interfaces and also RCU flavor name
  rcu: split 'rcuhier' to each flavor
  rcu: split 'rcugp' to each flavor
  rcu: split 'rcuboost' to each flavor
  rcu: split 'rcubarrier' to each flavor
  rcu: Fix tracing formatting
  rcu: Remove the interface "rcudata.csv"
  ...
2012-12-11 18:10:49 -08:00
Linus Torvalds de0c276b31 Merge branches 'core-locking-for-linus' and 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull trivial fix branches from Ingo Molnar.

Cleanup in __get_key_name, and a timer comment fixlet.

* 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  lockdep: Use KSYM_NAME_LEN'ed buffer for __get_key_name()

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  timers, sched: Correct the comments for tick_sched_timer()
2012-12-11 18:09:18 -08:00
Linus Torvalds 608ff1a210 Merge branch 'akpm' (Andrew's patchbomb)
Merge misc updates from Andrew Morton:
 "About half of most of MM.  Going very early this time due to
  uncertainty over the coreautounifiednumasched things.  I'll send the
  other half of most of MM tomorrow.  The rest of MM awaits a slab merge
  from Pekka."

* emailed patches from Andrew Morton: (71 commits)
  memory_hotplug: ensure every online node has NORMAL memory
  memory_hotplug: handle empty zone when online_movable/online_kernel
  mm, memory-hotplug: dynamic configure movable memory and portion memory
  drivers/base/node.c: cleanup node_state_attr[]
  bootmem: fix wrong call parameter for free_bootmem()
  avr32, kconfig: remove HAVE_ARCH_BOOTMEM
  mm: cma: remove watermark hacks
  mm: cma: skip watermarks check for already isolated blocks in split_free_page()
  mm, oom: fix race when specifying a thread as the oom origin
  mm, oom: change type of oom_score_adj to short
  mm: cleanup register_node()
  mm, mempolicy: remove duplicate code
  mm/vmscan.c: try_to_freeze() returns boolean
  mm: introduce putback_movable_pages()
  virtio_balloon: introduce migration primitives to balloon pages
  mm: introduce compaction and migration for ballooned pages
  mm: introduce a common interface for balloon pages mobility
  mm: redefine address_space.assoc_mapping
  mm: adjust address_space_operations.migratepage() return code
  arch/sparc/kernel/sys_sparc_64.c: s/COLOUR/COLOR/
  ...
2012-12-11 18:05:37 -08:00
Steven Rostedt 1892517056 ktest: Fix breakage from change of oldnoconfig to olddefconfig
Commit fb16d891 "kconfig: replace 'oldnoconfig' with 'olddefconfig', and
keep the old name", changed ktest's default config update from
oldnoconfig to olddefconfig without adding oldnoconfig as a backup.
The make oldnoconfig works much better than its backup of:
   yes '' | make oldconfig

But due to this change, and the fact that ktest is used to build lots of
older kernels (and for bisects), it forgoes the oldnoconfig completely.

Cc: Adam Lee <adam8157@gmail.com>
Cc: Michal Marek <mmarek@suse.cz>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-12-11 20:23:22 -05:00
Lai Jiangshan 74d42d8fe1 memory_hotplug: ensure every online node has NORMAL memory
Old memory hotplug code and new online/movable may cause a online node
don't have any normal memory, but memory-management acts bad when we have
nodes which is online but don't have any normal memory.  Example: it may
cause a bound task fail on all kernel allocation and cause the task can't
create task or create other kernel object.

So we disable non-normal-memory-node here, we will enable it when we
prepared.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-11 17:22:28 -08:00
Lai Jiangshan e455a9b92d memory_hotplug: handle empty zone when online_movable/online_kernel
Make online_movable/online_kernel can empty a zone or can move memory to a
empty zone.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-11 17:22:28 -08:00
Lai Jiangshan 511c2aba8f mm, memory-hotplug: dynamic configure movable memory and portion memory
Add online_movable and online_kernel for logic memory hotplug.  This is
the dynamic version of "movablecore" & "kernelcore".

We have the same reason to introduce it as to introduce "movablecore" &
"kernelcore".  It has the same motive as "movablecore" & "kernelcore", but
it is dynamic/running-time:

o We can configure memory as kernelcore or movablecore after boot.

  Userspace workload is increased, we need more hugepage, we can't use
  "online_movable" to add memory and allow the system use more
  THP(transparent-huge-page), vice-verse when kernel workload is increase.

  Also help for virtualization to dynamic configure host/guest's memory,
  to save/(reduce waste) memory.

  Memory capacity on Demand

o When a new node is physically online after boot, we need to use
  "online_movable" or "online_kernel" to configure/portion it as we
  expected when we logic-online it.

  This configuration also helps for physically-memory-migrate.

o all benefit as the same as existed "movablecore" & "kernelcore".

o Preparing for movable-node, which is very important for power-saving,
  hardware partitioning and high-available-system(hardware fault
  management).

(Note, we don't introduce movable-node here.)

Action behavior:
When a memoryblock/memorysection is onlined by "online_movable", the kernel
will not have directly reference to the page of the memoryblock,
thus we can remove that memory any time when needed.

When it is online by "online_kernel", the kernel can use it.
When it is online by "online", the zone type doesn't changed.

Current constraints:
Only the memoryblock which is adjacent to the ZONE_MOVABLE
can be online from ZONE_NORMAL to ZONE_MOVABLE.

[akpm@linux-foundation.org: use min_t, cleanups]
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-11 17:22:28 -08:00
Lai Jiangshan fcf07d22f0 drivers/base/node.c: cleanup node_state_attr[]
use [index] = init_value
use N_xxxxx instead of hardcode.

Make it more readability and easier to add new state.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-11 17:22:28 -08:00
Joonsoo Kim 81df9bff26 bootmem: fix wrong call parameter for free_bootmem()
It is strange that alloc_bootmem() returns a virtual address and
free_bootmem() requires a physical address.  Anyway, free_bootmem()'s
first parameter should be physical address.

There are some call sites for free_bootmem() with virtual address.  So fix
them.

[akpm@linux-foundation.org: improve free_bootmem() and free_bootmem_pate() documentation]
Signed-off-by: Joonsoo Kim <js1304@gmail.com>
Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-11 17:22:28 -08:00
Joonsoo Kim e9b2e78c6a avr32, kconfig: remove HAVE_ARCH_BOOTMEM
There is no code for CONFIG_HAVE_ARCH_BOOTMEM, so remove it.

Signed-off-by: Joonsoo Kim <js1304@gmail.com>
Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-11 17:22:28 -08:00
Marek Szyprowski bc357f431c mm: cma: remove watermark hacks
Commits 2139cbe627 ("cma: fix counting of isolated pages") and
d95ea5d18e ("cma: fix watermark checking") introduced a reliable
method of free page accounting when memory is being allocated from CMA
regions, so the workaround introduced earlier by commit 49f223a9cd
("mm: trigger page reclaim in alloc_contig_range() to stabilise
watermarks") can be finally removed.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Mel Gorman <mel@csn.ul.ie>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-11 17:22:27 -08:00
Marek Szyprowski 2e30abd173 mm: cma: skip watermarks check for already isolated blocks in split_free_page()
Since commit 2139cbe627 ("cma: fix counting of isolated pages") free
pages in isolated pageblocks are not accounted to NR_FREE_PAGES counters,
so watermarks check is not required if one operates on a free page in
isolated pageblock.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Mel Gorman <mel@csn.ul.ie>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-11 17:22:27 -08:00
David Rientjes e1e12d2f31 mm, oom: fix race when specifying a thread as the oom origin
test_set_oom_score_adj() and compare_swap_oom_score_adj() are used to
specify that current should be killed first if an oom condition occurs in
between the two calls.

The usage is

	short oom_score_adj = test_set_oom_score_adj(OOM_SCORE_ADJ_MAX);
	...
	compare_swap_oom_score_adj(OOM_SCORE_ADJ_MAX, oom_score_adj);

to store the thread's oom_score_adj, temporarily change it to the maximum
score possible, and then restore the old value if it is still the same.

This happens to still be racy, however, if the user writes
OOM_SCORE_ADJ_MAX to /proc/pid/oom_score_adj in between the two calls.
The compare_swap_oom_score_adj() will then incorrectly reset the old value
prior to the write of OOM_SCORE_ADJ_MAX.

To fix this, introduce a new oom_flags_t member in struct signal_struct
that will be used for per-thread oom killer flags.  KSM and swapoff can
now use a bit in this member to specify that threads should be killed
first in oom conditions without playing around with oom_score_adj.

This also allows the correct oom_score_adj to always be shown when reading
/proc/pid/oom_score.

Signed-off-by: David Rientjes <rientjes@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: Anton Vorontsov <anton.vorontsov@linaro.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-11 17:22:27 -08:00
David Rientjes a9c58b907d mm, oom: change type of oom_score_adj to short
The maximum oom_score_adj is 1000 and the minimum oom_score_adj is -1000,
so this range can be represented by the signed short type with no
functional change.  The extra space this frees up in struct signal_struct
will be used for per-thread oom kill flags in the next patch.

Signed-off-by: David Rientjes <rientjes@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: Anton Vorontsov <anton.vorontsov@linaro.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-11 17:22:27 -08:00