Commit graph

1048 commits

Author SHA1 Message Date
Ananth N Mavinakayanahalli 9ec4b1f356 [PATCH] kprobes: fix single-step out of line - take2
Now that PPC64 has no-execute support, here is a second try to fix the
single step out of line during kprobe execution.  Kprobes on x86_64 already
solved this problem by allocating an executable page and using it as the
scratch area for stepping out of line.  Reuse that.

Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-27 15:23:52 -07:00
Linus Torvalds d3b8a1a849 Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/usb-2.6 2005-06-27 15:13:26 -07:00
Andrea Arcangeli ffaa8bd6c9 [PATCH] seccomp: tsc disable
I believe at least for seccomp it's worth to turn off the tsc, not just for
HT but for the L2 cache too.  So it's up to you, either you turn it off
completely (which isn't very nice IMHO) or I recommend to apply this below
patch.

This has been tested successfully on x86-64 against current cogito
repository (i686 compiles so I didn't bother testing ;).  People selling
the cpu through cpushare may appreciate this bit for a peace of mind.

There's no way to get any timing info anymore with this applied
(gettimeofday is forbidden of course).  The seccomp environment is
completely deterministic so it can't be allowed to get timing info, it has
to be deterministic so in the future I can enable a computing mode that
does a parallel computing for each task with server side transparent
checkpointing and verification that the output is the same from all the 2/3
seller computers for each task, without the buyer even noticing (for now
the verification is left to the buyer client side and there's no
checkpointing, since that would require more kernel changes to track the
dirty bits but it'll be easy to extend once the basic mode is finished).

Eliminating a cold-cache read of the cr4 global variable will save one
cacheline during the tlb flush while making the code per-cpu-safe at the
same time.  Thanks to Mikael Pettersson for noticing the tlb flush wasn't
per-cpu-safe.

The global tlb flush can run from irq (IPI calling do_flush_tlb_all) but
it'll be transparent to the switch_to code since the IPI won't make any
change to the cr4 contents from the point of view of the interrupted code
and since it's now all per-cpu stuff, it will not race.  So no need to
disable irqs in switch_to slow path.

Signed-off-by: Andrea Arcangeli <andrea@cpushare.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-27 15:11:44 -07:00
Benjamin Herrenschmidt 6ae3db110e [PATCH] ppc64: Add missing exports
This patch adds a couple of missing symbol exports.  flush_dcache_page is
used by the AGP driver and rtc_lock by the RTC driver.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-27 15:11:43 -07:00
Benjamin Herrenschmidt 8c8709334c [PATCH] ppc32: Remove CONFIG_PMAC_PBOOK
This patch removes CONFIG_PMAC_PBOOK (PowerBook support).  This is now
split into CONFIG_PMAC_MEDIABAY for the actual hotswap bay that some
powerbooks have, CONFIG_PM for power management related code, and just left
out of any CONFIG_* option for some generally useful stuff that can be used
on non-laptops as well.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-27 15:11:43 -07:00
Benjamin Herrenschmidt e4ee69c8c1 [PATCH] ppc32: Bump PMU interrupt priority
The Power Management Unit on PowerMacs is very sensitive to timeouts during
async message exchanges.  It uses rather crude protocol based on a shift
register with an interrupt and is almost continuously exchanging messages with
the host CPU on laptops.

This patch adds a routine to the open_pic driver to be able to select a PMU
driver so that it bumps it's interrupt priority to above the normal level.

This will allow PMU interrupts to occur while another interrupt is pending,
and thus reduce the risk of machine beeing abruptly shutdown by the PMU due to
a timeout in PMU communication caused by excessive interrupt latency.  The
problem is very rare, and usually just doesn't happen, but it is still useful
to make things even more robust.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-27 15:11:42 -07:00
Marcelo Tosatti bb16574681 [PATCH] 8xx: avoid "dcbst" misbehaviour with unpopulated TLB
The proposed _tlbie call at update_mmu_cache() is safe because:

Addresses for which update_mmu_cache() gets invocated are never inside the
static kernel virtual mapping, meaning that there is no risk for the
_tlbie() here to be thrashing the pinned entry, as Dan suspected.

The intermediate TLB state in which this bug can be triggered is not
visible by userspace or any other contexts, except the page fault handling
path.  So there is no need to worry about userspace dcbxxx users.

The other solution to this is to avoid dcbst misbehaviour in the first
place, which involves changing in-kernel "dcbst" callers to use 8xx
specific SPR's.

Summary:

On 8xx, cache control instructions (particularly "dcbst" from
flush_dcache_icache) fault as write operation if there is an unpopulated
TLB entry for the address in question.  To workaround that, we invalidate
the TLB here, thus avoiding dcbst misbehaviour.

Signed-off-by: Marcelo Tosatti <marcelo.tosatti@cyclades.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-27 15:11:42 -07:00
Yoichi Yuasa d4b3a80e39 [PATCH] mips: fixed try_to_freeze build error
arch/mips/kernel/signal.c: In function 'do_signal':
arch/mips/kernel/signal.c:460: error: too many arguments to function 'try_to_freeze'

Signed-off-by: Yoichi Yuasa <yuasa@hh.iij4u.or.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-27 15:11:42 -07:00
Kumar Gala 9c4142a133 [PATCH] ppc32: Fix compiling of sandpoint platform
Lost a curly brace in translation.  Everything is better now.

Signed-off-by: Matt McClintock <msm@freescale.com>
Signed-off-by: Kumar Gala <kumar.gala@freescale.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-27 15:11:41 -07:00
Linus Torvalds 8b789b7d7e Merge master.kernel.org:/home/rmk/linux-2.6-arm 2005-06-27 15:00:10 -07:00
David Brownell 313980c927 [PATCH] USB: omap_udc updates (mostly cleanups)
Various USB patches, mostly for portability:

  - Fifo mode 1 didn't work previously (oopsed), so now it's fixed and
    (why not) defines even more endpoints for composite devices.

  - OMAP 1710 doesn't have an internal transceiver.

  - Small PM update:  if the USB link is suspended, don't disconnect on
    entry to deep sleep.

  - Be more correct about handling zero length control reads.  OMAP
    seems to mis-handle that protocol peculiarity though; best avoided.

  - Platform device resources (for UDC and OTG controllers) now use
    physical addresses, so /proc/iomem is more consistent.

  - Minor cleanups, notably (by volume) for "sparse" NULL warnings.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
2005-06-27 14:43:41 -07:00
Jens Axboe 22e2c507c3 [PATCH] Update cfq io scheduler to time sliced design
This updates the CFQ io scheduler to the new time sliced design (cfq
v3).  It provides full process fairness, while giving excellent
aggregate system throughput even for many competing processes.  It
supports io priorities, either inherited from the cpu nice value or set
directly with the ioprio_get/set syscalls.  The latter closely mimic
set/getpriority.

This import is based on my latest from -mm.

Signed-off-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-27 14:33:29 -07:00
Russell King f3bb742640 [PATCH] ARM: Update mach-types
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-06-27 14:49:10 +01:00
Russell King a013053d49 [PATCH] ARM: Move memmap freeing into init.c
It doesn't make sense for this to be in mm-armv.c now that 26-bit
ARM support is no longer integrated into arch/arm.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-06-27 14:16:47 +01:00
Russell King a343e6075a [PATCH] ARM: Move PGD kernel page table initialisation
It doesn't make sense to have the PGD kernel pointers initialisation
separate from the PGD user pointers, especially when we clean the
data cache over the whole range.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-06-27 14:08:56 +01:00
Russell King 2ea83398b7 [PATCH] ARM: Add VST idle loop call
This call allows the dynamic tick support to reprogram the timer
immediately before the CPU idles.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-06-27 14:04:05 +01:00
Russell King 99bcc05908 [PATCH] ARM: Add missed AAEC2000 file
My scripts missed committing this file.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-06-27 13:59:43 +01:00
Linus Torvalds 41b6c37326 Merge master.kernel.org:/home/rmk/linux-2.6-arm 2005-06-26 17:28:24 -07:00
Lennert Buytenhek 26799e675e [PATCH] ARM: 2757/1: remove ixdp2400_init_irq from ixdp2800 code
Patch from Lennert Buytenhek

Compiling one kernel that supports both ixdp2400 and ixdp2800 gives
an error, as a copy of the ixdp2400 irq init routing accidentally
ended up in ixdp2800.c somehow.

Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org>
Signed-off-by: Deepak Saxena
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-06-26 22:24:19 +01:00
Lennert Buytenhek baaf7ed179 [PATCH] ARM: 2756/1: add ixp2000 msf mapping
Patch from Lennert Buytenhek

Add a mapping for the ixp2400 and ixp2800 msf unit.  The msf is the
ixp2000's 'media and switch fabric' unit, which handles the networking
part of the chip.

Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org>
Signed-off-by: Deepak Saxena
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-06-26 22:24:17 +01:00
Russell King 09b8b5f843 [PATCH] ARM: Add SA_TIMER flag to timer interrupts
VST needs to know which timer handler is for the timer interrupt.
Mark all timer interrupts with the SA_TIMER flag.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-06-26 17:06:36 +01:00
Kumar Gala 7d681b23d6 [PATCH] ppc32: Fix MPC83xx IPIC external interrupt pending register offset
The pending registers for IRQ1-IRQ7 were pointing to the interrupt pending
register instead of the external one.

Signed-off-by: Tony Li <Tony.Li@freescale.com>
Signed-off-by: Kumar Gala <kumar.gala@freescale.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-26 08:43:19 -07:00
Andrew Morton bdb94f3a78 [PATCH] arm: swsusp build fix
Another swsusp fixup.

Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-26 08:43:19 -07:00
Linus Torvalds 6f0dcb72d6 Fix up try_to_freeze() usage in arch/i386/kernel/signal.c
The parentheses were missing. Noted by Pavel Machek.
2005-06-25 20:09:12 -07:00
Linus Torvalds 2031d0f586 Merge Christoph's freeze cleanup patch 2005-06-25 17:16:53 -07:00
Christoph Lameter 3e1d1d28d9 [PATCH] Cleanup patch for process freezing
1. Establish a simple API for process freezing defined in linux/include/sched.h:

   frozen(process)		Check for frozen process
   freezing(process)		Check if a process is being frozen
   freeze(process)		Tell a process to freeze (go to refrigerator)
   thaw_process(process)	Restart process
   frozen_process(process)	Process is frozen now

2. Remove all references to PF_FREEZE and PF_FROZEN from all
   kernel sources except sched.h

3. Fix numerous locations where try_to_freeze is manually done by a driver

4. Remove the argument that is no longer necessary from two function calls.

5. Some whitespace cleanup

6. Clear potential race in refrigerator (provides an open window of PF_FREEZE
   cleared before setting PF_FROZEN, recalc_sigpending does not check
   PF_FROZEN).

This patch does not address the problem of freeze_processes() violating the rule
that a task may only modify its own flags by setting PF_FREEZE. This is not clean
in an SMP environment. freeze(process) is therefore not SMP safe!

Signed-off-by: Christoph Lameter <christoph@lameter.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 17:10:13 -07:00
Christophe Lucas f90e7185ee [PATCH] printk: arch/i386/mm/pgtable.c
printk() calls should include appropriate KERN_* constant.

Signed-off-by: Christophe Lucas <clucas@rotomalug.org>
Signed-off-by: Domen Puncer <domen@coderock.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:25:08 -07:00
Christophe Lucas ee48dd5799 [PATCH] printk: arch/i386/mm/ioremap.c
printk() calls should include appropriate KERN_* constant.

Signed-off-by: Christophe Lucas <clucas@rotomalug.org>
Signed-off-by: Domen Puncer <domen@coderock.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:25:08 -07:00
Olaf Hering f14c6fd0fc [PATCH] update comment about gzip scratch size
fix a comment about the array size.

Signed-off-by: Olaf Hering <olh@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:25:07 -07:00
Adrian Bunk 9e84d1c36a [PATCH] i386: cleanup boot_cpu_logical_apicid variables
There are currently two different boot_cpu_logical_apicid variables:
- a global one in mpparse.c
- a static one in smpboot.c

Of these two, only the one in smpboot.c might be used (through
boot_cpu_apicid).

This patch therefore removes the one in mpparse.c .

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrey Panin <pazke@donpac.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:25:05 -07:00
Domen Puncer f45494480f [PATCH] x86_64: coding style and whitespace fixups
Remove some of the unnecessary differences between arch/i386 and
arch/x86_64.  This patch fixes more whitespace issues, some miscellaneous
typos, a wrong URL and a factually incorrect statement about the current
boot sector code.

Signed-off-by: Domen Puncer <domen@coderock.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:25:02 -07:00
Jesper Juhl 4ae6673e02 [PATCH] get rid of redundant NULL checks before kfree() in arch/i386/
Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:25:00 -07:00
Domen Puncer 40086ea17e [PATCH] arch/i386/crypto/aes.c: fix sparse warnings
Signed-off-by: Alexey Dobriyan <adobriyan@mail.ru>
Signed-off-by: Domen Puncer <domen@coderock.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:59 -07:00
Domen Puncer c7c5844526 [PATCH] arch/i386/mm/fault.c: fix sparse warnings
Signed-off-by: Alexey Dobriyan <adobriyan@mail.ru>
Signed-off-by: Domen Puncer <domen@coderock.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:59 -07:00
Domen Puncer 77617bd806 [PATCH] arch/i386/kernel/apm.c: fix sparse warnings
Signed-off-by: Alexey Dobriyan <adobriyan@mail.ru>
Signed-off-by: Domen Puncer <domen@coderock.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:58 -07:00
Domen Puncer 3f3ae3471f [PATCH] arch/i386/kernel/traps.c: fix sparse warnings
Signed-off-by: Alexey Dobriyan <adobriyan@mail.ru>
Signed-off-by: Domen Puncer <domen@coderock.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:58 -07:00
randy_dunlap 09dbb4768c [PATCH] x86-64: add memcpy/memset prototypes
Put function prototypes for memset() and memcpy() ahead of where
there are used, to kill sparse warnings:

arch/x86_64/boot/compressed/../../../../lib/inflate.c:317:3: warning: undefined identifier 'memset'
arch/x86_64/boot/compressed/../../../../lib/inflate.c:601:11: warning: undefined identifier 'memcpy'
arch/x86_64/boot/compressed/misc.c:151:2: warning: undefined identifier 'memcpy'
arch/x86_64/boot/compressed/../../../../lib/inflate.c:317:3: warning: call with no type!
arch/x86_64/boot/compressed/../../../../lib/inflate.c:601:17: warning: call with no type!
arch/x86_64/boot/compressed/misc.c:151:9: warning: call with no type!

Signed-off-by: randy_dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:57 -07:00
Maneesh Soni 72414d3f1d [PATCH] kexec code cleanup
o Following patch provides purely cosmetic changes and corrects CodingStyle
  guide lines related certain issues like below in kexec related files

  o braces for one line "if" statements, "for" loops,
  o more than 80 column wide lines,
  o No space after "while", "for" and "switch" key words

o Changes:
  o take-2: Removed the extra tab before "case" key words.
  o take-3: Put operator at the end of line and space before "*/"

Signed-off-by: Maneesh Soni <maneesh@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:55 -07:00
Alexander Nyberg 4f339ecb30 [PATCH] kdump: Save trap information for later analysis
If we are faulting in kernel it is quite possible this will lead to a
panic.  Save trap number, cr2 (in case of page fault) and error_code in the
current thread (these fields already exist for signal delivery but are not
used here).

This helps later kdump crash analyzing from user-space (a script has been
submitted to dig this info out in gdb).

Signed-off-by: Alexander Nyberg <alexn@telia.com>
Cc: <fastboot@lists.osdl.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:55 -07:00
Alexander Nyberg 6e274d1443 [PATCH] kdump: Use real pt_regs from exception
Makes kexec_crashdump() take a pt_regs * as an argument.  This allows to
get exact register state at the point of the crash.  If we come from direct
panic assertion NULL will be passed and the current registers saved before
crashdump.

This hooks into two places:
die(): check the conditions under which we will panic when calling
do_exit and go there directly with the pt_regs that caused the fatal
fault.

die_nmi(): If we receive an NMI lockup while in the kernel use the
pt_regs and go directly to crash_kexec(). We're probably nested up badly
at this point so this might be the only chance to escape with proper
information.

Signed-off-by: Alexander Nyberg <alexn@telia.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:54 -07:00
Vivek Goyal 2030eae52b [PATCH] Retrieve elfcorehdr address from command line
This patch adds support for retrieving the address of elf core header if one
is passed in command line.

Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:53 -07:00
Vivek Goyal 60e64d46a5 [PATCH] kdump: Routines for copying dump pages
This patch provides the interfaces necessary to read the dump contents,
treating it as a high memory device.

Signed off by Hariprasad Nellitheertha <hari@in.ibm.com>
Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:53 -07:00
Vivek Goyal 5f016456c9 [PATCH] kdump: Kconfig
- config option CONFIG_CRASH_DUMP

- Made it dependent on HIGHMEM.  This is required as capture kernel treats
  the previous kernel's memory as high memmory and stitches a PTE for
  accessing it.

Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:53 -07:00
Vivek Goyal 92aa63a5a1 [PATCH] kdump: Retrieve saved max pfn
This patch retrieves the max_pfn being used by previous kernel and stores it
in a safe location (saved_max_pfn) before it is overwritten due to user
defined memory map.  This pfn is used to make sure that user does not try to
read the physical memory beyond saved_max_pfn.

Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:52 -07:00
Vivek Goyal a3ea8ac846 [PATCH] Kexec: Kexec on panic fix with nmi watchdog enabled
o Problem: Kexec on panic hangs if first kernel is booted with nmi_watchdog
  command line parameter. This problem occurs because kexec crash shutdown
  code replaces the NMI callback handler. This handler saves the cpu register
  states and halts the cpu. If system is booted with nmi_watchdog parameter,
  then crashing cpu also runs this nmi handler and halts itself.

o This patch fixes the problem by keeping a track of crashing cpu and not
  executing the new nmi handler on crashing cpu.

o There is a dependence on smp_processor_id() function which might return
  insane value for cpu, if cpu field of thread_info is corrupted.

Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:52 -07:00
Vivek Goyal 4d55476c3f [PATCH] kdump: NMI handler segment selector, stack pointer fix
CPU does not save ss and esp on stack if execution was already in kernel mode
at the time of NMI occurrence.  This leads to saving of erractic values for ss
and esp.  This patch fixes the issue.

Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:52 -07:00
Vivek Goyal 625f1c8219 [PATCH] Kdump: Export crash notes section address through sysfs
o Following patch exports kexec global variable "crash_notes" to user space
  through sysfs as kernel attribute in /sys/kernel.

Signed-off-by: Maneesh Soni <maneesh@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:51 -07:00
Heiko Carstens cf13f0eaff [PATCH] kexec: s390 support
Add kexec support for s390 architecture.

From: Milton Miller <miltonm@bga.com>

- Fix passing of first argument to relocate_kernel assembly.
- Fix Kconfig description.
- Remove wrong comment and comments that describe obvious things.
- Allow only KEXEC_TYPE_DEFAULT as image type -> dump not supported.

Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:51 -07:00
R Sharada fce0d57403 [PATCH] ppc64: kexec support for ppc64
This patch implements the kexec support for ppc64 platforms.

A couple of notes:

1)  We copy the pages in virtual mode, using the full base kernel
    and a statically allocated stack.   At kexec_prepare time we
    scan the pages and if any overlap our (0, _end[]) range we
    return -ETXTBSY.

    On PowerPC 64 systems running in LPAR (logical partitioning)
    mode, only a small region of memory, referred to as the RMO,
    can be accessed in real mode.  Since Linux runs with only one
    zone of memory in the memory allocator, and it can be orders of
    magnitude more memory than the RMO, looping until we allocate
    pages in the source region is not feasible.  Copying in virtual
    means we don't have to write a hash table generation and call
    hypervisor to insert translations, instead we rely on the pinned
    kernel linear mapping.  The kernel already has move to linked
    location built in, so there is no requirement to load it at 0.

    If we want to load something other than a kernel, then a stub
    can be written to copy a linear chunk in real mode.

2)  The start entry point gets passed parameters from the kernel.
    Slaves are started at a fixed address after copying code from
    the entry point.

    All CPUs get passed their firmware assigned physical id in r3
    (most calling conventions use this register for the first
    argument).

    This is used to distinguish each CPU from all other CPUs.
    Since firmware is not around, there is no other way to obtain
    this information other than to pass it somewhere.

    A single CPU, referred to here as the master and the one executing
    the kexec call, branches to start with the address of start in r4.
    While this can be calculated, we have to load it through a gpr to
    branch to this point so defining the register this is contained
    in is free.  A stack of unspecified size is available at r1
    (also common calling convention).

    All remaining running CPUs are sent to start at absolute address
    0x60 after copying the first 0x100 bytes from start to address 0.
    This convention was chosen because it matches what the kernel
    has been doing itself.  (only gpr3 is defined).

    Note: This is not quite the convention of the kexec bootblock v2
    in the kernel.  A stub has been written to convert between them,
    and we may adjust the kernel in the future to allow this directly
    without any stub.

3)  Destination pages can be placed anywhere, even where they
    would not be accessible in real mode.  This will allow us to
    place ram disks above the RMO if we choose.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: R Sharada <sharada@in.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:51 -07:00
R Sharada f4c82d5132 [PATCH] ppc64 kexec: native hash clear
Add code to clear the hash table and invalidate the tlb for native (SMP,
non-LPAR) mode.  Supports 16M and 4k pages.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: R Sharada <sharada@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:51 -07:00
Eric W. Biederman 70765aa4bd [PATCH] kexec: kexec ppc support
I have tweaked this patch slightly to handle an empty list
of pages to relocate passed to relocate_new_kernel.  And
I have added ppc_md.machine_crash_shutdown.  To keep up with
the changes in the generic kexec infrastructure.

From: Albert Herranz <albert_herranz@yahoo.es>

The following patch adds support for kexec on the ppc32 platform.

Non-OpenFirmware based platforms are likely to work directly without
additional changes on the kernel side.  The kexec-tools userland package
may need to be slightly updated, though.

For OpenFirmware based machines, additional work is still needed on the
kernel side before kexec support is ready.  Benjamin Herrenschmidt is
kindly working on that part.

In order for a ppc platform to use the kexec kernel services it must
implement some ppc_md hooks.  Otherwise, kexec will be explicitly disabled,
as suggested by benh.

There are 3+1 new ppc_md hooks that a platform supporting kexec may
implement.  Two of them are mandatory for kexec to work.  See
include/asm-ppc/machdep.h for details.

- machine_kexec_prepare(image)

  This function is called to make any arrangements to the image before it
  is loaded.

  This hook _MUST_ be provided by a platform in order to activate kexec
  support for that platform.  Otherwise, the platform is considered to not
  support kexec and the kexec_load system call will fail (that makes all
  existing platforms by default non-kexec'able).

- machine_kexec_cleanup(image)

  This function is called to make any cleanups on image after the loaded
  image data it is freed.  This hook is optional.  A platform may or may
  not provide this hook.

- machine_kexec(image)

  This function is called to perform the _actual_ kexec.  This hook
  _MUST_ be provided by a platform in order to activate kexec support for
  that platform.

  If a platform provides machine_kexec_prepare but forgets to provide
  machine_kexec, a kexec will fall back to a reboot.

  A ready-to-use machine_kexec_simple() generic function is provided to,
  hopefully, simplify kexec adoption for embedded platforms.  A platform
  may call this function from its specific machine_kexec hook, like this:

void myplatform_kexec(struct kimage *image)
{
        machine_kexec_simple(image);
}

- machine_shutdown()

  This function is called to perform any machine specific shutdowns, not
  already done by drivers.  This hook is optional.  A platform may or may
  not provide this hook.

An example (trimmed) platform specific module for a platform supporting
kexec through the existing machine_kexec_simple follows:

/* ... */

#ifdef CONFIG_KEXEC
int myplatform_kexec_prepare(struct kimage *image)
{
        /* here, we can place additional preparations
*/
        return 0; /* yes, we support kexec */
}

void myplatform_kexec(struct kimage *image)
{
        machine_kexec_simple(image);
}
#endif /* CONFIG_KEXEC */

/* ... */

void __init
platform_init(unsigned long r3, unsigned long r4,
unsigned long r5,
              unsigned long r6, unsigned long r7)
{

/* ... */

#ifdef CONFIG_KEXEC
        ppc_md.machine_kexec_prepare =
myplatform_kexec_prepare;
        ppc_md.machine_kexec         =
myplatform_kexec;
#endif /* CONFIG_KEXEC */

/* ... */

}

The kexec ppc kernel support has been heavily tested on the GameCube Linux
port, and, as reported in the fastboot mailing list, it has been tested too
on a Moto 82xx ppc by Rick Richardson.

Signed-off-by: Albert Herranz <albert_herranz@yahoo.es>
Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:51 -07:00
Eric W. Biederman 5f5609df0c [PATCH] crashdump: x86_64: crashkernel option
This is the x86_64 implementation of the crashkernel option.  It reserves
a window of memory very early in the bootup process, so we never use
it for anything but the kernel to switch to when the running
kernel panics.

In addition to reserving this memory a resource structure is registered
so looking at /proc/iomem it is clear what happened to that memory.

ISSUES:
Is it possible to implement this in a architecture generic way?
What should be done with architectures that always use an iommu and
thus don't report their RAM memory resources in /proc/iomem?

Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:50 -07:00
Eric W. Biederman 5234f5eb04 [PATCH] kexec: x86_64 kexec implementation
This is the x86_64 implementation of machine kexec.  32bit compatibility
support has been implemented, and machine_kexec has been enhanced to not care
about the changing internal kernel paget table structures.

From: Alexander Nyberg <alexn@dsv.su.se>

      build fix

Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:50 -07:00
Eric W. Biederman d89559589a [PATCH] kexec: x86_64: factor out apic shutdown code
Factor out the apic and smp shutdown code from machine_restart so it can be
called by in the kexec reboot path as well.

Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:50 -07:00
Eric W. Biederman 1bc3b91aee [PATCH] crashdump: x86 crashkernel option
This is the x86 implementation of the crashkernel option.  It reserves a
window of memory very early in the bootup process, so we never use it for
anything but the kernel to switch to when the running kernel panics.

In addition to reserving this memory a resource structure is registered so
looking at /proc/iomem it is clear what happened to that memory.

ISSUES:
Is it possible to implement this in a architecture generic way?
What should be done with architectures that always use an iommu and
thus don't report their RAM memory resources in /proc/iomem?

Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:50 -07:00
Eric W. Biederman 63d30298ef [PATCH] kexec: x86 shutdown APICs during crash_shutdown
In the case of a crash/panic an architecture specific function
machine_crash_shutdown is called.  This patch adds to the x86 machine_crash
function the standard kernel code for shutting down apics.

Every line of code added to that function increases the risk that we will call
code after a kernel panic that is not safe.

This patch should not make it to the stable kernel without a being reviewed a
lot more.  It is unclear how much a hardned kernel can take when it comes to
misconfigured apics.  So since a normal kernel has problems this patch does a
clean shutdown.

It is my expectation this patch will be dropped from future generations of the
kexec work.  But for the moment it is a crutch to keep from breaking
everything.

Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:50 -07:00
Eric W. Biederman 2c818b45a2 [PATCH] kexec: x86: snapshot registers during crash shutdown
After the kernel panics if we wish to generate an entire machine core file it
is very nice to know the register state at the time the machine crashed.

After long discussion it was realized that if you are going to be saving the
information anyway it is reasonable to store the information in a format that
it will be used and recognized in so the register state is stored in the
standard ELF note format.

Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:49 -07:00
Eric W. Biederman c4ac4263a0 [PATCH] crashdump: x86: add NMI handler to capture other CPUs
One of the dangers when switching from one kernel to another is what happens
to all of the other cpus that were running in the crashed kernel.  In an
attempt to avoid that problem this patch adds a nmi handler and attempts to
shoot down the other cpus by sending them non maskable interrupts.

The code then waits for 1 second or until all known cpus have stopped running
and then jumps from the running kernel that has crashed to the kernel in
reserved memory.

The kernel spin loop is used for the delay as that should behave continue to
be safe even in after a crash.

Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:49 -07:00
Eric W. Biederman 5033cba087 [PATCH] kexec: x86 kexec core
This is the i386 implementation of kexec.

Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:49 -07:00
Eric W. Biederman dd2a13054f [PATCH] kexec: x86: factor out apic shutdown code
Factor out the apic and smp shutdown code from machine_restart so it can be
called by in the kexec reboot path as well.

By switching to the bootstrap cpu by default on reboot I can delete/simplify
some motherboard fixups well.

Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:49 -07:00
Eric W. Biederman d0537508a9 [PATCH] kexec: x86_64: add CONFIG_PHYSICAL_START
For one kernel to report a crash another kernel has created we need
to have 2 kernels loaded simultaneously in memory.  To accomplish this
the two kernels need to built to run at different physical addresses.

This patch adds the CONFIG_PHYSICAL_START option to the x86_64 kernel
so we can do just that.  You need to know what you are doing and
the ramifications are before changing this value, and most users
won't care so I have made it depend on CONFIG_EMBEDDED

bzImage kernels will work and run at a different address when compiled
with this option but they will still load at 1MB.  If you need a kernel
loaded at a different address as well you need to boot a vmlinux.

Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:48 -07:00
Vivek Goyal 8a9190853c [PATCH] kexec: reserve Bootmem fix for booting nondefault location kernel
This patch fixes a problem with reserving memory during boot up of a kernel
built for non-default location.  Currently boot memory allocator reserves
the memory required by kernel image, boot allocaotor bitmap etc.  It
assumes that kernel is loaded at 1MB (HIGH_MEMORY hard coded to 1024*1024).
 But kernel can be built for non-default locatoin, hence existing
hardcoding will lead to reserving unnecessary memory.  This patch fixes it.

Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:48 -07:00
Eric W. Biederman 3d345e3fc9 [PATCH] kexec: x86: add CONFIG_PYSICAL_START
For one kernel to report a crash another kernel has created we need
to have 2 kernels loaded simultaneously in memory.  To accomplish this
the two kernels need to built to run at different physical addresses.

This patch adds the CONFIG_PHYSICAL_START option to the x86 kernel
so we can do just that.  You need to know what you are doing and
the ramifications are before changing this value, and most users
won't care so I have made it depend on CONFIG_EMBEDDED

bzImage kernels will work and run at a different address when compiled
with this option but they will still load at 1MB.  If you need a kernel
loaded at a different address as well you need to boot a vmlinux.

Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:48 -07:00
Eric W. Biederman 5ded01e83e [PATCH] kexec: x86_64: vmlinux: fix physical addresses
The vmlinux on x86_64 does not report the correct physical address of
the kernel.  Instead in the physical address field it currently
reports the virtual address of the kernel.

This is patch is a bug fix that corrects vmlinux to report the
proper physical addresses.

This is potentially a help for crash dump analysis tools.

This definitiely allows bootloaders that load vmlinux as a standard
ELF executable.  Bootloaders directly loading vmlinux become of
practical importance when we consider the kexec on panic case.

Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:47 -07:00
Eric W. Biederman ad0d75ebac [PATCH] kexec: x86: vmlinux: fix physical addresses
The vmlinux on i386 does not report the correct physical address of
the kernel.  Instead in the physical address field it currently
reports the virtual address of the kernel.

This is patch is a bug fix that corrects vmlinux to report the
proper physical addresses.

This is potentially a help for crash dump analysis tools.

This definitiely allows bootloaders that load vmlinux as a standard
ELF executable.  Bootloaders directly loading vmlinux become of
practical importance when we consider the kexec on panic case.

Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:47 -07:00
Eric W. Biederman 208fb93162 [PATCH] kexec: x86_64: restore apic virtual wire mode on shutdown
When coming out of apic mode attempt to set the appropriate
apic back into virtual wire mode.  This improves on previous versions
of this patch by by never setting bot the local apic and the ioapic
into veritual wire mode.

This code looks at data from the mptable to see if an ioapic has
an ExtInt input to make this decision.  A future improvement
is to figure out which apic or ioapic was in virtual wire mode
at boot time and to remember it.  That is potentially a more accurate
method, of selecting which apic to place in virutal wire mode.

Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:47 -07:00
Eric W. Biederman 650927ef8a [PATCH] kexec: x86: resture apic virtual wire mode on shutdown
When coming out of apic mode attempt to set the appropriate
apic back into virtual wire mode.  This improves on previous versions
of this patch by by never setting bot the local apic and the ioapic
into veritual wire mode.

This code looks at data from the mptable to see if an ioapic has
an ExtInt input to make this decision.  A future improvement
is to figure out which apic or ioapic was in virtual wire mode
at boot time and to remember it.  That is potentially a more accurate
method, of selecting which apic to place in virutal wire mode.

Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:47 -07:00
Eric W. Biederman 719e711050 [PATCH] kexec: x86_64: add i8259 shutdown method
From: Eric W. Biederman <ebiederm@xmission.com

The following patch simply adds a shutdown method to the x86_64 i8259 code.

Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:46 -07:00
Eric W. Biederman cee5dab485 [PATCH] kexec: x86: i8259 shutdown: disable interrupts
From: Eric W. Biederman <ebiederm@xmission.com>

This patch disables interrupt generation from the legacy pic on reboot.  Now
that there is a sys_device class it should not be called while drivers are
still using interrupts.

There is a report about this breaking ACPI power off on some systems.
http://bugme.osdl.org/show_bug.cgi?id=4041
However the final comment seems to exonerate this code.  So until
I get more information I believe that was a false positive.

Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:46 -07:00
Eric W. Biederman 70adada428 [PATCH] kexec: x86_64: e820 64bit fix
From: Eric W. Biederman <ebiederm@xmission.com>

It is ok to reserve resources > 4G on x86_64 struct resource is 64bit now :)

Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:46 -07:00
Eric W. Biederman 9635b47d91 [PATCH] kexec: x86: local apic fix
From: "Maciej W. Rozycki" <macro@linux-mips.org>

Fix a kexec problem whcih causes local APIC detection failure.

The problem is detect_init_APIC() is called early, before the command line
have been processed.  Therefore "lapic" (and "nolapic") have not been seen,
yet.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:46 -07:00
Ingo Molnar cc19ca86a0 [PATCH] consolidate PREEMPT options into kernel/Kconfig.preempt
This patch consolidates the CONFIG_PREEMPT and CONFIG_PREEMPT_BKL
preemption options into kernel/Kconfig.preempt.  This, besides reducing
source-code, also enables more centralized tweaking of preemption related
options.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:45 -07:00
Dinakar Guniguntala 7f1867a5b3 [PATCH] Dynamic sched domains: ia64 changes
ia64 changes similar to kernel/sched.c.

Signed-off-by: Dinakar Guniguntala <dino@in.ibm.com>
Acked-by: Paul Jackson <pj@sgi.com>
Acked-by: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:45 -07:00
Nick Piggin 687f1661d3 [PATCH] sched: sched tuning
Do some basic initial tuning.

Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:42 -07:00
Paul E. McKenney b2b1866006 [PATCH] RCU: clean up a few remaining synchronize_kernel() calls
2.6.12-rc6-mm1 has a few remaining synchronize_kernel()s, some (but not
all) in comments.  This patch changes these synchronize_kernel() calls (and
comments) to synchronize_rcu() or synchronize_sched() as follows:

- arch/x86_64/kernel/mce.c mce_read(): change to synchronize_sched() to
  handle races with machine-check exceptions (synchronize_rcu() would not cut
  it given RCU implementations intended for hardcore realtime use.

- drivers/input/serio/i8042.c i8042_stop(): change to synchronize_sched() to
  handle races with i8042_interrupt() interrupt handler.  Again,
  synchronize_rcu() would not cut it given RCU implementations intended for
  hardcore realtime use.

- include/*/kdebug.h comments: change to synchronize_sched() to handle races
  with NMIs.  As before, synchronize_rcu() would not cut it...

- include/linux/list.h comment: change to synchronize_rcu(), since this
  comment is for list_del_rcu().

- security/keys/key.c unregister_key_type(): change to synchronize_rcu(),
  since this is interacting with RCU read side.

- security/keys/process_keys.c install_session_keyring(): change to
  synchronize_rcu(), since this is interacting with RCU read side.

Signed-off-by: "Paul E. McKenney" <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:38 -07:00
Michael Holzheu 66a464dbc8 [PATCH] s390: debug feature changes
This patch changes the memory allocation method for the s390 debug feature.
Trace buffers had been allocated using the get_free_pages() function before.
Therefore it was not possible to get big memory areas in a running system due
to memory fragmentation.  Now the trace buffers are subdivided into several
subbuffers with pagesize.  Therefore it is now possible to allocate more
memory for the trace buffers and more trace records can be written.

In addition to that, dynamic specification of the size of the trace buffers is
implemented.  It is now possible to change the size of a trace buffer using a
new debugfs file instance.  When writing a number into this file, the trace
buffer size is changed to 'number * pagesize'.

In the past all the traces could be obtained from userspace by accessing files
in the "proc" filesystem.  Now with debugfs we have a new filesystem which
should be used for debugging purposes.  This patch moves the debug feature
from procfs to debugfs.

Since the interface of debug_register() changed, all device drivers, which use
the debug feature had to be adjusted.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:37 -07:00
Christian Borntraeger 6b979de395 [PATCH] s390: add vmcp interface
Add interface to issue VM control program commands.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:37 -07:00
Heiko Carstens 77fa22450d [PATCH] s390: improved machine check handling
Improved machine check handling.  Kernel is now able to receive machine checks
while in kernel mode (system call, interrupt and program check handling).
Also register validation is now performed.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:37 -07:00
Jeff Dike 29d56cfe3c [PATCH] uml: hot-unplug code cleanup
Clean up the hot-unplugging code.  There is now an id procedure which is
called to figure out what device we're talking to.  The error messages from
that are now done from mconsole_remove instead of the driver.  remove is now
called with the device number, after it has been checked, so doesn't need to
do sanity checking on it.

Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:36 -07:00
Jeff Dike fc47a0d18a [PATCH] uml: time initialization tidying
user_time_init_skas and user_time_init_tt were essentially the same.  So, this
merges them, deleting the mode-specific functions and declarations.

Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:35 -07:00
Jeff Dike 026549d284 [PATCH] uml: always disable kmalloc during shutdown
kmalloc wasn't being disabled during panic.  This patch ensures that, no
matter how UML is exiting, it is disabled.  This matters because part of the
cleanup is to remove the umid file, which involves readdir, which calls
malloc.  This must map to libc malloc, rather than kmalloc or vmalloc.

Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:35 -07:00
Jeff Dike a6f4e3cf75 [PATCH] uml: fix timer initialization
In skas mode, the call to uml_idle_timer permanently shut off the virtual
timer, resulting in no timer ticks to anything but the idle thread.  This is
likely the cause of the soft lockups that are seen sporadically in recent
UMLs.

Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:35 -07:00
Jeff Dike e0877f07e8 [PATCH] uml: fork cleanup
Fix the do_fork calling convention: normal arch pass the regs and the new sp
value to do_fork instead of NULL.

Currently the arch-independent code ignores these values, while the UML code
(actually it's copy_thread) gets the right values by itself.

With this patch, things are fixed up.

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:35 -07:00
Jesper Juhl 41f2148a67 [PATCH] uml: kfree cleanup
Here's a small patch to remove a few unnessesary NULL pointer checks before
kfree() in arch/um/drivers/daemon_user.c

Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk>
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:35 -07:00
Andrew Morton 350d5bd84e [PATCH] uml: fix sizeof usage
Size of pointer doesn't seem right, but maybe my solution isn't either
(sig_size maybe?).

Signed-off-by: Domen Puncer <domen@coderock.org>
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:34 -07:00
Pavel Machek 8d783b3e02 [PATCH] swsusp: clean assembly parts
This patch fixes register saving so that each register is only saved once,
and adds missing saving of %cr8 on x86-64.  Some reordering so that
save/restore is more logical/safer (segment registers should be restored
after gdt).

Signed-off-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:33 -07:00
Pavel Machek 343c3f6428 [PATCH] s-t-RAM: load gdt the right way
Sleep code uses wrong version of lgdt, that does the wrong thing when
gdt is beyond 16MB or so.

Signed-off-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:32 -07:00
Pavel Machek 648be31881 [PATCH] swsusp: kill config_pm_disk
CONFIG_PM_DISK is long gone, but it still managed to survived at few
places.

Signed-off-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:32 -07:00
Li Shaohua 5a72e04df5 [PATCH] suspend/resume SMP support
Using CPU hotplug to support suspend/resume SMP.  Both S3 and S4 use
disable/enable_nonboot_cpus API.  The S4 part is based on Pavel's original S4
SMP patch.

Signed-off-by: Li Shaohua<shaohua.li@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:32 -07:00
Shaohua Li a9fa06c26f [PATCH] set cpu_state for CPU hotplug (ia64)
Dead CPU notifies online CPU that it's dead using cpu_state variable.
After switching to physical cpu hotplug, we forgot setting the variable.
This patch fixes it.  Currently only __cpu_die uses it.  We changed other
locations for consistency in case others use it.

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Acked-by: Ashok Raj <ashok.raj@intel.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:31 -07:00
Ashok Raj a02c4cb67e [PATCH] x86_64: Provide ability to choose using shortcuts for IPI in flat mode.
This patch provides an option to switch broadcast or use mask version for
sending IPI's.  If CONFIG_HOTPLUG_CPU is defined, we choose not to use
broadcast shortcuts by default, otherwise we choose broadcast mode as default.

both cases, one can change this via startup cmd line option, to choose
no-broadcast mode.

	no_ipi_broadcast=1

This is provided on request from Andi Kleen, since he doesnt agree with
replacing IPI shortcuts as a solution for CPU hotplug.  Without removing
broadcast IPI's, it would mean lots of new code for __cpu_up() path, which
would acheive the same results.

Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Acked-by: Andi Kleen <ak@muc.de>
Acked-by: Zwane Mwaikambo <zwane@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:31 -07:00
Ashok Raj 884d9e40b4 [PATCH] x86_64: Dont use broadcast shortcut to make it cpu hotplug safe.
Broadcast IPI's provide un-expected behaviour for cpu hotplug.  CPU's in
offline state also end up receiving the IPI.  Once the cpus become online they
receive these stale IPI's which are bad and introduce unexpected behaviour.

This is easily avoided by not sending a broadcast and addressing just the
CPU's in online map.  Doing prelim cycle counts it appears there is no big
overhead and numbers seem around 0x3000-0x3900 on an average on x86 and x86_64
systems with CPUS running 3G, both for broadcast and mask version of the
API's.

The shortcuts are useful only for flat mode (where the perf shows no
degradation), and in cluster mode, its unicast anyway.  Its simpler to just
not use broadcast anymore.

Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Acked-by: Andi Kleen <ak@muc.de>
Acked-by: Zwane Mwaikambo <zwane@arm.linux.org.uk>
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:31 -07:00
Ashok Raj cb0cd8d49a [PATCH] x86_64: CPU hotplug sibling map cleanup
This patch is a minor cleanup to the cpu sibling/core map.  It is required
that this setup happens on a per-cpu bringup time.

Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Acked-by: Andi Kleen <ak@muc.de>
Acked-by: Zwane Mwaikambo <zwane@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:31 -07:00
Ashok Raj 76e4f660d9 [PATCH] x86_64: CPU hotplug support
Experimental CPU hotplug patch for x86_64
  -----------------------------------------
This supports logical CPU online and offline.
- Test with maxcpus=1, and then kick other cpu's off to test if init code
  is all cleaned up. CONFIG_SCHED_SMT works as well.
- idle threads are forked on demand from keventd threads for clean startup

TBD:
1. Not tested on a real NUMA machine (tested with numa=fake=2)
2. Handle ACPI pieces for physical hotplug support.

Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Acked-by: Andi Kleen <ak@muc.de>
Acked-by: Zwane Mwaikambo <zwane@arm.linux.org.uk>
Signed-off-by: Shaohua.li<shaohua.li@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:30 -07:00
Ashok Raj e6982c671c [PATCH] x86_64: Change init sections for CPU hotplug support
This patch adds __cpuinit and __cpuinitdata sections that need to exist past
boot to support cpu hotplug.

Caveat: This is done *only* for EM64T CPU Hotplug support, on request from
Andi Kleen.  Much of the generic hotplug code in kernel, and none of the other
archs that support CPU hotplug today, i386, ia64, ppc64, s390 and parisc dont
mark sections with __cpuinit, but only mark them as __devinit, and
__devinitdata.

If someone is motivated to change generic code, we need to make sure all
existing hotplug code does not break, on other arch's that dont use __cpuinit,
and __cpudevinit.

Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Acked-by: Andi Kleen <ak@muc.de>
Acked-by: Zwane Mwaikambo <zwane@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:30 -07:00
Li Shaohua e1367daf3e [PATCH] cpu state clean after hot remove
Clean CPU states in order to reuse smp boot code for CPU hotplug.

Signed-off-by: Li Shaohua<shaohua.li@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:30 -07:00
Li Shaohua 0bb3184df5 [PATCH] init call cleanup
Trival patch for CPU hotplug.  In CPU identify part, only did cleaup for intel
CPUs.  Need do for other CPUs if they support S3 SMP.

Signed-off-by: Li Shaohua<shaohua.li@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:30 -07:00
Li Shaohua d720803a93 [PATCH] sibling map initializing rework
Make sibling map init per-cpu.  Hotplug CPU may change the map at runtime.

Signed-off-by: Li Shaohua<shaohua.li@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:29 -07:00
Li Shaohua 6fe940d6c3 [PATCH] sep initializing rework
Make SEP init per-cpu, so it is hotplug safe.

Signed-off-by: Li Shaohua<shaohua.li@intel.com>
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:29 -07:00
Ashok Raj 67664c8f7e [PATCH] i386: Dont use IPI broadcast when using cpu hotplug.
This patch introduces a startup parameter no_broadcast.  When we enable
CONFIG_HOTPLUG_CPU, we dont want to use broadcast shortcut as it has ill
effects on a offline cpu.  If we issue broadcast, the IPI is also delivered
to offline cpus, or partially up cpu causing stale IPI's to be handled,
which is a problem and can cause undesirable effects.

Introduces a new startup cmdline option no_ipi_broadcast, that can be
switched at cmdline if necessary.

Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Acked-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:29 -07:00
Zwane Mwaikambo f370513640 [PATCH] i386 CPU hotplug
(The i386 CPU hotplug patch provides infrastructure for some work which Pavel
is doing as well as for ACPI S3 (suspend-to-RAM) work which Li Shaohua
<shaohua.li@intel.com> is doing)

The following provides i386 architecture support for safely unregistering and
registering processors during runtime, updated for the current -mm tree.  In
order to avoid dumping cpu hotplug code into kernel/irq/* i dropped the
cpu_online check in do_IRQ() by modifying fixup_irqs().  The difference being
that on cpu offline, fixup_irqs() is called before we clear the cpu from
cpu_online_map and a long delay in order to ensure that we never have any
queued external interrupts on the APICs.  There are additional changes to s390
and ppc64 to account for this change.

1) Add CONFIG_HOTPLUG_CPU
2) disable local APIC timer on dead cpus.
3) Disable preempt around irq balancing to prevent CPUs going down.
4) Print irq stats for all possible cpus.
5) Debugging check for interrupts on offline cpus.
6) Hacky fixup_irqs() to redirect irqs when cpus go off/online.
7) play_dead() for offline cpus to spin inside.
8) Handle offline cpus set in flush_tlb_others().
9) Grab lock earlier in smp_call_function() to prevent CPUs going down.
10) Implement __cpu_disable() and __cpu_die().
11) Enable local interrupts in cpu_enable() after fixup_irqs()
12) Don't fiddle with NMI on dead cpu, but leave intact on other cpus.
13) Program IRQ affinity whilst cpu is still in cpu_online_map on offline.

Signed-off-by: Zwane Mwaikambo <zwane@linuxpower.ca>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:29 -07:00
Shaohua Li d92de65cab [PATCH] variable overflow after hundreds round of hotplug CPU
I'm doing the cpu hotplug stress test and found a variable ('ready') is
overflow after several hundreds rounds of cpu hotplug.  Here is a fix.

Signed-off-by: Shaohua Li<shaohua.li@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:28 -07:00
Shaohua Li a13db56624 [PATCH] CPU hotplug: fix hpet sectioning
With hpet enabled, cpu hotplug uses some routines marked with __init.

Signed-off-by: Shaohua Li<shaohua.li@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:28 -07:00
Andrey Panin 1249c5138f [PATCH] dmi: spring cleanup
Whitespace and CodingStyle cleanup. No functionality changes.

Signed-off-by: Andrey Panin <pazke@donpac.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:28 -07:00
Andrey Panin b625883f24 [PATCH] dmi: remove central blacklist
Since last dmi quirk looks useless (it just prints 404 compliant url) we can
finally remove central dmi blacklist.

Signed-off-by: Andrey Panin <pazke@donpac.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:28 -07:00
Andrey Panin 0f8133a8db [PATCH] dmi: move ACPI sleep quirk
This patch moves ACPI sleep quirk out of dmi_scan.c

Signed-off-by: Andrey Panin <pazke@donpac.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:28 -07:00
Andrey Panin aea00143a8 [PATCH] dmi: move ACPI boot quirk
This patch moves ACPI boot quirks out of dmi_scan.c

Signed-off-by: Andrey Panin <pazke@donpac.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:27 -07:00
Michael Ellerman 856509d5da [PATCH] ppc64: Fix compile warnings in arch/ppc64/kernel/lparcfg.c
Stephen's patch to remove LparData.h missed an include in lparcfg.c This
fixes a few compile warnings.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:27 -07:00
Kumar Gala 2ec19faf61 [PATCH] ppc32: remove some unnecessary includes of bootmem.h
Continue the Good Fight:  Limit bootmem.h include creep.

Signed-off-by: Jon Loeliger <jdl@freescale.com>
Signed-off-by: Kumar Gala <kumar.gala@freescale.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:27 -07:00
Kumar Gala 3d9077afea [PATCH] ppc32: Remove FSL OCP support
Support for the OCP device model on Freescale (FSL) PPC's is no longer used.
All FSL PPC's that were using OCP have be converted to using the platform
device model.

Signed-off-by: Kumar Gala <kumar.gala@freescale.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:27 -07:00
Kumar Gala 33d9e9b56d [PATCH] ppc32: Add support for Freescale e200 (Book-E) core
The e200 core is a Book-E core (similar to e500) that has a unified L1 cache
and is not cache coherent on the bus.  The e200 core also adds a separate
exception level for debug exceptions.  Part of this patch helps to cleanup a
few cases that are true for all Freescale Book-E parts, not just e500.

Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Kumar Gala <kumar.gala@freescale.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:26 -07:00
Kumar Gala 62aa751d16 [PATCH] ppc32: Check return of ppc_sys_get_pdata before accessing pointer
Ensure that the returned pointer from ppc_sys_get_pdata is not NULL before we
start using it.  This handles any cases where we have variants of processors
on the same board with different functionality.

Signed-off-by: Kumar Gala <kumar.gala@freescale.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:26 -07:00
Yoichi Yuasa b4819b5937 [PATCH] mips: add MIPS-specific support for flatmem/discontigmem
2.6.12-git6 doesn't boot on some MIPS machines.  They need the support of flat
memory and discontig memory.

Signed-off-by: Yoichi Yuasa <yuasa@hh.iij4u.or.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:25 -07:00
Russell King 7919a693bd [PATCH] Serial: remove unnecessary register_serial/unregister_serial
A couple of drivers declare register_serial/unregister_serial prototypes
but don't use them.  FRV contains a commented out call to register_serial.
Since these are deprecated, remove these unnecessary references.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:25 -07:00
Dave Jones 821fe94727 [PATCH] gcc4 compile fix for recent ia64 xpc changes
Gcc4 doesn't like volatile casts as lvalues.  Make the structure members
volatile instead.

Signed-off-by: Dave Jones <davej@redhat.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:25 -07:00
Dmitry Torokhov e70c9d5e61 [PATCH] I8K: use standard DMI interface
I8K: Change to use stock dmi infrastructure instead of homegrown
     parsing code. The driver now requires box's DMI data to match
     list of supported models so driver can be safely compiled-in
     by default without fear of it poking into random SMM BIOS
     code. DMI checks can be ignored with i8k.ignore_dmi option.

Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:24 -07:00
Russell King 8749af6821 [PATCH] ARM: Generic Dynamic Tick Timer support for ARM, take 4
This patch adds support for Dynamic Tick Timer for ARM. Dynamic Tick is
also known as VST (Variable Scheduling Timeouts).

Dynamic Tick has been in use in the OMAP tree since last October.  The
patch is not intrusive, and does not do anything unless CONFIG_NO_IDLE_HZ
is defined.  This patch has the following fixed based on comments from
RMK:
- Time is updated before calling interrupt handlers.
- Added new interrupt flag SA_TIMER to avoid duplicate timer interrupts
- Moved struct dyn_tick_timer to time.h until we at some point probably
  have an arch independent dyn-tick.h
- Cleaned up testing for DYN_TICK_ENABLED in irq.c

 I've cleaned up this patch to fix some remaining issues:
 - Call the timer tick handler with irqs disabled, as it would be from
   a normal interrupt
 - if we have a dyn_tick, we better implement all methods.
 - generic timer_dyn_reprogram() call, to be called before sleeping
 - added command line option - "dyntick=" to allow boot-time control
   of this feature
    -- rmk

Signed-off-by: Tony Lindgren
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-06-25 19:39:45 +01:00
Lennert Buytenhek 321ab6a5fa [PATCH] ARM: 2752/1: disable ixp2000 PCI I/O software workaround on chips that don't need it
Patch from Lennert Buytenhek

The later ixp2000 models don't need the PCI I/O workaround that we
currently perform.  Add a config option to disable the workaround,
and panic on boot if a kernel without the workaround is booted on a
buggy chip.  As only pre-production ixp2000s need the workaround,
the default is for it not to be configured in.

Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org>
Signed-off-by: Deepak Saxena
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-06-25 19:30:04 +01:00
Russell King 3cd9e19ebc [PATCH] ARM: Fix discontigmem
The merge of sparsemem broke ARM discontigmem.  Fix it.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-06-25 19:29:34 +01:00
Lennert Buytenhek 8144f56bd1 [PATCH] ARM: 2751/1: ixp2000 gpio cleanup broke ixdp2800 build
Patch from Lennert Buytenhek

The ixp2000 gpio cleanup broke the ixdp2800 build as it moved some
gpio-related functions from arch/platform.h to arch/gpio.h and the
ixdp2x00 support code used those functions but didn't include the
latter header file.

Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-06-25 16:58:22 +01:00
Lennert Buytenhek ea23d1ac7e [PATCH] ARM: 2750/1: add i2c platform device for enp2611 on-board i2c bus
Patch from Lennert Buytenhek

On the enp2611, GPIO 7 and 6 are connected to an on-board i2c bus that
attaches to the SODIMM module slot (for SPD) and an LM84 temperature
sensor.  Add a platform device for this i2c bus.

Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-06-25 16:58:21 +01:00
Lennert Buytenhek 456b6b863a [PATCH] ARM: 2749/1: update ixp2000 defconfigs to 2.6.12-git6
Patch from Lennert Buytenhek

Update the defconfigs for the ixp2000 platforms to 2.6.12-git6.

Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-06-25 16:58:21 +01:00
Catalin Marinas 79042f087b [PATCH] ARM: 2698/1: Enable kernel r/w access to user pages on ARMv6
Patch from Catalin Marinas

cpu_v6_set_pte() sets the kernel access rights to r/o for user
pages (L_PTE_USER) when neither L_PTE_WRITE nor L_PTE_DIRTY are
set. This causes a kernel data abort when writing the TLS value
in the 0xffff0000 page. This patch enables the kernel r/w access.

Signed-off-by: Catalin Marinas
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-06-24 21:27:39 +01:00
Deepak Saxena 5932ae3f5d [PATCH] ARM: 2745/1: Fix IXP4xx debug macros
Patch from Deepak Saxena

Current IXP4xx debug macros do not work in the small window between
the MMU being enabled and the call to map_io() b/c the standard
peripheral mapping is not properly setup for use with the low-level
debug code. This patch creates a new section-aligned mapping for the
UART specifically for use with the debug macros.

Signed-off-by: Deepak Saxena
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-06-24 20:54:35 +01:00
Lennert Buytenhek c4982887ca [PATCH] ARM: 2744/1: ixp2000 gpio irq support
Patch from Lennert Buytenhek

This patch cleans up the ixp2000 gpio irq code and implements the
set_irq_type method for gpio irqs so that users can select for which
events (falling edge/rising edge/level low/level high) on the gpio
pin they want the corresponding gpio irq to be triggered.

Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org>
Signed-off-by: Deepak Saxena
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-06-24 20:54:35 +01:00
Chris Zankel 7282bee787 [PATCH] xtensa: Architecture support for Tensilica Xtensa Part 8
The attached patches provides part 8 of an architecture implementation
for the Tensilica Xtensa CPU series.

Signed-off-by: Chris Zankel <chris@zankel.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:05:22 -07:00
Chris Zankel 3f65ce4d14 [PATCH] xtensa: Architecture support for Tensilica Xtensa Part 5
The attached patches provides part 5 of an architecture implementation for the
Tensilica Xtensa CPU series.

Signed-off-by: Chris Zankel <chris@zankel.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:05:22 -07:00
Chris Zankel 249ac17e96 [PATCH] xtensa: Architecture support for Tensilica Xtensa Part 4
The attached patches provides part 4 of an architecture implementation for the
Tensilica Xtensa CPU series.

Signed-off-by: Chris Zankel <chris@zankel.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:05:21 -07:00
Chris Zankel 5a0015d626 [PATCH] xtensa: Architecture support for Tensilica Xtensa Part 3
The attached patches provides part 3 of an architecture implementation for the
Tensilica Xtensa CPU series.

Signed-off-by: Chris Zankel <chris@zankel.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:05:21 -07:00
Chris Zankel 4bedea9454 [PATCH] xtensa: Architecture support for Tensilica Xtensa Part 2
The attached patches provides part 2 of an architecture implementation for the
Tensilica Xtensa CPU series.

Signed-off-by: Chris Zankel <chris@zankel.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:05:21 -07:00
Chris Zankel 8e1a6dd2fd [PATCH] xtensa: Architecture support for Tensilica Xtensa Part 1
The attached patches provides part 1 of an architecture implementation for the
Tensilica Xtensa CPU series.

Signed-off-by: Chris Zankel <chris@zankel.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:05:21 -07:00
Andrea Arcangeli 7286aa9b9a [PATCH] ppc64: fix seccomp with 32-bit userland
The seccomp check has to happen when entering the syscall and not when
exiting it or regs->gpr[0] contains garabge during signal handling in
ppc64_rt_sigreturn (this actually might be a bug too, but an orthogonal
one, since we really have to run the check before invoking the syscall and
not after it).

Signed-off-by: Andrea Arcangeli <andrea@cpushare.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:05:18 -07:00
Linus Torvalds adb7ee3746 Merge master.kernel.org:/home/rmk/linux-2.6-arm 2005-06-23 17:19:56 -07:00
Ben Dooks 691027b91b [PATCH] ARM: 2730/1: S3C2410 default configuration update
Patch from Ben Dooks

Add support for the DM9000 and bring default configuration
up-to-date with the latest 2.6.12 kernel release

Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-06-23 21:56:48 +01:00
Ben Dooks d97a666f36 [PATCH] ARM: 2729/1: DM9000 platform support for S3C2410 machines (BAST, VR1000)
Patch from Ben Dooks

Add platform_device information for DM9000 chip(s) on the
Simtec BAST and the VR1000 board.

Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-06-23 21:56:47 +01:00
Nicolas Pitre c1241c4c3a [PATCH] ARM: 2722/1: remove reliance on udivdi3 for nwfpe
Patch from Nicolas Pitre

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-06-23 21:56:46 +01:00
Linus Torvalds 24665cd00d Merge rsync://rsync.kernel.org/pub/scm/linux/kernel/git/paulus/ppc64-2.6 2005-06-23 09:49:55 -07:00
Christoph Hellwig 9a59f452ab [PATCH] remove <linux/xattr_acl.h>
This file duplicates <linux/posix_acl_xattr.h>, using slightly different
names.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:33 -07:00
William Lee Irwin III 30aaa80885 [PATCH] use drivers/Kconfig for sparc32
Kconfig is spitting out massive numbers of errors and so on.  This patch
switches arch/sparc/Kconfig to use drivers/Kconfig so those stop.

Signed-off-by: William Irwin <wli@holomorphy.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:33 -07:00
Stephen Rothwell 0d77e5a2c2 [PATCH] compat: introduce compat_time_t
This patch is based on work by Carlos O'Donell and Matthew Wilcox.  It
introduces/updates the compat_time_t type and uses it for compat siginfo
structures.  I have built this on ppc64 and x86_64.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:32 -07:00
Anil S Keshavamurthy 852caccc89 [PATCH] Kprobes/ia64: temporary disarming of reentrant probe
This patch includes IA64 architecture specific changes(ported form i386) to
support temporary disarming on reentrancy of probes.

In case of reentrancy we single step without calling user handler.

Signed-of-by: Anil S Keshavamurth <anil.s.keshavamurthy@intel.com>

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:25 -07:00
Prasanna S Panchamukhi e539c23314 [PATCH] kprobes: Temporary disarming of reentrant probe for sparc64
This patch includes sparc64 architecture specific changes to support temporary
disarming on reentrancy of probes.

Signed-of-by: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:25 -07:00
Prasanna S Panchamukhi 42cc20600a [PATCH] kprobes: Temporary disarming of reentrant probe for ppc64
This patch includes ppc64 architecture specific changes to support temporary
disarming on reentrancy of probes.

Signed-of-by: Prasanna S Panchamukhi <prasanna@in.ibm.com>

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:25 -07:00
Prasanna S Panchamukhi aa3d7e3d78 [PATCH] kprobes: Temporary disarming of reentrant probe for x86_64
This patch includes x86_64 architecture specific changes to support temporary
disarming on reentrancy of probes.

Signed-of-by: Prasanna S Panchamukhi <prasanna@in.ibm.com>

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:24 -07:00
Prasanna S Panchamukhi 417c8da651 [PATCH] kprobes: Temporary disarming of reentrant probe for i386
This patch includes i386 architecture specific changes to support temporary
disarming on reentrancy of probes.

Signed-of-by: Prasanna S Panchamukhi <prasanna@in.ibm.com>

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:24 -07:00
Keshavamurthy Anil S 89cb14c0dd [PATCH] Kprobes/IA64: check jprobe break before handling
Once the jprobe instrumented function returns, it executes a jprobe_break
which is a break instruction with __IA64_JPROBE_BREAK value.  The current
patch checks for this break value, before assuming that jprobe instrumented
function just completed.

The previous code was not checking for this value and that was a bug.

Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:24 -07:00
Anil S Keshavamurthy 708de8f11c [PATCH] Kprobes IA64: safe register kprobe
The current kprobes does not yet handle register kprobes on some of the
following kind of instruction which needs to be emulated in a special way.

1) mov r1=ip
2) chk -- Speculation check instruction

This patch attempts to fail register_kprobes() when user tries to insert
kprobes on the above kind of instruction.

Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:24 -07:00
Anil S Keshavamurthy 1674eafcbd [PATCH] Kprobes IA64: cmp ctype unc support
The current Kprobes when patching the original instruction with the break
instruction tries to retain the original qualifying predicate(qp), however
for cmp.crel.ctype where ctype == unc, which is a special instruction
always needs to be executed irrespective of qp.  Hence, if the instruction
we are patching is of this type, then we should not copy the original qp to
the break instruction, this is because we always want the break fault to
happen so that we can emulate the instruction.

This patch is based on the feedback given by David Mosberger

Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:23 -07:00
Anil S Keshavamurthy a5403183d8 [PATCH] Kprobes IA64: arch_prepare_kprobes() cleanup
arch_prepare_kprobes() was doing lots of functionality
in just one single function. This patch
attempts to clean up arch_prepare_kprobes() by moving
specific sub task to the following (new)functions
1)valid_kprobe_addr() -->> validate the given kprobe address
2)get_kprobe_inst(slot..)->> Retrives the instruction for a given slot from the bundle
3)prepare_break_inst() -->> Prepares break instruction within the bundle
	3a)update_kprobe_inst_flag()-->>Updates the internal flags, required
			for proper emulation of the instruction at later
			point in time.

Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:23 -07:00
Rusty Lynch 13608d6433 [PATCH] Kprobes ia64 qp fix
Fix a bug where a kprobe still fires when the instruction is predicated
off.  So given the p6=0, and we have an instruction like:

(p6) move loc1=0

we should not be triggering the kprobe.  This is handled by carrying over
the qp section of the original instruction into the break instruction.

Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Rusty Lynch <Rusty.lynch@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:23 -07:00
Rusty Lynch 8bc76772ad [PATCH] Kprobes ia64 cleanup
A cleanup of the ia64 kprobes implementation such that all of the bundle
manipulation logic is concentrated in arch_prepare_kprobe().

With the current design for kprobes, the arch specific code only has a
chance to return failure inside the arch_prepare_kprobe() function.

This patch moves all of the work that was happening in arch_copy_kprobe()
and most of the work that was happening in arch_arm_kprobe() into
arch_prepare_kprobe().  By doing this we can add further robustness checks
in arch_arm_kprobe() and refuse to insert kprobes that will cause problems.

Signed-off-by: Rusty Lynch <Rusty.lynch@intel.com>
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:23 -07:00
Anil S Keshavamurthy cd2675bf65 [PATCH] Kprobes/IA64: support kprobe on branch/call instructions
This patch is required to support kprobe on branch/call instructions.

Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:23 -07:00
Anil S Keshavamurthy b2761dc262 [PATCH] Kprobes/IA64: architecture specific JProbes support
This patch adds IA64 architecture specific JProbes support on top of Kprobes

Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Rusty Lynch <Rusty.lynch@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:22 -07:00
Anil S Keshavamurthy fd7b231ff9 [PATCH] Kprobes/IA64: arch specific handling
This is an IA64 arch specific handling of Kprobes

Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Rusty Lynch <Rusty.lynch@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:22 -07:00
Anil S Keshavamurthy 7213b25218 [PATCH] Kprobes/IA64: kdebug die notification mechanism
As many of you know that kprobes exist in the main line kernel for various
architecture including i386, x86_64, ppc64 and sparc64.  Attached patches
following this mail are a port of Kprobes and Jprobes for IA64.

I have tesed this patches for kprobes and Jprobes and this seems to work fine.
 I have tested this patch by inserting kprobes on various slots and various
templates including various types of branch instructions.

I have also tested this patch using the tool
http://marc.theaimsgroup.com/?l=linux-kernel&m=111657358022586&w=2 and the
kprobes for IA64 works great.

Here is list of TODO things and pathes for the same will appear soon.

1) Support kprobes on "mov r1=ip" type of instruction
2) Support Kprobes and Jprobes to exist on the same address
3) Support Return probes
3) Architecture independent cleanup of kprobes

This patch adds the kdebug die notification mechanism needed by Kprobes.

For break instruction on Branch type slot, imm21 is ignored and value
zero is placed in IIM register, hence we need to handle kprobes
for switch case zero.

Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Rusty Lynch <Rusty.lynch@intel.com>

From: Rusty Lynch <rusty.lynch@intel.com>

At the point in traps.c where we recieve a break with a zero value, we can
not say if the break was a result of a kprobe or some other debug facility.

This simple patch changes the informational string to a more correct "break
0" value, and applies to the 2.6.12-rc2-mm2 tree with all the kprobes
patches that were just recently included for the next mm cut.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:22 -07:00
Hien Nguyen 0aa55e4d7d [PATCH] kprobes: moves lock-unlock to non-arch kprobe_flush_task
This patch moves the lock/unlock of the arch specific kprobe_flush_task()
to the non-arch specific kprobe_flusk_task().

Signed-off-by: Hien Nguyen <hien@us.ibm.com>
Acked-by: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:21 -07:00
Rusty Lynch 7e1048b11c [PATCH] Move kprobe [dis]arming into arch specific code
The architecture independent code of the current kprobes implementation is
arming and disarming kprobes at registration time.  The problem is that the
code is assuming that arming and disarming is a just done by a simple write
of some magic value to an address.  This is problematic for ia64 where our
instructions look more like structures, and we can not insert break points
by just doing something like:

*p->addr = BREAKPOINT_INSTRUCTION;

The following patch to 2.6.12-rc4-mm2 adds two new architecture dependent
functions:

     * void arch_arm_kprobe(struct kprobe *p)
     * void arch_disarm_kprobe(struct kprobe *p)

and then adds the new functions for each of the architectures that already
implement kprobes (spar64/ppc64/i386/x86_64).

I thought arch_[dis]arm_kprobe was the most descriptive of what was really
happening, but each of the architectures already had a disarm_kprobe()
function that was really a "disarm and do some other clean-up items as
needed when you stumble across a recursive kprobe." So...  I took the
liberty of changing the code that was calling disarm_kprobe() to call
arch_disarm_kprobe(), and then do the cleanup in the block of code dealing
with the recursive kprobe case.

So far this patch as been tested on i386, x86_64, and ppc64, but still
needs to be tested in sparc64.

Signed-off-by: Rusty Lynch <rusty.lynch@intel.com>
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:21 -07:00
Rusty Lynch 73649dab0f [PATCH] x86_64 specific function return probes
The following patch adds the x86_64 architecture specific implementation
for function return probes.

Function return probes is a mechanism built on top of kprobes that allows
a caller to register a handler to be called when a given function exits.
For example, to instrument the return path of sys_mkdir:

static int sys_mkdir_exit(struct kretprobe_instance *i, struct pt_regs *regs)
{
	printk("sys_mkdir exited\n");
	return 0;
}
static struct kretprobe return_probe = {
	.handler = sys_mkdir_exit,
};

<inside setup function>

return_probe.kp.addr = (kprobe_opcode_t *) kallsyms_lookup_name("sys_mkdir");
if (register_kretprobe(&return_probe)) {
	printk(KERN_DEBUG "Unable to register return probe!\n");
	/* do error path */
}

<inside cleanup function>
unregister_kretprobe(&return_probe);

The way this works is that:

* At system initialization time, kernel/kprobes.c installs a kprobe
  on a function called kretprobe_trampoline() that is implemented in
  the arch/x86_64/kernel/kprobes.c  (More on this later)

* When a return probe is registered using register_kretprobe(),
  kernel/kprobes.c will install a kprobe on the first instruction of the
  targeted function with the pre handler set to arch_prepare_kretprobe()
  which is implemented in arch/x86_64/kernel/kprobes.c.

* arch_prepare_kretprobe() will prepare a kretprobe instance that stores:
  - nodes for hanging this instance in an empty or free list
  - a pointer to the return probe
  - the original return address
  - a pointer to the stack address

  With all this stowed away, arch_prepare_kretprobe() then sets the return
  address for the targeted function to a special trampoline function called
  kretprobe_trampoline() implemented in arch/x86_64/kernel/kprobes.c

* The kprobe completes as normal, with control passing back to the target
  function that executes as normal, and eventually returns to our trampoline
  function.

* Since a kprobe was installed on kretprobe_trampoline() during system
  initialization, control passes back to kprobes via the architecture
  specific function trampoline_probe_handler() which will lookup the
  instance in an hlist maintained by kernel/kprobes.c, and then call
  the handler function.

* When trampoline_probe_handler() is done, the kprobes infrastructure
  single steps the original instruction (in this case just a top), and
  then calls trampoline_post_handler().  trampoline_post_handler() then
  looks up the instance again, puts the instance back on the free list,
  and then makes a long jump back to the original return instruction.

So to recap, to instrument the exit path of a function this implementation
will cause four interruptions:

  - A breakpoint at the very beginning of the function allowing us to
    switch out the return address
  - A single step interruption to execute the original instruction that
    we replaced with the break instruction (normal kprobe flow)
  - A breakpoint in the trampoline function where our instrumented function
    returned to
  - A single step interruption to execute the original instruction that
    we replaced with the break instruction (normal kprobe flow)

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:21 -07:00
Hien Nguyen b94cce926b [PATCH] kprobes: function-return probes
This patch adds function-return probes to kprobes for the i386
architecture.  This enables you to establish a handler to be run when a
function returns.

1. API

Two new functions are added to kprobes:

	int register_kretprobe(struct kretprobe *rp);
	void unregister_kretprobe(struct kretprobe *rp);

2. Registration and unregistration

2.1 Register

  To register a function-return probe, the user populates the following
  fields in a kretprobe object and calls register_kretprobe() with the
  kretprobe address as an argument:

  kp.addr - the function's address

  handler - this function is run after the ret instruction executes, but
  before control returns to the return address in the caller.

  maxactive - The maximum number of instances of the probed function that
  can be active concurrently.  For example, if the function is non-
  recursive and is called with a spinlock or mutex held, maxactive = 1
  should be enough.  If the function is non-recursive and can never
  relinquish the CPU (e.g., via a semaphore or preemption), NR_CPUS should
  be enough.  maxactive is used to determine how many kretprobe_instance
  objects to allocate for this particular probed function.  If maxactive <=
  0, it is set to a default value (if CONFIG_PREEMPT maxactive=max(10, 2 *
  NR_CPUS) else maxactive=NR_CPUS)

  For example:

    struct kretprobe rp;
    rp.kp.addr = /* entrypoint address */
    rp.handler = /*return probe handler */
    rp.maxactive = /* e.g., 1 or NR_CPUS or 0, see the above explanation */
    register_kretprobe(&rp);

  The following field may also be of interest:

  nmissed - Initialized to zero when the function-return probe is
  registered, and incremented every time the probed function is entered but
  there is no kretprobe_instance object available for establishing the
  function-return probe (i.e., because maxactive was set too low).

2.2 Unregister

  To unregiter a function-return probe, the user calls
  unregister_kretprobe() with the same kretprobe object as registered
  previously.  If a probed function is running when the return probe is
  unregistered, the function will return as expected, but the handler won't
  be run.

3. Limitations

3.1 This patch supports only the i386 architecture, but patches for
    x86_64 and ppc64 are anticipated soon.

3.2 Return probes operates by replacing the return address in the stack
    (or in a known register, such as the lr register for ppc).  This may
    cause __builtin_return_address(0), when invoked from the return-probed
    function, to return the address of the return-probes trampoline.

3.3 This implementation uses the "Multiprobes at an address" feature in
    2.6.12-rc3-mm3.

3.4 Due to a limitation in multi-probes, you cannot currently establish
    a return probe and a jprobe on the same function.  A patch to remove
    this limitation is being tested.

This feature is required by SystemTap (http://sourceware.org/systemtap),
and reflects ideas contributed by several SystemTap developers, including
Will Cohen and Ananth Mavinakayanahalli.

Signed-off-by: Hien Nguyen <hien@us.ibm.com>
Signed-off-by: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Signed-off-by: Frederik Deweerdt <frederik.deweerdt@laposte.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:21 -07:00
Robert Love dfe52244e0 [PATCH] kstrdup: convert a few existing implementations
Convert a bunch of strdup() implementations and their callers to the new
kstrdup().  A few remain, for example see sound/core, and there are tons of
open coded strdup()'s around.  Sigh.  But this is a start.

Signed-off-by: Robert Love <rml@novell.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:18 -07:00
Domen Puncer 15d20bfd60 [PATCH] ptrace_h8300: condition bugfix
Assignment doesn't make much sense here as condition would always be true.

Signed-off-by: Domen Puncer <domen@coderock.org>
Signed-off-by: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:15 -07:00
Vincent Hanquez 76381fee7e [PATCH] xen: x86_64: use more usermode macro
Make use of the user_mode macro where it's possible.  This is useful for Xen
because it will need only to redefine only the macro to a hypervisor call.

Signed-off-by: Vincent Hanquez <vincent.hanquez@cl.cam.ac.uk>
Cc: Ian Pratt <m+Ian.Pratt@cl.cam.ac.uk>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:14 -07:00
Vincent Hanquez e9129e56e9 [PATCH] xen: x86_64: Add macro for debugreg
Add 2 macros to set and get debugreg on x86_64.  This is useful for Xen
because it will need only to redefine each macro to a hypervisor call.

Signed-off-by: Vincent Hanquez <vincent.hanquez@cl.cam.ac.uk>
Cc: Ian Pratt <m+Ian.Pratt@cl.cam.ac.uk>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:14 -07:00
Vincent Hanquez 717b594a41 [PATCH] xen: x86: Use more usermode macro
Use the user_mode macro where it's possible.

Signed-off-by: Vincent Hanquez <vincent.hanquez@cl.cam.ac.uk>
Cc: Ian Pratt <m+Ian.Pratt@cl.cam.ac.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:14 -07:00
Vincent Hanquez fa1e1bdf78 [PATCH] xen: x86: Rename usermode macro
Rename user_mode to user_mode_vm and add a user_mode macro similar to the
x86-64 one.

This is useful for Xen because the linux xen kernel does not runs on the same
priviledge that a vanilla linux kernel, and with this we just need to redefine
user_mode().

Signed-off-by: Vincent Hanquez <vincent.hanquez@cl.cam.ac.uk>
Cc: Ian Pratt <m+Ian.Pratt@cl.cam.ac.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:14 -07:00
Vincent Hanquez 1cc6f12e03 [PATCH] xen: x86: Use new macro for debugreg
Make use of the 2 new macro set_debugreg and get_debugreg.

Signed-off-by: Vincent Hanquez <vincent.hanquez@cl.cam.ac.uk>
Cc: Ian Pratt <m+Ian.Pratt@cl.cam.ac.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:13 -07:00
Natalie Protasevich 701067c466 [PATCH] x86_64: avoid wasting IRQs
I suggest to change the way IRQs are handed out to PCI devices.

Currently, each I/O APIC pin gets associated with an IRQ, no matter if the
pin is used or not.  It is expected that each pin can potentually be
engaged by a device inserted into the corresponding PCI slot.  However,
this imposes severe limitation on systems that have designs that employ
many I/O APICs, only utilizing couple lines of each, such as P64H2 chipset.

It is used in ES7000, and currently, there is no way to boot the system
with more that 9 I/O APICs.

The simple change below allows to boot a system with say 64 (or more) I/O
APICs, each providing 1 slot, which otherwise impossible because of the IRQ
gaps created for unused lines on each I/O APIC.  It does not resolve the
problem with number of devices that exceeds number of possible IRQs, but
eases up a tension for IRQs on any large system with potentually large
number of devices.

I only implemented this for the ACPI boot, since if the system is this big
and using newer chipsets it is probably (better be!) an ACPI based system
:).  The change is completely "mechanical" and does not alter any internal
structures or interrupt model/implementation.  The patch works for both
i386 and x86_64 archs.  It works with MSIs just fine, and should not
intervene with implementations like shared vectors, when they get worked
out and incorporated.

To illustrate, below is the interrupt distribution for 2-cell ES7000 with
20 I/O APICs, and an Ethernet card in the last slot, which should be eth1
and which was not configured because its IRQ exceeded allowable number (it
actially turned out huge - 480!):

zorro-tb2:~ # cat /proc/interrupts
           CPU0       CPU1       CPU2       CPU3       CPU4       CPU5       CPU6       CPU7
  0:      65716      30012      30007      30002      30009      30010      30010      30010    IO-APIC-edge  timer
  4:        373          0        725        280          0          0          0          0    IO-APIC-edge  serial
  8:          0          0          0          0          0          0          0          0    IO-APIC-edge  rtc
  9:          0          0          0          0          0          0          0          0   IO-APIC-level  acpi
 14:         39          3          0          0          0          0          0          0    IO-APIC-edge  ide0
 16:        108         13          0          0          0          0          0          0   IO-APIC-level  uhci_hcd:usb1
 18:          0          0          0          0          0          0          0          0   IO-APIC-level  uhci_hcd:usb3
 19:         15          0          0          0          0          0          0          0   IO-APIC-level  uhci_hcd:usb2
 23:          3          0          0          0          0          0          0          0   IO-APIC-level  ehci_hcd:usb4
 96:       4240        397         18          0          0          0          0          0   IO-APIC-level  aic7xxx
 97:         15          0          0          0          0          0          0          0   IO-APIC-level  aic7xxx
192:        847          0          0          0          0          0          0          0   IO-APIC-level  eth0
NMI:          0          0          0          0          0          0          0          0
LOC:     273423     274528     272829     274228     274092     273761     273827     273694
ERR:          7
MIS:          0

Even though the system doesn't have that many devices, some don't get
enabled only because of IRQ numbering model.

This is the IRQ picture after the patch was applied:

zorro-tb2:~ # cat /proc/interrupts
           CPU0       CPU1       CPU2       CPU3       CPU4       CPU5       CPU6       CPU7
  0:      44169      10004      10004      10001      10004      10003      10004       6135    IO-APIC-edge  timer
  4:        345          0          0          0          0        244          0          0    IO-APIC-edge  serial
  8:          0          0          0          0          0          0          0          0    IO-APIC-edge  rtc
  9:          0          0          0          0          0          0          0          0   IO-APIC-level  acpi
 14:         39          0          3          0          0          0          0          0    IO-APIC-edge  ide0
 17:       4425          0          9          0          0          0          0          0   IO-APIC-level  aic7xxx
 18:         15          0          0          0          0          0          0          0   IO-APIC-level  aic7xxx, uhci_hcd:usb3
 21:        231          0          0          0          0          0          0          0   IO-APIC-level  uhci_hcd:usb1
 22:         26          0          0          0          0          0          0          0   IO-APIC-level  uhci_hcd:usb2
 23:          3          0          0          0          0          0          0          0   IO-APIC-level  ehci_hcd:usb4
 24:        348          0          0          0          0          0          0          0   IO-APIC-level  eth0
 25:          6        192          0          0          0          0          0          0   IO-APIC-level  eth1
NMI:          0          0          0          0          0          0          0          0
LOC:     107981     107636     108899     108698     108489     108326     108331     108254
ERR:          7
MIS:          0

Not only we see the card in the last I/O APIC, but we are not even close to
using up available IRQs, since we didn't waste any.

Signed-off-by: Natalie Protasevich <Natalie.Protasevich@unisys.com>
Acked-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:13 -07:00
Roland McGrath 0928d6ef7f [PATCH] x86_64: never block forced SIGSEGV
This is the x86_64 version of the signal fix I just posted for i386.

This problem was first noticed on PPC and has already been fixed there.
But the exact same issue applies to other platforms in the same way.  The
signal blocking for sa_mask and the handled signal takes place after the
handler setup.  When the stack is bogus, the handler setup forces a
SIGSEGV.  But then this will be blocked, and returning to user mode will
fault again and iterate.  This patch fixes the problem by checking whether
signal handler setup failed, and not doing the signal-blocking if so.  This
copies what was done in the ppc code.  I think all architectures' signal
handler setup code follows this pattern and needs the change.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:13 -07:00
john stultz a3a00751ad [PATCH] x86_64: fix hpet for systems that don't support legacy replacement
Currently the x86-64 HPET code assumes the entire HPET implementation from
the spec is present.  This breaks on boxes that do not implement the
optional legacy timer replacement functionality portion of the spec.

This patch fixes this issue, allowing x86-64 systems that cannot use the
HPET for the timer interrupt and RTC to still use the HPET as a time
source.  I've tested this patch on a system systems without HPET, with HPET
but without legacy timer replacement, as well as HPET with legacy timer
replacement.

This version adds a minor check to cap the HPET counter value in
gettimeoffset_hpet to avoid possible time inconsistencies.  Please ignore
the A2 version I sent to you earlier.

Acked-by: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:12 -07:00
Alexander Nyberg c0a88c9878 [PATCH] x86_64: i8259.c iso99 structure initialization
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:12 -07:00
Andrew Morton c92c6ffdb1 [PATCH] mtrr size-and-base debugging
Consolidate the mtrr sanity checking, add a dump_stack().

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:12 -07:00
Andrew Morton a3a255e744 [PATCH] x86: cpu_khz type fix
x86_64's cpu_khz is unsigned int and there is no reason why x86 needs to use
unsigned long.

So make cpu_khz unsigned int on x86 as well.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:11 -07:00
Alexey Dobriyan 129f69465b [PATCH] Remove i386_ksyms.c, almost.
* EXPORT_SYMBOL's moved to other files
* #include <linux/config.h>, <linux/module.h> where needed
* #include's in i386_ksyms.c cleaned up
* After copy-paste, redundant due to Makefiles rules preprocessor directives
  removed:

	#ifdef CONFIG_FOO
	EXPORT_SYMBOL(foo);
	#endif

	obj-$(CONFIG_FOO) += foo.o

* Tiny reformat to fit in 80 columns

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:11 -07:00
Aleksey Gorelov 80bb82afea [PATCH] VIA 82C586B IRQ routing fix
According to the VIA 82C586B datasheet (still available from
http://gkernel.sourceforge.net/specs/via/586b.pdf.bz2) this chip need a
special PIRQ mapping.

Signed-off-by: Karsten Keil <kkeil@suse.de>
Signed-off-by: Aleksey Gorelov <aleksey_gorelov@phoenix.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:11 -07:00
Natalie Protasevich c434b7a6ae [PATCH] x86: avoid wasting IRQs for PCI devices
I have submitted the patch for x86_64, this is submission for i386.

The patch changes the way IRQs are handed out to PCI devices.  Currently,
each I/O APIC pin gets associated with an IRQ, no matter if the pin is used
or not.  This imposes severe limitation on systems that have designs that
employ many I/O APICs, only utilizing couple lines of each, such as P64H2
chipset.  It is used in ES7000, and currently, there is no way to boot the
system with more that 9 I/O APICs.

The simple change below allows to boot a system with say 64 (or more) I/O
APICs, each providing 1 slot, which otherwise impossible because of the IRQ
gaps created for unused lines on each I/O APIC.  It does not resolve the
problem with number of devices that exceeds number of possible IRQs, but
eases up a tension for IRQs on any large system with potentually large
number of devices.

Signed-off-by: Natalie Protasevich <Natalie.Protasevich@unisys.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:10 -07:00
Christoph Lameter b5d23e5b8c [PATCH] ia64: Selectable Timer Interrupt Frequency
It allows a selectable timer interrupt frequency of 100, 250 and 1000 HZ.
Reducing the timer frequency may have important performance benefits on
large systems.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:10 -07:00
Christoph Lameter 5912100372 [PATCH] i386: Selectable Frequency of the Timer Interrupt
Make the timer frequency selectable. The timer interrupt may cause bus
and memory contention in large NUMA systems since the interrupt occurs
on each processor HZ times per second.

Signed-off-by: Christoph Lameter <christoph@lameter.com>
Signed-off-by: Shai Fultheim <shai@scalex86.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:10 -07:00
Jan Beulich 799d19f6ec [PATCH] allow early printk to use more than 25 lines
Allow early printk code to take advantage of the full size of the screen, not
just the first 25 lines.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Acked-by: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:10 -07:00
Jan Beulich 7fbb4f6e68 [PATCH] adjust i386 watchdog tick calculation
Get the i386 watchdog tick calculation into a state where it can also be used
on CPUs with frequencies beyond 4GHz, and it consolidates the calculation into
a single place (for potential furture adjustments).

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:09 -07:00
Natalie Protasevich ca05fea6db [PATCH] Do not enforce unique IO_APIC_ID check for xAPIC systems (i386)
This patch is per Andi's request to remove NO_IOAPIC_CHECK from genapic and
use heuristics to prevent unique I/O APIC ID check for systems that don't
need it.  The patch disables unique I/O APIC ID check for Xeon-based and
other platforms that don't use serial APIC bus for interrupt delivery.
Andi stated that AMD systems don't need unique IO_APIC_IDs either.

Signed-off-by: Natalie Protasevich <Natalie.Protasevich@unisys.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:09 -07:00
Roland McGrath 7c1def1652 [PATCH] i386: never block forced SIGSEGV
This problem was first noticed on PPC and has already been fixed there.
But the exact same issue applies to other platforms in the same way.  The
signal blocking for sa_mask and the handled signal takes place after the
handler setup.  When the stack is bogus, the handler setup forces a
SIGSEGV.  But then this will be blocked, and returning to user mode will
fault again and iterate.  This patch fixes the problem by checking whether
signal handler setup failed, and not doing the signal-blocking if so.  This
copies what was done in the ppc code.  I think all architectures' signal
handler setup code follows this pattern and needs the change.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:09 -07:00
Christoph Lameter 8c5a09082f [PATCH] x86/x86_64: pcibus_to_node
Define pcibus_to_node to be able to figure out which NUMA node contains a
given PCI device.  This defines pcibus_to_node(bus) in
include/linux/topology.h and adjusts the macros for i386 and x86_64 that
already provided a way to determine the cpumask of a pci device.

x86_64 was changed to not build an array of cpumasks anymore.  Instead an
array of nodes is build which can be used to generate the cpumask via
node_to_cpumask.

Signed-off-by: Christoph Lameter <christoph@lameter.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:08 -07:00
Venkatesh Pallipadi 8a9e1b0f56 [PATCH] Platform SMIs and their interferance with tsc based delay calibration
Issue:
Current tsc based delay_calibration can result in significant errors in
loops_per_jiffy count when the platform events like SMIs
(System Management Interrupts that are non-maskable) are present. This could
lead to potential kernel panic(). This issue is becoming more visible with 2.6
kernel (as default HZ is 1000) and on platforms with higher SMI handling
latencies. During the boot time, SMIs are mostly used by BIOS (for things
like legacy keyboard emulation).

Description:
The psuedocode for current delay calibration with tsc based delay looks like
(0) Estimate a value for loops_per_jiffy
(1) While (loops_per_jiffy estimate is accurate enough)
(2)   wait for jiffy transition (jiffy1)
(3)   Note down current tsc (tsc1)
(4)   loop until tsc becomes tsc1 + loops_per_jiffy
(5)   check whether jiffy changed since jiffy1 or not and refine
loops_per_jiffy estimate

Consider the following cases
Case 1:
If SMIs happen between (2) and (3) above, we can end up with a
loops_per_jiffy value that is too low. This results in shorted delays and
kernel can panic () during boot (Mostly at IOAPIC timer initialization
timer_irq_works() as we don't have enough timer interrupts in a specified
interval).

Case 2:
If SMIs happen between (3) and (4) above, then we can end up with a
loops_per_jiffy value that is too high. And with current i386 code, too
high lpj value (greater than 17M) can result in a overflow in
delay.c:__const_udelay() again resulting in shorter delay and panic().

Solution:
The patch below makes the calibration routine aware of asynchronous events
like SMIs. We increase the delay calibration time and also identify any
significant errors (greater than 12.5%) in the calibration and notify it to
user.

Patch below changes both i386 and x86-64 architectures to use this
new and improved calibrate_delay_direct() routine.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:08 -07:00
Ian Campbell 0f8e2d62fa [PATCH] use ${CROSS_COMPILE}installkernel in arch/*/boot/install.sh
The attached patch causes the various arch specific install.sh scripts to
look for ${CROSS_COMPILE}installkernel rather than just installkernel (in
both /sbin/ and ~/bin/ where the script already did this).  This allows you
to have e.g.  arm-linux-installkernel as a handy way to install on your
cross target.  It also prevents the script picking up on the host
/sbin/installkernel which causes the script to fall through and do the
install itself (which is what I actually use myself, with $INSTALL_PATH
set).

I don't believe it causes back-compatibility problems since calling the
host installkernel was never likely to work or be what you wanted when
cross compiling anyway.  If $CROSS_COMPILE isn't set then nothing changes.

I only use ARM and i386 myself but I figured it couldn't hurt to do the
whole lot.  I've cc'd those who I hope are the arch maintainers for files
that I've touched.

Signed-off-by: Ian Campbell <icampbell@arcom.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:07 -07:00
H. Peter Anvin d0e7feb03d [PATCH] biarch compiler support for i386
This allows the i386 architecture to be built on a system with a biarch
compiler that defaults to x86-64, merely by specifying ARCH=i386.

As previously discussed, this uses the equivalent logic to the ppc port.

Signed-Off-By: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:07 -07:00
Martin J. Bligh 6f4e1e5061 [PATCH] add page_state info to show_mem
This helps a lot when debugging out of memory stuff - useful especially to
see if all the memory is sucked into slab, etc.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:07 -07:00
Matt Tolentino bbfceef47f [PATCH] add x86-64 specific support for sparsemem
This patch adds in the necessary support for sparsemem such that x86-64
kernels may use sparsemem as an alternative to discontigmem for NUMA
kernels.  Note that this does no preclude one from continuing to build NUMA
kernels using discontigmem, but merely allows the option to build NUMA
kernels with sparsemem.

Interestingly, the use of sparsemem in lieu of discontigmem in NUMA kernels
results in reduced text size for otherwise equivalent kernels as shown in
the example builds below:

   text	   data	    bss	    dec	    hex	filename
2371036	 765884	1237108	4374028	 42be0c	vmlinux.discontig
2366549	 776484	1302772	4445805	 43d66d	vmlinux.sparse

Signed-off-by: Matt Tolentino <matthew.e.tolentino@intel.com>
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:07 -07:00
Matt Tolentino 2b97690f4c [PATCH] reorganize x86-64 NUMA and DISCONTIGMEM config options
In order to use the alternative sparsemem implmentation for NUMA kernels,
we need to reorganize the config options.  This patch effectively abstracts
out the CONFIG_DISCONTIGMEM options to CONFIG_NUMA in most cases.  Thus,
the discontigmem implementation may be employed as always, but the
sparsemem implementation may be used alternatively.

Signed-off-by: Matt Tolentino <matthew.e.tolentino@intel.com>
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:06 -07:00
Matt Tolentino 1035faf1b1 [PATCH] add x86-64 Kconfig options for sparsemem
Add the requisite arch specific Kconfig options to enable the use of the
sparsemem implementation for NUMA kernels on x86-64.

Signed-off-by: Matt Tolentino <matthew.e.tolentino@intel.com>
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:06 -07:00
Matt Tolentino 073326634b [PATCH] remove direct ref to contig_page_data for x86-64
This patch pulls out all remaining direct references to contig_page_data
from arch/x86-64, thus saving an ifdef in one case.

Signed-off-by: Matt Tolentino <matthew.e.tolentino@intel.com>
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:06 -07:00
Andy Whitcroft 145e664231 [PATCH] ppc64: sparsemem memory model
Provide the architecture specific implementation for SPARSEMEM for PPC64
systems.

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Mike Kravetz <kravetz@us.ibm.com> (in part)
Signed-off-by: Martin Bligh <mbligh@aracnet.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:06 -07:00
Andy Whitcroft 74b30be2e1 [PATCH] ppc64: add memory present
Provide hooks for PPC64 to allow memory models to be informed of installed
memory areas.  This allows SPARSEMEM to instantiate mem_map for the populated
areas.

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Martin Bligh <mbligh@aracnet.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:05 -07:00
Andy Whitcroft 510f8fa7ba [PATCH] ppc64: add early_pfn_to_nid
Provide an implementation of early_pfn_to_nid for PPC64.  This is used by
memory models to determine the node from which to take allocations before the
memory allocators are fully initialised.

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Martin Bligh <mbligh@aracnet.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:05 -07:00
Andy Whitcroft 641c767389 [PATCH] sparsemem swiss cheese numa layouts
The part of the sparsemem patch which modifies memmap_init_zone() has recently
become a problem.  It changes behavior so that there is a call to
pfn_to_page() for each individual page inside of a node's range:
node_start_pfn through node_end_pfn.  It used to simply do this once, at the
beginning of the node, but having sparsemem's non-contiguous mem_map[]s inside
of a node made it necessary to change.

Mike Kravetz recently wrote a patch which made the NUMA code accept some new
kinds of layouts.  The system's memory was laid out like this, with node 0's
memory in two pieces: one before and one after node 1's memory:

	Node 0: +++++     +++++
	Node 1:      +++++

Previous behavior before Mike's patch was to assign nodes like this:

	Node 0: 00000     XXXXX
	Node 1:      11111

Where the 'X' areas were simply thrown away.  The new behavior was to make the
pg_data_t span node 0 across all of its areas, including areas that are really
node 1's: Node 0: 000000000000000 Node 1: 11111

This wastes a little bit of mem_map space, but ends up being OK, and more
fully utilizes the system's memory.  memmap_init_zone() initializes all of the
"struct page"s for node 0, even for the "hole", but those never get used,
because there is no pfn_to_page() that resolves to those pages.  However, only
calling pfn_to_page() once, memmap_init_zone() always uses the pages that were
allocated for node0->node_mem_map because:

	struct page *start = pfn_to_page(start_pfn);
	// effectively start = &node->node_mem_map[0]
	for (page = start; page < (start + size); page++) {
		init_page_here();...
		page++;
	}

Slow, and wasteful, but generally harmless.

But, modify that to call pfn_to_page() for each loop iteration (like sparsemem
does):

	for (pfn = start_pfn; pfn < < (start_pfn + size); pfn++++) {
		page = pfn_to_page(pfn);
	}

And you end up trying to initialize node 1's pages too early, along with bogus
data from node 0.  This patch checks for those weird layouts and declines to
touch the pages, making the more frequent pfn_to_page() calls OK to do.

Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:05 -07:00
Andy Whitcroft 05b79bdcb4 [PATCH] sparsemem memory model for i386
Provide the architecture specific implementation for SPARSEMEM for i386 SMP
and NUMA systems.

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Martin Bligh <mbligh@aracnet.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:05 -07:00
Andy Whitcroft d41dee369b [PATCH] sparsemem memory model
Sparsemem abstracts the use of discontiguous mem_maps[].  This kind of
mem_map[] is needed by discontiguous memory machines (like in the old
CONFIG_DISCONTIGMEM case) as well as memory hotplug systems.  Sparsemem
replaces DISCONTIGMEM when enabled, and it is hoped that it can eventually
become a complete replacement.

A significant advantage over DISCONTIGMEM is that it's completely separated
from CONFIG_NUMA.  When producing this patch, it became apparent in that NUMA
and DISCONTIG are often confused.

Another advantage is that sparse doesn't require each NUMA node's ranges to be
contiguous.  It can handle overlapping ranges between nodes with no problems,
where DISCONTIGMEM currently throws away that memory.

Sparsemem uses an array to provide different pfn_to_page() translations for
each SECTION_SIZE area of physical memory.  This is what allows the mem_map[]
to be chopped up.

In order to do quick pfn_to_page() operations, the section number of the page
is encoded in page->flags.  Part of the sparsemem infrastructure enables
sharing of these bits more dynamically (at compile-time) between the
page_zone() and sparsemem operations.  However, on 32-bit architectures, the
number of bits is quite limited, and may require growing the size of the
page->flags type in certain conditions.  Several things might force this to
occur: a decrease in the SECTION_SIZE (if you want to hotplug smaller areas of
memory), an increase in the physical address space, or an increase in the
number of used page->flags.

One thing to note is that, once sparsemem is present, the NUMA node
information no longer needs to be stored in the page->flags.  It might provide
speed increases on certain platforms and will be stored there if there is
room.  But, if out of room, an alternate (theoretically slower) mechanism is
used.

This patch introduces CONFIG_FLATMEM.  It is used in almost all cases where
there used to be an #ifndef DISCONTIG, because SPARSEMEM and DISCONTIGMEM
often have to compile out the same areas of code.

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Martin Bligh <mbligh@aracnet.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: Bob Picco <bob.picco@hp.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:04 -07:00
Andy Whitcroft af705362ab [PATCH] generify memory present
Allow architectures to indicate that they will be providing hooks to indice
installed memory areas, memory_present().  Provide prototypes for the i386
implementation.

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Martin Bligh <mbligh@aracnet.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:04 -07:00
Andy Whitcroft b159d43fbf [PATCH] generify early_pfn_to_nid
Provide a default implementation for early_pfn_to_nid returning node 0.  Allow
architectures to override this with their own implementation out of
asm/mmzone.h.

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Martin Bligh <mbligh@aracnet.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:04 -07:00
Mike Kravetz 368a0a3afa [PATCH] ppc64: Kconfig memory models
This patch changes some of the default behavior in the ppc64 Kconfig file
that was recently changed/added to 2.6.12-rc2-mm1 by Dave Hansen in
preparation for SPARSEMEM.  Patch allows the display of both FLAT and
DISCONTIG models on pseries.  As before, default is DISCONTIG for SMP and
PSERIES and FLAT for others.

Signed-off-by: Mike Kravetz <kravetz@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:04 -07:00
Dave Hansen 074ccf8016 [PATCH] mm/Kconfig: kill unused ARCH_FLATMEM_DISABLE
This used to be used to disable FLATMEM selection, but I decided to change it
to be done generically when DISCONTIG is enabled.  The option is unused, so
this kills it.

Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:03 -07:00