Commit graph

63981 commits

Author SHA1 Message Date
Andreas Schwab 6cba986298 [IA64] Use atomic64_read to read an atomic64_t.
The routines ia64_atomic64_{add,sub} mistakenly use
atomic_read() to grab the old value instead of using
atomic64_read().

Signed-off-by: Andreas Schwab <schwab@suse.de>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-08-13 10:21:04 -07:00
Dimitri Sivanich 71416bea5a [IA64] disable irq's and check need_resched before safe_halt
While sending interrupts to a cpu to repeatedly wake a thread, on occasion
that thread will take a full timer tick cycle (4002 usec in my case)
to wakeup.

The problem concerns a race condition in the code around the safe_halt()
call in the default_idle() routine.  Setting 'nohalt' on the kernel
command line causes the long wakeups to disappear.

void
default_idle (void)
{
        local_irq_enable();
        while (!need_resched()) {
-->             if (can_do_pal_halt)
-->                     safe_halt();
                else

A timer tick could arrive between the check for !need_resched and the
actual call to safe_halt() (which does a pal call to PAL_HALT_LIGHT).
By the time the timer tick completes, a thread that might now need to run
could get held up for as long as a timer tick waiting for the halted cpu.

I'm proposing that we disable irq's and check need_resched again before
calling safe_halt().  Does anyone see any problem with this approach?

Signed-off-by: Dimitri Sivanich <sivanich@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-08-13 10:17:23 -07:00
Linus Torvalds 39d3520c92 Linux 2.6.23-rc3 2007-08-12 21:25:24 -07:00
Linus Torvalds 738ddd3039 Merge git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched
* git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched:
  sched: run_rebalance_domains: s/SCHED_IDLE/CPU_IDLE/
  sched: fix sleeper bonus
  sched: make global code static
2007-08-12 11:06:45 -07:00
Thomas Gleixner cc75b92d11 genirq: mark io_apic level interrupts to avoid resend
Level type interrupts do not need to be resent.  It was also found that
some chipsets get confused in case of the resend.

Mark the ioapic level type interrupts as such to avoid the resend
functionality in the generic irq code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-08-12 11:05:45 -07:00
Thomas Gleixner 2464286ace genirq: suppress resend of level interrupts
Level type interrupts are resent by the interrupt hardware when they are
still active at irq_enable().

Suppress the resend mechanism for interrupts marked as level.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-08-12 11:05:45 -07:00
Thomas Gleixner 496634217e genirq: cleanup mismerge artifact
Commit 5a43a066b1: "genirq: Allow fasteoi
handler to retrigger disabled interrupts" was erroneously applied to
handle_level_irq().  This added the irq retrigger / resend functionality
to the level irq handler.

Revert the offending bits.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-08-12 11:05:45 -07:00
Oleg Nesterov de0cf899bb sched: run_rebalance_domains: s/SCHED_IDLE/CPU_IDLE/
rebalance_domains(SCHED_IDLE) looks strange (typo), change it to CPU_IDLE.

the effect of this bug was slightly more agressive idle-balancing on
SMP than intended.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-12 18:08:19 +02:00
Ingo Molnar 5d2b3d3695 sched: fix sleeper bonus
Peter Ziljstra noticed that the sleeper bonus deduction code
was not properly rate-limited: a task that scheduled more
frequently would get a disproportionately large deduction.
So limit the deduction to delta_exec.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-12 18:08:19 +02:00
Adrian Bunk 6707de00fd sched: make global code static
This patch makes the following needlessly global code static:

- arch_reinit_sched_domains()
- struct attr_sched_mc_power_savings
- struct attr_sched_smt_power_savings

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-08-12 18:08:19 +02:00
Linus Torvalds 963c6527e0 Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (28 commits)
  ACPI: thermal: add DMI hooks to handle AOpen's broken Award BIOS
  ACPI: thermal: create "thermal.act=" to disable or override active trip point
  ACPI: thermal: create "thermal.nocrt" to disable critical actions
  ACPI: thermal: create "thermal.psv=" to override passive trip points
  ACPI: thermal: expose "thermal.tzp=" to set global polling frequency
  ACPI: thermal: create "thermal.off=1" to disable ACPI thermal support
  ACPI: thinkpad-acpi: fix sysfs paths in documentation
  ACPI: static
  ACPI EC: remove potential deadlock from EC
  ACPI: dock: Send key=value pair instead of plain value
  ACPI: bay: send envp with uevent - fix
  acpi-cpufreq: Fix some x86/x86-64 acpi-cpufreq driver issues
  ACPI: fix "Time Problems with 2.6.23-rc1-gf695baf2"
  ACPI: thinkpad-acpi: change thinkpad-acpi input default and kconfig help
  ACPI: EC: fix run-together printk lines
  ACPI: sbs: remove dead code
  ACPI: EC: acpi_ec_remove(): fix use-after-free
  ACPI: EC: Switch from boot_ec as soon as we find its desc in DSDT.
  ACPI: EC: fix build warning
  ACPI: EC: If ECDT is not found, look up EC in DSDT.
  ...
2007-08-12 02:58:23 -07:00
Linus Torvalds c1502e2834 i386: Fix broken mmiocfg accesses
Commit 3320ad994a broke mmio config space
accesses totally on i386 - it dropped the "reg" offset to the address.

Cc: dean gaudet <dean@arctic.org>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-08-12 02:23:16 -07:00
Petr Vandrovec b8d3f2448b Do not replace whole memcpy in apply alternatives
apply_alternatives uses memcpy() to apply alternatives.  Which has the
unfortunate effect that while applying memcpy alternative to memcpy
itself it tries to overwrite itself with nops - which causes #UD fault
as it overwrites half of an instruction in copy loop, and from this
point on only possible outcome is triplefault and reboot.

So let's overwrite only first two instructions of memcpy - as long as
the main memcpy loop is not in first two bytes it will work fine.

Signed-off-by: Petr Vandrovec <petr@vandrovec.name>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-08-12 01:42:37 -07:00
Len Brown 4e54e9f442 Pull sbs into release branch 2007-08-12 00:21:22 -04:00
Len Brown 27196c30db Pull processor into release branch 2007-08-12 00:21:08 -04:00
Len Brown ad17b209dc Pull fluff into release branch 2007-08-12 00:20:59 -04:00
Len Brown d88da66f93 Pull ec into release branch 2007-08-12 00:20:41 -04:00
Len Brown 6712a4fbb1 Pull dock-bay into release branch 2007-08-12 00:20:33 -04:00
Len Brown d8dd3cbcf1 Pull bugzilla-8842 into release branch 2007-08-12 00:19:23 -04:00
Len Brown fc0dc4d3aa Pull bugzilla-8768 into release branch 2007-08-12 00:18:11 -04:00
Len Brown 53fdc5185c Pull bugzilla-3774 into release branch 2007-08-12 00:17:59 -04:00
Len Brown 3b6919e536 pull asus sony thinkpad into release branch 2007-08-12 00:17:12 -04:00
Len Brown 0b5bfa1cbe ACPI: thermal: add DMI hooks to handle AOpen's broken Award BIOS
Use DMI to:
1. enable polling (BIOS thermal events are broken)
2. disable active trip points (BIOS fan control is broken)
3. disable passive trip point (BIOS hard-codes it too low)

The actual temperature reading does work,
and with the aid of polling, the critical
trip point should work too.

http://bugzilla.kernel.org/show_bug.cgi?id=8842

Signed-off-by: Len Brown <len.brown@intel.com>
2007-08-12 00:13:02 -04:00
Len Brown f8707ec964 ACPI: thermal: create "thermal.act=" to disable or override active trip point
thermal.act=-1 disables all active trip points
in all ACPI thermal zones.

thermal.act=C, where C > 0, overrides all lowest temperature
active trip points in all thermal zones to C degrees Celsius.
Raising this trip-point may allow you to keep your system silent
up to a higher temperature.  However, it will not allow you to
raise the lowest temperature trip point above the next higher
trip point (if there is one).  Lowering this trip point may
kick in the fan sooner.

Note that overriding this trip-point will disable any BIOS attempts
to implement hysteresis around the lowest temperature trip point.
This may result in the fan starting and stopping frequently
if temperature frequently crosses C.

WARNING: raising trip points above the manufacturer's defaults
may cause the system to run at higher temperature and shorten
its life.

Signed-off-by: Len Brown <len.brown@intel.com>
2007-08-12 00:12:54 -04:00
Len Brown f548714561 ACPI: thermal: create "thermal.nocrt" to disable critical actions
thermal.nocrt=1 disables actions on _CRT and _HOT
ACPI thermal zone trip-points.  They will be marked
as <disabled> in /proc/acpi/thermal_zone/*/trip_points.

There are two cases where this option is used:

1. Debugging a hot system crossing valid trip point.

   If your system fan is spinning at full speed,
   be sure that the vent is not clogged with dust.
   Many laptops have very fine thermal fins that are easily blocked.

   Check that the processor fan-sink is properly seated,
   has the proper thermal grease, and is really spinning.

   Check for fan related options in BIOS SETUP.
   Sometimes there is a performance vs quiet option.
   Defaults are generally the most conservative.

   If your fan is not spinning, yet /proc/acpi/fan/
   has files in it, please file a Linux/ACPI bug.

   WARNING: you risk shortening the lifetime of your
   hardware if you use this parameter on a hot system.
   Note that this refers to all system components,
   including the disk drive.

2. Working around a cool system crossing critical
   trip point due to erroneous temperature reading.

   Try again with CONFIG_HWMON=n
   There is known potential for conflict between the
   the hwmon sub-system and the ACPI BIOS.
   If this fixes it, notify lm-sensors@lm-sensors.org
   and linux-acpi@vger.kernel.org

   Otherwise, file a Linux/ACPI bug, or notify
   just linux-acpi@vger.kernel.org.

Signed-off-by: Len Brown <len.brown@intel.com>
2007-08-12 00:12:44 -04:00
Len Brown a70cdc5200 ACPI: thermal: create "thermal.psv=" to override passive trip points
"thermal.psv=-1" disables passive trip points
for all ACPI thermal zones.

"thermal.psv=C", where 'C' is degrees Celsius,
overrides all existing passive trip points
for all ACPI thermal zones.

thermal.psv is checked at module load time,
and in response to trip-point change events.

Note that if the system does not deliver thermal zone
temperature change events near the new trip-point,
then it will not be noticed.  To force your custom
trip point to be noticed, you may need to enable polling:
eg. thermal.tzp=3000 invokes polling every 5 minutes.

Note that once passive thermal throttling is invoked,
it has its own internal Thermal Sampling Period (_TSP),
that is unrelated to _TZP.

WARNING: disabling or raising a thermal trip point
may result in increased running temperature and
shorter hardware lifetime on some systems.

Signed-off-by: Len Brown <len.brown@intel.com>
2007-08-12 00:12:35 -04:00
Len Brown 730ff34de7 ACPI: thermal: expose "thermal.tzp=" to set global polling frequency
Thermal Zone Polling frequency (_TZP) is an optional ACPI object
recommending the rate that the OS should poll the associated thermal zone.

If _TZP is 0, no polling should be used.
If _TZP is non-zero, then the platform recommends that
the OS poll the thermal zone at the specified rate.
The minimum period is 30 seconds.
The maximum period is 5 minutes.

(note _TZP and thermal.tzp units are in deci-seconds,
 so _TZP = 300 corresponds to 30 seconds)

If _TZP is not present, ACPI 3.0b recommends that the
thermal zone be polled at an "OS provided default frequency".

However, common industry practice is:
1. The BIOS never specifies any _TZP
2. High volume OS's from this century never poll any thermal zones

Ie. The OS depends on the platform's ability to
provoke thermal events when necessary, and
the "OS provided default frequency" is "never":-)

There is a proposal that ACPI 4.0 be updated to reflect
common industry practice -- ie. no _TZP, no polling.

The Linux kernel already follows this practice --
thermal zones are not polled unless _TZP is present and non-zero.

But thermal zone polling is useful as a workaround for systems
which have ACPI thermal control, but have an issue preventing
thermal events.  Indeed, some Linux distributions still
set a non-zero thermal polling frequency for this reason.

But rather than ask the user to write a polling frequency
into all the /proc/acpi/thermal_zone/*/polling_frequency
files, here we simply document and expose the already
existing module parameter to do the same at system level,
to simplify debugging those broken platforms.

Note that thermal.tzp is a module-load time parameter only.

Signed-off-by: Len Brown <len.brown@intel.com>
2007-08-12 00:12:26 -04:00
Len Brown 72b33ef8bb ACPI: thermal: create "thermal.off=1" to disable ACPI thermal support
"thermal.off=1" disables all ACPI thermal support at boot time.

CONFIG_ACPI_THERMAL=n can do this at build time.
"# rmmod thermal" can do this at run time,
as long as thermal is built as a module.

WARNING: On some systems, disabling ACPI thermal support
will cause the system to run hotter and reduce the
lifetime of the hardware.

Signed-off-by: Len Brown <len.brown@intel.com>
2007-08-12 00:12:17 -04:00
Henrique de Moraes Holschuh 9de1cc4a17 ACPI: thinkpad-acpi: fix sysfs paths in documentation
The documentation used "thinkpad-acpi" to refer to the directories in
sysfs, while it should have been using "thinkpad_acpi".  Thanks to Hugh
Dickins for the error report.

I wish I could just call the module and everything else by the proper
name with the "-", instead of using these ugly translations to "_".

Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2007-08-11 23:54:35 -04:00
Adrian Bunk e13d874732 ACPI: static
Make the needlessly global "acpi_event_seqnum" static.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Len Brown <len.brown@intel.com>
2007-08-11 22:28:34 -04:00
Alexey Starikovskiy 199e9e7d11 ACPI EC: remove potential deadlock from EC
Signed-off-by: Alexey Starikovskiy <astarikovskiy@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Len Brown <len.brown@intel.com>
2007-08-11 22:26:24 -04:00
Holger Macht 66b568218a ACPI: dock: Send key=value pair instead of plain value
Send key=value pair along with the uevent instead of a plain value so that
userspace (udev) can handle it like common environment variables.

Signed-off-by: Holger Macht <hmacht@suse.de>
Acked-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Cc: Stephan Berberig <s.berberig@arcor.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Len Brown <len.brown@intel.com>
2007-08-11 22:12:10 -04:00
Stephan Berberig 7aa763cb56 ACPI: bay: send envp with uevent - fix
There must not be a new-line character in the uevent.  Otherwise, udev gets
confused.  Thanks to Kay Sievers for pointing it out.

Signed-off-by: Stephan Berberig <s.berberig@arcor.de>
Cc: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Len Brown <len.brown@intel.com>
2007-08-11 22:10:04 -04:00
Linus Torvalds 3864e8ccbb Merge branch 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6
* 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6:
  [S390] monwriter: Serialization bug for multithreaded applications.
  [S390] vmur: diag14 only works with buffers below 2GB
  [S390] vmur: add "top of queue" sanity check for reader open
  [S390] vmur: reject open on z/VM reader files with status HOLD
  [S390] vmur: use DECLARE_COMPLETION_ONSTACK to keep lockdep happy
  [S390] vmur: allocate single record buffers instead of one big data buffer
  [S390] remove DEFAULT_MIGRATION_COST
  [S390] qdio: make sure data structures are correctly aligned.
  [S390] hypfs: implement show_options
  [S390] cio: avoid memory leak on error in css_alloc_subchannel().
2007-08-11 16:18:58 -07:00
Linus Torvalds 75ecb1a4d1 Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
  [POWERPC] Fix size check for hugetlbfs
  [POWERPC] Fix initialization and usage of dma_mask
  [POWERPC] Fix more section mismatches in head_64.S
  [POWERPC] Revert "[POWERPC] Add 'mdio' to bus scan id list for platforms with QE UEC"
  [POWERPC] PS3: Update ps3_defconfig
  [POWERPC] PS3: Remove text saying PS3 support is incomplete
  [POWERPC] PS3: Fix storage probe logic
  [POWERPC] cell: Move SPU affinity init to spu_management_of_ops
  [POWERPC] Fix potential duplicate entry in SLB shadow buffer
2007-08-11 16:09:49 -07:00
Linus Torvalds 73819b2d26 Merge branch 'async-tx-fixes-for-linus' of git://lost.foo-projects.org/~dwillia2/git/iop
* 'async-tx-fixes-for-linus' of git://lost.foo-projects.org/~dwillia2/git/iop:
  async_tx: update MAINTAINERS for async_tx and iop-adma
2007-08-11 16:03:27 -07:00
Linus Torvalds 886c818348 Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2:
  ocfs2: set non-default s_time_gran during mount
  ocfs2: Retry sendpage() if it returns EAGAIN
  ocfs2: Fix rename/extend race
  [2.6 patch] ocfs2_insert_extent(): remove dead code
  ocfs2: Fix max offset calculations
  ocfs2: check ia_size limits in setattr
  ocfs2: Fix some casting errors related to file writes
  ocfs2: use s_maxbytes directly in ocfs2_change_file_space()
  ocfs2: Restrict inode changes in ocfs2_update_inode_atime()
2007-08-11 16:01:34 -07:00
Linus Torvalds dc8a7b11aa Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  BLOCK: Hide the contents of linux/bio.h if CONFIG_BLOCK=n
  sysace: HDIO_GETGEO has it's own method for ages
  drivers/block/cpqarray.c: better error handling and kmalloc + memset conversion to k[cz]alloc
  drivers/block/cciss.c: kmalloc + memset conversion to kzalloc
  Clean up duplicate includes in drivers/block/
  Fix remap handling by blktrace
  [PATCH] remove mm/filemap.c:file_send_actor()
2007-08-11 16:01:06 -07:00
Linus Torvalds d291676ce8 Merge git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched
* git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched:
  sched debug: dont print kernel address in /proc/sched_debug
  sched: fix typo in the FAIR_GROUP_SCHED branch
  sched: improve rq-clock overflow logic
2007-08-11 15:58:37 -07:00
Chuck Ebbert 3dab307e52 i386: Fix double fault handler
The new percpu code has apparently broken the doublefault handler
when CONFIG_DEBUG_SPINLOCK is set. Doublefault is handled by
a hardware task, making the check

        SPIN_BUG_ON(lock->owner == current, lock, "recursion");

fault because it uses the FS register to access the percpu data
for current, and that register is zero in the new TSS. (The trace
I saw was on 2.6.20 where it was GS, but it looks like this will
still happen with FS on 2.6.22.)

Initializing FS in the doublefault_tss should fix it.

AK: Also fix broken ptr_ok() and turn printks into KERN_EMERG
AK: And add a PANIC prefix to make clear the system will hang
AK: (e.g. x86-64 will recover)

Signed-off-by: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-08-11 15:58:14 -07:00
Andi Kleen 5fe4486c79 i386: Fix start_kernel warning
Fix

WARNING: vmlinux.o(.text+0x183): Section mismatch: reference to .init.text:start_kernel (between 'is386' and 'check_x87')

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-08-11 15:58:14 -07:00
Pete Zaitcev 1f1014896d x86_64: vdso.lds in arch/x86_64/vdso/.gitignore
Create arch/x86_64/vdso/.gitignore and put vdso.lds into it.

Signed-off-by: Pete Zaitcev <zaitcev@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-08-11 15:58:14 -07:00
Andi Kleen 43fb2387d0 i386: Add warning in Documentation that zero-page is not a stable ABI
Some people writing boot loaders seem to falsely belief the 32bit zero page is a
stable interface for out of tree code like the real mode boot protocol. Add a comment
clarifying that is not true.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-08-11 15:58:14 -07:00
Andi Kleen d3f7eae182 i386: Use global flag to disable broken local apic timer on AMD CPUs.
The Averatec 2370 and some other Turion laptop BIOS seems to program the
ENABLE_C1E MSR inconsistently between cores. This confuses the lapic
use heuristics because when C1E is enabled anywhere it seems to affect
the complete chip.

Use a global flag instead of a per cpu flag to handle this.
If any CPU has C1E enabled disabled lapic use.

Thanks to Cal Peake for debugging.

Cc: tglx@linutronix.de
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-08-11 15:58:13 -07:00
Adrian Bunk d2d0251f6f i386: really stop MCEs during code patching
It's CONFIG_X86_MCE, not CONFIG_MCE.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-08-11 15:58:13 -07:00
Zachary Amsden 08da5a2ca4 x86_64: Early segment setup for VT
VT is very picky about when it can enter execution.
Get all segments setup and get LDT and TR into valid state to allow
VT execution under VMware and KVM (untested).

This makes the boot decompression run under VT, which makes it several
orders of magnitude faster on 64-bit Intel hardware.

Before, I was seeing times up to a minute or more to decompress a 1.3MB kernel
on a very fast box.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-08-11 15:58:13 -07:00
Andi Kleen ab144f5ec6 i386: Make patching more robust, fix paravirt issue
Commit 19d36ccdc3 "x86: Fix alternatives
and kprobes to remap write-protected kernel text" uses code which is
being patched for patching.

In particular, paravirt_ops does patching in two stages: first it
calls paravirt_ops.patch, then it fills any remaining instructions
with nop_out().  nop_out calls text_poke() which calls
lookup_address() which calls pgd_val() (aka paravirt_ops.pgd_val):
that call site is one of the places we patch.

If we always do patching as one single call to text_poke(), we only
need make sure we're not patching the memcpy in text_poke itself.
This means the prototype to paravirt_ops.patch needs to change, to
marshal the new code into a buffer rather than patching in place as it
does now.  It also means all patching goes through text_poke(), which
is known to be safe (apply_alternatives is also changed to make a
single patch).

AK: fix compilation on x86-64 (bad rusty!)
AK: fix boot on x86-64 (sigh)
AK: merged with other patches

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-08-11 15:58:13 -07:00
Andi Kleen d3f3c93469 x86: Disable CLFLUSH support again
It turns out CLFLUSH support is still not complete; we
flush the wrong pages.  Again disable it for the release.
Noticed by Jan Beulich who then also noticed a stupid typo later.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-08-11 15:58:13 -07:00
Andi Kleen 3f3f7b74a7 x86_64: Don't mark __exitcall as __cold
gcc currently doesn't support attributes on types, so we can't use it
function pointers.  This avoids some warnings on a gcc 4.3 build.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-08-11 15:58:13 -07:00
Murillo Fernandes Bernardes f055a0619a x86_64: Calgary - Fix mis-handled PCI topology
Current code assumed that devices were directly connected to a Calgary
bridge, as it tried to get the iommu table directly from the parent bus
controller.

When we have another bridge between the Calgary/CalIOC2 bridge and the
device we should look upwards until we get to the top (Calgary/CalIOC2
bridge), where the iommu table resides.

Signed-off-by: Murillo Fernandes Bernardes <mfb@br.ibm.com>
Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-08-11 15:58:12 -07:00