alistair23-linux

redonkable

Author	SHA1	Message	Date
Paul Mackerras	31a40e2b05	powerpc/64: Include KVM guest test in all interrupt vectors Currently, if HV KVM is configured but PR KVM isn't, we don't include a test to see whether we were interrupted in KVM guest context for the set of interrupts which get delivered directly to the guest by hardware if they occur in the guest. This includes things like program interrupts. However, the recent bug where userspace could set the MSR for a VCPU to have an illegal value in the TS field, and thus cause a TM Bad Thing type of program interrupt on the hrfid that enters the guest, showed that we can never be completely sure that these interrupts can never occur in the guest entry/exit code. If one of these interrupts does happen and we have HV KVM configured but not PR KVM, then we end up trying to run the handler in the host with the MMU set to the guest MMU context, which generally ends badly. Thus, for robustness it is better to have the test in every interrupt vector, so that if some way is found to trigger some interrupt in the guest entry/exit path, we can handle it without immediately crashing the host. This means that the distinction between KVMTEST and KVMTEST_PR goes away. Thus we delete KVMTEST_PR and associated macros and use KVMTEST everywhere that we previously used either KVMTEST_PR or KVMTEST. It also means that SOFTEN_TEST_HV_201 becomes the same as SOFTEN_TEST_PR, so we deleted SOFTEN_TEST_HV_201 and use SOFTEN_TEST_PR instead. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-01 13:52:23 +11:00
Anton Blanchard	d20be433e6	powerpc: Non relocatable system call doesn't need a trampoline We need to use a trampoline when using LOAD_HANDLER(), because the destination needs to be in the first 64kB. An absolute branch has no such limitations, so just jump there. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-06-02 13:26:47 +10:00
Anton Blanchard	05b05f28fb	powerpc: Relocatable system call no longer uses the LR We had some code to restore the LR in the relocatable system call path back when we used the LR to do an indirect branch. Commit `6a404806df` ("powerpc: Avoid link stack corruption in MMU on syscall entry path") changed this to use the CTR which is volatile across system calls so does not need restoring. Remove the stale comment and the restore of the LR. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-06-02 13:26:47 +10:00
Mahesh Salgaonkar	44d5f6f590	powerpc/book3s: Fix the MCE code to use CONFIG_KVM_BOOK3S_64_HANDLER commit id `2ba9f0d` has changed CONFIG_KVM_BOOK3S_64_HV to tristate to allow HV/PR bits to be built as modules. But the MCE code still depends on CONFIG_KVM_BOOK3S_64_HV which is wrong. When user selects CONFIG_KVM_BOOK3S_64_HV=m to build HV/PR bits as a separate module the relevant MCE code gets excluded. This patch fixes the MCE code to use CONFIG_KVM_BOOK3S_64_HANDLER. This makes sure that the relevant MCE code is included when HV/PR bits are built as a separate modules. Fixes: `2ba9f0d887` ("kvm: powerpc: book3s: Support building HV and PR KVM as module") Cc: stable@vger.kernel.org # v3.14+ Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-03-23 17:10:47 +11:00
Shreyas B. Prabhu	77b54e9f21	powernv/powerpc: Add winkle support for offline cpus Winkle is a deep idle state supported in power8 chips. A core enters winkle when all the threads of the core enter winkle. In this state power supply to the entire chiplet i.e core, private L2 and private L3 is turned off. As a result it gives higher powersavings compared to sleep. But entering winkle results in a total hypervisor state loss. Hence the hypervisor context has to be preserved before entering winkle and restored upon wake up. Power-on Reset Engine (PORE) is a dedicated engine which is responsible for powering on the chiplet during wake up. It can be programmed to restore the register contests of a few specific registers. This patch uses PORE to restore register state wherever possible and uses stack to save and restore rest of the necessary registers. With hypervisor state restore things fall under three categories- per-core state, per-subcore state and per-thread state. To manage this, extend the infrastructure introduced for sleep. Mainly we add a paca variable subcore_sibling_mask. Using this and the core_idle_state we can distingush first thread in core and subcore. Signed-off-by: Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2014-12-15 10:46:41 +11:00
Shreyas B. Prabhu	7cba160ad7	powernv/cpuidle: Redesign idle states management Deep idle states like sleep and winkle are per core idle states. A core enters these states only when all the threads enter either the particular idle state or a deeper one. There are tasks like fastsleep hardware bug workaround and hypervisor core state save which have to be done only by the last thread of the core entering deep idle state and similarly tasks like timebase resync, hypervisor core register restore that have to be done only by the first thread waking up from these state. The current idle state management does not have a way to distinguish the first/last thread of the core waking/entering idle states. Tasks like timebase resync are done for all the threads. This is not only is suboptimal, but can cause functionality issues when subcores and kvm is involved. This patch adds the necessary infrastructure to track idle states of threads in a per-core structure. It uses this info to perform tasks like fastsleep workaround and timebase resync only once per core. Signed-off-by: Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com> Originally-by: Preeti U. Murthy <preeti@linux.vnet.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: linux-pm@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2014-12-15 10:46:40 +11:00
Paul Mackerras	56548fc0e8	powerpc/powernv: Return to cpu offline loop when finished in KVM guest When a secondary hardware thread has finished running a KVM guest, we currently put that thread into nap mode using a nap instruction in the KVM code. This changes the code so that instead of doing a nap instruction directly, we instead cause the call to power7_nap() that put the thread into nap mode to return. The reason for doing this is to avoid having the KVM code having to know what low-power mode to put the thread into. In the case of a secondary thread used to run a KVM guest, the thread will be offline from the point of view of the host kernel, and the relevant power7_nap() call is the one in pnv_smp_cpu_disable(). In this case we don't want to clear pending IPIs in the offline loop in that function, since that might cause us to miss the wakeup for the next time the thread needs to run a guest. To tell whether or not to clear the interrupt, we use the SRR1 value returned from power7_nap(), and check if it indicates an external interrupt. We arrange that the return from power7_nap() when we have finished running a guest returns 0, so pending interrupts don't get flushed in that case. Note that it is important a secondary thread that has finished executing in the guest, or that didn't have a guest to run, should not return to power7_nap's caller while the kvm_hstate.hwthread_req flag in the PACA is non-zero, because the return from power7_nap will reenable the MMU, and the MMU might still be in guest context. In this situation we spin at low priority in real mode waiting for hwthread_req to become zero. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2014-12-08 13:16:31 +11:00
Aneesh Kumar K.V	aefa5688c0	powerpc/mm: don't do tlbie for updatepp request with NO HPTE fault upatepp can get called for a nohpte fault when we find from the linux page table that the translation was hashed before. In that case we are sure that there is no existing translation, hence we could avoid doing tlbie. We could possibly race with a parallel fault filling the TLB. But that should be ok because updatepp is only ever relaxing permissions. We also look at linux pte permission bits when filling hash pte permission bits. We also hold the linux pte busy bits while inserting/updating a hashpte entry, hence a paralle update of linux pte is not possible. On the other hand mprotect involves ptep_modify_prot_start which cause a hpte invalidate and not updatepp. Performance number: We use randbox_access_bench written by Anton. Kernel with THP disabled and smaller hash page table size. 86.60% random_access_b [kernel.kallsyms] [k] .native_hpte_updatepp 2.10% random_access_b random_access_bench [.] doit 1.99% random_access_b [kernel.kallsyms] [k] .do_raw_spin_lock 1.85% random_access_b [kernel.kallsyms] [k] .native_hpte_insert 1.26% random_access_b [kernel.kallsyms] [k] .native_flush_hash_range 1.18% random_access_b [kernel.kallsyms] [k] .__delay 0.69% random_access_b [kernel.kallsyms] [k] .native_hpte_remove 0.37% random_access_b [kernel.kallsyms] [k] .clear_user_page 0.34% random_access_b [kernel.kallsyms] [k] .__hash_page_64K 0.32% random_access_b [kernel.kallsyms] [k] fast_exception_return 0.30% random_access_b [kernel.kallsyms] [k] .hash_page_mm With Fix: 27.54% random_access_b random_access_bench [.] doit 22.90% random_access_b [kernel.kallsyms] [k] .native_hpte_insert 5.76% random_access_b [kernel.kallsyms] [k] .native_hpte_remove 5.20% random_access_b [kernel.kallsyms] [k] fast_exception_return 5.12% random_access_b [kernel.kallsyms] [k] .__hash_page_64K 4.80% random_access_b [kernel.kallsyms] [k] .hash_page_mm 3.31% random_access_b [kernel.kallsyms] [k] data_access_common 1.84% random_access_b [kernel.kallsyms] [k] .trace_hardirqs_on_caller Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2014-12-05 16:26:15 +11:00
Mahesh Salgaonkar	6d626c5ea3	powerpc/powernv: Cleanup unused MCE definitions/declarations. Cleanup OpalMCE_* definitions/declarations and other related code which is not used anymore. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Acked-by: Benjamin Herrrenschmidt <benh@kernel.crashing.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-12-02 11:03:45 +11:00
Suresh E. Warrier	8b91a25546	powerpc: Save/restore PPR for KVM hypercalls The system call FLIH (first-level interrupt handler) at 0xc00 unconditionally sets hardware priority to medium. For hypercalls, this means we lose guest OS priority. The front end (do_kvm_0x**) to the KVM interrupt handler always assumes that PPR priority is saved in PACA exception save area, so it copies this to the kvm_hstate structure. For hypercalls, this would be the saved priority from any previous exception. Eventually, the guest gets resumed with an incorrect priority. The fix is to save the PPR priority in PACA exception save area before switching HMT priorities in the FLIH so that existing code described above in the KVM interrupt handler can copy it from there into the VCPU's saved context. Signed-off-by: Suresh Warrier <warrier@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org> [mpe: Dropped HMT_MEDIUM_PPR_DISCARD and reworded comment] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2014-11-12 15:53:25 +11:00
Mahesh Salgaonkar	c675c7db62	powerpc/book3s: Don't clear MSR_RI in hmi handler. In HMI interrupt handler we don't touch SRR0/SRR1, instead we touch HSRR0/HSRR1. Hence we don't need to clear MSR_RI bit. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2014-10-10 17:25:25 +11:00
Guenter Roeck	11d549048e	powerpc: Fix "attempt to move .org backwards" error Once again, we see arch/powerpc/kernel/exceptions-64s.S: Assembler messages: arch/powerpc/kernel/exceptions-64s.S:865: Error: attempt to move .org backwards arch/powerpc/kernel/exceptions-64s.S:866: Error: attempt to move .org backwards arch/powerpc/kernel/exceptions-64s.S:890: Error: attempt to move .org backwards when compiling ppc:allmodconfig. This time the problem has been caused by to commit `0869b6fd20` ("powerpc/book3s: Add basic infrastructure to handle HMI in Linux"), which adds functions hmi_exception_early and hmi_exception_after_realmode into a critical (size-limited) code area, even though that does not appear to be necessary. Move those functions to a non-critical area of the file. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-08-13 15:13:25 +10:00
Mahesh Salgaonkar	0869b6fd20	powerpc/book3s: Add basic infrastructure to handle HMI in Linux. Handle Hypervisor Maintenance Interrupt (HMI) in Linux. This patch implements basic infrastructure to handle HMI in Linux host. The design is to invoke opal handle hmi in real mode for recovery and set irq_pending when we hit HMI. During check_irq_replay pull opal hmi event and print hmi info on console. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-08-05 16:33:48 +10:00
Michael Ellerman	9daf112bd4	powerpc: Remove misleading DISABLE_INTS DISABLE_INTS has a long and storied history, but for some time now it has not actually disabled interrupts. For the open-coded exception handlers, just stop using it, instead call RECONCILE_IRQ_STATE directly. This has the benefit of removing a level of indirection, and making it clear that r10 & r11 are used at that point. For the addition case we still need a macro, so rename it to clarify what it actually does. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-28 14:11:24 +10:00
Michael Ellerman	4e2bf01b21	powerpc: Move bad_stack() below the fwnmi_data_area At the moment the allmodconfig build is failing because we run out of space between altivec_assist() at 0x5700 and the fwnmi_data_area at 0x7000. Fixing it permanently will take some more work, but a quick fix is to move bad_stack() below the fwnmi_data_area. That gives us just enough room with everything enabled. bad_stack() is called from the common exception handlers, but it's a non-conditional branch, so we have plenty of scope to move it further way. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-28 14:11:22 +10:00
Michael Ellerman	376af5947c	powerpc: Remove STAB code Old cpus didn't have a Segment Lookaside Buffer (SLB), instead they had a Segment Table (STAB). Now that we've dropped support for those cpus, we can remove the STAB support entirely. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-28 14:10:22 +10:00
Anton Blanchard	ad718622ab	powerpc/book3s: Fix some ABIv2 issues in machine check code Commit `2749a2f26a` (powerpc/book3s: Fix machine check handling for unhandled errors) introduced a few ABIv2 issues. We can maintain ABIv1 and ABIv2 compatibility by branching to the function rather than the dot symbol. Fixes: `2749a2f26a` ("powerpc/book3s: Fix machine check handling for unhandled errors") Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-06-12 09:41:33 +10:00
Mahesh Salgaonkar	e75ad93afc	powerpc/book3s: Add stack overflow check in machine check handler. Currently machine check handler does not check for stack overflow for nested machine check. If we hit another MCE while inside the machine check handler repeatedly from same address then we get into risk of stack overflow which can cause huge memory corruption. This patch limits the nested MCE level to 4 and panic when we cross level 4. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-06-11 19:15:13 +10:00
Mahesh Salgaonkar	2749a2f26a	powerpc/book3s: Fix machine check handling for unhandled errors Current code does not check for unhandled/unrecovered errors and return from interrupt if it is recoverable exception which in-turn triggers same machine check exception in a loop causing hypervisor to be unresponsive. This patch fixes this situation and forces hypervisor to panic for unhandled/unrecovered errors. This patch also fixes another issue where unrecoverable_exception routine was called in real mode in case of unrecoverable exception (MSR_RI = 0). This causes another exception vector 0x300 (data access) during system crash leading to confusion while debugging cause of the system crash. Also turn ME bit off while going down, so that when another MCE is hit during panic path, system will checkstop and hypervisor will get restarted cleanly by SP. With the above fixes we now throw correct console messages (see below) while crashing the system in case of unhandled/unrecoverable machine checks. -------------- Severe Machine check interrupt [[Not recovered] Initiator: CPU Error type: UE [Instruction fetch] Effective address: 0000000030002864 Oops: Machine check, sig: 7 [#1] SMP NR_CPUS=2048 NUMA PowerNV Modules linked in: bork(O) bridge stp llc kvm [last unloaded: bork] CPU: 36 PID: 55162 Comm: bash Tainted: G O 3.14.0mce #1 task: c000002d72d022d0 ti: c000000007ec0000 task.ti: c000002d72de4000 NIP: 0000000030002864 LR: 00000000300151a4 CTR: 000000003001518c REGS: c000000007ec3d80 TRAP: 0200 Tainted: G O (3.14.0mce) MSR: 9000000000041002 <SF,HV,ME,RI> CR: 28222848 XER: 20000000 CFAR: 0000000030002838 DAR: d0000000004d0000 DSISR: 00000000 SOFTE: 1 GPR00: 000000003001512c 0000000031f92cb0 0000000030078af0 0000000030002864 GPR04: d0000000004d0000 0000000000000000 0000000030002864 ffffffffffffffc9 GPR08: 0000000000000024 0000000030008af0 000000000000002c c00000000150e728 GPR12: 9000000000041002 0000000031f90000 0000000010142550 0000000040000000 GPR16: 0000000010143cdc 0000000000000000 00000000101306fc 00000000101424dc GPR20: 00000000101424e0 000000001013c6f0 0000000000000000 0000000000000000 GPR24: 0000000010143ce0 00000000100f6440 c000002d72de7e00 c000002d72860250 GPR28: c000002d72860240 c000002d72ac0038 0000000000000008 0000000000040000 NIP [0000000030002864] 0x30002864 LR [00000000300151a4] 0x300151a4 Call Trace: Instruction dump: XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX ---[ end trace 7285f0beac1e29d3 ]--- Sending IPI to other CPUs IPI complete OPAL V3 detected ! -------------- Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-06-11 19:15:13 +10:00
Anton Blanchard	354255014a	powerpc: Remove dot symbol usage in exception macros STD_EXCEPTION_COMMON, STD_EXCEPTION_COMMON_ASYNC and MASKABLE_EXCEPTION branch to the handler, so we can remove the explicit dot symbol and binutils will do the right thing. Signed-off-by: Anton Blanchard <anton@samba.org>	2014-04-23 10:05:19 +10:00
Anton Blanchard	6a3bab90cf	powerpc: Remove some unnecessary uses of _GLOBAL() and _STATIC() There is no need to create a function descriptor for functions called locally out of assembly. Signed-off-by: Anton Blanchard <anton@samba.org>	2014-04-23 10:05:18 +10:00
Anton Blanchard	ad0289e4ac	powerpc: Remove superflous function descriptors in assembly only code We have a number of places where we load the text address of a local function and indirectly branch to it in assembly. Since it is an indirect branch binutils will not know to use the function text address, so that trick wont work. There is no need for these functions to have a function descriptor so we can replace it with a label and remove the dot symbol. Signed-off-by: Anton Blanchard <anton@samba.org>	2014-04-23 10:05:17 +10:00
Anton Blanchard	b1576fec7f	powerpc: No need to use dot symbols when branching to a function binutils is smart enough to know that a branch to a function descriptor is actually a branch to the functions text address. Alan tells me that binutils has been doing this for 9 years. Signed-off-by: Anton Blanchard <anton@samba.org>	2014-04-23 10:05:16 +10:00
Michael Neuling	fa5c11b790	powerpc: Remove dead code in sycall entry In: commit `742415d6b6` Author: Michael Neuling <mikey@neuling.org> powerpc: Turn syscall handler into macros We converted the syscall entry code onto macros, but in doing this we introduced some cruft that's never run and should never have been added. This removes that code. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-04-09 12:53:11 +10:00
Linus Torvalds	e6d9bfc638	Merge branch 'powernv-cpuidle' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull powerpc non-virtualized cpuidle from Ben Herrenschmidt: "This is the branch I mentioned in my other pull request which contains our improved cpuidle support for the "powernv" platform (non-virtualized). It adds support for the "fast sleep" feature of the processor which provides higher power savings than our usual "nap" mode but at the cost of losing the timers while asleep, and thus exploits the new timer broadcast framework to work around that limitation. It's based on a tip timer tree that you seem to have already merged" * 'powernv-cpuidle' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: cpuidle/powernv: Parse device tree to setup idle states cpuidle/powernv: Add "Fast-Sleep" CPU idle state powerpc/powernv: Add OPAL call to resync timebase on wakeup powerpc/powernv: Add context management for Fast Sleep powerpc: Split timer_interrupt() into timer handling and interrupt handling routines powerpc: Implement tick broadcast IPI as a fixed IPI message powerpc: Free up the slot of PPC_MSG_CALL_FUNC_SINGLE IPI message	2014-04-02 13:47:29 -07:00
Mahesh Salgaonkar	d410ae2126	powerpc/book3s: Fix CFAR clobbering issue in machine check handler. While checking powersaving mode in machine check handler at 0x200, we clobber CFAR register. Fix it by saving and restoring it during beq/bgt. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-03-24 10:16:09 +11:00
Vaidyanathan Srinivasan	97eb001f03	powerpc/powernv: Add OPAL call to resync timebase on wakeup During "Fast-sleep" and deeper power savings state, decrementer and timebase could be stopped making it out of sync with rest of the cores in the system. Add a firmware call to request platform to resync timebase using low level platform methods. Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Signed-off-by: Preeti U. Murthy <preeti@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-03-05 15:56:54 +11:00
Vaidyanathan Srinivasan	aca79d2b6e	powerpc/powernv: Add context management for Fast Sleep Before adding Fast-Sleep into the cpuidle framework, some low level support needs to be added to enable it. This includes saving and restoring of certain registers at entry and exit time of this state respectively just like we do in the NAP idle state. Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> [Changelog modified by Preeti U. Murthy <preeti@linux.vnet.ibm.com>] Signed-off-by: Preeti U. Murthy <preeti@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-03-05 15:56:31 +11:00
Mahesh Salgaonkar	4e243b79b0	powerpc: Fix "attempt to move .org backwards" error With recent machine check patch series changes, The exception vectors starting from 0x4300 are now overflowing with allyesconfig. Fix that by moving machine_check_common and machine_check_handle_early code out of that region to make enough room for exception vector area. Fixes this build error reportes by Stephen: arch/powerpc/kernel/exceptions-64s.S: Assembler messages: arch/powerpc/kernel/exceptions-64s.S:958: Error: attempt to move .org backwards arch/powerpc/kernel/exceptions-64s.S:959: Error: attempt to move .org backwards arch/powerpc/kernel/exceptions-64s.S:983: Error: attempt to move .org backwards arch/powerpc/kernel/exceptions-64s.S:984: Error: attempt to move .org backwards arch/powerpc/kernel/exceptions-64s.S:1003: Error: attempt to move .org backwards arch/powerpc/kernel/exceptions-64s.S:1013: Error: attempt to move .org backwards arch/powerpc/kernel/exceptions-64s.S:1014: Error: attempt to move .org backwards arch/powerpc/kernel/exceptions-64s.S:1015: Error: attempt to move .org backwards arch/powerpc/kernel/exceptions-64s.S:1016: Error: attempt to move .org backwards arch/powerpc/kernel/exceptions-64s.S:1017: Error: attempt to move .org backwards arch/powerpc/kernel/exceptions-64s.S:1018: Error: attempt to move .org backwards [Moved the code further down as it introduced link errors due to too long relative branches to the masked interrupts handlers from the exception prologs. Also removed the useless feature section --BenH ] Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Tested-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-12-30 14:16:30 +11:00
Mahesh Salgaonkar	b5ff4211a8	powerpc/book3s: Queue up and process delayed MCE events. When machine check real mode handler can not continue into host kernel in V mode, it returns from the interrupt and we loose MCE event which never gets logged. In such a situation queue up the MCE event so that we can log it later when we get back into host kernel with r1 pointing to kernel stack e.g. during syscall exit. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-12-05 16:05:21 +11:00
Mahesh Salgaonkar	1c51089f77	powerpc/book3s: Return from interrupt if coming from evil context. We can get machine checks from any context. We need to make sure that we handle all of them correctly. If we are coming from hypervisor user-space, we can continue in host kernel in virtual mode to deliver the MC event. If we got woken up from power-saving mode then we may come in with one of the following state: a. No state loss b. Supervisor state loss c. Hypervisor state loss For (a) and (b), we go back to nap again. State (c) is fatal, keep spinning. For all other context which we not sure of queue up the MCE event and return from the interrupt. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-12-05 16:04:36 +11:00
Mahesh Salgaonkar	1e9b4507ed	powerpc/book3s: handle machine check in Linux host. Move machine check entry point into Linux. So far we were dependent on firmware to decode MCE error details and handover the high level info to OS. This patch introduces early machine check routine that saves the MCE information (srr1, srr0, dar and dsisr) to the emergency stack. We allocate stack frame on emergency stack and set the r1 accordingly. This allows us to be prepared to take another exception without loosing context. One thing to note here that, if we get another machine check while ME bit is off then we risk a checkstop. Hence we restrict ourselves to save only MCE information and register saved on PACA_EXMC save are before we turn the ME bit on. We use paca->in_mce flag to differentiate between first entry and nested machine check entry which helps proper use of emergency stack. We increment paca->in_mce every time we enter in early machine check handler and decrement it while leaving. When we enter machine check early handler first time (paca->in_mce == 0), we are sure nobody is using MC emergency stack and allocate a stack frame at the start of the emergency stack. During subsequent entry (paca->in_mce > 0), we know that r1 points inside emergency stack and we allocate separate stack frame accordingly. This prevents us from clobbering MCE information during nested machine checks. The early machine check handler changes are placed under CPU_FTR_HVMODE section. This makes sure that the early machine check handler will get executed only in hypervisor kernel. This is the code flow: Machine Check Interrupt \| V 0x200 vector ME=0, IR=0, DR=0 \| V +-----------------------------------------------+ \|machine_check_pSeries_early: \| ME=0, IR=0, DR=0 \| Alloc frame on emergency stack \| \| Save srr1, srr0, dar and dsisr on stack \| +-----------------------------------------------+ \| (ME=1, IR=0, DR=0, RFID) \| V machine_check_handle_early ME=1, IR=0, DR=0 \| V +-----------------------------------------------+ \| machine_check_early (r3=pt_regs) \| ME=1, IR=0, DR=0 \| Things to do: (in next patches) \| \| Flush SLB for SLB errors \| \| Flush TLB for TLB errors \| \| Decode and save MCE info \| +-----------------------------------------------+ \| (Fall through existing exception handler routine.) \| V machine_check_pSerie ME=1, IR=0, DR=0 \| (ME=1, IR=1, DR=1, RFID) \| V machine_check_common ME=1, IR=1, DR=1 . . . Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-12-05 16:02:06 +11:00
Aneesh Kumar K.V	3a167beac0	kvm: powerpc: Add kvmppc_ops callback This patch add a new callback kvmppc_ops. This will help us in enabling both HV and PR KVM together in the same kernel. The actual change to enable them together is done in the later patch in the series. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> [agraf: squash in booke changes] Signed-off-by: Alexander Graf <agraf@suse.de>	2013-10-17 15:24:26 +02:00
Aneesh Kumar K.V	7aa79938f7	kvm: powerpc: book3s: pr: Rename KVM_BOOK3S_PR to KVM_BOOK3S_PR_POSSIBLE With later patches supporting PR kvm as a kernel module, the changes that has to be built into the main kernel binary to enable PR KVM module is now selected via KVM_BOOK3S_PR_POSSIBLE Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2013-10-17 15:17:49 +02:00
Paul Mackerras	4f6c11db10	KVM: PPC: Book3S: Move skip-interrupt handlers to common code Both PR and HV KVM have separate, identical copies of the kvmppc_skip_interrupt and kvmppc_skip_Hinterrupt handlers that are used for the situation where an interrupt happens when loading the instruction that caused an exit from the guest. To eliminate this duplication and make it easier to compile in both PR and HV KVM, this moves this code to arch/powerpc/kernel/exceptions-64s.S along with other kernel interrupt handler code. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2013-10-17 14:49:35 +02:00
Benjamin Herrenschmidt	3f1f431188	Merge branch 'merge' into next Merge stuff that already went into Linus via "merge" which are pre-reqs for subsequent patches	2013-08-27 15:03:30 +10:00
Michael Ellerman	d671ddd665	powerpc: Add more exception trampolines for hypervisor exceptions This makes back traces and profiles easier to read. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-08-27 14:45:09 +10:00
Michael Ellerman	fa111f1f76	powerpc: Fix location and rename exception trampolines The symbols that name some of our exception trampolines are ahead of the location they name. In most cases this is OK because the code is tightly packed, but in some cases it means the symbol floats ahead of the correct location, eg: c000000000000ea0 <performance_monitor_pSeries_1>: ... c000000000000f00: 7d b2 43 a6 mtsprg 2,r13 Fix them all by moving the symbol after the set of the location. While we're moving them anyway, rename them to loose the camelcase and to make it clear that they are trampolines. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-08-27 14:45:08 +10:00
Paul Mackerras	630573c1da	powerpc: Fix denormalized exception handler The denormalized exception handler (denorm_exception_hv) has a couple of bugs. If the CONFIG_PPC_DENORMALISATION option is not selected, or the HSRR1_DENORM bit is not set in HSRR1, we don't test whether the interrupt occurred within a KVM guest. On the other hand, if the HSRR1_DENORM bit is set and CONFIG_PPC_DENORMALISATION is enabled, we corrupt the CFAR and PPR. To correct these problems, this replaces the open-coded version of EXCEPTION_PROLOG_1 that is there currently, and that is missing the saving of PPR and CFAR values to the PACA, with an instance of EXCEPTION_PROLOG_1. This adds an explicit KVMTEST after testing whether the exception is one we can handle, and adds code to restore the CFAR on exit. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-08-14 15:00:09 +10:00
Michael Neuling	88f094120b	powerpc: Fix hypervisor facility unavaliable vector number Currently if we take hypervisor facility unavaliable (from 0xf80/0x4f80) we mark it as an OS facility unavaliable (0xf60) as the two share the same code path. The becomes a problem in facility_unavailable_exception() as we aren't able to see the hypervisor facility unavailable exceptions. Below fixes this by duplication the required macros. Signed-off-by: Michael Neuling <mikey@neuling.org> Cc: <stable@vger.kernel.org> [v3.10] Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-08-09 18:06:58 +10:00
Benjamin Herrenschmidt	24a72acac1	Linux 3.10 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) iQEcBAABAgAGBQJR0K2gAAoJEHm+PkMAQRiGWsEH+gMZSN1qRm34hZ82q1Tx7HvL Eb/Gsl3Qw/7G2TlTqgjBUs36IdqV9O2cui/aa3/TfXvdvrx+0GlhRkEwQPc+ygcO Mvoyoke4tT4+4jVFdCg1J8avREsa28/6oaHs0ZZxuVmJBBLTJH7aXaNsGn6eU1q9 9+p798MQis6naIiPC63somlZcCIiBhsuWCPWpEfLMn8G1HWAFTM3xXIbNBqe/brS bmIOfhomlIZ5dcdaXGvjtP3+KJhkNDwhkPC4tVYu8JqqgSlrE+a+EGyEuuGqKk10 U+swiqyuD31uBI9ga54u/2FzSqDiAu6YOcMXevjo/m3g9XLdYbYLvN+nvN8alCQ= =Ob6Z -----END PGP SIGNATURE----- Merge tag 'v3.10' into next Merge 3.10 in order to get some of the last minute powerpc changes, resolve conflicts and add additional fixes on top of them.	2013-07-01 17:57:25 +10:00
Michael Ellerman	b14b6260ef	powerpc: Wire up the HV facility unavailable exception Similar to the facility unavailble exception, except the facilities are controlled by HFSCR. Adapt the facility_unavailable_exception() so it can be called for either the regular or Hypervisor facility unavailable exceptions. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> CC: <stable@vger.kernel.org> [v3.10] Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-07-01 11:49:47 +10:00
Michael Ellerman	021424a1fc	powerpc: Rename and flesh out the facility unavailable exception handler The exception at 0xf60 is not the TM (Transactional Memory) unavailable exception, it is the "Facility Unavailable Exception", rename it as such. Flesh out the handler to acknowledge the fact that it can be called for many reasons, one of which is TM being unavailable. Use STD_EXCEPTION_COMMON() for the exception body, for some reason we had it open-coded, I've checked the generated code is identical. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> CC: <stable@vger.kernel.org> [v3.10] Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-07-01 11:49:44 +10:00
Michael Ellerman	c9f69518e5	powerpc: Remove KVMTEST from RELON exception handlers KVMTEST is a macro which checks whether we are taking an exception from guest context, if so we branch out of line and eventually call into the KVM code to handle the switch. When running real guests on bare metal (HV KVM) the hardware ensures that we never take a relocation on exception when transitioning from guest to host. For PR KVM we disable relocation on exceptions ourself in kvmppc_core_init_vm(), as of commit `a413f47` "Disable relocation on exceptions whenever PR KVM is active". So convert all the RELON macros to use NOTEST, and drop the remaining KVM_HANDLER() definitions we have for 0xe40 and 0xe80. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> CC: <stable@vger.kernel.org> [v3.9+] Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-07-01 11:49:40 +10:00
Michael Ellerman	1d567cb4bd	powerpc: Remove unreachable relocation on exception handlers We have relocation on exception handlers defined for h_data_storage and h_instr_storage. However we will never take relocation on exceptions for these because they can only come from a guest, and we never take relocation on exceptions when we transition from guest to host. We also have a handler for hmi_exception (Hypervisor Maintenance) which is defined in the architecture to never be delivered with relocation on, see see v2.07 Book III-S section 6.5. So remove the handlers, leaving a branch to self just to be double extra paranoid. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> CC: <stable@vger.kernel.org> [v3.9+] Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-07-01 11:49:37 +10:00
Paul Mackerras	bf593907f7	powerpc: Fix emulation of illegal instructions on PowerNV platform Normally, the kernel emulates a few instructions that are unimplemented on some processors (e.g. the old dcba instruction), or privileged (e.g. mfpvr). The emulation of unimplemented instructions is currently not working on the PowerNV platform. The reason is that on these machines, unimplemented and illegal instructions cause a hypervisor emulation assist interrupt, rather than a program interrupt as on older CPUs. Our vector for the emulation assist interrupt just calls program_check_exception() directly, without setting the bit in SRR1 that indicates an illegal instruction interrupt. This fixes it by making the emulation assist interrupt set that bit before calling program_check_interrupt(). With this, old programs that use no-longer implemented instructions such as dcba now work again. CC: <stable@vger.kernel.org> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-06-15 12:24:11 +10:00
Michael Neuling	fb0fce3e55	powerpc/power8: Update denormalization handler POWER8 can take a denormalisation exception on any VSX registers. This does the extra 32 VSX registers we don't currently handle. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-06-10 08:36:26 +10:00
Michael Neuling	d7c67fb1cf	powerpc/pseries: Simplify denormalization handler The following simplifies the denorm code by using macros to generate the long stream of almost identical instructions. This patch results in no changes to the output binary, but removes a lot of lines of code. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-06-10 08:36:22 +10:00
Aneesh Kumar K.V	ce54152f42	powerpc: Save DAR and DSISR in pt_regs on MCE We were not saving DAR and DSISR on MCE. Save then and also print the values along with exception details in xmon. Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-04-30 15:59:42 +10:00
Paul Mackerras	a485c70989	powerpc: Fix "attempt to move .org backwards" error Building a 64-bit powerpc kernel with PR KVM enabled currently gives this error: AS arch/powerpc/kernel/head_64.o arch/powerpc/kernel/exceptions-64s.S: Assembler messages: arch/powerpc/kernel/exceptions-64s.S:258: Error: attempt to move .org backwards make[2]: *** [arch/powerpc/kernel/head_64.o] Error 1 This happens because the MASKABLE_EXCEPTION_PSERIES macro turns into 33 instructions, but we only have space for 32 at the decrementer interrupt vector (from 0x900 to 0x980). In the code generated by the MASKABLE_EXCEPTION_PSERIES macro, we currently have two instances of the HMT_MEDIUM macro, which has the effect of setting the SMT thread priority to medium. One is the first instruction, and is overwritten by a no-op on processors where we save the PPR (processor priority register), that is, POWER7 or later. The other is after we have saved the PPR. In order to reduce the code at 0x900 by one instruction, we omit the first HMT_MEDIUM. On processors without SMT this will have no effect since HMT_MEDIUM is a no-op there. On POWER5 and RS64 machines this will mean that the first few instructions take a little longer in the case where a decrementer interrupt occurs when the hardware thread is running at low SMT priority. On POWER6 and later machines, the hardware automatically boosts the thread priority when a decrementer interrupt is taken if the thread priority was below medium, so this change won't make any difference. The alternative would be to branch out of line after saving the CFAR. However, that would incur an extra overhead on all processors, whereas the approach adopted here only adds overhead on older threaded processors. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-04-26 16:08:27 +10:00

1 2 3

131 Commits (91f1da99792a1d133df94c4753510305353064a1)