Commit graph

4556 commits

Author SHA1 Message Date
Sam Bobroff 0aab374709 powerpc/powernv: Restore non-volatile CRs after nap
Patches 7cba160ad "powernv/cpuidle: Redesign idle states management"
and 77b54e9f2 "powernv/powerpc: Add winkle support for offline cpus"
use non-volatile condition registers (cr2, cr3 and cr4) early in the system
reset interrupt handler (system_reset_pSeries()) before it has been determined
if state loss has occurred. If state loss has not occurred, control returns via
the power7_wakeup_noloss() path which does not restore those condition
registers, leaving them corrupted.

Fix this by restoring the condition registers in the power7_wakeup_noloss()
case.

This is apparent when running a KVM guest on hardware that does not
support winkle or sleep and the guest makes use of secondary threads. In
practice this means Power7 machines, though some early unreleased Power8
machines may also be susceptible.

The secondary CPUs are taken off line before the guest is started and
they call pnv_smp_cpu_kill_self(). This checks support for sleep
states (in this case there is no support) and power7_nap() is called.

When the CPU is woken, power7_nap() returns and because the CPU is
still off line, the main while loop executes again. The sleep states
support test is executed again, but because the tested values cannot
have changed, the compiler has optimized the test away and instead we
rely on the result of the first test, which has been left in cr3
and/or cr4. With the result overwritten, the wrong branch is taken and
power7_winkle() is called on a CPU that does not support it, leading
to it stalling.

Fixes: 7cba160ad7 ("powernv/cpuidle: Redesign idle states management")
Fixes: 77b54e9f21 ("powernv/powerpc: Add winkle support for offline cpus")
[mpe: Massage change log a bit more]
Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-05-01 16:55:11 +10:00
Gavin Shan d91dafc02f powerpc/eeh: Delay probing EEH device during hotplug
Commit 1c509148b ("powerpc/eeh: Do probe on pci_dn") probes EEH
devices in early stage, which is reasonable to pSeries platform.
However, it's wrong for PowerNV platform because the PE# isn't
determined until the resources (IO and MMIO) are assigned to
PE in hotplug case. So we have to delay probing EEH devices
for PowerNV platform until the PE# is assigned.

Fixes: ff57b454dd ("powerpc/eeh: Do probe on pci_dn")
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-05-01 13:52:32 +10:00
Gavin Shan 1ae79b78bc powerpc/eeh: Fix race condition in pcibios_set_pcie_reset_state()
When asserting reset in pcibios_set_pcie_reset_state(), the PE
is enforced to (hardware) frozen state in order to drop unexpected
PCI transactions (except PCI config read/write) automatically by
hardware during reset, which would cause recursive EEH error.
However, the (software) frozen state EEH_PE_ISOLATED is missed.
When users get 0xFF from PCI config or MMIO read, EEH_PE_ISOLATED
is set in PE state retrival backend. Unfortunately, nobody (the
reset handler or the EEH recovery functinality in host) will clear
EEH_PE_ISOLATED when the PE has been passed through to guest.

The patch sets and clears EEH_PE_ISOLATED properly during reset
in function pcibios_set_pcie_reset_state() to fix the issue.

Fixes: 28158cd ("Enhance pcibios_set_pcie_reset_state()")
Reported-by: Carol L. Soto <clsoto@us.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Tested-by: Carol L. Soto <clsoto@us.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-05-01 13:52:09 +10:00
Michael Ellerman 68fc378ce3 Revert "powerpc/tm: Abort syscalls in active transactions"
This reverts commit feba40362b.

Although the principle of this change is good, the implementation has a
few issues.

Firstly we can sometimes fail to abort a syscall because r12 may have
been clobbered by C code if we went down the virtual CPU accounting
path, or if syscall tracing was enabled.

Secondly we have decided that it is safer to abort the syscall even
earlier in the syscall entry path, so that we avoid the syscall tracing
path when we are transactional.

So that we have time to thoroughly test those changes we have decided to
revert this for this merge window and will merge the fixed version in
the next window.

NB. Rather than reverting the selftest we just drop tm-syscall from
TEST_PROGS so that it's not run by default.

Fixes: feba40362b ("powerpc/tm: Abort syscalls in active transactions")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-04-30 15:24:58 +10:00
Linus Torvalds 63905bba5b powerpc fixes for 4.1
- Fix for mm_dec_nr_pmds() from Scott.
 - Fixes for oopses seen with KVM + THP from Aneesh.
 - Build fixes from Aneesh & Shreyas.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJVOsk5AAoJEFHr6jzI4aWACWMP/3EaNoeA1g8VbZWZEdRaoLvX
 W7D08DI3Dt8HLxyn2JR08jYZF0gr68XrF6OiscVVki7wVXT8fbH4jSNmBkbzNH95
 d9taScJyR1CUavkhsXivnR1qEE1Fi2KA2OW9RaNfoSt1MVtdsvOK6xXklUGksuJQ
 XygzyrRr4Dj82kuMUAMO0YDMvknMlzi3a8dzyrWZBXBZOOTWavGB6bQKtCTaOQ99
 3OFGLQ10uY7lmdHDi0t0tQ99FuYfLiJpg5fTLoUni4J5tFp8JlZ+x0Gwc0apN0cy
 Ym8EO6++qWDv8FXvYEPfVUEjbF1fyPiawUgpkMnyvXgd8K5G85SIrtkGW0Ml+6sX
 GfJH8w9hpDbF5EnWlC9bn/jT7sHBHFdrxZuQUc0L4M2OtM73R2a0Xr3b7ZxFCD1q
 7RpYu8MKKcyvaIXNg7VBJjj8zL+WmUJKF6J5uX5bGU2xH0khmp0vTknyyjbwrlcF
 uHidv5ZhMt3aAI70v14jA5BTEmLyOYRu58Ei6cT/VT/DjdbpEApdK8BMAvKSEeib
 +hzh6oDFT92AM0tbg15bNmqGbGfgqtVKe4GDS2QyGaHGAFOGs1nPuSa9se1xYDcM
 CCtRyABwpzJsrCfwra2fsTU6FxlatK4ONViyWFBXa6mEjBNSZ4XmyZvdWUqlwpSC
 F5jNGppm5Ama6xxcLphA
 =6yQx
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux

Pull powerpc fixes from Michael Ellerman:

 - fix for mm_dec_nr_pmds() from Scott.

 - fixes for oopses seen with KVM + THP from Aneesh.

 - build fixes from Aneesh & Shreyas.

* tag 'powerpc-4.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux:
  powerpc/mm: Fix build error with CONFIG_PPC_TRANSACTIONAL_MEM disabled
  powerpc/kvm: Fix ppc64_defconfig + PPC_POWERNV=n build error
  powerpc/mm/thp: Return pte address if we find trans_splitting.
  powerpc/mm/thp: Make page table walk safe against thp split/collapse
  KVM: PPC: Remove page table walk helpers
  KVM: PPC: Use READ_ONCE when dereferencing pte_t pointer
  powerpc/hugetlb: Call mm_dec_nr_pmds() in hugetlb_free_pmd_range()
2015-04-26 13:23:15 -07:00
Linus Torvalds eadf16a912 This mostly includes the PPC changes for 4.1, which this time cover
Book3S HV only (debugging aids, minor performance improvements and some
 cleanups).  But there are also bug fixes and small cleanups for ARM,
 x86 and s390.
 
 The task_migration_notifier revert and real fix is still pending review,
 but I'll send it as soon as possible after -rc1.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJVOONLAAoJEL/70l94x66DbsMIAIpZPsaqgXOC1sDEiZuYay+6
 rD4n4id7j8hIAzcf3AlZdyf5XgLlr6I1Zyt62s1WcoRq/CCnL7k9EljzSmw31WFX
 P2y7/J0iBdkn0et+PpoNThfL2GsgTqNRCLOOQlKgEQwMP9Dlw5fnUbtC1UchOzTg
 eAMeBIpYwufkWkXhdMw4PAD4lJ9WxUZ1eXHEBRzJb0o0ZxAATJ1tPZGrFJzoUOSM
 WsVNTuBsNd7upT02kQdvA1TUo/OPjseTOEoksHHwfcORt6bc5qvpctL3jYfcr7sk
 /L6sIhYGVNkjkuredjlKGLfT2DDJjSEdJb1k2pWrDRsY76dmottQubAE9J9cDTk=
 =OAi2
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull second batch of KVM changes from Paolo Bonzini:
 "This mostly includes the PPC changes for 4.1, which this time cover
  Book3S HV only (debugging aids, minor performance improvements and
  some cleanups).  But there are also bug fixes and small cleanups for
  ARM, x86 and s390.

  The task_migration_notifier revert and real fix is still pending
  review, but I'll send it as soon as possible after -rc1"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (29 commits)
  KVM: arm/arm64: check IRQ number on userland injection
  KVM: arm: irqfd: fix value returned by kvm_irq_map_gsi
  KVM: VMX: Preserve host CR4.MCE value while in guest mode.
  KVM: PPC: Book3S HV: Use msgsnd for signalling threads on POWER8
  KVM: PPC: Book3S HV: Translate kvmhv_commence_exit to C
  KVM: PPC: Book3S HV: Streamline guest entry and exit
  KVM: PPC: Book3S HV: Use bitmap of active threads rather than count
  KVM: PPC: Book3S HV: Use decrementer to wake napping threads
  KVM: PPC: Book3S HV: Don't wake thread with no vcpu on guest IPI
  KVM: PPC: Book3S HV: Get rid of vcore nap_count and n_woken
  KVM: PPC: Book3S HV: Move vcore preemption point up into kvmppc_run_vcpu
  KVM: PPC: Book3S HV: Minor cleanups
  KVM: PPC: Book3S HV: Simplify handling of VCPUs that need a VPA update
  KVM: PPC: Book3S HV: Accumulate timing information for real-mode code
  KVM: PPC: Book3S HV: Create debugfs file for each guest's HPT
  KVM: PPC: Book3S HV: Add ICP real mode counters
  KVM: PPC: Book3S HV: Move virtual mode ICP functions to real-mode
  KVM: PPC: Book3S HV: Convert ICS mutex lock to spin lock
  KVM: PPC: Book3S HV: Add guest->host real mode completion counters
  KVM: PPC: Book3S HV: Add helpers for lock/unlock hpte
  ...
2015-04-26 13:06:22 -07:00
Paul Mackerras 66feed61cd KVM: PPC: Book3S HV: Use msgsnd for signalling threads on POWER8
This uses msgsnd where possible for signalling other threads within
the same core on POWER8 systems, rather than IPIs through the XICS
interrupt controller.  This includes waking secondary threads to run
the guest, the interrupts generated by the virtual XICS, and the
interrupts to bring the other threads out of the guest when exiting.

Aggregated statistics from debugfs across vcpus for a guest with 32
vcpus, 8 threads/vcore, running on a POWER8, show this before the
change:

 rm_entry:     3387.6ns (228 - 86600, 1008969 samples)
  rm_exit:     4561.5ns (12 - 3477452, 1009402 samples)
  rm_intr:     1660.0ns (12 - 553050, 3600051 samples)

and this after the change:

 rm_entry:     3060.1ns (212 - 65138, 953873 samples)
  rm_exit:     4244.1ns (12 - 9693408, 954331 samples)
  rm_intr:     1342.3ns (12 - 1104718, 3405326 samples)

for a test of booting Fedora 20 big-endian to the login prompt.

The time taken for a H_PROD hcall (which is handled in the host
kernel) went down from about 35 microseconds to about 16 microseconds
with this change.

The noinline added to kvmppc_run_core turned out to be necessary for
good performance, at least with gcc 4.9.2 as packaged with Fedora 21
and a little-endian POWER8 host.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2015-04-21 15:21:34 +02:00
Paul Mackerras 7d6c40da19 KVM: PPC: Book3S HV: Use bitmap of active threads rather than count
Currently, the entry_exit_count field in the kvmppc_vcore struct
contains two 8-bit counts, one of the threads that have started entering
the guest, and one of the threads that have started exiting the guest.
This changes it to an entry_exit_map field which contains two bitmaps
of 8 bits each.  The advantage of doing this is that it gives us a
bitmap of which threads need to be signalled when exiting the guest.
That means that we no longer need to use the trick of setting the
HDEC to 0 to pull the other threads out of the guest, which led in
some cases to a spurious HDEC interrupt on the next guest entry.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2015-04-21 15:21:33 +02:00
Paul Mackerras 5d5b99cd68 KVM: PPC: Book3S HV: Get rid of vcore nap_count and n_woken
We can tell when a secondary thread has finished running a guest by
the fact that it clears its kvm_hstate.kvm_vcpu pointer, so there
is no real need for the nap_count field in the kvmppc_vcore struct.
This changes kvmppc_wait_for_nap to poll the kvm_hstate.kvm_vcpu
pointers of the secondary threads rather than polling vc->nap_count.
Besides reducing the size of the kvmppc_vcore struct by 8 bytes,
this also means that we can tell which secondary threads have got
stuck and thus print a more informative error message.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2015-04-21 15:21:32 +02:00
Paul Mackerras 1f09c3ed86 KVM: PPC: Book3S HV: Minor cleanups
* Remove unused kvmppc_vcore::n_busy field.
* Remove setting of RMOR, since it was only used on PPC970 and the
  PPC970 KVM support has been removed.
* Don't use r1 or r2 in setting the runlatch since they are
  conventionally reserved for other things; use r0 instead.
* Streamline the code a little and remove the ext_interrupt_to_host
  label.
* Add some comments about register usage.
* hcall_try_real_mode doesn't need to be global, and can't be
  called from C code anyway.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2015-04-21 15:21:32 +02:00
Paul Mackerras b6c295df31 KVM: PPC: Book3S HV: Accumulate timing information for real-mode code
This reads the timebase at various points in the real-mode guest
entry/exit code and uses that to accumulate total, minimum and
maximum time spent in those parts of the code.  Currently these
times are accumulated per vcpu in 5 parts of the code:

* rm_entry - time taken from the start of kvmppc_hv_entry() until
  just before entering the guest.
* rm_intr - time from when we take a hypervisor interrupt in the
  guest until we either re-enter the guest or decide to exit to the
  host.  This includes time spent handling hcalls in real mode.
* rm_exit - time from when we decide to exit the guest until the
  return from kvmppc_hv_entry().
* guest - time spend in the guest
* cede - time spent napping in real mode due to an H_CEDE hcall
  while other threads in the same vcore are active.

These times are exposed in debugfs in a directory per vcpu that
contains a file called "timings".  This file contains one line for
each of the 5 timings above, with the name followed by a colon and
4 numbers, which are the count (number of times the code has been
executed), the total time, the minimum time, and the maximum time,
all in nanoseconds.

The overhead of the extra code amounts to about 30ns for an hcall that
is handled in real mode (e.g. H_SET_DABR), which is about 25%.  Since
production environments may not wish to incur this overhead, the new
code is conditional on a new config symbol,
CONFIG_KVM_BOOK3S_HV_EXIT_TIMING.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2015-04-21 15:21:31 +02:00
Aneesh Kumar K.V 691e95fd73 powerpc/mm/thp: Make page table walk safe against thp split/collapse
We can disable a THP split or a hugepage collapse by disabling irq.
We do send IPI to all the cpus in the early part of split/collapse,
and disabling local irq ensure we don't make progress with
split/collapse. If the THP is getting split we return NULL from
find_linux_pte_or_hugepte(). For all the current callers it should be ok.
We need to be careful if we want to use returned pte_t pointer outside
the irq disabled region. W.r.t to THP split, the pfn remains the same,
but then a hugepage collapse will result in a pfn change. There are
few steps we can take to avoid a hugepage collapse.One way is to take page
reference inside the irq disable region. Other option is to take
mmap_sem so that a parallel collapse will not happen. We can also
disable collapse by taking pmd_lock. Another method used by kvm
subsystem is to check whether we had a mmu_notifer update in between
using mmu_notifier_retry().

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-04-17 11:23:39 +10:00
Linus Torvalds d19d5efd8c powerpc updates for 4.1
- Numerous minor fixes, cleanups etc.
 - More EEH work from Gavin to remove its dependency on device_nodes.
 - Memory hotplug implemented entirely in the kernel from Nathan Fontenot.
 - Removal of redundant CONFIG_PPC_OF by Kevin Hao.
 - Rewrite of VPHN parsing logic & tests from Greg Kurz.
 - A fix from Nish Aravamudan to reduce memory usage by clamping
   nodes_possible_map.
 - Support for pstore on powernv from Hari Bathini.
 - Removal of old powerpc specific byte swap routines by David Gibson.
 - Fix from Vasant Hegde to prevent the flash driver telling you it was flashing
   your firmware when it wasn't.
 - Patch from Ben Herrenschmidt to add an OPAL heartbeat driver.
 - Fix for an oops causing get/put_cpu_var() imbalance in perf by Jan Stancek.
 - Some fixes for migration from Tyrel Datwyler.
 - A new syscall to switch the cpu endian by Michael Ellerman.
 - Large series from Wei Yang to implement SRIOV, reviewed and acked by Bjorn.
 - A fix for the OPAL sensor driver from Cédric Le Goater.
 - Fixes to get STRICT_MM_TYPECHECKS building again by Michael Ellerman.
 - Large series from Daniel Axtens to make our PCI hooks per PHB rather than per
   machine.
 - Small patch from Sam Bobroff to explicitly abort non-suspended transactions
   on syscalls, plus a test to exercise it.
 - Numerous reworks and fixes for the 24x7 PMU from Sukadev Bhattiprolu.
 - Small patch to enable the hard lockup detector from Anton Blanchard.
 - Fix from Dave Olson for missing L2 cache information on some CPUs.
 - Some fixes from Michael Ellerman to get Cell machines booting again.
 - Freescale updates from Scott: Highlights include BMan device tree nodes, an
   MSI erratum workaround, a couple minor performance improvements, config
   updates, and misc fixes/cleanup.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJVL2cxAAoJEFHr6jzI4aWAR8cP/19VTo/CzCE4ffPSx7qR464n
 F+WFZcbNjIMXu6+B0YLuJZEsuWtKKrCit/MCg3+mSgE4iqvxmtI+HDD0445Buszj
 UD4E4HMdPrXQ+KUSUDORvRjv/FFUXIa94LSv/0g2UeMsPz/HeZlhMxEu7AkXw9Nf
 rTxsmRTsOWME85Y/c9ss7XHuWKXT3DJV7fOoK9roSaN3dJAuWTtG3WaKS0nUu0ok
 0M81D6ZczoD6ybwh2DUMPD9K6SGxLdQ4OzQwtW6vWzcQIBDfy5Pdeo0iAFhGPvXf
 T4LLPkv4cF4AwHsAC4rKDPHQNa+oZBoLlScrHClaebAlDiv+XYKNdMogawUObvSh
 h7avKmQr0Ygp1OvvZAaXLhuDJI9FJJ8lf6AOIeULgHsDR9SyKMjZWxRzPe11uarO
 Fyi0qj3oJaQu6LjazZraApu8mo+JBtQuD3z3o5GhLxeFtBBF60JXj6zAXJikufnl
 kk1/BUF10nKUhtKcDX767AMUCtMH3fp5hx8K/z9T5v+pobJB26Wup1bbdT68pNBT
 NjdKUppV6QTjZvCsA6U2/ECu6E9KeIaFtFSL2IRRoiI0dWBN5/5eYn3RGkO2ZFoL
 1NdwKA2XJcchwTPkpSRrUG70sYH0uM2AldNYyaLfjzrQqza7Y6lF699ilxWmCN/H
 OplzJAE5cQ8Am078veTW
 =03Yh
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux

Pull powerpc updates from Michael Ellerman:

 - Numerous minor fixes, cleanups etc.

 - More EEH work from Gavin to remove its dependency on device_nodes.

 - Memory hotplug implemented entirely in the kernel from Nathan
   Fontenot.

 - Removal of redundant CONFIG_PPC_OF by Kevin Hao.

 - Rewrite of VPHN parsing logic & tests from Greg Kurz.

 - A fix from Nish Aravamudan to reduce memory usage by clamping
   nodes_possible_map.

 - Support for pstore on powernv from Hari Bathini.

 - Removal of old powerpc specific byte swap routines by David Gibson.

 - Fix from Vasant Hegde to prevent the flash driver telling you it was
   flashing your firmware when it wasn't.

 - Patch from Ben Herrenschmidt to add an OPAL heartbeat driver.

 - Fix for an oops causing get/put_cpu_var() imbalance in perf by Jan
   Stancek.

 - Some fixes for migration from Tyrel Datwyler.

 - A new syscall to switch the cpu endian by Michael Ellerman.

 - Large series from Wei Yang to implement SRIOV, reviewed and acked by
   Bjorn.

 - A fix for the OPAL sensor driver from Cédric Le Goater.

 - Fixes to get STRICT_MM_TYPECHECKS building again by Michael Ellerman.

 - Large series from Daniel Axtens to make our PCI hooks per PHB rather
   than per machine.

 - Small patch from Sam Bobroff to explicitly abort non-suspended
   transactions on syscalls, plus a test to exercise it.

 - Numerous reworks and fixes for the 24x7 PMU from Sukadev Bhattiprolu.

 - Small patch to enable the hard lockup detector from Anton Blanchard.

 - Fix from Dave Olson for missing L2 cache information on some CPUs.

 - Some fixes from Michael Ellerman to get Cell machines booting again.

 - Freescale updates from Scott: Highlights include BMan device tree
   nodes, an MSI erratum workaround, a couple minor performance
   improvements, config updates, and misc fixes/cleanup.

* tag 'powerpc-4.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux: (196 commits)
  powerpc/powermac: Fix build error seen with powermac smp builds
  powerpc/pseries: Fix compile of memory hotplug without CONFIG_MEMORY_HOTREMOVE
  powerpc: Remove PPC32 code from pseries specific find_and_init_phbs()
  powerpc/cell: Fix iommu breakage caused by controller_ops change
  powerpc/eeh: Fix crash in eeh_add_device_early() on Cell
  powerpc/perf: Cap 64bit userspace backtraces to PERF_MAX_STACK_DEPTH
  powerpc/perf/hv-24x7: Fail 24x7 initcall if create_events_from_catalog() fails
  powerpc/pseries: Correct memory hotplug locking
  powerpc: Fix missing L2 cache size in /sys/devices/system/cpu
  powerpc: Add ppc64 hard lockup detector support
  oprofile: Disable oprofile NMI timer on ppc64
  powerpc/perf/hv-24x7: Add missing put_cpu_var()
  powerpc/perf/hv-24x7: Break up single_24x7_request
  powerpc/perf/hv-24x7: Define update_event_count()
  powerpc/perf/hv-24x7: Whitespace cleanup
  powerpc/perf/hv-24x7: Define add_event_to_24x7_request()
  powerpc/perf/hv-24x7: Rename hv_24x7_event_update
  powerpc/perf/hv-24x7: Move debug prints to separate function
  powerpc/perf/hv-24x7: Drop event_24x7_request()
  powerpc/perf/hv-24x7: Use pr_devel() to log message
  ...

Conflicts:
	tools/testing/selftests/powerpc/Makefile
	tools/testing/selftests/powerpc/tm/Makefile
2015-04-16 13:53:32 -05:00
Linus Torvalds d0bbe0dd35 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Pull trivial tree from Jiri Kosina:
 "Usual trivial tree updates.  Nothing outstanding -- mostly printk()
  and comment fixes and unused identifier removals"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial:
  goldfish: goldfish_tty_probe() is not using 'i' any more
  powerpc: Fix comment in smu.h
  qla2xxx: Fix printks in ql_log message
  lib: correct link to the original source for div64_u64
  si2168, tda10071, m88ds3103: Fix firmware wording
  usb: storage: Fix printk in isd200_log_config()
  qla2xxx: Fix printk in qla25xx_setup_mode
  init/main: fix reset_device comment
  ipwireless: missing assignment
  goldfish: remove unreachable line of code
  coredump: Fix do_coredump() comment
  stacktrace.h: remove duplicate declaration task_struct
  smpboot.h: Remove unused function prototype
  treewide: Fix typo in printk messages
  treewide: Fix typo in printk messages
  mod_devicetable: fix comment for match_flags
2015-04-14 09:50:27 -07:00
Michael Ellerman 89a51df5ab powerpc/eeh: Fix crash in eeh_add_device_early() on Cell
The recent change to the EEH probing causes a crash on Cell because
eeh_ops is NULL.

Check if EEH is enabled and if not bail out.

Fixes: ff57b454dd ("powerpc/eeh: Do probe on pci_dn")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-04-14 17:13:31 +10:00
Michael Ellerman ad30cb9946 Merge branch 'next-sriov' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc into next
Merge Richard's work to support SR-IOV on PowerNV. All generic PCI
patches acked by Bjorn.

Some minor conflicts with Daniel's pci_controller_ops work.

Conflicts:
	arch/powerpc/include/asm/machdep.h
	arch/powerpc/platforms/powernv/pci-ioda.c
2015-04-14 09:29:23 +10:00
Dave Olson f7e9e35836 powerpc: Fix missing L2 cache size in /sys/devices/system/cpu
This problem appears to have been introduced in 2.6.29 by commit
93197a36a9 "Rewrite sysfs processor cache info code".

This caused lscpu to error out on at least e500v2 devices, eg:

  error: cannot open /sys/devices/system/cpu/cpu0/cache/index2/size: No such file or directory

Some embedded powerpc systems use cache-size in DTS for the unified L2
cache size, not d-cache-size, so we need to allow for both DTS names.
Added a new CACHE_TYPE_UNIFIED_D cache_type_info structure to handle
this.

Fixes: 93197a36a9 ("powerpc: Rewrite sysfs processor cache info code")
Signed-off-by: Dave Olson <olson@cumulusnetworks.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-04-11 20:49:28 +10:00
Anton Blanchard c54b2bf1b5 powerpc: Add ppc64 hard lockup detector support
The hard lockup detector uses a PMU event as a periodic NMI to
detect if we are stuck (where stuck means no timer interrupts have
occurred).

Ben's rework of the ppc64 soft disable code has made ppc64 PMU
exceptions a partial NMI. They can get disabled if an external
interrupt comes in, but otherwise PMU interrupts will fire in
interrupt disabled regions.

We disable the hard lockup detector by default for a few reasons:

- It breaks userspace event based branches on POWER8.
- It is likely to produce false positives on KVM guests.
- Since PMCs can only count to 2^31, counting cycles means we might
  take multiple PMU exceptions per second per hardware thread even
  if our hard lockup timeout is 10 seconds.

It can be enabled via a boot option, or via procfs.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-04-11 20:49:27 +10:00
Sam bobroff feba40362b powerpc/tm: Abort syscalls in active transactions
This patch changes the syscall handler to doom (tabort) active
transactions when a syscall is made and return immediately without
performing the syscall.

Currently, the system call instruction automatically suspends an
active transaction which causes side effects to persist when an active
transaction fails.

This does change the kernel's behaviour, but in a way that was
documented as unsupported. It doesn't reduce functionality because
syscalls will still be performed after tsuspend. It also provides a
consistent interface and makes the behaviour of user code
substantially the same across powerpc and platforms that do not
support suspended transactions (e.g. x86 and s390).

Performance measurements using
http://ozlabs.org/~anton/junkcode/null_syscall.c
indicate the cost of a system call increases by about 0.5%.

Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Acked-By: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-04-11 20:49:19 +10:00
Daniel Axtens 467efc2e4f powerpc: Remove shims for pci_controller_ops operations
Remove shims, patch callsites to use pci_controller_ops
versions instead.

Also move back the probe mode defines, as explained in the patch
for pci_probe_mode.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-04-11 20:49:18 +10:00
Daniel Axtens 97884e00e2 powerpc: fsl_pci, swiotlb: Move controller ops from ppc_md to controller_ops
Move the installation of DMA operations out of swiotlb's subsys
initcall, and into the generic PCI controller operations struct.

These ops are installed conditionally, based on the ppc_swiotlb_enable
global. The global can be set in two places:
 - swiotlb_detect_4g, which is always called at the arch initcall level
 - setup_pci_atmu, which is called as part of the fsl_add_bridge and
fsl_pci_syscore_do_resume.

fsl_pci_syscore_do_resume is called late enough that any changes as a
result of that call will have no effect.

As such, if we test the global and set the operations as part of
fsl_add_bridge, after the call to setup_pci_atmu, we can be confident
that it will cover all the PCI implementations affected by the changes
to dma-swiotlb.c.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-04-11 20:49:17 +10:00
Daniel Axtens cd16c7ba0c powerpc: Create pci_controller_ops.reset_secondary_bus and shim
Add pci_controller_ops.reset_secondary_bus,
shadowing ppc_md.pcibios_reset_secondary_bus.
Add a shim, and changes the callsites to use the shim.

Use pcibios_reset_secondary_bus_shim, as both
pcibios_reset_secondary_bus and pci_reset_secondary_bus
are already taken.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-04-11 20:49:14 +10:00
Daniel Axtens 542070baf4 powerpc: Create pci_controller_ops.window_alignment and shim
Add pci_controller_ops.window_alignment,
shadowing ppc_md.pcibios_window_alignment.
Add a shim, and changes the callsites to use the shim.

Here, we use pci_window_alignment, as pcibios_window_alignment is
already taken.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-04-11 20:49:13 +10:00
Daniel Axtens b31e79f8d9 powerpc: Create pci_controller_ops.enable_device_hook and shim
Add pci_controller_ops.enable_device_hook,
shadowing ppc_md.pcibios_enable_device_hook.
Add a shim, and changes the callsites to use the shim.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-04-11 20:49:13 +10:00
Daniel Axtens ff9df8c87d powerpc: Create pci_controller_ops.probe_mode and shim
Add pci_controller_ops.probe_mode, shadowing ppc_md.pci_probe_mode.
Add a shim, and changes the callsites to use the shim.

We also need to move the probe mode defines to pci-bridge.h from pci.h.
They are required by the shim in order to return a sensible default.
Previously, the were defined in pci.h, but pci.h includes pci-bridge.h
before the relevant #defines. This means the definitions are absent
if pci.h is included before pci-bridge.h. This occurs in some drivers.
So, move the definitons now, and move them back when we remove the shim.

Anything that wants the defines would have had to include pci.h, and
since pci.h includes pci-bridge.h, nothing will lose access to the
defines.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-04-11 20:49:12 +10:00
Daniel Axtens b122c95494 powerpc: Create pci_controller_ops.dma_bus_setup and shim
Add pci_controller_ops.dma_bus_setup, shadowing ppc_md.pci_dma_bus_setup.
Add a shim, and changes the callsites to use the shim.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-04-11 20:49:11 +10:00
Daniel Axtens e02def5bce powerpc: Create pci_controller_ops.dma_dev_setup and shim
Introduces the pci_controller_ops structure.
Add pci_controller_ops.dma_dev_setup, shadowing ppc_md.pci_dma_dev_setup.
Add a shim, and change the callsites to use the shim.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-04-11 20:49:11 +10:00
Daniel Axtens c88c2a1889 powerpc: pcibios_enable_device_hook: return bool rather than int
pcibios_enable_device_hook returned an int. Every implementation
returned either -EINVAL or 0. The return value wasn't propagated by
the caller: any non-zero return value caused pcibios_enable_device
to return -EINVAL itself. Therefore, make the hook return a bool.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-04-11 20:49:10 +10:00
Daniel Axtens bdc728a849 powerpc: move find_and_init_phbs() to pSeries specific code
Previously, find_and_init_phbs() was used in both PowerNV and pSeries
setup. However, since RTAS support has been dropped from PowerNV, we
can move it into a platform-specific file.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-04-11 20:49:09 +10:00
Michael Ellerman 7e862d7e7d powerpc: Reword the "returning from prom_init" message
We get way too many bug reports that say "the kernel is hung in
prom_init", which stems from the fact that the last piece of output
people see is "returning from prom_init".

The kernel is almost never hung in prom_init(), it's just that it's
crashed somewhere after prom_init() but prior to the console coming up.

The existing message should give a clue to that, ie. "returning from"
indicates that prom_init() has finished, but it doesn't seem to work.
Let's try something different.

This prints:

  Quiescing Open Firmware ...
  Booting Linux via __start() ...

Which hopefully makes it clear that prom_init() is not the problem, and
although __start() probably isn't either, it's at least the right place
to begin looking.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Wistfully-Acked-by: Jeremy Kerr <jk@ozlabs.org>
2015-04-10 20:02:48 +10:00
Michael Ellerman f691fa1080 powerpc: Replace mem_init_done with slab_is_available()
We have a powerpc specific global called mem_init_done which is "set on
boot once kmalloc can be called".

But that's not *quite* true. We set it at the bottom of mem_init(), and
rely on the fact that mm_init() calls kmem_cache_init() immediately
after that, and nothing is running in parallel.

So replace it with the generic and 100% correct slab_is_available().

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-04-10 20:02:48 +10:00
Michael Ellerman bf4981a006 powerpc: Remove the celleb support
The celleb code has seen no actual development for ~7 years.

We (maintainers) have no access to test hardware, and it is highly
likely the code has bit-rotted.

As far as we're aware the hardware was never widely available, and is
certainly no longer available, and no one on the list has shown any
interest in it over the years.

So remove it. If anyone has one and cares please speak up.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Jeremy Kerr <jk@ozlabs.org>
2015-04-07 17:15:13 +10:00
Michael Ellerman 428d4d6520 Merge branch 'next-eeh' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc into next 2015-04-07 13:24:55 +10:00
Benjamin Herrenschmidt d4ed11aa48 Merge branch 'next-eeh' into next-sriov
Merge in Gavin EEH fixes
2015-03-31 13:11:17 +11:00
Gavin Shan 433185d2b4 powerpc/eeh: Fix PE#0 check in eeh_add_to_parent_pe()
The function eeh_add_parent_pe() is used to create a PE or add one
edev to its parent PE. Current code checks if PE#0 is valid for the
later case. Actually, we should validate PE#0 for both cases when
EEH core regards PE#0 as invalid one (without flag EEH_VALID_PE_ZERO).
Otherwise, not all EEH devices can be added to its parent PE#0 for
EEH on P7IOC.

The patch fixes the issue by validating PE#0 for the two cases. So far,
we don't have PE#0 for EEH on P7IOC, but it will show up when we enable
M64 for P7IOC. The patch also makes the error message more meaningful.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2015-03-31 13:10:39 +11:00
Wei Yang 781a868f31 powerpc/powernv: Shift VF resource with an offset
On PowerNV platform, resource position in M64 BAR implies the PE# the
resource belongs to. In some cases, adjustment of a resource is necessary
to locate it to a correct position in M64 BAR .

This patch adds pnv_pci_vf_resource_shift() to shift the 'real' PF IOV BAR
address according to an offset.

Note:

    After doing so, there would be a "hole" in the /proc/iomem when offset
    is a positive value. It looks like the device return some mmio back to
    the system, which actually no one could use it.

[bhelgaas: rework loops, rework overlap check, index resource[]
conventionally, remove pci_regs.h include, squashed with next patch]
Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2015-03-31 13:02:38 +11:00
Wei Yang 5350ab3fd7 powerpc/powernv: Implement pcibios_iov_resource_alignment() on powernv
Implement pcibios_iov_resource_alignment() on powernv platform.

On PowerNV platform, there are 3 cases for the IOV BAR:
1. initial state, the IOV BAR size is multiple times of VF BAR size
2. after expanded, the IOV BAR size is expanded to meet the M64 segment size
3. sizing stage, the IOV BAR is truncated to 0

pnv_pci_iov_resource_alignment() handle these three cases respectively.

[bhelgaas: adjust to drop "align" parameter, return pci_iov_resource_size()
if no ppc_md machdep_call version]
Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2015-03-31 13:02:37 +11:00
Wei Yang 6e628c7d33 powerpc/powernv: Reserve additional space for IOV BAR according to the number of total_pe
On PHB3, PF IOV BAR will be covered by M64 BAR to have better PE isolation.
M64 BAR is a type of hardware resource in PHB3, which could map a range of
MMIO to PE numbers on powernv platform. And this range is divided equally
by the number of total_pe with each divided range mapping to a PE number.
Also, the M64 BAR must map a MMIO range with power-of-two size.

The total_pe number is usually different from total_VFs, which can lead to
a conflict between MMIO space and the PE number.

For example, if total_VFs is 128 and total_pe is 256, the second half of
M64 BAR will be part of other PCI device, which may already belong to other
PEs.

This patch prevents the conflict by reserving additional space for the PF
IOV BAR, which is total_pe number of VF's BAR size.

[bhelgaas: make dev_printk() output more consistent, index resource[]
conventionally]
Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2015-03-31 13:02:37 +11:00
Wei Yang c3b80fb0f2 powerpc/pci: Don't unset PCI resources for VFs
Flag PCI_REASSIGN_ALL_RSRC is used to ignore resources information setup by
firmware, so that kernel would re-assign all resources of pci devices.

On powerpc arch, this happens in a header fixup function
pcibios_fixup_resources(), which will clean up the resources if this flag
is set. This works fine for PFs, since after clean up, kernel will
re-assign the resources in pcibios_resource_survey().

Below is a simple call flow on how it works:

    pcibios_init
      pcibios_scan_phb
        pci_scan_child_bus
          ...
            pci_device_add
              pci_fixup_device(pci_fixup_header)
                pcibios_fixup_resources                     # header fixup
                  for (i = 0; i < DEVICE_COUNT_RESOURCE; i++)
                    dev->resource[i].start = 0
      pcibios_resource_survey                               # re-assign
        pcibios_allocate_resources

However, the VF resources won't be re-assigned, since the VF resources are
completely determined by the PF resources, and the PF resources have
already been reassigned. This means we need to leave VF's resources
un-cleared in pcibios_fixup_resources().

In this patch, we skip the resource unset process in
pcibios_fixup_resources(), if the pci_dev is a VF.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2015-03-31 13:02:37 +11:00
Gavin Shan a8b2f8288a powerpc/pci: Create pci_dn for VFs
pci_dn is the extension of PCI device node and is created from device node.
Unfortunately, VFs are enabled dynamically by PF's driver and they don't
have corresponding device nodes and pci_dn, which is required to access
VFs' config spaces.

The patch creates pci_dn for VFs in pcibios_sriov_enable() on their PF,
and removes pci_dn for VFs in pcibios_sriov_disable() on their PF. When
VF's pci_dn is created, it's put to the child list of the pci_dn of PF's
upstream bridge. The pci_dn is linked to pci_dev during early fixup time
to setup the fast path.

[bhelgaas: add ifdef around add_one_dev_pci_info(), use dev_printk()]
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2015-03-31 13:02:37 +11:00
Michael Ellerman 529d235a0e powerpc: Add a proper syscall for switching endianness
We currently have a "special" syscall for switching endianness. This is
syscall number 0x1ebe, which is handled explicitly in the 64-bit syscall
exception entry.

That has a few problems, firstly the syscall number is outside of the
usual range, which confuses various tools. For example strace doesn't
recognise the syscall at all.

Secondly it's handled explicitly as a special case in the syscall
exception entry, which is complicated enough without it.

As a first step toward removing the special syscall, we need to add a
regular syscall that implements the same functionality.

The logic is simple, it simply toggles the MSR_LE bit in the userspace
MSR. This is the same as the special syscall, with the caveat that the
special syscall clobbers fewer registers.

This version clobbers r9-r12, XER, CTR, and CR0-1,5-7.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-03-28 22:03:40 +11:00
Tyrel Datwyler c03e73740d powerpc/pseries: Simplify check for suspendability during suspend/migration
During suspend/migration operation we must wait for the VASI state reported
by the hypervisor to become Suspending prior to making the ibm,suspend-me
RTAS call. Calling routines to rtas_ibm_supend_me() pass a vasi_state variable
that exposes the VASI state to the caller. This is unnecessary as the caller
only really cares about the following three conditions; if there is an error
we should bailout, success indicating we have suspended and woken back up so
proceed to device tree update, or we are not suspendable yet so try calling
rtas_ibm_suspend_me again shortly.

This patch removes the extraneous vasi_state variable and simply uses the
return code to communicate how to proceed. We either succeed, fail, or get
-EAGAIN in which case we sleep for a second before trying to call
rtas_ibm_suspend_me again. The behaviour of ppc_rtas() remains the same,
but migrate_store() now returns the propogated error code on failure.
Previously -1 was returned from migrate_store() in the  failure case which
equates to -EPERM and was clearly wrong.

Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>
Cc: Nathan Fontenont <nfont@linux.vnet.ibm.com>
Cc: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-03-28 12:20:39 +11:00
Gavin Shan c6406d8fbb powerpc/eeh: Remove device_node dependency
The patch removes struct eeh_dev::dn and the corresponding helper
functions: eeh_dev_to_of_node() and of_node_to_eeh_dev(). Instead,
eeh_dev_to_pdn() and pdn_to_eeh_dev() should be used to get the
pdn, which might contain device_node on PowerNV platform.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2015-03-24 13:15:53 +11:00
Gavin Shan 0bd785873c powerpc/eeh: Replace device_node with pci_dn in eeh_ops
There are 3 EEH operations whose arguments contain device_node:
read_config(), write_config() and restore_config(). The patch
replaces device_node with pci_dn.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2015-03-24 13:15:52 +11:00
Gavin Shan ff57b454dd powerpc/eeh: Do probe on pci_dn
Originally, EEH core probes on device_node or pci_dev to populate
EEH devices and PEs, which conflicts with the fact: SRIOV VFs are
usually enabled and created by PF's driver and they don't have the
corresponding device_nodes. Instead, SRIOV VFs have dynamically
created pci_dn, which can be used for EEH probe.

The patch reworks EEH probe for PowerNV and pSeries platforms to
do probing based on pci_dn, instead of pci_dev or device_node any
more.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2015-03-24 13:15:52 +11:00
Gavin Shan e8e9b34cef powerpc/eeh: Create eeh_dev from pci_dn instead of device_node
The patch adds function traverse_pci_dn(), which is similar to
traverse_pci_devices() except it takes pci_dn, not device_node
as parameter. The pci_dev.c has been reworked to create eeh_dev
from pci_dn, instead of device_node.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2015-03-24 13:15:51 +11:00
Gavin Shan c035ff1d2e powerpc/pci: Trace more information from pci_dn
Originally, EEH probes on device_node or pci_dev and populates the
corresponding eeh_dev. In the subsequent patches, EEH will probes
on pci_dn and populates the corresponding eeh_dev. So we have to
cache some information in pci_dn, either from device_node or SRIOV
PF's enablement platform hook, to populate the eeh_dev properly.

The motivation to probe pci_dn, instead of device node or pci_dev,
to populate eeh_dev is SRIOV VFs are dynamically created and we
don't have the corresponding device nodes for them.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2015-03-24 13:15:50 +11:00
Gavin Shan cca87d303c powerpc/pci: Refactor pci_dn
Currently, the PCI config accessors are implemented based on device node.
Unfortunately, SRIOV VFs won't have the corresponding device nodes. pci_dn
will be used in replacement with device node for SRIOV VFs. So we have to
use pci_dn in PCI config accessors.

The patch refactors pci_dn in following aspects to make it ready to be used
in PCI config accessors as we do in subsequent patch:

   * pci_dn is organized as a hierarchy tree.  PCI device's pci_dn is
     put to the child list of pci_dn of its upstream bridge or PHB. VF's
     pci_dn will be put to the child list of pci_dn of PF's bridge.

   * For one particular PCI device (VF or not), its pci_dn can be
     found from pdev->dev.archdata.pci_data, PCI_DN(devnode), or
     parent's list.  The fast path (fetching pci_dn through PCI device
     instance) is populated during early fixup time.

[bhelgaas: changelog]
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2015-03-24 13:15:49 +11:00
Mahesh Salgaonkar 44d5f6f590 powerpc/book3s: Fix the MCE code to use CONFIG_KVM_BOOK3S_64_HANDLER
commit id 2ba9f0d has changed CONFIG_KVM_BOOK3S_64_HV to tristate to allow
HV/PR bits to be built as modules. But the MCE code still depends on
CONFIG_KVM_BOOK3S_64_HV which is wrong. When user selects
CONFIG_KVM_BOOK3S_64_HV=m to build HV/PR bits as a separate module the
relevant MCE code gets excluded.

This patch fixes the MCE code to use CONFIG_KVM_BOOK3S_64_HANDLER. This
makes sure that the relevant MCE code is included when HV/PR bits
are built as a separate modules.

Fixes: 2ba9f0d887 ("kvm: powerpc: book3s: Support building HV and PR KVM as module")
Cc: stable@vger.kernel.org  # v3.14+
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-03-23 17:10:47 +11:00
Paul Mackerras f57333a767 powerpc/powernv: Fix return value from power7_nap() et al.
The power7_nap(), power7_sleep() and power7_winkle() functions are
called from pnv_smp_cpu_kill_self(), which expects them to return the
SRR1 value set by the hardware on wakeup, or 0 if no nap/sleep/winkle
occurred.  However, in the case where an interrupt needs to be
replayed, the logic in power7_powersave_common (the common code for
power7_nap et al.) doesn't set r3 to 0 in this case.  Instead what we
get as the return value is the selector for the type of power-saving
mode requested (1, 2 or 3).  In fact this should not affect the
operation of pnv_smp_cpu_kill_self(), but it is better to get this
correct, so this adds an instruction to set r3 to 0 in this case.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-03-23 15:06:50 +11:00