Commit graph

767947 commits

Author SHA1 Message Date
Pavel Tatashin 8dbe438589 x86/tsc: Make use of tsc_calibrate_cpu_early()
During early boot enable tsc_calibrate_cpu_early() and switch to
tsc_calibrate_cpu() only later. Do this unconditionally, because it is
unknown what methods other cpus will use to calibrate once they are
onlined.

If by the time tsc_init() is called tsc frequency is still unknown do only
pit_hpet_ptimer_calibrate_cpu() to calibrate, as this function contains the
only methods wich have not been called and tried earlier.

Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: steven.sistare@oracle.com
Cc: daniel.m.jordan@oracle.com
Cc: linux@armlinux.org.uk
Cc: schwidefsky@de.ibm.com
Cc: heiko.carstens@de.ibm.com
Cc: john.stultz@linaro.org
Cc: sboyd@codeaurora.org
Cc: hpa@zytor.com
Cc: douly.fnst@cn.fujitsu.com
Cc: peterz@infradead.org
Cc: prarit@redhat.com
Cc: feng.tang@intel.com
Cc: pmladek@suse.com
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: linux-s390@vger.kernel.org
Cc: boris.ostrovsky@oracle.com
Cc: jgross@suse.com
Cc: pbonzini@redhat.com
Link: https://lkml.kernel.org/r/20180719205545.16512-27-pasha.tatashin@oracle.com
2018-07-20 00:02:44 +02:00
Pavel Tatashin 03821f451d x86/tsc: Split native_calibrate_cpu() into early and late parts
During early boot TSC and CPU frequency can be calibrated using MSR, CPUID,
and quick PIT calibration methods. The other methods PIT/HPET/PMTIMER are
available only after ACPI is initialized.

Split native_calibrate_cpu() into early and late parts so they can be
called separately during early and late tsc calibration.

Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: steven.sistare@oracle.com
Cc: daniel.m.jordan@oracle.com
Cc: linux@armlinux.org.uk
Cc: schwidefsky@de.ibm.com
Cc: heiko.carstens@de.ibm.com
Cc: john.stultz@linaro.org
Cc: sboyd@codeaurora.org
Cc: hpa@zytor.com
Cc: douly.fnst@cn.fujitsu.com
Cc: peterz@infradead.org
Cc: prarit@redhat.com
Cc: feng.tang@intel.com
Cc: pmladek@suse.com
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: linux-s390@vger.kernel.org
Cc: boris.ostrovsky@oracle.com
Cc: jgross@suse.com
Cc: pbonzini@redhat.com
Link: https://lkml.kernel.org/r/20180719205545.16512-26-pasha.tatashin@oracle.com
2018-07-20 00:02:44 +02:00
Pavel Tatashin 46457ea464 sched/clock: Use static key for sched_clock_running
sched_clock_running may be read every time sched_clock_cpu() is called.
Yet, this variable is updated only twice during boot, and never changes
again, therefore it is better to make it a static key.

Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: steven.sistare@oracle.com
Cc: daniel.m.jordan@oracle.com
Cc: linux@armlinux.org.uk
Cc: schwidefsky@de.ibm.com
Cc: heiko.carstens@de.ibm.com
Cc: john.stultz@linaro.org
Cc: sboyd@codeaurora.org
Cc: hpa@zytor.com
Cc: douly.fnst@cn.fujitsu.com
Cc: prarit@redhat.com
Cc: feng.tang@intel.com
Cc: pmladek@suse.com
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: linux-s390@vger.kernel.org
Cc: boris.ostrovsky@oracle.com
Cc: jgross@suse.com
Cc: pbonzini@redhat.com
Link: https://lkml.kernel.org/r/20180719205545.16512-25-pasha.tatashin@oracle.com
2018-07-20 00:02:43 +02:00
Pavel Tatashin 857baa87b6 sched/clock: Enable sched clock early
Allow sched_clock() to be used before schec_clock_init() is called.  This
provides a way to get early boot timestamps on machines with unstable
clocks.

Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: steven.sistare@oracle.com
Cc: daniel.m.jordan@oracle.com
Cc: linux@armlinux.org.uk
Cc: schwidefsky@de.ibm.com
Cc: heiko.carstens@de.ibm.com
Cc: john.stultz@linaro.org
Cc: sboyd@codeaurora.org
Cc: hpa@zytor.com
Cc: douly.fnst@cn.fujitsu.com
Cc: peterz@infradead.org
Cc: prarit@redhat.com
Cc: feng.tang@intel.com
Cc: pmladek@suse.com
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: linux-s390@vger.kernel.org
Cc: boris.ostrovsky@oracle.com
Cc: jgross@suse.com
Cc: pbonzini@redhat.com
Link: https://lkml.kernel.org/r/20180719205545.16512-24-pasha.tatashin@oracle.com
2018-07-20 00:02:43 +02:00
Pavel Tatashin 5d2a4e91a5 sched/clock: Move sched clock initialization and merge with generic clock
sched_clock_postinit() initializes a generic clock on systems where no
other clock is provided. This function may be called only after
timekeeping_init().

Rename sched_clock_postinit to generic_clock_inti() and call it from
sched_clock_init(). Move the call for sched_clock_init() until after
time_init().

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: steven.sistare@oracle.com
Cc: daniel.m.jordan@oracle.com
Cc: linux@armlinux.org.uk
Cc: schwidefsky@de.ibm.com
Cc: heiko.carstens@de.ibm.com
Cc: john.stultz@linaro.org
Cc: sboyd@codeaurora.org
Cc: hpa@zytor.com
Cc: douly.fnst@cn.fujitsu.com
Cc: prarit@redhat.com
Cc: feng.tang@intel.com
Cc: pmladek@suse.com
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: linux-s390@vger.kernel.org
Cc: boris.ostrovsky@oracle.com
Cc: jgross@suse.com
Cc: pbonzini@redhat.com
Link: https://lkml.kernel.org/r/20180719205545.16512-23-pasha.tatashin@oracle.com
2018-07-20 00:02:43 +02:00
Pavel Tatashin 4763f03d3d x86/tsc: Use TSC as sched clock early
All prerequesites for enabling TSC as sched clock early in the boot
process are available now:

 - Early attempt of TSC calibration

 - Early availablity of static branch patching

If TSC frequency can be established in the early calibration, enable the
static key which switches sched clock to use TSC.

[ tglx: Massaged changelog ]

Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: steven.sistare@oracle.com
Cc: daniel.m.jordan@oracle.com
Cc: linux@armlinux.org.uk
Cc: schwidefsky@de.ibm.com
Cc: heiko.carstens@de.ibm.com
Cc: john.stultz@linaro.org
Cc: sboyd@codeaurora.org
Cc: hpa@zytor.com
Cc: douly.fnst@cn.fujitsu.com
Cc: peterz@infradead.org
Cc: prarit@redhat.com
Cc: feng.tang@intel.com
Cc: pmladek@suse.com
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: linux-s390@vger.kernel.org
Cc: boris.ostrovsky@oracle.com
Cc: jgross@suse.com
Cc: pbonzini@redhat.com
Link: https://lkml.kernel.org/r/20180719205545.16512-22-pasha.tatashin@oracle.com
2018-07-20 00:02:42 +02:00
Pavel Tatashin e2a9ca29b5 x86/tsc: Initialize cyc2ns when tsc frequency is determined
cyc2ns converts tsc to nanoseconds, and it is handled in a per-cpu data
structure.

Currently, the setup code for c2ns data for every possible CPU goes through
the same sequence of calculations as for the boot CPU, but is based on the
same tsc frequency as the boot CPU, and thus this is not necessary.

Initialize the boot cpu when tsc frequency is determined. Copy the
calculated data from the boot CPU to the other CPUs in tsc_init().

In addition do the following:

 - Remove unnecessary zeroing of c2ns data by removing cyc2ns_data_init()

 - Split set_cyc2ns_scale() into two functions, so set_cyc2ns_scale() can be
   called when system is up, and wraps around __set_cyc2ns_scale() that can
   be called directly when system is booting but avoids saving restoring
   IRQs and going and waking up from idle.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: steven.sistare@oracle.com
Cc: daniel.m.jordan@oracle.com
Cc: linux@armlinux.org.uk
Cc: schwidefsky@de.ibm.com
Cc: heiko.carstens@de.ibm.com
Cc: john.stultz@linaro.org
Cc: sboyd@codeaurora.org
Cc: hpa@zytor.com
Cc: douly.fnst@cn.fujitsu.com
Cc: peterz@infradead.org
Cc: prarit@redhat.com
Cc: feng.tang@intel.com
Cc: pmladek@suse.com
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: linux-s390@vger.kernel.org
Cc: boris.ostrovsky@oracle.com
Cc: jgross@suse.com
Cc: pbonzini@redhat.com
Link: https://lkml.kernel.org/r/20180719205545.16512-21-pasha.tatashin@oracle.com
2018-07-20 00:02:42 +02:00
Pavel Tatashin cf7a63ef4e x86/tsc: Calibrate tsc only once
During boot tsc is calibrated twice: once in tsc_early_delay_calibrate(),
and the second time in tsc_init().

Rename tsc_early_delay_calibrate() to tsc_early_init(), and rework it so
the calibration is done only early, and make tsc_init() to use the values
already determined in tsc_early_init().

Sometimes it is not possible to determine tsc early, as the subsystem that
is required is not yet initialized, in such case try again later in
tsc_init().

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: steven.sistare@oracle.com
Cc: daniel.m.jordan@oracle.com
Cc: linux@armlinux.org.uk
Cc: schwidefsky@de.ibm.com
Cc: heiko.carstens@de.ibm.com
Cc: john.stultz@linaro.org
Cc: sboyd@codeaurora.org
Cc: hpa@zytor.com
Cc: douly.fnst@cn.fujitsu.com
Cc: peterz@infradead.org
Cc: prarit@redhat.com
Cc: feng.tang@intel.com
Cc: pmladek@suse.com
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: linux-s390@vger.kernel.org
Cc: boris.ostrovsky@oracle.com
Cc: jgross@suse.com
Cc: pbonzini@redhat.com
Link: https://lkml.kernel.org/r/20180719205545.16512-20-pasha.tatashin@oracle.com
2018-07-20 00:02:42 +02:00
Pavel Tatashin 227e3958a7 ARM/time: Remove read_boot_clock64()
read_boot_clock64() is deleted, and replaced with
read_persistent_wall_and_boot_offset().

The default implementation of read_persistent_wall_and_boot_offset()
provides a better fallback than the current stubs for read_boot_clock64()
that arm has with no users, so remove the old code.

Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: steven.sistare@oracle.com
Cc: daniel.m.jordan@oracle.com
Cc: linux@armlinux.org.uk
Cc: schwidefsky@de.ibm.com
Cc: heiko.carstens@de.ibm.com
Cc: john.stultz@linaro.org
Cc: sboyd@codeaurora.org
Cc: hpa@zytor.com
Cc: douly.fnst@cn.fujitsu.com
Cc: peterz@infradead.org
Cc: prarit@redhat.com
Cc: feng.tang@intel.com
Cc: pmladek@suse.com
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: linux-s390@vger.kernel.org
Cc: boris.ostrovsky@oracle.com
Cc: jgross@suse.com
Cc: pbonzini@redhat.com
Link: https://lkml.kernel.org/r/20180719205545.16512-19-pasha.tatashin@oracle.com
2018-07-20 00:02:41 +02:00
Pavel Tatashin 00067a6db2 s390/time: Remove read_boot_clock64()
read_boot_clock64() was replaced by read_persistent_wall_and_boot_offset()
so remove it.

Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: steven.sistare@oracle.com
Cc: daniel.m.jordan@oracle.com
Cc: linux@armlinux.org.uk
Cc: schwidefsky@de.ibm.com
Cc: heiko.carstens@de.ibm.com
Cc: john.stultz@linaro.org
Cc: sboyd@codeaurora.org
Cc: hpa@zytor.com
Cc: douly.fnst@cn.fujitsu.com
Cc: peterz@infradead.org
Cc: prarit@redhat.com
Cc: feng.tang@intel.com
Cc: pmladek@suse.com
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: linux-s390@vger.kernel.org
Cc: boris.ostrovsky@oracle.com
Cc: jgross@suse.com
Cc: pbonzini@redhat.com
Link: https://lkml.kernel.org/r/20180719205545.16512-18-pasha.tatashin@oracle.com
2018-07-20 00:02:41 +02:00
Pavel Tatashin 4b1b7f8054 timekeeping: Default boot time offset to local_clock()
read_persistent_wall_and_boot_offset() is called during boot to read
both the persistent clock and also return the offset between the boot time
and the value of persistent clock.

Change the default boot_offset from zero to local_clock() so architectures,
that do not have a dedicated boot_clock but have early sched_clock(), such
as SPARCv9, x86, and possibly more will benefit from this change by getting
a better and more consistent estimate of the boot time without need for an
arch specific implementation.

Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: steven.sistare@oracle.com
Cc: daniel.m.jordan@oracle.com
Cc: linux@armlinux.org.uk
Cc: schwidefsky@de.ibm.com
Cc: heiko.carstens@de.ibm.com
Cc: john.stultz@linaro.org
Cc: sboyd@codeaurora.org
Cc: hpa@zytor.com
Cc: douly.fnst@cn.fujitsu.com
Cc: peterz@infradead.org
Cc: prarit@redhat.com
Cc: feng.tang@intel.com
Cc: pmladek@suse.com
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: linux-s390@vger.kernel.org
Cc: boris.ostrovsky@oracle.com
Cc: jgross@suse.com
Cc: pbonzini@redhat.com
Link: https://lkml.kernel.org/r/20180719205545.16512-17-pasha.tatashin@oracle.com
2018-07-20 00:02:41 +02:00
Pavel Tatashin 3eca993740 timekeeping: Replace read_boot_clock64() with read_persistent_wall_and_boot_offset()
If architecture does not support exact boot time, it is challenging to
estimate boot time without having a reference to the current persistent
clock value. Yet, it cannot read the persistent clock time again, because
this may lead to math discrepancies with the caller of read_boot_clock64()
who have read the persistent clock at a different time.

This is why it is better to provide two values simultaneously: the
persistent clock value, and the boot time.

Replace read_boot_clock64() with:
read_persistent_wall_and_boot_offset(wall_time, boot_offset)

Where wall_time is returned by read_persistent_clock() And boot_offset is
wall_time - boot time, which defaults to 0.

Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: steven.sistare@oracle.com
Cc: daniel.m.jordan@oracle.com
Cc: linux@armlinux.org.uk
Cc: schwidefsky@de.ibm.com
Cc: heiko.carstens@de.ibm.com
Cc: john.stultz@linaro.org
Cc: sboyd@codeaurora.org
Cc: hpa@zytor.com
Cc: douly.fnst@cn.fujitsu.com
Cc: peterz@infradead.org
Cc: prarit@redhat.com
Cc: feng.tang@intel.com
Cc: pmladek@suse.com
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: linux-s390@vger.kernel.org
Cc: boris.ostrovsky@oracle.com
Cc: jgross@suse.com
Cc: pbonzini@redhat.com
Link: https://lkml.kernel.org/r/20180719205545.16512-16-pasha.tatashin@oracle.com
2018-07-20 00:02:40 +02:00
Pavel Tatashin be2e0e4257 s390/time: Add read_persistent_wall_and_boot_offset()
read_persistent_wall_and_boot_offset() will replace read_boot_clock64()
because on some architectures it is more convenient to read both sources
as one may depend on the other. For s390, implementation is the same
as read_boot_clock64() but also calling and returning value of
read_persistent_clock64()

Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: steven.sistare@oracle.com
Cc: daniel.m.jordan@oracle.com
Cc: linux@armlinux.org.uk
Cc: heiko.carstens@de.ibm.com
Cc: john.stultz@linaro.org
Cc: sboyd@codeaurora.org
Cc: hpa@zytor.com
Cc: douly.fnst@cn.fujitsu.com
Cc: peterz@infradead.org
Cc: prarit@redhat.com
Cc: feng.tang@intel.com
Cc: pmladek@suse.com
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: linux-s390@vger.kernel.org
Cc: boris.ostrovsky@oracle.com
Cc: jgross@suse.com
Cc: pbonzini@redhat.com
Link: https://lkml.kernel.org/r/20180719205545.16512-15-pasha.tatashin@oracle.com
2018-07-20 00:02:40 +02:00
Pavel Tatashin 38669ba205 x86/xen/time: Output xen sched_clock time from 0
It is expected for sched_clock() to output data from 0, when system boots.

Add an offset xen_sched_clock_offset (similarly how it is done in other
hypervisors i.e. kvm_sched_clock_offset) to count sched_clock() from 0,
when time is first initialized.

Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: steven.sistare@oracle.com
Cc: daniel.m.jordan@oracle.com
Cc: linux@armlinux.org.uk
Cc: schwidefsky@de.ibm.com
Cc: heiko.carstens@de.ibm.com
Cc: john.stultz@linaro.org
Cc: sboyd@codeaurora.org
Cc: hpa@zytor.com
Cc: douly.fnst@cn.fujitsu.com
Cc: peterz@infradead.org
Cc: prarit@redhat.com
Cc: feng.tang@intel.com
Cc: pmladek@suse.com
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: linux-s390@vger.kernel.org
Cc: boris.ostrovsky@oracle.com
Cc: jgross@suse.com
Cc: pbonzini@redhat.com
Link: https://lkml.kernel.org/r/20180719205545.16512-14-pasha.tatashin@oracle.com
2018-07-20 00:02:40 +02:00
Pavel Tatashin 7b25b9cb0d x86/xen/time: Initialize pv xen time in init_hypervisor_platform()
In every hypervisor except for xen pv time ops are initialized in
init_hypervisor_platform().

Xen PV domains initialize time ops in x86_init.paging.pagetable_init(),
by calling xen_setup_shared_info() which is a poor design, as time is
needed prior to memory allocator.

xen_setup_shared_info() is called from two places: during boot, and
after suspend. Split the content of xen_setup_shared_info() into
three places:

1. add the clock relavent data into new xen pv init_platform vector, and
   set clock ops in there.

2. move xen_setup_vcpu_info_placement() to new xen_pv_guest_late_init()
   call.

3. Re-initializing parts of shared info copy to xen_pv_post_suspend() to
   be symmetric to xen_pv_pre_suspend

Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: steven.sistare@oracle.com
Cc: daniel.m.jordan@oracle.com
Cc: linux@armlinux.org.uk
Cc: schwidefsky@de.ibm.com
Cc: heiko.carstens@de.ibm.com
Cc: john.stultz@linaro.org
Cc: sboyd@codeaurora.org
Cc: hpa@zytor.com
Cc: douly.fnst@cn.fujitsu.com
Cc: peterz@infradead.org
Cc: prarit@redhat.com
Cc: feng.tang@intel.com
Cc: pmladek@suse.com
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: linux-s390@vger.kernel.org
Cc: boris.ostrovsky@oracle.com
Cc: jgross@suse.com
Cc: pbonzini@redhat.com
Link: https://lkml.kernel.org/r/20180719205545.16512-13-pasha.tatashin@oracle.com
2018-07-20 00:02:39 +02:00
Pavel Tatashin fe9af81e52 x86/tsc: Redefine notsc to behave as tsc=unstable
Currently, the notsc kernel parameter disables the use of the TSC by
sched_clock(). However, this parameter does not prevent the kernel from
accessing tsc in other places.

The only rationale to boot with notsc is to avoid timing discrepancies on
multi-socket systems where TSC are not properly synchronized, and thus
exclude TSC from being used for time keeping. But that prevents using TSC
as sched_clock() as well, which is not necessary as the core sched_clock()
implementation can handle non synchronized TSC based sched clocks just
fine.

However, there is another method to solve the above problem: booting with
tsc=unstable parameter. This parameter allows sched_clock() to use TSC and
just excludes it from timekeeping.

So there is no real reason to keep notsc, but for compatibility reasons the
parameter has to stay. Make it behave like 'tsc=unstable' instead.

[ tglx: Massaged changelog ]

Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: steven.sistare@oracle.com
Cc: daniel.m.jordan@oracle.com
Cc: linux@armlinux.org.uk
Cc: schwidefsky@de.ibm.com
Cc: heiko.carstens@de.ibm.com
Cc: john.stultz@linaro.org
Cc: sboyd@codeaurora.org
Cc: hpa@zytor.com
Cc: peterz@infradead.org
Cc: prarit@redhat.com
Cc: feng.tang@intel.com
Cc: pmladek@suse.com
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: linux-s390@vger.kernel.org
Cc: boris.ostrovsky@oracle.com
Cc: jgross@suse.com
Cc: pbonzini@redhat.com
Link: https://lkml.kernel.org/r/20180719205545.16512-12-pasha.tatashin@oracle.com
2018-07-20 00:02:39 +02:00
Borislav Petkov 9b3661cd7e x86/CPU: Call detect_nopl() only on the BSP
Make it use the setup_* variants and have it be called only on the BSP and
drop the call in generic_identify() - X86_FEATURE_NOPL will be replicated
to the APs through the forced caps. Helps to keep the mess at a manageable
level.

Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: steven.sistare@oracle.com
Cc: daniel.m.jordan@oracle.com
Cc: linux@armlinux.org.uk
Cc: schwidefsky@de.ibm.com
Cc: heiko.carstens@de.ibm.com
Cc: john.stultz@linaro.org
Cc: sboyd@codeaurora.org
Cc: hpa@zytor.com
Cc: douly.fnst@cn.fujitsu.com
Cc: peterz@infradead.org
Cc: prarit@redhat.com
Cc: feng.tang@intel.com
Cc: pmladek@suse.com
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: linux-s390@vger.kernel.org
Cc: boris.ostrovsky@oracle.com
Cc: jgross@suse.com
Cc: pbonzini@redhat.com
Link: https://lkml.kernel.org/r/20180719205545.16512-11-pasha.tatashin@oracle.com
2018-07-20 00:02:39 +02:00
Pavel Tatashin 8990cac6e5 x86/jump_label: Initialize static branching early
Static branching is useful to runtime patch branches that are used in hot
path, but are infrequently changed.

The x86 clock framework is one example that uses static branches to setup
the best clock during boot and never changes it again.

It is desired to enable the TSC based sched clock early to allow fine
grained boot time analysis early on. That requires the static branching
functionality to be functional early as well.

Static branching requires patching nop instructions, thus,
arch_init_ideal_nops() must be called prior to jump_label_init().

Do all the necessary steps to call arch_init_ideal_nops() right after
early_cpu_init(), which also allows to insert a call to jump_label_init()
right after that. jump_label_init() will be called again from the generic
init code, but the code is protected against reinitialization already.

[ tglx: Massaged changelog ]

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: steven.sistare@oracle.com
Cc: daniel.m.jordan@oracle.com
Cc: linux@armlinux.org.uk
Cc: schwidefsky@de.ibm.com
Cc: heiko.carstens@de.ibm.com
Cc: john.stultz@linaro.org
Cc: sboyd@codeaurora.org
Cc: hpa@zytor.com
Cc: douly.fnst@cn.fujitsu.com
Cc: prarit@redhat.com
Cc: feng.tang@intel.com
Cc: pmladek@suse.com
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: linux-s390@vger.kernel.org
Cc: boris.ostrovsky@oracle.com
Cc: jgross@suse.com
Cc: pbonzini@redhat.com
Link: https://lkml.kernel.org/r/20180719205545.16512-10-pasha.tatashin@oracle.com
2018-07-20 00:02:38 +02:00
Pavel Tatashin 6fffacb303 x86/alternatives, jumplabel: Use text_poke_early() before mm_init()
It supposed to be safe to modify static branches after jump_label_init().
But, because static key modifying code eventually calls text_poke() it can
end up accessing a struct page which has not been initialized yet.

Here is how to quickly reproduce the problem. Insert code like this
into init/main.c:

| +static DEFINE_STATIC_KEY_FALSE(__test);
| asmlinkage __visible void __init start_kernel(void)
| {
|        char *command_line;
|@@ -587,6 +609,10 @@ asmlinkage __visible void __init start_kernel(void)
|        vfs_caches_init_early();
|        sort_main_extable();
|        trap_init();
|+       {
|+       static_branch_enable(&__test);
|+       WARN_ON(!static_branch_likely(&__test));
|+       }
|        mm_init();

The following warnings show-up:
WARNING: CPU: 0 PID: 0 at arch/x86/kernel/alternative.c:701 text_poke+0x20d/0x230
RIP: 0010:text_poke+0x20d/0x230
Call Trace:
 ? text_poke_bp+0x50/0xda
 ? arch_jump_label_transform+0x89/0xe0
 ? __jump_label_update+0x78/0xb0
 ? static_key_enable_cpuslocked+0x4d/0x80
 ? static_key_enable+0x11/0x20
 ? start_kernel+0x23e/0x4c8
 ? secondary_startup_64+0xa5/0xb0

---[ end trace abdc99c031b8a90a ]---

If the code above is moved after mm_init(), no warning is shown, as struct
pages are initialized during handover from memblock.

Use text_poke_early() in static branching until early boot IRQs are enabled
and from there switch to text_poke. Also, ensure text_poke() is never
invoked when unitialized memory access may happen by using adding a
!after_bootmem assertion.

Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: steven.sistare@oracle.com
Cc: daniel.m.jordan@oracle.com
Cc: linux@armlinux.org.uk
Cc: schwidefsky@de.ibm.com
Cc: heiko.carstens@de.ibm.com
Cc: john.stultz@linaro.org
Cc: sboyd@codeaurora.org
Cc: hpa@zytor.com
Cc: douly.fnst@cn.fujitsu.com
Cc: peterz@infradead.org
Cc: prarit@redhat.com
Cc: feng.tang@intel.com
Cc: pmladek@suse.com
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: linux-s390@vger.kernel.org
Cc: boris.ostrovsky@oracle.com
Cc: jgross@suse.com
Cc: pbonzini@redhat.com
Link: https://lkml.kernel.org/r/20180719205545.16512-9-pasha.tatashin@oracle.com
2018-07-20 00:02:38 +02:00
Thomas Gleixner 95a3d4454b x86/kvmclock: Switch kvmclock data to a PER_CPU variable
The previous removal of the memblock dependency from kvmclock introduced a
static data array sized 64bytes * CONFIG_NR_CPUS. That's wasteful on large
systems when kvmclock is not used.

Replace it with:

 - A static page sized array of pvclock data. It's page sized because the
   pvclock data of the boot cpu is mapped into the VDSO so otherwise random
   other data would be exposed to the vDSO

 - A PER_CPU variable of pvclock data pointers. This is used to access the
   pcvlock data storage on each CPU.

The setup is done in two stages:

 - Early boot stores the pointer to the static page for the boot CPU in
   the per cpu data.

 - In the preparatory stage of CPU hotplug assign either an element of
   the static array (when the CPU number is in that range) or allocate
   memory and initialize the per cpu pointer.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: steven.sistare@oracle.com
Cc: daniel.m.jordan@oracle.com
Cc: linux@armlinux.org.uk
Cc: schwidefsky@de.ibm.com
Cc: heiko.carstens@de.ibm.com
Cc: john.stultz@linaro.org
Cc: sboyd@codeaurora.org
Cc: hpa@zytor.com
Cc: douly.fnst@cn.fujitsu.com
Cc: peterz@infradead.org
Cc: prarit@redhat.com
Cc: feng.tang@intel.com
Cc: pmladek@suse.com
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: linux-s390@vger.kernel.org
Cc: boris.ostrovsky@oracle.com
Cc: jgross@suse.com
Link: https://lkml.kernel.org/r/20180719205545.16512-8-pasha.tatashin@oracle.com
2018-07-20 00:02:38 +02:00
Thomas Gleixner e499a9b6dc x86/kvmclock: Move kvmclock vsyscall param and init to kvmclock
There is no point to have this in the kvm code itself and call it from
there. This can be called from an initcall and the parameter is cleared
when the hypervisor is not KVM.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: steven.sistare@oracle.com
Cc: daniel.m.jordan@oracle.com
Cc: linux@armlinux.org.uk
Cc: schwidefsky@de.ibm.com
Cc: heiko.carstens@de.ibm.com
Cc: john.stultz@linaro.org
Cc: sboyd@codeaurora.org
Cc: hpa@zytor.com
Cc: douly.fnst@cn.fujitsu.com
Cc: peterz@infradead.org
Cc: prarit@redhat.com
Cc: feng.tang@intel.com
Cc: pmladek@suse.com
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: linux-s390@vger.kernel.org
Cc: boris.ostrovsky@oracle.com
Cc: jgross@suse.com
Link: https://lkml.kernel.org/r/20180719205545.16512-7-pasha.tatashin@oracle.com
2018-07-20 00:02:37 +02:00
Thomas Gleixner 42f8df935e x86/kvmclock: Mark variables __initdata and __ro_after_init
The kvmclock parameter is init data and the other variables are not
modified after init.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: steven.sistare@oracle.com
Cc: daniel.m.jordan@oracle.com
Cc: linux@armlinux.org.uk
Cc: schwidefsky@de.ibm.com
Cc: heiko.carstens@de.ibm.com
Cc: john.stultz@linaro.org
Cc: sboyd@codeaurora.org
Cc: hpa@zytor.com
Cc: douly.fnst@cn.fujitsu.com
Cc: peterz@infradead.org
Cc: prarit@redhat.com
Cc: feng.tang@intel.com
Cc: pmladek@suse.com
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: linux-s390@vger.kernel.org
Cc: boris.ostrovsky@oracle.com
Cc: jgross@suse.com
Link: https://lkml.kernel.org/r/20180719205545.16512-6-pasha.tatashin@oracle.com
2018-07-20 00:02:37 +02:00
Thomas Gleixner 146c394d0c x86/kvmclock: Cleanup the code
- Cleanup the mrs write for wall clock. The type casts to (int) are sloppy
  because the wrmsr parameters are u32 and aside of that wrmsrl() already
  provides the high/low split for free.

- Remove the pointless get_cpu()/put_cpu() dance from various
  functions. Either they are called during early init where CPU is
  guaranteed to be 0 or they are already called from non preemptible
  context where smp_processor_id() can be used safely

- Simplify the convoluted check for kvmclock in the init function.

- Mark the parameter parsing function __init. No point in keeping it
  around.

- Convert to pr_info()

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: steven.sistare@oracle.com
Cc: daniel.m.jordan@oracle.com
Cc: linux@armlinux.org.uk
Cc: schwidefsky@de.ibm.com
Cc: heiko.carstens@de.ibm.com
Cc: john.stultz@linaro.org
Cc: sboyd@codeaurora.org
Cc: hpa@zytor.com
Cc: douly.fnst@cn.fujitsu.com
Cc: peterz@infradead.org
Cc: prarit@redhat.com
Cc: feng.tang@intel.com
Cc: pmladek@suse.com
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: linux-s390@vger.kernel.org
Cc: boris.ostrovsky@oracle.com
Cc: jgross@suse.com
Link: https://lkml.kernel.org/r/20180719205545.16512-5-pasha.tatashin@oracle.com
2018-07-20 00:02:37 +02:00
Thomas Gleixner 7a5ddc8fe0 x86/kvmclock: Decrapify kvm_register_clock()
The return value is pointless because the wrmsr cannot fail if
KVM_FEATURE_CLOCKSOURCE or KVM_FEATURE_CLOCKSOURCE2 are set.

kvm_register_clock() is only called locally so wants to be static.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: steven.sistare@oracle.com
Cc: daniel.m.jordan@oracle.com
Cc: linux@armlinux.org.uk
Cc: schwidefsky@de.ibm.com
Cc: heiko.carstens@de.ibm.com
Cc: john.stultz@linaro.org
Cc: sboyd@codeaurora.org
Cc: hpa@zytor.com
Cc: douly.fnst@cn.fujitsu.com
Cc: peterz@infradead.org
Cc: prarit@redhat.com
Cc: feng.tang@intel.com
Cc: pmladek@suse.com
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: linux-s390@vger.kernel.org
Cc: boris.ostrovsky@oracle.com
Cc: jgross@suse.com
Link: https://lkml.kernel.org/r/20180719205545.16512-4-pasha.tatashin@oracle.com
2018-07-20 00:02:36 +02:00
Thomas Gleixner 7ef363a395 x86/kvmclock: Remove page size requirement from wall_clock
There is no requirement for wall_clock data to be page aligned or page
sized.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: steven.sistare@oracle.com
Cc: daniel.m.jordan@oracle.com
Cc: linux@armlinux.org.uk
Cc: schwidefsky@de.ibm.com
Cc: heiko.carstens@de.ibm.com
Cc: john.stultz@linaro.org
Cc: sboyd@codeaurora.org
Cc: hpa@zytor.com
Cc: douly.fnst@cn.fujitsu.com
Cc: peterz@infradead.org
Cc: prarit@redhat.com
Cc: feng.tang@intel.com
Cc: pmladek@suse.com
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: linux-s390@vger.kernel.org
Cc: boris.ostrovsky@oracle.com
Cc: jgross@suse.com
Link: https://lkml.kernel.org/r/20180719205545.16512-3-pasha.tatashin@oracle.com
2018-07-20 00:02:36 +02:00
Pavel Tatashin 368a540e02 x86/kvmclock: Remove memblock dependency
KVM clock is initialized later compared to other hypervisor clocks because
it has a dependency on the memblock allocator.

Bring it in line with other hypervisors by using memory from the BSS
instead of allocating it.

The benefits:

  - Remove ifdef from common code
  - Earlier availability of the clock
  - Remove dependency on memblock, and reduce code

The downside:

  - Static allocation of the per cpu data structures sized NR_CPUS * 64byte
    Will be addressed in follow up patches.

[ tglx: Split out from larger series ]

Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: steven.sistare@oracle.com
Cc: daniel.m.jordan@oracle.com
Cc: linux@armlinux.org.uk
Cc: schwidefsky@de.ibm.com
Cc: heiko.carstens@de.ibm.com
Cc: john.stultz@linaro.org
Cc: sboyd@codeaurora.org
Cc: hpa@zytor.com
Cc: douly.fnst@cn.fujitsu.com
Cc: peterz@infradead.org
Cc: prarit@redhat.com
Cc: feng.tang@intel.com
Cc: pmladek@suse.com
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: linux-s390@vger.kernel.org
Cc: boris.ostrovsky@oracle.com
Cc: jgross@suse.com
Link: https://lkml.kernel.org/r/20180719205545.16512-2-pasha.tatashin@oracle.com
2018-07-20 00:02:36 +02:00
Thomas Gleixner 73ab603f44 Merge branch 'linus' into x86/timers
Pick up upstream changes to avoid conflicts
2018-07-19 23:11:52 +02:00
Linus Torvalds fb7d1bcf16 pci-v4.18-fixes-3
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAltQ1y4UHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vyzKA//T8+ePVGcIBZhyEDy3gX0V/WXF5Sr
 feOWCy5YWsY3gWkQ1XIU40kPox+6/bsO8Cte74aO5m1cWShpqEJntuFkOInNz9ag
 6gkN1j3G7B8VjpzWlH9rML2d2QVcnm8POBkmtwEgBw8rdAumD25MFsvjENhVkdeL
 LOLzi+8kPRQl8UK33HnawU6spLgHCosSVVInIjPyNpSzw+agbW+s0i4/kmWXnB8C
 9P4VieQ3dzyfzX1W5ty82Ck6Gd6gOXI42hVls9EFrJnuxQIUPe5pX14dqFzVFP3G
 j9SrUTPbnoZ3yVxW4mhibCv3v8+vyTxPhkradj4zt1dk4Cq4+s1Z9+l8vPjChIkQ
 X679bNgI1x3xIdtUc5OcIdNI9LyzmCKZ309iNPO0bTD/gHvXromw4wqvOUCkBEoa
 HkTJeFXf+h7DsFNAcj6ntaQAbUcwjsOgHLhikhZJ8nUcxvjLvnIZ0imzytBuyboJ
 L7GV0VurwVgi2yHhdzsQ9qiSO4iTXnSe05urxzZyIn8mFETOxB6Kai2kmvAKTpqx
 DR0/gZMWHYd40zRtPmcpFbzPCJAaTtks2/C/IzOeUkSRDSLsc5IhmjduG5pNAfqC
 ikf/WrWNbZHXbLAGyD37YElhYOl7StcNz3yZtlvvbx1iNjWikU4NJR/GwgNeHZ3j
 sRI2Trz8bfJW4AE=
 =SV5J
 -----END PGP SIGNATURE-----

Merge tag 'pci-v4.18-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI fixes from Bjorn Helgaas:

 - Fix crashes that happen when PHY drivers are left disabled in the V3
   Semiconductor, MediaTek, Faraday, Aardvark, DesignWare, Versatile,
   and X-Gene host controller drivers (Sergei Shtylyov)

 - Fix a NULL pointer dereference in the endpoint library configfs
   support (Kishon Vijay Abraham I)

 - Fix a race condition in Hyper-V IRQ handling (Dexuan Cui)

* tag 'pci-v4.18-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  PCI: v3-semi: Fix I/O space page leak
  PCI: mediatek: Fix I/O space page leak
  PCI: faraday: Fix I/O space page leak
  PCI: aardvark: Fix I/O space page leak
  PCI: designware: Fix I/O space page leak
  PCI: versatile: Fix I/O space page leak
  PCI: xgene: Fix I/O space page leak
  PCI: OF: Fix I/O space page leak
  PCI: endpoint: Fix NULL pointer dereference error when CONFIGFS is disabled
  PCI: hv: Disable/enable IRQs rather than BH in hv_compose_msi_msg()
2018-07-19 11:54:04 -07:00
Linus Torvalds f39f28ff82 sound fixes for 4.18-rc6
A rawmidi race fix and three trivial HD-audio quirks.
 -----BEGIN PGP SIGNATURE-----
 
 iQJCBAABCAAsFiEEIXTw5fNLNI7mMiVaLtJE4w1nLE8FAltQZuYOHHRpd2FpQHN1
 c2UuZGUACgkQLtJE4w1nLE8P4A//Xaf6cy3XejK9zPaV/Ony+IiJfcpRbbug/Mhc
 eV1l+a/+DJNpeEJQScDtQjQejDnyytF3643sJ2VHVGMX3NEWeasfkJ3943FEijOK
 RkSW5VtXSExXcshwwu8JSllEXWK48rUzjYXT8ORccfQWn9mE1xfRFkMKalj2+f17
 0joP7N2wxc5Xw44Gkq9iFcadxyfUo/dj6/MNdSU+HpcwLFszCsVDY6t3pfwQykm9
 WEsalkAUARSMAo6ZnS67lLwH+9aqb7Zt5tPke44FT/LSAOlu8oJnqWrhwOQSZmRg
 hm5UF2yo9N29CKJwhSLNN1Me+Ko8OGaEVzB4bXYnhmCQi+YcVvL4lR3kTPCZQhVE
 ExI3vjGFi9EjCd8V1Oe5OxqAjdpLQgRzScGYszmry1U6jiZus2pc90D2s55ABqE7
 ZBmG6i+Ee9ltDPBbZMrsN7N2+M+hXq/10yy5pLZukA0YJpe5ec1iyJmvvduEodbh
 jKyzGh/+tFQGlUpgsJoBa0+K7C/UzaKJyPhp7+iSLrLUcaXdv4mWvQYWUZzzI8zW
 3GPPTUdpqm67eFRg1HgOQ0YWslgyuASCcsH/qmAfD+9ilEKe+r7EffYf79dm99JJ
 gF5ocX5zHS5JpRHxD5RZi+Vd3fj7wcj+NKAiIAIOFS3JLDScTG9n/5vP4K7W9vRI
 F+Q5s+c=
 =vsVz
 -----END PGP SIGNATURE-----

Merge tag 'sound-4.18-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "A rawmidi race fix and three trivial HD-audio quirks"

* tag 'sound-4.18-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda/realtek - Yet another Clevo P950 quirk entry
  ALSA: rawmidi: Change resized buffers atomically
  ALSA: hda/realtek - Add Panasonic CF-SZ6 headset jack quirk
  ALSA: hda: add mute led support for HP ProBook 455 G5
2018-07-19 07:43:17 -07:00
Linus Torvalds b4394c3435 Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fix from Herbert Xu:
 "This fixes an allocation error-path bug in af_alg discovered by
  syzkaller"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: af_alg - Initialize sg_num_bytes in error code path
2018-07-19 07:32:44 -07:00
Linus Torvalds 024ddc0ce1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:
 "Lots of fixes, here goes:

   1) NULL deref in qtnfmac, from Gustavo A. R. Silva.

   2) Kernel oops when fw download fails in rtlwifi, from Ping-Ke Shih.

   3) Lost completion messages in AF_XDP, from Magnus Karlsson.

   4) Correct bogus self-assignment in rhashtable, from Rishabh
      Bhatnagar.

   5) Fix regression in ipv6 route append handling, from David Ahern.

   6) Fix masking in __set_phy_supported(), from Heiner Kallweit.

   7) Missing module owner set in x_tables icmp, from Florian Westphal.

   8) liquidio's timeouts are HZ dependent, fix from Nicholas Mc Guire.

   9) Link setting fixes for sh_eth and ravb, from Vladimir Zapolskiy.

  10) Fix NULL deref when using chains in act_csum, from Davide Caratti.

  11) XDP_REDIRECT needs to check if the interface is up and whether the
      MTU is sufficient. From Toshiaki Makita.

  12) Net diag can do a double free when killing TCP_NEW_SYN_RECV
      connections, from Lorenzo Colitti.

  13) nf_defrag in ipv6 can unnecessarily hold onto dst entries for a
      full minute, delaying device unregister. From Eric Dumazet.

  14) Update MAC entries in the correct order in ixgbe, from Alexander
      Duyck.

  15) Don't leave partial mangles bpf program in jit_subprogs, from
      Daniel Borkmann.

  16) Fix pfmemalloc SKB state propagation, from Stefano Brivio.

  17) Fix ACK handling in DCTCP congestion control, from Yuchung Cheng.

  18) Use after free in tun XDP_TX, from Toshiaki Makita.

  19) Stale ipv6 header pointer in ipv6 gre code, from Prashant Bhole.

  20) Don't reuse remainder of RX page when XDP is set in mlx4, from
      Saeed Mahameed.

  21) Fix window probe handling of TCP rapair sockets, from Stefan
      Baranoff.

  22) Missing socket locking in smc_ioctl(), from Ursula Braun.

  23) IPV6_ILA needs DST_CACHE, from Arnd Bergmann.

  24) Spectre v1 fix in cxgb3, from Gustavo A. R. Silva.

  25) Two spots in ipv6 do a rol32() on a hash value but ignore the
      result. Fixes from Colin Ian King"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (176 commits)
  tcp: identify cryptic messages as TCP seq # bugs
  ptp: fix missing break in switch
  hv_netvsc: Fix napi reschedule while receive completion is busy
  MAINTAINERS: Drop inactive Vitaly Bordug's email
  net: cavium: Add fine-granular dependencies on PCI
  net: qca_spi: Fix log level if probe fails
  net: qca_spi: Make sure the QCA7000 reset is triggered
  net: qca_spi: Avoid packet drop during initial sync
  ipv6: fix useless rol32 call on hash
  ipv6: sr: fix useless rol32 call on hash
  net: sched: Using NULL instead of plain integer
  net: usb: asix: replace mii_nway_restart in resume path
  net: cxgb3_main: fix potential Spectre v1
  lib/rhashtable: consider param->min_size when setting initial table size
  net/smc: reset recv timeout after clc handshake
  net/smc: add error handling for get_user()
  net/smc: optimize consumer cursor updates
  net/nfc: Avoid stalls when nfc_alloc_send_skb() returned NULL.
  ipv6: ila: select CONFIG_DST_CACHE
  net: usb: rtl8150: demote allmulti message to dev_dbg()
  ...
2018-07-18 19:32:54 -07:00
Linus Torvalds 05df204549 DeviceTree fixes for 4.18:
- Fix phandle cache to work with overlays
 
 - Correct the default clock-frequency for QCom geni-i2c
 
 - Binding doc quote and spelling fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCgAuFiEEktVUI4SxYhzZyEuo+vtdtY28YcMFAltPq8YQHHJvYmhAa2Vy
 bmVsLm9yZwAKCRD6+121jbxhwyjhD/9HQpbSKRt4ofc8pwu/A00UGMhDtNc170Cf
 B2Le+rETohqk5+VTZgif25h42nR//p2SwLORem95Osk2Di9IrLDdeJUVbVz5h/5c
 Shn8h5AwB1W7wlYpk+jBCNqbwF+8oVdg1fxRkqX3z9PRJlw+6l/OiIxwPsWO0WyQ
 JsuTiqzgqlF/NipQDjGbohQpMEL6AhbwV/K8uTUlLHRxPSYI8+xCnmZdQKSxPs2G
 kQHNXCRS+5Z0YZWEDf2AfYvyTXIIO6khUOwwLZDtKYxCNO9RkKdOA9RGS7TmiI59
 qO4m06STNkkC7agX7UW769n1Pb+vgEB5ZT1ISqkfbcolEg83jyYp7vMJhqMrpLkN
 z6c5czHHpU05mWehkiXWNDD4JRUmWBacfs19/TC+gTfoohULD8soLnrHh4lD002Z
 O7jUdQ1O1onpngcRXrcg8DWXDFcHH9VfeI4E1ZDiRtgoQVo1aAeb/oROtyh0+ski
 a8xJmgxYsHh2aEZ+DceeHbqiq9oE8FpkTke9m6CcVIJPQcr21TS6tERDR4mZcXMa
 mKzBQChdI8AWnsfUSDrudc3ayvTLvUwY0ayo7WC+kfYCc+HlfHpb+xoSkFQJ/lbO
 v2L+PiHtjRi0t0nYlrBxZzTyBDTmJb8KHbCWtyoIez/BjPV+DFa75A+rjsx/iw0T
 0AE5h19DSA==
 =4/Lz
 -----END PGP SIGNATURE-----

Merge tag 'devicetree-fixes-for-4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux

Pull DeviceTree fixes from Rob Herring:

 - Fix phandle cache to work with overlays

 - Correct the default clock-frequency for QCom geni-i2c

 - Binding doc quote and spelling fixes

* tag 'devicetree-fixes-for-4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
  of: overlay: update phandle cache on overlay apply and remove
  dt-bindings: Fix unbalanced quotation marks
  dt-bindings: soc: qcom: Fix default clock-freq for qcom,geni-i2c
  dt-bindings: w1-gpio: Remove unneeded unit address
  Documentation: devicetree: tilcdc: fix spelling mistake "suppors" -> "supports"
2018-07-18 16:19:36 -07:00
Randy Dunlap e56b8ce363 tcp: identify cryptic messages as TCP seq # bugs
Attempt to make cryptic TCP seq number error messages clearer by
(1) identifying the source of the message as "TCP", (2) identifying the
errors as "seq # bug", and (3) grouping the field identifiers and values
by separating them with commas.

E.g., the following message is changed from:

recvmsg bug 2: copied 73BCB6CD seq 70F17CBE rcvnxt 73BCB9AA fl 0
WARNING: CPU: 2 PID: 1501 at /linux/net/ipv4/tcp.c:1881 tcp_recvmsg+0x649/0xb90

to:

TCP recvmsg seq # bug 2: copied 73BCB6CD, seq 70F17CBE, rcvnxt 73BCB9AA, fl 0
WARNING: CPU: 2 PID: 1501 at /linux/net/ipv4/tcp.c:2011 tcp_recvmsg+0x694/0xba0

Suggested-by: 積丹尼 Dan Jacobson <jidanni@jidanni.org>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18 15:26:33 -07:00
Gustavo A. R. Silva 9ba8376ce1 ptp: fix missing break in switch
It seems that a *break* is missing in order to avoid falling through
to the default case. Otherwise, checking *chan* makes no sense.

Fixes: 72df7a7244 ("ptp: Allow reassigning calibration pin function")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18 15:25:25 -07:00
Haiyang Zhang 6b81b193b8 hv_netvsc: Fix napi reschedule while receive completion is busy
If out ring is full temporarily and receive completion cannot go out,
we may still need to reschedule napi if certain conditions are met.
Otherwise the napi poll might be stopped forever, and cause network
disconnect.

Fixes: 7426b1a518 ("netvsc: optimize receive completions")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18 15:23:30 -07:00
Krzysztof Kozlowski a2ec9d14ed MAINTAINERS: Drop inactive Vitaly Bordug's email
The Vitaly Bordug's email bounces ("ru.mvista.com: Name or service not
known") and there was no activity (ack, review, sign) since 2009.

Cc: Vitaly Bordug <vitb@kernel.crashing.org>
Cc: Pantelis Antoniou <pantelis.antoniou@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18 15:22:37 -07:00
Alexander Sverdlin e40562abdf net: cavium: Add fine-granular dependencies on PCI
Add dependencies on PCI where necessary.

Fixes: 7e2bc7fb65 ("net: cavium: Drop dependency of NET_VENDOR_CAVIUM on PCI")
Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18 15:21:33 -07:00
David S. Miller 0838a4ef5e Merge branch 'net-qca_spi-Minor-bugfixes'
Stefan Wahren says:

====================
net: qca_spi: Minor bugfixes

This patch series contains some minor bugfixes for
the qca_spi driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18 15:19:08 -07:00
Stefan Wahren 5097399326 net: qca_spi: Fix log level if probe fails
In cases the probing fails the log level of the messages should
be an error.

Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18 15:19:08 -07:00
Stefan Wahren 711c62dfa6 net: qca_spi: Make sure the QCA7000 reset is triggered
In case the SPI thread is not running, a simple reset of sync
state won't fix the transmit timeout. We also need to wake up the kernel
thread.

Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Fixes: ed7d42e24e ("net: qca_spi: fix transmit queue timeout handling")
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18 15:19:08 -07:00
Stefan Wahren b2bab426dc net: qca_spi: Avoid packet drop during initial sync
As long as the synchronization with the QCA7000 isn't finished, we
cannot accept packets from the upper layers. So let the SPI thread
enable the TX queue after sync and avoid unwanted packet drop.

Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Fixes: 291ab06ecf ("net: qualcomm: new Ethernet over SPI driver for QCA7000")
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18 15:19:08 -07:00
Colin Ian King 169dc027fb ipv6: fix useless rol32 call on hash
The rol32 call is currently rotating hash but the rol'd value is
being discarded. I believe the current code is incorrect and hash
should be assigned the rotated value returned from rol32.

Thanks to David Lebrun for spotting this.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18 15:11:09 -07:00
Colin Ian King 3ee593adbb ipv6: sr: fix useless rol32 call on hash
The rol32 call is currently rotating hash but the rol'd value is
being discarded. I believe the current code is incorrect and hash
should be assigned the rotated value returned from rol32.

Detected by CoverityScan, CID#1468411 ("Useless call")

Fixes: b5facfdba1 ("ipv6: sr: Compute flowlabel for outer IPv6 header of seg6 encap mode")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: dlebrun@google.com
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18 15:10:47 -07:00
Sergei Shtylyov 270ed733e6 PCI: v3-semi: Fix I/O space page leak
When testing the R-Car PCIe driver on the Condor board, if the PCIe PHY
driver was left disabled, the kernel crashed with this BUG:

  kernel BUG at lib/ioremap.c:72!
  Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
  Modules linked in:
  CPU: 0 PID: 39 Comm: kworker/0:1 Not tainted 4.17.0-dirty #1092
  Hardware name: Renesas Condor board based on r8a77980 (DT)
  Workqueue: events deferred_probe_work_func
  pstate: 80000005 (Nzcv daif -PAN -UAO)
  pc : ioremap_page_range+0x370/0x3c8
  lr : ioremap_page_range+0x40/0x3c8
  sp : ffff000008da39e0
  x29: ffff000008da39e0 x28: 00e8000000000f07
  x27: ffff7dfffee00000 x26: 0140000000000000
  x25: ffff7dfffef00000 x24: 00000000000fe100
  x23: ffff80007b906000 x22: ffff000008ab8000
  x21: ffff000008bb1d58 x20: ffff7dfffef00000
  x19: ffff800009c30fb8 x18: 0000000000000001
  x17: 00000000000152d0 x16: 00000000014012d0
  x15: 0000000000000000 x14: 0720072007200720
  x13: 0720072007200720 x12: 0720072007200720
  x11: 0720072007300730 x10: 00000000000000ae
  x9 : 0000000000000000 x8 : ffff7dffff000000
  x7 : 0000000000000000 x6 : 0000000000000100
  x5 : 0000000000000000 x4 : 000000007b906000
  x3 : ffff80007c61a880 x2 : ffff7dfffeefffff
  x1 : 0000000040000000 x0 : 00e80000fe100f07
  Process kworker/0:1 (pid: 39, stack limit = 0x        (ptrval))
  Call trace:
   ioremap_page_range+0x370/0x3c8
   pci_remap_iospace+0x7c/0xac
   pci_parse_request_of_pci_ranges+0x13c/0x190
   rcar_pcie_probe+0x4c/0xb04
   platform_drv_probe+0x50/0xbc
   driver_probe_device+0x21c/0x308
   __device_attach_driver+0x98/0xc8
   bus_for_each_drv+0x54/0x94
   __device_attach+0xc4/0x12c
   device_initial_probe+0x10/0x18
   bus_probe_device+0x90/0x98
   deferred_probe_work_func+0xb0/0x150
   process_one_work+0x12c/0x29c
   worker_thread+0x200/0x3fc
   kthread+0x108/0x134
   ret_from_fork+0x10/0x18
  Code: f9004ba2 54000080 aa0003fb 17ffff48 (d4210000)

It turned out that pci_remap_iospace() wasn't undone when the driver's
probe failed, and since devm_phy_optional_get() returned -EPROBE_DEFER,
the probe was retried,  finally causing the BUG due to trying to remap
already remapped pages.

The V3 Semiconductor PCI driver has the same issue.
Replace devm_pci_remap_iospace() with its devm_ managed version to fix
the bug.

Fixes: 68a15eb7bd ("PCI: v3-semi: Add V3 Semiconductor PCI host driver")
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
[lorenzo.pieralisi@arm.com: updated the commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
2018-07-18 17:02:13 -05:00
Sergei Shtylyov 438477b9a0 PCI: mediatek: Fix I/O space page leak
When testing the R-Car PCIe driver on the Condor board, if the PCIe PHY
driver was left disabled, the kernel crashed with this BUG:

  kernel BUG at lib/ioremap.c:72!
  Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
  Modules linked in:
  CPU: 0 PID: 39 Comm: kworker/0:1 Not tainted 4.17.0-dirty #1092
  Hardware name: Renesas Condor board based on r8a77980 (DT)
  Workqueue: events deferred_probe_work_func
  pstate: 80000005 (Nzcv daif -PAN -UAO)
  pc : ioremap_page_range+0x370/0x3c8
  lr : ioremap_page_range+0x40/0x3c8
  sp : ffff000008da39e0
  x29: ffff000008da39e0 x28: 00e8000000000f07
  x27: ffff7dfffee00000 x26: 0140000000000000
  x25: ffff7dfffef00000 x24: 00000000000fe100
  x23: ffff80007b906000 x22: ffff000008ab8000
  x21: ffff000008bb1d58 x20: ffff7dfffef00000
  x19: ffff800009c30fb8 x18: 0000000000000001
  x17: 00000000000152d0 x16: 00000000014012d0
  x15: 0000000000000000 x14: 0720072007200720
  x13: 0720072007200720 x12: 0720072007200720
  x11: 0720072007300730 x10: 00000000000000ae
  x9 : 0000000000000000 x8 : ffff7dffff000000
  x7 : 0000000000000000 x6 : 0000000000000100
  x5 : 0000000000000000 x4 : 000000007b906000
  x3 : ffff80007c61a880 x2 : ffff7dfffeefffff
  x1 : 0000000040000000 x0 : 00e80000fe100f07
  Process kworker/0:1 (pid: 39, stack limit = 0x        (ptrval))
  Call trace:
   ioremap_page_range+0x370/0x3c8
   pci_remap_iospace+0x7c/0xac
   pci_parse_request_of_pci_ranges+0x13c/0x190
   rcar_pcie_probe+0x4c/0xb04
   platform_drv_probe+0x50/0xbc
   driver_probe_device+0x21c/0x308
   __device_attach_driver+0x98/0xc8
   bus_for_each_drv+0x54/0x94
   __device_attach+0xc4/0x12c
   device_initial_probe+0x10/0x18
   bus_probe_device+0x90/0x98
   deferred_probe_work_func+0xb0/0x150
   process_one_work+0x12c/0x29c
   worker_thread+0x200/0x3fc
   kthread+0x108/0x134
   ret_from_fork+0x10/0x18
  Code: f9004ba2 54000080 aa0003fb 17ffff48 (d4210000)

It turned out that pci_remap_iospace() wasn't undone when the driver's
probe failed, and since devm_phy_optional_get() returned -EPROBE_DEFER,
the probe was retried, finally causing the BUG due to trying to remap
already remapped pages.

The MediaTek PCIe driver has the same issue.

Replace devm_pci_remap_iospace() with its devm_ managed counterpart
to fix the bug.

Fixes: 637cfacae9 ("PCI: mediatek: Add MediaTek PCIe host controller support")
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
[lorenzo.pieralisi@arm.com: updated the commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
2018-07-18 17:01:36 -05:00
Sergei Shtylyov e30609454b PCI: faraday: Fix I/O space page leak
When testing the R-Car PCIe driver on the Condor board, if
the PCIe PHY driver was left disabled, the kernel crashed with this BUG:

  kernel BUG at lib/ioremap.c:72!
  Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
  Modules linked in:
  CPU: 0 PID: 39 Comm: kworker/0:1 Not tainted 4.17.0-dirty #1092
  Hardware name: Renesas Condor board based on r8a77980 (DT)
  Workqueue: events deferred_probe_work_func
  pstate: 80000005 (Nzcv daif -PAN -UAO)
  pc : ioremap_page_range+0x370/0x3c8
  lr : ioremap_page_range+0x40/0x3c8
  sp : ffff000008da39e0
  x29: ffff000008da39e0 x28: 00e8000000000f07
  x27: ffff7dfffee00000 x26: 0140000000000000
  x25: ffff7dfffef00000 x24: 00000000000fe100
  x23: ffff80007b906000 x22: ffff000008ab8000
  x21: ffff000008bb1d58 x20: ffff7dfffef00000
  x19: ffff800009c30fb8 x18: 0000000000000001
  x17: 00000000000152d0 x16: 00000000014012d0
  x15: 0000000000000000 x14: 0720072007200720
  x13: 0720072007200720 x12: 0720072007200720
  x11: 0720072007300730 x10: 00000000000000ae
  x9 : 0000000000000000 x8 : ffff7dffff000000
  x7 : 0000000000000000 x6 : 0000000000000100
  x5 : 0000000000000000 x4 : 000000007b906000
  x3 : ffff80007c61a880 x2 : ffff7dfffeefffff
  x1 : 0000000040000000 x0 : 00e80000fe100f07
  Process kworker/0:1 (pid: 39, stack limit = 0x        (ptrval))
  Call trace:
   ioremap_page_range+0x370/0x3c8
   pci_remap_iospace+0x7c/0xac
   pci_parse_request_of_pci_ranges+0x13c/0x190
   rcar_pcie_probe+0x4c/0xb04
   platform_drv_probe+0x50/0xbc
   driver_probe_device+0x21c/0x308
   __device_attach_driver+0x98/0xc8
   bus_for_each_drv+0x54/0x94
   __device_attach+0xc4/0x12c
   device_initial_probe+0x10/0x18
   bus_probe_device+0x90/0x98
   deferred_probe_work_func+0xb0/0x150
   process_one_work+0x12c/0x29c
   worker_thread+0x200/0x3fc
   kthread+0x108/0x134
   ret_from_fork+0x10/0x18
  Code: f9004ba2 54000080 aa0003fb 17ffff48 (d4210000)

It turned out that pci_remap_iospace() wasn't undone when the driver's
probe failed, and since devm_phy_optional_get() returned -EPROBE_DEFER,
the probe was retried, finally causing the BUG due to trying to remap
already remapped pages.

The Faraday PCI driver has the same issue. Replace pci_remap_iospace()
with its devm_ managed version to fix the bug.

Fixes: d3c68e0a7e ("PCI: faraday: Add Faraday Technology FTPCI100 PCI Host Bridge driver")
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
[lorenzo.pieralisi@arm.com: updated the commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
2018-07-18 17:01:14 -05:00
Sergei Shtylyov 1df3e5b3fe PCI: aardvark: Fix I/O space page leak
When testing the R-Car PCIe driver on the Condor board, if the PCIe PHY
driver was left disabled, the kernel crashed with this BUG:

  kernel BUG at lib/ioremap.c:72!
  Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
  Modules linked in:
  CPU: 0 PID: 39 Comm: kworker/0:1 Not tainted 4.17.0-dirty #1092
  Hardware name: Renesas Condor board based on r8a77980 (DT)
  Workqueue: events deferred_probe_work_func
  pstate: 80000005 (Nzcv daif -PAN -UAO)
  pc : ioremap_page_range+0x370/0x3c8
  lr : ioremap_page_range+0x40/0x3c8
  sp : ffff000008da39e0
  x29: ffff000008da39e0 x28: 00e8000000000f07
  x27: ffff7dfffee00000 x26: 0140000000000000
  x25: ffff7dfffef00000 x24: 00000000000fe100
  x23: ffff80007b906000 x22: ffff000008ab8000
  x21: ffff000008bb1d58 x20: ffff7dfffef00000
  x19: ffff800009c30fb8 x18: 0000000000000001
  x17: 00000000000152d0 x16: 00000000014012d0
  x15: 0000000000000000 x14: 0720072007200720
  x13: 0720072007200720 x12: 0720072007200720
  x11: 0720072007300730 x10: 00000000000000ae
  x9 : 0000000000000000 x8 : ffff7dffff000000
  x7 : 0000000000000000 x6 : 0000000000000100
  x5 : 0000000000000000 x4 : 000000007b906000
  x3 : ffff80007c61a880 x2 : ffff7dfffeefffff
  x1 : 0000000040000000 x0 : 00e80000fe100f07
  Process kworker/0:1 (pid: 39, stack limit = 0x        (ptrval))
  Call trace:
   ioremap_page_range+0x370/0x3c8
   pci_remap_iospace+0x7c/0xac
   pci_parse_request_of_pci_ranges+0x13c/0x190
   rcar_pcie_probe+0x4c/0xb04
   platform_drv_probe+0x50/0xbc
   driver_probe_device+0x21c/0x308
   __device_attach_driver+0x98/0xc8
   bus_for_each_drv+0x54/0x94
   __device_attach+0xc4/0x12c
   device_initial_probe+0x10/0x18
   bus_probe_device+0x90/0x98
   deferred_probe_work_func+0xb0/0x150
   process_one_work+0x12c/0x29c
   worker_thread+0x200/0x3fc
   kthread+0x108/0x134
   ret_from_fork+0x10/0x18
  Code: f9004ba2 54000080 aa0003fb 17ffff48 (d4210000)

It turned out that pci_remap_iospace() wasn't undone when the driver's
probe failed, and since devm_phy_optional_get() returned -EPROBE_DEFER,
the probe was retried, finally causing the BUG due to trying to remap
already remapped pages.

The Aardvark PCI controller driver has the same issue.
Replace pci_remap_iospace() with its devm_ managed version to fix the bug.

Fixes: 8c39d71036 ("PCI: aardvark: Add Aardvark PCI host controller driver")
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
[lorenzo.pieralisi@arm.com: updated the commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
2018-07-18 17:00:54 -05:00
Sergei Shtylyov fd07f5e19c PCI: designware: Fix I/O space page leak
When testing the R-Car PCIe driver on the Condor board, if the PCIe PHY
driver is left disabled, the kernel crashed with this BUG:

  kernel BUG at lib/ioremap.c:72!
  Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
  Modules linked in:
  CPU: 0 PID: 39 Comm: kworker/0:1 Not tainted 4.17.0-dirty #1092
  Hardware name: Renesas Condor board based on r8a77980 (DT)
  Workqueue: events deferred_probe_work_func
  pstate: 80000005 (Nzcv daif -PAN -UAO)
  pc : ioremap_page_range+0x370/0x3c8
  lr : ioremap_page_range+0x40/0x3c8
  sp : ffff000008da39e0
  x29: ffff000008da39e0 x28: 00e8000000000f07
  x27: ffff7dfffee00000 x26: 0140000000000000
  x25: ffff7dfffef00000 x24: 00000000000fe100
  x23: ffff80007b906000 x22: ffff000008ab8000
  x21: ffff000008bb1d58 x20: ffff7dfffef00000
  x19: ffff800009c30fb8 x18: 0000000000000001
  x17: 00000000000152d0 x16: 00000000014012d0
  x15: 0000000000000000 x14: 0720072007200720
  x13: 0720072007200720 x12: 0720072007200720
  x11: 0720072007300730 x10: 00000000000000ae
  x9 : 0000000000000000 x8 : ffff7dffff000000
  x7 : 0000000000000000 x6 : 0000000000000100
  x5 : 0000000000000000 x4 : 000000007b906000
  x3 : ffff80007c61a880 x2 : ffff7dfffeefffff
  x1 : 0000000040000000 x0 : 00e80000fe100f07
  Process kworker/0:1 (pid: 39, stack limit = 0x        (ptrval))
  Call trace:
   ioremap_page_range+0x370/0x3c8
   pci_remap_iospace+0x7c/0xac
   pci_parse_request_of_pci_ranges+0x13c/0x190
   rcar_pcie_probe+0x4c/0xb04
   platform_drv_probe+0x50/0xbc
   driver_probe_device+0x21c/0x308
   __device_attach_driver+0x98/0xc8
   bus_for_each_drv+0x54/0x94
   __device_attach+0xc4/0x12c
   device_initial_probe+0x10/0x18
   bus_probe_device+0x90/0x98
   deferred_probe_work_func+0xb0/0x150
   process_one_work+0x12c/0x29c
   worker_thread+0x200/0x3fc
   kthread+0x108/0x134
   ret_from_fork+0x10/0x18
  Code: f9004ba2 54000080 aa0003fb 17ffff48 (d4210000)

It turned out that pci_remap_iospace() wasn't undone when the driver's
probe failed, and since devm_phy_optional_get() returned -EPROBE_DEFER,
the probe was retried, finally causing the BUG due to trying to remap
already remapped pages.

The DesignWare PCIe controller driver has the same issue.

Replace devm_pci_remap_iospace() with a devm_ managed version to fix the
bug.

Fixes: cbce790059 ("PCI: designware: Make driver arch-agnostic")
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
[lorenzo.pieralisi@arm.com: updated the commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Jingoo Han <jingoohan1@gmail.com>
2018-07-18 17:00:29 -05:00
Sergei Shtylyov 0018b265ad PCI: versatile: Fix I/O space page leak
When testing the R-Car PCIe driver on the Condor board, if the PCIe PHY
driver was left disabled, the kernel crashed with this BUG:

  kernel BUG at lib/ioremap.c:72!
  Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
  Modules linked in:
  CPU: 0 PID: 39 Comm: kworker/0:1 Not tainted 4.17.0-dirty #1092
  Hardware name: Renesas Condor board based on r8a77980 (DT)
  Workqueue: events deferred_probe_work_func
  pstate: 80000005 (Nzcv daif -PAN -UAO)
  pc : ioremap_page_range+0x370/0x3c8
  lr : ioremap_page_range+0x40/0x3c8
  sp : ffff000008da39e0
  x29: ffff000008da39e0 x28: 00e8000000000f07
  x27: ffff7dfffee00000 x26: 0140000000000000
  x25: ffff7dfffef00000 x24: 00000000000fe100
  x23: ffff80007b906000 x22: ffff000008ab8000
  x21: ffff000008bb1d58 x20: ffff7dfffef00000
  x19: ffff800009c30fb8 x18: 0000000000000001
  x17: 00000000000152d0 x16: 00000000014012d0
  x15: 0000000000000000 x14: 0720072007200720
  x13: 0720072007200720 x12: 0720072007200720
  x11: 0720072007300730 x10: 00000000000000ae
  x9 : 0000000000000000 x8 : ffff7dffff000000
  x7 : 0000000000000000 x6 : 0000000000000100
  x5 : 0000000000000000 x4 : 000000007b906000
  x3 : ffff80007c61a880 x2 : ffff7dfffeefffff
  x1 : 0000000040000000 x0 : 00e80000fe100f07
  Process kworker/0:1 (pid: 39, stack limit = 0x        (ptrval))
  Call trace:
   ioremap_page_range+0x370/0x3c8
   pci_remap_iospace+0x7c/0xac
   pci_parse_request_of_pci_ranges+0x13c/0x190
   rcar_pcie_probe+0x4c/0xb04
   platform_drv_probe+0x50/0xbc
   driver_probe_device+0x21c/0x308
   __device_attach_driver+0x98/0xc8
   bus_for_each_drv+0x54/0x94
   __device_attach+0xc4/0x12c
   device_initial_probe+0x10/0x18
   bus_probe_device+0x90/0x98
   deferred_probe_work_func+0xb0/0x150
   process_one_work+0x12c/0x29c
   worker_thread+0x200/0x3fc
   kthread+0x108/0x134
   ret_from_fork+0x10/0x18
  Code: f9004ba2 54000080 aa0003fb 17ffff48 (d4210000)

It turned out that pci_remap_iospace() wasn't undone when the driver's
probe failed, and since devm_phy_optional_get() returned -EPROBE_DEFER,
the probe was retried, finally causing the BUG due to trying to remap
already remapped pages.

The Versatile PCI controller driver has the same issue.
Replace pci_remap_iospace() with the devm_ managed version to fix the bug.

Fixes: b7e78170ef ("PCI: versatile: Add DT-based ARM Versatile PB PCIe host driver")
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
[lorenzo.pieralisi@arm.com: updated the commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
2018-07-18 17:00:11 -05:00
Sergei Shtylyov 925652d035 PCI: xgene: Fix I/O space page leak
When testing the R-Car PCIe driver on the Condor board, if the PCIe PHY
driver was left disabled, the kernel crashed with this BUG:

  kernel BUG at lib/ioremap.c:72!
  Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
  Modules linked in:
  CPU: 0 PID: 39 Comm: kworker/0:1 Not tainted 4.17.0-dirty #1092
  Hardware name: Renesas Condor board based on r8a77980 (DT)
  Workqueue: events deferred_probe_work_func
  pstate: 80000005 (Nzcv daif -PAN -UAO)
  pc : ioremap_page_range+0x370/0x3c8
  lr : ioremap_page_range+0x40/0x3c8
  sp : ffff000008da39e0
  x29: ffff000008da39e0 x28: 00e8000000000f07
  x27: ffff7dfffee00000 x26: 0140000000000000
  x25: ffff7dfffef00000 x24: 00000000000fe100
  x23: ffff80007b906000 x22: ffff000008ab8000
  x21: ffff000008bb1d58 x20: ffff7dfffef00000
  x19: ffff800009c30fb8 x18: 0000000000000001
  x17: 00000000000152d0 x16: 00000000014012d0
  x15: 0000000000000000 x14: 0720072007200720
  x13: 0720072007200720 x12: 0720072007200720
  x11: 0720072007300730 x10: 00000000000000ae
  x9 : 0000000000000000 x8 : ffff7dffff000000
  x7 : 0000000000000000 x6 : 0000000000000100
  x5 : 0000000000000000 x4 : 000000007b906000
  x3 : ffff80007c61a880 x2 : ffff7dfffeefffff
  x1 : 0000000040000000 x0 : 00e80000fe100f07
  Process kworker/0:1 (pid: 39, stack limit = 0x        (ptrval))
  Call trace:
   ioremap_page_range+0x370/0x3c8
   pci_remap_iospace+0x7c/0xac
   pci_parse_request_of_pci_ranges+0x13c/0x190
   rcar_pcie_probe+0x4c/0xb04
   platform_drv_probe+0x50/0xbc
   driver_probe_device+0x21c/0x308
   __device_attach_driver+0x98/0xc8
   bus_for_each_drv+0x54/0x94
   __device_attach+0xc4/0x12c
   device_initial_probe+0x10/0x18
   bus_probe_device+0x90/0x98
   deferred_probe_work_func+0xb0/0x150
   process_one_work+0x12c/0x29c
   worker_thread+0x200/0x3fc
   kthread+0x108/0x134
   ret_from_fork+0x10/0x18
  Code: f9004ba2 54000080 aa0003fb 17ffff48 (d4210000)

It turned out that pci_remap_iospace() wasn't undone when the driver's
probe failed, and since devm_phy_optional_get() returned -EPROBE_DEFER,
the probe was retried, finally causing the BUG due to trying to remap
already remapped pages.

The X-Gene PCI controller driver has the same issue.
Replace pci_remap_iospace() with the devm_ managed version so that the
pages get unmapped automagically on any probe failure.

Fixes: 5f6b6ccdbe ("PCI: xgene: Add APM X-Gene PCIe driver")
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
[lorenzo.pieralisi@arm.com: updated the commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
2018-07-18 16:59:40 -05:00