redonkable/remarkable-linux

Author	SHA1	Message	Date
Michael Ellerman	4c5463a5a3	powerpc/pseries: Add new H_GET_CPU_CHARACTERISTICS flags commit `c4bc36628d` upstream. Add some additional values which have been defined for the H_GET_CPU_CHARACTERISTICS hypercall. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:51:51 +02:00
Michael Ellerman	d1cb5ff450	powerpc/rfi-flush: Call setup_rfi_flush() after LPM migration commit `921bc6cf80` upstream. We might have migrated to a machine that uses a different flush type, or doesn't need flushing at all. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:51:51 +02:00
Mauricio Faria de Oliveira	123f6d5cca	powerpc/rfi-flush: Differentiate enabled and patched flush types commit `0063d61ccf` upstream. Currently the rfi-flush messages print 'Using <type> flush' for all enabled_flush_types, but that is not necessarily true -- as now the fallback flush is always enabled on pseries, but the fixup function overwrites its nop/branch slot with other flush types, if available. So, replace the 'Using <type> flush' messages with '<type> flush is available'. Also, print the patched flush types in the fixup function, so users can know what is (not) being used (e.g., the slower, fallback flush, or no flush type at all if flush is disabled via the debugfs switch). Suggested-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:51:51 +02:00
Michael Ellerman	6af06dcdea	powerpc/rfi-flush: Always enable fallback flush on pseries commit `84749a58b6` upstream. This ensures the fallback flush area is always allocated on pseries, so in case a LPAR is migrated from a patched to an unpatched system, it is possible to enable the fallback flush in the target system. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:51:51 +02:00
Michael Ellerman	d744f8457f	powerpc/rfi-flush: Make it possible to call setup_rfi_flush() again commit `abf110f3e1` upstream. For PowerVM migration we want to be able to call setup_rfi_flush() again after we've migrated the partition. To support that we need to check that we're not trying to allocate the fallback flush area after memblock has gone away (i.e., boot-time only). Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:51:51 +02:00
Michael Ellerman	5412a9d91d	powerpc/rfi-flush: Move the logic to avoid a redo into the debugfs code commit `1e2a9fc749` upstream. rfi_flush_enable() includes a check to see if we're already enabled (or disabled), and in that case does nothing. But that means calling setup_rfi_flush() a 2nd time doesn't actually work, which is a bit confusing. Move that check into the debugfs code, where it really belongs. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:51:51 +02:00
Michael Ellerman	bf434b31ba	powerpc/powernv: Support firmware disable of RFI flush commit `eb0a2d2620` upstream. Some versions of firmware will have a setting that can be configured to disable the RFI flush, add support for it. Fixes: `6e032b350c` ("powerpc/powernv: Check device-tree for RFI flush settings") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:51:51 +02:00
Michael Ellerman	dff1a7e6c3	powerpc/pseries: Support firmware disable of RFI flush commit `582605a429` upstream. Some versions of firmware will have a setting that can be configured to disable the RFI flush, add support for it. Fixes: `8989d56878` ("powerpc/pseries: Query hypervisor for RFI flush settings") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:51:51 +02:00
Nicholas Piggin	2245d95d9f	powerpc/64s: Improve RFI L1-D cache flush fallback commit `bdcb1aefc5` upstream. The fallback RFI flush is used when firmware does not provide a way to flush the cache. It's a "displacement flush" that evicts useful data by displacing it with an uninteresting buffer. The flush has to take care to work with implementation specific cache replacment policies, so the recipe has been in flux. The initial slow but conservative approach is to touch all lines of a congruence class, with dependencies between each load. It has since been determined that a linear pattern of loads without dependencies is sufficient, and is significantly faster. Measuring the speed of a null syscall with RFI fallback flush enabled gives the relative improvement: P8 - 1.83x P9 - 1.75x The flush also becomes simpler and more adaptable to different cache geometries. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:51:50 +02:00
David Vrabel	421e1fadb0	x86/kvm: fix LAPIC timer drift when guest uses periodic mode commit `d8f2f498d9` upstream. Since 4.10, commit `8003c9ae20` (KVM: LAPIC: add APIC Timer periodic/oneshot mode VMX preemption timer support), guests using periodic LAPIC timers (such as FreeBSD 8.4) would see their timers drift significantly over time. Differences in the underlying clocks and numerical errors means the periods of the two timers (hv and sw) are not the same. This difference will accumulate with every expiry resulting in a large error between the hv and sw timer. This means the sw timer may be running slow when compared to the hv timer. When the timer is switched from hv to sw, the now active sw timer will expire late. The guest VCPU is reentered and it switches to using the hv timer. This timer catches up, injecting multiple IRQs into the guest (of which the guest only sees one as it does not get to run until the hv timer has caught up) and thus the guest's timer rate is low (and becomes increasing slower over time as the sw timer lags further and further behind). I believe a similar problem would occur if the hv timer is the slower one, but I have not observed this. Fix this by synchronizing the deadlines for both timers to the same time source on every tick. This prevents the errors from accumulating. Fixes: `8003c9ae20` Cc: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: David Vrabel <david.vrabel@nutanix.com> Cc: stable@vger.kernel.org Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Wanpeng Li <wanpengli@tencent.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:51:50 +02:00
Jim Mattson	b3ce16455c	kvm: x86: IA32_ARCH_CAPABILITIES is always supported commit `1eaafe91a0` upstream. If there is a possibility that a VM may migrate to a Skylake host, then the hypervisor should report IA32_ARCH_CAPABILITIES.RSBA[bit 2] as being set (future work, of course). This implies that CPUID.(EAX=7,ECX=0):EDX.ARCH_CAPABILITIES[bit 29] should be set. Therefore, kvm should report this CPUID bit as being supported whether or not the host supports it. Userspace is still free to clear the bit if it chooses. For more information on RSBA, see Intel's white paper, "Retpoline: A Branch Target Injection Mitigation" (Document Number 337131-001), currently available at https://bugzilla.kernel.org/show_bug.cgi?id=199511. Since the IA32_ARCH_CAPABILITIES MSR is emulated in kvm, there is no dependency on hardware support for this feature. Signed-off-by: Jim Mattson <jmattson@google.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Fixes: `28c1c9fabf` ("KVM/VMX: Emulate MSR_IA32_ARCH_CAPABILITIES") Cc: stable@vger.kernel.org Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:51:50 +02:00
Wei Huang	e765fd97e0	KVM: x86: Update cpuid properly when CR4.OSXAVE or CR4.PKE is changed commit `c4d2188206` upstream. The CPUID bits of OSXSAVE (function=0x1) and OSPKE (func=0x7, leaf=0x0) allows user apps to detect if OS has set CR4.OSXSAVE or CR4.PKE. KVM is supposed to update these CPUID bits when CR4 is updated. Current KVM code doesn't handle some special cases when updates come from emulator. Here is one example: Step 1: guest boots Step 2: guest OS enables XSAVE ==> CR4.OSXSAVE=1 and CPUID.OSXSAVE=1 Step 3: guest hot reboot ==> QEMU reset CR4 to 0, but CPUID.OSXAVE==1 Step 4: guest os checks CPUID.OSXAVE, detects 1, then executes xgetbv Step 4 above will cause an #UD and guest crash because guest OS hasn't turned on OSXAVE yet. This patch solves the problem by comparing the the old_cr4 with cr4. If the related bits have been changed, kvm_update_cpuid() needs to be called. Signed-off-by: Wei Huang <wei@redhat.com> Reviewed-by: Bandan Das <bsd@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:51:50 +02:00
David Hildenbrand	16c463a4ec	KVM: s390: vsie: fix < 8k check for the itdba commit `f4a551b723` upstream. By missing an "L", we might detect some addresses to be <8k, although they are not. e.g. for itdba = 100001fff !(gpa & ~0x1fffU) -> 1 !(gpa & ~0x1fffUL) -> 0 So we would report a SIE validity intercept although everything is fine. Fixes: `166ecb3` ("KVM: s390: vsie: support transactional execution") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Cc: stable@vger.kernel.org # v4.8+ Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:51:50 +02:00
Konrad Rzeszutek Wilk	9c5eee6056	KVM/VMX: Expose SSBD properly to guests commit `0aa48468d0` upstream. The X86_FEATURE_SSBD is an synthetic CPU feature - that is it bit location has no relevance to the real CPUID 0x7.EBX[31] bit position. For that we need the new CPU feature name. Fixes: `52817587e7` ("x86/cpufeatures: Disentangle SSBD enumeration") Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: kvm@vger.kernel.org Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: stable@vger.kernel.org Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: https://lkml.kernel.org/r/20180521215449.26423-2-konrad.wilk@oracle.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:51:50 +02:00
Gustavo A. R. Silva	058dfcf9c2	kernel/sys.c: fix potential Spectre v1 issue commit `23d6aef74d` upstream. `resource' can be controlled by user-space, hence leading to a potential exploitation of the Spectre variant 1 vulnerability. This issue was detected with the help of Smatch: kernel/sys.c:1474 __do_compat_sys_old_getrlimit() warn: potential spectre issue 'get_current()->signal->rlim' (local cap) kernel/sys.c:1455 __do_sys_old_getrlimit() warn: potential spectre issue 'get_current()->signal->rlim' (local cap) Fix this by sanitizing resource before using it to index current->signal->rlim Notice that given that speculation windows are large, the policy is to kill the speculation on the first load and not worry if it can be completed with a dependent load/store [1]. [1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2 Link: http://lkml.kernel.org/r/20180515030038.GA11822@embeddedor.com Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:51:50 +02:00
David Hildenbrand	1da530fe15	kasan: fix memory hotplug during boot commit `3f19597215` upstream. Using module_init() is wrong. E.g. ACPI adds and onlines memory before our memory notifier gets registered. This makes sure that ACPI memory detected during boot up will not result in a kernel crash. Easily reproducible with QEMU, just specify a DIMM when starting up. Link: http://lkml.kernel.org/r/20180522100756.18478-3-david@redhat.com Fixes: `786a895991` ("kasan: disable memory hotplug") Signed-off-by: David Hildenbrand <david@redhat.com> Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:51:50 +02:00
David Hildenbrand	b052960484	kasan: free allocated shadow memory on MEM_CANCEL_ONLINE commit `ed1596f9ab` upstream. We have to free memory again when we cancel onlining, otherwise a later onlining attempt will fail. Link: http://lkml.kernel.org/r/20180522100756.18478-2-david@redhat.com Fixes: `fa69b5989b` ("mm/kasan: add support for memory hotplug") Signed-off-by: David Hildenbrand <david@redhat.com> Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:51:49 +02:00
Andrey Ryabinin	9c7821c67a	mm/kasan: don't vfree() nonexistent vm_area commit `0f901dcbc3` upstream. KASAN uses different routines to map shadow for hot added memory and memory obtained in boot process. Attempt to offline memory onlined by normal boot process leads to this: Trying to vfree() nonexistent vm area (000000005d3b34b9) WARNING: CPU: 2 PID: 13215 at mm/vmalloc.c:1525 __vunmap+0x147/0x190 Call Trace: kasan_mem_notifier+0xad/0xb9 notifier_call_chain+0x166/0x260 __blocking_notifier_call_chain+0xdb/0x140 __offline_pages+0x96a/0xb10 memory_subsys_offline+0x76/0xc0 device_offline+0xb8/0x120 store_mem_state+0xfa/0x120 kernfs_fop_write+0x1d5/0x320 __vfs_write+0xd4/0x530 vfs_write+0x105/0x340 SyS_write+0xb0/0x140 Obviously we can't call vfree() to free memory that wasn't allocated via vmalloc(). Use find_vm_area() to see if we can call vfree(). Unfortunately it's a bit tricky to properly unmap and free shadow allocated during boot, so we'll have to keep it. If memory will come online again that shadow will be reused. Matthew asked: how can you call vfree() on something that isn't a vmalloc address? vfree() is able to free any address returned by __vmalloc_node_range(). And __vmalloc_node_range() gives you any address you ask. It doesn't have to be an address in [VMALLOC_START, VMALLOC_END] range. That's also how the module_alloc()/module_memfree() works on architectures that have designated area for modules. [aryabinin@virtuozzo.com: improve comments] Link: http://lkml.kernel.org/r/dabee6ab-3a7a-51cd-3b86-5468718e0390@virtuozzo.com [akpm@linux-foundation.org: fix typos, reflow comment] Link: http://lkml.kernel.org/r/20180201163349.8700-1-aryabinin@virtuozzo.com Fixes: `fa69b5989b` ("mm/kasan: add support for memory hotplug") Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Reported-by: Paul Menzel <pmenzel+linux-kasan-dev@molgen.mpg.de> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:51:49 +02:00
Davidlohr Bueso	afdc490b36	ipc/shm: fix shmat() nil address after round-down when remapping commit `8f89c007b6` upstream. shmat()'s SHM_REMAP option forbids passing a nil address for; this is in fact the very first thing we check for. Andrea reported that for SHM_RND\|SHM_REMAP cases we can end up bypassing the initial addr check, but we need to check again if the address was rounded down to nil. As of this patch, such cases will return -EINVAL. Link: http://lkml.kernel.org/r/20180503204934.kk63josdu6u53fbd@linux-n805 Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Reported-by: Andrea Arcangeli <aarcange@redhat.com> Cc: Joe Lawrence <joe.lawrence@redhat.com> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:51:49 +02:00
Davidlohr Bueso	67dd0bad81	Revert "ipc/shm: Fix shmat mmap nil-page protection" commit `a73ab244f0` upstream. Patch series "ipc/shm: shmat() fixes around nil-page". These patches fix two issues reported[1] a while back by Joe and Andrea around how shmat(2) behaves with nil-page. The first reverts a commit that it was incorrectly thought that mapping nil-page (address=0) was a no no with MAP_FIXED. This is not the case, with the exception of SHM_REMAP; which is address in the second patch. I chose two patches because it is easier to backport and it explicitly reverts bogus behaviour. Both patches ought to be in -stable and ltp testcases need updated (the added testcase around the cve can be modified to just test for SHM_RND\|SHM_REMAP). [1] lkml.kernel.org/r/20180430172152.nfa564pvgpk3ut7p@linux-n805 This patch (of 2): Commit `95e91b831f` ("ipc/shm: Fix shmat mmap nil-page protection") worked on the idea that we should not be mapping as root addr=0 and MAP_FIXED. However, it was reported that this scenario is in fact valid, thus making the patch both bogus and breaks userspace as well. For example X11's libint10.so relies on shmat(1, SHM_RND) for lowmem initialization[1]. [1] https://cgit.freedesktop.org/xorg/xserver/tree/hw/xfree86/os-support/linux/int10/linux.c#n347 Link: http://lkml.kernel.org/r/20180503203243.15045-2-dave@stgolabs.net Fixes: `95e91b831f` ("ipc/shm: Fix shmat mmap nil-page protection") Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Reported-by: Joe Lawrence <joe.lawrence@redhat.com> Reported-by: Andrea Arcangeli <aarcange@redhat.com> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:51:49 +02:00
Matthew Wilcox	0472f94cef	idr: fix invalid ptr dereference on item delete commit `7a4deea1aa` upstream. If the radix tree underlying the IDR happens to be full and we attempt to remove an id which is larger than any id in the IDR, we will call __radix_tree_delete() with an uninitialised 'slot' pointer, at which point anything could happen. This was easiest to hit with a single entry at id 0 and attempting to remove a non-0 id, but it could have happened with 64 entries and attempting to remove an id >= 64. Roman said: The syzcaller test boils down to opening /dev/kvm, creating an eventfd, and calling a couple of KVM ioctls. None of this requires superuser. And the result is dereferencing an uninitialized pointer which is likely a crash. The specific path caught by syzbot is via KVM_HYPERV_EVENTD ioctl which is new in 4.17. But I guess there are other user-triggerable paths, so cc:stable is probably justified. Matthew added: We have around 250 calls to idr_remove() in the kernel today. Many of them pass an ID which is embedded in the object they're removing, so they're safe. Picking a few likely candidates: drivers/firewire/core-cdev.c looks unsafe; the ID comes from an ioctl. drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c is similar drivers/atm/nicstar.c could be taken down by a handcrafted packet Link: http://lkml.kernel.org/r/20180518175025.GD6361@bombadil.infradead.org Fixes: `0a835c4f09` ("Reimplement IDR and IDA using the radix tree") Reported-by: <syzbot+35666cba7f0a337e2e79@syzkaller.appspotmail.com> Debugged-by: Roman Kagan <rkagan@virtuozzo.com> Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:51:49 +02:00
Jens Axboe	2a039b9367	sr: pass down correctly sized SCSI sense buffer commit `f7068114d4` upstream. We're casting the CDROM layer request_sense to the SCSI sense buffer, but the former is 64 bytes and the latter is 96 bytes. As we generally allocate these on the stack, we end up blowing up the stack. Fix this by wrapping the scsi_execute() call with a properly sized sense buffer, and copying back the bits for the CDROM layer. Cc: stable@vger.kernel.org Reported-by: Piotr Gabriel Kosinski <pg.kosinski@gmail.com> Reported-by: Daniel Shapira <daniel@twistlock.com> Tested-by: Kees Cook <keescook@chromium.org> Fixes: `82ed4db499` ("block: split scsi_request out of struct request") Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:51:49 +02:00
Lidong Chen	a59bd81957	IB/umem: Use the correct mm during ib_umem_release commit `8e907ed488` upstream. User-space may invoke ibv_reg_mr and ibv_dereg_mr in different threads. If ibv_dereg_mr is called after the thread which invoked ibv_reg_mr has exited, get_pid_task will return NULL and ib_umem_release will not decrease mm->pinned_vm. Instead of using threads to locate the mm, use the overall tgid from the ib_ucontext struct instead. This matches the behavior of ODP and disassociate in handling the mm of the process that called ibv_reg_mr. Cc: <stable@vger.kernel.org> Fixes: `87773dd56d` ("IB: ib_umem_release() should decrement mm->pinned_vm from ib_umem_get") Signed-off-by: Lidong Chen <lidongchen@tencent.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:51:49 +02:00
Michael J. Ruhl	7a5b3b91f8	IB/hfi1: Use after free race condition in send context error path commit `f9e76ca377` upstream. A pio send egress error can occur when the PSM library attempts to to send a bad packet. That issue is still being investigated. The pio error interrupt handler then attempts to progress the recovery of the errored pio send context. Code inspection reveals that the handling lacks the necessary locking if that recovery interleaves with a PSM close of the "context" object contains the pio send context. The lack of the locking can cause the recovery to access the already freed pio send context object and incorrectly deduce that the pio send context is actually a kernel pio send context as shown by the NULL deref stack below: [<ffffffff8143d78c>] _dev_info+0x6c/0x90 [<ffffffffc0613230>] sc_restart+0x70/0x1f0 [hfi1] [<ffffffff816ab124>] ? __schedule+0x424/0x9b0 [<ffffffffc06133c5>] sc_halted+0x15/0x20 [hfi1] [<ffffffff810aa3ba>] process_one_work+0x17a/0x440 [<ffffffff810ab086>] worker_thread+0x126/0x3c0 [<ffffffff810aaf60>] ? manage_workers.isra.24+0x2a0/0x2a0 [<ffffffff810b252f>] kthread+0xcf/0xe0 [<ffffffff810b2460>] ? insert_kthread_work+0x40/0x40 [<ffffffff816b8798>] ret_from_fork+0x58/0x90 [<ffffffff810b2460>] ? insert_kthread_work+0x40/0x40 This is the best case scenario and other scenarios can corrupt the already freed memory. Fix by adding the necessary locking in the pio send context error handler. Cc: <stable@vger.kernel.org> # 4.9.x Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:51:49 +02:00
Michael Neuling	df07f27184	powerpc/64s: Clear PCR on boot commit `faf37c44a1` upstream. Clear the PCR (Processor Compatibility Register) on boot to ensure we are not running in a compatibility mode. We've seen this cause problems when a crash (and kdump) occurs while running compat mode guests. The kdump kernel then runs with the PCR set and causes problems. The symptom in the kdump kernel (also seen in petitboot after fast-reboot) is early userspace programs taking sigills on newer instructions (seen in libc). Signed-off-by: Michael Neuling <mikey@neuling.org> Cc: stable@vger.kernel.org Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:51:49 +02:00
Will Deacon	92169a015b	arm64: lse: Add early clobbers to some input/output asm operands commit `32c3fa7cdf` upstream. For LSE atomics that read and write a register operand, we need to ensure that these operands are annotated as "early clobber" if the register is written before all of the input operands have been consumed. Failure to do so can result in the compiler allocating the same register to both operands, leading to splats such as: Unable to handle kernel paging request at virtual address 11111122222221 [...] x1 : 1111111122222222 x0 : 1111111122222221 Process swapper/0 (pid: 1, stack limit = 0x000000008209f908) Call trace: test_atomic64+0x1360/0x155c where x0 has been allocated as both the value to be stored and also the atomic_t pointer. This patch adds the missing clobbers. Cc: <stable@vger.kernel.org> Cc: Dave Martin <dave.martin@arm.com> Cc: Robin Murphy <robin.murphy@arm.com> Reported-by: Mark Salter <msalter@redhat.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:51:48 +02:00
Thomas Hellstrom	760e4d7e89	drm/vmwgfx: Fix 32-bit VMW_PORT_HB_[IN\|OUT] macros commit `938ae7259c` upstream. Depending on whether the kernel is compiled with frame-pointer or not, the temporary memory location used for the bp parameter in these macros is referenced relative to the stack pointer or the frame pointer. Hence we can never reference that parameter when we've modified either the stack pointer or the frame pointer, because then the compiler would generate an incorrect stack reference. Fix this by pushing the temporary memory parameter on a known location on the stack before modifying the stack- and frame pointers. Cc: <stable@vger.kernel.org> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:51:48 +02:00
Joe Jin	a0f8cbce7b	xen-swiotlb: fix the check condition for xen_swiotlb_free_coherent commit `4855c92dbb` upstream. When run raidconfig from Dom0 we found that the Xen DMA heap is reduced, but Dom Heap is increased by the same size. Tracing raidconfig we found that the related ioctl() in megaraid_sas will call dma_alloc_coherent() to apply memory. If the memory allocated by Dom0 is not in the DMA area, it will exchange memory with Xen to meet the requiment. Later drivers call dma_free_coherent() to free the memory, on xen_swiotlb_free_coherent() the check condition (dev_addr + size - 1 <= dma_mask) is always false, it prevents calling xen_destroy_contiguous_region() to return the memory to the Xen DMA heap. This issue introduced by commit `6810df88dc` "xen-swiotlb: When doing coherent alloc/dealloc check before swizzling the MFNs.". Signed-off-by: Joe Jin <joe.jin@oracle.com> Tested-by: John Sobecki <john.sobecki@oracle.com> Reviewed-by: Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: stable@vger.kernel.org Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:51:48 +02:00
Sudip Mukherjee	4182f5a075	libata: blacklist Micron 500IT SSD with MU01 firmware commit `136d769e0b` upstream. While whitelisting Micron M500DC drives, the tweaked blacklist entry enabled queued TRIM from M500IT variants also. But these do not support queued TRIM. And while using those SSDs with the latest kernel we have seen errors and even the partition table getting corrupted. Some part from the dmesg: [ 6.727384] ata1.00: ATA-9: Micron_M500IT_MTFDDAK060MBD, MU01, max UDMA/133 [ 6.727390] ata1.00: 117231408 sectors, multi 16: LBA48 NCQ (depth 31/32), AA [ 6.741026] ata1.00: supports DRM functions and may not be fully accessible [ 6.759887] ata1.00: configured for UDMA/133 [ 6.762256] scsi 0:0:0:0: Direct-Access ATA Micron_M500IT_MT MU01 PQ: 0 ANSI: 5 and then for the error: [ 120.860334] ata1.00: exception Emask 0x1 SAct 0x7ffc0007 SErr 0x0 action 0x6 frozen [ 120.860338] ata1.00: irq_stat 0x40000008 [ 120.860342] ata1.00: failed command: SEND FPDMA QUEUED [ 120.860351] ata1.00: cmd 64/01:00:00:00:00/00:00:00:00:00/a0 tag 0 ncq dma 512 out res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x5 (timeout) [ 120.860353] ata1.00: status: { DRDY } [ 120.860543] ata1: hard resetting link [ 121.166128] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [ 121.166376] ata1.00: supports DRM functions and may not be fully accessible [ 121.186238] ata1.00: supports DRM functions and may not be fully accessible [ 121.204445] ata1.00: configured for UDMA/133 [ 121.204454] ata1.00: device reported invalid CHS sector 0 [ 121.204541] sd 0:0:0:0: [sda] tag#18 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 [ 121.204546] sd 0:0:0:0: [sda] tag#18 Sense Key : 0x5 [current] [ 121.204550] sd 0:0:0:0: [sda] tag#18 ASC=0x21 ASCQ=0x4 [ 121.204555] sd 0:0:0:0: [sda] tag#18 CDB: opcode=0x93 93 08 00 00 00 00 00 04 28 80 00 00 00 30 00 00 [ 121.204559] print_req_error: I/O error, dev sda, sector 272512 After few reboots with these errors, and the SSD is corrupted. After blacklisting it, the errors are not seen and the SSD does not get corrupted any more. Fixes: `243918be63` ("libata: Do not blacklist Micron M500DC") Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: stable@vger.kernel.org Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:51:48 +02:00
Tejun Heo	21712abb8b	libata: Blacklist some Sandisk SSDs for NCQ commit `322579dcc8` upstream. Sandisk SSDs SD7SN6S256G and SD8SN8U256G are regularly locking up regularly under sustained moderate load with NCQ enabled. Blacklist for now. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Dave Jones <davej@codemonkey.org.uk> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:51:48 +02:00
Corneliu Doban	f2a3c8bb4d	mmc: sdhci-iproc: add SDHCI_QUIRK2_HOST_OFF_CARD_ON for cygnus commit `3de06d5a1f` upstream. The SDHCI_QUIRK2_HOST_OFF_CARD_ON is needed for the driver to properly reset the host controller (reset all) on initialization after exiting deep sleep. Signed-off-by: Corneliu Doban <corneliu.doban@broadcom.com> Signed-off-by: Scott Branden <scott.branden@broadcom.com> Reviewed-by: Ray Jui <ray.jui@broadcom.com> Reviewed-by: Srinath Mannam <srinath.mannam@broadcom.com> Fixes: `c833e92bbb` ("mmc: sdhci-iproc: support standard byte register accesses") Cc: stable@vger.kernel.org # v4.10+ Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:51:48 +02:00
Corneliu Doban	4da8f20a99	mmc: sdhci-iproc: fix 32bit writes for TRANSFER_MODE register commit `5f651b8704` upstream. When the host controller accepts only 32bit writes, the value of the 16bit TRANSFER_MODE register, that has the same 32bit address as the 16bit COMMAND register, needs to be saved and it will be written in a 32bit write together with the command as this will trigger the host to send the command on the SD interface. When sending the tuning command, TRANSFER_MODE is written and then sdhci_set_transfer_mode reads it back to clear AUTO_CMD12 bit and write it again resulting in wrong value to be written because the initial write value was saved in a shadow and the read-back returned a wrong value, from the register. Fix sdhci_iproc_readw to return the saved value of TRANSFER_MODE when a saved value exist. Same fix for read of BLOCK_SIZE and BLOCK_COUNT registers, that are saved for a different reason, although a scenario that will cause the mentioned problem on this registers is not probable. Fixes: `b580c52d58` ("mmc: sdhci-iproc: add IPROC SDHCI driver") Signed-off-by: Corneliu Doban <corneliu.doban@broadcom.com> Signed-off-by: Scott Branden <scott.branden@broadcom.com> Cc: stable@vger.kernel.org # v4.1+ Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:51:48 +02:00
Srinath Mannam	ebedf0b290	mmc: sdhci-iproc: remove hard coded mmc cap 1.8v commit `4c94238f37` upstream. Remove hard coded mmc cap 1.8v from platform data as it is board specific. The 1.8v DDR mmc caps can be enabled using DTS property for those boards that support it. Fixes: `b17b4ab8ce` ("mmc: sdhci-iproc: define MMC caps in platform data") Signed-off-by: Srinath Mannam <srinath.mannam@broadcom.com> Signed-off-by: Scott Branden <scott.branden@broadcom.com> Reviewed-by: Ray Jui <ray.jui@broadcom.com> Cc: stable@vger.kernel.org # v4.8+ Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:51:48 +02:00
Al Viro	f440ea85d4	do d_instantiate/unlock_new_inode combinations safely commit `1e2e547a93` upstream. For anything NFS-exported we do _not_ want to unlock new inode before it has grown an alias; original set of fixes got the ordering right, but missed the nasty complication in case of lockdep being enabled - unlock_new_inode() does lockdep_annotate_inode_mutex_key(inode) which can only be done before anyone gets a chance to touch ->i_mutex. Unfortunately, flipping the order and doing unlock_new_inode() before d_instantiate() opens a window when mkdir can race with open-by-fhandle on a guessed fhandle, leading to multiple aliases for a directory inode and all the breakage that follows from that. Correct solution: a new primitive (d_instantiate_new()) combining these two in the right order - lockdep annotate, then d_instantiate(), then the rest of unlock_new_inode(). All combinations of d_instantiate() with unlock_new_inode() should be converted to that. Cc: stable@kernel.org # 2.6.29 and later Tested-by: Mike Marshall <hubcap@omnibond.com> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:51:47 +02:00
Ben Hutchings	ba3fbb7afd	ALSA: timer: Fix pause event notification commit `3ae1809725` upstream. Commit `f65e0d2998` ("ALSA: timer: Call notifier in the same spinlock") combined the start/continue and stop/pause functions, and in doing so changed the event code for the pause case to SNDRV_TIMER_EVENT_CONTINUE. Change it back to SNDRV_TIMER_EVENT_PAUSE. Fixes: `f65e0d2998` ("ALSA: timer: Call notifier in the same spinlock") Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Cc: stable@vger.kernel.org Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:51:47 +02:00
Al Viro	fbcede36bb	aio: fix io_destroy(2) vs. lookup_ioctx() race commit `baf10564fb` upstream. kill_ioctx() used to have an explicit RCU delay between removing the reference from ->ioctx_table and percpu_ref_kill() dropping the refcount. At some point that delay had been removed, on the theory that percpu_ref_kill() itself contained an RCU delay. Unfortunately, that was the wrong kind of RCU delay and it didn't care about rcu_read_lock() used by lookup_ioctx(). As the result, we could get ctx freed right under lookup_ioctx(). Tejun has fixed that in `a6d7cff472` ("fs/aio: Add explicit RCU grace period when freeing kioctx"); however, that fix is not enough. Suppose io_destroy() from one thread races with e.g. io_setup() from another; CPU1 removes the reference from current->mm->ioctx_table[...] just as CPU2 has picked it (under rcu_read_lock()). Then CPU1 proceeds to drop the refcount, getting it to 0 and triggering a call of free_ioctx_users(), which proceeds to drop the secondary refcount and once that reaches zero calls free_ioctx_reqs(). That does INIT_RCU_WORK(&ctx->free_rwork, free_ioctx); queue_rcu_work(system_wq, &ctx->free_rwork); and schedules freeing the whole thing after RCU delay. In the meanwhile CPU2 has gotten around to percpu_ref_get(), bumping the refcount from 0 to 1 and returned the reference to io_setup(). Tejun's fix (that queue_rcu_work() in there) guarantees that ctx won't get freed until after percpu_ref_get(). Sure, we'd increment the counter before ctx can be freed. Now we are out of rcu_read_lock() and there's nothing to stop freeing of the whole thing. Unfortunately, CPU2 assumes that since it has grabbed the reference, ctx is NOT going away until it gets around to dropping that reference. The fix is obvious - use percpu_ref_tryget_live() and treat failure as miss. It's not costlier than what we currently do in normal case, it's safe to call since freeing is delayed and it closes the race window - either lookup_ioctx() comes before percpu_ref_kill() (in which case ctx->users won't reach 0 until the caller of lookup_ioctx() drops it) or lookup_ioctx() fails, ctx->users is unaffected and caller of lookup_ioctx() doesn't see the object in question at all. Cc: stable@kernel.org Fixes: `a6d7cff472` "fs/aio: Add explicit RCU grace period when freeing kioctx" Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:51:47 +02:00
Dave Chinner	b9659ff375	fs: don't scan the inode cache before SB_BORN is set commit `79f546a696` upstream. We recently had an oops reported on a 4.14 kernel in xfs_reclaim_inodes_count() where sb->s_fs_info pointed to garbage and so the m_perag_tree lookup walked into lala land. It produces an oops down this path during the failed mount: radix_tree_gang_lookup_tag+0xc4/0x130 xfs_perag_get_tag+0x37/0xf0 xfs_reclaim_inodes_count+0x32/0x40 xfs_fs_nr_cached_objects+0x11/0x20 super_cache_count+0x35/0xc0 shrink_slab.part.66+0xb1/0x370 shrink_node+0x7e/0x1a0 try_to_free_pages+0x199/0x470 __alloc_pages_slowpath+0x3a1/0xd20 __alloc_pages_nodemask+0x1c3/0x200 cache_grow_begin+0x20b/0x2e0 fallback_alloc+0x160/0x200 kmem_cache_alloc+0x111/0x4e0 The problem is that the superblock shrinker is running before the filesystem structures it depends on have been fully set up. i.e. the shrinker is registered in sget(), before ->fill_super() has been called, and the shrinker can call into the filesystem before fill_super() does it's setup work. Essentially we are exposed to both use-after-free and use-before-initialisation bugs here. To fix this, add a check for the SB_BORN flag in super_cache_count. In general, this flag is not set until ->fs_mount() completes successfully, so we know that it is set after the filesystem setup has completed. This matches the trylock_super() behaviour which will not let super_cache_scan() run if SB_BORN is not set, and hence will not allow the superblock shrinker from entering the filesystem while it is being set up or after it has failed setup and is being torn down. Cc: stable@kernel.org Signed-Off-By: Dave Chinner <dchinner@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:51:47 +02:00
Al Viro	1e5edf32e4	affs_lookup(): close a race with affs_remove_link() commit `30da870ce4` upstream. we unlock the directory hash too early - if we are looking at secondary link and primary (in another directory) gets removed just as we unlock, we could have the old primary moved in place of the secondary, leaving us to look into freed entry (and leaving our dentry with ->d_fsdata pointing to a freed entry). Cc: stable@vger.kernel.org # 2.4.4+ Acked-by: David Sterba <dsterba@suse.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:51:47 +02:00
Colin Ian King	2871a70132	KVM: Fix spelling mistake: "cop_unsuable" -> "cop_unusable" commit `ba3696e94d` upstream. Trivial fix to spelling mistake in debugfs_entries text. Fixes: `669e846e6c` ("KVM/MIPS32: MIPS arch specific APIs for KVM") Signed-off-by: Colin Ian King <colin.king@canonical.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kernel-janitors@vger.kernel.org Cc: <stable@vger.kernel.org> # 3.10+ Signed-off-by: James Hogan <jhogan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:51:47 +02:00
Maciej W. Rozycki	bba75a0ccd	MIPS: Fix ptrace(2) PTRACE_PEEKUSR and PTRACE_POKEUSR accesses to o32 FGRs commit `9a3a92ccfe` upstream. Check the TIF_32BIT_FPREGS task setting of the tracee rather than the tracer in determining the layout of floating-point general registers in the floating-point context, correcting access to odd-numbered registers for o32 tracees where the setting disagrees between the two processes. Fixes: `597ce1723e` ("MIPS: Support for 64-bit FP with O32 binaries") Signed-off-by: Maciej W. Rozycki <macro@mips.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: <stable@vger.kernel.org> # 3.14+ Signed-off-by: James Hogan <jhogan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:51:47 +02:00
Maciej W. Rozycki	769fc447cc	MIPS: ptrace: Expose FIR register through FP regset commit `71e909c0cd` upstream. Correct commit `7aeb753b53` ("MIPS: Implement task_user_regset_view.") and expose the FIR register using the unused 4 bytes at the end of the NT_PRFPREG regset. Without that register included clients cannot use the PTRACE_GETREGSET request to retrieve the complete FPU register set and have to resort to one of the older interfaces, either PTRACE_PEEKUSR or PTRACE_GETFPREGS, to retrieve the missing piece of data. Also the register is irreversibly missing from core dumps. This register is architecturally hardwired and read-only so the write path does not matter. Ignore data supplied on writes then. Fixes: `7aeb753b53` ("MIPS: Implement task_user_regset_view.") Signed-off-by: James Hogan <jhogan@kernel.org> Signed-off-by: Maciej W. Rozycki <macro@mips.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: <stable@vger.kernel.org> # 3.13+ Patchwork: https://patchwork.linux-mips.org/patch/19273/ Signed-off-by: James Hogan <jhogan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:51:47 +02:00
NeilBrown	368b70857d	MIPS: c-r4k: Fix data corruption related to cache coherence commit `55a2aa08b3` upstream. When DMA will be performed to a MIPS32 1004K CPS, the L1-cache for the range needs to be flushed and invalidated first. The code currently takes one of two approaches. 1/ If the range is less than the size of the dcache, then HIT type requests flush/invalidate cache lines for the particular addresses. HIT-type requests a globalised by the CPS so this is safe on SMP. 2/ If the range is larger than the size of dcache, then INDEX type requests flush/invalidate the whole cache. INDEX type requests affect the local cache only. CPS does not propagate them in any way. So this invalidation is not safe on SMP CPS systems. Data corruption due to '2' can quite easily be demonstrated by repeatedly "echo 3 > /proc/sys/vm/drop_caches" and then sha1sum a file that is several times the size of available memory. Dropping caches means that large contiguous extents (large than dcache) are more likely. This was not a problem before Linux-4.8 because option 2 was never used if CONFIG_MIPS_CPS was defined. The commit which removed that apparently didn't appreciate the full consequence of the change. We could, in theory, globalize the INDEX based flush by sending an IPI to other cores. These cache invalidation routines can be called with interrupts disabled and synchronous IPI require interrupts to be enabled. Asynchronous IPI may not trigger writeback soon enough. So we cannot use IPI in practice. We can already test if IPI would be needed for an INDEX operation with r4k_op_needs_ipi(R4K_INDEX). If this is true then we mustn't try the INDEX approach as we cannot use IPI. If this is false (e.g. when there is only one core and hence one L1 cache) then it is safe to use the INDEX approach without IPI. This patch avoids options 2 if r4k_op_needs_ipi(R4K_INDEX), and so eliminates the corruption. Fixes: `c00ab4896e` ("MIPS: Remove cpu_has_safe_index_cacheops") Signed-off-by: NeilBrown <neil@brown.name> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paul Burton <paul.burton@mips.com> Cc: linux-mips@linux-mips.org Cc: <stable@vger.kernel.org> # 4.8+ Patchwork: https://patchwork.linux-mips.org/patch/19259/ Signed-off-by: James Hogan <jhogan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-30 07:51:46 +02:00
Greg Kroah-Hartman	102b97d624	Linux 4.14.44	2018-05-25 16:18:02 +02:00
James Hogan	6b73dfbd3c	rtc: goldfish: Add missing MODULE_LICENSE [ Upstream commit `82d632b85e` ] Fix the following warning in MIPS allmodconfig by adding a MODULE_LICENSE() at the end of rtc-goldfish.c, based on the file header comment which says GNU General Public License version 2: WARNING: modpost: missing MODULE_LICENSE() in drivers/rtc/rtc-goldfish.o Fixes: `f22d9cdcb5` ("rtc: goldfish: Add RTC driver for Android emulator") Signed-off-by: James Hogan <jhogan@kernel.org> Cc: Miodrag Dinic <miodrag.dinic@mips.com> Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: Alexandre Belloni <alexandre.belloni@free-electrons.com> Cc: linux-rtc@vger.kernel.org Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-25 16:18:02 +02:00
Alexandre Belloni	6f34e43650	rtc: rp5c01: fix possible race condition [ Upstream commit `bcdd559268` ] The probe function is not allowed to fail after registering the RTC because the following may happen: CPU0: CPU1: sys_load_module() do_init_module() do_one_initcall() cmos_do_probe() rtc_device_register() __register_chrdev() cdev->owner = struct module* open("/dev/rtc0") rtc_device_unregister() module_put() free_module() module_free(mod->module_core) /* struct module module is now freed / chrdev_open() spin_lock(cdev_lock) cdev_get() try_module_get() module_is_live() /* dereferences already freed struct module* */ Switch to devm_rtc_allocate_device/rtc_register_device to register the rtc as late as possible. Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-25 16:18:01 +02:00
Colin Ian King	78227b671e	rtc: tx4939: avoid unintended sign extension on a 24 bit shift [ Upstream commit `347876ad47` ] The shifting of buf[5] by 24 bits to the left will be promoted to a 32 bit signed int and then sign-extended to an unsigned long. If the top bit of buf[5] is set then all then all the upper bits sec end up as also being set because of the sign-extension. Fix this by casting buf[5] to an unsigned long before the shift. Detected by CoverityScan, CID#1465292 ("Unintended sign extension") Fixes: `0e1492330c` ("rtc: add rtc-tx4939 driver") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-25 16:18:01 +02:00
Alexandre Belloni	459aa4904a	rtc: m41t80: fix race conditions [ Upstream commit `10d0c768cc` ] The IRQ is requested before the struct rtc is allocated and registered, but this struct is used in the IRQ handler, leading to: Unable to handle kernel NULL pointer dereference at virtual address 0000017c pgd = a38a2f9b [0000017c] pgd=00000000 Internal error: Oops: 5 [#1] ARM Modules linked in: CPU: 0 PID: 613 Comm: irq/48-m41t80 Not tainted 4.16.0-rc1+ #42 Hardware name: Atmel SAMA5 PC is at mutex_lock+0x14/0x38 LR is at m41t80_handle_irq+0x1c/0x9c pc : [<c06e864c>] lr : [<c04b70f0>] psr: 20000013 sp : dec73f30 ip : 00000000 fp : dec56d98 r10: df437cf0 r9 : c0a03008 r8 : c0145ffc r7 : df5c4300 r6 : dec568d0 r5 : df593000 r4 : 0000017c r3 : df592800 r2 : 60000013 r1 : df593000 r0 : 0000017c Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none Control: 10c53c7d Table: 20004059 DAC: 00000051 Process irq/48-m41t80 (pid: 613, stack limit = 0xb52d091e) Stack: (0xdec73f30 to 0xdec74000) 3f20: dec56840 df5c4300 00000001 df5c4300 3f40: c0145ffc c0146018 dec56840 ffffe000 00000001 c0146290 dec567c0 00000000 3f60: c0146084 ed7c9a62 c014615c dec56d80 dec567c0 00000000 dec72000 dec56840 3f80: c014615c c012ffc0 dec72000 dec567c0 c012fe80 00000000 00000000 00000000 3fa0: 00000000 00000000 00000000 c01010e8 00000000 00000000 00000000 00000000 3fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 3fe0: 00000000 00000000 00000000 00000000 00000013 00000000 29282726 2d2c2b2a [<c06e864c>] (mutex_lock) from [<c04b70f0>] (m41t80_handle_irq+0x1c/0x9c) [<c04b70f0>] (m41t80_handle_irq) from [<c0146018>] (irq_thread_fn+0x1c/0x54) [<c0146018>] (irq_thread_fn) from [<c0146290>] (irq_thread+0x134/0x1c0) [<c0146290>] (irq_thread) from [<c012ffc0>] (kthread+0x140/0x148) [<c012ffc0>] (kthread) from [<c01010e8>] (ret_from_fork+0x14/0x2c) Exception stack(0xdec73fb0 to 0xdec73ff8) 3fa0: 00000000 00000000 00000000 00000000 3fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 3fe0: 00000000 00000000 00000000 00000000 00000013 00000000 Code: e3c33d7f e3c3303f f5d0f000 e593300c (e1901f9f) ---[ end trace 22b027302eb7c604 ]--- genirq: exiting task "irq/48-m41t80" (613) is an active IRQ thread (irq 48) Also, there is another possible race condition. The probe function is not allowed to fail after the RTC is registered because the following may happen: CPU0: CPU1: sys_load_module() do_init_module() do_one_initcall() cmos_do_probe() rtc_device_register() __register_chrdev() cdev->owner = struct module open("/dev/rtc0") rtc_device_unregister() module_put() free_module() module_free(mod->module_core) /* struct module module is now freed / chrdev_open() spin_lock(cdev_lock) cdev_get() try_module_get() module_is_live() /* dereferences already freed struct module* */ Switch to devm_rtc_allocate_device/rtc_register_device to allocate the rtc before requesting the IRQ and register it as late as possible. Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-25 16:18:01 +02:00
Alexandre Belloni	6266010c38	rtc: rk808: fix possible race condition [ Upstream commit `201fac95e7` ] The probe function is not allowed to fail after registering the RTC because the following may happen: CPU0: CPU1: sys_load_module() do_init_module() do_one_initcall() cmos_do_probe() rtc_device_register() __register_chrdev() cdev->owner = struct module* open("/dev/rtc0") rtc_device_unregister() module_put() free_module() module_free(mod->module_core) /* struct module module is now freed / chrdev_open() spin_lock(cdev_lock) cdev_get() try_module_get() module_is_live() /* dereferences already freed struct module* */ Switch to devm_rtc_allocate_device/rtc_register_device to register the rtc as late as possible. Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-25 16:18:01 +02:00
Alexandre Belloni	6c1c171164	rtc: hctosys: Ensure system time doesn't overflow time_t [ Upstream commit `b3a5ac42ab` ] On 32bit platforms, time_t is still a signed 32bit long. If it is overflowed, userspace and the kernel cant agree on the current system time. This causes multiple issues, in particular with systemd: https://github.com/systemd/systemd/issues/1143 A good workaround is to simply avoid using hctosys which is something I greatly encourage as the time is better set by userspace. However, many distribution enable it and use systemd which is rendering the system unusable in case the RTC holds a date after 2038 (and more so after 2106). Many drivers have workaround for this case and they should be eliminated so there is only one place left to fix when userspace is able to cope with dates after the 31bit overflow. Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-25 16:18:00 +02:00
Bryan O'Donoghue	731d965a58	rtc: snvs: Fix usage of snvs_rtc_enable [ Upstream commit `1485991c02` ] commit `179a502f8c` ("rtc: snvs: add Freescale rtc-snvs driver") introduces the SNVS RTC driver with a function snvs_rtc_enable(). snvs_rtc_enable() can return an error on the enable path however this driver does not currently trap that failure on the probe() path and consequently if enabling the RTC fails we encounter a later error spinning forever in rtc_write_sync_lp(). [ 36.093481] [<c010d630>] (__irq_svc) from [<c0c2e9ec>] (_raw_spin_unlock_irqrestore+0x34/0x44) [ 36.102122] [<c0c2e9ec>] (_raw_spin_unlock_irqrestore) from [<c072e32c>] (regmap_read+0x4c/0x5c) [ 36.110938] [<c072e32c>] (regmap_read) from [<c085d0f4>] (rtc_write_sync_lp+0x6c/0x98) [ 36.118881] [<c085d0f4>] (rtc_write_sync_lp) from [<c085d160>] (snvs_rtc_alarm_irq_enable+0x40/0x4c) [ 36.128041] [<c085d160>] (snvs_rtc_alarm_irq_enable) from [<c08567b4>] (rtc_timer_do_work+0xd8/0x1a8) [ 36.137291] [<c08567b4>] (rtc_timer_do_work) from [<c01441b8>] (process_one_work+0x28c/0x76c) [ 36.145840] [<c01441b8>] (process_one_work) from [<c01446cc>] (worker_thread+0x34/0x58c) [ 36.153961] [<c01446cc>] (worker_thread) from [<c014aee4>] (kthread+0x138/0x150) [ 36.161388] [<c014aee4>] (kthread) from [<c0107e14>] (ret_from_fork+0x14/0x20) [ 36.168635] rcu_sched kthread starved for 2602 jiffies! g496 c495 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x0 ->cpu=0 [ 36.178564] rcu_sched R running task 0 8 2 0x00000000 [ 36.185664] [<c0c288b0>] (__schedule) from [<c0c29134>] (schedule+0x3c/0xa0) [ 36.192739] [<c0c29134>] (schedule) from [<c0c2db80>] (schedule_timeout+0x78/0x4e0) [ 36.200422] [<c0c2db80>] (schedule_timeout) from [<c01a7ab0>] (rcu_gp_kthread+0x648/0x1864) [ 36.208800] [<c01a7ab0>] (rcu_gp_kthread) from [<c014aee4>] (kthread+0x138/0x150) [ 36.216309] [<c014aee4>] (kthread) from [<c0107e14>] (ret_from_fork+0x14/0x20) This patch fixes by parsing the result of rtc_write_sync_lp() and propagating both in the probe and elsewhere. If the RTC doesn't start we don't proceed loading the driver and don't get into this loop mess later on. Fixes: `179a502f8c` ("rtc: snvs: add Freescale rtc-snvs driver") Signed-off-by: Bryan O'Donoghue <pure.logic@nexus-software.ie> Acked-by: Shawn Guo <shawn.guo@linaro.org> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-25 16:18:00 +02:00

1 2 3 4 5 ...

712621 commits