1
0
Fork 0
Commit Graph

134 Commits (8a7f97b902f4fb0d94b355b6b3f1fbd7154cafb9)

Author SHA1 Message Date
Timothy E Baldwin f18aef742c ARM: 8802/1: Call syscall_trace_exit even when system call skipped
On at least x86 and ARM64, and as documented in the ptrace man page
a skipped system call will still cause a syscall exit ptrace stop.

Previous to this commit 32-bit ARM did not, resulting in strace
being confused when seccomp skips system calls.

This change also impacts programs that use ptrace to skip system calls.

Fixes: ad75b51459 ("ARM: 7579/1: arch/allow a scno of -1 to not cause a SIGILL")
Signed-off-by: Timothy E Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Signed-off-by: Eugene Syromyatnikov <evgsyr@gmail.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Tested-by: Kees Cook <keescook@chromium.org>
Tested-by: Eugene Syromyatnikov <evgsyr@gmail.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2018-10-10 13:53:12 +01:00
Vincent Whitchurch afc9f65e01 ARM: 8781/1: Fix Thumb-2 syscall return for binutils 2.29+
When building the kernel as Thumb-2 with binutils 2.29 or newer, if the
assembler has seen the .type directive (via ENDPROC()) for a symbol, it
automatically handles the setting of the lowest bit when the symbol is
used with ADR.  The badr macro on the other hand handles this lowest bit
manually.  This leads to a jump to a wrong address in the wrong state
in the syscall return path:

 Internal error: Oops - undefined instruction: 0 [#2] SMP THUMB2
 Modules linked in:
 CPU: 0 PID: 652 Comm: modprobe Tainted: G      D           4.18.0-rc3+ #8
 PC is at ret_fast_syscall+0x4/0x62
 LR is at sys_brk+0x109/0x128
 pc : [<80101004>]    lr : [<801c8a35>]    psr: 60000013
 Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
 Control: 50c5387d  Table: 9e82006a  DAC: 00000051
 Process modprobe (pid: 652, stack limit = 0x(ptrval))

 80101000 <ret_fast_syscall>:
 80101000:       b672            cpsid   i
 80101002:       f8d9 2008       ldr.w   r2, [r9, #8]
 80101006:       f1b2 4ffe       cmp.w   r2, #2130706432 ; 0x7f000000

 80101184 <local_restart>:
 80101184:       f8d9 a000       ldr.w   sl, [r9]
 80101188:       e92d 0030       stmdb   sp!, {r4, r5}
 8010118c:       f01a 0ff0       tst.w   sl, #240        ; 0xf0
 80101190:       d117            bne.n   801011c2 <__sys_trace>
 80101192:       46ba            mov     sl, r7
 80101194:       f5ba 7fc8       cmp.w   sl, #400        ; 0x190
 80101198:       bf28            it      cs
 8010119a:       f04f 0a00       movcs.w sl, #0
 8010119e:       f3af 8014       nop.w   {20}
 801011a2:       f2af 1ea2       subw    lr, pc, #418    ; 0x1a2

To fix this, add a new symbol name which doesn't have ENDPROC used on it
and use that with badr.  We can't remove the badr usage since that would
would cause breakage with older binutils.

Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2018-07-30 11:45:19 +01:00
Linus Torvalds d82991a868 Merge branch 'core-rseq-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull restartable sequence support from Thomas Gleixner:
 "The restartable sequences syscall (finally):

  After a lot of back and forth discussion and massive delays caused by
  the speculative distraction of maintainers, the core set of
  restartable sequences has finally reached a consensus.

  It comes with the basic non disputed core implementation along with
  support for arm, powerpc and x86 and a full set of selftests

  It was exposed to linux-next earlier this week, so it does not fully
  comply with the merge window requirements, but there is really no
  point to drag it out for yet another cycle"

* 'core-rseq-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  rseq/selftests: Provide Makefile, scripts, gitignore
  rseq/selftests: Provide parametrized tests
  rseq/selftests: Provide basic percpu ops test
  rseq/selftests: Provide basic test
  rseq/selftests: Provide rseq library
  selftests/lib.mk: Introduce OVERRIDE_TARGETS
  powerpc: Wire up restartable sequences system call
  powerpc: Add syscall detection for restartable sequences
  powerpc: Add support for restartable sequences
  x86: Wire up restartable sequence system call
  x86: Add support for restartable sequences
  arm: Wire up restartable sequences system call
  arm: Add syscall detection for restartable sequences
  arm: Add restartable sequences support
  rseq: Introduce restartable sequences system call
  uapi/headers: Provide types_32_64.h
2018-06-10 10:17:09 -07:00
Mathieu Desnoyers b74406f377 arm: Add syscall detection for restartable sequences
Syscalls are not allowed inside restartable sequences, so add a call to
rseq_syscall() at the very beginning of system call exiting path for
CONFIG_DEBUG_RSEQ=y kernel. This could help us to detect whether there
is a syscall issued inside restartable sequences.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Joel Fernandes <joelaf@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dave Watson <davejwatson@fb.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Chris Lameter <cl@linux.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Andrew Hunter <ahh@google.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Paul Turner <pjt@google.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ben Maurer <bmaurer@fb.com>
Cc: linux-api@vger.kernel.org
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lkml.kernel.org/r/20180602124408.8430-5-mathieu.desnoyers@efficios.com
2018-06-06 11:58:31 +02:00
Russell King 10573ae547 ARM: spectre-v1: fix syscall entry
Prevent speculation at the syscall table decoding by clamping the index
used to zero on invalid system call numbers, and using the csdb
speculative barrier.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Boot-tested-by: Tony Lindgren <tony@atomide.com>
Reviewed-by: Tony Lindgren <tony@atomide.com>
2018-05-31 23:27:26 +01:00
Russell King c608906165 ARM: probes: avoid adding kprobes to sensitive kernel-entry/exit code
Avoid adding kprobes to any of the kernel entry/exit or startup
assembly code, or code in the identity-mapped region.  This code does
not conform to the standard C conventions, which means that the
expectations of the kprobes code is not forfilled.

Placing kprobes at some of these locations results in the kernel trying
to return to userspace addresses while retaining the CPU in kernel mode.

Tested-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2017-12-17 22:14:21 +00:00
Linus Torvalds 441692aafc Merge branch 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm
Pull ARM updates from Russell King:

 - add support for ELF fdpic binaries on both MMU and noMMU platforms

 - linker script cleanups

 - support for compressed .data section for XIP images

 - discard memblock arrays when possible

 - various cleanups

 - atomic DMA pool updates

 - better diagnostics of missing/corrupt device tree

 - export information to allow userspace kexec tool to place images more
   inteligently, so that the device tree isn't overwritten by the
   booting kernel

 - make early_printk more efficient on semihosted systems

 - noMMU cleanups

 - SA1111 PCMCIA update in preparation for further cleanups

* 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm: (38 commits)
  ARM: 8719/1: NOMMU: work around maybe-uninitialized warning
  ARM: 8717/2: debug printch/printascii: translate '\n' to "\r\n" not "\n\r"
  ARM: 8713/1: NOMMU: Support MPU in XIP configuration
  ARM: 8712/1: NOMMU: Use more MPU regions to cover memory
  ARM: 8711/1: V7M: Add support for MPU to M-class
  ARM: 8710/1: Kconfig: Kill CONFIG_VECTORS_BASE
  ARM: 8709/1: NOMMU: Disallow MPU for XIP
  ARM: 8708/1: NOMMU: Rework MPU to be mostly done in C
  ARM: 8707/1: NOMMU: Update MPU accessors to use cp15 helpers
  ARM: 8706/1: NOMMU: Move out MPU setup in separate module
  ARM: 8702/1: head-common.S: Clear lr before jumping to start_kernel()
  ARM: 8705/1: early_printk: use printascii() rather than printch()
  ARM: 8703/1: debug.S: move hexbuf to a writable section
  ARM: add additional table to compressed kernel
  ARM: decompressor: fix BSS size calculation
  pcmcia: sa1111: remove special sa1111 mmio accessors
  pcmcia: sa1111: use sa1111_get_irq() to obtain IRQ resources
  ARM: better diagnostics with missing/corrupt dtb
  ARM: 8699/1: dma-mapping: Remove init_dma_coherent_pool_size()
  ARM: 8698/1: dma-mapping: Mark atomic_pool as __ro_after_init
  ..
2017-11-16 12:50:35 -08:00
Vladimir Murzin 6022f80da0 ARM: 8695/1: entry: Remove dead code in sys_mmap2
We support page size of 4K only, remove dead code.

Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2017-09-28 11:13:03 +01:00
Thomas Garnier e33f8d3267 arm/syscalls: Optimize address limit check
Disable the generic address limit check in favor of an architecture
specific optimized implementation. The generic implementation using
pending work flags did not work well with ARM and alignment faults.

The address limit is checked on each syscall return path to user-mode
path as well as the irq user-mode return function. If the address limit
was changed, a function is called to report data corruption (stopping
the kernel or process based on configuration).

The address limit check has to be done before any pending work because
they can reset the address limit and the process is killed using a
SIGKILL signal. For example the lkdtm address limit check does not work
because the signal to kill the process will reset the user-mode address
limit.

Signed-off-by: Thomas Garnier <thgarnie@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Tested-by: Kees Cook <keescook@chromium.org>
Tested-by: Leonard Crestez <leonard.crestez@nxp.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Pratyush Anand <panand@redhat.com>
Cc: Dave Martin <Dave.Martin@arm.com>
Cc: Will Drewry <wad@chromium.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: David Howells <dhowells@redhat.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-api@vger.kernel.org
Cc: Yonghong Song <yhs@fb.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/1504798247-48833-4-git-send-email-keescook@chromium.org
2017-09-17 19:45:33 +02:00
Thomas Garnier 2404269bc4 Revert "arm/syscalls: Check address limit on user-mode return"
This reverts commit 73ac5d6a2b.

The work pending loop can call set_fs after addr_limit_user_check
removed the _TIF_FSCHECK flag. This may happen at anytime based on how
ARM handles alignment exceptions. It leads to an infinite loop condition.

After discussion, it has been agreed that the generic approach is not
tailored to the ARM architecture and any fix might not be complete. This
patch will be replaced by an architecture specific implementation. The
work flag approach will be kept for other architectures.

Reported-by: Leonard Crestez <leonard.crestez@nxp.com>
Signed-off-by: Thomas Garnier <thgarnie@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Pratyush Anand <panand@redhat.com>
Cc: Dave Martin <Dave.Martin@arm.com>
Cc: Will Drewry <wad@chromium.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: David Howells <dhowells@redhat.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-api@vger.kernel.org
Cc: Yonghong Song <yhs@fb.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/1504798247-48833-3-git-send-email-keescook@chromium.org
2017-09-17 19:45:33 +02:00
Linus Torvalds 8fac2f96ab Merge branch 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm
Pull ARM updates from Russell King:
 "Low priority fixes and updates for ARM:

   - add some missing includes

   - efficiency improvements in system call entry code when tracing is
     enabled

   - ensure ARMv6+ is always built as EABI

   - export save_stack_trace_tsk()

   - fix fatal signal handling during mm fault

   - build translation table base address register from scratch

   - appropriately align the .data section to a word boundary where we
     rely on that data being word aligned"

* 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
  ARM: 8691/1: Export save_stack_trace_tsk()
  ARM: 8692/1: mm: abort uaccess retries upon fatal signal
  ARM: 8690/1: lpae: build TTB control register value from scratch in v7_ttb_setup
  ARM: align .data section
  ARM: always enable AEABI for ARMv6+
  ARM: avoid saving and restoring registers unnecessarily
  ARM: move PC value into r9
  ARM: obtain thread info structure later
  ARM: use aliases for registers in entry-common
  ARM: 8689/1: scu: add missing errno include
  ARM: 8688/1: pm: add missing types include
2017-09-12 06:10:44 -07:00
Russell King dca778c5bb ARM: avoid saving and restoring registers unnecessarily
Avoid repeatedly saving and restoring registers around the calls to
trace_hardirqs_on() and context_tracking_user_exit().  With the
previous changes, we no longer need to preserve "lr" across these
calls, and if we re-load r0-r3 later, we can avoid preserving these
regsiters too.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2017-08-02 14:15:04 +01:00
Russell King fcea45236d ARM: move PC value into r9
Move the saved PC value into r9, thereby moving it into a caller-saved
register for functions that we may call during the entry to a syscall.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2017-08-02 14:15:04 +01:00
Russell King da594e3fff ARM: obtain thread info structure later
Obtain the thread info structure later in the syscall processing, so
that we free up a register for earlier code.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2017-08-02 14:15:04 +01:00
Russell King 309ee04257 ARM: use aliases for registers in entry-common
Use aliases for the saved (and preserved) PSR and PC values so that we
can control which registers are used.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2017-08-02 14:15:04 +01:00
Thomas Garnier 73ac5d6a2b arm/syscalls: Check address limit on user-mode return
Ensure the address limit is a user-mode segment before returning to
user-mode. Otherwise a process can corrupt kernel-mode memory and
elevate privileges [1].

The set_fs function sets the TIF_SETFS flag to force a slow path on
return. In the slow path, the address limit is checked to be USER_DS if
needed.

The TIF_SETFS flag is added to _TIF_WORK_MASK shifting _TIF_SYSCALL_WORK
for arm instruction immediate support. The global work mask is too big
to used on a single instruction so adapt ret_fast_syscall.

[1] https://bugs.chromium.org/p/project-zero/issues/detail?id=990

Signed-off-by: Thomas Garnier <thgarnie@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: kernel-hardening@lists.openwall.com
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Pratyush Anand <panand@redhat.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Will Drewry <wad@chromium.org>
Cc: linux-api@vger.kernel.org
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Link: http://lkml.kernel.org/r/20170615011203.144108-2-thgarnie@google.com
2017-07-08 14:05:33 +02:00
Russell King 96a8fae0fe ARM: convert to generated system call tables
Convert ARM to use a similar mechanism to x86 to generate the unistd.h
system call numbers and the various kernel system call tables.  This
means that rather than having to edit three places (asm/unistd.h for
the total number of system calls, uapi/asm/unistd.h for the system call
numbers, and arch/arm/kernel/calls.S for the call table) we have only
one place to edit, making the process much more simple.

The scripts have knowledge of the table padding requirements, so there's
no need to worry about __NR_syscalls not fitting within the immediate
constant field of ALU instructions anymore.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2016-10-18 21:34:06 +01:00
Russell King 5745eef6b8 ARM: rename S_FRAME_SIZE to PT_REGS_SIZE
S_FRAME_SIZE is no longer the size of the kernel stack frame, so this
name is misleading.  It is the size of the kernel pt_regs structure.
Name it so.

Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2016-06-22 19:54:28 +01:00
Russell King 40d3f02851 Merge branches 'cleanup', 'fixes', 'misc', 'omap-barrier' and 'uaccess' into for-linus 2015-09-03 15:28:37 +01:00
Russell King 2190fed67b ARM: entry: provide uaccess assembly macro hooks
Provide hooks into the kernel entry and exit paths to permit control
of userspace visibility to the kernel.  The intended use is:

- on entry to kernel from user, uaccess_disable will be called to
  disable userspace visibility
- on exit from kernel to user, uaccess_enable will be called to
  enable userspace visibility
- on entry from a kernel exception, uaccess_save_and_disable will be
  called to save the current userspace visibility setting, and disable
  access
- on exit from a kernel exception, uaccess_restore will be called to
  restore the userspace visibility as it was before the exception
  occurred.

These hooks allows us to keep userspace visibility disabled for the
vast majority of the kernel, except for localised regions where we
want to explicitly access userspace.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-08-26 20:27:02 +01:00
Russell King e0aa3a6657 ARM: entry: ensure that IRQs are enabled when calling syscall_trace_exit()
The audit code looks like it's been written to cope with being called
with IRQs enabled.  However, it's unclear whether IRQs should be
enabled or disabled when calling the syscall tracing infrastructure.

Right now, sometimes we call this with IRQs enabled, and other times
with IRQs disabled.  Opt for IRQs being enabled for consistency.

Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-08-25 10:32:50 +01:00
Russell King 3302caddf1 ARM: entry: efficiency cleanups
Make the "fast" syscall return path fast again.  The addition of IRQ
tracing and context tracking has made this path grossly inefficient.
We can do much better if these options are enabled if we save the
syscall return code on the stack - we then don't need to save a bunch
of registers around every single callout to C code.

Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-08-25 10:32:48 +01:00
Drew Richardson e83dd37700 ARM: 8409/1: Mark ret_fast_syscall as a function
ret_fast_syscall runs when user space makes a syscall. However it
needs to be marked as such so the ELF information is correct. Before
it was:

   101: 8000f300     0 NOTYPE  LOCAL  DEFAULT    2 ret_fast_syscall

But with this change it correctly shows as:

   101: 8000f300    96 FUNC    LOCAL  DEFAULT    2 ret_fast_syscall

I see this function when using perf to unwind call stacks from kernel
space to user space. Without this change I would need to add some
special case logic when using the vmlinux ELF information.

Signed-off-by: Drew Richardson <drew.richardson@arm.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-08-07 19:57:02 +01:00
Russell King 05c9ca8843 Merge branch 'bsym' into for-next
Conflicts:
	arch/arm/kernel/head.S
2015-06-12 21:18:38 +01:00
Russell King 1b97937246 ARM: fix missing syscall trace exit
Josh Stone reports:

  I've discovered a case where both arm and arm64 will miss a ptrace
  syscall-exit that they should report.  If the syscall is entered
  without TIF_SYSCALL_TRACE set, then it goes on the fast path.  It's
  then possible to have TIF_SYSCALL_TRACE added in the middle of the
  syscall, but ret_fast_syscall doesn't check this flag again.

Fix this by always checking for a syscall trace in the fast exit path.

Reported-by: Josh Stone <jistone@redhat.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-05-15 11:06:35 +01:00
Russell King 14327c6628 ARM: replace BSYM() with badr assembly macro
BSYM() was invented to allow us to work around a problem with the
assembler, where local symbols resolved by the assembler for the 'adr'
instruction did not take account of their ISA.

Since we don't want BSYM() used elsewhere, replace BSYM() with a new
macro 'badr', which is like the 'adr' pseudo-op, but with the BSYM()
mechanics integrated into it.  This ensures that the BSYM()-ification
is only used in conjunction with 'adr'.

Acked-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-05-08 17:33:50 +01:00
Russell King 82112379b7 ARM: move ftrace assembly code to separate file
The ftrace assembly code doesn't need to live in entry-common.S and
be surrounded with #ifdef CONFIG_FUNCTION_TRACER.  Instead, move it
to its own file and conditionally assemble it.

Tested-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2014-11-21 15:25:01 +00:00
Russell King 195b58add4 ARM: Avoid writing to control register on every exception
If we are not changing the control register value, avoid writing to it.
Writes to the control register can be very expensive, taking around a
hundred cycles or so.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2014-09-26 14:39:54 +01:00
Russell King 6ebbf2ce43 ARM: convert all "mov.* pc, reg" to "bx reg" for ARMv6+
ARMv6 and greater introduced a new instruction ("bx") which can be used
to return from function calls.  Recent CPUs perform better when the
"bx lr" instruction is used rather than the "mov pc, lr" instruction,
and this sequence is strongly recommended to be used by the ARM
architecture manual (section A.4.1.1).

We provide a new macro "ret" with all its variants for the condition
code which will resolve to the appropriate instruction.

Rather than doing this piecemeal, and miss some instances, change all
the "mov pc" instances to use the new macro, with the exception of
the "movs" instruction and the kprobes code.  This allows us to detect
the "mov pc, lr" case and fix it up - and also gives us the possibility
of deploying this for other registers depending on the CPU selection.

Reported-by: Will Deacon <will.deacon@arm.com>
Tested-by: Stephen Warren <swarren@nvidia.com> # Tegra Jetson TK1
Tested-by: Robert Jarzmik <robert.jarzmik@free.fr> # mioa701_bootresume.S
Tested-by: Andrew Lunn <andrew@lunn.ch> # Kirkwood
Tested-by: Shawn Guo <shawn.guo@freescale.com>
Tested-by: Tony Lindgren <tony@atomide.com> # OMAPs
Tested-by: Gregory CLEMENT <gregory.clement@free-electrons.com> # Armada XP, 375, 385
Acked-by: Sekhar Nori <nsekhar@ti.com> # DaVinci
Acked-by: Christoffer Dall <christoffer.dall@linaro.org> # kvm/hyp
Acked-by: Haojian Zhuang <haojian.zhuang@gmail.com> # PXA3xx
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> # Xen
Tested-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> # ARMv7M
Tested-by: Simon Horman <horms+renesas@verge.net.au> # Shmobile
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2014-07-18 12:29:04 +01:00
Russell King 8229c54fa1 ARM: consolidate last remaining open-coded alignment trap enable
We can use the alignment_trap assembly macro here too.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2014-06-02 09:20:20 +01:00
Ben Dooks 457c2403c5 ARM: asm: Add ARM_BE8() assembly helper
Add ARM_BE8() helper to wrap any code conditional on being
compile when CONFIG_ARM_ENDIAN_BE8 is selected and convert
existing places where this is to use it.

Acked-by: Nicolas Pitre <nico@linaro.org>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
2013-10-19 20:46:33 +01:00
Will Deacon d95bc2501d ARM: 7839/1: entry: fix tracing of ARM-private syscalls
Commit 377747c406 ("ARM: entry: allow ARM-private syscalls to be
restarted") reworked the low-level syscall dispatcher to allow
restarting of ARM-private syscalls. Unfortunately, this relocated the
label used to dispatch a private syscall from the trace path, so that
the invocation would be bypassed altogether!

This causes applications to fail under strace as soon as they rely on
a private syscall (e.g. set_tls):

  set_tls(0xb6fad4c0, 0xb6fadb98, 0xb6fb1050, 0xb6fad4c0, 0xb6fb1050)
      = -1 ENOSYS (Function not implemented)

This patch fixes the label so that we correctly dispatch private
syscalls from the trace path.

Reported-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Tested-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-09-21 20:41:25 +01:00
Will Deacon 377747c406 ARM: entry: allow ARM-private syscalls to be restarted
System calls will only be restarted after signal handling if they (a)
return an error code indicating that a restart is required and (b) have
`why' set to a non-zero value, to indicate that the signal interrupted
them.

This patch leaves `why' set to a non-zero value for ARM-private syscalls
, and only zeroes it for syscalls that are not implemented.

Signed-off-by: Will Deacon <will.deacon@arm.com>
2013-07-22 10:49:00 +01:00
Russell King 3c0c01ab74 Merge branch 'devel-stable' into for-next
Conflicts:
	arch/arm/Makefile
	arch/arm/include/asm/glue-proc.h
2013-06-29 11:44:43 +01:00
Will Deacon 1aa2b3b7a6 ARM: 7748/1: oabi: handle faults when loading swi instruction from userspace
Running an OABI_COMPAT kernel on an SMP platform can lead to fun and
games with page aging.

If one CPU issues a swi instruction immediately before another CPU
decides to mkold the page containing the swi instruction, then we will
fault attempting to load the instruction during the vector_swi handler
in order to retrieve its immediate field. Since this fault is not
currently dealt with by our exception tables, this results in a panic:

  Unable to handle kernel paging request at virtual address 4020841c
  pgd = c490c000
  [4020841c] *pgd=84451831, *pte=bf05859d, *ppte=00000000
  Internal error: Oops: 17 [#1] PREEMPT SMP ARM
  Modules linked in: hid_sony(O)
  CPU: 1    Tainted: G        W  O  (3.4.0-perf-gf496dca-01162-gcbcc62b #1)
  PC is at vector_swi+0x28/0x88
  LR is at 0x40208420

This patch wraps all of the swi instruction loads with the USER macro
and provides a shared exception table entry which simply rewinds the
saved user PC and returns from the system call (without setting tbl, so
there's no worries with tracing or syscall restarting). Returning to
userspace will re-enter the page fault handler, from where we will
probably send SIGSEGV to the current task.

Reported-by: Wang, Yalin <yalin.wang@sonymobile.com>
Reviewed-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-06-17 09:27:02 +01:00
Russell King f150abe101 Merge branch 'for-next' of git://git.pengutronix.de/git/ukl/linux into devel-stable
Pull ARM-v7M support from Uwe Kleine-König:
"All but the last patch were in next since next-20130418 without issues.
The last patch fixes a problem in combination with

  8164f7a (ARM: 7680/1: Detect support for SDIV/UDIV from ISAR0 register)

which triggers a WARN_ON without an implemented read_cpuid_ext.

The branch merges fine into v3.10-rc1 and I'd be happy if you pulled it
for 3.11-rc1. The only missing piece to be able to run a Cortex-M3 is
the irqchip driver that will go in via Thomas Gleixner and platform
specific stuff."
2013-05-22 10:52:24 +01:00
Russell King 946342d03e Merge branches 'devel-stable', 'entry', 'fixes', 'mach-types', 'misc' and 'smp-hotplug' into for-linus 2013-05-02 21:30:36 +01:00
Uwe Kleine-König 19c4d593f0 ARM: ARMv7-M: Add support for exception handling
This patch implements the exception handling for the ARMv7-M
architecture (pretty different from the A or R profiles).

It bases on work done earlier by Catalin for 2.6.33 but was nearly
completely rewritten to use a pt_regs layout compatible to the A
profile.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Jonathan Austin <jonathan.austin@arm.com>
Tested-by: Jonathan Austin <jonathan.austin@arm.com>
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
2013-04-17 21:44:46 +02:00
Kevin Hilman b008848020 ARM: 7688/1: add support for context tracking subsystem
commit 91d1aa43 (context_tracking: New context tracking susbsystem)
generalized parts of the RCU userspace extended quiescent state into
the context tracking subsystem.  Context tracking is then used
to implement adaptive tickless (a.k.a extended nohz)

To support the new context tracking subsystem on ARM, the user/kernel
boundary transtions need to be instrumented.

For exceptions and IRQs in usermode, the existing usr_entry macro is
used to instrument the user->kernel transition.  For the return to
usermode path, the ret_to_user* path is instrumented.  Using the
usr_entry macro, this covers interrupts in userspace, data abort and
prefetch abort exceptions in userspace as well as undefined exceptions
in userspace (which is where FP emulation and VFP are handled.)

For syscalls, the slow return path is covered by instrumenting the
ret_to_user path.  In addition, the syscall entry point is
instrumented which covers the user->kernel transition for both fast
and slow syscalls, and an additional instrumentation point is added
for the fast syscall return path (ret_fast_syscall).

Cc: Mats Liljegren <mats.liljegren@enea.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-04-03 17:00:01 +01:00
Russell King 651e94995e ARM: entry-common: get rid of unnecessary ifdefs
The contents of the asm_trace_hardirqs_on is already conditional on
CONFIG_TRACE_IRQFLAGS.  There's little point also making the use
of the macro conditional as well.  Get rid of these ifdefs to make
the code easier to read.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-04-03 16:50:12 +01:00
Rabin Vincent b21e023ba4 ARM: 7689/1: add unwind annotations to ftrace asm
Add unwind annotations to the ftrace assembly code so that the function
tracer's stacktracing options (func_stack_trace, etc.) work when
CONFIG_ARM_UNWIND is enabled.

Signed-off-by: Rabin Vincent <rabin@rab.in>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-04-03 16:45:50 +01:00
Al Viro ec93ac8663 arm: switch to generic sigaltstack
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-03 18:15:46 -05:00
Linus Torvalds 9977d9b379 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal
Pull big execve/kernel_thread/fork unification series from Al Viro:
 "All architectures are converted to new model.  Quite a bit of that
  stuff is actually shared with architecture trees; in such cases it's
  literally shared branch pulled by both, not a cherry-pick.

  A lot of ugliness and black magic is gone (-3KLoC total in this one):

   - kernel_thread()/kernel_execve()/sys_execve() redesign.

     We don't do syscalls from kernel anymore for either kernel_thread()
     or kernel_execve():

     kernel_thread() is essentially clone(2) with callback run before we
     return to userland, the callbacks either never return or do
     successful do_execve() before returning.

     kernel_execve() is a wrapper for do_execve() - it doesn't need to
     do transition to user mode anymore.

     As a result kernel_thread() and kernel_execve() are
     arch-independent now - they live in kernel/fork.c and fs/exec.c
     resp.  sys_execve() is also in fs/exec.c and it's completely
     architecture-independent.

   - daemonize() is gone, along with its parts in fs/*.c

   - struct pt_regs * is no longer passed to do_fork/copy_process/
     copy_thread/do_execve/search_binary_handler/->load_binary/do_coredump.

   - sys_fork()/sys_vfork()/sys_clone() unified; some architectures
     still need wrappers (ones with callee-saved registers not saved in
     pt_regs on syscall entry), but the main part of those suckers is in
     kernel/fork.c now."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal: (113 commits)
  do_coredump(): get rid of pt_regs argument
  print_fatal_signal(): get rid of pt_regs argument
  ptrace_signal(): get rid of unused arguments
  get rid of ptrace_signal_deliver() arguments
  new helper: signal_pt_regs()
  unify default ptrace_signal_deliver
  flagday: kill pt_regs argument of do_fork()
  death to idle_regs()
  don't pass regs to copy_process()
  flagday: don't pass regs to copy_thread()
  bfin: switch to generic vfork, get rid of pointless wrappers
  xtensa: switch to generic clone()
  openrisc: switch to use of generic fork and clone
  unicore32: switch to generic clone(2)
  score: switch to generic fork/vfork/clone
  c6x: sanitize copy_thread(), get rid of clone(2) wrapper, switch to generic clone()
  take sys_fork/sys_vfork/sys_clone prototypes to linux/syscalls.h
  mn10300: switch to generic fork/vfork/clone
  h8300: switch to generic fork/vfork/clone
  tile: switch to generic clone()
  ...

Conflicts:
	arch/microblaze/include/asm/Kbuild
2012-12-12 12:22:13 -08:00
Russell King 0b99cb7310 Merge branches 'cache-l2x0', 'fixes', 'hdrs', 'misc', 'mmci', 'vic' and 'warnings' into for-next 2012-12-11 00:20:18 +00:00
Will Deacon b10bca0bc6 ARM: 7595/1: syscall: rework ordering in syscall_trace_exit
syscall_trace_exit is currently doing things back-to-front; invoking
the audit hook *after* signalling the debugger, which presents an
opportunity for the registers to be re-written by userspace in order to
bypass auditing constaints.

This patch fixes the ordering by moving the audit code first and the
tracehook code last. On the face of it, it looks like
current_thread_info()->syscall may be incorrect for the sys_exit
tracepoint, but that's actually not an issue because it will have been
set during syscall entry and cannot have changed since then.

Reported-by: Andrew Gabbasov <Andrew_Gabbasov@mentor.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-12-11 00:18:26 +00:00
Al Viro 38a61b6b4a arm: switch to generic fork/vfork/clone
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-11-28 22:13:54 -05:00
Kees Cook ad75b51459 ARM: 7579/1: arch/allow a scno of -1 to not cause a SIGILL
On tracehook-friendly platforms, a system call number of -1 falls
through without running much code or taking much action.

ARM is different. This adds a short-circuit check in the trace path to
avoid any additional work, as suggested by Russell King, to make sure
that ARM behaves the same way as other platforms.

Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Will Drewry <wad@chromium.org>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-11-19 14:14:18 +00:00
Kees Cook 9b790d71d5 ARM: 7578/1: arch/move secure_computing into trace
There is very little difference in the TIF_SECCOMP and TIF_SYSCALL_WORK
path in entry-common.S, so merge TIF_SECCOMP into TIF_SYSCALL_WORK and
move seccomp into the syscall_trace_enter() handler.

Expanded some of the tracehook logic into the callers to make this code
more readable. Since tracehook needs to do register changing, this portion
is best left in its own function instead of copy/pasting into the callers.

Additionally, the return value for secure_computing() is now checked
and a -1 value will result in the system call being skipped.

Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Will Drewry <wad@chromium.org>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-11-19 14:14:17 +00:00
Russell King 68687c842c ARM: fix oops on initial entry to userspace with Thumb2 kernels
Daniel Mack reports an oops at boot with the latest kernels:

  Internal error: Oops - undefined instruction: 0 [#1] SMP THUMB2
  Modules linked in:
  CPU: 0    Not tainted  (3.6.0-11057-g584df1d #145)
  PC is at cpsw_probe+0x45a/0x9ac
  LR is at trace_hardirqs_on_caller+0x8f/0xfc
  pc : [<c03493de>]    lr : [<c005e81f>]    psr: 60000113
  sp : cf055fb0  ip : 00000000  fp : 00000000
  r10: 00000000  r9 : 00000000  r8 : 00000000
  r7 : 00000000  r6 : 00000000  r5 : c0344555  r4 : 00000000
  r3 : cf057a40  r2 : 00000000  r1 : 00000001  r0 : 00000000
  Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM Segment user
  Control: 50c5387d  Table: 8f3f4019  DAC: 00000015
  Process init (pid: 1, stack limit = 0xcf054240)
  Stack: (0xcf055fb0 to 0xcf056000)
  5fa0:                                     00000001 00000000 00000000 00000000
  5fc0: cf055fb0 c000d1a8 00000000 00000000 00000000 00000000 00000000 00000000
  5fe0: 00000000 be9b3f10 00000000 b6f6add0 00000010 00000000 aaaabfaf a8babbaa

The analysis of this is as follows.  In init/main.c, we issue:

	kernel_thread(kernel_init, NULL, CLONE_FS | CLONE_SIGHAND);

This creates a new thread, which falls through to the ret_from_fork
assembly, with r4 set NULL and r5 set to kernel_init.  You can see
this in your oops dump register set - r5 is 0xc0344555, which is the
address of kernel_init plus 1 which marks the function as Thumb code.

Now, let's look at this code a little closer - this is what the
disassembly looks like:

  c000d180 <ret_from_fork>:
  c000d180:       f03a fe08       bl      c0047d94 <schedule_tail>
  c000d184:       2d00            cmp     r5, #0
  c000d186:       bf1e            ittt    ne
  c000d188:       4620            movne   r0, r4
  c000d18a:       46fe            movne   lr, pc <-- XXXXXXX
  c000d18c:       46af            movne   pc, r5
  c000d18e:       46e9            mov     r9, sp
  c000d190:       ea4f 3959       mov.w   r9, r9, lsr #13
  c000d194:       ea4f 3949       mov.w   r9, r9, lsl #13
  c000d198:       e7c8            b.n     c000d12c <ret_to_user>
  c000d19a:       bf00            nop
  c000d19c:       f3af 8000       nop.w

This code was introduced in 9fff2fa0db (arm: switch to saner
kernel_execve() semantics).  I have marked one instruction, and it's
the significant one - I'll come back to that later.

Eventually, having had a successful call to kernel_execve(), kernel_init()
returns zero.

In returning, it uses the value in 'lr' which was set by the instruction
I marked above.  Unfortunately, this causes lr to contain 0xc000d18e -
an even address.  This switches the ISA to ARM on return but with a non
word aligned PC value.

So, what do we end up executing?  Well, not the instructions above - yes
the opcodes, but they don't mean the same thing in ARM mode.  In ARM mode,
it looks like this instead:

  c000d18c:       46e946af        strbtmi r4, [r9], pc, lsr #13
  c000d190:       3959ea4f        ldmdbcc r9, {r0, r1, r2, r3, r6, r9, fp, sp, lr, pc}^
  c000d194:       3949ea4f        stmdbcc r9, {r0, r1, r2, r3, r6, r9, fp, sp, lr, pc}^
  c000d198:       bf00e7c8        svclt   0x0000e7c8
  c000d19c:       8000f3af        andhi   pc, r0, pc, lsr #7
  c000d1a0:       e88db092        stm     sp, {r1, r4, r7, ip, sp, pc}
  c000d1a4:       46e81fff                        ; <UNDEFINED> instruction: 0x46e81fff
  c000d1a8:       8a00f3ef        bhi     0xc004a16c
  c000d1ac:       0a0cf08a        beq     0xc03493dc

I have included more above, because it's relevant.  The PSR flags which
we can see in the oops dump are nZCv, so Z and C are set.

All the above ARM instructions are not executed, except for two.
c000d1a0, which has no writeback, and writes below the current stack
pointer (and that data is lost when we take the next exception.) The
other instruction which is executed is c000d1ac, which takes us to...
0xc03493dc.  However, remember that bit 1 of the PC got set.  So that
makes the PC value 0xc03493de.

And that value is the value we find in the oops dump for PC.  What is
the instruction here when interpreted in ARM mode?

       0:       f71e150c                ; <UNDEFINED> instruction: 0xf71e150c

and there we have our undefined instruction (remember that the 'never'
condition code, 0xf, has been deprecated and is now always executed as
it is now being used for additional instructions.)

This path also nicely explains the state of the stack we see in the oops
dump too.

The above is a consistent and sane story for how we got to the oops
dump, which all stems from the instruction at 0xc000d18a being wrong.

Reported-by: Daniel Mack <zonque@gmail.com>
Tested-by: Daniel Mack <zonque@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-15 07:57:34 -07:00
Linus Torvalds 4e21fc138b Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal
Pull third pile of kernel_execve() patches from Al Viro:
 "The last bits of infrastructure for kernel_thread() et.al., with
  alpha/arm/x86 use of those.  Plus sanitizing the asm glue and
  do_notify_resume() on alpha, fixing the "disabled irq while running
  task_work stuff" breakage there.

  At that point the rest of kernel_thread/kernel_execve/sys_execve work
  can be done independently for different architectures.  The only
  pending bits that do depend on having all architectures converted are
  restrictred to fs/* and kernel/* - that'll obviously have to wait for
  the next cycle.

  I thought we'd have to wait for all of them done before we start
  eliminating the longjump-style insanity in kernel_execve(), but it
  turned out there's a very simple way to do that without flagday-style
  changes."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal:
  alpha: switch to saner kernel_execve() semantics
  arm: switch to saner kernel_execve() semantics
  x86, um: convert to saner kernel_execve() semantics
  infrastructure for saner ret_from_kernel_thread semantics
  make sure that kernel_thread() callbacks call do_exit() themselves
  make sure that we always have a return path from kernel_execve()
  ppc: eeh_event should just use kthread_run()
  don't bother with kernel_thread/kernel_execve for launching linuxrc
  alpha: get rid of switch_stack argument of do_work_pending()
  alpha: don't bother passing switch_stack separately from regs
  alpha: take SIGPENDING/NOTIFY_RESUME loop into signal.c
  alpha: simplify TIF_NEED_RESCHED handling
2012-10-13 10:05:52 +09:00