alistair23-linux/init
Borislav Petkov a9a3ed1eff x86: Fix early boot crash on gcc-10, third try
... or the odyssey of trying to disable the stack protector for the
function which generates the stack canary value.

The whole story started with Sergei reporting a boot crash with a kernel
built with gcc-10:

  Kernel panic — not syncing: stack-protector: Kernel stack is corrupted in: start_secondary
  CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.6.0-rc5—00235—gfffb08b37df9 #139
  Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./H77M—D3H, BIOS F12 11/14/2013
  Call Trace:
    dump_stack
    panic
    ? start_secondary
    __stack_chk_fail
    start_secondary
    secondary_startup_64
  -—-[ end Kernel panic — not syncing: stack—protector: Kernel stack is corrupted in: start_secondary

This happens because gcc-10 tail-call optimizes the last function call
in start_secondary() - cpu_startup_entry() - and thus emits a stack
canary check which fails because the canary value changes after the
boot_init_stack_canary() call.

To fix that, the initial attempt was to mark the one function which
generates the stack canary with:

  __attribute__((optimize("-fno-stack-protector"))) ... start_secondary(void *unused)

however, using the optimize attribute doesn't work cumulatively
as the attribute does not add to but rather replaces previously
supplied optimization options - roughly all -fxxx options.

The key one among them being -fno-omit-frame-pointer and thus leading to
not present frame pointer - frame pointer which the kernel needs.

The next attempt to prevent compilers from tail-call optimizing
the last function call cpu_startup_entry(), shy of carving out
start_secondary() into a separate compilation unit and building it with
-fno-stack-protector, was to add an empty asm("").

This current solution was short and sweet, and reportedly, is supported
by both compilers but we didn't get very far this time: future (LTO?)
optimization passes could potentially eliminate this, which leads us
to the third attempt: having an actual memory barrier there which the
compiler cannot ignore or move around etc.

That should hold for a long time, but hey we said that about the other
two solutions too so...

Reported-by: Sergei Trofimovich <slyfox@gentoo.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Tested-by: Kalle Valo <kvalo@codeaurora.org>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20200314164451.346497-1-slyfox@gentoo.org
2020-05-15 11:48:01 +02:00
..
calibrate.c
do_mounts.c block: remove __bdevname 2020-03-24 07:57:07 -06:00
do_mounts.h
do_mounts_initrd.c init: unify opening /dev/console as stdin/stdout/stderr 2019-12-12 18:58:24 +01:00
do_mounts_md.c
do_mounts_rd.c
init_task.c exec: Add exec_update_mutex to replace cred_guard_mutex 2020-03-25 10:03:36 -05:00
initramfs.c gcc-10: mark more functions __init to avoid section mismatch warnings 2020-05-09 17:50:03 -07:00
Kconfig Stop the ad-hoc games with -Wno-maybe-initialized 2020-05-09 13:57:10 -07:00
main.c x86: Fix early boot crash on gcc-10, third try 2020-05-15 11:48:01 +02:00
Makefile kbuild: do not pass $(KBUILD_CFLAGS) to scripts/mkcompile_h 2020-04-09 00:13:45 +09:00
noinitramfs.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 167 2019-05-30 11:26:39 -07:00
version.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00