remarkable-linux/mm
Denys Vlasenko 16e72e9b30 powerpc: do not make the entire heap executable
On 32-bit powerpc the ELF PLT sections of binaries (built with
--bss-plt, or with a toolchain which defaults to it) look like this:

  [17] .sbss             NOBITS          0002aff8 01aff8 000014 00  WA  0   0  4
  [18] .plt              NOBITS          0002b00c 01aff8 000084 00 WAX  0   0  4
  [19] .bss              NOBITS          0002b090 01aff8 0000a4 00  WA  0   0  4

Which results in an ELF load header:

  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  LOAD           0x019c70 0x00029c70 0x00029c70 0x01388 0x014c4 RWE 0x10000

This is all correct, the load region containing the PLT is marked as
executable.  Note that the PLT starts at 0002b00c but the file mapping
ends at 0002aff8, so the PLT falls in the 0 fill section described by
the load header, and after a page boundary.

Unfortunately the generic ELF loader ignores the X bit in the load
headers when it creates the 0 filled non-file backed mappings.  It
assumes all of these mappings are RW BSS sections, which is not the case
for PPC.

gcc/ld has an option (--secure-plt) to not do this, this is said to
incur a small performance penalty.

Currently, to support 32-bit binaries with PLT in BSS kernel maps
*entire brk area* with executable rights for all binaries, even
--secure-plt ones.

Stop doing that.

Teach the ELF loader to check the X bit in the relevant load header and
create 0 filled anonymous mappings that are executable if the load
header requests that.

Test program showing the difference in /proc/$PID/maps:

int main() {
	char buf[16*1024];
	char *p = malloc(123); /* make "[heap]" mapping appear */
	int fd = open("/proc/self/maps", O_RDONLY);
	int len = read(fd, buf, sizeof(buf));
	write(1, buf, len);
	printf("%p\n", p);
	return 0;
}

Compiled using: gcc -mbss-plt -m32 -Os test.c -otest

Unpatched ppc64 kernel:
00100000-00120000 r-xp 00000000 00:00 0                                  [vdso]
0fe10000-0ffd0000 r-xp 00000000 fd:00 67898094                           /usr/lib/libc-2.17.so
0ffd0000-0ffe0000 r--p 001b0000 fd:00 67898094                           /usr/lib/libc-2.17.so
0ffe0000-0fff0000 rw-p 001c0000 fd:00 67898094                           /usr/lib/libc-2.17.so
10000000-10010000 r-xp 00000000 fd:00 100674505                          /home/user/test
10010000-10020000 r--p 00000000 fd:00 100674505                          /home/user/test
10020000-10030000 rw-p 00010000 fd:00 100674505                          /home/user/test
10690000-106c0000 rwxp 00000000 00:00 0                                  [heap]
f7f70000-f7fa0000 r-xp 00000000 fd:00 67898089                           /usr/lib/ld-2.17.so
f7fa0000-f7fb0000 r--p 00020000 fd:00 67898089                           /usr/lib/ld-2.17.so
f7fb0000-f7fc0000 rw-p 00030000 fd:00 67898089                           /usr/lib/ld-2.17.so
ffa90000-ffac0000 rw-p 00000000 00:00 0                                  [stack]
0x10690008

Patched ppc64 kernel:
00100000-00120000 r-xp 00000000 00:00 0                                  [vdso]
0fe10000-0ffd0000 r-xp 00000000 fd:00 67898094                           /usr/lib/libc-2.17.so
0ffd0000-0ffe0000 r--p 001b0000 fd:00 67898094                           /usr/lib/libc-2.17.so
0ffe0000-0fff0000 rw-p 001c0000 fd:00 67898094                           /usr/lib/libc-2.17.so
10000000-10010000 r-xp 00000000 fd:00 100674505                          /home/user/test
10010000-10020000 r--p 00000000 fd:00 100674505                          /home/user/test
10020000-10030000 rw-p 00010000 fd:00 100674505                          /home/user/test
10180000-101b0000 rw-p 00000000 00:00 0                                  [heap]
                  ^^^^ this has changed
f7c60000-f7c90000 r-xp 00000000 fd:00 67898089                           /usr/lib/ld-2.17.so
f7c90000-f7ca0000 r--p 00020000 fd:00 67898089                           /usr/lib/ld-2.17.so
f7ca0000-f7cb0000 rw-p 00030000 fd:00 67898089                           /usr/lib/ld-2.17.so
ff860000-ff890000 rw-p 00000000 00:00 0                                  [stack]
0x10180008

The patch was originally posted in 2012 by Jason Gunthorpe
and apparently ignored:

https://lkml.org/lkml/2012/9/30/138

Lightly run-tested.

Link: http://lkml.kernel.org/r/20161215131950.23054-1-dvlasenk@redhat.com
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Tested-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Florian Weimer <fweimer@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-22 16:41:29 -08:00
..
kasan arm64 updates for 4.11: 2017-02-22 10:46:44 -08:00
backing-dev.c block: fix double-free in the failure path of cgwb_bdi_init() 2017-02-08 13:52:01 -07:00
balloon_compaction.c
bootmem.c mm/bootmem.c: cosmetic improvement of code readability 2017-02-22 16:41:29 -08:00
cleancache.c
cma.c mm/cma: Cleanup highmem check 2017-01-11 13:56:49 +00:00
cma.h
cma_debug.c
compaction.c mm,compaction: serialize waitqueue_active() checks 2017-02-22 16:41:29 -08:00
debug.c mm, debug: print raw struct page data in __dump_page() 2016-12-12 18:55:08 -08:00
debug_page_ref.c
dmapool.c
early_ioremap.c
fadvise.c mm: fadvise: avoid expensive remote LRU cache draining after FADV_DONTNEED 2016-12-20 09:48:46 -08:00
failslab.c
filemap.c mm: fix filemap.c kernel-doc warnings 2017-02-22 16:41:29 -08:00
frame_vector.c mm: replace get_vaddr_frames() write/force parameters with gup_flags 2016-10-19 08:11:24 -07:00
frontswap.c
gup.c userfaultfd: hugetlbfs: gup: support VM_FAULT_RETRY 2017-02-22 16:41:28 -08:00
highmem.c
huge_memory.c mm/huge_memory.c: respect FOLL_FORCE/FOLL_COW for thp 2017-01-24 16:26:14 -08:00
hugetlb.c userfaultfd: hugetlbfs: add UFFDIO_COPY support for shared mappings 2017-02-22 16:41:28 -08:00
hugetlb_cgroup.c
hwpoison-inject.c
init-mm.c mm: Add a user_ns owner to mm_struct and fix ptrace permission checks 2016-11-22 11:49:48 -06:00
internal.h mm, compaction: add vmstats for kcompactd work 2017-02-22 16:41:29 -08:00
interval_tree.c
Kconfig mm: THP page cache support for ppc64 2016-12-12 18:55:08 -08:00
Kconfig.debug PM / Hibernate: allow hibernation with PAGE_POISONING_ZERO 2016-09-13 02:35:27 +02:00
khugepaged.c mm: get rid of __GFP_OTHER_NODE 2017-01-10 18:31:55 -08:00
kmemcheck.c
kmemleak-test.c
kmemleak.c kmemleak: fix reference to Documentation 2016-12-12 18:55:07 -08:00
ksm.c mm,ksm: add __GFP_HIGH to the allocation in alloc_stable_node() 2016-10-07 18:46:29 -07:00
list_lru.c mm/list_lru.c: avoid error-path NULL pointer deref 2016-10-27 18:43:42 -07:00
maccess.c
madvise.c userfaultfd: non-cooperative: avoid MADV_DONTNEED race condition 2017-02-22 16:41:28 -08:00
Makefile Disable the __builtin_return_address() warning globally after all 2016-10-12 10:23:41 -07:00
memblock.c mm/memblock.c: check return value of memblock_reserve() in memblock_virt_alloc_internal() 2017-02-22 16:41:29 -08:00
memcontrol.c slab: use memcg_kmem_cache_wq for slab destruction operations 2017-02-22 16:41:27 -08:00
memory-failure.c mm: Use owner_priv bit for PageSwapCache, valid when PageSwapBacked 2016-12-25 11:54:48 -08:00
memory.c userfaultfd: hugetlbfs: fix __mcopy_atomic_hugetlb retry/error processing 2017-02-22 16:41:28 -08:00
memory_hotplug.c mm/memory_hotplug: set magic number to page->freelist instead of page->lru.next 2017-02-22 16:41:29 -08:00
mempolicy.c mm/mempolicy.c: do not put mempolicy before using its nodemask 2017-01-24 16:26:14 -08:00
mempool.c
memtest.c
migrate.c mm: Use owner_priv bit for PageSwapCache, valid when PageSwapBacked 2016-12-25 11:54:48 -08:00
mincore.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
mlock.c thp: fix corner case of munlock() of PTE-mapped THPs 2016-11-30 16:32:52 -08:00
mm_init.c
mmap.c powerpc: do not make the entire heap executable 2017-02-22 16:41:29 -08:00
mmu_context.c
mmu_notifier.c
mmzone.c mm/mmzone.c: swap likely to unlikely as code logic is different for next_zones_zonelist() 2017-02-22 16:41:29 -08:00
mprotect.c mm: mprotect: use pmd_trans_unstable instead of taking the pmd_lock 2017-02-22 16:41:29 -08:00
mremap.c userfaultfd: non-cooperative: optimize mremap_userfaultfd_complete() 2017-02-22 16:41:28 -08:00
msync.c
nobootmem.c mm: kmemleak: avoid using __va() on addresses that don't have a lowmem mapping 2016-10-11 15:06:33 -07:00
nommu.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
oom_kill.c oom: print nodemask in the oom report 2016-10-07 18:46:29 -07:00
page-writeback.c block: Use pointer to backing_dev_info from request_queue 2017-02-02 08:20:48 -07:00
page_alloc.c mm: page_alloc: skip over regions of invalid pfns where possible 2017-02-22 16:41:29 -08:00
page_counter.c
page_ext.c mm/page_ext: support extra space allocation by page_ext user 2016-10-07 18:46:27 -07:00
page_idle.c
page_io.c writeback: add wbc_to_write_flags() 2016-11-02 10:24:03 -06:00
page_isolation.c mm, page_alloc: avoid page_to_pfn() when merging buddies 2017-02-22 16:41:27 -08:00
page_owner.c mm/page_owner: don't define fields on struct page_ext by hard-coding 2016-10-07 18:46:27 -07:00
page_poison.c
pagewalk.c
percpu-km.c
percpu-vm.c
percpu.c Merge branch 'for-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu 2016-12-13 12:34:47 -08:00
pgtable-generic.c
process_vm_access.c mm: unexport __get_user_pages_unlocked() 2016-12-14 16:04:09 -08:00
quicklist.c
readahead.c mm: don't cap request size based on read-ahead setting 2016-12-12 18:55:08 -08:00
rmap.c mm, rmap: handle anon_vma_prepare() common case inline 2016-12-12 18:55:08 -08:00
shmem.c userfaultfd: shmem: avoid leaking blocks and used blocks in UFFDIO_COPY 2017-02-22 16:41:29 -08:00
slab.c slab: introduce __kmemcg_cache_deactivate() 2017-02-22 16:41:27 -08:00
slab.h slab: remove synchronous synchronize_sched() from memcg cache deactivation path 2017-02-22 16:41:27 -08:00
slab_common.c slab: use memcg_kmem_cache_wq for slab destruction operations 2017-02-22 16:41:27 -08:00
slob.c slab: introduce __kmemcg_cache_deactivate() 2017-02-22 16:41:27 -08:00
slub.c slub: make sysfs directories for memcg sub-caches optional 2017-02-22 16:41:27 -08:00
sparse-vmemmap.c
sparse.c mm/memory_hotplug: set magic number to page->freelist instead of page->lru.next 2017-02-22 16:41:29 -08:00
swap.c mm: add PageWaiters indicating tasks are waiting for a page bit 2016-12-25 11:54:48 -08:00
swap_cgroup.c
swap_state.c mm, swap: use offset of swap entry as key of swap cache 2016-10-07 18:46:28 -07:00
swapfile.c mm: support anonymous stable page 2017-01-10 18:31:55 -08:00
truncate.c mm: Invalidate DAX radix tree entries only if appropriate 2016-12-26 20:29:24 -08:00
usercopy.c mm/usercopy: Switch to using lm_alias 2017-01-11 13:56:50 +00:00
userfaultfd.c userfaultfd: hugetlbfs: add UFFDIO_COPY support for shared mappings 2017-02-22 16:41:28 -08:00
util.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
vmacache.c mm: unrig VMA cache hit ratio 2016-10-07 18:46:27 -07:00
vmalloc.c mm/vmalloc.c: use rb_entry_safe 2017-02-22 16:41:27 -08:00
vmpressure.c
vmscan.c mm, vmscan: add mm_vmscan_inactive_list_is_low tracepoint 2017-02-22 16:41:29 -08:00
vmstat.c mm, compaction: add vmstats for kcompactd work 2017-02-22 16:41:29 -08:00
workingset.c mm: workingset: fix use-after-free in shadow node shrinker 2017-01-07 18:22:40 -08:00
z3fold.c
zbud.c
zpool.c
zsmalloc.c mm: fix some typos in mm/zsmalloc.c 2017-02-22 16:41:29 -08:00
zswap.c zswap: disable changing params if init fails 2017-02-03 14:13:19 -08:00