alistair23-linux/fs/proc
Jérôme Glisse 5042db43cc mm/ZONE_DEVICE: new type of ZONE_DEVICE for unaddressable memory
HMM (heterogeneous memory management) need struct page to support
migration from system main memory to device memory.  Reasons for HMM and
migration to device memory is explained with HMM core patch.

This patch deals with device memory that is un-addressable memory (ie CPU
can not access it).  Hence we do not want those struct page to be manage
like regular memory.  That is why we extend ZONE_DEVICE to support
different types of memory.

A persistent memory type is define for existing user of ZONE_DEVICE and a
new device un-addressable type is added for the un-addressable memory
type.  There is a clear separation between what is expected from each
memory type and existing user of ZONE_DEVICE are un-affected by new
requirement and new use of the un-addressable type.  All specific code
path are protect with test against the memory type.

Because memory is un-addressable we use a new special swap type for when a
page is migrated to device memory (this reduces the number of maximum swap
file).

The main two additions beside memory type to ZONE_DEVICE is two callbacks.
First one, page_free() is call whenever page refcount reach 1 (which
means the page is free as ZONE_DEVICE page never reach a refcount of 0).
This allow device driver to manage its memory and associated struct page.

The second callback page_fault() happens when there is a CPU access to an
address that is back by a device page (which are un-addressable by the
CPU).  This callback is responsible to migrate the page back to system
main memory.  Device driver can not block migration back to system memory,
HMM make sure that such page can not be pin into device memory.

If device is in some error condition and can not migrate memory back then
a CPU page fault to device memory should end with SIGBUS.

[arnd@arndb.de: fix warning]
  Link: http://lkml.kernel.org/r/20170823133213.712917-1-arnd@arndb.de
Link: http://lkml.kernel.org/r/20170817000548.32038-8-jglisse@redhat.com
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Nellans <dnellans@nvidia.com>
Cc: Evgeny Baskakov <ebaskakov@nvidia.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Mark Hairgrove <mhairgrove@nvidia.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Sherry Cheung <SCheung@nvidia.com>
Cc: Subhash Gutti <sgutti@nvidia.com>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Bob Liu <liubo95@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-08 18:26:46 -07:00
..
array.c sched/headers: Prepare to move cputime functionality from <linux/sched.h> into <linux/sched/cputime.h> 2017-03-02 08:42:39 +01:00
base.c mm: add /proc/pid/smaps_rollup 2017-09-06 17:27:30 -07:00
cmdline.c
consoles.c
cpuinfo.c
devices.c block: order /proc/devices by major number 2017-07-17 15:42:20 +02:00
fd.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/signal.h> 2017-03-02 08:42:29 +01:00
fd.h proc: unsigned file descriptors 2016-09-27 18:47:38 -04:00
generic.c fs/proc/generic.c: switch to ida_simple_get/remove 2017-07-10 16:32:34 -07:00
inode.c fs/proc/inode.c: remove cast from memory allocation 2017-05-08 17:15:10 -07:00
internal.h mm: add /proc/pid/smaps_rollup 2017-09-06 17:27:30 -07:00
interrupts.c
Kconfig
kcore.c fs/proc: kcore: use kcore_list type to check for vmalloc/module address 2017-06-20 12:42:57 +01:00
kmsg.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
loadavg.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/stat.h> 2017-03-02 08:42:34 +01:00
Makefile fs/proc: Add compiler check for -Wno-override-init to support gcc < 4.2 2016-08-03 12:45:23 -04:00
meminfo.c mm: rename global_page_state to global_zone_page_state 2017-09-06 17:27:29 -07:00
namespaces.c pidns: expose task pid_ns_for_children to userspace 2017-05-08 17:15:12 -07:00
nommu.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
page.c mm: fix KPF_SWAPCACHE in /proc/kpageflags 2017-02-07 12:08:32 -08:00
proc_net.c Merge branch 'rebased-statx' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-03-03 11:38:56 -08:00
proc_sysctl.c Merge branch 'akpm' (patches from Andrew) 2017-07-13 12:38:49 -07:00
proc_tty.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
root.c Merge branch 'rebased-statx' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-03-03 11:38:56 -08:00
self.c vfs: remove ".readlink = generic_readlink" assignments 2016-12-09 16:45:04 +01:00
softirqs.c
stat.c sched/headers: Prepare to move cputime functionality from <linux/sched.h> into <linux/sched/cputime.h> 2017-03-02 08:42:39 +01:00
task_mmu.c mm/ZONE_DEVICE: new type of ZONE_DEVICE for unaddressable memory 2017-09-08 18:26:46 -07:00
task_nommu.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/mm.h> 2017-03-02 08:42:28 +01:00
thread_self.c vfs: remove ".readlink = generic_readlink" assignments 2016-12-09 16:45:04 +01:00
uptime.c sched/cputime: Convert kcpustat to nsecs 2017-02-01 09:13:47 +01:00
version.c
vmcore.c userfaultfd: non-cooperative: add event for memory unmaps 2017-02-24 17:46:55 -08:00