1
0
Fork 0
alistair23-linux/include
Andrea Arcangeli e067eba587 userfaultfd: document _IOR/_IOW
Patch series "userfaultfd tmpfs/hugetlbfs/non-cooperative", v2

These userfaultfd features are finished and are ready for larger
exposure in -mm and upstream merging.

1) tmpfs non present userfault
2) hugetlbfs non present userfault
3) non cooperative userfault for fork/madvise/mremap

qemu development code is already exercising 2) and container postcopy
live migration needs 3).

1) is not currently used but there's a self test and we know some qemu
user for various reasons uses tmpfs as backing for KVM so it'll need it
too to use postcopy live migration with tmpfs memory.

All review feedback from the previous submit has been handled and the
fixes are included.  There's no outstanding issue AFIK.

Upstream code just did a s/fe/vmf/ conversion in the page faults and
this has been converted as well incrementally.

In addition to the previous submits, this also wakes up stuck userfaults
during UFFDIO_UNREGISTER.  The non cooperative testcase actually
reproduced this problem by getting stuck instead of quitting clean in
some rare case as it could call UFFDIO_UNREGISTER while some userfault
could be still in flight.  The other option would have been to keep
leaving it up to userland to serialize itself and to patch the testcase
instead but the wakeup during unregister I think is preferable.

David also asked the UFFD_FEATURE_MISSING_HUGETLBFS and
UFFD_FEATURE_MISSING_SHMEM feature flags to be added so QEMU can avoid
to probe if the hugetlbfs/shmem missing support is available by calling
UFFDIO_REGISTER.  QEMU already checks HUGETLBFS_MAGIC with fstatfs so if
UFFD_FEATURE_MISSING_HUGETLBFS is also set, it knows UFFDIO_REGISTER
will succeed (or if it fails, it's for some other more concerning
reason).  There's no reason to worry about adding too many feature
flags.  There are 64 available and worst case we've to bump the API if
someday we're really going to run out of them.

The round-trip network latency of hugetlbfs userfaults during postcopy
live migration is still of the order of dozen milliseconds on 10GBit if
at 2MB hugepage granularity so it's working perfectly and it should
provide for higher bandwidth or lower CPU usage (which makes it
interesting to add an option in the future to support THP granularity
too for anonymous memory, UFFDIO_COPY would then have to create THP if
alignment/len allows for it).  1GB hugetlbfs granularity will require
big changes in hugetlbfs to work so it's deferred for later.

This patch (of 42):

This adds proper documentation (inline) to avoid the risk of further
misunderstandings about the semantics of _IOW/_IOR and it also reminds
whoever will bump the UFFDIO_API in the future, to change the two ioctl
to _IOW.

This was found while implementing strace support for those ioctl,
otherwise we could have never found it by just reviewing kernel code and
testing it.

_IOC_READ or _IOC_WRITE alters nothing but the ioctl number itself, so
it's only worth fixing if the UFFDIO_API is bumped someday.

Link: http://lkml.kernel.org/r/20161216144821.5183-2-aarcange@redhat.com
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Reported-by: "Dmitry V. Levin" <ldv@altlinux.org>
Cc: Michael Rapoport <RAPOPORT@il.ibm.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-22 16:41:27 -08:00
..
acpi Merge branches 'acpi-bus', 'acpi-sleep' and 'acpi-processor' 2017-02-20 14:28:03 +01:00
asm-generic Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-02-20 13:23:30 -08:00
clocksource
crypto
drm Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-02-20 13:23:30 -08:00
dt-bindings Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2017-02-22 10:15:09 -08:00
keys
kvm KVM: arm64: Access CNTHCTL_EL2 bit fields correctly on VHE systems 2017-01-13 11:19:25 +00:00
linux slab: use memcg_kmem_cache_wq for slab destruction operations 2017-02-22 16:41:27 -08:00
math-emu
media [media] v4l2-subdev.h: fix v4l2_subdev_pad_config documentation 2017-02-03 12:00:37 -02:00
memory
misc
net Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2017-02-22 10:15:09 -08:00
pcmcia
ras
rdma net-next: treewide use is_vlan_dev() helper function. 2017-02-06 16:33:29 -05:00
rxrpc
scsi SCSI misc on 20170220 2017-02-21 11:51:42 -08:00
soc ARC updates for 4.11 rc1 2017-02-22 10:33:53 -08:00
sound Merge remote-tracking branches 'asoc/fix/arizona', 'asoc/fix/dpcm', 'asoc/fix/dwc', 'asoc/fix/fsl-ssi' and 'asoc/fix/hdmi-codec' into asoc-linus 2017-01-10 10:47:50 +00:00
target target: Fix multi-session dynamic se_node_acl double free OOPs 2017-02-08 08:25:23 -08:00
trace oom, trace: add compaction retry tracepoint 2017-02-22 16:41:27 -08:00
uapi userfaultfd: document _IOR/_IOW 2017-02-22 16:41:27 -08:00
video
xen xen/privcmd: Add IOCTL_PRIVCMD_DM_OP 2017-02-14 15:13:43 -05:00
Kbuild