1
0
Fork 0
Commit Graph

21 Commits (636deed6c0bc137a7c4f4a97ae1fcf0ad75323da)

Author SHA1 Message Date
Ben Gardon b12ce36a43 kvm: Add memcg accounting to KVM allocations
There are many KVM kernel memory allocations which are tied to the life of
the VM process and should be charged to the VM process's cgroup. If the
allocations aren't tied to the process, the OOM killer will not know
that killing the process will free the associated kernel memory.
Add __GFP_ACCOUNT flags to many of the allocations which are not yet being
charged to the VM process's cgroup.

Tested:
	Ran all kvm-unit-tests on a 64 bit Haswell machine, the patch
	introduced no new failures.
	Ran a kernel memory accounting test which creates a VM to touch
	memory and then checks that the kernel memory allocated for the
	process is within certain bounds.
	With this patch we account for much more of the vmalloc and slab memory
	allocated for the VM.

There remain a few allocations which should be charged to the VM's
cgroup but are not. In they include:
        vcpu->run
        kvm->coalesced_mmio_ring
There allocations are unaccounted in this patch because they are mapped
to userspace, and accounting them to a cgroup causes problems. This
should be addressed in a future patch.

Signed-off-by: Ben Gardon <bgardon@google.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-02-20 22:48:29 +01:00
Christian Borntraeger 5535f800b0 KVM: use rcu access function for irq routing
irq routing is rcu protected. Use the proper access functions.
Found by sparse

virt/kvm/irqchip.c:233:13: warning: incorrect type in assignment (different address spaces)
virt/kvm/irqchip.c:233:13:    expected struct kvm_irq_routing_table *old
virt/kvm/irqchip.c:233:13:    got struct kvm_irq_routing_table [noderef] <asn:4>*irq_routing

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-07 15:24:15 +02:00
David Hildenbrand 5c0aea0e8d KVM: x86: don't hold kvm->lock in KVM_SET_GSI_ROUTING
We needed the lock to avoid racing with creation of the irqchip on x86. As
kvm_set_irq_routing() calls srcu_synchronize_expedited(), this lock
might be held for a longer time.

Let's introduce an arch specific callback to check if we can actually
add irq routes. For x86, all we have to do is check if we have an
irqchip in the kernel. We don't need kvm->lock at that point as the
irqchip is marked as inititalized only when actually fully created.

Reported-by: Steve Rutherford <srutherford@google.com>
Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Fixes: 1df6ddede1 ("KVM: x86: race between KVM_SET_GSI_ROUTING and KVM_CREATE_IRQCHIP")
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-05-02 14:45:45 +02:00
David Hildenbrand 8c6b7828c2 KVM: x86: cleanup return handling in setup_routing_entry()
Let's drop the goto and return the error value directly.

Suggested-by: Peter Xu <peterx@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-04-12 20:17:14 +02:00
Paolo Bonzini 6f49b2f341 KVM/ARM Changes for v4.8 - Take 2
Includes GSI routing support to go along with the new VGIC and a small fix that
 has been cooking in -next for a while.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJXoydqAAoJEEtpOizt6ddyM3oH/1A4VeG/J9q4fBPXqY2tVWXs
 c3P7UgNcrEgUNs/F9ykQY/lb31deecUzaBt1OyTf+RlsNbihq3dQdYcBhxtUODw/
 Faok582ya3UFgLW+IRHcID0EbkVOpIzMhOStYsnU/Dz7HG1JL9HdPzwkid7iu9LT
 fI6yrrBnJFjdWAAQ4BkcEKBENRsY8NTs7jX5vnFA92MkUBby7BmariPDD3FtrB+f
 Ob9B7CxM30pNqsN7OA/QvFOHMJHxf3s1TBKwmPHe5TLIfSzV1YxcEGiMc0lWqF4v
 BT8ZeMGCtjDw94tND1DskfQQRPaMqPmGuRTrAW/IuE2n92bFtbqIqs7Cbw0fzLE=
 =Vm6Q
 -----END PGP SIGNATURE-----

Merge tag 'kvm-arm-for-4.8-take2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/ARM Changes for v4.8 - Take 2

Includes GSI routing support to go along with the new VGIC and a small fix that
has been cooking in -next for a while.
2016-08-04 13:59:56 +02:00
Eric Auger 995a0ee980 KVM: arm/arm64: Enable MSI routing
Up to now, only irqchip routing entries could be set. This patch
adds the capability to insert MSI routing entries.

For ARM64, let's also increase KVM_MAX_IRQ_ROUTES to 4096: this
include SPI irqchip routes plus MSI routes. In the future this
might be extended.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-07-22 18:52:03 +01:00
Eric Auger ebe915259c KVM: irqchip: Convey devid to kvm_set_msi
on ARM, a devid field is populated in kvm_msi struct in case the
flag is set to KVM_MSI_VALID_DEVID. Let's propagate both flags and
devid field in kvm_kernel_irq_routing_entry.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-07-22 18:51:58 +01:00
Radim Krčmář c63cf538eb KVM: pass struct kvm to kvm_set_routing_entry
Arch-specific code will use it.

Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-07-14 09:03:56 +02:00
Paolo Bonzini c622a3c21e KVM: irqfd: fix NULL pointer dereference in kvm_irq_map_gsi
Found by syzkaller:

    BUG: unable to handle kernel NULL pointer dereference at 0000000000000120
    IP: [<ffffffffa0797202>] kvm_irq_map_gsi+0x12/0x90 [kvm]
    PGD 6f80b067 PUD b6535067 PMD 0
    Oops: 0000 [#1] SMP
    CPU: 3 PID: 4988 Comm: a.out Not tainted 4.4.9-300.fc23.x86_64 #1
    [...]
    Call Trace:
     [<ffffffffa0795f62>] irqfd_update+0x32/0xc0 [kvm]
     [<ffffffffa0796c7c>] kvm_irqfd+0x3dc/0x5b0 [kvm]
     [<ffffffffa07943f4>] kvm_vm_ioctl+0x164/0x6f0 [kvm]
     [<ffffffff81241648>] do_vfs_ioctl+0x298/0x480
     [<ffffffff812418a9>] SyS_ioctl+0x79/0x90
     [<ffffffff817a1062>] tracesys_phase2+0x84/0x89
    Code: b5 71 a7 e0 5b 41 5c 41 5d 5d f3 c3 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 8b 8f 10 2e 00 00 31 c0 48 89 e5 <39> 91 20 01 00 00 76 6a 48 63 d2 48 8b 94 d1 28 01 00 00 48 85
    RIP  [<ffffffffa0797202>] kvm_irq_map_gsi+0x12/0x90 [kvm]
     RSP <ffff8800926cbca8>
    CR2: 0000000000000120

Testcase:

    #include <unistd.h>
    #include <sys/syscall.h>
    #include <string.h>
    #include <stdint.h>
    #include <linux/kvm.h>
    #include <fcntl.h>
    #include <sys/ioctl.h>

    long r[26];

    int main()
    {
        memset(r, -1, sizeof(r));
        r[2] = open("/dev/kvm", 0);
        r[3] = ioctl(r[2], KVM_CREATE_VM, 0);

        struct kvm_irqfd ifd;
        ifd.fd = syscall(SYS_eventfd2, 5, 0);
        ifd.gsi = 3;
        ifd.flags = 2;
        ifd.resamplefd = ifd.fd;
        r[25] = ioctl(r[3], KVM_IRQFD, &ifd);
        return 0;
    }

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-06-02 17:38:50 +02:00
Andrey Smetanin abdb080f7a kvm/irqchip: kvm_arch_irq_routing_update renaming split
Actually kvm_arch_irq_routing_update() should be
kvm_arch_post_irq_routing_update() as it's called at the end
of irq routing update.

This renaming frees kvm_arch_irq_routing_update function name.
kvm_arch_irq_routing_update() weak function which will be used
to update mappings for arch-specific irq routing entries
(in particular, the upcoming Hyper-V synthetic interrupts).

Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Gleb Natapov <gleb@kernel.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Roman Kagan <rkagan@virtuozzo.com>
CC: Denis V. Lunev <den@openvz.org>
CC: qemu-devel@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-11-25 17:24:21 +01:00
Andrey Smetanin f33143d809 kvm/irqchip: allow only multiple irqchip routes per GSI
Any other irq routing types (MSI, S390_ADAPTER, upcoming Hyper-V
SynIC) map one-to-one to GSI.

Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Vitaly Kuznetsov <vkuznets@redhat.com>
CC: "K. Y. Srinivasan" <kys@microsoft.com>
CC: Gleb Natapov <gleb@kernel.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-16 10:34:30 +02:00
Steve Rutherford b053b2aef2 KVM: x86: Add EOI exit bitmap inference
In order to support a userspace IOAPIC interacting with an in kernel
APIC, the EOI exit bitmaps need to be configurable.

If the IOAPIC is in userspace (i.e. the irqchip has been split), the
EOI exit bitmaps will be set whenever the GSI Routes are configured.
In particular, for the low MSI routes are reservable for userspace
IOAPICs. For these MSI routes, the EOI Exit bit corresponding to the
destination vector of the route will be set for the destination VCPU.

The intention is for the userspace IOAPICs to use the reservable MSI
routes to inject interrupts into the guest.

This is a slight abuse of the notion of an MSI Route, given that MSIs
classically bypass the IOAPIC. It might be worthwhile to add an
additional route type to improve clarity.

Compile tested for Intel x86.

Signed-off-by: Steve Rutherford <srutherford@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-01 15:06:28 +02:00
Sudip Mukherjee ba60c41ae3 kvm: irqchip: fix memory leak
We were taking the exit path after checking ue->flags and return value
of setup_routing_entry(), but 'e' was not freed incase of a failure.

Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-08 11:16:41 +02:00
Joerg Roedel e73f61e41f kvm: irqchip: Break up high order allocations of kvm_irq_routing_table
The allocation size of the kvm_irq_routing_table depends on
the number of irq routing entries because they are all
allocated with one kzalloc call.

When the irq routing table gets bigger this requires high
order allocations which fail from time to time:

	qemu-kvm: page allocation failure: order:4, mode:0xd0

This patch fixes this issue by breaking up the allocation of
the table and its entries into individual kzalloc calls.
These could all be satisfied with order-0 allocations, which
are less likely to fail.

The downside of this change is the lower performance, because
of more calls to kzalloc. But given how often kvm_set_irq_routing
is called in the lifetime of a guest, it doesn't really
matter much.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
[Avoid sparse warning through rcu_access_pointer. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-19 17:16:25 +02:00
Kevin Mulvey ae548c5c80 KVM: fix checkpatch.pl errors in kvm/irqchip.c
Fix whitespace around while

Signed-off-by: Kevin Mulvey <kmulvey@linux.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2015-03-10 10:37:42 -03:00
Paul Mackerras e4d57e1ee1 KVM: Move irq notifier implementation into eventfd.c
This moves the functions kvm_irq_has_notifier(), kvm_notify_acked_irq(),
kvm_register_irq_ack_notifier() and kvm_unregister_irq_ack_notifier()
from irqchip.c to eventfd.c.  The reason for doing this is that those
functions are used in connection with IRQFDs, which are implemented in
eventfd.c.  In future we will want to use IRQFDs on platforms that
don't implement the GSI routing implemented in irqchip.c, so we won't
be compiling in irqchip.c, but we still need the irq notifiers.  The
implementation is unchanged.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Tested-by: Eric Auger <eric.auger@linaro.org>
Tested-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-05 14:26:24 +02:00
Paul Mackerras 9957c86d65 KVM: Move all accesses to kvm::irq_routing into irqchip.c
Now that struct _irqfd does not keep a reference to storage pointed
to by the irq_routing field of struct kvm, we can move the statement
that updates it out from under the irqfds.lock and put it in
kvm_set_irq_routing() instead.  That means we then have to take a
srcu_read_lock on kvm->irq_srcu around the irqfd_update call in
kvm_irqfd_assign(), since holding the kvm->irqfds.lock no longer
ensures that that the routing can't change.

Combined with changing kvm_irq_map_gsi() and kvm_irq_map_chip_pin()
to take a struct kvm * argument instead of the pointer to the routing
table, this allows us to to move all references to kvm->irq_routing
into irqchip.c.  That in turn allows us to move the definition of the
kvm_irq_routing_table struct into irqchip.c as well.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Tested-by: Eric Auger <eric.auger@linaro.org>
Tested-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-05 14:26:20 +02:00
Paul Mackerras 8ba918d488 KVM: irqchip: Provide and use accessors for irq routing table
This provides accessor functions for the KVM interrupt mappings, in
order to reduce the amount of code that accesses the fields of the
kvm_irq_routing_table struct, and restrict that code to one file,
virt/kvm/irqchip.c.  The new functions are kvm_irq_map_gsi(), which
maps from a global interrupt number to a set of IRQ routing entries,
and kvm_irq_map_chip_pin, which maps from IRQ chip and pin numbers to
a global interrupt number.

This also moves the update of kvm_irq_routing_table::chip[][]
into irqchip.c, out of the various kvm_set_routing_entry
implementations.  That means that none of the kvm_set_routing_entry
implementations need the kvm_irq_routing_table argument anymore,
so this removes it.

This does not change any locking or data lifetime rules.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Tested-by: Eric Auger <eric.auger@linaro.org>
Tested-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-05 14:26:16 +02:00
Christian Borntraeger 719d93cd5f kvm/irqchip: Speed up KVM_SET_GSI_ROUTING
When starting lots of dataplane devices the bootup takes very long on
Christian's s390 with irqfd patches. With larger setups he is even
able to trigger some timeouts in some components.  Turns out that the
KVM_SET_GSI_ROUTING ioctl takes very long (strace claims up to 0.1 sec)
when having multiple CPUs.  This is caused by the  synchronize_rcu and
the HZ=100 of s390.  By changing the code to use a private srcu we can
speed things up.  This patch reduces the boot time till mounting root
from 8 to 2 seconds on my s390 guest with 100 disks.

Uses of hlist_for_each_entry_rcu, hlist_add_head_rcu, hlist_del_init_rcu
are fine because they do not have lockdep checks (hlist_for_each_entry_rcu
uses rcu_dereference_raw rather than rcu_dereference, and write-sides
do not do rcu lockdep at all).

Note that we're hardly relying on the "sleepable" part of srcu.  We just
want SRCU's faster detection of grace periods.

Testing was done by Andrew Theurer using netperf tests STREAM, MAERTS
and RR.  The difference between results "before" and "after" the patch
has mean -0.2% and standard deviation 0.6%.  Using a paired t-test on the
data points says that there is a 2.5% probability that the patch is the
cause of the performance difference (rather than a random fluctuation).

(Restricting the t-test to RR, which is the most likely to be affected,
changes the numbers to respectively -0.3% mean, 0.7% stdev, and 8%
probability that the numbers actually say something about the patch.
The probability increases mostly because there are fewer data points).

Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com> # s390
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-05-05 16:29:11 +02:00
Alexander Graf e8cde0939d KVM: Move irq routing setup to irqchip.c
Setting up IRQ routes is nothing IOAPIC specific. Extract everything
that really is generic code into irqchip.c and only leave the ioapic
specific bits to irq_comm.c.

Signed-off-by: Alexander Graf <agraf@suse.de>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
2013-04-26 20:27:18 +02:00
Alexander Graf 1c9f8520bd KVM: Extract generic irqchip logic into irqchip.c
The current irq_comm.c file contains pieces of code that are generic
across different irqchip implementations, as well as code that is
fully IOAPIC specific.

Split the generic bits out into irqchip.c.

Signed-off-by: Alexander Graf <agraf@suse.de>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
2013-04-26 20:27:17 +02:00