1
0
Fork 0
Commit Graph

476 Commits (redonkable)

Author SHA1 Message Date
Andrey Zhizhikin 8c8c2d4715 This is the 5.4.86 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl/sW9MACgkQONu9yGCS
 aT5SwBAAo6dgHqwmPfuf98/8oVeVqTxcmE7GpzpVRH2+yI7Zwk2ez29tAflcM7lT
 LKtR2WFGAxoCL4DUKXeO7Ubwpue5NoBIsJ8/dAYBesojps3WDaFGL55PvJLWwFJ7
 5gPtPzynITaqIC1JCFcrJ7OTp7REiCUZRc1CJXJINWAYL1VbEbH8pH904xfFcivy
 XnNyL9UiWp1lSB8oF3CRJOaK5M5gY1+wdCFaLVqQn306XDEM8PvZK4G3at/jXWgH
 jQjArdtC8M8NwjyTwtqW9JAMV+6CD0/HXk0QboTZg6yiaRrtUsfzMqJ1cvhKcQgO
 kLE3rwdnr3/MxuzSnGWbswflG2WCutoah58g0uN8H0nCiui5mKN6x5K+emgDZIoO
 ndDnh+/5OE247EK+3CGn/0N8i/fOymrLAnLL4wCXVdlQLMCalnL37ibdfGbAptXi
 N3GOGZ2iEglvTsEr5w0r86+AzNskm5EqA7mFGFiAyf9viR2xwYk3RrWf2ZyMRos2
 2S7mKcZmw7voDu2TIDIhqydToBKxmYI/mUn3mFFme1h3lwzM3zYG1aovVLfd5NkY
 Gx5E/CA/ut/3n0u/dXJ8SxEitBWkqImp5UdYcElQNxQoXnVU4yKmjf6dDL9Wqh+1
 ujCiaCUJd3PY0uXXIb6RWWGs2VaL4xiEnk+ZBm0VI9WEUWksSx0=
 =jnmv
 -----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEdQaENiSDAlGTDEbB7G51OISzHs0FAl/yR4IACgkQ7G51OISz
 Hs2KfA//e3jp0Ah8krkhVUDaiKJ6cWwt3PyEyEuASV+KB7bf8dW25E98jEriBTBX
 kEgmy8ZWJqcDvxP7JWshmjSu6y7IfzFhrNMt3Fsd3ZsQz7nUprzokufnUSMcoNeu
 vXYtJHslunsxnUVOzy/iCAjb0KU7zS5lYxxITAGth1vgEM5QySXcBZx8yWrGxNt+
 Hk5Rc4hQlogmh4Mi1t9VoHOafy5smitOwVGtcl8oPiDCkoConXtBvNQgFkncBZWf
 0EOXiulRkWeo/KXMVrdVy8J1IzWjQDDM1/JDY/Xx6scLnBBCJ002Yfv/HpL/toAM
 K5/dYJmRktlsCaKFd14uMTAnEqhjnDyPtxntOa0Z4YExfOGwum/SmOMvQyCGLOJg
 eF5HejriqRfK2bRBpYO92aXdwmBSuu2cS3AXroHw3tl3VQ+9uzTNzp9iIonsKjSJ
 5WQPc+0Ebg5NPtHHkimeUTFcxmYfqOFV2u+wVDi9Lcfm7xzJ8zM7w/IyM6sMpdoZ
 xKQ0jto8KN0eQKWmih2GL/pde2iihjOPa6RYuRom7LCLMgjJvBMCIQJwUZBg1PU5
 0eh7WipmOt1xCfKWi3HX3A+2ibkdT/3CkochK26Md/mrCH4aI4JSacr3lXYTALhi
 yy37o4ogWgCwDGOmxCPfG2BxZzLywNIBJb3qwT3maDPMkMT0efs=
 =IdA4
 -----END PGP SIGNATURE-----

Merge tag 'v5.4.86' into 5.4-2.2.x-imx

This is the 5.4.86 stable release

Signed-off-by: Andrey Zhizhikin <andrey.zhizhikin@leica-geosystems.com>
2021-01-03 22:38:54 +00:00
Dan Carpenter e01dfcc08b virtio_ring: Fix two use after free bugs
[ Upstream commit e152d8af42 ]

The "vq" struct is added to the "vdev->vqs" list prematurely.  If we
encounter an error later in the function then the "vq" is freed, but
since it is still on the list that could lead to a use after free bug.

Fixes: cbeedb72b9 ("virtio_ring: allocate desc state for split ring separately")
Reported-by: Robert Buhren <robert.buhren@sect.tu-berlin.de>
Reported-by: Felicitas Hetzelt <file@sect.tu-berlin.de>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/X8pGaG/zkI3jk8mk@mwanda
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-12-30 11:51:29 +01:00
Dan Carpenter 5f70910832 virtio_ring: Cut and paste bugs in vring_create_virtqueue_packed()
[ Upstream commit ae93d8ea0f ]

There is a copy and paste bug in the error handling of this code and
it uses "ring_dma_addr" three times instead of "device_event_dma_addr"
and "driver_event_dma_addr".

Fixes: 1ce9e6055f (" virtio_ring: introduce packed ring support")
Reported-by: Robert Buhren <robert.buhren@sect.tu-berlin.de>
Reported-by: Felicitas Hetzelt <file@sect.tu-berlin.de>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/X8pGRJlEzyn+04u2@mwanda
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-12-30 11:51:28 +01:00
Andrey Zhizhikin ee7b6ad15b This is the 5.4.67 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl9rJlYACgkQONu9yGCS
 aT6WbRAAga6QVKrO6R4NeKk0fPqKQQoQeTK+phBOFA7jAoX/rIRKyob2Si9BwhBA
 F77vZ6HIZ7+e/o35JJQYQbffbHYs0ANuS1oHGqe0vgbh+72Viaan6g7lFOhpx8qf
 y0YS7q+hw4WLZB0gGlBM7nkPXRiis32IrEVabQW+t8hmT2lWyutY8E2yFAU60tvI
 Tvjm2c2pvHEcHz9MrjEd/jIVxMFnIl42FBTx9bGsbDUCDzBwEvPArS4bNioP7EFJ
 O+rrGCNvwtiv0DuKzX1UIZzQ88IROmU3ZjsIlgOwla7xJWv4QDgmPfyAyRI48QhH
 PAZQmSntz+y+MP6B3z3ZBrxc2Fx0kCDtugn2P9+2RVUEpheANJ293vUgYTKN9Roy
 dHdWHFWNTO9IYpIN0cZjc25db4ULdjerWQrKcCr6ZO8+Ep/0mSzx3lkWjfuUP8Hr
 L2RD6rAm259OpPq8xhAcJpJvoQLwGxaBHyr4QYUmRgmNVURoqe9Q0MTZuiyGsXhm
 rtcNky9WvmyyI1lJgXi4A+vmsIThCHEstEMycgTejfJ4itIVA9e1ctJVVomWULCn
 9oNStBJpmHw0myDCohbKNjeO1UX/erdF9NaoGto5bnfIhcSae1YQEjRB8zKmzbg1
 DpgC1f7IZ7q53vfrDGsAjInOcuEwAn/Y5JMLJOL4mdA9j3XlX2o=
 =Ot99
 -----END PGP SIGNATURE-----

Merge tag 'v5.4.67' into 5.4-2.2.x-imx

This is the 5.4.67 stable release

This updates the kernel present in the NXP release imx_5.4.47_2.2.0 to the
latest patchset available from stable korg.

Base stable kernel version present in the NXP BSP release is v5.4.47.

Following conflicts were recorded and resolved:
- arch/arm/mach-imx/pm-imx6.c
NXP version has a different PM vectoring scheme, where the IRAM bottom
half (8k) is used to store IRAM code and pm_info. Keep this version to
be compatible with NXP PM implementation.

- arch/arm64/boot/dts/freescale/imx8mm-evk.dts
- arch/arm64/boot/dts/freescale/imx8mn-ddr4-evk.dts
NXP patches kept to provide proper LDO setup:
imx8mm-evk.dts: 975d8ab07267ded741c4c5d7500e524c85ab40d3
imx8mn-ddr4-evk.dts: e8e35fd0e759965809f3dca5979a908a09286198

- drivers/crypto/caam/caamalg.c
Keep NXP version, as it already covers the functionality for the
upstream patch [d6bbd4eea2]

- drivers/gpu/drm/imx/dw_hdmi-imx.c
- drivers/gpu/drm/imx/imx-ldb.c
- drivers/gpu/drm/imx/ipuv3/ipuv3-crtc.c
Port changes from upstream commit [1a27987101], which extends
component lifetime by moving drm structures allocation/free from
bind() to probe().

- drivers/gpu/drm/imx/imx-ldb.c
Merge patch [1752ab50e8] from upstream to disable both LVDS channels
when Enoder is disabled

- drivers/mmc/host/sdhci-esdhc-imx.c
Fix merge fuzz produced by [6534c897fd].

- drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c
Commit d1a00c9bb1 from upstream solves the issue with improper error
reporting when qdisc type support is absent. Upstream version is merged
into NXP implementation.

- drivers/net/ethernet/freescale/enetc/enetc.c
Commit [ce06fcb6a6] from upstream merged,
base NXP version kept

- drivers/net/ethernet/freescale/enetc/enetc_pf.c
Commit [e8b86b4d87] from upstream solves
the kernel panic in case if probing fails. NXP has a clean-up logic
implemented different, where the MDIO remove would be invoked in any
failure case. Keep the NXP logic in place.

- drivers/thermal/imx_thermal.c
Upstream patch [9025a5589c] adds missing
of_node_put call, NXP version has been adapted to accommodate this patch
into the code.

- drivers/usb/cdns3/ep0.c
Manual merge of commit [be8df02707] from
upstream to protect cdns3_check_new_setup

- drivers/xen/swiotlb-xen.c
Port upstream commit cca58a1669 to NXP tree, manual hunk was
resolved during merge.

- sound/soc/fsl/fsl_esai.c
Commit [53057bd4ac] upstream addresses the problem of endless isr in
case if exception interrupt is enabled and tasklet is scheduled. Since
NXP implementation has tasklet removed with commit [2bbe95fe6c],
upstream fix does not match the main implementation, hence we keep the
NXP version here.

- sound/soc/fsl/fsl_sai.c
Apply patch [b8ae2bf5cc] from upstream, which uses FIFO watermark
mask macro.

Signed-off-by: Andrey Zhizhikin <andrey.zhizhikin@leica-geosystems.com>
2020-09-26 20:54:42 +00:00
Mao Wenan 05724341d9 virtio_ring: Avoid loop when vq is broken in virtqueue_poll
[ Upstream commit 481a0d7422 ]

The loop may exist if vq->broken is true,
virtqueue_get_buf_ctx_packed or virtqueue_get_buf_ctx_split
will return NULL, so virtnet_poll will reschedule napi to
receive packet, it will lead cpu usage(si) to 100%.

call trace as below:
virtnet_poll
	virtnet_receive
		virtqueue_get_buf_ctx
			virtqueue_get_buf_ctx_packed
			virtqueue_get_buf_ctx_split
	virtqueue_napi_complete
		virtqueue_poll           //return true
		virtqueue_napi_schedule //it will reschedule napi

to fix this, return false if vq is broken in virtqueue_poll.

Signed-off-by: Mao Wenan <wenan.mao@linux.alibaba.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://lore.kernel.org/r/1596354249-96204-1-git-send-email-wenan.mao@linux.alibaba.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-08-26 10:40:57 +02:00
Michael S. Tsirkin cea6633d53 virtio_balloon: fix up endian-ness for free cmd id
commit 168c358af2 upstream.

free cmd id is read using virtio endian, spec says all fields
in balloon are LE. Fix it up.

Fixes: 86a559787e ("virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT")
Cc: stable@vger.kernel.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Wei Wang <wei.w.wang@intel.com>
Acked-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-08-05 09:59:43 +02:00
Jason Liu 5691e22711 Merge tag 'v5.4.47' into imx_5.4.y
* tag 'v5.4.47': (2193 commits)
  Linux 5.4.47
  KVM: arm64: Save the host's PtrAuth keys in non-preemptible context
  KVM: arm64: Synchronize sysreg state on injecting an AArch32 exception
  ...

 Conflicts:
	arch/arm/boot/dts/imx6qdl.dtsi
	arch/arm/mach-imx/Kconfig
	arch/arm/mach-imx/common.h
	arch/arm/mach-imx/suspend-imx6.S
	arch/arm64/boot/dts/freescale/imx8qxp-mek.dts
	arch/powerpc/include/asm/cacheflush.h
	drivers/cpufreq/imx6q-cpufreq.c
	drivers/dma/imx-sdma.c
	drivers/edac/synopsys_edac.c
	drivers/firmware/imx/imx-scu.c
	drivers/net/ethernet/freescale/fec.h
	drivers/net/ethernet/freescale/fec_main.c
	drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
	drivers/net/phy/phy_device.c
	drivers/perf/fsl_imx8_ddr_perf.c
	drivers/usb/cdns3/gadget.c
	drivers/usb/dwc3/gadget.c
	include/uapi/linux/dma-buf.h

Signed-off-by: Jason Liu <jason.hui.liu@nxp.com>
2020-06-19 17:32:49 +08:00
Jan Kiszka 3ea2de8648 virtio: Add virtio-over-ivshmem transport driver
This provides a virtio transport driver over the Inter-VM shared memory
device as found in QEMU and the Jailhouse hypervisor.

...

Note: Specification work for both ivshmem and the virtio transport is
ongoing, so details may still change.

Acked-by: Ye Li <ye.li@nxp.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
2020-04-29 15:03:05 +08:00
Suman Anna a6ea1df949 virtio_ring: Fix mem leak with vring_new_virtqueue()
commit f13f09a12c upstream.

The functions vring_new_virtqueue() and __vring_new_virtqueue() are used
with split rings, and any allocations within these functions are managed
outside of the .we_own_ring flag. The commit cbeedb72b9 ("virtio_ring:
allocate desc state for split ring separately") allocates the desc state
within the __vring_new_virtqueue() but frees it only when the .we_own_ring
flag is set. This leads to a memory leak when freeing such allocated
virtqueues with the vring_del_virtqueue() function.

Fix this by moving the desc_state free code outside the flag and only
for split rings. Issue was discovered during testing with remoteproc
and virtio_rpmsg.

Fixes: cbeedb72b9 ("virtio_ring: allocate desc state for split ring separately")
Signed-off-by: Suman Anna <s-anna@ti.com>
Link: https://lore.kernel.org/r/20200224212643.30672-1-s-anna@ti.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-18 07:17:55 +01:00
Nathan Chancellor 8f310017aa virtio_balloon: Adjust label in virtballoon_probe
commit 6ae4edab2f upstream.

Clang warns when CONFIG_BALLOON_COMPACTION is unset:

../drivers/virtio/virtio_balloon.c:963:1: warning: unused label
'out_del_vqs' [-Wunused-label]
out_del_vqs:
^~~~~~~~~~~~
1 warning generated.

Move the label within the preprocessor block since it is only used when
CONFIG_BALLOON_COMPACTION is set.

Fixes: 1ad6f58ea9 ("virtio_balloon: Fix memory leaks on errors in virtballoon_probe()")
Link: https://github.com/ClangBuiltLinux/linux/issues/886
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Link: https://lore.kernel.org/r/20200216004039.23464-1-natechancellor@gmail.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-18 07:17:37 +01:00
Michael S. Tsirkin 77912b69a9 virtio_balloon: prevent pfn array overflow
[ Upstream commit 6e9826e772 ]

Make sure, at build time, that pfn array is big enough to hold a single
page.  It happens to be true since the PAGE_SHIFT value at the moment is
20, which is 1M - exactly 256 4K balloon pages.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24 08:37:03 +01:00
David Hildenbrand c6d07f6e50 virtio_balloon: Fix memory leaks on errors in virtballoon_probe()
commit 1ad6f58ea9 upstream.

We forget to put the inode and unmount the kernfs used for compaction.

Fixes: 71994620bb ("virtio_balloon: replace oom notifier with shrinker")
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Wei Wang <wei.w.wang@intel.com>
Cc: Liang Li <liang.z.li@intel.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20200205163402.42627-3-david@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-11 04:35:47 -08:00
David Hildenbrand 3ac13462f5 virtio-balloon: Fix memory leak when unloading while hinting is in progress
commit 6c22dc61c7 upstream.

When unloading the driver while hinting is in progress, we will not
release the free page blocks back to MM, resulting in a memory leak.

Fixes: 86a559787e ("virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT")
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Wei Wang <wei.w.wang@intel.com>
Cc: Liang Li <liang.z.li@intel.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20200205163402.42627-2-david@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-11 04:35:47 -08:00
Daniel Verkamp e0fc65ef8a virtio-pci: check name when counting MSI-X vectors
commit 303090b513 upstream.

VQs without a name specified are not valid; they are skipped in the
later loop that assigns MSI-X vectors to queues, but the per_vq_vectors
loop above that counts the required number of vectors previously still
counted any queue with a non-NULL callback as needing a vector.

Add a check to the per_vq_vectors loop so that vectors with no name are
not counted to make the two loops consistent.  This prevents
over-counting unnecessary vectors (e.g. for features which were not
negotiated with the device).

Cc: stable@vger.kernel.org
Fixes: 86a559787e ("virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT")
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Daniel Verkamp <dverkamp@chromium.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Wang, Wei W <wei.w.wang@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-11 04:35:43 -08:00
Daniel Verkamp f603b3714e virtio-balloon: initialize all vq callbacks
commit 5790b53390 upstream.

Ensure that elements of the callbacks array that correspond to
unavailable features are set to NULL; previously, they would be left
uninitialized.

Since the corresponding names array elements were explicitly set to
NULL, the uninitialized callback pointers would not actually be
dereferenced; however, the uninitialized callbacks elements would still
be read in vp_find_vqs_msix() and used to calculate the number of MSI-X
vectors required.

Cc: stable@vger.kernel.org
Fixes: 86a559787e ("virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT")
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Daniel Verkamp <dverkamp@chromium.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-11 04:35:43 -08:00
David Hildenbrand cc3b0930f2 virtio-balloon: fix managed page counts when migrating pages between zones
commit 63341ab037 upstream.

In case we have to migrate a ballon page to a newpage of another zone, the
managed page count of both zones is wrong. Paired with memory offlining
(which will adjust the managed page count), we can trigger kernel crashes
and all kinds of different symptoms.

One way to reproduce:
1. Start a QEMU guest with 4GB, no NUMA
2. Hotplug a 1GB DIMM and online the memory to ZONE_NORMAL
3. Inflate the balloon to 1GB
4. Unplug the DIMM (be quick, otherwise unmovable data ends up on it)
5. Observe /proc/zoneinfo
  Node 0, zone   Normal
    pages free     16810
          min      24848885473806
          low      18471592959183339
          high     36918337032892872
          spanned  262144
          present  262144
          managed  18446744073709533486
6. Do anything that requires some memory (e.g., inflate the balloon some
more). The OOM goes crazy and the system crashes
  [  238.324946] Out of memory: Killed process 537 (login) total-vm:27584kB, anon-rss:860kB, file-rss:0kB, shmem-rss:00
  [  238.338585] systemd invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
  [  238.339420] CPU: 0 PID: 1 Comm: systemd Tainted: G      D W         5.4.0-next-20191204+ #75
  [  238.340139] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu4
  [  238.341121] Call Trace:
  [  238.341337]  dump_stack+0x8f/0xd0
  [  238.341630]  dump_header+0x61/0x5ea
  [  238.341942]  oom_kill_process.cold+0xb/0x10
  [  238.342299]  out_of_memory+0x24d/0x5a0
  [  238.342625]  __alloc_pages_slowpath+0xd12/0x1020
  [  238.343024]  __alloc_pages_nodemask+0x391/0x410
  [  238.343407]  pagecache_get_page+0xc3/0x3a0
  [  238.343757]  filemap_fault+0x804/0xc30
  [  238.344083]  ? ext4_filemap_fault+0x28/0x42
  [  238.344444]  ext4_filemap_fault+0x30/0x42
  [  238.344789]  __do_fault+0x37/0x1a0
  [  238.345087]  __handle_mm_fault+0x104d/0x1ab0
  [  238.345450]  handle_mm_fault+0x169/0x360
  [  238.345790]  do_user_addr_fault+0x20d/0x490
  [  238.346154]  do_page_fault+0x31/0x210
  [  238.346468]  async_page_fault+0x43/0x50
  [  238.346797] RIP: 0033:0x7f47eba4197e
  [  238.347110] Code: Bad RIP value.
  [  238.347387] RSP: 002b:00007ffd7c0c1890 EFLAGS: 00010293
  [  238.347834] RAX: 0000000000000002 RBX: 000055d196a20a20 RCX: 00007f47eba4197e
  [  238.348437] RDX: 0000000000000033 RSI: 00007ffd7c0c18c0 RDI: 0000000000000004
  [  238.349047] RBP: 00007ffd7c0c1c20 R08: 0000000000000000 R09: 0000000000000033
  [  238.349660] R10: 00000000ffffffff R11: 0000000000000293 R12: 0000000000000001
  [  238.350261] R13: ffffffffffffffff R14: 0000000000000000 R15: 00007ffd7c0c18c0
  [  238.350878] Mem-Info:
  [  238.351085] active_anon:3121 inactive_anon:51 isolated_anon:0
  [  238.351085]  active_file:12 inactive_file:7 isolated_file:0
  [  238.351085]  unevictable:0 dirty:0 writeback:0 unstable:0
  [  238.351085]  slab_reclaimable:5565 slab_unreclaimable:10170
  [  238.351085]  mapped:3 shmem:111 pagetables:155 bounce:0
  [  238.351085]  free:720717 free_pcp:2 free_cma:0
  [  238.353757] Node 0 active_anon:12484kB inactive_anon:204kB active_file:48kB inactive_file:28kB unevictable:0kB iss
  [  238.355979] Node 0 DMA free:11556kB min:36kB low:48kB high:60kB reserved_highatomic:0KB active_anon:152kB inactivB
  [  238.358345] lowmem_reserve[]: 0 2955 2884 2884 2884
  [  238.358761] Node 0 DMA32 free:2677864kB min:7004kB low:10028kB high:13052kB reserved_highatomic:0KB active_anon:0B
  [  238.361202] lowmem_reserve[]: 0 0 72057594037927865 72057594037927865 72057594037927865
  [  238.361888] Node 0 Normal free:193448kB min:99395541895224kB low:73886371836733356kB high:147673348131571488kB reB
  [  238.364765] lowmem_reserve[]: 0 0 0 0 0
  [  238.365101] Node 0 DMA: 7*4kB (U) 5*8kB (UE) 6*16kB (UME) 2*32kB (UM) 1*64kB (U) 2*128kB (UE) 3*256kB (UME) 2*512B
  [  238.366379] Node 0 DMA32: 0*4kB 1*8kB (U) 2*16kB (UM) 2*32kB (UM) 2*64kB (UM) 1*128kB (U) 1*256kB (U) 1*512kB (U)B
  [  238.367654] Node 0 Normal: 1985*4kB (UME) 1321*8kB (UME) 844*16kB (UME) 524*32kB (UME) 300*64kB (UME) 138*128kB (B
  [  238.369184] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
  [  238.369915] 130 total pagecache pages
  [  238.370241] 0 pages in swap cache
  [  238.370533] Swap cache stats: add 0, delete 0, find 0/0
  [  238.370981] Free swap  = 0kB
  [  238.371239] Total swap = 0kB
  [  238.371488] 1048445 pages RAM
  [  238.371756] 0 pages HighMem/MovableOnly
  [  238.372090] 306992 pages reserved
  [  238.372376] 0 pages cma reserved
  [  238.372661] 0 pages hwpoisoned

In another instance (older kernel), I was able to observe this
(negative page count :/):
  [  180.896971] Offlined Pages 32768
  [  182.667462] Offlined Pages 32768
  [  184.408117] Offlined Pages 32768
  [  186.026321] Offlined Pages 32768
  [  187.684861] Offlined Pages 32768
  [  189.227013] Offlined Pages 32768
  [  190.830303] Offlined Pages 32768
  [  190.833071] Built 1 zonelists, mobility grouping on.  Total pages: -36920272750453009

In another instance (older kernel), I was no longer able to start any
process:
  [root@vm ~]# [  214.348068] Offlined Pages 32768
  [  215.973009] Offlined Pages 32768
  cat /proc/meminfo
  -bash: fork: Cannot allocate memory
  [root@vm ~]# cat /proc/meminfo
  -bash: fork: Cannot allocate memory

Fix it by properly adjusting the managed page count when migrating if
the zone changed. The managed page count of the zones now looks after
unplug of the DIMM (and after deflating the balloon) just like before
inflating the balloon (and plugging+onlining the DIMM).

We'll temporarily modify the totalram page count. If this ever becomes a
problem, we can fine tune by providing helpers that don't touch
the totalram pages (e.g., adjust_zone_managed_page_count()).

Please note that fixing up the managed page count is only necessary when
we adjusted the managed page count when inflating - only if we
don't have VIRTIO_BALLOON_F_DEFLATE_ON_OOM. With that feature, the
managed page count is not touched when inflating/deflating.

Reported-by: Yumei Huang <yuhuang@redhat.com>
Fixes: 3dcc0571cd ("mm: correctly update zone->managed_pages")
Cc: <stable@vger.kernel.org> # v3.11+
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Jiang Liu <liuj97@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17 19:55:56 +01:00
Wei Wang c9a6820fc0 virtio_balloon: fix shrinker count
Instead of multiplying by page order, virtio balloon divided by page
order. The result is that it can return 0 if there are a bit less
than MAX_ORDER - 1 pages in use, and then shrinker scan won't be called.

Cc: stable@vger.kernel.org
Fixes: 71994620bb ("virtio_balloon: replace oom notifier with shrinker")
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
2019-11-20 02:15:57 -05:00
Michael S. Tsirkin 60bd04f258 virtio_balloon: fix shrinker scan number of pages
virtio_balloon_shrinker_scan should return number of system pages freed,
but because it's calling functions that deal with balloon pages, it gets
confused and sometimes returns the number of balloon pages.

It does not matter practically as the exact number isn't
used, but it seems better to be consistent in case someone
starts using this API.

Further, if we ever tried to iteratively leak pages as
virtio_balloon_shrinker_scan tries to do, we'd run into issues - this is
because freed_pages was accumulating total freed pages, but was also
subtracted on each iteration from pages_to_free, which can result in
either leaking less memory than we were supposed to free, or more if
pages_to_free underruns.

On a system with 4K pages we are lucky that we are never asked to leak
more than 128 pages while we can leak up to 256 at a time,
but it looks like a real issue for systems with page size != 4K.

Fixes: 71994620bb ("virtio_balloon: replace oom notifier with shrinker")
Reported-by: Khazhismel Kumykov <khazhy@google.com>
Reviewed-by: Wei Wang <wei.w.wang@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-11-20 02:15:57 -05:00
Halil Pasic f7728002c1 virtio_ring: fix return code on DMA mapping fails
Commit 780bc7903a ("virtio_ring: Support DMA APIs")  makes
virtqueue_add() return -EIO when we fail to map our I/O buffers. This is
a very realistic scenario for guests with encrypted memory, as swiotlb
may run out of space, depending on it's size and the I/O load.

The virtio-blk driver interprets -EIO form virtqueue_add() as an IO
error, despite the fact that swiotlb full is in absence of bugs a
recoverable condition.

Let us change the return code to -ENOMEM, and make the block layer
recover form these failures when virtio-blk encounters the condition
described above.

Cc: stable@vger.kernel.org
Fixes: 780bc7903a ("virtio_ring: Support DMA APIs")
Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Tested-by: Michael Mueller <mimu@linux.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-11-19 05:13:49 -05:00
Marvin Liu 40ce7919d8 virtio_ring: fix stalls for packed rings
When VIRTIO_F_RING_EVENT_IDX is negotiated, virtio devices can
use virtqueue_enable_cb_delayed_packed to reduce the number of device
interrupts.  At the moment, this is the case for virtio-net when the
napi_tx module parameter is set to false.

In this case, the virtio driver selects an event offset and expects that
the device will send a notification when rolling over the event offset
in the ring.  However, if this roll-over happens before the event
suppression structure update, the notification won't be sent. To address
this race condition the driver needs to check wether the device rolled
over the offset after updating the event suppression structure.

With VIRTIO_F_RING_PACKED, the virtio driver did this by reading the
flags field of the descriptor at the specified offset.

Unfortunately, checking at the event offset isn't reliable: if
descriptors are chained (e.g. when INDIRECT is off) not all descriptors
are overwritten by the device, so it's possible that the device skipped
the specific descriptor driver is checking when writing out used
descriptors. If this happens, the driver won't detect the race condition
and will incorrectly expect the device to send a notification.

For virtio-net, the result will be a TX queue stall, with the
transmission getting blocked forever.

With the packed ring, it isn't easy to find a location which is
guaranteed to change upon the roll-over, except the next device
descriptor, as described in the spec:

        Writes of device and driver descriptors can generally be
        reordered, but each side (driver and device) are only required to
        poll (or test) a single location in memory: the next device descriptor after
        the one they processed previously, in circular order.

while this might be sub-optimal, let's do exactly this for now.

Cc: stable@vger.kernel.org
Cc: Jason Wang <jasowang@redhat.com>
Fixes: f51f982682 ("virtio_ring: leverage event idx in packed ring")
Signed-off-by: Marvin Liu <yong.liu@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-10-28 04:24:46 -04:00
Matthias Lange cf8f169670 virtio_ring: fix unmap of indirect descriptors
The function virtqueue_add_split() DMA-maps the scatterlist buffers. In
case a mapping error occurs the already mapped buffers must be unmapped.
This happens by jumping to the 'unmap_release' label.

In case of indirect descriptors the release is wrong and may leak kernel
memory. Because the implementation assumes that the head descriptor is
already mapped it starts iterating over the descriptor list starting
from the head descriptor. However for indirect descriptors the head
descriptor is never mapped in case of an error.

The fix is to initialize the start index with zero in case of indirect
descriptors and use the 'desc' pointer directly for iterating over the
descriptor chain.

Signed-off-by: Matthias Lange <matthias.lange@kernkonzept.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-09-09 10:43:15 -04:00
Linus Torvalds 933a90bf4f Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs mount updates from Al Viro:
 "The first part of mount updates.

  Convert filesystems to use the new mount API"

* 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits)
  mnt_init(): call shmem_init() unconditionally
  constify ksys_mount() string arguments
  don't bother with registering rootfs
  init_rootfs(): don't bother with init_ramfs_fs()
  vfs: Convert smackfs to use the new mount API
  vfs: Convert selinuxfs to use the new mount API
  vfs: Convert securityfs to use the new mount API
  vfs: Convert apparmorfs to use the new mount API
  vfs: Convert openpromfs to use the new mount API
  vfs: Convert xenfs to use the new mount API
  vfs: Convert gadgetfs to use the new mount API
  vfs: Convert oprofilefs to use the new mount API
  vfs: Convert ibmasmfs to use the new mount API
  vfs: Convert qib_fs/ipathfs to use the new mount API
  vfs: Convert efivarfs to use the new mount API
  vfs: Convert configfs to use the new mount API
  vfs: Convert binfmt_misc to use the new mount API
  convenience helper: get_tree_single()
  convenience helper get_tree_nodev()
  vfs: Kill sget_userns()
  ...
2019-07-19 10:42:02 -07:00
Linus Torvalds f8c3500cd1 - virtio_pmem: The new virtio_pmem facility introduces a paravirtualized
persistent memory device that allows a guest VM to use DAX mechanisms to
   access a host-file with host-page-cache. It arranges for MAP_SYNC to
   be disabled and instead triggers a host fsync() when a 'write-cache
   flush' command is sent to the virtual disk device.
 
 - Miscellaneous small fixups.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJdMHwpAAoJEB7SkWpmfYgCUYoP/3vcgYBAaXNksyALF0iowPoP
 z4J0KoaOA1CzRFEQtCWUQa84CWj+XoSewwSeyrIkqKQvx/gghXblK+GVjVzBn0BD
 hmmiKr8af4DdxfzYdEXJp65cCpIiVMaJiGr20Aj9ObwvWJb4QZbz9q7hnPt6KgiI
 jVND3BpP3OERb4ZFcibdmJT5foKooMcXVG6+luVe+hc1+ZZQxJBsBaqie4brQIFq
 j59NX3HfHH2fr1vVwnVH0CO4tgbgYg9wZ2EivGu6wBWvORjrr7KiSSbOYP68EBtd
 lUoNps+vQtGnfXGwNzAjp1wuknrQYYh4/KMKjep7hiZD39rgyvBpbHbyynKzQCWV
 REe8cXr/nwphsENvBAUBiqY999EWVIxdT2iaVaSA6K/31JQAC5AFyxVK/P2Ke1SK
 rvePZ++iLQ1o4phTxQPNlVUqF9jOrFVVICGwMDqaqSkOsD9YKQdFClfOF/1ntlDz
 V0bs+Y0Pe8AJCd9ESep4X+vHAWRRIb4EQIuwLaX8RJoY+r1fGye9RPthpYYzvXKp
 DI2iJztFO3anzj2i9htNPUFIaiUmIhzEvG32O2If2yc5FL02hMpHPoFx6vHhe6s3
 f8OJ+olsJK+/IIrV8+DHqYvhzylOYIhmRTvIxIxaNDPHkhR1i2RDQ6KKK1YZmsr8
 MjAZ+Ym0GadDivs+wcM6
 =uAMG
 -----END PGP SIGNATURE-----

Merge tag 'libnvdimm-for-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm

Pull libnvdimm updates from Dan Williams:
 "Primarily just the virtio_pmem driver:

   - virtio_pmem

     The new virtio_pmem facility introduces a paravirtualized
     persistent memory device that allows a guest VM to use DAX
     mechanisms to access a host-file with host-page-cache. It arranges
     for MAP_SYNC to be disabled and instead triggers a host fsync()
     when a 'write-cache flush' command is sent to the virtual disk
     device.

   - Miscellaneous small fixups"

* tag 'libnvdimm-for-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  virtio_pmem: fix sparse warning
  xfs: disable map_sync for async flush
  ext4: disable map_sync for async flush
  dax: check synchronous mapping is supported
  dm: enable synchronous dax
  libnvdimm: add dax_dev sync flag
  virtio-pmem: Add virtio pmem driver
  libnvdimm: nd_region flush callback support
  libnvdimm, namespace: Drop uuid_t implementation detail
2019-07-18 10:52:08 -07:00
Ihor Matushchak 5e663f0410 virtio-mmio: add error check for platform_get_irq
in vm_find_vqs() irq has a wrong type
so, in case of no IRQ resource defined,
wrong parameter will be passed to request_irq()

Signed-off-by: Ihor Matushchak <ihor.matushchak@foobox.net>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Ivan T. Ivanov <iivanov.xz@gmail.com>
2019-07-11 16:22:29 -04:00
Pankaj Gupta 6e84200c0a virtio-pmem: Add virtio pmem driver
This patch adds virtio-pmem driver for KVM guest.

Guest reads the persistent memory range information from
Qemu over VIRTIO and registers it on nvdimm_bus. It also
creates a nd_region object with the persistent memory
range information so that existing 'nvdimm/pmem' driver
can reserve this into system memory map. This way
'virtio-pmem' driver uses existing functionality of pmem
driver to register persistent memory compatible for DAX
capable filesystems.

This also provides function to perform guest flush over
VIRTIO from 'pmem' driver when userspace performs flush
on DAX memory range.

Signed-off-by: Pankaj Gupta <pagupta@redhat.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jakub Staron <jstaron@google.com>
Tested-by: Jakub Staron <jstaron@google.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2019-07-05 15:19:10 -07:00
Fabrizio Castro 6166e5330c virtio: Fix indentation of VIRTIO_MMIO
VIRTIO_MMIO config option block starts with a space, fix that.

Signed-off-by: Fabrizio Castro <fabrizio.castro@bp.renesas.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-05-27 11:08:22 -04:00
David Howells 99558d203c vfs: Convert virtio_balloon to use the new mount API
Convert the virtio_balloon filesystem to the new internal mount API as the old
one will be obsoleted and removed.  This allows greater flexibility in
communication of mount parameters between userspace, the VFS and the
filesystem.

See Documentation/filesystems/mount_api.txt for more information.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: "Michael S. Tsirkin" <mst@redhat.com>
cc: Jason Wang <jasowang@redhat.com>
cc: virtualization@lists.linux-foundation.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-05-25 18:06:15 -04:00
Al Viro 1f58bb18f6 mount_pseudo(): drop 'name' argument, switch to d_make_root()
Once upon a time we used to set ->d_name of e.g. pipefs root
so that d_path() on pipes would work.  These days it's
completely pointless - dentries of pipes are not even connected
to pipefs root.  However, mount_pseudo() had set the root
dentry name (passed as the second argument) and callers
kept inventing names to pass to it.  Including those that
didn't *have* any non-root dentries to start with...

All of that had been pointless for about 8 years now; it's
time to get rid of that cargo-culting...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-05-25 17:59:24 -04:00
Thomas Gleixner fd534e9b5f treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 102
Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license as published by
  the free software foundation either version 2 of the license or at
  your option any later version this program is distributed in the
  hope that it will be useful but without any warranty without even
  the implied warranty of merchantability or fitness for a particular
  purpose see the gnu general public license for more details you
  should have received a copy of the gnu general public license along
  with this program if not write to the free software foundation inc
  51 franklin st fifth floor boston ma 02110 1301 usa

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-or-later

has been chosen to replace the boilerplate/reference in 50 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Richard Fontana <rfontana@redhat.com>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190523091649.499889647@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-24 17:39:00 +02:00
Thomas Gleixner f33f5fe256 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 78
Based on 1 normalized pattern(s):

  this work is licensed under the terms of the gnu gpl version 2 or
  later see the copying file in the top level directory

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-or-later

has been chosen to replace the boilerplate/reference in 6 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Richard Fontana <rfontana@redhat.com>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190520075210.858783702@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-24 17:37:51 +02:00
Thomas Gleixner ec8f24b7fa treewide: Add SPDX license identifier - Makefile/Kconfig
Add SPDX license identifiers to all Make/Kconfig files which:

 - Have no license information of any form

These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:

  GPL-2.0-only

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-21 10:50:46 +02:00
Thomas Gleixner 09c434b8a0 treewide: Add SPDX license identifier for more missed files
Add SPDX license identifiers to all files which:

 - Have no license information of any form

 - Have MODULE_LICENCE("GPL*") inside which was used in the initial
   scan/conversion to ignore the file

These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:

  GPL-2.0-only

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-21 10:50:45 +02:00
Al Viro 985f404487 balloon: don't bother with dentry_operations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-05-20 14:09:46 +01:00
Jiang Biao a5581206c5 virtio/virtio_ring: do some comment fixes
There are lots of mismatches between comments and codes, this
patch do these comment fixes.

Signed-off-by: Jiang Biao <benbjiang@tencent.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-05-12 13:11:35 -04:00
YueHaibing df0bfe7501 virtio_ring: Fix potential mem leak in virtqueue_add_indirect_packed
'desc' should be freed before leaving from err handing path.

Fixes: 1ce9e6055f ("virtio_ring: introduce packed ring support")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
stable@vger.kernel.org
2019-05-12 13:11:35 -04:00
Cornelia Huck cf94db2190 virtio: Honour 'may_reduce_num' in vring_create_virtqueue
vring_create_virtqueue() allows the caller to specify via the
may_reduce_num parameter whether the vring code is allowed to
allocate a smaller ring than specified.

However, the split ring allocation code tries to allocate a
smaller ring on allocation failure regardless of what the
caller specified. This may cause trouble for e.g. virtio-pci
in legacy mode, which does not support ring resizing. (The
packed ring code does not resize in any case.)

Let's fix this by bailing out immediately in the split ring code
if the requested size cannot be allocated and may_reduce_num has
not been specified.

While at it, fix a typo in the usage instructions.

Fixes: 2a2d1382fe ("virtio: Add improved queue allocation API")
Cc: stable@vger.kernel.org # v4.6+
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Jens Freimann <jfreimann@redhat.com>
2019-04-08 17:05:52 -04:00
Longpeng 6a8aae68c8 virtio_pci: fix a NULL pointer reference in vp_del_vqs
If the msix_affinity_masks is alloced failed, then we'll
try to free some resources in vp_free_vectors() that may
access it directly.

We met the following stack in our production:
[   29.296767] BUG: unable to handle kernel NULL pointer dereference at  (null)
[   29.311151] IP: [<ffffffffc04fe35a>] vp_free_vectors+0x6a/0x150 [virtio_pci]
[   29.324787] PGD 0
[   29.333224] Oops: 0000 [#1] SMP
[...]
[   29.425175] RIP: 0010:[<ffffffffc04fe35a>]  [<ffffffffc04fe35a>] vp_free_vectors+0x6a/0x150 [virtio_pci]
[   29.441405] RSP: 0018:ffff9a55c2dcfa10  EFLAGS: 00010206
[   29.453491] RAX: 0000000000000000 RBX: ffff9a55c322c400 RCX: 0000000000000000
[   29.467488] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9a55c322c400
[   29.481461] RBP: ffff9a55c2dcfa20 R08: 0000000000000000 R09: ffffc1b6806ff020
[   29.495427] R10: 0000000000000e95 R11: 0000000000aaaaaa R12: 0000000000000000
[   29.509414] R13: 0000000000010000 R14: ffff9a55bd2d9e98 R15: ffff9a55c322c400
[   29.523407] FS:  00007fdcba69f8c0(0000) GS:ffff9a55c2840000(0000) knlGS:0000000000000000
[   29.538472] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   29.551621] CR2: 0000000000000000 CR3: 000000003ce52000 CR4: 00000000003607a0
[   29.565886] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   29.580055] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   29.594122] Call Trace:
[   29.603446]  [<ffffffffc04fe8a2>] vp_request_msix_vectors+0xe2/0x260 [virtio_pci]
[   29.618017]  [<ffffffffc04fedc5>] vp_try_to_find_vqs+0x95/0x3b0 [virtio_pci]
[   29.632152]  [<ffffffffc04ff117>] vp_find_vqs+0x37/0xb0 [virtio_pci]
[   29.645582]  [<ffffffffc057bf63>] init_vq+0x153/0x260 [virtio_blk]
[   29.658831]  [<ffffffffc057c1e8>] virtblk_probe+0xe8/0x87f [virtio_blk]
[...]

Cc: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Longpeng <longpeng2@huawei.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
2019-04-08 08:40:58 -04:00
Cornelia Huck ab7a2375fb virtio: hint if callbacks surprisingly might sleep
A virtio transport is free to implement some of the callbacks in
virtio_config_ops in a matter that they cannot be called from
atomic context (e.g. virtio-ccw, which maps a lot of the callbacks
to channel I/O, which is an inherently asynchronous mechanism).
This can be very surprising for developers using the much more
common virtio-pci transport, just to find out that things break
when used on s390.

The documentation for virtio_config_ops now contains a comment
explaining this, but it makes sense to add a might_sleep() annotation
to various wrapper functions in the virtio core to avoid surprises
later.

Note that annotations are NOT added to two classes of calls:
- direct calls from device drivers (all current callers should be
  fine, however)
- calls which clearly won't be made from atomic context (such as
  those ultimately coming in via the driver core)

Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-03-06 11:19:57 -05:00
Wei Wang 59f3397ca7 virtio_balloon: remove the unnecessary 0-initialization
We've changed to kzalloc the vb struct, so no need to 0-initialize
this field one more time.

Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
2019-03-06 11:19:33 -05:00
Wei Wang 53e946cb34 virtio-balloon: improve update_balloon_size_func
There is no need to update the balloon actual register when there is no
ballooning request. This patch avoids update_balloon_size when diff is 0.

Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-03-06 11:19:33 -05:00
Joerg Roedel e6d6dd6c87 virtio: Introduce virtio_max_dma_size()
This function returns the maximum segment size for a single
dma transaction of a virtio device. The possible limit comes
from the SWIOTLB implementation in the Linux kernel, that
has an upper limit of (currently) 256kb of contiguous
memory it can map. Other DMA-API implementations might also
have limits.

Use the new dma_max_mapping_size() function to determine the
maximum mapping size when DMA-API is in use for virtio.

Cc: stable@vger.kernel.org
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-03-06 11:19:15 -05:00
Michael S. Tsirkin 9c0644ee4a virtio: drop internal struct from UAPI
There's no reason to expose struct vring_packed in UAPI - if we do we
won't be able to change or drop it, and it's not part of any interface.

Let's move it to virtio_ring.c

Cc: Tiwei Bie <tiwei.bie@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-02-05 15:29:48 -05:00
Tiwei Bie 45383fb0f4 virtio: support VIRTIO_F_ORDER_PLATFORM
This patch introduces the support for VIRTIO_F_ORDER_PLATFORM.
If this feature is negotiated, the driver must use the barriers
suitable for hardware devices. Otherwise, the device and driver
are assumed to be implemented in software, that is they can be
assumed to run on identical CPUs in an SMP configuration. Thus
a weaker form of memory barriers is sufficient to yield better
performance.

It is recommended that an add-in card based PCI device offers
this feature for portability. The device will fail to operate
further or will operate in a slower emulation mode if this
feature is offered but not accepted.

Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-01-24 10:15:42 -05:00
Wei Wang bf4dc0b2be virtio-balloon: tweak config_changed implementation
virtio-ccw has deadlock issues with reading the config space inside the
interrupt context, so we tweak the virtballoon_changed implementation
by moving the config read operations into the related workqueue contexts.
The config_read_bitmap is used as a flag to the workqueue callbacks
about the related config fields that need to be read.

The cmd_id_received is also renamed to cmd_id_received_cache, and
the value should be obtained via virtio_balloon_cmd_id_received.

Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Cc: stable@vger.kernel.org
Fixes: 86a559787e ("virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT")
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
2019-01-14 20:15:20 -05:00
Wei Wang a229989d97 virtio: don't allocate vqs when names[i] = NULL
Some vqs may not need to be allocated when their related feature bits
are disabled. So callers may pass in such vqs with "names = NULL".
Then we skip such vq allocations.

Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Cc: stable@vger.kernel.org
Fixes: 86a559787e ("virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT")
2019-01-14 20:15:19 -05:00
Wei Wang ddbeac07a3 virtio_pci: use queue idx instead of array idx to set up the vq
When find_vqs, there will be no vq[i] allocation if its corresponding
names[i] is NULL. For example, the caller may pass in names[i] (i=4)
with names[2] being NULL because the related feature bit is turned off,
so technically there are 3 queues on the device, and name[4] should
correspond to the 3rd queue on the device.

So we use queue_idx as the queue index, which is increased only when the
queue exists.

Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
2019-01-14 20:15:18 -05:00
Linus Torvalds d548e65904 virtio, vhost: features, fixes, cleanups
discard in virtio blk
 misc fixes and cleanups
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJcLSujAAoJECgfDbjSjVRpguUH/jHFcIR0egb9k0nEx2ETxoPw
 HKjV3zzWW+WKYu7NNXyF4qiIedlQvTLUt1gRNtNa/G0C+AFKKPl+ynBNmBFfM3Lt
 RCpt0ctAaJDpr8xABC4PRoAU2Vga9Glkt9SobZ7kBDCXcCl6PDYk3zLryG87N5Rf
 pQJeTOpYtE8OgQaO7w3+7u5YmfGWaCrsxMWuq43ry9mn0J6QaJ6FYrz5+V90uOcT
 o5NtauCyTzIj+wrsh75qg6KWG8zLFwrskCxX8CmYd+j7ZTDZc5U9eaYJRx3HdqOE
 //aXqXy17trgy5GGTw9IPKE30JOztEhER9HzQASNVkmjYTq7q8DTMlnVIMSLRF0=
 =NI+Y
 -----END PGP SIGNATURE-----

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

Pull virtio/vhost updates from Michael Tsirkin:
"Features, fixes, cleanups:

   - discard in virtio blk

   - misc fixes and cleanups"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  vhost: correct the related warning message
  vhost: split structs into a separate header file
  virtio: remove deprecated VIRTIO_PCI_CONFIG()
  vhost/vsock: switch to a mutex for vhost_vsock_hash
  virtio_blk: add discard and write zeroes support
2019-01-02 18:54:45 -08:00
Dongli Zhang e8d26f29b7 virtio: remove deprecated VIRTIO_PCI_CONFIG()
VIRTIO_PCI_CONFIG() is deprecated. Use VIRTIO_PCI_CONFIG_OFF() instead.

Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-12-19 18:23:49 -05:00
Tiwei Bie f959a128fe virtio_ring: advertize packed ring layout
Advertize the packed ring layout support.

Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-26 22:17:40 -08:00
Tiwei Bie f51f982682 virtio_ring: leverage event idx in packed ring
Leverage the EVENT_IDX feature in packed ring to suppress
events when it's available.

Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-26 22:17:39 -08:00