All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: "Greg Kroah-Hartman" <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org,
	"Satheesh Rajendran" <sathnaga@linux.vnet.ibm.com>,
	"Cédric Le Goater" <clg@kaod.org>, "Greg Kurz" <groug@kaod.org>,
	"Lijun Pan" <ljp@linux.ibm.com>,
	"Paul Mackerras" <paulus@ozlabs.org>
Subject: [PATCH 5.4 59/92] KVM: PPC: Book3S HV: XIVE: Free previous EQ page when setting up a new one
Date: Wed, 11 Dec 2019 16:05:50 +0100	[thread overview]
Message-ID: <20191211150248.297201833@linuxfoundation.org> (raw)
In-Reply-To: <20191211150221.977775294@linuxfoundation.org>

From: Greg Kurz <groug@kaod.org>

commit 31a88c82b466d2f31a44e21c479f45b4732ccfd0 upstream.

The EQ page is allocated by the guest and then passed to the hypervisor
with the H_INT_SET_QUEUE_CONFIG hcall. A reference is taken on the page
before handing it over to the HW. This reference is dropped either when
the guest issues the H_INT_RESET hcall or when the KVM device is released.
But, the guest can legitimately call H_INT_SET_QUEUE_CONFIG several times,
either to reset the EQ (vCPU hot unplug) or to set a new EQ (guest reboot).
In both cases the existing EQ page reference is leaked because we simply
overwrite it in the XIVE queue structure without calling put_page().

This is especially visible when the guest memory is backed with huge pages:
start a VM up to the guest userspace, either reboot it or unplug a vCPU,
quit QEMU. The leak is observed by comparing the value of HugePages_Free in
/proc/meminfo before and after the VM is run.

Ideally we'd want the XIVE code to handle the EQ page de-allocation at the
platform level. This isn't the case right now because the various XIVE
drivers have different allocation needs. It could maybe worth introducing
hooks for this purpose instead of exposing XIVE internals to the drivers,
but this is certainly a huge work to be done later.

In the meantime, for easier backport, fix both vCPU unplug and guest reboot
leaks by introducing a wrapper around xive_native_configure_queue() that
does the necessary cleanup.

Reported-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org # v5.2
Fixes: 13ce3297c576 ("KVM: PPC: Book3S HV: XIVE: Add controls for the EQ configuration")
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
Tested-by: Lijun Pan <ljp@linux.ibm.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/powerpc/kvm/book3s_xive_native.c |   31 ++++++++++++++++++++++---------
 1 file changed, 22 insertions(+), 9 deletions(-)

--- a/arch/powerpc/kvm/book3s_xive_native.c
+++ b/arch/powerpc/kvm/book3s_xive_native.c
@@ -50,6 +50,24 @@ static void kvmppc_xive_native_cleanup_q
 	}
 }
 
+static int kvmppc_xive_native_configure_queue(u32 vp_id, struct xive_q *q,
+					      u8 prio, __be32 *qpage,
+					      u32 order, bool can_escalate)
+{
+	int rc;
+	__be32 *qpage_prev = q->qpage;
+
+	rc = xive_native_configure_queue(vp_id, q, prio, qpage, order,
+					 can_escalate);
+	if (rc)
+		return rc;
+
+	if (qpage_prev)
+		put_page(virt_to_page(qpage_prev));
+
+	return rc;
+}
+
 void kvmppc_xive_native_cleanup_vcpu(struct kvm_vcpu *vcpu)
 {
 	struct kvmppc_xive_vcpu *xc = vcpu->arch.xive_vcpu;
@@ -582,19 +600,14 @@ static int kvmppc_xive_native_set_queue_
 		q->guest_qaddr  = 0;
 		q->guest_qshift = 0;
 
-		rc = xive_native_configure_queue(xc->vp_id, q, priority,
-						 NULL, 0, true);
+		rc = kvmppc_xive_native_configure_queue(xc->vp_id, q, priority,
+							NULL, 0, true);
 		if (rc) {
 			pr_err("Failed to reset queue %d for VCPU %d: %d\n",
 			       priority, xc->server_num, rc);
 			return rc;
 		}
 
-		if (q->qpage) {
-			put_page(virt_to_page(q->qpage));
-			q->qpage = NULL;
-		}
-
 		return 0;
 	}
 
@@ -653,8 +666,8 @@ static int kvmppc_xive_native_set_queue_
 	  * OPAL level because the use of END ESBs is not supported by
 	  * Linux.
 	  */
-	rc = xive_native_configure_queue(xc->vp_id, q, priority,
-					 (__be32 *) qaddr, kvm_eq.qshift, true);
+	rc = kvmppc_xive_native_configure_queue(xc->vp_id, q, priority,
+					(__be32 *) qaddr, kvm_eq.qshift, true);
 	if (rc) {
 		pr_err("Failed to configure queue %d for VCPU %d: %d\n",
 		       priority, xc->server_num, rc);



  parent reply	other threads:[~2019-12-11 16:12 UTC|newest]

Thread overview: 109+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-11 15:04 [PATCH 5.4 00/92] 5.4.3-stable review Greg Kroah-Hartman
2019-12-11 15:04 ` [PATCH 5.4 01/92] rsi: release skb if rsi_prepare_beacon fails Greg Kroah-Hartman
2019-12-11 15:04 ` [PATCH 5.4 02/92] arm64: tegra: Fix active-low warning for Jetson TX1 regulator Greg Kroah-Hartman
2019-12-11 15:04 ` [PATCH 5.4 03/92] arm64: tegra: Fix active-low warning for Jetson Xavier regulator Greg Kroah-Hartman
2019-12-11 15:04 ` [PATCH 5.4 04/92] perf scripts python: exported-sql-viewer.py: Fix use of TRUE with SQLite Greg Kroah-Hartman
2019-12-11 15:04 ` [PATCH 5.4 05/92] sparc64: implement ioremap_uc Greg Kroah-Hartman
2019-12-11 15:04 ` [PATCH 5.4 06/92] lp: fix sparc64 LPSETTIMEOUT ioctl Greg Kroah-Hartman
2019-12-11 15:04 ` [PATCH 5.4 07/92] time: Zero the upper 32-bits in __kernel_timespec on 32-bit Greg Kroah-Hartman
2019-12-11 15:04 ` [PATCH 5.4 08/92] mailbox: tegra: Fix superfluous IRQ error message Greg Kroah-Hartman
2019-12-11 15:05 ` [PATCH 5.4 09/92] staging/octeon: Use stubs for MIPS && !CAVIUM_OCTEON_SOC Greg Kroah-Hartman
2019-12-11 15:05 ` [PATCH 5.4 10/92] usb: gadget: u_serial: add missing port entry locking Greg Kroah-Hartman
2019-12-11 15:05 ` [PATCH 5.4 11/92] serial: 8250-mtk: Use platform_get_irq_optional() for optional irq Greg Kroah-Hartman
2019-12-11 15:05 ` [PATCH 5.4 12/92] tty: serial: fsl_lpuart: use the sg count from dma_map_sg Greg Kroah-Hartman
2019-12-11 15:05 ` [PATCH 5.4 13/92] tty: serial: msm_serial: Fix flow control Greg Kroah-Hartman
2019-12-11 15:05 ` [PATCH 5.4 14/92] serial: pl011: Fix DMA ->flush_buffer() Greg Kroah-Hartman
2019-12-11 15:05 ` [PATCH 5.4 15/92] serial: serial_core: Perform NULL checks for break_ctl ops Greg Kroah-Hartman
2019-12-11 15:05 ` [PATCH 5.4 16/92] serial: stm32: fix clearing interrupt error flags Greg Kroah-Hartman
2019-12-11 15:05 ` [PATCH 5.4 17/92] serial: 8250_dw: Avoid double error messaging when IRQ absent Greg Kroah-Hartman
2019-12-11 15:05 ` [PATCH 5.4 18/92] serial: ifx6x60: add missed pm_runtime_disable Greg Kroah-Hartman
2019-12-11 15:05 ` [PATCH 5.4 19/92] mwifiex: Re-work support for SDIO HW reset Greg Kroah-Hartman
2019-12-11 15:05 ` [PATCH 5.4 20/92] io_uring: fix dead-hung for non-iter fixed rw Greg Kroah-Hartman
2019-12-11 15:05 ` [PATCH 5.4 21/92] io_uring: transform send/recvmsg() -ERESTARTSYS to -EINTR Greg Kroah-Hartman
2019-12-11 15:05 ` [PATCH 5.4 22/92] fuse: fix leak of fuse_io_priv Greg Kroah-Hartman
2019-12-11 15:05 ` [PATCH 5.4 23/92] fuse: verify nlink Greg Kroah-Hartman
2019-12-11 15:05 ` [PATCH 5.4 24/92] fuse: verify write return Greg Kroah-Hartman
2019-12-11 15:05 ` [PATCH 5.4 25/92] fuse: verify attributes Greg Kroah-Hartman
2019-12-11 15:05 ` [PATCH 5.4 26/92] io_uring: fix missing kmap() declaration on powerpc Greg Kroah-Hartman
2019-12-11 15:05 ` [PATCH 5.4 27/92] io_uring: ensure req->submit is copied when req is deferred Greg Kroah-Hartman
2019-12-11 15:05 ` [PATCH 5.4 28/92] SUNRPC: Avoid RPC delays when exiting suspend Greg Kroah-Hartman
2019-12-11 15:05 ` [PATCH 5.4 29/92] ALSA: hda/realtek - Enable internal speaker of ASUS UX431FLC Greg Kroah-Hartman
2019-12-11 15:05 ` [PATCH 5.4 30/92] ALSA: hda/realtek - Enable the headset-mic on a Xiaomis laptop Greg Kroah-Hartman
2019-12-11 15:05 ` [PATCH 5.4 31/92] ALSA: hda/realtek - Dell headphone has noise on unmute for ALC236 Greg Kroah-Hartman
2019-12-11 15:05 ` [PATCH 5.4 32/92] ALSA: hda/realtek - Fix inverted bass GPIO pin on Acer 8951G Greg Kroah-Hartman
2019-12-11 15:05 ` [PATCH 5.4 33/92] ALSA: pcm: oss: Avoid potential buffer overflows Greg Kroah-Hartman
2019-12-11 15:05 ` [PATCH 5.4 34/92] ALSA: hda - Add mute led support for HP ProBook 645 G4 Greg Kroah-Hartman
2019-12-11 15:05 ` [PATCH 5.4 35/92] ALSA: hda: Modify stream stripe mask only when needed Greg Kroah-Hartman
2019-12-11 15:05 ` [PATCH 5.4 36/92] soc: mediatek: cmdq: fixup wrong input order of write api Greg Kroah-Hartman
2019-12-11 15:05 ` [PATCH 5.4 37/92] Input: synaptics - switch another X1 Carbon 6 to RMI/SMbus Greg Kroah-Hartman
2019-12-11 15:05 ` [PATCH 5.4 38/92] Input: synaptics-rmi4 - re-enable IRQs in f34v7_do_reflash Greg Kroah-Hartman
2019-12-11 15:05 ` [PATCH 5.4 39/92] Input: synaptics-rmi4 - dont increment rmiaddr for SMBus transfers Greg Kroah-Hartman
2019-12-11 15:05 ` [PATCH 5.4 40/92] Input: goodix - add upside-down quirk for Teclast X89 tablet Greg Kroah-Hartman
2019-12-11 15:05 ` [PATCH 5.4 41/92] coresight: etm4x: Fix input validation for sysfs Greg Kroah-Hartman
2019-12-11 15:05 ` [PATCH 5.4 42/92] Input: Fix memory leak in psxpad_spi_probe Greg Kroah-Hartman
2019-12-11 15:05 ` [PATCH 5.4 43/92] media: rc: mark input device as pointing stick Greg Kroah-Hartman
2019-12-11 15:05 ` [PATCH 5.4 44/92] x86/mm/32: Sync only to VMALLOC_END in vmalloc_sync_all() Greg Kroah-Hartman
2019-12-11 15:05 ` [PATCH 5.4 45/92] x86/PCI: Avoid AMD FCH XHCI USB PME# from D0 defect Greg Kroah-Hartman
2019-12-11 15:05 ` [PATCH 5.4 46/92] CIFS: Fix NULL-pointer dereference in smb2_push_mandatory_locks Greg Kroah-Hartman
2019-12-11 15:05 ` [PATCH 5.4 47/92] CIFS: Fix SMB2 oplock break processing Greg Kroah-Hartman
2019-12-11 15:05 ` [PATCH 5.4 48/92] tty: vt: keyboard: reject invalid keycodes Greg Kroah-Hartman
2019-12-11 15:05 ` [PATCH 5.4 49/92] can: slcan: Fix use-after-free Read in slcan_open Greg Kroah-Hartman
2019-12-11 15:05 ` [PATCH 5.4 50/92] nfsd: Ensure CLONE persists data and metadata changes to the target file Greg Kroah-Hartman
2019-12-11 15:05 ` [PATCH 5.4 51/92] nfsd: restore NFSv3 ACL support Greg Kroah-Hartman
2019-12-11 15:05 ` [PATCH 5.4 52/92] kernfs: fix ino wrap-around detection Greg Kroah-Hartman
2019-12-11 15:05 ` [PATCH 5.4 53/92] jbd2: Fix possible overflow in jbd2_log_space_left() Greg Kroah-Hartman
2019-12-11 15:05 ` [PATCH 5.4 54/92] drm/msm: fix memleak on release Greg Kroah-Hartman
2019-12-11 15:05 ` [PATCH 5.4 55/92] drm: damage_helper: Fix race checking plane->state->fb Greg Kroah-Hartman
2019-12-11 15:05   ` Greg Kroah-Hartman
2019-12-11 15:05 ` [PATCH 5.4 56/92] drm/i810: Prevent underflow in ioctl Greg Kroah-Hartman
2019-12-11 15:05 ` [PATCH 5.4 57/92] arm64: Validate tagged addresses in access_ok() called from kernel threads Greg Kroah-Hartman
2019-12-11 15:05 ` [PATCH 5.4 58/92] arm64: dts: exynos: Revert "Remove unneeded address space mapping for soc node" Greg Kroah-Hartman
2019-12-11 15:05 ` Greg Kroah-Hartman [this message]
2019-12-11 15:05 ` [PATCH 5.4 60/92] KVM: PPC: Book3S HV: XIVE: Fix potential page leak on error path Greg Kroah-Hartman
2019-12-11 15:05 ` [PATCH 5.4 61/92] KVM: PPC: Book3S HV: XIVE: Set kvm->arch.xive when VPs are allocated Greg Kroah-Hartman
2019-12-11 15:05 ` [PATCH 5.4 62/92] KVM: nVMX: Always write vmcs02.GUEST_CR3 during nested VM-Enter Greg Kroah-Hartman
2019-12-11 15:05 ` [PATCH 5.4 63/92] KVM: arm/arm64: vgic: Dont rely on the wrong pending table Greg Kroah-Hartman
2019-12-11 15:05 ` [PATCH 5.4 64/92] KVM: x86: do not modify masked bits of shared MSRs Greg Kroah-Hartman
2019-12-11 15:05 ` [PATCH 5.4 65/92] KVM: x86: fix presentation of TSX feature in ARCH_CAPABILITIES Greg Kroah-Hartman
2019-12-11 15:05 ` [PATCH 5.4 66/92] KVM: x86: Remove a spurious export of a static function Greg Kroah-Hartman
2019-12-11 15:05 ` [PATCH 5.4 67/92] KVM: x86: Grab KVMs srcu lock when setting nested state Greg Kroah-Hartman
2019-12-11 15:05 ` [PATCH 5.4 68/92] crypto: crypto4xx - fix double-free in crypto4xx_destroy_sdr Greg Kroah-Hartman
2019-12-11 15:06 ` [PATCH 5.4 69/92] crypto: atmel-aes - Fix IV handling when req->nbytes < ivsize Greg Kroah-Hartman
2019-12-11 15:06 ` [PATCH 5.4 70/92] crypto: af_alg - cast ki_complete ternary op to int Greg Kroah-Hartman
2019-12-11 15:06 ` [PATCH 5.4 71/92] crypto: geode-aes - switch to skcipher for cbc(aes) fallback Greg Kroah-Hartman
2019-12-11 15:06 ` [PATCH 5.4 72/92] crypto: ccp - fix uninitialized list head Greg Kroah-Hartman
2019-12-11 15:06 ` [PATCH 5.4 73/92] crypto: ecdh - fix big endian bug in ECC library Greg Kroah-Hartman
2019-12-11 15:06 ` [PATCH 5.4 74/92] crypto: user - fix memory leak in crypto_report Greg Kroah-Hartman
2019-12-11 15:06 ` [PATCH 5.4 75/92] crypto: user - fix memory leak in crypto_reportstat Greg Kroah-Hartman
2019-12-11 15:06 ` [PATCH 5.4 76/92] spi: spi-fsl-qspi: Clear TDH bits in FLSHCR register Greg Kroah-Hartman
2019-12-11 15:06 ` [PATCH 5.4 77/92] spi: stm32-qspi: Fix kernel oops when unbinding driver Greg Kroah-Hartman
2019-12-11 15:06 ` [PATCH 5.4 78/92] spi: atmel: Fix CS high support Greg Kroah-Hartman
2019-12-11 15:06 ` [PATCH 5.4 79/92] spi: Fix SPI_CS_HIGH setting when using native and GPIO CS Greg Kroah-Hartman
2019-12-11 15:06 ` [PATCH 5.4 80/92] spi: Fix NULL pointer when setting SPI_CS_HIGH for " Greg Kroah-Hartman
2019-12-11 15:06 ` [PATCH 5.4 81/92] can: ucan: fix non-atomic allocation in completion handler Greg Kroah-Hartman
2019-12-11 15:06 ` [PATCH 5.4 82/92] RDMA/qib: Validate ->show()/store() callbacks before calling them Greg Kroah-Hartman
2019-12-11 15:06 ` [PATCH 5.4 83/92] rfkill: allocate static minor Greg Kroah-Hartman
2019-12-11 15:06 ` [PATCH 5.4 84/92] bdev: Factor out bdev revalidation into a common helper Greg Kroah-Hartman
2019-12-11 15:06 ` [PATCH 5.4 85/92] bdev: Refresh bdev size for disks without partitioning Greg Kroah-Hartman
2019-12-11 15:06 ` [PATCH 5.4 86/92] iomap: Fix pipe page leakage during splicing Greg Kroah-Hartman
2019-12-11 15:06 ` [PATCH 5.4 87/92] thermal: Fix deadlock in thermal thermal_zone_device_check Greg Kroah-Hartman
2019-12-11 15:06 ` [PATCH 5.4 88/92] vcs: prevent write access to vcsu devices Greg Kroah-Hartman
2019-12-11 15:06 ` [PATCH 5.4 89/92] Revert "serial/8250: Add support for NI-Serial PXI/PXIe+485 devices" Greg Kroah-Hartman
2019-12-11 15:06 ` [PATCH 5.4 90/92] binder: Fix race between mmap() and binder_alloc_print_pages() Greg Kroah-Hartman
2019-12-11 15:06 ` [PATCH 5.4 91/92] binder: Prevent repeated use of ->mmap() via NULL mapping Greg Kroah-Hartman
2019-12-11 15:06 ` [PATCH 5.4 92/92] binder: Handle start==NULL in binder_update_page_range() Greg Kroah-Hartman
2019-12-11 21:13 ` [PATCH 5.4 00/92] 5.4.3-stable review Jon Hunter
2019-12-11 21:13   ` Jon Hunter
2019-12-12  2:48 ` shuah
2019-12-12  9:36   ` Greg Kroah-Hartman
2019-12-12  5:28 ` Naresh Kamboju
2019-12-12  9:37   ` Greg Kroah-Hartman
2019-12-12  8:27 ` Jeffrin Jose
2019-12-12  9:10   ` Greg Kroah-Hartman
2019-12-12 10:04 ` Greg Kroah-Hartman
2019-12-12 13:18   ` Jon Hunter
2019-12-12 13:18     ` Jon Hunter
2019-12-13  4:49   ` Naresh Kamboju
2019-12-13 16:08     ` Greg Kroah-Hartman
2019-12-12 18:25 ` Guenter Roeck
2019-12-13 16:07   ` Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191211150248.297201833@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=clg@kaod.org \
    --cc=groug@kaod.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ljp@linux.ibm.com \
    --cc=paulus@ozlabs.org \
    --cc=sathnaga@linux.vnet.ibm.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.