linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Zhuang Yanying <ann.zhuangyanying@huawei.com>,
	LinFeng <linfeng23@huawei.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Sasha Levin <sashal@kernel.org>,
	kvm@vger.kernel.org
Subject: [PATCH AUTOSEL 4.9 31/90] KVM: fix overflow of zero page refcount with ksm running
Date: Thu, 17 Sep 2020 22:13:56 -0400	[thread overview]
Message-ID: <20200918021455.2067301-31-sashal@kernel.org> (raw)
In-Reply-To: <20200918021455.2067301-1-sashal@kernel.org>

From: Zhuang Yanying <ann.zhuangyanying@huawei.com>

[ Upstream commit 7df003c85218b5f5b10a7f6418208f31e813f38f ]

We are testing Virtual Machine with KSM on v5.4-rc2 kernel,
and found the zero_page refcount overflow.
The cause of refcount overflow is increased in try_async_pf
(get_user_page) without being decreased in mmu_set_spte()
while handling ept violation.
In kvm_release_pfn_clean(), only unreserved page will call
put_page. However, zero page is reserved.
So, as well as creating and destroy vm, the refcount of
zero page will continue to increase until it overflows.

step1:
echo 10000 > /sys/kernel/pages_to_scan/pages_to_scan
echo 1 > /sys/kernel/pages_to_scan/run
echo 1 > /sys/kernel/pages_to_scan/use_zero_pages

step2:
just create several normal qemu kvm vms.
And destroy it after 10s.
Repeat this action all the time.

After a long period of time, all domains hang because
of the refcount of zero page overflow.

Qemu print error log as follow:
 …
 error: kvm run failed Bad address
 EAX=00006cdc EBX=00000008 ECX=80202001 EDX=078bfbfd
 ESI=ffffffff EDI=00000000 EBP=00000008 ESP=00006cc4
 EIP=000efd75 EFL=00010002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
 ES =0010 00000000 ffffffff 00c09300 DPL=0 DS   [-WA]
 CS =0008 00000000 ffffffff 00c09b00 DPL=0 CS32 [-RA]
 SS =0010 00000000 ffffffff 00c09300 DPL=0 DS   [-WA]
 DS =0010 00000000 ffffffff 00c09300 DPL=0 DS   [-WA]
 FS =0010 00000000 ffffffff 00c09300 DPL=0 DS   [-WA]
 GS =0010 00000000 ffffffff 00c09300 DPL=0 DS   [-WA]
 LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
 TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy
 GDT=     000f7070 00000037
 IDT=     000f70ae 00000000
 CR0=00000011 CR2=00000000 CR3=00000000 CR4=00000000
 DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
 DR6=00000000ffff0ff0 DR7=0000000000000400
 EFER=0000000000000000
 Code=00 01 00 00 00 e9 e8 00 00 00 c7 05 4c 55 0f 00 01 00 00 00 <8b> 35 00 00 01 00 8b 3d 04 00 01 00 b8 d8 d3 00 00 c1 e0 08 0c ea a3 00 00 01 00 c7 05 04
 …

Meanwhile, a kernel warning is departed.

 [40914.836375] WARNING: CPU: 3 PID: 82067 at ./include/linux/mm.h:987 try_get_page+0x1f/0x30
 [40914.836412] CPU: 3 PID: 82067 Comm: CPU 0/KVM Kdump: loaded Tainted: G           OE     5.2.0-rc2 #5
 [40914.836415] RIP: 0010:try_get_page+0x1f/0x30
 [40914.836417] Code: 40 00 c3 0f 1f 84 00 00 00 00 00 48 8b 47 08 a8 01 75 11 8b 47 34 85 c0 7e 10 f0 ff 47 34 b8 01 00 00 00 c3 48 8d 78 ff eb e9 <0f> 0b 31 c0 c3 66 90 66 2e 0f 1f 84 00 0
 0 00 00 00 48 8b 47 08 a8
 [40914.836418] RSP: 0018:ffffb4144e523988 EFLAGS: 00010286
 [40914.836419] RAX: 0000000080000000 RBX: 0000000000000326 RCX: 0000000000000000
 [40914.836420] RDX: 0000000000000000 RSI: 00004ffdeba10000 RDI: ffffdf07093f6440
 [40914.836421] RBP: ffffdf07093f6440 R08: 800000424fd91225 R09: 0000000000000000
 [40914.836421] R10: ffff9eb41bfeebb8 R11: 0000000000000000 R12: ffffdf06bbd1e8a8
 [40914.836422] R13: 0000000000000080 R14: 800000424fd91225 R15: ffffdf07093f6440
 [40914.836423] FS:  00007fb60ffff700(0000) GS:ffff9eb4802c0000(0000) knlGS:0000000000000000
 [40914.836425] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 [40914.836426] CR2: 0000000000000000 CR3: 0000002f220e6002 CR4: 00000000003626e0
 [40914.836427] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 [40914.836427] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 [40914.836428] Call Trace:
 [40914.836433]  follow_page_pte+0x302/0x47b
 [40914.836437]  __get_user_pages+0xf1/0x7d0
 [40914.836441]  ? irq_work_queue+0x9/0x70
 [40914.836443]  get_user_pages_unlocked+0x13f/0x1e0
 [40914.836469]  __gfn_to_pfn_memslot+0x10e/0x400 [kvm]
 [40914.836486]  try_async_pf+0x87/0x240 [kvm]
 [40914.836503]  tdp_page_fault+0x139/0x270 [kvm]
 [40914.836523]  kvm_mmu_page_fault+0x76/0x5e0 [kvm]
 [40914.836588]  vcpu_enter_guest+0xb45/0x1570 [kvm]
 [40914.836632]  kvm_arch_vcpu_ioctl_run+0x35d/0x580 [kvm]
 [40914.836645]  kvm_vcpu_ioctl+0x26e/0x5d0 [kvm]
 [40914.836650]  do_vfs_ioctl+0xa9/0x620
 [40914.836653]  ksys_ioctl+0x60/0x90
 [40914.836654]  __x64_sys_ioctl+0x16/0x20
 [40914.836658]  do_syscall_64+0x5b/0x180
 [40914.836664]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
 [40914.836666] RIP: 0033:0x7fb61cb6bfc7

Signed-off-by: LinFeng <linfeng23@huawei.com>
Signed-off-by: Zhuang Yanying <ann.zhuangyanying@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 virt/kvm/kvm_main.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 4e4bb5dd2dcd5..266c9a31b1ba9 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -154,6 +154,7 @@ bool kvm_is_reserved_pfn(kvm_pfn_t pfn)
 	 */
 	if (pfn_valid(pfn))
 		return PageReserved(pfn_to_page(pfn)) &&
+		       !is_zero_pfn(pfn) &&
 		       !kvm_is_zone_device_pfn(pfn);
 
 	return true;
-- 
2.25.1


  parent reply	other threads:[~2020-09-18  2:27 UTC|newest]

Thread overview: 95+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-18  2:13 [PATCH AUTOSEL 4.9 01/90] scsi: aacraid: fix illegal IO beyond last LBA Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 02/90] m68k: q40: Fix info-leak in rtc_ioctl Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 03/90] gma/gma500: fix a memory disclosure bug due to uninitialized bytes Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 04/90] ASoC: kirkwood: fix IRQ error handling Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 05/90] ata: sata_mv, avoid trigerrable BUG_ON Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 06/90] PM / devfreq: tegra30: Fix integer overflow on CPU's freq max out Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 07/90] clk/ti/adpll: allocate room for terminating null Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 08/90] mtd: cfi_cmdset_0002: don't free cfi->cfiq in error path of cfi_amdstd_setup() Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 09/90] mfd: mfd-core: Protect against NULL call-back function pointer Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 10/90] tracing: Adding NULL checks for trace_array descriptor pointer Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 11/90] bcache: fix a lost wake-up problem caused by mca_cannibalize_lock Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 12/90] RDMA/i40iw: Fix potential use after free Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 13/90] xfs: fix attr leaf header freemap.size underflow Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 14/90] RDMA/iw_cgxb4: Fix an error handling path in 'c4iw_connect()' Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 15/90] debugfs: Fix !DEBUG_FS debugfs_create_automount Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 16/90] CIFS: Properly process SMB3 lease breaks Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 17/90] kernel/sys.c: avoid copying possible padding bytes in copy_to_user Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 18/90] neigh_stat_seq_next() should increase position index Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 19/90] rt_cpu_seq_next " Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 20/90] seqlock: Require WRITE_ONCE surrounding raw_seqcount_barrier Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 21/90] media: ti-vpe: cal: Restrict DMA to avoid memory corruption Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 22/90] ACPI: EC: Reference count query handlers under lock Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 23/90] efi/arm: Defer probe of PCIe backed efifb on DT systems Sasha Levin
2020-09-18  6:25   ` Ard Biesheuvel
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 24/90] dmaengine: zynqmp_dma: fix burst length configuration Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 25/90] tracing: Set kernel_stack's caller size properly Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 26/90] ext4: make dioread_nolock the default Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 27/90] ar5523: Add USB ID of SMCWUSBT-G2 wireless adapter Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 28/90] Bluetooth: Fix refcount use-after-free issue Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 29/90] mm: pagewalk: fix termination condition in walk_pte_range() Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 30/90] Bluetooth: prefetch channel before killing sock Sasha Levin
2020-09-18  2:13 ` Sasha Levin [this message]
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 32/90] ALSA: hda: Clear RIRB status before reading WP Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 33/90] skbuff: fix a data race in skb_queue_len() Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 34/90] audit: CONFIG_CHANGE don't log internal bookkeeping as an event Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 35/90] selinux: sel_avc_get_stat_idx should increase position index Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 36/90] scsi: lpfc: Fix RQ buffer leakage when no IOCBs available Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 37/90] scsi: lpfc: Fix coverity errors in fmdi attribute handling Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 38/90] drm/omap: fix possible object reference leak Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 39/90] RDMA/rxe: Fix configuration of atomic queue pair attributes Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 40/90] KVM: x86: fix incorrect comparison in trace event Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 41/90] x86/pkeys: Add check for pkey "overflow" Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 42/90] bpf: Remove recursion prevention from rcu free callback Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 43/90] dmaengine: tegra-apb: Prevent race conditions on channel's freeing Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 44/90] media: go7007: Fix URB type for interrupt handling Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 45/90] Bluetooth: guard against controllers sending zero'd events Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 46/90] timekeeping: Prevent 32bit truncation in scale64_check_overflow() Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 47/90] drm/amdgpu: increase atombios cmd timeout Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 48/90] Bluetooth: L2CAP: handle l2cap config request during open state Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 49/90] media: tda10071: fix unsigned sign extension overflow Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 50/90] xfs: don't ever return a stale pointer from __xfs_dir3_free_read Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 51/90] tpm: ibmvtpm: Wait for buffer to be set before proceeding Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 52/90] tracing: Use address-of operator on section symbols Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 53/90] serial: 8250_port: Don't service RX FIFO if throttled Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 54/90] serial: 8250_omap: Fix sleeping function called from invalid context during probe Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 55/90] serial: 8250: 8250_omap: Terminate DMA before pushing data on RX timeout Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 56/90] cpufreq: powernv: Fix frame-size-overflow in powernv_cpufreq_work_fn Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 57/90] tools: gpio-hammer: Avoid potential overflow in main Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 58/90] SUNRPC: Fix a potential buffer overflow in 'svc_print_xprts()' Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 59/90] svcrdma: Fix leak of transport addresses Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 60/90] ubifs: Fix out-of-bounds memory access caused by abnormal value of node_len Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 61/90] ALSA: usb-audio: Fix case when USB MIDI interface has more than one extra endpoint descriptor Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 62/90] mm/filemap.c: clear page error before actual read Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 63/90] mm/mmap.c: initialize align_offset explicitly for vm_unmapped_area Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 64/90] serial: uartps: Wait for tx_empty in console setup Sasha Levin
2020-09-28 20:11   ` Naresh Kamboju
2020-09-28 20:13     ` Naresh Kamboju
2020-09-29  6:59       ` Greg Kroah-Hartman
2020-09-29 17:39         ` Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 65/90] KVM: Remove CREATE_IRQCHIP/SET_PIT2 race Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 66/90] bdev: Reduce time holding bd_mutex in sync in blkdev_close() Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 67/90] drivers: char: tlclk.c: Avoid data race between init and interrupt handler Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 68/90] dt-bindings: sound: wm8994: Correct required supplies based on actual implementaion Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 69/90] atm: fix a memory leak of vcc->user_back Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 70/90] phy: samsung: s5pv210-usb2: Add delay after reset Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 71/90] Bluetooth: Handle Inquiry Cancel error after Inquiry Complete Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 72/90] USB: EHCI: ehci-mv: fix error handling in mv_ehci_probe() Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 73/90] tty: serial: samsung: Correct clock selection logic Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 74/90] ALSA: hda: Fix potential race in unsol event handler Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 75/90] fuse: don't check refcount after stealing page Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 76/90] USB: EHCI: ehci-mv: fix less than zero comparison of an unsigned int Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 77/90] e1000: Do not perform reset in reset_task if we are already down Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 78/90] printk: handle blank console arguments passed in Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 79/90] btrfs: don't force read-only after error in drop snapshot Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 80/90] vfio/pci: fix memory leaks of eventfd ctx Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 81/90] perf util: Fix memory leak of prefix_if_not_in Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 82/90] perf kcore_copy: Fix module map when there are no modules loaded Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 83/90] mtd: rawnand: omap_elm: Fix runtime PM imbalance on error Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 84/90] ceph: fix potential race in ceph_check_caps Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 85/90] mtd: parser: cmdline: Support MTD names containing one or more colons Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 86/90] x86/speculation/mds: Mark mds_user_clear_cpu_buffers() __always_inline Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 87/90] vfio/pci: Clear error and request eventfd ctx after releasing Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 88/90] cifs: Fix double add page to memcg when cifs_readpages Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 89/90] selftests/x86/syscall_nt: Clear weird flags after each test Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 90/90] vfio/pci: fix racy on error and request eventfd ctx Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200918021455.2067301-31-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=ann.zhuangyanying@huawei.com \
    --cc=kvm@vger.kernel.org \
    --cc=linfeng23@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).