stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] KVM: fix overflow of zero page refcount with ksm running
@ 2020-04-01 12:50 Zhuang Yanying
  2020-04-01 12:50 ` [PATCH 1/2] " Zhuang Yanying
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Zhuang Yanying @ 2020-04-01 12:50 UTC (permalink / raw)
  To: pbonzini, greg; +Cc: tv, stable, LinFeng

From: LinFeng <linfeng23@huawei.com>

We found that the !is_zero_page() in kvm_is_mmio_pfn() was
submmited in commit:90cff5a8cc("KVM: check for !is_zero_pfn() in
kvm_is_mmio_pfn()"), but reverted in commit:0ef2459983("kvm: fix
kvm_is_mmio_pfn() and rename to kvm_is_reserved_pfn()").

Maybe just adding !is_zero_page() to kvm_is_reserved_pfn() is too
rough. According to commit:e433e83bc3("KVM: MMU: Do not treat
ZONE_DEVICE pages as being reserved"), special handling in some
other flows is also need by zero_page, if we would treat zero_page
as being reserved.

Well, as fixing all functions reference to kvm_is_reserved_pfn() in
this patch, we found that only kvm_release_pfn_clean() and
kvm_get_pfn() don't need special handling.

So, we thought why not only check is_zero_page() in before get and
put page, and revert our last commit:31e813f38f("KVM: fix overflow
of zero page refcount with ksm running") in master.
Instead of adding !is_zero_page() in kvm_is_reserved_pfn(),
new idea is as follow:

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 7f9ee2929cfe..f9a1f9cf188e 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1695,7 +1695,8 @@ EXPORT_SYMBOL_GPL(kvm_release_page_clean);

 void kvm_release_pfn_clean(kvm_pfn_t pfn)
 {
-	if (!is_error_noslot_pfn(pfn) && !kvm_is_reserved_pfn(pfn))
+	if (!is_error_noslot_pfn(pfn) &&
+	    (!kvm_is_reserved_pfn(pfn) || is_zero_pfn(pfn)))
 		put_page(pfn_to_page(pfn));
 }
 EXPORT_SYMBOL_GPL(kvm_release_pfn_clean);
@@ -1734,7 +1735,7 @@ EXPORT_SYMBOL_GPL(kvm_set_pfn_accessed);

 void kvm_get_pfn(kvm_pfn_t pfn)
 {
-	if (!kvm_is_reserved_pfn(pfn))
+	if (!kvm_is_reserved_pfn(pfn) || is_zero_pfn(pfn))
 		get_page(pfn_to_page(pfn));
 }
 EXPORT_SYMBOL_GPL(kvm_get_pfn);

We are confused why ZONE_DEVICE not do this, but treating it as
no reserved. Is it racy if we change only use the patch in cover letter,
but not the series patches.

And we check the code of v4.9.y v4.10.y v4.11.y v4.12.y, this bug exists
in v4.11.y and later, but not in v4.9.y v4.10.y or before.
After commit:e86c59b1b1("mm/ksm: improve deduplication of zero pages
with colouring"), ksm will use zero pages with colouring. This feature
was added in v4.11.y, so I wonder why v4.9.y has this bug.

We use crash tools attaching to /proc/kcore to check the refcount of
zero_page, then create and destroy vm. The refcount stays at 1 on v4.9.y,
well it increases only after v4.11.y. Are you sure it is the same bug
you run into? Is there something we missing?

LinFeng (1):
  KVM: special handling of zero_page in some flows

Zhuang Yanying (1):
  KVM: fix overflow of zero page refcount with ksm running

 arch/x86/kvm/mmu.c  | 2 ++
 virt/kvm/kvm_main.c | 9 +++++----
 2 files changed, 7 insertions(+), 4 deletions(-)

-- 
2.23.0



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 1/2] KVM: fix overflow of zero page refcount with ksm running
  2020-04-01 12:50 [PATCH 0/2] KVM: fix overflow of zero page refcount with ksm running Zhuang Yanying
@ 2020-04-01 12:50 ` Zhuang Yanying
  2020-04-01 13:00   ` Greg KH
  2020-04-01 12:50 ` [PATCH 2/2] KVM: special handling of zero_page in some flows Zhuang Yanying
  2020-04-01 13:00 ` [PATCH 0/2] KVM: fix overflow of zero page refcount with ksm running Greg KH
  2 siblings, 1 reply; 8+ messages in thread
From: Zhuang Yanying @ 2020-04-01 12:50 UTC (permalink / raw)
  To: pbonzini, greg; +Cc: tv, stable, Zhuang Yanying, LinFeng

We are testing Virtual Machine with KSM on v5.4-rc2 kernel,
and found the zero_page refcount overflow.
The cause of refcount overflow is increased in try_async_pf
(get_user_page) without being decreased in mmu_set_spte()
while handling ept violation.
In kvm_release_pfn_clean(), only unreserved page will call
put_page. However, zero page is reserved.
So, as well as creating and destroy vm, the refcount of
zero page will continue to increase until it overflows.

step1:
echo 10000 > /sys/kernel/mm/ksm/pages_to_scan
echo 1 > /sys/kernel/mm/ksm/run
echo 1 > /sys/kernel/mm/ksm/use_zero_pages

step2:
just create several normal qemu kvm vms.
And destroy it after 10s.
Repeat this action all the time.

After a long period of time, all domains hang because
of the refcount of zero page overflow.

Qemu print error log as follow:
error: kvm run failed Bad address
 EAX=00006cdc EBX=00000008 ECX=80202001 EDX=078bfbfd
 ESI=ffffffff EDI=00000000 EBP=00000008 ESP=00006cc4
 EIP=000efd75 EFL=00010002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
 ES =0010 00000000 ffffffff 00c09300 DPL=0 DS   [-WA]
 CS =0008 00000000 ffffffff 00c09b00 DPL=0 CS32 [-RA]
 SS =0010 00000000 ffffffff 00c09300 DPL=0 DS   [-WA]
 DS =0010 00000000 ffffffff 00c09300 DPL=0 DS   [-WA]
 FS =0010 00000000 ffffffff 00c09300 DPL=0 DS   [-WA]
 GS =0010 00000000 ffffffff 00c09300 DPL=0 DS   [-WA]
 LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
 TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy
 GDT=     000f7070 00000037
 IDT=     000f70ae 00000000
 CR0=00000011 CR2=00000000 CR3=00000000 CR4=00000000
 DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
 DR6=00000000ffff0ff0 DR7=0000000000000400
 EFER=0000000000000000
 Code=00 01 00 00 00 e9 e8 00 00 00 c7 05 4c 55 0f 00 01 00 00 00 <8b> 35 00 00 01 00 8b 3d 04 00 01 00 b8 d8 d3 00 00 c1 e0 08 0c ea a3 00 00 01 00 c7 05 04

Meanwhile, a kernel warning is departed.

 [40914.836375] WARNING: CPU: 3 PID: 82067 at ./include/linux/mm.h:987 try_get_page+0x1f/0x30
 [40914.836412] CPU: 3 PID: 82067 Comm: CPU 0/KVM Kdump: loaded Tainted: G           OE     5.2.0-rc2 #5
 [40914.836415] RIP: 0010:try_get_page+0x1f/0x30
 [40914.836417] Code: 40 00 c3 0f 1f 84 00 00 00 00 00 48 8b 47 08 a8 01 75 11 8b 47 34 85 c0 7e 10 f0 ff 47 34 b8 01 00 00 00 c3 48 8d 78 ff eb e9 <0f> 0b 31 c0 c3 66 90 66 2e 0f 1f 84 00 0
 0 00 00 00 48 8b 47 08 a8
 [40914.836418] RSP: 0018:ffffb4144e523988 EFLAGS: 00010286
 [40914.836419] RAX: 0000000080000000 RBX: 0000000000000326 RCX: 0000000000000000
 [40914.836420] RDX: 0000000000000000 RSI: 00004ffdeba10000 RDI: ffffdf07093f6440
 [40914.836421] RBP: ffffdf07093f6440 R08: 800000424fd91225 R09: 0000000000000000
 [40914.836421] R10: ffff9eb41bfeebb8 R11: 0000000000000000 R12: ffffdf06bbd1e8a8
 [40914.836422] R13: 0000000000000080 R14: 800000424fd91225 R15: ffffdf07093f6440
 [40914.836423] FS:  00007fb60ffff700(0000) GS:ffff9eb4802c0000(0000) knlGS:0000000000000000
 [40914.836425] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 [40914.836426] CR2: 0000000000000000 CR3: 0000002f220e6002 CR4: 00000000003626e0
 [40914.836427] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 [40914.836427] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 [40914.836428] Call Trace:
 [40914.836433]  follow_page_pte+0x302/0x47b
 [40914.836437]  __get_user_pages+0xf1/0x7d0
 [40914.836441]  ? irq_work_queue+0x9/0x70
 [40914.836443]  get_user_pages_unlocked+0x13f/0x1e0
 [40914.836469]  __gfn_to_pfn_memslot+0x10e/0x400 [kvm]
 [40914.836486]  try_async_pf+0x87/0x240 [kvm]
 [40914.836503]  tdp_page_fault+0x139/0x270 [kvm]
 [40914.836523]  kvm_mmu_page_fault+0x76/0x5e0 [kvm]
 [40914.836588]  vcpu_enter_guest+0xb45/0x1570 [kvm]
 [40914.836632]  kvm_arch_vcpu_ioctl_run+0x35d/0x580 [kvm]
 [40914.836645]  kvm_vcpu_ioctl+0x26e/0x5d0 [kvm]
 [40914.836650]  do_vfs_ioctl+0xa9/0x620
 [40914.836653]  ksys_ioctl+0x60/0x90
 [40914.836654]  __x64_sys_ioctl+0x16/0x20
 [40914.836658]  do_syscall_64+0x5b/0x180
 [40914.836664]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
 [40914.836666] RIP: 0033:0x7fb61cb6bfc7

Signed-off-by: LinFeng <linfeng23@huawei.com>
Signed-off-by: Zhuang Yanying <ann.zhuangyanying@huawei.com>
---
 virt/kvm/kvm_main.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 7e80f62f034c..7f7c22a687c0 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -133,7 +133,8 @@ static bool largepages_enabled = true;
 bool kvm_is_reserved_pfn(kvm_pfn_t pfn)
 {
 	if (pfn_valid(pfn))
-		return PageReserved(pfn_to_page(pfn));
+		return PageReserved(pfn_to_page(pfn)) &&
+		       !is_zero_pfn(pfn);
 
 	return true;
 }
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/2] KVM: special handling of zero_page in some flows
  2020-04-01 12:50 [PATCH 0/2] KVM: fix overflow of zero page refcount with ksm running Zhuang Yanying
  2020-04-01 12:50 ` [PATCH 1/2] " Zhuang Yanying
@ 2020-04-01 12:50 ` Zhuang Yanying
  2020-04-01 13:00   ` Greg KH
  2020-04-01 13:00 ` [PATCH 0/2] KVM: fix overflow of zero page refcount with ksm running Greg KH
  2 siblings, 1 reply; 8+ messages in thread
From: Zhuang Yanying @ 2020-04-01 12:50 UTC (permalink / raw)
  To: pbonzini, greg; +Cc: tv, stable, LinFeng, Zhuang Yanying

From: LinFeng <linfeng23@huawei.com>

Just adding !is_zero_page() to kvm_is_reserved_pfn() is too
rough. According to commit:e433e83bc3("KVM: MMU: Do not treat
ZONE_DEVICE pages as being reserved"), special handling in some
other flows is also need by zero_page, if not treat zero_page as
being reserved.

Signed-off-by: LinFeng <linfeng23@huawei.com>
Signed-off-by: Zhuang Yanying <ann.zhuangyanying@huawei.com>
---
 arch/x86/kvm/mmu.c  | 2 ++
 virt/kvm/kvm_main.c | 6 +++---
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 732c0270a489..e8db17d87c1a 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2961,6 +2961,7 @@ static void transparent_hugepage_adjust(struct kvm_vcpu *vcpu,
 	 * here.
 	 */
 	if (!is_error_noslot_pfn(pfn) && !kvm_is_reserved_pfn(pfn) &&
+	    !is_zero_pfn(pfn) &&
 	    level == PT_PAGE_TABLE_LEVEL &&
 	    PageTransCompoundMap(pfn_to_page(pfn)) &&
 	    !mmu_gfn_lpage_is_disallowed(vcpu, gfn, PT_DIRECTORY_LEVEL)) {
@@ -5010,6 +5011,7 @@ static bool kvm_mmu_zap_collapsible_spte(struct kvm *kvm,
 		 */
 		if (sp->role.direct &&
 			!kvm_is_reserved_pfn(pfn) &&
+			!is_zero_pfn(pfn) &&
 			PageTransCompoundMap(pfn_to_page(pfn))) {
 			drop_spte(kvm, sptep);
 			need_tlb_flush = 1;
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 7f7c22a687c0..fc0446bb393b 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1660,7 +1660,7 @@ static struct page *kvm_pfn_to_page(kvm_pfn_t pfn)
 	if (is_error_noslot_pfn(pfn))
 		return KVM_ERR_PTR_BAD_PAGE;
 
-	if (kvm_is_reserved_pfn(pfn)) {
+	if (kvm_is_reserved_pfn(pfn) && is_zero_pfn(pfn)) {
 		WARN_ON(1);
 		return KVM_ERR_PTR_BAD_PAGE;
 	}
@@ -1719,7 +1719,7 @@ static void kvm_release_pfn_dirty(kvm_pfn_t pfn)
 
 void kvm_set_pfn_dirty(kvm_pfn_t pfn)
 {
-	if (!kvm_is_reserved_pfn(pfn)) {
+	if (!kvm_is_reserved_pfn(pfn) && !is_zero_pfn(pfn)) {
 		struct page *page = pfn_to_page(pfn);
 
 		if (!PageReserved(page))
@@ -1730,7 +1730,7 @@ EXPORT_SYMBOL_GPL(kvm_set_pfn_dirty);
 
 void kvm_set_pfn_accessed(kvm_pfn_t pfn)
 {
-	if (!kvm_is_reserved_pfn(pfn))
+	if (!kvm_is_reserved_pfn(pfn) && !is_zero_pfn(pfn))
 		mark_page_accessed(pfn_to_page(pfn));
 }
 EXPORT_SYMBOL_GPL(kvm_set_pfn_accessed);
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/2] KVM: fix overflow of zero page refcount with ksm running
  2020-04-01 12:50 [PATCH 0/2] KVM: fix overflow of zero page refcount with ksm running Zhuang Yanying
  2020-04-01 12:50 ` [PATCH 1/2] " Zhuang Yanying
  2020-04-01 12:50 ` [PATCH 2/2] KVM: special handling of zero_page in some flows Zhuang Yanying
@ 2020-04-01 13:00 ` Greg KH
  2 siblings, 0 replies; 8+ messages in thread
From: Greg KH @ 2020-04-01 13:00 UTC (permalink / raw)
  To: Zhuang Yanying; +Cc: pbonzini, tv, stable, LinFeng

On Wed, Apr 01, 2020 at 08:50:54PM +0800, Zhuang Yanying wrote:
> From: LinFeng <linfeng23@huawei.com>
> 
> We found that the !is_zero_page() in kvm_is_mmio_pfn() was
> submmited in commit:90cff5a8cc("KVM: check for !is_zero_pfn() in
> kvm_is_mmio_pfn()"), but reverted in commit:0ef2459983("kvm: fix
> kvm_is_mmio_pfn() and rename to kvm_is_reserved_pfn()").
> 
> Maybe just adding !is_zero_page() to kvm_is_reserved_pfn() is too
> rough. According to commit:e433e83bc3("KVM: MMU: Do not treat
> ZONE_DEVICE pages as being reserved"), special handling in some
> other flows is also need by zero_page, if we would treat zero_page
> as being reserved.
> 
> Well, as fixing all functions reference to kvm_is_reserved_pfn() in
> this patch, we found that only kvm_release_pfn_clean() and
> kvm_get_pfn() don't need special handling.
> 
> So, we thought why not only check is_zero_page() in before get and
> put page, and revert our last commit:31e813f38f("KVM: fix overflow
> of zero page refcount with ksm running") in master.
> Instead of adding !is_zero_page() in kvm_is_reserved_pfn(),
> new idea is as follow:
> 
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 7f9ee2929cfe..f9a1f9cf188e 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -1695,7 +1695,8 @@ EXPORT_SYMBOL_GPL(kvm_release_page_clean);
> 
>  void kvm_release_pfn_clean(kvm_pfn_t pfn)
>  {
> -	if (!is_error_noslot_pfn(pfn) && !kvm_is_reserved_pfn(pfn))
> +	if (!is_error_noslot_pfn(pfn) &&
> +	    (!kvm_is_reserved_pfn(pfn) || is_zero_pfn(pfn)))
>  		put_page(pfn_to_page(pfn));
>  }
>  EXPORT_SYMBOL_GPL(kvm_release_pfn_clean);
> @@ -1734,7 +1735,7 @@ EXPORT_SYMBOL_GPL(kvm_set_pfn_accessed);
> 
>  void kvm_get_pfn(kvm_pfn_t pfn)
>  {
> -	if (!kvm_is_reserved_pfn(pfn))
> +	if (!kvm_is_reserved_pfn(pfn) || is_zero_pfn(pfn))
>  		get_page(pfn_to_page(pfn));
>  }
>  EXPORT_SYMBOL_GPL(kvm_get_pfn);
> 
> We are confused why ZONE_DEVICE not do this, but treating it as
> no reserved. Is it racy if we change only use the patch in cover letter,
> but not the series patches.
> 
> And we check the code of v4.9.y v4.10.y v4.11.y v4.12.y, this bug exists
> in v4.11.y and later, but not in v4.9.y v4.10.y or before.
> After commit:e86c59b1b1("mm/ksm: improve deduplication of zero pages
> with colouring"), ksm will use zero pages with colouring. This feature
> was added in v4.11.y, so I wonder why v4.9.y has this bug.
> 
> We use crash tools attaching to /proc/kcore to check the refcount of
> zero_page, then create and destroy vm. The refcount stays at 1 on v4.9.y,
> well it increases only after v4.11.y. Are you sure it is the same bug
> you run into? Is there something we missing?
> 
> LinFeng (1):
>   KVM: special handling of zero_page in some flows
> 
> Zhuang Yanying (1):
>   KVM: fix overflow of zero page refcount with ksm running
> 
>  arch/x86/kvm/mmu.c  | 2 ++
>  virt/kvm/kvm_main.c | 9 +++++----
>  2 files changed, 7 insertions(+), 4 deletions(-)
> 
> -- 
> 2.23.0
> 
> 

<formletter>

This is not the correct way to submit patches for inclusion in the
stable kernel tree.  Please read:
    https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
for how to do this properly.

</formletter>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] KVM: special handling of zero_page in some flows
  2020-04-01 12:50 ` [PATCH 2/2] KVM: special handling of zero_page in some flows Zhuang Yanying
@ 2020-04-01 13:00   ` Greg KH
  0 siblings, 0 replies; 8+ messages in thread
From: Greg KH @ 2020-04-01 13:00 UTC (permalink / raw)
  To: Zhuang Yanying; +Cc: pbonzini, tv, stable, LinFeng

On Wed, Apr 01, 2020 at 08:50:56PM +0800, Zhuang Yanying wrote:
> From: LinFeng <linfeng23@huawei.com>
> 
> Just adding !is_zero_page() to kvm_is_reserved_pfn() is too
> rough. According to commit:e433e83bc3("KVM: MMU: Do not treat
> ZONE_DEVICE pages as being reserved"), special handling in some
> other flows is also need by zero_page, if not treat zero_page as
> being reserved.
> 
> Signed-off-by: LinFeng <linfeng23@huawei.com>
> Signed-off-by: Zhuang Yanying <ann.zhuangyanying@huawei.com>
> ---
>  arch/x86/kvm/mmu.c  | 2 ++
>  virt/kvm/kvm_main.c | 6 +++---
>  2 files changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index 732c0270a489..e8db17d87c1a 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -2961,6 +2961,7 @@ static void transparent_hugepage_adjust(struct kvm_vcpu *vcpu,
>  	 * here.
>  	 */
>  	if (!is_error_noslot_pfn(pfn) && !kvm_is_reserved_pfn(pfn) &&
> +	    !is_zero_pfn(pfn) &&
>  	    level == PT_PAGE_TABLE_LEVEL &&
>  	    PageTransCompoundMap(pfn_to_page(pfn)) &&
>  	    !mmu_gfn_lpage_is_disallowed(vcpu, gfn, PT_DIRECTORY_LEVEL)) {
> @@ -5010,6 +5011,7 @@ static bool kvm_mmu_zap_collapsible_spte(struct kvm *kvm,
>  		 */
>  		if (sp->role.direct &&
>  			!kvm_is_reserved_pfn(pfn) &&
> +			!is_zero_pfn(pfn) &&
>  			PageTransCompoundMap(pfn_to_page(pfn))) {
>  			drop_spte(kvm, sptep);
>  			need_tlb_flush = 1;
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 7f7c22a687c0..fc0446bb393b 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -1660,7 +1660,7 @@ static struct page *kvm_pfn_to_page(kvm_pfn_t pfn)
>  	if (is_error_noslot_pfn(pfn))
>  		return KVM_ERR_PTR_BAD_PAGE;
>  
> -	if (kvm_is_reserved_pfn(pfn)) {
> +	if (kvm_is_reserved_pfn(pfn) && is_zero_pfn(pfn)) {
>  		WARN_ON(1);
>  		return KVM_ERR_PTR_BAD_PAGE;
>  	}
> @@ -1719,7 +1719,7 @@ static void kvm_release_pfn_dirty(kvm_pfn_t pfn)
>  
>  void kvm_set_pfn_dirty(kvm_pfn_t pfn)
>  {
> -	if (!kvm_is_reserved_pfn(pfn)) {
> +	if (!kvm_is_reserved_pfn(pfn) && !is_zero_pfn(pfn)) {
>  		struct page *page = pfn_to_page(pfn);
>  
>  		if (!PageReserved(page))
> @@ -1730,7 +1730,7 @@ EXPORT_SYMBOL_GPL(kvm_set_pfn_dirty);
>  
>  void kvm_set_pfn_accessed(kvm_pfn_t pfn)
>  {
> -	if (!kvm_is_reserved_pfn(pfn))
> +	if (!kvm_is_reserved_pfn(pfn) && !is_zero_pfn(pfn))
>  		mark_page_accessed(pfn_to_page(pfn));
>  }
>  EXPORT_SYMBOL_GPL(kvm_set_pfn_accessed);
> -- 
> 2.23.0
> 
> 

<formletter>

This is not the correct way to submit patches for inclusion in the
stable kernel tree.  Please read:
    https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
for how to do this properly.

</formletter>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] KVM: fix overflow of zero page refcount with ksm running
  2020-04-01 12:50 ` [PATCH 1/2] " Zhuang Yanying
@ 2020-04-01 13:00   ` Greg KH
  0 siblings, 0 replies; 8+ messages in thread
From: Greg KH @ 2020-04-01 13:00 UTC (permalink / raw)
  To: Zhuang Yanying; +Cc: pbonzini, tv, stable, LinFeng

On Wed, Apr 01, 2020 at 08:50:55PM +0800, Zhuang Yanying wrote:
> We are testing Virtual Machine with KSM on v5.4-rc2 kernel,
> and found the zero_page refcount overflow.
> The cause of refcount overflow is increased in try_async_pf
> (get_user_page) without being decreased in mmu_set_spte()
> while handling ept violation.
> In kvm_release_pfn_clean(), only unreserved page will call
> put_page. However, zero page is reserved.
> So, as well as creating and destroy vm, the refcount of
> zero page will continue to increase until it overflows.
> 
> step1:
> echo 10000 > /sys/kernel/mm/ksm/pages_to_scan
> echo 1 > /sys/kernel/mm/ksm/run
> echo 1 > /sys/kernel/mm/ksm/use_zero_pages
> 
> step2:
> just create several normal qemu kvm vms.
> And destroy it after 10s.
> Repeat this action all the time.
> 
> After a long period of time, all domains hang because
> of the refcount of zero page overflow.
> 
> Qemu print error log as follow:
> error: kvm run failed Bad address
>  EAX=00006cdc EBX=00000008 ECX=80202001 EDX=078bfbfd
>  ESI=ffffffff EDI=00000000 EBP=00000008 ESP=00006cc4
>  EIP=000efd75 EFL=00010002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
>  ES =0010 00000000 ffffffff 00c09300 DPL=0 DS   [-WA]
>  CS =0008 00000000 ffffffff 00c09b00 DPL=0 CS32 [-RA]
>  SS =0010 00000000 ffffffff 00c09300 DPL=0 DS   [-WA]
>  DS =0010 00000000 ffffffff 00c09300 DPL=0 DS   [-WA]
>  FS =0010 00000000 ffffffff 00c09300 DPL=0 DS   [-WA]
>  GS =0010 00000000 ffffffff 00c09300 DPL=0 DS   [-WA]
>  LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
>  TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy
>  GDT=     000f7070 00000037
>  IDT=     000f70ae 00000000
>  CR0=00000011 CR2=00000000 CR3=00000000 CR4=00000000
>  DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
>  DR6=00000000ffff0ff0 DR7=0000000000000400
>  EFER=0000000000000000
>  Code=00 01 00 00 00 e9 e8 00 00 00 c7 05 4c 55 0f 00 01 00 00 00 <8b> 35 00 00 01 00 8b 3d 04 00 01 00 b8 d8 d3 00 00 c1 e0 08 0c ea a3 00 00 01 00 c7 05 04
> 
> Meanwhile, a kernel warning is departed.
> 
>  [40914.836375] WARNING: CPU: 3 PID: 82067 at ./include/linux/mm.h:987 try_get_page+0x1f/0x30
>  [40914.836412] CPU: 3 PID: 82067 Comm: CPU 0/KVM Kdump: loaded Tainted: G           OE     5.2.0-rc2 #5
>  [40914.836415] RIP: 0010:try_get_page+0x1f/0x30
>  [40914.836417] Code: 40 00 c3 0f 1f 84 00 00 00 00 00 48 8b 47 08 a8 01 75 11 8b 47 34 85 c0 7e 10 f0 ff 47 34 b8 01 00 00 00 c3 48 8d 78 ff eb e9 <0f> 0b 31 c0 c3 66 90 66 2e 0f 1f 84 00 0
>  0 00 00 00 48 8b 47 08 a8
>  [40914.836418] RSP: 0018:ffffb4144e523988 EFLAGS: 00010286
>  [40914.836419] RAX: 0000000080000000 RBX: 0000000000000326 RCX: 0000000000000000
>  [40914.836420] RDX: 0000000000000000 RSI: 00004ffdeba10000 RDI: ffffdf07093f6440
>  [40914.836421] RBP: ffffdf07093f6440 R08: 800000424fd91225 R09: 0000000000000000
>  [40914.836421] R10: ffff9eb41bfeebb8 R11: 0000000000000000 R12: ffffdf06bbd1e8a8
>  [40914.836422] R13: 0000000000000080 R14: 800000424fd91225 R15: ffffdf07093f6440
>  [40914.836423] FS:  00007fb60ffff700(0000) GS:ffff9eb4802c0000(0000) knlGS:0000000000000000
>  [40914.836425] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>  [40914.836426] CR2: 0000000000000000 CR3: 0000002f220e6002 CR4: 00000000003626e0
>  [40914.836427] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>  [40914.836427] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>  [40914.836428] Call Trace:
>  [40914.836433]  follow_page_pte+0x302/0x47b
>  [40914.836437]  __get_user_pages+0xf1/0x7d0
>  [40914.836441]  ? irq_work_queue+0x9/0x70
>  [40914.836443]  get_user_pages_unlocked+0x13f/0x1e0
>  [40914.836469]  __gfn_to_pfn_memslot+0x10e/0x400 [kvm]
>  [40914.836486]  try_async_pf+0x87/0x240 [kvm]
>  [40914.836503]  tdp_page_fault+0x139/0x270 [kvm]
>  [40914.836523]  kvm_mmu_page_fault+0x76/0x5e0 [kvm]
>  [40914.836588]  vcpu_enter_guest+0xb45/0x1570 [kvm]
>  [40914.836632]  kvm_arch_vcpu_ioctl_run+0x35d/0x580 [kvm]
>  [40914.836645]  kvm_vcpu_ioctl+0x26e/0x5d0 [kvm]
>  [40914.836650]  do_vfs_ioctl+0xa9/0x620
>  [40914.836653]  ksys_ioctl+0x60/0x90
>  [40914.836654]  __x64_sys_ioctl+0x16/0x20
>  [40914.836658]  do_syscall_64+0x5b/0x180
>  [40914.836664]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
>  [40914.836666] RIP: 0033:0x7fb61cb6bfc7
> 
> Signed-off-by: LinFeng <linfeng23@huawei.com>
> Signed-off-by: Zhuang Yanying <ann.zhuangyanying@huawei.com>
> ---
>  virt/kvm/kvm_main.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 7e80f62f034c..7f7c22a687c0 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -133,7 +133,8 @@ static bool largepages_enabled = true;
>  bool kvm_is_reserved_pfn(kvm_pfn_t pfn)
>  {
>  	if (pfn_valid(pfn))
> -		return PageReserved(pfn_to_page(pfn));
> +		return PageReserved(pfn_to_page(pfn)) &&
> +		       !is_zero_pfn(pfn);
>  
>  	return true;
>  }
> -- 
> 2.23.0
> 
> 
<formletter>

This is not the correct way to submit patches for inclusion in the
stable kernel tree.  Please read:
    https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
for how to do this properly.

</formletter>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] KVM: special handling of zero_page in some flows
  2020-03-30 13:32   ` [PATCH 2/2] KVM: special handling of zero_page in some flows Zhuang Yanying
@ 2020-03-30 13:40     ` Greg KH
  0 siblings, 0 replies; 8+ messages in thread
From: Greg KH @ 2020-03-30 13:40 UTC (permalink / raw)
  To: Zhuang Yanying; +Cc: tv, stable, pbonzini, LinFeng

On Mon, Mar 30, 2020 at 09:32:42PM +0800, Zhuang Yanying wrote:
> From: LinFeng <linfeng23@huawei.com>
> 
> Just adding !is_zero_page() to kvm_is_reserved_pfn() is too
> rough. According to commit:e433e83bc3("KVM: MMU: Do not treat
> ZONE_DEVICE pages as being reserved"), special handling in some
> other flows is also need by zero_page, if not treat zero_page as
> being reserved.
> 
> Signed-off-by: LinFeng <linfeng23@huawei.com>
> Signed-off-by: Zhuang Yanying <ann.zhuangyanying@huawei.com>
> ---
>  arch/x86/kvm/mmu.c  | 2 ++
>  virt/kvm/kvm_main.c | 6 +++---
>  2 files changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index d9c7e986b4e4..63155e67abf7 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -2826,6 +2826,7 @@ static void transparent_hugepage_adjust(struct kvm_vcpu *vcpu,
>  	 * here.
>  	 */
>  	if (!is_error_noslot_pfn(pfn) && !kvm_is_reserved_pfn(pfn) &&
> +	    !is_zero_pfn(pfn) &&
>  	    level == PT_PAGE_TABLE_LEVEL &&
>  	    PageTransCompoundMap(pfn_to_page(pfn)) &&
>  	    !mmu_gfn_lpage_is_disallowed(vcpu, gfn, PT_DIRECTORY_LEVEL)) {
> @@ -4788,6 +4789,7 @@ static bool kvm_mmu_zap_collapsible_spte(struct kvm *kvm,
>  		 */
>  		if (sp->role.direct &&
>  			!kvm_is_reserved_pfn(pfn) &&
> +			!is_zero_pfn(pfn) &&
>  			PageTransCompoundMap(pfn_to_page(pfn))) {
>  			drop_spte(kvm, sptep);
>  			need_tlb_flush = 1;
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index d49cdd6ca883..ddc8c2a6e206 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -1658,7 +1658,7 @@ static struct page *kvm_pfn_to_page(kvm_pfn_t pfn)
>  	if (is_error_noslot_pfn(pfn))
>  		return KVM_ERR_PTR_BAD_PAGE;
>  
> -	if (kvm_is_reserved_pfn(pfn)) {
> +	if (kvm_is_reserved_pfn(pfn) && is_zero_pfn(pfn)) {
>  		WARN_ON(1);
>  		return KVM_ERR_PTR_BAD_PAGE;
>  	}
> @@ -1717,7 +1717,7 @@ static void kvm_release_pfn_dirty(kvm_pfn_t pfn)
>  
>  void kvm_set_pfn_dirty(kvm_pfn_t pfn)
>  {
> -	if (!kvm_is_reserved_pfn(pfn)) {
> +	if (!kvm_is_reserved_pfn(pfn) && !is_zero_pfn(pfn)) {
>  		struct page *page = pfn_to_page(pfn);
>  
>  		if (!PageReserved(page))
> @@ -1728,7 +1728,7 @@ EXPORT_SYMBOL_GPL(kvm_set_pfn_dirty);
>  
>  void kvm_set_pfn_accessed(kvm_pfn_t pfn)
>  {
> -	if (!kvm_is_reserved_pfn(pfn))
> +	if (!kvm_is_reserved_pfn(pfn) && !is_zero_pfn(pfn))
>  		mark_page_accessed(pfn_to_page(pfn));
>  }
>  EXPORT_SYMBOL_GPL(kvm_set_pfn_accessed);
> -- 
> 2.23.0
> 
> 

<formletter>

This is not the correct way to submit patches for inclusion in the
stable kernel tree.  Please read:
    https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
for how to do this properly.

</formletter>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 2/2] KVM: special handling of zero_page in some flows
  2020-03-30 13:32 ` [PATCH 0/2] KVM: fix overflow of zero page refcount with ksm running Zhuang Yanying
@ 2020-03-30 13:32   ` Zhuang Yanying
  2020-03-30 13:40     ` Greg KH
  0 siblings, 1 reply; 8+ messages in thread
From: Zhuang Yanying @ 2020-03-30 13:32 UTC (permalink / raw)
  To: tv, stable, pbonzini; +Cc: greg, LinFeng, Zhuang Yanying

From: LinFeng <linfeng23@huawei.com>

Just adding !is_zero_page() to kvm_is_reserved_pfn() is too
rough. According to commit:e433e83bc3("KVM: MMU: Do not treat
ZONE_DEVICE pages as being reserved"), special handling in some
other flows is also need by zero_page, if not treat zero_page as
being reserved.

Signed-off-by: LinFeng <linfeng23@huawei.com>
Signed-off-by: Zhuang Yanying <ann.zhuangyanying@huawei.com>
---
 arch/x86/kvm/mmu.c  | 2 ++
 virt/kvm/kvm_main.c | 6 +++---
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index d9c7e986b4e4..63155e67abf7 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2826,6 +2826,7 @@ static void transparent_hugepage_adjust(struct kvm_vcpu *vcpu,
 	 * here.
 	 */
 	if (!is_error_noslot_pfn(pfn) && !kvm_is_reserved_pfn(pfn) &&
+	    !is_zero_pfn(pfn) &&
 	    level == PT_PAGE_TABLE_LEVEL &&
 	    PageTransCompoundMap(pfn_to_page(pfn)) &&
 	    !mmu_gfn_lpage_is_disallowed(vcpu, gfn, PT_DIRECTORY_LEVEL)) {
@@ -4788,6 +4789,7 @@ static bool kvm_mmu_zap_collapsible_spte(struct kvm *kvm,
 		 */
 		if (sp->role.direct &&
 			!kvm_is_reserved_pfn(pfn) &&
+			!is_zero_pfn(pfn) &&
 			PageTransCompoundMap(pfn_to_page(pfn))) {
 			drop_spte(kvm, sptep);
 			need_tlb_flush = 1;
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index d49cdd6ca883..ddc8c2a6e206 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1658,7 +1658,7 @@ static struct page *kvm_pfn_to_page(kvm_pfn_t pfn)
 	if (is_error_noslot_pfn(pfn))
 		return KVM_ERR_PTR_BAD_PAGE;
 
-	if (kvm_is_reserved_pfn(pfn)) {
+	if (kvm_is_reserved_pfn(pfn) && is_zero_pfn(pfn)) {
 		WARN_ON(1);
 		return KVM_ERR_PTR_BAD_PAGE;
 	}
@@ -1717,7 +1717,7 @@ static void kvm_release_pfn_dirty(kvm_pfn_t pfn)
 
 void kvm_set_pfn_dirty(kvm_pfn_t pfn)
 {
-	if (!kvm_is_reserved_pfn(pfn)) {
+	if (!kvm_is_reserved_pfn(pfn) && !is_zero_pfn(pfn)) {
 		struct page *page = pfn_to_page(pfn);
 
 		if (!PageReserved(page))
@@ -1728,7 +1728,7 @@ EXPORT_SYMBOL_GPL(kvm_set_pfn_dirty);
 
 void kvm_set_pfn_accessed(kvm_pfn_t pfn)
 {
-	if (!kvm_is_reserved_pfn(pfn))
+	if (!kvm_is_reserved_pfn(pfn) && !is_zero_pfn(pfn))
 		mark_page_accessed(pfn_to_page(pfn));
 }
 EXPORT_SYMBOL_GPL(kvm_set_pfn_accessed);
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-04-01 13:00 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-01 12:50 [PATCH 0/2] KVM: fix overflow of zero page refcount with ksm running Zhuang Yanying
2020-04-01 12:50 ` [PATCH 1/2] " Zhuang Yanying
2020-04-01 13:00   ` Greg KH
2020-04-01 12:50 ` [PATCH 2/2] KVM: special handling of zero_page in some flows Zhuang Yanying
2020-04-01 13:00   ` Greg KH
2020-04-01 13:00 ` [PATCH 0/2] KVM: fix overflow of zero page refcount with ksm running Greg KH
  -- strict thread matches above, loose matches on Subject: below --
2020-03-26 18:04 proposing 7df003c85218b5f for v5.5.y, v5.4.y, 4.19.y, v4.14.y, v4.9.y Paolo Bonzini
2020-03-30 13:32 ` [PATCH 0/2] KVM: fix overflow of zero page refcount with ksm running Zhuang Yanying
2020-03-30 13:32   ` [PATCH 2/2] KVM: special handling of zero_page in some flows Zhuang Yanying
2020-03-30 13:40     ` Greg KH

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).