All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paul Mackerras <paulus@ozlabs.org>
To: Bharata B Rao <bharata@linux.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org, kvm-ppc@vger.kernel.org,
	linux-mm@kvack.org, paulus@au1.ibm.com,
	aneesh.kumar@linux.vnet.ibm.com, jglisse@redhat.com,
	cclaudio@linux.ibm.com, linuxram@us.ibm.com,
	sukadev@linux.vnet.ibm.com, hch@lst.de
Subject: Re: [PATCH v10 1/8] mm: ksm: Export ksm_madvise()
Date: Thu, 7 Nov 2019 16:45:35 +1100	[thread overview]
Message-ID: <20191107054535.GA2882@oak.ozlabs.ibm.com> (raw)
In-Reply-To: <20191106064542.GB21634@in.ibm.com>

On Wed, Nov 06, 2019 at 12:15:42PM +0530, Bharata B Rao wrote:
> On Wed, Nov 06, 2019 at 03:33:29PM +1100, Paul Mackerras wrote:
> > On Mon, Nov 04, 2019 at 09:47:53AM +0530, Bharata B Rao wrote:
> > > KVM PPC module needs ksm_madvise() for supporting secure guests.
> > > Guest pages that become secure are represented as device private
> > > pages in the host. Such pages shouldn't participate in KSM merging.
> > 
> > If we don't do the ksm_madvise call, then as far as I can tell, it
> > should all still work correctly, but we might have KSM pulling pages
> > in unnecessarily, causing a reduction in performance.  Is that right?
> 
> I thought so too. When KSM tries to merge a secure page, it should
> cause a fault resulting in page-out the secure page. However I see
> the below crash when KSM is enabled and KSM scan tries to kmap and
> memcmp the device private page.
> 
> BUG: Unable to handle kernel data access at 0xc007fffe00010000
> Faulting instruction address: 0xc0000000000ab5a0
> Oops: Kernel access of bad area, sig: 11 [#1]
> LE PAGE_SIZE=64K MMU=Radix MMU=Hash SMP NR_CPUS=2048 NUMA PowerNV
> Modules linked in:
> CPU: 0 PID: 22 Comm: ksmd Not tainted 5.4.0-rc2-00026-g2249c0ae4a53-dirty #376
> NIP:  c0000000000ab5a0 LR: c0000000003d7c3c CTR: 0000000000000004
> REGS: c0000001c85d79b0 TRAP: 0300   Not tainted  (5.4.0-rc2-00026-g2249c0ae4a53-dirty)
> MSR:  900000000280b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 24002242  XER: 20040000
> CFAR: c0000000000ab3d0 DAR: c007fffe00010000 DSISR: 40000000 IRQMASK: 0 
> GPR00: 0000000000000004 c0000001c85d7c40 c0000000018ce000 c0000001c3880000 
> GPR04: c007fffe00010000 0000000000010000 0000000000000000 ffffffffffffffff 
> GPR08: c000000001992298 0000603820002138 ffffffffffffffff ffffffff00003a69 
> GPR12: 0000000024002242 c000000002550000 c0000001c8700000 c00000000179b728 
> GPR16: c00c01ffff800040 c00000000179b5b8 c00c00000070e200 ffffffffffffffff 
> GPR20: 0000000000000000 0000000000000000 fffffffffffff000 c00000000179b648 
> GPR24: c0000000024464a0 c00000000249f568 c000000001118918 0000000000000000 
> GPR28: c0000001c804c590 c00000000249f518 0000000000000000 c0000001c8700000 
> NIP [c0000000000ab5a0] memcmp+0x320/0x6a0
> LR [c0000000003d7c3c] memcmp_pages+0x8c/0xe0
> Call Trace:
> [c0000001c85d7c40] [c0000001c804c590] 0xc0000001c804c590 (unreliable)
> [c0000001c85d7c70] [c0000000004591d0] ksm_scan_thread+0x960/0x21b0
> [c0000001c85d7db0] [c0000000001bf328] kthread+0x198/0x1a0
> [c0000001c85d7e20] [c00000000000bfbc] ret_from_kernel_thread+0x5c/0x80
> Instruction dump:
> ebc1fff0 eba1ffe8 eb81ffe0 eb61ffd8 4e800020 38600001 4d810020 3860ffff 
> 4e800020 38000004 7c0903a6 7d201c28 <7d402428> 7c295040 38630008 38840008 

Hmmm, that seems like a bug in the ZONE_DEVICE stuff generally.  All
that ksm is doing as far as I can see is follow_page() and
kmap_atomic().  I wonder how many other places in the kernel might
also be prone to crashing if they try to touch device pages?

> In anycase, we wouldn't want secure guests pages to be pulled out due
> to KSM, hence disabled merging.

Sure, I don't disagree with that, but I worry that we are papering
over a bug here.

Paul.


WARNING: multiple messages have this Message-ID (diff)
From: Paul Mackerras <paulus@ozlabs.org>
To: Bharata B Rao <bharata@linux.ibm.com>
Cc: linuxram@us.ibm.com, cclaudio@linux.ibm.com,
	kvm-ppc@vger.kernel.org, linux-mm@kvack.org, jglisse@redhat.com,
	aneesh.kumar@linux.vnet.ibm.com, paulus@au1.ibm.com,
	sukadev@linux.vnet.ibm.com, linuxppc-dev@lists.ozlabs.org,
	hch@lst.de
Subject: Re: [PATCH v10 1/8] mm: ksm: Export ksm_madvise()
Date: Thu, 7 Nov 2019 16:45:35 +1100	[thread overview]
Message-ID: <20191107054535.GA2882@oak.ozlabs.ibm.com> (raw)
In-Reply-To: <20191106064542.GB21634@in.ibm.com>

On Wed, Nov 06, 2019 at 12:15:42PM +0530, Bharata B Rao wrote:
> On Wed, Nov 06, 2019 at 03:33:29PM +1100, Paul Mackerras wrote:
> > On Mon, Nov 04, 2019 at 09:47:53AM +0530, Bharata B Rao wrote:
> > > KVM PPC module needs ksm_madvise() for supporting secure guests.
> > > Guest pages that become secure are represented as device private
> > > pages in the host. Such pages shouldn't participate in KSM merging.
> > 
> > If we don't do the ksm_madvise call, then as far as I can tell, it
> > should all still work correctly, but we might have KSM pulling pages
> > in unnecessarily, causing a reduction in performance.  Is that right?
> 
> I thought so too. When KSM tries to merge a secure page, it should
> cause a fault resulting in page-out the secure page. However I see
> the below crash when KSM is enabled and KSM scan tries to kmap and
> memcmp the device private page.
> 
> BUG: Unable to handle kernel data access at 0xc007fffe00010000
> Faulting instruction address: 0xc0000000000ab5a0
> Oops: Kernel access of bad area, sig: 11 [#1]
> LE PAGE_SIZE=64K MMU=Radix MMU=Hash SMP NR_CPUS=2048 NUMA PowerNV
> Modules linked in:
> CPU: 0 PID: 22 Comm: ksmd Not tainted 5.4.0-rc2-00026-g2249c0ae4a53-dirty #376
> NIP:  c0000000000ab5a0 LR: c0000000003d7c3c CTR: 0000000000000004
> REGS: c0000001c85d79b0 TRAP: 0300   Not tainted  (5.4.0-rc2-00026-g2249c0ae4a53-dirty)
> MSR:  900000000280b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 24002242  XER: 20040000
> CFAR: c0000000000ab3d0 DAR: c007fffe00010000 DSISR: 40000000 IRQMASK: 0 
> GPR00: 0000000000000004 c0000001c85d7c40 c0000000018ce000 c0000001c3880000 
> GPR04: c007fffe00010000 0000000000010000 0000000000000000 ffffffffffffffff 
> GPR08: c000000001992298 0000603820002138 ffffffffffffffff ffffffff00003a69 
> GPR12: 0000000024002242 c000000002550000 c0000001c8700000 c00000000179b728 
> GPR16: c00c01ffff800040 c00000000179b5b8 c00c00000070e200 ffffffffffffffff 
> GPR20: 0000000000000000 0000000000000000 fffffffffffff000 c00000000179b648 
> GPR24: c0000000024464a0 c00000000249f568 c000000001118918 0000000000000000 
> GPR28: c0000001c804c590 c00000000249f518 0000000000000000 c0000001c8700000 
> NIP [c0000000000ab5a0] memcmp+0x320/0x6a0
> LR [c0000000003d7c3c] memcmp_pages+0x8c/0xe0
> Call Trace:
> [c0000001c85d7c40] [c0000001c804c590] 0xc0000001c804c590 (unreliable)
> [c0000001c85d7c70] [c0000000004591d0] ksm_scan_thread+0x960/0x21b0
> [c0000001c85d7db0] [c0000000001bf328] kthread+0x198/0x1a0
> [c0000001c85d7e20] [c00000000000bfbc] ret_from_kernel_thread+0x5c/0x80
> Instruction dump:
> ebc1fff0 eba1ffe8 eb81ffe0 eb61ffd8 4e800020 38600001 4d810020 3860ffff 
> 4e800020 38000004 7c0903a6 7d201c28 <7d402428> 7c295040 38630008 38840008 

Hmmm, that seems like a bug in the ZONE_DEVICE stuff generally.  All
that ksm is doing as far as I can see is follow_page() and
kmap_atomic().  I wonder how many other places in the kernel might
also be prone to crashing if they try to touch device pages?

> In anycase, we wouldn't want secure guests pages to be pulled out due
> to KSM, hence disabled merging.

Sure, I don't disagree with that, but I worry that we are papering
over a bug here.

Paul.

WARNING: multiple messages have this Message-ID (diff)
From: Paul Mackerras <paulus@ozlabs.org>
To: Bharata B Rao <bharata@linux.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org, kvm-ppc@vger.kernel.org,
	linux-mm@kvack.org, paulus@au1.ibm.com,
	aneesh.kumar@linux.vnet.ibm.com, jglisse@redhat.com,
	cclaudio@linux.ibm.com, linuxram@us.ibm.com,
	sukadev@linux.vnet.ibm.com, hch@lst.de
Subject: Re: [PATCH v10 1/8] mm: ksm: Export ksm_madvise()
Date: Thu, 07 Nov 2019 05:45:35 +0000	[thread overview]
Message-ID: <20191107054535.GA2882@oak.ozlabs.ibm.com> (raw)
In-Reply-To: <20191106064542.GB21634@in.ibm.com>

On Wed, Nov 06, 2019 at 12:15:42PM +0530, Bharata B Rao wrote:
> On Wed, Nov 06, 2019 at 03:33:29PM +1100, Paul Mackerras wrote:
> > On Mon, Nov 04, 2019 at 09:47:53AM +0530, Bharata B Rao wrote:
> > > KVM PPC module needs ksm_madvise() for supporting secure guests.
> > > Guest pages that become secure are represented as device private
> > > pages in the host. Such pages shouldn't participate in KSM merging.
> > 
> > If we don't do the ksm_madvise call, then as far as I can tell, it
> > should all still work correctly, but we might have KSM pulling pages
> > in unnecessarily, causing a reduction in performance.  Is that right?
> 
> I thought so too. When KSM tries to merge a secure page, it should
> cause a fault resulting in page-out the secure page. However I see
> the below crash when KSM is enabled and KSM scan tries to kmap and
> memcmp the device private page.
> 
> BUG: Unable to handle kernel data access at 0xc007fffe00010000
> Faulting instruction address: 0xc0000000000ab5a0
> Oops: Kernel access of bad area, sig: 11 [#1]
> LE PAGE_SIZEdK MMU=Radix MMU=Hash SMP NR_CPUS 48 NUMA PowerNV
> Modules linked in:
> CPU: 0 PID: 22 Comm: ksmd Not tainted 5.4.0-rc2-00026-g2249c0ae4a53-dirty #376
> NIP:  c0000000000ab5a0 LR: c0000000003d7c3c CTR: 0000000000000004
> REGS: c0000001c85d79b0 TRAP: 0300   Not tainted  (5.4.0-rc2-00026-g2249c0ae4a53-dirty)
> MSR:  900000000280b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 24002242  XER: 20040000
> CFAR: c0000000000ab3d0 DAR: c007fffe00010000 DSISR: 40000000 IRQMASK: 0 
> GPR00: 0000000000000004 c0000001c85d7c40 c0000000018ce000 c0000001c3880000 
> GPR04: c007fffe00010000 0000000000010000 0000000000000000 ffffffffffffffff 
> GPR08: c000000001992298 0000603820002138 ffffffffffffffff ffffffff00003a69 
> GPR12: 0000000024002242 c000000002550000 c0000001c8700000 c00000000179b728 
> GPR16: c00c01ffff800040 c00000000179b5b8 c00c00000070e200 ffffffffffffffff 
> GPR20: 0000000000000000 0000000000000000 fffffffffffff000 c00000000179b648 
> GPR24: c0000000024464a0 c00000000249f568 c000000001118918 0000000000000000 
> GPR28: c0000001c804c590 c00000000249f518 0000000000000000 c0000001c8700000 
> NIP [c0000000000ab5a0] memcmp+0x320/0x6a0
> LR [c0000000003d7c3c] memcmp_pages+0x8c/0xe0
> Call Trace:
> [c0000001c85d7c40] [c0000001c804c590] 0xc0000001c804c590 (unreliable)
> [c0000001c85d7c70] [c0000000004591d0] ksm_scan_thread+0x960/0x21b0
> [c0000001c85d7db0] [c0000000001bf328] kthread+0x198/0x1a0
> [c0000001c85d7e20] [c00000000000bfbc] ret_from_kernel_thread+0x5c/0x80
> Instruction dump:
> ebc1fff0 eba1ffe8 eb81ffe0 eb61ffd8 4e800020 38600001 4d810020 3860ffff 
> 4e800020 38000004 7c0903a6 7d201c28 <7d402428> 7c295040 38630008 38840008 

Hmmm, that seems like a bug in the ZONE_DEVICE stuff generally.  All
that ksm is doing as far as I can see is follow_page() and
kmap_atomic().  I wonder how many other places in the kernel might
also be prone to crashing if they try to touch device pages?

> In anycase, we wouldn't want secure guests pages to be pulled out due
> to KSM, hence disabled merging.

Sure, I don't disagree with that, but I worry that we are papering
over a bug here.

Paul.

  reply	other threads:[~2019-11-07  5:45 UTC|newest]

Thread overview: 117+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-04  4:17 [PATCH v10 0/8] KVM: PPC: Driver to manage pages of secure guest Bharata B Rao
2019-11-04  4:29 ` Bharata B Rao
2019-11-04  4:17 ` Bharata B Rao
2019-11-04  4:17 ` [PATCH v10 1/8] mm: ksm: Export ksm_madvise() Bharata B Rao
2019-11-04  4:29   ` Bharata B Rao
2019-11-04  4:17   ` Bharata B Rao
2019-11-06  4:33   ` Paul Mackerras
2019-11-06  4:33     ` Paul Mackerras
2019-11-06  4:33     ` Paul Mackerras
2019-11-06  6:45     ` Bharata B Rao
2019-11-06  6:57       ` Bharata B Rao
2019-11-06  6:45       ` Bharata B Rao
2019-11-07  5:45       ` Paul Mackerras [this message]
2019-11-07  5:45         ` Paul Mackerras
2019-11-07  5:45         ` Paul Mackerras
2019-11-15 14:10         ` Bharata B Rao
2019-11-15 14:22           ` Bharata B Rao
2019-11-15 14:10           ` Bharata B Rao
2019-11-04  4:17 ` [PATCH v10 2/8] KVM: PPC: Support for running secure guests Bharata B Rao
2019-11-04  4:29   ` Bharata B Rao
2019-11-04  4:17   ` Bharata B Rao
2019-11-06  4:34   ` Paul Mackerras
2019-11-06  4:34     ` Paul Mackerras
2019-11-06  4:34     ` Paul Mackerras
2019-11-04  4:17 ` [PATCH v10 3/8] KVM: PPC: Shared pages support for " Bharata B Rao
2019-11-04  4:29   ` Bharata B Rao
2019-11-04  4:17   ` Bharata B Rao
2019-11-06  4:52   ` Paul Mackerras
2019-11-06  4:52     ` Paul Mackerras
2019-11-06  4:52     ` Paul Mackerras
2019-11-06  8:22     ` Bharata B Rao
2019-11-06  8:34       ` Bharata B Rao
2019-11-06  8:22       ` Bharata B Rao
2019-11-06  8:29       ` Bharata B Rao
2019-11-06  8:41         ` Bharata B Rao
2019-11-06  8:29         ` Bharata B Rao
2019-11-04  4:17 ` [PATCH v10 4/8] KVM: PPC: Radix changes for secure guest Bharata B Rao
2019-11-04  4:29   ` Bharata B Rao
2019-11-04  4:17   ` Bharata B Rao
2019-11-06  5:58   ` Paul Mackerras
2019-11-06  5:58     ` Paul Mackerras
2019-11-06  5:58     ` Paul Mackerras
2019-11-06  8:36     ` Bharata B Rao
2019-11-06  8:48       ` Bharata B Rao
2019-11-06  8:36       ` Bharata B Rao
2019-11-04  4:17 ` [PATCH v10 5/8] KVM: PPC: Handle memory plug/unplug to secure VM Bharata B Rao
2019-11-04  4:29   ` Bharata B Rao
2019-11-04  4:17   ` Bharata B Rao
2019-11-11  4:25   ` Paul Mackerras
2019-11-11  4:25     ` Paul Mackerras
2019-11-11  4:25     ` Paul Mackerras
2019-11-04  4:17 ` [PATCH v10 6/8] KVM: PPC: Support reset of secure guest Bharata B Rao
2019-11-04  4:29   ` Bharata B Rao
2019-11-04  4:17   ` Bharata B Rao
2019-11-11  5:28   ` Paul Mackerras
2019-11-11  5:28     ` Paul Mackerras
2019-11-11  5:28     ` Paul Mackerras
2019-11-11  6:55     ` Bharata B Rao
2019-11-11  6:55       ` Bharata B Rao
2019-11-11  6:55       ` Bharata B Rao
2019-11-12  5:34   ` Paul Mackerras
2019-11-12  5:34     ` Paul Mackerras
2019-11-12  5:34     ` Paul Mackerras
2019-11-13 15:29     ` Bharata B Rao
2019-11-13 15:41       ` Bharata B Rao
2019-11-13 15:29       ` Bharata B Rao
2019-11-14  5:07       ` Paul Mackerras
2019-11-14  5:07         ` Paul Mackerras
2019-11-14  5:07         ` Paul Mackerras
2019-11-04  4:17 ` [PATCH v10 7/8] KVM: PPC: Implement H_SVM_INIT_ABORT hcall Bharata B Rao
2019-11-04  4:29   ` Bharata B Rao
2019-11-04  4:17   ` Bharata B Rao
2019-11-11  4:19   ` Paul Mackerras
2019-11-11  4:19     ` Paul Mackerras
2019-11-11  4:19     ` Paul Mackerras
2019-11-12  1:01     ` Ram Pai
2019-11-12  1:01       ` Ram Pai
2019-11-12  1:01       ` Ram Pai
2019-11-12  5:38       ` Paul Mackerras
2019-11-12  5:38         ` Paul Mackerras
2019-11-12  5:38         ` Paul Mackerras
2019-11-12  7:52         ` Ram Pai
2019-11-12  7:52           ` Ram Pai
2019-11-12  7:52           ` Ram Pai
2019-11-12 11:32           ` Paul Mackerras
2019-11-12 11:32             ` Paul Mackerras
2019-11-12 11:32             ` Paul Mackerras
2019-11-12 14:45             ` Ram Pai
2019-11-12 14:45               ` Ram Pai
2019-11-12 14:45               ` Ram Pai
2019-11-13  0:14               ` Paul Mackerras
2019-11-13  0:14                 ` Paul Mackerras
2019-11-13  0:14                 ` Paul Mackerras
2019-11-13  6:32                 ` Ram Pai
2019-11-13  6:32                   ` Ram Pai
2019-11-13  6:32                   ` Ram Pai
2019-11-13 21:18                   ` Paul Mackerras
2019-11-13 21:18                     ` Paul Mackerras
2019-11-13 21:18                     ` Paul Mackerras
2019-11-13 21:50                     ` Ram Pai
2019-11-13 21:50                       ` Ram Pai
2019-11-13 21:50                       ` Ram Pai
2019-11-14  5:08                       ` Paul Mackerras
2019-11-14  5:08                         ` Paul Mackerras
2019-11-14  5:08                         ` Paul Mackerras
2019-11-14  7:02                         ` Ram Pai
2019-11-14  7:02                           ` Ram Pai
2019-11-14  7:02                           ` Ram Pai
2019-11-04  4:18 ` [PATCH v10 8/8] KVM: PPC: Ultravisor: Add PPC_UV config option Bharata B Rao
2019-11-04  4:30   ` Bharata B Rao
2019-11-04  4:18   ` Bharata B Rao
2019-11-06  4:30 ` [PATCH v10 0/8] KVM: PPC: Driver to manage pages of secure guest Paul Mackerras
2019-11-06  4:30   ` Paul Mackerras
2019-11-06  4:30   ` Paul Mackerras
2019-11-06  6:20   ` Bharata B Rao
2019-11-06  6:32     ` Bharata B Rao
2019-11-06  6:20     ` Bharata B Rao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191107054535.GA2882@oak.ozlabs.ibm.com \
    --to=paulus@ozlabs.org \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=bharata@linux.ibm.com \
    --cc=cclaudio@linux.ibm.com \
    --cc=hch@lst.de \
    --cc=jglisse@redhat.com \
    --cc=kvm-ppc@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=linuxram@us.ibm.com \
    --cc=paulus@au1.ibm.com \
    --cc=sukadev@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.