All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Zhang,Yi" <yi.z.zhang@linux.intel.com>
To: David Hildenbrand <david@redhat.com>,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-nvdimm@lists.01.org, pbonzini@redhat.com,
	dan.j.williams@intel.com, jack@suse.cz, hch@lst.de,
	yu.c.zhang@intel.com
Cc: linux-mm@kvack.org, yi.z.zhang@intel.com, rkrcmar@redhat.com
Subject: Re: [PATCH V3 0/4] Fix kvm misconceives NVDIMM pages as reserved mmio
Date: Tue, 14 Aug 2018 01:25:38 +0800	[thread overview]
Message-ID: <0f4f0d15-7949-c576-1981-145e7758ae4a@linux.intel.com> (raw)
In-Reply-To: <76cbaf38-1c72-0b45-4075-add904226725@redhat.com>



On 2018年08月10日 21:27, David Hildenbrand wrote:
> On 09.08.2018 12:52, Zhang Yi wrote:
>> For device specific memory space, when we move these area of pfn to
>> memory zone, we will set the page reserved flag at that time, some of
>> these reserved for device mmio, and some of these are not, such as
>> NVDIMM pmem.
>>
>> Now, we map these dev_dax or fs_dax pages to kvm for DIMM/NVDIMM
>> backend, since these pages are reserved. the check of
>> kvm_is_reserved_pfn() misconceives those pages as MMIO. Therefor, we
>> introduce 2 page map types, MEMORY_DEVICE_FS_DAX/MEMORY_DEVICE_DEV_DAX,
>> to indentify these pages are from NVDIMM pmem. and let kvm treat these
>> as normal pages.
>>
>> Without this patch, Many operations will be missed due to this
>> mistreatment to pmem pages. For example, a page may not have chance to
>> be unpinned for KVM guest(in kvm_release_pfn_clean); not able to be
>> marked as dirty/accessed(in kvm_set_pfn_dirty/accessed) etc.
>>
> I am right now looking into (and trying to better document) PG_reserved
> - and having a hard time :) .
>
> One of the main points about reserved pages is that the struct pages are
> not to be touched. See [1] (I know that statement is fairly old, but it
> resembles what PG_reserved is actually used for nowadays - with some
> exceptions unfortunately.).
>
> Struct pages part of user space tables that are PG_reserved can indicate
> (as of now according to my research)
> - MMIO pages
> - Selected MMAPed pages - e.g. vDSO
> - Zero page
> - PMEM pages as you correctly state
>
> So I wonder, if it is really the right approach to silently go ahead and
> treat reserved pages just like they would not be reserved. Maybe the
> right approach would rather be to do something about pmem pages being
> reserved. Yes, they are never to be given to the page allocator, but I
> wonder if PG_reserved is strictly needed for that.
>
> [1] https://lists.linuxcoding.com/kernel/2005-q3/msg10350.html

Thanks David list the long history of Page reserved, By now, I think we treat nvdimm as a device not a DRAM, also has it's device driver which manager its own device memory. From this perspective, it is reasonable to set these pages as zone device memory and mark reserved flag.
@Dan @Dave, how do you think about this?

>
>> V1:
>> https://lkml.org/lkml/2018/7/4/91
>>
>> V2:
>> https://lkml.org/lkml/2018/7/10/135
>>
>> V3:
>> [PATCH V3 1/4] Needs Comments.
>> [PATCH V3 2/4] Update the description of MEMORY_DEVICE_DEV_DAX: Jan
>> [PATCH V3 3/4] Acked-by: Jan in V2
>> [PATCH V3 4/4] Needs Comments.
>>
>> Zhang Yi (4):
>>   kvm: remove redundant reserved page check
>>   mm: introduce memory type MEMORY_DEVICE_DEV_DAX
>>   mm: add a function to differentiate the pages is from DAX device
>>     memory
>>   kvm: add a check if pfn is from NVDIMM pmem.
>>
>>  drivers/dax/pmem.c       |  1 +
>>  include/linux/memremap.h |  8 ++++++++
>>  include/linux/mm.h       | 12 ++++++++++++
>>  virt/kvm/kvm_main.c      | 16 ++++++++--------
>>  4 files changed, 29 insertions(+), 8 deletions(-)
>>
>

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

WARNING: multiple messages have this Message-ID (diff)
From: "Zhang,Yi" <yi.z.zhang@linux.intel.com>
To: David Hildenbrand <david@redhat.com>,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-nvdimm@lists.01.org, pbonzini@redhat.com,
	dan.j.williams@intel.com, jack@suse.cz, hch@lst.de,
	yu.c.zhang@intel.com
Cc: linux-mm@kvack.org, rkrcmar@redhat.com, yi.z.zhang@intel.com
Subject: Re: [PATCH V3 0/4] Fix kvm misconceives NVDIMM pages as reserved mmio
Date: Tue, 14 Aug 2018 01:25:38 +0800	[thread overview]
Message-ID: <0f4f0d15-7949-c576-1981-145e7758ae4a@linux.intel.com> (raw)
In-Reply-To: <76cbaf38-1c72-0b45-4075-add904226725@redhat.com>



On 2018年08月10日 21:27, David Hildenbrand wrote:
> On 09.08.2018 12:52, Zhang Yi wrote:
>> For device specific memory space, when we move these area of pfn to
>> memory zone, we will set the page reserved flag at that time, some of
>> these reserved for device mmio, and some of these are not, such as
>> NVDIMM pmem.
>>
>> Now, we map these dev_dax or fs_dax pages to kvm for DIMM/NVDIMM
>> backend, since these pages are reserved. the check of
>> kvm_is_reserved_pfn() misconceives those pages as MMIO. Therefor, we
>> introduce 2 page map types, MEMORY_DEVICE_FS_DAX/MEMORY_DEVICE_DEV_DAX,
>> to indentify these pages are from NVDIMM pmem. and let kvm treat these
>> as normal pages.
>>
>> Without this patch, Many operations will be missed due to this
>> mistreatment to pmem pages. For example, a page may not have chance to
>> be unpinned for KVM guest(in kvm_release_pfn_clean); not able to be
>> marked as dirty/accessed(in kvm_set_pfn_dirty/accessed) etc.
>>
> I am right now looking into (and trying to better document) PG_reserved
> - and having a hard time :) .
>
> One of the main points about reserved pages is that the struct pages are
> not to be touched. See [1] (I know that statement is fairly old, but it
> resembles what PG_reserved is actually used for nowadays - with some
> exceptions unfortunately.).
>
> Struct pages part of user space tables that are PG_reserved can indicate
> (as of now according to my research)
> - MMIO pages
> - Selected MMAPed pages - e.g. vDSO
> - Zero page
> - PMEM pages as you correctly state
>
> So I wonder, if it is really the right approach to silently go ahead and
> treat reserved pages just like they would not be reserved. Maybe the
> right approach would rather be to do something about pmem pages being
> reserved. Yes, they are never to be given to the page allocator, but I
> wonder if PG_reserved is strictly needed for that.
>
> [1] https://lists.linuxcoding.com/kernel/2005-q3/msg10350.html

Thanks David list the long history of Page reserved, By now, I think we treat nvdimm as a device not a DRAM, also has it's device driver which manager its own device memory. From this perspective, it is reasonable to set these pages as zone device memory and mark reserved flag.
@Dan @Dave, how do you think about this?

>
>> V1:
>> https://lkml.org/lkml/2018/7/4/91
>>
>> V2:
>> https://lkml.org/lkml/2018/7/10/135
>>
>> V3:
>> [PATCH V3 1/4] Needs Comments.
>> [PATCH V3 2/4] Update the description of MEMORY_DEVICE_DEV_DAX: Jan
>> [PATCH V3 3/4] Acked-by: Jan in V2
>> [PATCH V3 4/4] Needs Comments.
>>
>> Zhang Yi (4):
>>   kvm: remove redundant reserved page check
>>   mm: introduce memory type MEMORY_DEVICE_DEV_DAX
>>   mm: add a function to differentiate the pages is from DAX device
>>     memory
>>   kvm: add a check if pfn is from NVDIMM pmem.
>>
>>  drivers/dax/pmem.c       |  1 +
>>  include/linux/memremap.h |  8 ++++++++
>>  include/linux/mm.h       | 12 ++++++++++++
>>  virt/kvm/kvm_main.c      | 16 ++++++++--------
>>  4 files changed, 29 insertions(+), 8 deletions(-)
>>
>


WARNING: multiple messages have this Message-ID (diff)
From: "Zhang,Yi" <yi.z.zhang-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
To: David Hildenbrand <david-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	kvm-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw@public.gmane.org,
	pbonzini-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org,
	dan.j.williams-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org,
	jack-AlSwsSmVLrQ@public.gmane.org,
	hch-jcswGhMUV9g@public.gmane.org,
	yu.c.zhang-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org
Cc: linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org,
	yi.z.zhang-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org,
	rkrcmar-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org
Subject: Re: [PATCH V3 0/4] Fix kvm misconceives NVDIMM pages as reserved mmio
Date: Tue, 14 Aug 2018 01:25:38 +0800	[thread overview]
Message-ID: <0f4f0d15-7949-c576-1981-145e7758ae4a@linux.intel.com> (raw)
In-Reply-To: <76cbaf38-1c72-0b45-4075-add904226725-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>



On 2018年08月10日 21:27, David Hildenbrand wrote:
> On 09.08.2018 12:52, Zhang Yi wrote:
>> For device specific memory space, when we move these area of pfn to
>> memory zone, we will set the page reserved flag at that time, some of
>> these reserved for device mmio, and some of these are not, such as
>> NVDIMM pmem.
>>
>> Now, we map these dev_dax or fs_dax pages to kvm for DIMM/NVDIMM
>> backend, since these pages are reserved. the check of
>> kvm_is_reserved_pfn() misconceives those pages as MMIO. Therefor, we
>> introduce 2 page map types, MEMORY_DEVICE_FS_DAX/MEMORY_DEVICE_DEV_DAX,
>> to indentify these pages are from NVDIMM pmem. and let kvm treat these
>> as normal pages.
>>
>> Without this patch, Many operations will be missed due to this
>> mistreatment to pmem pages. For example, a page may not have chance to
>> be unpinned for KVM guest(in kvm_release_pfn_clean); not able to be
>> marked as dirty/accessed(in kvm_set_pfn_dirty/accessed) etc.
>>
> I am right now looking into (and trying to better document) PG_reserved
> - and having a hard time :) .
>
> One of the main points about reserved pages is that the struct pages are
> not to be touched. See [1] (I know that statement is fairly old, but it
> resembles what PG_reserved is actually used for nowadays - with some
> exceptions unfortunately.).
>
> Struct pages part of user space tables that are PG_reserved can indicate
> (as of now according to my research)
> - MMIO pages
> - Selected MMAPed pages - e.g. vDSO
> - Zero page
> - PMEM pages as you correctly state
>
> So I wonder, if it is really the right approach to silently go ahead and
> treat reserved pages just like they would not be reserved. Maybe the
> right approach would rather be to do something about pmem pages being
> reserved. Yes, they are never to be given to the page allocator, but I
> wonder if PG_reserved is strictly needed for that.
>
> [1] https://lists.linuxcoding.com/kernel/2005-q3/msg10350.html

Thanks David list the long history of Page reserved, By now, I think we treat nvdimm as a device not a DRAM, also has it's device driver which manager its own device memory. From this perspective, it is reasonable to set these pages as zone device memory and mark reserved flag.
@Dan @Dave, how do you think about this?

>
>> V1:
>> https://lkml.org/lkml/2018/7/4/91
>>
>> V2:
>> https://lkml.org/lkml/2018/7/10/135
>>
>> V3:
>> [PATCH V3 1/4] Needs Comments.
>> [PATCH V3 2/4] Update the description of MEMORY_DEVICE_DEV_DAX: Jan
>> [PATCH V3 3/4] Acked-by: Jan in V2
>> [PATCH V3 4/4] Needs Comments.
>>
>> Zhang Yi (4):
>>   kvm: remove redundant reserved page check
>>   mm: introduce memory type MEMORY_DEVICE_DEV_DAX
>>   mm: add a function to differentiate the pages is from DAX device
>>     memory
>>   kvm: add a check if pfn is from NVDIMM pmem.
>>
>>  drivers/dax/pmem.c       |  1 +
>>  include/linux/memremap.h |  8 ++++++++
>>  include/linux/mm.h       | 12 ++++++++++++
>>  virt/kvm/kvm_main.c      | 16 ++++++++--------
>>  4 files changed, 29 insertions(+), 8 deletions(-)
>>
>

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

WARNING: multiple messages have this Message-ID (diff)
From: "Zhang,Yi" <yi.z.zhang@linux.intel.com>
To: David Hildenbrand <david@redhat.com>,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-nvdimm@lists.01.org, pbonzini@redhat.com,
	dan.j.williams@intel.com, jack@suse.cz, hch@lst.de,
	yu.c.zhang@intel.com
Cc: linux-mm@kvack.org, rkrcmar@redhat.com, yi.z.zhang@intel.com
Subject: Re: [PATCH V3 0/4] Fix kvm misconceives NVDIMM pages as reserved mmio
Date: Tue, 14 Aug 2018 01:25:38 +0800	[thread overview]
Message-ID: <0f4f0d15-7949-c576-1981-145e7758ae4a@linux.intel.com> (raw)
In-Reply-To: <76cbaf38-1c72-0b45-4075-add904226725@redhat.com>



On 2018a1'08ae??10ae?JPY 21:27, David Hildenbrand wrote:
> On 09.08.2018 12:52, Zhang Yi wrote:
>> For device specific memory space, when we move these area of pfn to
>> memory zone, we will set the page reserved flag at that time, some of
>> these reserved for device mmio, and some of these are not, such as
>> NVDIMM pmem.
>>
>> Now, we map these dev_dax or fs_dax pages to kvm for DIMM/NVDIMM
>> backend, since these pages are reserved. the check of
>> kvm_is_reserved_pfn() misconceives those pages as MMIO. Therefor, we
>> introduce 2 page map types, MEMORY_DEVICE_FS_DAX/MEMORY_DEVICE_DEV_DAX,
>> to indentify these pages are from NVDIMM pmem. and let kvm treat these
>> as normal pages.
>>
>> Without this patch, Many operations will be missed due to this
>> mistreatment to pmem pages. For example, a page may not have chance to
>> be unpinned for KVM guest(in kvm_release_pfn_clean); not able to be
>> marked as dirty/accessed(in kvm_set_pfn_dirty/accessed) etc.
>>
> I am right now looking into (and trying to better document) PG_reserved
> - and having a hard time :) .
>
> One of the main points about reserved pages is that the struct pages are
> not to be touched. See [1] (I know that statement is fairly old, but it
> resembles what PG_reserved is actually used for nowadays - with some
> exceptions unfortunately.).
>
> Struct pages part of user space tables that are PG_reserved can indicate
> (as of now according to my research)
> - MMIO pages
> - Selected MMAPed pages - e.g. vDSO
> - Zero page
> - PMEM pages as you correctly state
>
> So I wonder, if it is really the right approach to silently go ahead and
> treat reserved pages just like they would not be reserved. Maybe the
> right approach would rather be to do something about pmem pages being
> reserved. Yes, they are never to be given to the page allocator, but I
> wonder if PG_reserved is strictly needed for that.
>
> [1] https://lists.linuxcoding.com/kernel/2005-q3/msg10350.html

Thanks David list the long history of Page reserved, By now, I think we treat nvdimm as a device not a DRAM, also has it's device driver which manager its own device memory. From this perspective, it is reasonable to set these pages as zone device memory and mark reserved flag.
@Dan @Dave, how do you think about this?

>
>> V1:
>> https://lkml.org/lkml/2018/7/4/91
>>
>> V2:
>> https://lkml.org/lkml/2018/7/10/135
>>
>> V3:
>> [PATCH V3 1/4] Needs Comments.
>> [PATCH V3 2/4] Update the description of MEMORY_DEVICE_DEV_DAX: Jan
>> [PATCH V3 3/4] Acked-by: Jan in V2
>> [PATCH V3 4/4] Needs Comments.
>>
>> Zhang Yi (4):
>>   kvm: remove redundant reserved page check
>>   mm: introduce memory type MEMORY_DEVICE_DEV_DAX
>>   mm: add a function to differentiate the pages is from DAX device
>>     memory
>>   kvm: add a check if pfn is from NVDIMM pmem.
>>
>>  drivers/dax/pmem.c       |  1 +
>>  include/linux/memremap.h |  8 ++++++++
>>  include/linux/mm.h       | 12 ++++++++++++
>>  virt/kvm/kvm_main.c      | 16 ++++++++--------
>>  4 files changed, 29 insertions(+), 8 deletions(-)
>>
>

  reply	other threads:[~2018-08-13  9:41 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-09 10:52 [PATCH V3 0/4] Fix kvm misconceives NVDIMM pages as reserved mmio Zhang Yi
2018-08-09 10:52 ` Zhang Yi
2018-08-09 10:52 ` Zhang Yi
2018-08-09  9:02 ` Jan Kara
2018-08-09  9:02   ` Jan Kara
2018-08-09  9:02   ` Jan Kara
2018-08-13 17:33   ` Zhang,Yi
2018-08-13 17:33     ` Zhang,Yi
2018-08-13 17:33     ` Zhang,Yi
2018-08-13 17:33     ` Zhang,Yi
2018-08-09 10:52 ` [PATCH V3 1/4] kvm: remove redundant reserved page check Zhang Yi
2018-08-09 10:52   ` Zhang Yi
2018-08-09 10:52   ` Zhang Yi
2018-08-09  9:13   ` Pankaj Gupta
2018-08-09  9:13     ` Pankaj Gupta
2018-08-09  9:13     ` Pankaj Gupta
2018-08-10 11:23   ` David Hildenbrand
2018-08-09 10:53 ` [PATCH V3 2/4] mm: introduce memory type MEMORY_DEVICE_DEV_DAX Zhang Yi
2018-08-09 10:53   ` Zhang Yi
2018-08-09  8:59   ` Jan Kara
2018-08-09  8:59     ` Jan Kara
2018-08-09  8:59     ` Jan Kara
2018-08-09 10:53 ` [PATCH V3 3/4] mm: add a function to differentiate the pages is from DAX device memory Zhang Yi
2018-08-09 10:53   ` Zhang Yi
2018-08-09 10:53   ` Zhang Yi
2018-08-09  9:23   ` Pankaj Gupta
2018-08-09  9:23     ` Pankaj Gupta
2018-08-13 17:41     ` Zhang,Yi
2018-08-13 17:41       ` Zhang,Yi
2018-08-13 17:41       ` Zhang,Yi
2018-08-13 17:41       ` Zhang,Yi
2018-08-13 14:29       ` Jerome Glisse
2018-08-13 14:29         ` Jerome Glisse
2018-08-13 14:29         ` Jerome Glisse
2018-08-09 10:53 ` [PATCH V3 4/4] kvm: add a check if pfn is from NVDIMM pmem Zhang Yi
2018-08-09 10:53   ` Zhang Yi
2018-08-09 10:53   ` Zhang Yi
2018-08-09  8:32   ` Pankaj Gupta
2018-08-13 17:32     ` Zhang,Yi
2018-08-13 17:32       ` Zhang,Yi
2018-08-13 17:32       ` Zhang,Yi
2018-08-13 17:32       ` Zhang,Yi
2018-08-10 13:27 ` [PATCH V3 0/4] Fix kvm misconceives NVDIMM pages as reserved mmio David Hildenbrand
2018-08-10 13:27   ` David Hildenbrand
2018-08-13 17:25   ` Zhang,Yi [this message]
2018-08-13 17:25     ` Zhang,Yi
2018-08-13 17:25     ` Zhang,Yi
2018-08-13 17:25     ` Zhang,Yi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0f4f0d15-7949-c576-1981-145e7758ae4a@linux.intel.com \
    --to=yi.z.zhang@linux.intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=david@redhat.com \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=pbonzini@redhat.com \
    --cc=rkrcmar@redhat.com \
    --cc=yi.z.zhang@intel.com \
    --cc=yu.c.zhang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.