All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jia He <justin.he@arm.com>
To: Dan Williams <dan.j.williams@intel.com>,
	Vishal Verma <vishal.l.verma@intel.com>,
	Mike Rapoport <rppt@linux.ibm.com>,
	David Hildenbrand <david@redhat.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Steve Capper <steve.capper@arm.com>,
	Mark Rutland <mark.rutland@arm.com>,
	Anshuman Khandual <anshuman.khandual@arm.com>,
	Hsin-Yi Wang <hsinyi@chromium.org>,
	Jason Gunthorpe <jgg@ziepe.ca>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Kees Cook <keescook@chromium.org>,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, linux-nvdimm@lists.01.org,
	linux-mm@kvack.org, Pankaj Gupta <pankaj.gupta.linux@gmail.com>,
	Kaly Xin <Kaly.Xin@arm.com>, Jia He <justin.he@arm.com>
Subject: [RFC PATCH 0/6] decrease unnecessary gap due to pmem kmem alignment
Date: Wed, 29 Jul 2020 11:34:18 +0800	[thread overview]
Message-ID: <20200729033424.2629-1-justin.he@arm.com> (raw)

When enabling dax pmem as RAM device on arm64, I noticed that kmem_start
addr in dev_dax_kmem_probe() should be aligned w/ SECTION_SIZE_BITS(30),i.e.
1G memblock size. Even Dan Williams' sub-section patch series [1] had been
upstream merged, it was not helpful due to hard limitation of kmem_start:
$ndctl create-namespace -e namespace0.0 --mode=devdax --map=dev -s 2g -f -a 2M
$echo dax0.0 > /sys/bus/dax/drivers/device_dax/unbind
$echo dax0.0 > /sys/bus/dax/drivers/kmem/new_id
$cat /proc/iomem
...
23c000000-23fffffff : System RAM
  23dd40000-23fecffff : reserved
  23fed0000-23fffffff : reserved
240000000-33fdfffff : Persistent Memory
  240000000-2403fffff : namespace0.0
  280000000-2bfffffff : dax0.0          <- aligned with 1G boundary
    280000000-2bfffffff : System RAM
Hence there is a big gap between 0x2403fffff and 0x280000000 due to the 1G
alignment.
 
Without this series, if qemu creates a 4G bytes nvdimm device, we can only
use 2G bytes for dax pmem(kmem) in the worst case.
e.g.
240000000-33fdfffff : Persistent Memory 
We can only use the memblock between [240000000, 2ffffffff] due to the hard
limitation. It wastes too much memory space.

Decreasing the SECTION_SIZE_BITS on arm64 might be an alternative, but there
are too many concerns from other constraints, e.g. PAGE_SIZE, hugetlb,
SPARSEMEM_VMEMMAP, page bits in struct page ...

Beside decreasing the SECTION_SIZE_BITS, we can also relax the kmem alignment
with memory_block_size_bytes().

Tested on arm64 guest and x86 guest, qemu creates a 4G pmem device. dax pmem
can be used as ram with smaller gap. Also the kmem hotplug add/remove are both
tested on arm64/x86 guest.

This patch series (mainly patch6/6) is based on the fixing patch, ~v5.8-rc5 [2].

[1] https://lkml.org/lkml/2019/6/19/67
[2] https://lkml.org/lkml/2020/7/8/1546
Jia He (6):
  mm/memory_hotplug: remove redundant memory block size alignment check
  resource: export find_next_iomem_res() helper
  mm/memory_hotplug: allow pmem kmem not to align with memory_block_size
  mm/page_alloc: adjust the start,end in dax pmem kmem case
  device-dax: relax the memblock size alignment for kmem_start
  arm64: fall back to vmemmap_populate_basepages if not aligned  with
    PMD_SIZE

 arch/arm64/mm/mmu.c    |  4 ++++
 drivers/base/memory.c  | 24 ++++++++++++++++--------
 drivers/dax/kmem.c     | 22 +++++++++++++---------
 include/linux/ioport.h |  3 +++
 kernel/resource.c      |  3 ++-
 mm/memory_hotplug.c    | 39 ++++++++++++++++++++++++++++++++++++++-
 mm/page_alloc.c        | 14 ++++++++++++++
 7 files changed, 90 insertions(+), 19 deletions(-)

-- 
2.17.1
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

WARNING: multiple messages have this Message-ID (diff)
From: Jia He <justin.he@arm.com>
To: Dan Williams <dan.j.williams@intel.com>,
	Vishal Verma <vishal.l.verma@intel.com>,
	Mike Rapoport <rppt@linux.ibm.com>,
	David Hildenbrand <david@redhat.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	Dave Jiang <dave.jiang@intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Steve Capper <steve.capper@arm.com>,
	Mark Rutland <mark.rutland@arm.com>,
	Logan Gunthorpe <logang@deltatee.com>,
	Anshuman Khandual <anshuman.khandual@arm.com>,
	Hsin-Yi Wang <hsinyi@chromium.org>,
	Jason Gunthorpe <jgg@ziepe.ca>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Kees Cook <keescook@chromium.org>,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, linux-nvdimm@lists.01.org,
	linux-mm@kvack.org, Wei Yang <richardw.yang@linux.intel.com>,
	Pankaj Gupta <pankaj.gupta.linux@gmail.com>,
	Ira Weiny <ira.weiny@intel.com>, Kaly Xin <Kaly.Xin@arm.com>,
	Jia He <justin.he@arm.com>
Subject: [RFC PATCH 0/6] decrease unnecessary gap due to pmem kmem alignment
Date: Wed, 29 Jul 2020 11:34:18 +0800	[thread overview]
Message-ID: <20200729033424.2629-1-justin.he@arm.com> (raw)

When enabling dax pmem as RAM device on arm64, I noticed that kmem_start
addr in dev_dax_kmem_probe() should be aligned w/ SECTION_SIZE_BITS(30),i.e.
1G memblock size. Even Dan Williams' sub-section patch series [1] had been
upstream merged, it was not helpful due to hard limitation of kmem_start:
$ndctl create-namespace -e namespace0.0 --mode=devdax --map=dev -s 2g -f -a 2M
$echo dax0.0 > /sys/bus/dax/drivers/device_dax/unbind
$echo dax0.0 > /sys/bus/dax/drivers/kmem/new_id
$cat /proc/iomem
...
23c000000-23fffffff : System RAM
  23dd40000-23fecffff : reserved
  23fed0000-23fffffff : reserved
240000000-33fdfffff : Persistent Memory
  240000000-2403fffff : namespace0.0
  280000000-2bfffffff : dax0.0          <- aligned with 1G boundary
    280000000-2bfffffff : System RAM
Hence there is a big gap between 0x2403fffff and 0x280000000 due to the 1G
alignment.
 
Without this series, if qemu creates a 4G bytes nvdimm device, we can only
use 2G bytes for dax pmem(kmem) in the worst case.
e.g.
240000000-33fdfffff : Persistent Memory 
We can only use the memblock between [240000000, 2ffffffff] due to the hard
limitation. It wastes too much memory space.

Decreasing the SECTION_SIZE_BITS on arm64 might be an alternative, but there
are too many concerns from other constraints, e.g. PAGE_SIZE, hugetlb,
SPARSEMEM_VMEMMAP, page bits in struct page ...

Beside decreasing the SECTION_SIZE_BITS, we can also relax the kmem alignment
with memory_block_size_bytes().

Tested on arm64 guest and x86 guest, qemu creates a 4G pmem device. dax pmem
can be used as ram with smaller gap. Also the kmem hotplug add/remove are both
tested on arm64/x86 guest.

This patch series (mainly patch6/6) is based on the fixing patch, ~v5.8-rc5 [2].

[1] https://lkml.org/lkml/2019/6/19/67
[2] https://lkml.org/lkml/2020/7/8/1546
Jia He (6):
  mm/memory_hotplug: remove redundant memory block size alignment check
  resource: export find_next_iomem_res() helper
  mm/memory_hotplug: allow pmem kmem not to align with memory_block_size
  mm/page_alloc: adjust the start,end in dax pmem kmem case
  device-dax: relax the memblock size alignment for kmem_start
  arm64: fall back to vmemmap_populate_basepages if not aligned  with
    PMD_SIZE

 arch/arm64/mm/mmu.c    |  4 ++++
 drivers/base/memory.c  | 24 ++++++++++++++++--------
 drivers/dax/kmem.c     | 22 +++++++++++++---------
 include/linux/ioport.h |  3 +++
 kernel/resource.c      |  3 ++-
 mm/memory_hotplug.c    | 39 ++++++++++++++++++++++++++++++++++++++-
 mm/page_alloc.c        | 14 ++++++++++++++
 7 files changed, 90 insertions(+), 19 deletions(-)

-- 
2.17.1


WARNING: multiple messages have this Message-ID (diff)
From: Jia He <justin.he@arm.com>
To: Dan Williams <dan.j.williams@intel.com>,
	Vishal Verma <vishal.l.verma@intel.com>,
	Mike Rapoport <rppt@linux.ibm.com>,
	David Hildenbrand <david@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	linux-mm@kvack.org, Ira Weiny <ira.weiny@intel.com>,
	Dave Jiang <dave.jiang@intel.com>, Jason Gunthorpe <jgg@ziepe.ca>,
	Will Deacon <will@kernel.org>, Kaly Xin <Kaly.Xin@arm.com>,
	Kees Cook <keescook@chromium.org>,
	Anshuman Khandual <anshuman.khandual@arm.com>,
	Hsin-Yi Wang <hsinyi@chromium.org>, Jia He <justin.he@arm.com>,
	linux-arm-kernel@lists.infradead.org,
	Pankaj Gupta <pankaj.gupta.linux@gmail.com>,
	Steve Capper <steve.capper@arm.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	linux-nvdimm@lists.01.org, linux-kernel@vger.kernel.org,
	Wei Yang <richardw.yang@linux.intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Logan Gunthorpe <logang@deltatee.com>
Subject: [RFC PATCH 0/6] decrease unnecessary gap due to pmem kmem alignment
Date: Wed, 29 Jul 2020 11:34:18 +0800	[thread overview]
Message-ID: <20200729033424.2629-1-justin.he@arm.com> (raw)

When enabling dax pmem as RAM device on arm64, I noticed that kmem_start
addr in dev_dax_kmem_probe() should be aligned w/ SECTION_SIZE_BITS(30),i.e.
1G memblock size. Even Dan Williams' sub-section patch series [1] had been
upstream merged, it was not helpful due to hard limitation of kmem_start:
$ndctl create-namespace -e namespace0.0 --mode=devdax --map=dev -s 2g -f -a 2M
$echo dax0.0 > /sys/bus/dax/drivers/device_dax/unbind
$echo dax0.0 > /sys/bus/dax/drivers/kmem/new_id
$cat /proc/iomem
...
23c000000-23fffffff : System RAM
  23dd40000-23fecffff : reserved
  23fed0000-23fffffff : reserved
240000000-33fdfffff : Persistent Memory
  240000000-2403fffff : namespace0.0
  280000000-2bfffffff : dax0.0          <- aligned with 1G boundary
    280000000-2bfffffff : System RAM
Hence there is a big gap between 0x2403fffff and 0x280000000 due to the 1G
alignment.
 
Without this series, if qemu creates a 4G bytes nvdimm device, we can only
use 2G bytes for dax pmem(kmem) in the worst case.
e.g.
240000000-33fdfffff : Persistent Memory 
We can only use the memblock between [240000000, 2ffffffff] due to the hard
limitation. It wastes too much memory space.

Decreasing the SECTION_SIZE_BITS on arm64 might be an alternative, but there
are too many concerns from other constraints, e.g. PAGE_SIZE, hugetlb,
SPARSEMEM_VMEMMAP, page bits in struct page ...

Beside decreasing the SECTION_SIZE_BITS, we can also relax the kmem alignment
with memory_block_size_bytes().

Tested on arm64 guest and x86 guest, qemu creates a 4G pmem device. dax pmem
can be used as ram with smaller gap. Also the kmem hotplug add/remove are both
tested on arm64/x86 guest.

This patch series (mainly patch6/6) is based on the fixing patch, ~v5.8-rc5 [2].

[1] https://lkml.org/lkml/2019/6/19/67
[2] https://lkml.org/lkml/2020/7/8/1546
Jia He (6):
  mm/memory_hotplug: remove redundant memory block size alignment check
  resource: export find_next_iomem_res() helper
  mm/memory_hotplug: allow pmem kmem not to align with memory_block_size
  mm/page_alloc: adjust the start,end in dax pmem kmem case
  device-dax: relax the memblock size alignment for kmem_start
  arm64: fall back to vmemmap_populate_basepages if not aligned  with
    PMD_SIZE

 arch/arm64/mm/mmu.c    |  4 ++++
 drivers/base/memory.c  | 24 ++++++++++++++++--------
 drivers/dax/kmem.c     | 22 +++++++++++++---------
 include/linux/ioport.h |  3 +++
 kernel/resource.c      |  3 ++-
 mm/memory_hotplug.c    | 39 ++++++++++++++++++++++++++++++++++++++-
 mm/page_alloc.c        | 14 ++++++++++++++
 7 files changed, 90 insertions(+), 19 deletions(-)

-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

             reply	other threads:[~2020-07-29  3:35 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-29  3:34 Jia He [this message]
2020-07-29  3:34 ` [RFC PATCH 0/6] decrease unnecessary gap due to pmem kmem alignment Jia He
2020-07-29  3:34 ` Jia He
2020-07-29  3:34 ` [RFC PATCH 1/6] mm/memory_hotplug: remove redundant memory block size alignment check Jia He
2020-07-29  3:34   ` Jia He
2020-07-29  3:34   ` Jia He
2020-07-29  3:34 ` [RFC PATCH 2/6] resource: export find_next_iomem_res() helper Jia He
2020-07-29  3:34   ` Jia He
2020-07-29  3:34   ` Jia He
2020-07-29  3:34 ` [RFC PATCH 3/6] mm/memory_hotplug: allow pmem kmem not to align with memory_block_size Jia He
2020-07-29  3:34   ` Jia He
2020-07-29  3:34   ` Jia He
2020-07-29  3:34 ` [RFC PATCH 4/6] mm/page_alloc: adjust the start,end in dax pmem kmem case Jia He
2020-07-29  3:34   ` [RFC PATCH 4/6] mm/page_alloc: adjust the start, end " Jia He
2020-07-29  3:34   ` [RFC PATCH 4/6] mm/page_alloc: adjust the start,end " Jia He
2020-07-29  3:34 ` [RFC PATCH 5/6] device-dax: relax the memblock size alignment for kmem_start Jia He
2020-07-29  3:34   ` Jia He
2020-07-29  3:34   ` Jia He
2020-07-29  3:34 ` [RFC PATCH 6/6] arm64: fall back to vmemmap_populate_basepages if not aligned with PMD_SIZE Jia He
2020-07-29  3:34   ` Jia He
2020-07-29  3:34   ` Jia He
2020-07-29  6:36 ` [RFC PATCH 0/6] decrease unnecessary gap due to pmem kmem alignment David Hildenbrand
2020-07-29  6:36   ` David Hildenbrand
2020-07-29  6:36   ` David Hildenbrand
2020-07-29  8:27   ` Justin He
2020-07-29  8:27     ` Justin He
2020-07-29  8:27     ` Justin He
2020-07-29  8:27     ` Justin He
2020-07-29  8:44     ` David Hildenbrand
2020-07-29  8:44       ` David Hildenbrand
2020-07-29  8:44       ` David Hildenbrand
2020-07-29  8:44       ` David Hildenbrand
2020-07-29  9:31     ` Mike Rapoport
2020-07-29  9:31       ` Mike Rapoport
2020-07-29  9:31       ` Mike Rapoport
2020-07-29  9:31       ` Mike Rapoport
2020-07-29  9:35       ` David Hildenbrand
2020-07-29  9:35         ` David Hildenbrand
2020-07-29  9:35         ` David Hildenbrand
2020-07-29  9:35         ` David Hildenbrand
2020-07-29 13:00         ` Mike Rapoport
2020-07-29 13:00           ` Mike Rapoport
2020-07-29 13:00           ` Mike Rapoport
2020-07-29 13:00           ` Mike Rapoport
2020-07-29 13:03           ` David Hildenbrand
2020-07-29 13:03             ` David Hildenbrand
2020-07-29 13:03             ` David Hildenbrand
2020-07-29 13:03             ` David Hildenbrand
2020-07-29 14:12             ` Mike Rapoport
2020-07-29 14:12               ` Mike Rapoport
2020-07-29 14:12               ` Mike Rapoport
2020-07-29 14:12               ` Mike Rapoport
2020-07-30  2:17         ` Justin He
2020-07-30  2:17           ` Justin He
2020-07-30  2:17           ` Justin He
2020-07-30  2:17           ` Justin He

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200729033424.2629-1-justin.he@arm.com \
    --to=justin.he@arm.com \
    --cc=Kaly.Xin@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=anshuman.khandual@arm.com \
    --cc=catalin.marinas@arm.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@redhat.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hsinyi@chromium.org \
    --cc=jgg@ziepe.ca \
    --cc=keescook@chromium.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=mark.rutland@arm.com \
    --cc=pankaj.gupta.linux@gmail.com \
    --cc=rafael@kernel.org \
    --cc=rppt@linux.ibm.com \
    --cc=steve.capper@arm.com \
    --cc=vishal.l.verma@intel.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.