linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Jia He <justin.he@arm.com>
To: Dan Williams <dan.j.williams@intel.com>,
	Vishal Verma <vishal.l.verma@intel.com>,
	Mike Rapoport <rppt@linux.ibm.com>,
	David Hildenbrand <david@redhat.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	Dave Jiang <dave.jiang@intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Steve Capper <steve.capper@arm.com>,
	Mark Rutland <mark.rutland@arm.com>,
	Logan Gunthorpe <logang@deltatee.com>,
	Anshuman Khandual <anshuman.khandual@arm.com>,
	Hsin-Yi Wang <hsinyi@chromium.org>,
	Jason Gunthorpe <jgg@ziepe.ca>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Kees Cook <keescook@chromium.org>,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, linux-nvdimm@lists.01.org,
	linux-mm@kvack.org, Wei Yang <richardw.yang@linux.intel.com>,
	Pankaj Gupta <pankaj.gupta.linux@gmail.com>,
	Ira Weiny <ira.weiny@intel.com>, Kaly Xin <Kaly.Xin@arm.com>,
	Jia He <justin.he@arm.com>
Subject: [RFC PATCH 3/6] mm/memory_hotplug: allow pmem kmem not to align with memory_block_size
Date: Wed, 29 Jul 2020 11:34:21 +0800	[thread overview]
Message-ID: <20200729033424.2629-4-justin.he@arm.com> (raw)
In-Reply-To: <20200729033424.2629-1-justin.he@arm.com>

When dax pmem is probed as RAM device on arm64, previously, kmem_start in
dev_dax_kmem_probe() should be aligned with 1G memblock size on arm64 due
to SECTION_SIZE_BITS(30).

There will be some meta data at the beginning/end of the iomem space, e.g.
namespace info and nvdimm label:
240000000-33fdfffff : Persistent Memory
  240000000-2403fffff : namespace0.0
  280000000-2bfffffff : dax0.0
    280000000-2bfffffff : System RAM

Hence it makes the whole kmem space not aligned with memory_block_size for
both start addr and end addr. Hence there is a big gap when kmem is added
into memory block which causes big memory space wasting.

This changes it by relaxing the alignment check for dax pmem kmem in the
path of online/offline memory blocks.

Signed-off-by: Jia He <justin.he@arm.com>
---
 drivers/base/memory.c | 16 ++++++++++++++++
 mm/memory_hotplug.c   | 39 ++++++++++++++++++++++++++++++++++++++-
 2 files changed, 54 insertions(+), 1 deletion(-)

diff --git a/drivers/base/memory.c b/drivers/base/memory.c
index 4a1691664c6c..3d2a94f3b1d9 100644
--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -334,6 +334,22 @@ static ssize_t valid_zones_show(struct device *dev,
 	 * online nodes otherwise the page_zone is not reliable
 	 */
 	if (mem->state == MEM_ONLINE) {
+#ifdef CONFIG_ZONE_DEVICE
+		struct resource res;
+		int ret;
+
+		/* adjust start_pfn for dax pmem kmem */
+		ret = find_next_iomem_res(start_pfn << PAGE_SHIFT,
+					((start_pfn + nr_pages) << PAGE_SHIFT) - 1,
+					IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY,
+					IORES_DESC_PERSISTENT_MEMORY,
+					false, &res);
+		if (!ret && PFN_UP(res.start) > start_pfn) {
+			nr_pages -= PFN_UP(res.start) - start_pfn;
+			start_pfn = PFN_UP(res.start);
+		}
+#endif
+
 		/*
 		 * The block contains more than one zone can not be offlined.
 		 * This can happen e.g. for ZONE_DMA and ZONE_DMA32
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index a53103dc292b..25745f67b680 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -999,6 +999,20 @@ int try_online_node(int nid)
 
 static int check_hotplug_memory_range(u64 start, u64 size)
 {
+#ifdef CONFIG_ZONE_DEVICE
+	struct resource res;
+	int ret;
+
+	/* Allow pmem kmem not to align with block size */
+	ret = find_next_iomem_res(start, start + size - 1,
+				IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY,
+				IORES_DESC_PERSISTENT_MEMORY,
+				false, &res);
+	if (!ret) {
+		return 0;
+	}
+#endif
+
 	/* memory range must be block size aligned */
 	if (!size || !IS_ALIGNED(start, memory_block_size_bytes()) ||
 	    !IS_ALIGNED(size, memory_block_size_bytes())) {
@@ -1481,19 +1495,42 @@ static int __ref __offline_pages(unsigned long start_pfn,
 	mem_hotplug_begin();
 
 	/*
-	 * Don't allow to offline memory blocks that contain holes.
+	 * Don't allow to offline memory blocks that contain holes except
+	 * for pmem.
 	 * Consequently, memory blocks with holes can never get onlined
 	 * via the hotplug path - online_pages() - as hotplugged memory has
 	 * no holes. This way, we e.g., don't have to worry about marking
 	 * memory holes PG_reserved, don't need pfn_valid() checks, and can
 	 * avoid using walk_system_ram_range() later.
+	 * When dax pmem is used as RAM (kmem), holes at the beginning is
+	 * allowed.
 	 */
 	walk_system_ram_range(start_pfn, end_pfn - start_pfn, &nr_pages,
 			      count_system_ram_pages_cb);
 	if (nr_pages != end_pfn - start_pfn) {
+#ifdef CONFIG_ZONE_DEVICE
+		struct resource res;
+
+		/* Allow pmem kmem not to align with block size */
+		ret = find_next_iomem_res(start_pfn << PAGE_SHIFT,
+					(end_pfn << PAGE_SHIFT) - 1,
+					IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY,
+					IORES_DESC_PERSISTENT_MEMORY,
+					false, &res);
+		if (ret) {
+			ret = -EINVAL;
+			reason = "memory holes";
+			goto failed_removal;
+		}
+
+		/* adjust start_pfn for dax pmem kmem */
+		start_pfn = PFN_UP(res.start);
+		end_pfn = PFN_DOWN(res.end + 1);
+#else
 		ret = -EINVAL;
 		reason = "memory holes";
 		goto failed_removal;
+#endif
 	}
 
 	/* This makes hotplug much easier...and readable.
-- 
2.17.1



  parent reply	other threads:[~2020-07-29  3:35 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-29  3:34 [RFC PATCH 0/6] decrease unnecessary gap due to pmem kmem alignment Jia He
2020-07-29  3:34 ` [RFC PATCH 1/6] mm/memory_hotplug: remove redundant memory block size alignment check Jia He
2020-07-29  3:34 ` [RFC PATCH 2/6] resource: export find_next_iomem_res() helper Jia He
2020-07-29  3:34 ` Jia He [this message]
2020-07-29  3:34 ` [RFC PATCH 4/6] mm/page_alloc: adjust the start,end in dax pmem kmem case Jia He
2020-07-29  3:34 ` [RFC PATCH 5/6] device-dax: relax the memblock size alignment for kmem_start Jia He
2020-07-29  3:34 ` [RFC PATCH 6/6] arm64: fall back to vmemmap_populate_basepages if not aligned with PMD_SIZE Jia He
2020-07-29  6:36 ` [RFC PATCH 0/6] decrease unnecessary gap due to pmem kmem alignment David Hildenbrand
2020-07-29  8:27   ` Justin He
2020-07-29  8:44     ` David Hildenbrand
2020-07-29  9:31     ` Mike Rapoport
2020-07-29  9:35       ` David Hildenbrand
2020-07-29 13:00         ` Mike Rapoport
2020-07-29 13:03           ` David Hildenbrand
2020-07-29 14:12             ` Mike Rapoport
2020-07-30  2:17         ` Justin He

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200729033424.2629-4-justin.he@arm.com \
    --to=justin.he@arm.com \
    --cc=Kaly.Xin@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=anshuman.khandual@arm.com \
    --cc=catalin.marinas@arm.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=dave.jiang@intel.com \
    --cc=david@redhat.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hsinyi@chromium.org \
    --cc=ira.weiny@intel.com \
    --cc=jgg@ziepe.ca \
    --cc=keescook@chromium.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=logang@deltatee.com \
    --cc=mark.rutland@arm.com \
    --cc=pankaj.gupta.linux@gmail.com \
    --cc=rafael@kernel.org \
    --cc=richardw.yang@linux.intel.com \
    --cc=rppt@linux.ibm.com \
    --cc=steve.capper@arm.com \
    --cc=vishal.l.verma@intel.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).