linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Jia He <justin.he@arm.com>
To: Dan Williams <dan.j.williams@intel.com>,
	Vishal Verma <vishal.l.verma@intel.com>,
	Mike Rapoport <rppt@linux.ibm.com>,
	David Hildenbrand <david@redhat.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	Dave Jiang <dave.jiang@intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Steve Capper <steve.capper@arm.com>,
	Mark Rutland <mark.rutland@arm.com>,
	Logan Gunthorpe <logang@deltatee.com>,
	Anshuman Khandual <anshuman.khandual@arm.com>,
	Hsin-Yi Wang <hsinyi@chromium.org>,
	Jason Gunthorpe <jgg@ziepe.ca>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Kees Cook <keescook@chromium.org>,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, linux-nvdimm@lists.01.org,
	linux-mm@kvack.org, Wei Yang <richardw.yang@linux.intel.com>,
	Pankaj Gupta <pankaj.gupta.linux@gmail.com>,
	Ira Weiny <ira.weiny@intel.com>, Kaly Xin <Kaly.Xin@arm.com>,
	Jia He <justin.he@arm.com>
Subject: [RFC PATCH 5/6] device-dax: relax the memblock size alignment for kmem_start
Date: Wed, 29 Jul 2020 11:34:23 +0800	[thread overview]
Message-ID: <20200729033424.2629-6-justin.he@arm.com> (raw)
In-Reply-To: <20200729033424.2629-1-justin.he@arm.com>

Previously, kmem_start in dev_dax_kmem_probe should be aligned with
SECTION_SIZE_BITS(30), i.e. 1G memblock size on arm64. Even with Dan
Williams' sub-section patch series, it was not helpful when adding the
dax pmem kmem to memblock:
$ndctl create-namespace -e namespace0.0 --mode=devdax --map=dev -s 2g -f -a 2M
$echo dax0.0 > /sys/bus/dax/drivers/device_dax/unbind
$echo dax0.0 > /sys/bus/dax/drivers/kmem/new_id
$cat /proc/iomem
...
23c000000-23fffffff : System RAM
  23dd40000-23fecffff : reserved
  23fed0000-23fffffff : reserved
240000000-33fdfffff : Persistent Memory
  240000000-2403fffff : namespace0.0
  280000000-2bfffffff : dax0.0          <- boundary are aligned with 1G
    280000000-2bfffffff : System RAM (kmem)
$ lsmem
RANGE                                 SIZE  STATE REMOVABLE BLOCK
0x0000000040000000-0x000000023fffffff   8G online       yes   1-8
0x0000000280000000-0x00000002bfffffff   1G online       yes    10

Memory block size:         1G
Total online memory:       9G
Total offline memory:      0B
...
Hence there is a big gap between 0x2403fffff and 0x280000000 due to the 1G
alignment on arm64. More than that, only 1G memory is returned while 2G is
requested.

On x86, the gap is relatively small due to SECTION_SIZE_BITS(27).

Besides descreasing SECTION_SIZE_BITS on arm64, we can relax the alignment
when adding the kmem.
After this patch:
240000000-33fdfffff : Persistent Memory
  240000000-2421fffff : namespace0.0
  242400000-2bfffffff : dax0.0
    242400000-2bfffffff : System RAM (kmem)
$ lsmem
RANGE                                 SIZE  STATE REMOVABLE BLOCK
0x0000000040000000-0x00000002bfffffff  10G online       yes  1-10

Memory block size:         1G
Total online memory:      10G
Total offline memory:      0B

Notes, block 9-10 are the newly hotplug added.

This patches remove the tight alignment constraint of
memory_block_size_bytes(), but still keep the constraint from
online_pages_range().

Signed-off-by: Jia He <justin.he@arm.com>
---
 drivers/dax/kmem.c | 22 +++++++++++++---------
 1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/drivers/dax/kmem.c b/drivers/dax/kmem.c
index d77786dc0d92..849d0706dfe0 100644
--- a/drivers/dax/kmem.c
+++ b/drivers/dax/kmem.c
@@ -30,9 +30,20 @@ int dev_dax_kmem_probe(struct device *dev)
 	const char *new_res_name;
 	int numa_node;
 	int rc;
+	int order;
 
-	/* Hotplug starting at the beginning of the next block: */
-	kmem_start = ALIGN(res->start, memory_block_size_bytes());
+	/* kmem_start needn't be aligned with memory_block_size_bytes().
+	 * But given the constraint in online_pages_range(), adjust the
+	 * alignment of kmem_start and kmem_size
+	 */
+	kmem_size = resource_size(res);
+	order = min_t(int, MAX_ORDER - 1, get_order(kmem_size));
+	kmem_start = ALIGN(res->start, 1ul << (order + PAGE_SHIFT));
+	/* Adjust the size down to compensate for moving up kmem_start: */
+	kmem_size -= kmem_start - res->start;
+	/* Align the size down to cover only complete blocks: */
+	kmem_size &= ~((1ul << (order + PAGE_SHIFT)) - 1);
+	kmem_end = kmem_start + kmem_size;
 
 	/*
 	 * Ensure good NUMA information for the persistent memory.
@@ -48,13 +59,6 @@ int dev_dax_kmem_probe(struct device *dev)
 			numa_node, res);
 	}
 
-	kmem_size = resource_size(res);
-	/* Adjust the size down to compensate for moving up kmem_start: */
-	kmem_size -= kmem_start - res->start;
-	/* Align the size down to cover only complete blocks: */
-	kmem_size &= ~(memory_block_size_bytes() - 1);
-	kmem_end = kmem_start + kmem_size;
-
 	new_res_name = kstrdup(dev_name(dev), GFP_KERNEL);
 	if (!new_res_name)
 		return -ENOMEM;
-- 
2.17.1



  parent reply	other threads:[~2020-07-29  3:35 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-29  3:34 [RFC PATCH 0/6] decrease unnecessary gap due to pmem kmem alignment Jia He
2020-07-29  3:34 ` [RFC PATCH 1/6] mm/memory_hotplug: remove redundant memory block size alignment check Jia He
2020-07-29  3:34 ` [RFC PATCH 2/6] resource: export find_next_iomem_res() helper Jia He
2020-07-29  3:34 ` [RFC PATCH 3/6] mm/memory_hotplug: allow pmem kmem not to align with memory_block_size Jia He
2020-07-29  3:34 ` [RFC PATCH 4/6] mm/page_alloc: adjust the start,end in dax pmem kmem case Jia He
2020-07-29  3:34 ` Jia He [this message]
2020-07-29  3:34 ` [RFC PATCH 6/6] arm64: fall back to vmemmap_populate_basepages if not aligned with PMD_SIZE Jia He
2020-07-29  6:36 ` [RFC PATCH 0/6] decrease unnecessary gap due to pmem kmem alignment David Hildenbrand
2020-07-29  8:27   ` Justin He
2020-07-29  8:44     ` David Hildenbrand
2020-07-29  9:31     ` Mike Rapoport
2020-07-29  9:35       ` David Hildenbrand
2020-07-29 13:00         ` Mike Rapoport
2020-07-29 13:03           ` David Hildenbrand
2020-07-29 14:12             ` Mike Rapoport
2020-07-30  2:17         ` Justin He

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200729033424.2629-6-justin.he@arm.com \
    --to=justin.he@arm.com \
    --cc=Kaly.Xin@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=anshuman.khandual@arm.com \
    --cc=catalin.marinas@arm.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=dave.jiang@intel.com \
    --cc=david@redhat.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hsinyi@chromium.org \
    --cc=ira.weiny@intel.com \
    --cc=jgg@ziepe.ca \
    --cc=keescook@chromium.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=logang@deltatee.com \
    --cc=mark.rutland@arm.com \
    --cc=pankaj.gupta.linux@gmail.com \
    --cc=rafael@kernel.org \
    --cc=richardw.yang@linux.intel.com \
    --cc=rppt@linux.ibm.com \
    --cc=steve.capper@arm.com \
    --cc=vishal.l.verma@intel.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).