All of lore.kernel.org
 help / color / mirror / Atom feed
From: Petr Tesarik <ptesarik@suse.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	Vlastimil Babka <vbabka@suse.cz>,
	linux-kernel@vger.kernel.org,
	Mel Gorman <mgorman@techsingularity.net>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Kemi Wang <kemi.wang@intel.com>,
	YASUAKI ISHIMATSU <yasu.isimatu@gmail.com>,
	Andrey Ryabinin <aryabinin@virtuozzo.com>,
	Nikolay Borisov <nborisov@suse.com>
Subject: [PATCH v2] Fix explanation of lower bits in the SPARSEMEM mem_map pointer
Date: Thu, 25 Jan 2018 10:05:16 +0100	[thread overview]
Message-ID: <20180125100516.589ea6af@ezekiel.suse.cz> (raw)
In-Reply-To: <20180124145711.4f3f219d@ezekiel.suse.cz>

The comment is confusing. On the one hand, it refers to 32-bit
alignment (struct page alignment on 32-bit platforms), but this
would only guarantee that the 2 lowest bits must be zero. On the
other hand, it claims that at least 3 bits are available, and 3 bits
are actually used.

This is not broken, because there is a stronger alignment guarantee,
just less obvious. Let's fix the comment to make it clear how many
bits are available and why.

Although memmap arrays are allocated in various places, the
resulting pointer is encoded eventually, so I am adding a BUG_ON()
here to enforce at runtime that all expected bits are indeed
available.

I have also added a BUILD_BUG_ON to check that PFN_SECTION_SHIFT is
sufficient, because this part of the calculation can be easily
checked at build time.

Signed-off-by: Petr Tesarik <ptesarik@suse.com>
---
 include/linux/mmzone.h | 12 ++++++++++--
 mm/sparse.c            |  6 +++++-
 2 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 67f2e3c38939..7522a6987595 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -1166,8 +1166,16 @@ extern unsigned long usemap_size(void);
 
 /*
  * We use the lower bits of the mem_map pointer to store
- * a little bit of information.  There should be at least
- * 3 bits here due to 32-bit alignment.
+ * a little bit of information.  The pointer is calculated
+ * as mem_map - section_nr_to_pfn(pnum).  The result is
+ * aligned to the minimum alignment of the two values:
+ *   1. All mem_map arrays are page-aligned.
+ *   2. section_nr_to_pfn() always clears PFN_SECTION_SHIFT
+ *      lowest bits.  PFN_SECTION_SHIFT is arch-specific
+ *      (equal SECTION_SIZE_BITS - PAGE_SHIFT), and the
+ *      worst combination is powerpc with 256k pages,
+ *      which results in PFN_SECTION_SHIFT equal 6.
+ * To sum it up, at least 6 bits are available.
  */
 #define	SECTION_MARKED_PRESENT	(1UL<<0)
 #define SECTION_HAS_MEM_MAP	(1UL<<1)
diff --git a/mm/sparse.c b/mm/sparse.c
index 2609aba121e8..6b8b5e91ceef 100644
--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -264,7 +264,11 @@ unsigned long __init node_memmap_size_bytes(int nid, unsigned long start_pfn,
  */
 static unsigned long sparse_encode_mem_map(struct page *mem_map, unsigned long pnum)
 {
-	return (unsigned long)(mem_map - (section_nr_to_pfn(pnum)));
+	unsigned long coded_mem_map =
+		(unsigned long)(mem_map - (section_nr_to_pfn(pnum)));
+	BUILD_BUG_ON(SECTION_MAP_LAST_BIT > (1UL<<PFN_SECTION_SHIFT));
+	BUG_ON(coded_mem_map & ~SECTION_MAP_MASK);
+	return coded_mem_map;
 }
 
 /*
-- 
2.13.6

WARNING: multiple messages have this Message-ID
From: Petr Tesarik <ptesarik@suse.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	Vlastimil Babka <vbabka@suse.cz>,
	linux-kernel@vger.kernel.org,
	Mel Gorman <mgorman@techsingularity.net>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Kemi Wang <kemi.wang@intel.com>,
	YASUAKI ISHIMATSU <yasu.isimatu@gmail.com>,
	Andrey Ryabinin <aryabinin@virtuozzo.com>,
	Nikolay Borisov <nborisov@suse.com>
Subject: [PATCH v2] Fix explanation of lower bits in the SPARSEMEM mem_map pointer
Date: Thu, 25 Jan 2018 10:05:16 +0100	[thread overview]
Message-ID: <20180125100516.589ea6af@ezekiel.suse.cz> (raw)
In-Reply-To: <20180124145711.4f3f219d@ezekiel.suse.cz>

The comment is confusing. On the one hand, it refers to 32-bit
alignment (struct page alignment on 32-bit platforms), but this
would only guarantee that the 2 lowest bits must be zero. On the
other hand, it claims that at least 3 bits are available, and 3 bits
are actually used.

This is not broken, because there is a stronger alignment guarantee,
just less obvious. Let's fix the comment to make it clear how many
bits are available and why.

Although memmap arrays are allocated in various places, the
resulting pointer is encoded eventually, so I am adding a BUG_ON()
here to enforce at runtime that all expected bits are indeed
available.

I have also added a BUILD_BUG_ON to check that PFN_SECTION_SHIFT is
sufficient, because this part of the calculation can be easily
checked at build time.

Signed-off-by: Petr Tesarik <ptesarik@suse.com>
---
 include/linux/mmzone.h | 12 ++++++++++--
 mm/sparse.c            |  6 +++++-
 2 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 67f2e3c38939..7522a6987595 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -1166,8 +1166,16 @@ extern unsigned long usemap_size(void);
 
 /*
  * We use the lower bits of the mem_map pointer to store
- * a little bit of information.  There should be at least
- * 3 bits here due to 32-bit alignment.
+ * a little bit of information.  The pointer is calculated
+ * as mem_map - section_nr_to_pfn(pnum).  The result is
+ * aligned to the minimum alignment of the two values:
+ *   1. All mem_map arrays are page-aligned.
+ *   2. section_nr_to_pfn() always clears PFN_SECTION_SHIFT
+ *      lowest bits.  PFN_SECTION_SHIFT is arch-specific
+ *      (equal SECTION_SIZE_BITS - PAGE_SHIFT), and the
+ *      worst combination is powerpc with 256k pages,
+ *      which results in PFN_SECTION_SHIFT equal 6.
+ * To sum it up, at least 6 bits are available.
  */
 #define	SECTION_MARKED_PRESENT	(1UL<<0)
 #define SECTION_HAS_MEM_MAP	(1UL<<1)
diff --git a/mm/sparse.c b/mm/sparse.c
index 2609aba121e8..6b8b5e91ceef 100644
--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -264,7 +264,11 @@ unsigned long __init node_memmap_size_bytes(int nid, unsigned long start_pfn,
  */
 static unsigned long sparse_encode_mem_map(struct page *mem_map, unsigned long pnum)
 {
-	return (unsigned long)(mem_map - (section_nr_to_pfn(pnum)));
+	unsigned long coded_mem_map =
+		(unsigned long)(mem_map - (section_nr_to_pfn(pnum)));
+	BUILD_BUG_ON(SECTION_MAP_LAST_BIT > (1UL<<PFN_SECTION_SHIFT));
+	BUG_ON(coded_mem_map & ~SECTION_MAP_MASK);
+	return coded_mem_map;
 }
 
 /*
-- 
2.13.6

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2018-01-25  9:05 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-19  7:09 [PATCH] " Petr Tesarik
2018-01-19  7:09 ` Petr Tesarik
2018-01-19 12:39 ` Michal Hocko
2018-01-19 12:39   ` Michal Hocko
2018-01-19 13:21   ` Petr Tesarik
2018-01-19 13:21     ` Petr Tesarik
2018-01-24 12:43     ` Michal Hocko
2018-01-24 12:43       ` Michal Hocko
2018-01-24 13:57       ` Petr Tesarik
2018-01-24 13:57         ` Petr Tesarik
2018-01-25  9:05         ` Petr Tesarik [this message]
2018-01-25  9:05           ` [PATCH v2] " Petr Tesarik
2018-01-25 11:46           ` Michal Hocko
2018-01-25 11:46             ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180125100516.589ea6af@ezekiel.suse.cz \
    --to=ptesarik@suse.com \
    --cc=akpm@linux-foundation.org \
    --cc=aryabinin@virtuozzo.com \
    --cc=hannes@cmpxchg.org \
    --cc=kemi.wang@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@kernel.org \
    --cc=nborisov@suse.com \
    --cc=vbabka@suse.cz \
    --cc=yasu.isimatu@gmail.com \
    --subject='Re: [PATCH v2] Fix explanation of lower bits in the SPARSEMEM mem_map pointer' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.