All of lore.kernel.org
 help / color / mirror / Atom feed
From: Pavel Tatashin <pasha.tatashin@oracle.com>
To: linux-kernel@vger.kernel.org, sparclinux@vger.kernel.org,
	linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org,
	linux-s390@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	x86@kernel.org, kasan-dev@googlegroups.com,
	borntraeger@de.ibm.com, heiko.carstens@de.ibm.com,
	davem@davemloft.net, willy@infradead.org, mhocko@kernel.org,
	ard.biesheuvel@linaro.org, will.deacon@arm.com,
	catalin.marinas@arm.com, sam@ravnborg.org,
	mgorman@techsingularity.net, Steven.Sistare@oracle.com,
	daniel.m.jordan@oracle.com, bob.picco@oracle.com
Subject: [PATCH v7 07/11] sparc64: optimized struct page zeroing
Date: Mon, 28 Aug 2017 22:02:18 -0400	[thread overview]
Message-ID: <1503972142-289376-8-git-send-email-pasha.tatashin@oracle.com> (raw)
In-Reply-To: <1503972142-289376-1-git-send-email-pasha.tatashin@oracle.com>

Add an optimized mm_zero_struct_page(), so struct page's are zeroed without
calling memset(). We do eight to ten regular stores based on the size of
struct page. Compiler optimizes out the conditions of switch() statement.

SPARC-M6 with 15T of memory, single thread performance:

                               BASE            FIX  OPTIMIZED_FIX
        bootmem_init   28.440467985s   2.305674818s   2.305161615s
free_area_init_nodes  202.845901673s 225.343084508s 172.556506560s
                      --------------------------------------------
Total                 231.286369658s 227.648759326s 174.861668175s

BASE:  current linux
FIX:   This patch series without "optimized struct page zeroing"
OPTIMIZED_FIX: This patch series including the current patch.

bootmem_init() is where memory for struct pages is zeroed during
allocation. Note, about two seconds in this function is a fixed time: it
does not increase as memory is increased.

Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reviewed-by: Steven Sistare <steven.sistare@oracle.com>
Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Reviewed-by: Bob Picco <bob.picco@oracle.com>
---
 arch/sparc/include/asm/pgtable_64.h | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/arch/sparc/include/asm/pgtable_64.h b/arch/sparc/include/asm/pgtable_64.h
index 6fbd931f0570..cee5cc7ccc51 100644
--- a/arch/sparc/include/asm/pgtable_64.h
+++ b/arch/sparc/include/asm/pgtable_64.h
@@ -230,6 +230,36 @@ extern unsigned long _PAGE_ALL_SZ_BITS;
 extern struct page *mem_map_zero;
 #define ZERO_PAGE(vaddr)	(mem_map_zero)
 
+/* This macro must be updated when the size of struct page grows above 80
+ * or reduces below 64.
+ * The idea that compiler optimizes out switch() statement, and only
+ * leaves clrx instructions
+ */
+#define	mm_zero_struct_page(pp) do {					\
+	unsigned long *_pp = (void *)(pp);				\
+									\
+	 /* Check that struct page is either 64, 72, or 80 bytes */	\
+	BUILD_BUG_ON(sizeof(struct page) & 7);				\
+	BUILD_BUG_ON(sizeof(struct page) < 64);				\
+	BUILD_BUG_ON(sizeof(struct page) > 80);				\
+									\
+	switch (sizeof(struct page)) {					\
+	case 80:							\
+		_pp[9] = 0;	/* fallthrough */			\
+	case 72:							\
+		_pp[8] = 0;	/* fallthrough */			\
+	default:							\
+		_pp[7] = 0;						\
+		_pp[6] = 0;						\
+		_pp[5] = 0;						\
+		_pp[4] = 0;						\
+		_pp[3] = 0;						\
+		_pp[2] = 0;						\
+		_pp[1] = 0;						\
+		_pp[0] = 0;						\
+	}								\
+} while (0)
+
 /* PFNs are real physical page numbers.  However, mem_map only begins to record
  * per-page information starting at pfn_base.  This is to handle systems where
  * the first physical page in the machine is at some huge physical address,
-- 
2.14.1

WARNING: multiple messages have this Message-ID (diff)
From: Pavel Tatashin <pasha.tatashin@oracle.com>
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v7 07/11] sparc64: optimized struct page zeroing
Date: Tue, 29 Aug 2017 02:02:18 +0000	[thread overview]
Message-ID: <1503972142-289376-8-git-send-email-pasha.tatashin@oracle.com> (raw)
In-Reply-To: <1503972142-289376-1-git-send-email-pasha.tatashin@oracle.com>

Add an optimized mm_zero_struct_page(), so struct page's are zeroed without
calling memset(). We do eight to ten regular stores based on the size of
struct page. Compiler optimizes out the conditions of switch() statement.

SPARC-M6 with 15T of memory, single thread performance:

                               BASE            FIX  OPTIMIZED_FIX
        bootmem_init   28.440467985s   2.305674818s   2.305161615s
free_area_init_nodes  202.845901673s 225.343084508s 172.556506560s
                      --------------------------------------------
Total                 231.286369658s 227.648759326s 174.861668175s

BASE:  current linux
FIX:   This patch series without "optimized struct page zeroing"
OPTIMIZED_FIX: This patch series including the current patch.

bootmem_init() is where memory for struct pages is zeroed during
allocation. Note, about two seconds in this function is a fixed time: it
does not increase as memory is increased.

Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reviewed-by: Steven Sistare <steven.sistare@oracle.com>
Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Reviewed-by: Bob Picco <bob.picco@oracle.com>
---
 arch/sparc/include/asm/pgtable_64.h | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/arch/sparc/include/asm/pgtable_64.h b/arch/sparc/include/asm/pgtable_64.h
index 6fbd931f0570..cee5cc7ccc51 100644
--- a/arch/sparc/include/asm/pgtable_64.h
+++ b/arch/sparc/include/asm/pgtable_64.h
@@ -230,6 +230,36 @@ extern unsigned long _PAGE_ALL_SZ_BITS;
 extern struct page *mem_map_zero;
 #define ZERO_PAGE(vaddr)	(mem_map_zero)
 
+/* This macro must be updated when the size of struct page grows above 80
+ * or reduces below 64.
+ * The idea that compiler optimizes out switch() statement, and only
+ * leaves clrx instructions
+ */
+#define	mm_zero_struct_page(pp) do {					\
+	unsigned long *_pp = (void *)(pp);				\
+									\
+	 /* Check that struct page is either 64, 72, or 80 bytes */	\
+	BUILD_BUG_ON(sizeof(struct page) & 7);				\
+	BUILD_BUG_ON(sizeof(struct page) < 64);				\
+	BUILD_BUG_ON(sizeof(struct page) > 80);				\
+									\
+	switch (sizeof(struct page)) {					\
+	case 80:							\
+		_pp[9] = 0;	/* fallthrough */			\
+	case 72:							\
+		_pp[8] = 0;	/* fallthrough */			\
+	default:							\
+		_pp[7] = 0;						\
+		_pp[6] = 0;						\
+		_pp[5] = 0;						\
+		_pp[4] = 0;						\
+		_pp[3] = 0;						\
+		_pp[2] = 0;						\
+		_pp[1] = 0;						\
+		_pp[0] = 0;						\
+	}								\
+} while (0)
+
 /* PFNs are real physical page numbers.  However, mem_map only begins to record
  * per-page information starting at pfn_base.  This is to handle systems where
  * the first physical page in the machine is at some huge physical address,
-- 
2.14.1


WARNING: multiple messages have this Message-ID (diff)
From: Pavel Tatashin <pasha.tatashin@oracle.com>
To: linux-kernel@vger.kernel.org, sparclinux@vger.kernel.org,
	linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org,
	linux-s390@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	x86@kernel.org, kasan-dev@googlegroups.com,
	borntraeger@de.ibm.com, heiko.carstens@de.ibm.com,
	davem@davemloft.net, willy@infradead.org, mhocko@kernel.org,
	ard.biesheuvel@linaro.org, will.deacon@arm.com,
	catalin.marinas@arm.com, sam@ravnborg.org,
	mgorman@techsingularity.net, Steven.Sistare@oracle.com,
	daniel.m.jordan@oracle.com, bob.picco@oracle.com
Subject: [PATCH v7 07/11] sparc64: optimized struct page zeroing
Date: Mon, 28 Aug 2017 22:02:18 -0400	[thread overview]
Message-ID: <1503972142-289376-8-git-send-email-pasha.tatashin@oracle.com> (raw)
In-Reply-To: <1503972142-289376-1-git-send-email-pasha.tatashin@oracle.com>

Add an optimized mm_zero_struct_page(), so struct page's are zeroed without
calling memset(). We do eight to ten regular stores based on the size of
struct page. Compiler optimizes out the conditions of switch() statement.

SPARC-M6 with 15T of memory, single thread performance:

                               BASE            FIX  OPTIMIZED_FIX
        bootmem_init   28.440467985s   2.305674818s   2.305161615s
free_area_init_nodes  202.845901673s 225.343084508s 172.556506560s
                      --------------------------------------------
Total                 231.286369658s 227.648759326s 174.861668175s

BASE:  current linux
FIX:   This patch series without "optimized struct page zeroing"
OPTIMIZED_FIX: This patch series including the current patch.

bootmem_init() is where memory for struct pages is zeroed during
allocation. Note, about two seconds in this function is a fixed time: it
does not increase as memory is increased.

Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reviewed-by: Steven Sistare <steven.sistare@oracle.com>
Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Reviewed-by: Bob Picco <bob.picco@oracle.com>
---
 arch/sparc/include/asm/pgtable_64.h | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/arch/sparc/include/asm/pgtable_64.h b/arch/sparc/include/asm/pgtable_64.h
index 6fbd931f0570..cee5cc7ccc51 100644
--- a/arch/sparc/include/asm/pgtable_64.h
+++ b/arch/sparc/include/asm/pgtable_64.h
@@ -230,6 +230,36 @@ extern unsigned long _PAGE_ALL_SZ_BITS;
 extern struct page *mem_map_zero;
 #define ZERO_PAGE(vaddr)	(mem_map_zero)
 
+/* This macro must be updated when the size of struct page grows above 80
+ * or reduces below 64.
+ * The idea that compiler optimizes out switch() statement, and only
+ * leaves clrx instructions
+ */
+#define	mm_zero_struct_page(pp) do {					\
+	unsigned long *_pp = (void *)(pp);				\
+									\
+	 /* Check that struct page is either 64, 72, or 80 bytes */	\
+	BUILD_BUG_ON(sizeof(struct page) & 7);				\
+	BUILD_BUG_ON(sizeof(struct page) < 64);				\
+	BUILD_BUG_ON(sizeof(struct page) > 80);				\
+									\
+	switch (sizeof(struct page)) {					\
+	case 80:							\
+		_pp[9] = 0;	/* fallthrough */			\
+	case 72:							\
+		_pp[8] = 0;	/* fallthrough */			\
+	default:							\
+		_pp[7] = 0;						\
+		_pp[6] = 0;						\
+		_pp[5] = 0;						\
+		_pp[4] = 0;						\
+		_pp[3] = 0;						\
+		_pp[2] = 0;						\
+		_pp[1] = 0;						\
+		_pp[0] = 0;						\
+	}								\
+} while (0)
+
 /* PFNs are real physical page numbers.  However, mem_map only begins to record
  * per-page information starting at pfn_base.  This is to handle systems where
  * the first physical page in the machine is at some huge physical address,
-- 
2.14.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: pasha.tatashin@oracle.com (Pavel Tatashin)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v7 07/11] sparc64: optimized struct page zeroing
Date: Mon, 28 Aug 2017 22:02:18 -0400	[thread overview]
Message-ID: <1503972142-289376-8-git-send-email-pasha.tatashin@oracle.com> (raw)
In-Reply-To: <1503972142-289376-1-git-send-email-pasha.tatashin@oracle.com>

Add an optimized mm_zero_struct_page(), so struct page's are zeroed without
calling memset(). We do eight to ten regular stores based on the size of
struct page. Compiler optimizes out the conditions of switch() statement.

SPARC-M6 with 15T of memory, single thread performance:

                               BASE            FIX  OPTIMIZED_FIX
        bootmem_init   28.440467985s   2.305674818s   2.305161615s
free_area_init_nodes  202.845901673s 225.343084508s 172.556506560s
                      --------------------------------------------
Total                 231.286369658s 227.648759326s 174.861668175s

BASE:  current linux
FIX:   This patch series without "optimized struct page zeroing"
OPTIMIZED_FIX: This patch series including the current patch.

bootmem_init() is where memory for struct pages is zeroed during
allocation. Note, about two seconds in this function is a fixed time: it
does not increase as memory is increased.

Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reviewed-by: Steven Sistare <steven.sistare@oracle.com>
Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Reviewed-by: Bob Picco <bob.picco@oracle.com>
---
 arch/sparc/include/asm/pgtable_64.h | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/arch/sparc/include/asm/pgtable_64.h b/arch/sparc/include/asm/pgtable_64.h
index 6fbd931f0570..cee5cc7ccc51 100644
--- a/arch/sparc/include/asm/pgtable_64.h
+++ b/arch/sparc/include/asm/pgtable_64.h
@@ -230,6 +230,36 @@ extern unsigned long _PAGE_ALL_SZ_BITS;
 extern struct page *mem_map_zero;
 #define ZERO_PAGE(vaddr)	(mem_map_zero)
 
+/* This macro must be updated when the size of struct page grows above 80
+ * or reduces below 64.
+ * The idea that compiler optimizes out switch() statement, and only
+ * leaves clrx instructions
+ */
+#define	mm_zero_struct_page(pp) do {					\
+	unsigned long *_pp = (void *)(pp);				\
+									\
+	 /* Check that struct page is either 64, 72, or 80 bytes */	\
+	BUILD_BUG_ON(sizeof(struct page) & 7);				\
+	BUILD_BUG_ON(sizeof(struct page) < 64);				\
+	BUILD_BUG_ON(sizeof(struct page) > 80);				\
+									\
+	switch (sizeof(struct page)) {					\
+	case 80:							\
+		_pp[9] = 0;	/* fallthrough */			\
+	case 72:							\
+		_pp[8] = 0;	/* fallthrough */			\
+	default:							\
+		_pp[7] = 0;						\
+		_pp[6] = 0;						\
+		_pp[5] = 0;						\
+		_pp[4] = 0;						\
+		_pp[3] = 0;						\
+		_pp[2] = 0;						\
+		_pp[1] = 0;						\
+		_pp[0] = 0;						\
+	}								\
+} while (0)
+
 /* PFNs are real physical page numbers.  However, mem_map only begins to record
  * per-page information starting at pfn_base.  This is to handle systems where
  * the first physical page in the machine is at some huge physical address,
-- 
2.14.1

  parent reply	other threads:[~2017-08-29  2:06 UTC|newest]

Thread overview: 75+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-29  2:02 [PATCH v7 00/11] complete deferred page initialization Pavel Tatashin
2017-08-29  2:02 ` Pavel Tatashin
2017-08-29  2:02 ` Pavel Tatashin
2017-08-29  2:02 ` Pavel Tatashin
2017-08-29  2:02 ` [PATCH v7 01/11] x86/mm: setting fields in deferred pages Pavel Tatashin
2017-08-29  2:02   ` Pavel Tatashin
2017-08-29  2:02   ` Pavel Tatashin
2017-08-29  2:02   ` Pavel Tatashin
2017-08-29  2:02 ` [PATCH v7 02/11] sparc64/mm: " Pavel Tatashin
2017-08-29  2:02   ` Pavel Tatashin
2017-08-29  2:02   ` Pavel Tatashin
2017-08-29  2:02   ` Pavel Tatashin
2017-08-30  1:09   ` David Miller
2017-08-30  1:09     ` David Miller
2017-08-30  1:09     ` David Miller
2017-08-30  1:09     ` David Miller
2017-08-29  2:02 ` [PATCH v7 03/11] mm: deferred_init_memmap improvements Pavel Tatashin
2017-08-29  2:02   ` Pavel Tatashin
2017-08-29  2:02   ` Pavel Tatashin
2017-08-29  2:02   ` Pavel Tatashin
2017-08-29  2:02 ` [PATCH v7 04/11] sparc64: simplify vmemmap_populate Pavel Tatashin
2017-08-29  2:02   ` Pavel Tatashin
2017-08-29  2:02   ` Pavel Tatashin
2017-08-29  2:02   ` Pavel Tatashin
2017-08-30  1:08   ` David Miller
2017-08-30  1:08     ` David Miller
2017-08-30  1:08     ` David Miller
2017-08-30  1:08     ` David Miller
2017-08-29  2:02 ` [PATCH v7 05/11] mm: defining memblock_virt_alloc_try_nid_raw Pavel Tatashin
2017-08-29  2:02   ` Pavel Tatashin
2017-08-29  2:02   ` Pavel Tatashin
2017-08-29  2:02   ` Pavel Tatashin
2017-08-29  2:02 ` [PATCH v7 06/11] mm: zero struct pages during initialization Pavel Tatashin
2017-08-29  2:02   ` Pavel Tatashin
2017-08-29  2:02   ` Pavel Tatashin
2017-08-29  2:02   ` Pavel Tatashin
2017-08-29  2:02 ` Pavel Tatashin [this message]
2017-08-29  2:02   ` [PATCH v7 07/11] sparc64: optimized struct page zeroing Pavel Tatashin
2017-08-29  2:02   ` Pavel Tatashin
2017-08-29  2:02   ` Pavel Tatashin
2017-08-30  1:12   ` David Miller
2017-08-30  1:12     ` David Miller
2017-08-30  1:12     ` David Miller
2017-08-30  1:12     ` David Miller
2017-08-30 13:19     ` Pasha Tatashin
2017-08-30 13:19       ` Pasha Tatashin
2017-08-30 13:19       ` Pasha Tatashin
2017-08-30 13:19       ` Pasha Tatashin
2017-08-30 17:46       ` David Miller
2017-08-30 17:46         ` David Miller
2017-08-30 17:46         ` David Miller
2017-08-30 17:46         ` David Miller
2017-08-29  2:02 ` [PATCH v7 08/11] mm: zero reserved and unavailable struct pages Pavel Tatashin
2017-08-29  2:02   ` Pavel Tatashin
2017-08-29  2:02   ` Pavel Tatashin
2017-08-29  2:02   ` Pavel Tatashin
2017-08-30 21:22   ` kbuild test robot
2017-08-30 21:22     ` kbuild test robot
2017-08-30 21:22     ` kbuild test robot
2017-08-30 23:12   ` kbuild test robot
2017-08-30 23:12     ` kbuild test robot
2017-08-30 23:12     ` kbuild test robot
2017-08-30 23:12     ` kbuild test robot
2017-08-29  2:02 ` [PATCH v7 09/11] x86/kasan: explicitly zero kasan shadow memory Pavel Tatashin
2017-08-29  2:02   ` Pavel Tatashin
2017-08-29  2:02   ` Pavel Tatashin
2017-08-29  2:02   ` Pavel Tatashin
2017-08-29  2:02 ` [PATCH v7 10/11] arm64/kasan: " Pavel Tatashin
2017-08-29  2:02   ` Pavel Tatashin
2017-08-29  2:02   ` Pavel Tatashin
2017-08-29  2:02   ` Pavel Tatashin
2017-08-29  2:02 ` [PATCH v7 11/11] mm: stop zeroing memory during allocation in vmemmap Pavel Tatashin
2017-08-29  2:02   ` Pavel Tatashin
2017-08-29  2:02   ` Pavel Tatashin
2017-08-29  2:02   ` Pavel Tatashin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1503972142-289376-8-git-send-email-pasha.tatashin@oracle.com \
    --to=pasha.tatashin@oracle.com \
    --cc=Steven.Sistare@oracle.com \
    --cc=ard.biesheuvel@linaro.org \
    --cc=bob.picco@oracle.com \
    --cc=borntraeger@de.ibm.com \
    --cc=catalin.marinas@arm.com \
    --cc=daniel.m.jordan@oracle.com \
    --cc=davem@davemloft.net \
    --cc=heiko.carstens@de.ibm.com \
    --cc=kasan-dev@googlegroups.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@kernel.org \
    --cc=sam@ravnborg.org \
    --cc=sparclinux@vger.kernel.org \
    --cc=will.deacon@arm.com \
    --cc=willy@infradead.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.