All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/7] Add memsetN functions
@ 2017-03-24 16:13 ` Matthew Wilcox
  0 siblings, 0 replies; 50+ messages in thread
From: Matthew Wilcox @ 2017-03-24 16:13 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-fbdev, linux-arch, linux-alpha, linux-arm-kernel, x86,
	linux-mips, linuxppc-dev, sparclinux, Minchan Kim,
	Matthew Wilcox

From: Matthew Wilcox <mawilcox@microsoft.com>

zram was recently enhanced to support compressing pages with a repeating
pattern up to the size of an unsigned long.  As part of the discussion,
we noted it would be nice if architectures had optimised routines
to fill regions of memory with patterns larger than those contained
in a single byte.  Our suspicions were right; the x86 version offers
approximately a 7% performance improvement over the C implementation.

The generic memfill() function is part of Lars Wirzenius' publib,
but it doesn't offer the most convenient interface.  I chose to add
five more-specific functions as part of this patchset -- memset16(),
memset32(), memset64(), memset_l() (long) and memset_p() (pointer).

It would be nice to have some more architectures implement optimised
memsetN calls.  It would also be nice to find more places in the kernel
which could benefit from calling these functions.  Maybe a coccinelle
script could be written to find such places?  We're looking for loops
over an array where the value being stored into the array does not depend
on the iteration variable.

Since v1 of the patchset, I stumbled on Alpha's memsetw() which
caused me to add memset16() to complete the set.  I removed the
'__HAVE_ARCH_MEMSET_PLUS' preprocessor symbol in favour of separate
MEMSET16 MEMSET32 and MEMSET64 symbols.  I also reviewed the scr_mem*w()
usages across the different architectures and implemented some obvious
missing optimisations.  Alpha is still missing scr_memmovew() as it
would be non-trivial to write.

Russell's review on patch 2 only applies to the memset32/memset64
implementation.  The memset16 is unreviewed (and, indeed, untested)
to date.

Matthew Wilcox (7):
  Add multibyte memset functions
  ARM: Implement memset16, memset32 & memset64
  x86: Implement memset16, memset32 & memset64
  alpha: Add support for memset16
  zram: Convert to using memset_l
  sym53c8xx_2: Convert to use memset32
  vga: Optimise console scrolling

 arch/alpha/include/asm/string.h     | 15 ++++----
 arch/alpha/include/asm/vga.h        |  2 +-
 arch/alpha/lib/memset.S             | 10 +++---
 arch/arm/include/asm/string.h       | 21 ++++++++++++
 arch/arm/kernel/armksyms.c          |  3 ++
 arch/arm/lib/memset.S               | 44 +++++++++++++++++++-----
 arch/mips/include/asm/vga.h         |  6 ++++
 arch/powerpc/include/asm/vga.h      |  8 +++++
 arch/sparc/include/asm/vga.h        | 24 +++++++++++++
 arch/x86/include/asm/string_32.h    | 24 +++++++++++++
 arch/x86/include/asm/string_64.h    | 36 ++++++++++++++++++++
 drivers/block/zram/zram_drv.c       | 15 ++------
 drivers/scsi/sym53c8xx_2/sym_hipd.c | 11 ++----
 include/linux/string.h              | 30 ++++++++++++++++
 include/linux/vt_buffer.h           | 12 +++++++
 lib/string.c                        | 68 +++++++++++++++++++++++++++++++++++++
 16 files changed, 287 insertions(+), 42 deletions(-)

-- 
2.11.0

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH v3 0/7] Add memsetN functions
@ 2017-03-24 16:13 ` Matthew Wilcox
  0 siblings, 0 replies; 50+ messages in thread
From: Matthew Wilcox @ 2017-03-24 16:13 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-fbdev, linux-arch, linux-alpha, linux-arm-kernel, x86,
	linux-mips, linuxppc-dev, sparclinux, Minchan Kim,
	Matthew Wilcox

From: Matthew Wilcox <mawilcox@microsoft.com>

zram was recently enhanced to support compressing pages with a repeating
pattern up to the size of an unsigned long.  As part of the discussion,
we noted it would be nice if architectures had optimised routines
to fill regions of memory with patterns larger than those contained
in a single byte.  Our suspicions were right; the x86 version offers
approximately a 7% performance improvement over the C implementation.

The generic memfill() function is part of Lars Wirzenius' publib,
but it doesn't offer the most convenient interface.  I chose to add
five more-specific functions as part of this patchset -- memset16(),
memset32(), memset64(), memset_l() (long) and memset_p() (pointer).

It would be nice to have some more architectures implement optimised
memsetN calls.  It would also be nice to find more places in the kernel
which could benefit from calling these functions.  Maybe a coccinelle
script could be written to find such places?  We're looking for loops
over an array where the value being stored into the array does not depend
on the iteration variable.

Since v1 of the patchset, I stumbled on Alpha's memsetw() which
caused me to add memset16() to complete the set.  I removed the
'__HAVE_ARCH_MEMSET_PLUS' preprocessor symbol in favour of separate
MEMSET16 MEMSET32 and MEMSET64 symbols.  I also reviewed the scr_mem*w()
usages across the different architectures and implemented some obvious
missing optimisations.  Alpha is still missing scr_memmovew() as it
would be non-trivial to write.

Russell's review on patch 2 only applies to the memset32/memset64
implementation.  The memset16 is unreviewed (and, indeed, untested)
to date.

Matthew Wilcox (7):
  Add multibyte memset functions
  ARM: Implement memset16, memset32 & memset64
  x86: Implement memset16, memset32 & memset64
  alpha: Add support for memset16
  zram: Convert to using memset_l
  sym53c8xx_2: Convert to use memset32
  vga: Optimise console scrolling

 arch/alpha/include/asm/string.h     | 15 ++++----
 arch/alpha/include/asm/vga.h        |  2 +-
 arch/alpha/lib/memset.S             | 10 +++---
 arch/arm/include/asm/string.h       | 21 ++++++++++++
 arch/arm/kernel/armksyms.c          |  3 ++
 arch/arm/lib/memset.S               | 44 +++++++++++++++++++-----
 arch/mips/include/asm/vga.h         |  6 ++++
 arch/powerpc/include/asm/vga.h      |  8 +++++
 arch/sparc/include/asm/vga.h        | 24 +++++++++++++
 arch/x86/include/asm/string_32.h    | 24 +++++++++++++
 arch/x86/include/asm/string_64.h    | 36 ++++++++++++++++++++
 drivers/block/zram/zram_drv.c       | 15 ++------
 drivers/scsi/sym53c8xx_2/sym_hipd.c | 11 ++----
 include/linux/string.h              | 30 ++++++++++++++++
 include/linux/vt_buffer.h           | 12 +++++++
 lib/string.c                        | 68 +++++++++++++++++++++++++++++++++++++
 16 files changed, 287 insertions(+), 42 deletions(-)

-- 
2.11.0

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH v3 0/7] Add memsetN functions
@ 2017-03-24 16:13 ` Matthew Wilcox
  0 siblings, 0 replies; 50+ messages in thread
From: Matthew Wilcox @ 2017-03-24 16:13 UTC (permalink / raw)
  To: linux-arm-kernel

From: Matthew Wilcox <mawilcox@microsoft.com>

zram was recently enhanced to support compressing pages with a repeating
pattern up to the size of an unsigned long.  As part of the discussion,
we noted it would be nice if architectures had optimised routines
to fill regions of memory with patterns larger than those contained
in a single byte.  Our suspicions were right; the x86 version offers
approximately a 7% performance improvement over the C implementation.

The generic memfill() function is part of Lars Wirzenius' publib,
but it doesn't offer the most convenient interface.  I chose to add
five more-specific functions as part of this patchset -- memset16(),
memset32(), memset64(), memset_l() (long) and memset_p() (pointer).

It would be nice to have some more architectures implement optimised
memsetN calls.  It would also be nice to find more places in the kernel
which could benefit from calling these functions.  Maybe a coccinelle
script could be written to find such places?  We're looking for loops
over an array where the value being stored into the array does not depend
on the iteration variable.

Since v1 of the patchset, I stumbled on Alpha's memsetw() which
caused me to add memset16() to complete the set.  I removed the
'__HAVE_ARCH_MEMSET_PLUS' preprocessor symbol in favour of separate
MEMSET16 MEMSET32 and MEMSET64 symbols.  I also reviewed the scr_mem*w()
usages across the different architectures and implemented some obvious
missing optimisations.  Alpha is still missing scr_memmovew() as it
would be non-trivial to write.

Russell's review on patch 2 only applies to the memset32/memset64
implementation.  The memset16 is unreviewed (and, indeed, untested)
to date.

Matthew Wilcox (7):
  Add multibyte memset functions
  ARM: Implement memset16, memset32 & memset64
  x86: Implement memset16, memset32 & memset64
  alpha: Add support for memset16
  zram: Convert to using memset_l
  sym53c8xx_2: Convert to use memset32
  vga: Optimise console scrolling

 arch/alpha/include/asm/string.h     | 15 ++++----
 arch/alpha/include/asm/vga.h        |  2 +-
 arch/alpha/lib/memset.S             | 10 +++---
 arch/arm/include/asm/string.h       | 21 ++++++++++++
 arch/arm/kernel/armksyms.c          |  3 ++
 arch/arm/lib/memset.S               | 44 +++++++++++++++++++-----
 arch/mips/include/asm/vga.h         |  6 ++++
 arch/powerpc/include/asm/vga.h      |  8 +++++
 arch/sparc/include/asm/vga.h        | 24 +++++++++++++
 arch/x86/include/asm/string_32.h    | 24 +++++++++++++
 arch/x86/include/asm/string_64.h    | 36 ++++++++++++++++++++
 drivers/block/zram/zram_drv.c       | 15 ++------
 drivers/scsi/sym53c8xx_2/sym_hipd.c | 11 ++----
 include/linux/string.h              | 30 ++++++++++++++++
 include/linux/vt_buffer.h           | 12 +++++++
 lib/string.c                        | 68 +++++++++++++++++++++++++++++++++++++
 16 files changed, 287 insertions(+), 42 deletions(-)

-- 
2.11.0

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH v3 1/7] Add multibyte memset functions
  2017-03-24 16:13 ` Matthew Wilcox
  (?)
@ 2017-03-24 16:13   ` Matthew Wilcox
  -1 siblings, 0 replies; 50+ messages in thread
From: Matthew Wilcox @ 2017-03-24 16:13 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-fbdev, linux-arch, linux-alpha, linux-arm-kernel, x86,
	linux-mips, linuxppc-dev, sparclinux, Minchan Kim,
	Matthew Wilcox

From: Matthew Wilcox <mawilcox@microsoft.com>

memset16(), memset32() and memset64() are like memset(), but allow the
caller to fill the destination with a multibyte pattern.  memset_l()
and memset_p() allow the caller to use unsigned long and pointer
values respectively.  memset64() is currently only available on 64-bit
architectures.

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
---
 include/linux/string.h | 30 ++++++++++++++++++++++
 lib/string.c           | 68 ++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 98 insertions(+)

diff --git a/include/linux/string.h b/include/linux/string.h
index 26b6f6a66f83..b376875b650c 100644
--- a/include/linux/string.h
+++ b/include/linux/string.h
@@ -99,6 +99,36 @@ extern __kernel_size_t strcspn(const char *,const char *);
 #ifndef __HAVE_ARCH_MEMSET
 extern void * memset(void *,int,__kernel_size_t);
 #endif
+
+#ifndef __HAVE_ARCH_MEMSET16
+extern void *memset16(uint16_t *, uint16_t, __kernel_size_t);
+#endif
+
+#ifndef __HAVE_ARCH_MEMSET32
+extern void *memset32(uint32_t *, uint32_t, __kernel_size_t);
+#endif
+
+#ifndef __HAVE_ARCH_MEMSET64
+extern void *memset64(uint64_t *, uint64_t, __kernel_size_t);
+#endif
+
+static inline void *memset_l(unsigned long *p, unsigned long v,
+		__kernel_size_t n)
+{
+	if (BITS_PER_LONG == 32)
+		return memset32((uint32_t *)p, v, n);
+	else
+		return memset64((uint64_t *)p, v, n);
+}
+
+static inline void *memset_p(void **p, void *v, __kernel_size_t n)
+{
+	if (BITS_PER_LONG == 32)
+		return memset32((uint32_t *)p, (uintptr_t)v, n);
+	else
+		return memset64((uint64_t *)p, (uintptr_t)v, n);
+}
+
 #ifndef __HAVE_ARCH_MEMCPY
 extern void * memcpy(void *,const void *,__kernel_size_t);
 #endif
diff --git a/lib/string.c b/lib/string.c
index ed83562a53ae..f18ba402e503 100644
--- a/lib/string.c
+++ b/lib/string.c
@@ -697,6 +697,74 @@ void memzero_explicit(void *s, size_t count)
 }
 EXPORT_SYMBOL(memzero_explicit);
 
+#ifndef __HAVE_ARCH_MEMSET16
+/**
+ * memset16() - Fill a memory area with a uint16_t
+ * @s: Pointer to the start of the area.
+ * @v: The value to fill the area with
+ * @count: The number of values to store
+ *
+ * Differs from memset() in that it fills with a uint16_t instead
+ * of a byte.  Remember that @count is the number of uint16_ts to
+ * store, not the number of bytes.
+ */
+void *memset16(uint16_t *s, uint16_t v, size_t count)
+{
+	uint16_t *xs = s;
+
+	while (count--)
+		*xs++ = v;
+	return s;
+}
+EXPORT_SYMBOL(memset16);
+#endif
+
+#ifndef __HAVE_ARCH_MEMSET32
+/**
+ * memset32() - Fill a memory area with a uint32_t
+ * @s: Pointer to the start of the area.
+ * @v: The value to fill the area with
+ * @count: The number of values to store
+ *
+ * Differs from memset() in that it fills with a uint32_t instead
+ * of a byte.  Remember that @count is the number of uint32_ts to
+ * store, not the number of bytes.
+ */
+void *memset32(uint32_t *s, uint32_t v, size_t count)
+{
+	uint32_t *xs = s;
+
+	while (count--)
+		*xs++ = v;
+	return s;
+}
+EXPORT_SYMBOL(memset32);
+#endif
+
+#ifndef __HAVE_ARCH_MEMSET64
+#if BITS_PER_LONG > 32
+/**
+ * memset64() - Fill a memory area with a uint64_t
+ * @s: Pointer to the start of the area.
+ * @v: The value to fill the area with
+ * @count: The number of values to store
+ *
+ * Differs from memset() in that it fills with a uint64_t instead
+ * of a byte.  Remember that @count is the number of uint64_ts to
+ * store, not the number of bytes.
+ */
+void *memset64(uint64_t *s, uint64_t v, size_t count)
+{
+	uint64_t *xs = s;
+
+	while (count--)
+		*xs++ = v;
+	return s;
+}
+EXPORT_SYMBOL(memset64);
+#endif
+#endif
+
 #ifndef __HAVE_ARCH_MEMCPY
 /**
  * memcpy - Copy one area of memory to another
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 1/7] Add multibyte memset functions
@ 2017-03-24 16:13   ` Matthew Wilcox
  0 siblings, 0 replies; 50+ messages in thread
From: Matthew Wilcox @ 2017-03-24 16:13 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-fbdev, linux-arch, linux-alpha, linux-arm-kernel, x86,
	linux-mips, linuxppc-dev, sparclinux, Minchan Kim,
	Matthew Wilcox

From: Matthew Wilcox <mawilcox@microsoft.com>

memset16(), memset32() and memset64() are like memset(), but allow the
caller to fill the destination with a multibyte pattern.  memset_l()
and memset_p() allow the caller to use unsigned long and pointer
values respectively.  memset64() is currently only available on 64-bit
architectures.

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
---
 include/linux/string.h | 30 ++++++++++++++++++++++
 lib/string.c           | 68 ++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 98 insertions(+)

diff --git a/include/linux/string.h b/include/linux/string.h
index 26b6f6a66f83..b376875b650c 100644
--- a/include/linux/string.h
+++ b/include/linux/string.h
@@ -99,6 +99,36 @@ extern __kernel_size_t strcspn(const char *,const char *);
 #ifndef __HAVE_ARCH_MEMSET
 extern void * memset(void *,int,__kernel_size_t);
 #endif
+
+#ifndef __HAVE_ARCH_MEMSET16
+extern void *memset16(uint16_t *, uint16_t, __kernel_size_t);
+#endif
+
+#ifndef __HAVE_ARCH_MEMSET32
+extern void *memset32(uint32_t *, uint32_t, __kernel_size_t);
+#endif
+
+#ifndef __HAVE_ARCH_MEMSET64
+extern void *memset64(uint64_t *, uint64_t, __kernel_size_t);
+#endif
+
+static inline void *memset_l(unsigned long *p, unsigned long v,
+		__kernel_size_t n)
+{
+	if (BITS_PER_LONG = 32)
+		return memset32((uint32_t *)p, v, n);
+	else
+		return memset64((uint64_t *)p, v, n);
+}
+
+static inline void *memset_p(void **p, void *v, __kernel_size_t n)
+{
+	if (BITS_PER_LONG = 32)
+		return memset32((uint32_t *)p, (uintptr_t)v, n);
+	else
+		return memset64((uint64_t *)p, (uintptr_t)v, n);
+}
+
 #ifndef __HAVE_ARCH_MEMCPY
 extern void * memcpy(void *,const void *,__kernel_size_t);
 #endif
diff --git a/lib/string.c b/lib/string.c
index ed83562a53ae..f18ba402e503 100644
--- a/lib/string.c
+++ b/lib/string.c
@@ -697,6 +697,74 @@ void memzero_explicit(void *s, size_t count)
 }
 EXPORT_SYMBOL(memzero_explicit);
 
+#ifndef __HAVE_ARCH_MEMSET16
+/**
+ * memset16() - Fill a memory area with a uint16_t
+ * @s: Pointer to the start of the area.
+ * @v: The value to fill the area with
+ * @count: The number of values to store
+ *
+ * Differs from memset() in that it fills with a uint16_t instead
+ * of a byte.  Remember that @count is the number of uint16_ts to
+ * store, not the number of bytes.
+ */
+void *memset16(uint16_t *s, uint16_t v, size_t count)
+{
+	uint16_t *xs = s;
+
+	while (count--)
+		*xs++ = v;
+	return s;
+}
+EXPORT_SYMBOL(memset16);
+#endif
+
+#ifndef __HAVE_ARCH_MEMSET32
+/**
+ * memset32() - Fill a memory area with a uint32_t
+ * @s: Pointer to the start of the area.
+ * @v: The value to fill the area with
+ * @count: The number of values to store
+ *
+ * Differs from memset() in that it fills with a uint32_t instead
+ * of a byte.  Remember that @count is the number of uint32_ts to
+ * store, not the number of bytes.
+ */
+void *memset32(uint32_t *s, uint32_t v, size_t count)
+{
+	uint32_t *xs = s;
+
+	while (count--)
+		*xs++ = v;
+	return s;
+}
+EXPORT_SYMBOL(memset32);
+#endif
+
+#ifndef __HAVE_ARCH_MEMSET64
+#if BITS_PER_LONG > 32
+/**
+ * memset64() - Fill a memory area with a uint64_t
+ * @s: Pointer to the start of the area.
+ * @v: The value to fill the area with
+ * @count: The number of values to store
+ *
+ * Differs from memset() in that it fills with a uint64_t instead
+ * of a byte.  Remember that @count is the number of uint64_ts to
+ * store, not the number of bytes.
+ */
+void *memset64(uint64_t *s, uint64_t v, size_t count)
+{
+	uint64_t *xs = s;
+
+	while (count--)
+		*xs++ = v;
+	return s;
+}
+EXPORT_SYMBOL(memset64);
+#endif
+#endif
+
 #ifndef __HAVE_ARCH_MEMCPY
 /**
  * memcpy - Copy one area of memory to another
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 1/7] Add multibyte memset functions
@ 2017-03-24 16:13   ` Matthew Wilcox
  0 siblings, 0 replies; 50+ messages in thread
From: Matthew Wilcox @ 2017-03-24 16:13 UTC (permalink / raw)
  To: linux-arm-kernel

From: Matthew Wilcox <mawilcox@microsoft.com>

memset16(), memset32() and memset64() are like memset(), but allow the
caller to fill the destination with a multibyte pattern.  memset_l()
and memset_p() allow the caller to use unsigned long and pointer
values respectively.  memset64() is currently only available on 64-bit
architectures.

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
---
 include/linux/string.h | 30 ++++++++++++++++++++++
 lib/string.c           | 68 ++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 98 insertions(+)

diff --git a/include/linux/string.h b/include/linux/string.h
index 26b6f6a66f83..b376875b650c 100644
--- a/include/linux/string.h
+++ b/include/linux/string.h
@@ -99,6 +99,36 @@ extern __kernel_size_t strcspn(const char *,const char *);
 #ifndef __HAVE_ARCH_MEMSET
 extern void * memset(void *,int,__kernel_size_t);
 #endif
+
+#ifndef __HAVE_ARCH_MEMSET16
+extern void *memset16(uint16_t *, uint16_t, __kernel_size_t);
+#endif
+
+#ifndef __HAVE_ARCH_MEMSET32
+extern void *memset32(uint32_t *, uint32_t, __kernel_size_t);
+#endif
+
+#ifndef __HAVE_ARCH_MEMSET64
+extern void *memset64(uint64_t *, uint64_t, __kernel_size_t);
+#endif
+
+static inline void *memset_l(unsigned long *p, unsigned long v,
+		__kernel_size_t n)
+{
+	if (BITS_PER_LONG == 32)
+		return memset32((uint32_t *)p, v, n);
+	else
+		return memset64((uint64_t *)p, v, n);
+}
+
+static inline void *memset_p(void **p, void *v, __kernel_size_t n)
+{
+	if (BITS_PER_LONG == 32)
+		return memset32((uint32_t *)p, (uintptr_t)v, n);
+	else
+		return memset64((uint64_t *)p, (uintptr_t)v, n);
+}
+
 #ifndef __HAVE_ARCH_MEMCPY
 extern void * memcpy(void *,const void *,__kernel_size_t);
 #endif
diff --git a/lib/string.c b/lib/string.c
index ed83562a53ae..f18ba402e503 100644
--- a/lib/string.c
+++ b/lib/string.c
@@ -697,6 +697,74 @@ void memzero_explicit(void *s, size_t count)
 }
 EXPORT_SYMBOL(memzero_explicit);
 
+#ifndef __HAVE_ARCH_MEMSET16
+/**
+ * memset16() - Fill a memory area with a uint16_t
+ * @s: Pointer to the start of the area.
+ * @v: The value to fill the area with
+ * @count: The number of values to store
+ *
+ * Differs from memset() in that it fills with a uint16_t instead
+ * of a byte.  Remember that @count is the number of uint16_ts to
+ * store, not the number of bytes.
+ */
+void *memset16(uint16_t *s, uint16_t v, size_t count)
+{
+	uint16_t *xs = s;
+
+	while (count--)
+		*xs++ = v;
+	return s;
+}
+EXPORT_SYMBOL(memset16);
+#endif
+
+#ifndef __HAVE_ARCH_MEMSET32
+/**
+ * memset32() - Fill a memory area with a uint32_t
+ * @s: Pointer to the start of the area.
+ * @v: The value to fill the area with
+ * @count: The number of values to store
+ *
+ * Differs from memset() in that it fills with a uint32_t instead
+ * of a byte.  Remember that @count is the number of uint32_ts to
+ * store, not the number of bytes.
+ */
+void *memset32(uint32_t *s, uint32_t v, size_t count)
+{
+	uint32_t *xs = s;
+
+	while (count--)
+		*xs++ = v;
+	return s;
+}
+EXPORT_SYMBOL(memset32);
+#endif
+
+#ifndef __HAVE_ARCH_MEMSET64
+#if BITS_PER_LONG > 32
+/**
+ * memset64() - Fill a memory area with a uint64_t
+ * @s: Pointer to the start of the area.
+ * @v: The value to fill the area with
+ * @count: The number of values to store
+ *
+ * Differs from memset() in that it fills with a uint64_t instead
+ * of a byte.  Remember that @count is the number of uint64_ts to
+ * store, not the number of bytes.
+ */
+void *memset64(uint64_t *s, uint64_t v, size_t count)
+{
+	uint64_t *xs = s;
+
+	while (count--)
+		*xs++ = v;
+	return s;
+}
+EXPORT_SYMBOL(memset64);
+#endif
+#endif
+
 #ifndef __HAVE_ARCH_MEMCPY
 /**
  * memcpy - Copy one area of memory to another
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 2/7] ARM: Implement memset16, memset32 & memset64
  2017-03-24 16:13 ` Matthew Wilcox
  (?)
  (?)
@ 2017-03-24 16:13   ` Matthew Wilcox
  -1 siblings, 0 replies; 50+ messages in thread
From: Matthew Wilcox @ 2017-03-24 16:13 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-fbdev, linux-arch, linux-alpha, linux-arm-kernel, x86,
	linux-mips, linuxppc-dev, sparclinux, Minchan Kim,
	Matthew Wilcox

From: Matthew Wilcox <mawilcox@microsoft.com>

ARM is only 32-bit, so it doesn't really need a memset64, but it was
essentially free to add it to the existing implementation.

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk>
---
 arch/arm/include/asm/string.h | 21 +++++++++++++++++++++
 arch/arm/kernel/armksyms.c    |  3 +++
 arch/arm/lib/memset.S         | 44 ++++++++++++++++++++++++++++++++++---------
 3 files changed, 59 insertions(+), 9 deletions(-)

diff --git a/arch/arm/include/asm/string.h b/arch/arm/include/asm/string.h
index cf4f3aad0fc1..bc7a1be7a76a 100644
--- a/arch/arm/include/asm/string.h
+++ b/arch/arm/include/asm/string.h
@@ -24,6 +24,27 @@ extern void * memchr(const void *, int, __kernel_size_t);
 #define __HAVE_ARCH_MEMSET
 extern void * memset(void *, int, __kernel_size_t);
 
+#define __HAVE_ARCH_MEMSET16
+extern void *__memset16(uint16_t *, uint16_t v, __kernel_size_t);
+static inline void *memset16(uint16_t *p, uint16_t v, __kernel_size_t n)
+{
+	return __memset16(p, v, n * 2);
+}
+
+#define __HAVE_ARCH_MEMSET32
+extern void *__memset32(uint32_t *, uint32_t v, __kernel_size_t);
+static inline void *memset32(uint32_t *p, uint32_t v, __kernel_size_t n)
+{
+	return __memset32(p, v, n * 4);
+}
+
+#define __HAVE_ARCH_MEMSET64
+extern void *__memset64(uint64_t *, uint32_t low, __kernel_size_t, uint32_t hi);
+static inline void *memset64(uint64_t *p, uint64_t v, __kernel_size_t n)
+{
+	return __memset64(p, v, n * 8, v >> 32);
+}
+
 extern void __memzero(void *ptr, __kernel_size_t n);
 
 #define memset(p,v,n)							\
diff --git a/arch/arm/kernel/armksyms.c b/arch/arm/kernel/armksyms.c
index 8e8d20cdbce7..633341ed0713 100644
--- a/arch/arm/kernel/armksyms.c
+++ b/arch/arm/kernel/armksyms.c
@@ -87,6 +87,9 @@ EXPORT_SYMBOL(__raw_writesl);
 EXPORT_SYMBOL(strchr);
 EXPORT_SYMBOL(strrchr);
 EXPORT_SYMBOL(memset);
+EXPORT_SYMBOL(__memset16);
+EXPORT_SYMBOL(__memset32);
+EXPORT_SYMBOL(__memset64);
 EXPORT_SYMBOL(memcpy);
 EXPORT_SYMBOL(memmove);
 EXPORT_SYMBOL(memchr);
diff --git a/arch/arm/lib/memset.S b/arch/arm/lib/memset.S
index 3c65e3bd790f..9adc9bdf3ffb 100644
--- a/arch/arm/lib/memset.S
+++ b/arch/arm/lib/memset.S
@@ -21,14 +21,14 @@ ENTRY(memset)
 UNWIND( .fnstart         )
 	ands	r3, r0, #3		@ 1 unaligned?
 	mov	ip, r0			@ preserve r0 as return value
+	orr	r1, r1, r1, lsl #8
 	bne	6f			@ 1
 /*
  * we know that the pointer in ip is aligned to a word boundary.
  */
-1:	orr	r1, r1, r1, lsl #8
-	orr	r1, r1, r1, lsl #16
+1:	orr	r1, r1, r1, lsl #16
 	mov	r3, r1
-	cmp	r2, #16
+7:	cmp	r2, #16
 	blt	4f
 
 #if ! CALGN(1)+0
@@ -41,7 +41,7 @@ UNWIND( .fnend              )
 UNWIND( .fnstart            )
 UNWIND( .save {r8, lr}      )
 	mov	r8, r1
-	mov	lr, r1
+	mov	lr, r3
 
 2:	subs	r2, r2, #64
 	stmgeia	ip!, {r1, r3, r8, lr}	@ 64 bytes at a time.
@@ -73,11 +73,11 @@ UNWIND( .fnend                 )
 UNWIND( .fnstart               )
 UNWIND( .save {r4-r8, lr}      )
 	mov	r4, r1
-	mov	r5, r1
+	mov	r5, r3
 	mov	r6, r1
-	mov	r7, r1
+	mov	r7, r3
 	mov	r8, r1
-	mov	lr, r1
+	mov	lr, r3
 
 	cmp	r2, #96
 	tstgt	ip, #31
@@ -114,12 +114,13 @@ UNWIND( .fnstart            )
 	tst	r2, #4
 	strne	r1, [ip], #4
 /*
- * When we get here, we've got less than 4 bytes to zero.  We
+ * When we get here, we've got less than 4 bytes to set.  We
  * may have an unaligned pointer as well.
  */
 5:	tst	r2, #2
+	movne	r3, r1, lsr #8		@ the top half of a 16-bit pattern
 	strneb	r1, [ip], #1
-	strneb	r1, [ip], #1
+	strneb	r3, [ip], #1
 	tst	r2, #1
 	strneb	r1, [ip], #1
 	ret	lr
@@ -135,3 +136,28 @@ UNWIND( .fnstart            )
 UNWIND( .fnend   )
 ENDPROC(memset)
 ENDPROC(mmioset)
+
+ENTRY(__memset16)
+UNWIND( .fnstart         )
+	tst	r0, #2			@ pointer unaligned?
+	mov	ip, r0			@ preserve r0 as return value
+	beq	1b			@ jump into the middle of memset
+	subs	r2, r2, #2		@ cope with n == 0
+	movge	r3, r1, lsr #8		@ r3 = r1 >> 8
+	strgeb	r1, [ip], #1		@ *ip = r1
+	strgeb	r3, [ip], #1		@ *ip = r3
+	bgt	1b			@ back into memset if n > 0
+	ret	lr			@ otherwise return
+UNWIND( .fnend   )
+ENDPROC(__memset16)
+ENTRY(__memset32)
+UNWIND( .fnstart         )
+	mov	r3, r1			@ copy r1 to r3 and fall into memset64
+UNWIND( .fnend   )
+ENDPROC(__memset32)
+ENTRY(__memset64)
+UNWIND( .fnstart         )
+	mov	ip, r0			@ preserve r0 as return value
+	b	7b			@ jump into the middle of memset
+UNWIND( .fnend   )
+ENDPROC(__memset64)
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 2/7] ARM: Implement memset16, memset32 & memset64
@ 2017-03-24 16:13   ` Matthew Wilcox
  0 siblings, 0 replies; 50+ messages in thread
From: Matthew Wilcox @ 2017-03-24 16:13 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arch, linux-mips, linux-fbdev, Matthew Wilcox, x86,
	Minchan Kim, linux-alpha, sparclinux, linuxppc-dev,
	linux-arm-kernel

From: Matthew Wilcox <mawilcox@microsoft.com>

ARM is only 32-bit, so it doesn't really need a memset64, but it was
essentially free to add it to the existing implementation.

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk>
---
 arch/arm/include/asm/string.h | 21 +++++++++++++++++++++
 arch/arm/kernel/armksyms.c    |  3 +++
 arch/arm/lib/memset.S         | 44 ++++++++++++++++++++++++++++++++++---------
 3 files changed, 59 insertions(+), 9 deletions(-)

diff --git a/arch/arm/include/asm/string.h b/arch/arm/include/asm/string.h
index cf4f3aad0fc1..bc7a1be7a76a 100644
--- a/arch/arm/include/asm/string.h
+++ b/arch/arm/include/asm/string.h
@@ -24,6 +24,27 @@ extern void * memchr(const void *, int, __kernel_size_t);
 #define __HAVE_ARCH_MEMSET
 extern void * memset(void *, int, __kernel_size_t);
 
+#define __HAVE_ARCH_MEMSET16
+extern void *__memset16(uint16_t *, uint16_t v, __kernel_size_t);
+static inline void *memset16(uint16_t *p, uint16_t v, __kernel_size_t n)
+{
+	return __memset16(p, v, n * 2);
+}
+
+#define __HAVE_ARCH_MEMSET32
+extern void *__memset32(uint32_t *, uint32_t v, __kernel_size_t);
+static inline void *memset32(uint32_t *p, uint32_t v, __kernel_size_t n)
+{
+	return __memset32(p, v, n * 4);
+}
+
+#define __HAVE_ARCH_MEMSET64
+extern void *__memset64(uint64_t *, uint32_t low, __kernel_size_t, uint32_t hi);
+static inline void *memset64(uint64_t *p, uint64_t v, __kernel_size_t n)
+{
+	return __memset64(p, v, n * 8, v >> 32);
+}
+
 extern void __memzero(void *ptr, __kernel_size_t n);
 
 #define memset(p,v,n)							\
diff --git a/arch/arm/kernel/armksyms.c b/arch/arm/kernel/armksyms.c
index 8e8d20cdbce7..633341ed0713 100644
--- a/arch/arm/kernel/armksyms.c
+++ b/arch/arm/kernel/armksyms.c
@@ -87,6 +87,9 @@ EXPORT_SYMBOL(__raw_writesl);
 EXPORT_SYMBOL(strchr);
 EXPORT_SYMBOL(strrchr);
 EXPORT_SYMBOL(memset);
+EXPORT_SYMBOL(__memset16);
+EXPORT_SYMBOL(__memset32);
+EXPORT_SYMBOL(__memset64);
 EXPORT_SYMBOL(memcpy);
 EXPORT_SYMBOL(memmove);
 EXPORT_SYMBOL(memchr);
diff --git a/arch/arm/lib/memset.S b/arch/arm/lib/memset.S
index 3c65e3bd790f..9adc9bdf3ffb 100644
--- a/arch/arm/lib/memset.S
+++ b/arch/arm/lib/memset.S
@@ -21,14 +21,14 @@ ENTRY(memset)
 UNWIND( .fnstart         )
 	ands	r3, r0, #3		@ 1 unaligned?
 	mov	ip, r0			@ preserve r0 as return value
+	orr	r1, r1, r1, lsl #8
 	bne	6f			@ 1
 /*
  * we know that the pointer in ip is aligned to a word boundary.
  */
-1:	orr	r1, r1, r1, lsl #8
-	orr	r1, r1, r1, lsl #16
+1:	orr	r1, r1, r1, lsl #16
 	mov	r3, r1
-	cmp	r2, #16
+7:	cmp	r2, #16
 	blt	4f
 
 #if ! CALGN(1)+0
@@ -41,7 +41,7 @@ UNWIND( .fnend              )
 UNWIND( .fnstart            )
 UNWIND( .save {r8, lr}      )
 	mov	r8, r1
-	mov	lr, r1
+	mov	lr, r3
 
 2:	subs	r2, r2, #64
 	stmgeia	ip!, {r1, r3, r8, lr}	@ 64 bytes at a time.
@@ -73,11 +73,11 @@ UNWIND( .fnend                 )
 UNWIND( .fnstart               )
 UNWIND( .save {r4-r8, lr}      )
 	mov	r4, r1
-	mov	r5, r1
+	mov	r5, r3
 	mov	r6, r1
-	mov	r7, r1
+	mov	r7, r3
 	mov	r8, r1
-	mov	lr, r1
+	mov	lr, r3
 
 	cmp	r2, #96
 	tstgt	ip, #31
@@ -114,12 +114,13 @@ UNWIND( .fnstart            )
 	tst	r2, #4
 	strne	r1, [ip], #4
 /*
- * When we get here, we've got less than 4 bytes to zero.  We
+ * When we get here, we've got less than 4 bytes to set.  We
  * may have an unaligned pointer as well.
  */
 5:	tst	r2, #2
+	movne	r3, r1, lsr #8		@ the top half of a 16-bit pattern
 	strneb	r1, [ip], #1
-	strneb	r1, [ip], #1
+	strneb	r3, [ip], #1
 	tst	r2, #1
 	strneb	r1, [ip], #1
 	ret	lr
@@ -135,3 +136,28 @@ UNWIND( .fnstart            )
 UNWIND( .fnend   )
 ENDPROC(memset)
 ENDPROC(mmioset)
+
+ENTRY(__memset16)
+UNWIND( .fnstart         )
+	tst	r0, #2			@ pointer unaligned?
+	mov	ip, r0			@ preserve r0 as return value
+	beq	1b			@ jump into the middle of memset
+	subs	r2, r2, #2		@ cope with n == 0
+	movge	r3, r1, lsr #8		@ r3 = r1 >> 8
+	strgeb	r1, [ip], #1		@ *ip = r1
+	strgeb	r3, [ip], #1		@ *ip = r3
+	bgt	1b			@ back into memset if n > 0
+	ret	lr			@ otherwise return
+UNWIND( .fnend   )
+ENDPROC(__memset16)
+ENTRY(__memset32)
+UNWIND( .fnstart         )
+	mov	r3, r1			@ copy r1 to r3 and fall into memset64
+UNWIND( .fnend   )
+ENDPROC(__memset32)
+ENTRY(__memset64)
+UNWIND( .fnstart         )
+	mov	ip, r0			@ preserve r0 as return value
+	b	7b			@ jump into the middle of memset
+UNWIND( .fnend   )
+ENDPROC(__memset64)
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 2/7] ARM: Implement memset16, memset32 & memset64
@ 2017-03-24 16:13   ` Matthew Wilcox
  0 siblings, 0 replies; 50+ messages in thread
From: Matthew Wilcox @ 2017-03-24 16:13 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arch, linux-mips, linux-fbdev, Matthew Wilcox, x86,
	Minchan Kim, linux-alpha, sparclinux, linuxppc-dev,
	linux-arm-kernel

From: Matthew Wilcox <mawilcox@microsoft.com>

ARM is only 32-bit, so it doesn't really need a memset64, but it was
essentially free to add it to the existing implementation.

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk>
---
 arch/arm/include/asm/string.h | 21 +++++++++++++++++++++
 arch/arm/kernel/armksyms.c    |  3 +++
 arch/arm/lib/memset.S         | 44 ++++++++++++++++++++++++++++++++++---------
 3 files changed, 59 insertions(+), 9 deletions(-)

diff --git a/arch/arm/include/asm/string.h b/arch/arm/include/asm/string.h
index cf4f3aad0fc1..bc7a1be7a76a 100644
--- a/arch/arm/include/asm/string.h
+++ b/arch/arm/include/asm/string.h
@@ -24,6 +24,27 @@ extern void * memchr(const void *, int, __kernel_size_t);
 #define __HAVE_ARCH_MEMSET
 extern void * memset(void *, int, __kernel_size_t);
 
+#define __HAVE_ARCH_MEMSET16
+extern void *__memset16(uint16_t *, uint16_t v, __kernel_size_t);
+static inline void *memset16(uint16_t *p, uint16_t v, __kernel_size_t n)
+{
+	return __memset16(p, v, n * 2);
+}
+
+#define __HAVE_ARCH_MEMSET32
+extern void *__memset32(uint32_t *, uint32_t v, __kernel_size_t);
+static inline void *memset32(uint32_t *p, uint32_t v, __kernel_size_t n)
+{
+	return __memset32(p, v, n * 4);
+}
+
+#define __HAVE_ARCH_MEMSET64
+extern void *__memset64(uint64_t *, uint32_t low, __kernel_size_t, uint32_t hi);
+static inline void *memset64(uint64_t *p, uint64_t v, __kernel_size_t n)
+{
+	return __memset64(p, v, n * 8, v >> 32);
+}
+
 extern void __memzero(void *ptr, __kernel_size_t n);
 
 #define memset(p,v,n)							\
diff --git a/arch/arm/kernel/armksyms.c b/arch/arm/kernel/armksyms.c
index 8e8d20cdbce7..633341ed0713 100644
--- a/arch/arm/kernel/armksyms.c
+++ b/arch/arm/kernel/armksyms.c
@@ -87,6 +87,9 @@ EXPORT_SYMBOL(__raw_writesl);
 EXPORT_SYMBOL(strchr);
 EXPORT_SYMBOL(strrchr);
 EXPORT_SYMBOL(memset);
+EXPORT_SYMBOL(__memset16);
+EXPORT_SYMBOL(__memset32);
+EXPORT_SYMBOL(__memset64);
 EXPORT_SYMBOL(memcpy);
 EXPORT_SYMBOL(memmove);
 EXPORT_SYMBOL(memchr);
diff --git a/arch/arm/lib/memset.S b/arch/arm/lib/memset.S
index 3c65e3bd790f..9adc9bdf3ffb 100644
--- a/arch/arm/lib/memset.S
+++ b/arch/arm/lib/memset.S
@@ -21,14 +21,14 @@ ENTRY(memset)
 UNWIND( .fnstart         )
 	ands	r3, r0, #3		@ 1 unaligned?
 	mov	ip, r0			@ preserve r0 as return value
+	orr	r1, r1, r1, lsl #8
 	bne	6f			@ 1
 /*
  * we know that the pointer in ip is aligned to a word boundary.
  */
-1:	orr	r1, r1, r1, lsl #8
-	orr	r1, r1, r1, lsl #16
+1:	orr	r1, r1, r1, lsl #16
 	mov	r3, r1
-	cmp	r2, #16
+7:	cmp	r2, #16
 	blt	4f
 
 #if ! CALGN(1)+0
@@ -41,7 +41,7 @@ UNWIND( .fnend              )
 UNWIND( .fnstart            )
 UNWIND( .save {r8, lr}      )
 	mov	r8, r1
-	mov	lr, r1
+	mov	lr, r3
 
 2:	subs	r2, r2, #64
 	stmgeia	ip!, {r1, r3, r8, lr}	@ 64 bytes at a time.
@@ -73,11 +73,11 @@ UNWIND( .fnend                 )
 UNWIND( .fnstart               )
 UNWIND( .save {r4-r8, lr}      )
 	mov	r4, r1
-	mov	r5, r1
+	mov	r5, r3
 	mov	r6, r1
-	mov	r7, r1
+	mov	r7, r3
 	mov	r8, r1
-	mov	lr, r1
+	mov	lr, r3
 
 	cmp	r2, #96
 	tstgt	ip, #31
@@ -114,12 +114,13 @@ UNWIND( .fnstart            )
 	tst	r2, #4
 	strne	r1, [ip], #4
 /*
- * When we get here, we've got less than 4 bytes to zero.  We
+ * When we get here, we've got less than 4 bytes to set.  We
  * may have an unaligned pointer as well.
  */
 5:	tst	r2, #2
+	movne	r3, r1, lsr #8		@ the top half of a 16-bit pattern
 	strneb	r1, [ip], #1
-	strneb	r1, [ip], #1
+	strneb	r3, [ip], #1
 	tst	r2, #1
 	strneb	r1, [ip], #1
 	ret	lr
@@ -135,3 +136,28 @@ UNWIND( .fnstart            )
 UNWIND( .fnend   )
 ENDPROC(memset)
 ENDPROC(mmioset)
+
+ENTRY(__memset16)
+UNWIND( .fnstart         )
+	tst	r0, #2			@ pointer unaligned?
+	mov	ip, r0			@ preserve r0 as return value
+	beq	1b			@ jump into the middle of memset
+	subs	r2, r2, #2		@ cope with n = 0
+	movge	r3, r1, lsr #8		@ r3 = r1 >> 8
+	strgeb	r1, [ip], #1		@ *ip = r1
+	strgeb	r3, [ip], #1		@ *ip = r3
+	bgt	1b			@ back into memset if n > 0
+	ret	lr			@ otherwise return
+UNWIND( .fnend   )
+ENDPROC(__memset16)
+ENTRY(__memset32)
+UNWIND( .fnstart         )
+	mov	r3, r1			@ copy r1 to r3 and fall into memset64
+UNWIND( .fnend   )
+ENDPROC(__memset32)
+ENTRY(__memset64)
+UNWIND( .fnstart         )
+	mov	ip, r0			@ preserve r0 as return value
+	b	7b			@ jump into the middle of memset
+UNWIND( .fnend   )
+ENDPROC(__memset64)
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 2/7] ARM: Implement memset16, memset32 & memset64
@ 2017-03-24 16:13   ` Matthew Wilcox
  0 siblings, 0 replies; 50+ messages in thread
From: Matthew Wilcox @ 2017-03-24 16:13 UTC (permalink / raw)
  To: linux-arm-kernel

From: Matthew Wilcox <mawilcox@microsoft.com>

ARM is only 32-bit, so it doesn't really need a memset64, but it was
essentially free to add it to the existing implementation.

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk>
---
 arch/arm/include/asm/string.h | 21 +++++++++++++++++++++
 arch/arm/kernel/armksyms.c    |  3 +++
 arch/arm/lib/memset.S         | 44 ++++++++++++++++++++++++++++++++++---------
 3 files changed, 59 insertions(+), 9 deletions(-)

diff --git a/arch/arm/include/asm/string.h b/arch/arm/include/asm/string.h
index cf4f3aad0fc1..bc7a1be7a76a 100644
--- a/arch/arm/include/asm/string.h
+++ b/arch/arm/include/asm/string.h
@@ -24,6 +24,27 @@ extern void * memchr(const void *, int, __kernel_size_t);
 #define __HAVE_ARCH_MEMSET
 extern void * memset(void *, int, __kernel_size_t);
 
+#define __HAVE_ARCH_MEMSET16
+extern void *__memset16(uint16_t *, uint16_t v, __kernel_size_t);
+static inline void *memset16(uint16_t *p, uint16_t v, __kernel_size_t n)
+{
+	return __memset16(p, v, n * 2);
+}
+
+#define __HAVE_ARCH_MEMSET32
+extern void *__memset32(uint32_t *, uint32_t v, __kernel_size_t);
+static inline void *memset32(uint32_t *p, uint32_t v, __kernel_size_t n)
+{
+	return __memset32(p, v, n * 4);
+}
+
+#define __HAVE_ARCH_MEMSET64
+extern void *__memset64(uint64_t *, uint32_t low, __kernel_size_t, uint32_t hi);
+static inline void *memset64(uint64_t *p, uint64_t v, __kernel_size_t n)
+{
+	return __memset64(p, v, n * 8, v >> 32);
+}
+
 extern void __memzero(void *ptr, __kernel_size_t n);
 
 #define memset(p,v,n)							\
diff --git a/arch/arm/kernel/armksyms.c b/arch/arm/kernel/armksyms.c
index 8e8d20cdbce7..633341ed0713 100644
--- a/arch/arm/kernel/armksyms.c
+++ b/arch/arm/kernel/armksyms.c
@@ -87,6 +87,9 @@ EXPORT_SYMBOL(__raw_writesl);
 EXPORT_SYMBOL(strchr);
 EXPORT_SYMBOL(strrchr);
 EXPORT_SYMBOL(memset);
+EXPORT_SYMBOL(__memset16);
+EXPORT_SYMBOL(__memset32);
+EXPORT_SYMBOL(__memset64);
 EXPORT_SYMBOL(memcpy);
 EXPORT_SYMBOL(memmove);
 EXPORT_SYMBOL(memchr);
diff --git a/arch/arm/lib/memset.S b/arch/arm/lib/memset.S
index 3c65e3bd790f..9adc9bdf3ffb 100644
--- a/arch/arm/lib/memset.S
+++ b/arch/arm/lib/memset.S
@@ -21,14 +21,14 @@ ENTRY(memset)
 UNWIND( .fnstart         )
 	ands	r3, r0, #3		@ 1 unaligned?
 	mov	ip, r0			@ preserve r0 as return value
+	orr	r1, r1, r1, lsl #8
 	bne	6f			@ 1
 /*
  * we know that the pointer in ip is aligned to a word boundary.
  */
-1:	orr	r1, r1, r1, lsl #8
-	orr	r1, r1, r1, lsl #16
+1:	orr	r1, r1, r1, lsl #16
 	mov	r3, r1
-	cmp	r2, #16
+7:	cmp	r2, #16
 	blt	4f
 
 #if ! CALGN(1)+0
@@ -41,7 +41,7 @@ UNWIND( .fnend              )
 UNWIND( .fnstart            )
 UNWIND( .save {r8, lr}      )
 	mov	r8, r1
-	mov	lr, r1
+	mov	lr, r3
 
 2:	subs	r2, r2, #64
 	stmgeia	ip!, {r1, r3, r8, lr}	@ 64 bytes at a time.
@@ -73,11 +73,11 @@ UNWIND( .fnend                 )
 UNWIND( .fnstart               )
 UNWIND( .save {r4-r8, lr}      )
 	mov	r4, r1
-	mov	r5, r1
+	mov	r5, r3
 	mov	r6, r1
-	mov	r7, r1
+	mov	r7, r3
 	mov	r8, r1
-	mov	lr, r1
+	mov	lr, r3
 
 	cmp	r2, #96
 	tstgt	ip, #31
@@ -114,12 +114,13 @@ UNWIND( .fnstart            )
 	tst	r2, #4
 	strne	r1, [ip], #4
 /*
- * When we get here, we've got less than 4 bytes to zero.  We
+ * When we get here, we've got less than 4 bytes to set.  We
  * may have an unaligned pointer as well.
  */
 5:	tst	r2, #2
+	movne	r3, r1, lsr #8		@ the top half of a 16-bit pattern
 	strneb	r1, [ip], #1
-	strneb	r1, [ip], #1
+	strneb	r3, [ip], #1
 	tst	r2, #1
 	strneb	r1, [ip], #1
 	ret	lr
@@ -135,3 +136,28 @@ UNWIND( .fnstart            )
 UNWIND( .fnend   )
 ENDPROC(memset)
 ENDPROC(mmioset)
+
+ENTRY(__memset16)
+UNWIND( .fnstart         )
+	tst	r0, #2			@ pointer unaligned?
+	mov	ip, r0			@ preserve r0 as return value
+	beq	1b			@ jump into the middle of memset
+	subs	r2, r2, #2		@ cope with n == 0
+	movge	r3, r1, lsr #8		@ r3 = r1 >> 8
+	strgeb	r1, [ip], #1		@ *ip = r1
+	strgeb	r3, [ip], #1		@ *ip = r3
+	bgt	1b			@ back into memset if n > 0
+	ret	lr			@ otherwise return
+UNWIND( .fnend   )
+ENDPROC(__memset16)
+ENTRY(__memset32)
+UNWIND( .fnstart         )
+	mov	r3, r1			@ copy r1 to r3 and fall into memset64
+UNWIND( .fnend   )
+ENDPROC(__memset32)
+ENTRY(__memset64)
+UNWIND( .fnstart         )
+	mov	ip, r0			@ preserve r0 as return value
+	b	7b			@ jump into the middle of memset
+UNWIND( .fnend   )
+ENDPROC(__memset64)
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 3/7] x86: Implement memset16, memset32 & memset64
  2017-03-24 16:13 ` Matthew Wilcox
  (?)
@ 2017-03-24 16:13   ` Matthew Wilcox
  -1 siblings, 0 replies; 50+ messages in thread
From: Matthew Wilcox @ 2017-03-24 16:13 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-fbdev, linux-arch, linux-alpha, linux-arm-kernel, x86,
	linux-mips, linuxppc-dev, sparclinux, Minchan Kim,
	Matthew Wilcox

From: Matthew Wilcox <mawilcox@microsoft.com>

These are single instructions on x86.  There's no 64-bit instruction
for x86-32, but we don't yet have any user for memset64() on 32-bit
architectures, so don't bother to implement it.

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
---
 arch/x86/include/asm/string_32.h | 24 ++++++++++++++++++++++++
 arch/x86/include/asm/string_64.h | 36 ++++++++++++++++++++++++++++++++++++
 2 files changed, 60 insertions(+)

diff --git a/arch/x86/include/asm/string_32.h b/arch/x86/include/asm/string_32.h
index 3d3e8353ee5c..84da91fe13ac 100644
--- a/arch/x86/include/asm/string_32.h
+++ b/arch/x86/include/asm/string_32.h
@@ -331,6 +331,30 @@ void *__constant_c_and_count_memset(void *s, unsigned long pattern,
 	 : __memset((s), (c), (count)))
 #endif
 
+#define __HAVE_ARCH_MEMSET16
+static inline void *memset16(uint16_t *s, uint16_t v, size_t n)
+{
+	int d0, d1;
+	asm volatile("rep\n\t"
+		     "stosw"
+		     : "=&c" (d0), "=&D" (d1)
+		     : "a" (v), "1" (s), "0" (n)
+		     : "memory");
+	return s;
+}
+
+#define __HAVE_ARCH_MEMSET_32
+static inline void *memset32(uint32_t *s, uint32_t v, size_t n)
+{
+	int d0, d1;
+	asm volatile("rep\n\t"
+		     "stosl"
+		     : "=&c" (d0), "=&D" (d1)
+		     : "a" (v), "1" (s), "0" (n)
+		     : "memory");
+	return s;
+}
+
 /*
  * find the first occurrence of byte 'c', or 1 past the area if none
  */
diff --git a/arch/x86/include/asm/string_64.h b/arch/x86/include/asm/string_64.h
index a164862d77e3..71c5e860c7da 100644
--- a/arch/x86/include/asm/string_64.h
+++ b/arch/x86/include/asm/string_64.h
@@ -56,6 +56,42 @@ extern void *__memcpy(void *to, const void *from, size_t len);
 void *memset(void *s, int c, size_t n);
 void *__memset(void *s, int c, size_t n);
 
+#define __HAVE_ARCH_MEMSET16
+static inline void *memset16(uint16_t *s, uint16_t v, size_t n)
+{
+	long d0, d1;
+	asm volatile("rep\n\t"
+		     "stosw"
+		     : "=&c" (d0), "=&D" (d1)
+		     : "a" (v), "1" (s), "0" (n)
+		     : "memory");
+	return s;
+}
+
+#define __HAVE_ARCH_MEMSET32
+static inline void *memset32(uint32_t *s, uint32_t v, size_t n)
+{
+	long d0, d1;
+	asm volatile("rep\n\t"
+		     "stosl"
+		     : "=&c" (d0), "=&D" (d1)
+		     : "a" (v), "1" (s), "0" (n)
+		     : "memory");
+	return s;
+}
+
+#define __HAVE_ARCH_MEMSET64
+static inline void *memset64(uint64_t *s, uint64_t v, size_t n)
+{
+	long d0, d1;
+	asm volatile("rep\n\t"
+		     "stosq"
+		     : "=&c" (d0), "=&D" (d1)
+		     : "a" (v), "1" (s), "0" (n)
+		     : "memory");
+	return s;
+}
+
 #define __HAVE_ARCH_MEMMOVE
 void *memmove(void *dest, const void *src, size_t count);
 void *__memmove(void *dest, const void *src, size_t count);
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 3/7] x86: Implement memset16, memset32 & memset64
@ 2017-03-24 16:13   ` Matthew Wilcox
  0 siblings, 0 replies; 50+ messages in thread
From: Matthew Wilcox @ 2017-03-24 16:13 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-fbdev, linux-arch, linux-alpha, linux-arm-kernel, x86,
	linux-mips, linuxppc-dev, sparclinux, Minchan Kim,
	Matthew Wilcox

From: Matthew Wilcox <mawilcox@microsoft.com>

These are single instructions on x86.  There's no 64-bit instruction
for x86-32, but we don't yet have any user for memset64() on 32-bit
architectures, so don't bother to implement it.

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
---
 arch/x86/include/asm/string_32.h | 24 ++++++++++++++++++++++++
 arch/x86/include/asm/string_64.h | 36 ++++++++++++++++++++++++++++++++++++
 2 files changed, 60 insertions(+)

diff --git a/arch/x86/include/asm/string_32.h b/arch/x86/include/asm/string_32.h
index 3d3e8353ee5c..84da91fe13ac 100644
--- a/arch/x86/include/asm/string_32.h
+++ b/arch/x86/include/asm/string_32.h
@@ -331,6 +331,30 @@ void *__constant_c_and_count_memset(void *s, unsigned long pattern,
 	 : __memset((s), (c), (count)))
 #endif
 
+#define __HAVE_ARCH_MEMSET16
+static inline void *memset16(uint16_t *s, uint16_t v, size_t n)
+{
+	int d0, d1;
+	asm volatile("rep\n\t"
+		     "stosw"
+		     : "=&c" (d0), "=&D" (d1)
+		     : "a" (v), "1" (s), "0" (n)
+		     : "memory");
+	return s;
+}
+
+#define __HAVE_ARCH_MEMSET_32
+static inline void *memset32(uint32_t *s, uint32_t v, size_t n)
+{
+	int d0, d1;
+	asm volatile("rep\n\t"
+		     "stosl"
+		     : "=&c" (d0), "=&D" (d1)
+		     : "a" (v), "1" (s), "0" (n)
+		     : "memory");
+	return s;
+}
+
 /*
  * find the first occurrence of byte 'c', or 1 past the area if none
  */
diff --git a/arch/x86/include/asm/string_64.h b/arch/x86/include/asm/string_64.h
index a164862d77e3..71c5e860c7da 100644
--- a/arch/x86/include/asm/string_64.h
+++ b/arch/x86/include/asm/string_64.h
@@ -56,6 +56,42 @@ extern void *__memcpy(void *to, const void *from, size_t len);
 void *memset(void *s, int c, size_t n);
 void *__memset(void *s, int c, size_t n);
 
+#define __HAVE_ARCH_MEMSET16
+static inline void *memset16(uint16_t *s, uint16_t v, size_t n)
+{
+	long d0, d1;
+	asm volatile("rep\n\t"
+		     "stosw"
+		     : "=&c" (d0), "=&D" (d1)
+		     : "a" (v), "1" (s), "0" (n)
+		     : "memory");
+	return s;
+}
+
+#define __HAVE_ARCH_MEMSET32
+static inline void *memset32(uint32_t *s, uint32_t v, size_t n)
+{
+	long d0, d1;
+	asm volatile("rep\n\t"
+		     "stosl"
+		     : "=&c" (d0), "=&D" (d1)
+		     : "a" (v), "1" (s), "0" (n)
+		     : "memory");
+	return s;
+}
+
+#define __HAVE_ARCH_MEMSET64
+static inline void *memset64(uint64_t *s, uint64_t v, size_t n)
+{
+	long d0, d1;
+	asm volatile("rep\n\t"
+		     "stosq"
+		     : "=&c" (d0), "=&D" (d1)
+		     : "a" (v), "1" (s), "0" (n)
+		     : "memory");
+	return s;
+}
+
 #define __HAVE_ARCH_MEMMOVE
 void *memmove(void *dest, const void *src, size_t count);
 void *__memmove(void *dest, const void *src, size_t count);
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 3/7] x86: Implement memset16, memset32 & memset64
@ 2017-03-24 16:13   ` Matthew Wilcox
  0 siblings, 0 replies; 50+ messages in thread
From: Matthew Wilcox @ 2017-03-24 16:13 UTC (permalink / raw)
  To: linux-arm-kernel

From: Matthew Wilcox <mawilcox@microsoft.com>

These are single instructions on x86.  There's no 64-bit instruction
for x86-32, but we don't yet have any user for memset64() on 32-bit
architectures, so don't bother to implement it.

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
---
 arch/x86/include/asm/string_32.h | 24 ++++++++++++++++++++++++
 arch/x86/include/asm/string_64.h | 36 ++++++++++++++++++++++++++++++++++++
 2 files changed, 60 insertions(+)

diff --git a/arch/x86/include/asm/string_32.h b/arch/x86/include/asm/string_32.h
index 3d3e8353ee5c..84da91fe13ac 100644
--- a/arch/x86/include/asm/string_32.h
+++ b/arch/x86/include/asm/string_32.h
@@ -331,6 +331,30 @@ void *__constant_c_and_count_memset(void *s, unsigned long pattern,
 	 : __memset((s), (c), (count)))
 #endif
 
+#define __HAVE_ARCH_MEMSET16
+static inline void *memset16(uint16_t *s, uint16_t v, size_t n)
+{
+	int d0, d1;
+	asm volatile("rep\n\t"
+		     "stosw"
+		     : "=&c" (d0), "=&D" (d1)
+		     : "a" (v), "1" (s), "0" (n)
+		     : "memory");
+	return s;
+}
+
+#define __HAVE_ARCH_MEMSET_32
+static inline void *memset32(uint32_t *s, uint32_t v, size_t n)
+{
+	int d0, d1;
+	asm volatile("rep\n\t"
+		     "stosl"
+		     : "=&c" (d0), "=&D" (d1)
+		     : "a" (v), "1" (s), "0" (n)
+		     : "memory");
+	return s;
+}
+
 /*
  * find the first occurrence of byte 'c', or 1 past the area if none
  */
diff --git a/arch/x86/include/asm/string_64.h b/arch/x86/include/asm/string_64.h
index a164862d77e3..71c5e860c7da 100644
--- a/arch/x86/include/asm/string_64.h
+++ b/arch/x86/include/asm/string_64.h
@@ -56,6 +56,42 @@ extern void *__memcpy(void *to, const void *from, size_t len);
 void *memset(void *s, int c, size_t n);
 void *__memset(void *s, int c, size_t n);
 
+#define __HAVE_ARCH_MEMSET16
+static inline void *memset16(uint16_t *s, uint16_t v, size_t n)
+{
+	long d0, d1;
+	asm volatile("rep\n\t"
+		     "stosw"
+		     : "=&c" (d0), "=&D" (d1)
+		     : "a" (v), "1" (s), "0" (n)
+		     : "memory");
+	return s;
+}
+
+#define __HAVE_ARCH_MEMSET32
+static inline void *memset32(uint32_t *s, uint32_t v, size_t n)
+{
+	long d0, d1;
+	asm volatile("rep\n\t"
+		     "stosl"
+		     : "=&c" (d0), "=&D" (d1)
+		     : "a" (v), "1" (s), "0" (n)
+		     : "memory");
+	return s;
+}
+
+#define __HAVE_ARCH_MEMSET64
+static inline void *memset64(uint64_t *s, uint64_t v, size_t n)
+{
+	long d0, d1;
+	asm volatile("rep\n\t"
+		     "stosq"
+		     : "=&c" (d0), "=&D" (d1)
+		     : "a" (v), "1" (s), "0" (n)
+		     : "memory");
+	return s;
+}
+
 #define __HAVE_ARCH_MEMMOVE
 void *memmove(void *dest, const void *src, size_t count);
 void *__memmove(void *dest, const void *src, size_t count);
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 4/7] alpha: Add support for memset16
  2017-03-24 16:13 ` Matthew Wilcox
  (?)
  (?)
@ 2017-03-24 16:13   ` Matthew Wilcox
  -1 siblings, 0 replies; 50+ messages in thread
From: Matthew Wilcox @ 2017-03-24 16:13 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-fbdev, linux-arch, linux-alpha, linux-arm-kernel, x86,
	linux-mips, linuxppc-dev, sparclinux, Minchan Kim,
	Matthew Wilcox

From: Matthew Wilcox <mawilcox@microsoft.com>

Alpha already had an optimised memset-16-bit-quantity assembler routine
called memsetw().  It has a slightly different calling convention
from memset16() in that it takes a byte count, not a count of words.
That's the same convention used by ARM's __memset16(), so rename Alpha's
routine to match and add a memset16() wrapper around it.  Then convert
Alpha's scr_memsetw() to call memset16() instead of memsetw().

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
---
 arch/alpha/include/asm/string.h | 15 ++++++++-------
 arch/alpha/include/asm/vga.h    |  2 +-
 arch/alpha/lib/memset.S         | 10 +++++-----
 3 files changed, 14 insertions(+), 13 deletions(-)

diff --git a/arch/alpha/include/asm/string.h b/arch/alpha/include/asm/string.h
index c2911f591704..74c0a693b76b 100644
--- a/arch/alpha/include/asm/string.h
+++ b/arch/alpha/include/asm/string.h
@@ -65,13 +65,14 @@ extern void * memchr(const void *, int, size_t);
    aligned values.  The DEST and COUNT parameters must be even for 
    correct operation.  */
 
-#define __HAVE_ARCH_MEMSETW
-extern void * __memsetw(void *dest, unsigned short, size_t count);
-
-#define memsetw(s, c, n)						 \
-(__builtin_constant_p(c)						 \
- ? __constant_c_memset((s),0x0001000100010001UL*(unsigned short)(c),(n)) \
- : __memsetw((s),(c),(n)))
+#define __HAVE_ARCH_MEMSET16
+extern void * __memset16(void *dest, unsigned short, size_t count);
+static inline void *memset16(uint16_t *p, uint16_t v, size_t n)
+{
+	if (__builtin_constant_p(v))
+		return __constant_c_memset(p, 0x0001000100010001UL * v, n * 2)
+	return __memset16(p, v, n * 2);
+}
 
 #endif /* __KERNEL__ */
 
diff --git a/arch/alpha/include/asm/vga.h b/arch/alpha/include/asm/vga.h
index c00106bac521..3c1c2b6128e7 100644
--- a/arch/alpha/include/asm/vga.h
+++ b/arch/alpha/include/asm/vga.h
@@ -34,7 +34,7 @@ static inline void scr_memsetw(u16 *s, u16 c, unsigned int count)
 	if (__is_ioaddr(s))
 		memsetw_io((u16 __iomem *) s, c, count);
 	else
-		memsetw(s, c, count);
+		memset16(s, c, count / 2);
 }
 
 /* Do not trust that the usage will be correct; analyze the arguments.  */
diff --git a/arch/alpha/lib/memset.S b/arch/alpha/lib/memset.S
index 89a26f5e89de..f824969e9e77 100644
--- a/arch/alpha/lib/memset.S
+++ b/arch/alpha/lib/memset.S
@@ -20,7 +20,7 @@
 	.globl memset
 	.globl __memset
 	.globl ___memset
-	.globl __memsetw
+	.globl __memset16
 	.globl __constant_c_memset
 
 	.ent ___memset
@@ -110,8 +110,8 @@ EXPORT_SYMBOL(___memset)
 EXPORT_SYMBOL(__constant_c_memset)
 
 	.align 5
-	.ent __memsetw
-__memsetw:
+	.ent __memset16
+__memset16:
 	.prologue 0
 
 	inswl $17,0,$1		/* E0 */
@@ -123,8 +123,8 @@ __memsetw:
 	or $1,$4,$17		/* E0 */
 	br __constant_c_memset	/* .. E1 */
 
-	.end __memsetw
-EXPORT_SYMBOL(__memsetw)
+	.end __memset16
+EXPORT_SYMBOL(__memset16)
 
 memset = ___memset
 __memset = ___memset
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 4/7] alpha: Add support for memset16
@ 2017-03-24 16:13   ` Matthew Wilcox
  0 siblings, 0 replies; 50+ messages in thread
From: Matthew Wilcox @ 2017-03-24 16:13 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arch, linux-mips, linux-fbdev, Matthew Wilcox, x86,
	Minchan Kim, linux-alpha, sparclinux, linuxppc-dev,
	linux-arm-kernel

From: Matthew Wilcox <mawilcox@microsoft.com>

Alpha already had an optimised memset-16-bit-quantity assembler routine
called memsetw().  It has a slightly different calling convention
from memset16() in that it takes a byte count, not a count of words.
That's the same convention used by ARM's __memset16(), so rename Alpha's
routine to match and add a memset16() wrapper around it.  Then convert
Alpha's scr_memsetw() to call memset16() instead of memsetw().

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
---
 arch/alpha/include/asm/string.h | 15 ++++++++-------
 arch/alpha/include/asm/vga.h    |  2 +-
 arch/alpha/lib/memset.S         | 10 +++++-----
 3 files changed, 14 insertions(+), 13 deletions(-)

diff --git a/arch/alpha/include/asm/string.h b/arch/alpha/include/asm/string.h
index c2911f591704..74c0a693b76b 100644
--- a/arch/alpha/include/asm/string.h
+++ b/arch/alpha/include/asm/string.h
@@ -65,13 +65,14 @@ extern void * memchr(const void *, int, size_t);
    aligned values.  The DEST and COUNT parameters must be even for 
    correct operation.  */
 
-#define __HAVE_ARCH_MEMSETW
-extern void * __memsetw(void *dest, unsigned short, size_t count);
-
-#define memsetw(s, c, n)						 \
-(__builtin_constant_p(c)						 \
- ? __constant_c_memset((s),0x0001000100010001UL*(unsigned short)(c),(n)) \
- : __memsetw((s),(c),(n)))
+#define __HAVE_ARCH_MEMSET16
+extern void * __memset16(void *dest, unsigned short, size_t count);
+static inline void *memset16(uint16_t *p, uint16_t v, size_t n)
+{
+	if (__builtin_constant_p(v))
+		return __constant_c_memset(p, 0x0001000100010001UL * v, n * 2)
+	return __memset16(p, v, n * 2);
+}
 
 #endif /* __KERNEL__ */
 
diff --git a/arch/alpha/include/asm/vga.h b/arch/alpha/include/asm/vga.h
index c00106bac521..3c1c2b6128e7 100644
--- a/arch/alpha/include/asm/vga.h
+++ b/arch/alpha/include/asm/vga.h
@@ -34,7 +34,7 @@ static inline void scr_memsetw(u16 *s, u16 c, unsigned int count)
 	if (__is_ioaddr(s))
 		memsetw_io((u16 __iomem *) s, c, count);
 	else
-		memsetw(s, c, count);
+		memset16(s, c, count / 2);
 }
 
 /* Do not trust that the usage will be correct; analyze the arguments.  */
diff --git a/arch/alpha/lib/memset.S b/arch/alpha/lib/memset.S
index 89a26f5e89de..f824969e9e77 100644
--- a/arch/alpha/lib/memset.S
+++ b/arch/alpha/lib/memset.S
@@ -20,7 +20,7 @@
 	.globl memset
 	.globl __memset
 	.globl ___memset
-	.globl __memsetw
+	.globl __memset16
 	.globl __constant_c_memset
 
 	.ent ___memset
@@ -110,8 +110,8 @@ EXPORT_SYMBOL(___memset)
 EXPORT_SYMBOL(__constant_c_memset)
 
 	.align 5
-	.ent __memsetw
-__memsetw:
+	.ent __memset16
+__memset16:
 	.prologue 0
 
 	inswl $17,0,$1		/* E0 */
@@ -123,8 +123,8 @@ __memsetw:
 	or $1,$4,$17		/* E0 */
 	br __constant_c_memset	/* .. E1 */
 
-	.end __memsetw
-EXPORT_SYMBOL(__memsetw)
+	.end __memset16
+EXPORT_SYMBOL(__memset16)
 
 memset = ___memset
 __memset = ___memset
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 4/7] alpha: Add support for memset16
@ 2017-03-24 16:13   ` Matthew Wilcox
  0 siblings, 0 replies; 50+ messages in thread
From: Matthew Wilcox @ 2017-03-24 16:13 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arch, linux-mips, linux-fbdev, Matthew Wilcox, x86,
	Minchan Kim, linux-alpha, sparclinux, linuxppc-dev,
	linux-arm-kernel

From: Matthew Wilcox <mawilcox@microsoft.com>

Alpha already had an optimised memset-16-bit-quantity assembler routine
called memsetw().  It has a slightly different calling convention
from memset16() in that it takes a byte count, not a count of words.
That's the same convention used by ARM's __memset16(), so rename Alpha's
routine to match and add a memset16() wrapper around it.  Then convert
Alpha's scr_memsetw() to call memset16() instead of memsetw().

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
---
 arch/alpha/include/asm/string.h | 15 ++++++++-------
 arch/alpha/include/asm/vga.h    |  2 +-
 arch/alpha/lib/memset.S         | 10 +++++-----
 3 files changed, 14 insertions(+), 13 deletions(-)

diff --git a/arch/alpha/include/asm/string.h b/arch/alpha/include/asm/string.h
index c2911f591704..74c0a693b76b 100644
--- a/arch/alpha/include/asm/string.h
+++ b/arch/alpha/include/asm/string.h
@@ -65,13 +65,14 @@ extern void * memchr(const void *, int, size_t);
    aligned values.  The DEST and COUNT parameters must be even for 
    correct operation.  */
 
-#define __HAVE_ARCH_MEMSETW
-extern void * __memsetw(void *dest, unsigned short, size_t count);
-
-#define memsetw(s, c, n)						 \
-(__builtin_constant_p(c)						 \
- ? __constant_c_memset((s),0x0001000100010001UL*(unsigned short)(c),(n)) \
- : __memsetw((s),(c),(n)))
+#define __HAVE_ARCH_MEMSET16
+extern void * __memset16(void *dest, unsigned short, size_t count);
+static inline void *memset16(uint16_t *p, uint16_t v, size_t n)
+{
+	if (__builtin_constant_p(v))
+		return __constant_c_memset(p, 0x0001000100010001UL * v, n * 2)
+	return __memset16(p, v, n * 2);
+}
 
 #endif /* __KERNEL__ */
 
diff --git a/arch/alpha/include/asm/vga.h b/arch/alpha/include/asm/vga.h
index c00106bac521..3c1c2b6128e7 100644
--- a/arch/alpha/include/asm/vga.h
+++ b/arch/alpha/include/asm/vga.h
@@ -34,7 +34,7 @@ static inline void scr_memsetw(u16 *s, u16 c, unsigned int count)
 	if (__is_ioaddr(s))
 		memsetw_io((u16 __iomem *) s, c, count);
 	else
-		memsetw(s, c, count);
+		memset16(s, c, count / 2);
 }
 
 /* Do not trust that the usage will be correct; analyze the arguments.  */
diff --git a/arch/alpha/lib/memset.S b/arch/alpha/lib/memset.S
index 89a26f5e89de..f824969e9e77 100644
--- a/arch/alpha/lib/memset.S
+++ b/arch/alpha/lib/memset.S
@@ -20,7 +20,7 @@
 	.globl memset
 	.globl __memset
 	.globl ___memset
-	.globl __memsetw
+	.globl __memset16
 	.globl __constant_c_memset
 
 	.ent ___memset
@@ -110,8 +110,8 @@ EXPORT_SYMBOL(___memset)
 EXPORT_SYMBOL(__constant_c_memset)
 
 	.align 5
-	.ent __memsetw
-__memsetw:
+	.ent __memset16
+__memset16:
 	.prologue 0
 
 	inswl $17,0,$1		/* E0 */
@@ -123,8 +123,8 @@ __memsetw:
 	or $1,$4,$17		/* E0 */
 	br __constant_c_memset	/* .. E1 */
 
-	.end __memsetw
-EXPORT_SYMBOL(__memsetw)
+	.end __memset16
+EXPORT_SYMBOL(__memset16)
 
 memset = ___memset
 __memset = ___memset
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 4/7] alpha: Add support for memset16
@ 2017-03-24 16:13   ` Matthew Wilcox
  0 siblings, 0 replies; 50+ messages in thread
From: Matthew Wilcox @ 2017-03-24 16:13 UTC (permalink / raw)
  To: linux-arm-kernel

From: Matthew Wilcox <mawilcox@microsoft.com>

Alpha already had an optimised memset-16-bit-quantity assembler routine
called memsetw().  It has a slightly different calling convention
from memset16() in that it takes a byte count, not a count of words.
That's the same convention used by ARM's __memset16(), so rename Alpha's
routine to match and add a memset16() wrapper around it.  Then convert
Alpha's scr_memsetw() to call memset16() instead of memsetw().

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
---
 arch/alpha/include/asm/string.h | 15 ++++++++-------
 arch/alpha/include/asm/vga.h    |  2 +-
 arch/alpha/lib/memset.S         | 10 +++++-----
 3 files changed, 14 insertions(+), 13 deletions(-)

diff --git a/arch/alpha/include/asm/string.h b/arch/alpha/include/asm/string.h
index c2911f591704..74c0a693b76b 100644
--- a/arch/alpha/include/asm/string.h
+++ b/arch/alpha/include/asm/string.h
@@ -65,13 +65,14 @@ extern void * memchr(const void *, int, size_t);
    aligned values.  The DEST and COUNT parameters must be even for 
    correct operation.  */
 
-#define __HAVE_ARCH_MEMSETW
-extern void * __memsetw(void *dest, unsigned short, size_t count);
-
-#define memsetw(s, c, n)						 \
-(__builtin_constant_p(c)						 \
- ? __constant_c_memset((s),0x0001000100010001UL*(unsigned short)(c),(n)) \
- : __memsetw((s),(c),(n)))
+#define __HAVE_ARCH_MEMSET16
+extern void * __memset16(void *dest, unsigned short, size_t count);
+static inline void *memset16(uint16_t *p, uint16_t v, size_t n)
+{
+	if (__builtin_constant_p(v))
+		return __constant_c_memset(p, 0x0001000100010001UL * v, n * 2)
+	return __memset16(p, v, n * 2);
+}
 
 #endif /* __KERNEL__ */
 
diff --git a/arch/alpha/include/asm/vga.h b/arch/alpha/include/asm/vga.h
index c00106bac521..3c1c2b6128e7 100644
--- a/arch/alpha/include/asm/vga.h
+++ b/arch/alpha/include/asm/vga.h
@@ -34,7 +34,7 @@ static inline void scr_memsetw(u16 *s, u16 c, unsigned int count)
 	if (__is_ioaddr(s))
 		memsetw_io((u16 __iomem *) s, c, count);
 	else
-		memsetw(s, c, count);
+		memset16(s, c, count / 2);
 }
 
 /* Do not trust that the usage will be correct; analyze the arguments.  */
diff --git a/arch/alpha/lib/memset.S b/arch/alpha/lib/memset.S
index 89a26f5e89de..f824969e9e77 100644
--- a/arch/alpha/lib/memset.S
+++ b/arch/alpha/lib/memset.S
@@ -20,7 +20,7 @@
 	.globl memset
 	.globl __memset
 	.globl ___memset
-	.globl __memsetw
+	.globl __memset16
 	.globl __constant_c_memset
 
 	.ent ___memset
@@ -110,8 +110,8 @@ EXPORT_SYMBOL(___memset)
 EXPORT_SYMBOL(__constant_c_memset)
 
 	.align 5
-	.ent __memsetw
-__memsetw:
+	.ent __memset16
+__memset16:
 	.prologue 0
 
 	inswl $17,0,$1		/* E0 */
@@ -123,8 +123,8 @@ __memsetw:
 	or $1,$4,$17		/* E0 */
 	br __constant_c_memset	/* .. E1 */
 
-	.end __memsetw
-EXPORT_SYMBOL(__memsetw)
+	.end __memset16
+EXPORT_SYMBOL(__memset16)
 
 memset = ___memset
 __memset = ___memset
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 5/7] zram: Convert to using memset_l
  2017-03-24 16:13 ` Matthew Wilcox
  (?)
@ 2017-03-24 16:13   ` Matthew Wilcox
  -1 siblings, 0 replies; 50+ messages in thread
From: Matthew Wilcox @ 2017-03-24 16:13 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-fbdev, linux-arch, linux-alpha, linux-arm-kernel, x86,
	linux-mips, linuxppc-dev, sparclinux, Minchan Kim,
	Matthew Wilcox

From: Matthew Wilcox <mawilcox@microsoft.com>

zram was the motivation for creating memset_l().  Minchan Kim sees a 7%
performance improvement on x86 with 100MB of non-zero deduplicatable
data:

        perf stat -r 10 dd if=/dev/zram0 of=/dev/null

vanilla:        0.232050465 seconds time elapsed ( +-  0.51% )
memset_l:	0.217219387 seconds time elapsed ( +-  0.07% )

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Tested-by: Minchan Kim <minchan@kernel.org>
---
 drivers/block/zram/zram_drv.c | 15 +++------------
 1 file changed, 3 insertions(+), 12 deletions(-)

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index e27d89a36c34..25dcad309695 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -157,20 +157,11 @@ static inline void update_used_max(struct zram *zram,
 	} while (old_max != cur_max);
 }
 
-static inline void zram_fill_page(char *ptr, unsigned long len,
+static inline void zram_fill_page(void *ptr, unsigned long len,
 					unsigned long value)
 {
-	int i;
-	unsigned long *page = (unsigned long *)ptr;
-
 	WARN_ON_ONCE(!IS_ALIGNED(len, sizeof(unsigned long)));
-
-	if (likely(value == 0)) {
-		memset(ptr, 0, len);
-	} else {
-		for (i = 0; i < len / sizeof(*page); i++)
-			page[i] = value;
-	}
+	memset_l(ptr, value, len / sizeof(unsigned long));
 }
 
 static bool page_same_filled(void *ptr, unsigned long *element)
@@ -193,7 +184,7 @@ static bool page_same_filled(void *ptr, unsigned long *element)
 static void handle_same_page(struct bio_vec *bvec, unsigned long element)
 {
 	struct page *page = bvec->bv_page;
-	void *user_mem;
+	char *user_mem;
 
 	user_mem = kmap_atomic(page);
 	zram_fill_page(user_mem + bvec->bv_offset, bvec->bv_len, element);
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 5/7] zram: Convert to using memset_l
@ 2017-03-24 16:13   ` Matthew Wilcox
  0 siblings, 0 replies; 50+ messages in thread
From: Matthew Wilcox @ 2017-03-24 16:13 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-fbdev, linux-arch, linux-alpha, linux-arm-kernel, x86,
	linux-mips, linuxppc-dev, sparclinux, Minchan Kim,
	Matthew Wilcox

From: Matthew Wilcox <mawilcox@microsoft.com>

zram was the motivation for creating memset_l().  Minchan Kim sees a 7%
performance improvement on x86 with 100MB of non-zero deduplicatable
data:

        perf stat -r 10 dd if=/dev/zram0 of=/dev/null

vanilla:        0.232050465 seconds time elapsed ( +-  0.51% )
memset_l:	0.217219387 seconds time elapsed ( +-  0.07% )

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Tested-by: Minchan Kim <minchan@kernel.org>
---
 drivers/block/zram/zram_drv.c | 15 +++------------
 1 file changed, 3 insertions(+), 12 deletions(-)

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index e27d89a36c34..25dcad309695 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -157,20 +157,11 @@ static inline void update_used_max(struct zram *zram,
 	} while (old_max != cur_max);
 }
 
-static inline void zram_fill_page(char *ptr, unsigned long len,
+static inline void zram_fill_page(void *ptr, unsigned long len,
 					unsigned long value)
 {
-	int i;
-	unsigned long *page = (unsigned long *)ptr;
-
 	WARN_ON_ONCE(!IS_ALIGNED(len, sizeof(unsigned long)));
-
-	if (likely(value = 0)) {
-		memset(ptr, 0, len);
-	} else {
-		for (i = 0; i < len / sizeof(*page); i++)
-			page[i] = value;
-	}
+	memset_l(ptr, value, len / sizeof(unsigned long));
 }
 
 static bool page_same_filled(void *ptr, unsigned long *element)
@@ -193,7 +184,7 @@ static bool page_same_filled(void *ptr, unsigned long *element)
 static void handle_same_page(struct bio_vec *bvec, unsigned long element)
 {
 	struct page *page = bvec->bv_page;
-	void *user_mem;
+	char *user_mem;
 
 	user_mem = kmap_atomic(page);
 	zram_fill_page(user_mem + bvec->bv_offset, bvec->bv_len, element);
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 5/7] zram: Convert to using memset_l
@ 2017-03-24 16:13   ` Matthew Wilcox
  0 siblings, 0 replies; 50+ messages in thread
From: Matthew Wilcox @ 2017-03-24 16:13 UTC (permalink / raw)
  To: linux-arm-kernel

From: Matthew Wilcox <mawilcox@microsoft.com>

zram was the motivation for creating memset_l().  Minchan Kim sees a 7%
performance improvement on x86 with 100MB of non-zero deduplicatable
data:

        perf stat -r 10 dd if=/dev/zram0 of=/dev/null

vanilla:        0.232050465 seconds time elapsed ( +-  0.51% )
memset_l:	0.217219387 seconds time elapsed ( +-  0.07% )

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Tested-by: Minchan Kim <minchan@kernel.org>
---
 drivers/block/zram/zram_drv.c | 15 +++------------
 1 file changed, 3 insertions(+), 12 deletions(-)

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index e27d89a36c34..25dcad309695 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -157,20 +157,11 @@ static inline void update_used_max(struct zram *zram,
 	} while (old_max != cur_max);
 }
 
-static inline void zram_fill_page(char *ptr, unsigned long len,
+static inline void zram_fill_page(void *ptr, unsigned long len,
 					unsigned long value)
 {
-	int i;
-	unsigned long *page = (unsigned long *)ptr;
-
 	WARN_ON_ONCE(!IS_ALIGNED(len, sizeof(unsigned long)));
-
-	if (likely(value == 0)) {
-		memset(ptr, 0, len);
-	} else {
-		for (i = 0; i < len / sizeof(*page); i++)
-			page[i] = value;
-	}
+	memset_l(ptr, value, len / sizeof(unsigned long));
 }
 
 static bool page_same_filled(void *ptr, unsigned long *element)
@@ -193,7 +184,7 @@ static bool page_same_filled(void *ptr, unsigned long *element)
 static void handle_same_page(struct bio_vec *bvec, unsigned long element)
 {
 	struct page *page = bvec->bv_page;
-	void *user_mem;
+	char *user_mem;
 
 	user_mem = kmap_atomic(page);
 	zram_fill_page(user_mem + bvec->bv_offset, bvec->bv_len, element);
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 6/7] sym53c8xx_2: Convert to use memset32
  2017-03-24 16:13 ` Matthew Wilcox
  (?)
  (?)
@ 2017-03-24 16:13   ` Matthew Wilcox
  -1 siblings, 0 replies; 50+ messages in thread
From: Matthew Wilcox @ 2017-03-24 16:13 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-fbdev, linux-arch, linux-alpha, linux-arm-kernel, x86,
	linux-mips, linuxppc-dev, sparclinux, Minchan Kim,
	Matthew Wilcox

From: Matthew Wilcox <mawilcox@microsoft.com>

memset32() can be used to initialise these three arrays.  Minor code
footprint reduction.

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
---
 drivers/scsi/sym53c8xx_2/sym_hipd.c | 11 +++--------
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/drivers/scsi/sym53c8xx_2/sym_hipd.c b/drivers/scsi/sym53c8xx_2/sym_hipd.c
index 6b349e301869..b886b10e3499 100644
--- a/drivers/scsi/sym53c8xx_2/sym_hipd.c
+++ b/drivers/scsi/sym53c8xx_2/sym_hipd.c
@@ -4985,13 +4985,10 @@ struct sym_lcb *sym_alloc_lcb (struct sym_hcb *np, u_char tn, u_char ln)
 	 *  Compute the bus address of this table.
 	 */
 	if (ln && !tp->luntbl) {
-		int i;
-
 		tp->luntbl = sym_calloc_dma(256, "LUNTBL");
 		if (!tp->luntbl)
 			goto fail;
-		for (i = 0 ; i < 64 ; i++)
-			tp->luntbl[i] = cpu_to_scr(vtobus(&np->badlun_sa));
+		memset32(tp->luntbl, cpu_to_scr(vtobus(&np->badlun_sa)), 64);
 		tp->head.luntbl_sa = cpu_to_scr(vtobus(tp->luntbl));
 	}
 
@@ -5077,8 +5074,7 @@ static void sym_alloc_lcb_tags (struct sym_hcb *np, u_char tn, u_char ln)
 	/*
 	 *  Initialize the task table with invalid entries.
 	 */
-	for (i = 0 ; i < SYM_CONF_MAX_TASK ; i++)
-		lp->itlq_tbl[i] = cpu_to_scr(np->notask_ba);
+	memset32(lp->itlq_tbl, cpu_to_scr(np->notask_ba), SYM_CONF_MAX_TASK);
 
 	/*
 	 *  Fill up the tag buffer with tag numbers.
@@ -5764,8 +5760,7 @@ int sym_hcb_attach(struct Scsi_Host *shost, struct sym_fw *fw, struct sym_nvram
 		goto attach_failed;
 
 	np->badlun_sa = cpu_to_scr(SCRIPTB_BA(np, resel_bad_lun));
-	for (i = 0 ; i < 64 ; i++)	/* 64 luns/target, no less */
-		np->badluntbl[i] = cpu_to_scr(vtobus(&np->badlun_sa));
+	memset32(np->badluntbl, cpu_to_scr(vtobus(&np->badlun_sa)), 64);
 
 	/*
 	 *  Prepare the bus address array that contains the bus 
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 6/7] sym53c8xx_2: Convert to use memset32
@ 2017-03-24 16:13   ` Matthew Wilcox
  0 siblings, 0 replies; 50+ messages in thread
From: Matthew Wilcox @ 2017-03-24 16:13 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arch, linux-mips, linux-fbdev, Matthew Wilcox, x86,
	Minchan Kim, linux-alpha, sparclinux, linuxppc-dev,
	linux-arm-kernel

From: Matthew Wilcox <mawilcox@microsoft.com>

memset32() can be used to initialise these three arrays.  Minor code
footprint reduction.

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
---
 drivers/scsi/sym53c8xx_2/sym_hipd.c | 11 +++--------
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/drivers/scsi/sym53c8xx_2/sym_hipd.c b/drivers/scsi/sym53c8xx_2/sym_hipd.c
index 6b349e301869..b886b10e3499 100644
--- a/drivers/scsi/sym53c8xx_2/sym_hipd.c
+++ b/drivers/scsi/sym53c8xx_2/sym_hipd.c
@@ -4985,13 +4985,10 @@ struct sym_lcb *sym_alloc_lcb (struct sym_hcb *np, u_char tn, u_char ln)
 	 *  Compute the bus address of this table.
 	 */
 	if (ln && !tp->luntbl) {
-		int i;
-
 		tp->luntbl = sym_calloc_dma(256, "LUNTBL");
 		if (!tp->luntbl)
 			goto fail;
-		for (i = 0 ; i < 64 ; i++)
-			tp->luntbl[i] = cpu_to_scr(vtobus(&np->badlun_sa));
+		memset32(tp->luntbl, cpu_to_scr(vtobus(&np->badlun_sa)), 64);
 		tp->head.luntbl_sa = cpu_to_scr(vtobus(tp->luntbl));
 	}
 
@@ -5077,8 +5074,7 @@ static void sym_alloc_lcb_tags (struct sym_hcb *np, u_char tn, u_char ln)
 	/*
 	 *  Initialize the task table with invalid entries.
 	 */
-	for (i = 0 ; i < SYM_CONF_MAX_TASK ; i++)
-		lp->itlq_tbl[i] = cpu_to_scr(np->notask_ba);
+	memset32(lp->itlq_tbl, cpu_to_scr(np->notask_ba), SYM_CONF_MAX_TASK);
 
 	/*
 	 *  Fill up the tag buffer with tag numbers.
@@ -5764,8 +5760,7 @@ int sym_hcb_attach(struct Scsi_Host *shost, struct sym_fw *fw, struct sym_nvram
 		goto attach_failed;
 
 	np->badlun_sa = cpu_to_scr(SCRIPTB_BA(np, resel_bad_lun));
-	for (i = 0 ; i < 64 ; i++)	/* 64 luns/target, no less */
-		np->badluntbl[i] = cpu_to_scr(vtobus(&np->badlun_sa));
+	memset32(np->badluntbl, cpu_to_scr(vtobus(&np->badlun_sa)), 64);
 
 	/*
 	 *  Prepare the bus address array that contains the bus 
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 6/7] sym53c8xx_2: Convert to use memset32
@ 2017-03-24 16:13   ` Matthew Wilcox
  0 siblings, 0 replies; 50+ messages in thread
From: Matthew Wilcox @ 2017-03-24 16:13 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arch, linux-mips, linux-fbdev, Matthew Wilcox, x86,
	Minchan Kim, linux-alpha, sparclinux, linuxppc-dev,
	linux-arm-kernel

From: Matthew Wilcox <mawilcox@microsoft.com>

memset32() can be used to initialise these three arrays.  Minor code
footprint reduction.

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
---
 drivers/scsi/sym53c8xx_2/sym_hipd.c | 11 +++--------
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/drivers/scsi/sym53c8xx_2/sym_hipd.c b/drivers/scsi/sym53c8xx_2/sym_hipd.c
index 6b349e301869..b886b10e3499 100644
--- a/drivers/scsi/sym53c8xx_2/sym_hipd.c
+++ b/drivers/scsi/sym53c8xx_2/sym_hipd.c
@@ -4985,13 +4985,10 @@ struct sym_lcb *sym_alloc_lcb (struct sym_hcb *np, u_char tn, u_char ln)
 	 *  Compute the bus address of this table.
 	 */
 	if (ln && !tp->luntbl) {
-		int i;
-
 		tp->luntbl = sym_calloc_dma(256, "LUNTBL");
 		if (!tp->luntbl)
 			goto fail;
-		for (i = 0 ; i < 64 ; i++)
-			tp->luntbl[i] = cpu_to_scr(vtobus(&np->badlun_sa));
+		memset32(tp->luntbl, cpu_to_scr(vtobus(&np->badlun_sa)), 64);
 		tp->head.luntbl_sa = cpu_to_scr(vtobus(tp->luntbl));
 	}
 
@@ -5077,8 +5074,7 @@ static void sym_alloc_lcb_tags (struct sym_hcb *np, u_char tn, u_char ln)
 	/*
 	 *  Initialize the task table with invalid entries.
 	 */
-	for (i = 0 ; i < SYM_CONF_MAX_TASK ; i++)
-		lp->itlq_tbl[i] = cpu_to_scr(np->notask_ba);
+	memset32(lp->itlq_tbl, cpu_to_scr(np->notask_ba), SYM_CONF_MAX_TASK);
 
 	/*
 	 *  Fill up the tag buffer with tag numbers.
@@ -5764,8 +5760,7 @@ int sym_hcb_attach(struct Scsi_Host *shost, struct sym_fw *fw, struct sym_nvram
 		goto attach_failed;
 
 	np->badlun_sa = cpu_to_scr(SCRIPTB_BA(np, resel_bad_lun));
-	for (i = 0 ; i < 64 ; i++)	/* 64 luns/target, no less */
-		np->badluntbl[i] = cpu_to_scr(vtobus(&np->badlun_sa));
+	memset32(np->badluntbl, cpu_to_scr(vtobus(&np->badlun_sa)), 64);
 
 	/*
 	 *  Prepare the bus address array that contains the bus 
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 6/7] sym53c8xx_2: Convert to use memset32
@ 2017-03-24 16:13   ` Matthew Wilcox
  0 siblings, 0 replies; 50+ messages in thread
From: Matthew Wilcox @ 2017-03-24 16:13 UTC (permalink / raw)
  To: linux-arm-kernel

From: Matthew Wilcox <mawilcox@microsoft.com>

memset32() can be used to initialise these three arrays.  Minor code
footprint reduction.

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
---
 drivers/scsi/sym53c8xx_2/sym_hipd.c | 11 +++--------
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/drivers/scsi/sym53c8xx_2/sym_hipd.c b/drivers/scsi/sym53c8xx_2/sym_hipd.c
index 6b349e301869..b886b10e3499 100644
--- a/drivers/scsi/sym53c8xx_2/sym_hipd.c
+++ b/drivers/scsi/sym53c8xx_2/sym_hipd.c
@@ -4985,13 +4985,10 @@ struct sym_lcb *sym_alloc_lcb (struct sym_hcb *np, u_char tn, u_char ln)
 	 *  Compute the bus address of this table.
 	 */
 	if (ln && !tp->luntbl) {
-		int i;
-
 		tp->luntbl = sym_calloc_dma(256, "LUNTBL");
 		if (!tp->luntbl)
 			goto fail;
-		for (i = 0 ; i < 64 ; i++)
-			tp->luntbl[i] = cpu_to_scr(vtobus(&np->badlun_sa));
+		memset32(tp->luntbl, cpu_to_scr(vtobus(&np->badlun_sa)), 64);
 		tp->head.luntbl_sa = cpu_to_scr(vtobus(tp->luntbl));
 	}
 
@@ -5077,8 +5074,7 @@ static void sym_alloc_lcb_tags (struct sym_hcb *np, u_char tn, u_char ln)
 	/*
 	 *  Initialize the task table with invalid entries.
 	 */
-	for (i = 0 ; i < SYM_CONF_MAX_TASK ; i++)
-		lp->itlq_tbl[i] = cpu_to_scr(np->notask_ba);
+	memset32(lp->itlq_tbl, cpu_to_scr(np->notask_ba), SYM_CONF_MAX_TASK);
 
 	/*
 	 *  Fill up the tag buffer with tag numbers.
@@ -5764,8 +5760,7 @@ int sym_hcb_attach(struct Scsi_Host *shost, struct sym_fw *fw, struct sym_nvram
 		goto attach_failed;
 
 	np->badlun_sa = cpu_to_scr(SCRIPTB_BA(np, resel_bad_lun));
-	for (i = 0 ; i < 64 ; i++)	/* 64 luns/target, no less */
-		np->badluntbl[i] = cpu_to_scr(vtobus(&np->badlun_sa));
+	memset32(np->badluntbl, cpu_to_scr(vtobus(&np->badlun_sa)), 64);
 
 	/*
 	 *  Prepare the bus address array that contains the bus 
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 7/7] vga: Optimise console scrolling
  2017-03-24 16:13 ` Matthew Wilcox
  (?)
  (?)
@ 2017-03-24 16:13   ` Matthew Wilcox
  -1 siblings, 0 replies; 50+ messages in thread
From: Matthew Wilcox @ 2017-03-24 16:13 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-fbdev, linux-arch, linux-alpha, linux-arm-kernel, x86,
	linux-mips, linuxppc-dev, sparclinux, Minchan Kim,
	Matthew Wilcox

From: Matthew Wilcox <mawilcox@microsoft.com>

Where possible, call memset16(), memmove() or memcpy() instead of using
open-coded loops.  If an architecture doesn't define VT_BUF_HAVE_RW,
we can do that from the generic code.  For the architectures which do
have special RW routines, usually we can do the special thing (pointer
test or byteswap) once (and then use a mem* call) instead of each time
around a loop.  Alpha is the only architecture missing a scr_memmovew()
definition (because it's non-trivial to write).

I don't like the calling convention that uses a byte count instead of
a count of u16s, but it's a little late to change that.  Reduces code
size of fbcon.o by almost 400 bytes on my laptop build.

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
---
 arch/mips/include/asm/vga.h    |  6 ++++++
 arch/powerpc/include/asm/vga.h |  8 ++++++++
 arch/sparc/include/asm/vga.h   | 24 ++++++++++++++++++++++++
 include/linux/vt_buffer.h      | 12 ++++++++++++
 4 files changed, 50 insertions(+)

diff --git a/arch/mips/include/asm/vga.h b/arch/mips/include/asm/vga.h
index f82c83749a08..7510f406e1e1 100644
--- a/arch/mips/include/asm/vga.h
+++ b/arch/mips/include/asm/vga.h
@@ -40,9 +40,15 @@ static inline u16 scr_readw(volatile const u16 *addr)
 	return le16_to_cpu(*addr);
 }
 
+static inline void scr_memsetw(u16 *s, u16 v, unsigned int count)
+{
+	memset16(s, cpu_to_le16(v), count / 2);
+}
+
 #define scr_memcpyw(d, s, c) memcpy(d, s, c)
 #define scr_memmovew(d, s, c) memmove(d, s, c)
 #define VT_BUF_HAVE_MEMCPYW
 #define VT_BUF_HAVE_MEMMOVEW
+#define VT_BUF_HAVE_MEMSETW
 
 #endif /* _ASM_VGA_H */
diff --git a/arch/powerpc/include/asm/vga.h b/arch/powerpc/include/asm/vga.h
index ab3acd2f2786..7a7b541b7493 100644
--- a/arch/powerpc/include/asm/vga.h
+++ b/arch/powerpc/include/asm/vga.h
@@ -33,8 +33,16 @@ static inline u16 scr_readw(volatile const u16 *addr)
 	return le16_to_cpu(*addr);
 }
 
+#define VT_BUF_HAVE_MEMSETW
+static inline void scr_memsetw(u16 *s, u16 v, unsigned int n)
+{
+	memset16(s, cpu_to_le16(v), n / 2);
+}
+
 #define VT_BUF_HAVE_MEMCPYW
+#define VT_BUF_HAVE_MEMMOVEW
 #define scr_memcpyw	memcpy
+#define scr_memmovew	memmove
 
 #endif /* !CONFIG_VGA_CONSOLE && !CONFIG_MDA_CONSOLE */
 
diff --git a/arch/sparc/include/asm/vga.h b/arch/sparc/include/asm/vga.h
index ec0e9967d93d..1fab92b110d9 100644
--- a/arch/sparc/include/asm/vga.h
+++ b/arch/sparc/include/asm/vga.h
@@ -11,6 +11,9 @@
 #include <asm/types.h>
 
 #define VT_BUF_HAVE_RW
+#define VT_BUF_HAVE_MEMSETW
+#define VT_BUF_HAVE_MEMCPYW
+#define VT_BUF_HAVE_MEMMOVEW
 
 #undef scr_writew
 #undef scr_readw
@@ -29,6 +32,27 @@ static inline u16 scr_readw(const u16 *addr)
 	return *addr;
 }
 
+static inline void scr_memsetw(u16 *p, u16 v, unsigned int n)
+{
+	BUG_ON((long) p >= 0);
+
+	memset16(s, cpu_to_le16(v), n / 2);
+}
+
+static inline void scr_memcpyw(u16 *d, u16 *s, unsigned int n)
+{
+	BUG_ON((long) d >= 0);
+
+	memcpy(d, s, n);
+}
+
+static inline void scr_memmovew(u16 *d, u16 *s, unsigned int n)
+{
+	BUG_ON((long) d >= 0);
+
+	memmove(d, s, n);
+}
+
 #define VGA_MAP_MEM(x,s) (x)
 
 #endif
diff --git a/include/linux/vt_buffer.h b/include/linux/vt_buffer.h
index f38c10ba3ff5..31b92fcd8f03 100644
--- a/include/linux/vt_buffer.h
+++ b/include/linux/vt_buffer.h
@@ -26,24 +26,33 @@
 #ifndef VT_BUF_HAVE_MEMSETW
 static inline void scr_memsetw(u16 *s, u16 c, unsigned int count)
 {
+#ifdef VT_BUF_HAVE_RW
 	count /= 2;
 	while (count--)
 		scr_writew(c, s++);
+#else
+	memset16(s, c, count / 2);
+#endif
 }
 #endif
 
 #ifndef VT_BUF_HAVE_MEMCPYW
 static inline void scr_memcpyw(u16 *d, const u16 *s, unsigned int count)
 {
+#ifdef VT_BUF_HAVE_RW
 	count /= 2;
 	while (count--)
 		scr_writew(scr_readw(s++), d++);
+#else
+	memcpy(d, s, count);
+#endif
 }
 #endif
 
 #ifndef VT_BUF_HAVE_MEMMOVEW
 static inline void scr_memmovew(u16 *d, const u16 *s, unsigned int count)
 {
+#ifdef VT_BUF_HAVE_RW
 	if (d < s)
 		scr_memcpyw(d, s, count);
 	else {
@@ -53,6 +62,9 @@ static inline void scr_memmovew(u16 *d, const u16 *s, unsigned int count)
 		while (count--)
 			scr_writew(scr_readw(--s), --d);
 	}
+#else
+	memmove(d, s, count);
+#endif
 }
 #endif
 
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 7/7] vga: Optimise console scrolling
@ 2017-03-24 16:13   ` Matthew Wilcox
  0 siblings, 0 replies; 50+ messages in thread
From: Matthew Wilcox @ 2017-03-24 16:13 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arch, linux-mips, linux-fbdev, Matthew Wilcox, x86,
	Minchan Kim, linux-alpha, sparclinux, linuxppc-dev,
	linux-arm-kernel

From: Matthew Wilcox <mawilcox@microsoft.com>

Where possible, call memset16(), memmove() or memcpy() instead of using
open-coded loops.  If an architecture doesn't define VT_BUF_HAVE_RW,
we can do that from the generic code.  For the architectures which do
have special RW routines, usually we can do the special thing (pointer
test or byteswap) once (and then use a mem* call) instead of each time
around a loop.  Alpha is the only architecture missing a scr_memmovew()
definition (because it's non-trivial to write).

I don't like the calling convention that uses a byte count instead of
a count of u16s, but it's a little late to change that.  Reduces code
size of fbcon.o by almost 400 bytes on my laptop build.

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
---
 arch/mips/include/asm/vga.h    |  6 ++++++
 arch/powerpc/include/asm/vga.h |  8 ++++++++
 arch/sparc/include/asm/vga.h   | 24 ++++++++++++++++++++++++
 include/linux/vt_buffer.h      | 12 ++++++++++++
 4 files changed, 50 insertions(+)

diff --git a/arch/mips/include/asm/vga.h b/arch/mips/include/asm/vga.h
index f82c83749a08..7510f406e1e1 100644
--- a/arch/mips/include/asm/vga.h
+++ b/arch/mips/include/asm/vga.h
@@ -40,9 +40,15 @@ static inline u16 scr_readw(volatile const u16 *addr)
 	return le16_to_cpu(*addr);
 }
 
+static inline void scr_memsetw(u16 *s, u16 v, unsigned int count)
+{
+	memset16(s, cpu_to_le16(v), count / 2);
+}
+
 #define scr_memcpyw(d, s, c) memcpy(d, s, c)
 #define scr_memmovew(d, s, c) memmove(d, s, c)
 #define VT_BUF_HAVE_MEMCPYW
 #define VT_BUF_HAVE_MEMMOVEW
+#define VT_BUF_HAVE_MEMSETW
 
 #endif /* _ASM_VGA_H */
diff --git a/arch/powerpc/include/asm/vga.h b/arch/powerpc/include/asm/vga.h
index ab3acd2f2786..7a7b541b7493 100644
--- a/arch/powerpc/include/asm/vga.h
+++ b/arch/powerpc/include/asm/vga.h
@@ -33,8 +33,16 @@ static inline u16 scr_readw(volatile const u16 *addr)
 	return le16_to_cpu(*addr);
 }
 
+#define VT_BUF_HAVE_MEMSETW
+static inline void scr_memsetw(u16 *s, u16 v, unsigned int n)
+{
+	memset16(s, cpu_to_le16(v), n / 2);
+}
+
 #define VT_BUF_HAVE_MEMCPYW
+#define VT_BUF_HAVE_MEMMOVEW
 #define scr_memcpyw	memcpy
+#define scr_memmovew	memmove
 
 #endif /* !CONFIG_VGA_CONSOLE && !CONFIG_MDA_CONSOLE */
 
diff --git a/arch/sparc/include/asm/vga.h b/arch/sparc/include/asm/vga.h
index ec0e9967d93d..1fab92b110d9 100644
--- a/arch/sparc/include/asm/vga.h
+++ b/arch/sparc/include/asm/vga.h
@@ -11,6 +11,9 @@
 #include <asm/types.h>
 
 #define VT_BUF_HAVE_RW
+#define VT_BUF_HAVE_MEMSETW
+#define VT_BUF_HAVE_MEMCPYW
+#define VT_BUF_HAVE_MEMMOVEW
 
 #undef scr_writew
 #undef scr_readw
@@ -29,6 +32,27 @@ static inline u16 scr_readw(const u16 *addr)
 	return *addr;
 }
 
+static inline void scr_memsetw(u16 *p, u16 v, unsigned int n)
+{
+	BUG_ON((long) p >= 0);
+
+	memset16(s, cpu_to_le16(v), n / 2);
+}
+
+static inline void scr_memcpyw(u16 *d, u16 *s, unsigned int n)
+{
+	BUG_ON((long) d >= 0);
+
+	memcpy(d, s, n);
+}
+
+static inline void scr_memmovew(u16 *d, u16 *s, unsigned int n)
+{
+	BUG_ON((long) d >= 0);
+
+	memmove(d, s, n);
+}
+
 #define VGA_MAP_MEM(x,s) (x)
 
 #endif
diff --git a/include/linux/vt_buffer.h b/include/linux/vt_buffer.h
index f38c10ba3ff5..31b92fcd8f03 100644
--- a/include/linux/vt_buffer.h
+++ b/include/linux/vt_buffer.h
@@ -26,24 +26,33 @@
 #ifndef VT_BUF_HAVE_MEMSETW
 static inline void scr_memsetw(u16 *s, u16 c, unsigned int count)
 {
+#ifdef VT_BUF_HAVE_RW
 	count /= 2;
 	while (count--)
 		scr_writew(c, s++);
+#else
+	memset16(s, c, count / 2);
+#endif
 }
 #endif
 
 #ifndef VT_BUF_HAVE_MEMCPYW
 static inline void scr_memcpyw(u16 *d, const u16 *s, unsigned int count)
 {
+#ifdef VT_BUF_HAVE_RW
 	count /= 2;
 	while (count--)
 		scr_writew(scr_readw(s++), d++);
+#else
+	memcpy(d, s, count);
+#endif
 }
 #endif
 
 #ifndef VT_BUF_HAVE_MEMMOVEW
 static inline void scr_memmovew(u16 *d, const u16 *s, unsigned int count)
 {
+#ifdef VT_BUF_HAVE_RW
 	if (d < s)
 		scr_memcpyw(d, s, count);
 	else {
@@ -53,6 +62,9 @@ static inline void scr_memmovew(u16 *d, const u16 *s, unsigned int count)
 		while (count--)
 			scr_writew(scr_readw(--s), --d);
 	}
+#else
+	memmove(d, s, count);
+#endif
 }
 #endif
 
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 7/7] vga: Optimise console scrolling
@ 2017-03-24 16:13   ` Matthew Wilcox
  0 siblings, 0 replies; 50+ messages in thread
From: Matthew Wilcox @ 2017-03-24 16:13 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arch, linux-mips, linux-fbdev, Matthew Wilcox, x86,
	Minchan Kim, linux-alpha, sparclinux, linuxppc-dev,
	linux-arm-kernel

From: Matthew Wilcox <mawilcox@microsoft.com>

Where possible, call memset16(), memmove() or memcpy() instead of using
open-coded loops.  If an architecture doesn't define VT_BUF_HAVE_RW,
we can do that from the generic code.  For the architectures which do
have special RW routines, usually we can do the special thing (pointer
test or byteswap) once (and then use a mem* call) instead of each time
around a loop.  Alpha is the only architecture missing a scr_memmovew()
definition (because it's non-trivial to write).

I don't like the calling convention that uses a byte count instead of
a count of u16s, but it's a little late to change that.  Reduces code
size of fbcon.o by almost 400 bytes on my laptop build.

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
---
 arch/mips/include/asm/vga.h    |  6 ++++++
 arch/powerpc/include/asm/vga.h |  8 ++++++++
 arch/sparc/include/asm/vga.h   | 24 ++++++++++++++++++++++++
 include/linux/vt_buffer.h      | 12 ++++++++++++
 4 files changed, 50 insertions(+)

diff --git a/arch/mips/include/asm/vga.h b/arch/mips/include/asm/vga.h
index f82c83749a08..7510f406e1e1 100644
--- a/arch/mips/include/asm/vga.h
+++ b/arch/mips/include/asm/vga.h
@@ -40,9 +40,15 @@ static inline u16 scr_readw(volatile const u16 *addr)
 	return le16_to_cpu(*addr);
 }
 
+static inline void scr_memsetw(u16 *s, u16 v, unsigned int count)
+{
+	memset16(s, cpu_to_le16(v), count / 2);
+}
+
 #define scr_memcpyw(d, s, c) memcpy(d, s, c)
 #define scr_memmovew(d, s, c) memmove(d, s, c)
 #define VT_BUF_HAVE_MEMCPYW
 #define VT_BUF_HAVE_MEMMOVEW
+#define VT_BUF_HAVE_MEMSETW
 
 #endif /* _ASM_VGA_H */
diff --git a/arch/powerpc/include/asm/vga.h b/arch/powerpc/include/asm/vga.h
index ab3acd2f2786..7a7b541b7493 100644
--- a/arch/powerpc/include/asm/vga.h
+++ b/arch/powerpc/include/asm/vga.h
@@ -33,8 +33,16 @@ static inline u16 scr_readw(volatile const u16 *addr)
 	return le16_to_cpu(*addr);
 }
 
+#define VT_BUF_HAVE_MEMSETW
+static inline void scr_memsetw(u16 *s, u16 v, unsigned int n)
+{
+	memset16(s, cpu_to_le16(v), n / 2);
+}
+
 #define VT_BUF_HAVE_MEMCPYW
+#define VT_BUF_HAVE_MEMMOVEW
 #define scr_memcpyw	memcpy
+#define scr_memmovew	memmove
 
 #endif /* !CONFIG_VGA_CONSOLE && !CONFIG_MDA_CONSOLE */
 
diff --git a/arch/sparc/include/asm/vga.h b/arch/sparc/include/asm/vga.h
index ec0e9967d93d..1fab92b110d9 100644
--- a/arch/sparc/include/asm/vga.h
+++ b/arch/sparc/include/asm/vga.h
@@ -11,6 +11,9 @@
 #include <asm/types.h>
 
 #define VT_BUF_HAVE_RW
+#define VT_BUF_HAVE_MEMSETW
+#define VT_BUF_HAVE_MEMCPYW
+#define VT_BUF_HAVE_MEMMOVEW
 
 #undef scr_writew
 #undef scr_readw
@@ -29,6 +32,27 @@ static inline u16 scr_readw(const u16 *addr)
 	return *addr;
 }
 
+static inline void scr_memsetw(u16 *p, u16 v, unsigned int n)
+{
+	BUG_ON((long) p >= 0);
+
+	memset16(s, cpu_to_le16(v), n / 2);
+}
+
+static inline void scr_memcpyw(u16 *d, u16 *s, unsigned int n)
+{
+	BUG_ON((long) d >= 0);
+
+	memcpy(d, s, n);
+}
+
+static inline void scr_memmovew(u16 *d, u16 *s, unsigned int n)
+{
+	BUG_ON((long) d >= 0);
+
+	memmove(d, s, n);
+}
+
 #define VGA_MAP_MEM(x,s) (x)
 
 #endif
diff --git a/include/linux/vt_buffer.h b/include/linux/vt_buffer.h
index f38c10ba3ff5..31b92fcd8f03 100644
--- a/include/linux/vt_buffer.h
+++ b/include/linux/vt_buffer.h
@@ -26,24 +26,33 @@
 #ifndef VT_BUF_HAVE_MEMSETW
 static inline void scr_memsetw(u16 *s, u16 c, unsigned int count)
 {
+#ifdef VT_BUF_HAVE_RW
 	count /= 2;
 	while (count--)
 		scr_writew(c, s++);
+#else
+	memset16(s, c, count / 2);
+#endif
 }
 #endif
 
 #ifndef VT_BUF_HAVE_MEMCPYW
 static inline void scr_memcpyw(u16 *d, const u16 *s, unsigned int count)
 {
+#ifdef VT_BUF_HAVE_RW
 	count /= 2;
 	while (count--)
 		scr_writew(scr_readw(s++), d++);
+#else
+	memcpy(d, s, count);
+#endif
 }
 #endif
 
 #ifndef VT_BUF_HAVE_MEMMOVEW
 static inline void scr_memmovew(u16 *d, const u16 *s, unsigned int count)
 {
+#ifdef VT_BUF_HAVE_RW
 	if (d < s)
 		scr_memcpyw(d, s, count);
 	else {
@@ -53,6 +62,9 @@ static inline void scr_memmovew(u16 *d, const u16 *s, unsigned int count)
 		while (count--)
 			scr_writew(scr_readw(--s), --d);
 	}
+#else
+	memmove(d, s, count);
+#endif
 }
 #endif
 
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 7/7] vga: Optimise console scrolling
@ 2017-03-24 16:13   ` Matthew Wilcox
  0 siblings, 0 replies; 50+ messages in thread
From: Matthew Wilcox @ 2017-03-24 16:13 UTC (permalink / raw)
  To: linux-arm-kernel

From: Matthew Wilcox <mawilcox@microsoft.com>

Where possible, call memset16(), memmove() or memcpy() instead of using
open-coded loops.  If an architecture doesn't define VT_BUF_HAVE_RW,
we can do that from the generic code.  For the architectures which do
have special RW routines, usually we can do the special thing (pointer
test or byteswap) once (and then use a mem* call) instead of each time
around a loop.  Alpha is the only architecture missing a scr_memmovew()
definition (because it's non-trivial to write).

I don't like the calling convention that uses a byte count instead of
a count of u16s, but it's a little late to change that.  Reduces code
size of fbcon.o by almost 400 bytes on my laptop build.

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
---
 arch/mips/include/asm/vga.h    |  6 ++++++
 arch/powerpc/include/asm/vga.h |  8 ++++++++
 arch/sparc/include/asm/vga.h   | 24 ++++++++++++++++++++++++
 include/linux/vt_buffer.h      | 12 ++++++++++++
 4 files changed, 50 insertions(+)

diff --git a/arch/mips/include/asm/vga.h b/arch/mips/include/asm/vga.h
index f82c83749a08..7510f406e1e1 100644
--- a/arch/mips/include/asm/vga.h
+++ b/arch/mips/include/asm/vga.h
@@ -40,9 +40,15 @@ static inline u16 scr_readw(volatile const u16 *addr)
 	return le16_to_cpu(*addr);
 }
 
+static inline void scr_memsetw(u16 *s, u16 v, unsigned int count)
+{
+	memset16(s, cpu_to_le16(v), count / 2);
+}
+
 #define scr_memcpyw(d, s, c) memcpy(d, s, c)
 #define scr_memmovew(d, s, c) memmove(d, s, c)
 #define VT_BUF_HAVE_MEMCPYW
 #define VT_BUF_HAVE_MEMMOVEW
+#define VT_BUF_HAVE_MEMSETW
 
 #endif /* _ASM_VGA_H */
diff --git a/arch/powerpc/include/asm/vga.h b/arch/powerpc/include/asm/vga.h
index ab3acd2f2786..7a7b541b7493 100644
--- a/arch/powerpc/include/asm/vga.h
+++ b/arch/powerpc/include/asm/vga.h
@@ -33,8 +33,16 @@ static inline u16 scr_readw(volatile const u16 *addr)
 	return le16_to_cpu(*addr);
 }
 
+#define VT_BUF_HAVE_MEMSETW
+static inline void scr_memsetw(u16 *s, u16 v, unsigned int n)
+{
+	memset16(s, cpu_to_le16(v), n / 2);
+}
+
 #define VT_BUF_HAVE_MEMCPYW
+#define VT_BUF_HAVE_MEMMOVEW
 #define scr_memcpyw	memcpy
+#define scr_memmovew	memmove
 
 #endif /* !CONFIG_VGA_CONSOLE && !CONFIG_MDA_CONSOLE */
 
diff --git a/arch/sparc/include/asm/vga.h b/arch/sparc/include/asm/vga.h
index ec0e9967d93d..1fab92b110d9 100644
--- a/arch/sparc/include/asm/vga.h
+++ b/arch/sparc/include/asm/vga.h
@@ -11,6 +11,9 @@
 #include <asm/types.h>
 
 #define VT_BUF_HAVE_RW
+#define VT_BUF_HAVE_MEMSETW
+#define VT_BUF_HAVE_MEMCPYW
+#define VT_BUF_HAVE_MEMMOVEW
 
 #undef scr_writew
 #undef scr_readw
@@ -29,6 +32,27 @@ static inline u16 scr_readw(const u16 *addr)
 	return *addr;
 }
 
+static inline void scr_memsetw(u16 *p, u16 v, unsigned int n)
+{
+	BUG_ON((long) p >= 0);
+
+	memset16(s, cpu_to_le16(v), n / 2);
+}
+
+static inline void scr_memcpyw(u16 *d, u16 *s, unsigned int n)
+{
+	BUG_ON((long) d >= 0);
+
+	memcpy(d, s, n);
+}
+
+static inline void scr_memmovew(u16 *d, u16 *s, unsigned int n)
+{
+	BUG_ON((long) d >= 0);
+
+	memmove(d, s, n);
+}
+
 #define VGA_MAP_MEM(x,s) (x)
 
 #endif
diff --git a/include/linux/vt_buffer.h b/include/linux/vt_buffer.h
index f38c10ba3ff5..31b92fcd8f03 100644
--- a/include/linux/vt_buffer.h
+++ b/include/linux/vt_buffer.h
@@ -26,24 +26,33 @@
 #ifndef VT_BUF_HAVE_MEMSETW
 static inline void scr_memsetw(u16 *s, u16 c, unsigned int count)
 {
+#ifdef VT_BUF_HAVE_RW
 	count /= 2;
 	while (count--)
 		scr_writew(c, s++);
+#else
+	memset16(s, c, count / 2);
+#endif
 }
 #endif
 
 #ifndef VT_BUF_HAVE_MEMCPYW
 static inline void scr_memcpyw(u16 *d, const u16 *s, unsigned int count)
 {
+#ifdef VT_BUF_HAVE_RW
 	count /= 2;
 	while (count--)
 		scr_writew(scr_readw(s++), d++);
+#else
+	memcpy(d, s, count);
+#endif
 }
 #endif
 
 #ifndef VT_BUF_HAVE_MEMMOVEW
 static inline void scr_memmovew(u16 *d, const u16 *s, unsigned int count)
 {
+#ifdef VT_BUF_HAVE_RW
 	if (d < s)
 		scr_memcpyw(d, s, count);
 	else {
@@ -53,6 +62,9 @@ static inline void scr_memmovew(u16 *d, const u16 *s, unsigned int count)
 		while (count--)
 			scr_writew(scr_readw(--s), --d);
 	}
+#else
+	memmove(d, s, count);
+#endif
 }
 #endif
 
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 4/7] alpha: Add support for memset16
  2017-03-24 16:13   ` Matthew Wilcox
  (?)
  (?)
@ 2017-03-26  7:28     ` kbuild test robot
  -1 siblings, 0 replies; 50+ messages in thread
From: kbuild test robot @ 2017-03-26  7:28 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: kbuild-all, linux-kernel, linux-arch, linux-mips, linux-fbdev,
	Matthew Wilcox, x86, Minchan Kim, linux-alpha, sparclinux,
	linuxppc-dev, linux-arm-kernel

[-- Attachment #1: Type: text/plain, Size: 2139 bytes --]

Hi Matthew,

[auto build test ERROR on linus/master]
[also build test ERROR on v4.11-rc3 next-20170324]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Matthew-Wilcox/Add-memsetN-functions/20170326-140108
config: alpha-allyesconfig (attached as .config)
compiler: alpha-linux-gnu-gcc (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
        wget https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=alpha 

All errors (new ones prefixed by >>):

   In file included from include/linux/string.h:18:0,
                    from include/linux/bitmap.h:8,
                    from include/linux/cpumask.h:11,
                    from include/linux/rcupdate.h:40,
                    from include/linux/rculist.h:10,
                    from include/linux/pid.h:4,
                    from include/linux/sched.h:13,
                    from arch/alpha/kernel/asm-offsets.c:9:
   arch/alpha/include/asm/string.h: In function 'memset16':
>> arch/alpha/include/asm/string.h:74:2: error: expected ';' before 'return'
     return __memset16(p, v, n * 2);
     ^~~~~~
   make[2]: *** [arch/alpha/kernel/asm-offsets.s] Error 1
   make[2]: Target '__build' not remade because of errors.
   make[1]: *** [prepare0] Error 2
   make[1]: Target 'prepare' not remade because of errors.
   make: *** [sub-make] Error 2

vim +74 arch/alpha/include/asm/string.h

    68	#define __HAVE_ARCH_MEMSET16
    69	extern void * __memset16(void *dest, unsigned short, size_t count);
    70	static inline void *memset16(uint16_t *p, uint16_t v, size_t n)
    71	{
    72		if (__builtin_constant_p(v))
    73			return __constant_c_memset(p, 0x0001000100010001UL * v, n * 2)
  > 74		return __memset16(p, v, n * 2);
    75	}
    76	
    77	#endif /* __KERNEL__ */

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 49576 bytes --]

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 4/7] alpha: Add support for memset16
@ 2017-03-26  7:28     ` kbuild test robot
  0 siblings, 0 replies; 50+ messages in thread
From: kbuild test robot @ 2017-03-26  7:28 UTC (permalink / raw)
  To: linux-fbdev

[-- Attachment #1: Type: text/plain, Size: 2139 bytes --]

Hi Matthew,

[auto build test ERROR on linus/master]
[also build test ERROR on v4.11-rc3 next-20170324]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Matthew-Wilcox/Add-memsetN-functions/20170326-140108
config: alpha-allyesconfig (attached as .config)
compiler: alpha-linux-gnu-gcc (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
        wget https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=alpha 

All errors (new ones prefixed by >>):

   In file included from include/linux/string.h:18:0,
                    from include/linux/bitmap.h:8,
                    from include/linux/cpumask.h:11,
                    from include/linux/rcupdate.h:40,
                    from include/linux/rculist.h:10,
                    from include/linux/pid.h:4,
                    from include/linux/sched.h:13,
                    from arch/alpha/kernel/asm-offsets.c:9:
   arch/alpha/include/asm/string.h: In function 'memset16':
>> arch/alpha/include/asm/string.h:74:2: error: expected ';' before 'return'
     return __memset16(p, v, n * 2);
     ^~~~~~
   make[2]: *** [arch/alpha/kernel/asm-offsets.s] Error 1
   make[2]: Target '__build' not remade because of errors.
   make[1]: *** [prepare0] Error 2
   make[1]: Target 'prepare' not remade because of errors.
   make: *** [sub-make] Error 2

vim +74 arch/alpha/include/asm/string.h

    68	#define __HAVE_ARCH_MEMSET16
    69	extern void * __memset16(void *dest, unsigned short, size_t count);
    70	static inline void *memset16(uint16_t *p, uint16_t v, size_t n)
    71	{
    72		if (__builtin_constant_p(v))
    73			return __constant_c_memset(p, 0x0001000100010001UL * v, n * 2)
  > 74		return __memset16(p, v, n * 2);
    75	}
    76	
    77	#endif /* __KERNEL__ */

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 49576 bytes --]

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 4/7] alpha: Add support for memset16
@ 2017-03-26  7:28     ` kbuild test robot
  0 siblings, 0 replies; 50+ messages in thread
From: kbuild test robot @ 2017-03-26  7:28 UTC (permalink / raw)
  To: sparclinux

[-- Attachment #1: Type: text/plain, Size: 2139 bytes --]

Hi Matthew,

[auto build test ERROR on linus/master]
[also build test ERROR on v4.11-rc3 next-20170324]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Matthew-Wilcox/Add-memsetN-functions/20170326-140108
config: alpha-allyesconfig (attached as .config)
compiler: alpha-linux-gnu-gcc (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
        wget https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=alpha 

All errors (new ones prefixed by >>):

   In file included from include/linux/string.h:18:0,
                    from include/linux/bitmap.h:8,
                    from include/linux/cpumask.h:11,
                    from include/linux/rcupdate.h:40,
                    from include/linux/rculist.h:10,
                    from include/linux/pid.h:4,
                    from include/linux/sched.h:13,
                    from arch/alpha/kernel/asm-offsets.c:9:
   arch/alpha/include/asm/string.h: In function 'memset16':
>> arch/alpha/include/asm/string.h:74:2: error: expected ';' before 'return'
     return __memset16(p, v, n * 2);
     ^~~~~~
   make[2]: *** [arch/alpha/kernel/asm-offsets.s] Error 1
   make[2]: Target '__build' not remade because of errors.
   make[1]: *** [prepare0] Error 2
   make[1]: Target 'prepare' not remade because of errors.
   make: *** [sub-make] Error 2

vim +74 arch/alpha/include/asm/string.h

    68	#define __HAVE_ARCH_MEMSET16
    69	extern void * __memset16(void *dest, unsigned short, size_t count);
    70	static inline void *memset16(uint16_t *p, uint16_t v, size_t n)
    71	{
    72		if (__builtin_constant_p(v))
    73			return __constant_c_memset(p, 0x0001000100010001UL * v, n * 2)
  > 74		return __memset16(p, v, n * 2);
    75	}
    76	
    77	#endif /* __KERNEL__ */

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 49576 bytes --]

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH v3 4/7] alpha: Add support for memset16
@ 2017-03-26  7:28     ` kbuild test robot
  0 siblings, 0 replies; 50+ messages in thread
From: kbuild test robot @ 2017-03-26  7:28 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Matthew,

[auto build test ERROR on linus/master]
[also build test ERROR on v4.11-rc3 next-20170324]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Matthew-Wilcox/Add-memsetN-functions/20170326-140108
config: alpha-allyesconfig (attached as .config)
compiler: alpha-linux-gnu-gcc (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
        wget https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=alpha 

All errors (new ones prefixed by >>):

   In file included from include/linux/string.h:18:0,
                    from include/linux/bitmap.h:8,
                    from include/linux/cpumask.h:11,
                    from include/linux/rcupdate.h:40,
                    from include/linux/rculist.h:10,
                    from include/linux/pid.h:4,
                    from include/linux/sched.h:13,
                    from arch/alpha/kernel/asm-offsets.c:9:
   arch/alpha/include/asm/string.h: In function 'memset16':
>> arch/alpha/include/asm/string.h:74:2: error: expected ';' before 'return'
     return __memset16(p, v, n * 2);
     ^~~~~~
   make[2]: *** [arch/alpha/kernel/asm-offsets.s] Error 1
   make[2]: Target '__build' not remade because of errors.
   make[1]: *** [prepare0] Error 2
   make[1]: Target 'prepare' not remade because of errors.
   make: *** [sub-make] Error 2

vim +74 arch/alpha/include/asm/string.h

    68	#define __HAVE_ARCH_MEMSET16
    69	extern void * __memset16(void *dest, unsigned short, size_t count);
    70	static inline void *memset16(uint16_t *p, uint16_t v, size_t n)
    71	{
    72		if (__builtin_constant_p(v))
    73			return __constant_c_memset(p, 0x0001000100010001UL * v, n * 2)
  > 74		return __memset16(p, v, n * 2);
    75	}
    76	
    77	#endif /* __KERNEL__ */

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation
-------------- next part --------------
A non-text attachment was scrubbed...
Name: .config.gz
Type: application/gzip
Size: 49576 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20170326/1a1fb4b2/attachment-0001.gz>

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 3/7] x86: Implement memset16, memset32 & memset64
  2017-03-24 16:13   ` Matthew Wilcox
                       ` (2 preceding siblings ...)
  (?)
@ 2017-03-26  7:44     ` kbuild test robot
  -1 siblings, 0 replies; 50+ messages in thread
From: kbuild test robot @ 2017-03-26  7:44 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: kbuild-all, linux-kernel, linux-fbdev, linux-arch, linux-alpha,
	linux-arm-kernel, x86, linux-mips, linuxppc-dev, sparclinux,
	Minchan Kim, Matthew Wilcox

[-- Attachment #1: Type: text/plain, Size: 3053 bytes --]

Hi Matthew,

[auto build test ERROR on linus/master]
[also build test ERROR on v4.11-rc3 next-20170324]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Matthew-Wilcox/Add-memsetN-functions/20170326-140108
config: i386-randconfig-x077-201713 (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

All errors (new ones prefixed by >>):

>> lib/string.c:733:7: error: redefinition of 'memset32'
    void *memset32(uint32_t *s, uint32_t v, size_t count)
          ^~~~~~~~
   In file included from arch/x86/include/asm/string.h:2:0,
                    from include/linux/string.h:18,
                    from lib/string.c:23:
   arch/x86/include/asm/string_32.h:347:21: note: previous definition of 'memset32' was here
    static inline void *memset32(uint32_t *s, uint32_t v, size_t n)
                        ^~~~~~~~

vim +/memset32 +733 lib/string.c

9114f9de Matthew Wilcox 2017-03-24  717  	return s;
9114f9de Matthew Wilcox 2017-03-24  718  }
9114f9de Matthew Wilcox 2017-03-24  719  EXPORT_SYMBOL(memset16);
9114f9de Matthew Wilcox 2017-03-24  720  #endif
9114f9de Matthew Wilcox 2017-03-24  721  
9114f9de Matthew Wilcox 2017-03-24  722  #ifndef __HAVE_ARCH_MEMSET32
9114f9de Matthew Wilcox 2017-03-24  723  /**
9114f9de Matthew Wilcox 2017-03-24  724   * memset32() - Fill a memory area with a uint32_t
9114f9de Matthew Wilcox 2017-03-24  725   * @s: Pointer to the start of the area.
9114f9de Matthew Wilcox 2017-03-24  726   * @v: The value to fill the area with
9114f9de Matthew Wilcox 2017-03-24  727   * @count: The number of values to store
9114f9de Matthew Wilcox 2017-03-24  728   *
9114f9de Matthew Wilcox 2017-03-24  729   * Differs from memset() in that it fills with a uint32_t instead
9114f9de Matthew Wilcox 2017-03-24  730   * of a byte.  Remember that @count is the number of uint32_ts to
9114f9de Matthew Wilcox 2017-03-24  731   * store, not the number of bytes.
9114f9de Matthew Wilcox 2017-03-24  732   */
9114f9de Matthew Wilcox 2017-03-24 @733  void *memset32(uint32_t *s, uint32_t v, size_t count)
9114f9de Matthew Wilcox 2017-03-24  734  {
9114f9de Matthew Wilcox 2017-03-24  735  	uint32_t *xs = s;
9114f9de Matthew Wilcox 2017-03-24  736  
9114f9de Matthew Wilcox 2017-03-24  737  	while (count--)
9114f9de Matthew Wilcox 2017-03-24  738  		*xs++ = v;
9114f9de Matthew Wilcox 2017-03-24  739  	return s;
9114f9de Matthew Wilcox 2017-03-24  740  }
9114f9de Matthew Wilcox 2017-03-24  741  EXPORT_SYMBOL(memset32);

:::::: The code at line 733 was first introduced by commit
:::::: 9114f9de5005f9468370ed1cb1b5b841b10d3bad Add multibyte memset functions

:::::: TO: Matthew Wilcox <mawilcox@microsoft.com>
:::::: CC: 0day robot <fengguang.wu@intel.com>

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 25827 bytes --]

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 3/7] x86: Implement memset16, memset32 & memset64
@ 2017-03-26  7:44     ` kbuild test robot
  0 siblings, 0 replies; 50+ messages in thread
From: kbuild test robot @ 2017-03-26  7:44 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-arch, linux-mips, linux-fbdev, Matthew Wilcox, x86,
	linux-kernel, Minchan Kim, kbuild-all, linux-alpha, sparclinux,
	linuxppc-dev, linux-arm-kernel

[-- Attachment #1: Type: text/plain, Size: 3053 bytes --]

Hi Matthew,

[auto build test ERROR on linus/master]
[also build test ERROR on v4.11-rc3 next-20170324]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Matthew-Wilcox/Add-memsetN-functions/20170326-140108
config: i386-randconfig-x077-201713 (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

All errors (new ones prefixed by >>):

>> lib/string.c:733:7: error: redefinition of 'memset32'
    void *memset32(uint32_t *s, uint32_t v, size_t count)
          ^~~~~~~~
   In file included from arch/x86/include/asm/string.h:2:0,
                    from include/linux/string.h:18,
                    from lib/string.c:23:
   arch/x86/include/asm/string_32.h:347:21: note: previous definition of 'memset32' was here
    static inline void *memset32(uint32_t *s, uint32_t v, size_t n)
                        ^~~~~~~~

vim +/memset32 +733 lib/string.c

9114f9de Matthew Wilcox 2017-03-24  717  	return s;
9114f9de Matthew Wilcox 2017-03-24  718  }
9114f9de Matthew Wilcox 2017-03-24  719  EXPORT_SYMBOL(memset16);
9114f9de Matthew Wilcox 2017-03-24  720  #endif
9114f9de Matthew Wilcox 2017-03-24  721  
9114f9de Matthew Wilcox 2017-03-24  722  #ifndef __HAVE_ARCH_MEMSET32
9114f9de Matthew Wilcox 2017-03-24  723  /**
9114f9de Matthew Wilcox 2017-03-24  724   * memset32() - Fill a memory area with a uint32_t
9114f9de Matthew Wilcox 2017-03-24  725   * @s: Pointer to the start of the area.
9114f9de Matthew Wilcox 2017-03-24  726   * @v: The value to fill the area with
9114f9de Matthew Wilcox 2017-03-24  727   * @count: The number of values to store
9114f9de Matthew Wilcox 2017-03-24  728   *
9114f9de Matthew Wilcox 2017-03-24  729   * Differs from memset() in that it fills with a uint32_t instead
9114f9de Matthew Wilcox 2017-03-24  730   * of a byte.  Remember that @count is the number of uint32_ts to
9114f9de Matthew Wilcox 2017-03-24  731   * store, not the number of bytes.
9114f9de Matthew Wilcox 2017-03-24  732   */
9114f9de Matthew Wilcox 2017-03-24 @733  void *memset32(uint32_t *s, uint32_t v, size_t count)
9114f9de Matthew Wilcox 2017-03-24  734  {
9114f9de Matthew Wilcox 2017-03-24  735  	uint32_t *xs = s;
9114f9de Matthew Wilcox 2017-03-24  736  
9114f9de Matthew Wilcox 2017-03-24  737  	while (count--)
9114f9de Matthew Wilcox 2017-03-24  738  		*xs++ = v;
9114f9de Matthew Wilcox 2017-03-24  739  	return s;
9114f9de Matthew Wilcox 2017-03-24  740  }
9114f9de Matthew Wilcox 2017-03-24  741  EXPORT_SYMBOL(memset32);

:::::: The code at line 733 was first introduced by commit
:::::: 9114f9de5005f9468370ed1cb1b5b841b10d3bad Add multibyte memset functions

:::::: TO: Matthew Wilcox <mawilcox@microsoft.com>
:::::: CC: 0day robot <fengguang.wu@intel.com>

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 25827 bytes --]

[-- Attachment #3: Type: text/plain, Size: 176 bytes --]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 3/7] x86: Implement memset16, memset32 & memset64
@ 2017-03-26  7:44     ` kbuild test robot
  0 siblings, 0 replies; 50+ messages in thread
From: kbuild test robot @ 2017-03-26  7:44 UTC (permalink / raw)
  To: linux-fbdev

[-- Attachment #1: Type: text/plain, Size: 3053 bytes --]

Hi Matthew,

[auto build test ERROR on linus/master]
[also build test ERROR on v4.11-rc3 next-20170324]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Matthew-Wilcox/Add-memsetN-functions/20170326-140108
config: i386-randconfig-x077-201713 (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

All errors (new ones prefixed by >>):

>> lib/string.c:733:7: error: redefinition of 'memset32'
    void *memset32(uint32_t *s, uint32_t v, size_t count)
          ^~~~~~~~
   In file included from arch/x86/include/asm/string.h:2:0,
                    from include/linux/string.h:18,
                    from lib/string.c:23:
   arch/x86/include/asm/string_32.h:347:21: note: previous definition of 'memset32' was here
    static inline void *memset32(uint32_t *s, uint32_t v, size_t n)
                        ^~~~~~~~

vim +/memset32 +733 lib/string.c

9114f9de Matthew Wilcox 2017-03-24  717  	return s;
9114f9de Matthew Wilcox 2017-03-24  718  }
9114f9de Matthew Wilcox 2017-03-24  719  EXPORT_SYMBOL(memset16);
9114f9de Matthew Wilcox 2017-03-24  720  #endif
9114f9de Matthew Wilcox 2017-03-24  721  
9114f9de Matthew Wilcox 2017-03-24  722  #ifndef __HAVE_ARCH_MEMSET32
9114f9de Matthew Wilcox 2017-03-24  723  /**
9114f9de Matthew Wilcox 2017-03-24  724   * memset32() - Fill a memory area with a uint32_t
9114f9de Matthew Wilcox 2017-03-24  725   * @s: Pointer to the start of the area.
9114f9de Matthew Wilcox 2017-03-24  726   * @v: The value to fill the area with
9114f9de Matthew Wilcox 2017-03-24  727   * @count: The number of values to store
9114f9de Matthew Wilcox 2017-03-24  728   *
9114f9de Matthew Wilcox 2017-03-24  729   * Differs from memset() in that it fills with a uint32_t instead
9114f9de Matthew Wilcox 2017-03-24  730   * of a byte.  Remember that @count is the number of uint32_ts to
9114f9de Matthew Wilcox 2017-03-24  731   * store, not the number of bytes.
9114f9de Matthew Wilcox 2017-03-24  732   */
9114f9de Matthew Wilcox 2017-03-24 @733  void *memset32(uint32_t *s, uint32_t v, size_t count)
9114f9de Matthew Wilcox 2017-03-24  734  {
9114f9de Matthew Wilcox 2017-03-24  735  	uint32_t *xs = s;
9114f9de Matthew Wilcox 2017-03-24  736  
9114f9de Matthew Wilcox 2017-03-24  737  	while (count--)
9114f9de Matthew Wilcox 2017-03-24  738  		*xs++ = v;
9114f9de Matthew Wilcox 2017-03-24  739  	return s;
9114f9de Matthew Wilcox 2017-03-24  740  }
9114f9de Matthew Wilcox 2017-03-24  741  EXPORT_SYMBOL(memset32);

:::::: The code at line 733 was first introduced by commit
:::::: 9114f9de5005f9468370ed1cb1b5b841b10d3bad Add multibyte memset functions

:::::: TO: Matthew Wilcox <mawilcox@microsoft.com>
:::::: CC: 0day robot <fengguang.wu@intel.com>

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 25827 bytes --]

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 3/7] x86: Implement memset16, memset32 & memset64
@ 2017-03-26  7:44     ` kbuild test robot
  0 siblings, 0 replies; 50+ messages in thread
From: kbuild test robot @ 2017-03-26  7:44 UTC (permalink / raw)
  To: sparclinux

[-- Attachment #1: Type: text/plain, Size: 3053 bytes --]

Hi Matthew,

[auto build test ERROR on linus/master]
[also build test ERROR on v4.11-rc3 next-20170324]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Matthew-Wilcox/Add-memsetN-functions/20170326-140108
config: i386-randconfig-x077-201713 (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

All errors (new ones prefixed by >>):

>> lib/string.c:733:7: error: redefinition of 'memset32'
    void *memset32(uint32_t *s, uint32_t v, size_t count)
          ^~~~~~~~
   In file included from arch/x86/include/asm/string.h:2:0,
                    from include/linux/string.h:18,
                    from lib/string.c:23:
   arch/x86/include/asm/string_32.h:347:21: note: previous definition of 'memset32' was here
    static inline void *memset32(uint32_t *s, uint32_t v, size_t n)
                        ^~~~~~~~

vim +/memset32 +733 lib/string.c

9114f9de Matthew Wilcox 2017-03-24  717  	return s;
9114f9de Matthew Wilcox 2017-03-24  718  }
9114f9de Matthew Wilcox 2017-03-24  719  EXPORT_SYMBOL(memset16);
9114f9de Matthew Wilcox 2017-03-24  720  #endif
9114f9de Matthew Wilcox 2017-03-24  721  
9114f9de Matthew Wilcox 2017-03-24  722  #ifndef __HAVE_ARCH_MEMSET32
9114f9de Matthew Wilcox 2017-03-24  723  /**
9114f9de Matthew Wilcox 2017-03-24  724   * memset32() - Fill a memory area with a uint32_t
9114f9de Matthew Wilcox 2017-03-24  725   * @s: Pointer to the start of the area.
9114f9de Matthew Wilcox 2017-03-24  726   * @v: The value to fill the area with
9114f9de Matthew Wilcox 2017-03-24  727   * @count: The number of values to store
9114f9de Matthew Wilcox 2017-03-24  728   *
9114f9de Matthew Wilcox 2017-03-24  729   * Differs from memset() in that it fills with a uint32_t instead
9114f9de Matthew Wilcox 2017-03-24  730   * of a byte.  Remember that @count is the number of uint32_ts to
9114f9de Matthew Wilcox 2017-03-24  731   * store, not the number of bytes.
9114f9de Matthew Wilcox 2017-03-24  732   */
9114f9de Matthew Wilcox 2017-03-24 @733  void *memset32(uint32_t *s, uint32_t v, size_t count)
9114f9de Matthew Wilcox 2017-03-24  734  {
9114f9de Matthew Wilcox 2017-03-24  735  	uint32_t *xs = s;
9114f9de Matthew Wilcox 2017-03-24  736  
9114f9de Matthew Wilcox 2017-03-24  737  	while (count--)
9114f9de Matthew Wilcox 2017-03-24  738  		*xs++ = v;
9114f9de Matthew Wilcox 2017-03-24  739  	return s;
9114f9de Matthew Wilcox 2017-03-24  740  }
9114f9de Matthew Wilcox 2017-03-24  741  EXPORT_SYMBOL(memset32);

:::::: The code at line 733 was first introduced by commit
:::::: 9114f9de5005f9468370ed1cb1b5b841b10d3bad Add multibyte memset functions

:::::: TO: Matthew Wilcox <mawilcox@microsoft.com>
:::::: CC: 0day robot <fengguang.wu@intel.com>

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 25827 bytes --]

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH v3 3/7] x86: Implement memset16, memset32 & memset64
@ 2017-03-26  7:44     ` kbuild test robot
  0 siblings, 0 replies; 50+ messages in thread
From: kbuild test robot @ 2017-03-26  7:44 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Matthew,

[auto build test ERROR on linus/master]
[also build test ERROR on v4.11-rc3 next-20170324]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Matthew-Wilcox/Add-memsetN-functions/20170326-140108
config: i386-randconfig-x077-201713 (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

All errors (new ones prefixed by >>):

>> lib/string.c:733:7: error: redefinition of 'memset32'
    void *memset32(uint32_t *s, uint32_t v, size_t count)
          ^~~~~~~~
   In file included from arch/x86/include/asm/string.h:2:0,
                    from include/linux/string.h:18,
                    from lib/string.c:23:
   arch/x86/include/asm/string_32.h:347:21: note: previous definition of 'memset32' was here
    static inline void *memset32(uint32_t *s, uint32_t v, size_t n)
                        ^~~~~~~~

vim +/memset32 +733 lib/string.c

9114f9de Matthew Wilcox 2017-03-24  717  	return s;
9114f9de Matthew Wilcox 2017-03-24  718  }
9114f9de Matthew Wilcox 2017-03-24  719  EXPORT_SYMBOL(memset16);
9114f9de Matthew Wilcox 2017-03-24  720  #endif
9114f9de Matthew Wilcox 2017-03-24  721  
9114f9de Matthew Wilcox 2017-03-24  722  #ifndef __HAVE_ARCH_MEMSET32
9114f9de Matthew Wilcox 2017-03-24  723  /**
9114f9de Matthew Wilcox 2017-03-24  724   * memset32() - Fill a memory area with a uint32_t
9114f9de Matthew Wilcox 2017-03-24  725   * @s: Pointer to the start of the area.
9114f9de Matthew Wilcox 2017-03-24  726   * @v: The value to fill the area with
9114f9de Matthew Wilcox 2017-03-24  727   * @count: The number of values to store
9114f9de Matthew Wilcox 2017-03-24  728   *
9114f9de Matthew Wilcox 2017-03-24  729   * Differs from memset() in that it fills with a uint32_t instead
9114f9de Matthew Wilcox 2017-03-24  730   * of a byte.  Remember that @count is the number of uint32_ts to
9114f9de Matthew Wilcox 2017-03-24  731   * store, not the number of bytes.
9114f9de Matthew Wilcox 2017-03-24  732   */
9114f9de Matthew Wilcox 2017-03-24 @733  void *memset32(uint32_t *s, uint32_t v, size_t count)
9114f9de Matthew Wilcox 2017-03-24  734  {
9114f9de Matthew Wilcox 2017-03-24  735  	uint32_t *xs = s;
9114f9de Matthew Wilcox 2017-03-24  736  
9114f9de Matthew Wilcox 2017-03-24  737  	while (count--)
9114f9de Matthew Wilcox 2017-03-24  738  		*xs++ = v;
9114f9de Matthew Wilcox 2017-03-24  739  	return s;
9114f9de Matthew Wilcox 2017-03-24  740  }
9114f9de Matthew Wilcox 2017-03-24  741  EXPORT_SYMBOL(memset32);

:::::: The code at line 733 was first introduced by commit
:::::: 9114f9de5005f9468370ed1cb1b5b841b10d3bad Add multibyte memset functions

:::::: TO: Matthew Wilcox <mawilcox@microsoft.com>
:::::: CC: 0day robot <fengguang.wu@intel.com>

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation
-------------- next part --------------
A non-text attachment was scrubbed...
Name: .config.gz
Type: application/gzip
Size: 25827 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20170326/cd155bfd/attachment-0001.gz>

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 7/7] vga: Optimise console scrolling
  2017-03-24 16:13   ` Matthew Wilcox
                       ` (2 preceding siblings ...)
  (?)
@ 2017-03-26  8:45     ` kbuild test robot
  -1 siblings, 0 replies; 50+ messages in thread
From: kbuild test robot @ 2017-03-26  8:45 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: kbuild-all, linux-kernel, linux-fbdev, linux-arch, linux-alpha,
	linux-arm-kernel, x86, linux-mips, linuxppc-dev, sparclinux,
	Minchan Kim, Matthew Wilcox

[-- Attachment #1: Type: text/plain, Size: 4553 bytes --]

Hi Matthew,

[auto build test ERROR on linus/master]
[also build test ERROR on v4.11-rc3 next-20170324]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Matthew-Wilcox/Add-memsetN-functions/20170326-140108
config: sparc-allyesconfig (attached as .config)
compiler: sparc64-linux-gnu-gcc (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
        wget https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=sparc 

All error/warnings (new ones prefixed by >>):

   In file included from include/video/vga.h:22:0,
                    from include/linux/vgaarb.h:34,
                    from drivers/gpu//drm/amd/amdgpu/amdgpu_device.c:35:
   arch/sparc/include/asm/vga.h: In function 'scr_memsetw':
>> arch/sparc/include/asm/vga.h:39:11: error: 's' undeclared (first use in this function)
     memset16(s, cpu_to_le16(v), n / 2);
              ^
   arch/sparc/include/asm/vga.h:39:11: note: each undeclared identifier is reported only once for each function it appears in
--
   In file included from include/video/vga.h:22:0,
                    from include/linux/vgaarb.h:34,
                    from drivers/gpu//drm/nouveau/nouveau_vga.c:1:
   arch/sparc/include/asm/vga.h: In function 'scr_memsetw':
>> arch/sparc/include/asm/vga.h:39:2: error: implicit declaration of function 'memset16' [-Werror=implicit-function-declaration]
     memset16(s, cpu_to_le16(v), n / 2);
     ^~~~~~~~
>> arch/sparc/include/asm/vga.h:39:11: error: 's' undeclared (first use in this function)
     memset16(s, cpu_to_le16(v), n / 2);
              ^
   arch/sparc/include/asm/vga.h:39:11: note: each undeclared identifier is reported only once for each function it appears in
   arch/sparc/include/asm/vga.h: In function 'scr_memcpyw':
>> arch/sparc/include/asm/vga.h:46:2: error: implicit declaration of function 'memcpy' [-Werror=implicit-function-declaration]
     memcpy(d, s, n);
     ^~~~~~
>> arch/sparc/include/asm/vga.h:46:2: warning: incompatible implicit declaration of built-in function 'memcpy'
   arch/sparc/include/asm/vga.h:46:2: note: include '<string.h>' or provide a declaration of 'memcpy'
   arch/sparc/include/asm/vga.h: In function 'scr_memmovew':
>> arch/sparc/include/asm/vga.h:53:2: error: implicit declaration of function 'memmove' [-Werror=implicit-function-declaration]
     memmove(d, s, n);
     ^~~~~~~
>> arch/sparc/include/asm/vga.h:53:2: warning: incompatible implicit declaration of built-in function 'memmove'
   arch/sparc/include/asm/vga.h:53:2: note: include '<string.h>' or provide a declaration of 'memmove'
   In file included from include/uapi/linux/uuid.h:21:0,
                    from include/linux/uuid.h:19,
                    from include/linux/mod_devicetable.h:12,
                    from include/linux/i2c.h:29,
                    from include/uapi/linux/fb.h:5,
                    from include/linux/fb.h:5,
                    from include/linux/vga_switcheroo.h:34,
                    from drivers/gpu//drm/nouveau/nouveau_vga.c:2:
   include/linux/string.h: At top level:
>> include/linux/string.h:104:14: error: conflicting types for 'memset16'
    extern void *memset16(uint16_t *, uint16_t, __kernel_size_t);
                 ^~~~~~~~
   In file included from include/video/vga.h:22:0,
                    from include/linux/vgaarb.h:34,
                    from drivers/gpu//drm/nouveau/nouveau_vga.c:1:
   arch/sparc/include/asm/vga.h:39:2: note: previous implicit declaration of 'memset16' was here
     memset16(s, cpu_to_le16(v), n / 2);
     ^~~~~~~~
   cc1: some warnings being treated as errors

vim +/s +39 arch/sparc/include/asm/vga.h

    33	}
    34	
    35	static inline void scr_memsetw(u16 *p, u16 v, unsigned int n)
    36	{
    37		BUG_ON((long) p >= 0);
    38	
  > 39		memset16(s, cpu_to_le16(v), n / 2);
    40	}
    41	
    42	static inline void scr_memcpyw(u16 *d, u16 *s, unsigned int n)
    43	{
    44		BUG_ON((long) d >= 0);
    45	
  > 46		memcpy(d, s, n);
    47	}
    48	
    49	static inline void scr_memmovew(u16 *d, u16 *s, unsigned int n)
    50	{
    51		BUG_ON((long) d >= 0);
    52	
  > 53		memmove(d, s, n);
    54	}
    55	
    56	#define VGA_MAP_MEM(x,s) (x)

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 49779 bytes --]

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 7/7] vga: Optimise console scrolling
@ 2017-03-26  8:45     ` kbuild test robot
  0 siblings, 0 replies; 50+ messages in thread
From: kbuild test robot @ 2017-03-26  8:45 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-arch, linux-mips, linux-fbdev, Matthew Wilcox, x86,
	linux-kernel, Minchan Kim, kbuild-all, linux-alpha, sparclinux,
	linuxppc-dev, linux-arm-kernel

[-- Attachment #1: Type: text/plain, Size: 4553 bytes --]

Hi Matthew,

[auto build test ERROR on linus/master]
[also build test ERROR on v4.11-rc3 next-20170324]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Matthew-Wilcox/Add-memsetN-functions/20170326-140108
config: sparc-allyesconfig (attached as .config)
compiler: sparc64-linux-gnu-gcc (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
        wget https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=sparc 

All error/warnings (new ones prefixed by >>):

   In file included from include/video/vga.h:22:0,
                    from include/linux/vgaarb.h:34,
                    from drivers/gpu//drm/amd/amdgpu/amdgpu_device.c:35:
   arch/sparc/include/asm/vga.h: In function 'scr_memsetw':
>> arch/sparc/include/asm/vga.h:39:11: error: 's' undeclared (first use in this function)
     memset16(s, cpu_to_le16(v), n / 2);
              ^
   arch/sparc/include/asm/vga.h:39:11: note: each undeclared identifier is reported only once for each function it appears in
--
   In file included from include/video/vga.h:22:0,
                    from include/linux/vgaarb.h:34,
                    from drivers/gpu//drm/nouveau/nouveau_vga.c:1:
   arch/sparc/include/asm/vga.h: In function 'scr_memsetw':
>> arch/sparc/include/asm/vga.h:39:2: error: implicit declaration of function 'memset16' [-Werror=implicit-function-declaration]
     memset16(s, cpu_to_le16(v), n / 2);
     ^~~~~~~~
>> arch/sparc/include/asm/vga.h:39:11: error: 's' undeclared (first use in this function)
     memset16(s, cpu_to_le16(v), n / 2);
              ^
   arch/sparc/include/asm/vga.h:39:11: note: each undeclared identifier is reported only once for each function it appears in
   arch/sparc/include/asm/vga.h: In function 'scr_memcpyw':
>> arch/sparc/include/asm/vga.h:46:2: error: implicit declaration of function 'memcpy' [-Werror=implicit-function-declaration]
     memcpy(d, s, n);
     ^~~~~~
>> arch/sparc/include/asm/vga.h:46:2: warning: incompatible implicit declaration of built-in function 'memcpy'
   arch/sparc/include/asm/vga.h:46:2: note: include '<string.h>' or provide a declaration of 'memcpy'
   arch/sparc/include/asm/vga.h: In function 'scr_memmovew':
>> arch/sparc/include/asm/vga.h:53:2: error: implicit declaration of function 'memmove' [-Werror=implicit-function-declaration]
     memmove(d, s, n);
     ^~~~~~~
>> arch/sparc/include/asm/vga.h:53:2: warning: incompatible implicit declaration of built-in function 'memmove'
   arch/sparc/include/asm/vga.h:53:2: note: include '<string.h>' or provide a declaration of 'memmove'
   In file included from include/uapi/linux/uuid.h:21:0,
                    from include/linux/uuid.h:19,
                    from include/linux/mod_devicetable.h:12,
                    from include/linux/i2c.h:29,
                    from include/uapi/linux/fb.h:5,
                    from include/linux/fb.h:5,
                    from include/linux/vga_switcheroo.h:34,
                    from drivers/gpu//drm/nouveau/nouveau_vga.c:2:
   include/linux/string.h: At top level:
>> include/linux/string.h:104:14: error: conflicting types for 'memset16'
    extern void *memset16(uint16_t *, uint16_t, __kernel_size_t);
                 ^~~~~~~~
   In file included from include/video/vga.h:22:0,
                    from include/linux/vgaarb.h:34,
                    from drivers/gpu//drm/nouveau/nouveau_vga.c:1:
   arch/sparc/include/asm/vga.h:39:2: note: previous implicit declaration of 'memset16' was here
     memset16(s, cpu_to_le16(v), n / 2);
     ^~~~~~~~
   cc1: some warnings being treated as errors

vim +/s +39 arch/sparc/include/asm/vga.h

    33	}
    34	
    35	static inline void scr_memsetw(u16 *p, u16 v, unsigned int n)
    36	{
    37		BUG_ON((long) p >= 0);
    38	
  > 39		memset16(s, cpu_to_le16(v), n / 2);
    40	}
    41	
    42	static inline void scr_memcpyw(u16 *d, u16 *s, unsigned int n)
    43	{
    44		BUG_ON((long) d >= 0);
    45	
  > 46		memcpy(d, s, n);
    47	}
    48	
    49	static inline void scr_memmovew(u16 *d, u16 *s, unsigned int n)
    50	{
    51		BUG_ON((long) d >= 0);
    52	
  > 53		memmove(d, s, n);
    54	}
    55	
    56	#define VGA_MAP_MEM(x,s) (x)

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 49779 bytes --]

[-- Attachment #3: Type: text/plain, Size: 176 bytes --]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 7/7] vga: Optimise console scrolling
@ 2017-03-26  8:45     ` kbuild test robot
  0 siblings, 0 replies; 50+ messages in thread
From: kbuild test robot @ 2017-03-26  8:45 UTC (permalink / raw)
  To: linux-fbdev

[-- Attachment #1: Type: text/plain, Size: 4553 bytes --]

Hi Matthew,

[auto build test ERROR on linus/master]
[also build test ERROR on v4.11-rc3 next-20170324]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Matthew-Wilcox/Add-memsetN-functions/20170326-140108
config: sparc-allyesconfig (attached as .config)
compiler: sparc64-linux-gnu-gcc (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
        wget https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=sparc 

All error/warnings (new ones prefixed by >>):

   In file included from include/video/vga.h:22:0,
                    from include/linux/vgaarb.h:34,
                    from drivers/gpu//drm/amd/amdgpu/amdgpu_device.c:35:
   arch/sparc/include/asm/vga.h: In function 'scr_memsetw':
>> arch/sparc/include/asm/vga.h:39:11: error: 's' undeclared (first use in this function)
     memset16(s, cpu_to_le16(v), n / 2);
              ^
   arch/sparc/include/asm/vga.h:39:11: note: each undeclared identifier is reported only once for each function it appears in
--
   In file included from include/video/vga.h:22:0,
                    from include/linux/vgaarb.h:34,
                    from drivers/gpu//drm/nouveau/nouveau_vga.c:1:
   arch/sparc/include/asm/vga.h: In function 'scr_memsetw':
>> arch/sparc/include/asm/vga.h:39:2: error: implicit declaration of function 'memset16' [-Werror=implicit-function-declaration]
     memset16(s, cpu_to_le16(v), n / 2);
     ^~~~~~~~
>> arch/sparc/include/asm/vga.h:39:11: error: 's' undeclared (first use in this function)
     memset16(s, cpu_to_le16(v), n / 2);
              ^
   arch/sparc/include/asm/vga.h:39:11: note: each undeclared identifier is reported only once for each function it appears in
   arch/sparc/include/asm/vga.h: In function 'scr_memcpyw':
>> arch/sparc/include/asm/vga.h:46:2: error: implicit declaration of function 'memcpy' [-Werror=implicit-function-declaration]
     memcpy(d, s, n);
     ^~~~~~
>> arch/sparc/include/asm/vga.h:46:2: warning: incompatible implicit declaration of built-in function 'memcpy'
   arch/sparc/include/asm/vga.h:46:2: note: include '<string.h>' or provide a declaration of 'memcpy'
   arch/sparc/include/asm/vga.h: In function 'scr_memmovew':
>> arch/sparc/include/asm/vga.h:53:2: error: implicit declaration of function 'memmove' [-Werror=implicit-function-declaration]
     memmove(d, s, n);
     ^~~~~~~
>> arch/sparc/include/asm/vga.h:53:2: warning: incompatible implicit declaration of built-in function 'memmove'
   arch/sparc/include/asm/vga.h:53:2: note: include '<string.h>' or provide a declaration of 'memmove'
   In file included from include/uapi/linux/uuid.h:21:0,
                    from include/linux/uuid.h:19,
                    from include/linux/mod_devicetable.h:12,
                    from include/linux/i2c.h:29,
                    from include/uapi/linux/fb.h:5,
                    from include/linux/fb.h:5,
                    from include/linux/vga_switcheroo.h:34,
                    from drivers/gpu//drm/nouveau/nouveau_vga.c:2:
   include/linux/string.h: At top level:
>> include/linux/string.h:104:14: error: conflicting types for 'memset16'
    extern void *memset16(uint16_t *, uint16_t, __kernel_size_t);
                 ^~~~~~~~
   In file included from include/video/vga.h:22:0,
                    from include/linux/vgaarb.h:34,
                    from drivers/gpu//drm/nouveau/nouveau_vga.c:1:
   arch/sparc/include/asm/vga.h:39:2: note: previous implicit declaration of 'memset16' was here
     memset16(s, cpu_to_le16(v), n / 2);
     ^~~~~~~~
   cc1: some warnings being treated as errors

vim +/s +39 arch/sparc/include/asm/vga.h

    33	}
    34	
    35	static inline void scr_memsetw(u16 *p, u16 v, unsigned int n)
    36	{
    37		BUG_ON((long) p >= 0);
    38	
  > 39		memset16(s, cpu_to_le16(v), n / 2);
    40	}
    41	
    42	static inline void scr_memcpyw(u16 *d, u16 *s, unsigned int n)
    43	{
    44		BUG_ON((long) d >= 0);
    45	
  > 46		memcpy(d, s, n);
    47	}
    48	
    49	static inline void scr_memmovew(u16 *d, u16 *s, unsigned int n)
    50	{
    51		BUG_ON((long) d >= 0);
    52	
  > 53		memmove(d, s, n);
    54	}
    55	
    56	#define VGA_MAP_MEM(x,s) (x)

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 49779 bytes --]

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 7/7] vga: Optimise console scrolling
@ 2017-03-26  8:45     ` kbuild test robot
  0 siblings, 0 replies; 50+ messages in thread
From: kbuild test robot @ 2017-03-26  8:45 UTC (permalink / raw)
  To: sparclinux

[-- Attachment #1: Type: text/plain, Size: 4553 bytes --]

Hi Matthew,

[auto build test ERROR on linus/master]
[also build test ERROR on v4.11-rc3 next-20170324]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Matthew-Wilcox/Add-memsetN-functions/20170326-140108
config: sparc-allyesconfig (attached as .config)
compiler: sparc64-linux-gnu-gcc (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
        wget https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=sparc 

All error/warnings (new ones prefixed by >>):

   In file included from include/video/vga.h:22:0,
                    from include/linux/vgaarb.h:34,
                    from drivers/gpu//drm/amd/amdgpu/amdgpu_device.c:35:
   arch/sparc/include/asm/vga.h: In function 'scr_memsetw':
>> arch/sparc/include/asm/vga.h:39:11: error: 's' undeclared (first use in this function)
     memset16(s, cpu_to_le16(v), n / 2);
              ^
   arch/sparc/include/asm/vga.h:39:11: note: each undeclared identifier is reported only once for each function it appears in
--
   In file included from include/video/vga.h:22:0,
                    from include/linux/vgaarb.h:34,
                    from drivers/gpu//drm/nouveau/nouveau_vga.c:1:
   arch/sparc/include/asm/vga.h: In function 'scr_memsetw':
>> arch/sparc/include/asm/vga.h:39:2: error: implicit declaration of function 'memset16' [-Werror=implicit-function-declaration]
     memset16(s, cpu_to_le16(v), n / 2);
     ^~~~~~~~
>> arch/sparc/include/asm/vga.h:39:11: error: 's' undeclared (first use in this function)
     memset16(s, cpu_to_le16(v), n / 2);
              ^
   arch/sparc/include/asm/vga.h:39:11: note: each undeclared identifier is reported only once for each function it appears in
   arch/sparc/include/asm/vga.h: In function 'scr_memcpyw':
>> arch/sparc/include/asm/vga.h:46:2: error: implicit declaration of function 'memcpy' [-Werror=implicit-function-declaration]
     memcpy(d, s, n);
     ^~~~~~
>> arch/sparc/include/asm/vga.h:46:2: warning: incompatible implicit declaration of built-in function 'memcpy'
   arch/sparc/include/asm/vga.h:46:2: note: include '<string.h>' or provide a declaration of 'memcpy'
   arch/sparc/include/asm/vga.h: In function 'scr_memmovew':
>> arch/sparc/include/asm/vga.h:53:2: error: implicit declaration of function 'memmove' [-Werror=implicit-function-declaration]
     memmove(d, s, n);
     ^~~~~~~
>> arch/sparc/include/asm/vga.h:53:2: warning: incompatible implicit declaration of built-in function 'memmove'
   arch/sparc/include/asm/vga.h:53:2: note: include '<string.h>' or provide a declaration of 'memmove'
   In file included from include/uapi/linux/uuid.h:21:0,
                    from include/linux/uuid.h:19,
                    from include/linux/mod_devicetable.h:12,
                    from include/linux/i2c.h:29,
                    from include/uapi/linux/fb.h:5,
                    from include/linux/fb.h:5,
                    from include/linux/vga_switcheroo.h:34,
                    from drivers/gpu//drm/nouveau/nouveau_vga.c:2:
   include/linux/string.h: At top level:
>> include/linux/string.h:104:14: error: conflicting types for 'memset16'
    extern void *memset16(uint16_t *, uint16_t, __kernel_size_t);
                 ^~~~~~~~
   In file included from include/video/vga.h:22:0,
                    from include/linux/vgaarb.h:34,
                    from drivers/gpu//drm/nouveau/nouveau_vga.c:1:
   arch/sparc/include/asm/vga.h:39:2: note: previous implicit declaration of 'memset16' was here
     memset16(s, cpu_to_le16(v), n / 2);
     ^~~~~~~~
   cc1: some warnings being treated as errors

vim +/s +39 arch/sparc/include/asm/vga.h

    33	}
    34	
    35	static inline void scr_memsetw(u16 *p, u16 v, unsigned int n)
    36	{
    37		BUG_ON((long) p >= 0);
    38	
  > 39		memset16(s, cpu_to_le16(v), n / 2);
    40	}
    41	
    42	static inline void scr_memcpyw(u16 *d, u16 *s, unsigned int n)
    43	{
    44		BUG_ON((long) d >= 0);
    45	
  > 46		memcpy(d, s, n);
    47	}
    48	
    49	static inline void scr_memmovew(u16 *d, u16 *s, unsigned int n)
    50	{
    51		BUG_ON((long) d >= 0);
    52	
  > 53		memmove(d, s, n);
    54	}
    55	
    56	#define VGA_MAP_MEM(x,s) (x)

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 49779 bytes --]

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH v3 7/7] vga: Optimise console scrolling
@ 2017-03-26  8:45     ` kbuild test robot
  0 siblings, 0 replies; 50+ messages in thread
From: kbuild test robot @ 2017-03-26  8:45 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Matthew,

[auto build test ERROR on linus/master]
[also build test ERROR on v4.11-rc3 next-20170324]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Matthew-Wilcox/Add-memsetN-functions/20170326-140108
config: sparc-allyesconfig (attached as .config)
compiler: sparc64-linux-gnu-gcc (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
        wget https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=sparc 

All error/warnings (new ones prefixed by >>):

   In file included from include/video/vga.h:22:0,
                    from include/linux/vgaarb.h:34,
                    from drivers/gpu//drm/amd/amdgpu/amdgpu_device.c:35:
   arch/sparc/include/asm/vga.h: In function 'scr_memsetw':
>> arch/sparc/include/asm/vga.h:39:11: error: 's' undeclared (first use in this function)
     memset16(s, cpu_to_le16(v), n / 2);
              ^
   arch/sparc/include/asm/vga.h:39:11: note: each undeclared identifier is reported only once for each function it appears in
--
   In file included from include/video/vga.h:22:0,
                    from include/linux/vgaarb.h:34,
                    from drivers/gpu//drm/nouveau/nouveau_vga.c:1:
   arch/sparc/include/asm/vga.h: In function 'scr_memsetw':
>> arch/sparc/include/asm/vga.h:39:2: error: implicit declaration of function 'memset16' [-Werror=implicit-function-declaration]
     memset16(s, cpu_to_le16(v), n / 2);
     ^~~~~~~~
>> arch/sparc/include/asm/vga.h:39:11: error: 's' undeclared (first use in this function)
     memset16(s, cpu_to_le16(v), n / 2);
              ^
   arch/sparc/include/asm/vga.h:39:11: note: each undeclared identifier is reported only once for each function it appears in
   arch/sparc/include/asm/vga.h: In function 'scr_memcpyw':
>> arch/sparc/include/asm/vga.h:46:2: error: implicit declaration of function 'memcpy' [-Werror=implicit-function-declaration]
     memcpy(d, s, n);
     ^~~~~~
>> arch/sparc/include/asm/vga.h:46:2: warning: incompatible implicit declaration of built-in function 'memcpy'
   arch/sparc/include/asm/vga.h:46:2: note: include '<string.h>' or provide a declaration of 'memcpy'
   arch/sparc/include/asm/vga.h: In function 'scr_memmovew':
>> arch/sparc/include/asm/vga.h:53:2: error: implicit declaration of function 'memmove' [-Werror=implicit-function-declaration]
     memmove(d, s, n);
     ^~~~~~~
>> arch/sparc/include/asm/vga.h:53:2: warning: incompatible implicit declaration of built-in function 'memmove'
   arch/sparc/include/asm/vga.h:53:2: note: include '<string.h>' or provide a declaration of 'memmove'
   In file included from include/uapi/linux/uuid.h:21:0,
                    from include/linux/uuid.h:19,
                    from include/linux/mod_devicetable.h:12,
                    from include/linux/i2c.h:29,
                    from include/uapi/linux/fb.h:5,
                    from include/linux/fb.h:5,
                    from include/linux/vga_switcheroo.h:34,
                    from drivers/gpu//drm/nouveau/nouveau_vga.c:2:
   include/linux/string.h: At top level:
>> include/linux/string.h:104:14: error: conflicting types for 'memset16'
    extern void *memset16(uint16_t *, uint16_t, __kernel_size_t);
                 ^~~~~~~~
   In file included from include/video/vga.h:22:0,
                    from include/linux/vgaarb.h:34,
                    from drivers/gpu//drm/nouveau/nouveau_vga.c:1:
   arch/sparc/include/asm/vga.h:39:2: note: previous implicit declaration of 'memset16' was here
     memset16(s, cpu_to_le16(v), n / 2);
     ^~~~~~~~
   cc1: some warnings being treated as errors

vim +/s +39 arch/sparc/include/asm/vga.h

    33	}
    34	
    35	static inline void scr_memsetw(u16 *p, u16 v, unsigned int n)
    36	{
    37		BUG_ON((long) p >= 0);
    38	
  > 39		memset16(s, cpu_to_le16(v), n / 2);
    40	}
    41	
    42	static inline void scr_memcpyw(u16 *d, u16 *s, unsigned int n)
    43	{
    44		BUG_ON((long) d >= 0);
    45	
  > 46		memcpy(d, s, n);
    47	}
    48	
    49	static inline void scr_memmovew(u16 *d, u16 *s, unsigned int n)
    50	{
    51		BUG_ON((long) d >= 0);
    52	
  > 53		memmove(d, s, n);
    54	}
    55	
    56	#define VGA_MAP_MEM(x,s) (x)

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation
-------------- next part --------------
A non-text attachment was scrubbed...
Name: .config.gz
Type: application/gzip
Size: 49779 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20170326/b8e77df5/attachment-0001.gz>

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 7/7] vga: Optimise console scrolling
  2017-03-24 16:13   ` Matthew Wilcox
  (?)
  (?)
@ 2017-03-26  9:53     ` kbuild test robot
  -1 siblings, 0 replies; 50+ messages in thread
From: kbuild test robot @ 2017-03-26  9:53 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: kbuild-all, linux-kernel, linux-fbdev, linux-arch, linux-alpha,
	linux-arm-kernel, x86, linux-mips, linuxppc-dev, sparclinux,
	Minchan Kim, Matthew Wilcox

[-- Attachment #1: Type: text/plain, Size: 6527 bytes --]

Hi Matthew,

[auto build test ERROR on linus/master]
[also build test ERROR on v4.11-rc3 next-20170324]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Matthew-Wilcox/Add-memsetN-functions/20170326-140108
config: mips-defconfig (attached as .config)
compiler: mips-linux-gnu-gcc (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
        wget https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=mips 

All errors (new ones prefixed by >>):

   In file included from include/linux/selection.h:11:0,
                    from drivers/video/console/newport_con.c:16:
   include/linux/vt_buffer.h: In function 'scr_memsetw':
>> include/linux/vt_buffer.h:34:2: error: implicit declaration of function 'memset16' [-Werror=implicit-function-declaration]
     memset16(s, c, count / 2);
     ^~~~~~~~
   include/linux/vt_buffer.h: In function 'scr_memcpyw':
>> include/linux/vt_buffer.h:47:2: error: implicit declaration of function 'memcpy' [-Werror=implicit-function-declaration]
     memcpy(d, s, count);
     ^~~~~~
   include/linux/vt_buffer.h: In function 'scr_memmovew':
>> include/linux/vt_buffer.h:66:2: error: implicit declaration of function 'memmove' [-Werror=implicit-function-declaration]
     memmove(d, s, count);
     ^~~~~~~
   In file included from include/linux/string.h:18:0,
                    from include/linux/bitmap.h:8,
                    from include/linux/cpumask.h:11,
                    from arch/mips/include/asm/processor.h:15,
                    from arch/mips/include/asm/thread_info.h:15,
                    from include/linux/thread_info.h:25,
                    from include/asm-generic/preempt.h:4,
                    from ./arch/mips/include/generated/asm/preempt.h:1,
                    from include/linux/preempt.h:80,
                    from include/linux/spinlock.h:50,
                    from include/linux/wait.h:8,
                    from include/linux/fs.h:5,
                    from include/linux/tty.h:4,
                    from include/linux/vt_kern.h:11,
                    from drivers/video/console/newport_con.c:18:
   arch/mips/include/asm/string.h: At top level:
>> arch/mips/include/asm/string.h:138:14: error: conflicting types for 'memcpy'
    extern void *memcpy(void *__to, __const__ void *__from, size_t __n);
                 ^~~~~~
   In file included from include/linux/selection.h:11:0,
                    from drivers/video/console/newport_con.c:16:
   include/linux/vt_buffer.h:47:2: note: previous implicit declaration of 'memcpy' was here
     memcpy(d, s, count);
     ^~~~~~
   In file included from include/linux/string.h:18:0,
                    from include/linux/bitmap.h:8,
                    from include/linux/cpumask.h:11,
                    from arch/mips/include/asm/processor.h:15,
                    from arch/mips/include/asm/thread_info.h:15,
                    from include/linux/thread_info.h:25,
                    from include/asm-generic/preempt.h:4,
                    from ./arch/mips/include/generated/asm/preempt.h:1,
                    from include/linux/preempt.h:80,
                    from include/linux/spinlock.h:50,
                    from include/linux/wait.h:8,
                    from include/linux/fs.h:5,
                    from include/linux/tty.h:4,
                    from include/linux/vt_kern.h:11,
                    from drivers/video/console/newport_con.c:18:
>> arch/mips/include/asm/string.h:141:14: error: conflicting types for 'memmove'
    extern void *memmove(void *__dest, __const__ void *__src, size_t __n);
                 ^~~~~~~
   In file included from include/linux/selection.h:11:0,
                    from drivers/video/console/newport_con.c:16:
   include/linux/vt_buffer.h:66:2: note: previous implicit declaration of 'memmove' was here
     memmove(d, s, count);
     ^~~~~~~
   In file included from include/linux/bitmap.h:8:0,
                    from include/linux/cpumask.h:11,
                    from arch/mips/include/asm/processor.h:15,
                    from arch/mips/include/asm/thread_info.h:15,
                    from include/linux/thread_info.h:25,
                    from include/asm-generic/preempt.h:4,
                    from ./arch/mips/include/generated/asm/preempt.h:1,
                    from include/linux/preempt.h:80,
                    from include/linux/spinlock.h:50,
                    from include/linux/wait.h:8,
                    from include/linux/fs.h:5,
                    from include/linux/tty.h:4,
                    from include/linux/vt_kern.h:11,
                    from drivers/video/console/newport_con.c:18:
   include/linux/string.h:104:14: error: conflicting types for 'memset16'
    extern void *memset16(uint16_t *, uint16_t, __kernel_size_t);
                 ^~~~~~~~
   In file included from include/linux/selection.h:11:0,
                    from drivers/video/console/newport_con.c:16:
   include/linux/vt_buffer.h:34:2: note: previous implicit declaration of 'memset16' was here
     memset16(s, c, count / 2);
     ^~~~~~~~
   cc1: some warnings being treated as errors

vim +/memset16 +34 include/linux/vt_buffer.h

    28	{
    29	#ifdef VT_BUF_HAVE_RW
    30		count /= 2;
    31		while (count--)
    32			scr_writew(c, s++);
    33	#else
  > 34		memset16(s, c, count / 2);
    35	#endif
    36	}
    37	#endif
    38	
    39	#ifndef VT_BUF_HAVE_MEMCPYW
    40	static inline void scr_memcpyw(u16 *d, const u16 *s, unsigned int count)
    41	{
    42	#ifdef VT_BUF_HAVE_RW
    43		count /= 2;
    44		while (count--)
    45			scr_writew(scr_readw(s++), d++);
    46	#else
  > 47		memcpy(d, s, count);
    48	#endif
    49	}
    50	#endif
    51	
    52	#ifndef VT_BUF_HAVE_MEMMOVEW
    53	static inline void scr_memmovew(u16 *d, const u16 *s, unsigned int count)
    54	{
    55	#ifdef VT_BUF_HAVE_RW
    56		if (d < s)
    57			scr_memcpyw(d, s, count);
    58		else {
    59			count /= 2;
    60			d += count;
    61			s += count;
    62			while (count--)
    63				scr_writew(scr_readw(--s), --d);
    64		}
    65	#else
  > 66		memmove(d, s, count);
    67	#endif
    68	}
    69	#endif

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 13581 bytes --]

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 7/7] vga: Optimise console scrolling
@ 2017-03-26  9:53     ` kbuild test robot
  0 siblings, 0 replies; 50+ messages in thread
From: kbuild test robot @ 2017-03-26  9:53 UTC (permalink / raw)
  To: linux-fbdev

[-- Attachment #1: Type: text/plain, Size: 6527 bytes --]

Hi Matthew,

[auto build test ERROR on linus/master]
[also build test ERROR on v4.11-rc3 next-20170324]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Matthew-Wilcox/Add-memsetN-functions/20170326-140108
config: mips-defconfig (attached as .config)
compiler: mips-linux-gnu-gcc (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
        wget https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=mips 

All errors (new ones prefixed by >>):

   In file included from include/linux/selection.h:11:0,
                    from drivers/video/console/newport_con.c:16:
   include/linux/vt_buffer.h: In function 'scr_memsetw':
>> include/linux/vt_buffer.h:34:2: error: implicit declaration of function 'memset16' [-Werror=implicit-function-declaration]
     memset16(s, c, count / 2);
     ^~~~~~~~
   include/linux/vt_buffer.h: In function 'scr_memcpyw':
>> include/linux/vt_buffer.h:47:2: error: implicit declaration of function 'memcpy' [-Werror=implicit-function-declaration]
     memcpy(d, s, count);
     ^~~~~~
   include/linux/vt_buffer.h: In function 'scr_memmovew':
>> include/linux/vt_buffer.h:66:2: error: implicit declaration of function 'memmove' [-Werror=implicit-function-declaration]
     memmove(d, s, count);
     ^~~~~~~
   In file included from include/linux/string.h:18:0,
                    from include/linux/bitmap.h:8,
                    from include/linux/cpumask.h:11,
                    from arch/mips/include/asm/processor.h:15,
                    from arch/mips/include/asm/thread_info.h:15,
                    from include/linux/thread_info.h:25,
                    from include/asm-generic/preempt.h:4,
                    from ./arch/mips/include/generated/asm/preempt.h:1,
                    from include/linux/preempt.h:80,
                    from include/linux/spinlock.h:50,
                    from include/linux/wait.h:8,
                    from include/linux/fs.h:5,
                    from include/linux/tty.h:4,
                    from include/linux/vt_kern.h:11,
                    from drivers/video/console/newport_con.c:18:
   arch/mips/include/asm/string.h: At top level:
>> arch/mips/include/asm/string.h:138:14: error: conflicting types for 'memcpy'
    extern void *memcpy(void *__to, __const__ void *__from, size_t __n);
                 ^~~~~~
   In file included from include/linux/selection.h:11:0,
                    from drivers/video/console/newport_con.c:16:
   include/linux/vt_buffer.h:47:2: note: previous implicit declaration of 'memcpy' was here
     memcpy(d, s, count);
     ^~~~~~
   In file included from include/linux/string.h:18:0,
                    from include/linux/bitmap.h:8,
                    from include/linux/cpumask.h:11,
                    from arch/mips/include/asm/processor.h:15,
                    from arch/mips/include/asm/thread_info.h:15,
                    from include/linux/thread_info.h:25,
                    from include/asm-generic/preempt.h:4,
                    from ./arch/mips/include/generated/asm/preempt.h:1,
                    from include/linux/preempt.h:80,
                    from include/linux/spinlock.h:50,
                    from include/linux/wait.h:8,
                    from include/linux/fs.h:5,
                    from include/linux/tty.h:4,
                    from include/linux/vt_kern.h:11,
                    from drivers/video/console/newport_con.c:18:
>> arch/mips/include/asm/string.h:141:14: error: conflicting types for 'memmove'
    extern void *memmove(void *__dest, __const__ void *__src, size_t __n);
                 ^~~~~~~
   In file included from include/linux/selection.h:11:0,
                    from drivers/video/console/newport_con.c:16:
   include/linux/vt_buffer.h:66:2: note: previous implicit declaration of 'memmove' was here
     memmove(d, s, count);
     ^~~~~~~
   In file included from include/linux/bitmap.h:8:0,
                    from include/linux/cpumask.h:11,
                    from arch/mips/include/asm/processor.h:15,
                    from arch/mips/include/asm/thread_info.h:15,
                    from include/linux/thread_info.h:25,
                    from include/asm-generic/preempt.h:4,
                    from ./arch/mips/include/generated/asm/preempt.h:1,
                    from include/linux/preempt.h:80,
                    from include/linux/spinlock.h:50,
                    from include/linux/wait.h:8,
                    from include/linux/fs.h:5,
                    from include/linux/tty.h:4,
                    from include/linux/vt_kern.h:11,
                    from drivers/video/console/newport_con.c:18:
   include/linux/string.h:104:14: error: conflicting types for 'memset16'
    extern void *memset16(uint16_t *, uint16_t, __kernel_size_t);
                 ^~~~~~~~
   In file included from include/linux/selection.h:11:0,
                    from drivers/video/console/newport_con.c:16:
   include/linux/vt_buffer.h:34:2: note: previous implicit declaration of 'memset16' was here
     memset16(s, c, count / 2);
     ^~~~~~~~
   cc1: some warnings being treated as errors

vim +/memset16 +34 include/linux/vt_buffer.h

    28	{
    29	#ifdef VT_BUF_HAVE_RW
    30		count /= 2;
    31		while (count--)
    32			scr_writew(c, s++);
    33	#else
  > 34		memset16(s, c, count / 2);
    35	#endif
    36	}
    37	#endif
    38	
    39	#ifndef VT_BUF_HAVE_MEMCPYW
    40	static inline void scr_memcpyw(u16 *d, const u16 *s, unsigned int count)
    41	{
    42	#ifdef VT_BUF_HAVE_RW
    43		count /= 2;
    44		while (count--)
    45			scr_writew(scr_readw(s++), d++);
    46	#else
  > 47		memcpy(d, s, count);
    48	#endif
    49	}
    50	#endif
    51	
    52	#ifndef VT_BUF_HAVE_MEMMOVEW
    53	static inline void scr_memmovew(u16 *d, const u16 *s, unsigned int count)
    54	{
    55	#ifdef VT_BUF_HAVE_RW
    56		if (d < s)
    57			scr_memcpyw(d, s, count);
    58		else {
    59			count /= 2;
    60			d += count;
    61			s += count;
    62			while (count--)
    63				scr_writew(scr_readw(--s), --d);
    64		}
    65	#else
  > 66		memmove(d, s, count);
    67	#endif
    68	}
    69	#endif

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 13581 bytes --]

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 7/7] vga: Optimise console scrolling
@ 2017-03-26  9:53     ` kbuild test robot
  0 siblings, 0 replies; 50+ messages in thread
From: kbuild test robot @ 2017-03-26  9:53 UTC (permalink / raw)
  To: sparclinux

[-- Attachment #1: Type: text/plain, Size: 6527 bytes --]

Hi Matthew,

[auto build test ERROR on linus/master]
[also build test ERROR on v4.11-rc3 next-20170324]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Matthew-Wilcox/Add-memsetN-functions/20170326-140108
config: mips-defconfig (attached as .config)
compiler: mips-linux-gnu-gcc (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
        wget https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=mips 

All errors (new ones prefixed by >>):

   In file included from include/linux/selection.h:11:0,
                    from drivers/video/console/newport_con.c:16:
   include/linux/vt_buffer.h: In function 'scr_memsetw':
>> include/linux/vt_buffer.h:34:2: error: implicit declaration of function 'memset16' [-Werror=implicit-function-declaration]
     memset16(s, c, count / 2);
     ^~~~~~~~
   include/linux/vt_buffer.h: In function 'scr_memcpyw':
>> include/linux/vt_buffer.h:47:2: error: implicit declaration of function 'memcpy' [-Werror=implicit-function-declaration]
     memcpy(d, s, count);
     ^~~~~~
   include/linux/vt_buffer.h: In function 'scr_memmovew':
>> include/linux/vt_buffer.h:66:2: error: implicit declaration of function 'memmove' [-Werror=implicit-function-declaration]
     memmove(d, s, count);
     ^~~~~~~
   In file included from include/linux/string.h:18:0,
                    from include/linux/bitmap.h:8,
                    from include/linux/cpumask.h:11,
                    from arch/mips/include/asm/processor.h:15,
                    from arch/mips/include/asm/thread_info.h:15,
                    from include/linux/thread_info.h:25,
                    from include/asm-generic/preempt.h:4,
                    from ./arch/mips/include/generated/asm/preempt.h:1,
                    from include/linux/preempt.h:80,
                    from include/linux/spinlock.h:50,
                    from include/linux/wait.h:8,
                    from include/linux/fs.h:5,
                    from include/linux/tty.h:4,
                    from include/linux/vt_kern.h:11,
                    from drivers/video/console/newport_con.c:18:
   arch/mips/include/asm/string.h: At top level:
>> arch/mips/include/asm/string.h:138:14: error: conflicting types for 'memcpy'
    extern void *memcpy(void *__to, __const__ void *__from, size_t __n);
                 ^~~~~~
   In file included from include/linux/selection.h:11:0,
                    from drivers/video/console/newport_con.c:16:
   include/linux/vt_buffer.h:47:2: note: previous implicit declaration of 'memcpy' was here
     memcpy(d, s, count);
     ^~~~~~
   In file included from include/linux/string.h:18:0,
                    from include/linux/bitmap.h:8,
                    from include/linux/cpumask.h:11,
                    from arch/mips/include/asm/processor.h:15,
                    from arch/mips/include/asm/thread_info.h:15,
                    from include/linux/thread_info.h:25,
                    from include/asm-generic/preempt.h:4,
                    from ./arch/mips/include/generated/asm/preempt.h:1,
                    from include/linux/preempt.h:80,
                    from include/linux/spinlock.h:50,
                    from include/linux/wait.h:8,
                    from include/linux/fs.h:5,
                    from include/linux/tty.h:4,
                    from include/linux/vt_kern.h:11,
                    from drivers/video/console/newport_con.c:18:
>> arch/mips/include/asm/string.h:141:14: error: conflicting types for 'memmove'
    extern void *memmove(void *__dest, __const__ void *__src, size_t __n);
                 ^~~~~~~
   In file included from include/linux/selection.h:11:0,
                    from drivers/video/console/newport_con.c:16:
   include/linux/vt_buffer.h:66:2: note: previous implicit declaration of 'memmove' was here
     memmove(d, s, count);
     ^~~~~~~
   In file included from include/linux/bitmap.h:8:0,
                    from include/linux/cpumask.h:11,
                    from arch/mips/include/asm/processor.h:15,
                    from arch/mips/include/asm/thread_info.h:15,
                    from include/linux/thread_info.h:25,
                    from include/asm-generic/preempt.h:4,
                    from ./arch/mips/include/generated/asm/preempt.h:1,
                    from include/linux/preempt.h:80,
                    from include/linux/spinlock.h:50,
                    from include/linux/wait.h:8,
                    from include/linux/fs.h:5,
                    from include/linux/tty.h:4,
                    from include/linux/vt_kern.h:11,
                    from drivers/video/console/newport_con.c:18:
   include/linux/string.h:104:14: error: conflicting types for 'memset16'
    extern void *memset16(uint16_t *, uint16_t, __kernel_size_t);
                 ^~~~~~~~
   In file included from include/linux/selection.h:11:0,
                    from drivers/video/console/newport_con.c:16:
   include/linux/vt_buffer.h:34:2: note: previous implicit declaration of 'memset16' was here
     memset16(s, c, count / 2);
     ^~~~~~~~
   cc1: some warnings being treated as errors

vim +/memset16 +34 include/linux/vt_buffer.h

    28	{
    29	#ifdef VT_BUF_HAVE_RW
    30		count /= 2;
    31		while (count--)
    32			scr_writew(c, s++);
    33	#else
  > 34		memset16(s, c, count / 2);
    35	#endif
    36	}
    37	#endif
    38	
    39	#ifndef VT_BUF_HAVE_MEMCPYW
    40	static inline void scr_memcpyw(u16 *d, const u16 *s, unsigned int count)
    41	{
    42	#ifdef VT_BUF_HAVE_RW
    43		count /= 2;
    44		while (count--)
    45			scr_writew(scr_readw(s++), d++);
    46	#else
  > 47		memcpy(d, s, count);
    48	#endif
    49	}
    50	#endif
    51	
    52	#ifndef VT_BUF_HAVE_MEMMOVEW
    53	static inline void scr_memmovew(u16 *d, const u16 *s, unsigned int count)
    54	{
    55	#ifdef VT_BUF_HAVE_RW
    56		if (d < s)
    57			scr_memcpyw(d, s, count);
    58		else {
    59			count /= 2;
    60			d += count;
    61			s += count;
    62			while (count--)
    63				scr_writew(scr_readw(--s), --d);
    64		}
    65	#else
  > 66		memmove(d, s, count);
    67	#endif
    68	}
    69	#endif

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 13581 bytes --]

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH v3 7/7] vga: Optimise console scrolling
@ 2017-03-26  9:53     ` kbuild test robot
  0 siblings, 0 replies; 50+ messages in thread
From: kbuild test robot @ 2017-03-26  9:53 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Matthew,

[auto build test ERROR on linus/master]
[also build test ERROR on v4.11-rc3 next-20170324]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Matthew-Wilcox/Add-memsetN-functions/20170326-140108
config: mips-defconfig (attached as .config)
compiler: mips-linux-gnu-gcc (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
        wget https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=mips 

All errors (new ones prefixed by >>):

   In file included from include/linux/selection.h:11:0,
                    from drivers/video/console/newport_con.c:16:
   include/linux/vt_buffer.h: In function 'scr_memsetw':
>> include/linux/vt_buffer.h:34:2: error: implicit declaration of function 'memset16' [-Werror=implicit-function-declaration]
     memset16(s, c, count / 2);
     ^~~~~~~~
   include/linux/vt_buffer.h: In function 'scr_memcpyw':
>> include/linux/vt_buffer.h:47:2: error: implicit declaration of function 'memcpy' [-Werror=implicit-function-declaration]
     memcpy(d, s, count);
     ^~~~~~
   include/linux/vt_buffer.h: In function 'scr_memmovew':
>> include/linux/vt_buffer.h:66:2: error: implicit declaration of function 'memmove' [-Werror=implicit-function-declaration]
     memmove(d, s, count);
     ^~~~~~~
   In file included from include/linux/string.h:18:0,
                    from include/linux/bitmap.h:8,
                    from include/linux/cpumask.h:11,
                    from arch/mips/include/asm/processor.h:15,
                    from arch/mips/include/asm/thread_info.h:15,
                    from include/linux/thread_info.h:25,
                    from include/asm-generic/preempt.h:4,
                    from ./arch/mips/include/generated/asm/preempt.h:1,
                    from include/linux/preempt.h:80,
                    from include/linux/spinlock.h:50,
                    from include/linux/wait.h:8,
                    from include/linux/fs.h:5,
                    from include/linux/tty.h:4,
                    from include/linux/vt_kern.h:11,
                    from drivers/video/console/newport_con.c:18:
   arch/mips/include/asm/string.h: At top level:
>> arch/mips/include/asm/string.h:138:14: error: conflicting types for 'memcpy'
    extern void *memcpy(void *__to, __const__ void *__from, size_t __n);
                 ^~~~~~
   In file included from include/linux/selection.h:11:0,
                    from drivers/video/console/newport_con.c:16:
   include/linux/vt_buffer.h:47:2: note: previous implicit declaration of 'memcpy' was here
     memcpy(d, s, count);
     ^~~~~~
   In file included from include/linux/string.h:18:0,
                    from include/linux/bitmap.h:8,
                    from include/linux/cpumask.h:11,
                    from arch/mips/include/asm/processor.h:15,
                    from arch/mips/include/asm/thread_info.h:15,
                    from include/linux/thread_info.h:25,
                    from include/asm-generic/preempt.h:4,
                    from ./arch/mips/include/generated/asm/preempt.h:1,
                    from include/linux/preempt.h:80,
                    from include/linux/spinlock.h:50,
                    from include/linux/wait.h:8,
                    from include/linux/fs.h:5,
                    from include/linux/tty.h:4,
                    from include/linux/vt_kern.h:11,
                    from drivers/video/console/newport_con.c:18:
>> arch/mips/include/asm/string.h:141:14: error: conflicting types for 'memmove'
    extern void *memmove(void *__dest, __const__ void *__src, size_t __n);
                 ^~~~~~~
   In file included from include/linux/selection.h:11:0,
                    from drivers/video/console/newport_con.c:16:
   include/linux/vt_buffer.h:66:2: note: previous implicit declaration of 'memmove' was here
     memmove(d, s, count);
     ^~~~~~~
   In file included from include/linux/bitmap.h:8:0,
                    from include/linux/cpumask.h:11,
                    from arch/mips/include/asm/processor.h:15,
                    from arch/mips/include/asm/thread_info.h:15,
                    from include/linux/thread_info.h:25,
                    from include/asm-generic/preempt.h:4,
                    from ./arch/mips/include/generated/asm/preempt.h:1,
                    from include/linux/preempt.h:80,
                    from include/linux/spinlock.h:50,
                    from include/linux/wait.h:8,
                    from include/linux/fs.h:5,
                    from include/linux/tty.h:4,
                    from include/linux/vt_kern.h:11,
                    from drivers/video/console/newport_con.c:18:
   include/linux/string.h:104:14: error: conflicting types for 'memset16'
    extern void *memset16(uint16_t *, uint16_t, __kernel_size_t);
                 ^~~~~~~~
   In file included from include/linux/selection.h:11:0,
                    from drivers/video/console/newport_con.c:16:
   include/linux/vt_buffer.h:34:2: note: previous implicit declaration of 'memset16' was here
     memset16(s, c, count / 2);
     ^~~~~~~~
   cc1: some warnings being treated as errors

vim +/memset16 +34 include/linux/vt_buffer.h

    28	{
    29	#ifdef VT_BUF_HAVE_RW
    30		count /= 2;
    31		while (count--)
    32			scr_writew(c, s++);
    33	#else
  > 34		memset16(s, c, count / 2);
    35	#endif
    36	}
    37	#endif
    38	
    39	#ifndef VT_BUF_HAVE_MEMCPYW
    40	static inline void scr_memcpyw(u16 *d, const u16 *s, unsigned int count)
    41	{
    42	#ifdef VT_BUF_HAVE_RW
    43		count /= 2;
    44		while (count--)
    45			scr_writew(scr_readw(s++), d++);
    46	#else
  > 47		memcpy(d, s, count);
    48	#endif
    49	}
    50	#endif
    51	
    52	#ifndef VT_BUF_HAVE_MEMMOVEW
    53	static inline void scr_memmovew(u16 *d, const u16 *s, unsigned int count)
    54	{
    55	#ifdef VT_BUF_HAVE_RW
    56		if (d < s)
    57			scr_memcpyw(d, s, count);
    58		else {
    59			count /= 2;
    60			d += count;
    61			s += count;
    62			while (count--)
    63				scr_writew(scr_readw(--s), --d);
    64		}
    65	#else
  > 66		memmove(d, s, count);
    67	#endif
    68	}
    69	#endif

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation
-------------- next part --------------
A non-text attachment was scrubbed...
Name: .config.gz
Type: application/gzip
Size: 13581 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20170326/f2c361de/attachment-0001.gz>

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 5/7] zram: Convert to using memset_l
  2017-03-24 16:13   ` Matthew Wilcox
  (?)
  (?)
@ 2017-03-27  5:01     ` Minchan Kim
  -1 siblings, 0 replies; 50+ messages in thread
From: Minchan Kim @ 2017-03-27  5:01 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-kernel, linux-fbdev, linux-arch, linux-alpha,
	linux-arm-kernel, x86, linux-mips, linuxppc-dev, sparclinux,
	Matthew Wilcox

On Fri, Mar 24, 2017 at 09:13:16AM -0700, Matthew Wilcox wrote:
> From: Matthew Wilcox <mawilcox@microsoft.com>
> 
> zram was the motivation for creating memset_l().  Minchan Kim sees a 7%
> performance improvement on x86 with 100MB of non-zero deduplicatable
> data:
> 
>         perf stat -r 10 dd if=/dev/zram0 of=/dev/null
> 
> vanilla:        0.232050465 seconds time elapsed ( +-  0.51% )
> memset_l:	0.217219387 seconds time elapsed ( +-  0.07% )
> 
> Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
> Tested-by: Minchan Kim <minchan@kernel.org>
Acked-by: Minchan Kim <minchan@kernel.org>

Thanks!

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 5/7] zram: Convert to using memset_l
@ 2017-03-27  5:01     ` Minchan Kim
  0 siblings, 0 replies; 50+ messages in thread
From: Minchan Kim @ 2017-03-27  5:01 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-kernel, linux-fbdev, linux-arch, linux-alpha,
	linux-arm-kernel, x86, linux-mips, linuxppc-dev, sparclinux,
	Matthew Wilcox

On Fri, Mar 24, 2017 at 09:13:16AM -0700, Matthew Wilcox wrote:
> From: Matthew Wilcox <mawilcox@microsoft.com>
> 
> zram was the motivation for creating memset_l().  Minchan Kim sees a 7%
> performance improvement on x86 with 100MB of non-zero deduplicatable
> data:
> 
>         perf stat -r 10 dd if=/dev/zram0 of=/dev/null
> 
> vanilla:        0.232050465 seconds time elapsed ( +-  0.51% )
> memset_l:	0.217219387 seconds time elapsed ( +-  0.07% )
> 
> Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
> Tested-by: Minchan Kim <minchan@kernel.org>
Acked-by: Minchan Kim <minchan@kernel.org>

Thanks!

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 5/7] zram: Convert to using memset_l
@ 2017-03-27  5:01     ` Minchan Kim
  0 siblings, 0 replies; 50+ messages in thread
From: Minchan Kim @ 2017-03-27  5:01 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-kernel, linux-fbdev, linux-arch, linux-alpha,
	linux-arm-kernel, x86, linux-mips, linuxppc-dev, sparclinux,
	Matthew Wilcox

On Fri, Mar 24, 2017 at 09:13:16AM -0700, Matthew Wilcox wrote:
> From: Matthew Wilcox <mawilcox@microsoft.com>
> 
> zram was the motivation for creating memset_l().  Minchan Kim sees a 7%
> performance improvement on x86 with 100MB of non-zero deduplicatable
> data:
> 
>         perf stat -r 10 dd if=/dev/zram0 of=/dev/null
> 
> vanilla:        0.232050465 seconds time elapsed ( +-  0.51% )
> memset_l:	0.217219387 seconds time elapsed ( +-  0.07% )
> 
> Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
> Tested-by: Minchan Kim <minchan@kernel.org>
Acked-by: Minchan Kim <minchan@kernel.org>

Thanks!

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH v3 5/7] zram: Convert to using memset_l
@ 2017-03-27  5:01     ` Minchan Kim
  0 siblings, 0 replies; 50+ messages in thread
From: Minchan Kim @ 2017-03-27  5:01 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Mar 24, 2017 at 09:13:16AM -0700, Matthew Wilcox wrote:
> From: Matthew Wilcox <mawilcox@microsoft.com>
> 
> zram was the motivation for creating memset_l().  Minchan Kim sees a 7%
> performance improvement on x86 with 100MB of non-zero deduplicatable
> data:
> 
>         perf stat -r 10 dd if=/dev/zram0 of=/dev/null
> 
> vanilla:        0.232050465 seconds time elapsed ( +-  0.51% )
> memset_l:	0.217219387 seconds time elapsed ( +-  0.07% )
> 
> Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
> Tested-by: Minchan Kim <minchan@kernel.org>
Acked-by: Minchan Kim <minchan@kernel.org>

Thanks!

^ permalink raw reply	[flat|nested] 50+ messages in thread

end of thread, other threads:[~2017-03-27  5:03 UTC | newest]

Thread overview: 50+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-24 16:13 [PATCH v3 0/7] Add memsetN functions Matthew Wilcox
2017-03-24 16:13 ` Matthew Wilcox
2017-03-24 16:13 ` Matthew Wilcox
2017-03-24 16:13 ` [PATCH v3 1/7] Add multibyte memset functions Matthew Wilcox
2017-03-24 16:13   ` Matthew Wilcox
2017-03-24 16:13   ` Matthew Wilcox
2017-03-24 16:13 ` [PATCH v3 2/7] ARM: Implement memset16, memset32 & memset64 Matthew Wilcox
2017-03-24 16:13   ` Matthew Wilcox
2017-03-24 16:13   ` Matthew Wilcox
2017-03-24 16:13   ` Matthew Wilcox
2017-03-24 16:13 ` [PATCH v3 3/7] x86: " Matthew Wilcox
2017-03-24 16:13   ` Matthew Wilcox
2017-03-24 16:13   ` Matthew Wilcox
2017-03-26  7:44   ` kbuild test robot
2017-03-26  7:44     ` kbuild test robot
2017-03-26  7:44     ` kbuild test robot
2017-03-26  7:44     ` kbuild test robot
2017-03-26  7:44     ` kbuild test robot
2017-03-24 16:13 ` [PATCH v3 4/7] alpha: Add support for memset16 Matthew Wilcox
2017-03-24 16:13   ` Matthew Wilcox
2017-03-24 16:13   ` Matthew Wilcox
2017-03-24 16:13   ` Matthew Wilcox
2017-03-26  7:28   ` kbuild test robot
2017-03-26  7:28     ` kbuild test robot
2017-03-26  7:28     ` kbuild test robot
2017-03-26  7:28     ` kbuild test robot
2017-03-24 16:13 ` [PATCH v3 5/7] zram: Convert to using memset_l Matthew Wilcox
2017-03-24 16:13   ` Matthew Wilcox
2017-03-24 16:13   ` Matthew Wilcox
2017-03-27  5:01   ` Minchan Kim
2017-03-27  5:01     ` Minchan Kim
2017-03-27  5:01     ` Minchan Kim
2017-03-27  5:01     ` Minchan Kim
2017-03-24 16:13 ` [PATCH v3 6/7] sym53c8xx_2: Convert to use memset32 Matthew Wilcox
2017-03-24 16:13   ` Matthew Wilcox
2017-03-24 16:13   ` Matthew Wilcox
2017-03-24 16:13   ` Matthew Wilcox
2017-03-24 16:13 ` [PATCH v3 7/7] vga: Optimise console scrolling Matthew Wilcox
2017-03-24 16:13   ` Matthew Wilcox
2017-03-24 16:13   ` Matthew Wilcox
2017-03-24 16:13   ` Matthew Wilcox
2017-03-26  8:45   ` kbuild test robot
2017-03-26  8:45     ` kbuild test robot
2017-03-26  8:45     ` kbuild test robot
2017-03-26  8:45     ` kbuild test robot
2017-03-26  8:45     ` kbuild test robot
2017-03-26  9:53   ` kbuild test robot
2017-03-26  9:53     ` kbuild test robot
2017-03-26  9:53     ` kbuild test robot
2017-03-26  9:53     ` kbuild test robot

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.