linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/4] usercopy: generic tests + arm64 fixes
@ 2023-03-21 12:25 Mark Rutland
  2023-03-21 12:25 ` [PATCH v2 1/4] lib: test copy_{to,from}_user() Mark Rutland
                   ` (4 more replies)
  0 siblings, 5 replies; 18+ messages in thread
From: Mark Rutland @ 2023-03-21 12:25 UTC (permalink / raw)
  To: linux-kernel, Al Viro, Catalin Marinas, Linus Torvalds, Will Deacon
  Cc: agordeev, aou, bp, dave.hansen, davem, gor, hca, linux-arch,
	linux, mark.rutland, mingo, palmer, paul.walmsley, robin.murphy,
	tglx

This series adds tests for the usercopy functions, which have found a
few issues on most architectures I've tested on. The last two patches
are fixes for arm64 (which pass all the tests, and are at least as good
performance-wise as the existing implementation).

Arch maintainers: if you are Cc'd on patches 1 or 2, I believe that your
architecture has an issue (which should be called out explicitly in the
commit message). I've only been able to test on some architectures, so
other architectures may be affected.

Al, Linus: could you double-check that I have the requirements right?
I'm going by Al's prior message at:

  https://lore.kernel.org/all/YNSyZaZtPTmTa5P8@zeniv-ca.linux.org.uk/

Which mentions the commentary in include/linux/uaccess.h:

 * If raw_copy_{to,from}_user(to, from, size) returns N, size - N bytes starting
 * at to must become equal to the bytes fetched from the corresponding area
 * starting at from.  All data past to + size - N must be left unmodified.
 *
 * If copying succeeds, the return value must be 0.  If some data cannot be
 * fetched, it is permitted to copy less than had been fetched; the only
 * hard requirement is that not storing anything at all (i.e. returning size)
 * should happen only when nothing could be copied.  In other words, you don't
 * have to squeeze as much as possible - it is allowed, but not necessary.

I've made two additional assumptions:

1) That a usercopy in a non-atomic context shouldn't have a partial
   failure if it's possible for it to succeed completely. i.e. callers
   don't need to call it in a loop if they only care about it entirely
   succeeding or failing.

   Code all over the place seems to rely upon that (e.g. signal code
   using copy_to_user() to place signal frames), so I assume this must
   be the case or we have bigger issues.

2) That clear_user() has the same requirements, given it has the same
   rough shape as raw_copy_{to,from}_user(), and I believe it's used in
   the same way in the iov_iter code (but I could be mistaken).

Since v1 [1]:
* Trivial rebase to v6.3-rc3
* Simplify the test harness
* Improve comments
* Improve commit messages
* Expand Cc list

[1] https://lore.kernel.org/lkml/20230314115030.347976-1-mark.rutland@arm.com/

Thanks,
Mark.

Mark Rutland (4):
  lib: test copy_{to,from}_user()
  lib: test clear_user()
  arm64: fix __raw_copy_to_user semantics
  arm64: fix clear_user() semantics

 arch/arm64/lib/clear_user.S   | 195 ++++++++++---
 arch/arm64/lib/copy_to_user.S | 203 +++++++++++---
 lib/Kconfig.debug             |   9 +
 lib/Makefile                  |   1 +
 lib/usercopy_kunit.c          | 506 ++++++++++++++++++++++++++++++++++
 5 files changed, 834 insertions(+), 80 deletions(-)
 create mode 100644 lib/usercopy_kunit.c

-- 
2.30.2


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v2 1/4] lib: test copy_{to,from}_user()
  2023-03-21 12:25 [PATCH v2 0/4] usercopy: generic tests + arm64 fixes Mark Rutland
@ 2023-03-21 12:25 ` Mark Rutland
  2023-03-21 17:09   ` Catalin Marinas
  2023-03-21 18:04   ` Linus Torvalds
  2023-03-21 12:25 ` [PATCH v2 2/4] lib: test clear_user() Mark Rutland
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 18+ messages in thread
From: Mark Rutland @ 2023-03-21 12:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: agordeev, aou, bp, catalin.marinas, dave.hansen, davem, gor, hca,
	linux-arch, linux, mark.rutland, mingo, palmer, paul.walmsley,
	robin.murphy, tglx, torvalds, viro, will

Historically, implementations of raw_copy_{to,from}_user() haven't
correctly handled the requirements laid forth in <linux/uaccess.h>, e.g.
as Al spotted on arm64:

  https://lore.kernel.org/all/YNSyZaZtPTmTa5P8@zeniv-ca.linux.org.uk/

Similarly, we've had other issues where incorrect fault handling logic
lead to memory corruption, e.g.

  https://lore.kernel.org/linux-arm-kernel/Y44gVm7IEMXqilef@FVFF77S0Q05N.cambridge.arm.com/

These issues are easy to introduce and hard to spot as we don't have any
systematic tests for the faulting paths.

This patch adds KUnit tests for raw_copy_{to,from}_user() which
systematically test the faulting paths, allowing us to detect and avoid
problems such as those above. The test checks copy sizes of 0 to 128
bytes at every possible alignment relative to leading/trailing page
boundaries to exhaustively check faulting and success paths.

I've given this a spin on a few architectures using the KUnit QEMU
harness, and it looks like most get *something* wrong. From those
initial runs:

* arm64's copy_to_user() under-reports the number of bytes copied in
  some cases, e.g.

  | post-destination bytes modified (dst_page[4082]=0x1, offset=4081, size=16, ret=15)

* arm's copy_to_user() under-reports the number of bytes copied in some
  cases, and both copy_to_user() and copy_from_user() don't guarantee
  that at least a single byte is copied when a partial copy is possible,
  e.g.

  | post-destination bytes modified (dst_page[4068]=0x1, offset=4067, size=33, ret=32)
  | too few bytes consumed (offset=4068, size=32, ret=32)

* i386's copy_from_user does not guarantee that at least a single byte
  is copied when a partial copit is possible, e.g.

  | too few bytes consumed (offset=4093, size=8, ret=8)

* riscv's copy_to_user() and copy_from_user() don't guarantee that at
  least a single byte is copied when a partial copy is possible, e.g.

  | too few bytes consumed (offset=4095, size=2, ret=2)

* s390 passes all tests

* sparc's copy_from_user() over-reports the number of bbytes copied in
  some caes, e.g.

  | copied bytes incorrect (dst_page[0+0]=0x0, src_page[4095+0]=0xff, offset=4095, size=2, ret=1

* x86_64 passes all tests

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arch@vger.kernel.org
---
 lib/Kconfig.debug    |   9 +
 lib/Makefile         |   1 +
 lib/usercopy_kunit.c | 431 +++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 441 insertions(+)
 create mode 100644 lib/usercopy_kunit.c

diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index c8b379e2e9adc..bd737db4ef06a 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -2647,6 +2647,15 @@ config SIPHASH_KUNIT_TEST
 	  This is intended to help people writing architecture-specific
 	  optimized versions.  If unsure, say N.
 
+config USERCOPY_KUNIT_TEST
+	bool "KUnit tests for usercopy functions" if !KUNIT_ALL_TESTS
+	depends on KUNIT=y
+	default KUNIT_ALL_TESTS
+	help
+	  Tests for faulting behaviour of copy_{to,from}_user().
+
+	  If unsure, say N.
+
 config TEST_UDELAY
 	tristate "udelay test driver"
 	help
diff --git a/lib/Makefile b/lib/Makefile
index baf2821f7a00f..37fb438e6c433 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -391,6 +391,7 @@ CFLAGS_fortify_kunit.o += $(DISABLE_STRUCTLEAK_PLUGIN)
 obj-$(CONFIG_FORTIFY_KUNIT_TEST) += fortify_kunit.o
 obj-$(CONFIG_STRSCPY_KUNIT_TEST) += strscpy_kunit.o
 obj-$(CONFIG_SIPHASH_KUNIT_TEST) += siphash_kunit.o
+obj-$(CONFIG_USERCOPY_KUNIT_TEST) += usercopy_kunit.o
 
 obj-$(CONFIG_GENERIC_LIB_DEVMEM_IS_ALLOWED) += devmem_is_allowed.o
 
diff --git a/lib/usercopy_kunit.c b/lib/usercopy_kunit.c
new file mode 100644
index 0000000000000..45983952cc079
--- /dev/null
+++ b/lib/usercopy_kunit.c
@@ -0,0 +1,431 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Test cases for usercopy faulting behaviour
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include <kunit/test.h>
+
+#include <linux/highmem.h>
+#include <linux/kthread.h>
+#include <linux/mm.h>
+#include <linux/sched/mm.h>
+#include <linux/uaccess.h>
+#include <linux/vmalloc.h>
+
+/*
+ * Arbitrarily chosen user address for the test page.
+ */
+#define UBUF_ADDR_BASE	SZ_2M
+
+struct usercopy_env {
+	struct mm_struct		*mm;
+	void				*kbuf;
+	struct page			*ubuf_page;
+	void				*ubuf;
+};
+
+struct usercopy_params {
+	long		offset;
+	unsigned long	size;
+};
+
+static void *usercopy_env_alloc(void)
+{
+	struct usercopy_env *env = kzalloc(sizeof(*env), GFP_KERNEL);
+	if (!env)
+		return NULL;
+
+	env->kbuf = vmalloc(PAGE_SIZE);
+	if (!env->kbuf)
+		goto out_free_env;
+
+	return env;
+
+out_free_env:
+	kfree(env);
+	return NULL;
+}
+
+static void usercopy_env_free(struct usercopy_env *env)
+{
+	vfree(env->kbuf);
+	kfree(env);
+}
+
+static void *usercopy_mm_alloc(struct usercopy_env *env)
+{
+	struct mm_struct *mm;
+	struct vm_area_struct *vma;
+	mm = mm_alloc();
+	if (!mm)
+		return NULL;
+
+	if (mmap_write_lock_killable(mm))
+		goto out_free;
+
+	vma = vm_area_alloc(mm);
+	if (!vma)
+		goto out_unlock;
+
+	vma_set_anonymous(vma);
+	vma->vm_start = UBUF_ADDR_BASE;
+	vma->vm_end = UBUF_ADDR_BASE + PAGE_SIZE;
+	vm_flags_init(vma, VM_READ | VM_MAYREAD | VM_WRITE | VM_MAYWRITE);
+	vma->vm_page_prot = vm_get_page_prot(vma->vm_flags);
+
+	if (insert_vm_struct(mm, vma))
+		goto out_free_vma;
+
+	mmap_write_unlock(mm);
+	return mm;
+
+out_free_vma:
+	vm_area_free(vma);
+out_unlock:
+	mmap_write_unlock(mm);
+out_free:
+	mmput(mm);
+	return NULL;
+}
+
+static void usercopy_mm_free(struct mm_struct *mm)
+{
+	mmput(mm);
+}
+
+static struct page *usercopy_ubuf_pin(struct usercopy_env *env)
+{
+	struct page *p = NULL;
+
+	kthread_use_mm(env->mm);
+	pin_user_pages_unlocked(UBUF_ADDR_BASE, 1, &p, FOLL_LONGTERM);
+	kthread_unuse_mm(env->mm);
+
+	return p;
+}
+
+static void usercopy_ubuf_unpin(struct usercopy_env *env)
+{
+	unpin_user_page(env->ubuf_page);
+}
+
+static int usercopy_test_init(struct kunit *test)
+{
+	struct usercopy_env *env;
+
+	env = usercopy_env_alloc();
+	if (!env)
+		return -ENOMEM;
+
+	env->mm = usercopy_mm_alloc(env);
+	if (!env->mm)
+		goto out_free_env;
+
+	env->ubuf_page = usercopy_ubuf_pin(env);
+	if (!env->ubuf_page)
+		goto out_free_mm;
+
+	env->ubuf = kmap(env->ubuf_page);
+	if (!env->ubuf)
+		goto out_unpin_ubuf;
+
+	test->priv = env;
+
+	return 0;
+
+out_unpin_ubuf:
+	usercopy_ubuf_unpin(env);
+out_free_mm:
+	usercopy_mm_free(env->mm);
+out_free_env:
+	usercopy_env_free(env);
+	return -ENOMEM;
+}
+
+static void usercopy_test_exit(struct kunit *test)
+{
+	struct usercopy_env *env = test->priv;
+
+	kunmap(env->ubuf);
+
+	usercopy_ubuf_unpin(env);
+	usercopy_mm_free(env->mm);
+	usercopy_env_free(env);
+}
+
+static char buf_pattern(int offset)
+{
+	return offset & 0xff;
+}
+
+static void buf_init_pattern(char *buf)
+{
+	for (int i = 0; i < PAGE_SIZE; i++)
+		buf[i] = buf_pattern(i);
+}
+
+static void buf_init_zero(char *buf)
+{
+	memset(buf, 0, PAGE_SIZE);
+}
+
+static void assert_size_valid(struct kunit *test,
+			      const struct usercopy_params *params,
+			      unsigned long ret)
+{
+	const unsigned long size = params->size;
+	const unsigned long offset = params->offset;
+
+	if (ret > size) {
+		KUNIT_ASSERT_FAILURE(test,
+			"return value is impossibly large (offset=%ld, size=%lu, ret=%lu)",
+			offset, size, ret);
+	}
+
+	/*
+	 * When the user buffer starts within a faulting page, no bytes can be
+	 * copied, so ret must equal size.
+	 */
+	if (offset < 0 || offset >= PAGE_SIZE) {
+		if (ret == size)
+			return;
+
+		KUNIT_ASSERT_FAILURE(test,
+			"impossible copy did not fail (offset=%ld, size=%lu, ret=%lu)",
+			offset, size, ret);
+	}
+
+	/*
+	 * When the user buffer fits entirely within a non-faulting page, all
+	 * bytes must be copied, so ret must equal 0.
+	 */
+	if (offset + size <= PAGE_SIZE) {
+		if (ret == 0)
+			return;
+
+		KUNIT_ASSERT_FAILURE(test,
+			"completely possible copy failed (offset=%ld, size=%lu, ret=%lu)",
+			offset, size, ret);
+	}
+
+	/*
+	 * The buffer starts in a non-faulting page, but continues into a
+	 * faulting page. At least one byte must be copied, and at most all the
+	 * non-faulting bytes may be copied.
+	 */
+	if (ret == size) {
+		KUNIT_ASSERT_FAILURE(test,
+			"too few bytes consumed (offset=%ld, size=%lu, ret=%lu)",
+			offset, size, ret);
+	}
+
+	if (ret < (offset + size) - PAGE_SIZE) {
+		KUNIT_ASSERT_FAILURE(test,
+			   "too many bytes consumed (offset=%ld, size=%lu, ret=%lu)",
+			   offset, size, ret);
+	}
+}
+
+static void assert_src_valid(struct kunit *test,
+			     const struct usercopy_params *params,
+			     const char *src, long src_offset,
+			     unsigned long ret)
+{
+	const unsigned long size = params->size;
+	const unsigned long offset = params->offset;
+
+	/*
+	 * A usercopy MUST NOT modify the source buffer.
+	 */
+	for (int i = 0; i < PAGE_SIZE; i++) {
+		char val = src[i];
+
+		if (val == buf_pattern(i))
+			continue;
+
+		KUNIT_ASSERT_FAILURE(test,
+			"source bytes modified (src[%d]=0x%x, offset=%ld, size=%lu, ret=%lu)",
+			i, offset, size, ret);
+	}
+}
+
+static void assert_dst_valid(struct kunit *test,
+			     const struct usercopy_params *params,
+			     const char *dst, long dst_offset,
+			     unsigned long ret)
+{
+	const unsigned long size = params->size;
+	const unsigned long offset = params->offset;
+
+	/*
+	 * A usercopy MUST NOT modify any bytes before the destination buffer.
+	 */
+	for (int i = 0; i < dst_offset; i++) {
+		char val = dst[i];
+
+		if (val == 0)
+			continue;
+
+		KUNIT_ASSERT_FAILURE(test,
+			"pre-destination bytes modified (dst_page[%d]=0x%x, offset=%ld, size=%lu, ret=%lu)",
+			i, val, offset, size, ret);
+	}
+
+	/*
+	 * A usercopy MUST NOT modify any bytes after the end of the destination
+	 * buffer.
+	 */
+	for (int i = dst_offset + size - ret; i < PAGE_SIZE; i++) {
+		char val = dst[i];
+
+		if (val == 0)
+			continue;
+
+		KUNIT_ASSERT_FAILURE(test,
+			"post-destination bytes modified (dst_page[%d]=0x%x, offset=%ld, size=%lu, ret=%lu)",
+			i, val, offset, size, ret);
+	}
+}
+
+static void assert_copy_valid(struct kunit *test,
+			      const struct usercopy_params *params,
+			      const char *dst, long dst_offset,
+			      const char *src, long src_offset,
+			      unsigned long ret)
+{
+	const unsigned long size = params->size;
+	const unsigned long offset = params->offset;
+
+	/*
+	 * Have we actually copied the bytes we expected to?
+	 */
+	for (int i = 0; i < params->size - ret; i++) {
+		char dst_val = dst[dst_offset + i];
+		char src_val = src[src_offset + i];
+
+		if (dst_val == src_val)
+			continue;
+
+		KUNIT_ASSERT_FAILURE(test,
+			"copied bytes incorrect (dst_page[%ld+%d]=0x%x, src_page[%ld+%d]=0x%x, offset=%ld, size=%lu, ret=%lu",
+			dst_offset, i, dst_val,
+			src_offset, i, src_val,
+			offset, size, ret);
+	}
+}
+
+static unsigned long do_copy_to_user(const struct usercopy_env *env,
+				     const struct usercopy_params *params)
+{
+	void __user *uptr = (void __user *)UBUF_ADDR_BASE + params->offset;
+	void *kptr = env->kbuf;
+	unsigned long ret;
+
+	kthread_use_mm(env->mm);
+	ret = raw_copy_to_user(uptr, kptr, params->size);
+	kthread_unuse_mm(env->mm);
+
+	return ret;
+}
+
+static unsigned long do_copy_from_user(const struct usercopy_env *env,
+				       const struct usercopy_params *params)
+{
+	void __user *uptr = (void __user *)UBUF_ADDR_BASE + params->offset;
+	void *kptr = env->kbuf;
+	unsigned long ret;
+
+	kthread_use_mm(env->mm);
+	ret = raw_copy_from_user(kptr, uptr, params->size);
+	kthread_unuse_mm(env->mm);
+
+	return ret;
+}
+
+/*
+ * Generate the size and offset combinations to test.
+ *
+ * Usercopies may have different paths for small/medium/large copies, but
+ * typically max out at 64 byte chunks. We test sizes 0 to 128 bytes to check
+ * all combinations of leading/trailing chunks and bulk copies.
+ *
+ * For each size, we test every offset relative to a leading and trailing page
+ * boundary (i.e. [size, 0] and [PAGE_SIZE - size, PAGE_SIZE]) to check every
+ * possible faulting boundary.
+ */
+#define for_each_size_offset(size, offset)				\
+	for (unsigned long size = 0; size <= 128; size++)		\
+		for (long offset = -size;				\
+		     offset <= (long)PAGE_SIZE;				\
+		     offset = (offset ? offset + 1: (PAGE_SIZE - size)))
+
+static void test_copy_to_user(struct kunit *test)
+{
+	const struct usercopy_env *env = test->priv;
+
+	for_each_size_offset(size, offset) {
+		const struct usercopy_params params = {
+			.size = size,
+			.offset = offset,
+		};
+		unsigned long ret;
+
+		buf_init_zero(env->ubuf);
+		buf_init_pattern(env->kbuf);
+
+		ret = do_copy_to_user(env, &params);
+
+		assert_size_valid(test, &params, ret);
+		assert_src_valid(test, &params, env->kbuf, 0, ret);
+		assert_dst_valid(test, &params, env->ubuf, params.offset, ret);
+		assert_copy_valid(test, &params,
+				  env->ubuf, params.offset,
+				  env->kbuf, 0,
+				  ret);
+	}
+}
+
+static void test_copy_from_user(struct kunit *test)
+{
+	const struct usercopy_env *env = test->priv;
+
+	for_each_size_offset(size, offset) {
+		const struct usercopy_params params = {
+			.size = size,
+			.offset = offset,
+		};
+		unsigned long ret;
+
+		buf_init_zero(env->kbuf);
+		buf_init_pattern(env->ubuf);
+
+		ret = do_copy_from_user(env, &params);
+
+		assert_size_valid(test, &params, ret);
+		assert_src_valid(test, &params, env->ubuf, params.offset, ret);
+		assert_dst_valid(test, &params, env->kbuf, 0, ret);
+		assert_copy_valid(test, &params,
+				  env->kbuf, 0,
+				  env->ubuf, params.offset,
+				  ret);
+	}
+}
+
+static struct kunit_case usercopy_cases[] = {
+	KUNIT_CASE(test_copy_to_user),
+	KUNIT_CASE(test_copy_from_user),
+	{ /* sentinel */ }
+};
+
+static struct kunit_suite usercopy_suite = {
+	.name = "usercopy",
+	.init = usercopy_test_init,
+	.exit = usercopy_test_exit,
+	.test_cases = usercopy_cases,
+};
+kunit_test_suites(&usercopy_suite);
+
+MODULE_AUTHOR("Mark Rutland <mark.rutland@arm.com>");
+MODULE_LICENSE("GPL v2");
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 2/4] lib: test clear_user()
  2023-03-21 12:25 [PATCH v2 0/4] usercopy: generic tests + arm64 fixes Mark Rutland
  2023-03-21 12:25 ` [PATCH v2 1/4] lib: test copy_{to,from}_user() Mark Rutland
@ 2023-03-21 12:25 ` Mark Rutland
  2023-03-21 17:13   ` Catalin Marinas
  2023-03-21 12:25 ` [PATCH v2 3/4] arm64: fix __raw_copy_to_user semantics Mark Rutland
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 18+ messages in thread
From: Mark Rutland @ 2023-03-21 12:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: agordeev, aou, bp, catalin.marinas, dave.hansen, davem, gor, hca,
	linux-arch, linux, mark.rutland, mingo, palmer, paul.walmsley,
	robin.murphy, tglx, torvalds, viro, will

The clear_user() function follows the same conventions as
copy_{to,from}_user(), and presumably has identical requirements on the
return value. Test it in the same way.

I've given this a spin on a few architectures using the KUnit QEMU
harness, and it looks like most get *something* wrong, or I've
misunderstood and clear_user() doesn't have the same requirements as
copy_{to,from}_user()). From those initial runs:

* arm, arm64, i386, riscv, x86_64  don't ensure that at least 1 byte is
  zeroed when a partial zeroing is possible, e.g.

  | too few bytes consumed (offset=4095, size=2, ret=2)
  | too few bytes consumed (offset=4093, size=4, ret=4)
  | too few bytes consumed (offset=4089, size=8, ret=8)

* s390 reports that some bytes have been zeroed even when they haven't,
  e.g.

  | zeroed bytes incorrect (dst_page[4031+64]=0xca, offset=4031, size=66, ret=1

* sparc passses all tests

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arch@vger.kernel.org
---
 lib/usercopy_kunit.c | 89 ++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 82 insertions(+), 7 deletions(-)

diff --git a/lib/usercopy_kunit.c b/lib/usercopy_kunit.c
index 45983952cc079..1ec0d5bbc179a 100644
--- a/lib/usercopy_kunit.c
+++ b/lib/usercopy_kunit.c
@@ -155,6 +155,11 @@ static void usercopy_test_exit(struct kunit *test)
 	usercopy_env_free(env);
 }
 
+static char buf_zero(int offset)
+{
+	return 0;
+}
+
 static char buf_pattern(int offset)
 {
 	return offset & 0xff;
@@ -230,6 +235,7 @@ static void assert_size_valid(struct kunit *test,
 
 static void assert_src_valid(struct kunit *test,
 			     const struct usercopy_params *params,
+			     char (*buf_expected)(int),
 			     const char *src, long src_offset,
 			     unsigned long ret)
 {
@@ -240,9 +246,10 @@ static void assert_src_valid(struct kunit *test,
 	 * A usercopy MUST NOT modify the source buffer.
 	 */
 	for (int i = 0; i < PAGE_SIZE; i++) {
+		char expected = buf_expected(i);
 		char val = src[i];
 
-		if (val == buf_pattern(i))
+		if (val == expected)
 			continue;
 
 		KUNIT_ASSERT_FAILURE(test,
@@ -253,6 +260,7 @@ static void assert_src_valid(struct kunit *test,
 
 static void assert_dst_valid(struct kunit *test,
 			     const struct usercopy_params *params,
+			     char (*buf_expected)(int),
 			     const char *dst, long dst_offset,
 			     unsigned long ret)
 {
@@ -263,9 +271,10 @@ static void assert_dst_valid(struct kunit *test,
 	 * A usercopy MUST NOT modify any bytes before the destination buffer.
 	 */
 	for (int i = 0; i < dst_offset; i++) {
+		char expected = buf_expected(i);
 		char val = dst[i];
 
-		if (val == 0)
+		if (val == expected)
 			continue;
 
 		KUNIT_ASSERT_FAILURE(test,
@@ -278,9 +287,10 @@ static void assert_dst_valid(struct kunit *test,
 	 * buffer.
 	 */
 	for (int i = dst_offset + size - ret; i < PAGE_SIZE; i++) {
+		char expected = buf_expected(i);
 		char val = dst[i];
 
-		if (val == 0)
+		if (val == expected)
 			continue;
 
 		KUNIT_ASSERT_FAILURE(test,
@@ -316,6 +326,29 @@ static void assert_copy_valid(struct kunit *test,
 	}
 }
 
+static void assert_clear_valid(struct kunit *test,
+			       const struct usercopy_params *params,
+			       const char *dst, long dst_offset,
+			       unsigned long ret)
+{
+	const unsigned long size = params->size;
+	const unsigned long offset = params->offset;
+
+	/*
+	 * Have we actually zeroed the bytes we expected to?
+	 */
+	for (int i = 0; i < params->size - ret; i++) {
+		char dst_val = dst[dst_offset + i];
+
+		if (dst_val == 0)
+			continue;
+
+		KUNIT_ASSERT_FAILURE(test,
+			"zeroed bytes incorrect (dst_page[%ld+%d]=0x%x, offset=%ld, size=%lu, ret=%lu",
+			dst_offset, i, dst_val,
+			offset, size, ret);
+	}
+}
 static unsigned long do_copy_to_user(const struct usercopy_env *env,
 				     const struct usercopy_params *params)
 {
@@ -344,6 +377,19 @@ static unsigned long do_copy_from_user(const struct usercopy_env *env,
 	return ret;
 }
 
+static unsigned long do_clear_user(const struct usercopy_env *env,
+				   const struct usercopy_params *params)
+{
+	void __user *uptr = (void __user *)UBUF_ADDR_BASE + params->offset;
+	unsigned long ret;
+
+	kthread_use_mm(env->mm);
+	ret = clear_user(uptr, params->size);
+	kthread_unuse_mm(env->mm);
+
+	return ret;
+}
+
 /*
  * Generate the size and offset combinations to test.
  *
@@ -378,8 +424,10 @@ static void test_copy_to_user(struct kunit *test)
 		ret = do_copy_to_user(env, &params);
 
 		assert_size_valid(test, &params, ret);
-		assert_src_valid(test, &params, env->kbuf, 0, ret);
-		assert_dst_valid(test, &params, env->ubuf, params.offset, ret);
+		assert_src_valid(test, &params, buf_pattern,
+				 env->kbuf, 0, ret);
+		assert_dst_valid(test, &params, buf_zero,
+				 env->ubuf, params.offset, ret);
 		assert_copy_valid(test, &params,
 				  env->ubuf, params.offset,
 				  env->kbuf, 0,
@@ -404,8 +452,10 @@ static void test_copy_from_user(struct kunit *test)
 		ret = do_copy_from_user(env, &params);
 
 		assert_size_valid(test, &params, ret);
-		assert_src_valid(test, &params, env->ubuf, params.offset, ret);
-		assert_dst_valid(test, &params, env->kbuf, 0, ret);
+		assert_src_valid(test, &params, buf_pattern,
+				 env->ubuf, params.offset, ret);
+		assert_dst_valid(test, &params, buf_zero,
+				 env->kbuf, 0, ret);
 		assert_copy_valid(test, &params,
 				  env->kbuf, 0,
 				  env->ubuf, params.offset,
@@ -413,9 +463,34 @@ static void test_copy_from_user(struct kunit *test)
 	}
 }
 
+static void test_clear_user(struct kunit *test)
+{
+	const struct usercopy_env *env = test->priv;
+
+	for_each_size_offset(size, offset) {
+		const struct usercopy_params params = {
+			.size = size,
+			.offset = offset,
+		};
+		unsigned long ret;
+
+		buf_init_pattern(env->ubuf);
+
+		ret = do_clear_user(env, &params);
+
+		assert_size_valid(test, &params, ret);
+		assert_dst_valid(test, &params, buf_pattern,
+				 env->ubuf, params.offset, ret);
+		assert_clear_valid(test, &params,
+				   env->ubuf, params.offset,
+				   ret);
+	}
+}
+
 static struct kunit_case usercopy_cases[] = {
 	KUNIT_CASE(test_copy_to_user),
 	KUNIT_CASE(test_copy_from_user),
+	KUNIT_CASE(test_clear_user),
 	{ /* sentinel */ }
 };
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 3/4] arm64: fix __raw_copy_to_user semantics
  2023-03-21 12:25 [PATCH v2 0/4] usercopy: generic tests + arm64 fixes Mark Rutland
  2023-03-21 12:25 ` [PATCH v2 1/4] lib: test copy_{to,from}_user() Mark Rutland
  2023-03-21 12:25 ` [PATCH v2 2/4] lib: test clear_user() Mark Rutland
@ 2023-03-21 12:25 ` Mark Rutland
  2023-03-21 17:50   ` Linus Torvalds
  2023-03-22 14:48   ` Catalin Marinas
  2023-03-21 12:25 ` [PATCH v2 4/4] arm64: fix clear_user() semantics Mark Rutland
  2023-11-15 22:30 ` [PATCH v2 0/4] usercopy: generic tests + arm64 fixes Kees Cook
  4 siblings, 2 replies; 18+ messages in thread
From: Mark Rutland @ 2023-03-21 12:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: agordeev, aou, bp, catalin.marinas, dave.hansen, davem, gor, hca,
	linux-arch, linux, mark.rutland, mingo, palmer, paul.walmsley,
	robin.murphy, tglx, torvalds, viro, will

For some combinations of sizes and alignments __{arch,raw}_copy_to_user
will copy some bytes between (to + size - N) and (to + size), but will
never modify bytes past (to + size).

This violates the documentation in <linux/uaccess.h>, which states:

> If raw_copy_{to,from}_user(to, from, size) returns N, size - N bytes
> starting at to must become equal to the bytes fetched from the
> corresponding area starting at from.  All data past to + size - N must
> be left unmodified.

This can be demonstrated through testing, e.g.

|     # test_copy_to_user: EXPECTATION FAILED at lib/usercopy_kunit.c:287
| post-destination bytes modified (dst_page[4082]=0x1, offset=4081, size=16, ret=15)
| [FAILED] 16 byte copy

This happens because the __arch_copy_to_user() can make unaligned stores
to the userspace buffer, and the ARM architecture permits (but does not
require) that such unaligned stores write some bytes before raising a
fault (per ARM DDI 0487I.a Section B2.2.1 and Section B2.7.1). The
extable fixup handlers in __arch_copy_to_user() assume that any faulting
store has failed entirely, and so under-report the number of bytes
copied when an unaligned store writes some bytes before faulting.

The only architecturally guaranteed way to avoid this is to only use
aligned stores to write to user memory.	This patch rewrites
__arch_copy_to_user() to only access the user buffer with aligned
stores, such that the bytes written can always be determined reliably.

For correctness, I've tested this exhaustively for sizes 0 to 128
against every possible alignment relative to a leading and trailing page
boundary. I've also boot tested and run a few kernel builds with the new
implementations.

For performance, I've benchmarked this on a variety of CPU
implementations, and across the board this appears at least as good as
(or marginally better than) the current implementation of
copy_to_user(). Timing a kernel build indicates the same, though the
difference is very close to noise.

We do not have a similar bug in __{arch,raw}_copy_from_user(), as faults
taken on loads from user memory do not have side-effects. We do have
similar issues in __arch_clear_user(), which will be addresssed in a
subsequent patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Will Deacon <will@kernel.org>
---
 arch/arm64/lib/copy_to_user.S | 202 +++++++++++++++++++++++++++-------
 1 file changed, 161 insertions(+), 41 deletions(-)

diff --git a/arch/arm64/lib/copy_to_user.S b/arch/arm64/lib/copy_to_user.S
index 8022317726085..fa603487e8571 100644
--- a/arch/arm64/lib/copy_to_user.S
+++ b/arch/arm64/lib/copy_to_user.S
@@ -9,6 +9,14 @@
 #include <asm/assembler.h>
 #include <asm/cache.h>
 
+#define USER_OFF(off, x...)	USER(fixup_offset_##off, x)
+
+#define FIXUP_OFFSET(n)							\
+fixup_offset_##n:							\
+	sub	x0, x3, x0;						\
+	sub	x0, x0, n;						\
+	ret
+
 /*
  * Copy to user space from a kernel buffer (alignment handled by the hardware)
  *
@@ -18,56 +26,168 @@
  *	x2 - n
  * Returns:
  *	x0 - bytes not copied
+ *
+ * Unlike a memcpy(), we need to handle faults on user addresses, and we need
+ * to precisely report the number of bytes (not) copied. We must only use
+ * aligned single-copy-atomic stores to write to user memory, as stores which
+ * are not single-copy-atomic (e.g. unaligned stores, STP, ASMID stores) can be
+ * split into separate byte accesses (per ARM DDI 0487I.a, Section B2.2.1) and
+ * some arbitatrary set of those byte accesses might occur prior to a fault
+ * being raised (per per ARM DDI 0487I.a, Section B2.7.1).
+ *
+ * We use STTR to write to user memory, which has 1/2/4/8 byte forms, and the
+ * user address ('to') might have arbitrary alignment, so we must handle
+ * misalignment up to 8 bytes.
  */
-	.macro ldrb1 reg, ptr, val
-	ldrb  \reg, [\ptr], \val
-	.endm
+SYM_FUNC_START(__arch_copy_to_user)
+		/*
+		 * The end address. Fixup handlers will use this to calculate
+		 * the number of bytes copied.
+		 */
+		add	x3, x0, x2
 
-	.macro strb1 reg, ptr, val
-	user_ldst 9998f, sttrb, \reg, \ptr, \val
-	.endm
+		/*
+		 * Tracing of a kernel build indicates that for the vast
+		 * majority of calls to copy_to_user(), 'to' is aligned to 8
+		 * bytes. When this is the case, we want to skip to the bulk
+		 * copy as soon as possible.
+		 */
+		ands	x4, x0, 0x7
+		b.eq	body
 
-	.macro ldrh1 reg, ptr, val
-	ldrh  \reg, [\ptr], \val
-	.endm
+		/*
+		 * For small unaligned copies, it's not worth trying to be
+		 * optimal.
+		 */
+		cmp	x2, #8
+		b.lo	bytewise_loop
 
-	.macro strh1 reg, ptr, val
-	user_ldst 9997f, sttrh, \reg, \ptr, \val
-	.endm
+		/*
+		 * Calculate the distance to the next 8-byte boundary.
+		 */
+		mov	x5, #8
+		sub	x4, x5, x4
 
-	.macro ldr1 reg, ptr, val
-	ldr \reg, [\ptr], \val
-	.endm
+SYM_INNER_LABEL(head_realign_1b, SYM_L_LOCAL)
+		tbz	x4, #0, head_realign_2b
 
-	.macro str1 reg, ptr, val
-	user_ldst 9997f, sttr, \reg, \ptr, \val
-	.endm
+		ldrb	w8, [x1], #1
+USER_OFF(0,	sttrb	w8, [x0])
+		add	x0, x0, #1
 
-	.macro ldp1 reg1, reg2, ptr, val
-	ldp \reg1, \reg2, [\ptr], \val
-	.endm
+SYM_INNER_LABEL(head_realign_2b, SYM_L_LOCAL)
+		tbz	x4, #1, head_realign_4b
 
-	.macro stp1 reg1, reg2, ptr, val
-	user_stp 9997f, \reg1, \reg2, \ptr, \val
-	.endm
+		ldrh	w8, [x1], #2
+USER_OFF(0,	sttrh	w8, [x0])
+		add	x0, x0, #2
 
-end	.req	x5
-srcin	.req	x15
-SYM_FUNC_START(__arch_copy_to_user)
-	add	end, x0, x2
-	mov	srcin, x1
-#include "copy_template.S"
-	mov	x0, #0
-	ret
+SYM_INNER_LABEL(head_realign_4b, SYM_L_LOCAL)
+		tbz	x4, #2, head_realigned
 
-	// Exception fixups
-9997:	cmp	dst, dstin
-	b.ne	9998f
-	// Before being absolutely sure we couldn't copy anything, try harder
-	ldrb	tmp1w, [srcin]
-USER(9998f, sttrb tmp1w, [dst])
-	add	dst, dst, #1
-9998:	sub	x0, end, dst			// bytes not copied
-	ret
+		ldr	w8, [x1], #4
+USER_OFF(0,	sttr	w8, [x0])
+		add	x0, x0, #4
+
+SYM_INNER_LABEL(head_realigned, SYM_L_LOCAL)
+		/*
+		 * Any 1-7 byte misalignment has now been fixed; subtract this
+		 * misalignment from the remaining size.
+		 */
+		sub	x2, x2, x4
+
+SYM_INNER_LABEL(body, SYM_L_LOCAL)
+		cmp	x2, #64
+		b.lo	tail_32b
+
+SYM_INNER_LABEL(body_64b_loop, SYM_L_LOCAL)
+		ldp	x8,  x9,  [x1, #0]
+		ldp	x10, x11, [x1, #16]
+		ldp	x12, x13, [x1, #32]
+		ldp	x14, x15, [x1, #48]
+USER_OFF(0,	sttr	x8,  [x0, #0])
+USER_OFF(8,	sttr	x9,  [x0, #8])
+USER_OFF(16,	sttr	x10, [x0, #16])
+USER_OFF(24,	sttr	x11, [x0, #24])
+USER_OFF(32,	sttr	x12, [x0, #32])
+USER_OFF(40,	sttr	x13, [x0, #40])
+USER_OFF(48,	sttr	x14, [x0, #48])
+USER_OFF(56,	sttr	x15, [x0, #56])
+		add	x0, x0, #64
+		add	x1, x1, #64
+		sub	x2, x2, #64
+
+		cmp	x2, #64
+		b.hs	body_64b_loop
+
+SYM_INNER_LABEL(tail_32b, SYM_L_LOCAL)
+		tbz	x2, #5, tail_16b
+
+		ldp	x8,  x9,  [x1, #0]
+		ldp	x10, x11, [x1, #16]
+USER_OFF(0,	sttr	x8,  [x0, #0])
+USER_OFF(8,	sttr	x9,  [x0, #8])
+USER_OFF(16,	sttr	x10, [x0, #16])
+USER_OFF(24,	sttr	x11, [x0, #24])
+		add	x0, x0, #32
+		add	x1, x1, #32
+
+SYM_INNER_LABEL(tail_16b, SYM_L_LOCAL)
+		tbz	x2, #4, tail_8b
+
+		ldp	x8,  x9,  [x1], #16
+USER_OFF(0,	sttr	x8, [x0, #0])
+USER_OFF(8,	sttr	x9, [x0, #8])
+		add	x0, x0, #16
+
+SYM_INNER_LABEL(tail_8b, SYM_L_LOCAL)
+		tbz	x2, #3, tail_4b
+
+		ldr	x8, [x1], #8
+USER_OFF(0,	sttr	x8, [x0])
+		add	x0, x0, #8
+
+SYM_INNER_LABEL(tail_4b, SYM_L_LOCAL)
+		tbz	x2, #2, tail_2b
+
+		ldr	w8, [x1], #4
+USER_OFF(0,	sttr	w8, [x0])
+		add	x0, x0, #4
+
+SYM_INNER_LABEL(tail_2b, SYM_L_LOCAL)
+		tbz	x2, #1, tail_1b
+
+		ldrh	w8, [x1], #2
+USER_OFF(0,	sttrh	w8, [x0])
+		add	x0, x0, #2
+
+SYM_INNER_LABEL(tail_1b, SYM_L_LOCAL)
+		tbz	x2, #0, done
+
+		ldrb	w8, [x1]
+USER_OFF(0,	sttrb	w8, [x0])
+
+SYM_INNER_LABEL(done, SYM_L_LOCAL)
+		mov	x0, xzr
+		ret
+
+SYM_INNER_LABEL(bytewise_loop, SYM_L_LOCAL)
+		cbz	x2, done
+
+		ldrb	w8, [x1], #1
+USER_OFF(0,	sttrb	w8, [x0])
+		add	x0, x0, #1
+		sub	x2, x2, #1
+
+		b	bytewise_loop
+
+FIXUP_OFFSET(0)
+FIXUP_OFFSET(8)
+FIXUP_OFFSET(16)
+FIXUP_OFFSET(24)
+FIXUP_OFFSET(32)
+FIXUP_OFFSET(40)
+FIXUP_OFFSET(48)
+FIXUP_OFFSET(56)
 SYM_FUNC_END(__arch_copy_to_user)
 EXPORT_SYMBOL(__arch_copy_to_user)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 4/4] arm64: fix clear_user() semantics
  2023-03-21 12:25 [PATCH v2 0/4] usercopy: generic tests + arm64 fixes Mark Rutland
                   ` (2 preceding siblings ...)
  2023-03-21 12:25 ` [PATCH v2 3/4] arm64: fix __raw_copy_to_user semantics Mark Rutland
@ 2023-03-21 12:25 ` Mark Rutland
  2023-11-15 22:30 ` [PATCH v2 0/4] usercopy: generic tests + arm64 fixes Kees Cook
  4 siblings, 0 replies; 18+ messages in thread
From: Mark Rutland @ 2023-03-21 12:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: agordeev, aou, bp, catalin.marinas, dave.hansen, davem, gor, hca,
	linux-arch, linux, mark.rutland, mingo, palmer, paul.walmsley,
	robin.murphy, tglx, torvalds, viro, will

Generally, clear_user() is supposed to behave the same way as
raw_copy_{to,from}_user(), which per <linux/uaccess.h> requires:

> If raw_copy_{to,from}_user(to, from, size) returns N, size - N bytes
> starting at to must become equal to the bytes fetched from the
> corresponding area starting at from.  All data past to + size - N must
> be left unmodified.
>
> If copying succeeds, the return value must be 0.  If some data cannot
> be fetched, it is permitted to copy less than had been fetched; the
> only hard requirement is that not storing anything at all (i.e.
> returning size) should happen only when nothing could be copied.  In
> other words, you don't have to squeeze as much as possible - it is
> allowed, but not necessary.

Unfortunately arm64's implementation of clear_user() may fail in cases
where some bytes could be written, which can be demonstrated through
testing, e.g.

|     # test_clear_user: ASSERTION FAILED at lib/usercopy_kunit.c:221
| too few bytes consumed (offset=4095, size=2, ret=2)
| [FAILED] 2 byte(s)

As with __{arch,raw}_copy_to_user() there are also a number of potential
other problems where the use of misaligned accesses could result in
under-reporting the number of bytes zeroed.

This patch fixes these issues by replacing the current implementation of
__arch_clear_user() with one derived from the new __arch_copy_to_user().
This only uses aligned stores to write to user memory, and reliably
reports the number of bytes zeroed.

For correctness, I've tested this exhaustively for sizes 0 to 128
against every possible alignment relative to a leading and trailing page
boundary. I've also boot tested and run a few kernel builds with the new
implementation.

For performenace, I've benchmarked reads form /dev/zero and timed kernel
builds before and after this patch, and this seems to be at least as
good (or marginally better than) the current implementation, though the
difference is very close to noise.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Will Deacon <will@kernel.org>
---
 arch/arm64/lib/clear_user.S   | 195 +++++++++++++++++++++++++++-------
 arch/arm64/lib/copy_to_user.S |   1 -
 2 files changed, 157 insertions(+), 39 deletions(-)

diff --git a/arch/arm64/lib/clear_user.S b/arch/arm64/lib/clear_user.S
index a5a5f5b97b175..f422c05d84351 100644
--- a/arch/arm64/lib/clear_user.S
+++ b/arch/arm64/lib/clear_user.S
@@ -4,52 +4,171 @@
  */
 
 #include <linux/linkage.h>
+
 #include <asm/asm-uaccess.h>
+#include <asm/assembler.h>
+
+#define USER_OFF(off, x...)	USER(fixup_offset_##off, x)
 
-	.text
+#define FIXUP_OFFSET(n)							\
+fixup_offset_##n:							\
+	sub	x0, x3, x0;						\
+	sub	x0, x0, n;						\
+	ret
 
-/* Prototype: int __arch_clear_user(void *addr, size_t sz)
- * Purpose  : clear some user memory
- * Params   : addr - user memory address to clear
- *          : sz   - number of bytes to clear
- * Returns  : number of bytes NOT cleared
+/*
+ * Write zeroes to a user space buffer
+ *
+ * Parameters:
+ *	x0 - to
+ *	x1 - n
+ * Returns:
+ *	x0 - bytes not copied
  *
- * Alignment fixed up by hardware.
+ * Unlike a memset(to, 0, n), we need to handle faults on user addresses, and
+ * we need to precisely report the number of bytes (not) written. We must only
+ * use aligned single-copy-atomic stores to write to user memory, as stores
+ * which are not single-copy-atomic (e.g. unaligned stores, STP, ASMID stores)
+ * can be split into separate byte accesses (per ARM DDI 0487I.a, Section
+ * B2.2.1) and some arbitatrary set of those byte accesses might occur prior to
+ * a fault being raised (per per ARM DDI 0487I.a, Section B2.7.1).
+ *
+ * We use STTR to write to user memory, which has 1/2/4/8 byte forms, and the
+ * user address ('to') might have arbitrary alignment, so we must handle
+ * misalignment up to 8 bytes.
  */
-
-	.p2align 4
-	// Alignment is for the loop, but since the prologue (including BTI)
-	// is also 16 bytes we can keep any padding outside the function
 SYM_FUNC_START(__arch_clear_user)
-	add	x2, x0, x1
-	subs	x1, x1, #8
-	b.mi	2f
-1:
-USER(9f, sttr	xzr, [x0])
-	add	x0, x0, #8
-	subs	x1, x1, #8
-	b.hi	1b
-USER(9f, sttr	xzr, [x2, #-8])
-	mov	x0, #0
-	ret
+		/*
+		 * The end address. Fixup handlers will use this to calculate
+		 * the number of bytes cleared.
+		 */
+		add	x3, x0, x1
 
-2:	tbz	x1, #2, 3f
-USER(9f, sttr	wzr, [x0])
-USER(8f, sttr	wzr, [x2, #-4])
-	mov	x0, #0
-	ret
+		/*
+		 * Tracing of a kernel build indicates that for the vast
+		 * majority of calls to copy_to_user(), 'to' is aligned to 8
+		 * bytes. When this is the case, we want to skip to the bulk
+		 * copy as soon as possible.
+		 */
+		ands	x4, x0, 0x7
+		b.eq	body
 
-3:	tbz	x1, #1, 4f
-USER(9f, sttrh	wzr, [x0])
-4:	tbz	x1, #0, 5f
-USER(7f, sttrb	wzr, [x2, #-1])
-5:	mov	x0, #0
-	ret
+		/*
+		 * For small unaligned copies, it's not worth trying to be
+		 * optimal.
+		 */
+		cmp	x1, #8
+		b.lo	bytewise_loop
 
-	// Exception fixups
-7:	sub	x0, x2, #5	// Adjust for faulting on the final byte...
-8:	add	x0, x0, #4	// ...or the second word of the 4-7 byte case
-9:	sub	x0, x2, x0
-	ret
+		/*
+		 * Calculate the distance to the next 8-byte boundary.
+		 */
+		mov	x5, #8
+		sub	x4, x5, x4
+
+SYM_INNER_LABEL(head_realign_1b, SYM_L_LOCAL)
+		tbz	x4, #0, head_realign_2b
+
+USER_OFF(0,	sttrb	wzr, [x0])
+		add	x0, x0, #1
+
+SYM_INNER_LABEL(head_realign_2b, SYM_L_LOCAL)
+		tbz	x4, #1, head_realign_4b
+
+USER_OFF(0,	sttrh	wzr, [x0])
+		add	x0, x0, #2
+
+SYM_INNER_LABEL(head_realign_4b, SYM_L_LOCAL)
+		tbz	x4, #2, head_realigned
+
+USER_OFF(0,	sttr	wzr, [x0])
+		add	x0, x0, #4
+
+SYM_INNER_LABEL(head_realigned, SYM_L_LOCAL)
+		/*
+		 * Any 1-7 byte misalignment has now been fixed; subtract this
+		 * misalignment from the remaining size.
+		 */
+		sub	x1, x1, x4
+
+SYM_INNER_LABEL(body, SYM_L_LOCAL)
+		cmp	x1, #64
+		b.lo	tail_32b
+
+SYM_INNER_LABEL(body_64b_loop, SYM_L_LOCAL)
+USER_OFF(0,	sttr	xzr, [x0, #0])
+USER_OFF(8,	sttr	xzr, [x0, #8])
+USER_OFF(16,	sttr	xzr, [x0, #16])
+USER_OFF(24,	sttr	xzr, [x0, #24])
+USER_OFF(32,	sttr	xzr, [x0, #32])
+USER_OFF(40,	sttr	xzr, [x0, #40])
+USER_OFF(48,	sttr	xzr, [x0, #48])
+USER_OFF(56,	sttr	xzr, [x0, #56])
+		add	x0, x0, #64
+		sub	x1, x1, #64
+
+		cmp	x1, #64
+		b.hs	body_64b_loop
+
+SYM_INNER_LABEL(tail_32b, SYM_L_LOCAL)
+		tbz	x1, #5, tail_16b
+
+USER_OFF(0,	sttr	xzr, [x0, #0])
+USER_OFF(8,	sttr	xzr, [x0, #8])
+USER_OFF(16,	sttr	xzr, [x0, #16])
+USER_OFF(24,	sttr	xzr, [x0, #24])
+		add	x0, x0, #32
+
+SYM_INNER_LABEL(tail_16b, SYM_L_LOCAL)
+		tbz	x1, #4, tail_8b
+
+USER_OFF(0,	sttr	xzr, [x0, #0])
+USER_OFF(8,	sttr	xzr, [x0, #8])
+		add	x0, x0, #16
+
+SYM_INNER_LABEL(tail_8b, SYM_L_LOCAL)
+		tbz	x1, #3, tail_4b
+
+USER_OFF(0,	sttr	xzr, [x0])
+		add	x0, x0, #8
+
+SYM_INNER_LABEL(tail_4b, SYM_L_LOCAL)
+		tbz	x1, #2, tail_2b
+
+USER_OFF(0,	sttr	wzr, [x0])
+		add	x0, x0, #4
+
+SYM_INNER_LABEL(tail_2b, SYM_L_LOCAL)
+		tbz	x1, #1, tail_1b
+
+USER_OFF(0,	sttrh	wzr, [x0])
+		add	x0, x0, #2
+
+SYM_INNER_LABEL(tail_1b, SYM_L_LOCAL)
+		tbz	x1, #0, done
+
+USER_OFF(0,	sttrb	wzr, [x0])
+
+SYM_INNER_LABEL(done, SYM_L_LOCAL)
+		mov	x0, xzr
+		ret
+
+SYM_INNER_LABEL(bytewise_loop, SYM_L_LOCAL)
+		cbz	x1, done
+
+USER_OFF(0,	sttrb	wzr, [x0])
+		add	x0, x0, #1
+		sub	x1, x1, #1
+
+		b	bytewise_loop
+
+FIXUP_OFFSET(0)
+FIXUP_OFFSET(8)
+FIXUP_OFFSET(16)
+FIXUP_OFFSET(24)
+FIXUP_OFFSET(32)
+FIXUP_OFFSET(40)
+FIXUP_OFFSET(48)
+FIXUP_OFFSET(56)
 SYM_FUNC_END(__arch_clear_user)
 EXPORT_SYMBOL(__arch_clear_user)
diff --git a/arch/arm64/lib/copy_to_user.S b/arch/arm64/lib/copy_to_user.S
index fa603487e8571..9eab9ec6c71e9 100644
--- a/arch/arm64/lib/copy_to_user.S
+++ b/arch/arm64/lib/copy_to_user.S
@@ -7,7 +7,6 @@
 
 #include <asm/asm-uaccess.h>
 #include <asm/assembler.h>
-#include <asm/cache.h>
 
 #define USER_OFF(off, x...)	USER(fixup_offset_##off, x)
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 1/4] lib: test copy_{to,from}_user()
  2023-03-21 12:25 ` [PATCH v2 1/4] lib: test copy_{to,from}_user() Mark Rutland
@ 2023-03-21 17:09   ` Catalin Marinas
  2023-03-22 14:05     ` Mark Rutland
  2023-03-21 18:04   ` Linus Torvalds
  1 sibling, 1 reply; 18+ messages in thread
From: Catalin Marinas @ 2023-03-21 17:09 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-kernel, agordeev, aou, bp, dave.hansen, davem, gor, hca,
	linux-arch, linux, mingo, palmer, paul.walmsley, robin.murphy,
	tglx, torvalds, viro, will

On Tue, Mar 21, 2023 at 12:25:11PM +0000, Mark Rutland wrote:
> +static void assert_size_valid(struct kunit *test,
> +			      const struct usercopy_params *params,
> +			      unsigned long ret)
> +{
> +	const unsigned long size = params->size;
> +	const unsigned long offset = params->offset;

I think you should drop the 'unsigned' for 'offset', it better matches
the usercopy_params structure and the 'offset < 0' test.

> +
> +	if (ret > size) {
> +		KUNIT_ASSERT_FAILURE(test,
> +			"return value is impossibly large (offset=%ld, size=%lu, ret=%lu)",
> +			offset, size, ret);
> +	}
> +
> +	/*
> +	 * When the user buffer starts within a faulting page, no bytes can be
> +	 * copied, so ret must equal size.
> +	 */
> +	if (offset < 0 || offset >= PAGE_SIZE) {
> +		if (ret == size)
> +			return;
> +
> +		KUNIT_ASSERT_FAILURE(test,
> +			"impossible copy did not fail (offset=%ld, size=%lu, ret=%lu)",
> +			offset, size, ret);
> +	}
> +
> +	/*
> +	 * When the user buffer fits entirely within a non-faulting page, all
> +	 * bytes must be copied, so ret must equal 0.
> +	 */
> +	if (offset + size <= PAGE_SIZE) {
> +		if (ret == 0)
> +			return;
> +
> +		KUNIT_ASSERT_FAILURE(test,
> +			"completely possible copy failed (offset=%ld, size=%lu, ret=%lu)",
> +			offset, size, ret);
> +	}
> +
> +	/*
> +	 * The buffer starts in a non-faulting page, but continues into a
> +	 * faulting page. At least one byte must be copied, and at most all the
> +	 * non-faulting bytes may be copied.
> +	 */
> +	if (ret == size) {
> +		KUNIT_ASSERT_FAILURE(test,
> +			"too few bytes consumed (offset=%ld, size=%lu, ret=%lu)",
> +			offset, size, ret);
> +	}
> +
> +	if (ret < (offset + size) - PAGE_SIZE) {
> +		KUNIT_ASSERT_FAILURE(test,
> +			   "too many bytes consumed (offset=%ld, size=%lu, ret=%lu)",
> +			   offset, size, ret);
> +	}
> +}

Nitpick: slightly less indentation probably if we write the above as:

	KUNIT_ASSERT_TRUE_MSG(test,
		ret < (offset + size) - PAGE_SIZE,
		"too many bytes consumed (offset=%ld, size=%lu, ret=%lu)",
		offset, size, ret);

Not sure this works for the early return cases above.

> +static void assert_src_valid(struct kunit *test,
> +			     const struct usercopy_params *params,
> +			     const char *src, long src_offset,
> +			     unsigned long ret)
> +{
> +	const unsigned long size = params->size;
> +	const unsigned long offset = params->offset;

The unsigned offset here doesn't matter much but I'd drop it as well.

> +	/*
> +	 * A usercopy MUST NOT modify the source buffer.
> +	 */
> +	for (int i = 0; i < PAGE_SIZE; i++) {
> +		char val = src[i];
> +
> +		if (val == buf_pattern(i))
> +			continue;
> +
> +		KUNIT_ASSERT_FAILURE(test,
> +			"source bytes modified (src[%d]=0x%x, offset=%ld, size=%lu, ret=%lu)",
> +			i, offset, size, ret);
> +	}
> +}
> +
> +static void assert_dst_valid(struct kunit *test,
> +			     const struct usercopy_params *params,
> +			     const char *dst, long dst_offset,
> +			     unsigned long ret)
> +{
> +	const unsigned long size = params->size;
> +	const unsigned long offset = params->offset;
> +
> +	/*
> +	 * A usercopy MUST NOT modify any bytes before the destination buffer.
> +	 */
> +	for (int i = 0; i < dst_offset; i++) {
> +		char val = dst[i];
> +
> +		if (val == 0)
> +			continue;
> +
> +		KUNIT_ASSERT_FAILURE(test,
> +			"pre-destination bytes modified (dst_page[%d]=0x%x, offset=%ld, size=%lu, ret=%lu)",
> +			i, val, offset, size, ret);
> +	}
> +
> +	/*
> +	 * A usercopy MUST NOT modify any bytes after the end of the destination
> +	 * buffer.
> +	 */

Without looking at the arm64 fixes in this series, I think such test can
fail for us currently given the right offset.

> +	for (int i = dst_offset + size - ret; i < PAGE_SIZE; i++) {
> +		char val = dst[i];
> +
> +		if (val == 0)
> +			continue;
> +
> +		KUNIT_ASSERT_FAILURE(test,
> +			"post-destination bytes modified (dst_page[%d]=0x%x, offset=%ld, size=%lu, ret=%lu)",
> +			i, val, offset, size, ret);
> +	}
> +}

[...]

> +/*
> + * Generate the size and offset combinations to test.
> + *
> + * Usercopies may have different paths for small/medium/large copies, but
> + * typically max out at 64 byte chunks. We test sizes 0 to 128 bytes to check
> + * all combinations of leading/trailing chunks and bulk copies.
> + *
> + * For each size, we test every offset relative to a leading and trailing page
> + * boundary (i.e. [size, 0] and [PAGE_SIZE - size, PAGE_SIZE]) to check every
> + * possible faulting boundary.
> + */
> +#define for_each_size_offset(size, offset)				\
> +	for (unsigned long size = 0; size <= 128; size++)		\
> +		for (long offset = -size;				\
> +		     offset <= (long)PAGE_SIZE;				\
> +		     offset = (offset ? offset + 1: (PAGE_SIZE - size)))
> +
> +static void test_copy_to_user(struct kunit *test)
> +{
> +	const struct usercopy_env *env = test->priv;
> +
> +	for_each_size_offset(size, offset) {
> +		const struct usercopy_params params = {
> +			.size = size,
> +			.offset = offset,
> +		};
> +		unsigned long ret;
> +
> +		buf_init_zero(env->ubuf);
> +		buf_init_pattern(env->kbuf);
> +
> +		ret = do_copy_to_user(env, &params);
> +
> +		assert_size_valid(test, &params, ret);
> +		assert_src_valid(test, &params, env->kbuf, 0, ret);
> +		assert_dst_valid(test, &params, env->ubuf, params.offset, ret);
> +		assert_copy_valid(test, &params,
> +				  env->ubuf, params.offset,
> +				  env->kbuf, 0,
> +				  ret);
> +	}
> +}

IIUC, in such tests you only vary the destination offset. Our copy
routines in general try to align the source and leave the destination
unaligned for performance. It would be interesting to add some variation
on the source offset as well to spot potential issues with that part of
the memcpy routines.

-- 
Catalin

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 2/4] lib: test clear_user()
  2023-03-21 12:25 ` [PATCH v2 2/4] lib: test clear_user() Mark Rutland
@ 2023-03-21 17:13   ` Catalin Marinas
  0 siblings, 0 replies; 18+ messages in thread
From: Catalin Marinas @ 2023-03-21 17:13 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-kernel, agordeev, aou, bp, dave.hansen, davem, gor, hca,
	linux-arch, linux, mingo, palmer, paul.walmsley, robin.murphy,
	tglx, torvalds, viro, will

On Tue, Mar 21, 2023 at 12:25:12PM +0000, Mark Rutland wrote:
> +static void assert_clear_valid(struct kunit *test,
> +			       const struct usercopy_params *params,
> +			       const char *dst, long dst_offset,
> +			       unsigned long ret)
> +{
> +	const unsigned long size = params->size;
> +	const unsigned long offset = params->offset;

Nit: long offset.

Otherwise,

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/4] arm64: fix __raw_copy_to_user semantics
  2023-03-21 12:25 ` [PATCH v2 3/4] arm64: fix __raw_copy_to_user semantics Mark Rutland
@ 2023-03-21 17:50   ` Linus Torvalds
  2023-03-22 14:16     ` Mark Rutland
  2023-03-22 14:48   ` Catalin Marinas
  1 sibling, 1 reply; 18+ messages in thread
From: Linus Torvalds @ 2023-03-21 17:50 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-kernel, agordeev, aou, bp, catalin.marinas, dave.hansen,
	davem, gor, hca, linux-arch, linux, mingo, palmer, paul.walmsley,
	robin.murphy, tglx, viro, will

On Tue, Mar 21, 2023 at 5:25 AM Mark Rutland <mark.rutland@arm.com> wrote:
>
> For some combinations of sizes and alignments __{arch,raw}_copy_to_user
> will copy some bytes between (to + size - N) and (to + size), but will
> never modify bytes past (to + size).
>
> This violates the documentation in <linux/uaccess.h>, which states:
>
> > If raw_copy_{to,from}_user(to, from, size) returns N, size - N bytes
> > starting at to must become equal to the bytes fetched from the
> > corresponding area starting at from.  All data past to + size - N must
> > be left unmodified.

Hmm.

I'm not 100% sure we couldn't just relax the documentation.

After all, the "exception happens in the middle of a copy" is a
special case, and generally results in -EFAULT rather than any
indication of "this is how much data we filled in for user space".

Now, some operations do *try* to generally give partial results
(notably "read()") even in the presence of page faults in the middle,
but I'm not entirely convinced we need to bend over backwards over
this.

Put another way: we have lots of situations where we fill in partial
user butffers and then return EFAULT, so "copy_to_user()" has at no
point been "atomic" wrt the return value.

So I in no way hate this patch, and I do think it's a good "QoI fix",
but if this ends up being painful for some architecture, I get the
feeling that we could easily just relax the implementation instead.

We have accepted the whole "return value is not byte-exact" thing
forever, simply because we have never required user copies to be done
as byte-at-a-time copies.

Now, it is undoubtedly true that the buffer we fill in with a user copy must

 (a) be filled AT LEAST as much as we reported it to be filled in (ie
user space can expect that there's no uninitialized data in any range
we claimed to fill)

 (b) obviously never filled past the buffer size that was given

But if we take an exception in the middle, and write a partial aligned
word, and don't report that as written (which is what you are fixing),
I really feel that is a "QoI" thing, not a correctness thing.

I don't think this arm implementation thing has ever hurt anything, in
other words.

That said, at some point that quality-of-implementation thing makes
validation so much easier that maybe it's worth doing just for that
reason, which is why I think "if it's not too painful, go right ahead"
is fine.

           Linus

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 1/4] lib: test copy_{to,from}_user()
  2023-03-21 12:25 ` [PATCH v2 1/4] lib: test copy_{to,from}_user() Mark Rutland
  2023-03-21 17:09   ` Catalin Marinas
@ 2023-03-21 18:04   ` Linus Torvalds
  2023-03-22 13:55     ` Mark Rutland
  1 sibling, 1 reply; 18+ messages in thread
From: Linus Torvalds @ 2023-03-21 18:04 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-kernel, agordeev, aou, bp, catalin.marinas, dave.hansen,
	davem, gor, hca, linux-arch, linux, mingo, palmer, paul.walmsley,
	robin.murphy, tglx, viro, will

On Tue, Mar 21, 2023 at 5:25 AM Mark Rutland <mark.rutland@arm.com> wrote:
>
> * arm64's copy_to_user() under-reports the number of bytes copied in
>   some cases, e.g.

So I think this is the ok case.

> * arm's copy_to_user() under-reports the number of bytes copied in some
>   cases, and both copy_to_user() and copy_from_user() don't guarantee
>   that at least a single byte is copied when a partial copy is possible,

Again, this is ok historically.

> * i386's copy_from_user does not guarantee that at least a single byte
>   is copied when a partial copit is possible, e.g.
>
>   | too few bytes consumed (offset=4093, size=8, ret=8)

And here's the real example of "we've always done this optimization".
The exact details have differed, but the i386 case is the really
really traditional one: it does word-at-a-time copies, and does *not*
try to fall back to byte-wise copies. Never has.

> * riscv's copy_to_user() and copy_from_user() don't guarantee that at
>   least a single byte is copied when a partial copy is possible, e.g.
>
>   | too few bytes consumed (offset=4095, size=2, ret=2)

Yup. This is all the same "we've never forced byte-at-a-time copies"

> * s390 passes all tests
>
> * sparc's copy_from_user() over-reports the number of bbytes copied in
>   some caes, e.g.

So this case I think this is wrong, and an outright bug. That can
cause people to think that uninitialized data is initialized, and leak
sensitive information.

> * x86_64 passes all tests

I suspect your testing is flawed due to being too limited, and x86-64
having multiple different copying routines.

Yes, at some point we made everything be quite careful with
"handle_tail" etc, but we end up still having things that fail early,
and fail hard.

At a minimum, at least unsafe_copy_to_user() will fault and not do the
"fill to the very last byte" case. Of course, that doesn't return a
partial length (it only has a "fail" case), but it's an example of
this whole thing where we haven't really been byte-exact when doing
copies.

So again, I get the feeling that these rules may make sense from a
validation standpoint, but I'm not 100% sure we should generally have
to be this careful.

                 Linus

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 1/4] lib: test copy_{to,from}_user()
  2023-03-21 18:04   ` Linus Torvalds
@ 2023-03-22 13:55     ` Mark Rutland
  0 siblings, 0 replies; 18+ messages in thread
From: Mark Rutland @ 2023-03-22 13:55 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: linux-kernel, agordeev, aou, bp, catalin.marinas, dave.hansen,
	davem, gor, hca, linux-arch, linux, mingo, palmer, paul.walmsley,
	robin.murphy, tglx, viro, will

On Tue, Mar 21, 2023 at 11:04:26AM -0700, Linus Torvalds wrote:
> On Tue, Mar 21, 2023 at 5:25 AM Mark Rutland <mark.rutland@arm.com> wrote:
> >
> > * arm64's copy_to_user() under-reports the number of bytes copied in
> >   some cases, e.g.
> 
> So I think this is the ok case.
> 
> > * arm's copy_to_user() under-reports the number of bytes copied in some
> >   cases, and both copy_to_user() and copy_from_user() don't guarantee
> >   that at least a single byte is copied when a partial copy is possible,
> 
> Again, this is ok historically.
> 
> > * i386's copy_from_user does not guarantee that at least a single byte
> >   is copied when a partial copit is possible, e.g.
> >
> >   | too few bytes consumed (offset=4093, size=8, ret=8)
> 
> And here's the real example of "we've always done this optimization".
> The exact details have differed, but the i386 case is the really
> really traditional one: it does word-at-a-time copies, and does *not*
> try to fall back to byte-wise copies. Never has.

Sure; I understand that. The reason for pointing this out is that Al was very
specific that implementations *must* guarantee this back in:

  https://lore.kernel.org/all/YNSyZaZtPTmTa5P8@zeniv-ca.linux.org.uk/

... and that this could be done by having the fixup handler try to copy a byte.

I had assumed that *something* depended upon that, but I don't know what that
something actually is.

I'm not wedded to the semantic either way; if that's not required I can drop it
from the tests.

> > * s390 passes all tests
> >
> > * sparc's copy_from_user() over-reports the number of bbytes copied in
> >   some caes, e.g.
> 
> So this case I think this is wrong, and an outright bug. That can
> cause people to think that uninitialized data is initialized, and leak
> sensitive information.

Agreed.

> > * x86_64 passes all tests
> 
> I suspect your testing is flawed due to being too limited, and x86-64
> having multiple different copying routines.

Sorry; I should've called that out explicitly. I'm aware I'm not testing all
the variants (I'd be happy to); I just wanted to check that I wasn't going off
into the weeds with the semantics first.

I probably should've sent this as an RFC...

> Yes, at some point we made everything be quite careful with
> "handle_tail" etc, but we end up still having things that fail early,
> and fail hard.
> 
> At a minimum, at least unsafe_copy_to_user() will fault and not do the
> "fill to the very last byte" case. Of course, that doesn't return a
> partial length (it only has a "fail" case), but it's an example of
> this whole thing where we haven't really been byte-exact when doing
> copies.

Sure; that does seem to be different structurally too, so it'd need to be
plumbed into the harness differently.

I'll note that's more like {get,put}_user() which similarly just have a fail
case (and a put_user() could do a parital write then fault).

> So again, I get the feeling that these rules may make sense from a
> validation standpoint, but I'm not 100% sure we should generally have
> to be this careful.

I'm more than happy to relax the tests (and the docs); I just need to know
where the boundary is between what we must guarantee and what's a nice-to-have.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 1/4] lib: test copy_{to,from}_user()
  2023-03-21 17:09   ` Catalin Marinas
@ 2023-03-22 14:05     ` Mark Rutland
  2023-03-23 22:16       ` David Laight
  0 siblings, 1 reply; 18+ messages in thread
From: Mark Rutland @ 2023-03-22 14:05 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: linux-kernel, agordeev, aou, bp, dave.hansen, davem, gor, hca,
	linux-arch, linux, mingo, palmer, paul.walmsley, robin.murphy,
	tglx, torvalds, viro, will

On Tue, Mar 21, 2023 at 05:09:48PM +0000, Catalin Marinas wrote:
> On Tue, Mar 21, 2023 at 12:25:11PM +0000, Mark Rutland wrote:
> > +static void assert_size_valid(struct kunit *test,
> > +			      const struct usercopy_params *params,
> > +			      unsigned long ret)
> > +{
> > +	const unsigned long size = params->size;
> > +	const unsigned long offset = params->offset;
> 
> I think you should drop the 'unsigned' for 'offset', it better matches
> the usercopy_params structure and the 'offset < 0' test.

Agreed. I'll go fix all offset values to use long.

[...]

> > +	if (ret < (offset + size) - PAGE_SIZE) {
> > +		KUNIT_ASSERT_FAILURE(test,
> > +			   "too many bytes consumed (offset=%ld, size=%lu, ret=%lu)",
> > +			   offset, size, ret);
> > +	}
> > +}
> 
> Nitpick: slightly less indentation probably if we write the above as:
> 
> 	KUNIT_ASSERT_TRUE_MSG(test,
> 		ret < (offset + size) - PAGE_SIZE,
> 		"too many bytes consumed (offset=%ld, size=%lu, ret=%lu)",
> 		offset, size, ret);
> 
> Not sure this works for the early return cases above.

I had originally used KUNIT_ASSERT_*_MSG(), but found those produced a lot of
unhelpful output; lemme go check what KUNIT_ASSERT_TRUE_MSG() produces.

[...]

> > +	/*
> > +	 * A usercopy MUST NOT modify any bytes before the destination buffer.
> > +	 */
> > +	for (int i = 0; i < dst_offset; i++) {
> > +		char val = dst[i];
> > +
> > +		if (val == 0)
> > +			continue;
> > +
> > +		KUNIT_ASSERT_FAILURE(test,
> > +			"pre-destination bytes modified (dst_page[%d]=0x%x, offset=%ld, size=%lu, ret=%lu)",
> > +			i, val, offset, size, ret);
> > +	}
> > +
> > +	/*
> > +	 * A usercopy MUST NOT modify any bytes after the end of the destination
> > +	 * buffer.
> > +	 */
> 
> Without looking at the arm64 fixes in this series, I think such test can
> fail for us currently given the right offset.

Yes, it can.

The test matches the documened semantic, so in that sense it's correctly
detecting that arm64 doesn't match the documentation.

Whether the documentation is right is clearly the key issue here. :)

> > +/*
> > + * Generate the size and offset combinations to test.
> > + *
> > + * Usercopies may have different paths for small/medium/large copies, but
> > + * typically max out at 64 byte chunks. We test sizes 0 to 128 bytes to check
> > + * all combinations of leading/trailing chunks and bulk copies.
> > + *
> > + * For each size, we test every offset relative to a leading and trailing page
> > + * boundary (i.e. [size, 0] and [PAGE_SIZE - size, PAGE_SIZE]) to check every
> > + * possible faulting boundary.
> > + */
> > +#define for_each_size_offset(size, offset)				\
> > +	for (unsigned long size = 0; size <= 128; size++)		\
> > +		for (long offset = -size;				\
> > +		     offset <= (long)PAGE_SIZE;				\
> > +		     offset = (offset ? offset + 1: (PAGE_SIZE - size)))
> > +
> > +static void test_copy_to_user(struct kunit *test)
> > +{
> > +	const struct usercopy_env *env = test->priv;
> > +
> > +	for_each_size_offset(size, offset) {
> > +		const struct usercopy_params params = {
> > +			.size = size,
> > +			.offset = offset,
> > +		};
> > +		unsigned long ret;
> > +
> > +		buf_init_zero(env->ubuf);
> > +		buf_init_pattern(env->kbuf);
> > +
> > +		ret = do_copy_to_user(env, &params);
> > +
> > +		assert_size_valid(test, &params, ret);
> > +		assert_src_valid(test, &params, env->kbuf, 0, ret);
> > +		assert_dst_valid(test, &params, env->ubuf, params.offset, ret);
> > +		assert_copy_valid(test, &params,
> > +				  env->ubuf, params.offset,
> > +				  env->kbuf, 0,
> > +				  ret);
> > +	}
> > +}
> 
> IIUC, in such tests you only vary the destination offset. Our copy
> routines in general try to align the source and leave the destination
> unaligned for performance. It would be interesting to add some variation
> on the source offset as well to spot potential issues with that part of
> the memcpy routines.

I have that on my TODO list; I had intended to drop that into the
usercopy_params. The only problem is that the cross product of size,
src_offset, and dst_offset gets quite large.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/4] arm64: fix __raw_copy_to_user semantics
  2023-03-21 17:50   ` Linus Torvalds
@ 2023-03-22 14:16     ` Mark Rutland
  0 siblings, 0 replies; 18+ messages in thread
From: Mark Rutland @ 2023-03-22 14:16 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: linux-kernel, agordeev, aou, bp, catalin.marinas, dave.hansen,
	davem, gor, hca, linux-arch, linux, mingo, palmer, paul.walmsley,
	robin.murphy, tglx, viro, will

On Tue, Mar 21, 2023 at 10:50:33AM -0700, Linus Torvalds wrote:
> On Tue, Mar 21, 2023 at 5:25 AM Mark Rutland <mark.rutland@arm.com> wrote:
> >
> > For some combinations of sizes and alignments __{arch,raw}_copy_to_user
> > will copy some bytes between (to + size - N) and (to + size), but will
> > never modify bytes past (to + size).
> >
> > This violates the documentation in <linux/uaccess.h>, which states:
> >
> > > If raw_copy_{to,from}_user(to, from, size) returns N, size - N bytes
> > > starting at to must become equal to the bytes fetched from the
> > > corresponding area starting at from.  All data past to + size - N must
> > > be left unmodified.
> 
> Hmm.
> 
> I'm not 100% sure we couldn't just relax the documentation.

Ok.

> After all, the "exception happens in the middle of a copy" is a
> special case, and generally results in -EFAULT rather than any
> indication of "this is how much data we filled in for user space".
> 
> Now, some operations do *try* to generally give partial results
> (notably "read()") even in the presence of page faults in the middle,
> but I'm not entirely convinced we need to bend over backwards over
> this.

If you think we should relax the documented semantic, I can go do that. If we
actually need the documented semantic in some cases, then something will need
to change.

All I'm really after is "what should the semantic be?" since there's clearly a
disconnect between the documentation and the code. I'm happy to go update
either.

> Put another way: we have lots of situations where we fill in partial
> user butffers and then return EFAULT, so "copy_to_user()" has at no
> point been "atomic" wrt the return value.
> 
> So I in no way hate this patch, and I do think it's a good "QoI fix",
> but if this ends up being painful for some architecture, I get the
> feeling that we could easily just relax the implementation instead.
> 
> We have accepted the whole "return value is not byte-exact" thing
> forever, simply because we have never required user copies to be done
> as byte-at-a-time copies.
> 
> Now, it is undoubtedly true that the buffer we fill in with a user copy must
> 
>  (a) be filled AT LEAST as much as we reported it to be filled in (ie
> user space can expect that there's no uninitialized data in any range
> we claimed to fill)
> 
>  (b) obviously never filled past the buffer size that was given

I agree those are both necessary.

> But if we take an exception in the middle, and write a partial aligned
> word, and don't report that as written (which is what you are fixing),
> I really feel that is a "QoI" thing, not a correctness thing.
>
> I don't think this arm implementation thing has ever hurt anything, in
> other words.
> 
> That said, at some point that quality-of-implementation thing makes
> validation so much easier that maybe it's worth doing just for that
> reason, which is why I think "if it's not too painful, go right ahead"
> is fine.

Fair enough.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/4] arm64: fix __raw_copy_to_user semantics
  2023-03-21 12:25 ` [PATCH v2 3/4] arm64: fix __raw_copy_to_user semantics Mark Rutland
  2023-03-21 17:50   ` Linus Torvalds
@ 2023-03-22 14:48   ` Catalin Marinas
  2023-03-22 15:30     ` Mark Rutland
  1 sibling, 1 reply; 18+ messages in thread
From: Catalin Marinas @ 2023-03-22 14:48 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-kernel, agordeev, aou, bp, dave.hansen, davem, gor, hca,
	linux-arch, linux, mingo, palmer, paul.walmsley, robin.murphy,
	tglx, torvalds, viro, will

On Tue, Mar 21, 2023 at 12:25:13PM +0000, Mark Rutland wrote:
> For some combinations of sizes and alignments __{arch,raw}_copy_to_user
> will copy some bytes between (to + size - N) and (to + size), but will
> never modify bytes past (to + size).
> 
> This violates the documentation in <linux/uaccess.h>, which states:
> 
> > If raw_copy_{to,from}_user(to, from, size) returns N, size - N bytes
> > starting at to must become equal to the bytes fetched from the
> > corresponding area starting at from.  All data past to + size - N must
> > be left unmodified.
> 
> This can be demonstrated through testing, e.g.
> 
> |     # test_copy_to_user: EXPECTATION FAILED at lib/usercopy_kunit.c:287
> | post-destination bytes modified (dst_page[4082]=0x1, offset=4081, size=16, ret=15)
> | [FAILED] 16 byte copy
> 
> This happens because the __arch_copy_to_user() can make unaligned stores
> to the userspace buffer, and the ARM architecture permits (but does not
> require) that such unaligned stores write some bytes before raising a
> fault (per ARM DDI 0487I.a Section B2.2.1 and Section B2.7.1). The
> extable fixup handlers in __arch_copy_to_user() assume that any faulting
> store has failed entirely, and so under-report the number of bytes
> copied when an unaligned store writes some bytes before faulting.

I find the Arm ARM hard to parse (no surprise here). Do you happen to
know what the behavior is for the new CPY instructions? I'd very much
like to use those for uaccess as well eventually but if they have the
same imp def behaviour, I'd rather relax the documentation and continue
to live with the current behaviour.

> The only architecturally guaranteed way to avoid this is to only use
> aligned stores to write to user memory.	This patch rewrites
> __arch_copy_to_user() to only access the user buffer with aligned
> stores, such that the bytes written can always be determined reliably.

Can we not fall back to byte-at-a-time? There's still a potential race
if the page becomes read-only for example. Well, probably not worth it
if we decide to go this route.

Where we may notice some small performance degradation is copy_to_user()
where the reads from the source end up unaligned due to the destination
buffer alignment. I doubt that's a common case though and most CPUs can
probably cope just well with this.

-- 
Catalin

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/4] arm64: fix __raw_copy_to_user semantics
  2023-03-22 14:48   ` Catalin Marinas
@ 2023-03-22 15:30     ` Mark Rutland
  2023-03-22 16:39       ` Linus Torvalds
  0 siblings, 1 reply; 18+ messages in thread
From: Mark Rutland @ 2023-03-22 15:30 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: linux-kernel, agordeev, aou, bp, dave.hansen, davem, gor, hca,
	linux-arch, linux, mingo, palmer, paul.walmsley, robin.murphy,
	tglx, torvalds, viro, will

On Wed, Mar 22, 2023 at 02:48:26PM +0000, Catalin Marinas wrote:
> On Tue, Mar 21, 2023 at 12:25:13PM +0000, Mark Rutland wrote:
> > For some combinations of sizes and alignments __{arch,raw}_copy_to_user
> > will copy some bytes between (to + size - N) and (to + size), but will
> > never modify bytes past (to + size).
> > 
> > This violates the documentation in <linux/uaccess.h>, which states:
> > 
> > > If raw_copy_{to,from}_user(to, from, size) returns N, size - N bytes
> > > starting at to must become equal to the bytes fetched from the
> > > corresponding area starting at from.  All data past to + size - N must
> > > be left unmodified.
> > 
> > This can be demonstrated through testing, e.g.
> > 
> > |     # test_copy_to_user: EXPECTATION FAILED at lib/usercopy_kunit.c:287
> > | post-destination bytes modified (dst_page[4082]=0x1, offset=4081, size=16, ret=15)
> > | [FAILED] 16 byte copy
> > 
> > This happens because the __arch_copy_to_user() can make unaligned stores
> > to the userspace buffer, and the ARM architecture permits (but does not
> > require) that such unaligned stores write some bytes before raising a
> > fault (per ARM DDI 0487I.a Section B2.2.1 and Section B2.7.1). The
> > extable fixup handlers in __arch_copy_to_user() assume that any faulting
> > store has failed entirely, and so under-report the number of bytes
> > copied when an unaligned store writes some bytes before faulting.
> 
> I find the Arm ARM hard to parse (no surprise here). Do you happen to
> know what the behavior is for the new CPY instructions? I'd very much
> like to use those for uaccess as well eventually but if they have the
> same imp def behaviour, I'd rather relax the documentation and continue
> to live with the current behaviour.

My understanding is that those have to be broken up into a set of smaller
accesses, and so the same applies, with the architectural window for clobbering
being the *entire* range from 'to' to 'to + size'

That said, the description of the forward-only instructions (CPYF*) suggests
those might have to be exact.

In ARM DDI 0487I.a, section C6.2.80 "CPYFPWT, CPYFMWT, CPYFEWT", it says:

> The memory copy performed by these instructions is in the forward direction
> only, so the instructions are suitable for a memory copy only where there is
> no overlap between the source and destination locations, or where the source
> address is greater than the destination address.

That *might* mean in practice that CPYF* instructions can't clobber later bytes
in dst in case they'll later be consumed as src bytes, but it doesn't strictly
say that (and maybe the core would figure out how much potential overlap there
is and tune the copy chunk size).

We could chase our architects on that.

> > The only architecturally guaranteed way to avoid this is to only use
> > aligned stores to write to user memory.	This patch rewrites
> > __arch_copy_to_user() to only access the user buffer with aligned
> > stores, such that the bytes written can always be determined reliably.
> 
> Can we not fall back to byte-at-a-time? There's still a potential race
> if the page becomes read-only for example. Well, probably not worth it
> if we decide to go this route.

I was trying to avoid that shape of race.

I also believe that if we have a misaligned store straddling two pages, and the
first page is faulting, it the store can do a partial write to the 2nd page,
which I suspected is not what we want (though maybe that's beningn, if we're
going to say that clobbering anywhere within the dst buffer is fine).

> Where we may notice some small performance degradation is copy_to_user()
> where the reads from the source end up unaligned due to the destination
> buffer alignment. I doubt that's a common case though and most CPUs can
> probably cope just well with this.

FWIW, I instrumented a kernel build workload, and the significant majority of
the time (IIRC by ~100x), both src and dst happened to be aligned to at least 8
bytes, so even if that were slow its the rare case. I guess that's because any
structures we copy are likely to contain a long or a pointer, and a bunch of
other copies are going to be bulk page copies into buffers which the allocator
has aligned to at least 8 bytes.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/4] arm64: fix __raw_copy_to_user semantics
  2023-03-22 15:30     ` Mark Rutland
@ 2023-03-22 16:39       ` Linus Torvalds
  0 siblings, 0 replies; 18+ messages in thread
From: Linus Torvalds @ 2023-03-22 16:39 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Catalin Marinas, linux-kernel, agordeev, aou, bp, dave.hansen,
	davem, gor, hca, linux-arch, linux, mingo, palmer, paul.walmsley,
	robin.murphy, tglx, viro, will

On Wed, Mar 22, 2023 at 8:30 AM Mark Rutland <mark.rutland@arm.com> wrote:
>
> I also believe that if we have a misaligned store straddling two pages, and the
> first page is faulting, it the store can do a partial write to the 2nd page,
> which I suspected is not what we want (though maybe that's beningn, if we're
> going to say that clobbering anywhere within the dst buffer is fine).

So I don't think that clobbering anywhere in the write buffer is fine
in general:, since user space may well depend on the whole "kernel
wrote exactly this range" (ie people who implement things like
circular buffers in user space that are filled by the kernel with a
read() system call).

But that's about partial read() results in general, and things like
interruptible reads (or just partial results from a pipe) in
particular.

At the same time EFAULT really is pretty special. We've *tried* to
make it restartable by user space, but honestly, I'm not entirely sure
that was ever a good idea, and I'm not even sure anybody really
depends on it.

For that case, I don't think we necessarily should care too deeply.

HOWEVER.

There is one very very special case: we may not care about
"restartable system calls" from user space, and say "we might as well
just always return EFAULT for any partial result".

That is, after all, what we already do for many cases (ie structure
copies, "put_user()" and friends). And I suspect it's what a lot of
other systems do.

No, the one special case is for the _kernel_ use of restartable user
copies, and the "__copy_from_user_inatomic()" case in particular.

They aren't hugely common, but they are required for some cases
(notably filesystems that hold locks and cannot take user page faults
due to deadlock scenarios), and *those* need very special care.

They still don't need to be byte-exact, but they do need to be able to
restart and most notably they need to be able to make forward progress
(together with a separate "fault_in_user_readable/writable()").

We've had situations where that didn't happen, and then you get actual
kernel hangs when some filesystem just does an endless loop with page
faults.

And by "make forward progress", it's actually fine to write too much
to user space, and return a short return value, and when restarting,
do some of the writes to user space *again*. It just has to be
restartable with the same data, and it can't keep taking a fault
thanks to some bad interaction with the "fault-in" case.

Of course, that does end up using the exact same user copy code, just
under "pagefault_disable()" (which is often implicitly called through
something like "kmap_atomic()" or similar).

So I don't think we need to be byte-exact, but we do need to keep that
restart case in mind. But again, the restart can re-do parts of the
copy.

                    Linus

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [PATCH v2 1/4] lib: test copy_{to,from}_user()
  2023-03-22 14:05     ` Mark Rutland
@ 2023-03-23 22:16       ` David Laight
  2023-03-27 10:11         ` Catalin Marinas
  0 siblings, 1 reply; 18+ messages in thread
From: David Laight @ 2023-03-23 22:16 UTC (permalink / raw)
  To: 'Mark Rutland', Catalin Marinas
  Cc: linux-kernel, agordeev, aou, bp, dave.hansen, davem, gor, hca,
	linux-arch, linux, mingo, palmer, paul.walmsley, robin.murphy,
	tglx, torvalds, viro, will

From: Mark Rutland
> Sent: 22 March 2023 14:05
....
> > IIUC, in such tests you only vary the destination offset. Our copy
> > routines in general try to align the source and leave the destination
> > unaligned for performance. It would be interesting to add some variation
> > on the source offset as well to spot potential issues with that part of
> > the memcpy routines.
> 
> I have that on my TODO list; I had intended to drop that into the
> usercopy_params. The only problem is that the cross product of size,
> src_offset, and dst_offset gets quite large.

I thought that is was better to align the writes and do misaligned reads.
Although maybe copy_to/from_user() would be best aligning the user address
(to avoid page faults part way through a misaligned access).

OTOH, on x86, is it even worth bothering at all.
I have measured a performance drop for misaligned reads, but it
was less than 1 clock per cache line in a test that was doing
2 misaligned reads in at least some of the clock cycles.
I think the memory read path can do two AVX reads each clock.
So doing two misaligned 64bit reads isn't stressing it.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 1/4] lib: test copy_{to,from}_user()
  2023-03-23 22:16       ` David Laight
@ 2023-03-27 10:11         ` Catalin Marinas
  0 siblings, 0 replies; 18+ messages in thread
From: Catalin Marinas @ 2023-03-27 10:11 UTC (permalink / raw)
  To: David Laight
  Cc: 'Mark Rutland',
	linux-kernel, agordeev, aou, bp, dave.hansen, davem, gor, hca,
	linux-arch, linux, mingo, palmer, paul.walmsley, robin.murphy,
	tglx, torvalds, viro, will

On Thu, Mar 23, 2023 at 10:16:12PM +0000, David Laight wrote:
> From: Mark Rutland
> > Sent: 22 March 2023 14:05
> ....
> > > IIUC, in such tests you only vary the destination offset. Our copy
> > > routines in general try to align the source and leave the destination
> > > unaligned for performance. It would be interesting to add some variation
> > > on the source offset as well to spot potential issues with that part of
> > > the memcpy routines.
> > 
> > I have that on my TODO list; I had intended to drop that into the
> > usercopy_params. The only problem is that the cross product of size,
> > src_offset, and dst_offset gets quite large.
> 
> I thought that is was better to align the writes and do misaligned reads.

We inherited the memcpy/memset routines from the optimised cortex
strings library (fine-tuned by the toolchain people for various Arm
microarchitectures). For some CPUs with less aggressive prefetching it's
probably marginally faster to align the reads instead of writes (as
multiple unaligned writes are usually combined in the write buffer
somewhere).

Also, IIRC for some small copies (less than 16 bytes), our routines
don't bother with any alignment at all.

> Although maybe copy_to/from_user() would be best aligning the user address
> (to avoid page faults part way through a misaligned access).

In theory only copy_to_user() needs the write aligned if we want strict
guarantees of what was written. For copy_from_user() we can work around
by falling back to a byte read.

> OTOH, on x86, is it even worth bothering at all.
> I have measured a performance drop for misaligned reads, but it
> was less than 1 clock per cache line in a test that was doing
> 2 misaligned reads in at least some of the clock cycles.
> I think the memory read path can do two AVX reads each clock.
> So doing two misaligned 64bit reads isn't stressing it.

I think that's what Mark found as well in his testing, though I'm sure
one can build a very specific benchmark that shows a small degradation.

-- 
Catalin

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 0/4] usercopy: generic tests + arm64 fixes
  2023-03-21 12:25 [PATCH v2 0/4] usercopy: generic tests + arm64 fixes Mark Rutland
                   ` (3 preceding siblings ...)
  2023-03-21 12:25 ` [PATCH v2 4/4] arm64: fix clear_user() semantics Mark Rutland
@ 2023-11-15 22:30 ` Kees Cook
  4 siblings, 0 replies; 18+ messages in thread
From: Kees Cook @ 2023-11-15 22:30 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-kernel, Al Viro, Catalin Marinas, Linus Torvalds,
	Will Deacon, agordeev, aou, bp, dave.hansen, davem, gor, hca,
	linux-arch, linux, mingo, palmer, paul.walmsley, robin.murphy,
	tglx

On Tue, Mar 21, 2023 at 12:25:10PM +0000, Mark Rutland wrote:
> This series adds tests for the usercopy functions, which have found a
> few issues on most architectures I've tested on. The last two patches
> are fixes for arm64 (which pass all the tests, and are at least as good
> performance-wise as the existing implementation).
> [...]
>  lib/usercopy_kunit.c          | 506 ++++++++++++++++++++++++++++++++++

You didn't like lib/test_user_copy.c ? :)

There was a prior attempt to move it to KUnit:

https://lore.kernel.org/lkml/20200721174654.72132-1-vitor@massaru.org/

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2023-11-15 22:30 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-21 12:25 [PATCH v2 0/4] usercopy: generic tests + arm64 fixes Mark Rutland
2023-03-21 12:25 ` [PATCH v2 1/4] lib: test copy_{to,from}_user() Mark Rutland
2023-03-21 17:09   ` Catalin Marinas
2023-03-22 14:05     ` Mark Rutland
2023-03-23 22:16       ` David Laight
2023-03-27 10:11         ` Catalin Marinas
2023-03-21 18:04   ` Linus Torvalds
2023-03-22 13:55     ` Mark Rutland
2023-03-21 12:25 ` [PATCH v2 2/4] lib: test clear_user() Mark Rutland
2023-03-21 17:13   ` Catalin Marinas
2023-03-21 12:25 ` [PATCH v2 3/4] arm64: fix __raw_copy_to_user semantics Mark Rutland
2023-03-21 17:50   ` Linus Torvalds
2023-03-22 14:16     ` Mark Rutland
2023-03-22 14:48   ` Catalin Marinas
2023-03-22 15:30     ` Mark Rutland
2023-03-22 16:39       ` Linus Torvalds
2023-03-21 12:25 ` [PATCH v2 4/4] arm64: fix clear_user() semantics Mark Rutland
2023-11-15 22:30 ` [PATCH v2 0/4] usercopy: generic tests + arm64 fixes Kees Cook

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).