x86: faster strncpy_from_user()

* x86: faster strncpy_from_user()
@ 2012-04-06 21:32 Linus Torvalds
  2012-04-06 21:49 ` Linus Torvalds
  2012-04-10 22:35 ` Benjamin Herrenschmidt
  0 siblings, 2 replies; 16+ messages in thread
From: Linus Torvalds @ 2012-04-06 21:32 UTC (permalink / raw)
  To: Ingo Molnar, H. Peter Anvin, the arch/x86 maintainers, linux-arch

[-- Attachment #1: Type: text/plain, Size: 4983 bytes --]

Ok, as some of you are aware, one of the things that got merged very
early in the 3.4 merge window was the "word-at-a-time" filename lookup
patches I had been working on. They only get enabled on x86, but when
they do, they do speed things up by quite a noticeable bit (mainly on
x86-64, which ends up doing things 8 bytes at a time - it's much less
noticeable on x86-32).

Now that the name lookup itself is quite lean, that ended up showing
some serious problems in selinux and the security wrappers, some of
which were largely solved a few days ago (the recent "Merge branch
'selinux'" thing got rid of the most annoying cases).

I have a few more selinux fixes to really get the worst stupidity out
of the way, but I suspect they are for next merge window.  But with
those, the path lookup is fast, and selinux isn't eating up *too* much
time either.

And the next thing that shows up is "strncpy_from_user()".

It's not all that surprising - the function really isn't particularly
smart, and using lodsb/stosb is just legacy cruft. But realistically,
while it hasn't been a piece of wonderful engineering, it also never
really showed up very high on profiles. Now with all the other
improvements I have a few loads where it really does show up pretty
well - the only real users of strncpy_from_user() are the pathname
copy code (sure, there's stuff elsewhere, but it's just not that
important).

Anyway, that function uses asms and is duplicated across x86-32 and
x86-64 for no real good reason except it was never a priority.

Here's a patch that actually removes more lines than it adds, because
it removes that duplication, and replaces the asm with some fairly
optimal C code. So we have

 6 files changed, 105 insertions(+), 145 deletions(-)

but what is perhaps more interesting (and the reason linux-arch is
cc'd) is that not only is it noticeably faster, it should also
actually get all the error cases right.

What do I mean by that?

I mean that in my tree (but not committed), I have actually removed
this nasty piece of code from fs/namei.c:

  diff --git a/fs/namei.c b/fs/namei.c
  index 0062dd17eb55..2b7d691523bb 100644
  --- a/fs/namei.c
  +++ b/fs/namei.c
  @@ -119,14 +119,7 @@
   static int do_getname(const char __user *filename, char *page)
   {
          int retval;
  -       unsigned long len = PATH_MAX;
  -
  -       if (!segment_eq(get_fs(), KERNEL_DS)) {
  -               if ((unsigned long) filename >= TASK_SIZE)
  -                       return -EFAULT;
  -               if (TASK_SIZE - (unsigned long) filename < PATH_MAX)
  -                       len = TASK_SIZE - (unsigned long) filename;
  -       }
  +       const unsigned long len = PATH_MAX;

          retval = strncpy_from_user(page, filename, len);
          if (retval > 0) {

which worked around the fact that strncpy_from_user() didn't really
get the end conditions quite right, and it really just did an
access_ok() on the first byte. The new x86 implementation should get
all of that right without any need for any ugly hacks.

Now, I haven't committed that change to do_getname(), but it *will*
get committed at some point. Probably not until the next merge window,
but I did want to let architecture maintainers know: you need to look
at your strncpy_from_userspace(), and make sure they get the "end of
address space" case right, because the pathname code won't hack around
it for you any more if you don't.

The x86-"specific" strncpy_from_user() attached in the patch is
actually fairly generic. It does do the word-at-a-time tricks, but if
you don't want to bother with those, you can just remove that loop
entirely, and it should all work in general (the x86 code has a
fall-back to the byte-at-a-time case, so if you remove the "while (max
>= sizeof(unsigned long))" loop, it should all "just work".

And the word-at-a-time tricks it does do are actually simpler than the
ones the dcache does - just a subset of them - so you may decide that
you'll do that part too, even if the dcache ones are more complicated.

(For example, if you just implement the "has_zero()" function, you can
make it do

    if (has_zero(a))
        break;

to fall back to the byte-at-a-time model - that's what I did
originally in this patch too, but quite frankly, it does mean that the
word-at-a-time logic isn't as efficient).

Anyway, this works for me on x86 (you need current tip-of-git for the
<asm/word-at-a-time.h> thing) and is quite noticeable in profiles
(well, if you have loads that do a lot of pathname lookups: I use a
microbenchmarks that does ten million lookups of a filename with
multiple components, but I also do "make -j" profiles of a fully build
tree).

And considering that the old x86 strncpy_to_user() code was so ugly,
it's pretty much a no-brainer there. But on other architectures, I'd
still like to point out the problem with the end-of-address-space,
which you may or may not actually have on other architectures.

Comments?

                         Linus

[-- Attachment #2: x86-strncpy_from_user.patch --]
[-- Type: application/octet-stream, Size: 9998 bytes --]

 arch/x86/include/asm/uaccess.h    |    2 +
 arch/x86/include/asm/uaccess_32.h |    5 --
 arch/x86/include/asm/uaccess_64.h |    4 --
 arch/x86/lib/usercopy.c           |  103 +++++++++++++++++++++++++++++++++++++
 arch/x86/lib/usercopy_32.c        |   87 -------------------------------
 arch/x86/lib/usercopy_64.c        |   49 ------------------
 6 files changed, 105 insertions(+), 145 deletions(-)

diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index 8be5f54d9360..e0544597cfe7 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -557,6 +557,8 @@ struct __large_struct { unsigned long buf[100]; };
 
 extern unsigned long
 copy_from_user_nmi(void *to, const void __user *from, unsigned long n);
+extern __must_check long
+strncpy_from_user(char *dst, const char __user *src, long count);
 
 /*
  * movsl can be slow when source and dest are not both 8-byte aligned
diff --git a/arch/x86/include/asm/uaccess_32.h b/arch/x86/include/asm/uaccess_32.h
index 566e803cc602..8084bc73b18c 100644
--- a/arch/x86/include/asm/uaccess_32.h
+++ b/arch/x86/include/asm/uaccess_32.h
@@ -213,11 +213,6 @@ static inline unsigned long __must_check copy_from_user(void *to,
 	return n;
 }
 
-long __must_check strncpy_from_user(char *dst, const char __user *src,
-				    long count);
-long __must_check __strncpy_from_user(char *dst,
-				      const char __user *src, long count);
-
 /**
  * strlen_user: - Get the size of a string in user space.
  * @str: The string to measure.
diff --git a/arch/x86/include/asm/uaccess_64.h b/arch/x86/include/asm/uaccess_64.h
index 1c66d30971ad..fcd4b6f3ef02 100644
--- a/arch/x86/include/asm/uaccess_64.h
+++ b/arch/x86/include/asm/uaccess_64.h
@@ -208,10 +208,6 @@ int __copy_in_user(void __user *dst, const void __user *src, unsigned size)
 	}
 }
 
-__must_check long
-strncpy_from_user(char *dst, const char __user *src, long count);
-__must_check long
-__strncpy_from_user(char *dst, const char __user *src, long count);
 __must_check long strnlen_user(const char __user *str, long n);
 __must_check long __strnlen_user(const char __user *str, long n);
 __must_check long strlen_user(const char __user *str);
diff --git a/arch/x86/lib/usercopy.c b/arch/x86/lib/usercopy.c
index 97be9cb54483..57252c928f56 100644
--- a/arch/x86/lib/usercopy.c
+++ b/arch/x86/lib/usercopy.c
@@ -7,6 +7,8 @@
 #include <linux/highmem.h>
 #include <linux/module.h>
 
+#include <asm/word-at-a-time.h>
+
 /*
  * best effort, GUP based copy_from_user() that is NMI-safe
  */
@@ -41,3 +43,104 @@ copy_from_user_nmi(void *to, const void __user *from, unsigned long n)
 	return len;
 }
 EXPORT_SYMBOL_GPL(copy_from_user_nmi);
+
+static inline unsigned long count_bytes(unsigned long mask)
+{
+	mask = (mask - 1) & ~mask;
+	mask >>= 7;
+	return count_masked_bytes(mask);
+}
+
+/*
+ * Do a strncpy, return length of string without final '\0'.
+ * 'count' is the user-supplied count (return 'count' if we
+ * hit it), 'max' is the address space maximum (and we return
+ * -EFAULT if we hit it).
+ */
+static inline long do_strncpy_from_user(char *dst, const char __user *src, long count, long max)
+{
+	long res = 0;
+
+	/*
+	 * Truncate 'max' to the user-specified limit, so that
+	 * we only have one limit we need to check in the loop
+	 */
+	if (max > count)
+		max = count;
+
+	while (max >= sizeof(unsigned long)) {
+		unsigned long c;
+
+		/* Fall back to byte-at-a-time if we get a page fault */
+		if (unlikely(__get_user(c,(unsigned long __user *)(src+res))))
+			break;
+		/* This can write a few bytes past the NUL character, but that's ok */
+		*(unsigned long *)(dst+res) = c;
+		c = has_zero(c);
+		if (c)
+			return res + count_bytes(c);
+		res += sizeof(unsigned long);
+		max -= sizeof(unsigned long);
+	}
+
+	while (max) {
+		char c;
+
+		if (unlikely(__get_user(c,src+res)))
+			return -EFAULT;
+		dst[res] = c;
+		if (!c)
+			return res;
+		res++;
+		max--;
+	}
+
+	/*
+	 * Uhhuh. We hit 'max'. But was that the user-specified maximum
+	 * too? If so, that's ok - we got as much as the user asked for.
+	 */
+	if (res >= count)
+		return count;
+
+	/*
+	 * Nope: we hit the address space limit, and we still had more
+	 * characters the caller would have wanted. That's an EFAULT.
+	 */
+	return -EFAULT;
+}
+
+/**
+ * strncpy_from_user: - Copy a NUL terminated string from userspace.
+ * @dst:   Destination address, in kernel space.  This buffer must be at
+ *         least @count bytes long.
+ * @src:   Source address, in user space.
+ * @count: Maximum number of bytes to copy, including the trailing NUL.
+ *
+ * Copies a NUL-terminated string from userspace to kernel space.
+ *
+ * On success, returns the length of the string (not including the trailing
+ * NUL).
+ *
+ * If access to userspace fails, returns -EFAULT (some data may have been
+ * copied).
+ *
+ * If @count is smaller than the length of the string, copies @count bytes
+ * and returns @count.
+ */
+long
+strncpy_from_user(char *dst, const char __user *src, long count)
+{
+	unsigned long max_addr, src_addr;
+
+	if (unlikely(count <= 0))
+		return 0;
+
+	max_addr = current_thread_info()->addr_limit.seg;
+	src_addr = (unsigned long)src;
+	if (likely(src_addr < max_addr)) {
+		unsigned long max = max_addr - src_addr;
+		return do_strncpy_from_user(dst, src, count, max);
+	}
+	return -EFAULT;
+}
+EXPORT_SYMBOL(strncpy_from_user);
diff --git a/arch/x86/lib/usercopy_32.c b/arch/x86/lib/usercopy_32.c
index d9b094ca7aaa..ef2a6a5d78e3 100644
--- a/arch/x86/lib/usercopy_32.c
+++ b/arch/x86/lib/usercopy_32.c
@@ -33,93 +33,6 @@ static inline int __movsl_is_ok(unsigned long a1, unsigned long a2, unsigned lon
 	__movsl_is_ok((unsigned long)(a1), (unsigned long)(a2), (n))
 
 /*
- * Copy a null terminated string from userspace.
- */
-
-#define __do_strncpy_from_user(dst, src, count, res)			   \
-do {									   \
-	int __d0, __d1, __d2;						   \
-	might_fault();							   \
-	__asm__ __volatile__(						   \
-		"	testl %1,%1\n"					   \
-		"	jz 2f\n"					   \
-		"0:	lodsb\n"					   \
-		"	stosb\n"					   \
-		"	testb %%al,%%al\n"				   \
-		"	jz 1f\n"					   \
-		"	decl %1\n"					   \
-		"	jnz 0b\n"					   \
-		"1:	subl %1,%0\n"					   \
-		"2:\n"							   \
-		".section .fixup,\"ax\"\n"				   \
-		"3:	movl %5,%0\n"					   \
-		"	jmp 2b\n"					   \
-		".previous\n"						   \
-		_ASM_EXTABLE(0b,3b)					   \
-		: "=&d"(res), "=&c"(count), "=&a" (__d0), "=&S" (__d1),	   \
-		  "=&D" (__d2)						   \
-		: "i"(-EFAULT), "0"(count), "1"(count), "3"(src), "4"(dst) \
-		: "memory");						   \
-} while (0)
-
-/**
- * __strncpy_from_user: - Copy a NUL terminated string from userspace, with less checking.
- * @dst:   Destination address, in kernel space.  This buffer must be at
- *         least @count bytes long.
- * @src:   Source address, in user space.
- * @count: Maximum number of bytes to copy, including the trailing NUL.
- *
- * Copies a NUL-terminated string from userspace to kernel space.
- * Caller must check the specified block with access_ok() before calling
- * this function.
- *
- * On success, returns the length of the string (not including the trailing
- * NUL).
- *
- * If access to userspace fails, returns -EFAULT (some data may have been
- * copied).
- *
- * If @count is smaller than the length of the string, copies @count bytes
- * and returns @count.
- */
-long
-__strncpy_from_user(char *dst, const char __user *src, long count)
-{
-	long res;
-	__do_strncpy_from_user(dst, src, count, res);
-	return res;
-}
-EXPORT_SYMBOL(__strncpy_from_user);
-
-/**
- * strncpy_from_user: - Copy a NUL terminated string from userspace.
- * @dst:   Destination address, in kernel space.  This buffer must be at
- *         least @count bytes long.
- * @src:   Source address, in user space.
- * @count: Maximum number of bytes to copy, including the trailing NUL.
- *
- * Copies a NUL-terminated string from userspace to kernel space.
- *
- * On success, returns the length of the string (not including the trailing
- * NUL).
- *
- * If access to userspace fails, returns -EFAULT (some data may have been
- * copied).
- *
- * If @count is smaller than the length of the string, copies @count bytes
- * and returns @count.
- */
-long
-strncpy_from_user(char *dst, const char __user *src, long count)
-{
-	long res = -EFAULT;
-	if (access_ok(VERIFY_READ, src, 1))
-		__do_strncpy_from_user(dst, src, count, res);
-	return res;
-}
-EXPORT_SYMBOL(strncpy_from_user);
-
-/*
  * Zero Userspace
  */
 
diff --git a/arch/x86/lib/usercopy_64.c b/arch/x86/lib/usercopy_64.c
index b7c2849ffb66..0d0326f388c0 100644
--- a/arch/x86/lib/usercopy_64.c
+++ b/arch/x86/lib/usercopy_64.c
@@ -9,55 +9,6 @@
 #include <asm/uaccess.h>
 
 /*
- * Copy a null terminated string from userspace.
- */
-
-#define __do_strncpy_from_user(dst,src,count,res)			   \
-do {									   \
-	long __d0, __d1, __d2;						   \
-	might_fault();							   \
-	__asm__ __volatile__(						   \
-		"	testq %1,%1\n"					   \
-		"	jz 2f\n"					   \
-		"0:	lodsb\n"					   \
-		"	stosb\n"					   \
-		"	testb %%al,%%al\n"				   \
-		"	jz 1f\n"					   \
-		"	decq %1\n"					   \
-		"	jnz 0b\n"					   \
-		"1:	subq %1,%0\n"					   \
-		"2:\n"							   \
-		".section .fixup,\"ax\"\n"				   \
-		"3:	movq %5,%0\n"					   \
-		"	jmp 2b\n"					   \
-		".previous\n"						   \
-		_ASM_EXTABLE(0b,3b)					   \
-		: "=&r"(res), "=&c"(count), "=&a" (__d0), "=&S" (__d1),	   \
-		  "=&D" (__d2)						   \
-		: "i"(-EFAULT), "0"(count), "1"(count), "3"(src), "4"(dst) \
-		: "memory");						   \
-} while (0)
-
-long
-__strncpy_from_user(char *dst, const char __user *src, long count)
-{
-	long res;
-	__do_strncpy_from_user(dst, src, count, res);
-	return res;
-}
-EXPORT_SYMBOL(__strncpy_from_user);
-
-long
-strncpy_from_user(char *dst, const char __user *src, long count)
-{
-	long res = -EFAULT;
-	if (access_ok(VERIFY_READ, src, 1))
-		return __strncpy_from_user(dst, src, count);
-	return res;
-}
-EXPORT_SYMBOL(strncpy_from_user);
-
-/*
  * Zero Userspace
  */
 

^ permalink raw reply related	[flat|nested] 16+ messages in thread