linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] x86/uaccess: use unrolled string copy for short strings
@ 2017-06-21 11:09 Paolo Abeni
  2017-06-21 17:38 ` Kees Cook
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Paolo Abeni @ 2017-06-21 11:09 UTC (permalink / raw)
  To: x86
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Al Viro, Kees Cook,
	Hannes Frederic Sowa, linux-kernel

The 'rep' prefix suffers for a relevant "setup cost"; as a result
string copies with unrolled loops are faster than even
optimized string copy using 'rep' variant, for short string.

This change updates __copy_user_generic() to use the unrolled
version for small string length. The threshold length for short
string - 64 - has been selected with empirical measures as the
larger value that still ensure a measurable gain.

A micro-benchmark of __copy_from_user() with different lengths shows
the following:

string len	vanilla		patched 	delta
bytes		ticks		ticks		tick(%)

0		58		26		32(55%)
1		49		29		20(40%)
2		49		31		18(36%)
3		49		32		17(34%)
4		50		34		16(32%)
5		49		35		14(28%)
6		49		36		13(26%)
7		49		38		11(22%)
8		50		31		19(38%)
9		51		33		18(35%)
10		52		36		16(30%)
11		52		37		15(28%)
12		52		38		14(26%)
13		52		40		12(23%)
14		52		41		11(21%)
15		52		42		10(19%)
16		51		34		17(33%)
17		51		35		16(31%)
18		52		37		15(28%)
19		51		38		13(25%)
20		52		39		13(25%)
21		52		40		12(23%)
22		51		42		9(17%)
23		51		46		5(9%)
24		52		35		17(32%)
25		52		37		15(28%)
26		52		38		14(26%)
27		52		39		13(25%)
28		52		40		12(23%)
29		53		42		11(20%)
30		52		43		9(17%)
31		52		44		8(15%)
32		51		36		15(29%)
33		51		38		13(25%)
34		51		39		12(23%)
35		51		41		10(19%)
36		52		41		11(21%)
37		52		43		9(17%)
38		51		44		7(13%)
39		52		46		6(11%)
40		51		37		14(27%)
41		50		38		12(24%)
42		50		39		11(22%)
43		50		40		10(20%)
44		50		42		8(16%)
45		50		43		7(14%)
46		50		43		7(14%)
47		50		45		5(10%)
48		50		37		13(26%)
49		49		38		11(22%)
50		50		40		10(20%)
51		50		42		8(16%)
52		50		42		8(16%)
53		49		46		3(6%)
54		50		46		4(8%)
55		49		48		1(2%)
56		50		39		11(22%)
57		50		40		10(20%)
58		49		42		7(14%)
59		50		42		8(16%)
60		50		46		4(8%)
61		50		47		3(6%)
62		50		48		2(4%)
63		50		48		2(4%)
64		51		38		13(25%)

Above 64 bytes the gain fades away.

Very similar values are collectd for __copy_to_user().
UDP receive performances under flood with small packets using recvfrom()
increase by ~5%.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 arch/x86/include/asm/uaccess_64.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/include/asm/uaccess_64.h b/arch/x86/include/asm/uaccess_64.h
index c5504b9..16a8871 100644
--- a/arch/x86/include/asm/uaccess_64.h
+++ b/arch/x86/include/asm/uaccess_64.h
@@ -28,6 +28,9 @@ copy_user_generic(void *to, const void *from, unsigned len)
 {
 	unsigned ret;
 
+	if (len <= 64)
+		return copy_user_generic_unrolled(to, from, len);
+
 	/*
 	 * If CPU has ERMS feature, use copy_user_enhanced_fast_string.
 	 * Otherwise, if CPU has rep_good feature, use copy_user_generic_string.
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2017-06-30 13:17 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-21 11:09 [PATCH] x86/uaccess: use unrolled string copy for short strings Paolo Abeni
2017-06-21 17:38 ` Kees Cook
2017-06-22 14:55   ` Alan Cox
2017-06-22  8:47 ` Ingo Molnar
2017-06-22 17:02   ` Paolo Abeni
2017-06-22 17:30 ` Linus Torvalds
2017-06-22 17:54   ` Paolo Abeni
2017-06-29 13:55   ` [PATCH] x86/uaccess: optimize copy_user_enhanced_fast_string for short string Paolo Abeni
2017-06-29 21:40     ` Linus Torvalds
2017-06-30 13:10     ` [tip:x86/asm] x86/uaccess: Optimize copy_user_enhanced_fast_string() for short strings tip-bot for Paolo Abeni

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).