All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] x86: bring back rep movsq for user access on CPUs without ERMS
@ 2023-08-28 17:07 Mateusz Guzik
  2023-08-28 18:00 ` Linus Torvalds
  2023-08-29 19:30 ` Linus Torvalds
  0 siblings, 2 replies; 9+ messages in thread
From: Mateusz Guzik @ 2023-08-28 17:07 UTC (permalink / raw)
  To: torvalds; +Cc: linux-kernel, linux-arch, bp, Mateusz Guzik

I spotted rep_movs_alternative on a flamegraph while building the kernel
on an AMD CPU and I was surprised to find it does not have ERMS.

The 64 byte loop there can be trivially replaced with movsq and I don't
think there are any reasons to NOT do it. If anything I'm surprised the
non-ERMS case was left hanging around after the 32KB regression reported
by Eric Dumazet.

Anyhow patch below remedies it.

The patch initially had:
 SYM_FUNC_START(rep_movs_alternative)
        cmpq $64,%rcx
-       jae .Llarge
+       ALTERNATIVE "jae .Lrep_movsq", "jae .Lrep_movsb", X86_FEATURE_ERMS


But then I found the weird nops and after reading the thread which
resulted in bringing back ERMS I see why
https://lore.kernel.org/lkml/CANn89iKUbyrJ=r2+_kK+sb2ZSSHifFZ7QkPLDpAtkJ8v4WUumA@mail.gmail.com/

That said, whacking the 64 byte loop in favor of rep movsq is definitely
the right to do and the patch below is one way to do it.

What I don't get is what's up with ERMS availability on AMD CPUs. I was
told newer uarchs support it, but the bit may be disabled in bios(?).

Anyhow, I temporarily got an instance on Amazon EC2 with EPYC 7R13 and
the bit is not there, whether this is a config problem or not.

I also note there is quite a mess concerning other string ops, which I'm
going to tackle in another thread later(tm).

================ cut here ================

Intel CPUs ship with ERMS for over a decade, but this is not true for
AMD. 

Hand-rolled mov loops executing in this case are quite pessimal compared
to rep movsq for bigger sizes. While the upper limit depends on uarch,
everyone is well south of 1KB AFAICS and sizes bigger than that are
common. The problem can be easily remedied so do it.

Sample result from read1_processes from will-it-scale on EPYC 7R13
(4KB reads/s):
before:	1507021
after:	1721828 (+14%)

Note that the cutoff point for rep usage is set to 64 bytes, which is
way too conservative but I'm sticking to what was done in 47ee3f1dd93b
("x86: re-introduce support for ERMS copies for user space accesses").
That is to say *some* copies will now go slower, which is fixable but
beyond the scope of this patch.

While here make sure labels are unique.

Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
---
 arch/x86/lib/copy_user_64.S | 61 +++++++++++--------------------------
 1 file changed, 17 insertions(+), 44 deletions(-)

diff --git a/arch/x86/lib/copy_user_64.S b/arch/x86/lib/copy_user_64.S
index 01c5de4c279b..2fe61de81a22 100644
--- a/arch/x86/lib/copy_user_64.S
+++ b/arch/x86/lib/copy_user_64.S
@@ -68,55 +68,28 @@ SYM_FUNC_START(rep_movs_alternative)
 	_ASM_EXTABLE_UA( 3b, .Lcopy_user_tail)
 
 .Llarge:
-0:	ALTERNATIVE "jmp .Lunrolled", "rep movsb", X86_FEATURE_ERMS
-1:	RET
+4:	ALTERNATIVE "jmp .Llarge_movsq", "rep movsb", X86_FEATURE_ERMS
+5:	RET
 
-        _ASM_EXTABLE_UA( 0b, 1b)
+	_ASM_EXTABLE_UA( 4b, 5b)
 
-	.p2align 4
-.Lunrolled:
-10:	movq (%rsi),%r8
-11:	movq 8(%rsi),%r9
-12:	movq 16(%rsi),%r10
-13:	movq 24(%rsi),%r11
-14:	movq %r8,(%rdi)
-15:	movq %r9,8(%rdi)
-16:	movq %r10,16(%rdi)
-17:	movq %r11,24(%rdi)
-20:	movq 32(%rsi),%r8
-21:	movq 40(%rsi),%r9
-22:	movq 48(%rsi),%r10
-23:	movq 56(%rsi),%r11
-24:	movq %r8,32(%rdi)
-25:	movq %r9,40(%rdi)
-26:	movq %r10,48(%rdi)
-27:	movq %r11,56(%rdi)
-	addq $64,%rsi
-	addq $64,%rdi
-	subq $64,%rcx
-	cmpq $64,%rcx
-	jae .Lunrolled
-	cmpl $8,%ecx
-	jae .Lword
+.Llarge_movsq:
+	movq %rcx,%r8
+	movq %rcx,%rax
+	shrq $3,%rcx
+	andl $7,%eax
+6:	rep movsq
+	movl %eax,%ecx
 	testl %ecx,%ecx
 	jne .Lcopy_user_tail
 	RET
 
-	_ASM_EXTABLE_UA(10b, .Lcopy_user_tail)
-	_ASM_EXTABLE_UA(11b, .Lcopy_user_tail)
-	_ASM_EXTABLE_UA(12b, .Lcopy_user_tail)
-	_ASM_EXTABLE_UA(13b, .Lcopy_user_tail)
-	_ASM_EXTABLE_UA(14b, .Lcopy_user_tail)
-	_ASM_EXTABLE_UA(15b, .Lcopy_user_tail)
-	_ASM_EXTABLE_UA(16b, .Lcopy_user_tail)
-	_ASM_EXTABLE_UA(17b, .Lcopy_user_tail)
-	_ASM_EXTABLE_UA(20b, .Lcopy_user_tail)
-	_ASM_EXTABLE_UA(21b, .Lcopy_user_tail)
-	_ASM_EXTABLE_UA(22b, .Lcopy_user_tail)
-	_ASM_EXTABLE_UA(23b, .Lcopy_user_tail)
-	_ASM_EXTABLE_UA(24b, .Lcopy_user_tail)
-	_ASM_EXTABLE_UA(25b, .Lcopy_user_tail)
-	_ASM_EXTABLE_UA(26b, .Lcopy_user_tail)
-	_ASM_EXTABLE_UA(27b, .Lcopy_user_tail)
+/*
+ * Recovery after failed rep movsq
+ */
+7:	movq %r8,%rcx
+	jmp .Lcopy_user_tail
+
+	_ASM_EXTABLE_UA( 6b, 7b)
 SYM_FUNC_END(rep_movs_alternative)
 EXPORT_SYMBOL(rep_movs_alternative)
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] x86: bring back rep movsq for user access on CPUs without ERMS
  2023-08-28 17:07 [PATCH] x86: bring back rep movsq for user access on CPUs without ERMS Mateusz Guzik
@ 2023-08-28 18:00 ` Linus Torvalds
  2023-08-28 18:04   ` Mateusz Guzik
  2023-08-29 19:30 ` Linus Torvalds
  1 sibling, 1 reply; 9+ messages in thread
From: Linus Torvalds @ 2023-08-28 18:00 UTC (permalink / raw)
  To: Mateusz Guzik; +Cc: linux-kernel, linux-arch, bp

On Mon, 28 Aug 2023 at 10:07, Mateusz Guzik <mjguzik@gmail.com> wrote:
>
> While here make sure labels are unique.

I'll take a look at the other changes later, but this one I reacted
to: please don't do this.

It's a disaster. It makes people make up random numbers, and then
pointlessly change them if the code moves around etc.

Numeric labels should make sense *locally*.  The way to disambiguate
them is to have each use just have "f" and 'b" to distinguish whether
it refers forward or backwards.

And then just keep the numbering sensible in a *local* scope, because
the "file global" scope is just such a complete pain when you
pointlessly change the numbering just because some entirely unrelated
non-local thing changed.

And yes, you can see some confusion in the existing code where it uses
0/1/2/3, and that was because I just didn't consistently put the
exception table entries closer, and then moved things around.  So the
current code isn't entirely consistent either, but let's once and for
all make it clear that the sequential numbering is wrong, wrong,
wrong.

The numeric labels should not use sequential numbers, they should use
purely "locally sensible" numbers., and the exception handling should
similarly be as locally sensible as possible.

And if you use complicated numbering, please make the numbering be
some sane visually sensible grouping with commentary (ie that unrolled
'movq' loop case, or for an even nastier case see commit 034ff37d3407:
"x86: rewrite '__copy_user_nocache' function").

So if anything, the existing 2/3 labels in that file should be made
into 0/1. Because I've seen way too many of the "pointlessly renumber
lines just to sort them and make them unique".

I used to do that when I was twelve years old because of nasty BASIC
line numbering.

I'm a big boy now, all grown up, and I don't want to still live in a
world where we renumber lines because we added some code in the
middle.

The alternative, of course, is to use actual proper named labels. And
for "real assembly code" that is obviously always the right solution.

But for exception table entries or for random assembler macros, that's
actually horrendous.

The numeric labels literally *exist* to avoid the uniqueness
requirements, exactly because for things like assembler macros, you
want to be able to re-use the same (local) label in a macro expansion
multiple times.

So trying to make numeric labels unique is literally missing the
entire *point* of them.

                 Linus

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] x86: bring back rep movsq for user access on CPUs without ERMS
  2023-08-28 18:00 ` Linus Torvalds
@ 2023-08-28 18:04   ` Mateusz Guzik
  2023-08-28 18:24     ` Linus Torvalds
  0 siblings, 1 reply; 9+ messages in thread
From: Mateusz Guzik @ 2023-08-28 18:04 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch, bp

On 8/28/23, Linus Torvalds <torvalds@linux-foundation.org> wrote:
> On Mon, 28 Aug 2023 at 10:07, Mateusz Guzik <mjguzik@gmail.com> wrote:
>>
>> While here make sure labels are unique.
>
> I'll take a look at the other changes later, but this one I reacted
> to: please don't do this.
>
> It's a disaster. It makes people make up random numbers, and then
> pointlessly change them if the code moves around etc.
>
> Numeric labels should make sense *locally*.  The way to disambiguate
> them is to have each use just have "f" and 'b" to distinguish whether
> it refers forward or backwards.
>
[snip]

Other files do it (e.g., see __copy_user_nocache), but I have no
strong opinion myself.

That said I'll wait for the rest of the review before sending a v2.

-- 
Mateusz Guzik <mjguzik gmail.com>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] x86: bring back rep movsq for user access on CPUs without ERMS
  2023-08-28 18:04   ` Mateusz Guzik
@ 2023-08-28 18:24     ` Linus Torvalds
  2023-08-28 18:32       ` Mateusz Guzik
  0 siblings, 1 reply; 9+ messages in thread
From: Linus Torvalds @ 2023-08-28 18:24 UTC (permalink / raw)
  To: Mateusz Guzik; +Cc: linux-kernel, linux-arch, bp

On Mon, 28 Aug 2023 at 11:04, Mateusz Guzik <mjguzik@gmail.com> wrote:
>
> Other files do it (e.g., see __copy_user_nocache), but I have no
> strong opinion myself.

So the __copy_user_nocache() thing is a case of that second issue -
see my comment about "some sane visually sensible grouping" of the
numbers.

Look closer, and you'll notice that they aren't actually sequential.
They are of the form XY where the X is the grouping, and Y is the
local number within that grouping.

That case also comes with a fair amount of comments about each group
for the extable entries.

But yes, we also do have a number of thos e"sequential labels". See
for example arch/x86/lib/getuser.S, where we then end up having all
the exception handling at the end because it is mostly shared across
cases. It's ugly.

We also have a lot of ugly cases that probably shouldn't use numbers
at all, eg csum_partial(). I think that goes back to some darker age
when things like "assembly is so trivial that it doesn't need any
fancy explanatory labels for code" was ok.

See also arch/x86/lib/memmove_64.S for similar horrors. I wonder if it
is a case of "use compiler to get almost the right code, then massage
things manually". Nasty, nasty. That should use legible names, not
random numbers.

I also suspect some people really thought that the numbers need to be
unique, and just didn't know to use local numbering.

             Linus

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] x86: bring back rep movsq for user access on CPUs without ERMS
  2023-08-28 18:24     ` Linus Torvalds
@ 2023-08-28 18:32       ` Mateusz Guzik
  0 siblings, 0 replies; 9+ messages in thread
From: Mateusz Guzik @ 2023-08-28 18:32 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch, bp

On 8/28/23, Linus Torvalds <torvalds@linux-foundation.org> wrote:
> On Mon, 28 Aug 2023 at 11:04, Mateusz Guzik <mjguzik@gmail.com> wrote:
>>
>> Other files do it (e.g., see __copy_user_nocache), but I have no
>> strong opinion myself.
>
> So the __copy_user_nocache() thing is a case of that second issue -
> see my comment about "some sane visually sensible grouping" of the
> numbers.
>
> Look closer, and you'll notice that they aren't actually sequential.
> They are of the form XY where the X is the grouping, and Y is the
> local number within that grouping.
>
> That case also comes with a fair amount of comments about each group
> for the extable entries.
>
> But yes, we also do have a number of thos e"sequential labels". See
> for example arch/x86/lib/getuser.S, where we then end up having all
> the exception handling at the end because it is mostly shared across
> cases. It's ugly.
>
> We also have a lot of ugly cases that probably shouldn't use numbers
> at all, eg csum_partial(). I think that goes back to some darker age
> when things like "assembly is so trivial that it doesn't need any
> fancy explanatory labels for code" was ok.
>
> See also arch/x86/lib/memmove_64.S for similar horrors. I wonder if it
> is a case of "use compiler to get almost the right code, then massage
> things manually". Nasty, nasty. That should use legible names, not
> random numbers.
>
> I also suspect some people really thought that the numbers need to be
> unique, and just didn't know to use local numbering.
>

That was bad example, I meant stuff was already *unique* in other
files and it is sequential in some of them. In the very func I'm
modifying here there is 0/1 followed by 2/3 pair already, so it looked
like the convention to follow.

Anyhow this is bullshit detail I'm not going to argue about, you made
your position clear and I see no problem adhering to it -- consider
this bit patched in v2.

Can we drop this aspect please ;)

-- 
Mateusz Guzik <mjguzik gmail.com>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] x86: bring back rep movsq for user access on CPUs without ERMS
  2023-08-28 17:07 [PATCH] x86: bring back rep movsq for user access on CPUs without ERMS Mateusz Guzik
  2023-08-28 18:00 ` Linus Torvalds
@ 2023-08-29 19:30 ` Linus Torvalds
  2023-08-29 19:45   ` Mateusz Guzik
  1 sibling, 1 reply; 9+ messages in thread
From: Linus Torvalds @ 2023-08-29 19:30 UTC (permalink / raw)
  To: Mateusz Guzik; +Cc: linux-kernel, linux-arch, bp

On Mon, 28 Aug 2023 at 10:07, Mateusz Guzik <mjguzik@gmail.com> wrote:
>
> Hand-rolled mov loops executing in this case are quite pessimal compared
> to rep movsq for bigger sizes. While the upper limit depends on uarch,
> everyone is well south of 1KB AFAICS and sizes bigger than that are
> common. The problem can be easily remedied so do it.

Ok, looking at teh actual code now, and your patch is buggy.

> +.Llarge_movsq:
> +       movq %rcx,%r8
> +       movq %rcx,%rax
> +       shrq $3,%rcx
> +       andl $7,%eax
> +6:     rep movsq
> +       movl %eax,%ecx
>         testl %ecx,%ecx
>         jne .Lcopy_user_tail
>         RET

The fixup code is very very broken:

> +/*
> + * Recovery after failed rep movsq
> + */
> +7:     movq %r8,%rcx
> +       jmp .Lcopy_user_tail
> +
> +       _ASM_EXTABLE_UA( 6b, 7b)

That just copies the original value back into %rcx. That's not at all
ok. The "rep movsq" may have succeeded partially, and updated %rcx
(and %rsi/rdi) accordingly. You now will do the "tail" for entirely
too much, and returning the wrong return value.

In fact, if this then races with a mmap() in another thread, the user
copy might end up then succeeding for the part that used to fail, and
in that case it will possibly end up copying much more than asked for
and overrunning the buffers provided.

So all those games with %r8 are entirely bogus. There is no way that
"save the original length" can ever be relevant or correct.

              Linus

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] x86: bring back rep movsq for user access on CPUs without ERMS
  2023-08-29 19:30 ` Linus Torvalds
@ 2023-08-29 19:45   ` Mateusz Guzik
  2023-08-29 20:03     ` Linus Torvalds
  0 siblings, 1 reply; 9+ messages in thread
From: Mateusz Guzik @ 2023-08-29 19:45 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch, bp

On 8/29/23, Linus Torvalds <torvalds@linux-foundation.org> wrote:
> On Mon, 28 Aug 2023 at 10:07, Mateusz Guzik <mjguzik@gmail.com> wrote:
>>
>> Hand-rolled mov loops executing in this case are quite pessimal compared
>> to rep movsq for bigger sizes. While the upper limit depends on uarch,
>> everyone is well south of 1KB AFAICS and sizes bigger than that are
>> common. The problem can be easily remedied so do it.
>
> Ok, looking at teh actual code now, and your patch is buggy.
>
>> +.Llarge_movsq:
>> +       movq %rcx,%r8
>> +       movq %rcx,%rax
>> +       shrq $3,%rcx
>> +       andl $7,%eax
>> +6:     rep movsq
>> +       movl %eax,%ecx
>>         testl %ecx,%ecx
>>         jne .Lcopy_user_tail
>>         RET
>
> The fixup code is very very broken:
>
>> +/*
>> + * Recovery after failed rep movsq
>> + */
>> +7:     movq %r8,%rcx
>> +       jmp .Lcopy_user_tail
>> +
>> +       _ASM_EXTABLE_UA( 6b, 7b)
>
> That just copies the original value back into %rcx. That's not at all
> ok. The "rep movsq" may have succeeded partially, and updated %rcx
> (and %rsi/rdi) accordingly. You now will do the "tail" for entirely
> too much, and returning the wrong return value.
>
> In fact, if this then races with a mmap() in another thread, the user
> copy might end up then succeeding for the part that used to fail, and
> in that case it will possibly end up copying much more than asked for
> and overrunning the buffers provided.
>
> So all those games with %r8 are entirely bogus. There is no way that
> "save the original length" can ever be relevant or correct.
>

Huh, pretty obvious now that you mention it, I don't know why I
thought regs go back. But more importantly I should have checked
handling in the now-removed movsq routine (copy_user_generic_string):

[snip]
        movl %edx,%ecx
        shrl $3,%ecx
        andl $7,%edx
1:      rep movsq
2:      movl %edx,%ecx
3:      rep movsb
        xorl %eax,%eax
        ASM_CLAC
        RET

11:     leal (%rdx,%rcx,8),%ecx
12:     movl %ecx,%edx          /* ecx is zerorest also */
        jmp .Lcopy_user_handle_tail

        _ASM_EXTABLE_CPY(1b, 11b)
        _ASM_EXTABLE_CPY(3b, 12b)
[/snip]

So I think I know how to fix it, but I'm going to sleep on it.

-- 
Mateusz Guzik <mjguzik gmail.com>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] x86: bring back rep movsq for user access on CPUs without ERMS
  2023-08-29 19:45   ` Mateusz Guzik
@ 2023-08-29 20:03     ` Linus Torvalds
  2023-08-29 20:20       ` Mateusz Guzik
  0 siblings, 1 reply; 9+ messages in thread
From: Linus Torvalds @ 2023-08-29 20:03 UTC (permalink / raw)
  To: Mateusz Guzik; +Cc: linux-kernel, linux-arch, bp

On Tue, 29 Aug 2023 at 12:45, Mateusz Guzik <mjguzik@gmail.com> wrote:
>
> So I think I know how to fix it, but I'm going to sleep on it.

I think you can just skip the %r8 games, and do that

        leal (%rax,%rcx,8),%rcx

in the exception fixup code, since %rax will have the low bits of the
byte count, and %rcx will have the remaining qword count.

We should also have some test-case for partial reads somewhere, but I
have to admit that when I did the cleanup patches I just wrote some
silly test myself (ie just doing a 'mmap()' and then reading/writing
into the end of that mmap at different offsets.

I didn't save that hacky thing, I'm afraid.

I also tried to figure out if there is any CPU we should care about
that doesn't like 'rep movsq', but I think you are right that there
really isn't. The "good enough" rep things were introduced in the PPro
if I recall correctly, and while you could disable them in the BIOS,
by the time Intel did 64-bit in Northwood (?) it was pretty much
standard.

So yeah, no reason to have the unrolled loop at all, and I think your
patch is fine conceptually, just needs fixing and testing for the
partial success case.

Oh, and you should also remove the clobbers of r8-r11 in the
copy_user_generic() inline asm in <asm/uaccess_64.h> when you've fixed
the exception handling. The only reason for those clobbers were for
that unrolled register use.

So only %rax ends up being a clobber for the rep_movs_alternative
case, as far as I can tell.

            Linus

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] x86: bring back rep movsq for user access on CPUs without ERMS
  2023-08-29 20:03     ` Linus Torvalds
@ 2023-08-29 20:20       ` Mateusz Guzik
  0 siblings, 0 replies; 9+ messages in thread
From: Mateusz Guzik @ 2023-08-29 20:20 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch, bp

On 8/29/23, Linus Torvalds <torvalds@linux-foundation.org> wrote:
> On Tue, 29 Aug 2023 at 12:45, Mateusz Guzik <mjguzik@gmail.com> wrote:
>>
>> So I think I know how to fix it, but I'm going to sleep on it.
>
> I think you can just skip the %r8 games, and do that
>
>         leal (%rax,%rcx,8),%rcx
>
> in the exception fixup code, since %rax will have the low bits of the
> byte count, and %rcx will have the remaining qword count.
>
> We should also have some test-case for partial reads somewhere, but I
> have to admit that when I did the cleanup patches I just wrote some
> silly test myself (ie just doing a 'mmap()' and then reading/writing
> into the end of that mmap at different offsets.
>
> I didn't save that hacky thing, I'm afraid.
>

Ye I was planning on writing some tests to illustrate this works as
intended and v1 does not. Part of why I'm going to take more time,
there is no rush patching this.

> I also tried to figure out if there is any CPU we should care about
> that doesn't like 'rep movsq', but I think you are right that there
> really isn't. The "good enough" rep things were introduced in the PPro
> if I recall correctly, and while you could disable them in the BIOS,
> by the time Intel did 64-bit in Northwood (?) it was pretty much
> standard.
>

gcc already inlines rep movsq for copies which fit, so....

On that note I'm going to submit a patch to whack non-rep clear_page as well.

> So yeah, no reason to have the unrolled loop at all, and I think your
> patch is fine conceptually, just needs fixing and testing for the
> partial success case.
>
> Oh, and you should also remove the clobbers of r8-r11 in the
> copy_user_generic() inline asm in <asm/uaccess_64.h> when you've fixed
> the exception handling. The only reason for those clobbers were for
> that unrolled register use.
>
> So only %rax ends up being a clobber for the rep_movs_alternative
> case, as far as I can tell.
>

Ok, I'll patch it up.

That label reorg was cosmetics, did not matter. But that bad fixup
thing was quite avoidable by checking what original movsq was doing,
which I should have done before sending v1. Sorry for the lame patch
on that front. ;) (fwiw I did multiple kernel builds and whatnot with
it, nothing blew up)

Thanks for the review.

-- 
Mateusz Guzik <mjguzik gmail.com>

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-08-29 20:21 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-28 17:07 [PATCH] x86: bring back rep movsq for user access on CPUs without ERMS Mateusz Guzik
2023-08-28 18:00 ` Linus Torvalds
2023-08-28 18:04   ` Mateusz Guzik
2023-08-28 18:24     ` Linus Torvalds
2023-08-28 18:32       ` Mateusz Guzik
2023-08-29 19:30 ` Linus Torvalds
2023-08-29 19:45   ` Mateusz Guzik
2023-08-29 20:03     ` Linus Torvalds
2023-08-29 20:20       ` Mateusz Guzik

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.