* [PATCH] Introduce load_TLS to the "for" loop.
@ 2007-03-13 6:39 Rusty Russell
2007-03-13 13:50 ` Andi Kleen
0 siblings, 1 reply; 6+ messages in thread
From: Rusty Russell @ 2007-03-13 6:39 UTC (permalink / raw)
To: Andi Kleen; +Cc: lkml - Kernel Mailing List
GCC (4.1 at least) unrolls it anyway, but I can't believe this code
was ever justifiable. (I've also submitted a patch which cleans up
i386, which is even uglier).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
diff -r de5618b5e562 include/asm-x86_64/desc.h
--- a/include/asm-x86_64/desc.h Tue Mar 13 11:41:55 2007 +1100
+++ b/include/asm-x86_64/desc.h Tue Mar 13 16:09:56 2007 +1100
@@ -135,16 +135,13 @@ static inline void set_ldt_desc(unsigned
(info)->useable == 0 && \
(info)->lm == 0)
-#if TLS_SIZE != 24
-# error update this code.
-#endif
-
static inline void load_TLS(struct thread_struct *t, unsigned int cpu)
{
+ unsigned int i;
u64 *gdt = (u64 *)(cpu_gdt(cpu) + GDT_ENTRY_TLS_MIN);
- gdt[0] = t->tls_array[0];
- gdt[1] = t->tls_array[1];
- gdt[2] = t->tls_array[2];
+
+ for (i = 0; i < GDT_ENTRY_TLS_ENTRIES; i++)
+ gdt[i] = t->tls_array[i];
}
/*
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] Introduce load_TLS to the "for" loop.
2007-03-13 6:39 [PATCH] Introduce load_TLS to the "for" loop Rusty Russell
@ 2007-03-13 13:50 ` Andi Kleen
2007-03-13 17:31 ` Jeremy Fitzhardinge
2007-03-14 6:31 ` Rusty Russell
0 siblings, 2 replies; 6+ messages in thread
From: Andi Kleen @ 2007-03-13 13:50 UTC (permalink / raw)
To: Rusty Russell; +Cc: lkml - Kernel Mailing List
On Tue, Mar 13, 2007 at 05:39:36PM +1100, Rusty Russell wrote:
> GCC (4.1 at least) unrolls it anyway, but I can't believe this code
Are you sure? Normally it doesn't unroll without -funroll-loops which
the kernel does normally not set. Especially not with -Os builds.
-Andi
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] Introduce load_TLS to the "for" loop.
2007-03-13 13:50 ` Andi Kleen
@ 2007-03-13 17:31 ` Jeremy Fitzhardinge
2007-03-13 20:55 ` Andi Kleen
2007-03-14 6:31 ` Rusty Russell
1 sibling, 1 reply; 6+ messages in thread
From: Jeremy Fitzhardinge @ 2007-03-13 17:31 UTC (permalink / raw)
To: Andi Kleen; +Cc: Rusty Russell, lkml - Kernel Mailing List
Andi Kleen wrote:
> On Tue, Mar 13, 2007 at 05:39:36PM +1100, Rusty Russell wrote:
>
>> GCC (4.1 at least) unrolls it anyway, but I can't believe this code
>>
>
> Are you sure? Normally it doesn't unroll without -funroll-loops which
> the kernel does normally not set. Especially not with -Os builds.
>
Does it matter either way in this case?
J
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] Introduce load_TLS to the "for" loop.
2007-03-13 17:31 ` Jeremy Fitzhardinge
@ 2007-03-13 20:55 ` Andi Kleen
2007-03-14 6:43 ` Rusty Russell
0 siblings, 1 reply; 6+ messages in thread
From: Andi Kleen @ 2007-03-13 20:55 UTC (permalink / raw)
To: Jeremy Fitzhardinge; +Cc: Rusty Russell, lkml - Kernel Mailing List
On Tue, Mar 13, 2007 at 10:31:27AM -0700, Jeremy Fitzhardinge wrote:
> Andi Kleen wrote:
> > On Tue, Mar 13, 2007 at 05:39:36PM +1100, Rusty Russell wrote:
> >
> >> GCC (4.1 at least) unrolls it anyway, but I can't believe this code
> >>
> >
> > Are you sure? Normally it doesn't unroll without -funroll-loops which
> > the kernel does normally not set. Especially not with -Os builds.
> >
>
> Does it matter either way in this case?
It's in the middle of the context switch.
-Andi
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] Introduce load_TLS to the "for" loop.
2007-03-13 13:50 ` Andi Kleen
2007-03-13 17:31 ` Jeremy Fitzhardinge
@ 2007-03-14 6:31 ` Rusty Russell
1 sibling, 0 replies; 6+ messages in thread
From: Rusty Russell @ 2007-03-14 6:31 UTC (permalink / raw)
To: Andi Kleen; +Cc: lkml - Kernel Mailing List
On Tue, 2007-03-13 at 14:50 +0100, Andi Kleen wrote:
> On Tue, Mar 13, 2007 at 05:39:36PM +1100, Rusty Russell wrote:
> > GCC (4.1 at least) unrolls it anyway, but I can't believe this code
>
> Are you sure? Normally it doesn't unroll without -funroll-loops which
> the kernel does normally not set. Especially not with -Os builds.
Yep, checked again:
$ gcc --version
gcc (GCC) 4.1.2 20060928 (prerelease) (Ubuntu 4.1.1-13ubuntu5)
...
...
gcc -Wp,-MD,arch/x86_64/kernel/.process.o.d -nostdinc
-isystem /usr/lib/gcc/i486-linux-gnu/4.1.2/include -D__KERNEL__
-Iinclude -include include/linux/autoconf.h -Wall -Wundef
-Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -O2
-mtune=generic -m64 -mno-red-zone -mcmodel=kernel -pipe
-fno-reorder-blocks -Wno-sign-compare -fno-asynchronous-unwind-tables
-funit-at-a-time -mno-sse -mno-mmx -mno-sse2 -mno-3dnow
-maccumulate-outgoing-args -fno-omit-frame-pointer
-fno-optimize-sibling-calls -g -fno-stack-protector
-Wdeclaration-after-statement -Wno-pointer-sign -D"KBUILD_STR(s)=#s"
-D"KBUILD_BASENAME=KBUILD_STR(process)"
-D"KBUILD_MODNAME=KBUILD_STR(process)" -c -o
arch/x86_64/kernel/process.o arch/x86_64/kernel/process.c
...
$ objdump -Dr arch/x86_64/kernel/process.o | less
...
6be: 48 8b 94 00 00 00 00 mov 0x0(%rax,%rax,1),%rdx
6c5: 00
6c2: R_X86_64_32S cpu_gdt_descr+0x2
6c6: 48 8b 83 98 02 00 00 mov 0x298(%rbx),%rax
6cd: 48 83 c2 60 add $0x60,%rdx
6d1: 48 89 02 mov %rax,(%rdx)
6d4: 48 8b 83 a0 02 00 00 mov 0x2a0(%rbx),%rax
6db: 48 89 42 08 mov %rax,0x8(%rdx)
6df: 48 8b 83 a8 02 00 00 mov 0x2a8(%rbx),%rax
6e6: 48 89 42 10 mov %rax,0x10(%rdx)
If I turn on CONFIG_OPTIMIZE_FOR_SIZE, it's still unrolled,
interestingly.
Cheers,
Rusty.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] Introduce load_TLS to the "for" loop.
2007-03-13 20:55 ` Andi Kleen
@ 2007-03-14 6:43 ` Rusty Russell
0 siblings, 0 replies; 6+ messages in thread
From: Rusty Russell @ 2007-03-14 6:43 UTC (permalink / raw)
To: Andi Kleen; +Cc: Jeremy Fitzhardinge, lkml - Kernel Mailing List
On Tue, 2007-03-13 at 21:55 +0100, Andi Kleen wrote:
> On Tue, Mar 13, 2007 at 10:31:27AM -0700, Jeremy Fitzhardinge wrote:
> > Andi Kleen wrote:
> > > On Tue, Mar 13, 2007 at 05:39:36PM +1100, Rusty Russell wrote:
> > >
> > >> GCC (4.1 at least) unrolls it anyway, but I can't believe this code
> > >>
> > >
> > > Are you sure? Normally it doesn't unroll without -funroll-loops which
> > > the kernel does normally not set. Especially not with -Os builds.
> > >
> >
> > Does it matter either way in this case?
>
> It's in the middle of the context switch.
Well, the rest of __switch_to isn't "0PTIM1Z3D!!!" like this.
But even so, that's no excuse for crap code. If it had used memcpy, we
wouldn't be wasting cycles on this discussion.
Rusty.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2007-03-14 6:43 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-03-13 6:39 [PATCH] Introduce load_TLS to the "for" loop Rusty Russell
2007-03-13 13:50 ` Andi Kleen
2007-03-13 17:31 ` Jeremy Fitzhardinge
2007-03-13 20:55 ` Andi Kleen
2007-03-14 6:43 ` Rusty Russell
2007-03-14 6:31 ` Rusty Russell
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).