* [PATCH] Introduce load_TLS to the "for" loop. @ 2007-03-13 6:39 Rusty Russell 2007-03-13 13:50 ` Andi Kleen 0 siblings, 1 reply; 6+ messages in thread From: Rusty Russell @ 2007-03-13 6:39 UTC (permalink / raw) To: Andi Kleen; +Cc: lkml - Kernel Mailing List GCC (4.1 at least) unrolls it anyway, but I can't believe this code was ever justifiable. (I've also submitted a patch which cleans up i386, which is even uglier). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> diff -r de5618b5e562 include/asm-x86_64/desc.h --- a/include/asm-x86_64/desc.h Tue Mar 13 11:41:55 2007 +1100 +++ b/include/asm-x86_64/desc.h Tue Mar 13 16:09:56 2007 +1100 @@ -135,16 +135,13 @@ static inline void set_ldt_desc(unsigned (info)->useable == 0 && \ (info)->lm == 0) -#if TLS_SIZE != 24 -# error update this code. -#endif - static inline void load_TLS(struct thread_struct *t, unsigned int cpu) { + unsigned int i; u64 *gdt = (u64 *)(cpu_gdt(cpu) + GDT_ENTRY_TLS_MIN); - gdt[0] = t->tls_array[0]; - gdt[1] = t->tls_array[1]; - gdt[2] = t->tls_array[2]; + + for (i = 0; i < GDT_ENTRY_TLS_ENTRIES; i++) + gdt[i] = t->tls_array[i]; } /* ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] Introduce load_TLS to the "for" loop. 2007-03-13 6:39 [PATCH] Introduce load_TLS to the "for" loop Rusty Russell @ 2007-03-13 13:50 ` Andi Kleen 2007-03-13 17:31 ` Jeremy Fitzhardinge 2007-03-14 6:31 ` Rusty Russell 0 siblings, 2 replies; 6+ messages in thread From: Andi Kleen @ 2007-03-13 13:50 UTC (permalink / raw) To: Rusty Russell; +Cc: lkml - Kernel Mailing List On Tue, Mar 13, 2007 at 05:39:36PM +1100, Rusty Russell wrote: > GCC (4.1 at least) unrolls it anyway, but I can't believe this code Are you sure? Normally it doesn't unroll without -funroll-loops which the kernel does normally not set. Especially not with -Os builds. -Andi ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] Introduce load_TLS to the "for" loop. 2007-03-13 13:50 ` Andi Kleen @ 2007-03-13 17:31 ` Jeremy Fitzhardinge 2007-03-13 20:55 ` Andi Kleen 2007-03-14 6:31 ` Rusty Russell 1 sibling, 1 reply; 6+ messages in thread From: Jeremy Fitzhardinge @ 2007-03-13 17:31 UTC (permalink / raw) To: Andi Kleen; +Cc: Rusty Russell, lkml - Kernel Mailing List Andi Kleen wrote: > On Tue, Mar 13, 2007 at 05:39:36PM +1100, Rusty Russell wrote: > >> GCC (4.1 at least) unrolls it anyway, but I can't believe this code >> > > Are you sure? Normally it doesn't unroll without -funroll-loops which > the kernel does normally not set. Especially not with -Os builds. > Does it matter either way in this case? J ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] Introduce load_TLS to the "for" loop. 2007-03-13 17:31 ` Jeremy Fitzhardinge @ 2007-03-13 20:55 ` Andi Kleen 2007-03-14 6:43 ` Rusty Russell 0 siblings, 1 reply; 6+ messages in thread From: Andi Kleen @ 2007-03-13 20:55 UTC (permalink / raw) To: Jeremy Fitzhardinge; +Cc: Rusty Russell, lkml - Kernel Mailing List On Tue, Mar 13, 2007 at 10:31:27AM -0700, Jeremy Fitzhardinge wrote: > Andi Kleen wrote: > > On Tue, Mar 13, 2007 at 05:39:36PM +1100, Rusty Russell wrote: > > > >> GCC (4.1 at least) unrolls it anyway, but I can't believe this code > >> > > > > Are you sure? Normally it doesn't unroll without -funroll-loops which > > the kernel does normally not set. Especially not with -Os builds. > > > > Does it matter either way in this case? It's in the middle of the context switch. -Andi ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] Introduce load_TLS to the "for" loop. 2007-03-13 20:55 ` Andi Kleen @ 2007-03-14 6:43 ` Rusty Russell 0 siblings, 0 replies; 6+ messages in thread From: Rusty Russell @ 2007-03-14 6:43 UTC (permalink / raw) To: Andi Kleen; +Cc: Jeremy Fitzhardinge, lkml - Kernel Mailing List On Tue, 2007-03-13 at 21:55 +0100, Andi Kleen wrote: > On Tue, Mar 13, 2007 at 10:31:27AM -0700, Jeremy Fitzhardinge wrote: > > Andi Kleen wrote: > > > On Tue, Mar 13, 2007 at 05:39:36PM +1100, Rusty Russell wrote: > > > > > >> GCC (4.1 at least) unrolls it anyway, but I can't believe this code > > >> > > > > > > Are you sure? Normally it doesn't unroll without -funroll-loops which > > > the kernel does normally not set. Especially not with -Os builds. > > > > > > > Does it matter either way in this case? > > It's in the middle of the context switch. Well, the rest of __switch_to isn't "0PTIM1Z3D!!!" like this. But even so, that's no excuse for crap code. If it had used memcpy, we wouldn't be wasting cycles on this discussion. Rusty. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] Introduce load_TLS to the "for" loop. 2007-03-13 13:50 ` Andi Kleen 2007-03-13 17:31 ` Jeremy Fitzhardinge @ 2007-03-14 6:31 ` Rusty Russell 1 sibling, 0 replies; 6+ messages in thread From: Rusty Russell @ 2007-03-14 6:31 UTC (permalink / raw) To: Andi Kleen; +Cc: lkml - Kernel Mailing List On Tue, 2007-03-13 at 14:50 +0100, Andi Kleen wrote: > On Tue, Mar 13, 2007 at 05:39:36PM +1100, Rusty Russell wrote: > > GCC (4.1 at least) unrolls it anyway, but I can't believe this code > > Are you sure? Normally it doesn't unroll without -funroll-loops which > the kernel does normally not set. Especially not with -Os builds. Yep, checked again: $ gcc --version gcc (GCC) 4.1.2 20060928 (prerelease) (Ubuntu 4.1.1-13ubuntu5) ... ... gcc -Wp,-MD,arch/x86_64/kernel/.process.o.d -nostdinc -isystem /usr/lib/gcc/i486-linux-gnu/4.1.2/include -D__KERNEL__ -Iinclude -include include/linux/autoconf.h -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -O2 -mtune=generic -m64 -mno-red-zone -mcmodel=kernel -pipe -fno-reorder-blocks -Wno-sign-compare -fno-asynchronous-unwind-tables -funit-at-a-time -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -maccumulate-outgoing-args -fno-omit-frame-pointer -fno-optimize-sibling-calls -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(process)" -D"KBUILD_MODNAME=KBUILD_STR(process)" -c -o arch/x86_64/kernel/process.o arch/x86_64/kernel/process.c ... $ objdump -Dr arch/x86_64/kernel/process.o | less ... 6be: 48 8b 94 00 00 00 00 mov 0x0(%rax,%rax,1),%rdx 6c5: 00 6c2: R_X86_64_32S cpu_gdt_descr+0x2 6c6: 48 8b 83 98 02 00 00 mov 0x298(%rbx),%rax 6cd: 48 83 c2 60 add $0x60,%rdx 6d1: 48 89 02 mov %rax,(%rdx) 6d4: 48 8b 83 a0 02 00 00 mov 0x2a0(%rbx),%rax 6db: 48 89 42 08 mov %rax,0x8(%rdx) 6df: 48 8b 83 a8 02 00 00 mov 0x2a8(%rbx),%rax 6e6: 48 89 42 10 mov %rax,0x10(%rdx) If I turn on CONFIG_OPTIMIZE_FOR_SIZE, it's still unrolled, interestingly. Cheers, Rusty. ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2007-03-14 6:43 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2007-03-13 6:39 [PATCH] Introduce load_TLS to the "for" loop Rusty Russell 2007-03-13 13:50 ` Andi Kleen 2007-03-13 17:31 ` Jeremy Fitzhardinge 2007-03-13 20:55 ` Andi Kleen 2007-03-14 6:43 ` Rusty Russell 2007-03-14 6:31 ` Rusty Russell
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).