* Re: Syscall changes registers beyond %eax, on linux-i386
@ 2002-09-19 17:44 Petr Vandrovec
2002-09-19 18:04 ` Brian Gerst
0 siblings, 1 reply; 30+ messages in thread
From: Petr Vandrovec @ 2002-09-19 17:44 UTC (permalink / raw)
To: Richard B. Johnson; +Cc: dvorak, linux-kernel
On 19 Sep 02 at 13:22, Richard B. Johnson wrote:
> > >>A short snippet of sys_poll, with irrelavant data removed.
> > >>
> > >>sys_poll(struct pollfd *ufds, .. , ..) {
> > >> ...
> > >> ufds++;
> > >> ...
>
> Well which one? Here is an ioctl(). It certainly modifies one
> of its parameter values.
poll(), as was already noted. Program below should
print same value for B= and F=, but it reports f + 8*c instead
(where c = number of filedescriptors passed to poll).
And you must call it from assembly, as your calls to getpid() or
ioctl() (or poll()) are wrapped in libc - and glibc's code begins with
push %ebx because of %ebx is used by -fPIC code.
It is questinable whether we should try to not modify parameters
passed into functions. It is definitely nice behavior, but I think
that we should only guarantee that syscalls do not modify unused
registers.
Petr Vandrovec
vandrove@vc.cvut.cz
#include <unistd.h>
#include <fcntl.h>
#include <stdio.h>
#include <sys/poll.h>
struct pollfd f[5];
int main(int argc, char* argv[]) {
unsigned int i;
void * reg;
for (i = 0; i < 5; i++) {
f[i].fd = 0;
f[i].events = POLLIN;
}
__asm__ __volatile__("int $0x80\n" : "=b"(reg) : "a"(168), "0"(f), "c"(5), "d"(1));
printf("B=%p F=%p\n", reg, f);
return 0;
}
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Syscall changes registers beyond %eax, on linux-i386 2002-09-19 17:44 Syscall changes registers beyond %eax, on linux-i386 Petr Vandrovec @ 2002-09-19 18:04 ` Brian Gerst 2002-09-19 18:30 ` Richard Henderson 2002-09-19 19:24 ` Daniel Jacobowitz 0 siblings, 2 replies; 30+ messages in thread From: Brian Gerst @ 2002-09-19 18:04 UTC (permalink / raw) To: Petr Vandrovec; +Cc: Richard B. Johnson, dvorak, linux-kernel Petr Vandrovec wrote: > On 19 Sep 02 at 13:22, Richard B. Johnson wrote: > > >>>>>A short snippet of sys_poll, with irrelavant data removed. >>>>> >>>>>sys_poll(struct pollfd *ufds, .. , ..) { >>>>> ... >>>>> ufds++; >>>>> ... >>>> >>Well which one? Here is an ioctl(). It certainly modifies one >>of its parameter values. > > > poll(), as was already noted. Program below should > print same value for B= and F=, but it reports f + 8*c instead > (where c = number of filedescriptors passed to poll). > > And you must call it from assembly, as your calls to getpid() or > ioctl() (or poll()) are wrapped in libc - and glibc's code begins with > push %ebx because of %ebx is used by -fPIC code. > > It is questinable whether we should try to not modify parameters > passed into functions. It is definitely nice behavior, but I think > that we should only guarantee that syscalls do not modify unused > registers. > Petr Vandrovec > vandrove@vc.cvut.cz Now that I've thought about it more, I think the best solution is to go through all the syscalls (a big job, I know), and declare the parameters as const, so that gcc knows it can't modify them, and will throw a warning if we try. -- Brian Gerst ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Syscall changes registers beyond %eax, on linux-i386 2002-09-19 18:04 ` Brian Gerst @ 2002-09-19 18:30 ` Richard Henderson 2002-09-19 18:51 ` Brian Gerst 2002-09-19 19:24 ` Daniel Jacobowitz 1 sibling, 1 reply; 30+ messages in thread From: Richard Henderson @ 2002-09-19 18:30 UTC (permalink / raw) To: Brian Gerst; +Cc: Petr Vandrovec, Richard B. Johnson, dvorak, linux-kernel On Thu, Sep 19, 2002 at 02:04:43PM -0400, Brian Gerst wrote: > Now that I've thought about it more, I think the best solution is to go > through all the syscalls (a big job, I know), and declare the parameters > as const, so that gcc knows it can't modify them, and will throw a > warning if we try. The parameter area belongs to the callee, and it may *always* be modified. r~ ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Syscall changes registers beyond %eax, on linux-i386 2002-09-19 18:30 ` Richard Henderson @ 2002-09-19 18:51 ` Brian Gerst 2002-09-19 18:57 ` Richard Henderson 2002-09-19 19:18 ` Richard B. Johnson 0 siblings, 2 replies; 30+ messages in thread From: Brian Gerst @ 2002-09-19 18:51 UTC (permalink / raw) To: Richard Henderson Cc: Petr Vandrovec, Richard B. Johnson, dvorak, linux-kernel Richard Henderson wrote: > On Thu, Sep 19, 2002 at 02:04:43PM -0400, Brian Gerst wrote: > >>Now that I've thought about it more, I think the best solution is to go >>through all the syscalls (a big job, I know), and declare the parameters >>as const, so that gcc knows it can't modify them, and will throw a >>warning if we try. > > > The parameter area belongs to the callee, and it may *always* be modified. > > > r~ > The parameters can not be modified if they are declared const though, that's my point. -- Brian Gerst ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Syscall changes registers beyond %eax, on linux-i386 2002-09-19 18:51 ` Brian Gerst @ 2002-09-19 18:57 ` Richard Henderson 2002-09-19 19:40 ` Richard B. Johnson 2002-09-19 19:18 ` Richard B. Johnson 1 sibling, 1 reply; 30+ messages in thread From: Richard Henderson @ 2002-09-19 18:57 UTC (permalink / raw) To: Brian Gerst; +Cc: Petr Vandrovec, Richard B. Johnson, dvorak, linux-kernel On Thu, Sep 19, 2002 at 02:51:44PM -0400, Brian Gerst wrote: > > The parameter area belongs to the callee, and it may *always* be modified. > > The parameters can not be modified if they are declared const though, > that's my point. Yes they can. extern void bar(int x, int y, int z); void foo(const int a, const int b, const int c) { bar(a+1, b+1, c+1); } subl $12, %esp movl 20(%esp), %eax incl %eax movl %eax, 20(%esp) movl 16(%esp), %eax incl %eax incl 24(%esp) movl %eax, 16(%esp) addl $12, %esp jmp bar (Not sure why gcc doesn't use incl on all three memories, nor should it allocate that stack frame...) r~ ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Syscall changes registers beyond %eax, on linux-i386 2002-09-19 18:57 ` Richard Henderson @ 2002-09-19 19:40 ` Richard B. Johnson 2002-09-19 19:41 ` Richard Henderson 0 siblings, 1 reply; 30+ messages in thread From: Richard B. Johnson @ 2002-09-19 19:40 UTC (permalink / raw) To: Richard Henderson; +Cc: Brian Gerst, Petr Vandrovec, dvorak, linux-kernel On Thu, 19 Sep 2002, Richard Henderson wrote: > On Thu, Sep 19, 2002 at 02:51:44PM -0400, Brian Gerst wrote: > > > The parameter area belongs to the callee, and it may *always* be modified. > > > > The parameters can not be modified if they are declared const though, > > that's my point. > > Yes they can. > > extern void bar(int x, int y, int z); > void foo(const int a, const int b, const int c) > { > bar(a+1, b+1, c+1); > } > > subl $12, %esp > movl 20(%esp), %eax > incl %eax > movl %eax, 20(%esp) > movl 16(%esp), %eax > incl %eax > incl 24(%esp) > movl %eax, 16(%esp) > addl $12, %esp > jmp bar > > (Not sure why gcc doesn't use incl on all three memories, nor > should it allocate that stack frame...) > > > r~ > Well it's not modifying those values. It's putting the constant value into a register and modifying the value in the register before calling a function that takes int. Note that the parameter passed to the function, a, b, and c, are local copies. gcc can whack those anyway it wants. In fact, it does strange things above which may not be valid. It subtracts an offset from esp for local variables ($12). There aren't any local variables!. Therefore, it has to access the passed parameters at their pushed offset + 12. Then, after it's through mucking with them, it collapses the local stack area (levels the stack), then jumps to the called function. It will use the early 'call' return-value to return to the caller. It's really bad code because it could have done: incl $0x04(%esp) incl $0x08(%esp) incl $0x1c(%esp) jmp bar Note that, in every case, the constant value was pushed onto the stack and this function called. That copy of the constant value can be trashed anyway the callee wants. It's his copy. I thought you were going to do something like: Script started on Thu Sep 19 15:22:05 2002 # cat zzz.c int foo(const int a, const int b, const int c) { a += b; a += c; return a; } # gcc -c -o zzz zzz.c zzz.c: In function `foo': zzz.c:6: warning: assignment of read-only location zzz.c:7: warning: assignment of read-only location # exit exit Script done on Thu Sep 19 15:22:23 2002 Which makes gcc barf when you attempt to modify the const value. This allows you to check if the code is doing the wrong thing. Cheers, Dick Johnson Penguin : Linux version 2.4.18 on an i686 machine (797.90 BogoMips). The US military has given us many words, FUBAR, SNAFU, now ENRON. Yes, top management were graduates of West Point and Annapolis. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Syscall changes registers beyond %eax, on linux-i386 2002-09-19 19:40 ` Richard B. Johnson @ 2002-09-19 19:41 ` Richard Henderson 2002-09-19 19:53 ` Richard B. Johnson 0 siblings, 1 reply; 30+ messages in thread From: Richard Henderson @ 2002-09-19 19:41 UTC (permalink / raw) To: Richard B. Johnson; +Cc: Brian Gerst, Petr Vandrovec, dvorak, linux-kernel On Thu, Sep 19, 2002 at 03:40:52PM -0400, Richard B. Johnson wrote: > Well it's not modifying those values. It's not modifying "a", true, but it _is_ modifying the parameter area. Which is exactly the kernel bug in question. > It's really bad code because it could have done: > > incl $0x04(%esp) > incl $0x08(%esp) > incl $0x1c(%esp) > jmp bar Yes, I know. r~ ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Syscall changes registers beyond %eax, on linux-i386 2002-09-19 19:41 ` Richard Henderson @ 2002-09-19 19:53 ` Richard B. Johnson 2002-09-19 22:46 ` J.A. Magallon 2002-09-22 1:33 ` Pavel Machek 0 siblings, 2 replies; 30+ messages in thread From: Richard B. Johnson @ 2002-09-19 19:53 UTC (permalink / raw) To: Richard Henderson; +Cc: Brian Gerst, Petr Vandrovec, dvorak, linux-kernel On Thu, 19 Sep 2002, Richard Henderson wrote: > On Thu, Sep 19, 2002 at 03:40:52PM -0400, Richard B. Johnson wrote: > > Well it's not modifying those values. > > It's not modifying "a", true, but it _is_ modifying the parameter > area. Which is exactly the kernel bug in question. > Yep. This can't be found by the compiler. The parameter area is writable so it looks like somebody needs to do some 'code inspection' and some additional testing. > > It's really bad code because it could have done: > > > > incl $0x04(%esp) > > incl $0x08(%esp) > > incl $0x1c(%esp) > > jmp bar > > Yes, I know. > It's a problem with a 'general purpose' compiler that wants to be "all things" to all people. If somebody made a gcc-compatible compiler, tuned to the ix86 characteristics, I think we could cut the extra instructions by at least 1/2, maybe more. Cheers, Dick Johnson Penguin : Linux version 2.4.18 on an i686 machine (797.90 BogoMips). The US military has given us many words, FUBAR, SNAFU, now ENRON. Yes, top management were graduates of West Point and Annapolis. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Syscall changes registers beyond %eax, on linux-i386 2002-09-19 19:53 ` Richard B. Johnson @ 2002-09-19 22:46 ` J.A. Magallon 2002-09-20 12:27 ` Richard B. Johnson 2002-09-22 1:33 ` Pavel Machek 1 sibling, 1 reply; 30+ messages in thread From: J.A. Magallon @ 2002-09-19 22:46 UTC (permalink / raw) To: root; +Cc: Richard Henderson, Brian Gerst, Petr Vandrovec, dvorak, linux-kernel On 2002.09.19 Richard B. Johnson wrote: >On Thu, 19 Sep 2002, Richard Henderson wrote: > [...] >> > It's really bad code because it could have done: >> > >> > incl $0x04(%esp) >> > incl $0x08(%esp) >> > incl $0x1c(%esp) >> > jmp bar >> [...] > >It's a problem with a 'general purpose' compiler that wants to >be "all things" to all people. If somebody made a gcc-compatible >compiler, tuned to the ix86 characteristics, I think we could >cut the extra instructions by at least 1/2, maybe more. > Curiosity killed the cat.... Just tried it with gcc-3.2. C code: extern void bar(int x, int y, int z); void foo(const int a, const int b, const int c) { bar(a+1, b+1, c+1); } - gcc -S -O0: pushl %ebp movl %esp, %ebp subl $8, %esp subl $4, %esp movl 16(%ebp), %eax incl %eax pushl %eax movl 12(%ebp), %eax incl %eax pushl %eax movl 8(%ebp), %eax incl %eax pushl %eax call bar addl $16, %esp leave ret - gcc -S -O1: pushl %ebp movl %esp, %ebp subl $12, %esp movl 16(%ebp), %eax incl %eax pushl %eax movl 12(%ebp), %eax incl %eax pushl %eax movl 8(%ebp), %eax incl %eax pushl %eax call bar addl $16, %esp movl %ebp, %esp popl %ebp ret - gcc -S -O2: movl 12(%esp), %eax incl %eax movl %eax, 12(%esp) movl 8(%esp), %eax incl %eax movl %eax, 8(%esp) movl 4(%esp), %eax incl %eax movl %eax, 4(%esp) jmp bar - gcc -S -O2 -march=[i686,pentium2,pentium3]: incl 4(%esp) movl 8(%esp), %eax incl %eax movl %eax, 8(%esp) movl 12(%esp), %eax incl %eax movl %eax, 12(%esp) jmp bar - gcc -S -O2 -march=pentium4: movl 8(%esp), %eax addl $1, 4(%esp) addl $1, %eax movl %eax, 8(%esp) movl 12(%esp), %eax addl $1, %eax movl %eax, 12(%esp) jmp bar -- J.A. Magallon <jamagallon@able.es> \ Software is like sex: werewolf.able.es \ It's better when it's free Mandrake Linux release 9.0 (Cooker) for i586 Linux 2.4.20-pre7-jam0 (gcc 3.2 (Mandrake Linux 9.0 3.2-1mdk)) ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Syscall changes registers beyond %eax, on linux-i386 2002-09-19 22:46 ` J.A. Magallon @ 2002-09-20 12:27 ` Richard B. Johnson 2002-09-20 17:16 ` Richard Henderson 0 siblings, 1 reply; 30+ messages in thread From: Richard B. Johnson @ 2002-09-20 12:27 UTC (permalink / raw) To: J.A. Magallon Cc: Richard Henderson, Brian Gerst, Petr Vandrovec, dvorak, linux-kernel On Fri, 20 Sep 2002, J.A. Magallon wrote: > > On 2002.09.19 Richard B. Johnson wrote: > >On Thu, 19 Sep 2002, Richard Henderson wrote: > > > [...] > >> > It's really bad code because it could have done: > >> > > >> > incl $0x04(%esp) > >> > incl $0x08(%esp) > >> > incl $0x1c(%esp) > >> > jmp bar > >> > [...] > > > >It's a problem with a 'general purpose' compiler that wants to > >be "all things" to all people. If somebody made a gcc-compatible > >compiler, tuned to the ix86 characteristics, I think we could > >cut the extra instructions by at least 1/2, maybe more. > > > > Curiosity killed the cat.... > Just tried it with gcc-3.2. > C code: > extern void bar(int x, int y, int z); > void foo(const int a, const int b, const int c) > { > bar(a+1, b+1, c+1); > } > > - gcc -S -O0: > pushl %ebp > movl %esp, %ebp > subl $8, %esp > subl $4, %esp > movl 16(%ebp), %eax > incl %eax > pushl %eax > movl 12(%ebp), %eax > incl %eax > pushl %eax > movl 8(%ebp), %eax > incl %eax > pushl %eax > call bar > addl $16, %esp > leave > ret > > - gcc -S -O1: > pushl %ebp > movl %esp, %ebp > subl $12, %esp > movl 16(%ebp), %eax > incl %eax > pushl %eax > movl 12(%ebp), %eax > incl %eax > pushl %eax > movl 8(%ebp), %eax > incl %eax > pushl %eax > call bar > addl $16, %esp > movl %ebp, %esp > popl %ebp > ret > > - gcc -S -O2: > movl 12(%esp), %eax > incl %eax > movl %eax, 12(%esp) > movl 8(%esp), %eax > incl %eax > movl %eax, 8(%esp) > movl 4(%esp), %eax > incl %eax > movl %eax, 4(%esp) > jmp bar > > - gcc -S -O2 -march=[i686,pentium2,pentium3]: > incl 4(%esp) > movl 8(%esp), %eax > incl %eax > movl %eax, 8(%esp) > movl 12(%esp), %eax > incl %eax > movl %eax, 12(%esp) > jmp bar > > - gcc -S -O2 -march=pentium4: > movl 8(%esp), %eax > addl $1, 4(%esp) > addl $1, %eax > movl %eax, 8(%esp) > movl 12(%esp), %eax > addl $1, %eax > movl %eax, 12(%esp) > jmp bar > > -- > J.A. Magallon <jamagallon@able.es> \ Software is like sex: > werewolf.able.es \ It's better when it's free > Mandrake Linux release 9.0 (Cooker) for i586 > Linux 2.4.20-pre7-jam0 (gcc 3.2 (Mandrake Linux 9.0 3.2-1mdk)) > Notice that it always gets some value from memory, modifies it, then writes it back. Adding 1 to %eax is plain dumb. Those instructions have to be fetched! Any instruction that's longer than the constant long-word in that instruction should be reviewed. Also that 1 is 4 bytes long. It has a single-byte oprand. That means the next instruction fetch will be at an odd address if it started on even because that sequence is 5 bytes in length. .if 0 You can assemble this directly ..... You know there are continuous complaints about ix86 processors being "register starved", but somehow the 'C' compilers often don't use the capabilities that are available with the processors. The following is some 'code' that will assemble. It doesn't do anything useful, but shows some addressing capability that is often ignored. .endif foo: .long 0 bar: incl (foo) # Bump the value of foo directly addl %eax,(foo) # Add eax to value in foo addl $0x10,(foo) # Add constant to value in foo addl (foo),%eax # Add value in foo to eax pushl (foo) # Put value in foo onto stack popl (foo) # Pop value on stack into foo movl %eax, foo(%ebx) # Put eax value into memory at foo + ebx incb (foo) # This is atomic, no lock required movl 14(%esp, %ebx), %eax # Get value from stack at offset # ESP + EBX (good for local arrays) .if 0 Most of the gcc code that deals with memory oprands, gets a value from memory, modifies it, then writes it back. This is a "throw-back" from processors that only have load and store operations. The ix86 processors can directly modify a single bit, anywhere in memory without having to put it into a register. Of course, what the hardware physically does may be quite another thing altogether. But I suggest that the CPU/Hardware combination is more capable of doing the right thing in executing the binary than any compiler that forces a load into a register, modification of register contents, then a write back to memory. Timing tests with rdtsc show many cycles are often wasted with these forced load and store operations. .endif Cheers, Dick Johnson Penguin : Linux version 2.4.18 on an i686 machine (797.90 BogoMips). The US military has given us many words, FUBAR, SNAFU, now ENRON. Yes, top management were graduates of West Point and Annapolis. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Syscall changes registers beyond %eax, on linux-i386 2002-09-20 12:27 ` Richard B. Johnson @ 2002-09-20 17:16 ` Richard Henderson 0 siblings, 0 replies; 30+ messages in thread From: Richard Henderson @ 2002-09-20 17:16 UTC (permalink / raw) To: Richard B. Johnson Cc: J.A. Magallon, Brian Gerst, Petr Vandrovec, dvorak, linux-kernel On Fri, Sep 20, 2002 at 08:27:32AM -0400, Richard B. Johnson wrote: > Adding 1 to %eax is plain dumb. No it isn't. P4 has a partial register stall on the flags register when using incl. You'll notice that we *do* use incl except when optimizing for P4. > Also that 1 is 4 bytes long. No it isn't. There is an 8-bit signed immediate form. As for the rest of the memory operand rant, the problem is not that gcc won't try to use memory operands, it's that the bit of code that's supposed to put these memory operands back together is like 10 years old and hasn't been taught about the memory aliasing subsystem. So any time it sees a memory load cross a memory store, it gives up. Perhaps I'll have this fixed for gcc 3.4. r~ ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Syscall changes registers beyond %eax, on linux-i386 2002-09-19 19:53 ` Richard B. Johnson 2002-09-19 22:46 ` J.A. Magallon @ 2002-09-22 1:33 ` Pavel Machek 2002-09-23 13:11 ` Richard B. Johnson 1 sibling, 1 reply; 30+ messages in thread From: Pavel Machek @ 2002-09-22 1:33 UTC (permalink / raw) To: Richard B. Johnson Cc: Richard Henderson, Brian Gerst, Petr Vandrovec, dvorak, linux-kernel Hi! > It's a problem with a 'general purpose' compiler that wants to > be "all things" to all people. If somebody made a gcc-compatible > compiler, tuned to the ix86 characteristics, I think we could > cut the extra instructions by at least 1/2, maybe more. Remember pgcc? And btw cutting instructions by 1/2might look nice but unless you can keep it as fast as it was, its useless. Pavel -- Philips Velo 1: 1"x4"x8", 300gram, 60, 12MB, 40bogomips, linux, mutt, details at http://atrey.karlin.mff.cuni.cz/~pavel/velo/index.html. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Syscall changes registers beyond %eax, on linux-i386 2002-09-22 1:33 ` Pavel Machek @ 2002-09-23 13:11 ` Richard B. Johnson 2002-09-23 18:31 ` Pavel Machek 0 siblings, 1 reply; 30+ messages in thread From: Richard B. Johnson @ 2002-09-23 13:11 UTC (permalink / raw) To: Pavel Machek Cc: Richard Henderson, Brian Gerst, Petr Vandrovec, dvorak, linux-kernel On Sun, 22 Sep 2002, Pavel Machek wrote: > Hi! > > > It's a problem with a 'general purpose' compiler that wants to > > be "all things" to all people. If somebody made a gcc-compatible > > compiler, tuned to the ix86 characteristics, I think we could > > cut the extra instructions by at least 1/2, maybe more. > > Remember pgcc? > > And btw cutting instructions by 1/2might look nice but unless you can > keep it as fast as it was, its useless. > Pavel > -- Yes, but to see the affect of cutting down the instruction length, you need to make benchmarks that emulate running 'forever'. Many bench- marks access some memory over-and-over again in a loop. This does not exercise the need to refill prefetch so the benchmarks ignore the advantages obtained by reducing the amount of instructions needed to be fetched from memory. Cheers, Dick Johnson Penguin : Linux version 2.4.18 on an i686 machine (797.90 BogoMips). The US military has given us many words, FUBAR, SNAFU, now ENRON. Yes, top management were graduates of West Point and Annapolis. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Syscall changes registers beyond %eax, on linux-i386 2002-09-23 13:11 ` Richard B. Johnson @ 2002-09-23 18:31 ` Pavel Machek 0 siblings, 0 replies; 30+ messages in thread From: Pavel Machek @ 2002-09-23 18:31 UTC (permalink / raw) To: Richard B. Johnson Cc: Pavel Machek, Richard Henderson, Brian Gerst, Petr Vandrovec, dvorak, linux-kernel Hi! > > > It's a problem with a 'general purpose' compiler that wants to > > > be "all things" to all people. If somebody made a gcc-compatible > > > compiler, tuned to the ix86 characteristics, I think we could > > > cut the extra instructions by at least 1/2, maybe more. > > > > Remember pgcc? > > > > And btw cutting instructions by 1/2might look nice but unless you can > > keep it as fast as it was, its useless. > > Pavel > > -- > Yes, but to see the affect of cutting down the instruction length, you > need to make benchmarks that emulate running 'forever'. Many bench- Specs contain things like perl and gcc, those are I believe far too big to be put entirely into cache and emulate "Real Life" quite well... Pavel -- Casualities in World Trade Center: ~3k dead inside the building, cryptography in U.S.A. and free speech in Czech Republic. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Syscall changes registers beyond %eax, on linux-i386 2002-09-19 18:51 ` Brian Gerst 2002-09-19 18:57 ` Richard Henderson @ 2002-09-19 19:18 ` Richard B. Johnson 1 sibling, 0 replies; 30+ messages in thread From: Richard B. Johnson @ 2002-09-19 19:18 UTC (permalink / raw) To: Brian Gerst; +Cc: Richard Henderson, Petr Vandrovec, dvorak, linux-kernel On Thu, 19 Sep 2002, Brian Gerst wrote: > Richard Henderson wrote: > > On Thu, Sep 19, 2002 at 02:04:43PM -0400, Brian Gerst wrote: > > > >>Now that I've thought about it more, I think the best solution is to go > >>through all the syscalls (a big job, I know), and declare the parameters > >>as const, so that gcc knows it can't modify them, and will throw a > >>warning if we try. > > > > > > The parameter area belongs to the callee, and it may *always* be modified. > > > > > > r~ > > > > The parameters can not be modified if they are declared const though, > that's my point. Yes. A temporary declaration change to compile the kernel and see where it complains. Cheers, Dick Johnson Penguin : Linux version 2.4.18 on an i686 machine (797.90 BogoMips). The US military has given us many words, FUBAR, SNAFU, now ENRON. Yes, top management were graduates of West Point and Annapolis. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Syscall changes registers beyond %eax, on linux-i386 2002-09-19 18:04 ` Brian Gerst 2002-09-19 18:30 ` Richard Henderson @ 2002-09-19 19:24 ` Daniel Jacobowitz 2002-09-19 20:25 ` Mikael Pettersson 1 sibling, 1 reply; 30+ messages in thread From: Daniel Jacobowitz @ 2002-09-19 19:24 UTC (permalink / raw) To: Brian Gerst; +Cc: Petr Vandrovec, Richard B. Johnson, dvorak, linux-kernel On Thu, Sep 19, 2002 at 02:04:43PM -0400, Brian Gerst wrote: > Petr Vandrovec wrote: > >On 19 Sep 02 at 13:22, Richard B. Johnson wrote: > > > > > >>>>>A short snippet of sys_poll, with irrelavant data removed. > >>>>> > >>>>>sys_poll(struct pollfd *ufds, .. , ..) { > >>>>> ... > >>>>> ufds++; > >>>>> ... > >>>> > >>Well which one? Here is an ioctl(). It certainly modifies one > >>of its parameter values. > > > > > >poll(), as was already noted. Program below should > >print same value for B= and F=, but it reports f + 8*c instead > >(where c = number of filedescriptors passed to poll). > > > >And you must call it from assembly, as your calls to getpid() or > >ioctl() (or poll()) are wrapped in libc - and glibc's code begins with > >push %ebx because of %ebx is used by -fPIC code. > > > >It is questinable whether we should try to not modify parameters > >passed into functions. It is definitely nice behavior, but I think > >that we should only guarantee that syscalls do not modify unused > >registers. > > Petr Vandrovec > > vandrove@vc.cvut.cz > > Now that I've thought about it more, I think the best solution is to go > through all the syscalls (a big job, I know), and declare the parameters > as const, so that gcc knows it can't modify them, and will throw a > warning if we try. That's not going to help. As Richard said, the memory in question belongs to the called function. GCC knows this. It can freely modify it. The fact that the value of the parameter is const is a language-level, semantic thing. It doesn't say anything about the const-ness of that memory. Only the ABI does. -- Daniel Jacobowitz MontaVista Software Debian GNU/Linux Developer ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Syscall changes registers beyond %eax, on linux-i386 2002-09-19 19:24 ` Daniel Jacobowitz @ 2002-09-19 20:25 ` Mikael Pettersson 2002-09-20 8:32 ` george anzinger 0 siblings, 1 reply; 30+ messages in thread From: Mikael Pettersson @ 2002-09-19 20:25 UTC (permalink / raw) To: Daniel Jacobowitz Cc: Brian Gerst, Petr Vandrovec, Richard B. Johnson, dvorak, linux-kernel Daniel Jacobowitz writes: > That's not going to help. As Richard said, the memory in question > belongs to the called function. GCC knows this. It can freely modify > it. The fact that the value of the parameter is const is a > language-level, semantic thing. It doesn't say anything about the > const-ness of that memory. Only the ABI does. Does Linux/x86 even have a proper ABI document? I've never seen one. The closest I've seen would be the SVR4 i386 psABI, but it deliberately doesn't define the raw syscall interface, only the each-syscall-is-a-C-function one implemented by the C library, and that interface doesn't suffer from the current issue. IOW, the kernel may not be at fault if user-space code invokes int $0x80 directly and then sees clobbered registers. /Mikael ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Syscall changes registers beyond %eax, on linux-i386 2002-09-19 20:25 ` Mikael Pettersson @ 2002-09-20 8:32 ` george anzinger 2002-09-21 6:19 ` Richard Henderson 0 siblings, 1 reply; 30+ messages in thread From: george anzinger @ 2002-09-20 8:32 UTC (permalink / raw) To: Mikael Pettersson Cc: Daniel Jacobowitz, Brian Gerst, Petr Vandrovec, Richard B. Johnson, dvorak, linux-kernel Mikael Pettersson wrote: > > Daniel Jacobowitz writes: > > That's not going to help. As Richard said, the memory in question > > belongs to the called function. GCC knows this. It can freely modify > > it. The fact that the value of the parameter is const is a > > language-level, semantic thing. It doesn't say anything about the > > const-ness of that memory. Only the ABI does. > > Does Linux/x86 even have a proper ABI document? I've never seen one. > The closest I've seen would be the SVR4 i386 psABI, but it > deliberately doesn't define the raw syscall interface, only the > each-syscall-is-a-C-function one implemented by the C library, > and that interface doesn't suffer from the current issue. > > IOW, the kernel may not be at fault if user-space code invokes int > $0x80 directly and then sees clobbered registers. Ah, that, indeed is the issue. As far as C is concerned, the call is NOT a call, but a bit of asm. If the asm is correctly written the problem goes away, not because the register is not modified, but because C is on notice that it MIGHT be modified and thus not to count on it. As a practical matter, ebx is used to pass arg1 to the kernel so it must be changed by the asm code, the further listing of it beyond the third ":" in the asm inline, will cause the compiler to not rely on it being further modified. The same is true of all the registers used to pass parameters. (These are: arg1 ebx, arg2 ecx, arg3 edx, arg4 esi, arg5 edi, and arg6 ebp.) So, is there a problem? Yes, neither the call stub macros in asm/unistd.h nor those in glibc bother to list the used registers beyond the third ":". And, if I understand this right, the glibc code to save ebx in another register suffers from the false assumption that THAT register can be clobbered, but this is only true if C sees the code as a function, not an inline asm, but most system calls in glibc are coded as inline asm, not separate functions (not to be confused with the C inline, which is a separate function). At least that is how I see it. Comments? -g > > /Mikael > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- George Anzinger george@mvista.com High-res-timers: http://sourceforge.net/projects/high-res-timers/ Preemption patch: http://www.kernel.org/pub/linux/kernel/people/rml ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Syscall changes registers beyond %eax, on linux-i386 2002-09-20 8:32 ` george anzinger @ 2002-09-21 6:19 ` Richard Henderson 2002-09-21 8:09 ` george anzinger 0 siblings, 1 reply; 30+ messages in thread From: Richard Henderson @ 2002-09-21 6:19 UTC (permalink / raw) To: george anzinger Cc: Mikael Pettersson, Daniel Jacobowitz, Brian Gerst, Petr Vandrovec, Richard B. Johnson, dvorak, linux-kernel On Fri, Sep 20, 2002 at 01:32:05AM -0700, george anzinger wrote: > So, is there a problem? Yes, neither the call stub macros > in asm/unistd.h nor those in glibc bother to list the used > registers beyond the third ":". No, this is not the real problem. The real problem is that if the program receives a signal during a system call, the kernel will return all the way up to entry.S, deliver the signal and then restart the syscall. Except the syscall will restart with the corrupted registers. Hilarity ensues. r~ ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Syscall changes registers beyond %eax, on linux-i386 2002-09-21 6:19 ` Richard Henderson @ 2002-09-21 8:09 ` george anzinger 2002-09-21 15:08 ` Richard Henderson 2002-09-24 18:02 ` CHECKER bate: " george anzinger 0 siblings, 2 replies; 30+ messages in thread From: george anzinger @ 2002-09-21 8:09 UTC (permalink / raw) To: Richard Henderson Cc: Mikael Pettersson, Daniel Jacobowitz, Brian Gerst, Petr Vandrovec, Richard B. Johnson, dvorak, linux-kernel Richard Henderson wrote: > > On Fri, Sep 20, 2002 at 01:32:05AM -0700, george anzinger wrote: > > So, is there a problem? Yes, neither the call stub macros > > in asm/unistd.h nor those in glibc bother to list the used > > registers beyond the third ":". > > No, this is not the real problem. The real problem is that if > the program receives a signal during a system call, the kernel > will return all the way up to entry.S, deliver the signal and > then restart the syscall. > > Except the syscall will restart with the corrupted registers. > > Hilarity ensues. > I submit that BOTH of these are problems. And only the kernel can fix the latter. -g -- George Anzinger george@mvista.com High-res-timers: http://sourceforge.net/projects/high-res-timers/ Preemption patch: http://www.kernel.org/pub/linux/kernel/people/rml ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Syscall changes registers beyond %eax, on linux-i386 2002-09-21 8:09 ` george anzinger @ 2002-09-21 15:08 ` Richard Henderson 2002-09-24 18:02 ` CHECKER bate: " george anzinger 1 sibling, 0 replies; 30+ messages in thread From: Richard Henderson @ 2002-09-21 15:08 UTC (permalink / raw) To: george anzinger Cc: Mikael Pettersson, Daniel Jacobowitz, Brian Gerst, Petr Vandrovec, Richard B. Johnson, dvorak, linux-kernel On Sat, Sep 21, 2002 at 01:09:12AM -0700, george anzinger wrote: > > Except the syscall will restart with the corrupted registers. > > > > Hilarity ensues. > > > I submit that BOTH of these are problems. And only the > kernel can fix the latter. If the later is fixed, so is the former. r~ ^ permalink raw reply [flat|nested] 30+ messages in thread
* CHECKER bate: Syscall changes registers beyond %eax, on linux-i386 2002-09-21 8:09 ` george anzinger 2002-09-21 15:08 ` Richard Henderson @ 2002-09-24 18:02 ` george anzinger 1 sibling, 0 replies; 30+ messages in thread From: george anzinger @ 2002-09-24 18:02 UTC (permalink / raw) To: Richard Henderson, Mikael Pettersson, Daniel Jacobowitz, Brian Gerst, Petr Vandrovec, Richard B. Johnson, dvorak, linux-kernel george anzinger wrote: > > Richard Henderson wrote: > > > > On Fri, Sep 20, 2002 at 01:32:05AM -0700, george anzinger wrote: > > > So, is there a problem? Yes, neither the call stub macros > > > in asm/unistd.h nor those in glibc bother to list the used > > > registers beyond the third ":". > > > > No, this is not the real problem. The real problem is that if > > the program receives a signal during a system call, the kernel > > will return all the way up to entry.S, deliver the signal and > > then restart the syscall. > > > > Except the syscall will restart with the corrupted registers. > > > > Hilarity ensues. > > > I submit that BOTH of these are problems. And only the > kernel can fix the latter. > Sounds like a job for the CHECKER. Should be easy to verify that a system call does not modify its call parameters. -- George Anzinger george@mvista.com High-res-timers: http://sourceforge.net/projects/high-res-timers/ Preemption patch: http://www.kernel.org/pub/linux/kernel/people/rml ^ permalink raw reply [flat|nested] 30+ messages in thread
* Syscall changes registers beyond %eax, on linux-i386 @ 2002-09-19 14:45 dvorak 2002-09-19 16:11 ` Richard B. Johnson 0 siblings, 1 reply; 30+ messages in thread From: dvorak @ 2002-09-19 14:45 UTC (permalink / raw) To: linux-kernel [-- Attachment #1: Type: text/plain, Size: 2095 bytes --] Hi, recently i came across a situation were on linux-i386 not only %eax was altered after a syscall but also %ebx. I tracked this problem down, to gcc re-using a variable passed to a function. This was found on a debian system with a 2.4.17 kernel compiled with gcc 2.95.2 and verified on another system, kernel 2.4.18 compiled with 2.95.4 Attached is small program to test for this 'bug' a syscall gets his data off the stack, the stack looks like: saved(edx) saved(ecx) saved(ebx) return_addres (somewhere in entry.S) When the syscall is called. the register came there through use of 'SAVE_ALL'. After the syscall returns these registers are restored using RESTORE_ALL and execution is transferred to userland again. A short snippet of sys_poll, with irrelavant data removed. sys_poll(struct pollfd *ufds, .. , ..) { ... ufds++; ... } It seems that gcc in certain cases optimizes in such a way that it changes the variable ufds as placed on the stack directly. Which results in saved(ebx) being overwritten and thus in a changed %ebx on return from the system call. I don't know if this is considered a bug, and if it is, from whom. If it's not a bug it means low-level userland programs need to be rewritten to store all registers on a syscall and restore them on return. It shouldn't be a bug in gcc, since the C-standard doesn't talk about how to pass variables and stuff. So it seems like a kernel(-gcc-interaction) bug. To solve this issue 2 solutions spring to mind 1) add a flag to gcc to tell it that it shouldn't do this optimization, this won't work with the gcc's already out there. 2) When calling a syscall explicitly push all variable an extra time, since the code in entry.S doesn't know the amount of variables to a syscall it needs to push all theoretical 6 parameters every time, a not so nice overhead. I hope someone can shed some light on this issue, i am not myself reading the linux-kernel mailing list, and would like to be cc'd if possible (i'll also check the archives so it's not 100% needed). Thanks in advance, Dvorak [-- Attachment #2: reg-bug.c --] [-- Type: text/plain, Size: 1241 bytes --] /* * usage is easy, though not very friendly: * gcc reg-bug.c * ./a.out | od -tx4 * <ENTER> * if the values outputted by hexdump are different the 'bug' is present * else the bug is not present * on a system without the bug: dvorak$ dmesg | head -1 Linux version 2.2.21 (kernel@debian) (gcc version 2.95.4 20011002 (Debian prerelease)) #6 Sat Sep 7 22:48:42 CEST 2002 dvorak$ gcc reg-bug.c dvorak$ ./a.out | od -tx4 0000000 bff7de6c bff7de6c * on a 'buggy' system: * (m4xx) dmesg | head -1 (m4xx) Linux version 2.4.18 (maxx@meuuh) (gcc version 2.95.4 20011002 (Debian +prerelease)) #2 Mon Jul 29 17:01:30 CEST 2002 (m4xx) $ gcc reg-bug.c (m4xx) $ ./a.out | od -tx4 (m4xx) 0000000 bffffdcc bffffdbc */ int main(void) { __asm__(" pushl $0x00010001 pushl $0x0 pushl $0x00010001 pushl $0x1 movl %esp, %ebx pushl %ebx movl $0x2, %ecx movl $0xa8, %eax movl $(-1), %edx int $0x80 pushl %ebx movl %esp, %ecx movl $0x08, %edx movl $0x04, %eax movl $0x01, %ebx int $0x80 movl $0x01, %eax xorl %ebx, %ebx int $0x80 "); } ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Syscall changes registers beyond %eax, on linux-i386 2002-09-19 14:45 dvorak @ 2002-09-19 16:11 ` Richard B. Johnson 2002-09-19 17:09 ` Brian Gerst 0 siblings, 1 reply; 30+ messages in thread From: Richard B. Johnson @ 2002-09-19 16:11 UTC (permalink / raw) To: dvorak; +Cc: linux-kernel On Thu, 19 Sep 2002, dvorak wrote: > Hi, > > recently i came across a situation were on linux-i386 not only %eax was > altered after a syscall but also %ebx. I tracked this problem down, to > gcc re-using a variable passed to a function. > > This was found on a debian system with a 2.4.17 kernel compiled with gcc > 2.95.2 and verified on another system, kernel 2.4.18 compiled with 2.95.4 > Attached is small program to test for this 'bug' > > a syscall gets his data off the stack, the stack looks like: > > saved(edx) > saved(ecx) > saved(ebx) > return_addres (somewhere in entry.S) > > When the syscall is called. > > the register came there through use of 'SAVE_ALL'. > > After the syscall returns these registers are restored using RESTORE_ALL > and execution is transferred to userland again. > > A short snippet of sys_poll, with irrelavant data removed. > > sys_poll(struct pollfd *ufds, .. , ..) { > ... > ufds++; > ... > } > > It seems that gcc in certain cases optimizes in such a way that it changes > the variable ufds as placed on the stack directly. Which results in saved(ebx) > being overwritten and thus in a changed %ebx on return from the system call. > The 'C' compiler must make room on the stack for any local variables except register types. If it was doing as you state, you couldn't even execute a "hello world" program. Further, the local variables are after the return address. It would screw up the return address and you'd go off into hyper-space upon return. > I don't know if this is considered a bug, and if it is, from whom. > If it's not a bug it means low-level userland programs need to be rewritten > to store all registers on a syscall and restore them on return. > No. Various 'C' implementers have standardized calling methods even though it's not part of the 'C' standard. gcc and others assume that a called procedure is not going to change any segments or index registers. There are various optimization things, like "-fcaller-saves" where the called procedure can destroy anything. You may be using something that was wrongly compiled using that switch. > It shouldn't be a bug in gcc, since the C-standard doesn't talk about how to > pass variables and stuff. So it seems like a kernel(-gcc-interaction) bug. > > To solve this issue 2 solutions spring to mind > 1) add a flag to gcc to tell it that it shouldn't do this optimization, this > won't work with the gcc's already out there. > 2) When calling a syscall explicitly push all variable an extra time, since > the code in entry.S doesn't know the amount of variables to a syscall it > needs to push all theoretical 6 parameters every time, a not so nice > overhead. > > There is a bug in some other code. Try this. It will show that ebx is not being killed in a syscall. You can prove that this code works by changing ebx to eax, which will get destroyed and print "Broken" before exit. #include <stdio.h> #include <unistd.h> void barf(void); void barf() { puts("Broken\n"); exit(0); } int main() { __asm__ __volatile__("movl $0xdeadface, %ebx\n"); (void)getpid(); __asm__ __volatile__("cmpl $0xdeadface, %ebx\n" "jnz barf\n"); return 0; } Cheers, Dick Johnson Penguin : Linux version 2.4.18 on an i686 machine (797.90 BogoMips). The US military has given us many words, FUBAR, SNAFU, now ENRON. Yes, top management were graduates of West Point and Annapolis. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Syscall changes registers beyond %eax, on linux-i386 2002-09-19 16:11 ` Richard B. Johnson @ 2002-09-19 17:09 ` Brian Gerst 2002-09-19 17:22 ` Richard B. Johnson 0 siblings, 1 reply; 30+ messages in thread From: Brian Gerst @ 2002-09-19 17:09 UTC (permalink / raw) To: root; +Cc: dvorak, linux-kernel Richard B. Johnson wrote: > On Thu, 19 Sep 2002, dvorak wrote: > > >>Hi, >> >>recently i came across a situation were on linux-i386 not only %eax was >>altered after a syscall but also %ebx. I tracked this problem down, to >>gcc re-using a variable passed to a function. >> >>This was found on a debian system with a 2.4.17 kernel compiled with gcc >>2.95.2 and verified on another system, kernel 2.4.18 compiled with 2.95.4 >>Attached is small program to test for this 'bug' >> >>a syscall gets his data off the stack, the stack looks like: >> >>saved(edx) >>saved(ecx) >>saved(ebx) >>return_addres (somewhere in entry.S) >> >>When the syscall is called. >> >>the register came there through use of 'SAVE_ALL'. >> >>After the syscall returns these registers are restored using RESTORE_ALL >>and execution is transferred to userland again. >> >>A short snippet of sys_poll, with irrelavant data removed. >> >>sys_poll(struct pollfd *ufds, .. , ..) { >> ... >> ufds++; >> ... >>} >> >>It seems that gcc in certain cases optimizes in such a way that it changes >>the variable ufds as placed on the stack directly. Which results in saved(ebx) >>being overwritten and thus in a changed %ebx on return from the system call. >> > > > The 'C' compiler must make room on the stack for any local > variables except register types. If it was doing as you state, you > couldn't even execute a "hello world" program. Further, the local > variables are after the return address. It would screw up the return > address and you'd go off into hyper-space upon return. > > > >>I don't know if this is considered a bug, and if it is, from whom. >>If it's not a bug it means low-level userland programs need to be rewritten >>to store all registers on a syscall and restore them on return. >> > > > No. Various 'C' implementers have standardized calling methods even > though it's not part of the 'C' standard. gcc and others assume that > a called procedure is not going to change any segments or index registers. > There are various optimization things, like "-fcaller-saves" where the > called procedure can destroy anything. You may be using something that > was wrongly compiled using that switch. > > > >>It shouldn't be a bug in gcc, since the C-standard doesn't talk about how to >>pass variables and stuff. So it seems like a kernel(-gcc-interaction) bug. >> >>To solve this issue 2 solutions spring to mind >>1) add a flag to gcc to tell it that it shouldn't do this optimization, this >> won't work with the gcc's already out there. >>2) When calling a syscall explicitly push all variable an extra time, since >> the code in entry.S doesn't know the amount of variables to a syscall it >> needs to push all theoretical 6 parameters every time, a not so nice >> overhead. >> >> > > > There is a bug in some other code. Try this. It will show > that ebx is not being killed in a syscall. You can prove > that this code works by changing ebx to eax, which will > get destroyed and print "Broken" before exit. The bug is only with _some_ syscalls, and getpid() is not one of them, so your example is flawed. It happens when a syscall modifies one of it's parameter values. The solution is to assign the parameter to a local variable before modifying it. -- Brian Gerst ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Syscall changes registers beyond %eax, on linux-i386 2002-09-19 17:09 ` Brian Gerst @ 2002-09-19 17:22 ` Richard B. Johnson 2002-09-19 17:51 ` Brian Gerst 2002-09-19 17:59 ` dvorak 0 siblings, 2 replies; 30+ messages in thread From: Richard B. Johnson @ 2002-09-19 17:22 UTC (permalink / raw) To: Brian Gerst; +Cc: dvorak, linux-kernel On Thu, 19 Sep 2002, Brian Gerst wrote: > Richard B. Johnson wrote: > > On Thu, 19 Sep 2002, dvorak wrote: > > > > > >>Hi, > >> > >>recently i came across a situation were on linux-i386 not only %eax was > >>altered after a syscall but also %ebx. I tracked this problem down, to > >>gcc re-using a variable passed to a function. > >> > >>This was found on a debian system with a 2.4.17 kernel compiled with gcc > >>2.95.2 and verified on another system, kernel 2.4.18 compiled with 2.95.4 > >>Attached is small program to test for this 'bug' > >> > >>a syscall gets his data off the stack, the stack looks like: > >> > >>saved(edx) > >>saved(ecx) > >>saved(ebx) > >>return_addres (somewhere in entry.S) > >> > >>When the syscall is called. > >> > >>the register came there through use of 'SAVE_ALL'. > >> > >>After the syscall returns these registers are restored using RESTORE_ALL > >>and execution is transferred to userland again. > >> > >>A short snippet of sys_poll, with irrelavant data removed. > >> > >>sys_poll(struct pollfd *ufds, .. , ..) { > >> ... > >> ufds++; > >> ... > >>} > >> > >>It seems that gcc in certain cases optimizes in such a way that it changes > >>the variable ufds as placed on the stack directly. Which results in saved(ebx) > >>being overwritten and thus in a changed %ebx on return from the system call. > >> > > > > > > The 'C' compiler must make room on the stack for any local > > variables except register types. If it was doing as you state, you > > couldn't even execute a "hello world" program. Further, the local > > variables are after the return address. It would screw up the return > > address and you'd go off into hyper-space upon return. > > > > > > > >>I don't know if this is considered a bug, and if it is, from whom. > >>If it's not a bug it means low-level userland programs need to be rewritten > >>to store all registers on a syscall and restore them on return. > >> > > > > > > No. Various 'C' implementers have standardized calling methods even > > though it's not part of the 'C' standard. gcc and others assume that > > a called procedure is not going to change any segments or index registers. > > There are various optimization things, like "-fcaller-saves" where the > > called procedure can destroy anything. You may be using something that > > was wrongly compiled using that switch. > > > > > > > >>It shouldn't be a bug in gcc, since the C-standard doesn't talk about how to > >>pass variables and stuff. So it seems like a kernel(-gcc-interaction) bug. > >> > >>To solve this issue 2 solutions spring to mind > >>1) add a flag to gcc to tell it that it shouldn't do this optimization, this > >> won't work with the gcc's already out there. > >>2) When calling a syscall explicitly push all variable an extra time, since > >> the code in entry.S doesn't know the amount of variables to a syscall it > >> needs to push all theoretical 6 parameters every time, a not so nice > >> overhead. > >> > >> > > > > > > There is a bug in some other code. Try this. It will show > > that ebx is not being killed in a syscall. You can prove > > that this code works by changing ebx to eax, which will > > get destroyed and print "Broken" before exit. > > The bug is only with _some_ syscalls, and getpid() is not one of them, > so your example is flawed. It happens when a syscall modifies one of > it's parameter values. The solution is to assign the parameter to a > local variable before modifying it. > Well which one? Here is an ioctl(). It certainly modifies one of its parameter values. #include <stdio.h> #include <unistd.h> #include <sys/ioctl.h> #include <termios.h> void barf(void); void barf() { puts("Broken\n"); exit(0); } int main() { struct termios t; __asm__ __volatile__("movl $0xdeadface, %ebx\n"); (void)ioctl(0, TCGETS, &t); (void)getpid(); __asm__ __volatile__("cmpl $0xdeadface, %ebx\n" "jnz barf\n"); return 0; } Until you can show the syscall that doesn't follow the correct rules, then my example is not flawed. In fact a modified example can be used to find any broken calls. Cheers, Dick Johnson Penguin : Linux version 2.4.18 on an i686 machine (797.90 BogoMips). The US military has given us many words, FUBAR, SNAFU, now ENRON. Yes, top management were graduates of West Point and Annapolis. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Syscall changes registers beyond %eax, on linux-i386 2002-09-19 17:22 ` Richard B. Johnson @ 2002-09-19 17:51 ` Brian Gerst 2002-09-19 18:30 ` Richard B. Johnson 2002-09-19 17:59 ` dvorak 1 sibling, 1 reply; 30+ messages in thread From: Brian Gerst @ 2002-09-19 17:51 UTC (permalink / raw) To: root; +Cc: dvorak, linux-kernel Richard B. Johnson wrote: > On Thu, 19 Sep 2002, Brian Gerst wrote: >>Richard B. Johnson wrote: >>>There is a bug in some other code. Try this. It will show >>>that ebx is not being killed in a syscall. You can prove >>>that this code works by changing ebx to eax, which will >>>get destroyed and print "Broken" before exit. >> >>The bug is only with _some_ syscalls, and getpid() is not one of them, >>so your example is flawed. It happens when a syscall modifies one of >>it's parameter values. The solution is to assign the parameter to a >>local variable before modifying it. >> > > > Well which one? Here is an ioctl(). It certainly modifies one > of its parameter values. > > #include <stdio.h> > #include <unistd.h> > #include <sys/ioctl.h> > #include <termios.h> > > void barf(void); > void barf() > { > puts("Broken\n"); > exit(0); > } > int main() > { > struct termios t; > > __asm__ __volatile__("movl $0xdeadface, %ebx\n"); > (void)ioctl(0, TCGETS, &t); > (void)getpid(); > __asm__ __volatile__("cmpl $0xdeadface, %ebx\n" > "jnz barf\n"); > > return 0; > } > > > Until you can show the syscall that doesn't follow the correct > rules, then my example is not flawed. In fact a modified example can > be used to find any broken calls. Well the original poster gave one valid example: sys_poll(). We're not talking about it modifying userspace though a pointer. We're talking about it taking it's parameter on the kernel stack (which is really the pt_regs structure saved from user space) and modifying it. Which then gets restored to the user registers upon syscall exit. This is how the kernel stack looks like inside a syscall (x86): OLDSS OLDESP EFLAGS CS EIP ORIG_EAX ES DS EAX <- syscall number EBP <- syscall arg6 EDI <- syscall arg5 ESI <- syscall arg4 EDX <- syscall arg3 ECX <- syscall arg2 EBX <- syscall arg1 (return address) (local variables) Everything above the return address is the pt_regs struct that gets restored to user space. If the syscall modifies any of its args (*not memory pointed to by the args*), they get written back to the stack in the pt_regs area, and then get restored to userspace modified. Understand now? -- Brian Gerst ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Syscall changes registers beyond %eax, on linux-i386 2002-09-19 17:51 ` Brian Gerst @ 2002-09-19 18:30 ` Richard B. Johnson 0 siblings, 0 replies; 30+ messages in thread From: Richard B. Johnson @ 2002-09-19 18:30 UTC (permalink / raw) To: Brian Gerst; +Cc: dvorak, linux-kernel On Thu, 19 Sep 2002, Brian Gerst wrote: > Richard B. Johnson wrote: > > On Thu, 19 Sep 2002, Brian Gerst wrote: > >>Richard B. Johnson wrote: > >>>There is a bug in some other code. Try this. It will show > >>>that ebx is not being killed in a syscall. You can prove > >>>that this code works by changing ebx to eax, which will > >>>get destroyed and print "Broken" before exit. > >> > >>The bug is only with _some_ syscalls, and getpid() is not one of them, > >>so your example is flawed. It happens when a syscall modifies one of > >>it's parameter values. The solution is to assign the parameter to a > >>local variable before modifying it. > >> > > > > > > Well which one? Here is an ioctl(). It certainly modifies one > > of its parameter values. > > > > #include <stdio.h> > > #include <unistd.h> > > #include <sys/ioctl.h> > > #include <termios.h> > > > > void barf(void); > > void barf() > > { > > puts("Broken\n"); > > exit(0); > > } > > int main() > > { > > struct termios t; > > > > __asm__ __volatile__("movl $0xdeadface, %ebx\n"); > > (void)ioctl(0, TCGETS, &t); > > (void)getpid(); > > __asm__ __volatile__("cmpl $0xdeadface, %ebx\n" > > "jnz barf\n"); > > > > return 0; > > } > > > > > > Until you can show the syscall that doesn't follow the correct > > rules, then my example is not flawed. In fact a modified example can > > be used to find any broken calls. > > Well the original poster gave one valid example: sys_poll(). We're not > talking about it modifying userspace though a pointer. We're talking > about it taking it's parameter on the kernel stack (which is really the > pt_regs structure saved from user space) and modifying it. Which then > gets restored to the user registers upon syscall exit. > > This is how the kernel stack looks like inside a syscall (x86): > OLDSS > OLDESP > EFLAGS > CS > EIP > ORIG_EAX > ES > DS > EAX <- syscall number > EBP <- syscall arg6 > EDI <- syscall arg5 > ESI <- syscall arg4 > EDX <- syscall arg3 > ECX <- syscall arg2 > EBX <- syscall arg1 > (return address) > (local variables) > > Everything above the return address is the pt_regs struct that gets > restored to user space. If the syscall modifies any of its args (*not > memory pointed to by the args*), they get written back to the stack in > the pt_regs area, and then get restored to userspace modified. > Understand now? > Maybe. So, if the 'C' runtime library puts 0xdeadfeed into the ebx register and executes a syscall, upon return from the syscall, this value is no longer 0xdeadfeed? If this is true, then is the kernel supposed to save the values of registers modified by user-code, before calling the function? I expect that the 'C' runtime library expects the index registers to be preserved and EBX is an index register. Cheers, Dick Johnson Penguin : Linux version 2.4.18 on an i686 machine (797.90 BogoMips). The US military has given us many words, FUBAR, SNAFU, now ENRON. Yes, top management were graduates of West Point and Annapolis. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Syscall changes registers beyond %eax, on linux-i386 2002-09-19 17:22 ` Richard B. Johnson 2002-09-19 17:51 ` Brian Gerst @ 2002-09-19 17:59 ` dvorak 2002-09-19 18:32 ` Richard B. Johnson 1 sibling, 1 reply; 30+ messages in thread From: dvorak @ 2002-09-19 17:59 UTC (permalink / raw) To: Richard B. Johnson; +Cc: linux-kernel [-- Attachment #1: Type: text/plain, Size: 3873 bytes --] On Thu, Sep 19, 2002 at 01:22:35PM -0400, Richard B. Johnson wrote: > On Thu, 19 Sep 2002, Brian Gerst wrote: > > > Richard B. Johnson wrote: > > > On Thu, 19 Sep 2002, dvorak wrote: > > > > > > > > >>Hi, > > >> > > >>recently i came across a situation were on linux-i386 not only %eax was > > >>altered after a syscall but also %ebx. I tracked this problem down, to > > >>gcc re-using a variable passed to a function. > > >> > > >>This was found on a debian system with a 2.4.17 kernel compiled with gcc > > >>2.95.2 and verified on another system, kernel 2.4.18 compiled with 2.95.4 > > >>Attached is small program to test for this 'bug' > > >> <SNIP part of the explanation> > > >>It seems that gcc in certain cases optimizes in such a way that it changes > > >>the variable ufds as placed on the stack directly. Which results in saved(ebx) > > >>being overwritten and thus in a changed %ebx on return from the system call. > > >> > > > > > > > > > The 'C' compiler must make room on the stack for any local > > > variables except register types. If it was doing as you state, you > > > couldn't even execute a "hello world" program. Further, the local > > > variables are after the return address. It would screw up the return > > > address and you'd go off into hyper-space upon return. The problem is it uses one of the _arguments_ passed to the function, that argument gets modified, normally this happens on a copy, but there is no 'garantue' that is doesn't modify the original argument as putted on the stack by the calling function. > > > No. Various 'C' implementers have standardized calling methods even > > > though it's not part of the 'C' standard. gcc and others assume that > > > a called procedure is not going to change any segments or index registers. > > > There are various optimization things, like "-fcaller-saves" where the > > > called procedure can destroy anything. You may be using something that > > > was wrongly compiled using that switch. This is not what happens here, what happens is that one of the _arguments_ placed on the stack is being modified, normally a calling function discards these values after use (addl $0x10, %esp or similar) but in this case they are reused. (in the RESTORE_ALL call) > > > > The bug is only with _some_ syscalls, and getpid() is not one of them, > > so your example is flawed. It happens when a syscall modifies one of > > it's parameter values. The solution is to assign the parameter to a > > local variable before modifying it. > > and only with _some_ compiler + kernel combinations. > int main() > { > struct termios t; > > __asm__ __volatile__("movl $0xdeadface, %ebx\n"); > (void)ioctl(0, TCGETS, &t); > (void)getpid(); > __asm__ __volatile__("cmpl $0xdeadface, %ebx\n" > "jnz barf\n"); > > return 0; > } > Until you can show the syscall that doesn't follow the correct > rules, then my example is not flawed. In fact a modified example can > be used to find any broken calls. I putted in some assembler code in my original post that uses the sys_poll syscall of which i _know_ it modifies one of it's arguments, to be more specific, it's first argument, which is %ebx on passing in: asmlinkage int sys_poll(struct pollfd * ufds, unsigned int nfds, long timeout) .... for(i=0; i < (int)nfds; i++, ufds++, fds1++) { .... and in fact we saw that the change in %ebx is proportional to the nfds as passed to sys_poll. now however for sys_ioctl: it's first argument, fd (%ebx on passing) is never modifed in the code nowhere is there an fd++ or similar, so again this 'example' of yours is flawed. gtx. dvorak P.S. i think my original was quite clear and INCLUDED example code that can easily be checked by someone who reads asm, i attach an extra copy which explains all the asm in there for easier reference. [-- Attachment #2: reg-bug.c --] [-- Type: text/plain, Size: 2476 bytes --] /* * usage is easy, though not very friendly: * gcc reg-bug.c * ./a.out | od -tx4 * <ENTER> * if the values outputted by hexdump are different the 'bug' is present * else the bug is not present * on a system without the bug: dvorak$ dmesg | head -1 Linux version 2.2.21 (kernel@debian) (gcc version 2.95.4 20011002 (Debian prerelease)) #6 Sat Sep 7 22:48:42 CEST 2002 dvorak$ gcc reg-bug.c dvorak$ ./a.out | od -tx4 0000000 bff7de6c bff7de6c * on a 'buggy' system: * (m4xx) dmesg | head -1 (m4xx) Linux version 2.4.18 (maxx@meuuh) (gcc version 2.95.4 20011002 (Debian +prerelease)) #2 Mon Jul 29 17:01:30 CEST 2002 (m4xx) $ gcc reg-bug.c (m4xx) $ ./a.out | od -tx4 (m4xx) 0000000 bffffdcc bffffdbc */ int main(void) { __asm__(" pushl $0x00010001 # this is events and revents pushl $0x0 # fd 0 pushl $0x00010001 # again events and revents pushl $0x1 # fd 1 movl %esp, %ebx # %ebx now contains a pointer to the # pollfd structure i setted up, # note that only pushes occur # below so this structures stays on the stack pushl %ebx # we pushl %ebx to compare it later movl $0x2, %ecx # 2 = number of fd's movl $0xa8, %eax # __NR_sys_poll movl $(-1), %edx # no timeout int $0x80 # call kernel # sys_poll(%ebx, %ecx, %edx) # %ebx == ufds (address of the pollfd structs)o # %ecx == num fd's (2) # %edx == timeout (-1) pushl %ebx # we push the returned %ebx on the stack # as well movl %esp, %ecx # %ecx is now pointer to the 2 saved %ebx's movl $0x08, %edx # %edx = 8 movl $0x04, %eax # %eax = 4 == __NR_sys_write movl $0x01, %ebx # %ebx = 1 == fd int $0x80 # call kernel # sys_write(%ebx, %ecx, %edx) # %ebx = fd (1) # %ecx = buf (pointer to the 2 saved %ebx's) # %edx = len (8) movl $0x01, %eax # and an sys_exit(0); xorl %ebx, %ebx int $0x80 "); } ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Syscall changes registers beyond %eax, on linux-i386 2002-09-19 17:59 ` dvorak @ 2002-09-19 18:32 ` Richard B. Johnson 0 siblings, 0 replies; 30+ messages in thread From: Richard B. Johnson @ 2002-09-19 18:32 UTC (permalink / raw) To: dvorak; +Cc: linux-kernel On Thu, 19 Sep 2002, dvorak wrote: > On Thu, Sep 19, 2002 at 01:22:35PM -0400, Richard B. Johnson wrote: > > On Thu, 19 Sep 2002, Brian Gerst wrote: > > > > > Richard B. Johnson wrote: > > > > On Thu, 19 Sep 2002, dvorak wrote: > > > > > > > > > > > >>Hi, > > > >> > > > >>recently i came across a situation were on linux-i386 not only %eax was > > > >>altered after a syscall but also %ebx. I tracked this problem down, to > > > >>gcc re-using a variable passed to a function. > > > >> > > > >>This was found on a debian system with a 2.4.17 kernel compiled with gcc > > > >>2.95.2 and verified on another system, kernel 2.4.18 compiled with 2.95.4 > > > >>Attached is small program to test for this 'bug' > > > >> > <SNIP part of the explanation> > > > > >>It seems that gcc in certain cases optimizes in such a way that it changes > > > >>the variable ufds as placed on the stack directly. Which results in saved(ebx) > > > >>being overwritten and thus in a changed %ebx on return from the system call. > > > >> > > > > > > > > > > > > The 'C' compiler must make room on the stack for any local > > > > variables except register types. If it was doing as you state, you > > > > couldn't even execute a "hello world" program. Further, the local > > > > variables are after the return address. It would screw up the return > > > > address and you'd go off into hyper-space upon return. > > The problem is it uses one of the _arguments_ passed to the function, > that argument gets modified, normally this happens on a copy, but there > is no 'garantue' that is doesn't modify the original argument as > putted on the stack by the calling function. > > > > > No. Various 'C' implementers have standardized calling methods even > > > > though it's not part of the 'C' standard. gcc and others assume that > > > > a called procedure is not going to change any segments or index registers. > > > > There are various optimization things, like "-fcaller-saves" where the > > > > called procedure can destroy anything. You may be using something that > > > > was wrongly compiled using that switch. > This is not what happens here, what happens is that one of the _arguments_ > placed on the stack is being modified, normally a calling function discards > these values after use (addl $0x10, %esp or similar) but in this case they > are reused. (in the RESTORE_ALL call) > > > > > > > The bug is only with _some_ syscalls, and getpid() is not one of them, > > > so your example is flawed. It happens when a syscall modifies one of > > > it's parameter values. The solution is to assign the parameter to a > > > local variable before modifying it. > > > > and only with _some_ compiler + kernel combinations. [SNIPPED...] Okay. Thanks for the explaination. Cheers, Dick Johnson Penguin : Linux version 2.4.18 on an i686 machine (797.90 BogoMips). The US military has given us many words, FUBAR, SNAFU, now ENRON. Yes, top management were graduates of West Point and Annapolis. ^ permalink raw reply [flat|nested] 30+ messages in thread
end of thread, other threads:[~2002-09-24 17:59 UTC | newest] Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2002-09-19 17:44 Syscall changes registers beyond %eax, on linux-i386 Petr Vandrovec 2002-09-19 18:04 ` Brian Gerst 2002-09-19 18:30 ` Richard Henderson 2002-09-19 18:51 ` Brian Gerst 2002-09-19 18:57 ` Richard Henderson 2002-09-19 19:40 ` Richard B. Johnson 2002-09-19 19:41 ` Richard Henderson 2002-09-19 19:53 ` Richard B. Johnson 2002-09-19 22:46 ` J.A. Magallon 2002-09-20 12:27 ` Richard B. Johnson 2002-09-20 17:16 ` Richard Henderson 2002-09-22 1:33 ` Pavel Machek 2002-09-23 13:11 ` Richard B. Johnson 2002-09-23 18:31 ` Pavel Machek 2002-09-19 19:18 ` Richard B. Johnson 2002-09-19 19:24 ` Daniel Jacobowitz 2002-09-19 20:25 ` Mikael Pettersson 2002-09-20 8:32 ` george anzinger 2002-09-21 6:19 ` Richard Henderson 2002-09-21 8:09 ` george anzinger 2002-09-21 15:08 ` Richard Henderson 2002-09-24 18:02 ` CHECKER bate: " george anzinger -- strict thread matches above, loose matches on Subject: below -- 2002-09-19 14:45 dvorak 2002-09-19 16:11 ` Richard B. Johnson 2002-09-19 17:09 ` Brian Gerst 2002-09-19 17:22 ` Richard B. Johnson 2002-09-19 17:51 ` Brian Gerst 2002-09-19 18:30 ` Richard B. Johnson 2002-09-19 17:59 ` dvorak 2002-09-19 18:32 ` Richard B. Johnson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).