* gcc 2.95 vs 3.21 performance @ 2003-02-03 23:05 Martin J. Bligh 2003-02-03 23:22 ` [Lse-tech] " Andi Kleen ` (3 more replies) 0 siblings, 4 replies; 84+ messages in thread From: Martin J. Bligh @ 2003-02-03 23:05 UTC (permalink / raw) To: linux-kernel; +Cc: lse-tech People keep extolling the virtues of gcc 3.2 to me, which I'm reluctant to switch to, since it compiles so much slower. But it supposedly generates better code, so I thought I'd compile the kernel with both and compare the results. This is gcc 2.95 and 3.2.1 from debian unstable on a 16-way NUMA-Q. The kernbench tests still use 2.95 for the compile-time stuff. The results below leaves me distinctly unconvinced by the supposed merits of modern gcc's. Not really better or worse, within experimental error. But much slower to compile things with. Kernbench-2: (make -j N vmlinux, where N = 2 x num_cpus) Elapsed User System CPU 2.5.59 46.08 563.88 118.38 1480.00 2.5.59-gcc3.2 45.86 563.63 119.58 1489.33 Kernbench-16: (make -j N vmlinux, where N = 16 x num_cpus) Elapsed User System CPU 2.5.59 47.45 568.02 143.17 1498.17 2.5.59-gcc3.2 47.15 567.41 143.72 1507.50 DISCLAIMER: SPEC(tm) and the benchmark name SDET(tm) are registered trademarks of the Standard Performance Evaluation Corporation. This benchmarking was performed for research purposes only, and the run results are non-compliant and not-comparable with any published results. Results are shown as percentages of the first set displayed SDET 1 (see disclaimer) Throughput Std. Dev 2.5.59 100.0% 0.8% 2.5.59-gcc3.2 95.3% 5.2% SDET 2 (see disclaimer) Throughput Std. Dev 2.5.59 100.0% 0.6% 2.5.59-gcc3.2 91.9% 7.1% SDET 4 (see disclaimer) Throughput Std. Dev 2.5.59 100.0% 5.7% 2.5.59-gcc3.2 98.8% 5.3% SDET 8 (see disclaimer) Throughput Std. Dev 2.5.59 100.0% 1.4% 2.5.59-gcc3.2 105.3% 4.7% SDET 16 (see disclaimer) Throughput Std. Dev 2.5.59 100.0% 1.7% 2.5.59-gcc3.2 103.1% 1.8% SDET 32 (see disclaimer) Throughput Std. Dev 2.5.59 100.0% 1.5% 2.5.59-gcc3.2 101.0% 1.6% SDET 64 (see disclaimer) Throughput Std. Dev 2.5.59 100.0% 0.7% 2.5.59-gcc3.2 103.1% 1.1% SDET 128 (see disclaimer) Throughput Std. Dev NUMA schedbench 4: AvgUser Elapsed TotalUser TotalSys 2.5.59 0.00 38.88 82.78 0.65 2.5.59-gcc3.2 0.00 41.80 107.76 0.73 NUMA schedbench 8: AvgUser Elapsed TotalUser TotalSys 2.5.59 0.00 49.30 247.80 1.93 2.5.59-gcc3.2 0.00 38.00 229.83 2.11 NUMA schedbench 16: AvgUser Elapsed TotalUser TotalSys 2.5.59 0.00 57.37 843.12 3.77 2.5.59-gcc3.2 0.00 57.28 839.21 2.85 NUMA schedbench 32: AvgUser Elapsed TotalUser TotalSys 2.5.59 0.00 116.99 1805.79 6.05 2.5.59-gcc3.2 0.00 118.44 1788.09 6.25 NUMA schedbench 64: AvgUser Elapsed TotalUser TotalSys 2.5.59 0.00 235.18 3632.73 15.45 2.5.59-gcc3.2 0.00 234.55 3633.76 15.02 ------------------------------------------------------------------------------ And with the same kernel, comparing the compile times for gcc 2.95 to 3.2 Kernbench-2: (make -j N vmlinux, where N = 2 x num_cpus) Elapsed User System CPU gcc2.95 46.08 563.88 118.38 1480.00 gcc3.21 69.93 923.17 114.36 1483.17 Kernbench-16: (make -j N vmlinux, where N = 16 x num_cpus) Elapsed User System CPU gcc2.95 47.45 568.02 143.17 1498.17 gcc3.21 71.44 926.45 134.89 1485.33 pft. ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [Lse-tech] gcc 2.95 vs 3.21 performance 2003-02-03 23:05 gcc 2.95 vs 3.21 performance Martin J. Bligh @ 2003-02-03 23:22 ` Andi Kleen 2003-02-03 23:31 ` Richard B. Johnson ` (2 subsequent siblings) 3 siblings, 0 replies; 84+ messages in thread From: Andi Kleen @ 2003-02-03 23:22 UTC (permalink / raw) To: Martin J. Bligh; +Cc: linux-kernel, lse-tech On Mon, Feb 03, 2003 at 03:05:06PM -0800, Martin J. Bligh wrote: > The results below leaves me distinctly unconvinced by the supposed > merits of modern gcc's. Not really better or worse, within experimental > error. But much slower to compile things with. Curious - could you compare it with a gcc 3.3 snapshot too? It should be even slower at compiling, but generate better code. -Andi ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-03 23:05 gcc 2.95 vs 3.21 performance Martin J. Bligh 2003-02-03 23:22 ` [Lse-tech] " Andi Kleen @ 2003-02-03 23:31 ` Richard B. Johnson 2003-02-04 0:43 ` J.A. Magallon ` (2 more replies) 2003-02-04 12:20 ` [Lse-tech] " Dave Jones 2003-02-06 15:42 ` gcc -O2 vs gcc -Os performance Martin J. Bligh 3 siblings, 3 replies; 84+ messages in thread From: Richard B. Johnson @ 2003-02-03 23:31 UTC (permalink / raw) To: Martin J. Bligh; +Cc: linux-kernel, lse-tech On Mon, 3 Feb 2003, Martin J. Bligh wrote: > People keep extolling the virtues of gcc 3.2 to me, which I'm > reluctant to switch to, since it compiles so much slower. But > it supposedly generates better code, so I thought I'd compile > the kernel with both and compare the results. This is gcc 2.95 > and 3.2.1 from debian unstable on a 16-way NUMA-Q. The kernbench > tests still use 2.95 for the compile-time stuff. > [SNIPPED tests...] Don't let this get out, but egcs-2.91.66 compiled FFT code works about 50 percent of the speed of whatever M$ uses for Visual C++ Version 6.0 I was awfully disheartened when I found that identical code executed twice as fast on M$ than it does on Linux. I tried to isolate what was causing the difference. So I replaced 'hypot()' with some 'C' code that does sqrt(x^2 + y^2) just to see if it was the 'C' library. It didn't help. When I find out what type (section) of code is running slower, I'll report. In the meantime, it's fast enough, but I don't like being beat by M$. Cheers, Dick Johnson Penguin : Linux version 2.4.18 on an i686 machine (797.90 BogoMips). Why is the government concerned about the lunatic fringe? Think about it. ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-03 23:31 ` Richard B. Johnson @ 2003-02-04 0:43 ` J.A. Magallon 2003-02-04 13:42 ` Richard B. Johnson 2003-02-04 6:54 ` Denis Vlasenko 2003-02-04 10:57 ` Padraig 2 siblings, 1 reply; 84+ messages in thread From: J.A. Magallon @ 2003-02-04 0:43 UTC (permalink / raw) To: root; +Cc: Martin J. Bligh, linux-kernel, lse-tech On 2003.02.04 Richard B. Johnson wrote: > On Mon, 3 Feb 2003, Martin J. Bligh wrote: > > > People keep extolling the virtues of gcc 3.2 to me, which I'm > > reluctant to switch to, since it compiles so much slower. But > > it supposedly generates better code, so I thought I'd compile > > the kernel with both and compare the results. This is gcc 2.95 > > and 3.2.1 from debian unstable on a 16-way NUMA-Q. The kernbench > > tests still use 2.95 for the compile-time stuff. > > > [SNIPPED tests...] > > Don't let this get out, but egcs-2.91.66 compiled FFT code > works about 50 percent of the speed of whatever M$ uses for > Visual C++ Version 6.0 I was awfully disheartened when I > found that identical code executed twice as fast on M$ than > it does on Linux. I tried to isolate what was causing the > difference. So I replaced 'hypot()' with some 'C' code that > does sqrt(x^2 + y^2) just to see if it was the 'C' library. > It didn't help. When I find out what type (section) of code > is running slower, I'll report. In the meantime, it's fast > enough, but I don't like being beat by M$. > I face a simliar problem. As everybody says that SSE is so marvelous, we are trying to put some SSE code in our render engine, to speed up this. But look at the results of the code below (box is a P4@1.8, Xeon with ht): annwn:~/sse> ss-g Proc std: 5020 kticks Proc std inline: 4320 kticks Proc sse: 4290 kticks Proc sse inline: 3890 kticks So what ? Just around 500 ticks for updating to sse ? As Computer Architecture people at the school says, it is something called 'spill code' (did I wrote it ok?). In short, too much sse but too less registers, so Intel ia32 turns into crap when you need some indexes, out of registers and copy to and from the stack. #include <stdlib.h> #include <time.h> #include <stdio.h> #if defined(__INTEL_COMPILER) #include <xmmintrin.h> #endif #define LOOPS 1000 #define SZ 100000 #if defined(__GNUC__) && defined(__SSE__) typedef void __ve_reg __attribute__((__mode__(V4SF))); #endif typedef struct point point; struct point { float v[4]; }; void mulp_std(const point* a,const point* b,point* r) { int i; for (i=0; i<4; i++) r->v[i] = a->v[i] * b->v[i]; } inline void mulpi_std(const point* a,const point* b,point* r) { int i; for (i=0; i<4; i++) r->v[i] = a->v[i] * b->v[i]; } void mulp_sse(const point* a,const point* b,point* r) { #if defined(__GNUC__) && defined(__SSE__) __ve_reg xmm0,xmm1,xmm2; xmm0 = __builtin_ia32_loadups((float*)a->v); xmm1 = __builtin_ia32_loadups((float*)b->v); xmm2 = __builtin_ia32_mulps(xmm0,xmm1); __builtin_ia32_storeups(r->v,xmm2); #endif #if defined(__INTEL_COMPILER) __m128 xmm0,xmm1,xmm2; xmm0 = _mm_loadu_ps((float*)a->v); xmm1 = _mm_loadu_ps((float*)b->v); xmm2 = _mm_mul_ps(xmm0,xmm1); _mm_storeu_ps(r->v,xmm2); #endif } inline void mulpi_sse(const point* a,const point* b,point* r) { #if defined(__GNUC__) && defined(__SSE__) __ve_reg xmm0,xmm1,xmm2; xmm0 = __builtin_ia32_loadups((float*)a->v); xmm1 = __builtin_ia32_loadups((float*)b->v); xmm2 = __builtin_ia32_mulps(xmm0,xmm1); __builtin_ia32_storeups(r->v,xmm2); #endif #if defined(__INTEL_COMPILER) #if defined(__INTEL_COMPILER) __m128 xmm0,xmm1,xmm2; xmm0 = _mm_loadu_ps((float*)a->v); xmm1 = _mm_loadu_ps((float*)b->v); xmm2 = _mm_mul_ps(xmm0,xmm1); _mm_storeu_ps(r->v,xmm2); #endif #endif } int main(int argc, char** argv) { point *a; point *b; point *c; int i,j; unsigned long t0,t1; a = malloc(SZ*sizeof(point)); b = malloc(SZ*sizeof(point)); c = malloc(SZ*sizeof(point)); printf("Proc std:\n"); t0 = clock(); for (i=0; i<LOOPS; i++) { for (j=0; j<SZ; j++) mulp_std(&a[j],&b[j],&c[j]); for (j=0; j<SZ; j++) mulp_std(&b[j],&b[j],&a[j]); } t1 = clock(); printf("%10d kticks\n",(t1-t0)/1000); printf("Proc std inline:\n"); t0 = clock(); for (i=0; i<LOOPS; i++) { for (j=0; j<SZ; j++) mulpi_std(&a[j],&b[j],&c[j]); for (j=0; j<SZ; j++) mulpi_std(&b[j],&b[j],&a[j]); } t1 = clock(); printf("%10d kticks\n",(t1-t0)/1000); printf("Proc sse:\n"); t0 = clock(); for (i=0; i<LOOPS; i++) { for (j=0; j<SZ; j++) mulp_sse(&a[j],&b[j],&c[j]); for (j=0; j<SZ; j++) mulp_sse(&b[j],&b[j],&a[j]); } t1 = clock(); printf("%10d kticks\n",(t1-t0)/1000); printf("Proc sse inline:\n"); t0 = clock(); for (i=0; i<LOOPS; i++) { for (j=0; j<SZ; j++) mulpi_sse(&a[j],&b[j],&c[j]); for (j=0; j<SZ; j++) mulpi_sse(&b[j],&b[j],&a[j]); } t1 = clock(); printf("%10d kticks\n",(t1-t0)/1000); free(c); free(b); free(a); return 0; } -- J.A. Magallon <jamagallon@able.es> \ Software is like sex: werewolf.able.es \ It's better when it's free Mandrake Linux release 9.1 (Cooker) for i586 Linux 2.4.21-pre4-jam1 (gcc 3.2.1 (Mandrake Linux 9.1 3.2.1-5mdk)) ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-04 0:43 ` J.A. Magallon @ 2003-02-04 13:42 ` Richard B. Johnson 2003-02-04 14:20 ` John Bradford 0 siblings, 1 reply; 84+ messages in thread From: Richard B. Johnson @ 2003-02-04 13:42 UTC (permalink / raw) To: J.A. Magallon; +Cc: Martin J. Bligh, linux-kernel, lse-tech On Tue, 4 Feb 2003, J.A. Magallon wrote: > > On 2003.02.04 Richard B. Johnson wrote: > > On Mon, 3 Feb 2003, Martin J. Bligh wrote: > > > > > People keep extolling the virtues of gcc 3.2 to me, which I'm > > > reluctant to switch to, since it compiles so much slower. But > > > it supposedly generates better code, so I thought I'd compile > > > the kernel with both and compare the results. This is gcc 2.95 > > > and 3.2.1 from debian unstable on a 16-way NUMA-Q. The kernbench > > > tests still use 2.95 for the compile-time stuff. > > > > > [SNIPPED tests...] > > > > Don't let this get out, but egcs-2.91.66 compiled FFT code > > works about 50 percent of the speed of whatever M$ uses for > > Visual C++ Version 6.0 I was awfully disheartened when I > > found that identical code executed twice as fast on M$ than > > it does on Linux. I tried to isolate what was causing the > > difference. So I replaced 'hypot()' with some 'C' code that > > does sqrt(x^2 + y^2) just to see if it was the 'C' library. > > It didn't help. When I find out what type (section) of code > > is running slower, I'll report. In the meantime, it's fast > > enough, but I don't like being beat by M$. > > > > I face a simliar problem. As everybody says that SSE is so marvelous, > we are trying to put some SSE code in our render engine, to speed up this. > But look at the results of the code below (box is a P4@1.8, Xeon with ht): [SNIPPED good demo code] I'm going to answer all the comments on this topic with just one observation. Sorry that I don't have the time to answer all who responded personally, but I have to take a "work break" today and tommorrow (design review). gcc is a marvelous compiler because it was designed to be readily ported to different architectures. However, is not an optimum compiler for ix86 machines and probably is not optimum for any one kind of machine. I often hear complaints about the ix86 processors as being "register starved", etc. This could not be further from fact. There are enough registers. However, various registers were designed to do various things. Once you decide that you know more than the processor developers, and start using registers for things they were not designed for, you start to have excellent test benchmarks, but awful overall performance. For example, the ECX register was designed to be used as a counter. It can be told to decrement and perform a conditional jump with the 'loop' instruction. The loop instruction comes in various flavors, also, like loopz, loopnz. Somebody decided that 'dec ecx; jnz' was faster. They measured this to "prove" that it's faster. In the meantime, other code suffers (stumbles) because there was really no spare time to be grabbed. Data needs to be fetched to and from memory. The instruction unit ends up being starved while data are acquired. This would not normally hurt anything because the RAM bandwidth ends up being the dominant pole in the transfer function, but you end up with something I call the "accordion problem". I will first demonstrate the accordion problem and then explain where it comes from. Note a smooth slow of traffic on a highway. All the cars are traveling at the same speed. Their speed increases until they don't dare go any faster. They are now "bandwidth limited". Somebody sees a traffic cop. Somebody slows down, it takes a few hundred milliseconds for the next car to slow down, this transient moves backwards though the line of cars until cars several miles back actually have to perform emergency braking to stay off the bumper ahead. Then, the cars start accelerating again. This acceleration, deceleration ripple moves through the line of cars like the bellows of an accordion. The average speed of the line of traffic is now reduced even though there are oscillatory accelerations above the speed-limit. Now, visualize a CPU and RAM combination running in lock-step. The speed of the execution unit is matched to the speed of the processor I/O so the instructions are fetched and executed in a more-or-less synchronized manner. This is like the high-speed line of cars before somebody sees the traffic cop. Now, perturb this execution by throwing in some faster-than-normal program sequences. You may start the accordion effect. The problem is that both instructions and data come through the same hole-in- the wall, regardless of caching. When the prefetch unit needs more data (instructions) it must contend with the data I/O. This may cause an oscillatory condition, actually reducing throughput. Anybody who uses CPUs in laboratories with sensitive receiving equipment knows that, regardless of the FCC rules, these machines generate great gobs of radio frequency interference. That's why they need to be in shielded boxes. If you want to "hear" the stumble I'm talking about, just listen to the AM audio output using a field-intensity meter. When you have a fast smoothly-running machine, the interference sounds like noise. When you have the accordion effect, the interference has a repetitive pattern to it, a tone, usually low-frequency. If you capture enough data in a logic analyzer, you will see the pattern and can see actual pauses in bus I/O where the CPU just isn't doing a damn thing at all! FYI, there is a difference in power supply current required to write 0xffffffff to RAM than 0x00000000 (honest!). If you are doing a memory-test, writing such a pattern that the load on the power supply changes at a rate that will disturb the power supply servo-loop, you can make the voltage bounce! This has nothing to do with slow CPU execution speed, but just demonstrates that there are a lot of interactions that should be considered when designing or proving-out a system. It's not just a local bench-mark that counts. The Intel Compiler(s) I have used generate code that uses the registers just like Intel specified. It uses EBX, ESI, EDI as index registers just like the 16-bit BX, SI, DI. I have never seen code output from an Intel 'C' compiler that uses EAX as in index register, even though it's available and "faster". They seem to stick with the "un-optimized" string instructions like rep movsb, repnz cmpsb, etc., and they use 'loop'. Maybe, just maybe, Intel knows something about their processor that shouldn't be second-guessed by clever programmers. Cheers, Dick Johnson Penguin : Linux version 2.4.18 on an i686 machine (797.90 BogoMips). Why is the government concerned about the lunatic fringe? Think about it. ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-04 13:42 ` Richard B. Johnson @ 2003-02-04 14:20 ` John Bradford 0 siblings, 0 replies; 84+ messages in thread From: John Bradford @ 2003-02-04 14:20 UTC (permalink / raw) To: root; +Cc: jamagallon, mbligh, linux-kernel, lse-tech There is some discussion about compiler optimisations in this Linux Journal article: http://www.linuxjournal.com/article.php?sid=4885 John. ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-03 23:31 ` Richard B. Johnson 2003-02-04 0:43 ` J.A. Magallon @ 2003-02-04 6:54 ` Denis Vlasenko 2003-02-04 7:13 ` Martin J. Bligh ` (2 more replies) 2003-02-04 10:57 ` Padraig 2 siblings, 3 replies; 84+ messages in thread From: Denis Vlasenko @ 2003-02-04 6:54 UTC (permalink / raw) To: root, Martin J. Bligh; +Cc: linux-kernel, lse-tech On 4 February 2003 01:31, Richard B. Johnson wrote: > On Mon, 3 Feb 2003, Martin J. Bligh wrote: > > People keep extolling the virtues of gcc 3.2 to me, which I'm > > reluctant to switch to, since it compiles so much slower. But > > it supposedly generates better code, so I thought I'd compile > > the kernel with both and compare the results. This is gcc 2.95 > > and 3.2.1 from debian unstable on a 16-way NUMA-Q. The kernbench > > tests still use 2.95 for the compile-time stuff. > > [SNIPPED tests...] What was the size of uncompressed kernel binaries? This is a simple (and somewhat inaccurate) measure of compiler improvement ;) > Don't let this get out, but egcs-2.91.66 compiled FFT code > works about 50 percent of the speed of whatever M$ uses for > Visual C++ Version 6.0 I was awfully disheartened when I Yes. M$ (and some other compilers) beat GCC badly. > found that identical code executed twice as fast on M$ than > it does on Linux. I tried to isolate what was causing the > difference. So I replaced 'hypot()' with some 'C' code that > does sqrt(x^2 + y^2) just to see if it was the 'C' library. > It didn't help. When I find out what type (section) of code > is running slower, I'll report. In the meantime, it's fast > enough, but I don't like being beat by M$. I'm afraid it's code generation engine. It is just worse than M$ or Intel's one. It is not easily fixable, GCC folks have tremendous task at hand. I wonder whether some big companies supposedly supporting Linux (e.g. Intel) can help GCC team (for example by giving away some code and/or developer time). -- vda ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-04 6:54 ` Denis Vlasenko @ 2003-02-04 7:13 ` Martin J. Bligh 2003-02-04 12:25 ` Adrian Bunk 2003-02-04 9:54 ` Bryan Andersen 2003-02-04 19:09 ` Timothy D. Witham 2 siblings, 1 reply; 84+ messages in thread From: Martin J. Bligh @ 2003-02-04 7:13 UTC (permalink / raw) To: vda; +Cc: linux-kernel, lse-tech > I'm afraid it's code generation engine. It is just worse than > M$ or Intel's one. It is not easily fixable, > GCC folks have tremendous task at hand. > > I wonder whether some big companies supposedly supporting > Linux (e.g. Intel) can help GCC team (for example by giving > away some code and/or developer time). Comparing Intel's compiler vs GCC on Linux would be more interesting. Anyone got a copy and some time to burn? M. ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-04 7:13 ` Martin J. Bligh @ 2003-02-04 12:25 ` Adrian Bunk 2003-02-04 15:51 ` Martin J. Bligh 0 siblings, 1 reply; 84+ messages in thread From: Adrian Bunk @ 2003-02-04 12:25 UTC (permalink / raw) To: Martin J. Bligh; +Cc: vda, linux-kernel, lse-tech On Mon, Feb 03, 2003 at 11:13:31PM -0800, Martin J. Bligh wrote: > > I'm afraid it's code generation engine. It is just worse than > > M$ or Intel's one. It is not easily fixable, > > GCC folks have tremendous task at hand. > > > > I wonder whether some big companies supposedly supporting > > Linux (e.g. Intel) can help GCC team (for example by giving > > away some code and/or developer time). > > Comparing Intel's compiler vs GCC on Linux would be more interesting. > Anyone got a copy and some time to burn? There are already people who have done this, e.g. http://www.coyotegulch.com/reviews/intel_comp/intel_gcc_bench2.html compares g++ and Intel's C++ compiler with C++ code. > M. cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-04 12:25 ` Adrian Bunk @ 2003-02-04 15:51 ` Martin J. Bligh 2003-02-04 16:27 ` [Lse-tech] " Martin J. Bligh 0 siblings, 1 reply; 84+ messages in thread From: Martin J. Bligh @ 2003-02-04 15:51 UTC (permalink / raw) To: Adrian Bunk; +Cc: vda, linux-kernel, lse-tech >> Comparing Intel's compiler vs GCC on Linux would be more interesting. >> Anyone got a copy and some time to burn? > > There are already people who have done this, e.g. > > http://www.coyotegulch.com/reviews/intel_comp/intel_gcc_bench2.html > > compares g++ and Intel's C++ compiler with C++ code. C would be infinitely more interesting ;-) M. ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [Lse-tech] Re: gcc 2.95 vs 3.21 performance 2003-02-04 15:51 ` Martin J. Bligh @ 2003-02-04 16:27 ` Martin J. Bligh 2003-02-04 17:40 ` Patrick Mansfield 0 siblings, 1 reply; 84+ messages in thread From: Martin J. Bligh @ 2003-02-04 16:27 UTC (permalink / raw) To: Adrian Bunk; +Cc: linux-kernel, lse-tech >>> Comparing Intel's compiler vs GCC on Linux would be more interesting. >>> Anyone got a copy and some time to burn? >> >> There are already people who have done this, e.g. >> >> http://www.coyotegulch.com/reviews/intel_comp/intel_gcc_bench2.html >> >> compares g++ and Intel's C++ compiler with C++ code. > > C would be infinitely more interesting ;-) Speaking of which, has anyone ever compiled the ia32 Linux kernel with the Intel compiler? I thought I saw some patches floating around to make it compile the ia64 kernel .... that'd be an interesting test case ... might give us some ideas about what could be tweaked in GCC (or code rejiggled in the kernel). M. ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [Lse-tech] Re: gcc 2.95 vs 3.21 performance 2003-02-04 16:27 ` [Lse-tech] " Martin J. Bligh @ 2003-02-04 17:40 ` Patrick Mansfield 2003-02-04 17:55 ` Martin J. Bligh 0 siblings, 1 reply; 84+ messages in thread From: Patrick Mansfield @ 2003-02-04 17:40 UTC (permalink / raw) To: Martin J. Bligh; +Cc: Adrian Bunk, linux-kernel, lse-tech On Tue, Feb 04, 2003 at 08:27:28AM -0800, Martin J. Bligh wrote: > >>> Comparing Intel's compiler vs GCC on Linux would be more interesting. > >>> Anyone got a copy and some time to burn? > >> > >> There are already people who have done this, e.g. > >> > >> http://www.coyotegulch.com/reviews/intel_comp/intel_gcc_bench2.html > >> > >> compares g++ and Intel's C++ compiler with C++ code. > > > > C would be infinitely more interesting ;-) > > Speaking of which, has anyone ever compiled the ia32 Linux kernel with the > Intel compiler? I thought I saw some patches floating around to make it > compile the ia64 kernel .... that'd be an interesting test case ... might > give us some ideas about what could be tweaked in GCC (or code rejiggled in > the kernel). > > M. Martin - Like this? http://marc.theaimsgroup.com/?l=linux-kernel&m=103559880923586&w=2 -- Patrick Mansfield ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [Lse-tech] Re: gcc 2.95 vs 3.21 performance 2003-02-04 17:40 ` Patrick Mansfield @ 2003-02-04 17:55 ` Martin J. Bligh 0 siblings, 0 replies; 84+ messages in thread From: Martin J. Bligh @ 2003-02-04 17:55 UTC (permalink / raw) To: Patrick Mansfield; +Cc: Adrian Bunk, linux-kernel, lse-tech >> Speaking of which, has anyone ever compiled the ia32 Linux kernel with >> the Intel compiler? I thought I saw some patches floating around to make >> it compile the ia64 kernel .... that'd be an interesting test case ... >> might give us some ideas about what could be tweaked in GCC (or code >> rejiggled in the kernel). >> >> M. > > Martin - > > Like this? > > http://marc.theaimsgroup.com/?l=linux-kernel&m=103559880923586&w=2 Yeah, something very like that ;-) Thanks. Preferably less micro-benchmarky though .... M. ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-04 6:54 ` Denis Vlasenko 2003-02-04 7:13 ` Martin J. Bligh @ 2003-02-04 9:54 ` Bryan Andersen 2003-02-04 15:46 ` Martin J. Bligh 2003-02-04 19:09 ` Timothy D. Witham 2 siblings, 1 reply; 84+ messages in thread From: Bryan Andersen @ 2003-02-04 9:54 UTC (permalink / raw) To: linux-kernel; +Cc: vda, root, Martin J. Bligh, lse-tech Personal opinion here but I know it is also held by many developers I know and work with. I'd rather have a compiler that produces correct and fast code but ran slow than one that produces slow or bad code and runs fast. Remember compilation is done far less often than run time execution. Yes I too noticed a difference when I switched over to 3.2 but I also noticed some of my code speed up. >>>People keep extolling the virtues of gcc 3.2 to me, which I'm >>>reluctant to switch to, since it compiles so much slower. But >>>it supposedly generates better code, so I thought I'd compile >>>the kernel with both and compare the results. This is gcc 2.95 >>>and 3.2.1 from debian unstable on a 16-way NUMA-Q. The kernbench >>>tests still use 2.95 for the compile-time stuff. >> >>[SNIPPED tests...] > > > What was the size of uncompressed kernel binaries? > This is a simple (and somewhat inaccurate) measure of compiler > improvement ;) While I too like smaller tighter output code, I'd trade it for code that runs faster in real world situations. As an example identifying the most likely execution path through a routine and keeping it contiguous in memory will do more for average execution speed than optimizing to use the smallest number of bytes. If the compiler could tell which blocks of code are for handling exceptions it then can place them ouside of the main execution path. This makes the normal code execution path smaller and more compact. In doing so it also reduces the number of memory fetch operations and cache space needed to run the code. With cache misses being 100+ clock cycles and page faults well into the millions, keeping that normal execution path short means alot. >>Don't let this get out, but egcs-2.91.66 compiled FFT code >>works about 50 percent of the speed of whatever M$ uses for >>Visual C++ Version 6.0 I was awfully disheartened when I > > Yes. M$ (and some other compilers) beat GCC badly. But can M$'s compiler produce code for many radically different CPU architectures? Most people only work with gcc on one type of CPU so they never think about just how flexible and good GCC really is. I see it often compaired against compilers that are dedicated to a single CPU where the development team only has to worry about one CPU type. GCC's development team needs to worry about many different arcitectures. Some are radically different in their fundamental structure. This really complicates the job of producing a compiler that works correctly. - Bryan ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-04 9:54 ` Bryan Andersen @ 2003-02-04 15:46 ` Martin J. Bligh 0 siblings, 0 replies; 84+ messages in thread From: Martin J. Bligh @ 2003-02-04 15:46 UTC (permalink / raw) To: Bryan Andersen, linux-kernel; +Cc: lse-tech > Personal opinion here but I know it is also held by many developers I > know and work with. I'd rather have a compiler that produces correct and > fast code but ran slow than one that produces slow or bad code and runs > fast. Remember compilation is done far less often than run time > execution. Yeah, I'd make that tradeoff too, but gcc 3.2 doesn't give me that. People keep saying it does, but I see no real evidence of it. Show me the money. M. ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-04 6:54 ` Denis Vlasenko 2003-02-04 7:13 ` Martin J. Bligh 2003-02-04 9:54 ` Bryan Andersen @ 2003-02-04 19:09 ` Timothy D. Witham 2003-02-04 19:35 ` John Bradford 2 siblings, 1 reply; 84+ messages in thread From: Timothy D. Witham @ 2003-02-04 19:09 UTC (permalink / raw) To: vda; +Cc: root, Martin J. Bligh, linux-kernel, lse-tech On Mon, 2003-02-03 at 22:54, Denis Vlasenko wrote: snip > > I'm afraid it's code generation engine. It is just worse than > M$ or Intel's one. It is not easily fixable, > GCC folks have tremendous task at hand. > > I wonder whether some big companies supposedly supporting > Linux (e.g. Intel) can help GCC team (for example by giving > away some code and/or developer time). > -- I'm hesitant to enter into this. But from my own experience the issue with big companies supporting these sort of changes in gcc have more to do with the acceptance process of changes into gcc than a lack of desire on the large companies part. Tim > vda > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- Timothy D. Witham - Lab Director - wookie@osdlab.org Open Source Development Lab Inc - A non-profit corporation 15275 SW Koll Parkway - Suite H - Beaverton OR, 97006 (503)-626-2455 x11 (office) (503)-702-2871 (cell) (503)-626-2436 (fax) ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-04 19:09 ` Timothy D. Witham @ 2003-02-04 19:35 ` John Bradford 2003-02-04 19:44 ` Dave Jones 2003-02-04 21:38 ` Linus Torvalds 0 siblings, 2 replies; 84+ messages in thread From: John Bradford @ 2003-02-04 19:35 UTC (permalink / raw) To: Timothy D. Witham; +Cc: vda, root, mbligh, linux-kernel, lse-tech > I'm hesitant to enter into this. But from my own experience > the issue with big companies supporting these sort of changes > in gcc have more to do with the acceptance process of changes > into gcc than a lack of desire on the large companies part. Maybe we should create a KGCC fork, optimise it for kernel complilations, then try to get our changes merged back in to GCC mainline at a later date. John. ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-04 19:35 ` John Bradford @ 2003-02-04 19:44 ` Dave Jones 2003-02-04 20:11 ` John Bradford 2003-02-04 21:38 ` Linus Torvalds 1 sibling, 1 reply; 84+ messages in thread From: Dave Jones @ 2003-02-04 19:44 UTC (permalink / raw) To: John Bradford Cc: Timothy D. Witham, vda, root, mbligh, linux-kernel, lse-tech On Tue, Feb 04, 2003 at 07:35:06PM +0000, John Bradford wrote: > Maybe we should create a KGCC fork, optimise it for kernel > complilations, then try to get our changes merged back in to GCC > mainline at a later date. What exactly do you mean by "optimise for kernel compilations" ? Dave -- | Dave Jones. http://www.codemonkey.org.uk | SuSE Labs ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-04 19:44 ` Dave Jones @ 2003-02-04 20:11 ` John Bradford 2003-02-04 20:20 ` John Bradford 2003-02-04 20:45 ` Herman Oosthuysen 0 siblings, 2 replies; 84+ messages in thread From: John Bradford @ 2003-02-04 20:11 UTC (permalink / raw) To: Dave Jones; +Cc: john, wookie, vda, root, mbligh, linux-kernel, lse-tech > > Maybe we should create a KGCC fork, optimise it for kernel > > complilations, then try to get our changes merged back in to GCC > > mainline at a later date. > > What exactly do you mean by "optimise for kernel compilations" ? I don't, that was a bad way of phrasing it - I didn't mean fork GCC just to create one which compiles the kernel so it runs faster, as the expense of other code. What I was thinking was that if we forked GCC, we could try out all of these ideas that have been floating around in this thread, and if, as was hinted at earlier in this thread, $bigcompanies[] have not offered contributions because of reluctance to accept them by the GCC team, we would be more in a position to try them out, because we only need to concern ourselves with breaking the compilation of the kernel, not every single program that currently compiles with GCC. The way I see it, the development series would be optimised for KGCC, and when we start to think about stabilising that development series, we try to get our KGCC changes merged back in to GCC mainline. If they are not accepted, either KGCC becomes the recommended kernel compiler, which should cause no great difficulties, (having one compiler for kernels, and one for userland applications), or we start making sure that we haven't broken compilation with GCC, (and since a there would probably always be people compiling with GCC anyway, even if there was a KGCC, we would effectively always know if we broke compilation with GCC), and then the recommended compiler is just not the optimal one, and it would be up to the various distributions to decide which one they are going to use. John. ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-04 20:11 ` John Bradford @ 2003-02-04 20:20 ` John Bradford 2003-02-04 20:45 ` Herman Oosthuysen 1 sibling, 0 replies; 84+ messages in thread From: John Bradford @ 2003-02-04 20:20 UTC (permalink / raw) To: John Bradford; +Cc: davej, wookie, vda, root, mbligh, linux-kernel, lse-tech Sorry, that last post didn't make sense, please apply this diff: - just to create one which compiles the kernel so it runs faster, as the + just to create one which compiles the kernel so it runs faster, at the expense of other code. John. ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-04 20:11 ` John Bradford 2003-02-04 20:20 ` John Bradford @ 2003-02-04 20:45 ` Herman Oosthuysen 2003-02-04 21:44 ` Timothy D. Witham 2003-02-05 7:15 ` Denis Vlasenko 1 sibling, 2 replies; 84+ messages in thread From: Herman Oosthuysen @ 2003-02-04 20:45 UTC (permalink / raw) To: John Bradford Cc: Dave Jones, wookie, vda, root, mbligh, linux-kernel, lse-tech Hi there, From my experience, the speed issue is caused by misalligned memory accesses, causing inefficient SDRAM to Cache movement of data and instructions. I don't think that you necessarily need a modification to the compiler. What you can do is carefully place the ALLIGN switch in a few critical places in the kernel code, to ensure that the code and data will be properly alligned for whatever processor it is compiled for, be that a Pentium, an ARM, a MIPS or whatever. It would be nice if GCC can be suitably improved to do this correcly for all architectures, but a little bit of human help can do wonders, without having to fork the GCC project. Cheers, -- ------------------------------------------------------------------------ Herman Oosthuysen B.Eng.(E), Member of IEEE Wireless Networks Inc. http://www.WirelessNetworksInc.com E-mail: Herman@WirelessNetworksInc.com Phone: 1.403.569-5687, Fax: 1.403.235-3965 ------------------------------------------------------------------------ John Bradford wrote: >> > Maybe we should create a KGCC fork, optimise it for kernel >> > complilations, then try to get our changes merged back in to GCC >> > mainline at a later date. >> >>What exactly do you mean by "optimise for kernel compilations" ? > > > I don't, that was a bad way of phrasing it - I didn't mean fork GCC > just to create one which compiles the kernel so it runs faster, as the > expense of other code. > > What I was thinking was that if we forked GCC, we could try out all of > these ideas that have been floating around in this thread, and if, as > was hinted at earlier in this thread, $bigcompanies[] have not offered > contributions because of reluctance to accept them by the GCC team, we > would be more in a position to try them out, because we only need to > concern ourselves with breaking the compilation of the kernel, not > every single program that currently compiles with GCC. > > The way I see it, the development series would be optimised for KGCC, > and when we start to think about stabilising that development series, > we try to get our KGCC changes merged back in to GCC mainline. If > they are not accepted, either KGCC becomes the recommended kernel > compiler, which should cause no great difficulties, (having one > compiler for kernels, and one for userland applications), or we start > making sure that we haven't broken compilation with GCC, (and since a > there would probably always be people compiling with GCC anyway, even > if there was a KGCC, we would effectively always know if we broke > compilation with GCC), and then the recommended compiler is just not > the optimal one, and it would be up to the various distributions to > decide which one they are going to use. > > John. > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-04 20:45 ` Herman Oosthuysen @ 2003-02-04 21:44 ` Timothy D. Witham 2003-02-05 7:15 ` Denis Vlasenko 1 sibling, 0 replies; 84+ messages in thread From: Timothy D. Witham @ 2003-02-04 21:44 UTC (permalink / raw) To: Herman Oosthuysen Cc: John Bradford, Dave Jones, vda, root, mbligh, linux-kernel, lse-tech On Tue, 2003-02-04 at 12:45, Herman Oosthuysen wrote: > Hi there, > > From my experience, the speed issue is caused by misalligned memory > accesses, causing inefficient SDRAM to Cache movement of data and > instructions. > > I don't think that you necessarily need a modification to the compiler. > What you can do is carefully place the ALLIGN switch in a few critical > places in the kernel code, to ensure that the code and data will be > properly alligned for whatever processor it is compiled for, be that a > Pentium, an ARM, a MIPS or whatever. > I guess I would like the compiler to do that without having to go in and futz the code. > It would be nice if GCC can be suitably improved to do this correcly for > all architectures, but a little bit of human help can do wonders, > without having to fork the GCC project. > > Cheers, -- Timothy D. Witham - Lab Director - wookie@osdlab.org Open Source Development Lab Inc - A non-profit corporation 15275 SW Koll Parkway - Suite H - Beaverton OR, 97006 (503)-626-2455 x11 (office) (503)-702-2871 (cell) (503)-626-2436 (fax) ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-04 20:45 ` Herman Oosthuysen 2003-02-04 21:44 ` Timothy D. Witham @ 2003-02-05 7:15 ` Denis Vlasenko 2003-02-05 10:36 ` Andreas Schwab 2003-02-05 15:30 ` Martin J. Bligh 1 sibling, 2 replies; 84+ messages in thread From: Denis Vlasenko @ 2003-02-05 7:15 UTC (permalink / raw) To: Herman Oosthuysen, John Bradford Cc: Dave Jones, wookie, root, mbligh, linux-kernel, lse-tech On 4 February 2003 22:45, Herman Oosthuysen wrote: > Hi there, > > From my experience, the speed issue is caused by misalligned memory > accesses, causing inefficient SDRAM to Cache movement of data and > instructions. > > I don't think that you necessarily need a modification to the > compiler. What you can do is carefully place the ALLIGN switch in a > few critical places in the kernel code, to ensure that the code and > data will be properly alligned for whatever processor it is compiled > for, be that a Pentium, an ARM, a MIPS or whatever. > > It would be nice if GCC can be suitably improved to do this correcly > for all architectures, but a little bit of human help can do wonders, > without having to fork the GCC project. NO. GCC already went this way, i.e. it aligns functions and loops by ridiculous (IMHO) amounts like 16 bytes. That's 7,5 bytes per alignment on average. Now count lk functions and loops and mourn for lost icache. Or just disassemble any .o module and read the damn code. This is the primary reason why people report larger kernels for GCC 3.x I am damn sure that if you compile with less sadistic alignment you will get smaller *and* faster kernel. -- vda ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-05 7:15 ` Denis Vlasenko @ 2003-02-05 10:36 ` Andreas Schwab 2003-02-05 11:41 ` Denis Vlasenko 2003-02-05 15:30 ` Martin J. Bligh 1 sibling, 1 reply; 84+ messages in thread From: Andreas Schwab @ 2003-02-05 10:36 UTC (permalink / raw) To: vda; +Cc: linux-kernel, lse-tech Denis Vlasenko <vda@port.imtp.ilyichevsk.odessa.ua> writes: |> I am damn sure that if you compile with less sadistic alignment |> you will get smaller *and* faster kernel. So why don't you try it out? GCC offers everything you need for this experiment. Andreas. -- Andreas Schwab, SuSE Labs, schwab@suse.de SuSE Linux AG, Deutschherrnstr. 15-19, D-90429 Nürnberg Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-05 10:36 ` Andreas Schwab @ 2003-02-05 11:41 ` Denis Vlasenko 2003-02-05 12:20 ` Dave Jones 2003-02-05 13:10 ` [Lse-tech] " Dipankar Sarma 0 siblings, 2 replies; 84+ messages in thread From: Denis Vlasenko @ 2003-02-05 11:41 UTC (permalink / raw) To: Andreas Schwab; +Cc: linux-kernel, lse-tech On 5 February 2003 12:36, Andreas Schwab wrote: > Denis Vlasenko <vda@port.imtp.ilyichevsk.odessa.ua> writes: > |> I am damn sure that if you compile with less sadistic alignment > |> you will get smaller *and* faster kernel. > > So why don't you try it out? GCC offers everything you need for this > experiment. I did. Others did it too on occasion. My argument was against overusing optimization techniques. You cannot speed up kernel by aligning *everything* to 32 bytes, or by unrolling all loops, or by aggressive inlining. That's too easy to work. You get kernel which is bigger *and* slower. -- vda ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-05 11:41 ` Denis Vlasenko @ 2003-02-05 12:20 ` Dave Jones 2003-02-05 13:10 ` [Lse-tech] " Dipankar Sarma 1 sibling, 0 replies; 84+ messages in thread From: Dave Jones @ 2003-02-05 12:20 UTC (permalink / raw) To: Denis Vlasenko; +Cc: Andreas Schwab, linux-kernel, lse-tech On Wed, Feb 05, 2003 at 01:41:34PM +0200, Denis Vlasenko wrote: > > So why don't you try it out? GCC offers everything you need for this > > experiment. > > I did. Others did it too on occasion. You seem to have forgotten to attach the numbers to your mail. Dave -- | Dave Jones. http://www.codemonkey.org.uk | SuSE Labs ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [Lse-tech] Re: gcc 2.95 vs 3.21 performance 2003-02-05 11:41 ` Denis Vlasenko 2003-02-05 12:20 ` Dave Jones @ 2003-02-05 13:10 ` Dipankar Sarma 1 sibling, 0 replies; 84+ messages in thread From: Dipankar Sarma @ 2003-02-05 13:10 UTC (permalink / raw) To: Denis Vlasenko; +Cc: Andreas Schwab, linux-kernel, lse-tech On Wed, Feb 05, 2003 at 01:41:34PM +0200, Denis Vlasenko wrote: > My argument was against overusing optimization techniques. > You cannot speed up kernel by aligning *everything* to 32 bytes, > or by unrolling all loops, or by aggressive inlining. > That's too easy to work. You get kernel which is bigger > *and* slower. I am not getting into this debate, just wanted to point out that effect of compiler optimization on UNIX kernels have been studied before. One paper I recall is - http://www.usenix.org/publications/library/proceedings/sf94/full_papers/partridge.ps They used prfile-guided optimization, so that is whole another angle altogether. Thanks Dipankar ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-05 7:15 ` Denis Vlasenko 2003-02-05 10:36 ` Andreas Schwab @ 2003-02-05 15:30 ` Martin J. Bligh 1 sibling, 0 replies; 84+ messages in thread From: Martin J. Bligh @ 2003-02-05 15:30 UTC (permalink / raw) To: vda, Herman Oosthuysen; +Cc: linux-kernel, lse-tech > GCC already went this way, i.e. it aligns functions and loops by > ridiculous (IMHO) amounts like 16 bytes. That's 7,5 bytes per alignment > on average. Now count lk functions and loops and mourn for lost icache. > Or just disassemble any .o module and read the damn code. > > This is the primary reason why people report larger kernels for GCC 3.x > > I am damn sure that if you compile with less sadistic alignment > you will get smaller *and* faster kernel. There's only one real way to know that. Do it, test it. M. ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-04 19:35 ` John Bradford 2003-02-04 19:44 ` Dave Jones @ 2003-02-04 21:38 ` Linus Torvalds 2003-02-04 21:54 ` John Bradford ` (2 more replies) 1 sibling, 3 replies; 84+ messages in thread From: Linus Torvalds @ 2003-02-04 21:38 UTC (permalink / raw) To: linux-kernel In article <200302041935.h14JZ69G002675@darkstar.example.net>, John Bradford <john@grabjohn.com> wrote: >> I'm hesitant to enter into this. But from my own experience >> the issue with big companies supporting these sort of changes >> in gcc have more to do with the acceptance process of changes >> into gcc than a lack of desire on the large companies part. > >Maybe we should create a KGCC fork, optimise it for kernel >complilations, then try to get our changes merged back in to GCC >mainline at a later date. That's not really the problem. I think the problem with gcc is that many of the developers are actually much more interested in Ada or C++ (or even Fortran!), than in plain old-fashioned C. So it's not a kernel issue per se, gcc is slow to compile _any_ C project. And a lot of the optimizations gcc does aren't even interesting to most C projects. Most "old-fashioned" C projects tend to be written in ways that mean that the most important optimizations are the truly trivial ones, and then doing good register allocation. I'd love to see a small - and fast - C compiler, and I'd be willing to make kernel changes to make it work with it. Let's see. There's been some noises on the gcc lists about splitting up the languages for easier maintenance, we'll see what happens. Linus ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-04 21:38 ` Linus Torvalds @ 2003-02-04 21:54 ` John Bradford 2003-02-04 22:11 ` Linus Torvalds 2003-02-04 23:21 ` Larry McVoy 2003-02-07 16:09 ` Pavel Machek 2 siblings, 1 reply; 84+ messages in thread From: John Bradford @ 2003-02-04 21:54 UTC (permalink / raw) To: Linus Torvalds; +Cc: linux-kernel > I'd love to see a small - and fast - C compiler, and I'd be willing to > make kernel changes to make it work with it. How IA-32 centric would your prefered compiler choice be? In other words, if a small and fast C compiler turns up, which lacks support for some currently ported to architectures, are you likely to encourage kernel changes which will make it difficult for the other architectures that have to stay with GCC to keep up? John. ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-04 21:54 ` John Bradford @ 2003-02-04 22:11 ` Linus Torvalds 2003-02-04 23:27 ` Timothy D. Witham 0 siblings, 1 reply; 84+ messages in thread From: Linus Torvalds @ 2003-02-04 22:11 UTC (permalink / raw) To: John Bradford; +Cc: linux-kernel On Tue, 4 Feb 2003, John Bradford wrote: > > I'd love to see a small - and fast - C compiler, and I'd be willing to > > make kernel changes to make it work with it. > > How IA-32 centric would your prefered compiler choice be? In other > words, if a small and fast C compiler turns up, which lacks support > for some currently ported to architectures, are you likely to > encourage kernel changes which will make it difficult for the other > architectures that have to stay with GCC to keep up? I don't think being architecture-specific is necessarily a bad thing in compilers, although most compiler writers obviously try to avoid it. The kernel shouldn't really care: it does want to have a compiler with support for inline functions, but other than that it's fairly close to ANSI C. Yes, I know we use a _lot_ of gcc extensions (inline asms, variadic macros etc), but that's at least partly because there simply aren't any really viable alternatives to gcc, so we've had no incentives to abstract any of that out. So the gcc'isms aren't really fundamental per se. Although, quite frankly, even inline asms are pretty much a "standard" thing for any reasonable C compiler (since C is often used for things that really want it), and the main issue tends to be the exact syntax rather than anything else. So I don't think I'd like to use a compiler that is _so_ limited that it doesn't have some support for something like that. I certainly would refuse to use a C compiler that didn't support inline functions. Linus ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-04 22:11 ` Linus Torvalds @ 2003-02-04 23:27 ` Timothy D. Witham 0 siblings, 0 replies; 84+ messages in thread From: Timothy D. Witham @ 2003-02-04 23:27 UTC (permalink / raw) To: Linus Torvalds; +Cc: John Bradford, linux-kernel If needed we could build this compiler's tree into our testing process. (PLM/STP) So that patches or changes could be automatically tested against a matrix of kernels, hardware configurations on different regression and stress tests. Tim On Tue, 2003-02-04 at 14:11, Linus Torvalds wrote: > On Tue, 4 Feb 2003, John Bradford wrote: > > > I'd love to see a small - and fast - C compiler, and I'd be willing to > > > make kernel changes to make it work with it. > > > > How IA-32 centric would your prefered compiler choice be? In other > > words, if a small and fast C compiler turns up, which lacks support > > for some currently ported to architectures, are you likely to > > encourage kernel changes which will make it difficult for the other > > architectures that have to stay with GCC to keep up? > > I don't think being architecture-specific is necessarily a bad thing in > compilers, although most compiler writers obviously try to avoid it. > > The kernel shouldn't really care: it does want to have a compiler with > support for inline functions, but other than that it's fairly close to > ANSI C. > > Yes, I know we use a _lot_ of gcc extensions (inline asms, variadic macros > etc), but that's at least partly because there simply aren't any really > viable alternatives to gcc, so we've had no incentives to abstract any of > that out. > > So the gcc'isms aren't really fundamental per se. Although, quite frankly, > even inline asms are pretty much a "standard" thing for any reasonable C > compiler (since C is often used for things that really want it), and the > main issue tends to be the exact syntax rather than anything else. So I > don't think I'd like to use a compiler that is _so_ limited that it > doesn't have some support for something like that. I certainly would > refuse to use a C compiler that didn't support inline functions. > > Linus > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- Timothy D. Witham - Lab Director - wookie@osdlab.org Open Source Development Lab Inc - A non-profit corporation 15275 SW Koll Parkway - Suite H - Beaverton OR, 97006 (503)-626-2455 x11 (office) (503)-702-2871 (cell) (503)-626-2436 (fax) ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-04 21:38 ` Linus Torvalds 2003-02-04 21:54 ` John Bradford @ 2003-02-04 23:21 ` Larry McVoy 2003-02-04 23:42 ` b_adlakha ` (4 more replies) 2003-02-07 16:09 ` Pavel Machek 2 siblings, 5 replies; 84+ messages in thread From: Larry McVoy @ 2003-02-04 23:21 UTC (permalink / raw) To: Linus Torvalds; +Cc: linux-kernel > I'd love to see a small - and fast - C compiler, and I'd be willing to > make kernel changes to make it work with it. I can't offer any immediate help with this but I want the same thing. At some point, we're planning on funding some extensions into GCC or whatever reasonable C compiler is around: - associative arrays as a builtin type { assoc bar = {}; // anonymous, no file backing bar{"some key"} = "some value"; if (defined(bar{"some other value"})) ... } - regular expressions { char *foo = "blech"; if (foo =~ /regex are nice/) { printf("Well isn't that special?\n"); } } - tk bindings built in and then we'll port BK to that compiler. It's likely to be GCC because we want to support all the different architectures but if a kernel sponsered cc shows up we'll happily throw money at that. -- --- Larry McVoy lm at bitmover.com http://www.bitmover.com/lm ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-04 23:21 ` Larry McVoy @ 2003-02-04 23:42 ` b_adlakha 2003-02-05 0:19 ` Andy Pfiffer 2003-02-04 23:51 ` Jakob Oestergaard ` (3 subsequent siblings) 4 siblings, 1 reply; 84+ messages in thread From: b_adlakha @ 2003-02-04 23:42 UTC (permalink / raw) To: linux-kernel >> I'd love to see a small - and fast - C compiler, and I'd be willing to >> make kernel changes to make it work with it. tcc looks like a cool project to me... Its small enough to be distributed through this mailing list! and the "C scripts" looks like a cool feature... ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-04 23:42 ` b_adlakha @ 2003-02-05 0:19 ` Andy Pfiffer 0 siblings, 0 replies; 84+ messages in thread From: Andy Pfiffer @ 2003-02-05 0:19 UTC (permalink / raw) To: b_adlakha; +Cc: linux-kernel On Tue, 2003-02-04 at 15:42, b_adlakha@softhome.net wrote: > >> I'd love to see a small - and fast - C compiler, and I'd be willing to > >> make kernel changes to make it work with it. > > tcc looks like a cool project to me... > Its small enough to be distributed through this mailing list! Don't overlook lcc -- last I knew most users were using GNU's cpp, but other than that, it is available for the curious: http://www.cs.princeton.edu/software/lcc/ ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-04 23:21 ` Larry McVoy 2003-02-04 23:42 ` b_adlakha @ 2003-02-04 23:51 ` Jakob Oestergaard 2003-02-05 1:03 ` Hugo Mills 2003-02-10 22:26 ` Andrea Arcangeli 2003-02-04 23:51 ` Eli Carter ` (2 subsequent siblings) 4 siblings, 2 replies; 84+ messages in thread From: Jakob Oestergaard @ 2003-02-04 23:51 UTC (permalink / raw) To: Larry McVoy, linux-kernel On Tue, Feb 04, 2003 at 03:21:01PM -0800, Larry McVoy wrote: > > I'd love to see a small - and fast - C compiler, and I'd be willing to > > make kernel changes to make it work with it. > > I can't offer any immediate help with this but I want the same thing. At > some point, we're planning on funding some extensions into GCC or whatever > reasonable C compiler is around: [snipping Linus from To:] Cool. > > - associative arrays as a builtin type > > { > assoc bar = {}; // anonymous, no file backing > > bar{"some key"} = "some value"; > if (defined(bar{"some other value"})) ... > } Allow me: { std::map<std::string,std::string> bar; bar["some key"] = "some value"; if (bar.find("some other value") != bar.end()) ... } Works beautifully, all you need is to pick the existing language which allows for the existing standard library which already provide that functionality. I doubt there's much need for a C+ or C 2+/3 langauage variant ;) > > - regular expressions > > { > char *foo = "blech"; > > if (foo =~ /regex are nice/) { > printf("Well isn't that special?\n"); > } > } Ok, I can't help you with that. You have probably seen a Perl program before... Now imagine a two million line Perl program... That is why the above is not a good idea ;) It's still your right to want it of course... > > - tk bindings built in Built into the language (not a library)? <sarcasm> Then I'd want the compiler in a kernel module ;) </> > and then we'll port BK to that compiler. It's likely to be GCC because we > want to support all the different architectures but if a kernel sponsered > cc shows up we'll happily throw money at that. If you look at http://www.codesourcery.com, you can see that there really are some people who do GCC extentions or optimizations for money - various institutions have funded additions to GCC this way. It's a cool idea - I have a few things I'd like my company to fund as well... Some time in the future... Unless someone beats us to it. -- ................................................................ : jakob@unthought.net : And I see the elder races, : :.........................: putrid forms of man : : Jakob Østergaard : See him rise and claim the earth, : : OZ9ABN : his downfall is at hand. : :.........................:............{Konkhra}...............: ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-04 23:51 ` Jakob Oestergaard @ 2003-02-05 1:03 ` Hugo Mills 2003-02-10 22:26 ` Andrea Arcangeli 1 sibling, 0 replies; 84+ messages in thread From: Hugo Mills @ 2003-02-05 1:03 UTC (permalink / raw) To: linux-kernel [-- Attachment #1: Type: text/plain, Size: 2166 bytes --] On Wed, Feb 05, 2003 at 12:51:12AM +0100, Jakob Oestergaard wrote: > On Tue, Feb 04, 2003 at 03:21:01PM -0800, Larry McVoy wrote: > > I can't offer any immediate help with this but I want the same thing. At > > some point, we're planning on funding some extensions into GCC or whatever > > reasonable C compiler is around: > > > > - regular expressions > > > > { > > char *foo = "blech"; > > > > if (foo =~ /regex are nice/) { > > printf("Well isn't that special?\n"); > > } > > } > > Ok, I can't help you with that. I wanted something like that a while ago, so I wrote a couple of classes in C++ to handle regexps. Some of the test code looks like this: string str = "fum foo"; rejex exp("f(o*)"); // Search for a regex if( s/exp ) cout << "Found it!" << endl; // Count matches cout << s/exp << " matches" << endl; replace rep("g$0"); // Search & replace str/exp/rep; cout << s << endl; // All in one "foo bar"/rejex("ba")/replace(); It's not perfect by any stretch of the imagination, but it works. I've not released it, because I haven't had a chance to get it into a releasable form yet. Actually, looking at it, I should probably play a couple of tricks with overloading operators to give you instead str =~ search/replace; or even "str" =~ "search"/"replace"; > You have probably seen a Perl program before... Now imagine a two > million line Perl program... That is why the above is not a good idea ;) > > It's still your right to want it of course... That's a good point, but I've always felt that the main problem with perl isn't the regexes, but the rest of the language(*). Hugo. (*) Some may feel that, coming from a C++ programmer, this is a case of the pot calling the kettle black. :) -- === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk === PGP key: 1C335860 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk --- Our so-called leaders speak/with words they try to jail ya/ --- They subjugate the meek/but it's the rhetoric of failure. [-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-04 23:51 ` Jakob Oestergaard 2003-02-05 1:03 ` Hugo Mills @ 2003-02-10 22:26 ` Andrea Arcangeli 2003-02-10 23:28 ` J.A. Magallon 1 sibling, 1 reply; 84+ messages in thread From: Andrea Arcangeli @ 2003-02-10 22:26 UTC (permalink / raw) To: Jakob Oestergaard, Larry McVoy, linux-kernel On Wed, Feb 05, 2003 at 12:51:12AM +0100, Jakob Oestergaard wrote: > On Tue, Feb 04, 2003 at 03:21:01PM -0800, Larry McVoy wrote: > > > I'd love to see a small - and fast - C compiler, and I'd be willing to > > > make kernel changes to make it work with it. > > > > I can't offer any immediate help with this but I want the same thing. At > > some point, we're planning on funding some extensions into GCC or whatever > > reasonable C compiler is around: > > [snipping Linus from To:] > > Cool. > > > > > - associative arrays as a builtin type > > > > { > > assoc bar = {}; // anonymous, no file backing > > > > bar{"some key"} = "some value"; > > if (defined(bar{"some other value"})) ... > > } > > Allow me: > > { > std::map<std::string,std::string> bar; > > bar["some key"] = "some value"; > if (bar.find("some other value") != bar.end()) ... > } Indeed. Hardcoding map and multimap templates with string,string parameter in the language sounds like a very worthless effort. If he wants an high level syntax on top of the abstractions he should use a more high level language. C can do everything but it's going to be a sintax like what we do in the kernel, with lists, rbtrees, structures of pointer to functions etc.. > Works beautifully, all you need is to pick the existing language which > allows for the existing standard library which already provide that > functionality. > > I doubt there's much need for a C+ or C 2+/3 langauage variant ;) > > > > > - regular expressions > > > > { > > char *foo = "blech"; > > > > if (foo =~ /regex are nice/) { > > printf("Well isn't that special?\n"); > > } > > } > > Ok, I can't help you with that. > > You have probably seen a Perl program before... Now imagine a two > million line Perl program... That is why the above is not a good idea ;) actually the python syntax for re is quite nice, and would be pretty compatible with C, no magic perl =~ operator etc.. again a library like STL in an highlevel language would do the trick just fine. > > It's still your right to want it of course... > > > > > - tk bindings built in > > Built into the language (not a library)? Oh my. > > <sarcasm> > Then I'd want the compiler in a kernel module ;) > </> then I want insmod kde.o too ;) Andrea ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-10 22:26 ` Andrea Arcangeli @ 2003-02-10 23:28 ` J.A. Magallon 0 siblings, 0 replies; 84+ messages in thread From: J.A. Magallon @ 2003-02-10 23:28 UTC (permalink / raw) To: Andrea Arcangeli; +Cc: Jakob Oestergaard, Larry McVoy, linux-kernel On 2003.02.10 Andrea Arcangeli wrote: > On Wed, Feb 05, 2003 at 12:51:12AM +0100, Jakob Oestergaard wrote: > > On Tue, Feb 04, 2003 at 03:21:01PM -0800, Larry McVoy wrote: > > > > I'd love to see a small - and fast - C compiler, and I'd be willing to > > > > make kernel changes to make it work with it. > > > > > > I can't offer any immediate help with this but I want the same thing. At > > > some point, we're planning on funding some extensions into GCC or whatever > > > reasonable C compiler is around: > > > > [snipping Linus from To:] > > > > Cool. > > > > > > > > - associative arrays as a builtin type > > > > > > { > > > assoc bar = {}; // anonymous, no file backing > > > > > > bar{"some key"} = "some value"; > > > if (defined(bar{"some other value"})) ... > > > } > > > > Allow me: > > > > { > > std::map<std::string,std::string> bar; > > > > bar["some key"] = "some value"; > > if (bar.find("some other value") != bar.end()) ... > > } > And don't forget smart pointers with reference counting so you can get rid of all those stupind kfree's... ;) -- J.A. Magallon <jamagallon@able.es> \ Software is like sex: werewolf.able.es \ It's better when it's free Mandrake Linux release 9.1 (Cooker) for i586 Linux 2.4.21-pre4-jam1 (gcc 3.2.1 (Mandrake Linux 9.1 3.2.1-5mdk)) ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-04 23:21 ` Larry McVoy 2003-02-04 23:42 ` b_adlakha 2003-02-04 23:51 ` Jakob Oestergaard @ 2003-02-04 23:51 ` Eli Carter 2003-02-05 0:27 ` Larry McVoy 2003-02-05 3:03 ` Tomas Szepe 2003-02-05 6:03 ` Mark Mielke 4 siblings, 1 reply; 84+ messages in thread From: Eli Carter @ 2003-02-04 23:51 UTC (permalink / raw) To: Larry McVoy; +Cc: Linus Torvalds, linux-kernel Larry McVoy wrote: >>I'd love to see a small - and fast - C compiler, and I'd be willing to >>make kernel changes to make it work with it. > > > I can't offer any immediate help with this but I want the same thing. At > some point, we're planning on funding some extensions into GCC or whatever > reasonable C compiler is around: > > - associative arrays as a builtin type [snip] > - regular expressions [snip] > - tk bindings built in > > and then we'll port BK to that compiler. It's likely to be GCC because we > want to support all the different architectures but if a kernel sponsered > cc shows up we'll happily throw money at that. Ok, dumb, (and probably flamebait) question time: I read your list and thought "In C? Why not Python?" I'm guessing speed issues? Eli --------------------. "If it ain't broke now, Eli Carter \ it will be soon." -- crypto-gram eli.carter(a)inet.com `------------------------------------------------- ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-04 23:51 ` Eli Carter @ 2003-02-05 0:27 ` Larry McVoy 2003-02-06 20:42 ` Paul Jakma 0 siblings, 1 reply; 84+ messages in thread From: Larry McVoy @ 2003-02-05 0:27 UTC (permalink / raw) To: Eli Carter; +Cc: Larry McVoy, Linus Torvalds, linux-kernel > Ok, dumb, (and probably flamebait) question time: I read your list and > thought "In C? Why not Python?" I'm guessing speed issues? Scripting languages are unacceptable for products. Flat out unacceptable. I spoke to Chip when he was running the perl effort, his answer was "if you are worried about new releases of perl breaking your scripts, ship your own version of perl". I spoke with Guido or some other Python luminary and he said the same thing. For something which a company has to support, it needs to be a compiled language with fairly minimal dependencies. Otherwise the customer upgrades and the tool breaks. Don't get me wrong, I love perl (well, perl 4, perl 5 got a bit weird for my tastes but some people seem to like it) and python looks cool as well. They are great for prototyping but they are just useless as a application platform. Our support costs would be through the roof. Before the inevitable flameage, please consider that we have to support people who insist on using all sorts of weird things. Richard Gooch maintains his own a.out based linux distribution, for example. Do we get to tell him to upgrade? Nope. And it just gets worse from there. -- --- Larry McVoy lm at bitmover.com http://www.bitmover.com/lm ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-05 0:27 ` Larry McVoy @ 2003-02-06 20:42 ` Paul Jakma 0 siblings, 0 replies; 84+ messages in thread From: Paul Jakma @ 2003-02-06 20:42 UTC (permalink / raw) To: Larry McVoy; +Cc: Eli Carter, Linux Kernel On Tue, 4 Feb 2003, Larry McVoy wrote: > Scripting languages are unacceptable for products. Flat out unacceptable. > I spoke to Chip when he was running the perl effort, his answer was "if > you are worried about new releases of perl breaking your scripts, ship > your own version of perl". There is a perl compiler, perlcc, but its not perfect. why not fund it to have it made perfect. then you get best of all worlds - perl and interpretation at run time for developers and ability to ship binary files to customers. regards, -- Paul Jakma Sys Admin Alphyra paulj@alphyra.ie Warning: /never/ send email to spam@dishone.st or trap@dishone.st ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-04 23:21 ` Larry McVoy ` (2 preceding siblings ...) 2003-02-04 23:51 ` Eli Carter @ 2003-02-05 3:03 ` Tomas Szepe 2003-02-05 6:03 ` Mark Mielke 4 siblings, 0 replies; 84+ messages in thread From: Tomas Szepe @ 2003-02-05 3:03 UTC (permalink / raw) To: Larry McVoy, Linus Torvalds, linux-kernel > [lm@bitmover.com] > > I can't offer any immediate help with this but I want the same thing. At > some point, we're planning on funding some extensions into GCC or whatever > reasonable C compiler is around: > > - associative arrays as a builtin type > - regular expressions > - tk bindings built in Is it April 1st already? I can't see why this should be a language extension other than you want to make a real mess out of it. > and then we'll port BK to that compiler. It's likely to be GCC because we > want to support all the different architectures but if a kernel sponsered > cc shows up we'll happily throw money at that. Ever heard of glib? #include <glib.h> and be done with it. -- Tomas Szepe <szepe@pinerecords.com> ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-04 23:21 ` Larry McVoy ` (3 preceding siblings ...) 2003-02-05 3:03 ` Tomas Szepe @ 2003-02-05 6:03 ` Mark Mielke 4 siblings, 0 replies; 84+ messages in thread From: Mark Mielke @ 2003-02-05 6:03 UTC (permalink / raw) To: Larry McVoy, Linus Torvalds, linux-kernel On Tue, Feb 04, 2003 at 03:21:01PM -0800, Larry McVoy wrote: > > I'd love to see a small - and fast - C compiler, and I'd be willing to > > make kernel changes to make it work with it. > I can't offer any immediate help with this but I want the same thing. At > some point, we're planning on funding some extensions into GCC or whatever > reasonable C compiler is around: > - associative arrays as a builtin type > - regular expressions > - tk bindings built in What is the problem with C++ or objective C? I doubt that the GCC people would accept these sort of additions, even if complete. mark -- mark@mielke.cc/markm@ncf.ca/markm@nortelnetworks.com __________________________ . . _ ._ . . .__ . . ._. .__ . . . .__ | Neighbourhood Coder |\/| |_| |_| |/ |_ |\/| | |_ | |/ |_ | | | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada One ring to rule them all, one ring to find them, one ring to bring them all and in the darkness bind them... http://mark.mielke.cc/ ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-04 21:38 ` Linus Torvalds 2003-02-04 21:54 ` John Bradford 2003-02-04 23:21 ` Larry McVoy @ 2003-02-07 16:09 ` Pavel Machek 2 siblings, 0 replies; 84+ messages in thread From: Pavel Machek @ 2003-02-07 16:09 UTC (permalink / raw) To: Linus Torvalds; +Cc: linux-kernel Hi! > >> I'm hesitant to enter into this. But from my own experience > >> the issue with big companies supporting these sort of changes > >> in gcc have more to do with the acceptance process of changes > >> into gcc than a lack of desire on the large companies part. > > > >Maybe we should create a KGCC fork, optimise it for kernel > >complilations, then try to get our changes merged back in to GCC > >mainline at a later date. > > That's not really the problem. > > I think the problem with gcc is that many of the developers are actually > much more interested in Ada or C++ (or even Fortran!), than in plain > old-fashioned C. So it's not a kernel issue per se, gcc is slow to > compile _any_ C project. > > And a lot of the optimizations gcc does aren't even interesting to most > C projects. Most "old-fashioned" C projects tend to be written in ways > that mean that the most important optimizations are the truly trivial > ones, and then doing good register allocation. > > I'd love to see a small - and fast - C compiler, and I'd be willing to > make kernel changes to make it work with it. What about gcc-1.4 or something like that? If you go back in time, you'll find gcc is getting smaller and faster ;-). Actually making kernel compile with gcc-2.7.2 should make it few times faster than gcc-3.2... Pavel -- Worst form of spam? Adding advertisment signatures ala sourceforge.net. What goes next? Inserting advertisment *into* email? ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-03 23:31 ` Richard B. Johnson 2003-02-04 0:43 ` J.A. Magallon 2003-02-04 6:54 ` Denis Vlasenko @ 2003-02-04 10:57 ` Padraig 2003-02-04 13:11 ` Helge Hafting 2 siblings, 1 reply; 84+ messages in thread From: Padraig @ 2003-02-04 10:57 UTC (permalink / raw) To: root; +Cc: linux-kernel [-- Attachment #1: Type: text/plain, Size: 1156 bytes --] Richard B. Johnson wrote: > On Mon, 3 Feb 2003, Martin J. Bligh wrote: > >>People keep extolling the virtues of gcc 3.2 to me, which I'm >>reluctant to switch to, since it compiles so much slower. But >>it supposedly generates better code, so I thought I'd compile >>the kernel with both and compare the results. This is gcc 2.95 >>and 3.2.1 from debian unstable on a 16-way NUMA-Q. The kernbench >>tests still use 2.95 for the compile-time stuff. >> > > [SNIPPED tests...] > > Don't let this get out, but egcs-2.91.66 compiled FFT code > works about 50 percent of the speed of whatever M$ uses for > Visual C++ Version 6.0 Interesting. I just noticed that I get 50% decrease in the speed of my program if I just insert a printf(). I.E. my program is like: printf() for(;;) { do_sorting_loop_test(); } If I remove the initial printf it doubles in speed? I assume this is some weird caching thing? gcc is 3.2.1 (same happens for 2.95..) <boggle> Note this is with -O3. If I don't specify -O then leaving the printf in speeds things up by about 15% </boggle> attached is the assembly for the slow and fast in case anyone's interested. Pádraig. [-- Attachment #2: slow.s --] [-- Type: text/plain, Size: 4466 bytes --] .file "testfunc.c" .globl TEST_NUMBER .data .align 2 .type TEST_NUMBER,@object .size TEST_NUMBER,2 TEST_NUMBER: .value 256 .globl count .align 4 .type count,@object .size count,4 count: .long 0 .globl exit_flag .align 4 .type exit_flag,@object .size exit_flag,4 exit_flag: .long 0 .align 4 .type throttle_print.0,@object .size throttle_print.0,4 throttle_print.0: .long 0 .section .rodata.str1.1,"aMS",@progbits,1 .LC0: .string "\033[H\033[2J" .section .rodata.str1.32,"aMS",@progbits,1 .align 32 .LC3: .string "\nAdding & dropping random array elements,(from a set of 000..%03u)\n" .section .rodata.str1.1 .LC4: .string "Ctrl C to exit" .section .rodata.str1.32 .align 32 .LC1: .string "\n%lu array elements randomly dropped and added in %lus" .align 32 .LC2: .string " (%lu/s)\n \n" .text .p2align 2,,3 .globl main .type main,@function main: pushl %ebp movl %esp, %ebp pushl %edi pushl %esi pushl %ebx subl $12, %esp andl $-16, %esp cmpl $1, 8(%ebp) movl $1, %edi jle .L2 pushl $0 pushl $10 pushl $0 movl 12(%ebp), %eax pushl 4(%eax) call __strtol_internal addl $16, %esp testl %eax, %eax jle .L2 movw %ax, TEST_NUMBER .L2: subl $12, %esp pushl $.LC0 call printf popl %eax pushl stdout call fflush movzwl TEST_NUMBER, %edx sall $1, %edx movl %edx, (%esp) call malloc movl %eax, %esi movl $0, (%esp) call time popl %ebx movl %eax, start popl %eax pushl $exit_info_sig pushl $2 call signal xorl %edx, %edx movw TEST_NUMBER, %cx addl $16, %esp cmpw %cx, %dx jae .L24 .L10: movzwl %dx, %ebx movw %dx, (%esi,%ebx,2) incl %edx cmpw %cx, %dx jb .L10 .p2align 2,,3 .L24: incl count call rand movw TEST_NUMBER, %bx movzwl %bx, %edx movl %edx, %ecx cltd idivl %ecx cmpw %bx, %dx movl %edx, %ecx jae .L27 .p2align 2,,3 .L18: movzwl %cx, %edx incl %ecx movw (%esi,%edx,2), %ax cmpw %bx, %cx movw %ax, -2(%esi,%edx,2) jb .L18 .L27: leal -1(%ebx), %ecx subl $8, %esp movzwl %cx, %edx pushl %edx pushl %esi call GetLowestValueAvailable movzwl TEST_NUMBER, %edx movw %ax, -2(%esi,%edx,2) movl exit_flag, %eax addl $16, %esp testl %eax, %eax jne .L28 testl %edi, %edi je .L24 subl $8, %esp leal -1(%edx), %ebx pushl %ebx pushl $.LC3 call printf xorl %edi, %edi movl $.LC4, (%esp) call puts addl $16, %esp jmp .L24 .L28: subl $12, %esp pushl $0 call time movl %eax, %esi addl $12, %esp subl start, %esi pushl %esi pushl count pushl $.LC1 call printf popl %eax popl %edx movl count, %eax xorl %edx, %edx divl %esi pushl %eax pushl $.LC2 call printf movl $1, (%esp) call exit .Lfe1: .size main,.Lfe1-main .p2align 2,,3 .globl RemoveNumber .type RemoveNumber,@function RemoveNumber: pushl %ebp movl %esp, %ebp movl 12(%ebp), %ecx cmpw TEST_NUMBER, %cx pushl %ebx movl 8(%ebp), %ebx jae .L69 .p2align 2,,3 .L67: movzwl %cx, %edx movw (%ebx,%edx,2), %ax movw %ax, -2(%ebx,%edx,2) incl %ecx cmpw TEST_NUMBER, %cx jb .L67 .L69: popl %ebx leave ret .Lfe2: .size RemoveNumber,.Lfe2-RemoveNumber .section .rodata.str1.1 .LC5: .string "\033[H" .LC6: .string "%03d " .text .p2align 2,,3 .globl printArray .type printArray,@function printArray: pushl %ebp movl %esp, %ebp pushl %esi pushl %ebx subl $12, %esp pushl $.LC5 movl 8(%ebp), %esi call printf popl %eax pushl stdout xorl %ebx, %ebx call fflush addl $16, %esp cmpw TEST_NUMBER, %bx jb .L75 .L77: leal -8(%ebp), %esp popl %ebx popl %esi leave ret .p2align 2,,3 .L75: movzwl %bx, %ecx subl $8, %esp movzwl (%esi,%ecx,2), %edx pushl %edx pushl $.LC6 incl %ebx call printf addl $16, %esp cmpw TEST_NUMBER, %bx jb .L75 jmp .L77 .Lfe3: .size printArray,.Lfe3-printArray .p2align 2,,3 .globl exit_info .type exit_info,@function exit_info: pushl %ebp movl %esp, %ebp pushl %ebx subl $16, %esp pushl $0 call time movl %eax, %ebx addl $12, %esp subl start, %ebx pushl %ebx pushl count pushl $.LC1 call printf popl %eax popl %edx movl count, %eax xorl %edx, %edx divl %ebx pushl %eax pushl $.LC2 call printf movl $1, (%esp) call exit .Lfe4: .size exit_info,.Lfe4-exit_info .p2align 2,,3 .globl exit_info_sig .type exit_info_sig,@function exit_info_sig: pushl %ebp movl %esp, %ebp movl $1, exit_flag leave ret .Lfe5: .size exit_info_sig,.Lfe5-exit_info_sig .comm start,4,4 .ident "GCC: (GNU) 3.2.1 20021207 (Red Hat Linux 8.0 3.2.1-2)" [-- Attachment #3: fast.s --] [-- Type: text/plain, Size: 4339 bytes --] .file "testfunc.c" .globl TEST_NUMBER .data .align 2 .type TEST_NUMBER,@object .size TEST_NUMBER,2 TEST_NUMBER: .value 256 .globl count .align 4 .type count,@object .size count,4 count: .long 0 .globl exit_flag .align 4 .type exit_flag,@object .size exit_flag,4 exit_flag: .long 0 .align 4 .type throttle_print.0,@object .size throttle_print.0,4 throttle_print.0: .long 0 .section .rodata.str1.32,"aMS",@progbits,1 .align 32 .LC2: .string "\nAdding & dropping random array elements,(from a set of 000..%03u)\n" .section .rodata.str1.1,"aMS",@progbits,1 .LC3: .string "Ctrl C to exit" .section .rodata.str1.32 .align 32 .LC0: .string "\n%lu array elements randomly dropped and added in %lus" .align 32 .LC1: .string " (%lu/s)\n \n" .text .p2align 2,,3 .globl main .type main,@function main: pushl %ebp movl %esp, %ebp pushl %edi pushl %esi pushl %ebx subl $12, %esp andl $-16, %esp cmpl $1, 8(%ebp) movl $1, %edi jle .L2 pushl $0 pushl $10 pushl $0 movl 12(%ebp), %eax pushl 4(%eax) call __strtol_internal addl $16, %esp testl %eax, %eax jle .L2 movw %ax, TEST_NUMBER .L2: movzwl TEST_NUMBER, %edx subl $12, %esp sall $1, %edx pushl %edx call malloc movl %eax, %esi movl $0, (%esp) call time popl %ebx movl %eax, start popl %eax pushl $exit_info_sig pushl $2 call signal xorl %edx, %edx movw TEST_NUMBER, %cx addl $16, %esp cmpw %cx, %dx jae .L24 .L10: movzwl %dx, %ebx movw %dx, (%esi,%ebx,2) incl %edx cmpw %cx, %dx jb .L10 .p2align 2,,3 .L24: incl count call rand movw TEST_NUMBER, %bx movzwl %bx, %edx movl %edx, %ecx cltd idivl %ecx cmpw %bx, %dx movl %edx, %ecx jae .L27 .p2align 2,,3 .L18: movzwl %cx, %edx incl %ecx movw (%esi,%edx,2), %ax cmpw %bx, %cx movw %ax, -2(%esi,%edx,2) jb .L18 .L27: leal -1(%ebx), %ecx subl $8, %esp movzwl %cx, %edx pushl %edx pushl %esi call GetLowestValueAvailable movzwl TEST_NUMBER, %edx movw %ax, -2(%esi,%edx,2) movl exit_flag, %eax addl $16, %esp testl %eax, %eax jne .L28 testl %edi, %edi je .L24 subl $8, %esp leal -1(%edx), %ebx pushl %ebx pushl $.LC2 call printf xorl %edi, %edi movl $.LC3, (%esp) call puts addl $16, %esp jmp .L24 .L28: subl $12, %esp pushl $0 call time movl %eax, %esi addl $12, %esp subl start, %esi pushl %esi pushl count pushl $.LC0 call printf popl %eax popl %edx movl count, %eax xorl %edx, %edx divl %esi pushl %eax pushl $.LC1 call printf movl $1, (%esp) call exit .Lfe1: .size main,.Lfe1-main .p2align 2,,3 .globl RemoveNumber .type RemoveNumber,@function RemoveNumber: pushl %ebp movl %esp, %ebp movl 12(%ebp), %ecx cmpw TEST_NUMBER, %cx pushl %ebx movl 8(%ebp), %ebx jae .L69 .p2align 2,,3 .L67: movzwl %cx, %edx movw (%ebx,%edx,2), %ax movw %ax, -2(%ebx,%edx,2) incl %ecx cmpw TEST_NUMBER, %cx jb .L67 .L69: popl %ebx leave ret .Lfe2: .size RemoveNumber,.Lfe2-RemoveNumber .section .rodata.str1.1 .LC4: .string "\033[H" .LC5: .string "%03d " .text .p2align 2,,3 .globl printArray .type printArray,@function printArray: pushl %ebp movl %esp, %ebp pushl %esi pushl %ebx subl $12, %esp pushl $.LC4 movl 8(%ebp), %esi call printf popl %eax pushl stdout xorl %ebx, %ebx call fflush addl $16, %esp cmpw TEST_NUMBER, %bx jb .L75 .L77: leal -8(%ebp), %esp popl %ebx popl %esi leave ret .p2align 2,,3 .L75: movzwl %bx, %ecx subl $8, %esp movzwl (%esi,%ecx,2), %edx pushl %edx pushl $.LC5 incl %ebx call printf addl $16, %esp cmpw TEST_NUMBER, %bx jb .L75 jmp .L77 .Lfe3: .size printArray,.Lfe3-printArray .p2align 2,,3 .globl exit_info .type exit_info,@function exit_info: pushl %ebp movl %esp, %ebp pushl %ebx subl $16, %esp pushl $0 call time movl %eax, %ebx addl $12, %esp subl start, %ebx pushl %ebx pushl count pushl $.LC0 call printf popl %eax popl %edx movl count, %eax xorl %edx, %edx divl %ebx pushl %eax pushl $.LC1 call printf movl $1, (%esp) call exit .Lfe4: .size exit_info,.Lfe4-exit_info .p2align 2,,3 .globl exit_info_sig .type exit_info_sig,@function exit_info_sig: pushl %ebp movl %esp, %ebp movl $1, exit_flag leave ret .Lfe5: .size exit_info_sig,.Lfe5-exit_info_sig .comm start,4,4 .ident "GCC: (GNU) 3.2.1 20021207 (Red Hat Linux 8.0 3.2.1-2)" ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-04 10:57 ` Padraig @ 2003-02-04 13:11 ` Helge Hafting 2003-02-04 13:29 ` Jörn Engel 2003-02-04 14:05 ` P 0 siblings, 2 replies; 84+ messages in thread From: Helge Hafting @ 2003-02-04 13:11 UTC (permalink / raw) To: Padraig; +Cc: linux-kernel Padraig@Linux.ie wrote: [...] > Interesting. I just noticed that I get 50% decrease in > the speed of my program if I just insert a printf(). I.E. > my program is like: > > printf() > for(;;) { > do_sorting_loop_test(); > } > > If I remove the initial printf it doubles in speed? > I assume this is some weird caching thing? Looks like a cacheline alignment issue to me. This loop of yours occupy x cachelines on your cpu, moving it in memory by adding the printf might cause it to ocupy x+1 cachelines. That might be noticeable if x is a really small number, such as 1. > gcc is 3.2.1 (same happens for 2.95..) > > <boggle> > Note this is with -O3. If I don't specify -O then > leaving the printf in speeds things up by about 15% > </boggle> Sure - going from -O3 to -O changes code generation so your loop code hits the cachelines differently. In this case the printf moved the loop into better alignment. My advice is to put your test loop in a function of its own, and do the printing in the function that calls it. functions are always aligned the same (good) way so that calling them will be fast. You can tune the speed of your inner loop by experimenting with the insertion of one or more NOP asms in front of the loop. Just be aware that all such tuning is wasted once you change anything at all in that function - you'll have to re-do the tuning each time. The compiler should ideally align the loops for maximum performance. That can be hard though, considering all the different processors that might run your program. And aligning everything optimally could waste a _lot_ of code space - so do this only for small loops with lots of iterations. Helge Hafting ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-04 13:11 ` Helge Hafting @ 2003-02-04 13:29 ` Jörn Engel 2003-02-04 14:05 ` P 1 sibling, 0 replies; 84+ messages in thread From: Jörn Engel @ 2003-02-04 13:29 UTC (permalink / raw) To: Helge Hafting; +Cc: Padraig, linux-kernel On Tue, 4 February 2003 14:11:56 +0100, Helge Hafting wrote: > > Looks like a cacheline alignment issue to me. > This loop of yours occupy x cachelines on your cpu, > moving it in memory by adding the printf > might cause it to ocupy x+1 cachelines. > That might be noticeable if x is a really small number, > such as 1. Makes a lot of sense. > My advice is to put your test loop in a function of its own, > and do the printing in the function that calls it. > functions are always aligned the same (good) way so > that calling them will be fast. > > You can tune the speed of your inner loop by experimenting > with the insertion of one or more NOP asms in front > of the loop. Just be aware that all such tuning is wasted once > you change anything at all in that function - you'll have to > re-do the tuning each time. > > The compiler should ideally align the loops for maximum performance. > That can be hard though, considering all the different processors > that might run your program. And aligning everything optimally > could waste a _lot_ of code space - so do this only for > small loops with lots of iterations. The compiler has a hard time to identify those loops that affect performance as opposed to those that are run 2-3 times. But the developer can usually profile and figure out, where those loops are. I wonder if the following would be possible. printf(); __cacheline_aligned_code; for(;;) do_sorting_loop_test(); include/linux/cache.h appears to define such for data structures, but not for code. Jörn -- ticks = jiffies; while (ticks == jiffies); ticks = jiffies; -- /usr/src/linux/init/main.c ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-04 13:11 ` Helge Hafting 2003-02-04 13:29 ` Jörn Engel @ 2003-02-04 14:05 ` P 2003-02-04 20:36 ` Herman Oosthuysen 1 sibling, 1 reply; 84+ messages in thread From: P @ 2003-02-04 14:05 UTC (permalink / raw) To: Helge Hafting; +Cc: linux-kernel Helge Hafting wrote: > Padraig@Linux.ie wrote: > [...] > >>Interesting. I just noticed that I get 50% decrease in >>the speed of my program if I just insert a printf(). I.E. >>my program is like: >> >>printf() >>for(;;) { >> do_sorting_loop_test(); >>} >> >>If I remove the initial printf it doubles in speed? >>I assume this is some weird caching thing? > > > Looks like a cacheline alignment issue to me. > This loop of yours occupy x cachelines on your cpu, > moving it in memory by adding the printf > might cause it to ocupy x+1 cachelines. > That might be noticeable if x is a really small number, > such as 1. OK it is (as I suspected and as you explained nicely) related to the cachelines on my CPU (866 celery). =============================== GCC options loops/s =============================== gcc 2283 gcc -O3 -falign-loops=2 3451 gcc -O3 -falign-loops=4 3443 gcc -O3 -falign-loops=8 7045 gcc -march=i686 -O3 9101 =============================== cheers, Pádraig. ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-04 14:05 ` P @ 2003-02-04 20:36 ` Herman Oosthuysen 0 siblings, 0 replies; 84+ messages in thread From: Herman Oosthuysen @ 2003-02-04 20:36 UTC (permalink / raw) To: P; +Cc: Helge Hafting, linux-kernel Hi there, More than anything else, the execution speed on modern processors seem to be a factor of code and data allignment. Some processors are OK with 16 bit word allignment, other require 32 bit word allignment and the new crop of processors will probably require 64 bit word allignment. If the data accesses are not alligned for your type of processor, then SDRAM accesses go to hell as the bursting gets upset. Unfortunately, this is a factor of processor architecture and the MS and Intel compilers support a small number of processors and can therefore be more easily optimized than GCC, which supports every processor in the whole world. If some application of yours is very speed sensitive, then you'll have to insert specific allignment control switches/pragmas to force GCC to do things the right way for speed, but that will typically increase the code and data size a little. Cheers, -- ------------------------------------------------------------------------ Herman Oosthuysen B.Eng.(E), Member of IEEE Wireless Networks Inc. http://www.WirelessNetworksInc.com E-mail: Herman@WirelessNetworksInc.com Phone: 1.403.569-5687, Fax: 1.403.235-3965 ------------------------------------------------------------------------ P@draigBrady.com wrote: > Helge Hafting wrote: > >>Padraig@Linux.ie wrote: >>[...] >> >> >>>Interesting. I just noticed that I get 50% decrease in >>>the speed of my program if I just insert a printf(). I.E. >>>my program is like: >>> >>>printf() >>>for(;;) { >>> do_sorting_loop_test(); >>>} >>> >>>If I remove the initial printf it doubles in speed? >>>I assume this is some weird caching thing? >> >> >>Looks like a cacheline alignment issue to me. >>This loop of yours occupy x cachelines on your cpu, >>moving it in memory by adding the printf >>might cause it to ocupy x+1 cachelines. >>That might be noticeable if x is a really small number, >>such as 1. > > > OK it is (as I suspected and as you explained nicely) > related to the cachelines on my CPU (866 celery). > > =============================== > GCC options loops/s > =============================== > gcc 2283 > gcc -O3 -falign-loops=2 3451 > gcc -O3 -falign-loops=4 3443 > gcc -O3 -falign-loops=8 7045 > gcc -march=i686 -O3 9101 > =============================== > > cheers, > Pádraig. > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [Lse-tech] gcc 2.95 vs 3.21 performance 2003-02-03 23:05 gcc 2.95 vs 3.21 performance Martin J. Bligh 2003-02-03 23:22 ` [Lse-tech] " Andi Kleen 2003-02-03 23:31 ` Richard B. Johnson @ 2003-02-04 12:20 ` Dave Jones 2003-02-04 15:50 ` Martin J. Bligh 2003-02-06 15:42 ` gcc -O2 vs gcc -Os performance Martin J. Bligh 3 siblings, 1 reply; 84+ messages in thread From: Dave Jones @ 2003-02-04 12:20 UTC (permalink / raw) To: Martin J. Bligh; +Cc: linux-kernel, lse-tech On Mon, Feb 03, 2003 at 03:05:06PM -0800, Martin J. Bligh wrote: > People keep extolling the virtues of gcc 3.2 to me, which I'm > reluctant to switch to, since it compiles so much slower. But > it supposedly generates better code, so I thought I'd compile > the kernel with both and compare the results. This is gcc 2.95 > and 3.2.1 from debian unstable on a 16-way NUMA-Q. The kernbench > tests still use 2.95 for the compile-time stuff. > > The results below leaves me distinctly unconvinced by the supposed > merits of modern gcc's. Not really better or worse, within experimental > error. But much slower to compile things with. What kernel was kernbench compiling ? The reason I'm asking is that 2.5s (and more recent 2.4.21pre's) will use -march flags for more aggressive optimisation on newer gcc's. If you want to compare apples to apples, make sure you choose something like i386 in the processor menu, and then it'll always use -march=i386 instead of getting fancy with things like -march=pentium4 Dave -- | Dave Jones. http://www.codemonkey.org.uk | SuSE Labs ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [Lse-tech] gcc 2.95 vs 3.21 performance 2003-02-04 12:20 ` [Lse-tech] " Dave Jones @ 2003-02-04 15:50 ` Martin J. Bligh 2003-02-10 12:13 ` Momchil Velikov 0 siblings, 1 reply; 84+ messages in thread From: Martin J. Bligh @ 2003-02-04 15:50 UTC (permalink / raw) To: Dave Jones; +Cc: linux-kernel, lse-tech > > People keep extolling the virtues of gcc 3.2 to me, which I'm > > reluctant to switch to, since it compiles so much slower. But > > it supposedly generates better code, so I thought I'd compile > > the kernel with both and compare the results. This is gcc 2.95 > > and 3.2.1 from debian unstable on a 16-way NUMA-Q. The kernbench > > tests still use 2.95 for the compile-time stuff. > > > > The results below leaves me distinctly unconvinced by the supposed > > merits of modern gcc's. Not really better or worse, within experimental > > error. But much slower to compile things with. > > What kernel was kernbench compiling ? The reason I'm asking is that > 2.5s (and more recent 2.4.21pre's) will use -march flags for more > aggressive optimisation on newer gcc's. > If you want to compare apples to apples, make sure you choose > something like i386 in the processor menu, and then it'll always > use -march=i386 instead of getting fancy with things like -march=pentium4 Kernbench compiles 2.4.17, because I'm old, slow and lazy, and that was what was around when I started doing this test ;-) But the point is still the same ... even if it is doing more agressive optimisation, it's not actually buying us anything (at least for the kernel) M. ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [Lse-tech] gcc 2.95 vs 3.21 performance 2003-02-04 15:50 ` Martin J. Bligh @ 2003-02-10 12:13 ` Momchil Velikov 0 siblings, 0 replies; 84+ messages in thread From: Momchil Velikov @ 2003-02-10 12:13 UTC (permalink / raw) To: Martin J. Bligh; +Cc: Dave Jones, linux-kernel, lse-tech >>>>> "Martin" == Martin J Bligh <mbligh@aracnet.com> writes: Martin> But the point is still the same ... even if it is doing Martin> more agressive optimisation, it's not actually buying us Martin> anything (at least for the kernel) which might be due in part to ``-fno-strict-aliasing'' used to compile the Linux kernel. ~velco ^ permalink raw reply [flat|nested] 84+ messages in thread
* gcc -O2 vs gcc -Os performance 2003-02-03 23:05 gcc 2.95 vs 3.21 performance Martin J. Bligh ` (2 preceding siblings ...) 2003-02-04 12:20 ` [Lse-tech] " Dave Jones @ 2003-02-06 15:42 ` Martin J. Bligh 2003-02-06 15:51 ` [Lse-tech] " Andi Kleen 2003-02-06 17:48 ` Alan Cox 3 siblings, 2 replies; 84+ messages in thread From: Martin J. Bligh @ 2003-02-06 15:42 UTC (permalink / raw) To: linux-kernel; +Cc: lse-tech Compiled the kernel with gcc -O2 (default) vs -Os (which people sometimes predict will be faster due to better cache usage). Didn't bother to measure how much time the compile itself took like that, but the resultant kernels were compared. Summary ... -Os is a little slower (note system times on kernbench, SDET and NUMAschedbench I consider within experimental error), but not drastically. I wouldn't switch to it though ;-) All done with gcc-2.95.4 (Debian Woody). These machines (16x NUMA-Q) have 700MHz P3 Xeons with 2Mb L2 cache ... -Os might fare better on celeron with a puny cache if someone wants to try that out. M. sizes: 894822 Feb 5 23:50 /boot/vmlinuz-2.5.59-mjb3-Os 906203 Feb 5 22:46 /boot/vmlinuz-2.5.59-mjb3.old Kernbench-2: (make -j N vmlinux, where N = 2 x num_cpus) Elapsed User System CPU 2.5.59-mjb3 45.66 565.33 110.18 1479.00 2.5.59-mjb3-Os 45.58 565.38 111.42 1484.33 Kernbench-16: (make -j N vmlinux, where N = 16 x num_cpus) Elapsed User System CPU 2.5.59-mjb3 46.87 569.77 133.32 1499.67 2.5.59-mjb3-Os 46.86 569.30 134.63 1501.50 DISCLAIMER: SPEC(tm) and the benchmark name SDET(tm) are registered trademarks of the Standard Performance Evaluation Corporation. This benchmarking was performed for research purposes only, and the run results are non-compliant and not-comparable with any published results. Results are shown as percentages of the first set displayed SDET 1 (see disclaimer) Throughput Std. Dev 2.5.59-mjb3 100.0% 4.1% 2.5.59-mjb3-Os 95.1% 6.7% SDET 2 (see disclaimer) Throughput Std. Dev 2.5.59-mjb3 100.0% 8.0% 2.5.59-mjb3-Os 101.2% 5.8% SDET 4 (see disclaimer) Throughput Std. Dev 2.5.59-mjb3 100.0% 6.2% 2.5.59-mjb3-Os 99.4% 14.1% SDET 8 (see disclaimer) Throughput Std. Dev 2.5.59-mjb3 100.0% 3.3% 2.5.59-mjb3-Os 100.5% 2.2% SDET 16 (see disclaimer) Throughput Std. Dev 2.5.59-mjb3 100.0% 3.2% 2.5.59-mjb3-Os 98.9% 2.4% SDET 32 (see disclaimer) Throughput Std. Dev 2.5.59-mjb3 100.0% 2.2% 2.5.59-mjb3-Os 97.2% 1.6% SDET 64 (see disclaimer) Throughput Std. Dev 2.5.59-mjb3 100.0% 0.4% 2.5.59-mjb3-Os 99.9% 0.3% SDET 128 (see disclaimer) Throughput Std. Dev NUMA schedbench 4: AvgUser Elapsed TotalUser TotalSys 2.5.59-mjb3 0.00 34.62 90.63 0.91 2.5.59-mjb3-Os 0.00 40.35 81.94 0.69 NUMA schedbench 8: AvgUser Elapsed TotalUser TotalSys 2.5.59-mjb3 0.00 52.16 266.45 1.51 2.5.59-mjb3-Os 0.00 46.61 248.47 1.49 NUMA schedbench 16: AvgUser Elapsed TotalUser TotalSys 2.5.59-mjb3 0.00 57.38 845.30 3.58 2.5.59-mjb3-Os 0.00 58.34 851.12 2.94 NUMA schedbench 32: AvgUser Elapsed TotalUser TotalSys 2.5.59-mjb3 0.00 118.05 1806.79 6.24 2.5.59-mjb3-Os 0.00 115.85 1803.72 6.29 NUMA schedbench 64: AvgUser Elapsed TotalUser TotalSys 2.5.59-mjb3 0.00 236.59 3627.47 15.24 2.5.59-mjb3-Os 0.00 236.90 3631.11 15.35 ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: [Lse-tech] gcc -O2 vs gcc -Os performance 2003-02-06 15:42 ` gcc -O2 vs gcc -Os performance Martin J. Bligh @ 2003-02-06 15:51 ` Andi Kleen 2003-02-06 17:48 ` Alan Cox 1 sibling, 0 replies; 84+ messages in thread From: Andi Kleen @ 2003-02-06 15:51 UTC (permalink / raw) To: Martin J. Bligh; +Cc: linux-kernel, lse-tech > All done with gcc-2.95.4 (Debian Woody). These machines (16x NUMA-Q) have > 700MHz P3 Xeons with 2Mb L2 cache ... -Os might fare better on celeron > with a puny cache if someone wants to try that out. -Os on 2.95 is not too useful. It only started becomming useful on 3.1+, even more so on the upcomming 3.3. e.g. there was one report of ACPI shrinking by >60k by recompiling it with -Os on 3.1. ACPI is only slow path code so that is completely reasonable. Best would be of course to use profile feedback to let the compiler decide where to generate small and where to generate fast&big code. But that has problems with the maintainability (it will be hard to generate the same vmlinux as users for debugging/ksymoops reading purposes) -Andi ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc -O2 vs gcc -Os performance 2003-02-06 15:42 ` gcc -O2 vs gcc -Os performance Martin J. Bligh 2003-02-06 15:51 ` [Lse-tech] " Andi Kleen @ 2003-02-06 17:48 ` Alan Cox 2003-02-06 17:06 ` Martin J. Bligh 2003-02-06 20:38 ` Martin J. Bligh 1 sibling, 2 replies; 84+ messages in thread From: Alan Cox @ 2003-02-06 17:48 UTC (permalink / raw) To: Martin J. Bligh; +Cc: Linux Kernel Mailing List, lse-tech On Thu, 2003-02-06 at 15:42, Martin J. Bligh wrote: > All done with gcc-2.95.4 (Debian Woody). These machines (16x NUMA-Q) have > 700MHz P3 Xeons with 2Mb L2 cache ... -Os might fare better on celeron > with a puny cache if someone wants to try that out gcc 3.2 is a lot smarter about -Os and it makes a very big size difference according to the numbers the from the ACPI guys. Im not sure testing with a gcc from the last millenium is useful 8) ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc -O2 vs gcc -Os performance 2003-02-06 17:48 ` Alan Cox @ 2003-02-06 17:06 ` Martin J. Bligh 2003-02-06 20:38 ` Martin J. Bligh 1 sibling, 0 replies; 84+ messages in thread From: Martin J. Bligh @ 2003-02-06 17:06 UTC (permalink / raw) To: Alan Cox; +Cc: Linux Kernel Mailing List, lse-tech >> All done with gcc-2.95.4 (Debian Woody). These machines (16x NUMA-Q) have >> 700MHz P3 Xeons with 2Mb L2 cache ... -Os might fare better on celeron >> with a puny cache if someone wants to try that out > > gcc 3.2 is a lot smarter about -Os and it makes a very big size > difference according to the numbers the from the ACPI guys. > > Im not sure testing with a gcc from the last millenium is useful 8) I'll retest with gcc-3.2 ... maybe it'll finally show a case where it's better than 2.95 this way? <ducks> <runs> M. ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc -O2 vs gcc -Os performance 2003-02-06 17:48 ` Alan Cox 2003-02-06 17:06 ` Martin J. Bligh @ 2003-02-06 20:38 ` Martin J. Bligh 2003-02-06 21:32 ` John Bradford ` (2 more replies) 1 sibling, 3 replies; 84+ messages in thread From: Martin J. Bligh @ 2003-02-06 20:38 UTC (permalink / raw) To: Alan Cox; +Cc: Linux Kernel Mailing List, lse-tech >> All done with gcc-2.95.4 (Debian Woody). These machines (16x NUMA-Q) have >> 700MHz P3 Xeons with 2Mb L2 cache ... -Os might fare better on celeron >> with a puny cache if someone wants to try that out > > gcc 3.2 is a lot smarter about -Os and it makes a very big size > difference according to the numbers the from the ACPI guys. > > Im not sure testing with a gcc from the last millenium is useful 8) Still no use. /me throws gcc-3.2 in the trash can. 2901299 vmlinux.O2 2667827 vmlinux.Os Kernbench-2: (make -j N vmlinux, where N = 2 x num_cpus) Elapsed User System CPU 2.5.59-mjb3-gcc32-O2 45.86 564.75 110.91 1472.67 2.5.59-mjb3-gcc32-Os 45.74 563.96 111.06 1475.17 Kernbench-16: (make -j N vmlinux, where N = 16 x num_cpus) Elapsed User System CPU 2.5.59-mjb3-gcc32-O2 46.83 569.15 133.88 1500.50 2.5.59-mjb3-gcc32-Os 46.90 568.17 134.58 1497.83 DISCLAIMER: SPEC(tm) and the benchmark name SDET(tm) are registered trademarks of the Standard Performance Evaluation Corporation. This benchmarking was performed for research purposes only, and the run results are non-compliant and not-comparable with any published results. Results are shown as percentages of the first set displayed SDET 1 (see disclaimer) Throughput Std. Dev 2.5.59-mjb3-gcc32-O2 100.0% 3.4% 2.5.59-mjb3-gcc32-Os 99.8% 2.8% SDET 2 (see disclaimer) Throughput Std. Dev 2.5.59-mjb3-gcc32-O2 100.0% 6.7% 2.5.59-mjb3-gcc32-Os 101.2% 4.9% SDET 4 (see disclaimer) Throughput Std. Dev 2.5.59-mjb3-gcc32-O2 100.0% 3.8% 2.5.59-mjb3-gcc32-Os 95.1% 3.0% SDET 8 (see disclaimer) Throughput Std. Dev 2.5.59-mjb3-gcc32-O2 100.0% 1.1% 2.5.59-mjb3-gcc32-Os 98.1% 1.4% SDET 16 (see disclaimer) Throughput Std. Dev 2.5.59-mjb3-gcc32-O2 100.0% 1.6% 2.5.59-mjb3-gcc32-Os 97.7% 1.7% SDET 32 (see disclaimer) Throughput Std. Dev 2.5.59-mjb3-gcc32-O2 100.0% 1.1% 2.5.59-mjb3-gcc32-Os 103.7% 1.9% SDET 64 (see disclaimer) Throughput Std. Dev 2.5.59-mjb3-gcc32-O2 100.0% 1.4% 2.5.59-mjb3-gcc32-Os 96.6% 9.7% NUMA schedbench 4: AvgUser Elapsed TotalUser TotalSys 2.5.59-mjb3-gcc32-O2 0.00 36.93 88.84 0.62 2.5.59-mjb3-gcc32-Os 0.00 44.28 96.95 0.67 NUMA schedbench 8: AvgUser Elapsed TotalUser TotalSys 2.5.59-mjb3-gcc32-O2 0.00 54.16 327.57 1.58 2.5.59-mjb3-gcc32-Os 0.00 50.66 248.42 1.89 NUMA schedbench 16: AvgUser Elapsed TotalUser TotalSys 2.5.59-mjb3-gcc32-O2 0.00 57.17 851.44 3.09 2.5.59-mjb3-gcc32-Os 0.00 57.25 849.20 3.14 NUMA schedbench 32: AvgUser Elapsed TotalUser TotalSys 2.5.59-mjb3-gcc32-O2 0.00 117.82 1808.42 6.34 2.5.59-mjb3-gcc32-Os 0.00 130.02 1814.74 6.52 NUMA schedbench 64: AvgUser Elapsed TotalUser TotalSys 2.5.59-mjb3-gcc32-O2 0.00 236.82 3616.04 15.17 2.5.59-mjb3-gcc32-Os 0.00 241.34 3624.50 16.39 ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc -O2 vs gcc -Os performance 2003-02-06 20:38 ` Martin J. Bligh @ 2003-02-06 21:32 ` John Bradford 2003-02-06 22:12 ` Linus Torvalds 2003-02-06 23:17 ` Roger Larsson 2 siblings, 0 replies; 84+ messages in thread From: John Bradford @ 2003-02-06 21:32 UTC (permalink / raw) To: Martin J. Bligh; +Cc: alan, linux-kernel, lse-tech > >> All done with gcc-2.95.4 (Debian Woody). These machines (16x NUMA-Q) have > >> 700MHz P3 Xeons with 2Mb L2 cache ... -Os might fare better on celeron > >> with a puny cache if someone wants to try that out > > > > gcc 3.2 is a lot smarter about -Os and it makes a very big size > > difference according to the numbers the from the ACPI guys. > > > > Im not sure testing with a gcc from the last millenium is useful 8) > > Still no use. > /me throws gcc-3.2 in the trash can. What submodel options are you using? If you're compiling with -march=i386, I wouldn't expect -Os to have much effect. Note that, of all architectures, GCC is almost certainly most efficient on IA-32. Although I haven't done any benchmarks against other compilers on $arch!=IA32, the ones I've seen claim that the native compiler generates much better code. John. ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc -O2 vs gcc -Os performance 2003-02-06 20:38 ` Martin J. Bligh 2003-02-06 21:32 ` John Bradford @ 2003-02-06 22:12 ` Linus Torvalds 2003-02-06 22:58 ` Martin J. Bligh 2003-02-06 23:17 ` Roger Larsson 2 siblings, 1 reply; 84+ messages in thread From: Linus Torvalds @ 2003-02-06 22:12 UTC (permalink / raw) To: linux-kernel In article <263740000.1044563891@[10.10.2.4]>, Martin J. Bligh <mbligh@aracnet.com> wrote: >>> All done with gcc-2.95.4 (Debian Woody). These machines (16x NUMA-Q) have >>> 700MHz P3 Xeons with 2Mb L2 cache ... -Os might fare better on celeron >>> with a puny cache if someone wants to try that out >> >> gcc 3.2 is a lot smarter about -Os and it makes a very big size >> difference according to the numbers the from the ACPI guys. >> >> Im not sure testing with a gcc from the last millenium is useful 8) > >Still no use. >/me throws gcc-3.2 in the trash can. > >2901299 vmlinux.O2 >2667827 vmlinux.Os Well, Os is certainly smaller. One thing to look out for is that microbenchmarks for kernels are usually the _worst_ things to test with Os. That's since a large part of the premise of the -Os speed advantage is that it is better for icache (usually not an issue for microbenchmarks) and that it is better for load/startup times (generally not a huge issue for kernels, since the real startup costs of kernels tend to be entirely elsewhere). So I suspect -Os tends to be more appropriate for user-mode code, and especially code with low repeat rates. Possibly the "low repeat rate" thing ends up being true of certain kernel subsystems too. Think of it this way: if you win 10% in size, you're likely to map and load 10% less code pages at run-time. Which is not a big issue for traditional data-centric loads, but can be a _huge_ deal for things like GUI programs etc where there is often more code than data. Linus ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc -O2 vs gcc -Os performance 2003-02-06 22:12 ` Linus Torvalds @ 2003-02-06 22:58 ` Martin J. Bligh 2003-02-06 23:16 ` Linus Torvalds 0 siblings, 1 reply; 84+ messages in thread From: Martin J. Bligh @ 2003-02-06 22:58 UTC (permalink / raw) To: Linus Torvalds, linux-kernel >> 2901299 vmlinux.O2 >> 2667827 vmlinux.Os > > Well, Os is certainly smaller. Yup. I have lots of RAM though, so unless I can see the perf increase from cache effects, it's not desperately interesting to me personally. If someone could do similar measurements with a puny-cache celeron chip, it would be interesting ... > So I suspect -Os tends to be more appropriate for user-mode code, and > especially code with low repeat rates. Possibly the "low repeat rate" > thing ends up being true of certain kernel subsystems too. Fair enough. I'm not desperately interested in user-land code at the moment, personally, but gcc is admittedly more general. Maybe we should compile gcc itself with -Os ;-) Andi (I think) also made the observation that the garbage-collect size for gcc3.2 may be rather small. The observation re low repeat rate is interesting ... might be amusing to do some really basic profile-guided optimisation on this grounds, take readprofile / oprofile output, and compile the files that don't get hammered at all with -Os rather than -O2. Given their low frequency (by definition), I'm not sure that improving their icache footprint will have a measureable effect though. M. ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc -O2 vs gcc -Os performance 2003-02-06 22:58 ` Martin J. Bligh @ 2003-02-06 23:16 ` Linus Torvalds 2003-02-06 23:59 ` Martin J. Bligh 0 siblings, 1 reply; 84+ messages in thread From: Linus Torvalds @ 2003-02-06 23:16 UTC (permalink / raw) To: Martin J. Bligh; +Cc: linux-kernel On Thu, 6 Feb 2003, Martin J. Bligh wrote: > > The observation re low repeat rate is interesting ... might be amusing > to do some really basic profile-guided optimisation on this grounds, > take readprofile / oprofile output, and compile the files that don't > get hammered at all with -Os rather than -O2. Given their low frequency > (by definition), I'm not sure that improving their icache footprint will > have a measureable effect though. Icache footprint has nothing to do with repeat rates, which is exactly why repeat rates are interesting for -Os. Icache footprint is directly proportional to the _static_ size of the code (ie exactly the thing that -Os is supposed to optimize for), while instruction-level performance measurement is only valid on the _dynamic_ code. And with modern CPU's with big caches, a _lot_ of cache misses are the forced kind - the startup costs, not the actual runtime cost. That's not always true (if you touch big data sets, you'll have replacement misses too, of course), but it's not really false either. So think of the I$ (and TLB, and page load/map - all the same) cost as a fixed cost that will always be there, but that -Os tries to minimize. That's _one_ dimension in the total cost. The "traditional" -O2 kind of "try to make the code run fast" optimizations tend to try to minimize a totally different dimension, namely the dynamic code speed. And the time required for running the program is the sum of the static and dynamic factors. In other words, a _good_ optimization should try to minimize not one or the other, but the sum. And low repeat rates means that the dynamic component is smaller, which clearly makes the static component more important. For example, if you are doing mp3 encoding, the repeat rates for the core loop are huge, and the code is small, so clearly the static component is largely insignificant. Use -O2. But if you're running a GUI program then just the loading time is often quite noticeable, and if you can improve that by, say, 10%, then that can _more_ than make up for almost any amount of stupidity in your code. Especially since a lot of the code isn't even all that loopy and tends to have low repeat rates. You're almost guaranteed to be better off using -Os than -O2. If you've got performance counter data, check the I$ and ITLB miss ratios, and if they are at all noticeable, think about the fact that a I$ miss tends to cost a lot more than a few more dynamic instructions. I suspect the kernel I$ behaviour is generally pretty good, and the ITLB behaviour is improved even further thanks to large pages etc. That said, a user app that blows the I$ will blow the kernel out of the I$ too, so small is always beautiful, even in the kernel. Linus ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc -O2 vs gcc -Os performance 2003-02-06 23:16 ` Linus Torvalds @ 2003-02-06 23:59 ` Martin J. Bligh 0 siblings, 0 replies; 84+ messages in thread From: Martin J. Bligh @ 2003-02-06 23:59 UTC (permalink / raw) To: Linus Torvalds; +Cc: linux-kernel >> The observation re low repeat rate is interesting ... might be amusing >> to do some really basic profile-guided optimisation on this grounds, >> take readprofile / oprofile output, and compile the files that don't >> get hammered at all with -Os rather than -O2. Given their low frequency >> (by definition), I'm not sure that improving their icache footprint will >> have a measureable effect though. > > Icache footprint has nothing to do with repeat rates, which is exactly why > repeat rates are interesting for -Os. Reading the below, I think I just misinterpreted what you meant by "repeate rate". My point was that if you hardly ever run that section of code, -Os might be better. If we call how often you call that code section it's "frequency" (nothing to do with how tightly it loops inside it), then if the frequency of the code is low, the icache footprint might be better off smaller, as it'll just blow the icache when we do run it and those cachelines are fetched. On the other hand, that won't happen often, so it may well be unobservable for real loads. M. ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc -O2 vs gcc -Os performance 2003-02-06 20:38 ` Martin J. Bligh 2003-02-06 21:32 ` John Bradford 2003-02-06 22:12 ` Linus Torvalds @ 2003-02-06 23:17 ` Roger Larsson 2003-02-06 23:33 ` Martin J. Bligh 2 siblings, 1 reply; 84+ messages in thread From: Roger Larsson @ 2003-02-06 23:17 UTC (permalink / raw) To: Martin J. Bligh; +Cc: linux-kernel On Thursday 06 February 2003 21:38, Martin J. Bligh wrote: > gcc-3.2 > > 2901299 vmlinux.O2 > 2667827 vmlinux.Os > In an earlier message, Martin J. Bligh wrote: > > 894822 Feb 5 23:50 /boot/vmlinuz-2.5.59-mjb3-Os > 906203 Feb 5 22:46 /boot/vmlinuz-2.5.59-mjb3.old And if you compare both with same/no compression? /RogerL -- Roger Larsson Skellefteå Sweden ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc -O2 vs gcc -Os performance 2003-02-06 23:17 ` Roger Larsson @ 2003-02-06 23:33 ` Martin J. Bligh 0 siblings, 0 replies; 84+ messages in thread From: Martin J. Bligh @ 2003-02-06 23:33 UTC (permalink / raw) To: Roger Larsson; +Cc: linux-kernel >> gcc-3.2 >> >> 2901299 vmlinux.O2 >> 2667827 vmlinux.Os >> > > In an earlier message, Martin J. Bligh wrote: >> >> 894822 Feb 5 23:50 /boot/vmlinuz-2.5.59-mjb3-Os >> 906203 Feb 5 22:46 /boot/vmlinuz-2.5.59-mjb3.old > > And if you compare both with same/no compression? 980233 Feb 6 11:15 /boot/vmlinuz-2.5.59-mjb3 914965 Feb 6 09:34 /boot/vmlinuz-2.5.59-mjb3.old Those were probably the right files. (O2 and Os respectively) I didn't look too closely at the time. Looks like 2.95 produces smaller files with O2 than 3.2 does with -Os. Bah. /me cheers for gcc 2.95.4 M. ^ permalink raw reply [flat|nested] 84+ messages in thread
[parent not found: <1044385759.1861.46.camel@localhost.localdomain.suse.lists.linux.kernel>]
[parent not found: <200302041935.h14JZ69G002675@darkstar.example.net.suse.lists.linux.kernel>]
[parent not found: <b1pbt8$2ll$1@penguin.transmeta.com.suse.lists.linux.kernel>]
* Re: gcc 2.95 vs 3.21 performance [not found] ` <b1pbt8$2ll$1@penguin.transmeta.com.suse.lists.linux.kernel> @ 2003-02-04 22:05 ` Andi Kleen 2003-02-04 22:14 ` Linus Torvalds 2003-02-04 22:59 ` Jeff Muizelaar 0 siblings, 2 replies; 84+ messages in thread From: Andi Kleen @ 2003-02-04 22:05 UTC (permalink / raw) To: Linus Torvalds; +Cc: linux-kernel torvalds@transmeta.com (Linus Torvalds) writes: > > I'd love to see a small - and fast - C compiler, and I'd be willing to > make kernel changes to make it work with it. If you want small and fast use lcc. Unfortunately it's not completely free (some weird license), doesn't really support real inline assembly and generates rather bad code compared to gcc. I'm still looking forward to Open Watcom (http://www.openwatcom.org) - they are near self hosting on Linux. The inline assembly is very VC++ style though; very different from gcc and worse you have to write it in Intel syntax. Another alternative would be TenDRA, but it also has no inline assembly and it's C understanding can be only described as "fascist". If you don't care about free software you could also use the Intel compiler, which seems to be often faster in compile time than gcc now and can already compile kernels. -Andi ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-04 22:05 ` gcc 2.95 vs 3.21 performance Andi Kleen @ 2003-02-04 22:14 ` Linus Torvalds 2003-02-05 10:04 ` Pavel Janík 2003-02-04 22:59 ` Jeff Muizelaar 1 sibling, 1 reply; 84+ messages in thread From: Linus Torvalds @ 2003-02-04 22:14 UTC (permalink / raw) To: Andi Kleen; +Cc: linux-kernel On 4 Feb 2003, Andi Kleen wrote: > > If you want small and fast use lcc. lcc isn't really something I want to use, since the license is so strange, and thus can't be improved upon if there are issues with it. Some people have used the Intel compiler - which obviously also cannot be improved upon, but which is likely to start off pretty good. I don't really want to use it myself - what I'd really like to see is gcc splitting up just the C compiler as a separate project with more attention to size and speed. Linus ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-04 22:14 ` Linus Torvalds @ 2003-02-05 10:04 ` Pavel Janík 2003-02-05 20:07 ` Linus Torvalds 2003-02-06 15:00 ` Horst von Brand 0 siblings, 2 replies; 84+ messages in thread From: Pavel Janík @ 2003-02-05 10:04 UTC (permalink / raw) To: Linus Torvalds; +Cc: linux-kernel From: Linus Torvalds <torvalds@transmeta.com> Date: Tue, 4 Feb 2003 14:14:06 -0800 (PST) Hi Linus, > lcc isn't really something I want to use, since the license is so > strange, and thus can't be improved upon if there are issues with it. what is the difference between compiler and source management system regarding licenses and improvements? -- Pavel Janík I think I started with hitting C-h a lot. Really a LOT. -- Kai Grossjohann in gnu.emacs.help about Emacs knowledge ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-05 10:04 ` Pavel Janík @ 2003-02-05 20:07 ` Linus Torvalds 2003-02-06 15:00 ` Horst von Brand 1 sibling, 0 replies; 84+ messages in thread From: Linus Torvalds @ 2003-02-05 20:07 UTC (permalink / raw) To: Pavel Janík; +Cc: linux-kernel On Wed, 5 Feb 2003, Pavel [iso-8859-2] Janík wrote: > > Hi Linus, > > > lcc isn't really something I want to use, since the license is so > > strange, and thus can't be improved upon if there are issues with it. > > what is the difference between compiler and source management system > regarding licenses and improvements? You snipped the part where I said that the intel compiler is likely to be more interesting to a number of people, since it's at a higher level. So no, I'm not religious about licenses. But the real issue is "does it do what we want it to do?" and "do we have a choice?". There are no open-source SCM's that work for me. But there _is_ an open-source compiler that does work for me. At which point the license matters - simply because there is choice in the matter. Gcc mostly works. But it's slower then I'd like. And it prioritizes things I don't care about. And competition is always good. So I would definitely love to see some alternatives. And if you have issues with BK, maybe you can try to encourage the SCM people to see why I consider BK to not even have alternatives right now. Linus ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-05 10:04 ` Pavel Janík 2003-02-05 20:07 ` Linus Torvalds @ 2003-02-06 15:00 ` Horst von Brand 1 sibling, 0 replies; 84+ messages in thread From: Horst von Brand @ 2003-02-06 15:00 UTC (permalink / raw) To: Pavel Janík; +Cc: Linux Kernel Mailing List Pavel@Janik.cz (Pavel =?iso-8859-2?q?Jan=EDk?=) said: > Linus Torvalds <torvalds@transmeta.com> said: > > lcc isn't really something I want to use, since the license is so > > strange, and thus can't be improved upon if there are issues with it. > what is the difference between compiler and source management system > regarding licenses and improvements? That bk was designed around Linus' and other head kernel hackers ideas of how it should work, and they are still bending over backwards to keep this biggest _*non*_customer of theirs happy. OTOH, lcc as a project seems to be dead for all practical purposes (it looks like 4.2 will be the endo of the line, no patches or updates have shown up for quite some time). Its licence <http://www.cs.princeton.edu/software/lcc/pkg/CPYRIGHT> is vaguely BSDish, but with a "you can't make money off this or any modified versions/software based on it" clause. I've been inside lcc 4.1 (current version is 4.2, somewhat different, so YMMV...) myself a bit, and while it is a marvelous showpiece for classroom use, it is sorely lacking in what makes a _real_ C compiler (for kernel use). For one, it only knows about i486-ish ia32 CPUs, to get others supported in its current incarnation would be a massive excercise in duplication or macro-massaging the backend source; other than the (very good) optimal instruction selection there is very little optimization (what there is is a bit of strength reduction), the organization of the compiler makes adding aditional higher-level optimization almost impossible, a separate SSA or such intermediate form would have to retrofitted; the register selection is very simplistic and doesn't work correctly (some experimental patches we had for generating PIC code on ia32 kept it crashing by running out of registers the code for fixing this case up just doesn't work). No hint at scheduling instructions or such. -- Dr. Horst H. von Brand User #22616 counter.li.org Departamento de Informatica Fono: +56 32 654431 Universidad Tecnica Federico Santa Maria +56 32 654239 Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513 ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-04 22:05 ` gcc 2.95 vs 3.21 performance Andi Kleen 2003-02-04 22:14 ` Linus Torvalds @ 2003-02-04 22:59 ` Jeff Muizelaar 2003-02-04 23:12 ` b_adlakha ` (3 more replies) 1 sibling, 4 replies; 84+ messages in thread From: Jeff Muizelaar @ 2003-02-04 22:59 UTC (permalink / raw) To: Andi Kleen; +Cc: Linus Torvalds, linux-kernel Andi Kleen wrote: >If you want small and fast use lcc. > >Unfortunately it's not completely free (some weird license), doesn't >really support real inline assembly and generates rather bad code compared >to gcc. > >I'm still looking forward to Open Watcom (http://www.openwatcom.org) - >they are near self hosting on Linux. The inline assembly is very VC++ style >though; very different from gcc and worse you have to write it in >Intel syntax. > >Another alternative would be TenDRA, but it also has no inline assembly >and it's C understanding can be only described as "fascist". > >If you don't care about free software you could also use the Intel >compiler, which seems to be often faster in compile time than gcc now >and can already compile kernels. > There is also tcc (http://fabrice.bellard.free.fr/tcc/) It claims to support gcc-like inline assembler, appears to be much smaller and faster than gcc. Plus it is GPL so the liscense isn't a problem either. Though, I am not really sure of the quality of code generated or of how mature it is. -Jeff ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-04 22:59 ` Jeff Muizelaar @ 2003-02-04 23:12 ` b_adlakha 2003-02-05 8:41 ` Horst von Brand ` (2 subsequent siblings) 3 siblings, 0 replies; 84+ messages in thread From: b_adlakha @ 2003-02-04 23:12 UTC (permalink / raw) To: Jeff Muizelaar; +Cc: linux-kernel Jeff Muizelaar writes: > Andi Kleen wrote: > >> If you want small and fast use lcc. >> >> Unfortunately it's not completely free (some weird license), doesn't >> really support real inline assembly and generates rather bad code >> compared to gcc. >> >> I'm still looking forward to Open Watcom (http://www.openwatcom.org) - >> they are near self hosting on Linux. The inline assembly is very VC++ >> style though; very different from gcc and worse you have to write it in >> Intel syntax. >> >> Another alternative would be TenDRA, but it also has no inline assembly >> and it's C understanding can be only described as "fascist". >> >> If you don't care about free software you could also use the Intel >> compiler, which seems to be often faster in compile time than gcc now >> and can already compile kernels. >> > There is also tcc (http://fabrice.bellard.free.fr/tcc/) > It claims to support gcc-like inline assembler, appears to be much smaller > and faster than gcc. Plus it is GPL so the liscense isn't a problem > either. > Though, I am not really sure of the quality of code generated or of how > mature it is. > > -Jeff wow, looks like some teenage kid like me made it... its a 170 kb gzipped tar! nice for a C compiler...But i'm not sure if it could compile half of the linux kernel successfully... ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-04 22:59 ` Jeff Muizelaar 2003-02-04 23:12 ` b_adlakha @ 2003-02-05 8:41 ` Horst von Brand 2003-02-05 19:09 ` Linus Torvalds 2003-02-06 7:02 ` Neil Booth 3 siblings, 0 replies; 84+ messages in thread From: Horst von Brand @ 2003-02-05 8:41 UTC (permalink / raw) To: Jeff Muizelaar; +Cc: linux-kernel [Massive Cc: snippage] Jeff Muizelaar <muizelaar@rogers.com> said: [...] > There is also tcc (http://fabrice.bellard.free.fr/tcc/) > It claims to support gcc-like inline assembler, appears to be much > smaller and faster than gcc. Plus it is GPL so the liscense isn't a > problem either. > Though, I am not really sure of the quality of code generated Horrible. > or of how > mature it is. Nice for one-file throwaway C proggies. But then again, Perl is so much better at what you'd want to do most of the time... Look, people, the gcc folks have recently redone the guts of the compiler to make more advanced optimizations possible/easier (look at the news for 2000-2002 at <http://gcc.gnu.org>). It still needs a lot of porting over of optimizations and developing new ones, plus tuning, AFAIU. The other open(ish) C compilers I know about are mere toys. -- Dr. Horst H. von Brand User #22616 counter.li.org Departamento de Informatica Fono: +56 32 654431 Universidad Tecnica Federico Santa Maria +56 32 654239 Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513 ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-04 22:59 ` Jeff Muizelaar 2003-02-04 23:12 ` b_adlakha 2003-02-05 8:41 ` Horst von Brand @ 2003-02-05 19:09 ` Linus Torvalds 2003-02-05 19:22 ` Randy.Dunlap 2003-02-05 19:24 ` John Bradford 2003-02-06 7:02 ` Neil Booth 3 siblings, 2 replies; 84+ messages in thread From: Linus Torvalds @ 2003-02-05 19:09 UTC (permalink / raw) To: linux-kernel In article <3E4045D1.4010704@rogers.com>, Jeff Muizelaar <muizelaar@rogers.com> wrote: > >There is also tcc (http://fabrice.bellard.free.fr/tcc/) >It claims to support gcc-like inline assembler, appears to be much >smaller and faster than gcc. Plus it is GPL so the liscense isn't a >problem either. >Though, I am not really sure of the quality of code generated or of how >mature it is. tcc is interesting. The code generation is pretty simplistic (read: trivially horrible for most things), but it sure is fast and small. And judging by the changelog, Fabrice is trying to compile the kernel with it. For a lot of problems, small-and-fast is good. Hell, some of the things I'd personally find interesting don't have any code generation part at all (static analysis of annotated source-code - stanford checker on the cheap). And development doesn't always need good code generation (right now some people use "gcc -O0" for that, because anything else hurts too much. Now, the code from tcc will probably look more like "-O-1", but at least you can test out things _quickly_). Linus ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-05 19:09 ` Linus Torvalds @ 2003-02-05 19:22 ` Randy.Dunlap 2003-02-05 19:24 ` John Bradford 1 sibling, 0 replies; 84+ messages in thread From: Randy.Dunlap @ 2003-02-05 19:22 UTC (permalink / raw) To: Linus Torvalds; +Cc: linux-kernel On Wed, 5 Feb 2003, Linus Torvalds wrote: | In article <3E4045D1.4010704@rogers.com>, | Jeff Muizelaar <muizelaar@rogers.com> wrote: | > | >There is also tcc (http://fabrice.bellard.free.fr/tcc/) | >It claims to support gcc-like inline assembler, appears to be much | >smaller and faster than gcc. Plus it is GPL so the liscense isn't a | >problem either. | >Though, I am not really sure of the quality of code generated or of how | >mature it is. | | tcc is interesting. The code generation is pretty simplistic (read: | trivially horrible for most things), but it sure is fast and small. And | judging by the changelog, Fabrice is trying to compile the kernel with | it. | | For a lot of problems, small-and-fast is good. Hell, some of the things | I'd personally find interesting don't have any code generation part at | all (static analysis of annotated source-code - stanford checker on the | cheap). Yep, that's exactly why I'm interested... | And development doesn't always need good code generation (right | now some people use "gcc -O0" for that, because anything else hurts too | much. Now, the code from tcc will probably look more like "-O-1", but | at least you can test out things _quickly_). -- ~Randy ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-05 19:09 ` Linus Torvalds 2003-02-05 19:22 ` Randy.Dunlap @ 2003-02-05 19:24 ` John Bradford 1 sibling, 0 replies; 84+ messages in thread From: John Bradford @ 2003-02-05 19:24 UTC (permalink / raw) To: Linus Torvalds; +Cc: linux-kernel > >There is also tcc (http://fabrice.bellard.free.fr/tcc/) > >It claims to support gcc-like inline assembler, appears to be much > >smaller and faster than gcc. Plus it is GPL so the liscense isn't a > >problem either. > >Though, I am not really sure of the quality of code generated or of how > >mature it is. > > tcc is interesting. The code generation is pretty simplistic (read: > trivially horrible for most things), but it sure is fast and small. And > judging by the changelog, Fabrice is trying to compile the kernel with > it. > > For a lot of problems, small-and-fast is good. Maybe otcc is a better choice, then? http://fabrice.bellard.free.fr/otcc/ :-) John. ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-04 22:59 ` Jeff Muizelaar ` (2 preceding siblings ...) 2003-02-05 19:09 ` Linus Torvalds @ 2003-02-06 7:02 ` Neil Booth [not found] ` <courier.3E423112.00007219@softhome.net> 2003-02-10 2:14 ` Jeff Garzik 3 siblings, 2 replies; 84+ messages in thread From: Neil Booth @ 2003-02-06 7:02 UTC (permalink / raw) To: Jeff Muizelaar; +Cc: Andi Kleen, Linus Torvalds, linux-kernel Jeff Muizelaar wrote:- > There is also tcc (http://fabrice.bellard.free.fr/tcc/) > It claims to support gcc-like inline assembler, appears to be much > smaller and faster than gcc. Plus it is GPL so the liscense isn't a > problem either. It doesn't expand macros correctly, however, and accepts an enormous range of invalid code without a single diagnostic. I'm pretty sure it's arithmetic rules are incorrect, too. It's certainly nowhere near C89 compliance. Neil. ^ permalink raw reply [flat|nested] 84+ messages in thread
[parent not found: <courier.3E423112.00007219@softhome.net>]
[parent not found: <20030206212218.GA4891@daikokuya.co.uk>]
* Re: gcc 2.95 vs 3.21 performance [not found] ` <20030206212218.GA4891@daikokuya.co.uk> @ 2003-02-07 10:31 ` b_adlakha 2003-02-07 18:46 ` Horst von Brand 2003-02-07 21:49 ` Neil Booth 0 siblings, 2 replies; 84+ messages in thread From: b_adlakha @ 2003-02-07 10:31 UTC (permalink / raw) To: Neil Booth; +Cc: linux-kernel Neil Booth writes: > b_adlakha@softhome.net wrote:- > >> Maybe thats why its a 0.9* version, and the auther has stated on his site >> that not all C98 features are implimented...but then even GCC doesn't >> impliment them... > > No, I said C89. He's got a *long* way to go for that. Forget C99. > > However, he does claim C89 compliance, which is quite disingenuous. > >> I checked tcc out, and its damn fast, much much much much faster than gcc. >> gcc is bloated and its slow even on my pentium 4 machine, let alone my 1.2 >> celeron. It takes 20 minutes to compile a new kernel on that, now if you're >> gonna test kernels/patches, you can wait 20 minutes for every compile! > > I agree. I'm trying to fix it. > > GCC is larger for a reason: it does things properly. It's easy to be > fast if you're willing to be wrong, and not emit warnings or errors, and > not implement half the standard. And not optimize. > >> Even icc is much better than gcc, but its very perticular about code (and >> its not gcc compatible as the intel site says) >> And its non-free also... > > Only better in terms of compile speed. Cool (you're trying to fix it), maybe you can modify tcc so it is optimized for compiling linux (optimized for compiling speed and runtime speed for linux). I think it'll be easier and quicker to just make it compile linux properly first, then do the testing/fixing for other things, as they are so many compilers for other things anyway...And maybe it can be called "Linux C Compiler"? lol. ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-07 10:31 ` b_adlakha @ 2003-02-07 18:46 ` Horst von Brand 2003-02-07 21:49 ` Neil Booth 1 sibling, 0 replies; 84+ messages in thread From: Horst von Brand @ 2003-02-07 18:46 UTC (permalink / raw) To: b_adlakha; +Cc: linux-kernel b_adlakha@softhome.net said: > Neil Booth writes: > > b_adlakha@softhome.net wrote:- > >> Maybe thats why its a 0.9* version, and the auther has stated on his site > >> that not all C98 features are implimented...but then even GCC doesn't > >> impliment them... > > No, I said C89. He's got a *long* way to go for that. Forget C99. > > However, he does claim C89 compliance, which is quite disingenuous. > >> I checked tcc out, and its damn fast, much much much much faster than > >> gcc. gcc is bloated and its slow even on my pentium 4 machine, let > >> alone my 1.2 celeron. It takes 20 minutes to compile a new kernel on > >> that, now if you're gonna test kernels/patches, you can wait 20 > >> minutes for every compile! Come on, quit whining already. When I started out fooling around with egcs and the kernel, it took 45 to 60 minutes to build a kernel for me. And the kernel was a lot smaller, and the compiler much faster. > > I agree. I'm trying to fix it. > > > > GCC is larger for a reason: it does things properly. It's easy to be > > fast if you're willing to be wrong, and not emit warnings or errors, and > > not implement half the standard. And not optimize. > >> Even icc is much better than gcc, but its very perticular about code (and > >> its not gcc compatible as the intel site says) > >> And its non-free also... Pour manpower and people who _know_ that _one_ CPU you are targeting in and out into the project, it sure will get further along... > > Only better in terms of compile speed. > > Cool (you're trying to fix it), maybe you can modify tcc so it is optimized > for compiling linux (optimized for compiling speed and runtime speed for > linux). Sorry, can pick just one. Either you compile very fast (because you don't analyze the code you are compiling very much, i.e., generate lousy code) or generate excelent code (that requires complex analysis, large data structures to build and use, and takes time). > I think it'll be easier and quicker to just make it compile linux > properly first, then do the testing/fixing for other things, as they are so > many compilers for other things anyway...And maybe it can be called "Linux C > Compiler"? lol. "Easier and quicker" as in 5 or 6 years of hard work. Sure enough, come back when you're done. -- Dr. Horst H. von Brand User #22616 counter.li.org Departamento de Informatica Fono: +56 32 654431 Universidad Tecnica Federico Santa Maria +56 32 654239 Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513 ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-07 10:31 ` b_adlakha 2003-02-07 18:46 ` Horst von Brand @ 2003-02-07 21:49 ` Neil Booth 1 sibling, 0 replies; 84+ messages in thread From: Neil Booth @ 2003-02-07 21:49 UTC (permalink / raw) To: b_adlakha; +Cc: linux-kernel b_adlakha@softhome.net wrote:- > Cool (you're trying to fix it), maybe you can modify tcc so it is optimized > for compiling linux (optimized for compiling speed and runtime speed for > linux). I think it'll be easier and quicker to just make it compile linux > properly first, then do the testing/fixing for other things, as they are so > many compilers for other things anyway...And maybe it can be called "Linux > C Compiler"? lol. Sorry, I only care about GCC. Neil. ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-06 7:02 ` Neil Booth [not found] ` <courier.3E423112.00007219@softhome.net> @ 2003-02-10 2:14 ` Jeff Garzik 2003-02-10 9:19 ` Tomas Szepe 1 sibling, 1 reply; 84+ messages in thread From: Jeff Garzik @ 2003-02-10 2:14 UTC (permalink / raw) To: Neil Booth; +Cc: Jeff Muizelaar, Andi Kleen, Linus Torvalds, linux-kernel Neil Booth wrote: > Jeff Muizelaar wrote:- > > >>There is also tcc (http://fabrice.bellard.free.fr/tcc/) >>It claims to support gcc-like inline assembler, appears to be much >>smaller and faster than gcc. Plus it is GPL so the liscense isn't a >>problem either. > > > It doesn't expand macros correctly, however, and accepts an enormous > range of invalid code without a single diagnostic. I'm pretty sure > it's arithmetic rules are incorrect, too. It's certainly nowhere > near C89 compliance. 100% agreed. However, for our purposes, TinyCC is only missing two pieces needed for successfully building a bootable kernel: * __builtin_constant_p * function inlining Given the existing TinyCC source base, function inlining is a big step (since tcc doesn't do AST-like things currently), so don't expect that very soon. TinyCC is a fun little project to watch and play around with, though, and can compile most major open source projects, as well as itself. Jeff ^ permalink raw reply [flat|nested] 84+ messages in thread
* Re: gcc 2.95 vs 3.21 performance 2003-02-10 2:14 ` Jeff Garzik @ 2003-02-10 9:19 ` Tomas Szepe 0 siblings, 0 replies; 84+ messages in thread From: Tomas Szepe @ 2003-02-10 9:19 UTC (permalink / raw) To: Jeff Garzik Cc: Neil Booth, Jeff Muizelaar, Andi Kleen, Linus Torvalds, linux-kernel > [jgarzik@pobox.com] > > Given the existing TinyCC source base, function inlining is a big step > (since tcc doesn't do AST-like things currently), so don't expect that > very soon. TinyCC is a fun little project to watch and play around > with, though, and can compile most major open source projects, as well > as itself. I wonder how that can be, though, because I've failed getting it to compile code as trivial as walk_de = (dirent_t *) debug_malloc(sizeof(dirent_t)); where dirent_t is a simple structure and debug_malloc is prototyped to void *debug_malloc(size_t size); -- Tomas Szepe <szepe@pinerecords.com> ^ permalink raw reply [flat|nested] 84+ messages in thread
[parent not found: <120432836@toto.iv>]
* Re: gcc 2.95 vs 3.21 performance [not found] <120432836@toto.iv> @ 2003-02-05 2:45 ` Peter Chubb 0 siblings, 0 replies; 84+ messages in thread From: Peter Chubb @ 2003-02-05 2:45 UTC (permalink / raw) To: Bryan Andersen; +Cc: linux-kernel, vda, root, Martin J. Bligh, lse-tech >>>>> "Bryan" == Bryan Andersen <bryan@bogonomicon.net> writes: Bryan> Personal opinion here but I know it is also held by many Bryan> developers I know and work with. I'd rather have a compiler Bryan> that produces correct and fast code but ran slow than one that Bryan> produces slow or bad code and runs fast. Remember compilation Bryan> is done far less often than run time execution. Yes I too Bryan> noticed a difference when I switched over to 3.2 but I also Bryan> noticed some of my code speed up. A different personal opinion: I'd prefer a compiler than can be told either to run fast and produce correct but suboptimal code or to produce the fastest correct code it can. While developing, the compile/test/think/edit cycle is dominated by compile time for me. So fast compilation is important while developing algorithms. -- Dr Peter Chubb peterc@gelato.unsw.edu.au You are lost in a maze of BitKeeper repositories, all almost the same. ^ permalink raw reply [flat|nested] 84+ messages in thread
[parent not found: <200302052021.h15KLrXv000881@darkstar.example.net>]
* Re: gcc 2.95 vs 3.21 performance [not found] <200302052021.h15KLrXv000881@darkstar.example.net> @ 2003-02-05 20:28 ` b_adlakha 0 siblings, 0 replies; 84+ messages in thread From: b_adlakha @ 2003-02-05 20:28 UTC (permalink / raw) To: John Bradford; +Cc: linux-kernel John Bradford writes: >> No really, I downloaded tcc yesterday, compiled a few things with it and it >> is REALLY fast...and as I wrote yesterday, its small enough so people might >> say: >> >> A: "I can't compile linux, what is wrong?" >> B: "Here, compile it with the compiler attached to this message" >> >> Sounds like fun doesn't it? I mean, tcc is a working C compiler (thats >> supposed to be a great thing), and its only 170 kb gzipped tar! > > I haven't actually had chance to test tcc yet, but I'll try to > tomorrow. How close is it to being able to compile the kernel? > > John. Far away, it doesn't even compile the ncurses based menuconfig...I think we need to hack (seriously) either tcc or linux... Since tcc is so small it would be easier to make it run it (bit) more like gcc, than modifying the whole kernel... ^ permalink raw reply [flat|nested] 84+ messages in thread
end of thread, other threads:[~2003-02-10 23:19 UTC | newest] Thread overview: 84+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2003-02-03 23:05 gcc 2.95 vs 3.21 performance Martin J. Bligh 2003-02-03 23:22 ` [Lse-tech] " Andi Kleen 2003-02-03 23:31 ` Richard B. Johnson 2003-02-04 0:43 ` J.A. Magallon 2003-02-04 13:42 ` Richard B. Johnson 2003-02-04 14:20 ` John Bradford 2003-02-04 6:54 ` Denis Vlasenko 2003-02-04 7:13 ` Martin J. Bligh 2003-02-04 12:25 ` Adrian Bunk 2003-02-04 15:51 ` Martin J. Bligh 2003-02-04 16:27 ` [Lse-tech] " Martin J. Bligh 2003-02-04 17:40 ` Patrick Mansfield 2003-02-04 17:55 ` Martin J. Bligh 2003-02-04 9:54 ` Bryan Andersen 2003-02-04 15:46 ` Martin J. Bligh 2003-02-04 19:09 ` Timothy D. Witham 2003-02-04 19:35 ` John Bradford 2003-02-04 19:44 ` Dave Jones 2003-02-04 20:11 ` John Bradford 2003-02-04 20:20 ` John Bradford 2003-02-04 20:45 ` Herman Oosthuysen 2003-02-04 21:44 ` Timothy D. Witham 2003-02-05 7:15 ` Denis Vlasenko 2003-02-05 10:36 ` Andreas Schwab 2003-02-05 11:41 ` Denis Vlasenko 2003-02-05 12:20 ` Dave Jones 2003-02-05 13:10 ` [Lse-tech] " Dipankar Sarma 2003-02-05 15:30 ` Martin J. Bligh 2003-02-04 21:38 ` Linus Torvalds 2003-02-04 21:54 ` John Bradford 2003-02-04 22:11 ` Linus Torvalds 2003-02-04 23:27 ` Timothy D. Witham 2003-02-04 23:21 ` Larry McVoy 2003-02-04 23:42 ` b_adlakha 2003-02-05 0:19 ` Andy Pfiffer 2003-02-04 23:51 ` Jakob Oestergaard 2003-02-05 1:03 ` Hugo Mills 2003-02-10 22:26 ` Andrea Arcangeli 2003-02-10 23:28 ` J.A. Magallon 2003-02-04 23:51 ` Eli Carter 2003-02-05 0:27 ` Larry McVoy 2003-02-06 20:42 ` Paul Jakma 2003-02-05 3:03 ` Tomas Szepe 2003-02-05 6:03 ` Mark Mielke 2003-02-07 16:09 ` Pavel Machek 2003-02-04 10:57 ` Padraig 2003-02-04 13:11 ` Helge Hafting 2003-02-04 13:29 ` Jörn Engel 2003-02-04 14:05 ` P 2003-02-04 20:36 ` Herman Oosthuysen 2003-02-04 12:20 ` [Lse-tech] " Dave Jones 2003-02-04 15:50 ` Martin J. Bligh 2003-02-10 12:13 ` Momchil Velikov 2003-02-06 15:42 ` gcc -O2 vs gcc -Os performance Martin J. Bligh 2003-02-06 15:51 ` [Lse-tech] " Andi Kleen 2003-02-06 17:48 ` Alan Cox 2003-02-06 17:06 ` Martin J. Bligh 2003-02-06 20:38 ` Martin J. Bligh 2003-02-06 21:32 ` John Bradford 2003-02-06 22:12 ` Linus Torvalds 2003-02-06 22:58 ` Martin J. Bligh 2003-02-06 23:16 ` Linus Torvalds 2003-02-06 23:59 ` Martin J. Bligh 2003-02-06 23:17 ` Roger Larsson 2003-02-06 23:33 ` Martin J. Bligh [not found] <1044385759.1861.46.camel@localhost.localdomain.suse.lists.linux.kernel> [not found] ` <200302041935.h14JZ69G002675@darkstar.example.net.suse.lists.linux.kernel> [not found] ` <b1pbt8$2ll$1@penguin.transmeta.com.suse.lists.linux.kernel> 2003-02-04 22:05 ` gcc 2.95 vs 3.21 performance Andi Kleen 2003-02-04 22:14 ` Linus Torvalds 2003-02-05 10:04 ` Pavel Janík 2003-02-05 20:07 ` Linus Torvalds 2003-02-06 15:00 ` Horst von Brand 2003-02-04 22:59 ` Jeff Muizelaar 2003-02-04 23:12 ` b_adlakha 2003-02-05 8:41 ` Horst von Brand 2003-02-05 19:09 ` Linus Torvalds 2003-02-05 19:22 ` Randy.Dunlap 2003-02-05 19:24 ` John Bradford 2003-02-06 7:02 ` Neil Booth [not found] ` <courier.3E423112.00007219@softhome.net> [not found] ` <20030206212218.GA4891@daikokuya.co.uk> 2003-02-07 10:31 ` b_adlakha 2003-02-07 18:46 ` Horst von Brand 2003-02-07 21:49 ` Neil Booth 2003-02-10 2:14 ` Jeff Garzik 2003-02-10 9:19 ` Tomas Szepe [not found] <120432836@toto.iv> 2003-02-05 2:45 ` Peter Chubb [not found] <200302052021.h15KLrXv000881@darkstar.example.net> 2003-02-05 20:28 ` b_adlakha
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).