linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Kernel Development & Objective-C
@ 2007-11-29 12:14 Ben Crowhurst
  2007-11-30 10:02 ` Xavier Bestel
                   ` (4 more replies)
  0 siblings, 5 replies; 57+ messages in thread
From: Ben Crowhurst @ 2007-11-29 12:14 UTC (permalink / raw)
  To: linux-kernel

Has Objective-C ever been considered for kernel development?

regards,
BPC


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Kernel Development & Objective-C
  2007-11-29 12:14 Kernel Development & Objective-C Ben Crowhurst
@ 2007-11-30 10:02 ` Xavier Bestel
  2007-11-30 10:09   ` KOSAKI Motohiro
  2007-11-30 10:29 ` Loïc Grenié
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 57+ messages in thread
From: Xavier Bestel @ 2007-11-30 10:02 UTC (permalink / raw)
  To: Ben.Crowhurst; +Cc: linux-kernel

On Thu, 2007-11-29 at 12:14 +0000, Ben Crowhurst wrote:
> Has Objective-C ever been considered for kernel development?

Why not C# instead ?



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Kernel Development & Objective-C
  2007-11-30 10:02 ` Xavier Bestel
@ 2007-11-30 10:09   ` KOSAKI Motohiro
  2007-11-30 10:20     ` Xavier Bestel
  2007-11-30 22:52     ` J.A. Magallón
  0 siblings, 2 replies; 57+ messages in thread
From: KOSAKI Motohiro @ 2007-11-30 10:09 UTC (permalink / raw)
  To: Xavier Bestel; +Cc: kosaki.motohiro, Ben.Crowhurst, linux-kernel

> > Has Objective-C ever been considered for kernel development?
> 
> Why not C# instead ?

Why not Haskell nor Erlang instead ? :-D




^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Kernel Development & Objective-C
  2007-11-30 10:09   ` KOSAKI Motohiro
@ 2007-11-30 10:20     ` Xavier Bestel
  2007-11-30 10:54       ` Jan Engelhardt
  2007-11-30 22:52     ` J.A. Magallón
  1 sibling, 1 reply; 57+ messages in thread
From: Xavier Bestel @ 2007-11-30 10:20 UTC (permalink / raw)
  To: KOSAKI Motohiro; +Cc: Ben.Crowhurst, linux-kernel

On Fri, 2007-11-30 at 19:09 +0900, KOSAKI Motohiro wrote:
> > > Has Objective-C ever been considered for kernel development?
> > 
> > Why not C# instead ?
> 
> Why not Haskell nor Erlang instead ? :-D

I heard of a bash compiler. That would enable development time
rationalization and maximize the collaborative convergence of a
community-oriented synergy.




^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Kernel Development & Objective-C
  2007-11-29 12:14 Kernel Development & Objective-C Ben Crowhurst
  2007-11-30 10:02 ` Xavier Bestel
@ 2007-11-30 10:29 ` Loïc Grenié
  2007-11-30 11:16   ` Ben Crowhurst
  2007-11-30 23:19   ` J.A. Magallón
  2007-11-30 11:37 ` Matti Aarnio
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 57+ messages in thread
From: Loïc Grenié @ 2007-11-30 10:29 UTC (permalink / raw)
  To: Ben.Crowhurst; +Cc: linux-kernel

2007/11/29, Ben Crowhurst <Ben.Crowhurst@stellatravel.co.uk>:
> Has Objective-C ever been considered for kernel development?
>
> regards,
> BPC

   No, it has not. Any language that looks remotely like an OO language
  has not ever been considered for (Linux) kernel development and for
  most, if not all, other operating systems kernels.

    Various problems occur in an object oriented language. One of them
  is garbage collection: it provokes asynchronous delays and, during
  an interrupt or a system call for a real time task, the kernel cannot
  wait. Another is memory overhead: all the magic that OO languages
  provide take space in memory and Linux kernel is used in embedded
  systems with very tight memory requirements.

    Lots of people will think of better reasons why ObjC is not used...

        Loïc Grenié

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Kernel Development & Objective-C
  2007-11-30 10:20     ` Xavier Bestel
@ 2007-11-30 10:54       ` Jan Engelhardt
  2007-11-30 14:21         ` David Newall
  0 siblings, 1 reply; 57+ messages in thread
From: Jan Engelhardt @ 2007-11-30 10:54 UTC (permalink / raw)
  To: Xavier Bestel; +Cc: KOSAKI Motohiro, Ben.Crowhurst, linux-kernel


On Nov 30 2007 11:20, Xavier Bestel wrote:
>On Fri, 2007-11-30 at 19:09 +0900, KOSAKI Motohiro wrote:
>> > > Has Objective-C ever been considered for kernel development?
>> > 
>> > Why not C# instead ?
>> 
>> Why not Haskell nor Erlang instead ? :-D
>
>I heard of a bash compiler. That would enable development time
>rationalization and maximize the collaborative convergence of a
>community-oriented synergy.
>
Fortran90 it has to be.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Kernel Development & Objective-C
  2007-11-30 10:29 ` Loïc Grenié
@ 2007-11-30 11:16   ` Ben Crowhurst
  2007-11-30 11:36     ` Karol Swietlicki
                       ` (2 more replies)
  2007-11-30 23:19   ` J.A. Magallón
  1 sibling, 3 replies; 57+ messages in thread
From: Ben Crowhurst @ 2007-11-30 11:16 UTC (permalink / raw)
  To: loic.grenie; +Cc: linux-kernel

Loïc Grenié wrote:
> 2007/11/29, Ben Crowhurst <Ben.Crowhurst@stellatravel.co.uk>:
>   
>> Has Objective-C ever been considered for kernel development?
>>
>> regards,
>> BPC
>>     
>
>    No, it has not. Any language that looks remotely like an OO language
>   has not ever been considered for (Linux) kernel development and for
>   most, if not all, other operating systems kernels.
>
>     Various problems occur in an object oriented language. One of them
>   is garbage collection: it provokes asynchronous delays and, during
>   an interrupt or a system call for a real time task, the kernel cannot
>   wait. 
Objective C 1.0 does not force nor have garbage collection.

> Another is memory overhead: all the magic that OO languages
>   provide take space in memory and Linux kernel is used in embedded
>   systems with very tight memory requirements.
>   
But are embedded systems not rapidly moving on. Turning to stare at the 
ADSL X6 modem with MB's of ram.
>     Lots of people will think of better reasons why ObjC is not used...
>
>         Loïc Grenié
>
>
>   
Which I'm looking forward to hear :)

Thank you for your appropriate response.

--

Regards
BPC



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Kernel Development & Objective-C
  2007-11-30 11:16   ` Ben Crowhurst
@ 2007-11-30 11:36     ` Karol Swietlicki
  2007-11-30 14:37     ` Lennart Sorensen
  2007-12-08  8:54     ` Rogelio M. Serrano Jr.
  2 siblings, 0 replies; 57+ messages in thread
From: Karol Swietlicki @ 2007-11-30 11:36 UTC (permalink / raw)
  To: Ben.Crowhurst; +Cc: loic.grenie, linux-kernel

On 30/11/2007, Ben Crowhurst <Ben.Crowhurst@stellatravel.co.uk> wrote:
> Loïc Grenié wrote:
> > 2007/11/29, Ben Crowhurst <Ben.Crowhurst@stellatravel.co.uk>:
> >
> >> Has Objective-C ever been considered for kernel development?
> >>

<snip>

> >     Lots of people will think of better reasons why ObjC is not used...
> >
> >         Loïc Grenié
> >
> Which I'm looking forward to hear :)
>
> Thank you for your appropriate response.

Here are a few reasons off the top of my head:
1. Adding extra unneeded complexity. Debugging would be harder.
2. Not many people can code ObjC when compared to the number of C coders.
3. If it ain't broken... Why fix it. The kernel works, right? Good.

You can find a great explanation somewhere out there, I'm not sure who
wrote it and the thing was explaining why C++ is not a great choice
for the Linux kernel. Some things going against C++ will also go
against ObjC. I cannot find it, but it is out there somewhere.

I'm a newbie and I might be wrong, but the above is what I believe to be true.

Karol Swietlicki

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Kernel Development & Objective-C
  2007-11-29 12:14 Kernel Development & Objective-C Ben Crowhurst
  2007-11-30 10:02 ` Xavier Bestel
  2007-11-30 10:29 ` Loïc Grenié
@ 2007-11-30 11:37 ` Matti Aarnio
  2007-11-30 14:34 ` Lennart Sorensen
  2007-11-30 15:00 ` Chris Snook
  4 siblings, 0 replies; 57+ messages in thread
From: Matti Aarnio @ 2007-11-30 11:37 UTC (permalink / raw)
  To: Ben Crowhurst; +Cc: linux-kernel

On Thu, Nov 29, 2007 at 12:14:16PM +0000, Ben Crowhurst wrote:
> Has Objective-C ever been considered for kernel development?
>
> regards,
> BPC

To my recall:  Never.

Some limited subset of C++ was tried, but was soon abandoned.

Overall the kernel data structures are done in objectish-manner,
although there are no strong type mechanisms being used.

Could the kernel be written in a limited subset[*] of ObjC ?  Very likely.
Would it be worth the job ?   Radical decrease in number of available
programmers...

*) Subset as enforcing the rule of not even indirectly using dynamic
   memory allocation, when operating in interrupt state.

      /Matti Aarnio

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Kernel Development & Objective-C
  2007-11-30 10:54       ` Jan Engelhardt
@ 2007-11-30 14:21         ` David Newall
  2007-11-30 23:31           ` Bill Davidsen
  0 siblings, 1 reply; 57+ messages in thread
From: David Newall @ 2007-11-30 14:21 UTC (permalink / raw)
  To: Jan Engelhardt
  Cc: Xavier Bestel, KOSAKI Motohiro, Ben.Crowhurst, linux-kernel

Jan Engelhardt wrote:
> On Nov 30 2007 11:20, Xavier Bestel wrote:
>   
>> On Fri, 2007-11-30 at 19:09 +0900, KOSAKI Motohiro wrote:
>>     
>>>>> Has Objective-C ever been considered for kernel development?
>>>>>           
>>>> Why not C# instead ?
>>>>         
>>> Why not Haskell nor Erlang instead ? :-D
>>>       
>> I heard of a bash compiler. That would enable development time
>> rationalization and maximize the collaborative convergence of a
>> community-oriented synergy.
>>
>>     
> Fortran90 it has to be.

It used to be written in BCPL; or was that Multics?

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Kernel Development & Objective-C
  2007-11-29 12:14 Kernel Development & Objective-C Ben Crowhurst
                   ` (2 preceding siblings ...)
  2007-11-30 11:37 ` Matti Aarnio
@ 2007-11-30 14:34 ` Lennart Sorensen
  2007-11-30 15:26   ` Kyle Moffett
  2007-12-01 19:59   ` Avi Kivity
  2007-11-30 15:00 ` Chris Snook
  4 siblings, 2 replies; 57+ messages in thread
From: Lennart Sorensen @ 2007-11-30 14:34 UTC (permalink / raw)
  To: Ben Crowhurst; +Cc: linux-kernel

On Thu, Nov 29, 2007 at 12:14:16PM +0000, Ben Crowhurst wrote:
> Has Objective-C ever been considered for kernel development?

Doesn't objective C essentially require a runtime to provide a lot of
the features of the language?  If it does (as I suspect) then it is
totally unsiatable for kernel development.

That and object oriented languages in general are badly designed and a
bad idea.  Having not used objective C I have no idea if it qualifies as
badly designed or not.  Certainly C++ and java are both very badly
designed.

Besides the kernel does a wonderful job doing object oriented design
where apropriate using C without any of the stupidities added by the
common OO languages.

--
Len Sorensen

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Kernel Development & Objective-C
  2007-11-30 11:16   ` Ben Crowhurst
  2007-11-30 11:36     ` Karol Swietlicki
@ 2007-11-30 14:37     ` Lennart Sorensen
  2007-12-08  8:54     ` Rogelio M. Serrano Jr.
  2 siblings, 0 replies; 57+ messages in thread
From: Lennart Sorensen @ 2007-11-30 14:37 UTC (permalink / raw)
  To: Ben Crowhurst; +Cc: loic.grenie, linux-kernel

On Fri, Nov 30, 2007 at 11:16:14AM +0000, Ben Crowhurst wrote:
> But are embedded systems not rapidly moving on. Turning to stare at the 
> ADSL X6 modem with MB's of ram.

Some embedded systems run on batteries, so the less ram they have to
power the better, and the less cpu cycles that have to spend executing
code the less power they consume.  An ADSL modem on your desk doesn't
have any of those worries, it just has to work and if doubling the ram
cuts the development problems by a lot, then that might have been a
worthwhile trade off.

--
Len Sorensen

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Kernel Development & Objective-C
  2007-11-29 12:14 Kernel Development & Objective-C Ben Crowhurst
                   ` (3 preceding siblings ...)
  2007-11-30 14:34 ` Lennart Sorensen
@ 2007-11-30 15:00 ` Chris Snook
  2007-12-01  9:50   ` David Newall
  4 siblings, 1 reply; 57+ messages in thread
From: Chris Snook @ 2007-11-30 15:00 UTC (permalink / raw)
  To: Ben.Crowhurst; +Cc: linux-kernel

Ben Crowhurst wrote:
> Has Objective-C ever been considered for kernel development?

No.  Kernel programming requires what is essentially assembly language with a 
lot of syntactic sugar, which C provides.  Higher-level languages abstract away 
too much detail to be suitable for the sort of bit-perfect control you need when 
you're directly controlling bare metal.  You can still use object-oriented 
programming techniques in C, and we do this all the time in the kernel, but we 
do so with more fine-grained explicit control than a language like Objective-C 
would give us.  More to the point, if we tried to use Objective-C, we'd find 
ourselves needing to fall back to C-style explicitness so often that it wouldn't 
be worth the trouble.

In other news, I hear Hurd boots again!

	-- Chris

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Kernel Development & Objective-C
  2007-11-30 14:34 ` Lennart Sorensen
@ 2007-11-30 15:26   ` Kyle Moffett
  2007-11-30 18:40     ` H. Peter Anvin
  2007-12-01 20:03     ` Avi Kivity
  2007-12-01 19:59   ` Avi Kivity
  1 sibling, 2 replies; 57+ messages in thread
From: Kyle Moffett @ 2007-11-30 15:26 UTC (permalink / raw)
  To: Lennart Sorensen; +Cc: Ben Crowhurst, linux-kernel

On Nov 30, 2007, at 09:34:45, Lennart Sorensen wrote:
> On Thu, Nov 29, 2007 at 12:14:16PM +0000, Ben Crowhurst wrote:
>> Has Objective-C ever been considered for kernel development?
>
> Doesn't objective C essentially require a runtime to provide a lot  
> of the features of the language?  If it does (as I suspect) then it  
> is totally unsiatable for kernel development.
>
> That and object oriented languages in general are badly designed  
> and a bad idea.  Having not used objective C I have no idea if it  
> qualifies as badly designed or not.  Certainly C++ and java are  
> both very badly designed.

Objective-C is actually a pretty minimal wrapper around C; it was  
originally implemented as a C preprocessor.  It generally does not  
have any kind of memory management, garbage collection, or anything  
else (although typically a "runtime" will provide those features).   
There are no first-class exceptions, so there would be nothing to  
worry about there (the exceptions used in GUI programs are built  
around the setjmp/longjmp primitives).  Objective-C is also almost  
completely backwards-compatible with C, much more so than C++ ever  
was.  As far as the runtime goes the kernel would be expected to  
write its own, the same way that it implements "kmalloc()" as part of  
a "C runtime".  Since the runtime itself never does any implicit  
memory allocation, I think it would conceivably even be relatively  
safe for kernel usage.

With that said, there is a significant performance penalty as all  
Objective-C method calls are looked up symbolically at runtime for  
every single call.  For GUI programs where large chunks of the code  
are event-loops and not performance-sensitive that provides a huge  
amount of extra flexibility.  In the kernel though, there are many  
codepaths where *every* *single* instruction counts; that could be a  
serious performance hit.

Cheers,
Kyle Moffett


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Kernel Development & Objective-C
  2007-11-30 15:26   ` Kyle Moffett
@ 2007-11-30 18:40     ` H. Peter Anvin
  2007-11-30 19:35       ` Kyle Moffett
  2007-12-01 20:03     ` Avi Kivity
  1 sibling, 1 reply; 57+ messages in thread
From: H. Peter Anvin @ 2007-11-30 18:40 UTC (permalink / raw)
  To: Kyle Moffett; +Cc: Lennart Sorensen, Ben Crowhurst, linux-kernel

Kyle Moffett wrote:
> With that said, there is a significant performance penalty as all 
> Objective-C method calls are looked up symbolically at runtime for every 
> single call.

GACK!

At least C++ has vtables.

	-hpa


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Kernel Development & Objective-C
  2007-11-30 18:40     ` H. Peter Anvin
@ 2007-11-30 19:35       ` Kyle Moffett
  0 siblings, 0 replies; 57+ messages in thread
From: Kyle Moffett @ 2007-11-30 19:35 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Lennart Sorensen, Ben Crowhurst, linux-kernel

On Nov 30, 2007, at 13:40:07, H. Peter Anvin wrote:
> Kyle Moffett wrote:
>> With that said, there is a significant performance penalty as all  
>> Objective-C method calls are looked up symbolically at runtime for  
>> every single call.
>
> GACK!
>
> At least C++ has vtables.

In a tight loop there is a way to do a single symbolic lookup and  
just call directly through a function pointer, but typically it isn't  
necessary for GUI programs and the like.  The flexibility of being  
able to dynamically add new methods to an existing class (at least  
for desktop user interfaces) significantly outweighs the performance  
cost.  Any performance-sensitive code is typically written in  
straight C anyways.

Cheers,
Kyle Moffett


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Kernel Development & Objective-C
  2007-11-30 10:09   ` KOSAKI Motohiro
  2007-11-30 10:20     ` Xavier Bestel
@ 2007-11-30 22:52     ` J.A. Magallón
  1 sibling, 0 replies; 57+ messages in thread
From: J.A. Magallón @ 2007-11-30 22:52 UTC (permalink / raw)
  To: Linux-Kernel, 

On Fri, 30 Nov 2007 19:09:45 +0900, KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:

> > > Has Objective-C ever been considered for kernel development?
> > 
> > Why not C# instead ?
> 
> Why not Haskell nor Erlang instead ? :-D
> 

Flash

http://www.lagmonster.info/humor/windowsrg.html

--
J.A. Magallon <jamagallon()ono!com>     \               Software is like sex:
                                         \         It's better when it's free
Mandriva Linux release 2008.1 (Cooker) for i586
Linux 2.6.23-jam03 (gcc 4.2.2 (4.2.2-1mdv2008.1)) SMP Sat Nov
09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Kernel Development & Objective-C
  2007-11-30 10:29 ` Loïc Grenié
  2007-11-30 11:16   ` Ben Crowhurst
@ 2007-11-30 23:19   ` J.A. Magallón
  2007-11-30 23:53     ` Nicholas Miell
                       ` (2 more replies)
  1 sibling, 3 replies; 57+ messages in thread
From: J.A. Magallón @ 2007-11-30 23:19 UTC (permalink / raw)
  To: Loïc Grenié; +Cc: Ben.Crowhurst, linux-kernel

On Fri, 30 Nov 2007 11:29:55 +0100, "Loïc Grenié" <loic.grenie@gmail.com> wrote:

> 2007/11/29, Ben Crowhurst <Ben.Crowhurst@stellatravel.co.uk>:
> > Has Objective-C ever been considered for kernel development?
> >
> > regards,
> > BPC
> 

Well, I really would like to learn some things here, could we
keep this off-topic thread alive just a bit, please ?
(I know, I'm going to gain a troll's fame because I can't avoid this
discussions, its one of my secret vices...)

>    No, it has not. Any language that looks remotely like an OO language
>   has not ever been considered for (Linux) kernel development and for
>   most, if not all, other operating systems kernels.
> 

I think BeOS was C++ and OSX is C+ObjectiveC (and runs on an iPhone).
Original MacOS (fron 6 to 9) was Pascal (and a mac SE was very near
to embedded hardware :) ).

I do not advocate to rewrite Linux in C++, but don't say a kernel written
in C++ can not be efficient.

>     Various problems occur in an object oriented language. One of them
>   is garbage collection: it provokes asynchronous delays and, during
>   an interrupt or a system call for a real time task, the kernel cannot
>   wait. 

C++ (and for what I read on other answer, nor ObjectiveC) has no garbage
collection. It does not anything you did not it to do. It just allows
you to change this

	struct buffer *x;
	x = kmalloc(...)
	x->sz = 128
	x->buff = kmalloc(...)
	...
	kfree(x->buff)
	kfree(x)
	
to
	struct buffer *x;
	x = new buffer(128); (that does itself allocates x->buff,
                              because _you_ programmed it,
                              so you poor programmer don't forget)
        ...
	delete x;            (that also was programmed to deallocate
                              x->buff itself, sou you have one less
                              memory leak to worry about)

>   Another is memory overhead: all the magic that OO languages
>   provide take space in memory and Linux kernel is used in embedded
>   systems with very tight memory requirements.
> 

An vtable in C++ takes exactly the same space that the function
table pointer present in every driver nowadays... and probably
the virtual method call that C++ does itself with

	thing->do_something(with,this)

like
	push thing
	push with
	push this
	call THING_vtable+indexof(do_something) // constants at compile time

is much more efficient that what gcc can mangle to do with

	thing->do_something(with,this,thing)

	push with
	push this
	push thing
	get thing+offsetof(do_something) // not constant at compile time
	dereference it
	call it

(that is, get a generic field on a structure and use it as jump address)

In short, the kernel is object oriented, implements OO programming by
hand, but the compiler lacks the knowledge that it is object oriented
programming so it could do some optimizations.

>     Lots of people will think of better reasons why ObjC is not used...

People usually complains about RTTI or exceptions, but benefits versus
memory space should be seriously considered (sure there is something
in current drivers to ask 'are you a SATA or an IDE disk?').


--
J.A. Magallon <jamagallon()ono!com>     \               Software is like sex:
                                         \         It's better when it's free
Mandriva Linux release 2008.1 (Cooker) for i586
Linux 2.6.23-jam03 (gcc 4.2.2 (4.2.2-1mdv2008.1)) SMP Sat Nov
09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Kernel Development & Objective-C
  2007-11-30 14:21         ` David Newall
@ 2007-11-30 23:31           ` Bill Davidsen
  2007-11-30 23:40             ` Alan Cox
  0 siblings, 1 reply; 57+ messages in thread
From: Bill Davidsen @ 2007-11-30 23:31 UTC (permalink / raw)
  To: David Newall
  Cc: Jan Engelhardt, Xavier Bestel, KOSAKI Motohiro, Ben.Crowhurst,
	linux-kernel

David Newall wrote:
> Jan Engelhardt wrote:
>> On Nov 30 2007 11:20, Xavier Bestel wrote:
>>  
>>> On Fri, 2007-11-30 at 19:09 +0900, KOSAKI Motohiro wrote:
>>>    
>>>>>> Has Objective-C ever been considered for kernel development?
>>>>>>           
>>>>> Why not C# instead ?
>>>>>         
>>>> Why not Haskell nor Erlang instead ? :-D
>>>>       
>>> I heard of a bash compiler. That would enable development time
>>> rationalization and maximize the collaborative convergence of a
>>> community-oriented synergy.
>>>
>>>     
>> Fortran90 it has to be.
> 
> It used to be written in BCPL; or was that Multics?

BCPL was typeless, as was the successor B (between Bell Labs and GE we 
write thousands of lines of B, ported to 8080, GE600, etc). C introduced 
types, and the rest is history. Multics is written in PL/1, and I wrote 
a lot of PL/1 subset G back when as well. You don't know slow compile 
until you get a seven pass compiler with each pass on floppy.


-- 
Bill Davidsen <davidsen@tmr.com>
   "We have more to fear from the bungling of the incompetent than from
the machinations of the wicked."  - from Slashdot

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Kernel Development & Objective-C
  2007-11-30 23:31           ` Bill Davidsen
@ 2007-11-30 23:40             ` Alan Cox
  2007-12-01  0:05               ` Arnaldo Carvalho de Melo
  2007-12-01 18:27               ` Bill Davidsen
  0 siblings, 2 replies; 57+ messages in thread
From: Alan Cox @ 2007-11-30 23:40 UTC (permalink / raw)
  To: Bill Davidsen
  Cc: David Newall, Jan Engelhardt, Xavier Bestel, KOSAKI Motohiro,
	Ben.Crowhurst, linux-kernel

> BCPL was typeless, as was the successor B (between Bell Labs and GE we 

B isn't quite typeless. It has minimal inbuilt support for concepts like
strings (although you can of course multiply a string by an array
pointer ;))

It also had some elegances that C lost, notably 

	case 1..5:

the ability to do no zero biased arrays

	x[40];
	x-=10;

and the ability to reassign function names.

	printk = wombat;

as well as stuff like free(function);

Alan (who learned B before C, and is still waiting for P)

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Kernel Development & Objective-C
  2007-11-30 23:19   ` J.A. Magallón
@ 2007-11-30 23:53     ` Nicholas Miell
  2007-12-01  0:31     ` Al Viro
  2007-12-04 17:54     ` Lennart Sorensen
  2 siblings, 0 replies; 57+ messages in thread
From: Nicholas Miell @ 2007-11-30 23:53 UTC (permalink / raw)
  To: J.A. Magallón; +Cc: Loïc Grenié, Ben.Crowhurst, linux-kernel


On Sat, 2007-12-01 at 00:19 +0100, J.A. Magallón wrote:

> An vtable in C++ takes exactly the same space that the function
> table pointer present in every driver nowadays... and probably
> the virtual method call that C++ does itself with
> 
> 	thing->do_something(with,this)
> 
> like
> 	push thing
> 	push with
> 	push this
> 	call THING_vtable+indexof(do_something) // constants at compile time
> 
> is much more efficient that what gcc can mangle to do with
> 
> 	thing->do_something(with,this,thing)
> 
> 	push with
> 	push this
> 	push thing
> 	get thing+offsetof(do_something) // not constant at compile time
> 	dereference it
> 	call it
> 
> (that is, get a generic field on a structure and use it as jump address)
> 
> In short, the kernel is object oriented, implements OO programming by
> hand, but the compiler lacks the knowledge that it is object oriented
> programming so it could do some optimizations.

        struct test;
        struct testVtbl
        {
        	int (*fn1)(struct test *t, int x, int y);
        	int (*fn2)(struct test *t, int x, int y);
        };
        struct test
        {
        	struct testVtbl *vtbl;
        	int x, y;
        };
        void testCall(struct test *t, int x, int y)
        {
        	t->vtbl->fn1(t, x, y);
        	t->vtbl->fn2(t, x, y);
        }

and

        struct test
        {
        	virtual int fn1(int x, int y);
        	virtual int fn2(int x, int y);
        
        	int x, y;
        };
        
        void testCall(struct test *t, int x, int y)
        {
        	t->fn1(x, y);
        	t->fn2(x, y);
        }
        
generate instruction-for-instruction identical code.

-- 
Nicholas Miell <nmiell@comcast.net>


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Kernel Development & Objective-C
  2007-11-30 23:40             ` Alan Cox
@ 2007-12-01  0:05               ` Arnaldo Carvalho de Melo
  2007-12-01 18:27               ` Bill Davidsen
  1 sibling, 0 replies; 57+ messages in thread
From: Arnaldo Carvalho de Melo @ 2007-12-01  0:05 UTC (permalink / raw)
  To: Alan Cox
  Cc: Bill Davidsen, David Newall, Jan Engelhardt, Xavier Bestel,
	KOSAKI Motohiro, Ben.Crowhurst, linux-kernel

Em Fri, Nov 30, 2007 at 11:40:13PM +0000, Alan Cox escreveu:
> > BCPL was typeless, as was the successor B (between Bell Labs and GE we 
> 
> B isn't quite typeless. It has minimal inbuilt support for concepts like
> strings (although you can of course multiply a string by an array
> pointer ;))
> 
> It also had some elegances that C lost, notably 
> 
> 	case 1..5:

Hey, the language we use, gcC has this too 8-)

[acme@doppio net-2.6.25]$ find . -name "*.c" | xargs grep 'case.\+\.\.' | wc -l
400
[acme@doppio net-2.6.25]$ find . -name "*.c" | xargs grep 'case.\+\.\.' | head
./kernel/signal.c:      default: /* this is just in case for now ... */
./kernel/audit.c:       case AUDIT_FIRST_USER_MSG ...  AUDIT_LAST_USER_MSG:
./kernel/audit.c:       case AUDIT_FIRST_USER_MSG2 ...  AUDIT_LAST_USER_MSG2:
./kernel/audit.c:       case AUDIT_FIRST_USER_MSG ...  AUDIT_LAST_USER_MSG:
./kernel/audit.c:       case AUDIT_FIRST_USER_MSG2 ...  AUDIT_LAST_USER_MSG2:
./kernel/timer.c:        * well, in that case 2.2.x was broken anyways...
./arch/frv/kernel/traps.c:      case TBR_TT_TRAP2 ... TBR_TT_TRAP126:
./arch/frv/kernel/ptrace.c:             case 0 ... PT__END - 1:
./arch/frv/kernel/ptrace.c:             case 0 ... PT__END-1:
./arch/frv/kernel/gdb-stub.c:                   case GDB_REG_GR(1) ...  GDB_REG_GR(63):
[acme@doppio net-2.6.25]$

- Arnaldo

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Kernel Development & Objective-C
  2007-11-30 23:19   ` J.A. Magallón
  2007-11-30 23:53     ` Nicholas Miell
@ 2007-12-01  0:31     ` Al Viro
  2007-12-01  0:34       ` Al Viro
                         ` (2 more replies)
  2007-12-04 17:54     ` Lennart Sorensen
  2 siblings, 3 replies; 57+ messages in thread
From: Al Viro @ 2007-12-01  0:31 UTC (permalink / raw)
  To: J.A. Magall??n; +Cc: Lo??c Greni??, Ben.Crowhurst, linux-kernel

On Sat, Dec 01, 2007 at 12:19:50AM +0100, J.A. Magall??n wrote:
> An vtable in C++ takes exactly the same space that the function
> table pointer present in every driver nowadays... and probably
> the virtual method call that C++ does itself with
> 
> 	thing->do_something(with,this)
> 
> like
> 	push thing
> 	push with
> 	push this
> 	call THING_vtable+indexof(do_something) // constants at compile time

This is not what vtables are.  Think for a minute - all codepaths arriving
to that point in your code will pick the address to call from the same
location.  Either the contents of that location is constant (in which case
you could bloody well call it directly in the first place) *or* it has to
somehow be reassigned back and forth, according to the value of this.  The
former is dumb, the latter - outright insane.

The contents of vtables is constant.  The whole point of that thing is
to deal with the situations where we _can't_ tell which derived class
this ->do_something() is from; if we could tell which vtable it is at
compile time, we wouldn't need to bother at all.

It's a tradeoff - we pay the extra memory access (fetch vtable pointer, then 
fetch method from vtable) for not having to store a slew of method pointers
in each instance of base class.  But the extra memory access is very much
there.  It can be further optimized away if you have several method calls
for the same object next to each other (then vtable can be picked once),
but it's still done at runtime.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Kernel Development & Objective-C
  2007-12-01  0:31     ` Al Viro
@ 2007-12-01  0:34       ` Al Viro
  2007-12-01  1:09       ` J.A. Magallón
  2007-12-01 19:55       ` Avi Kivity
  2 siblings, 0 replies; 57+ messages in thread
From: Al Viro @ 2007-12-01  0:34 UTC (permalink / raw)
  To: J.A. Magall??n; +Cc: Lo??c Greni??, Ben.Crowhurst, linux-kernel

On Sat, Dec 01, 2007 at 12:31:19AM +0000, Al Viro wrote:
> somehow be reassigned back and forth, according to the value of this.  The
s/this/thing/, of course

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Kernel Development & Objective-C
  2007-12-01  0:31     ` Al Viro
  2007-12-01  0:34       ` Al Viro
@ 2007-12-01  1:09       ` J.A. Magallón
  2007-12-01 19:55       ` Avi Kivity
  2 siblings, 0 replies; 57+ messages in thread
From: J.A. Magallón @ 2007-12-01  1:09 UTC (permalink / raw)
  To: Linux-Kernel, 

On Sat, 1 Dec 2007 00:31:19 +0000, Al Viro <viro@ftp.linux.org.uk> wrote:

> On Sat, Dec 01, 2007 at 12:19:50AM +0100, J.A. Magall??n wrote:
> > An vtable in C++ takes exactly the same space that the function
> > table pointer present in every driver nowadays... and probably
> > the virtual method call that C++ does itself with
> > 
> > 	thing->do_something(with,this)
> > 
> > like
> > 	push thing
> > 	push with
> > 	push this
> > 	call THING_vtable+indexof(do_something) // constants at compile time
> 
> This is not what vtables are.  Think for a minute - all codepaths arriving
> to that point in your code will pick the address to call from the same
> location.  Either the contents of that location is constant (in which case
> you could bloody well call it directly in the first place) *or* it has to
> somehow be reassigned back and forth, according to the value of this.  The
> former is dumb, the latter - outright insane.
> 
> The contents of vtables is constant.  The whole point of that thing is
> to deal with the situations where we _can't_ tell which derived class
> this ->do_something() is from; if we could tell which vtable it is at
> compile time, we wouldn't need to bother at all.
> 

Yup, my mistake (that's why I said i will learn something). I was thinking
on non-virtual methods. For virtual ones you have to fetch the vtable
start address and index from it.

> It's a tradeoff - we pay the extra memory access (fetch vtable pointer, then 
> fetch method from vtable) for not having to store a slew of method pointers
> in each instance of base class.  But the extra memory access is very much
> there.  It can be further optimized away if you have several method calls
> for the same object next to each other (then vtable can be picked once),
> but it's still done at runtime.

--
J.A. Magallon <jamagallon()ono!com>     \               Software is like sex:
                                         \         It's better when it's free
Mandriva Linux release 2008.1 (Cooker) for i586
Linux 2.6.23-jam03 (gcc 4.2.2 (4.2.2-1mdv2008.1)) SMP Sat Nov
09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Kernel Development & Objective-C
  2007-11-30 15:00 ` Chris Snook
@ 2007-12-01  9:50   ` David Newall
  0 siblings, 0 replies; 57+ messages in thread
From: David Newall @ 2007-12-01  9:50 UTC (permalink / raw)
  To: Chris Snook; +Cc: Ben.Crowhurst, linux-kernel

Chris Snook wrote:
> Ben Crowhurst wrote:
>> Has Objective-C ever been considered for kernel development?
>
> No.  Kernel programming requires what is essentially assembly language 
> with a lot of syntactic sugar, which C provides.

I somewhat disagree.  Kernel programming requires and deserves the same 
care, rigor and eye to details as all other serious systems.  Whilst 
performance is always a consideration, high-level languages give a 
reward in ease of expression and improved reliability, such that a 
notional performance cost is easily justified.  Occasionally, precise 
bit-diddling or tight timing requirements might necessitate use of 
assembly; even so, a lot of bit-diddling can be expressed in high-level 
languages.

Kernel programming might require a scintilla of assembly language, but 
the very vast majority of it should be written in a high-level language.

There's an old joke that claims, "real programmers can write FORTRAN in 
any language."  It's true.  Object orientation is a style of 
programming, not a language, and while certain languages have intrinsic 
support for this style, objects, methods, properties and inheritance can 
be probably be written in any language.  It's an issue of putting in 
care and eye to detail.

Linux could be written in Objective-C, it could be written in Pascal, 
but it is written in plain C, with a smattering of assembler.  Does it 
need to be more complicated than that?

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Kernel Development & Objective-C
  2007-12-01 18:27               ` Bill Davidsen
@ 2007-12-01 18:18                 ` Alan Cox
  2007-12-03  1:23                   ` Bill Davidsen
  0 siblings, 1 reply; 57+ messages in thread
From: Alan Cox @ 2007-12-01 18:18 UTC (permalink / raw)
  To: Bill Davidsen
  Cc: David Newall, Jan Engelhardt, Xavier Bestel, KOSAKI Motohiro,
	Ben.Crowhurst, linux-kernel

> Well, original C allowed you to do what you wanted with pointers (I used 
> to teach that back when K&R was "the" C manual). Now people which about 
> having pointers outside the array, which is a crock in practice, as long 
> as you don't actually /use/ an out of range value.

Actually the standards had good reasons to bar this use, because many
runtime environments used segmentation and unsigned segment offsets. On a
286 you could get into quite a mess with out of array reference tricks.

> variable with the address of the start. I was more familiar with the B 
> stuff, I wrote both the interpreter and the code generator+library for 
> the 8080 and GE600 machines. B on MULTICS, those were the days... :-D

B on Honeywell L66, so that may well have been a relative of your code
generator ?


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Kernel Development & Objective-C
  2007-11-30 23:40             ` Alan Cox
  2007-12-01  0:05               ` Arnaldo Carvalho de Melo
@ 2007-12-01 18:27               ` Bill Davidsen
  2007-12-01 18:18                 ` Alan Cox
  1 sibling, 1 reply; 57+ messages in thread
From: Bill Davidsen @ 2007-12-01 18:27 UTC (permalink / raw)
  To: Alan Cox
  Cc: David Newall, Jan Engelhardt, Xavier Bestel, KOSAKI Motohiro,
	Ben.Crowhurst, linux-kernel

Alan Cox wrote:
>> BCPL was typeless, as was the successor B (between Bell Labs and GE we 
> 
> B isn't quite typeless. It has minimal inbuilt support for concepts like
> strings (although you can of course multiply a string by an array
> pointer ;))
> 
> It also had some elegances that C lost, notably 
> 
> 	case 1..5:
> 
> the ability to do no zero biased arrays
> 
> 	x[40];
> 	x-=10;

Well, original C allowed you to do what you wanted with pointers (I used 
to teach that back when K&R was "the" C manual). Now people which about 
having pointers outside the array, which is a crock in practice, as long 
as you don't actually /use/ an out of range value.
> 
> and the ability to reassign function names.
> 
> 	printk = wombat;

I had forgotten that, the function name was actually a variable with the 
entry point, say so in section 3.11. And as I recall the code, arrays 
were the same thing, a length ten vector was actually the vector and 
variable with the address of the start. I was more familiar with the B 
stuff, I wrote both the interpreter and the code generator+library for 
the 8080 and GE600 machines. B on MULTICS, those were the days... :-D
> 
> as well as stuff like free(function);
> 
> Alan (who learned B before C, and is still waiting for P)

I had the BCPL book still on the reference shelf in the office, along 
with goodies like the four candidates to be Ada, and a TRAC manual. I 
too expected the next language to be "P".

-- 
Bill Davidsen <davidsen@tmr.com>
   "We have more to fear from the bungling of the incompetent than from
the machinations of the wicked."  - from Slashdot

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Kernel Development & Objective-C
  2007-12-01  0:31     ` Al Viro
  2007-12-01  0:34       ` Al Viro
  2007-12-01  1:09       ` J.A. Magallón
@ 2007-12-01 19:55       ` Avi Kivity
  2 siblings, 0 replies; 57+ messages in thread
From: Avi Kivity @ 2007-12-01 19:55 UTC (permalink / raw)
  To: Al Viro; +Cc: J.A. Magall??n, Lo??c Greni??, Ben.Crowhurst, linux-kernel

Al Viro wrote:
> On Sat, Dec 01, 2007 at 12:19:50AM +0100, J.A. Magall??n wrote:
>   
>> An vtable in C++ takes exactly the same space that the function
>> table pointer present in every driver nowadays... and probably
>> the virtual method call that C++ does itself with
>>
>> 	thing->do_something(with,this)
>>
>> like
>> 	push thing
>> 	push with
>> 	push this
>> 	call THING_vtable+indexof(do_something) // constants at compile time
>>     
>
> This is not what vtables are.  Think for a minute - all codepaths arriving
> to that point in your code will pick the address to call from the same
> location.  Either the contents of that location is constant (in which case
> you could bloody well call it directly in the first place) *or* it has to
> somehow be reassigned back and forth, according to the value of this.  The
> former is dumb, the latter - outright insane.
>
> The contents of vtables is constant.  The whole point of that thing is
> to deal with the situations where we _can't_ tell which derived class
> this ->do_something() is from; if we could tell which vtable it is at
> compile time, we wouldn't need to bother at all.
>
> It's a tradeoff - we pay the extra memory access (fetch vtable pointer, then 
> fetch method from vtable) for not having to store a slew of method pointers
> in each instance of base class.  But the extra memory access is very much
> there.  It can be further optimized away if you have several method calls
> for the same object next to each other (then vtable can be picked once),
> but it's still done at runtime.
>   

True. C++ vtables have no performance advantage over C ->ops->function() 
calls. But they have no disadvantage either and they do offer many 
syntactic advantages (such as automatically casting the object type to 
the *correct* derived class.


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Kernel Development & Objective-C
  2007-11-30 14:34 ` Lennart Sorensen
  2007-11-30 15:26   ` Kyle Moffett
@ 2007-12-01 19:59   ` Avi Kivity
  2007-12-02 19:44     ` Jörn Engel
  2007-12-03 16:53     ` Lennart Sorensen
  1 sibling, 2 replies; 57+ messages in thread
From: Avi Kivity @ 2007-12-01 19:59 UTC (permalink / raw)
  To: Lennart Sorensen; +Cc: Ben Crowhurst, linux-kernel

Lennart Sorensen wrote:
> On Thu, Nov 29, 2007 at 12:14:16PM +0000, Ben Crowhurst wrote:
>   
>> Has Objective-C ever been considered for kernel development?
>>     
>
> Doesn't objective C essentially require a runtime to provide a lot of
> the features of the language?  If it does (as I suspect) then it is
> totally unsiatable for kernel development.
>
>   

C also requires a (very minimal) runtime. And I don't see how having a 
runtime disqualifies a language from being usable in a kernel; the 
runtime is just one more library, either supplied by the compiler or by 
the kernel.

>
> Besides the kernel does a wonderful job doing object oriented design
> where apropriate using C without any of the stupidities added by the
> common OO languages

Object orientation in C leaves much to be desired; see the huge number 
of void pointers and container_of()s in the kernel.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Kernel Development & Objective-C
  2007-11-30 15:26   ` Kyle Moffett
  2007-11-30 18:40     ` H. Peter Anvin
@ 2007-12-01 20:03     ` Avi Kivity
  2007-12-02 19:01       ` Andi Kleen
  1 sibling, 1 reply; 57+ messages in thread
From: Avi Kivity @ 2007-12-01 20:03 UTC (permalink / raw)
  To: Kyle Moffett; +Cc: Lennart Sorensen, Ben Crowhurst, linux-kernel

Kyle Moffett wrote:
> In the kernel though, there are many codepaths where *every* *single* 
> instruction counts; that could be a serious performance hit.

Write *those* *codepaths* in *C* or *assembly*. But only after you 
manage to measure a difference compared to the object-oriented systems 
language.

[I really doubt there are that many of these; syscall 
entry/dispatch/exit, interrupt dispatch, context switch, what else?]

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Kernel Development & Objective-C
  2007-12-01 20:03     ` Avi Kivity
@ 2007-12-02 19:01       ` Andi Kleen
  2007-12-03  5:12         ` Avi Kivity
  0 siblings, 1 reply; 57+ messages in thread
From: Andi Kleen @ 2007-12-02 19:01 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Kyle Moffett, Lennart Sorensen, Ben Crowhurst, linux-kernel

Avi Kivity <avi@argo.co.il> writes:
>
> [I really doubt there are that many of these; syscall
> entry/dispatch/exit, interrupt dispatch, context switch, what else?]

Networking, block IO, page fault, ... But only the fast paths in these 
cases. A lot of the kernel is slow path code and could probably
be written even in an interpreted language without much trouble.

-Andi

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Kernel Development & Objective-C
  2007-12-01 19:59   ` Avi Kivity
@ 2007-12-02 19:44     ` Jörn Engel
  2007-12-03 16:53     ` Lennart Sorensen
  1 sibling, 0 replies; 57+ messages in thread
From: Jörn Engel @ 2007-12-02 19:44 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Lennart Sorensen, Ben Crowhurst, linux-kernel

On Sat, 1 December 2007 21:59:31 +0200, Avi Kivity wrote:
> 
> Object orientation in C leaves much to be desired; see the huge number 
> of void pointers and container_of()s in the kernel.

While true, this isn't such a bad problem.  A language really sucks when
it tries to disallow something useful.  Back in university I was forced
to write system software in pascal.  Simple pointer arithmetic became a
5-line piece of code.

Imo the main advantage of C is simply that it doesn't get in the way.

Jörn

-- 
But this is not to say that the main benefit of Linux and other GPL
software is lower-cost. Control is the main benefit--cost is secondary.
-- Bruce Perens

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Kernel Development & Objective-C
  2007-12-01 18:18                 ` Alan Cox
@ 2007-12-03  1:23                   ` Bill Davidsen
  0 siblings, 0 replies; 57+ messages in thread
From: Bill Davidsen @ 2007-12-03  1:23 UTC (permalink / raw)
  To: Alan Cox
  Cc: David Newall, Jan Engelhardt, Xavier Bestel, KOSAKI Motohiro,
	Ben.Crowhurst, linux-kernel

Alan Cox wrote:
>> Well, original C allowed you to do what you wanted with pointers (I used 
>> to teach that back when K&R was "the" C manual). Now people which about 
>> having pointers outside the array, which is a crock in practice, as long 
>> as you don't actually /use/ an out of range value.
>>     
>
> Actually the standards had good reasons to bar this use, because many
> runtime environments used segmentation and unsigned segment offsets. On a
> 286 you could get into quite a mess with out of array reference tricks.
>
>   
>> variable with the address of the start. I was more familiar with the B 
>> stuff, I wrote both the interpreter and the code generator+library for 
>> the 8080 and GE600 machines. B on MULTICS, those were the days... :-D
>>     
>
> B on Honeywell L66, so that may well have been a relative of your code
> generator ?
>
>   
Probably the Bell Labs one. I did an optimizer on the Pcode which caught 
jumps to jumps, then had separate 8080 and L66 code generators into GMAP 
on the GE and the CP/M assembler or the Intel (ISIS) assembler for 8080. 
There was also an 8085 code generator using the "ten undocumented 
instructions" from the Dr Dobbs article. GE actually had a contract with 
Intel to provide CPUs with those instructions, and we used them in the 
Terminet(r) printers.

Those were the days ;-)

-- 
Bill Davidsen <davidsen@tmr.com>
  "Woe unto the statesman who makes war without a reason that will still
  be valid when the war is over..." Otto von Bismark 



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Kernel Development & Objective-C
  2007-12-02 19:01       ` Andi Kleen
@ 2007-12-03  5:12         ` Avi Kivity
  2007-12-03  9:50           ` Andi Kleen
  2007-12-03 12:35           ` Kernel Development & Objective-C Gilboa Davara
  0 siblings, 2 replies; 57+ messages in thread
From: Avi Kivity @ 2007-12-03  5:12 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Kyle Moffett, Lennart Sorensen, Ben Crowhurst, linux-kernel

Andi Kleen wrote:
> Avi Kivity <avi@argo.co.il> writes:
>   
>> [I really doubt there are that many of these; syscall
>> entry/dispatch/exit, interrupt dispatch, context switch, what else?]
>>     
>
> Networking, block IO, page fault, ... But only the fast paths in these 
> cases. A lot of the kernel is slow path code and could probably
> be written even in an interpreted language without much trouble.
>
>   

Even these (with the exception of the page fault path) are hardly "we 
care about a single instruction" material suggested above.  Even with a 
million packets per second per core (does such a setup actually exist?)  
You have a few thousand cycles per packet.  For block you'd need around 
5,000 disks per core to reach such rates.

The real benefits aren't in keeping close to the metal, but in high 
level optimizations.  Ironically, these are easier when the code is a 
little more abstracted.  You can add quite a lot of instructions if it 
allows you not to do some of the I/O at all.



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Kernel Development & Objective-C
  2007-12-03  5:12         ` Avi Kivity
@ 2007-12-03  9:50           ` Andi Kleen
  2007-12-03 11:46             ` Avi Kivity
  2007-12-03 12:35           ` Kernel Development & Objective-C Gilboa Davara
  1 sibling, 1 reply; 57+ messages in thread
From: Andi Kleen @ 2007-12-03  9:50 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Andi Kleen, Kyle Moffett, Lennart Sorensen, Ben Crowhurst, linux-kernel

> Even these (with the exception of the page fault path) are hardly "we 
> care about a single instruction" material suggested above.  Even with a 

With 10Gbit/s ethernet working you start to care about every cycle.
Similar with highend routing or in some latency sensitive network
applications (e.g. in HPC). Another simple noticeable case is Unix
sockets and your X server communication. 

And there are some special cases where block IO is also pretty critical.
A popular one is TPC-* benchmarking, but there are also others and it 
looks likely in the future that this will become more critical
as block devices become faster (e.g. highend SSDs) 

> The real benefits aren't in keeping close to the metal, but in high 
> level optimizations.  Ironically, these are easier when the code is a 
> little more abstracted.  You can add quite a lot of instructions if it 
> allows you not to do some of the I/O at all.

While that's partly true -- cache misses are good for a lot of cycles --
it is not the whole truth and at some point raw code efficiency matters
too.

For example there are some CPUs who are relatively slow at indirect
function calls and there are actually cases where this can be measured.

-Andi


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Kernel Development & Objective-C
  2007-12-03  9:50           ` Andi Kleen
@ 2007-12-03 11:46             ` Avi Kivity
  2007-12-03 11:50               ` Andi Kleen
  2007-12-03 21:13               ` Willy Tarreau
  0 siblings, 2 replies; 57+ messages in thread
From: Avi Kivity @ 2007-12-03 11:46 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Kyle Moffett, Lennart Sorensen, Ben Crowhurst, linux-kernel

Andi Kleen wrote:
>> Even these (with the exception of the page fault path) are hardly "we 
>> care about a single instruction" material suggested above.  Even with a 
>>     
>
> With 10Gbit/s ethernet working you start to care about every cycle.
>   

If you have 10M packets/sec no amount of cycle-saving will help you.  
You need high level optimizations like TSO.  I'm not saying we should 
sacrifice cycles like there's no tomorrow, but the big wins are elsewhere.

> Similar with highend routing or in some latency sensitive network
> applications (e.g. in HPC). 

True.  And here, the hardware can cut hundreds of cycles by avoiding the 
kernel completely for the fast path.

> Another simple noticeable case is Unix
> sockets and your X server communication.

Your reflexes are *much* better than mine if you can measure half a 
nanosecond on X.

Here, it's scheduling that matters, avoiding large transfers, and 
avoiding ping-pongs, not some cycles on the unix domain socket.  You 
already paid 150 cycles or so by issuing the syscall and thousands for 
copying the data, 50 more won't be noticeable except in nanobenchmarks.

>  
>
> And there are some special cases where block IO is also pretty critical.
> A popular one is TPC-* benchmarking, but there are also others and it 
> looks likely in the future that this will become more critical
> as block devices become faster (e.g. highend SSDs) 
>   

And again the key is batching, improving cpu affinity, and caching, not 
looking for a faster instruction sequence.

>   
>> The real benefits aren't in keeping close to the metal, but in high 
>> level optimizations.  Ironically, these are easier when the code is a 
>> little more abstracted.  You can add quite a lot of instructions if it 
>> allows you not to do some of the I/O at all.
>>     
>
> While that's partly true -- cache misses are good for a lot of cycles --
> it is not the whole truth and at some point raw code efficiency matters
> too.
>
> For example there are some CPUs who are relatively slow at indirect
> function calls and there are actually cases where this can be measured.
>
>   

That is true.  But any self-respecting systems language will let you 
choose between direct and indirect calls.

If adding an indirect call allows you to avoid even 1% of I/O, you save 
much more than you lose, so again the high level optimizations win.

Nanooptimizations are fun (I do them myself, I admit) but that's not 
where performance as measured by the end user lies.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Kernel Development & Objective-C
  2007-12-03 11:46             ` Avi Kivity
@ 2007-12-03 11:50               ` Andi Kleen
  2007-12-03 21:13               ` Willy Tarreau
  1 sibling, 0 replies; 57+ messages in thread
From: Andi Kleen @ 2007-12-03 11:50 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Andi Kleen, Kyle Moffett, Lennart Sorensen, Ben Crowhurst, linux-kernel

On Mon, Dec 03, 2007 at 01:46:45PM +0200, Avi Kivity wrote:
> If you have 10M packets/sec no amount of cycle-saving will help you.  
> You need high level optimizations like TSO.  I'm not saying we should 
> sacrifice cycles like there's no tomorrow, but the big wins are elsewhere.

Both high and low level optimizations are needed for good performance.

> >Similar with highend routing or in some latency sensitive network
> >applications (e.g. in HPC). 
> 
> True.  And here, the hardware can cut hundreds of cycles by avoiding the 
> kernel completely for the fast path.

A lot of applications don't and the user space networking schemes
tend to have their own drawbacks anyways.

> >Another simple noticeable case is Unix
> >sockets and your X server communication.
> 
> Your reflexes are *much* better than mine if you can measure half a 
> nanosecond on X.

That's not about mouse/keyboard input, but about all X protocol communication
between X clients and X server. The key is not large copies here 
anyways (large data is put into shm) but latency.

> And again the key is batching, improving cpu affinity, and caching, not 
> looking for a faster instruction sequence.

That's not the whole story no. Batching etc are needed, but the
faster instruction sequences are needed too. 

> Nanooptimizations are fun (I do them myself, I admit) but that's not 
> where performance as measured by the end user lies.

It depends. Often high level (and then caching) optimizations are better 
bang for the buck, but completely disregarding the fast path work is a bad 
thing too. As an example see Christoph's recent work on the slub fastpath
which makes a quite measurable difference on benchmarks.


-Andi


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Kernel Development & Objective-C
  2007-12-03  5:12         ` Avi Kivity
  2007-12-03  9:50           ` Andi Kleen
@ 2007-12-03 12:35           ` Gilboa Davara
  2007-12-03 12:44             ` Gilboa Davara
                               ` (2 more replies)
  1 sibling, 3 replies; 57+ messages in thread
From: Gilboa Davara @ 2007-12-03 12:35 UTC (permalink / raw)
  To: LKML Linux Kernel; +Cc: Avi Kivity


On Mon, 2007-12-03 at 07:12 +0200, Avi Kivity wrote:
> Andi Kleen wrote:
> > Avi Kivity <avi@argo.co.il> writes:
> >   
> >> [I really doubt there are that many of these; syscall
> >> entry/dispatch/exit, interrupt dispatch, context switch, what else?]
> >>     
> >
> > Networking, block IO, page fault, ... But only the fast paths in these 
> > cases. A lot of the kernel is slow path code and could probably
> > be written even in an interpreted language without much trouble.
> >
> >   
> 
> Even these (with the exception of the page fault path) are hardly "we 
> care about a single instruction" material suggested above.  Even with a 
> million packets per second per core (does such a setup actually exist?)  
> You have a few thousand cycles per packet.  For block you'd need around 
> 5,000 disks per core to reach such rate

Intel's newest dual 10GbE NIC can easily (?) throw ~14M packets per
second. (theoretical peak at 1514bytes/frame)
Granted, installing such a device on a single CPU/single core machine is
absurd - but even on an 8 core machine (2 x Xeon 53xx/54xx / AMD
Barcelona) it can still generate ~1M packets/s per core.

Now assuming you're doing low-level (passive) filtering of some sort
(frame/packet routing, traffic interception and/or packet analysis)
using hardware assistance (TSO, complete TCP offloading, etc) is off the
table and each and every cycle within netif_receive_skb (and friends)
-counts-.

I don't suggest that the kernel should be (re)designed for such (niche)
applications but on other hand, if it works...

- Gilboa


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Kernel Development & Objective-C
  2007-12-03 12:35           ` Kernel Development & Objective-C Gilboa Davara
@ 2007-12-03 12:44             ` Gilboa Davara
  2007-12-03 16:28             ` Casey Schaufler
  2007-12-04 17:50             ` Lennart Sorensen
  2 siblings, 0 replies; 57+ messages in thread
From: Gilboa Davara @ 2007-12-03 12:44 UTC (permalink / raw)
  To: LKML Linux Kernel; +Cc: Avi Kivity


On Mon, 2007-12-03 at 14:35 +0200, Gilboa Davara wrote:
> Intel's newest dual 10GbE NIC can easily (?) throw ~14M packets per
> second. (theoretical peak at 1514bytes/frame)
> Granted, installing such a device on a single CPU/single core machine is
> absurd - but even on an 8 core machine (2 x Xeon 53xx/54xx / AMD
> Barcelona) it can still generate ~1M packets/s per core.

Sigh... Sorry. Please ignore the broken math on my part.
Make that 1.8M frames/second per card and ~100K packets/second per core.

- Gilboa



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Kernel Development & Objective-C
  2007-12-03 12:35           ` Kernel Development & Objective-C Gilboa Davara
  2007-12-03 12:44             ` Gilboa Davara
@ 2007-12-03 16:28             ` Casey Schaufler
  2007-12-04 17:50             ` Lennart Sorensen
  2 siblings, 0 replies; 57+ messages in thread
From: Casey Schaufler @ 2007-12-03 16:28 UTC (permalink / raw)
  To: Gilboa Davara, LKML Linux Kernel; +Cc: Avi Kivity


--- Gilboa Davara <gilboad@gmail.com> wrote:

> 
> On Mon, 2007-12-03 at 07:12 +0200, Avi Kivity wrote:
> > Andi Kleen wrote:
> > > Avi Kivity <avi@argo.co.il> writes:
> > >   
> > >> [I really doubt there are that many of these; syscall
> > >> entry/dispatch/exit, interrupt dispatch, context switch, what else?]
> > >>     
> > >
> > > Networking, block IO, page fault, ... But only the fast paths in these 
> > > cases. A lot of the kernel is slow path code and could probably
> > > be written even in an interpreted language without much trouble.
> > >
> > >   
> > 
> > Even these (with the exception of the page fault path) are hardly "we 
> > care about a single instruction" material suggested above.  Even with a 
> > million packets per second per core (does such a setup actually exist?)  
> > You have a few thousand cycles per packet.  For block you'd need around 
> > 5,000 disks per core to reach such rate
> 
> Intel's newest dual 10GbE NIC can easily (?) throw ~14M packets per
> second. (theoretical peak at 1514bytes/frame)
> Granted, installing such a device on a single CPU/single core machine is
> absurd - but even on an 8 core machine (2 x Xeon 53xx/54xx / AMD
> Barcelona) it can still generate ~1M packets/s per core.
> 
> Now assuming you're doing low-level (passive) filtering of some sort
> (frame/packet routing, traffic interception and/or packet analysis)
> using hardware assistance (TSO, complete TCP offloading, etc) is off the
> table and each and every cycle within netif_receive_skb (and friends)
> -counts-.
> 
> I don't suggest that the kernel should be (re)designed for such (niche)
> applications but on other hand, if it works...

I was involved in a 10GBe project like you're describing not too
long ago. Only the driver, and only a tight, lean, special purpose
driver at that, was able to deal with line rate volumes. This was
in a real appliance, where faster CPUs were not an option. In fact,
not hardware changes were possible due to the issues with squeezing
in the 10GBe nics. This project would have been impossible without
the speed and deterministic behavior of th ekernel C environment.


Casey Schaufler
casey@schaufler-ca.com

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Kernel Development & Objective-C
  2007-12-01 19:59   ` Avi Kivity
  2007-12-02 19:44     ` Jörn Engel
@ 2007-12-03 16:53     ` Lennart Sorensen
  1 sibling, 0 replies; 57+ messages in thread
From: Lennart Sorensen @ 2007-12-03 16:53 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Ben Crowhurst, linux-kernel

On Sat, Dec 01, 2007 at 09:59:31PM +0200, Avi Kivity wrote:
> C also requires a (very minimal) runtime. And I don't see how having a 
> runtime disqualifies a language from being usable in a kernel; the 
> runtime is just one more library, either supplied by the compiler or by 
> the kernel.

Well the majority of C syntax requires no runtime library.  There are
some system call like things that you often want that need a library
(like malloc and such), but those aren't really part of C itself.  Of
course without malloc and printf and file i/o calls the program would
probably be a bit boring.  I have written some small C programs without
a runtime, where the few things I needed where implemented in assembly
and poked the hardware directly and called from the C program.

> Object orientation in C leaves much to be desired; see the huge number 
> of void pointers and container_of()s in the kernel.

As a programming language, C leaves much to be desired.

--
Len Sorensen

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Kernel Development & Objective-C
  2007-12-03 11:46             ` Avi Kivity
  2007-12-03 11:50               ` Andi Kleen
@ 2007-12-03 21:13               ` Willy Tarreau
  2007-12-03 21:39                 ` J.A. Magallón
  2007-12-04 21:07                 ` Avi Kivity
  1 sibling, 2 replies; 57+ messages in thread
From: Willy Tarreau @ 2007-12-03 21:13 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Andi Kleen, Kyle Moffett, Lennart Sorensen, Ben Crowhurst, linux-kernel

On Mon, Dec 03, 2007 at 01:46:45PM +0200, Avi Kivity wrote:
> Andi Kleen wrote:
> >>Even these (with the exception of the page fault path) are hardly "we 
> >>care about a single instruction" material suggested above.  Even with a 
> >>    
> >
> >With 10Gbit/s ethernet working you start to care about every cycle.
> >  
> 
> If you have 10M packets/sec no amount of cycle-saving will help you.  
> You need high level optimizations like TSO.  I'm not saying we should 
> sacrifice cycles like there's no tomorrow, but the big wins are elsewhere.

Huh? At 4 GHz, you have 400 cycles to process each packet. If you need to
route those packets, those cycles may just be what you need to lookup a
forwarding table and perform a few MMIO on an accelerated chip which will
take care of the transfer. But you need those cycles. If you start to waste
them 30 by 30, the performance can drop by a critical factor.

> >Similar with highend routing or in some latency sensitive network
> >applications (e.g. in HPC). 
> 
> True.  And here, the hardware can cut hundreds of cycles by avoiding the 
> kernel completely for the fast path.
> 
> >Another simple noticeable case is Unix
> >sockets and your X server communication.
> 
> Your reflexes are *much* better than mine if you can measure half a 
> nanosecond on X.

It just depends how many times a second it happens. For instance, consider
this trivial loop (fct is a two-function array which just return 1 or 2) :

        i = 0;
        for (j = 0; j < (1 << 28); j++) {
                k = (j >> 8) & 1;
                i += fct[k]();
        }

It takes 1.6 seconds to execute on my athlon-xp 1.5 GHz. If, instead of
changing the function once every 256 calls, you change it to every call :

        i = 0;
        for (j = 0; j < (1 << 28); j++) {
                k = (j >> 0) & 1;
                i += fct[k]();
        }

Then it only takes 4.3 seconds, which is about 3 times slower. The number
of calls per function remains the same (128M calls each), it's just the
branch prediction which is wrong every time. The very few nanoseconds added
at each call are enough to slow down a program from 1.6 to 4.3 seconds while
it executes the exact same code (it may even save one shift). If you have
such stupid code, say, to compute the color or alpha of each pixel in an
image, you will certainly notice the difference.

And such poorly efficient code may happen very often when you blindly rely
on function pointers instead of explicit calls.

> Here, it's scheduling that matters, avoiding large transfers, and 
> avoiding ping-pongs, not some cycles on the unix domain socket.  You 
> already paid 150 cycles or so by issuing the syscall and thousands for 
> copying the data, 50 more won't be noticeable except in nanobenchmarks.

You are forgetting something very important : once you start stacking
functions to perform the dirty work for you, you end up with so much
abstraction that even new stupid code cannot be written at all without
relying on them, and it's where the problem takes its roots, because
when you need to write a fast function and you notice that you cannot
touch a variable without passing through a slow pinhole, your fast
function will remain slow whatever you do, and the worst of all is that
you will think that it is normally fast and that it cannot be written
faster.

> >And there are some special cases where block IO is also pretty critical.
> >A popular one is TPC-* benchmarking, but there are also others and it 
> >looks likely in the future that this will become more critical
> >as block devices become faster (e.g. highend SSDs) 
> >  
> 
> And again the key is batching, improving cpu affinity, and caching, not 
> looking for a faster instruction sequence.

Every cycle burned is definitely lost. The time cannot go backwards. So
for each cycle that you lose to laziness, you have to become more and more
clever to find out how to write an alternative. Lazy people simply put
caches everywhere and after that they find normal that "hello world" requires
2 Gigs of RAM to be displayed. The only true solution is to create better
algorithms, but you will find even less people capable of creating efficient
algorithms than you will find capable of coding correctly.

> >For example there are some CPUs who are relatively slow at indirect
> >function calls and there are actually cases where this can be measured.
> 
> That is true.  But any self-respecting systems language will let you 
> choose between direct and indirect calls.
> 
> If adding an indirect call allows you to avoid even 1% of I/O, you save 
> much more than you lose, so again the high level optimizations win.

It depends which type of I/O. If the I/O is non-blocking, you end up doing
something else instead of actively burning cycles.

> Nanooptimizations are fun (I do them myself, I admit) but that's not 
> where performance as measured by the end user lies.

I do not agree. It's not uncommon to find 2- or 3-fold performance factors
between equivalent components when one is carefully optimized and the other
one is not. Granted it takes an awful lot of time doing all those nano-opts
at the beginning, but the more you learn about how the hardware reacts to
your code, the more efficiently you write future code, with the fewest bloat.
End users notice bloat a lot (especially when CPU and RAM are excessively
wasted).

Best regards,
Willy


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Kernel Development & Objective-C
  2007-12-03 21:13               ` Willy Tarreau
@ 2007-12-03 21:39                 ` J.A. Magallón
  2007-12-03 21:57                   ` Alan Cox
  2007-12-04 21:07                 ` Avi Kivity
  1 sibling, 1 reply; 57+ messages in thread
From: J.A. Magallón @ 2007-12-03 21:39 UTC (permalink / raw)
  To: linux-kernel

On Mon, 3 Dec 2007 22:13:53 +0100, Willy Tarreau <w@1wt.eu> wrote:

...
> 
> It just depends how many times a second it happens. For instance, consider
> this trivial loop (fct is a two-function array which just return 1 or 2) :
> 
>         i = 0;
>         for (j = 0; j < (1 << 28); j++) {
>                 k = (j >> 8) & 1;
>                 i += fct[k]();
>         }
> 
> It takes 1.6 seconds to execute on my athlon-xp 1.5 GHz. If, instead of
> changing the function once every 256 calls, you change it to every call :
> 
>         i = 0;
>         for (j = 0; j < (1 << 28); j++) {
>                 k = (j >> 0) & 1;
>                 i += fct[k]();
>         }
> 
> Then it only takes 4.3 seconds, which is about 3 times slower. The number
> of calls per function remains the same (128M calls each), it's just the
> branch prediction which is wrong every time. The very few nanoseconds added
> at each call are enough to slow down a program from 1.6 to 4.3 seconds while
> it executes the exact same code (it may even save one shift). If you have
> such stupid code, say, to compute the color or alpha of each pixel in an
> image, you will certainly notice the difference.
> 
> And such poorly efficient code may happen very often when you blindly rely
> on function pointers instead of explicit calls.
> 
...
> 
> You are forgetting something very important : once you start stacking
> functions to perform the dirty work for you, you end up with so much
> abstraction that even new stupid code cannot be written at all without
> relying on them, and it's where the problem takes its roots, because
> when you need to write a fast function and you notice that you cannot
> touch a variable without passing through a slow pinhole, your fast
> function will remain slow whatever you do, and the worst of all is that
> you will think that it is normally fast and that it cannot be written
> faster.
> 

But don't forget that OOP is just another way to organize your code,
and let the language/compiler do some things you shouldn't de doing,
like fill an vtable pointer, that are error prone.

And of course everything depends on what language you choose and how
you use it.
You could write an equally effcient kernel in languages like C++,
using C++ abstractions as a high level organization, where
the fast paths could be coded the right way; we are not talking about
C# or Java, where even a sum is a call to an overloaded method.
Its the difference between doing school-book push and pops to lists,
and suddenly inventing the splice operator...

--
J.A. Magallon <jamagallon()ono!com>     \               Software is like sex:
                                         \         It's better when it's free
Mandriva Linux release 2008.1 (Cooker) for i586
Linux 2.6.23-jam03 (gcc 4.2.2 (4.2.2-1mdv2008.1)) SMP Sat Nov

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Kernel Development & Objective-C
  2007-12-03 21:39                 ` J.A. Magallón
@ 2007-12-03 21:57                   ` Alan Cox
  2007-12-04 21:47                     ` J.A. Magallón
  0 siblings, 1 reply; 57+ messages in thread
From: Alan Cox @ 2007-12-03 21:57 UTC (permalink / raw)
  To: J.A. Magallón; +Cc: linux-kernel

> You could write an equally effcient kernel in languages like C++,
> using C++ abstractions as a high level organization, where

It's very very hard to generate good C code because of the numerous ways
objects get temporarily created, and the week aliasing rules (as with C).

There are reasons that Fortran lives on (and no I'm not suggesting one
should rewrite the kernel in Fortran ;)) and the fact its not really got
pointer aliasing or "address of" operators and all the resulting
optimsation problems is one of the big ones.

Alan

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Kernel Development & Objective-C
  2007-12-03 12:35           ` Kernel Development & Objective-C Gilboa Davara
  2007-12-03 12:44             ` Gilboa Davara
  2007-12-03 16:28             ` Casey Schaufler
@ 2007-12-04 17:50             ` Lennart Sorensen
  2007-12-05 10:31               ` Gilboa Davara
  2 siblings, 1 reply; 57+ messages in thread
From: Lennart Sorensen @ 2007-12-04 17:50 UTC (permalink / raw)
  To: Gilboa Davara; +Cc: LKML Linux Kernel, Avi Kivity

On Mon, Dec 03, 2007 at 02:35:31PM +0200, Gilboa Davara wrote:
> Intel's newest dual 10GbE NIC can easily (?) throw ~14M packets per
> second. (theoretical peak at 1514bytes/frame)
> Granted, installing such a device on a single CPU/single core machine is
> absurd - but even on an 8 core machine (2 x Xeon 53xx/54xx / AMD
> Barcelona) it can still generate ~1M packets/s per core.

10GbE can't do 14M packets per second if the packets are 1514 bytes.  At
10M packets per second you have less than 1000 bits per packet, which is
far from 1514bytes.

10Gbps gives you at most 1.25GBps, which at 1514 bytes per packet works
out to 825627 packets per second.  You could reach ~14M packets per
second with only the smallest packet size, which is rather unusual for
high throughput traffic, since you waste almost all the bytes on
overhead in that case.  But you do want to be able to handle at least a
million or two packets per second to do 10GbE.

--
Len Sorensen

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Kernel Development & Objective-C
  2007-11-30 23:19   ` J.A. Magallón
  2007-11-30 23:53     ` Nicholas Miell
  2007-12-01  0:31     ` Al Viro
@ 2007-12-04 17:54     ` Lennart Sorensen
  2007-12-04 21:10       ` Avi Kivity
  2007-12-04 21:24       ` J.A. Magallón
  2 siblings, 2 replies; 57+ messages in thread
From: Lennart Sorensen @ 2007-12-04 17:54 UTC (permalink / raw)
  To: J.A. Magall??n; +Cc: Lo??c Greni??, Ben.Crowhurst, linux-kernel

On Sat, Dec 01, 2007 at 12:19:50AM +0100, J.A. Magall??n wrote:
> I think BeOS was C++ and OSX is C+ObjectiveC (and runs on an iPhone).
> Original MacOS (fron 6 to 9) was Pascal (and a mac SE was very near
> to embedded hardware :) ).
> 
> I do not advocate to rewrite Linux in C++, but don't say a kernel written
> in C++ can not be efficient.

Well I am pretty sure the micro kernel of OS X is in C, and certainly
the BSD layer is as well.  So the only ObjC part would be the nextstep
framework and other parts of the Mac GUI and other Mac APIs they
provide, which all at some point probably end up calling down into the C
stuff below.

> C++ (and for what I read on other answer, nor ObjectiveC) has no garbage
> collection. It does not anything you did not it to do. It just allows
> you to change this
> 
> 	struct buffer *x;
> 	x = kmalloc(...)
> 	x->sz = 128
> 	x->buff = kmalloc(...)
> 	...
> 	kfree(x->buff)
> 	kfree(x)
> 	
> to
> 	struct buffer *x;
> 	x = new buffer(128); (that does itself allocates x->buff,
>                               because _you_ programmed it,
>                               so you poor programmer don't forget)
>         ...
> 	delete x;            (that also was programmed to deallocate
>                               x->buff itself, sou you have one less
>                               memory leak to worry about)

But kmalloc is implemented by the kernel.  Who implements 'new'?

--
Len Sorensen

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Kernel Development & Objective-C
  2007-12-03 21:13               ` Willy Tarreau
  2007-12-03 21:39                 ` J.A. Magallón
@ 2007-12-04 21:07                 ` Avi Kivity
  2007-12-04 22:43                   ` Willy Tarreau
  1 sibling, 1 reply; 57+ messages in thread
From: Avi Kivity @ 2007-12-04 21:07 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: Andi Kleen, Kyle Moffett, Lennart Sorensen, Ben Crowhurst, linux-kernel

Willy Tarreau wrote:
>>>>    
>>>>         
>>> With 10Gbit/s ethernet working you start to care about every cycle.
>>>  
>>>       
>> If you have 10M packets/sec no amount of cycle-saving will help you.  
>> You need high level optimizations like TSO.  I'm not saying we should 
>> sacrifice cycles like there's no tomorrow, but the big wins are elsewhere.
>>     
>
> Huh? At 4 GHz, you have 400 cycles to process each packet. If you need to
> route those packets, those cycles may just be what you need to lookup a
> forwarding table and perform a few MMIO on an accelerated chip which will
> take care of the transfer. But you need those cycles. If you start to waste
> them 30 by 30, the performance can drop by a critical factor.
>
>   

I really doubt Linux spends 400 cycles routing a packet.  Look what an 
skbuff looks like.

A flood ping to localhost on a 2GHz system takes 8 microseconds, that's 
16,000 cycles.  Sure it involves userspace, but you're about two orders 
of magnitude off.  And the localhost interface is nicely cached in L1 
without mmio at all, unlike real devices.

>>> Another simple noticeable case is Unix
>>> sockets and your X server communication.
>>>       
>> Your reflexes are *much* better than mine if you can measure half a 
>> nanosecond on X.
>>     
>
> It just depends how many times a second it happens. For instance, consider
> this trivial loop (fct is a two-function array which just return 1 or 2) :
>
>         i = 0;
>         for (j = 0; j < (1 << 28); j++) {
>                 k = (j >> 8) & 1;
>                 i += fct[k]();
>         }
>
> It takes 1.6 seconds to execute on my athlon-xp 1.5 GHz. If, instead of
> changing the function once every 256 calls, you change it to every call :
>
>         i = 0;
>         for (j = 0; j < (1 << 28); j++) {
>                 k = (j >> 0) & 1;
>                 i += fct[k]();
>         }
>
> Then it only takes 4.3 seconds, which is about 3 times slower. The number
> of calls per function remains the same (128M calls each), it's just the
> branch prediction which is wrong every time. The very few nanoseconds added
> at each call are enough to slow down a program from 1.6 to 4.3 seconds while
> it executes the exact same code (it may even save one shift). If you have
> such stupid code, say, to compute the color or alpha of each pixel in an
> image, you will certainly notice the difference.
>
>   

This happens very often in HPC, and when it does, it is often worthwhile 
to invest in manual optimizations or even assembly coding.  
Unfortunately it is very rare in the kernel (memcmp, raid xor, what 
else?).  Loops with high iteration counts are very rare, so any 
attention you give to the loop body is not amortized over a large number 
of executions.

> And such poorly efficient code may happen very often when you blindly rely
> on function pointers instead of explicit calls.
>   

Using an indirect call where a direct call is sufficient will also 
reduce the compiler's optimization opportunities.  However, I don't see 
anyone recommending it in the context of systems programming.

It is not true that the number of indirect calls necessarily increases 
if you use a language other than C.

(Actually, with templates you can reduce the number of indirect calls)

>> Here, it's scheduling that matters, avoiding large transfers, and 
>> avoiding ping-pongs, not some cycles on the unix domain socket.  You 
>> already paid 150 cycles or so by issuing the syscall and thousands for 
>> copying the data, 50 more won't be noticeable except in nanobenchmarks.
>>     
>
> You are forgetting something very important : once you start stacking
> functions to perform the dirty work for you, you end up with so much
> abstraction that even new stupid code cannot be written at all without
> relying on them, and it's where the problem takes its roots, because
> when you need to write a fast function and you notice that you cannot
> touch a variable without passing through a slow pinhole, your fast
> function will remain slow whatever you do, and the worst of all is that
> you will think that it is normally fast and that it cannot be written
> faster.
>
>   

I don't understand.  Can you give an example?

There are two cases where abstraction hurts performance: the first is 
where the mechanisms used to achieve the abstraction (functions instead 
of direct access to variables, function pointers instead of duplicating 
the caller) introduce performance overhead.  I don't think C has any 
advantage here -- actually a disadvantage as it lacks templates and is 
forced to use function pointers for nontrivial cases.  Usually the 
abstraction penalty is nil with modern compilers.

The second case is where too much abstraction clouds the programmer's 
mind.  But this is independent of the programming language.


>>> And there are some special cases where block IO is also pretty critical.
>>> A popular one is TPC-* benchmarking, but there are also others and it 
>>> looks likely in the future that this will become more critical
>>> as block devices become faster (e.g. highend SSDs) 
>>>  
>>>       
>> And again the key is batching, improving cpu affinity, and caching, not 
>> looking for a faster instruction sequence.
>>     
>
> Every cycle burned is definitely lost. The time cannot go backwards. So
> for each cycle that you lose to laziness, you have to become more and more
> clever to find out how to write an alternative. Lazy people simply put
> caches everywhere and after that they find normal that "hello world" requires
> 2 Gigs of RAM to be displayed. 

A 100 byte program will print "hello world" on a UART and stop.  A 
modern program will load a vector description of a font, scale it to the 
desired size, render it using anti aliasing and sub-pixel positioning, 
lay it out according to the language rules of whereever you live, and 
place it on a multi-megabyte frame buffer.  Yes it needs hundreds of 
megabytes and lots of nasty algorithms to do that.

> The only true solution is to create better
> algorithms, but you will find even less people capable of creating efficient
> algorithms than you will find capable of coding correctly.
>
>   

That is true, that is why we see a lot more microoptimizations than 
algorithmic progress.

But if you want a fast streaming filesystem you choose XFS over ext3, 
even though the latter is much smaller and easier to optimize.  If you 
write a network server you choose epoll() instead of trying to optimize 
select() somehow.  True algorithmic improvements are rare but they are 
the ones that are actually measurable.


>>> For example there are some CPUs who are relatively slow at indirect
>>> function calls and there are actually cases where this can be measured.
>>>       
>> That is true.  But any self-respecting systems language will let you 
>> choose between direct and indirect calls.
>>
>> If adding an indirect call allows you to avoid even 1% of I/O, you save 
>> much more than you lose, so again the high level optimizations win.
>>     
>
> It depends which type of I/O. If the I/O is non-blocking, you end up doing
> something else instead of actively burning cycles.
>
>   

Unless you are I/O bound, which is usually the case when you have 2GHz 
cpus driving 200Hz disks.

>> Nanooptimizations are fun (I do them myself, I admit) but that's not 
>> where performance as measured by the end user lies.
>>     
>
> I do not agree. It's not uncommon to find 2- or 3-fold performance factors
> between equivalent components when one is carefully optimized and the other
> one is not. Granted it takes an awful lot of time doing all those nano-opts
> at the beginning, but the more you learn about how the hardware reacts to
> your code, the more efficiently you write future code, with the fewest bloat.
> End users notice bloat a lot (especially when CPU and RAM are excessively
> wasted).
>   

Can you give an example of a 2- or 3- fold factor on an end-user 
workload achieved by microopts?

I agree about bloat.


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Kernel Development & Objective-C
  2007-12-04 17:54     ` Lennart Sorensen
@ 2007-12-04 21:10       ` Avi Kivity
  2007-12-04 21:24       ` J.A. Magallón
  1 sibling, 0 replies; 57+ messages in thread
From: Avi Kivity @ 2007-12-04 21:10 UTC (permalink / raw)
  To: Lennart Sorensen
  Cc: J.A. Magall??n, Lo??c Greni??, Ben.Crowhurst, linux-kernel

Lennart Sorensen wrote:
> But kmalloc is implemented by the kernel.  Who implements 'new'?
>   

The kernel.


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Kernel Development & Objective-C
  2007-12-04 17:54     ` Lennart Sorensen
  2007-12-04 21:10       ` Avi Kivity
@ 2007-12-04 21:24       ` J.A. Magallón
  1 sibling, 0 replies; 57+ messages in thread
From: J.A. Magallón @ 2007-12-04 21:24 UTC (permalink / raw)
  To: Linux-Kernel, 

On Tue, 4 Dec 2007 12:54:13 -0500, lsorense@csclub.uwaterloo.ca (Lennart Sorensen) wrote:

> On Sat, Dec 01, 2007 at 12:19:50AM +0100, J.A. Magall??n wrote:
> > I think BeOS was C++ and OSX is C+ObjectiveC (and runs on an iPhone).
> > Original MacOS (fron 6 to 9) was Pascal (and a mac SE was very near
> > to embedded hardware :) ).
> > 
> > I do not advocate to rewrite Linux in C++, but don't say a kernel written
> > in C++ can not be efficient.
> 
> Well I am pretty sure the micro kernel of OS X is in C, and certainly
> the BSD layer is as well.  So the only ObjC part would be the nextstep
> framework and other parts of the Mac GUI and other Mac APIs they
> provide, which all at some point probably end up calling down into the C
> stuff below.
> 

Yup, thanks.

> > C++ (and for what I read on other answer, nor ObjectiveC) has no garbage
> > collection. It does not anything you did not it to do. It just allows
> > you to change this
> > 
> > 	struct buffer *x;
> > 	x = kmalloc(...)
> > 	x->sz = 128
> > 	x->buff = kmalloc(...)
> > 	...
> > 	kfree(x->buff)
> > 	kfree(x)
> > 	
> > to
> > 	struct buffer *x;
> > 	x = new buffer(128); (that does itself allocates x->buff,
> >                               because _you_ programmed it,
> >                               so you poor programmer don't forget)
> >         ...
> > 	delete x;            (that also was programmed to deallocate
> >                               x->buff itself, sou you have one less
> >                               memory leak to worry about)
> 
> But kmalloc is implemented by the kernel.  Who implements 'new'?
> 

Help yourself... as kmalloc() is a replacement for userspace glibc's
malloc, you can write your replacements for functions/operators in
libstdc++ (operators are just cosmetic, as many other features in C++)
In fact, for someone who dared to write a kernel C++ framework, the
very first function he has to write could be something like:

void* operator new(size_t sz)
{
	return kmalloc(sz,GPF_KERNEL);
}

And could write alternatives like

operator new(size_t sz,int flags) -> x = new(GPF_ATOMIC) X;

operator new(size_t sz,MemPool& pl) -> x = new(pool) X;

If you are curious, this page http://www.osdev.org/wiki/C_PlusPlus
has some clues about what should you implement to get rid of
libstdc++.

--
J.A. Magallon <jamagallon()ono!com>     \               Software is like sex:
                                         \         It's better when it's free
Mandriva Linux release 2008.1 (Cooker) for i586
Linux 2.6.23-jam03 (gcc 4.2.2 (4.2.2-1mdv2008.1)) SMP Sat Nov

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Kernel Development & Objective-C
  2007-12-03 21:57                   ` Alan Cox
@ 2007-12-04 21:47                     ` J.A. Magallón
  2007-12-04 22:20                       ` Diego Calleja
  0 siblings, 1 reply; 57+ messages in thread
From: J.A. Magallón @ 2007-12-04 21:47 UTC (permalink / raw)
  To: linux-kernel

On Mon, 3 Dec 2007 21:57:27 +0000, Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:

> > You could write an equally effcient kernel in languages like C++,
> > using C++ abstractions as a high level organization, where
> 
> It's very very hard to generate good C code because of the numerous ways
> objects get temporarily created, and the week aliasing rules (as with C).
> 

That is what I like of C++, with good placement of high level features
like const's and & (references) one can gain fine control over what
gets copied or not.
Try to write a Vector class that does ops with SSE without storing
temporals on the stack. Its a good example of how one can get low
level control, and gcc is pretty good simplifying things like u=v+2*w
and not putting anything on the stack, all in xmm registers.

The advantage is you onle has to be careful one time, when you write
the class.

> There are reasons that Fortran lives on (and no I'm not suggesting one
> should rewrite the kernel in Fortran ;)) and the fact its not really got
> pointer aliasing or "address of" operators and all the resulting
> optimsation problems is one of the big ones.
> 


--
J.A. Magallon <jamagallon()ono!com>     \               Software is like sex:
                                         \         It's better when it's free
Mandriva Linux release 2008.1 (Cooker) for i586
Linux 2.6.23-jam03 (gcc 4.2.2 (4.2.2-1mdv2008.1)) SMP Sat Nov

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Kernel Development & Objective-C
  2007-12-04 21:47                     ` J.A. Magallón
@ 2007-12-04 22:20                       ` Diego Calleja
  2007-12-05 10:59                         ` Giacomo A. Catenazzi
  0 siblings, 1 reply; 57+ messages in thread
From: Diego Calleja @ 2007-12-04 22:20 UTC (permalink / raw)
  To: "J.A. Magallón"; +Cc: linux-kernel

El Tue, 4 Dec 2007 22:47:45 +0100, "J.A. Magallón" <jamagallon@ono.com> escribió:

> That is what I like of C++, with good placement of high level features
> like const's and & (references) one can gain fine control over what
> gets copied or not.

But...if there's some way Linux can get "language improvements", is with
new C standards/gccextensions/etc. It'd be nice if people tried to add
(useful) C extensions to gcc, instead of proposing some random language :)

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Kernel Development & Objective-C
  2007-12-04 21:07                 ` Avi Kivity
@ 2007-12-04 22:43                   ` Willy Tarreau
  2007-12-05 17:05                     ` Micro vs macro optimizations (was: Re: Kernel Development & Objective-C) Avi Kivity
  0 siblings, 1 reply; 57+ messages in thread
From: Willy Tarreau @ 2007-12-04 22:43 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Andi Kleen, Kyle Moffett, Lennart Sorensen, Ben Crowhurst, linux-kernel

Hi Avi,

On Tue, Dec 04, 2007 at 11:07:05PM +0200, Avi Kivity wrote:
> Willy Tarreau wrote:
> >>>>   
> >>>>        
> >>>With 10Gbit/s ethernet working you start to care about every cycle.
> >>> 
> >>>      
> >>If you have 10M packets/sec no amount of cycle-saving will help you.  
> >>You need high level optimizations like TSO.  I'm not saying we should 
> >>sacrifice cycles like there's no tomorrow, but the big wins are elsewhere.
> >>    
> >
> >Huh? At 4 GHz, you have 400 cycles to process each packet. If you need to
> >route those packets, those cycles may just be what you need to lookup a
> >forwarding table and perform a few MMIO on an accelerated chip which will
> >take care of the transfer. But you need those cycles. If you start to waste
> >them 30 by 30, the performance can drop by a critical factor.
> >
> >  
> 
> I really doubt Linux spends 400 cycles routing a packet.  Look what an 
> skbuff looks like.

That's not what I wrote. I just wrote about doing forwarding table lookup
and MMIO so that dedicated hardware NICs can process the recv/send to the
correct ends. If you just need to scan a list of DMAed packets, look at
their destination IP address, lookup that IP in a table to find the output
NIC and destination MAC address, link them into an output list and waking
the output NIC up, there's nothing which requires more than 400 cycles
here. I never said that it was a requirement to pass through the existing
network stack.

> A flood ping to localhost on a 2GHz system takes 8 microseconds, that's 
> 16,000 cycles.  Sure it involves userspace, but you're about two orders 
> of magnitude off.

I don't see where you see a userspace (or I don't understand your test).
On traffic generation I often do from user space, I can send 630 k raw
ethernet packets per second from userspace on a 1.8 GHz opteron and PCI-e
NICs. That's 2857 cycles per packet, including the (small amount of)
userspace work. That's quite cheap.

> And the localhost interface is nicely cached in L1 without mmio at all,
> unlike real devices.

(...)
> This happens very often in HPC, and when it does, it is often worthwhile 
> to invest in manual optimizations or even assembly coding.  
> Unfortunately it is very rare in the kernel (memcmp, raid xor, what 
> else?).  Loops with high iteration counts are very rare, so any 
> attention you give to the loop body is not amortized over a large number 
> of executions.

Well, in my example above, everythin in the path of the send() syscall down
to the bare metal NIC is under high pressure in a fast loop. 30 cycles
already represent 1% of the performance! In fact, to modulate speed, I
use a busy loop with a volatile int and small values.

> >And such poorly efficient code may happen very often when you blindly rely
> >on function pointers instead of explicit calls.
> >  
> 
> Using an indirect call where a direct call is sufficient will also 
> reduce the compiler's optimization opportunities.

That's true.

> However, I don't see 
> anyone recommending it in the context of systems programming.
> 
> It is not true that the number of indirect calls necessarily increases 
> if you use a language other than C.
> 
> (Actually, with templates you can reduce the number of indirect calls)
> 
> >>Here, it's scheduling that matters, avoiding large transfers, and 
> >>avoiding ping-pongs, not some cycles on the unix domain socket.  You 
> >>already paid 150 cycles or so by issuing the syscall and thousands for 
> >>copying the data, 50 more won't be noticeable except in nanobenchmarks.
> >>    
> >
> >You are forgetting something very important : once you start stacking
> >functions to perform the dirty work for you, you end up with so much
> >abstraction that even new stupid code cannot be written at all without
> >relying on them, and it's where the problem takes its roots, because
> >when you need to write a fast function and you notice that you cannot
> >touch a variable without passing through a slow pinhole, your fast
> >function will remain slow whatever you do, and the worst of all is that
> >you will think that it is normally fast and that it cannot be written
> >faster.
> >
> >  
> 
> I don't understand.  Can you give an example?

Yes, the most common examples found today involve applications reading
data from databases. For instance, let's say that one function in your
program must count the number of unique people with the name starting
with an "A". It is very common to see "low-level" primitives to abstract
the database for portability purposes. One of such primitives will
generally be consist in retrieving a list of people with their names,
age and sex in one well-formated 3-column array. Many lazy people will
not see any problem in calling this one from the function described
above. Basically, what they would do is :

 count_people_with_name_starting_with_a()
    -> array[name,age,sex] = get_list_of_people()
         -> while read_one_people_entry() {
               alloc(one_line_of_3_columns)
               read then parse the 3 fields
               format_them_appropriately
            }
    -> create a new array "name2" by duplicating the "name" column
    -> name3 = sort_unique(name2)
    -> name4 = name3.grep("^A")
    -> return name4.count

Don't laugh, I've recently read such a horrible thing. It was done
that way just because it was easier. Without the abstraction layer,
the coder would have been forced to access the base anyway and would
have seen an added value into just counting from the inner while
loop, saving lots of copies, greps, sort, etc... :

 count_people_with_name_starting_with_a() {
      count = 0;
      while read_one_people_entry() {
         read the 3 fields into a statically-allocated buffer
         if (name[0] == 'A') count++;
      }
      return count;
 }

I'm not saying that the above was not possible, just that it's
1000% easier to do the former without even having to think that
the final code uses such horrible things. And yes, I can confirm
that when you see this, you want to shoot the author !

> There are two cases where abstraction hurts performance: the first is 
> where the mechanisms used to achieve the abstraction (functions instead 
> of direct access to variables, function pointers instead of duplicating 
> the caller) introduce performance overhead.  I don't think C has any 
> advantage here -- actually a disadvantage as it lacks templates and is 
> forced to use function pointers for nontrivial cases.  Usually the 
> abstraction penalty is nil with modern compilers.
> 
> The second case is where too much abstraction clouds the programmer's 
> mind.  But this is independent of the programming language.

Agreed. But most often, the abstraction prevents the user from accessing
some information directly and that becomes nasty. I remember when I was
a teen, I wrote a program designed to inventory what you had in your PC,
and run a few performance tests. It ran in those semi-graphical DOS mode
where you use graphics characters to draw boxes. I initially wrote all
the windowing code myself and it ran perfectly. I once decided to rewrite
it using TurboVision, the windowing framework from Borland (it was written
in TurboPascal). I made intensive use of the equivalent of a putchar()
function to write text in a window. You cannot imagine my pain when I
ran it on my old 8088, it wrote at the speed of a 1200 bps terminal. I
then tried to find how to write faster, even by accessing the window
buffer directly. I couldn't. I had to reverse-engineer the internal
structures by debugging memory contents in order to find the pointers
to the window buffer to write to them directly. After this disastrous
experience with abstraction, I thought "never that crap again".

> >Every cycle burned is definitely lost. The time cannot go backwards. So
> >for each cycle that you lose to laziness, you have to become more and more
> >clever to find out how to write an alternative. Lazy people simply put
> >caches everywhere and after that they find normal that "hello world" 
> >requires
> >2 Gigs of RAM to be displayed. 
> 
> A 100 byte program will print "hello world" on a UART and stop.  A 
> modern program will load a vector description of a font, scale it to the 
> desired size, render it using anti aliasing and sub-pixel positioning, 
> lay it out according to the language rules of whereever you live, and 
> place it on a multi-megabyte frame buffer.  Yes it needs hundreds of 
> megabytes and lots of nasty algorithms to do that.

What I'm complaining about is that when you don't want those fancy things,
you still have them to justify the hundreds of megs. And even if you manage
to print to stdout, you still have a huge runtime just in case you'd like
to use the fancy features.

> >The only true solution is to create better
> >algorithms, but you will find even less people capable of creating 
> >efficient
> >algorithms than you will find capable of coding correctly.
> 
> That is true, that is why we see a lot more microoptimizations than 
> algorithmic progress.

Also, algorithmic research is very little rewarding. You can work for
months or years thinking you found the nice algo for the job, then
finally discover a limitation you did not expect and throw that amount
of work to the bin in a few minutes.

> But if you want a fast streaming filesystem you choose XFS over ext3, 
> even though the latter is much smaller and easier to optimize.  If you 
> write a network server you choose epoll() instead of trying to optimize 
> select() somehow.

That's interesting that you cite epoll() vs select(). I measured the
break-even point around 1000 FDs. Below, select() is faster. Above,
epoll() is faster. On small number of entries (less than 100), a select
based proxy can be 20-30% faster than the same one running on epoll()
because select() while dumber is cheaper to set up.

> True algorithmic improvements are rare but they are the ones that are
> actually measurable.

I generally agree with this.

> >>>For example there are some CPUs who are relatively slow at indirect
> >>>function calls and there are actually cases where this can be measured.
> >>>      
> >>That is true.  But any self-respecting systems language will let you 
> >>choose between direct and indirect calls.
> >>
> >>If adding an indirect call allows you to avoid even 1% of I/O, you save 
> >>much more than you lose, so again the high level optimizations win.
> >>    
> >
> >It depends which type of I/O. If the I/O is non-blocking, you end up doing
> >something else instead of actively burning cycles.
> >
> >  
> 
> Unless you are I/O bound, which is usually the case when you have 2GHz 
> cpus driving 200Hz disks.

That's true when you seek a lot. When you manage to mostly perform sequential
reads (such as what you do when processing large files such as logs), you can
easily achieve 80 MB/s, which is 20000 pages/s, or 100 times faster.

> >>Nanooptimizations are fun (I do them myself, I admit) but that's not 
> >>where performance as measured by the end user lies.
> >>    
> >
> >I do not agree. It's not uncommon to find 2- or 3-fold performance factors
> >between equivalent components when one is carefully optimized and the other
> >one is not. Granted it takes an awful lot of time doing all those nano-opts
> >at the beginning, but the more you learn about how the hardware reacts to
> >your code, the more efficiently you write future code, with the fewest 
> >bloat.
> >End users notice bloat a lot (especially when CPU and RAM are excessively
> >wasted).
> >  
> 
> Can you give an example of a 2- or 3- fold factor on an end-user 
> workload achieved by microopts?

Oh there are many primitives which are generally optimized in assembly for
this reason. What randomly comes to my mind :
  - graphics libraries. Saving 1 cycle per pixel in a rectangle drawing
    primitive can have an important impact in animated graphics for
    instance.

  - video/audio and generally multimedia code. I remember a specially
    written version of mpg123 about 10 years ago, which was optimized
    for i486 and which was the only one able to run on a 486 without
    skipping.

  - crypto code. It's common to find CPU-specific DES or AES functions.
    Take a look at John The Ripper. I don't know if it still exists,
    but there was an Alpha-optimized DES function which was something
    like 5 times faster than the generic C one. It changes a lot of
    things when you have 1 day to check your users passwords.

I also wrote a netfilter log analyzer which parses 300000 lines per
second on my 1.7 GHz notebook. That's 5600 cycles to read a full
line, lookup the field names, extract the values, parse them (atoi,
aton) save them in a structure, apply a filter, insert the result
in a tree containing up to 12 millions of them, and dump a report
of the counts by any creteria. That saved me a lot of time working
on log analysis. But to achieve such a speed, I had to optimize at
every level, including rewriting a faster atoi() equivalent, a
faster aton() equivalent (with no multiplies), and playing with
likely/unlikely a lot. The code slowly improved from about 75k
lines/s to 300k lines/s with no algorithmic change. Just by the
way of careful code placement and ordering.

In fact, you could say that micro-optimizations are not important
if you are doing them in a crappy environment where the fast path
is already wasted by a big dirty function. But when you have the
ability to master all the environment, every single cycle counts
because there's almost no waste.

I find it essential not to be the first one bringing crap somewhere
and serving as an excuse for others not to care about their code.
If everyone cares, you can still produce very good software, and
that's what I care about.

Cheers,
Willy


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Kernel Development & Objective-C
  2007-12-04 17:50             ` Lennart Sorensen
@ 2007-12-05 10:31               ` Gilboa Davara
  0 siblings, 0 replies; 57+ messages in thread
From: Gilboa Davara @ 2007-12-05 10:31 UTC (permalink / raw)
  To: Lennart Sorensen; +Cc: LKML Linux Kernel


On Tue, 2007-12-04 at 12:50 -0500, Lennart Sorensen wrote:
> On Mon, Dec 03, 2007 at 02:35:31PM +0200, Gilboa Davara wrote:
> > Intel's newest dual 10GbE NIC can easily (?) throw ~14M packets per
> > second. (theoretical peak at 1514bytes/frame)
> > Granted, installing such a device on a single CPU/single core machine is
> > absurd - but even on an 8 core machine (2 x Xeon 53xx/54xx / AMD
> > Barcelona) it can still generate ~1M packets/s per core.
> 
> 10GbE can't do 14M packets per second if the packets are 1514 bytes.  At
> 10M packets per second you have less than 1000 bits per packet, which is
> far from 1514bytes.
> 
> 10Gbps gives you at most 1.25GBps, which at 1514 bytes per packet works
> out to 825627 packets per second.  You could reach ~14M packets per
> second with only the smallest packet size, which is rather unusual for
> high throughput traffic, since you waste almost all the bytes on
> overhead in that case.  But you do want to be able to handle at least a
> million or two packets per second to do 10GbE.

... I corrected my math in the second email. [1] 

Never the less, a VOIP network (E.g. G729 and friends) can generate the
maximum number of frames allowed on 10GbE Ethernet which is, AFAIR just
below 15M -per- port. (~29M on a dual port card)

While I doubt that any non-NPU based NIC can handle such a load, on
mixed networks we're already seeing well-above 1M frames per port.

- Gilboa
[1] http://lkml.org/lkml/2007/12/3/69



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Kernel Development & Objective-C
  2007-12-04 22:20                       ` Diego Calleja
@ 2007-12-05 10:59                         ` Giacomo A. Catenazzi
  0 siblings, 0 replies; 57+ messages in thread
From: Giacomo A. Catenazzi @ 2007-12-05 10:59 UTC (permalink / raw)
  To: Diego Calleja; +Cc: jamagallon, linux-kernel

Diego Calleja wrote:
> El Tue, 4 Dec 2007 22:47:45 +0100, "J.A. Magallón" <jamagallon@ono.com> escribió:
> 
>> That is what I like of C++, with good placement of high level features
>> like const's and & (references) one can gain fine control over what
>> gets copied or not.
> 
> But...if there's some way Linux can get "language improvements", is with
> new C standards/gccextensions/etc. It'd be nice if people tried to add
> (useful) C extensions to gcc, instead of proposing some random language :)

But nobody know such extensions.
I think that the core kernel will remain in C, because
there are no problems and no improvement possible
(with other language)

But the drivers side has more problems. There is a lot
of copy-paste, quality is often not high, not all developers
know well linux kernel, and not well maintained with new
or better internal API. So if we found a good template
or a good language to help *some* drivers without
causing a lot of problem to the rest of community, it would
be nice.

I don't think that we have written in stone that kernel
drivers should be written only in C, but actually there is
no good alternative.

But I think it is a huge task to find a language, a
prototype of API and convert some testing drivers.
And there is no guarantee of good result.

ciao
	cate

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Micro vs macro optimizations (was: Re: Kernel Development & Objective-C)
  2007-12-04 22:43                   ` Willy Tarreau
@ 2007-12-05 17:05                     ` Avi Kivity
  0 siblings, 0 replies; 57+ messages in thread
From: Avi Kivity @ 2007-12-05 17:05 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: Andi Kleen, Kyle Moffett, Lennart Sorensen, Ben Crowhurst, linux-kernel

Willy Tarreau wrote:
> Hi Avi,
>
> On Tue, Dec 04, 2007 at 11:07:05PM +0200, Avi Kivity wrote:
>   
>> Willy Tarreau wrote:
>>     
>>>>>>   
>>>>>>        
>>>>>>             
>>>>> With 10Gbit/s ethernet working you start to care about every cycle.
>>>>>
>>>>>      
>>>>>           
>>>> If you have 10M packets/sec no amount of cycle-saving will help you.  
>>>> You need high level optimizations like TSO.  I'm not saying we should 
>>>> sacrifice cycles like there's no tomorrow, but the big wins are elsewhere.
>>>>    
>>>>         
>>> Huh? At 4 GHz, you have 400 cycles to process each packet. If you need to
>>> route those packets, those cycles may just be what you need to lookup a
>>> forwarding table and perform a few MMIO on an accelerated chip which will
>>> take care of the transfer. But you need those cycles. If you start to waste
>>> them 30 by 30, the performance can drop by a critical factor.
>>>
>>>  
>>>       
>> I really doubt Linux spends 400 cycles routing a packet.  Look what an 
>> skbuff looks like.
>>     
>
> That's not what I wrote. I just wrote about doing forwarding table lookup
> and MMIO so that dedicated hardware NICs can process the recv/send to the
> correct ends. If you just need to scan a list of DMAed packets, look at
> their destination IP address, lookup that IP in a table to find the output
> NIC and destination MAC address, link them into an output list and waking
> the output NIC up, there's nothing which requires more than 400 cycles
> here. I never said that it was a requirement to pass through the existing
> network stack.
>   

If you're writing a single-purpose program then there is justification 
to micro-optimize it to the death.  Write it in VHDL, even.  But that 
description doesn't fit the kernel.

>   
>> A flood ping to localhost on a 2GHz system takes 8 microseconds, that's 
>> 16,000 cycles.  Sure it involves userspace, but you're about two orders 
>> of magnitude off.
>>     
>
> I don't see where you see a userspace (or I don't understand your test).
>   

ping -f -q localhost; the ping client is in userspace.

> On traffic generation I often do from user space, I can send 630 k raw
> ethernet packets per second from userspace on a 1.8 GHz opteron and PCI-e
> NICs. That's 2857 cycles per packet, including the (small amount of)
> userspace work. That's quite cheap.
>
>   

Yes, it is.

>> This happens very often in HPC, and when it does, it is often worthwhile 
>> to invest in manual optimizations or even assembly coding.  
>> Unfortunately it is very rare in the kernel (memcmp, raid xor, what 
>> else?).  Loops with high iteration counts are very rare, so any 
>> attention you give to the loop body is not amortized over a large number 
>> of executions.
>>     
>
> Well, in my example above, everythin in the path of the send() syscall down
> to the bare metal NIC is under high pressure in a fast loop. 30 cycles
> already represent 1% of the performance! In fact, to modulate speed, I
> use a busy loop with a volatile int and small values.
>
>   

Having an interface to send multiple packets in one syscall would cut 
way more than 30 cycles.

>>>>    
>>>>         
>>> You are forgetting something very important : once you start stacking
>>> functions to perform the dirty work for you, you end up with so much
>>> abstraction that even new stupid code cannot be written at all without
>>> relying on them, and it's where the problem takes its roots, because
>>> when you need to write a fast function and you notice that you cannot
>>> touch a variable without passing through a slow pinhole, your fast
>>> function will remain slow whatever you do, and the worst of all is that
>>> you will think that it is normally fast and that it cannot be written
>>> faster.
>>>
>>>  
>>>       
>> I don't understand.  Can you give an example?
>>     
>
> Yes, the most common examples found today involve applications reading
> data from databases. For instance, let's say that one function in your
> program must count the number of unique people with the name starting
> with an "A". It is very common to see "low-level" primitives to abstract
> the database for portability purposes. One of such primitives will
> generally be consist in retrieving a list of people with their names,
> age and sex in one well-formated 3-column array. Many lazy people will
> not see any problem in calling this one from the function described
> above. Basically, what they would do is :
>
>  count_people_with_name_starting_with_a()
>     -> array[name,age,sex] = get_list_of_people()
>          -> while read_one_people_entry() {
>                alloc(one_line_of_3_columns)
>                read then parse the 3 fields
>                format_them_appropriately
>             }
>     -> create a new array "name2" by duplicating the "name" column
>     -> name3 = sort_unique(name2)
>     -> name4 = name3.grep("^A")
>     -> return name4.count
>
> Don't laugh, I've recently read such a horrible thing. It was done
> that way just because it was easier. Without the abstraction layer,
> the coder would have been forced to access the base anyway and would
> have seen an added value into just counting from the inner while
> loop, saving lots of copies, greps, sort, etc... :
>
>  count_people_with_name_starting_with_a() {
>       count = 0;
>       while read_one_people_entry() {
>          read the 3 fields into a statically-allocated buffer
>          if (name[0] == 'A') count++;
>       }
>       return count;
>  }
>
> I'm not saying that the above was not possible, just that it's
> 1000% easier to do the former without even having to think that
> the final code uses such horrible things. 

Your optimized version is wrong.  It counts duplicated names, while you 
stated you needed unique names.  Otherwise the sort_unique step is 
completely redundant.

Databases are good examples of where the abstraction helps.  If you had 
hundreds of millions of records in your example, you'd connect to a 
database, present it with an ASCII string describing what you want, upon 
which it would parse it, compile it into an internal language against 
the schema, optimize that and then execute it.  Despite all that 
abstraction it would win against your example because it would implement 
the inner loop as

    open index (by name)
    seek to 'A'
        while (current starts with 'A')
                ++count (taking care of the uniqueness requirement if 
needed)
    close index

Thus it would never see people who's name begins with 'W'.  If the 
database had a materialized view feature, and this particular query was 
deemed important enough, it would optimize it to

    open materialized view
    read count
    close materialized view

The database does all this while allowing concurrent reads and writes 
and keeping your data in case someone trips on the power cord.  You 
can't do that without a zillion layers of abstraction.


>> There are two cases where abstraction hurts performance: the first is 
>> where the mechanisms used to achieve the abstraction (functions instead 
>> of direct access to variables, function pointers instead of duplicating 
>> the caller) introduce performance overhead.  I don't think C has any 
>> advantage here -- actually a disadvantage as it lacks templates and is 
>> forced to use function pointers for nontrivial cases.  Usually the 
>> abstraction penalty is nil with modern compilers.
>>
>> The second case is where too much abstraction clouds the programmer's 
>> mind.  But this is independent of the programming language.
>>     
>
> Agreed. But most often, the abstraction prevents the user from accessing
> some information directly and that becomes nasty. I remember when I was
> a teen, I wrote a program designed to inventory what you had in your PC,
> and run a few performance tests. It ran in those semi-graphical DOS mode
> where you use graphics characters to draw boxes. I initially wrote all
> the windowing code myself and it ran perfectly. I once decided to rewrite
> it using TurboVision, the windowing framework from Borland (it was written
> in TurboPascal). I made intensive use of the equivalent of a putchar()
> function to write text in a window. You cannot imagine my pain when I
> ran it on my old 8088, it wrote at the speed of a 1200 bps terminal. I
> then tried to find how to write faster, even by accessing the window
> buffer directly. I couldn't. I had to reverse-engineer the internal
> structures by debugging memory contents in order to find the pointers
> to the window buffer to write to them directly. After this disastrous
> experience with abstraction, I thought "never that crap again".
>
>   

If the abstraction if badly written, and further you cannot change it, 
then of course it hurts.  But if the abstraction is well written, or if 
it can be fixed, then all is well.  The problem here is not that 
abstractions exist, but that you persist in using a broken API instead 
of fixing it.


>>> Every cycle burned is definitely lost. The time cannot go backwards. So
>>> for each cycle that you lose to laziness, you have to become more and more
>>> clever to find out how to write an alternative. Lazy people simply put
>>> caches everywhere and after that they find normal that "hello world" 
>>> requires
>>> 2 Gigs of RAM to be displayed. 
>>>       
>> A 100 byte program will print "hello world" on a UART and stop.  A 
>> modern program will load a vector description of a font, scale it to the 
>> desired size, render it using anti aliasing and sub-pixel positioning, 
>> lay it out according to the language rules of whereever you live, and 
>> place it on a multi-megabyte frame buffer.  Yes it needs hundreds of 
>> megabytes and lots of nasty algorithms to do that.
>>     
>
> What I'm complaining about is that when you don't want those fancy things,
> you still have them to justify the hundreds of megs. And even if you manage
> to print to stdout, you still have a huge runtime just in case you'd like
> to use the fancy features.
>
>   

That's life.  The fact is that users demand features, and programmers 
cater to them.  If you can find a way to provide all those features 
without the bloat, more power to you.  The abstractions here are not the 
cause of the bloat, they are the tool used to provide the features while 
keeping a reasonable level of maintainability and reliability.


>>> The only true solution is to create better
>>> algorithms, but you will find even less people capable of creating 
>>> efficient
>>> algorithms than you will find capable of coding correctly.
>>>       
>> That is true, that is why we see a lot more microoptimizations than 
>> algorithmic progress.
>>     
>
> Also, algorithmic research is very little rewarding. You can work for
> months or years thinking you found the nice algo for the job, then
> finally discover a limitation you did not expect and throw that amount
> of work to the bin in a few minutes.
>   

You don't need to prove that P == NP to improve things.  Most 
improvements are in adding new APIs and data structures to keep the 
inner loops working on more data.  And of course scalability work to 
keep data local to a processing core.

It won't win you a Nobel prize, but you'll be able to measure a few 
percent improvement on a real-life workload instead of 10 cycles on a 
microbenchmark.

>   
>> But if you want a fast streaming filesystem you choose XFS over ext3, 
>> even though the latter is much smaller and easier to optimize.  If you 
>> write a network server you choose epoll() instead of trying to optimize 
>> select() somehow.
>>     
>
> That's interesting that you cite epoll() vs select(). I measured the
> break-even point around 1000 FDs. Below, select() is faster. Above,
> epoll() is faster. On small number of entries (less than 100), a select
> based proxy can be 20-30% faster than the same one running on epoll()
> because select() while dumber is cheaper to set up.
>   

[IIRC epoll() setup is done outside the loop, just once]

The small proxy probably doesn't have a performance problem, while 10K 
connection servers do.
>> Unless you are I/O bound, which is usually the case when you have 2GHz 
>> cpus driving 200Hz disks.
>>     
>
> That's true when you seek a lot. When you manage to mostly perform sequential
> reads (such as what you do when processing large files such as logs), you can
> easily achieve 80 MB/s, which is 20000 pages/s, or 100 times faster.
>
>   

Right, and this was achieved by having very good batching in the bio layer.

>> Can you give an example of a 2- or 3- fold factor on an end-user 
>> workload achieved by microopts?
>>     
>
> Oh there are many primitives which are generally optimized in assembly for
> this reason. What randomly comes to my mind :
>   - graphics libraries. Saving 1 cycle per pixel in a rectangle drawing
>     primitive can have an important impact in animated graphics for
>     instance.
>
>   - video/audio and generally multimedia code. I remember a specially
>     written version of mpg123 about 10 years ago, which was optimized
>     for i486 and which was the only one able to run on a 486 without
>     skipping.
>
>   - crypto code. It's common to find CPU-specific DES or AES functions.
>     Take a look at John The Ripper. I don't know if it still exists,
>     but there was an Alpha-optimized DES function which was something
>     like 5 times faster than the generic C one. It changes a lot of
>     things when you have 1 day to check your users passwords.
>   

These are indeed cases where the inner loop is executed millions times 
per second.  Of course it is perfectly reasonable to assembly code these.

I'm talking about regular C code.  Most C code is decision taking and 
pointer chasing, which is why traditional microopts don't help much.


> I also wrote a netfilter log analyzer which parses 300000 lines per
> second on my 1.7 GHz notebook. That's 5600 cycles to read a full
> line, lookup the field names, extract the values, parse them (atoi,
> aton) save them in a structure, apply a filter, insert the result
> in a tree containing up to 12 millions of them, and dump a report
> of the counts by any creteria. That saved me a lot of time working
> on log analysis. But to achieve such a speed, I had to optimize at
> every level, including rewriting a faster atoi() equivalent, a
> faster aton() equivalent (with no multiplies), and playing with
> likely/unlikely a lot. The code slowly improved from about 75k
> lines/s to 300k lines/s with no algorithmic change. Just by the
> way of careful code placement and ordering.
>   

Curious:  wasn't the time dominated by the tree code? 12M nodes is 24 
levels, and probably unpredictable to the processor unless the data is 
very regular.

> In fact, you could say that micro-optimizations are not important
> if you are doing them in a crappy environment where the fast path
> is already wasted by a big dirty function. But when you have the
> ability to master all the environment, every single cycle counts
> because there's almost no waste.
>
>   

That only works if the environment is very small.  A large scale project 
needs abstractions, otherwise you spend all your time re-learning all 
the details.

> I find it essential not to be the first one bringing crap somewhere
> and serving as an excuse for others not to care about their code.
> If everyone cares, you can still produce very good software, and
> that's what I care about.

We just disagree about the methods.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Kernel Development & Objective-C
  2007-11-30 11:16   ` Ben Crowhurst
  2007-11-30 11:36     ` Karol Swietlicki
  2007-11-30 14:37     ` Lennart Sorensen
@ 2007-12-08  8:54     ` Rogelio M. Serrano Jr.
  2 siblings, 0 replies; 57+ messages in thread
From: Rogelio M. Serrano Jr. @ 2007-12-08  8:54 UTC (permalink / raw)
  To: Linux Kernel Mailing List


[-- Attachment #1.1: Type: text/plain, Size: 1864 bytes --]

Ben Crowhurst wrote:
> Loïc Grenié wrote:
>> 2007/11/29, Ben Crowhurst <Ben.Crowhurst@stellatravel.co.uk>:
>>  
>>> Has Objective-C ever been considered for kernel development?
>>>
>>> regards,
>>> BPC
>>>     
>>
I have tried it in a toy kernel. Oskit style. The code reuse is very
high specially with string ops and driver interfaces. Its also very easy
to do unit testing with. My main problem was the quality of the compiler
optimization. Its just not good enough. I think if the compiler can do
the right kind of optimizations correctly then a low overhead OO
language like objective-c can be used in a kernel.

On the other hand its the automated testing part that really matters for
me. Imagine adding features to linux week after week without ever
getting a serious panic or two. And then getting a big performance boost
whenever the compiler does more and more optimizations correctly.

>>    No, it has not. Any language that looks remotely like an OO language
>>   has not ever been considered for (Linux) kernel development and for
>>   most, if not all, other operating systems kernels.
>>
>>     Various problems occur in an object oriented language. One of them
>>   is garbage collection: it provokes asynchronous delays and, during
>>   an interrupt or a system call for a real time task, the kernel cannot
>>   wait. 
> Objective C 1.0 does not force nor have garbage collection.
>
True.

>> Another is memory overhead: all the magic that OO languages
>>   provide take space in memory and Linux kernel is used in embedded
>>   systems with very tight memory requirements.
>>   
> But are embedded systems not rapidly moving on. Turning to stare at
> the ADSL X6 modem with MB's of ram.

Its all about optimizations.

-- 
Democracy is about two wolves and a sheep deciding what to eat for dinner.


[-- Attachment #1.2: rogelio.vcf --]
[-- Type: text/x-vcard, Size: 333 bytes --]

begin:vcard
fn:Rogelio M. Serrano Jr
n:M. Serrano Jr;Rogelio
org:SMSG Communications Philippines;Technical Department
adr:;;;;;;Republic of the Philippines
email;internet:rogelio@smsglobal.net
title:Programmer
tel;work:+6327534145
tel;home:+6329527026
tel;cell:+639209202267
x-mozilla-html:FALSE
version:2.1
end:vcard


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 252 bytes --]

^ permalink raw reply	[flat|nested] 57+ messages in thread

end of thread, other threads:[~2007-12-08  8:54 UTC | newest]

Thread overview: 57+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-11-29 12:14 Kernel Development & Objective-C Ben Crowhurst
2007-11-30 10:02 ` Xavier Bestel
2007-11-30 10:09   ` KOSAKI Motohiro
2007-11-30 10:20     ` Xavier Bestel
2007-11-30 10:54       ` Jan Engelhardt
2007-11-30 14:21         ` David Newall
2007-11-30 23:31           ` Bill Davidsen
2007-11-30 23:40             ` Alan Cox
2007-12-01  0:05               ` Arnaldo Carvalho de Melo
2007-12-01 18:27               ` Bill Davidsen
2007-12-01 18:18                 ` Alan Cox
2007-12-03  1:23                   ` Bill Davidsen
2007-11-30 22:52     ` J.A. Magallón
2007-11-30 10:29 ` Loïc Grenié
2007-11-30 11:16   ` Ben Crowhurst
2007-11-30 11:36     ` Karol Swietlicki
2007-11-30 14:37     ` Lennart Sorensen
2007-12-08  8:54     ` Rogelio M. Serrano Jr.
2007-11-30 23:19   ` J.A. Magallón
2007-11-30 23:53     ` Nicholas Miell
2007-12-01  0:31     ` Al Viro
2007-12-01  0:34       ` Al Viro
2007-12-01  1:09       ` J.A. Magallón
2007-12-01 19:55       ` Avi Kivity
2007-12-04 17:54     ` Lennart Sorensen
2007-12-04 21:10       ` Avi Kivity
2007-12-04 21:24       ` J.A. Magallón
2007-11-30 11:37 ` Matti Aarnio
2007-11-30 14:34 ` Lennart Sorensen
2007-11-30 15:26   ` Kyle Moffett
2007-11-30 18:40     ` H. Peter Anvin
2007-11-30 19:35       ` Kyle Moffett
2007-12-01 20:03     ` Avi Kivity
2007-12-02 19:01       ` Andi Kleen
2007-12-03  5:12         ` Avi Kivity
2007-12-03  9:50           ` Andi Kleen
2007-12-03 11:46             ` Avi Kivity
2007-12-03 11:50               ` Andi Kleen
2007-12-03 21:13               ` Willy Tarreau
2007-12-03 21:39                 ` J.A. Magallón
2007-12-03 21:57                   ` Alan Cox
2007-12-04 21:47                     ` J.A. Magallón
2007-12-04 22:20                       ` Diego Calleja
2007-12-05 10:59                         ` Giacomo A. Catenazzi
2007-12-04 21:07                 ` Avi Kivity
2007-12-04 22:43                   ` Willy Tarreau
2007-12-05 17:05                     ` Micro vs macro optimizations (was: Re: Kernel Development & Objective-C) Avi Kivity
2007-12-03 12:35           ` Kernel Development & Objective-C Gilboa Davara
2007-12-03 12:44             ` Gilboa Davara
2007-12-03 16:28             ` Casey Schaufler
2007-12-04 17:50             ` Lennart Sorensen
2007-12-05 10:31               ` Gilboa Davara
2007-12-01 19:59   ` Avi Kivity
2007-12-02 19:44     ` Jörn Engel
2007-12-03 16:53     ` Lennart Sorensen
2007-11-30 15:00 ` Chris Snook
2007-12-01  9:50   ` David Newall

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).