* Developing multi-threading applications @ 2002-06-13 8:13 Roberto Fichera 2002-06-13 8:26 ` David Schwartz 0 siblings, 1 reply; 19+ messages in thread From: Roberto Fichera @ 2002-06-13 8:13 UTC (permalink / raw) To: linux-kernel Hi All, I'm designing a multithreding application with many threads, from ~100 to 300/400. I need to take some decisions about which threading library use, and which patch I need for the kernel to improve the scheduler performances. The machines will be a SMP Xeon with 4/8 processors with 4Gb RAM. All threads are almost computational intensive and the library need a fast interprocess comunication and syncronization because there are many sync & async threads time dependent and/or critical. I'm planning, in the future, to distribuite all the threads in a pool of SMP box. Thanks in advance. Roberto Fichera. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Developing multi-threading applications 2002-06-13 8:13 Developing multi-threading applications Roberto Fichera @ 2002-06-13 8:26 ` David Schwartz 2002-06-13 9:08 ` Roberto Fichera 0 siblings, 1 reply; 19+ messages in thread From: David Schwartz @ 2002-06-13 8:26 UTC (permalink / raw) To: kernel, linux-kernel On Thu, 13 Jun 2002 10:13:35 +0200, Roberto Fichera wrote: >I'm designing a multithreding application with many threads, >from ~100 to 300/400. I need to take some decisions about >which threading library use, and which patch I need for the >kernel to improve the scheduler performances. The machines >will be a SMP Xeon with 4/8 processors with 4Gb RAM. >All threads are almost computational intensive and the library >need a fast interprocess comunication and syncronization >because there are many sync & async threads time >dependent and/or critical. I'm planning, in the future, to distribuite >all the threads in a pool of SMP box. With 4/8 processors, you don't want to create 100-400 threads doing computation intensive tasks. So redesign things so that the number of threads you create is more in line with the number of CPUs you have available. That is, use a 'thread per CPU' (or slightly more threads than their are CPUs per node) approach and you'll perform a lot better. Distribute the available work over the available threads. DS ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Developing multi-threading applications 2002-06-13 8:26 ` David Schwartz @ 2002-06-13 9:08 ` Roberto Fichera 2002-06-13 9:44 ` Peter Wächtler 2002-06-13 10:13 ` David Schwartz 0 siblings, 2 replies; 19+ messages in thread From: Roberto Fichera @ 2002-06-13 9:08 UTC (permalink / raw) To: David Schwartz; +Cc: linux-kernel At 01.26 13/06/02 -0700, you wrote: >On Thu, 13 Jun 2002 10:13:35 +0200, Roberto Fichera wrote: > > >I'm designing a multithreding application with many threads, > >from ~100 to 300/400. I need to take some decisions about > >which threading library use, and which patch I need for the > >kernel to improve the scheduler performances. The machines > >will be a SMP Xeon with 4/8 processors with 4Gb RAM. > >All threads are almost computational intensive and the library > >need a fast interprocess comunication and syncronization > >because there are many sync & async threads time > >dependent and/or critical. I'm planning, in the future, to distribuite > >all the threads in a pool of SMP box. > > With 4/8 processors, you don't want to create 100-400 threads doing >computation intensive tasks. So redesign things so that the number of threads >you create is more in line with the number of CPUs you have available. That >is, use a 'thread per CPU' (or slightly more threads than their are CPUs per >node) approach and you'll perform a lot better. Distribute the available work >over the available threads. You are right! But "computational intensive" is not totaly right as I say ;-), because most of thread are waiting for I/O, after I/O are performed the computational intensive tasks, finished its work all the result are sent to thread-father, the father collect all the child's result and perform some computational work and send its result to its father and so on with many thread-father controlling other child. So I think the main problem/overhead is thread creation and the thread's numbers. > DS Roberto Fichera. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Developing multi-threading applications 2002-06-13 9:08 ` Roberto Fichera @ 2002-06-13 9:44 ` Peter Wächtler 2002-06-13 9:52 ` Roberto Fichera 2002-06-13 10:13 ` David Schwartz 1 sibling, 1 reply; 19+ messages in thread From: Peter Wächtler @ 2002-06-13 9:44 UTC (permalink / raw) To: Roberto Fichera; +Cc: David Schwartz, linux-kernel Roberto Fichera wrote: > At 01.26 13/06/02 -0700, you wrote: > >> On Thu, 13 Jun 2002 10:13:35 +0200, Roberto Fichera wrote: >> >> >I'm designing a multithreding application with many threads, >> >from ~100 to 300/400. I need to take some decisions about >> >which threading library use, and which patch I need for the >> >kernel to improve the scheduler performances. The machines >> >will be a SMP Xeon with 4/8 processors with 4Gb RAM. >> >All threads are almost computational intensive and the library >> >need a fast interprocess comunication and syncronization >> >because there are many sync & async threads time >> >dependent and/or critical. I'm planning, in the future, to distribuite >> >all the threads in a pool of SMP box. >> >> With 4/8 processors, you don't want to create 100-400 threads >> doing >> computation intensive tasks. So redesign things so that the number of >> threads >> you create is more in line with the number of CPUs you have available. >> That >> is, use a 'thread per CPU' (or slightly more threads than their are >> CPUs per >> node) approach and you'll perform a lot better. Distribute the >> available work >> over the available threads. > > > You are right! But "computational intensive" is not totaly right as I > say ;-), > because most of thread are waiting for I/O, after I/O are performed the > computational intensive tasks, finished its work all the result are sent > to thread-father, the father collect all the child's result and perform > some > computational work and send its result to its father and so on with many > thread-father controlling other child. So I think the main problem/overhead > is thread creation and the thread's numbers. > Have a look at http://www-124.ibm.com/developerworks/opensource/pthreads/ they provide M:N threading model where threads can live in userspace. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Developing multi-threading applications 2002-06-13 9:44 ` Peter Wächtler @ 2002-06-13 9:52 ` Roberto Fichera 2002-06-13 10:16 ` Peter Wächtler 0 siblings, 1 reply; 19+ messages in thread From: Roberto Fichera @ 2002-06-13 9:52 UTC (permalink / raw) To: Peter Wächtler; +Cc: linux-kernel At 11.44 13/06/02 +0200, Peter Wächtler wrote: >>You are right! But "computational intensive" is not totaly right as I say >>;-), >>because most of thread are waiting for I/O, after I/O are performed the >>computational intensive tasks, finished its work all the result are sent >>to thread-father, the father collect all the child's result and perform some >>computational work and send its result to its father and so on with many >>thread-father controlling other child. So I think the main problem/overhead >>is thread creation and the thread's numbers. > >Have a look at http://www-124.ibm.com/developerworks/opensource/pthreads/ > >they provide M:N threading model where threads can live in userspace. Yes! I'm looking for it. But I want evaluate some other before. Roberto Fichera. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Developing multi-threading applications 2002-06-13 9:52 ` Roberto Fichera @ 2002-06-13 10:16 ` Peter Wächtler 2002-06-13 10:42 ` Roberto Fichera 0 siblings, 1 reply; 19+ messages in thread From: Peter Wächtler @ 2002-06-13 10:16 UTC (permalink / raw) To: Roberto Fichera; +Cc: linux-kernel Roberto Fichera wrote: > At 11.44 13/06/02 +0200, Peter Wächtler wrote: > >>> You are right! But "computational intensive" is not totaly right as I >>> say ;-), >>> because most of thread are waiting for I/O, after I/O are performed the >>> computational intensive tasks, finished its work all the result are sent >>> to thread-father, the father collect all the child's result and >>> perform some >>> computational work and send its result to its father and so on with many >>> thread-father controlling other child. So I think the main >>> problem/overhead >>> is thread creation and the thread's numbers. >> >> >> Have a look at http://www-124.ibm.com/developerworks/opensource/pthreads/ >> >> they provide M:N threading model where threads can live in userspace. > > > Yes! I'm looking for it. But I want evaluate some other before. > There is a paper rse-pmt.ps included in the tar archives from Ralf Engelschall (author of GNU portable threads). There you will find lots of interesting pointers to other thread packages. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Developing multi-threading applications 2002-06-13 10:16 ` Peter Wächtler @ 2002-06-13 10:42 ` Roberto Fichera 0 siblings, 0 replies; 19+ messages in thread From: Roberto Fichera @ 2002-06-13 10:42 UTC (permalink / raw) To: Peter Wächtler; +Cc: linux-kernel At 12.16 13/06/02 +0200, Peter Wächtler wrote: >Roberto Fichera wrote: >>At 11.44 13/06/02 +0200, Peter Wächtler wrote: >> >>>>You are right! But "computational intensive" is not totaly right as I >>>>say ;-), >>>>because most of thread are waiting for I/O, after I/O are performed the >>>>computational intensive tasks, finished its work all the result are sent >>>>to thread-father, the father collect all the child's result and perform >>>>some >>>>computational work and send its result to its father and so on with many >>>>thread-father controlling other child. So I think the main problem/overhead >>>>is thread creation and the thread's numbers. >>> >>> >>>Have a look at http://www-124.ibm.com/developerworks/opensource/pthreads/ >>> >>>they provide M:N threading model where threads can live in userspace. >> >>Yes! I'm looking for it. But I want evaluate some other before. And I don't want use a library that's totally in userspace. >There is a paper rse-pmt.ps included in the tar archives from Ralf Engelschall >(author of GNU portable threads). > >There you will find lots of interesting pointers to other thread packages. I'll take a look. Thanks! >- >To unsubscribe from this list: send the line "unsubscribe linux-kernel" in >the body of a message to majordomo@vger.kernel.org >More majordomo info at http://vger.kernel.org/majordomo-info.html >Please read the FAQ at http://www.tux.org/lkml/ Roberto Fichera. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Developing multi-threading applications 2002-06-13 9:08 ` Roberto Fichera 2002-06-13 9:44 ` Peter Wächtler @ 2002-06-13 10:13 ` David Schwartz 2002-06-13 11:21 ` Roberto Fichera 1 sibling, 1 reply; 19+ messages in thread From: David Schwartz @ 2002-06-13 10:13 UTC (permalink / raw) To: kernel; +Cc: linux-kernel On Thu, 13 Jun 2002 11:08:27 +0200, Roberto Fichera wrote: >You are right! But "computational intensive" is not totaly right as I say ;- >), It's really not fair to change the premises in the middle of an argument. >because most of thread are waiting for I/O, Still wrong. You don't tie up threads waiting for I/O. You can wait without having a thread doing the waiting. >after I/O are performed the >computational intensive tasks, finished its work all the result are sent >to thread-father, Okay, so you need a new abstraction -- separate the waiting from the working. Create as many threads to do the work as you have processors to do the work on. As for the waiting, minimize threads waiting, they're pure overhead. If it's sockets, use 'poll' so one thread can do lots of waiting. >the father collect all the child's result and perform some >computational work and send its result to its father and so on with many >thread-father controlling other child. So I think the main problem/overhead >is thread creation and the thread's numbers. So get rid of the problem! Don't create so many threads, create only as many threads as can do useful work and reuse them rather than destroying and recreating them. Solve the actual problem/overhead since it's totally artificial and due to your model rather than your problem! DS ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Developing multi-threading applications 2002-06-13 10:13 ` David Schwartz @ 2002-06-13 11:21 ` Roberto Fichera 2002-06-13 11:58 ` David Schwartz 0 siblings, 1 reply; 19+ messages in thread From: Roberto Fichera @ 2002-06-13 11:21 UTC (permalink / raw) To: David Schwartz; +Cc: linux-kernel At 03.13 13/06/02 -0700, you wrote: >On Thu, 13 Jun 2002 11:08:27 +0200, Roberto Fichera wrote: > >You are right! But "computational intensive" is not totaly right as I say ;- > >), > > It's really not fair to change the premises in the middle of an > argument. Sorry ;-)! > >because most of thread are waiting for I/O, > > Still wrong. You don't tie up threads waiting for I/O. You can > wait without >having a thread doing the waiting. > > >after I/O are performed the > >computational intensive tasks, finished its work all the result are sent > >to thread-father, > > Okay, so you need a new abstraction -- separate the waiting from the >working. Create as many threads to do the work as you have processors to do >the work on. As for the waiting, minimize threads waiting, they're pure >overhead. If it's sockets, use 'poll' so one thread can do lots of waiting. This's a possible solution. > >the father collect all the child's result and perform some > >computational work and send its result to its father and so on with many > >thread-father controlling other child. So I think the main problem/overhead > >is thread creation and the thread's numbers. > > So get rid of the problem! Don't create so many threads, create > only as many >threads as can do useful work and reuse them rather than destroying and >recreating them. Solve the actual problem/overhead since it's totally >artificial and due to your model rather than your problem! Depending by the applications. With my simulation/emulation program I need to create many thread because each thread resolve/manage/compute a specific problem and it's live depend by some factors. Each thread is create only if needed to avoid the overhead. The simulation/emulation is a "merge" of many and many object, each object work to resolve/manage/compute a specific problem. All the low objects are grouped to resolve a specific problem and are managed by a thread controller that should take some decision or doing some work. Some thread controller are grouped and managed by another thread controller and so on. Do not think that I need always 400 threads active they are create only if need by the controller. You must thinks this simulation/emulation as collection of many and many object that should interoperate, and the model is designed to scale easily on a distribuite environment. > DS Roberto Fichera. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Developing multi-threading applications 2002-06-13 11:21 ` Roberto Fichera @ 2002-06-13 11:58 ` David Schwartz 2002-06-13 16:26 ` Roberto Fichera 0 siblings, 1 reply; 19+ messages in thread From: David Schwartz @ 2002-06-13 11:58 UTC (permalink / raw) To: kernel; +Cc: linux-kernel >Depending by the applications. With my simulation/emulation program I need >to create >many thread because each thread resolve/manage/compute a specific problem and >it's live depend by some factors. Each thread is create only if needed to >avoid the >overhead. The simulation/emulation is a "merge" of many and many object, >each object >work to resolve/manage/compute a specific problem. All the low objects are >grouped to >resolve a specific problem and are managed by a thread controller that >should take some >decision or doing some work. Some thread controller are grouped and managed >by another >thread controller and so on. Do not think that I need always 400 threads >active they are >create only if need by the controller. You must thinks this >simulation/emulation as collection >of many and many object that should interoperate, and the model is designed >to scale easily >on a distribuite environment. If it's a simulation, you don't *really* need the threads, you just need to be able to act as if you had them. After all, what are you simulating if what work gets done when is up to the random vagaries of the OS scheduler? If it's a real application wanting real performance, the suggestions I made stand -- you don't want many more threads working than you have CPUs and you don't want a lot of threads sitting around waiting for work (and thus forcing bazillions of extra context switches). It sounds to me like your design is broken, needlessly mapping threads to I/Os that are being waited for one-to-one. This is a common error among programmers who consciously or subconsciously have accepted the 'more threads can do more work' philosophy. What you need to do is take whatever it is you're thinking of as a 'thread' right now, which I'd roughly define as 'one logical task, from start to completion' and realize that there is absolutely no reason to map this one-to-one to actual pthreads threads and every reason in the world not to. This will conserve resources (12 thread stacks instead of 300, 12 KSEs instead of 300), reduce context switches (context switches will only occur when there's no work to do at all or a thread uses up its entire timeslice rather than every time we change which client/task we're doing work for/on), improve scheduler efficiency (because the number of ready threads will not exceed the number of CPUs by much) and more often than not, clean up a lot of ugliness in your architecture (because threads are probably being used instead of a sane abstraction for 'work to be done' or 'a client I'm doing work for'). DS ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Developing multi-threading applications 2002-06-13 11:58 ` David Schwartz @ 2002-06-13 16:26 ` Roberto Fichera 2002-06-14 20:56 ` David Schwartz 0 siblings, 1 reply; 19+ messages in thread From: Roberto Fichera @ 2002-06-13 16:26 UTC (permalink / raw) To: David Schwartz; +Cc: linux-kernel At 04.58 13/06/02 -0700, David Schwartz wrote: > If it's a simulation, you don't *really* need the threads, you > just need to >be able to act as if you had them. After all, what are you simulating if what >work gets done when is up to the random vagaries of the OS scheduler? > > If it's a real application wanting real performance, the > suggestions I made >stand -- you don't want many more threads working than you have CPUs and you >don't want a lot of threads sitting around waiting for work (and thus forcing >bazillions of extra context switches). This is a scheduler problem! All threads waiting for I/O are blocked by the scheduler, and this doesn't have any impact for the context switches it increase only the waitqueue, using the Ingo's O(1) scheduler, a big piece of code, it should make a big difference for example. > It sounds to me like your design is broken, needlessly mapping > threads to >I/Os that are being waited for one-to-one. This is a common error among >programmers who consciously or subconsciously have accepted the 'more threads >can do more work' philosophy. I don't think "more threads == more work done"! With the thread's approch it's possible to split a big sequential program in a variety of concurrent logical programs with a big win for code revisions and new implementation. > What you need to do is take whatever it is you're thinking of as > a 'thread' >right now, which I'd roughly define as 'one logical task, from start to >completion' and realize that there is absolutely no reason to map this >one-to-one to actual pthreads threads and every reason in the world not to. > > This will conserve resources (12 thread stacks instead of 300, 12 > KSEs >instead of 300), reduce context switches (context switches will only occur >when there's no work to do at all or a thread uses up its entire timeslice >rather than every time we change which client/task we're doing work for/on), >improve scheduler efficiency (because the number of ready threads will not >exceed the number of CPUs by much) and more often than not, clean up a lot of >ugliness in your architecture (because threads are probably being used >instead of a sane abstraction for 'work to be done' or 'a client I'm doing >work for'). You are right! But depend by the application! If you have todo I/O like signal acquisition, sensors acquisitions and so on, you must have a one thread for each type of data acquisition, you must have a thread that perform some data computation with a subset, for examples, of this data, and generate the output that could be a new input for an other thread. This make the environment more realistic. I agree with you that if we increase the thread's numbers the system could collapse (= context switches become expensive = we must increase the CPU numbers or new box is required or new approch should be make). Roberto Fichera. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Developing multi-threading applications 2002-06-13 16:26 ` Roberto Fichera @ 2002-06-14 20:56 ` David Schwartz 2002-06-15 9:01 ` Roberto Fichera 0 siblings, 1 reply; 19+ messages in thread From: David Schwartz @ 2002-06-14 20:56 UTC (permalink / raw) To: kernel; +Cc: linux-kernel On Thu, 13 Jun 2002 18:26:54 +0200, Roberto Fichera wrote: >At 04.58 13/06/02 -0700, David Schwartz wrote: >This is a scheduler problem! All threads waiting for I/O are blocked by >the scheduler, and this doesn't have any impact for the context switches >it increase only the waitqueue, using the Ingo's O(1) scheduler, a big piece >of code, it should make a big difference for example. You are incorrect. If you have ten threads each waiting for an I/O and all ten I/Os are ready, then ten context switches are needed. If you have one thread waiting for ten I/Os, and then I/Os come ready, one context switch is needed. [snip] >I don't think "more threads == more work done"! With the thread's approch >it's >possible to split a big sequential program in a variety of concurrent >logical >programs with a big win for code revisions and new implementation. I'm not advising eliminating the threads approach. I'm only advising not using threads as your abstraction for clients or work to be done. Use threads as the execution vehicles that pick up work when there's work to be done. (Think thread pools, think separating I/O from computation.) [snip] >You are right! But depend by the application! If you have todo I/O like >signal acquisition, >sensors acquisitions and so on, you must have a one thread for each type of >data acquisition, Even if that's true, and it's often not, how many different types of data acquisition can you have? Ten? Twenty? That's a far cry from 300. DS ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Developing multi-threading applications 2002-06-14 20:56 ` David Schwartz @ 2002-06-15 9:01 ` Roberto Fichera 2002-06-15 10:30 ` Ingo Oeser 0 siblings, 1 reply; 19+ messages in thread From: Roberto Fichera @ 2002-06-15 9:01 UTC (permalink / raw) To: David Schwartz; +Cc: linux-kernel At 13.56 14/06/02 -0700, David Schwartz wrote: >On Thu, 13 Jun 2002 18:26:54 +0200, Roberto Fichera wrote: > >At 04.58 13/06/02 -0700, David Schwartz wrote: > > >This is a scheduler problem! All threads waiting for I/O are blocked by > >the scheduler, and this doesn't have any impact for the context switches > >it increase only the waitqueue, using the Ingo's O(1) scheduler, a big piece > >of code, it should make a big difference for example. > > You are incorrect. If you have ten threads each waiting for an > I/O and all >ten I/Os are ready, then ten context switches are needed. If you have one >thread waiting for ten I/Os, and then I/Os come ready, one context switch is >needed. You are right with this specific case, but always depending what kind of I/O you must be done. Not all the case could be reduce to your logic, only a specific case. It's a only "local" optimization. >[snip] > > >I don't think "more threads == more work done"! With the thread's approch > >it's > >possible to split a big sequential program in a variety of concurrent > >logical > >programs with a big win for code revisions and new implementation. > > I'm not advising eliminating the threads approach. I'm only > advising not >using threads as your abstraction for clients or work to be done. Use threads >as the execution vehicles that pick up work when there's work to be done. >(Think thread pools, think separating I/O from computation.) Yes! This is what I want! >[snip] > >You are right! But depend by the application! If you have todo I/O like > >signal acquisition, > >sensors acquisitions and so on, you must have a one thread for each type of > >data acquisition, > > Even if that's true, and it's often not, how many different types > of data >acquisition can you have? Ten? Twenty? That's a far cry from 300. Currently are 190! Always active are ~110! So thinking by separating I/O from the computation we double the threads. Roberto Fichera. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Developing multi-threading applications 2002-06-15 9:01 ` Roberto Fichera @ 2002-06-15 10:30 ` Ingo Oeser 2002-06-17 8:17 ` Roberto Fichera 0 siblings, 1 reply; 19+ messages in thread From: Ingo Oeser @ 2002-06-15 10:30 UTC (permalink / raw) To: Roberto Fichera; +Cc: David Schwartz, linux-kernel On Sat, Jun 15, 2002 at 11:01:44AM +0200, Roberto Fichera wrote: > > Even if that's true, and it's often not, how many different types > > of data > >acquisition can you have? Ten? Twenty? That's a far cry from 300. > > Currently are 190! Always active are ~110! So thinking by separating I/O from > the computation we double the threads. So basically you are just traversing your data depedency graph wrongly. Do a level order traversion if it is a dependency forest or an breadth first traversion if not. If this node require IO -> schedule the IO and return back to the upper level noticing it, that you like to be woken, if the IO is finished. If this node require Computation -> do it, if this CPU is the one with lowest load, else schedule it for the CPU with lowest load. Continue with next node. (load is meant "number of compuations with same metric scheduled on this thread") Use only one thread per CPU. Try to make the IO-Waiting as unique as possible (poll would be perfect). So this is all doable, once you analyze your data dependency graph properly and make the simulation data driven (which it usally is). Regards Ingo Oeser -- Science is what we can tell a computer. Art is everything else. --- D.E.Knuth ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Developing multi-threading applications 2002-06-15 10:30 ` Ingo Oeser @ 2002-06-17 8:17 ` Roberto Fichera 2002-06-17 16:07 ` Marco Colombo 0 siblings, 1 reply; 19+ messages in thread From: Roberto Fichera @ 2002-06-17 8:17 UTC (permalink / raw) To: Ingo Oeser; +Cc: David Schwartz, linux-kernel At 12.30 15/06/02 +0200, Ingo Oeser wrote: >On Sat, Jun 15, 2002 at 11:01:44AM +0200, Roberto Fichera wrote: > > > Even if that's true, and it's often not, how many different > types > > > of data > > >acquisition can you have? Ten? Twenty? That's a far cry from 300. > > > > Currently are 190! Always active are ~110! So thinking by separating > I/O from > > the computation we double the threads. > >So basically you are just traversing your data depedency graph >wrongly. Do a level order traversion if it is a dependency forest >or an breadth first traversion if not. Ok! I've semplified too much ;-)! >If this node require IO -> schedule the IO and return back to the upper >level noticing it, that you like to be woken, if the IO is >finished. > >If this node require Computation -> do it, if this CPU is the one with >lowest load, else schedule it for the CPU with lowest load. How can I do it ? Shouldn't be a kernel problem ? I could collect a various patch around that implement a CPU process bind/affinity and CPU load balance but how can I determine which CPU have the lowest load in a given time ? >Continue with next node. > >(load is meant "number of compuations with same metric scheduled >on this thread") > >Use only one thread per CPU. Try to make the IO-Waiting as unique >as possible (poll would be perfect). This could be implemented by the process affinity to bind the process to a CPU. But I continue to not hunderstand why I must have only one thread per CPU. There is some URL where can I see some kernel/sched/vm/I-O/other-think graph about this point ? >So this is all doable, once you analyze your data dependency >graph properly and make the simulation data driven (which it >usally is). > >Regards > >Ingo Oeser >-- >Science is what we can tell a computer. Art is everything else. --- D.E.Knuth Roberto Fichera. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Developing multi-threading applications 2002-06-17 8:17 ` Roberto Fichera @ 2002-06-17 16:07 ` Marco Colombo 2002-06-17 18:00 ` Roberto Fichera 2002-06-17 18:55 ` Jakob Oestergaard 0 siblings, 2 replies; 19+ messages in thread From: Marco Colombo @ 2002-06-17 16:07 UTC (permalink / raw) To: Roberto Fichera; +Cc: Ingo Oeser, David Schwartz, linux-kernel On Mon, 17 Jun 2002, Roberto Fichera wrote: [...] > process to a CPU. But I continue to not hunderstand why > I must have only one thread per CPU. There is some URL > where can I see some kernel/sched/vm/I-O/other-think graph about > this point ? To put it simply, because you have only one PC per CPU. It's not really an OS thing. Every time you're saving the PC (and SP, and all the "thread context") you're "emulating" more CPUs on just one. And what you got is just... an emulation. A Thread is an execution abstraction, and a CPU is an execution actor. Sounds sensible to match the two. Use functions instead to group instructions by their (functional) meaning. It makes much more sense, on 4-ways system, to have 4 rather complex threads that are able to execute different functions, like in a data-driven or event-driven model, than to run 400 simpler threads which implement one function each, IMHO. .TM. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Developing multi-threading applications 2002-06-17 16:07 ` Marco Colombo @ 2002-06-17 18:00 ` Roberto Fichera 2002-06-17 18:55 ` Jakob Oestergaard 1 sibling, 0 replies; 19+ messages in thread From: Roberto Fichera @ 2002-06-17 18:00 UTC (permalink / raw) To: Marco Colombo; +Cc: Ingo Oeser, David Schwartz, linux-kernel At 18.07 17/06/02 +0200, Marco Colombo wrote: >On Mon, 17 Jun 2002, Roberto Fichera wrote: > >[...] > > process to a CPU. But I continue to not hunderstand why > > I must have only one thread per CPU. There is some URL > > where can I see some kernel/sched/vm/I-O/other-think graph about > > this point ? > >To put it simply, because you have only one PC per CPU. It's not >really an OS thing. > >Every time you're saving the PC (and SP, and all the "thread context") >you're "emulating" more CPUs on just one. And what you got is just... >an emulation. A Thread is an execution abstraction, and a CPU is an >execution actor. Sounds sensible to match the two. Use functions instead >to group instructions by their (functional) meaning. Yes! I know ;-)! >It makes much more sense, on 4-ways system, to have 4 rather complex >threads that are able to execute different functions, like in >a data-driven or event-driven model, than to run 400 simpler threads >which implement one function each, IMHO. To make it simple, I'll try the 2 solutions! >.TM. Roberto Fichera. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Developing multi-threading applications 2002-06-17 16:07 ` Marco Colombo 2002-06-17 18:00 ` Roberto Fichera @ 2002-06-17 18:55 ` Jakob Oestergaard 1 sibling, 0 replies; 19+ messages in thread From: Jakob Oestergaard @ 2002-06-17 18:55 UTC (permalink / raw) To: Marco Colombo; +Cc: Roberto Fichera, Ingo Oeser, David Schwartz, linux-kernel On Mon, Jun 17, 2002 at 06:07:51PM +0200, Marco Colombo wrote: > On Mon, 17 Jun 2002, Roberto Fichera wrote: > > [...] > > process to a CPU. But I continue to not hunderstand why > > I must have only one thread per CPU. There is some URL > > where can I see some kernel/sched/vm/I-O/other-think graph about > > this point ? > > To put it simply, because you have only one PC per CPU. It's not > really an OS thing. > > Every time you're saving the PC (and SP, and all the "thread context") > you're "emulating" more CPUs on just one. And what you got is just... > an emulation. A Thread is an execution abstraction, and a CPU is an > execution actor. Sounds sensible to match the two. Use functions instead > to group instructions by their (functional) meaning. It is common to use many threads per processor on some operating systems. But this is (in my experience) because of the lack of proper non-blocking APIs on said OS. You can emulate non-blocking APIs with threads and a blocking API. And on some systems you simply have to. On GNU/Linux this is not generally a problem. And as Marco said, you really shouldn't have to do that. -- ................................................................ : jakob@unthought.net : And I see the elder races, : :.........................: putrid forms of man : : Jakob Østergaard : See him rise and claim the earth, : : OZ9ABN : his downfall is at hand. : :.........................:............{Konkhra}...............: ^ permalink raw reply [flat|nested] 19+ messages in thread
[parent not found: <20020613113158.I22429@nightmaster.csn.tu-chemnitz.de>]
* Re: Developing multi-threading applications [not found] <20020613113158.I22429@nightmaster.csn.tu-chemnitz.de> @ 2002-06-13 10:25 ` Roberto Fichera 0 siblings, 0 replies; 19+ messages in thread From: Roberto Fichera @ 2002-06-13 10:25 UTC (permalink / raw) To: Ingo Oeser; +Cc: linux-kernel At 11.31 13/06/02 +0200, Ingo Oeser wrote: >On Thu, Jun 13, 2002 at 11:08:27AM +0200, Roberto Fichera wrote: > > You are right! But "computational intensive" is not totaly right as I > say ;-), > > because most of thread are waiting for I/O, after I/O are performed the > > computational intensive tasks, finished its work all the result are sent > > to thread-father, the father collect all the child's result and perform > some > > computational work and send its result to its father and so on with many > > thread-father controlling other child. So I think the main problem/overhead > > is thread creation and the thread's numbers. > >So you are creating a simulation/emulation application/engine, right? >Or a measured data analysis engine? (which is basically the same >task) Yes! It's a simulation/emulation application. >For these kind of tasks creating your own kind of "threads" is >probably better. > >Split it in the following data structure: > >struct my_thread { > actor_function_t actor; > input_t inbuf; > output_t outbuf; > state_t statebuf; >} > >And provide rules and primitives for accessing inbuf/outbuf, if >they might be shared (which is probable). This can be a solution. >Now you can build a dependency tree/graph for the whole stuff >easily and schedule works of the same level to some real worker >threads (which might be on different machines), which are one per CPU. > >The problem is to build the actor as a REAL primitive, that >scales only by the size of inbuf and not by the contents of it. Yes! >Everything else is going to be bloated and not really scalable, >but can be implemented by every "Joe Programmer" after finishing >high school ;-) Depending by the threading library, if it's totaly userspace or not! With so many thread that aren't totaly userspace the scheduler performances/caratteristics are much important. I prefer a mixed solution for example. Because some problem can be easily resolved with a userspace threads and other not. >Regards > >Ingo Oeser >-- >Science is what we can tell a computer. Art is everything else. --- D.E.Knuth Roberto Fichera. ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2002-06-17 18:55 UTC | newest] Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2002-06-13 8:13 Developing multi-threading applications Roberto Fichera 2002-06-13 8:26 ` David Schwartz 2002-06-13 9:08 ` Roberto Fichera 2002-06-13 9:44 ` Peter Wächtler 2002-06-13 9:52 ` Roberto Fichera 2002-06-13 10:16 ` Peter Wächtler 2002-06-13 10:42 ` Roberto Fichera 2002-06-13 10:13 ` David Schwartz 2002-06-13 11:21 ` Roberto Fichera 2002-06-13 11:58 ` David Schwartz 2002-06-13 16:26 ` Roberto Fichera 2002-06-14 20:56 ` David Schwartz 2002-06-15 9:01 ` Roberto Fichera 2002-06-15 10:30 ` Ingo Oeser 2002-06-17 8:17 ` Roberto Fichera 2002-06-17 16:07 ` Marco Colombo 2002-06-17 18:00 ` Roberto Fichera 2002-06-17 18:55 ` Jakob Oestergaard [not found] <20020613113158.I22429@nightmaster.csn.tu-chemnitz.de> 2002-06-13 10:25 ` Roberto Fichera
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.