* Re: [patch 00/21] hrtimer - High-resolution timer subsystem @ 2005-12-13 12:45 Nicolas Mailhot 2005-12-13 23:38 ` George Anzinger 0 siblings, 1 reply; 52+ messages in thread From: Nicolas Mailhot @ 2005-12-13 12:45 UTC (permalink / raw) To: Thomas Gleixner; +Cc: Roman Zippel, linux-kernel "This is your interpretation and I disagree. If I set up a timer with a 24 hour interval, which should go off everyday at 6:00 AM, then I expect that this timer does this even when the clock is set e.g. by daylight saving. I think, that this is a completely valid interpretation and makes a lot of sense from a practical point of view. The existing implementation does it that way already, so why do we want to change this ?" Please do not hardcode anywhere 1 day = 24h or something like this. Relative timers should stay relative not depend on DST. If someone needs a timer that sets of everyday at the same (legal) time, make him ask for everyday at that time not one time + n x 24h. Some processes need an exact legal hour Other processes need an exact duration In a DST world that's not the same thing at all - don't assume one or the other, have coders request exactly what they need and everyone will be happy. I can tell from experience trying to fix code which assumed one day = 24h is not fun at all. And yes sometimes the difference between legal and UTC time matters a lot. -- Nicolas Mailhot ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [patch 00/21] hrtimer - High-resolution timer subsystem 2005-12-13 12:45 [patch 00/21] hrtimer - High-resolution timer subsystem Nicolas Mailhot @ 2005-12-13 23:38 ` George Anzinger 2005-12-14 8:58 ` Kyle Moffett 2005-12-14 10:03 ` Nicolas Mailhot 0 siblings, 2 replies; 52+ messages in thread From: George Anzinger @ 2005-12-13 23:38 UTC (permalink / raw) To: Nicolas Mailhot; +Cc: Thomas Gleixner, Roman Zippel, linux-kernel Nicolas Mailhot wrote: > "This is your interpretation and I disagree. > > If I set up a timer with a 24 hour interval, which should go off > everyday at 6:00 AM, then I expect that this timer does this even when > the clock is set e.g. by daylight saving. I think, that this is a > completely valid interpretation and makes a lot of sense from a > practical point of view. The existing implementation does it that way > already, so why do we want to change this ?" I think that there is a miss understanding here. The kernel timers, at this time, do not know or care about daylight savings time. This is not really a clock set but a time zone change which does not intrude on the kernels notion of time (that being, more or less UTC). > > Please do not hardcode anywhere 1 day = 24h or something like this. > Relative timers should stay relative not depend on DST. As far as timers go, it is only the user who understands any abstraction above the second. I.e. hour, day, min. all are abstractions done in user land. There is, however, one exception, the leap second. The kernel inserts this at midnight UTC and does use a fixed constant (86400) to find midnight. > > If someone needs a timer that sets of everyday at the same (legal) time, > make him ask for everyday at that time not one time + n x 24h. > > Some processes need an exact legal hour > Other processes need an exact duration I think what we are saying is that ABS time flag says that the timer is supposed to expire at the given time "by the specified clock", however that time is arrived at, be it the initial time or the initial time plus one or more intervals. We are NOT saying that these intervals are the same size, but only that the given clock says that they are the same size, thus any clock setting done during an interval can cause that interval to be of a different size. Without the ABS time flag, we are talking about intervals (the initial and subsequent) that are NOT affected by clock setting and are thus as close to the requested duration as possible. > > In a DST world that's not the same thing at all - don't assume one or the > other, have coders request exactly what they need and everyone will be > happy. This is why the standard introduced the ABS time flag. It does NOT, however, IMHO touch on the issue of time zone changes introduced by shifting into and out of day light savings time. > > I can tell from experience trying to fix code which assumed one day = 24h > is not fun at all. And yes sometimes the difference between legal and UTC > time matters a lot. > -- George Anzinger george@mvista.com HRT (High-res-timers): http://sourceforge.net/projects/high-res-timers/ ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [patch 00/21] hrtimer - High-resolution timer subsystem 2005-12-13 23:38 ` George Anzinger @ 2005-12-14 8:58 ` Kyle Moffett 2005-12-14 10:03 ` Nicolas Mailhot 1 sibling, 0 replies; 52+ messages in thread From: Kyle Moffett @ 2005-12-14 8:58 UTC (permalink / raw) To: george; +Cc: Nicolas Mailhot, Thomas Gleixner, Roman Zippel, linux-kernel On Dec 13, 2005, at 18:38, George Anzinger wrote: > I think that there is a miss understanding here. The kernel > timers, at this time, do not know or care about daylight savings > time. This is not really a clock set but a time zone change which > does not intrude on the kernels notion of time (that being, more or > less UTC). One question I have right now is: How does the kernel treat time slewing? Sometimes I might want to say: "The clock has continuous error and measures 24hours and 2 seconds for every 24 hours of real time", in which case the monotonic time should be slewed -2sec/ 24hours. On the other hand, I might also want to say: "The clock has fixed error and is 2 hours ahead cause some dummy messed up the time", so I'm going to fix this over the next 2 weeks by slewing backwards 1 hour per 7 days, in which case I do _not_ want the monotonic time to be affected (I'm passing 2 days, not 1 day and 22 hours). How does the kernel handle this? I've never seen any good description of the NTP and time-control APIs; if there is one out there (that's not 42 pages of dry standard), I would love a link. Cheers, Kyle Moffett -- If you don't believe that a case based on [nothing] could potentially drag on in court for _years_, then you have no business playing with the legal system at all. -- Rob Landley ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [patch 00/21] hrtimer - High-resolution timer subsystem 2005-12-13 23:38 ` George Anzinger 2005-12-14 8:58 ` Kyle Moffett @ 2005-12-14 10:03 ` Nicolas Mailhot 2005-12-15 1:11 ` George Anzinger 1 sibling, 1 reply; 52+ messages in thread From: Nicolas Mailhot @ 2005-12-14 10:03 UTC (permalink / raw) To: george; +Cc: Thomas Gleixner, Roman Zippel, linux-kernel On Mer 14 décembre 2005 00:38, George Anzinger wrote: > Nicolas Mailhot wrote: >> "This is your interpretation and I disagree. >> >> If I set up a timer with a 24 hour interval, which should go off >> everyday at 6:00 AM, then I expect that this timer does this even when >> the clock is set e.g. by daylight saving. I think, that this is a >> completely valid interpretation and makes a lot of sense from a >> practical point of view. The existing implementation does it that way >> already, so why do we want to change this ?" > > I think that there is a miss understanding here. The kernel timers, > at this time, do not know or care about daylight savings time. This > is not really a clock set but a time zone change which does not > intrude on the kernels notion of time (that being, more or less UTC). Probably. I freely admit I didn't follow the whole discussion. But the example quoted strongly hinted at fudging timers in case of DST, which would be very bad if done systematically and not on explicit user request. What I meant to write is "do not assume any random clock adjustement should change timer duration". Some people want it, others definitely don't. I case of kernel code legal time should be pretty much irrelevant, so if 24h timers are adjusted so they still go of at the same legal hour, that would be a bug IMHO. -- Nicolas Mailhot ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [patch 00/21] hrtimer - High-resolution timer subsystem 2005-12-14 10:03 ` Nicolas Mailhot @ 2005-12-15 1:11 ` George Anzinger 0 siblings, 0 replies; 52+ messages in thread From: George Anzinger @ 2005-12-15 1:11 UTC (permalink / raw) To: Nicolas Mailhot; +Cc: Thomas Gleixner, Roman Zippel, linux-kernel Nicolas Mailhot wrote: > On Mer 14 décembre 2005 00:38, George Anzinger wrote: > >>Nicolas Mailhot wrote: >> >>>"This is your interpretation and I disagree. >>> >>>If I set up a timer with a 24 hour interval, which should go off >>>everyday at 6:00 AM, then I expect that this timer does this even when >>>the clock is set e.g. by daylight saving. I think, that this is a >>>completely valid interpretation and makes a lot of sense from a >>>practical point of view. The existing implementation does it that way >>>already, so why do we want to change this ?" >> >>I think that there is a miss understanding here. The kernel timers, >>at this time, do not know or care about daylight savings time. This >>is not really a clock set but a time zone change which does not >>intrude on the kernels notion of time (that being, more or less UTC). > > > Probably. I freely admit I didn't follow the whole discussion. But the > example quoted strongly hinted at fudging timers in case of DST, which > would be very bad if done systematically and not on explicit user request. > > What I meant to write is "do not assume any random clock adjustement > should change timer duration". Some people want it, others definitely > don't. > > I case of kernel code legal time should be pretty much irrelevant, so if > 24h timers are adjusted so they still go of at the same legal hour, that > would be a bug IMHO. I am not quite sure what you are asking for here, but, as things set today, the kernels notion of time starts with a set time somewhere around boot up, be it from RT clock hardware or, possibly some script that quires some other system to find the time (NTP or otherwise). This time is then kept up to date by timer ticks which are assumed to have some fixed duration with, possibly, small drift corrections via NTP code. And then there is the random settimeofday, which the kernel has to assume is "needed" and correct. On top of this the POSIX clocks and timers code implements clocks which read the current time and a system relative time called monotonic time. We, by convention, roll the monotonic time and uptime together, and, assuming that the NTP corrections are telling us something about our "rock", we correct both the monotonic time and the time of day as per the NTP requests. Timers are then built on top of these clocks in two ways, again, as per the POSIX standard: 1) the relative timer, and 2) the absolute timer. For the relative timer, the specified expiry time is defined to be _now_ plus the given interval. For the absolute timer the expiry time is defined as that time when the given clock reaches the requested time. The only thing in here that might relate to your "legal" hour is that we adjust (via NTP) the clocks so that they, supposedly, run at the NBS (or is it the Naval Observatory) rate, give or take a small, hopefully, well defined error. > -- George Anzinger george@mvista.com HRT (High-res-timers): http://sourceforge.net/projects/high-res-timers/ ^ permalink raw reply [flat|nested] 52+ messages in thread
* [patch 00/21] hrtimer - High-resolution timer subsystem @ 2005-12-06 0:01 tglx 2005-12-06 17:32 ` Roman Zippel 0 siblings, 1 reply; 52+ messages in thread From: tglx @ 2005-12-06 0:01 UTC (permalink / raw) To: linux-kernel; +Cc: akpm, rostedt, johnstul, zippel, mingo This is a major rework of the former ktimer subsystem. It replaces the ktimer patch series and drops the ktimeout series completely. A broken out series is available from http://www.tglx.de/projects/ktimers/patches-2.6.15-rc5-hrtimer.tar.bz2 1. Naming After the extensive discussions on LKML, Andrew Morton suggested "hrtimer" and we picked it up. While the hrtimer subsystem does not offer high-resolution clock sources just yet, the subsystem can be easily extended with high-resolution clock capabilities. The rework of the ktimer-hrt patches is the next step. 2. More simplifications We worked through the subsystem and its users and further reduced the implementation to the basic required infrastructure and generally streamlined it. (We did this with easy extensibility for the high resolution clock support still in mind, so we kept some small extras around.) The new .text overhead (on x86) we believe speaks for itself: text data bss dec hex filename 2468380 547212 155164 3170756 3061c4 vmlinux-2.6.15-rc2 2469996 548016 155164 3173176 306b38 vmlinux-ktimer-rc5-mm1 2468164 547508 155100 3170772 3061d4 vmlinux-hrtimer While it was +1616 bytes before, it's -216 bytes now. This also gives a new justification for hrtimers: it reduces .text overhead ;-) [ There's still some .data overhead, but it's acceptable at 0.1%.] On 64-bit platforms such as x64 there are even more .text savings: text data bss dec hex filename 3853431 914316 403880 5171627 4ee9ab vmlinux-x64-2.6.15-rc5 3852407 914548 403752 5170707 4ee613 vmlinux-x64-hrtimer (due to the compactness of 64-bit ktime_t ops) Other 32-bit platforms (arm, ppc) have a much smaller .text hrtimers footprint now too. 3. Fixes The last splitup of ktimers resulted in a bug in the overrun accounting. This bug is now fixed and the code verified for correctness. 4. Rounding We looked at the runtime behaviour of vanilla, ktimers and ptimers to figure out the consequences for applications in a more detailed way. The rounding of time values and intervals leads to rather unpredictible results which deviates of the current mainline implementation significantly and introduces unpredictible behaviour vs. the timeline. After reading the Posix specification again, we came to the conclusion that it is possible to do no rounding at all for the ktime_t values, and to still ensure that the timer is not delivered early. ".. and that timers must wait for the next clock tick after the theoretical expiration time, to ensure that a timer never returns too soon. Note also that the granularity of the clock may be significantly coarser than the resolution of the data format used to set and get time and interval values. Also note that some implementations may choose to adjust time and/or interval values to exactly match the ticks of the underlying clock." Which allows the already discussed part of the spec to be interpreted differently: "Time values that are between two consecutive non-negative integer multiples of the resolution of the specified timer shall be rounded up to the larger multiple of the resolution. Quantization error shall not cause the timer to expire earlier than the rounded time value." The rounding of the time value i.e. the expiry time itself must be rounded to the next clock tick, to ensure that a timer never expires early. Thomas, Ingo -- ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [patch 00/21] hrtimer - High-resolution timer subsystem 2005-12-06 0:01 tglx @ 2005-12-06 17:32 ` Roman Zippel 2005-12-06 19:07 ` Ingo Molnar ` (2 more replies) 0 siblings, 3 replies; 52+ messages in thread From: Roman Zippel @ 2005-12-06 17:32 UTC (permalink / raw) To: tglx; +Cc: linux-kernel, Andrew Morton, rostedt, johnstul, mingo Hi Thomas, On Tue, 6 Dec 2005 tglx@linutronix.de wrote: Before I get into a detailed review, I have to asked a question I already asked earlier: are even interested in a discussion about this? Since I posted the ptimer patches, I haven't gotten a single direct response from you, except some generic description in your last patch. I would prefer if we could work together on this, but this requires some communication. I know I'm sometimes a little hard to understand, but you don't even try to ask if something is unclear or to explain the details from your perspective. Slowly I'm asking myself why I should bother, the alternative would be to just continue my own patch set. I don't really want that and Andrew certainly doesn't want to choose between two versions either. So Thomas, please get over yourself and start talking. > We worked through the subsystem and its users and further reduced the > implementation to the basic required infrastructure and generally > streamlined it. (We did this with easy extensibility for the high > resolution clock support still in mind, so we kept some small extras > around.) It looks better, but could you please explain, what these extras are good for? > After reading the Posix specification again, we came to the conclusion > that it is possible to do no rounding at all for the ktime_t values, and > to still ensure that the timer is not delivered early. Nice, that you finally also come to that conclusion, after I said that already for ages. (It's also interesting how you do that without giving me any credit for it.) Nevertheless, if you read my explanation of the rounding carefully and look at my implementation, you may notice that I still disagree with the actual implementation. BTW there is one thing I'm currently curious about. Why did you rename the timer from high-precision timer to high-resolution timer? hrtimer was just a suggestion from Andrew and ptimer would have been fine as well. bye, Roman ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [patch 00/21] hrtimer - High-resolution timer subsystem 2005-12-06 17:32 ` Roman Zippel @ 2005-12-06 19:07 ` Ingo Molnar 2005-12-07 3:05 ` Roman Zippel 2005-12-06 22:10 ` Thomas Gleixner 2005-12-06 22:28 ` Thomas Gleixner 2 siblings, 1 reply; 52+ messages in thread From: Ingo Molnar @ 2005-12-06 19:07 UTC (permalink / raw) To: Roman Zippel; +Cc: tglx, linux-kernel, Andrew Morton, rostedt, johnstul * Roman Zippel <zippel@linux-m68k.org> wrote: > Hi Thomas, > > On Tue, 6 Dec 2005 tglx@linutronix.de wrote: > > Before I get into a detailed review, I have to asked a question I > already asked earlier: are even interested in a discussion about this? we are certainly interested in a technical discussion! > I would prefer if we could work together on this, but this requires > some communication. I know I'm sometimes a little hard to understand, > but you don't even try to ask if something is unclear or to explain > the details from your perspective. you think the reason is that you are "sometimes a little hard to understand". Which, as i guess it implies, comes from your superior intellectual state of mind, and i am really thankful for your efforts trying to educate us mere mortals. but do you honestly believe that this is the only possible reason? How about "your message often gets lost because you often offend people and thus do not respect their work" as a possibility? How about "hence it has not been much fun to work with you" as a further explanation? to be able to comprehend what kind of mood we might be in when reading your emails these days, how about this little snippet from you, from the second email you wrote in the ktimers threads: "First off, I can understand that you're rather upset with what I wrote, unfortunately you got overly defensive, so could you please next time not reply immediately and first sleep over it, an overly emotional reply is funny to read but not exactly useful." http://marc.theaimsgroup.com/?l=linux-kernel&m=112743074308613&w=2 and to tell you my personal perspective, the insults coming from you in our direction have not appeared to have stopped since. I am being dead serious here, and i'd love nothing else if you stopped doing what you are doing and if i didnt have to write such mails and if things got more constructive in the end. Insults like the following sentence in this very email: > [...] So Thomas, please get over yourself and start talking. let me be frank, and show you my initial reply that came to my mind when reading the above sentence: "who the f*ck do you think you are to talk to _anyone_ like that?". Now i'm usually polite and wont reply like that, but one thing is sure: no technical thought was triggered by your sentence and no eternal joy filled my mind aching to reply to your questions. Suggestion: if you want communication and cooperation, then be cooperative to begin with. We are doing Linux kernel programming for the fun of it, and the style of discussions matters just as much as the style of code. i'm not sure what eternal sin we've committed to have deserved the sometimes hidden, sometimes less hidden trash-talk you've been practicing ever since we announced ktimers. in any case, from me you'll definitely get a reply to every positive or constructive question you ask in this thread, but you wont get many replies to mails that also include high-horse insults, question or statements. Frankly, i dont have that much time to burn, we've been through one ktimer flamewar already and it wasnt overly productive :) Ingo ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [patch 00/21] hrtimer - High-resolution timer subsystem 2005-12-06 19:07 ` Ingo Molnar @ 2005-12-07 3:05 ` Roman Zippel 2005-12-08 5:18 ` Paul Jackson 2005-12-08 9:26 ` Ingo Molnar 0 siblings, 2 replies; 52+ messages in thread From: Roman Zippel @ 2005-12-07 3:05 UTC (permalink / raw) To: Ingo Molnar; +Cc: tglx, linux-kernel, Andrew Morton, rostedt, johnstul Hi, On Tue, 6 Dec 2005, Ingo Molnar wrote: > you think the reason is that you are "sometimes a little hard to > understand". Which, as i guess it implies, comes from your superior > intellectual state of mind, and i am really thankful for your efforts > trying to educate us mere mortals. I can assure you my "superior intellectual state of mind" is not much different from many other kernel hackers. I have at times strong opinions, but who here hasn't? > to be able to comprehend what kind of mood we might be in when reading > your emails these days, how about this little snippet from you, from the > second email you wrote in the ktimers threads: > > "First off, I can understand that you're rather upset with what I wrote, > unfortunately you got overly defensive, so could you please next time > not reply immediately and first sleep over it, an overly emotional > reply is funny to read but not exactly useful." Here we probably get to the root of the problem: we got off on the wrong foot. In my first email I hadn't much good to say about the initial announcement, but at any time it was meant technical. Anyone who compares the first and the following announcement will notice the big improvement. Unfortunately Thomas seemed to have to taken it rather personal (although it never was meant that way) and I never got past this first impression and ever since I can't get him back to a normal conversation. > Insults like the following sentence in this very email: > > > [...] So Thomas, please get over yourself and start talking. I must say it's completely beyond me how this could be "insulting". This is my desperate attempt at getting any conversation started. If Thomas isn't talking to me at all, I can't resolve any issue he might have with me. Instead he's just moping around, pissed at me and simply ignores me, which makes a conversation over this channel nearly impossible. > let me be frank, and show you my initial reply that came to my mind when > reading the above sentence: "who the f*ck do you think you are to talk > to _anyone_ like that?". Now i'm usually polite and wont reply like > that,... You may haven't said it openly like that, but this hostility was still noticable. You disagreed with me on minor issues and used the smallest mistake to simply lecture me. From my point the attitude you showed towards me is not much different from what you're accusing me of here. I'm not saying that I'm innocent about this, but any "insult" was never intentional and I tried my best to correct any issues after we got off on the wrong foot, but I obviously failed at that, I simply never got past the initial impression. > in any case, from me you'll definitely get a reply to every positive or > constructive question you ask in this thread, but you wont get many > replies to mails that also include high-horse insults, question or > statements. Let's take the ptimer patches, I got _zero_ direct responses to it and it's difficult for me to understand how this could be taken as "high-horse insult". As I obviously failed to make my criticism understandable before, I produced these patches to provide a technical base for a discussion of how this functionality could be merged in the hopes of "Patches wont be ignored, i can assure you", unfortunately they were. Ingo, you might now start to understand my frustration. One positive effect at least is that finally some movement got into this mess and you managed to produce a simplified version of the timer. OTOH since I never got a reply to these patches does that mean they were neither positive nor constructive? bye, Roman ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [patch 00/21] hrtimer - High-resolution timer subsystem 2005-12-07 3:05 ` Roman Zippel @ 2005-12-08 5:18 ` Paul Jackson 2005-12-08 8:12 ` Ingo Molnar 2005-12-08 9:26 ` Ingo Molnar 1 sibling, 1 reply; 52+ messages in thread From: Paul Jackson @ 2005-12-08 5:18 UTC (permalink / raw) To: Roman Zippel; +Cc: mingo, tglx, linux-kernel, akpm, rostedt, johnstul > > > [...] So Thomas, please get over yourself and start talking. > > I must say it's completely beyond me how this could be "insulting". Well ... fools rush in where angels fear to tread ... As you note, people had better not take comments on their code as insulting, if they are going to survive for long on lkml. However comments on the person that don't match that person's current self image often cause distress, even if (sometimes -especially- if) they are accurate. This 'get over yourself' implies a comment on the person, not the code. It suggests you think they are on a high horse. Since Thomas (apparently) didn't think he was afflicted at the moment with something he needed to 'get over', he probably found that instruction annoying on first reading. That he called it an 'insult' is an irrelevant detail. He's just saying he found it annoying to read, but like most of us, instead of saying "I hurt", he's saying "dog bites." Nevermind that it was actually a cat. There is an easy way around this however. When I feel like telling someone they are an idiot (or any other ad hominem comment less than puppy dog happiness), I have better luck calling myself that, as in "sorry for being such a stupid git, but ...". Few people object to the -other- person confessing to being a stupid rude bastard. They just don't want themselves to be thought of that way, or anything remotely resembling that. As I recall, Linus has called himself a bastard more than once, with just such good affect. So just take all descriptions of other persons, and flip them around, pretending to describe yourself. It will be a bold faced lie, and totally illogical ... but that's typical in the realm of human emotions. The human species is definitely one dorked up bunch. Imagine that this subthread had begun "Sorry, Thomas, let me get off my high horse and start talking ...". Thanks, by the way, for your help back then on cpuset locking. It was invaluable. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <pj@sgi.com> 1.925.600.0401 ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [patch 00/21] hrtimer - High-resolution timer subsystem 2005-12-08 5:18 ` Paul Jackson @ 2005-12-08 8:12 ` Ingo Molnar 0 siblings, 0 replies; 52+ messages in thread From: Ingo Molnar @ 2005-12-08 8:12 UTC (permalink / raw) To: Paul Jackson; +Cc: Roman Zippel, tglx, linux-kernel, akpm, rostedt, johnstul * Paul Jackson <pj@sgi.com> wrote: > So just take all descriptions of other persons, and flip them around, > pretending to describe yourself. It will be a bold faced lie, and > totally illogical ... but that's typical in the realm of human > emotions. The human species is definitely one dorked up bunch. > "sorry for being such a stupid git, but ...". it does not matter that it's a bold faced lie, people technically 'lie' about little things all the time and for a good reason - the human species (especially males) are way to agressive by default, so a certain conscious buffer zone is needed to even that out. (It is in fact a medical condition if that buffer-zone does not exist.) But there's a crutial difference. A statement from Linus (or you) like the one above also shows three more things that are important: 1) that you have a sense of humor, and that you dont take things too seriously :) Humor goes a long way defusing differences. 2) that you actually entertain the possibility of being wrong and that you do not just want to steamroll the other side with your opinion. 3) that you actually care about the other person. This matters alot. There's a reason why hundreds of people who never met each other join on a mailing list and write code, and the reason is definitely not to have others piss on their code. > Thanks, by the way, for your help back then on cpuset locking. It was > invaluable. i took some time and re-read your thread with Roman about cpusets back in September. It was like fresh air! Roman was totally reasonable and positive in that thread - so i dont think it's me or Thomas misreading his style in this case or something. anyway, the reason i confronted Roman with the situation was in the renewed hope to improve direct communication. I'm still hopeful :) Ingo ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [patch 00/21] hrtimer - High-resolution timer subsystem 2005-12-07 3:05 ` Roman Zippel 2005-12-08 5:18 ` Paul Jackson @ 2005-12-08 9:26 ` Ingo Molnar 2005-12-08 13:08 ` Roman Zippel 1 sibling, 1 reply; 52+ messages in thread From: Ingo Molnar @ 2005-12-08 9:26 UTC (permalink / raw) To: Roman Zippel; +Cc: tglx, linux-kernel, Andrew Morton, rostedt, johnstul * Roman Zippel <zippel@linux-m68k.org> wrote: > > to be able to comprehend what kind of mood we might be in when reading > > your emails these days, how about this little snippet from you, from the > > second email you wrote in the ktimers threads: > > > > "First off, I can understand that you're rather upset with what I wrote, > > unfortunately you got overly defensive, so could you please next time > > not reply immediately and first sleep over it, an overly emotional > > reply is funny to read but not exactly useful." > > Here we probably get to the root of the problem: we got off on the > wrong foot. yes. > In my first email I hadn't much good to say about the initial > announcement, but at any time it was meant technical. Anyone who > compares the first and the following announcement will notice the big > improvement. Unfortunately Thomas seemed to have to taken it rather > personal (although it never was meant that way) [...] what you did was in essence to piss on his code, concepts and description. [oh, i wrote roughly half of ktimers, so you pissed on my code too ;-) ] Here are a few snippets from you that show that most of the negative messaging from you was directed against the text Thomas wrote (or against Thomas), not against the code: " How you get to these conclusions is still rather unclear, I don't even really know what the problem is from just reading the pretext. " " What is seriously missing here is the big picture. " " This no answer at all, you only repeat what you already said above. " " The main problem with your text is that you jump from one topic to the next, making it impossible to create a coherent picture from it. " " You don't think that having that much timers in first place is little insane (especially if these are kernel timers)? " " Later it becomes clear that you want high resolution timers, what doesn't become clear is why it's such an essential feature that everyone has to pay the price for it (this does not only include changes in runtime behaviour, but also the proposed API changes). " " It's nice that you're sure of it, but as long don't provide the means to verify them, they are just assertions. " note that these were pretty much out of the blue sky, and they pretty much set the stage. Given that Thomas is a volunteer too, he does not have to bear with what he senses as arbitrary abuse. > [...] and I never got past this first impression and ever since I > can't get him back to a normal conversation. maybe because your 'get him back to a normal conversation' attempt used precisely the same somewhat dismissive and somewhat derogatory tone and type of language that your initial mails were using? Past experience is extrapolated to the future, so even small negative messages get easily blown up, and the "cycle of violence" never stops, because nobody breaks the loop. > > Insults like the following sentence in this very email: > > > > > [...] So Thomas, please get over yourself and start talking. > > I must say it's completely beyond me how this could be "insulting". maybe it is insulting because the "get over yourself" implicitly suggests that the fault is with Thomas? Let me give you a few alternatives, that might have had a completely different effect and which do not contain any implicit insults: "So Thomas, I know we got on to the wrong footing, but lets start talking again." or: "So Thomas, I know we had a couple of nasty exchanges in the past, but lets bury the hatchet and try again. I apologize if I offended you in any way in the past." just try it, really. Even if it's a bold faced lie ;) > This is my desperate attempt at getting any conversation started. If > Thomas isn't talking to me at all, I can't resolve any issue he might > have with me. [...] Thomas wrote you 11 replies in 2.5 months, and some of those were extremely detailed. That's a far cry from not talking at all. He did try hard, he did get involved in a flamewar with you, which wasnt overly productive. But he is a volunteer, he has no obligation to waste time on flamewars. > > let me be frank, and show you my initial reply that came to my mind when > > reading the above sentence: "who the f*ck do you think you are to talk > > to _anyone_ like that?". Now i'm usually polite and wont reply like > > that,... > > You may haven't said it openly like that, but this hostility was still > noticable. You disagreed with me on minor issues and used the smallest > mistake to simply lecture me. From my point the attitude you showed > towards me is not much different from what you're accusing me of here. yes, i definitely was not nice in a couple of mails, and i'd like to apologize for it. > I'm not saying that I'm innocent about this, but any "insult" was > never intentional and I tried my best to correct any issues after we > got off on the wrong foot, but I obviously failed at that, I simply > never got past the initial impression. ok, apology taken :) > > in any case, from me you'll definitely get a reply to every positive or > > constructive question you ask in this thread, but you wont get many > > replies to mails that also include high-horse insults, question or > > statements. > > Let's take the ptimer patches, I got _zero_ direct responses to it > [...] well, direct communication with you has proven to be very unproductive a number of times, so what would have been the point? But we did mention lots of technical points in our subsequent mails, referring to your ptimers queue a number of times. We even adopted the ktime.h simplification idea and credited you for it. also, what did you expect? Basically you came out with a patch-queue based on ktimers, but you did not send any changes against the ktimers patch itself, which made it very hard to map the real finegrained changes you did to ktimers. You provided a writeup of differences, but they did not fully cover the full scope of changes, relative to ktimers. You based your queue on a weeks old version of ktimers, which also raises the possibility that you were working on this for some time, in private, for whatever reason. (Again, this is not a complaint - we are happy you are communicating via code - whatever form that code is in.) > [...] and it's difficult for me to understand how this could be taken > as "high-horse insult". As I obviously failed to make my criticism > understandable before, I produced these patches to provide a technical > base for a discussion of how this functionality could be merged in the > hopes of "Patches wont be ignored, i can assure you", unfortunately > they were. they were not ignored - we mentioned the ptimer changes in our subsequent announcements, and we Cc:-ed you to those annoucements so that you get it first-hand. Ingo ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [patch 00/21] hrtimer - High-resolution timer subsystem 2005-12-08 9:26 ` Ingo Molnar @ 2005-12-08 13:08 ` Roman Zippel 2005-12-08 15:36 ` Steven Rostedt 0 siblings, 1 reply; 52+ messages in thread From: Roman Zippel @ 2005-12-08 13:08 UTC (permalink / raw) To: Ingo Molnar; +Cc: tglx, linux-kernel, Andrew Morton, rostedt, johnstul Hi, On Thu, 8 Dec 2005, Ingo Molnar wrote: > Here are a few snippets from you that show that most of the negative > messaging from you was directed against the text Thomas wrote (or > against Thomas), not against the code: Do you really think quoting me out of context is helping? From my perspective you're trying to show me now as the bad guy and I'm not accepting that. I don't know what you're trying to do, if you're trying to mediate, then you're really suck at it, if you just want to piss me off, it's working great. :-( Technically I still stand behind everything I said in that context, in the meantime I learned a few new things and I understand them better, so some things have become nonissues and I even changed my mind about some other things. OTOH I'm the first to admit that I could have said things nicer, but mail is a rather bad channel to transport emotions and whatever I say can be taken badly. I really try my best to avoid this, but sometimes it's really hard, especially if I can't get past the initial resentment. I gladly apologize for any mistake I did and I'll do my best to learn from it, but I'm not going to make amends for it forever. At some point it would be really nice if you stopped to rub it in what a insensitive clod I am, I know that already. Ingo, if you want to help me, why don't you go with a good example ahead and I'll try to follow you. How about this? > > > > [...] So Thomas, please get over yourself and start talking. > > > > I must say it's completely beyond me how this could be "insulting". > > maybe it is insulting because the "get over yourself" implicitly > suggests that the fault is with Thomas? This is a nice example, that _whatever_ I'm saying can be misunderstood. Why don't you even try to give me a little credit that above was not meant as insult? You make an assumption about what I said and you don't even give me a chance to correct myself. Thomas obviously has some kind of problem with me and unless he starts to talk to me, I can't help him to get over whatever problem that is. I'm not going away, so we have to get along somehow and this means we have to _talk_. Ingo, you only want to see the "get over yourself" part, whereas my emphasis was and is on "talking". > just try it, really. Even if it's a bold faced lie ;) I'm a bad liar and as long as I don't know what the problem is, I'll make the same mistake over and over. I have no intention of becoming a notorious liar. > Thomas wrote you 11 replies in 2.5 months, and some of those were > extremely detailed. That's a far cry from not talking at all. Some of it was indeed more verbose, but I never got very far with my followup questions. Thomas used very often a phrase like "we analyzed the problem and we came to the conclusion...". It's great that you and Thomas get so well along with each other, but I'm in the disadvantage that I lack the information context that you have. What is "extremely detailed" for you is lacking context to create a coherent picture for me, so it's sometimes really frustrating to pull some information out of you both. > also, what did you expect? Basically you came out with a patch-queue > based on ktimers, but you did not send any changes against the ktimers > patch itself, which made it very hard to map the real finegrained > changes you did to ktimers. At the time I only had the huge ktimers patch from -mm to work with. One primary target was to split out the core (without all the extra complexity and extra cleanups) into mergable pieces, which makes it a bit pointless to do it relative to this huge patch. The other main target was the resolution handling, I tried very hard to explain the details of it and why I did them this way. A discussion about this would have required a _direct_ response, where you point out with what you disagree. Random comments in other mails are not helping at all. The rest are some smaller patches, which are completely independent of hrtimer, but even for this I got no response except from Oleg. > You provided a writeup of differences, but > they did not fully cover the full scope of changes, relative to ktimers. I've seen this claim now a few times, but why the hell don't you just ask about the things that you think were missing? bye, Roman ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [patch 00/21] hrtimer - High-resolution timer subsystem 2005-12-08 13:08 ` Roman Zippel @ 2005-12-08 15:36 ` Steven Rostedt 0 siblings, 0 replies; 52+ messages in thread From: Steven Rostedt @ 2005-12-08 15:36 UTC (permalink / raw) To: Roman Zippel; +Cc: Ingo Molnar, tglx, linux-kernel, Andrew Morton, johnstul On Thu, 2005-12-08 at 14:08 +0100, Roman Zippel wrote: > Hi, > > On Thu, 8 Dec 2005, Ingo Molnar wrote: > > > Here are a few snippets from you that show that most of the negative > > messaging from you was directed against the text Thomas wrote (or > > against Thomas), not against the code: > > Do you really think quoting me out of context is helping? From my > perspective you're trying to show me now as the bad guy and I'm not > accepting that. I don't know what you're trying to do, if you're trying to > mediate, then you're really suck at it, if you just want to piss me off, > it's working great. :-( > Ingo, Thomas, Roman, Please!!!! Lets all say sorry to each other and start over. This thread is starting to become really annoying. You three are very intelligent, and I really want to know what you have to say technically to each other, and that's why I'm not ignoring this thread. Yes, all of you were not nice to each other. Suck it up and forget about it. How about this, if you want to flame each other, do it privately. But when posting to the LKML, Talk technically only. Don't even try to be nice, since I'm not sure now any of you can do that to each other. But also leave out _any_ personal comments. So talk _only_ about the code. If you don't like it, give a technical reason why. Remember, the rest of the world is watching here. Thank you, -- Steve ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [patch 00/21] hrtimer - High-resolution timer subsystem 2005-12-06 17:32 ` Roman Zippel 2005-12-06 19:07 ` Ingo Molnar @ 2005-12-06 22:10 ` Thomas Gleixner 2005-12-07 3:11 ` Roman Zippel 2005-12-06 22:28 ` Thomas Gleixner 2 siblings, 1 reply; 52+ messages in thread From: Thomas Gleixner @ 2005-12-06 22:10 UTC (permalink / raw) To: Roman Zippel; +Cc: linux-kernel, Andrew Morton, rostedt, johnstul, mingo Hi Roman, On Tue, 2005-12-06 at 18:32 +0100, Roman Zippel wrote: > Before I get into a detailed review, I have to asked a question I already > asked earlier: are even interested in a discussion about this? Yes, I am and always was, as long it is on a technical level. > Slowly I'm asking myself why I should bother, the alternative would be > to just continue my own patch set. I don't really want that and Andrew > certainly doesn't want to choose between two versions either. So > Thomas, please get over yourself and start talking. I'm interested in working with others and I do that a lot. It depends a bit on the attitude of the person who wants to do that. I did not have the feeling that you are interested in working together. Usually people who want to participate in a project send patches, suggestions or testing feedback. Your reaction throughout the whole mail threads was neither cooperative nor appealing to me. I have no problem at all to accept critizism and help from others, but your attitude of teaching me how to do my work was just annoying. When others have done the hard chores of analysing the underlying problems and trying to solve them in various ways it is a simple task to jump in and tell them the big truth of the right solution. Acknowledging the work of others which led to a maybe imperfect solution in the first pass and helping in a constructive way to bring it to a better shape is a different thing. Sure you can fork off your own project and do what you want if you feel the urge to do so. We'd prefer to see patches against our queue, but it's up to you. I'm replying to the technical points in a different mail. tglx ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [patch 00/21] hrtimer - High-resolution timer subsystem 2005-12-06 22:10 ` Thomas Gleixner @ 2005-12-07 3:11 ` Roman Zippel 0 siblings, 0 replies; 52+ messages in thread From: Roman Zippel @ 2005-12-07 3:11 UTC (permalink / raw) To: Thomas Gleixner; +Cc: linux-kernel, Andrew Morton, rostedt, johnstul, mingo Hi, On Tue, 6 Dec 2005, Thomas Gleixner wrote: > I'm interested in working with others and I do that a lot. It depends a > bit on the attitude of the person who wants to do that. I did not have > the feeling that you are interested in working together. Usually people > who want to participate in a project send patches, suggestions or > testing feedback. Your reaction throughout the whole mail threads was > neither cooperative nor appealing to me. I have no problem at all to > accept critizism and help from others, but your attitude of teaching me > how to do my work was just annoying. See my mail to Ingo about most of this. The basic point is you should have told me about this earlier, simply ignoring the problem won't make it go away. Your annoyance was quite noticable, but this seemed to include my complete contribution. You never said what annoyed you, which makes it rather hard for me to you to change it into a more acceptable form. bye, Roman ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [patch 00/21] hrtimer - High-resolution timer subsystem 2005-12-06 17:32 ` Roman Zippel 2005-12-06 19:07 ` Ingo Molnar 2005-12-06 22:10 ` Thomas Gleixner @ 2005-12-06 22:28 ` Thomas Gleixner 2005-12-07 9:31 ` Andrew Morton 2005-12-07 12:18 ` Roman Zippel 2 siblings, 2 replies; 52+ messages in thread From: Thomas Gleixner @ 2005-12-06 22:28 UTC (permalink / raw) To: Roman Zippel; +Cc: linux-kernel, Andrew Morton, rostedt, johnstul, mingo Hi Roman, On Tue, 2005-12-06 at 18:32 +0100, Roman Zippel wrote: > > We worked through the subsystem and its users and further reduced the > > implementation to the basic required infrastructure and generally > > streamlined it. (We did this with easy extensibility for the high > > resolution clock support still in mind, so we kept some small extras > > around.) > > It looks better, but could you please explain, what these extras are > good for? One extra we kept is the list and the state field, which are useful for the high resolution implementation. We wanted to keep as much as possible common code [shared between the current low-resolution clock based hrtimer code and future high-resolution clock based hrtimercode] to avoid the big #ifdeffery all over the place. It might turn out in the rework of the hrt bits that the list can go away, but this is a nobrainer to do. (The very first unpublished version of ktimers had no list, it got introduced during the hrt addon and stayed there to make the hrt patch less intrusive.) > > After reading the Posix specification again, we came to the conclusion > > that it is possible to do no rounding at all for the ktime_t values, and > > to still ensure that the timer is not delivered early. > > Nice, that you finally also come to that conclusion, after I said that > already for ages. (It's also interesting how you do that without > giving me any credit for it.) Sorry if it was previously your idea and if we didnt credit you for it. I did not keep track of each word said in these endless mail threads. We credited every suggestion and idea which we picked up from you, see our previous emails. If we missed one, it was definitely not intentional. The decision to change the rounding implementation was not made based on reading old mail threads. It was made by doing tests and analysis and it differs a lot from your implementation. Setting up a periodic timer leads to a summing-up error versus the expected timeline. This applies both to vanilla, ktimers and ptimers. The difference is how this error is showing up: The vanilla 2.6.15-rc5 kernel has an rather constant error which is in the range of roughly 1ms per interval mostly independent of the given interval and the system load. This is roughly 1 sec per 1000 intervals. Ktimers had a contant summing error which is exactly the delta of the rounded interval to the real interval. The error is number of intervals * delta. The error could be deduced by the application, but thats not a really good idea. For a 50ms interval it is 2sec / 1000 intervals, which is exactly the 2ms delta between the 50ms requested and the 52ms real interval on a system with HZ=250 Ptimers have a rounding error which depends on the delta to the jiffy resolution and the system load. So it gets rather unpredicitble what happens. The basic error is roughly the same as with ktimers, but the addon component due to system load is not. For a 50ms interval a summing error between 2sec and 7sec per 1000 intervals was measured. So while vanilla and ktimers have a systematic error, ptimer introduces random Brownean motion! We analysed the problem again and went through the spec and came to the conclusion that rounding can be completely omitted. We changed the code accordingly and did the same tests. The result is systematic deviation of the timeline which wanders between 0 and resolution - 1 [i.e. 0-4msec with HZ=250], but does not introduce a summing error. This behaviour will be the same when high resolution bits are put on top. Of course the error then will be significantly smaller. To sum up the effects of various implementations (and non-implementations in our hrtimers case) of rounding, a 50 msec interval timer accumulates the following timeline error (precision error) over 1000 periods (50 seconds): vanilla: 1000 msecs ktimers: 2000 msecs ptimers: 2000-7000 msecs hrtimers: 4 msecs In the interim low-res ktimers version we were concentrating on the 'multiples of exposed resolution case. E.g. with 40 msec intervals (which is 10x 4msec jiffy) you'd only get 0-4msecs longterm error: vanilla: 1000 msecs ktimers: 4 msecs ptimers: 8-2000 msecs hrtimers: 4 msecs > Nevertheless, if you read my explanation of the rounding carefully and > look at my implementation, you may notice that I still disagree with > the actual implementation. I started to read it, but your explanation seems to be completely detached from the testing results and the code. I can imagine that you dont agree, but you might also elaborate why. I definitely disagree with your implementation for the following reasons: You define that absolute interval timers on clock realtime change their behaviour when the initial time value is expired into relative timers which are not affected by time settings. I have no idea how you came to that interpretation of the spec. I completely disagree [but if you would like I can go into more detail why I think it's wrong.] Beside of that, the implementation is also completely broken. (You rebase the timer from the realtime base to the monotonic base inside of the timer callback function. On return you lock the realtime base and enqueue the timer into the realtime queue, but the base field of the timer points to the monotonic queue. It needs not much phantasy to get this exploited into a crash.) Furthermore, your implementation is calculating the next expiry value based on the current time of the expiration rather than on the previous expected expiry time, which would be the natural thing to do. This detail also explains the system-load dependent random drifting of ptimers quite well. The changes you did to the timer locking code (also in timer.c) are racy and simply exposable. Oleg's locking implementation is there for a good reason. Neither do I understand the jiffie boundness you re-introduced all over the place. The softirq code is called once per jiffy and the expiry is checked versus the current time. Basing a new design on jiffies, where the design intends to be easily extensible to high resolution clocks, is wrong IMNSHO. Doing a high resolution extension on top of it is just introducing a lot of #ifdef mess in places where none has to be. We had that before, and dont want to go back there. One of your main, often repeated arguments was the complexity of ktimers. While ktimers held a lot of complex functionality, the "simple" ptimers .text size is larger than the ktimers one! I know that you claim that .text size is not a criteria, but Andrew seriously asked what he gets for the increase of .text. > BTW there is one thing I'm currently curious about. Why did you rename > the timer from high-precision timer to high-resolution timer? hrtimer > was just a suggestion from Andrew and ptimer would have been fine as > well. We decided to rename 'ktimer' because Andrew pretty much vetoed the 'ktimeout' queue, and "timer_list" plus "ktimer" looked and sounded confusing (as we've explained it before). Of the possible target names, we decided against "ptimer" because it could be confused with "process timers" and "posix timers". hrtimers is a clear term that indicates what those timers do, so we picked up Andrew's suggestion as a way out the endless naming discussion. Does this satisfy your curiosity? tglx ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [patch 00/21] hrtimer - High-resolution timer subsystem 2005-12-06 22:28 ` Thomas Gleixner @ 2005-12-07 9:31 ` Andrew Morton 2005-12-07 10:11 ` Ingo Molnar 2005-12-07 12:18 ` Roman Zippel 1 sibling, 1 reply; 52+ messages in thread From: Andrew Morton @ 2005-12-07 9:31 UTC (permalink / raw) To: tglx; +Cc: zippel, linux-kernel, rostedt, johnstul, mingo Thomas Gleixner <tglx@linutronix.de> wrote: > > We decided to rename 'ktimer' because Andrew pretty much vetoed the > 'ktimeout' queue Well I whined about the rename of timer_list to ktimeout and asked why it happened. I don't think anyone replied. I assume from your above statement that there wasn't really a strong reason for the rename, and that a new patch series is in the offing, which adds ktimers and leaves timer_list alone? Is ktimer a better name than ptimer or hrtimer? ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [patch 00/21] hrtimer - High-resolution timer subsystem 2005-12-07 9:31 ` Andrew Morton @ 2005-12-07 10:11 ` Ingo Molnar 2005-12-07 10:20 ` Ingo Molnar 2005-12-07 10:23 ` Nick Piggin 0 siblings, 2 replies; 52+ messages in thread From: Ingo Molnar @ 2005-12-07 10:11 UTC (permalink / raw) To: Andrew Morton; +Cc: tglx, zippel, linux-kernel, rostedt, johnstul * Andrew Morton <akpm@osdl.org> wrote: > Thomas Gleixner <tglx@linutronix.de> wrote: > > > > We decided to rename 'ktimer' because Andrew pretty much vetoed the > > 'ktimeout' queue let me defuse things a bit here: the above sentence might sound a bit bitter, but we really, truly are not. You were more like the sane voice slapping us back into reality: for the next 2 years we do not want to be buried in tons of timer_list->ktimeout patches, causing disruption all across the kernel (and external trees). You definitely did not 'veto' it in any way, and in fact you are carrying it in -mm currently. I did the ktimeout queue in an hour or so, and i dont have strong feelings about it. I very much agree with you that a mass rename could easily cause more problems than the added clarity adds - still i had to try the ktimeout queue, because i'm hopelessly purist at heart :) Maybe in a few years all substantial kernel code will be managed by a network of GIT repositories, and GIT will be extended with automatic 'mass namespace change' capabilities, making an overnight switchover much more practical. > Well I whined about the rename of timer_list to ktimeout and asked why > it happened. I don't think anyone replied. we thought we had this issue covered way too many times :) but find below my original justification for the ktimeout patch-queue. This is just for historical interest now i think. [ insert the text below here. Time passes as everyone reads it :-) ] once we take 'mass change of timer_list to ktimeout' out of the possible things to do, we've only got these secondary possibilities: 'struct timer_list, struct ktimer' 'struct timer_list, struct ptimer' 'struct timer_list, struct hrtimer' and having eliminated the first option due to being impractical to pull off, we had the choice between 'ptimer' and 'hrtimer', and went for the last one, for the following reason [snipped from a mail to Roman]: | we decided against "ptimer" because it could be confused with "process | timers" and "posix timers". hrtimers is a clear term that indicates | what those timers do, so we picked up Andrew's suggestion as a way out | the endless naming discussion. but really ... facing an imperfect naming situation (i do not think timer_list is the correct name - just as much as struct inode_list would not be correct - but it is the historic name and i think you are right that we've got to live with it) i'm alot less passionate about which one to pick. If we had the chance to have perfect naming, i'd definitely spend the effort to get it right, but now lets just go with the most descriptive one: 'struct hrtimer'. Ingo ----- regarding naming. This is something best left to native speakers, but i'm not giving up the issue just yet :-) i always sensed and agreed that 'struct ktimer' and 'struct timer_list' together is confusing. Same for kernel/ktimers.c and kernel/timers.c. So no argument about that, this situation cannot continue. but the reason i am still holding on to 'struct ktimer' is that i think the end result should be: - 'struct ktimer' (for precise timers where the event of expiry is the main function) - 'struct ktimeout' (for the wheel-based timeouts, where expiry is an exception) Similarly, kernel/ktimer.c for ktimers, and kernel/ktimeout.c for timeouts. see the attached finegrained patchqueue that does all the changes to rename 'timers' to 'timeouts' [and to clean up the resulting subsystem], to see what i'm trying to achieve. For now i'm ignoring the feasibility of a 'mass API change' issues - those are better up to lkml. The queue does build and compile fine on a rather big .config so the only question is - do we want it. Note that the patch does not have to touch even a single non-subsystem user of the timer.c APIs, so the renaming is robust. IMO it looks a lot less confusing and dualistic that way. The rename is technically feasible and robust mainly because we can do this: #define timer_list timeout for the transition period (see the patch-queue). Fortunately timer_list is not a generic name. (it's also an incorrect name because it implies implementation) Here's the full list of mappings that occur: #define timer_list ktimeout #define TIMER_INITIALIZER KTIMEOUT_INITIALIZER #define DEFINE_TIMER DEFINE_KTIMEOUT #define init_timer ktimeout_init #define setup_timer ktimeout_setup #define timer_pending ktimeout_pending #define add_timer_on ktimeout_add_on #define del_timer ktimeout_del #define __mod_timer __ktimeout_mod #define mod_timer ktimeout_mod #define next_timer_interrupt ktimeout_next_interrupt #define add_timer ktimeout_add #define try_to_del_timer_sync ktimeout_try_to_del_sync #define del_timer_sync ktimeout_del_sync #define del_singleshot_timer_sync ktimeout_del_singleshot_sync #define init_timers init_ktimeouts #define run_local_timers run_local_ktimeouts but maybe 'struct ptimer' and 'struct ktimeout' is the better choice? I dont think so, but it's a possibility. so i believe that: - 'struct ktimer', 'struct ktimeout' is in theory superior naming, compared to: - 'struct ptimer', 'struct timer_list' again, ignoring all the 'do we want to have a massive namespace change' issues. Ingo ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [patch 00/21] hrtimer - High-resolution timer subsystem 2005-12-07 10:11 ` Ingo Molnar @ 2005-12-07 10:20 ` Ingo Molnar 2005-12-07 10:23 ` Nick Piggin 1 sibling, 0 replies; 52+ messages in thread From: Ingo Molnar @ 2005-12-07 10:20 UTC (permalink / raw) To: Andrew Morton; +Cc: tglx, zippel, linux-kernel, rostedt, johnstul * Ingo Molnar <mingo@elte.hu> wrote: > once we take 'mass change of timer_list to ktimeout' out of the possible > things to do, we've only got these secondary possibilities: > > 'struct timer_list, struct ktimer' > 'struct timer_list, struct ptimer' > 'struct timer_list, struct hrtimer' > > and having eliminated the first option due to being impractical to pull > off, [...] (correction: due to being confusing.) Ingo ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [patch 00/21] hrtimer - High-resolution timer subsystem 2005-12-07 10:11 ` Ingo Molnar 2005-12-07 10:20 ` Ingo Molnar @ 2005-12-07 10:23 ` Nick Piggin 2005-12-07 10:49 ` Ingo Molnar 1 sibling, 1 reply; 52+ messages in thread From: Nick Piggin @ 2005-12-07 10:23 UTC (permalink / raw) To: Ingo Molnar; +Cc: Andrew Morton, tglx, zippel, linux-kernel, rostedt, johnstul Ingo Molnar wrote: > so i believe that: > > - 'struct ktimer', 'struct ktimeout' > > is in theory superior naming, compared to: > > - 'struct ptimer', 'struct timer_list' > Just curious -- why the "k" thing? Send instant messages to your online friends http://au.messenger.yahoo.com ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [patch 00/21] hrtimer - High-resolution timer subsystem 2005-12-07 10:23 ` Nick Piggin @ 2005-12-07 10:49 ` Ingo Molnar 2005-12-07 11:09 ` Nick Piggin 0 siblings, 1 reply; 52+ messages in thread From: Ingo Molnar @ 2005-12-07 10:49 UTC (permalink / raw) To: Nick Piggin; +Cc: Andrew Morton, tglx, zippel, linux-kernel, rostedt, johnstul * Nick Piggin <nickpiggin@yahoo.com.au> wrote: > Ingo Molnar wrote: > > >so i believe that: > > > > - 'struct ktimer', 'struct ktimeout' > > > >is in theory superior naming, compared to: > > > > - 'struct ptimer', 'struct timer_list' > > > > Just curious -- why the "k" thing? yeah. 'struct timer' and 'struct timeout' is even better. I tried it on real code and sometimes it looked a bit funny: often we have a 'timeout' parameter somewhere that is a scalar or a timeval/timespec. So at least for variable names it was useful to have it in this form: struct timeout *ktimeout; struct timer *ktimer; otherwise it looked OK. This is also in line with most other 'object names' we have in the kernel: struct inode, struct dentry. Ingo ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [patch 00/21] hrtimer - High-resolution timer subsystem 2005-12-07 10:49 ` Ingo Molnar @ 2005-12-07 11:09 ` Nick Piggin 2005-12-07 11:33 ` Ingo Molnar 2005-12-07 12:40 ` Roman Zippel 0 siblings, 2 replies; 52+ messages in thread From: Nick Piggin @ 2005-12-07 11:09 UTC (permalink / raw) To: Ingo Molnar; +Cc: Andrew Morton, tglx, zippel, linux-kernel, rostedt, johnstul Ingo Molnar wrote: > * Nick Piggin <nickpiggin@yahoo.com.au> wrote: > > >>Ingo Molnar wrote: >> >> >>>so i believe that: >>> >>> - 'struct ktimer', 'struct ktimeout' >>> >>>is in theory superior naming, compared to: >>> >>> - 'struct ptimer', 'struct timer_list' >>> >> >>Just curious -- why the "k" thing? > > > yeah. 'struct timer' and 'struct timeout' is even better. I tried it on Oh good, glad you think so :) > real code and sometimes it looked a bit funny: often we have a 'timeout' > parameter somewhere that is a scalar or a timeval/timespec. So at least Sure... hmm, the names timeout and timer themselves have something vagely wrong about them, but I can't quite place my finger on it, not a real worry though... Maybe it is that timeout is an end result, but timer is a mechanism. So maybe it should be 'struct interval', 'struct timeout'; or 'struct timer', 'struct timeout_timer'. But I don't know really, it isn't a big deal. Nick -- SUSE Labs, Novell Inc. Send instant messages to your online friends http://au.messenger.yahoo.com ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [patch 00/21] hrtimer - High-resolution timer subsystem 2005-12-07 11:09 ` Nick Piggin @ 2005-12-07 11:33 ` Ingo Molnar 2005-12-07 11:40 ` Nick Piggin 2005-12-07 13:06 ` Roman Zippel 2005-12-07 12:40 ` Roman Zippel 1 sibling, 2 replies; 52+ messages in thread From: Ingo Molnar @ 2005-12-07 11:33 UTC (permalink / raw) To: Nick Piggin; +Cc: Andrew Morton, tglx, zippel, linux-kernel, rostedt, johnstul * Nick Piggin <nickpiggin@yahoo.com.au> wrote: > >>Just curious -- why the "k" thing? > > > > > >yeah. 'struct timer' and 'struct timeout' is even better. I tried it on > > Oh good, glad you think so :) > > >real code and sometimes it looked a bit funny: often we have a 'timeout' > >parameter somewhere that is a scalar or a timeval/timespec. So at least > > Sure... hmm, the names timeout and timer themselves have something > vagely wrong about them, but I can't quite place my finger on it, not > a real worry though... > > Maybe it is that timeout is an end result, but timer is a mechanism. hm, i think you are right. > So maybe it should be 'struct interval', 'struct timeout'; or 'struct > timer', 'struct timeout_timer'. maybe 'struct timer' and 'struct hrtimer' is the right solution after all, and our latest queue doing 'struct timer_list' + 'struct hrtimer' is actually quite close to it. 'struct ptimer' does have a bit of vagueness in it at first sight, do you agree with that? (does it mean 'process'? 'posix'? 'precision'?) also, hrtimers on low-res clocks do have high internal resolution, but they are not precise timing mechanisms in the end, due to the low-res clock. So the more generic name would be 'high-resolution timers', not 'precision timers'. (also, the name 'precision timers' sounds a bit funny too, but i dont really know why.) Ingo ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [patch 00/21] hrtimer - High-resolution timer subsystem 2005-12-07 11:33 ` Ingo Molnar @ 2005-12-07 11:40 ` Nick Piggin 2005-12-07 13:06 ` Roman Zippel 1 sibling, 0 replies; 52+ messages in thread From: Nick Piggin @ 2005-12-07 11:40 UTC (permalink / raw) To: Ingo Molnar; +Cc: Andrew Morton, tglx, zippel, linux-kernel, rostedt, johnstul Ingo Molnar wrote: > * Nick Piggin <nickpiggin@yahoo.com.au> wrote: > >>Maybe it is that timeout is an end result, but timer is a mechanism. > > > hm, i think you are right. > > >>So maybe it should be 'struct interval', 'struct timeout'; or 'struct >>timer', 'struct timeout_timer'. > > > maybe 'struct timer' and 'struct hrtimer' is the right solution after > all, and our latest queue doing 'struct timer_list' + 'struct hrtimer' > is actually quite close to it. > > 'struct ptimer' does have a bit of vagueness in it at first sight, do > you agree with that? (does it mean 'process'? 'posix'? 'precision'?) > Yes I would agree that the p doesn't add much, wheras hrtimer at least *rules out* the obvious process and posix. I can't see a problem with timer and hrtimer myself. Thanks, Nick -- SUSE Labs, Novell Inc. Send instant messages to your online friends http://au.messenger.yahoo.com ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [patch 00/21] hrtimer - High-resolution timer subsystem 2005-12-07 11:33 ` Ingo Molnar 2005-12-07 11:40 ` Nick Piggin @ 2005-12-07 13:06 ` Roman Zippel 1 sibling, 0 replies; 52+ messages in thread From: Roman Zippel @ 2005-12-07 13:06 UTC (permalink / raw) To: Ingo Molnar Cc: Nick Piggin, Andrew Morton, tglx, linux-kernel, rostedt, johnstul Hi, On Wed, 7 Dec 2005, Ingo Molnar wrote: > maybe 'struct timer' and 'struct hrtimer' is the right solution after > all, and our latest queue doing 'struct timer_list' + 'struct hrtimer' > is actually quite close to it. > > 'struct ptimer' does have a bit of vagueness in it at first sight, do > you agree with that? (does it mean 'process'? 'posix'? 'precision'?) > > also, hrtimers on low-res clocks do have high internal resolution, but > they are not precise timing mechanisms in the end, due to the low-res > clock. So the more generic name would be 'high-resolution timers', not > 'precision timers'. (also, the name 'precision timers' sounds a bit > funny too, but i dont really know why.) My ptimer suggestion was based on your "funny" name "high-precision timer". Sorry Ingo, that joke is on you. :-) Anyway, anything else than ktimer is fine with me. bye, Roman ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [patch 00/21] hrtimer - High-resolution timer subsystem 2005-12-07 11:09 ` Nick Piggin 2005-12-07 11:33 ` Ingo Molnar @ 2005-12-07 12:40 ` Roman Zippel 2005-12-07 23:12 ` Nick Piggin 1 sibling, 1 reply; 52+ messages in thread From: Roman Zippel @ 2005-12-07 12:40 UTC (permalink / raw) To: Nick Piggin Cc: Ingo Molnar, Andrew Morton, tglx, linux-kernel, rostedt, johnstul Hi, On Wed, 7 Dec 2005, Nick Piggin wrote: > Sure... hmm, the names timeout and timer themselves have something > vagely wrong about them, but I can't quite place my finger on it, > not a real worry though... > > Maybe it is that timeout is an end result, but timer is a mechanism. > So maybe it should be 'struct interval', 'struct timeout'; > or 'struct timer', 'struct timeout_timer'. > > But I don't know really, it isn't a big deal. Nick, thanks for speaking up about this. My mistake was to make a big deal out of it, because I knew it would confuse more people. After I got the heat for this, it seems nobody else want to get flamed for it. bye, Roman ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [patch 00/21] hrtimer - High-resolution timer subsystem 2005-12-07 12:40 ` Roman Zippel @ 2005-12-07 23:12 ` Nick Piggin 0 siblings, 0 replies; 52+ messages in thread From: Nick Piggin @ 2005-12-07 23:12 UTC (permalink / raw) To: Roman Zippel Cc: Ingo Molnar, Andrew Morton, tglx, linux-kernel, rostedt, johnstul Hi Roman, Roman Zippel wrote: > > Nick, thanks for speaking up about this. > My mistake was to make a big deal out of it, because I knew it would > confuse more people. After I got the heat for this, it seems nobody else > want to get flamed for it. > I didn't mean to trivialise the issue. I think good naming is important, however I added the disclaimer because of course I didn't write any code, so my opinion didn't carry much weight in that particular situation compared to you guys. Thanks, Nick -- SUSE Labs, Novell Inc. Send instant messages to your online friends http://au.messenger.yahoo.com ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [patch 00/21] hrtimer - High-resolution timer subsystem 2005-12-06 22:28 ` Thomas Gleixner 2005-12-07 9:31 ` Andrew Morton @ 2005-12-07 12:18 ` Roman Zippel 2005-12-07 16:55 ` Ingo Molnar 2005-12-09 17:23 ` Thomas Gleixner 1 sibling, 2 replies; 52+ messages in thread From: Roman Zippel @ 2005-12-07 12:18 UTC (permalink / raw) To: Thomas Gleixner; +Cc: linux-kernel, Andrew Morton, rostedt, johnstul, mingo Hi On Tue, 6 Dec 2005, Thomas Gleixner wrote: > > It looks better, but could you please explain, what these extras are > > good for? > > One extra we kept is the list and the state field, which are useful for > the high resolution implementation. We wanted to keep as much as > possible common code [shared between the current low-resolution clock > based hrtimer code and future high-resolution clock based hrtimercode] > to avoid the big #ifdeffery all over the place. > > It might turn out in the rework of the hrt bits that the list can go > away, but this is a nobrainer to do. (The very first unpublished version > of ktimers had no list, it got introduced during the hrt addon and > stayed there to make the hrt patch less intrusive.) If it's such a nobrainer to remove it, why don't you add it later? As it is right now, it's not needed and nobody but you knows what it's good for. This way yoy make it only harder to review the code, if one stumbles over these pieces all the time. > > Nice, that you finally also come to that conclusion, after I said that > > already for ages. (It's also interesting how you do that without > > giving me any credit for it.) > > Sorry if it was previously your idea and if we didnt credit you for it. > I did not keep track of each word said in these endless mail threads. We > credited every suggestion and idea which we picked up from you, see our > previous emails. If we missed one, it was definitely not intentional. http://marc.theaimsgroup.com/?l=linux-kernel&m=112755827327101 http://marc.theaimsgroup.com/?l=linux-kernel&m=112760582427537 A bit later ktime_t looked pretty much like the 64bit part of my ktimespec. I don't won't to imply any intention, but please try to see this from my perspective, after this happens a number of times. > Setting up a periodic timer leads to a summing-up error versus the > expected timeline. This applies both to vanilla, ktimers and ptimers. > The difference is how this error is showing up: > > The vanilla 2.6.15-rc5 kernel has an rather constant error which is in > the range of roughly 1ms per interval mostly independent of the given > interval and the system load. This is roughly 1 sec per 1000 intervals. > > Ktimers had a contant summing error which is exactly the delta of the > rounded interval to the real interval. The error is number of intervals > * delta. The error could be deduced by the application, but thats not a > really good idea. For a 50ms interval it is 2sec / 1000 intervals, which > is exactly the 2ms delta between the 50ms requested and the 52ms real > interval on a system with HZ=250 > > Ptimers have a rounding error which depends on the delta to the jiffy > resolution and the system load. So it gets rather unpredicitble what > happens. The basic error is roughly the same as with ktimers, but the > addon component due to system load is not. For a 50ms interval a summing > error between 2sec and 7sec per 1000 intervals was measured. > > So while vanilla and ktimers have a systematic error, ptimer introduces > random Brownean motion! Thomas, you unfortunately don't provide enough context for these numbers for me to reproduce them. I don't want to guess, so please provide an example to demonstrate this. > > Nevertheless, if you read my explanation of the rounding carefully and > > look at my implementation, you may notice that I still disagree with > > the actual implementation. > > I started to read it, but your explanation seems to be completely > detached from the testing results and the code. Thomas, if you don't ask me, I can't help you to understand any issues, there might have been. > You define that absolute interval timers on clock realtime change their > behaviour when the initial time value is expired into relative timers > which are not affected by time settings. I have no idea how you came to > that interpretation of the spec. I completely disagree [but if you would > like I can go into more detail why I think it's wrong.] Please do. I explained it in one of my patches: [PATCH 5/9] remove relative timer from abs_list When an absolute timer expires, it becomes a relative timer, so remove it from the abs_list. The TIMER_ABSTIME flag for timer_settime() changes the interpretation of the it_value member, but it_interval is always a relative value and clock_settime() only affects absolute time services. > Beside of that, the implementation is also completely broken. (You > rebase the timer from the realtime base to the monotonic base inside of > the timer callback function. On return you lock the realtime base and > enqueue the timer into the realtime queue, but the base field of the > timer points to the monotonic queue. It needs not much phantasy to get > this exploited into a crash.) If you provide the wrong parameters, you can crash a lot of stuff in the kernel. "exploited" usually implies it can be abused from userspace, which is not the case here. > Furthermore, your implementation is calculating the next expiry value > based on the current time of the expiration rather than on the previous > expected expiry time, which would be the natural thing to do. This > detail also explains the system-load dependent random drifting of > ptimers quite well. Is this conclusion based on actual testing? The behaviour of ptimer should be quite close to the old jiffie based timer, so I'm a bit at a loss here, how you get to this conclusion. Please provide more details. > The changes you did to the timer locking code (also in timer.c) are racy > and simply exposable. Oleg's locking implementation is there for a good > reason. Thomas, bringing up this issue is really weak. With Oleg's help it's already solved, you don't have to warm it up. :( > Neither do I understand the jiffie boundness you re-introduced all over > the place. The softirq code is called once per jiffy and the expiry is > checked versus the current time. Basing a new design on jiffies, where > the design intends to be easily extensible to high resolution clocks, is > wrong IMNSHO. Doing a high resolution extension on top of it is just > introducing a lot of #ifdef mess in places where none has to be. We had > that before, and dont want to go back there. I don't understand where you get this from, I explicitely said that higher resolution requires a better clock abstraction, bascially any place which mentions TICK_NSEC has to be cleaned up like this. I'm at loss why you think this requires "a lot of #ifdef mess". > One of your main, often repeated arguments was the complexity of > ktimers. While ktimers held a lot of complex functionality, the "simple" > ptimers .text size is larger than the ktimers one! I know that you claim > that .text size is not a criteria, but Andrew seriously asked what he > gets for the increase of .text. Sorry, but I didn't had as much time as you to finetune my implementation. I did some quick tests by also deinlining some stuff, which quickly brought it down to the ktimer size, integrating some more cleanups should do the rest. Thomas, this is hardly an argument against my implementation, I never claimed it to be complete. It was meant as sanely mergable version compared to your large ktimers patch. Anyway, thanks for finally responding, there seem have to piled up a number of misconceptions, please give it some time to clear up. bye, Roman ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [patch 00/21] hrtimer - High-resolution timer subsystem 2005-12-07 12:18 ` Roman Zippel @ 2005-12-07 16:55 ` Ingo Molnar 2005-12-07 17:17 ` Roman Zippel 2005-12-09 17:23 ` Thomas Gleixner 1 sibling, 1 reply; 52+ messages in thread From: Ingo Molnar @ 2005-12-07 16:55 UTC (permalink / raw) To: Roman Zippel Cc: Thomas Gleixner, linux-kernel, Andrew Morton, rostedt, johnstul * Roman Zippel <zippel@linux-m68k.org> wrote: > > Sorry if it was previously your idea and if we didnt credit you for it. > > I did not keep track of each word said in these endless mail threads. We > > credited every suggestion and idea which we picked up from you, see our > > previous emails. If we missed one, it was definitely not intentional. > > http://marc.theaimsgroup.com/?l=linux-kernel&m=112755827327101 > http://marc.theaimsgroup.com/?l=linux-kernel&m=112760582427537 > > A bit later ktime_t looked pretty much like the 64bit part of my > ktimespec. and Thomas credited you for that point in his announcement: " Roman pointed out that the penalty for some architectures would be quite big when using the nsec_t (64bit) scalar time storage format. " http://marc.theaimsgroup.com/?l=linux-kernel&m=112794069605537&w=2 also, once you came up with actual modifications to the ktimers concept (the ptimer queue) we noticed a further refinement of ktime_t in that code: the elimination of the plain scalar type. We gave you credit again: " - eliminate the plain s64 scalar type, and always use the union. This simplifies the arithmetics. Idea from Roman Zippel. " see: http://marc.theaimsgroup.com/?l=linux-kernel&m=113339663027117&w=2 http://marc.theaimsgroup.com/?l=linux-kernel&m=113382965626004&w=2 we couldnt take your actual patch/code though, due to the way you created the ptimer queue: you took our ktimer queue, added a dozen changes to it (intermixed, without keeping/disclosing the changes), then you split up the resulting queue. This structure made it largely incompatible with our queue, the diff between ktimers and ptimers was larger than the two patches themselves, due to the stacked changed! This is not a complaint - we are happy you are writing ktimer based code - this is just an explanation of why we couldnt take the code/patch as-is but had to redo that portion from scratch, based on your idea. > I don't won't to imply any intention, but please try to see this from > my perspective, after this happens a number of times. What the hell are you talking about? I bloody well know how it all happened, because i did those simplifications to ktime.h myself, and i added the changelog too, crediting you for the idea. Ingo ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [patch 00/21] hrtimer - High-resolution timer subsystem 2005-12-07 16:55 ` Ingo Molnar @ 2005-12-07 17:17 ` Roman Zippel 2005-12-07 17:57 ` Ingo Molnar 2005-12-07 18:02 ` Paul Baxter 0 siblings, 2 replies; 52+ messages in thread From: Roman Zippel @ 2005-12-07 17:17 UTC (permalink / raw) To: Ingo Molnar Cc: Thomas Gleixner, linux-kernel, Andrew Morton, rostedt, johnstul Hi, On Wed, 7 Dec 2005, Ingo Molnar wrote: > > A bit later ktime_t looked pretty much like the 64bit part of my > > ktimespec. > > and Thomas credited you for that point in his announcement: > > " Roman pointed out that the penalty for some architectures > would be quite big when using the nsec_t (64bit) scalar time > storage format. " "pointed out that the penalty" is a bit different from "provided the basic idea of the ktime_t union and half the implementation"... bye, Roman ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [patch 00/21] hrtimer - High-resolution timer subsystem 2005-12-07 17:17 ` Roman Zippel @ 2005-12-07 17:57 ` Ingo Molnar 2005-12-07 18:18 ` Roman Zippel 2005-12-07 18:02 ` Paul Baxter 1 sibling, 1 reply; 52+ messages in thread From: Ingo Molnar @ 2005-12-07 17:57 UTC (permalink / raw) To: Roman Zippel Cc: Thomas Gleixner, linux-kernel, Andrew Morton, rostedt, johnstul * Roman Zippel <zippel@linux-m68k.org> wrote: > > > (It's also interesting how you do that without giving me any > > > credit for it.) > > > > Sorry if it was previously your idea and if we didnt credit you for > > it. > > [...] > > > > > A bit later ktime_t looked pretty much like the 64bit part of my > > > ktimespec. > > > > and Thomas credited you for that point in his announcement: > > > > " Roman pointed out that the penalty for some architectures > > would be quite big when using the nsec_t (64bit) scalar time > > storage format. " > > "pointed out that the penalty" is a bit different from "provided the > basic idea of the ktime_t union and half the implementation"... so ... did you change your position from accusing us of not giving you _any_ credit: "It's also interesting how you do that without giving me any credit for it." to accusing us of not giving you _enough_ credit? Did i get that right? And ontop of that, you now want the credit for providing the basic idea for half of the ktimer/hrtimer implementation? Sorry that i did not find out in advance that you wanted _that_ ;-) Ingo ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [patch 00/21] hrtimer - High-resolution timer subsystem 2005-12-07 17:57 ` Ingo Molnar @ 2005-12-07 18:18 ` Roman Zippel 0 siblings, 0 replies; 52+ messages in thread From: Roman Zippel @ 2005-12-07 18:18 UTC (permalink / raw) To: Ingo Molnar Cc: Thomas Gleixner, linux-kernel, Andrew Morton, rostedt, johnstul Hi, On Wed, 7 Dec 2005, Ingo Molnar wrote: > so ... did you change your position from accusing us of not giving you > _any_ credit: > > "It's also interesting how you do that without giving me > any credit for it." > > to accusing us of not giving you _enough_ credit? Did i get that right? > > And ontop of that, you now want the credit for providing the basic idea > for half of the ktimer/hrtimer implementation? Sorry that i did not find > out in advance that you wanted _that_ ;-) Ingo, please stay serious and don't take things out of context. I was just asking for some more/better credit for some of the ktime_t core ideas, I'll leave it now to you what you make with this and I won't further pursue this issue. Can we please now get back to the more important issues? bye, Roman ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [patch 00/21] hrtimer - High-resolution timer subsystem 2005-12-07 17:17 ` Roman Zippel 2005-12-07 17:57 ` Ingo Molnar @ 2005-12-07 18:02 ` Paul Baxter 1 sibling, 0 replies; 52+ messages in thread From: Paul Baxter @ 2005-12-07 18:02 UTC (permalink / raw) To: Roman Zippel, linux-kernel > "pointed out that the penalty" is a bit different from "provided the > basic idea of the ktime_t union and half the implementation"... > > bye, Roman This is getting bloody ridiculous. Roman, you won't get credited for every nuance of what you've said and done, neither will Ingo and Thomas and the many others that work hard to make Linux a better Operating System. I admire the fact that you did pick up the gauntlett and produce code which has helped further the whole work. I keep reading this thread because, against all odds, there is a lot of technical progress but the constant bickering really does your credibility no favours. Please stop trying to portray yourself as a victim, those that care will form an opinion from your words. Please, please have the grace to rise above any perceived 'insults' both actual, and, in my view, mostly insubstantial. A frustrated lurker, who can't wait for high resolution timers and maybe even high precision timers one day. Thank you, Roman, for all your technical efforts. ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [patch 00/21] hrtimer - High-resolution timer subsystem 2005-12-07 12:18 ` Roman Zippel 2005-12-07 16:55 ` Ingo Molnar @ 2005-12-09 17:23 ` Thomas Gleixner 2005-12-12 13:39 ` Roman Zippel 1 sibling, 1 reply; 52+ messages in thread From: Thomas Gleixner @ 2005-12-09 17:23 UTC (permalink / raw) To: Roman Zippel; +Cc: linux-kernel, Andrew Morton, rostedt, johnstul, mingo Hi, sorry for the late reply. I was travelling and cut off the net for a while. On Wed, 2005-12-07 at 13:18 +0100, Roman Zippel wrote: > > It might turn out in the rework of the hrt bits that the list can go > > away, but this is a nobrainer to do. (The very first unpublished version > > of ktimers had no list, it got introduced during the hrt addon and > > stayed there to make the hrt patch less intrusive.) > > If it's such a nobrainer to remove it, why don't you add it later? As it > is right now, it's not needed and nobody but you knows what it's good for. > This way yoy make it only harder to review the code, if one stumbles over > these pieces all the time. Well, if it makes it simpler for you to review the code. But can you please explain to a code review unaware kernel developer newbie like me, why this makes a lot of difference and why it makes it so much easier to review ? Actually the change adds more code lines and removes one field of the hrtimer structure, but it has exactly the same functionality: Fast access to the first expiring timer without walking the rb_tree. include/linux/hrtimer.h | 7 ++----- kernel/hrtimer.c | 26 ++++++++++++++------------ 2 files changed, 16 insertions(+), 17 deletions(-) Index: linux-2.6.15-rc5/include/linux/hrtimer.h =================================================================== --- linux-2.6.15-rc5.orig/include/linux/hrtimer.h +++ linux-2.6.15-rc5/include/linux/hrtimer.h @@ -49,8 +49,6 @@ struct hrtimer_base; * struct hrtimer - the basic hrtimer structure * * @node: red black tree node for time ordered insertion - * @list: list head for easier access to the time ordered list, - * without walking the red black tree. * @expires: the absolute expiry time in the hrtimers internal * representation. The time is related to the clock on * which the timer is based. @@ -63,7 +61,6 @@ struct hrtimer_base; */ struct hrtimer { struct rb_node node; - struct list_head list; ktime_t expires; enum hrtimer_state state; int (*function)(void *); @@ -78,7 +75,7 @@ struct hrtimer { * to a base on another cpu. * @lock: lock protecting the base and associated timers * @active: red black tree root node for the active timers - * @pending: list of pending timers for simple time ordered access + * @first: pointer to the timer node which expires first * @resolution: the resolution of the clock, in nanoseconds * @get_time: function to retrieve the current time of the clock * @curr_timer: the timer which is executing a callback right now @@ -87,7 +84,7 @@ struct hrtimer_base { clockid_t index; spinlock_t lock; struct rb_root active; - struct list_head pending; + struct rb_node *first; unsigned long resolution; ktime_t (*get_time)(void); struct hrtimer *curr_timer; Index: linux-2.6.15-rc5/kernel/hrtimer.c =================================================================== --- linux-2.6.15-rc5.orig/kernel/hrtimer.c +++ linux-2.6.15-rc5/kernel/hrtimer.c @@ -313,7 +313,6 @@ hrtimer_forward(struct hrtimer *timer, c static void enqueue_hrtimer(struct hrtimer *timer, struct hrtimer_base *base) { struct rb_node **link = &base->active.rb_node; - struct list_head *prev = &base->pending; struct rb_node *parent = NULL; struct hrtimer *entry; @@ -329,22 +328,23 @@ static void enqueue_hrtimer(struct hrtim */ if (timer->expires.tv64 < entry->expires.tv64) link = &(*link)->rb_left; - else { + else link = &(*link)->rb_right; - prev = &entry->list; - } } /* - * Insert the timer to the rbtree and to the sorted list: + * Insert the timer to the rbtree and check whether it + * replaces the first pending timer */ rb_link_node(&timer->node, parent, link); rb_insert_color(&timer->node, &base->active); - list_add(&timer->list, prev); timer->state = HRTIMER_PENDING; -} + if (!base->first || timer->expires.tv64 < + rb_entry(base->first, struct hrtimer, node)->expires.tv64) + base->first = &timer->node; +} /* * __remove_hrtimer - internal function to remove a timer @@ -354,9 +354,11 @@ static void enqueue_hrtimer(struct hrtim static void __remove_hrtimer(struct hrtimer *timer, struct hrtimer_base *base) { /* - * Remove the timer from the sorted list and from the rbtree: + * Remove the timer from the rbtree and replace the + * first entry pointer if necessary. */ - list_del(&timer->list); + if (base->first == &timer->node) + base->first = rb_next(&timer->node); rb_erase(&timer->node, &base->active); } @@ -528,16 +530,17 @@ int hrtimer_get_res(const clockid_t whic static inline void run_hrtimer_queue(struct hrtimer_base *base) { ktime_t now = base->get_time(); + struct rb_node *node; spin_lock_irq(&base->lock); - while (!list_empty(&base->pending)) { + while ((node = base->first)) { struct hrtimer *timer; int (*fn)(void *); int restart; void *data; - timer = list_entry(base->pending.next, struct hrtimer, list); + timer = rb_entry(node, struct hrtimer, node); if (now.tv64 <= timer->expires.tv64) break; @@ -590,7 +593,6 @@ static void __devinit init_hrtimers_cpu( for (i = 0; i < MAX_HRTIMER_BASES; i++) { spin_lock_init(&base->lock); - INIT_LIST_HEAD(&base->pending); base++; } } > > > Nice, that you finally also come to that conclusion, after I said that > > > already for ages. (It's also interesting how you do that without > > > giving me any credit for it.) > > > > Sorry if it was previously your idea and if we didnt credit you for it. > > I did not keep track of each word said in these endless mail threads. We > > credited every suggestion and idea which we picked up from you, see our > > previous emails. If we missed one, it was definitely not intentional. > > http://marc.theaimsgroup.com/?l=linux-kernel&m=112755827327101 > http://marc.theaimsgroup.com/?l=linux-kernel&m=112760582427537 > > A bit later ktime_t looked pretty much like the 64bit part of my ktimespec. > I don't won't to imply any intention, but please try to see this from my > perspective, after this happens a number of times. I have seen your and Ingos conversation on that and I dont want to add more flames into this. > > You define that absolute interval timers on clock realtime change their > > behaviour when the initial time value is expired into relative timers > > which are not affected by time settings. I have no idea how you came to > > that interpretation of the spec. I completely disagree [but if you would > > like I can go into more detail why I think it's wrong.] > > Please do. I explained it in one of my patches: > > [PATCH 5/9] remove relative timer from abs_list > > When an absolute timer expires, it becomes a relative timer, so remove > it from the abs_list. The TIMER_ABSTIME flag for timer_settime() > changes the interpretation of the it_value member, but it_interval is > always a relative value and clock_settime() only affects absolute time > services. This is your interpretation and I disagree. If I set up a timer with a 24 hour interval, which should go off everyday at 6:00 AM, then I expect that this timer does this even when the clock is set e.g. by daylight saving. I think, that this is a completely valid interpretation and makes a lot of sense from a practical point of view. The existing implementation does it that way already, so why do we want to change this ? Also you treat the interval relative to the current time of the callback function: timer->expires = ktime_add(timer->base->last_expired, timr->it.real.incr); This leads to a summing up error and even if the result is similar to the summing up error of the current vanilla implementation I prefer a solution which adds the interval to the previous set expiry time timer->expires = ktime_add(timer->expires, timr->it.real.incr); The spec says: "Also note that some implementations may choose to adjust time and/or interval values to exactly match the ticks of the underlying clock." So there is no requirement to do so. Of course you may, but this takes simply the name "precision" ad absurdum. > > Beside of that, the implementation is also completely broken. (You > > rebase the timer from the realtime base to the monotonic base inside of > > the timer callback function. On return you lock the realtime base and > > enqueue the timer into the realtime queue, but the base field of the > > timer points to the monotonic queue. It needs not much phantasy to get > > this exploited into a crash.) > > If you provide the wrong parameters, you can crash a lot of stuff in the > kernel. "exploited" usually implies it can be abused from userspace, which > is not the case here. Thanks for teaching me what an "exploit" usally means! I intentionally wrote "exploited into a crash". How do you think I got this to crash ? By hacking up some complex kernel code ? No, simply by running my unmodified test scripts from user space with completely valid and correct parameters. Of course its also possible to write a program which actually exploits this. The implementation is simply broken. Can you just accept this ? > > Furthermore, your implementation is calculating the next expiry value > > based on the current time of the expiration rather than on the previous > > expected expiry time, which would be the natural thing to do. This > > detail also explains the system-load dependent random drifting of > > ptimers quite well. > > Is this conclusion based on actual testing? The behaviour of ptimer should > be quite close to the old jiffie based timer, so I'm a bit at a loss here, > how you get to this conclusion. Please provide more details. It is based on testing. Do you think I pulled numbers out of my nose ? But I have to admit that I did not look close enough into your code and so I missed the ptimer_run_queue call inside of the lost jiffie loop. Sorry, my conclusion was wrong. The problem seems to be related to the rebase code, which leads to a wrong expiry value for clock realtime interval timers with the ABSTIME flag set. > > The changes you did to the timer locking code (also in timer.c) are racy > > and simply exposable. Oleg's locking implementation is there for a good > > reason. > > Thomas, bringing up this issue is really weak. With Oleg's help it's > already solved, you don't have to warm it up. :( I did not warm anything up. I was not aware that Oleg jumped already in on this - I was not cc'ed and I really did not pay much attention on LKML during this time. I'm familiar enough with locking, that I can recognize such a problem on my own. > > Neither do I understand the jiffie boundness you re-introduced all over > > the place. The softirq code is called once per jiffy and the expiry is > > checked versus the current time. Basing a new design on jiffies, where > > the design intends to be easily extensible to high resolution clocks, is > > wrong IMNSHO. Doing a high resolution extension on top of it is just > > introducing a lot of #ifdef mess in places where none has to be. We had > > that before, and dont want to go back there. > > I don't understand where you get this from, I explicitely said that higher > resolution requires a better clock abstraction, bascially any place which > mentions TICK_NSEC has to be cleaned up like this. I'm at loss why you > think this requires "a lot of #ifdef mess". Why do you need all this jiffie stuff in the first place? It is not necessary at all. The hrtimer code does not contain a single reference of jiffies and therefor it does not need anything to clean up. I consider even a single high resolution timer related #ifdef outside of hrtimer.c and the clock event abstraction as an unnecessary mess. Sure you can replace the TICK_NSEC and ktime_to_jiffie stuff completely, but I still do not see the point why it is necessary to put it there first. It just makes it overly complex to review and understand :) I'm happy that we at least agree that we need better clock abstraction layers. How do you think does our existing high resolution timer implementation work ? While you explicitely said that it is required, we explicitely used exactly such a mechanism from the very first day. Please stop your absurd schoolmasterly attempts to teach me stuff which I'm well aware off. Can you please accept, that I exactly know what I'm talking about? > Anyway, thanks for finally responding, there seem have to piled up a > number of misconceptions, please give it some time to clear up. Roman, I have no interest and no intention to spend any more of my private time on a discussion like this. I always was and I'm still up for a technical discussion and cooperation. I'm not vengeful at all and if I ever meet you in person, the first beers at the bar are on my bill. But I seriously will ignore you completely, if you keep this tone and attitude with me. I'm well aware that LKML is not a nun convent, but the basic rules of human behaviour and respect still apply. tglx ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [patch 00/21] hrtimer - High-resolution timer subsystem 2005-12-09 17:23 ` Thomas Gleixner @ 2005-12-12 13:39 ` Roman Zippel 2005-12-12 16:42 ` Thomas Gleixner 0 siblings, 1 reply; 52+ messages in thread From: Roman Zippel @ 2005-12-12 13:39 UTC (permalink / raw) To: Thomas Gleixner; +Cc: linux-kernel, Andrew Morton, rostedt, johnstul, mingo Hi, On Fri, 9 Dec 2005, Thomas Gleixner wrote: > Actually the change adds more code lines and removes one field of the > hrtimer structure, but it has exactly the same functionality: Fast > access to the first expiring timer without walking the rb_tree. Together with the state field this would save 12 bytes, which is IMO very well worth considering. You seem to have some plans for it, the best hint I've found for it is: + (This seperate list is also useful for high-resolution timers where we + need seperate pending and expired queues while keeping the time-order + intact.)" Could you please elaborate on this? > > [PATCH 5/9] remove relative timer from abs_list > > > > When an absolute timer expires, it becomes a relative timer, so remove > > it from the abs_list. The TIMER_ABSTIME flag for timer_settime() > > changes the interpretation of the it_value member, but it_interval is > > always a relative value and clock_settime() only affects absolute time > > services. > > This is your interpretation and I disagree. > > If I set up a timer with a 24 hour interval, which should go off > everyday at 6:00 AM, then I expect that this timer does this even when > the clock is set e.g. by daylight saving. I think, that this is a > completely valid interpretation and makes a lot of sense from a > practical point of view. The existing implementation does it that way > already, so why do we want to change this ? I don't know whether this behaviour was intentional and why it was done this way, so I did this patch to initiate a discussion about this. I wouldn't say a 1 day interval timer is a very realistic example and the old timer wouldn't be very precise for this. The rationale for example talks about "a periodic timer with an absolute _initial_ expiration time", so I could also construct a valid example with this expectation. The more I read the spec the more I think the current behaviour is not correct, e.g. that ABS_TIME is only relevant for it_value. So I'm interested in specific interpretations of the spec which support the current behaviour. > Also you treat the interval relative to the current time of the callback > function: > > timer->expires = ktime_add(timer->base->last_expired, > timr->it.real.incr); > > This leads to a summing up error and even if the result is similar to > the summing up error of the current vanilla implementation I prefer a > solution which adds the interval to the previous set expiry time > > timer->expires = ktime_add(timer->expires, > timr->it.real.incr); > > The spec says: > "Also note that some implementations may choose to adjust time and/or > interval values to exactly match the ticks of the underlying clock." > > So there is no requirement to do so. Of course you may, but this takes > simply the name "precision" ad absurdum. Your current implementation contradicts the requirement that values should be rounded up to the resolution of the timer, that's exactly what my implementation does. The resolution of the timer is currently TICK_NSEC (+- ntp correction) and one expiry of it should only cause at most one expiry of all pending timer. If I set a 1msec timer in your implementation (with HZ=250), I automatically get 3 overruns, even though the timer really did only expire once. Since you don't do any rounding at all anymore, your timer may now expire early with low resolution clocks (the old jiffies + 1 problem I explained in my ktime_t patch). Also in the ktimer patch you wrote: +- also, there is an application surprise factor, the 'do not round + intervals' technique can lead to the following sample sequence of + events: + + Interval: 1.7ms + Resolution: 1ms + + Event timeline: + + 2ms - 4ms - 6ms - 7ms - 9ms - 11ms - 12ms - 14ms - 16ms - 17ms ... + + this 2,2,1,2,2,1...msec 'unpredictable and uneven' relative distance + of events could surprise applications. But this is now exactly the bevhaviour your timer has, why is not "surprising" anymore? > > > Beside of that, the implementation is also completely broken. (You > > > rebase the timer from the realtime base to the monotonic base inside of > > > the timer callback function. On return you lock the realtime base and > > > enqueue the timer into the realtime queue, but the base field of the > > > timer points to the monotonic queue. It needs not much phantasy to get > > > this exploited into a crash.) > > > > If you provide the wrong parameters, you can crash a lot of stuff in the > > kernel. "exploited" usually implies it can be abused from userspace, which > > is not the case here. > > Thanks for teaching me what an "exploit" usally means! > > I intentionally wrote "exploited into a crash". > > How do you think I got this to crash ? By hacking up some complex kernel > code ? No, simply by running my unmodified test scripts from user space > with completely valid and correct parameters. Of course its also > possible to write a program which actually exploits this. > > The implementation is simply broken. Can you just accept this ? I can accept that you found bug, but for "simply broken" I'm not convinced yet. Sorry, I have not been specific enough, I disagree with your analysis above. On return the timer isn't requeued into the realtime queue at all, so this can't be the reason for the crash. I guess it's more likely you managed to trigger the locking bug. > > > Furthermore, your implementation is calculating the next expiry value > > > based on the current time of the expiration rather than on the previous > > > expected expiry time, which would be the natural thing to do. This > > > detail also explains the system-load dependent random drifting of > > > ptimers quite well. > > > > Is this conclusion based on actual testing? The behaviour of ptimer should > > be quite close to the old jiffie based timer, so I'm a bit at a loss here, > > how you get to this conclusion. Please provide more details. > > It is based on testing. Do you think I pulled numbers out of my nose ? Jeez, sorry for asking. :( You didn't specify anywhere how you got to this conclusion, so I could reproduce it myself. Could you please elaborate on this "system-load dependent random drifting"? > > I don't understand where you get this from, I explicitely said that higher > > resolution requires a better clock abstraction, bascially any place which > > mentions TICK_NSEC has to be cleaned up like this. I'm at loss why you > > think this requires "a lot of #ifdef mess". > > Why do you need all this jiffie stuff in the first place? It is not > necessary at all. The hrtimer code does not contain a single reference > of jiffies and therefor it does not need anything to clean up. I > consider even a single high resolution timer related #ifdef outside of > hrtimer.c and the clock event abstraction as an unnecessary mess. Sure > you can replace the TICK_NSEC and ktime_to_jiffie stuff completely, but > I still do not see the point why it is necessary to put it there first. > It just makes it overly complex to review and understand :) In this regard I had two major goals: a) keep it as simple as possible, b) preserve the current behaviour and I still think I found the best compromise so far. This would allow to first merge the basic infrastructure, while reducing the risk of breaking anything. I don't mind changing the behaviour, but I would prefer to do this in a separate step and with an analysis of the possible consequences. This is not just about posix-timers, but it also affects itimers, nanosleep and possibly other systems in the future. Actually my main focus is not on HR posix timer, my main interest is that everythings else keeps working and doesn't has to pay the price for it. It's rather likely that if there is a subtle change in behaviour, which causes something to break, it's not noticed until it hits a release kernel, so I think it's very well worth it to understand and document the differences between the implementations. > Please stop your absurd schoolmasterly attempts to teach me stuff which > I'm well aware off. Can you please accept, that I exactly know what I'm > talking about? Sure, I can. I'm sorry I tried to explain things you already know, but if you know these things already, then please show it. At this point I'm mostly still trying to understand, why you did certain things and sometimes I explain things from my perspective in the hopes you would fill in the holes from your perspective. You mostly just post your patches and only explain the conclusion, you're make it rather short on how you get to these conclusions, e.g. what other alternatives you've already considered. This makes it hard for me to figure out what you know exactly from what you're talking about. > But I seriously will ignore you completely, if you keep this tone and > attitude with me. Jeez, cut me some slack, would you? Especially in the last mail I mostly just asked for more information. You read something into my mails, that is simply not there. bye, Roman ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [patch 00/21] hrtimer - High-resolution timer subsystem 2005-12-12 13:39 ` Roman Zippel @ 2005-12-12 16:42 ` Thomas Gleixner 2005-12-12 18:37 ` Thomas Gleixner ` (2 more replies) 0 siblings, 3 replies; 52+ messages in thread From: Thomas Gleixner @ 2005-12-12 16:42 UTC (permalink / raw) To: Roman Zippel; +Cc: linux-kernel, Andrew Morton, rostedt, johnstul, mingo Hi, On Mon, 2005-12-12 at 14:39 +0100, Roman Zippel wrote: > > Actually the change adds more code lines and removes one field of the > > hrtimer structure, but it has exactly the same functionality: Fast > > access to the first expiring timer without walking the rb_tree. > > Together with the state field this would save 12 bytes, which is IMO very > well worth considering. > You seem to have some plans for it, the best hint I've found for it is: > > + (This seperate list is also useful for high-resolution timers where we > + need seperate pending and expired queues while keeping the time-order > + intact.)" > > Could you please elaborate on this? Sure. I have already removed the list_head for the non high resolution case as it turned out that it does not hurt the high resolution implementation. For the high resolution implementation we have to move the expired timers to a seperate list, as we do not want to do complex callback functions from the event interrupt itself. But we have to reprogramm the next event interrupt, so we need simple access to the timer which expires first. The initial implementation did simply move the timer from the pending list to the expired list without doing the rb_tree removal inside of the event interrupt handler. That way the next event for reprogramming was the first event in the pending list. The new rebased version with the pending list removed does the rb_tree removal inside the event interrupt and enqueues the timer, for which the callback function has to be executed in the softirq, to the expired list. One exception are simple wakeup callback functions, as they are reasonably fast and we save two context switches. The next event for reprogramming the event interrupt is retrieved by the pointer in the base structure. This way the list head is only necessary for the high resolution case. The state field is not removed > > > [PATCH 5/9] remove relative timer from abs_list > > > > > > When an absolute timer expires, it becomes a relative timer, so remove > > > it from the abs_list. The TIMER_ABSTIME flag for timer_settime() > > > changes the interpretation of the it_value member, but it_interval is > > > always a relative value and clock_settime() only affects absolute time > > > services. > > > > This is your interpretation and I disagree. > > > > If I set up a timer with a 24 hour interval, which should go off > > everyday at 6:00 AM, then I expect that this timer does this even when > > the clock is set e.g. by daylight saving. I think, that this is a > > completely valid interpretation and makes a lot of sense from a > > practical point of view. The existing implementation does it that way > > already, so why do we want to change this ? > > I don't know whether this behaviour was intentional and why it was done > this way, so I did this patch to initiate a discussion about this. Ok. > I wouldn't say a 1 day interval timer is a very realistic example and the > old timer wouldn't be very precise for this. Sure, as all comparisons are flawed. I just used a simple example to illustrate my POV. > The rationale for example talks about "a periodic timer with an absolute > _initial_ expiration time", so I could also construct a valid example with > this expectation. The more I read the spec the more I think the current > behaviour is not correct, e.g. that ABS_TIME is only relevant for > it_value. > So I'm interested in specific interpretations of the spec which support > the current behaviour. Unfortunately you find just the spec all over the place. I fear we have to find and agree on an interpretation ourself. I agree, that the restriction to the initial it_value is definitely something you can read out of the spec. But it does not make a lot of sense for me. Also the restriction to TIMER_ABSTIME is somehow strange as it converts an CLOCK_REALTIME timer to a CLOCK_MONOTONIC timer. I never understood the rationale behind that. > > The spec says: > > "Also note that some implementations may choose to adjust time and/or > > interval values to exactly match the ticks of the underlying clock." > > > > So there is no requirement to do so. Of course you may, but this takes > > simply the name "precision" ad absurdum. > > Your current implementation contradicts the requirement that values should > be rounded up to the resolution of the timer, that's exactly what my > implementation does. The resolution of the timer is currently TICK_NSEC > (+- ntp correction) and one expiry of it should only cause at most one > expiry of all pending timer. If I set a 1msec timer in your implementation > (with HZ=250), I automatically get 3 overruns, even though the timer > really did only expire once. Damn, you are right. We did not take this into account. > Since you don't do any rounding at all anymore, your timer may now expire > early with low resolution clocks (the old jiffies + 1 problem I explained > in my ktime_t patch). It does not expire early. The timer->expires field is still compared against now. > Also in the ktimer patch you wrote: > > +- also, there is an application surprise factor, the 'do not round > + intervals' technique can lead to the following sample sequence of > + events: > + > + Interval: 1.7ms > + Resolution: 1ms > + > + Event timeline: > + > + 2ms - 4ms - 6ms - 7ms - 9ms - 11ms - 12ms - 14ms - 16ms - 17ms ... > + > + this 2,2,1,2,2,1...msec 'unpredictable and uneven' relative distance > + of events could surprise applications. > > But this is now exactly the bevhaviour your timer has, why is not > "surprising" anymore? Yes, we wrote that before. After reconsidering the results we came to the conclusion, that we actually dont need the rounding at all because the uneven distance is equally surprising as the summing up errors introduced by rounding. > I can accept that you found bug, but for "simply broken" I'm not convinced > yet. Sorry, I have not been specific enough, I disagree with your analysis > above. On return the timer isn't requeued into the realtime queue at all, > so this can't be the reason for the crash. I guess it's more likely you > managed to trigger the locking bug. Ok. Maybe I did not understand the code at this point. > You didn't specify anywhere how you got to this conclusion, so I could > reproduce it myself. Could you please elaborate on this "system-load > dependent random drifting"? As I said already, my conclusion was wrong. This showed up on a SMP machine not on UP, when the system load was high. (The timeline was randomly off) > > > I don't understand where you get this from, I explicitely said that higher > > > resolution requires a better clock abstraction, bascially any place which > > > mentions TICK_NSEC has to be cleaned up like this. I'm at loss why you > > > think this requires "a lot of #ifdef mess". > > > > Why do you need all this jiffie stuff in the first place? It is not > > necessary at all. The hrtimer code does not contain a single reference > > of jiffies and therefor it does not need anything to clean up. I > > consider even a single high resolution timer related #ifdef outside of > > hrtimer.c and the clock event abstraction as an unnecessary mess. Sure > > you can replace the TICK_NSEC and ktime_to_jiffie stuff completely, but > > I still do not see the point why it is necessary to put it there first. > > It just makes it overly complex to review and understand :) > > In this regard I had two major goals: a) keep it as simple as possible, b) > preserve the current behaviour and I still think I found the best > compromise so far. This would allow to first merge the basic > infrastructure, while reducing the risk of breaking anything. > > I don't mind changing the behaviour, but I would prefer to do this in a > separate step and with an analysis of the possible consequences. This is > not just about posix-timers, but it also affects itimers, nanosleep and > possibly other systems in the future. Actually my main focus is not on HR > posix timer, my main interest is that everythings else keeps working and > doesn't has to pay the price for it. While my focus is a clean merging of high resolution timers without breaking the non hrt case, I still believe that we can find a solution, where we can have the base implementation without any reference to jiffies. > It's rather likely that if there is a subtle change in behaviour, which > causes something to break, it's not noticed until it hits a release > kernel, so I think it's very well worth it to understand and document the > differences between the implementations. Sure. Our goal was to keep the code almost identical independend of the driving clock source. I try to compare and contrast the two possible solutions: Rounding the initial expiry time and the interval results in a summing up error, which depends on the delta of the interval and the resolution. The non rounding solution results in a summing up error for intervals which are less than the resolution. For intervals >= resolution no summing up error is happening, but for intervals, which are not a multiple of the resolution, an uneven interval as close as possible to the timeline is delivered. In both cases the timers never expire early and I think both variants are compliant with the specification. > Sure, I can. I'm sorry I tried to explain things you already know, but if > you know these things already, then please show it. At this point I'm > mostly still trying to understand, why you did certain things and > sometimes I explain things from my perspective in the hopes you would fill > in the holes from your perspective. > > You mostly just post your patches and only explain the conclusion, you're > make it rather short on how you get to these conclusions, e.g. what other > alternatives you've already considered. This makes it hard for me to > figure out what you know exactly from what you're talking about. Ok. Will try to get this better. tglx ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [patch 00/21] hrtimer - High-resolution timer subsystem 2005-12-12 16:42 ` Thomas Gleixner @ 2005-12-12 18:37 ` Thomas Gleixner 2005-12-13 1:25 ` George Anzinger 2005-12-14 20:48 ` Roman Zippel 2 siblings, 0 replies; 52+ messages in thread From: Thomas Gleixner @ 2005-12-12 18:37 UTC (permalink / raw) To: Roman Zippel; +Cc: linux-kernel, Andrew Morton, rostedt, johnstul, mingo > On Mon, 2005-12-12 at 14:39 +0100, Roman Zippel wrote: > > > Actually the change adds more code lines and removes one field of the > > > hrtimer structure, but it has exactly the same functionality: Fast > > > access to the first expiring timer without walking the rb_tree. > > > > Together with the state field this would save 12 bytes, which is IMO very > > well worth considering. > > You seem to have some plans for it, the best hint I've found for it is: > > > > + (This seperate list is also useful for high-resolution timers where we > > + need seperate pending and expired queues while keeping the time-order > > + intact.)" > > > > Could you please elaborate on this? > > Sure. I have already removed the list_head for the non high resolution > case as it turned out that it does not hurt the high resolution > implementation. > > For the high resolution implementation we have to move the expired > timers to a seperate list, as we do not want to do complex callback > functions from the event interrupt itself. But we have to reprogramm the > next event interrupt, so we need simple access to the timer which > expires first. > > The initial implementation did simply move the timer from the pending > list to the expired list without doing the rb_tree removal inside of the > event interrupt handler. That way the next event for reprogramming was > the first event in the pending list. > > The new rebased version with the pending list removed does the rb_tree > removal inside the event interrupt and enqueues the timer, for which the > callback function has to be executed in the softirq, to the expired > list. One exception are simple wakeup callback functions, as they are > reasonably fast and we save two context switches. The next event for > reprogramming the event interrupt is retrieved by the pointer in the > base structure. > > This way the list head is only necessary for the high resolution case. > > The state field is not removed Oops, I somehow managed to remove the rest of this paragraph :( The state field is not removed because I'm not a big fan of those overloaded fields and I prefer to pay the 4 byte penalty for the seperation. Of course if there is the absolute requirement to reduce the size, I'm not insisting on keeping it. tglx ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [patch 00/21] hrtimer - High-resolution timer subsystem 2005-12-12 16:42 ` Thomas Gleixner 2005-12-12 18:37 ` Thomas Gleixner @ 2005-12-13 1:25 ` George Anzinger 2005-12-13 9:18 ` Thomas Gleixner 2005-12-15 1:35 ` Roman Zippel 2005-12-14 20:48 ` Roman Zippel 2 siblings, 2 replies; 52+ messages in thread From: George Anzinger @ 2005-12-13 1:25 UTC (permalink / raw) To: tglx; +Cc: Roman Zippel, linux-kernel, Andrew Morton, rostedt, johnstul, mingo Thomas Gleixner wrote: ~ > > >>I wouldn't say a 1 day interval timer is a very realistic example and the >>old timer wouldn't be very precise for this. > > > Sure, as all comparisons are flawed. I just used a simple example to > illustrate my POV. > > >>The rationale for example talks about "a periodic timer with an absolute >>_initial_ expiration time", so I could also construct a valid example with >>this expectation. The more I read the spec the more I think the current >>behaviour is not correct, e.g. that ABS_TIME is only relevant for >>it_value. >>So I'm interested in specific interpretations of the spec which support >>the current behaviour. My $0.02 worth: It is clear (from the standard) that the initial time is to be ABS_TIME. It is also clear that the interval is to be added to that time. IMHO then, the result should have the same property, i.e. ABS_TIME. Sort of like adding an offset to a relative address. The result is still relative. > > > Unfortunately you find just the spec all over the place. I fear we have > to find and agree on an interpretation ourself. > > I agree, that the restriction to the initial it_value is definitely > something you can read out of the spec. But it does not make a lot of > sense for me. Also the restriction to TIMER_ABSTIME is somehow strange > as it converts an CLOCK_REALTIME timer to a CLOCK_MONOTONIC timer. I > never understood the rationale behind that. I don't think it really does that. The TIMER_ABSTIME flag just says that the time requested is to be taken as "clock" time (which ever clock) AND that this is to be the expire time regardless of clock setting. We, in an attempt to simplify the lists, convert the expire time into some common time notation (in most cases we convert relative times to absolute times) but this introduces problems because the caller has _asked_ for a relative or absolute time and not the other. If the clock can not be set this is not a problem. If it can, well, we need to keep track of what the caller wanted, absolute or relative. It might help others to understand this if you were to remove the clock names from your queues and instead call them "absolute_real" and "up_time". Then it would be more clear, I think, that we are mapping user requests onto these queues based on the desired functionality without a predilection to put a timer on a given queue just because a particular clock was requested. At this point it becomes clear, for example, that a TIMER_ABSTIME request on the real clock is the _only_ request that should be mapped to the "absolute_real" list. > ~ -- George Anzinger george@mvista.com HRT (High-res-timers): http://sourceforge.net/projects/high-res-timers/ ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [patch 00/21] hrtimer - High-resolution timer subsystem 2005-12-13 1:25 ` George Anzinger @ 2005-12-13 9:18 ` Thomas Gleixner 2005-12-15 1:35 ` Roman Zippel 1 sibling, 0 replies; 52+ messages in thread From: Thomas Gleixner @ 2005-12-13 9:18 UTC (permalink / raw) To: george Cc: Roman Zippel, linux-kernel, Andrew Morton, rostedt, johnstul, mingo On Mon, 2005-12-12 at 17:25 -0800, George Anzinger wrote: > >>The rationale for example talks about "a periodic timer with an absolute > >>_initial_ expiration time", so I could also construct a valid example with > >>this expectation. The more I read the spec the more I think the current > >>behaviour is not correct, e.g. that ABS_TIME is only relevant for > >>it_value. > >>So I'm interested in specific interpretations of the spec which support > >>the current behaviour. > > My $0.02 worth: It is clear (from the standard) that the initial time > is to be ABS_TIME. It is also clear that the interval is to be added > to that time. IMHO then, the result should have the same property, > i.e. ABS_TIME. Sort of like adding an offset to a relative address. > The result is still relative. So the only difference between a timer with ABSTIME set and one without is the notion of the initial expiry value, aside the clock_settime(CLOCK_REALTIME) speciality. ABSTIME: firstexp = it_value firstexp, firstexp + it_interval, ... firstexp + n * it_interval non ABSTIME: firstexp = now + it_value firstexp, firstexp + it_interval, ... firstexp + n * it_interval The only limitation of this is that the interval value can not be less than the resolution of the clock in order to avoid the wrong accounting of the overflow. > > Unfortunately you find just the spec all over the place. I fear we have > > to find and agree on an interpretation ourself. > > > > I agree, that the restriction to the initial it_value is definitely > > something you can read out of the spec. But it does not make a lot of > > sense for me. Also the restriction to TIMER_ABSTIME is somehow strange > > as it converts an CLOCK_REALTIME timer to a CLOCK_MONOTONIC timer. I > > never understood the rationale behind that. > > I don't think it really does that. The TIMER_ABSTIME flag just says > that the time requested is to be taken as "clock" time (which ever > clock) AND that this is to be the expire time regardless of clock > setting. We, in an attempt to simplify the lists, convert the expire > time into some common time notation (in most cases we convert relative > times to absolute times) but this introduces problems because the > caller has _asked_ for a relative or absolute time and not the other. > If the clock can not be set this is not a problem. If it can, well, > we need to keep track of what the caller wanted, absolute or relative. > > It might help others to understand this if you were to remove the > clock names from your queues and instead call them "absolute_real" and > "up_time". Then it would be more clear, I think, that we are mapping > user requests onto these queues based on the desired functionality > without a predilection to put a timer on a given queue just because a > particular clock was requested. At this point it becomes clear, for > example, that a TIMER_ABSTIME request on the real clock is the _only_ > request that should be mapped to the "absolute_real" list. In other words. If there is only CLOCK_REALTIME, then the implementation has to keep track of absolute and relative timers. The existance of CLOCK_MONOTONIC and the fact that CLOCK_MONOTONIC is using the same clock source as CLOCK_REALTIME allows us to optimize the implementation by putting the relative timers on the monotonic list. tglx ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [patch 00/21] hrtimer - High-resolution timer subsystem 2005-12-13 1:25 ` George Anzinger 2005-12-13 9:18 ` Thomas Gleixner @ 2005-12-15 1:35 ` Roman Zippel 2005-12-15 2:29 ` George Anzinger 1 sibling, 1 reply; 52+ messages in thread From: Roman Zippel @ 2005-12-15 1:35 UTC (permalink / raw) To: George Anzinger Cc: tglx, linux-kernel, Andrew Morton, rostedt, johnstul, mingo Hi, On Mon, 12 Dec 2005, George Anzinger wrote: > My $0.02 worth: It is clear (from the standard) that the initial time is to be > ABS_TIME. Yes. > It is also clear that the interval is to be added to that time. Not necessarily. It says it_interval is a "reload value", it's used to reload the timer to count down to the next expiration. It's up to the implementation, whether it really counts down this time or whether it converts it first into an absolute value. > IMHO then, the result should have the same property, i.e. ABS_TIME. Sort of > like adding an offset to a relative address. The result is still relative. If the result is relative, why should have a clock set any effect? IMO the spec makes it quite clear that initial timer and the periodic timer are two different types of the timer. The initial timer only specifies how the periodic timer is started and the periodic timer itself is a "relative time service". bye, Roman ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [patch 00/21] hrtimer - High-resolution timer subsystem 2005-12-15 1:35 ` Roman Zippel @ 2005-12-15 2:29 ` George Anzinger 2005-12-19 14:56 ` Roman Zippel 0 siblings, 1 reply; 52+ messages in thread From: George Anzinger @ 2005-12-15 2:29 UTC (permalink / raw) To: Roman Zippel; +Cc: tglx, linux-kernel, Andrew Morton, rostedt, johnstul, mingo Roman Zippel wrote: > Hi, > > On Mon, 12 Dec 2005, George Anzinger wrote: > > >>My $0.02 worth: It is clear (from the standard) that the initial time is to be >>ABS_TIME. > > > Yes. > > >> It is also clear that the interval is to be added to that time. > > > Not necessarily. It says it_interval is a "reload value", it's used to > reload the timer to count down to the next expiration. > It's up to the implementation, whether it really counts down this time or > whether it converts it first into an absolute value. > > >>IMHO then, the result should have the same property, i.e. ABS_TIME. Sort of >>like adding an offset to a relative address. The result is still relative. > > > If the result is relative, why should have a clock set any effect? > IMO the spec makes it quite clear that initial timer and the periodic > timer are two different types of the timer. The initial timer only > specifies how the periodic timer is started and the periodic timer itself > is a "relative time service". > Well, I guess we will have to agree to disagree. That which the interval is added to is an absolute time, so I, and others, take the result as absolute. At this point there really is no "conversion" to an absolute timer. Once the timer initial time is absolute, everything derived from it, i.e. all intervals added to it, must be absolute. For what its worth, I do think that the standards folks could have done a bit better here. I, for example, would have liked to have seen a discussion about what to do with overrun in the face of clock setting. -- George Anzinger george@mvista.com HRT (High-res-timers): http://sourceforge.net/projects/high-res-timers/ ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [patch 00/21] hrtimer - High-resolution timer subsystem 2005-12-15 2:29 ` George Anzinger @ 2005-12-19 14:56 ` Roman Zippel 2005-12-19 20:54 ` George Anzinger 0 siblings, 1 reply; 52+ messages in thread From: Roman Zippel @ 2005-12-19 14:56 UTC (permalink / raw) To: George Anzinger Cc: tglx, linux-kernel, Andrew Morton, rostedt, johnstul, mingo Hi, On Wed, 14 Dec 2005, George Anzinger wrote: > > > IMHO then, the result should have the same property, i.e. ABS_TIME. Sort > > > of > > > like adding an offset to a relative address. The result is still relative. > > > > > > If the result is relative, why should have a clock set any effect? > > IMO the spec makes it quite clear that initial timer and the periodic timer > > are two different types of the timer. The initial timer only specifies how > > the periodic timer is started and the periodic timer itself is a "relative > > time service". > > > Well, I guess we will have to agree to disagree. That's easy for you to say. :) You don't think the current behaviour is wrong. > That which the interval is > added to is an absolute time, so I, and others, take the result as absolute. > At this point there really is no "conversion" to an absolute timer. Once the > timer initial time is absolute, everything derived from it, i.e. all intervals > added to it, must be absolute. With this argumentation, any relative timer could be treated this way, you have to base a relative timer on something. While searching for more information I found the NetBSD code and they do exactly this, they just convert everything to absolute values and clock set affects all timers equally. Is this now more correct? > For what its worth, I do think that the standards folks could have done a bit > better here. I, for example, would have liked to have seen a discussion about > what to do with overrun in the face of clock setting. Maybe they thought it wouldn't be necessary :), because a periodic is a relative timer and thus not affected... bye, Roman ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [patch 00/21] hrtimer - High-resolution timer subsystem 2005-12-19 14:56 ` Roman Zippel @ 2005-12-19 20:54 ` George Anzinger 2005-12-21 23:03 ` Roman Zippel 0 siblings, 1 reply; 52+ messages in thread From: George Anzinger @ 2005-12-19 20:54 UTC (permalink / raw) To: Roman Zippel; +Cc: tglx, linux-kernel, Andrew Morton, rostedt, johnstul, mingo Roman Zippel wrote: > Hi, > > On Wed, 14 Dec 2005, George Anzinger wrote: > > >>>>IMHO then, the result should have the same property, i.e. ABS_TIME. Sort >>>>of >>>>like adding an offset to a relative address. The result is still relative. >>> >>> >>>If the result is relative, why should have a clock set any effect? >>>IMO the spec makes it quite clear that initial timer and the periodic timer >>>are two different types of the timer. The initial timer only specifies how >>>the periodic timer is started and the periodic timer itself is a "relative >>>time service". >>> >> >>Well, I guess we will have to agree to disagree. > > > That's easy for you to say. :) > You don't think the current behaviour is wrong. > > On of the issues I see with using your assumption is that moving the timer to an absolute clock after the initial expiry _may_ lead to additional qauntization errors, depending on how aligned the two clocks are. >> That which the interval is >>added to is an absolute time, so I, and others, take the result as absolute. >>At this point there really is no "conversion" to an absolute timer. Once the >>timer initial time is absolute, everything derived from it, i.e. all intervals >>added to it, must be absolute. > > > With this argumentation, any relative timer could be treated this way, you > have to base a relative timer on something. > While searching for more information I found the NetBSD code and they > do exactly this, they just convert everything to absolute values and clock > set affects all timers equally. Is this now more correct? > I would guess, then, that either the non-absolute or the absolute timer behaves badly in the face of clock setting. Could you provide a pointer to the NetBSD code so I can have a look too? > >>For what its worth, I do think that the standards folks could have done a bit >>better here. I, for example, would have liked to have seen a discussion about >>what to do with overrun in the face of clock setting. > > > Maybe they thought it wouldn't be necessary :), because a periodic is a > relative timer and thus not affected... Well, then they could have said that :) Might have prevented a lot of lkml bandwidth usage as well as several days of my time trying to do something other than what they might say is the right thing. -- George Anzinger george@mvista.com HRT (High-res-timers): http://sourceforge.net/projects/high-res-timers/ ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [patch 00/21] hrtimer - High-resolution timer subsystem 2005-12-19 20:54 ` George Anzinger @ 2005-12-21 23:03 ` Roman Zippel 2005-12-22 4:30 ` George Anzinger 0 siblings, 1 reply; 52+ messages in thread From: Roman Zippel @ 2005-12-21 23:03 UTC (permalink / raw) To: George Anzinger Cc: tglx, linux-kernel, Andrew Morton, rostedt, johnstul, mingo Hi, On Mon, 19 Dec 2005, George Anzinger wrote: > > You don't think the current behaviour is wrong. > > > > > On of the issues I see with using your assumption is that moving the timer to > an absolute clock after the initial expiry _may_ lead to additional > qauntization errors, depending on how aligned the two clocks are. What do you mean by "moving the timer to an an absolute clock"? > I would guess, then, that either the non-absolute or the absolute timer > behaves badly in the face of clock setting. Could you provide a pointer to > the NetBSD code so I can have a look too? http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/kern/kern_time.c?rev=1.98&content-type=text/x-cvsweb-markup AFAICT TIMER_ABSTIME is only used to convert the relative value to an absolute value. bye, Roman ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [patch 00/21] hrtimer - High-resolution timer subsystem 2005-12-21 23:03 ` Roman Zippel @ 2005-12-22 4:30 ` George Anzinger 0 siblings, 0 replies; 52+ messages in thread From: George Anzinger @ 2005-12-22 4:30 UTC (permalink / raw) To: Roman Zippel; +Cc: tglx, linux-kernel, Andrew Morton, rostedt, johnstul, mingo Roman Zippel wrote: > Hi, > > On Mon, 19 Dec 2005, George Anzinger wrote: > > >>>You don't think the current behaviour is wrong. >>> >>> >> >>One of the issues I see with using your assumption is that moving the timer to >>an absolute clock after the initial expiry _may_ lead to additional >>qauntization errors, depending on how aligned the two clocks are. > > > What do you mean by "moving the timer to an an absolute clock"? The assumption I am making is that the timer is connected to a clock (CLOCK_MONOTONIC or CLOCK_REALTIME). Timers on CLOCK_REALTIME with the absolute flag set should expire at the requested time as read from that clock, where as relative timers are not affected by time setting and thus should be on CLOCK_MONOTONIC. It is unclear, in general, how these two clocks relate to each other at the nanosecond level, or so one might think. Of course, we can define this problem away by a particular definition of one of these clocks as being derived from the other (which we, infact, do in Linux). > > >>I would guess, then, that either the non-absolute or the absolute timer >>behaves badly in the face of clock setting. Could you provide a pointer to >>the NetBSD code so I can have a look too? > > > http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/kern/kern_time.c?rev=1.98&content-type=text/x-cvsweb-markup > AFAICT TIMER_ABSTIME is only used to convert the relative value to an > absolute value. Yes, there is also this interesting comment in settime: /* WHAT DO WE DO ABOUT PENDING REAL-TIME TIMEOUTS??? */ I strongly suspect that this system does NOT expire absolute timers and clock_nanosleep calls at the requested time in the face of clock setting. I see NO hooks in the referenced code that would allow them to find such timers at clock set time, nor are they entered into any different list to make them findable. It would appear that the absolute attribute is lost as soon as the time is convereted to a relative time. In fairness, the POSIX folks added the clock setting requirement a few years after the absolute flag was defined... but, still, there is that comment. -- George Anzinger george@mvista.com HRT (High-res-timers): http://sourceforge.net/projects/high-res-timers/ ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [patch 00/21] hrtimer - High-resolution timer subsystem 2005-12-12 16:42 ` Thomas Gleixner 2005-12-12 18:37 ` Thomas Gleixner 2005-12-13 1:25 ` George Anzinger @ 2005-12-14 20:48 ` Roman Zippel 2005-12-14 22:30 ` Thomas Gleixner 2 siblings, 1 reply; 52+ messages in thread From: Roman Zippel @ 2005-12-14 20:48 UTC (permalink / raw) To: Thomas Gleixner; +Cc: linux-kernel, Andrew Morton, rostedt, johnstul, mingo Hi, On Mon, 12 Dec 2005, Thomas Gleixner wrote: > For the high resolution implementation we have to move the expired > timers to a seperate list, as we do not want to do complex callback > functions from the event interrupt itself. But we have to reprogramm the > next event interrupt, so we need simple access to the timer which > expires first. > > The initial implementation did simply move the timer from the pending > list to the expired list without doing the rb_tree removal inside of the > event interrupt handler. That way the next event for reprogramming was > the first event in the pending list. > > The new rebased version with the pending list removed does the rb_tree > removal inside the event interrupt and enqueues the timer, for which the > callback function has to be executed in the softirq, to the expired > list. One exception are simple wakeup callback functions, as they are > reasonably fast and we save two context switches. The next event for > reprogramming the event interrupt is retrieved by the pointer in the > base structure. > > This way the list head is only necessary for the high resolution case. Thanks for the explanation. If it's just for reprogramming the interrupt, it should be cheaper to just check the rbtree than walk the list to find the next expiration time (at least theoretically). This leaves only optimizations for rt kernel and from the base kernel point of view I prefer the immediate space savings. > The state field is not removed because I'm not a big fan of those > overloaded fields and I prefer to pay the 4 byte penalty for the > seperation. > Of course if there is the absolute requirement to reduce the size, I'm > not insisting on keeping it. Well, I'm not a big fan of redundant state information, e.g. the pending information can be included in the rb_node (it's not as quite simple as with the timer_list, but it's the same thing). The expired information (together with the data field) is an optimization for simple sleeps that AFAICT only makes a difference in the rt kernel (the saved context switch you mentioned above). What makes me more uncomfortable is that this is a special case optimization and other callbacks are probably fast as well (e.g. wakeup + timer restart). I can understand you want to keep the difference to the rt kernel small, but for me it's more about immediate benefits against uncertain long term benefits. > > The rationale for example talks about "a periodic timer with an absolute > > _initial_ expiration time", so I could also construct a valid example with > > this expectation. The more I read the spec the more I think the current > > behaviour is not correct, e.g. that ABS_TIME is only relevant for > > it_value. > > So I'm interested in specific interpretations of the spec which support > > the current behaviour. > > Unfortunately you find just the spec all over the place. I fear we have > to find and agree on an interpretation ourself. > > I agree, that the restriction to the initial it_value is definitely > something you can read out of the spec. But it does not make a lot of > sense for me. Also the restriction to TIMER_ABSTIME is somehow strange > as it converts an CLOCK_REALTIME timer to a CLOCK_MONOTONIC timer. I > never understood the rationale behind that. As George already said, it's easier to keep these clocks separate. I think the spec rationale is also more clear about the intended usage. About timers it says: "Two timer types are required for a system to support realtime applications: 1. One-shot ... 2. Periodic ..." Basically you have two independent timer types. It's quite explicit about that only the "initial timer expiration" can be relative or absolute. It doesn't say anywhere that there are relative and absolute periodic timer, all references to "absolute" or "relative" are only in connection with the initial expiration time and after the initial expiration, it becomes a periodic timer. At every timer expiration the timer is reloaded with a relative time interval. I can understand that you find this behaviour useful (although other people may disagree) and the spec doesn't explicitely say that you must not do this, but I really can't convince myself that this is the _intendend_ behaviour. > > Since you don't do any rounding at all anymore, your timer may now expire > > early with low resolution clocks (the old jiffies + 1 problem I explained > > in my ktime_t patch). > > It does not expire early. The timer->expires field is still compared > against now. I don't think that's enough (unless I missed something). Steven maybe explained it better than I did in http://marc.theaimsgroup.com/?l=linux-kernel&m=113047529313935 Even if you set the timer resolution to 1 nsec, there is still the resolution of the actual hardware clock and it has to be taken into account somewhere when you start a relative timer. Even if the clock resolution is usually higher than the normal latency, so the problem won't be visible for most people, the general timer code should take this into account. If someone doesn't care about high resolution timer, he can still use it with a low resolution clock (e.g. jiffies) and then this becomes a problem. > > But this is now exactly the bevhaviour your timer has, why is not > > "surprising" anymore? > > Yes, we wrote that before. After reconsidering the results we came to > the conclusion, that we actually dont need the rounding at all because > the uneven distance is equally surprising as the summing up errors > introduced by rounding. I don't think that the summing up error is surprising at all, the spec is quite clear that the time values have to be rounded up to the resolution of the timer and it's also the behaviour of the current timer. This error is actually the expected behaviour for any timer with a resolution different from 1 nsec. I don't want to say that we can't have such a timer, but I'm not so sure whether this should be the default behaviour. I actually prefer George's earlier suggestion of CLOCK_REALTIME and CLOCK_REALTIME_HR, where one is possibly faster and the other is more precise. Even within the kernel I would prefer to map itimer and nanosleep to the first clock (maybe also based on arch/kconfig defaults). OTOH if the hardware allows it, both clocks can do the same thing, but I really would like to have the possibility to give higher (and thus possibly more expensive) resolution only to those asking for it. > > I don't mind changing the behaviour, but I would prefer to do this in a > > separate step and with an analysis of the possible consequences. This is > > not just about posix-timers, but it also affects itimers, nanosleep and > > possibly other systems in the future. Actually my main focus is not on HR > > posix timer, my main interest is that everythings else keeps working and > > doesn't has to pay the price for it. > > While my focus is a clean merging of high resolution timers without > breaking the non hrt case, I still believe that we can find a solution, > where we can have the base implementation without any reference to > jiffies. This is what I think requires the better clock abstraction, most of it is related to the clock resolution, the generic timer code currently has no idea of the real resolution of the underlying clock, so I assumed a worst case of TICK_NSEC everywhere. > I try to compare and contrast the two possible solutions: > > Rounding the initial expiry time and the interval results in a summing > up error, which depends on the delta of the interval and the > resolution. > > The non rounding solution results in a summing up error for intervals > which are less than the resolution. For intervals >= resolution no > summing up error is happening, but for intervals, which are not a > multiple of the resolution, an uneven interval as close as possible to > the timeline is delivered. > > In both cases the timers never expire early and I think both variants > are compliant with the specification. What I'd like to avoid is that we have to commit ourselves to only one solution right now. I think the first solution is better suited to the low resolution timer, that we have right now. The second solution requires a better clock framework - this includes better time keeping and reprogrammable timer interrupts. At this point I wouldn't like to settle on just one solution, I had to see more of the infrastructure integrated before doing this. At this point I see more advantages in having a choice (may it be as Kconfig or even a runtime option). bye, Roman ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [patch 00/21] hrtimer - High-resolution timer subsystem 2005-12-14 20:48 ` Roman Zippel @ 2005-12-14 22:30 ` Thomas Gleixner 2005-12-15 0:55 ` George Anzinger ` (2 more replies) 0 siblings, 3 replies; 52+ messages in thread From: Thomas Gleixner @ 2005-12-14 22:30 UTC (permalink / raw) To: Roman Zippel; +Cc: linux-kernel, Andrew Morton, rostedt, johnstul, mingo Hi, On Wed, 2005-12-14 at 21:48 +0100, Roman Zippel wrote: > > > > This way the list head is only necessary for the high resolution case. > > Thanks for the explanation. If it's just for reprogramming the interrupt, > it should be cheaper to just check the rbtree than walk the list to find > the next expiration time (at least theoretically). This leaves only > optimizations for rt kernel and from the base kernel point of view I > prefer the immediate space savings. The current -hrt queue contains the removal patch of the list_head already and you interrupted my attempt to send out the patch for -mm :) > > The state field is not removed because I'm not a big fan of those > > overloaded fields and I prefer to pay the 4 byte penalty for the > > seperation. > > Of course if there is the absolute requirement to reduce the size, I'm > > not insisting on keeping it. > > Well, I'm not a big fan of redundant state information, e.g. the pending > information can be included in the rb_node (it's not as quite simple as > with the timer_list, but it's the same thing). I do not consider this as redundant information. It's a design decision whether to use a state variable for state information and the rbnode for rbtree handling or to overload the meaning of the rbnode with information which is not the natural associated content. I'm well aware of those optimization and space saving tricks. I did microcontroller programming long enough, but - and out of the experience - I want to avoid it for new designs where ever it is possible for clarity and extensibility reasons. > The expired information > (together with the data field) is an optimization for simple sleeps that > AFAICT only makes a difference in the rt kernel (the saved context switch > you mentioned above). What makes me more uncomfortable is that this is a > special case optimization and other callbacks are probably fast as well > (e.g. wakeup + timer restart). Not only in a -rt kernel, it also saves the context switch for a high resolution timer enabled kernel, where you actually can execute the callback in hard interrupt context. We can solve it differently, but we should carefully think about the extensiblity issues. The wakeup + restart szenario is a good reason to reconsider this. > I can understand you want to keep the difference to the rt kernel small, > but for me it's more about immediate benefits against uncertain long term > benefits. If you have a clear target and the experience of having implemented the extensions, you have to carefully weigh up the consequences of such decisions. I'm not talking about "might be implemented by somebody sometimes features", I'm talking about existing proof of concept implementations. There is no real justification to ignore well known consequences. Of course if you consider the possibility of including high resolution timers (I'm not talking about -rt) as zero, your requests make sense. I disagree because I'm convinced that the problems "high res timers", "dynamic ticks", "timekeeping", "clock event abstraction" are closely related and we have high demands to get those solved in a clean way. So providing some jiffies bound minimal solution in the first place is more than shortsighted IMO. > > > The rationale for example talks about "a periodic timer with an absolute > > > _initial_ expiration time", so I could also construct a valid example with > > > this expectation. The more I read the spec the more I think the current > > > behaviour is not correct, e.g. that ABS_TIME is only relevant for > > > it_value. > > > So I'm interested in specific interpretations of the spec which support > > > the current behaviour. > > > > Unfortunately you find just the spec all over the place. I fear we have > > to find and agree on an interpretation ourself. > > > > I agree, that the restriction to the initial it_value is definitely > > something you can read out of the spec. But it does not make a lot of > > sense for me. Also the restriction to TIMER_ABSTIME is somehow strange > > as it converts an CLOCK_REALTIME timer to a CLOCK_MONOTONIC timer. I > > never understood the rationale behind that. > > As George already said, it's easier to keep these clocks separate. I think > the spec rationale is also more clear about the intended usage. About > timers it says: > > "Two timer types are required for a system to support realtime > applications: > > 1. One-shot > ... > > 2. Periodic > ..." > > Basically you have two independent timer types. It's quite explicit about > that only the "initial timer expiration" can be relative or absolute. It > doesn't say anywhere that there are relative and absolute periodic timer, > all references to "absolute" or "relative" are only in connection with the > initial expiration time and after the initial expiration, it becomes a > periodic timer. At every timer expiration the timer is reloaded with a > relative time interval. > I can understand that you find this behaviour useful (although other > people may disagree) and the spec doesn't explicitely say that you must > not do this, but I really can't convince myself that this is the > _intendend_ behaviour. Goerge said explicitely: > My $0.02 worth: It is clear (from the standard) that the initial time > is to be ABS_TIME. It is also clear that the interval is to be added > to that time. IMHO then, the result should have the same property, > i.e. ABS_TIME. I dont find a way to read out that the interval should not have the ABSTIME property. > > > Since you don't do any rounding at all anymore, your timer may now expire > > > early with low resolution clocks (the old jiffies + 1 problem I explained > > > in my ktime_t patch). > > > > It does not expire early. The timer->expires field is still compared > > against now. > > I don't think that's enough (unless I missed something). Steven maybe > explained it better than I did in > http://marc.theaimsgroup.com/?l=linux-kernel&m=113047529313935 Steven said: > Interesting though, I tried to force this scenario, by changing the > base->get_time to return jiffies. I have a jitter test and ran this > several times, and I could never get it to expire early. I even changed > HZ back to 100. > > Then I looked at run_ktimer_queue. And here we have the compare: > > timer = list_entry(base->pending.next, struct ktimer, list); > if (ktime_cmp(now, <=, timer->expires)) > break; > > So, the timer does _not_ get processed if it is after or _equal_ to the > current time. So although the timer may go off early, the expired queue > does not get executed. So the above example would not go off at 3.2, > but some time in the 4 category Again, I'm not able to find the problem. while(timers_pending()) { timer = getnext_timer(); if (timer->expires > now) break; execute_callback(); } Please elaborate how the timer can expire early. > Even if you set the timer resolution to 1 nsec, there is still the > resolution of the actual hardware clock and it has to be taken into > account somewhere when you start a relative timer. Even if the clock > resolution is usually higher than the normal latency, so the problem won't > be visible for most people, the general timer code should take this into > account. If someone doesn't care about high resolution timer, he can still > use it with a low resolution clock (e.g. jiffies) and then this becomes a > problem. I'm completely lost on this. Can you please make up a simple example with numbers? If you disable high resolution timers then the resolution is jiffies. The simple comparision which determines whether the timer is expired or not is still valid. > > > But this is now exactly the bevhaviour your timer has, why is not > > > "surprising" anymore? > > > > Yes, we wrote that before. After reconsidering the results we came to > > the conclusion, that we actually dont need the rounding at all because > > the uneven distance is equally surprising as the summing up errors > > introduced by rounding. > > I don't think that the summing up error is surprising at all, the spec is > quite clear that the time values have to be rounded up to the resolution > of the timer and it's also the behaviour of the current timer. No, the spec is not clear at all about this. I pointed this out before and I still think that the part of the RATIONALE section is the key to this decision. "Also note that some implementations may choose to adjust time and/or interval values to exactly match the ticks of the underlying clock" You decide to do the adjustment. I prefer not to do so and I dont buy any argument which says, that the current behaviour is the yardstick for everything. It can't be. Otherwise we would not be able to introduce high resolution timers at all. Every application which relies on some behaviour of the kernel which is not explicitely required by the specification is broken by definition. The compliance requirement is the yardstick, not some random implementation detail which happens to be compliant. > This error is actually the expected behaviour for any timer with a > resolution different from 1 nsec. I don't want to say that we can't have > such a timer, but I'm not so sure whether this should be the default > behaviour. I actually prefer George's earlier suggestion of CLOCK_REALTIME > and CLOCK_REALTIME_HR, where one is possibly faster and the other is more > precise. Even within the kernel I would prefer to map itimer and nanosleep > to the first clock (maybe also based on arch/kconfig defaults). > OTOH if the hardware allows it, both clocks can do the same thing, but I > really would like to have the possibility to give higher (and thus > possibly more expensive) resolution only to those asking for it. Thats an rather odd approach for me. If we drag this further then we might consider that only some users (i.e. applications) of -rt patches are using the enhanced functionalities, which introduces interesting computational problems (e.g when to treat a mutex as a concurrency control which is capable of priority inversion or not). I vote strongly against introducing private, special purpose APIs and I consider CLOCK_XXX_HR as such. The proposed hrtimer solution does not introduce any penalties for people who do not enable a future high resolution extension. It gives us the benefit of a clean code base which is capable to be switched simply and non intrusive to the high resolution mode. We have done extensive tests on the impact of converting all users unconditionally to high resolution mode once it is switched on and the penalty is within the noise range. You are explicitely asking for increased complexity with your approach. > > > I don't mind changing the behaviour, but I would prefer to do this in a > > > separate step and with an analysis of the possible consequences. This is > > > not just about posix-timers, but it also affects itimers, nanosleep and > > > possibly other systems in the future. Actually my main focus is not on HR > > > posix timer, my main interest is that everythings else keeps working and > > > doesn't has to pay the price for it. > > > > While my focus is a clean merging of high resolution timers without > > breaking the non hrt case, I still believe that we can find a solution, > > where we can have the base implementation without any reference to > > jiffies. > > This is what I think requires the better clock abstraction, most of it is > related to the clock resolution, the generic timer code currently has no > idea of the real resolution of the underlying clock, so I assumed a worst > case of TICK_NSEC everywhere. Well, can you please show where the current hrtimer implementation is doing something which requires a better clock abstraction ? The clock abstraction layer is relevant if you actuallly want to switch to high resolution mode and until then the coarse interface is sufficient. > > I try to compare and contrast the two possible solutions: > > > > Rounding the initial expiry time and the interval results in a summing > > up error, which depends on the delta of the interval and the > > resolution. > > > > The non rounding solution results in a summing up error for intervals > > which are less than the resolution. For intervals >= resolution no > > summing up error is happening, but for intervals, which are not a > > multiple of the resolution, an uneven interval as close as possible to > > the timeline is delivered. > > > > In both cases the timers never expire early and I think both variants > > are compliant with the specification. > > What I'd like to avoid is that we have to commit ourselves to only one > solution right now. I think the first solution is better suited to the low > resolution timer, that we have right now. The second solution requires a > better clock framework - this includes better time keeping and > reprogrammable timer interrupts. We have to choose one. Everything else is a bad compromise. There is nothing worse than making no decision when you are at a point where a decision is required. Please provide one reproducable scenario why a better time keeping and a reprogrammable timer interrupt is required. The current hrtimer code does not need this at all. Only if you want to have higher resolution you need this and we use it in the high resolution timer queue - both the timekeeping and the reprogamming abstraction layer. > At this point I wouldn't like to settle on just one solution, I had to > see more of the infrastructure integrated before doing this. At this point > I see more advantages in having a choice (may it be as Kconfig or even a > runtime option). Well, I do not see a point why we want to have Kconfig, runtime or whatever choices for a non existant problem at all. tglx ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [patch 00/21] hrtimer - High-resolution timer subsystem 2005-12-14 22:30 ` Thomas Gleixner @ 2005-12-15 0:55 ` George Anzinger 2005-12-15 14:18 ` Steven Rostedt 2005-12-19 14:50 ` Roman Zippel 2 siblings, 0 replies; 52+ messages in thread From: George Anzinger @ 2005-12-15 0:55 UTC (permalink / raw) To: tglx; +Cc: Roman Zippel, linux-kernel, Andrew Morton, rostedt, johnstul, mingo Thomas Gleixner wrote: > Hi, ~ > > >>This error is actually the expected behaviour for any timer with a >>resolution different from 1 nsec. I don't want to say that we can't have >>such a timer, but I'm not so sure whether this should be the default >>behaviour. I actually prefer George's earlier suggestion of CLOCK_REALTIME >>and CLOCK_REALTIME_HR, where one is possibly faster and the other is more >>precise. Even within the kernel I would prefer to map itimer and nanosleep >>to the first clock (maybe also based on arch/kconfig defaults). >>OTOH if the hardware allows it, both clocks can do the same thing, but I >>really would like to have the possibility to give higher (and thus >>possibly more expensive) resolution only to those asking for it. > > > Thats an rather odd approach for me. If we drag this further then we > might consider that only some users (i.e. applications) of -rt patches > are using the enhanced functionalities, which introduces interesting > computational problems (e.g when to treat a mutex as a concurrency > control which is capable of priority inversion or not). Er... what? This is a non-compute. > > I vote strongly against introducing private, special purpose APIs and I > consider CLOCK_XXX_HR as such. The proposed hrtimer solution does not > introduce any penalties for people who do not enable a future high > resolution extension. It gives us the benefit of a clean code base which > is capable to be switched simply and non intrusive to the high > resolution mode. We have done extensive tests on the impact of > converting all users unconditionally to high resolution mode once it is > switched on and the penalty is within the noise range. > > You are explicitely asking for increased complexity with your approach. I beg to differ here. The fact that high res timers, in general, require an interrupt per expiry, and that, by definition, we are changing the resolution by, I would guess, a couple of orders of magnitude implies a rather much larger over head. If we sum this over all user timers it can IMHO get out of control. Given that only a very small number of applications really need the extra resolution, I think it makes a lot of sense that those applications incur the overhead and others, which don't need nor want the higher resolution, just use the old low resolution timers. The notion of switching this at configure time implies that a given kernel is going to be used ONLY one way or another for all applications, which, AFAICT is just not the way most users do things. As to CLOCK_XXX_HR being a special purpose API, this is only half true. It is a POSIX conforming extension and I do think you can find it used elsewhere as well. On the other hand, it if you want to limit the higher overhead timers to only those who ask, well, I guess you could call that "special purpose". On the complexity thing, your new organization makes the added "complexity" rather non-complex, in fact, you might say it is down right simple, for which, thank you. > > ~ -- George Anzinger george@mvista.com HRT (High-res-timers): http://sourceforge.net/projects/high-res-timers/ ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [patch 00/21] hrtimer - High-resolution timer subsystem 2005-12-14 22:30 ` Thomas Gleixner 2005-12-15 0:55 ` George Anzinger @ 2005-12-15 14:18 ` Steven Rostedt 2005-12-19 14:50 ` Roman Zippel 2 siblings, 0 replies; 52+ messages in thread From: Steven Rostedt @ 2005-12-15 14:18 UTC (permalink / raw) To: tglx; +Cc: Roman Zippel, linux-kernel, Andrew Morton, johnstul, mingo On Wed, 2005-12-14 at 23:30 +0100, Thomas Gleixner wrote: > > > > I don't think that's enough (unless I missed something). Steven maybe > > explained it better than I did in > > http://marc.theaimsgroup.com/?l=linux-kernel&m=113047529313935 > > Steven said: > > > Interesting though, I tried to force this scenario, by changing the > > base->get_time to return jiffies. I have a jitter test and ran this > > several times, and I could never get it to expire early. I even changed > > HZ back to 100. > > > > Then I looked at run_ktimer_queue. And here we have the compare: > > > > timer = list_entry(base->pending.next, struct ktimer, list); > > if (ktime_cmp(now, <=, timer->expires)) > > break; > > > > So, the timer does _not_ get processed if it is after or _equal_ to the > > current time. So although the timer may go off early, the expired queue > > does not get executed. So the above example would not go off at 3.2, > > but some time in the 4 category > > Again, I'm not able to find the problem. > > while(timers_pending()) { > timer = getnext_timer(); > if (timer->expires > now) > break; > execute_callback(); > } > > Please elaborate how the timer can expire early. Actually Thomas, the above code doesn't handle it correctly, although, the code you have in hrtimer.c does. Here you say ">" where it should be ">=", otherwise you can have the affect that I explained in the reference that Roman stated. Although run_hrtimer_queue is still correct, I think you might want to change the hrtimer_forward. It has a ">" where it should be ">=". -- Steve ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [patch 00/21] hrtimer - High-resolution timer subsystem 2005-12-14 22:30 ` Thomas Gleixner 2005-12-15 0:55 ` George Anzinger 2005-12-15 14:18 ` Steven Rostedt @ 2005-12-19 14:50 ` Roman Zippel 2005-12-19 22:05 ` Thomas Gleixner 2 siblings, 1 reply; 52+ messages in thread From: Roman Zippel @ 2005-12-19 14:50 UTC (permalink / raw) To: Thomas Gleixner; +Cc: linux-kernel, Andrew Morton, rostedt, johnstul, mingo Hi, On Wed, 14 Dec 2005, Thomas Gleixner wrote: > The current -hrt queue contains the removal patch of the list_head > already and you interrupted my attempt to send out the patch for -mm :) Ok. > > > The state field is not removed because I'm not a big fan of those > > > overloaded fields and I prefer to pay the 4 byte penalty for the > > > seperation. > > > Of course if there is the absolute requirement to reduce the size, I'm > > > not insisting on keeping it. > > > > Well, I'm not a big fan of redundant state information, e.g. the pending > > information can be included in the rb_node (it's not as quite simple as > > with the timer_list, but it's the same thing). > > I do not consider this as redundant information. It's a design decision > whether to use a state variable for state information and the rbnode for > rbtree handling or to overload the meaning of the rbnode with > information which is not the natural associated content. > > I'm well aware of those optimization and space saving tricks. I did > microcontroller programming long enough, but - and out of the experience > - I want to avoid it for new designs where ever it is possible for > clarity and extensibility reasons. It's not just about optimization tricks, it's about redundant information. Right now the primary function of the state is to tell whether the timer node is in the tree or now, so I prefer to add this information directly to the rbnode, similiar to what we do with normal lists. The other problem is that such "simple" state variables quickly become inadequate as the state machine becomes more complex, especially if multiple processes run at the same time (e.g. a timer can be "running" and/or "active"). So for me it's also for clarity and extensibility reasons, that I want to avoid overloaded state machines and rather keep it as simple as possible. > > The expired information > > (together with the data field) is an optimization for simple sleeps that > > AFAICT only makes a difference in the rt kernel (the saved context switch > > you mentioned above). What makes me more uncomfortable is that this is a > > special case optimization and other callbacks are probably fast as well > > (e.g. wakeup + timer restart). > > Not only in a -rt kernel, it also saves the context switch for a high > resolution timer enabled kernel, where you actually can execute the > callback in hard interrupt context. We can solve it differently, but we > should carefully think about the extensiblity issues. The wakeup + > restart szenario is a good reason to reconsider this. I don't think executing something in the soft or hard interrupt context makes a big difference on a normal kernel (at least I wouldn't call it a context switch). > > I can understand you want to keep the difference to the rt kernel small, > > but for me it's more about immediate benefits against uncertain long term > > benefits. > > If you have a clear target and the experience of having implemented the > extensions, you have to carefully weigh up the consequences of such > decisions. I'm not talking about "might be implemented by somebody > sometimes features", I'm talking about existing proof of concept > implementations. There is no real justification to ignore well known > consequences. > > Of course if you consider the possibility of including high resolution > timers (I'm not talking about -rt) as zero, your requests make sense. Thomas, please don't treat me like an idiot, you may have more experiance with hrtimer, but after working on it for a while I also know what I'm talking about. Please accept that I have different focus on this, I want to keep things as simple as possible. New features should stand on his own and this includes the complexity they add to the kernel. The new hrtimer especially cleans up greatly the posix timer stuff, this and keeping all other users working is my primary focus now. New features add new complexity and I want to see and evaluate it at the time it's added to the kernel, primarily to find solutions to avoid the (runtime) complexity for clocks which don't want to or can't support such high resolutions, so they don't have to pay the price for these new features. I want to keep things flexible and keeping things simple is IMO a much better starting point. > I disagree because I'm convinced that the problems "high res timers", > "dynamic ticks", "timekeeping", "clock event abstraction" are closely > related and we have high demands to get those solved in a clean way. So > providing some jiffies bound minimal solution in the first place is more > than shortsighted IMO. You're misunderstanding me, I don't want "some jiffies bound minimal solution", I want to solve one problem at a time and fixing the jiffies problem requires solving problems in the clock abstraction first, otherwise you produce a crutch which works in most cases, but leaves a few problem cases behind. > Goerge said explicitely: > > > My $0.02 worth: It is clear (from the standard) that the initial time > > is to be ABS_TIME. It is also clear that the interval is to be added > > to that time. IMHO then, the result should have the same property, > > i.e. ABS_TIME. > > I dont find a way to read out that the interval should not have the > ABSTIME property. That's not what you wrote earlier: "I agree, that the restriction to the initial it_value is definitely something you can read out of the spec." clock_settime says: "..., these time services shall expire when the requested relative interval elapses, independently of the new or old value of the clock." it_interval is a relative interval and otherwise the spec only talks about "an initial expiration time, again either relative or absolute," I can't really find a direct connection that TIMER_ABSTIME should apply to the interval as well. > > > > Since you don't do any rounding at all anymore, your timer may now expire > > > > early with low resolution clocks (the old jiffies + 1 problem I explained > > > > in my ktime_t patch). > > > > > > It does not expire early. The timer->expires field is still compared > > > against now. > > > > I don't think that's enough (unless I missed something). Steven maybe > > explained it better than I did in > > http://marc.theaimsgroup.com/?l=linux-kernel&m=113047529313935 > > Steven said: > > > Interesting though, I tried to force this scenario, by changing the > > base->get_time to return jiffies. I have a jitter test and ran this > > several times, and I could never get it to expire early. > [..] > Please elaborate how the timer can expire early. At this time you still did the rounding of the values, so it actually worked. When reading a time t from a clock with resolution r, the real time can be anything from t to t+r-1. Assuming it's currently t+r-1 and you try to set a relative timer to r-1, you will read t from the clock and arm the timer for t+r-1, which will cause the timer to expire at t+r, where it must not expire before t+r-1+r-1. It currently only works because latencies are usually larger than the clock resolution, but if I want to configure hrtimer with a low resolution clock, the problem can become visible. > > > > But this is now exactly the bevhaviour your timer has, why is not > > > > "surprising" anymore? > > > > > > Yes, we wrote that before. After reconsidering the results we came to > > > the conclusion, that we actually dont need the rounding at all because > > > the uneven distance is equally surprising as the summing up errors > > > introduced by rounding. > > > > I don't think that the summing up error is surprising at all, the spec is > > quite clear that the time values have to be rounded up to the resolution > > of the timer and it's also the behaviour of the current timer. > > No, the spec is not clear at all about this. > > I pointed this out before and I still think that the part of the > RATIONALE section is the key to this decision. > > "Also note that some implementations may choose to adjust time and/or > interval values to exactly match the ticks of the underlying clock" You basically use this sentence as loophole to take the whole resolution rounding rule ad absurdum and I disagree that this sentence means "ignore everything above and do whatever you want". > You decide to do the adjustment. I prefer not to do so and I dont buy > any argument which says, that the current behaviour is the yardstick for > everything. It can't be. Otherwise we would not be able to introduce > high resolution timers at all. Every application which relies on some > behaviour of the kernel which is not explicitely required by the > specification is broken by definition. The problem is that you force this behaviour also on other hrtimer users (itimers and nanosleep) and we should be very careful with such behaviour changes. My proposal keeps the current behaviour and is less likely to break anything. As I said before I'm not against changing the behaviour, but it should be done carefully. > I vote strongly against introducing private, special purpose APIs and I > consider CLOCK_XXX_HR as such. The proposed hrtimer solution does not > introduce any penalties for people who do not enable a future high > resolution extension. It gives us the benefit of a clean code base which > is capable to be switched simply and non intrusive to the high > resolution mode. We have done extensive tests on the impact of > converting all users unconditionally to high resolution mode once it is > switched on and the penalty is within the noise range. > > You are explicitely asking for increased complexity with your approach. Which in this case it would be good thing. Right now we don't have much choice in clock source, but that will change soon and I think it would be a good to be able to map a timer to specific clock source. The gained flexibility outweighs the required complexity greatly. > > > While my focus is a clean merging of high resolution timers without > > > breaking the non hrt case, I still believe that we can find a solution, > > > where we can have the base implementation without any reference to > > > jiffies. > > > > This is what I think requires the better clock abstraction, most of it is > > related to the clock resolution, the generic timer code currently has no > > idea of the real resolution of the underlying clock, so I assumed a worst > > case of TICK_NSEC everywhere. > > Well, can you please show where the current hrtimer implementation is > doing something which requires a better clock abstraction ? 1. clock resolution is unknown (see above). 2. reprogrammable timer interrupts. > The clock abstraction layer is relevant if you actuallly want to switch > to high resolution mode and until then the coarse interface is > sufficient. Right and until then it's also not really avoidable, that you find a few references to jiffies. I'm not saying that we have to keep them, but please only one step at a time. > > What I'd like to avoid is that we have to commit ourselves to only one > > solution right now. I think the first solution is better suited to the low > > resolution timer, that we have right now. The second solution requires a > > better clock framework - this includes better time keeping and > > reprogrammable timer interrupts. > > We have to choose one. Everything else is a bad compromise. There is > nothing worse than making no decision when you are at a point where a > decision is required. Do you really have so little trust in your own code, that we can't afford the flexibility and have to hardcode the timer resolution now? bye, Roman ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [patch 00/21] hrtimer - High-resolution timer subsystem 2005-12-19 14:50 ` Roman Zippel @ 2005-12-19 22:05 ` Thomas Gleixner 0 siblings, 0 replies; 52+ messages in thread From: Thomas Gleixner @ 2005-12-19 22:05 UTC (permalink / raw) To: Roman Zippel; +Cc: linux-kernel, Andrew Morton, rostedt, johnstul, Ingo Molnar Hi, On Mon, 2005-12-19 at 15:50 +0100, Roman Zippel wrote: > It's not just about optimization tricks, it's about redundant information. > Right now the primary function of the state is to tell whether the timer > node is in the tree or now, so I prefer to add this information directly > to the rbnode, similiar to what we do with normal lists. > The other problem is that such "simple" state variables quickly become > inadequate as the state machine becomes more complex, especially if > multiple processes run at the same time (e.g. a timer can be "running" > and/or "active"). So for me it's also for clarity and extensibility > reasons, that I want to avoid overloaded state machines and rather keep > it as simple as possible. You want to avoid overloaded state machines and therefor overload a rbnode struct with state information ? Sorry, -ENOPARSE. > > Not only in a -rt kernel, it also saves the context switch for a high > > resolution timer enabled kernel, where you actually can execute the > > callback in hard interrupt context. We can solve it differently, but we > > should carefully think about the extensiblity issues. The wakeup + > > restart szenario is a good reason to reconsider this. > > I don't think executing something in the soft or hard interrupt context > makes a big difference on a normal kernel (at least I wouldn't call it a > context switch). Well. How would you call it then? Thread A runs hrtimer interrupt timer X is expired, softirq is woken up context switch to softirq softirq runs timer X callback is executed, thread B is woken up context switch to thread B versus Thread A runs hrtimer interrupt timer X is expired, callback is executed thread B is woken up context switch to thread B I still call it a context switch, because it is one, except for the case where the softirq is called in the interrupt return path, but also then we gain the advantage that we do not have to execute it. > > > I can understand you want to keep the difference to the rt kernel small, > > > but for me it's more about immediate benefits against uncertain long term > > > benefits. > > > > If you have a clear target and the experience of having implemented the > > extensions, you have to carefully weigh up the consequences of such > > decisions. I'm not talking about "might be implemented by somebody > > sometimes features", I'm talking about existing proof of concept > > implementations. There is no real justification to ignore well known > > consequences. > > > > Of course if you consider the possibility of including high resolution > > timers (I'm not talking about -rt) as zero, your requests make sense. > > Thomas, please don't treat me like an idiot, you may have more experiance > with hrtimer, but after working on it for a while I also know what I'm > talking about. Please accept that I have different focus on this, I want > to keep things as simple as possible. New features should stand on his > own and this includes the complexity they add to the kernel. The new > hrtimer especially cleans up greatly the posix timer stuff, this and > keeping all other users working is my primary focus now. The basic hrtimer patch without any addons does not introduce complexities and is simple and keeps everything working. Can you please elaborate the new features and complexities instead of repeating this over and over without pointing out exactly what and where it is? > New features add new complexity and I want to see and evaluate it at the > time it's added to the kernel, primarily to find solutions to avoid the > (runtime) complexity for clocks which don't want to or can't support such > high resolutions, so they don't have to pay the price for these new > features. I want to keep things flexible and keeping things simple is IMO > a much better starting point. The code is flexible to handle low resolution as well as a later high resolution extension. It does not introduce additional complexity to anything. Stop this prayer wheel argumentation and show exactly which complexitiy it introduces. > > I disagree because I'm convinced that the problems "high res timers", > > "dynamic ticks", "timekeeping", "clock event abstraction" are closely > > related and we have high demands to get those solved in a clean way. So > > providing some jiffies bound minimal solution in the first place is more > > than shortsighted IMO. > > You're misunderstanding me, I don't want "some jiffies bound minimal > solution", I want to solve one problem at a time and fixing the jiffies > problem requires solving problems in the clock abstraction first, > otherwise you produce a crutch which works in most cases, but leaves a few > problem cases behind. Which problem cases please ? > > Goerge said explicitely: > > > > > My $0.02 worth: It is clear (from the standard) that the initial time > > > is to be ABS_TIME. It is also clear that the interval is to be added > > > to that time. IMHO then, the result should have the same property, > > > i.e. ABS_TIME. > > > > I dont find a way to read out that the interval should not have the > > ABSTIME property. > > That's not what you wrote earlier: "I agree, that the restriction to the > initial it_value is definitely something you can read out of the spec." I was talking about Georges citiation and not about some random pieces of text cut out of the original context. I still dont find a way to interprete Georges writing in the way you did. > clock_settime says: "..., these time services shall expire when the > requested relative interval elapses, independently of the new or old value > of the clock." it_interval is a relative interval and otherwise the spec > only talks about "an initial expiration time, again either relative or > absolute," I can't really find a direct connection that TIMER_ABSTIME > should apply to the interval as well. timer_set says: "... The reload value of the timer is set to the value specified by the it_interval member of value. When a timer is armed with a non-zero it_interval, a periodic (or repetitive) timer is specified." I dont see a notion that an interval does remove the ABSTIME property from a timer which was set up with the ABSTIME flag set. And you are claiming not to change anything in order not to break anything. The current upstream code is keeping ABSTIME interval timers on the abslist, so why are you changing this at will for no real good reason. > > Please elaborate how the timer can expire early. > > At this time you still did the rounding of the values, so it actually > worked. > When reading a time t from a clock with resolution r, the real time can be > anything from t to t+r-1. Assuming it's currently t+r-1 and you try to set > a relative timer to r-1, you will read t from the clock and arm the timer > for t+r-1, which will cause the timer to expire at t+r, where it must not > expire before t+r-1+r-1. I dont see where you pull this from. At any given point the clock reads a value between two ticks. t(tickn) <= now < t(tickn+1), where t(tickn+1) - t(tickn) = resolution In any given case the interval is added to now. expiry = now + interval The expiry check is still if (expiry <= now) expire_timer() The softirq which handles the expiry is called every tick, which guarantees that the timer is always expired at or past the tick boundary, but never ever it can be expired early. > It currently only works because latencies are usually larger than the > clock resolution, but if I want to configure hrtimer with a low resolution > clock, the problem can become visible. Where is this configuration switch in the posted code ? The posted hrtimers code is low resolution. Show me a simple example code which makes this become visible. > > > > > But this is now exactly the bevhaviour your timer has, why is not > > > > > "surprising" anymore? > > > > > > > > Yes, we wrote that before. After reconsidering the results we came to > > > > the conclusion, that we actually dont need the rounding at all because > > > > the uneven distance is equally surprising as the summing up errors > > > > introduced by rounding. > > > > > > I don't think that the summing up error is surprising at all, the spec is > > > quite clear that the time values have to be rounded up to the resolution > > > of the timer and it's also the behaviour of the current timer. > > > > No, the spec is not clear at all about this. > > > > I pointed this out before and I still think that the part of the > > RATIONALE section is the key to this decision. > > > > "Also note that some implementations may choose to adjust time and/or > > interval values to exactly match the ticks of the underlying clock" > > You basically use this sentence as loophole to take the whole resolution > rounding rule ad absurdum and I disagree that this sentence means > "ignore everything above and do whatever you want". I do not whatever I want and you well know that. The rounding is still done on expiry time, which means the rounding happens on the tick boundary. "Time values that are between two consecutive non-negative integer multiples of the resolution of the specified timer will be rounded up to the larger multiple of the resolution. Quantization error will not cause the timer to expire earlier than the rounded time value. .... Also note that some implementations may choose to adjust time and/or interval values to exactly match the ticks of the underlying clock." Please tell me where my interpretation is violating the spec. It is different to your interpretation, thats all. > > You decide to do the adjustment. I prefer not to do so and I dont buy > > any argument which says, that the current behaviour is the yardstick for > > everything. It can't be. Otherwise we would not be able to introduce > > high resolution timers at all. Every application which relies on some > > behaviour of the kernel which is not explicitely required by the > > specification is broken by definition. > > The problem is that you force this behaviour also on other hrtimer users > (itimers and nanosleep) and we should be very careful with such behaviour > changes. My proposal keeps the current behaviour and is less likely to > break anything. As I said before I'm not against changing the behaviour, > but it should be done carefully. > > > I vote strongly against introducing private, special purpose APIs and I > > consider CLOCK_XXX_HR as such. The proposed hrtimer solution does not > > introduce any penalties for people who do not enable a future high > > resolution extension. It gives us the benefit of a clean code base which > > is capable to be switched simply and non intrusive to the high > > resolution mode. We have done extensive tests on the impact of > > converting all users unconditionally to high resolution mode once it is > > switched on and the penalty is within the noise range. > > > > You are explicitely asking for increased complexity with your approach. > > Which in this case it would be good thing. Right now we don't have much > choice in clock source, but that will change soon and I think it would be > a good to be able to map a timer to specific clock source. The gained > flexibility outweighs the required complexity greatly. I really want to know what you are talking about. I have proven, that without clock abstraction improvements the current code is working without one single reference to jiffies. When clock abstractions are available the code will make use of them just by changing the functions which read the crurrent time. > > > > While my focus is a clean merging of high resolution timers without > > > > breaking the non hrt case, I still believe that we can find a solution, > > > > where we can have the base implementation without any reference to > > > > jiffies. > > > > > > This is what I think requires the better clock abstraction, most of it is > > > related to the clock resolution, the generic timer code currently has no > > > idea of the real resolution of the underlying clock, so I assumed a worst > > > case of TICK_NSEC everywhere. > > > > Well, can you please show where the current hrtimer implementation is > > doing something which requires a better clock abstraction ? > > 1. clock resolution is unknown (see above). > 2. reprogrammable timer interrupts. There is no single reference to a reprogrammable timer interrupt in the current hrtimer code which was posted and replaced the initial ktimer code. Please stop mixing up the high resolution timer implementation on top of hrtimers with the hrtimers base patch for a cheap argument. We have been there and I dont see a point to get back to this kind of discussion. > > The clock abstraction layer is relevant if you actuallly want to switch > > to high resolution mode and until then the coarse interface is > > sufficient. > > Right and until then it's also not really avoidable, that you find a few > references to jiffies. I'm not saying that we have to keep them, but > please only one step at a time. No, it is not necessary at all. Thats proven and it is one step at a time. If we can avoid jiffies in the first place why should we put them there ? What would we gain ? > > > What I'd like to avoid is that we have to commit ourselves to only one > > > solution right now. I think the first solution is better suited to the low > > > resolution timer, that we have right now. The second solution requires a > > > better clock framework - this includes better time keeping and > > > reprogrammable timer interrupts. > > > > We have to choose one. Everything else is a bad compromise. There is > > nothing worse than making no decision when you are at a point where a > > decision is required. > > Do you really have so little trust in your own code, that we can't afford > the flexibility and have to hardcode the timer resolution now? I trust my code, but you seem to trust jiffies more than simple math. What flexibility do you gain with your jiffies implementation ? The flexibility to add some replacement code which will look basically the same way as hrtimers are looking now ? tglx ^ permalink raw reply [flat|nested] 52+ messages in thread
end of thread, other threads:[~2005-12-22 4:31 UTC | newest] Thread overview: 52+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2005-12-13 12:45 [patch 00/21] hrtimer - High-resolution timer subsystem Nicolas Mailhot 2005-12-13 23:38 ` George Anzinger 2005-12-14 8:58 ` Kyle Moffett 2005-12-14 10:03 ` Nicolas Mailhot 2005-12-15 1:11 ` George Anzinger -- strict thread matches above, loose matches on Subject: below -- 2005-12-06 0:01 tglx 2005-12-06 17:32 ` Roman Zippel 2005-12-06 19:07 ` Ingo Molnar 2005-12-07 3:05 ` Roman Zippel 2005-12-08 5:18 ` Paul Jackson 2005-12-08 8:12 ` Ingo Molnar 2005-12-08 9:26 ` Ingo Molnar 2005-12-08 13:08 ` Roman Zippel 2005-12-08 15:36 ` Steven Rostedt 2005-12-06 22:10 ` Thomas Gleixner 2005-12-07 3:11 ` Roman Zippel 2005-12-06 22:28 ` Thomas Gleixner 2005-12-07 9:31 ` Andrew Morton 2005-12-07 10:11 ` Ingo Molnar 2005-12-07 10:20 ` Ingo Molnar 2005-12-07 10:23 ` Nick Piggin 2005-12-07 10:49 ` Ingo Molnar 2005-12-07 11:09 ` Nick Piggin 2005-12-07 11:33 ` Ingo Molnar 2005-12-07 11:40 ` Nick Piggin 2005-12-07 13:06 ` Roman Zippel 2005-12-07 12:40 ` Roman Zippel 2005-12-07 23:12 ` Nick Piggin 2005-12-07 12:18 ` Roman Zippel 2005-12-07 16:55 ` Ingo Molnar 2005-12-07 17:17 ` Roman Zippel 2005-12-07 17:57 ` Ingo Molnar 2005-12-07 18:18 ` Roman Zippel 2005-12-07 18:02 ` Paul Baxter 2005-12-09 17:23 ` Thomas Gleixner 2005-12-12 13:39 ` Roman Zippel 2005-12-12 16:42 ` Thomas Gleixner 2005-12-12 18:37 ` Thomas Gleixner 2005-12-13 1:25 ` George Anzinger 2005-12-13 9:18 ` Thomas Gleixner 2005-12-15 1:35 ` Roman Zippel 2005-12-15 2:29 ` George Anzinger 2005-12-19 14:56 ` Roman Zippel 2005-12-19 20:54 ` George Anzinger 2005-12-21 23:03 ` Roman Zippel 2005-12-22 4:30 ` George Anzinger 2005-12-14 20:48 ` Roman Zippel 2005-12-14 22:30 ` Thomas Gleixner 2005-12-15 0:55 ` George Anzinger 2005-12-15 14:18 ` Steven Rostedt 2005-12-19 14:50 ` Roman Zippel 2005-12-19 22:05 ` Thomas Gleixner
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).