* Re: [PATCH 9/11] Panic delay fix
2007-02-07 20:36 ` Rusty Russell
@ 2007-02-07 22:23 ` Zachary Amsden
2007-02-08 14:43 ` Dmitry Torokhov
` (2 subsequent siblings)
3 siblings, 0 replies; 31+ messages in thread
From: Zachary Amsden @ 2007-02-07 22:23 UTC (permalink / raw)
To: Rusty Russell
Cc: Pavel Machek, Andi Kleen, Linux Kernel Mailing List,
Andrew Morton, Jeremy Fitzhardinge, Chris Wright
Rusty Russell wrote:
> On Wed, 2007-02-07 at 12:35 +0000, Pavel Machek wrote:
>
>> Ugh, it sounds like paravirt is more b0rken then I thought. It should
>> always to the proper delay, then replace those udelays that are not
>> needed on virtualized hardware with something else.
>>
>> Just magically defining udelay into nop is broken.
>>
>
> We'd have to audit and figure out what udelays are for hardware and
> which are not, but the evidence is that the vast majority of them are
> for hardware and not needed for virtualization.
>
> Changing udelay to "hardware_udelay" or something all over the kernel
> would have delayed the paravirt_ops merge by an infinite amount 8)
>
Yes, so I chose the same approach used for b0rken hardware - #define
REALLY_SLOW_IO before including headers. It's ugly, but it works
without rewriting the entire source base.
Zach
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 9/11] Panic delay fix
2007-02-07 20:36 ` Rusty Russell
2007-02-07 22:23 ` Zachary Amsden
@ 2007-02-08 14:43 ` Dmitry Torokhov
2007-02-08 21:26 ` Zachary Amsden
2007-02-14 12:26 ` Pavel Machek
2007-02-14 12:52 ` Alan
3 siblings, 1 reply; 31+ messages in thread
From: Dmitry Torokhov @ 2007-02-08 14:43 UTC (permalink / raw)
To: Rusty Russell
Cc: Pavel Machek, Zachary Amsden, Andi Kleen,
Linux Kernel Mailing List, Andrew Morton, Jeremy Fitzhardinge,
Chris Wright
On 2/7/07, Rusty Russell <rusty@rustcorp.com.au> wrote:
> On Wed, 2007-02-07 at 12:35 +0000, Pavel Machek wrote:
> > Ugh, it sounds like paravirt is more b0rken then I thought. It should
> > always to the proper delay, then replace those udelays that are not
> > needed on virtualized hardware with something else.
> >
> > Just magically defining udelay into nop is broken.
>
> We'd have to audit and figure out what udelays are for hardware and
> which are not, but the evidence is that the vast majority of them are
> for hardware and not needed for virtualization.
>
> Changing udelay to "hardware_udelay" or something all over the kernel
> would have delayed the paravirt_ops merge by an infinite amount 8)
>
However I am not really fond of idea of adding constructs like this
all over the code:
#define USE_REAL_TIME_DELAY_I_REALLY_MEAN_IT_THIS_TIME_I_SWEAR
as the time passes... Drivers should be blissfully ignorant of being
run on virtual hardware.
--
Dmitry
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 9/11] Panic delay fix
2007-02-08 14:43 ` Dmitry Torokhov
@ 2007-02-08 21:26 ` Zachary Amsden
2007-02-08 21:37 ` Dmitry Torokhov
0 siblings, 1 reply; 31+ messages in thread
From: Zachary Amsden @ 2007-02-08 21:26 UTC (permalink / raw)
To: Dmitry Torokhov
Cc: Rusty Russell, Pavel Machek, Andi Kleen,
Linux Kernel Mailing List, Andrew Morton, Jeremy Fitzhardinge,
Chris Wright
Dmitry Torokhov wrote:
>
> However I am not really fond of idea of adding constructs like this
> all over the code:
>
> #define USE_REAL_TIME_DELAY_I_REALLY_MEAN_IT_THIS_TIME_I_SWEAR
>
> as the time passes... Drivers should be blissfully ignorant of being
> run on virtual hardware.
I agree in general, but there are two uses for this ugly construct. One
is for drivers and isolated sections of code which actually interact
with the real world. They need real time delays. The only two examples
so far are SMP coprocessor bootstrapping and blinking LEDs with a
recognizable frequency. I don't expect many more to show up.
The other use for this construct is compiling a hardware privileged
virtual domain that actually drives real hardware - you can just define
this globally to override any delay masking.
Zach
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 9/11] Panic delay fix
2007-02-08 21:26 ` Zachary Amsden
@ 2007-02-08 21:37 ` Dmitry Torokhov
0 siblings, 0 replies; 31+ messages in thread
From: Dmitry Torokhov @ 2007-02-08 21:37 UTC (permalink / raw)
To: Zachary Amsden
Cc: Rusty Russell, Pavel Machek, Andi Kleen,
Linux Kernel Mailing List, Andrew Morton, Jeremy Fitzhardinge,
Chris Wright
On 2/8/07, Zachary Amsden <zach@vmware.com> wrote:
> Dmitry Torokhov wrote:
> >
> > However I am not really fond of idea of adding constructs like this
> > all over the code:
> >
> > #define USE_REAL_TIME_DELAY_I_REALLY_MEAN_IT_THIS_TIME_I_SWEAR
> >
> > as the time passes... Drivers should be blissfully ignorant of being
> > run on virtual hardware.
>
> I agree in general, but there are two uses for this ugly construct. One
> is for drivers and isolated sections of code which actually interact
> with the real world. They need real time delays. The only two examples
> so far are SMP coprocessor bootstrapping and blinking LEDs with a
> recognizable frequency. I don't expect many more to show up.
>
I am pretty sure that using that construct inside of i8042 is wrong -
the host OS should handle hardware access/delay issues. Have you
verified that fixing kerne/panic.c to call i8042_panic_blink once per
1 ms (as it happens on native running kernel) does not produce the
desired blinking?
--
Dmitry
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 9/11] Panic delay fix
2007-02-07 20:36 ` Rusty Russell
2007-02-07 22:23 ` Zachary Amsden
2007-02-08 14:43 ` Dmitry Torokhov
@ 2007-02-14 12:26 ` Pavel Machek
2007-02-14 19:47 ` Zachary Amsden
2007-02-14 12:52 ` Alan
3 siblings, 1 reply; 31+ messages in thread
From: Pavel Machek @ 2007-02-14 12:26 UTC (permalink / raw)
To: Rusty Russell
Cc: Zachary Amsden, Andi Kleen, Linux Kernel Mailing List,
Andrew Morton, Jeremy Fitzhardinge, Chris Wright
On Thu 2007-02-08 07:36:12, Rusty Russell wrote:
> On Wed, 2007-02-07 at 12:35 +0000, Pavel Machek wrote:
> > Ugh, it sounds like paravirt is more b0rken then I thought. It should
> > always to the proper delay, then replace those udelays that are not
> > needed on virtualized hardware with something else.
> >
> > Just magically defining udelay into nop is broken.
>
> We'd have to audit and figure out what udelays are for hardware and
> which are not, but the evidence is that the vast majority of them are
> for hardware and not needed for virtualization.
You did not time to do the full audit, so you just did #define.
> Changing udelay to "hardware_udelay" or something all over the kernel
> would have delayed the paravirt_ops merge by an infinite amount 8)
And here you claim you could not do the right thing, because people
would notice you are doing huge search/replace without audit, and
would stop you. So you simply hidden it from them :-(.
Plus... udelay() should just work under virtualization, right? You get
slightly slower kernel, but still working, so the "full audit" is not
as hard as you are telling me.
Just replace udelay() with hardware_udelay() on places that matter in
your workload...
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 9/11] Panic delay fix
2007-02-14 12:26 ` Pavel Machek
@ 2007-02-14 19:47 ` Zachary Amsden
0 siblings, 0 replies; 31+ messages in thread
From: Zachary Amsden @ 2007-02-14 19:47 UTC (permalink / raw)
To: Pavel Machek
Cc: Rusty Russell, Andi Kleen, Linux Kernel Mailing List,
Andrew Morton, Jeremy Fitzhardinge, Chris Wright
Pavel Machek wrote:
> On Thu 2007-02-08 07:36:12, Rusty Russell wrote:
>
>> On Wed, 2007-02-07 at 12:35 +0000, Pavel Machek wrote:
>>
>>> Ugh, it sounds like paravirt is more b0rken then I thought. It should
>>> always to the proper delay, then replace those udelays that are not
>>> needed on virtualized hardware with something else.
>>>
>>> Just magically defining udelay into nop is broken.
>>>
>> We'd have to audit and figure out what udelays are for hardware and
>> which are not, but the evidence is that the vast majority of them are
>> for hardware and not needed for virtualization.
>>
>
> You did not time to do the full audit, so you just did #define.
>
Yes, of course. Since 99% of the drivers are completely irrelevant for
paravirt, and 99% of the udelays are in drivers, there isn't much point
to auditing a bunch of code we're not even going to be affected by. The
default case for udelay is it is not needed.
>> Changing udelay to "hardware_udelay" or something all over the kernel
>> would have delayed the paravirt_ops merge by an infinite amount 8)
>>
>
> And here you claim you could not do the right thing, because people
> would notice you are doing huge search/replace without audit, and
> would stop you. So you simply hidden it from them :-(.
>
What ludicrousness is this? Hidden what? That the default case for
udelay is that it is not needed?
> Plus... udelay() should just work under virtualization, right? You get
> slightly slower kernel, but still working, so the "full audit" is not
> as hard as you are telling me.
>
Save the time of doing a useless full audit and making sure we didn't
accidentally redefine or misspell some symbol on a bunch of
architectures we aren't even set up to compile for.
> Just replace udelay() with hardware_udelay() on places that matter in
> your workload...
>
That's inconsistent. We would be doing 2 SCSI drivers, part of the IDE
code, some i386 arch code, some random places in the kernel... and now
nobody else knows whether to use udelay or hardware_udelay and the code
gets jumbled to the point that it is useless because there is no clear
distinction between the two. It is non-trivial to come up with a list
of source files that we have to actually do this to. One C-file calls a
shared routine in a library, and now you've got a hidden udelay that you
have absolutely no way of detecting. The right thing to do if you want
to do it on a line by line basis is exactly the opposite. Remove udelay
and find out what breaks. Bugs are easier to find and fix than hidden
code. If I were to do it on a line by line basis, I would chose to
replace udelay() with real_time_udelay() for those places that actually
need it.
Zach
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 9/11] Panic delay fix
2007-02-07 20:36 ` Rusty Russell
` (2 preceding siblings ...)
2007-02-14 12:26 ` Pavel Machek
@ 2007-02-14 12:52 ` Alan
2007-02-14 20:04 ` Zachary Amsden
3 siblings, 1 reply; 31+ messages in thread
From: Alan @ 2007-02-14 12:52 UTC (permalink / raw)
To: Rusty Russell
Cc: Pavel Machek, Zachary Amsden, Andi Kleen,
Linux Kernel Mailing List, Andrew Morton, Jeremy Fitzhardinge,
Chris Wright
> We'd have to audit and figure out what udelays are for hardware and
> which are not, but the evidence is that the vast majority of them are
> for hardware and not needed for virtualization.
Which is irrelevant since the hardware drivers won't be used in a
virtualised environment with any kind of performance optimisation.
> Changing udelay to "hardware_udelay" or something all over the kernel
> would have delayed the paravirt_ops merge by an infinite amount 8)
paravirt_ops has no business fiddling with udelay. Not only does it
create more code bloat and stalls in relatively fast paths but its
optimising the wrong thing anyway.
My performance sucks -> optimise out udelay is the wrong approach. My
performance sucks, switch to the virtual block driver is the right
approach, and a virtual block driver won't be using udelay anyway
Alan
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 9/11] Panic delay fix
2007-02-14 12:52 ` Alan
@ 2007-02-14 20:04 ` Zachary Amsden
2007-02-14 21:34 ` Alan
2007-02-15 10:17 ` Pavel Machek
0 siblings, 2 replies; 31+ messages in thread
From: Zachary Amsden @ 2007-02-14 20:04 UTC (permalink / raw)
To: Alan
Cc: Rusty Russell, Pavel Machek, Andi Kleen,
Linux Kernel Mailing List, Andrew Morton, Jeremy Fitzhardinge,
Chris Wright
Alan wrote:
>> We'd have to audit and figure out what udelays are for hardware and
>> which are not, but the evidence is that the vast majority of them are
>> for hardware and not needed for virtualization.
>>
>
> Which is irrelevant since the hardware drivers won't be used in a
> virtualised environment with any kind of performance optimisation.
>
Which is why an audit is irrelevant for the most part. Note on the
performance below.
>> Changing udelay to "hardware_udelay" or something all over the kernel
>> would have delayed the paravirt_ops merge by an infinite amount 8)
>>
>
> paravirt_ops has no business fiddling with udelay. Not only does it
> create more code bloat and stalls in relatively fast paths but its
> optimising the wrong thing anyway.
>
??? I fail to see the code bloat and also the fast paths. Which fast
paths use udelay?
> My performance sucks -> optimise out udelay is the wrong approach. My
> performance sucks, switch to the virtual block driver is the right
> approach, and a virtual block driver won't be using udelay anyway
>
This is not to stop performance from sucking. It doesn't. This is not
an "approach". Sure, a virtual block driver won't be using udelay.
Everyone else who writes hypervisors writes virtual block drivers
because they don't have optimized I/O emulation for real hardware.
Their performance sucks without it because they have to go switch to
some other context and run a device emulator. Our doesn't. We have
optimized almost every I/O device we emulate. But sitting around
spinning in udelay is wasting everybody's time. There is an overhead
cost to trapping out on I/O instructions. Removing the udelays that
typically happen around I/O instructions causes the emulation to break even.
And that is a good thing. It's certainly not required, nor is it a
significant win while the kernel is running. It does cut the boot time
by a lot, and you will notice an obvious difference with a much faster
kernel boot simply because a lot of the hardware setup has very
conservative udelays which take a lot of time during device
initialization. Since boot time * number of reboots has a direct impact
on the number of 9's you can claim for uptime, this is actually a large
win for reliability.
Zach
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 9/11] Panic delay fix
2007-02-14 20:04 ` Zachary Amsden
@ 2007-02-14 21:34 ` Alan
2007-02-14 21:53 ` Zachary Amsden
2007-02-15 10:17 ` Pavel Machek
1 sibling, 1 reply; 31+ messages in thread
From: Alan @ 2007-02-14 21:34 UTC (permalink / raw)
To: Zachary Amsden
Cc: Rusty Russell, Pavel Machek, Andi Kleen,
Linux Kernel Mailing List, Andrew Morton, Jeremy Fitzhardinge,
Chris Wright
> ??? I fail to see the code bloat and also the fast paths. Which fast
> paths use udelay?
IDE on several platforms has performance critical paths that use
ndelay(400) or failing that udelay(1)
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 9/11] Panic delay fix
2007-02-14 21:34 ` Alan
@ 2007-02-14 21:53 ` Zachary Amsden
2007-02-15 0:33 ` Alan
0 siblings, 1 reply; 31+ messages in thread
From: Zachary Amsden @ 2007-02-14 21:53 UTC (permalink / raw)
To: Alan
Cc: Rusty Russell, Pavel Machek, Andi Kleen,
Linux Kernel Mailing List, Andrew Morton, Jeremy Fitzhardinge,
Chris Wright
Alan wrote:
>> ??? I fail to see the code bloat and also the fast paths. Which fast
>> paths use udelay?
>>
>
> IDE on several platforms has performance critical paths that use
> ndelay(400) or failing that udelay(1)
Ok, I buy that. A 486DX / 33 Mhz processor takes 10 cycles to issue a
CALL / RET pair. This is about 300ns. Is there an issue with being too
early to issue I/O operations or too late?
If it is too early, then we have lost 10 cycles of processor time per
udelay, which considering the antiquity of this piece of hardware, seems
acceptable.
If it is too late, then there could be a real issue on older machines.
But I fail to see how such careful timing can be done at this
granularity on such hardware without well tweaked assembly code. A
suboptimal compile output could easily throw you off by just as much,
sticking a little bit of arithmetic before and after the delay.
Zach
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 9/11] Panic delay fix
2007-02-14 21:53 ` Zachary Amsden
@ 2007-02-15 0:33 ` Alan
0 siblings, 0 replies; 31+ messages in thread
From: Alan @ 2007-02-15 0:33 UTC (permalink / raw)
To: Zachary Amsden
Cc: Rusty Russell, Pavel Machek, Andi Kleen,
Linux Kernel Mailing List, Andrew Morton, Jeremy Fitzhardinge,
Chris Wright
On Wed, 14 Feb 2007 13:53:08 -0800
Zachary Amsden <zach@vmware.com> wrote:
> > IDE on several platforms has performance critical paths that use
> > ndelay(400) or failing that udelay(1)
>
> Ok, I buy that. A 486DX / 33 Mhz processor takes 10 cycles to issue a
> CALL / RET pair. This is about 300ns. Is there an issue with being too
> early to issue I/O operations or too late?
Too early you lose, too late you just waste clock time.
> But I fail to see how such careful timing can be done at this
> granularity on such hardware without well tweaked assembly code.
Thats what is used most platforms use udelay(1) in fact however
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 9/11] Panic delay fix
2007-02-14 20:04 ` Zachary Amsden
2007-02-14 21:34 ` Alan
@ 2007-02-15 10:17 ` Pavel Machek
2007-02-15 23:42 ` Zachary Amsden
1 sibling, 1 reply; 31+ messages in thread
From: Pavel Machek @ 2007-02-15 10:17 UTC (permalink / raw)
To: Zachary Amsden
Cc: Alan, Rusty Russell, Andi Kleen, Linux Kernel Mailing List,
Andrew Morton, Jeremy Fitzhardinge, Chris Wright
Hi!
> >>We'd have to audit and figure out what udelays are for hardware and
> >>which are not, but the evidence is that the vast majority of them are
> >>for hardware and not needed for virtualization.
> >>
> >
> >Which is irrelevant since the hardware drivers won't be used in a
> >virtualised environment with any kind of performance optimisation.
> >
>
> Which is why an audit is irrelevant for the most part. Note on the
> performance below.
You know it is ugly. Alan demonstrated it even hurts performance, but
being ugly is the main problem.
If you _need_ to avoid udelay() in some cases, introduce
udelay_unless_virtualized(), and switch few users to it. Just globaly
defining udelay to nop is _ugly_.
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 9/11] Panic delay fix
2007-02-15 10:17 ` Pavel Machek
@ 2007-02-15 23:42 ` Zachary Amsden
2007-02-15 23:49 ` Pavel Machek
` (2 more replies)
0 siblings, 3 replies; 31+ messages in thread
From: Zachary Amsden @ 2007-02-15 23:42 UTC (permalink / raw)
To: Pavel Machek
Cc: Alan, Rusty Russell, Andi Kleen, Linux Kernel Mailing List,
Andrew Morton, Jeremy Fitzhardinge, Chris Wright
Pavel Machek wrote:
>
> You know it is ugly. Alan demonstrated it even hurts performance, but
> being ugly is the main problem.
>
No argument with that. Well, we're ok with dropping it. Actually,
reverting the entire set of udelay changes now seems wise. The same bug
that happened with i8042 can happen with any hardware device driven by
the VM in passthrough mode - potentially USB or IDE CDROM or direct SCSI.
Since that is per-device and not global, a global udelay disable really
is broken in that case, and recompiling individual C files or modules
for passthrough vs. non-passthrough is not the answer.
So Rusty, Chris, Jeremy, any objections to killing udelay() and friends
in paravirt-ops? It would simplify things a bit. The only thing we
lose is a slightly faster boot time in the 100% emulated device case.
I'm ok with losing that. Even the PIT fast paths don't use udelay, so I
don't think we care for runtime performance at all.
Zach
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 9/11] Panic delay fix
2007-02-15 23:42 ` Zachary Amsden
@ 2007-02-15 23:49 ` Pavel Machek
2007-02-15 23:50 ` Jeremy Fitzhardinge
2007-02-16 3:22 ` Rusty Russell
2 siblings, 0 replies; 31+ messages in thread
From: Pavel Machek @ 2007-02-15 23:49 UTC (permalink / raw)
To: Zachary Amsden
Cc: Alan, Rusty Russell, Andi Kleen, Linux Kernel Mailing List,
Andrew Morton, Jeremy Fitzhardinge, Chris Wright
Hi!
> >You know it is ugly. Alan demonstrated it even hurts performance, but
> >being ugly is the main problem.
> >
>
> No argument with that. Well, we're ok with dropping it. Actually,
> reverting the entire set of udelay changes now seems wise. The same
>bug
Good, thanks.
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 9/11] Panic delay fix
2007-02-15 23:42 ` Zachary Amsden
2007-02-15 23:49 ` Pavel Machek
@ 2007-02-15 23:50 ` Jeremy Fitzhardinge
2007-02-16 3:22 ` Rusty Russell
2 siblings, 0 replies; 31+ messages in thread
From: Jeremy Fitzhardinge @ 2007-02-15 23:50 UTC (permalink / raw)
To: Zachary Amsden
Cc: Pavel Machek, Alan, Rusty Russell, Andi Kleen,
Linux Kernel Mailing List, Andrew Morton, Chris Wright
Zachary Amsden wrote:
> So Rusty, Chris, Jeremy, any objections to killing udelay() and
> friends in paravirt-ops? It would simplify things a bit. The only
> thing we lose is a slightly faster boot time in the 100% emulated
> device case. I'm ok with losing that. Even the PIT fast paths don't
> use udelay, so I don't think we care for runtime performance at all.
Fine with me. That always seemed a bit warty to me.
J
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 9/11] Panic delay fix
2007-02-15 23:42 ` Zachary Amsden
2007-02-15 23:49 ` Pavel Machek
2007-02-15 23:50 ` Jeremy Fitzhardinge
@ 2007-02-16 3:22 ` Rusty Russell
2 siblings, 0 replies; 31+ messages in thread
From: Rusty Russell @ 2007-02-16 3:22 UTC (permalink / raw)
To: Zachary Amsden
Cc: Pavel Machek, Alan, Andi Kleen, Linux Kernel Mailing List,
Andrew Morton, Jeremy Fitzhardinge, Chris Wright
On Thu, 2007-02-15 at 15:42 -0800, Zachary Amsden wrote:
> So Rusty, Chris, Jeremy, any objections to killing udelay() and friends
> in paravirt-ops? It would simplify things a bit.
I agree.
Thanks!
Rusty.
^ permalink raw reply [flat|nested] 31+ messages in thread