* Re: [PATCH RFC] x86: hpet: Avoid the readback penalty
2010-09-15 13:11 ` [PATCH RFC] x86: hpet: Avoid the readback penalty Thomas Gleixner
@ 2010-09-15 13:43 ` Anders Larsen
2010-09-15 13:54 ` John Drescher
` (4 subsequent siblings)
5 siblings, 0 replies; 13+ messages in thread
From: Anders Larsen @ 2010-09-15 13:43 UTC (permalink / raw)
To: Thomas Gleixner
Cc: LKML, Ingo Molnar, H. Peter Anvin, Nix, Artur Skawina,
Damien Wyart, John Drescher, Venkatesh Pallipadi,
Arjan van de Ven, Andreas Herrmann, Borislav Petkov,
Suresh Siddha
On 2010-09-15 15:11:57, Thomas Gleixner wrote:
> - * We need to read back the CMP register on certain HPET
> - * implementations (ATI chipsets) which seem to delay the
> - * transfer of the compare register into the internal compare
> - * logic. With small deltas this might actually be too late as
> - * the counter could already be higher than the compare value
> - * at that point and we would wait for the next hpet interrupt
> - * forever. We found out that reading the CMP register back
> - * forces the transfer so we can rely on the comparison with
> - * the counter register below. If the read back from the
> - * compare register does not match the value we programmed
> - * then we might have a real hardware problem. We can not do
> - * much about it here, but at least alert the user/admin with
> - * a prominent warning.
> - *
> - * An erratum on some chipsets (ICH9,..), results in
> - * comparator read immediately following a write returning old
> - * value. Workaround for this is to read this value second
> - * time, when first read returns old value.
> - *
> - * In fact the write to the comparator register is delayed up
> - * to two HPET cycles so the workaround we tried to restrict
> - * the readback to those known to be borked ATI chipsets
> - * failed miserably. So we give up on optimizations forever
> - * and penalize all HPET incarnations unconditionally.
> + * HPETs are a complete disaster. The compare register is
> + * based on a equal comparison and does provide a less than or
s/does provide/does not provide/
> + * equal functionality (which would require to take the
> + * wraparound into account) and it does not provide a simple
> + * count down event mode. Further the write to the comparator
> + * register is delayed internaly up to two HPET clock cycles
s/internaly/internally/
> + * in certain chipsets (ATI, ICH9,10). We worked around that
> + * by reading back the compare register, but that required
> + * another workaround for ICH9,10 chips where the first
> + * readout after write can return the old stale value. We
> + * already have a minimum delta of 5us enforced, but a NMI or
> + * SMI hitting between the counter readout and the comparator
> + * write can move us behind that point easily. Now instead of
> + * reading the compare register back several times, we make
> + * the ETIME decision based on the following: Return ETIME if
> + * the counter value after the write is less than 8 HPET
> + * cycles away from the event or if the counter is already
> + * ahead of the event.
Cheers
Anders
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH RFC] x86: hpet: Avoid the readback penalty
2010-09-15 13:11 ` [PATCH RFC] x86: hpet: Avoid the readback penalty Thomas Gleixner
2010-09-15 13:43 ` Anders Larsen
@ 2010-09-15 13:54 ` John Drescher
[not found] ` <AANLkTikHfDUh3LdmyA5eMNACFPfi6pqGLFJiyzVm79X_@mail.gmail.com>
2010-09-15 14:40 ` Borislav Petkov
` (3 subsequent siblings)
5 siblings, 1 reply; 13+ messages in thread
From: John Drescher @ 2010-09-15 13:54 UTC (permalink / raw)
To: Thomas Gleixner
Cc: LKML, Ingo Molnar, H. Peter Anvin, Nix, Artur Skawina,
Damien Wyart, Venkatesh Pallipadi, Arjan van de Ven,
Andreas Herrmann, Borislav Petkov, Suresh Siddha
On Wed, Sep 15, 2010 at 9:11 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
> On Tue, 14 Sep 2010, tip-bot for Thomas Gleixner wrote:
>> x86: hpet: Work around hardware stupidity
>
> After my brain recovered from yesterdays exposure with the x86 timer
> horror, I came up with a different solution for this problem, which
> avoids the readback of the compare register completely. It works
> nicely on my affected ATI system, but needs some exposure to the other
> machines.
>
> Comments ?
>
I can test this on my i7 machine in tonight (around 12 hours from now).
John
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH RFC] x86: hpet: Avoid the readback penalty
2010-09-15 13:11 ` [PATCH RFC] x86: hpet: Avoid the readback penalty Thomas Gleixner
2010-09-15 13:43 ` Anders Larsen
2010-09-15 13:54 ` John Drescher
@ 2010-09-15 14:40 ` Borislav Petkov
2010-09-15 14:42 ` Thomas Gleixner
2010-09-16 12:28 ` Borislav Petkov
2010-09-16 20:04 ` Nix
` (2 subsequent siblings)
5 siblings, 2 replies; 13+ messages in thread
From: Borislav Petkov @ 2010-09-15 14:40 UTC (permalink / raw)
To: Thomas Gleixner
Cc: LKML, Ingo Molnar, H. Peter Anvin, Nix, Artur Skawina,
Damien Wyart, John Drescher, Venkatesh Pallipadi,
Arjan van de Ven, Herrmann3, Andreas, Suresh Siddha
From: Thomas Gleixner <tglx@linutronix.de>
Date: Wed, Sep 15, 2010 at 09:11:57AM -0400
> On Tue, 14 Sep 2010, tip-bot for Thomas Gleixner wrote:
> > x86: hpet: Work around hardware stupidity
>
> After my brain recovered from yesterdays exposure with the x86 timer
> horror, I came up with a different solution for this problem, which
> avoids the readback of the compare register completely. It works
> nicely on my affected ATI system, but needs some exposure to the other
> machines.
Will run in on a couple of SBx00 machines I got here.
...
> If cmp is less than 8 HPET clock cycles, then we decide that the event
> has happened already and return -ETIME. That covers the above #1 and
> #2 problems which would cause a wait for HPET wraparound (~306
> seconds).
Make sense. I guess you're choosing a value of 8 just to be on the safe
side wrt to HPET clock cycles it takes to write the cmp register?
--
Regards/Gruss,
Boris.
Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Gemeinde Aschheim, Landkreis Muenchen
Registergericht Muenchen, HRB Nr. 43632
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH RFC] x86: hpet: Avoid the readback penalty
2010-09-15 14:40 ` Borislav Petkov
@ 2010-09-15 14:42 ` Thomas Gleixner
2010-09-16 12:28 ` Borislav Petkov
1 sibling, 0 replies; 13+ messages in thread
From: Thomas Gleixner @ 2010-09-15 14:42 UTC (permalink / raw)
To: Borislav Petkov
Cc: LKML, Ingo Molnar, H. Peter Anvin, Nix, Artur Skawina,
Damien Wyart, John Drescher, Venkatesh Pallipadi,
Arjan van de Ven, Herrmann3, Andreas, Suresh Siddha
On Wed, 15 Sep 2010, Borislav Petkov wrote:
> From: Thomas Gleixner <tglx@linutronix.de>
> Date: Wed, Sep 15, 2010 at 09:11:57AM -0400
>
> > On Tue, 14 Sep 2010, tip-bot for Thomas Gleixner wrote:
> > > x86: hpet: Work around hardware stupidity
> >
> > After my brain recovered from yesterdays exposure with the x86 timer
> > horror, I came up with a different solution for this problem, which
> > avoids the readback of the compare register completely. It works
> > nicely on my affected ATI system, but needs some exposure to the other
> > machines.
>
> Will run in on a couple of SBx00 machines I got here.
>
> ...
>
> > If cmp is less than 8 HPET clock cycles, then we decide that the event
> > has happened already and return -ETIME. That covers the above #1 and
> > #2 problems which would cause a wait for HPET wraparound (~306
> > seconds).
>
> Make sense. I guess you're choosing a value of 8 just to be on the safe
> side wrt to HPET clock cycles it takes to write the cmp register?
Yes, I do _NOT_ trust those hardware dudes at all. A factor 4 seems to
be an appropriate choice:)
Thanks,
tglx
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH RFC] x86: hpet: Avoid the readback penalty
2010-09-15 14:40 ` Borislav Petkov
2010-09-15 14:42 ` Thomas Gleixner
@ 2010-09-16 12:28 ` Borislav Petkov
1 sibling, 0 replies; 13+ messages in thread
From: Borislav Petkov @ 2010-09-16 12:28 UTC (permalink / raw)
To: Thomas Gleixner
Cc: LKML, Ingo Molnar, H. Peter Anvin, Nix, Artur Skawina,
Damien Wyart, John Drescher, Venkatesh Pallipadi,
Arjan van de Ven, Herrmann3, Andreas, Suresh Siddha
> > On Tue, 14 Sep 2010, tip-bot for Thomas Gleixner wrote:
> > > x86: hpet: Work around hardware stupidity
> >
> > After my brain recovered from yesterdays exposure with the x86 timer
> > horror, I came up with a different solution for this problem, which
> > avoids the readback of the compare register completely. It works
> > nicely on my affected ATI system, but needs some exposure to the other
> > machines.
>
> Will run in on a couple of SBx00 machines I got here.
FWIW, the patch doesn't break my SBx00 mini-farm here. Let me know if
you need more testing done, otherwise:
Tested-by: Borislav Petkov <borislav.petkov@amd.com>
--
Regards/Gruss,
Boris.
Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Gemeinde Aschheim, Landkreis Muenchen
Registergericht Muenchen, HRB Nr. 43632
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH RFC] x86: hpet: Avoid the readback penalty
2010-09-15 13:11 ` [PATCH RFC] x86: hpet: Avoid the readback penalty Thomas Gleixner
` (2 preceding siblings ...)
2010-09-15 14:40 ` Borislav Petkov
@ 2010-09-16 20:04 ` Nix
2010-09-18 7:49 ` Damien Wyart
2010-09-18 10:13 ` [tip:x86/timers] x86: Hpet: Avoid the comparator " tip-bot for Thomas Gleixner
5 siblings, 0 replies; 13+ messages in thread
From: Nix @ 2010-09-16 20:04 UTC (permalink / raw)
To: Thomas Gleixner; +Cc: LKML, Neil Brown, linux-nfs
On 15 Sep 2010, Thomas Gleixner said:
> On Tue, 14 Sep 2010, tip-bot for Thomas Gleixner wrote:
>> x86: hpet: Work around hardware stupidity
>
> After my brain recovered from yesterdays exposure with the x86 timer
> horror, I came up with a different solution for this problem, which
> avoids the readback of the compare register completely. It works
> nicely on my affected ATI system, but needs some exposure to the other
> machines.
>
> Comments ?
Works for me. (I still have boot problems, but these seem to be
NFS-related: -ESTALE from an apparently random subset of my mount points
at initial mount time. Restarting rpc.mountd on the server fixes
it. This is with nfs-utils 1.2.2, but I see it right up to 60abb98, that
being the most recent version I tested. This doesn't happen if the
client is running 2.6.34.x, as far as I can tell. It also doesn't happen
with all my 2.6.35.x clients. Sigh. I hate bugs like this.)
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH RFC] x86: hpet: Avoid the readback penalty
2010-09-15 13:11 ` [PATCH RFC] x86: hpet: Avoid the readback penalty Thomas Gleixner
` (3 preceding siblings ...)
2010-09-16 20:04 ` Nix
@ 2010-09-18 7:49 ` Damien Wyart
2010-09-18 9:41 ` Thomas Gleixner
2010-09-18 13:43 ` Nix
2010-09-18 10:13 ` [tip:x86/timers] x86: Hpet: Avoid the comparator " tip-bot for Thomas Gleixner
5 siblings, 2 replies; 13+ messages in thread
From: Damien Wyart @ 2010-09-18 7:49 UTC (permalink / raw)
To: Thomas Gleixner
Cc: LKML, Ingo Molnar, H. Peter Anvin, Nix, Artur Skawina,
John Drescher, Venkatesh Pallipadi, Arjan van de Ven,
Andreas Herrmann, Borislav Petkov, Suresh Siddha
* Thomas Gleixner <tglx@linutronix.de> [2010-09-15 15:11]:
> > x86: hpet: Work around hardware stupidity
> After my brain recovered from yesterdays exposure with the x86 timer
> horror, I came up with a different solution for this problem, which
> avoids the readback of the compare register completely. It works
> nicely on my affected ATI system, but needs some exposure to the other
> machines.
Comments for this different solution seemed fine, but it seems only the
first one was commited into mainline and then stable. Is this intended?
Thanks,
Damien
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH RFC] x86: hpet: Avoid the readback penalty
2010-09-18 7:49 ` Damien Wyart
@ 2010-09-18 9:41 ` Thomas Gleixner
2010-09-18 13:43 ` Nix
1 sibling, 0 replies; 13+ messages in thread
From: Thomas Gleixner @ 2010-09-18 9:41 UTC (permalink / raw)
To: Damien Wyart
Cc: LKML, Ingo Molnar, H. Peter Anvin, Nix, Artur Skawina,
John Drescher, Venkatesh Pallipadi, Arjan van de Ven,
Andreas Herrmann, Borislav Petkov, Suresh Siddha
On Sat, 18 Sep 2010, Damien Wyart wrote:
> * Thomas Gleixner <tglx@linutronix.de> [2010-09-15 15:11]:
> > > x86: hpet: Work around hardware stupidity
>
> > After my brain recovered from yesterdays exposure with the x86 timer
> > horror, I came up with a different solution for this problem, which
> > avoids the readback of the compare register completely. It works
> > nicely on my affected ATI system, but needs some exposure to the other
> > machines.
>
> Comments for this different solution seemed fine, but it seems only the
> first one was commited into mainline and then stable. Is this intended?
Yes. I wanted to have confirmation for the second solution (also from
the HW folks). I'm queueing it for .37
Thanks,
tglx
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH RFC] x86: hpet: Avoid the readback penalty
2010-09-18 7:49 ` Damien Wyart
2010-09-18 9:41 ` Thomas Gleixner
@ 2010-09-18 13:43 ` Nix
1 sibling, 0 replies; 13+ messages in thread
From: Nix @ 2010-09-18 13:43 UTC (permalink / raw)
To: Damien Wyart
Cc: Thomas Gleixner, LKML, Ingo Molnar, H. Peter Anvin,
Artur Skawina, John Drescher, Venkatesh Pallipadi,
Arjan van de Ven, Andreas Herrmann, Borislav Petkov,
Suresh Siddha
On 18 Sep 2010, Damien Wyart uttered the following:
> * Thomas Gleixner <tglx@linutronix.de> [2010-09-15 15:11]:
>> > x86: hpet: Work around hardware stupidity
>
>> After my brain recovered from yesterdays exposure with the x86 timer
>> horror, I came up with a different solution for this problem, which
>> avoids the readback of the compare register completely. It works
>> nicely on my affected ATI system, but needs some exposure to the other
>> machines.
>
> Comments for this different solution seemed fine, but it seems only the
> first one was commited into mainline and then stable. Is this intended?
I assumed he was letting people shake the bugs out rather than inflict
what is basically just a performance improvement on -stable. For the
record: no bugs here. I also know why one of my machines was unaffected,
and it provides even more confirmation that this bug is correctly
identified, as if we needed more. It's not using the HPET at all:
Clock Event Device: cs5535-clockevt
Every single machine I owned with an HPET was affected by this bug: they
all work perfectly well now.
^ permalink raw reply [flat|nested] 13+ messages in thread
* [tip:x86/timers] x86: Hpet: Avoid the comparator readback penalty
2010-09-15 13:11 ` [PATCH RFC] x86: hpet: Avoid the readback penalty Thomas Gleixner
` (4 preceding siblings ...)
2010-09-18 7:49 ` Damien Wyart
@ 2010-09-18 10:13 ` tip-bot for Thomas Gleixner
5 siblings, 0 replies; 13+ messages in thread
From: tip-bot for Thomas Gleixner @ 2010-09-18 10:13 UTC (permalink / raw)
To: linux-tip-commits
Cc: linux-kernel, hpa, mingo, andreas.herrmann3, arjan, drescherjm,
art.08.09, damien.wyart, suresh.b.siddha, tglx, nix,
borislav.petkov, venki
Commit-ID: 995bd3bb5c78f3ff71339803c0b8337ed36d64fb
Gitweb: http://git.kernel.org/tip/995bd3bb5c78f3ff71339803c0b8337ed36d64fb
Author: Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Wed, 15 Sep 2010 15:11:57 +0200
Committer: Thomas Gleixner <tglx@linutronix.de>
CommitDate: Sat, 18 Sep 2010 12:09:13 +0200
x86: Hpet: Avoid the comparator readback penalty
Due to the overly intelligent design of HPETs, we need to workaround
the problem that the compare value which we write is already behind
the actual counter value at the point where the value hits the real
compare register. This happens for two reasons:
1) We read out the counter, add the delta and write the result to the
compare register. When a NMI or SMI hits between the read out and
the write then the counter can be ahead of the event already
2) The write to the compare register is delayed by up to two HPET
cycles in certain chipsets.
We worked around this by reading back the compare register to make
sure that the written value has hit the hardware. For certain ICH9+
chipsets this can require two readouts, as the first one can return
the previous compare register value. That's bad performance wise for
the normal case where the event is far enough in the future.
As we already know that the write can be delayed by up to two cycles
we can avoid the read back of the compare register completely if we
make the decision whether the delta has elapsed already or not based
on the following calculation:
cmp = event - actual_count;
If cmp is less than 8 HPET clock cycles, then we decide that the event
has happened already and return -ETIME. That covers the above #1 and
#2 problems which would cause a wait for HPET wraparound (~306
seconds).
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Nix <nix@esperi.org.uk>
Tested-by: Artur Skawina <art.08.09@gmail.com>
Cc: Damien Wyart <damien.wyart@free.fr>
Tested-by: John Drescher <drescherjm@gmail.com>
Cc: Venkatesh Pallipadi <venki@google.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
Tested-by: Borislav Petkov <borislav.petkov@amd.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
LKML-Reference: <alpine.LFD.2.00.1009151500060.2416@localhost6.localdomain6>
---
arch/x86/kernel/hpet.c | 51 +++++++++++++++++++----------------------------
1 files changed, 21 insertions(+), 30 deletions(-)
diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c
index 410fdb3..0b568b3 100644
--- a/arch/x86/kernel/hpet.c
+++ b/arch/x86/kernel/hpet.c
@@ -380,44 +380,35 @@ static int hpet_next_event(unsigned long delta,
struct clock_event_device *evt, int timer)
{
u32 cnt;
+ s32 res;
cnt = hpet_readl(HPET_COUNTER);
cnt += (u32) delta;
hpet_writel(cnt, HPET_Tn_CMP(timer));
/*
- * We need to read back the CMP register on certain HPET
- * implementations (ATI chipsets) which seem to delay the
- * transfer of the compare register into the internal compare
- * logic. With small deltas this might actually be too late as
- * the counter could already be higher than the compare value
- * at that point and we would wait for the next hpet interrupt
- * forever. We found out that reading the CMP register back
- * forces the transfer so we can rely on the comparison with
- * the counter register below. If the read back from the
- * compare register does not match the value we programmed
- * then we might have a real hardware problem. We can not do
- * much about it here, but at least alert the user/admin with
- * a prominent warning.
- *
- * An erratum on some chipsets (ICH9,..), results in
- * comparator read immediately following a write returning old
- * value. Workaround for this is to read this value second
- * time, when first read returns old value.
- *
- * In fact the write to the comparator register is delayed up
- * to two HPET cycles so the workaround we tried to restrict
- * the readback to those known to be borked ATI chipsets
- * failed miserably. So we give up on optimizations forever
- * and penalize all HPET incarnations unconditionally.
+ * HPETs are a complete disaster. The compare register is
+ * based on a equal comparison and neither provides a less
+ * than or equal functionality (which would require to take
+ * the wraparound into account) nor a simple count down event
+ * mode. Further the write to the comparator register is
+ * delayed internally up to two HPET clock cycles in certain
+ * chipsets (ATI, ICH9,10). We worked around that by reading
+ * back the compare register, but that required another
+ * workaround for ICH9,10 chips where the first readout after
+ * write can return the old stale value. We already have a
+ * minimum delta of 5us enforced, but a NMI or SMI hitting
+ * between the counter readout and the comparator write can
+ * move us behind that point easily. Now instead of reading
+ * the compare register back several times, we make the ETIME
+ * decision based on the following: Return ETIME if the
+ * counter value after the write is less than 8 HPET cycles
+ * away from the event or if the counter is already ahead of
+ * the event.
*/
- if (unlikely((u32)hpet_readl(HPET_Tn_CMP(timer)) != cnt)) {
- if (hpet_readl(HPET_Tn_CMP(timer)) != cnt)
- printk_once(KERN_WARNING
- "hpet: compare register read back failed.\n");
- }
+ res = (s32)(cnt - hpet_readl(HPET_COUNTER));
- return (s32)(hpet_readl(HPET_COUNTER) - cnt) >= 0 ? -ETIME : 0;
+ return res < 8 ? -ETIME : 0;
}
static void hpet_legacy_set_mode(enum clock_event_mode mode,
^ permalink raw reply related [flat|nested] 13+ messages in thread