linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* AMD64 X2 lost ticks on PM timer
@ 2006-02-27 21:22 bubshait
  2006-02-27 22:21 ` Bill Rugolsky Jr.
  0 siblings, 1 reply; 60+ messages in thread
From: bubshait @ 2006-02-27 21:22 UTC (permalink / raw)
  To: linux-kernel

I have been suffering from lost ticks for several months now ever since I 
switched to an X86_64 kernel with SMP. I have read previous posts about lost 
ticks due to TSC timer, but I have been having a different problem as I have 
only been using 2.6.14 and then 2.6.15 kernels and only PM timers. Also, this 
problem does not manifest itself immedietly, I could be using the system for 
anywhere from a few hours to a couple of days without any problems before I 
start losing ticks, but once it starts it would lose them constantly and my 
desktop becomes unstable (some apps crash, while other take 5 minutes to 
start up, this was what led me to discover the lost ticks) forcing me to 
reboot. The following error appears in dmesg at the time the system starts to 
act strange

	Losing some ticks... checking if CPU frequency changed.
	warning: many lost ticks.
	Your time source seems to be instable or some driver is hogging interupts
	rip __do_softirq+0x47/0xd1

adding report_lost_ticks only prints repeating messages like

	Lost 3 timer tick(s)! rip __do_softirq+0x47/0xd1

I have tried using acipmaintimer and acippmtimer, it would boot fine but I 
would notice the following in dmesg

	..MP-BIOS bug: 8254 timer not connected to IO-APIC
	timer doesn't work through the IO-APIC - disabling NMI Watchdog!
	Uhhuh. NMI received for unknown reason 3d.

And would still end up with lost ticks eventually. using acpi=off causes the 
entire system to come to a crawl (I am guessing this is due to the PM timer). 
For the life of me I can't seem to figure out what causes these lost ticks to 
start, but when they do the /proc/interrupts show a drop from roughly 1000 
interrupts/sec to around 700 and this persists until I reboot.

My hardware is an AMD64 X2 4800+ on an asus A8N-SLI.

Also, could I please be CC'ed to any replies. I don't mean to be rude by not 
subscribing but I couldn't handle the volume.

Thanks,
Abdulla Bubshait

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: AMD64 X2 lost ticks on PM timer
  2006-02-27 21:22 AMD64 X2 lost ticks on PM timer bubshait
@ 2006-02-27 22:21 ` Bill Rugolsky Jr.
  2006-02-27 22:47   ` Jason Baron
  0 siblings, 1 reply; 60+ messages in thread
From: Bill Rugolsky Jr. @ 2006-02-27 22:21 UTC (permalink / raw)
  To: bubshait; +Cc: linux-kernel

On Tue, Feb 28, 2006 at 12:22:40AM +0300, bubshait wrote:
> 	Losing some ticks... checking if CPU frequency changed.
> 	warning: many lost ticks.
> 	Your time source seems to be instable or some driver is hogging interupts
> 	rip __do_softirq+0x47/0xd1
> 
> adding report_lost_ticks only prints repeating messages like
> 
> 	Lost 3 timer tick(s)! rip __do_softirq+0x47/0xd1

I'm seeing tons of these on a Tyan 2895 (Nvidia CKO4) running FC4 with
kernel-2.6.15-1.1830 (2.6.15.2) SMP: 

time.c: Lost 1 timer tick(s)! rip default_idle+0x37/0x7a)
time.c: Lost 2 timer tick(s)! rip __do_softirq+0x55/0xd4)

[I've seen the same thing with earlier FC 2.6.14 kernels.]

On our systems the __do_softirq messages are strongly correlated with
sata_nv interrupts, especially during our nightly tripwire-like fs
checksum job.  Unfortunately, the log messages are not very informative.
I'm not sure what ever happened to the following patch,

http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.5/2.5.64/2.5.64-mm3/broken-out/report-lost-ticks.patch

but it was dropped.

Unfortunately, I need to spend tomorrow patching kernels in search of a
fix or workaround, as I have to start using these boxes in production,
and they need to keep time.

Regards,

	Bill Rugolsky

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: AMD64 X2 lost ticks on PM timer
  2006-02-27 22:21 ` Bill Rugolsky Jr.
@ 2006-02-27 22:47   ` Jason Baron
  2006-02-28  7:41     ` Abdulla Bubshait
  2006-02-28 21:17     ` Abdulla Bubshait
  0 siblings, 2 replies; 60+ messages in thread
From: Jason Baron @ 2006-02-27 22:47 UTC (permalink / raw)
  To: Bill Rugolsky Jr.; +Cc: bubshait, linux-kernel


On Mon, 27 Feb 2006, Bill Rugolsky Jr. wrote:

> On Tue, Feb 28, 2006 at 12:22:40AM +0300, bubshait wrote:
> > 	Losing some ticks... checking if CPU frequency changed.
> > 	warning: many lost ticks.
> > 	Your time source seems to be instable or some driver is hogging interupts
> > 	rip __do_softirq+0x47/0xd1
> > 
> > adding report_lost_ticks only prints repeating messages like
> > 
> > 	Lost 3 timer tick(s)! rip __do_softirq+0x47/0xd1
> 
> I'm seeing tons of these on a Tyan 2895 (Nvidia CKO4) running FC4 with
> kernel-2.6.15-1.1830 (2.6.15.2) SMP: 
> 
> time.c: Lost 1 timer tick(s)! rip default_idle+0x37/0x7a)
> time.c: Lost 2 timer tick(s)! rip __do_softirq+0x55/0xd4)
> 
> [I've seen the same thing with earlier FC 2.6.14 kernels.]
> 
> On our systems the __do_softirq messages are strongly correlated with
> sata_nv interrupts, especially during our nightly tripwire-like fs
> checksum job.  Unfortunately, the log messages are not very informative.
> I'm not sure what ever happened to the following patch,
> 
> http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.5/2.5.64/2.5.64-mm3/broken-out/report-lost-ticks.patch
> 
> but it was dropped.
> 
> Unfortunately, I need to spend tomorrow patching kernels in search of a
> fix or workaround, as I have to start using these boxes in production,
> and they need to keep time.
> 

passing 'nohpet' and/or 'nopmtimer' will force the use of a different 
timer...but this is certainly a workaround, if it helps...



^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: AMD64 X2 lost ticks on PM timer
  2006-02-27 22:47   ` Jason Baron
@ 2006-02-28  7:41     ` Abdulla Bubshait
  2006-02-28 22:00       ` Bill Rugolsky Jr.
  2006-02-28 21:17     ` Abdulla Bubshait
  1 sibling, 1 reply; 60+ messages in thread
From: Abdulla Bubshait @ 2006-02-28  7:41 UTC (permalink / raw)
  To: Jason Baron; +Cc: Bill Rugolsky Jr., linux-kernel

On Tuesday 28 February 2006 01:47, Jason Baron wrote:
> On Mon, 27 Feb 2006, Bill Rugolsky Jr. wrote:
> > On Tue, Feb 28, 2006 at 12:22:40AM +0300, bubshait wrote:
> > > 	Losing some ticks... checking if CPU frequency changed.
> > > 	warning: many lost ticks.
> > > 	Your time source seems to be instable or some driver is hogging
> > > interupts rip __do_softirq+0x47/0xd1
> > >
> > > adding report_lost_ticks only prints repeating messages like
> > >
> > > 	Lost 3 timer tick(s)! rip __do_softirq+0x47/0xd1
> >
> > I'm seeing tons of these on a Tyan 2895 (Nvidia CKO4) running FC4 with
> > kernel-2.6.15-1.1830 (2.6.15.2) SMP:
> >
> > time.c: Lost 1 timer tick(s)! rip default_idle+0x37/0x7a)
> > time.c: Lost 2 timer tick(s)! rip __do_softirq+0x55/0xd4)
> >
> > [I've seen the same thing with earlier FC 2.6.14 kernels.]
> >
> > On our systems the __do_softirq messages are strongly correlated with
> > sata_nv interrupts, especially during our nightly tripwire-like fs
> > checksum job.  Unfortunately, the log messages are not very informative.
> > I'm not sure what ever happened to the following patch,
> >
> > http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.5/2.5.64/2.5
> >.64-mm3/broken-out/report-lost-ticks.patch
> >
> > but it was dropped.
> >
> > Unfortunately, I need to spend tomorrow patching kernels in search of a
> > fix or workaround, as I have to start using these boxes in production,
> > and they need to keep time.
>
> passing 'nohpet' and/or 'nopmtimer' will force the use of a different
> timer...but this is certainly a workaround, if it helps...

Unfortunately, I can't seem to find a way to force it to use hpet. Passing 
'notsc' and 'nopmtimer' I end up using PIT/TSC based timekeeping. TSC is 
already known to have problems with dual core. But I will sit with it for a 
while to see if it fairs better than the pm timer.

Bill, What timer do you use, and do these lost ticks persist after sata_nv 
interrupts stop?

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: AMD64 X2 lost ticks on PM timer
  2006-02-27 22:47   ` Jason Baron
  2006-02-28  7:41     ` Abdulla Bubshait
@ 2006-02-28 21:17     ` Abdulla Bubshait
  1 sibling, 0 replies; 60+ messages in thread
From: Abdulla Bubshait @ 2006-02-28 21:17 UTC (permalink / raw)
  To: linux-kernel

On Tuesday 28 February 2006 01:47, you wrote:
> On Mon, 27 Feb 2006, Bill Rugolsky Jr. wrote:
> > On Tue, Feb 28, 2006 at 12:22:40AM +0300, bubshait wrote:
> > > 	Losing some ticks... checking if CPU frequency changed.
> > > 	warning: many lost ticks.
> > > 	Your time source seems to be instable or some driver is hogging
> > > interupts rip __do_softirq+0x47/0xd1
> > >
> > > adding report_lost_ticks only prints repeating messages like
> > >
> > > 	Lost 3 timer tick(s)! rip __do_softirq+0x47/0xd1
> >
> > I'm seeing tons of these on a Tyan 2895 (Nvidia CKO4) running FC4 with
> > kernel-2.6.15-1.1830 (2.6.15.2) SMP:
> >
> > time.c: Lost 1 timer tick(s)! rip default_idle+0x37/0x7a)
> > time.c: Lost 2 timer tick(s)! rip __do_softirq+0x55/0xd4)
> >
> > [I've seen the same thing with earlier FC 2.6.14 kernels.]
> >
> > On our systems the __do_softirq messages are strongly correlated with
> > sata_nv interrupts, especially during our nightly tripwire-like fs
> > checksum job.  Unfortunately, the log messages are not very informative.
> > I'm not sure what ever happened to the following patch,
> >
> > http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.5/2.5.64/2.5
> >.64-mm3/broken-out/report-lost-ticks.patch
> >
> > but it was dropped.
> >
> > Unfortunately, I need to spend tomorrow patching kernels in search of a
> > fix or workaround, as I have to start using these boxes in production,
> > and they need to keep time.
>
> passing 'nohpet' and/or 'nopmtimer' will force the use of a different
> timer...but this is certainly a workaround, if it helps...

TSC timer is useless and the kernel can't seem to find the hpet, unfortunately 
nosmp doesn't even boot so that workaround is blocked too.

Trying to go through this problem some more, I would have to agree that it 
could be the sata_nv interrupts that are throwing off the time, but what I 
don't seem to understand is how the pmtimer would persist at this new 
interrupt rate of 700/s even after sata_nv interrupts drop off.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: AMD64 X2 lost ticks on PM timer
  2006-02-28  7:41     ` Abdulla Bubshait
@ 2006-02-28 22:00       ` Bill Rugolsky Jr.
  2006-02-28 23:53         ` Andi Kleen
  0 siblings, 1 reply; 60+ messages in thread
From: Bill Rugolsky Jr. @ 2006-02-28 22:00 UTC (permalink / raw)
  To: Abdulla Bubshait; +Cc: Jason Baron, linux-kernel

On Tue, Feb 28, 2006 at 10:41:27AM +0300, Abdulla Bubshait wrote:
> Unfortunately, I can't seem to find a way to force it to use hpet. Passing 
> 'notsc' and 'nopmtimer' I end up using PIT/TSC based timekeeping. TSC is 
> already known to have problems with dual core. But I will sit with it for a 
> while to see if it fairs better than the pm timer.
> 
> Bill, What timer do you use, and do these lost ticks persist after sata_nv 
> interrupts stop?

Sorry for the late reply.  I'm using pmtimer (the default).  I
get lost ticks reported mostly in default_idle and __do_softirq.

The machine is running PostgreSQL, so the Lost tick messages occur
throughout the day, but they come frequently during our nightly cron
jobs that do rsyncs, checksums, etc. So far today:

rugolsky@ti88: awk '/Feb 28.*Lost.*timer/{n++;sum+=$8};END{printf "%d messages, %d lost ticks\n",n,sum}' /var/log/messages
487 messages, 588 lost ticks

And this month:

rugolsky@ti88: awk '/Feb .*Lost.*timer/{n++;sum+=$8};END{printf "%d messages, %d lost ticks\n",n,sum}' /var/log/messages*
19051 messages, 23794 lost ticks

I got another test machine up and running today, so I can start patching and
testing tomorrow.

	Bill Rugolsky

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: AMD64 X2 lost ticks on PM timer
  2006-02-28 22:00       ` Bill Rugolsky Jr.
@ 2006-02-28 23:53         ` Andi Kleen
  2006-03-01 14:46           ` Bill Rugolsky Jr.
  0 siblings, 1 reply; 60+ messages in thread
From: Andi Kleen @ 2006-02-28 23:53 UTC (permalink / raw)
  To: Bill Rugolsky Jr.; +Cc: Jason Baron, linux-kernel

"Bill Rugolsky Jr." <brugolsky@telemetry-investments.com> writes:
> 
> The machine is running PostgreSQL, so the Lost tick messages occur
> throughout the day, but they come frequently during our nightly cron
> jobs that do rsyncs, checksums, etc. So far today:

What chipset?

> I got another test machine up and running today, so I can start patching and
> testing tomorrow.

What output do you get when you run ftp.suse.com:/pub/people/ak/tools/trtc.c ?
(and what is the _HZ value you configured in Kconfig?)

Does it go away when you run with idle=poll?

-Andi

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: AMD64 X2 lost ticks on PM timer
  2006-02-28 23:53         ` Andi Kleen
@ 2006-03-01 14:46           ` Bill Rugolsky Jr.
  2006-03-01 14:56             ` Andi Kleen
  0 siblings, 1 reply; 60+ messages in thread
From: Bill Rugolsky Jr. @ 2006-03-01 14:46 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Jason Baron, linux-kernel

On Wed, Mar 01, 2006 at 12:53:58AM +0100, Andi Kleen wrote:
> What chipset?

Thanks for the interest, Andi.
 
The chipset is NVIDIA nForce Pro 2200 (CK804).  The mobo is Tyan 2895:

  http://www.tyan.com/products/html/thunderk8we.html

It's running the current 1.02 version of the BIOS.

Current kernel is the FC4 errata:

  Linux ti94 2.6.15-1.1831_FC4smp #1 SMP Tue Feb 7 13:51:52 EST 2006 x86_64 x86_64 x86_64 GNU/Linux

  rugolsky@ti94: rpm -q --changelog kernel-smp-2.6.15-1.1831_FC4 | head -3
  * Tue Feb 07 2006 Dave Jones <davej@redhat.com>
  - 2.6.15.3
    Fixes remotely exploitable bug in ICMP (CVE-2006-0454)

Powernow-k8 is built into the kernel (see log messages below),
but I have it turned off in the BIOS config.

Some time today I'll build vanilla 2.6.15.4 and 2.6.16-rc5.

> What output do you get when you run ftp.suse.com:/pub/people/ak/tools/trtc.c ?
> (and what is the _HZ value you configured in Kconfig?)

  rugolsky@ti94: grep CONFIG_HZ /usr/src/kernels/2.6.15-1.1831_FC4smp-x86_64/.config 
  # CONFIG_HZ_100 is not set
  CONFIG_HZ_250=y
  # CONFIG_HZ_1000 is not set
  CONFIG_HZ=250

Below is the trtc output while running "find /usr -type f |  cpio -o > /dev/null"
without and with idle=poll.

> Does it go away when you run with idle=poll?

No.

Here's some output of trtc without idle=poll:

1141220165:151240: rtc 464 int 0 125 (=125)
1141220165:651241: rtc 448 int 0 125 (=125)
1141220166:151242: rtc 464 int 0 125 (=125)
1141220166:651244: rtc 448 int 0 125 (=125)
1141220167:151245: rtc 464 int 0 125 (=125)
1141220167:651245: rtc 448 int 0 125 (=125)
1141220168:155246: rtc 464 int 0 125 (=125)
1141220168:655250: rtc 448 int 22 103 (=125)
1141220169:155280: rtc 464 int 125 0 (=125)
1141220169:655251: rtc 448 int 125 0 (=125)
1141220170:155251: rtc 464 int 125 0 (=125)
1141220170:655252: rtc 448 int 125 0 (=125)
1141220171:155253: rtc 464 int 125 0 (=125)
1141220171:655253: rtc 448 int 125 0 (=125)
1141220172:155288: rtc 464 int 125 0 (=125)
1141220172:655256: rtc 448 int 125 0 (=125)
1141220173:155258: rtc 464 int 125 0 (=125)
1141220173:655258: rtc 448 int 125 0 (=125)
1141220174:155259: rtc 464 int 125 0 (=125)
1141220174:655260: rtc 448 int 125 0 (=125)
1141220175:155262: rtc 464 int 125 0 (=125)
1141220175:655262: rtc 448 int 125 0 (=125)
1141220176:155263: rtc 464 int 125 0 (=125)
1141220176:655265: rtc 448 int 125 0 (=125)
1141220177:159266: rtc 464 int 125 0 (=125)
1141220177:659268: rtc 448 int 125 0 (=125)
1141220178:159268: rtc 464 int 125 0 (=125)
1141220178:659274: rtc 448 int 104 21 (=125)
1141220179:159272: rtc 464 int 0 125 (=125)
1141220179:659270: rtc 448 int 0 125 (=125)
1141220180:159272: rtc 464 int 0 125 (=125)
1141220180:659273: rtc 448 int 0 125 (=125)
1141220181:159274: rtc 464 int 0 125 (=125)
1141220181:659275: rtc 448 int 0 125 (=125)
1141220182:159276: rtc 464 int 0 125 (=125)
1141220182:659277: rtc 448 int 0 125 (=125)
1141220183:159283: rtc 464 int 0 125 (=125)
1141220183:659279: rtc 448 int 0 125 (=125)
1141220184:163288: rtc 464 int 0 123 (=123)  <-----
1141220184:663281: rtc 448 int 0 125 (=125)
1141220185:163283: rtc 464 int 0 125 (=125)
1141220185:667283: rtc 448 int 0 125 (=125)
1141220186:167285: rtc 464 int 0 125 (=125)
1141220186:667285: rtc 448 int 0 125 (=125)
1141220187:167289: rtc 464 int 0 125 (=125)
1141220187:667288: rtc 448 int 0 125 (=125)
1141220188:167289: rtc 464 int 0 125 (=125)
1141220188:667292: rtc 448 int 22 103 (=125)
1141220189:167291: rtc 464 int 125 0 (=125)
1141220189:667291: rtc 448 int 125 0 (=125)
1141220190:167292: rtc 464 int 125 0 (=125)
1141220190:667293: rtc 448 int 125 0 (=125)
1141220191:167292: rtc 464 int 125 0 (=125)

Kernel log highlights:

Kernel command line: ro root=/dev/md2 report_lost_ticks
...
time.c: Using 3.579545 MHz PM timer.
time.c: Detected 2009.284 MHz processor.
...
Using local APIC timer interrupts.
Detected 12.558 MHz APIC timer.
time.c: Lost 11 timer tick(s)! rip setup_boot_APIC_clock+0x117/0x11a)
Booting processor 1/2 APIC 0x1
Initializing CPU#1
Calibrating delay using timer specific routine.. 4018.82 BogoMIPS (lpj=8037654)
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 1024K (64 bytes/line)
CPU 1(1) -> Node 0 -> Core 0
AMD Opteron(tm) Processor 246 stepping 0a
CPU 1: Syncing TSC to CPU 0.
CPU 1: synchronized TSC with CPU 0 (last diff -124 cycles, maxerr 1093 cycles)
Brought up 2 CPUs
Disabling vsyscall due to use of PM timer
time.c: Using PM based timekeeping.
testing NMI watchdog ... <4>time.c: Lost 17 timer tick(s)! rip __delay+0xa/0x10)
OK.
powernow-k8: Found 2 AMD Athlon 64 / Opteron processors (version 1.50.4)
powernow-k8: MP systems not supported by PSB BIOS structure
powernow-k8: MP systems not supported by PSB BIOS structure
...
time.c: Lost 1 timer tick(s)! rip __do_softirq+0x55/0xd4)
time.c: Lost 1 timer tick(s)! rip __do_softirq+0x55/0xd4)
time.c: Lost 1 timer tick(s)! rip __do_softirq+0x55/0xd4)
time.c: Lost 1 timer tick(s)! rip default_idle+0x37/0x7a)
time.c: Lost 1 timer tick(s)! rip default_idle+0x37/0x7a)
time.c: Lost 2 timer tick(s)! rip default_idle+0x37/0x7a)
time.c: Lost 1 timer tick(s)! rip default_idle+0x37/0x7a)
time.c: Lost 2 timer tick(s)! rip default_idle+0x37/0x7a)
time.c: Lost 1 timer tick(s)! rip default_idle+0x37/0x7a)
time.c: Lost 1 timer tick(s)! rip default_idle+0x37/0x7a)
time.c: Lost 1 timer tick(s)! rip default_idle+0x37/0x7a)


Here's the output of trtc with idle=poll:

1141221151:371869: rtc 464 int 125 0 (=125)
1141221151:871857: rtc 448 int 125 0 (=125)
1141221152:371845: rtc 464 int 125 0 (=125)
1141221152:871833: rtc 448 int 125 0 (=125)
1141221153:371820: rtc 464 int 125 0 (=125)
1141221153:871808: rtc 448 int 125 0 (=125)
1141221154:371796: rtc 464 int 125 0 (=125)
1141221154:871784: rtc 448 int 125 0 (=125)
1141221155:371771: rtc 464 int 71 54 (=125)
1141221155:871759: rtc 448 int 0 125 (=125)
1141221156:371745: rtc 464 int 0 125 (=125)
1141221156:871734: rtc 448 int 0 125 (=125)
1141221157:371721: rtc 464 int 0 125 (=125)
1141221157:871709: rtc 448 int 0 125 (=125)
1141221158:371696: rtc 464 int 0 125 (=125)
1141221158:871685: rtc 448 int 0 125 (=125)
1141221159:371672: rtc 464 int 0 125 (=125)
1141221159:871660: rtc 448 int 0 125 (=125)
1141221160:371648: rtc 464 int 0 125 (=125)
1141221160:871635: rtc 448 int 0 125 (=125)
1141221161:371622: rtc 464 int 0 125 (=125)
1141221161:871610: rtc 448 int 0 125 (=125)
1141221162:371599: rtc 464 int 0 125 (=125)
1141221162:871586: rtc 448 int 0 125 (=125)
1141221163:371573: rtc 464 int 0 125 (=125)
1141221163:871561: rtc 448 int 0 125 (=125)
1141221164:371549: rtc 464 int 0 125 (=125)
1141221164:871537: rtc 448 int 0 125 (=125)
1141221165:371526: rtc 464 int 53 72 (=125)
1141221165:871510: rtc 448 int 125 0 (=125)
1141221166:371502: rtc 464 int 125 0 (=125)
1141221166:871488: rtc 448 int 125 0 (=125)
1141221167:371476: rtc 464 int 125 0 (=125)
1141221167:871471: rtc 448 int 125 0 (=125)
1141221168:371451: rtc 464 int 125 0 (=125)
1141221168:871439: rtc 448 int 125 0 (=125)
1141221169:371427: rtc 464 int 125 0 (=125)
1141221169:871415: rtc 448 int 125 0 (=125)
1141221170:371402: rtc 464 int 125 0 (=125)
1141221170:871390: rtc 448 int 125 0 (=125)
1141221171:371377: rtc 464 int 125 0 (=125)
1141221171:871365: rtc 448 int 125 0 (=125)
1141221172:371352: rtc 464 int 125 0 (=125)
1141221172:875382: rtc 448 int 123 0 (=123)  <-----
1141221173:375328: rtc 464 int 125 0 (=125)
1141221173:875340: rtc 448 int 125 0 (=125)

Kernel log highlights (idle=poll):

Kernel command line: ro root=/dev/md2 report_lost_ticks idle=poll
using polling idle threads.
...
time.c: Using 3.579545 MHz PM timer.
time.c: Detected 2009.264 MHz processor.
...
Using local APIC timer interrupts.
Detected 12.557 MHz APIC timer.
time.c: Lost 11 timer tick(s)! rip setup_boot_APIC_clock+0x117/0x11a)
Booting processor 1/2 APIC 0x1
Initializing CPU#1
Calibrating delay using timer specific routine.. 4018.82 BogoMIPS (lpj=8037642)
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 1024K (64 bytes/line)
CPU 1(1) -> Node 0 -> Core 0
AMD Opteron(tm) Processor 246 stepping 0a
CPU 1: Syncing TSC to CPU 0.
CPU 1: synchronized TSC with CPU 0 (last diff -129 cycles, maxerr 1112 cycles)
Brought up 2 CPUs
Disabling vsyscall due to use of PM timer
time.c: Using PM based timekeeping.
testing NMI watchdog ... <4>time.c: Lost 29 timer tick(s)! rip __delay+0x8/0x10)
OK.
...
powernow-k8: Found 2 AMD Athlon 64 / Opteron processors (version 1.50.4)
powernow-k8: MP systems not supported by PSB BIOS structure
powernow-k8: MP systems not supported by PSB BIOS structure
...
time.c: Lost 3 timer tick(s)! rip poll_idle+0x14/0x19)
time.c: Lost 3 timer tick(s)! rip poll_idle+0x14/0x19)
time.c: Lost 1 timer tick(s)! rip poll_idle+0x14/0x19)
time.c: Lost 2 timer tick(s)! rip poll_idle+0xa/0x19)
time.c: Lost 1 timer tick(s)! rip poll_idle+0x14/0x19)
time.c: Lost 1 timer tick(s)! rip poll_idle+0x14/0x19)
time.c: Lost 1 timer tick(s)! rip poll_idle+0x14/0x19)
time.c: Lost 1 timer tick(s)! rip poll_idle+0x14/0x19)
time.c: Lost 1 timer tick(s)! rip poll_idle+0x14/0x19)
time.c: Lost 1 timer tick(s)! rip poll_idle+0x14/0x19)
time.c: Lost 1 timer tick(s)! rip poll_idle+0xa/0x19)
time.c: Lost 1 timer tick(s)! rip poll_idle+0x14/0x19)
time.c: Lost 1 timer tick(s)! rip poll_idle+0xa/0x19)
time.c: Lost 1 timer tick(s)! rip poll_idle+0x14/0x19)
time.c: Lost 1 timer tick(s)! rip poll_idle+0xa/0x19)
time.c: Lost 1 timer tick(s)! rip poll_idle+0x14/0x19)
time.c: Lost 2 timer tick(s)! rip poll_idle+0x14/0x19)
time.c: Lost 1 timer tick(s)! rip poll_idle+0x14/0x19)
time.c: Lost 1 timer tick(s)! rip poll_idle+0xa/0x19)
time.c: Lost 1 timer tick(s)! rip poll_idle+0x14/0x19)
time.c: Lost 3 timer tick(s)! rip poll_idle+0xa/0x19)
Losing some ticks... checking if CPU frequency changed.
time.c: Lost 1 timer tick(s)! rip poll_idle+0x14/0x19)
time.c: Lost 1 timer tick(s)! rip poll_idle+0xa/0x19)
time.c: Lost 1 timer tick(s)! rip poll_idle+0x14/0x19)
time.c: Lost 1 timer tick(s)! rip poll_idle+0xa/0x19)
time.c: Lost 1 timer tick(s)! rip poll_idle+0x14/0x19)
time.c: Lost 1 timer tick(s)! rip poll_idle+0xa/0x19)
time.c: Lost 1 timer tick(s)! rip poll_idle+0x14/0x19)
time.c: Lost 2 timer tick(s)! rip poll_idle+0x14/0x19)
time.c: Lost 3 timer tick(s)! rip poll_idle+0x14/0x19)
time.c: Lost 1 timer tick(s)! rip poll_idle+0x14/0x19)
time.c: Lost 2 timer tick(s)! rip poll_idle+0xa/0x19)
time.c: Lost 1 timer tick(s)! rip poll_idle+0x14/0x19)
time.c: Lost 3 timer tick(s)! rip poll_idle+0xa/0x19)
time.c: Lost 1 timer tick(s)! rip poll_idle+0xa/0x19)
time.c: Lost 2 timer tick(s)! rip poll_idle+0x14/0x19)


Thanks.

	Bill

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: AMD64 X2 lost ticks on PM timer
  2006-03-01 14:46           ` Bill Rugolsky Jr.
@ 2006-03-01 14:56             ` Andi Kleen
  2006-03-01 15:43               ` Bill Rugolsky Jr.
  0 siblings, 1 reply; 60+ messages in thread
From: Andi Kleen @ 2006-03-01 14:56 UTC (permalink / raw)
  To: Bill Rugolsky Jr.; +Cc: Jason Baron, linux-kernel

On Wednesday 01 March 2006 15:46, Bill Rugolsky Jr. wrote:
> On Wed, Mar 01, 2006 at 12:53:58AM +0100, Andi Kleen wrote:
> > What chipset?
> 
> Thanks for the interest, Andi.
>  
> The chipset is NVIDIA nForce Pro 2200 (CK804).  The mobo is Tyan 2895:

I have such a system sitting next to me and it doesn't show any such symptoms.
I normally don't let it run unrebooted over days though.

I would suspect some driver.
Do you use any special addin cards? What modules are you using?

>   http://www.tyan.com/products/html/thunderk8we.html
> 
> It's running the current 1.02 version of the BIOS.

My BIOS is

 Version: 2004Q3
 Release Date: 06/07/2005

(which is self contradicting, but oh well) 


> Current kernel is the FC4 errata:

I don't run these kernels though - only mainline.

> 1141220165:151240: rtc 464 int 0 125 (=125)
...

Looks all ok. Your timer interrupts are ticking correctly.

> time.c: Lost 3 timer tick(s)! rip poll_idle+0x14/0x19)

Ok then it's not C1.

-Andi

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: AMD64 X2 lost ticks on PM timer
  2006-03-01 14:56             ` Andi Kleen
@ 2006-03-01 15:43               ` Bill Rugolsky Jr.
  2006-03-01 15:47                 ` Andi Kleen
  2006-03-02 15:47                 ` Gabor Gombas
  0 siblings, 2 replies; 60+ messages in thread
From: Bill Rugolsky Jr. @ 2006-03-01 15:43 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Jason Baron, linux-kernel

On Wed, Mar 01, 2006 at 03:56:09PM +0100, Andi Kleen wrote:
> > Thanks for the interest, Andi.
> >  
> > The chipset is NVIDIA nForce Pro 2200 (CK804).  The mobo is Tyan 2895:
> 
> I have such a system sitting next to me and it doesn't show any such symptoms.
> I normally don't let it run unrebooted over days though.

These lost ticks are reproducable in a few minutes with cpio.

> I would suspect some driver.
> Do you use any special addin cards? What modules are you using?

My guess is the sata_nv driver, as it happens during heavy local file read.
The machines all have 2-4 SATA WD Raptors connected to the mobo.

> I don't run these kernels though - only mainline.
 
I wouldn't expect you to be running a Fedora kernel. :-p
I usually roll my own, but I've been really backed up with other tasks.

As I said, I'll build some mainline kernels.  I want to apply
some of Ingo's debugging patches and give John Stultz's new timekeeping
code a try anyway.

rugolsky@ti94: cat /proc/interrupts 
           CPU0       CPU1       
  0:      28474     577049    IO-APIC-edge  timer
  1:          8          0    IO-APIC-edge  i8042
  7:          2          0    IO-APIC-edge  parport0
  8:         40          0    IO-APIC-edge  rtc
  9:          0          0   IO-APIC-level  acpi
 14:         36      21506    IO-APIC-edge  ide0
 50:          3          0   IO-APIC-level  ohci1394
201:          0          0   IO-APIC-level  libata, ehci_hcd:usb1
209:       1876      15495   IO-APIC-level  libata, ohci_hcd:usb2
217:    2483567          0   IO-APIC-level  eth0
233:          0          0   IO-APIC-level  NVidia CK804
NMI:         87         41 
LOC:     605510     605486 
ERR:          0
MIS:          0

Lots-o-modules; I'll have to whittle these down.

rugolsky@ti94: lsmod
Module                  Size  Used by
parport_pc             65581  1 
lp                     49025  0 
parport                77261  2 parport_pc,lp
nfs                   275745  4 
lockd                 107601  2 nfs
nfs_acl                37185  1 nfs
sunrpc                210041  4 nfs,lockd,nfs_acl
8021q                  57041  0 
video                  52553  0 
button                 41185  0 
battery                44233  0 
ac                     38985  0 
ohci1394               71457  0 
ieee1394              407641  1 ohci1394
ohci_hcd               57565  0 
ehci_hcd               70477  0 
i2c_nforce2            41409  0 
i2c_core               59457  1 i2c_nforce2
snd_intel8x0           70889  0 
snd_ac97_codec        146045  1 snd_intel8x0
snd_ac97_bus           36033  1 snd_ac97_codec
snd_seq_dummy          37445  0 
snd_seq_oss            71973  0 
snd_seq_midi_event     42177  1 snd_seq_oss
snd_seq                99225  5 snd_seq_dummy,snd_seq_oss,snd_seq_midi_event
snd_seq_device         43857  3 snd_seq_dummy,snd_seq_oss,snd_seq
snd_pcm_oss            93297  0 
snd_mixer_oss          52673  1 snd_pcm_oss
snd_pcm               139593  3 snd_intel8x0,snd_ac97_codec,snd_pcm_oss
snd_timer              62025  2 snd_seq,snd_pcm
snd                   103073  9 snd_intel8x0,snd_ac97_codec,snd_seq_oss,snd_seq,
snd_seq_device,snd_pcm_oss,snd_mixer_oss,snd_pcm,snd_timer
soundcore              45025  1 snd
snd_page_alloc         46289  2 snd_intel8x0,snd_pcm
forcedeth              60869  0 
floppy                107993  0 
ext3                  179665  5 
jbd                   100073  1 ext3
raid1                  56385  4 
dm_mod                 98697  4 
sata_nv                44101  8 
libata                 98265  1 sata_nv
sd_mod                 53697  10 
scsi_mod              195321  2 libata,sd_mod

Thanks.

	-Bill

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: AMD64 X2 lost ticks on PM timer
  2006-03-01 15:43               ` Bill Rugolsky Jr.
@ 2006-03-01 15:47                 ` Andi Kleen
  2006-03-01 18:07                   ` Bill Rugolsky Jr.
  2006-03-02 15:47                 ` Gabor Gombas
  1 sibling, 1 reply; 60+ messages in thread
From: Andi Kleen @ 2006-03-01 15:47 UTC (permalink / raw)
  To: Bill Rugolsky Jr.; +Cc: Jason Baron, linux-kernel

On Wednesday 01 March 2006 16:43, Bill Rugolsky Jr. wrote:

> > I would suspect some driver.
> > Do you use any special addin cards? What modules are you using?
> 
> My guess is the sata_nv driver, as it happens during heavy local file read.
> The machines all have 2-4 SATA WD Raptors connected to the mobo.

Are you accessing all these disks in parallel with that cpio? If 
yes could you try it with only a single disk? 

My box only has a single SATA disk. Maybe there is some 
corner case in that SATA driver that leaks interrupt state
and it's only turned on later by idle or softirq then.

-Andi

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: AMD64 X2 lost ticks on PM timer
  2006-03-01 15:47                 ` Andi Kleen
@ 2006-03-01 18:07                   ` Bill Rugolsky Jr.
  2006-03-01 18:29                     ` Andi Kleen
  0 siblings, 1 reply; 60+ messages in thread
From: Bill Rugolsky Jr. @ 2006-03-01 18:07 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Jason Baron, linux-kernel

On Wed, Mar 01, 2006 at 04:47:33PM +0100, Andi Kleen wrote:
> > My guess is the sata_nv driver, as it happens during heavy local file read.
> > The machines all have 2-4 SATA WD Raptors connected to the mobo.
> 
> Are you accessing all these disks in parallel with that cpio? If 
> yes could you try it with only a single disk? 
 
Yes, all of the hosts are LVM2 over MD RAID1.  The PostgreSQL 
LV has striping over the two MD RAID1 PVs.

> My box only has a single SATA disk. Maybe there is some 
> corner case in that SATA driver that leaks interrupt state
> and it's only turned on later by idle or softirq then.

Good call!  Stressing one disk results in no lost ticks.

I stuck a spare disk in one of the workstations that has its system
partitions on Ext3/LVM2/MD-RAID1, and then copied the 9GB /usr to
a raw Ext3 partition on the new disk:

    find usr | cpio -pdum /opt

That resulted in:

Mar  1 11:39:27 ti94 kernel: time.c: Lost 1 timer tick(s)! rip __do_softirq+0x55/0xd4)
Mar  1 11:39:41 ti94 kernel: time.c: Lost 1 timer tick(s)! rip __do_softirq+0x55/0xd4)
Mar  1 11:40:37 ti94 kernel: time.c: Lost 3 timer tick(s)! rip __do_softirq+0x55/0xd4)
Mar  1 11:40:40 ti94 kernel: time.c: Lost 6 timer tick(s)! rip __do_softirq+0x55/0xd4)
Mar  1 11:40:41 ti94 kernel: time.c: Lost 1 timer tick(s)! rip __do_softirq+0x55/0xd4)
Mar  1 11:40:42 ti94 kernel: time.c: Lost 1 timer tick(s)! rip __do_softirq+0x55/0xd4)
Mar  1 11:40:50 ti94 kernel: time.c: Lost 1 timer tick(s)! rip __do_softirq+0x55/0xd4)
Mar  1 11:40:54 ti94 kernel: time.c: Lost 1 timer tick(s)! rip __do_softirq+0x55/0xd4)
Mar  1 11:40:57 ti94 kernel: time.c: Lost 1 timer tick(s)! rip __do_softirq+0x55/0xd4)
Mar  1 11:41:01 ti94 kernel: time.c: Lost 2 timer tick(s)! rip __do_softirq+0x55/0xd4)
Mar  1 11:41:06 ti94 kernel: time.c: Lost 1 timer tick(s)! rip __do_softirq+0x55/0xd4)
Mar  1 11:41:12 ti94 kernel: time.c: Lost 1 timer tick(s)! rip __do_softirq+0x55/0xd4)
Mar  1 11:41:17 ti94 kernel: time.c: Lost 2 timer tick(s)! rip __do_softirq+0x55/0xd4)
Mar  1 11:41:21 ti94 kernel: time.c: Lost 3 timer tick(s)! rip __do_softirq+0x55/0xd4)
Mar  1 11:41:27 ti94 kernel: time.c: Lost 4 timer tick(s)! rip __do_softirq+0x55/0xd4)
Mar  1 11:41:27 ti94 kernel: time.c: Lost 1 timer tick(s)! rip __do_softirq+0x55/0xd4)
Mar  1 11:41:29 ti94 kernel: time.c: Lost 3 timer tick(s)! rip _spin_unlock_irqrestore+0xb/0xd)
Mar  1 11:41:42 ti94 kernel: time.c: Lost 1 timer tick(s)! rip __do_softirq+0x55/0xd4)
Mar  1 11:41:43 ti94 kernel: time.c: Lost 2 timer tick(s)! rip __do_softirq+0x55/0xd4)
Mar  1 11:41:43 ti94 kernel: time.c: Lost 1 timer tick(s)! rip __do_softirq+0x55/0xd4)
Mar  1 11:41:46 ti94 kernel: time.c: Lost 2 timer tick(s)! rip __do_softirq+0x55/0xd4)
Mar  1 11:41:55 ti94 kernel: time.c: Lost 1 timer tick(s)! rip __do_softirq+0x55/0xd4)
Mar  1 11:41:55 ti94 kernel: time.c: Lost 2 timer tick(s)! rip default_idle+0x37/0x7a)
Mar  1 11:41:57 ti94 kernel: time.c: Lost 1 timer tick(s)! rip default_idle+0x37/0x7a)
Mar  1 11:41:57 ti94 kernel: time.c: Lost 1 timer tick(s)! rip default_idle+0x37/0x7a)
Mar  1 11:42:00 ti94 kernel: time.c: Lost 2 timer tick(s)! rip default_idle+0x37/0x7a)
Mar  1 11:42:00 ti94 kernel: time.c: Lost 1 timer tick(s)! rip default_idle+0x37/0x7a)
Mar  1 11:42:00 ti94 kernel: time.c: Lost 2 timer tick(s)! rip default_idle+0x37/0x7a)
...

After a umount/mount on /opt, I did

   find /opt | cpio -o > /dev/null

and got no lost ticks in the log.  My "nice --10 ./trtc" gave me:

rugolsky@ti94: tail +10 one-disk | grep -v '=125'
1141232738:578610: rtc 448 int 124 0 (=124)
1141232807:67036: rtc 464 int 0 124 (=124)
1141232875:557669: rtc 448 int 0 124 (=124)

I converted the raw EXT3 partition to a degraded MD RAID1, and again
got no lost ticks.  Then I created a PV/VG/LV/Ext3 on top of the degraded MD RAID1,
populated it, and re-read it; once again, there were no lost ticks on the
single-disk read.

Time to instrument sata_nv, I suppose.  Many thanks for helping to narrow this
down.

	-Bill

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: AMD64 X2 lost ticks on PM timer
  2006-03-01 18:07                   ` Bill Rugolsky Jr.
@ 2006-03-01 18:29                     ` Andi Kleen
  2006-03-01 19:16                       ` Lee Revell
  0 siblings, 1 reply; 60+ messages in thread
From: Andi Kleen @ 2006-03-01 18:29 UTC (permalink / raw)
  To: Bill Rugolsky Jr.; +Cc: Jason Baron, linux-kernel

On Wednesday 01 March 2006 19:07, Bill Rugolsky Jr. wrote:

> Mar  1 11:39:27 ti94 kernel: time.c: Lost 1 timer tick(s)! rip __do_softirq+0x55/0xd4)

Yes, I bet something forgets to turn on interrupts again and it's picked up by
(and blamed on) the next guy who does an unconditional sti, which happens to be __do_sofitrq
or idle.


> Time to instrument sata_nv, I suppose.  Many thanks for helping to narrow this
> down.

Sprinkle WARN_ON(in_interrupt()) all over the parts that shouldn't have interrupts 
off.

-Andi

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: AMD64 X2 lost ticks on PM timer
  2006-03-01 18:29                     ` Andi Kleen
@ 2006-03-01 19:16                       ` Lee Revell
  2006-03-03 19:18                         ` Bill Rugolsky Jr.
  0 siblings, 1 reply; 60+ messages in thread
From: Lee Revell @ 2006-03-01 19:16 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Bill Rugolsky Jr., Jason Baron, linux-kernel

On Wed, 2006-03-01 at 19:29 +0100, Andi Kleen wrote:
> Sprinkle WARN_ON(in_interrupt()) all over the parts that shouldn't
> have interrupts 
> off. 

Might be faster to just try the -rt kernel, it has tons of debugging
checks for stuff like this.

Lee


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: AMD64 X2 lost ticks on PM timer
  2006-03-01 15:43               ` Bill Rugolsky Jr.
  2006-03-01 15:47                 ` Andi Kleen
@ 2006-03-02 15:47                 ` Gabor Gombas
  1 sibling, 0 replies; 60+ messages in thread
From: Gabor Gombas @ 2006-03-02 15:47 UTC (permalink / raw)
  To: Bill Rugolsky Jr., Andi Kleen, Jason Baron, linux-kernel

On Wed, Mar 01, 2006 at 10:43:13AM -0500, Bill Rugolsky Jr. wrote:

> My guess is the sata_nv driver, as it happens during heavy local file read.
> The machines all have 2-4 SATA WD Raptors connected to the mobo.

I have 4 SATA disks connected to an nForce4, being part of an md/raid5
array. If I start bonnie on the raid5 array, I get:

warning: many lost ticks.
Your time source seems to be instable or some driver is hogging interupts
rip __do_softirq+0x3b/0xa1

So sata_nv definitely looks fishy.

Gabor

-- 
     ---------------------------------------------------------
     MTA SZTAKI Computer and Automation Research Institute
                Hungarian Academy of Sciences
     ---------------------------------------------------------

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: AMD64 X2 lost ticks on PM timer
  2006-03-01 19:16                       ` Lee Revell
@ 2006-03-03 19:18                         ` Bill Rugolsky Jr.
  2006-03-03 21:26                           ` Lee Revell
  0 siblings, 1 reply; 60+ messages in thread
From: Bill Rugolsky Jr. @ 2006-03-03 19:18 UTC (permalink / raw)
  To: Lee Revell; +Cc: Andi Kleen, Jason Baron, linux-kernel

On Wed, Mar 01, 2006 at 02:16:50PM -0500, Lee Revell wrote:
> On Wed, 2006-03-01 at 19:29 +0100, Andi Kleen wrote:
> > Sprinkle WARN_ON(in_interrupt()) all over the parts that shouldn't
> > have interrupts 
> > off. 
> 
> Might be faster to just try the -rt kernel, it has tons of debugging
> checks for stuff like this.

After several attempts where 2.6.15-rt18 reset on startup, I whittled
my config down to something minimal (turned off NUMA, CPUSETS, PRINTK_TIME, ...)
and got it up and running PREEMPT_RT:

rugolsky@ti94: uname -a
Linux ti94 2.6.15-rt18-realtime #4 SMP PREEMPT Fri Mar 3 11:39:20 EST 2006 x86_64 x86_64 x86_64 GNU/Linux

rugolsky@ti94: egrep 'PREEMPT|LATENCY|HZ' .config
# CONFIG_PREEMPT_NONE is not set
# CONFIG_PREEMPT_VOLUNTARY is not set
# CONFIG_PREEMPT_DESKTOP is not set
CONFIG_PREEMPT_RT=y
CONFIG_PREEMPT=y
CONFIG_PREEMPT_SOFTIRQS=y
CONFIG_PREEMPT_HARDIRQS=y
CONFIG_PREEMPT_BKL=y
CONFIG_PREEMPT_RCU=y
# CONFIG_HZ_100 is not set
# CONFIG_HZ_250 is not set
CONFIG_HZ_1000=y
CONFIG_HZ=1000
CONFIG_DEBUG_PREEMPT=y
# CONFIG_WAKEUP_LATENCY_HIST is not set
CONFIG_PREEMPT_TRACE=y
CONFIG_CRITICAL_PREEMPT_TIMING=y
# CONFIG_PREEMPT_OFF_HIST is not set
CONFIG_LATENCY_TIMING=y
CONFIG_LATENCY_TRACE=y

% sudo sysctl -a | egrep 'kernel\.*(preempt|latency|trace|wakeup)'
kernel.preempt_thresh = 0
kernel.preempt_max_latency = 2483
kernel.trace_all_cpus = 1
kernel.trace_verbose = 1
kernel.trace_print_at_crash = 1
kernel.trace_freerunning = 0
kernel.trace_user_trigger_irq = -1
kernel.trace_user_trigger_irq = -1
kernel.trace_user_triggered = 0
kernel.trace_enabled = 1
kernel.wakeup_timing = 1

% cat /sys/devices/system/clocksource/clocksource0/current_clocksource 
acpi_pm 

Should I be compiling for a different preempt mode?  I've only cursorily
followed the realtime-preempt patch discussion threads, so I am unclear as
to what debugging facilities are available with each preemption level.
[Is there a howto/tutorial floating around on using the debugging
features?]

I got the following trace.  Since you have a great deal of experience
interpreting these traces, perhaps you can help me interpret this one?

Thanks.

	Bill Rugolsky


preemption latency trace v1.1.5 on 2.6.15-rt18-realtime
--------------------------------------------------------------------
 latency: 2483 us, #238/238, CPU#1 | (M:rt VP:0, KP:0, SP:1 HP:1 #P:2)
    -----------------
    | task: softirq-timer/1-14 (uid:0 nice:0 policy:1 rt_prio:1)
    -----------------

(T1/#0)            <...>   697 0 16 00000000 00000000 [00000424e86874c5] 0.000ms (+0.000ms): smp_apic_timer_interrupt+0xc/0x48 <ffffffff8011902a> (apic_timer_interrupt+0x84/0x8c <ffffffff8010ead0>)
(T1/#1)            <...>   697 0 20 00000000 00000001 [00000424e86876a8] 0.000ms (+0.000ms): smp_local_timer_interrupt+0xc/0x32 <ffffffff80118ad5> (smp_apic_timer_interrupt+0x3e/0x48 <ffffffff8011905c>)
(T1/#2)            <...>   697 0 20 00000000 00000002 [00000424e868778c] 0.000ms (+0.000ms): profile_tick+0xc/0x77 <ffffffff801334d4> (smp_local_timer_interrupt+0x1c/0x32 <ffffffff80118ae5>)
(T1/#3)            <...>   697 0 20 00000000 00000003 [00000424e8687885] 0.000ms (+0.000ms): profile_pc+0xc/0x71 <ffffffff8011173c> (profile_tick+0x67/0x77 <ffffffff8013352f>)
(T1/#4)            <...>   697 0 20 00000000 00000004 [00000424e86879a4] 0.000ms (+0.000ms): profile_hit+0x14/0x19f <ffffffff8013333d> (profile_tick+0x72/0x77 <ffffffff8013353a>)
(T1/#5)            <...>   697 0 20 00000000 00000005 [00000424e8687aae] 0.000ms (+0.000ms): update_process_times+0xc/0x68 <ffffffff8013b457> (smp_local_timer_interrupt+0x2e/0x32 <ffffffff80118af7>)
(T1/#6)            <...>   697 0 20 00000000 00000006 [00000424e8687baf] 0.000ms (+0.000ms): account_system_time+0x9/0x9e <ffffffff8012a19c> (update_process_times+0x3f/0x68 <ffffffff8013b48a>)
(T1/#7)            <...>   697 0 20 00000000 00000007 [00000424e8687cb0] 0.001ms (+0.000ms): acct_update_integrals+0x9/0x59 <ffffffff801571a1> (account_system_time+0x9c/0x9e <ffffffff8012a22f>)
(T1/#8)            <...>   697 0 20 00000000 00000008 [00000424e8687dca] 0.001ms (+0.000ms): run_local_timers+0x9/0x15 <ffffffff8013b0cd> (update_process_times+0x44/0x68 <ffffffff8013b48f>)
(T1/#9)            <...>   697 0 20 00000000 00000009 [00000424e8687ea7] 0.001ms (+0.000ms): raise_softirq+0xc/0x91 <ffffffff80137cf8> (run_local_timers+0x13/0x15 <ffffffff8013b0d7>)
(T1/#10)            <...>   697 0 20 00000000 0000000a [00000424e8687fce] 0.001ms (+0.000ms): wakeup_softirqd+0x9/0x38 <ffffffff801373e5> (raise_softirq+0x6f/0x91 <ffffffff80137d5b>)
(T1/#11)            <...>   697 0 20 00000000 0000000b [00000424e86880c5] 0.001ms (+0.000ms): wake_up_process+0xb/0x31 <ffffffff8012cd66> (wakeup_softirqd+0x36/0x38 <ffffffff80137412>)
(T1/#12)            <...>   697 0 20 00000000 0000000c [00000424e86881ab] 0.001ms (+0.000ms): check_preempt_wakeup+0xc/0xac <ffffffff8014a814> (wake_up_process+0x13/0x31 <ffffffff8012cd6e>)
(T1/#13)            <...>   697 0 20 00000000 0000000d [00000424e86882b4] 0.001ms (+0.000ms): try_to_wake_up+0x16/0x560 <ffffffff8012c747> (wake_up_process+0x24/0x31 <ffffffff8012cd7f>)
(T1/#14)            <...>   697 0 20 00000000 0000000e [00000424e86883ad] 0.001ms (+0.000ms): _raw_spin_lock+0xb/0x25 <ffffffff802fa1a7> (try_to_wake_up+0x5d/0x560 <ffffffff8012c78e>)
(T1/#15)            <...>   697 0 20 00000001 0000000f [00000424e86884fd] 0.002ms (+0.000ms): idle_cpu+0x9/0x30 <ffffffff80129d37> (try_to_wake_up+0x292/0x560 <ffffffff8012c9c3>)
(T1/#16)           <idle>     0 1 23 00000003 00000010 [00000424e86885b5] 0.002ms (+0.000ms): _raw_spin_unlock_irqrestore+0xb/0x62 <ffffffff802fa61d> (unmask_IO_APIC_irq+0x35/0x3a <ffffffff8011a854>)
(T1/#17)            <...>   697 0 20 00000001 00000011 [00000424e868868b] 0.002ms (+0.000ms): smp_send_reschedule_allbutself+0x9/0x1a <ffffffff8011809b> (try_to_wake_up+0x3e9/0x560 <ffffffff8012cb1a>)
(T1/#18)           <idle>     0 1 23 00000002 00000012 [00000424e86886b0] 0.002ms (+0.000ms): check_raw_flags+0x9/0x5b <ffffffff8014a6f5> (_raw_spin_unlock_irqrestore+0x26/0x62 <ffffffff802fa638>)
(T1/#19)            <...>   697 0 20 00000001 00000013 [00000424e8688779] 0.002ms (+0.000ms): flat_send_IPI_allbutself+0xb/0x5b <ffffffff8011b644> (smp_send_reschedule_allbutself+0x18/0x1a <ffffffff801180aa>)
(T1/#20)           <idle>     0 1 23 00000002 00000014 [00000424e86887ba] 0.002ms (+0.000ms): constant_test_bit+0x9/0x25 <ffffffff80152845> (_raw_spin_unlock_irqrestore+0x55/0x62 <ffffffff802fa667>)
(T1/#21)            <...>   697 0 20 00000001 00000015 [00000424e8688855] 0.002ms (+0.000ms): __bitmap_weight+0xa/0x18b <ffffffff801fbce6> (flat_send_IPI_allbutself+0x1e/0x5b <ffffffff8011b657>)
(T1/#22)           <idle>     0 1 23 00000002 00000016 [00000424e86888ab] 0.002ms (+0.000ms): preempt_schedule+0xc/0x9d <ffffffff802f82a1> (_raw_spin_unlock_irqrestore+0x5e/0x62 <ffffffff802fa670>)
(T1/#23)           <idle>     0 1 23 00000002 00000017 [00000424e86889c7] 0.002ms (+0.000ms): _raw_spin_unlock+0x9/0x3e <ffffffff802fa6f7> (__do_IRQ+0x12a/0x141 <ffffffff8015deb9>)
(T1/#24)            <...>   697 0 20 00000001 00000018 [00000424e8688a6c] 0.002ms (+0.000ms): activate_task+0x10/0xe0 <ffffffff8012bc38> (try_to_wake_up+0x491/0x560 <ffffffff8012cbc2>)
(T1/#25)           <idle>     0 1 23 00000001 00000019 [00000424e8688abd] 0.002ms (+0.000ms): constant_test_bit+0x9/0x25 <ffffffff80152845> (_raw_spin_unlock+0x33/0x3e <ffffffff802fa721>)
(T1/#26)            <...>   697 0 20 00000001 0000001a [00000424e8688b4b] 0.002ms (+0.000ms): sched_clock+0x9/0x26 <ffffffff8011180c> (activate_task+0x1d/0xe0 <ffffffff8012bc45>)
(T1/#27)           <idle>     0 1 23 00000001 0000001b [00000424e8688bbc] 0.002ms (+0.000ms): preempt_schedule+0xc/0x9d <ffffffff802f82a1> (_raw_spin_unlock+0x3c/0x3e <ffffffff802fa72a>)
(T3/#28)    <...>-697   0D.h1    2us : activate_task+0x9b/0xe0 <ffffffff8012bcc3> <<...>-4> (62 1)
(T1/#29)           <idle>     0 1 23 00000001 0000001d [00000424e8688cd8] 0.003ms (+0.000ms): irq_exit+0x9/0x28 <ffffffff80137aaa> (do_IRQ+0x3a/0x44 <ffffffff80110767>)
(T1/#30)            <...>   697 0 20 00000001 0000001e [00000424e8688d26] 0.003ms (+0.000ms): enqueue_task+0xc/0x95 <ffffffff80129be4> (activate_task+0xa7/0xe0 <ffffffff8012bccf>)
(T1/#31)            <...>   697 0 20 00000001 0000001f [00000424e8688f26] 0.003ms (+0.000ms): _raw_spin_unlock_irqrestore+0xb/0x62 <ffffffff802fa61d> (try_to_wake_up+0x54f/0x560 <ffffffff8012cc80>)
(T1/#32)           <idle>     0 1 19 00000001 00000020 [00000424e8688f9e] 0.003ms (+0.000ms): smp_reschedule_interrupt+0x9/0x16 <ffffffff801186ee> (reschedule_interrupt+0x84/0x8c <ffffffff8010e558>)
(T1/#33)            <...>   697 0 20 00000000 00000021 [00000424e868901e] 0.003ms (+0.000ms): check_raw_flags+0x9/0x5b <ffffffff8014a6f5> (_raw_spin_unlock_irqrestore+0x26/0x62 <ffffffff802fa638>)
(T1/#34)            <...>   697 0 20 00000000 00000022 [00000424e8689137] 0.003ms (+0.000ms): constant_test_bit+0x9/0x25 <ffffffff80152845> (_raw_spin_unlock_irqrestore+0x55/0x62 <ffffffff802fa667>)
(T1/#35)           <idle>     0 1 19 00000001 00000023 [00000424e8689211] 0.003ms (+0.000ms): smp_apic_timer_interrupt+0xc/0x48 <ffffffff8011902a> (apic_timer_interrupt+0x84/0x8c <ffffffff8010ead0>)
(T1/#36)            <...>   697 0 20 00000000 00000024 [00000424e8689254] 0.003ms (+0.000ms): wake_up_process+0x2b/0x31 <ffffffff8012cd86> (wakeup_softirqd+0x36/0x38 <ffffffff80137412>)
(T1/#37)            <...>   697 0 20 00000000 00000025 [00000424e8689354] 0.003ms (+0.000ms): check_raw_flags+0x9/0x5b <ffffffff8014a6f5> (raise_softirq+0x77/0x91 <ffffffff80137d63>)
(T1/#38)           <idle>     0 1 23 00000001 00000026 [00000424e8689354] 0.003ms (+0.000ms): smp_local_timer_interrupt+0xc/0x32 <ffffffff80118ad5> (smp_apic_timer_interrupt+0x3e/0x48 <ffffffff8011905c>)
(T1/#39)           <idle>     0 1 23 00000001 00000027 [00000424e8689426] 0.003ms (+0.000ms): profile_tick+0xc/0x77 <ffffffff801334d4> (smp_local_timer_interrupt+0x1c/0x32 <ffffffff80118ae5>)
(T1/#40)            <...>   697 0 20 00000000 00000028 [00000424e8689492] 0.004ms (+0.000ms): rcu_pending+0x9/0x30 <ffffffff801445e8> (update_process_times+0x4b/0x68 <ffffffff8013b496>)
(T1/#41)           <idle>     0 1 23 00000001 00000029 [00000424e8689520] 0.004ms (+0.000ms): profile_pc+0xc/0x71 <ffffffff8011173c> (profile_tick+0x67/0x77 <ffffffff8013352f>)
(T1/#42)            <...>   697 0 20 00000000 0000002a [00000424e86895c8] 0.004ms (+0.000ms): scheduler_tick+0x13/0x34c <ffffffff8012d1b9> (update_process_times+0x5e/0x68 <ffffffff8013b4a9>)
(T1/#43)           <idle>     0 1 23 00000001 0000002b [00000424e8689636] 0.004ms (+0.000ms): profile_hit+0x14/0x19f <ffffffff8013333d> (profile_tick+0x72/0x77 <ffffffff8013353a>)
(T1/#44)            <...>   697 0 20 00000000 0000002c [00000424e86896af] 0.004ms (+0.000ms): sched_clock+0x9/0x26 <ffffffff8011180c> (scheduler_tick+0x3d/0x34c <ffffffff8012d1e3>)
(T1/#45)           <idle>     0 1 23 00000001 0000002d [00000424e8689754] 0.004ms (+0.000ms): update_process_times+0xc/0x68 <ffffffff8013b457> (smp_local_timer_interrupt+0x2e/0x32 <ffffffff80118af7>)
(T1/#46)            <...>   697 0 20 00000000 0000002e [00000424e86897c5] 0.004ms (+0.000ms): _raw_spin_lock+0xb/0x25 <ffffffff802fa1a7> (scheduler_tick+0xce/0x34c <ffffffff8012d274>)
(T1/#47)           <idle>     0 1 23 00000001 0000002f [00000424e868984a] 0.004ms (+0.000ms): account_system_time+0x9/0x9e <ffffffff8012a19c> (update_process_times+0x3f/0x68 <ffffffff8013b48a>)
(T1/#48)            <...>   697 0 20 00000001 00000030 [00000424e86898f8] 0.004ms (+0.000ms): _raw_spin_unlock+0x9/0x3e <ffffffff802fa6f7> (scheduler_tick+0x330/0x34c <ffffffff8012d4d6>)
(T1/#49)           <idle>     0 1 23 00000001 00000031 [00000424e868994c] 0.004ms (+0.000ms): acct_update_integrals+0x9/0x59 <ffffffff801571a1> (account_system_time+0x9c/0x9e <ffffffff8012a22f>)
(T1/#50)            <...>   697 0 20 00000000 00000032 [00000424e8689a04] 0.004ms (+0.000ms): constant_test_bit+0x9/0x25 <ffffffff80152845> (_raw_spin_unlock+0x33/0x3e <ffffffff802fa721>)
(T1/#51)           <idle>     0 1 23 00000001 00000033 [00000424e8689a5a] 0.004ms (+0.000ms): run_local_timers+0x9/0x15 <ffffffff8013b0cd> (update_process_times+0x44/0x68 <ffffffff8013b48f>)
(T1/#52)            <...>   697 0 20 00000000 00000034 [00000424e8689b0b] 0.004ms (+0.000ms): rebalance_tick+0x16/0x2e8 <ffffffff8012ced4> (scheduler_tick+0x340/0x34c <ffffffff8012d4e6>)
(T1/#53)           <idle>     0 1 23 00000001 00000035 [00000424e8689b36] 0.004ms (+0.000ms): raise_softirq+0xc/0x91 <ffffffff80137cf8> (run_local_timers+0x13/0x15 <ffffffff8013b0d7>)
(T1/#54)           <idle>     0 1 23 00000001 00000036 [00000424e8689c41] 0.005ms (+0.000ms): wakeup_softirqd+0x9/0x38 <ffffffff801373e5> (raise_softirq+0x6f/0x91 <ffffffff80137d5b>)
(T1/#55)           <idle>     0 1 23 00000001 00000037 [00000424e8689d50] 0.005ms (+0.000ms): check_raw_flags+0x9/0x5b <ffffffff8014a6f5> (raise_softirq+0x77/0x91 <ffffffff80137d63>)
(T1/#56)            <...>   697 0 20 00000000 00000038 [00000424e8689dc9] 0.005ms (+0.000ms): softlockup_tick+0xf/0x11d <ffffffff8015d672> (update_process_times+0x63/0x68 <ffffffff8013b4ae>)
(T1/#57)           <idle>     0 1 23 00000001 00000039 [00000424e8689e81] 0.005ms (+0.000ms): rcu_pending+0x9/0x30 <ffffffff801445e8> (update_process_times+0x4b/0x68 <ffffffff8013b496>)
(T1/#58)            <...>   697 0 20 00000000 0000003a [00000424e8689ef2] 0.005ms (+0.000ms): irq_exit+0x9/0x28 <ffffffff80137aaa> (smp_apic_timer_interrupt+0x43/0x48 <ffffffff80119061>)
(T1/#59)           <idle>     0 1 23 00000001 0000003b [00000424e8689f91] 0.005ms (+0.000ms): scheduler_tick+0x13/0x34c <ffffffff8012d1b9> (update_process_times+0x5e/0x68 <ffffffff8013b4a9>)
(T1/#60)           <idle>     0 1 23 00000001 0000003c [00000424e868a078] 0.005ms (+0.000ms): sched_clock+0x9/0x26 <ffffffff8011180c> (scheduler_tick+0x3d/0x34c <ffffffff8012d1e3>)
(T1/#61)           <idle>     0 1 23 00000001 0000003d [00000424e868a17d] 0.005ms (+0.000ms): _raw_spin_lock+0xb/0x25 <ffffffff802fa1a7> (scheduler_tick+0x81/0x34c <ffffffff8012d227>)
(T1/#62)            <...>   697 0 16 00000000 0000003e [00000424e868a1e5] 0.005ms (+0.000ms): do_IRQ+0xc/0x44 <ffffffff80110739> (ret_from_intr+0x0/0x12 <ffffffff8010e276>)
(T1/#63)           <idle>     0 1 23 00000002 0000003f [00000424e868a2cf] 0.005ms (+0.000ms): resched_task+0xc/0x79 <ffffffff8012a958> (scheduler_tick+0x95/0x34c <ffffffff8012d23b>)
(T1/#64)            <...>   697 0 20 00000000 00000040 [00000424e868a332] 0.005ms (+0.000ms): _raw_spin_lock+0xb/0x25 <ffffffff802fa1a7> (__do_IRQ+0x5d/0x141 <ffffffff8015ddec>)
(T1/#65)           <idle>     0 1 23 00000002 00000041 [00000424e868a3e6] 0.006ms (+0.000ms): _raw_spin_unlock+0x9/0x3e <ffffffff802fa6f7> (scheduler_tick+0x9d/0x34c <ffffffff8012d243>)
(T1/#66)            <...>   697 0 20 00000001 00000042 [00000424e868a469] 0.006ms (+0.000ms): mask_and_ack_level_ioapic_irq+0x10/0xa2 <ffffffff8011aa2d> (__do_IRQ+0x72/0x141 <ffffffff8015de01>)
(T1/#67)           <idle>     0 1 23 00000001 00000043 [00000424e868a4e0] 0.006ms (+0.000ms): constant_test_bit+0x9/0x25 <ffffffff80152845> (_raw_spin_unlock+0x33/0x3e <ffffffff802fa721>)
(T1/#68)            <...>   697 0 20 00000001 00000044 [00000424e868a548] 0.006ms (+0.000ms): mask_IO_APIC_irq+0xb/0x11e <ffffffff8011a90a> (mask_and_ack_level_ioapic_irq+0x8e/0xa2 <ffffffff8011aaab>)
(T1/#69)           <idle>     0 1 23 00000001 00000045 [00000424e868a5e8] 0.006ms (+0.000ms): preempt_schedule+0xc/0x9d <ffffffff802f82a1> (_raw_spin_unlock+0x3c/0x3e <ffffffff802fa72a>)
(T1/#70)            <...>   697 0 20 00000001 00000046 [00000424e868a6bd] 0.006ms (+0.000ms): _raw_spin_lock_irqsave+0xc/0x33 <ffffffff802fa2ab> (mask_IO_APIC_irq+0x19/0x11e <ffffffff8011a918>)
(T1/#71)           <idle>     0 1 23 00000001 00000047 [00000424e868a70c] 0.006ms (+0.000ms): rebalance_tick+0x16/0x2e8 <ffffffff8012ced4> (scheduler_tick+0x340/0x34c <ffffffff8012d4e6>)
(T1/#72)           <idle>     0 1 23 00000001 00000048 [00000424e868a936] 0.006ms (+0.000ms): softlockup_tick+0xf/0x11d <ffffffff8015d672> (update_process_times+0x63/0x68 <ffffffff8013b4ae>)
(T1/#73)           <idle>     0 1 23 00000001 00000049 [00000424e868aa5e] 0.006ms (+0.000ms): irq_exit+0x9/0x28 <ffffffff80137aaa> (smp_apic_timer_interrupt+0x43/0x48 <ffffffff80119061>)
(T1/#74)           <idle>     0 1 19 00000001 0000004a [00000424e868ad2f] 0.007ms (+0.000ms): do_IRQ+0xc/0x44 <ffffffff80110739> (ret_from_intr+0x0/0x12 <ffffffff8010e276>)
(T1/#75)           <idle>     0 1 23 00000001 0000004b [00000424e868ae68] 0.007ms (+0.000ms): _raw_spin_lock+0xb/0x25 <ffffffff802fa1a7> (__do_IRQ+0x5d/0x141 <ffffffff8015ddec>)
(T1/#76)           <idle>     0 1 23 00000002 0000004c [00000424e868b067] 0.007ms (+0.000ms): ack_edge_ioapic_irq+0x10/0xba <ffffffff8011ab21> (__do_IRQ+0x72/0x141 <ffffffff8015de01>)
(T1/#77)           <idle>     0 1 23 00000002 0000004d [00000424e868b16b] 0.007ms (+0.000ms): redirect_hardirq+0x9/0x53 <ffffffff8015d909> (__do_IRQ+0xb5/0x141 <ffffffff8015de44>)
(T1/#78)           <idle>     0 1 23 00000002 0000004e [00000424e868b283] 0.007ms (+0.000ms): _raw_spin_unlock+0x9/0x3e <ffffffff802fa6f7> (__do_IRQ+0xc1/0x141 <ffffffff8015de50>)
(T1/#79)           <idle>     0 1 23 00000001 0000004f [00000424e868b38d] 0.007ms (+0.000ms): constant_test_bit+0x9/0x25 <ffffffff80152845> (_raw_spin_unlock+0x33/0x3e <ffffffff802fa721>)
(T1/#80)           <idle>     0 1 23 00000001 00000050 [00000424e868b4a0] 0.008ms (+0.000ms): preempt_schedule+0xc/0x9d <ffffffff802f82a1> (_raw_spin_unlock+0x3c/0x3e <ffffffff802fa72a>)
(T1/#81)           <idle>     0 1 23 00000001 00000051 [00000424e868b5af] 0.008ms (+0.000ms): handle_IRQ_event+0x16/0x114 <ffffffff8015d96f> (__do_IRQ+0xd1/0x141 <ffffffff8015de60>)
(T1/#82)           <idle>     0 1 23 00000001 00000052 [00000424e868b699] 0.008ms (+0.000ms): timer_interrupt+0xb/0x62 <ffffffff801117ac> (handle_IRQ_event+0x6a/0x114 <ffffffff8015d9c3>)
(T1/#83)           <idle>     0 1 23 00000001 00000053 [00000424e868b789] 0.008ms (+0.000ms): _raw_spin_lock+0xb/0x25 <ffffffff802fa1a7> (timer_interrupt+0x1a/0x62 <ffffffff801117bb>)
(T1/#84)           <idle>     0 1 23 00000002 00000054 [00000424e868b8c5] 0.008ms (+0.000ms): do_timer+0x9/0x12 <ffffffff8013b0e2> (timer_interrupt+0x28/0x62 <ffffffff801117c9>)
(T1/#85)           <idle>     0 1 23 00000002 00000055 [00000424e868ba17] 0.008ms (+0.000ms): _raw_spin_unlock+0x9/0x3e <ffffffff802fa6f7> (timer_interrupt+0x4b/0x62 <ffffffff801117ec>)
(T1/#86)            <...>   697 0 20 00000002 00000056 [00000424e868ba4b] 0.008ms (+0.000ms): _raw_spin_unlock_irqrestore+0xb/0x62 <ffffffff802fa61d> (mask_IO_APIC_irq+0x11a/0x11e <ffffffff8011aa19>)
(T1/#87)           <idle>     0 1 23 00000001 00000057 [00000424e868bb18] 0.008ms (+0.000ms): constant_test_bit+0x9/0x25 <ffffffff80152845> (_raw_spin_unlock+0x33/0x3e <ffffffff802fa721>)
(T1/#88)            <...>   697 0 20 00000001 00000058 [00000424e868bb4d] 0.008ms (+0.000ms): check_raw_flags+0x9/0x5b <ffffffff8014a6f5> (_raw_spin_unlock_irqrestore+0x26/0x62 <ffffffff802fa638>)
(T1/#89)           <idle>     0 1 23 00000001 00000059 [00000424e868bc15] 0.009ms (+0.000ms): preempt_schedule+0xc/0x9d <ffffffff802f82a1> (_raw_spin_unlock+0x3c/0x3e <ffffffff802fa72a>)
(T1/#90)            <...>   697 0 20 00000001 0000005a [00000424e868bc62] 0.009ms (+0.000ms): constant_test_bit+0x9/0x25 <ffffffff80152845> (_raw_spin_unlock_irqrestore+0x55/0x62 <ffffffff802fa667>)
(T1/#91)           <idle>     0 1 23 00000001 0000005b [00000424e868bd30] 0.009ms (+0.000ms): smp_send_timer_broadcast_ipi+0x9/0x31 <ffffffff80118a91> (timer_interrupt+0x59/0x62 <ffffffff801117fa>)
(T1/#92)            <...>   697 0 20 00000001 0000005c [00000424e868bd98] 0.009ms (+0.000ms): redirect_hardirq+0x9/0x53 <ffffffff8015d909> (__do_IRQ+0xb5/0x141 <ffffffff8015de44>)
(T1/#93)           <idle>     0 1 23 00000001 0000005d [00000424e868be66] 0.009ms (+0.000ms): _raw_spin_lock+0xb/0x25 <ffffffff802fa1a7> (__do_IRQ+0xdb/0x141 <ffffffff8015de6a>)
(T1/#94)            <...>   697 0 20 00000001 0000005e [00000424e868bed0] 0.009ms (+0.000ms): wake_up_process+0xb/0x31 <ffffffff8012cd66> (redirect_hardirq+0x46/0x53 <ffffffff8015d946>)
(T1/#95)           <idle>     0 1 23 00000002 0000005f [00000424e868bfa8] 0.009ms (+0.000ms): note_interrupt+0x16/0x227 <ffffffff8015ec77> (__do_IRQ+0xf5/0x141 <ffffffff8015de84>)
(T1/#96)            <...>   697 0 20 00000001 00000060 [00000424e868bfe2] 0.009ms (+0.000ms): check_preempt_wakeup+0xc/0xac <ffffffff8014a814> (wake_up_process+0x13/0x31 <ffffffff8012cd6e>)
(T1/#97)           <idle>     0 1 23 00000002 00000061 [00000424e868c0a8] 0.009ms (+0.000ms): unmask_IO_APIC_irq+0xc/0x3a <ffffffff8011a82b> (__do_IRQ+0x122/0x141 <ffffffff8015deb1>)
(T1/#98)            <...>   697 0 20 00000001 00000062 [00000424e868c0d6] 0.009ms (+0.000ms): try_to_wake_up+0x16/0x560 <ffffffff8012c747> (wake_up_process+0x24/0x31 <ffffffff8012cd7f>)
(T1/#99)           <idle>     0 1 23 00000002 00000063 [00000424e868c19a] 0.009ms (+0.000ms): _raw_spin_lock_irqsave+0xc/0x33 <ffffffff802fa2ab> (unmask_IO_APIC_irq+0x1b/0x3a <ffffffff8011a83a>)
(T1/#100)            <...>   697 0 20 00000001 00000064 [00000424e868c1f8] 0.009ms (+0.000ms): _raw_spin_lock+0xb/0x25 <ffffffff802fa1a7> (try_to_wake_up+0x5d/0x560 <ffffffff8012c78e>)
(T1/#101)            <...>   697 0 20 00000002 00000065 [00000424e868c3d9] 0.010ms (+0.000ms): idle_cpu+0x9/0x30 <ffffffff80129d37> (try_to_wake_up+0x292/0x560 <ffffffff8012c9c3>)
(T1/#102)           <idle>     0 1 23 00000003 00000066 [00000424e868c439] 0.010ms (+0.000ms): __unmask_IO_APIC_irq+0x9/0xff <ffffffff8011a729> (unmask_IO_APIC_irq+0x26/0x3a <ffffffff8011a845>)
(T1/#103)            <...>   697 0 20 00000002 00000067 [00000424e868c538] 0.010ms (+0.000ms): smp_send_reschedule_allbutself+0x9/0x1a <ffffffff8011809b> (try_to_wake_up+0x3e9/0x560 <ffffffff8012cb1a>)
(T1/#104)            <...>   697 0 20 00000002 00000068 [00000424e868c674] 0.010ms (+0.000ms): flat_send_IPI_allbutself+0xb/0x5b <ffffffff8011b644> (smp_send_reschedule_allbutself+0x18/0x1a <ffffffff801180aa>)
(T1/#105)            <...>   697 0 20 00000002 00000069 [00000424e868c74e] 0.010ms (+0.000ms): __bitmap_weight+0xa/0x18b <ffffffff801fbce6> (flat_send_IPI_allbutself+0x1e/0x5b <ffffffff8011b657>)
(T1/#106)            <...>   697 0 20 00000002 0000006a [00000424e868c95a] 0.010ms (+0.000ms): activate_task+0x10/0xe0 <ffffffff8012bc38> (try_to_wake_up+0x491/0x560 <ffffffff8012cbc2>)
(T1/#107)            <...>   697 0 20 00000002 0000006b [00000424e868ca36] 0.010ms (+0.000ms): sched_clock+0x9/0x26 <ffffffff8011180c> (activate_task+0x1d/0xe0 <ffffffff8012bc45>)
(T3/#108)    <...>-697   0D.h2   11us : activate_task+0x9b/0xe0 <ffffffff8012bcc3> <<...>-1432> (3a 2)
(T1/#109)            <...>   697 0 20 00000002 0000006d [00000424e868cc15] 0.011ms (+0.000ms): enqueue_task+0xc/0x95 <ffffffff80129be4> (activate_task+0xa7/0xe0 <ffffffff8012bccf>)
(T1/#110)            <...>   697 0 20 00000002 0000006e [00000424e868cdf1] 0.011ms (+0.000ms): _raw_spin_unlock_irqrestore+0xb/0x62 <ffffffff802fa61d> (try_to_wake_up+0x54f/0x560 <ffffffff8012cc80>)
(T1/#111)            <...>   697 0 20 00000001 0000006f [00000424e868cf1a] 0.011ms (+0.000ms): check_raw_flags+0x9/0x5b <ffffffff8014a6f5> (_raw_spin_unlock_irqrestore+0x26/0x62 <ffffffff802fa638>)
(T1/#112)            <...>   697 0 20 00000001 00000070 [00000424e868d02f] 0.011ms (+0.000ms): constant_test_bit+0x9/0x25 <ffffffff80152845> (_raw_spin_unlock_irqrestore+0x55/0x62 <ffffffff802fa667>)
(T1/#113)            <...>   697 0 20 00000001 00000071 [00000424e868d147] 0.011ms (+0.000ms): wake_up_process+0x2b/0x31 <ffffffff8012cd86> (redirect_hardirq+0x46/0x53 <ffffffff8015d946>)
(T1/#114)            <...>   697 0 20 00000001 00000072 [00000424e868d261] 0.011ms (+0.000ms): _raw_spin_unlock+0x9/0x3e <ffffffff802fa6f7> (__do_IRQ+0x12a/0x141 <ffffffff8015deb9>)
(T1/#115)            <...>   697 0 20 00000000 00000073 [00000424e868d356] 0.012ms (+0.000ms): constant_test_bit+0x9/0x25 <ffffffff80152845> (_raw_spin_unlock+0x33/0x3e <ffffffff802fa721>)
(T1/#116)            <...>   697 0 20 00000000 00000074 [00000424e868d470] 0.012ms (+0.000ms): irq_exit+0x9/0x28 <ffffffff80137aaa> (do_IRQ+0x3a/0x44 <ffffffff80110767>)
(T1/#117)            <...>   697 0 0 00000000 00000075 [00000424e868d72e] 0.012ms (+0.000ms): ata_host_intr+0xf/0xb6 <ffffffff88033d99> (nv_interrupt+0x65/0xa6 <ffffffff88042065>)
(T1/#118)            <...>   697 0 0 00000000 00000076 [00000424e868d81d] 0.012ms (+0.000ms): ata_bmdma_status+0x9/0x27 <ffffffff88031e66> (ata_host_intr+0x3e/0xb6 <ffffffff88033dc8>)
(T1/#119)           <idle>     0 1 23 00000003 00000077 [00000424e868d846] 0.012ms (+0.000ms): _raw_spin_unlock_irqrestore+0xb/0x62 <ffffffff802fa61d> (unmask_IO_APIC_irq+0x35/0x3a <ffffffff8011a854>)
(T1/#120)           <idle>     0 1 23 00000002 00000078 [00000424e868d958] 0.012ms (+0.000ms): check_raw_flags+0x9/0x5b <ffffffff8014a6f5> (_raw_spin_unlock_irqrestore+0x26/0x62 <ffffffff802fa638>)
(T1/#121)           <idle>     0 1 23 00000002 00000079 [00000424e868da62] 0.012ms (+0.000ms): constant_test_bit+0x9/0x25 <ffffffff80152845> (_raw_spin_unlock_irqrestore+0x55/0x62 <ffffffff802fa667>)
(T1/#122)            <...>   697 0 0 00000000 0000007a [00000424e868dacb] 0.012ms (+0.000ms): ata_bmdma_stop+0x9/0x32 <ffffffff88031e34> (ata_host_intr+0x52/0xb6 <ffffffff88033ddc>)
(T1/#123)           <idle>     0 1 23 00000002 0000007b [00000424e868db53] 0.013ms (+0.000ms): preempt_schedule+0xc/0x9d <ffffffff802f82a1> (_raw_spin_unlock_irqrestore+0x5e/0x62 <ffffffff802fa670>)
(T1/#124)           <idle>     0 1 23 00000002 0000007c [00000424e868dc6f] 0.013ms (+0.000ms): _raw_spin_unlock+0x9/0x3e <ffffffff802fa6f7> (__do_IRQ+0x12a/0x141 <ffffffff8015deb9>)
(T1/#125)           <idle>     0 1 23 00000001 0000007d [00000424e868dd65] 0.013ms (+0.000ms): constant_test_bit+0x9/0x25 <ffffffff80152845> (_raw_spin_unlock+0x33/0x3e <ffffffff802fa721>)
(T1/#126)           <idle>     0 1 23 00000001 0000007e [00000424e868de78] 0.013ms (+0.000ms): preempt_schedule+0xc/0x9d <ffffffff802f82a1> (_raw_spin_unlock+0x3c/0x3e <ffffffff802fa72a>)
(T1/#127)            <...>   697 0 0 00000000 0000007f [00000424e868dec3] 0.013ms (+0.000ms): ata_altstatus+0x9/0x34 <ffffffff88031e00> (ata_bmdma_stop+0x30/0x32 <ffffffff88031e5b>)
(T1/#128)           <idle>     0 1 23 00000001 00000080 [00000424e868df94] 0.013ms (+0.000ms): irq_exit+0x9/0x28 <ffffffff80137aaa> (do_IRQ+0x3a/0x44 <ffffffff80110767>)
(T1/#129)            <...>   697 0 0 00000000 00000081 [00000424e868e150] 0.013ms (+0.000ms): ata_altstatus+0x9/0x34 <ffffffff88031e00> (ata_host_intr+0x5a/0xb6 <ffffffff88033de4>)
(T1/#130)           <idle>     0 1 19 00000001 00000082 [00000424e868e25a] 0.013ms (+0.000ms): smp_reschedule_interrupt+0x9/0x16 <ffffffff801186ee> (reschedule_interrupt+0x84/0x8c <ffffffff8010e558>)
(T1/#131)            <...>   697 0 0 00000000 00000083 [00000424e868e3d5] 0.014ms (+0.000ms): ata_check_status+0x9/0x23 <ffffffff88031ddd> (ata_host_intr+0x68/0xb6 <ffffffff88033df2>)
(T1/#132)           <idle>     0 1 19 00000000 00000084 [00000424e868e465] 0.014ms (+0.000ms): __schedule+0x16/0xad6 <ffffffff802f7609> (cpu_idle+0xbe/0xd3 <ffffffff8010c6dd>)
(T1/#133)           <idle>     0 1 19 00000000 00000085 [00000424e868e560] 0.014ms (+0.000ms): profile_hit+0x14/0x19f <ffffffff8013333d> (__schedule+0xb0/0xad6 <ffffffff802f76a3>)
(T1/#134)           <idle>     0 1 19 00000001 00000086 [00000424e868e6a4] 0.014ms (+0.000ms): sched_clock+0x9/0x26 <ffffffff8011180c> (__schedule+0x110/0xad6 <ffffffff802f7703>)
(T1/#135)            <...>   697 0 0 00000000 00000087 [00000424e868e737] 0.014ms (+0.000ms): ata_bmdma_irq_clear+0x9/0x2b <ffffffff88032170> (ata_host_intr+0x7c/0xb6 <ffffffff88033e06>)
(T1/#136)           <idle>     0 1 19 00000001 00000088 [00000424e868e88e] 0.014ms (+0.000ms): _raw_spin_lock_irq+0xb/0x2c <ffffffff802fa2dd> (__schedule+0x18c/0xad6 <ffffffff802f777f>)
(T1/#137)            <...>   697 0 0 00000000 00000089 [00000424e868eb5d] 0.015ms (+0.000ms): ata_qc_complete+0x13/0x20b <ffffffff88032870> (ata_host_intr+0x9e/0xb6 <ffffffff88033e28>)
(T1/#138)           <idle>     0 1 19 00000002 0000008a [00000424e868ec85] 0.015ms (+0.000ms): double_lock_balance+0xc/0x43 <ffffffff80129cf7> (__schedule+0x2f8/0xad6 <ffffffff802f78eb>)
(T1/#139)            <...>   697 0 0 00000000 0000008b [00000424e868ec87] 0.015ms (+0.000ms): dma_unmap_sg+0x14/0x6d <ffffffff8011c396> (ata_qc_complete+0x132/0x20b <ffffffff8803298f>)
(T1/#140)           <idle>     0 1 19 00000002 0000008c [00000424e868ed68] 0.015ms (+0.000ms): _raw_spin_trylock+0xb/0x5d <ffffffff802fa8f7> (double_lock_balance+0x1a/0x43 <ffffffff80129d05>)
(T1/#141)            <...>   697 0 0 00000000 0000008d [00000424e868ed93] 0.015ms (+0.000ms): dma_unmap_single+0xc/0xe6 <ffffffff8011c2a8> (dma_unmap_sg+0x5b/0x6d <ffffffff8011c3dd>)
(T1/#142)            <...>   697 0 0 00000000 0000008e [00000424e868ef1d] 0.015ms (+0.000ms): ata_scsi_qc_complete+0x10/0x89 <ffffffff88036aa1> (ata_qc_complete+0x1f2/0x20b <ffffffff88032a4f>)
(T1/#143)            <...>   697 0 0 00000000 0000008f [00000424e868f076] 0.015ms (+0.000ms): scsi_done+0xc/0x24 <ffffffff880028d4> (ata_scsi_qc_complete+0x7f/0x89 <ffffffff88036b10>)
(T1/#144)            <...>   697 0 0 00000000 00000090 [00000424e868f16f] 0.015ms (+0.000ms): scsi_delete_timer+0xc/0x65 <ffffffff8800576d> (scsi_done+0x14/0x24 <ffffffff880028dc>)
(T1/#145)           <idle>     0 1 19 00000003 00000091 [00000424e868f1ec] 0.015ms (+0.000ms): find_next_bit+0xc/0x74 <ffffffff80200598> (__schedule+0x3a2/0xad6 <ffffffff802f7995>)
(T1/#146)            <...>   697 0 0 00000000 00000092 [00000424e868f268] 0.016ms (+0.000ms): del_timer+0xe/0x5e <ffffffff8013b7a8> (scsi_delete_timer+0x1b/0x65 <ffffffff8800577c>)
(T1/#147)            <...>   697 0 0 00000000 00000093 [00000424e868f359] 0.016ms (+0.000ms): lock_timer_base+0xf/0x4c <ffffffff8013b5a6> (del_timer+0x23/0x5e <ffffffff8013b7bd>)
(T1/#148)           <idle>     0 1 19 00000003 00000094 [00000424e868f46a] 0.016ms (+0.000ms): find_next_bit+0xc/0x74 <ffffffff80200598> (__schedule+0x3a2/0xad6 <ffffffff802f7995>)
(T1/#149)           <idle>     0 1 19 00000003 00000095 [00000424e868f589] 0.016ms (+0.000ms): __find_first_bit+0x9/0x34 <ffffffff80200551> (find_next_bit+0x65/0x74 <ffffffff802005f1>)
(T1/#150)            <...>   697 0 0 00000000 00000096 [00000424e868f5fc] 0.016ms (+0.000ms): _spin_lock_irqsave+0x9/0x4b <ffffffff802f9db4> (lock_timer_base+0x27/0x4c <ffffffff8013b5be>)
(T1/#151)            <...>   697 0 0 00000000 00000097 [00000424e868f735] 0.016ms (+0.000ms): account_mutex_owner_down+0x9/0x4c <ffffffff8014a3fd> (_spin_lock_irqsave+0x3a/0x4b <ffffffff802f9de5>)
(T1/#152)           <idle>     0 1 19 00000003 00000098 [00000424e868f850] 0.016ms (+0.000ms): find_next_bit+0xc/0x74 <ffffffff80200598> (__schedule+0x3a2/0xad6 <ffffffff802f7995>)
(T1/#153)            <...>   697 0 0 00000000 00000099 [00000424e868f8df] 0.016ms (+0.000ms): _spin_unlock_irqrestore+0xc/0x6e <ffffffff802fa13a> (del_timer+0x54/0x5e <ffffffff8013b7ee>)
(T1/#154)           <idle>     0 1 19 00000003 0000009a [00000424e868f945] 0.016ms (+0.000ms): __find_first_bit+0x9/0x34 <ffffffff80200551> (find_next_bit+0x65/0x74 <ffffffff802005f1>)
(T1/#155)            <...>   697 0 0 00000000 0000009b [00000424e868f9c7] 0.016ms (+0.000ms): up_mutex+0x9/0x3f <ffffffff8014b0f4> (_spin_unlock_irqrestore+0x6a/0x6e <ffffffff802fa198>)
(T1/#156)           <idle>     0 1 19 00000003 0000009c [00000424e868fa58] 0.017ms (+0.000ms): _raw_spin_unlock+0x9/0x3e <ffffffff802fa6f7> (__schedule+0x437/0xad6 <ffffffff802f7a2a>)
(T1/#157)            <...>   697 0 0 00000000 0000009d [00000424e868fab1] 0.017ms (+0.000ms): account_mutex_owner_up+0x9/0x4b <ffffffff8014a449> (up_mutex+0x36/0x3f <ffffffff8014b121>)
(T1/#158)           <idle>     0 1 19 00000002 0000009e [00000424e868fbed] 0.017ms (+0.000ms): constant_test_bit+0x9/0x25 <ffffffff80152845> (_raw_spin_unlock+0x33/0x3e <ffffffff802fa721>)
(T1/#159)            <...>   697 0 0 00000000 0000009f [00000424e868fc11] 0.017ms (+0.000ms): __scsi_done+0xc/0x9b <ffffffff88002839> (scsi_done+0x20/0x24 <ffffffff880028e8>)
(T1/#160)           <idle>     0 1 19 00000002 000000a0 [00000424e868fcec] 0.017ms (+0.000ms): preempt_schedule+0xc/0x9d <ffffffff802f82a1> (_raw_spin_unlock+0x3c/0x3e <ffffffff802fa72a>)
(T1/#161)            <...>   697 0 16 00000000 000000a1 [00000424e868fdb2] 0.017ms (+0.000ms): raise_softirq_irqoff+0x9/0x5f <ffffffff80137ad2> (__scsi_done+0x73/0x9b <ffffffff880028a0>)
(T1/#162)           <idle>     0 1 19 00000002 000000a2 [00000424e868fe1b] 0.017ms (+0.000ms): dependent_sleeper+0x16/0x3f1 <ffffffff8012d508> (__schedule+0x74c/0xad6 <ffffffff802f7d3f>)
(T1/#163)            <...>   697 0 16 00000000 000000a3 [00000424e868fefd] 0.017ms (+0.000ms): wakeup_softirqd+0x9/0x38 <ffffffff801373e5> (raise_softirq_irqoff+0x5d/0x5f <ffffffff80137b26>)
(T5/#164) [ =>          swapper ] 0.017ms (+0.000ms)
(T1/#165)            <...>   697 0 16 00000000 000000a5 [00000424e8690058] 0.017ms (+0.000ms): wake_up_process+0xb/0x31 <ffffffff8012cd66> (wakeup_softirqd+0x36/0x38 <ffffffff80137412>)
(T1/#166)           <idle>     0 1 17 00000002 000000a6 [00000424e8690071] 0.017ms (+0.000ms): __switch_to+0x13/0x212 <ffffffff8010c7d1> (thread_return+0x0/0x13f <ffffffff802f80c9>)
(T1/#167)            <...>   697 0 16 00000000 000000a7 [00000424e8690136] 0.017ms (+0.000ms): check_preempt_wakeup+0xc/0xac <ffffffff8014a814> (wake_up_process+0x13/0x31 <ffffffff8012cd6e>)
(T3/#168)    <...>-14    1D..2   17us : thread_return+0x4a/0x13f <ffffffff802f8113> <<idle>-0> (8c 62)
(T1/#169)            <...>   697 0 16 00000000 000000a9 [00000424e8690245] 0.018ms (+0.000ms): try_to_wake_up+0x16/0x560 <ffffffff8012c747> (wake_up_process+0x24/0x31 <ffffffff8012cd7f>)
(T1/#170)            <...>    14 1 16 00000002 000000aa [00000424e86902fc] 0.018ms (+0.000ms): _raw_spin_unlock_irq+0x9/0x44 <ffffffff802fa5d7> (thread_return+0xa3/0x13f <ffffffff802f816c>)
(T1/#171)            <...>   697 0 16 00000000 000000ab [00000424e8690363] 0.018ms (+0.000ms): _raw_spin_lock+0xb/0x25 <ffffffff802fa1a7> (try_to_wake_up+0x5d/0x560 <ffffffff8012c78e>)
(T1/#172)            <...>    14 1 0 00000001 000000ac [00000424e869042d] 0.018ms (+0.000ms): constant_test_bit+0x9/0x25 <ffffffff80152845> (_raw_spin_unlock_irq+0x39/0x44 <ffffffff802fa607>)
(T1/#173)            <...>    14 1 0 00000001 000000ad [00000424e8690556] 0.018ms (+0.000ms): trace_stop_sched_switched+0x16/0x332 <ffffffff801504d0> (thread_return+0xb1/0x13f <ffffffff802f817a>)
(T1/#174)            <...>   697 0 16 00000001 000000ae [00000424e86905a9] 0.018ms (+0.000ms): idle_cpu+0x9/0x30 <ffffffff80129d37> (try_to_wake_up+0x292/0x560 <ffffffff8012c9c3>)
(T1/#175)            <...>    14 1 16 00000001 000000af [00000424e869065c] 0.018ms (+0.000ms): _raw_spin_lock+0xb/0x25 <ffffffff802fa1a7> (trace_stop_sched_switched+0x49/0x332 <ffffffff80150503>)
(T1/#176)            <...>   697 0 16 00000001 000000b0 [00000424e86906b0] 0.018ms (+0.000ms): smp_send_reschedule_allbutself+0x9/0x1a <ffffffff8011809b> (try_to_wake_up+0x3e9/0x560 <ffffffff8012cb1a>)
(T1/#177)            <...>   697 0 16 00000001 000000b1 [00000424e869078d] 0.018ms (+0.000ms): flat_send_IPI_allbutself+0xb/0x5b <ffffffff8011b644> (smp_send_reschedule_allbutself+0x18/0x1a <ffffffff801180aa>)
(T3/#178)    <...>-14    1D..2   18us : trace_stop_sched_switched+0x6f/0x332 <ffffffff80150529> <<...>-14> (62 1)
(T1/#179)            <...>   697 0 16 00000001 000000b3 [00000424e8690871] 0.018ms (+0.000ms): __bitmap_weight+0xa/0x18b <ffffffff801fbce6> (flat_send_IPI_allbutself+0x1e/0x5b <ffffffff8011b657>)
(T1/#180)            <...>    14 1 16 00000003 000000b4 [00000424e8690898] 0.018ms (+0.000ms): _raw_spin_unlock+0x9/0x3e <ffffffff802fa6f7> (trace_stop_sched_switched+0xbf/0x332 <ffffffff80150579>)
(T1/#181)            <...>    14 1 16 00000002 000000b5 [00000424e86909ac] 0.018ms (+0.000ms): constant_test_bit+0x9/0x25 <ffffffff80152845> (_raw_spin_unlock+0x33/0x3e <ffffffff802fa721>)
(T1/#182)            <...>   697 0 16 00000001 000000b6 [00000424e8690a78] 0.019ms (+0.000ms): activate_task+0x10/0xe0 <ffffffff8012bc38> (try_to_wake_up+0x491/0x560 <ffffffff8012cbc2>)
(T1/#183)            <...>   697 0 16 00000001 000000b7 [00000424e8690b50] 0.019ms (+0.000ms): sched_clock+0x9/0x26 <ffffffff8011180c> (activate_task+0x1d/0xe0 <ffffffff8012bc45>)
(T1/#184)            <...>    14 1 16 00000002 000000b8 [00000424e8690b62] 0.019ms (+9177482346838.318ms): thread_return+0xb1/0x13f <ffffffff802f817a> (thread_return+0xb1/0x13f <ffffffff802f817a>)


vim:ft=help

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: AMD64 X2 lost ticks on PM timer
  2006-03-03 19:18                         ` Bill Rugolsky Jr.
@ 2006-03-03 21:26                           ` Lee Revell
  2006-03-03 22:09                             ` Jeff Garzik
  0 siblings, 1 reply; 60+ messages in thread
From: Lee Revell @ 2006-03-03 21:26 UTC (permalink / raw)
  To: Bill Rugolsky Jr.
  Cc: Andi Kleen, Jason Baron, linux-kernel, john stultz, Ingo Molnar

All this tells me is that your system's timer is screwed up (not news).

John, Ingo, any ideas?

Lee

On Fri, 2006-03-03 at 14:18 -0500, Bill Rugolsky Jr. wrote:
> On Wed, Mar 01, 2006 at 02:16:50PM -0500, Lee Revell wrote:
> > On Wed, 2006-03-01 at 19:29 +0100, Andi Kleen wrote:
> > > Sprinkle WARN_ON(in_interrupt()) all over the parts that shouldn't
> > > have interrupts 
> > > off. 
> > 
> > Might be faster to just try the -rt kernel, it has tons of debugging
> > checks for stuff like this.
> 
> After several attempts where 2.6.15-rt18 reset on startup, I whittled
> my config down to something minimal (turned off NUMA, CPUSETS, PRINTK_TIME, ...)
> and got it up and running PREEMPT_RT:
> 
> rugolsky@ti94: uname -a
> Linux ti94 2.6.15-rt18-realtime #4 SMP PREEMPT Fri Mar 3 11:39:20 EST 2006 x86_64 x86_64 x86_64 GNU/Linux
> 
> rugolsky@ti94: egrep 'PREEMPT|LATENCY|HZ' .config
> # CONFIG_PREEMPT_NONE is not set
> # CONFIG_PREEMPT_VOLUNTARY is not set
> # CONFIG_PREEMPT_DESKTOP is not set
> CONFIG_PREEMPT_RT=y
> CONFIG_PREEMPT=y
> CONFIG_PREEMPT_SOFTIRQS=y
> CONFIG_PREEMPT_HARDIRQS=y
> CONFIG_PREEMPT_BKL=y
> CONFIG_PREEMPT_RCU=y
> # CONFIG_HZ_100 is not set
> # CONFIG_HZ_250 is not set
> CONFIG_HZ_1000=y
> CONFIG_HZ=1000
> CONFIG_DEBUG_PREEMPT=y
> # CONFIG_WAKEUP_LATENCY_HIST is not set
> CONFIG_PREEMPT_TRACE=y
> CONFIG_CRITICAL_PREEMPT_TIMING=y
> # CONFIG_PREEMPT_OFF_HIST is not set
> CONFIG_LATENCY_TIMING=y
> CONFIG_LATENCY_TRACE=y
> 
> % sudo sysctl -a | egrep 'kernel\.*(preempt|latency|trace|wakeup)'
> kernel.preempt_thresh = 0
> kernel.preempt_max_latency = 2483
> kernel.trace_all_cpus = 1
> kernel.trace_verbose = 1
> kernel.trace_print_at_crash = 1
> kernel.trace_freerunning = 0
> kernel.trace_user_trigger_irq = -1
> kernel.trace_user_trigger_irq = -1
> kernel.trace_user_triggered = 0
> kernel.trace_enabled = 1
> kernel.wakeup_timing = 1
> 
> % cat /sys/devices/system/clocksource/clocksource0/current_clocksource 
> acpi_pm 
> 
> Should I be compiling for a different preempt mode?  I've only cursorily
> followed the realtime-preempt patch discussion threads, so I am unclear as
> to what debugging facilities are available with each preemption level.
> [Is there a howto/tutorial floating around on using the debugging
> features?]
> 
> I got the following trace.  Since you have a great deal of experience
> interpreting these traces, perhaps you can help me interpret this one?
> 
> Thanks.
> 
> 	Bill Rugolsky
> 
> 
> preemption latency trace v1.1.5 on 2.6.15-rt18-realtime
> --------------------------------------------------------------------
>  latency: 2483 us, #238/238, CPU#1 | (M:rt VP:0, KP:0, SP:1 HP:1 #P:2)
>     -----------------
>     | task: softirq-timer/1-14 (uid:0 nice:0 policy:1 rt_prio:1)
>     -----------------
> 
> (T1/#0)            <...>   697 0 16 00000000 00000000 [00000424e86874c5] 0.000ms (+0.000ms): smp_apic_timer_interrupt+0xc/0x48 <ffffffff8011902a> (apic_timer_interrupt+0x84/0x8c <ffffffff8010ead0>)
> (T1/#1)            <...>   697 0 20 00000000 00000001 [00000424e86876a8] 0.000ms (+0.000ms): smp_local_timer_interrupt+0xc/0x32 <ffffffff80118ad5> (smp_apic_timer_interrupt+0x3e/0x48 <ffffffff8011905c>)
> (T1/#2)            <...>   697 0 20 00000000 00000002 [00000424e868778c] 0.000ms (+0.000ms): profile_tick+0xc/0x77 <ffffffff801334d4> (smp_local_timer_interrupt+0x1c/0x32 <ffffffff80118ae5>)
> (T1/#3)            <...>   697 0 20 00000000 00000003 [00000424e8687885] 0.000ms (+0.000ms): profile_pc+0xc/0x71 <ffffffff8011173c> (profile_tick+0x67/0x77 <ffffffff8013352f>)
> (T1/#4)            <...>   697 0 20 00000000 00000004 [00000424e86879a4] 0.000ms (+0.000ms): profile_hit+0x14/0x19f <ffffffff8013333d> (profile_tick+0x72/0x77 <ffffffff8013353a>)
> (T1/#5)            <...>   697 0 20 00000000 00000005 [00000424e8687aae] 0.000ms (+0.000ms): update_process_times+0xc/0x68 <ffffffff8013b457> (smp_local_timer_interrupt+0x2e/0x32 <ffffffff80118af7>)
> (T1/#6)            <...>   697 0 20 00000000 00000006 [00000424e8687baf] 0.000ms (+0.000ms): account_system_time+0x9/0x9e <ffffffff8012a19c> (update_process_times+0x3f/0x68 <ffffffff8013b48a>)
> (T1/#7)            <...>   697 0 20 00000000 00000007 [00000424e8687cb0] 0.001ms (+0.000ms): acct_update_integrals+0x9/0x59 <ffffffff801571a1> (account_system_time+0x9c/0x9e <ffffffff8012a22f>)
> (T1/#8)            <...>   697 0 20 00000000 00000008 [00000424e8687dca] 0.001ms (+0.000ms): run_local_timers+0x9/0x15 <ffffffff8013b0cd> (update_process_times+0x44/0x68 <ffffffff8013b48f>)
> (T1/#9)            <...>   697 0 20 00000000 00000009 [00000424e8687ea7] 0.001ms (+0.000ms): raise_softirq+0xc/0x91 <ffffffff80137cf8> (run_local_timers+0x13/0x15 <ffffffff8013b0d7>)
> (T1/#10)            <...>   697 0 20 00000000 0000000a [00000424e8687fce] 0.001ms (+0.000ms): wakeup_softirqd+0x9/0x38 <ffffffff801373e5> (raise_softirq+0x6f/0x91 <ffffffff80137d5b>)
> (T1/#11)            <...>   697 0 20 00000000 0000000b [00000424e86880c5] 0.001ms (+0.000ms): wake_up_process+0xb/0x31 <ffffffff8012cd66> (wakeup_softirqd+0x36/0x38 <ffffffff80137412>)
> (T1/#12)            <...>   697 0 20 00000000 0000000c [00000424e86881ab] 0.001ms (+0.000ms): check_preempt_wakeup+0xc/0xac <ffffffff8014a814> (wake_up_process+0x13/0x31 <ffffffff8012cd6e>)
> (T1/#13)            <...>   697 0 20 00000000 0000000d [00000424e86882b4] 0.001ms (+0.000ms): try_to_wake_up+0x16/0x560 <ffffffff8012c747> (wake_up_process+0x24/0x31 <ffffffff8012cd7f>)
> (T1/#14)            <...>   697 0 20 00000000 0000000e [00000424e86883ad] 0.001ms (+0.000ms): _raw_spin_lock+0xb/0x25 <ffffffff802fa1a7> (try_to_wake_up+0x5d/0x560 <ffffffff8012c78e>)
> (T1/#15)            <...>   697 0 20 00000001 0000000f [00000424e86884fd] 0.002ms (+0.000ms): idle_cpu+0x9/0x30 <ffffffff80129d37> (try_to_wake_up+0x292/0x560 <ffffffff8012c9c3>)
> (T1/#16)           <idle>     0 1 23 00000003 00000010 [00000424e86885b5] 0.002ms (+0.000ms): _raw_spin_unlock_irqrestore+0xb/0x62 <ffffffff802fa61d> (unmask_IO_APIC_irq+0x35/0x3a <ffffffff8011a854>)
> (T1/#17)            <...>   697 0 20 00000001 00000011 [00000424e868868b] 0.002ms (+0.000ms): smp_send_reschedule_allbutself+0x9/0x1a <ffffffff8011809b> (try_to_wake_up+0x3e9/0x560 <ffffffff8012cb1a>)
> (T1/#18)           <idle>     0 1 23 00000002 00000012 [00000424e86886b0] 0.002ms (+0.000ms): check_raw_flags+0x9/0x5b <ffffffff8014a6f5> (_raw_spin_unlock_irqrestore+0x26/0x62 <ffffffff802fa638>)
> (T1/#19)            <...>   697 0 20 00000001 00000013 [00000424e8688779] 0.002ms (+0.000ms): flat_send_IPI_allbutself+0xb/0x5b <ffffffff8011b644> (smp_send_reschedule_allbutself+0x18/0x1a <ffffffff801180aa>)
> (T1/#20)           <idle>     0 1 23 00000002 00000014 [00000424e86887ba] 0.002ms (+0.000ms): constant_test_bit+0x9/0x25 <ffffffff80152845> (_raw_spin_unlock_irqrestore+0x55/0x62 <ffffffff802fa667>)
> (T1/#21)            <...>   697 0 20 00000001 00000015 [00000424e8688855] 0.002ms (+0.000ms): __bitmap_weight+0xa/0x18b <ffffffff801fbce6> (flat_send_IPI_allbutself+0x1e/0x5b <ffffffff8011b657>)
> (T1/#22)           <idle>     0 1 23 00000002 00000016 [00000424e86888ab] 0.002ms (+0.000ms): preempt_schedule+0xc/0x9d <ffffffff802f82a1> (_raw_spin_unlock_irqrestore+0x5e/0x62 <ffffffff802fa670>)
> (T1/#23)           <idle>     0 1 23 00000002 00000017 [00000424e86889c7] 0.002ms (+0.000ms): _raw_spin_unlock+0x9/0x3e <ffffffff802fa6f7> (__do_IRQ+0x12a/0x141 <ffffffff8015deb9>)
> (T1/#24)            <...>   697 0 20 00000001 00000018 [00000424e8688a6c] 0.002ms (+0.000ms): activate_task+0x10/0xe0 <ffffffff8012bc38> (try_to_wake_up+0x491/0x560 <ffffffff8012cbc2>)
> (T1/#25)           <idle>     0 1 23 00000001 00000019 [00000424e8688abd] 0.002ms (+0.000ms): constant_test_bit+0x9/0x25 <ffffffff80152845> (_raw_spin_unlock+0x33/0x3e <ffffffff802fa721>)
> (T1/#26)            <...>   697 0 20 00000001 0000001a [00000424e8688b4b] 0.002ms (+0.000ms): sched_clock+0x9/0x26 <ffffffff8011180c> (activate_task+0x1d/0xe0 <ffffffff8012bc45>)
> (T1/#27)           <idle>     0 1 23 00000001 0000001b [00000424e8688bbc] 0.002ms (+0.000ms): preempt_schedule+0xc/0x9d <ffffffff802f82a1> (_raw_spin_unlock+0x3c/0x3e <ffffffff802fa72a>)
> (T3/#28)    <...>-697   0D.h1    2us : activate_task+0x9b/0xe0 <ffffffff8012bcc3> <<...>-4> (62 1)
> (T1/#29)           <idle>     0 1 23 00000001 0000001d [00000424e8688cd8] 0.003ms (+0.000ms): irq_exit+0x9/0x28 <ffffffff80137aaa> (do_IRQ+0x3a/0x44 <ffffffff80110767>)
> (T1/#30)            <...>   697 0 20 00000001 0000001e [00000424e8688d26] 0.003ms (+0.000ms): enqueue_task+0xc/0x95 <ffffffff80129be4> (activate_task+0xa7/0xe0 <ffffffff8012bccf>)
> (T1/#31)            <...>   697 0 20 00000001 0000001f [00000424e8688f26] 0.003ms (+0.000ms): _raw_spin_unlock_irqrestore+0xb/0x62 <ffffffff802fa61d> (try_to_wake_up+0x54f/0x560 <ffffffff8012cc80>)
> (T1/#32)           <idle>     0 1 19 00000001 00000020 [00000424e8688f9e] 0.003ms (+0.000ms): smp_reschedule_interrupt+0x9/0x16 <ffffffff801186ee> (reschedule_interrupt+0x84/0x8c <ffffffff8010e558>)
> (T1/#33)            <...>   697 0 20 00000000 00000021 [00000424e868901e] 0.003ms (+0.000ms): check_raw_flags+0x9/0x5b <ffffffff8014a6f5> (_raw_spin_unlock_irqrestore+0x26/0x62 <ffffffff802fa638>)
> (T1/#34)            <...>   697 0 20 00000000 00000022 [00000424e8689137] 0.003ms (+0.000ms): constant_test_bit+0x9/0x25 <ffffffff80152845> (_raw_spin_unlock_irqrestore+0x55/0x62 <ffffffff802fa667>)
> (T1/#35)           <idle>     0 1 19 00000001 00000023 [00000424e8689211] 0.003ms (+0.000ms): smp_apic_timer_interrupt+0xc/0x48 <ffffffff8011902a> (apic_timer_interrupt+0x84/0x8c <ffffffff8010ead0>)
> (T1/#36)            <...>   697 0 20 00000000 00000024 [00000424e8689254] 0.003ms (+0.000ms): wake_up_process+0x2b/0x31 <ffffffff8012cd86> (wakeup_softirqd+0x36/0x38 <ffffffff80137412>)
> (T1/#37)            <...>   697 0 20 00000000 00000025 [00000424e8689354] 0.003ms (+0.000ms): check_raw_flags+0x9/0x5b <ffffffff8014a6f5> (raise_softirq+0x77/0x91 <ffffffff80137d63>)
> (T1/#38)           <idle>     0 1 23 00000001 00000026 [00000424e8689354] 0.003ms (+0.000ms): smp_local_timer_interrupt+0xc/0x32 <ffffffff80118ad5> (smp_apic_timer_interrupt+0x3e/0x48 <ffffffff8011905c>)
> (T1/#39)           <idle>     0 1 23 00000001 00000027 [00000424e8689426] 0.003ms (+0.000ms): profile_tick+0xc/0x77 <ffffffff801334d4> (smp_local_timer_interrupt+0x1c/0x32 <ffffffff80118ae5>)
> (T1/#40)            <...>   697 0 20 00000000 00000028 [00000424e8689492] 0.004ms (+0.000ms): rcu_pending+0x9/0x30 <ffffffff801445e8> (update_process_times+0x4b/0x68 <ffffffff8013b496>)
> (T1/#41)           <idle>     0 1 23 00000001 00000029 [00000424e8689520] 0.004ms (+0.000ms): profile_pc+0xc/0x71 <ffffffff8011173c> (profile_tick+0x67/0x77 <ffffffff8013352f>)
> (T1/#42)            <...>   697 0 20 00000000 0000002a [00000424e86895c8] 0.004ms (+0.000ms): scheduler_tick+0x13/0x34c <ffffffff8012d1b9> (update_process_times+0x5e/0x68 <ffffffff8013b4a9>)
> (T1/#43)           <idle>     0 1 23 00000001 0000002b [00000424e8689636] 0.004ms (+0.000ms): profile_hit+0x14/0x19f <ffffffff8013333d> (profile_tick+0x72/0x77 <ffffffff8013353a>)
> (T1/#44)            <...>   697 0 20 00000000 0000002c [00000424e86896af] 0.004ms (+0.000ms): sched_clock+0x9/0x26 <ffffffff8011180c> (scheduler_tick+0x3d/0x34c <ffffffff8012d1e3>)
> (T1/#45)           <idle>     0 1 23 00000001 0000002d [00000424e8689754] 0.004ms (+0.000ms): update_process_times+0xc/0x68 <ffffffff8013b457> (smp_local_timer_interrupt+0x2e/0x32 <ffffffff80118af7>)
> (T1/#46)            <...>   697 0 20 00000000 0000002e [00000424e86897c5] 0.004ms (+0.000ms): _raw_spin_lock+0xb/0x25 <ffffffff802fa1a7> (scheduler_tick+0xce/0x34c <ffffffff8012d274>)
> (T1/#47)           <idle>     0 1 23 00000001 0000002f [00000424e868984a] 0.004ms (+0.000ms): account_system_time+0x9/0x9e <ffffffff8012a19c> (update_process_times+0x3f/0x68 <ffffffff8013b48a>)
> (T1/#48)            <...>   697 0 20 00000001 00000030 [00000424e86898f8] 0.004ms (+0.000ms): _raw_spin_unlock+0x9/0x3e <ffffffff802fa6f7> (scheduler_tick+0x330/0x34c <ffffffff8012d4d6>)
> (T1/#49)           <idle>     0 1 23 00000001 00000031 [00000424e868994c] 0.004ms (+0.000ms): acct_update_integrals+0x9/0x59 <ffffffff801571a1> (account_system_time+0x9c/0x9e <ffffffff8012a22f>)
> (T1/#50)            <...>   697 0 20 00000000 00000032 [00000424e8689a04] 0.004ms (+0.000ms): constant_test_bit+0x9/0x25 <ffffffff80152845> (_raw_spin_unlock+0x33/0x3e <ffffffff802fa721>)
> (T1/#51)           <idle>     0 1 23 00000001 00000033 [00000424e8689a5a] 0.004ms (+0.000ms): run_local_timers+0x9/0x15 <ffffffff8013b0cd> (update_process_times+0x44/0x68 <ffffffff8013b48f>)
> (T1/#52)            <...>   697 0 20 00000000 00000034 [00000424e8689b0b] 0.004ms (+0.000ms): rebalance_tick+0x16/0x2e8 <ffffffff8012ced4> (scheduler_tick+0x340/0x34c <ffffffff8012d4e6>)
> (T1/#53)           <idle>     0 1 23 00000001 00000035 [00000424e8689b36] 0.004ms (+0.000ms): raise_softirq+0xc/0x91 <ffffffff80137cf8> (run_local_timers+0x13/0x15 <ffffffff8013b0d7>)
> (T1/#54)           <idle>     0 1 23 00000001 00000036 [00000424e8689c41] 0.005ms (+0.000ms): wakeup_softirqd+0x9/0x38 <ffffffff801373e5> (raise_softirq+0x6f/0x91 <ffffffff80137d5b>)
> (T1/#55)           <idle>     0 1 23 00000001 00000037 [00000424e8689d50] 0.005ms (+0.000ms): check_raw_flags+0x9/0x5b <ffffffff8014a6f5> (raise_softirq+0x77/0x91 <ffffffff80137d63>)
> (T1/#56)            <...>   697 0 20 00000000 00000038 [00000424e8689dc9] 0.005ms (+0.000ms): softlockup_tick+0xf/0x11d <ffffffff8015d672> (update_process_times+0x63/0x68 <ffffffff8013b4ae>)
> (T1/#57)           <idle>     0 1 23 00000001 00000039 [00000424e8689e81] 0.005ms (+0.000ms): rcu_pending+0x9/0x30 <ffffffff801445e8> (update_process_times+0x4b/0x68 <ffffffff8013b496>)
> (T1/#58)            <...>   697 0 20 00000000 0000003a [00000424e8689ef2] 0.005ms (+0.000ms): irq_exit+0x9/0x28 <ffffffff80137aaa> (smp_apic_timer_interrupt+0x43/0x48 <ffffffff80119061>)
> (T1/#59)           <idle>     0 1 23 00000001 0000003b [00000424e8689f91] 0.005ms (+0.000ms): scheduler_tick+0x13/0x34c <ffffffff8012d1b9> (update_process_times+0x5e/0x68 <ffffffff8013b4a9>)
> (T1/#60)           <idle>     0 1 23 00000001 0000003c [00000424e868a078] 0.005ms (+0.000ms): sched_clock+0x9/0x26 <ffffffff8011180c> (scheduler_tick+0x3d/0x34c <ffffffff8012d1e3>)
> (T1/#61)           <idle>     0 1 23 00000001 0000003d [00000424e868a17d] 0.005ms (+0.000ms): _raw_spin_lock+0xb/0x25 <ffffffff802fa1a7> (scheduler_tick+0x81/0x34c <ffffffff8012d227>)
> (T1/#62)            <...>   697 0 16 00000000 0000003e [00000424e868a1e5] 0.005ms (+0.000ms): do_IRQ+0xc/0x44 <ffffffff80110739> (ret_from_intr+0x0/0x12 <ffffffff8010e276>)
> (T1/#63)           <idle>     0 1 23 00000002 0000003f [00000424e868a2cf] 0.005ms (+0.000ms): resched_task+0xc/0x79 <ffffffff8012a958> (scheduler_tick+0x95/0x34c <ffffffff8012d23b>)
> (T1/#64)            <...>   697 0 20 00000000 00000040 [00000424e868a332] 0.005ms (+0.000ms): _raw_spin_lock+0xb/0x25 <ffffffff802fa1a7> (__do_IRQ+0x5d/0x141 <ffffffff8015ddec>)
> (T1/#65)           <idle>     0 1 23 00000002 00000041 [00000424e868a3e6] 0.006ms (+0.000ms): _raw_spin_unlock+0x9/0x3e <ffffffff802fa6f7> (scheduler_tick+0x9d/0x34c <ffffffff8012d243>)
> (T1/#66)            <...>   697 0 20 00000001 00000042 [00000424e868a469] 0.006ms (+0.000ms): mask_and_ack_level_ioapic_irq+0x10/0xa2 <ffffffff8011aa2d> (__do_IRQ+0x72/0x141 <ffffffff8015de01>)
> (T1/#67)           <idle>     0 1 23 00000001 00000043 [00000424e868a4e0] 0.006ms (+0.000ms): constant_test_bit+0x9/0x25 <ffffffff80152845> (_raw_spin_unlock+0x33/0x3e <ffffffff802fa721>)
> (T1/#68)            <...>   697 0 20 00000001 00000044 [00000424e868a548] 0.006ms (+0.000ms): mask_IO_APIC_irq+0xb/0x11e <ffffffff8011a90a> (mask_and_ack_level_ioapic_irq+0x8e/0xa2 <ffffffff8011aaab>)
> (T1/#69)           <idle>     0 1 23 00000001 00000045 [00000424e868a5e8] 0.006ms (+0.000ms): preempt_schedule+0xc/0x9d <ffffffff802f82a1> (_raw_spin_unlock+0x3c/0x3e <ffffffff802fa72a>)
> (T1/#70)            <...>   697 0 20 00000001 00000046 [00000424e868a6bd] 0.006ms (+0.000ms): _raw_spin_lock_irqsave+0xc/0x33 <ffffffff802fa2ab> (mask_IO_APIC_irq+0x19/0x11e <ffffffff8011a918>)
> (T1/#71)           <idle>     0 1 23 00000001 00000047 [00000424e868a70c] 0.006ms (+0.000ms): rebalance_tick+0x16/0x2e8 <ffffffff8012ced4> (scheduler_tick+0x340/0x34c <ffffffff8012d4e6>)
> (T1/#72)           <idle>     0 1 23 00000001 00000048 [00000424e868a936] 0.006ms (+0.000ms): softlockup_tick+0xf/0x11d <ffffffff8015d672> (update_process_times+0x63/0x68 <ffffffff8013b4ae>)
> (T1/#73)           <idle>     0 1 23 00000001 00000049 [00000424e868aa5e] 0.006ms (+0.000ms): irq_exit+0x9/0x28 <ffffffff80137aaa> (smp_apic_timer_interrupt+0x43/0x48 <ffffffff80119061>)
> (T1/#74)           <idle>     0 1 19 00000001 0000004a [00000424e868ad2f] 0.007ms (+0.000ms): do_IRQ+0xc/0x44 <ffffffff80110739> (ret_from_intr+0x0/0x12 <ffffffff8010e276>)
> (T1/#75)           <idle>     0 1 23 00000001 0000004b [00000424e868ae68] 0.007ms (+0.000ms): _raw_spin_lock+0xb/0x25 <ffffffff802fa1a7> (__do_IRQ+0x5d/0x141 <ffffffff8015ddec>)
> (T1/#76)           <idle>     0 1 23 00000002 0000004c [00000424e868b067] 0.007ms (+0.000ms): ack_edge_ioapic_irq+0x10/0xba <ffffffff8011ab21> (__do_IRQ+0x72/0x141 <ffffffff8015de01>)
> (T1/#77)           <idle>     0 1 23 00000002 0000004d [00000424e868b16b] 0.007ms (+0.000ms): redirect_hardirq+0x9/0x53 <ffffffff8015d909> (__do_IRQ+0xb5/0x141 <ffffffff8015de44>)
> (T1/#78)           <idle>     0 1 23 00000002 0000004e [00000424e868b283] 0.007ms (+0.000ms): _raw_spin_unlock+0x9/0x3e <ffffffff802fa6f7> (__do_IRQ+0xc1/0x141 <ffffffff8015de50>)
> (T1/#79)           <idle>     0 1 23 00000001 0000004f [00000424e868b38d] 0.007ms (+0.000ms): constant_test_bit+0x9/0x25 <ffffffff80152845> (_raw_spin_unlock+0x33/0x3e <ffffffff802fa721>)
> (T1/#80)           <idle>     0 1 23 00000001 00000050 [00000424e868b4a0] 0.008ms (+0.000ms): preempt_schedule+0xc/0x9d <ffffffff802f82a1> (_raw_spin_unlock+0x3c/0x3e <ffffffff802fa72a>)
> (T1/#81)           <idle>     0 1 23 00000001 00000051 [00000424e868b5af] 0.008ms (+0.000ms): handle_IRQ_event+0x16/0x114 <ffffffff8015d96f> (__do_IRQ+0xd1/0x141 <ffffffff8015de60>)
> (T1/#82)           <idle>     0 1 23 00000001 00000052 [00000424e868b699] 0.008ms (+0.000ms): timer_interrupt+0xb/0x62 <ffffffff801117ac> (handle_IRQ_event+0x6a/0x114 <ffffffff8015d9c3>)
> (T1/#83)           <idle>     0 1 23 00000001 00000053 [00000424e868b789] 0.008ms (+0.000ms): _raw_spin_lock+0xb/0x25 <ffffffff802fa1a7> (timer_interrupt+0x1a/0x62 <ffffffff801117bb>)
> (T1/#84)           <idle>     0 1 23 00000002 00000054 [00000424e868b8c5] 0.008ms (+0.000ms): do_timer+0x9/0x12 <ffffffff8013b0e2> (timer_interrupt+0x28/0x62 <ffffffff801117c9>)
> (T1/#85)           <idle>     0 1 23 00000002 00000055 [00000424e868ba17] 0.008ms (+0.000ms): _raw_spin_unlock+0x9/0x3e <ffffffff802fa6f7> (timer_interrupt+0x4b/0x62 <ffffffff801117ec>)
> (T1/#86)            <...>   697 0 20 00000002 00000056 [00000424e868ba4b] 0.008ms (+0.000ms): _raw_spin_unlock_irqrestore+0xb/0x62 <ffffffff802fa61d> (mask_IO_APIC_irq+0x11a/0x11e <ffffffff8011aa19>)
> (T1/#87)           <idle>     0 1 23 00000001 00000057 [00000424e868bb18] 0.008ms (+0.000ms): constant_test_bit+0x9/0x25 <ffffffff80152845> (_raw_spin_unlock+0x33/0x3e <ffffffff802fa721>)
> (T1/#88)            <...>   697 0 20 00000001 00000058 [00000424e868bb4d] 0.008ms (+0.000ms): check_raw_flags+0x9/0x5b <ffffffff8014a6f5> (_raw_spin_unlock_irqrestore+0x26/0x62 <ffffffff802fa638>)
> (T1/#89)           <idle>     0 1 23 00000001 00000059 [00000424e868bc15] 0.009ms (+0.000ms): preempt_schedule+0xc/0x9d <ffffffff802f82a1> (_raw_spin_unlock+0x3c/0x3e <ffffffff802fa72a>)
> (T1/#90)            <...>   697 0 20 00000001 0000005a [00000424e868bc62] 0.009ms (+0.000ms): constant_test_bit+0x9/0x25 <ffffffff80152845> (_raw_spin_unlock_irqrestore+0x55/0x62 <ffffffff802fa667>)
> (T1/#91)           <idle>     0 1 23 00000001 0000005b [00000424e868bd30] 0.009ms (+0.000ms): smp_send_timer_broadcast_ipi+0x9/0x31 <ffffffff80118a91> (timer_interrupt+0x59/0x62 <ffffffff801117fa>)
> (T1/#92)            <...>   697 0 20 00000001 0000005c [00000424e868bd98] 0.009ms (+0.000ms): redirect_hardirq+0x9/0x53 <ffffffff8015d909> (__do_IRQ+0xb5/0x141 <ffffffff8015de44>)
> (T1/#93)           <idle>     0 1 23 00000001 0000005d [00000424e868be66] 0.009ms (+0.000ms): _raw_spin_lock+0xb/0x25 <ffffffff802fa1a7> (__do_IRQ+0xdb/0x141 <ffffffff8015de6a>)
> (T1/#94)            <...>   697 0 20 00000001 0000005e [00000424e868bed0] 0.009ms (+0.000ms): wake_up_process+0xb/0x31 <ffffffff8012cd66> (redirect_hardirq+0x46/0x53 <ffffffff8015d946>)
> (T1/#95)           <idle>     0 1 23 00000002 0000005f [00000424e868bfa8] 0.009ms (+0.000ms): note_interrupt+0x16/0x227 <ffffffff8015ec77> (__do_IRQ+0xf5/0x141 <ffffffff8015de84>)
> (T1/#96)            <...>   697 0 20 00000001 00000060 [00000424e868bfe2] 0.009ms (+0.000ms): check_preempt_wakeup+0xc/0xac <ffffffff8014a814> (wake_up_process+0x13/0x31 <ffffffff8012cd6e>)
> (T1/#97)           <idle>     0 1 23 00000002 00000061 [00000424e868c0a8] 0.009ms (+0.000ms): unmask_IO_APIC_irq+0xc/0x3a <ffffffff8011a82b> (__do_IRQ+0x122/0x141 <ffffffff8015deb1>)
> (T1/#98)            <...>   697 0 20 00000001 00000062 [00000424e868c0d6] 0.009ms (+0.000ms): try_to_wake_up+0x16/0x560 <ffffffff8012c747> (wake_up_process+0x24/0x31 <ffffffff8012cd7f>)
> (T1/#99)           <idle>     0 1 23 00000002 00000063 [00000424e868c19a] 0.009ms (+0.000ms): _raw_spin_lock_irqsave+0xc/0x33 <ffffffff802fa2ab> (unmask_IO_APIC_irq+0x1b/0x3a <ffffffff8011a83a>)
> (T1/#100)            <...>   697 0 20 00000001 00000064 [00000424e868c1f8] 0.009ms (+0.000ms): _raw_spin_lock+0xb/0x25 <ffffffff802fa1a7> (try_to_wake_up+0x5d/0x560 <ffffffff8012c78e>)
> (T1/#101)            <...>   697 0 20 00000002 00000065 [00000424e868c3d9] 0.010ms (+0.000ms): idle_cpu+0x9/0x30 <ffffffff80129d37> (try_to_wake_up+0x292/0x560 <ffffffff8012c9c3>)
> (T1/#102)           <idle>     0 1 23 00000003 00000066 [00000424e868c439] 0.010ms (+0.000ms): __unmask_IO_APIC_irq+0x9/0xff <ffffffff8011a729> (unmask_IO_APIC_irq+0x26/0x3a <ffffffff8011a845>)
> (T1/#103)            <...>   697 0 20 00000002 00000067 [00000424e868c538] 0.010ms (+0.000ms): smp_send_reschedule_allbutself+0x9/0x1a <ffffffff8011809b> (try_to_wake_up+0x3e9/0x560 <ffffffff8012cb1a>)
> (T1/#104)            <...>   697 0 20 00000002 00000068 [00000424e868c674] 0.010ms (+0.000ms): flat_send_IPI_allbutself+0xb/0x5b <ffffffff8011b644> (smp_send_reschedule_allbutself+0x18/0x1a <ffffffff801180aa>)
> (T1/#105)            <...>   697 0 20 00000002 00000069 [00000424e868c74e] 0.010ms (+0.000ms): __bitmap_weight+0xa/0x18b <ffffffff801fbce6> (flat_send_IPI_allbutself+0x1e/0x5b <ffffffff8011b657>)
> (T1/#106)            <...>   697 0 20 00000002 0000006a [00000424e868c95a] 0.010ms (+0.000ms): activate_task+0x10/0xe0 <ffffffff8012bc38> (try_to_wake_up+0x491/0x560 <ffffffff8012cbc2>)
> (T1/#107)            <...>   697 0 20 00000002 0000006b [00000424e868ca36] 0.010ms (+0.000ms): sched_clock+0x9/0x26 <ffffffff8011180c> (activate_task+0x1d/0xe0 <ffffffff8012bc45>)
> (T3/#108)    <...>-697   0D.h2   11us : activate_task+0x9b/0xe0 <ffffffff8012bcc3> <<...>-1432> (3a 2)
> (T1/#109)            <...>   697 0 20 00000002 0000006d [00000424e868cc15] 0.011ms (+0.000ms): enqueue_task+0xc/0x95 <ffffffff80129be4> (activate_task+0xa7/0xe0 <ffffffff8012bccf>)
> (T1/#110)            <...>   697 0 20 00000002 0000006e [00000424e868cdf1] 0.011ms (+0.000ms): _raw_spin_unlock_irqrestore+0xb/0x62 <ffffffff802fa61d> (try_to_wake_up+0x54f/0x560 <ffffffff8012cc80>)
> (T1/#111)            <...>   697 0 20 00000001 0000006f [00000424e868cf1a] 0.011ms (+0.000ms): check_raw_flags+0x9/0x5b <ffffffff8014a6f5> (_raw_spin_unlock_irqrestore+0x26/0x62 <ffffffff802fa638>)
> (T1/#112)            <...>   697 0 20 00000001 00000070 [00000424e868d02f] 0.011ms (+0.000ms): constant_test_bit+0x9/0x25 <ffffffff80152845> (_raw_spin_unlock_irqrestore+0x55/0x62 <ffffffff802fa667>)
> (T1/#113)            <...>   697 0 20 00000001 00000071 [00000424e868d147] 0.011ms (+0.000ms): wake_up_process+0x2b/0x31 <ffffffff8012cd86> (redirect_hardirq+0x46/0x53 <ffffffff8015d946>)
> (T1/#114)            <...>   697 0 20 00000001 00000072 [00000424e868d261] 0.011ms (+0.000ms): _raw_spin_unlock+0x9/0x3e <ffffffff802fa6f7> (__do_IRQ+0x12a/0x141 <ffffffff8015deb9>)
> (T1/#115)            <...>   697 0 20 00000000 00000073 [00000424e868d356] 0.012ms (+0.000ms): constant_test_bit+0x9/0x25 <ffffffff80152845> (_raw_spin_unlock+0x33/0x3e <ffffffff802fa721>)
> (T1/#116)            <...>   697 0 20 00000000 00000074 [00000424e868d470] 0.012ms (+0.000ms): irq_exit+0x9/0x28 <ffffffff80137aaa> (do_IRQ+0x3a/0x44 <ffffffff80110767>)
> (T1/#117)            <...>   697 0 0 00000000 00000075 [00000424e868d72e] 0.012ms (+0.000ms): ata_host_intr+0xf/0xb6 <ffffffff88033d99> (nv_interrupt+0x65/0xa6 <ffffffff88042065>)
> (T1/#118)            <...>   697 0 0 00000000 00000076 [00000424e868d81d] 0.012ms (+0.000ms): ata_bmdma_status+0x9/0x27 <ffffffff88031e66> (ata_host_intr+0x3e/0xb6 <ffffffff88033dc8>)
> (T1/#119)           <idle>     0 1 23 00000003 00000077 [00000424e868d846] 0.012ms (+0.000ms): _raw_spin_unlock_irqrestore+0xb/0x62 <ffffffff802fa61d> (unmask_IO_APIC_irq+0x35/0x3a <ffffffff8011a854>)
> (T1/#120)           <idle>     0 1 23 00000002 00000078 [00000424e868d958] 0.012ms (+0.000ms): check_raw_flags+0x9/0x5b <ffffffff8014a6f5> (_raw_spin_unlock_irqrestore+0x26/0x62 <ffffffff802fa638>)
> (T1/#121)           <idle>     0 1 23 00000002 00000079 [00000424e868da62] 0.012ms (+0.000ms): constant_test_bit+0x9/0x25 <ffffffff80152845> (_raw_spin_unlock_irqrestore+0x55/0x62 <ffffffff802fa667>)
> (T1/#122)            <...>   697 0 0 00000000 0000007a [00000424e868dacb] 0.012ms (+0.000ms): ata_bmdma_stop+0x9/0x32 <ffffffff88031e34> (ata_host_intr+0x52/0xb6 <ffffffff88033ddc>)
> (T1/#123)           <idle>     0 1 23 00000002 0000007b [00000424e868db53] 0.013ms (+0.000ms): preempt_schedule+0xc/0x9d <ffffffff802f82a1> (_raw_spin_unlock_irqrestore+0x5e/0x62 <ffffffff802fa670>)
> (T1/#124)           <idle>     0 1 23 00000002 0000007c [00000424e868dc6f] 0.013ms (+0.000ms): _raw_spin_unlock+0x9/0x3e <ffffffff802fa6f7> (__do_IRQ+0x12a/0x141 <ffffffff8015deb9>)
> (T1/#125)           <idle>     0 1 23 00000001 0000007d [00000424e868dd65] 0.013ms (+0.000ms): constant_test_bit+0x9/0x25 <ffffffff80152845> (_raw_spin_unlock+0x33/0x3e <ffffffff802fa721>)
> (T1/#126)           <idle>     0 1 23 00000001 0000007e [00000424e868de78] 0.013ms (+0.000ms): preempt_schedule+0xc/0x9d <ffffffff802f82a1> (_raw_spin_unlock+0x3c/0x3e <ffffffff802fa72a>)
> (T1/#127)            <...>   697 0 0 00000000 0000007f [00000424e868dec3] 0.013ms (+0.000ms): ata_altstatus+0x9/0x34 <ffffffff88031e00> (ata_bmdma_stop+0x30/0x32 <ffffffff88031e5b>)
> (T1/#128)           <idle>     0 1 23 00000001 00000080 [00000424e868df94] 0.013ms (+0.000ms): irq_exit+0x9/0x28 <ffffffff80137aaa> (do_IRQ+0x3a/0x44 <ffffffff80110767>)
> (T1/#129)            <...>   697 0 0 00000000 00000081 [00000424e868e150] 0.013ms (+0.000ms): ata_altstatus+0x9/0x34 <ffffffff88031e00> (ata_host_intr+0x5a/0xb6 <ffffffff88033de4>)
> (T1/#130)           <idle>     0 1 19 00000001 00000082 [00000424e868e25a] 0.013ms (+0.000ms): smp_reschedule_interrupt+0x9/0x16 <ffffffff801186ee> (reschedule_interrupt+0x84/0x8c <ffffffff8010e558>)
> (T1/#131)            <...>   697 0 0 00000000 00000083 [00000424e868e3d5] 0.014ms (+0.000ms): ata_check_status+0x9/0x23 <ffffffff88031ddd> (ata_host_intr+0x68/0xb6 <ffffffff88033df2>)
> (T1/#132)           <idle>     0 1 19 00000000 00000084 [00000424e868e465] 0.014ms (+0.000ms): __schedule+0x16/0xad6 <ffffffff802f7609> (cpu_idle+0xbe/0xd3 <ffffffff8010c6dd>)
> (T1/#133)           <idle>     0 1 19 00000000 00000085 [00000424e868e560] 0.014ms (+0.000ms): profile_hit+0x14/0x19f <ffffffff8013333d> (__schedule+0xb0/0xad6 <ffffffff802f76a3>)
> (T1/#134)           <idle>     0 1 19 00000001 00000086 [00000424e868e6a4] 0.014ms (+0.000ms): sched_clock+0x9/0x26 <ffffffff8011180c> (__schedule+0x110/0xad6 <ffffffff802f7703>)
> (T1/#135)            <...>   697 0 0 00000000 00000087 [00000424e868e737] 0.014ms (+0.000ms): ata_bmdma_irq_clear+0x9/0x2b <ffffffff88032170> (ata_host_intr+0x7c/0xb6 <ffffffff88033e06>)
> (T1/#136)           <idle>     0 1 19 00000001 00000088 [00000424e868e88e] 0.014ms (+0.000ms): _raw_spin_lock_irq+0xb/0x2c <ffffffff802fa2dd> (__schedule+0x18c/0xad6 <ffffffff802f777f>)
> (T1/#137)            <...>   697 0 0 00000000 00000089 [00000424e868eb5d] 0.015ms (+0.000ms): ata_qc_complete+0x13/0x20b <ffffffff88032870> (ata_host_intr+0x9e/0xb6 <ffffffff88033e28>)
> (T1/#138)           <idle>     0 1 19 00000002 0000008a [00000424e868ec85] 0.015ms (+0.000ms): double_lock_balance+0xc/0x43 <ffffffff80129cf7> (__schedule+0x2f8/0xad6 <ffffffff802f78eb>)
> (T1/#139)            <...>   697 0 0 00000000 0000008b [00000424e868ec87] 0.015ms (+0.000ms): dma_unmap_sg+0x14/0x6d <ffffffff8011c396> (ata_qc_complete+0x132/0x20b <ffffffff8803298f>)
> (T1/#140)           <idle>     0 1 19 00000002 0000008c [00000424e868ed68] 0.015ms (+0.000ms): _raw_spin_trylock+0xb/0x5d <ffffffff802fa8f7> (double_lock_balance+0x1a/0x43 <ffffffff80129d05>)
> (T1/#141)            <...>   697 0 0 00000000 0000008d [00000424e868ed93] 0.015ms (+0.000ms): dma_unmap_single+0xc/0xe6 <ffffffff8011c2a8> (dma_unmap_sg+0x5b/0x6d <ffffffff8011c3dd>)
> (T1/#142)            <...>   697 0 0 00000000 0000008e [00000424e868ef1d] 0.015ms (+0.000ms): ata_scsi_qc_complete+0x10/0x89 <ffffffff88036aa1> (ata_qc_complete+0x1f2/0x20b <ffffffff88032a4f>)
> (T1/#143)            <...>   697 0 0 00000000 0000008f [00000424e868f076] 0.015ms (+0.000ms): scsi_done+0xc/0x24 <ffffffff880028d4> (ata_scsi_qc_complete+0x7f/0x89 <ffffffff88036b10>)
> (T1/#144)            <...>   697 0 0 00000000 00000090 [00000424e868f16f] 0.015ms (+0.000ms): scsi_delete_timer+0xc/0x65 <ffffffff8800576d> (scsi_done+0x14/0x24 <ffffffff880028dc>)
> (T1/#145)           <idle>     0 1 19 00000003 00000091 [00000424e868f1ec] 0.015ms (+0.000ms): find_next_bit+0xc/0x74 <ffffffff80200598> (__schedule+0x3a2/0xad6 <ffffffff802f7995>)
> (T1/#146)            <...>   697 0 0 00000000 00000092 [00000424e868f268] 0.016ms (+0.000ms): del_timer+0xe/0x5e <ffffffff8013b7a8> (scsi_delete_timer+0x1b/0x65 <ffffffff8800577c>)
> (T1/#147)            <...>   697 0 0 00000000 00000093 [00000424e868f359] 0.016ms (+0.000ms): lock_timer_base+0xf/0x4c <ffffffff8013b5a6> (del_timer+0x23/0x5e <ffffffff8013b7bd>)
> (T1/#148)           <idle>     0 1 19 00000003 00000094 [00000424e868f46a] 0.016ms (+0.000ms): find_next_bit+0xc/0x74 <ffffffff80200598> (__schedule+0x3a2/0xad6 <ffffffff802f7995>)
> (T1/#149)           <idle>     0 1 19 00000003 00000095 [00000424e868f589] 0.016ms (+0.000ms): __find_first_bit+0x9/0x34 <ffffffff80200551> (find_next_bit+0x65/0x74 <ffffffff802005f1>)
> (T1/#150)            <...>   697 0 0 00000000 00000096 [00000424e868f5fc] 0.016ms (+0.000ms): _spin_lock_irqsave+0x9/0x4b <ffffffff802f9db4> (lock_timer_base+0x27/0x4c <ffffffff8013b5be>)
> (T1/#151)            <...>   697 0 0 00000000 00000097 [00000424e868f735] 0.016ms (+0.000ms): account_mutex_owner_down+0x9/0x4c <ffffffff8014a3fd> (_spin_lock_irqsave+0x3a/0x4b <ffffffff802f9de5>)
> (T1/#152)           <idle>     0 1 19 00000003 00000098 [00000424e868f850] 0.016ms (+0.000ms): find_next_bit+0xc/0x74 <ffffffff80200598> (__schedule+0x3a2/0xad6 <ffffffff802f7995>)
> (T1/#153)            <...>   697 0 0 00000000 00000099 [00000424e868f8df] 0.016ms (+0.000ms): _spin_unlock_irqrestore+0xc/0x6e <ffffffff802fa13a> (del_timer+0x54/0x5e <ffffffff8013b7ee>)
> (T1/#154)           <idle>     0 1 19 00000003 0000009a [00000424e868f945] 0.016ms (+0.000ms): __find_first_bit+0x9/0x34 <ffffffff80200551> (find_next_bit+0x65/0x74 <ffffffff802005f1>)
> (T1/#155)            <...>   697 0 0 00000000 0000009b [00000424e868f9c7] 0.016ms (+0.000ms): up_mutex+0x9/0x3f <ffffffff8014b0f4> (_spin_unlock_irqrestore+0x6a/0x6e <ffffffff802fa198>)
> (T1/#156)           <idle>     0 1 19 00000003 0000009c [00000424e868fa58] 0.017ms (+0.000ms): _raw_spin_unlock+0x9/0x3e <ffffffff802fa6f7> (__schedule+0x437/0xad6 <ffffffff802f7a2a>)
> (T1/#157)            <...>   697 0 0 00000000 0000009d [00000424e868fab1] 0.017ms (+0.000ms): account_mutex_owner_up+0x9/0x4b <ffffffff8014a449> (up_mutex+0x36/0x3f <ffffffff8014b121>)
> (T1/#158)           <idle>     0 1 19 00000002 0000009e [00000424e868fbed] 0.017ms (+0.000ms): constant_test_bit+0x9/0x25 <ffffffff80152845> (_raw_spin_unlock+0x33/0x3e <ffffffff802fa721>)
> (T1/#159)            <...>   697 0 0 00000000 0000009f [00000424e868fc11] 0.017ms (+0.000ms): __scsi_done+0xc/0x9b <ffffffff88002839> (scsi_done+0x20/0x24 <ffffffff880028e8>)
> (T1/#160)           <idle>     0 1 19 00000002 000000a0 [00000424e868fcec] 0.017ms (+0.000ms): preempt_schedule+0xc/0x9d <ffffffff802f82a1> (_raw_spin_unlock+0x3c/0x3e <ffffffff802fa72a>)
> (T1/#161)            <...>   697 0 16 00000000 000000a1 [00000424e868fdb2] 0.017ms (+0.000ms): raise_softirq_irqoff+0x9/0x5f <ffffffff80137ad2> (__scsi_done+0x73/0x9b <ffffffff880028a0>)
> (T1/#162)           <idle>     0 1 19 00000002 000000a2 [00000424e868fe1b] 0.017ms (+0.000ms): dependent_sleeper+0x16/0x3f1 <ffffffff8012d508> (__schedule+0x74c/0xad6 <ffffffff802f7d3f>)
> (T1/#163)            <...>   697 0 16 00000000 000000a3 [00000424e868fefd] 0.017ms (+0.000ms): wakeup_softirqd+0x9/0x38 <ffffffff801373e5> (raise_softirq_irqoff+0x5d/0x5f <ffffffff80137b26>)
> (T5/#164) [ =>          swapper ] 0.017ms (+0.000ms)
> (T1/#165)            <...>   697 0 16 00000000 000000a5 [00000424e8690058] 0.017ms (+0.000ms): wake_up_process+0xb/0x31 <ffffffff8012cd66> (wakeup_softirqd+0x36/0x38 <ffffffff80137412>)
> (T1/#166)           <idle>     0 1 17 00000002 000000a6 [00000424e8690071] 0.017ms (+0.000ms): __switch_to+0x13/0x212 <ffffffff8010c7d1> (thread_return+0x0/0x13f <ffffffff802f80c9>)
> (T1/#167)            <...>   697 0 16 00000000 000000a7 [00000424e8690136] 0.017ms (+0.000ms): check_preempt_wakeup+0xc/0xac <ffffffff8014a814> (wake_up_process+0x13/0x31 <ffffffff8012cd6e>)
> (T3/#168)    <...>-14    1D..2   17us : thread_return+0x4a/0x13f <ffffffff802f8113> <<idle>-0> (8c 62)
> (T1/#169)            <...>   697 0 16 00000000 000000a9 [00000424e8690245] 0.018ms (+0.000ms): try_to_wake_up+0x16/0x560 <ffffffff8012c747> (wake_up_process+0x24/0x31 <ffffffff8012cd7f>)
> (T1/#170)            <...>    14 1 16 00000002 000000aa [00000424e86902fc] 0.018ms (+0.000ms): _raw_spin_unlock_irq+0x9/0x44 <ffffffff802fa5d7> (thread_return+0xa3/0x13f <ffffffff802f816c>)
> (T1/#171)            <...>   697 0 16 00000000 000000ab [00000424e8690363] 0.018ms (+0.000ms): _raw_spin_lock+0xb/0x25 <ffffffff802fa1a7> (try_to_wake_up+0x5d/0x560 <ffffffff8012c78e>)
> (T1/#172)            <...>    14 1 0 00000001 000000ac [00000424e869042d] 0.018ms (+0.000ms): constant_test_bit+0x9/0x25 <ffffffff80152845> (_raw_spin_unlock_irq+0x39/0x44 <ffffffff802fa607>)
> (T1/#173)            <...>    14 1 0 00000001 000000ad [00000424e8690556] 0.018ms (+0.000ms): trace_stop_sched_switched+0x16/0x332 <ffffffff801504d0> (thread_return+0xb1/0x13f <ffffffff802f817a>)
> (T1/#174)            <...>   697 0 16 00000001 000000ae [00000424e86905a9] 0.018ms (+0.000ms): idle_cpu+0x9/0x30 <ffffffff80129d37> (try_to_wake_up+0x292/0x560 <ffffffff8012c9c3>)
> (T1/#175)            <...>    14 1 16 00000001 000000af [00000424e869065c] 0.018ms (+0.000ms): _raw_spin_lock+0xb/0x25 <ffffffff802fa1a7> (trace_stop_sched_switched+0x49/0x332 <ffffffff80150503>)
> (T1/#176)            <...>   697 0 16 00000001 000000b0 [00000424e86906b0] 0.018ms (+0.000ms): smp_send_reschedule_allbutself+0x9/0x1a <ffffffff8011809b> (try_to_wake_up+0x3e9/0x560 <ffffffff8012cb1a>)
> (T1/#177)            <...>   697 0 16 00000001 000000b1 [00000424e869078d] 0.018ms (+0.000ms): flat_send_IPI_allbutself+0xb/0x5b <ffffffff8011b644> (smp_send_reschedule_allbutself+0x18/0x1a <ffffffff801180aa>)
> (T3/#178)    <...>-14    1D..2   18us : trace_stop_sched_switched+0x6f/0x332 <ffffffff80150529> <<...>-14> (62 1)
> (T1/#179)            <...>   697 0 16 00000001 000000b3 [00000424e8690871] 0.018ms (+0.000ms): __bitmap_weight+0xa/0x18b <ffffffff801fbce6> (flat_send_IPI_allbutself+0x1e/0x5b <ffffffff8011b657>)
> (T1/#180)            <...>    14 1 16 00000003 000000b4 [00000424e8690898] 0.018ms (+0.000ms): _raw_spin_unlock+0x9/0x3e <ffffffff802fa6f7> (trace_stop_sched_switched+0xbf/0x332 <ffffffff80150579>)
> (T1/#181)            <...>    14 1 16 00000002 000000b5 [00000424e86909ac] 0.018ms (+0.000ms): constant_test_bit+0x9/0x25 <ffffffff80152845> (_raw_spin_unlock+0x33/0x3e <ffffffff802fa721>)
> (T1/#182)            <...>   697 0 16 00000001 000000b6 [00000424e8690a78] 0.019ms (+0.000ms): activate_task+0x10/0xe0 <ffffffff8012bc38> (try_to_wake_up+0x491/0x560 <ffffffff8012cbc2>)
> (T1/#183)            <...>   697 0 16 00000001 000000b7 [00000424e8690b50] 0.019ms (+0.000ms): sched_clock+0x9/0x26 <ffffffff8011180c> (activate_task+0x1d/0xe0 <ffffffff8012bc45>)
> (T1/#184)            <...>    14 1 16 00000002 000000b8 [00000424e8690b62] 0.019ms (+9177482346838.318ms): thread_return+0xb1/0x13f <ffffffff802f817a> (thread_return+0xb1/0x13f <ffffffff802f817a>)
> 
> 
> vim:ft=help
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: AMD64 X2 lost ticks on PM timer
  2006-03-03 21:26                           ` Lee Revell
@ 2006-03-03 22:09                             ` Jeff Garzik
  2006-03-03 23:43                               ` Bill Rugolsky Jr.
  0 siblings, 1 reply; 60+ messages in thread
From: Jeff Garzik @ 2006-03-03 22:09 UTC (permalink / raw)
  To: Lee Revell
  Cc: Bill Rugolsky Jr.,
	Andi Kleen, Jason Baron, linux-kernel, john stultz, Ingo Molnar

Lee Revell wrote:
> All this tells me is that your system's timer is screwed up (not news).

Or sata_nv/libata is to blame.

	Jeff




^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: AMD64 X2 lost ticks on PM timer
  2006-03-03 22:09                             ` Jeff Garzik
@ 2006-03-03 23:43                               ` Bill Rugolsky Jr.
  2006-03-03 23:46                                 ` Jeff Garzik
                                                   ` (2 more replies)
  0 siblings, 3 replies; 60+ messages in thread
From: Bill Rugolsky Jr. @ 2006-03-03 23:43 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Lee Revell, Andi Kleen, Jason Baron, linux-kernel, john stultz,
	Ingo Molnar

On Fri, Mar 03, 2006 at 05:09:57PM -0500, Jeff Garzik wrote:
> Or sata_nv/libata is to blame.
 
In case you are coming late to the thread:

The lost ticks are closely correlated with sata_nv disk activity on
multiple disks, and the problem is easily reproducable with "find /usr |
cpio -o >/dev/null" on an MD RAID1 -- but not on a single disk.

Andi suggested:

   Yes, I bet something forgets to turn on interrupts again and it's
   picked up by (and blamed on) the next guy who does an unconditional
   sti, which happens to be __do_sofitrq or idle.

That sounds right to me.

I built 2.6.16-rc5-git6 yesterday, and it still suffers from the same
issue.

	-Bill

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: AMD64 X2 lost ticks on PM timer
  2006-03-03 23:43                               ` Bill Rugolsky Jr.
@ 2006-03-03 23:46                                 ` Jeff Garzik
  2006-03-03 23:49                                   ` Lee Revell
  2006-03-04  0:08                                   ` Andi Kleen
  2006-03-04  0:07                                 ` Andi Kleen
  2006-03-04 12:06                                 ` AMD64 X2 lost ticks on PM timer Martin Schlemmer
  2 siblings, 2 replies; 60+ messages in thread
From: Jeff Garzik @ 2006-03-03 23:46 UTC (permalink / raw)
  To: Bill Rugolsky Jr.
  Cc: Lee Revell, Andi Kleen, Jason Baron, linux-kernel, john stultz,
	Ingo Molnar

Bill Rugolsky Jr. wrote:
> On Fri, Mar 03, 2006 at 05:09:57PM -0500, Jeff Garzik wrote:
> 
>>Or sata_nv/libata is to blame.
> 
>  
> In case you are coming late to the thread:

I'm not.  Thus my comments refuting Lee's silly speculation.


> Andi suggested:
> 
>    Yes, I bet something forgets to turn on interrupts again and it's
>    picked up by (and blamed on) the next guy who does an unconditional
>    sti, which happens to be __do_sofitrq or idle.
> 
> That sounds right to me.

Unlikely.  More likely is a disabled interrupt period is longer than a 
tick period, or similar.

	Jeff



^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: AMD64 X2 lost ticks on PM timer
  2006-03-03 23:46                                 ` Jeff Garzik
@ 2006-03-03 23:49                                   ` Lee Revell
  2006-03-04  0:08                                   ` Andi Kleen
  1 sibling, 0 replies; 60+ messages in thread
From: Lee Revell @ 2006-03-03 23:49 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Bill Rugolsky Jr.,
	Andi Kleen, Jason Baron, linux-kernel, john stultz, Ingo Molnar

On Fri, 2006-03-03 at 18:46 -0500, Jeff Garzik wrote:
> > In case you are coming late to the thread:
> 
> I'm not.  Thus my comments refuting Lee's silly speculation.
> 

I did not engage in any speculation at all, I only said the timing in
that latency trace was screwed up which is obvious.

Lee


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: AMD64 X2 lost ticks on PM timer
  2006-03-03 23:43                               ` Bill Rugolsky Jr.
  2006-03-03 23:46                                 ` Jeff Garzik
@ 2006-03-04  0:07                                 ` Andi Kleen
       [not found]                                   ` <20060315213638.GA17817@ti64.telemetry-investments.com>
  2006-03-04 12:06                                 ` AMD64 X2 lost ticks on PM timer Martin Schlemmer
  2 siblings, 1 reply; 60+ messages in thread
From: Andi Kleen @ 2006-03-04  0:07 UTC (permalink / raw)
  To: Bill Rugolsky Jr.
  Cc: Jeff Garzik, Lee Revell, Jason Baron, linux-kernel, john stultz,
	Ingo Molnar

On Saturday 04 March 2006 00:43, Bill Rugolsky Jr. wrote:

> I built 2.6.16-rc5-git6 yesterday, and it still suffers from the same
> issue.

FWIW i looked over sata_nv and libata-{core,scsi} and I couldn't find
any obviously unmatched irqsave/irqrestore. So it would need instrumentation.

In theory it could be also hardware i suppose.

-Andi

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: AMD64 X2 lost ticks on PM timer
  2006-03-03 23:46                                 ` Jeff Garzik
  2006-03-03 23:49                                   ` Lee Revell
@ 2006-03-04  0:08                                   ` Andi Kleen
  1 sibling, 0 replies; 60+ messages in thread
From: Andi Kleen @ 2006-03-04  0:08 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Bill Rugolsky Jr.,
	Lee Revell, Jason Baron, linux-kernel, john stultz, Ingo Molnar

On Saturday 04 March 2006 00:46, Jeff Garzik wrote:
>  More likely is a disabled interrupt period is longer than a
> tick period, or similar.

Then the ticks wouldn't be attributed to idle and softirq.
They are both special in that they do an unconditional local_irq_enable()
instead of the usual save/restore.

-Andi

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: AMD64 X2 lost ticks on PM timer
  2006-03-03 23:43                               ` Bill Rugolsky Jr.
  2006-03-03 23:46                                 ` Jeff Garzik
  2006-03-04  0:07                                 ` Andi Kleen
@ 2006-03-04 12:06                                 ` Martin Schlemmer
  2006-03-05  7:07                                   ` Alexander Samad
  2 siblings, 1 reply; 60+ messages in thread
From: Martin Schlemmer @ 2006-03-04 12:06 UTC (permalink / raw)
  To: Bill Rugolsky Jr.
  Cc: Jeff Garzik, Lee Revell, Andi Kleen, Jason Baron, linux-kernel,
	john stultz, Ingo Molnar

[-- Attachment #1: Type: text/plain, Size: 2719 bytes --]

On Fri, 2006-03-03 at 18:43 -0500, Bill Rugolsky Jr. wrote:
> On Fri, Mar 03, 2006 at 05:09:57PM -0500, Jeff Garzik wrote:
> > Or sata_nv/libata is to blame.
>  
> In case you are coming late to the thread:
> 
> The lost ticks are closely correlated with sata_nv disk activity on
> multiple disks, and the problem is easily reproducable with "find /usr |
> cpio -o >/dev/null" on an MD RAID1 -- but not on a single disk.
> 
> Andi suggested:
> 
>    Yes, I bet something forgets to turn on interrupts again and it's
>    picked up by (and blamed on) the next guy who does an unconditional
>    sti, which happens to be __do_sofitrq or idle.
> 
> That sounds right to me.
> 
> I built 2.6.16-rc5-git6 yesterday, and it still suffers from the same
> issue.
> 

Not sure this will help in anyway, but anyhow.

I have had this system for about 6-8 months (maybe 10) now.  It was
originally a Asus A8N-SLI Deluxe with a 3200+ Athlon64.  In November I
changed to a Asus A8N-SLI Premium, and added another 1GB memory (now
have 2GB memory).  In all that time I have not had any issues with lost
ticks, but I was hesitant to get a X2 processor due to the issue that
some people had.

Beginning February I got an Athlon64 X2 3800+ processor, and since that
time I have also had no issues with lost ticks.  I usually run latest
git kernel +/- a week.  No extra patches, except I started using Alan's
libata PATA stuff a week or so back as well.

Only difference I can see that might be of consequence, is:

1) Board type?  Not sure if anybody with an A8N-SLI had any lost tick
issues ?

2) I do not use MD RAID1, but have 2 ST380013AS striped with
device-mapper's stripe module.

3) If I remember correctly, then I have the 2 hdd's on nv_sata ports 3
and 4, with ports 1 and 2 disabled, else they show up as sdc and sdd.
Not sure if its an A8N-SLI only peculiarity, and not 100% sure of the
port order - I can check if it might be an issue?

4) I do not use the NV lan adapter, as I had issues back when I got the
system, but rather the extra Marvell controller (skge module).  I think
it picked up fine, etc, but it lost connection after a minute or two.

5) I run the processor at 240FSB (or HT or whatever) with the memory at
333 (166 on some boards) multiplier (ending up at running 200Mhz
anyhow).  Not sure if this might make any difference, but just listing
it in case.

6) Haven't checked if this makes any difference, but the board have an
option for ACPI 2.0 support, which I have enabled.

If anything might be of relevance, or you want me to try something, just
say it.  Same with extra info that might be needed.


Regards,

-- 
Martin Schlemmer


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 191 bytes --]

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: AMD64 X2 lost ticks on PM timer
  2006-03-04 12:06                                 ` AMD64 X2 lost ticks on PM timer Martin Schlemmer
@ 2006-03-05  7:07                                   ` Alexander Samad
  0 siblings, 0 replies; 60+ messages in thread
From: Alexander Samad @ 2006-03-05  7:07 UTC (permalink / raw)
  To: Martin Schlemmer
  Cc: Bill Rugolsky Jr.,
	Jeff Garzik, Lee Revell, Andi Kleen, Jason Baron, linux-kernel,
	john stultz, Ingo Molnar

[-- Attachment #1: Type: text/plain, Size: 1824 bytes --]

On Sat, Mar 04, 2006 at 02:06:45PM +0200, Martin Schlemmer wrote:
> On Fri, 2006-03-03 at 18:43 -0500, Bill Rugolsky Jr. wrote:
> > On Fri, Mar 03, 2006 at 05:09:57PM -0500, Jeff Garzik wrote:
> > > Or sata_nv/libata is to blame.
> >  
> > In case you are coming late to the thread:
> > 
> > The lost ticks are closely correlated with sata_nv disk activity on
> > multiple disks, and the problem is easily reproducable with "find /usr |
> > cpio -o >/dev/null" on an MD RAID1 -- but not on a single disk.
> > 
> > Andi suggested:
> > 
> >    Yes, I bet something forgets to turn on interrupts again and it's
> >    picked up by (and blamed on) the next guy who does an unconditional
> >    sti, which happens to be __do_sofitrq or idle.
> > 
> > That sounds right to me.
> > 
> > I built 2.6.16-rc5-git6 yesterday, and it still suffers from the same
> > issue.
> > 
> 
> Not sure this will help in anyway, but anyhow.

Hi

just to throw my 2c, I have a shuttle sn25p with a amd 2x 4400+, under
normal conditions I don't see any mising tick, but when I hammer the
network and the raid5 lvm I start to see missing ticks and the same
error message mentioned before. I am using debian 2.6.15 amd64


> 
> I have had this system for about 6-8 months (maybe 10) now.  It was
> originally a Asus A8N-SLI Deluxe with a 3200+ Athlon64.  In November I
> changed to a Asus A8N-SLI Premium, and added another 1GB memory (now
> have 2GB memory).  In all that time I have not had any issues with lost
> ticks, but I was hesitant to get a X2 processor due to the issue that
> some people had.
 snip ..

> 
> If anything might be of relevance, or you want me to try something, just
> say it.  Same with extra info that might be needed.
> 
> 
> Regards,
> 
> -- 
> Martin Schlemmer
> 



[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: libata/sata_nv latency on NVIDIA CK804 [was Re: AMD64 X2 lost ticks on PM timer]
       [not found]                                   ` <20060315213638.GA17817@ti64.telemetry-investments.com>
@ 2006-03-15 21:45                                     ` Lee Revell
  2006-03-15 21:58                                       ` Ingo Molnar
  2006-03-15 21:50                                     ` Ingo Molnar
  2006-03-15 22:04                                     ` [patch] latency-tracing-v2.6.16.patch Ingo Molnar
  2 siblings, 1 reply; 60+ messages in thread
From: Lee Revell @ 2006-03-15 21:45 UTC (permalink / raw)
  To: Bill Rugolsky Jr.
  Cc: Andi Kleen, Jeff Garzik, Jason Baron, linux-kernel, john stultz,
	Ingo Molnar

On Wed, 2006-03-15 at 16:36 -0500, Bill Rugolsky Jr. wrote:
>    <...>-2913  0d.h.    9us : ata_host_intr (nv_interrupt)
>    <...>-2913  0d.h.    9us!: ata_bmdma_status (ata_host_intr)
>    <...>-2913  0d.h. 16641us : nv_check_hotplug_ck804 (nv_interrupt)
>    <...>-2913  0d.h. 16642us : _spin_unlock_irqrestore (nv_interrupt) 

There's your problem - it looks like ata_bmdma_status() stalled the
machine for almost 17ms.

Lee


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: libata/sata_nv latency on NVIDIA CK804 [was Re: AMD64 X2 lost ticks on PM timer]
       [not found]                                   ` <20060315213638.GA17817@ti64.telemetry-investments.com>
  2006-03-15 21:45                                     ` libata/sata_nv latency on NVIDIA CK804 [was Re: AMD64 X2 lost ticks on PM timer] Lee Revell
@ 2006-03-15 21:50                                     ` Ingo Molnar
  2006-03-15 22:11                                       ` Ingo Molnar
  2006-03-15 22:30                                       ` Jeff Garzik
  2006-03-15 22:04                                     ` [patch] latency-tracing-v2.6.16.patch Ingo Molnar
  2 siblings, 2 replies; 60+ messages in thread
From: Ingo Molnar @ 2006-03-15 21:50 UTC (permalink / raw)
  To: Bill Rugolsky Jr.,
	Andi Kleen, Jeff Garzik, Lee Revell, Jason Baron, linux-kernel,
	john stultz


* Bill Rugolsky Jr. <brugolsky@telemetry-investments.com> wrote:

>    <...>-2913  0d.h.    8us : raise_softirq_irqoff (blk_complete_request)
>    <...>-2913  0d.h.    8us : __ata_qc_complete (ata_qc_complete)
>    <...>-2913  0d.h.    9us : ata_host_intr (nv_interrupt)
>    <...>-2913  0d.h.    9us!: ata_bmdma_status (ata_host_intr)
>    <...>-2913  0d.h. 16641us : nv_check_hotplug_ck804 (nv_interrupt)
>    <...>-2913  0d.h. 16642us : _spin_unlock_irqrestore (nv_interrupt)
>    <...>-2913  0d.h. 16642us : smp_apic_timer_interrupt (apic_timer_interrupt)
>    <...>-2913  0d.h. 16642us : exit_idle (smp_apic_timer_interrupt)

ouch. The codepath in question (ata_host_intr()) doesnt seem to have any 
loop that could take 16.6 msecs (!). This very much looks like some 
hardware-triggered delay - some really screwed up DMA prioritization 
perhaps, starving the host CPU for 16.6 msecs? But what DMA takes 16.6 
msecs? That's enough time to transfer dozens of megabytes of data on a 
midrange system.

	Ingo

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: libata/sata_nv latency on NVIDIA CK804 [was Re: AMD64 X2 lost ticks on PM timer]
  2006-03-15 21:45                                     ` libata/sata_nv latency on NVIDIA CK804 [was Re: AMD64 X2 lost ticks on PM timer] Lee Revell
@ 2006-03-15 21:58                                       ` Ingo Molnar
  2006-03-15 22:00                                         ` Ingo Molnar
  2006-03-15 22:22                                         ` Jeff Garzik
  0 siblings, 2 replies; 60+ messages in thread
From: Ingo Molnar @ 2006-03-15 21:58 UTC (permalink / raw)
  To: Lee Revell
  Cc: Bill Rugolsky Jr.,
	Andi Kleen, Jeff Garzik, Jason Baron, linux-kernel, john stultz


* Lee Revell <rlrevell@joe-job.com> wrote:

> On Wed, 2006-03-15 at 16:36 -0500, Bill Rugolsky Jr. wrote:
> >    <...>-2913  0d.h.    9us : ata_host_intr (nv_interrupt)
> >    <...>-2913  0d.h.    9us!: ata_bmdma_status (ata_host_intr)
> >    <...>-2913  0d.h. 16641us : nv_check_hotplug_ck804 (nv_interrupt)
> >    <...>-2913  0d.h. 16642us : _spin_unlock_irqrestore (nv_interrupt) 
> 
> There's your problem - it looks like ata_bmdma_status() stalled the 
> machine for almost 17ms.

i agree. Here's a bit more detailed analysis: the tracer timestamps 
function entry points. So what we know is that from the call to 
ata_bdma_status(), up to the call to nv_check_hotplug_ck804(), 16.6 
msecs passed. The codepath includes:

 - the whole of the ata_bdma_status() function

 - a small portion of ata_host_intr() [from the point where it returns 
   from the ops->bdma_status() call up to the return]

 - and a small portion of nv_interrupt(), from the ata_host_intr() 
   return to the ->check_hotplug() call.

in this particular case there's only very simple (and non-IO) 
instructions in that codepath (no loops either), except for 
ata_bmdma_status() which does IO ops: so i agree with you that the most 
likely candidate for the delay is the readb() or the inb() in 
ata_bdma_status().

I'm wondering which one of the two. inb()s are known to be horrible on 
some systems - but i've never seen them take 16 milliseconds. If it's 
the inb(), then that could also involve SMM mode and IO 
emulation/bug-workaround BIOS hackery - which could indeed cause such 
delays. [but i havent seen such a thing either.]

the other option is that this is a random delay [e.g. DMA starvation] 
hitting ata_bmdma_status() only by accident. (That looks a bit unlikely 
though, given how related this codepath seems to the whole problem 
area.)

	Ingo

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: libata/sata_nv latency on NVIDIA CK804 [was Re: AMD64 X2 lost ticks on PM timer]
  2006-03-15 21:58                                       ` Ingo Molnar
@ 2006-03-15 22:00                                         ` Ingo Molnar
  2006-03-15 22:25                                           ` Jeff Garzik
  2006-03-16 15:13                                           ` Alan Cox
  2006-03-15 22:22                                         ` Jeff Garzik
  1 sibling, 2 replies; 60+ messages in thread
From: Ingo Molnar @ 2006-03-15 22:00 UTC (permalink / raw)
  To: Lee Revell
  Cc: Bill Rugolsky Jr.,
	Andi Kleen, Jeff Garzik, Jason Baron, linux-kernel, john stultz


* Ingo Molnar <mingo@elte.hu> wrote:

> in this particular case there's only very simple (and non-IO) 
> instructions in that codepath (no loops either), except for 
> ata_bmdma_status() which does IO ops: so i agree with you that the 
> most likely candidate for the delay is the readb() or the inb() in 
> ata_bdma_status().
> 
> I'm wondering which one of the two. inb()s are known to be horrible on 
> some systems - but i've never seen them take 16 milliseconds. If it's 
> the inb(), then that could also involve SMM mode and IO 
> emulation/bug-workaround BIOS hackery - which could indeed cause such 
> delays. [but i havent seen such a thing either.]
> 
> the other option is that this is a random delay [e.g. DMA starvation] 
> hitting ata_bmdma_status() only by accident. (That looks a bit 
> unlikely though, given how related this codepath seems to the whole 
> problem area.)

i'd exclude this option based on the second latency trace: that too 
shows a delay in ata_bmdma_status().

so my guess would be that this device doesnt do MMIO, and the PIO inb() 
causes some bad BIOS-based SMM handler/emulator to trigger, which takes 
16.6 msecs. If indeed the device is not in MMIO mode, is there a way to 
force it into MMIO mode, to test this theory?

	Ingo

^ permalink raw reply	[flat|nested] 60+ messages in thread

* [patch] latency-tracing-v2.6.16.patch
       [not found]                                   ` <20060315213638.GA17817@ti64.telemetry-investments.com>
  2006-03-15 21:45                                     ` libata/sata_nv latency on NVIDIA CK804 [was Re: AMD64 X2 lost ticks on PM timer] Lee Revell
  2006-03-15 21:50                                     ` Ingo Molnar
@ 2006-03-15 22:04                                     ` Ingo Molnar
  2006-03-15 22:32                                       ` Bill Rugolsky Jr.
  2 siblings, 1 reply; 60+ messages in thread
From: Ingo Molnar @ 2006-03-15 22:04 UTC (permalink / raw)
  To: Bill Rugolsky Jr.,
	Andi Kleen, Jeff Garzik, Lee Revell, Jason Baron, linux-kernel,
	john stultz


* Bill Rugolsky Jr. <brugolsky@telemetry-investments.com> wrote:

> Here are a pair of traces from Ingo's latency tracer running on 
> 2.6.16-rc6-git4 and 2.6.15 x86_64 SMP kernel with maxcpus=1 and 
> report_lost_ticks. [...]

just for the record, the latency tracer can be found at:

   http://redhat.com/~mingo/latency-tracing-patches/

latency-tracing-v2.6.16.patch would be the one for current upstream 
kernels. The codebase is the same as in the -rt tree.

	Ingo

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: libata/sata_nv latency on NVIDIA CK804 [was Re: AMD64 X2 lost ticks on PM timer]
  2006-03-15 21:50                                     ` Ingo Molnar
@ 2006-03-15 22:11                                       ` Ingo Molnar
  2006-03-15 22:33                                         ` Jeff Garzik
  2006-03-15 22:30                                       ` Jeff Garzik
  1 sibling, 1 reply; 60+ messages in thread
From: Ingo Molnar @ 2006-03-15 22:11 UTC (permalink / raw)
  To: Bill Rugolsky Jr.,
	Andi Kleen, Jeff Garzik, Lee Revell, Jason Baron, linux-kernel,
	john stultz


the patch below is a blind shot into the dark: it turns on MMIO for the 
sata_nv driver. But be careful with it - this turns on a probably 
totally untested mode in the driver and thus may damage your data. (It 
might not even work at all because the driver might not be ready for it 
- Jeff?).  I'd suggest to first boot into single-user mode with all 
filesystems readonly mounted.

on the low chance of this patch actually working, the interesting thing 
would be to check whether the latencies occur in MMIO mode too? (if they 
do then please send us the new latency traces too.)

	Ingo

---------

WARNING: this may damage your data. Be careful ...

 drivers/scsi/sata_nv.c |    1 +
 1 files changed, 1 insertion(+)

Index: linux/drivers/scsi/sata_nv.c
===================================================================
--- linux.orig/drivers/scsi/sata_nv.c
+++ linux/drivers/scsi/sata_nv.c
@@ -280,6 +280,7 @@ static struct ata_port_info nv_port_info
 	.host_flags	= ATA_FLAG_SATA |
 			  /* ATA_FLAG_SATA_RESET | */
 			  ATA_FLAG_SRST |
+			  ATA_FLAG_MMIO |
 			  ATA_FLAG_NO_LEGACY,
 	.pio_mask	= NV_PIO_MASK,
 	.mwdma_mask	= NV_MWDMA_MASK,

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: libata/sata_nv latency on NVIDIA CK804 [was Re: AMD64 X2 lost ticks on PM timer]
  2006-03-15 21:58                                       ` Ingo Molnar
  2006-03-15 22:00                                         ` Ingo Molnar
@ 2006-03-15 22:22                                         ` Jeff Garzik
  2006-03-15 22:24                                           ` Ingo Molnar
  1 sibling, 1 reply; 60+ messages in thread
From: Jeff Garzik @ 2006-03-15 22:22 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Lee Revell, Bill Rugolsky Jr.,
	Andi Kleen, Jason Baron, linux-kernel, john stultz

Ingo Molnar wrote:
> in this particular case there's only very simple (and non-IO) 
> instructions in that codepath (no loops either), except for 
> ata_bmdma_status() which does IO ops: so i agree with you that the most 
> likely candidate for the delay is the readb() or the inb() in 
> ata_bdma_status().
> 
> I'm wondering which one of the two. inb()s are known to be horrible on 
> some systems - but i've never seen them take 16 milliseconds. If it's 
> the inb(), then that could also involve SMM mode and IO 


ata_bmdma_status() is just a single IO read, and even 1ms is highly 
improbable.

I'd look elsewhere.  There are a ton of udelay() calls in the legacy PCI 
IDE BMDMA code paths (sata_nv uses these), so I'm not surprised there is 
latency in general, in a libata+sata_nv configuration.  Status checks 
for example (ata_busy_wait in libata.h) are basically

	while (ioreadX() != condition)
		udelay(10)

That delay is mainly a "don't pound too hard on the hardware" delay.  If 
the hardware is really slow completing a command after signalling 
completion, you'll potentially wait up to 1000*10 us in some cases.  And 
there are other delays, such as the per-command ndelay() plus ioread().

Welcome to the wonderful world of IDE.

	Jeff



^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: libata/sata_nv latency on NVIDIA CK804 [was Re: AMD64 X2 lost ticks on PM timer]
  2006-03-15 22:22                                         ` Jeff Garzik
@ 2006-03-15 22:24                                           ` Ingo Molnar
  2006-03-15 22:36                                             ` Bill Rugolsky Jr.
  0 siblings, 1 reply; 60+ messages in thread
From: Ingo Molnar @ 2006-03-15 22:24 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Lee Revell, Bill Rugolsky Jr.,
	Andi Kleen, Jason Baron, linux-kernel, john stultz


* Jeff Garzik <jeff@garzik.org> wrote:

> Ingo Molnar wrote:
> >in this particular case there's only very simple (and non-IO) 
> >instructions in that codepath (no loops either), except for 
> >ata_bmdma_status() which does IO ops: so i agree with you that the most 
> >likely candidate for the delay is the readb() or the inb() in 
> >ata_bdma_status().
> >
> >I'm wondering which one of the two. inb()s are known to be horrible on 
> >some systems - but i've never seen them take 16 milliseconds. If it's 
> >the inb(), then that could also involve SMM mode and IO 
> 
> 
> ata_bmdma_status() is just a single IO read, and even 1ms is highly 
> improbable.

well, it's a PIO inb() op i think, and could thus in theory trigger SMM 
BIOS code.

> I'd look elsewhere.  There are a ton of udelay() calls in the legacy 
> PCI IDE BMDMA code paths (sata_nv uses these), so I'm not surprised 
> there is latency in general, in a libata+sata_nv configuration. [...]

they would show up in the latency trace ... the latency trace is very 
clear, and in the previous mail i described the precise codepath where 
we observed the latency. Only that single PIO read is there AFAICS.
  
	Ingo

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: libata/sata_nv latency on NVIDIA CK804 [was Re: AMD64 X2 lost ticks on PM timer]
  2006-03-15 22:00                                         ` Ingo Molnar
@ 2006-03-15 22:25                                           ` Jeff Garzik
  2006-03-16 15:13                                           ` Alan Cox
  1 sibling, 0 replies; 60+ messages in thread
From: Jeff Garzik @ 2006-03-15 22:25 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Lee Revell, Bill Rugolsky Jr.,
	Andi Kleen, Jason Baron, linux-kernel, john stultz

Ingo Molnar wrote:
> so my guess would be that this device doesnt do MMIO, and the PIO inb() 
> causes some bad BIOS-based SMM handler/emulator to trigger, which takes 
> 16.6 msecs. If indeed the device is not in MMIO mode, is there a way to 
> force it into MMIO mode, to test this theory?

Yes, but that carries with it an entirely new programming set, to use 
MMIO: http://marc.theaimsgroup.com/?l=linux-ide&m=114124501116060&w=2

Not like NICs, where you can just point it at another PCI BAR.

	Jeff



^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: libata/sata_nv latency on NVIDIA CK804 [was Re: AMD64 X2 lost ticks on PM timer]
  2006-03-15 21:50                                     ` Ingo Molnar
  2006-03-15 22:11                                       ` Ingo Molnar
@ 2006-03-15 22:30                                       ` Jeff Garzik
  2006-03-15 22:36                                         ` Ingo Molnar
  1 sibling, 1 reply; 60+ messages in thread
From: Jeff Garzik @ 2006-03-15 22:30 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Bill Rugolsky Jr.,
	Andi Kleen, Lee Revell, Jason Baron, linux-kernel, john stultz

Ingo Molnar wrote:
> * Bill Rugolsky Jr. <brugolsky@telemetry-investments.com> wrote:
> 
> 
>>   <...>-2913  0d.h.    8us : raise_softirq_irqoff (blk_complete_request)
>>   <...>-2913  0d.h.    8us : __ata_qc_complete (ata_qc_complete)
>>   <...>-2913  0d.h.    9us : ata_host_intr (nv_interrupt)
>>   <...>-2913  0d.h.    9us!: ata_bmdma_status (ata_host_intr)
>>   <...>-2913  0d.h. 16641us : nv_check_hotplug_ck804 (nv_interrupt)
>>   <...>-2913  0d.h. 16642us : _spin_unlock_irqrestore (nv_interrupt)
>>   <...>-2913  0d.h. 16642us : smp_apic_timer_interrupt (apic_timer_interrupt)
>>   <...>-2913  0d.h. 16642us : exit_idle (smp_apic_timer_interrupt)
> 
> 
> ouch. The codepath in question (ata_host_intr()) doesnt seem to have any 
> loop that could take 16.6 msecs (!). This very much looks like some 
> hardware-triggered delay - some really screwed up DMA prioritization 
> perhaps, starving the host CPU for 16.6 msecs? But what DMA takes 16.6 
> msecs? That's enough time to transfer dozens of megabytes of data on a 
> midrange system.

Yeah, I don't see anything offhand either.

sata_nv's nv_interrupt() should be using spin_lock() rather than 
spin_lock_irqsave(), but I doubt that's a latency cause.

I would be surprised if the legacy PCI IDE registers, all PIO-based, 
would be implemented via BIOS SMM or some other similarly slow method. 
But its not impossible...

	Jeff




^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [patch] latency-tracing-v2.6.16.patch
  2006-03-15 22:04                                     ` [patch] latency-tracing-v2.6.16.patch Ingo Molnar
@ 2006-03-15 22:32                                       ` Bill Rugolsky Jr.
  2006-03-16  9:18                                         ` Ingo Molnar
  0 siblings, 1 reply; 60+ messages in thread
From: Bill Rugolsky Jr. @ 2006-03-15 22:32 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Andi Kleen, Jeff Garzik, Lee Revell, Jason Baron, linux-kernel,
	john stultz

On Wed, Mar 15, 2006 at 11:04:33PM +0100, Ingo Molnar wrote:
> 
> * Bill Rugolsky Jr. <brugolsky@telemetry-investments.com> wrote:
> 
> > Here are a pair of traces from Ingo's latency tracer running on 
> > 2.6.16-rc6-git4 and 2.6.15 x86_64 SMP kernel with maxcpus=1 and 
> > report_lost_ticks. [...]
> 
> just for the record, the latency tracer can be found at:
> 
>    http://redhat.com/~mingo/latency-tracing-patches/
> 
> latency-tracing-v2.6.16.patch would be the one for current upstream 
> kernels. The codebase is the same as in the -rt tree.

Ingo, I had to add this incremental patch against 2.6.16-rc6-git4 in order to
get the 2.6.15-rc7 latency tracer working on x86_64.  Looks like the
problem is still there in latency-tracing-v2.6.16.patch.

Regards,

	Bill


--- linux-2.6.16-rc6-git4-latency/include/asm-x86_64/system.h	2006-03-15 17:19:20.000000000 -0500
+++ linux-2.6.16-rc6-git4-latency/include/asm-x86_64/system.h	2006-03-15 13:45:14.000000000 -0500
@@ -341,10 +341,8 @@
 #define local_irq_disable()	do { unsigned long flags; local_save_flags(flags); local_irq_restore((flags & ~(1 << 9)) | (1 << 18)); } while (0)
 #define local_irq_enable()	do { unsigned long flags; local_save_flags(flags); local_irq_restore((flags | (1 << 9)) & ~(1 << 18)); } while (0)
 
-#define irqs_disabled()					\
+#define irqs_disabled_flags(flags)			\
 ({							\
-	unsigned long flags;				\
-	local_save_flags(flags);			\
 	(flags & (1<<18)) || !(flags & (1<<9));		\
 })
 
@@ -354,10 +352,8 @@
 #define local_irq_disable() 	__asm__ __volatile__("cli": : :"memory")
 #define local_irq_enable()	__asm__ __volatile__("sti": : :"memory")
 
-#define irqs_disabled()			\
+#define irqs_disabled_flags(flags)	\
 ({					\
-	unsigned long flags;		\
-	local_save_flags(flags);	\
 	!(flags & (1<<9));		\
 })
 
@@ -365,6 +361,13 @@
 #define local_irq_save(x) 	do { warn_if_not_ulong(x); __asm__ __volatile__("# local_irq_save \n\t pushfq ; popq %0 ; cli":"=g" (x): /* no input */ :"memory"); } while (0)
 #endif
 
+#define irqs_disabled()			\
+({					\
+	unsigned long flags;		\
+	local_save_flags(flags);	\
+	irqs_disabled_flags(flags);	\
+})
+
 /* used in the idle loop; sti takes one instruction cycle to complete */
 #define safe_halt()		__asm__ __volatile__("sti; hlt": : :"memory")
 /* used when interrupts are already enabled or to shutdown the processor */
--- linux-2.6.16-rc6-git4-latency/include/asm-x86_64/unistd.h	2006-03-15 17:19:20.000000000 -0500
+++ linux-2.6.16-rc6-git4-latency/include/asm-x86_64/unistd.h	2006-03-15 13:47:32.000000000 -0500
@@ -607,6 +607,7 @@
 __SYSCALL(__NR_unshare,	sys_unshare)
 
 #define __NR_syscall_max __NR_unshare
+#define NR_syscalls (__NR_syscall_max+1)
 
 #ifndef __NO_STUBS
 

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: libata/sata_nv latency on NVIDIA CK804 [was Re: AMD64 X2 lost ticks on PM timer]
  2006-03-15 22:11                                       ` Ingo Molnar
@ 2006-03-15 22:33                                         ` Jeff Garzik
  2006-03-15 22:44                                           ` Ingo Molnar
  0 siblings, 1 reply; 60+ messages in thread
From: Jeff Garzik @ 2006-03-15 22:33 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Bill Rugolsky Jr.,
	Andi Kleen, Lee Revell, Jason Baron, linux-kernel, john stultz

Ingo Molnar wrote:
> the patch below is a blind shot into the dark: it turns on MMIO for the 
> sata_nv driver. But be careful with it - this turns on a probably 
> totally untested mode in the driver and thus may damage your data. (It 
> might not even work at all because the driver might not be ready for it 
> - Jeff?).  I'd suggest to first boot into single-user mode with all 
> filesystems readonly mounted.
> 
> on the low chance of this patch actually working, the interesting thing 
> would be to check whether the latencies occur in MMIO mode too? (if they 
> do then please send us the new latency traces too.)
> 
> 	Ingo
> 
> ---------
> 
> WARNING: this may damage your data. Be careful ...
> 
>  drivers/scsi/sata_nv.c |    1 +
>  1 files changed, 1 insertion(+)
> 
> Index: linux/drivers/scsi/sata_nv.c
> ===================================================================
> --- linux.orig/drivers/scsi/sata_nv.c
> +++ linux/drivers/scsi/sata_nv.c
> @@ -280,6 +280,7 @@ static struct ata_port_info nv_port_info
>  	.host_flags	= ATA_FLAG_SATA |
>  			  /* ATA_FLAG_SATA_RESET | */
>  			  ATA_FLAG_SRST |
> +			  ATA_FLAG_MMIO |
>  			  ATA_FLAG_NO_LEGACY,

It won't work at all...

You have to stop talking to PCI IDE registers completely (consumes 5 PCI 
BARs), and talk exclusively to the MMIO 6th PCI BAR, at non-standard 
offsets and a using a proprietary DMA descriptor format [all public now 
in that link I just sent].

My main workstation has one:
> 00:08.0 IDE interface: nVidia Corporation CK804 Serial ATA Controller (rev f3) (prog-if 85 [Master SecO PriO])
>         Subsystem: Hewlett-Packard Company: Unknown device 1500
>         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
>         Status: Cap+ 66Mhz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort-
> <MAbort- >SERR- <PERR-
>         Latency: 0 (750ns min, 250ns max)
>         Interrupt: pin A routed to IRQ 201
>         Region 0: I/O ports at 28e0 [size=8]
>         Region 1: I/O ports at 2c00 [size=4]
>         Region 2: I/O ports at 28e8 [size=8]
>         Region 3: I/O ports at 2c04 [size=4]
>         Region 4: I/O ports at 28c0 [size=16]
>         Region 5: Memory at f2103000 (32-bit, non-prefetchable) [size=4K]

Regards,

	Jeff



^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: libata/sata_nv latency on NVIDIA CK804 [was Re: AMD64 X2 lost ticks on PM timer]
  2006-03-15 22:30                                       ` Jeff Garzik
@ 2006-03-15 22:36                                         ` Ingo Molnar
  0 siblings, 0 replies; 60+ messages in thread
From: Ingo Molnar @ 2006-03-15 22:36 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Bill Rugolsky Jr.,
	Andi Kleen, Lee Revell, Jason Baron, linux-kernel, john stultz


* Jeff Garzik <jeff@garzik.org> wrote:

> Ingo Molnar wrote:
> >* Bill Rugolsky Jr. <brugolsky@telemetry-investments.com> wrote:
> >
> >
> >>  <...>-2913  0d.h.    8us : raise_softirq_irqoff (blk_complete_request)
> >>  <...>-2913  0d.h.    8us : __ata_qc_complete (ata_qc_complete)
> >>  <...>-2913  0d.h.    9us : ata_host_intr (nv_interrupt)
> >>  <...>-2913  0d.h.    9us!: ata_bmdma_status (ata_host_intr)
> >>  <...>-2913  0d.h. 16641us : nv_check_hotplug_ck804 (nv_interrupt)
> >>  <...>-2913  0d.h. 16642us : _spin_unlock_irqrestore (nv_interrupt)
> >>  <...>-2913  0d.h. 16642us : smp_apic_timer_interrupt 
> >>  (apic_timer_interrupt)
> >>  <...>-2913  0d.h. 16642us : exit_idle (smp_apic_timer_interrupt)
> >
> >
> >ouch. The codepath in question (ata_host_intr()) doesnt seem to have any 
> >loop that could take 16.6 msecs (!). This very much looks like some 
> >hardware-triggered delay - some really screwed up DMA prioritization 
> >perhaps, starving the host CPU for 16.6 msecs? But what DMA takes 16.6 
> >msecs? That's enough time to transfer dozens of megabytes of data on a 
> >midrange system.
> 
> Yeah, I don't see anything offhand either.
> 
> sata_nv's nv_interrupt() should be using spin_lock() rather than 
> spin_lock_irqsave(), but I doubt that's a latency cause.

please see my previous codepath description. No spin_lock() calls in the 
measured codepath. They would show up in the latency trace anyway.

> I would be surprised if the legacy PCI IDE registers, all PIO-based, 
> would be implemented via BIOS SMM or some other similarly slow method.  
> But its not impossible...

yeah. But then again - PIO ops easily take 16 usecs on almost arbitrary 
PC hardware, so an SMM trap might not be all that noticeable - 
especially if this register is considered legacy. (maybe Windows does 
the MMIO variant)

	Ingo

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: libata/sata_nv latency on NVIDIA CK804 [was Re: AMD64 X2 lost ticks on PM timer]
  2006-03-15 22:24                                           ` Ingo Molnar
@ 2006-03-15 22:36                                             ` Bill Rugolsky Jr.
  2006-03-15 22:46                                               ` Ingo Molnar
  2006-03-15 22:48                                               ` Jeff Garzik
  0 siblings, 2 replies; 60+ messages in thread
From: Bill Rugolsky Jr. @ 2006-03-15 22:36 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Jeff Garzik, Lee Revell, Andi Kleen, Jason Baron, linux-kernel,
	john stultz

On Wed, Mar 15, 2006 at 11:24:41PM +0100, Ingo Molnar wrote:
> well, it's a PIO inb() op i think, and could thus in theory trigger SMM 
> BIOS code.
 
Is there any easy way to disable more SMM stuff than "noacpi"?

If push comes to shove, I'll go all the way and just install LinuxBIOS
on the damn thing.  Though I'm sure that will be a chore.

	-Bill

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: libata/sata_nv latency on NVIDIA CK804 [was Re: AMD64 X2 lost ticks on PM timer]
  2006-03-15 22:33                                         ` Jeff Garzik
@ 2006-03-15 22:44                                           ` Ingo Molnar
  2006-03-15 22:50                                             ` Jeff Garzik
  0 siblings, 1 reply; 60+ messages in thread
From: Ingo Molnar @ 2006-03-15 22:44 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Bill Rugolsky Jr.,
	Andi Kleen, Lee Revell, Jason Baron, linux-kernel, john stultz

[-- Attachment #1: Type: text/plain, Size: 497 bytes --]


* Jeff Garzik <jeff@garzik.org> wrote:

> It won't work at all...

ok.

> You have to stop talking to PCI IDE registers completely (consumes 5 
> PCI BARs), and talk exclusively to the MMIO 6th PCI BAR, at 
> non-standard offsets and a using a proprietary DMA descriptor format 
> [all public now in that link I just sent].

just to make it easier to test: i've attached the new sata_nv.c file, 
which, to test it, should be copied over the existing 
drivers/scsi/sata_nv.c file, correct?

	Ingo

[-- Attachment #2: sata_nv.c --]
[-- Type: text/plain, Size: 42799 bytes --]

/*
 *  sata_nv.c - NVIDIA nForce SATA
 *
 *  Copyright 2004 NVIDIA Corp.  All rights reserved.
 *  Copyright 2004 Andrew Chew
 *
 *
 *  This program is free software; you can redistribute it and/or modify
 *  it under the terms of the GNU General Public License as published by
 *  the Free Software Foundation; either version 2, or (at your option)
 *  any later version.
 *
 *  This program is distributed in the hope that it will be useful,
 *  but WITHOUT ANY WARRANTY; without even the implied warranty of
 *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 *  GNU General Public License for more details.
 *
 *  You should have received a copy of the GNU General Public License
 *  along with this program; see the file COPYING.  If not, write to
 *  the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA.
 *
 *
 *  libata documentation is available via 'make {ps|pdf}docs',
 *  as Documentation/DocBook/libata.*
 *
 *  No hardware documentation available outside of NVIDIA.
 *  This driver programs the NVIDIA SATA controller in a similar
 *  fashion as with other PCI IDE BMDMA controllers, with a few
 *  NV-specific details such as register offsets, SATA phy location,
 *  hotplug info, etc.
 *
 *
 *  0.08
 *     - Added support for MCP51 and MCP55.
 *
 *  0.07
 *     - Added support for RAID class code.
 *
 *  0.06
 *     - Added generic SATA support by using a pci_device_id that filters on
 *       the IDE storage class code.
 *
 *  0.03
 *     - Fixed a bug where the hotplug handlers for non-CK804/MCP04 were using
 *       mmio_base, which is only set for the CK804/MCP04 case.
 *
 *  0.02
 *     - Added support for CK804 SATA controller.
 *
 *  0.01
 *     - Initial revision.
 */

#include <linux/config.h>
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/pci.h>
#include <linux/init.h>
#include <linux/blkdev.h>
#include <linux/delay.h>
#include <linux/interrupt.h>
#include "scsi.h"
#include <scsi/scsi_host.h>
#include <linux/libata.h>

//#define DEBUG

#define DRV_NAME			"sata_nv"
#define DRV_VERSION			"0.8"

#define NV_PORTS			2
#define NV_PIO_MASK			0x1f
#define NV_MWDMA_MASK			0x07
#define NV_UDMA_MASK			0x7f
#define NV_PORT0_SCR_REG_OFFSET		0x00
#define NV_PORT1_SCR_REG_OFFSET		0x40

#define NV_INT_STATUS			0x10
#define NV_INT_STATUS_CK804		0x440
#define NV_INT_STATUS_PDEV_INT		0x01
#define NV_INT_STATUS_PDEV_PM		0x02
#define NV_INT_STATUS_PDEV_ADDED	0x04
#define NV_INT_STATUS_PDEV_REMOVED	0x08
#define NV_INT_STATUS_SDEV_INT		0x10
#define NV_INT_STATUS_SDEV_PM		0x20
#define NV_INT_STATUS_SDEV_ADDED	0x40
#define NV_INT_STATUS_SDEV_REMOVED	0x80
#define NV_INT_STATUS_PDEV_HOTPLUG	(NV_INT_STATUS_PDEV_ADDED | \
					NV_INT_STATUS_PDEV_REMOVED)
#define NV_INT_STATUS_SDEV_HOTPLUG	(NV_INT_STATUS_SDEV_ADDED | \
					NV_INT_STATUS_SDEV_REMOVED)
#define NV_INT_STATUS_HOTPLUG		(NV_INT_STATUS_PDEV_HOTPLUG | \
					NV_INT_STATUS_SDEV_HOTPLUG)

#define NV_INT_ENABLE			0x11
#define NV_INT_ENABLE_CK804		0x441
#define NV_INT_ENABLE_PDEV_MASK		0x01
#define NV_INT_ENABLE_PDEV_PM		0x02
#define NV_INT_ENABLE_PDEV_ADDED	0x04
#define NV_INT_ENABLE_PDEV_REMOVED	0x08
#define NV_INT_ENABLE_SDEV_MASK		0x10
#define NV_INT_ENABLE_SDEV_PM		0x20
#define NV_INT_ENABLE_SDEV_ADDED	0x40
#define NV_INT_ENABLE_SDEV_REMOVED	0x80
#define NV_INT_ENABLE_PDEV_HOTPLUG	(NV_INT_ENABLE_PDEV_ADDED | \
					NV_INT_ENABLE_PDEV_REMOVED)
#define NV_INT_ENABLE_SDEV_HOTPLUG	(NV_INT_ENABLE_SDEV_ADDED | \
					NV_INT_ENABLE_SDEV_REMOVED)
#define NV_INT_ENABLE_HOTPLUG		(NV_INT_ENABLE_PDEV_HOTPLUG | \
					NV_INT_ENABLE_SDEV_HOTPLUG)

#define NV_INT_CONFIG			0x12
#define NV_INT_CONFIG_METHD		0x01 // 0 = INT, 1 = SMI

// For PCI config register 20
#define NV_MCP_SATA_CFG_20		0x50
#define NV_MCP_SATA_CFG_20_SATA_SPACE_EN	0x04
#define NV_MCP_SATA_CFG_20_PORT0_EN	(1 << 17)
#define NV_MCP_SATA_CFG_20_PORT1_EN	(1 << 16)
#define NV_MCP_SATA_CFG_20_PORT0_PWB_EN	(1 << 14)
#define NV_MCP_SATA_CFG_20_PORT1_PWB_EN	(1 << 12)

//#define NV_ADMA_NCQ

#ifdef NV_ADMA_NCQ
#define NV_ADMA_CAN_QUEUE		ATA_MAX_QUEUE
#else
#define NV_ADMA_CAN_QUEUE		ATA_DEF_QUEUE
#endif

#define NV_ADMA_CPB_SZ			128
#define NV_ADMA_APRD_SZ			16
#define NV_ADMA_SGTBL_LEN		(1024 - NV_ADMA_CPB_SZ) / NV_ADMA_APRD_SZ
#define NV_ADMA_SGTBL_SZ                NV_ADMA_SGTBL_LEN * NV_ADMA_APRD_SZ
#define NV_ADMA_PORT_PRIV_DMA_SZ        NV_ADMA_CAN_QUEUE * (NV_ADMA_CPB_SZ + NV_ADMA_SGTBL_SZ)
//#define NV_ADMA_MAX_CPBS		32

// BAR5 offset to ADMA general registers
#define NV_ADMA_GEN			0x400
#define NV_ADMA_GEN_CTL			0x00
#define NV_ADMA_NOTIFIER_CLEAR		0x30

#define NV_ADMA_CHECK_INTR(GCTL, PORT) ((GCTL) & ( 1 << (19 + (12 * (PORT)))))

// BAR5 offset to ADMA ports
#define NV_ADMA_PORT			0x480

// size of ADMA port register space 
#define NV_ADMA_PORT_SIZE		0x100

// ADMA port registers
#define NV_ADMA_CTL			0x40
#define NV_ADMA_CPB_COUNT		0x42
#define NV_ADMA_NEXT_CPB_IDX		0x43
#define NV_ADMA_STAT			0x44
#define NV_ADMA_CPB_BASE_LOW		0x48
#define NV_ADMA_CPB_BASE_HIGH		0x4C
#define NV_ADMA_APPEND			0x50
#define NV_ADMA_NOTIFIER		0x68
#define NV_ADMA_NOTIFIER_ERROR		0x6C

// NV_ADMA_CTL register bits
#define NV_ADMA_CTL_HOTPLUG_IEN		(1 << 0)
#define NV_ADMA_CTL_CHANNEL_RESET	(1 << 5)
#define NV_ADMA_CTL_GO			(1 << 7)
#define NV_ADMA_CTL_AIEN		(1 << 8)
#define NV_ADMA_CTL_READ_NON_COHERENT	(1 << 11)
#define NV_ADMA_CTL_WRITE_NON_COHERENT	(1 << 12)

// CPB response flag bits
#define NV_CPB_RESP_DONE		(1 << 0)
#define NV_CPB_RESP_ATA_ERR		(1 << 3)
#define NV_CPB_RESP_CMD_ERR		(1 << 4)
#define NV_CPB_RESP_CPB_ERR		(1 << 7)

// CPB control flag bits
#define NV_CPB_CTL_CPB_VALID		(1 << 0)
#define NV_CPB_CTL_QUEUE		(1 << 1)
#define NV_CPB_CTL_APRD_VALID		(1 << 2)
#define NV_CPB_CTL_IEN			(1 << 3)
#define NV_CPB_CTL_FPDMA		(1 << 4)

// APRD flags
#define NV_APRD_WRITE			(1 << 1)
#define NV_APRD_END			(1 << 2)
#define NV_APRD_CONT			(1 << 3)

// NV_ADMA_STAT flags
#define NV_ADMA_STAT_TIMEOUT		(1 << 0)
#define NV_ADMA_STAT_HOTUNPLUG		(1 << 1)
#define NV_ADMA_STAT_HOTPLUG		(1 << 2)
#define NV_ADMA_STAT_CPBERR		(1 << 4)
#define NV_ADMA_STAT_SERROR		(1 << 5)
#define NV_ADMA_STAT_CMD_COMPLETE	(1 << 6)
#define NV_ADMA_STAT_IDLE		(1 << 8)
#define NV_ADMA_STAT_LEGACY		(1 << 9)
#define NV_ADMA_STAT_STOPPED		(1 << 10)
#define NV_ADMA_STAT_DONE		(1 << 12)
#define NV_ADMA_STAT_ERR		(NV_ADMA_STAT_CPBERR | NV_ADMA_STAT_TIMEOUT)

// port flags
#define NV_ADMA_PORT_REGISTER_MODE	(1 << 0)

#ifndef min
#define min(x,y) ((x) < (y) ? x : y)
#endif

struct nv_adma_prd {
	u64			addr;
	u32			len;
	u8			flags;
	u8			packet_len;
	u16			reserved;
};

enum nv_adma_regbits {
	CMDEND	= (1 << 15),		/* end of command list */
	WNB	= (1 << 14),		/* wait-not-BSY */
	IGN	= (1 << 13),		/* ignore this entry */
	CS1n	= (1 << (4 + 8)),	/* std. PATA signals follow... */
	DA2	= (1 << (2 + 8)),
	DA1	= (1 << (1 + 8)),
	DA0	= (1 << (0 + 8)),
};

struct nv_adma_cpb {
	u8			resp_flags;    //0
	u8			reserved1;     //1
	u8			ctl_flags;     //2
	// len is length of taskfile in 64 bit words
 	u8			len;           //3 
	u8			tag;           //4
	u8			next_cpb_idx;  //5
	u16			reserved2;     //6-7
	u16			tf[12];        //8-31
	struct nv_adma_prd	aprd[5];       //32-111
	u64                     next_aprd;     //112-119
	u64                     reserved3;     //120-127
};


struct nv_adma_port_priv {
	struct nv_adma_cpb	*cpb;
  //	u8			cpb_idx;
	u8			flags;
	u32			notifier;
	u32			notifier_error;
	dma_addr_t		cpb_dma;
	struct nv_adma_prd	*aprd;
	dma_addr_t		aprd_dma;
};

static int nv_init_one (struct pci_dev *pdev, const struct pci_device_id *ent);
static irqreturn_t nv_interrupt (int irq, void *dev_instance,
				 struct pt_regs *regs);
static u32 nv_scr_read (struct ata_port *ap, unsigned int sc_reg);
static void nv_scr_write (struct ata_port *ap, unsigned int sc_reg, u32 val);
static void nv_host_stop (struct ata_host_set *host_set);
static int nv_port_start(struct ata_port *ap);
static void nv_port_stop(struct ata_port *ap);
static int nv_adma_port_start(struct ata_port *ap);
static void nv_adma_port_stop(struct ata_port *ap);
static void nv_irq_clear(struct ata_port *ap);
static void nv_adma_irq_clear(struct ata_port *ap);
static void nv_enable_hotplug(struct ata_probe_ent *probe_ent);
static void nv_disable_hotplug(struct ata_host_set *host_set);
static void nv_check_hotplug(struct ata_host_set *host_set);
static void nv_enable_hotplug_ck804(struct ata_probe_ent *probe_ent);
static void nv_disable_hotplug_ck804(struct ata_host_set *host_set);
static void nv_check_hotplug_ck804(struct ata_host_set *host_set);
static void nv_enable_hotplug_adma(struct ata_probe_ent *probe_ent);
static void nv_disable_hotplug_adma(struct ata_host_set *host_set);
static void nv_check_hotplug_adma(struct ata_host_set *host_set);
static void nv_qc_prep(struct ata_queued_cmd *qc);
static int nv_qc_issue(struct ata_queued_cmd *qc);
static int nv_adma_qc_issue(struct ata_queued_cmd *qc);
static void nv_adma_qc_prep(struct ata_queued_cmd *qc);
static unsigned int nv_adma_tf_to_cpb(struct ata_taskfile *tf, u16 *cpb);
static void nv_adma_fill_sg(struct ata_queued_cmd *qc, struct nv_adma_cpb *cpb);
static void nv_adma_fill_aprd(struct ata_queued_cmd *qc, int idx, struct nv_adma_prd *aprd);
static void nv_adma_register_mode(struct ata_port *ap);
static void nv_adma_mode(struct ata_port *ap);
static u8 nv_bmdma_status(struct ata_port *ap);
static u8 nv_adma_bmdma_status(struct ata_port *ap);
static void nv_bmdma_stop(struct ata_queued_cmd *qc);
static void nv_adma_bmdma_stop(struct ata_queued_cmd *qc);
static void nv_eng_timeout(struct ata_port *ap);
static void nv_adma_eng_timeout(struct ata_port *ap);
#ifdef DEBUG
static void nv_adma_dump_cpb(struct nv_adma_cpb *cpb);
static void nv_adma_dump_aprd(struct nv_adma_prd *aprd);
static void nv_adma_dump_cpb_tf(u16 tf);
static void nv_adma_dump_port(struct ata_port *ap);
static void nv_adma_dump_iomem(void __iomem *m, int len);
#endif

enum nv_host_type
{
	GENERIC,
	NFORCE2,
	NFORCE3,
	CK804,
	MCP51,
	MCP55,
	ADMA
};

static struct pci_device_id nv_pci_tbl[] = {
	{ PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NFORCE2S_SATA,
		PCI_ANY_ID, PCI_ANY_ID, 0, 0, NFORCE2 },
	{ PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NFORCE3S_SATA,
		PCI_ANY_ID, PCI_ANY_ID, 0, 0, NFORCE3 },
	{ PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NFORCE3S_SATA2,
		PCI_ANY_ID, PCI_ANY_ID, 0, 0, NFORCE3 },
	{ PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NFORCE_CK804_SATA,
		PCI_ANY_ID, PCI_ANY_ID, 0, 0, ADMA },
	{ PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NFORCE_CK804_SATA2,
		PCI_ANY_ID, PCI_ANY_ID, 0, 0, ADMA },
	{ PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NFORCE_MCP04_SATA,
		PCI_ANY_ID, PCI_ANY_ID, 0, 0, ADMA },
	{ PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NFORCE_MCP04_SATA2,
		PCI_ANY_ID, PCI_ANY_ID, 0, 0, ADMA },
	{ PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NFORCE_MCP51_SATA,
		PCI_ANY_ID, PCI_ANY_ID, 0, 0, MCP51 },
	{ PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NFORCE_MCP51_SATA2,
		PCI_ANY_ID, PCI_ANY_ID, 0, 0, MCP51 },
	{ PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NFORCE_MCP55_SATA,
		PCI_ANY_ID, PCI_ANY_ID, 0, 0, MCP55 },
	{ PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NFORCE_MCP55_SATA2,
		PCI_ANY_ID, PCI_ANY_ID, 0, 0, MCP55 },
	{ PCI_VENDOR_ID_NVIDIA, PCI_ANY_ID,
		PCI_ANY_ID, PCI_ANY_ID,
		PCI_CLASS_STORAGE_IDE<<8, 0xffff00, GENERIC },
	{ PCI_VENDOR_ID_NVIDIA, PCI_ANY_ID,
		PCI_ANY_ID, PCI_ANY_ID,
		PCI_CLASS_STORAGE_RAID<<8, 0xffff00, GENERIC },
	{ 0, } /* terminate list */
};

#define NV_HOST_FLAGS_SCR_MMIO	0x00000001

struct nv_host_desc
{
	enum nv_host_type	host_type;
	void			(*enable_hotplug)(struct ata_probe_ent *probe_ent);
	void			(*disable_hotplug)(struct ata_host_set *host_set);
	void			(*check_hotplug)(struct ata_host_set *host_set);

};
static struct nv_host_desc nv_device_tbl[] = {
	{
		.host_type	= GENERIC,
		.enable_hotplug	= NULL,
		.disable_hotplug= NULL,
		.check_hotplug	= NULL,
	},
	{
		.host_type	= NFORCE2,
		.enable_hotplug	= nv_enable_hotplug,
		.disable_hotplug= nv_disable_hotplug,
		.check_hotplug	= nv_check_hotplug,
	},
	{
		.host_type	= NFORCE3,
		.enable_hotplug	= nv_enable_hotplug,
		.disable_hotplug= nv_disable_hotplug,
		.check_hotplug	= nv_check_hotplug,
	},
	{	.host_type	= CK804,
		.enable_hotplug	= nv_enable_hotplug_ck804,
		.disable_hotplug= nv_disable_hotplug_ck804,
		.check_hotplug	= nv_check_hotplug_ck804,
	},
	{	.host_type	= MCP51,
		.enable_hotplug	= nv_enable_hotplug,
		.disable_hotplug= nv_disable_hotplug,
		.check_hotplug	= nv_check_hotplug,
	},
	{	.host_type	= MCP55,
		.enable_hotplug	= nv_enable_hotplug,
		.disable_hotplug= nv_disable_hotplug,
		.check_hotplug	= nv_check_hotplug,
	},
	{	.host_type	= ADMA,
		.enable_hotplug	= nv_enable_hotplug_adma,
		.disable_hotplug= nv_disable_hotplug_adma,
		.check_hotplug	= nv_check_hotplug_adma,
	},
};

struct nv_host
{
	struct nv_host_desc	*host_desc;
	unsigned long		host_flags;
};

static struct pci_driver nv_pci_driver = {
	.name			= DRV_NAME,
	.id_table		= nv_pci_tbl,
	.probe			= nv_init_one,
	.remove			= ata_pci_remove_one,
};

static struct scsi_host_template nv_sht = {
	.module			= THIS_MODULE,
	.name			= DRV_NAME,
	.ioctl			= ata_scsi_ioctl,
	.queuecommand		= ata_scsi_queuecmd,
	.eh_strategy_handler	= ata_scsi_error,
	.can_queue		= ATA_DEF_QUEUE,
	.this_id		= ATA_SHT_THIS_ID,
	.sg_tablesize		= LIBATA_MAX_PRD,
	.max_sectors		= ATA_MAX_SECTORS,
	.cmd_per_lun		= ATA_SHT_CMD_PER_LUN,
	.emulated		= ATA_SHT_EMULATED,
	.use_clustering		= ATA_SHT_USE_CLUSTERING,
	.proc_name		= DRV_NAME,
	.dma_boundary		= ATA_DMA_BOUNDARY,
	.slave_configure	= ata_scsi_slave_config,
	.bios_param		= ata_std_bios_param,
	.ordered_flush		= 1,
};

static struct ata_port_operations nv_ops = {
	.port_disable		= ata_port_disable,
	.tf_load		= ata_tf_load,
	.tf_read		= ata_tf_read,
	.exec_command		= ata_exec_command,
	.check_status		= ata_check_status,
	.dev_select		= ata_std_dev_select,
	.phy_reset		= sata_phy_reset,
	.bmdma_setup		= ata_bmdma_setup,
	.bmdma_start		= ata_bmdma_start,
	.bmdma_stop		= nv_bmdma_stop,
	.bmdma_status		= nv_bmdma_status,
	.qc_prep		= nv_qc_prep,
	.qc_issue		= nv_qc_issue,
	.eng_timeout		= nv_eng_timeout,
	.irq_handler		= nv_interrupt,
	.irq_clear		= nv_irq_clear,
	.scr_read		= nv_scr_read,
	.scr_write		= nv_scr_write,
	.port_start		= nv_port_start,
	.port_stop		= nv_port_stop,
	.host_stop		= nv_host_stop,
};

static unsigned int nv_adma_tf_to_cpb(struct ata_taskfile *tf, u16 *cpb)
{
	unsigned int idx = 0;

	cpb[idx++] = cpu_to_le16((ATA_REG_DEVICE << 8) | tf->device | WNB);

	if ((tf->flags & ATA_TFLAG_LBA48) == 0) {
		cpb[idx++] = cpu_to_le16(IGN);
		cpb[idx++] = cpu_to_le16(IGN);
		cpb[idx++] = cpu_to_le16(IGN);
		cpb[idx++] = cpu_to_le16(IGN);
		cpb[idx++] = cpu_to_le16(IGN);
	}
	else {
		cpb[idx++] = cpu_to_le16((ATA_REG_ERR   << 8) | tf->hob_feature);
		cpb[idx++] = cpu_to_le16((ATA_REG_NSECT << 8) | tf->hob_nsect);
		cpb[idx++] = cpu_to_le16((ATA_REG_LBAL  << 8) | tf->hob_lbal);
		cpb[idx++] = cpu_to_le16((ATA_REG_LBAM  << 8) | tf->hob_lbam);
		cpb[idx++] = cpu_to_le16((ATA_REG_LBAH  << 8) | tf->hob_lbah);
	}
	cpb[idx++] = cpu_to_le16((ATA_REG_ERR    << 8) | tf->feature);
	cpb[idx++] = cpu_to_le16((ATA_REG_NSECT  << 8) | tf->nsect);
	cpb[idx++] = cpu_to_le16((ATA_REG_LBAL   << 8) | tf->lbal);
	cpb[idx++] = cpu_to_le16((ATA_REG_LBAM   << 8) | tf->lbam);
	cpb[idx++] = cpu_to_le16((ATA_REG_LBAH   << 8) | tf->lbah);

	cpb[idx++] = cpu_to_le16((ATA_REG_CMD    << 8) | tf->command | CMDEND);

	return idx;
}

static inline void __iomem *__nv_adma_ctl_block(void __iomem *mmio,
					     unsigned int port_no)
{
	mmio += NV_ADMA_PORT + port_no * NV_ADMA_PORT_SIZE;
	return mmio;
}

static inline void __iomem *nv_adma_ctl_block(struct ata_port *ap)
{
	return __nv_adma_ctl_block(ap->host_set->mmio_base, ap->port_no);
}

static inline void __iomem *nv_adma_gen_block(struct ata_port *ap)
{
	return (ap->host_set->mmio_base + NV_ADMA_GEN);
}

static inline void __iomem *nv_adma_notifier_clear_block(struct ata_port *ap)
{
	return (nv_adma_gen_block(ap) + NV_ADMA_NOTIFIER_CLEAR + (4 * ap->port_no));
}

static inline void nv_adma_reset_channel(struct ata_port *ap)
{
	void __iomem *mmio = nv_adma_ctl_block(ap);
	u16 tmp;

	// clear CPB fetch count
	writew(0, mmio + NV_ADMA_CPB_COUNT);

	// clear GO
	tmp = readw(mmio + NV_ADMA_CTL);
	writew(tmp & ~NV_ADMA_CTL_GO, mmio + NV_ADMA_CTL);

	tmp = readw(mmio + NV_ADMA_CTL);
	writew(tmp | NV_ADMA_CTL_CHANNEL_RESET, mmio + NV_ADMA_CTL);
	udelay(1);
	writew(tmp & ~NV_ADMA_CTL_CHANNEL_RESET, mmio + NV_ADMA_CTL);
}

static inline int nv_adma_host_intr(struct ata_port *ap, struct ata_queued_cmd *qc)
{
	void __iomem *mmio = nv_adma_ctl_block(ap);
	struct nv_adma_port_priv *pp = ap->private_data;
	struct nv_adma_cpb *cpb = &pp->cpb[qc->tag];
	u16 status;
	u32 gen_ctl;
	u16 flags;
	int have_err = 0;
	int handled = 0;

	status = readw(mmio + NV_ADMA_STAT);

	// if in ATA register mode, use standard ata interrupt handler
	if (pp->flags & NV_ADMA_PORT_REGISTER_MODE) {
		VPRINTK("in ATA register mode\n");
		return ata_host_intr(ap, qc);
	}

	gen_ctl = readl(nv_adma_gen_block(ap) + NV_ADMA_GEN_CTL);
	if (!NV_ADMA_CHECK_INTR(gen_ctl, ap->port_no)) {
		return 0;
	}

	if (!pp->notifier && !pp->notifier_error) {
		if (status) {
			VPRINTK("XXX no notifier, but status 0x%x\n", status);
#ifdef DEBUG
			nv_adma_dump_port(ap);
			nv_adma_dump_cpb(cpb);
#endif
		} else {
			return 0;
		}
	}
	if (pp->notifier_error) {
		have_err = 1;
		handled = 1;
	}

	if (status & NV_ADMA_STAT_TIMEOUT) {
		VPRINTK("timeout, stat = 0x%x\n", status);
		have_err = 1;
		handled = 1;
	}
	if (status & NV_ADMA_STAT_CPBERR) {
		VPRINTK("CPB error, stat = 0x%x\n", status);
		have_err = 1;
		handled = 1;
	}
	if (status & NV_ADMA_STAT_STOPPED) {
		VPRINTK("ADMA stopped, stat = 0x%x, resp_flags = 0x%x\n", status, cpb->resp_flags);
		if (!(status & NV_ADMA_STAT_DONE)) {
			have_err = 1;
			handled = 1;
		}
	}
	if (status & NV_ADMA_STAT_CMD_COMPLETE) {
		VPRINTK("ADMA command complete, stat = 0x%x\n", status);
	}
	if (status & NV_ADMA_STAT_DONE) {
		flags = cpb->resp_flags;
		VPRINTK("CPB done, stat = 0x%x, flags = 0x%x\n", status, flags);
		handled = 1;
		if (!(status & NV_ADMA_STAT_IDLE)) {
			VPRINTK("XXX CPB done, but not idle\n");
		}
		if (flags & NV_CPB_RESP_DONE) {
			VPRINTK("CPB flags done, flags = 0x%x\n", flags);
		}
		if (flags & NV_CPB_RESP_ATA_ERR) {
			VPRINTK("CPB flags ATA err, flags = 0x%x\n", flags);
			have_err = 1;
		}
		if (flags & NV_CPB_RESP_CMD_ERR) {
			VPRINTK("CPB flags CMD err, flags = 0x%x\n", flags);
			have_err = 1;
		}
		if (flags & NV_CPB_RESP_CPB_ERR) {
			VPRINTK("CPB flags CPB err, flags = 0x%x\n", flags);
			have_err = 1;
		}
	}

	// clear status
	writew(status, mmio + NV_ADMA_STAT);

	if (handled) {
		u8 ata_status = readb(mmio + (ATA_REG_STATUS * 4));
		ata_qc_complete(qc, have_err ? (ata_status | ATA_ERR) : ata_status);
	}

	return handled; /* irq handled */
}

/* FIXME: The hardware provides the necessary SATA PHY controls
 * to support ATA_FLAG_SATA_RESET.  However, it is currently
 * necessary to disable that flag, to solve misdetection problems.
 * See http://bugme.osdl.org/show_bug.cgi?id=3352 for more info.
 *
 * This problem really needs to be investigated further.  But in the
 * meantime, we avoid ATA_FLAG_SATA_RESET to get people working.
 */

static struct ata_port_info nv_port_info = {
	.sht		= &nv_sht,
	.host_flags	= ATA_FLAG_SATA |
			  /* ATA_FLAG_SATA_RESET | */
			  ATA_FLAG_SRST |
			  ATA_FLAG_NO_LEGACY,
	.pio_mask	= NV_PIO_MASK,
	.mwdma_mask	= NV_MWDMA_MASK,
	.udma_mask	= NV_UDMA_MASK,
	.port_ops	= &nv_ops,
};

MODULE_AUTHOR("NVIDIA");
MODULE_DESCRIPTION("low-level driver for NVIDIA nForce SATA controller");
MODULE_LICENSE("GPL");
MODULE_DEVICE_TABLE(pci, nv_pci_tbl);
MODULE_VERSION(DRV_VERSION);

static inline void nv_enable_adma_space (struct pci_dev *pdev)
{
	u8 regval;

	VPRINTK("ENTER\n");

	pci_read_config_byte(pdev, NV_MCP_SATA_CFG_20, &regval);
	regval |= NV_MCP_SATA_CFG_20_SATA_SPACE_EN;
	pci_write_config_byte(pdev, NV_MCP_SATA_CFG_20, regval);
}

static inline void nv_disable_adma_space (struct pci_dev *pdev)
{
	u8 regval;

	VPRINTK("ENTER\n");

	pci_read_config_byte(pdev, NV_MCP_SATA_CFG_20, &regval);
	regval &= ~NV_MCP_SATA_CFG_20_SATA_SPACE_EN;
	pci_write_config_byte(pdev, NV_MCP_SATA_CFG_20, regval);
}

static void nv_irq_clear(struct ata_port *ap)
{
	struct ata_host_set *host_set = ap->host_set;
	struct nv_host *host = host_set->private_data;

	if (host->host_desc->host_type == ADMA) {
		nv_adma_irq_clear(ap);
	} else {
		ata_bmdma_irq_clear(ap);
	}
}

static void nv_adma_irq_clear(struct ata_port *ap)
{
	/* TODO */
}

static u8 nv_bmdma_status(struct ata_port *ap)
{
	struct ata_host_set *host_set = ap->host_set;
	struct nv_host *host = host_set->private_data;

	if (host->host_desc->host_type == ADMA) {
		return nv_adma_bmdma_status(ap);
	} else {
		return ata_bmdma_status(ap);
	}
}

static u8 nv_adma_bmdma_status(struct ata_port *ap)
{
	return inb(ap->ioaddr.bmdma_addr + ATA_DMA_STATUS);
}

static void nv_bmdma_stop(struct ata_queued_cmd *qc)
{
	struct ata_host_set *host_set = qc->ap->host_set;
	struct nv_host *host = host_set->private_data;

	if (host->host_desc->host_type == ADMA) {
		nv_adma_bmdma_stop(qc);
	} else {
		ata_bmdma_stop(qc);
	}
}

static void nv_adma_bmdma_stop(struct ata_queued_cmd *qc)
{
	/* TODO */
}

static irqreturn_t nv_interrupt (int irq, void *dev_instance,
				 struct pt_regs *regs)
{
	struct ata_host_set *host_set = dev_instance;
	struct nv_host *host = host_set->private_data;
	unsigned int i;
	unsigned int handled = 0;
	unsigned long flags;

	spin_lock_irqsave(&host_set->lock, flags);

	for (i = 0; i < host_set->n_ports; i++) {
		struct ata_port *ap = host_set->ports[i];
		struct nv_adma_port_priv *pp = ap->private_data;

		if (ap &&
		    !(ap->flags & (ATA_FLAG_PORT_DISABLED | ATA_FLAG_NOINTR))) {
			void __iomem *mmio = nv_adma_ctl_block(ap);
			struct ata_queued_cmd *qc;

			// read notifiers
			pp->notifier = readl(mmio + NV_ADMA_NOTIFIER);
			pp->notifier_error = readl(mmio + NV_ADMA_NOTIFIER_ERROR);
				
			qc = ata_qc_from_tag(ap, ap->active_tag);
			if (qc && (!(qc->tf.ctl & ATA_NIEN))) {
				if (host->host_desc->host_type == ADMA) {
					handled += nv_adma_host_intr(ap, qc);
				} else {
					handled += ata_host_intr(ap, qc);
				}
			}
				
		}

	}

	if (host->host_desc->check_hotplug)
		host->host_desc->check_hotplug(host_set);

	// clear notifier
	if (handled) {
		for (i = 0; i < host_set->n_ports; i++) {
			struct ata_port *ap = host_set->ports[i];
			struct nv_adma_port_priv *pp = ap->private_data;
			writel(pp->notifier | pp->notifier_error,
			       nv_adma_notifier_clear_block(ap));
		}
	}

	spin_unlock_irqrestore(&host_set->lock, flags);

	return IRQ_RETVAL(handled);
}

static u32 nv_scr_read (struct ata_port *ap, unsigned int sc_reg)
{
	struct ata_host_set *host_set = ap->host_set;
	struct nv_host *host = host_set->private_data;
	u32 val = 0;

	VPRINTK("ENTER\n");

	VPRINTK("reading SCR reg %d, got 0x%08x\n", sc_reg, val);

	if (sc_reg > SCR_CONTROL)
		return 0xffffffffU;

	if (host->host_flags & NV_HOST_FLAGS_SCR_MMIO)
		val = readl((void*)ap->ioaddr.scr_addr + (sc_reg * 4));
	else
		val = inl(ap->ioaddr.scr_addr + (sc_reg * 4));

	VPRINTK("reading SCR reg %d, got 0x%08x\n", sc_reg, val);
	return val;
}

static void nv_scr_write (struct ata_port *ap, unsigned int sc_reg, u32 val)
{
	struct ata_host_set *host_set = ap->host_set;
	struct nv_host *host = host_set->private_data;

	VPRINTK("ENTER\n");

	VPRINTK("writing SCR reg %d with 0x%08x\n", sc_reg, val);
	if (sc_reg > SCR_CONTROL)
		return;

	if (host->host_flags & NV_HOST_FLAGS_SCR_MMIO)
		writel(val, (void*)ap->ioaddr.scr_addr + (sc_reg * 4));
	else
		outl(val, ap->ioaddr.scr_addr + (sc_reg * 4));
}

static void nv_host_stop (struct ata_host_set *host_set)
{
	struct nv_host *host = host_set->private_data;
	struct pci_dev *pdev = to_pci_dev(host_set->dev);

	VPRINTK("ENTER\n");

	// Disable hotplug event interrupts.
	if (host->host_desc->disable_hotplug)
		host->host_desc->disable_hotplug(host_set);

	kfree(host);

	if (host_set->mmio_base)
		pci_iounmap(pdev, host_set->mmio_base);
}

static int nv_port_start(struct ata_port *ap)
{
	struct ata_host_set *host_set = ap->host_set;
	struct nv_host *host = host_set->private_data;

	if (host->host_desc->host_type == ADMA) {
		return nv_adma_port_start(ap);
	} else {
		return ata_port_start(ap);
	}
}

static void nv_port_stop(struct ata_port *ap)
{
	struct ata_host_set *host_set = ap->host_set;
	struct nv_host *host = host_set->private_data;

	if (host->host_desc->host_type == ADMA) {
		nv_adma_port_stop(ap);
	} else {
		ata_port_stop(ap);
	}
}

static int nv_adma_port_start(struct ata_port *ap)
{
	struct device *dev = ap->host_set->dev;
	struct nv_adma_port_priv *pp;
	int rc;
	void *mem;
	dma_addr_t mem_dma;
	void __iomem *mmio = nv_adma_ctl_block(ap);

	VPRINTK("ENTER\n");

	nv_adma_reset_channel(ap);

#ifdef DEBUG
	VPRINTK("after reset:\n");
	nv_adma_dump_port(ap);
#endif

	rc = ata_port_start(ap);
	if (rc)
		return rc;

	pp = kmalloc(sizeof(*pp), GFP_KERNEL);
	if (!pp) {
		rc = -ENOMEM;
		goto err_out;
	}
	memset(pp, 0, sizeof(*pp));

	mem = dma_alloc_coherent(dev, NV_ADMA_PORT_PRIV_DMA_SZ,
				 &mem_dma, GFP_KERNEL);
	
	VPRINTK("dma memory: vaddr = 0x%08x, paddr = 0x%08x\n", (u32)mem, (u32)mem_dma);
	
	if (!mem) {
		rc = -ENOMEM;
		goto err_out_kfree;
	}
	memset(mem, 0, NV_ADMA_PORT_PRIV_DMA_SZ);

	/*
	 * First item in chunk of DMA memory:
	 * 128-byte command parameter block (CPB)
	 * one for each command tag
	 */
	pp->cpb     = mem;
	pp->cpb_dma = mem_dma;

	VPRINTK("cpb = 0x%08x, cpb_dma = 0x%08x\n", (u32)pp->cpb, (u32)pp->cpb_dma);

	writel(mem_dma, mmio + NV_ADMA_CPB_BASE_LOW);
	writel(0,       mmio + NV_ADMA_CPB_BASE_HIGH);

	mem     += NV_ADMA_CAN_QUEUE * NV_ADMA_CPB_SZ;
	mem_dma += NV_ADMA_CAN_QUEUE * NV_ADMA_CPB_SZ;

	/*
	 * Second item: block of ADMA_SGTBL_LEN s/g entries
	 */
	pp->aprd = mem;
	pp->aprd_dma = mem_dma;

	VPRINTK("aprd = 0x%08x, aprd_dma = 0x%08x\n", (u32)pp->aprd, (u32)pp->aprd_dma);

	ap->private_data = pp;

	// clear any outstanding interrupt conditions
	writew(0xffff, mmio + NV_ADMA_STAT);

	// initialize port variables
	//	pp->cpb_idx = 0;
	pp->flags = NV_ADMA_PORT_REGISTER_MODE;

	// make sure controller is in ATA register mode
	nv_adma_register_mode(ap);

	return 0;

err_out_kfree:
	kfree(pp);
err_out:
	ata_port_stop(ap);
	return rc;
}

static void nv_adma_port_stop(struct ata_port *ap)
{
	struct device *dev = ap->host_set->dev;
	struct nv_adma_port_priv *pp = ap->private_data;
	void __iomem *mmio = nv_adma_ctl_block(ap);

	VPRINTK("ENTER\n");

	writew(0, mmio + NV_ADMA_CTL);

	ap->private_data = NULL;
	dma_free_coherent(dev, NV_ADMA_PORT_PRIV_DMA_SZ, pp->cpb, pp->cpb_dma);
	kfree(pp);
	ata_port_stop(ap);
}


static void nv_adma_setup_port(struct ata_probe_ent *probe_ent, unsigned int port)
{
	void __iomem *mmio = probe_ent->mmio_base;
	struct ata_ioports *ioport = &probe_ent->port[port];

	VPRINTK("ENTER\n");

	mmio += NV_ADMA_PORT + port * NV_ADMA_PORT_SIZE;

	ioport->cmd_addr	= (unsigned long) mmio;
	ioport->data_addr	= (unsigned long) mmio + (ATA_REG_DATA * 4);
	ioport->error_addr	=
	ioport->feature_addr	= (unsigned long) mmio + (ATA_REG_ERR * 4);
	ioport->nsect_addr	= (unsigned long) mmio + (ATA_REG_NSECT * 4);
	ioport->lbal_addr	= (unsigned long) mmio + (ATA_REG_LBAL * 4);
	ioport->lbam_addr	= (unsigned long) mmio + (ATA_REG_LBAM * 4);
	ioport->lbah_addr	= (unsigned long) mmio + (ATA_REG_LBAH * 4);
	ioport->device_addr	= (unsigned long) mmio + (ATA_REG_DEVICE * 4);
	ioport->status_addr	=
	ioport->command_addr	= (unsigned long) mmio + (ATA_REG_STATUS * 4);
	ioport->altstatus_addr	=
	ioport->ctl_addr	= (unsigned long) mmio + 0x20;
}

static int nv_adma_host_init(struct ata_probe_ent *probe_ent)
{
	struct pci_dev *pdev = to_pci_dev(probe_ent->dev);
	unsigned int i;
	u32 tmp32;

	VPRINTK("ENTER\n");

	probe_ent->n_ports = NV_PORTS;

	nv_enable_adma_space(pdev);
	
	// enable ADMA on the ports
	pci_read_config_dword(pdev, NV_MCP_SATA_CFG_20, &tmp32);
	tmp32 |= NV_MCP_SATA_CFG_20_PORT0_EN |
		 NV_MCP_SATA_CFG_20_PORT0_PWB_EN |
		 NV_MCP_SATA_CFG_20_PORT1_EN |
		 NV_MCP_SATA_CFG_20_PORT1_PWB_EN;

	pci_write_config_dword(pdev, NV_MCP_SATA_CFG_20, tmp32);
	
	for (i = 0; i < probe_ent->n_ports; i++)
		nv_adma_setup_port(probe_ent, i);

	for (i = 0; i < probe_ent->n_ports; i++) {
		void __iomem *mmio = __nv_adma_ctl_block(probe_ent->mmio_base, i);
		u16 tmp;

		/* enable interrupt, clear reset if not already clear */
		tmp = readw(mmio + NV_ADMA_CTL);
		writew(tmp | NV_ADMA_CTL_AIEN, mmio + NV_ADMA_CTL);
	}

	pci_set_master(pdev);

	return 0;
}

static int nv_init_one (struct pci_dev *pdev, const struct pci_device_id *ent)
{
	static int printed_version = 0;
	struct nv_host *host;
	struct ata_port_info *ppi;
	struct ata_probe_ent *probe_ent;
	struct nv_host_desc *host_desc;
	int pci_dev_busy = 0;
	int rc;
	u32 bar;

	VPRINTK("ENTER\n");

        // Make sure this is a SATA controller by counting the number of bars
        // (NVIDIA SATA controllers will always have six bars).  Otherwise,
        // it's an IDE controller and we ignore it.
	for (bar=0; bar<6; bar++)
		if (pci_resource_start(pdev, bar) == 0)
			return -ENODEV;

	if (!printed_version++)
		printk(KERN_DEBUG DRV_NAME " version " DRV_VERSION "\n");

	rc = pci_enable_device(pdev);
	if (rc)
		goto err_out;

	rc = pci_request_regions(pdev, DRV_NAME);
	if (rc) {
		pci_dev_busy = 1;
		goto err_out_disable;
	}

	rc = pci_set_dma_mask(pdev, ATA_DMA_MASK);
	if (rc)
		goto err_out_regions;
	rc = pci_set_consistent_dma_mask(pdev, ATA_DMA_MASK);
	if (rc)
		goto err_out_regions;

	rc = -ENOMEM;

	ppi = &nv_port_info;
	
	host_desc = &nv_device_tbl[ent->driver_data];
	if (host_desc->host_type == ADMA) {
		// ADMA overrides
		ppi->host_flags                |= ATA_FLAG_MMIO | ATA_FLAG_SATA_RESET;
#ifdef NV_ADMA_NCQ
		ppi->host_flags		       |= ATA_FLAG_NCQ;
#endif
		ppi->sht->can_queue		= NV_ADMA_CAN_QUEUE;
		ppi->sht->sg_tablesize		= NV_ADMA_SGTBL_LEN;
//		ppi->port_ops->irq_handler	= nv_adma_interrupt;
	}
	
	probe_ent = ata_pci_init_native_mode(pdev, &ppi);
	if (!probe_ent)
		goto err_out_regions;

	host = kmalloc(sizeof(struct nv_host), GFP_KERNEL);
	if (!host)
		goto err_out_free_ent;

	memset(host, 0, sizeof(struct nv_host));
	host->host_desc = host_desc;

	probe_ent->private_data = host;

	if (pci_resource_flags(pdev, 5) & IORESOURCE_MEM)
		host->host_flags |= NV_HOST_FLAGS_SCR_MMIO;

	if (host->host_flags & NV_HOST_FLAGS_SCR_MMIO) {
		unsigned long base;

		probe_ent->mmio_base = pci_iomap(pdev, 5, 0);
		if (probe_ent->mmio_base == NULL) {
			rc = -EIO;
			goto err_out_free_host;
		}

		base = (unsigned long)probe_ent->mmio_base;
		VPRINTK("BAR5 base is at 0x%x\n", (u32)base);

		probe_ent->port[0].scr_addr =
			base + NV_PORT0_SCR_REG_OFFSET;
		probe_ent->port[1].scr_addr =
			base + NV_PORT1_SCR_REG_OFFSET;
	} else {

		probe_ent->port[0].scr_addr =
			pci_resource_start(pdev, 5) | NV_PORT0_SCR_REG_OFFSET;
		probe_ent->port[1].scr_addr =
			pci_resource_start(pdev, 5) | NV_PORT1_SCR_REG_OFFSET;
	}

	pci_set_master(pdev);

	if (ent->driver_data == ADMA) {
		rc = nv_adma_host_init(probe_ent);
		if (rc)
			goto err_out_iounmap;
	}

	rc = ata_device_add(probe_ent);
	if (rc != NV_PORTS)
		goto err_out_iounmap;

	// Enable hotplug event interrupts.
	if (host->host_desc->enable_hotplug)
		host->host_desc->enable_hotplug(probe_ent);

	kfree(probe_ent);

	return 0;

err_out_iounmap:
	if (host->host_flags & NV_HOST_FLAGS_SCR_MMIO)
		pci_iounmap(pdev, probe_ent->mmio_base);
err_out_free_host:
	kfree(host);
err_out_free_ent:
	kfree(probe_ent);
err_out_regions:
	pci_release_regions(pdev);
err_out_disable:
	if (!pci_dev_busy)
		pci_disable_device(pdev);
err_out:
	return rc;
}

static void nv_eng_timeout(struct ata_port *ap)
{
	struct ata_host_set *host_set = ap->host_set;
	struct nv_host *host = host_set->private_data;

	if (host->host_desc->host_type == ADMA) {
		nv_adma_eng_timeout(ap);
	} else {
		return ata_eng_timeout(ap);
	}
}

static void nv_adma_eng_timeout(struct ata_port *ap)
{
	struct ata_queued_cmd *qc = ata_qc_from_tag(ap, ap->active_tag);
	struct nv_adma_port_priv *pp = ap->private_data;
	u8 drv_stat;

	VPRINTK("ENTER\n");
	
	if (pp->flags & NV_ADMA_PORT_REGISTER_MODE) {
		ata_eng_timeout(ap);
		goto out;
	}


	if (!qc) {
		printk(KERN_ERR "ata%u: BUG: timeout without command\n",
		       ap->id);
		goto out;
	}
	

//	spin_lock_irqsave(&host_set->lock, flags);

	qc->scsidone = scsi_finish_command;

	drv_stat = ata_chk_status(ap);

	printk(KERN_ERR "ata%u: command 0x%x timeout, stat 0x%x\n",
	       ap->id, qc->tf.command, drv_stat);

	// reset channel
	nv_adma_reset_channel(ap);

	/* complete taskfile transaction */
	ata_qc_complete(qc, drv_stat);

//	spin_unlock_irqrestore(&host_set->lock, flags);

out:
	DPRINTK("EXIT\n");
}

static void nv_qc_prep(struct ata_queued_cmd *qc)
{
	struct ata_host_set *host_set = qc->ap->host_set;
	struct nv_host *host = host_set->private_data;

	if (host->host_desc->host_type == ADMA) {
		nv_adma_qc_prep(qc);
	} else {
		ata_qc_prep(qc);
	}
}

static void nv_adma_qc_prep(struct ata_queued_cmd *qc)
{
	struct nv_adma_port_priv *pp = qc->ap->private_data;
	struct nv_adma_cpb *cpb = &pp->cpb[qc->tag];

	VPRINTK("ENTER\n");

	VPRINTK("qc->flags = 0x%x\n", (u32)qc->flags);

	if (!(qc->flags & ATA_QCFLAG_DMAMAP)) {
		ata_qc_prep(qc);
		return;
	}

	memset(cpb, 0, sizeof(struct nv_adma_cpb));
	       
	cpb->ctl_flags		= NV_CPB_CTL_CPB_VALID |
				  NV_CPB_CTL_APRD_VALID |
				  NV_CPB_CTL_IEN;
	cpb->len		= 3;
	cpb->tag		= qc->tag;
	cpb->next_cpb_idx	= 0;

#ifdef NV_ADMA_NCQ
	// turn on NCQ flags for NCQ commands
	if (qc->flags & ATA_QCFLAG_NCQ)
		cpb->ctl_flags |= NV_CPB_CTL_QUEUE | NV_CPB_CTL_FPDMA;
#endif

	nv_adma_tf_to_cpb(&qc->tf, cpb->tf);

	nv_adma_fill_sg(qc, cpb);
}

static void nv_adma_fill_sg(struct ata_queued_cmd *qc, struct nv_adma_cpb *cpb)
{
	struct nv_adma_port_priv *pp = qc->ap->private_data;
	unsigned int idx;
	struct nv_adma_prd *aprd;

	VPRINTK("ENTER\n");

	idx = 0;

	for (idx = 0; idx < qc->n_elem; idx++) {
		if (idx < 5) {
			aprd = &cpb->aprd[idx];
		} else {
			aprd = &pp->aprd[idx-5];
		}
		nv_adma_fill_aprd(qc, idx, aprd);
	}
	if (idx > 5) {
		cpb->next_aprd = (u64)(pp->aprd_dma + NV_ADMA_APRD_SZ * qc->tag);
	}
}

static void nv_adma_fill_aprd(struct ata_queued_cmd *qc,
			      int idx,
			      struct nv_adma_prd *aprd)
{
	u32 sg_len, addr, flags;

	memset(aprd, 0, sizeof(struct nv_adma_prd));

	addr   = sg_dma_address(&qc->sg[idx]);
	sg_len = sg_dma_len(&qc->sg[idx]);

	flags = 0;
	if (qc->tf.flags & ATA_TFLAG_WRITE)
		flags |= NV_APRD_WRITE;
	if (idx == qc->n_elem - 1) {
		flags |= NV_APRD_END;
	} else if (idx != 4) {
		flags |= NV_APRD_CONT;
	}

	aprd->addr  = cpu_to_le32(addr);
	aprd->len   = cpu_to_le32(sg_len); /* len in bytes */
	aprd->flags = cpu_to_le32(flags);
}

static void nv_adma_register_mode(struct ata_port *ap)
{
	void __iomem *mmio = nv_adma_ctl_block(ap);
	struct nv_adma_port_priv *pp = ap->private_data;
	u16 tmp;

	tmp = readw(mmio + NV_ADMA_CTL);
	writew(tmp & ~NV_ADMA_CTL_GO, mmio + NV_ADMA_CTL);

	pp->flags |= NV_ADMA_PORT_REGISTER_MODE;
}

static void nv_adma_mode(struct ata_port *ap)
{
	void __iomem *mmio = nv_adma_ctl_block(ap);
	struct nv_adma_port_priv *pp = ap->private_data;
	u16 tmp;

	if(!(pp->flags & NV_ADMA_PORT_REGISTER_MODE)) {
		return;
	}

#if 0
	nv_adma_reset_channel(ap);
#endif

	tmp = readw(mmio + NV_ADMA_CTL);
	writew(tmp | NV_ADMA_CTL_GO, mmio + NV_ADMA_CTL);

	pp->flags &= ~NV_ADMA_PORT_REGISTER_MODE;
}

static int nv_qc_issue(struct ata_queued_cmd *qc)
{
	struct ata_host_set *host_set = qc->ap->host_set;
	struct nv_host *host = host_set->private_data;

	if (host->host_desc->host_type == ADMA) {
		return nv_adma_qc_issue(qc);
	} else {
		return ata_qc_issue_prot(qc);
	}
}

static int nv_adma_qc_issue(struct ata_queued_cmd *qc)
{
#if 0
	struct nv_adma_port_priv *pp = qc->ap->private_data;
#endif
	void __iomem *mmio = nv_adma_ctl_block(qc->ap);

	VPRINTK("ENTER\n");

	if (!(qc->flags & ATA_QCFLAG_DMAMAP)) {
		VPRINTK("no dmamap, using ATA register mode: 0x%x\n", (u32)qc->flags);
		// use ATA register mode
		nv_adma_register_mode(qc->ap);
		return ata_qc_issue_prot(qc);
	} else {
		nv_adma_mode(qc->ap);
	}

#if 0
	nv_adma_dump_port(qc->ap);
	nv_adma_dump_cpb(&pp->cpb[qc->tag]);
	if (qc->n_elem > 5) {
		int i;
		for (i = 0; i < qc->n_elem - 5; i++) {
			nv_adma_dump_aprd(&pp->aprd[i]);
		}
	}
#endif

	//
	// write append register, command tag in lower 8 bits
	// and (number of cpbs to append -1) in top 8 bits
	//
	mb();
	writew(qc->tag, mmio + NV_ADMA_APPEND);
	
	VPRINTK("EXIT\n");

	return 0;
}

static void nv_enable_hotplug(struct ata_probe_ent *probe_ent)
{
	u8 intr_mask;

	outb(NV_INT_STATUS_HOTPLUG,
		probe_ent->port[0].scr_addr + NV_INT_STATUS);

	intr_mask = inb(probe_ent->port[0].scr_addr + NV_INT_ENABLE);
	intr_mask |= NV_INT_ENABLE_HOTPLUG;

	outb(intr_mask, probe_ent->port[0].scr_addr + NV_INT_ENABLE);
}

static void nv_disable_hotplug(struct ata_host_set *host_set)
{
	u8 intr_mask;

	intr_mask = inb(host_set->ports[0]->ioaddr.scr_addr + NV_INT_ENABLE);

	intr_mask &= ~(NV_INT_ENABLE_HOTPLUG);

	outb(intr_mask, host_set->ports[0]->ioaddr.scr_addr + NV_INT_ENABLE);
}

static void nv_check_hotplug(struct ata_host_set *host_set)
{
	u8 intr_status;

	intr_status = inb(host_set->ports[0]->ioaddr.scr_addr + NV_INT_STATUS);

	// Clear interrupt status.
	outb(0xff, host_set->ports[0]->ioaddr.scr_addr + NV_INT_STATUS);

	if (intr_status & NV_INT_STATUS_HOTPLUG) {
		if (intr_status & NV_INT_STATUS_PDEV_ADDED)
			printk(KERN_WARNING "nv_sata: "
				"Primary device added\n");

		if (intr_status & NV_INT_STATUS_PDEV_REMOVED)
			printk(KERN_WARNING "nv_sata: "
				"Primary device removed\n");

		if (intr_status & NV_INT_STATUS_SDEV_ADDED)
			printk(KERN_WARNING "nv_sata: "
				"Secondary device added\n");

		if (intr_status & NV_INT_STATUS_SDEV_REMOVED)
			printk(KERN_WARNING "nv_sata: "
				"Secondary device removed\n");
	}
}

static void nv_enable_hotplug_ck804(struct ata_probe_ent *probe_ent)
{
	struct pci_dev *pdev = to_pci_dev(probe_ent->dev);
	u8 intr_mask;

	nv_enable_adma_space(pdev);

	writeb(NV_INT_STATUS_HOTPLUG, probe_ent->mmio_base + NV_INT_STATUS_CK804);

	intr_mask = readb(probe_ent->mmio_base + NV_INT_ENABLE_CK804);
	intr_mask |= NV_INT_ENABLE_HOTPLUG;

	writeb(intr_mask, probe_ent->mmio_base + NV_INT_ENABLE_CK804);
}

static void nv_disable_hotplug_ck804(struct ata_host_set *host_set)
{
	struct pci_dev *pdev = to_pci_dev(host_set->dev);
	u8 intr_mask;

	intr_mask = readb(host_set->mmio_base + NV_INT_ENABLE_CK804);

	intr_mask &= ~(NV_INT_ENABLE_HOTPLUG);

	writeb(intr_mask, host_set->mmio_base + NV_INT_ENABLE_CK804);

	nv_disable_adma_space(pdev);
}

static void nv_check_hotplug_ck804(struct ata_host_set *host_set)
{
	u8 intr_status;

	intr_status = readb(host_set->mmio_base + NV_INT_STATUS_CK804);

	// Clear interrupt status.
	writeb(0xff, host_set->mmio_base + NV_INT_STATUS_CK804);

	if (intr_status & NV_INT_STATUS_HOTPLUG) {
		if (intr_status & NV_INT_STATUS_PDEV_ADDED)
			printk(KERN_WARNING "nv_sata: "
				"Primary device added\n");

		if (intr_status & NV_INT_STATUS_PDEV_REMOVED)
			printk(KERN_WARNING "nv_sata: "
				"Primary device removed\n");

		if (intr_status & NV_INT_STATUS_SDEV_ADDED)
			printk(KERN_WARNING "nv_sata: "
				"Secondary device added\n");

		if (intr_status & NV_INT_STATUS_SDEV_REMOVED)
			printk(KERN_WARNING "nv_sata: "
				"Secondary device removed\n");
	}
}

static void nv_enable_hotplug_adma(struct ata_probe_ent *probe_ent)
{
	struct pci_dev *pdev = to_pci_dev(probe_ent->dev);
	unsigned int i;
	u16 tmp;

	nv_enable_adma_space(pdev);

	for (i = 0; i < probe_ent->n_ports; i++) {
		void __iomem *mmio = __nv_adma_ctl_block(probe_ent->mmio_base, i);
		writew(NV_ADMA_STAT_HOTPLUG | NV_ADMA_STAT_HOTUNPLUG,
		       mmio + NV_ADMA_STAT);

		tmp = readw(mmio + NV_ADMA_CTL);
		writew(tmp | NV_ADMA_CTL_HOTPLUG_IEN, mmio + NV_ADMA_CTL);
		
	}
}

static void nv_disable_hotplug_adma(struct ata_host_set *host_set)
{
	unsigned int i;
	u16 tmp;

	for (i = 0; i < host_set->n_ports; i++) {
		void __iomem *mmio = __nv_adma_ctl_block(host_set->mmio_base, i);

		tmp = readw(mmio + NV_ADMA_CTL);
		writew(tmp & ~NV_ADMA_CTL_HOTPLUG_IEN, mmio + NV_ADMA_CTL);
		
	}
}

static void nv_check_hotplug_adma(struct ata_host_set *host_set)
{
	unsigned int i;
	u16 adma_status;

	for (i = 0; i < host_set->n_ports; i++) {
		void __iomem *mmio = __nv_adma_ctl_block(host_set->mmio_base, i);
		adma_status = readw(mmio + NV_ADMA_STAT);
		if (adma_status & NV_ADMA_STAT_HOTPLUG) {
			printk(KERN_WARNING "nv_sata: "
			       "port %d device added\n", i);
			writew(NV_ADMA_STAT_HOTPLUG, mmio + NV_ADMA_STAT);
		}
		if (adma_status & NV_ADMA_STAT_HOTUNPLUG) {
			printk(KERN_WARNING "nv_sata: "
			       "port %d device removed\n", i);
			writew(NV_ADMA_STAT_HOTUNPLUG, mmio + NV_ADMA_STAT);
		}
	}
}

static int __init nv_init(void)
{
	return pci_module_init(&nv_pci_driver);
}

static void __exit nv_exit(void)
{
	pci_unregister_driver(&nv_pci_driver);
}

module_init(nv_init);
module_exit(nv_exit);

#ifdef DEBUG
static void nv_adma_dump_aprd(struct nv_adma_prd *aprd)
{
	printk("%016llx %08x %02x %s %s %s\n",
	       aprd->addr,
	       aprd->len,
	       aprd->flags,
	       (aprd->flags & NV_APRD_WRITE) ? "WRITE" : "     ",
	       (aprd->flags & NV_APRD_END)   ? "END"   : "   ",
	       (aprd->flags & NV_APRD_CONT)  ? "CONT"  : "    ");
}
static void nv_adma_dump_iomem(void __iomem *m, int len)
{
	int i, j;

	for (i = 0; i < len/16; i++) {
		printk(KERN_WARNING "%02x: ", 16*i);
		for (j = 0; j < 16; j++) {
			printk("%02x%s", (u32)readb(m + 16*i + j),
			       (j == 7) ? "-" : " ");
		}
		printk("\n");
	}
}

static void nv_adma_dump_cpb_tf(u16 tf)
{
	printk("0x%04x %s %s %s 0x%02x 0x%02x\n",
	       tf,
	       (tf & CMDEND) ? "END" : "   ",
	       (tf & WNB) ? "WNB" : "   ",
	       (tf & IGN) ? "IGN" : "   ",
	       ((tf >> 8) & 0x1f),
	       (tf & 0xff));
}
	
static void nv_adma_dump_port(struct ata_port *ap)
{
	void __iomem *mmio = nv_adma_ctl_block(ap);
	nv_adma_dump_iomem(mmio, NV_ADMA_PORT_SIZE);
}
			
static void nv_adma_dump_cpb(struct nv_adma_cpb *cpb)
{
	int i;

	printk("resp_flags:   0x%02x\n", cpb->resp_flags);
	printk("ctl_flags:    0x%02x\n", cpb->ctl_flags);
	printk("len:          0x%02x\n", cpb->len);
	printk("tag:          0x%02x\n", cpb->tag);
	printk("next_cpb_idx: 0x%02x\n", cpb->next_cpb_idx);
	printk("tf:\n");
	for (i=0; i<12; i++) {
		nv_adma_dump_cpb_tf(cpb->tf[i]);
	}
	printk("aprd:\n");
	for (i=0; i<5; i++) {
		nv_adma_dump_aprd(&cpb->aprd[i]);
	}
	printk("next_aprd:    0x%016llx\n", cpb->next_aprd);
}

#endif	


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: libata/sata_nv latency on NVIDIA CK804 [was Re: AMD64 X2 lost ticks on PM timer]
  2006-03-15 22:36                                             ` Bill Rugolsky Jr.
@ 2006-03-15 22:46                                               ` Ingo Molnar
  2006-03-15 22:48                                               ` Jeff Garzik
  1 sibling, 0 replies; 60+ messages in thread
From: Ingo Molnar @ 2006-03-15 22:46 UTC (permalink / raw)
  To: Bill Rugolsky Jr.,
	Jeff Garzik, Lee Revell, Andi Kleen, Jason Baron, linux-kernel,
	john stultz


* Bill Rugolsky Jr. <brugolsky@telemetry-investments.com> wrote:

> On Wed, Mar 15, 2006 at 11:24:41PM +0100, Ingo Molnar wrote:
> > well, it's a PIO inb() op i think, and could thus in theory trigger SMM 
> > BIOS code.
>  
> Is there any easy way to disable more SMM stuff than "noacpi"?

not that i know of. Point of SMM is to make it hard to disable by OSs.
They are a kind of hypervisor mode for the purposes of seemless hardware 
soft-extensions, from the days when it wasnt yet fashionable to call 
them virtualization ;)

> If push comes to shove, I'll go all the way and just install LinuxBIOS 
> on the damn thing.  Though I'm sure that will be a chore.

i think it might be easier to try the new sata_nv.c driver first.

	Ingo

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: libata/sata_nv latency on NVIDIA CK804 [was Re: AMD64 X2 lost ticks on PM timer]
  2006-03-15 22:36                                             ` Bill Rugolsky Jr.
  2006-03-15 22:46                                               ` Ingo Molnar
@ 2006-03-15 22:48                                               ` Jeff Garzik
  2006-03-15 23:31                                                 ` Lee Revell
  1 sibling, 1 reply; 60+ messages in thread
From: Jeff Garzik @ 2006-03-15 22:48 UTC (permalink / raw)
  To: Bill Rugolsky Jr.
  Cc: Ingo Molnar, Lee Revell, Andi Kleen, Jason Baron, linux-kernel,
	john stultz

Bill Rugolsky Jr. wrote:
> On Wed, Mar 15, 2006 at 11:24:41PM +0100, Ingo Molnar wrote:
> 
>>well, it's a PIO inb() op i think, and could thus in theory trigger SMM 
>>BIOS code.
> 
>  
> Is there any easy way to disable more SMM stuff than "noacpi"?

It's unlikely you can disable SMM stuff even with noacpi...

	Jeff




^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: libata/sata_nv latency on NVIDIA CK804 [was Re: AMD64 X2 lost ticks on PM timer]
  2006-03-15 22:44                                           ` Ingo Molnar
@ 2006-03-15 22:50                                             ` Jeff Garzik
  2006-03-15 23:14                                               ` Bill Rugolsky Jr.
  2006-03-16  0:01                                               ` Lee Revell
  0 siblings, 2 replies; 60+ messages in thread
From: Jeff Garzik @ 2006-03-15 22:50 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Bill Rugolsky Jr.,
	Andi Kleen, Lee Revell, Jason Baron, linux-kernel, john stultz

Ingo Molnar wrote:
> * Jeff Garzik <jeff@garzik.org> wrote:
> 
> 
>>It won't work at all...
> 
> 
> ok.
> 
> 
>>You have to stop talking to PCI IDE registers completely (consumes 5 
>>PCI BARs), and talk exclusively to the MMIO 6th PCI BAR, at 
>>non-standard offsets and a using a proprietary DMA descriptor format 
>>[all public now in that link I just sent].
> 
> 
> just to make it easier to test: i've attached the new sata_nv.c file, 
> which, to test it, should be copied over the existing 
> drivers/scsi/sata_nv.c file, correct?

Alas, it is far from that simple :(

The code I linked to isn't in a working state.  NV contributed it 
largely as "it worked at one time" documentation of a 
previously-undocumented register interface.

Someone needs to debug it.

	Jeff




^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: libata/sata_nv latency on NVIDIA CK804 [was Re: AMD64 X2 lost ticks on PM timer]
  2006-03-15 22:50                                             ` Jeff Garzik
@ 2006-03-15 23:14                                               ` Bill Rugolsky Jr.
  2006-03-15 23:44                                                 ` Lee Revell
  2006-03-16  3:15                                                 ` libata/sata_nv latency on NVIDIA CK804 [was Re: AMD64 X2 lost ticks on PM timer] Bill Rugolsky Jr.
  2006-03-16  0:01                                               ` Lee Revell
  1 sibling, 2 replies; 60+ messages in thread
From: Bill Rugolsky Jr. @ 2006-03-15 23:14 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Ingo Molnar, Andi Kleen, Lee Revell, Jason Baron, linux-kernel,
	john stultz

On Wed, Mar 15, 2006 at 05:50:37PM -0500, Jeff Garzik wrote:
> Alas, it is far from that simple :(
> 
> The code I linked to isn't in a working state.  NV contributed it 
> largely as "it worked at one time" documentation of a 
> previously-undocumented register interface.
> 
> Someone needs to debug it.

Errrr, guess that would me me.  Looks like a few interfaces have changed.
I'll put some time in to see whether I can get it to compile and boot.
If it's just a sata_nv issue, the easier solution is to buy a 3ware or
Areca card ... but I'll take a shot at anyway.

[Meanwhile, I still have to switch contexts and look at the long softirq
latencies that at first glance appear to be due to the use of mempool
by the RAID1 bio code.]

	-Bill



^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: libata/sata_nv latency on NVIDIA CK804 [was Re: AMD64 X2 lost ticks on PM timer]
  2006-03-15 22:48                                               ` Jeff Garzik
@ 2006-03-15 23:31                                                 ` Lee Revell
  0 siblings, 0 replies; 60+ messages in thread
From: Lee Revell @ 2006-03-15 23:31 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Bill Rugolsky Jr.,
	Ingo Molnar, Andi Kleen, Jason Baron, linux-kernel, john stultz

On Wed, 2006-03-15 at 17:48 -0500, Jeff Garzik wrote:
> Bill Rugolsky Jr. wrote:
> > On Wed, Mar 15, 2006 at 11:24:41PM +0100, Ingo Molnar wrote:
> > 
> >>well, it's a PIO inb() op i think, and could thus in theory trigger SMM 
> >>BIOS code.
> > 
> >  
> > Is there any easy way to disable more SMM stuff than "noacpi"?
> 
> It's unlikely you can disable SMM stuff even with noacpi...

A while back someone posted some code from RTAI to disable a bunch of
SMM stuff, check the archives.  I sometimes wonder where the heck the
realtime people were when the SMM abomination was being developed...

Lee


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: libata/sata_nv latency on NVIDIA CK804 [was Re: AMD64 X2 lost ticks on PM timer]
  2006-03-15 23:14                                               ` Bill Rugolsky Jr.
@ 2006-03-15 23:44                                                 ` Lee Revell
       [not found]                                                   ` <20060316002133.GE17817@ti64.telemetry-investments.com>
  2006-03-16  3:15                                                 ` libata/sata_nv latency on NVIDIA CK804 [was Re: AMD64 X2 lost ticks on PM timer] Bill Rugolsky Jr.
  1 sibling, 1 reply; 60+ messages in thread
From: Lee Revell @ 2006-03-15 23:44 UTC (permalink / raw)
  To: Bill Rugolsky Jr.
  Cc: Jeff Garzik, Ingo Molnar, Andi Kleen, Jason Baron, linux-kernel,
	john stultz

On Wed, 2006-03-15 at 18:14 -0500, Bill Rugolsky Jr. wrote:
> [Meanwhile, I still have to switch contexts and look at the long
> softirq latencies that at first glance appear to be due to the use of
> mempool by the RAID1 bio code.] 

Can you post traces of them somewhere?  There are no long running
softirqs in the two you posted (the worst is only 200 usecs or so).

Lee


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: libata/sata_nv latency on NVIDIA CK804 [was Re: AMD64 X2 lost ticks on PM timer]
  2006-03-15 22:50                                             ` Jeff Garzik
  2006-03-15 23:14                                               ` Bill Rugolsky Jr.
@ 2006-03-16  0:01                                               ` Lee Revell
  2006-03-16  0:14                                                 ` Jeff Garzik
  1 sibling, 1 reply; 60+ messages in thread
From: Lee Revell @ 2006-03-16  0:01 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Ingo Molnar, Bill Rugolsky Jr.,
	Andi Kleen, Jason Baron, linux-kernel, john stultz

On Wed, 2006-03-15 at 17:50 -0500, Jeff Garzik wrote:
> Alas, it is far from that simple :(
> 
> The code I linked to isn't in a working state.  NV contributed it 
> largely as "it worked at one time" documentation of a 
> previously-undocumented register interface.
> 
> Someone needs to debug it.

Would you expect every device supported by sata_nv to have this bug?

Lee


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: libata/sata_nv latency on NVIDIA CK804 [was Re: AMD64 X2 lost ticks on PM timer]
  2006-03-16  0:01                                               ` Lee Revell
@ 2006-03-16  0:14                                                 ` Jeff Garzik
  0 siblings, 0 replies; 60+ messages in thread
From: Jeff Garzik @ 2006-03-16  0:14 UTC (permalink / raw)
  To: Lee Revell
  Cc: Ingo Molnar, Bill Rugolsky Jr.,
	Andi Kleen, Jason Baron, linux-kernel, john stultz

Lee Revell wrote:
> 
>>Alas, it is far from that simple :(
>>
>>The code I linked to isn't in a working state.  NV contributed it 
>>largely as "it worked at one time" documentation of a 
>>previously-undocumented register interface.
>>
>>Someone needs to debug it.

> Would you expect every device supported by sata_nv to have this bug?

We don't know yet what "this bug" is.

	Jeff




^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: Long latencies with MD RAID 1 [was Re: libata/sata_nv latency on NVIDIA CK804 ]
       [not found]                                                   ` <20060316002133.GE17817@ti64.telemetry-investments.com>
@ 2006-03-16  0:48                                                     ` Lee Revell
  0 siblings, 0 replies; 60+ messages in thread
From: Lee Revell @ 2006-03-16  0:48 UTC (permalink / raw)
  To: Bill Rugolsky Jr.
  Cc: Jeff Garzik, Ingo Molnar, Andi Kleen, Jason Baron, linux-kernel,
	john stultz

On Wed, 2006-03-15 at 19:21 -0500, Bill Rugolsky Jr. wrote:
> On Wed, Mar 15, 2006 at 06:44:02PM -0500, Lee Revell wrote:
> > On Wed, 2006-03-15 at 18:14 -0500, Bill Rugolsky Jr. wrote:
> > > [Meanwhile, I still have to switch contexts and look at the long
> > > softirq latencies that at first glance appear to be due to the use of
> > > mempool by the RAID1 bio code.] 
> > 
> > Can you post traces of them somewhere?  There are no long running
> > softirqs in the two you posted (the worst is only 200 usecs or so).
> 
> This is typical of what I'm seeing. It seems to be looping over lots
> of io request completions?
> 
> 	-Bill
> 
> preemption latency trace v1.1.5 on 2.6.16-rc6-git4-latency
> --------------------------------------------------------------------
>  latency: 1950 us, #8586/8586, CPU#0 | (M:desktop VP:0, KP:0, SP:0 HP:0 #P:1)

This looks very similar to what I see with my regular ATA drive (except
that the completions are handled in hardirq context).

You can cause less work to be done in each softirq by
lowering /sys/block/$DEV/queue/max_sectors_kb.

I would not consider ~2ms "long", there are some other softirqs that
induce 10-15ms latencies...

Lee


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: libata/sata_nv latency on NVIDIA CK804 [was Re: AMD64 X2 lost ticks on PM timer]
  2006-03-15 23:14                                               ` Bill Rugolsky Jr.
  2006-03-15 23:44                                                 ` Lee Revell
@ 2006-03-16  3:15                                                 ` Bill Rugolsky Jr.
  2006-03-16  4:20                                                   ` Lee Revell
  1 sibling, 1 reply; 60+ messages in thread
From: Bill Rugolsky Jr. @ 2006-03-16  3:15 UTC (permalink / raw)
  To: Jeff Garzik, Ingo Molnar, Andi Kleen, Lee Revell, Jason Baron,
	linux-kernel, john stultz

[-- Attachment #1: Type: text/plain, Size: 1947 bytes --]

On Wed, Mar 15, 2006 at 06:14:26PM -0500, Bill Rugolsky Jr. wrote:
> On Wed, Mar 15, 2006 at 05:50:37PM -0500, Jeff Garzik wrote:
> > Alas, it is far from that simple :(
> > 
> > The code I linked to isn't in a working state.  NV contributed it 
> > largely as "it worked at one time" documentation of a 
> > previously-undocumented register interface.
> > 
> > Someone needs to debug it.
> 
> Errrr, guess that would me me.  Looks like a few interfaces have changed.
> I'll put some time in to see whether I can get it to compile and boot.
> If it's just a sata_nv issue, the easier solution is to buy a 3ware or
> Areca card ... but I'll take a shot at anyway.

Jeff,

I took a stab at it and got it to boot, but the boot hung around rc.local.
So I rebooted it with init=/bin/sh and manipulated it a bit by hand.

I have three drives in the machine:

ata1: sda: spare disk that I test Andi's RAID1 theory.
ata3: sdb: first drive of the system RAID1 pair
ata4: sdc: second drive of the system RAID1 pair.

I can write to ata1 no problem; I did the following:

	cd /
	mount /dev/sda1 /mnt
	cp -axv . /mnt/root
	sync

and that all worked.

Can't seem to write to ata4:

ata4: command 0x35 timeout, stat 0x50
ata4: command 0x35 timeout, stat 0x50
ata4: command 0x35 timeout, stat 0x50

/proc/mdstat shows:

Personalities : [raid1] 
md2 : active raid1 sdc2[1] sdb2[0]
      1052160 blocks [2/2] [UU]
        resync=DELAYED
      
md3 : active raid1 sdc3[1] sdb3[0]
      586304 blocks [2/2] [UU]
      
md5 : active raid1 sdc5[1] sdb5[0]
      70838464 blocks [2/2] [UU]
      [>....................]  resync =  0.0% (64768/70838464) finish=127224.1min speed=9K/sec
      
md1 : active raid1 sdc1[1] sdb1[0]
      128384 blocks [2/2] [UU]
      
unused devices: <none>


I'm heading home now (it's 22:00, and I've been here 16 hours already), but
I figured that I'd post what I have thus far, and perhaps you can tell me
what the problem is.

	-Bill

[-- Attachment #2: sata_nv-adma-incr.patch --]
[-- Type: text/plain, Size: 5634 bytes --]

--- sata_nv.c.adma	2006-03-15 17:46:56.000000000 -0500
+++ sata_nv.c	2006-03-15 20:48:44.000000000 -0500
@@ -29,6 +29,14 @@
  *  NV-specific details such as register offsets, SATA phy location,
  *  hotplug info, etc.
  *
+ *  0.10
+ *     - Fixed spurious interrupts issue seen with the Maxtor 6H500F0 500GB
+ *       drive.  Also made the check_hotplug() callbacks return whether there
+ *       was a hotplug interrupt or not.  This was not the source of the
+ *       spurious interrupts, but is the right thing to do anyway.
+ *
+ *  0.09
+ *     - Fixed bug introduced by 0.08's MCP51 and MCP55 support.
  *
  *  0.08
  *     - Added support for MCP51 and MCP55.
@@ -59,6 +67,7 @@
 #include <linux/blkdev.h>
 #include <linux/delay.h>
 #include <linux/interrupt.h>
+#include <linux/device.h>
 #include "scsi.h"
 #include <scsi/scsi_host.h>
 #include <linux/libata.h>
@@ -277,7 +286,7 @@ static int nv_adma_qc_issue(struct ata_q
 static void nv_adma_qc_prep(struct ata_queued_cmd *qc);
 static unsigned int nv_adma_tf_to_cpb(struct ata_taskfile *tf, u16 *cpb);
 static void nv_adma_fill_sg(struct ata_queued_cmd *qc, struct nv_adma_cpb *cpb);
-static void nv_adma_fill_aprd(struct ata_queued_cmd *qc, int idx, struct nv_adma_prd *aprd);
+static void nv_adma_fill_aprd(struct ata_queued_cmd *qc, struct scatterlist *sg, int idx, struct nv_adma_prd *aprd);
 static void nv_adma_register_mode(struct ata_port *ap);
 static void nv_adma_mode(struct ata_port *ap);
 static u8 nv_bmdma_status(struct ata_port *ap);
@@ -305,7 +314,7 @@ enum nv_host_type
 	ADMA
 };
 
-static struct pci_device_id nv_pci_tbl[] = {
+static const struct pci_device_id nv_pci_tbl[] = {
 	{ PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NFORCE2S_SATA,
 		PCI_ANY_ID, PCI_ANY_ID, 0, 0, NFORCE2 },
 	{ PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NFORCE3S_SATA,
@@ -418,10 +427,9 @@ static struct scsi_host_template nv_sht 
 	.dma_boundary		= ATA_DMA_BOUNDARY,
 	.slave_configure	= ata_scsi_slave_config,
 	.bios_param		= ata_std_bios_param,
-	.ordered_flush		= 1,
 };
 
-static struct ata_port_operations nv_ops = {
+static const struct ata_port_operations nv_ops = {
 	.port_disable		= ata_port_disable,
 	.tf_load		= ata_tf_load,
 	.tf_read		= ata_tf_read,
@@ -605,7 +613,8 @@ static inline int nv_adma_host_intr(stru
 
 	if (handled) {
 		u8 ata_status = readb(mmio + (ATA_REG_STATUS * 4));
-		ata_qc_complete(qc, have_err ? (ata_status | ATA_ERR) : ata_status);
+		qc->err_mask |= ac_err_mask(have_err ? (ata_status | ATA_ERR) : ata_status);
+		ata_qc_complete(qc);
 	}
 
 	return handled; /* irq handled */
@@ -737,13 +746,11 @@ static irqreturn_t nv_interrupt (int irq
 				
 			qc = ata_qc_from_tag(ap, ap->active_tag);
 			if (qc && (!(qc->tf.ctl & ATA_NIEN))) {
-				if (host->host_desc->host_type == ADMA) {
+				if (host->host_desc->host_type == ADMA)
 					handled += nv_adma_host_intr(ap, qc);
-				} else {
+				else
 					handled += ata_host_intr(ap, qc);
-				}
 			}
-				
 		}
 
 	}
@@ -780,7 +787,7 @@ static u32 nv_scr_read (struct ata_port 
 		return 0xffffffffU;
 
 	if (host->host_flags & NV_HOST_FLAGS_SCR_MMIO)
-		val = readl((void*)ap->ioaddr.scr_addr + (sc_reg * 4));
+		val = readl((void __iomem *)ap->ioaddr.scr_addr + (sc_reg * 4));
 	else
 		val = inl(ap->ioaddr.scr_addr + (sc_reg * 4));
 
@@ -800,7 +807,7 @@ static void nv_scr_write (struct ata_por
 		return;
 
 	if (host->host_flags & NV_HOST_FLAGS_SCR_MMIO)
-		writel(val, (void*)ap->ioaddr.scr_addr + (sc_reg * 4));
+		writel(val, (void __iomem *)ap->ioaddr.scr_addr + (sc_reg * 4));
 	else
 		outl(val, ap->ioaddr.scr_addr + (sc_reg * 4));
 }
@@ -1031,7 +1038,7 @@ static int nv_init_one (struct pci_dev *
 			return -ENODEV;
 
 	if (!printed_version++)
-		printk(KERN_DEBUG DRV_NAME " version " DRV_VERSION "\n");
+		dev_printk(KERN_DEBUG, &pdev->dev, "version " DRV_VERSION "\n");
 
 	rc = pci_enable_device(pdev);
 	if (rc)
@@ -1066,7 +1073,7 @@ static int nv_init_one (struct pci_dev *
 //		ppi->port_ops->irq_handler	= nv_adma_interrupt;
 	}
 	
-	probe_ent = ata_pci_init_native_mode(pdev, &ppi);
+	probe_ent = ata_pci_init_native_mode(pdev, &ppi, ATA_PORT_PRIMARY | ATA_PORT_SECONDARY);
 	if (!probe_ent)
 		goto err_out_regions;
 
@@ -1188,7 +1195,8 @@ static void nv_adma_eng_timeout(struct a
 	nv_adma_reset_channel(ap);
 
 	/* complete taskfile transaction */
-	ata_qc_complete(qc, drv_stat);
+	qc->err_mask |= ac_err_mask(drv_stat);
+	ata_qc_complete(qc);
 
 //	spin_unlock_irqrestore(&host_set->lock, flags);
 
@@ -1247,18 +1255,16 @@ static void nv_adma_fill_sg(struct ata_q
 	struct nv_adma_port_priv *pp = qc->ap->private_data;
 	unsigned int idx;
 	struct nv_adma_prd *aprd;
+	struct scatterlist *sg;
 
 	VPRINTK("ENTER\n");
 
 	idx = 0;
 
-	for (idx = 0; idx < qc->n_elem; idx++) {
-		if (idx < 5) {
-			aprd = &cpb->aprd[idx];
-		} else {
-			aprd = &pp->aprd[idx-5];
-		}
-		nv_adma_fill_aprd(qc, idx, aprd);
+	ata_for_each_sg(sg, qc) {
+		aprd = (idx < 5) ? &cpb->aprd[idx] : &pp->aprd[idx-5];
+		nv_adma_fill_aprd(qc, sg, idx, aprd);
+		idx++;
 	}
 	if (idx > 5) {
 		cpb->next_aprd = (u64)(pp->aprd_dma + NV_ADMA_APRD_SZ * qc->tag);
@@ -1266,6 +1272,7 @@ static void nv_adma_fill_sg(struct ata_q
 }
 
 static void nv_adma_fill_aprd(struct ata_queued_cmd *qc,
+			      struct scatterlist *sg,
 			      int idx,
 			      struct nv_adma_prd *aprd)
 {
@@ -1273,8 +1280,8 @@ static void nv_adma_fill_aprd(struct ata
 
 	memset(aprd, 0, sizeof(struct nv_adma_prd));
 
-	addr   = sg_dma_address(&qc->sg[idx]);
-	sg_len = sg_dma_len(&qc->sg[idx]);
+	addr   = sg_dma_address(sg);
+	sg_len = sg_dma_len(sg);
 
 	flags = 0;
 	if (qc->tf.flags & ATA_TFLAG_WRITE)

[-- Attachment #3: linux-2.6.16-rc6-git4-sata_nv-adma.patch --]
[-- Type: text/plain, Size: 37757 bytes --]

--- linux-2.6.16-rc6-git4/drivers/scsi/sata_nv.c	2006-03-15 17:19:18.000000000 -0500
+++ linux-2.6.16-rc6-git4/drivers/scsi/sata_nv.c	2006-03-15 20:48:44.000000000 -0500
@@ -68,9 +68,12 @@
 #include <linux/delay.h>
 #include <linux/interrupt.h>
 #include <linux/device.h>
+#include "scsi.h"
 #include <scsi/scsi_host.h>
 #include <linux/libata.h>
 
+//#define DEBUG
+
 #define DRV_NAME			"sata_nv"
 #define DRV_VERSION			"0.8"
 
@@ -121,6 +124,140 @@
 // For PCI config register 20
 #define NV_MCP_SATA_CFG_20		0x50
 #define NV_MCP_SATA_CFG_20_SATA_SPACE_EN	0x04
+#define NV_MCP_SATA_CFG_20_PORT0_EN	(1 << 17)
+#define NV_MCP_SATA_CFG_20_PORT1_EN	(1 << 16)
+#define NV_MCP_SATA_CFG_20_PORT0_PWB_EN	(1 << 14)
+#define NV_MCP_SATA_CFG_20_PORT1_PWB_EN	(1 << 12)
+
+//#define NV_ADMA_NCQ
+
+#ifdef NV_ADMA_NCQ
+#define NV_ADMA_CAN_QUEUE		ATA_MAX_QUEUE
+#else
+#define NV_ADMA_CAN_QUEUE		ATA_DEF_QUEUE
+#endif
+
+#define NV_ADMA_CPB_SZ			128
+#define NV_ADMA_APRD_SZ			16
+#define NV_ADMA_SGTBL_LEN		(1024 - NV_ADMA_CPB_SZ) / NV_ADMA_APRD_SZ
+#define NV_ADMA_SGTBL_SZ                NV_ADMA_SGTBL_LEN * NV_ADMA_APRD_SZ
+#define NV_ADMA_PORT_PRIV_DMA_SZ        NV_ADMA_CAN_QUEUE * (NV_ADMA_CPB_SZ + NV_ADMA_SGTBL_SZ)
+//#define NV_ADMA_MAX_CPBS		32
+
+// BAR5 offset to ADMA general registers
+#define NV_ADMA_GEN			0x400
+#define NV_ADMA_GEN_CTL			0x00
+#define NV_ADMA_NOTIFIER_CLEAR		0x30
+
+#define NV_ADMA_CHECK_INTR(GCTL, PORT) ((GCTL) & ( 1 << (19 + (12 * (PORT)))))
+
+// BAR5 offset to ADMA ports
+#define NV_ADMA_PORT			0x480
+
+// size of ADMA port register space 
+#define NV_ADMA_PORT_SIZE		0x100
+
+// ADMA port registers
+#define NV_ADMA_CTL			0x40
+#define NV_ADMA_CPB_COUNT		0x42
+#define NV_ADMA_NEXT_CPB_IDX		0x43
+#define NV_ADMA_STAT			0x44
+#define NV_ADMA_CPB_BASE_LOW		0x48
+#define NV_ADMA_CPB_BASE_HIGH		0x4C
+#define NV_ADMA_APPEND			0x50
+#define NV_ADMA_NOTIFIER		0x68
+#define NV_ADMA_NOTIFIER_ERROR		0x6C
+
+// NV_ADMA_CTL register bits
+#define NV_ADMA_CTL_HOTPLUG_IEN		(1 << 0)
+#define NV_ADMA_CTL_CHANNEL_RESET	(1 << 5)
+#define NV_ADMA_CTL_GO			(1 << 7)
+#define NV_ADMA_CTL_AIEN		(1 << 8)
+#define NV_ADMA_CTL_READ_NON_COHERENT	(1 << 11)
+#define NV_ADMA_CTL_WRITE_NON_COHERENT	(1 << 12)
+
+// CPB response flag bits
+#define NV_CPB_RESP_DONE		(1 << 0)
+#define NV_CPB_RESP_ATA_ERR		(1 << 3)
+#define NV_CPB_RESP_CMD_ERR		(1 << 4)
+#define NV_CPB_RESP_CPB_ERR		(1 << 7)
+
+// CPB control flag bits
+#define NV_CPB_CTL_CPB_VALID		(1 << 0)
+#define NV_CPB_CTL_QUEUE		(1 << 1)
+#define NV_CPB_CTL_APRD_VALID		(1 << 2)
+#define NV_CPB_CTL_IEN			(1 << 3)
+#define NV_CPB_CTL_FPDMA		(1 << 4)
+
+// APRD flags
+#define NV_APRD_WRITE			(1 << 1)
+#define NV_APRD_END			(1 << 2)
+#define NV_APRD_CONT			(1 << 3)
+
+// NV_ADMA_STAT flags
+#define NV_ADMA_STAT_TIMEOUT		(1 << 0)
+#define NV_ADMA_STAT_HOTUNPLUG		(1 << 1)
+#define NV_ADMA_STAT_HOTPLUG		(1 << 2)
+#define NV_ADMA_STAT_CPBERR		(1 << 4)
+#define NV_ADMA_STAT_SERROR		(1 << 5)
+#define NV_ADMA_STAT_CMD_COMPLETE	(1 << 6)
+#define NV_ADMA_STAT_IDLE		(1 << 8)
+#define NV_ADMA_STAT_LEGACY		(1 << 9)
+#define NV_ADMA_STAT_STOPPED		(1 << 10)
+#define NV_ADMA_STAT_DONE		(1 << 12)
+#define NV_ADMA_STAT_ERR		(NV_ADMA_STAT_CPBERR | NV_ADMA_STAT_TIMEOUT)
+
+// port flags
+#define NV_ADMA_PORT_REGISTER_MODE	(1 << 0)
+
+#ifndef min
+#define min(x,y) ((x) < (y) ? x : y)
+#endif
+
+struct nv_adma_prd {
+	u64			addr;
+	u32			len;
+	u8			flags;
+	u8			packet_len;
+	u16			reserved;
+};
+
+enum nv_adma_regbits {
+	CMDEND	= (1 << 15),		/* end of command list */
+	WNB	= (1 << 14),		/* wait-not-BSY */
+	IGN	= (1 << 13),		/* ignore this entry */
+	CS1n	= (1 << (4 + 8)),	/* std. PATA signals follow... */
+	DA2	= (1 << (2 + 8)),
+	DA1	= (1 << (1 + 8)),
+	DA0	= (1 << (0 + 8)),
+};
+
+struct nv_adma_cpb {
+	u8			resp_flags;    //0
+	u8			reserved1;     //1
+	u8			ctl_flags;     //2
+	// len is length of taskfile in 64 bit words
+ 	u8			len;           //3 
+	u8			tag;           //4
+	u8			next_cpb_idx;  //5
+	u16			reserved2;     //6-7
+	u16			tf[12];        //8-31
+	struct nv_adma_prd	aprd[5];       //32-111
+	u64                     next_aprd;     //112-119
+	u64                     reserved3;     //120-127
+};
+
+
+struct nv_adma_port_priv {
+	struct nv_adma_cpb	*cpb;
+  //	u8			cpb_idx;
+	u8			flags;
+	u32			notifier;
+	u32			notifier_error;
+	dma_addr_t		cpb_dma;
+	struct nv_adma_prd	*aprd;
+	dma_addr_t		aprd_dma;
+};
 
 static int nv_init_one (struct pci_dev *pdev, const struct pci_device_id *ent);
 static irqreturn_t nv_interrupt (int irq, void *dev_instance,
@@ -128,19 +265,53 @@ static irqreturn_t nv_interrupt (int irq
 static u32 nv_scr_read (struct ata_port *ap, unsigned int sc_reg);
 static void nv_scr_write (struct ata_port *ap, unsigned int sc_reg, u32 val);
 static void nv_host_stop (struct ata_host_set *host_set);
+static int nv_port_start(struct ata_port *ap);
+static void nv_port_stop(struct ata_port *ap);
+static int nv_adma_port_start(struct ata_port *ap);
+static void nv_adma_port_stop(struct ata_port *ap);
+static void nv_irq_clear(struct ata_port *ap);
+static void nv_adma_irq_clear(struct ata_port *ap);
 static void nv_enable_hotplug(struct ata_probe_ent *probe_ent);
 static void nv_disable_hotplug(struct ata_host_set *host_set);
-static int nv_check_hotplug(struct ata_host_set *host_set);
+static void nv_check_hotplug(struct ata_host_set *host_set);
 static void nv_enable_hotplug_ck804(struct ata_probe_ent *probe_ent);
 static void nv_disable_hotplug_ck804(struct ata_host_set *host_set);
-static int nv_check_hotplug_ck804(struct ata_host_set *host_set);
+static void nv_check_hotplug_ck804(struct ata_host_set *host_set);
+static void nv_enable_hotplug_adma(struct ata_probe_ent *probe_ent);
+static void nv_disable_hotplug_adma(struct ata_host_set *host_set);
+static void nv_check_hotplug_adma(struct ata_host_set *host_set);
+static void nv_qc_prep(struct ata_queued_cmd *qc);
+static int nv_qc_issue(struct ata_queued_cmd *qc);
+static int nv_adma_qc_issue(struct ata_queued_cmd *qc);
+static void nv_adma_qc_prep(struct ata_queued_cmd *qc);
+static unsigned int nv_adma_tf_to_cpb(struct ata_taskfile *tf, u16 *cpb);
+static void nv_adma_fill_sg(struct ata_queued_cmd *qc, struct nv_adma_cpb *cpb);
+static void nv_adma_fill_aprd(struct ata_queued_cmd *qc, struct scatterlist *sg, int idx, struct nv_adma_prd *aprd);
+static void nv_adma_register_mode(struct ata_port *ap);
+static void nv_adma_mode(struct ata_port *ap);
+static u8 nv_bmdma_status(struct ata_port *ap);
+static u8 nv_adma_bmdma_status(struct ata_port *ap);
+static void nv_bmdma_stop(struct ata_queued_cmd *qc);
+static void nv_adma_bmdma_stop(struct ata_queued_cmd *qc);
+static void nv_eng_timeout(struct ata_port *ap);
+static void nv_adma_eng_timeout(struct ata_port *ap);
+#ifdef DEBUG
+static void nv_adma_dump_cpb(struct nv_adma_cpb *cpb);
+static void nv_adma_dump_aprd(struct nv_adma_prd *aprd);
+static void nv_adma_dump_cpb_tf(u16 tf);
+static void nv_adma_dump_port(struct ata_port *ap);
+static void nv_adma_dump_iomem(void __iomem *m, int len);
+#endif
 
 enum nv_host_type
 {
 	GENERIC,
 	NFORCE2,
 	NFORCE3,
-	CK804
+	CK804,
+	MCP51,
+	MCP55,
+	ADMA
 };
 
 static const struct pci_device_id nv_pci_tbl[] = {
@@ -151,21 +322,21 @@ static const struct pci_device_id nv_pci
 	{ PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NFORCE3S_SATA2,
 		PCI_ANY_ID, PCI_ANY_ID, 0, 0, NFORCE3 },
 	{ PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NFORCE_CK804_SATA,
-		PCI_ANY_ID, PCI_ANY_ID, 0, 0, CK804 },
+		PCI_ANY_ID, PCI_ANY_ID, 0, 0, ADMA },
 	{ PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NFORCE_CK804_SATA2,
-		PCI_ANY_ID, PCI_ANY_ID, 0, 0, CK804 },
+		PCI_ANY_ID, PCI_ANY_ID, 0, 0, ADMA },
 	{ PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NFORCE_MCP04_SATA,
-		PCI_ANY_ID, PCI_ANY_ID, 0, 0, CK804 },
+		PCI_ANY_ID, PCI_ANY_ID, 0, 0, ADMA },
 	{ PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NFORCE_MCP04_SATA2,
-		PCI_ANY_ID, PCI_ANY_ID, 0, 0, CK804 },
+		PCI_ANY_ID, PCI_ANY_ID, 0, 0, ADMA },
 	{ PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NFORCE_MCP51_SATA,
-		PCI_ANY_ID, PCI_ANY_ID, 0, 0, GENERIC },
+		PCI_ANY_ID, PCI_ANY_ID, 0, 0, MCP51 },
 	{ PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NFORCE_MCP51_SATA2,
-		PCI_ANY_ID, PCI_ANY_ID, 0, 0, GENERIC },
+		PCI_ANY_ID, PCI_ANY_ID, 0, 0, MCP51 },
 	{ PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NFORCE_MCP55_SATA,
-		PCI_ANY_ID, PCI_ANY_ID, 0, 0, GENERIC },
+		PCI_ANY_ID, PCI_ANY_ID, 0, 0, MCP55 },
 	{ PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NFORCE_MCP55_SATA2,
-		PCI_ANY_ID, PCI_ANY_ID, 0, 0, GENERIC },
+		PCI_ANY_ID, PCI_ANY_ID, 0, 0, MCP55 },
 	{ PCI_VENDOR_ID_NVIDIA, PCI_ANY_ID,
 		PCI_ANY_ID, PCI_ANY_ID,
 		PCI_CLASS_STORAGE_IDE<<8, 0xffff00, GENERIC },
@@ -182,7 +353,7 @@ struct nv_host_desc
 	enum nv_host_type	host_type;
 	void			(*enable_hotplug)(struct ata_probe_ent *probe_ent);
 	void			(*disable_hotplug)(struct ata_host_set *host_set);
-	int			(*check_hotplug)(struct ata_host_set *host_set);
+	void			(*check_hotplug)(struct ata_host_set *host_set);
 
 };
 static struct nv_host_desc nv_device_tbl[] = {
@@ -209,6 +380,21 @@ static struct nv_host_desc nv_device_tbl
 		.disable_hotplug= nv_disable_hotplug_ck804,
 		.check_hotplug	= nv_check_hotplug_ck804,
 	},
+	{	.host_type	= MCP51,
+		.enable_hotplug	= nv_enable_hotplug,
+		.disable_hotplug= nv_disable_hotplug,
+		.check_hotplug	= nv_check_hotplug,
+	},
+	{	.host_type	= MCP55,
+		.enable_hotplug	= nv_enable_hotplug,
+		.disable_hotplug= nv_disable_hotplug,
+		.check_hotplug	= nv_check_hotplug,
+	},
+	{	.host_type	= ADMA,
+		.enable_hotplug	= nv_enable_hotplug_adma,
+		.disable_hotplug= nv_disable_hotplug_adma,
+		.check_hotplug	= nv_check_hotplug_adma,
+	},
 };
 
 struct nv_host
@@ -253,20 +439,187 @@ static const struct ata_port_operations 
 	.phy_reset		= sata_phy_reset,
 	.bmdma_setup		= ata_bmdma_setup,
 	.bmdma_start		= ata_bmdma_start,
-	.bmdma_stop		= ata_bmdma_stop,
-	.bmdma_status		= ata_bmdma_status,
-	.qc_prep		= ata_qc_prep,
-	.qc_issue		= ata_qc_issue_prot,
-	.eng_timeout		= ata_eng_timeout,
+	.bmdma_stop		= nv_bmdma_stop,
+	.bmdma_status		= nv_bmdma_status,
+	.qc_prep		= nv_qc_prep,
+	.qc_issue		= nv_qc_issue,
+	.eng_timeout		= nv_eng_timeout,
 	.irq_handler		= nv_interrupt,
-	.irq_clear		= ata_bmdma_irq_clear,
+	.irq_clear		= nv_irq_clear,
 	.scr_read		= nv_scr_read,
 	.scr_write		= nv_scr_write,
-	.port_start		= ata_port_start,
-	.port_stop		= ata_port_stop,
+	.port_start		= nv_port_start,
+	.port_stop		= nv_port_stop,
 	.host_stop		= nv_host_stop,
 };
 
+static unsigned int nv_adma_tf_to_cpb(struct ata_taskfile *tf, u16 *cpb)
+{
+	unsigned int idx = 0;
+
+	cpb[idx++] = cpu_to_le16((ATA_REG_DEVICE << 8) | tf->device | WNB);
+
+	if ((tf->flags & ATA_TFLAG_LBA48) == 0) {
+		cpb[idx++] = cpu_to_le16(IGN);
+		cpb[idx++] = cpu_to_le16(IGN);
+		cpb[idx++] = cpu_to_le16(IGN);
+		cpb[idx++] = cpu_to_le16(IGN);
+		cpb[idx++] = cpu_to_le16(IGN);
+	}
+	else {
+		cpb[idx++] = cpu_to_le16((ATA_REG_ERR   << 8) | tf->hob_feature);
+		cpb[idx++] = cpu_to_le16((ATA_REG_NSECT << 8) | tf->hob_nsect);
+		cpb[idx++] = cpu_to_le16((ATA_REG_LBAL  << 8) | tf->hob_lbal);
+		cpb[idx++] = cpu_to_le16((ATA_REG_LBAM  << 8) | tf->hob_lbam);
+		cpb[idx++] = cpu_to_le16((ATA_REG_LBAH  << 8) | tf->hob_lbah);
+	}
+	cpb[idx++] = cpu_to_le16((ATA_REG_ERR    << 8) | tf->feature);
+	cpb[idx++] = cpu_to_le16((ATA_REG_NSECT  << 8) | tf->nsect);
+	cpb[idx++] = cpu_to_le16((ATA_REG_LBAL   << 8) | tf->lbal);
+	cpb[idx++] = cpu_to_le16((ATA_REG_LBAM   << 8) | tf->lbam);
+	cpb[idx++] = cpu_to_le16((ATA_REG_LBAH   << 8) | tf->lbah);
+
+	cpb[idx++] = cpu_to_le16((ATA_REG_CMD    << 8) | tf->command | CMDEND);
+
+	return idx;
+}
+
+static inline void __iomem *__nv_adma_ctl_block(void __iomem *mmio,
+					     unsigned int port_no)
+{
+	mmio += NV_ADMA_PORT + port_no * NV_ADMA_PORT_SIZE;
+	return mmio;
+}
+
+static inline void __iomem *nv_adma_ctl_block(struct ata_port *ap)
+{
+	return __nv_adma_ctl_block(ap->host_set->mmio_base, ap->port_no);
+}
+
+static inline void __iomem *nv_adma_gen_block(struct ata_port *ap)
+{
+	return (ap->host_set->mmio_base + NV_ADMA_GEN);
+}
+
+static inline void __iomem *nv_adma_notifier_clear_block(struct ata_port *ap)
+{
+	return (nv_adma_gen_block(ap) + NV_ADMA_NOTIFIER_CLEAR + (4 * ap->port_no));
+}
+
+static inline void nv_adma_reset_channel(struct ata_port *ap)
+{
+	void __iomem *mmio = nv_adma_ctl_block(ap);
+	u16 tmp;
+
+	// clear CPB fetch count
+	writew(0, mmio + NV_ADMA_CPB_COUNT);
+
+	// clear GO
+	tmp = readw(mmio + NV_ADMA_CTL);
+	writew(tmp & ~NV_ADMA_CTL_GO, mmio + NV_ADMA_CTL);
+
+	tmp = readw(mmio + NV_ADMA_CTL);
+	writew(tmp | NV_ADMA_CTL_CHANNEL_RESET, mmio + NV_ADMA_CTL);
+	udelay(1);
+	writew(tmp & ~NV_ADMA_CTL_CHANNEL_RESET, mmio + NV_ADMA_CTL);
+}
+
+static inline int nv_adma_host_intr(struct ata_port *ap, struct ata_queued_cmd *qc)
+{
+	void __iomem *mmio = nv_adma_ctl_block(ap);
+	struct nv_adma_port_priv *pp = ap->private_data;
+	struct nv_adma_cpb *cpb = &pp->cpb[qc->tag];
+	u16 status;
+	u32 gen_ctl;
+	u16 flags;
+	int have_err = 0;
+	int handled = 0;
+
+	status = readw(mmio + NV_ADMA_STAT);
+
+	// if in ATA register mode, use standard ata interrupt handler
+	if (pp->flags & NV_ADMA_PORT_REGISTER_MODE) {
+		VPRINTK("in ATA register mode\n");
+		return ata_host_intr(ap, qc);
+	}
+
+	gen_ctl = readl(nv_adma_gen_block(ap) + NV_ADMA_GEN_CTL);
+	if (!NV_ADMA_CHECK_INTR(gen_ctl, ap->port_no)) {
+		return 0;
+	}
+
+	if (!pp->notifier && !pp->notifier_error) {
+		if (status) {
+			VPRINTK("XXX no notifier, but status 0x%x\n", status);
+#ifdef DEBUG
+			nv_adma_dump_port(ap);
+			nv_adma_dump_cpb(cpb);
+#endif
+		} else {
+			return 0;
+		}
+	}
+	if (pp->notifier_error) {
+		have_err = 1;
+		handled = 1;
+	}
+
+	if (status & NV_ADMA_STAT_TIMEOUT) {
+		VPRINTK("timeout, stat = 0x%x\n", status);
+		have_err = 1;
+		handled = 1;
+	}
+	if (status & NV_ADMA_STAT_CPBERR) {
+		VPRINTK("CPB error, stat = 0x%x\n", status);
+		have_err = 1;
+		handled = 1;
+	}
+	if (status & NV_ADMA_STAT_STOPPED) {
+		VPRINTK("ADMA stopped, stat = 0x%x, resp_flags = 0x%x\n", status, cpb->resp_flags);
+		if (!(status & NV_ADMA_STAT_DONE)) {
+			have_err = 1;
+			handled = 1;
+		}
+	}
+	if (status & NV_ADMA_STAT_CMD_COMPLETE) {
+		VPRINTK("ADMA command complete, stat = 0x%x\n", status);
+	}
+	if (status & NV_ADMA_STAT_DONE) {
+		flags = cpb->resp_flags;
+		VPRINTK("CPB done, stat = 0x%x, flags = 0x%x\n", status, flags);
+		handled = 1;
+		if (!(status & NV_ADMA_STAT_IDLE)) {
+			VPRINTK("XXX CPB done, but not idle\n");
+		}
+		if (flags & NV_CPB_RESP_DONE) {
+			VPRINTK("CPB flags done, flags = 0x%x\n", flags);
+		}
+		if (flags & NV_CPB_RESP_ATA_ERR) {
+			VPRINTK("CPB flags ATA err, flags = 0x%x\n", flags);
+			have_err = 1;
+		}
+		if (flags & NV_CPB_RESP_CMD_ERR) {
+			VPRINTK("CPB flags CMD err, flags = 0x%x\n", flags);
+			have_err = 1;
+		}
+		if (flags & NV_CPB_RESP_CPB_ERR) {
+			VPRINTK("CPB flags CPB err, flags = 0x%x\n", flags);
+			have_err = 1;
+		}
+	}
+
+	// clear status
+	writew(status, mmio + NV_ADMA_STAT);
+
+	if (handled) {
+		u8 ata_status = readb(mmio + (ATA_REG_STATUS * 4));
+		qc->err_mask |= ac_err_mask(have_err ? (ata_status | ATA_ERR) : ata_status);
+		ata_qc_complete(qc);
+	}
+
+	return handled; /* irq handled */
+}
+
 /* FIXME: The hardware provides the necessary SATA PHY controls
  * to support ATA_FLAG_SATA_RESET.  However, it is currently
  * necessary to disable that flag, to solve misdetection problems.
@@ -275,6 +628,7 @@ static const struct ata_port_operations 
  * This problem really needs to be investigated further.  But in the
  * meantime, we avoid ATA_FLAG_SATA_RESET to get people working.
  */
+
 static struct ata_port_info nv_port_info = {
 	.sht		= &nv_sht,
 	.host_flags	= ATA_FLAG_SATA |
@@ -293,6 +647,79 @@ MODULE_LICENSE("GPL");
 MODULE_DEVICE_TABLE(pci, nv_pci_tbl);
 MODULE_VERSION(DRV_VERSION);
 
+static inline void nv_enable_adma_space (struct pci_dev *pdev)
+{
+	u8 regval;
+
+	VPRINTK("ENTER\n");
+
+	pci_read_config_byte(pdev, NV_MCP_SATA_CFG_20, &regval);
+	regval |= NV_MCP_SATA_CFG_20_SATA_SPACE_EN;
+	pci_write_config_byte(pdev, NV_MCP_SATA_CFG_20, regval);
+}
+
+static inline void nv_disable_adma_space (struct pci_dev *pdev)
+{
+	u8 regval;
+
+	VPRINTK("ENTER\n");
+
+	pci_read_config_byte(pdev, NV_MCP_SATA_CFG_20, &regval);
+	regval &= ~NV_MCP_SATA_CFG_20_SATA_SPACE_EN;
+	pci_write_config_byte(pdev, NV_MCP_SATA_CFG_20, regval);
+}
+
+static void nv_irq_clear(struct ata_port *ap)
+{
+	struct ata_host_set *host_set = ap->host_set;
+	struct nv_host *host = host_set->private_data;
+
+	if (host->host_desc->host_type == ADMA) {
+		nv_adma_irq_clear(ap);
+	} else {
+		ata_bmdma_irq_clear(ap);
+	}
+}
+
+static void nv_adma_irq_clear(struct ata_port *ap)
+{
+	/* TODO */
+}
+
+static u8 nv_bmdma_status(struct ata_port *ap)
+{
+	struct ata_host_set *host_set = ap->host_set;
+	struct nv_host *host = host_set->private_data;
+
+	if (host->host_desc->host_type == ADMA) {
+		return nv_adma_bmdma_status(ap);
+	} else {
+		return ata_bmdma_status(ap);
+	}
+}
+
+static u8 nv_adma_bmdma_status(struct ata_port *ap)
+{
+	return inb(ap->ioaddr.bmdma_addr + ATA_DMA_STATUS);
+}
+
+static void nv_bmdma_stop(struct ata_queued_cmd *qc)
+{
+	struct ata_host_set *host_set = qc->ap->host_set;
+	struct nv_host *host = host_set->private_data;
+
+	if (host->host_desc->host_type == ADMA) {
+		nv_adma_bmdma_stop(qc);
+	} else {
+		ata_bmdma_stop(qc);
+	}
+}
+
+static void nv_adma_bmdma_stop(struct ata_queued_cmd *qc)
+{
+	/* TODO */
+}
+
 static irqreturn_t nv_interrupt (int irq, void *dev_instance,
 				 struct pt_regs *regs)
 {
@@ -305,26 +732,41 @@ static irqreturn_t nv_interrupt (int irq
 	spin_lock_irqsave(&host_set->lock, flags);
 
 	for (i = 0; i < host_set->n_ports; i++) {
-		struct ata_port *ap;
+		struct ata_port *ap = host_set->ports[i];
+		struct nv_adma_port_priv *pp = ap->private_data;
 
-		ap = host_set->ports[i];
 		if (ap &&
 		    !(ap->flags & (ATA_FLAG_PORT_DISABLED | ATA_FLAG_NOINTR))) {
+			void __iomem *mmio = nv_adma_ctl_block(ap);
 			struct ata_queued_cmd *qc;
 
+			// read notifiers
+			pp->notifier = readl(mmio + NV_ADMA_NOTIFIER);
+			pp->notifier_error = readl(mmio + NV_ADMA_NOTIFIER_ERROR);
+				
 			qc = ata_qc_from_tag(ap, ap->active_tag);
-			if (qc && (!(qc->tf.ctl & ATA_NIEN)))
-				handled += ata_host_intr(ap, qc);
-			else
-				// No request pending?  Clear interrupt status
-				// anyway, in case there's one pending.
-				ap->ops->check_status(ap);
+			if (qc && (!(qc->tf.ctl & ATA_NIEN))) {
+				if (host->host_desc->host_type == ADMA)
+					handled += nv_adma_host_intr(ap, qc);
+				else
+					handled += ata_host_intr(ap, qc);
+			}
 		}
 
 	}
 
 	if (host->host_desc->check_hotplug)
-		handled += host->host_desc->check_hotplug(host_set);
+		host->host_desc->check_hotplug(host_set);
+
+	// clear notifier
+	if (handled) {
+		for (i = 0; i < host_set->n_ports; i++) {
+			struct ata_port *ap = host_set->ports[i];
+			struct nv_adma_port_priv *pp = ap->private_data;
+			writel(pp->notifier | pp->notifier_error,
+			       nv_adma_notifier_clear_block(ap));
+		}
+	}
 
 	spin_unlock_irqrestore(&host_set->lock, flags);
 
@@ -335,14 +777,22 @@ static u32 nv_scr_read (struct ata_port 
 {
 	struct ata_host_set *host_set = ap->host_set;
 	struct nv_host *host = host_set->private_data;
+	u32 val = 0;
+
+	VPRINTK("ENTER\n");
+
+	VPRINTK("reading SCR reg %d, got 0x%08x\n", sc_reg, val);
 
 	if (sc_reg > SCR_CONTROL)
 		return 0xffffffffU;
 
 	if (host->host_flags & NV_HOST_FLAGS_SCR_MMIO)
-		return readl((void __iomem *)ap->ioaddr.scr_addr + (sc_reg * 4));
+		val = readl((void __iomem *)ap->ioaddr.scr_addr + (sc_reg * 4));
 	else
-		return inl(ap->ioaddr.scr_addr + (sc_reg * 4));
+		val = inl(ap->ioaddr.scr_addr + (sc_reg * 4));
+
+	VPRINTK("reading SCR reg %d, got 0x%08x\n", sc_reg, val);
+	return val;
 }
 
 static void nv_scr_write (struct ata_port *ap, unsigned int sc_reg, u32 val)
@@ -350,6 +800,9 @@ static void nv_scr_write (struct ata_por
 	struct ata_host_set *host_set = ap->host_set;
 	struct nv_host *host = host_set->private_data;
 
+	VPRINTK("ENTER\n");
+
+	VPRINTK("writing SCR reg %d with 0x%08x\n", sc_reg, val);
 	if (sc_reg > SCR_CONTROL)
 		return;
 
@@ -364,6 +817,8 @@ static void nv_host_stop (struct ata_hos
 	struct nv_host *host = host_set->private_data;
 	struct pci_dev *pdev = to_pci_dev(host_set->dev);
 
+	VPRINTK("ENTER\n");
+
 	// Disable hotplug event interrupts.
 	if (host->host_desc->disable_hotplug)
 		host->host_desc->disable_hotplug(host_set);
@@ -374,16 +829,207 @@ static void nv_host_stop (struct ata_hos
 		pci_iounmap(pdev, host_set->mmio_base);
 }
 
+static int nv_port_start(struct ata_port *ap)
+{
+	struct ata_host_set *host_set = ap->host_set;
+	struct nv_host *host = host_set->private_data;
+
+	if (host->host_desc->host_type == ADMA) {
+		return nv_adma_port_start(ap);
+	} else {
+		return ata_port_start(ap);
+	}
+}
+
+static void nv_port_stop(struct ata_port *ap)
+{
+	struct ata_host_set *host_set = ap->host_set;
+	struct nv_host *host = host_set->private_data;
+
+	if (host->host_desc->host_type == ADMA) {
+		nv_adma_port_stop(ap);
+	} else {
+		ata_port_stop(ap);
+	}
+}
+
+static int nv_adma_port_start(struct ata_port *ap)
+{
+	struct device *dev = ap->host_set->dev;
+	struct nv_adma_port_priv *pp;
+	int rc;
+	void *mem;
+	dma_addr_t mem_dma;
+	void __iomem *mmio = nv_adma_ctl_block(ap);
+
+	VPRINTK("ENTER\n");
+
+	nv_adma_reset_channel(ap);
+
+#ifdef DEBUG
+	VPRINTK("after reset:\n");
+	nv_adma_dump_port(ap);
+#endif
+
+	rc = ata_port_start(ap);
+	if (rc)
+		return rc;
+
+	pp = kmalloc(sizeof(*pp), GFP_KERNEL);
+	if (!pp) {
+		rc = -ENOMEM;
+		goto err_out;
+	}
+	memset(pp, 0, sizeof(*pp));
+
+	mem = dma_alloc_coherent(dev, NV_ADMA_PORT_PRIV_DMA_SZ,
+				 &mem_dma, GFP_KERNEL);
+	
+	VPRINTK("dma memory: vaddr = 0x%08x, paddr = 0x%08x\n", (u32)mem, (u32)mem_dma);
+	
+	if (!mem) {
+		rc = -ENOMEM;
+		goto err_out_kfree;
+	}
+	memset(mem, 0, NV_ADMA_PORT_PRIV_DMA_SZ);
+
+	/*
+	 * First item in chunk of DMA memory:
+	 * 128-byte command parameter block (CPB)
+	 * one for each command tag
+	 */
+	pp->cpb     = mem;
+	pp->cpb_dma = mem_dma;
+
+	VPRINTK("cpb = 0x%08x, cpb_dma = 0x%08x\n", (u32)pp->cpb, (u32)pp->cpb_dma);
+
+	writel(mem_dma, mmio + NV_ADMA_CPB_BASE_LOW);
+	writel(0,       mmio + NV_ADMA_CPB_BASE_HIGH);
+
+	mem     += NV_ADMA_CAN_QUEUE * NV_ADMA_CPB_SZ;
+	mem_dma += NV_ADMA_CAN_QUEUE * NV_ADMA_CPB_SZ;
+
+	/*
+	 * Second item: block of ADMA_SGTBL_LEN s/g entries
+	 */
+	pp->aprd = mem;
+	pp->aprd_dma = mem_dma;
+
+	VPRINTK("aprd = 0x%08x, aprd_dma = 0x%08x\n", (u32)pp->aprd, (u32)pp->aprd_dma);
+
+	ap->private_data = pp;
+
+	// clear any outstanding interrupt conditions
+	writew(0xffff, mmio + NV_ADMA_STAT);
+
+	// initialize port variables
+	//	pp->cpb_idx = 0;
+	pp->flags = NV_ADMA_PORT_REGISTER_MODE;
+
+	// make sure controller is in ATA register mode
+	nv_adma_register_mode(ap);
+
+	return 0;
+
+err_out_kfree:
+	kfree(pp);
+err_out:
+	ata_port_stop(ap);
+	return rc;
+}
+
+static void nv_adma_port_stop(struct ata_port *ap)
+{
+	struct device *dev = ap->host_set->dev;
+	struct nv_adma_port_priv *pp = ap->private_data;
+	void __iomem *mmio = nv_adma_ctl_block(ap);
+
+	VPRINTK("ENTER\n");
+
+	writew(0, mmio + NV_ADMA_CTL);
+
+	ap->private_data = NULL;
+	dma_free_coherent(dev, NV_ADMA_PORT_PRIV_DMA_SZ, pp->cpb, pp->cpb_dma);
+	kfree(pp);
+	ata_port_stop(ap);
+}
+
+
+static void nv_adma_setup_port(struct ata_probe_ent *probe_ent, unsigned int port)
+{
+	void __iomem *mmio = probe_ent->mmio_base;
+	struct ata_ioports *ioport = &probe_ent->port[port];
+
+	VPRINTK("ENTER\n");
+
+	mmio += NV_ADMA_PORT + port * NV_ADMA_PORT_SIZE;
+
+	ioport->cmd_addr	= (unsigned long) mmio;
+	ioport->data_addr	= (unsigned long) mmio + (ATA_REG_DATA * 4);
+	ioport->error_addr	=
+	ioport->feature_addr	= (unsigned long) mmio + (ATA_REG_ERR * 4);
+	ioport->nsect_addr	= (unsigned long) mmio + (ATA_REG_NSECT * 4);
+	ioport->lbal_addr	= (unsigned long) mmio + (ATA_REG_LBAL * 4);
+	ioport->lbam_addr	= (unsigned long) mmio + (ATA_REG_LBAM * 4);
+	ioport->lbah_addr	= (unsigned long) mmio + (ATA_REG_LBAH * 4);
+	ioport->device_addr	= (unsigned long) mmio + (ATA_REG_DEVICE * 4);
+	ioport->status_addr	=
+	ioport->command_addr	= (unsigned long) mmio + (ATA_REG_STATUS * 4);
+	ioport->altstatus_addr	=
+	ioport->ctl_addr	= (unsigned long) mmio + 0x20;
+}
+
+static int nv_adma_host_init(struct ata_probe_ent *probe_ent)
+{
+	struct pci_dev *pdev = to_pci_dev(probe_ent->dev);
+	unsigned int i;
+	u32 tmp32;
+
+	VPRINTK("ENTER\n");
+
+	probe_ent->n_ports = NV_PORTS;
+
+	nv_enable_adma_space(pdev);
+	
+	// enable ADMA on the ports
+	pci_read_config_dword(pdev, NV_MCP_SATA_CFG_20, &tmp32);
+	tmp32 |= NV_MCP_SATA_CFG_20_PORT0_EN |
+		 NV_MCP_SATA_CFG_20_PORT0_PWB_EN |
+		 NV_MCP_SATA_CFG_20_PORT1_EN |
+		 NV_MCP_SATA_CFG_20_PORT1_PWB_EN;
+
+	pci_write_config_dword(pdev, NV_MCP_SATA_CFG_20, tmp32);
+	
+	for (i = 0; i < probe_ent->n_ports; i++)
+		nv_adma_setup_port(probe_ent, i);
+
+	for (i = 0; i < probe_ent->n_ports; i++) {
+		void __iomem *mmio = __nv_adma_ctl_block(probe_ent->mmio_base, i);
+		u16 tmp;
+
+		/* enable interrupt, clear reset if not already clear */
+		tmp = readw(mmio + NV_ADMA_CTL);
+		writew(tmp | NV_ADMA_CTL_AIEN, mmio + NV_ADMA_CTL);
+	}
+
+	pci_set_master(pdev);
+
+	return 0;
+}
+
 static int nv_init_one (struct pci_dev *pdev, const struct pci_device_id *ent)
 {
 	static int printed_version = 0;
 	struct nv_host *host;
 	struct ata_port_info *ppi;
 	struct ata_probe_ent *probe_ent;
+	struct nv_host_desc *host_desc;
 	int pci_dev_busy = 0;
 	int rc;
 	u32 bar;
 
+	VPRINTK("ENTER\n");
+
         // Make sure this is a SATA controller by counting the number of bars
         // (NVIDIA SATA controllers will always have six bars).  Otherwise,
         // it's an IDE controller and we ignore it.
@@ -414,6 +1060,19 @@ static int nv_init_one (struct pci_dev *
 	rc = -ENOMEM;
 
 	ppi = &nv_port_info;
+	
+	host_desc = &nv_device_tbl[ent->driver_data];
+	if (host_desc->host_type == ADMA) {
+		// ADMA overrides
+		ppi->host_flags                |= ATA_FLAG_MMIO | ATA_FLAG_SATA_RESET;
+#ifdef NV_ADMA_NCQ
+		ppi->host_flags		       |= ATA_FLAG_NCQ;
+#endif
+		ppi->sht->can_queue		= NV_ADMA_CAN_QUEUE;
+		ppi->sht->sg_tablesize		= NV_ADMA_SGTBL_LEN;
+//		ppi->port_ops->irq_handler	= nv_adma_interrupt;
+	}
+	
 	probe_ent = ata_pci_init_native_mode(pdev, &ppi, ATA_PORT_PRIMARY | ATA_PORT_SECONDARY);
 	if (!probe_ent)
 		goto err_out_regions;
@@ -423,7 +1082,7 @@ static int nv_init_one (struct pci_dev *
 		goto err_out_free_ent;
 
 	memset(host, 0, sizeof(struct nv_host));
-	host->host_desc = &nv_device_tbl[ent->driver_data];
+	host->host_desc = host_desc;
 
 	probe_ent->private_data = host;
 
@@ -440,6 +1099,7 @@ static int nv_init_one (struct pci_dev *
 		}
 
 		base = (unsigned long)probe_ent->mmio_base;
+		VPRINTK("BAR5 base is at 0x%x\n", (u32)base);
 
 		probe_ent->port[0].scr_addr =
 			base + NV_PORT0_SCR_REG_OFFSET;
@@ -455,6 +1115,12 @@ static int nv_init_one (struct pci_dev *
 
 	pci_set_master(pdev);
 
+	if (ent->driver_data == ADMA) {
+		rc = nv_adma_host_init(probe_ent);
+		if (rc)
+			goto err_out_iounmap;
+	}
+
 	rc = ata_device_add(probe_ent);
 	if (rc != NV_PORTS)
 		goto err_out_iounmap;
@@ -483,6 +1149,239 @@ err_out:
 	return rc;
 }
 
+static void nv_eng_timeout(struct ata_port *ap)
+{
+	struct ata_host_set *host_set = ap->host_set;
+	struct nv_host *host = host_set->private_data;
+
+	if (host->host_desc->host_type == ADMA) {
+		nv_adma_eng_timeout(ap);
+	} else {
+		return ata_eng_timeout(ap);
+	}
+}
+
+static void nv_adma_eng_timeout(struct ata_port *ap)
+{
+	struct ata_queued_cmd *qc = ata_qc_from_tag(ap, ap->active_tag);
+	struct nv_adma_port_priv *pp = ap->private_data;
+	u8 drv_stat;
+
+	VPRINTK("ENTER\n");
+	
+	if (pp->flags & NV_ADMA_PORT_REGISTER_MODE) {
+		ata_eng_timeout(ap);
+		goto out;
+	}
+
+
+	if (!qc) {
+		printk(KERN_ERR "ata%u: BUG: timeout without command\n",
+		       ap->id);
+		goto out;
+	}
+	
+
+//	spin_lock_irqsave(&host_set->lock, flags);
+
+	qc->scsidone = scsi_finish_command;
+
+	drv_stat = ata_chk_status(ap);
+
+	printk(KERN_ERR "ata%u: command 0x%x timeout, stat 0x%x\n",
+	       ap->id, qc->tf.command, drv_stat);
+
+	// reset channel
+	nv_adma_reset_channel(ap);
+
+	/* complete taskfile transaction */
+	qc->err_mask |= ac_err_mask(drv_stat);
+	ata_qc_complete(qc);
+
+//	spin_unlock_irqrestore(&host_set->lock, flags);
+
+out:
+	DPRINTK("EXIT\n");
+}
+
+static void nv_qc_prep(struct ata_queued_cmd *qc)
+{
+	struct ata_host_set *host_set = qc->ap->host_set;
+	struct nv_host *host = host_set->private_data;
+
+	if (host->host_desc->host_type == ADMA) {
+		nv_adma_qc_prep(qc);
+	} else {
+		ata_qc_prep(qc);
+	}
+}
+
+static void nv_adma_qc_prep(struct ata_queued_cmd *qc)
+{
+	struct nv_adma_port_priv *pp = qc->ap->private_data;
+	struct nv_adma_cpb *cpb = &pp->cpb[qc->tag];
+
+	VPRINTK("ENTER\n");
+
+	VPRINTK("qc->flags = 0x%x\n", (u32)qc->flags);
+
+	if (!(qc->flags & ATA_QCFLAG_DMAMAP)) {
+		ata_qc_prep(qc);
+		return;
+	}
+
+	memset(cpb, 0, sizeof(struct nv_adma_cpb));
+	       
+	cpb->ctl_flags		= NV_CPB_CTL_CPB_VALID |
+				  NV_CPB_CTL_APRD_VALID |
+				  NV_CPB_CTL_IEN;
+	cpb->len		= 3;
+	cpb->tag		= qc->tag;
+	cpb->next_cpb_idx	= 0;
+
+#ifdef NV_ADMA_NCQ
+	// turn on NCQ flags for NCQ commands
+	if (qc->flags & ATA_QCFLAG_NCQ)
+		cpb->ctl_flags |= NV_CPB_CTL_QUEUE | NV_CPB_CTL_FPDMA;
+#endif
+
+	nv_adma_tf_to_cpb(&qc->tf, cpb->tf);
+
+	nv_adma_fill_sg(qc, cpb);
+}
+
+static void nv_adma_fill_sg(struct ata_queued_cmd *qc, struct nv_adma_cpb *cpb)
+{
+	struct nv_adma_port_priv *pp = qc->ap->private_data;
+	unsigned int idx;
+	struct nv_adma_prd *aprd;
+	struct scatterlist *sg;
+
+	VPRINTK("ENTER\n");
+
+	idx = 0;
+
+	ata_for_each_sg(sg, qc) {
+		aprd = (idx < 5) ? &cpb->aprd[idx] : &pp->aprd[idx-5];
+		nv_adma_fill_aprd(qc, sg, idx, aprd);
+		idx++;
+	}
+	if (idx > 5) {
+		cpb->next_aprd = (u64)(pp->aprd_dma + NV_ADMA_APRD_SZ * qc->tag);
+	}
+}
+
+static void nv_adma_fill_aprd(struct ata_queued_cmd *qc,
+			      struct scatterlist *sg,
+			      int idx,
+			      struct nv_adma_prd *aprd)
+{
+	u32 sg_len, addr, flags;
+
+	memset(aprd, 0, sizeof(struct nv_adma_prd));
+
+	addr   = sg_dma_address(sg);
+	sg_len = sg_dma_len(sg);
+
+	flags = 0;
+	if (qc->tf.flags & ATA_TFLAG_WRITE)
+		flags |= NV_APRD_WRITE;
+	if (idx == qc->n_elem - 1) {
+		flags |= NV_APRD_END;
+	} else if (idx != 4) {
+		flags |= NV_APRD_CONT;
+	}
+
+	aprd->addr  = cpu_to_le32(addr);
+	aprd->len   = cpu_to_le32(sg_len); /* len in bytes */
+	aprd->flags = cpu_to_le32(flags);
+}
+
+static void nv_adma_register_mode(struct ata_port *ap)
+{
+	void __iomem *mmio = nv_adma_ctl_block(ap);
+	struct nv_adma_port_priv *pp = ap->private_data;
+	u16 tmp;
+
+	tmp = readw(mmio + NV_ADMA_CTL);
+	writew(tmp & ~NV_ADMA_CTL_GO, mmio + NV_ADMA_CTL);
+
+	pp->flags |= NV_ADMA_PORT_REGISTER_MODE;
+}
+
+static void nv_adma_mode(struct ata_port *ap)
+{
+	void __iomem *mmio = nv_adma_ctl_block(ap);
+	struct nv_adma_port_priv *pp = ap->private_data;
+	u16 tmp;
+
+	if(!(pp->flags & NV_ADMA_PORT_REGISTER_MODE)) {
+		return;
+	}
+
+#if 0
+	nv_adma_reset_channel(ap);
+#endif
+
+	tmp = readw(mmio + NV_ADMA_CTL);
+	writew(tmp | NV_ADMA_CTL_GO, mmio + NV_ADMA_CTL);
+
+	pp->flags &= ~NV_ADMA_PORT_REGISTER_MODE;
+}
+
+static int nv_qc_issue(struct ata_queued_cmd *qc)
+{
+	struct ata_host_set *host_set = qc->ap->host_set;
+	struct nv_host *host = host_set->private_data;
+
+	if (host->host_desc->host_type == ADMA) {
+		return nv_adma_qc_issue(qc);
+	} else {
+		return ata_qc_issue_prot(qc);
+	}
+}
+
+static int nv_adma_qc_issue(struct ata_queued_cmd *qc)
+{
+#if 0
+	struct nv_adma_port_priv *pp = qc->ap->private_data;
+#endif
+	void __iomem *mmio = nv_adma_ctl_block(qc->ap);
+
+	VPRINTK("ENTER\n");
+
+	if (!(qc->flags & ATA_QCFLAG_DMAMAP)) {
+		VPRINTK("no dmamap, using ATA register mode: 0x%x\n", (u32)qc->flags);
+		// use ATA register mode
+		nv_adma_register_mode(qc->ap);
+		return ata_qc_issue_prot(qc);
+	} else {
+		nv_adma_mode(qc->ap);
+	}
+
+#if 0
+	nv_adma_dump_port(qc->ap);
+	nv_adma_dump_cpb(&pp->cpb[qc->tag]);
+	if (qc->n_elem > 5) {
+		int i;
+		for (i = 0; i < qc->n_elem - 5; i++) {
+			nv_adma_dump_aprd(&pp->aprd[i]);
+		}
+	}
+#endif
+
+	//
+	// write append register, command tag in lower 8 bits
+	// and (number of cpbs to append -1) in top 8 bits
+	//
+	mb();
+	writew(qc->tag, mmio + NV_ADMA_APPEND);
+	
+	VPRINTK("EXIT\n");
+
+	return 0;
+}
+
 static void nv_enable_hotplug(struct ata_probe_ent *probe_ent)
 {
 	u8 intr_mask;
@@ -507,7 +1406,7 @@ static void nv_disable_hotplug(struct at
 	outb(intr_mask, host_set->ports[0]->ioaddr.scr_addr + NV_INT_ENABLE);
 }
 
-static int nv_check_hotplug(struct ata_host_set *host_set)
+static void nv_check_hotplug(struct ata_host_set *host_set)
 {
 	u8 intr_status;
 
@@ -532,22 +1431,15 @@ static int nv_check_hotplug(struct ata_h
 		if (intr_status & NV_INT_STATUS_SDEV_REMOVED)
 			printk(KERN_WARNING "nv_sata: "
 				"Secondary device removed\n");
-
-		return 1;
 	}
-
-	return 0;
 }
 
 static void nv_enable_hotplug_ck804(struct ata_probe_ent *probe_ent)
 {
 	struct pci_dev *pdev = to_pci_dev(probe_ent->dev);
 	u8 intr_mask;
-	u8 regval;
 
-	pci_read_config_byte(pdev, NV_MCP_SATA_CFG_20, &regval);
-	regval |= NV_MCP_SATA_CFG_20_SATA_SPACE_EN;
-	pci_write_config_byte(pdev, NV_MCP_SATA_CFG_20, regval);
+	nv_enable_adma_space(pdev);
 
 	writeb(NV_INT_STATUS_HOTPLUG, probe_ent->mmio_base + NV_INT_STATUS_CK804);
 
@@ -561,7 +1453,6 @@ static void nv_disable_hotplug_ck804(str
 {
 	struct pci_dev *pdev = to_pci_dev(host_set->dev);
 	u8 intr_mask;
-	u8 regval;
 
 	intr_mask = readb(host_set->mmio_base + NV_INT_ENABLE_CK804);
 
@@ -569,12 +1460,10 @@ static void nv_disable_hotplug_ck804(str
 
 	writeb(intr_mask, host_set->mmio_base + NV_INT_ENABLE_CK804);
 
-	pci_read_config_byte(pdev, NV_MCP_SATA_CFG_20, &regval);
-	regval &= ~NV_MCP_SATA_CFG_20_SATA_SPACE_EN;
-	pci_write_config_byte(pdev, NV_MCP_SATA_CFG_20, regval);
+	nv_disable_adma_space(pdev);
 }
 
-static int nv_check_hotplug_ck804(struct ata_host_set *host_set)
+static void nv_check_hotplug_ck804(struct ata_host_set *host_set)
 {
 	u8 intr_status;
 
@@ -599,11 +1488,61 @@ static int nv_check_hotplug_ck804(struct
 		if (intr_status & NV_INT_STATUS_SDEV_REMOVED)
 			printk(KERN_WARNING "nv_sata: "
 				"Secondary device removed\n");
+	}
+}
+
+static void nv_enable_hotplug_adma(struct ata_probe_ent *probe_ent)
+{
+	struct pci_dev *pdev = to_pci_dev(probe_ent->dev);
+	unsigned int i;
+	u16 tmp;
 
-		return 1;
+	nv_enable_adma_space(pdev);
+
+	for (i = 0; i < probe_ent->n_ports; i++) {
+		void __iomem *mmio = __nv_adma_ctl_block(probe_ent->mmio_base, i);
+		writew(NV_ADMA_STAT_HOTPLUG | NV_ADMA_STAT_HOTUNPLUG,
+		       mmio + NV_ADMA_STAT);
+
+		tmp = readw(mmio + NV_ADMA_CTL);
+		writew(tmp | NV_ADMA_CTL_HOTPLUG_IEN, mmio + NV_ADMA_CTL);
+		
 	}
+}
 
-	return 0;
+static void nv_disable_hotplug_adma(struct ata_host_set *host_set)
+{
+	unsigned int i;
+	u16 tmp;
+
+	for (i = 0; i < host_set->n_ports; i++) {
+		void __iomem *mmio = __nv_adma_ctl_block(host_set->mmio_base, i);
+
+		tmp = readw(mmio + NV_ADMA_CTL);
+		writew(tmp & ~NV_ADMA_CTL_HOTPLUG_IEN, mmio + NV_ADMA_CTL);
+		
+	}
+}
+
+static void nv_check_hotplug_adma(struct ata_host_set *host_set)
+{
+	unsigned int i;
+	u16 adma_status;
+
+	for (i = 0; i < host_set->n_ports; i++) {
+		void __iomem *mmio = __nv_adma_ctl_block(host_set->mmio_base, i);
+		adma_status = readw(mmio + NV_ADMA_STAT);
+		if (adma_status & NV_ADMA_STAT_HOTPLUG) {
+			printk(KERN_WARNING "nv_sata: "
+			       "port %d device added\n", i);
+			writew(NV_ADMA_STAT_HOTPLUG, mmio + NV_ADMA_STAT);
+		}
+		if (adma_status & NV_ADMA_STAT_HOTUNPLUG) {
+			printk(KERN_WARNING "nv_sata: "
+			       "port %d device removed\n", i);
+			writew(NV_ADMA_STAT_HOTUNPLUG, mmio + NV_ADMA_STAT);
+		}
+	}
 }
 
 static int __init nv_init(void)
@@ -618,3 +1557,68 @@ static void __exit nv_exit(void)
 
 module_init(nv_init);
 module_exit(nv_exit);
+
+#ifdef DEBUG
+static void nv_adma_dump_aprd(struct nv_adma_prd *aprd)
+{
+	printk("%016llx %08x %02x %s %s %s\n",
+	       aprd->addr,
+	       aprd->len,
+	       aprd->flags,
+	       (aprd->flags & NV_APRD_WRITE) ? "WRITE" : "     ",
+	       (aprd->flags & NV_APRD_END)   ? "END"   : "   ",
+	       (aprd->flags & NV_APRD_CONT)  ? "CONT"  : "    ");
+}
+static void nv_adma_dump_iomem(void __iomem *m, int len)
+{
+	int i, j;
+
+	for (i = 0; i < len/16; i++) {
+		printk(KERN_WARNING "%02x: ", 16*i);
+		for (j = 0; j < 16; j++) {
+			printk("%02x%s", (u32)readb(m + 16*i + j),
+			       (j == 7) ? "-" : " ");
+		}
+		printk("\n");
+	}
+}
+
+static void nv_adma_dump_cpb_tf(u16 tf)
+{
+	printk("0x%04x %s %s %s 0x%02x 0x%02x\n",
+	       tf,
+	       (tf & CMDEND) ? "END" : "   ",
+	       (tf & WNB) ? "WNB" : "   ",
+	       (tf & IGN) ? "IGN" : "   ",
+	       ((tf >> 8) & 0x1f),
+	       (tf & 0xff));
+}
+	
+static void nv_adma_dump_port(struct ata_port *ap)
+{
+	void __iomem *mmio = nv_adma_ctl_block(ap);
+	nv_adma_dump_iomem(mmio, NV_ADMA_PORT_SIZE);
+}
+			
+static void nv_adma_dump_cpb(struct nv_adma_cpb *cpb)
+{
+	int i;
+
+	printk("resp_flags:   0x%02x\n", cpb->resp_flags);
+	printk("ctl_flags:    0x%02x\n", cpb->ctl_flags);
+	printk("len:          0x%02x\n", cpb->len);
+	printk("tag:          0x%02x\n", cpb->tag);
+	printk("next_cpb_idx: 0x%02x\n", cpb->next_cpb_idx);
+	printk("tf:\n");
+	for (i=0; i<12; i++) {
+		nv_adma_dump_cpb_tf(cpb->tf[i]);
+	}
+	printk("aprd:\n");
+	for (i=0; i<5; i++) {
+		nv_adma_dump_aprd(&cpb->aprd[i]);
+	}
+	printk("next_aprd:    0x%016llx\n", cpb->next_aprd);
+}
+
+#endif	
+

[-- Attachment #4: dmesg-sata_nv-adma.txt --]
[-- Type: text/plain, Size: 18119 bytes --]

Bootdata ok (command line is ro root=/dev/md2 report_lost_ticks maxcpus=1 init=/bin/sh)
Linux version 2.6.16-rc6-git4-mmio (rugolsky@ti94) (gcc version 4.0.2 20051125 (Red Hat 4.0.2-8)) #4 SMP Wed Mar 15 20:48:51 EST 2006
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 000000000009b800 (usable)
 BIOS-e820: 000000000009b800 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000d8000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 000000007ff10000 (usable)
 BIOS-e820: 000000007ff10000 - 000000007ff17000 (ACPI data)
 BIOS-e820: 000000007ff17000 - 000000007ff80000 (ACPI NVS)
 BIOS-e820: 000000007ff80000 - 0000000080000000 (reserved)
 BIOS-e820: 00000000e0000000 - 00000000f0000000 (reserved)
 BIOS-e820: 00000000fec00000 - 00000000fec00400 (reserved)
 BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
 BIOS-e820: 00000000fff80000 - 0000000100000000 (reserved)
ACPI: RSDP (v000 PTLTD                                 ) @ 0x00000000000f7920
ACPI: RSDT (v001 PTLTD    RSDT   0x06040000  LTP 0x00000000) @ 0x000000007ff127f1
ACPI: FADT (v001 NVIDIA CK8S     0x06040000 PTL_ 0x000f4240) @ 0x000000007ff16dd8
ACPI: SPCR (v001 PTLTD  $UCRTBL$ 0x06040000 PTL  0x00000001) @ 0x000000007ff16e4c
ACPI: MADT (v001 PTLTD  	 APIC   0x06040000  LTP 0x00000000) @ 0x000000007ff16e9c
ACPI: MCFG (v001 PTLTD    MCFG   0x06040000  LTP 0x00000000) @ 0x000000007ff16f1c
ACPI: MADT (v001 PTLTD  	 APIC   0x06040000  LTP 0x00000000) @ 0x000000007ff16f58
ACPI: BOOT (v001 PTLTD  $SBFTBL$ 0x06040000  LTP 0x00000001) @ 0x000000007ff16fd8
ACPI: DSDT (v001 NVIDIA      CK8 0x06040000 MSFT 0x0100000e) @ 0x0000000000000000
On node 0 totalpages: 514672
  DMA zone: 1828 pages, LIFO batch:0
  DMA32 zone: 512844 pages, LIFO batch:31
  Normal zone: 0 pages, LIFO batch:0
  HighMem zone: 0 pages, LIFO batch:0
Nvidia board detected. Ignoring ACPI timer override.
ACPI: PM-Timer IO Port: 0x8008
ACPI: Local APIC address 0xfee00000
ACPI: 2 duplicate APIC table ignored.
ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
Processor #0 15:5 APIC version 16
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled)
Processor #1 15:5 APIC version 16
ACPI: LAPIC_NMI (acpi_id[0x00] high edge lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1])
ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 2, version 17, address 0xfec00000, GSI 0-23
ACPI: IOAPIC (id[0x03] address[0xc0400000] gsi_base[24])
IOAPIC[1]: apic_id 3, version 17, address 0xc0400000, GSI 24-27
ACPI: IOAPIC (id[0x04] address[0xc0401000] gsi_base[28])
IOAPIC[2]: apic_id 4, version 17, address 0xc0401000, GSI 28-31
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 high edge)
ACPI: BIOS IRQ0 pin2 override ignored.
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 low level)
ACPI: IRQ9 used by override.
Setting APIC routing to flat
Using ACPI (MADT) for SMP configuration information
Allocating PCI resources starting at 88000000 (gap: 80000000:60000000)
Checking aperture...
CPU 0: aperture @ bc000000 size 64 MB
CPU 1: aperture @ bc000000 size 64 MB
Built 1 zonelists
Kernel command line: ro root=/dev/md2 report_lost_ticks maxcpus=1 init=/bin/sh
Initializing CPU#0
PID hash table entries: 4096 (order: 12, 131072 bytes)
Disabling vsyscall due to use of PM timer
time.c: Using 3.579545 MHz WALL PM GTOD PM timer.
time.c: Detected 2009.273 MHz processor.
Console: colour VGA+ 80x25
time.c: Lost 495 timer tick(s)! rip start_kernel+0x102/0x1cc)
Dentry cache hash table entries: 262144 (order: 9, 2097152 bytes)
Inode-cache hash table entries: 131072 (order: 8, 1048576 bytes)
Memory: 2053752k/2096192k available (1985k kernel code, 41716k reserved, 1053k data, 180k init)
Calibrating delay using timer specific routine.. 4023.40 BogoMIPS (lpj=2011702)
Security Framework v1.0.0 initialized
SELinux:  Initializing.
SELinux:  Starting in permissive mode
selinux_register_security:  Registering secondary module capability
Capability LSM initialized as secondary
Mount-cache hash table entries: 256
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 1024K (64 bytes/line)
time.c: Lost 2 timer tick(s)! rip acpi_os_write_port+0x22/0x42)
Using local APIC timer interrupts.
result 12557972
Detected 12.557 MHz APIC timer.
time.c: Lost 49 timer tick(s)! rip setup_boot_APIC_clock+0x121/0x127)
Brought up 1 CPUs
testing NMI watchdog ... OK.
migration_cost=0
checking if image is initramfs... it is
Freeing initrd memory: 1289k freed
DMI present.
NET: Registered protocol family 16
ACPI: bus type pci registered
PCI: Using configuration type 1
PCI: Using MMCONFIG at e0000000
ACPI: Subsystem revision 20060127
ACPI: Interpreter enabled
ACPI: Using IOAPIC for interrupt routing
ACPI: PCI Root Bridge [PCI0] (0000:00)
PCI: Probing PCI hardware (bus 00)
ACPI: Assume root bridge [\_SB_.PCI0] bus is 0
PCI: Transparent bridge - 0000:00:09.0
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.P2P0._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.XVR0._PRT]
ACPI: PCI Interrupt Link [LNK1] (IRQs 16 17 18 19) *0, disabled.
ACPI: PCI Interrupt Link [LNK2] (IRQs 16 17 18 19) *0, disabled.
ACPI: PCI Interrupt Link [LNK3] (IRQs 16 17 18 19) *0
ACPI: PCI Interrupt Link [LNK4] (IRQs 16 17 18 19) *0
ACPI: PCI Interrupt Link [LNK5] (IRQs 16 17 18 19) *0, disabled.
ACPI: PCI Interrupt Link [LSMB] (IRQs 20 21 22 23) *0, disabled.
ACPI: PCI Interrupt Link [LUS0] (IRQs 20 21 22 23) *0
ACPI: PCI Interrupt Link [LUS2] (IRQs 20 21 22 23) *0
ACPI: PCI Interrupt Link [LMAC] (IRQs 20 21 22 23) *0
ACPI: PCI Interrupt Link [LACI] (IRQs 20 21 22 23) *0
ACPI: PCI Interrupt Link [LMCI] (IRQs 20 21 22 23) *0, disabled.
ACPI: PCI Interrupt Link [LPID] (IRQs 20 21 22 23) *0, disabled.
ACPI: PCI Interrupt Link [LTID] (IRQs 20 21 22 23) *0
ACPI: PCI Interrupt Link [LSI1] (IRQs 20 21 22 23) *0, disabled.
ACPI: PCI Interrupt Link [APCP] (IRQs 20 21 22 23) *0, disabled.
ACPI: PCI Root Bridge [PCI2] (0000:08)
PCI: Probing PCI hardware (bus 08)
ACPI: Assume root bridge [\_SB_.PCI0] bus is 0
ACPI: PCI Root Bridge [PCI1] (0000:80)
PCI: Probing PCI hardware (bus 80)
ACPI: Assume root bridge [\_SB_.PCI0] bus is 0
Boot video device is 0000:81:00.0
ACPI: PCI Interrupt Routing Table [\_SB_.PCI1._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI1.XVR0._PRT]
Linux Plug and Play Support v0.97 (c) Adam Belay
pnp: PnP ACPI init
pnp: PnP ACPI: found 15 devices
PCI: Using ACPI for IRQ routing
PCI: If a device doesn't work, try "pci=routeirq".  If it helps, post a report
PCI-DMA: Disabling IOMMU.
pnp: 00:02: ioport range 0x8000-0x807f could not be reserved
pnp: 00:02: ioport range 0x8080-0x80ff has been reserved
pnp: 00:02: ioport range 0x8400-0x847f has been reserved
pnp: 00:02: ioport range 0x8480-0x84ff has been reserved
pnp: 00:02: ioport range 0x8800-0x887f has been reserved
pnp: 00:02: ioport range 0x8880-0x88ff has been reserved
pnp: 00:02: ioport range 0xa000-0xa03f has been reserved
pnp: 00:02: ioport range 0xa040-0xa07f has been reserved
PCI: Bridge: 0000:00:09.0
  IO window: disabled.
  MEM window: c0100000-c01fffff
  PREFETCH window: disabled.
PCI: Bridge: 0000:00:0e.0
  IO window: disabled.
  MEM window: disabled.
  PREFETCH window: disabled.
PCI: Setting latency timer of device 0000:00:09.0 to 64
PCI: Setting latency timer of device 0000:00:0e.0 to 64
PCI: Bridge: 0000:80:0e.0
  IO window: 3000-3fff
  MEM window: c0900000-c09fffff
  PREFETCH window: d0000000-dfffffff
PCI: Setting latency timer of device 0000:80:0e.0 to 64
Simple Boot Flag at 0x36 set to 0x1
IA32 emulation $Id: sys_ia32.c,v 1.32 2002/03/24 13:02:28 ak Exp $
audit: initializing netlink socket (disabled)
audit(1142476264.041:1): initialized
Total HugeTLB memory allocated, 0
VFS: Disk quotas dquot_6.5.1
Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
Initializing Cryptographic API
io scheduler noop registered
io scheduler anticipatory registered
io scheduler deadline registered
io scheduler cfq registered (default)
time.c: Lost 202 timer tick(s)! rip __do_softirq+0x45/0xc8)
PCI: Setting latency timer of device 0000:00:0e.0 to 64
pcie_portdrv_probe->Dev[005d:10de] has invalid IRQ. Check vendor BIOS
assign_interrupt_mode Found MSI capability
Allocate Port Service[0000:00:0e.0:pcie00]
Allocate Port Service[0000:00:0e.0:pcie03]
PCI: Setting latency timer of device 0000:80:0e.0 to 64
pcie_portdrv_probe->Dev[005d:10de] has invalid IRQ. Check vendor BIOS
assign_interrupt_mode Found MSI capability
Allocate Port Service[0000:80:0e.0:pcie00]
Allocate Port Service[0000:80:0e.0:pcie03]
Real Time Clock Driver v1.12ac
Linux agpgart interface v0.101 (c) Dave Jones
PNP: PS/2 Controller [PNP0303:PS2K,PNP0f13:PS2M] at 0x60,0x64 irq 1,12
serio: i8042 AUX port at 0x60,0x64 irq 12
serio: i8042 KBD port at 0x60,0x64 irq 1
Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing enabled
serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
00:08: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
isa bounce pool size: 16 pages
RAMDISK driver initialized: 16 RAM disks of 16384K size 1024 blocksize
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
NFORCE-CK804: IDE controller at PCI slot 0000:00:06.0
NFORCE-CK804: chipset revision 242
NFORCE-CK804: not 100% native mode: will probe irqs later
NFORCE-CK804: 0000:00:06.0 (rev f2) UDMA133 controller
    ide0: BM-DMA at 0x1c00-0x1c07, BIOS settings: hda:DMA, hdb:pio
    ide1: BM-DMA at 0x1c08-0x1c0f, BIOS settings: hdc:pio, hdd:pio
Probing IDE interface ide0...
hda: _NEC DVD_RW ND-3550A, ATAPI CD/DVD-ROM drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
Probing IDE interface ide1...
Probing IDE interface ide1...
hda: ATAPI 48X DVD-ROM DVD-R CD-R/RW drive, 2048kB Cache, UDMA(33)
Uniform CD-ROM driver Revision: 3.20
ide-floppy driver 0.99.newide
mice: PS/2 mouse device common for all mice
md: md driver 0.90.3 MAX_MD_DEVS=256, MD_SB_DISKS=27
md: bitmap version 4.39
NET: Registered protocol family 2
IP route cache hash table entries: 65536 (order: 7, 524288 bytes)
TCP established hash table entries: 262144 (order: 10, 4194304 bytes)
TCP bind hash table entries: 65536 (order: 8, 1048576 bytes)
TCP: Hash tables configured (established 262144 bind 65536)
TCP reno registered
TCP bic registered
Initializing IPsec netlink socket
NET: Registered protocol family 1
NET: Registered protocol family 17
Freeing unused kernel memory: 180k freed
SCSI subsystem initialized
libata version 1.20 loaded.
sata_nv 0000:00:07.0: version 0.8
ACPI: PCI Interrupt Link [LTID] enabled at IRQ 23
GSI 16 sharing vector 0xB1 and IRQ 16
ACPI: PCI Interrupt 0000:00:07.0[A] -> Link [LTID] -> GSI 23 (level, high) -> IRQ 16
PCI: Setting latency timer of device 0000:00:07.0 to 64
PCI: Setting latency timer of device 0000:00:07.0 to 64
ata1: SATA max UDMA/133 cmd 0xFFFFC20000004480 ctl 0xFFFFC200000044A0 bmdma 0x1C10 irq 16
ata2: SATA max UDMA/133 cmd 0xFFFFC20000004580 ctl 0xFFFFC200000045A0 bmdma 0x1C18 irq 16
input: AT Translated Set 2 keyboard as /class/input/input0
ata1: SATA link up 3.0 Gbps (SStatus 123)
ata1: dev 0 cfg 49:2f00 82:346b 83:7fe9 84:4773 85:3469 86:3d01 87:4763 88:407f
ata1: dev 0 ATA-7, max UDMA/133, 160836480 sectors: LBA48
ata1: dev 0 configured for UDMA/133
scsi0 : sata_nv
ata2: SATA link down (SStatus 0)
scsi1 : sata_nv
  Vendor: ATA       Model: HDS728080PLA380   Rev: PF2O
  Type:   Direct-Access                      ANSI SCSI revision: 05
SCSI device sda: 160836480 512-byte hdwr sectors (82348 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: drive cache: write back
SCSI device sda: 160836480 512-byte hdwr sectors (82348 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: drive cache: write back
 sda: sda1
sd 0:0:0:0: Attached scsi disk sda
ACPI: PCI Interrupt Link [LSI1] enabled at IRQ 22
GSI 17 sharing vector 0xB9 and IRQ 17
ACPI: PCI Interrupt 0000:00:08.0[A] -> Link [LSI1] -> GSI 22 (level, high) -> IRQ 17
PCI: Setting latency timer of device 0000:00:08.0 to 64
PCI: Setting latency timer of device 0000:00:08.0 to 64
ata3: SATA max UDMA/133 cmd 0xFFFFC20000006480 ctl 0xFFFFC200000064A0 bmdma 0x1C20 irq 17
ata4: SATA max UDMA/133 cmd 0xFFFFC20000006580 ctl 0xFFFFC200000065A0 bmdma 0x1C28 irq 17
ata3: SATA link up 1.5 Gbps (SStatus 113)
ata3: dev 0 cfg 49:2f00 82:74eb 83:7f63 84:4003 85:74e9 86:3d43 87:4003 88:007f
ata3: dev 0 ATA-6, max UDMA/133, 145226112 sectors: LBA48
ata3: dev 0 configured for UDMA/133
scsi2 : sata_nv
ata4: SATA link up 1.5 Gbps (SStatus 113)
ata4: dev 0 cfg 49:2f00 82:74eb 83:7f63 84:4003 85:74e9 86:3d43 87:4003 88:007f
ata4: dev 0 ATA-6, max UDMA/133, 145226112 sectors: LBA48
ata4: dev 0 configured for UDMA/133
scsi3 : sata_nv
  Vendor: ATA       Model: WDC WD740GD-00FL  Rev: 33.0
  Type:   Direct-Access                      ANSI SCSI revision: 05
SCSI device sdb: 145226112 512-byte hdwr sectors (74356 MB)
sdb: Write Protect is off
sdb: Mode Sense: 00 3a 00 00
SCSI device sdb: drive cache: write back
SCSI device sdb: 145226112 512-byte hdwr sectors (74356 MB)
sdb: Write Protect is off
sdb: Mode Sense: 00 3a 00 00
SCSI device sdb: drive cache: write back
 sdb: sdb1 sdb2 sdb3 sdb4 < sdb5 >
sd 2:0:0:0: Attached scsi disk sdb
  Vendor: ATA       Model: WDC WD740GD-00FL  Rev: 33.0
  Type:   Direct-Access                      ANSI SCSI revision: 05
SCSI device sdc: 145226112 512-byte hdwr sectors (74356 MB)
sdc: Write Protect is off
sdc: Mode Sense: 00 3a 00 00
SCSI device sdc: drive cache: write back
SCSI device sdc: 145226112 512-byte hdwr sectors (74356 MB)
sdc: Write Protect is off
sdc: Mode Sense: 00 3a 00 00
SCSI device sdc: drive cache: write back
 sdc: sdc1 sdc2 sdc3 sdc4 < sdc5 >
sd 3:0:0:0: Attached scsi disk sdc
device-mapper: 4.5.0-ioctl (2005-10-04) initialised: dm-devel@redhat.com
md: raid1 personality registered for level 1
md: Autodetecting RAID arrays.
md: autorun ...
md: considering sdc5 ...
md:  adding sdc5 ...
md: sdc3 has different UUID to sdc5
md: sdc2 has different UUID to sdc5
md: sdc1 has different UUID to sdc5
md:  adding sdb5 ...
md: sdb3 has different UUID to sdc5
md: sdb2 has different UUID to sdc5
md: sdb1 has different UUID to sdc5
md: created md5
md: bind<sdb5>
md: bind<sdc5>
md: running: <sdc5><sdb5>
md: md5: raid array is not clean -- starting background reconstruction
raid1: raid set md5 active with 2 out of 2 mirrors
md: considering sdc3 ...
md: syncing RAID array md5
md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc.
md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for reconstruction.
md: using 128k window, over a total of 70838464 blocks.
md:  adding sdc3 ...
md: sdc2 has different UUID to sdc3
md: sdc1 has different UUID to sdc3
md:  adding sdb3 ...
md: sdb2 has different UUID to sdc3
md: sdb1 has different UUID to sdc3
md: created md3
md: bind<sdb3>
md: bind<sdc3>
md: running: <sdc3><sdb3>
raid1: raid set md3 active with 2 out of 2 mirrors
md: considering sdc2 ...
md:  adding sdc2 ...
md: sdc1 has different UUID to sdc2
md:  adding sdb2 ...
md: sdb1 has different UUID to sdc2
md: created md2
md: bind<sdb2>
md: bind<sdc2>
md: running: <sdc2><sdb2>
md: md2: raid array is not clean -- starting background reconstruction
raid1: raid set md2 active with 2 out of 2 mirrors
md: considering sdc1 ...
md: delaying resync of md2 until md5 has finished resync (they share one or more physical units)
md:  adding sdc1 ...
md:  adding sdb1 ...
md: created md1
md: bind<sdb1>
md: bind<sdc1>
md: running: <sdc1><sdb1>
raid1: raid set md1 active with 2 out of 2 mirrors
md: ... autorun DONE.
md: Autodetecting RAID arrays.
md: autorun ...
md: ... autorun DONE.
md: Autodetecting RAID arrays.
md: autorun ...
md: ... autorun DONE.
md: Autodetecting RAID arrays.
md: autorun ...
md: ... autorun DONE.
EXT3-fs: INFO: recovery required on readonly filesystem.
EXT3-fs: write access will be enabled during recovery.
kjournald starting.  Commit interval 5 seconds
EXT3-fs: recovery complete.
EXT3-fs: mounted filesystem with ordered data mode.
ata4: command 0x35 timeout, stat 0x50
ata4: command 0x35 timeout, stat 0x50
ata4: command 0x35 timeout, stat 0x50
ata4: command 0x35 timeout, stat 0x50
ata4: command 0x35 timeout, stat 0x50
kjournald starting.  Commit interval 5 seconds
EXT3-fs: mounted filesystem with ordered data mode.
ata4: command 0x35 timeout, stat 0x50
ata4: command 0x35 timeout, stat 0x50
(        events/0-5    |#0): new 5 us maximum-latency wakeup.
(        events/0-5    |#0): new 7 us maximum-latency wakeup.
(        events/0-5    |#0): new 8 us maximum-latency wakeup.
(        events/0-5    |#0): new 15 us maximum-latency wakeup.
(        events/0-5    |#0): new 16 us maximum-latency wakeup.
ata4: command 0x35 timeout, stat 0x50
(      md5_resync-677  |#0): new 34 us maximum-latency wakeup.
ata4: command 0x35 timeout, stat 0x50
ata4: command 0x35 timeout, stat 0x50
ata4: command 0x35 timeout, stat 0x50
ata4: command 0x35 timeout, stat 0x50
ata4: command 0x35 timeout, stat 0x50
ata4: command 0x35 timeout, stat 0x50
ata4: command 0x35 timeout, stat 0x50
ata4: command 0x35 timeout, stat 0x50
ata4: command 0x35 timeout, stat 0x50
ata4: command 0x35 timeout, stat 0x50
ata4: command 0x35 timeout, stat 0x50
ata4: command 0x35 timeout, stat 0x50
(       kblockd/0-14   |#0): new 54 us maximum-latency wakeup.
ata4: command 0x35 timeout, stat 0x50
EXT3 FS on sda1, internal journal
ata4: command 0x35 timeout, stat 0x50
ata4: command 0x35 timeout, stat 0x50
ata4: command 0x35 timeout, stat 0x50
kjournald starting.  Commit interval 5 seconds
EXT3 FS on sda1, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
ata4: command 0x35 timeout, stat 0x50
ata4: command 0x35 timeout, stat 0x50
ata4: command 0x35 timeout, stat 0x50
ata4: command 0x35 timeout, stat 0x50
ata4: command 0x35 timeout, stat 0x50
ata4: command 0x35 timeout, stat 0x50
ata4: command 0x35 timeout, stat 0x50
ata4: command 0x35 timeout, stat 0x50
ata4: command 0x35 timeout, stat 0x50
ata4: command 0x35 timeout, stat 0x50
ata4: command 0x35 timeout, stat 0x50
ata4: command 0x35 timeout, stat 0x50

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: libata/sata_nv latency on NVIDIA CK804 [was Re: AMD64 X2 lost ticks on PM timer]
  2006-03-16  3:15                                                 ` libata/sata_nv latency on NVIDIA CK804 [was Re: AMD64 X2 lost ticks on PM timer] Bill Rugolsky Jr.
@ 2006-03-16  4:20                                                   ` Lee Revell
  2006-03-16  9:18                                                     ` Ingo Molnar
  2006-03-16 14:42                                                     ` Gabor Gombas
  0 siblings, 2 replies; 60+ messages in thread
From: Lee Revell @ 2006-03-16  4:20 UTC (permalink / raw)
  To: Bill Rugolsky Jr.
  Cc: Jeff Garzik, Ingo Molnar, Andi Kleen, Jason Baron, linux-kernel,
	john stultz, Allen Martin

On Wed, 2006-03-15 at 22:15 -0500, Bill Rugolsky Jr. wrote:
> 
> I'm heading home now (it's 22:00, and I've been here 16 hours
> already), but I figured that I'd post what I have thus far, and
> perhaps you can tell me what the problem is.
> 

I think it would be better to try to identify the exact circumstances
that trigger the large PIO delay, than to start over debugging a new and
untested driver, especially if the SMM hypothesis has been ruled out.

You mentioned before the bug only hits with writes to multiple drives -
can you try to identify a pattern here - stress the drives one at a
time, try RAID vs. stressing both drives independently, remove one from
the bus.  See if anything affects the duration of the latencies, etc.

Lots of people have these boards and it seems like if the problem was
widespread, I would have seen it on the Linux audio lists, as many of
those users run Ingo's instrumented kernel and they all know to report
latency traces when they get them.

Lee


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: libata/sata_nv latency on NVIDIA CK804 [was Re: AMD64 X2 lost ticks on PM timer]
  2006-03-16  4:20                                                   ` Lee Revell
@ 2006-03-16  9:18                                                     ` Ingo Molnar
  2006-03-16 14:42                                                     ` Gabor Gombas
  1 sibling, 0 replies; 60+ messages in thread
From: Ingo Molnar @ 2006-03-16  9:18 UTC (permalink / raw)
  To: Lee Revell
  Cc: Bill Rugolsky Jr.,
	Jeff Garzik, Andi Kleen, Jason Baron, linux-kernel, john stultz,
	Allen Martin


* Lee Revell <rlrevell@joe-job.com> wrote:

> On Wed, 2006-03-15 at 22:15 -0500, Bill Rugolsky Jr. wrote:
> > 
> > I'm heading home now (it's 22:00, and I've been here 16 hours
> > already), but I figured that I'd post what I have thus far, and
> > perhaps you can tell me what the problem is.
> > 
> 
> I think it would be better to try to identify the exact circumstances 
> that trigger the large PIO delay, than to start over debugging a new 
> and untested driver, especially if the SMM hypothesis has been ruled 
> out.

well, but it's nevertheless a nice thing that the driver got enhanced - 
and Bill's patch seems to be quite close to usable. If that driver 
enhancement also ends up solving the latency then why not?

one more thing to try (in the old driver) would be to surround the inb() 
line in the offending function with mcount(); calls, and redo the 
latency trace. That will tell us for sure whether it's the PIO 
instruction that causes the delay.

	Ingo

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [patch] latency-tracing-v2.6.16.patch
  2006-03-15 22:32                                       ` Bill Rugolsky Jr.
@ 2006-03-16  9:18                                         ` Ingo Molnar
  0 siblings, 0 replies; 60+ messages in thread
From: Ingo Molnar @ 2006-03-16  9:18 UTC (permalink / raw)
  To: Bill Rugolsky Jr.,
	Andi Kleen, Jeff Garzik, Lee Revell, Jason Baron, linux-kernel,
	john stultz


* Bill Rugolsky Jr. <brugolsky@telemetry-investments.com> wrote:

> > latency-tracing-v2.6.16.patch would be the one for current upstream 
> > kernels. The codebase is the same as in the -rt tree.
> 
> Ingo, I had to add this incremental patch against 2.6.16-rc6-git4 in 
> order to get the 2.6.15-rc7 latency tracer working on x86_64.  Looks 
> like the problem is still there in latency-tracing-v2.6.16.patch.

thanks, applied.

	Ingo

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: libata/sata_nv latency on NVIDIA CK804 [was Re: AMD64 X2 lost ticks on PM timer]
  2006-03-16  4:20                                                   ` Lee Revell
  2006-03-16  9:18                                                     ` Ingo Molnar
@ 2006-03-16 14:42                                                     ` Gabor Gombas
  1 sibling, 0 replies; 60+ messages in thread
From: Gabor Gombas @ 2006-03-16 14:42 UTC (permalink / raw)
  To: Lee Revell
  Cc: Bill Rugolsky Jr.,
	Jeff Garzik, Ingo Molnar, Andi Kleen, Jason Baron, linux-kernel,
	john stultz, Allen Martin

On Wed, Mar 15, 2006 at 11:20:24PM -0500, Lee Revell wrote:

> Lots of people have these boards and it seems like if the problem was
> widespread, I would have seen it on the Linux audio lists, as many of
> those users run Ingo's instrumented kernel and they all know to report
> latency traces when they get them.

I did not experience any sata_nv problems with 2 disks/RAID1 (at least I
never noticed). I immediately got the "warning: many lost ticks. Your
time source seems to be instable or some driver is hogging interupts"
message when I started using 4 disks/RAID5. It's possible that most
people do not have enough disks connected to the nForce4 to trigger the
bug.

Gabor

-- 
     ---------------------------------------------------------
     MTA SZTAKI Computer and Automation Research Institute
                Hungarian Academy of Sciences
     ---------------------------------------------------------

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: libata/sata_nv latency on NVIDIA CK804 [was Re: AMD64 X2 lost ticks on PM timer]
  2006-03-15 22:00                                         ` Ingo Molnar
  2006-03-15 22:25                                           ` Jeff Garzik
@ 2006-03-16 15:13                                           ` Alan Cox
  2006-03-16 16:57                                             ` Bill Rugolsky Jr.
  1 sibling, 1 reply; 60+ messages in thread
From: Alan Cox @ 2006-03-16 15:13 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Lee Revell, Bill Rugolsky Jr.,
	Andi Kleen, Jeff Garzik, Jason Baron, linux-kernel, john stultz

On Mer, 2006-03-15 at 23:00 +0100, Ingo Molnar wrote:
> so my guess would be that this device doesnt do MMIO, and the PIO inb() 
> causes some bad BIOS-based SMM handler/emulator to trigger, which takes 
> 16.6 msecs. If indeed the device is not in MMIO mode, is there a way to 
> force it into MMIO mode, to test this theory?

There is a much more reliable way to check this. Use the profiling
registers to check the instruction issue count before/after the I/O and
you'll know if its something like SMM or just a bus stall.

I can believe the bus stall because some devices will queue a large FIFO
of data for the disk and the status read may require flushing it all
out.


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: libata/sata_nv latency on NVIDIA CK804 [was Re: AMD64 X2 lost ticks on PM timer]
  2006-03-16 15:13                                           ` Alan Cox
@ 2006-03-16 16:57                                             ` Bill Rugolsky Jr.
  2006-03-22 16:09                                               ` Andi Kleen
  0 siblings, 1 reply; 60+ messages in thread
From: Bill Rugolsky Jr. @ 2006-03-16 16:57 UTC (permalink / raw)
  To: Alan Cox
  Cc: Ingo Molnar, Lee Revell, Andi Kleen, Jeff Garzik, Jason Baron,
	linux-kernel, john stultz

On Thu, Mar 16, 2006 at 03:13:39PM +0000, Alan Cox wrote:
> On Mer, 2006-03-15 at 23:00 +0100, Ingo Molnar wrote:
> > so my guess would be that this device doesnt do MMIO, and the PIO inb() 
> > causes some bad BIOS-based SMM handler/emulator to trigger, which takes 
> > 16.6 msecs. If indeed the device is not in MMIO mode, is there a way to 
> > force it into MMIO mode, to test this theory?
> 
> There is a much more reliable way to check this. Use the profiling
> registers to check the instruction issue count before/after the I/O and
> you'll know if its something like SMM or just a bus stall.


Brilliant [as usual 8-)].


So I imagine that the thing to do is just insert before/after
rdmsr(MSR_K7_PERFCTR[0123]) into the code, with a suitable printk(),
and then program the counters with oprofile to use large event
counts (lasting seconds)?  The -rt patch could make good use of some
infrastructure for doing this.

> I can believe the bus stall because some devices will queue a large FIFO
> of data for the disk and the status read may require flushing it all
> out.

It may involve synchronous writes. 

I did as Ingo suggested, and added the before/after mcount()s:

diff -up drivers/scsi/libata-core.c{.orig,}
--- drivers/scsi/libata-core.c.orig     2006-03-15 17:19:42.000000000 -0500
+++ drivers/scsi/libata-core.c  2006-03-16 10:08:32.000000000 -0500
@@ -3984,8 +3984,11 @@ u8 ata_bmdma_status(struct ata_port *ap)
        if (ap->flags & ATA_FLAG_MMIO) {
                void __iomem *mmio = (void __iomem *) ap->ioaddr.bmdma_addr;
                host_stat = readb(mmio + ATA_DMA_STATUS);
-       } else
+       } else {
+               mcount();
                host_stat = inb(ap->ioaddr.bmdma_addr + ATA_DMA_STATUS);
+               mcount();
+       }
        return host_stat;
 }

This produced the trace below in < 30 seconds, which clearly indicates
that the inb() is a problem.  This occurs when running

	 tar cf /dev/zero /usr & tar cf /dev/zero /extra_disk &

with both of them mounted readonly.  HOWEVER - I'm an idiot - I just
realized that syslog is sitting there synchronously writing lost tick
messages into the log on the same disks.  So I turned off syslog and the
trace became much harder to reproduce.  After several minutes I got one.

+1 for Alan's theory about flushes stalling status reads.

	-Bill


preemption latency trace v1.1.5 on 2.6.16-rc6-git4-profile
--------------------------------------------------------------------
 latency: 1861 us, #730/730, CPU#0 | (M:desktop VP:0, KP:0, SP:0 HP:0 #P:1)
    -----------------
    | task: rtpserver-3032 (uid:316 nice:-10 policy:0 rt_prio:0)
    -----------------

                 _------=> CPU#            
                / _-----=> irqs-off        
               | / _----=> need-resched    
               || / _---=> hardirq/softirq 
               ||| / _--=> preempt-depth   
               |||| /                      
               |||||     delay             
   cmd     pid ||||| time  |   caller      
      \   /    |||||   \   |   /           
  <idle>-0     0dns.    0us : __trace_start_sched_wakeup (try_to_wake_up)
  <idle>-0     0dns.    0us : __trace_start_sched_wakeup <<...>-3032> (69 0)
  <idle>-0     0dns.    0us : _spin_unlock_irqrestore (try_to_wake_up)
  <idle>-0     0dns.    0us : _spin_unlock_irqrestore (__wake_up)
  <idle>-0     0dns.    0us : do_IRQ (ret_from_intr)
  <idle>-0     0dns.    0us : exit_idle (do_IRQ)
  <idle>-0     0dns.    0us : in_lock_functions (add_preempt_count)
  <idle>-0     0dnH.    0us : __do_IRQ (do_IRQ)
  <idle>-0     0dnH.    0us : __lock_text_start (__do_IRQ)
  <idle>-0     0dnH.    1us : mask_and_ack_level_ioapic_irq (__do_IRQ)
  <idle>-0     0dnH.    1us : handle_IRQ_event (__do_IRQ)
  <idle>-0     0dnH.    1us : nv_interrupt (handle_IRQ_event)
  <idle>-0     0dnH.    1us : _spin_lock_irqsave (nv_interrupt)
  <idle>-0     0dnH.    1us : ata_host_intr (nv_interrupt)
  <idle>-0     0dnH.    1us : ata_bmdma_status (ata_host_intr)
  <idle>-0     0dnH.    1us : ata_bmdma_status (ata_host_intr)
  <idle>-0     0dnH.    2us : ata_bmdma_status (ata_host_intr)
  <idle>-0     0dnH.    2us : ata_bmdma_stop (ata_host_intr)
  <idle>-0     0dnH.    2us : ata_altstatus (ata_bmdma_stop)
  <idle>-0     0dnH.    2us : ata_altstatus (ata_host_intr)
  <idle>-0     0dnH.    3us : ata_check_status (ata_host_intr)
  <idle>-0     0dnH.    3us : ata_bmdma_irq_clear (ata_host_intr)
  <idle>-0     0dnH.    3us : ata_qc_complete (ata_host_intr)
  <idle>-0     0dnH.    4us : nommu_unmap_sg (ata_qc_complete)
  <idle>-0     0dnH.    4us : ata_scsi_qc_complete (ata_qc_complete)
  <idle>-0     0dnH.    4us : scsi_done (ata_scsi_qc_complete)
  <idle>-0     0dnH.    4us : scsi_delete_timer (scsi_done)
  <idle>-0     0dnH.    4us : del_timer (scsi_delete_timer)
  <idle>-0     0dnH.    4us : lock_timer_base (del_timer)
  <idle>-0     0dnH.    4us : _spin_lock_irqsave (lock_timer_base)
  <idle>-0     0dnH.    4us : _spin_unlock_irqrestore (del_timer)
  <idle>-0     0dnH.    4us : __scsi_done (scsi_done)
  <idle>-0     0dnH.    5us : blk_complete_request (__scsi_done)
  <idle>-0     0dnH.    5us : raise_softirq_irqoff (blk_complete_request)
  <idle>-0     0dnH.    5us : __ata_qc_complete (ata_qc_complete)
  <idle>-0     0dnH.    5us : ata_host_intr (nv_interrupt)
  <idle>-0     0dnH.    5us : ata_bmdma_status (ata_host_intr)
  <idle>-0     0dnH.    5us!: ata_bmdma_status (ata_host_intr)
  <idle>-0     0dnH. 1726us : ata_bmdma_status (ata_host_intr)
  <idle>-0     0dnH. 1726us : nv_check_hotplug_ck804 (nv_interrupt)
  <idle>-0     0dnH. 1726us : _spin_unlock_irqrestore (nv_interrupt)
  <idle>-0     0dnH. 1727us : smp_apic_timer_interrupt (apic_timer_interrupt)
  <idle>-0     0dnH. 1727us : exit_idle (smp_apic_timer_interrupt)
  <idle>-0     0dnH. 1727us : in_lock_functions (add_preempt_count)
  <idle>-0     0dnH. 1727us : smp_local_timer_interrupt (smp_apic_timer_interrupt)
  <idle>-0     0dnH. 1727us : profile_tick (smp_local_timer_interrupt)
  <idle>-0     0dnH. 1727us : profile_pc (profile_tick)
  <idle>-0     0dnH. 1727us : in_lock_functions (profile_pc)
  <idle>-0     0dnH. 1727us : profile_hit (profile_tick)
  <idle>-0     0dnH. 1727us : update_process_times (smp_local_timer_interrupt)
  <idle>-0     0dnH. 1727us : account_system_time (update_process_times)
  <idle>-0     0dnH. 1728us : acct_update_integrals (account_system_time)
  <idle>-0     0dnH. 1728us : run_local_timers (update_process_times)
  <idle>-0     0dnH. 1728us : raise_softirq (run_local_timers)
  <idle>-0     0dnH. 1728us : rcu_pending (update_process_times)
  <idle>-0     0dnH. 1728us : __rcu_pending (rcu_pending)
  <idle>-0     0dnH. 1728us : rcu_check_callbacks (update_process_times)
  <idle>-0     0dnH. 1728us : idle_cpu (rcu_check_callbacks)
  <idle>-0     0dnH. 1728us : __tasklet_schedule (rcu_check_callbacks)
  <idle>-0     0dnH. 1728us : scheduler_tick (update_process_times)
  <idle>-0     0dnH. 1728us : sched_clock (scheduler_tick)
  <idle>-0     0dnH. 1729us : __lock_text_start (scheduler_tick)
  <idle>-0     0dnH. 1729us : resched_task (scheduler_tick)
  <idle>-0     0dnH. 1729us : rebalance_tick (scheduler_tick)
  <idle>-0     0dnH. 1729us : run_posix_cpu_timers (update_process_times)
  <idle>-0     0dnH. 1729us : irq_exit (smp_apic_timer_interrupt)
  <idle>-0     0dnH. 1730us : do_IRQ (ret_from_intr)
  <idle>-0     0dnH. 1730us : exit_idle (do_IRQ)
  <idle>-0     0dnH. 1730us : in_lock_functions (add_preempt_count)
  <idle>-0     0dnH. 1730us : __do_IRQ (do_IRQ)
  <idle>-0     0dnH. 1730us : __lock_text_start (__do_IRQ)
  <idle>-0     0dnH. 1730us : mask_and_ack_level_ioapic_irq (__do_IRQ)
  <idle>-0     0dnH. 1730us : handle_IRQ_event (__do_IRQ)
  <idle>-0     0dnH. 1730us : nv_nic_irq (handle_IRQ_event)
  <idle>-0     0dnH. 1731us : __lock_text_start (nv_nic_irq)
  <idle>-0     0dnH. 1731us : nv_tx_done (nv_nic_irq)
  <idle>-0     0dnH. 1731us : netpoll_trap (nv_tx_done)
  <idle>-0     0dnH. 1731us : nommu_unmap_single (nv_nic_irq)
  <idle>-0     0dnH. 1732us : eth_type_trans (nv_nic_irq)
  <idle>-0     0dnH. 1732us : netif_rx (nv_nic_irq)
  <idle>-0     0dnH. 1732us : nommu_unmap_single (nv_nic_irq)
  <idle>-0     0dnH. 1733us : eth_type_trans (nv_nic_irq)
  <idle>-0     0dnH. 1733us : netif_rx (nv_nic_irq)
  <idle>-0     0dnH. 1733us : nommu_unmap_single (nv_nic_irq)
  <idle>-0     0dnH. 1733us : eth_type_trans (nv_nic_irq)
  <idle>-0     0dnH. 1734us : netif_rx (nv_nic_irq)
  <idle>-0     0dnH. 1734us : nv_alloc_rx (nv_nic_irq)
  <idle>-0     0dnH. 1734us : __alloc_skb (nv_alloc_rx)
  <idle>-0     0dnH. 1734us : kmem_cache_alloc (__alloc_skb)
  <idle>-0     0dnH. 1734us : __kmalloc (__alloc_skb)
  <idle>-0     0dnH. 1735us : nommu_map_single (nv_alloc_rx)
  <idle>-0     0dnH. 1735us : check_addr (nommu_map_single)
  <idle>-0     0dnH. 1735us : __alloc_skb (nv_alloc_rx)
  <idle>-0     0dnH. 1735us : kmem_cache_alloc (__alloc_skb)
  <idle>-0     0dnH. 1735us : __kmalloc (__alloc_skb)
  <idle>-0     0dnH. 1735us : nommu_map_single (nv_alloc_rx)
  <idle>-0     0dnH. 1735us : check_addr (nommu_map_single)
  <idle>-0     0dnH. 1735us : __alloc_skb (nv_alloc_rx)
  <idle>-0     0dnH. 1736us : kmem_cache_alloc (__alloc_skb)
  <idle>-0     0dnH. 1736us : __kmalloc (__alloc_skb)
  <idle>-0     0dnH. 1736us : nommu_map_single (nv_alloc_rx)
  <idle>-0     0dnH. 1736us : check_addr (nommu_map_single)
  <idle>-0     0dnH. 1737us : __lock_text_start (__do_IRQ)
  <idle>-0     0dnH. 1737us : note_interrupt (__do_IRQ)
  <idle>-0     0dnH. 1737us : end_level_ioapic_irq (__do_IRQ)
  <idle>-0     0dnH. 1737us : irq_exit (do_IRQ)
  <idle>-0     0dnH. 1737us : __lock_text_start (__do_IRQ)
  <idle>-0     0dnH. 1738us : note_interrupt (__do_IRQ)
  <idle>-0     0dnH. 1738us : end_level_ioapic_irq (__do_IRQ)
  <idle>-0     0dnH. 1738us : irq_exit (do_IRQ)
  <idle>-0     0dns. 1738us : do_IRQ (ret_from_intr)
  <idle>-0     0dns. 1738us : exit_idle (do_IRQ)
  <idle>-0     0dns. 1738us : in_lock_functions (add_preempt_count)
  <idle>-0     0dnH. 1738us : __do_IRQ (do_IRQ)
  <idle>-0     0dnH. 1738us : __lock_text_start (__do_IRQ)
  <idle>-0     0dnH. 1738us : ack_edge_ioapic_irq (__do_IRQ)
  <idle>-0     0dnH. 1739us : handle_IRQ_event (__do_IRQ)
  <idle>-0     0dnH. 1739us : timer_interrupt (handle_IRQ_event)
  <idle>-0     0dnH. 1739us : main_timer_handler (timer_interrupt)
  <idle>-0     0dnH. 1739us : __lock_text_start (main_timer_handler)
  <idle>-0     0dnH. 1739us+: __lock_text_start (main_timer_handler)
  <idle>-0     0dnH. 1742us : pmtimer_mark_offset (main_timer_handler)
  <idle>-0     0dnH. 1743us : handle_lost_ticks (main_timer_handler)
  <idle>-0     0dnH. 1744us : printk (handle_lost_ticks)
  <idle>-0     0dnH. 1744us : vprintk (printk)
  <idle>-0     0dnH. 1744us : _spin_lock_irqsave (vprintk)
  <idle>-0     0dnH. 1745us : vscnprintf (vprintk)
  <idle>-0     0dnH. 1745us+: vsnprintf (vscnprintf)
  <idle>-0     0dnH. 1747us : number (vsnprintf)
  <idle>-0     0dnH. 1748us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1749us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1749us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1749us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1749us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1749us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1749us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1749us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1749us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1749us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1749us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1750us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1750us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1750us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1750us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1750us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1750us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1750us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1750us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1750us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1750us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1751us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1751us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1751us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1751us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1751us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1751us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1751us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1751us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1751us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1751us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1751us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1752us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1752us : _spin_unlock_irqrestore (vprintk)
  <idle>-0     0dnH. 1752us : release_console_sem (vprintk)
  <idle>-0     0dnH. 1752us : _spin_lock_irqsave (release_console_sem)
  <idle>-0     0dnH. 1753us : _call_console_drivers (release_console_sem)
  <idle>-0     0dnH. 1753us : _spin_lock_irqsave (release_console_sem)
  <idle>-0     0dnH. 1753us : _spin_unlock_irqrestore (release_console_sem)
  <idle>-0     0dnH. 1753us : __wake_up (release_console_sem)
  <idle>-0     0dnH. 1753us : _spin_lock_irqsave (__wake_up)
  <idle>-0     0dnH. 1753us : __wake_up_common (__wake_up)
  <idle>-0     0dnH. 1754us : default_wake_function (__wake_up_common)
  <idle>-0     0dnH. 1754us : try_to_wake_up (default_wake_function)
  <idle>-0     0dnH. 1754us : __lock_text_start (try_to_wake_up)
  <idle>-0     0dnH. 1754us : idle_cpu (try_to_wake_up)
  <idle>-0     0dnH. 1754us : activate_task (try_to_wake_up)
  <idle>-0     0dnH. 1754us : sched_clock (activate_task)
  <idle>-0     0dnH. 1755us : recalc_task_prio (activate_task)
  <idle>-0     0dnH. 1755us : effective_prio (recalc_task_prio)
  <idle>-0     0dnH. 1755us : activate_task <<...>-1774> (73 1)
  <idle>-0     0dnH. 1755us : enqueue_task (activate_task)
  <idle>-0     0dnH. 1755us : resched_task (try_to_wake_up)
  <idle>-0     0dnH. 1756us : __trace_start_sched_wakeup (try_to_wake_up)
  <idle>-0     0dnH. 1756us : __lock_text_start (__trace_start_sched_wakeup)
  <idle>-0     0dnH. 1756us : _spin_unlock_irqrestore (try_to_wake_up)
  <idle>-0     0dnH. 1756us : _spin_unlock_irqrestore (__wake_up)
  <idle>-0     0dnH. 1756us : __print_symbol (handle_lost_ticks)
  <idle>-0     0dnH. 1757us : kallsyms_lookup (__print_symbol)
  <idle>-0     0dnH. 1758us+: get_symbol_offset (kallsyms_lookup)
  <idle>-0     0dnH. 1761us : kallsyms_expand_symbol (kallsyms_lookup)
  <idle>-0     0dnH. 1762us : sprintf (__print_symbol)
  <idle>-0     0dnH. 1763us : vsnprintf (sprintf)
  <idle>-0     0dnH. 1763us : strnlen (vsnprintf)
  <idle>-0     0dnH. 1764us : number (vsnprintf)
  <idle>-0     0dnH. 1765us : number (vsnprintf)
  <idle>-0     0dnH. 1765us : printk (__print_symbol)
  <idle>-0     0dnH. 1765us : vprintk (printk)
  <idle>-0     0dnH. 1765us : _spin_lock_irqsave (vprintk)
  <idle>-0     0dnH. 1765us : vscnprintf (vprintk)
  <idle>-0     0dnH. 1765us : vsnprintf (vscnprintf)
  <idle>-0     0dnH. 1766us : strnlen (vsnprintf)
  <idle>-0     0dnH. 1766us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1766us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1766us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1766us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1766us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1766us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1766us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1766us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1766us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1767us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1767us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1767us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1767us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1767us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1767us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1767us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1767us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1767us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1767us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1767us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1768us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1768us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1768us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1768us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1768us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1768us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1768us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1768us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1768us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1768us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1768us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1769us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1769us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1769us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1769us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1769us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1769us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1769us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1769us : emit_log_char (vprintk)
  <idle>-0     0dnH. 1769us : _spin_unlock_irqrestore (vprintk)
  <idle>-0     0dnH. 1769us : release_console_sem (vprintk)
  <idle>-0     0dnH. 1769us : _spin_lock_irqsave (release_console_sem)
  <idle>-0     0dnH. 1770us : _call_console_drivers (release_console_sem)
  <idle>-0     0dnH. 1770us : _call_console_drivers (release_console_sem)
  <idle>-0     0dnH. 1770us : _spin_lock_irqsave (release_console_sem)
  <idle>-0     0dnH. 1770us : _spin_unlock_irqrestore (release_console_sem)
  <idle>-0     0dnH. 1770us : __wake_up (release_console_sem)
  <idle>-0     0dnH. 1770us : _spin_lock_irqsave (__wake_up)
  <idle>-0     0dnH. 1770us : __wake_up_common (__wake_up)
  <idle>-0     0dnH. 1770us : default_wake_function (__wake_up_common)
  <idle>-0     0dnH. 1771us : try_to_wake_up (default_wake_function)
  <idle>-0     0dnH. 1771us : __lock_text_start (try_to_wake_up)
  <idle>-0     0dnH. 1771us : _spin_unlock_irqrestore (try_to_wake_up)
  <idle>-0     0dnH. 1771us : _spin_unlock_irqrestore (__wake_up)
  <idle>-0     0dnH. 1771us : do_timer (main_timer_handler)
  <idle>-0     0dnH. 1771us : adjtime_adjustment (do_timer)
  <idle>-0     0dnH. 1771us : adjtime_adjustment (do_timer)
  <idle>-0     0dnH. 1772us : softlockup_tick (do_timer)
  <idle>-0     0dnH. 1772us : smp_send_timer_broadcast_ipi (timer_interrupt)
  <idle>-0     0dnH. 1772us : __lock_text_start (__do_IRQ)
  <idle>-0     0dnH. 1772us : note_interrupt (__do_IRQ)
  <idle>-0     0dnH. 1772us : end_edge_ioapic_irq (__do_IRQ)
  <idle>-0     0dnH. 1772us : irq_exit (do_IRQ)
  <idle>-0     0dns. 1772us : do_IRQ (ret_from_intr)
  <idle>-0     0dns. 1773us : exit_idle (do_IRQ)
  <idle>-0     0dns. 1773us : in_lock_functions (add_preempt_count)
  <idle>-0     0dnH. 1773us : __do_IRQ (do_IRQ)
  <idle>-0     0dnH. 1773us : __lock_text_start (__do_IRQ)
  <idle>-0     0dnH. 1773us : mask_and_ack_level_ioapic_irq (__do_IRQ)
  <idle>-0     0dnH. 1773us : handle_IRQ_event (__do_IRQ)
  <idle>-0     0dnH. 1773us : nv_interrupt (handle_IRQ_event)
  <idle>-0     0dnH. 1773us : _spin_lock_irqsave (nv_interrupt)
  <idle>-0     0dnH. 1773us : ata_check_status (nv_interrupt)
  <idle>-0     0dnH. 1774us : ata_host_intr (nv_interrupt)
  <idle>-0     0dnH. 1774us : ata_bmdma_status (ata_host_intr)
  <idle>-0     0dnH. 1774us : ata_bmdma_status (ata_host_intr)
  <idle>-0     0dnH. 1774us : ata_bmdma_status (ata_host_intr)
  <idle>-0     0dnH. 1774us : ata_bmdma_stop (ata_host_intr)
  <idle>-0     0dnH. 1775us : ata_altstatus (ata_bmdma_stop)
  <idle>-0     0dnH. 1775us : ata_altstatus (ata_host_intr)
  <idle>-0     0dnH. 1775us : ata_check_status (ata_host_intr)
  <idle>-0     0dnH. 1775us : ata_bmdma_irq_clear (ata_host_intr)
  <idle>-0     0dnH. 1776us : ata_qc_complete (ata_host_intr)
  <idle>-0     0dnH. 1776us : nommu_unmap_sg (ata_qc_complete)
  <idle>-0     0dnH. 1776us : ata_scsi_qc_complete (ata_qc_complete)
  <idle>-0     0dnH. 1776us : scsi_done (ata_scsi_qc_complete)
  <idle>-0     0dnH. 1776us : scsi_delete_timer (scsi_done)
  <idle>-0     0dnH. 1776us : del_timer (scsi_delete_timer)
  <idle>-0     0dnH. 1777us : lock_timer_base (del_timer)
  <idle>-0     0dnH. 1777us : _spin_lock_irqsave (lock_timer_base)
  <idle>-0     0dnH. 1777us : _spin_unlock_irqrestore (del_timer)
  <idle>-0     0dnH. 1777us : __scsi_done (scsi_done)
  <idle>-0     0dnH. 1777us : blk_complete_request (__scsi_done)
  <idle>-0     0dnH. 1777us : raise_softirq_irqoff (blk_complete_request)
  <idle>-0     0dnH. 1777us : __ata_qc_complete (ata_qc_complete)
  <idle>-0     0dnH. 1777us : nv_check_hotplug_ck804 (nv_interrupt)
  <idle>-0     0dnH. 1778us : _spin_unlock_irqrestore (nv_interrupt)
  <idle>-0     0dnH. 1778us : __lock_text_start (__do_IRQ)
  <idle>-0     0dnH. 1778us : note_interrupt (__do_IRQ)
  <idle>-0     0dnH. 1778us : end_level_ioapic_irq (__do_IRQ)
  <idle>-0     0dnH. 1778us : irq_exit (do_IRQ)
  <idle>-0     0dns. 1779us : run_timer_softirq (__do_softirq)
  <idle>-0     0dns. 1779us : hrtimer_run_queues (run_timer_softirq)
  <idle>-0     0dns. 1779us : ktime_get_real (hrtimer_run_queues)
  <idle>-0     0dns. 1779us : getnstimeofday (ktime_get_real)
  <idle>-0     0dns. 1779us : do_gettimeofday (getnstimeofday)
  <idle>-0     0dns. 1780us : do_gettimeoffset_pm (do_gettimeofday)
  <idle>-0     0dns. 1781us : _spin_lock_irq (hrtimer_run_queues)
  <idle>-0     0dns. 1782us : ktime_get (hrtimer_run_queues)
  <idle>-0     0dns. 1782us : ktime_get_ts (ktime_get)
  <idle>-0     0dns. 1782us : getnstimeofday (ktime_get_ts)
  <idle>-0     0dns. 1782us : do_gettimeofday (getnstimeofday)
  <idle>-0     0dns. 1782us : do_gettimeoffset_pm (do_gettimeofday)
  <idle>-0     0dns. 1784us : set_normalized_timespec (ktime_get_ts)
  <idle>-0     0dns. 1784us : _spin_lock_irq (hrtimer_run_queues)
  <idle>-0     0dns. 1784us : _spin_lock_irq (run_timer_softirq)
  <idle>-0     0dns. 1785us : i8042_timer_func (run_timer_softirq)
  <idle>-0     0dns. 1785us : i8042_interrupt (i8042_timer_func)
  <idle>-0     0dns. 1785us : mod_timer (i8042_interrupt)
  <idle>-0     0dns. 1785us : __mod_timer (mod_timer)
  <idle>-0     0dns. 1785us : lock_timer_base (__mod_timer)
  <idle>-0     0dns. 1786us : _spin_lock_irqsave (lock_timer_base)
  <idle>-0     0dns. 1786us : internal_add_timer (__mod_timer)
  <idle>-0     0dns. 1786us : _spin_unlock_irqrestore (__mod_timer)
  <idle>-0     0dns. 1786us : _spin_lock_irqsave (i8042_interrupt)
  <idle>-0     0dns. 1788us : _spin_unlock_irqrestore (i8042_interrupt)
  <idle>-0     0dns. 1788us : _spin_lock_irq (run_timer_softirq)
  <idle>-0     0dns. 1788us : net_rx_action (__do_softirq)
  <idle>-0     0dns. 1788us : process_backlog (net_rx_action)
  <idle>-0     0dns. 1789us : netif_receive_skb (process_backlog)
  <idle>-0     0dns. 1789us : vlan_skb_recv (netif_receive_skb)
  <idle>-0     0dns. 1789us : __find_vlan_dev (vlan_skb_recv)
  <idle>-0     0dns. 1789us : __vlan_find_group (__find_vlan_dev)
  <idle>-0     0dns. 1790us : memmove (vlan_skb_recv)
  <idle>-0     0dns. 1790us : netif_rx (vlan_skb_recv)
  <idle>-0     0dns. 1790us : netif_receive_skb (process_backlog)
  <idle>-0     0dns. 1791us : vlan_skb_recv (netif_receive_skb)
  <idle>-0     0dns. 1791us : __find_vlan_dev (vlan_skb_recv)
  <idle>-0     0dns. 1791us : __vlan_find_group (__find_vlan_dev)
  <idle>-0     0dns. 1791us : memmove (vlan_skb_recv)
  <idle>-0     0dns. 1792us : netif_rx (vlan_skb_recv)
  <idle>-0     0dns. 1792us : netif_receive_skb (process_backlog)
  <idle>-0     0dns. 1792us : vlan_skb_recv (netif_receive_skb)
  <idle>-0     0dns. 1792us : __find_vlan_dev (vlan_skb_recv)
  <idle>-0     0dns. 1793us : __vlan_find_group (__find_vlan_dev)
  <idle>-0     0dns. 1793us : memmove (vlan_skb_recv)
  <idle>-0     0dns. 1793us : netif_rx (vlan_skb_recv)
  <idle>-0     0dns. 1793us : netif_receive_skb (process_backlog)
  <idle>-0     0dns. 1794us : ip_rcv (netif_receive_skb)
  <idle>-0     0dns. 1794us : ip_route_input (ip_rcv)
  <idle>-0     0dns. 1794us : rt_hash_code (ip_route_input)
  <idle>-0     0dns. 1794us : ip_local_deliver (ip_rcv)
  <idle>-0     0dns. 1795us : udp_rcv (ip_local_deliver)
  <idle>-0     0dns. 1795us : _read_lock (udp_rcv)
  <idle>-0     0dns. 1795us : ip_mc_sf_allow (udp_rcv)
  <idle>-0     0dns. 1795us : udp_queue_rcv_skb (udp_rcv)
  <idle>-0     0dns. 1796us : dummy_socket_sock_rcv_skb (udp_queue_rcv_skb)
  <idle>-0     0dns. 1796us : skb_queue_tail (udp_queue_rcv_skb)
  <idle>-0     0dns. 1796us : _spin_lock_irqsave (skb_queue_tail)
  <idle>-0     0dns. 1796us : _spin_unlock_irqrestore (skb_queue_tail)
  <idle>-0     0dns. 1796us : sock_def_readable (udp_queue_rcv_skb)
  <idle>-0     0dns. 1796us : _read_lock (sock_def_readable)
  <idle>-0     0dns. 1797us : __wake_up (sock_def_readable)
  <idle>-0     0dns. 1797us : _spin_lock_irqsave (__wake_up)
  <idle>-0     0dns. 1797us : __wake_up_common (__wake_up)
  <idle>-0     0dns. 1797us : default_wake_function (__wake_up_common)
  <idle>-0     0dns. 1797us : try_to_wake_up (default_wake_function)
  <idle>-0     0dns. 1797us : __lock_text_start (try_to_wake_up)
  <idle>-0     0dns. 1798us : _spin_unlock_irqrestore (try_to_wake_up)
  <idle>-0     0dns. 1798us : _spin_unlock_irqrestore (__wake_up)
  <idle>-0     0dns. 1798us : netif_receive_skb (process_backlog)
  <idle>-0     0dns. 1799us : ip_rcv (netif_receive_skb)
  <idle>-0     0dns. 1799us : ip_route_input (ip_rcv)
  <idle>-0     0dns. 1799us : rt_hash_code (ip_route_input)
  <idle>-0     0dns. 1799us : ip_local_deliver (ip_rcv)
  <idle>-0     0dns. 1799us : udp_rcv (ip_local_deliver)
  <idle>-0     0dns. 1800us : _read_lock (udp_rcv)
  <idle>-0     0dns. 1800us : ip_mc_sf_allow (udp_rcv)
  <idle>-0     0dns. 1800us : udp_queue_rcv_skb (udp_rcv)
  <idle>-0     0dns. 1800us : dummy_socket_sock_rcv_skb (udp_queue_rcv_skb)
  <idle>-0     0dns. 1800us : skb_queue_tail (udp_queue_rcv_skb)
  <idle>-0     0dns. 1800us : _spin_lock_irqsave (skb_queue_tail)
  <idle>-0     0dns. 1801us : _spin_unlock_irqrestore (skb_queue_tail)
  <idle>-0     0dns. 1801us : sock_def_readable (udp_queue_rcv_skb)
  <idle>-0     0dns. 1801us : _read_lock (sock_def_readable)
  <idle>-0     0dns. 1801us : __wake_up (sock_def_readable)
  <idle>-0     0dns. 1801us : _spin_lock_irqsave (__wake_up)
  <idle>-0     0dns. 1801us : __wake_up_common (__wake_up)
  <idle>-0     0dns. 1801us : default_wake_function (__wake_up_common)
  <idle>-0     0dns. 1802us : try_to_wake_up (default_wake_function)
  <idle>-0     0dns. 1802us : __lock_text_start (try_to_wake_up)
  <idle>-0     0dns. 1802us : _spin_unlock_irqrestore (try_to_wake_up)
  <idle>-0     0dns. 1802us : _spin_unlock_irqrestore (__wake_up)
  <idle>-0     0dns. 1802us : netif_receive_skb (process_backlog)
  <idle>-0     0dns. 1803us : ip_rcv (netif_receive_skb)
  <idle>-0     0dns. 1803us : ip_route_input (ip_rcv)
  <idle>-0     0dns. 1803us : rt_hash_code (ip_route_input)
  <idle>-0     0dns. 1803us : ip_local_deliver (ip_rcv)
  <idle>-0     0dns. 1803us : udp_rcv (ip_local_deliver)
  <idle>-0     0dns. 1803us : _read_lock (udp_rcv)
  <idle>-0     0dns. 1804us : ip_mc_sf_allow (udp_rcv)
  <idle>-0     0dns. 1804us : udp_queue_rcv_skb (udp_rcv)
  <idle>-0     0dns. 1804us : dummy_socket_sock_rcv_skb (udp_queue_rcv_skb)
  <idle>-0     0dns. 1804us : skb_queue_tail (udp_queue_rcv_skb)
  <idle>-0     0dns. 1804us : _spin_lock_irqsave (skb_queue_tail)
  <idle>-0     0dns. 1804us : _spin_unlock_irqrestore (skb_queue_tail)
  <idle>-0     0dns. 1805us : sock_def_readable (udp_queue_rcv_skb)
  <idle>-0     0dns. 1805us : _read_lock (sock_def_readable)
  <idle>-0     0dns. 1805us : __wake_up (sock_def_readable)
  <idle>-0     0dns. 1805us : _spin_lock_irqsave (__wake_up)
  <idle>-0     0dns. 1805us : __wake_up_common (__wake_up)
  <idle>-0     0dns. 1805us : default_wake_function (__wake_up_common)
  <idle>-0     0dns. 1805us : try_to_wake_up (default_wake_function)
  <idle>-0     0dns. 1805us : __lock_text_start (try_to_wake_up)
  <idle>-0     0dns. 1806us : _spin_unlock_irqrestore (try_to_wake_up)
  <idle>-0     0dns. 1806us : _spin_unlock_irqrestore (__wake_up)
  <idle>-0     0dns. 1806us : blk_done_softirq (__do_softirq)
  <idle>-0     0dns. 1806us : scsi_softirq_done (blk_done_softirq)
  <idle>-0     0dns. 1807us : scsi_decide_disposition (scsi_softirq_done)
  <idle>-0     0dns. 1807us : scsi_log_completion (scsi_softirq_done)
  <idle>-0     0dns. 1807us : scsi_finish_command (scsi_softirq_done)
  <idle>-0     0dns. 1807us : scsi_device_unbusy (scsi_finish_command)
  <idle>-0     0dns. 1807us : _spin_lock_irqsave (scsi_device_unbusy)
  <idle>-0     0dns. 1807us : __lock_text_start (scsi_device_unbusy)
  <idle>-0     0dns. 1807us : _spin_unlock_irqrestore (scsi_device_unbusy)
  <idle>-0     0dns. 1808us : sd_rw_intr (scsi_finish_command)
  <idle>-0     0dns. 1808us : scsi_io_completion (sd_rw_intr)
  <idle>-0     0dns. 1808us : scsi_free_sgtable (scsi_io_completion)
  <idle>-0     0dns. 1808us : mempool_free (scsi_free_sgtable)
  <idle>-0     0dns. 1808us : mempool_free_slab (mempool_free)
  <idle>-0     0dns. 1808us : kmem_cache_free (mempool_free_slab)
  <idle>-0     0dns. 1809us : scsi_end_request (scsi_io_completion)
  <idle>-0     0dns. 1809us : end_that_request_chunk (scsi_end_request)
  <idle>-0     0dns. 1809us : __end_that_request_first (end_that_request_chunk)
  <idle>-0     0dns. 1809us : bio_endio (__end_that_request_first)
  <idle>-0     0dns. 1809us : raid1_end_read_request (bio_endio)
  <idle>-0     0dns. 1810us : raid_end_bio_io (raid1_end_read_request)
  <idle>-0     0dns. 1810us : bio_endio (raid_end_bio_io)
  <idle>-0     0dns. 1810us : clone_endio (bio_endio)
  <idle>-0     0dns. 1810us : mempool_free (clone_endio)
  <idle>-0     0dns. 1810us : mempool_free_slab (mempool_free)
  <idle>-0     0dns. 1810us : kmem_cache_free (mempool_free_slab)
  <idle>-0     0dns. 1810us : dec_pending (clone_endio)
  <idle>-0     0dns. 1811us : disk_round_stats (dec_pending)
  <idle>-0     0dns. 1811us : __wake_up (dec_pending)
  <idle>-0     0dns. 1811us : _spin_lock_irqsave (__wake_up)
  <idle>-0     0dns. 1811us : __wake_up_common (__wake_up)
  <idle>-0     0dns. 1811us : _spin_unlock_irqrestore (__wake_up)
  <idle>-0     0dns. 1811us : bio_endio (dec_pending)
  <idle>-0     0dns. 1812us : mpage_end_io_read (bio_endio)
  <idle>-0     0dns. 1812us : unlock_page (mpage_end_io_read)
  <idle>-0     0dns. 1812us : page_waitqueue (unlock_page)
  <idle>-0     0dns. 1812us : __wake_up_bit (unlock_page)
  <idle>-0     0dns. 1812us : unlock_page (mpage_end_io_read)
  <idle>-0     0dns. 1812us : page_waitqueue (unlock_page)
  <idle>-0     0dns. 1813us : __wake_up_bit (unlock_page)
  <idle>-0     0dns. 1813us : __wake_up (__wake_up_bit)
  <idle>-0     0dns. 1813us : _spin_lock_irqsave (__wake_up)
  <idle>-0     0dns. 1813us : __wake_up_common (__wake_up)
  <idle>-0     0dns. 1813us : wake_bit_function (__wake_up_common)
  <idle>-0     0dns. 1813us : autoremove_wake_function (wake_bit_function)
  <idle>-0     0dns. 1813us : default_wake_function (autoremove_wake_function)
  <idle>-0     0dns. 1814us : try_to_wake_up (default_wake_function)
  <idle>-0     0dns. 1814us : __lock_text_start (try_to_wake_up)
  <idle>-0     0dns. 1814us : idle_cpu (try_to_wake_up)
  <idle>-0     0dns. 1814us : activate_task (try_to_wake_up)
  <idle>-0     0dns. 1814us : sched_clock (activate_task)
  <idle>-0     0dns. 1814us : recalc_task_prio (activate_task)
  <idle>-0     0dns. 1815us : effective_prio (recalc_task_prio)
  <idle>-0     0dns. 1815us : activate_task <<...>-3524> (76 2)
  <idle>-0     0dns. 1815us : enqueue_task (activate_task)
  <idle>-0     0dns. 1815us : resched_task (try_to_wake_up)
  <idle>-0     0dns. 1815us : __trace_start_sched_wakeup (try_to_wake_up)
  <idle>-0     0dns. 1816us : __lock_text_start (__trace_start_sched_wakeup)
  <idle>-0     0dns. 1816us : _spin_unlock_irqrestore (try_to_wake_up)
  <idle>-0     0dns. 1816us : _spin_unlock_irqrestore (__wake_up)
  <idle>-0     0dns. 1816us : bio_put (mpage_end_io_read)
  <idle>-0     0dns. 1816us : bio_fs_destructor (bio_put)
  <idle>-0     0dns. 1816us : bio_free (bio_fs_destructor)
  <idle>-0     0dns. 1817us : mempool_free (bio_free)
  <idle>-0     0dns. 1817us : mempool_free_slab (mempool_free)
  <idle>-0     0dns. 1817us : kmem_cache_free (mempool_free_slab)
  <idle>-0     0dns. 1817us : mempool_free (bio_free)
  <idle>-0     0dns. 1817us : mempool_free_slab (mempool_free)
  <idle>-0     0dns. 1818us : kmem_cache_free (mempool_free_slab)
  <idle>-0     0dns. 1818us : mempool_free (dec_pending)
  <idle>-0     0dns. 1818us : mempool_free_slab (mempool_free)
  <idle>-0     0dns. 1818us : kmem_cache_free (mempool_free_slab)
  <idle>-0     0dns. 1818us : bio_put (clone_endio)
  <idle>-0     0dns. 1819us : bio_fs_destructor (bio_put)
  <idle>-0     0dns. 1819us : bio_free (bio_fs_destructor)
  <idle>-0     0dns. 1819us : mempool_free (bio_free)
  <idle>-0     0dns. 1819us : mempool_free_slab (mempool_free)
  <idle>-0     0dns. 1819us : kmem_cache_free (mempool_free_slab)
  <idle>-0     0dns. 1819us : mempool_free (bio_free)
  <idle>-0     0dns. 1820us : mempool_free_slab (mempool_free)
  <idle>-0     0dns. 1820us : kmem_cache_free (mempool_free_slab)
  <idle>-0     0dns. 1820us : allow_barrier (raid_end_bio_io)
  <idle>-0     0dns. 1820us : _spin_lock_irqsave (allow_barrier)
  <idle>-0     0dns. 1820us : _spin_unlock_irqrestore (allow_barrier)
  <idle>-0     0dns. 1821us : __wake_up (allow_barrier)
  <idle>-0     0dns. 1821us : _spin_lock_irqsave (__wake_up)
  <idle>-0     0dns. 1821us : __wake_up_common (__wake_up)
  <idle>-0     0dns. 1821us : _spin_unlock_irqrestore (__wake_up)
  <idle>-0     0dns. 1821us : bio_put (raid_end_bio_io)
  <idle>-0     0dns. 1821us : bio_fs_destructor (bio_put)
  <idle>-0     0dns. 1822us : bio_free (bio_fs_destructor)
  <idle>-0     0dns. 1822us : mempool_free (bio_free)
  <idle>-0     0dns. 1822us : mempool_free_slab (mempool_free)
  <idle>-0     0dns. 1822us : kmem_cache_free (mempool_free_slab)
  <idle>-0     0dns. 1822us : mempool_free (bio_free)
  <idle>-0     0dns. 1822us : mempool_free_slab (mempool_free)
  <idle>-0     0dns. 1823us : kmem_cache_free (mempool_free_slab)
  <idle>-0     0dns. 1823us : mempool_free (raid_end_bio_io)
  <idle>-0     0dns. 1823us : r1bio_pool_free (mempool_free)
  <idle>-0     0dns. 1823us : kfree (r1bio_pool_free)
  <idle>-0     0dns. 1824us : add_disk_randomness (scsi_end_request)
  <idle>-0     0dns. 1824us : add_timer_randomness (add_disk_randomness)
  <idle>-0     0dns. 1824us : _spin_lock_irqsave (scsi_end_request)
  <idle>-0     0dns. 1824us : end_that_request_last (scsi_end_request)
  <idle>-0     0dns. 1824us : disk_round_stats (end_that_request_last)
  <idle>-0     0dns. 1824us : __blk_put_request (end_that_request_last)
  <idle>-0     0dns. 1825us : elv_completed_request (__blk_put_request)
  <idle>-0     0dns. 1825us : cfq_completed_request (elv_completed_request)
  <idle>-0     0dns. 1825us : elv_put_request (__blk_put_request)
  <idle>-0     0dns. 1825us : cfq_put_request (elv_put_request)
  <idle>-0     0dns. 1826us : put_io_context (cfq_put_request)
  <idle>-0     0dns. 1826us : mempool_free (cfq_put_request)
  <idle>-0     0dns. 1826us : mempool_free_slab (mempool_free)
  <idle>-0     0dns. 1826us : kmem_cache_free (mempool_free_slab)
  <idle>-0     0dns. 1826us : cfq_put_queue (cfq_put_request)
  <idle>-0     0dns. 1827us : mempool_free (__blk_put_request)
  <idle>-0     0dns. 1827us : mempool_free_slab (mempool_free)
  <idle>-0     0dns. 1827us : kmem_cache_free (mempool_free_slab)
  <idle>-0     0dns. 1827us : freed_request (__blk_put_request)
  <idle>-0     0dns. 1827us : __freed_request (freed_request)
  <idle>-0     0dns. 1827us : clear_queue_congested (__freed_request)
  <idle>-0     0dns. 1828us : _spin_unlock_irqrestore (scsi_end_request)
  <idle>-0     0dns. 1828us : scsi_next_command (scsi_end_request)
  <idle>-0     0dns. 1828us : get_device (scsi_next_command)
  <idle>-0     0dns. 1828us : kobject_get (get_device)
  <idle>-0     0dns. 1828us : kref_get (kobject_get)
  <idle>-0     0dns. 1829us : scsi_put_command (scsi_next_command)
  <idle>-0     0dns. 1829us : _spin_lock_irqsave (scsi_put_command)
  <idle>-0     0dns. 1829us : __lock_text_start (scsi_put_command)
  <idle>-0     0dns. 1829us : _spin_unlock_irqrestore (scsi_put_command)
  <idle>-0     0dns. 1829us : kmem_cache_free (scsi_put_command)
  <idle>-0     0dns. 1829us : put_device (scsi_put_command)
  <idle>-0     0dns. 1830us : kobject_put (put_device)
  <idle>-0     0dns. 1830us : kref_put (kobject_put)
  <idle>-0     0dns. 1830us : scsi_run_queue (scsi_next_command)
  <idle>-0     0dns. 1830us : _spin_lock_irqsave (scsi_run_queue)
  <idle>-0     0dns. 1830us : _spin_unlock_irqrestore (scsi_run_queue)
  <idle>-0     0dns. 1830us : blk_run_queue (scsi_run_queue)
  <idle>-0     0dns. 1831us : _spin_lock_irqsave (blk_run_queue)
  <idle>-0     0dns. 1831us : blk_remove_plug (blk_run_queue)
  <idle>-0     0dns. 1831us : elv_queue_empty (blk_run_queue)
  <idle>-0     0dns. 1831us : cfq_queue_empty (elv_queue_empty)
  <idle>-0     0dns. 1831us : _spin_unlock_irqrestore (blk_run_queue)
  <idle>-0     0dns. 1831us : put_device (scsi_next_command)
  <idle>-0     0dns. 1831us : kobject_put (put_device)
  <idle>-0     0dns. 1831us : kref_put (kobject_put)
  <idle>-0     0dns. 1832us : scsi_softirq_done (blk_done_softirq)
  <idle>-0     0dns. 1832us : scsi_decide_disposition (scsi_softirq_done)
  <idle>-0     0dns. 1832us : scsi_log_completion (scsi_softirq_done)
  <idle>-0     0dns. 1832us : scsi_finish_command (scsi_softirq_done)
  <idle>-0     0dns. 1832us : scsi_device_unbusy (scsi_finish_command)
  <idle>-0     0dns. 1833us : _spin_lock_irqsave (scsi_device_unbusy)
  <idle>-0     0dns. 1833us : __lock_text_start (scsi_device_unbusy)
  <idle>-0     0dns. 1833us : _spin_unlock_irqrestore (scsi_device_unbusy)
  <idle>-0     0dns. 1833us : sd_rw_intr (scsi_finish_command)
  <idle>-0     0dns. 1833us : scsi_io_completion (sd_rw_intr)
  <idle>-0     0dns. 1833us : scsi_free_sgtable (scsi_io_completion)
  <idle>-0     0dns. 1833us : mempool_free (scsi_free_sgtable)
  <idle>-0     0dns. 1833us : mempool_free_slab (mempool_free)
  <idle>-0     0dns. 1834us : kmem_cache_free (mempool_free_slab)
  <idle>-0     0dns. 1834us : scsi_end_request (scsi_io_completion)
  <idle>-0     0dns. 1834us : end_that_request_chunk (scsi_end_request)
  <idle>-0     0dns. 1834us : __end_that_request_first (end_that_request_chunk)
  <idle>-0     0dns. 1834us : bio_endio (__end_that_request_first)
  <idle>-0     0dns. 1834us : raid1_end_read_request (bio_endio)
  <idle>-0     0dns. 1835us : raid_end_bio_io (raid1_end_read_request)
  <idle>-0     0dns. 1835us : bio_endio (raid_end_bio_io)
  <idle>-0     0dns. 1835us : clone_endio (bio_endio)
  <idle>-0     0dns. 1835us : mempool_free (clone_endio)
  <idle>-0     0dns. 1835us : mempool_free_slab (mempool_free)
  <idle>-0     0dns. 1835us : kmem_cache_free (mempool_free_slab)
  <idle>-0     0dns. 1836us : dec_pending (clone_endio)
  <idle>-0     0dns. 1836us : disk_round_stats (dec_pending)
  <idle>-0     0dns. 1836us : __wake_up (dec_pending)
  <idle>-0     0dns. 1836us : _spin_lock_irqsave (__wake_up)
  <idle>-0     0dns. 1836us : __wake_up_common (__wake_up)
  <idle>-0     0dns. 1837us : _spin_unlock_irqrestore (__wake_up)
  <idle>-0     0dns. 1837us : bio_endio (dec_pending)
  <idle>-0     0dns. 1837us : end_bio_bh_io_sync (bio_endio)
  <idle>-0     0dns. 1837us : end_buffer_read_sync (end_bio_bh_io_sync)
  <idle>-0     0dns. 1837us : unlock_buffer (end_buffer_read_sync)
  <idle>-0     0dns. 1837us : wake_up_bit (unlock_buffer)
  <idle>-0     0dns. 1837us : bit_waitqueue (wake_up_bit)
  <idle>-0     0dns. 1837us : __wake_up_bit (wake_up_bit)
  <idle>-0     0dns. 1838us : __wake_up (__wake_up_bit)
  <idle>-0     0dns. 1838us : _spin_lock_irqsave (__wake_up)
  <idle>-0     0dns. 1838us : __wake_up_common (__wake_up)
  <idle>-0     0dns. 1838us : wake_bit_function (__wake_up_common)
  <idle>-0     0dns. 1838us : autoremove_wake_function (wake_bit_function)
  <idle>-0     0dns. 1838us : default_wake_function (autoremove_wake_function)
  <idle>-0     0dns. 1838us : try_to_wake_up (default_wake_function)
  <idle>-0     0dns. 1839us : __lock_text_start (try_to_wake_up)
  <idle>-0     0dns. 1839us : idle_cpu (try_to_wake_up)
  <idle>-0     0dns. 1839us : activate_task (try_to_wake_up)
  <idle>-0     0dns. 1839us : sched_clock (activate_task)
  <idle>-0     0dns. 1839us : recalc_task_prio (activate_task)
  <idle>-0     0dns. 1840us : effective_prio (recalc_task_prio)
  <idle>-0     0dns. 1840us : activate_task <<...>-3525> (76 3)
  <idle>-0     0dns. 1840us : enqueue_task (activate_task)
  <idle>-0     0dns. 1840us : resched_task (try_to_wake_up)
  <idle>-0     0dns. 1840us : __trace_start_sched_wakeup (try_to_wake_up)
  <idle>-0     0dns. 1840us : __lock_text_start (__trace_start_sched_wakeup)
  <idle>-0     0dns. 1840us : _spin_unlock_irqrestore (try_to_wake_up)
  <idle>-0     0dns. 1841us : _spin_unlock_irqrestore (__wake_up)
  <idle>-0     0dns. 1841us : bio_put (end_bio_bh_io_sync)
  <idle>-0     0dns. 1841us : bio_fs_destructor (bio_put)
  <idle>-0     0dns. 1841us : bio_free (bio_fs_destructor)
  <idle>-0     0dns. 1841us : mempool_free (bio_free)
  <idle>-0     0dns. 1842us : mempool_free_slab (mempool_free)
  <idle>-0     0dns. 1842us : kmem_cache_free (mempool_free_slab)
  <idle>-0     0dns. 1842us : mempool_free (bio_free)
  <idle>-0     0dns. 1842us : mempool_free_slab (mempool_free)
  <idle>-0     0dns. 1842us : kmem_cache_free (mempool_free_slab)
  <idle>-0     0dns. 1842us : mempool_free (dec_pending)
  <idle>-0     0dns. 1842us : mempool_free_slab (mempool_free)
  <idle>-0     0dns. 1843us : kmem_cache_free (mempool_free_slab)
  <idle>-0     0dns. 1843us : bio_put (clone_endio)
  <idle>-0     0dns. 1843us : bio_fs_destructor (bio_put)
  <idle>-0     0dns. 1843us : bio_free (bio_fs_destructor)
  <idle>-0     0dns. 1843us : mempool_free (bio_free)
  <idle>-0     0dns. 1843us : mempool_free_slab (mempool_free)
  <idle>-0     0dns. 1843us : kmem_cache_free (mempool_free_slab)
  <idle>-0     0dns. 1844us : mempool_free (bio_free)
  <idle>-0     0dns. 1844us : mempool_free_slab (mempool_free)
  <idle>-0     0dns. 1844us : kmem_cache_free (mempool_free_slab)
  <idle>-0     0dns. 1844us : allow_barrier (raid_end_bio_io)
  <idle>-0     0dns. 1844us : _spin_lock_irqsave (allow_barrier)
  <idle>-0     0dns. 1844us : _spin_unlock_irqrestore (allow_barrier)
  <idle>-0     0dns. 1844us : __wake_up (allow_barrier)
  <idle>-0     0dns. 1844us : _spin_lock_irqsave (__wake_up)
  <idle>-0     0dns. 1845us : __wake_up_common (__wake_up)
  <idle>-0     0dns. 1845us : _spin_unlock_irqrestore (__wake_up)
  <idle>-0     0dns. 1845us : bio_put (raid_end_bio_io)
  <idle>-0     0dns. 1845us : bio_fs_destructor (bio_put)
  <idle>-0     0dns. 1845us : bio_free (bio_fs_destructor)
  <idle>-0     0dns. 1846us : mempool_free (bio_free)
  <idle>-0     0dns. 1846us : mempool_free_slab (mempool_free)
  <idle>-0     0dns. 1846us : kmem_cache_free (mempool_free_slab)
  <idle>-0     0dns. 1846us : mempool_free (bio_free)
  <idle>-0     0dns. 1846us : mempool_free_slab (mempool_free)
  <idle>-0     0dns. 1846us : kmem_cache_free (mempool_free_slab)
  <idle>-0     0dns. 1847us : mempool_free (raid_end_bio_io)
  <idle>-0     0dns. 1847us : r1bio_pool_free (mempool_free)
  <idle>-0     0dns. 1847us : kfree (r1bio_pool_free)
  <idle>-0     0dns. 1847us : add_disk_randomness (scsi_end_request)
  <idle>-0     0dns. 1847us : add_timer_randomness (add_disk_randomness)
  <idle>-0     0dns. 1848us : _spin_lock_irqsave (scsi_end_request)
  <idle>-0     0dns. 1848us : end_that_request_last (scsi_end_request)
  <idle>-0     0dns. 1848us : disk_round_stats (end_that_request_last)
  <idle>-0     0dns. 1848us : __blk_put_request (end_that_request_last)
  <idle>-0     0dns. 1848us : elv_completed_request (__blk_put_request)
  <idle>-0     0dns. 1849us : cfq_completed_request (elv_completed_request)
  <idle>-0     0dns. 1849us : elv_put_request (__blk_put_request)
  <idle>-0     0dns. 1849us : cfq_put_request (elv_put_request)
  <idle>-0     0dns. 1849us : put_io_context (cfq_put_request)
  <idle>-0     0dns. 1850us : mempool_free (cfq_put_request)
  <idle>-0     0dns. 1850us : mempool_free_slab (mempool_free)
  <idle>-0     0dns. 1850us : kmem_cache_free (mempool_free_slab)
  <idle>-0     0dns. 1850us : cfq_put_queue (cfq_put_request)
  <idle>-0     0dns. 1850us : mempool_free (__blk_put_request)
  <idle>-0     0dns. 1850us : mempool_free_slab (mempool_free)
  <idle>-0     0dns. 1851us : kmem_cache_free (mempool_free_slab)
  <idle>-0     0dns. 1851us : freed_request (__blk_put_request)
  <idle>-0     0dns. 1851us : __freed_request (freed_request)
  <idle>-0     0dns. 1851us : clear_queue_congested (__freed_request)
  <idle>-0     0dns. 1851us : _spin_unlock_irqrestore (scsi_end_request)
  <idle>-0     0dns. 1852us : scsi_next_command (scsi_end_request)
  <idle>-0     0dns. 1852us : get_device (scsi_next_command)
  <idle>-0     0dns. 1852us : kobject_get (get_device)
  <idle>-0     0dns. 1852us : kref_get (kobject_get)
  <idle>-0     0dns. 1852us : scsi_put_command (scsi_next_command)
  <idle>-0     0dns. 1852us : _spin_lock_irqsave (scsi_put_command)
  <idle>-0     0dns. 1853us : __lock_text_start (scsi_put_command)
  <idle>-0     0dns. 1853us : _spin_unlock_irqrestore (scsi_put_command)
  <idle>-0     0dns. 1853us : kmem_cache_free (scsi_put_command)
  <idle>-0     0dns. 1853us : put_device (scsi_put_command)
  <idle>-0     0dns. 1853us : kobject_put (put_device)
  <idle>-0     0dns. 1853us : kref_put (kobject_put)
  <idle>-0     0dns. 1854us : scsi_run_queue (scsi_next_command)
  <idle>-0     0dns. 1854us : _spin_lock_irqsave (scsi_run_queue)
  <idle>-0     0dns. 1854us : _spin_unlock_irqrestore (scsi_run_queue)
  <idle>-0     0dns. 1854us : blk_run_queue (scsi_run_queue)
  <idle>-0     0dns. 1854us : _spin_lock_irqsave (blk_run_queue)
  <idle>-0     0dns. 1854us : blk_remove_plug (blk_run_queue)
  <idle>-0     0dns. 1855us : elv_queue_empty (blk_run_queue)
  <idle>-0     0dns. 1855us : cfq_queue_empty (elv_queue_empty)
  <idle>-0     0dns. 1855us : _spin_unlock_irqrestore (blk_run_queue)
  <idle>-0     0dns. 1855us : put_device (scsi_next_command)
  <idle>-0     0dns. 1855us : kobject_put (put_device)
  <idle>-0     0dns. 1855us : kref_put (kobject_put)
  <idle>-0     0dns. 1856us : tasklet_action (__do_softirq)
  <idle>-0     0dns. 1856us : rcu_process_callbacks (tasklet_action)
  <idle>-0     0dns. 1856us : __rcu_process_callbacks (rcu_process_callbacks)
  <idle>-0     0dns. 1856us : __lock_text_start (__rcu_process_callbacks)
  <idle>-0     0dns. 1856us : rcu_start_batch (__rcu_process_callbacks)
  <idle>-0     0dns. 1857us : __rcu_process_callbacks (rcu_process_callbacks)
  <idle>-0     0dn.. 1857us : __exit_idle (cpu_idle)
  <idle>-0     0dn.. 1857us : notifier_call_chain (__exit_idle)
  <idle>-0     0dn.. 1857us : schedule (cpu_idle)
  <idle>-0     0dn.. 1858us : profile_hit (schedule)
  <idle>-0     0dn.. 1858us : sched_clock (schedule)
  <idle>-0     0dn.. 1858us : _spin_lock_irq (schedule)
  <idle>-0     0dn.. 1858us : recalc_task_prio (schedule)
  <idle>-0     0dn.. 1859us : effective_prio (recalc_task_prio)
  <idle>-0     0dn.. 1859us : requeue_task (schedule)
  <idle>-0     0d... 1859us : __kprobes_text_start (thread_return)
   <...>-3032  0d... 1860us : thread_return <<idle>-0> (8c 69)
   <...>-3032  0d... 1860us : trace_stop_sched_switched (thread_return)
   <...>-3032  0d... 1860us : __lock_text_start (trace_stop_sched_switched)
   <...>-3032  0d... 1860us : trace_stop_sched_switched <<...>-3032> (69 0)
   <...>-3032  0d... 1861us : thread_return (thread_return)


vim:ft=help

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: libata/sata_nv latency on NVIDIA CK804 [was Re: AMD64 X2 lost ticks on PM timer]
  2006-03-16 16:57                                             ` Bill Rugolsky Jr.
@ 2006-03-22 16:09                                               ` Andi Kleen
  2006-03-22 18:39                                                 ` Bill Rugolsky Jr.
  2006-03-22 23:07                                                 ` Bill Rugolsky Jr.
  0 siblings, 2 replies; 60+ messages in thread
From: Andi Kleen @ 2006-03-22 16:09 UTC (permalink / raw)
  To: Bill Rugolsky Jr.
  Cc: Alan Cox, Ingo Molnar, Lee Revell, Jeff Garzik, Jason Baron,
	linux-kernel, john stultz

On Thursday 16 March 2006 17:57, Bill Rugolsky Jr. wrote:
> On Thu, Mar 16, 2006 at 03:13:39PM +0000, Alan Cox wrote:
> > On Mer, 2006-03-15 at 23:00 +0100, Ingo Molnar wrote:
> > > so my guess would be that this device doesnt do MMIO, and the PIO inb() 
> > > causes some bad BIOS-based SMM handler/emulator to trigger, which takes 
> > > 16.6 msecs. If indeed the device is not in MMIO mode, is there a way to 
> > > force it into MMIO mode, to test this theory?
> > 
> > There is a much more reliable way to check this. Use the profiling
> > registers to check the instruction issue count before/after the I/O and
> > you'll know if its something like SMM or just a bus stall.
> 
> 
> Brilliant [as usual 8-)].
> 
> 
> So I imagine that the thing to do is just insert before/after
> rdmsr(MSR_K7_PERFCTR[0123]) into the code, with a suitable printk(),
> and then program the counters with oprofile to use large event
> counts (lasting seconds)? 


perfctr0 is already programmed. You can just use rdpmc on it

Also my latest patchkit has a debugging patch for lost tries

ftp://ftp.firstfloor.org/pub/ak/x86_64/quilt/patches/lost-cli-debug

Can you test it with this patch? 

I'm still not quite convinced you're barking at the right tree
with these latency traces because it doesn't match the symptoms.

If some particular critical section would take too long the 
interrupt should occur at its STI or POPF or one instruction after it. 
But they happen on STIs that are not related to any critical section. 
This more looks like a lost CLI to me.

>   <idle>-0     0dns. 1855us : _spin_unlock_irqrestore (blk_run_queue)

This is very long, but still less than a tick (assuming HZ=250) 
I guess it would be a good idea to add some code to split this up
though and enable interrupts more often, but that probably won't fix 
your problem.

-Andi

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: libata/sata_nv latency on NVIDIA CK804 [was Re: AMD64 X2 lost ticks on PM timer]
  2006-03-22 16:09                                               ` Andi Kleen
@ 2006-03-22 18:39                                                 ` Bill Rugolsky Jr.
  2006-03-22 23:07                                                 ` Bill Rugolsky Jr.
  1 sibling, 0 replies; 60+ messages in thread
From: Bill Rugolsky Jr. @ 2006-03-22 18:39 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Alan Cox, Ingo Molnar, Lee Revell, Jeff Garzik, Jason Baron,
	linux-kernel, john stultz

On Wed, Mar 22, 2006 at 05:09:08PM +0100, Andi Kleen wrote:
> perfctr0 is already programmed. You can just use rdpmc on it
> 
> Also my latest patchkit has a debugging patch for lost tries
> 
> ftp://ftp.firstfloor.org/pub/ak/x86_64/quilt/patches/lost-cli-debug

Excellent.

> Can you test it with this patch? 
 
I'll give it a try this evening; the test box is an
FC5 upgrade guinea pig at the moment, and only half-migrated.

> I'm still not quite convinced you're barking at the right tree
> with these latency traces because it doesn't match the symptoms.

Yes, I agree that that the latency issue is not due to the inb(),
as I get lost ticks almost continuously at 1 KHZ, and the inb()-related
latency occurs with much lower frequency.

However, a 17ms inb() latency is rather dismal, and I need to
address that hardware issue, perhaps by purchasing a 4-port card.

I have not been chasing the other issue because the test box appears to
keep perfect time with the -rt patch (on SMP) and NTP doesn't complain at
all; I haven't yet had the chance to determine whether that is entirely
due to John Stultz's new code, or if the preemption is responsible.

	-Bill


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: libata/sata_nv latency on NVIDIA CK804 [was Re: AMD64 X2 lost ticks on PM timer]
  2006-03-22 16:09                                               ` Andi Kleen
  2006-03-22 18:39                                                 ` Bill Rugolsky Jr.
@ 2006-03-22 23:07                                                 ` Bill Rugolsky Jr.
  1 sibling, 0 replies; 60+ messages in thread
From: Bill Rugolsky Jr. @ 2006-03-22 23:07 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Alan Cox, Ingo Molnar, Lee Revell, Jeff Garzik, Jason Baron,
	linux-kernel, john stultz

On Wed, Mar 22, 2006 at 05:09:08PM +0100, Andi Kleen wrote:
> Also my latest patchkit has a debugging patch for lost tries
> 
> ftp://ftp.firstfloor.org/pub/ak/x86_64/quilt/patches/lost-cli-debug
> 
> Can you test it with this patch? 

It didn't apply cleanly against vanilla 2.6.16; rediffed patch below.

Typical output:

time.c: Lost 12 timer tick(s)! rip 10:__do_softirq+0x4b/0xdf
last cli handle_IRQ_event+0x62/0x71
last cli caller __do_IRQ+0xa6/0x104
time.c: Lost 5 timer tick(s)! rip 10:__do_softirq+0x4b/0xdf
last cli handle_IRQ_event+0x62/0x71
last cli caller __do_IRQ+0xa6/0x104
time.c: Lost 4 timer tick(s)! rip 10:__do_softirq+0x4b/0xdf
last cli handle_IRQ_event+0x62/0x71
last cli caller __do_IRQ+0xa6/0x104
time.c: Lost 8 timer tick(s)! rip 10:__do_softirq+0x4b/0xdf
last cli handle_IRQ_event+0x62/0x71
last cli caller __do_IRQ+0xa6/0x104
time.c: Lost 3 timer tick(s)! rip 10:__do_softirq+0x4b/0xdf
last cli handle_IRQ_event+0x62/0x71
last cli caller __do_IRQ+0xa6/0x104

Some statistics:

rugolsky@ti94: grep 'last cli' /var/log/messages | sed -n -e '1p;$p'
Mar 22 17:37:06 ti94 kernel: last cli 0x0
Mar 22 17:43:48 ti94 kernel: last cli caller __do_IRQ+0xa6/0x104

rugolsky@ti94: grep 'last cli' /var/log/messages | cut -d' ' -f6- | sort | uniq -c | sort -nr
    782 last cli handle_IRQ_event+0x62/0x71
    782 last cli caller __do_IRQ+0xa6/0x104
      2 last cli _spin_lock_irqsave+0x16/0x27
      1 last cli setup_boot_APIC_clock+0x48/0x12e
      1 last cli pci_direct_init+0x47/0x190
      1 last cli kmem_cache_free+0x1d/0x62
      1 last cli caller smp_prepare_cpus+0x36a/0x394
      1 last cli caller release_console_sem+0x1a/0x1c9
      1 last cli caller init+0x1c3/0x338
      1 last cli caller acpi_os_release_object+0x9/0xd
      1 last cli caller __up_read+0x19/0x9e
      1 last cli caller 0x0
      1 last cli 0x0

Thanks.

	-Bill


--- linux-2.6.16/arch/x86_64/kernel/time.c.lost-cli-debug	2006-03-20 00:53:29.000000000 -0500
+++ linux-2.6.16/arch/x86_64/kernel/time.c	2006-03-22 17:09:14.000000000 -0500
@@ -43,7 +43,7 @@
 #endif
 
 #ifdef CONFIG_CPU_FREQ
-static void cpufreq_delayed_get(void);
+static void cpufreq_delayed_get(struct pt_regs *);
 #endif
 extern void i8254_timer_resume(void);
 extern int using_apic_timer;
@@ -63,7 +63,7 @@ static unsigned long hpet_period;			/* f
 unsigned long hpet_tick;				/* HPET clocks / interrupt */
 int hpet_use_timer;				/* Use counter of hpet for time keeping, otherwise PIT */
 unsigned long vxtime_hz = PIT_TICK_RATE;
-int report_lost_ticks;				/* command line option */
+int report_lost_ticks = 1;			/* command line option */
 unsigned long long monotonic_base;
 
 struct vxtime_data __vxtime __section_vxtime;	/* for vsyscalls */
@@ -309,15 +309,20 @@ unsigned long long monotonic_clock(void)
 }
 EXPORT_SYMBOL(monotonic_clock);
 
+extern unsigned long last_clier, last_clier_caller;
+
 static noinline void handle_lost_ticks(int lost, struct pt_regs *regs)
 {
     static long lost_count;
     static int warned;
 
     if (report_lost_ticks) {
-	    printk(KERN_WARNING "time.c: Lost %d timer "
-		   "tick(s)! ", lost);
-	    print_symbol("rip %s)\n", regs->rip);
+		printk(KERN_WARNING 
+			"time.c: Lost %d timer tick(s)! rip %02lx", 
+			lost, regs->cs);
+		print_symbol(":%s\n", regs->rip);
+		print_symbol("last cli %s\n", last_clier);
+		print_symbol("last cli caller %s\n", last_clier_caller);
     }
 
     if (lost_count == 1000 && !warned) {
@@ -345,7 +350,7 @@ static noinline void handle_lost_ticks(i
        (like going into thermal throttle)
        Give cpufreq a change to catch up. */
     if ((lost_count+1) % 25 == 0) {
-	    cpufreq_delayed_get();
+	    cpufreq_delayed_get(regs);
     }
 #endif
 }
@@ -599,14 +604,15 @@ static void handle_cpufreq_delayed_get(v
  * to verify the CPU frequency the timing core thinks the CPU is running
  * at is still correct.
  */
-static void cpufreq_delayed_get(void)
+static void cpufreq_delayed_get(struct pt_regs *regs)
 {
 	static int warned;
 	if (cpufreq_init && !cpufreq_delayed_issched) {
 		cpufreq_delayed_issched = 1;
 		if (!warned) {
 			warned = 1;
-			printk(KERN_DEBUG "Losing some ticks... checking if CPU frequency changed.\n");
+			print_symbol(KERN_DEBUG 
+	"Losing some ticks... checking if CPU frequency changed (rip=%s)\n", regs->rip);
 		}
 		schedule_work(&cpufreq_delayed_get_work);
 	}
--- linux-2.6.16/arch/x86_64/kernel/process.c.lost-cli-debug	2006-03-20 00:53:29.000000000 -0500
+++ linux-2.6.16/arch/x86_64/kernel/process.c	2006-03-22 17:12:01.000000000 -0500
@@ -841,3 +841,16 @@ unsigned long arch_align_stack(unsigned 
 		sp -= get_random_int() % 8192;
 	return sp & ~0xf;
 }
+
+unsigned long last_clier, last_clier_caller;
+ 
+void __local_irq_disable(void *caller)
+{
+	if (!irqs_disabled()) {
+		last_clier = __builtin_return_address(0);
+		last_clier = (unsigned long)__builtin_return_address(0);
+		last_clier_caller = (unsigned long)caller;
+ 		asm volatile("cli":::"memory"); 
+	}
+}
+EXPORT_SYMBOL(__local_irq_disable);
--- linux-2.6.16/include/asm-x86_64/system.h.lost-cli-debug	2006-03-20 00:53:29.000000000 -0500
+++ linux-2.6.16/include/asm-x86_64/system.h	2006-03-22 17:01:50.000000000 -0500
@@ -351,7 +351,10 @@ static inline unsigned long __cmpxchg(vo
 /* For spinlocks etc */
 #define local_irq_save(x)	do { local_save_flags(x); local_irq_restore((x & ~(1 << 9)) | (1 << 18)); } while (0)
 #else  /* CONFIG_X86_VSMP */
-#define local_irq_disable() 	__asm__ __volatile__("cli": : :"memory")
+
+extern void __local_irq_disable(void *caller);
+#define local_irq_disable() __local_irq_disable(__builtin_return_address(0))
+
 #define local_irq_enable()	__asm__ __volatile__("sti": : :"memory")
 
 #define irqs_disabled()			\
@@ -362,7 +365,7 @@ static inline unsigned long __cmpxchg(vo
 })
 
 /* For spinlocks etc */
-#define local_irq_save(x) 	do { warn_if_not_ulong(x); __asm__ __volatile__("# local_irq_save \n\t pushfq ; popq %0 ; cli":"=g" (x): /* no input */ :"memory"); } while (0)
+#define local_irq_save(x) 	do { warn_if_not_ulong(x); __asm__ __volatile__("# local_irq_save \n\t pushfq ; popq %0":"=g" (x): /* no input */ :"memory"); local_irq_disable(); } while (0)
 #endif
 
 /* used in the idle loop; sti takes one instruction cycle to complete */

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: AMD64 X2 lost ticks on PM timer
  2006-03-06 17:37 [RFC] Encrypting file system Michael Halcrow
@ 2006-03-06 21:36 ` Timo Schroeter
  0 siblings, 0 replies; 60+ messages in thread
From: Timo Schroeter @ 2006-03-06 21:36 UTC (permalink / raw)
  To: linux-kernel

Hi,

I have the same problem with my Tyan K8E Board. I've connected 2 WD raptor
to the onboard SATA ports (nForce 4, RAID0 md). I noticed that my server
hung up this night, after reboot and checking the logfiles I found the same
messages:

time.c: Lost 141 timer tick(s)! rip poll_idle+0x14/0x19)
time.c: Lost 127 timer tick(s)! rip poll_idle+0xa/0x19)
time.c: Lost 92 timer tick(s)! rip poll_idle+0x14/0x19)
time.c: Lost 64 timer tick(s)! rip poll_idle+0xa/0x19)

I dont use the onboard nFORCE4 NIC but the BROADCOM one. 

I ran ./trtc with the following results:

1141680789:835638: rtc 256 int 0 0 (=0)
1141680790:104663: rtc 464 int 269 0 (=269)
1141680790:604729: rtc 448 int 501 0 (=501)
1141680791:104795: rtc 464 int 500 0 (=500)
1141680791:604862: rtc 448 int 500 0 (=500)
1141680792:104927: rtc 464 int 500 0 (=500)
1141680792:604994: rtc 448 int 500 0 (=500)
1141680793:105060: rtc 464 int 500 0 (=500)
1141680793:605126: rtc 448 int 500 0 (=500)
1141680794:105192: rtc 464 int 501 0 (=501)
1141680794:605259: rtc 448 int 500 0 (=500)
1141680795:105326: rtc 464 int 500 0 (=500)
1141680795:605392: rtc 448 int 500 0 (=500)
1141680796:105458: rtc 464 int 500 0 (=500)
1141680796:605525: rtc 448 int 500 0 (=500)
1141680797:105592: rtc 464 int 500 0 (=500)
1141680797:605658: rtc 448 int 501 0 (=501)
1141680798:105725: rtc 464 int 500 0 (=500)

I wonder if my server was frozen because of this error. If there will be no
fix soon, I think ist better to get a PCIexpress SATA2 card :(

Regards,

Timo Schroeter


^ permalink raw reply	[flat|nested] 60+ messages in thread

end of thread, other threads:[~2006-03-22 23:07 UTC | newest]

Thread overview: 60+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-02-27 21:22 AMD64 X2 lost ticks on PM timer bubshait
2006-02-27 22:21 ` Bill Rugolsky Jr.
2006-02-27 22:47   ` Jason Baron
2006-02-28  7:41     ` Abdulla Bubshait
2006-02-28 22:00       ` Bill Rugolsky Jr.
2006-02-28 23:53         ` Andi Kleen
2006-03-01 14:46           ` Bill Rugolsky Jr.
2006-03-01 14:56             ` Andi Kleen
2006-03-01 15:43               ` Bill Rugolsky Jr.
2006-03-01 15:47                 ` Andi Kleen
2006-03-01 18:07                   ` Bill Rugolsky Jr.
2006-03-01 18:29                     ` Andi Kleen
2006-03-01 19:16                       ` Lee Revell
2006-03-03 19:18                         ` Bill Rugolsky Jr.
2006-03-03 21:26                           ` Lee Revell
2006-03-03 22:09                             ` Jeff Garzik
2006-03-03 23:43                               ` Bill Rugolsky Jr.
2006-03-03 23:46                                 ` Jeff Garzik
2006-03-03 23:49                                   ` Lee Revell
2006-03-04  0:08                                   ` Andi Kleen
2006-03-04  0:07                                 ` Andi Kleen
     [not found]                                   ` <20060315213638.GA17817@ti64.telemetry-investments.com>
2006-03-15 21:45                                     ` libata/sata_nv latency on NVIDIA CK804 [was Re: AMD64 X2 lost ticks on PM timer] Lee Revell
2006-03-15 21:58                                       ` Ingo Molnar
2006-03-15 22:00                                         ` Ingo Molnar
2006-03-15 22:25                                           ` Jeff Garzik
2006-03-16 15:13                                           ` Alan Cox
2006-03-16 16:57                                             ` Bill Rugolsky Jr.
2006-03-22 16:09                                               ` Andi Kleen
2006-03-22 18:39                                                 ` Bill Rugolsky Jr.
2006-03-22 23:07                                                 ` Bill Rugolsky Jr.
2006-03-15 22:22                                         ` Jeff Garzik
2006-03-15 22:24                                           ` Ingo Molnar
2006-03-15 22:36                                             ` Bill Rugolsky Jr.
2006-03-15 22:46                                               ` Ingo Molnar
2006-03-15 22:48                                               ` Jeff Garzik
2006-03-15 23:31                                                 ` Lee Revell
2006-03-15 21:50                                     ` Ingo Molnar
2006-03-15 22:11                                       ` Ingo Molnar
2006-03-15 22:33                                         ` Jeff Garzik
2006-03-15 22:44                                           ` Ingo Molnar
2006-03-15 22:50                                             ` Jeff Garzik
2006-03-15 23:14                                               ` Bill Rugolsky Jr.
2006-03-15 23:44                                                 ` Lee Revell
     [not found]                                                   ` <20060316002133.GE17817@ti64.telemetry-investments.com>
2006-03-16  0:48                                                     ` Long latencies with MD RAID 1 [was Re: libata/sata_nv latency on NVIDIA CK804 ] Lee Revell
2006-03-16  3:15                                                 ` libata/sata_nv latency on NVIDIA CK804 [was Re: AMD64 X2 lost ticks on PM timer] Bill Rugolsky Jr.
2006-03-16  4:20                                                   ` Lee Revell
2006-03-16  9:18                                                     ` Ingo Molnar
2006-03-16 14:42                                                     ` Gabor Gombas
2006-03-16  0:01                                               ` Lee Revell
2006-03-16  0:14                                                 ` Jeff Garzik
2006-03-15 22:30                                       ` Jeff Garzik
2006-03-15 22:36                                         ` Ingo Molnar
2006-03-15 22:04                                     ` [patch] latency-tracing-v2.6.16.patch Ingo Molnar
2006-03-15 22:32                                       ` Bill Rugolsky Jr.
2006-03-16  9:18                                         ` Ingo Molnar
2006-03-04 12:06                                 ` AMD64 X2 lost ticks on PM timer Martin Schlemmer
2006-03-05  7:07                                   ` Alexander Samad
2006-03-02 15:47                 ` Gabor Gombas
2006-02-28 21:17     ` Abdulla Bubshait
2006-03-06 17:37 [RFC] Encrypting file system Michael Halcrow
2006-03-06 21:36 ` AMD64 X2 lost ticks on PM timer Timo Schroeter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).