linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] Preemption Latency Measurement Tool
@ 2001-09-19 22:44 ` Robert Love
  2001-09-20  1:40   ` Ignacio Vazquez-Abrams
                     ` (8 more replies)
  0 siblings, 9 replies; 55+ messages in thread
From: Robert Love @ 2001-09-19 22:44 UTC (permalink / raw)
  To: linux-kernel; +Cc: ozone, safemode, Dieter.Nuetzel, iafilius, ilsensine, george

Available at:
http://tech9.net/rml/linux/patch-rml-2.4.9-ac12-preempt-stats-1 and
http://tech9.net/rml/linux/patch-rml-2.4.10-pre12-preempt-stats-1
for 2.4.9-ac12 and 2.4.10-pre12, respectively.

This patch is provided thanks to MontaVista (http://mvista.com).

This patch enables a new kernel configure option, CONFIG_PREEMPT_TIMES,
which once enabled instructs the kernel preemption code to monitor
in-kernel latencies due to the various locking primitives and report the
20 worst recorded cases to /proc/latencytimes.

The patch obviously requires the preemption patch, available at
http://tech9.net/rml/linux

The patch won't start recording (I will change this...) until you read
from /proc/latencytimes once.  From then on, each read will return the
20 worst cases and reset the log.  Nonetheless, you don't want this in
the kernel if you don't plan to do use it, and I do _not_ want it
enabled during benchmarks.

The proc interface is fairly verbose.  It will give you a measurement of
the latency, what is causing the latency, the file line number and
filename where the lock (or whatever) began and the same for where it
ended.

The point is to track down long held locks and other problems that are
causing poor response.

Thus, most of you CC have noticed certain situations wherein even with
preemption you see high latencies (most of you with mp3 playback).  I
ask that, if you get the chance, to play with this patch and measuring
some latencies.  See where bad values lie.  Get a feeling for your
system... most of your latencies should be very small.  Right now, I
just read a sampling of my 8000 worst cases and the worst was 600us --
this is not bad.  Things over 5000us are interesting, especially if
consistent.

I appreciate any comments.  CC me and the list.  Enjoy.

-- 
Robert M. Love
rml at ufl.edu
rml at tech9.net


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] Preemption Latency Measurement Tool
  2001-09-20  2:23     ` safemode
@ 2001-09-20  1:13       ` David Lang
  2001-09-20  2:57         ` Robert Love
  0 siblings, 1 reply; 55+ messages in thread
From: David Lang @ 2001-09-20  1:13 UTC (permalink / raw)
  To: safemode; +Cc: Ignacio Vazquez-Abrams, linux-kernel

also how useful is the latency info once swapping starts? if you have
something that gets swapped out it will have horrible latency and skew the
results (or at the very least you won't know if the problem is disk IO or
fixable problems)

David Lang

 On Wed, 19 Sep 2001, safemode wrote:

> Date: Wed, 19 Sep 2001 22:23:21 -0400
> From: safemode <safemode@speakeasy.net>
> To: Ignacio Vazquez-Abrams <ignacio@openservices.net>,
>      linux-kernel@vger.kernel.org
> Subject: Re: [PATCH] Preemption Latency Measurement Tool
>
> Is it possible to run the 20 worsst latency thing without having the pre-empt
> patch ?  I'd like to see how things run without the preempt patch and which
> ones are the slowest.   The last time i used the pre-empt patch i had worse
> audio performance than not.  Perhaps that's because i use alsa.
>
>
>
> On Wednesday 19 September 2001 21:40, Ignacio Vazquez-Abrams wrote:
> > I'm posting this in the hope that someone will find it useful.
> >
> > Worst 20 latency times of 4647 measured in this period.
> >   usec      cause     mask   start line/file      address   end line/file
> >  35735  spin_lock        1   358/memory.c        c012c74a   379/memory.c
> >  14003  spin_lock        1   341/vmscan.c        c013977b   362/vmscan.c
> >  13701  spin_lock        1  1376/sched.c         c0118ef9   697/sched.c
> >  12533        BKL        7  2689/buffer.c        c0149acc   697/sched.c
> >  12459  spin_lock        7   547/sched.c         c0116ca3   697/sched.c
> >  10479  spin_lock        1   703/vmscan.c        c013a213   685/vmscan.c
> >  10443  spin_lock        1   669/inode.c         c015e8dd   696/inode.c
> >   9163   reacqBKL        1  1375/sched.c         c0118edb  1381/sched.c
> >   8727        BKL        9   164/inode.c         c0171885   188/inode.c
> >   8066   usb-uhci        d    76/softirq.c       c0120cf2   119/softirq.c
> >   8041        BKL        9   130/attr.c          c015fbd4   143/attr.c
> >   7982  spin_lock        8   547/sched.c         c0116ca3   204/namei.c
> >   6920   reacqBKL        5  1375/sched.c         c0118edb   697/sched.c
> >   6720  spin_lock        0  1376/sched.c         c0118ef9  1380/sched.c
> >   6567  spin_lock        0   661/vmscan.c        c013a0f2   764/vmscan.c
> >   6365  spin_lock        0   703/vmscan.c        c013a213   764/vmscan.c
> >   5680        BKL        1   452/exit.c          c011f6c6   697/sched.c
> >   5646  spin_lock        0   227/inode.c         c015df35   696/inode.c
> >   5428  spin_lock        1   710/inode.c         c015ea08   696/inode.c
> >   4933  spin_lock        0   547/sched.c         c0116ca3  1380/sched.c
> >
> > This was taken after compiling several packages and running dbench a few
> > times.
> >
> > I also ran it previously while running 'while true ; do python -c
> > range\(3e7\) ; done', and the first line had minimum latencies of 10000,
> > going as high as about 96000 when swapping.
> >
> > This is on an Athlon 1050/100 Asus A7V (KT133) with PC133 CL2 SDRAM running
> > 2.4.9-ac12-preempt1.
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] Preemption Latency Measurement Tool
  2001-09-19 22:44 ` [PATCH] Preemption Latency Measurement Tool Robert Love
@ 2001-09-20  1:40   ` Ignacio Vazquez-Abrams
  2001-09-20  2:23     ` safemode
  2001-09-20  2:38     ` Robert Love
  2001-09-20  6:31   ` Dieter Nützel
                     ` (7 subsequent siblings)
  8 siblings, 2 replies; 55+ messages in thread
From: Ignacio Vazquez-Abrams @ 2001-09-20  1:40 UTC (permalink / raw)
  To: linux-kernel

I'm posting this in the hope that someone will find it useful.

Worst 20 latency times of 4647 measured in this period.
  usec      cause     mask   start line/file      address   end line/file
 35735  spin_lock        1   358/memory.c        c012c74a   379/memory.c
 14003  spin_lock        1   341/vmscan.c        c013977b   362/vmscan.c
 13701  spin_lock        1  1376/sched.c         c0118ef9   697/sched.c
 12533        BKL        7  2689/buffer.c        c0149acc   697/sched.c
 12459  spin_lock        7   547/sched.c         c0116ca3   697/sched.c
 10479  spin_lock        1   703/vmscan.c        c013a213   685/vmscan.c
 10443  spin_lock        1   669/inode.c         c015e8dd   696/inode.c
  9163   reacqBKL        1  1375/sched.c         c0118edb  1381/sched.c
  8727        BKL        9   164/inode.c         c0171885   188/inode.c
  8066   usb-uhci        d    76/softirq.c       c0120cf2   119/softirq.c
  8041        BKL        9   130/attr.c          c015fbd4   143/attr.c
  7982  spin_lock        8   547/sched.c         c0116ca3   204/namei.c
  6920   reacqBKL        5  1375/sched.c         c0118edb   697/sched.c
  6720  spin_lock        0  1376/sched.c         c0118ef9  1380/sched.c
  6567  spin_lock        0   661/vmscan.c        c013a0f2   764/vmscan.c
  6365  spin_lock        0   703/vmscan.c        c013a213   764/vmscan.c
  5680        BKL        1   452/exit.c          c011f6c6   697/sched.c
  5646  spin_lock        0   227/inode.c         c015df35   696/inode.c
  5428  spin_lock        1   710/inode.c         c015ea08   696/inode.c
  4933  spin_lock        0   547/sched.c         c0116ca3  1380/sched.c

This was taken after compiling several packages and running dbench a few
times.

I also ran it previously while running 'while true ; do python -c range\(3e7\)
; done', and the first line had minimum latencies of 10000, going as high as
about 96000 when swapping.

This is on an Athlon 1050/100 Asus A7V (KT133) with PC133 CL2 SDRAM running
2.4.9-ac12-preempt1.

-- 
Ignacio Vazquez-Abrams  <ignacio@openservices.net>


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] Preemption Latency Measurement Tool
  2001-09-20  1:40   ` Ignacio Vazquez-Abrams
@ 2001-09-20  2:23     ` safemode
  2001-09-20  1:13       ` David Lang
  2001-09-20  2:38     ` Robert Love
  1 sibling, 1 reply; 55+ messages in thread
From: safemode @ 2001-09-20  2:23 UTC (permalink / raw)
  To: Ignacio Vazquez-Abrams, linux-kernel

Is it possible to run the 20 worsst latency thing without having the pre-empt 
patch ?  I'd like to see how things run without the preempt patch and which 
ones are the slowest.   The last time i used the pre-empt patch i had worse 
audio performance than not.  Perhaps that's because i use alsa. 



On Wednesday 19 September 2001 21:40, Ignacio Vazquez-Abrams wrote:
> I'm posting this in the hope that someone will find it useful.
>
> Worst 20 latency times of 4647 measured in this period.
>   usec      cause     mask   start line/file      address   end line/file
>  35735  spin_lock        1   358/memory.c        c012c74a   379/memory.c
>  14003  spin_lock        1   341/vmscan.c        c013977b   362/vmscan.c
>  13701  spin_lock        1  1376/sched.c         c0118ef9   697/sched.c
>  12533        BKL        7  2689/buffer.c        c0149acc   697/sched.c
>  12459  spin_lock        7   547/sched.c         c0116ca3   697/sched.c
>  10479  spin_lock        1   703/vmscan.c        c013a213   685/vmscan.c
>  10443  spin_lock        1   669/inode.c         c015e8dd   696/inode.c
>   9163   reacqBKL        1  1375/sched.c         c0118edb  1381/sched.c
>   8727        BKL        9   164/inode.c         c0171885   188/inode.c
>   8066   usb-uhci        d    76/softirq.c       c0120cf2   119/softirq.c
>   8041        BKL        9   130/attr.c          c015fbd4   143/attr.c
>   7982  spin_lock        8   547/sched.c         c0116ca3   204/namei.c
>   6920   reacqBKL        5  1375/sched.c         c0118edb   697/sched.c
>   6720  spin_lock        0  1376/sched.c         c0118ef9  1380/sched.c
>   6567  spin_lock        0   661/vmscan.c        c013a0f2   764/vmscan.c
>   6365  spin_lock        0   703/vmscan.c        c013a213   764/vmscan.c
>   5680        BKL        1   452/exit.c          c011f6c6   697/sched.c
>   5646  spin_lock        0   227/inode.c         c015df35   696/inode.c
>   5428  spin_lock        1   710/inode.c         c015ea08   696/inode.c
>   4933  spin_lock        0   547/sched.c         c0116ca3  1380/sched.c
>
> This was taken after compiling several packages and running dbench a few
> times.
>
> I also ran it previously while running 'while true ; do python -c
> range\(3e7\) ; done', and the first line had minimum latencies of 10000,
> going as high as about 96000 when swapping.
>
> This is on an Athlon 1050/100 Asus A7V (KT133) with PC133 CL2 SDRAM running
> 2.4.9-ac12-preempt1.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] Preemption Latency Measurement Tool
  2001-09-20  1:40   ` Ignacio Vazquez-Abrams
  2001-09-20  2:23     ` safemode
@ 2001-09-20  2:38     ` Robert Love
  1 sibling, 0 replies; 55+ messages in thread
From: Robert Love @ 2001-09-20  2:38 UTC (permalink / raw)
  To: safemode; +Cc: linux-kernel

On Wed, 2001-09-19 at 22:23, safemode wrote:
> Is it possible to run the 20 worsst latency thing without having the pre-empt
> patch ?  I'd like to see how things run without the preempt patch and which 
> ones are the slowest.   The last time i used the pre-empt patch i had worse 
> audio performance than not.  Perhaps that's because i use alsa. 

No, it requires the preemption patch.  It uses the preemption count code
in figuring out when code returns.  Besides, if you are non-preemption
and UP, you don't have many locking primitives anyhow (no concurrency
concerns there).

You honestly should not have _worse_ latency with the patch.  Even if
something was wrong, it should still be about the same.  However, most
people are reporting a many many times over reduction in latency.

Your point about using ALSA is one of the reasons I am interested in
this patch.  Most of the people reporting high-latencies remaining even
after the patch used ALSA.

If you try preemption again, along with this patch, we can find out what
is causing your problems.

If you just want to gauage how long it takes to execute code from A to B
in the kernel, there are various benchmarking tools as well as Andrew
Morton's excellent timepegs work.

-- 
Robert M. Love
rml at ufl.edu
rml at tech9.net


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] Preemption Latency Measurement Tool
  2001-09-20  1:13       ` David Lang
@ 2001-09-20  2:57         ` Robert Love
  0 siblings, 0 replies; 55+ messages in thread
From: Robert Love @ 2001-09-20  2:57 UTC (permalink / raw)
  To: David Lang; +Cc: safemode, linux-kernel

On Wed, 2001-09-19 at 21:13, David Lang wrote:
> also how useful is the latency info once swapping starts? if you have
> something that gets swapped out it will have horrible latency and skew the
> results (or at the very least you won't know if the problem is disk IO or
> fixable problems)

Its not that bad because the patch measures latency across points where
preemption was enabled and disabled -- the various kernel locking
mechanisms.

Thus either page fault holds a lock and it is a legimate latency
recorded over that specific lock, or it doesn't in which case its not an
issue.

-- 
Robert M. Love
rml at ufl.edu
rml at tech9.net


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] Preemption Latency Measurement Tool
  2001-09-19 22:44 ` [PATCH] Preemption Latency Measurement Tool Robert Love
  2001-09-20  1:40   ` Ignacio Vazquez-Abrams
  2001-09-20  6:31   ` Dieter Nützel
@ 2001-09-20  6:31   ` Dieter Nützel
       [not found]   ` <20010920063143.424BD1E41A@Cantor.suse.de>
                     ` (5 subsequent siblings)
  8 siblings, 0 replies; 55+ messages in thread
From: Dieter Nützel @ 2001-09-20  6:31 UTC (permalink / raw)
  To: Robert Love, Roger Larsson, linux-kernel
  Cc: Andrea Arcangeli, Linux Kernel List, ReiserFS List

Am Donnerstag, 20. September 2001 00:44 schrieb Robert Love:
> Available at:
> http://tech9.net/rml/linux/patch-rml-2.4.9-ac12-preempt-stats-1 and
> http://tech9.net/rml/linux/patch-rml-2.4.10-pre12-preempt-stats-1
> for 2.4.9-ac12 and 2.4.10-pre12, respectively.
>
> This patch is provided thanks to MontaVista (http://mvista.com).

Great, thanks.

Here are some results for 2.4.10-pre12 (Andrea's VM :-)

Athlon II 1 GHz (0.18 µm)
MSI MS-6167 Rev 1.0B (Irongate C4)
640 MB PC100-2-2-2 SDRAM
IBM DDYS 18 GB U160 (on AHA-2940UW)
ReiserFS 3.6 on all partitions

Sound driver is the new kernel one for SB Live! (not ALSA).
No swap used during whole test.

2.4.10-pre12 + patch-rml-2.4.10-pre12-preempt-kernel-1 + 
patch-rml-2.4.10-pre12-preempt-stats-1

Hope my numbers help to find the right reason for the hiccups.
ReiserFS seems _NOT_ to be the culprit for this.
Maybe the scheduler it self?

-Dieter

KDE-2.2.1 noatun running MP3/Ogg-Vorbis
+
time ./dbench 16
Throughput 29.3012 MB/sec (NB=36.6265 MB/sec  293.012 MBit/sec)
7.450u 28.830s 1:13.10 49.6%    0+0k 0+0io 511pf+0w
load: 1140

Worst 20 latency times of 5583 measured in this period.
  usec      cause     mask   start line/file      address   end line/file
  5664  spin_lock        1  1376/sched.c         c0114db3   697/sched.c
  2586        BKL        1  1302/inode.c         c016f2f9  1381/sched.c
  2329        BKL        0  1302/inode.c         c016f2f9   842/inode.c
  2231        BKL        1  1302/inode.c         c016f2f9   697/sched.c
  2088        BKL        1  1437/namei.c         c014c42f   697/sched.c
  1992        BKL        0    30/inode.c         c016cdf1  1381/sched.c
  1953        BKL        1  1302/inode.c         c016f2f9    52/inode.c
  1952        BKL        0  1302/inode.c         c016f2f9  1380/sched.c
  1947        BKL        1    30/inode.c         c016cdf1   697/sched.c
  1923   reacqBKL        0  1375/sched.c         c0114d94  1381/sched.c
  1908        BKL        0  1302/inode.c         c016f2f9  1306/inode.c
  1904        BKL        1   452/exit.c          c011af61   697/sched.c
  1892        BKL        1   129/attr.c          c015765d   697/sched.c
  1875        BKL        0  1437/namei.c         c014c42f  1381/sched.c
  1823  spin_lock        0   547/sched.c         c0112fe4  1381/sched.c
  1790   reacqBKL        0  1375/sched.c         c0114d94  1439/namei.c
  1779        BKL        1  1366/namei.c         c014c12c  1381/sched.c
  1762  spin_lock        0   547/sched.c         c0112fe4  1380/sched.c
  1756   reacqBKL        1  1375/sched.c         c0114d94    52/inode.c
  1754        BKL        0   927/namei.c         c014b2bf  1381/sched.c


time ./dbench 32
Throughput 28.0138 MB/sec (NB=35.0172 MB/sec  280.138 MBit/sec)
14.570u 61.520s 2:31.80 50.1%   0+0k 0+0io 911pf+0w
load: 2690

Worst 20 latency times of 8091 measured in this period.
  usec      cause     mask   start line/file      address   end line/file
  8347        BKL        1   533/inode.c         c016d96d   842/inode.c
  7808  spin_lock        1  1376/sched.c         c0114db3   697/sched.c
  6181        BKL        1  1302/inode.c         c016f2f9  1381/sched.c
  3325        BKL        1  1302/inode.c         c016f2f9   842/inode.c
  3314  spin_lock        1   547/sched.c         c0112fe4   697/sched.c
  3014        BKL        1   129/attr.c          c015765d   697/sched.c
  2956        BKL        1  1302/inode.c         c016f2f9  1380/sched.c
  2954   reacqBKL        1  1375/sched.c         c0114d94   697/sched.c
  2795        BKL        1   927/namei.c         c014b2bf   697/sched.c
  2750        BKL        1  1302/inode.c         c016f2f9   697/sched.c
  2667        BKL        1    30/inode.c         c016cdf1   697/sched.c
  2623  spin_lock        0  1376/sched.c         c0114db3  1380/sched.c
  2613  spin_lock        0  1376/sched.c         c0114db3  1306/inode.c
  2566   reacqBKL        0  1375/sched.c         c0114d94  1381/sched.c
  2527        BKL        0  1437/namei.c         c014c42f   697/sched.c
  2526        BKL        0  1437/namei.c         c014c42f  1380/sched.c
  2525  spin_lock        1   547/sched.c         c0112fe4  1380/sched.c
  2513        BKL        0   927/namei.c         c014b2bf  1380/sched.c
  2442        BKL        0    30/inode.c         c016cdf1  1381/sched.c
  2353        BKL        0   452/exit.c          c011af61  1380/sched.c


time ./dbench 40
Throughput 24.664 MB/sec (NB=30.83 MB/sec  246.64 MBit/sec)
18.690u 77.980s 3:35.09 44.9%   0+0k 0+0io 1111pf+0w
load: 3734

Worst 20 latency times of 7340 measured in this period.
  usec      cause     mask   start line/file      address   end line/file
  9313  spin_lock        1  1376/sched.c         c0114db3   697/sched.c
  9040        BKL        1  1302/inode.c         c016f2f9  1306/inode.c
  8530        BKL        1  1302/inode.c         c016f2f9   842/inode.c
  8268   reacqBKL        1  1375/sched.c         c0114d94  1381/sched.c
  5432  spin_lock        1   547/sched.c         c0112fe4   697/sched.c
  3630   reacqBKL        1  1375/sched.c         c0114d94   697/sched.c
  3519        BKL        1  1437/namei.c         c014c42f  1380/sched.c
  3387        BKL        0  1302/inode.c         c016f2f9   697/sched.c
  3194  spin_lock        0   547/sched.c         c0112fe4  1380/sched.c
  3180  spin_lock        0   547/sched.c         c0112fe4  1381/sched.c
  3015  spin_lock        1   547/sched.c         c0112fe4   842/inode.c
  2894        BKL        0  1302/inode.c         c016f2f9  1381/sched.c
  2778  spin_lock        1  1376/sched.c         c0114db3  1380/sched.c
  2755        BKL        1  1302/inode.c         c016f2f9  1380/sched.c
  2676        BKL        1   452/exit.c          c011af61   697/sched.c
  2354        BKL        1   533/inode.c         c016d96d   697/sched.c
  2237        BKL        0    30/inode.c         c016cdf1   697/sched.c
  2104        BKL        0  1437/namei.c         c014c42f  1381/sched.c
  2097  spin_lock        0  1376/sched.c         c0114db3  1381/sched.c
  2082        BKL        0    30/inode.c         c016cdf1  1380/sched.c


time ./dbench 48
Throughput 24.5409 MB/sec (NB=30.6761 MB/sec  245.409 MBit/sec)
22.080u 97.560s 4:19.19 46.1%   0+0k 0+0io 1311pf+0w
load: 4622

Worst 20 latency times of 10544 measured in this period.
  usec      cause     mask   start line/file      address   end line/file
 12831        BKL        1    30/inode.c         c016cdf1    52/inode.c
 10869   reacqBKL        1  1375/sched.c         c0114d94  1381/sched.c
  9303        BKL        1  1302/inode.c         c016f2f9   697/sched.c
  9145        BKL        1  1302/inode.c         c016f2f9  1306/inode.c
  7984  spin_lock        1   547/sched.c         c0112fe4  1380/sched.c
  5983        BKL        1  1302/inode.c         c016f2f9   842/inode.c
  4947   reacqBKL        1  1375/sched.c         c0114d94   697/sched.c
  4497  spin_lock        1   547/sched.c         c0112fe4   697/sched.c
  4310  spin_lock        1  1376/sched.c         c0114db3  1380/sched.c
  4290   reacqBKL        0  1375/sched.c         c0114d94  1306/inode.c
  4002  spin_lock        1   547/sched.c         c0112fe4  1381/sched.c
  3822  spin_lock        0  1376/sched.c         c0114db3   697/sched.c
  3683        BKL        0  1302/inode.c         c016f2f9  1380/sched.c
  3422  spin_lock        1   547/sched.c         c0112fe4  1306/inode.c
  3193        BKL        0  1437/namei.c         c014c42f  1380/sched.c
  3006        BKL        1    30/inode.c         c016cdf1  1380/sched.c
  2990        BKL        1   129/attr.c          c015765d   697/sched.c
  2971        BKL        1  1437/namei.c         c014c42f  1381/sched.c
  2939        BKL        0   927/namei.c         c014b2bf  1381/sched.c
  2862        BKL        0  1302/inode.c         c016f2f9  1381/sched.c


KDE-2.2.1 noatun running MP3/Ogg-Vorbis

Worst 20 latency times of 2252 measured in this period.
  usec      cause     mask   start line/file      address   end line/file
   237        BKL        0  2763/buffer.c        c01410aa   697/sched.c
   218  spin_lock        0   547/sched.c         c0112fe4  1381/sched.c
   215   reacqBKL        0  1375/sched.c         c0114d94  1381/sched.c
   215        BKL        0  2763/buffer.c        c01410aa  1381/sched.c
   205        BKL        0   359/buffer.c        c013d1bc  1381/sched.c
   205       eth1        0   585/irq.c           c010886f   647/irq.c
   149  spin_lock        0   547/sched.c         c0112fe4   697/sched.c
   147        BKL        0   359/buffer.c        c013d1bc   697/sched.c
   135  spin_lock        0  1376/sched.c         c0114db3  1380/sched.c
   130   reacqBKL        0  1375/sched.c         c0114d94   697/sched.c
   111        BKL        0   452/exit.c          c011af61   697/sched.c
   105    unknown        1    76/softirq.c       c011c634   119/softirq.c
    86        BKL        4   712/tty_io.c        c018cf6b   714/tty_io.c
    61  spin_lock        0  1715/dev.c           c01dc4b3  1728/dev.c
    58        BKL        0   301/namei.c         c0149db1   303/namei.c
    56  spin_lock        2   468/netfilter.c     c01e4303   119/softirq.c
    44        BKL        1    59/ioctl.c         c014ea3a   111/ioctl.c
    43        BKL        0  1302/inode.c         c016f2f9  1306/inode.c
    43  spin_lock        1    69/i387.c          c010ca23   227/mmx.c
    42        BKL        1    26/readdir.c       c014ed07    28/readdir.c


Renice -20 both artsd prozesses (the KDE-2.2.1 noatun sound daemon)
help a little bit but there are still some hiccups (1~3 sec)
remaining.

But the system is very responsive (mouse, keyboard).

time ./dbench 16
Throughput 30.8602 MB/sec (NB=38.5752 MB/sec  308.602 MBit/sec)
7.490u 29.350s 1:09.44 53.0%    0+0k 0+0io 511pf+0w

Worst 20 latency times of 5851 measured in this period.
  usec      cause     mask   start line/file      address   end line/file
  5518  spin_lock        1  1376/sched.c         c0114db3  1380/sched.c
  2553        BKL        1  1302/inode.c         c016f2f9   697/sched.c
  2284        BKL        0  1302/inode.c         c016f2f9   842/inode.c
  2235   reacqBKL        1  1375/sched.c         c0114d94  1381/sched.c
  2227        BKL        0  1302/inode.c         c016f2f9  1380/sched.c
  2183        BKL        0   927/namei.c         c014b2bf  1306/inode.c
  2055        BKL        1   927/namei.c         c014b2bf   697/sched.c
  2037        BKL        1   452/exit.c          c011af61   697/sched.c
  2032        BKL        0    30/inode.c         c016cdf1   697/sched.c
  2008        BKL        1  1302/inode.c         c016f2f9  1381/sched.c
  1943        BKL        0   927/namei.c         c014b2bf  1439/namei.c
  1926        BKL        1   452/exit.c          c011af61    52/inode.c
  1919  spin_lock        1  1376/sched.c         c0114db3   697/sched.c
  1880        BKL        0  1437/namei.c         c014c42f  1381/sched.c
  1865        BKL        0    30/inode.c         c016cdf1  1380/sched.c
  1858  spin_lock        0   547/sched.c         c0112fe4  1380/sched.c
  1849   reacqBKL        0  1375/sched.c         c0114d94  1439/namei.c
  1821   reacqBKL        0  1375/sched.c         c0114d94   697/sched.c
  1774        BKL        1   129/attr.c          c015765d   697/sched.c
  1720        BKL        1  1437/namei.c         c014c42f  1380/sched.c


time ./dbench 48
Throughput 22.85 MB/sec (NB=28.5626 MB/sec  228.5 MBit/sec)
21.840u 98.560s 4:38.30 43.2%   0+0k 0+0io 1311pf+0w

Worst 20 latency times of 8664 measured in this period.
  usec      cause     mask   start line/file      address   end line/file
 11179  spin_lock        1   547/sched.c         c0112fe4   697/sched.c
 10943        BKL        1  1302/inode.c         c016f2f9   697/sched.c
 10798  spin_lock        1  1376/sched.c         c0114db3   697/sched.c
  9367  spin_lock        1  1376/sched.c         c0114db3  1306/inode.c
  7230  spin_lock        1  1376/sched.c         c0114db3  1381/sched.c
  6870  spin_lock        1   547/sched.c         c0112fe4  1381/sched.c
  5610   reacqBKL        1  1375/sched.c         c0114d94  1381/sched.c
  5356  spin_lock        0   547/sched.c         c0112fe4  1380/sched.c
  4162   reacqBKL        0  1375/sched.c         c0114d94  1306/inode.c
  4068   reacqBKL        1  1375/sched.c         c0114d94   697/sched.c
  3335        BKL        0  1302/inode.c         c016f2f9  1380/sched.c
  3161  spin_lock        1  1376/sched.c         c0114db3  1380/sched.c
  3030        BKL        1  1437/namei.c         c014c42f   697/sched.c
  2974        BKL        0  1437/namei.c         c014c42f  1380/sched.c
  2832        BKL        0  1302/inode.c         c016f2f9  1381/sched.c
  2817        BKL        0   927/namei.c         c014b2bf  1380/sched.c
  2713  spin_lock        1   483/dcache.c        c0153efa   520/dcache.c
  2522  spin_lock        1   547/sched.c         c0112fe4  1306/inode.c
  2522        BKL        0   533/inode.c         c016d96d  1381/sched.c
  2508        BKL        0   533/inode.c         c016d96d   929/namei.c

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] Preemption Latency Measurement Tool
  2001-09-19 22:44 ` [PATCH] Preemption Latency Measurement Tool Robert Love
  2001-09-20  1:40   ` Ignacio Vazquez-Abrams
@ 2001-09-20  6:31   ` Dieter Nützel
  2001-09-20  6:31   ` Dieter Nützel
                     ` (6 subsequent siblings)
  8 siblings, 0 replies; 55+ messages in thread
From: Dieter Nützel @ 2001-09-20  6:31 UTC (permalink / raw)
  To: Robert Love, Roger Larsson, linux-kernel
  Cc: Andrea Arcangeli, Linux Kernel List, ReiserFS List

Am Donnerstag, 20. September 2001 00:44 schrieb Robert Love:
> Available at:
> http://tech9.net/rml/linux/patch-rml-2.4.9-ac12-preempt-stats-1 and
> http://tech9.net/rml/linux/patch-rml-2.4.10-pre12-preempt-stats-1
> for 2.4.9-ac12 and 2.4.10-pre12, respectively.
>
> This patch is provided thanks to MontaVista (http://mvista.com).

Great, thanks.

Here are some results for 2.4.10-pre12 (Andrea's VM :-)

Athlon II 1 GHz (0.18 µm)
MSI MS-6167 Rev 1.0B (Irongate C4)
640 MB PC100-2-2-2 SDRAM
IBM DDYS 18 GB U160 (on AHA-2940UW)
ReiserFS 3.6 on all partitions

Sound driver is the new kernel one for SB Live! (not ALSA).
No swap used during whole test.

2.4.10-pre12 + patch-rml-2.4.10-pre12-preempt-kernel-1 + 
patch-rml-2.4.10-pre12-preempt-stats-1

Hope my numbers help to find the right reason for the hiccups.
ReiserFS seems _NOT_ to be the culprit for this.
Maybe the scheduler it self?

-Dieter

KDE-2.2.1 noatun running MP3/Ogg-Vorbis
+
time ./dbench 16
Throughput 29.3012 MB/sec (NB=36.6265 MB/sec  293.012 MBit/sec)
7.450u 28.830s 1:13.10 49.6%    0+0k 0+0io 511pf+0w
load: 1140

Worst 20 latency times of 5583 measured in this period.
  usec      cause     mask   start line/file      address   end line/file
  5664  spin_lock        1  1376/sched.c         c0114db3   697/sched.c
  2586        BKL        1  1302/inode.c         c016f2f9  1381/sched.c
  2329        BKL        0  1302/inode.c         c016f2f9   842/inode.c
  2231        BKL        1  1302/inode.c         c016f2f9   697/sched.c
  2088        BKL        1  1437/namei.c         c014c42f   697/sched.c
  1992        BKL        0    30/inode.c         c016cdf1  1381/sched.c
  1953        BKL        1  1302/inode.c         c016f2f9    52/inode.c
  1952        BKL        0  1302/inode.c         c016f2f9  1380/sched.c
  1947        BKL        1    30/inode.c         c016cdf1   697/sched.c
  1923   reacqBKL        0  1375/sched.c         c0114d94  1381/sched.c
  1908        BKL        0  1302/inode.c         c016f2f9  1306/inode.c
  1904        BKL        1   452/exit.c          c011af61   697/sched.c
  1892        BKL        1   129/attr.c          c015765d   697/sched.c
  1875        BKL        0  1437/namei.c         c014c42f  1381/sched.c
  1823  spin_lock        0   547/sched.c         c0112fe4  1381/sched.c
  1790   reacqBKL        0  1375/sched.c         c0114d94  1439/namei.c
  1779        BKL        1  1366/namei.c         c014c12c  1381/sched.c
  1762  spin_lock        0   547/sched.c         c0112fe4  1380/sched.c
  1756   reacqBKL        1  1375/sched.c         c0114d94    52/inode.c
  1754        BKL        0   927/namei.c         c014b2bf  1381/sched.c


time ./dbench 32
Throughput 28.0138 MB/sec (NB=35.0172 MB/sec  280.138 MBit/sec)
14.570u 61.520s 2:31.80 50.1%   0+0k 0+0io 911pf+0w
load: 2690

Worst 20 latency times of 8091 measured in this period.
  usec      cause     mask   start line/file      address   end line/file
  8347        BKL        1   533/inode.c         c016d96d   842/inode.c
  7808  spin_lock        1  1376/sched.c         c0114db3   697/sched.c
  6181        BKL        1  1302/inode.c         c016f2f9  1381/sched.c
  3325        BKL        1  1302/inode.c         c016f2f9   842/inode.c
  3314  spin_lock        1   547/sched.c         c0112fe4   697/sched.c
  3014        BKL        1   129/attr.c          c015765d   697/sched.c
  2956        BKL        1  1302/inode.c         c016f2f9  1380/sched.c
  2954   reacqBKL        1  1375/sched.c         c0114d94   697/sched.c
  2795        BKL        1   927/namei.c         c014b2bf   697/sched.c
  2750        BKL        1  1302/inode.c         c016f2f9   697/sched.c
  2667        BKL        1    30/inode.c         c016cdf1   697/sched.c
  2623  spin_lock        0  1376/sched.c         c0114db3  1380/sched.c
  2613  spin_lock        0  1376/sched.c         c0114db3  1306/inode.c
  2566   reacqBKL        0  1375/sched.c         c0114d94  1381/sched.c
  2527        BKL        0  1437/namei.c         c014c42f   697/sched.c
  2526        BKL        0  1437/namei.c         c014c42f  1380/sched.c
  2525  spin_lock        1   547/sched.c         c0112fe4  1380/sched.c
  2513        BKL        0   927/namei.c         c014b2bf  1380/sched.c
  2442        BKL        0    30/inode.c         c016cdf1  1381/sched.c
  2353        BKL        0   452/exit.c          c011af61  1380/sched.c


time ./dbench 40
Throughput 24.664 MB/sec (NB=30.83 MB/sec  246.64 MBit/sec)
18.690u 77.980s 3:35.09 44.9%   0+0k 0+0io 1111pf+0w
load: 3734

Worst 20 latency times of 7340 measured in this period.
  usec      cause     mask   start line/file      address   end line/file
  9313  spin_lock        1  1376/sched.c         c0114db3   697/sched.c
  9040        BKL        1  1302/inode.c         c016f2f9  1306/inode.c
  8530        BKL        1  1302/inode.c         c016f2f9   842/inode.c
  8268   reacqBKL        1  1375/sched.c         c0114d94  1381/sched.c
  5432  spin_lock        1   547/sched.c         c0112fe4   697/sched.c
  3630   reacqBKL        1  1375/sched.c         c0114d94   697/sched.c
  3519        BKL        1  1437/namei.c         c014c42f  1380/sched.c
  3387        BKL        0  1302/inode.c         c016f2f9   697/sched.c
  3194  spin_lock        0   547/sched.c         c0112fe4  1380/sched.c
  3180  spin_lock        0   547/sched.c         c0112fe4  1381/sched.c
  3015  spin_lock        1   547/sched.c         c0112fe4   842/inode.c
  2894        BKL        0  1302/inode.c         c016f2f9  1381/sched.c
  2778  spin_lock        1  1376/sched.c         c0114db3  1380/sched.c
  2755        BKL        1  1302/inode.c         c016f2f9  1380/sched.c
  2676        BKL        1   452/exit.c          c011af61   697/sched.c
  2354        BKL        1   533/inode.c         c016d96d   697/sched.c
  2237        BKL        0    30/inode.c         c016cdf1   697/sched.c
  2104        BKL        0  1437/namei.c         c014c42f  1381/sched.c
  2097  spin_lock        0  1376/sched.c         c0114db3  1381/sched.c
  2082        BKL        0    30/inode.c         c016cdf1  1380/sched.c


time ./dbench 48
Throughput 24.5409 MB/sec (NB=30.6761 MB/sec  245.409 MBit/sec)
22.080u 97.560s 4:19.19 46.1%   0+0k 0+0io 1311pf+0w
load: 4622

Worst 20 latency times of 10544 measured in this period.
  usec      cause     mask   start line/file      address   end line/file
 12831        BKL        1    30/inode.c         c016cdf1    52/inode.c
 10869   reacqBKL        1  1375/sched.c         c0114d94  1381/sched.c
  9303        BKL        1  1302/inode.c         c016f2f9   697/sched.c
  9145        BKL        1  1302/inode.c         c016f2f9  1306/inode.c
  7984  spin_lock        1   547/sched.c         c0112fe4  1380/sched.c
  5983        BKL        1  1302/inode.c         c016f2f9   842/inode.c
  4947   reacqBKL        1  1375/sched.c         c0114d94   697/sched.c
  4497  spin_lock        1   547/sched.c         c0112fe4   697/sched.c
  4310  spin_lock        1  1376/sched.c         c0114db3  1380/sched.c
  4290   reacqBKL        0  1375/sched.c         c0114d94  1306/inode.c
  4002  spin_lock        1   547/sched.c         c0112fe4  1381/sched.c
  3822  spin_lock        0  1376/sched.c         c0114db3   697/sched.c
  3683        BKL        0  1302/inode.c         c016f2f9  1380/sched.c
  3422  spin_lock        1   547/sched.c         c0112fe4  1306/inode.c
  3193        BKL        0  1437/namei.c         c014c42f  1380/sched.c
  3006        BKL        1    30/inode.c         c016cdf1  1380/sched.c
  2990        BKL        1   129/attr.c          c015765d   697/sched.c
  2971        BKL        1  1437/namei.c         c014c42f  1381/sched.c
  2939        BKL        0   927/namei.c         c014b2bf  1381/sched.c
  2862        BKL        0  1302/inode.c         c016f2f9  1381/sched.c


KDE-2.2.1 noatun running MP3/Ogg-Vorbis

Worst 20 latency times of 2252 measured in this period.
  usec      cause     mask   start line/file      address   end line/file
   237        BKL        0  2763/buffer.c        c01410aa   697/sched.c
   218  spin_lock        0   547/sched.c         c0112fe4  1381/sched.c
   215   reacqBKL        0  1375/sched.c         c0114d94  1381/sched.c
   215        BKL        0  2763/buffer.c        c01410aa  1381/sched.c
   205        BKL        0   359/buffer.c        c013d1bc  1381/sched.c
   205       eth1        0   585/irq.c           c010886f   647/irq.c
   149  spin_lock        0   547/sched.c         c0112fe4   697/sched.c
   147        BKL        0   359/buffer.c        c013d1bc   697/sched.c
   135  spin_lock        0  1376/sched.c         c0114db3  1380/sched.c
   130   reacqBKL        0  1375/sched.c         c0114d94   697/sched.c
   111        BKL        0   452/exit.c          c011af61   697/sched.c
   105    unknown        1    76/softirq.c       c011c634   119/softirq.c
    86        BKL        4   712/tty_io.c        c018cf6b   714/tty_io.c
    61  spin_lock        0  1715/dev.c           c01dc4b3  1728/dev.c
    58        BKL        0   301/namei.c         c0149db1   303/namei.c
    56  spin_lock        2   468/netfilter.c     c01e4303   119/softirq.c
    44        BKL        1    59/ioctl.c         c014ea3a   111/ioctl.c
    43        BKL        0  1302/inode.c         c016f2f9  1306/inode.c
    43  spin_lock        1    69/i387.c          c010ca23   227/mmx.c
    42        BKL        1    26/readdir.c       c014ed07    28/readdir.c


Renice -20 both artsd prozesses (the KDE-2.2.1 noatun sound daemon)
help a little bit but there are still some hiccups (1~3 sec)
remaining.

But the system is very responsive (mouse, keyboard).

time ./dbench 16
Throughput 30.8602 MB/sec (NB=38.5752 MB/sec  308.602 MBit/sec)
7.490u 29.350s 1:09.44 53.0%    0+0k 0+0io 511pf+0w

Worst 20 latency times of 5851 measured in this period.
  usec      cause     mask   start line/file      address   end line/file
  5518  spin_lock        1  1376/sched.c         c0114db3  1380/sched.c
  2553        BKL        1  1302/inode.c         c016f2f9   697/sched.c
  2284        BKL        0  1302/inode.c         c016f2f9   842/inode.c
  2235   reacqBKL        1  1375/sched.c         c0114d94  1381/sched.c
  2227        BKL        0  1302/inode.c         c016f2f9  1380/sched.c
  2183        BKL        0   927/namei.c         c014b2bf  1306/inode.c
  2055        BKL        1   927/namei.c         c014b2bf   697/sched.c
  2037        BKL        1   452/exit.c          c011af61   697/sched.c
  2032        BKL        0    30/inode.c         c016cdf1   697/sched.c
  2008        BKL        1  1302/inode.c         c016f2f9  1381/sched.c
  1943        BKL        0   927/namei.c         c014b2bf  1439/namei.c
  1926        BKL        1   452/exit.c          c011af61    52/inode.c
  1919  spin_lock        1  1376/sched.c         c0114db3   697/sched.c
  1880        BKL        0  1437/namei.c         c014c42f  1381/sched.c
  1865        BKL        0    30/inode.c         c016cdf1  1380/sched.c
  1858  spin_lock        0   547/sched.c         c0112fe4  1380/sched.c
  1849   reacqBKL        0  1375/sched.c         c0114d94  1439/namei.c
  1821   reacqBKL        0  1375/sched.c         c0114d94   697/sched.c
  1774        BKL        1   129/attr.c          c015765d   697/sched.c
  1720        BKL        1  1437/namei.c         c014c42f  1380/sched.c


time ./dbench 48
Throughput 22.85 MB/sec (NB=28.5626 MB/sec  228.5 MBit/sec)
21.840u 98.560s 4:38.30 43.2%   0+0k 0+0io 1311pf+0w

Worst 20 latency times of 8664 measured in this period.
  usec      cause     mask   start line/file      address   end line/file
 11179  spin_lock        1   547/sched.c         c0112fe4   697/sched.c
 10943        BKL        1  1302/inode.c         c016f2f9   697/sched.c
 10798  spin_lock        1  1376/sched.c         c0114db3   697/sched.c
  9367  spin_lock        1  1376/sched.c         c0114db3  1306/inode.c
  7230  spin_lock        1  1376/sched.c         c0114db3  1381/sched.c
  6870  spin_lock        1   547/sched.c         c0112fe4  1381/sched.c
  5610   reacqBKL        1  1375/sched.c         c0114d94  1381/sched.c
  5356  spin_lock        0   547/sched.c         c0112fe4  1380/sched.c
  4162   reacqBKL        0  1375/sched.c         c0114d94  1306/inode.c
  4068   reacqBKL        1  1375/sched.c         c0114d94   697/sched.c
  3335        BKL        0  1302/inode.c         c016f2f9  1380/sched.c
  3161  spin_lock        1  1376/sched.c         c0114db3  1380/sched.c
  3030        BKL        1  1437/namei.c         c014c42f   697/sched.c
  2974        BKL        0  1437/namei.c         c014c42f  1380/sched.c
  2832        BKL        0  1302/inode.c         c016f2f9  1381/sched.c
  2817        BKL        0   927/namei.c         c014b2bf  1380/sched.c
  2713  spin_lock        1   483/dcache.c        c0153efa   520/dcache.c
  2522  spin_lock        1   547/sched.c         c0112fe4  1306/inode.c
  2522        BKL        0   533/inode.c         c016d96d  1381/sched.c
  2508        BKL        0   533/inode.c         c016d96d   929/namei.c

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] Preemption Latency Measurement Tool
       [not found]   ` <20010920063143.424BD1E41A@Cantor.suse.de>
@ 2001-09-20  6:41     ` Andrea Arcangeli
  2001-09-20  7:57       ` Dieter Nützel
                         ` (3 more replies)
  0 siblings, 4 replies; 55+ messages in thread
From: Andrea Arcangeli @ 2001-09-20  6:41 UTC (permalink / raw)
  To: Dieter Nützel
  Cc: Robert Love, Roger Larsson, linux-kernel, ReiserFS List

Those inodes lines reminded me one thing, you may want to give it a try:

--- 2.4.10pre12aa1/fs/inode.c.~1~	Thu Sep 20 01:44:07 2001
+++ 2.4.10pre12aa1/fs/inode.c	Thu Sep 20 08:37:33 2001
@@ -295,6 +295,12 @@
 			 * so we have to start looking from the list head.
 			 */
 			tmp = head;
+
+			if (unlikely(current->need_resched)) {
+				spin_unlock(&inode_lock);
+				schedule();
+				spin_lock(&inode_lock);
+			}
 		}
 	}
 
Andrea

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] Preemption Latency Measurement Tool
  2001-09-20  6:41     ` Andrea Arcangeli
@ 2001-09-20  7:57       ` Dieter Nützel
  2001-09-20 21:09       ` Robert Love
                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 55+ messages in thread
From: Dieter Nützel @ 2001-09-20  7:57 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: Robert Love, Roger Larsson, linux-kernel, ReiserFS List

Am Donnerstag, 20. September 2001 08:41 schrieb Andrea Arcangeli:
> Those inodes lines reminded me one thing, you may want to give it a try:
>
> --- 2.4.10pre12aa1/fs/inode.c.~1~	Thu Sep 20 01:44:07 2001
> +++ 2.4.10pre12aa1/fs/inode.c	Thu Sep 20 08:37:33 2001
> @@ -295,6 +295,12 @@
>  			 * so we have to start looking from the list head.
>  			 */
>  			tmp = head;
> +
> +			if (unlikely(current->need_resched)) {
> +				spin_unlock(&inode_lock);
> +				schedule();
> +				spin_lock(&inode_lock);
> +			}
>  		}
>  	}
>

You've forgotten a one liner.

  #include <linux/locks.h>
+#include <linux/compiler.h>

But this is not enough. Even with reniced artsd (-20).
Some shorter hiccups (0.5~1 sec).

-Dieter

dbench 16 (without renice)
Throughput 31.0483 MB/sec (NB=38.8104 MB/sec  310.483 MBit/sec)
7.270u 29.290s 1:09.03 52.9%    0+0k 0+0io 511pf+0w
load: 1034

Worst 20 latency times of 4261 measured in this period.
  usec      cause     mask   start line/file      address   end line/file
  4617   reacqBKL        0  1375/sched.c         c0114d94  1381/sched.c
  2890  spin_lock        1  1376/sched.c         c0114db3   697/sched.c
  2072        BKL        1   533/inode.c         c016d9cd   697/sched.c
  2062        BKL        1  1302/inode.c         c016f359    52/inode.c
  2039        BKL        1  1302/inode.c         c016f359   697/sched.c
  1908        BKL        0  1302/inode.c         c016f359   929/namei.c
  1870        BKL        0  1302/inode.c         c016f359  1381/sched.c
  1859  spin_lock        0   547/sched.c         c0112fe4   697/sched.c
  1834        BKL        0    30/inode.c         c016ce51  1381/sched.c
  1834        BKL        1  1302/inode.c         c016f359  1380/sched.c
  1833        BKL        1  1437/namei.c         c014c42f   697/sched.c
  1831        BKL        1   452/exit.c          c011af61   697/sched.c
  1820        BKL        1  1437/namei.c         c014c42f   842/inode.c
  1797        BKL        0  1302/inode.c         c016f359   842/inode.c
  1741  spin_lock        0   547/sched.c         c0112fe4  1380/sched.c
  1696   reacqBKL        1  1375/sched.c         c0114d94   697/sched.c
  1690        BKL        1  1437/namei.c         c014c42f  1380/sched.c
  1652        BKL        1   533/inode.c         c016d9cd  1380/sched.c
  1648        BKL        1  1870/namei.c         c014d420  1381/sched.c
  1643        BKL        0   927/namei.c         c014b2bf  1381/sched.c


dbench 16 (artsd renice -20)
Throughput 32.8102 MB/sec (NB=41.0127 MB/sec  328.102 MBit/sec)
7.490u 28.570s 1:05.38 55.1%    0+0k 0+0io 511pf+0w
load: 963

Worst 20 latency times of 4649 measured in this period.
  usec      cause     mask   start line/file      address   end line/file
  4828  spin_lock        1  1376/sched.c         c0114db3   697/sched.c
  2446  spin_lock        1  1376/sched.c         c0114db3  1380/sched.c
  2237   reacqBKL        1  1375/sched.c         c0114d94   697/sched.c
  2205   reacqBKL        1  1375/sched.c         c0114d94  1381/sched.c
  2205        BKL        1  1302/inode.c         c016f359  1380/sched.c
  2076        BKL        0  1302/inode.c         c016f359  1381/sched.c
  2065  spin_lock        0   547/sched.c         c0112fe4   697/sched.c
  2048        BKL        0    30/inode.c         c016ce51   697/sched.c
  2040        BKL        1   452/exit.c          c011af61   697/sched.c
  1982        BKL        0  1437/namei.c         c014c42f  1381/sched.c
  1953        BKL        0  1437/namei.c         c014c42f  1380/sched.c
  1942        BKL        1  1302/inode.c         c016f359   697/sched.c
  1847        BKL        0  1302/inode.c         c016f359   842/inode.c
  1761        BKL        0  1437/namei.c         c014c42f   697/sched.c
  1734   reacqBKL        1  1375/sched.c         c0114d94    52/inode.c
  1708        BKL        0    30/inode.c         c016ce51  1306/inode.c
  1705        BKL        0   927/namei.c         c014b2bf  1381/sched.c
  1688        BKL        0  1302/inode.c         c016f359   929/namei.c
  1679        BKL        1  1437/namei.c         c014c42f   842/inode.c
  1648        BKL        1  1302/inode.c         c016f359  1439/namei.c


dbench 48 looks like this (without renice)
Throughput 24.5736 MB/sec (NB=30.717 MB/sec  245.736 MBit/sec)
23.250u 89.760s 4:18.85 43.6%   0+0k 0+0io 1311pf+0w
load: 4429

Worst 20 latency times of 8033 measured in this period.
  usec      cause     mask   start line/file      address   end line/file
 10856  spin_lock        1  1376/sched.c         c0114db3   697/sched.c
 10705        BKL        1  1302/inode.c         c016f359   697/sched.c
 10577  spin_lock        1  1376/sched.c         c0114db3   303/namei.c
  9427  spin_lock        1   547/sched.c         c0112fe4   697/sched.c
  8526   reacqBKL        1  1375/sched.c         c0114d94   697/sched.c
  4492   reacqBKL        1  1375/sched.c         c0114d94  1381/sched.c
  4171        BKL        1  1302/inode.c         c016f359  1381/sched.c
  3902   reacqBKL        0  1375/sched.c         c0114d94  1306/inode.c
  3376  spin_lock        0  1376/sched.c         c0114db3  1380/sched.c
  3132        BKL        0  1302/inode.c         c016f359  1380/sched.c
  3096  spin_lock        1   547/sched.c         c0112fe4  1380/sched.c
  2808        BKL        0    30/inode.c         c016ce51  1381/sched.c
  2807  spin_lock        0   547/sched.c         c0112fe4  1381/sched.c
  2782        BKL        0   452/exit.c          c011af61  1380/sched.c
  2631  spin_lock        0   483/dcache.c        c0153efa   520/dcache.c
  2533        BKL        0   533/inode.c         c016d9cd  1380/sched.c
  2489        BKL        0   927/namei.c         c014b2bf  1380/sched.c
  2389        BKL        1   452/exit.c          c011af61    52/inode.c
  2369        BKL        1  1302/inode.c         c016f359   842/inode.c
  2327        BKL        1    30/inode.c         c016ce51  1380/sched.c

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] Preemption Latency Measurement Tool
       [not found]       ` <20010920075751.6CA791E6B2@Cantor.suse.de>
@ 2001-09-20  8:21         ` Andrea Arcangeli
  2001-09-20 20:13           ` george anzinger
  2001-09-20 21:10         ` Robert Love
  1 sibling, 1 reply; 55+ messages in thread
From: Andrea Arcangeli @ 2001-09-20  8:21 UTC (permalink / raw)
  To: Dieter Nützel
  Cc: Robert Love, Roger Larsson, linux-kernel, ReiserFS List

On Thu, Sep 20, 2001 at 09:57:50AM +0200, Dieter Nützel wrote:
> Am Donnerstag, 20. September 2001 08:41 schrieb Andrea Arcangeli:
> > Those inodes lines reminded me one thing, you may want to give it a try:
> >
> > --- 2.4.10pre12aa1/fs/inode.c.~1~	Thu Sep 20 01:44:07 2001
> > +++ 2.4.10pre12aa1/fs/inode.c	Thu Sep 20 08:37:33 2001
> > @@ -295,6 +295,12 @@
> >  			 * so we have to start looking from the list head.
> >  			 */
> >  			tmp = head;
> > +
> > +			if (unlikely(current->need_resched)) {
> > +				spin_unlock(&inode_lock);
> > +				schedule();
> > +				spin_lock(&inode_lock);
> > +			}
> >  		}
> >  	}
> >
> 
> You've forgotten a one liner.
> 
>   #include <linux/locks.h>
> +#include <linux/compiler.h>

woops, didn't trapped it because of gcc 3.0.2. thanks.

> But this is not enough. Even with reniced artsd (-20).
> Some shorter hiccups (0.5~1 sec).

I'm not familiar with the output of the latency bench, but I actually
read "4617" usec as the worst latency, that means 4msec, not 500/1000
msec.

> Worst 20 latency times of 4261 measured in this period.
>   usec      cause     mask   start line/file      address   end line/file
    ^^^^
>   4617   reacqBKL        0  1375/sched.c         c0114d94  1381/sched.c
    ^^^^
>   2890  spin_lock        1  1376/sched.c         c0114db3   697/sched.c
>   2072        BKL        1   533/inode.c         c016d9cd   697/sched.c
>   2062        BKL        1  1302/inode.c         c016f359    52/inode.c
>   2039        BKL        1  1302/inode.c         c016f359   697/sched.c
>   1908        BKL        0  1302/inode.c         c016f359   929/namei.c
>   1870        BKL        0  1302/inode.c         c016f359  1381/sched.c
>   1859  spin_lock        0   547/sched.c         c0112fe4   697/sched.c
>   1834        BKL        0    30/inode.c         c016ce51  1381/sched.c
>   1834        BKL        1  1302/inode.c         c016f359  1380/sched.c
>   1833        BKL        1  1437/namei.c         c014c42f   697/sched.c
>   1831        BKL        1   452/exit.c          c011af61   697/sched.c
>   1820        BKL        1  1437/namei.c         c014c42f   842/inode.c
>   1797        BKL        0  1302/inode.c         c016f359   842/inode.c
>   1741  spin_lock        0   547/sched.c         c0112fe4  1380/sched.c
>   1696   reacqBKL        1  1375/sched.c         c0114d94   697/sched.c
>   1690        BKL        1  1437/namei.c         c014c42f  1380/sched.c
>   1652        BKL        1   533/inode.c         c016d9cd  1380/sched.c
>   1648        BKL        1  1870/namei.c         c014d420  1381/sched.c
>   1643        BKL        0   927/namei.c         c014b2bf  1381/sched.c


those are kernel addresses, can you resolve them via System.map rather
than trying to find their start/end line number?

> Worst 20 latency times of 8033 measured in this period.
>   usec      cause     mask   start line/file      address   end line/file
>  10856  spin_lock        1  1376/sched.c         c0114db3   697/sched.c

with dbench 48 we gone to 10msec latency as far I can see (still far
from 0.5~1 sec). dbench 48 is longer so more probability to get the
higher latency, and it does more I/O, probably also seeks more, so thre
are many variables (slower insection in I/O queues first of all, etcll).
However 10msec isn't that bad, it means 100hz, something that the human
eye cannot see. 0.5~1 sec would been horribly bad latency instead.. :)

>  10705        BKL        1  1302/inode.c         c016f359   697/sched.c
>  10577  spin_lock        1  1376/sched.c         c0114db3   303/namei.c
>   9427  spin_lock        1   547/sched.c         c0112fe4   697/sched.c
>   8526   reacqBKL        1  1375/sched.c         c0114d94   697/sched.c
>   4492   reacqBKL        1  1375/sched.c         c0114d94  1381/sched.c
>   4171        BKL        1  1302/inode.c         c016f359  1381/sched.c
>   3902   reacqBKL        0  1375/sched.c         c0114d94  1306/inode.c
>   3376  spin_lock        0  1376/sched.c         c0114db3  1380/sched.c
>   3132        BKL        0  1302/inode.c         c016f359  1380/sched.c
>   3096  spin_lock        1   547/sched.c         c0112fe4  1380/sched.c
>   2808        BKL        0    30/inode.c         c016ce51  1381/sched.c
>   2807  spin_lock        0   547/sched.c         c0112fe4  1381/sched.c
>   2782        BKL        0   452/exit.c          c011af61  1380/sched.c
>   2631  spin_lock        0   483/dcache.c        c0153efa   520/dcache.c
>   2533        BKL        0   533/inode.c         c016d9cd  1380/sched.c
>   2489        BKL        0   927/namei.c         c014b2bf  1380/sched.c
>   2389        BKL        1   452/exit.c          c011af61    52/inode.c
>   2369        BKL        1  1302/inode.c         c016f359   842/inode.c
>   2327        BKL        1    30/inode.c         c016ce51  1380/sched.c


Andrea

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] Preemption Latency Measurement Tool
       [not found]       ` <200109200757.JAA60995@blipp.internet5.net>
@ 2001-09-20 17:37         ` Roger Larsson
  2001-09-20 21:29         ` Robert Love
  1 sibling, 0 replies; 55+ messages in thread
From: Roger Larsson @ 2001-09-20 17:37 UTC (permalink / raw)
  To: Dieter Nützel, Andrea Arcangeli
  Cc: Robert Love, linux-kernel, ReiserFS List

On Thursday den 20 September 2001 09.57, Dieter Nützel wrote:
> Am Donnerstag, 20. September 2001 08:41 schrieb Andrea Arcangeli:

>
> But this is not enough. Even with reniced artsd (-20).
> Some shorter hiccups (0.5~1 sec).
>
> -Dieter
>
> dbench 16 (without renice)
> Throughput 31.0483 MB/sec (NB=38.8104 MB/sec  310.483 MBit/sec)
> 7.270u 29.290s 1:09.03 52.9%    0+0k 0+0io 511pf+0w
> load: 1034
>
> Worst 20 latency times of 4261 measured in this period.
>   usec      cause     mask   start line/file      address   end line/file
>   4617   reacqBKL        0  1375/sched.c         c0114d94  1381/sched.c
    ^^^^

This is fantastic! It REALLY is!
When we started with the low latency work we aimed at 10 ms.
(in all situations, not only when running dbench... but still)

Lets see - no swap used? - not swapped out
But with priority altered - it is unlikely that it would not be scheduled
in for such a long time.

Might it be that the disk is busy to handle dbench requests. 16 threads ->
16 read and several (async) write requests at different disk locations in
queue - make it 20. Seek time 10 ms => queue length 200 ms... probable???
Do you have more than one disk? Try to run dbench on one and the player from
the other.

/RogerL

-- 
Roger Larsson
Skellefteå
Sweden

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] Preemption Latency Measurement Tool
  2001-09-19 22:44 ` [PATCH] Preemption Latency Measurement Tool Robert Love
                     ` (3 preceding siblings ...)
       [not found]   ` <20010920063143.424BD1E41A@Cantor.suse.de>
@ 2001-09-20 20:01   ` Tobias Diedrich
  2001-09-20 20:27   ` Robert Love
                     ` (3 subsequent siblings)
  8 siblings, 0 replies; 55+ messages in thread
From: Tobias Diedrich @ 2001-09-20 20:01 UTC (permalink / raw)
  To: Robert Love; +Cc: linux-kernel

Should the patch work with SMP systems ?
Here is what I get:

Worst 20 latency times of 1322 measured in this period.
  usec      cause     mask   start line/file      address   end line/file
988940   reacqBKL        0  1375/sched.c         c0117ba1   772/sched.c
988940  spin_lock        0   129/file_table.c    c013f500   135/page_alloc.c
988940  spin_lock        0   103/page_alloc.c    c0136fd9   251/timer.c
988940  spin_lock        0   129/file_table.c    c013f500   135/page_alloc.c
988940  spin_lock        0    63/fork.c          c0117efc   135/page_alloc.c
988940  spin_lock        0   129/file_table.c    c013f500   135/page_alloc.c
988940  spin_lock        0   129/file_table.c    c013f500   135/page_alloc.c
988940  spin_lock        0   592/af_unix.c       c02ddd88   594/af_unix.c
988940  spin_lock        0    43/fork.c          c0117ddc   829/inode.c
988940  spin_lock        0   547/sched.c         c0115bf5   606/signal.c
988940  spin_lock        0   770/sched.c         c011641c    80/dcache.c
988940  spin_lock        0    36/dec_and_lock.c  c0319573   280/time.c
988940  spin_lock        0    36/dec_and_lock.c  c0319573   280/time.c
988940        BKL        0   714/ll_rw_blk.c     c01eaec1    80/dcache.c
988939  spin_lock        0   592/af_unix.c       c02ddd88   594/af_unix.c
988939  spin_lock        0   714/ll_rw_blk.c     c01eaec1    80/dcache.c
988939    unknown        0    76/softirq.c       c011f410   782/sched.c
988939  spin_lock        0   592/af_unix.c       c02ddd88  1380/sched.c
988939  spin_lock        0   176/fork.c          c0118608   696/namei.c
988939  spin_lock        0  1684/af_unix.c       c02e0de6   734/dcache.c

And another one about 2 seconds after the last flush:

Worst 20 latency times of 579 measured in this period.
  usec      cause     mask   start line/file      address   end line/file
988939  spin_lock        0   592/af_unix.c       c02ddd88   133/file_table.c
988930  spin_lock        0   770/sched.c         c011641c   697/sched.c
988930  spin_lock        0   547/sched.c         c0115bf5   133/file_table.c
988930  spin_lock        0  1512/es1371.c        c02408e0   133/file_table.c
  6613  spin_lock        1   547/sched.c         c0115bf5   697/sched.c
  6401  spin_lock        0   547/sched.c         c0115bf5   647/irq.c
  2557  spin_lock        0   547/sched.c         c0115bf5   647/signal.c
   833  spin_lock        0   547/sched.c         c0115bf5   133/file_table.c
   364        BKL        6   710/tty_io.c        c01d1eed   712/tty_io.c
   283        BKL        4   710/tty_io.c        c01d1eed   280/time.c
   243        BKL        0    59/ioctl.c         c014fbe5   685/socket.c
   231        BKL        4   710/tty_io.c        c01d1eed   697/sched.c
   218  spin_lock        0   798/vmscan.c        c01366a5   857/vmscan.c
   210  spin_lock        0  3169/tcp_input.c     c02b5331   697/sched.c
   191  spin_lock        2   468/netfilter.c     c0285351   119/softirq.c
   183  spin_lock        0   468/netfilter.c     c0285351   280/time.c
   176  spin_lock        0   468/netfilter.c     c0285351   697/sched.c
   168        BKL        0    26/readdir.c       c014ff8c    28/readdir.c
   134       ide0        0   585/irq.c           c0108eb1   647/irq.c
   134        BKL        0   980/inode.c         c01691a3   997/inode.c

-- 
Tobias								PGP: 0x9AC7E0BC
Hannover Fantreffen ML: mailto:fantreffen-request@mantrha.de?subject=subscribe
Manga & Anime Treff Hannover: http://www.mantrha.de/

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] Preemption Latency Measurement Tool
  2001-09-20  8:21         ` Andrea Arcangeli
@ 2001-09-20 20:13           ` george anzinger
  2001-09-20 20:38             ` Randy.Dunlap
  0 siblings, 1 reply; 55+ messages in thread
From: george anzinger @ 2001-09-20 20:13 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: Dieter Nützel, Robert Love, Roger Larsson, linux-kernel,
	ReiserFS List

Andrea Arcangeli wrote:
> 
> On Thu, Sep 20, 2001 at 09:57:50AM +0200, Dieter Nützel wrote:
> > Am Donnerstag, 20. September 2001 08:41 schrieb Andrea Arcangeli:
> > > Those inodes lines reminded me one thing, you may want to give it a try:
> > >
> > > --- 2.4.10pre12aa1/fs/inode.c.~1~   Thu Sep 20 01:44:07 2001
> > > +++ 2.4.10pre12aa1/fs/inode.c       Thu Sep 20 08:37:33 2001
> > > @@ -295,6 +295,12 @@
> > >                      * so we have to start looking from the list head.
> > >                      */
> > >                     tmp = head;
> > > +
> > > +                   if (unlikely(current->need_resched)) {
> > > +                           spin_unlock(&inode_lock);
> > > +                           schedule();
> > > +                           spin_lock(&inode_lock);
> > > +                   }
> > >             }
> > >     }
> > >
Be warned, while the above will improve latency, it will not improve the
numbers reported by the tool.  The reason is that the tool looks at
actual preemption disable times, so it does not care if "need_resched"
is set or not.  This means that, in the case where need_resched is not
set, the tool would report a long latency as if the conditional lock
break code was not present.  To get the above code into the tools
purview, you could either code it without the conditional or, better,
put the whole thing into a macro ( I think Ingo called it
"conditional_schedule()") and put the instrumentation stuff in the
macro.  Also, in the case of a system with the preemption patch, the
schedule is not needed unless the BKL is also held.

> >
> > You've forgotten a one liner.
> >
> >   #include <linux/locks.h>
> > +#include <linux/compiler.h>
> 
> woops, didn't trapped it because of gcc 3.0.2. thanks.
> 
> > But this is not enough. Even with reniced artsd (-20).
> > Some shorter hiccups (0.5~1 sec).
> 
> I'm not familiar with the output of the latency bench, but I actually
> read "4617" usec as the worst latency, that means 4msec, not 500/1000
> msec.
> 
> > Worst 20 latency times of 4261 measured in this period.
> >   usec      cause     mask   start line/file      address   end line/file
>     ^^^^
> >   4617   reacqBKL        0  1375/sched.c         c0114d94  1381/sched.c
>     ^^^^
~snip
> 
> those are kernel addresses, can you resolve them via System.map rather
> than trying to find their start/end line number?

Uh, trying to find???  These are the names and line numbers provided by
the PPC macros.  The only time the address is helpful is when the bad
guy is hiding in an "inline" in a header file.  Still, is there a simple
script to get the function/offset from the System.map?  We could then
post process with a simple bash/sed script.

> 
> > Worst 20 latency times of 8033 measured in this period.
> >   usec      cause     mask   start line/file      address   end line/file
> >  10856  spin_lock        1  1376/sched.c         c0114db3   697/sched.c
> 
> with dbench 48 we gone to 10msec latency as far I can see (still far
> from 0.5~1 sec). dbench 48 is longer so more probability to get the
> higher latency, and it does more I/O, probably also seeks more, so thre
> are many variables (slower insection in I/O queues first of all, etcll).
> However 10msec isn't that bad, it means 100hz, something that the human
> eye cannot see. 0.5~1 sec would been horribly bad latency instead.. :)

Ah, but we would REALLY like to see it below 1msec.
~snip
 
George

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] Preemption Latency Measurement Tool
  2001-09-19 22:44 ` [PATCH] Preemption Latency Measurement Tool Robert Love
                     ` (4 preceding siblings ...)
  2001-09-20 20:01   ` Tobias Diedrich
@ 2001-09-20 20:27   ` Robert Love
  2001-09-20 22:09     ` [PATCH] Preemption patch 2.4.9-ac12 Robert Love
  2001-09-20 22:01   ` [PATCH] Preemption Latency Measurement Tool Robert Love
                     ` (2 subsequent siblings)
  8 siblings, 1 reply; 55+ messages in thread
From: Robert Love @ 2001-09-20 20:27 UTC (permalink / raw)
  To: Dieter Nützel
  Cc: Roger Larsson, linux-kernel, Andrea Arcangeli, ReiserFS List

On Thu, 2001-09-20 at 02:31, Dieter Nützel wrote:
> Here are some results for 2.4.10-pre12 (Andrea's VM :-)
> 
> Athlon II 1 GHz (0.18 µm)
> MSI MS-6167 Rev 1.0B (Irongate C4)
> 640 MB PC100-2-2-2 SDRAM
> IBM DDYS 18 GB U160 (on AHA-2940UW)
> ReiserFS 3.6 on all partitions
> 
> Sound driver is the new kernel one for SB Live! (not ALSA).
> No swap used during whole test.

Good. I use that sound driver too (I was the one who updated it and
caused the commotion back in 2.4.9 :)

I am interested if you see any different times if you switch to ALSA,
though.

> 2.4.10-pre12 + patch-rml-2.4.10-pre12-preempt-kernel-1 + 
> patch-rml-2.4.10-pre12-preempt-stats-1
> 
> Hope my numbers help to find the right reason for the hiccups.
> ReiserFS seems _NOT_ to be the culprit for this.
> Maybe the scheduler it self?

The scheduler does hold some spinlocks.  So does VM.  I am working on
some ideas to tackle the highest of the worst-case spinlocks. Stay
tuned.

> KDE-2.2.1 noatun running MP3/Ogg-Vorbis
>
> time ./dbench 16
> Throughput 29.3012 MB/sec (NB=36.6265 MB/sec  293.012 MBit/sec)
> 7.450u 28.830s 1:13.10 49.6%    0+0k 0+0io 511pf+0w
> load: 1140
> 
> Worst 20 latency times of 5583 measured in this period.
>   usec      cause     mask   start line/file      address   end line/file
>   5664  spin_lock        1  1376/sched.c         c0114db3   697/sched.c
> <snip>

5ms is not an issue at all, especially for a worst case spinlock... no
problem here.

> time ./dbench 40
> Throughput 24.664 MB/sec (NB=30.83 MB/sec  246.64 MBit/sec)
> 18.690u 77.980s 3:35.09 44.9%   0+0k 0+0io 1111pf+0w
> load: 3734
> 
> Worst 20 latency times of 7340 measured in this period.
>   usec      cause     mask   start line/file      address   end line/file
>   9313  spin_lock        1  1376/sched.c         c0114db3   697/sched.c
> <snip>

We are getting higher, but still not an issue.  I'll theorize below why
I think it is increasing...

> time ./dbench 48
> Throughput 24.5409 MB/sec (NB=30.6761 MB/sec  245.409 MBit/sec)
> 22.080u 97.560s 4:19.19 46.1%   0+0k 0+0io 1311pf+0w
> load: 4622
> 
> Worst 20 latency times of 10544 measured in this period.
>   usec      cause     mask   start line/file      address   end line/file
>  12831        BKL        1    30/inode.c         c016cdf1    52/inode.c
>  10869   reacqBKL        1  1375/sched.c         c0114d94  1381/sched.c
> <snip>

Now we are above what I consider perfect (10ms), but I still am not
concenrned until 15-20ms times.  The list I am compiling is of much
higher problems, so you are fine here too.

The reason I think it jumps during the higher threaded dbenchs is for
the obvious reason - more I/O.  This is causing more seeks and longer
tranversing of the I/O queue.

> KDE-2.2.1 noatun running MP3/Ogg-Vorbis
> 
> Worst 20 latency times of 2252 measured in this period.
>   usec      cause     mask   start line/file      address   end line/file
>    237        BKL        0  2763/buffer.c        c01410aa   697/sched.c
> <snip>

0.2ms is ideal for any system :)


> Renice -20 both artsd prozesses (the KDE-2.2.1 noatun sound daemon)
> help a little bit but there are still some hiccups (1~3 sec)
> remaining.

You see 1-3second skips in the audio? With the 0.2ms latencies? Grr,
odd.

> But the system is very responsive (mouse, keyboard).
> 
> time ./dbench 16
> Throughput 30.8602 MB/sec (NB=38.5752 MB/sec  308.602 MBit/sec)
> 7.490u 29.350s 1:09.44 53.0%    0+0k 0+0io 511pf+0w
> 
> Worst 20 latency times of 5851 measured in this period.
>   usec      cause     mask   start line/file      address   end line/file
>   5518  spin_lock        1  1376/sched.c         c0114db3  1380/sched.c
> <snip>
>
> time ./dbench 48
> Throughput 22.85 MB/sec (NB=28.5626 MB/sec  228.5 MBit/sec)
> 21.840u 98.560s 4:38.30 43.2%   0+0k 0+0io 1311pf+0w
> 
> Worst 20 latency times of 8664 measured in this period.
>   usec      cause     mask   start line/file      address   end line/file
>  11179  spin_lock        1   547/sched.c         c0112fe4   697/sched.c
> <snip>

Still OK.

Thanks for the feedback.

-- 
Robert M. Love
rml at ufl.edu
rml at tech9.net


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] Preemption Latency Measurement Tool
  2001-09-20 20:13           ` george anzinger
@ 2001-09-20 20:38             ` Randy.Dunlap
  0 siblings, 0 replies; 55+ messages in thread
From: Randy.Dunlap @ 2001-09-20 20:38 UTC (permalink / raw)
  To: george anzinger
  Cc: Andrea Arcangeli, Dieter Nützel, Robert Love, Roger Larsson,
	linux-kernel, ReiserFS List

george anzinger wrote:
> 
> Andrea Arcangeli wrote:
> >
> > those are kernel addresses, can you resolve them via System.map rather
> > than trying to find their start/end line number?
> 
> Uh, trying to find???  These are the names and line numbers provided by
> the PPC macros.  The only time the address is helpful is when the bad
> guy is hiding in an "inline" in a header file.  Still, is there a simple
> script to get the function/offset from the System.map?  We could then
> post process with a simple bash/sed script.

I posted one earlier today (Perl).  It's at
http://www.osdlab.org/sw_resources/scripts/ksysmap .

~Randy

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] Preemption Latency Measurement Tool
  2001-09-20  6:41     ` Andrea Arcangeli
  2001-09-20  7:57       ` Dieter Nützel
@ 2001-09-20 21:09       ` Robert Love
       [not found]       ` <20010920075751.6CA791E6B2@Cantor.suse.de>
       [not found]       ` <200109200757.JAA60995@blipp.internet5.net>
  3 siblings, 0 replies; 55+ messages in thread
From: Robert Love @ 2001-09-20 21:09 UTC (permalink / raw)
  To: Dieter Nützel
  Cc: Andrea Arcangeli, Roger Larsson, linux-kernel, ReiserFS List

On Thu, 2001-09-20 at 03:57, Dieter Nützel wrote:
> You've forgotten a one liner.
> 
>   #include <linux/locks.h>
> +#include <linux/compiler.h>
> 
> But this is not enough. Even with reniced artsd (-20).
> Some shorter hiccups (0.5~1 sec).


Note (I am repeated myself from an email I just sent) that the
conditional schedule won't show better results if
current->need_reschedule is unset, since preemption won't be enabled.  I
need to add explicit support to the preemption-test patch for this.

So you may see some better results, but just one time the condition
schedule does not occur, you will see the worst result in
/proc/latencytimes -- remembers its the 20 worst (perhaps we need
average or total latency, too?)

Now, with all that said, you should _see_ an improvement with this
patch.  You say short hiccups.  Some? All? How much better is it?

-- 
Robert M. Love
rml at ufl.edu
rml at tech9.net


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] Preemption Latency Measurement Tool
       [not found]       ` <20010920075751.6CA791E6B2@Cantor.suse.de>
  2001-09-20  8:21         ` Andrea Arcangeli
@ 2001-09-20 21:10         ` Robert Love
  2001-09-20 21:35           ` Dieter Nützel
  1 sibling, 1 reply; 55+ messages in thread
From: Robert Love @ 2001-09-20 21:10 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: Dieter Nützel, Roger Larsson, linux-kernel, ReiserFS List

On Thu, 2001-09-20 at 04:21, Andrea Arcangeli wrote:
> > You've forgotten a one liner.
> > 
> >   #include <linux/locks.h>
> > +#include <linux/compiler.h>
> 
> woops, didn't trapped it because of gcc 3.0.2. thanks.
> 
> > But this is not enough. Even with reniced artsd (-20).
> > Some shorter hiccups (0.5~1 sec).
> 
> I'm not familiar with the output of the latency bench, but I actually
> read "4617" usec as the worst latency, that means 4msec, not 500/1000
> msec.

Right, the patch is returning the length preemption was unavailable
(which is when a lock is held) in us. So it is indded 4ms.

But, I think Dieter is saying he _sees_ 0.5~1s latencies (in the form of
audio skips).  This is despite the 4ms locks being held.

-- 
Robert M. Love
rml at ufl.edu
rml at tech9.net


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] Preemption Latency Measurement Tool
       [not found]       ` <200109200757.JAA60995@blipp.internet5.net>
  2001-09-20 17:37         ` Roger Larsson
@ 2001-09-20 21:29         ` Robert Love
  2001-09-20 21:53           ` Dieter Nützel
  1 sibling, 1 reply; 55+ messages in thread
From: Robert Love @ 2001-09-20 21:29 UTC (permalink / raw)
  To: Roger Larsson
  Cc: Dieter Nützel, Andrea Arcangeli, linux-kernel, ReiserFS List

On Thu, 2001-09-20 at 13:37, Roger Larsson wrote:

> > Worst 20 latency times of 4261 measured in this period.
> >   usec      cause     mask   start line/file      address   end line/file
> >   4617   reacqBKL        0  1375/sched.c         c0114d94  1381/sched.c
>
> This is fantastic! It REALLY is!
> When we started with the low latency work we aimed at 10 ms.
> (in all situations, not only when running dbench... but still)

Yes it really is, especially noting that that 4.6ms lock is the
_longest_ held lock on the system.

I am seeing 90% of the reported locks under 15ms, and this means that
almost all the locks on the system are even less.

However, I am also seeing some stray 20-50ms and even 60-70ms latencies
and those bother me.  I am looking into another solution, perhaps
conditional scheduling for now.

> Lets see - no swap used? - not swapped out
> But with priority altered - it is unlikely that it would not be scheduled
> in for such a long time.
> 
> Might it be that the disk is busy to handle dbench requests. 16 threads ->
> 16 read and several (async) write requests at different disk locations in
> queue - make it 20. Seek time 10 ms => queue length 200 ms... probable???
> Do you have more than one disk? Try to run dbench on one and the player from
> the other.

Good idea.

-- 
Robert M. Love
rml at ufl.edu
rml at tech9.net


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] Preemption Latency Measurement Tool
  2001-09-20 21:10         ` Robert Love
@ 2001-09-20 21:35           ` Dieter Nützel
  2001-09-20 22:03             ` Oliver Xymoron
  0 siblings, 1 reply; 55+ messages in thread
From: Dieter Nützel @ 2001-09-20 21:35 UTC (permalink / raw)
  To: Robert Love, Andrea Arcangeli; +Cc: Roger Larsson, linux-kernel, ReiserFS List

Am Donnerstag, 20. September 2001 23:10 schrieb Robert Love:
> On Thu, 2001-09-20 at 04:21, Andrea Arcangeli wrote:
> > > You've forgotten a one liner.
> > >
> > >   #include <linux/locks.h>
> > > +#include <linux/compiler.h>
> >
> > woops, didn't trapped it because of gcc 3.0.2. thanks.
> >
> > > But this is not enough. Even with reniced artsd (-20).
> > > Some shorter hiccups (0.5~1 sec).
> >
> > I'm not familiar with the output of the latency bench, but I actually
> > read "4617" usec as the worst latency, that means 4msec, not 500/1000
> > msec.
>
> Right, the patch is returning the length preemption was unavailable
> (which is when a lock is held) in us. So it is indded 4ms.
>
> But, I think Dieter is saying he _sees_ 0.5~1s latencies (in the form of
> audio skips).  This is despite the 4ms locks being held.

Yes, that's the case. During dbench 16,32,40,48, etc...

-Dieter

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] Preemption Latency Measurement Tool
  2001-09-20 21:29         ` Robert Love
@ 2001-09-20 21:53           ` Dieter Nützel
  0 siblings, 0 replies; 55+ messages in thread
From: Dieter Nützel @ 2001-09-20 21:53 UTC (permalink / raw)
  To: Robert Love, Roger Larsson; +Cc: Andrea Arcangeli, linux-kernel, ReiserFS List

Am Donnerstag, 20. September 2001 23:29 schrieb Robert Love:
> On Thu, 2001-09-20 at 13:37, Roger Larsson wrote:
> > > Worst 20 latency times of 4261 measured in this period.
> > >   usec      cause     mask   start line/file      address   end
> > > line/file 4617   reacqBKL        0  1375/sched.c         c0114d94 
> > > 1381/sched.c
> >
> > This is fantastic! It REALLY is!
> > When we started with the low latency work we aimed at 10 ms.
> > (in all situations, not only when running dbench... but still)
>
> Yes it really is, especially noting that that 4.6ms lock is the
> _longest_ held lock on the system.
>
> I am seeing 90% of the reported locks under 15ms, and this means that
> almost all the locks on the system are even less.
>
> However, I am also seeing some stray 20-50ms and even 60-70ms latencies
> and those bother me.  I am looking into another solution, perhaps
> conditional scheduling for now.
>
> > Lets see - no swap used? - not swapped out

swap is enabled but not used by artsd (only 8 MB for X and some kdeinit stuff)

  810 nuetzel   -1 -20  6820    0 6820  4380 S <   0.0  1.0   0:31 artsd
 2724 nuetzel    8   0  6820    0 6820  4380 S     0.0  1.0   0:00 artsd

SunWave1>cat /proc/meminfo
        total:    used:    free:  shared: buffers:  cached:
Mem:  658395136 437657600 220737536        0  6656000 287236096
Swap: 1052794880  8990720 1043804160
MemTotal:       642964 kB
MemFree:        215564 kB
MemShared:           0 kB
Buffers:          6500 kB
Cached:         278396 kB
SwapCached:       2108 kB
Active:         265108 kB
Inactive:        21896 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:       642964 kB
LowFree:        215564 kB
SwapTotal:     1028120 kB
SwapFree:      1019340 kB

> > But with priority altered - it is unlikely that it would not be scheduled
> > in for such a long time.
> >
> > Might it be that the disk is busy to handle dbench requests. 16 threads
> > -> 16 read and several (async) write requests at different disk locations
> > in queue - make it 20. Seek time 10 ms => queue length 200 ms...
> > probable??? Do you have more than one disk? Try to run dbench on one and
> > the player from the other.

OK, I moved my video and sound files to one of my "older" IBM DDRS-9GB UW 
disks (read max ~12.8 MB/s).

Did NOT help. With and without renice -20.
Same hiccup (0.5~3 sec) during the first few seconds of dbench. The hiccup 
ONLY occur at the beginning of every dbench run.

-Dieter

PS At the bottom you can find the latency and time for this copy job.

Throughput 23.7919 MB/sec (NB=29.7399 MB/sec  237.919 MBit/sec)
7.470u 29.740s 1:29.79 41.4%    0+0k 0+0io 511pf+0w
load: 1300

SunWave1>cat /proc/latencytimes
Worst 20 latency times of 5890 measured in this period.
  usec      cause     mask   start line/file      address   end line/file
 13990  spin_lock        9  2043/tcp_ipv6.c      e9072837   119/softirq.c
 11596  spin_lock        1  2111/tcp_ipv4.c      c0207287   119/softirq.c
  4616   reacqBKL        1  1375/sched.c         c0114d94  1381/sched.c
  4478   reacqBKL        1  1375/sched.c         c0114d94   697/sched.c
  2187        BKL        1  1302/inode.c         c016f359   697/sched.c
  2172   reacqBKL        1  1375/sched.c         c0114d94   929/namei.c
  1991        BKL        0  1302/inode.c         c016f359  1381/sched.c
  1966        BKL        1  1302/inode.c         c016f359   842/inode.c
  1934        BKL        0  1437/namei.c         c014c42f  1380/sched.c
  1891        BKL        0    30/inode.c         c016ce51   697/sched.c
  1879  spin_lock        0   547/sched.c         c0112fe4  1381/sched.c
  1865  spin_lock        0  1376/sched.c         c0114db3  1380/sched.c
  1833  spin_lock        1  1376/sched.c         c0114db3   697/sched.c
  1828        BKL        0    30/inode.c         c016ce51  1380/sched.c
  1812        BKL        0  1302/inode.c         c016f359  1380/sched.c
  1792        BKL        0  1437/namei.c         c014c42f  1381/sched.c
  1782  spin_lock        1   547/sched.c         c0112fe4   697/sched.c
  1776        BKL        0  1437/namei.c         c014c42f   929/namei.c
  1772        BKL        1  1437/namei.c         c014c42f   697/sched.c
  1767        BKL        0   129/attr.c          c01576bd  1380/sched.c

SunWave1>cat /proc/latencytimes
Worst 20 latency times of 2260 measured in this period.
  usec      cause     mask   start line/file      address   end line/file
   583        BKL        6   712/tty_io.c        c018cfcb   714/tty_io.c
   284        BKL        0    83/file.c          c0171024  1381/sched.c
   245        BKL        0  2763/buffer.c        c01410aa  1381/sched.c
   209        BKL        0  2763/buffer.c        c01410aa   697/sched.c
   204       eth1        0   585/irq.c           c010886f   647/irq.c
   193   reacqBKL        0  1375/sched.c         c0114d94  1381/sched.c
   142   reacqBKL        1  1375/sched.c         c0114d94   697/sched.c
   111   reacqBKL        0  1375/sched.c         c0114d94    98/file.c
   109        BKL        0   452/exit.c          c011af61  1380/sched.c
   109        BKL        0   927/namei.c         c014b2bf   929/namei.c
    95        BKL        0   533/inode.c         c016d9cd   842/inode.c
    91        BKL        0  1870/namei.c         c014d420  1873/namei.c
    85    unknown        3    76/softirq.c       c011c634   119/softirq.c
    76        BKL        0    30/inode.c         c016ce51    52/inode.c
    64        BKL        1    26/readdir.c       c014ed07    28/readdir.c
    62  spin_lock        0  1715/dev.c           c01dc513  1728/dev.c
    54  spin_lock        2   468/netfilter.c     c01e4363   119/softirq.c
    49  spin_lock        0   991/dev.c           c01db583   998/dev.c
    47  spin_lock        0   547/sched.c         c0112fe4  1381/sched.c
    45        BKL        0  1302/inode.c         c016f359  1306/inode.c

renice -20 artsd
Throughput 27.6322 MB/sec (NB=34.5402 MB/sec  276.322 MBit/sec)
7.180u 30.180s 1:17.44 48.2%    0+0k 0+0io 511pf+0w
load: 1222

Worst 20 latency times of 6170 measured in this period.
  usec      cause     mask   start line/file      address   end line/file
  4883  spin_lock        1   547/sched.c         c0112fe4   697/sched.c
  3590        BKL        0  2763/buffer.c        c01410aa  1381/sched.c
  3192   reacqBKL        1  1375/sched.c         c0114d94   697/sched.c
  2647        BKL        0  1302/inode.c         c016f359   697/sched.c
  2286        BKL        0  1302/inode.c         c016f359  1381/sched.c
  2011        BKL        1  1437/namei.c         c014c42f   697/sched.c
  1795        BKL        1   452/exit.c          c011af61   697/sched.c
  1790        BKL        1  1302/inode.c         c016f359  1380/sched.c
  1725   reacqBKL        0  1375/sched.c         c0114d94  1381/sched.c
  1700  spin_lock        0   547/sched.c         c0112fe4  1381/sched.c
  1685        BKL        1  1437/namei.c         c014c42f  1380/sched.c
  1673        BKL        0  1437/namei.c         c014c42f  1381/sched.c
  1662        BKL        0    30/inode.c         c016ce51  1381/sched.c
  1575  spin_lock        0  1376/sched.c         c0114db3  1380/sched.c
  1565        BKL        0   533/inode.c         c016d9cd  1381/sched.c
  1452        BKL        0    30/inode.c         c016ce51   697/sched.c
  1452        BKL        0   533/inode.c         c016d9cd  1380/sched.c
  1445        BKL        0   927/namei.c         c014b2bf  1380/sched.c
  1423        BKL        0   129/attr.c          c01576bd  1381/sched.c
  1413        BKL        1   927/namei.c         c014b2bf   697/sched.c

SunWave1#du -s /usr/data/sounds/
263404  /usr/data/sounds

SunWave1#time cp -a /usr/data/sounds/ /database/db1/data/
0.080u 5.800s 1:03.06 9.3%      0+0k 0+0io 127pf+0w

SunWave1#du -s /database/db1/data/sounds/
263404  /database/db1/data/sounds

Worst 20 latency times of 3398 measured in this period.
  usec      cause     mask   start line/file      address   end line/file
 10155  spin_lock        1   291/buffer.c        c014151c   285/buffer.c
  9028        BKL        8   152/devices.c       c013c282   154/devices.c
  7007   keyboard        9    76/softirq.c       c011c634   119/softirq.c
  5999  spin_lock        8    86/softirq.c       c011c66c   112/softirq.c
  4201  spin_lock        1   257/vmscan.c        c01331e6   286/vmscan.c
  4009  spin_lock        0   978/pc_keyb.c       c01a0787   983/pc_keyb.c
  3000  spin_lock        1   375/memory.c        c0127abf   389/memory.c
  1715  spin_lock        0   468/vmscan.c        c0133c35   431/vmscan.c
  1387  spin_lock        0   678/inode.c         c01566d7   704/inode.c
  1234   reacqBKL        1  1375/sched.c         c0114d94  1381/sched.c
  1071        BKL        1  1302/inode.c         c016f359  1380/sched.c
  1049  spin_lock        1   305/dcache.c        c0153acd    86/dcache.c
  1049  spin_lock        1   468/vmscan.c        c0133c35   344/vmscan.c
   968  spin_lock        1   547/sched.c         c0112fe4   697/sched.c
   950   reacqBKL        0  1375/sched.c         c0114d94   697/sched.c
   858       eth1        0   585/irq.c           c010886f   647/irq.c
   803        BKL        4   927/namei.c         c014b2bf   929/namei.c
   736        BKL        0   452/exit.c          c011af61   697/sched.c
   696        BKL        4  1870/namei.c         c014d420  1873/namei.c
   694        BKL        0  2763/buffer.c        c01410aa  1381/sched.c

c0141400 T kupdate
c014151c ..... <<<<<
c0141610 T set_buffer_async_io

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] Preemption Latency Measurement Tool
  2001-09-19 22:44 ` [PATCH] Preemption Latency Measurement Tool Robert Love
                     ` (5 preceding siblings ...)
  2001-09-20 20:27   ` Robert Love
@ 2001-09-20 22:01   ` Robert Love
  2001-09-22  3:57   ` Andre Pang
  2001-09-22  6:10   ` Robert Love
  8 siblings, 0 replies; 55+ messages in thread
From: Robert Love @ 2001-09-20 22:01 UTC (permalink / raw)
  To: Tobias Diedrich; +Cc: linux-kernel

On Thu, 2001-09-20 at 16:01, Tobias Diedrich wrote:
> Should the patch work with SMP systems ?
> Here is what I get:

I never thought that it shouldn't, but upon thinking about it maybe it
won't :)

You are the second person reporting those all-the-same latency values. 
I think the SMP locks are being reported wrongly by the patch.  I will
take a look at it.

Thanks

-- 
Robert M. Love
rml at ufl.edu
rml at tech9.net


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] Preemption Latency Measurement Tool
  2001-09-20 21:35           ` Dieter Nützel
@ 2001-09-20 22:03             ` Oliver Xymoron
  2001-09-20 22:51               ` Dieter Nützel
  2001-09-21  3:17               ` Robert Love
  0 siblings, 2 replies; 55+ messages in thread
From: Oliver Xymoron @ 2001-09-20 22:03 UTC (permalink / raw)
  To: Dieter Nützel
  Cc: Robert Love, Andrea Arcangeli, Roger Larsson, linux-kernel,
	ReiserFS List

On Thu, 20 Sep 2001, Dieter Nützel wrote:

> Am Donnerstag, 20. September 2001 23:10 schrieb Robert Love:
> > On Thu, 2001-09-20 at 04:21, Andrea Arcangeli wrote:
> > > > You've forgotten a one liner.
> > > >
> > > >   #include <linux/locks.h>
> > > > +#include <linux/compiler.h>
> > >
> > > woops, didn't trapped it because of gcc 3.0.2. thanks.
> > >
> > > > But this is not enough. Even with reniced artsd (-20).
> > > > Some shorter hiccups (0.5~1 sec).
> > >
> > > I'm not familiar with the output of the latency bench, but I actually
> > > read "4617" usec as the worst latency, that means 4msec, not 500/1000
> > > msec.
> >
> > Right, the patch is returning the length preemption was unavailable
> > (which is when a lock is held) in us. So it is indded 4ms.
> >
> > But, I think Dieter is saying he _sees_ 0.5~1s latencies (in the form of
> > audio skips).  This is despite the 4ms locks being held.
>
> Yes, that's the case. During dbench 16,32,40,48, etc...

You might actually be waiting on disk I/O and not blocked.

Does your audio source depend on any files (eg mp3s) and if so, could they
be moved to a ramfs? Do the skips go away then?

--
 "Love the dolphins," she advised him. "Write by W.A.S.T.E.."


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] Preemption patch 2.4.9-ac12
  2001-09-20 20:27   ` Robert Love
@ 2001-09-20 22:09     ` Robert Love
  0 siblings, 0 replies; 55+ messages in thread
From: Robert Love @ 2001-09-20 22:09 UTC (permalink / raw)
  To: safemode; +Cc: linux-kernel

On Thu, 2001-09-20 at 17:11, safemode wrote:
> ps.  Rob, i sent a few emails about this and my latencies.  

I just sent you some replies...

I know, and I am working on it.  You are one of the few people reporting
any problems, and the only one with problems like this.  This is not to
say `its you' but just that it is making it hard to figure out, but I am
trying.

I have a boatload of emails to respond to, and most of them were sent
just to me and not CC the list so people just have to wait for me.

I don't get paid to do this, and there are only so many hours until my
girlfriend gets mad :)

-- 
Robert M. Love
rml at ufl.edu
rml at tech9.net


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] Preemption Latency Measurement Tool
  2001-09-20 22:03             ` Oliver Xymoron
@ 2001-09-20 22:51               ` Dieter Nützel
  2001-09-21  3:17               ` Robert Love
  1 sibling, 0 replies; 55+ messages in thread
From: Dieter Nützel @ 2001-09-20 22:51 UTC (permalink / raw)
  To: Oliver Xymoron
  Cc: Robert Love, Andrea Arcangeli, Roger Larsson, linux-kernel,
	ReiserFS List

Am Freitag, 21. September 2001 00:03 schrieb Oliver Xymoron:
> On Thu, 20 Sep 2001, Dieter Nützel wrote:
> > Am Donnerstag, 20. September 2001 23:10 schrieb Robert Love:
> > > On Thu, 2001-09-20 at 04:21, Andrea Arcangeli wrote:
> > > > > You've forgotten a one liner.
> > > > >
> > > > >   #include <linux/locks.h>
> > > > > +#include <linux/compiler.h>
> > > >
> > > > woops, didn't trapped it because of gcc 3.0.2. thanks.
> > > >
> > > > > But this is not enough. Even with reniced artsd (-20).
> > > > > Some shorter hiccups (0.5~1 sec).
> > > >
> > > > I'm not familiar with the output of the latency bench, but I actually
> > > > read "4617" usec as the worst latency, that means 4msec, not 500/1000
> > > > msec.
> > >
> > > Right, the patch is returning the length preemption was unavailable
> > > (which is when a lock is held) in us. So it is indded 4ms.
> > >
> > > But, I think Dieter is saying he _sees_ 0.5~1s latencies (in the form
> > > of audio skips).  This is despite the 4ms locks being held.
> >
> > Yes, that's the case. During dbench 16,32,40,48, etc...
>
> You might actually be waiting on disk I/O and not blocked.
>
> Does your audio source depend on any files (eg mp3s) and if so, could they
> be moved to a ramfs? Do the skips go away then?

Good point.

I've copied one video (MP2) and one Ogg-Vorbis file into /dev/shm.
Little bit better but hiccup still there :-(

dbench 16
Throughput 25.7613 MB/sec (NB=32.2016 MB/sec  257.613 MBit/sec)
7.500u 29.870s 1:22.99 45.0%    0+0k 0+0io 511pf+0w

Worst 20 latency times of 3298 measured in this period.
  usec      cause     mask   start line/file      address   end line/file
 11549  spin_lock        1   678/inode.c         c01566d7   704/inode.c

c01566a0 T prune_icache
c01566d7 ..... <<<<<
c0156800 T shrink_icache_memory

  7395  spin_lock        1   291/buffer.c        c014151c   285/buffer.c

c0141400 T kupdate
c014151c ..... <<<<<
c0141610 T set_buffer_async_io

  7372  spin_lock        1   291/buffer.c        c01413e3   280/buffer.c

c0141290 T bdflush
c01413e3 ..... <<<<<
c0141400 T kupdate

  5702   reacqBKL        1  1375/sched.c         c0114d94   697/sched.c

c0114d60 T preempt_schedule
c0114d94 ..... <<<<<
c0114e10 T wake_up_process

  4744        BKL        0  2763/buffer.c        c01410aa   697/sched.c

c0141080 t sync_old_buffers
c01410aa ..... <<<<<
c01411b0 T block_sync_page

  4695  spin_lock        1   291/buffer.c        c014151c   280/buffer.c
  4551  spin_lock        1  1376/sched.c         c0114db3  1380/sched.c
  4466  spin_lock        1   547/sched.c         c0112fe4  1306/inode.c
  4464  spin_lock        1  1376/sched.c         c0114db3   697/sched.c
  4146   reacqBKL        1  1375/sched.c         c0114d94   842/inode.c
  4131  spin_lock        0   547/sched.c         c0112fe4   697/sched.c
  3900   reacqBKL        1  1375/sched.c         c0114d94   929/namei.c
  3390  spin_lock        1   547/sched.c         c0112fe4  1439/namei.c
  3191        BKL        0  1302/inode.c         c016f359   842/inode.c
  2866        BKL        0  1302/inode.c         c016f359  1381/sched.c
  2803   reacqBKL        0  1375/sched.c         c0114d94  1381/sched.c
  2762        BKL        0    30/inode.c         c016ce51    52/inode.c
  2633        BKL        0  2763/buffer.c        c01410aa  1380/sched.c
  2629        BKL        0  2763/buffer.c        c01410aa  1381/sched.c
  2466  spin_lock        1   468/vmscan.c        c0133c35   415/vmscan.c

*******************************************************

dbench 16 + renice artsd -20 works
GREAT!

*******************************************************

dbench 32 and above + renice artsd -20 fail

Writing this during dbench 32 ...:-)))

dbench 32 + renice artsd -20
Throughput 18.5102 MB/sec (NB=23.1378 MB/sec  185.102 MBit/sec)
15.240u 63.070s 3:49.21 34.1%   0+0k 0+0io 911pf+0w

Worst 20 latency times of 3679 measured in this period.
  usec      cause     mask   start line/file      address   end line/file
 17625  spin_lock        1   678/inode.c         c01566d7   704/inode.c

c01566a0 T prune_icache
c01566d7 ..... <<<<<
c0156800 T shrink_icache_memory

  9829  spin_lock        1   547/sched.c         c0112fe4   697/sched.c
  9186  spin_lock        1   547/sched.c         c0112fe4  1306/inode.c
  7447   reacqBKL        1  1375/sched.c         c0114d94   697/sched.c
  7097        BKL        1  1302/inode.c         c016f359   697/sched.c
  5974  spin_lock        1  1376/sched.c         c0114db3   697/sched.c
  5231        BKL        1  1437/namei.c         c014c42f   697/sched.c
  5192  spin_lock        0  1376/sched.c         c0114db3  1380/sched.c
  4992   reacqBKL        1  1375/sched.c         c0114d94  1381/sched.c
  4875  spin_lock        1   305/dcache.c        c0153acd    80/dcache.c
  4390        BKL        1   927/namei.c         c014b2bf   929/namei.c
  3616   reacqBKL        0  1375/sched.c         c0114d94  1306/inode.c
  3498  spin_lock        1   547/sched.c         c0112fe4   929/namei.c
  3427  spin_lock        1   547/sched.c         c0112fe4   842/inode.c
  3323        BKL        1  1302/inode.c         c016f359  1306/inode.c
  3165        BKL        1   452/exit.c          c011af61   697/sched.c
  3059  spin_lock        1   305/dcache.c        c0153acd    86/dcache.c
  3016        BKL        1   533/inode.c         c016d9cd  1306/inode.c
  2943        BKL        1  1302/inode.c         c016f359    52/inode.c
  2904  spin_lock        1   547/sched.c         c0112fe4  1439/namei.c

Thanks,
	Dieter

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] Preemption Latency Measurement Tool
  2001-09-20 22:03             ` Oliver Xymoron
  2001-09-20 22:51               ` Dieter Nützel
@ 2001-09-21  3:17               ` Robert Love
  2001-09-21 15:48                 ` george anzinger
  1 sibling, 1 reply; 55+ messages in thread
From: Robert Love @ 2001-09-21  3:17 UTC (permalink / raw)
  To: Dieter Nützel
  Cc: Oliver Xymoron, Andrea Arcangeli, Roger Larsson, linux-kernel,
	ReiserFS List

On Thu, 2001-09-20 at 18:51, Dieter Nützel wrote:
> > Does your audio source depend on any files (eg mp3s) and if so, could they
> > be moved to a ramfs? Do the skips go away then?
> 
> Good point.
> 
> I've copied one video (MP2) and one Ogg-Vorbis file into /dev/shm.
> Little bit better but hiccup still there :-(

As I've been saying, the problem really shouldn't be disk I/O.  I would
think (and really hope) the readahead code can fit a little mp3 in
memory.  Even if not, its a quick read to load it.  The continued blips
you see are caused by something, well, continual :)

> dbench 16
> Throughput 25.7613 MB/sec (NB=32.2016 MB/sec  257.613 MBit/sec)
> 7.500u 29.870s 1:22.99 45.0%    0+0k 0+0io 511pf+0w
> 
> Worst 20 latency times of 3298 measured in this period.
>   usec      cause     mask   start line/file      address   end line/file
>  11549  spin_lock        1   678/inode.c         c01566d7   704/inode.c

A single 11ms latency is not bad.  Again, this looks OK.

> *******************************************************
> 
> dbench 16 + renice artsd -20 works
> GREAT!
> 
> *******************************************************

Great :)

> dbench 32 and above + renice artsd -20 fail
> 
> Writing this during dbench 32 ...:-)))
> 
> dbench 32 + renice artsd -20
> Throughput 18.5102 MB/sec (NB=23.1378 MB/sec  185.102 MBit/sec)
> 15.240u 63.070s 3:49.21 34.1%   0+0k 0+0io 911pf+0w
> 
> Worst 20 latency times of 3679 measured in this period.
>   usec      cause     mask   start line/file      address   end line/file
>  17625  spin_lock        1   678/inode.c         c01566d7   704/inode.c

What do you mean failed?

-- 
Robert M. Love
rml at ufl.edu
rml at tech9.net


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] Preemption Latency Measurement Tool
  2001-09-21  3:17               ` Robert Love
@ 2001-09-21 15:48                 ` george anzinger
  2001-09-22 21:09                   ` Dieter Nützel
  2001-09-23  2:44                   ` Robert Love
  0 siblings, 2 replies; 55+ messages in thread
From: george anzinger @ 2001-09-21 15:48 UTC (permalink / raw)
  To: Robert Love
  Cc: Dieter Nützel, Oliver Xymoron, Andrea Arcangeli,
	Roger Larsson, linux-kernel, ReiserFS List

[-- Attachment #1: Type: text/plain, Size: 2786 bytes --]

Robert Love wrote:
> 
> On Thu, 2001-09-20 at 18:51, Dieter Nützel wrote:
> > > Does your audio source depend on any files (eg mp3s) and if so, could they
> > > be moved to a ramfs? Do the skips go away then?
> >
> > Good point.
> >
> > I've copied one video (MP2) and one Ogg-Vorbis file into /dev/shm.
> > Little bit better but hiccup still there :-(
> 
> As I've been saying, the problem really shouldn't be disk I/O.  I would
> think (and really hope) the readahead code can fit a little mp3 in
> memory.  Even if not, its a quick read to load it.  The continued blips
> you see are caused by something, well, continual :)
> 
Are you running your application at some real time priority?  I suspect
that, when dbench starts, it floods the system with a lot of new tasks
and the system must visit each one until it gets back to your ap.  Nice
will only do so much here.  Real time priority is the way to go. 
Attached are a couple of small programs to help here.  "rt" runs a given
ap at a given priority.  I.e. > rt 10 foo, runs foo at priority 10. 
(There are more options, try rt -h.)  getrt reports the priority of a
task.  If you do something like > rt 10 bash   everything you run from
the new bash prompt will be at priority 10.  You must be root to run rt
:(

George

> > dbench 16
> > Throughput 25.7613 MB/sec (NB=32.2016 MB/sec  257.613 MBit/sec)
> > 7.500u 29.870s 1:22.99 45.0%    0+0k 0+0io 511pf+0w
> >
> > Worst 20 latency times of 3298 measured in this period.
> >   usec      cause     mask   start line/file      address   end line/file
> >  11549  spin_lock        1   678/inode.c         c01566d7   704/inode.c
> 
> A single 11ms latency is not bad.  Again, this looks OK.
> 
> > *******************************************************
> >
> > dbench 16 + renice artsd -20 works
> > GREAT!
> >
> > *******************************************************
> 
> Great :)
> 
> > dbench 32 and above + renice artsd -20 fail
> >
> > Writing this during dbench 32 ...:-)))
> >
> > dbench 32 + renice artsd -20
> > Throughput 18.5102 MB/sec (NB=23.1378 MB/sec  185.102 MBit/sec)
> > 15.240u 63.070s 3:49.21 34.1%   0+0k 0+0io 911pf+0w
> >
> > Worst 20 latency times of 3679 measured in this period.
> >   usec      cause     mask   start line/file      address   end line/file
> >  17625  spin_lock        1   678/inode.c         c01566d7   704/inode.c
> 
> What do you mean failed?
> 
> --
> Robert M. Love
> rml at ufl.edu
> rml at tech9.net
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

[-- Attachment #2: rt.c --]
[-- Type: text/plain, Size: 3205 bytes --]



/*
   rt - a utility to set the realtime priority and scheduling policy
*/

/* includes */
#include <stdio.h>
#include <stdlib.h>
#include <sched.h>
#include <unistd.h>
#define _GNU_LIBRARY__
#include <getopt.h>

/* defines */

#define RUNSTRING "usage:  rt [-f] [-v] prio [--] runstring  \n \
 or:   rt [-f] [-v] -p PID prio\\ \n\n \
where: prio specifies the realtime priority  \n \
       -f set scheduling policy to SCHED_FIFO \n \
       -v turns on verbose mode. \n \
       -p PID specifies an existing process to modify \n \
       runstring is a process and parameters \n \
       (use '--' if runstring contains options). \n"

#define POLICY(x)  x ? x-1 ? "SCHED_RR" : "SCHED_FIFO" : "SCHED_OTHER"

/* prototypes */
void print_usage(char *[]);

/* globals */
int verbose=0;  /* 0=none, !0=verbose */


main(int argc, char *argv[])
{
        struct sched_param prio_struct;
        int policy = -1;
        int pid = 0;
        int pidopt = 0;
        int optprobs = 0; /* problems parsing? */
        
        int  c;         /* generic single character */

        /* "standard" option parsing... */
        while ( (c=getopt(argc, argv, "+fp:v?")) != EOF)
        {
        switch (c) {
                case 'f':       /* set FIFO mode */
                        policy = SCHED_FIFO;
                        break;
                case 'p':       /* read PID */  
                        sscanf(optarg,"%d",&pid); 
                        pidopt=1;
                        break;
                case 'v':
                        verbose=1;      /* verbosity */
                        break;
                case '?':       /* help? */
                        printf("%s",RUNSTRING);
                        exit(0);
                default:        /* something went wrong */
                        optprobs=1;     /* we'll deal with this problem later */
                        break;
                }
        }

        if (optprobs) {
                fprintf(stderr,RUNSTRING);
                exit(1);
        }


        if((argc - optind) < 2-pidopt) {
                print_usage(argv);
        }

        sscanf(argv[optind], "%d", &(prio_struct.sched_priority));

        /* sanity checking... */
        if ( (prio_struct.sched_priority != 0) && (policy < 0 ) ) {
                policy=SCHED_RR;
                if (verbose)
                  printf("Defaulting sched policy to %s.\n", POLICY(policy));
        }

        if ( (prio_struct.sched_priority == 0 ) && (policy != SCHED_OTHER) ) {
                policy=SCHED_OTHER;
                fprintf(stderr,"Priority of %d implies sched policy of %s.\n",
                        prio_struct.sched_priority,  POLICY(policy)); 
        }


        policy = (prio_struct.sched_priority)? policy : SCHED_OTHER;
        if( sched_setscheduler(pid,policy,&prio_struct)){
                perror("Priority out of range");
                print_usage(argv);
        }
        if ( pid ) exit(0);
        argv+=optind;   /* adjust argv to point to the runstring */
        argv++;
        execvp(argv[0],argv);
        perror("exec failed..");
}

void print_usage(char * who[])
{
        printf("%s",RUNSTRING);
        exit (1);
}


[-- Attachment #3: getrt.c --]
[-- Type: text/plain, Size: 1336 bytes --]

#include <sched.h>
#include <stdlib.h>

main(int args,char* argc[])
{
        struct sched_param p;
        pid_t pid;

        int policy;
        int prio,max_prio,max_rt_prio;
        char * cpolicy;

        if (args < 2) {
                pid = 0;
        }else{
                pid = atoi(argc[1]);
        }
        policy = sched_getscheduler(pid);
        sched_getparam(pid,&p);
        prio = p.sched_priority;
        max_prio = sched_get_priority_max(policy);
        max_rt_prio = sched_get_priority_max(SCHED_FIFO);

        switch(policy){
        case SCHED_OTHER:
                cpolicy = "SCHED_OTHER";
                break;
        case SCHED_RR:
                cpolicy = "SCHED_RR";
                break;
        case SCHED_FIFO:
                cpolicy = "SCHED_FIFO";
                break;
        default:
                perror("sched_getscheduler");
                exit(1);
        }
                
        if (policy == SCHED_OTHER){
                printf("%s at priority %d (MAX_PRIO(%s) = %d, MAX_PRIO(SCHED_FIFO) = %d)\n",
                       cpolicy,      prio,       cpolicy,max_prio,               max_rt_prio);
        }else{
                printf("%s at priority %d (MAX_PRIO(%s) = %d)\n",
                       cpolicy,       prio,      cpolicy, max_prio);
        }
        exit(0);
}

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] Preemption Latency Measurement Tool
  2001-09-19 22:44 ` [PATCH] Preemption Latency Measurement Tool Robert Love
                     ` (6 preceding siblings ...)
  2001-09-20 22:01   ` [PATCH] Preemption Latency Measurement Tool Robert Love
@ 2001-09-22  3:57   ` Andre Pang
  2001-09-22  6:10   ` Robert Love
  8 siblings, 0 replies; 55+ messages in thread
From: Andre Pang @ 2001-09-22  3:57 UTC (permalink / raw)
  To: Robert Love
  Cc: linux-kernel, safemode, Dieter.Nuetzel, iafilius, ilsensine, george

On Wed, Sep 19, 2001 at 06:44:09PM -0400, Robert Love wrote:

> Available at:
> http://tech9.net/rml/linux/patch-rml-2.4.9-ac12-preempt-stats-1 and
> http://tech9.net/rml/linux/patch-rml-2.4.10-pre12-preempt-stats-1
> for 2.4.9-ac12 and 2.4.10-pre12, respectively.

hi Robert, thanks for producing the stats patches!

i did a test of it on linux-2.4.10-pre13 with Benno Senoner's
lowlatency program, which i hacked up a bit to output
/proc/latencytimes after each of the graphs.  test results are at

    http://www.algorithm.com.au/hacking/linux-lowlatency/2.4.10-pre13-pes/

and since i stared at the results in disbelief, i won't even try
to guess what's going on :).  maybe you can make some sense of
it?

i'm prety sure his latencytest program runs at real-time
priority, but i'll run another test just with the getrt and rt
programs just posted to the list to make sure.  that's the only
reason i can think why the graph is so bizarre compared to the
/proc/latencytimes results.


-- 
#ozone/algorithm <ozone@algorithm.com.au>          - trust.in.love.to.save

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] Preemption Latency Measurement Tool
  2001-09-19 22:44 ` [PATCH] Preemption Latency Measurement Tool Robert Love
                     ` (7 preceding siblings ...)
  2001-09-22  3:57   ` Andre Pang
@ 2001-09-22  6:10   ` Robert Love
  2001-09-22  7:22     ` Andre Pang
                       ` (2 more replies)
  8 siblings, 3 replies; 55+ messages in thread
From: Robert Love @ 2001-09-22  6:10 UTC (permalink / raw)
  To: Andre Pang
  Cc: linux-kernel, safemode, Dieter.Nuetzel, iafilius, ilsensine, george

On Fri, 2001-09-21 at 23:57, Andre Pang wrote:
> hi Robert, thanks for producing the stats patches!

no problem, but note that the original work was by MontaVista.  They
have contributed a lot to lowering latency in the Linux kernel.

I just spruced the patch up and updated it for the current kernels.

> i did a test of it on linux-2.4.10-pre13 with Benno Senoner's
> lowlatency program, which i hacked up a bit to output
> /proc/latencytimes after each of the graphs.  test results are at
> 
>     http://www.algorithm.com.au/hacking/linux-lowlatency/2.4.10-pre13-pes/
> 
> and since i stared at the results in disbelief, i won't even try
> to guess what's going on :).  maybe you can make some sense of
> it?

Well, its not hard to decipher...and really, its actually fairly good.
the latency test program is giving you a max latency of around 12ms in
each test, which is OK.

the preemption-test patch is showing _MAX_ latencies of 0.8ms through
12ms.  this is fine, too.

> i'm prety sure his latencytest program runs at real-time
> priority, but i'll run another test just with the getrt and rt
> programs just posted to the list to make sure.  that's the only
> reason i can think why the graph is so bizarre compared to the
> /proc/latencytimes results.

well, latencytest is showing you the end result of a certain set of
operations, with a final latency of 12ms.  the /proc interface is a bit
more specific, in that it shows you latencies over individual locks.

the latencies over the specific locks held during latencytest, plus the
time to execute the instructions outside of locks, is equal to 12ms. 
the correlation between max lock and latencytest is not 1-to-1.

in an ideal world, you want latencytest to show you 10ms or less, which
we are pretty close to by your results, and you want most of your locks
under 1ms, which we also achieve since of your top 20 locks out of 20k+,
we see many still around 1ms.

thank you for these data points...

-- 
Robert M. Love
rml at ufl.edu
rml at tech9.net


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] Preemption Latency Measurement Tool
  2001-09-22  6:10   ` Robert Love
@ 2001-09-22  7:22     ` Andre Pang
  2001-09-23  3:18       ` george anzinger
  2001-09-23  3:21       ` Robert Love
  2001-09-22 12:56     ` ksoftirqd? (Was: Re: [PATCH] Preemption Latency Measurement Tool) Roger Larsson
  2001-09-23  7:05     ` [PATCH] Preemption Latency Measurement Tool Robert Love
  2 siblings, 2 replies; 55+ messages in thread
From: Andre Pang @ 2001-09-22  7:22 UTC (permalink / raw)
  To: Robert Love
  Cc: linux-kernel, safemode, Dieter.Nuetzel, iafilius, ilsensine, george

On Sat, Sep 22, 2001 at 02:10:18AM -0400, Robert Love wrote:

> > i did a test of it on linux-2.4.10-pre13 with Benno Senoner's
> > lowlatency program, which i hacked up a bit to output
> > /proc/latencytimes after each of the graphs.  test results are at
> > 
> >     http://www.algorithm.com.au/hacking/linux-lowlatency/2.4.10-pre13-pes/
> > 
> > and since i stared at the results in disbelief, i won't even try
> > to guess what's going on :).  maybe you can make some sense of
> > it?
> 
> Well, its not hard to decipher...and really, its actually fairly good.
> the latency test program is giving you a max latency of around 12ms in
> each test, which is OK.

arrgh!  i just realised my script buggered up and was producing the same
graph for all the results.  please have a look at the page again, sorry.

apart from that, i'm still confused.  compared to other graphs produced
by the latencytest program, my system seems to have huge latencies.
unless i'm reading it wrongly, the graph is saying that i'm getting
latencies of up to 30ms, and a lot of overruns.  compare this to

    http://www.gardena.net/benno/linux/audio/2.4.0-test2/3x256.html

which shows latencytest on 2.4.0-test2, and

    http://www.gardena.net/benno/linux/audio/2.2.10-p133-3x128/3x128.html

which are the results for latencytest on 2.2.10.  admittedly these
kernels are much older, but i'm consistently getting far more latency
than those kernels.  that's the bit i'm confused about :)  i've tried
Andrew Morton's low-latency patches as well, to no avail.  i've made
sure i've tuned my hard disks correctly, and i don't have any other
realtime processes running.

am i concerned with a different issue than the one you're addressing?

> the preemption-test patch is showing _MAX_ latencies of 0.8ms through
> 12ms.  this is fine, too.

yep, i agree with that ... so why is latencytest showing scheduling
latencies of > 30ms?  i get the feeling i'm confusing two different
issues here.  from what i understand, /proc/latencytimes shows the
how long it takes for various functions in the kernel to finish, and
the latencytest result shows how long it takes for it to be
re-scheduled (represented by the white line on the graph).


-- 
#ozone/algorithm <ozone@algorithm.com.au>          - trust.in.love.to.save

^ permalink raw reply	[flat|nested] 55+ messages in thread

* ksoftirqd? (Was: Re: [PATCH] Preemption Latency Measurement Tool)
  2001-09-22  6:10   ` Robert Love
  2001-09-22  7:22     ` Andre Pang
@ 2001-09-22 12:56     ` Roger Larsson
  2001-09-22 13:14       ` Andrea Arcangeli
  2001-09-23  7:05     ` [PATCH] Preemption Latency Measurement Tool Robert Love
  2 siblings, 1 reply; 55+ messages in thread
From: Roger Larsson @ 2001-09-22 12:56 UTC (permalink / raw)
  To: Robert Love, Andre Pang, Andrea Arcangeli
  Cc: linux-kernel, safemode, Dieter.Nuetzel, iafilius, ilsensine, george

Hi,

We have a new kid on the block since we started thinking of a preemptive 
kernel.

ksoftirqd...

Running with nice 19 (shouldn't it really be -19?)
Or have a RT setting? (maybe not since one of the reasons for
softirqd would be lost - would be scheduled in immediately)
Can't a high prio or RT process be starved due to missing
service (bh) after an interrupt?

This will not show up in latency profiling patches since
the kernel does what is requested...

Previously it was run directly after interrupt,
before returning to the interrupted process...

See:
  /usr/src/develop/linux/kernel/softirq.c

/RogerL

-- 
Roger Larsson
Skellefteå
Sweden

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: ksoftirqd? (Was: Re: [PATCH] Preemption Latency Measurement Tool)
  2001-09-22 12:56     ` ksoftirqd? (Was: Re: [PATCH] Preemption Latency Measurement Tool) Roger Larsson
@ 2001-09-22 13:14       ` Andrea Arcangeli
  2001-09-22 20:51         ` Roger Larsson
  0 siblings, 1 reply; 55+ messages in thread
From: Andrea Arcangeli @ 2001-09-22 13:14 UTC (permalink / raw)
  To: Roger Larsson
  Cc: Robert Love, Andre Pang, linux-kernel, safemode, Dieter.Nuetzel,
	iafilius, ilsensine, george

On Sat, Sep 22, 2001 at 02:56:58PM +0200, Roger Larsson wrote:
> Hi,
> 
> We have a new kid on the block since we started thinking of a preemptive 
> kernel.
> 
> ksoftirqd...
> 
> Running with nice 19 (shouldn't it really be -19?)
> Or have a RT setting? (maybe not since one of the reasons for
> softirqd would be lost - would be scheduled in immediately)
> Can't a high prio or RT process be starved due to missing
> service (bh) after an interrupt?

It cannot be starved, if ksoftirqd is never scheduled the do_softirq()
will be run by the next timer irq or apic_timer irq.

> This will not show up in latency profiling patches since
> the kernel does what is requested...
> 
> Previously it was run directly after interrupt,
> before returning to the interrupted process...

It is still the case, that's also the common case actually.

Andrea

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: ksoftirqd? (Was: Re: [PATCH] Preemption Latency Measurement Tool)
  2001-09-22 13:14       ` Andrea Arcangeli
@ 2001-09-22 20:51         ` Roger Larsson
  2001-09-22 21:33           ` Andrea Arcangeli
  0 siblings, 1 reply; 55+ messages in thread
From: Roger Larsson @ 2001-09-22 20:51 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: Robert Love, Andre Pang, linux-kernel, safemode, Dieter.Nuetzel,
	iafilius, ilsensine, george

On Saturday 22 September 2001 15.14, Andrea Arcangeli wrote:
> On Sat, Sep 22, 2001 at 02:56:58PM +0200, Roger Larsson wrote:
> > Hi,
> >
> > We have a new kid on the block since we started thinking of a preemptive
> > kernel.
> >
> > ksoftirqd...
> >
> > Running with nice 19 (shouldn't it really be -19?)

I repeat this question - should it be nice 19 and not nice -19 (not nice)?

> > Or have a RT setting? (maybe not since one of the reasons for
> > softirqd would be lost - would be scheduled in immediately)
> > Can't a high prio or RT process be starved due to missing
> > service (bh) after an interrupt?
>
> It cannot be starved, if ksoftirqd is never scheduled the do_softirq()
> will be run by the next timer irq or apic_timer irq.
>

Then could you please explain the output after adding a printk in
ksoftirqd like this.

--- kernel/softirq.c~   Mon Sep 17 22:37:34 2001
+++ kernel/softirq.c    Sat Sep 22 21:46:49 2001
@@ -387,6 +387,7 @@
                __set_current_state(TASK_RUNNING);

                while (softirq_pending(cpu)) {
+                       printk("ksoftirqd: do_softirq\n");
                        do_softirq();
                        if (current->need_resched)
                                schedule();


Output like this:

Sep 22 22:37:38 jeloin logger: *** ./stress_dbench begin ***
Sep 22 22:37:42 jeloin kernel: ksoftirqd: do_softirq
Sep 22 22:37:44 jeloin last message repeated 2 times
Sep 22 22:37:48 jeloin kernel: Latency   8ms PID     7 %% kupdated
Sep 22 22:37:49 jeloin kernel: ksoftirqd: do_softirq
Sep 22 22:38:19 jeloin last message repeated 18 times
Sep 22 22:38:20 jeloin su: (to root) roger on /dev/pts/2
Sep 22 22:38:20 jeloin PAM-unix2[988]: session started for user root, service 
su
Sep 22 22:38:23 jeloin kernel: ksoftirqd: do_softirq
Sep 22 22:38:24 jeloin modprobe: modprobe: Can't locate module net-pf-10
Sep 22 22:38:24 jeloin kernel: ksoftirqd: do_softirq
Sep 22 22:38:58 jeloin last message repeated 20 times
Sep 22 22:39:05 jeloin last message repeated 4 times
Sep 22 22:42:03 jeloin kernel: ksoftirqd: do_softirq
Sep 22 22:43:58 jeloin logger: *** ./stress_dbench end ***

> > This will not show up in latency profiling patches since
> > the kernel does what is requested...
> >
> > Previously it was run directly after interrupt,
> > before returning to the interrupted process...
>
> It is still the case, that's also the common case actually.
>

Possibly but the other case is quite common too...

/RogerL

-- 
Roger Larsson
Skellefteå
Sweden

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] Preemption Latency Measurement Tool
  2001-09-21 15:48                 ` george anzinger
@ 2001-09-22 21:09                   ` Dieter Nützel
  2001-09-22 23:40                     ` safemode
                                       ` (7 more replies)
  2001-09-23  2:44                   ` Robert Love
  1 sibling, 8 replies; 55+ messages in thread
From: Dieter Nützel @ 2001-09-22 21:09 UTC (permalink / raw)
  To: george anzinger, Robert Love
  Cc: Oliver Xymoron, Andrea Arcangeli, Roger Larsson, linux-kernel,
	ReiserFS List

Am Freitag, 21. September 2001 17:48 schrieb george anzinger:
> Robert Love wrote:
> > On Thu, 2001-09-20 at 18:51, Dieter Nützel wrote:
> > > > Does your audio source depend on any files (eg mp3s) and if so, could
> > > > they be moved to a ramfs? Do the skips go away then?
> > >
> > > Good point.
> > >
> > > I've copied one video (MP2) and one Ogg-Vorbis file into /dev/shm.
> > > Little bit better but hiccup still there :-(
> >
> > As I've been saying, the problem really shouldn't be disk I/O.  I would
> > think (and really hope) the readahead code can fit a little mp3 in
> > memory.  Even if not, its a quick read to load it.  The continued blips
> > you see are caused by something, well, continual :)
>
> Are you running your application at some real time priority?  I suspect
> that, when dbench starts, it floods the system with a lot of new tasks
> and the system must visit each one until it gets back to your ap.  Nice
> will only do so much here.  Real time priority is the way to go.
> Attached are a couple of small programs to help here.  "rt" runs a given
> ap at a given priority.  I.e. > rt 10 foo, runs foo at priority 10.
> (There are more options, try rt -h.)  getrt reports the priority of a
> task.  If you do something like > rt 10 bash   everything you run from
> the new bash prompt will be at priority 10.  You must be root to run rt

Yes, I do like I've posted in all the related threads.

Here is some more input for you. I hope it could help nail it down.

Regards,
	Dieter

BTW I'll now switching to 2.4.10-pre14

SunWave1#time tar xIf /Pakete/Linux/linux-2.4.9.tar.bz2
34.570u 5.350s 0:47.84 83.4%    0+0k 0+0io 295pf+0w

I "hear" some have disk activities (disk trashing).

SunWave1#sync

Runs for ages!!!

User	 CPU  0%
System	 CPU  0%
Idle	 CPU 99%

So where did it wait???

Here comes what latencytimes show:

Worst 20 latency times of 5061 measured in this period.
  usec      cause     mask   start line/file      address   end line/file
322263   reacqBKL        1  1375/sched.c         c01138b4   697/sched.c
216174        BKL        1    30/inode.c         c016b971    52/inode.c
158328        BKL        9   742/block_dev.c     c0144d51   697/sched.c
 66975        BKL        1  1101/super.c         c014250d   697/sched.c
 53560        BKL        1  1101/super.c         c014250d  1130/super.c
 37742  spin_lock        1   547/sched.c         c0111b04   795/block_dev.c
 24822  spin_lock        1   678/inode.c         c01551f7   704/inode.c
 24739  spin_lock        1   291/buffer.c        c014003c   285/buffer.c
 19330  spin_lock        1   547/sched.c         c0111b04   697/sched.c
 16566        BKL        1    30/inode.c         c016b971   697/sched.c
 14049  spin_lock        1  2043/tcp_ipv6.c      e9078837   119/softirq.c
 11993  spin_lock        1  2111/tcp_ipv4.c      c0220517   119/softirq.c
 11704        BKL        1   706/block_dev.c     c0144bd7  1381/sched.c
 11405  spin_lock        1   291/buffer.c        c014003c   280/buffer.c
 11288        BKL        1   706/block_dev.c     c0144bd7   697/sched.c
 11213   reacqBKL        9  1375/sched.c         c01138b4   696/block_dev.c
 11212  spin_lock        1   547/sched.c         c0111b04   696/block_dev.c
  7552  spin_lock        1   298/buffer.c        c013bb1c   285/buffer.c
  5013        BKL        0  2763/buffer.c        c013fbca   697/sched.c
  4641        BKL        1  1302/inode.c         c016de79   697/sched.c

SunWave1#ksymoops -m /boot/System.map -A " c01138b4 c016b971 c0144d51 
c014250d c014250d c0111b04 c01551f7 c014003c c0111b04 c016b971 e9078837 
c0220517 c0144bd7 c014003c c0144bd7 c01138b4 c0111b04 c013bb1c c013fbca 
c016de79"
ksymoops 2.4.3 on i686 2.4.10-pre12-preempt.  Options used
     -V (default)
     -k /proc/ksyms (default)
     -l /proc/modules (default)
     -o /lib/modules/2.4.10-pre12-preempt/ (default)
     -m /boot/System.map (specified)


Adhoc c01138b4 <preempt_schedule+34/b0>
Adhoc c016b970 <reiserfs_delete_inode+30/110>
Adhoc c0144d50 <blkdev_close+60/1b0>
Adhoc c014250c <kill_super+ac/280>
Adhoc c014250c <kill_super+ac/280>
Adhoc c0111b04 <schedule+34/550>
Adhoc c01551f6 <prune_icache+36/160>
Adhoc c014003c <kupdate+11c/210>
Adhoc c0111b04 <schedule+34/550>
Adhoc c016b970 <reiserfs_delete_inode+30/110>
Adhoc e9078836 <[ipv6]tcp6_get_info+4a6/770>
Adhoc c0220516 <tcp_get_info+4b6/7a0>
Adhoc c0144bd6 <blkdev_put+46/160>
Adhoc c014003c <kupdate+11c/210>
Adhoc c0144bd6 <blkdev_put+46/160>
Adhoc c01138b4 <preempt_schedule+34/b0>
Adhoc c0111b04 <schedule+34/550>
Adhoc c013bb1c <wait_for_locked_buffers+3c/60>
Adhoc c013fbca <sync_old_buffers+2a/130>
Adhoc c016de78 <reiserfs_dirty_inode+58/f0>


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: ksoftirqd? (Was: Re: [PATCH] Preemption Latency Measurement Tool)
  2001-09-22 20:51         ` Roger Larsson
@ 2001-09-22 21:33           ` Andrea Arcangeli
  0 siblings, 0 replies; 55+ messages in thread
From: Andrea Arcangeli @ 2001-09-22 21:33 UTC (permalink / raw)
  To: Roger Larsson
  Cc: Robert Love, Andre Pang, linux-kernel, safemode, Dieter.Nuetzel,
	iafilius, ilsensine, george

On Sat, Sep 22, 2001 at 10:51:16PM +0200, Roger Larsson wrote:
> On Saturday 22 September 2001 15.14, Andrea Arcangeli wrote:
> > On Sat, Sep 22, 2001 at 02:56:58PM +0200, Roger Larsson wrote:
> > > Hi,
> > >
> > > We have a new kid on the block since we started thinking of a preemptive
> > > kernel.
> > >
> > > ksoftirqd...
> > >
> > > Running with nice 19 (shouldn't it really be -19?)
> 
> I repeat this question - should it be nice 19 and not nice -19 (not nice)?

Being it +19 it is light with respect to the other tasks in the system,
mainly considering there's one for each cpu.

> Sep 22 22:37:38 jeloin logger: *** ./stress_dbench begin ***
> Sep 22 22:37:42 jeloin kernel: ksoftirqd: do_softirq
> Sep 22 22:37:44 jeloin last message repeated 2 times
> Sep 22 22:37:48 jeloin kernel: Latency   8ms PID     7 %% kupdated
> Sep 22 22:37:49 jeloin kernel: ksoftirqd: do_softirq
> Sep 22 22:38:19 jeloin last message repeated 18 times
> Sep 22 22:38:20 jeloin su: (to root) roger on /dev/pts/2
> Sep 22 22:38:20 jeloin PAM-unix2[988]: session started for user root, service 
> su
> Sep 22 22:38:23 jeloin kernel: ksoftirqd: do_softirq
> Sep 22 22:38:24 jeloin modprobe: modprobe: Can't locate module net-pf-10
> Sep 22 22:38:24 jeloin kernel: ksoftirqd: do_softirq
> Sep 22 22:38:58 jeloin last message repeated 20 times
> Sep 22 22:39:05 jeloin last message repeated 4 times
> Sep 22 22:42:03 jeloin kernel: ksoftirqd: do_softirq
> Sep 22 22:43:58 jeloin logger: *** ./stress_dbench end ***

one reschedule every few seconds sounds ok.

> Possibly but the other case is quite common too...

if you try to put a printk also in do_softirq you'll see why it is
not the common case :)

also you may want to put the printk in the schedule() path, ksoftirqd
may just be rescheduled once, and run many of the do_softirq().

Andrea

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] Preemption Latency Measurement Tool
  2001-09-22 21:09                   ` Dieter Nützel
@ 2001-09-22 23:40                     ` safemode
  2001-09-22 23:46                     ` Dieter Nützel
                                       ` (6 subsequent siblings)
  7 siblings, 0 replies; 55+ messages in thread
From: safemode @ 2001-09-22 23:40 UTC (permalink / raw)
  To: Dieter Nützel, george anzinger, Robert Love
  Cc: Oliver Xymoron, Andrea Arcangeli, Roger Larsson, linux-kernel,
	ReiserFS List

ok. The preemption patch helps realtime applications in linux be a little 
more close to realtime.  I understand that.  But your mp3 player shouldn't 
need root permission or renicing or realtime priority flags to play mp3s.  To 
test how well the latency patches are working you should be running things 
all at the same priority.  The main issue people are having with skipping 
mp3s is not in the decoding of the mp3 or in the retrieving of the file, it's 
in the playing in the soundcard.  That's being affected by dbench flooding 
the system with irq requests.  I'm inclined to believe it's irq requests 
because the _only_ time i have problems with mp3s (and i dont change priority 
levels) is when A. i do a cdparanoia -Z -B "1-"    or dbench 32.   I bet if 
someone did these tests on scsi hardware with the latency patch, they'd find 
much better results than us users of ide devices.  

even i dont get any skips when i run the player at nice -n -20.   That 
doesn't tell you much about the preemption patch though.  And it doesn't tell 
you about performance when you dont give linux the chance to do what it does, 
 multitask.  That's where the latency patch is directed at improving, i 
think.  




On Saturday 22 September 2001 17:09, Dieter Nützel wrote:
> Am Freitag, 21. September 2001 17:48 schrieb george anzinger:
> > Robert Love wrote:
> > > On Thu, 2001-09-20 at 18:51, Dieter Nützel wrote:
> > > > > Does your audio source depend on any files (eg mp3s) and if so,
> > > > > could they be moved to a ramfs? Do the skips go away then?
> > > >
> > > > Good point.
> > > >
> > > > I've copied one video (MP2) and one Ogg-Vorbis file into /dev/shm.
> > > > Little bit better but hiccup still there :-(
> > >
> > > As I've been saying, the problem really shouldn't be disk I/O.  I would
> > > think (and really hope) the readahead code can fit a little mp3 in
> > > memory.  Even if not, its a quick read to load it.  The continued blips
> > > you see are caused by something, well, continual :)
> >
> > Are you running your application at some real time priority?  I suspect
> > that, when dbench starts, it floods the system with a lot of new tasks
> > and the system must visit each one until it gets back to your ap.  Nice
> > will only do so much here.  Real time priority is the way to go.
> > Attached are a couple of small programs to help here.  "rt" runs a given
> > ap at a given priority.  I.e. > rt 10 foo, runs foo at priority 10.
> > (There are more options, try rt -h.)  getrt reports the priority of a
> > task.  If you do something like > rt 10 bash   everything you run from
> > the new bash prompt will be at priority 10.  You must be root to run rt
>
> Yes, I do like I've posted in all the related threads.
>
> Here is some more input for you. I hope it could help nail it down.
>
> Regards,
> 	Dieter
>
> BTW I'll now switching to 2.4.10-pre14
>
> SunWave1#time tar xIf /Pakete/Linux/linux-2.4.9.tar.bz2
> 34.570u 5.350s 0:47.84 83.4%    0+0k 0+0io 295pf+0w
>
> I "hear" some have disk activities (disk trashing).
>
> SunWave1#sync
>
> Runs for ages!!!
>
> User	 CPU  0%
> System	 CPU  0%
> Idle	 CPU 99%
>
> So where did it wait???
>
> Here comes what latencytimes show:
>
> Worst 20 latency times of 5061 measured in this period.
>   usec      cause     mask   start line/file      address   end line/file
> 322263   reacqBKL        1  1375/sched.c         c01138b4   697/sched.c
> 216174        BKL        1    30/inode.c         c016b971    52/inode.c
> 158328        BKL        9   742/block_dev.c     c0144d51   697/sched.c
>  66975        BKL        1  1101/super.c         c014250d   697/sched.c
>  53560        BKL        1  1101/super.c         c014250d  1130/super.c
>  37742  spin_lock        1   547/sched.c         c0111b04   795/block_dev.c
>  24822  spin_lock        1   678/inode.c         c01551f7   704/inode.c
>  24739  spin_lock        1   291/buffer.c        c014003c   285/buffer.c
>  19330  spin_lock        1   547/sched.c         c0111b04   697/sched.c
>  16566        BKL        1    30/inode.c         c016b971   697/sched.c
>  14049  spin_lock        1  2043/tcp_ipv6.c      e9078837   119/softirq.c
>  11993  spin_lock        1  2111/tcp_ipv4.c      c0220517   119/softirq.c
>  11704        BKL        1   706/block_dev.c     c0144bd7  1381/sched.c
>  11405  spin_lock        1   291/buffer.c        c014003c   280/buffer.c
>  11288        BKL        1   706/block_dev.c     c0144bd7   697/sched.c
>  11213   reacqBKL        9  1375/sched.c         c01138b4   696/block_dev.c
>  11212  spin_lock        1   547/sched.c         c0111b04   696/block_dev.c
>   7552  spin_lock        1   298/buffer.c        c013bb1c   285/buffer.c
>   5013        BKL        0  2763/buffer.c        c013fbca   697/sched.c
>   4641        BKL        1  1302/inode.c         c016de79   697/sched.c
>
> SunWave1#ksymoops -m /boot/System.map -A " c01138b4 c016b971 c0144d51
> c014250d c014250d c0111b04 c01551f7 c014003c c0111b04 c016b971 e9078837
> c0220517 c0144bd7 c014003c c0144bd7 c01138b4 c0111b04 c013bb1c c013fbca
> c016de79"
> ksymoops 2.4.3 on i686 2.4.10-pre12-preempt.  Options used
>      -V (default)
>      -k /proc/ksyms (default)
>      -l /proc/modules (default)
>      -o /lib/modules/2.4.10-pre12-preempt/ (default)
>      -m /boot/System.map (specified)
>
>
> Adhoc c01138b4 <preempt_schedule+34/b0>
> Adhoc c016b970 <reiserfs_delete_inode+30/110>
> Adhoc c0144d50 <blkdev_close+60/1b0>
> Adhoc c014250c <kill_super+ac/280>
> Adhoc c014250c <kill_super+ac/280>
> Adhoc c0111b04 <schedule+34/550>
> Adhoc c01551f6 <prune_icache+36/160>
> Adhoc c014003c <kupdate+11c/210>
> Adhoc c0111b04 <schedule+34/550>
> Adhoc c016b970 <reiserfs_delete_inode+30/110>
> Adhoc e9078836 <[ipv6]tcp6_get_info+4a6/770>
> Adhoc c0220516 <tcp_get_info+4b6/7a0>
> Adhoc c0144bd6 <blkdev_put+46/160>
> Adhoc c014003c <kupdate+11c/210>
> Adhoc c0144bd6 <blkdev_put+46/160>
> Adhoc c01138b4 <preempt_schedule+34/b0>
> Adhoc c0111b04 <schedule+34/550>
> Adhoc c013bb1c <wait_for_locked_buffers+3c/60>
> Adhoc c013fbca <sync_old_buffers+2a/130>
> Adhoc c016de78 <reiserfs_dirty_inode+58/f0>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] Preemption Latency Measurement Tool
  2001-09-22 21:09                   ` Dieter Nützel
  2001-09-22 23:40                     ` safemode
@ 2001-09-22 23:46                     ` Dieter Nützel
  2001-09-23  0:15                     ` safemode
                                       ` (5 subsequent siblings)
  7 siblings, 0 replies; 55+ messages in thread
From: Dieter Nützel @ 2001-09-22 23:46 UTC (permalink / raw)
  To: safemode, george anzinger, Robert Love
  Cc: Oliver Xymoron, Andrea Arcangeli, Roger Larsson, linux-kernel,
	ReiserFS List

Am Sonntag, 23. September 2001 01:40 schrieb safemode:
> ok. The preemption patch helps realtime applications in linux be a little
> more close to realtime.  I understand that.  But your mp3 player shouldn't
> need root permission or renicing or realtime priority flags to play mp3s. 
> To test how well the latency patches are working you should be running
> things all at the same priority.  The main issue people are having with
> skipping mp3s is not in the decoding of the mp3 or in the retrieving of the
> file, it's in the playing in the soundcard.  That's being affected by
> dbench flooding the system with irq requests.  I'm inclined to believe it's
> irq requests because the _only_ time i have problems with mp3s (and i dont
> change priority levels) is when A. i do a cdparanoia -Z -B "1-"    or
> dbench 32.   I bet if someone did these tests on scsi hardware with the
> latency patch, they'd find much better results than us users of ide
> devices.

Nope.
If you would have read (all) posts about this and related threads you should 
have noticed that I am and others running SCSI systems...

>
> even i dont get any skips when i run the player at nice -n -20. 

During dbench 16/32 and higher? Are you sure?

-Dieter

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] Preemption Latency Measurement Tool
  2001-09-22 21:09                   ` Dieter Nützel
  2001-09-22 23:40                     ` safemode
  2001-09-22 23:46                     ` Dieter Nützel
@ 2001-09-23  0:15                     ` safemode
       [not found]                     ` <200109222340.BAA37547@blipp.internet5.net>
                                       ` (4 subsequent siblings)
  7 siblings, 0 replies; 55+ messages in thread
From: safemode @ 2001-09-23  0:15 UTC (permalink / raw)
  To: Dieter Nützel, Robert Love; +Cc: linux-kernel

On Saturday 22 September 2001 19:46, Dieter Nützel wrote:
> Am Sonntag, 23. September 2001 01:40 schrieb safemode:
> > ok. The preemption patch helps realtime applications in linux be a little
> > more close to realtime.  I understand that.  But your mp3 player
> > shouldn't need root permission or renicing or realtime priority flags to
> > play mp3s. To test how well the latency patches are working you should be
> > running things all at the same priority.  The main issue people are
> > having with skipping mp3s is not in the decoding of the mp3 or in the
> > retrieving of the file, it's in the playing in the soundcard.  That's
> > being affected by dbench flooding the system with irq requests.  I'm
> > inclined to believe it's irq requests because the _only_ time i have
> > problems with mp3s (and i dont change priority levels) is when A. i do a
> > cdparanoia -Z -B "1-"    or dbench 32.   I bet if someone did these tests
> > on scsi hardware with the latency patch, they'd find much better results
> > than us users of ide devices.
>
> Nope.
> If you would have read (all) posts about this and related threads you
> should have noticed that I am and others running SCSI systems...

hrmm  strange because the only thing that could be causing the soundcard to 
skip would be irq requests still stuck in the cpu as far as i know.  I only 
get that with massive ide access and that's it.  Also that is when linux is 
juggling them all equally. 

> > even i dont get any skips when i run the player at nice -n -20.
>
> During dbench 16/32 and higher? Are you sure?

I ran it myself and i dont drink alcohol or take drugs.  so yea, i'm sure :)  
If i went high enough i suppose the same problem would occur.  it's probably 
in an area of the kernel where the preempt patch doesn't work (yet).    It 
does happen on cdparanoia -Z -B "1"    I dont think anything ide is safe from 
that.    

> -Dieter

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] Preemption Latency Measurement Tool
       [not found]                     ` <200109222340.BAA37547@blipp.internet5.net>
@ 2001-09-23  0:38                       ` Roger Larsson
  2001-09-23  1:42                         ` safemode
  2001-09-23  3:02                       ` Robert Love
  1 sibling, 1 reply; 55+ messages in thread
From: Roger Larsson @ 2001-09-23  0:38 UTC (permalink / raw)
  To: safemode, Dieter Nützel, george anzinger, Robert Love
  Cc: Andrea Arcangeli, linux-kernel

On Sunday 23 September 2001 01.40, safemode wrote:
> ok. The preemption patch helps realtime applications in linux be a little
> more close to realtime.  I understand that.  But your mp3 player shouldn't
> need root permission or renicing or realtime priority flags to play mp3s. 
> To test how well the latency patches are working you should be running
> things all at the same priority.  The main issue people are having with
> skipping mp3s is not in the decoding of the mp3 or in the retrieving of the
> file, it's in the playing in the soundcard.  That's being affected by
> dbench flooding the system with irq requests.  I'm inclined to believe it's
> irq requests because the _only_ time i have problems with mp3s (and i dont
> change priority levels) is when A. i do a cdparanoia -Z -B "1-"    or
> dbench 32.   I bet if someone did these tests on scsi hardware with the
> latency patch, they'd find much better results than us users of ide
> devices.

No, irq might have something to do with it but it is unlikely since the
stops are too long.

Much more likely is:
a) when running without priority altered:
MP3 playing requires processing, this will lower the process priority
relative processes that does little processing - like dbench processes.
Running with negative nice will help this, but is no guarantee. Running
with RT priority is.
[this is where the various low latency patches helps]

b) several processes are doing disk operations simultaneously.
This will put preassure on the VM to free pages fast enough. It will
also put preassure on the disk IO to read from disk fast enough -
these requests are not prioritized with the process priority in mind.
If there are no pages free we have to wait...
[if you run the latencytest program with no audiofile, to use a sine,
 you will not hear dropouts... it runs with RT prio with resident memory]

This second problem is MUCH harder to solve.
Multimedia applications could reserve memory - for their critical code
including buffers... (but this will require suid helpers...)
SGIs filesystem XFS is said to be able to guarantee bandwith to an
application.

Riels schedule in __alloc_pages probably helps the case with competing
regular processes a lot. Not allowing memory allocators to run their
whole time slot. The result should be a way to prioritize memory allocs
relative your priority. (yield part might be possible/good to remove)

/RogerL

-- 
Roger Larsson
Skellefteå
Sweden

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] Preemption Latency Measurement Tool
  2001-09-22 21:09                   ` Dieter Nützel
                                       ` (3 preceding siblings ...)
       [not found]                     ` <200109222340.BAA37547@blipp.internet5.net>
@ 2001-09-23  0:42                     ` Dieter Nützel
  2001-09-23  2:50                     ` Robert Love
                                       ` (2 subsequent siblings)
  7 siblings, 0 replies; 55+ messages in thread
From: Dieter Nützel @ 2001-09-23  0:42 UTC (permalink / raw)
  To: safemode, Robert Love; +Cc: linux-kernel

Am Sonntag, 23. September 2001 02:15 schrieb safemode:
> On Saturday 22 September 2001 19:46, Dieter Nützel wrote:
> > Am Sonntag, 23. September 2001 01:40 schrieb safemode:
> > > ok. The preemption patch helps realtime applications in linux be a
> > > little more close to realtime.  I understand that.  But your mp3 player
> > > shouldn't need root permission or renicing or realtime priority flags
> > > to play mp3s. To test how well the latency patches are working you
> > > should be running things all at the same priority.  The main issue
> > > people are having with skipping mp3s is not in the decoding of the mp3
> > > or in the retrieving of the file, it's in the playing in the soundcard.
> > >  That's being affected by dbench flooding the system with irq requests.
> > >  I'm inclined to believe it's irq requests because the _only_ time i
> > > have problems with mp3s (and i dont change priority levels) is when A.
> > > i do a cdparanoia -Z -B "1-"    or dbench 32.   I bet if someone did
> > > these tests on scsi hardware with the latency patch, they'd find much
> > > better results than us users of ide devices.
> >
> > Nope.
> > If you would have read (all) posts about this and related threads you
> > should have noticed that I am and others running SCSI systems...
>
> hrmm  strange because the only thing that could be causing the soundcard to
> skip would be irq requests still stuck in the cpu as far as i know.  I only
> get that with massive ide access and that's it.  Also that is when linux is
> juggling them all equally.
>
> > > even i dont get any skips when i run the player at nice -n -20.
> >
> > During dbench 16/32 and higher? Are you sure?
>
> I ran it myself and i dont drink alcohol or take drugs.  so yea, i'm sure
> :)

:-)))

I would only to be sure. 'cause I get this regularly.

> If i went high enough i suppose the same problem would occur.  it's
> probably in an area of the kernel where the preempt patch doesn't work
> (yet).    It does happen on cdparanoia -Z -B "1"

I am under the impression that we didn't found the cause of it, yet.

>    I dont think anything ide is safe from that.

No comment on this...:-)

So, do you running ReiserFS or what?

Thanks,
	Dieter

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] Preemption Latency Measurement Tool
  2001-09-23  0:38                       ` Roger Larsson
@ 2001-09-23  1:42                         ` safemode
  0 siblings, 0 replies; 55+ messages in thread
From: safemode @ 2001-09-23  1:42 UTC (permalink / raw)
  To: Roger Larsson, Robert Love; +Cc: linux-kernel

On Saturday 22 September 2001 20:38, Roger Larsson wrote:
> On Sunday 23 September 2001 01.40, safemode wrote:
> > ok. The preemption patch helps realtime applications in linux be a little
> > more close to realtime.  I understand that.  But your mp3 player
> > shouldn't need root permission or renicing or realtime priority flags to
> > play mp3s. To test how well the latency patches are working you should be
> > running things all at the same priority.  The main issue people are
> > having with skipping mp3s is not in the decoding of the mp3 or in the
> > retrieving of the file, it's in the playing in the soundcard.  That's
> > being affected by dbench flooding the system with irq requests.  I'm
> > inclined to believe it's irq requests because the _only_ time i have
> > problems with mp3s (and i dont change priority levels) is when A. i do a
> > cdparanoia -Z -B "1-"    or dbench 32.   I bet if someone did these tests
> > on scsi hardware with the latency patch, they'd find much better results
> > than us users of ide devices.
>
> No, irq might have something to do with it but it is unlikely since the
> stops are too long.
>
> Much more likely is:
> a) when running without priority altered:
> MP3 playing requires processing, this will lower the process priority
> relative processes that does little processing - like dbench processes.
> Running with negative nice will help this, but is no guarantee. Running
> with RT priority is.
> [this is where the various low latency patches helps]

I dont think mp3 playing should require realtime priority.  if the problem is 
not getting enough cpu time or attention from the kernel then it's a kernel 
scheduling problem.  You shouldn't have to "cheat" and bypass all that.  in 
an ideal situation the kernel should be able to see that the mp3 proccess is 
running and is thus more important than stuff like dbench proccesses.  
Although this problem we're having with mp3 skippage is not a cpu bound 
problem.  We're dealing with another resource here that's not so predictable 
because the kernel scheduling handles cpu time slices very nicely.  


> b) several processes are doing disk operations simultaneously.
> This will put preassure on the VM to free pages fast enough. It will
> also put preassure on the disk IO to read from disk fast enough -
> these requests are not prioritized with the process priority in mind.
> If there are no pages free we have to wait...
> [if you run the latencytest program with no audiofile, to use a sine,
>  you will not hear dropouts... it runs with RT prio with resident memory]

this may be the reason.  Although i dont think RT prio has anything to do 
with it.  If RT prio doesn't effect this VM portion of the kernel then wether 
we use RT or not for something as tiny as mp3 playing would hardly matter.  
Our sound drivers may not be able to play nicely with the VM under heavy disk 
access like cdparanoia -Z or dbench 32 and above.  


> This second problem is MUCH harder to solve.
> Multimedia applications could reserve memory - for their critical code
> including buffers... (but this will require suid helpers...)
> SGIs filesystem XFS is said to be able to guarantee bandwith to an
> application.
>
> Riels schedule in __alloc_pages probably helps the case with competing
> regular processes a lot. Not allowing memory allocators to run their
> whole time slot. The result should be a way to prioritize memory allocs
> relative your priority. (yield part might be possible/good to remove)

Sounds interesting. 
> /RogerL

Setting to RT should guarantee that your process gets executed when it wants 
to and processed by the kernel, which should help when dealing with something 
that holds on to a part of the kernel for an unpredictable amount of time or 
perhaps just a length of time that's too long.  But this vm scheduling thing 
that can break up those requests should allow the kernel to organize more of 
it's processes to be able to run smoothly at the same time without the need 
to give one higher priority than the other. Perhaps making a default alloc 
time slot that's small so no processes get accidentally starved and if you 
want a program that hogs memory time to run at full performance and not be 
hampered by multitasking, running at a higher priority would increase it's 
mem alloc time.   mp3 playing requires a very minimal amount of resources 
from anything.  it should be made that these minimal programs can run 
smoothly without resource hogging programs starving them unless you 
specificaly want them to (ie. increase priority).  

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] Preemption Latency Measurement Tool
  2001-09-21 15:48                 ` george anzinger
  2001-09-22 21:09                   ` Dieter Nützel
@ 2001-09-23  2:44                   ` Robert Love
  1 sibling, 0 replies; 55+ messages in thread
From: Robert Love @ 2001-09-23  2:44 UTC (permalink / raw)
  To: Dieter Nützel
  Cc: george anzinger, Oliver Xymoron, Andrea Arcangeli, Roger Larsson,
	linux-kernel, ReiserFS List

On Sat, 2001-09-22 at 17:09, Dieter Nützel wrote:
> Yes, I do like I've posted in all the related threads.
> 
> Here is some more input for you. I hope it could help nail it down.
> 
> Regards,
> 	Dieter
> 
> BTW I'll now switching to 2.4.10-pre14
>
> SunWave1#time tar xIf /Pakete/Linux/linux-2.4.9.tar.bz2
> 34.570u 5.350s 0:47.84 83.4%    0+0k 0+0io 295pf+0w
> 
> I "hear" some have disk activities (disk trashing).
> 
> SunWave1#sync
> 
> Runs for ages!!!

It has to flush that whole tar job you just did, which it did in memory
:)

> User	 CPU  0%
> System	 CPU  0%
> Idle	 CPU 99%
> 
> So where did it wait???
> 
> Here comes what latencytimes show:
> 
> Worst 20 latency times of 5061 measured in this period.
>   usec      cause     mask   start line/file      address   end line/file
> 322263   reacqBKL        1  1375/sched.c         c01138b4   697/sched.c
> 216174        BKL        1    30/inode.c         c016b971    52/inode.c
> 158328        BKL        9   742/block_dev.c     c0144d51   697/sched.c

OK, these three are not acceptable...adding them to the "list".

-- 
Robert M. Love
rml at ufl.edu
rml at tech9.net


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] Preemption Latency Measurement Tool
  2001-09-22 21:09                   ` Dieter Nützel
                                       ` (4 preceding siblings ...)
  2001-09-23  0:42                     ` Dieter Nützel
@ 2001-09-23  2:50                     ` Robert Love
  2001-09-23  3:14                       ` george anzinger
  2001-09-23  2:54                     ` Robert Love
  2001-09-23  2:58                     ` Robert Love
  7 siblings, 1 reply; 55+ messages in thread
From: Robert Love @ 2001-09-23  2:50 UTC (permalink / raw)
  To: safemode
  Cc: Dieter Nützel, george anzinger, Oliver Xymoron,
	Andrea Arcangeli, Roger Larsson, linux-kernel, ReiserFS List

On Sat, 2001-09-22 at 19:40, safemode wrote:
> ok. The preemption patch helps realtime applications in linux be a little 
> more close to realtime.  I understand that.  But your mp3 player shouldn't 
> need root permission or renicing or realtime priority flags to play mp3s.

It doesn't, it needs them to play with a dbench 32 running in the
background.  This isn't nessecarily acceptable, either, but it is a
difference.

Note one thing the preemption patch does is really make `realtime' apps
accel.  Without it, regardless of the priority of the application, the
app can be starved due to something in kernel mode.  Now it can't, and
since said application is high priority, it will get the CPU when it
wants it.

This is not to say the preemption patch is no good if you don't run
stuff at realtime --  I don't (who uses nice, anyhow? :>), and I notice
a difference.

> To 
> test how well the latency patches are working you should be running things 
> all at the same priority.  The main issue people are having with skipping 
> mp3s is not in the decoding of the mp3 or in the retrieving of the file, it's 
> in the playing in the soundcard.  That's being affected by dbench flooding 
> the system with irq requests.  I'm inclined to believe it's irq requests 
> because the _only_ time i have problems with mp3s (and i dont change priority 
> levels) is when A. i do a cdparanoia -Z -B "1-"    or dbench 32.   I bet if 
> someone did these tests on scsi hardware with the latency patch, they'd find 
> much better results than us users of ide devices.  

The skips are really big to be irq requests, although perhaps you are
right in that the handling of the irq (we disable preemption during
irq_off, of course, but also during bottom half execution) is the
problem.

However, I am more inclined to believe it is something else.  All these
long held locks can indeed be the problem.

I am on an all UW2 SCSI system, and I have no major blips playing during
a `dbench 16' (never ran 32).  However, many other users (Dieter, I
believe) are on a SCSI system too.

> even i dont get any skips when i run the player at nice -n -20.   That 
> doesn't tell you much about the preemption patch though.  And it doesn't tell 
> you about performance when you dont give linux the chance to do what it does, 
>  multitask.  That's where the latency patch is directed at improving, i 
> think.  

Agreed.

-- 
Robert M. Love
rml at ufl.edu
rml at tech9.net


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] Preemption Latency Measurement Tool
  2001-09-22 21:09                   ` Dieter Nützel
                                       ` (5 preceding siblings ...)
  2001-09-23  2:50                     ` Robert Love
@ 2001-09-23  2:54                     ` Robert Love
  2001-09-27  0:02                       ` [reiserfs-list] " Dieter Nützel
  2001-09-23  2:58                     ` Robert Love
  7 siblings, 1 reply; 55+ messages in thread
From: Robert Love @ 2001-09-23  2:54 UTC (permalink / raw)
  To: Dieter Nützel
  Cc: safemode, george anzinger, Oliver Xymoron, Andrea Arcangeli,
	Roger Larsson, linux-kernel, ReiserFS List

On Sat, 2001-09-22 at 19:46, Dieter Nützel wrote:
> Nope.
> If you would have read (all) posts about this and related threads you should 
> have noticed that I am and others running SCSI systems...
> 
> >
> > even i dont get any skips when i run the player at nice -n -20. 
> 
> During dbench 16/32 and higher? Are you sure?

I can't speak for safemode, but doing something like:

[22:43:47]rml@phantasy:~$ mpg321 /home/mp3/Get_Up_Kids-Woodson.mp3 &
[22:52:03]rml@phantasy:~$ dbench 16

I don't get any blips (nothing over 0.5s, anyhow, I would wager).

This is a P3-733, i815 board, 384MB PC133, AHA-2940U2W, U2W IBM 9GB system.

Linux 2.4.9-ac14 + preempt + and more

-- 
Robert M. Love
rml at ufl.edu
rml at tech9.net


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] Preemption Latency Measurement Tool
  2001-09-22 21:09                   ` Dieter Nützel
                                       ` (6 preceding siblings ...)
  2001-09-23  2:54                     ` Robert Love
@ 2001-09-23  2:58                     ` Robert Love
  7 siblings, 0 replies; 55+ messages in thread
From: Robert Love @ 2001-09-23  2:58 UTC (permalink / raw)
  To: safemode; +Cc: Dieter Nützel, linux-kernel

On Sat, 2001-09-22 at 20:15, safemode wrote:
> hrmm  strange because the only thing that could be causing the soundcard to 
> skip would be irq requests still stuck in the cpu as far as i know.  I only 
> get that with massive ide access and that's it.  Also that is when linux is 
> juggling them all equally. 

The soundcard will skip whenever there is not enough data for it to
play, this would be caused by the mp3 player not getting the needed CPU
time (or the PCI bus is saturated or some such).

This could be caused by more than just irq requests -- anytime the
kernel can't schedule the mp3 player readily enough.  With the
preemption patch, this can occur during not just interrupt handling but
any concurreny lock.

> > > even i dont get any skips when i run the player at nice -n -20.
> >
> > During dbench 16/32 and higher? Are you sure?
> 
> I ran it myself and i dont drink alcohol or take drugs.  so yea, i'm sure :)

Heh :)
 
> If i went high enough i suppose the same problem would occur.  it's probably 
> in an area of the kernel where the preempt patch doesn't work (yet).    It 
> does happen on cdparanoia -Z -B "1"    I dont think anything ide is safe from 
> that.    

By design, preemption does not occur under any lock (right now we use
SMP locks).  This means all the spinlocks in the kernel disable
preemption, the big kernel lock disables preemption, irq_off obviously
does, etc.

What we need to do now is identify the long held locks and -- if they
are causing problems -- do something about them.

-- 
Robert M. Love
rml at ufl.edu
rml at tech9.net


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] Preemption Latency Measurement Tool
       [not found]                     ` <200109222340.BAA37547@blipp.internet5.net>
  2001-09-23  0:38                       ` Roger Larsson
@ 2001-09-23  3:02                       ` Robert Love
  2001-09-23 16:43                         ` Roger Larsson
  1 sibling, 1 reply; 55+ messages in thread
From: Robert Love @ 2001-09-23  3:02 UTC (permalink / raw)
  To: Roger Larsson
  Cc: safemode, Dieter Nützel, george anzinger, Andrea Arcangeli,
	linux-kernel

On Sat, 2001-09-22 at 20:38, Roger Larsson wrote:
> Riels schedule in __alloc_pages probably helps the case with competing
> regular processes a lot. Not allowing memory allocators to run their
> whole time slot. The result should be a way to prioritize memory allocs
> relative your priority. (yield part might be possible/good to remove)

When did this go in?  I assume its in the 2.4.9-ac series and not
2.4.10?

-- 
Robert M. Love
rml at ufl.edu
rml at tech9.net


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] Preemption Latency Measurement Tool
  2001-09-23  2:50                     ` Robert Love
@ 2001-09-23  3:14                       ` george anzinger
  2001-09-23  4:06                         ` Dieter Nützel
  0 siblings, 1 reply; 55+ messages in thread
From: george anzinger @ 2001-09-23  3:14 UTC (permalink / raw)
  To: Robert Love
  Cc: safemode, Dieter Nützel, Oliver Xymoron, Andrea Arcangeli,
	Roger Larsson, linux-kernel, ReiserFS List

Robert Love wrote:
> 
> On Sat, 2001-09-22 at 19:40, safemode wrote:
> > ok. The preemption patch helps realtime applications in linux be a little
> > more close to realtime.  I understand that.  But your mp3 player shouldn't
> > need root permission or renicing or realtime priority flags to play mp3s.
> 
> It doesn't, it needs them to play with a dbench 32 running in the
> background.  This isn't nessecarily acceptable, either, but it is a
> difference.
> 
> Note one thing the preemption patch does is really make `realtime' apps
> accel.  Without it, regardless of the priority of the application, the
> app can be starved due to something in kernel mode.  Now it can't, and
> since said application is high priority, it will get the CPU when it
> wants it.
> 
> This is not to say the preemption patch is no good if you don't run
> stuff at realtime --  I don't (who uses nice, anyhow? :>), and I notice
> a difference.
> 
> > To
> > test how well the latency patches are working you should be running things
> > all at the same priority.  The main issue people are having with skipping
> > mp3s is not in the decoding of the mp3 or in the retrieving of the file, it's
> > in the playing in the soundcard.  That's being affected by dbench flooding
> > the system with irq requests.  I'm inclined to believe it's irq requests
> > because the _only_ time i have problems with mp3s (and i dont change priority
> > levels) is when A. i do a cdparanoia -Z -B "1-"    or dbench 32.   I bet if
> > someone did these tests on scsi hardware with the latency patch, they'd find
> > much better results than us users of ide devices.
> 
> The skips are really big to be irq requests, although perhaps you are
> right in that the handling of the irq (we disable preemption during
> irq_off, of course, but also during bottom half execution) is the
> problem.
> 
> However, I am more inclined to believe it is something else.  All these
> long held locks can indeed be the problem.
> 
> I am on an all UW2 SCSI system, and I have no major blips playing during
> a `dbench 16' (never ran 32).  However, many other users (Dieter, I
> believe) are on a SCSI system too.

Dieter, could you post your .config file?  It might have a clue or two.

George

> 
> > even i dont get any skips when i run the player at nice -n -20.   That
> > doesn't tell you much about the preemption patch though.  And it doesn't tell
> > you about performance when you dont give linux the chance to do what it does,
> >  multitask.  That's where the latency patch is directed at improving, i
> > think.
> 
> Agreed.
> 
> --
> Robert M. Love
> rml at ufl.edu
> rml at tech9.net

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] Preemption Latency Measurement Tool
  2001-09-22  7:22     ` Andre Pang
@ 2001-09-23  3:18       ` george anzinger
  2001-09-23  3:21       ` Robert Love
  1 sibling, 0 replies; 55+ messages in thread
From: george anzinger @ 2001-09-23  3:18 UTC (permalink / raw)
  To: Andre Pang
  Cc: Robert Love, linux-kernel, safemode, Dieter.Nuetzel, iafilius, ilsensine

Andre Pang wrote:
> 
> On Sat, Sep 22, 2001 at 02:10:18AM -0400, Robert Love wrote:
> 
> > > i did a test of it on linux-2.4.10-pre13 with Benno Senoner's
> > > lowlatency program, which i hacked up a bit to output
> > > /proc/latencytimes after each of the graphs.  test results are at
> > >
> > >     http://www.algorithm.com.au/hacking/linux-lowlatency/2.4.10-pre13-pes/
> > >
> > > and since i stared at the results in disbelief, i won't even try
> > > to guess what's going on :).  maybe you can make some sense of
> > > it?
> >
> > Well, its not hard to decipher...and really, its actually fairly good.
> > the latency test program is giving you a max latency of around 12ms in
> > each test, which is OK.
> 
> arrgh!  i just realised my script buggered up and was producing the same
> graph for all the results.  please have a look at the page again, sorry.
> 
> apart from that, i'm still confused.  compared to other graphs produced
> by the latencytest program, my system seems to have huge latencies.
> unless i'm reading it wrongly, the graph is saying that i'm getting
> latencies of up to 30ms, and a lot of overruns.  compare this to
> 
>     http://www.gardena.net/benno/linux/audio/2.4.0-test2/3x256.html
> 
> which shows latencytest on 2.4.0-test2, and
> 
>     http://www.gardena.net/benno/linux/audio/2.2.10-p133-3x128/3x128.html
> 
> which are the results for latencytest on 2.2.10.  admittedly these
> kernels are much older, but i'm consistently getting far more latency
> than those kernels.  that's the bit i'm confused about :)  i've tried
> Andrew Morton's low-latency patches as well, to no avail.  i've made
> sure i've tuned my hard disks correctly, and i don't have any other
> realtime processes running.
> 
> am i concerned with a different issue than the one you're addressing?
> 
> > the preemption-test patch is showing _MAX_ latencies of 0.8ms through
> > 12ms.  this is fine, too.
> 
> yep, i agree with that ... so why is latencytest showing scheduling
> latencies of > 30ms?  i get the feeling i'm confusing two different
> issues here.  from what i understand, /proc/latencytimes shows the
> how long it takes for various functions in the kernel to finish, and
> the latencytest result shows how long it takes for it to be
> re-scheduled (represented by the white line on the graph).

The one thing the latancytimes patch doesn't monitor is interrupt off
time.  Maybe it should...


George

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] Preemption Latency Measurement Tool
  2001-09-22  7:22     ` Andre Pang
  2001-09-23  3:18       ` george anzinger
@ 2001-09-23  3:21       ` Robert Love
  1 sibling, 0 replies; 55+ messages in thread
From: Robert Love @ 2001-09-23  3:21 UTC (permalink / raw)
  To: george anzinger
  Cc: Andre Pang, linux-kernel, safemode, Dieter.Nuetzel, iafilius, ilsensine

On Sat, 2001-09-22 at 23:18, george anzinger wrote:
> The one thing the latancytimes patch doesn't monitor is interrupt off
> time.  Maybe it should...

Yes, it should.  And it could... I will add this.

-- 
Robert M. Love
rml at ufl.edu
rml at tech9.net


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] Preemption Latency Measurement Tool
  2001-09-23  3:14                       ` george anzinger
@ 2001-09-23  4:06                         ` Dieter Nützel
  0 siblings, 0 replies; 55+ messages in thread
From: Dieter Nützel @ 2001-09-23  4:06 UTC (permalink / raw)
  To: george anzinger, Robert Love
  Cc: safemode, Oliver Xymoron, Andrea Arcangeli, Roger Larsson,
	linux-kernel, ReiserFS List

[-- Attachment #1: Type: text/plain, Size: 2530 bytes --]

Am Sonntag, 23. September 2001 05:14 schrieb george anzinger:
> Robert Love wrote:
> > On Sat, 2001-09-22 at 19:40, safemode wrote:
> > > ok. The preemption patch helps realtime applications in linux be a
> > > little more close to realtime.  I understand that.  But your mp3 player
> > > shouldn't need root permission or renicing or realtime priority flags
> > > to play mp3s.
> >
> > It doesn't, it needs them to play with a dbench 32 running in the
> > background.  This isn't nessecarily acceptable, either, but it is a
> > difference.
> >
> > Note one thing the preemption patch does is really make `realtime' apps
> > accel.  Without it, regardless of the priority of the application, the
> > app can be starved due to something in kernel mode.  Now it can't, and
> > since said application is high priority, it will get the CPU when it
> > wants it.
> >
> > This is not to say the preemption patch is no good if you don't run
> > stuff at realtime --  I don't (who uses nice, anyhow? :>), and I notice
> > a difference.
> >
> > > To
> > > test how well the latency patches are working you should be running
> > > things all at the same priority.  The main issue people are having with
> > > skipping mp3s is not in the decoding of the mp3 or in the retrieving of
> > > the file, it's in the playing in the soundcard.  That's being affected
> > > by dbench flooding the system with irq requests.  I'm inclined to
> > > believe it's irq requests because the _only_ time i have problems with
> > > mp3s (and i dont change priority levels) is when A. i do a cdparanoia
> > > -Z -B "1-"    or dbench 32.   I bet if someone did these tests on scsi
> > > hardware with the latency patch, they'd find much better results than
> > > us users of ide devices.
> >
> > The skips are really big to be irq requests, although perhaps you are
> > right in that the handling of the irq (we disable preemption during
> > irq_off, of course, but also during bottom half execution) is the
> > problem.
> >
> > However, I am more inclined to believe it is something else.  All these
> > long held locks can indeed be the problem.
> >
> > I am on an all UW2 SCSI system, and I have no major blips playing during
> > a `dbench 16' (never ran 32).  However, many other users (Dieter, I
> > believe) are on a SCSI system too.
>
> Dieter, could you post your .config file?  It might have a clue or two.

Here it comes.

Good night ;-)

-Dieter

BTW I have very good results (not the hiccup) for 2.4.10-pre14 + ReiserFS 
journal.c-2-patch from Chris

[-- Attachment #2: config.bz2 --]
[-- Type: application/x-bzip2, Size: 4370 bytes --]

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] Preemption Latency Measurement Tool
  2001-09-22  6:10   ` Robert Love
  2001-09-22  7:22     ` Andre Pang
  2001-09-22 12:56     ` ksoftirqd? (Was: Re: [PATCH] Preemption Latency Measurement Tool) Roger Larsson
@ 2001-09-23  7:05     ` Robert Love
  2001-09-23 12:03       ` Andre Pang
  2001-09-23 18:31       ` Robert Love
  2 siblings, 2 replies; 55+ messages in thread
From: Robert Love @ 2001-09-23  7:05 UTC (permalink / raw)
  To: Andre Pang
  Cc: linux-kernel, safemode, Dieter.Nuetzel, iafilius, ilsensine, george

On Sat, 2001-09-22 at 03:22, Andre Pang wrote:
> arrgh!  i just realised my script buggered up and was producing the same
> graph for all the results.  please have a look at the page again, sorry.

no problem...

> apart from that, i'm still confused.  compared to other graphs produced
> by the latencytest program, my system seems to have huge latencies.
> unless i'm reading it wrongly, the graph is saying that i'm getting
> latencies of up to 30ms, and a lot of overruns.  compare this to
> 
>     http://www.gardena.net/benno/linux/audio/2.4.0-test2/3x256.html
> 
> which shows latencytest on 2.4.0-test2, and
> 
>     http://www.gardena.net/benno/linux/audio/2.2.10-p133-3x128/3x128.html
> 
> which are the results for latencytest on 2.2.10.  admittedly these
> kernels are much older, but i'm consistently getting far more latency
> than those kernels.  that's the bit i'm confused about :)  i've tried
> Andrew Morton's low-latency patches as well, to no avail.  i've made
> sure i've tuned my hard disks correctly, and i don't have any other
> realtime processes running.
> 
> am i concerned with a different issue than the one you're addressing?

No, I understand now.  I honestly don't know what to say about your
results.

You are right, those older kernels are showing much better response
times than your kernel.  One would think your newer kernel, with
preemption or low-latency patch, would be an improvement.

I honestly don't know what to tell you.  It could be a piece of hardware
(or, more accurately) its driver ... 

the /proc/latencytimes output shows us that no single lock is accounting
for your bad times.  In fact, all your locks aren't that bad, so...

maybe the problem is in the "overruns" -- I don't know what that means
exactly.  maybe someone else on the list can shed some light? 
otheriwse, you can email the author perhaps.

With whatever buffer or whatnot is overrunning (and you are getting a
bit of them, compared to those other URLs you posted), something is
stalling or repeating.

> > the preemption-test patch is showing _MAX_ latencies of 0.8ms through
> > 12ms.  this is fine, too.
> 
> yep, i agree with that ... so why is latencytest showing scheduling
> latencies of > 30ms?  i get the feeling i'm confusing two different
> issues here.  from what i understand, /proc/latencytimes shows the
> how long it takes for various functions in the kernel to finish, and
> the latencytest result shows how long it takes for it to be
> re-scheduled (represented by the white line on the graph).

Pretty much.

/proc/latencytimes shows you how long specific locks are held in the
kernel, so this isn't the same as how long certain functions take (which
could be longer or shorter).  The reason for showing lock duration is
that preemption (with the preemptible kernel patch) can not occur during
a lock for concurrency reasons, so we want to find long-held locks and
try to find a solution to them.

latencytest does indeed show scheduling latency.  it tries to show you
how well the system can keep up with the audio buffer.  with a
preemptible kernel, you can reduce this latency to a very small number,
baring any large locks (from above).

latencytest's latency is a function of the sum of the latencies reported
by /proc/latencytimes. thus, a bad latency in latencytest could be
caused by what you see in /proc/latencytest.  you would think we would
see one or two long locks screwing everything up, but we do not.

I would try to see what the overruns mean exactly, and see if any
hardware drivers could be at fault (right now we don't report time with
interrupts disabled in /proc/latencytimes, which could be a factor too).

-- 
Robert M. Love
rml at ufl.edu
rml at tech9.net


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] Preemption Latency Measurement Tool
  2001-09-23  7:05     ` [PATCH] Preemption Latency Measurement Tool Robert Love
@ 2001-09-23 12:03       ` Andre Pang
  2001-09-23 18:31       ` Robert Love
  1 sibling, 0 replies; 55+ messages in thread
From: Andre Pang @ 2001-09-23 12:03 UTC (permalink / raw)
  To: Robert Love
  Cc: linux-kernel, safemode, Dieter.Nuetzel, iafilius, ilsensine, george

On Sun, Sep 23, 2001 at 03:05:41AM -0400, Robert Love wrote:

> You are right, those older kernels are showing much better response
> times than your kernel.  One would think your newer kernel, with
> preemption or low-latency patch, would be an improvement.
> 
> I honestly don't know what to tell you.  It could be a piece of hardware
> (or, more accurately) its driver ... 

I found out why, it's the driver.  I'm an idiot for not trying
this before ... I was previously using ALSA's snd-card-ymfpci
driver; I switchted to the OSS ymfpci driver that comes with the
kernel.  My latencies used to be 15ms on average, with spikes >
30ms; they're now ~3ms with some occasional spikes up to 10ms.

The graphs and numbers are up at
http://www.algorithm.com.au/hacking/linux-lowlatency/

> the /proc/latencytimes output shows us that no single lock is accounting
> for your bad times.  In fact, all your locks aren't that bad, so...
> 
> maybe the problem is in the "overruns" -- I don't know what that means
> exactly.  maybe someone else on the list can shed some light? 
> otheriwse, you can email the author perhaps.

An over-run occurs when the latencytest program, which plays a
continuous sound, doesn't get re-scheduled quickly enough.  This
results in a sound dropout because it can't re-fill its buffer.
This is similar to what you would get in XMMS, except that
latencytest simulates professional audio applications which must
run with _very_ small buffers in order to get low latencies.
(Imagine your computer being sync'ed in realtime with lots of
other music equipment and have it drag behind by 30ms -- it
doesn't sound good :).

But I guess my problem's solved ... thanks so much to Andrew,
yourself, MontaVista, Dietel and all the other guys who spend
their hours benchmarking so this can be improved!  If you still
want me to run benchmarks, let me know.  The 15 kernels I've
compiled since starting testing have gotta be useful for
something.


-- 
#ozone/algorithm <ozone@algorithm.com.au>          - trust.in.love.to.save

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] Preemption Latency Measurement Tool
  2001-09-23  3:02                       ` Robert Love
@ 2001-09-23 16:43                         ` Roger Larsson
  0 siblings, 0 replies; 55+ messages in thread
From: Roger Larsson @ 2001-09-23 16:43 UTC (permalink / raw)
  To: Robert Love
  Cc: safemode, Dieter Nützel, george anzinger, Andrea Arcangeli,
	linux-kernel

On Sunday 23 September 2001 05.02, Robert Love wrote:
> On Sat, 2001-09-22 at 20:38, Roger Larsson wrote:
> > Riels schedule in __alloc_pages probably helps the case with competing
> > regular processes a lot. Not allowing memory allocators to run their
> > whole time slot. The result should be a way to prioritize memory allocs
> > relative your priority. (yield part might be possible/good to remove)
>
> When did this go in?  I assume its in the 2.4.9-ac series and not
> 2.4.10?

2.4.0-test something...
It was removed when introducing Andreas VM

/RogerL

-- 
Roger Larsson
Skellefteå
Sweden

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] Preemption Latency Measurement Tool
  2001-09-23  7:05     ` [PATCH] Preemption Latency Measurement Tool Robert Love
  2001-09-23 12:03       ` Andre Pang
@ 2001-09-23 18:31       ` Robert Love
  1 sibling, 0 replies; 55+ messages in thread
From: Robert Love @ 2001-09-23 18:31 UTC (permalink / raw)
  To: Andre Pang
  Cc: linux-kernel, safemode, Dieter.Nuetzel, iafilius, ilsensine, george

On Sun, 2001-09-23 at 08:03, Andre Pang wrote:
> I found out why, it's the driver.  I'm an idiot for not trying
> this before ... I was previously using ALSA's snd-card-ymfpci
> driver; I switchted to the OSS ymfpci driver that comes with the
> kernel.  My latencies used to be 15ms on average, with spikes >
> 30ms; they're now ~3ms with some occasional spikes up to 10ms.

You are the second or third person to report high latencies with ALSA
drivers.  I wonder what braindead locking they do?

Good find.

> The graphs and numbers are up at
> http://www.algorithm.com.au/hacking/linux-lowlatency/

Nice results.

> > maybe the problem is in the "overruns" -- I don't know what that means
> > exactly.  maybe someone else on the list can shed some light? 
> > otheriwse, you can email the author perhaps.
> 
> An over-run occurs when the latencytest program, which plays a
> continuous sound, doesn't get re-scheduled quickly enough.  This
> results in a sound dropout because it can't re-fill its buffer.
> This is similar to what you would get in XMMS, except that
> latencytest simulates professional audio applications which must
> run with _very_ small buffers in order to get low latencies.
> (Imagine your computer being sync'ed in realtime with lots of
> other music equipment and have it drag behind by 30ms -- it
> doesn't sound good :).

Oh, ok -- its pretty much the number of times the scheduling latency was
greater than the audio buffer time.

> But I guess my problem's solved ... thanks so much to Andrew,
> yourself, MontaVista, Dietel and all the other guys who spend
> their hours benchmarking so this can be improved!  If you still
> want me to run benchmarks, let me know.  The 15 kernels I've
> compiled since starting testing have gotta be useful for
> something.

You are welcome :)

I should probably take a look at ALSA stuff and see if I can find what
exactly is the culprit.

Thank you for the feedback, now we know.

-- 
Robert M. Love
rml at ufl.edu
rml at tech9.net


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [reiserfs-list] Re: [PATCH] Preemption Latency Measurement Tool
  2001-09-23  2:54                     ` Robert Love
@ 2001-09-27  0:02                       ` Dieter Nützel
  0 siblings, 0 replies; 55+ messages in thread
From: Dieter Nützel @ 2001-09-27  0:02 UTC (permalink / raw)
  To: Robert Love
  Cc: george anzinger, Andrea Arcangeli, Ingo Molnar, Roger Larsson,
	linux-kernel, ReiserFS List

Hello Robert, all the others,

here are my latest results.
I've tried 2.4.10+preempt with some additional patches.
Latest that I have applied was softirq-2.4.10-A7. But it all didn't fix my 
reported MP3 playback hiccup during dbench 16/32.

See the numbers below.

Regards,
	Dieter

2.4.10+
patch-rml-2.4.10-preempt-kernel-1+
patch-rml-2.4.10-preempt-ptrace-and-jobs-fix+
patch-rml-2.4.10-preempt-stats-1+
inode.c-schedule.patch+
journal.c-1-patch

32 clients started
.....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................++............................................+.......+++..+....+++..+...+.....+.++...+.++++.+++.++++++.+++********************************
Throughput 38.6878 MB/sec (NB=48.3597 MB/sec  386.878 MBit/sec)
14.200u 54.940s 1:50.21 62.7%   0+0k 0+0io 911pf+0w
max load: 1777

Version 1.92a       ------Sequential Output------ --Sequential Input- 
--Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- 
--Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec 
%CP
SunWave1      1248M    79  97 16034  21  5719   6   147  98 22904  16 269.0   
4
Latency               138ms    2546ms     201ms   97838us   58940us    3207ms
Version 1.92a       ------Sequential Create------ --------Random 
Create--------
SunWave1            -Create-- --Read--- -Delete-- -Create-- --Read--- 
-Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec 
%CP
                 16  6121  75 +++++ +++ 12753  95  8422  80 +++++ +++ 11152  
95
Latency             26126us   11425us   11879us    5325us   12082us   13025us
1.92a,1.92a,SunWave1,1001286857,1248M,79,97,16034,21,5719,6,147,98,22904,16,269.0,4,16,,,,,6121,75,+++++,+++,12753,95,8422,80,+++++,+++,11152,95,138ms,2546ms,201ms,97838us,58940us,3207ms,26126us,11425us,11879us,5325us,12082us,13025us

After running VTK (VIS app) I get this:

Worst 20 latency times of 1648 measured in this period.
  usec      cause     mask   start line/file      address   end line/file
  7239  spin_lock        1   381/memory.c        c012808f   402/memory.c
   321        BKL        0  2754/buffer.c        c01415ba   697/sched.c
   312        BKL        0   359/buffer.c        c013d6dc  1381/sched.c
   280        BKL        0   359/buffer.c        c013d6dc  1380/sched.c
   252   reacqBKL        0  1375/sched.c         c0115334  1381/sched.c
   232  spin_lock        0   547/sched.c         c0113574   697/sched.c
   215       eth1        0   585/irq.c           c01089af   647/irq.c
   164        BKL        0   452/exit.c          c011b4d1   681/tty_io.c
   119        BKL        0  1437/namei.c         c014cabf   697/sched.c
   105        BKL        0   452/exit.c          c011b4d1   697/sched.c
   101        BKL        5   712/tty_io.c        c01a6edb   714/tty_io.c
   100        BKL        0   452/exit.c          c011b4d1  1380/sched.c
    99    unknown        1    76/softirq.c       c011cba4   119/softirq.c
    92  spin_lock        4   468/netfilter.c     c01fe263   119/softirq.c
    79        BKL        0    42/file.c          c01714b0    63/file.c
    72        BKL        0   752/namei.c         c014b73f   697/sched.c
    71        BKL        0   533/inode.c         c016e0ad  1381/sched.c
    71        BKL        0    30/inode.c         c016d531    52/inode.c
    68        BKL        0   452/exit.c          c011b4d1  1381/sched.c
    66        BKL        0   927/namei.c         c014b94f   929/namei.c

Adhoc c012808e <zap_page_range+5e/260>

Do we need Rik's patch?

****************************************************************************

2.4.10+
patch-rml-2.4.10-preempt-kernel-1+
patch-rml-2.4.10-preempt-ptrace-and-jobs-fix+
patch-rml-2.4.10-preempt-stats-1+
inode.c-schedule.patch+
journal.c-1-patch+
kupdated-patch

32 clients started
...............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................+..........................................+.................+...........++.....................++.....++......+......+++++...++++++++.+++++++.++********************************
Throughput 38.9015 MB/sec (NB=48.6269 MB/sec  389.015 MBit/sec)
15.140u 60.640s 1:49.66 69.1%   0+0k 0+0io 911pf+0w
max load: 1654

Version 1.92a       ------Sequential Output------ --Sequential Input- 
--Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- 
--Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec 
%CP
SunWave1      1248M    84  99 16348  21  5746   6   142  99 23411  17 265.9   
4
Latency               130ms    1868ms     192ms   88459us   54625us    3367ms
Version 1.92a       ------Sequential Create------ --------Random 
Create--------
SunWave1            -Create-- --Read--- -Delete-- -Create-- --Read--- 
-Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec 
%CP
                 16  4941  65 +++++ +++ 12916  96  6847  76 +++++ +++ 10785  
94
Latency              8468us   11334us   11736us    8520us   12205us   12856us
1.92a,1.92a,SunWave1,1001358471,1248M,84,99,16348,21,5746,6,142,99,23411,17,265.9,4,16,,,,,4941,65,+++++,+++,12916,96,6847,76,+++++,+++,10785,94,130ms,1868ms,192ms,88459us,54625us,3367ms,8468us,11334us,11736us,8520us,12205us,12856us

Dbench run during MP3 playback:
32 clients started
.................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................+.........+.........................................+.+.....+...............+...........................................................................+...................++...........+.+.......+..++++.++.+.++.++++++.+++++********************************
Throughput 34.7025 MB/sec (NB=43.3782 MB/sec  347.025 MBit/sec)
15.290u 63.820s 2:02.74 64.4%   0+0k 0+0io 911pf+0w

Worst 20 latency times of 16578 measured in this period.
  usec      cause     mask   start line/file      address   end line/file
 26795  spin_lock        1   291/buffer.c        c0141a2c   280/buffer.c
 17330  spin_lock        1   341/vmscan.c        c0133f0a   402/vmscan.c
 12925  spin_lock        1   439/vmscan.c        c0133ea5   338/vmscan.c
 11923  spin_lock        1   291/buffer.c        c0141a2c   285/buffer.c
  7253        BKL        0  1302/inode.c         c016faa9  1381/sched.c
  7117        BKL        1  1302/inode.c         c016faa9   697/sched.c
  6097        BKL        0  1302/inode.c         c016faa9  1380/sched.c
  6000        BKL        1   533/inode.c         c016e11d   697/sched.c
  4870   reacqBKL        1  1375/sched.c         c0115334   929/namei.c
  4015  spin_lock        0   439/vmscan.c        c0133ea5   402/vmscan.c
  2075        BKL        1   452/exit.c          c011b4d1   697/sched.c
  2029  spin_lock        1   547/sched.c         c0113574   697/sched.c
  2010        BKL        0  1302/inode.c         c016faa9   842/inode.c
  1730        BKL        0  2754/buffer.c        c01415ba  2757/buffer.c
  1668        BKL        1  2754/buffer.c        c01415ba   697/sched.c
  1574  spin_lock        0   483/dcache.c        c01545da   520/dcache.c
  1416  spin_lock        0  1376/sched.c         c0115353  1380/sched.c
  1396  spin_lock        1  1376/sched.c         c0115353   697/sched.c
  1387    aic7xxx        1    76/softirq.c       c011cba4   119/softirq.c
  1341        BKL        1   533/inode.c         c016e11d   842/inode.c

Adhoc c0141a2c <kupdate+11c/210>
Adhoc c0133f0a <shrink_cache+37a/5b0>
Adhoc c0133ea4 <shrink_cache+314/5b0>
Adhoc c0141a2c <kupdate+11c/210>
Adhoc c016faa8 <reiserfs_dirty_inode+58/f0>
Adhoc c016faa8 <reiserfs_dirty_inode+58/f0>
Adhoc c016faa8 <reiserfs_dirty_inode+58/f0>
Adhoc c016e11c <reiserfs_get_block+9c/f30>
Adhoc c0115334 <preempt_schedule+34/b0>
Adhoc c0133ea4 <shrink_cache+314/5b0>
Adhoc c011b4d0 <do_exit+130/360>
Adhoc c0113574 <schedule+34/550>
Adhoc c016faa8 <reiserfs_dirty_inode+58/f0>
Adhoc c01415ba <sync_old_buffers+2a/130>
Adhoc c01415ba <sync_old_buffers+2a/130>
Adhoc c01545da <select_parent+3a/100>
Adhoc c0115352 <preempt_schedule+52/b0>
Adhoc c0115352 <preempt_schedule+52/b0>
Adhoc c011cba4 <do_softirq+34/150>
Adhoc c016e11c <reiserfs_get_block+9c/f30>

Redo after some seconds:

Worst 20 latency times of 1944 measured in this period.
  usec      cause     mask   start line/file      address   end line/file
  2028        BKL        1  1302/inode.c         c016faa9   697/sched.c
   584        BKL        0  1302/inode.c         c016faa9  1306/inode.c
   572  spin_lock        0  1376/sched.c         c0115353  1380/sched.c
   415        BKL        0  1302/inode.c         c016faa9  1381/sched.c
   356        BKL        0  2754/buffer.c        c01415ba  2757/buffer.c
   353        BKL        0  2754/buffer.c        c01415ba   697/sched.c
   328        BKL        0  2754/buffer.c        c01415ba  1381/sched.c
   278  spin_lock        0   381/memory.c        c012808f   402/memory.c
   274   reacqBKL        0  1375/sched.c         c0115334  1381/sched.c
   245  spin_lock        0   547/sched.c         c0113574   697/sched.c
   208       eth1        0   585/irq.c           c01089af   647/irq.c
   188        BKL        1   301/namei.c         c014a4b1   697/sched.c
   176        BKL        1   927/namei.c         c014b9bf   929/namei.c
   161        BKL        0   301/namei.c         c014a4b1  1380/sched.c
   154        BKL        0   533/inode.c         c016e11d   842/inode.c
   147        BKL        6   712/tty_io.c        c01a6f6b   714/tty_io.c
   141        BKL        0   301/namei.c         c014a4b1  1381/sched.c
   141        BKL        0    30/inode.c         c016d5a1    52/inode.c
   126   reacqBKL        0  1375/sched.c         c0115334  2757/buffer.c
   121   reacqBKL        0  1375/sched.c         c0115334   929/namei.c

Adhoc c016faa8 <reiserfs_dirty_inode+58/f0>
Adhoc c016faa8 <reiserfs_dirty_inode+58/f0>
Adhoc c0115352 <preempt_schedule+52/b0>
Adhoc c016faa8 <reiserfs_dirty_inode+58/f0>
Adhoc c01415ba <sync_old_buffers+2a/130>
Adhoc c01415ba <sync_old_buffers+2a/130>
Adhoc c01415ba <sync_old_buffers+2a/130>
Adhoc c012808e <zap_page_range+5e/260>
Adhoc c0115334 <preempt_schedule+34/b0>
Adhoc c0113574 <schedule+34/550>
Adhoc c01089ae <do_IRQ+3e/1d0>
Adhoc c014a4b0 <real_lookup+70/150>
Adhoc c014b9be <vfs_create+ae/150>
Adhoc c014a4b0 <real_lookup+70/150>
Adhoc c016e11c <reiserfs_get_block+9c/f30>
Adhoc c01a6f6a <tty_write+21a/2f0>
Adhoc c014a4b0 <real_lookup+70/150>
Adhoc c016d5a0 <reiserfs_delete_inode+30/110>
Adhoc c0115334 <preempt_schedule+34/b0>
Adhoc c0115334 <preempt_schedule+34/b0>

****************************************************************************

2.4.10+
patch-rml-2.4.10-preempt-kernel-1+
patch-rml-2.4.10-preempt-ptrace-and-jobs-fix+
patch-rml-2.4.10-preempt-stats-1+
inode.c-schedule.patch+
journal.c-1-patch+
kupdated-patch+
00_vm-tweaks-1

32 clients started
............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................+......................................+..+...+...+...............+...+...................+.....+++...+.++.+.+..+++++.+...++++++++++********************************
Throughput 37.9905 MB/sec (NB=47.4881 MB/sec  379.905 MBit/sec)
14.810u 62.280s 1:52.26 68.6%   0+0k 0+0io 911pf+0w
max load: 1764

Version 1.92a       ------Sequential Output------ --Sequential Input- 
--Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- 
--Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec 
%CP
SunWave1      1248M    87  98 16359  21  5839   6   155  99 23709  16 275.7   
4
Latency               247ms    2351ms     191ms   84684us   67700us    3791ms
Version 1.92a       ------Sequential Create------ --------Random 
Create--------
SunWave1            -Create-- --Read--- -Delete-- -Create-- --Read--- 
-Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec 
%CP
                 16  5545  52 +++++ +++ 13434  95  5138  67 +++++ +++ 11297  
95
Latency              4894us   11251us   11868us    4922us   12265us   12863us
1.92a,1.92a,SunWave1,1001467140,1248M,87,98,16359,21,5839,6,155,99,23709,16,275.7,4,16,,,,,5545,52,+++++,+++,13434,95,5138,67,+++++,+++,11297,95,247ms,2351ms,191ms,84684us,67700us,3791ms,4894us,11251us,11868us,4922us,12265us,12863us

Dbench run during MP3 playback:
32 clients started
.................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................++..........+.+..+..............................................+...................................................+.....+..........+.....+.+..++++.+..+....+++++++.+.++++.+++********************************
Throughput 36.1736 MB/sec (NB=45.217 MB/sec  361.736 MBit/sec)
14.920u 61.240s 1:57.82 64.6%   0+0k 0+0io 911pf+0w

Worst 20 latency times of 19595 measured in this period.
  usec      cause     mask   start line/file      address   end line/file
 40920  spin_lock        1   291/buffer.c        c0141a5c   280/buffer.c
 16812  spin_lock        1   443/vmscan.c        c0133eb5   340/vmscan.c
  8114        BKL        1   533/inode.c         c016e14d   697/sched.c
  7932  spin_lock        1   443/vmscan.c        c0133eb5   406/vmscan.c
  6886        BKL        1  1302/inode.c         c016fad9   697/sched.c
  6660        BKL        0  1302/inode.c         c016fad9  1306/inode.c
  6260  spin_lock        1   343/vmscan.c        c0133f1a   406/vmscan.c
  4273        BKL        1  1302/inode.c         c016fad9   842/inode.c
  2410   reacqBKL        1  1375/sched.c         c0115334   697/sched.c
  2153   reacqBKL        0  1375/sched.c         c0115334  1381/sched.c
  1876        BKL        0   452/exit.c          c011b4d1   697/sched.c
  1609  spin_lock        1   483/dcache.c        c015460a   520/dcache.c
  1536        BKL        0  2754/buffer.c        c01415ea  2757/buffer.c
  1533  spin_lock        0   291/buffer.c        c0141a5c   285/buffer.c
  1311        BKL        0  2754/buffer.c        c01415ea  1380/sched.c
  1308  spin_lock        0   547/sched.c         c0113574   697/sched.c
  1286        BKL        1   533/inode.c         c016e14d   842/inode.c
  1222        BKL        0    30/inode.c         c016d5d1    52/inode.c
  1110        BKL        0  2754/buffer.c        c01415ea  1381/sched.c
   850  spin_lock        1  1376/sched.c         c0115353  1380/sched.c

Adhoc c0141a5c <kupdate+11c/210>
Adhoc c0133eb4 <shrink_cache+2f4/590>
Adhoc c016e14c <reiserfs_get_block+9c/f30>
Adhoc c0133eb4 <shrink_cache+2f4/590>
Adhoc c016fad8 <reiserfs_dirty_inode+58/f0>
Adhoc c016fad8 <reiserfs_dirty_inode+58/f0>
Adhoc c0133f1a <shrink_cache+35a/590>
Adhoc c016fad8 <reiserfs_dirty_inode+58/f0>
Adhoc c0115334c0115334 <END_OF_CODE+c0115333d122a04e/????>
Adhoc c011b4d0 <do_exit+130/360>
Adhoc c015460a <select_parent+3a/100>
Adhoc c01415ea <sync_old_buffers+2a/130>
Adhoc c0141a5c <kupdate+11c/210>
Adhoc c01415ea <sync_old_buffers+2a/130>
Adhoc c0113574 <schedule+34/550>
Adhoc c016e14c <reiserfs_get_block+9c/f30>
Adhoc c016d5d0 <reiserfs_delete_inode+30/110>
Adhoc c01415ea <sync_old_buffers+2a/130>
Adhoc c0115352 <preempt_schedule+52/b0>

SunWave1#free -t
             total       used       free     shared    buffers     cached
Mem:        642824     168212     474612          0      11404      47612
-/+ buffers/cache:     109196     533628
Swap:            0          0          0
Total:      642824     168212     474612

SunWave1#cat /proc/meminfo
        total:    used:    free:  shared: buffers:  cached:
Mem:  658251776 172527616 485724160        0 11677696 49016832
Swap:        0        0        0
MemTotal:       642824 kB
MemFree:        474340 kB
MemShared:           0 kB
Buffers:         11404 kB
Cached:          47868 kB
SwapCached:          0 kB
Active:          37356 kB
Inactive:        21916 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:       642824 kB
LowFree:        474340 kB
SwapTotal:           0 kB
SwapFree:            0 kB

****************************************************************************

2.4.10+
patch-rml-2.4.10-preempt-kernel-1+
patch-rml-2.4.10-preempt-ptrace-and-jobs-fix+
patch-rml-2.4.10-preempt-stats-1+
inode.c-schedule.patch+
journal.c-1-patch+
kupdated-patch+
00_vm-tweaks-2+
softirq-2.4.10-A7

32 clients started
....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................+............+.....................................+...+................................+.....................+........+..+++..+++...+++++++..+++++.++++.+++********************************
Throughput 38.061 MB/sec (NB=47.5762 MB/sec  380.61 MBit/sec)
14.850u 60.210s 1:52.00 67.0%   0+0k 0+0io 911pf+0w
max load: 1872

Version 1.92a       ------Sequential Output------ --Sequential Input- 
--Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- 
--Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec 
%CP
SunWave1      1248M    86  99 16238  21  5950   6   160  99 23612  16 270.9   
4
Latency               186ms    1905ms     170ms   81429us   58393us    2869ms
Version 1.92a       ------Sequential Create------ --------Random 
Create--------
SunWave1            -Create-- --Read--- -Delete-- -Create-- --Read--- 
-Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec 
%CP
                 16  5351  51 +++++ +++ 13298  94  8670  82 +++++ +++ 11260  
95
Latency              7799us   11631us   11649us    4801us   12097us   12804us
1.92a,1.92a,SunWave1,1001546561,1248M,86,99,16238,21,5950,6,160,99,23612,16,270.9,4,16,,,,,5351,51,+++++,+++,13298,94,8670,82,+++++,+++,11260,95,186ms,1905ms,170ms,81429us,58393us,2869ms,7799us,11631us,11649us,4801us,12097us,12804us

Dbench run during MP3 playback:
SunWave1>time ./dbench 32
32 clients started
........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................+.+.+................++.....+........+......................+...................+++.+...........+........+.......++..+++.+..++.+.++...++++.+.+.........+********************************
Throughput 37.9649 MB/sec (NB=47.4562 MB/sec  379.649 MBit/sec)
14.900u 61.430s 1:52.34 67.9%   0+0k 0+0io 911pf+0w

Worst 20 latency times of 13895 measured in this period.
  usec      cause     mask   start line/file      address   end line/file
 32066  spin_lock        1   291/buffer.c        c0141aac   285/buffer.c
 13058  spin_lock        1   443/vmscan.c        c0133f05   340/vmscan.c
 10875  spin_lock        1   343/vmscan.c        c0133f6a   340/vmscan.c
  9627  spin_lock        1   291/buffer.c        c0141aac   280/buffer.c
  7310  spin_lock        0   343/vmscan.c        c0133f6a   406/vmscan.c
  7003        BKL        1  1302/inode.c         c016fb29  1402/sched.c
  6906        BKL        0  1302/inode.c         c016fb29   719/sched.c
  6723        BKL        0  1302/inode.c         c016fb29  1306/inode.c
  4036        BKL        1  1302/inode.c         c016fb29   842/inode.c
  3978        BKL        1   533/inode.c         c016e19d   842/inode.c
  3636        BKL        1    30/inode.c         c016d621    52/inode.c
  3359   reacqBKL        1  1397/sched.c         c01153f4    52/inode.c
  3005        BKL        1   452/exit.c          c011b591   719/sched.c
  2137   reacqBKL        1  1397/sched.c         c01153f4   929/namei.c
  2064  spin_lock        0   443/vmscan.c        c0133f05   406/vmscan.c
  1776   reacqBKL        1  1397/sched.c         c01153f4   719/sched.c
  1695  spin_lock        0   569/sched.c         c0113634   719/sched.c
  1557  spin_lock        0   483/dcache.c        c015465a   520/dcache.c
  1385        BKL        1  2754/buffer.c        c014163a  2757/buffer.c
  1324  spin_lock        0  1398/sched.c         c0115413   719/sched.c

Adhoc c0141aac <kupdate+11c/210>
Adhoc c0133f04 <shrink_cache+2f4/590>
Adhoc c0133f6a <shrink_cache+35a/590>
Adhoc c0141aac <kupdate+11c/210>
Adhoc c0133f6a <shrink_cache+35a/590>
Adhoc c016fb28 <reiserfs_dirty_inode+58/f0>
Adhoc c016fb28 <reiserfs_dirty_inode+58/f0>
Adhoc c016fb28 <reiserfs_dirty_inode+58/f0>
Adhoc c016fb28 <reiserfs_dirty_inode+58/f0>
Adhoc c016e19c <reiserfs_get_block+9c/f30>
Adhoc c016d620 <reiserfs_delete_inode+30/110>
Adhoc c01153f4 <preempt_schedule+34/b0>
Adhoc c011b590 <do_exit+130/360>
Adhoc c01153f4 <preempt_schedule+34/b0>
Adhoc c0133f04 <shrink_cache+2f4/590>
Adhoc c01153f4 <preempt_schedule+34/b0>
Adhoc c0113634 <schedule+34/550>
Adhoc c015465a <select_parent+3a/100>
Adhoc c014163a <sync_old_buffers+2a/130>
Adhoc c0115412 <preempt_schedule+52/b0>

^ permalink raw reply	[flat|nested] 55+ messages in thread

end of thread, other threads:[~2001-09-27  0:02 UTC | newest]

Thread overview: 55+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <200109200758.f8K7wEG13675@zero.tech9.net>
2001-09-19 22:44 ` [PATCH] Preemption Latency Measurement Tool Robert Love
2001-09-20  1:40   ` Ignacio Vazquez-Abrams
2001-09-20  2:23     ` safemode
2001-09-20  1:13       ` David Lang
2001-09-20  2:57         ` Robert Love
2001-09-20  2:38     ` Robert Love
2001-09-20  6:31   ` Dieter Nützel
2001-09-20  6:31   ` Dieter Nützel
     [not found]   ` <20010920063143.424BD1E41A@Cantor.suse.de>
2001-09-20  6:41     ` Andrea Arcangeli
2001-09-20  7:57       ` Dieter Nützel
2001-09-20 21:09       ` Robert Love
     [not found]       ` <20010920075751.6CA791E6B2@Cantor.suse.de>
2001-09-20  8:21         ` Andrea Arcangeli
2001-09-20 20:13           ` george anzinger
2001-09-20 20:38             ` Randy.Dunlap
2001-09-20 21:10         ` Robert Love
2001-09-20 21:35           ` Dieter Nützel
2001-09-20 22:03             ` Oliver Xymoron
2001-09-20 22:51               ` Dieter Nützel
2001-09-21  3:17               ` Robert Love
2001-09-21 15:48                 ` george anzinger
2001-09-22 21:09                   ` Dieter Nützel
2001-09-22 23:40                     ` safemode
2001-09-22 23:46                     ` Dieter Nützel
2001-09-23  0:15                     ` safemode
     [not found]                     ` <200109222340.BAA37547@blipp.internet5.net>
2001-09-23  0:38                       ` Roger Larsson
2001-09-23  1:42                         ` safemode
2001-09-23  3:02                       ` Robert Love
2001-09-23 16:43                         ` Roger Larsson
2001-09-23  0:42                     ` Dieter Nützel
2001-09-23  2:50                     ` Robert Love
2001-09-23  3:14                       ` george anzinger
2001-09-23  4:06                         ` Dieter Nützel
2001-09-23  2:54                     ` Robert Love
2001-09-27  0:02                       ` [reiserfs-list] " Dieter Nützel
2001-09-23  2:58                     ` Robert Love
2001-09-23  2:44                   ` Robert Love
     [not found]       ` <200109200757.JAA60995@blipp.internet5.net>
2001-09-20 17:37         ` Roger Larsson
2001-09-20 21:29         ` Robert Love
2001-09-20 21:53           ` Dieter Nützel
2001-09-20 20:01   ` Tobias Diedrich
2001-09-20 20:27   ` Robert Love
2001-09-20 22:09     ` [PATCH] Preemption patch 2.4.9-ac12 Robert Love
2001-09-20 22:01   ` [PATCH] Preemption Latency Measurement Tool Robert Love
2001-09-22  3:57   ` Andre Pang
2001-09-22  6:10   ` Robert Love
2001-09-22  7:22     ` Andre Pang
2001-09-23  3:18       ` george anzinger
2001-09-23  3:21       ` Robert Love
2001-09-22 12:56     ` ksoftirqd? (Was: Re: [PATCH] Preemption Latency Measurement Tool) Roger Larsson
2001-09-22 13:14       ` Andrea Arcangeli
2001-09-22 20:51         ` Roger Larsson
2001-09-22 21:33           ` Andrea Arcangeli
2001-09-23  7:05     ` [PATCH] Preemption Latency Measurement Tool Robert Love
2001-09-23 12:03       ` Andre Pang
2001-09-23 18:31       ` Robert Love

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).