linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] livelock in elevator scheduling
@ 2000-11-21  8:38 kumon
  2000-11-21 10:28 ` Jens Axboe
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: kumon @ 2000-11-21  8:38 UTC (permalink / raw)
  To: linux-kernel; +Cc: Dave Jones, Andrea Arcangeli, Jens Axboe, kumon

The current elevator_linus() doesn't obey the true elevator
scheduling, and causes I/O livelock during frequent random write
traffics. In such environment I/O (read/write) transactions may be
delayed almost infinitely (more than 1 hour).

Problem:
 Current elevator_linus() traverses the I/O requesting queue from the
tail to top. And when the current request has smaller sector number
than the request on the top of queue, it is always placed just after
the top.
 This means, if requests in some sector range are continuously
generated, a request with larger sector number is always places at the
last and has no chance to go to the front.  e.g. it is not scheduled.

 This is not hypothetical but actually observed.  Running a random
disk write benchmark can completely supress other disk I/O by this
reason.


 The following patch fixes this problem. It still doesn't follow a
strict elevator scheduling, but it does much better.  Additionally, it
may be better to add extra priority to reads than writes to obtain
better response, but this patch doesn't.

diff -ru linux-2.4.0-test11-pre2/drivers/block/elevator.c linux-2.4.0-test11-pre2-test5/drivers/block/elevator.c
--- linux-2.4.0-test11-pre2/drivers/block/elevator.c	Wed Aug 23 14:33:46 2000
+++ linux-2.4.0-test11-pre2-test5/drivers/block/elevator.c	Tue Nov 21 15:32:01 2000
@@ -47,6 +47,11 @@
 			break;
 		tmp->elevator_sequence--;
 	}
+	if (entry == head) {
+		tmp = blkdev_entry_to_request(entry);
+		if (IN_ORDER(req, tmp))
+			entry = real_head->prev;
+	}
 	list_add(&req->queue, entry);
 }
 

To implement a complete elevator scheduling, preparing an alternate
waiting queue is better, I think.

--
Computer Systems Laboratory, Fujitsu Labs.
kumon@flab.fujitsu.co.jp
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] livelock in elevator scheduling
  2000-11-21  8:38 [PATCH] livelock in elevator scheduling kumon
@ 2000-11-21 10:28 ` Jens Axboe
  2000-11-21 11:30 ` kumon
  2000-11-22 10:59 ` kumon
  2 siblings, 0 replies; 16+ messages in thread
From: Jens Axboe @ 2000-11-21 10:28 UTC (permalink / raw)
  To: kumon; +Cc: linux-kernel, Dave Jones, Andrea Arcangeli

On Tue, Nov 21 2000, kumon@flab.fujitsu.co.jp wrote:
> The current elevator_linus() doesn't obey the true elevator
> scheduling, and causes I/O livelock during frequent random write
> traffics. In such environment I/O (read/write) transactions may be
> delayed almost infinitely (more than 1 hour).
> 
> Problem:
>  Current elevator_linus() traverses the I/O requesting queue from the
> tail to top. And when the current request has smaller sector number
> than the request on the top of queue, it is always placed just after
> the top.
>  This means, if requests in some sector range are continuously
> generated, a request with larger sector number is always places at the
> last and has no chance to go to the front.  e.g. it is not scheduled.

Believe it or not, but this is intentional. In that regard, the
function name is a misnomer -- call it i/o scheduler instead :-)
The current settings in test11 cause this behaviour, because the
starting request sequence numbers are a 'bit' too high.

I'd be very interested if you could repeat your test with my
block patch applied. It has, among other things, a more fair (and
faster) insertion.

*.kernel.org/pub/linux/kernel/people/axboe/patches/2.4.0-test11/blk-11.bz2

> [...] Additionally, it may be better to add extra priority to reads
> than writes to obtain better response, but this patch doesn't.

READs do have bigger priority, they start out with lower sequence
numbers than WRITEs do:

	latency = elevator_request_latency(elevator, rw);

With my patch, READ sequence start is now 8192. WRITEs are twice
that.

-- 
* Jens Axboe <axboe@suse.de>
* SuSE Labs
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] livelock in elevator scheduling
  2000-11-21  8:38 [PATCH] livelock in elevator scheduling kumon
  2000-11-21 10:28 ` Jens Axboe
@ 2000-11-21 11:30 ` kumon
  2000-11-21 11:36   ` Jens Axboe
  2000-11-21 12:39   ` kumon
  2000-11-22 10:59 ` kumon
  2 siblings, 2 replies; 16+ messages in thread
From: kumon @ 2000-11-21 11:30 UTC (permalink / raw)
  To: Jens Axboe; +Cc: kumon, linux-kernel, Dave Jones, Andrea Arcangeli, kumon

Jens Axboe writes:
 > > Problem:
 > >  Current elevator_linus() traverses the I/O requesting queue from the
 > > tail to top. And when the current request has smaller sector number
 > > than the request on the top of queue, it is always placed just after
 > > the top.
 > >  This means, if requests in some sector range are continuously
 > > generated, a request with larger sector number is always places at the
 > > last and has no chance to go to the front.  e.g. it is not scheduled.
 > 
 > Believe it or not, but this is intentional. In that regard, the
 > function name is a misnomer -- call it i/o scheduler instead :-)

I never believe it intentional.  If it is true, the current kernel
will be suffered from a kind of DOS attack.  Yes, actually I'm a
victim of it.

By Running ZD's ServerBench, not only the performance down, but my
machine blocks all commands execution including /bin/ps, /bin/ls... ,
and those are not ^C able unless the benchmark is stopped. Those
commands are read from disks but the requests are wating at the end of
I/O queue, those won't be executed.

Anyway, I'll try your patch.

--
Computer Systems Laboratory, Fujitsu Labs.
kumon@flab.fujitsu.co.jp
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] livelock in elevator scheduling
  2000-11-21 11:30 ` kumon
@ 2000-11-21 11:36   ` Jens Axboe
  2000-12-02  0:22     ` Russell Cattelan
  2000-11-21 12:39   ` kumon
  1 sibling, 1 reply; 16+ messages in thread
From: Jens Axboe @ 2000-11-21 11:36 UTC (permalink / raw)
  To: kumon; +Cc: linux-kernel, Dave Jones, Andrea Arcangeli

On Tue, Nov 21 2000, kumon@flab.fujitsu.co.jp wrote:
>  > Believe it or not, but this is intentional. In that regard, the
>  > function name is a misnomer -- call it i/o scheduler instead :-)
> 
> I never believe it intentional.  If it is true, the current kernel
> will be suffered from a kind of DOS attack.  Yes, actually I'm a
> victim of it.

The problem is caused by the too high sequence numbers in stock
kernel, as I said. Plus, the sequence decrementing doesn't take
request/buffer size into account. So the starvation _is_ limited,
the limit is just too high.

> By Running ZD's ServerBench, not only the performance down, but my
> machine blocks all commands execution including /bin/ps, /bin/ls... ,
> and those are not ^C able unless the benchmark is stopped. Those
> commands are read from disks but the requests are wating at the end of
> I/O queue, those won't be executed.

If performance is down, then that problem is most likely elsewhere.
I/O limited benchmarking typically thrives on lots of request
latency -- with that comes better throughput for individual threads.

> Anyway, I'll try your patch.

Thanks

-- 
* Jens Axboe <axboe@suse.de>
* SuSE Labs
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] livelock in elevator scheduling
  2000-11-21 11:30 ` kumon
  2000-11-21 11:36   ` Jens Axboe
@ 2000-11-21 12:39   ` kumon
  2000-11-21 13:01     ` Jens Axboe
  2000-11-22  6:08     ` kumon
  1 sibling, 2 replies; 16+ messages in thread
From: kumon @ 2000-11-21 12:39 UTC (permalink / raw)
  To: Jens Axboe; +Cc: kumon, linux-kernel, Dave Jones, Andrea Arcangeli, kumon

Jens Axboe writes:
 > On Tue, Nov 21 2000, kumon@flab.fujitsu.co.jp wrote:
 > > I never believe it intentional.  If it is true, the current kernel
 > > will be suffered from a kind of DOS attack.  Yes, actually I'm a
 > > victim of it.
 > 
 > The problem is caused by the too high sequence numbers in stock
 > kernel, as I said. Plus, the sequence decrementing doesn't take
 > request/buffer size into account. So the starvation _is_ limited,
 > the limit is just too high.

Yes, current limit is 1000000 and if I/O can manage 200req/s, then it
will expire 5000 sec after. So, I said "infinite (more than 1hour)".
Why do you add extreme priotity to lower sector accesses, which breaks
elevator scheduling idea?


 > If performance is down, then that problem is most likely elsewhere.
 > I/O limited benchmarking typically thrives on lots of request
 > latency -- with that comes better throughput for individual threads.

No, the performance down caused from this point.  Server benchmark has
a standard configuration workload which consists from several kind of
task, such as, CPU interntional, disk seq-read, seq-write, random-read,
random-write.

The benchmark invokes lots of processes, each corresponds to a client,
and each accesses different portion of few large files.  We have
enough memory to hold all dirty data at onece (1GB without himem), so
if no I/O blocking occur, all process can be run simultaneously with
limited amount of dirty flush I/O stream.

If some processes eagerly access relatively lower blocks, and other
process unfortunately requests higher block read, then the process is
blocked. Eventually this happens to large portion of processes, the
performance goes extremely down.
 During the measurement of test10 or test11, the performance is very
fluctuated and lots of idle time observed by vmstat. Such instability
is not observed on test1 or test2.

--
Computer Systems Laboratory, Fujitsu Labs.
kumon@flab.fujitsu.co.jp
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] livelock in elevator scheduling
  2000-11-21 12:39   ` kumon
@ 2000-11-21 13:01     ` Jens Axboe
  2000-11-22  6:08     ` kumon
  1 sibling, 0 replies; 16+ messages in thread
From: Jens Axboe @ 2000-11-21 13:01 UTC (permalink / raw)
  To: kumon; +Cc: linux-kernel, Dave Jones, Andrea Arcangeli

On Tue, Nov 21 2000, kumon@flab.fujitsu.co.jp wrote:
>  > The problem is caused by the too high sequence numbers in stock
>  > kernel, as I said. Plus, the sequence decrementing doesn't take
>  > request/buffer size into account. So the starvation _is_ limited,
>  > the limit is just too high.
> 
> Yes, current limit is 1000000 and if I/O can manage 200req/s, then it
> will expire 5000 sec after. So, I said "infinite (more than 1hour)".
> Why do you add extreme priotity to lower sector accesses, which breaks
> elevator scheduling idea?

Look at how it works in my blk-11 patch. It's not adding extreme
priority to low sector requests, it's always trying to sort sector
wise ascendingly (which of course then tends to put lower sectors
at the front of the queue). blk-11 does it a bit differently though,
the sequence number is in sector size units. And the queue scan
will apply simple aging to requests just sitting there.

>  > If performance is down, then that problem is most likely elsewhere.
>  > I/O limited benchmarking typically thrives on lots of request
>  > latency -- with that comes better throughput for individual threads.
> 
> No, the performance down caused from this point.  Server benchmark has
> a standard configuration workload which consists from several kind of
> task, such as, CPU interntional, disk seq-read, seq-write, random-read,
> random-write.
> 
> The benchmark invokes lots of processes, each corresponds to a client,
> and each accesses different portion of few large files.  We have
> enough memory to hold all dirty data at onece (1GB without himem), so
> if no I/O blocking occur, all process can be run simultaneously with
> limited amount of dirty flush I/O stream.

Flushing that much dirty data will always end up blocking waiting
for request slots.

> If some processes eagerly access relatively lower blocks, and other
> process unfortunately requests higher block read, then the process is
> blocked. Eventually this happens to large portion of processes, the
> performance goes extremely down.
>  During the measurement of test10 or test11, the performance is very
> fluctuated and lots of idle time observed by vmstat. Such instability
> is not observed on test1 or test2.

So check why there's much idle time -- the test2 elevator is identical
to the one in test11... Or check where it breaks exactly, what kernel
revision. Comparing test8 and test9 would be interesting.

-- 
* Jens Axboe <axboe@suse.de>
* SuSE Labs
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] livelock in elevator scheduling
  2000-11-21 12:39   ` kumon
  2000-11-21 13:01     ` Jens Axboe
@ 2000-11-22  6:08     ` kumon
  1 sibling, 0 replies; 16+ messages in thread
From: kumon @ 2000-11-22  6:08 UTC (permalink / raw)
  To: Jens Axboe; +Cc: kumon, linux-kernel, Dave Jones, Andrea Arcangeli, kumon

Jens Axboe writes:
 > > The benchmark invokes lots of processes, each corresponds to a client,
 > > and each accesses different portion of few large files.  We have
 > > enough memory to hold all dirty data at onece (1GB without himem), so
 > > if no I/O blocking occur, all process can be run simultaneously with
 > > limited amount of dirty flush I/O stream.
 > 
 > Flushing that much dirty data will always end up blocking waiting
 > for request slots.

Yes, such benchmarks need moderate long startup time, like 10 minutes
or more.  During that period, physical read requests are issued and
those are waiting on a queue, resulting low performance value.
 As file data is accumulated into memory, CPU usage goes
high. Finally, all required data are on the buffer, the system can use
100% CPU.

I show you the example of "vmstat 10" during the test.
It includes the beginning, startup, full-usage states.

At the current setting, the hard-threshold of dirty cache, which needs
synchronous flushing, is over 300MB and this test doesn't reach at the
limit.

During startup period, "cache" steadily increase, but until READ
is disappeared cpu usage stays very low.

If some of the read requests have inferior priority to others, those
requests need (extremely) long time to be served, then the startup
process never end in reasonable time.


   procs                   memory    swap          io     system         cpu
 r  b  wswpd   free   buff  cache  si  so    bi    bo   in    cs  us  sy  id
 0  0  0   0 863424   1760  13200   0   0     0     0  101     6   0   0 100

START CLIENTS
 0 16  0   0 843448   2052  25796   0   0   322     0  587   458   1   2  97
 0 16  0   0 827988   2052  40836   0   0   376     0  619   491   1   0  99
 0 16  0   0 813276   2052  55120   0   0   357     0  630   489   1   0  98

DIRTY FLUSH STARTS
 0 16  0   0 802488   2052  65608   0   0   262    34  541   420   1   0  99
	9 more lines
 0 16  0   0 717824   2052 148040   0   0   151    73  572   442   1   0  99
	9 more lines
 0 16  1   0 657216   2052 206980   0   0   119    71  790   602   2   1  98
	9 more lines
 0 16  1   0 652484   2052 211592   0   0   115    72  815   613   2   0  98
	9 more lines
 0 16  1   0 623256   2052 240072   0   0    20   119  432   424   1   0  99
	9 more lines
 0 16  1   0 598600   2052 264088   0   0    16   119  884   719   2   1  97
	9 more lines
 1 15  1   0 592156   2052 270332   0   0    15   119 1609  1185   5   1  94
 0 16  1   0 591540   2052 270916   0   0    15   119 1475  1102   4   1  95
 1 15  1   0 590920   2052 271512   0   0    15   119 1669  1243   5   1  94
 1 15  1   0 590344   2052 272072   0   0    14   122 2091  1500   7   2  92
 0 16  1   0 589752   2052 272644   0   0    14   120 2484  1767   8   2  91
 1 15  1   0 589180   2052 273196   0   0    14   120 2674  1905   8   2  90
 2 14  1   0 588608   2052 273748   0   0    14   122 3059  2142  10   2  88
 0 16  1   0 588044   2052 274288   0   0    14   122 4430  3037  16   3  82
 0 16  1   0 587524   2052 274800   0   0    13   123 5892  3913  21   4  75
 1 14  2   0 587036   2052 275236   0   0    11   121 8525  6062  31   7  62
10  4  2   0 586688   2052 275576   0   0     9   124 12162  8902  47  10  43
14  1  2   0 586528   2052 275708   0   0     3   129 17605  7710  80  18   2

FULL CPU USAGE (No physical read request is issued)
15  0  2   0 586492   2052 275740   0   0     1   131 18843  6652  82  18   0
14  0  2   0 586484   2052 275748   0   0     0   132 19210  6340  82  18   0

CONTINUE TO THE END

--
Computer Systems Laboratory, Fujitsu Labs.
kumon@flab.fujitsu.co.jp
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] livelock in elevator scheduling
  2000-11-21  8:38 [PATCH] livelock in elevator scheduling kumon
  2000-11-21 10:28 ` Jens Axboe
  2000-11-21 11:30 ` kumon
@ 2000-11-22 10:59 ` kumon
  2000-11-22 15:50   ` davej
  2 siblings, 1 reply; 16+ messages in thread
From: kumon @ 2000-11-22 10:59 UTC (permalink / raw)
  To: Jens Axboe; +Cc: kumon, linux-kernel, Dave Jones, Andrea Arcangeli, kumon

Jens Axboe writes:
 > I'd be very interested if you could repeat your test with my
 > block patch applied. It has, among other things, a more fair (and
 > faster) insertion.
 > 
 > *.kernel.org/pub/linux/kernel/people/axboe/patches/2.4.0-test11/blk-11.bz2

This patch fixes the "DoS" behavior of the current queuing mechanism.
Even if I set the "pass-over" values to very large number (1000000) it
stably runs.
Thank you for your patch.

>From my understanding, passover is an option of the elevator
scheduling to prioritize long waiting requests for improving online
responses.  In that test, passover value setting doesn't affect the
benchmark number, which is probable.

Will the patch is included in the next kernel?


BTW,
The major performance difference between test1 and test2 was caused by
whether the hard_dirty_limit is hit or not.

The current Linux has a lot of difficult to set parameters in
/proc/sys.
 Once a system goes beyond some settable limits, the system behavior
changes so sharp.  Bdf_prm.nrfract in fs/buffer.c is one of the
difficult parameters.  I hope a tool to monitor or set these value.

--
Computer Systems Laboratory, Fujitsu Labs.
kumon@flab.fujitsu.co.jp
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] livelock in elevator scheduling
  2000-11-22 10:59 ` kumon
@ 2000-11-22 15:50   ` davej
  0 siblings, 0 replies; 16+ messages in thread
From: davej @ 2000-11-22 15:50 UTC (permalink / raw)
  To: kumon; +Cc: Jens Axboe, linux-kernel, Andrea Arcangeli

On Wed, 22 Nov 2000 kumon@flab.fujitsu.co.jp wrote:

> The current Linux has a lot of difficult to set parameters in
> /proc/sys.
>  Once a system goes beyond some settable limits, the system behavior
> changes so sharp.  Bdf_prm.nrfract in fs/buffer.c is one of the
> difficult parameters.  I hope a tool to monitor or set these value.

http://www.powertweak.org
(See CVS version). Helpful(?) advice, profiles, and easy to use UI.
If we missed anything, we take patches, and can always use extra hands :)

regards,

Davej.

-- 
| Dave Jones <davej@suse.de>  http://www.suse.de/~davej
| SuSE Labs

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] livelock in elevator scheduling
  2000-11-21 11:36   ` Jens Axboe
@ 2000-12-02  0:22     ` Russell Cattelan
  2000-12-02 15:42       ` Jens Axboe
  0 siblings, 1 reply; 16+ messages in thread
From: Russell Cattelan @ 2000-12-02  0:22 UTC (permalink / raw)
  To: Jens Axboe; +Cc: kumon, linux-kernel, Dave Jones, Andrea Arcangeli

Jens Axboe wrote:

> On Tue, Nov 21 2000, kumon@flab.fujitsu.co.jp wrote:
> >  > Believe it or not, but this is intentional. In that regard, the
> >  > function name is a misnomer -- call it i/o scheduler instead :-)
> >
> > I never believe it intentional.  If it is true, the current kernel
> > will be suffered from a kind of DOS attack.  Yes, actually I'm a
> > victim of it.
>
> The problem is caused by the too high sequence numbers in stock
> kernel, as I said. Plus, the sequence decrementing doesn't take
> request/buffer size into account. So the starvation _is_ limited,
> the limit is just too high.
>
> > By Running ZD's ServerBench, not only the performance down, but my
> > machine blocks all commands execution including /bin/ps, /bin/ls... ,
> > and those are not ^C able unless the benchmark is stopped. Those
> > commands are read from disks but the requests are wating at the end of
> > I/O queue, those won't be executed.
>
> If performance is down, then that problem is most likely elsewhere.
> I/O limited benchmarking typically thrives on lots of request
> latency -- with that comes better throughput for individual threads.
>
> > Anyway, I'll try your patch.

Well this patch does help with the request starvation problem.
Unfortunately it has introduced another problem.
Running 4 doio programs, on and XFS partion with KIO buf IO turned on.

I did see something about problems with aic7xxx driver in test11, so this may

not be related to your patch.

I'm going to run without kiobuf  to see if the problem still occurs.


XFS (dev: 8/17) mounting with KIOBUFIO
Start mounting filesystem: sd(8,17)
Ending clean XFS mount for filesystem: sd(8,17)
NMI Watchdog detected LOCKUP on CPU1, registers:
CPU:    1
EIP:    0010:[<c0217a9f>]
EFLAGS: 00000082
eax: c01b21ac   ebx: c197b078   ecx: 00000000   edx: 00000012
esi: 00000286   edi: dfff7f70   ebp: dfff7f20   esp: dfff7f14
ds: 0018   es: 0018   ss: 0018
Process swapper (pid: 0, stackpage=dfff7000)
Stack: c190fb20 04000001 00000020 dfff7f40 c010c539 00000012 c197b078
dfff7f70
       00000240 c0331a40 00000012 dfff7f68 c010c73d 00000012 dfff7f70
c190fb20
       c0108960 dfff6000 c0108960 c190fb20 00000001 dfff7fa4 c010a8c8
c0108960
Call Trace: [<c010c539>] [<c010c73d>] [<c0108960>] [<c0108960>] [<c010a8c8>]
[<c0108960>] [<c0108960>]
       [<c0100018>] [<c010898f>] [<c0108a02>] [<c010a9be>]
Code: f3 90 7e f5 e9 1b a7 f9 ff 80 3d e4 e4 2e c0 00 f3 90 7e f5

Entering kdb (current=0xdfff6000, pid 0) on processor 1 due to WatchDog
Interrupt @ 0xc0217a9f
eax = 0xc01b21ac ebx = 0xc197b078 ecx = 0x00000000 edx = 0x00000012
esi = 0x00000286 edi = 0xdfff7f70 esp = 0xdfff7f14 eip = 0xc0217a9f
ebp = 0xdfff7f20 xss = 0x00000018 xcs = 0x00000010 eflags = 0x00000082
xds = 0xc1970018 xes = 0xdfff0018 origeax = 0xc01b21ac &regs = 0xdfff7ee0
[1]kdb> bt
    EBP       EIP         Function(args)
           0x00000000c0217a9f stext_lock+0x43af
                               kernel .text.lock 0xc02136f0 0xc02136f0
0xc02197c0
0xdfff7f20 0x00000000c01b21c3 do_aic7xxx_isr+0x17 (0x12, 0xc197b078,
0xdfff7f70, 0x240, 0xc0331a40)
                               kernel .text 0xc0100000 0xc01b21ac 0xc01b225c
0xdfff7f40 0x00000000c010c539 handle_IRQ_event+0x4d (0x12, 0xdfff7f70,
0xc190fb20, 0xc0108960, 0xdfff6000)
                               kernel .text 0xc0100000 0xc010c4ec 0xc010c568
0xdfff7f68 0x00000000c010c73d do_IRQ+0x99 (0xc0108960, 0x0, 0xdfff6000,
0xdfff6000, 0xc0108960)
                               kernel .text 0xc0100000 0xc010c6a4 0xc010c790
           0x00000000c010a8c8 ret_from_intr
                               kernel .text 0xc0100000 0xc010a8c8 0xc010a8e8
Interrupt registers:
eax = 0x00000000 ebx = 0xc0108960 ecx = 0x00000000 edx = 0xdfff6000
esi = 0xdfff6000 edi = 0xc0108960 esp = 0xdfff7fa4 eip = 0xc010898f
ebp = 0xdfff7fa4 xss = 0x00000018 xcs = 0x00000010 eflags = 0x00000246
xds = 0xc0100018 xes = 0xdfff0018 origeax = 0xffffff12 &regs = 0xdfff7f70
           0x00000000c010898f default_idle+0x2f
                               kernel .text 0xc0100000 0xc0108960 0xc0108998
0xdfff7fb8 0x00000000c0108a02 cpu_idle+0x42
                               kernel .text 0xc0100000 0xc01089c0 0xc0108a18
0xdfff7fc0 0x00000000c02fb5b9 start_secondary+0x25
                               kernel .text.init 0xc02f6000 0xc02fb594
0xc02fb5c0


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] livelock in elevator scheduling
  2000-12-02  0:22     ` Russell Cattelan
@ 2000-12-02 15:42       ` Jens Axboe
  2000-12-04 23:25         ` Russell Cattelan
  2000-12-05  1:38         ` Russell Cattelan
  0 siblings, 2 replies; 16+ messages in thread
From: Jens Axboe @ 2000-12-02 15:42 UTC (permalink / raw)
  To: Russell Cattelan; +Cc: kumon, linux-kernel, Dave Jones, Andrea Arcangeli

On Fri, Dec 01 2000, Russell Cattelan wrote:
> > If performance is down, then that problem is most likely elsewhere.
> > I/O limited benchmarking typically thrives on lots of request
> > latency -- with that comes better throughput for individual threads.
> >
> > > Anyway, I'll try your patch.
> 
> Well this patch does help with the request starvation problem.
> Unfortunately it has introduced another problem.
> Running 4 doio programs, on and XFS partion with KIO buf IO turned on.

This looks like a generic aic7xxx problem, and not block related. Since
you are doing such nice traces, what is the other CPU doing? CPU1
seems to be stuck grabbing the io_request_lock (for reasons not entirely
clear from reading the aic7xxx source...)

-- 
* Jens Axboe <axboe@suse.de>
* SuSE Labs
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] livelock in elevator scheduling
  2000-12-02 15:42       ` Jens Axboe
@ 2000-12-04 23:25         ` Russell Cattelan
  2000-12-05  1:38         ` Russell Cattelan
  1 sibling, 0 replies; 16+ messages in thread
From: Russell Cattelan @ 2000-12-04 23:25 UTC (permalink / raw)
  To: Jens Axboe; +Cc: kumon, linux-kernel, Dave Jones, Andrea Arcangeli

Jens Axboe wrote:

> On Fri, Dec 01 2000, Russell Cattelan wrote:
> > > If performance is down, then that problem is most likely elsewhere.
> > > I/O limited benchmarking typically thrives on lots of request
> > > latency -- with that comes better throughput for individual threads.
> > >
> > > > Anyway, I'll try your patch.
> >
> > Well this patch does help with the request starvation problem.
> > Unfortunately it has introduced another problem.
> > Running 4 doio programs, on and XFS partion with KIO buf IO turned on.
>
> This looks like a generic aic7xxx problem, and not block related. Since
> you are doing such nice traces, what is the other CPU doing? CPU1
> seems to be stuck grabbing the io_request_lock (for reasons not entirely
> clear from reading the aic7xxx source...)
>

Sorry I haven't been able to get a decent backtrace of the other processor.

According to Keith Owens the maintainer of kdb there is a race condition in

kbd and the NMI loop detection stuff that resulting in not being able to
switch cpus.


I'll keep try to dig up some more info.

I'm also seeing various other panics in XFS (well pagebuf to be specific)
with this patch.
Nothing seems to be very consistent and this point.

Ok I did manage to switch processors.
Entering kdb (current=0xd7c0a000, pid 645) on processor 1 due to cpu switch

[1]kdb> bt
    EBP       EIP         Function(args)
           0x00000000c0216594 stext_lock+0x2ea4
                               kernel .text.lock 0xc02136f0 0xc02136f0
0xc02197c0
0xd7c0bf98 0x00000000c0155964 ext2_sync_file+0x2c (0xd8257560, 0xd7348220,
0x0, 0xd7c0a000)
                               kernel .text 0xc0100000 0xc0155938
0xc0155a40
0xd7c0bfbc 0x00000000c0136064 sys_fsync+0x54 (0x1, 0xbffff020, 0x0,
0xbffff048, 0x8051738)
                               kernel .text 0xc0100000 0xc0136010
0xc0136088
           0x00000000c010a807 system_call+0x33
                               kernel .text 0xc0100000 0xc010a7d4
0xc010a80c
[1]kdb>


>
> --
> * Jens Axboe <axboe@suse.de>
> * SuSE Labs

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] livelock in elevator scheduling
  2000-12-02 15:42       ` Jens Axboe
  2000-12-04 23:25         ` Russell Cattelan
@ 2000-12-05  1:38         ` Russell Cattelan
  2000-12-05 23:01           ` Jens Axboe
  1 sibling, 1 reply; 16+ messages in thread
From: Russell Cattelan @ 2000-12-05  1:38 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-kernel

Jens Axboe wrote:

> On Fri, Dec 01 2000, Russell Cattelan wrote:
> > > If performance is down, then that problem is most likely elsewhere.
> > > I/O limited benchmarking typically thrives on lots of request
> > > latency -- with that comes better throughput for individual threads.
> > >
> > > > Anyway, I'll try your patch.
> >
> > Well this patch does help with the request starvation problem.
> > Unfortunately it has introduced another problem.
> > Running 4 doio programs, on and XFS partion with KIO buf IO turned on.
>
> This looks like a generic aic7xxx problem, and not block related. Since
> you are doing such nice traces, what is the other CPU doing? CPU1
> seems to be stuck grabbing the io_request_lock (for reasons not entirely
> clear from reading the aic7xxx source...)
>

Ok Keith gave me a quick hack to help with the race condition.

Here is the latest set up back traces...
The actually accuracy of these back traces... well?  who knows, but it does
give us something to go on.
It doesn't make much sense to me right now, but I'm guessing the problem is
starting with that do_div error.

I'm going to take a closer look at the scsi_back_merge_fn.
This may  have more to due with our/Chait's kiobuf modifications than
anything else.



XFS (dev: 8/20) mounting with KIOBUFIO
Start mounting filesystem: sd(8,20)
Ending clean XFS mount for filesystem: sd(8,20)
kmem_alloc doing a vmalloc 262144 size & PAGE_SIZE 0 rval=0xe0a10000
Unable to handle kernel NULL pointer dereference at virtual address
00000008
 printing eip:
c019f8b5
*pde = 00000000

Entering kdb (current=0xc1910000, pid 5) on processor 1 Panic: Oops
due to panic @ 0xc019f8b5
eax = 0x00000002 ebx = 0x00000001 ecx = 0x00081478 edx = 0x00000000
esi = 0xc1957da0 edi = 0xc1923ac8 esp = 0xc1911e94 eip = 0xc019f8b5
ebp = 0xc1911e9c xss = 0x00000018 xcs = 0x00000010 eflags = 0x00010046
xds = 0x00000018 xes = 0x00000018 origeax = 0xffffffff &regs = 0xc1911e60
[1]kdb> bt
    EBP       EIP         Function(args)
0xc1911e9c 0x00000000c019f8b5 scsi_back_merge_fn_c+0x15 (0xc1923a98,
0xc1957da0, 0xcfb05780, 0x80)
                               kernel .text 0xc0100000 0xc019f8a0
0xc019f98c
0xc1911f2c 0x00000000c016a0df __make_request+0x1af (0xc1923a98, 0x1,
0xcfb05780, 0x0, 0x814)
                               kernel .text 0xc0100000 0xc0169f30
0xc016a8a4
0xc1911f70 0x00000000c016a9c8 generic_make_request+0x124 (0x1, 0xcfb05780,
0x0, 0x0, 0x0)
                               kernel .text 0xc0100000 0xc016a8a4
0xc016aa50
0xc1911fac 0x00000000c016abde ll_rw_block+0x18e (0x1, 0x1, 0xc1911fd0, 0x0)

                               kernel .text 0xc0100000 0xc016aa50
0xc016ac58
0xc1911fd4 0x00000000c0138ed7 flush_dirty_buffers+0x97 (0x0, 0x10f00)
                               kernel .text 0xc0100000 0xc0138e40
0xc0138f24
0xc1911fec 0x00000000c01391ab bdflush+0x8f
                               kernel .text 0xc0100000 0xc013911c
0xc0139260
           0x00000000c0108c9b kernel_thread+0x23
                               kernel .text 0xc0100000 0xc0108c78
0xc0108cb0
[1]kdb> go
Oops: 0000
CPU:    1
EIP:    0010:[<c019f8b5>]
EFLAGS: 00010046
eax: 00000002   ebx: 00000001   ecx: 00081478   edx: 00000000
esi: c1957da0   edi: c1923ac8   ebp: c1911e9c   esp: c1911e94
ds: 0018   es: 0018   ss: 0018
Process kflushd (pid: 5, stackpage=c1911000)
Stack: cfb05780 c1923a98 c1911f2c c016a0df c1923a98 c1957da0 cfb05780
00000080
       00000814 00081478 cfb05780 00000008 00000001 00000200 00000000
c1923ac0
       00000000 0000000e c1910000 c1911efc c010c77e 00000246 00000814
def0e800
Call Trace: [<c016a0df>] [<c010c77e>] [<c010a8c8>] [<c016a9c8>]
[<c016abde>] [<c0138ed7>] [<c01391ab>]
       [<c0108c9b>]
Code: 66 81 7a 08 00 10 0f 47 d8 8b 4a 2c 85 c9 74 19 0f b7 42 08
NMI Watchdog detected LOCKUP on CPU0, registers:
CPU:    0
EIP:    0010:[<c0217a98>]
EFLAGS: 00000086
eax: c01b21ac   ebx: c197b078   ecx: 00000000   edx: 00000012
esi: 00000286   edi: c02f5f94   ebp: c02f5f44   esp: c02f5f38
ds: 0018   es: 0018   ss: 0018
Process swapper (pid: 0, stackpage=c02f5000)
Stack: c190fb20 04000001 00000000 c02f5f64 c010c539 00000012 c197b078
c02f5f94
       00000240 c0331a40 00000012 c02f5f8c c010c73d 00000012 c02f5f94
c190fb20
       c0108960 c02f4000 c0108960 c190fb20 00000000 c02f5fc8 c010a8c8
c0108960
Call Trace: [<c010c539>] [<c010c73d>] [<c0108960>] [<c0108960>]
[<c010a8c8>] [<c0108960>] [<c0108960>]
       [<c0100018>] [<c010898f>] [<c0108a02>] [<c0105000>] [<c01001d0>]
Code: 80 3d 64 47 2e c0 00 f3 90 7e f5 e9 1b a7 f9 ff 80 3d 64 e3

Entering kdb (current=0xc02f4000, pid 0) on processor 0 due to WatchDog
Interrupt @ 0xc0217a98
eax = 0xc01b21ac ebx = 0xc197b078 ecx = 0x00000000 edx = 0x00000012
esi = 0x00000286 edi = 0xc02f5f94 esp = 0xc02f5f38 eip = 0xc0217a98
ebp = 0xc02f5f44 xss = 0x00000018 xcs = 0x00000010 eflags = 0x00000086
xds = 0x00000018 xes = 0xc02f0018 origeax = 0xc01b21ac &regs = 0xc02f5f04
[0]kdb> bt
    EBP       EIP         Function(args)
           0x00000000c0217a98 stext_lock+0x43a8
                               kernel .text.lock 0xc02136f0 0xc02136f0
0xc02197c0
0xc02f5f44 0x00000000c01b21c3 do_aic7xxx_isr+0x17 (0x12, 0xc197b078,
0xc02f5f94, 0x240, 0xc0331a40)
                               kernel .text 0xc0100000 0xc01b21ac
0xc01b225c
0xc02f5f64 0x00000000c010c539 handle_IRQ_event+0x4d (0x12, 0xc02f5f94,
0xc190fb20, 0xc0108960, 0xc02f4000)
                               kernel .text 0xc0100000 0xc010c4ec
0xc010c568
0xc02f5f8c 0x00000000c010c73d do_IRQ+0x99 (0xc0108960, 0x0, 0xc02f4000,
0xc02f4000, 0xc0108960)
                               kernel .text 0xc0100000 0xc010c6a4
0xc010c790
           0x00000000c010a8c8 ret_from_intr
                               kernel .text 0xc0100000 0xc010a8c8
0xc010a8e8
Interrupt registers:
eax = 0x00000000 ebx = 0xc0108960 ecx = 0x00000000 edx = 0xc02f4000
esi = 0xc02f4000 edi = 0xc0108960 esp = 0xc02f5fc8 eip = 0xc010898f
ebp = 0xc02f5fc8 xss = 0x00000018 xcs = 0x00000010 eflags = 0x00000246
xds = 0xc0100018 xes = 0xc02f0018 origeax = 0xffffff12 &regs = 0xc02f5f94
           0x00000000c010898f default_idle+0x2f
                               kernel .text 0xc0100000 0xc0108960
0xc0108998
0xc02f5fdc 0x00000000c0108a02 cpu_idle+0x42
                               kernel .text 0xc0100000 0xc01089c0
0xc0108a18
[0]kdb> cpu
Currently on cpu 0
Available cpus: 0, 1
[0]kdb> cpu 1

Entering kdb (current=0xc1910000, pid 5) on processor 1 due to cpu switch
[1]kdb> bt
    EBP       EIP         Function(args)
0xc1911c88 0x00000000c010c387 __global_cli+0xb7
                               kernel .text 0xc0100000 0xc010c2d0
0xc010c424
0xc1911c9c 0x00000000c01793a7 rs_timer+0x37 (0x0)
                               kernel .text 0xc0100000 0xc0179370
0xc017946c
0xc1911cc4 0x00000000c01231b5 timer_bh+0x269 (0xc034de40, 0x20, 0x0)
                               kernel .text 0xc0100000 0xc0122f4c
0xc0123210
0xc1911cd8 0x00000000c0120248 bh_action+0x50 (0x0, 0x3, 0xc033a660)
                               kernel .text 0xc0100000 0xc01201f8
0xc01202a8
0xc1911cf0 0x00000000c012011b tasklet_hi_action+0x4f (0xc033a660, 0x260,
0xc0331a60)
                               kernel .text 0xc0100000 0xc01200cc
0xc0120154
0xc1911d10 0x00000000c011ffad do_softirq+0x5d (0xc1910000, 0xc02dca60)
                               kernel .text 0xc0100000 0xc011ff50
0xc011ffe0
0xc1911d2c 0x00000000c010c77e do_IRQ+0xda (0xc1910000, 0x0, 0x0,
0xc02dca60, 0xc1910000)
                               kernel .text 0xc0100000 0xc010c6a4
0xc010c790
           0x00000000c010a8c8 ret_from_intr
                               kernel .text 0xc0100000 0xc010a8c8
0xc010a8e8
Interrupt registers:
eax = 0xc1910648 ebx = 0xc1910000 ecx = 0x00000000 edx = 0x00000000
esi = 0xc02dca60 edi = 0xc1910000 esp = 0xc1911d68 eip = 0xc011512e
ebp = 0xc1911d70 xss = 0x00000018 xcs = 0x00000010 eflags = 0x00000246
xds = 0xc0330018 xes = 0xc0330018 origeax = 0xffffff13 &regs = 0xc1911d34
[1]more>
           0x00000000c011512e exit_sighand+0x66 (0xc1910000)
                               kernel .text 0xc0100000 0xc01150c8
0xc0115134
0xc1911d88 0x00000000c011eef5 do_exit+0x219 (0xb, 0x0, 0x0, 0xc0114798,
0xc1911e50)
                               kernel .text 0xc0100000 0xc011ecdc
0xc011ef50
0xc1911da0 0x00000000c010aef0 do_divide_error (0xc1911e60, 0x0, 0x1,
0x81478, 0x0)
                               kernel .text 0xc0100000 0xc010aef0
0xc010af90
           0x00000000c010a938 error_code+0x34
                               kernel .text 0xc0100000 0xc010a904
0xc010a940
Interrupt registers:
eax = 0x00000002 ebx = 0x00000001 ecx = 0x00081478 edx = 0x00000000
esi = 0xc1957da0 edi = 0xc1923ac8 esp = 0xc1911e94 eip = 0xc019f8b5
ebp = 0xc1911e9c xss = 0x00000018 xcs = 0x00000010 eflags = 0x00010046
xds = 0x00000018 xes = 0x00000018 origeax = 0xffffffff &regs = 0xc1911e60
           0x00000000c019f8b5 scsi_back_merge_fn_c+0x15 (0xc1923a98,
0xc1957da0, 0xcfb05780, 0x80)
                               kernel .text 0xc0100000 0xc019f8a0
0xc019f98c
0xc1911f2c 0x00000000c016a0df __make_request+0x1af (0xc1923a98, 0x1,
0xcfb05780, 0x0, 0x814)
                               kernel .text 0xc0100000 0xc0169f30
0xc016a8a4
0xc1911f70 0x00000000c016a9c8 generic_make_request+0x124 (0x1, 0xcfb05780,
0x0, 0x0, 0x0)
                               kernel .text 0xc0100000 0xc016a8a4
0xc016aa50
0xc1911fac 0x00000000c016abde ll_rw_block+0x18e (0x1, 0x1, 0xc1911fd0, 0x0)

                               kernel .text 0xc0100000 0xc016aa50
0xc016ac58
0xc1911fd4 0x00000000c0138ed7 flush_dirty_buffers+0x97 (0x0, 0x10f00)
                               kernel .text 0xc0100000 0xc0138e40
0xc0138f24
[1]more>
0xc1911fec 0x00000000c01391ab bdflush+0x8f
                               kernel .text 0xc0100000 0xc013911c
0xc0139260
           0x00000000c0108c9b kernel_thread+0x23
                               kernel .text 0xc0100000 0xc0108c78
0xc0108cb0
[1]kdb>

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] livelock in elevator scheduling
  2000-12-05  1:38         ` Russell Cattelan
@ 2000-12-05 23:01           ` Jens Axboe
  2000-12-06  0:53             ` Russell Cattelan
  0 siblings, 1 reply; 16+ messages in thread
From: Jens Axboe @ 2000-12-05 23:01 UTC (permalink / raw)
  To: Russell Cattelan; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1360 bytes --]

On Mon, Dec 04 2000, Russell Cattelan wrote:
> I'm going to take a closer look at the scsi_back_merge_fn.
> This may  have more to due with our/Chait's kiobuf modifications than
> anything else.
> 
> 
> 
> XFS (dev: 8/20) mounting with KIOBUFIO
> Start mounting filesystem: sd(8,20)
> Ending clean XFS mount for filesystem: sd(8,20)
> kmem_alloc doing a vmalloc 262144 size & PAGE_SIZE 0 rval=0xe0a10000
> Unable to handle kernel NULL pointer dereference at virtual address
> 00000008
>  printing eip:
> c019f8b5
> *pde = 00000000
> 
> Entering kdb (current=0xc1910000, pid 5) on processor 1 Panic: Oops
> due to panic @ 0xc019f8b5
> eax = 0x00000002 ebx = 0x00000001 ecx = 0x00081478 edx = 0x00000000
> esi = 0xc1957da0 edi = 0xc1923ac8 esp = 0xc1911e94 eip = 0xc019f8b5
> ebp = 0xc1911e9c xss = 0x00000018 xcs = 0x00000010 eflags = 0x00010046
> xds = 0x00000018 xes = 0x00000018 origeax = 0xffffffff &regs = 0xc1911e60
> [1]kdb> bt
>     EBP       EIP         Function(args)
> 0xc1911e9c 0x00000000c019f8b5 scsi_back_merge_fn_c+0x15 (0xc1923a98,
> 0xc1957da0, 0xcfb05780, 0x80)
>                                kernel .text 0xc0100000 0xc019f8a0

Ah, I see what it is now. The elevator is attempting to merge a buffer
head into a kio based request, poof. The attached diff should take
care of that in your tree.

-- 
* Jens Axboe <axboe@suse.de>
* SuSE Labs

[-- Attachment #2: xfs-elv-1 --]
[-- Type: text/plain, Size: 549 bytes --]

--- drivers/block/elevator.c~	Tue Dec  5 23:59:01 2000
+++ drivers/block/elevator.c	Tue Dec  5 23:59:41 2000
@@ -39,6 +39,9 @@
 	while ((entry = entry->prev) != head) {
 		struct request *__rq = blkdev_entry_to_request(entry);
 
+		if (req->kiobuf)
+			continue;
+
 		/*
 		 * simply "aging" of requests in queue
 		 */
@@ -105,6 +108,8 @@
 	while ((entry = entry->prev) != head) {
 		struct request *__rq = blkdev_entry_to_request(entry);
 
+		if (req->kiobuf)
+			continue;
 		if (__rq->cmd != rw)
 			continue;
 		if (__rq->rq_dev != bh->b_rdev)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] livelock in elevator scheduling
  2000-12-05 23:01           ` Jens Axboe
@ 2000-12-06  0:53             ` Russell Cattelan
  0 siblings, 0 replies; 16+ messages in thread
From: Russell Cattelan @ 2000-12-06  0:53 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-kernel

Jens Axboe wrote:

> On Mon, Dec 04 2000, Russell Cattelan wrote:
> > I'm going to take a closer look at the scsi_back_merge_fn.
> > This may  have more to due with our/Chait's kiobuf modifications than
> > anything else.
> >
> >
> >
> > XFS (dev: 8/20) mounting with KIOBUFIO
> > Start mounting filesystem: sd(8,20)
> > Ending clean XFS mount for filesystem: sd(8,20)
> > kmem_alloc doing a vmalloc 262144 size & PAGE_SIZE 0 rval=0xe0a10000
> > Unable to handle kernel NULL pointer dereference at virtual address
> > 00000008
> >  printing eip:
> > c019f8b5
> > *pde = 00000000
> >
> > Entering kdb (current=0xc1910000, pid 5) on processor 1 Panic: Oops
> > due to panic @ 0xc019f8b5
> > eax = 0x00000002 ebx = 0x00000001 ecx = 0x00081478 edx = 0x00000000
> > esi = 0xc1957da0 edi = 0xc1923ac8 esp = 0xc1911e94 eip = 0xc019f8b5
> > ebp = 0xc1911e9c xss = 0x00000018 xcs = 0x00000010 eflags = 0x00010046
> > xds = 0x00000018 xes = 0x00000018 origeax = 0xffffffff &regs = 0xc1911e60
> > [1]kdb> bt
> >     EBP       EIP         Function(args)
> > 0xc1911e9c 0x00000000c019f8b5 scsi_back_merge_fn_c+0x15 (0xc1923a98,
> > 0xc1957da0, 0xcfb05780, 0x80)
> >                                kernel .text 0xc0100000 0xc019f8a0
>
> Ah, I see what it is now. The elevator is attempting to merge a buffer
> head into a kio based request, poof. The attached diff should take
> care of that in your tree.

Hmm..  Yup... that is actually the mods made for kio in our base XFS tree.
I wonder why the patch dropped them?
I should have caught that.

Thanks.
I'll let you know how things go.


>
>
> --
> * Jens Axboe <axboe@suse.de>
> * SuSE Labs
>
>   ------------------------------------------------------------------------
>
>    xfs-elv-1Name: xfs-elv-1
>             Type: Plain Text (text/plain)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] livelock in elevator scheduling
       [not found] <200011210828.RAA27311@asami.proc.flab.fujitsu.co.jp>
@ 2000-11-21 15:12 ` Andrea Arcangeli
  0 siblings, 0 replies; 16+ messages in thread
From: Andrea Arcangeli @ 2000-11-21 15:12 UTC (permalink / raw)
  To: kumon; +Cc: linux-kernel, Dave Jones, Jens Axboe

On Tue, Nov 21, 2000 at 05:28:40PM +0900, kumon@flab.fujitsu.co.jp wrote:
> @@ -47,6 +47,11 @@
>  			break;
>  		tmp->elevator_sequence--;
>  	}
> +	if (entry == head) {
> +		tmp = blkdev_entry_to_request(entry);
> +		if (IN_ORDER(req, tmp))
> +			entry = real_head->prev;
> +	}
>  	list_add(&req->queue, entry);
>  }

This patch is buggy. head with scsi doesn't point to a request so it
doesn't make sense to compare it.

> To implement a complete elevator scheduling, preparing an alternate

Complete elevator scheduling is _just_ implemented, but it's enterely disabled.
You should always enable it before running a 2.4.x kernel. To do that use
elvtune or apply this patch:

--- 2.4.0-test11-pre6/include/linux/elevator.h.~1~	Wed Jul 19 06:43:10 2000
+++ 2.4.0-test11-pre6/include/linux/elevator.h	Tue Nov 21 15:57:51 2000
@@ -100,8 +100,8 @@
 ((elevator_t) {							\
 	0,				/* not used */		\
 								\
-	1000000,				/* read passovers */	\
-	2000000,				/* write passovers */	\
+	500,				/* read passovers */	\
+	1000,				/* write passovers */	\
 	0,				/* max_bomb_segments */	\
 								\
 	0,				/* not used */		\


The "DoS" attack is the bug that is been fixed by implementing the new elevator
with proper scheduling.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2000-12-06  1:24 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2000-11-21  8:38 [PATCH] livelock in elevator scheduling kumon
2000-11-21 10:28 ` Jens Axboe
2000-11-21 11:30 ` kumon
2000-11-21 11:36   ` Jens Axboe
2000-12-02  0:22     ` Russell Cattelan
2000-12-02 15:42       ` Jens Axboe
2000-12-04 23:25         ` Russell Cattelan
2000-12-05  1:38         ` Russell Cattelan
2000-12-05 23:01           ` Jens Axboe
2000-12-06  0:53             ` Russell Cattelan
2000-11-21 12:39   ` kumon
2000-11-21 13:01     ` Jens Axboe
2000-11-22  6:08     ` kumon
2000-11-22 10:59 ` kumon
2000-11-22 15:50   ` davej
     [not found] <200011210828.RAA27311@asami.proc.flab.fujitsu.co.jp>
2000-11-21 15:12 ` Andrea Arcangeli

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).