On Fri, Jul 18, 2003 at 09:38:10PM +1000, Nick Piggin wrote: > > > Wiktor Wodecki wrote: > > >On Fri, Jul 18, 2003 at 08:43:05PM +1000, Con Kolivas wrote: > > > >>On Fri, 18 Jul 2003 20:31, Wiktor Wodecki wrote: > >> > >>>On Fri, Jul 18, 2003 at 12:18:33PM +0200, Mike Galbraith wrote: > >>> > >>>>That _might_ (add salt) be priorities of kernel threads dropping too > >>>>low. > >>>> > >>>>I'm also seeing occasional total stalls under heavy I/O in the order of > >>>>10-12 seconds (even the disk stops). I have no idea if that's something > >>>>in mm or the scheduler changes though, as I've yet to do any isolation > >>>>and/or tinkering. All I know at this point is that I haven't seen it in > >>>>stock yet. > >>>> > >>>I've seen this too while doing a huge nfs transfer from a 2.6 machine to > >>>a 2.4 machine (sparc32). Thought it'd be something with the nfs changes > >>>which were recently, might be the scheduler, tho. Ah, and it is fully > >>>reproducable. > >>> > >>Well I didn't want to post this yet because I'm not sure if it's a good > >>workaround yet but it looks like a reasonable compromise, and since you > >>have a testcase it will be interesting to see if it addresses it. It's > >>possible that a task is being requeued every millisecond, and this is a > >>little smarter. It allows cpu hogs to run for 100ms before being round > >>robinned, but shorter for interactive tasks. Can you try this O7 which > >>applies on top of O6.1 please: > >> > >>available here: > >>http://kernel.kolivas.org/2.5 > >> > > > >sorry, the problem still persists. Aborting the cp takes less time, tho > >(about 10 seconds now, before it was about 30 secs). I'm aborting during > >a big file, FYI. > > > > OK if the IO actually stops then it shouldn't be an IO scheduler or > request allocation problem, but could you try to capture a sysrq T > trace for me during the freeze. okay, here it is. I only paste the output for cp, if you need the whole thing, tell me please. Jul 19 12:54:16 kakerlak kernel: cp D C0140F7B 6164 6160 (NOTLB) Jul 19 12:54:16 kakerlak kernel: c2c6fec8 00200082 d3de680c c0140f7b d3de6800 c72ef000 c0477cc0 d3de681c Jul 19 12:54:16 kakerlak kernel: d139ad60 c2c6e000 ce8dc3c0 ce8dc3dc 00000000 c01aba07 00000000 00000001 Jul 19 12:54:16 kakerlak kernel: 00000000 00000001 ce8153e4 00000000 c26706d0 c011cdb0 ce8dc3dc ce8dc3dc Jul 19 12:54:16 kakerlak kernel: Call Trace: Jul 19 12:54:16 kakerlak kernel: [free_block+203/256] free_block+0xcb/0x100 Jul 19 12:54:16 kakerlak kernel: [nfs_wait_on_request+151/336] nfs_wait_on_request+0x97/0x150 Jul 19 12:54:16 kakerlak kernel: [default_wake_function+0/48] default_wake_function+0x0/0x30 Jul 19 12:54:16 kakerlak kernel: [nfs_wait_on_requests+169/256] nfs_wait_on_requests+0xa9/0x100 Jul 19 12:54:16 kakerlak kernel: [nfs_sync_file+150/192] nfs_sync_file+0x96/0xc0 Jul 19 12:54:16 kakerlak kernel: [nfs_file_flush+88/208] nfs_file_flush+0x58/0xd0 Jul 19 12:54:16 kakerlak kernel: [filp_close+101/128] filp_close+0x65/0x80 Jul 19 12:54:16 kakerlak kernel: [sys_close+97/160] sys_close+0x61/0xa0 Jul 19 12:54:16 kakerlak kernel: [syscall_call+7/11] syscall_call+0x7/0xb Jul 19 12:54:16 kakerlak kernel: -- Regards, Wiktor Wodecki