linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* INFO: task mount:11202 blocked for more than 120 seconds
@ 2008-03-07 20:32 Christian Kujau
  2008-03-07 22:40 ` David Chinner
  0 siblings, 1 reply; 34+ messages in thread
From: Christian Kujau @ 2008-03-07 20:32 UTC (permalink / raw)
  To: LKML; +Cc: xfs

Hi,

after upgrading from 2.6.24.1 to 2.6.25-rc3, I came across[0]. This 
warning seems to be gone now. With 2.6.25-rc4 (and the fix from [1])
the box was running fine for 20 hours or so (doing its usual jobs plus 
a "make randconfig && make" loop).

After this, I noticed that /bin/sync would not exit anymore and
remains stuck in D state. Looking around I noticed that the rsync
backup jobs (rsync'ing to an xfs partition) from earlier this
morning did not exit either and hung in D state. With sync hung, the 
following messages started to appear:

[75377.756985] INFO: task sync:2697 blocked for more than 120 seconds.
[75377.757579] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[75377.758211] sync          D c013835c     0  2697  16457
[75377.758216]        f59506c0 00000082 f4c34000 c013835c fffeffff f6c1bcb0 f5dd0000 f4c34000 
[75377.758223]        c04405d7 f53f7e98 f6c1bcb4 f6c1bcd0 00000000 f6c1bcb0 00000000 f7ca1090 
[75377.758230]        f4c34000 c044070a f6c1bcd0 f6c1bcd0 f5dd0000 00000001 f6c1bcb0 c044074b 
[75377.758237] Call Trace:
[75377.758253]  [<c013835c>] trace_hardirqs_on+0x9c/0x110
[75377.758269]  [<c04405d7>] rwsem_down_failed_common+0x67/0x150
[75377.758279]  [<c044070a>] rwsem_down_read_failed+0x1a/0x24
[75377.758286]  [<c044074b>] call_rwsem_down_read_failed+0x7/0xc
[75377.758291]  [<c012fd7c>] down_read_nested+0x4c/0x60
[75377.758295]  [<c027a64b>] xfs_ilock+0x5b/0xb0
[75377.758301]  [<c027a64b>] xfs_ilock+0x5b/0xb0
[75377.758306]  [<c029693d>] xfs_sync_inodes+0x3dd/0x6b0
[75377.758314]  [<c0440b14>] _spin_unlock+0x14/0x20
[75377.758325]  [<c0296d9b>] xfs_syncsub+0x18b/0x300
[75377.758330]  [<c0440b14>] _spin_unlock+0x14/0x20
[75377.758335]  [<c02a7c2b>] xfs_fs_sync_super+0x2b/0xd0
[75377.758342]  [<c016a124>] sync_filesystems+0xa4/0x100
[75377.758351]  [<c043fdd8>] down_read+0x38/0x50
[75377.758356]  [<c016a13f>] sync_filesystems+0xbf/0x100
[75377.758361]  [<c01872b3>] do_sync+0x33/0x70
[75377.758366]  [<c0102ed7>] restore_nocheck+0x12/0x15
[75377.758371]  [<c01872fa>] sys_sync+0xa/0x10
[75377.758375]  [<c0102dee>] sysenter_past_esp+0x5f/0xa5
[75377.758402]  =======================
[75377.758405] 3 locks held by sync/2697:
[75377.758407]  #0:  (mutex){--..}, at: [<c016a091>] sync_filesystems+0x11/0x100
[75377.758414]  #1:  (&type->s_umount_key#22){----}, at: [<c016a124>] sync_filesystems+0xa4/0x100
[75377.758422]  #2:  (&(&ip->i_iolock)->mr_lock){----}, at: [<c027a64b>] xfs_ilock+0x5b/0xb0


The box is still up & running, although the load is increasing slightly. 
I've gathered some details here: http://nerdbynature.de/bits/2.6.25-rc4/

I've searched the archives for this error, but the only thing was
   * http://lkml.org/lkml/2008/2/12/44
     [BUG] 2.6.25-rc1-git1 softlockup while bootup on powerpc
     ...however, I don't get "CPU stuck" messages
   * http://lkml.org/lkml/2008/1/29/370
     Re: system hang on latest git
     ...but calltrace looks a lot different.

Since both mailings are not so current, I'd like to got back to -rc3 and 
try to reproduce this one.

Do you have any idea what's going on here?

Thanks,
Christian.

[0] http://lkml.org/lkml/2008/3/2/171
[1] http://lkml.org/lkml/2008/3/4/634
-- 
BOFH excuse #158:

Defunct processes

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: INFO: task mount:11202 blocked for more than 120 seconds
  2008-03-07 20:32 INFO: task mount:11202 blocked for more than 120 seconds Christian Kujau
@ 2008-03-07 22:40 ` David Chinner
  2008-03-07 23:46   ` Christian Kujau
  0 siblings, 1 reply; 34+ messages in thread
From: David Chinner @ 2008-03-07 22:40 UTC (permalink / raw)
  To: Christian Kujau; +Cc: LKML, xfs

On Fri, Mar 07, 2008 at 09:32:57PM +0100, Christian Kujau wrote:
> Hi,
> 
> after upgrading from 2.6.24.1 to 2.6.25-rc3, I came across[0]. This 
> warning seems to be gone now. With 2.6.25-rc4 (and the fix from [1])
> the box was running fine for 20 hours or so (doing its usual jobs plus 
> a "make randconfig && make" loop).
> 
> After this, I noticed that /bin/sync would not exit anymore and
> remains stuck in D state. Looking around I noticed that the rsync
> backup jobs (rsync'ing to an xfs partition) from earlier this
> morning did not exit either and hung in D state. With sync hung, the 
> following messages started to appear:
> 
> [75377.756985] INFO: task sync:2697 blocked for more than 120 seconds.
> [75377.757579] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
> this message.
> [75377.758211] sync          D c013835c     0  2697  16457
> [75377.758216]        f59506c0 00000082 f4c34000 c013835c fffeffff f6c1bcb0 
> f5dd0000 f4c34000 [75377.758223]        c04405d7 f53f7e98 f6c1bcb4 f6c1bcd0 
> 00000000 f6c1bcb0 00000000 f7ca1090 [75377.758230]        f4c34000 c044070a 
> f6c1bcd0 f6c1bcd0 f5dd0000 00000001 f6c1bcb0 c044074b [75377.758237] Call 
> Trace:
> [75377.758253]  [<c013835c>] trace_hardirqs_on+0x9c/0x110
> [75377.758269]  [<c04405d7>] rwsem_down_failed_common+0x67/0x150
> [75377.758279]  [<c044070a>] rwsem_down_read_failed+0x1a/0x24
> [75377.758286]  [<c044074b>] call_rwsem_down_read_failed+0x7/0xc
> [75377.758291]  [<c012fd7c>] down_read_nested+0x4c/0x60
> [75377.758295]  [<c027a64b>] xfs_ilock+0x5b/0xb0
> [75377.758301]  [<c027a64b>] xfs_ilock+0x5b/0xb0
> [75377.758306]  [<c029693d>] xfs_sync_inodes+0x3dd/0x6b0
> [75377.758314]  [<c0440b14>] _spin_unlock+0x14/0x20
> [75377.758325]  [<c0296d9b>] xfs_syncsub+0x18b/0x300
> [75377.758330]  [<c0440b14>] _spin_unlock+0x14/0x20
> [75377.758335]  [<c02a7c2b>] xfs_fs_sync_super+0x2b/0xd0
> [75377.758342]  [<c016a124>] sync_filesystems+0xa4/0x100
> [75377.758351]  [<c043fdd8>] down_read+0x38/0x50
> [75377.758356]  [<c016a13f>] sync_filesystems+0xbf/0x100
> [75377.758361]  [<c01872b3>] do_sync+0x33/0x70
> [75377.758366]  [<c0102ed7>] restore_nocheck+0x12/0x15
> [75377.758371]  [<c01872fa>] sys_sync+0xa/0x10
> [75377.758375]  [<c0102dee>] sysenter_past_esp+0x5f/0xa5
> [75377.758402]  =======================
> [75377.758405] 3 locks held by sync/2697:
> [75377.758407]  #0:  (mutex){--..}, at: [<c016a091>] 
> sync_filesystems+0x11/0x100
> [75377.758414]  #1:  (&type->s_umount_key#22){----}, at: [<c016a124>] 
> sync_filesystems+0xa4/0x100
> [75377.758422]  #2:  (&(&ip->i_iolock)->mr_lock){----}, at: [<c027a64b>] 
> xfs_ilock+0x5b/0xb0

Well, if that is hung there, something else must be holding on to
the iolock it's waiting on. What are the other D state processes in the
machine?

Also, the iolock can be held across I/O so it's possible you've lost an I/O.
Any I/O errors in the syslog?

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: INFO: task mount:11202 blocked for more than 120 seconds
  2008-03-07 22:40 ` David Chinner
@ 2008-03-07 23:46   ` Christian Kujau
  2008-03-08  1:54     ` Christian Kujau
  2008-03-09 21:34     ` INFO: task mount:11202 blocked for more than 120 seconds David Chinner
  0 siblings, 2 replies; 34+ messages in thread
From: Christian Kujau @ 2008-03-07 23:46 UTC (permalink / raw)
  To: David Chinner; +Cc: LKML, xfs

On Sat, 8 Mar 2008, David Chinner wrote:
> Well, if that is hung there, something else must be holding on to
> the iolock it's waiting on. What are the other D state processes in the
> machine?

I have 7 processes in D state so far:

$ ps auxww [....]
root      9844  0.0  0.0      0     0 ?   D    Mar06   0:22 [pdflush]
root      2697  0.0  0.0   4712   460 ?   D    Mar07   0:00 sync
root      8342  0.0  0.0   1780   440 ?   D    Mar07   0:01 /bin/rm -rf /data/md1/stuff
root     12494  0.0  0.0  11124  1228 ?   D    Mar07   0:14 /usr/bin/rsync 
root     15008  0.0  0.0   4712   460 ?   D    Mar07   0:00 sync
root     11202  0.0  0.0   5012   764 ?   D    Mar07   0:00 mount -o remount,ro /data/md1
root     15936  0.0  0.0   4712   460 ?   D    Mar07   0:00 sync

At one point I did a sysrq-D and put the results in:
http://nerdbynature.de/bits/2.6.25-rc4/hung_task/kern.log.gz
(grep for "SysRq : Show Locks Held" and "SysRq : Show Blocked State")

> Also, the iolock can be held across I/O so it's possible you've lost an I/O.
> Any I/O errors in the syslog?

No, no I/O errors at all. See the kern.log above, I could even do dd(1) 
from the md1 (dm-crypt on raid1), no errors either.

thanks,
Christian.
-- 
BOFH excuse #233:

TCP/IP UDP alarm threshold is set too low.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: INFO: task mount:11202 blocked for more than 120 seconds
  2008-03-07 23:46   ` Christian Kujau
@ 2008-03-08  1:54     ` Christian Kujau
  2008-03-09  6:15       ` 2.6.25-rc hangs (was: INFO: task mount:11202 blocked for more than 120 seconds) Christian Kujau
  2008-03-09 21:34     ` INFO: task mount:11202 blocked for more than 120 seconds David Chinner
  1 sibling, 1 reply; 34+ messages in thread
From: Christian Kujau @ 2008-03-08  1:54 UTC (permalink / raw)
  To: David Chinner; +Cc: LKML, xfs

On Sat, 8 Mar 2008, Christian Kujau wrote:
> I have 7 processes in D state so far:

FWIW, it's 100% reproducible with 2.6.25-rc3 too...sigh :-\
So, the last working kernel for me is 2.6.24.1 - that's a lot of bisecting 
and I fear that compile errors will invalidate the bisecting results again 
or make it impossible at all....I'll try anyway....tomorrow...

C.
-- 
BOFH excuse #285:

Telecommunications is upgrading.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* 2.6.25-rc hangs (was: INFO: task mount:11202 blocked for more than 120 seconds)
  2008-03-08  1:54     ` Christian Kujau
@ 2008-03-09  6:15       ` Christian Kujau
  2008-03-09 16:44         ` 2.6.25-rc hangs Eric Sandeen
  0 siblings, 1 reply; 34+ messages in thread
From: Christian Kujau @ 2008-03-09  6:15 UTC (permalink / raw)
  To: David Chinner; +Cc: LKML, xfs

On Sat, 8 Mar 2008, Christian Kujau wrote:
> FWIW, it's 100% reproducible with 2.6.25-rc3 too...sigh :-\
> So, the last working kernel for me is 2.6.24.1 - that's a lot of bisecting 
> and I fear that compile errors will invalidate the bisecting results again or 
> make it impossible at all....I'll try anyway....tomorrow...

Bisecting failed as expected :-(
I tried to follow the git-bisect manpage (and have successfully used 
bisect in the past a few times), but I think ~5700 revisions between 
2.6.24 and 2.6.25 are just too much fuzz. The bisect logs so far, with
my comments inbetween:

git-bisect start
# 2.6.25-rc4, known to be bad:
# bad: [84c6f6046c5a2189160a8f0dca8b90427bf690ea] x86_64: make ptrace always sign-extend orig_ax to 64 bits
git-bisect bad 84c6f6046c5a2189160a8f0dca8b90427bf690ea
# 2.6.24, good:
# good: [49914084e797530d9baaf51df9eda77babc98fa8] Linux 2.6.24
git-bisect good 49914084e797530d9baaf51df9eda77babc98fa8
# 2.6.24, hard lockup during boot, bad:
# bad: [bd45ac0c5daae35e7c71138172e63df5cf644cf6] Merge branch 'linux-2.6'
git-bisect bad bd45ac0c5daae35e7c71138172e63df5cf644cf6

I marked the last one bad, because I could not boot any more. As it's a 
headless box, I could not get more details. It did not even respond so
sysrq-b. After marking this bad, I compiled and booted again - same 
result, hard lockup. So I tried again:

git bisect reset
git-bisect start
# 2.6.25-rc4, known to be bad:
# bad: [84c6f6046c5a2189160a8f0dca8b90427bf690ea] x86_64: make ptrace always sign-extend orig_ax to 64 bits
git-bisect bad 84c6f6046c5a2189160a8f0dca8b90427bf690ea
# 2.6.24, good:
# good: [49914084e797530d9baaf51df9eda77babc98fa8] Linux 2.6.24
git-bisect good 49914084e797530d9baaf51df9eda77babc98fa8
# 2.6.24, hard lockup during boot, marking good anyway:
# good: [bd45ac0c5daae35e7c71138172e63df5cf644cf6] Merge branch 'linux-2.6'
git-bisect good bd45ac0c5daae35e7c71138172e63df5cf644cf6
# lockup
# bad: [f0f1b3364ae7f48084bdf2837fb979ff59622523] Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6
git-bisect bad f0f1b3364ae7f48084bdf2837fb979ff59622523

Although I could not boot with bd45ac0c5daae35e7c71138172e63df5cf644cf6, I 
marked it "good", as the lockup is totally unrelated to my problem. 
However, the box locks up as soon as I'm using the device-mapper. This 
time it does respond to sysrq-b.

But still: I'm unable to diagnose the system hang [0] and I fear that
2.6.25 is released and for the first time since ages I'd have to skip a release...

Help!

Christian.

[0] http://lkml.org/lkml/2008/3/7/308
-- 
BOFH excuse #288:

Hard drive sleeping. Let it wake up on it's own...

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: 2.6.25-rc hangs
  2008-03-09  6:15       ` 2.6.25-rc hangs (was: INFO: task mount:11202 blocked for more than 120 seconds) Christian Kujau
@ 2008-03-09 16:44         ` Eric Sandeen
  2008-03-09 18:05           ` Christian Kujau
  0 siblings, 1 reply; 34+ messages in thread
From: Eric Sandeen @ 2008-03-09 16:44 UTC (permalink / raw)
  To: Christian Kujau; +Cc: David Chinner, LKML, xfs

Christian Kujau wrote:
> On Sat, 8 Mar 2008, Christian Kujau wrote:
>> FWIW, it's 100% reproducible with 2.6.25-rc3 too...sigh :-\
>> So, the last working kernel for me is 2.6.24.1 - that's a lot of bisecting 
>> and I fear that compile errors will invalidate the bisecting results again or 
>> make it impossible at all....I'll try anyway....tomorrow...
> 
> Bisecting failed as expected :-(
> I tried to follow the git-bisect manpage (and have successfully used 
> bisect in the past a few times), but I think ~5700 revisions between 
> 2.6.24 and 2.6.25 are just too much fuzz. The bisect logs so far, with
> my comments inbetween:

Christian, what is the test you are using for the bisect?

Thanks,
-Eric

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: 2.6.25-rc hangs
  2008-03-09 16:44         ` 2.6.25-rc hangs Eric Sandeen
@ 2008-03-09 18:05           ` Christian Kujau
       [not found]             ` <47D42AD6.10500@sandeen.net>
  0 siblings, 1 reply; 34+ messages in thread
From: Christian Kujau @ 2008-03-09 18:05 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: David Chinner, LKML, xfs

On Sun, 9 Mar 2008, Eric Sandeen wrote:
>> I tried to follow the git-bisect manpage (and have successfully used
>> bisect in the past a few times), but I think ~5700 revisions between
>> 2.6.24 and 2.6.25 are just too much fuzz. The bisect logs so far, with
>> my comments inbetween:
>
> Christian, what is the test you are using for the bisect?

Sorry, I don't understand: which "test" do you mean?

I did the bisect as per the bisect log and then just rebooted. Which gave 
me no usable results yet. I'm trying to boot 2.6.25-rc1 in a few moments 
and see if the hang is there as well. If it is, I'll start bisecting again 
and hope that "halfway between -rc1 and .24" will be a bootable kernel 
this time...

The "error" I'm trying to chase is a system "hang", but no instant lockup. 
I can reproduce this by increasing disk I/O. I did this primarily with 
rsync from different filesystems to my backup XFS partition. After a few 
minutes, the INFO: messages[0] appeared.

Thanks,
Christian.

[0] http://lkml.org/lkml/2008/3/7/308
-- 
BOFH excuse #2:

solar flares

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: 2.6.25-rc hangs
       [not found]             ` <47D42AD6.10500@sandeen.net>
@ 2008-03-09 18:49               ` Christian Kujau
  2008-03-12 18:06                 ` Samuel Tardieu
  0 siblings, 1 reply; 34+ messages in thread
From: Christian Kujau @ 2008-03-09 18:49 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: LKML

On Sun, 9 Mar 2008, Eric Sandeen wrote:
> I meant the test you were using to determine "good" from "bad" - I guess
> it was "boot up and do IO for a while to see if it hangs"

Maybe I was too confusing, let me try again:

* 2.6.24.1 was running fine for weeks (and is now)
* 2.6.25-rc3, -rc4 comes with system hang. Trying to bisect it failed
   at a really early stage:

  [check out current -git, 2.6.25-rc4)
  $ git bisect start
  $ git bisect bad          # because I know, that current -git is bad
  $ git bisect good v2.6.24 # because I know 2.6.24 is good
  [ compiling, and first reboot]
  -> failed, because hard lockup.

After rebooting to a working kernel I can now do either:

  1) mark the current one as "bad", solely on the fact that "it does not
    boot" and is "bad" per se, ignoring the fact that if the box *had*
    booted, the system hang *perhaps* did not occur. IOW, I'm marking
    it "bad" because of a totally different issue.

  2) although booting failed, I still mark it "good", which means I'm
    literally *guessing* that this current kernel (bd45ac0c5daa...)
    does NOT have the system hang and thus affecting all subsequent
    bisects. Worst case: this guess turns out to be false and I'm bisecting
    all through all ~2800 revisions without ever getting the correct "bad"
    one (because they were actually all "bad").

I tried 1), but did not get any further, as the next kernel did not boot 
either.
I tried 2), but did not get any further, as the next kernel did boot but 
locked up when I tried to use the device mapper.

Hope that's a bit better explained than before...

Thanks,
C.
-- 
BOFH excuse #58:

high pressure system failure

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: INFO: task mount:11202 blocked for more than 120 seconds
  2008-03-07 23:46   ` Christian Kujau
  2008-03-08  1:54     ` Christian Kujau
@ 2008-03-09 21:34     ` David Chinner
  2008-03-10  1:46       ` Christian Kujau
                         ` (2 more replies)
  1 sibling, 3 replies; 34+ messages in thread
From: David Chinner @ 2008-03-09 21:34 UTC (permalink / raw)
  To: Christian Kujau; +Cc: David Chinner, LKML, xfs, dm-devel

[adding dm-devel to cc list]

On Sat, Mar 08, 2008 at 12:46:40AM +0100, Christian Kujau wrote:
> On Sat, 8 Mar 2008, David Chinner wrote:
> >Well, if that is hung there, something else must be holding on to
> >the iolock it's waiting on. What are the other D state processes in the
> >machine?
> 
> I have 7 processes in D state so far:
> 
> $ ps auxww [....]
> root      9844  0.0  0.0      0     0 ?   D    Mar06   0:22 [pdflush]
> root      2697  0.0  0.0   4712   460 ?   D    Mar07   0:00 sync
> root      8342  0.0  0.0   1780   440 ?   D    Mar07   0:01 /bin/rm -rf 
> /data/md1/stuff
> root     12494  0.0  0.0  11124  1228 ?   D    Mar07   0:14 /usr/bin/rsync 
> root     15008  0.0  0.0   4712   460 ?   D    Mar07   0:00 sync
> root     11202  0.0  0.0   5012   764 ?   D    Mar07   0:00 mount -o 
> remount,ro /data/md1
> root     15936  0.0  0.0   4712   460 ?   D    Mar07   0:00 sync
> 
> At one point I did a sysrq-D and put the results in:
> http://nerdbynature.de/bits/2.6.25-rc4/hung_task/kern.log.gz
> (grep for "SysRq : Show Locks Held" and "SysRq : Show Blocked State")

Ok, this looks like a block layer issue. Everything is waiting in ioschedule()
and so in places we are blocked holding locked inodes. Hence sync, pdflush,
etc are hung waiting for the inode locks to be released. e.g:

INFO: task rm:8342 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
rm            D ee08de8c     0  8342   8199
       f5eebd80 00000086 c0380a16 ee08de8c ee08de8c 00000000 ee08de94 c200ebd0 
       c043ef2b c0146237 c043f1d2 c0146210 ee08de8c 00000000 00000000 db122110 
       c01464ca 00000002 c1865b00 0000000c 00000000 ee3acd60 c012c700 c200ebec 
Call Trace:
 [<c0380a16>] dm_table_unplug_all+0x26/0x40
 [<c043ef2b>] io_schedule+0xb/0x20
 [<c0146237>] sync_page+0x27/0x40
 [<c043f1d2>] __wait_on_bit+0x42/0x70
 [<c0146210>] sync_page+0x0/0x40
 [<c01464ca>] wait_on_page_bit+0x5a/0x60
 [<c012c700>] wake_bit_function+0x0/0x60
 [<c014ec53>] truncate_inode_pages_range+0x223/0x360
 [<c014eda7>] truncate_inode_pages+0x17/0x20
 [<c017b44f>] generic_delete_inode+0xef/0x100
 [<c017a8ac>] iput+0x5c/0x70
 [<c0171e47>] do_unlinkat+0xf7/0x160
 [<c0102e29>] sysenter_past_esp+0x9a/0xa5
 [<c013835c>] trace_hardirqs_on+0x9c/0x110
 [<c0102dee>] sysenter_past_esp+0x5f/0xa5
 =======================
no locks held by rm/8342.

And rsync is stuck in congestion_wait()

SysRq : Show Blocked State
  task                PC stack   pid father
rsync         D 00000292     0 12494      1
       f5dc7b40 00000086 00000000 00000292 f697bcdc 00735f5c 00000000 00000600 
       c043efd9 c013820d f532ed60 c05c0f04 f5cc1b58 00735f5c c0122900 f532ed60 
       c05c0c00 c053eb04 0000000a c043ef0b c01515f8 00000000 f532ed60 c012c6c0 
Call Trace:
 [<c043efd9>] schedule_timeout+0x49/0xc0
 [<c013820d>] mark_held_locks+0x3d/0x70
 [<c0122900>] process_timeout+0x0/0x10
 [<c043ef0b>] io_schedule_timeout+0xb/0x20
 [<c01515f8>] congestion_wait+0x58/0x80
 [<c012c6c0>] autoremove_wake_function+0x0/0x40
 [<c014cc32>] balance_dirty_pages_ratelimited_nr+0xb2/0x240
 [<c0147368>] generic_file_buffered_write+0x1a8/0x650
 [<c028540e>] xfs_log_move_tail+0x3e/0x180
 [<c0440be9>] _spin_lock+0x29/0x40
 [<c02a73cc>] xfs_write+0x7ac/0x8a0
 [<c0174c01>] core_sys_select+0x21/0x350
 [<c02a32bc>] xfs_file_aio_write+0x5c/0x70
 [<c0167d25>] do_sync_write+0xd5/0x120
 [<c0102ed7>] restore_nocheck+0x12/0x15
 [<c012c6c0>] autoremove_wake_function+0x0/0x40
 [<c019d105>] dnotify_parent+0x35/0x90
 [<c0167c50>] do_sync_write+0x0/0x120
 [<c016859f>] vfs_write+0x9f/0x140
 [<c0168b51>] sys_write+0x41/0x70
 [<c0102dee>] sysenter_past_esp+0x5f/0xa5

And this rsync procss will definitely be holding the iolock on an XFS
inode here (which is why other processes are hanging on an inode iolock).

> >Also, the iolock can be held across I/O so it's possible you've lost an 
> >I/O. Any I/O errors in the syslog?
> 
> No, no I/O errors at all. See the kern.log above, I could even do dd(1) 
> from the md1 (dm-crypt on raid1), no errors either.

Oh, dm-crypt. Well, I'd definitely start looking there. XFS has a
history of exposing dm-crypt bugs, and these hangs appear to be I/O
congestion/scheduling related and not XFS. Also, we haven't changed
anything related to plug/unplug of block devices in XFS recently, so
that also points to some other change as well...

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: INFO: task mount:11202 blocked for more than 120 seconds
  2008-03-09 21:34     ` INFO: task mount:11202 blocked for more than 120 seconds David Chinner
@ 2008-03-10  1:46       ` Christian Kujau
  2008-03-12 18:03       ` Samuel Tardieu
  2008-03-14  9:27       ` Milan Broz
  2 siblings, 0 replies; 34+ messages in thread
From: Christian Kujau @ 2008-03-10  1:46 UTC (permalink / raw)
  To: David Chinner; +Cc: LKML, xfs, dm-devel

On Mon, 10 Mar 2008, David Chinner wrote:
> Oh, dm-crypt. Well, I'd definitely start looking there. XFS has a
> history of exposing dm-crypt bugs, and these hangs appear to be I/O
> congestion/scheduling related and not XFS.

Yeah, I noticed that too, thanks for verifying this: during the 2nd bisect 
run, the box locked up hard when I accessed the device-mapper. I'm using a 
wrapper script to set up my luks/dm-crypt devices and still have to find 
out which command exactly triggers the lockup. So, maybe the hard lockup 
and the hangs are not so unrelated after all...oh well.

> Also, we haven't changed
> anything related to plug/unplug of block devices in XFS recently, so
> that also points to some other change as well...

Thanks for your assistance, David, I really appreciate it. I'll try to 
find out more about this dm-crypt thingy....

Christian.
-- 
BOFH excuse #396:

Mail server hit by UniSpammer.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: INFO: task mount:11202 blocked for more than 120 seconds
  2008-03-09 21:34     ` INFO: task mount:11202 blocked for more than 120 seconds David Chinner
  2008-03-10  1:46       ` Christian Kujau
@ 2008-03-12 18:03       ` Samuel Tardieu
  2008-03-12 19:53         ` Chr
  2008-03-14  9:27       ` Milan Broz
  2 siblings, 1 reply; 34+ messages in thread
From: Samuel Tardieu @ 2008-03-12 18:03 UTC (permalink / raw)
  To: device-mapper development; +Cc: Christian Kujau, David Chinner, LKML, xfs

>>>>> "David" == David Chinner <dgc@sgi.com> writes:

>> No, no I/O errors at all. See the kern.log above, I could even do
>> dd(1) from the md1 (dm-crypt on raid1), no errors either.

David> Oh, dm-crypt. Well, I'd definitely start looking there. XFS has
David> a history of exposing dm-crypt bugs, and these hangs appear to
David> be I/O congestion/scheduling related and not XFS. Also, we
David> haven't changed anything related to plug/unplug of block
David> devices in XFS recently, so that also points to some other
David> change as well...

For what it is worth, I notice the same error last week (with a kernel
of the day) on my laptop but was too busy at work to investigate. And
I use ext3... on dm-crypt.

  Sam
-- 
Samuel Tardieu -- sam@rfc1149.net -- http://www.rfc1149.net/


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: 2.6.25-rc hangs
  2008-03-09 18:49               ` Christian Kujau
@ 2008-03-12 18:06                 ` Samuel Tardieu
  2008-03-12 21:02                   ` Christian Kujau
  0 siblings, 1 reply; 34+ messages in thread
From: Samuel Tardieu @ 2008-03-12 18:06 UTC (permalink / raw)
  To: Christian Kujau; +Cc: Eric Sandeen, LKML

>>>>> "Christian" == Christian Kujau <lists@nerdbynature.de> writes:

Christian> After rebooting to a working kernel I can now do either:

Christian>  1) mark the current one as "bad" [...]

Christian>  2) although booting failed, I still mark it "good" [...]

Can't you skip it with "git bisect skip" in this case? (I've never
tried "skip" but from "git bisect" documentation it does what you
want, and "git bisect" will select a nearby revision)

  Sam
-- 
Samuel Tardieu -- sam@rfc1149.net -- http://www.rfc1149.net/


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: INFO: task mount:11202 blocked for more than 120 seconds
  2008-03-12 18:03       ` Samuel Tardieu
@ 2008-03-12 19:53         ` Chr
  2008-03-12 23:07           ` Christian Kujau
  0 siblings, 1 reply; 34+ messages in thread
From: Chr @ 2008-03-12 19:53 UTC (permalink / raw)
  To: Samuel Tardieu
  Cc: device-mapper development, Christian Kujau, David Chinner, LKML, xfs

On Wednesday 12 March 2008 19:03:51 Samuel Tardieu wrote:
>
> David> Oh, dm-crypt. Well, I'd definitely start looking there. XFS has
> David> a history of exposing dm-crypt bugs, and these hangs appear to
> David> be I/O congestion/scheduling related and not XFS. Also, we
> David> haven't changed anything related to plug/unplug of block
> David> devices in XFS recently, so that also points to some other
> David> change as well...
>
me too... 

http://lkml.org/lkml/2008/3/11/377
(BTW: I'm still bisecting it... can somebody please confirm that 2.6.24-git17 
http://www.kernel.org/pub/linux/kernel/v2.6/snapshots/patch-2.6.24-git17.bz2
was the _last_ "good" one?)  

BTW:
by pressing "SYSRQ-S (Emerg. Sync)" over and over again can unlock
some but not all "D" tasks... 

Regards,
	Chr.


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: 2.6.25-rc hangs
  2008-03-12 18:06                 ` Samuel Tardieu
@ 2008-03-12 21:02                   ` Christian Kujau
  0 siblings, 0 replies; 34+ messages in thread
From: Christian Kujau @ 2008-03-12 21:02 UTC (permalink / raw)
  To: Samuel Tardieu; +Cc: LKML, dm-devel

On Wed, 12 Mar 2008, Samuel Tardieu wrote:
> Can't you skip it with "git bisect skip" in this case? (I've never
> tried "skip" but from "git bisect" documentation it does what you
> want, and "git bisect" will select a nearby revision)

Yes, I shall try so. In the meantime I tried -rc5 but even more headaches 
with this one:

[ 1219.355352] ------------[ cut here ]------------
[ 1219.355359] WARNING: at drivers/usb/host/ehci-hcd.c:287 ehci_iaa_watchdog+0x7a/0x80()
[ 1219.355362] Modules linked in: act_police sch_ingress cls_u32 sch_sfq sch_cbq xt_tcpudp ipt_MASQUERADE iptable_nat nf_conntrack_ipv4 nf_nat_ftp nf_nat nf_conntrack_ftp xt_conntrack nf_conntrack iptable_filter ip_tables ipt_ULOG x_tables nfsd lockd nfs_acl auth_rpcgss exportfs sunrpc tun fuse sg sr_mod twofish_i586 twofish_common eeprom w83l785ts asb100 hwmon_vid hwmon usb_storage ecb zd1211rw firmware_class snd_intel8x0 snd_ac97_codec mac80211 ac97_bus snd_pcm snd_timer snd soundcore pl2303 usbserial snd_page_alloc cfg80211 i2c_nforce2 i2c_core
[ 1219.355411] Pid: 450, comm: md0_raid1 Not tainted 2.6.25-rc5 #1
[ 1219.355415]  [<c0119a0f>] warn_on_slowpath+0x5f/0x90
[ 1219.355429]  [<c0138d57>] __lock_acquire+0x537/0x10b0
[ 1219.355441]  [<c0117062>] task_tick_fair+0x32/0x60
[ 1219.355446]  [<c01159cf>] scheduler_tick+0xcf/0x200
[ 1219.355453]  [<c011b47d>] profile_tick+0x3d/0x80
[ 1219.355459]  [<c0102ed7>] restore_nocheck+0x12/0x15
[ 1219.355467]  [<c043c9ff>] _spin_lock_irqsave+0x3f/0x50
[ 1219.355475]  [<c03620ba>] ehci_iaa_watchdog+0x7a/0x80
[ 1219.355480]  [<c0122748>] run_timer_softirq+0x128/0x190
[ 1219.355486]  [<c0105688>] do_softirq+0x98/0xd0
[ 1219.355492]  [<c0362040>] ehci_iaa_watchdog+0x0/0x80
[ 1219.355498]  [<c011ecb2>] __do_softirq+0x52/0xa0
[ 1219.355503]  [<c0105688>] do_softirq+0x98/0xd0
[ 1219.355507]  [<c0144960>] handle_level_irq+0x0/0xe0
[ 1219.355515]  [<c011ec3d>] irq_exit+0x4d/0x70
[ 1219.355518]  [<c0105750>] do_IRQ+0x90/0xf0
[ 1219.355522]  [<c013837c>] trace_hardirqs_on+0x9c/0x110
[ 1219.355529]  [<c0103886>] common_interrupt+0x2e/0x34
[ 1219.355536]  [<c043c982>] _spin_unlock_irq+0x22/0x30
[ 1219.355540]  [<c02ba101>] __make_request+0xc1/0x320
[ 1219.355550]  [<c02b91e7>] generic_make_request+0x177/0x230
[ 1219.355556]  [<c043caf5>] _spin_unlock_irqrestore+0x45/0x60
[ 1219.355561]  [<c013837c>] trace_hardirqs_on+0x9c/0x110
[ 1219.355567]  [<c03705d9>] raid1d+0x419/0xcc0
[ 1219.355574]  [<c0138d57>] __lock_acquire+0x537/0x10b0
[ 1219.355581]  [<c01159cf>] scheduler_tick+0xcf/0x200
[ 1219.355590]  [<c0102ed7>] restore_nocheck+0x12/0x15
[ 1219.355593]  [<c037b6e0>] md_thread+0x0/0xe0
[ 1219.355599]  [<c013837c>] trace_hardirqs_on+0x9c/0x110
[ 1219.355608]  [<c043caf5>] _spin_unlock_irqrestore+0x45/0x60
[ 1219.355612]  [<c037b6e0>] md_thread+0x0/0xe0
[ 1219.355616]  [<c013837c>] trace_hardirqs_on+0x9c/0x110
[ 1219.355622]  [<c037b6e0>] md_thread+0x0/0xe0
[ 1219.355626]  [<c037b702>] md_thread+0x22/0xe0
[ 1219.355631]  [<c012c6d0>] autoremove_wake_function+0x0/0x40
[ 1219.355639]  [<c037b6e0>] md_thread+0x0/0xe0
[ 1219.355643]  [<c012c402>] kthread+0x42/0x70
[ 1219.355647]  [<c012c3c0>] kthread+0x0/0x70
[ 1219.355651]  [<c0103a1f>] kernel_thread_helper+0x7/0x18
[ 1219.355657]  =======================
[ 1219.355660] ---[ end trace d0567d7a35270324 ]---


@dm-devel, can you get something useful out of these traces?

Thank you,
Christian.
-- 
BOFH excuse #152:

My pony-tail hit the on/off switch on the power strip.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: INFO: task mount:11202 blocked for more than 120 seconds
  2008-03-12 19:53         ` Chr
@ 2008-03-12 23:07           ` Christian Kujau
  2008-03-13 13:45             ` Christian Kujau
  0 siblings, 1 reply; 34+ messages in thread
From: Christian Kujau @ 2008-03-12 23:07 UTC (permalink / raw)
  To: Chr; +Cc: Samuel Tardieu, device-mapper development, LKML

On Wed, 12 Mar 2008, Chr wrote:
> http://lkml.org/lkml/2008/3/11/377
> (BTW: I'm still bisecting it... can somebody please confirm that 2.6.24-git17
> http://www.kernel.org/pub/linux/kernel/v2.6/snapshots/patch-2.6.24-git17.bz2
> was the _last_ "good" one?)

Hm, -git17 locked up hard when trying to access the device mapper
(just like -rc1), not even sysrq was usable any more. Sadly, 
nothing was logged to the disks/network. Will try earlier snapshots 
then. And yes, this resyncing of the md devices after an unclean shutdown 
is a real pita right now...

Thanks for reporting!
Christian.
-- 
BOFH excuse #197:

I'm sorry a pentium won't do, you need an SGI to connect with us.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: INFO: task mount:11202 blocked for more than 120 seconds
  2008-03-12 23:07           ` Christian Kujau
@ 2008-03-13 13:45             ` Christian Kujau
  2008-03-13 21:33               ` Chr
  0 siblings, 1 reply; 34+ messages in thread
From: Christian Kujau @ 2008-03-13 13:45 UTC (permalink / raw)
  To: Chr; +Cc: Samuel Tardieu, device-mapper development, LKML

On Thu, 13 Mar 2008, Christian Kujau wrote:
> Hm, -git17 locked up hard when trying to access the device mapper

and so does -git8 and -git9...sigh :(

C.
-- 
BOFH excuse #412:

Radial Telemetry Infiltration

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: INFO: task mount:11202 blocked for more than 120 seconds
  2008-03-13 13:45             ` Christian Kujau
@ 2008-03-13 21:33               ` Chr
  2008-03-13 21:54                 ` Christian Kujau
  0 siblings, 1 reply; 34+ messages in thread
From: Chr @ 2008-03-13 21:33 UTC (permalink / raw)
  To: Christian Kujau, Christian Kujau
  Cc: Samuel Tardieu, device-mapper development, LKML

On Thursday 13 March 2008 14:45:39 Christian Kujau wrote:
> and so does -git8 and -git9...sigh :(

hmm, I can boot any of these -rcX-gitY kernels (but not without some stuck 
tasks after about a hour or so...)

but back: I tried lockdep and ran into this: (debug blocker)

pktcdvd: writer pktcdvd0 mapped to sr0

=============================================
[ INFO: possible recursive locking detected ]
2.6.25-rc5-git3 #2
---------------------------------------------
scsi_id/8262 is trying to acquire lock:
 (&bdev->bd_mutex){--..}, at: [<ffffffff802a8e51>] do_open+0x7b/0x2c0

but task is already holding lock:
 (&bdev->bd_mutex){--..}, at: [<ffffffff802a8e51>] do_open+0x7b/0x2c0

other info that might help us debug this:
2 locks held by scsi_id/8262:
 #0:  (&bdev->bd_mutex){--..}, at: [<ffffffff802a8e51>] do_open+0x7b/0x2c0
 #1:  (&ctl_mutex#2){--..}, at: [<ffffffff88231af2>] pkt_open+0x26/0x516 
[pktcdvd]

stack backtrace:
Pid: 8262, comm: scsi_id Not tainted 2.6.25-rc5-git3 #2

Call Trace:
 [<ffffffff8024ffe4>] __lock_acquire+0x8c9/0xc72
 [<ffffffff804bb23e>] ? __mutex_unlock_slowpath+0xf9/0x105
 [<ffffffff8025079f>] lock_acquire+0x5e/0x77
 [<ffffffff802a8e51>] ? do_open+0x7b/0x2c0
 [<ffffffff804bb336>] mutex_lock_nested+0xe1/0x270
 [<ffffffff802a8e51>] ? do_open+0x7b/0x2c0
 [<ffffffff802a8e51>] do_open+0x7b/0x2c0
 [<ffffffff802a9116>] __blkdev_get+0x80/0x92
 [<ffffffff802a9133>] blkdev_get+0xb/0xd
 [<ffffffff88231b6b>] :pktcdvd:pkt_open+0x9f/0x516
 [<ffffffff802a8e84>] do_open+0xae/0x2c0
 [<ffffffff802a92a4>] ? blkdev_open+0x0/0x6a
 [<ffffffff804bcbab>] ? _spin_unlock+0x26/0x2a
 [<ffffffff802a92a4>] ? blkdev_open+0x0/0x6a
 [<ffffffff802a92db>] blkdev_open+0x37/0x6a
 [<ffffffff80282571>] __dentry_open+0xe6/0x1c0
 [<ffffffff802826e4>] nameidata_to_filp+0x2e/0x40
 [<ffffffff8028272f>] do_filp_open+0x39/0x4b
 [<ffffffff804bcbab>] ? _spin_unlock+0x26/0x2a
 [<ffffffff8028246f>] ? get_unused_fd_flags+0x10d/0x11c
 [<ffffffff80282792>] do_sys_open+0x51/0xd9
 [<ffffffff80282843>] sys_open+0x1b/0x1d
 [<ffffffff8020b22b>] system_call_after_swapgs+0x7b/0x80

soooo.....?

scsi_id/8262 is trying to acquire lock:
 (&bdev->bd_mutex){--..}, at: [<ffffffff802a8e51>] do_open+0x7b/0x2c0

but task is already holding lock:
 (&bdev->bd_mutex){--..}, at: [<ffffffff802a8e51>] do_open+0x7b/0x2c0

sounds a bit strange? Is this a SMP related problem after all?
(*disabled pktcddvd for now... to see if still locks)

Regards,
	Chr...

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: INFO: task mount:11202 blocked for more than 120 seconds
  2008-03-13 21:33               ` Chr
@ 2008-03-13 21:54                 ` Christian Kujau
  2008-03-14  0:15                   ` Chr
  0 siblings, 1 reply; 34+ messages in thread
From: Christian Kujau @ 2008-03-13 21:54 UTC (permalink / raw)
  To: Chr; +Cc: Samuel Tardieu, device-mapper development, LKML

On Thu, 13 Mar 2008, Chr wrote:
> On Thursday 13 March 2008 14:45:39 Christian Kujau wrote:
>> and so does -git8 and -git9...sigh :(
>
> hmm, I can boot any of these -rcX-gitY kernels (but not without some stuck
> tasks after about a hour or so...)

Ah, when you said -git17 would be the last "good" kernel, I thought 
everything earlier would not have the stuck tasks.

> but back: I tried lockdep and ran into this: (debug blocker)

I have lockdep enabled as well, but no warnings so far...

> sounds a bit strange? Is this a SMP related problem after all?

I don't think so, as I don't have an SMP system (and CONFIG_SMP is 
disabled as well).

C.
-- 
BOFH excuse #441:

Hash table has woodworm

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: INFO: task mount:11202 blocked for more than 120 seconds
  2008-03-13 21:54                 ` Christian Kujau
@ 2008-03-14  0:15                   ` Chr
  0 siblings, 0 replies; 34+ messages in thread
From: Chr @ 2008-03-14  0:15 UTC (permalink / raw)
  To: Christian Kujau
  Cc: Samuel Tardieu, device-mapper development, LKML, Ingo Molnar

On Thursday 13 March 2008 22:54:25 Christian Kujau wrote:
> On Thu, 13 Mar 2008, Chr wrote:
> > On Thursday 13 March 2008 14:45:39 Christian Kujau wrote:
> >> and so does -git8 and -git9...sigh :(
> >
> > hmm, I can boot any of these -rcX-gitY kernels (but not without some
> > stuck tasks after about a hour or so...)
>
> Ah, when you said -git17 would be the last "good" kernel, I thought
> everything earlier would not have the stuck tasks.
well, no... even 2.6.24.3 has _hangs_, however not forever (approx. 5-10 min)
and everything is back to normal...
 
> > sounds a bit strange? Is this a SMP related problem after all?
>
> I don't think so, as I don't have an SMP system (and CONFIG_SMP is
> disabled as well).
Then maybe scheduler? because these values looks suspecious:
(that's why I added Ingo to CC, because he probably knows what's NOT going on
 ;-) )

(full dmesg there: 
http://www.pastebin.ca/941845

config here:
http://www.pastebin.ca/941856
)

the funny stuff is highlighted with a >:
cpu#1, 2211.332 MHz
  .nr_running                    : 8
  .load                          : 188884
  .nr_switches                   : 9615826
  .nr_load_updates               : 1879402
 > .nr_uninterruptible            : -2053
  .jiffies                       : 4296946718
  .next_balance                  : 4296.946740
  .curr->pid                     : 0
  .clock                         : 9800558.756490
  .idle_clock                    : 6521432.794621
  .prev_clock_raw                : 4476359.432375
  .clock_warps                   : 0
  .clock_overflows               : 8676899
  .clock_underflows              : 688045
  .clock_deep_idle_events        : 0
  .clock_max_delta               : 4.000250
  .cpu_load[0]                   : 0
  .cpu_load[1]                   : 24
  .cpu_load[2]                   : 112
  .cpu_load[3]                   : 155
  .cpu_load[4]                   : 177
 
cfs_rq
  .exec_clock                    : 1353227.466775
  .MIN_vruntime                  : 1808656.942015
  .min_vruntime                  : 1808696.942015
  .max_vruntime                  : 1808657.235424
  .spread                        : 0.293409
>  .spread0                       : -267982.528057
  .nr_running                    : 7
  .load                          : 11362
  .bkl_count                     : 60751
  .nr_spread_over                : 6234
 
runnable tasks:
            task   PID         tree-key  switches  prio     exec-runtime         
sum-exec        sum-sleep
----------------------------------------------------------------------------------------------------------
         kcryptd  2266   1808656.942015     12778   115   1808656.942015      
3689.257692   3125323.031084
 hald-addon-inpu  5504   1808656.942015       622   120   1808656.942015        
10.739211   3044831.003941
         kcryptd  7312   1808656.960820    578891   115   1808656.960820     
71166.387493   2625492.750280
            Xorg  9564   1808656.942016    513338   120   1808656.942016    
245777.204573   2587613.331559
         pdflush 10059   1808656.942015     14575   120   1808656.942015      
4400.355110   2985523.630368
         pdflush 10696   1808656.942015      9433   120   1808656.942015      
3607.302084   3103638.660388
            xine 16773   1808657.235424     13455   120   1808657.235424      
2135.923058    574348.774173
 >       watchdog 16959   -206866.028086         0    98   -206866.028086         
0.000000         0.000000

(it's at the end of the dmesg, the kernel, X.org and probably everything else 
was stuck...)

Thanks,
	Christian.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: INFO: task mount:11202 blocked for more than 120 seconds
  2008-03-09 21:34     ` INFO: task mount:11202 blocked for more than 120 seconds David Chinner
  2008-03-10  1:46       ` Christian Kujau
  2008-03-12 18:03       ` Samuel Tardieu
@ 2008-03-14  9:27       ` Milan Broz
  2008-03-14 23:58         ` Christian Kujau
  2008-03-15  0:08         ` Chr
  2 siblings, 2 replies; 34+ messages in thread
From: Milan Broz @ 2008-03-14  9:27 UTC (permalink / raw)
  To: David Chinner; +Cc: Christian Kujau, LKML, xfs, dm-devel

David Chinner wrote:
>>> Also, the iolock can be held across I/O so it's possible you've lost an 
>>> I/O. Any I/O errors in the syslog?
>> No, no I/O errors at all. See the kern.log above, I could even do dd(1) 
>> from the md1 (dm-crypt on raid1), no errors either.
> 
> Oh, dm-crypt. Well, I'd definitely start looking there. XFS has a
> history of exposing dm-crypt bugs, and these hangs appear to be I/O
> congestion/scheduling related and not XFS. Also, we haven't changed
> anything related to plug/unplug of block devices in XFS recently, so
> that also points to some other change as well...

Yes, there is bug in dm-crypt...
Please try if the patch here helps: http://lkml.org/lkml/2008/3/14/71

Thanks,
Milan
--
mbroz@redhat.com


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: INFO: task mount:11202 blocked for more than 120 seconds
  2008-03-14  9:27       ` Milan Broz
@ 2008-03-14 23:58         ` Christian Kujau
  2008-03-16 20:33           ` David Chinner
  2008-03-15  0:08         ` Chr
  1 sibling, 1 reply; 34+ messages in thread
From: Christian Kujau @ 2008-03-14 23:58 UTC (permalink / raw)
  To: Milan Broz; +Cc: David Chinner, LKML, xfs, dm-devel

On Fri, 14 Mar 2008, Milan Broz wrote:
> Yes, there is bug in dm-crypt...
> Please try if the patch here helps: http://lkml.org/lkml/2008/3/14/71

Hm, it seems to help the hangs, yes. Applied to today's -git a few hours 
ago, the hangs are gone. However, when doing lots of disk I/O, the machine 
locks up after a few (10-20) minutes. Sadly, netconsole got nothing :(

After the first lockup I tried again and shortly after bootup I got:

[  866.681441] [ INFO: possible circular locking dependency detected ]
[  866.681876] 2.6.25-rc5 #1
[  866.682203] -------------------------------------------------------
[  866.682637] kswapd0/132 is trying to acquire lock:
[  866.683028]  (&(&ip->i_iolock)->mr_lock){----}, at: [<c027a686>] xfs_ilock+0x96/0xb0
[  866.683916] 
[  866.683917] but task is already holding lock:
[  866.684582]  (iprune_mutex){--..}, at: [<c017b592>] shrink_icache_memory+0x72/0x220
[  866.685461] 
[  866.685462] which lock already depends on the new lock.
[  866.685463] 
[  866.686440] 
[  866.686441] the existing dependency chain (in reverse order) is:
[  866.687151] 
[  866.687152] -> #1 (iprune_mutex){--..}:
[  866.687339]        [<c0136914>] add_lock_to_list+0x44/0xc0
[  866.687339]        [<c01393a6>] __lock_acquire+0xc26/0x10b0
[  866.687339]        [<c017b592>] shrink_icache_memory+0x72/0x220
[  866.687339]        [<c013890f>] __lock_acquire+0x18f/0x10b0
[  866.687339]        [<c013988e>] lock_acquire+0x5e/0x80
[  866.687339]        [<c017b592>] shrink_icache_memory+0x72/0x220
[  866.687339]        [<c043fc79>] mutex_lock_nested+0x89/0x240
[  866.687339]        [<c017b592>] shrink_icache_memory+0x72/0x220
[  866.687339]        [<c017b592>] shrink_icache_memory+0x72/0x220
[  866.687339]        [<c017b592>] shrink_icache_memory+0x72/0x220
[  866.687339]        [<c0150051>] shrink_slab+0x21/0x160
[  866.687340]        [<c0150131>] shrink_slab+0x101/0x160
[  866.687340]        [<c01502e2>] try_to_free_pages+0x152/0x230
[  866.687340]        [<c014f060>] isolate_pages_global+0x0/0x60
[  866.687340]        [<c014b95b>] __alloc_pages+0x14b/0x370
[  866.687340]        [<c04413a0>] _read_unlock_irq+0x20/0x30
[  866.687340]        [<c0146601>] __grab_cache_page+0x81/0xc0
[  866.687340]        [<c01896f6>] block_write_begin+0x76/0xe0
[  866.687340]        [<c029ec76>] xfs_vm_write_begin+0x46/0x50
[  866.687340]        [<c029f4c0>] xfs_get_blocks+0x0/0x30
[  866.687340]        [<c0147297>] generic_file_buffered_write+0x117/0x650
[  866.687340]        [<c027a65d>] xfs_ilock+0x6d/0xb0
[  866.687340]        [<c02a73cc>] xfs_write+0x7ac/0x8a0
[  866.687340]        [<c0174ac1>] core_sys_select+0x21/0x350
[  866.687340]        [<c02a32bc>] xfs_file_aio_write+0x5c/0x70
[  866.687340]        [<c0167bf5>] do_sync_write+0xd5/0x120
[  866.687340]        [<c012c630>] autoremove_wake_function+0x0/0x40
[  866.687340]        [<c019cfd5>] dnotify_parent+0x35/0x90
[  866.687340]        [<c0167b20>] do_sync_write+0x0/0x120
[  866.687340]        [<c016846f>] vfs_write+0x9f/0x140
[  866.687340]        [<c0168a21>] sys_write+0x41/0x70
[  866.687340]        [<c0102dee>] sysenter_past_esp+0x5f/0xa5
[  866.687340]        [<ffffffff>] 0xffffffff
[  866.687340] 
[  866.687340] -> #0 (&(&ip->i_iolock)->mr_lock){----}:
[  866.687340]        [<c0136ba0>] print_circular_bug_entry+0x40/0x50

The box was running fine then for ~20 minutes, then it locked up again.

Full dmesg and .config: http://nerdbynature.de/bits/2.6.25-rc5/

Right now I'm back to 2.6.24.3...

Thanks,
Christian.
-- 
BOFH excuse #350:

paradigm shift...without a clutch

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: INFO: task mount:11202 blocked for more than 120 seconds
  2008-03-14  9:27       ` Milan Broz
  2008-03-14 23:58         ` Christian Kujau
@ 2008-03-15  0:08         ` Chr
  2008-03-15 13:32           ` [dm-crypt] " Chr
  1 sibling, 1 reply; 34+ messages in thread
From: Chr @ 2008-03-15  0:08 UTC (permalink / raw)
  To: Milan Broz; +Cc: David Chinner, Christian Kujau, LKML, xfs, dm-devel

On Friday 14 March 2008 10:27:07 Milan Broz wrote:
>
> > Oh, dm-crypt. Well, I'd definitely start looking there. XFS has a
> > history of exposing dm-crypt bugs, and these hangs appear to be I/O
> > congestion/scheduling related and not XFS. Also, we haven't changed
> > anything related to plug/unplug of block devices in XFS recently, so
> > that also points to some other change as well...
>
> Yes, there is bug in dm-crypt...
> Please try if the patch here helps: http://lkml.org/lkml/2008/3/14/71
>
hmm, :-/

stuck again... (tried the patch on top of 2.6.25-rc5 and 2.6.25-rc5-git4)

SYS-RQ (several times) T+W+Q 
right here:
http://www.pastebin.ca/943145

Regards,
	Christian

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [dm-crypt] INFO: task mount:11202 blocked for more than 120 seconds
  2008-03-15  0:08         ` Chr
@ 2008-03-15 13:32           ` Chr
  2008-03-15 21:34             ` Chr
  0 siblings, 1 reply; 34+ messages in thread
From: Chr @ 2008-03-15 13:32 UTC (permalink / raw)
  To: Milan Broz
  Cc: David Chinner, Christian Kujau, LKML, xfs, dm-devel,
	Alasdair G Kergon, dm-crypt

>On Sat, Mar 15, 2008 at 01:21:04AM +0100, Ulrich Lukas wrote:
>> With that patch applied, writing seems to work again.
> 
> Current version of the patch is below.  It's still not quite finished.
>
:-/, both patches freezes here... 

I'll try to revert:

commit  3a7f6c990ad04e6f576a159876c602d14d6f7fef
dm crypt: use async crypto

dm-crypt: Use crypto ablkcipher interface
Move encrypt/decrypt core to async crypto call.

maybe it isn't the async stuff after all..

>> Btw, is this a regression because of the changed 2.6.25(-rc) code or is
>> it just a coincidence that it worked before?
>
>Introduced during code changes to support async crypto interface.

Regards,
	Chr
 

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [dm-crypt] INFO: task mount:11202 blocked for more than 120 seconds
  2008-03-15 13:32           ` [dm-crypt] " Chr
@ 2008-03-15 21:34             ` Chr
  2008-03-16 13:08               ` Christian Kujau
  0 siblings, 1 reply; 34+ messages in thread
From: Chr @ 2008-03-15 21:34 UTC (permalink / raw)
  To: Milan Broz
  Cc: David Chinner, Christian Kujau, LKML, xfs, dm-devel,
	Alasdair G Kergon, dm-crypt

On Saturday 15 March 2008 14:32:10 Chr wrote:
> reverted:
>
> commit  3a7f6c990ad04e6f576a159876c602d14d6f7fef
> dm crypt: use async crypto
>
> dm-crypt: Use crypto ablkcipher interface
> Move encrypt/decrypt core to async crypto call.
>

well.... it's much better now, without the async interface. 
Christian Kajau, can you confirm it too? 

Regards,
	Christian
  




^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [dm-crypt] INFO: task mount:11202 blocked for more than 120 seconds
  2008-03-15 21:34             ` Chr
@ 2008-03-16 13:08               ` Christian Kujau
  2008-03-17 17:36                 ` Alasdair G Kergon
  0 siblings, 1 reply; 34+ messages in thread
From: Christian Kujau @ 2008-03-16 13:08 UTC (permalink / raw)
  To: Chr
  Cc: Milan Broz, David Chinner, LKML, xfs, dm-devel,
	Alasdair G Kergon, dm-crypt

On Sat, 15 Mar 2008, Chr wrote:
> On Saturday 15 March 2008 14:32:10 Chr wrote:
>> reverted:
>>
>> commit  3a7f6c990ad04e6f576a159876c602d14d6f7fef
>> dm crypt: use async crypto
>>
>> dm-crypt: Use crypto ablkcipher interface
>> Move encrypt/decrypt core to async crypto call.
>>
>
> well.... it's much better now, without the async interface.
> Christian Kajau, can you confirm it too?

I reverted the commit above from today's -git and booted....I could not 
notice any hangs more. But when I tried to reproduce these hangs by 
generating disk I/O (mostly reads) ~10 minutes later the box panicked, but
still not netconsole messages :-(

thanks,
C.
-- 
BOFH excuse #233:

TCP/IP UDP alarm threshold is set too low.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: INFO: task mount:11202 blocked for more than 120 seconds
  2008-03-14 23:58         ` Christian Kujau
@ 2008-03-16 20:33           ` David Chinner
  0 siblings, 0 replies; 34+ messages in thread
From: David Chinner @ 2008-03-16 20:33 UTC (permalink / raw)
  To: Christian Kujau; +Cc: Milan Broz, David Chinner, LKML, xfs, dm-devel

On Sat, Mar 15, 2008 at 12:58:02AM +0100, Christian Kujau wrote:
> On Fri, 14 Mar 2008, Milan Broz wrote:
> >Yes, there is bug in dm-crypt...
> >Please try if the patch here helps: http://lkml.org/lkml/2008/3/14/71
> 
> Hm, it seems to help the hangs, yes. Applied to today's -git a few hours 
> ago, the hangs are gone. However, when doing lots of disk I/O, the machine 
> locks up after a few (10-20) minutes. Sadly, netconsole got nothing :(
> 
> After the first lockup I tried again and shortly after bootup I got:

False positive. Memory reclaim can inverts the order of iprune_mutex and
the normal inode locking orders. i.e. Both of these are possible:

	do_something()
	enter memory reclaim
	iprune_mutex
	inode lock

or
	do_something()
	inode lock
	do_something_else
	enter memory reclaim
	iprune_mutex
	inode lock on different inode

So depending on what is seen first (in this case the first), the second
will trip lockdep. Neither are a deadlock, because the inode lock
held before memory reclaim is a referenced inode and will *never* be
on the free list for memory reclaim to deadlock on....

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [dm-crypt] INFO: task mount:11202 blocked for more than 120 seconds
  2008-03-16 13:08               ` Christian Kujau
@ 2008-03-17 17:36                 ` Alasdair G Kergon
  2008-03-17 18:36                   ` Chr
                                     ` (3 more replies)
  0 siblings, 4 replies; 34+ messages in thread
From: Alasdair G Kergon @ 2008-03-17 17:36 UTC (permalink / raw)
  To: Christian Kujau, Chr, Milan Broz, David Chinner, LKML, xfs,
	dm-devel, dm-crypt, Herbert Xu, Ritesh Raj Sarraf

Latest version for everyone to try:

From: Milan Broz <mbroz@redhat.com>

Fix regression in dm-crypt introduced in commit
3a7f6c990ad04e6f576a159876c602d14d6f7fef
(dm crypt: use async crypto).

If write requests need to be split into pieces, the code must not
process them in parallel because the crypto context cannot be shared.
So there can be parallel crypto operations on one part of the write,
but only one write bio can be processed at a time.

This is not optimal and the workqueue code need to be optimized for
parallel processing, but for now it solves problem without affecting
the performance of synchronous crypto operation (most of current
dm-crypt users).

Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

---
 drivers/md/dm-crypt.c |   58 +++++++++++++++++++++++++-------------------------
 1 files changed, 30 insertions(+), 28 deletions(-)

Index: linux-2.6.25-rc4/drivers/md/dm-crypt.c
===================================================================
--- linux-2.6.25-rc4.orig/drivers/md/dm-crypt.c	2008-03-17 11:42:16.000000000 +0000
+++ linux-2.6.25-rc4/drivers/md/dm-crypt.c	2008-03-17 11:42:28.000000000 +0000
@@ -1,7 +1,7 @@
 /*
  * Copyright (C) 2003 Christophe Saout <christophe@saout.de>
  * Copyright (C) 2004 Clemens Fruhwirth <clemens@endorphin.org>
- * Copyright (C) 2006-2007 Red Hat, Inc. All rights reserved.
+ * Copyright (C) 2006-2008 Red Hat, Inc. All rights reserved.
  *
  * This file is released under the GPL.
  */
@@ -93,6 +93,8 @@ struct crypt_config {
 
 	struct workqueue_struct *io_queue;
 	struct workqueue_struct *crypt_queue;
+	wait_queue_head_t writeq;
+
 	/*
 	 * crypto related data
 	 */
@@ -331,14 +333,7 @@ static void crypt_convert_init(struct cr
 	ctx->idx_out = bio_out ? bio_out->bi_idx : 0;
 	ctx->sector = sector + cc->iv_offset;
 	init_completion(&ctx->restart);
-	/*
-	 * Crypto operation can be asynchronous,
-	 * ctx->pending is increased after request submission.
-	 * We need to ensure that we don't call the crypt finish
-	 * operation before pending got incremented
-	 * (dependent on crypt submission return code).
-	 */
-	atomic_set(&ctx->pending, 2);
+	atomic_set(&ctx->pending, 1);
 }
 
 static int crypt_convert_block(struct crypt_config *cc,
@@ -411,43 +406,42 @@ static void crypt_alloc_req(struct crypt
 static int crypt_convert(struct crypt_config *cc,
 			 struct convert_context *ctx)
 {
-	int r = 0;
+	int r;
 
 	while(ctx->idx_in < ctx->bio_in->bi_vcnt &&
 	      ctx->idx_out < ctx->bio_out->bi_vcnt) {
 
 		crypt_alloc_req(cc, ctx);
 
+		atomic_inc(&ctx->pending);
+
 		r = crypt_convert_block(cc, ctx, cc->req);
 
 		switch (r) {
+		/* async */
 		case -EBUSY:
 			wait_for_completion(&ctx->restart);
 			INIT_COMPLETION(ctx->restart);
 			/* fall through*/
 		case -EINPROGRESS:
-			atomic_inc(&ctx->pending);
 			cc->req = NULL;
-			r = 0;
-			/* fall through*/
+			ctx->sector++;
+			continue;
+
+		/* sync */
 		case 0:
+			atomic_dec(&ctx->pending);
 			ctx->sector++;
 			continue;
-		}
 
-		break;
+		/* error */
+		default:
+			atomic_dec(&ctx->pending);
+			return r;
+		}
 	}
 
-	/*
-	 * If there are pending crypto operation run async
-	 * code. Otherwise process return code synchronously.
-	 * The step of 2 ensures that async finish doesn't
-	 * call crypto finish too early.
-	 */
-	if (atomic_sub_return(2, &ctx->pending))
-		return -EINPROGRESS;
-
-	return r;
+	return 0;
 }
 
 static void dm_crypt_bio_destructor(struct bio *bio)
@@ -624,8 +618,10 @@ static void kcryptd_io_read(struct dm_cr
 static void kcryptd_io_write(struct dm_crypt_io *io)
 {
 	struct bio *clone = io->ctx.bio_out;
+	struct crypt_config *cc = io->target->private;
 
 	generic_make_request(clone);
+	wake_up(&cc->writeq);
 }
 
 static void kcryptd_io(struct work_struct *work)
@@ -698,7 +694,8 @@ static void kcryptd_crypt_write_convert_
 
 		r = crypt_convert(cc, &io->ctx);
 
-		if (r != -EINPROGRESS) {
+		if (atomic_dec_and_test(&io->ctx.pending)) {
+			/* processed, no running async crypto  */
 			kcryptd_crypt_write_io_submit(io, r, 0);
 			if (unlikely(r < 0))
 				return;
@@ -706,8 +703,12 @@ static void kcryptd_crypt_write_convert_
 			atomic_inc(&io->pending);
 
 		/* out of memory -> run queues */
-		if (unlikely(remaining))
+		if (unlikely(remaining)) {
+			/* wait for async crypto and reinitialize pending counter */
+			wait_event(cc->writeq, !atomic_read(&io->ctx.pending));
+			atomic_set(&io->ctx.pending, 1);
 			congestion_wait(WRITE, HZ/100);
+		}
 	}
 }
 
@@ -746,7 +747,7 @@ static void kcryptd_crypt_read_convert(s
 
 	r = crypt_convert(cc, &io->ctx);
 
-	if (r != -EINPROGRESS)
+	if (atomic_dec_and_test(&io->ctx.pending))
 		kcryptd_crypt_read_done(io, r);
 
 	crypt_dec_pending(io);
@@ -1047,6 +1048,7 @@ static int crypt_ctr(struct dm_target *t
 		goto bad_crypt_queue;
 	}
 
+	init_waitqueue_head(&cc->writeq);
 	ti->private = cc;
 	return 0;
 

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [dm-crypt] INFO: task mount:11202 blocked for more than 120 seconds
  2008-03-17 17:36                 ` Alasdair G Kergon
@ 2008-03-17 18:36                   ` Chr
  2008-03-18 17:46                     ` Christian Kujau
  2008-03-18  0:56                   ` Herbert Xu
                                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 34+ messages in thread
From: Chr @ 2008-03-17 18:36 UTC (permalink / raw)
  To: Alasdair G Kergon
  Cc: Christian Kujau, Milan Broz, David Chinner, LKML, xfs, dm-devel,
	dm-crypt, Herbert Xu, Ritesh Raj Sarraf

On Monday 17 March 2008 18:36:09 Alasdair G Kergon wrote:
> Latest version for everyone to try:
>
> From: Milan Broz <mbroz@redhat.com>
>
> Fix regression in dm-crypt introduced in commit
> 3a7f6c990ad04e6f576a159876c602d14d6f7fef
> (dm crypt: use async crypto).
>
> If write requests need to be split into pieces, the code must not
> process them in parallel because the crypto context cannot be shared.
> So there can be parallel crypto operations on one part of the write,
> but only one write bio can be processed at a time.
>
> This is not optimal and the workqueue code need to be optimized for
> parallel processing, but for now it solves problem without affecting
> the performance of synchronous crypto operation (most of current
> dm-crypt users).
>
> Signed-off-by: Milan Broz <mbroz@redhat.com>
> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Tested-by: Christian Lamparter <chunkeey@web.de>

Well, the dm-crypt regressions seems to be gone. :)
Thanks for your work & time!

Regards, 
	Christian

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [dm-crypt] INFO: task mount:11202 blocked for more than 120 seconds
  2008-03-17 17:36                 ` Alasdair G Kergon
  2008-03-17 18:36                   ` Chr
@ 2008-03-18  0:56                   ` Herbert Xu
  2008-03-18  4:07                     ` Milan Broz
  2008-03-22  2:52                   ` Christian Kujau
  2008-03-26 16:57                   ` Chr
  3 siblings, 1 reply; 34+ messages in thread
From: Herbert Xu @ 2008-03-18  0:56 UTC (permalink / raw)
  To: Christian Kujau, Chr, Milan Broz, David Chinner, LKML, xfs,
	dm-devel, dm-crypt, Ritesh Raj Sarraf

On Mon, Mar 17, 2008 at 05:36:09PM +0000, Alasdair G Kergon wrote:
> Latest version for everyone to try:
> 
> From: Milan Broz <mbroz@redhat.com>
> 
> Fix regression in dm-crypt introduced in commit
> 3a7f6c990ad04e6f576a159876c602d14d6f7fef
> (dm crypt: use async crypto).
> 
> If write requests need to be split into pieces, the code must not
> process them in parallel because the crypto context cannot be shared.
> So there can be parallel crypto operations on one part of the write,
> but only one write bio can be processed at a time.

Could you explain this part please? Crypto tfm objects are meant
to be reentrant, synchronous or not.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [dm-crypt] INFO: task mount:11202 blocked for more than 120 seconds
  2008-03-18  0:56                   ` Herbert Xu
@ 2008-03-18  4:07                     ` Milan Broz
  0 siblings, 0 replies; 34+ messages in thread
From: Milan Broz @ 2008-03-18  4:07 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Christian Kujau, Chr, David Chinner, LKML, xfs, dm-devel,
	dm-crypt, Ritesh Raj Sarraf

Herbert Xu wrote:
> On Mon, Mar 17, 2008 at 05:36:09PM +0000, Alasdair G Kergon wrote:
>> Latest version for everyone to try:
>>
>> From: Milan Broz <mbroz@redhat.com>
>>
>> Fix regression in dm-crypt introduced in commit
>> 3a7f6c990ad04e6f576a159876c602d14d6f7fef
>> (dm crypt: use async crypto).
>>
>> If write requests need to be split into pieces, the code must not
>> process them in parallel because the crypto context cannot be shared.
>> So there can be parallel crypto operations on one part of the write,
>> but only one write bio can be processed at a time.
> 
> Could you explain this part please? Crypto tfm objects are meant
> to be reentrant, synchronous or not.

Ah, sorry - I mean dm-crypt convert context (with crypto context included).

Context is reentrant in the sense of crypto operations. But we need also
sometimes split bio in writes (not only because of low memory,
but also new memory layout of cloned bio can be different and we
must not violate hardware restrictions - spec. XFS generates such
highly optimized bio requests - it's why it discovers so many dm-crypt problems ;-)

see problematic dm-crypt bio write path

	while (remaining) {
		clone = crypt_alloc_buffer(io, remaining);
		...
		io->ctx.bio_out = clone;
		io->ctx.idx_out = 0;
		remaining -= clone->bi_size;
		...
		r = crypt_convert(cc, &io->ctx);
			-> fire sync or (multiple) async crypto operation

		if (atomic_dec_and_test(&io->ctx.pending))
			-> sync mode, submit clone direclty
		...
		if (unlikely(remaining))
			congestion_wait(WRITE, HZ/100);
	}

and in async crypto completion callback (because async callback cannot
call in its context generic_make_request directly) is called:

	struct convert_context *ctx = async_req->data;
	...
	if (!atomic_dec_and_test(&ctx->pending))
		return;
	...
	INIT_WORK(&io->work, kcryptd_io);
	queue_work(cc->io_queue, &io->work);
	...
	(and from io thread later)
	struct bio *clone = io->ctx.bio_out;
	generic_make_request(clone);

problems:
1) we cannot use io->work struct in parallel
2) io->ctx.pending is shared here between multiple sub-bio clones
... 

(there was no problems in sync crypto mode. and dm-crypt io struct is
already allocated from mempool in crypt_map allocation, so changing this
to per cloned sub-bio allocation can cause new problems in low-memory situations,
not good idea change it in this development phase...)

Milan
--
mbroz@redhat.com

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [dm-crypt] INFO: task mount:11202 blocked for more than 120 seconds
  2008-03-17 18:36                   ` Chr
@ 2008-03-18 17:46                     ` Christian Kujau
  0 siblings, 0 replies; 34+ messages in thread
From: Christian Kujau @ 2008-03-18 17:46 UTC (permalink / raw)
  To: Chr
  Cc: Alasdair G Kergon, Milan Broz, David Chinner, LKML, xfs,
	dm-devel, dm-crypt, Herbert Xu, Ritesh Raj Sarraf

Chr wrote:
> Well, the dm-crypt regressions seems to be gone. :)
> Thanks for your work&  time!

Unfortunately I'm unable to test atm as I currently don't have access to 
the machine where the hangs occured. So, if Chr says "it's fixed" I 
believe it is and I don't wanna be a show stopper...IOW: feel free to 
close the bug.

Thanks to all involved,
Christian.





^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [dm-crypt] INFO: task mount:11202 blocked for more than 120 seconds
  2008-03-17 17:36                 ` Alasdair G Kergon
  2008-03-17 18:36                   ` Chr
  2008-03-18  0:56                   ` Herbert Xu
@ 2008-03-22  2:52                   ` Christian Kujau
  2008-03-27  8:21                     ` Christian Kujau
  2008-03-26 16:57                   ` Chr
  3 siblings, 1 reply; 34+ messages in thread
From: Christian Kujau @ 2008-03-22  2:52 UTC (permalink / raw)
  To: Alasdair G Kergon
  Cc: Chr, Milan Broz, David Chinner, LKML, xfs, dm-devel, dm-crypt,
	Herbert Xu, Ritesh Raj Sarraf

On Mon, 17 Mar 2008, Alasdair G Kergon wrote:
> From: Milan Broz <mbroz@redhat.com>
>
> Fix regression in dm-crypt introduced in commit
> 3a7f6c990ad04e6f576a159876c602d14d6f7fef
> (dm crypt: use async crypto).

I finally got around to test this - and yes, 2.6.25-rc6 with this patch 
applied make the hangs go away. There are other problems now[0], but they 
don't seem to be related, AFAICT.

Thank you!
Christian.

[0] http://lkml.org/lkml/2008/3/21/408
-- 
BOFH excuse #197:

I'm sorry a pentium won't do, you need an SGI to connect with us.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [dm-crypt] INFO: task mount:11202 blocked for more than 120 seconds
  2008-03-17 17:36                 ` Alasdair G Kergon
                                     ` (2 preceding siblings ...)
  2008-03-22  2:52                   ` Christian Kujau
@ 2008-03-26 16:57                   ` Chr
  3 siblings, 0 replies; 34+ messages in thread
From: Chr @ 2008-03-26 16:57 UTC (permalink / raw)
  To: Alasdair G Kergon
  Cc: Christian Kujau, Milan Broz, David Chinner, LKML,
	Ritesh Raj Sarraf, dm-devel, dm-crypt, Herbert Xu

On Monday 17 March 2008 18:36:09 Alasdair G Kergon wrote:
> Latest version for everyone to try:
>
> From: Milan Broz <mbroz@redhat.com>
>
> Fix regression in dm-crypt introduced in commit
> 3a7f6c990ad04e6f576a159876c602d14d6f7fef
> (dm crypt: use async crypto).
>
> If write requests need to be split into pieces, the code must not
> process them in parallel because the crypto context cannot be shared.
> So there can be parallel crypto operations on one part of the write,
> but only one write bio can be processed at a time.
>
> This is not optimal and the workqueue code need to be optimized for
> parallel processing, but for now it solves problem without affecting
> the performance of synchronous crypto operation (most of current
> dm-crypt users).
>
> Signed-off-by: Milan Broz <mbroz@redhat.com>
> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
>
> [...]

Is there already a new, optimized version? Or is it postponed and someone will 
_hopefully_ merge  this patch into -rc7-gitX before we have a buggy mainline 
2.6.25?

Thanks,
	Chr  




^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [dm-crypt] INFO: task mount:11202 blocked for more than 120 seconds
  2008-03-22  2:52                   ` Christian Kujau
@ 2008-03-27  8:21                     ` Christian Kujau
  0 siblings, 0 replies; 34+ messages in thread
From: Christian Kujau @ 2008-03-27  8:21 UTC (permalink / raw)
  To: Alasdair G Kergon
  Cc: Chr, Milan Broz, David Chinner, LKML, xfs, dm-devel, dm-crypt,
	Herbert Xu, Ritesh Raj Sarraf

On Sat, 22 Mar 2008, Christian Kujau wrote:
> On Mon, 17 Mar 2008, Alasdair G Kergon wrote:
>> From: Milan Broz <mbroz@redhat.com>
>> 
>> Fix regression in dm-crypt introduced in commit
>> 3a7f6c990ad04e6f576a159876c602d14d6f7fef
>> (dm crypt: use async crypto).

I noticed that this patch[0] hasn't made it into mainline yet,
will this be included or is this fixed by something else?

Thanks,
Christian.

[0] http://lkml.org/lkml/2008/3/17/214
-- 
BOFH excuse #416:

We're out of slots on the server

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2008-03-27  8:22 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-03-07 20:32 INFO: task mount:11202 blocked for more than 120 seconds Christian Kujau
2008-03-07 22:40 ` David Chinner
2008-03-07 23:46   ` Christian Kujau
2008-03-08  1:54     ` Christian Kujau
2008-03-09  6:15       ` 2.6.25-rc hangs (was: INFO: task mount:11202 blocked for more than 120 seconds) Christian Kujau
2008-03-09 16:44         ` 2.6.25-rc hangs Eric Sandeen
2008-03-09 18:05           ` Christian Kujau
     [not found]             ` <47D42AD6.10500@sandeen.net>
2008-03-09 18:49               ` Christian Kujau
2008-03-12 18:06                 ` Samuel Tardieu
2008-03-12 21:02                   ` Christian Kujau
2008-03-09 21:34     ` INFO: task mount:11202 blocked for more than 120 seconds David Chinner
2008-03-10  1:46       ` Christian Kujau
2008-03-12 18:03       ` Samuel Tardieu
2008-03-12 19:53         ` Chr
2008-03-12 23:07           ` Christian Kujau
2008-03-13 13:45             ` Christian Kujau
2008-03-13 21:33               ` Chr
2008-03-13 21:54                 ` Christian Kujau
2008-03-14  0:15                   ` Chr
2008-03-14  9:27       ` Milan Broz
2008-03-14 23:58         ` Christian Kujau
2008-03-16 20:33           ` David Chinner
2008-03-15  0:08         ` Chr
2008-03-15 13:32           ` [dm-crypt] " Chr
2008-03-15 21:34             ` Chr
2008-03-16 13:08               ` Christian Kujau
2008-03-17 17:36                 ` Alasdair G Kergon
2008-03-17 18:36                   ` Chr
2008-03-18 17:46                     ` Christian Kujau
2008-03-18  0:56                   ` Herbert Xu
2008-03-18  4:07                     ` Milan Broz
2008-03-22  2:52                   ` Christian Kujau
2008-03-27  8:21                     ` Christian Kujau
2008-03-26 16:57                   ` Chr

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).