All of lore.kernel.org
 help / color / mirror / Atom feed
* XFS: Observed Crash followed by deadlock of khubd/sync/XFS
@ 2011-09-08 11:05 Amit Sahrawat
  2011-09-08 17:28 ` Amit Sahrawat
  2011-09-10 18:30 ` Christoph Hellwig
  0 siblings, 2 replies; 9+ messages in thread
From: Amit Sahrawat @ 2011-09-08 11:05 UTC (permalink / raw)
  To: xfs

Kernel Version: 2.6.39.4
Target: ARM

Observed while doing:
Copy some file (any size, I tried with 10MB, 100MB) to XFS partition
After Copy do‘sync’
Now immediately, unplug the device.

usb 2-1.4: USB disconnect, address 4
end_request: I/O error, dev sda, sector 5696908
I/O error in filesystem ("sda3") meta-data dev sda3 block 0x56ed8c
  ("xlog_iodone") error 5 buf count 1024
xfs_force_shutdown(sda3,0x2) called from line 945 of file
fs/xfs/xfs_log.c.  Return address = 0xc018ac20
Filesystem "sda3": Log I/O Error Detected.  Shutting down filesystem: sda3
Please umount the filesystem, and rectify the problem(s)
XFS: Unable to update superblock counters. Freespace may not be
correct on next mount.
Unable to handle kernel NULL pointer dereference at virtual address 00000014
pgd = e42d4000
[00000014] *pgd=8b8d8031, *pte=00000000, *ppte=00000000

Main Backtrace:
[<c0189d88>] (xfs_log_move_tail+0x0/0x1b4)
[<c0198b78>] (xfs_trans_ail_delete+0x0/0x17c)
[<c016eaf8>] (xfs_buf_iodone+0x0/0x48)
[<c016ea98>] (xfs_buf_do_callbacks+0x0/0x3c)
[<c016eb7c>] (xfs_buf_iodone_callbacks+0x0/0x18c)
[<c01a2f98>] (xfs_buf_iodone_work+0x0/0x7c)
[<c01a3014>] (xfs_buf_ioend+0x0/0x9c)
[<c01a36f8>] (xfs_bioerror+0x0/0x54)
[<c01a374c>] (xfs_bdstrat_cb+0x0/0x6c)
[<c01a3158>] (xfs_flush_buftarg+0x0/0x18c)
[<c01a32e4>] (xfs_free_buftarg+0x0/0x78)
[<c01aa8d0>] (xfs_close_devices+0x0/0x68)
[<c01aa938>] (xfs_fs_put_super+0x0/0x88)
[<c00ab2b4>] (generic_shutdown_super+0x0/0x120)
[<c00ab3d4>] (kill_block_super+0x0/0x4c)
[<c00aa3ac>] (deactivate_locked_super+0x0/0x5c)
[<c00aa598>] (deactivate_super+0x0/0x60)
[<c00c1fec>] (mntput_no_expire+0x0/0xe8)
[<c00c2424>] (sys_umount+0x0/0x334) from [<c001ef80>]
(ret_fast_syscall+0x0/0x30)
---[ end trace 6bf95bedb3092162 ]---
Segmentation fault
#>

Again plugging the USB does not work because ‘umount’ process which
resulted in the crash has not returned properly and the lock is kept
held.
When I check the state of ‘khubd’ and ‘sync’ they both lie in ‘D –
TASK_UNINTERRUPTIBLE’ state and if their back-trace is checked at that
point.

For Khubd:
Backtrace:
[<c02f6524>] (schedule+0x0/0x50c)
[<c02f8988>] (__down_read+0x0/0x130)
[<c02f7ee4>] (down_read+0x0/0x14)
[<c00c30a4>] (get_super+0x0/0x104)
[<c00eed70>] (fsync_bdev+0x0/0x44)
[<c01df914>] (invalidate_partition+0x0/0x3c)
[<c010a384>] (del_gendisk+0x0/0xec)
[<c0228bb8>] (sd_remove+0x0/0xc8)

[<c02147f8>] (__device_release_driver+0x0/0xac)
[<c0214994>] (device_release_driver+0x0/0x30)
[<c0213de4>] (bus_remove_device+0x0/0x8c)
[<c0212308>] (device_del+0x0/0x160)
[<c0225fbc>] (__scsi_remove_device+0x0/0x90)
[<c0223328>] (scsi_forget_host+0x0/0xbc)
[<c021cccc>] (scsi_remove_host+0x0/0x18c)
[<bf15fe14>] (quiesce_and_remove_host+0x0/0xe4
[<bf15ff7c>] (usb_stor_disconnect+0x0/0x28
[<bf11e594>] (usb_unbind_interface+0x0/0xdc
[<c02147f8>] (__device_release_driver+0x0/0xac)
[<c0214994>] (device_release_driver+0x0/0x30)
[<c0213de4>] (bus_remove_device+0x0/0x8c)
[<c0212308>] (device_del+0x0/0x160)
[<bf11bc48>] (usb_disable_device+0x0/0x17c
[<bf116488>] (usb_disconnect+0x0/0x158
[<bf1167b8>] (hub_thread+0x0/0x1094
[<c005a7d8>] (kthread+0x0/0x8c)



For Sync:
Backtrace:
[<c02f6524>] (schedule+0x0/0x50c)
[<c02f8988>] (__down_read+0x0/0x130)
[<c02f7ee4>] (down_read+0x0/0x14)
[<c00c31a8>] (iterate_supers+0x0/0xfc)
[<c00e4690>] (sync_filesystems+0x0/0x2c)
[<c00e47c4>] (sys_sync+0x0/0x44)

Both are stuck, waiting to acquire a semaphore ‘sb->s_umount’
During umount – which gets called when a device is unplugged flow is:
Sys_umount()…deactivate_super()deactivate_locked_super()kill_block_super()generic_shutdown_super()
This semaphore is taken in deactivate super and released in
generic_shutdown_super() – ‘up_write(&sb->s_umount)’, but due to “NULL
pointer dereference” crash it is not called.

While for “NULL pointer deference” crash it shows the PC at:
Xfs_log_move_tail() while accessing ‘log’	
if (XLOG_FORCED_SHUTDOWN(log))
		return;

Changing the condition takes crash to other places.

Has anyone observed this scenario? Please advice something on this.

Thanks & Regards,
Amit Sahrawat

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: XFS: Observed Crash followed by deadlock of khubd/sync/XFS
  2011-09-08 11:05 XFS: Observed Crash followed by deadlock of khubd/sync/XFS Amit Sahrawat
@ 2011-09-08 17:28 ` Amit Sahrawat
  2011-09-13 15:26   ` Christoph Hellwig
  2011-09-10 18:30 ` Christoph Hellwig
  1 sibling, 1 reply; 9+ messages in thread
From: Amit Sahrawat @ 2011-09-08 17:28 UTC (permalink / raw)
  To: xfs

Since this is very hard to reproduce, to make it easy to debug. This
can be reproduce by introducing msleep in the kernel xfs_umountfs()
before xfs_log_sbcount(), just add a print before this function and
sleep and the moment the print appear unplug the USB device, same
scenario will be reproduced.
CRASH will show the backtrace and return to normal shell, but when
process state is checked, khubd will be shown in TASK-UNINTERRUPTIBLE
state 'D'.
Further if sync is issued that will also get converted to 'D' state,
the back-trace for each of the task is same as mentioned in the
previous mail.

Thanks & Regards,
Amit Sahrawat

On Thu, Sep 8, 2011 at 4:35 PM, Amit Sahrawat <amit.sahrawat83@gmail.com> wrote:
> Kernel Version: 2.6.39.4
> Target: ARM
>
> Observed while doing:
> Copy some file (any size, I tried with 10MB, 100MB) to XFS partition
> After Copy do‘sync’
> Now immediately, unplug the device.
>
> usb 2-1.4: USB disconnect, address 4
> end_request: I/O error, dev sda, sector 5696908
> I/O error in filesystem ("sda3") meta-data dev sda3 block 0x56ed8c
>  ("xlog_iodone") error 5 buf count 1024
> xfs_force_shutdown(sda3,0x2) called from line 945 of file
> fs/xfs/xfs_log.c.  Return address = 0xc018ac20
> Filesystem "sda3": Log I/O Error Detected.  Shutting down filesystem: sda3
> Please umount the filesystem, and rectify the problem(s)
> XFS: Unable to update superblock counters. Freespace may not be
> correct on next mount.
> Unable to handle kernel NULL pointer dereference at virtual address 00000014
> pgd = e42d4000
> [00000014] *pgd=8b8d8031, *pte=00000000, *ppte=00000000
>
> Main Backtrace:
> [<c0189d88>] (xfs_log_move_tail+0x0/0x1b4)
> [<c0198b78>] (xfs_trans_ail_delete+0x0/0x17c)
> [<c016eaf8>] (xfs_buf_iodone+0x0/0x48)
> [<c016ea98>] (xfs_buf_do_callbacks+0x0/0x3c)
> [<c016eb7c>] (xfs_buf_iodone_callbacks+0x0/0x18c)
> [<c01a2f98>] (xfs_buf_iodone_work+0x0/0x7c)
> [<c01a3014>] (xfs_buf_ioend+0x0/0x9c)
> [<c01a36f8>] (xfs_bioerror+0x0/0x54)
> [<c01a374c>] (xfs_bdstrat_cb+0x0/0x6c)
> [<c01a3158>] (xfs_flush_buftarg+0x0/0x18c)
> [<c01a32e4>] (xfs_free_buftarg+0x0/0x78)
> [<c01aa8d0>] (xfs_close_devices+0x0/0x68)
> [<c01aa938>] (xfs_fs_put_super+0x0/0x88)
> [<c00ab2b4>] (generic_shutdown_super+0x0/0x120)
> [<c00ab3d4>] (kill_block_super+0x0/0x4c)
> [<c00aa3ac>] (deactivate_locked_super+0x0/0x5c)
> [<c00aa598>] (deactivate_super+0x0/0x60)
> [<c00c1fec>] (mntput_no_expire+0x0/0xe8)
> [<c00c2424>] (sys_umount+0x0/0x334) from [<c001ef80>]
> (ret_fast_syscall+0x0/0x30)
> ---[ end trace 6bf95bedb3092162 ]---
> Segmentation fault
> #>
>
> Again plugging the USB does not work because ‘umount’ process which
> resulted in the crash has not returned properly and the lock is kept
> held.
> When I check the state of ‘khubd’ and ‘sync’ they both lie in ‘D –
> TASK_UNINTERRUPTIBLE’ state and if their back-trace is checked at that
> point.
>
> For Khubd:
> Backtrace:
> [<c02f6524>] (schedule+0x0/0x50c)
> [<c02f8988>] (__down_read+0x0/0x130)
> [<c02f7ee4>] (down_read+0x0/0x14)
> [<c00c30a4>] (get_super+0x0/0x104)
> [<c00eed70>] (fsync_bdev+0x0/0x44)
> [<c01df914>] (invalidate_partition+0x0/0x3c)
> [<c010a384>] (del_gendisk+0x0/0xec)
> [<c0228bb8>] (sd_remove+0x0/0xc8)
>
> [<c02147f8>] (__device_release_driver+0x0/0xac)
> [<c0214994>] (device_release_driver+0x0/0x30)
> [<c0213de4>] (bus_remove_device+0x0/0x8c)
> [<c0212308>] (device_del+0x0/0x160)
> [<c0225fbc>] (__scsi_remove_device+0x0/0x90)
> [<c0223328>] (scsi_forget_host+0x0/0xbc)
> [<c021cccc>] (scsi_remove_host+0x0/0x18c)
> [<bf15fe14>] (quiesce_and_remove_host+0x0/0xe4
> [<bf15ff7c>] (usb_stor_disconnect+0x0/0x28
> [<bf11e594>] (usb_unbind_interface+0x0/0xdc
> [<c02147f8>] (__device_release_driver+0x0/0xac)
> [<c0214994>] (device_release_driver+0x0/0x30)
> [<c0213de4>] (bus_remove_device+0x0/0x8c)
> [<c0212308>] (device_del+0x0/0x160)
> [<bf11bc48>] (usb_disable_device+0x0/0x17c
> [<bf116488>] (usb_disconnect+0x0/0x158
> [<bf1167b8>] (hub_thread+0x0/0x1094
> [<c005a7d8>] (kthread+0x0/0x8c)
>
>
>
> For Sync:
> Backtrace:
> [<c02f6524>] (schedule+0x0/0x50c)
> [<c02f8988>] (__down_read+0x0/0x130)
> [<c02f7ee4>] (down_read+0x0/0x14)
> [<c00c31a8>] (iterate_supers+0x0/0xfc)
> [<c00e4690>] (sync_filesystems+0x0/0x2c)
> [<c00e47c4>] (sys_sync+0x0/0x44)
>
> Both are stuck, waiting to acquire a semaphore ‘sb->s_umount’
> During umount – which gets called when a device is unplugged flow is:
> Sys_umount()…deactivate_super()deactivate_locked_super()kill_block_super()generic_shutdown_super()
> This semaphore is taken in deactivate super and released in
> generic_shutdown_super() – ‘up_write(&sb->s_umount)’, but due to “NULL
> pointer dereference” crash it is not called.
>
> While for “NULL pointer deference” crash it shows the PC at:
> Xfs_log_move_tail() while accessing ‘log’
> if (XLOG_FORCED_SHUTDOWN(log))
>                return;
>
> Changing the condition takes crash to other places.
>
> Has anyone observed this scenario? Please advice something on this.
>
> Thanks & Regards,
> Amit Sahrawat
>

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: XFS: Observed Crash followed by deadlock of khubd/sync/XFS
  2011-09-08 11:05 XFS: Observed Crash followed by deadlock of khubd/sync/XFS Amit Sahrawat
  2011-09-08 17:28 ` Amit Sahrawat
@ 2011-09-10 18:30 ` Christoph Hellwig
  2011-09-11 16:46   ` Amit Sahrawat
  1 sibling, 1 reply; 9+ messages in thread
From: Christoph Hellwig @ 2011-09-10 18:30 UTC (permalink / raw)
  To: Amit Sahrawat; +Cc: xfs

[-- Attachment #1: Type: text/plain, Size: 321 bytes --]

On Thu, Sep 08, 2011 at 04:35:28PM +0530, Amit Sahrawat wrote:
> Kernel Version: 2.6.39.4
> Target: ARM
> 
> Observed while doing:
> Copy some file (any size, I tried with 10MB, 100MB) to XFS partition
> After Copy do???sync???
> Now immediately, unplug the device.

Does this still happen with the patch below applied?


[-- Attachment #2: xfs-fix-synchronous-writes.diff --]
[-- Type: text/plain, Size: 1356 bytes --]

commit 9e978d8f7db1c5de7cdc6450a8ca208db3b95f84
Author: Ajeet Yadav <ajeet.yadav.77@gmail.com>
Date:   Fri Jul 29 07:42:59 2011 +0000

    "xfs: fix error handling for synchronous writes" revisited
    
    xfs: fix for hang during synchronous buffer write error
    
    If removed storage while synchronous buffer write underway,
    "xfslogd" hangs.
    
    Detailed log http://oss.sgi.com/archives/xfs/2011-07/msg00740.html
    
    Related work bfc60177f8ab509bc225becbb58f7e53a0e33e81
    "xfs: fix error handling for synchronous writes"
    
    Given that xfs_bwrite actually does the shutdown already after
    waiting for the b_iodone completion and given that we actually
    found that calling xfs_force_shutdown from inside
    xfs_buf_iodone_callbacks was a major contributor the problem
    it better to drop this call.
    
    Signed-off-by: Ajeet Yadav <ajeet.yadav.77@gmail.com>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Alex Elder <aelder@sgi.com>

diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c
index 0402173..cac2ecf 100644
--- a/fs/xfs/xfs_buf_item.c
+++ b/fs/xfs/xfs_buf_item.c
@@ -1010,7 +1010,6 @@ xfs_buf_iodone_callbacks(
 	XFS_BUF_UNDELAYWRITE(bp);
 
 	trace_xfs_buf_error_relse(bp, _RET_IP_);
-	xfs_force_shutdown(mp, SHUTDOWN_META_IO_ERROR);
 
 do_callbacks:
 	xfs_buf_do_callbacks(bp);

[-- Attachment #3: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: XFS: Observed Crash followed by deadlock of khubd/sync/XFS
  2011-09-10 18:30 ` Christoph Hellwig
@ 2011-09-11 16:46   ` Amit Sahrawat
  2011-09-12 11:02     ` Amit Sahrawat
  2011-09-12 11:06     ` Amit Sahrawat
  0 siblings, 2 replies; 9+ messages in thread
From: Amit Sahrawat @ 2011-09-11 16:46 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs

Yes, the patch is applied and the crash is still appearing easily.

On Sun, Sep 11, 2011 at 12:00 AM, Christoph Hellwig <hch@infradead.org> wrote:
> On Thu, Sep 08, 2011 at 04:35:28PM +0530, Amit Sahrawat wrote:
>> Kernel Version: 2.6.39.4
>> Target: ARM
>>
>> Observed while doing:
>> Copy some file (any size, I tried with 10MB, 100MB) to XFS partition
>> After Copy do???sync???
>> Now immediately, unplug the device.
>
> Does this still happen with the patch below applied?
>
>

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: XFS: Observed Crash followed by deadlock of khubd/sync/XFS
  2011-09-11 16:46   ` Amit Sahrawat
@ 2011-09-12 11:02     ` Amit Sahrawat
  2011-09-12 11:06     ` Amit Sahrawat
  1 sibling, 0 replies; 9+ messages in thread
From: Amit Sahrawat @ 2011-09-12 11:02 UTC (permalink / raw)
  To: Christoph Hellwig, Dave Chinner; +Cc: xfs

Kernel Version:
Linux version 3.0.3 (root@localhost.localdomain) (gcc version 4.4.4
20100503 (Red Hat 4.4.4-2) (GCC) ) #5 SMP Fri Sep 9 11:00:53 IST 2011
Target: x86


Sep 12 16:15:49 localhost kernel: [  281.879802] sd 5:0:0:0: [sdb]
15625216 512-byte logical blocks: (8.00 GB/7.45 GiB)
Sep 12 16:15:49 localhost kernel: [  281.881664] sd 5:0:0:0: [sdb]
Write Protect is off
Sep 12 16:15:49 localhost kernel: [  281.883307] sd 5:0:0:0: [sdb] No
Caching mode page present
Sep 12 16:15:49 localhost kernel: [  281.883311] sd 5:0:0:0: [sdb]
Assuming drive cache: write through
Sep 12 16:15:49 localhost kernel: [  281.887671] sd 5:0:0:0: [sdb] No
Caching mode page present
Sep 12 16:15:49 localhost kernel: [  281.887676] sd 5:0:0:0: [sdb]
Assuming drive cache: write through
Sep 12 16:15:49 localhost kernel: [  281.890712]  sdb: sdb1 sdb2 sdb3
Sep 12 16:15:49 localhost kernel: [  281.895542] sd 5:0:0:0: [sdb] No
Caching mode page present
Sep 12 16:15:49 localhost kernel: [  281.895545] sd 5:0:0:0: [sdb]
Assuming drive cache: write through
Sep 12 16:15:49 localhost kernel: [  281.895548] sd 5:0:0:0: [sdb]
Attached SCSI removable disk
Sep 12 16:15:49 localhost kernel: [  282.171467] XFS (sdb3): Mounting Filesystem
Sep 12 16:15:50 localhost kernel: [  283.264423] XFS (sdb3): Ending clean mount
Sep 12 16:16:37 localhost kernel: [  330.586075] usb 2-6: USB
disconnect, device number 3
Sep 12 16:16:41 localhost kernel: [  334.105070] XFS (sdb3): I/O error
occurred: meta-data dev sdb3 block 0x56f159       ("xlog_iodone")
error 5 buf count 1024
Sep 12 16:16:41 localhost kernel: [  334.105076] XFS (sdb3):
xfs_do_force_shutdown(0x2) called from line 891 of file
fs/xfs/xfs_log.c.  Return address = 0xf7b20ae1
Sep 12 16:16:41 localhost kernel: [  334.105084] XFS (sdb3): Log I/O
Error Detected.  Shutting down filesystem
Sep 12 16:16:41 localhost kernel: [  334.105088] XFS (sdb3): Please
umount the filesystem and rectify the problem(s)
Sep 12 16:16:41 localhost kernel: [  334.105093] XFS (sdb3): Unable to
update superblock counters. Freespace may not be correct on next
mount.
Sep 12 16:16:41 localhost kernel: [  334.105147] XFS (€):
xfs_trans_ail_delete_bulk: attempting to delete a log item that is not
in the AIL
Sep 12 16:16:41 localhost kernel: [  334.105152] XFS (€):
xfs_do_force_shutdown(0x8) called from line 740 of file
fs/xfs/xfs_trans_ail.c.  Return address = 0xf7b2dc7a
Sep 12 16:16:41 localhost kernel: [  334.105168] BUG: unable to handle
kernel NULL pointer dereference at 00000208
Sep 12 16:16:41 localhost kernel: [  334.105243] IP: [<f7b20f73>]
xfs_log_force_umount+0x1d/0x1b5 [xfs]
Sep 12 16:16:41 localhost kernel: [  334.105346] *pde = 00000000
Sep 12 16:16:41 localhost kernel: [  334.105377] Oops: 0000 [#1] SMP
Sep 12 16:16:41 localhost kernel: [  334.105414] Modules linked in:
vfat fat usb_storage xfs exportfs fuse sunrpc cpufreq_ondemand
acpi_cpufreq mperf ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6
ip6table_filter ip6_tables ipv6 uinput r8169 microcode
snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep i2c_i801
snd_seq snd_seq_device snd_pcm snd_timer snd soundcore snd_page_alloc
mii ppdev parport_pc parport iTCO_wdt iTCO_vendor_support pcspkr i915
drm_kms_helper drm i2c_algo_bit i2c_core video [last unloaded:
scsi_wait_scan]
Sep 12 16:16:41 localhost kernel: [  334.105926]
Sep 12 16:16:41 localhost kernel: [  334.105944] Pid: 2233, comm:
umount Not tainted 3.0.3 #5 Hewlett-Packard HP dx2480
MT(KL969AV)/0B08h
Sep 12 16:16:41 localhost kernel: [  334.106005] EIP:
0060:[<f7b20f73>] EFLAGS: 00210202 CPU: 1
Sep 12 16:16:41 localhost kernel: [  334.106005] EIP is at
xfs_log_force_umount+0x1d/0x1b5 [xfs]
Sep 12 16:16:41 localhost kernel: [  334.106005] EAX: f2a1b380 EBX:
000001f4 ECX: f3ccdd5c EDX: 00000000
Sep 12 16:16:41 localhost kernel: [  334.106005] ESI: f2a1b380 EDI:
00000000 EBP: f3ccdd98 ESP: f3ccdd80
Sep 12 16:16:41 localhost kernel: [  334.106005]  DS: 007b ES: 007b
FS: 00d8 GS: 00e0 SS: 0068
Sep 12 16:16:41 localhost kernel: [  334.106005] Process umount (pid:
2233, ti=f3ccc000 task=f1f957f0 task.ti=f3ccc000)
Sep 12 16:16:41 localhost kernel: [  334.106005] Stack:
Sep 12 16:16:41 localhost kernel: [  334.106005]  f7b32547 f2a1b380
00000000 f2a1b380 00000008 00000000 f3ccddb0 f7b32564
Sep 12 16:16:41 localhost kernel: [  334.106005]  00000000 f2a1b000
f2a1b380 00000000 f3ccddec f7b2dc7a 000002e4 f2a1b380
Sep 12 16:16:41 localhost kernel: [  334.106005]  00000004 f7b48bbf
f7b42efc f7b2364d 00000001 f3ccddf4 f2a1b004 00000000
Sep 12 16:16:41 localhost kernel: [  334.106005] Call Trace:
Sep 12 16:16:41 localhost kernel: [  334.106005]  [<f7b32547>] ?
xfs_do_force_shutdown+0x39/0xd6 [xfs]
Sep 12 16:16:41 localhost kernel: [  334.106005]  [<f7b32564>]
xfs_do_force_shutdown+0x56/0xd6 [xfs]
Sep 12 16:16:41 localhost kernel: [  334.106005]  [<f7b2dc7a>]
xfs_trans_ail_delete_bulk+0x83/0xfa [xfs]
Sep 12 16:16:41 localhost kernel: [  334.106005]  [<f7b2364d>] ?
xlog_cil_push+0x2d1/0x2f6 [xfs]
Sep 12 16:16:41 localhost kernel: [  334.106005]  [<f7b35623>] ?
xfs_buf_iodone_work+0x14/0x23 [xfs]
Sep 12 16:16:41 localhost kernel: [  334.106005]  [<f7b09d6f>]
xfs_buf_iodone+0x31/0x3d [xfs]
Sep 12 16:16:41 localhost kernel: [  334.106005]  [<f7b09d1b>]
xfs_buf_do_callbacks+0x24/0x31 [xfs]
Sep 12 16:16:41 localhost kernel: [  334.106005]  [<f7b0a5b7>]
xfs_buf_iodone_callbacks+0x16f/0x1a2 [xfs]
Sep 12 16:16:41 localhost kernel: [  334.106005]  [<f7b35623>]
xfs_buf_iodone_work+0x14/0x23 [xfs]
Sep 12 16:16:41 localhost kernel: [  334.106005]  [<f7b357ec>]
xfs_buf_ioend+0x95/0xa5 [xfs]
Sep 12 16:16:41 localhost kernel: [  334.106005]  [<f7b35954>] ?
xfs_bioerror+0x34/0x3c [xfs]
Sep 12 16:16:41 localhost kernel: [  334.106005]  [<f7b36217>] ?
xfs_flush_buftarg+0x9e/0xe9 [xfs]
Sep 12 16:16:41 localhost kernel: [  334.106005]  [<f7b35954>]
xfs_bioerror+0x34/0x3c [xfs]
Sep 12 16:16:41 localhost kernel: [  334.106005]  [<f7b36169>]
xfs_bdstrat_cb+0x5f/0x6f [xfs]
Sep 12 16:16:41 localhost kernel: [  334.106005]  [<f7b36217>]
xfs_flush_buftarg+0x9e/0xe9 [xfs]
Sep 12 16:16:41 localhost kernel: [  334.106005]  [<f7b36284>]
xfs_free_buftarg+0x22/0x45 [xfs]
Sep 12 16:16:41 localhost kernel: [  334.106005]  [<f7b3c390>]
xfs_close_devices+0x55/0x59 [xfs]
Sep 12 16:16:41 localhost kernel: [  334.106005]  [<f7b3c3e4>]
xfs_fs_put_super+0x50/0x61 [xfs]
Sep 12 16:16:41 localhost kernel: [  334.106005]  [<c04e1432>]
generic_shutdown_super+0x52/0xb0
Sep 12 16:16:41 localhost kernel: [  334.106005]  [<c04e14b2>]
kill_block_super+0x22/0x5e
Sep 12 16:16:41 localhost kernel: [  334.106005]  [<c04e1997>]
deactivate_locked_super+0x1f/0x40
Sep 12 16:16:41 localhost kernel: [  334.106005]  [<c04e21bf>]
deactivate_super+0x37/0x3c
Sep 12 16:16:41 localhost kernel: [  334.106005]  [<c04f40c3>]
mntput_no_expire+0x114/0x11a
Sep 12 16:16:41 localhost kernel: [  334.106005]  [<c04f4a2b>]
sys_umount+0x26e/0x295
Sep 12 16:16:41 localhost kernel: [  334.106005]  [<c04f4a64>]
sys_oldumount+0x12/0x14
Sep 12 16:16:41 localhost kernel: [  334.106005]  [<c07c401f>]
sysenter_do_call+0x12/0x28
Sep 12 16:16:41 localhost kernel: [  334.106005] Code: 05 b8 05 00 00
00 83 c4 10 5b 5e 5f 5d c3 55 89 e5 57 56 53 83 ec 0c 3e 8d 74 26 00
89 55 f0 8b 98 18 01 00 00 89 c6 85 db 74 06 <f6> 43 14 02 74 27 8b 86
e0 00 00 00 31 ff 83 8e cc 01 00 00 10
Sep 12 16:16:41 localhost kernel: [  334.106005] EIP: [<f7b20f73>]
xfs_log_force_umount+0x1d/0x1b5 [xfs] SS:ESP 0068:f3ccdd80
Sep 12 16:16:41 localhost kernel: [  334.106005] CR2: 0000000000000208
Sep 12 16:16:41 localhost kernel: [  334.135511] ---[ end trace
f3af361b30e84114 ]---
Sep 12 16:16:41 localhost kernel: [  334.135514] ------------[ cut
here ]------------
Sep 12 16:16:41 localhost kernel: [  334.135519] WARNING: at
kernel/exit.c:909 do_exit+0x37/0x621()
Sep 12 16:16:41 localhost kernel: [  334.135520] Hardware name: HP
dx2480 MT(KL969AV)
Sep 12 16:16:41 localhost kernel: [  334.135522] Modules linked in:
vfat fat usb_storage xfs exportfs fuse sunrpc cpufreq_ondemand
acpi_cpufreq mperf ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6
ip6table_filter ip6_tables ipv6 uinput r8169 microcode
snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep i2c_i801
snd_seq snd_seq_device snd_pcm snd_timer snd soundcore snd_page_alloc
mii ppdev parport_pc parport iTCO_wdt iTCO_vendor_support pcspkr i915
drm_kms_helper drm i2c_algo_bit i2c_core video [last unloaded:
scsi_wait_scan]
Sep 12 16:16:41 localhost kernel: [  334.135551] Pid: 2233, comm:
umount Tainted: G      D     3.0.3 #5
Sep 12 16:16:41 localhost kernel: [  334.135553] Call Trace:
Sep 12 16:16:41 localhost kernel: [  334.135557]  [<c0437f03>]
warn_slowpath_common+0x6a/0x7f
Sep 12 16:16:41 localhost kernel: [  334.135559]  [<c043b016>] ?
do_exit+0x37/0x621
Sep 12 16:16:41 localhost kernel: [  334.135562]  [<c0437f2c>]
warn_slowpath_null+0x14/0x18
Sep 12 16:16:41 localhost kernel: [  334.135564]  [<c043b016>]
do_exit+0x37/0x621
Sep 12 16:16:41 localhost kernel: [  334.135566]  [<c043840d>] ?
kmsg_dump+0x3a/0xb3
Sep 12 16:16:41 localhost kernel: [  334.135569]  [<c07bfd15>]
oops_end+0x9d/0xa5
Sep 12 16:16:41 localhost kernel: [  334.135572]  [<c042155b>]
no_context+0x115/0x11f
Sep 12 16:16:41 localhost kernel: [  334.135575]  [<c0421659>]
__bad_area_nosemaphore+0xf4/0xfc
Sep 12 16:16:41 localhost kernel: [  334.135577]  [<c04216b0>]
bad_area+0x3a/0x40
Sep 12 16:16:41 localhost kernel: [  334.135579]  [<c07c15db>]
do_page_fault+0x227/0x376
Sep 12 16:16:41 localhost kernel: [  334.135582]  [<c07c13b4>] ?
spurious_fault+0xba/0xba
Sep 12 16:16:41 localhost kernel: [  334.135585]  [<c07bf3d7>]
error_code+0x67/0x6c
Sep 12 16:16:41 localhost kernel: [  334.135612]  [<f7b20f73>] ?
xfs_log_force_umount+0x1d/0x1b5 [xfs]
Sep 12 16:16:41 localhost kernel: [  334.135636]  [<f7b32547>] ?
xfs_do_force_shutdown+0x39/0xd6 [xfs]
Sep 12 16:16:41 localhost kernel: [  334.135660]  [<f7b32564>]
xfs_do_force_shutdown+0x56/0xd6 [xfs]
Sep 12 16:16:41 localhost kernel: [  334.135683]  [<f7b2dc7a>]
xfs_trans_ail_delete_bulk+0x83/0xfa [xfs]
Sep 12 16:16:41 localhost kernel: [  334.135706]  [<f7b2364d>] ?
xlog_cil_push+0x2d1/0x2f6 [xfs]
Sep 12 16:16:41 localhost kernel: [  334.135730]  [<f7b35623>] ?
xfs_buf_iodone_work+0x14/0x23 [xfs]
Sep 12 16:16:41 localhost kernel: [  334.135750]  [<f7b09d6f>]
xfs_buf_iodone+0x31/0x3d [xfs]
Sep 12 16:16:41 localhost kernel: [  334.135771]  [<f7b09d1b>]
xfs_buf_do_callbacks+0x24/0x31 [xfs]
Sep 12 16:16:41 localhost kernel: [  334.135791]  [<f7b0a5b7>]
xfs_buf_iodone_callbacks+0x16f/0x1a2 [xfs]
Sep 12 16:16:41 localhost kernel: [  334.135815]  [<f7b35623>]
xfs_buf_iodone_work+0x14/0x23 [xfs]
Sep 12 16:16:41 localhost kernel: [  334.135839]  [<f7b357ec>]
xfs_buf_ioend+0x95/0xa5 [xfs]
Sep 12 16:16:41 localhost kernel: [  334.135862]  [<f7b35954>] ?
xfs_bioerror+0x34/0x3c [xfs]
Sep 12 16:16:41 localhost kernel: [  334.135886]  [<f7b36217>] ?
xfs_flush_buftarg+0x9e/0xe9 [xfs]
Sep 12 16:16:41 localhost kernel: [  334.135910]  [<f7b35954>]
xfs_bioerror+0x34/0x3c [xfs]
Sep 12 16:16:41 localhost kernel: [  334.135933]  [<f7b36169>]
xfs_bdstrat_cb+0x5f/0x6f [xfs]
Sep 12 16:16:41 localhost kernel: [  334.135957]  [<f7b36217>]
xfs_flush_buftarg+0x9e/0xe9 [xfs]
Sep 12 16:16:41 localhost kernel: [  334.135981]  [<f7b36284>]
xfs_free_buftarg+0x22/0x45 [xfs]
Sep 12 16:16:41 localhost kernel: [  334.136000]  [<f7b3c390>]
xfs_close_devices+0x55/0x59 [xfs]
Sep 12 16:16:41 localhost kernel: [  334.136032]  [<f7b3c3e4>]
xfs_fs_put_super+0x50/0x61 [xfs]
Sep 12 16:16:41 localhost kernel: [  334.136036]  [<c04e1432>]
generic_shutdown_super+0x52/0xb0
Sep 12 16:16:41 localhost kernel: [  334.136039]  [<c04e14b2>]
kill_block_super+0x22/0x5e
Sep 12 16:16:41 localhost kernel: [  334.136042]  [<c04e1997>]
deactivate_locked_super+0x1f/0x40
Sep 12 16:16:41 localhost kernel: [  334.136045]  [<c04e21bf>]
deactivate_super+0x37/0x3c
Sep 12 16:16:41 localhost kernel: [  334.136049]  [<c04f40c3>]
mntput_no_expire+0x114/0x11a
Sep 12 16:16:41 localhost kernel: [  334.136052]  [<c04f4a2b>]
sys_umount+0x26e/0x295
Sep 12 16:16:41 localhost kernel: [  334.136055]  [<c04f4a64>]
sys_oldumount+0x12/0x14
Sep 12 16:16:41 localhost kernel: [  334.136059]  [<c07c401f>]
sysenter_do_call+0x12/0x28
Sep 12 16:16:41 localhost kernel: [  334.136061] ---[ end trace
f3af361b30e84115 ]---

Thanks & Regards,
Amit Sahrawat

On Sun, Sep 11, 2011 at 10:16 PM, Amit Sahrawat
<amit.sahrawat83@gmail.com> wrote:
> Yes, the patch is applied and the crash is still appearing easily.
>
> On Sun, Sep 11, 2011 at 12:00 AM, Christoph Hellwig <hch@infradead.org> wrote:
>> On Thu, Sep 08, 2011 at 04:35:28PM +0530, Amit Sahrawat wrote:
>>> Kernel Version: 2.6.39.4
>>> Target: ARM
>>>
>>> Observed while doing:
>>> Copy some file (any size, I tried with 10MB, 100MB) to XFS partition
>>> After Copy do???sync???
>>> Now immediately, unplug the device.
>>
>> Does this still happen with the patch below applied?
>>
>>
>

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: XFS: Observed Crash followed by deadlock of khubd/sync/XFS
  2011-09-11 16:46   ` Amit Sahrawat
  2011-09-12 11:02     ` Amit Sahrawat
@ 2011-09-12 11:06     ` Amit Sahrawat
  2011-09-12 11:15       ` Amit Sahrawat
  1 sibling, 1 reply; 9+ messages in thread
From: Amit Sahrawat @ 2011-09-12 11:06 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs

Please ignore previous message, there is formatting issue with the
back trace. I will post again.

On Sun, Sep 11, 2011 at 10:16 PM, Amit Sahrawat
<amit.sahrawat83@gmail.com> wrote:
> Yes, the patch is applied and the crash is still appearing easily.
>
> On Sun, Sep 11, 2011 at 12:00 AM, Christoph Hellwig <hch@infradead.org> wrote:
>> On Thu, Sep 08, 2011 at 04:35:28PM +0530, Amit Sahrawat wrote:
>>> Kernel Version: 2.6.39.4
>>> Target: ARM
>>>
>>> Observed while doing:
>>> Copy some file (any size, I tried with 10MB, 100MB) to XFS partition
>>> After Copy do???sync???
>>> Now immediately, unplug the device.
>>
>> Does this still happen with the patch below applied?
>>
>>
>

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: XFS: Observed Crash followed by deadlock of khubd/sync/XFS
  2011-09-12 11:06     ` Amit Sahrawat
@ 2011-09-12 11:15       ` Amit Sahrawat
  0 siblings, 0 replies; 9+ messages in thread
From: Amit Sahrawat @ 2011-09-12 11:15 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs

Kernel Version:
Linux version 3.0.3 (root@localhost.localdomain) (gcc version 4.4.4
20100503 (Red Hat 4.4.4-2) (GCC) ) #5 SMP Fri Sep 9 11:00:53 IST 2011
Target: x86

XFS (sdb3): Mounting Filesystem
XFS (sdb3): Ending clean mount
usb 2-6: USB disconnect, device number 3
XFS (sdb3): I/O error occurred: meta-data dev sdb3 block 0x56f159
 ("xlog_iodone") error 5 buf count 1024
XFS (sdb3): xfs_do_force_shutdown(0x2) called from line 891 of file
fs/xfs/xfs_log.c.  Return address = 0xf7b20ae1
XFS (sdb3): Log I/O Error Detected.  Shutting down filesystem
XFS (sdb3): Please umount the filesystem and rectify the problem(s)
XFS (sdb3): Unable to update superblock counters. Freespace may not be
correct on next mount.
XFS (€): xfs_trans_ail_delete_bulk: attempting to delete a log item
that is not in the AIL
XFS (€): xfs_do_force_shutdown(0x8) called from line 740 of file
fs/xfs/xfs_trans_ail.c.  Return address = 0xf7b2dc7a
BUG: unable to handle kernel NULL pointer dereference at 00000208
IP: [<f7b20f73>] xfs_log_force_umount+0x1d/0x1b5 [xfs]
*pde = 00000000
Oops: 0000 [#1] SMP
Modules linked in: vfat fat usb_storage xfs exportfs fuse sunrpc
cpufreq_ondemand acpi_cpufreq mperf ip6t_REJECT nf_conntrack_ipv6
nf_defrag_ipv6 ip6table_filter ip6_tables ipv6 uinput r8169 microcode
snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep i2c_i801
snd_seq snd_seq_device snd_pcm snd_timer snd soundcore snd_page_alloc
mii ppdev parport_pc parport iTCO_wdt iTCO_vendor_support pcspkr i915
drm_kms_helper drm i2c_algo_bit i2c_core video [last unloaded:
scsi_wait_scan]

Pid: 2233, comm: umount Not tainted 3.0.3 #5 Hewlett-Packard HP dx2480
MT(KL969AV)/0B08h
EIP: 0060:[<f7b20f73>] EFLAGS: 00210202 CPU: 1
EIP is at xfs_log_force_umount+0x1d/0x1b5 [xfs]
EAX: f2a1b380 EBX: 000001f4 ECX: f3ccdd5c EDX: 00000000
ESI: f2a1b380 EDI: 00000000 EBP: f3ccdd98 ESP: f3ccdd80
DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Process umount (pid: 2233, ti=f3ccc000 task=f1f957f0 task.ti=f3ccc000)
Stack:
 f7b32547 f2a1b380 00000000 f2a1b380 00000008 00000000 f3ccddb0 f7b32564
 00000000 f2a1b000 f2a1b380 00000000 f3ccddec f7b2dc7a 000002e4 f2a1b380
 00000004 f7b48bbf f7b42efc f7b2364d 00000001 f3ccddf4 f2a1b004 00000000
Call Trace:
 [<f7b32547>] ? xfs_do_force_shutdown+0x39/0xd6 [xfs]
 [<f7b32564>] xfs_do_force_shutdown+0x56/0xd6 [xfs]
 [<f7b2dc7a>] xfs_trans_ail_delete_bulk+0x83/0xfa [xfs]
 [<f7b2364d>] ? xlog_cil_push+0x2d1/0x2f6 [xfs]
 [<f7b35623>] ? xfs_buf_iodone_work+0x14/0x23 [xfs]
 [<f7b09d6f>] xfs_buf_iodone+0x31/0x3d [xfs]
 [<f7b09d1b>] xfs_buf_do_callbacks+0x24/0x31 [xfs]
 [<f7b0a5b7>] xfs_buf_iodone_callbacks+0x16f/0x1a2 [xfs]
 [<f7b35623>] xfs_buf_iodone_work+0x14/0x23 [xfs]
 [<f7b357ec>] xfs_buf_ioend+0x95/0xa5 [xfs]
 [<f7b35954>] ? xfs_bioerror+0x34/0x3c [xfs]
 [<f7b36217>] ? xfs_flush_buftarg+0x9e/0xe9 [xfs]
 [<f7b35954>] xfs_bioerror+0x34/0x3c [xfs]
 [<f7b36169>] xfs_bdstrat_cb+0x5f/0x6f [xfs]
 [<f7b36217>] xfs_flush_buftarg+0x9e/0xe9 [xfs]
 [<f7b36284>] xfs_free_buftarg+0x22/0x45 [xfs]
 [<f7b3c390>] xfs_close_devices+0x55/0x59 [xfs]
 [<f7b3c3e4>] xfs_fs_put_super+0x50/0x61 [xfs]
 [<c04e1432>] generic_shutdown_super+0x52/0xb0
 [<c04e14b2>] kill_block_super+0x22/0x5e
 [<c04e1997>] deactivate_locked_super+0x1f/0x40
 [<c04e21bf>] deactivate_super+0x37/0x3c
 [<c04f40c3>] mntput_no_expire+0x114/0x11a
 [<c04f4a2b>] sys_umount+0x26e/0x295
 [<c04f4a64>] sys_oldumount+0x12/0x14
 [<c07c401f>] sysenter_do_call+0x12/0x28
Code: 05 b8 05 00 00 00 83 c4 10 5b 5e 5f 5d c3 55 89 e5 57 56 53 83
ec 0c 3e 8d 74 26 00 89 55 f0 8b 98 18 01 00 00 89 c6 85 db 74 06 <f6>
43 14 02 74 27 8b 86 e0 00 00 00 31 ff 83 8e cc 01 00 00 10
EIP: [<f7b20f73>] xfs_log_force_umount+0x1d/0x1b5 [xfs] SS:ESP 0068:f3ccdd80
CR2: 0000000000000208
 ---[ end trace f3af361b30e84114 ]---
------------[ cut here ]------------
WARNING: at kernel/exit.c:909 do_exit+0x37/0x621()
Hardware name: HP dx2480 MT(KL969AV)
Modules linked in: vfat fat usb_storage xfs exportfs fuse sunrpc
cpufreq_ondemand acpi_cpufreq mperf ip6t_REJECT nf_conntrack_ipv6
nf_defrag_ipv6 ip6table_filter ip6_tables ipv6 uinput r8169 microcode
snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep i2c_i801
snd_seq snd_seq_device snd_pcm snd_timer snd soundcore snd_page_alloc
mii ppdev parport_pc parport iTCO_wdt iTCO_vendor_support pcspkr i915
drm_kms_helper drm i2c_algo_bit i2c_core video [last unloaded:
scsi_wait_scan]
Pid: 2233, comm: umount Tainted: G      D     3.0.3 #5
Call Trace:
 [<c0437f03>] warn_slowpath_common+0x6a/0x7f
 [<c043b016>] ? do_exit+0x37/0x621
 [<c0437f2c>] warn_slowpath_null+0x14/0x18
 [<c043b016>] do_exit+0x37/0x621
 [<c043840d>] ? kmsg_dump+0x3a/0xb3
 [<c07bfd15>] oops_end+0x9d/0xa5
 [<c042155b>] no_context+0x115/0x11f
 [<c0421659>] __bad_area_nosemaphore+0xf4/0xfc
 [<c04216b0>] bad_area+0x3a/0x40
 [<c07c15db>] do_page_fault+0x227/0x376
 [<c07c13b4>] ? spurious_fault+0xba/0xba
 [<c07bf3d7>] error_code+0x67/0x6c
 [<f7b20f73>] ? xfs_log_force_umount+0x1d/0x1b5 [xfs]
 [<f7b32547>] ? xfs_do_force_shutdown+0x39/0xd6 [xfs]
 [<f7b32564>] xfs_do_force_shutdown+0x56/0xd6 [xfs]
 [<f7b2dc7a>] xfs_trans_ail_delete_bulk+0x83/0xfa [xfs]
 [<f7b2364d>] ? xlog_cil_push+0x2d1/0x2f6 [xfs]
 [<f7b35623>] ? xfs_buf_iodone_work+0x14/0x23 [xfs]
 [<f7b09d6f>] xfs_buf_iodone+0x31/0x3d [xfs]
 [<f7b09d1b>] xfs_buf_do_callbacks+0x24/0x31 [xfs]
 [<f7b0a5b7>] xfs_buf_iodone_callbacks+0x16f/0x1a2 [xfs]
 [<f7b35623>] xfs_buf_iodone_work+0x14/0x23 [xfs]
 [<f7b357ec>] xfs_buf_ioend+0x95/0xa5 [xfs]
 [<f7b35954>] ? xfs_bioerror+0x34/0x3c [xfs]
 [<f7b36217>] ? xfs_flush_buftarg+0x9e/0xe9 [xfs]
 [<f7b35954>] xfs_bioerror+0x34/0x3c [xfs]
 [<f7b36169>] xfs_bdstrat_cb+0x5f/0x6f [xfs]
 [<f7b36217>] xfs_flush_buftarg+0x9e/0xe9 [xfs]
 [<f7b36284>] xfs_free_buftarg+0x22/0x45 [xfs]
 [<f7b3c390>] xfs_close_devices+0x55/0x59 [xfs]
 [<f7b3c3e4>] xfs_fs_put_super+0x50/0x61 [xfs]
 [<c04e1432>] generic_shutdown_super+0x52/0xb0
 [<c04e14b2>] kill_block_super+0x22/0x5e
 [<c04e1997>] deactivate_locked_super+0x1f/0x40
 [<c04e21bf>] deactivate_super+0x37/0x3c
 [<c04f40c3>] mntput_no_expire+0x114/0x11a
 [<c04f4a2b>] sys_umount+0x26e/0x295
 [<c04f4a64>] sys_oldumount+0x12/0x14
 [<c07c401f>] sysenter_do_call+0x12/0x28
---[ end trace f3af361b30e84115 ]---


On Mon, Sep 12, 2011 at 4:36 PM, Amit Sahrawat
<amit.sahrawat83@gmail.com> wrote:
> Please ignore previous message, there is formatting issue with the
> back trace. I will post again.
>
> On Sun, Sep 11, 2011 at 10:16 PM, Amit Sahrawat
> <amit.sahrawat83@gmail.com> wrote:
>> Yes, the patch is applied and the crash is still appearing easily.
>>
>> On Sun, Sep 11, 2011 at 12:00 AM, Christoph Hellwig <hch@infradead.org> wrote:
>>> On Thu, Sep 08, 2011 at 04:35:28PM +0530, Amit Sahrawat wrote:
>>>> Kernel Version: 2.6.39.4
>>>> Target: ARM
>>>>
>>>> Observed while doing:
>>>> Copy some file (any size, I tried with 10MB, 100MB) to XFS partition
>>>> After Copy do???sync???
>>>> Now immediately, unplug the device.
>>>
>>> Does this still happen with the patch below applied?
>>>
>>>
>>
>

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: XFS: Observed Crash followed by deadlock of khubd/sync/XFS
  2011-09-08 17:28 ` Amit Sahrawat
@ 2011-09-13 15:26   ` Christoph Hellwig
  2011-09-13 15:43     ` Amit Sahrawat
  0 siblings, 1 reply; 9+ messages in thread
From: Christoph Hellwig @ 2011-09-13 15:26 UTC (permalink / raw)
  To: Amit Sahrawat; +Cc: xfs

On Thu, Sep 08, 2011 at 10:58:35PM +0530, Amit Sahrawat wrote:
> Since this is very hard to reproduce, to make it easy to debug. This
> can be reproduce by introducing msleep in the kernel xfs_umountfs()
> before xfs_log_sbcount(), just add a print before this function and
> sleep and the moment the print appear unplug the USB device, same
> scenario will be reproduced.
> CRASH will show the backtrace and return to normal shell, but when
> process state is checked, khubd will be shown in TASK-UNINTERRUPTIBLE
> state 'D'.
> Further if sync is issued that will also get converted to 'D' state,
> the back-trace for each of the task is same as mentioned in the
> previous mail.

I've not been able to reproduce this using that method so far.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: XFS: Observed Crash followed by deadlock of khubd/sync/XFS
  2011-09-13 15:26   ` Christoph Hellwig
@ 2011-09-13 15:43     ` Amit Sahrawat
  0 siblings, 0 replies; 9+ messages in thread
From: Amit Sahrawat @ 2011-09-13 15:43 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs

I am able to get it very frequent on both ARM/x86 with the same case
and same logs.
In particular just adding a msleep in unmount path before
xfs_log_sbcount to trigger this path - this will definetely help.
I am able to figure out the issue - there is an issue in callback and
unmount path not in sync.
unmount path does it work as required but there is a callback pending.
In normal case I can see xfslogd is used to execute callbacks - while
at the time of the issue the callback is called is called in the
context of 'umount' process - which leads to destroying AIL and
deallocating log and then using the same in callback path from
xfs_buf_iodone_callbacks --> xfs_buf_iodone

I have done some changes to get over this but further looking.

Thanks & Regards,
Amit Sahrawat

On Tue, Sep 13, 2011 at 8:56 PM, Christoph Hellwig <hch@infradead.org> wrote:
> On Thu, Sep 08, 2011 at 10:58:35PM +0530, Amit Sahrawat wrote:
>> Since this is very hard to reproduce, to make it easy to debug. This
>> can be reproduce by introducing msleep in the kernel xfs_umountfs()
>> before xfs_log_sbcount(), just add a print before this function and
>> sleep and the moment the print appear unplug the USB device, same
>> scenario will be reproduced.
>> CRASH will show the backtrace and return to normal shell, but when
>> process state is checked, khubd will be shown in TASK-UNINTERRUPTIBLE
>> state 'D'.
>> Further if sync is issued that will also get converted to 'D' state,
>> the back-trace for each of the task is same as mentioned in the
>> previous mail.
>
> I've not been able to reproduce this using that method so far.
>
>

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2011-09-13 15:44 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-09-08 11:05 XFS: Observed Crash followed by deadlock of khubd/sync/XFS Amit Sahrawat
2011-09-08 17:28 ` Amit Sahrawat
2011-09-13 15:26   ` Christoph Hellwig
2011-09-13 15:43     ` Amit Sahrawat
2011-09-10 18:30 ` Christoph Hellwig
2011-09-11 16:46   ` Amit Sahrawat
2011-09-12 11:02     ` Amit Sahrawat
2011-09-12 11:06     ` Amit Sahrawat
2011-09-12 11:15       ` Amit Sahrawat

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.