xfs_iext_realloc_indirect and "XFS: possible memory allocation deadlock"

* xfs_iext_realloc_indirect and "XFS: possible memory allocation deadlock"
@ 2015-06-23  8:18 Alex Lyakas
  2015-06-23 20:18 ` Dave Chinner
  0 siblings, 1 reply; 17+ messages in thread
From: Alex Lyakas @ 2015-06-23  8:18 UTC (permalink / raw)
  To: xfs, hch

[-- Attachment #1.1: Type: text/plain, Size: 7604 bytes --]

Greetings,

We are hitting an issue with XFS printing messages like
“XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250)”
and stack trace like in [1]. Eventually, hung-task panic kicks in with stack traces like [2].

We are running kernel 3.8.13. I see that in http://oss.sgi.com/archives/xfs/2012-01/msg00341.html a similar issue has been discussed, but no code changes followed comparing to what we have in 3.8.13.

Any suggestion on how to move forward with this problem? For example, does this memory has to be really allocated with kmalloc (i.e., physically continuous) or vmalloc can be used?

Thanks,
Alex.

[1]
[109626.075483] nfsd            D 0000000000000002     0 20042      2 0x00000000
[109626.075483]  ffff88026ac3ef58 0000000000000046 ffff88031fffbd80 ffff88026ac40000
[109626.075483]  ffff88026ac3ffd8 ffff88026ac3ffd8 ffff88026ac3ffd8 0000000000013f40
[109626.075483]  ffff88030e542e80 ffff88026ac40000 ffff88030e58c000 ffff88026ac3ef90
[109626.075483] Call Trace:
[109626.075483]  [<ffffffff816ec509>] schedule+0x29/0x70
[109626.075483]  [<ffffffff816eabd0>] schedule_timeout+0x130/0x250
[109626.075483]  [<ffffffff8106a340>] ? cascade+0xa0/0xa0
[109626.075483]  [<ffffffff816ec8a2>] io_schedule_timeout+0xa2/0x100
[109626.075483]  [<ffffffff81185311>] ? __kmalloc+0x181/0x190
[109626.075483]  [<ffffffff81153bc0>] congestion_wait+0x80/0x120
[109626.075483]  [<ffffffff81185311>] ? __kmalloc+0x181/0x190
[109626.075483]  [<ffffffff8107fc10>] ? add_wait_queue+0x60/0x60
[109626.075483]  [<ffffffff81185260>] ? __kmalloc+0xd0/0x190
[109626.075483]  [<ffffffffa07631fc>] ? kmem_alloc+0x5c/0xe0 [xfs]
[109626.075483]  [<ffffffffa0763330>] ? kmem_realloc+0x30/0x70 [xfs]
[109626.075483]  [<ffffffffa0795f0d>] ? xfs_iext_realloc_indirect+0x3d/0x60 [xfs]
[109626.075483]  [<ffffffffa0795f6f>] ? xfs_iext_irec_new+0x3f/0x180 [xfs]
[109626.075483]  [<ffffffffa0796229>] ? xfs_iext_add_indirect_multi+0x179/0x2b0 [xfs]
[109626.075483]  [<ffffffffa079662e>] ? xfs_iext_add+0xce/0x290 [xfs]
[109626.075483]  [<ffffffff81097c33>] ? update_curr+0x143/0x1f0
[109626.075483]  [<ffffffffa0796842>] ? xfs_iext_insert+0x52/0x100 [xfs]
[109626.075483]  [<ffffffffa0771b43>] ? xfs_bmap_add_extent_hole_delay+0xd3/0x6a0 [xfs]
[109626.075483]  [<ffffffffa0771b43>] ? xfs_bmap_add_extent_hole_delay+0xd3/0x6a0 [xfs]
[109626.075483]  [<ffffffffa07950d7>] ? xfs_iext_bno_to_ext+0xf7/0x160 [xfs]
[109626.075483]  [<ffffffffa0772389>] ? xfs_bmapi_reserve_delalloc+0x279/0x2a0 [xfs]
[109626.075483]  [<ffffffffa07793b2>] ? xfs_bmapi_delay+0x122/0x270 [xfs]
[109626.075483]  [<ffffffffa0758703>] ? xfs_iomap_write_delay+0x173/0x320 [xfs]
[109626.075483]  [<ffffffffa077909c>] ? xfs_bmapi_read+0xfc/0x2f0 [xfs]
[109626.075483]  [<ffffffff8135d8f3>] ? call_rwsem_down_write_failed+0x13/0x20
[109626.075483]  [<ffffffffa0745b40>] ? __xfs_get_blocks+0x280/0x550 [xfs]
[109626.075483]  [<ffffffffa0745e41>] ? xfs_get_blocks+0x11/0x20 [xfs]
[109626.075483]  [<ffffffff811cf77e>] ? __block_write_begin+0x1ae/0x4e0
[109626.075483]  [<ffffffffa0745e30>] ? xfs_get_blocks_direct+0x20/0x20 [xfs]
[109626.075483]  [<ffffffff81135fff>] ? grab_cache_page_write_begin+0x8f/0xf0
[109626.075483]  [<ffffffffa074509f>] ? xfs_vm_write_begin+0x5f/0xe0 [xfs]
[109626.075483]  [<ffffffff8113552a>] ? generic_perform_write+0xca/0x210
[109626.075483]  [<ffffffff811356cd>] ? generic_file_buffered_write+0x5d/0x90
[109626.075483]  [<ffffffffa07502d5>] ? xfs_file_buffered_aio_write+0x115/0x1c0 [xfs]
[109626.075483]  [<ffffffff816159f4>] ? ip_finish_output+0x224/0x3b0
[109626.075483]  [<ffffffffa075047c>] ? xfs_file_aio_write+0xfc/0x1b0 [xfs]
[109626.075483]  [<ffffffffa0750380>] ? xfs_file_buffered_aio_write+0x1c0/0x1c0 [xfs]
[109626.075483]  [<ffffffff8119b8c3>] ? do_sync_readv_writev+0xa3/0xe0
[109626.075483]  [<ffffffff8119bb8d>] ? do_readv_writev+0xcd/0x1d0
[109626.075483]  [<ffffffff810877e0>] ? set_groups+0x40/0x60
[109626.075483]  [<ffffffffa01be6b0>] ? nfsd_setuser+0x120/0x2b0 [nfsd]
[109626.075483]  [<ffffffff8119bccc>] ? vfs_writev+0x3c/0x50
[109626.075483]  [<ffffffffa01b7dd2>] ? nfsd_vfs_write.isra.12+0x92/0x350 [nfsd]
[109626.075483]  [<ffffffff8119a6cb>] ? dentry_open+0x6b/0xd0
[109626.075483]  [<ffffffffa01ba679>] ? nfsd_write+0xf9/0x110 [nfsd]
[109626.075483]  [<ffffffffa01c4dd1>] ? nfsd3_proc_write+0xb1/0x140 [nfsd]
[109626.075483]  [<ffffffffa01b3d62>] ? nfsd_dispatch+0x102/0x270 [nfsd]
[109626.075483]  [<ffffffffa012bb48>] ? svc_process_common+0x328/0x5e0 [sunrpc]
[109626.075483]  [<ffffffffa012c153>] ? svc_process+0x103/0x160 [sunrpc]
[109626.075483]  [<ffffffffa01b372f>] ? nfsd+0xbf/0x130 [nfsd]
[109626.075483]  [<ffffffffa01b3670>] ? nfsd_destroy+0x80/0x80 [nfsd]
[109626.075483]  [<ffffffff8107f050>] ? kthread+0xc0/0xd0
[109626.075483]  [<ffffffff8107ef90>] ? flush_kthread_worker+0xb0/0xb0
[109626.075483]  [<ffffffff816f61ec>] ? ret_from_fork+0x7c/0xb0
[109626.075483]  [<ffffffff8107ef90>] ? flush_kthread_worker+0xb0/0xb0

[2]
[87303.976119] INFO: task nfsd:5684 blocked for more than 180 seconds.
[87303.976976] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[87303.978012] nfsd            D 0000000000000003     0  5684      2 0x00000000
[87303.978017]  ffff8802506d37e8 0000000000000046 ffff880200000000 ffff880307ce3c9c
[87303.978020]  ffff8802506d3fd8 ffff8802506d3fd8 ffff8802506d3fd8 0000000000013f40
[87303.978023]  ffff88030e5445c0 ffff8802ca9945c0 ffff8802506d37c8 ffff88009dd8fd60
[87303.978026] Call Trace:
[87303.978036]  [<ffffffff816ec509>] schedule+0x29/0x70
[87303.978039]  [<ffffffff816ec7be>] schedule_preempt_disabled+0xe/0x10
[87303.978042]  [<ffffffff816eb437>] __mutex_lock_slowpath+0xd7/0x150
[87303.978045]  [<ffffffff816eb04a>] mutex_lock+0x2a/0x50
[87303.978076]  [<ffffffffa075a227>] xfs_file_buffered_aio_write+0x67/0x1c0 [xfs]
[87303.978089]  [<ffffffffa075a47c>] xfs_file_aio_write+0xfc/0x1b0 [xfs]
[87303.978101]  [<ffffffffa075a380>] ? xfs_file_buffered_aio_write+0x1c0/0x1c0 [xfs]
[87303.978105]  [<ffffffff8119b8c3>] do_sync_readv_writev+0xa3/0xe0
[87303.978109]  [<ffffffff8119bb8d>] do_readv_writev+0xcd/0x1d0
[87303.978112]  [<ffffffff811afe51>] ? prepend_path+0xf1/0x1e0
[87303.978115]  [<ffffffff811856fc>] ? kmem_cache_alloc_trace+0x11c/0x140
[87303.978119]  [<ffffffff8130c425>] ? aa_alloc_task_context+0x35/0x50
[87303.978122]  [<ffffffff8119bccc>] vfs_writev+0x3c/0x50
[87303.978145]  [<ffffffffa0266dd2>] nfsd_vfs_write.isra.12+0x92/0x350 [nfsd]
[87303.978149]  [<ffffffff816ed43e>] ? _raw_spin_lock+0xe/0x20
[87303.978159]  [<ffffffffa0284ba4>] ? find_confirmed_client.isra.58+0x144/0x1a0 [nfsd]
[87303.978167]  [<ffffffffa0284d48>] ? nfsd4_lookup_stateid+0xc8/0x120 [nfsd]
[87303.978174]  [<ffffffffa0269623>] nfsd_write+0xa3/0x110 [nfsd]
[87303.978182]  [<ffffffffa027794c>] nfsd4_write+0x1cc/0x250 [nfsd]
[87303.978189]  [<ffffffffa027746c>] nfsd4_proc_compound+0x5ac/0x7a0 [nfsd]
[87303.978197]  [<ffffffffa0262d62>] nfsd_dispatch+0x102/0x270 [nfsd]
[87303.978214]  [<ffffffffa01f3b48>] svc_process_common+0x328/0x5e0 [sunrpc]
[87303.978225]  [<ffffffffa01f4153>] svc_process+0x103/0x160 [sunrpc]
[87303.978232]  [<ffffffffa026272f>] nfsd+0xbf/0x130 [nfsd]
[87303.978238]  [<ffffffffa0262670>] ? nfsd_destroy+0x80/0x80 [nfsd]
[87303.978243]  [<ffffffff8107f050>] kthread+0xc0/0xd0
[87303.978246]  [<ffffffff8107ef90>] ? flush_kthread_worker+0xb0/0xb0
[87303.978250]  [<ffffffff816f61ec>] ret_from_fork+0x7c/0xb0
[87303.978253]  [<ffffffff8107ef90>] ? flush_kthread_worker+0xb0/0xb0

[-- Attachment #1.2: Type: text/html, Size: 10509 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 17+ messages in thread