* Re: [Bugme-new] [Bug 15426] New: Running many copies of bonnie++ on different filesystems seems to deadlock in sync [not found] <bug-15426-10286@http.bugzilla.kernel.org/> @ 2010-03-03 0:16 ` Andrew Morton 2010-03-03 12:09 ` John Hughes 0 siblings, 1 reply; 13+ messages in thread From: Andrew Morton @ 2010-03-03 0:16 UTC (permalink / raw) To: john; +Cc: bugzilla-daemon, bugme-daemon, linux-kernel, Jens Axboe (switched to email. Please respond via emailed reply-to-all, not via the bugzilla web interface). On Tue, 2 Mar 2010 16:31:03 GMT bugzilla-daemon@bugzilla.kernel.org wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=15426 > > Summary: Running many copies of bonnie++ on different > filesystems seems to deadlock in sync > Product: File System > Version: 2.5 > Kernel Version: 2.6.32 > Platform: All > OS/Version: Linux > Tree: Mainline > Status: NEW > Severity: normal > Priority: P1 > Component: VFS > AssignedTo: fs_vfs@kernel-bugs.osdl.org > ReportedBy: john@calva.com > Regression: No > > > I create 8 ext3 filesystems on 8 disks and run a copy of bonnie++ on each > filesystem. All but two or three of the Bonnie++ processes seem to hang in > sync and I get log messages like: > > [ 1557.377626] INFO: task bonnie:2503 blocked for more than 120 seconds. > [ 1557.377693] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this > message. > [ 1557.377775] bonnie D 0000000000000000 0 2503 2489 0x00000000 > [ 1557.377778] ffffffff8144b1f0 0000000000000082 0000000000000000 > ffff88034ac155c0 > [ 1557.377780] ffff88033c079880 ffffffff81047f43 000000000000f8a0 > ffff88063b1e9fd8 > [ 1557.377783] 00000000000155c0 00000000000155c0 ffff88063bbe3f90 > ffff88063bbe4288 > [ 1557.377785] Call Trace: > [ 1557.377787] [<ffffffff81047f43>] ? finish_task_switch+0x3a/0xa7 > [ 1557.377790] [<ffffffff812ee653>] ? __mutex_lock_common+0x10b/0x17b > [ 1557.377792] [<ffffffff812ee77b>] ? mutex_lock+0x1a/0x31 > [ 1557.377794] [<ffffffff8110a3cd>] ? sync_filesystems+0x13/0xe3 > [ 1557.377796] [<ffffffff8110a4d8>] ? sys_sync+0x12/0x2e > [ 1557.377799] [<ffffffff81010b42>] ? system_call_fastpath+0x16/0x1b > When the system is stuck, please run echo w > /proc/sysrq-trigger or, equivalently, hit `alt-sysrq-w'. This will emit huge amounts of stack traces into the system logs and onto the console. There's no need to send all of them - please identify typical-looking ones and then send them in reply to this email. Email clients like to wordwrap these traces - please try to prevent that. Thanks. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Bugme-new] [Bug 15426] New: Running many copies of bonnie++ on different filesystems seems to deadlock in sync 2010-03-03 0:16 ` [Bugme-new] [Bug 15426] New: Running many copies of bonnie++ on different filesystems seems to deadlock in sync Andrew Morton @ 2010-03-03 12:09 ` John Hughes 2010-03-03 11:50 ` Jens Axboe 0 siblings, 1 reply; 13+ messages in thread From: John Hughes @ 2010-03-03 12:09 UTC (permalink / raw) To: Andrew Morton Cc: bugzilla-daemon, bugme-daemon, linux-kernel, Jens Axboe, John Hughes Andrew Morton wrote: > When the system is stuck, please run > > echo w > /proc/sysrq-trigger > Ok, we have: [ 1128.964403] kjournald D 0000000000000000 0 2196 2 0x00000000 [ 1128.964407] ffffffff8144b1f0 0000000000000046 0000000000000000 ffffffff811837bb [ 1128.964409] ffff88063d1762e0 ffff880340002c80 000000000000f8a0 ffff88063c6e9fd8 [ 1128.964412] 00000000000155c0 00000000000155c0 ffff88063d1762e0 ffff88063d1765d8 [ 1128.964415] Call Trace: [ 1128.964423] [<ffffffff811837bb>] ? cfq_set_request+0x2af/0x319 [ 1128.964427] [<ffffffff811780d7>] ? get_request+0x7e/0x2ba [ 1128.964431] [<ffffffff812edf36>] ? io_schedule+0x73/0xb7 [ 1128.964433] [<ffffffff81178403>] ? get_request_wait+0xf0/0x188 [ 1128.964437] [<ffffffff81064a56>] ? autoremove_wake_function+0x0/0x2e [ 1128.964441] [<ffffffff8117b28d>] ? blk_recount_segments+0x17/0x27 [ 1128.964443] [<ffffffff81178792>] ? __make_request+0x2f7/0x428 [ 1128.964447] [<ffffffff81176f6f>] ? generic_make_request+0x299/0x2f9 [ 1128.964450] [<ffffffff811770a5>] ? submit_bio+0xd6/0xf2 [ 1128.964455] [<ffffffff8110b50b>] ? submit_bh+0xf5/0x115 [ 1128.964463] [<ffffffffa017f5a2>] ? journal_commit_transaction+0x8d9/0xe2b [jbd] [ 1128.964467] [<ffffffffa01823fb>] ? kjournald+0xdf/0x226 [jbd] [ 1128.964469] [<ffffffff81064a56>] ? autoremove_wake_function+0x0/0x2e [ 1128.964472] [<ffffffffa018231c>] ? kjournald+0x0/0x226 [jbd] [ 1128.964474] [<ffffffff81064789>] ? kthread+0x79/0x81 [ 1128.964479] [<ffffffff81011baa>] ? child_rip+0xa/0x20 [ 1128.964481] [<ffffffff81064710>] ? kthread+0x0/0x81 [ 1128.964483] [<ffffffff81011ba0>] ? child_rip+0x0/0x20 And three of (flush-8:32, flush-8:48 and flush-8:64): [ 1128.964485] flush-8:32 D 0000000000000000 0 2221 2 0x00000000 [ 1128.964488] ffff88033e44b880 0000000000000046 0000000000000000 ffffffff811837bb [ 1128.964490] ffff88063d177100 0000001000022c80 000000000000f8a0 ffff88063b66dfd8 [ 1128.964493] 00000000000155c0 00000000000155c0 ffff88063d177100 ffff88063d1773f8 [ 1128.964495] Call Trace: [ 1128.964497] [<ffffffff811837bb>] ? cfq_set_request+0x2af/0x319 [ 1128.964500] [<ffffffff811780d7>] ? get_request+0x7e/0x2ba [ 1128.964502] [<ffffffff812edf36>] ? io_schedule+0x73/0xb7 [ 1128.964504] [<ffffffff81178403>] ? get_request_wait+0xf0/0x188 [ 1128.964506] [<ffffffff81064a56>] ? autoremove_wake_function+0x0/0x2e [ 1128.964508] [<ffffffff81178792>] ? __make_request+0x2f7/0x428 [ 1128.964511] [<ffffffff81176f6f>] ? generic_make_request+0x299/0x2f9 [ 1128.964514] [<ffffffff810114ce>] ? common_interrupt+0xe/0x13 [ 1128.964516] [<ffffffff811770a5>] ? submit_bio+0xd6/0xf2 [ 1128.964518] [<ffffffff8110b50b>] ? submit_bh+0xf5/0x115 [ 1128.964522] [<ffffffff8110da7c>] ? __block_write_full_page+0x1d6/0x2ac [ 1128.964525] [<ffffffff8110c801>] ? end_buffer_async_write+0x0/0x13b [ 1128.964528] [<ffffffff8110fabc>] ? blkdev_get_block+0x0/0x57 [ 1128.964532] [<ffffffff810b9572>] ? __writepage+0xa/0x25 [ 1128.964534] [<ffffffff810b9bf9>] ? write_cache_pages+0x20b/0x327 [ 1128.964536] [<ffffffff810b9568>] ? __writepage+0x0/0x25 [ 1128.964541] [<ffffffff8110642e>] ? writeback_single_inode+0xe7/0x2da [ 1128.964543] [<ffffffff81107107>] ? writeback_inodes_wb+0x423/0x4fe [ 1128.964546] [<ffffffff8110730e>] ? wb_writeback+0x12c/0x1ab [ 1128.964549] [<ffffffff81107574>] ? wb_do_writeback+0x145/0x15b [ 1128.964551] [<ffffffff811075bb>] ? bdi_writeback_task+0x31/0x9d [ 1128.964556] [<ffffffff810c7cc6>] ? bdi_start_fn+0x0/0xca [ 1128.964558] [<ffffffff810c7d36>] ? bdi_start_fn+0x70/0xca [ 1128.964560] [<ffffffff810c7cc6>] ? bdi_start_fn+0x0/0xca [ 1128.964562] [<ffffffff81064789>] ? kthread+0x79/0x81 [ 1128.964564] [<ffffffff81011baa>] ? child_rip+0xa/0x20 [ 1128.964566] [<ffffffff81064710>] ? kthread+0x0/0x81 [ 1128.964568] [<ffffffff81011ba0>] ? child_rip+0x0/0x20 And a few copies of bonnie (14 in total): Two like this: [ 1128.964735] bonnie D 0000000000000000 0 2240 2236 0x00000000 [ 1128.964737] ffff88033e44b880 0000000000000086 0000000000000000 00000000000009ef [ 1128.964740] ffff8802bac52000 0000000000430bf5 000000000000f8a0 ffff88033d031fd8 [ 1128.964743] 00000000000155c0 00000000000155c0 ffff88033d811c40 ffff88033d811f38 [ 1128.964745] Call Trace: [ 1128.964747] [<ffffffff8110c637>] ? sync_buffer+0x0/0x40 [ 1128.964749] [<ffffffff812edf36>] ? io_schedule+0x73/0xb7 [ 1128.964752] [<ffffffff8110c672>] ? sync_buffer+0x3b/0x40 [ 1128.964754] [<ffffffff812ee339>] ? __wait_on_bit_lock+0x3f/0x84 [ 1128.964756] [<ffffffff8110c637>] ? sync_buffer+0x0/0x40 [ 1128.964758] [<ffffffff812ee3e9>] ? out_of_line_wait_on_bit_lock+0x6b/0x77 [ 1128.964760] [<ffffffff81064a84>] ? wake_bit_function+0x0/0x23 [ 1128.964764] [<ffffffffa017e068>] ? do_get_write_access+0x82/0x441 [jbd] [ 1128.964766] [<ffffffff8110bbed>] ? __getblk+0x26/0x29a [ 1128.964769] [<ffffffffa017e449>] ? journal_get_write_access+0x22/0x33 [jbd] [ 1128.964774] [<ffffffffa01a040f>] ? __ext3_journal_get_write_access+0x1e/0x47 [ext3] [ 1128.964778] [<ffffffffa019418e>] ? ext3_reserve_inode_write+0x3e/0x75 [ext3] [ 1128.964782] [<ffffffffa0199d86>] ? ext3_orphan_add+0x94/0x15b [ext3] [ 1128.964786] [<ffffffffa01941f8>] ? ext3_mark_inode_dirty+0x33/0x3c [ext3] [ 1128.964790] [<ffffffffa019a6cb>] ? ext3_unlink+0x166/0x1b4 [ext3] [ 1128.964793] [<ffffffff810f645d>] ? vfs_unlink+0x72/0xa6 [ 1128.964795] [<ffffffff810f7cb5>] ? do_unlinkat+0xad/0x134 [ 1128.964797] [<ffffffff810f9b11>] ? vfs_readdir+0x92/0xa7 [ 1128.964800] [<ffffffff810f9c9d>] ? sys_getdents+0xb9/0xc7 [ 1128.964803] [<ffffffff81010b42>] ? system_call_fastpath+0x16/0x1b One like this: [ 1128.964804] bonnie D 0000000000000000 0 2241 2236 0x00000000 [ 1128.964807] ffff88033e44b880 0000000000000082 0000000000000000 ffffffff810b90a9 [ 1128.964809] ffff880000022c80 0000000000000000 000000000000f8a0 ffff88033c3abfd8 [ 1128.964812] 00000000000155c0 00000000000155c0 ffff88033d813f90 ffff88033d814288 [ 1128.964814] Call Trace: [ 1128.964816] [<ffffffff810b90a9>] ? __alloc_pages_nodemask+0x10f/0x5ce [ 1128.964819] [<ffffffff81064977>] ? bit_waitqueue+0x10/0xa0 [ 1128.964822] [<ffffffffa017e212>] ? do_get_write_access+0x22c/0x441 [jbd] [ 1128.964824] [<ffffffff81064a84>] ? wake_bit_function+0x0/0x23 [ 1128.964826] [<ffffffff8110bbed>] ? __getblk+0x26/0x29a [ 1128.964829] [<ffffffff810170cd>] ? read_tsc+0xa/0x20 [ 1128.964832] [<ffffffffa017e449>] ? journal_get_write_access+0x22/0x33 [jbd] [ 1128.964836] [<ffffffffa01a040f>] ? __ext3_journal_get_write_access+0x1e/0x47 [ext3] [ 1128.964840] [<ffffffffa019418e>] ? ext3_reserve_inode_write+0x3e/0x75 [ext3] [ 1128.964844] [<ffffffffa01941e6>] ? ext3_mark_inode_dirty+0x21/0x3c [ext3] [ 1128.964847] [<ffffffffa019a6b4>] ? ext3_unlink+0x14f/0x1b4 [ext3] [ 1128.964850] [<ffffffff810f645d>] ? vfs_unlink+0x72/0xa6 [ 1128.964852] [<ffffffff810f7cb5>] ? do_unlinkat+0xad/0x134 [ 1128.964854] [<ffffffff810f9b11>] ? vfs_readdir+0x92/0xa7 [ 1128.964856] [<ffffffff810f9c9d>] ? sys_getdents+0xb9/0xc7 [ 1128.964859] [<ffffffff81010b42>] ? system_call_fastpath+0x16/0x1b One like this: [ 1128.964955] bonnie D 0000000000000000 0 2244 2236 0x00000000 [ 1128.964958] ffffffff8144b1f0 0000000000000082 0000000000000000 ffffffff811837bb [ 1128.964960] ffff88033d813170 00000010a00fc7ab 000000000000f8a0 ffff88033ceadfd8 [ 1128.964963] 00000000000155c0 00000000000155c0 ffff88033d813170 ffff88033d813468 [ 1128.964965] Call Trace: [ 1128.964967] [<ffffffff811837bb>] ? cfq_set_request+0x2af/0x319 [ 1128.964969] [<ffffffff811780d7>] ? get_request+0x7e/0x2ba [ 1128.964971] [<ffffffff812edf36>] ? io_schedule+0x73/0xb7 [ 1128.964973] [<ffffffff81178403>] ? get_request_wait+0xf0/0x188 [ 1128.964976] [<ffffffff81064a56>] ? autoremove_wake_function+0x0/0x2e [ 1128.964978] [<ffffffff81178792>] ? __make_request+0x2f7/0x428 [ 1128.964980] [<ffffffff81176f6f>] ? generic_make_request+0x299/0x2f9 [ 1128.964983] [<ffffffff811770a5>] ? submit_bio+0xd6/0xf2 [ 1128.964985] [<ffffffff8110b50b>] ? submit_bh+0xf5/0x115 [ 1128.964987] [<ffffffff8110da7c>] ? __block_write_full_page+0x1d6/0x2ac [ 1128.964990] [<ffffffff8110c801>] ? end_buffer_async_write+0x0/0x13b [ 1128.964992] [<ffffffff8110fabc>] ? blkdev_get_block+0x0/0x57 [ 1128.964994] [<ffffffff810b9572>] ? __writepage+0xa/0x25 [ 1128.964997] [<ffffffff810b9bf9>] ? write_cache_pages+0x20b/0x327 [ 1128.964999] [<ffffffff810b9568>] ? __writepage+0x0/0x25 [ 1128.965003] [<ffffffff810b3cda>] ? __filemap_fdatawrite_range+0x4b/0x54 [ 1128.965007] [<ffffffffa019b5bb>] ? ext3_sync_fs+0x42/0x4b [ext3] [ 1128.965009] [<ffffffff810b4998>] ? filemap_write_and_wait+0x17/0x32 [ 1128.965013] [<ffffffff8112d015>] ? sync_quota_sb+0x5d/0xf6 [ 1128.965015] [<ffffffff8110a38d>] ? __sync_filesystem+0x43/0x70 [ 1128.965017] [<ffffffff8110a454>] ? sync_filesystems+0x9a/0xe3 [ 1128.965019] [<ffffffff8110a4e2>] ? sys_sync+0x1c/0x2e [ 1128.965021] [<ffffffff81010b42>] ? system_call_fastpath+0x16/0x1b Seven bonnies like this: [ 1128.965231] bonnie D 0000000000000000 0 2253 2236 0x00000000 [ 1128.965233] ffffffff8144b1f0 0000000000000082 0000000000000000 ffff88034ac155c0 [ 1128.965236] ffff88033ce7f700 ffffffff81047f43 000000000000f8a0 ffff88033c249fd8 [ 1128.965239] 00000000000155c0 00000000000155c0 ffff88033d814db0 ffff88033d8150a8 [ 1128.965241] Call Trace: [ 1128.965243] [<ffffffff81047f43>] ? finish_task_switch+0x3a/0xa7 [ 1128.965246] [<ffffffff812ee653>] ? __mutex_lock_common+0x10b/0x17b [ 1128.965248] [<ffffffff812ee77b>] ? mutex_lock+0x1a/0x31 [ 1128.965250] [<ffffffff8110a3cd>] ? sync_filesystems+0x13/0xe3 [ 1128.965252] [<ffffffff8110a4d8>] ? sys_sync+0x12/0x2e [ 1128.965255] [<ffffffff81010b42>] ? system_call_fastpath+0x16/0x1b And three bonnies like this: [ 1128.965023] bonnie D 0000000000000000 0 2245 2236 0x00000000 [ 1128.965026] ffff88033e44b880 0000000000000082 0000000000000000 00000000000155c0 [ 1128.965028] 0000000000000005 00000000000155c0 000000000000f8a0 ffff88033cfc3fd8 [ 1128.965031] 00000000000155c0 00000000000155c0 ffff88033d817100 ffff88033d8173f8 [ 1128.965033] Call Trace: [ 1128.965035] [<ffffffff812ee653>] ? __mutex_lock_common+0x10b/0x17b [ 1128.965038] [<ffffffff8104a196>] ? try_to_wake_up+0x249/0x259 [ 1128.965040] [<ffffffff812ee77b>] ? mutex_lock+0x1a/0x31 [ 1128.965042] [<ffffffff8110a3cd>] ? sync_filesystems+0x13/0xe3 [ 1128.965044] [<ffffffff8110a4d8>] ? sys_sync+0x12/0x2e [ 1128.965047] [<ffffffff81010b42>] ? system_call_fastpath+0x16/0x1b ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Bugme-new] [Bug 15426] New: Running many copies of bonnie++ on different filesystems seems to deadlock in sync 2010-03-03 12:09 ` John Hughes @ 2010-03-03 11:50 ` Jens Axboe 2010-03-03 12:37 ` John Hughes 0 siblings, 1 reply; 13+ messages in thread From: Jens Axboe @ 2010-03-03 11:50 UTC (permalink / raw) To: John Hughes; +Cc: Andrew Morton, bugzilla-daemon, bugme-daemon, linux-kernel On Wed, Mar 03 2010, John Hughes wrote: > Andrew Morton wrote: >> When the system is stuck, please run >> >> echo w > /proc/sysrq-trigger >> > Ok, we have: Is IO still going on, or does it appear to be stuck? From the traces below, we have various procs caught in waiting for a request. So if things are totally stuck, it could be some race in there. > > [ 1128.964403] kjournald D 0000000000000000 0 2196 2 0x00000000 > [ 1128.964407] ffffffff8144b1f0 0000000000000046 0000000000000000 ffffffff811837bb > [ 1128.964409] ffff88063d1762e0 ffff880340002c80 000000000000f8a0 ffff88063c6e9fd8 > [ 1128.964412] 00000000000155c0 00000000000155c0 ffff88063d1762e0 ffff88063d1765d8 > [ 1128.964415] Call Trace: > [ 1128.964423] [<ffffffff811837bb>] ? cfq_set_request+0x2af/0x319 > [ 1128.964427] [<ffffffff811780d7>] ? get_request+0x7e/0x2ba > [ 1128.964431] [<ffffffff812edf36>] ? io_schedule+0x73/0xb7 > [ 1128.964433] [<ffffffff81178403>] ? get_request_wait+0xf0/0x188 > [ 1128.964437] [<ffffffff81064a56>] ? autoremove_wake_function+0x0/0x2e > [ 1128.964441] [<ffffffff8117b28d>] ? blk_recount_segments+0x17/0x27 > [ 1128.964443] [<ffffffff81178792>] ? __make_request+0x2f7/0x428 > [ 1128.964447] [<ffffffff81176f6f>] ? generic_make_request+0x299/0x2f9 > [ 1128.964450] [<ffffffff811770a5>] ? submit_bio+0xd6/0xf2 > [ 1128.964455] [<ffffffff8110b50b>] ? submit_bh+0xf5/0x115 > [ 1128.964463] [<ffffffffa017f5a2>] ? journal_commit_transaction+0x8d9/0xe2b [jbd] > [ 1128.964467] [<ffffffffa01823fb>] ? kjournald+0xdf/0x226 [jbd] > [ 1128.964469] [<ffffffff81064a56>] ? autoremove_wake_function+0x0/0x2e > [ 1128.964472] [<ffffffffa018231c>] ? kjournald+0x0/0x226 [jbd] > [ 1128.964474] [<ffffffff81064789>] ? kthread+0x79/0x81 > [ 1128.964479] [<ffffffff81011baa>] ? child_rip+0xa/0x20 > [ 1128.964481] [<ffffffff81064710>] ? kthread+0x0/0x81 > [ 1128.964483] [<ffffffff81011ba0>] ? child_rip+0x0/0x20 > > And three of (flush-8:32, flush-8:48 and flush-8:64): > > [ 1128.964485] flush-8:32 D 0000000000000000 0 2221 2 0x00000000 > [ 1128.964488] ffff88033e44b880 0000000000000046 0000000000000000 ffffffff811837bb > [ 1128.964490] ffff88063d177100 0000001000022c80 000000000000f8a0 ffff88063b66dfd8 > [ 1128.964493] 00000000000155c0 00000000000155c0 ffff88063d177100 ffff88063d1773f8 > [ 1128.964495] Call Trace: > [ 1128.964497] [<ffffffff811837bb>] ? cfq_set_request+0x2af/0x319 > [ 1128.964500] [<ffffffff811780d7>] ? get_request+0x7e/0x2ba > [ 1128.964502] [<ffffffff812edf36>] ? io_schedule+0x73/0xb7 > [ 1128.964504] [<ffffffff81178403>] ? get_request_wait+0xf0/0x188 > [ 1128.964506] [<ffffffff81064a56>] ? autoremove_wake_function+0x0/0x2e > [ 1128.964508] [<ffffffff81178792>] ? __make_request+0x2f7/0x428 > [ 1128.964511] [<ffffffff81176f6f>] ? generic_make_request+0x299/0x2f9 > [ 1128.964514] [<ffffffff810114ce>] ? common_interrupt+0xe/0x13 > [ 1128.964516] [<ffffffff811770a5>] ? submit_bio+0xd6/0xf2 > [ 1128.964518] [<ffffffff8110b50b>] ? submit_bh+0xf5/0x115 > [ 1128.964522] [<ffffffff8110da7c>] ? __block_write_full_page+0x1d6/0x2ac > [ 1128.964525] [<ffffffff8110c801>] ? end_buffer_async_write+0x0/0x13b > [ 1128.964528] [<ffffffff8110fabc>] ? blkdev_get_block+0x0/0x57 > [ 1128.964532] [<ffffffff810b9572>] ? __writepage+0xa/0x25 > [ 1128.964534] [<ffffffff810b9bf9>] ? write_cache_pages+0x20b/0x327 > [ 1128.964536] [<ffffffff810b9568>] ? __writepage+0x0/0x25 > [ 1128.964541] [<ffffffff8110642e>] ? writeback_single_inode+0xe7/0x2da > [ 1128.964543] [<ffffffff81107107>] ? writeback_inodes_wb+0x423/0x4fe > [ 1128.964546] [<ffffffff8110730e>] ? wb_writeback+0x12c/0x1ab > [ 1128.964549] [<ffffffff81107574>] ? wb_do_writeback+0x145/0x15b > [ 1128.964551] [<ffffffff811075bb>] ? bdi_writeback_task+0x31/0x9d > [ 1128.964556] [<ffffffff810c7cc6>] ? bdi_start_fn+0x0/0xca > [ 1128.964558] [<ffffffff810c7d36>] ? bdi_start_fn+0x70/0xca > [ 1128.964560] [<ffffffff810c7cc6>] ? bdi_start_fn+0x0/0xca > [ 1128.964562] [<ffffffff81064789>] ? kthread+0x79/0x81 > [ 1128.964564] [<ffffffff81011baa>] ? child_rip+0xa/0x20 > [ 1128.964566] [<ffffffff81064710>] ? kthread+0x0/0x81 > [ 1128.964568] [<ffffffff81011ba0>] ? child_rip+0x0/0x20 > > > And a few copies of bonnie (14 in total): > > Two like this: > > [ 1128.964735] bonnie D 0000000000000000 0 2240 2236 0x00000000 > [ 1128.964737] ffff88033e44b880 0000000000000086 0000000000000000 00000000000009ef > [ 1128.964740] ffff8802bac52000 0000000000430bf5 000000000000f8a0 ffff88033d031fd8 > [ 1128.964743] 00000000000155c0 00000000000155c0 ffff88033d811c40 ffff88033d811f38 > [ 1128.964745] Call Trace: > [ 1128.964747] [<ffffffff8110c637>] ? sync_buffer+0x0/0x40 > [ 1128.964749] [<ffffffff812edf36>] ? io_schedule+0x73/0xb7 > [ 1128.964752] [<ffffffff8110c672>] ? sync_buffer+0x3b/0x40 > [ 1128.964754] [<ffffffff812ee339>] ? __wait_on_bit_lock+0x3f/0x84 > [ 1128.964756] [<ffffffff8110c637>] ? sync_buffer+0x0/0x40 > [ 1128.964758] [<ffffffff812ee3e9>] ? out_of_line_wait_on_bit_lock+0x6b/0x77 > [ 1128.964760] [<ffffffff81064a84>] ? wake_bit_function+0x0/0x23 > [ 1128.964764] [<ffffffffa017e068>] ? do_get_write_access+0x82/0x441 [jbd] > [ 1128.964766] [<ffffffff8110bbed>] ? __getblk+0x26/0x29a > [ 1128.964769] [<ffffffffa017e449>] ? journal_get_write_access+0x22/0x33 [jbd] > [ 1128.964774] [<ffffffffa01a040f>] ? __ext3_journal_get_write_access+0x1e/0x47 [ext3] > [ 1128.964778] [<ffffffffa019418e>] ? ext3_reserve_inode_write+0x3e/0x75 [ext3] > [ 1128.964782] [<ffffffffa0199d86>] ? ext3_orphan_add+0x94/0x15b [ext3] > [ 1128.964786] [<ffffffffa01941f8>] ? ext3_mark_inode_dirty+0x33/0x3c [ext3] > [ 1128.964790] [<ffffffffa019a6cb>] ? ext3_unlink+0x166/0x1b4 [ext3] > [ 1128.964793] [<ffffffff810f645d>] ? vfs_unlink+0x72/0xa6 > [ 1128.964795] [<ffffffff810f7cb5>] ? do_unlinkat+0xad/0x134 > [ 1128.964797] [<ffffffff810f9b11>] ? vfs_readdir+0x92/0xa7 > [ 1128.964800] [<ffffffff810f9c9d>] ? sys_getdents+0xb9/0xc7 > [ 1128.964803] [<ffffffff81010b42>] ? system_call_fastpath+0x16/0x1b > > One like this: > > [ 1128.964804] bonnie D 0000000000000000 0 2241 2236 0x00000000 > [ 1128.964807] ffff88033e44b880 0000000000000082 0000000000000000 ffffffff810b90a9 > [ 1128.964809] ffff880000022c80 0000000000000000 000000000000f8a0 ffff88033c3abfd8 > [ 1128.964812] 00000000000155c0 00000000000155c0 ffff88033d813f90 ffff88033d814288 > [ 1128.964814] Call Trace: > [ 1128.964816] [<ffffffff810b90a9>] ? __alloc_pages_nodemask+0x10f/0x5ce > [ 1128.964819] [<ffffffff81064977>] ? bit_waitqueue+0x10/0xa0 > [ 1128.964822] [<ffffffffa017e212>] ? do_get_write_access+0x22c/0x441 [jbd] > [ 1128.964824] [<ffffffff81064a84>] ? wake_bit_function+0x0/0x23 > [ 1128.964826] [<ffffffff8110bbed>] ? __getblk+0x26/0x29a > [ 1128.964829] [<ffffffff810170cd>] ? read_tsc+0xa/0x20 > [ 1128.964832] [<ffffffffa017e449>] ? journal_get_write_access+0x22/0x33 [jbd] > [ 1128.964836] [<ffffffffa01a040f>] ? __ext3_journal_get_write_access+0x1e/0x47 [ext3] > [ 1128.964840] [<ffffffffa019418e>] ? ext3_reserve_inode_write+0x3e/0x75 [ext3] > [ 1128.964844] [<ffffffffa01941e6>] ? ext3_mark_inode_dirty+0x21/0x3c [ext3] > [ 1128.964847] [<ffffffffa019a6b4>] ? ext3_unlink+0x14f/0x1b4 [ext3] > [ 1128.964850] [<ffffffff810f645d>] ? vfs_unlink+0x72/0xa6 > [ 1128.964852] [<ffffffff810f7cb5>] ? do_unlinkat+0xad/0x134 > [ 1128.964854] [<ffffffff810f9b11>] ? vfs_readdir+0x92/0xa7 > [ 1128.964856] [<ffffffff810f9c9d>] ? sys_getdents+0xb9/0xc7 > [ 1128.964859] [<ffffffff81010b42>] ? system_call_fastpath+0x16/0x1b > > > One like this: > > [ 1128.964955] bonnie D 0000000000000000 0 2244 2236 0x00000000 > [ 1128.964958] ffffffff8144b1f0 0000000000000082 0000000000000000 ffffffff811837bb > [ 1128.964960] ffff88033d813170 00000010a00fc7ab 000000000000f8a0 ffff88033ceadfd8 > [ 1128.964963] 00000000000155c0 00000000000155c0 ffff88033d813170 ffff88033d813468 > [ 1128.964965] Call Trace: > [ 1128.964967] [<ffffffff811837bb>] ? cfq_set_request+0x2af/0x319 > [ 1128.964969] [<ffffffff811780d7>] ? get_request+0x7e/0x2ba > [ 1128.964971] [<ffffffff812edf36>] ? io_schedule+0x73/0xb7 > [ 1128.964973] [<ffffffff81178403>] ? get_request_wait+0xf0/0x188 > [ 1128.964976] [<ffffffff81064a56>] ? autoremove_wake_function+0x0/0x2e > [ 1128.964978] [<ffffffff81178792>] ? __make_request+0x2f7/0x428 > [ 1128.964980] [<ffffffff81176f6f>] ? generic_make_request+0x299/0x2f9 > [ 1128.964983] [<ffffffff811770a5>] ? submit_bio+0xd6/0xf2 > [ 1128.964985] [<ffffffff8110b50b>] ? submit_bh+0xf5/0x115 > [ 1128.964987] [<ffffffff8110da7c>] ? __block_write_full_page+0x1d6/0x2ac > [ 1128.964990] [<ffffffff8110c801>] ? end_buffer_async_write+0x0/0x13b > [ 1128.964992] [<ffffffff8110fabc>] ? blkdev_get_block+0x0/0x57 > [ 1128.964994] [<ffffffff810b9572>] ? __writepage+0xa/0x25 > [ 1128.964997] [<ffffffff810b9bf9>] ? write_cache_pages+0x20b/0x327 > [ 1128.964999] [<ffffffff810b9568>] ? __writepage+0x0/0x25 > [ 1128.965003] [<ffffffff810b3cda>] ? __filemap_fdatawrite_range+0x4b/0x54 > [ 1128.965007] [<ffffffffa019b5bb>] ? ext3_sync_fs+0x42/0x4b [ext3] > [ 1128.965009] [<ffffffff810b4998>] ? filemap_write_and_wait+0x17/0x32 > [ 1128.965013] [<ffffffff8112d015>] ? sync_quota_sb+0x5d/0xf6 > [ 1128.965015] [<ffffffff8110a38d>] ? __sync_filesystem+0x43/0x70 > [ 1128.965017] [<ffffffff8110a454>] ? sync_filesystems+0x9a/0xe3 > [ 1128.965019] [<ffffffff8110a4e2>] ? sys_sync+0x1c/0x2e > [ 1128.965021] [<ffffffff81010b42>] ? system_call_fastpath+0x16/0x1b > > Seven bonnies like this: > > [ 1128.965231] bonnie D 0000000000000000 0 2253 2236 0x00000000 > [ 1128.965233] ffffffff8144b1f0 0000000000000082 0000000000000000 ffff88034ac155c0 > [ 1128.965236] ffff88033ce7f700 ffffffff81047f43 000000000000f8a0 ffff88033c249fd8 > [ 1128.965239] 00000000000155c0 00000000000155c0 ffff88033d814db0 ffff88033d8150a8 > [ 1128.965241] Call Trace: > [ 1128.965243] [<ffffffff81047f43>] ? finish_task_switch+0x3a/0xa7 > [ 1128.965246] [<ffffffff812ee653>] ? __mutex_lock_common+0x10b/0x17b > [ 1128.965248] [<ffffffff812ee77b>] ? mutex_lock+0x1a/0x31 > [ 1128.965250] [<ffffffff8110a3cd>] ? sync_filesystems+0x13/0xe3 > [ 1128.965252] [<ffffffff8110a4d8>] ? sys_sync+0x12/0x2e > [ 1128.965255] [<ffffffff81010b42>] ? system_call_fastpath+0x16/0x1b > > And three bonnies like this: > > [ 1128.965023] bonnie D 0000000000000000 0 2245 2236 0x00000000 > [ 1128.965026] ffff88033e44b880 0000000000000082 0000000000000000 00000000000155c0 > [ 1128.965028] 0000000000000005 00000000000155c0 000000000000f8a0 ffff88033cfc3fd8 > [ 1128.965031] 00000000000155c0 00000000000155c0 ffff88033d817100 ffff88033d8173f8 > [ 1128.965033] Call Trace: > [ 1128.965035] [<ffffffff812ee653>] ? __mutex_lock_common+0x10b/0x17b > [ 1128.965038] [<ffffffff8104a196>] ? try_to_wake_up+0x249/0x259 > [ 1128.965040] [<ffffffff812ee77b>] ? mutex_lock+0x1a/0x31 > [ 1128.965042] [<ffffffff8110a3cd>] ? sync_filesystems+0x13/0xe3 > [ 1128.965044] [<ffffffff8110a4d8>] ? sys_sync+0x12/0x2e > [ 1128.965047] [<ffffffff81010b42>] ? system_call_fastpath+0x16/0x1b > > > -- Jens Axboe ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Bugme-new] [Bug 15426] New: Running many copies of bonnie++ on different filesystems seems to deadlock in sync 2010-03-03 11:50 ` Jens Axboe @ 2010-03-03 12:37 ` John Hughes 2010-03-03 12:03 ` Jens Axboe 0 siblings, 1 reply; 13+ messages in thread From: John Hughes @ 2010-03-03 12:37 UTC (permalink / raw) To: Jens Axboe; +Cc: Andrew Morton, bugzilla-daemon, bugme-daemon, linux-kernel Jens Axboe wrote: > Is IO still going on, or does it appear to be stuck? From the traces > below, we have various procs caught in waiting for a request. So if > things are totally stuck, it could be some race in there. > I see I/O happening on three or four of the disks. Just a thought. What exactly is sync(2) supposed to do - block until there are no more dirty pages, or block until all pages that were dirty when the sync was done are clean? In other words is the problem simply that pages are being dirtied faster than the sync is writing them out? ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Bugme-new] [Bug 15426] New: Running many copies of bonnie++ on different filesystems seems to deadlock in sync 2010-03-03 12:37 ` John Hughes @ 2010-03-03 12:03 ` Jens Axboe 2010-03-03 12:45 ` John Hughes ` (3 more replies) 0 siblings, 4 replies; 13+ messages in thread From: Jens Axboe @ 2010-03-03 12:03 UTC (permalink / raw) To: John Hughes; +Cc: Andrew Morton, bugzilla-daemon, bugme-daemon, linux-kernel [-- Attachment #1: Type: text/plain, Size: 908 bytes --] On Wed, Mar 03 2010, John Hughes wrote: > Jens Axboe wrote: >> Is IO still going on, or does it appear to be stuck? From the traces >> below, we have various procs caught in waiting for a request. So if >> things are totally stuck, it could be some race in there. >> > I see I/O happening on three or four of the disks. > > Just a thought. What exactly is sync(2) supposed to do - block until > there are no more dirty pages, or block until all pages that were dirty > when the sync was done are clean? In other words is the problem simply > that pages are being dirtied faster than the sync is writing them out? Our sync is currently broken in that regard, since it'll wait for too long. We have a debated patch going, I have included it below. Any chance you could give it a whirl? The semantics of sync are supposed to be 'wait for dirty IO generated BEFORE this sync call'. -- Jens Axboe [-- Attachment #2: writeback-fix-broken-sync-2.6.32.patch --] [-- Type: text/x-diff, Size: 2409 bytes --] commit 057226ca7447880e4e766a82cf32197e492ba963 Author: Jens Axboe <jens.axboe@oracle.com> Date: Fri Feb 12 10:14:34 2010 +0100 writeback: Fix broken sync writeback There's currently a writeback bug in the 2.6.32 and 2.6.33-rc kernels that prevent proper writeback when sync(1) is being run. Instead of flushing everything older than the sync run, it will do chunks of normal MAX_WRITEBACK_PAGES writeback and restart over and over. This results in very suboptimal writeback for many files, see the initial report from Jan Engelhardt: http://lkml.org/lkml/2010/1/22/382 This fixes it by using the passed in page writeback count, instead of doing MAX_WRITEBACK_PAGES batches, which gets us much better performance (Jan reports it's up from ~400KB/sec to 10MB/sec) and makes sync(1) finish properly even when new pages are being dirted. Thanks to Jan Kara <jack@suse.cz> for spotting this problem! Cc: stable@kernel.org Reported-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Jens Axboe <jens.axboe@oracle.com> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c index 9d5360c..8a46c67 100644 --- a/fs/fs-writeback.c +++ b/fs/fs-writeback.c @@ -773,6 +773,8 @@ static long wb_writeback(struct bdi_writeback *wb, } for (;;) { + long to_write = 0; + /* * Stop writeback when nr_pages has been consumed */ @@ -786,13 +788,18 @@ static long wb_writeback(struct bdi_writeback *wb, if (args->for_background && !over_bground_thresh()) break; + if (args->sync_mode == WB_SYNC_ALL) + to_write = args->nr_pages; + if (!to_write) + to_write = MAX_WRITEBACK_PAGES; + wbc.more_io = 0; wbc.encountered_congestion = 0; - wbc.nr_to_write = MAX_WRITEBACK_PAGES; + wbc.nr_to_write = to_write; wbc.pages_skipped = 0; writeback_inodes_wb(wb, &wbc); - args->nr_pages -= MAX_WRITEBACK_PAGES - wbc.nr_to_write; - wrote += MAX_WRITEBACK_PAGES - wbc.nr_to_write; + args->nr_pages -= to_write - wbc.nr_to_write; + wrote += to_write - wbc.nr_to_write; /* * If we consumed everything, see if we have more @@ -807,7 +814,7 @@ static long wb_writeback(struct bdi_writeback *wb, /* * Did we write something? Try for more */ - if (wbc.nr_to_write < MAX_WRITEBACK_PAGES) + if (wbc.nr_to_write < to_write) continue; /* * Nothing written. Wait for some inode to ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [Bugme-new] [Bug 15426] New: Running many copies of bonnie++ on different filesystems seems to deadlock in sync 2010-03-03 12:03 ` Jens Axboe @ 2010-03-03 12:45 ` John Hughes 2010-03-03 12:09 ` Jens Axboe 2010-03-03 14:42 ` Andre Noll ` (2 subsequent siblings) 3 siblings, 1 reply; 13+ messages in thread From: John Hughes @ 2010-03-03 12:45 UTC (permalink / raw) To: Jens Axboe; +Cc: Andrew Morton, bugzilla-daemon, bugme-daemon, linux-kernel Jens Axboe wrote: > On Wed, Mar 03 2010, John Hughes wrote: > >> Just a thought. What exactly is sync(2) supposed to do - block until >> there are no more dirty pages, or block until all pages that were dirty >> when the sync was done are clean? In other words is the problem simply >> that pages are being dirtied faster than the sync is writing them out? >> > > Our sync is currently broken in that regard, since it'll wait for too > long. We have a debated patch going, I have included it below. Any > chance you could give it a whirl? > > The semantics of sync are supposed to be 'wait for dirty IO generated > BEFORE this sync call'. > Ok, I will try this patch and get back to you. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Bugme-new] [Bug 15426] New: Running many copies of bonnie++ on different filesystems seems to deadlock in sync 2010-03-03 12:45 ` John Hughes @ 2010-03-03 12:09 ` Jens Axboe 0 siblings, 0 replies; 13+ messages in thread From: Jens Axboe @ 2010-03-03 12:09 UTC (permalink / raw) To: John Hughes; +Cc: Andrew Morton, bugzilla-daemon, bugme-daemon, linux-kernel On Wed, Mar 03 2010, John Hughes wrote: > Jens Axboe wrote: >> On Wed, Mar 03 2010, John Hughes wrote: >> >>> Just a thought. What exactly is sync(2) supposed to do - block until >>> there are no more dirty pages, or block until all pages that were >>> dirty when the sync was done are clean? In other words is the >>> problem simply that pages are being dirtied faster than the sync is >>> writing them out? >>> >> >> Our sync is currently broken in that regard, since it'll wait for too >> long. We have a debated patch going, I have included it below. Any >> chance you could give it a whirl? >> >> The semantics of sync are supposed to be 'wait for dirty IO generated >> BEFORE this sync call'. >> > Ok, I will try this patch and get back to you. Looking at your bug report, I saw that you were on 2.6.32. If you are testing 2.6.33, let me know, the patch wont apply (I have another variant for that). -- Jens Axboe ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Bugme-new] [Bug 15426] New: Running many copies of bonnie++ on different filesystems seems to deadlock in sync 2010-03-03 12:03 ` Jens Axboe 2010-03-03 12:45 ` John Hughes @ 2010-03-03 14:42 ` Andre Noll 2010-03-04 14:55 ` John Hughes 2010-03-03 18:33 ` John Hughes 2010-03-04 11:16 ` John Hughes 3 siblings, 1 reply; 13+ messages in thread From: Andre Noll @ 2010-03-03 14:42 UTC (permalink / raw) To: Jens Axboe Cc: John Hughes, Andrew Morton, bugzilla-daemon, bugme-daemon, linux-kernel [-- Attachment #1: Type: text/plain, Size: 1577 bytes --] On 13:03, Jens Axboe wrote: > Our sync is currently broken in that regard, since it'll wait for too > long. We have a debated patch going, I have included it below. Any > chance you could give it a whirl? That's the same patch you posted two weeks ago and on which Linus commented he thinks it is broken. He also proposed another, simpler patch (included below) which I'm running for a few days and which fixes the sync problem for me. However, it is not clear at all whether that simpler patch is correct. Any comments on that? Thanks Andre --- fs/fs-writeback.c | 15 ++++++++++----- 1 files changed, 10 insertions(+), 5 deletions(-) diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c index 1a7c42c..a0a8424 100644 --- a/fs/fs-writeback.c +++ b/fs/fs-writeback.c @@ -738,11 +738,16 @@ static long wb_writeback(struct bdi_writeback *wb, long wrote = 0; struct inode *inode; - if (wbc.for_kupdate) { - wbc.older_than_this = &oldest_jif; - oldest_jif = jiffies - - msecs_to_jiffies(dirty_expire_interval * 10); - } + /* + * We never write back data that was dirtied _after_ we + * started writeback. But kupdate doesn't even want to + * write back recently dirtied stuff, only older data. + */ + oldest_jif = jiffies-1; + wbc.older_than_this = &oldest_jif; + if (wbc.for_kupdate) + oldest_jif -= msecs_to_jiffies(dirty_expire_interval * 10); + if (!wbc.range_cyclic) { wbc.range_start = 0; wbc.range_end = LLONG_MAX; -- The only person who always got his work done by Friday was Robinson Crusoe [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [Bugme-new] [Bug 15426] New: Running many copies of bonnie++ on different filesystems seems to deadlock in sync 2010-03-03 14:42 ` Andre Noll @ 2010-03-04 14:55 ` John Hughes 2010-03-04 17:42 ` Andre Noll 0 siblings, 1 reply; 13+ messages in thread From: John Hughes @ 2010-03-04 14:55 UTC (permalink / raw) To: Andre Noll Cc: Jens Axboe, Andrew Morton, bugzilla-daemon, bugme-daemon, linux-kernel Andre Noll wrote: > On 13:03, Jens Axboe wrote: > > >> Our sync is currently broken in that regard, since it'll wait for too >> long. We have a debated patch going, I have included it below. Any >> chance you could give it a whirl? >> > > That's the same patch you posted two weeks ago and on which Linus > commented he thinks it is broken. He also proposed another, simpler > patch (included below) which I'm running for a few days and which > fixes the sync problem for me. > I've tried the "simpler patch" on my system and it makes no difference for me. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Bugme-new] [Bug 15426] New: Running many copies of bonnie++ on different filesystems seems to deadlock in sync 2010-03-04 14:55 ` John Hughes @ 2010-03-04 17:42 ` Andre Noll 2010-03-05 10:44 ` John Hughes 0 siblings, 1 reply; 13+ messages in thread From: Andre Noll @ 2010-03-04 17:42 UTC (permalink / raw) To: John Hughes Cc: Jens Axboe, Andrew Morton, bugzilla-daemon, bugme-daemon, linux-kernel [-- Attachment #1: Type: text/plain, Size: 639 bytes --] On 15:55, John Hughes wrote: > >That's the same patch you posted two weeks ago and on which Linus > >commented he thinks it is broken. He also proposed another, simpler > >patch (included below) which I'm running for a few days and which > >fixes the sync problem for me. > > > I've tried the "simpler patch" on my system and it makes no difference > for me. Does it make no difference to Jens' patch, or no difference to an unpatched kernel? BTW, switching from cfq to noop or deadline also improved things a bit for me. Thanks Andre -- The only person who always got his work done by Friday was Robinson Crusoe [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Bugme-new] [Bug 15426] New: Running many copies of bonnie++ on different filesystems seems to deadlock in sync 2010-03-04 17:42 ` Andre Noll @ 2010-03-05 10:44 ` John Hughes 0 siblings, 0 replies; 13+ messages in thread From: John Hughes @ 2010-03-05 10:44 UTC (permalink / raw) To: Andre Noll Cc: Jens Axboe, Andrew Morton, bugzilla-daemon, bugme-daemon, linux-kernel Andre Noll wrote: > On 15:55, John Hughes wrote: > >>> That's the same patch you posted two weeks ago and on which Linus >>> commented he thinks it is broken. He also proposed another, simpler >>> patch (included below) which I'm running for a few days and which >>> fixes the sync problem for me. >>> >>> >> I've tried the "simpler patch" on my system and it makes no difference >> for me. >> > > Does it make no difference to Jens' patch, or no difference to an > unpatched kernel? > No difference to an unpatched kernel. I didn't think of stacking the patches. > BTW, switching from cfq to noop or deadline also improved things a > bit for me. > Will try that. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Bugme-new] [Bug 15426] New: Running many copies of bonnie++ on different filesystems seems to deadlock in sync 2010-03-03 12:03 ` Jens Axboe 2010-03-03 12:45 ` John Hughes 2010-03-03 14:42 ` Andre Noll @ 2010-03-03 18:33 ` John Hughes 2010-03-04 11:16 ` John Hughes 3 siblings, 0 replies; 13+ messages in thread From: John Hughes @ 2010-03-03 18:33 UTC (permalink / raw) To: Jens Axboe; +Cc: Andrew Morton, bugzilla-daemon, bugme-daemon, linux-kernel Jens Axboe wrote: > Our sync is currently broken in that regard, since it'll wait for too > long. We have a debated patch going, I have included it below. Any > chance you could give it a whirl? > > The semantics of sync are supposed to be 'wait for dirty IO generated > BEFORE this sync call'. > > This patch doesn't seem to change things. It's a bit late here, I'll check my results tomorrow. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Bugme-new] [Bug 15426] New: Running many copies of bonnie++ on different filesystems seems to deadlock in sync 2010-03-03 12:03 ` Jens Axboe ` (2 preceding siblings ...) 2010-03-03 18:33 ` John Hughes @ 2010-03-04 11:16 ` John Hughes 3 siblings, 0 replies; 13+ messages in thread From: John Hughes @ 2010-03-04 11:16 UTC (permalink / raw) To: Jens Axboe; +Cc: Andrew Morton, bugzilla-daemon, bugme-daemon, linux-kernel Jens Axboe wrote: > Our sync is currently broken in that regard, since it'll wait for too > long. We have a debated patch going, I have included it below. Any > chance you could give it a whirl? > Ok, I confirm that this patch does not fix my problem. I'll try the "simpler" patch. ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2010-03-05 10:44 UTC | newest] Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <bug-15426-10286@http.bugzilla.kernel.org/> 2010-03-03 0:16 ` [Bugme-new] [Bug 15426] New: Running many copies of bonnie++ on different filesystems seems to deadlock in sync Andrew Morton 2010-03-03 12:09 ` John Hughes 2010-03-03 11:50 ` Jens Axboe 2010-03-03 12:37 ` John Hughes 2010-03-03 12:03 ` Jens Axboe 2010-03-03 12:45 ` John Hughes 2010-03-03 12:09 ` Jens Axboe 2010-03-03 14:42 ` Andre Noll 2010-03-04 14:55 ` John Hughes 2010-03-04 17:42 ` Andre Noll 2010-03-05 10:44 ` John Hughes 2010-03-03 18:33 ` John Hughes 2010-03-04 11:16 ` John Hughes
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.