From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757502AbZFBCHd (ORCPT ); Mon, 1 Jun 2009 22:07:33 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1755655AbZFBCHX (ORCPT ); Mon, 1 Jun 2009 22:07:23 -0400 Received: from mga09.intel.com ([134.134.136.24]:44153 "EHLO mga09.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755124AbZFBCHX (ORCPT ); Mon, 1 Jun 2009 22:07:23 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.41,287,1241420400"; d="scan'208";a="520835376" Subject: Re: [PATCH 0/12] Per-bdi writeback flusher threads v7 From: "Zhang, Yanmin" To: Jens Axboe Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, chris.mason@oracle.com, david@fromorbit.com, hch@infradead.org, akpm@linux-foundation.org, jack@suse.cz, richard@rsk.demon.co.uk In-Reply-To: <20090527061705.GQ11363@kernel.dk> References: <1243330430-9964-1-git-send-email-jens.axboe@oracle.com> <1243402035.2560.127.camel@ymzhang> <20090527061705.GQ11363@kernel.dk> Content-Type: text/plain Date: Tue, 02 Jun 2009 10:07:33 +0800 Message-Id: <1243908453.2560.139.camel@ymzhang> Mime-Version: 1.0 X-Mailer: Evolution 2.22.1 (2.22.1-2.fc9) Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 2009-05-27 at 08:17 +0200, Jens Axboe wrote: > On Wed, May 27 2009, Zhang, Yanmin wrote: > > On Tue, 2009-05-26 at 11:33 +0200, Jens Axboe wrote: > > > Hi, > > > > > > Here's the 7th version of the writeback patches. Changes since > > > v5/v6: > > > > > > - Move the sync_supers() to the global bdi_forker_task() thread, so we > > > don't writeback the supers from all the bdi kupdated() tasks. > > > - Make bdi_start_writeback() and bdi_writeback_all() be sync when called > > > with WB_SYNC_ALL only. > > > - Shuffle some more things around to make a cleaner series. The sync vs > > > async nature of bdi_writeback_all() and bdi_start_writeback() isn't > > > consistent through the series, but otherwise things should be sane. > > > > > > I'd appreciate if Richard and Yanmin could re-run testing with this, > > > just to make sure that things are sane. For ease of patching, I've > > > put the full diff here: > > > > > > http://kernel.dk/writeback-v7.patch > > I ported it to 2.6.30-rc6 with some change in file mm/page-write.c, so I > > could compare with old data. > > > > See the attachment. > > > > The new testing hits the hang issue again. It seems there is still a race. > > It's actually not a race, it's a deadlock on the bdi_lock. If you find > the bdi-default task, it should be stuck in the mutex slow path. I > posted this quickie [1] yesterday but didn't test it, I'll test it today and > post a v8. > > [1] http://lkml.org/lkml/2009/5/26/401 Jens, I checked V9 and found it includes the fix. However, the issue appears again on another machine when doing 'umount -f /dev/sda3'. INFO: task umount:17110 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. umount D ffffc20000011300 6128 17110 16225 ffffffff8093e350 0000000000000086 0000000000000000 0000000000000000 ffffe200006482e8 0000000000004000 0000000000011300 000000000000c868 ffff8801be07dd48 0000000000000000 ffff8801be5f8000 ffff8801be5f8388 Call Trace: [] ? bdi_sched_wait+0x0/0xd [] ? schedule+0x9/0x1e [] ? bdi_sched_wait+0x9/0xd [] ? __wait_on_bit+0x41/0x71 [] ? bdi_sched_wait+0x0/0xd [] ? out_of_line_wait_on_bit+0x6b/0x77 [] ? wake_bit_function+0x0/0x23 [] ? bdi_writeback_all+0x117/0x14e [] ? generic_sync_sb_inodes+0x31/0xdc [] ? sync_inodes_sb+0x83/0x88 [] ? fsync_super+0x9/0x16 [] ? generic_shutdown_super+0x25/0xde [] ? kill_block_super+0x22/0x3a [] ? deactivate_super+0x5f/0x76 [] ? sys_umount+0x2cd/0x2fc [] ? system_call_fastpath+0x16/0x1b Then, I run command sync and hit a panic: [root@lkp-ne02 ~]# BUG: unable to handle kernel paging request at 0000000561887e50 IP: [] task_rq_lock+0x28/0x6b PGD bc551067 PUD 0 Oops: 0000 [#1] SMP last sysfs file: /sys/class/net/eth1/ifindex CPU 9 Modules linked in: Pid: 27470, comm: sync Not tainted 2.6.30-rc7-bdiflusherv9 #1 X8DTN RIP: 0010:[] [] task_rq_lock+0x28/0x6b RSP: 0018:ffff8800bccd9e48 EFLAGS: 00010096 RAX: 00000000bc1d0420 RBX: 0000000000011300 RCX: 0000000000000000 RDX: 00000000000110fd RSI: ffff8800bccd9e88 RDI: ffff8800bd703090 RBP: 0000000000011300 R08: fefefefefefefeff R09: 0000000000000000 R10: 00000034cbd6da70 R11: 00000034cbacab70 R12: ffff8800bccd9e88 R13: ffff8800bd703090 R14: 000000000000000f R15: 0000000000000000 FS: 00007f04696dd6f0(0000) GS:ffffc20001200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000561887e50 CR3: 00000000bc0d6000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process sync (pid: 27470, threadinfo ffff8800bccd8000, task ffff8800bc4ee810) Stack: 00000034cbaca000 000000000000000f ffff8800bd703090 ffff8800bccd9f18 ffff8800bd780138 ffffffff80233889 0000000000000000 ffffffff802c2be6 0000000000000296 ffff8800bcd6f900 ffff8800bd5a9d10 ffff8800bccd9f18 Call Trace: [] ? try_to_wake_up+0x8f/0x2d0 [] ? bdi_alloc_work+0x28/0xa6 [] ? bdi_writeback_all+0x13d/0x14e [] ? wakeup_flusher_threads+0x4c/0x51 [] ? do_sync+0xb/0x5a [] ? sys_sync+0xe/0x14 [] ? system_call_fastpath+0x16/0x1b Code: 41 59 c3 41 55 49 89 fd 41 54 49 89 f4 55 48 c7 c5 00 13 01 00 53 48 83 ec 08 9c 58 fa 49 89 04 24 49 8b 45 08 48 89 e RIP [] task_rq_lock+0x28/0x6b RSP CR2: 0000000561887e50 Another machine hangs twice with 2.6.30-rc7+bdiflusherV9 after it's idle for hours without testing. Serial console didn't capture any useful info. As for perforamnce testing result, there is no big variation from previous vesions. > > > INFO: task sync:30013 blocked for more than 120 seconds. > > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > > sync D ffffc20000011300 4736 30013 28019 > > ffffffff8093e350 0000000000000086 0000000000000000 0000000000000000 > > 0000000000021220 0000000000004000 0000000000011300 000000000000c868 > > ffff880000016c48 ffffe20002934b30 ffff8800720b3780 ffff8800720b3b08 > > Call Trace: > > [] ? bdi_sched_wait+0x0/0xd > > [] ? schedule+0x9/0x1e > > [] ? bdi_sched_wait+0x9/0xd > > [] ? __wait_on_bit+0x41/0x71 > > [] ? bdi_sched_wait+0x0/0xd > > [] ? out_of_line_wait_on_bit+0x6b/0x77 > > [] ? wake_bit_function+0x0/0x23 > > [] ? __wake_up+0x30/0x44 > > [] ? bdi_writeback_all+0x20b/0x24c > > [] ? pagevec_lookup_tag+0x1a/0x21 > > [] ? wait_on_page_writeback_range+0xce/0x11b > > [] ? generic_sync_sb_inodes+0x36/0xe1 > > [] ? sync_inodes_sb+0x83/0x88 > > [] ? __sync_inodes+0x46/0x8f > > [] ? do_sync+0x36/0x5a > > [] ? sys_sync+0xe/0x14 > > [] ? system_call_fastpath+0x16/0x1b > > > >