From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932584AbcINM6j (ORCPT ); Wed, 14 Sep 2016 08:58:39 -0400 Received: from mx1.redhat.com ([209.132.183.28]:53082 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756090AbcINM6i (ORCPT ); Wed, 14 Sep 2016 08:58:38 -0400 Date: Wed, 14 Sep 2016 14:58:36 +0200 From: Oleg Nesterov To: Nikolay Borisov Cc: "Paul E. McKenney" , linux-kernel@vger.kernel.org Subject: Re: BUG_ON in rcu_sync_func triggered Message-ID: <20160914125835.GA6673@redhat.com> References: <57D69CEC.5010103@kyup.com> <20160912130124.GA7984@redhat.com> <57D7B6F5.4040106@kyup.com> <20160913131852.GA4112@redhat.com> <20160913134304.GA26160@redhat.com> <57D80EB8.9080405@kyup.com> <57D80F52.6090804@kyup.com> <20160913152042.GA30160@redhat.com> <57D8EE82.3090502@kyup.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <57D8EE82.3090502@kyup.com> User-Agent: Mutt/1.5.24 (2015-08-30) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.32]); Wed, 14 Sep 2016 12:58:37 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 09/14, Nikolay Borisov wrote: > > [ 557.006656] [] dump_stack+0x6b/0xa0 > [ 557.012737] [] warn_slowpath_common+0x95/0xe0 > [ 557.019781] [] warn_slowpath_null+0x1a/0x20 > [ 557.026645] [] rcu_sync_enter+0x148/0x1a0 > [ 557.033309] [] percpu_down_write+0x1e/0xf0 > [ 557.040074] [] ? call_rwsem_down_write_failed+0x13/0x20 > [ 557.048092] [] freeze_super+0xab/0x1b0 > [ 557.054456] [] do_vfs_ioctl+0x29d/0x560 > [ 557.060920] [] ? SYSC_newfstat+0x2e/0x40 > [ 557.067480] [] SyS_ioctl+0x92/0xa0 > [ 557.073465] [] entry_SYSCALL_64_fastpath+0x12/0x6a > [ 557.081015] ---[ end trace fc087420ac1d8f16 ]--- > [ 557.086507] XXX: ffff880473326b08 gp=2 cnt=-1 cb=1 > [ 557.092326] rbd: rbd19: added with size 0x500000000 > > This is: if (WARN_ON(rsp->gp_count < 0)) xxx(rsp); Thanks a lot. This is what I wanted to see. However, I can't understand why you did not hit the similar WARN_ON(rsp->gp_count <= 0) in rcu_sync_exit() before that. OK, in any case this doesn't look as a bug in rcu/sync.c, could you please try the fix below? Not sure it will help, perhaps there is something else... No need to revert the previous debugging patch. Thanks, Oleg. diff --git a/fs/super.c b/fs/super.c index d78b984..a90bdff 100644 --- a/fs/super.c +++ b/fs/super.c @@ -1344,7 +1344,9 @@ int thaw_super(struct super_block *sb) int error; down_write(&sb->s_umount); - if (sb->s_writers.frozen == SB_UNFROZEN) { + if (sb->s_writers.frozen != SB_FREEZE_COMPLETE) { + if (sb->s_writers.frozen != SB_UNFROZEN) + pr_crit("THAW: hit the race: %d\n", sb->s_writers.frozen); up_write(&sb->s_umount); return -EINVAL; }