[Ocfs2-devel] [Ocfs2-dev] BUG: deadlock with umount and ocfs2 workqueue triggered by ocfs2rec thread

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Ocfs2-devel] [Ocfs2-dev] BUG: deadlock with umount and ocfs2 workqueue triggered by ocfs2rec thread
@ 2018-01-12  3:43 Shichangkuo
  2018-01-12  5:50 ` Joseph Qi
  2018-01-12  8:25 ` Eric Ren
  0 siblings, 2 replies; 13+ messages in thread
From: Shichangkuo @ 2018-01-12  3:43 UTC (permalink / raw)
  To: ocfs2-devel

Hi all,
??Now we are testing ocfs2 with 4.14 kernel, and we finding a deadlock with umount and ocfs2 workqueue triggered by ocfs2rec thread. The stack as follows:
journal recovery work:
[<ffffffff8a8c0694>] call_rwsem_down_read_failed+0x14/0x30
[<ffffffffc0d5d652>] ocfs2_finish_quota_recovery+0x62/0x450 [ocfs2]
[<ffffffffc0d21221>] ocfs2_complete_recovery+0xc1/0x440 [ocfs2]
[<ffffffff8a09a1f0>] process_one_work+0x130/0x350
[<ffffffff8a09a946>] worker_thread+0x46/0x3b0
[<ffffffff8a0a0e51>] kthread+0x101/0x140
[<ffffffff8aa002ff>] ret_from_fork+0x1f/0x30
[<ffffffffffffffff>] 0xffffffffffffffff

/bin/umount:
[<ffffffff8a099b24>] flush_workqueue+0x104/0x3e0
[<ffffffffc0cf18db>] ocfs2_truncate_log_shutdown+0x3b/0xc0 [ocfs2]
[<ffffffffc0d4fd6c>] ocfs2_dismount_volume+0x8c/0x3d0 [ocfs2]
[<ffffffffc0d500e1>] ocfs2_put_super+0x31/0xa0 [ocfs2]
[<ffffffff8a2445bd>] generic_shutdown_super+0x6d/0x120
[<ffffffff8a24469d>] kill_block_super+0x2d/0x60
[<ffffffff8a244e71>] deactivate_locked_super+0x51/0x90
[<ffffffff8a263a1b>] cleanup_mnt+0x3b/0x70
[<ffffffff8a09e9c6>] task_work_run+0x86/0xa0
[<ffffffff8a003d70>] exit_to_usermode_loop+0x6d/0xa9
[<ffffffff8a003a2d>] do_syscall_64+0x11d/0x130
[<ffffffff8aa00113>] entry_SYSCALL64_slow_path+0x25/0x25
[<ffffffffffffffff>] 0xffffffffffffffff
??
Function ocfs2_finish_quota_recovery try to get sb->s_umount, which was already locked by umount thread, then get a deadlock.
This issue was introduced by c3b004460d77bf3f980d877be539016f2df4df12 and 5f530de63cfc6ca8571cbdf58af63fb166cc6517.
I think we cannot use :: s_umount, but the mutex ::dqonoff_mutex was already removed.
Shall we add a new mutex?

Thanks
Changkuo
-------------------------------------------------------------------------------------------------------------------------------------
?????????????????????????????????????
????????????????????????????????????????
????????????????????????????????????????
???
This e-mail and its attachments contain confidential information from New H3C, which is
intended only for the person or entity whose address is listed above. Any use of the
information contained herein in any way (including, but not limited to, total or partial
disclosure, reproduction, or dissemination) by persons other than the intended
recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender
by phone or email immediately and delete it!

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Ocfs2-devel] [Ocfs2-dev] BUG: deadlock with umount and ocfs2 workqueue triggered by ocfs2rec thread
  2018-01-12  3:43 [Ocfs2-devel] [Ocfs2-dev] BUG: deadlock with umount and ocfs2 workqueue triggered by ocfs2rec thread Shichangkuo
@ 2018-01-12  5:50 ` Joseph Qi
  2018-01-12  6:15   ` Shichangkuo
  2018-01-12  8:25 ` Eric Ren
  1 sibling, 1 reply; 13+ messages in thread
From: Joseph Qi @ 2018-01-12  5:50 UTC (permalink / raw)
  To: ocfs2-devel

Hi Changkuo,

You said s_umount was acquired by umount and ocfs2rec was blocked when
acquiring it. But you didn't describe why umount was blocked.

Thanks,
Joseph

On 18/1/12 11:43, Shichangkuo wrote:
> Hi all,
> ??Now we are testing ocfs2 with 4.14 kernel, and we finding a deadlock with umount and ocfs2 workqueue triggered by ocfs2rec thread. The stack as follows:
> journal recovery work:
> [<ffffffff8a8c0694>] call_rwsem_down_read_failed+0x14/0x30
> [<ffffffffc0d5d652>] ocfs2_finish_quota_recovery+0x62/0x450 [ocfs2]
> [<ffffffffc0d21221>] ocfs2_complete_recovery+0xc1/0x440 [ocfs2]
> [<ffffffff8a09a1f0>] process_one_work+0x130/0x350
> [<ffffffff8a09a946>] worker_thread+0x46/0x3b0
> [<ffffffff8a0a0e51>] kthread+0x101/0x140
> [<ffffffff8aa002ff>] ret_from_fork+0x1f/0x30
> [<ffffffffffffffff>] 0xffffffffffffffff
> 
> /bin/umount:
> [<ffffffff8a099b24>] flush_workqueue+0x104/0x3e0
> [<ffffffffc0cf18db>] ocfs2_truncate_log_shutdown+0x3b/0xc0 [ocfs2]
> [<ffffffffc0d4fd6c>] ocfs2_dismount_volume+0x8c/0x3d0 [ocfs2]
> [<ffffffffc0d500e1>] ocfs2_put_super+0x31/0xa0 [ocfs2]
> [<ffffffff8a2445bd>] generic_shutdown_super+0x6d/0x120
> [<ffffffff8a24469d>] kill_block_super+0x2d/0x60
> [<ffffffff8a244e71>] deactivate_locked_super+0x51/0x90
> [<ffffffff8a263a1b>] cleanup_mnt+0x3b/0x70
> [<ffffffff8a09e9c6>] task_work_run+0x86/0xa0
> [<ffffffff8a003d70>] exit_to_usermode_loop+0x6d/0xa9
> [<ffffffff8a003a2d>] do_syscall_64+0x11d/0x130
> [<ffffffff8aa00113>] entry_SYSCALL64_slow_path+0x25/0x25
> [<ffffffffffffffff>] 0xffffffffffffffff
> ??
> Function ocfs2_finish_quota_recovery try to get sb->s_umount, which was already locked by umount thread, then get a deadlock.
> This issue was introduced by c3b004460d77bf3f980d877be539016f2df4df12 and 5f530de63cfc6ca8571cbdf58af63fb166cc6517.
> I think we cannot use :: s_umount, but the mutex ::dqonoff_mutex was already removed.
> Shall we add a new mutex?
> 
> Thanks
> Changkuo
> -------------------------------------------------------------------------------------------------------------------------------------
> ?????????????????????????????????????
> ????????????????????????????????????????
> ????????????????????????????????????????
> ???
> This e-mail and its attachments contain confidential information from New H3C, which is
> intended only for the person or entity whose address is listed above. Any use of the
> information contained herein in any way (including, but not limited to, total or partial
> disclosure, reproduction, or dissemination) by persons other than the intended
> recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender
> by phone or email immediately and delete it!
> _______________________________________________
> Ocfs2-devel mailing list
> Ocfs2-devel at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Ocfs2-devel] [Ocfs2-dev] BUG: deadlock with umount and ocfs2 workqueue triggered by ocfs2rec thread
  2018-01-12  5:50 ` Joseph Qi
@ 2018-01-12  6:15   ` Shichangkuo
  0 siblings, 0 replies; 13+ messages in thread
From: Shichangkuo @ 2018-01-12  6:15 UTC (permalink / raw)
  To: ocfs2-devel

Hi Joseph
    Thanks for replying.
    Umount will flush the ocfs2 workqueue in function ocfs2_truncate_log_shutdown and journal recovery is one work of ocfs2 wq.

Thanks
Changkuo

> -----Original Message-----
> From: Joseph Qi [mailto:jiangqi903 at gmail.com]
> Sent: Friday, January 12, 2018 1:51 PM
> To: shichangkuo (Cloud); zren at suse.com; jack at suse.cz
> Cc: ocfs2-devel at oss.oracle.com
> Subject: Re: [Ocfs2-devel] [Ocfs2-dev] BUG: deadlock with umount and ocfs2
> workqueue triggered by ocfs2rec thread
> 
> Hi Changkuo,
> 
> You said s_umount was acquired by umount and ocfs2rec was blocked when
> acquiring it. But you didn't describe why umount was blocked.
> 
> Thanks,
> Joseph
> 
> On 18/1/12 11:43, Shichangkuo wrote:
> > Hi all,
> > ??Now we are testing ocfs2 with 4.14 kernel, and we finding a deadlock
> with umount and ocfs2 workqueue triggered by ocfs2rec thread. The stack as
> follows:
> > journal recovery work:
> > [<ffffffff8a8c0694>] call_rwsem_down_read_failed+0x14/0x30
> > [<ffffffffc0d5d652>] ocfs2_finish_quota_recovery+0x62/0x450 [ocfs2]
> > [<ffffffffc0d21221>] ocfs2_complete_recovery+0xc1/0x440 [ocfs2]
> > [<ffffffff8a09a1f0>] process_one_work+0x130/0x350 [<ffffffff8a09a946>]
> > worker_thread+0x46/0x3b0 [<ffffffff8a0a0e51>] kthread+0x101/0x140
> > [<ffffffff8aa002ff>] ret_from_fork+0x1f/0x30 [<ffffffffffffffff>]
> > 0xffffffffffffffff
> >
> > /bin/umount:
> > [<ffffffff8a099b24>] flush_workqueue+0x104/0x3e0 [<ffffffffc0cf18db>]
> > ocfs2_truncate_log_shutdown+0x3b/0xc0 [ocfs2] [<ffffffffc0d4fd6c>]
> > ocfs2_dismount_volume+0x8c/0x3d0 [ocfs2] [<ffffffffc0d500e1>]
> > ocfs2_put_super+0x31/0xa0 [ocfs2] [<ffffffff8a2445bd>]
> > generic_shutdown_super+0x6d/0x120 [<ffffffff8a24469d>]
> > kill_block_super+0x2d/0x60 [<ffffffff8a244e71>]
> > deactivate_locked_super+0x51/0x90 [<ffffffff8a263a1b>]
> > cleanup_mnt+0x3b/0x70 [<ffffffff8a09e9c6>] task_work_run+0x86/0xa0
> > [<ffffffff8a003d70>] exit_to_usermode_loop+0x6d/0xa9
> > [<ffffffff8a003a2d>] do_syscall_64+0x11d/0x130 [<ffffffff8aa00113>]
> > entry_SYSCALL64_slow_path+0x25/0x25
> > [<ffffffffffffffff>] 0xffffffffffffffff
> >
> > Function ocfs2_finish_quota_recovery try to get sb->s_umount, which was
> already locked by umount thread, then get a deadlock.
> > This issue was introduced by
> c3b004460d77bf3f980d877be539016f2df4df12 and
> 5f530de63cfc6ca8571cbdf58af63fb166cc6517.
> > I think we cannot use :: s_umount, but the mutex ::dqonoff_mutex was
> already removed.
> > Shall we add a new mutex?
> >
> > Thanks
> > Changkuo
> > ----------------------------------------------------------------------
> > ---------------------------------------------------------------
> > ???????????????????????????????
> ??????
> > ???????????????????????????????
> ?????????
> > ???????????????????????????????
> ?????????
> > ???
> > This e-mail and its attachments contain confidential information from
> > New H3C, which is intended only for the person or entity whose address
> > is listed above. Any use of the information contained herein in any
> > way (including, but not limited to, total or partial disclosure,
> > reproduction, or dissemination) by persons other than the intended
> > recipient(s) is prohibited. If you receive this e-mail in error,
> > please notify the sender by phone or email immediately and delete it!
> > _______________________________________________
> > Ocfs2-devel mailing list
> > Ocfs2-devel at oss.oracle.com
> > https://oss.oracle.com/mailman/listinfo/ocfs2-devel
> >

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Ocfs2-devel] [Ocfs2-dev] BUG: deadlock with umount and ocfs2 workqueue triggered by ocfs2rec thread
  2018-01-12  3:43 [Ocfs2-devel] [Ocfs2-dev] BUG: deadlock with umount and ocfs2 workqueue triggered by ocfs2rec thread Shichangkuo
  2018-01-12  5:50 ` Joseph Qi
@ 2018-01-12  8:25 ` Eric Ren
  2018-01-17 15:21   ` Jan Kara
  1 sibling, 1 reply; 13+ messages in thread
From: Eric Ren @ 2018-01-12  8:25 UTC (permalink / raw)
  To: ocfs2-devel

Hi,

On 01/12/2018 11:43 AM, Shichangkuo wrote:
> Hi all,
> ??Now we are testing ocfs2 with 4.14 kernel, and we finding a deadlock with umount and ocfs2 workqueue triggered by ocfs2rec thread. The stack as follows:
> journal recovery work:
> [<ffffffff8a8c0694>] call_rwsem_down_read_failed+0x14/0x30
> [<ffffffffc0d5d652>] ocfs2_finish_quota_recovery+0x62/0x450 [ocfs2]
> [<ffffffffc0d21221>] ocfs2_complete_recovery+0xc1/0x440 [ocfs2]
> [<ffffffff8a09a1f0>] process_one_work+0x130/0x350
> [<ffffffff8a09a946>] worker_thread+0x46/0x3b0
> [<ffffffff8a0a0e51>] kthread+0x101/0x140
> [<ffffffff8aa002ff>] ret_from_fork+0x1f/0x30
> [<ffffffffffffffff>] 0xffffffffffffffff
>
> /bin/umount:
> [<ffffffff8a099b24>] flush_workqueue+0x104/0x3e0
> [<ffffffffc0cf18db>] ocfs2_truncate_log_shutdown+0x3b/0xc0 [ocfs2]
> [<ffffffffc0d4fd6c>] ocfs2_dismount_volume+0x8c/0x3d0 [ocfs2]
> [<ffffffffc0d500e1>] ocfs2_put_super+0x31/0xa0 [ocfs2]
> [<ffffffff8a2445bd>] generic_shutdown_super+0x6d/0x120
> [<ffffffff8a24469d>] kill_block_super+0x2d/0x60
> [<ffffffff8a244e71>] deactivate_locked_super+0x51/0x90
> [<ffffffff8a263a1b>] cleanup_mnt+0x3b/0x70
> [<ffffffff8a09e9c6>] task_work_run+0x86/0xa0
> [<ffffffff8a003d70>] exit_to_usermode_loop+0x6d/0xa9
> [<ffffffff8a003a2d>] do_syscall_64+0x11d/0x130
> [<ffffffff8aa00113>] entry_SYSCALL64_slow_path+0x25/0x25
> [<ffffffffffffffff>] 0xffffffffffffffff
> ??
> Function ocfs2_finish_quota_recovery try to get sb->s_umount, which was already locked by umount thread, then get a deadlock.

Good catch, thanks for reporting.? Is it reproducible? Can you please 
share the steps for reproducing this issue?
> This issue was introduced by c3b004460d77bf3f980d877be539016f2df4df12 and 5f530de63cfc6ca8571cbdf58af63fb166cc6517.
> I think we cannot use :: s_umount, but the mutex ::dqonoff_mutex was already removed.
> Shall we add a new mutex?

@Jan, I don't look into the code yet, could you help me understand why 
we need to get sb->s_umount in ocfs2_finish_quota_recovery?
Is it because that the quota recovery process will start at umounting? 
or some where else?

Thanks,
Eric

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Ocfs2-devel] [Ocfs2-dev] BUG: deadlock with umount and ocfs2 workqueue triggered by ocfs2rec thread
  2018-01-12  8:25 ` Eric Ren
@ 2018-01-17 15:21   ` Jan Kara
  2018-01-17 15:57     ` Jan Kara
  0 siblings, 1 reply; 13+ messages in thread
From: Jan Kara @ 2018-01-17 15:21 UTC (permalink / raw)
  To: ocfs2-devel

Hello,

On Fri 12-01-18 16:25:56, Eric Ren wrote:
> On 01/12/2018 11:43 AM, Shichangkuo wrote:
> > Hi all,
> > ??Now we are testing ocfs2 with 4.14 kernel, and we finding a deadlock with umount and ocfs2 workqueue triggered by ocfs2rec thread. The stack as follows:
> > journal recovery work:
> > [<ffffffff8a8c0694>] call_rwsem_down_read_failed+0x14/0x30
> > [<ffffffffc0d5d652>] ocfs2_finish_quota_recovery+0x62/0x450 [ocfs2]
> > [<ffffffffc0d21221>] ocfs2_complete_recovery+0xc1/0x440 [ocfs2]
> > [<ffffffff8a09a1f0>] process_one_work+0x130/0x350
> > [<ffffffff8a09a946>] worker_thread+0x46/0x3b0
> > [<ffffffff8a0a0e51>] kthread+0x101/0x140
> > [<ffffffff8aa002ff>] ret_from_fork+0x1f/0x30
> > [<ffffffffffffffff>] 0xffffffffffffffff
> > 
> > /bin/umount:
> > [<ffffffff8a099b24>] flush_workqueue+0x104/0x3e0
> > [<ffffffffc0cf18db>] ocfs2_truncate_log_shutdown+0x3b/0xc0 [ocfs2]
> > [<ffffffffc0d4fd6c>] ocfs2_dismount_volume+0x8c/0x3d0 [ocfs2]
> > [<ffffffffc0d500e1>] ocfs2_put_super+0x31/0xa0 [ocfs2]
> > [<ffffffff8a2445bd>] generic_shutdown_super+0x6d/0x120
> > [<ffffffff8a24469d>] kill_block_super+0x2d/0x60
> > [<ffffffff8a244e71>] deactivate_locked_super+0x51/0x90
> > [<ffffffff8a263a1b>] cleanup_mnt+0x3b/0x70
> > [<ffffffff8a09e9c6>] task_work_run+0x86/0xa0
> > [<ffffffff8a003d70>] exit_to_usermode_loop+0x6d/0xa9
> > [<ffffffff8a003a2d>] do_syscall_64+0x11d/0x130
> > [<ffffffff8aa00113>] entry_SYSCALL64_slow_path+0x25/0x25
> > [<ffffffffffffffff>] 0xffffffffffffffff
> > ??
> > Function ocfs2_finish_quota_recovery try to get sb->s_umount, which was already locked by umount thread, then get a deadlock.
> 
> Good catch, thanks for reporting.? Is it reproducible? Can you please share
> the steps for reproducing this issue?
> > This issue was introduced by c3b004460d77bf3f980d877be539016f2df4df12 and 5f530de63cfc6ca8571cbdf58af63fb166cc6517.
> > I think we cannot use :: s_umount, but the mutex ::dqonoff_mutex was already removed.
> > Shall we add a new mutex?
> 
> @Jan, I don't look into the code yet, could you help me understand why we
> need to get sb->s_umount in ocfs2_finish_quota_recovery?
> Is it because that the quota recovery process will start at umounting? or
> some where else?

I was refreshing my memory wrt how ocfs2 quota recovery works. The problem
is the following: We load information about all quota information that
needs recovering (this is possibly for other nodes) in
ocfs2_begin_quota_recovery() that gets called during mount. Real quota
recovery happens from the recovery thread in ocfs2_finish_quota_recovery().
We need to protect code running there from dquot_disable() calls as that
will free structures we use for updating quota information etc. Currently
we use sb->s_umount for that protection.

The problem above apparently happens when someone calls umount before the
recovery thread can finish quota recovery. I will think more about how to
fix the locking so that this lock inversion does not happen...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Ocfs2-devel] [Ocfs2-dev] BUG: deadlock with umount and ocfs2 workqueue triggered by ocfs2rec thread
  2018-01-17 15:21   ` Jan Kara
@ 2018-01-17 15:57     ` Jan Kara
  2018-01-19  1:48       ` Changwei Ge
  0 siblings, 1 reply; 13+ messages in thread
From: Jan Kara @ 2018-01-17 15:57 UTC (permalink / raw)
  To: ocfs2-devel

On Wed 17-01-18 16:21:35, Jan Kara wrote:
> Hello,
> 
> On Fri 12-01-18 16:25:56, Eric Ren wrote:
> > On 01/12/2018 11:43 AM, Shichangkuo wrote:
> > > Hi all,
> > > ??Now we are testing ocfs2 with 4.14 kernel, and we finding a deadlock with umount and ocfs2 workqueue triggered by ocfs2rec thread. The stack as follows:
> > > journal recovery work:
> > > [<ffffffff8a8c0694>] call_rwsem_down_read_failed+0x14/0x30
> > > [<ffffffffc0d5d652>] ocfs2_finish_quota_recovery+0x62/0x450 [ocfs2]
> > > [<ffffffffc0d21221>] ocfs2_complete_recovery+0xc1/0x440 [ocfs2]
> > > [<ffffffff8a09a1f0>] process_one_work+0x130/0x350
> > > [<ffffffff8a09a946>] worker_thread+0x46/0x3b0
> > > [<ffffffff8a0a0e51>] kthread+0x101/0x140
> > > [<ffffffff8aa002ff>] ret_from_fork+0x1f/0x30
> > > [<ffffffffffffffff>] 0xffffffffffffffff
> > > 
> > > /bin/umount:
> > > [<ffffffff8a099b24>] flush_workqueue+0x104/0x3e0
> > > [<ffffffffc0cf18db>] ocfs2_truncate_log_shutdown+0x3b/0xc0 [ocfs2]
> > > [<ffffffffc0d4fd6c>] ocfs2_dismount_volume+0x8c/0x3d0 [ocfs2]
> > > [<ffffffffc0d500e1>] ocfs2_put_super+0x31/0xa0 [ocfs2]
> > > [<ffffffff8a2445bd>] generic_shutdown_super+0x6d/0x120
> > > [<ffffffff8a24469d>] kill_block_super+0x2d/0x60
> > > [<ffffffff8a244e71>] deactivate_locked_super+0x51/0x90
> > > [<ffffffff8a263a1b>] cleanup_mnt+0x3b/0x70
> > > [<ffffffff8a09e9c6>] task_work_run+0x86/0xa0
> > > [<ffffffff8a003d70>] exit_to_usermode_loop+0x6d/0xa9
> > > [<ffffffff8a003a2d>] do_syscall_64+0x11d/0x130
> > > [<ffffffff8aa00113>] entry_SYSCALL64_slow_path+0x25/0x25
> > > [<ffffffffffffffff>] 0xffffffffffffffff
> > > ??
> > > Function ocfs2_finish_quota_recovery try to get sb->s_umount, which was already locked by umount thread, then get a deadlock.
> > 
> > Good catch, thanks for reporting.? Is it reproducible? Can you please share
> > the steps for reproducing this issue?
> > > This issue was introduced by c3b004460d77bf3f980d877be539016f2df4df12 and 5f530de63cfc6ca8571cbdf58af63fb166cc6517.
> > > I think we cannot use :: s_umount, but the mutex ::dqonoff_mutex was already removed.
> > > Shall we add a new mutex?
> > 
> > @Jan, I don't look into the code yet, could you help me understand why we
> > need to get sb->s_umount in ocfs2_finish_quota_recovery?
> > Is it because that the quota recovery process will start at umounting? or
> > some where else?
> 
> I was refreshing my memory wrt how ocfs2 quota recovery works. The problem
> is the following: We load information about all quota information that
> needs recovering (this is possibly for other nodes) in
> ocfs2_begin_quota_recovery() that gets called during mount. Real quota
> recovery happens from the recovery thread in ocfs2_finish_quota_recovery().
> We need to protect code running there from dquot_disable() calls as that
> will free structures we use for updating quota information etc. Currently
> we use sb->s_umount for that protection.
> 
> The problem above apparently happens when someone calls umount before the
> recovery thread can finish quota recovery. I will think more about how to
> fix the locking so that this lock inversion does not happen...

So could we move ocfs2_recovery_exit() call in ocfs2_dismount_volume() up
before ocfs2_disable_quotas()? It seems possible to me, I'm just not sure
if there are not some hidden dependencies on recovery being shut down only
after truncate log / local alloc. If we can do that, we could remove
s_umount protection from ocfs2_finish_quota_recovery() and thus resolve the
race.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Ocfs2-devel] [Ocfs2-dev] BUG: deadlock with umount and ocfs2 workqueue triggered by ocfs2rec thread
  2018-01-17 15:57     ` Jan Kara
@ 2018-01-19  1:48       ` Changwei Ge
  2018-01-19  3:16         ` piaojun
  0 siblings, 1 reply; 13+ messages in thread
From: Changwei Ge @ 2018-01-19  1:48 UTC (permalink / raw)
  To: ocfs2-devel

Hi Jan,

On 2018/1/18 0:03, Jan Kara wrote:
> On Wed 17-01-18 16:21:35, Jan Kara wrote:
>> Hello,
>>
>> On Fri 12-01-18 16:25:56, Eric Ren wrote:
>>> On 01/12/2018 11:43 AM, Shichangkuo wrote:
>>>> Hi all,
>>>> ??Now we are testing ocfs2 with 4.14 kernel, and we finding a deadlock with umount and ocfs2 workqueue triggered by ocfs2rec thread. The stack as follows:
>>>> journal recovery work:
>>>> [<ffffffff8a8c0694>] call_rwsem_down_read_failed+0x14/0x30
>>>> [<ffffffffc0d5d652>] ocfs2_finish_quota_recovery+0x62/0x450 [ocfs2]
>>>> [<ffffffffc0d21221>] ocfs2_complete_recovery+0xc1/0x440 [ocfs2]
>>>> [<ffffffff8a09a1f0>] process_one_work+0x130/0x350
>>>> [<ffffffff8a09a946>] worker_thread+0x46/0x3b0
>>>> [<ffffffff8a0a0e51>] kthread+0x101/0x140
>>>> [<ffffffff8aa002ff>] ret_from_fork+0x1f/0x30
>>>> [<ffffffffffffffff>] 0xffffffffffffffff
>>>>
>>>> /bin/umount:
>>>> [<ffffffff8a099b24>] flush_workqueue+0x104/0x3e0
>>>> [<ffffffffc0cf18db>] ocfs2_truncate_log_shutdown+0x3b/0xc0 [ocfs2]
>>>> [<ffffffffc0d4fd6c>] ocfs2_dismount_volume+0x8c/0x3d0 [ocfs2]
>>>> [<ffffffffc0d500e1>] ocfs2_put_super+0x31/0xa0 [ocfs2]
>>>> [<ffffffff8a2445bd>] generic_shutdown_super+0x6d/0x120
>>>> [<ffffffff8a24469d>] kill_block_super+0x2d/0x60
>>>> [<ffffffff8a244e71>] deactivate_locked_super+0x51/0x90
>>>> [<ffffffff8a263a1b>] cleanup_mnt+0x3b/0x70
>>>> [<ffffffff8a09e9c6>] task_work_run+0x86/0xa0
>>>> [<ffffffff8a003d70>] exit_to_usermode_loop+0x6d/0xa9
>>>> [<ffffffff8a003a2d>] do_syscall_64+0x11d/0x130
>>>> [<ffffffff8aa00113>] entry_SYSCALL64_slow_path+0x25/0x25
>>>> [<ffffffffffffffff>] 0xffffffffffffffff
>>>> ??
>>>> Function ocfs2_finish_quota_recovery try to get sb->s_umount, which was already locked by umount thread, then get a deadlock.
>>>
>>> Good catch, thanks for reporting.? Is it reproducible? Can you please share
>>> the steps for reproducing this issue?
>>>> This issue was introduced by c3b004460d77bf3f980d877be539016f2df4df12 and 5f530de63cfc6ca8571cbdf58af63fb166cc6517.
>>>> I think we cannot use :: s_umount, but the mutex ::dqonoff_mutex was already removed.
>>>> Shall we add a new mutex?
>>>
>>> @Jan, I don't look into the code yet, could you help me understand why we
>>> need to get sb->s_umount in ocfs2_finish_quota_recovery?
>>> Is it because that the quota recovery process will start at umounting? or
>>> some where else?
>>
>> I was refreshing my memory wrt how ocfs2 quota recovery works. The problem
>> is the following: We load information about all quota information that
>> needs recovering (this is possibly for other nodes) in
>> ocfs2_begin_quota_recovery() that gets called during mount. Real quota
>> recovery happens from the recovery thread in ocfs2_finish_quota_recovery().
>> We need to protect code running there from dquot_disable() calls as that
>> will free structures we use for updating quota information etc. Currently
>> we use sb->s_umount for that protection.
>>
>> The problem above apparently happens when someone calls umount before the
>> recovery thread can finish quota recovery. I will think more about how to
>> fix the locking so that this lock inversion does not happen...
> 
> So could we move ocfs2_recovery_exit() call in ocfs2_dismount_volume() up
> before ocfs2_disable_quotas()? It seems possible to me, I'm just not sure
> if there are not some hidden dependencies on recovery being shut down only
> after truncate log / local alloc. If we can do that, we could remove
> s_umount protection from ocfs2_finish_quota_recovery() and thus resolve the
> race.
> 
> 								Honza

Thanks for looking into this.
I am not quite familiar with quota part.:)

Or can we move ocfs2_disable_quotas() in ocfs2_dismount_volume down 
after ocfs2_recovery_exit() with ::invoking down_read(&sb->s_umount) 
eliminated?

Another way I can figure out is:
I think we might get inspired from qsync_work_fn().
In that function if current work is under running context of umount with 
::s_umount held, it just delays current work to next time.

So can we also _try lock_ in ocfs2_finish_quota_recovery() and recover 
quota by other ocfs2 cluster member nodes or local node's next time of 
mount?

Thanks,
Changwei


> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Ocfs2-devel] [Ocfs2-dev] BUG: deadlock with umount and ocfs2 workqueue triggered by ocfs2rec thread
  2018-01-19  1:48       ` Changwei Ge
@ 2018-01-19  3:16         ` piaojun
  2018-01-19  3:38           ` Changwei Ge
  0 siblings, 1 reply; 13+ messages in thread
From: piaojun @ 2018-01-19  3:16 UTC (permalink / raw)
  To: ocfs2-devel

Hi Jan, Eric and Changwei,

Could we use another mutex lock to protect quota recovery? Sharing the
lock with VFS-layer probably seems a little weird.

On 2018/1/19 9:48, Changwei Ge wrote:
> Hi Jan,
> 
> On 2018/1/18 0:03, Jan Kara wrote:
>> On Wed 17-01-18 16:21:35, Jan Kara wrote:
>>> Hello,
>>>
>>> On Fri 12-01-18 16:25:56, Eric Ren wrote:
>>>> On 01/12/2018 11:43 AM, Shichangkuo wrote:
>>>>> Hi all,
>>>>> ??Now we are testing ocfs2 with 4.14 kernel, and we finding a deadlock with umount and ocfs2 workqueue triggered by ocfs2rec thread. The stack as follows:
>>>>> journal recovery work:
>>>>> [<ffffffff8a8c0694>] call_rwsem_down_read_failed+0x14/0x30
>>>>> [<ffffffffc0d5d652>] ocfs2_finish_quota_recovery+0x62/0x450 [ocfs2]
>>>>> [<ffffffffc0d21221>] ocfs2_complete_recovery+0xc1/0x440 [ocfs2]
>>>>> [<ffffffff8a09a1f0>] process_one_work+0x130/0x350
>>>>> [<ffffffff8a09a946>] worker_thread+0x46/0x3b0
>>>>> [<ffffffff8a0a0e51>] kthread+0x101/0x140
>>>>> [<ffffffff8aa002ff>] ret_from_fork+0x1f/0x30
>>>>> [<ffffffffffffffff>] 0xffffffffffffffff
>>>>>
>>>>> /bin/umount:
>>>>> [<ffffffff8a099b24>] flush_workqueue+0x104/0x3e0
>>>>> [<ffffffffc0cf18db>] ocfs2_truncate_log_shutdown+0x3b/0xc0 [ocfs2]
>>>>> [<ffffffffc0d4fd6c>] ocfs2_dismount_volume+0x8c/0x3d0 [ocfs2]
>>>>> [<ffffffffc0d500e1>] ocfs2_put_super+0x31/0xa0 [ocfs2]
>>>>> [<ffffffff8a2445bd>] generic_shutdown_super+0x6d/0x120
>>>>> [<ffffffff8a24469d>] kill_block_super+0x2d/0x60
>>>>> [<ffffffff8a244e71>] deactivate_locked_super+0x51/0x90
>>>>> [<ffffffff8a263a1b>] cleanup_mnt+0x3b/0x70
>>>>> [<ffffffff8a09e9c6>] task_work_run+0x86/0xa0
>>>>> [<ffffffff8a003d70>] exit_to_usermode_loop+0x6d/0xa9
>>>>> [<ffffffff8a003a2d>] do_syscall_64+0x11d/0x130
>>>>> [<ffffffff8aa00113>] entry_SYSCALL64_slow_path+0x25/0x25
>>>>> [<ffffffffffffffff>] 0xffffffffffffffff
>>>>> ??
>>>>> Function ocfs2_finish_quota_recovery try to get sb->s_umount, which was already locked by umount thread, then get a deadlock.
>>>>
>>>> Good catch, thanks for reporting.  Is it reproducible? Can you please share
>>>> the steps for reproducing this issue?
>>>>> This issue was introduced by c3b004460d77bf3f980d877be539016f2df4df12 and 5f530de63cfc6ca8571cbdf58af63fb166cc6517.
>>>>> I think we cannot use :: s_umount, but the mutex ::dqonoff_mutex was already removed.
>>>>> Shall we add a new mutex?
>>>>
>>>> @Jan, I don't look into the code yet, could you help me understand why we
>>>> need to get sb->s_umount in ocfs2_finish_quota_recovery?
>>>> Is it because that the quota recovery process will start at umounting? or
>>>> some where else?
>>>
>>> I was refreshing my memory wrt how ocfs2 quota recovery works. The problem
>>> is the following: We load information about all quota information that
>>> needs recovering (this is possibly for other nodes) in
>>> ocfs2_begin_quota_recovery() that gets called during mount. Real quota
>>> recovery happens from the recovery thread in ocfs2_finish_quota_recovery().
>>> We need to protect code running there from dquot_disable() calls as that
>>> will free structures we use for updating quota information etc. Currently
>>> we use sb->s_umount for that protection.
>>>
>>> The problem above apparently happens when someone calls umount before the
>>> recovery thread can finish quota recovery. I will think more about how to
>>> fix the locking so that this lock inversion does not happen...
>>
>> So could we move ocfs2_recovery_exit() call in ocfs2_dismount_volume() up
>> before ocfs2_disable_quotas()? It seems possible to me, I'm just not sure
>> if there are not some hidden dependencies on recovery being shut down only
>> after truncate log / local alloc. If we can do that, we could remove
>> s_umount protection from ocfs2_finish_quota_recovery() and thus resolve the
>> race.
>>
>> 								Honza
> 
> Thanks for looking into this.
> I am not quite familiar with quota part.:)
> 
> Or can we move ocfs2_disable_quotas() in ocfs2_dismount_volume down 
> after ocfs2_recovery_exit() with ::invoking down_read(&sb->s_umount) 
> eliminated?
> 
> Another way I can figure out is:
> I think we might get inspired from qsync_work_fn().
> In that function if current work is under running context of umount with 
> ::s_umount held, it just delays current work to next time.
> 
> So can we also _try lock_ in ocfs2_finish_quota_recovery() and recover 
> quota by other ocfs2 cluster member nodes or local node's next time of 
> mount?
> 
I guess we need analyse the impact of _try lock_. Such as no other node
will help recovering quota when I'm the only node in cluster.

thanks,
Jun

> Thanks,
> Changwei
> 
> 
>>
> _______________________________________________
> Ocfs2-devel mailing list
> Ocfs2-devel at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Ocfs2-devel] [Ocfs2-dev] BUG: deadlock with umount and ocfs2 workqueue triggered by ocfs2rec thread
  2018-01-19  3:16         ` piaojun
@ 2018-01-19  3:38           ` Changwei Ge
  2018-01-19  3:59             ` piaojun
  0 siblings, 1 reply; 13+ messages in thread
From: Changwei Ge @ 2018-01-19  3:38 UTC (permalink / raw)
  To: ocfs2-devel

Hi Jun,

On 2018/1/19 11:17, piaojun wrote:
> Hi Jan, Eric and Changwei,
> 
> Could we use another mutex lock to protect quota recovery? Sharing the
> lock with VFS-layer probably seems a little weird.

I am afraid that we can't since quota need ::s_umount and we indeed need 
::s_umount to get rid of race that quota has freed structs that will be 
used by quota recovery in ocfs2.

> 
> On 2018/1/19 9:48, Changwei Ge wrote:
>> Hi Jan,
>>
>> On 2018/1/18 0:03, Jan Kara wrote:
>>> On Wed 17-01-18 16:21:35, Jan Kara wrote:
>>>> Hello,
>>>>
>>>> On Fri 12-01-18 16:25:56, Eric Ren wrote:
>>>>> On 01/12/2018 11:43 AM, Shichangkuo wrote:
>>>>>> Hi all,
>>>>>> ??Now we are testing ocfs2 with 4.14 kernel, and we finding a deadlock with umount and ocfs2 workqueue triggered by ocfs2rec thread. The stack as follows:
>>>>>> journal recovery work:
>>>>>> [<ffffffff8a8c0694>] call_rwsem_down_read_failed+0x14/0x30
>>>>>> [<ffffffffc0d5d652>] ocfs2_finish_quota_recovery+0x62/0x450 [ocfs2]
>>>>>> [<ffffffffc0d21221>] ocfs2_complete_recovery+0xc1/0x440 [ocfs2]
>>>>>> [<ffffffff8a09a1f0>] process_one_work+0x130/0x350
>>>>>> [<ffffffff8a09a946>] worker_thread+0x46/0x3b0
>>>>>> [<ffffffff8a0a0e51>] kthread+0x101/0x140
>>>>>> [<ffffffff8aa002ff>] ret_from_fork+0x1f/0x30
>>>>>> [<ffffffffffffffff>] 0xffffffffffffffff
>>>>>>
>>>>>> /bin/umount:
>>>>>> [<ffffffff8a099b24>] flush_workqueue+0x104/0x3e0
>>>>>> [<ffffffffc0cf18db>] ocfs2_truncate_log_shutdown+0x3b/0xc0 [ocfs2]
>>>>>> [<ffffffffc0d4fd6c>] ocfs2_dismount_volume+0x8c/0x3d0 [ocfs2]
>>>>>> [<ffffffffc0d500e1>] ocfs2_put_super+0x31/0xa0 [ocfs2]
>>>>>> [<ffffffff8a2445bd>] generic_shutdown_super+0x6d/0x120
>>>>>> [<ffffffff8a24469d>] kill_block_super+0x2d/0x60
>>>>>> [<ffffffff8a244e71>] deactivate_locked_super+0x51/0x90
>>>>>> [<ffffffff8a263a1b>] cleanup_mnt+0x3b/0x70
>>>>>> [<ffffffff8a09e9c6>] task_work_run+0x86/0xa0
>>>>>> [<ffffffff8a003d70>] exit_to_usermode_loop+0x6d/0xa9
>>>>>> [<ffffffff8a003a2d>] do_syscall_64+0x11d/0x130
>>>>>> [<ffffffff8aa00113>] entry_SYSCALL64_slow_path+0x25/0x25
>>>>>> [<ffffffffffffffff>] 0xffffffffffffffff
>>>>>> ??
>>>>>> Function ocfs2_finish_quota_recovery try to get sb->s_umount, which was already locked by umount thread, then get a deadlock.
>>>>>
>>>>> Good catch, thanks for reporting.  Is it reproducible? Can you please share
>>>>> the steps for reproducing this issue?
>>>>>> This issue was introduced by c3b004460d77bf3f980d877be539016f2df4df12 and 5f530de63cfc6ca8571cbdf58af63fb166cc6517.
>>>>>> I think we cannot use :: s_umount, but the mutex ::dqonoff_mutex was already removed.
>>>>>> Shall we add a new mutex?
>>>>>
>>>>> @Jan, I don't look into the code yet, could you help me understand why we
>>>>> need to get sb->s_umount in ocfs2_finish_quota_recovery?
>>>>> Is it because that the quota recovery process will start at umounting? or
>>>>> some where else?
>>>>
>>>> I was refreshing my memory wrt how ocfs2 quota recovery works. The problem
>>>> is the following: We load information about all quota information that
>>>> needs recovering (this is possibly for other nodes) in
>>>> ocfs2_begin_quota_recovery() that gets called during mount. Real quota
>>>> recovery happens from the recovery thread in ocfs2_finish_quota_recovery().
>>>> We need to protect code running there from dquot_disable() calls as that
>>>> will free structures we use for updating quota information etc. Currently
>>>> we use sb->s_umount for that protection.
>>>>
>>>> The problem above apparently happens when someone calls umount before the
>>>> recovery thread can finish quota recovery. I will think more about how to
>>>> fix the locking so that this lock inversion does not happen...
>>>
>>> So could we move ocfs2_recovery_exit() call in ocfs2_dismount_volume() up
>>> before ocfs2_disable_quotas()? It seems possible to me, I'm just not sure
>>> if there are not some hidden dependencies on recovery being shut down only
>>> after truncate log / local alloc. If we can do that, we could remove
>>> s_umount protection from ocfs2_finish_quota_recovery() and thus resolve the
>>> race.
>>>
>>> 								Honza
>>
>> Thanks for looking into this.
>> I am not quite familiar with quota part.:)
>>
>> Or can we move ocfs2_disable_quotas() in ocfs2_dismount_volume down
>> after ocfs2_recovery_exit() with ::invoking down_read(&sb->s_umount)
>> eliminated?
>>
>> Another way I can figure out is:
>> I think we might get inspired from qsync_work_fn().
>> In that function if current work is under running context of umount with
>> ::s_umount held, it just delays current work to next time.
>>
>> So can we also _try lock_ in ocfs2_finish_quota_recovery() and recover
>> quota by other ocfs2 cluster member nodes or local node's next time of
>> mount?
>>
> I guess we need analyse the impact of _try lock_. Such as no other node
> will help recovering quota when I'm the only node in cluster.

I don't see any risk for now. I will think about it more, later.

Thanks,
Changwei
> 
> thanks,
> Jun
> 
>> Thanks,
>> Changwei
>>
>>
>>>
>> _______________________________________________
>> Ocfs2-devel mailing list
>> Ocfs2-devel at oss.oracle.com
>> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
>>
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Ocfs2-devel] [Ocfs2-dev] BUG: deadlock with umount and ocfs2 workqueue triggered by ocfs2rec thread
  2018-01-19  3:38           ` Changwei Ge
@ 2018-01-19  3:59             ` piaojun
  2018-01-19  5:42               ` Changwei Ge
  0 siblings, 1 reply; 13+ messages in thread
From: piaojun @ 2018-01-19  3:59 UTC (permalink / raw)
  To: ocfs2-devel

Hi Changwei,

On 2018/1/19 11:38, Changwei Ge wrote:
> Hi Jun,
> 
> On 2018/1/19 11:17, piaojun wrote:
>> Hi Jan, Eric and Changwei,
>>
>> Could we use another mutex lock to protect quota recovery? Sharing the
>> lock with VFS-layer probably seems a little weird.
> 
> I am afraid that we can't since quota need ::s_umount and we indeed need 
> ::s_umount to get rid of race that quota has freed structs that will be 
> used by quota recovery in ocfs2.
> 
Could you explain which 'structs' used by quota recovery? Do you mean
'struct super_block'?

thanks,
Jun

>>
>> On 2018/1/19 9:48, Changwei Ge wrote:
>>> Hi Jan,
>>>
>>> On 2018/1/18 0:03, Jan Kara wrote:
>>>> On Wed 17-01-18 16:21:35, Jan Kara wrote:
>>>>> Hello,
>>>>>
>>>>> On Fri 12-01-18 16:25:56, Eric Ren wrote:
>>>>>> On 01/12/2018 11:43 AM, Shichangkuo wrote:
>>>>>>> Hi all,
>>>>>>> ??Now we are testing ocfs2 with 4.14 kernel, and we finding a deadlock with umount and ocfs2 workqueue triggered by ocfs2rec thread. The stack as follows:
>>>>>>> journal recovery work:
>>>>>>> [<ffffffff8a8c0694>] call_rwsem_down_read_failed+0x14/0x30
>>>>>>> [<ffffffffc0d5d652>] ocfs2_finish_quota_recovery+0x62/0x450 [ocfs2]
>>>>>>> [<ffffffffc0d21221>] ocfs2_complete_recovery+0xc1/0x440 [ocfs2]
>>>>>>> [<ffffffff8a09a1f0>] process_one_work+0x130/0x350
>>>>>>> [<ffffffff8a09a946>] worker_thread+0x46/0x3b0
>>>>>>> [<ffffffff8a0a0e51>] kthread+0x101/0x140
>>>>>>> [<ffffffff8aa002ff>] ret_from_fork+0x1f/0x30
>>>>>>> [<ffffffffffffffff>] 0xffffffffffffffff
>>>>>>>
>>>>>>> /bin/umount:
>>>>>>> [<ffffffff8a099b24>] flush_workqueue+0x104/0x3e0
>>>>>>> [<ffffffffc0cf18db>] ocfs2_truncate_log_shutdown+0x3b/0xc0 [ocfs2]
>>>>>>> [<ffffffffc0d4fd6c>] ocfs2_dismount_volume+0x8c/0x3d0 [ocfs2]
>>>>>>> [<ffffffffc0d500e1>] ocfs2_put_super+0x31/0xa0 [ocfs2]
>>>>>>> [<ffffffff8a2445bd>] generic_shutdown_super+0x6d/0x120
>>>>>>> [<ffffffff8a24469d>] kill_block_super+0x2d/0x60
>>>>>>> [<ffffffff8a244e71>] deactivate_locked_super+0x51/0x90
>>>>>>> [<ffffffff8a263a1b>] cleanup_mnt+0x3b/0x70
>>>>>>> [<ffffffff8a09e9c6>] task_work_run+0x86/0xa0
>>>>>>> [<ffffffff8a003d70>] exit_to_usermode_loop+0x6d/0xa9
>>>>>>> [<ffffffff8a003a2d>] do_syscall_64+0x11d/0x130
>>>>>>> [<ffffffff8aa00113>] entry_SYSCALL64_slow_path+0x25/0x25
>>>>>>> [<ffffffffffffffff>] 0xffffffffffffffff
>>>>>>> ??
>>>>>>> Function ocfs2_finish_quota_recovery try to get sb->s_umount, which was already locked by umount thread, then get a deadlock.
>>>>>>
>>>>>> Good catch, thanks for reporting.  Is it reproducible? Can you please share
>>>>>> the steps for reproducing this issue?
>>>>>>> This issue was introduced by c3b004460d77bf3f980d877be539016f2df4df12 and 5f530de63cfc6ca8571cbdf58af63fb166cc6517.
>>>>>>> I think we cannot use :: s_umount, but the mutex ::dqonoff_mutex was already removed.
>>>>>>> Shall we add a new mutex?
>>>>>>
>>>>>> @Jan, I don't look into the code yet, could you help me understand why we
>>>>>> need to get sb->s_umount in ocfs2_finish_quota_recovery?
>>>>>> Is it because that the quota recovery process will start at umounting? or
>>>>>> some where else?
>>>>>
>>>>> I was refreshing my memory wrt how ocfs2 quota recovery works. The problem
>>>>> is the following: We load information about all quota information that
>>>>> needs recovering (this is possibly for other nodes) in
>>>>> ocfs2_begin_quota_recovery() that gets called during mount. Real quota
>>>>> recovery happens from the recovery thread in ocfs2_finish_quota_recovery().
>>>>> We need to protect code running there from dquot_disable() calls as that
>>>>> will free structures we use for updating quota information etc. Currently
>>>>> we use sb->s_umount for that protection.
>>>>>
>>>>> The problem above apparently happens when someone calls umount before the
>>>>> recovery thread can finish quota recovery. I will think more about how to
>>>>> fix the locking so that this lock inversion does not happen...
>>>>
>>>> So could we move ocfs2_recovery_exit() call in ocfs2_dismount_volume() up
>>>> before ocfs2_disable_quotas()? It seems possible to me, I'm just not sure
>>>> if there are not some hidden dependencies on recovery being shut down only
>>>> after truncate log / local alloc. If we can do that, we could remove
>>>> s_umount protection from ocfs2_finish_quota_recovery() and thus resolve the
>>>> race.
>>>>
>>>> 								Honza
>>>
>>> Thanks for looking into this.
>>> I am not quite familiar with quota part.:)
>>>
>>> Or can we move ocfs2_disable_quotas() in ocfs2_dismount_volume down
>>> after ocfs2_recovery_exit() with ::invoking down_read(&sb->s_umount)
>>> eliminated?
>>>
>>> Another way I can figure out is:
>>> I think we might get inspired from qsync_work_fn().
>>> In that function if current work is under running context of umount with
>>> ::s_umount held, it just delays current work to next time.
>>>
>>> So can we also _try lock_ in ocfs2_finish_quota_recovery() and recover
>>> quota by other ocfs2 cluster member nodes or local node's next time of
>>> mount?
>>>
>> I guess we need analyse the impact of _try lock_. Such as no other node
>> will help recovering quota when I'm the only node in cluster.
> 
> I don't see any risk for now. I will think about it more, later.
> 
> Thanks,
> Changwei
>>
>> thanks,
>> Jun
>>
>>> Thanks,
>>> Changwei
>>>
>>>
>>>>
>>> _______________________________________________
>>> Ocfs2-devel mailing list
>>> Ocfs2-devel at oss.oracle.com
>>> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
>>>
>>
> .
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Ocfs2-devel] [Ocfs2-dev] BUG: deadlock with umount and ocfs2 workqueue triggered by ocfs2rec thread
  2018-01-19  3:59             ` piaojun
@ 2018-01-19  5:42               ` Changwei Ge
  2018-01-19  7:03                 ` piaojun
  0 siblings, 1 reply; 13+ messages in thread
From: Changwei Ge @ 2018-01-19  5:42 UTC (permalink / raw)
  To: ocfs2-devel

Hi Jun,

On 2018/1/19 11:59, piaojun wrote:
> Hi Changwei,
> 
> On 2018/1/19 11:38, Changwei Ge wrote:
>> Hi Jun,
>>
>> On 2018/1/19 11:17, piaojun wrote:
>>> Hi Jan, Eric and Changwei,
>>>
>>> Could we use another mutex lock to protect quota recovery? Sharing the
>>> lock with VFS-layer probably seems a little weird.
>>
>> I am afraid that we can't since quota need ::s_umount and we indeed need
>> ::s_umount to get rid of race that quota has freed structs that will be
>> used by quota recovery in ocfs2.
>>
> Could you explain which 'structs' used by quota recovery? Do you mean
> 'struct super_block'?
I am not pointing to super_block.

Sure.
You can refer to
ocfs2_finish_quota_recovery
   ocfs2_recover_local_quota_file -> here, operations on quota happens

Thanks,
Changwei

> 
> thanks,
> Jun
> 
>>>
>>> On 2018/1/19 9:48, Changwei Ge wrote:
>>>> Hi Jan,
>>>>
>>>> On 2018/1/18 0:03, Jan Kara wrote:
>>>>> On Wed 17-01-18 16:21:35, Jan Kara wrote:
>>>>>> Hello,
>>>>>>
>>>>>> On Fri 12-01-18 16:25:56, Eric Ren wrote:
>>>>>>> On 01/12/2018 11:43 AM, Shichangkuo wrote:
>>>>>>>> Hi all,
>>>>>>>> ??Now we are testing ocfs2 with 4.14 kernel, and we finding a deadlock with umount and ocfs2 workqueue triggered by ocfs2rec thread. The stack as follows:
>>>>>>>> journal recovery work:
>>>>>>>> [<ffffffff8a8c0694>] call_rwsem_down_read_failed+0x14/0x30
>>>>>>>> [<ffffffffc0d5d652>] ocfs2_finish_quota_recovery+0x62/0x450 [ocfs2]
>>>>>>>> [<ffffffffc0d21221>] ocfs2_complete_recovery+0xc1/0x440 [ocfs2]
>>>>>>>> [<ffffffff8a09a1f0>] process_one_work+0x130/0x350
>>>>>>>> [<ffffffff8a09a946>] worker_thread+0x46/0x3b0
>>>>>>>> [<ffffffff8a0a0e51>] kthread+0x101/0x140
>>>>>>>> [<ffffffff8aa002ff>] ret_from_fork+0x1f/0x30
>>>>>>>> [<ffffffffffffffff>] 0xffffffffffffffff
>>>>>>>>
>>>>>>>> /bin/umount:
>>>>>>>> [<ffffffff8a099b24>] flush_workqueue+0x104/0x3e0
>>>>>>>> [<ffffffffc0cf18db>] ocfs2_truncate_log_shutdown+0x3b/0xc0 [ocfs2]
>>>>>>>> [<ffffffffc0d4fd6c>] ocfs2_dismount_volume+0x8c/0x3d0 [ocfs2]
>>>>>>>> [<ffffffffc0d500e1>] ocfs2_put_super+0x31/0xa0 [ocfs2]
>>>>>>>> [<ffffffff8a2445bd>] generic_shutdown_super+0x6d/0x120
>>>>>>>> [<ffffffff8a24469d>] kill_block_super+0x2d/0x60
>>>>>>>> [<ffffffff8a244e71>] deactivate_locked_super+0x51/0x90
>>>>>>>> [<ffffffff8a263a1b>] cleanup_mnt+0x3b/0x70
>>>>>>>> [<ffffffff8a09e9c6>] task_work_run+0x86/0xa0
>>>>>>>> [<ffffffff8a003d70>] exit_to_usermode_loop+0x6d/0xa9
>>>>>>>> [<ffffffff8a003a2d>] do_syscall_64+0x11d/0x130
>>>>>>>> [<ffffffff8aa00113>] entry_SYSCALL64_slow_path+0x25/0x25
>>>>>>>> [<ffffffffffffffff>] 0xffffffffffffffff
>>>>>>>> ??
>>>>>>>> Function ocfs2_finish_quota_recovery try to get sb->s_umount, which was already locked by umount thread, then get a deadlock.
>>>>>>>
>>>>>>> Good catch, thanks for reporting.  Is it reproducible? Can you please share
>>>>>>> the steps for reproducing this issue?
>>>>>>>> This issue was introduced by c3b004460d77bf3f980d877be539016f2df4df12 and 5f530de63cfc6ca8571cbdf58af63fb166cc6517.
>>>>>>>> I think we cannot use :: s_umount, but the mutex ::dqonoff_mutex was already removed.
>>>>>>>> Shall we add a new mutex?
>>>>>>>
>>>>>>> @Jan, I don't look into the code yet, could you help me understand why we
>>>>>>> need to get sb->s_umount in ocfs2_finish_quota_recovery?
>>>>>>> Is it because that the quota recovery process will start at umounting? or
>>>>>>> some where else?
>>>>>>
>>>>>> I was refreshing my memory wrt how ocfs2 quota recovery works. The problem
>>>>>> is the following: We load information about all quota information that
>>>>>> needs recovering (this is possibly for other nodes) in
>>>>>> ocfs2_begin_quota_recovery() that gets called during mount. Real quota
>>>>>> recovery happens from the recovery thread in ocfs2_finish_quota_recovery().
>>>>>> We need to protect code running there from dquot_disable() calls as that
>>>>>> will free structures we use for updating quota information etc. Currently
>>>>>> we use sb->s_umount for that protection.
>>>>>>
>>>>>> The problem above apparently happens when someone calls umount before the
>>>>>> recovery thread can finish quota recovery. I will think more about how to
>>>>>> fix the locking so that this lock inversion does not happen...
>>>>>
>>>>> So could we move ocfs2_recovery_exit() call in ocfs2_dismount_volume() up
>>>>> before ocfs2_disable_quotas()? It seems possible to me, I'm just not sure
>>>>> if there are not some hidden dependencies on recovery being shut down only
>>>>> after truncate log / local alloc. If we can do that, we could remove
>>>>> s_umount protection from ocfs2_finish_quota_recovery() and thus resolve the
>>>>> race.
>>>>>
>>>>> 								Honza
>>>>
>>>> Thanks for looking into this.
>>>> I am not quite familiar with quota part.:)
>>>>
>>>> Or can we move ocfs2_disable_quotas() in ocfs2_dismount_volume down
>>>> after ocfs2_recovery_exit() with ::invoking down_read(&sb->s_umount)
>>>> eliminated?
>>>>
>>>> Another way I can figure out is:
>>>> I think we might get inspired from qsync_work_fn().
>>>> In that function if current work is under running context of umount with
>>>> ::s_umount held, it just delays current work to next time.
>>>>
>>>> So can we also _try lock_ in ocfs2_finish_quota_recovery() and recover
>>>> quota by other ocfs2 cluster member nodes or local node's next time of
>>>> mount?
>>>>
>>> I guess we need analyse the impact of _try lock_. Such as no other node
>>> will help recovering quota when I'm the only node in cluster.
>>
>> I don't see any risk for now. I will think about it more, later.
>>
>> Thanks,
>> Changwei
>>>
>>> thanks,
>>> Jun
>>>
>>>> Thanks,
>>>> Changwei
>>>>
>>>>
>>>>>
>>>> _______________________________________________
>>>> Ocfs2-devel mailing list
>>>> Ocfs2-devel at oss.oracle.com
>>>> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
>>>>
>>>
>> .
>>
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Ocfs2-devel] [Ocfs2-dev] BUG: deadlock with umount and ocfs2 workqueue triggered by ocfs2rec thread
  2018-01-19  5:42               ` Changwei Ge
@ 2018-01-19  7:03                 ` piaojun
  2018-01-19 12:18                   ` Jan Kara
  0 siblings, 1 reply; 13+ messages in thread
From: piaojun @ 2018-01-19  7:03 UTC (permalink / raw)
  To: ocfs2-devel

Hi Changwei,

On 2018/1/19 13:42, Changwei Ge wrote:
> Hi Jun,
> 
> On 2018/1/19 11:59, piaojun wrote:
>> Hi Changwei,
>>
>> On 2018/1/19 11:38, Changwei Ge wrote:
>>> Hi Jun,
>>>
>>> On 2018/1/19 11:17, piaojun wrote:
>>>> Hi Jan, Eric and Changwei,
>>>>
>>>> Could we use another mutex lock to protect quota recovery? Sharing the
>>>> lock with VFS-layer probably seems a little weird.
>>>
>>> I am afraid that we can't since quota need ::s_umount and we indeed need
>>> ::s_umount to get rid of race that quota has freed structs that will be
>>> used by quota recovery in ocfs2.
>>>
>> Could you explain which 'structs' used by quota recovery? Do you mean
>> 'struct super_block'?
> I am not pointing to super_block.
> 
> Sure.
> You can refer to
> ocfs2_finish_quota_recovery
>    ocfs2_recover_local_quota_file -> here, operations on quota happens
> 
> Thanks,
> Changwei
> 
I looked through the code in deactivate_locked_super(), and did not
find any *structs* needed to be protected except 'sb'. In addition I
still could not figure out why 'dqonoff_mutex' was relaced by
's_umount'. At least, the patch 5f530de63cfc did not mention that. I
will appreciate for detailed explaination.

""
ocfs2: Use s_umount for quota recovery protection

Currently we use dqonoff_mutex to serialize quota recovery protection
and turning of quotas on / off. Use s_umount semaphore instead.
""

thanks,
Jun

>>
>> thanks,
>> Jun
>>
>>>>
>>>> On 2018/1/19 9:48, Changwei Ge wrote:
>>>>> Hi Jan,
>>>>>
>>>>> On 2018/1/18 0:03, Jan Kara wrote:
>>>>>> On Wed 17-01-18 16:21:35, Jan Kara wrote:
>>>>>>> Hello,
>>>>>>>
>>>>>>> On Fri 12-01-18 16:25:56, Eric Ren wrote:
>>>>>>>> On 01/12/2018 11:43 AM, Shichangkuo wrote:
>>>>>>>>> Hi all,
>>>>>>>>> ??Now we are testing ocfs2 with 4.14 kernel, and we finding a deadlock with umount and ocfs2 workqueue triggered by ocfs2rec thread. The stack as follows:
>>>>>>>>> journal recovery work:
>>>>>>>>> [<ffffffff8a8c0694>] call_rwsem_down_read_failed+0x14/0x30
>>>>>>>>> [<ffffffffc0d5d652>] ocfs2_finish_quota_recovery+0x62/0x450 [ocfs2]
>>>>>>>>> [<ffffffffc0d21221>] ocfs2_complete_recovery+0xc1/0x440 [ocfs2]
>>>>>>>>> [<ffffffff8a09a1f0>] process_one_work+0x130/0x350
>>>>>>>>> [<ffffffff8a09a946>] worker_thread+0x46/0x3b0
>>>>>>>>> [<ffffffff8a0a0e51>] kthread+0x101/0x140
>>>>>>>>> [<ffffffff8aa002ff>] ret_from_fork+0x1f/0x30
>>>>>>>>> [<ffffffffffffffff>] 0xffffffffffffffff
>>>>>>>>>
>>>>>>>>> /bin/umount:
>>>>>>>>> [<ffffffff8a099b24>] flush_workqueue+0x104/0x3e0
>>>>>>>>> [<ffffffffc0cf18db>] ocfs2_truncate_log_shutdown+0x3b/0xc0 [ocfs2]
>>>>>>>>> [<ffffffffc0d4fd6c>] ocfs2_dismount_volume+0x8c/0x3d0 [ocfs2]
>>>>>>>>> [<ffffffffc0d500e1>] ocfs2_put_super+0x31/0xa0 [ocfs2]
>>>>>>>>> [<ffffffff8a2445bd>] generic_shutdown_super+0x6d/0x120
>>>>>>>>> [<ffffffff8a24469d>] kill_block_super+0x2d/0x60
>>>>>>>>> [<ffffffff8a244e71>] deactivate_locked_super+0x51/0x90
>>>>>>>>> [<ffffffff8a263a1b>] cleanup_mnt+0x3b/0x70
>>>>>>>>> [<ffffffff8a09e9c6>] task_work_run+0x86/0xa0
>>>>>>>>> [<ffffffff8a003d70>] exit_to_usermode_loop+0x6d/0xa9
>>>>>>>>> [<ffffffff8a003a2d>] do_syscall_64+0x11d/0x130
>>>>>>>>> [<ffffffff8aa00113>] entry_SYSCALL64_slow_path+0x25/0x25
>>>>>>>>> [<ffffffffffffffff>] 0xffffffffffffffff
>>>>>>>>> ??
>>>>>>>>> Function ocfs2_finish_quota_recovery try to get sb->s_umount, which was already locked by umount thread, then get a deadlock.
>>>>>>>>
>>>>>>>> Good catch, thanks for reporting.  Is it reproducible? Can you please share
>>>>>>>> the steps for reproducing this issue?
>>>>>>>>> This issue was introduced by c3b004460d77bf3f980d877be539016f2df4df12 and 5f530de63cfc6ca8571cbdf58af63fb166cc6517.
>>>>>>>>> I think we cannot use :: s_umount, but the mutex ::dqonoff_mutex was already removed.
>>>>>>>>> Shall we add a new mutex?
>>>>>>>>
>>>>>>>> @Jan, I don't look into the code yet, could you help me understand why we
>>>>>>>> need to get sb->s_umount in ocfs2_finish_quota_recovery?
>>>>>>>> Is it because that the quota recovery process will start at umounting? or
>>>>>>>> some where else?
>>>>>>>
>>>>>>> I was refreshing my memory wrt how ocfs2 quota recovery works. The problem
>>>>>>> is the following: We load information about all quota information that
>>>>>>> needs recovering (this is possibly for other nodes) in
>>>>>>> ocfs2_begin_quota_recovery() that gets called during mount. Real quota
>>>>>>> recovery happens from the recovery thread in ocfs2_finish_quota_recovery().
>>>>>>> We need to protect code running there from dquot_disable() calls as that
>>>>>>> will free structures we use for updating quota information etc. Currently
>>>>>>> we use sb->s_umount for that protection.
>>>>>>>
>>>>>>> The problem above apparently happens when someone calls umount before the
>>>>>>> recovery thread can finish quota recovery. I will think more about how to
>>>>>>> fix the locking so that this lock inversion does not happen...
>>>>>>
>>>>>> So could we move ocfs2_recovery_exit() call in ocfs2_dismount_volume() up
>>>>>> before ocfs2_disable_quotas()? It seems possible to me, I'm just not sure
>>>>>> if there are not some hidden dependencies on recovery being shut down only
>>>>>> after truncate log / local alloc. If we can do that, we could remove
>>>>>> s_umount protection from ocfs2_finish_quota_recovery() and thus resolve the
>>>>>> race.
>>>>>>
>>>>>> 								Honza
>>>>>
>>>>> Thanks for looking into this.
>>>>> I am not quite familiar with quota part.:)
>>>>>
>>>>> Or can we move ocfs2_disable_quotas() in ocfs2_dismount_volume down
>>>>> after ocfs2_recovery_exit() with ::invoking down_read(&sb->s_umount)
>>>>> eliminated?
>>>>>
>>>>> Another way I can figure out is:
>>>>> I think we might get inspired from qsync_work_fn().
>>>>> In that function if current work is under running context of umount with
>>>>> ::s_umount held, it just delays current work to next time.
>>>>>
>>>>> So can we also _try lock_ in ocfs2_finish_quota_recovery() and recover
>>>>> quota by other ocfs2 cluster member nodes or local node's next time of
>>>>> mount?
>>>>>
>>>> I guess we need analyse the impact of _try lock_. Such as no other node
>>>> will help recovering quota when I'm the only node in cluster.
>>>
>>> I don't see any risk for now. I will think about it more, later.
>>>
>>> Thanks,
>>> Changwei
>>>>
>>>> thanks,
>>>> Jun
>>>>
>>>>> Thanks,
>>>>> Changwei
>>>>>
>>>>>
>>>>>>
>>>>> _______________________________________________
>>>>> Ocfs2-devel mailing list
>>>>> Ocfs2-devel at oss.oracle.com
>>>>> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
>>>>>
>>>>
>>> .
>>>
>>
> .
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Ocfs2-devel] [Ocfs2-dev] BUG: deadlock with umount and ocfs2 workqueue triggered by ocfs2rec thread
  2018-01-19  7:03                 ` piaojun
@ 2018-01-19 12:18                   ` Jan Kara
  0 siblings, 0 replies; 13+ messages in thread
From: Jan Kara @ 2018-01-19 12:18 UTC (permalink / raw)
  To: ocfs2-devel

Hi!

On Fri 19-01-18 15:03:30, piaojun wrote:
> On 2018/1/19 13:42, Changwei Ge wrote:
> > Hi Jun,
> > 
> > On 2018/1/19 11:59, piaojun wrote:
> >> Hi Changwei,
> >>
> >> On 2018/1/19 11:38, Changwei Ge wrote:
> >>> Hi Jun,
> >>>
> >>> On 2018/1/19 11:17, piaojun wrote:
> >>>> Hi Jan, Eric and Changwei,
> >>>>
> >>>> Could we use another mutex lock to protect quota recovery? Sharing the
> >>>> lock with VFS-layer probably seems a little weird.
> >>>
> >>> I am afraid that we can't since quota need ::s_umount and we indeed need
> >>> ::s_umount to get rid of race that quota has freed structs that will be
> >>> used by quota recovery in ocfs2.
> >>>
> >> Could you explain which 'structs' used by quota recovery? Do you mean
> >> 'struct super_block'?
> > I am not pointing to super_block.
> > 
> > Sure.
> > You can refer to
> > ocfs2_finish_quota_recovery
> >    ocfs2_recover_local_quota_file -> here, operations on quota happens
> > 
> > Thanks,
> > Changwei
> > 
> I looked through the code in deactivate_locked_super(), and did not
> find any *structs* needed to be protected except 'sb'. In addition I

There are quota structures (generally attached to superblock in some way)
that get freed in ocfs2_local_free_info() that are used by quota recovery.
ocfs2_local_free_info() gets called when we call dquot_disable() from
ocfs2_disable_quotas() from ocfs2_dismount_volume(). That's why we need to
serialize quota recovery and umount.

> still could not figure out why 'dqonoff_mutex' was relaced by
> 's_umount'. At least, the patch 5f530de63cfc did not mention that. I
> will appreciate for detailed explaination.

It was replaced mostly to fix a lock inversion issue in generic code -
details are described in commit 9d1ccbe70e0b145 "quota: Use s_umount
protection for quota operations".

Looking into ocfs2 quota code, I think it will be easiest to introduce
ocfs2 variant of dqonoff mutex and use it for serialization of umount and
recovery. While reading the recovery / umount / remount code, I've also
found several other problems / races that also need fixing (e.g. if some
node mounts a filesystem in read-only mode and ends up doing recovery of
some crashed node, quota recovery will not work). So I plan to do some more
fixing in this area, just didn't get to it yet.

								Honza

> >>>> On 2018/1/19 9:48, Changwei Ge wrote:
> >>>>> Hi Jan,
> >>>>>
> >>>>> On 2018/1/18 0:03, Jan Kara wrote:
> >>>>>> On Wed 17-01-18 16:21:35, Jan Kara wrote:
> >>>>>>> Hello,
> >>>>>>>
> >>>>>>> On Fri 12-01-18 16:25:56, Eric Ren wrote:
> >>>>>>>> On 01/12/2018 11:43 AM, Shichangkuo wrote:
> >>>>>>>>> Hi all,
> >>>>>>>>> ??Now we are testing ocfs2 with 4.14 kernel, and we finding a deadlock with umount and ocfs2 workqueue triggered by ocfs2rec thread. The stack as follows:
> >>>>>>>>> journal recovery work:
> >>>>>>>>> [<ffffffff8a8c0694>] call_rwsem_down_read_failed+0x14/0x30
> >>>>>>>>> [<ffffffffc0d5d652>] ocfs2_finish_quota_recovery+0x62/0x450 [ocfs2]
> >>>>>>>>> [<ffffffffc0d21221>] ocfs2_complete_recovery+0xc1/0x440 [ocfs2]
> >>>>>>>>> [<ffffffff8a09a1f0>] process_one_work+0x130/0x350
> >>>>>>>>> [<ffffffff8a09a946>] worker_thread+0x46/0x3b0
> >>>>>>>>> [<ffffffff8a0a0e51>] kthread+0x101/0x140
> >>>>>>>>> [<ffffffff8aa002ff>] ret_from_fork+0x1f/0x30
> >>>>>>>>> [<ffffffffffffffff>] 0xffffffffffffffff
> >>>>>>>>>
> >>>>>>>>> /bin/umount:
> >>>>>>>>> [<ffffffff8a099b24>] flush_workqueue+0x104/0x3e0
> >>>>>>>>> [<ffffffffc0cf18db>] ocfs2_truncate_log_shutdown+0x3b/0xc0 [ocfs2]
> >>>>>>>>> [<ffffffffc0d4fd6c>] ocfs2_dismount_volume+0x8c/0x3d0 [ocfs2]
> >>>>>>>>> [<ffffffffc0d500e1>] ocfs2_put_super+0x31/0xa0 [ocfs2]
> >>>>>>>>> [<ffffffff8a2445bd>] generic_shutdown_super+0x6d/0x120
> >>>>>>>>> [<ffffffff8a24469d>] kill_block_super+0x2d/0x60
> >>>>>>>>> [<ffffffff8a244e71>] deactivate_locked_super+0x51/0x90
> >>>>>>>>> [<ffffffff8a263a1b>] cleanup_mnt+0x3b/0x70
> >>>>>>>>> [<ffffffff8a09e9c6>] task_work_run+0x86/0xa0
> >>>>>>>>> [<ffffffff8a003d70>] exit_to_usermode_loop+0x6d/0xa9
> >>>>>>>>> [<ffffffff8a003a2d>] do_syscall_64+0x11d/0x130
> >>>>>>>>> [<ffffffff8aa00113>] entry_SYSCALL64_slow_path+0x25/0x25
> >>>>>>>>> [<ffffffffffffffff>] 0xffffffffffffffff
> >>>>>>>>> ??
> >>>>>>>>> Function ocfs2_finish_quota_recovery try to get sb->s_umount, which was already locked by umount thread, then get a deadlock.
> >>>>>>>>
> >>>>>>>> Good catch, thanks for reporting.  Is it reproducible? Can you please share
> >>>>>>>> the steps for reproducing this issue?
> >>>>>>>>> This issue was introduced by c3b004460d77bf3f980d877be539016f2df4df12 and 5f530de63cfc6ca8571cbdf58af63fb166cc6517.
> >>>>>>>>> I think we cannot use :: s_umount, but the mutex ::dqonoff_mutex was already removed.
> >>>>>>>>> Shall we add a new mutex?
> >>>>>>>>
> >>>>>>>> @Jan, I don't look into the code yet, could you help me understand why we
> >>>>>>>> need to get sb->s_umount in ocfs2_finish_quota_recovery?
> >>>>>>>> Is it because that the quota recovery process will start at umounting? or
> >>>>>>>> some where else?
> >>>>>>>
> >>>>>>> I was refreshing my memory wrt how ocfs2 quota recovery works. The problem
> >>>>>>> is the following: We load information about all quota information that
> >>>>>>> needs recovering (this is possibly for other nodes) in
> >>>>>>> ocfs2_begin_quota_recovery() that gets called during mount. Real quota
> >>>>>>> recovery happens from the recovery thread in ocfs2_finish_quota_recovery().
> >>>>>>> We need to protect code running there from dquot_disable() calls as that
> >>>>>>> will free structures we use for updating quota information etc. Currently
> >>>>>>> we use sb->s_umount for that protection.
> >>>>>>>
> >>>>>>> The problem above apparently happens when someone calls umount before the
> >>>>>>> recovery thread can finish quota recovery. I will think more about how to
> >>>>>>> fix the locking so that this lock inversion does not happen...
> >>>>>>
> >>>>>> So could we move ocfs2_recovery_exit() call in ocfs2_dismount_volume() up
> >>>>>> before ocfs2_disable_quotas()? It seems possible to me, I'm just not sure
> >>>>>> if there are not some hidden dependencies on recovery being shut down only
> >>>>>> after truncate log / local alloc. If we can do that, we could remove
> >>>>>> s_umount protection from ocfs2_finish_quota_recovery() and thus resolve the
> >>>>>> race.
> >>>>>>
> >>>>>> 								Honza
> >>>>>
> >>>>> Thanks for looking into this.
> >>>>> I am not quite familiar with quota part.:)
> >>>>>
> >>>>> Or can we move ocfs2_disable_quotas() in ocfs2_dismount_volume down
> >>>>> after ocfs2_recovery_exit() with ::invoking down_read(&sb->s_umount)
> >>>>> eliminated?
> >>>>>
> >>>>> Another way I can figure out is:
> >>>>> I think we might get inspired from qsync_work_fn().
> >>>>> In that function if current work is under running context of umount with
> >>>>> ::s_umount held, it just delays current work to next time.
> >>>>>
> >>>>> So can we also _try lock_ in ocfs2_finish_quota_recovery() and recover
> >>>>> quota by other ocfs2 cluster member nodes or local node's next time of
> >>>>> mount?
> >>>>>
> >>>> I guess we need analyse the impact of _try lock_. Such as no other node
> >>>> will help recovering quota when I'm the only node in cluster.
> >>>
> >>> I don't see any risk for now. I will think about it more, later.
> >>>
> >>> Thanks,
> >>> Changwei
> >>>>
> >>>> thanks,
> >>>> Jun
> >>>>
> >>>>> Thanks,
> >>>>> Changwei
> >>>>>
> >>>>>
> >>>>>>
> >>>>> _______________________________________________
> >>>>> Ocfs2-devel mailing list
> >>>>> Ocfs2-devel at oss.oracle.com
> >>>>> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
> >>>>>
> >>>>
> >>> .
> >>>
> >>
> > .
> > 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2018-01-19 12:18 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-12  3:43 [Ocfs2-devel] [Ocfs2-dev] BUG: deadlock with umount and ocfs2 workqueue triggered by ocfs2rec thread Shichangkuo
2018-01-12  5:50 ` Joseph Qi
2018-01-12  6:15   ` Shichangkuo
2018-01-12  8:25 ` Eric Ren
2018-01-17 15:21   ` Jan Kara
2018-01-17 15:57     ` Jan Kara
2018-01-19  1:48       ` Changwei Ge
2018-01-19  3:16         ` piaojun
2018-01-19  3:38           ` Changwei Ge
2018-01-19  3:59             ` piaojun
2018-01-19  5:42               ` Changwei Ge
2018-01-19  7:03                 ` piaojun
2018-01-19 12:18                   ` Jan Kara

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.