linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* 5.3.0 deadlock: btrfs_sync_file / btrfs_async_reclaim_metadata_space / btrfs_page_mkwrite
@ 2019-10-12 23:29 James Harvey
  2019-10-12 23:32 ` James Harvey
  2019-10-14  1:45 ` Chris Murphy
  0 siblings, 2 replies; 6+ messages in thread
From: James Harvey @ 2019-10-12 23:29 UTC (permalink / raw)
  To: Btrfs BTRFS

Was using a temporary BTRFS volume to compile mongodb, which is quite
intensive and takes quite a bit of time.  The volume has been
deadlocked for about 12 hours.

Being a temporary volume, I just used mount without options, so it
used the defaults:  rw,relatime,ssd,space_cache,subvolid=5,subvol=/

Apologies if upgrading to 5.3.5+ will fix this.  I didn't see
discussions of a deadlock looking like this.



See the full sysrq-trigger output here: http://ix.io/1Ysp



But, for searchability, here is some of it:

systemd-journal D    0   813      1 0x00004320
Call Trace:
 ? __schedule+0x27f/0x6d0
 schedule+0x43/0xd0
 schedule_timeout+0x299/0x3d0
 io_schedule_timeout+0x19/0x40
 wait_for_common_io.constprop.0+0xcf/0x150
 ? wake_up_q+0x60/0x60
 write_all_supers+0x87f/0x940 [btrfs]
 btrfs_sync_log+0x71b/0x9f0 [btrfs]
 ? dput+0xc9/0x2d0
 btrfs_sync_file+0x364/0x460 [btrfs]
 do_fsync+0x38/0x70
 __x64_sys_fsync+0x10/0x20
 do_syscall_64+0x5f/0x1c0
 entry_SYSCALL_64_after_hwframe+0x44/0xa9


Call Trace:
 ? __schedule+0x27f/0x6d0
 schedule+0x43/0xd0
 schedule_timeout+0x1cf/0x3d0
 ? collect_expired_timers+0xb0/0xb0
 flush_space+0x55c/0x710 [btrfs]
 btrfs_async_reclaim_metadata_space+0xc4/0x4a0 [btrfs]
 ? __schedule+0x287/0x6d0
 process_one_work+0x1d1/0x3a0
 worker_thread+0x4a/0x3d0
 kthread+0xfb/0x130
 ? process_one_work+0x3a0/0x3a0
 ? kthread_park+0x80/0x80
 ret_from_fork+0x35/0x40


Then 28 of these:

Call Trace:
 ? __schedule+0x27f/0x6d0
 schedule+0x43/0xd0
 wait_reserve_ticket+0x95/0x150 [btrfs]
 ? wait_woken+0x70/0x70
 btrfs_reserve_metadata_bytes+0x782/0x920 [btrfs]
 btrfs_block_rsv_add+0x1f/0x50 [btrfs]
 start_transaction+0x2c2/0x490 [btrfs]
 btrfs_dirty_inode+0x9d/0xd0 [btrfs]
 file_update_time+0xfd/0x150
 btrfs_page_mkwrite+0xfe/0x4e0 [btrfs]
 ? lock_page_memcg+0x11/0x90
 do_page_mkwrite+0x31/0x90
 do_wp_page+0x2c6/0x660
 __handle_mm_fault+0xc24/0x15d0
 ? call_function_single_interrupt+0xa/0x20
 handle_mm_fault+0xce/0x1f0
 __do_page_fault+0x216/0x4f0
 do_page_fault+0x31/0x130
 page_fault+0x3e/0x50

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 5.3.0 deadlock: btrfs_sync_file / btrfs_async_reclaim_metadata_space / btrfs_page_mkwrite
  2019-10-12 23:29 5.3.0 deadlock: btrfs_sync_file / btrfs_async_reclaim_metadata_space / btrfs_page_mkwrite James Harvey
@ 2019-10-12 23:32 ` James Harvey
  2019-10-13  2:12   ` James Harvey
  2019-10-14  1:45 ` Chris Murphy
  1 sibling, 1 reply; 6+ messages in thread
From: James Harvey @ 2019-10-12 23:32 UTC (permalink / raw)
  To: Btrfs BTRFS

On Sat, Oct 12, 2019 at 7:29 PM James Harvey <jamespharvey20@gmail.com> wrote:
>
> Was using a temporary BTRFS volume to compile mongodb, which is quite
> intensive and takes quite a bit of time.  The volume has been
> deadlocked for about 12 hours.
>
> ...

Surprisingly, I was able to touch an unrelated file on the volume
while the deadlock was ongoing.  I was able to CTRL+C the process, and
all of the uninterruptible sleep processes quit.  I was able to delete
the partial compilation and restart.  All without having to remount
the volume or reboot.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 5.3.0 deadlock: btrfs_sync_file / btrfs_async_reclaim_metadata_space / btrfs_page_mkwrite
  2019-10-12 23:32 ` James Harvey
@ 2019-10-13  2:12   ` James Harvey
  0 siblings, 0 replies; 6+ messages in thread
From: James Harvey @ 2019-10-13  2:12 UTC (permalink / raw)
  To: Btrfs BTRFS

On Sat, Oct 12, 2019 at 7:32 PM James Harvey <jamespharvey20@gmail.com> wrote:
>
> On Sat, Oct 12, 2019 at 7:29 PM James Harvey <jamespharvey20@gmail.com> wrote:
> >
> > Was using a temporary BTRFS volume to compile mongodb, which is quite
> > intensive and takes quite a bit of time.  The volume has been
> > deadlocked for about 12 hours.
> >
> > ...

Not a one time fluke.  Attempted recompilation of mongodb, and BTRFS
deadlocked again, on a different spot but almost as far into it as
before.  Similar looking sysrq-trigger, included just because I'm
already replying saying this wasn't a fluke.  http://ix.io/1YsR

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 5.3.0 deadlock: btrfs_sync_file / btrfs_async_reclaim_metadata_space / btrfs_page_mkwrite
  2019-10-12 23:29 5.3.0 deadlock: btrfs_sync_file / btrfs_async_reclaim_metadata_space / btrfs_page_mkwrite James Harvey
  2019-10-12 23:32 ` James Harvey
@ 2019-10-14  1:45 ` Chris Murphy
  2019-10-15  1:04   ` James Harvey
  1 sibling, 1 reply; 6+ messages in thread
From: Chris Murphy @ 2019-10-14  1:45 UTC (permalink / raw)
  To: James Harvey; +Cc: Btrfs BTRFS

On Sat, Oct 12, 2019 at 5:29 PM James Harvey <jamespharvey20@gmail.com> wrote:
>
> Was using a temporary BTRFS volume to compile mongodb, which is quite
> intensive and takes quite a bit of time.  The volume has been
> deadlocked for about 12 hours.
>
> Being a temporary volume, I just used mount without options, so it
> used the defaults:  rw,relatime,ssd,space_cache,subvolid=5,subvol=/
>
> Apologies if upgrading to 5.3.5+ will fix this.  I didn't see
> discussions of a deadlock looking like this.

I think it's a bug in any case, in particular because its all default
mount options, but it'd be interesting if any of the following make a
difference:

- space_cache=v2
- noatime



-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 5.3.0 deadlock: btrfs_sync_file / btrfs_async_reclaim_metadata_space / btrfs_page_mkwrite
  2019-10-14  1:45 ` Chris Murphy
@ 2019-10-15  1:04   ` James Harvey
  2019-10-15  1:27     ` Chris Murphy
  0 siblings, 1 reply; 6+ messages in thread
From: James Harvey @ 2019-10-15  1:04 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Btrfs BTRFS

On Sun, Oct 13, 2019 at 9:46 PM Chris Murphy <lists@colorremedies.com> wrote:
>
> On Sat, Oct 12, 2019 at 5:29 PM James Harvey <jamespharvey20@gmail.com> wrote:
> >
> > Was using a temporary BTRFS volume to compile mongodb, which is quite
> > intensive and takes quite a bit of time.  The volume has been
> > deadlocked for about 12 hours.
> >
> > Being a temporary volume, I just used mount without options, so it
> > used the defaults:  rw,relatime,ssd,space_cache,subvolid=5,subvol=/
> >
> > Apologies if upgrading to 5.3.5+ will fix this.  I didn't see
> > discussions of a deadlock looking like this.
>
> I think it's a bug in any case, in particular because its all default
> mount options, but it'd be interesting if any of the following make a
> difference:
>
> - space_cache=v2
> - noatime

Interesting.

This isn't 100% reproducible.  Before my original post, after my
initial deadlock, I tried again and immediately hit another deadlock.
But, yesterday, in response to your email, I tried again still without
"space_cache=v2,noatime" to re-confirm the deadlock.  I had to
re-compile mongodb about 6 times to hit another deadlock.  I was
almost at the point of thinking I wouldn't see it again.

After re-confirming it, I re-created the BTRFS volume to use
"space_cache=v2,noatime" mount options.  It deadlocked during the
first mongodb compilation.  w > sysrq_trigger is a little bit
different.  No trace including "btrfs_sync_log" or
"btrfs_async_reclaim_metadata_space".  Only traces including the
"btrfs_btrfs_async_reclaim_metadata_space".  Viewable here:
http://ix.io/1YGe

Who knows, maybe as a particular volume has more use, it becomes less
likely to deadlock.  IF it is space cache related, maybe as the tree
gets filled out, it becomes less likely?  Or, maybe I'm looking too
much into variance, and just the way the dice rolled was that it
happened on the first retry on the new volume.  My initial deadlock
was right after volume creation, as well.

I'll also mention this is on 32 cores and a Samsung 970 EVO NVMe, and
a multithreaded compilation, so perhaps it requires a pretty high load
to run into this.

Also, as I'm testing some issues with the mongodb compilation process
(upstream always forces debug symbols...), as a workaround to be able
to test its issues, I've used a temporary ext4 volume for it, which I
haven't had a single issue with.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 5.3.0 deadlock: btrfs_sync_file / btrfs_async_reclaim_metadata_space / btrfs_page_mkwrite
  2019-10-15  1:04   ` James Harvey
@ 2019-10-15  1:27     ` Chris Murphy
  0 siblings, 0 replies; 6+ messages in thread
From: Chris Murphy @ 2019-10-15  1:27 UTC (permalink / raw)
  To: James Harvey; +Cc: Chris Murphy, Btrfs BTRFS

On Mon, Oct 14, 2019 at 7:05 PM James Harvey <jamespharvey20@gmail.com> wrote:
>
> On Sun, Oct 13, 2019 at 9:46 PM Chris Murphy <lists@colorremedies.com> wrote:
> >
> > On Sat, Oct 12, 2019 at 5:29 PM James Harvey <jamespharvey20@gmail.com> wrote:
> > >
> > > Was using a temporary BTRFS volume to compile mongodb, which is quite
> > > intensive and takes quite a bit of time.  The volume has been
> > > deadlocked for about 12 hours.
> > >
> > > Being a temporary volume, I just used mount without options, so it
> > > used the defaults:  rw,relatime,ssd,space_cache,subvolid=5,subvol=/
> > >
> > > Apologies if upgrading to 5.3.5+ will fix this.  I didn't see
> > > discussions of a deadlock looking like this.
> >
> > I think it's a bug in any case, in particular because its all default
> > mount options, but it'd be interesting if any of the following make a
> > difference:
> >
> > - space_cache=v2
> > - noatime
>
> Interesting.
>
> This isn't 100% reproducible.  Before my original post, after my
> initial deadlock, I tried again and immediately hit another deadlock.
> But, yesterday, in response to your email, I tried again still without
> "space_cache=v2,noatime" to re-confirm the deadlock.  I had to
> re-compile mongodb about 6 times to hit another deadlock.  I was
> almost at the point of thinking I wouldn't see it again.
>
> After re-confirming it, I re-created the BTRFS volume to use
> "space_cache=v2,noatime" mount options.  It deadlocked during the
> first mongodb compilation.  w > sysrq_trigger is a little bit
> different.  No trace including "btrfs_sync_log" or
> "btrfs_async_reclaim_metadata_space".  Only traces including the
> "btrfs_btrfs_async_reclaim_metadata_space".  Viewable here:
> http://ix.io/1YGe

I think it's some kind of disk or lock contention, but I don't really
know much about it. The v1 space_cache is basically data extents, so
they use data chunks and I guess can conflict with heavy data writes.
Whereas v2 space_cache is a dedicated metadata btree. So yeah - and
I'm not sure if mongo builds use atime at all so the noatime could be
a goose chase, but figured it might help reduce unnecessary metadata
updates.


> Also, as I'm testing some issues with the mongodb compilation process
> (upstream always forces debug symbols...), as a workaround to be able
> to test its issues, I've used a temporary ext4 volume for it, which I
> haven't had a single issue with.

Adds to the notion this is some kind of bug.

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-10-15  1:30 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-12 23:29 5.3.0 deadlock: btrfs_sync_file / btrfs_async_reclaim_metadata_space / btrfs_page_mkwrite James Harvey
2019-10-12 23:32 ` James Harvey
2019-10-13  2:12   ` James Harvey
2019-10-14  1:45 ` Chris Murphy
2019-10-15  1:04   ` James Harvey
2019-10-15  1:27     ` Chris Murphy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).