All of lore.kernel.org
 help / color / mirror / Atom feed
* WARN_ON in record_root_in_trans() when deleting freshly renamed subvolume
@ 2016-04-07 16:44 Holger Hoffstätte
  2016-04-07 18:07 ` Filipe Manana
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Holger Hoffstätte @ 2016-04-07 16:44 UTC (permalink / raw)
  To: linux-btrfs

Hi,

Looks like I just found an exciting new corner case.
kernel 4.4.6 with btrfs ~4.6, so 4.6 should reproduce.

Try on a fresh volume:

$btrfs subvolume create foo
Create subvolume './foo'
$sync
$btrfs subvolume snapshot foo foo-1
Create a snapshot of 'foo' in './foo-1'
$sync
$mv foo-1 foo.new
$btrfs subvolume delete foo.new 
Delete subvolume (no-commit): '/mnt/test/foo.new'
$dmesg 
[  226.923316] ------------[ cut here ]------------
[  226.923339] WARNING: CPU: 1 PID: 5863 at fs/btrfs/transaction.c:319 record_root_in_trans+0xd6/0x100 [btrfs]()
[  226.923340] Modules linked in: auth_rpcgss oid_registry nfsv4 btrfs xor raid6_pq loop nfs lockd grace sunrpc autofs4 sch_fq_codel radeon snd_hda_codec_realtek x86_pkg_temp_thermal snd_hda_codec_generic coretemp crc32_pclmul crc32c_intel aesni_intel i2c_algo_bit uvcvideo snd_hda_codec_hdmi aes_x86_64 drm_kms_helper videobuf2_vmalloc glue_helper videobuf2_memops syscopyarea lrw sysfillrect gf128mul videobuf2_v4l2 sysimgblt snd_usb_audio fb_sys_fops ablk_helper snd_hda_intel videobuf2_core ttm cryptd snd_hwdep v4l2_common usbhid snd_hda_codec snd_usbmidi_lib videodev snd_rawmidi drm snd_hda_core snd_seq_device i2c_i801 snd_pcm i2c_core snd_timer snd r8169 soundcore mii parport_pc parport
[  226.923365] CPU: 1 PID: 5863 Comm: ls Not tainted 4.4.6 #1
[  226.923366] Hardware name: Gigabyte Technology Co., Ltd. P67-DS3-B3/P67-DS3-B3, BIOS F1 05/06/2011
[  226.923367]  0000000000000000 ffff8800da677d20 ffffffff813181a8 0000000000000000
[  226.923368]  ffffffffa0aacdbf ffff8800da677d58 ffffffff810507b2 ffff880601e90800
[  226.923369]  ffff8800dacf10a0 ffff880601e90800 ffff880601e909f0 0000000000000001
[  226.923371] Call Trace:
[  226.923374]  [<ffffffff813181a8>] dump_stack+0x4d/0x65
[  226.923376]  [<ffffffff810507b2>] warn_slowpath_common+0x82/0xc0
[  226.923378]  [<ffffffff810508aa>] warn_slowpath_null+0x1a/0x20
[  226.923387]  [<ffffffffa0a2cf46>] record_root_in_trans+0xd6/0x100 [btrfs]
[  226.923395]  [<ffffffffa0a2db24>] btrfs_record_root_in_trans+0x44/0x70 [btrfs]
[  226.923404]  [<ffffffffa0a2fb5e>] start_transaction+0x9e/0x4c0 [btrfs]
[  226.923412]  [<ffffffffa0a2ffd7>] btrfs_join_transaction+0x17/0x20 [btrfs]
[  226.923421]  [<ffffffffa0a359b5>] btrfs_dirty_inode+0x35/0xd0 [btrfs]
[  226.923430]  [<ffffffffa0a35acd>] btrfs_update_time+0x7d/0xb0 [btrfs]
[  226.923432]  [<ffffffff81187028>] touch_atime+0x88/0xa0
[  226.923434]  [<ffffffff8117ec9b>] iterate_dir+0xdb/0x120
[  226.923435]  [<ffffffff8117f0c8>] SyS_getdents+0x88/0xf0
[  226.923437]  [<ffffffff8117edb0>] ? fillonedir+0xd0/0xd0
[  226.923439]  [<ffffffff815b8257>] entry_SYSCALL_64_fastpath+0x12/0x6a
[  226.923440] ---[ end trace 9c78caf253e284fe ]---

Code looks like:

..
static int record_root_in_trans(struct btrfs_trans_handle *trans,
			       struct btrfs_root *root)
{
	if (test_bit(BTRFS_ROOT_REF_COWS, &root->state) &&
	    root->last_trans < trans->transid) {
		WARN_ON(root == root->fs_info->extent_root);
		WARN_ON(root->commit_root != root->node);
..

There's been a few journal/recovery/directory consistency patches recently,
so maybe it's a corner case or an older problem. I'll try to bisect, but
meanwhile wanted to report it for discussion.

Holger


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: WARN_ON in record_root_in_trans() when deleting freshly renamed subvolume
  2016-04-07 16:44 WARN_ON in record_root_in_trans() when deleting freshly renamed subvolume Holger Hoffstätte
@ 2016-04-07 18:07 ` Filipe Manana
  2016-04-08  2:01 ` Liu Bo
  2016-04-08 11:14 ` Filipe Manana
  2 siblings, 0 replies; 10+ messages in thread
From: Filipe Manana @ 2016-04-07 18:07 UTC (permalink / raw)
  To: Holger Hoffstätte; +Cc: linux-btrfs

On Thu, Apr 7, 2016 at 5:44 PM, Holger Hoffstätte
<holger.hoffstaette@googlemail.com> wrote:
> Hi,
>
> Looks like I just found an exciting new corner case.
> kernel 4.4.6 with btrfs ~4.6, so 4.6 should reproduce.
>
> Try on a fresh volume:
>
> $btrfs subvolume create foo
> Create subvolume './foo'
> $sync
> $btrfs subvolume snapshot foo foo-1
> Create a snapshot of 'foo' in './foo-1'
> $sync
> $mv foo-1 foo.new

Haven't tried, but by taking a glance at btrfs_rename(), if you call
sync (or force a transaction commit through some other means) after
the rename and before deleting the snapshot, the warning should not
happen.

> $btrfs subvolume delete foo.new
> Delete subvolume (no-commit): '/mnt/test/foo.new'
> $dmesg
> [  226.923316] ------------[ cut here ]------------
> [  226.923339] WARNING: CPU: 1 PID: 5863 at fs/btrfs/transaction.c:319 record_root_in_trans+0xd6/0x100 [btrfs]()
> [  226.923340] Modules linked in: auth_rpcgss oid_registry nfsv4 btrfs xor raid6_pq loop nfs lockd grace sunrpc autofs4 sch_fq_codel radeon snd_hda_codec_realtek x86_pkg_temp_thermal snd_hda_codec_generic coretemp crc32_pclmul crc32c_intel aesni_intel i2c_algo_bit uvcvideo snd_hda_codec_hdmi aes_x86_64 drm_kms_helper videobuf2_vmalloc glue_helper videobuf2_memops syscopyarea lrw sysfillrect gf128mul videobuf2_v4l2 sysimgblt snd_usb_audio fb_sys_fops ablk_helper snd_hda_intel videobuf2_core ttm cryptd snd_hwdep v4l2_common usbhid snd_hda_codec snd_usbmidi_lib videodev snd_rawmidi drm snd_hda_core snd_seq_device i2c_i801 snd_pcm i2c_core snd_timer snd r8169 soundcore mii parport_pc parport
> [  226.923365] CPU: 1 PID: 5863 Comm: ls Not tainted 4.4.6 #1
> [  226.923366] Hardware name: Gigabyte Technology Co., Ltd. P67-DS3-B3/P67-DS3-B3, BIOS F1 05/06/2011
> [  226.923367]  0000000000000000 ffff8800da677d20 ffffffff813181a8 0000000000000000
> [  226.923368]  ffffffffa0aacdbf ffff8800da677d58 ffffffff810507b2 ffff880601e90800
> [  226.923369]  ffff8800dacf10a0 ffff880601e90800 ffff880601e909f0 0000000000000001
> [  226.923371] Call Trace:
> [  226.923374]  [<ffffffff813181a8>] dump_stack+0x4d/0x65
> [  226.923376]  [<ffffffff810507b2>] warn_slowpath_common+0x82/0xc0
> [  226.923378]  [<ffffffff810508aa>] warn_slowpath_null+0x1a/0x20
> [  226.923387]  [<ffffffffa0a2cf46>] record_root_in_trans+0xd6/0x100 [btrfs]
> [  226.923395]  [<ffffffffa0a2db24>] btrfs_record_root_in_trans+0x44/0x70 [btrfs]
> [  226.923404]  [<ffffffffa0a2fb5e>] start_transaction+0x9e/0x4c0 [btrfs]
> [  226.923412]  [<ffffffffa0a2ffd7>] btrfs_join_transaction+0x17/0x20 [btrfs]
> [  226.923421]  [<ffffffffa0a359b5>] btrfs_dirty_inode+0x35/0xd0 [btrfs]
> [  226.923430]  [<ffffffffa0a35acd>] btrfs_update_time+0x7d/0xb0 [btrfs]
> [  226.923432]  [<ffffffff81187028>] touch_atime+0x88/0xa0
> [  226.923434]  [<ffffffff8117ec9b>] iterate_dir+0xdb/0x120
> [  226.923435]  [<ffffffff8117f0c8>] SyS_getdents+0x88/0xf0
> [  226.923437]  [<ffffffff8117edb0>] ? fillonedir+0xd0/0xd0
> [  226.923439]  [<ffffffff815b8257>] entry_SYSCALL_64_fastpath+0x12/0x6a
> [  226.923440] ---[ end trace 9c78caf253e284fe ]---
>
> Code looks like:
>
> ..
> static int record_root_in_trans(struct btrfs_trans_handle *trans,
>                                struct btrfs_root *root)
> {
>         if (test_bit(BTRFS_ROOT_REF_COWS, &root->state) &&
>             root->last_trans < trans->transid) {
>                 WARN_ON(root == root->fs_info->extent_root);
>                 WARN_ON(root->commit_root != root->node);
> ..
>
> There's been a few journal/recovery/directory consistency patches recently,

Which can't solve this warning nor cause it, since you didn't fsync
and trigger the replay of the log trees.

> so maybe it's a corner case or an older problem. I'll try to bisect, but
> meanwhile wanted to report it for discussion.

Seems like an old problem to me.

Thanks for reporting it!

>
> Holger
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Filipe David Manana,

"Reasonable men adapt themselves to the world.
 Unreasonable men adapt the world to themselves.
 That's why all progress depends on unreasonable men."

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: WARN_ON in record_root_in_trans() when deleting freshly renamed subvolume
  2016-04-07 16:44 WARN_ON in record_root_in_trans() when deleting freshly renamed subvolume Holger Hoffstätte
  2016-04-07 18:07 ` Filipe Manana
@ 2016-04-08  2:01 ` Liu Bo
  2016-04-08 11:14 ` Filipe Manana
  2 siblings, 0 replies; 10+ messages in thread
From: Liu Bo @ 2016-04-08  2:01 UTC (permalink / raw)
  To: Holger Hoffstätte; +Cc: linux-btrfs

On Thu, Apr 07, 2016 at 06:44:20PM +0200, Holger Hoffstätte wrote:
> Hi,
> 
> Looks like I just found an exciting new corner case.
> kernel 4.4.6 with btrfs ~4.6, so 4.6 should reproduce.
> 
> Try on a fresh volume:
> 
> $btrfs subvolume create foo
> Create subvolume './foo'
> $sync
> $btrfs subvolume snapshot foo foo-1
> Create a snapshot of 'foo' in './foo-1'
> $sync
> $mv foo-1 foo.new
> $btrfs subvolume delete foo.new 
> Delete subvolume (no-commit): '/mnt/test/foo.new'
> $dmesg 
> [  226.923316] ------------[ cut here ]------------
> [  226.923339] WARNING: CPU: 1 PID: 5863 at fs/btrfs/transaction.c:319 record_root_in_trans+0xd6/0x100 [btrfs]()
> [  226.923340] Modules linked in: auth_rpcgss oid_registry nfsv4 btrfs xor raid6_pq loop nfs lockd grace sunrpc autofs4 sch_fq_codel radeon snd_hda_codec_realtek x86_pkg_temp_thermal snd_hda_codec_generic coretemp crc32_pclmul crc32c_intel aesni_intel i2c_algo_bit uvcvideo snd_hda_codec_hdmi aes_x86_64 drm_kms_helper videobuf2_vmalloc glue_helper videobuf2_memops syscopyarea lrw sysfillrect gf128mul videobuf2_v4l2 sysimgblt snd_usb_audio fb_sys_fops ablk_helper snd_hda_intel videobuf2_core ttm cryptd snd_hwdep v4l2_common usbhid snd_hda_codec snd_usbmidi_lib videodev snd_rawmidi drm snd_hda_core snd_seq_device i2c_i801 snd_pcm i2c_core snd_timer snd r8169 soundcore mii parport_pc parport
> [  226.923365] CPU: 1 PID: 5863 Comm: ls Not tainted 4.4.6 #1
> [  226.923366] Hardware name: Gigabyte Technology Co., Ltd. P67-DS3-B3/P67-DS3-B3, BIOS F1 05/06/2011
> [  226.923367]  0000000000000000 ffff8800da677d20 ffffffff813181a8 0000000000000000
> [  226.923368]  ffffffffa0aacdbf ffff8800da677d58 ffffffff810507b2 ffff880601e90800
> [  226.923369]  ffff8800dacf10a0 ffff880601e90800 ffff880601e909f0 0000000000000001
> [  226.923371] Call Trace:
> [  226.923374]  [<ffffffff813181a8>] dump_stack+0x4d/0x65
> [  226.923376]  [<ffffffff810507b2>] warn_slowpath_common+0x82/0xc0
> [  226.923378]  [<ffffffff810508aa>] warn_slowpath_null+0x1a/0x20
> [  226.923387]  [<ffffffffa0a2cf46>] record_root_in_trans+0xd6/0x100 [btrfs]
> [  226.923395]  [<ffffffffa0a2db24>] btrfs_record_root_in_trans+0x44/0x70 [btrfs]
> [  226.923404]  [<ffffffffa0a2fb5e>] start_transaction+0x9e/0x4c0 [btrfs]
> [  226.923412]  [<ffffffffa0a2ffd7>] btrfs_join_transaction+0x17/0x20 [btrfs]
> [  226.923421]  [<ffffffffa0a359b5>] btrfs_dirty_inode+0x35/0xd0 [btrfs]
> [  226.923430]  [<ffffffffa0a35acd>] btrfs_update_time+0x7d/0xb0 [btrfs]
> [  226.923432]  [<ffffffff81187028>] touch_atime+0x88/0xa0
> [  226.923434]  [<ffffffff8117ec9b>] iterate_dir+0xdb/0x120
> [  226.923435]  [<ffffffff8117f0c8>] SyS_getdents+0x88/0xf0
> [  226.923437]  [<ffffffff8117edb0>] ? fillonedir+0xd0/0xd0
> [  226.923439]  [<ffffffff815b8257>] entry_SYSCALL_64_fastpath+0x12/0x6a
> [  226.923440] ---[ end trace 9c78caf253e284fe ]---
> 
> Code looks like:
> 
> ..
> static int record_root_in_trans(struct btrfs_trans_handle *trans,
> 			       struct btrfs_root *root)
> {
> 	if (test_bit(BTRFS_ROOT_REF_COWS, &root->state) &&
> 	    root->last_trans < trans->transid) {
> 		WARN_ON(root == root->fs_info->extent_root);
> 		WARN_ON(root->commit_root != root->node);
> ..
> 
> There's been a few journal/recovery/directory consistency patches recently,
> so maybe it's a corner case or an older problem. I'll try to bisect, but
> meanwhile wanted to report it for discussion.

Vanilla 4.5.0 is fine with the above test, you may bisect between 4.5 with upstream.

Thanks,

-liubo

> 
> Holger
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: WARN_ON in record_root_in_trans() when deleting freshly renamed subvolume
  2016-04-07 16:44 WARN_ON in record_root_in_trans() when deleting freshly renamed subvolume Holger Hoffstätte
  2016-04-07 18:07 ` Filipe Manana
  2016-04-08  2:01 ` Liu Bo
@ 2016-04-08 11:14 ` Filipe Manana
  2016-04-08 11:51   ` Holger Hoffstätte
  2 siblings, 1 reply; 10+ messages in thread
From: Filipe Manana @ 2016-04-08 11:14 UTC (permalink / raw)
  To: Holger Hoffstätte; +Cc: linux-btrfs

On Thu, Apr 7, 2016 at 5:44 PM, Holger Hoffstätte
<holger.hoffstaette@googlemail.com> wrote:
> Hi,
>
> Looks like I just found an exciting new corner case.
> kernel 4.4.6 with btrfs ~4.6, so 4.6 should reproduce.

Using Chris' for-linus-4.6 branch, which is 4.5-rc6 + all 4.6 btrfs
patches, it didn't reproduce here:

#!/bin/bash

dmesg -C
mkfs.btrfs -f /dev/sdi
mount /dev/sdi /mnt/sdi
cd /mnt/sdi
btrfs subvolume create foo
sync
btrfs subvolume snapshot foo foo-1
sync
mv foo-1 foo.new
btrfs subvolume delete foo.new
cd -
umount /dev/sdi
dmesg

gives:

btrfs-progs v4.5.1-dirty
See http://btrfs.wiki.kernel.org for more information.

Performing full device TRIM (100.00GiB) ...
Label:              (null)
UUID:               76cebc54-0ae1-4f53-91fd-3f9438bdfb50
Node size:          16384
Sector size:        4096
Filesystem size:    100.00GiB
Block group profiles:
  Data:             single            8.00MiB
  Metadata:         DUP               1.01GiB
  System:           DUP              12.00MiB
SSD detected:       no
Incompat features:  extref, skinny-metadata
Number of devices:  1
Devices:
   ID        SIZE  PATH
    1   100.00GiB  /dev/sdi

Create subvolume './foo'
Create a snapshot of 'foo' in './foo-1'
Delete subvolume (no-commit): '/mnt/sdi/foo.new'
/mnt
[75015.529626] systemd-journald[578]: Sent WATCHDOG=1 notification.
[75015.756407] BTRFS: device fsid 76cebc54-0ae1-4f53-91fd-3f9438bdfb50
devid 1 transid 3 /dev/sdi
[75015.932527] BTRFS info (device sdi): disk space caching is enabled
[75015.937674] BTRFS: has skinny extents
[75015.938470] BTRFS: flagging fs with big metadata feature
[75015.962601] BTRFS: creating UUID tree

Are you sure that you are not using some patches not in 4.6?
Also tried my own integration branch, and no issue either.

>
> Try on a fresh volume:
>
> $btrfs subvolume create foo
> Create subvolume './foo'
> $sync
> $btrfs subvolume snapshot foo foo-1
> Create a snapshot of 'foo' in './foo-1'
> $sync
> $mv foo-1 foo.new
> $btrfs subvolume delete foo.new
> Delete subvolume (no-commit): '/mnt/test/foo.new'
> $dmesg
> [  226.923316] ------------[ cut here ]------------
> [  226.923339] WARNING: CPU: 1 PID: 5863 at fs/btrfs/transaction.c:319 record_root_in_trans+0xd6/0x100 [btrfs]()
> [  226.923340] Modules linked in: auth_rpcgss oid_registry nfsv4 btrfs xor raid6_pq loop nfs lockd grace sunrpc autofs4 sch_fq_codel radeon snd_hda_codec_realtek x86_pkg_temp_thermal snd_hda_codec_generic coretemp crc32_pclmul crc32c_intel aesni_intel i2c_algo_bit uvcvideo snd_hda_codec_hdmi aes_x86_64 drm_kms_helper videobuf2_vmalloc glue_helper videobuf2_memops syscopyarea lrw sysfillrect gf128mul videobuf2_v4l2 sysimgblt snd_usb_audio fb_sys_fops ablk_helper snd_hda_intel videobuf2_core ttm cryptd snd_hwdep v4l2_common usbhid snd_hda_codec snd_usbmidi_lib videodev snd_rawmidi drm snd_hda_core snd_seq_device i2c_i801 snd_pcm i2c_core snd_timer snd r8169 soundcore mii parport_pc parport
> [  226.923365] CPU: 1 PID: 5863 Comm: ls Not tainted 4.4.6 #1
> [  226.923366] Hardware name: Gigabyte Technology Co., Ltd. P67-DS3-B3/P67-DS3-B3, BIOS F1 05/06/2011
> [  226.923367]  0000000000000000 ffff8800da677d20 ffffffff813181a8 0000000000000000
> [  226.923368]  ffffffffa0aacdbf ffff8800da677d58 ffffffff810507b2 ffff880601e90800
> [  226.923369]  ffff8800dacf10a0 ffff880601e90800 ffff880601e909f0 0000000000000001
> [  226.923371] Call Trace:
> [  226.923374]  [<ffffffff813181a8>] dump_stack+0x4d/0x65
> [  226.923376]  [<ffffffff810507b2>] warn_slowpath_common+0x82/0xc0
> [  226.923378]  [<ffffffff810508aa>] warn_slowpath_null+0x1a/0x20
> [  226.923387]  [<ffffffffa0a2cf46>] record_root_in_trans+0xd6/0x100 [btrfs]
> [  226.923395]  [<ffffffffa0a2db24>] btrfs_record_root_in_trans+0x44/0x70 [btrfs]
> [  226.923404]  [<ffffffffa0a2fb5e>] start_transaction+0x9e/0x4c0 [btrfs]
> [  226.923412]  [<ffffffffa0a2ffd7>] btrfs_join_transaction+0x17/0x20 [btrfs]
> [  226.923421]  [<ffffffffa0a359b5>] btrfs_dirty_inode+0x35/0xd0 [btrfs]
> [  226.923430]  [<ffffffffa0a35acd>] btrfs_update_time+0x7d/0xb0 [btrfs]
> [  226.923432]  [<ffffffff81187028>] touch_atime+0x88/0xa0
> [  226.923434]  [<ffffffff8117ec9b>] iterate_dir+0xdb/0x120
> [  226.923435]  [<ffffffff8117f0c8>] SyS_getdents+0x88/0xf0
> [  226.923437]  [<ffffffff8117edb0>] ? fillonedir+0xd0/0xd0
> [  226.923439]  [<ffffffff815b8257>] entry_SYSCALL_64_fastpath+0x12/0x6a
> [  226.923440] ---[ end trace 9c78caf253e284fe ]---
>
> Code looks like:
>
> ..
> static int record_root_in_trans(struct btrfs_trans_handle *trans,
>                                struct btrfs_root *root)
> {
>         if (test_bit(BTRFS_ROOT_REF_COWS, &root->state) &&
>             root->last_trans < trans->transid) {
>                 WARN_ON(root == root->fs_info->extent_root);
>                 WARN_ON(root->commit_root != root->node);
> ..
>
> There's been a few journal/recovery/directory consistency patches recently,
> so maybe it's a corner case or an older problem. I'll try to bisect, but
> meanwhile wanted to report it for discussion.
>
> Holger
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Filipe David Manana,

"Reasonable men adapt themselves to the world.
 Unreasonable men adapt the world to themselves.
 That's why all progress depends on unreasonable men."

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: WARN_ON in record_root_in_trans() when deleting freshly renamed subvolume
  2016-04-08 11:14 ` Filipe Manana
@ 2016-04-08 11:51   ` Holger Hoffstätte
  2016-04-08 13:10     ` Holger Hoffstätte
  0 siblings, 1 reply; 10+ messages in thread
From: Holger Hoffstätte @ 2016-04-08 11:51 UTC (permalink / raw)
  To: fdmanana; +Cc: linux-btrfs

On 04/08/16 13:14, Filipe Manana wrote:
> Using Chris' for-linus-4.6 branch, which is 4.5-rc6 + all 4.6 btrfs
> patches, it didn't reproduce here:

Great, that's good to know (sort of :). Thanks also to Liu Bo.

> Are you sure that you are not using some patches not in 4.6?

Quite a few, but to offset that I also left out some that have diverged
too much or were not that important (block/sectorsize, device handling).
But those should not have anything to do with this particular bug.

Except for this everything works rock-solid, I use it daily.
Should be easy to track down..

-h


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: WARN_ON in record_root_in_trans() when deleting freshly renamed subvolume
  2016-04-08 11:51   ` Holger Hoffstätte
@ 2016-04-08 13:10     ` Holger Hoffstätte
  2016-04-08 19:18       ` Mark Fasheh
  0 siblings, 1 reply; 10+ messages in thread
From: Holger Hoffstätte @ 2016-04-08 13:10 UTC (permalink / raw)
  To: Filipe David Manana; +Cc: linux-btrfs, Mark Fasheh, Qu Wenruo

[cc: Mark and Qu]

On 04/08/16 13:51, Holger Hoffstätte wrote:
> On 04/08/16 13:14, Filipe Manana wrote:
>> Using Chris' for-linus-4.6 branch, which is 4.5-rc6 + all 4.6 btrfs
>> patches, it didn't reproduce here:
> 
> Great, that's good to know (sort of :). Thanks also to Liu Bo.
> 
>> Are you sure that you are not using some patches not in 4.6?

We have a bingo!

Reverting "qgroup: Fix qgroup accounting when creating snapshot"
from last Wednesday immediately fixes the problem.

Was quite easy to find - the triggered WARN_ON was the second one that
complained about a mismatch between roots. The only patch that even
remotely did something in that area was said qgroup fix.

Looks like something is missing there. Suggestions welcome. :)

Holger


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: WARN_ON in record_root_in_trans() when deleting freshly renamed subvolume
  2016-04-08 13:10     ` Holger Hoffstätte
@ 2016-04-08 19:18       ` Mark Fasheh
  2016-04-11  1:05         ` Qu Wenruo
  0 siblings, 1 reply; 10+ messages in thread
From: Mark Fasheh @ 2016-04-08 19:18 UTC (permalink / raw)
  To: Holger Hoffstätte; +Cc: Filipe David Manana, linux-btrfs, Qu Wenruo

On Fri, Apr 08, 2016 at 03:10:35PM +0200, Holger Hoffstätte wrote:
> [cc: Mark and Qu]
> 
> On 04/08/16 13:51, Holger Hoffstätte wrote:
> > On 04/08/16 13:14, Filipe Manana wrote:
> >> Using Chris' for-linus-4.6 branch, which is 4.5-rc6 + all 4.6 btrfs
> >> patches, it didn't reproduce here:
> > 
> > Great, that's good to know (sort of :). Thanks also to Liu Bo.
> > 
> >> Are you sure that you are not using some patches not in 4.6?
> 
> We have a bingo!
> 
> Reverting "qgroup: Fix qgroup accounting when creating snapshot"
> from last Wednesday immediately fixes the problem.

Not surprising, I had some issues testing it out too. I'm pretty sure this
patch is corrupting memory, I just haven't found where yet though my
educated guess is that the transaction is being reused improperly.
	--Mark

--
Mark Fasheh

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: WARN_ON in record_root_in_trans() when deleting freshly renamed subvolume
  2016-04-08 19:18       ` Mark Fasheh
@ 2016-04-11  1:05         ` Qu Wenruo
  2016-04-11 18:09           ` Mark Fasheh
  0 siblings, 1 reply; 10+ messages in thread
From: Qu Wenruo @ 2016-04-11  1:05 UTC (permalink / raw)
  To: Mark Fasheh, Holger Hoffstätte; +Cc: Filipe David Manana, linux-btrfs



Mark Fasheh wrote on 2016/04/08 12:18 -0700:
> On Fri, Apr 08, 2016 at 03:10:35PM +0200, Holger Hoffstätte wrote:
>> [cc: Mark and Qu]
>>
>> On 04/08/16 13:51, Holger Hoffstätte wrote:
>>> On 04/08/16 13:14, Filipe Manana wrote:
>>>> Using Chris' for-linus-4.6 branch, which is 4.5-rc6 + all 4.6 btrfs
>>>> patches, it didn't reproduce here:
>>>
>>> Great, that's good to know (sort of :). Thanks also to Liu Bo.
>>>
>>>> Are you sure that you are not using some patches not in 4.6?
>>
>> We have a bingo!
>>
>> Reverting "qgroup: Fix qgroup accounting when creating snapshot"
>> from last Wednesday immediately fixes the problem.
>
> Not surprising, I had some issues testing it out too. I'm pretty sure this
> patch is corrupting memory, I just haven't found where yet though my
> educated guess is that the transaction is being reused improperly.
> 	--Mark
>
> --
> Mark Fasheh
>
>
Still digging the bug Mark has reported about the patch.

Good to have another report, as I can't always reproduce the soft lockup 
from Mark.

It seems that the WARN_ON will bring another clue to fix it.

BTW, the memory corruption assumption seems to be quite helpful.
I didn't consider in that way, but it seems to be the only reason 
causing dead spinlock while no other thread spinning and no lockdep warning.

Thanks,
Qu



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: WARN_ON in record_root_in_trans() when deleting freshly renamed subvolume
  2016-04-11  1:05         ` Qu Wenruo
@ 2016-04-11 18:09           ` Mark Fasheh
  2016-04-12  0:30             ` Qu Wenruo
  0 siblings, 1 reply; 10+ messages in thread
From: Mark Fasheh @ 2016-04-11 18:09 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: Holger Hoffstätte, Filipe David Manana, linux-btrfs

On Mon, Apr 11, 2016 at 09:05:47AM +0800, Qu Wenruo wrote:
> 
> 
> Mark Fasheh wrote on 2016/04/08 12:18 -0700:
> >On Fri, Apr 08, 2016 at 03:10:35PM +0200, Holger Hoffstätte wrote:
> >>[cc: Mark and Qu]
> >>
> >>On 04/08/16 13:51, Holger Hoffstätte wrote:
> >>>On 04/08/16 13:14, Filipe Manana wrote:
> >>>>Using Chris' for-linus-4.6 branch, which is 4.5-rc6 + all 4.6 btrfs
> >>>>patches, it didn't reproduce here:
> >>>
> >>>Great, that's good to know (sort of :). Thanks also to Liu Bo.
> >>>
> >>>>Are you sure that you are not using some patches not in 4.6?
> >>
> >>We have a bingo!
> >>
> >>Reverting "qgroup: Fix qgroup accounting when creating snapshot"
> >>from last Wednesday immediately fixes the problem.
> >
> >Not surprising, I had some issues testing it out too. I'm pretty sure this
> >patch is corrupting memory, I just haven't found where yet though my
> >educated guess is that the transaction is being reused improperly.
> >	--Mark
> >
> >--
> >Mark Fasheh
> >
> >
> Still digging the bug Mark has reported about the patch.
> 
> Good to have another report, as I can't always reproduce the soft
> lockup from Mark.
> 
> It seems that the WARN_ON will bring another clue to fix it.
> 
> BTW, the memory corruption assumption seems to be quite helpful.
> I didn't consider in that way, but it seems to be the only reason
> causing dead spinlock while no other thread spinning and no lockdep
> warning.

It seems to be the call to commit_cowonly_roots() in your patch which sets
everything off. If I remove that call I can run all day without a crash.

Btw, I'm not convinced this fixes the qgroup numbers anyway - we are still
inconsistent even if I don't get a crash.

Have you tested that the actual numbers on your end are coming out ok?
	--Mark

--
Mark Fasheh

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: WARN_ON in record_root_in_trans() when deleting freshly renamed subvolume
  2016-04-11 18:09           ` Mark Fasheh
@ 2016-04-12  0:30             ` Qu Wenruo
  0 siblings, 0 replies; 10+ messages in thread
From: Qu Wenruo @ 2016-04-12  0:30 UTC (permalink / raw)
  To: Mark Fasheh; +Cc: Holger Hoffstätte, Filipe David Manana, linux-btrfs



Mark Fasheh wrote on 2016/04/11 11:09 -0700:
> On Mon, Apr 11, 2016 at 09:05:47AM +0800, Qu Wenruo wrote:
>>
>>
>> Mark Fasheh wrote on 2016/04/08 12:18 -0700:
>>> On Fri, Apr 08, 2016 at 03:10:35PM +0200, Holger Hoffstätte wrote:
>>>> [cc: Mark and Qu]
>>>>
>>>> On 04/08/16 13:51, Holger Hoffstätte wrote:
>>>>> On 04/08/16 13:14, Filipe Manana wrote:
>>>>>> Using Chris' for-linus-4.6 branch, which is 4.5-rc6 + all 4.6 btrfs
>>>>>> patches, it didn't reproduce here:
>>>>>
>>>>> Great, that's good to know (sort of :). Thanks also to Liu Bo.
>>>>>
>>>>>> Are you sure that you are not using some patches not in 4.6?
>>>>
>>>> We have a bingo!
>>>>
>>>> Reverting "qgroup: Fix qgroup accounting when creating snapshot"
>>> >from last Wednesday immediately fixes the problem.
>>>
>>> Not surprising, I had some issues testing it out too. I'm pretty sure this
>>> patch is corrupting memory, I just haven't found where yet though my
>>> educated guess is that the transaction is being reused improperly.
>>> 	--Mark
>>>
>>> --
>>> Mark Fasheh
>>>
>>>
>> Still digging the bug Mark has reported about the patch.
>>
>> Good to have another report, as I can't always reproduce the soft
>> lockup from Mark.
>>
>> It seems that the WARN_ON will bring another clue to fix it.
>>
>> BTW, the memory corruption assumption seems to be quite helpful.
>> I didn't consider in that way, but it seems to be the only reason
>> causing dead spinlock while no other thread spinning and no lockdep
>> warning.
>
> It seems to be the call to commit_cowonly_roots() in your patch which sets
> everything off. If I remove that call I can run all day without a crash.
>
> Btw, I'm not convinced this fixes the qgroup numbers anyway - we are still
> inconsistent even if I don't get a crash.
>
> Have you tested that the actual numbers on your end are coming out ok?
> 	--Mark

Yes, my initial test shows that the snapshot of fs tree doesn't break 
the number anymore.

And commit_cowonly_roots() is the core of the fix, without it the bug 
won't be fixed.

I'm still digging but it seems to be related to missing 
switch_commit_roots() call after commit_cowonly_roots(), but still 
uncertain, as I'm not familiar with the commit codes.

Thanks,
Qu

>
> --
> Mark Fasheh
>
>



^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2016-04-12  0:30 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-04-07 16:44 WARN_ON in record_root_in_trans() when deleting freshly renamed subvolume Holger Hoffstätte
2016-04-07 18:07 ` Filipe Manana
2016-04-08  2:01 ` Liu Bo
2016-04-08 11:14 ` Filipe Manana
2016-04-08 11:51   ` Holger Hoffstätte
2016-04-08 13:10     ` Holger Hoffstätte
2016-04-08 19:18       ` Mark Fasheh
2016-04-11  1:05         ` Qu Wenruo
2016-04-11 18:09           ` Mark Fasheh
2016-04-12  0:30             ` Qu Wenruo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.