All of lore.kernel.org
 help / color / mirror / Atom feed
* btrfs check --repair crash, and btrfs-cleaner crash
@ 2015-07-06 21:21 Marc MERLIN
  2015-07-10 13:43 ` Btrfs progs release 4.1.1 David Sterba
  0 siblings, 1 reply; 25+ messages in thread
From: Marc MERLIN @ 2015-07-06 21:21 UTC (permalink / raw)
  To: linux-btrfs


myth:~# btrfs check --repair /dev/mapper/crypt_sdd1 
enabling repair mode
Checking filesystem on /dev/mapper/crypt_sdd1
UUID: 024ba4d0-dacb-438d-9f1b-eeb34083fe49
checking extents
cmds-check.c:4486: add_data_backref: Assertion `back->bytes != max_size` failed.
btrfs[0x8066a73]
btrfs[0x8066aa4]
btrfs[0x8067991]
btrfs[0x806b4ab]
btrfs[0x806b9a3]
btrfs[0x806c5b2]
btrfs(cmd_check+0x1088)[0x806eddf]
btrfs(main+0x153)[0x80557c6]
/lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0xb75064d3]
btrfs[0x80557ec]

myth:~# btrfs --version
btrfs-progs v4.0

Is anyone interested in getting data off this filesystem or having me
try newer code/a patch?

filesystem is 10TB-ish, so sending an image isn't going to be easy though.

I can mount with -o ro without it crashing, but if I drop ro, it then
tries to do something and crashes, and unfortunately the error doesn't
make it to syslog

Screenshot: http://marc.merlins.org/tmp/btrfs_crash.jpg

Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/                         | PGP 1024R/763BE901

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Btrfs progs release 4.1.1
@ 2015-07-10 13:43 ` David Sterba
  2015-07-12  1:02   ` Marc MERLIN
  0 siblings, 1 reply; 25+ messages in thread
From: David Sterba @ 2015-07-10 13:43 UTC (permalink / raw)
  To: linux-btrfs; +Cc: clm

Hi,

btrfs-progs 4.1.1 have been released, a few bugfixes and enhancements.

* bugfixes
  - defrag: threshold overflow fix
  - fsck:
    - check if items fit into the leaf space
    - fix wrong nbytes
  - mkfs:
    - create only desired block groups for single device
    - preparatory work for fix on multiple devices

* enhancements
  - new alias for 'device delete': 'device remove'

* other
  - fix compilation on old gcc (4.3)
  - documentation updates
  - debug-tree: print nbytes
  - test: image for corrupted nbytes
  - corupt-block: let it kill nbytes

Tarballs: https://www.kernel.org/pub/linux/kernel/people/kdave/btrfs-progs/
Git: git://git.kernel.org/pub/scm/linux/kernel/git/kdave/btrfs-progs.git

Shortlog:

Adam Borowski (1):
      btrfs-progs: doc: mkfs.btrfs: document -O^

David Sterba (15):
      btrfs-progs: defrag, check target extent earlier
      btrfs-progs: doc: update defrag page
      btrfs-progs: no extra newline between aliased commands in help text
      btrfs-progs: drop argument from attribute deprecated
      btrfs-progs: check for item end outside of leaf
      btrfs-progs: add 'device remove' alias to completion
      btrfs-progs: move make_btrfs arguments to a struct
      btrfs-progs: drop unused parameter from make_btrfs
      btrfs-progs: split metadata group creation out of make_root_dir
      btrfs-progs: move transaction start/commit out of make_root_dir
      btrfs-progs: split data block group creation out of make_root_dir
      btrfs-progs: drop unused argument from create_raid_groups
      btrfs-progs: mkfs: create only desired block groups for single device
      btrfs-progs: let corrupt-block kill nbytes
      Btrfs progs v4.1.1

Omar Sandoval (2):
      btrfs-progs: replace struct cmd_group->hidden with flags
      btrfs-progs: alias btrfs device delete to btrfs device remove

Patrik Lundquist (2):
      btrfs-progs: fix defrag threshold overflow
      btrfs-progs: inspect: Fix out of bounds string termination.

Qu Wenruo (9):
      btrfs-progs: Add nbytes output for print-tree and reformat inode output
      btrfs-progs: fsck: Add repair function for I_ERR_FILE_WRONG_NBYTES
      btrfs-progs: tests: Add test case for I_ERR_FILE_WRONG_NBYTES repair
      btrfs-progs: disk-io: Support commit transaction on chunk tree
      btrfs-progs: extent-tree: Introduce free_block_group_item function
      btrfs-progs: extent-tree: Introduce functions to free dev extents in a chunk
      btrfs-progs: extent-tree: Introduce functions to free chunk items
      btrfs-progs: extent-tree: Introduce functions to free in-memory block group cache
      btrfs-progs: extent-tree: Introduce btrfs_free_block_group function

Tsutomu Itoh (1):
      btrfs-progs: doc: fix short explanation of restore in btrfs

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Btrfs progs release 4.1.1
  2015-07-10 13:43 ` Btrfs progs release 4.1.1 David Sterba
@ 2015-07-12  1:02   ` Marc MERLIN
  2015-07-23 11:55     ` David Sterba
  0 siblings, 1 reply; 25+ messages in thread
From: Marc MERLIN @ 2015-07-12  1:02 UTC (permalink / raw)
  To: dsterba, linux-btrfs, clm

On Fri, Jul 10, 2015 at 03:43:48PM +0200, David Sterba wrote:
> Hi,
> 
> btrfs-progs 4.1.1 have been released, a few bugfixes and enhancements.
> 
> * bugfixes
>   - defrag: threshold overflow fix
>   - fsck:
>     - check if items fit into the leaf space
>     - fix wrong nbytes

Are you interested in crash reports for fsck?

If so, see my recent message:

On Mon, Jul 06, 2015 at 02:21:56PM -0700, Marc MERLIN wrote:
> 
> myth:~# btrfs check --repair /dev/mapper/crypt_sdd1 
> enabling repair mode
> Checking filesystem on /dev/mapper/crypt_sdd1
> UUID: 024ba4d0-dacb-438d-9f1b-eeb34083fe49
> checking extents
> cmds-check.c:4486: add_data_backref: Assertion `back->bytes != max_size` failed.
> btrfs[0x8066a73]
> btrfs[0x8066aa4]
> btrfs[0x8067991]
> btrfs[0x806b4ab]
> btrfs[0x806b9a3]
> btrfs[0x806c5b2]
> btrfs(cmd_check+0x1088)[0x806eddf]
> btrfs(main+0x153)[0x80557c6]
> /lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0xb75064d3]
> btrfs[0x80557ec]
> 
> myth:~# btrfs --version
> btrfs-progs v4.0
> 
> Is anyone interested in getting data off this filesystem or having me
> try newer code/a patch?
> 
> filesystem is 10TB-ish, so sending an image isn't going to be easy though.
> 
> I can mount with -o ro without it crashing, but if I drop ro, it then
> tries to do something and crashes, and unfortunately the error doesn't
> make it to syslog
> 
> Screenshot: http://marc.merlins.org/tmp/btrfs_crash.jpg
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/                         | PGP 1024R/763BE901

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Btrfs progs release 4.1.1
  2015-07-12  1:02   ` Marc MERLIN
@ 2015-07-23 11:55     ` David Sterba
  2015-07-24 16:24       ` Marc MERLIN
  0 siblings, 1 reply; 25+ messages in thread
From: David Sterba @ 2015-07-23 11:55 UTC (permalink / raw)
  To: Marc MERLIN; +Cc: clm, linux-btrfs, jbacik

On Sat, Jul 11, 2015 at 06:02:29PM -0700, Marc MERLIN wrote:
> Are you interested in crash reports for fsck?
> 
> If so, see my recent message:
> 
> On Mon, Jul 06, 2015 at 02:21:56PM -0700, Marc MERLIN wrote:
> > 
> > myth:~# btrfs check --repair /dev/mapper/crypt_sdd1 
> > enabling repair mode
> > Checking filesystem on /dev/mapper/crypt_sdd1
> > UUID: 024ba4d0-dacb-438d-9f1b-eeb34083fe49
> > checking extents
> > cmds-check.c:4486: add_data_backref: Assertion `back->bytes != max_size` failed.

The bugon was added by Josef in commit 650e656a8b9c1fbe4e to
(https://git.kernel.org/kdave/btrfs-progs/c/650e656a8b9c1fbe4ec)

but I don't thing that your filesystem is affected by the described bug,
rather that it tripped over some other inconsistency in backrefs.

> > I can mount with -o ro without it crashing, but if I drop ro, it then
> > tries to do something and crashes, and unfortunately the error doesn't
> > make it to syslog
> > 
> > Screenshot: http://marc.merlins.org/tmp/btrfs_crash.jpg

So it's 32bit system, 3.19.8, crashing during snapshot deletion and
backref walking. EIP is in do_walk_down+0x142. I've tried to match it to
the sources on a local 32bit build, but it does not point to the
expected crash site:

(gdb) l *(do_walk_down+0x142)
0x1cdc2 is in do_walk_down (fs/btrfs/extent-tree.c:7875).
7870
7871                            wc->stage = UPDATE_BACKREF;
7872                            wc->shared_level = level - 1;
7873                    }
7874            } else {
7875                    if (level == 1 &&
7876                        (wc->flags[0] & BTRFS_BLOCK_FLAG_FULL_BACKREF))
7877                            goto skip;
7878            }
7879

There are other places where it could hit a bug-on. It would need more
debugging to find out what's happening.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Btrfs progs release 4.1.1
  2015-07-23 11:55     ` David Sterba
@ 2015-07-24 16:24       ` Marc MERLIN
  2015-08-03  3:51         ` kernel BUG at fs/btrfs/extent-tree.c:8113! (4.1.3 kernel) Marc MERLIN
  0 siblings, 1 reply; 25+ messages in thread
From: Marc MERLIN @ 2015-07-24 16:24 UTC (permalink / raw)
  To: dsterba, clm, linux-btrfs, jbacik

On Thu, Jul 23, 2015 at 01:55:59PM +0200, David Sterba wrote:
> On Sat, Jul 11, 2015 at 06:02:29PM -0700, Marc MERLIN wrote:
> > Are you interested in crash reports for fsck?
> > 
> > If so, see my recent message:
> > 
> > On Mon, Jul 06, 2015 at 02:21:56PM -0700, Marc MERLIN wrote:
> > > 
> > > myth:~# btrfs check --repair /dev/mapper/crypt_sdd1 
> > > enabling repair mode
> > > Checking filesystem on /dev/mapper/crypt_sdd1
> > > UUID: 024ba4d0-dacb-438d-9f1b-eeb34083fe49
> > > checking extents
> > > cmds-check.c:4486: add_data_backref: Assertion `back->bytes != max_size` failed.
> 
> The bugon was added by Josef in commit 650e656a8b9c1fbe4e to
> (https://git.kernel.org/kdave/btrfs-progs/c/650e656a8b9c1fbe4ec)
> 
> but I don't thing that your filesystem is affected by the described bug,
> rather that it tripped over some other inconsistency in backrefs.
> 
> > > I can mount with -o ro without it crashing, but if I drop ro, it then
> > > tries to do something and crashes, and unfortunately the error doesn't
> > > make it to syslog
> > > 
> > > Screenshot: http://marc.merlins.org/tmp/btrfs_crash.jpg
> 
> So it's 32bit system, 3.19.8, crashing during snapshot deletion and
> backref walking. EIP is in do_walk_down+0x142. I've tried to match it to
> the sources on a local 32bit build, but it does not point to the
> expected crash site:

Thanks for looking.
Unfortunately it's a mythtv where if I put a 64bit kernel, other things
go wrong with the 32bit userland/64bit kernel split.
But I'll put a newer 64bit kernel on it to see what happens and report
back.

Thanks,
Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/                         | PGP 1024R/763BE901

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: kernel BUG at fs/btrfs/extent-tree.c:8113! (4.1.3 kernel)
  2015-07-24 16:24       ` Marc MERLIN
@ 2015-08-03  3:51         ` Marc MERLIN
  2015-08-11  5:07           ` Marc MERLIN
  0 siblings, 1 reply; 25+ messages in thread
From: Marc MERLIN @ 2015-08-03  3:51 UTC (permalink / raw)
  To: dsterba, clm, linux-btrfs, jbacik

On Fri, Jul 24, 2015 at 09:24:46AM -0700, Marc MERLIN wrote:
> > > > Screenshot: http://marc.merlins.org/tmp/btrfs_crash.jpg
> > 
> > So it's 32bit system, 3.19.8, crashing during snapshot deletion and
> > backref walking. EIP is in do_walk_down+0x142. I've tried to match it to
> > the sources on a local 32bit build, but it does not point to the
> > expected crash site:
> 
> Thanks for looking.
> Unfortunately it's a mythtv where if I put a 64bit kernel, other things
> go wrong with the 32bit userland/64bit kernel split.
> But I'll put a newer 64bit kernel on it to see what happens and report
> back.

I got home, built the last kernel and got netconsole working.
4.1.3/64bit and 32bit crash the same way. 

Here's the 64bit crash:
[  209.647162] BTRFS: device label bigbackup devid 1 transid 39536 /dev/mapper/crypt_sdd1
[  209.647449] BTRFS: device label bigbackup devid 5 transid 39536 /dev/mapper/crypt_sdh1
[  209.648069] BTRFS: device label bigbackup devid 4 transid 39536 /dev/mapper/crypt_sdg1
[  209.648469] BTRFS: device label bigbackup devid 2 transid 39536 /dev/mapper/crypt_sde1
[  209.648871] BTRFS: device label bigbackup devid 3 transid 39536 /dev/mapper/crypt_sdf1
[  209.675030] BTRFS info (device dm-0): disk space caching is enabled  
[  249.865515] ------------[ cut here ]------------  
[  249.865530] WARNING: CPU: 1 PID: 3556 at fs/btrfs/extent-tree.c:863 btrfs_lookup_extent_info+0x292/0x2e0()
[  249.865534] Modules linked in: xts gf128mul configs rc_hauppauge ir_kbd_i2c cpufreq_userspace cpufreq_powersave cpufreq_conservative cpufreq_stats autofs4 joydev hid_generic usbhid tuner_simple tuner_types tda9887 tda8290 hid tuner msp3400 firewire_sbp2 snd_hda_codec_hdmi rc_imon_mce saa7127 snd_hda_codec_realtek snd_hda_codec_generic hwmon_vid dm_crypt snd_hda_intel dm_mod snd_hda_controller imon saa7115 snd_hda_codec snd_hda_core coretemp snd_pcm_oss snd_mixer_oss ivtv bttv cx2341x tea575x v4l2_common snd_pcm videodev snd_hwdep snd_seq_midi videobuf_dma_sg videobuf_core rc_core snd_rawmidi snd_seq_midi_event snd_seq gpio_ich kvm_intel ehci_pci sr_mod media firewire_ohci cdrom floppy snd_timer ehci_hcd uhci_hcd psmouse snd_seq_device tveeprom acpi_cpufreq lpc_ich usbcore sg lp firewire_core asus_atk0110 evdev usb_common kvm atl1 serio_raw mii crc_itu_t parport snd soundcore processor microcode 8250_fintek
[  249.865672] CPU: 1 PID: 3556 Comm: btrfs-cleaner Not tainted 4.1.3-amd64-i915-volpreempt-20150421 #1
[  249.865674] Hardware name: System manufacturer P5E-VM HDMI/P5E-VM HDMI, BIOS 0604    07/16/2008
[  249.865676]  0000000000000009 ffff880041e03b68 ffffffff8169d72d 0000000000004fb9
[  249.865683]  0000000000000000 ffff880041e03ba8 ffffffff8105861f ffffffff00000094
[  249.865691]  ffffffff81239ac5 ffff8800772920b0 0000000000000000 0000000000000000
[  249.865700] Call Trace:
[  249.865706]  [<ffffffff8169d72d>] dump_stack+0x45/0x57
[  249.865711]  [<ffffffff8105861f>] warn_slowpath_common+0xa1/0xbb
[  249.865714]  [<ffffffff81239ac5>] ? btrfs_lookup_extent_info+0x292/0x2e0
[  249.865717]  [<ffffffff810586dc>] warn_slowpath_null+0x1a/0x1c
[  249.865720]  [<ffffffff81239ac5>] btrfs_lookup_extent_info+0x292/0x2e0
[  249.865724]  [<ffffffff816a3752>] ? _raw_write_lock+0xe/0x10
[  249.865727]  [<ffffffff8123c4af>] do_walk_down+0x14d/0x735
[  249.865731]  [<ffffffff8123cb1e>] walk_down_tree+0x87/0xb0
[  249.865734]  [<ffffffff8123f617>] btrfs_drop_snapshot+0x2e7/0x696
[  249.865739]  [<ffffffff8124ee09>] btrfs_clean_one_deleted_snapshot+0xce/0xdb
[  249.865742]  [<ffffffff81248452>] cleaner_kthread+0x112/0x146
[  249.865745]  [<ffffffff81248340>] ? atomic_add_unless.constprop.56+0x24/0x24
[  249.865748]  [<ffffffff81248340>] ? atomic_add_unless.constprop.56+0x24/0x24
[  249.865751]  [<ffffffff810706f6>] kthread+0xae/0xb6
[  249.865755]  [<ffffffff81070648>] ? __kthread_parkme+0x61/0x61
[  249.865758]  [<ffffffff816a3e62>] ret_from_fork+0x42/0x70
[  249.865761]  [<ffffffff81070648>] ? __kthread_parkme+0x61/0x61
[  249.865764] ---[ end trace 826326bc6da53e4f ]---
[  249.865767] BTRFS error (device dm-0): Missing references.
[  249.865781] ------------[ cut here ]------------
[  249.867106] kernel BUG at fs/btrfs/extent-tree.c:8113!
[  249.868435] invalid opcode: 0000 [#1] SMP
[  249.869508] Modules linked in: xts gf128mul configs rc_hauppauge ir_kbd_i2c cpufreq_userspace cpufreq_powersave cpufreq_conservative cpufreq_stats autofs4 joydev hid_generic usbhid tuner_simple tuner_types tda9887 tda8290 hid tuner msp3400 firewire_sbp2 snd_hda_codec_hdmi rc_imon_mce saa7127 snd_hda_codec_realtek snd_hda_codec_generic hwmon_vid dm_crypt snd_hda_intel dm_mod snd_hda_controller imon saa7115 snd_hda_codec snd_hda_core coretemp snd_pcm_oss snd_mixer_oss ivtv bttv cx2341x tea575x v4l2_common snd_pcm videodev snd_hwdep snd_seq_midi videobuf_dma_sg videobuf_core rc_core snd_rawmidi snd_seq_midi_event snd_seq gpio_ich kvm_intel ehci_pci sr_mod media firewire_ohci cdrom floppy snd_timer ehci_hcd uhci_hcd psmouse snd_seq_device tveeprom acpi_cpufreq lpc_ich usbcore sg lp firewire_core asus_atk0110 evdev usb_common kvm atl1 serio_raw mii crc_itu_t parport snd soundcore processor microcode 8250_fintek
[  249.869508] CPU: 1 PID: 3556 Comm: btrfs-cleaner Tainted: G        W       4.1.3-amd64-i915-volpreempt-20150421 #1
[  249.869508] Hardware name: System manufacturer P5E-VM HDMI/P5E-VM HDMI, BIOS 0604    07/16/2008
[  249.869508] task: ffff880041dfc110 ti: ffff880041e00000 task.ti: ffff880041e00000
[  249.869508] RIP: 0010:[<ffffffff8123c4e9>]  [<ffffffff8123c4e9>] do_walk_down+0x187/0x735
[  249.869508] RSP: 0018:ffff880041e03c68  EFLAGS: 00010296
[  249.869508] RAX: 000000000000002e RBX: ffff880046655010 RCX: 0000000000000007
[  249.869508] RDX: 0000000000005063 RSI: 0000000000000246 RDI: ffff88007f48e5d0
[  249.869508] RBP: ffff880041e03d38 R08: 000000000000004f R09: 0000000000000001
[  249.869508] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8800778a1800
[  249.869508] R13: ffff8800477f4100 R14: ffff880077292140 R15: ffff880077292150
[  249.869508] FS:  0000000000000000(0000) GS:ffff88007f480000(0000) knlGS:0000000000000000
[  249.869508] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  249.869508] CR2: 00000000084764f8 CR3: 0000000071dda000 CR4: 00000000000006e0
[  249.869508] Stack:
[  249.869508]  ffff8800477f4148 0000000000000000 ffff880000000000 0000000000200282
[  249.869508]  ffff88007b5b1880 ffff8800477f4108 0000000000000008 ffff880077292148
[  249.869508]  ffff880000000001 0000000000000001 ffff880041e03d64 0000008636598000
[  249.869508] Call Trace:
[  249.869508]  [<ffffffff8123cb1e>] walk_down_tree+0x87/0xb0
[  249.869508]  [<ffffffff8123f617>] btrfs_drop_snapshot+0x2e7/0x696
[  249.869508]  [<ffffffff8124ee09>] btrfs_clean_one_deleted_snapshot+0xce/0xdb
[  249.869508]  [<ffffffff81248452>] cleaner_kthread+0x112/0x146
[  249.869508]  [<ffffffff81248340>] ? atomic_add_unless.constprop.56+0x24/0x24
[  249.869508]  [<ffffffff81248340>] ? atomic_add_unless.constprop.56+0x24/0x24
[  249.869508]  [<ffffffff810706f6>] kthread+0xae/0xb6
[  249.869508]  [<ffffffff81070648>] ? __kthread_parkme+0x61/0x61
[  249.869508]  [<ffffffff816a3e62>] ret_from_fork+0x42/0x70
[  249.869508]  [<ffffffff81070648>] ? __kthread_parkme+0x61/0x61
[  249.869508] Code: 45 ac e8 39 19 04 00 8b 45 ac e9 b8 05 00 00 49 83 3a 00 75 18 49 8b bc 24 f0 01 00 00 48 c7 c6 99 5e ad 81 31 c0 e8 92 e3 fe ff <0f> 0b 48 8b 45 80 c7 00 00 00 00 00 41 83 bd 94 00 00 00 01 0f
[  249.869508] RIP  [<ffffffff8123c4e9>] do_walk_down+0x187/0x735
[  249.869508]  RSP <ffff880041e03c68>
[  249.936818] ---[ end trace 826326bc6da53e50 ]---
[  249.936823] Kernel panic - not syncing: Fatal exception
[  249.938550] Kernel Offset: disabled

And here is 4.1.3/32bit:
[ 1346.737490] ------------[ cut here ]------------
[ 1346.739026] WARNING: CPU: 1 PID: 12919 at fs/btrfs/extent-tree.c:863 btrfs_lookup_extent_info+0x2b8/0x2f7()
[ 1346.740592] Modules linked in: xts gf128mul rc_hauppauge ir_kbd_i2c cpufreq_userspace cpufreq_powersave cpufreq_conservative cpufreq_stats autofs4 tuner_simple tuner_types tda9887 tda8290 tuner msp3400 snd_hda_codec_hdmi saa7127 snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel snd_hda_controller firewire_sbp2 saa7115 snd_hda_codec ivtv snd_hda_core snd_hwdep hwmon_vid snd_pcm_oss snd_mixer_oss snd_pcm dm_crypt snd_seq_midi snd_seq_midi_event dm_mod snd_rawmidi snd_seq bttv tea575x videobuf_dma_sg coretemp videobuf_core snd_seq_device snd_timer tveeprom cx2341x v4l2_common videodev kvm_intel gpio_ich snd soundcore ehci_pci ehci_hcd lpc_ich media kvm psmouse rc_imon_mce acpi_cpufreq imon rc_core joydev sr_mod asus_atk0110 lp cdrom microcode processor hid_generic serio_raw evdev parport sg usbhid hid raid456 async_raid6_recov async_pq async_xor async_memcpy async_tx multipath firewire_ohci firewire_core crc_itu_t atl1 uhci_hcd mii floppy usbcore usb_common
[ 1346.750673] CPU: 1 PID: 12919 Comm: btrfs-cleaner Not tainted 4.1.3-ia32-i915-volpreempt-20150421 #2
[ 1346.752780] Hardware name: System manufacturer P5E-VM HDMI/P5E-VM HDMI, BIOS 0604    07/16/2008
[ 1346.754912]  00000000 00000000 d6413db0 c1544620 00000000 d6413dc8 c10388a5 c11abf7d
[ 1346.757115]  00000000 d87f65d8 00000000 d6413dd8 c1038920 00000009 00000000 d6413e24
[ 1346.759320]  c11abf7d 00000000 00000086 36598000 d87f7dec d87f7d38 00000000 d87f6648
[ 1346.761543] Call Trace:
[ 1346.763709]  [<c1544620>] dump_stack+0x49/0x73
[ 1346.765895]  [<c10388a5>] warn_slowpath_common+0x7e/0x95
[ 1346.768076]  [<c11abf7d>] ? btrfs_lookup_extent_info+0x2b8/0x2f7
[ 1346.770280]  [<c1038920>] warn_slowpath_null+0xf/0x13
[ 1346.772481]  [<c11abf7d>] btrfs_lookup_extent_info+0x2b8/0x2f7
[ 1346.774682]  [<c11aea5b>] do_walk_down+0x10c/0x65f
[ 1346.776890]  [<c11a7a2b>] ? btrfs_tree_unlock_rw+0x10/0x2e
[ 1346.779088]  [<c11ac90d>] ? walk_down_proc+0x110/0x1cb
[ 1346.781265]  [<c11af020>] walk_down_tree+0x72/0x93
[ 1346.783434]  [<c11b190a>] btrfs_drop_snapshot+0x278/0x591
[ 1346.785610]  [<c11bfbf2>] btrfs_clean_one_deleted_snapshot+0x79/0x87
[ 1346.787801]  [<c11b9985>] cleaner_kthread+0x74/0xdd
[ 1346.790007]  [<c11b9911>] ? btrfs_need_cleaner_sleep.isra.20+0x2a/0x2a
[ 1346.792248]  [<c104bccd>] kthread+0x88/0x8d
[ 1346.794503]  [<c105013e>] ? mmdrop+0xe/0x1c
[ 1346.796745]  [<c1050000>] ? check_same_owner+0x2c/0x43
[ 1346.798981]  [<c15498c1>] ret_from_kernel_thread+0x21/0x30
[ 1346.801236]  [<c104bc45>] ? __kthread_parkme+0x50/0x50
[ 1346.803480] ---[ end trace 482e6619a3037689 ]---
[ 1346.805757] BTRFS error (device dm-0): Missing references.
[ 1346.808066] ------------[ cut here ]------------
[ 1346.810375] kernel BUG at fs/btrfs/extent-tree.c:8113!
[ 1346.812053] invalid opcode: 0000 [#1] PREEMPT SMP 
[ 1346.812053] Modules linked in: xts gf128mul rc_hauppauge ir_kbd_i2c cpufreq_userspace cpufreq_powersave cpufreq_conservative cpufreq_stats autofs4 tuner_simple tuner_types tda9887 tda8290 tuner msp3400 snd_hda_codec_hdmi saa7127 snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel snd_hda_controller firewire_sbp2 saa7115 snd_hda_codec ivtv snd_hda_core snd_hwdep hwmon_vid snd_pcm_oss snd_mixer_oss snd_pcm dm_crypt snd_seq_midi snd_seq_midi_event dm_mod snd_rawmidi snd_seq bttv tea575x videobuf_dma_sg coretemp videobuf_core snd_seq_device snd_timer tveeprom cx2341x v4l2_common videodev kvm_intel gpio_ich snd soundcore ehci_pci ehci_hcd lpc_ich media kvm psmouse rc_imon_mce acpi_cpufreq imon rc_core joydev sr_mod asus_atk0110 lp cdrom microcode processor hid_generic serio_raw evdev parport sg usbhid hid raid456 async_raid6_recov async_pq async_xor async_memcpy async_tx multipath firewire_ohci firewire_core crc_itu_t atl1 uhci_hcd mii floppy usbcore usb_common
[ 1346.812053] CPU: 1 PID: 12919 Comm: btrfs-cleaner Tainted: G        W       4.1.3-ia32-i915-volpreempt-20150421 #2
[ 1346.812053] Hardware name: System manufacturer P5E-VM HDMI/P5E-VM HDMI, BIOS 0604    07/16/2008
[ 1346.812053] task: d70faea0 ti: d6412000 task.ti: d6412000
[ 1346.812053] EIP: 0060:[<c11aea91>] EFLAGS: 00010282 CPU: 1
[ 1346.812053] EIP is at do_walk_down+0x142/0x65f
[ 1346.812053] EAX: 0000002e EBX: f4a2a4c0 ECX: f598b310 EDX: 80000000
[ 1346.812053] ESI: d87f66c0 EDI: c21a0c00 EBP: d6413ebc ESP: d6413e40
[ 1346.812053]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[ 1346.812053] CR0: 8005003b CR2: 09947000 CR3: 34d01000 CR4: 000006d0
[ 1346.812053] Stack:
[ 1346.812053]  f4c50000 c174d1ac f4a435c0 00000001 00000000 00004000 d87f66b8 00000001
[ 1346.812053]  00000286 f4a2a4c8 00000000 36598000 00000086 d87f6648 00000002 00008f48
[ 1346.812053]  00000000 00000001 d87f3bc8 00000000 db2f1438 d6413e9c c11a7a2b d6413ebc
[ 1346.812053] Call Trace:
[ 1346.812053]  [<c11a7a2b>] ? btrfs_tree_unlock_rw+0x10/0x2e
[ 1346.812053]  [<c11ac90d>] ? walk_down_proc+0x110/0x1cb
[ 1346.812053]  [<c11af020>] walk_down_tree+0x72/0x93
[ 1346.812053]  [<c11b190a>] btrfs_drop_snapshot+0x278/0x591
[ 1346.812053]  [<c11bfbf2>] btrfs_clean_one_deleted_snapshot+0x79/0x87
[ 1346.812053]  [<c11b9985>] cleaner_kthread+0x74/0xdd
[ 1346.812053]  [<c11b9911>] ? btrfs_need_cleaner_sleep.isra.20+0x2a/0x2a
[ 1346.812053]  [<c104bccd>] kthread+0x88/0x8d
[ 1346.812053]  [<c105013e>] ? mmdrop+0xe/0x1c
[ 1346.812053]  [<c1050000>] ? check_same_owner+0x2c/0x43
[ 1346.812053]  [<c15498c1>] ret_from_kernel_thread+0x21/0x30
[ 1346.812053]  [<c104bc45>] ? __kthread_parkme+0x50/0x50
[ 1346.812053] Code: 45 cc e8 10 ff 03 00 8b 55 c8 89 d0 e9 2f 05 00 00 8b 4d a8 8b 41 04 0b 01 75 12 68 ac d1 74 c1 ff b7 dc 01 00 00 e8 02 f3 fe ff <0f> 0b 8b 45 0c c7 00 00 00 00 00 83 bb 94 00 00 00 01 0f 85 a5
[ 1346.812053] EIP: [<c11aea91>] do_walk_down+0x142/0x65f SS:ESP 0068:d6413e40
[ 1346.913822] ---[ end trace 482e6619a303768a ]---
[ 1346.916847] Kernel panic - not syncing: Fatal exception
[ 1346.919763] Kernel Offset: disabled
[ 1346.920844] drm_kms_helper: panic occurred, switching back to text console
[ 1346.920844] Rebooting in 20 seconds..


-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/  

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: kernel BUG at fs/btrfs/extent-tree.c:8113! (4.1.3 kernel)
  2015-08-03  3:51         ` kernel BUG at fs/btrfs/extent-tree.c:8113! (4.1.3 kernel) Marc MERLIN
@ 2015-08-11  5:07           ` Marc MERLIN
  2015-08-11 15:40             ` Josef Bacik
  0 siblings, 1 reply; 25+ messages in thread
From: Marc MERLIN @ 2015-08-11  5:07 UTC (permalink / raw)
  To: dsterba, clm, linux-btrfs, jbacik

On Sun, Aug 02, 2015 at 08:51:30PM -0700, Marc MERLIN wrote:
> On Fri, Jul 24, 2015 at 09:24:46AM -0700, Marc MERLIN wrote:
> > > > > Screenshot: http://marc.merlins.org/tmp/btrfs_crash.jpg
> > > 
> > > So it's 32bit system, 3.19.8, crashing during snapshot deletion and
> > > backref walking. EIP is in do_walk_down+0x142. I've tried to match it to
> > > the sources on a local 32bit build, but it does not point to the
> > > expected crash site:
> > 
> > Thanks for looking.
> > Unfortunately it's a mythtv where if I put a 64bit kernel, other things
> > go wrong with the 32bit userland/64bit kernel split.
> > But I'll put a newer 64bit kernel on it to see what happens and report
> > back.
> 
> I got home, built the last kernel and got netconsole working.
> 4.1.3/64bit and 32bit crash the same way. 
 
So, it's been several weeks that I can't use this filesystem.
Is anyone interested in fixing the kernel bug before I wipe it?
(as in, even if the FS is corrupted, it should not crash the kernel)

Thanks,
Marc

> Here's the 64bit crash:
> [  209.647162] BTRFS: device label bigbackup devid 1 transid 39536 /dev/mapper/crypt_sdd1
> [  209.647449] BTRFS: device label bigbackup devid 5 transid 39536 /dev/mapper/crypt_sdh1
> [  209.648069] BTRFS: device label bigbackup devid 4 transid 39536 /dev/mapper/crypt_sdg1
> [  209.648469] BTRFS: device label bigbackup devid 2 transid 39536 /dev/mapper/crypt_sde1
> [  209.648871] BTRFS: device label bigbackup devid 3 transid 39536 /dev/mapper/crypt_sdf1
> [  209.675030] BTRFS info (device dm-0): disk space caching is enabled  
> [  249.865515] ------------[ cut here ]------------  
> [  249.865530] WARNING: CPU: 1 PID: 3556 at fs/btrfs/extent-tree.c:863 btrfs_lookup_extent_info+0x292/0x2e0()
> [  249.865534] Modules linked in: xts gf128mul configs rc_hauppauge ir_kbd_i2c cpufreq_userspace cpufreq_powersave cpufreq_conservative cpufreq_stats autofs4 joydev hid_generic usbhid tuner_simple tuner_types tda9887 tda8290 hid tuner msp3400 firewire_sbp2 snd_hda_codec_hdmi rc_imon_mce saa7127 snd_hda_codec_realtek snd_hda_codec_generic hwmon_vid dm_crypt snd_hda_intel dm_mod snd_hda_controller imon saa7115 snd_hda_codec snd_hda_core coretemp snd_pcm_oss snd_mixer_oss ivtv bttv cx2341x tea575x v4l2_common snd_pcm videodev snd_hwdep snd_seq_midi videobuf_dma_sg videobuf_core rc_core snd_rawmidi snd_seq_midi_event snd_seq gpio_ich kvm_intel ehci_pci sr_mod media firewire_ohci cdrom floppy snd_timer ehci_hcd uhci_hcd psmouse snd_seq_device tveeprom acpi_cpufreq lpc_ich usbcore sg lp firewire_core asus_atk0110 evdev usb_common kvm atl1 serio_raw mii crc_itu_t parport snd soundcore processor microcode 8250_fintek
> [  249.865672] CPU: 1 PID: 3556 Comm: btrfs-cleaner Not tainted 4.1.3-amd64-i915-volpreempt-20150421 #1
> [  249.865674] Hardware name: System manufacturer P5E-VM HDMI/P5E-VM HDMI, BIOS 0604    07/16/2008
> [  249.865676]  0000000000000009 ffff880041e03b68 ffffffff8169d72d 0000000000004fb9
> [  249.865683]  0000000000000000 ffff880041e03ba8 ffffffff8105861f ffffffff00000094
> [  249.865691]  ffffffff81239ac5 ffff8800772920b0 0000000000000000 0000000000000000
> [  249.865700] Call Trace:
> [  249.865706]  [<ffffffff8169d72d>] dump_stack+0x45/0x57
> [  249.865711]  [<ffffffff8105861f>] warn_slowpath_common+0xa1/0xbb
> [  249.865714]  [<ffffffff81239ac5>] ? btrfs_lookup_extent_info+0x292/0x2e0
> [  249.865717]  [<ffffffff810586dc>] warn_slowpath_null+0x1a/0x1c
> [  249.865720]  [<ffffffff81239ac5>] btrfs_lookup_extent_info+0x292/0x2e0
> [  249.865724]  [<ffffffff816a3752>] ? _raw_write_lock+0xe/0x10
> [  249.865727]  [<ffffffff8123c4af>] do_walk_down+0x14d/0x735
> [  249.865731]  [<ffffffff8123cb1e>] walk_down_tree+0x87/0xb0
> [  249.865734]  [<ffffffff8123f617>] btrfs_drop_snapshot+0x2e7/0x696
> [  249.865739]  [<ffffffff8124ee09>] btrfs_clean_one_deleted_snapshot+0xce/0xdb
> [  249.865742]  [<ffffffff81248452>] cleaner_kthread+0x112/0x146
> [  249.865745]  [<ffffffff81248340>] ? atomic_add_unless.constprop.56+0x24/0x24
> [  249.865748]  [<ffffffff81248340>] ? atomic_add_unless.constprop.56+0x24/0x24
> [  249.865751]  [<ffffffff810706f6>] kthread+0xae/0xb6
> [  249.865755]  [<ffffffff81070648>] ? __kthread_parkme+0x61/0x61
> [  249.865758]  [<ffffffff816a3e62>] ret_from_fork+0x42/0x70
> [  249.865761]  [<ffffffff81070648>] ? __kthread_parkme+0x61/0x61
> [  249.865764] ---[ end trace 826326bc6da53e4f ]---
> [  249.865767] BTRFS error (device dm-0): Missing references.
> [  249.865781] ------------[ cut here ]------------
> [  249.867106] kernel BUG at fs/btrfs/extent-tree.c:8113!
> [  249.868435] invalid opcode: 0000 [#1] SMP
> [  249.869508] Modules linked in: xts gf128mul configs rc_hauppauge ir_kbd_i2c cpufreq_userspace cpufreq_powersave cpufreq_conservative cpufreq_stats autofs4 joydev hid_generic usbhid tuner_simple tuner_types tda9887 tda8290 hid tuner msp3400 firewire_sbp2 snd_hda_codec_hdmi rc_imon_mce saa7127 snd_hda_codec_realtek snd_hda_codec_generic hwmon_vid dm_crypt snd_hda_intel dm_mod snd_hda_controller imon saa7115 snd_hda_codec snd_hda_core coretemp snd_pcm_oss snd_mixer_oss ivtv bttv cx2341x tea575x v4l2_common snd_pcm videodev snd_hwdep snd_seq_midi videobuf_dma_sg videobuf_core rc_core snd_rawmidi snd_seq_midi_event snd_seq gpio_ich kvm_intel ehci_pci sr_mod media firewire_ohci cdrom floppy snd_timer ehci_hcd uhci_hcd psmouse snd_seq_device tveeprom acpi_cpufreq lpc_ich usbcore sg lp firewire_core asus_atk0110 evdev usb_common kvm atl1 serio_raw mii crc_itu_t parport snd soundcore processor microcode 8250_fintek
> [  249.869508] CPU: 1 PID: 3556 Comm: btrfs-cleaner Tainted: G        W       4.1.3-amd64-i915-volpreempt-20150421 #1
> [  249.869508] Hardware name: System manufacturer P5E-VM HDMI/P5E-VM HDMI, BIOS 0604    07/16/2008
> [  249.869508] task: ffff880041dfc110 ti: ffff880041e00000 task.ti: ffff880041e00000
> [  249.869508] RIP: 0010:[<ffffffff8123c4e9>]  [<ffffffff8123c4e9>] do_walk_down+0x187/0x735
> [  249.869508] RSP: 0018:ffff880041e03c68  EFLAGS: 00010296
> [  249.869508] RAX: 000000000000002e RBX: ffff880046655010 RCX: 0000000000000007
> [  249.869508] RDX: 0000000000005063 RSI: 0000000000000246 RDI: ffff88007f48e5d0
> [  249.869508] RBP: ffff880041e03d38 R08: 000000000000004f R09: 0000000000000001
> [  249.869508] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8800778a1800
> [  249.869508] R13: ffff8800477f4100 R14: ffff880077292140 R15: ffff880077292150
> [  249.869508] FS:  0000000000000000(0000) GS:ffff88007f480000(0000) knlGS:0000000000000000
> [  249.869508] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [  249.869508] CR2: 00000000084764f8 CR3: 0000000071dda000 CR4: 00000000000006e0
> [  249.869508] Stack:
> [  249.869508]  ffff8800477f4148 0000000000000000 ffff880000000000 0000000000200282
> [  249.869508]  ffff88007b5b1880 ffff8800477f4108 0000000000000008 ffff880077292148
> [  249.869508]  ffff880000000001 0000000000000001 ffff880041e03d64 0000008636598000
> [  249.869508] Call Trace:
> [  249.869508]  [<ffffffff8123cb1e>] walk_down_tree+0x87/0xb0
> [  249.869508]  [<ffffffff8123f617>] btrfs_drop_snapshot+0x2e7/0x696
> [  249.869508]  [<ffffffff8124ee09>] btrfs_clean_one_deleted_snapshot+0xce/0xdb
> [  249.869508]  [<ffffffff81248452>] cleaner_kthread+0x112/0x146
> [  249.869508]  [<ffffffff81248340>] ? atomic_add_unless.constprop.56+0x24/0x24
> [  249.869508]  [<ffffffff81248340>] ? atomic_add_unless.constprop.56+0x24/0x24
> [  249.869508]  [<ffffffff810706f6>] kthread+0xae/0xb6
> [  249.869508]  [<ffffffff81070648>] ? __kthread_parkme+0x61/0x61
> [  249.869508]  [<ffffffff816a3e62>] ret_from_fork+0x42/0x70
> [  249.869508]  [<ffffffff81070648>] ? __kthread_parkme+0x61/0x61
> [  249.869508] Code: 45 ac e8 39 19 04 00 8b 45 ac e9 b8 05 00 00 49 83 3a 00 75 18 49 8b bc 24 f0 01 00 00 48 c7 c6 99 5e ad 81 31 c0 e8 92 e3 fe ff <0f> 0b 48 8b 45 80 c7 00 00 00 00 00 41 83 bd 94 00 00 00 01 0f
> [  249.869508] RIP  [<ffffffff8123c4e9>] do_walk_down+0x187/0x735
> [  249.869508]  RSP <ffff880041e03c68>
> [  249.936818] ---[ end trace 826326bc6da53e50 ]---
> [  249.936823] Kernel panic - not syncing: Fatal exception
> [  249.938550] Kernel Offset: disabled
> 
> And here is 4.1.3/32bit:
> [ 1346.737490] ------------[ cut here ]------------
> [ 1346.739026] WARNING: CPU: 1 PID: 12919 at fs/btrfs/extent-tree.c:863 btrfs_lookup_extent_info+0x2b8/0x2f7()
> [ 1346.740592] Modules linked in: xts gf128mul rc_hauppauge ir_kbd_i2c cpufreq_userspace cpufreq_powersave cpufreq_conservative cpufreq_stats autofs4 tuner_simple tuner_types tda9887 tda8290 tuner msp3400 snd_hda_codec_hdmi saa7127 snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel snd_hda_controller firewire_sbp2 saa7115 snd_hda_codec ivtv snd_hda_core snd_hwdep hwmon_vid snd_pcm_oss snd_mixer_oss snd_pcm dm_crypt snd_seq_midi snd_seq_midi_event dm_mod snd_rawmidi snd_seq bttv tea575x videobuf_dma_sg coretemp videobuf_core snd_seq_device snd_timer tveeprom cx2341x v4l2_common videodev kvm_intel gpio_ich snd soundcore ehci_pci ehci_hcd lpc_ich media kvm psmouse rc_imon_mce acpi_cpufreq imon rc_core joydev sr_mod asus_atk0110 lp cdrom microcode processor hid_generic serio_raw evdev parport sg usbhid hid raid456 async_raid6_recov async_pq async_xor async_memcpy async_tx multipath firewire_ohci firewire_core crc_itu_t atl1 uhci_hcd mii floppy usbcore usb_common
> [ 1346.750673] CPU: 1 PID: 12919 Comm: btrfs-cleaner Not tainted 4.1.3-ia32-i915-volpreempt-20150421 #2
> [ 1346.752780] Hardware name: System manufacturer P5E-VM HDMI/P5E-VM HDMI, BIOS 0604    07/16/2008
> [ 1346.754912]  00000000 00000000 d6413db0 c1544620 00000000 d6413dc8 c10388a5 c11abf7d
> [ 1346.757115]  00000000 d87f65d8 00000000 d6413dd8 c1038920 00000009 00000000 d6413e24
> [ 1346.759320]  c11abf7d 00000000 00000086 36598000 d87f7dec d87f7d38 00000000 d87f6648
> [ 1346.761543] Call Trace:
> [ 1346.763709]  [<c1544620>] dump_stack+0x49/0x73
> [ 1346.765895]  [<c10388a5>] warn_slowpath_common+0x7e/0x95
> [ 1346.768076]  [<c11abf7d>] ? btrfs_lookup_extent_info+0x2b8/0x2f7
> [ 1346.770280]  [<c1038920>] warn_slowpath_null+0xf/0x13
> [ 1346.772481]  [<c11abf7d>] btrfs_lookup_extent_info+0x2b8/0x2f7
> [ 1346.774682]  [<c11aea5b>] do_walk_down+0x10c/0x65f
> [ 1346.776890]  [<c11a7a2b>] ? btrfs_tree_unlock_rw+0x10/0x2e
> [ 1346.779088]  [<c11ac90d>] ? walk_down_proc+0x110/0x1cb
> [ 1346.781265]  [<c11af020>] walk_down_tree+0x72/0x93
> [ 1346.783434]  [<c11b190a>] btrfs_drop_snapshot+0x278/0x591
> [ 1346.785610]  [<c11bfbf2>] btrfs_clean_one_deleted_snapshot+0x79/0x87
> [ 1346.787801]  [<c11b9985>] cleaner_kthread+0x74/0xdd
> [ 1346.790007]  [<c11b9911>] ? btrfs_need_cleaner_sleep.isra.20+0x2a/0x2a
> [ 1346.792248]  [<c104bccd>] kthread+0x88/0x8d
> [ 1346.794503]  [<c105013e>] ? mmdrop+0xe/0x1c
> [ 1346.796745]  [<c1050000>] ? check_same_owner+0x2c/0x43
> [ 1346.798981]  [<c15498c1>] ret_from_kernel_thread+0x21/0x30
> [ 1346.801236]  [<c104bc45>] ? __kthread_parkme+0x50/0x50
> [ 1346.803480] ---[ end trace 482e6619a3037689 ]---
> [ 1346.805757] BTRFS error (device dm-0): Missing references.
> [ 1346.808066] ------------[ cut here ]------------
> [ 1346.810375] kernel BUG at fs/btrfs/extent-tree.c:8113!
> [ 1346.812053] invalid opcode: 0000 [#1] PREEMPT SMP 
> [ 1346.812053] Modules linked in: xts gf128mul rc_hauppauge ir_kbd_i2c cpufreq_userspace cpufreq_powersave cpufreq_conservative cpufreq_stats autofs4 tuner_simple tuner_types tda9887 tda8290 tuner msp3400 snd_hda_codec_hdmi saa7127 snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel snd_hda_controller firewire_sbp2 saa7115 snd_hda_codec ivtv snd_hda_core snd_hwdep hwmon_vid snd_pcm_oss snd_mixer_oss snd_pcm dm_crypt snd_seq_midi snd_seq_midi_event dm_mod snd_rawmidi snd_seq bttv tea575x videobuf_dma_sg coretemp videobuf_core snd_seq_device snd_timer tveeprom cx2341x v4l2_common videodev kvm_intel gpio_ich snd soundcore ehci_pci ehci_hcd lpc_ich media kvm psmouse rc_imon_mce acpi_cpufreq imon rc_core joydev sr_mod asus_atk0110 lp cdrom microcode processor hid_generic serio_raw evdev parport sg usbhid hid raid456 async_raid6_recov async_pq async_xor async_memcpy async_tx multipath firewire_ohci firewire_core crc_itu_t atl1 uhci_hcd mii floppy usbcore usb_common
> [ 1346.812053] CPU: 1 PID: 12919 Comm: btrfs-cleaner Tainted: G        W       4.1.3-ia32-i915-volpreempt-20150421 #2
> [ 1346.812053] Hardware name: System manufacturer P5E-VM HDMI/P5E-VM HDMI, BIOS 0604    07/16/2008
> [ 1346.812053] task: d70faea0 ti: d6412000 task.ti: d6412000
> [ 1346.812053] EIP: 0060:[<c11aea91>] EFLAGS: 00010282 CPU: 1
> [ 1346.812053] EIP is at do_walk_down+0x142/0x65f
> [ 1346.812053] EAX: 0000002e EBX: f4a2a4c0 ECX: f598b310 EDX: 80000000
> [ 1346.812053] ESI: d87f66c0 EDI: c21a0c00 EBP: d6413ebc ESP: d6413e40
> [ 1346.812053]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
> [ 1346.812053] CR0: 8005003b CR2: 09947000 CR3: 34d01000 CR4: 000006d0
> [ 1346.812053] Stack:
> [ 1346.812053]  f4c50000 c174d1ac f4a435c0 00000001 00000000 00004000 d87f66b8 00000001
> [ 1346.812053]  00000286 f4a2a4c8 00000000 36598000 00000086 d87f6648 00000002 00008f48
> [ 1346.812053]  00000000 00000001 d87f3bc8 00000000 db2f1438 d6413e9c c11a7a2b d6413ebc
> [ 1346.812053] Call Trace:
> [ 1346.812053]  [<c11a7a2b>] ? btrfs_tree_unlock_rw+0x10/0x2e
> [ 1346.812053]  [<c11ac90d>] ? walk_down_proc+0x110/0x1cb
> [ 1346.812053]  [<c11af020>] walk_down_tree+0x72/0x93
> [ 1346.812053]  [<c11b190a>] btrfs_drop_snapshot+0x278/0x591
> [ 1346.812053]  [<c11bfbf2>] btrfs_clean_one_deleted_snapshot+0x79/0x87
> [ 1346.812053]  [<c11b9985>] cleaner_kthread+0x74/0xdd
> [ 1346.812053]  [<c11b9911>] ? btrfs_need_cleaner_sleep.isra.20+0x2a/0x2a
> [ 1346.812053]  [<c104bccd>] kthread+0x88/0x8d
> [ 1346.812053]  [<c105013e>] ? mmdrop+0xe/0x1c
> [ 1346.812053]  [<c1050000>] ? check_same_owner+0x2c/0x43
> [ 1346.812053]  [<c15498c1>] ret_from_kernel_thread+0x21/0x30
> [ 1346.812053]  [<c104bc45>] ? __kthread_parkme+0x50/0x50
> [ 1346.812053] Code: 45 cc e8 10 ff 03 00 8b 55 c8 89 d0 e9 2f 05 00 00 8b 4d a8 8b 41 04 0b 01 75 12 68 ac d1 74 c1 ff b7 dc 01 00 00 e8 02 f3 fe ff <0f> 0b 8b 45 0c c7 00 00 00 00 00 83 bb 94 00 00 00 01 0f 85 a5
> [ 1346.812053] EIP: [<c11aea91>] do_walk_down+0x142/0x65f SS:ESP 0068:d6413e40
> [ 1346.913822] ---[ end trace 482e6619a303768a ]---
> [ 1346.916847] Kernel panic - not syncing: Fatal exception
> [ 1346.919763] Kernel Offset: disabled
> [ 1346.920844] drm_kms_helper: panic occurred, switching back to text console
> [ 1346.920844] Rebooting in 20 seconds..
> 
> 
> -- 
> "A mouse is a device used to point at the xterm you want to type in" - A.S.R.
> Microsoft is to operating systems ....
>                                       .... what McDonalds is to gourmet cooking
> Home page: http://marc.merlins.org/  
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/                         | PGP 1024R/763BE901

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: kernel BUG at fs/btrfs/extent-tree.c:8113! (4.1.3 kernel)
  2015-08-11  5:07           ` Marc MERLIN
@ 2015-08-11 15:40             ` Josef Bacik
  2015-08-12 14:47               ` Marc MERLIN
  0 siblings, 1 reply; 25+ messages in thread
From: Josef Bacik @ 2015-08-11 15:40 UTC (permalink / raw)
  To: Marc MERLIN, dsterba, clm, linux-btrfs

On 08/11/2015 01:07 AM, Marc MERLIN wrote:
> On Sun, Aug 02, 2015 at 08:51:30PM -0700, Marc MERLIN wrote:
>> On Fri, Jul 24, 2015 at 09:24:46AM -0700, Marc MERLIN wrote:
>>>>>> Screenshot: https://urldefense.proofpoint.com/v1/url?u=http://marc.merlins.org/tmp/btrfs_crash.jpg&k=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0A&r=cKCbChRKsMpTX8ybrSkonQ%3D%3D%0A&m=BIMTuuT5G3PNqsD7rUX5Uzfyd1xL9vQIECC7sPpJh5U%3D%0A&s=5a4e737cf6e23a884121a0bd2c935edb9e7011394b6b59b109c11716a562000b
>>>>
>>>> So it's 32bit system, 3.19.8, crashing during snapshot deletion and
>>>> backref walking. EIP is in do_walk_down+0x142. I've tried to match it to
>>>> the sources on a local 32bit build, but it does not point to the
>>>> expected crash site:
>>>
>>> Thanks for looking.
>>> Unfortunately it's a mythtv where if I put a 64bit kernel, other things
>>> go wrong with the 32bit userland/64bit kernel split.
>>> But I'll put a newer 64bit kernel on it to see what happens and report
>>> back.
>>
>> I got home, built the last kernel and got netconsole working.
>> 4.1.3/64bit and 32bit crash the same way.
>
> So, it's been several weeks that I can't use this filesystem.
> Is anyone interested in fixing the kernel bug before I wipe it?
> (as in, even if the FS is corrupted, it should not crash the kernel)
>


 From a48cf7a9ae44a17d927df5542c8b0be287aee9ed Mon Sep 17 00:00:00 2001
From: Josef Bacik <jbacik@fb.com>
Date: Tue, 11 Aug 2015 11:39:37 -0400
Subject: [PATCH] Btrfs: kill BUG_ON() in btrfs_lookup_extent_info()

Replace it with an ASSERT(0) for the developers and an error for not the
developers.

Signed-off-by: Josef Bacik <jbacik@fb.com>
---
  fs/btrfs/extent-tree.c | 7 +++++--
  1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 5411f0a..f7fb120 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -818,7 +818,11 @@ search_again:
  			BUG();
  #endif
  		}
-		BUG_ON(num_refs == 0);
+		if (num_refs == 0) {
+			ASSERT(0);
+			ret = -EIO;
+			goto out_free;
+		}
  	} else {
  		num_refs = 0;
  		extent_flags = 0;
@@ -859,7 +863,6 @@ search_again:
  	}
  	spin_unlock(&delayed_refs->lock);
  out:
-	WARN_ON(num_refs == 0);
  	if (refs)
  		*refs = num_refs;
  	if (flags)
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: kernel BUG at fs/btrfs/extent-tree.c:8113! (4.1.3 kernel)
  2015-08-11 15:40             ` Josef Bacik
@ 2015-08-12 14:47               ` Marc MERLIN
  2015-08-12 15:15                 ` Josef Bacik
  0 siblings, 1 reply; 25+ messages in thread
From: Marc MERLIN @ 2015-08-12 14:47 UTC (permalink / raw)
  To: Josef Bacik; +Cc: dsterba, clm, linux-btrfs

On Tue, Aug 11, 2015 at 11:40:45AM -0400, Josef Bacik wrote:
> From a48cf7a9ae44a17d927df5542c8b0be287aee9ed Mon Sep 17 00:00:00 2001
> From: Josef Bacik <jbacik@fb.com>
> Date: Tue, 11 Aug 2015 11:39:37 -0400
> Subject: [PATCH] Btrfs: kill BUG_ON() in btrfs_lookup_extent_info()
> 
> Replace it with an ASSERT(0) for the developers and an error for not the
> developers.
 
Thanks. We knocked one down and now another BUG has been triggered :)

	if (unlikely(wc->refs[level - 1] == 0)) {
		btrfs_err(root->fs_info, "Missing references.");
		BUG();
	}

> Signed-off-by: Josef Bacik <jbacik@fb.com>
> ---
>  fs/btrfs/extent-tree.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> index 5411f0a..f7fb120 100644
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -818,7 +818,11 @@ search_again:
>  			BUG();
>  #endif
>  		}
> -		BUG_ON(num_refs == 0);
> +		if (num_refs == 0) {
> +			ASSERT(0);
> +			ret = -EIO;
> +			goto out_free;
> +		}
>  	} else {
>  		num_refs = 0;
>  		extent_flags = 0;
> @@ -859,7 +863,6 @@ search_again:
>  	}
>  	spin_unlock(&delayed_refs->lock);
>  out:
> -	WARN_ON(num_refs == 0);
>  	if (refs)
>  		*refs = num_refs;
>  	if (flags)
> -- 
> 

[  408.641308] BTRFS info (device dm-0): disk space caching is enabled
[  448.528218] BTRFS error (device dm-0): Missing references.
[  448.528247] ------------[ cut here ]------------
[  448.529994] kernel BUG at fs/btrfs/extent-tree.c:8116!
[  448.531747] invalid opcode: 0000 [#1] PREEMPT SMP 
[  448.532002] Modules linked in: xts gf128mul configs rc_hauppauge ir_kbd_i2c cpufreq_userspace cpufreq_powersave cpufreq_conservative cpufreq_stats autofs4 tuner_simple tuner_types tda9887 tda8290 snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic tuner firewire_sbp2 snd_hda_intel snd_hda_controller msp3400 hwmon_vid joydev snd_hda_codec dm_crypt snd_hda_core snd_hwdep snd_pcm_oss snd_mixer_oss saa7127 snd_pcm dm_mod snd_seq_midi snd_seq_midi_event hid_generic snd_rawmidi saa7115 bttv ivtv tea575x snd_seq tveeprom snd_seq_device videobuf_dma_sg cx2341x videobuf_core v4l2_common snd_timer snd videodev soundcore coretemp gpio_ich rc_imon_mce usbhid imon rc_core kvm_intel hid media lpc_ich ehci_pci ehci_hcd psmouse evdev asus_atk0110 acpi_cpufreq kvm sr_mod serio_raw cdrom microcode processor sg lp parport raid456 async_raid6_recov async_pq async_xor async_memcpy async_tx multipath floppy firewire_ohci firewire_core crc_itu_t uhci_hcd atl1 mii usbcore usb_common
[  448.532002] CPU: 1 PID: 3756 Comm: btrfs-cleaner Not tainted 4.1.3-ia32-i915-volpreempt-20150421jb1 #3
[  448.532002] Hardware name: System manufacturer P5E-VM HDMI/P5E-VM HDMI, BIOS 0604    07/16/2008
[  448.532002] task: f499f1e0 ti: e1d44000 task.ti: e1d44000
[  448.532002] EIP: 0060:[<c11aea88>] EFLAGS: 00010282 CPU: 1
[  448.532002] EIP is at do_walk_down+0x142/0x65f
[  448.532002] EAX: 0000002e EBX: f2d62a00 ECX: f5987310 EDX: 80000000
[  448.532002] ESI: f44c5030 EDI: e3c5c000 EBP: e1d45ebc ESP: e1d45e40
[  448.532002]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[  448.532002] CR0: 8005003b CR2: b0057e4c CR3: 31216000 CR4: 000006d0
[  448.532002] Stack:
[  448.532002]  f1bef000 c174d1ac f4a2e7c0 00000001 00000000 00004000 f44c5028 00000001
[  448.532002]  00000286 f2d62a08 00000000 36598000 00000086 f44c5098 00000002 00008f48
[  448.532002]  00000000 00000001 e7f80178 00000000 e7f80228 e1d45e9c c11a7a2b e1d45ebc
[  448.532002] Call Trace:
[  448.532002]  [<c11a7a2b>] ? btrfs_tree_unlock_rw+0x10/0x2e
[  448.532002]  [<c11ac904>] ? walk_down_proc+0x110/0x1cb
[  448.532002]  [<c11af017>] walk_down_tree+0x72/0x93
[  448.532002]  [<c11b1901>] btrfs_drop_snapshot+0x278/0x591
[  448.532002]  [<c11bfbe9>] btrfs_clean_one_deleted_snapshot+0x79/0x87
[  448.532002]  [<c11b997c>] cleaner_kthread+0x74/0xdd
[  448.532002]  [<c11b9908>] ? btrfs_need_cleaner_sleep.isra.20+0x2a/0x2a
[  448.532002]  [<c104bccd>] kthread+0x88/0x8d
[  448.532002]  [<c105013e>] ? mmdrop+0xe/0x1c
[  448.532002]  [<c1050000>] ? check_same_owner+0x2c/0x43
[  448.532002]  [<c1549841>] ret_from_kernel_thread+0x21/0x30
[  448.532002]  [<c104bc45>] ? __kthread_parkme+0x50/0x50
[  448.532002] Code: 45 cc e8 10 ff 03 00 8b 55 c8 89 d0 e9 2f 05 00 00 8b 4d a8 8b 41 04 0b 01 75 12 68 ac d1 74 c1 ff b7 dc 01 00 00 e8 0b f3 fe ff <0f> 0b 8b 45 0c c7 00 00 00 00 00 83 bb 94 00 00 00 01 0f 85 a5
[  448.532002] EIP: [<c11aea88>] do_walk_down+0x142/0x65f SS:ESP 0068:e1d45e40
[  448.640313] ---[ end trace 9ddb31ca62f7248d ]---
[  448.640319] Kernel panic - not syncing: Fatal exception
[  448.642259] Kernel Offset: disabled

-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/  

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: kernel BUG at fs/btrfs/extent-tree.c:8113! (4.1.3 kernel)
  2015-08-12 14:47               ` Marc MERLIN
@ 2015-08-12 15:15                 ` Josef Bacik
  2015-08-12 16:09                   ` Marc MERLIN
  0 siblings, 1 reply; 25+ messages in thread
From: Josef Bacik @ 2015-08-12 15:15 UTC (permalink / raw)
  To: Marc MERLIN; +Cc: dsterba, clm, linux-btrfs

On 08/12/2015 10:47 AM, Marc MERLIN wrote:
> On Tue, Aug 11, 2015 at 11:40:45AM -0400, Josef Bacik wrote:
>>  From a48cf7a9ae44a17d927df5542c8b0be287aee9ed Mon Sep 17 00:00:00 2001
>> From: Josef Bacik <jbacik@fb.com>
>> Date: Tue, 11 Aug 2015 11:39:37 -0400
>> Subject: [PATCH] Btrfs: kill BUG_ON() in btrfs_lookup_extent_info()
>>
>> Replace it with an ASSERT(0) for the developers and an error for not the
>> developers.
>
> Thanks. We knocked one down and now another BUG has been triggered :)
>
> 	if (unlikely(wc->refs[level - 1] == 0)) {
> 		btrfs_err(root->fs_info, "Missing references.");
> 		BUG();
> 	}
>

This is why you got your own branch, it's never just one.  Here's the 
next bit


 From 07214b5294d2772682aba893de15ef8020994598 Mon Sep 17 00:00:00 2001
From: Josef Bacik <jbacik@fb.com>
Date: Wed, 12 Aug 2015 11:06:42 -0400
Subject: [PATCH] Btrfs: don't BUG() during drop snapshot

Really there's lots of things that can go wrong here, kill all the 
BUG_ON()'s
and replace the logic ones with ASSERT()'s and return EIO instead.  Also 
fix the
leak of next in one of the error conditions while we're at it.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fb.com>
---
  fs/btrfs/extent-tree.c | 27 +++++++++++++++++++++++----
  1 file changed, 23 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index f7fb120..6671faf 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -8196,12 +8196,15 @@ static noinline int do_walk_down(struct 
btrfs_trans_handle *trans,
  				       &wc->flags[level - 1]);
  	if (ret < 0) {
  		btrfs_tree_unlock(next);
+		free_extent_buffer(next);
  		return ret;
  	}

  	if (unlikely(wc->refs[level - 1] == 0)) {
  		btrfs_err(root->fs_info, "Missing references.");
-		BUG();
+		btrfs_tree_unlock(next);
+		free_extent_buffer(next);
+		return -EIO;
  	}
  	*lookup_info = 0;

@@ -8253,7 +8256,13 @@ static noinline int do_walk_down(struct 
btrfs_trans_handle *trans,
  	}

  	level--;
-	BUG_ON(level != btrfs_header_level(next));
+	ASSERT(level == btrfs_header_level(next));
+	if (level != btrfs_header_level(next)) {
+		printk(KERN_ERR "Mismatched level\n");
+		btrfs_tree_unlock(next);
+		free_extent_buffer(next);
+		return -EIO;
+	}
  	path->nodes[level] = next;
  	path->slots[level] = 0;
  	path->locks[level] = BTRFS_WRITE_LOCK_BLOCKING;
@@ -8268,8 +8277,14 @@ skip:
  		if (wc->flags[level] & BTRFS_BLOCK_FLAG_FULL_BACKREF) {
  			parent = path->nodes[level]->start;
  		} else {
-			BUG_ON(root->root_key.objectid !=
+			ASSERT(root->root_key.objectid ==
  			       btrfs_header_owner(path->nodes[level]));
+			if (root->root_key.objectid !=
+			    btrfs_header_owner(path->nodes[level])) {
+				printk(KERN_ERR "Mismatched block owner\n");
+				btrfs_tree_unlock(next);
+				free_extent_buffer(next);
+			}
  			parent = 0;
  		}

@@ -8285,7 +8300,11 @@ skip:
  		}
  		ret = btrfs_free_extent(trans, root, bytenr, blocksize, parent,
  				root->root_key.objectid, level - 1, 0, 0);
-		BUG_ON(ret); /* -ENOMEM */
+		if (ret) {
+			btrfs_tree_unlock(next);
+			free_extent_buffer(next);
+			return ret;
+		}
  	}
  	btrfs_tree_unlock(next);
  	free_extent_buffer(next);
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: kernel BUG at fs/btrfs/extent-tree.c:8113! (4.1.3 kernel)
  2015-08-12 15:15                 ` Josef Bacik
@ 2015-08-12 16:09                   ` Marc MERLIN
  2015-08-12 16:18                     ` Josef Bacik
  0 siblings, 1 reply; 25+ messages in thread
From: Marc MERLIN @ 2015-08-12 16:09 UTC (permalink / raw)
  To: Josef Bacik; +Cc: dsterba, clm, linux-btrfs

On Wed, Aug 12, 2015 at 11:15:39AM -0400, Josef Bacik wrote:
> On 08/12/2015 10:47 AM, Marc MERLIN wrote:
> >On Tue, Aug 11, 2015 at 11:40:45AM -0400, Josef Bacik wrote:
> >> From a48cf7a9ae44a17d927df5542c8b0be287aee9ed Mon Sep 17 00:00:00 2001
> >>From: Josef Bacik <jbacik@fb.com>
> >>Date: Tue, 11 Aug 2015 11:39:37 -0400
> >>Subject: [PATCH] Btrfs: kill BUG_ON() in btrfs_lookup_extent_info()
> >>
> >>Replace it with an ASSERT(0) for the developers and an error for not the
> >>developers.
> >
> >Thanks. We knocked one down and now another BUG has been triggered :)
> >
> >	if (unlikely(wc->refs[level - 1] == 0)) {
> >		btrfs_err(root->fs_info, "Missing references.");
> >		BUG();
> >	}
> >
> 
> This is why you got your own branch, it's never just one.  Here's
> the next bit

Yes, I figured there might be a few more :)
Thanks for this patch, it definitely made things better:

[  165.656408] BTRFS info (device dm-0): disk space caching is enabled
[  205.528199] BTRFS error (device dm-0): Missing references.
[  205.528216] BTRFS: error (device dm-0) in btrfs_drop_snapshot:8652: errno=-5 IO failure
[  205.528225] BTRFS info (device dm-0): forced readonly

That's perfect, thanks much for that.

Now, back to check --repair, does it make sense to fix it too so that it doesn't crash either?

myth:~#  btrfs check --repair /dev/mapper/crypt_sdd1
enabling repair mode
Checking filesystem on /dev/mapper/crypt_sdd1
UUID: 024ba4d0-dacb-438d-9f1b-eeb34083fe49
checking extents
cmds-check.c:4486: add_data_backref: Assertion `back->bytes != max_size` failed.
btrfs[0x8066a73]
btrfs[0x8066aa4]
btrfs[0x8067991]
btrfs[0x806b4ab]
btrfs[0x806b9a3]
btrfs[0x806c5b2]
btrfs(cmd_check+0x1088)[0x806eddf]
btrfs(main+0x153)[0x80557c6]
/lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0xb75784d3]
btrfs[0x80557ec]

Marc
 
> From 07214b5294d2772682aba893de15ef8020994598 Mon Sep 17 00:00:00 2001
> From: Josef Bacik <jbacik@fb.com>
> Date: Wed, 12 Aug 2015 11:06:42 -0400
> Subject: [PATCH] Btrfs: don't BUG() during drop snapshot
> 
> Really there's lots of things that can go wrong here, kill all the
> BUG_ON()'s
> and replace the logic ones with ASSERT()'s and return EIO instead.
> Also fix the
> leak of next in one of the error conditions while we're at it.  Thanks,
> 
> Signed-off-by: Josef Bacik <jbacik@fb.com>
> ---
>  fs/btrfs/extent-tree.c | 27 +++++++++++++++++++++++----
>  1 file changed, 23 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> index f7fb120..6671faf 100644
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -8196,12 +8196,15 @@ static noinline int do_walk_down(struct
> btrfs_trans_handle *trans,
>  				       &wc->flags[level - 1]);
>  	if (ret < 0) {
>  		btrfs_tree_unlock(next);
> +		free_extent_buffer(next);
>  		return ret;
>  	}
> 
>  	if (unlikely(wc->refs[level - 1] == 0)) {
>  		btrfs_err(root->fs_info, "Missing references.");
> -		BUG();
> +		btrfs_tree_unlock(next);
> +		free_extent_buffer(next);
> +		return -EIO;
>  	}
>  	*lookup_info = 0;
> 
> @@ -8253,7 +8256,13 @@ static noinline int do_walk_down(struct
> btrfs_trans_handle *trans,
>  	}
> 
>  	level--;
> -	BUG_ON(level != btrfs_header_level(next));
> +	ASSERT(level == btrfs_header_level(next));
> +	if (level != btrfs_header_level(next)) {
> +		printk(KERN_ERR "Mismatched level\n");
> +		btrfs_tree_unlock(next);
> +		free_extent_buffer(next);
> +		return -EIO;
> +	}
>  	path->nodes[level] = next;
>  	path->slots[level] = 0;
>  	path->locks[level] = BTRFS_WRITE_LOCK_BLOCKING;
> @@ -8268,8 +8277,14 @@ skip:
>  		if (wc->flags[level] & BTRFS_BLOCK_FLAG_FULL_BACKREF) {
>  			parent = path->nodes[level]->start;
>  		} else {
> -			BUG_ON(root->root_key.objectid !=
> +			ASSERT(root->root_key.objectid ==
>  			       btrfs_header_owner(path->nodes[level]));
> +			if (root->root_key.objectid !=
> +			    btrfs_header_owner(path->nodes[level])) {
> +				printk(KERN_ERR "Mismatched block owner\n");
> +				btrfs_tree_unlock(next);
> +				free_extent_buffer(next);
> +			}
>  			parent = 0;
>  		}
> 
> @@ -8285,7 +8300,11 @@ skip:
>  		}
>  		ret = btrfs_free_extent(trans, root, bytenr, blocksize, parent,
>  				root->root_key.objectid, level - 1, 0, 0);
> -		BUG_ON(ret); /* -ENOMEM */
> +		if (ret) {
> +			btrfs_tree_unlock(next);
> +			free_extent_buffer(next);
> +			return ret;
> +		}
>  	}
>  	btrfs_tree_unlock(next);
>  	free_extent_buffer(next);
> -- 
> 2.1.0
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/                         | PGP 1024R/763BE901

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: kernel BUG at fs/btrfs/extent-tree.c:8113! (4.1.3 kernel)
  2015-08-12 16:09                   ` Marc MERLIN
@ 2015-08-12 16:18                     ` Josef Bacik
  2015-08-12 17:19                       ` Marc MERLIN
  0 siblings, 1 reply; 25+ messages in thread
From: Josef Bacik @ 2015-08-12 16:18 UTC (permalink / raw)
  To: Marc MERLIN; +Cc: dsterba, clm, linux-btrfs

On 08/12/2015 12:09 PM, Marc MERLIN wrote:
> On Wed, Aug 12, 2015 at 11:15:39AM -0400, Josef Bacik wrote:
>> On 08/12/2015 10:47 AM, Marc MERLIN wrote:
>>> On Tue, Aug 11, 2015 at 11:40:45AM -0400, Josef Bacik wrote:
>>>>  From a48cf7a9ae44a17d927df5542c8b0be287aee9ed Mon Sep 17 00:00:00 2001
>>>> From: Josef Bacik <jbacik@fb.com>
>>>> Date: Tue, 11 Aug 2015 11:39:37 -0400
>>>> Subject: [PATCH] Btrfs: kill BUG_ON() in btrfs_lookup_extent_info()
>>>>
>>>> Replace it with an ASSERT(0) for the developers and an error for not the
>>>> developers.
>>>
>>> Thanks. We knocked one down and now another BUG has been triggered :)
>>>
>>> 	if (unlikely(wc->refs[level - 1] == 0)) {
>>> 		btrfs_err(root->fs_info, "Missing references.");
>>> 		BUG();
>>> 	}
>>>
>>
>> This is why you got your own branch, it's never just one.  Here's
>> the next bit
>
> Yes, I figured there might be a few more :)
> Thanks for this patch, it definitely made things better:
>
> [  165.656408] BTRFS info (device dm-0): disk space caching is enabled
> [  205.528199] BTRFS error (device dm-0): Missing references.
> [  205.528216] BTRFS: error (device dm-0) in btrfs_drop_snapshot:8652: errno=-5 IO failure
> [  205.528225] BTRFS info (device dm-0): forced readonly
>
> That's perfect, thanks much for that.
>
> Now, back to check --repair, does it make sense to fix it too so that it doesn't crash either?
>
> myth:~#  btrfs check --repair /dev/mapper/crypt_sdd1
> enabling repair mode
> Checking filesystem on /dev/mapper/crypt_sdd1
> UUID: 024ba4d0-dacb-438d-9f1b-eeb34083fe49
> checking extents
> cmds-check.c:4486: add_data_backref: Assertion `back->bytes != max_size` failed.
> btrfs[0x8066a73]
> btrfs[0x8066aa4]
> btrfs[0x8067991]
> btrfs[0x806b4ab]
> btrfs[0x806b9a3]
> btrfs[0x806c5b2]
> btrfs(cmd_check+0x1088)[0x806eddf]
> btrfs(main+0x153)[0x80557c6]
> /lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0xb75784d3]
> btrfs[0x80557ec]
>

Going to need more info to figure this one out


 From d77cd13f94fae6d995f753f3de3728c4ef4f8e75 Mon Sep 17 00:00:00 2001
From: Josef Bacik <jbacik@fb.com>
Date: Wed, 12 Aug 2015 12:18:01 -0400
Subject: [PATCH] some debugging

---
  cmds-check.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/cmds-check.c b/cmds-check.c
index dd2fce3..8f668d7 100644
--- a/cmds-check.c
+++ b/cmds-check.c
@@ -4524,6 +4524,8 @@ static int add_data_backref(struct cache_tree 
*extent_cache, u64 bytenr,
  	if (found_ref) {
  		BUG_ON(num_refs != 1);
  		if (back->node.found_ref)
+			if (back->bytes != max_size)
+				fprintf(stderr, "wtf, parent %llu\n", (unsigned long long)parent);
  			BUG_ON(back->bytes != max_size);
  		back->node.found_ref = 1;
  		back->found_ref += 1;
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: kernel BUG at fs/btrfs/extent-tree.c:8113! (4.1.3 kernel)
  2015-08-12 16:18                     ` Josef Bacik
@ 2015-08-12 17:19                       ` Marc MERLIN
  2015-08-17  2:01                         ` Qu Wenruo
  0 siblings, 1 reply; 25+ messages in thread
From: Marc MERLIN @ 2015-08-12 17:19 UTC (permalink / raw)
  To: Josef Bacik; +Cc: dsterba, clm, linux-btrfs

On Wed, Aug 12, 2015 at 12:18:45PM -0400, Josef Bacik wrote:
> Going to need more info to figure this one out
 
Thanks for the patch, here's the output:
enabling repair mode
Checking filesystem on /dev/mapper/crypt_sdd1
UUID: 024ba4d0-dacb-438d-9f1b-eeb34083fe49
checking extents
wtf, parent 575708413952 <<<<<<
cmds-check.c:4488: add_data_backref: Assertion `back->bytes != max_size` failed.
/tmp/btrfs[0x8066a83]
/tmp/btrfs[0x8066ab4]
/tmp/btrfs[0x80679d8]
/tmp/btrfs[0x806b4f2]
/tmp/btrfs[0x806b9ea]
/tmp/btrfs[0x806c5f9]
/tmp/btrfs(cmd_check+0x1088)[0x806ee26]
/tmp/btrfs(main+0x153)[0x80557d6]
/lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0xb75a54d3]
/tmp/btrfs[0x80557fc]

Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/                         | PGP 1024R/763BE901

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: kernel BUG at fs/btrfs/extent-tree.c:8113! (4.1.3 kernel)
  2015-08-12 17:19                       ` Marc MERLIN
@ 2015-08-17  2:01                         ` Qu Wenruo
  2015-08-17 14:49                           ` Marc MERLIN
  0 siblings, 1 reply; 25+ messages in thread
From: Qu Wenruo @ 2015-08-17  2:01 UTC (permalink / raw)
  To: Marc MERLIN, Josef Bacik; +Cc: dsterba, clm, linux-btrfs

Hi Marc,

Did btrfs-debug-tree also has the crash?

If not, would you please attach the output if it doesn't contain 
classified data.

Thanks,
Qu

Marc MERLIN wrote on 2015/08/12 10:19 -0700:
> On Wed, Aug 12, 2015 at 12:18:45PM -0400, Josef Bacik wrote:
>> Going to need more info to figure this one out
>
> Thanks for the patch, here's the output:
> enabling repair mode
> Checking filesystem on /dev/mapper/crypt_sdd1
> UUID: 024ba4d0-dacb-438d-9f1b-eeb34083fe49
> checking extents
> wtf, parent 575708413952 <<<<<<
> cmds-check.c:4488: add_data_backref: Assertion `back->bytes != max_size` failed.
> /tmp/btrfs[0x8066a83]
> /tmp/btrfs[0x8066ab4]
> /tmp/btrfs[0x80679d8]
> /tmp/btrfs[0x806b4f2]
> /tmp/btrfs[0x806b9ea]
> /tmp/btrfs[0x806c5f9]
> /tmp/btrfs(cmd_check+0x1088)[0x806ee26]
> /tmp/btrfs(main+0x153)[0x80557d6]
> /lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0xb75a54d3]
> /tmp/btrfs[0x80557fc]
>
> Marc
>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: kernel BUG at fs/btrfs/extent-tree.c:8113! (4.1.3 kernel)
  2015-08-17  2:01                         ` Qu Wenruo
@ 2015-08-17 14:49                           ` Marc MERLIN
  2015-08-22 14:37                             ` Marc MERLIN
  0 siblings, 1 reply; 25+ messages in thread
From: Marc MERLIN @ 2015-08-17 14:49 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: Josef Bacik, dsterba, clm, linux-btrfs

On Mon, Aug 17, 2015 at 10:01:16AM +0800, Qu Wenruo wrote:
> Hi Marc,
 
Hi Qu, thanks for your answer and looking at this.

> Did btrfs-debug-tree also has the crash?
> 
> If not, would you please attach the output if it doesn't contain
> classified data.
 
Sure thing:
btrfs-debug-tree /dev/mapper/crypt_sdd1 > /tmp/tree.out
parent transid verify failed on 2968115101696 wanted 34855 found 39533
parent transid verify failed on 2968115101696 wanted 34855 found 39533
parent transid verify failed on 2968115101696 wanted 34855 found 39533
parent transid verify failed on 2968115101696 wanted 34855 found 39533
Ignoring transid failure
parent transid verify failed on 2968115134464 wanted 34855 found 39533
parent transid verify failed on 2968115134464 wanted 34855 found 39533
parent transid verify failed on 2968115134464 wanted 34855 found 39533
parent transid verify failed on 2968115134464 wanted 34855 found 39533
Ignoring transid failure
parent transid verify failed on 2968115150848 wanted 34855 found 39533
parent transid verify failed on 2968115150848 wanted 34855 found 39533
parent transid verify failed on 2968115150848 wanted 34855 found 39533
parent transid verify failed on 2968115150848 wanted 34855 found 39533
Ignoring transid failure
parent transid verify failed on 2968115691520 wanted 34855 found 39533
parent transid verify failed on 2968115691520 wanted 34855 found 39533
parent transid verify failed on 2968115691520 wanted 34855 found 39533
parent transid verify failed on 2968115691520 wanted 34855 found 39533
Ignoring transid failure
parent transid verify failed on 1291597152256 wanted 35830 found 39530
parent transid verify failed on 1291597152256 wanted 35830 found 39530
parent transid verify failed on 1291597152256 wanted 35830 found 39530
parent transid verify failed on 1291597152256 wanted 35830 found 39530
Ignoring transid failure
parent transid verify failed on 2968116592640 wanted 34855 found 39533
parent transid verify failed on 2968116592640 wanted 34855 found 39533
parent transid verify failed on 2968116592640 wanted 34855 found 39533
parent transid verify failed on 2968116592640 wanted 34855 found 39533
Ignoring transid failure
parent transid verify failed on 2968116609024 wanted 34855 found 39533
parent transid verify failed on 2968116609024 wanted 34855 found 39533
parent transid verify failed on 2968116609024 wanted 34855 found 39533
parent transid verify failed on 2968116609024 wanted 34855 found 39533
Ignoring transid failure
print-tree.c:1094: btrfs_print_tree: Assertion failed.
btrfs-debug-tree[0x805ce93]
btrfs-debug-tree(btrfs_print_tree+0x26d)[0x805eb51]
btrfs-debug-tree(btrfs_print_tree+0x279)[0x805eb5d]
btrfs-debug-tree(main+0x8b5)[0x804dfb7]
/lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0xb757c4d3]
btrfs-debug-tree[0x804e221]

Do you want the actual output?
(it's 1.1GB uncompressed)

Marc


> Thanks,
> Qu
> 
> Marc MERLIN wrote on 2015/08/12 10:19 -0700:
> >On Wed, Aug 12, 2015 at 12:18:45PM -0400, Josef Bacik wrote:
> >>Going to need more info to figure this one out
> >
> >Thanks for the patch, here's the output:
> >enabling repair mode
> >Checking filesystem on /dev/mapper/crypt_sdd1
> >UUID: 024ba4d0-dacb-438d-9f1b-eeb34083fe49
> >checking extents
> >wtf, parent 575708413952 <<<<<<
> >cmds-check.c:4488: add_data_backref: Assertion `back->bytes != max_size` failed.
> >/tmp/btrfs[0x8066a83]
> >/tmp/btrfs[0x8066ab4]
> >/tmp/btrfs[0x80679d8]
> >/tmp/btrfs[0x806b4f2]
> >/tmp/btrfs[0x806b9ea]
> >/tmp/btrfs[0x806c5f9]
> >/tmp/btrfs(cmd_check+0x1088)[0x806ee26]
> >/tmp/btrfs(main+0x153)[0x80557d6]
> >/lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0xb75a54d3]
> >/tmp/btrfs[0x80557fc]
> >
> >Marc
> >
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/                         | PGP 1024R/763BE901

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: kernel BUG at fs/btrfs/extent-tree.c:8113! (4.1.3 kernel)
  2015-08-17 14:49                           ` Marc MERLIN
@ 2015-08-22 14:37                             ` Marc MERLIN
  2015-08-24  1:10                               ` Qu Wenruo
  0 siblings, 1 reply; 25+ messages in thread
From: Marc MERLIN @ 2015-08-22 14:37 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: Josef Bacik, dsterba, clm, linux-btrfs

On Mon, Aug 17, 2015 at 07:49:04AM -0700, Marc MERLIN wrote:
> On Mon, Aug 17, 2015 at 10:01:16AM +0800, Qu Wenruo wrote:
> > Hi Marc,
>  
> Hi Qu, thanks for your answer and looking at this.
> 
> > Did btrfs-debug-tree also has the crash?
> > 
> > If not, would you please attach the output if it doesn't contain
> > classified data.
  
Do you need anything else before I wipe the filesystem and start over?

1) kernel is fixed not to crash
2) btrfs check --repair segfaults
3) btrfs-debug-tree ends with assert

Thanks,
Marc

> Sure thing:
> btrfs-debug-tree /dev/mapper/crypt_sdd1 > /tmp/tree.out
> parent transid verify failed on 2968115101696 wanted 34855 found 39533
> parent transid verify failed on 2968115101696 wanted 34855 found 39533
> parent transid verify failed on 2968115101696 wanted 34855 found 39533
> parent transid verify failed on 2968115101696 wanted 34855 found 39533
> Ignoring transid failure
> parent transid verify failed on 2968115134464 wanted 34855 found 39533
> parent transid verify failed on 2968115134464 wanted 34855 found 39533
> parent transid verify failed on 2968115134464 wanted 34855 found 39533
> parent transid verify failed on 2968115134464 wanted 34855 found 39533
> Ignoring transid failure
> parent transid verify failed on 2968115150848 wanted 34855 found 39533
> parent transid verify failed on 2968115150848 wanted 34855 found 39533
> parent transid verify failed on 2968115150848 wanted 34855 found 39533
> parent transid verify failed on 2968115150848 wanted 34855 found 39533
> Ignoring transid failure
> parent transid verify failed on 2968115691520 wanted 34855 found 39533
> parent transid verify failed on 2968115691520 wanted 34855 found 39533
> parent transid verify failed on 2968115691520 wanted 34855 found 39533
> parent transid verify failed on 2968115691520 wanted 34855 found 39533
> Ignoring transid failure
> parent transid verify failed on 1291597152256 wanted 35830 found 39530
> parent transid verify failed on 1291597152256 wanted 35830 found 39530
> parent transid verify failed on 1291597152256 wanted 35830 found 39530
> parent transid verify failed on 1291597152256 wanted 35830 found 39530
> Ignoring transid failure
> parent transid verify failed on 2968116592640 wanted 34855 found 39533
> parent transid verify failed on 2968116592640 wanted 34855 found 39533
> parent transid verify failed on 2968116592640 wanted 34855 found 39533
> parent transid verify failed on 2968116592640 wanted 34855 found 39533
> Ignoring transid failure
> parent transid verify failed on 2968116609024 wanted 34855 found 39533
> parent transid verify failed on 2968116609024 wanted 34855 found 39533
> parent transid verify failed on 2968116609024 wanted 34855 found 39533
> parent transid verify failed on 2968116609024 wanted 34855 found 39533
> Ignoring transid failure
> print-tree.c:1094: btrfs_print_tree: Assertion failed.
> btrfs-debug-tree[0x805ce93]
> btrfs-debug-tree(btrfs_print_tree+0x26d)[0x805eb51]
> btrfs-debug-tree(btrfs_print_tree+0x279)[0x805eb5d]
> btrfs-debug-tree(main+0x8b5)[0x804dfb7]
> /lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0xb757c4d3]
> btrfs-debug-tree[0x804e221]
> 
> Do you want the actual output?
> (it's 1.1GB uncompressed)
> 
> Marc
> 
> 
> > Thanks,
> > Qu
> > 
> > Marc MERLIN wrote on 2015/08/12 10:19 -0700:
> > >On Wed, Aug 12, 2015 at 12:18:45PM -0400, Josef Bacik wrote:
> > >>Going to need more info to figure this one out
> > >
> > >Thanks for the patch, here's the output:
> > >enabling repair mode
> > >Checking filesystem on /dev/mapper/crypt_sdd1
> > >UUID: 024ba4d0-dacb-438d-9f1b-eeb34083fe49
> > >checking extents
> > >wtf, parent 575708413952 <<<<<<
> > >cmds-check.c:4488: add_data_backref: Assertion `back->bytes != max_size` failed.
> > >/tmp/btrfs[0x8066a83]
> > >/tmp/btrfs[0x8066ab4]
> > >/tmp/btrfs[0x80679d8]
> > >/tmp/btrfs[0x806b4f2]
> > >/tmp/btrfs[0x806b9ea]
> > >/tmp/btrfs[0x806c5f9]
> > >/tmp/btrfs(cmd_check+0x1088)[0x806ee26]
> > >/tmp/btrfs(main+0x153)[0x80557d6]
> > >/lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0xb75a54d3]
> > >/tmp/btrfs[0x80557fc]
> > >
> > >Marc
> > >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 
> 
> -- 
> "A mouse is a device used to point at the xterm you want to type in" - A.S.R.
> Microsoft is to operating systems ....
>                                       .... what McDonalds is to gourmet cooking
> Home page: http://marc.merlins.org/                         | PGP 1024R/763BE901
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/                         | PGP 1024R/763BE901

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: kernel BUG at fs/btrfs/extent-tree.c:8113! (4.1.3 kernel)
  2015-08-22 14:37                             ` Marc MERLIN
@ 2015-08-24  1:10                               ` Qu Wenruo
  2015-08-24  4:28                                 ` Marc MERLIN
  0 siblings, 1 reply; 25+ messages in thread
From: Qu Wenruo @ 2015-08-24  1:10 UTC (permalink / raw)
  To: Marc MERLIN; +Cc: Josef Bacik, dsterba, clm, linux-btrfs



Marc MERLIN wrote on 2015/08/22 07:37 -0700:
> On Mon, Aug 17, 2015 at 07:49:04AM -0700, Marc MERLIN wrote:
>> On Mon, Aug 17, 2015 at 10:01:16AM +0800, Qu Wenruo wrote:
>>> Hi Marc,
>>
>> Hi Qu, thanks for your answer and looking at this.
>>
>>> Did btrfs-debug-tree also has the crash?
>>>
>>> If not, would you please attach the output if it doesn't contain
>>> classified data.
>
> Do you need anything else before I wipe the filesystem and start over?
>
> 1) kernel is fixed not to crash
> 2) btrfs check --repair segfaults
> 3) btrfs-debug-tree ends with assert
>
> Thanks,
> Marc

Would you please take the following output?

1) btrfs check output
With error message if it happens.

2) btrfs check --repair output
Full output until segfault.

3) btrfs-debug-tree output
With assert output.

At least this should help us to figure out what's wrong with on-disk data.

Thanks,
Qu

>
>> Sure thing:
>> btrfs-debug-tree /dev/mapper/crypt_sdd1 > /tmp/tree.out
>> parent transid verify failed on 2968115101696 wanted 34855 found 39533
>> parent transid verify failed on 2968115101696 wanted 34855 found 39533
>> parent transid verify failed on 2968115101696 wanted 34855 found 39533
>> parent transid verify failed on 2968115101696 wanted 34855 found 39533
>> Ignoring transid failure
>> parent transid verify failed on 2968115134464 wanted 34855 found 39533
>> parent transid verify failed on 2968115134464 wanted 34855 found 39533
>> parent transid verify failed on 2968115134464 wanted 34855 found 39533
>> parent transid verify failed on 2968115134464 wanted 34855 found 39533
>> Ignoring transid failure
>> parent transid verify failed on 2968115150848 wanted 34855 found 39533
>> parent transid verify failed on 2968115150848 wanted 34855 found 39533
>> parent transid verify failed on 2968115150848 wanted 34855 found 39533
>> parent transid verify failed on 2968115150848 wanted 34855 found 39533
>> Ignoring transid failure
>> parent transid verify failed on 2968115691520 wanted 34855 found 39533
>> parent transid verify failed on 2968115691520 wanted 34855 found 39533
>> parent transid verify failed on 2968115691520 wanted 34855 found 39533
>> parent transid verify failed on 2968115691520 wanted 34855 found 39533
>> Ignoring transid failure
>> parent transid verify failed on 1291597152256 wanted 35830 found 39530
>> parent transid verify failed on 1291597152256 wanted 35830 found 39530
>> parent transid verify failed on 1291597152256 wanted 35830 found 39530
>> parent transid verify failed on 1291597152256 wanted 35830 found 39530
>> Ignoring transid failure
>> parent transid verify failed on 2968116592640 wanted 34855 found 39533
>> parent transid verify failed on 2968116592640 wanted 34855 found 39533
>> parent transid verify failed on 2968116592640 wanted 34855 found 39533
>> parent transid verify failed on 2968116592640 wanted 34855 found 39533
>> Ignoring transid failure
>> parent transid verify failed on 2968116609024 wanted 34855 found 39533
>> parent transid verify failed on 2968116609024 wanted 34855 found 39533
>> parent transid verify failed on 2968116609024 wanted 34855 found 39533
>> parent transid verify failed on 2968116609024 wanted 34855 found 39533
>> Ignoring transid failure
>> print-tree.c:1094: btrfs_print_tree: Assertion failed.
>> btrfs-debug-tree[0x805ce93]
>> btrfs-debug-tree(btrfs_print_tree+0x26d)[0x805eb51]
>> btrfs-debug-tree(btrfs_print_tree+0x279)[0x805eb5d]
>> btrfs-debug-tree(main+0x8b5)[0x804dfb7]
>> /lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0xb757c4d3]
>> btrfs-debug-tree[0x804e221]
>>
>> Do you want the actual output?
>> (it's 1.1GB uncompressed)
>>
>> Marc
>>
>>
>>> Thanks,
>>> Qu
>>>
>>> Marc MERLIN wrote on 2015/08/12 10:19 -0700:
>>>> On Wed, Aug 12, 2015 at 12:18:45PM -0400, Josef Bacik wrote:
>>>>> Going to need more info to figure this one out
>>>>
>>>> Thanks for the patch, here's the output:
>>>> enabling repair mode
>>>> Checking filesystem on /dev/mapper/crypt_sdd1
>>>> UUID: 024ba4d0-dacb-438d-9f1b-eeb34083fe49
>>>> checking extents
>>>> wtf, parent 575708413952 <<<<<<
>>>> cmds-check.c:4488: add_data_backref: Assertion `back->bytes != max_size` failed.
>>>> /tmp/btrfs[0x8066a83]
>>>> /tmp/btrfs[0x8066ab4]
>>>> /tmp/btrfs[0x80679d8]
>>>> /tmp/btrfs[0x806b4f2]
>>>> /tmp/btrfs[0x806b9ea]
>>>> /tmp/btrfs[0x806c5f9]
>>>> /tmp/btrfs(cmd_check+0x1088)[0x806ee26]
>>>> /tmp/btrfs(main+0x153)[0x80557d6]
>>>> /lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0xb75a54d3]
>>>> /tmp/btrfs[0x80557fc]
>>>>
>>>> Marc
>>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>
>> --
>> "A mouse is a device used to point at the xterm you want to type in" - A.S.R.
>> Microsoft is to operating systems ....
>>                                        .... what McDonalds is to gourmet cooking
>> Home page: http://marc.merlins.org/                         | PGP 1024R/763BE901
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: kernel BUG at fs/btrfs/extent-tree.c:8113! (4.1.3 kernel)
  2015-08-24  1:10                               ` Qu Wenruo
@ 2015-08-24  4:28                                 ` Marc MERLIN
  2015-08-24  5:11                                   ` Qu Wenruo
  0 siblings, 1 reply; 25+ messages in thread
From: Marc MERLIN @ 2015-08-24  4:28 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: Josef Bacik, dsterba, clm, linux-btrfs

On Mon, Aug 24, 2015 at 09:10:30AM +0800, Qu Wenruo wrote:
> Would you please take the following output?
> 
> 1) btrfs check output
> With error message if it happens.
 
myth:~# btrfs check /dev/mapper/crypt_sdd1
Checking filesystem on /dev/mapper/crypt_sdd1
UUID: 024ba4d0-dacb-438d-9f1b-eeb34083fe49
checking extents
cmds-check.c:4486: add_data_backref: Assertion `back->bytes != max_size` failed.
btrfs[0x8066a73]
btrfs[0x8066aa4]
btrfs[0x8067991]
btrfs[0x806b4ab]
btrfs[0x806b9a3]
btrfs[0x806c5b2]
btrfs(cmd_check+0x1088)[0x806eddf]
btrfs(main+0x153)[0x80557c6]
/lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0xb753a4d3]
btrfs[0x80557ec]

> 2) btrfs check --repair output
> Full output until segfault.
 
myth:~# btrfs check --repair /dev/mapper/crypt_sdd1
enabling repair mode
Checking filesystem on /dev/mapper/crypt_sdd1
UUID: 024ba4d0-dacb-438d-9f1b-eeb34083fe49
checking extents
cmds-check.c:4486: add_data_backref: Assertion `back->bytes != max_size` failed.
btrfs[0x8066a73]
btrfs[0x8066aa4]
btrfs[0x8067991]
btrfs[0x806b4ab]
btrfs[0x806b9a3]
btrfs[0x806c5b2]
btrfs(cmd_check+0x1088)[0x806eddf]
btrfs(main+0x153)[0x80557c6]
/lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0xb75114d3]
btrfs[0x80557ec]

Strangely I'm not getting a segfault anymore.

> 3) btrfs-debug-tree output
> With assert output.
 
The full output is multi gigabyte. Do you need this and if so, do I need to
upload it somewhere and will you download the multi gigabyte file?

The errors and assert, I already posted here:

> >>Sure thing:
> >>btrfs-debug-tree /dev/mapper/crypt_sdd1 > /tmp/tree.out
> >>parent transid verify failed on 2968115101696 wanted 34855 found 39533
> >>parent transid verify failed on 2968115101696 wanted 34855 found 39533
> >>parent transid verify failed on 2968115101696 wanted 34855 found 39533
> >>parent transid verify failed on 2968115101696 wanted 34855 found 39533
> >>Ignoring transid failure
> >>parent transid verify failed on 2968115134464 wanted 34855 found 39533
> >>parent transid verify failed on 2968115134464 wanted 34855 found 39533
> >>parent transid verify failed on 2968115134464 wanted 34855 found 39533
> >>parent transid verify failed on 2968115134464 wanted 34855 found 39533
> >>Ignoring transid failure
> >>parent transid verify failed on 2968115150848 wanted 34855 found 39533
> >>parent transid verify failed on 2968115150848 wanted 34855 found 39533
> >>parent transid verify failed on 2968115150848 wanted 34855 found 39533
> >>parent transid verify failed on 2968115150848 wanted 34855 found 39533
> >>Ignoring transid failure
> >>parent transid verify failed on 2968115691520 wanted 34855 found 39533
> >>parent transid verify failed on 2968115691520 wanted 34855 found 39533
> >>parent transid verify failed on 2968115691520 wanted 34855 found 39533
> >>parent transid verify failed on 2968115691520 wanted 34855 found 39533
> >>Ignoring transid failure
> >>parent transid verify failed on 1291597152256 wanted 35830 found 39530
> >>parent transid verify failed on 1291597152256 wanted 35830 found 39530
> >>parent transid verify failed on 1291597152256 wanted 35830 found 39530
> >>parent transid verify failed on 1291597152256 wanted 35830 found 39530
> >>Ignoring transid failure
> >>parent transid verify failed on 2968116592640 wanted 34855 found 39533
> >>parent transid verify failed on 2968116592640 wanted 34855 found 39533
> >>parent transid verify failed on 2968116592640 wanted 34855 found 39533
> >>parent transid verify failed on 2968116592640 wanted 34855 found 39533
> >>Ignoring transid failure
> >>parent transid verify failed on 2968116609024 wanted 34855 found 39533
> >>parent transid verify failed on 2968116609024 wanted 34855 found 39533
> >>parent transid verify failed on 2968116609024 wanted 34855 found 39533
> >>parent transid verify failed on 2968116609024 wanted 34855 found 39533
> >>Ignoring transid failure
> >>print-tree.c:1094: btrfs_print_tree: Assertion failed.
> >>btrfs-debug-tree[0x805ce93]
> >>btrfs-debug-tree(btrfs_print_tree+0x26d)[0x805eb51]
> >>btrfs-debug-tree(btrfs_print_tree+0x279)[0x805eb5d]
> >>btrfs-debug-tree(main+0x8b5)[0x804dfb7]
> >>/lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0xb757c4d3]
> >>btrfs-debug-tree[0x804e221]

Thanks,
Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/                         | PGP 1024R/763BE901

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: kernel BUG at fs/btrfs/extent-tree.c:8113! (4.1.3 kernel)
  2015-08-24  4:28                                 ` Marc MERLIN
@ 2015-08-24  5:11                                   ` Qu Wenruo
  2015-08-24 14:10                                     ` Marc MERLIN
  0 siblings, 1 reply; 25+ messages in thread
From: Qu Wenruo @ 2015-08-24  5:11 UTC (permalink / raw)
  To: Marc MERLIN; +Cc: Josef Bacik, dsterba, clm, linux-btrfs



Marc MERLIN wrote on 2015/08/23 21:28 -0700:
> On Mon, Aug 24, 2015 at 09:10:30AM +0800, Qu Wenruo wrote:
>> Would you please take the following output?
>>
>> 1) btrfs check output
>> With error message if it happens.
>
> myth:~# btrfs check /dev/mapper/crypt_sdd1
> Checking filesystem on /dev/mapper/crypt_sdd1
> UUID: 024ba4d0-dacb-438d-9f1b-eeb34083fe49
> checking extents
> cmds-check.c:4486: add_data_backref: Assertion `back->bytes != max_size` failed.
> btrfs[0x8066a73]
> btrfs[0x8066aa4]
> btrfs[0x8067991]
> btrfs[0x806b4ab]
> btrfs[0x806b9a3]
> btrfs[0x806c5b2]
> btrfs(cmd_check+0x1088)[0x806eddf]
> btrfs(main+0x153)[0x80557c6]
> /lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0xb753a4d3]
> btrfs[0x80557ec]
>
>> 2) btrfs check --repair output
>> Full output until segfault.
>
> myth:~# btrfs check --repair /dev/mapper/crypt_sdd1
> enabling repair mode
> Checking filesystem on /dev/mapper/crypt_sdd1
> UUID: 024ba4d0-dacb-438d-9f1b-eeb34083fe49
> checking extents
> cmds-check.c:4486: add_data_backref: Assertion `back->bytes != max_size` failed.
> btrfs[0x8066a73]
> btrfs[0x8066aa4]
> btrfs[0x8067991]
> btrfs[0x806b4ab]
> btrfs[0x806b9a3]
> btrfs[0x806c5b2]
> btrfs(cmd_check+0x1088)[0x806eddf]
> btrfs(main+0x153)[0x80557c6]
> /lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0xb75114d3]
> btrfs[0x80557ec]
>
> Strangely I'm not getting a segfault anymore.

It seems that the tree block's backref has something wrong.

>
>> 3) btrfs-debug-tree output
>> With assert output.
>
> The full output is multi gigabyte. Do you need this and if so, do I need to
> upload it somewhere and will you download the multi gigabyte file?
>
> The errors and assert, I already posted here:
>
>>>> Sure thing:
>>>> btrfs-debug-tree /dev/mapper/crypt_sdd1 > /tmp/tree.out
>>>> parent transid verify failed on 2968115101696 wanted 34855 found 39533
>>>> parent transid verify failed on 2968115101696 wanted 34855 found 39533
>>>> parent transid verify failed on 2968115101696 wanted 34855 found 39533
>>>> parent transid verify failed on 2968115101696 wanted 34855 found 39533
>>>> Ignoring transid failure
>>>> parent transid verify failed on 2968115134464 wanted 34855 found 39533
>>>> parent transid verify failed on 2968115134464 wanted 34855 found 39533
>>>> parent transid verify failed on 2968115134464 wanted 34855 found 39533
>>>> parent transid verify failed on 2968115134464 wanted 34855 found 39533
>>>> Ignoring transid failure
>>>> parent transid verify failed on 2968115150848 wanted 34855 found 39533
>>>> parent transid verify failed on 2968115150848 wanted 34855 found 39533
>>>> parent transid verify failed on 2968115150848 wanted 34855 found 39533
>>>> parent transid verify failed on 2968115150848 wanted 34855 found 39533
>>>> Ignoring transid failure
>>>> parent transid verify failed on 2968115691520 wanted 34855 found 39533
>>>> parent transid verify failed on 2968115691520 wanted 34855 found 39533
>>>> parent transid verify failed on 2968115691520 wanted 34855 found 39533
>>>> parent transid verify failed on 2968115691520 wanted 34855 found 39533
>>>> Ignoring transid failure
>>>> parent transid verify failed on 1291597152256 wanted 35830 found 39530
>>>> parent transid verify failed on 1291597152256 wanted 35830 found 39530
>>>> parent transid verify failed on 1291597152256 wanted 35830 found 39530
>>>> parent transid verify failed on 1291597152256 wanted 35830 found 39530
>>>> Ignoring transid failure
>>>> parent transid verify failed on 2968116592640 wanted 34855 found 39533
>>>> parent transid verify failed on 2968116592640 wanted 34855 found 39533
>>>> parent transid verify failed on 2968116592640 wanted 34855 found 39533
>>>> parent transid verify failed on 2968116592640 wanted 34855 found 39533
>>>> Ignoring transid failure
>>>> parent transid verify failed on 2968116609024 wanted 34855 found 39533
>>>> parent transid verify failed on 2968116609024 wanted 34855 found 39533
>>>> parent transid verify failed on 2968116609024 wanted 34855 found 39533
>>>> parent transid verify failed on 2968116609024 wanted 34855 found 39533
>>>> Ignoring transid failure
>>>> print-tree.c:1094: btrfs_print_tree: Assertion failed.
>>>> btrfs-debug-tree[0x805ce93]
>>>> btrfs-debug-tree(btrfs_print_tree+0x26d)[0x805eb51]
>>>> btrfs-debug-tree(btrfs_print_tree+0x279)[0x805eb5d]
>>>> btrfs-debug-tree(main+0x8b5)[0x804dfb7]
>>>> /lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0xb757c4d3]
>>>> btrfs-debug-tree[0x804e221]
Oh, sorry for ignoring the existing output.

And the last assert info should be enough. No need to upload it.

The b-tree seems to be hugely damaged, or at least one leaf tree block 
is referred by higher level node.
It maybe something wrong happened when level of a btree is reduced.

Normally, I have no idea on how to fix such huge problem in btrfsck.
But there is still some clue.

In your debug-tree output, the transid difference between wanted and 
found is quite huge. I suppose there would be a much much newer root 
tree, but not recorded in superblock.

So, my last bet will be, using "btrfs-find-root -a" to find the root 
with highest generation, and use the new root to exec "btrfsck -b 
<bytenr of highest gen root>".

The latest btrfs-find-root would output possible tree root by descending 
order of its generation. You'll find proper bytenr quite easy.
But be prepared as "btrfs-find-root -a" will iterate all metadata space, 
so it will takes a long long time to finish.
And until it scanned all the space, it won't output anything.

Thanks,
Qu
>
> Thanks,
> Marc
>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: kernel BUG at fs/btrfs/extent-tree.c:8113! (4.1.3 kernel)
  2015-08-24  5:11                                   ` Qu Wenruo
@ 2015-08-24 14:10                                     ` Marc MERLIN
  2015-08-25  0:26                                       ` Qu Wenruo
  2015-08-25  2:51                                       ` Qu Wenruo
  0 siblings, 2 replies; 25+ messages in thread
From: Marc MERLIN @ 2015-08-24 14:10 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: Josef Bacik, dsterba, clm, linux-btrfs

On Mon, Aug 24, 2015 at 01:11:26PM +0800, Qu Wenruo wrote:
> So, my last bet will be, using "btrfs-find-root -a" to find the root
> with highest generation, and use the new root to exec "btrfsck -b
> <bytenr of highest gen root>".
 
> The latest btrfs-find-root would output possible tree root by
> descending order of its generation. You'll find proper bytenr quite
> easy.
> But be prepared as "btrfs-find-root -a" will iterate all metadata
> space, so it will takes a long long time to finish.
> And until it scanned all the space, it won't output anything.

This is what I got:

myth:~# btrfs-find-root -a /dev/mapper/crypt_sdd1
Superblock thinks the generation is 39538
Superblock thinks the level is 1
Well block 4243456(gen: 3 level: 0) seems good, but generation/level doesn't match, want gen: 39538 level: 1
Well block 4194304(gen: 2 level: 0) seems good, but generation/level doesn't match, want gen: 39538 level: 1
myth:~# 

Does it mean there is no other block I can/should use?

Thanks,
Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/                         | PGP 1024R/763BE901

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: kernel BUG at fs/btrfs/extent-tree.c:8113! (4.1.3 kernel)
  2015-08-24 14:10                                     ` Marc MERLIN
@ 2015-08-25  0:26                                       ` Qu Wenruo
  2015-08-25  2:51                                       ` Qu Wenruo
  1 sibling, 0 replies; 25+ messages in thread
From: Qu Wenruo @ 2015-08-25  0:26 UTC (permalink / raw)
  To: Marc MERLIN; +Cc: Josef Bacik, dsterba, clm, linux-btrfs



Marc MERLIN wrote on 2015/08/24 07:10 -0700:
> On Mon, Aug 24, 2015 at 01:11:26PM +0800, Qu Wenruo wrote:
>> So, my last bet will be, using "btrfs-find-root -a" to find the root
>> with highest generation, and use the new root to exec "btrfsck -b
>> <bytenr of highest gen root>".
>
>> The latest btrfs-find-root would output possible tree root by
>> descending order of its generation. You'll find proper bytenr quite
>> easy.
>> But be prepared as "btrfs-find-root -a" will iterate all metadata
>> space, so it will takes a long long time to finish.
>> And until it scanned all the space, it won't output anything.
>
> This is what I got:
>
> myth:~# btrfs-find-root -a /dev/mapper/crypt_sdd1
> Superblock thinks the generation is 39538
> Superblock thinks the level is 1
> Well block 4243456(gen: 3 level: 0) seems good, but generation/level doesn't match, want gen: 39538 level: 1
> Well block 4194304(gen: 2 level: 0) seems good, but generation/level doesn't match, want gen: 39538 level: 1
> myth:~#
>
> Does it mean there is no other block I can/should use?
>
> Thanks,
> Marc
>
Oh, it seems to be a new bug in btrfs-find-root.

I'll fix it first then.

Thanks,
Qu

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: kernel BUG at fs/btrfs/extent-tree.c:8113! (4.1.3 kernel)
  2015-08-24 14:10                                     ` Marc MERLIN
  2015-08-25  0:26                                       ` Qu Wenruo
@ 2015-08-25  2:51                                       ` Qu Wenruo
  2015-08-25  5:28                                         ` Marc MERLIN
  1 sibling, 1 reply; 25+ messages in thread
From: Qu Wenruo @ 2015-08-25  2:51 UTC (permalink / raw)
  To: Marc MERLIN; +Cc: Josef Bacik, dsterba, clm, linux-btrfs



Marc MERLIN wrote on 2015/08/24 07:10 -0700:
> On Mon, Aug 24, 2015 at 01:11:26PM +0800, Qu Wenruo wrote:
>> So, my last bet will be, using "btrfs-find-root -a" to find the root
>> with highest generation, and use the new root to exec "btrfsck -b
>> <bytenr of highest gen root>".
>
>> The latest btrfs-find-root would output possible tree root by
>> descending order of its generation. You'll find proper bytenr quite
>> easy.
>> But be prepared as "btrfs-find-root -a" will iterate all metadata
>> space, so it will takes a long long time to finish.
>> And until it scanned all the space, it won't output anything.
>
> This is what I got:
>
> myth:~# btrfs-find-root -a /dev/mapper/crypt_sdd1
> Superblock thinks the generation is 39538
> Superblock thinks the level is 1
> Well block 4243456(gen: 3 level: 0) seems good, but generation/level doesn't match, want gen: 39538 level: 1
> Well block 4194304(gen: 2 level: 0) seems good, but generation/level doesn't match, want gen: 39538 level: 1
> myth:~#
>
> Does it mean there is no other block I can/should use?
>
> Thanks,
> Marc
>
Patches sent and CCed to you.

Please try the two patches and see what's new.
This time, I think the output will be much larger.

Thanks,
Qu

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: kernel BUG at fs/btrfs/extent-tree.c:8113! (4.1.3 kernel)
  2015-08-25  2:51                                       ` Qu Wenruo
@ 2015-08-25  5:28                                         ` Marc MERLIN
  2015-08-25  6:00                                           ` Qu Wenruo
  0 siblings, 1 reply; 25+ messages in thread
From: Marc MERLIN @ 2015-08-25  5:28 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: Josef Bacik, dsterba, clm, linux-btrfs

On Tue, Aug 25, 2015 at 10:51:00AM +0800, Qu Wenruo wrote:
> Patches sent and CCed to you.
> 
> Please try the two patches and see what's new.
> This time, I think the output will be much larger.

Indeed.

However the bad news is that gen 39538 is the highest.
Should I force btrfsck to work with an older generation, or do we throw the towel and stop bothering
trying to rescue this FS longer (it's a backup FS, so I have no data I need to recover on it, I just curious
on how it managed to corrupt itself when all I did was a weekly backup to it via btrfs send/receive.

myth:~# sort  -rn -k +4  /var/spool/out  |head -20
Well block 29523968(gen: 39538 level: 1) seems good, and it matches superblock
Well block 29687808(gen: 39537 level: 1) seems good, but generation/level doesn't match, want gen: 39538 level: 1
Well block 93749248(gen: 39536 level: 0) seems good, but generation/level doesn't match, want gen: 39538 level: 1
Well block 60669952(gen: 39536 level: 0) seems good, but generation/level doesn't match, want gen: 39538 level: 1
Well block 30474240(gen: 39536 level: 0) seems good, but generation/level doesn't match, want gen: 39538 level: 1
Well block 29540352(gen: 39536 level: 0) seems good, but generation/level doesn't match, want gen: 39538 level: 1
Well block 150880256(gen: 39536 level: 0) seems good, but generation/level doesn't match, want gen: 39538 level: 1
Well block 150568960(gen: 39536 level: 0) seems good, but generation/level doesn't match, want gen: 39538 level: 1
Well block 150552576(gen: 39536 level: 0) seems good, but generation/level doesn't match, want gen: 39538 level: 1
Well block 150519808(gen: 39536 level: 0) seems good, but generation/level doesn't match, want gen: 39538 level: 1
Well block 150503424(gen: 39536 level: 0) seems good, but generation/level doesn't match, want gen: 39538 level: 1
Well block 141410304(gen: 39536 level: 0) seems good, but generation/level doesn't match, want gen: 39538 level: 1
Well block 136347648(gen: 39536 level: 0) seems good, but generation/level doesn't match, want gen: 39538 level: 1
Well block 7312195813376(gen: 39535 level: 1) seems good, but generation/level doesn't match, want gen: 39538 level: 1
Well block 7312194404352(gen: 39535 level: 1) seems good, but generation/level doesn't match, want gen: 39538 level: 1
Well block 7312086122496(gen: 39535 level: 1) seems good, but generation/level doesn't match, want gen: 39538 level: 1
Well block 7312079536128(gen: 39535 level: 1) seems good, but generation/level doesn't match, want gen: 39538 level: 1
Well block 7311940960256(gen: 39535 level: 1) seems good, but generation/level doesn't match, want gen: 39538 level: 1
Well block 7311866003456(gen: 39535 level: 1) seems good, but generation/level doesn't match, want gen: 39538 level: 1
Well block 7311839477760(gen: 39535 level: 1) seems good, but generation/level doesn't match, want gen: 39538 level: 1

Thanks,
Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/                         | PGP 1024R/763BE901

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: kernel BUG at fs/btrfs/extent-tree.c:8113! (4.1.3 kernel)
  2015-08-25  5:28                                         ` Marc MERLIN
@ 2015-08-25  6:00                                           ` Qu Wenruo
  2015-08-25  6:50                                             ` Marc MERLIN
  0 siblings, 1 reply; 25+ messages in thread
From: Qu Wenruo @ 2015-08-25  6:00 UTC (permalink / raw)
  To: Marc MERLIN; +Cc: Josef Bacik, dsterba, clm, linux-btrfs



Marc MERLIN wrote on 2015/08/24 22:28 -0700:
> On Tue, Aug 25, 2015 at 10:51:00AM +0800, Qu Wenruo wrote:
>> Patches sent and CCed to you.
>>
>> Please try the two patches and see what's new.
>> This time, I think the output will be much larger.
>
> Indeed.
>
> However the bad news is that gen 39538 is the highest.
> Should I force btrfsck to work with an older generation, or do we throw the towel and stop bothering
> trying to rescue this FS longer (it's a backup FS, so I have no data I need to recover on it, I just curious
> on how it managed to corrupt itself when all I did was a weekly backup to it via btrfs send/receive.
>
> myth:~# sort  -rn -k +4  /var/spool/out  |head -20
> Well block 29523968(gen: 39538 level: 1) seems good, and it matches superblock
> Well block 29687808(gen: 39537 level: 1) seems good, but generation/level doesn't match, want gen: 39538 level: 1
> Well block 93749248(gen: 39536 level: 0) seems good, but generation/level doesn't match, want gen: 39538 level: 1
> Well block 60669952(gen: 39536 level: 0) seems good, but generation/level doesn't match, want gen: 39538 level: 1
> Well block 30474240(gen: 39536 level: 0) seems good, but generation/level doesn't match, want gen: 39538 level: 1
> Well block 29540352(gen: 39536 level: 0) seems good, but generation/level doesn't match, want gen: 39538 level: 1
> Well block 150880256(gen: 39536 level: 0) seems good, but generation/level doesn't match, want gen: 39538 level: 1
> Well block 150568960(gen: 39536 level: 0) seems good, but generation/level doesn't match, want gen: 39538 level: 1
> Well block 150552576(gen: 39536 level: 0) seems good, but generation/level doesn't match, want gen: 39538 level: 1
> Well block 150519808(gen: 39536 level: 0) seems good, but generation/level doesn't match, want gen: 39538 level: 1
> Well block 150503424(gen: 39536 level: 0) seems good, but generation/level doesn't match, want gen: 39538 level: 1
> Well block 141410304(gen: 39536 level: 0) seems good, but generation/level doesn't match, want gen: 39538 level: 1
> Well block 136347648(gen: 39536 level: 0) seems good, but generation/level doesn't match, want gen: 39538 level: 1
> Well block 7312195813376(gen: 39535 level: 1) seems good, but generation/level doesn't match, want gen: 39538 level: 1
> Well block 7312194404352(gen: 39535 level: 1) seems good, but generation/level doesn't match, want gen: 39538 level: 1
> Well block 7312086122496(gen: 39535 level: 1) seems good, but generation/level doesn't match, want gen: 39538 level: 1
> Well block 7312079536128(gen: 39535 level: 1) seems good, but generation/level doesn't match, want gen: 39538 level: 1
> Well block 7311940960256(gen: 39535 level: 1) seems good, but generation/level doesn't match, want gen: 39538 level: 1
> Well block 7311866003456(gen: 39535 level: 1) seems good, but generation/level doesn't match, want gen: 39538 level: 1
> Well block 7311839477760(gen: 39535 level: 1) seems good, but generation/level doesn't match, want gen: 39538 level: 1
>
> Thanks,
> Marc
>
Thanks for all your work and patient Marc,

Good to know there is backup.
But as there is no higher generation one, so I'd assume that's not a
normal transaction id failure case.

Personally, I'd like to try btrfsck with gen 39537(--tree-root 29687808),
but that's all my personal curiosity.
Although my curiosity is driving me from finding a clue how it's
damaged to try to recover it.

If you think it's OK, then just wipe it, nobody has the right to disturb 
your sleep.

At least we got some clue here.
Some parent nodes got corrupted with much higher and non-exists generation.

Thanks,
Qu

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: kernel BUG at fs/btrfs/extent-tree.c:8113! (4.1.3 kernel)
  2015-08-25  6:00                                           ` Qu Wenruo
@ 2015-08-25  6:50                                             ` Marc MERLIN
  0 siblings, 0 replies; 25+ messages in thread
From: Marc MERLIN @ 2015-08-25  6:50 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: Josef Bacik, dsterba, clm, linux-btrfs

On Tue, Aug 25, 2015 at 02:00:32PM +0800, Qu Wenruo wrote:
> Thanks for all your work and patient Marc,
 
Haha, no problem, you're doing a lot more work than I am :)

> Good to know there is backup.
> But as there is no higher generation one, so I'd assume that's not a
> normal transaction id failure case.

Right.

> Personally, I'd like to try btrfsck with gen 39537(--tree-root 29687808),
> but that's all my personal curiosity.
> Although my curiosity is driving me from finding a clue how it's
> damaged to try to recover it.

I gave that a shot, same thing:
myth:~# btrfs check --repair --tree-root 29687808 /dev/mapper/crypt_sdd1
enabling repair mode
parent transid verify failed on 29687808 wanted 39538 found 39537
parent transid verify failed on 29687808 wanted 39538 found 39537
parent transid verify failed on 29687808 wanted 39538 found 39537
parent transid verify failed on 29687808 wanted 39538 found 39537
Ignoring transid failure
Checking filesystem on /dev/mapper/crypt_sdd1
UUID: 024ba4d0-dacb-438d-9f1b-eeb34083fe49
checking extents
wtf, parent 575708413952
cmds-check.c:4488: add_data_backref: Assertion `back->bytes != max_size` failed.
btrfs[0x8066a83]
btrfs[0x8066ab4]
btrfs[0x80679d8]
btrfs[0x806b4f2]
btrfs[0x806b9ea]
btrfs[0x806c5f9]
btrfs(cmd_check+0x1088)[0x806ee26]
btrfs(main+0x153)[0x80557d6]
/lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0xb75154d3]
btrfs[0x80557fc]

> If you think it's OK, then just wipe it, nobody has the right to
> disturb your sleep.
 
That's not a problem :) 
I signed up for bugs when using btrfs, and am happy to help with reports
and getting the tools improved with your help and others' where
possible.

> At least we got some clue here.
> Some parent nodes got corrupted with much higher and non-exists generation.

Right. So should I try to go back in time until it works, but the
previous level doesn't work either:

enabling repair mode
parent transid verify failed on 7312195813376 wanted 39538 found 39535
parent transid verify failed on 7312195813376 wanted 39538 found 39535
parent transid verify failed on 7312195813376 wanted 39538 found 39535
parent transid verify failed on 7312195813376 wanted 39538 found 39535
Ignoring transid failure
parent transid verify failed on 29687808 wanted 39524 found 39537
parent transid verify failed on 29687808 wanted 39524 found 39537
parent transid verify failed on 29687808 wanted 39524 found 39537
parent transid verify failed on 29687808 wanted 39524 found 39537
Ignoring transid failure
Checking filesystem on /dev/mapper/crypt_sdd1
UUID: 024ba4d0-dacb-438d-9f1b-eeb34083fe49
checking extents
cmds-check.c:3730: check_owner_ref: Assertion `rec->is_root` failed.
btrfs[0x8066a83]
btrfs[0x8066ab4]
btrfs[0x806acc4]
btrfs[0x806b9ea]
btrfs[0x806c5f9]
btrfs(cmd_check+0x1088)[0x806ee26]
btrfs(main+0x153)[0x80557d6]
/lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0xb75104d3]
btrfs[0x80557fc]

Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/                         | PGP 1024R/763BE901

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2015-08-25  6:50 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-06 21:21 btrfs check --repair crash, and btrfs-cleaner crash Marc MERLIN
2015-07-10 13:43 ` Btrfs progs release 4.1.1 David Sterba
2015-07-12  1:02   ` Marc MERLIN
2015-07-23 11:55     ` David Sterba
2015-07-24 16:24       ` Marc MERLIN
2015-08-03  3:51         ` kernel BUG at fs/btrfs/extent-tree.c:8113! (4.1.3 kernel) Marc MERLIN
2015-08-11  5:07           ` Marc MERLIN
2015-08-11 15:40             ` Josef Bacik
2015-08-12 14:47               ` Marc MERLIN
2015-08-12 15:15                 ` Josef Bacik
2015-08-12 16:09                   ` Marc MERLIN
2015-08-12 16:18                     ` Josef Bacik
2015-08-12 17:19                       ` Marc MERLIN
2015-08-17  2:01                         ` Qu Wenruo
2015-08-17 14:49                           ` Marc MERLIN
2015-08-22 14:37                             ` Marc MERLIN
2015-08-24  1:10                               ` Qu Wenruo
2015-08-24  4:28                                 ` Marc MERLIN
2015-08-24  5:11                                   ` Qu Wenruo
2015-08-24 14:10                                     ` Marc MERLIN
2015-08-25  0:26                                       ` Qu Wenruo
2015-08-25  2:51                                       ` Qu Wenruo
2015-08-25  5:28                                         ` Marc MERLIN
2015-08-25  6:00                                           ` Qu Wenruo
2015-08-25  6:50                                             ` Marc MERLIN

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.