All of lore.kernel.org
 help / color / mirror / Atom feed
* Can't mount, power failure - recoverable?
@ 2012-03-17  4:24 Skylar Burtenshaw
  2012-03-17  7:51 ` cwillu
                   ` (2 more replies)
  0 siblings, 3 replies; 24+ messages in thread
From: Skylar Burtenshaw @ 2012-03-17  4:24 UTC (permalink / raw)
  To: linux-btrfs

Hey all. First and foremost, great work on the filesystem. Love it. That is,
until this...

AGES ago, I had a power failure. I had 22 drives in one BTRFS filesystem. I
know, dumb idea given that it's an experimental FS, but it's not important data,
just.....LOTS of it. A dozen terabytes or so.

Now when I try to mount it with all present kernels (up to 3.2.0) I get several
minutes of disk churning, and a kernel stack trace. Every tool I throw at it
fails. find-root only shows one tree (at the very end) after complaining about
blocks seeming great, but generations don't match for ages. The btrfsck from the
stable tree lists twenty "item # key" messages, then stops with "failed to find
block number 20975616" and aborts every time.

I've been sitting on this filesystem for half a year now, using my backup array,
but it's getting full. I realize I'm being very sparse on information, but I'm
not sure what you need from me.

As such, my questions are these:
1) What information do you require in order to ascertain the degree of my
problem?

2) (Once more information is obtained, of course.) Is there any hope of this
filesystem being reliably recovered? I realize that's a loaded question and that
you obviously can't give me a definite 100% answer, but I would like to know if
it's time to wipe it and start anew, or if the odds are good enough that I
should wait.

Thank you very much for your attention.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Can't mount, power failure - recoverable?
  2012-03-17  4:24 Can't mount, power failure - recoverable? Skylar Burtenshaw
@ 2012-03-17  7:51 ` cwillu
  2012-03-17 19:06   ` Skylar Burtenshaw
  2012-03-17 10:31 ` Hugo Mills
  2012-03-17 12:18 ` Chris Mason
  2 siblings, 1 reply; 24+ messages in thread
From: cwillu @ 2012-03-17  7:51 UTC (permalink / raw)
  To: Skylar Burtenshaw; +Cc: linux-btrfs

> Now when I try to mount it with all present kernels (up to 3.2.0) I get several
> minutes of disk churning, and a kernel stack trace.

[snip]

> As such, my questions are these:
> 1) What information do you require in order to ascertain the degree of my
> problem?

The stack trace would be a start :p

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Can't mount, power failure - recoverable?
  2012-03-17  4:24 Can't mount, power failure - recoverable? Skylar Burtenshaw
  2012-03-17  7:51 ` cwillu
@ 2012-03-17 10:31 ` Hugo Mills
  2012-03-17 19:06   ` Skylar Burtenshaw
  2012-03-17 12:18 ` Chris Mason
  2 siblings, 1 reply; 24+ messages in thread
From: Hugo Mills @ 2012-03-17 10:31 UTC (permalink / raw)
  To: Skylar Burtenshaw; +Cc: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 1702 bytes --]

On Sat, Mar 17, 2012 at 04:24:02AM +0000, Skylar Burtenshaw wrote:
> Hey all. First and foremost, great work on the filesystem. Love it. That is,
> until this...
> 
> AGES ago, I had a power failure. I had 22 drives in one BTRFS filesystem. I
> know, dumb idea given that it's an experimental FS, but it's not important data,
> just.....LOTS of it. A dozen terabytes or so.

> Now when I try to mount it with all present kernels (up to 3.2.0) I
> get several minutes of disk churning, and a kernel stack trace.
> Every tool I throw at it fails. find-root only shows one tree (at
> the very end) after complaining about blocks seeming great, but
> generations don't match for ages.

   Can you give us the last, say, 200 lines of find-root's output?
Does it give you any listings for "root objectid"s? With find-root and
recover, it's not necessarily fatal that the transids/generations
don't match.

> The btrfsck from the stable tree lists twenty "item # key" messages,
> then stops with "failed to find block number 20975616" and aborts
> every time.

> I've been sitting on this filesystem for half a year now, using my
> backup array, but it's getting full. I realize I'm being very sparse
> on information, but I'm not sure what you need from me.

   Only half a year? :) I sat on my broken 6TB array for a year before
I gave up and recovered the data I didn't have in backups... (And for
the same reason you have -- I ran out of space on the secondary
storage)

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
         --- 2 + 2 = 5,  for sufficiently large values of 2. ---         

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Can't mount, power failure - recoverable?
  2012-03-17  4:24 Can't mount, power failure - recoverable? Skylar Burtenshaw
  2012-03-17  7:51 ` cwillu
  2012-03-17 10:31 ` Hugo Mills
@ 2012-03-17 12:18 ` Chris Mason
  2012-03-17 19:06   ` Skylar Burtenshaw
  2 siblings, 1 reply; 24+ messages in thread
From: Chris Mason @ 2012-03-17 12:18 UTC (permalink / raw)
  To: Skylar Burtenshaw; +Cc: linux-btrfs

On Sat, Mar 17, 2012 at 04:24:02AM +0000, Skylar Burtenshaw wrote:
> Hey all. First and foremost, great work on the filesystem. Love it. That is,
> until this...
> 
> AGES ago, I had a power failure. I had 22 drives in one BTRFS filesystem. I
> know, dumb idea given that it's an experimental FS, but it's not important data,
> just.....LOTS of it. A dozen terabytes or so.

Which kernel was used during the power outage?  If 3.2 or higher you may
be able to mount -o recovery

> 
> Now when I try to mount it with all present kernels (up to 3.2.0) I get several
> minutes of disk churning, and a kernel stack trace. Every tool I throw at it
> fails. find-root only shows one tree (at the very end) after complaining about
> blocks seeming great, but generations don't match for ages. The btrfsck from the
> stable tree lists twenty "item # key" messages, then stops with "failed to find
> block number 20975616" and aborts every time.

We'll definitely need the stack trace, and the tool output.  From there
I'll ask for more.

-chris

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Can't mount, power failure - recoverable?
  2012-03-17  7:51 ` cwillu
@ 2012-03-17 19:06   ` Skylar Burtenshaw
  2012-03-18 15:16     ` Chris Mason
  0 siblings, 1 reply; 24+ messages in thread
From: Skylar Burtenshaw @ 2012-03-17 19:06 UTC (permalink / raw)
  To: linux-btrfs

> The stack trace would be a start :p


Here's the trace. Mount attempt using 3.2.0;

[ 39.873019] device label SolaceNetArray devid 20 transid 267118 /dev/sda3
[ 39.894154] btrfs: disk space caching is enabled
[ 62.383043] BUG: unable to handle kernel NULL pointer dereference at
00000000000000d0
[ 62.383114] IP: [<ffffffffa0021161>] btrfs_put_block_group+0x11/0x70 [btrfs]
[ 62.383190] PGD 78e80067 PUD 79588067 PMD 0
[ 62.383231] Oops: 0002 [#1] SMP
[ 62.383261] CPU 0
[ 62.383277] Modules linked in: aoe radeon ttm nfsd drm_kms_helper drm nfs lockd
k8temp fscache auth_rpcgss nfs_acl lp i2c_algo_bit edac_core edac_mce_amd
i2c_nforce2 psmouse asus_atk0110 serio_raw parport sunrpc usbhid hid 3w_9xxx
floppy sata_promise pata_sil680 sata_nv forcedeth pata_amd sky2 sata_sil24 nbd
btrfs zlib_deflate libcrc32c
[ 62.383583]
[ 62.383597] Pid: 1764, comm: mount Not tainted 3.2.0-030200-generic
#201201042035 System manufacturer
System Product Name/A8N32-SLI-Deluxe
[ 62.383692] RIP: 0010:[<ffffffffa0021161>] [<ffffffffa0021161>]
btrfs_put_block_group+0x11/0x70
[btrfs]
[ 62.383772] RSP: 0018:ffff880078e79568 EFLAGS: 00010292
[ 62.383810] RAX: ffff880072e35000 RBX: 0000000000000000 RCX: 0000160000000000
[ 62.383861] RDX: 000000000004136e RSI: 0000000040000000 RDI: 0000000000000000
[ 62.383912] RBP: ffff880078e79578 R08: 0000000000000000 R09: 0000000000000002
[ 62.383962] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88007b3ca000
[ 62.384013] R13: 0000000000000000 R14: ffff880077daa048 R15: ffff880000000000
[ 62.384064] FS: 00007f95889927e0(0000) GS:ffff88007fc00000(0000)
knlGS:0000000000000000
[ 62.384121] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 62.384162] CR2: 00000000000000d0 CR3: 0000000078c90000 CR4: 00000000000006f0
[ 62.384213] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 62.384263] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 62.384314] Process mount (pid: 1764, threadinfo ffff880078e78000, task
ffff88007ab90000)
[ 62.384371] Stack:
[ 62.386171] ffff880078e79578 ffff8800729e9320 ffff880078e795d8 ffffffffa0024bdf
[ 62.386928] ffff880000000002 0000000000000000 ffff880077ff8190 000000000004136f
[ 62.386928] ffff880078e795d8 ffff88007b3ca000 ffff880072ba6460 ffff8800729e9320
[ 62.386928] Call Trace:
[ 62.386928] [<ffffffffa0024bdf>] btrfs_free_tree_block+0xef/0x1a0 [btrfs]
[ 62.386928] [<ffffffffa001b99a>] __btrfs_cow_block+0x2ca/0x4b0 [btrfs]
[ 62.386928] [<ffffffffa001c14e>] btrfs_cow_block+0xee/0x200 [btrfs]
[ 62.386928] [<ffffffffa001ee68>] btrfs_search_slot+0x328/0x730 [btrfs]
[ 62.386928] [<ffffffffa002919c>] lookup_inline_extent_backref+0xbc/0x400 [btrfs]
[ 62.386928] [<ffffffff811674ad>] ? kmem_cache_alloc+0xcd/0x120
[ 62.386928] [<ffffffffa002ac56>] __btrfs_free_extent+0xd6/0x700 [btrfs]
[ 62.386928] [<ffffffff8107720c>] ? lock_timer_base+0x3c/0x70
[ 62.386928] [<ffffffff81138c58>] ? bdi_wakeup_thread_delayed+0x38/0x40
[ 62.386928] [<ffffffffa002b3da>] run_delayed_tree_ref+0x15a/0x160 [btrfs]




^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Can't mount, power failure - recoverable?
  2012-03-17 12:18 ` Chris Mason
@ 2012-03-17 19:06   ` Skylar Burtenshaw
  2012-07-12  0:47     ` Skylar Burtenshaw
  0 siblings, 1 reply; 24+ messages in thread
From: Skylar Burtenshaw @ 2012-03-17 19:06 UTC (permalink / raw)
  To: linux-btrfs

Chris Mason <chris.mason <at> oracle.com> writes:

> Which kernel was used during the power outage?
The kernel in use was 2.6.38 or so. Didn't write that down, but I'm fairly
certain it was .38 or .37 - sorry I can't be more precise.

Here's the trace. Mount attempt using 3.2.0;

[ 39.873019] device label SolaceNetArray devid 20 transid 267118 /dev/sda3
[ 39.894154] btrfs: disk space caching is enabled
[ 62.383043] BUG: unable to handle kernel NULL pointer dereference at
00000000000000d0
[ 62.383114] IP: [<ffffffffa0021161>] btrfs_put_block_group+0x11/0x70 [btrfs]
[ 62.383190] PGD 78e80067 PUD 79588067 PMD 0
[ 62.383231] Oops: 0002 [#1] SMP
[ 62.383261] CPU 0
[ 62.383277] Modules linked in: aoe radeon ttm nfsd drm_kms_helper drm nfs lockd
k8temp fscache auth_rpcgss nfs_acl lp i2c_algo_bit edac_core edac_mce_amd
i2c_nforce2 psmouse asus_atk0110 serio_raw parport sunrpc usbhid hid 3w_9xxx
floppy sata_promise pata_sil680 sata_nv forcedeth pata_amd sky2 sata_sil24 nbd
btrfs zlib_deflate libcrc32c
[ 62.383583]
[ 62.383597] Pid: 1764, comm: mount Not tainted 3.2.0-030200-generic
#201201042035 System manufacturer
System Product Name/A8N32-SLI-Deluxe
[ 62.383692] RIP: 0010:[<ffffffffa0021161>] [<ffffffffa0021161>]
btrfs_put_block_group+0x11/0x70
[btrfs]
[ 62.383772] RSP: 0018:ffff880078e79568 EFLAGS: 00010292
[ 62.383810] RAX: ffff880072e35000 RBX: 0000000000000000 RCX: 0000160000000000
[ 62.383861] RDX: 000000000004136e RSI: 0000000040000000 RDI: 0000000000000000
[ 62.383912] RBP: ffff880078e79578 R08: 0000000000000000 R09: 0000000000000002
[ 62.383962] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88007b3ca000
[ 62.384013] R13: 0000000000000000 R14: ffff880077daa048 R15: ffff880000000000
[ 62.384064] FS: 00007f95889927e0(0000) GS:ffff88007fc00000(0000)
knlGS:0000000000000000
[ 62.384121] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 62.384162] CR2: 00000000000000d0 CR3: 0000000078c90000 CR4: 00000000000006f0
[ 62.384213] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 62.384263] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 62.384314] Process mount (pid: 1764, threadinfo ffff880078e78000, task
ffff88007ab90000)
[ 62.384371] Stack:
[ 62.386171] ffff880078e79578 ffff8800729e9320 ffff880078e795d8 ffffffffa0024bdf
[ 62.386928] ffff880000000002 0000000000000000 ffff880077ff8190 000000000004136f
[ 62.386928] ffff880078e795d8 ffff88007b3ca000 ffff880072ba6460 ffff8800729e9320
[ 62.386928] Call Trace:
[ 62.386928] [<ffffffffa0024bdf>] btrfs_free_tree_block+0xef/0x1a0 [btrfs]
[ 62.386928] [<ffffffffa001b99a>] __btrfs_cow_block+0x2ca/0x4b0 [btrfs]
[ 62.386928] [<ffffffffa001c14e>] btrfs_cow_block+0xee/0x200 [btrfs]
[ 62.386928] [<ffffffffa001ee68>] btrfs_search_slot+0x328/0x730 [btrfs]
[ 62.386928] [<ffffffffa002919c>] lookup_inline_extent_backref+0xbc/0x400 [btrfs]
[ 62.386928] [<ffffffff811674ad>] ? kmem_cache_alloc+0xcd/0x120
[ 62.386928] [<ffffffffa002ac56>] __btrfs_free_extent+0xd6/0x700 [btrfs]
[ 62.386928] [<ffffffff8107720c>] ? lock_timer_base+0x3c/0x70
[ 62.386928] [<ffffffff81138c58>] ? bdi_wakeup_thread_delayed+0x38/0x40
[ 62.386928] [<ffffffffa002b3da>] run_delayed_tree_ref+0x15a/0x160 [btrfs]






^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Can't mount, power failure - recoverable?
  2012-03-17 10:31 ` Hugo Mills
@ 2012-03-17 19:06   ` Skylar Burtenshaw
  0 siblings, 0 replies; 24+ messages in thread
From: Skylar Burtenshaw @ 2012-03-17 19:06 UTC (permalink / raw)
  To: linux-btrfs

>    Can you give us the last, say, 200 lines of find-root's output?
First line:
Super think's the tree root is at 12653942837248, chunk root 20975616

Last bunch of lines:
Well block 12154942255104 seems great, but generation doesn't match,
have=267067, want=267118
Well block 12154943320064 seems great, but generation doesn't match,
have=267070, want=267118
Well block 12154952667136 seems great, but generation doesn't match,
have=267082, want=267118
Well block 12154988584960 seems great, but generation doesn't match,
have=267090, want=267118
Well block 12377338957824 seems great, but generation doesn't match,
have=267074, want=267118
Well block 12377342033920 seems great, but generation doesn't match,
have=267098, want=267118
Well block 12377342611456 seems great, but generation doesn't match,
have=267077, want=267118
Well block 12377458331648 seems great, but generation doesn't match,
have=267080, want=267118
Well block 12377624317952 seems great, but generation doesn't match,
have=267095, want=267118
Well block 12514222108672 seems great, but generation doesn't match,
have=267100, want=267118
Well block 12514227576832 seems great, but generation doesn't match,
have=267101, want=267118
Well block 12514233110528 seems great, but generation doesn't match,
have=267106, want=267118
Well block 12514240315392 seems great, but generation doesn't match,
have=267114, want=267118
Well block 12514243575808 seems great, but generation doesn't match,
have=267112, want=267118
Well block 12514243645440 seems great, but generation doesn't match,
have=267113, want=267118
Well block 12514243678208 seems great, but generation doesn't match,
have=267116, want=267118
Well block 12514252976128 seems great, but generation doesn't match,
have=267117, want=267118
Found tree root at 12653942837248

There are roughly 2000 lines of:
"Well block X seems great, but generation doesn't match, have=X want=267118"

They ALL want 267118, and only the first line and the last line (both shown) are
different.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Can't mount, power failure - recoverable?
  2012-03-17 19:06   ` Skylar Burtenshaw
@ 2012-03-18 15:16     ` Chris Mason
  2012-03-18 18:49       ` Skylar Burtenshaw
  0 siblings, 1 reply; 24+ messages in thread
From: Chris Mason @ 2012-03-18 15:16 UTC (permalink / raw)
  To: Skylar Burtenshaw; +Cc: linux-btrfs

On Sat, Mar 17, 2012 at 07:06:14PM +0000, Skylar Burtenshaw wrote:
> > The stack trace would be a start :p
> 
> 
> Here's the trace. Mount attempt using 3.2.0;

Looks like some of the bottom of this trace is missing.  Can you please
try to grab the whole thing?

-chris

> 
> [ 39.873019] device label SolaceNetArray devid 20 transid 267118 /dev/sda3
> [ 39.894154] btrfs: disk space caching is enabled
> [ 62.383043] BUG: unable to handle kernel NULL pointer dereference at
> 00000000000000d0
> [ 62.383114] IP: [<ffffffffa0021161>] btrfs_put_block_group+0x11/0x70 [btrfs]
> [ 62.383190] PGD 78e80067 PUD 79588067 PMD 0
> [ 62.383231] Oops: 0002 [#1] SMP
> [ 62.383261] CPU 0
> [ 62.383277] Modules linked in: aoe radeon ttm nfsd drm_kms_helper drm nfs lockd
> k8temp fscache auth_rpcgss nfs_acl lp i2c_algo_bit edac_core edac_mce_amd
> i2c_nforce2 psmouse asus_atk0110 serio_raw parport sunrpc usbhid hid 3w_9xxx
> floppy sata_promise pata_sil680 sata_nv forcedeth pata_amd sky2 sata_sil24 nbd
> btrfs zlib_deflate libcrc32c
> [ 62.383583]
> [ 62.383597] Pid: 1764, comm: mount Not tainted 3.2.0-030200-generic
> #201201042035 System manufacturer
> System Product Name/A8N32-SLI-Deluxe
> [ 62.383692] RIP: 0010:[<ffffffffa0021161>] [<ffffffffa0021161>]
> btrfs_put_block_group+0x11/0x70
> [btrfs]
> [ 62.383772] RSP: 0018:ffff880078e79568 EFLAGS: 00010292
> [ 62.383810] RAX: ffff880072e35000 RBX: 0000000000000000 RCX: 0000160000000000
> [ 62.383861] RDX: 000000000004136e RSI: 0000000040000000 RDI: 0000000000000000
> [ 62.383912] RBP: ffff880078e79578 R08: 0000000000000000 R09: 0000000000000002
> [ 62.383962] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88007b3ca000
> [ 62.384013] R13: 0000000000000000 R14: ffff880077daa048 R15: ffff880000000000
> [ 62.384064] FS: 00007f95889927e0(0000) GS:ffff88007fc00000(0000)
> knlGS:0000000000000000
> [ 62.384121] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [ 62.384162] CR2: 00000000000000d0 CR3: 0000000078c90000 CR4: 00000000000006f0
> [ 62.384213] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 62.384263] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [ 62.384314] Process mount (pid: 1764, threadinfo ffff880078e78000, task
> ffff88007ab90000)
> [ 62.384371] Stack:
> [ 62.386171] ffff880078e79578 ffff8800729e9320 ffff880078e795d8 ffffffffa0024bdf
> [ 62.386928] ffff880000000002 0000000000000000 ffff880077ff8190 000000000004136f
> [ 62.386928] ffff880078e795d8 ffff88007b3ca000 ffff880072ba6460 ffff8800729e9320
> [ 62.386928] Call Trace:
> [ 62.386928] [<ffffffffa0024bdf>] btrfs_free_tree_block+0xef/0x1a0 [btrfs]
> [ 62.386928] [<ffffffffa001b99a>] __btrfs_cow_block+0x2ca/0x4b0 [btrfs]
> [ 62.386928] [<ffffffffa001c14e>] btrfs_cow_block+0xee/0x200 [btrfs]
> [ 62.386928] [<ffffffffa001ee68>] btrfs_search_slot+0x328/0x730 [btrfs]
> [ 62.386928] [<ffffffffa002919c>] lookup_inline_extent_backref+0xbc/0x400 [btrfs]
> [ 62.386928] [<ffffffff811674ad>] ? kmem_cache_alloc+0xcd/0x120
> [ 62.386928] [<ffffffffa002ac56>] __btrfs_free_extent+0xd6/0x700 [btrfs]
> [ 62.386928] [<ffffffff8107720c>] ? lock_timer_base+0x3c/0x70
> [ 62.386928] [<ffffffff81138c58>] ? bdi_wakeup_thread_delayed+0x38/0x40
> [ 62.386928] [<ffffffffa002b3da>] run_delayed_tree_ref+0x15a/0x160 [btrfs]
> 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Can't mount, power failure - recoverable?
  2012-03-18 15:16     ` Chris Mason
@ 2012-03-18 18:49       ` Skylar Burtenshaw
  2012-03-19 18:02         ` Chris Mason
  0 siblings, 1 reply; 24+ messages in thread
From: Skylar Burtenshaw @ 2012-03-18 18:49 UTC (permalink / raw)
  To: linux-btrfs

> Looks like some of the bottom of this trace is missing.> 

That's because I'm amazing, Chris. I don't even know how that happened.. Trying
again:

BUG: unable to handle kernel NULL pointer dereference at 00000000000$
IP: [<ffffffffa0021161>] btrfs_put_block_group+0x11/0x70 [btrfs]
PGD 78e80067 PUD 79588067 PMD 0
Oops: 0002 [#1] SMP
CPU 0
Modules linked in: aoe radeon ttm nfsd drm_kms_helper drm nfs lockd k8temp
fscache auth_rpcgss nfs_acl lp i2c_algo_bit edac_core edac_mce_amd i2c_nforce2
psmouse asus_atk0110 serio_raw parport sunrpc usbhid hid 3w_9xxx floppy
sata_promise pata_sil680 sata_nv forcedeth pata_amd sky2 sata_sil24 nbd btrfs
zlib_deflate libcrc32c
Pid: 1764, comm: mount Not tainted 3.2.0-030200-generic #201201042035 System
manufacturer
        System Product Name/A8N32-SLI-Deluxe
RIP: 0010:[<ffffffffa0021161>]  [<ffffffffa0021161>]
btrfs_put_block_group+0x11/0x70 [btrfs]
RSP: 0018:ffff880078e79568  EFLAGS: 00010292
RAX: ffff880072e35000 RBX: 0000000000000000 RCX: 0000160000000000
RDX: 000000000004136e RSI: 0000000040000000 RDI: 0000000000000000
RBP: ffff880078e79578 R08: 0000000000000000 R09: 0000000000000002
R10: 0000000000000000 R11: 0000000000000000 R12: ffff88007b3ca000
R13: 0000000000000000 R14: ffff880077daa048 R15: ffff880000000000
FS:  00007f95889927e0(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000000000000d0 CR3: 0000000078c90000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process mount (pid: 1764, threadinfo ffff880078e78000, task ffff88007ab90000)
Stack:
 ffff880078e79578 ffff8800729e9320 ffff880078e795d8 ffffffffa0024bdf
 ffff880000000002 0000000000000000 ffff880077ff8190 000000000004136f
 ffff880078e795d8 ffff88007b3ca000 ffff880072ba6460 ffff8800729e9320
Call Trace:
 [<ffffffffa0024bdf>] btrfs_free_tree_block+0xef/0x1a0 [btrfs]
 [<ffffffffa001b99a>] __btrfs_cow_block+0x2ca/0x4b0 [btrfs]
 [<ffffffffa001c14e>] btrfs_cow_block+0xee/0x200 [btrfs]
 [<ffffffffa001ee68>] btrfs_search_slot+0x328/0x730 [btrfs]
 [<ffffffffa002919c>] lookup_inline_extent_backref+0xbc/0x400 [btrfs]
 [<ffffffff811674ad>] ? kmem_cache_alloc+0xcd/0x120
 [<ffffffffa002ac56>] __btrfs_free_extent+0xd6/0x700 [btrfs]
 [<ffffffff8107720c>] ? lock_timer_base+0x3c/0x70
 [<ffffffff81138c58>] ? bdi_wakeup_thread_delayed+0x38/0x40
 [<ffffffffa002b3da>] run_delayed_tree_ref+0x15a/0x160 [btrfs]
 [<ffffffffa00586be>] ? memcmp_extent_buffer+0x1de/0x230 [btrfs]
 [<ffffffffa002b5db>] run_one_delayed_ref+0x9b/0xc0 [btrfs]
 [<ffffffffa002b6c0>] run_clustered_refs+0xc0/0x220 [btrfs]
 [<ffffffffa002b8ea>] btrfs_run_delayed_refs+0xca/0x220 [btrfs]
 [<ffffffff81053033>] ? __wake_up+0x53/0x70
 [<ffffffffa003a1c6>] commit_cowonly_roots+0x86/0x1e0 [btrfs]
 [<ffffffffa003afff>] btrfs_commit_transaction+0x42f/0x900 [btrfs]
 [<ffffffffa003ab9d>] ? join_transaction+0x24d/0x280 [btrfs]
 [<ffffffff8108afd0>] ? wake_up_bit+0x40/0x40
 [<ffffffffa0035ae8>] btrfs_commit_super+0x88/0xd0 [btrfs]
 [<ffffffffa0036be0>] close_ctree+0x340/0x3b0 [btrfs]
 [<ffffffff8119648e>] ? iput+0x3e/0x50
 [<ffffffffa00437e4>] ? btrfs_iget+0xf4/0x110 [btrfs]
 [<ffffffffa0016226>] btrfs_fill_super+0x136/0x150 [btrfs]
 [<ffffffff81317c1a>] ? strlcpy+0x4a/0x60
 [<ffffffffa0018185>] btrfs_mount+0x335/0x370 [btrfs]
 [<ffffffff8117f1b3>] mount_fs+0x43/0x1a0
 [<ffffffff8119b633>] vfs_kern_mount+0x63/0xd0
 [<ffffffff8119b722>] do_kern_mount+0x52/0x110
 [<ffffffff812a2c3a>] ? security_capable+0x2a/0x30
 [<ffffffff8119d15d>] do_mount+0x1ed/0x240
 [<ffffffff8119d240>] sys_mount+0x90/0xe0
 [<ffffffff8164f242>] system_call_fastpath+0x16/0x1b
Code: 62 04 e1 eb cd be 95 00 00 00 48 c7 c7 48 fb 08 a0 e8 14 62 04 e1 eb cb 66
90 55 48 89 e5 53 48 83 ec 08 66 66 66 66 90 48 89 fb <f0> ff 8f d0 00 00 00 0f
94 c0 84 c0 74 22 48 83 7f 48 00 75 22
RIP  [<ffffffffa0021161>] btrfs_put_block_group+0x11/0x70 [btrfs]
 RSP <ffff880078e79568>
CR2: 00000000000000d0
---[ end trace dbfc4032ba5e601f ]---


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Can't mount, power failure - recoverable?
  2012-03-18 18:49       ` Skylar Burtenshaw
@ 2012-03-19 18:02         ` Chris Mason
  2012-03-20  3:06           ` Skylar Burtenshaw
  0 siblings, 1 reply; 24+ messages in thread
From: Chris Mason @ 2012-03-19 18:02 UTC (permalink / raw)
  To: Skylar Burtenshaw; +Cc: linux-btrfs

On Sun, Mar 18, 2012 at 06:49:13PM +0000, Skylar Burtenshaw wrote:
> > Looks like some of the bottom of this trace is missing.> 
> 
> That's because I'm amazing, Chris. I don't even know how that happened.. Trying
> again:

;) Ok, so the good news is that you're crashing when you try to write to
the FS.  The kernel you were running had bugs in btrfs that make power
failures very dangerous.  Starting with 3.2, these are fixed.

The safest way forward from here is to just copy your data off and run a
newer kernel.  Do you have the spare capacity for this?

-chris

> 
> BUG: unable to handle kernel NULL pointer dereference at 00000000000$
> IP: [<ffffffffa0021161>] btrfs_put_block_group+0x11/0x70 [btrfs]
> PGD 78e80067 PUD 79588067 PMD 0
> Oops: 0002 [#1] SMP
> CPU 0
> Modules linked in: aoe radeon ttm nfsd drm_kms_helper drm nfs lockd k8temp
> fscache auth_rpcgss nfs_acl lp i2c_algo_bit edac_core edac_mce_amd i2c_nforce2
> psmouse asus_atk0110 serio_raw parport sunrpc usbhid hid 3w_9xxx floppy
> sata_promise pata_sil680 sata_nv forcedeth pata_amd sky2 sata_sil24 nbd btrfs
> zlib_deflate libcrc32c
> Pid: 1764, comm: mount Not tainted 3.2.0-030200-generic #201201042035 System
> manufacturer
>         System Product Name/A8N32-SLI-Deluxe
> RIP: 0010:[<ffffffffa0021161>]  [<ffffffffa0021161>]
> btrfs_put_block_group+0x11/0x70 [btrfs]
> RSP: 0018:ffff880078e79568  EFLAGS: 00010292
> RAX: ffff880072e35000 RBX: 0000000000000000 RCX: 0000160000000000
> RDX: 000000000004136e RSI: 0000000040000000 RDI: 0000000000000000
> RBP: ffff880078e79578 R08: 0000000000000000 R09: 0000000000000002
> R10: 0000000000000000 R11: 0000000000000000 R12: ffff88007b3ca000
> R13: 0000000000000000 R14: ffff880077daa048 R15: ffff880000000000
> FS:  00007f95889927e0(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 00000000000000d0 CR3: 0000000078c90000 CR4: 00000000000006f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process mount (pid: 1764, threadinfo ffff880078e78000, task ffff88007ab90000)
> Stack:
>  ffff880078e79578 ffff8800729e9320 ffff880078e795d8 ffffffffa0024bdf
>  ffff880000000002 0000000000000000 ffff880077ff8190 000000000004136f
>  ffff880078e795d8 ffff88007b3ca000 ffff880072ba6460 ffff8800729e9320
> Call Trace:
>  [<ffffffffa0024bdf>] btrfs_free_tree_block+0xef/0x1a0 [btrfs]
>  [<ffffffffa001b99a>] __btrfs_cow_block+0x2ca/0x4b0 [btrfs]
>  [<ffffffffa001c14e>] btrfs_cow_block+0xee/0x200 [btrfs]
>  [<ffffffffa001ee68>] btrfs_search_slot+0x328/0x730 [btrfs]
>  [<ffffffffa002919c>] lookup_inline_extent_backref+0xbc/0x400 [btrfs]
>  [<ffffffff811674ad>] ? kmem_cache_alloc+0xcd/0x120
>  [<ffffffffa002ac56>] __btrfs_free_extent+0xd6/0x700 [btrfs]
>  [<ffffffff8107720c>] ? lock_timer_base+0x3c/0x70
>  [<ffffffff81138c58>] ? bdi_wakeup_thread_delayed+0x38/0x40
>  [<ffffffffa002b3da>] run_delayed_tree_ref+0x15a/0x160 [btrfs]
>  [<ffffffffa00586be>] ? memcmp_extent_buffer+0x1de/0x230 [btrfs]
>  [<ffffffffa002b5db>] run_one_delayed_ref+0x9b/0xc0 [btrfs]
>  [<ffffffffa002b6c0>] run_clustered_refs+0xc0/0x220 [btrfs]
>  [<ffffffffa002b8ea>] btrfs_run_delayed_refs+0xca/0x220 [btrfs]
>  [<ffffffff81053033>] ? __wake_up+0x53/0x70
>  [<ffffffffa003a1c6>] commit_cowonly_roots+0x86/0x1e0 [btrfs]
>  [<ffffffffa003afff>] btrfs_commit_transaction+0x42f/0x900 [btrfs]
>  [<ffffffffa003ab9d>] ? join_transaction+0x24d/0x280 [btrfs]
>  [<ffffffff8108afd0>] ? wake_up_bit+0x40/0x40
>  [<ffffffffa0035ae8>] btrfs_commit_super+0x88/0xd0 [btrfs]
>  [<ffffffffa0036be0>] close_ctree+0x340/0x3b0 [btrfs]
>  [<ffffffff8119648e>] ? iput+0x3e/0x50
>  [<ffffffffa00437e4>] ? btrfs_iget+0xf4/0x110 [btrfs]
>  [<ffffffffa0016226>] btrfs_fill_super+0x136/0x150 [btrfs]
>  [<ffffffff81317c1a>] ? strlcpy+0x4a/0x60
>  [<ffffffffa0018185>] btrfs_mount+0x335/0x370 [btrfs]
>  [<ffffffff8117f1b3>] mount_fs+0x43/0x1a0
>  [<ffffffff8119b633>] vfs_kern_mount+0x63/0xd0
>  [<ffffffff8119b722>] do_kern_mount+0x52/0x110
>  [<ffffffff812a2c3a>] ? security_capable+0x2a/0x30
>  [<ffffffff8119d15d>] do_mount+0x1ed/0x240
>  [<ffffffff8119d240>] sys_mount+0x90/0xe0
>  [<ffffffff8164f242>] system_call_fastpath+0x16/0x1b
> Code: 62 04 e1 eb cd be 95 00 00 00 48 c7 c7 48 fb 08 a0 e8 14 62 04 e1 eb cb 66
> 90 55 48 89 e5 53 48 83 ec 08 66 66 66 66 90 48 89 fb <f0> ff 8f d0 00 00 00 0f
> 94 c0 84 c0 74 22 48 83 7f 48 00 75 22
> RIP  [<ffffffffa0021161>] btrfs_put_block_group+0x11/0x70 [btrfs]
>  RSP <ffff880078e79568>
> CR2: 00000000000000d0
> ---[ end trace dbfc4032ba5e601f ]---
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Can't mount, power failure - recoverable?
  2012-03-19 18:02         ` Chris Mason
@ 2012-03-20  3:06           ` Skylar Burtenshaw
  2012-03-26  8:34             ` Skylar Burtenshaw
  0 siblings, 1 reply; 24+ messages in thread
From: Skylar Burtenshaw @ 2012-03-20  3:06 UTC (permalink / raw)
  To: linux-btrfs

> ;) 
I like you. Hehe.. ;)


> Ok, so the good news is that you're crashing when you try to write to
> the FS.

Well, I'll go ahead and assume you're right (you know, since it's your baby and
all, and you know what you're talking about better than anyone I can think of)
but the problem here is that I'm not getting past "mount". "mount /dev/sda3
/mnt/tmp" ....churn churn churn, about a minute or two later, bam, kernel stack
trace. If it's crashing because it's being written to, it's not because I'm
asking it to write. Though on that note, I tried 'mount -o ro,recovery' and got
"mount: Stale NTFS handle" which..made me lol. Same with just '-o ro'. '-o
recovery' procudes a stack trace. And, just for the heck of it (because I have
too much time on my hands) I tried chmod'ing -r to all of the dev nodes of the
drives in the array - surprise surprise, same stack trace.


> Starting with 3.2, these are fixed.

This makes me happy, because I feel that I can now trust it more. Especially
since I kicked out the $200 for batteries for that damn UPS.. Rack systems are
expensive.


> The safest way forward from here is to just copy your data off and run a
> newer kernel.

One step ahead of you there, thankfully.


>  Do you have the spare capacity for this?

I'm not sure how cursing is looked upon on this group, so I'll simply say "no"
and let your imagination fill in the blanks..hah. I could probably limp it, if I
was able to transfer approx. 1tb then remove a drive from the filesystem and add
it to my backup system, but that'd get hairy fast. Especially since the FS is
'damaged' and likely would not survive removal. If you say the only way is to
copy data off, I'll find a way - be it credit or borrowing drives - since I know
that's likely what it'll come to...but that's last resort, for me. At this
point, I almost want to hear "you're screwed" just so I can be done with it, but
it would not make me happy. Thanks so much for your support.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Can't mount, power failure - recoverable?
  2012-03-20  3:06           ` Skylar Burtenshaw
@ 2012-03-26  8:34             ` Skylar Burtenshaw
  2012-03-26  8:43               ` Hugo Mills
  2012-03-26  8:44               ` Fajar A. Nugraha
  0 siblings, 2 replies; 24+ messages in thread
From: Skylar Burtenshaw @ 2012-03-26  8:34 UTC (permalink / raw)
  To: linux-btrfs

Hey - been a few days, not meaning to pester but I wanted to make sure my
previous message didn't slip through the cracks. If I offended, I apologize - I
certainly didn't mean to, and my attempts at joviality can come across as
abrasive. If you simply haven't had time to look into this yet, or it's bizarre
enough that it's taking time to isolate, take all the time you need. Thank you.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Can't mount, power failure - recoverable?
  2012-03-26  8:34             ` Skylar Burtenshaw
@ 2012-03-26  8:43               ` Hugo Mills
  2012-03-26  8:51                 ` Skylar Burtenshaw
  2012-03-26  8:44               ` Fajar A. Nugraha
  1 sibling, 1 reply; 24+ messages in thread
From: Hugo Mills @ 2012-03-26  8:43 UTC (permalink / raw)
  To: Skylar Burtenshaw; +Cc: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 999 bytes --]

On Mon, Mar 26, 2012 at 08:34:46AM +0000, Skylar Burtenshaw wrote:
> Hey - been a few days, not meaning to pester but I wanted to make sure my
> previous message didn't slip through the cracks. If I offended, I apologize - I
> certainly didn't mean to, and my attempts at joviality can come across as
> abrasive. If you simply haven't had time to look into this yet, or it's bizarre
> enough that it's taking time to isolate, take all the time you need. Thank you.

   I suspect that Chris is working hard on getting queued-up patches
ready to go into the 3.4 kernel. He's usually quite quiet while he's
doing that.

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
   --- Anyone who claims their cryptographic protocol is secure is ---   
         either a genius or a fool.  Given the genius/fool ratio         
                 for our species,  the odds aren't good.                 

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Can't mount, power failure - recoverable?
  2012-03-26  8:34             ` Skylar Burtenshaw
  2012-03-26  8:43               ` Hugo Mills
@ 2012-03-26  8:44               ` Fajar A. Nugraha
  2012-03-26  8:49                 ` Skylar Burtenshaw
  1 sibling, 1 reply; 24+ messages in thread
From: Fajar A. Nugraha @ 2012-03-26  8:44 UTC (permalink / raw)
  To: Skylar Burtenshaw; +Cc: linux-btrfs

On Mon, Mar 26, 2012 at 3:34 PM, Skylar Burtenshaw <daninfuchs@gmail.com> wrote:
> Hey - been a few days, not meaning to pester but I wanted to make sure my
> previous message didn't slip through the cracks. If I offended, I apologize - I
> certainly didn't mean to, and my attempts at joviality can come across as
> abrasive. If you simply haven't had time to look into this yet, or it's bizarre
> enough that it's taking time to isolate, take all the time you need. Thank you.

Didn't Chris' last response basically say "use kernel 3.2 or newer,
mount the fs (possibly with -o ro), and copy the data elsewhere"? Have
you done that?

-- 
Fajar

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Can't mount, power failure - recoverable?
  2012-03-26  8:44               ` Fajar A. Nugraha
@ 2012-03-26  8:49                 ` Skylar Burtenshaw
  2012-03-26  8:56                   ` Fajar A. Nugraha
  2012-07-13 12:23                   ` Martin Steigerwald
  0 siblings, 2 replies; 24+ messages in thread
From: Skylar Burtenshaw @ 2012-03-26  8:49 UTC (permalink / raw)
  To: linux-btrfs

Fajar A. Nugraha <list <at> fajar.net> writes:

> Didn't Chris' last response basically say "use kernel 3.2 or newer,
> mount the fs (possibly with -o ro), and copy the data elsewhere"?

Why yes, yes it did actually. I appreciate your spotlighting it, just in case I
somehow managed to miss it, though.

> Have you done that?

I have. In fact, in my first message, I stated that in all kernels up to present
3.2 kernels, I get several minutes of disk churning, then a stack trace. Also
present in my messages is the fact that the filesystem will not mount, as well
as data output from the recovery program etc which fail to recognize things in
the filesystem that they require in order to fix it. Did you have something you
wished to suggest, in order to help me? If so, I'd gladly listen to any proposed
ideas.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Can't mount, power failure - recoverable?
  2012-03-26  8:43               ` Hugo Mills
@ 2012-03-26  8:51                 ` Skylar Burtenshaw
  0 siblings, 0 replies; 24+ messages in thread
From: Skylar Burtenshaw @ 2012-03-26  8:51 UTC (permalink / raw)
  To: linux-btrfs

Hugo Mills <hugo <at> carfax.org.uk> writes:

>    I suspect that Chris is working hard on getting queued-up patches
> ready to go into the 3.4 kernel. He's usually quite quiet while he's
> doing that.
> 
>    Hugo.
> 


Thanks Hugo - I assumed he was busy, I just wanted to make sure it wasn't a case
of "missed message" or something similar. I'm more than happy to wait patiently
- the whole thing has been sitting for over a year now, it's not as if I'm in a
massive rush. Good, free help comes at its own pace, and I'm thrilled to even be
-receiving- help to be honest. Thank you for the information. I shall now
continue to wait very patiently. :)


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Can't mount, power failure - recoverable?
  2012-03-26  8:49                 ` Skylar Burtenshaw
@ 2012-03-26  8:56                   ` Fajar A. Nugraha
  2012-07-13 12:23                   ` Martin Steigerwald
  1 sibling, 0 replies; 24+ messages in thread
From: Fajar A. Nugraha @ 2012-03-26  8:56 UTC (permalink / raw)
  To: Skylar Burtenshaw; +Cc: linux-btrfs

On Mon, Mar 26, 2012 at 3:49 PM, Skylar Burtenshaw <daninfuchs@gmail.com> wrote:
> Fajar A. Nugraha <list <at> fajar.net> writes:
>
>> Didn't Chris' last response basically say "use kernel 3.2 or newer,
>> mount the fs (possibly with -o ro), and copy the data elsewhere"?
>
> Why yes, yes it did actually. I appreciate your spotlighting it, just in case I
> somehow managed to miss it, though.
>
>> Have you done that?
>
> I have. In fact, in my first message, I stated that in all kernels up to present
> 3.2 kernels, I get several minutes of disk churning, then a stack trace. Also
> present in my messages is the fact that the filesystem will not mount, as well
> as data output from the recovery program etc which fail to recognize things in
> the filesystem that they require in order to fix it. Did you have something you
> wished to suggest, in order to help me? If so, I'd gladly listen to any proposed
> ideas.

Since you apprently tried "-o ro" (which I missed), then my last
suggestion is probably kernel 3.3 with "-o ro". just in case :)

-- 
Fajar

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Can't mount, power failure - recoverable?
  2012-03-17 19:06   ` Skylar Burtenshaw
@ 2012-07-12  0:47     ` Skylar Burtenshaw
  0 siblings, 0 replies; 24+ messages in thread
From: Skylar Burtenshaw @ 2012-07-12  0:47 UTC (permalink / raw)
  To: linux-btrfs

Skylar Burtenshaw <daninfuchs <at> gmail.com> writes:

> 
> Chris Mason <chris.mason <at> oracle.com> writes:
> 
> > Which kernel was used during the power outage?
> The kernel in use was 2.6.38 or so. Didn't write that down, but I'm fairly
> certain it was .38 or .37 - sorry I can't be more precise.
> 
> Here's the trace. Mount attempt using 3.2.0;
> 


I suppose at this point the filesystem is probably beyond repair, correct? I'm
building a new chassis for this server, because my other array is beginning to
fill, and coming dangerously near full. Any chance of recovery at this point?



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Can't mount, power failure - recoverable?
  2012-03-26  8:49                 ` Skylar Burtenshaw
  2012-03-26  8:56                   ` Fajar A. Nugraha
@ 2012-07-13 12:23                   ` Martin Steigerwald
  2012-07-13 12:28                     ` Hugo Mills
  1 sibling, 1 reply; 24+ messages in thread
From: Martin Steigerwald @ 2012-07-13 12:23 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Skylar Burtenshaw

Am Montag, 26. März 2012 schrieb Skylar Burtenshaw:
> Fajar A. Nugraha <list <at> fajar.net> writes:
> > Didn't Chris' last response basically say "use kernel 3.2 or newer,
> > mount the fs (possibly with -o ro), and copy the data elsewhere"?
> 
> Why yes, yes it did actually. I appreciate your spotlighting it, just
> in case I somehow managed to miss it, though.
> 
> > Have you done that?
> 
> I have. In fact, in my first message, I stated that in all kernels up
> to present 3.2 kernels, I get several minutes of disk churning, then a
> stack trace. Also present in my messages is the fact that the
> filesystem will not mount, as well as data output from the recovery
> program etc which fail to recognize things in the filesystem that they
> require in order to fix it. Did you have something you wished to
> suggest, in order to help me? If so, I'd gladly listen to any proposed
> ideas.

Since I didn´t found any explicit mention on it:

Did you try btrfs-zero-log on the partition prior to mounting it?

All of my BTRFS will not mount after sudden write interruption cases have 
been solved by it. Except one with a BTRFS RAID 0 with lots of 2 TB drives 
at a time where I didn´t know about btrfs-zero-log. Maybe it would have 
helped there, too.

Of course I could be completely off track and this could be a completely 
different issue.

-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Can't mount, power failure - recoverable?
  2012-07-13 12:23                   ` Martin Steigerwald
@ 2012-07-13 12:28                     ` Hugo Mills
  2012-07-13 14:38                       ` Martin Steigerwald
  0 siblings, 1 reply; 24+ messages in thread
From: Hugo Mills @ 2012-07-13 12:28 UTC (permalink / raw)
  To: Martin Steigerwald; +Cc: linux-btrfs, Skylar Burtenshaw

[-- Attachment #1: Type: text/plain, Size: 2038 bytes --]

On Fri, Jul 13, 2012 at 02:23:53PM +0200, Martin Steigerwald wrote:
> Am Montag, 26. März 2012 schrieb Skylar Burtenshaw:
> > Fajar A. Nugraha <list <at> fajar.net> writes:
> > > Didn't Chris' last response basically say "use kernel 3.2 or newer,
> > > mount the fs (possibly with -o ro), and copy the data elsewhere"?
> > 
> > Why yes, yes it did actually. I appreciate your spotlighting it, just
> > in case I somehow managed to miss it, though.
> > 
> > > Have you done that?
> > 
> > I have. In fact, in my first message, I stated that in all kernels up
> > to present 3.2 kernels, I get several minutes of disk churning, then a
> > stack trace. Also present in my messages is the fact that the
> > filesystem will not mount, as well as data output from the recovery
> > program etc which fail to recognize things in the filesystem that they
> > require in order to fix it. Did you have something you wished to
> > suggest, in order to help me? If so, I'd gladly listen to any proposed
> > ideas.
> 
> Since I didn´t found any explicit mention on it:
> 
> Did you try btrfs-zero-log on the partition prior to mounting it?
> 
> All of my BTRFS will not mount after sudden write interruption cases have 
> been solved by it. Except one with a BTRFS RAID 0 with lots of 2 TB drives 
> at a time where I didn´t know about btrfs-zero-log. Maybe it would have 
> helped there, too.
> 
> Of course I could be completely off track and this could be a completely 
> different issue.

   I'm afraid you probably are -- there's nothing I can see in the
stack trace that would indicate that it's falling over in the log tree
replay, which is the only thing that btrfs-zero-log would help with.

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
   --- A diverse working environment:  Di longer you vork here, di ---   
                             verse it gets.                              

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Can't mount, power failure - recoverable?
  2012-07-13 12:28                     ` Hugo Mills
@ 2012-07-13 14:38                       ` Martin Steigerwald
  2012-07-14  1:01                         ` Skylar Burtenshaw
  0 siblings, 1 reply; 24+ messages in thread
From: Martin Steigerwald @ 2012-07-13 14:38 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Hugo Mills, Skylar Burtenshaw

Am Freitag, 13. Juli 2012 schrieb Hugo Mills:
> On Fri, Jul 13, 2012 at 02:23:53PM +0200, Martin Steigerwald wrote:
> > Am Montag, 26. März 2012 schrieb Skylar Burtenshaw:
> > > Fajar A. Nugraha <list <at> fajar.net> writes:
> > > > Didn't Chris' last response basically say "use kernel 3.2 or
> > > > newer, mount the fs (possibly with -o ro), and copy the data
> > > > elsewhere"?
> > > 
> > > Why yes, yes it did actually. I appreciate your spotlighting it,
> > > just in case I somehow managed to miss it, though.
> > > 
> > > > Have you done that?
> > > 
> > > I have. In fact, in my first message, I stated that in all kernels
> > > up to present 3.2 kernels, I get several minutes of disk churning,
> > > then a stack trace. Also present in my messages is the fact that
> > > the filesystem will not mount, as well as data output from the
> > > recovery program etc which fail to recognize things in the
> > > filesystem that they require in order to fix it. Did you have
> > > something you wished to suggest, in order to help me? If so, I'd
> > > gladly listen to any proposed ideas.
> > 
> > Since I didn´t found any explicit mention on it:
> > 
> > Did you try btrfs-zero-log on the partition prior to mounting it?
> > 
> > All of my BTRFS will not mount after sudden write interruption cases
> > have been solved by it. Except one with a BTRFS RAID 0 with lots of
> > 2 TB drives at a time where I didn´t know about btrfs-zero-log.
> > Maybe it would have helped there, too.
> > 
> > Of course I could be completely off track and this could be a
> > completely different issue.
> 
>    I'm afraid you probably are -- there's nothing I can see in the
> stack trace that would indicate that it's falling over in the log tree
> replay, which is the only thing that btrfs-zero-log would help with.

Yes.

But why would it write some stuff then on mounting?

Could it be that it tries to update some of its caches (inode or space)? 
But then that also does not seem to be in the trace.

Well I bet I leave that to you BTRFS developers then. Just wanted to throw 
in some ideas.

Thanks,
-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Can't mount, power failure - recoverable?
  2012-07-13 14:38                       ` Martin Steigerwald
@ 2012-07-14  1:01                         ` Skylar Burtenshaw
  2012-07-15 11:20                           ` Martin Steigerwald
  2012-07-15 11:30                           ` Hugo Mills
  0 siblings, 2 replies; 24+ messages in thread
From: Skylar Burtenshaw @ 2012-07-14  1:01 UTC (permalink / raw)
  To: linux-btrfs

Martin Steigerwald <Martin <at> lichtvoll.de> writes:

> > > Since I didn´t found any explicit mention on it:
> > > Did you try btrfs-zero-log on the partition prior to mounting it?


I had tried that previously, yes. Approximately the date of my first post.
Unless something significant has changed in that tool, it seems to not be
the answer in this case.


> > > All of my BTRFS will not mount after sudden write interruption cases
> > > have been solved by it. Except one with a BTRFS RAID 0 with lots of
> > > 2 TB drives at a time where I didn´t know about btrfs-zero-log.
> > > Maybe it would have helped there, too.

Actually, I have about two dozen drives ranging from 250gb to 2tb, but I
don't think size plays much in this one - I'm obviously just guessing here,
though.

> Yes.
> 
> But why would it write some stuff then on mounting?
> 
> Could it be that it tries to update some of its caches (inode or space)? 
> But then that also does not seem to be in the trace.

I agree with you. I noticed it was trying to write, myself. I have no idea
what it was doing when the power dropped, I wasn't even present, so I can't
even say if it was doing some massive database culling or just idling.

I noticed there've been some recent (since I last looked at least) updates
including fsck and such, however I haven't run anything git-based since the
last time I pulled the btrfs tools, and I had to dig for ages to find info
on how to get the RECENT stuff from the CORRECT source. I can find a dozen
Google results that seem relevant, but can someone give me a definitive 
answer on which tree to pull down (and how) to test the new tools on my mess?

On a related note, if the new tools don't work I'm thinking it's time to bag
it, UNLESS one of the BTRFS devs is curious about my problem and wants to
inspect something to try to figure out the cause. My fear is that with 23
disks and the obscure nature of the issue, it wouldn't be worth the time
it would take to learn what happened. Basically; unless a dev says "let us
look at it first" the chances are it'll be wiped before Monday.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Can't mount, power failure - recoverable?
  2012-07-14  1:01                         ` Skylar Burtenshaw
@ 2012-07-15 11:20                           ` Martin Steigerwald
  2012-07-15 11:30                           ` Hugo Mills
  1 sibling, 0 replies; 24+ messages in thread
From: Martin Steigerwald @ 2012-07-15 11:20 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Skylar Burtenshaw

Am Samstag, 14. Juli 2012 schrieb Skylar Burtenshaw:
> Martin Steigerwald <Martin <at> lichtvoll.de> writes:
> > > > Since I didn´t found any explicit mention on it:
> > > > Did you try btrfs-zero-log on the partition prior to mounting it?
> 
> I had tried that previously, yes. Approximately the date of my first
> post. Unless something significant has changed in that tool, it seems
> to not be the answer in this case.
> 
> > > > All of my BTRFS will not mount after sudden write interruption
> > > > cases have been solved by it. Except one with a BTRFS RAID 0
> > > > with lots of 2 TB drives at a time where I didn´t know about
> > > > btrfs-zero-log. Maybe it would have helped there, too.
> 
> Actually, I have about two dozen drives ranging from 250gb to 2tb, but
> I don't think size plays much in this one - I'm obviously just
> guessing here, though.
> 
> > Yes.
> > 
> > But why would it write some stuff then on mounting?
> > 
> > Could it be that it tries to update some of its caches (inode or
> > space)? But then that also does not seem to be in the trace.
> 
> I agree with you. I noticed it was trying to write, myself. I have no
> idea what it was doing when the power dropped, I wasn't even present,
> so I can't even say if it was doing some massive database culling or
> just idling.
> 
> I noticed there've been some recent (since I last looked at least)
> updates including fsck and such, however I haven't run anything
> git-based since the last time I pulled the btrfs tools, and I had to
> dig for ages to find info on how to get the RECENT stuff from the
> CORRECT source. I can find a dozen Google results that seem relevant,
> but can someone give me a definitive answer on which tree to pull down
> (and how) to test the new tools on my mess?

Hehe, I looked for myself quite some time for it.

Last time I used:

martin@merkaba:~/Linux/Kernel/BTRFS/btrfs-progs> git remote -v
origin  git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-
progs.git (fetch)
origin  git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-
progs.git (push)
martin@merkaba:~/Linux/Kernel/BTRFS/btrfs-progs> git branch -a
* dangerdonteveruse
  master
  remotes/origin/HEAD -> origin/master
  remotes/origin/dangerdonteveruse
  remotes/origin/integration-scrub
  remotes/origin/master
  remotes/origin/parser
  remotes/origin/recovery-beta

But last change in it is:

commit 1957076ab4fefa47b6efed3da541bc974c83eed7
Author: Chris Mason <chris.mason@oracle.com>
Date:   Wed Mar 28 12:05:27 2012 -0400

    Add incompat flag for big metadata blocks
    
    Signed-off-by: Chris Mason <chris.mason@oracle.com>


Hmmm, master branch seems to be quite current:

commit 8935d8436147f86dfbda3d8b8175a77b654b8abc
Author: David Sterba <dsterba@suse.cz>
Date:   Fri Jul 6 10:11:10 2012 -0400

    btrfs-progs: mkfs: add option to skip trim
    
    Signed-off-by: Chris Mason <chris.mason@fusionio.com>

Ciao,
-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Can't mount, power failure - recoverable?
  2012-07-14  1:01                         ` Skylar Burtenshaw
  2012-07-15 11:20                           ` Martin Steigerwald
@ 2012-07-15 11:30                           ` Hugo Mills
  1 sibling, 0 replies; 24+ messages in thread
From: Hugo Mills @ 2012-07-15 11:30 UTC (permalink / raw)
  To: Skylar Burtenshaw; +Cc: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 981 bytes --]

On Sat, Jul 14, 2012 at 01:01:04AM +0000, Skylar Burtenshaw wrote:
> I noticed there've been some recent (since I last looked at least) updates
> including fsck and such, however I haven't run anything git-based since the
> last time I pulled the btrfs tools, and I had to dig for ages to find info
> on how to get the RECENT stuff from the CORRECT source. I can find a dozen
> Google results that seem relevant, but can someone give me a definitive 
> answer on which tree to pull down (and how) to test the new tools on my mess?

   This is the definitive source on where to get things:

   https://btrfs.wiki.kernel.org/index.php/Btrfs_source_repositories

   You will need the official -progs repository, as that's most up to
date right now.

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
       --- Great oxymorons of the world, no. 4: Future Perfect ---       

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2012-07-15 11:30 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-03-17  4:24 Can't mount, power failure - recoverable? Skylar Burtenshaw
2012-03-17  7:51 ` cwillu
2012-03-17 19:06   ` Skylar Burtenshaw
2012-03-18 15:16     ` Chris Mason
2012-03-18 18:49       ` Skylar Burtenshaw
2012-03-19 18:02         ` Chris Mason
2012-03-20  3:06           ` Skylar Burtenshaw
2012-03-26  8:34             ` Skylar Burtenshaw
2012-03-26  8:43               ` Hugo Mills
2012-03-26  8:51                 ` Skylar Burtenshaw
2012-03-26  8:44               ` Fajar A. Nugraha
2012-03-26  8:49                 ` Skylar Burtenshaw
2012-03-26  8:56                   ` Fajar A. Nugraha
2012-07-13 12:23                   ` Martin Steigerwald
2012-07-13 12:28                     ` Hugo Mills
2012-07-13 14:38                       ` Martin Steigerwald
2012-07-14  1:01                         ` Skylar Burtenshaw
2012-07-15 11:20                           ` Martin Steigerwald
2012-07-15 11:30                           ` Hugo Mills
2012-03-17 10:31 ` Hugo Mills
2012-03-17 19:06   ` Skylar Burtenshaw
2012-03-17 12:18 ` Chris Mason
2012-03-17 19:06   ` Skylar Burtenshaw
2012-07-12  0:47     ` Skylar Burtenshaw

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.