All of lore.kernel.org
 help / color / mirror / Atom feed
* Connection lost during BTRFS move + resize
@ 2021-11-29  8:48 Borden
  2021-11-29 15:26 ` Phillip Susi
  0 siblings, 1 reply; 9+ messages in thread
From: Borden @ 2021-11-29  8:48 UTC (permalink / raw)
  To: linux-btrfs

Good morning,

I couldn't find any definitive guidance on the list archives or Internet, so I want to double-check before giving up.

I tried to left-move and resize a btrfs partition on a USB-attached hard drive. My intention was to expand the partition from 2 TB to 3 TB on a 4TB drive. During the move, the USB cable came loose and the process failed.

From what I can tell, the partition was "moved" to its new location and correctly shows the usage, but the partition is not expanded to its new 3 TB limit. As one would expect, sudo mount -o ro /dev/sdb3 /mnt yields:
mount: /mnt: wrong fs type, bad option, bad superblock on /dev/sdb3, missing codepage or helper program, or other error.

Although I know it's extremely dangerous, I nevertheless ran btrfs check --repair /dev/sdb3 and received:
Starting repair.
Opening filesystem to check...
checksum verify failed on 2160397959168 wanted 0x2d75ada8 found 0x55dc86b3
checksum verify failed on 2160397959168 wanted 0x5c57dcfd found 0xe722d853
checksum verify failed on 2160397959168 wanted 0x5c57dcfd found 0xe722d853
bad tree block 2160397959168, bytenr mismatch, want=2160397959168, have=8937084726424501725
Couldn't read tree root
ERROR: cannot open file system

As requested:
uname -a: Linux debian 5.15.0-1-amd64 #1 SMP Debian 5.15.3-1 (2021-11-18) x86_64 GNU/Linux
btrfs --version: btrfs-progs v5.15
sudo btrfs fi show: Label: 'Backup'  uuid: <drive UUID, still good>
  Total devices 1 FS bytes used 1.71TiB
  devid    1 size 1.82TiB used 1.73TiB path /dev/sdb3
dmesg: output at http://paste.debian.net/1221191/, valid until 2021-12-02

From other discussions, it seems like the partition's contents are gone for good, which is OK because it's a backup and I can start over. However, if I can recover the data, that would be nice, too, since I might have to dig up something from a while back.

With many thanks,

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Connection lost during BTRFS move + resize
  2021-11-29  8:48 Connection lost during BTRFS move + resize Borden
@ 2021-11-29 15:26 ` Phillip Susi
  2021-11-29 15:50   ` Borden
  2021-12-03 18:31   ` Chris Murphy
  0 siblings, 2 replies; 9+ messages in thread
From: Phillip Susi @ 2021-11-29 15:26 UTC (permalink / raw)
  To: Borden; +Cc: linux-btrfs


Borden <borden_c@tutanota.com> writes:

> Good morning,
>
> I couldn't find any definitive guidance on the list archives or Internet, so I want to double-check before giving up.
>
> I tried to left-move and resize a btrfs partition on a USB-attached
> hard drive. My intention was to expand the partition from 2 TB to 3 TB
> on a 4TB drive. During the move, the USB cable came loose and the
> process failed.

The only tool I know of that can do this is gparted, so I assume you are
using that.  In this case, it has to umount the filesystem and manually
copy data from the old start of the partition to the new start.  Being
interrupted in the middle leaves part of the filesystem in the wrong
place ( and which parts is unknowable ), and so it is toast.  This is
one area where LVM has a significant advantage as its moves are
interruption safe and automatically resumed on the next activation of
the volume.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Connection lost during BTRFS move + resize
  2021-11-29 15:26 ` Phillip Susi
@ 2021-11-29 15:50   ` Borden
  2021-11-29 16:12     ` Graham Cobb
                       ` (2 more replies)
  2021-12-03 18:31   ` Chris Murphy
  1 sibling, 3 replies; 9+ messages in thread
From: Borden @ 2021-11-29 15:50 UTC (permalink / raw)
  To: Phillip Susi; +Cc: linux-btrfs

29 Nov 2021, 10:26 by phill@thesusis.net:
> The only tool I know of that can do this is gparted, so I assume you are
> using that.  In this case, it has to umount the filesystem and manually
> copy data from the old start of the partition to the new start.  Being
> interrupted in the middle leaves part of the filesystem in the wrong
> place ( and which parts is unknowable ), and so it is toast.  This is
> one area where LVM has a significant advantage as its moves are
> interruption safe and automatically resumed on the next activation of
> the volume.
>
This is the answer that I anticipated, and it's good to know now so I don't destroy data that I _cannot_ afford to lose later. So thank you.

For my own education/curiosity/intellectual banter: ddrescue, badblocks, rsync and other utilities have log files that track progress and allow it to resume if it's interrupted. Since resize operations work in the linear process you described, how hard would it be, theoretically, to implement a "needle position" in a move operation to allow a move to pick up where it left off?

Obviously, it wouldn't be 100% perfect, but if a recovery utility could look at the disk and say "partition starts here, skip a bit somewhere in the middle, continue here, stop there," surely that would be more efficient than trying to recover files with a low-level utility?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Connection lost during BTRFS move + resize
  2021-11-29 15:50   ` Borden
@ 2021-11-29 16:12     ` Graham Cobb
  2021-12-01 18:42       ` Matthew Warren
  2021-11-29 16:20     ` Phillip Susi
  2021-12-03 18:47     ` Chris Murphy
  2 siblings, 1 reply; 9+ messages in thread
From: Graham Cobb @ 2021-11-29 16:12 UTC (permalink / raw)
  To: Borden, Phillip Susi; +Cc: linux-btrfs

On 29/11/2021 15:50, Borden wrote:
> 29 Nov 2021, 10:26 by phill@thesusis.net:
>> The only tool I know of that can do this is gparted, so I assume you are
>> using that.  In this case, it has to umount the filesystem and manually
>> copy data from the old start of the partition to the new start.  Being
>> interrupted in the middle leaves part of the filesystem in the wrong
>> place ( and which parts is unknowable ), and so it is toast.  This is
>> one area where LVM has a significant advantage as its moves are
>> interruption safe and automatically resumed on the next activation of
>> the volume.
>>
> This is the answer that I anticipated, and it's good to know now so I don't destroy data that I _cannot_ afford to lose later. So thank you.
> 
> For my own education/curiosity/intellectual banter: ddrescue, badblocks, rsync and other utilities have log files that track progress and allow it to resume if it's interrupted. Since resize operations work in the linear process you described, how hard would it be, theoretically, to implement a "needle position" in a move operation to allow a move to pick up where it left off?
> 
> Obviously, it wouldn't be 100% perfect, but if a recovery utility could look at the disk and say "partition starts here, skip a bit somewhere in the middle, continue here, stop there," surely that would be more efficient than trying to recover files with a low-level utility?
> 

I can't comment on that, and I don't know how the utility you were using
works, but if it was copying blocks from higher disk addresses to lower
ones, starting at the bottom, it is *possible* that it hadn't got beyond
the first 1TB before it failed and the original filesystem is still
untouched.

Did you try just resetting the original partition parameters manually
(forcing them using something like GNU parted's mkpart - not a resize
operation) to see whether the original filesystem could be mounted? It's
just a long shot, of course.

Graham

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Connection lost during BTRFS move + resize
  2021-11-29 15:50   ` Borden
  2021-11-29 16:12     ` Graham Cobb
@ 2021-11-29 16:20     ` Phillip Susi
  2021-11-29 17:20       ` Borden
  2021-12-03 18:47     ` Chris Murphy
  2 siblings, 1 reply; 9+ messages in thread
From: Phillip Susi @ 2021-11-29 16:20 UTC (permalink / raw)
  To: Borden; +Cc: linux-btrfs


Borden <borden_c@tutanota.com> writes:

> For my own education/curiosity/intellectual banter: ddrescue,
> badblocks, rsync and other utilities have log files that track
> progress and allow it to resume if it's interrupted. Since resize
> operations work in the linear process you described, how hard would it
> be, theoretically, to implement a "needle position" in a move
> operation to allow a move to pick up where it left off?

Theoretically it shouldn't be too hard.  It's just a matter of deciding
on a location where you can safely record the checkpoint information and
then update the checkpoint between blocks.  That's how LVM handles
moves safely.  In the worst case, you restart the move at the last
checkpoint and just waste some time copying data that was already copied
but not checkpointed.



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Connection lost during BTRFS move + resize
  2021-11-29 16:20     ` Phillip Susi
@ 2021-11-29 17:20       ` Borden
  0 siblings, 0 replies; 9+ messages in thread
From: Borden @ 2021-11-29 17:20 UTC (permalink / raw)
  To: linux-btrfs

29 Nov 2021, 11:20 by phill@thesusis.net:
> Theoretically it shouldn't be too hard.  It's just a matter of deciding
> on a location where you can safely record the checkpoint information and
> then update the checkpoint between blocks.  That's how LVM handles
> moves safely.  In the worst case, you restart the move at the last
> checkpoint and just waste some time copying data that was already copied
> but not checkpointed.
>
Thanks again. In those other utilities, since the logs get written to a plain text file of the user's choosing, and partition moving should be offline, anyhow, it would be reasonable to expect the user to provide a safe location to stash a checkpoint file.

And, of course, if the user chooses to manipulate the file and/or forego it altogether, their funeral.
Not sure if it's a feature worth requesting, but it would keep people like me from bothering the list with these Hail Mary support requests, so probably be net-cost negative.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Connection lost during BTRFS move + resize
  2021-11-29 16:12     ` Graham Cobb
@ 2021-12-01 18:42       ` Matthew Warren
  0 siblings, 0 replies; 9+ messages in thread
From: Matthew Warren @ 2021-12-01 18:42 UTC (permalink / raw)
  To: Graham Cobb; +Cc: Borden, Phillip Susi, linux-btrfs

Even if it wrote more than 1 TB and started overwriting the file
system, it SHOULD be possible to find what data still needs to be
written by searching backwards from the old partition's end point and
the new partition's end point until you start finding identical data.

Matthew Warren

On Mon, Nov 29, 2021 at 5:07 PM Graham Cobb <g.btrfs@cobb.uk.net> wrote:
>
> On 29/11/2021 15:50, Borden wrote:
> > 29 Nov 2021, 10:26 by phill@thesusis.net:
> >> The only tool I know of that can do this is gparted, so I assume you are
> >> using that.  In this case, it has to umount the filesystem and manually
> >> copy data from the old start of the partition to the new start.  Being
> >> interrupted in the middle leaves part of the filesystem in the wrong
> >> place ( and which parts is unknowable ), and so it is toast.  This is
> >> one area where LVM has a significant advantage as its moves are
> >> interruption safe and automatically resumed on the next activation of
> >> the volume.
> >>
> > This is the answer that I anticipated, and it's good to know now so I don't destroy data that I _cannot_ afford to lose later. So thank you.
> >
> > For my own education/curiosity/intellectual banter: ddrescue, badblocks, rsync and other utilities have log files that track progress and allow it to resume if it's interrupted. Since resize operations work in the linear process you described, how hard would it be, theoretically, to implement a "needle position" in a move operation to allow a move to pick up where it left off?
> >
> > Obviously, it wouldn't be 100% perfect, but if a recovery utility could look at the disk and say "partition starts here, skip a bit somewhere in the middle, continue here, stop there," surely that would be more efficient than trying to recover files with a low-level utility?
> >
>
> I can't comment on that, and I don't know how the utility you were using
> works, but if it was copying blocks from higher disk addresses to lower
> ones, starting at the bottom, it is *possible* that it hadn't got beyond
> the first 1TB before it failed and the original filesystem is still
> untouched.
>
> Did you try just resetting the original partition parameters manually
> (forcing them using something like GNU parted's mkpart - not a resize
> operation) to see whether the original filesystem could be mounted? It's
> just a long shot, of course.
>
> Graham

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Connection lost during BTRFS move + resize
  2021-11-29 15:26 ` Phillip Susi
  2021-11-29 15:50   ` Borden
@ 2021-12-03 18:31   ` Chris Murphy
  1 sibling, 0 replies; 9+ messages in thread
From: Chris Murphy @ 2021-12-03 18:31 UTC (permalink / raw)
  To: Phillip Susi; +Cc: Borden, Btrfs BTRFS

On Mon, Nov 29, 2021 at 10:33 AM Phillip Susi <phill@thesusis.net> wrote:
>
>
> Borden <borden_c@tutanota.com> writes:
>
> > Good morning,
> >
> > I couldn't find any definitive guidance on the list archives or Internet, so I want to double-check before giving up.
> >
> > I tried to left-move and resize a btrfs partition on a USB-attached
> > hard drive. My intention was to expand the partition from 2 TB to 3 TB
> > on a 4TB drive. During the move, the USB cable came loose and the
> > process failed.
>
> The only tool I know of that can do this is gparted, so I assume you are
> using that.  In this case, it has to umount the filesystem and manually
> copy data from the old start of the partition to the new start.  Being
> interrupted in the middle leaves part of the filesystem in the wrong
> place ( and which parts is unknowable ), and so it is toast.  This is
> one area where LVM has a significant advantage as its moves are
> interruption safe and automatically resumed on the next activation of
> the volume.

Whether LVM or Btrfs, you can just add the earlier partition to the
storage pool. No need to move extents around, and near as I can tell
no advantage of doing so.

i.e. if the btrfs is on vda2 and it's now desired to expand the file
system "forward" into the space defined by vda1, just add it

btrfs device add /dev/vda1 /mnt

The command implies an abbreviated mkfs on vda1, and resizes the btrfs
file system to encompass both devices. And it's also an interrupt safe
operation unlike gparted move/resize which should come with dire
warnings about the consequence for any interruption.


-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Connection lost during BTRFS move + resize
  2021-11-29 15:50   ` Borden
  2021-11-29 16:12     ` Graham Cobb
  2021-11-29 16:20     ` Phillip Susi
@ 2021-12-03 18:47     ` Chris Murphy
  2 siblings, 0 replies; 9+ messages in thread
From: Chris Murphy @ 2021-12-03 18:47 UTC (permalink / raw)
  To: Borden; +Cc: Phillip Susi, Btrfs BTRFS

On Mon, Nov 29, 2021 at 10:52 AM Borden <borden_c@tutanota.com> wrote:
>
> 29 Nov 2021, 10:26 by phill@thesusis.net:
> > The only tool I know of that can do this is gparted, so I assume you are
> > using that.  In this case, it has to umount the filesystem and manually
> > copy data from the old start of the partition to the new start.  Being
> > interrupted in the middle leaves part of the filesystem in the wrong
> > place ( and which parts is unknowable ), and so it is toast.  This is
> > one area where LVM has a significant advantage as its moves are
> > interruption safe and automatically resumed on the next activation of
> > the volume.
> >
> This is the answer that I anticipated, and it's good to know now so I don't destroy data that I _cannot_ afford to lose later. So thank you.
>
> For my own education/curiosity/intellectual banter: ddrescue, badblocks, rsync and other utilities have log files that track progress and allow it to resume if it's interrupted. Since resize operations work in the linear process you described, how hard would it be, theoretically, to implement a "needle position" in a move operation to allow a move to pick up where it left off?

Trivial as a thought exercise, and not difficult for someone who can
do some scripting or coding. Do a block by block comparison to find
out where the "forward" (new location) writes ended. You can make a
bunch of assumptions that at the moment of failure, only a specific
contiguous set of sectors were in the process of receiving writes. So
it's just a matter of finding where it left off. Optimization:
probably some minimal sampling you could do to get in the ballpark
then tighten it up to a block by block to find out exactly where it
stopped.

Instead of resuming the move, you can just use some device mapper
commands to create a range (offset+length) for the two segments and
make the two appear in a single virtual device (i.e. you delete the
gap between the new range and old range). And now you can mount it and
do whatever you want, back it up somewhere.

> Obviously, it wouldn't be 100% perfect, but if a recovery utility could look at the disk and say "partition starts here, skip a bit somewhere in the middle, continue here, stop there," surely that would be more efficient than trying to recover files with a low-level utility?

Yes.


-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-12-03 18:48 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-29  8:48 Connection lost during BTRFS move + resize Borden
2021-11-29 15:26 ` Phillip Susi
2021-11-29 15:50   ` Borden
2021-11-29 16:12     ` Graham Cobb
2021-12-01 18:42       ` Matthew Warren
2021-11-29 16:20     ` Phillip Susi
2021-11-29 17:20       ` Borden
2021-12-03 18:47     ` Chris Murphy
2021-12-03 18:31   ` Chris Murphy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.