All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: Run mkfs.ext4 on an already existing ext4 filesystem. [Solved]
@ 2019-04-18 10:54 Andrea Lo Pumo
  2019-04-18 18:55 ` Andreas Dilger
  0 siblings, 1 reply; 2+ messages in thread
From: Andrea Lo Pumo @ 2019-04-18 10:54 UTC (permalink / raw)
  Cc: linux-ext4

I have been able to recover the files. It seems that they unmounted
the filesystem shortly after having mounted it. So ext4lazyinit did
not overwrite all the inode tables.

Here I write a more detailed report, in the hope it will be useful for
others. If you would like to review it, it will be appreciated and I
will write a tutorial somewhere.

Also, could the modified do_dump() command  be usefully integrated in debugfs?

---- Report ---

On /dev/sda1 there was an ext4 file system with a lot of large files.
Now, by mistake, mkfs.ext4 has been run on /dev/sda1. The result is
that now /dev/sda1 is "empty": mounting it shows no files.
Luckily, /dev/sda1 was immediately umounted and the majority of files
were recovered.

I have modified the debugfs / libext2fs code to ignore the checksums
of the files and the directories (see Note 1), and run debugfs -c
(catastrophic mode).
Then, with the ls -l and rdump commands of debugfs, the recovery was done.

Since we were not interested in recovering all files, but only certain
directories, I proceeded as follow:

   - modified do_dump() of debugfs/dump.c, to dump all directories
("directory" intended as the special file that contains the list of
sub-files and sub-directories). This is accomplished by doing this:

        for (inode = 1; inode < (ext2_ino_t)0xffffffff; inode++) {
           ....
           sprintf(outFilename, "%u", inode);
           out_fn = outFilename;

           if (inode % 1000000 == 0) {
              printf("%u\n", inode); // print the current inode as a
progress report. There are a total of 2**32-1 inodes.
           }

           unsigned int blocks = inode_blocks(inode);
           if (blocks == 0)
              continue;

           ....

           dump_file(argv[0], inode, fd, preserve, out_fn);

           ....

         }

   inode_blocks() is a custom function, which is attached at the end
of this email, and is used to consider only inodes with a block count
> 0, a link count > 0, a deletion time = 0 and of type directory.

Compiling the modified debugfs, and issuing the dump <1> 111 command
(arguments are irrelevant), one gets all the directory-files of the
filesystem:
  mkdir recover
  cd recover
  sudo path/to/debugfs/debugfs -c /dev/loop12
  debugfs>   dump <1> 111

Then, do "grep NAME recover/ -r", where NAME is the name of a
sub-directory or a sub-file of the directory you are looking for.
If the inode you are looking for was not destroyed, grep will print
something like:
   Binary file recover/7452143 matched.
If you get more than one match. Use the command "strings", e.g.
strings recover/7452143, an you will get a (somewhat garbled) list of
files and directories inside the directory recover/7452143. So you can
choose the appropriate match.

Now, assuming that you settled on a match, that number, e.g. 7452143,
is the inode number of the directory.

Open debugfs -c, and do

  debugfs > ls -l <7452143>

    ... list of files of <7452143> ...

  debugfs > rdump <7452143> .

rdump will recursively dump the content of the directory.

Note 1: I am seeing that debugfs has a -n option to disable checksum
verification. Perhaps, it is the same to what I have done? In details,
I have disabled the following errors: EXT2_ET_DIR_CSUM_INVALID,
EXT2_ET_EXTENT_CSUM_INVALID, disabled "Verify the inode checksum."  in
 ext2fs_read_inode_full().

Final note, debugfs is a powerful tool, by hacking its code you can do
a lot. For example, using inode->i_mtime and modifying do_rdump(), you
can dump only files with a modification time greater than some date.

------

unsigned int inode_blocks(ext2_ino_t  inode)
{
    struct ext2_inode *inode_buf = (struct ext2_inode *)
malloc(EXT2_INODE_SIZE(current_fs->super));

    if (!inode_buf) {
        fprintf(stderr, "%u do_stat: can't allocate buffer\n", inode);
        return 0;
    }

    if (debugfs_read_inode_full(inode, inode_buf, "inode_blocks",
                    EXT2_INODE_SIZE(current_fs->super))) {
        free(inode_buf);
        return 0;
    }

    char ok = inode_buf->i_blocks != 0 && inode_buf->i_links_count != 0 &&
            inode_buf->i_dtime == 0 &&
            LINUX_S_ISDIR(inode_buf->i_mode);

    unsigned int ret  = ok ? inode_buf->i_blocks : 0;

    free(inode_buf);

    return ret;
}

--------

Il giorno gio 11 apr 2019 alle ore 23:23 Theodore Ts'o <tytso@mit.edu>
ha scritto:
>
> On Thu, Apr 11, 2019 at 11:37:55AM +0200, Andrea Lo Pumo wrote:
> > - The inode map has been overwritten too.
> > - However, the data is still there in the disk, and also the related
> > inode structures. (Just the inode map is missing right?). So, if one
> > is able to locate these inode structures, the relative files could be
> > recovered. We know the name of important directories and files to be
> > recovered. Could this help?
>
> Unfortunately, what gets overwritten is the inode table, which
> contains the inode structures.  So all of the information which says,
> "logical block N of inode M is located on physical block P" is gone.
>
> So your only hope is going to be to use a program which looks at
> individual data blocks, and assumes that (for the most part) files
> tend to be allocated contiguously on the storage device.  Fortunately,
> such a tool has already been written, and it is an open source tool
> called PhotoRec[1].  I see however, you've already tried PhotoRec.
>
> > I could also invest some programming efforts to solve this issue, by
> > hacking some available tools, if this could help and is not too
> > complex. In this regard, I have this question: given that I know the
> > name of some important directories and files to be recovered,
> > theoretically I could write a program that "greps" the name of the
> > file in /dev/sda1 and, around that point, I should locate the inode
> > structure, and with the inode recover the whole file? Any hint toward
> > this direction? I don't have experience with ext programming, but I am
> > willing to hack.
>
> Yeah, unfortunately, no.  You'll be able to find the directory data
> block, sure.  And that will contain the inode number.  But mke2fs
> overwrites the entire inode table, so there's nothing that you can
> find.
>
> > Final question: are there tools to handle this situation? testdisk and
> > ext4magic do not seem to give good results. Photorec is useless to
> > recover large .tar.gz and .ogg files, and more importantly the name of
> > the file, which we also need.
>
> You've listed the primary tools that are available already.  It is
> possible to configure mke2fs to create an undo file, and then when
> something screws up they can use the e2undo file to unwind the
> modifications made by mke2fs (or e2fsck, if the undo file generation
> is enabled by e2fsck).
>
> This feature is not enabled by default, mainly because (a) it slows
> down the mke2fs and e2fsck operations, and that tends to make system
> administrators cranky, and (b) you have to put the e2undo file
> somewhere, and you need to have some kind of scheme to delete old
> e2undo files.  So there is a lot of distribution integration changes
> that has never been done.
>
> Telling you this now isn't particularly helpful, since it's basically
> suggesting that you close the barn door after the horse has escaped.
> However, along with other changes you might want to make to your
> procedures (such as doing regular backups) to avoid future mistakes of
> this ilk, it might be something to consider.
>
> Good luck, and sorry there's not much else help we can offer you,
>
>                                         - Ted

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Run mkfs.ext4 on an already existing ext4 filesystem. [Solved]
  2019-04-18 10:54 Run mkfs.ext4 on an already existing ext4 filesystem. [Solved] Andrea Lo Pumo
@ 2019-04-18 18:55 ` Andreas Dilger
  0 siblings, 0 replies; 2+ messages in thread
From: Andreas Dilger @ 2019-04-18 18:55 UTC (permalink / raw)
  To: Andrea Lo Pumo; +Cc: linux-ext4

[-- Attachment #1: Type: text/plain, Size: 8618 bytes --]

On Apr 18, 2019, at 4:54 AM, Andrea Lo Pumo <alopumo@movia.biz> wrote:
> 
> I have been able to recover the files. It seems that they unmounted
> the filesystem shortly after having mounted it. So ext4lazyinit did
> not overwrite all the inode tables.
> 
> Here I write a more detailed report, in the hope it will be useful for
> others. If you would like to review it, it will be appreciated and I
> will write a tutorial somewhere.
> 
> Also, could the modified do_dump() command  be usefully integrated in debugfs?

Andrea,
it sounds interesting what you have done, and I would encourage you to
finish off the work you have started so it is available for others:
- make the inode/directory dumping flexible (e.g. optionally allow
  dumping a range of inodes so it doesn't interfere with normal use)
- allow the checksum verification to be turned on/off
- add some documentation to the usage and man page so that users
  will know that this option is available

Cheers, Andreas

> 
> ---- Report ---
> 
> On /dev/sda1 there was an ext4 file system with a lot of large files.
> Now, by mistake, mkfs.ext4 has been run on /dev/sda1. The result is
> that now /dev/sda1 is "empty": mounting it shows no files.
> Luckily, /dev/sda1 was immediately umounted and the majority of files
> were recovered.
> 
> I have modified the debugfs / libext2fs code to ignore the checksums
> of the files and the directories (see Note 1), and run debugfs -c
> (catastrophic mode).
> Then, with the ls -l and rdump commands of debugfs, the recovery was done.
> 
> Since we were not interested in recovering all files, but only certain
> directories, I proceeded as follow:
> 
>   - modified do_dump() of debugfs/dump.c, to dump all directories
> ("directory" intended as the special file that contains the list of
> sub-files and sub-directories). This is accomplished by doing this:
> 
>        for (inode = 1; inode < (ext2_ino_t)0xffffffff; inode++) {
>           ....
>           sprintf(outFilename, "%u", inode);
>           out_fn = outFilename;
> 
>           if (inode % 1000000 == 0) {
>              printf("%u\n", inode); // print the current inode as a
> progress report. There are a total of 2**32-1 inodes.
>           }
> 
>           unsigned int blocks = inode_blocks(inode);
>           if (blocks == 0)
>              continue;
> 
>           ....
> 
>           dump_file(argv[0], inode, fd, preserve, out_fn);
> 
>           ....
> 
>         }
> 
>   inode_blocks() is a custom function, which is attached at the end
> of this email, and is used to consider only inodes with a block
> count > 0, a link count > 0, a deletion time = 0 and of type directory.
> 
> Compiling the modified debugfs, and issuing the dump <1> 111 command
> (arguments are irrelevant), one gets all the directory-files of the
> filesystem:
>  mkdir recover
>  cd recover
>  sudo path/to/debugfs/debugfs -c /dev/loop12
>  debugfs>   dump <1> 111
> 
> Then, do "grep NAME recover/ -r", where NAME is the name of a
> sub-directory or a sub-file of the directory you are looking for.
> If the inode you are looking for was not destroyed, grep will print
> something like:
>   Binary file recover/7452143 matched.
> If you get more than one match. Use the command "strings", e.g.
> strings recover/7452143, an you will get a (somewhat garbled) list of
> files and directories inside the directory recover/7452143. So you can
> choose the appropriate match.
> 
> Now, assuming that you settled on a match, that number, e.g. 7452143,
> is the inode number of the directory.
> 
> Open debugfs -c, and do
> 
>  debugfs > ls -l <7452143>
> 
>    ... list of files of <7452143> ...
> 
>  debugfs > rdump <7452143> .
> 
> rdump will recursively dump the content of the directory.
> 
> Note 1: I am seeing that debugfs has a -n option to disable checksum
> verification. Perhaps, it is the same to what I have done? In details,
> I have disabled the following errors: EXT2_ET_DIR_CSUM_INVALID,
> EXT2_ET_EXTENT_CSUM_INVALID, disabled "Verify the inode checksum."  in
> ext2fs_read_inode_full().
> 
> Final note, debugfs is a powerful tool, by hacking its code you can do
> a lot. For example, using inode->i_mtime and modifying do_rdump(), you
> can dump only files with a modification time greater than some date.
> 
> ------
> 
> unsigned int inode_blocks(ext2_ino_t  inode)
> {
>    struct ext2_inode *inode_buf = (struct ext2_inode *)
> malloc(EXT2_INODE_SIZE(current_fs->super));
> 
>    if (!inode_buf) {
>        fprintf(stderr, "%u do_stat: can't allocate buffer\n", inode);
>        return 0;
>    }
> 
>    if (debugfs_read_inode_full(inode, inode_buf, "inode_blocks",
>                    EXT2_INODE_SIZE(current_fs->super))) {
>        free(inode_buf);
>        return 0;
>    }
> 
>    char ok = inode_buf->i_blocks != 0 && inode_buf->i_links_count != 0 &&
>            inode_buf->i_dtime == 0 &&
>            LINUX_S_ISDIR(inode_buf->i_mode);
> 
>    unsigned int ret  = ok ? inode_buf->i_blocks : 0;
> 
>    free(inode_buf);
> 
>    return ret;
> }
> 
> --------
> 
> Il giorno gio 11 apr 2019 alle ore 23:23 Theodore Ts'o <tytso@mit.edu>
> ha scritto:
>> 
>> On Thu, Apr 11, 2019 at 11:37:55AM +0200, Andrea Lo Pumo wrote:
>>> - The inode map has been overwritten too.
>>> - However, the data is still there in the disk, and also the related
>>> inode structures. (Just the inode map is missing right?). So, if one
>>> is able to locate these inode structures, the relative files could be
>>> recovered. We know the name of important directories and files to be
>>> recovered. Could this help?
>> 
>> Unfortunately, what gets overwritten is the inode table, which
>> contains the inode structures.  So all of the information which says,
>> "logical block N of inode M is located on physical block P" is gone.
>> 
>> So your only hope is going to be to use a program which looks at
>> individual data blocks, and assumes that (for the most part) files
>> tend to be allocated contiguously on the storage device.  Fortunately,
>> such a tool has already been written, and it is an open source tool
>> called PhotoRec[1].  I see however, you've already tried PhotoRec.
>> 
>>> I could also invest some programming efforts to solve this issue, by
>>> hacking some available tools, if this could help and is not too
>>> complex. In this regard, I have this question: given that I know the
>>> name of some important directories and files to be recovered,
>>> theoretically I could write a program that "greps" the name of the
>>> file in /dev/sda1 and, around that point, I should locate the inode
>>> structure, and with the inode recover the whole file? Any hint toward
>>> this direction? I don't have experience with ext programming, but I am
>>> willing to hack.
>> 
>> Yeah, unfortunately, no.  You'll be able to find the directory data
>> block, sure.  And that will contain the inode number.  But mke2fs
>> overwrites the entire inode table, so there's nothing that you can
>> find.
>> 
>>> Final question: are there tools to handle this situation? testdisk and
>>> ext4magic do not seem to give good results. Photorec is useless to
>>> recover large .tar.gz and .ogg files, and more importantly the name of
>>> the file, which we also need.
>> 
>> You've listed the primary tools that are available already.  It is
>> possible to configure mke2fs to create an undo file, and then when
>> something screws up they can use the e2undo file to unwind the
>> modifications made by mke2fs (or e2fsck, if the undo file generation
>> is enabled by e2fsck).
>> 
>> This feature is not enabled by default, mainly because (a) it slows
>> down the mke2fs and e2fsck operations, and that tends to make system
>> administrators cranky, and (b) you have to put the e2undo file
>> somewhere, and you need to have some kind of scheme to delete old
>> e2undo files.  So there is a lot of distribution integration changes
>> that has never been done.
>> 
>> Telling you this now isn't particularly helpful, since it's basically
>> suggesting that you close the barn door after the horse has escaped.
>> However, along with other changes you might want to make to your
>> procedures (such as doing regular backups) to avoid future mistakes of
>> this ilk, it might be something to consider.
>> 
>> Good luck, and sorry there's not much else help we can offer you,
>> 
>>                                        - Ted


Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 873 bytes --]

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2019-04-18 18:55 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-18 10:54 Run mkfs.ext4 on an already existing ext4 filesystem. [Solved] Andrea Lo Pumo
2019-04-18 18:55 ` Andreas Dilger

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.