linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* shared extents, but no snapshots or reflinks
@ 2019-08-23  2:38 Chris Murphy
  2019-08-23  3:19 ` Chris Murphy
  0 siblings, 1 reply; 5+ messages in thread
From: Chris Murphy @ 2019-08-23  2:38 UTC (permalink / raw)
  To: Btrfs BTRFS

kernel 5.2.9
btrfs-progs 5.1.1
File system is pretty new, few months, has only seen kernels 5.1.16+

This is a bit curious. Three subvolumes, no snapshots, no reflinks.

There have previously been snapshots, typically prior to doing system
updates. Is this an example of extents being pinned due to snapshots,
and then extents updated and are now "stuck"? I'm kinda surprised, in
that I'd expect most programs, especially RPM, are writing out new
files entirely, then deleting obsolete files, then renaming. But...
this suggests something is doing partial overwrites of file extents
rather than replacements.

Any ideas?


$ ls -l
total 0
drwxr-xr-x. 1 root root  10 Jun 23 13:36 home
drwxr-xr-x. 1 root root  26 Jun  9 22:27 images
dr-xr-xr-x. 1 root root 144 Aug 22 13:49 root
$ sudo btrfs fi du -s *
     Total   Exclusive  Set shared  Filename
   4.41GiB     4.39GiB    14.51MiB  home
 581.40MiB   552.04MiB    29.36MiB  images
   7.78GiB     7.77GiB     6.71MiB  root
$

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: shared extents, but no snapshots or reflinks
  2019-08-23  2:38 shared extents, but no snapshots or reflinks Chris Murphy
@ 2019-08-23  3:19 ` Chris Murphy
  2019-08-25  8:14   ` Andrei Borzenkov
  0 siblings, 1 reply; 5+ messages in thread
From: Chris Murphy @ 2019-08-23  3:19 UTC (permalink / raw)
  To: Btrfs BTRFS

On Thu, Aug 22, 2019 at 8:38 PM Chris Murphy <lists@colorremedies.com> wrote:
>
> There have previously been snapshots, typically prior to doing system
> updates. Is this an example of extents being pinned due to snapshots,
> and then extents updated and are now "stuck"? I'm kinda surprised, in
> that I'd expect most programs, especially RPM, are writing out new
> files entirely, then deleting obsolete files, then renaming. But...
> this suggests something is doing partial overwrites of file extents
> rather than replacements.

It's databases. Databases are updating their files with block
overwrites, btrfs COWs them. And if there's a snapshot that exists
while COW happens, partial extents get pinned. This affects the
firefox database files, and also RPM's. It's a small effect on my
system, but it's a curious issue in particular if the files were much
larger.


-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: shared extents, but no snapshots or reflinks
  2019-08-23  3:19 ` Chris Murphy
@ 2019-08-25  8:14   ` Andrei Borzenkov
  2019-08-26 23:35     ` Chris Murphy
  0 siblings, 1 reply; 5+ messages in thread
From: Andrei Borzenkov @ 2019-08-25  8:14 UTC (permalink / raw)
  To: Chris Murphy, Btrfs BTRFS

23.08.2019 6:19, Chris Murphy пишет:
> On Thu, Aug 22, 2019 at 8:38 PM Chris Murphy <lists@colorremedies.com> wrote:
>>
>> There have previously been snapshots, typically prior to doing system
>> updates. Is this an example of extents being pinned due to snapshots,
>> and then extents updated and are now "stuck"? I'm kinda surprised, in
>> that I'd expect most programs, especially RPM, are writing out new
>> files entirely, then deleting obsolete files, then renaming. But...
>> this suggests something is doing partial overwrites of file extents
>> rather than replacements.
> 
> It's databases. Databases are updating their files with block
> overwrites, btrfs COWs them. And if there's a snapshot that exists
> while COW happens, partial extents get pinned. This affects the
> firefox database files, and also RPM's. It's a small effect on my
> system, but it's a curious issue in particular if the files were much
> larger.
> 
> 

What exactly "pinned" means, why it happens and when it goes away?

Comparing situation with and without shared extents - when you simply
delete snapshot, it disappears:


-       item 12 key (257 ROOT_ITEM 7) itemoff 13188 itemsize 439
-               generation 7 root_dirid 256 bytenr 30670848 level 0 refs 1
-               lastsnap 7 byte_limit 0 bytes_used 16384 flags 0x1(RDONLY)
-               uuid 5357e159-c577-d34b-8e0e-815767568a89
-               parent_uuid 1dfec531-ef6e-4d2e-a93b-2a4e4c0e4682
-               ctransid 6 otransid 7 stransid 0 rtransid 0
-               ctime 1566719522.371361184 (2019-08-25 10:52:02)
-               otime 1566719541.289249684 (2019-08-25 10:52:21)
-               drop key (0 UNKNOWN.0 0) level 0
-       item 13 key (257 ROOT_BACKREF 5) itemoff 13166 itemsize 22
-               root backref key dirid 258 sequence 2 name snap


but when there was shared extent (caused by partial overwrite) it is stuck:

-       item 12 key (257 ROOT_ITEM 7) itemoff 13188 itemsize 439
-               generation 7 root_dirid 256 bytenr 30670848 level 0 refs 1
-               lastsnap 7 byte_limit 0 bytes_used 16384 flags 0x1(RDONLY)
+       item 11 key (257 ROOT_ITEM 7) itemoff 13210 itemsize 439
+               generation 7 root_dirid 256 bytenr 30670848 level 0 refs 0
+               lastsnap 7 byte_limit 0 bytes_used 16384 flags
0x1000000000001(RDONLY)


Now the undecoded flag is

/*
 * Internal in-memory flag that a subvolume has been marked for deletion but
 * still visible as a directory
 */
#define BTRFS_ROOT_SUBVOL_DEAD          (1ULL << 48)

but it does not agree with comment - this flag is not "in memory", it is
persistent (output above is from inspect-internal after filesystem is
unmounted).

So when this dead subvolume is going to be removed? This can cause quite
real memory leak if it is stuck as long as original extent reference
remains.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: shared extents, but no snapshots or reflinks
  2019-08-25  8:14   ` Andrei Borzenkov
@ 2019-08-26 23:35     ` Chris Murphy
  2019-08-27  4:26       ` Andrei Borzenkov
  0 siblings, 1 reply; 5+ messages in thread
From: Chris Murphy @ 2019-08-26 23:35 UTC (permalink / raw)
  To: Andrei Borzenkov; +Cc: Chris Murphy, Btrfs BTRFS

On Sun, Aug 25, 2019 at 2:14 AM Andrei Borzenkov <arvidjaar@gmail.com> wrote:
>
> 23.08.2019 6:19, Chris Murphy пишет:
> > On Thu, Aug 22, 2019 at 8:38 PM Chris Murphy <lists@colorremedies.com> wrote:
> >>
> >> There have previously been snapshots, typically prior to doing system
> >> updates. Is this an example of extents being pinned due to snapshots,
> >> and then extents updated and are now "stuck"? I'm kinda surprised, in
> >> that I'd expect most programs, especially RPM, are writing out new
> >> files entirely, then deleting obsolete files, then renaming. But...
> >> this suggests something is doing partial overwrites of file extents
> >> rather than replacements.
> >
> > It's databases. Databases are updating their files with block
> > overwrites, btrfs COWs them. And if there's a snapshot that exists
> > while COW happens, partial extents get pinned. This affects the
> > firefox database files, and also RPM's. It's a small effect on my
> > system, but it's a curious issue in particular if the files were much
> > larger.
> >
> >
>
> What exactly "pinned" means, why it happens and when it goes away?

Pinned by a snapshot, which seems to have prevented stale extents in a file
from being deleted (expected) but then upon subsequence deletion of the
snapshot, those pinned extents are not released. And yet several
database like files have "shared" extents reported by
filefrag -v.

The file must be completely replaced (cp, then rm, then mv)


> Comparing situation with and without shared extents - when you simply
> delete snapshot, it disappears:
>
>
>
> -       item 12 key (257 ROOT_ITEM 7) itemoff 13188 itemsize 439
> -               generation 7 root_dirid 256 bytenr 30670848 level 0 refs 1
> -               lastsnap 7 byte_limit 0 bytes_used 16384 flags 0x1(RDONLY)
> -               uuid 5357e159-c577-d34b-8e0e-815767568a89
> -               parent_uuid 1dfec531-ef6e-4d2e-a93b-2a4e4c0e4682
> -               ctransid 6 otransid 7 stransid 0 rtransid 0
> -               ctime 1566719522.371361184 (2019-08-25 10:52:02)
> -               otime 1566719541.289249684 (2019-08-25 10:52:21)
> -               drop key (0 UNKNOWN.0 0) level 0
> -       item 13 key (257 ROOT_BACKREF 5) itemoff 13166 itemsize 22
> -               root backref key dirid 258 sequence 2 name snap
>
>
> but when there was shared extent (caused by partial overwrite) it is stuck:
>
> -       item 12 key (257 ROOT_ITEM 7) itemoff 13188 itemsize 439
> -               generation 7 root_dirid 256 bytenr 30670848 level 0 refs 1
> -               lastsnap 7 byte_limit 0 bytes_used 16384 flags 0x1(RDONLY)
> +       item 11 key (257 ROOT_ITEM 7) itemoff 13210 itemsize 439
> +               generation 7 root_dirid 256 bytenr 30670848 level 0 refs 0
> +               lastsnap 7 byte_limit 0 bytes_used 16384 flags
> 0x1000000000001(RDONLY)
>
>
> Now the undecoded flag is
>
> /*
>  * Internal in-memory flag that a subvolume has been marked for deletion but
>  * still visible as a directory
>  */
> #define BTRFS_ROOT_SUBVOL_DEAD          (1ULL << 48)
>
> but it does not agree with comment - this flag is not "in memory", it is
> persistent (output above is from inspect-internal after filesystem is
> unmounted).
>
> So when this dead subvolume is going to be removed? This can cause quite
> real memory leak if it is stuck as long as original extent reference
> remains.

I didn't have any stale subvolumes only marked for deletion, they were
long gone (hours).

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: shared extents, but no snapshots or reflinks
  2019-08-26 23:35     ` Chris Murphy
@ 2019-08-27  4:26       ` Andrei Borzenkov
  0 siblings, 0 replies; 5+ messages in thread
From: Andrei Borzenkov @ 2019-08-27  4:26 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Btrfs BTRFS

27.08.2019 2:35, Chris Murphy пишет:
> On Sun, Aug 25, 2019 at 2:14 AM Andrei Borzenkov <arvidjaar@gmail.com> wrote:
>>
>> 23.08.2019 6:19, Chris Murphy пишет:
>>> On Thu, Aug 22, 2019 at 8:38 PM Chris Murphy <lists@colorremedies.com> wrote:
>>>>
>>>> There have previously been snapshots, typically prior to doing system
>>>> updates. Is this an example of extents being pinned due to snapshots,
>>>> and then extents updated and are now "stuck"? I'm kinda surprised, in
>>>> that I'd expect most programs, especially RPM, are writing out new
>>>> files entirely, then deleting obsolete files, then renaming. But...
>>>> this suggests something is doing partial overwrites of file extents
>>>> rather than replacements.
>>>
>>> It's databases. Databases are updating their files with block
>>> overwrites, btrfs COWs them. And if there's a snapshot that exists
>>> while COW happens, partial extents get pinned. This affects the
>>> firefox database files, and also RPM's. It's a small effect on my
>>> system, but it's a curious issue in particular if the files were much
>>> larger.
>>>
>>>
>>
>> What exactly "pinned" means, why it happens and when it goes away?
> 
> Pinned by a snapshot, which seems to have prevented stale extents in a file
> from being deleted (expected) but then upon subsequence deletion of the
> snapshot, those pinned extents are not released. And yet several
> database like files have "shared" extents reported by
> filefrag -v.
> 
> The file must be completely replaced (cp, then rm, then mv)
> 
> 
>> Comparing situation with and without shared extents - when you simply
>> delete snapshot, it disappears:
>>
>>
>>
>> -       item 12 key (257 ROOT_ITEM 7) itemoff 13188 itemsize 439
>> -               generation 7 root_dirid 256 bytenr 30670848 level 0 refs 1
>> -               lastsnap 7 byte_limit 0 bytes_used 16384 flags 0x1(RDONLY)
>> -               uuid 5357e159-c577-d34b-8e0e-815767568a89
>> -               parent_uuid 1dfec531-ef6e-4d2e-a93b-2a4e4c0e4682
>> -               ctransid 6 otransid 7 stransid 0 rtransid 0
>> -               ctime 1566719522.371361184 (2019-08-25 10:52:02)
>> -               otime 1566719541.289249684 (2019-08-25 10:52:21)
>> -               drop key (0 UNKNOWN.0 0) level 0
>> -       item 13 key (257 ROOT_BACKREF 5) itemoff 13166 itemsize 22
>> -               root backref key dirid 258 sequence 2 name snap
>>
>>
>> but when there was shared extent (caused by partial overwrite) it is stuck:
>>
>> -       item 12 key (257 ROOT_ITEM 7) itemoff 13188 itemsize 439
>> -               generation 7 root_dirid 256 bytenr 30670848 level 0 refs 1
>> -               lastsnap 7 byte_limit 0 bytes_used 16384 flags 0x1(RDONLY)
>> +       item 11 key (257 ROOT_ITEM 7) itemoff 13210 itemsize 439
>> +               generation 7 root_dirid 256 bytenr 30670848 level 0 refs 0
>> +               lastsnap 7 byte_limit 0 bytes_used 16384 flags
>> 0x1000000000001(RDONLY)
>>
>>
>> Now the undecoded flag is
>>
>> /*
>>  * Internal in-memory flag that a subvolume has been marked for deletion but
>>  * still visible as a directory
>>  */
>> #define BTRFS_ROOT_SUBVOL_DEAD          (1ULL << 48)
>>
>> but it does not agree with comment - this flag is not "in memory", it is
>> persistent (output above is from inspect-internal after filesystem is
>> unmounted).
>>
>> So when this dead subvolume is going to be removed? This can cause quite
>> real memory leak if it is stuck as long as original extent reference
>> remains.
> 
> I didn't have any stale subvolumes only marked for deletion, they were
> long gone (hours).
> 

Yes, it disappeared after some time here too, but in my case file
extents were no more shown as "shared" after that. This is using kernel
5.2.8.

Of course space was still wasted. So actually I am not sure if there is
any real difference except display issue.

I wonder if btrfs can reuse those unused parts of extents that are left
after overwriting them.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-08-27  4:26 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-23  2:38 shared extents, but no snapshots or reflinks Chris Murphy
2019-08-23  3:19 ` Chris Murphy
2019-08-25  8:14   ` Andrei Borzenkov
2019-08-26 23:35     ` Chris Murphy
2019-08-27  4:26       ` Andrei Borzenkov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).