linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Filipe Manana <fdmanana@kernel.org>
To: Qu Wenruo <quwenruo.btrfs@gmx.com>
Cc: Qu Wenruo <wqu@suse.com>,
	linux-btrfs@vger.kernel.org,  David Sterba <dsterba@suse.com>
Subject: Re: [PATCH v2 03/11] btrfs: introduce new members for extent_map
Date: Fri, 10 May 2024 12:26:46 +0100	[thread overview]
Message-ID: <CAL3q7H6rz9Z8xCbsjvqaEQC34m7uiM_FH1ecW+PQC5kARWKxrA@mail.gmail.com> (raw)
In-Reply-To: <1fe28d75-a4a3-4304-9998-a88a5fee4919@gmx.com>

On Thu, May 9, 2024 at 11:12 PM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
>
>
>
> 在 2024/5/10 02:35, Filipe Manana 写道:
> > On Fri, May 3, 2024 at 7:02 AM Qu Wenruo <wqu@suse.com> wrote:
> >>
> >> Introduce two new members for extent_map:
> >>
> >> - disk_bytenr
> >> - offset
> >>
> >> Both are matching the members with the same name inside
> >> btrfs_file_extent_items.
> >>
> >> For now this patch only touches those members when:
> >>
> >> - Reading btrfs_file_extent_items from disk
> >> - Inserting new holes
> >> - Merging two extent maps
> >>    With the new disk_bytenr and disk_num_bytes, doing merging would be a
> >>    little complex, as we have 3 different cases:
> >>
> >>    * Both extent maps are referring to the same data extent
> >>    * Both extent maps are referring to different data extents, but
> >>      those data extents are adjacent, and extent maps are at head/tail
> >>      of each data extents
> >
> > I have no idea what this last part of the sentence means:
> >
> > "and extent maps are at head/tail of each data extents"
>
> For this case:
>
>          |<- data extent 1 ->|<- data extent 2 ->|
>         |          |////////|////////|          |
>                       FE A      FE B
>
> In above case, FE A is only referring the tailing part of data extent 1,
> and FE B is referring to the heading part of data extent 2.
>
> In that case, FE A and FE B can be merged into something like this:
>
>          |<------ merged extent 1 + 2 --------->|
>         |          |/////////////////|         |
>                         FE A + B
>
> In that case, merged file extent would have:
>
> - disk_bytenr = fe_a->disk_bytenr
> - disk_num_bytes = fe_a->disk_num_bytes + fe_b->disk_num_bytes
> - ram_bytes = fe_a->ram_bytes + fe_b->ram_bytes
> - offset = fe_a->offset
> - num_bytes = fe_a->num_bytes + fe_b->num_bytes
>
> >
> >>    * One of the extent map is referring to an merged and larger data
> >
> > map -> maps
> > an -> a
> >
> >>      extent that covers both extent maps
> >
> > Not sure if I can understand this. How can one of the extent maps
> > already cover the range of the other extent map?
> > This suggests some overlapping, or most likely I misunderstood what
> > this is trying to say.
>
> That's the for impossible test case 3, but as you mentioned, it's more
> sane just to remove it.
>
> [...]
> >
> >> +        *
> >> +        * The calculation here always merge the data extents first, then update
> >> +        * @offset using the new data extents.
> >> +        *
> >> +        * For case 1), the merged data extent would be the same.
> >> +        * For case 2), we just merge the two data extents into one.
> >> +        * For case 3), we just got the larger data extent.
> >> +        */
> >> +       new_disk_bytenr = min(prev->disk_bytenr, next->disk_bytenr);
> >> +       new_disk_num_bytes = max(prev->disk_bytenr + prev->disk_num_bytes,
> >> +                                next->disk_bytenr + next->disk_num_bytes) -
> >> +                            new_disk_bytenr;
> >
> > So this is confusing, disk_num_bytes being a max between the two
> > extent maps and not their sum.
>
> Check case 1).
>
> Both file extents are referring to the same data extent.
>
> Like this:
>
> FE 1, file pos 0, num_bytes 4K, disk_bytenr X, disk_num_bytes 16K,
> offset 0, ram_bytes 16K, compression none
>
> FE 2, file pos 4k, num_bytes 4K, disk_bytenr X, disk_num_bytes 16K,
> offset 4k, ram_bytes 16K, compression none.
>
> In that case we should not just sum up the disk_num_bytes at all.
> Remember disk_num_bytes are for the data extent.

Yes, but for cases where they refer to different file extent items
that are contiguous on disk, the max is confusing, and a sum is what
makes sense. Example:

FE 1, file pos 0, num_bytes 4K, disk_bytenr X, disk_num_bytes 4K,
offset 0, ram_bytes 4K, no compression

FE 2, file pos 4K, num_bytes 4K, disk_bytenr X + 4K, disk_num_bytes
4K, offset 0, ram_bytes 4K, no compression

When merging the corresponding extent maps it's natural to think
disk_num_bytes is 8K and not 4K.

But as I mentioned before, after merging we really don't use
disk_num_bytes anywhere.
It's only used during extent logging, which only happens for extents
maps that were not merged and can not be before they are logged.

Before this patchset, when disk_num_bytes was named orig_block_len,
that was never updated during merging, exactly because we don't use
cases for it.


>
> > I gather this is modelled after what we already do at
> > btrfs_drop_extent_map_range() when splitting.
> >
> > But the truth is we never used the disk_num_bytes of an extent map
> > that was merged - we also didn't do it before this patch, for that
> > reason.
> > It's only used for logging new extents - which can't be merged - they
> > can be merged only after being logged.
> >
> > We can set this to the sum, or leave with some value to signal it's invalid.
> > Just leave a comment saying disk_num_bytes it's not used anywhere for
> > merged extent maps.
> >
> > In the splitting case at btrfs_drop_extent_map_range() it's what we
> > need since in the case the extent is new and not logged (in the
> > modified list), disk_num_bytes always represents the size of the
> > original, before split, extent.
>
> [...]
>
> >> +       file_extent.disk_bytenr = ins.objectid;
> >> +       file_extent.disk_num_bytes = ins.offset;
> >> +       file_extent.num_bytes = ins.offset;
> >> +       file_extent.ram_bytes = ins.offset;
> >> +       file_extent.offset = 0;
> >> +       file_extent.compression = BTRFS_COMPRESS_NONE;
> >
> > Same as before:
> >
> > "If we always have to initialize all the members of the structure,
> > it's pointless to have it initialized to zeros when we declared it."
> >
> [...]
> >> +       file_extent.disk_bytenr = ins.objectid;
> >> +       file_extent.disk_num_bytes = ins.offset;
> >> +       file_extent.num_bytes = num_bytes;
> >> +       file_extent.ram_bytes = ram_bytes;
> >> +       file_extent.offset = encoded->unencoded_offset;
> >> +       file_extent.compression = compression;
> >
> > Same as before:
> >
> > "If we always have to initialize all the members of the structure,
> > it's pointless to have it initialized to zeros when we declared it."
>
> Fair enough, and an uninitilized structure member can also make compiler
> to warn us.
>
> Thanks,
> Qu
> >
> > The rest I suppose seems fine, but I have to look at the rest of the patchset.
> >
> > Thanks.
> >
> >>          em = create_io_em(inode, start, num_bytes,
> >>                            start - encoded->unencoded_offset, ins.objectid,
> >>                            ins.offset, ins.offset, ram_bytes, compression,
> >> -                         BTRFS_ORDERED_COMPRESSED);
> >> +                         &file_extent, BTRFS_ORDERED_COMPRESSED);
> >>          if (IS_ERR(em)) {
> >>                  ret = PTR_ERR(em);
> >>                  goto out_free_reserved;
> >> --
> >> 2.45.0
> >>
> >>
> >

  reply	other threads:[~2024-05-10 11:27 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-03  6:01 [PATCH v2 00/11] btrfs: extent-map: unify the members with btrfs_ordered_extent Qu Wenruo
2024-05-03  6:01 ` [PATCH v2 01/11] btrfs: rename extent_map::orig_block_len to disk_num_bytes Qu Wenruo
2024-05-09 16:15   ` Filipe Manana
2024-05-03  6:01 ` [PATCH v2 02/11] btrfs: export the expected file extent through can_nocow_extent() Qu Wenruo
2024-05-09 16:22   ` Filipe Manana
2024-05-09 21:55     ` Qu Wenruo
2024-05-03  6:01 ` [PATCH v2 03/11] btrfs: introduce new members for extent_map Qu Wenruo
2024-05-09 17:05   ` Filipe Manana
2024-05-09 22:11     ` Qu Wenruo
2024-05-10 11:26       ` Filipe Manana [this message]
2024-05-10 22:26         ` Qu Wenruo
2024-05-13 12:48   ` Filipe Manana
2024-05-13 12:54     ` Filipe Manana
2024-05-13 17:31   ` Filipe Manana
2024-05-03  6:01 ` [PATCH v2 04/11] btrfs: introduce extra sanity checks for extent maps Qu Wenruo
2024-05-13 12:21   ` Filipe Manana
2024-05-13 22:34     ` Qu Wenruo
2024-05-03  6:01 ` [PATCH v2 05/11] btrfs: remove extent_map::orig_start member Qu Wenruo
2024-05-13 13:09   ` Filipe Manana
2024-05-13 22:14     ` Qu Wenruo
2024-05-03  6:01 ` [PATCH v2 06/11] btrfs: remove extent_map::block_len member Qu Wenruo
2024-05-13 17:44   ` Filipe Manana
2024-05-14  7:09     ` Qu Wenruo
2024-05-03  6:01 ` [PATCH v2 07/11] btrfs: remove extent_map::block_start member Qu Wenruo
2024-05-16 17:28   ` Filipe Manana
2024-05-16 22:45     ` Qu Wenruo
2024-05-03  6:01 ` [PATCH v2 08/11] btrfs: cleanup duplicated parameters related to can_nocow_file_extent_args Qu Wenruo
2024-05-20 15:55   ` Filipe Manana
2024-05-20 22:13     ` Qu Wenruo
2024-05-03  6:01 ` [PATCH v2 09/11] btrfs: cleanup duplicated parameters related to btrfs_alloc_ordered_extent Qu Wenruo
2024-05-20 16:31   ` Filipe Manana
2024-05-03  6:01 ` [PATCH v2 10/11] btrfs: cleanup duplicated parameters related to create_io_em() Qu Wenruo
2024-05-20 16:46   ` Filipe Manana
2024-05-03  6:01 ` [PATCH v2 11/11] btrfs: cleanup duplicated parameters related to btrfs_create_dio_extent() Qu Wenruo
2024-05-20 16:48   ` Filipe Manana
2024-05-23  4:03     ` Qu Wenruo
2024-05-03 11:53 ` [PATCH v2 00/11] btrfs: extent-map: unify the members with btrfs_ordered_extent David Sterba
2024-05-20 16:55 ` Filipe Manana

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAL3q7H6rz9Z8xCbsjvqaEQC34m7uiM_FH1ecW+PQC5kARWKxrA@mail.gmail.com \
    --to=fdmanana@kernel.org \
    --cc=dsterba@suse.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=quwenruo.btrfs@gmx.com \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).