linux-erofs.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Gao Xiang <gaoxiang25@huawei.com>
To: Pratik Shinde <pratikshinde320@gmail.com>
Cc: miaoxie@huawei.com, linux-erofs@lists.ozlabs.org
Subject: Re: [RFCv2] erofs-utils:code for detecting and tracking holes in uncompressed sparse files.
Date: Thu, 26 Dec 2019 14:00:28 +0800	[thread overview]
Message-ID: <20191226060028.GA89409@architecture4> (raw)
In-Reply-To: <CAGu0czQzwunpV2pV6EujWknWKt+uALmSHKqCBPjxhxJ=BZY5gQ@mail.gmail.com>

On Thu, Dec 26, 2019 at 11:12:09AM +0530, Pratik Shinde wrote:
> Thanks Gao.
> 
> Now I understand the purpose.
> So with i_format we will be able to recognize which path to take. i.e fast
> path (flat mode) or slow path(i.e to search through extent list).
> I am working on it.

Thanks. Yes, that is the original consideration of i_format design and
sorry for some misrepresentations in the previous email such as what
I meant is "old kernel forward compatibility", etc..

But I think the final i_format new mode number is minor for now (I can
also assign a new number based on your patch) because I'd like to
rearrange this field for better scalability as an extra patch in advance
together with your patch.

we can get the main new extent approach in shape first. :-)

Thanks,
Gao Xiang

> 
> --Pratik.
> 
> On Tue, Dec 24, 2019 at 4:46 PM Gao Xiang <gaoxiang25@huawei.com> wrote:
> 
> > On Tue, Dec 24, 2019 at 04:15:47PM +0530, Pratik Shinde wrote:
> > > Hi Gao,
> > >
> > > No no. What I am saying is - in the current code (excluding all my
> > changes)
> > > the block lookup will happens in constant time. with only hole list it
> >
> > Not only lookup but other interfaces such as fiemap, that is why called
> > flat mode and fast path.
> >
> > > won't be O(1) time but rather we have to traverse the holes list. (say in
> > > binary search way).
> > > what I don't understand is - what is the purpose of tracking data
> > extents.
> > > hope you get it.
> >
> > Mode plain and inline are called flat modes, which is the most common
> > case of regular and dir files. You can see that's the fastest path for
> > most file accesses (minimum metadata).
> >
> > The reason why don't extend the flat modes but introduce another new
> > sparse mode for 3 main reasons:
> >  1) introduce a complete enhanced new extent table (or later B+-tree);
> >  2) we don't even know how many holes in the file if we only read
> >     inode base metadata, some extra header (no matter extent or hole
> >     header) need to be readed in advance;
> >  3) Old kernel backward compatibility need to be considered, not all
> >     files are sparsed, and we need to get them work properly, and rest
> >     files are sparsed, we need to block such files from accessed by
> >     old kernels;
> >
> > Note that i_format is for such use, so we can introduce sparse mode
> > with some enhanced on-disk representation (but with more metadata
> > read amplification than flat modes).
> >
> > So if files without holes it should be considered as flat modes (fast
> > path), and then considering the slow path --- upcoming sparse mode.
> >
> > The purpose of tracking data extents is we could then use it
> > for deduping, repeated data or data redirect. Hole can only be 0
> > though.
> >
> > Thanks,
> > Gao Xiang
> >
> > >
> > > --Pratik.
> > >
> >

      reply	other threads:[~2019-12-26  6:01 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-23 17:29 [RFCv2] erofs-utils:code for detecting and tracking holes in uncompressed sparse files Pratik Shinde
2019-12-24  3:48 ` Gao Xiang
2019-12-24  9:35   ` Pratik Shinde
2019-12-24 10:05     ` Gao Xiang
2019-12-24 10:45       ` Pratik Shinde
2019-12-24 11:15         ` Gao Xiang
2019-12-26  5:42           ` Pratik Shinde
2019-12-26  6:00             ` Gao Xiang [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191226060028.GA89409@architecture4 \
    --to=gaoxiang25@huawei.com \
    --cc=linux-erofs@lists.ozlabs.org \
    --cc=miaoxie@huawei.com \
    --cc=pratikshinde320@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).