From: Gao Xiang <hsiangkao@linux.alibaba.com>
To: Yue Hu <zbestahu@gmail.com>
Cc: xiang@kernel.org, linux-erofs@lists.ozlabs.org,
huyue2@yulong.com, zbestahu@163.com
Subject: Re: [PATCH] erofs-utils: add clusterofs zero check to write_uncompressed_extent()
Date: Wed, 18 Aug 2021 22:06:12 +0800 [thread overview]
Message-ID: <YR0T1Bonxx3rVTJl@B-P7TQMD6M-0146.local> (raw)
In-Reply-To: <20210818102121.00004d36.zbestahu@gmail.com>
Hi Yue,
On Wed, Aug 18, 2021 at 10:21:21AM +0800, Yue Hu wrote:
> Hi Xiang,
>
> On Tue, 17 Aug 2021 22:38:04 +0800
> Gao Xiang <hsiangkao@linux.alibaba.com> wrote:
>
> > On Tue, Aug 17, 2021 at 10:10:46PM +0800, Yue Hu wrote:
> > > Hi Xiang,
> > >
> > > On Tue, 17 Aug 2021 21:13:14 +0800
> > > Gao Xiang <hsiangkao@linux.alibaba.com> wrote:
> > >
> > > > Hi Yue,
> > > >
> > > > On Tue, Aug 17, 2021 at 12:06:04PM +0800, Yue Hu wrote:
> > > > > From: Yue Hu <huyue2@yulong.com>
> > > > >
> > > > > No need to reset clusterofs to 0 if it's already 0. Acturally, we also
> > > > > observed that case frequently. Let's avoid it.
> > > > >
> > > > > Signed-off-by: Yue Hu <huyue2@yulong.com>
> > > > > ---
> > > > > lib/compress.c | 2 +-
> > > > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/lib/compress.c b/lib/compress.c
> > > > > index 40723a1..a8ebbc1 100644
> > > > > --- a/lib/compress.c
> > > > > +++ b/lib/compress.c
> > > > > @@ -130,7 +130,7 @@ static int write_uncompressed_extent(struct z_erofs_vle_compress_ctx *ctx,
> > > > > unsigned int count;
> > > > >
> > > > > /* reset clusterofs to 0 if permitted */
> > > > > - if (!erofs_sb_has_lz4_0padding() &&
> > > > > + if (!erofs_sb_has_lz4_0padding() && ctx->clusterofs &&
> > > >
> > > > Also out of curiosity, which means erofs is used without lz4 0padding?
>
> Moreover, seems 'ctx->head' will always >= 'ctx->clusterofs' here since there
> are only 2 sites to increase it. One is 'ctx->head += count' after compression,
> another is 'ctx->head = qh_after' ( which should be same as ->clusterofs, rt?).
Sorry for some delay (since I shouldn't take working time about this.)
The design of this part is that
Due to most compression algorithms can only accept a one-shot consecutive
buffer for each lz4 block and we limit the maximal input (uncompressed)
data for each pcluster is EROFS_CONFIG_COMPR_MAX_SZ, so EROFS allocates
a buffer with the size of EROFS_CONFIG_COMPR_MAX_SZ * 2, as below:
2 * EROFS_CONFIG_COMPR_MAX_SZ
_____________________________________________________________
|_____________________________________________________________|
^ ^
head tail
head is the pending data for compression, and tail is for the next
read. As you can see, if head < EROFS_CONFIG_COMPR_MAX_SZ, we can
always get `EROFS_CONFIG_COMPR_MAX_SZ consecutive data' in this buffer.
So after head >= EROFS_CONFIG_COMPR_MAX_SZ, we needs to move head ~ tail
to the start of this buffer to keep it work.
>
> I understand there are part of duplication data when writing uncompressed data
> to disk due to 'ctx->head' decreased?
Yeah, qh_aligned is mainly used for !0padding alignment of uncompressed
data. It's not quite useful for 0padding, since head pointer won't be
decreased.
But I don't think I like omiting ">= 'ctx->clusterofs'" (and I suspect
if it's correct or just because the special EROFS_CONFIG_COMPR_MAX_SZ)
since it involves many assumption and deep condition, and could cause
maintain burden if we change logic like this.
Thanks,
Gao Xiang
>
> Thanks.
>
> > >
> > > Yes. We are using legacy compression layout now.
> >
> > Ok, I think sometime I will seperate it into (non)compact compress
> > indexes, and (non)0padding. It was once named as legacy just before
> > Linux < 5.3 didn't support them.
> >
> > By default, the preferred option now I think is compact and 0padding.
> >
> > 0padding will enable in-place decompression (and LZMA will always
> > enable it.)
> > compact will reduce metadata even further as long as the compressed
> > data is consecutive (e.g. 2 bytes each lcluster for compact indexes
> > rather than 8 bytes each lcluster for noncompact indexes, but anyway,
> > either (non)compact metadata can support effective data random access.)
> >
> > Ok, if it's somewhat kernel 4.19 based code, I'd suggest to upgrade
> > the codebase at least to 5.4, not because 4.19 doesn't work, but
> > really for some performance concern....
> >
> > Thanks,
> > Gao Xiang
prev parent reply other threads:[~2021-08-18 14:06 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-08-17 4:06 [PATCH] erofs-utils: add clusterofs zero check to write_uncompressed_extent() Yue Hu
2021-08-17 12:18 ` Gao Xiang
2021-08-17 13:13 ` Gao Xiang
2021-08-17 14:10 ` Yue Hu
2021-08-17 14:38 ` Gao Xiang
2021-08-18 2:21 ` Yue Hu
2021-08-18 14:06 ` Gao Xiang [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YR0T1Bonxx3rVTJl@B-P7TQMD6M-0146.local \
--to=hsiangkao@linux.alibaba.com \
--cc=huyue2@yulong.com \
--cc=linux-erofs@lists.ozlabs.org \
--cc=xiang@kernel.org \
--cc=zbestahu@163.com \
--cc=zbestahu@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).