All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yue Hu <zbestahu@163.com>
To: Gao Xiang <hsiangkao@linux.alibaba.com>
Cc: xiang@kernel.org, yuchao0@huawei.com,
	linux-erofs@lists.ozlabs.org, huyue2@yulong.com
Subject: Re: [PATCH] erofs-utils: do not check ->idata_size for compressed files in erofs_prepare_inode_buffer()
Date: Wed, 1 Sep 2021 10:20:38 +0800	[thread overview]
Message-ID: <20210901102038.00004934.zbestahu@163.com> (raw)
In-Reply-To: <YS4ee67530HlDjPp@B-P7TQMD6M-0146.local>

On Tue, 31 Aug 2021 20:20:11 +0800
Gao Xiang <hsiangkao@linux.alibaba.com> wrote:

> On Tue, Aug 31, 2021 at 07:56:14PM +0800, Yue Hu wrote:
> > On Tue, 31 Aug 2021 19:15:50 +0800
> > Gao Xiang <hsiangkao@linux.alibaba.com> wrote:  
> 
> ...
> 
> > > > > > > >         
> > > > > > > > > 
> > > > > > > > > BTW, if you have some interest, would you like to implement it? :)        
> > > > > > > > 
> > > > > > > > I don't know if i can finish it. But i would like to have a try :)        
> > > > > > > 
> > > > > > > My rough thought is to try to inline the last tail compresseed
> > > > > > > extent after the on-disk compressed extents, maybe we could let
> > > > > > > it work for non-compact (legacy) compress index format cases...        
> > > > > > 
> > > > > > I mean try to implement non-compact (legacy) compress index format cases
> > > > > > first.    
> > > > 
> > > > I'm trying to do it under 4.19 code (since i have no 5.x environment temporarily).
> > > > 
> > > > Now, i think mkfs should be done. But, kernel side seems not working fine(no crash,
> > > > no decompression warning/bug). Only some files are working, others not. I'm sure i
> > > > can catch the inline data correctly via file dump. And I'm trying debug the issue.
> > > > Maybe i need more time to read/understand more decompression code related.
> > > > 
> > > > BTW, now i understand no need to go z_erofs_vle_work_xxx for inline part(cur-end)
> > > > , just go next_part after mapping as below, am i right? 
> > > >     
> > > 
> > > You are right. For the common cases (except for fiemap or cases to get the exact
> > > decompressed length), we only need to calculate the start of the compression extent,
> > > so it's transversal in the reverse order.
> > > 
> > > But really... Again, I don't suggest using 4.19 staging code for real production
> > > or further development. The uncompressed part is considered as stable, but
> > > compression side may not (also it was disabled by default). Please also see,
> > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/staging/erofs/Kconfig?h=v4.19#n86
> > > 
> > > " config EROFS_FS_ZIP
> > >   bool "EROFS Data Compresssion Support"
> > >   depends on EROFS_FS
> > >   help
> > >     Currently we support VLE Compression only.
> > >     Play at your own risk.
> > > 
> > >     If you don't want to use compression feature, say N. "
> > > 
> > > Our original first real production codebase was between 5.2~5.3. Therefore,
> > > I suggest using >= 5.4 LTS codebase for production. You could also find
> > > some backport codebase on github, e.g.:
> > > https://github.com/nolange/erofs_kernel_4_19
> > > , which backports 5.6 erofs codebase to 4.19.
> > > 
> > > As for tail-packing inline extent feature, how about focusing on on-disk
> > > design and mkfs/erofsfuse implementation first as PoC?
> > > 
> > > I'm afraid that if you only focus on 4.19 codebase, the format of compact
> > > indexes will be ignored, but "compact indexes" is the default option for
> > > erofs now since it has less metadata overhead than non-compact indexes,
> > > so both the sequential / random read are better.  
> > 
> > OK, let me develop it under 5.4. I need taking time to find it:)  
> 
> As the first step of kernel development, I think using x86 qemu should
> be better since it's easier to debug than on the embedded device.

Agree.

> 
> For this feature, I'm very glad to discuss some on-disk format first.
> Since it's not trivial for compact indexes since it's impossible to mark
> tailing-packing extent with some special blkaddr like non-compact
> indexes.

Yes, blkaddr should be an issue for inline case. I can feel that faintly.

> 
> My rough thought about this is "to add some new feature flag to "struct
> z_erofs_map_header" and trigger z_erofs_map_blocks(i_size - 1); at a
> proper time to get all information about the last tail-packing
> compression extent", and when submitting io, we erofs_get_meta_page()
> instead and fill the compressed pages.

Firstly, I need to add code about inline part to verify my understanding. I
think i did it almost about what i want to know including z_erofs_map_blocks()
since i can catch the inline data which is key point for me although kernel side
does not work fine totally.

Then i can re-factor/re-write it based on that. Yes, i will switch it on >=5.4
to continue developing later.

I also think we need a new flag for inline case. I'm just not focus on the flag
due to my working step above.

Now, i think i can check it about where to add the new flag more proper. Let me
check it also for your thought mentioned above.

> 
> But anyway, I still think focusing on mkfs.erofs and erofsfuse is a good
> start for this.

Yes, we should check the compression firstly.

One more question:

There's a piece of code (as below) to handle small output size(< PAGE_SIZE) which looks
like for inline part in z_erofs_decompress_generic()? If so, we also need to go vle 
decompression flow for inline data just like other data case?

```code
	if (rq->outputsize <= PAGE_SIZE * 7 / 8) {
		dst = erofs_get_pcpubuf(0);
		if (IS_ERR(dst))
			return PTR_ERR(dst);

		rq->inplace_io = false;
		ret = alg->decompress(rq, dst);
		if (!ret)
			copy_from_pcpubuf(rq->out, dst, rq->pageofs_out,
					  rq->outputsize);

		erofs_put_pcpubuf(dst);
		return ret;
	}
```

Thanks.

> 
> Thanks,
> Gao Xiang
> 
> > 
> > Thanks.
> >   
> > > 
> > > Thanks,
> > > Gao Xiang
> > >   
> > > > Thanks.
> > > >       
> > > > > 
> > > > > Ok, let me try to implement it.
> > > > > 
> > > > > Thanks.    



  reply	other threads:[~2021-09-01  2:21 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-17  8:29 [PATCH] erofs-utils: do not check ->idata_size for compressed files in erofs_prepare_inode_buffer() Yue Hu
2021-06-17  9:04 ` Gao Xiang
2021-06-17  9:15   ` Yue Hu
2021-06-17  9:27     ` Gao Xiang
2021-06-17  9:30       ` Gao Xiang
2021-06-17 10:14         ` Yue Hu
2021-08-31  9:00           ` Yue Hu
2021-08-31 11:15             ` Gao Xiang
2021-08-31 11:56               ` Yue Hu
2021-08-31 12:20                 ` Gao Xiang
2021-09-01  2:20                   ` Yue Hu [this message]
2021-09-01  3:05                     ` Gao Xiang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210901102038.00004934.zbestahu@163.com \
    --to=zbestahu@163.com \
    --cc=hsiangkao@linux.alibaba.com \
    --cc=huyue2@yulong.com \
    --cc=linux-erofs@lists.ozlabs.org \
    --cc=xiang@kernel.org \
    --cc=yuchao0@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.