linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Chao Yu <yuchao0@huawei.com>
To: Gao Xiang <hsiangkao@redhat.com>
Cc: Eric Biggers <ebiggers@kernel.org>, <jaegeuk@kernel.org>,
	<linux-kernel@vger.kernel.org>,
	<linux-f2fs-devel@lists.sourceforge.net>
Subject: Re: [f2fs-dev] [PATCH v6] f2fs: compress: support compress level
Date: Fri, 4 Dec 2020 09:56:27 +0800	[thread overview]
Message-ID: <7b975d1a-a06c-4e14-067e-064afc200934@huawei.com> (raw)
In-Reply-To: <20201204003119.GA1957051@xiangao.remote.csb>

Hi Xiang,

On 2020/12/4 8:31, Gao Xiang wrote:
> Hi Chao,
> 
> On Thu, Dec 03, 2020 at 11:32:34AM -0800, Eric Biggers wrote:
> 
> ...
> 
>>
>> What is the use case for storing the compression level on-disk?
>>
>> Keep in mind that compression levels are an implementation detail; the exact
>> compressed data that is produced by a particular algorithm at a particular
>> compression level is *not* a stable interface.  It can change when the
>> compressor is updated, as long as the output continues to be compatible with the
>> decompressor.
>>
>> So does compression level really belong in the on-disk format?
>>
> 
> Curious about this, since f2fs compression uses 16k f2fs compress cluster
> by default (doesn't do sub-block compression by design as what btrfs did),
> so is there significant CR difference between lz4 and lz4hc on 16k
> configuration (I guess using zstd or lz4hc for 128k cluster like btrfs
> could make more sense), could you leave some CR numbers about these
> algorithms on typical datasets (enwik9, silisia.tar or else.) with 16k
> cluster size?

Yup, I can figure out some numbers later. :)

> 
> As you may noticed, lz4hc is much slower than lz4, so if it's used online,
> it's a good way to keep all CPUs busy (under writeback) with unprivileged
> users. I'm not sure if it does matter. (Ok, it'll give users more options
> at least, yet I'm not sure end users are quite understand what these
> algorithms really mean, I guess it spends more CPU time but without much
> more storage saving by the default 16k configuration.)
> 
> from https://github.com/lz4/lz4    Core i7-9700K CPU @ 4.9GHz
> Silesia Corpus
> 
> Compressor              Ratio   Compression     Decompression
> memcpy                  1.000   13700 MB/s      13700 MB/s
> Zstandard 1.4.0 -1      2.883   515 MB/s	1380 MB/s
> LZ4 HC -9 (v1.9.0)      2.721   41 MB/s         4900 MB/s

There is one solutions now, Daeho has submitted two patches:

f2fs: add compress_mode mount option
f2fs: add F2FS_IOC_DECOMPRESS_FILE and F2FS_IOC_COMPRESS_FILE

Which allows to specify all files in data partition be compressible, by default,
all files are written as non-compressed one, at free time of system, we can use
ioctl to reload and compress data for specific files.

> 
> Also a minor thing is lzo-rle, initially it was only used for in-memory
> anonymous pages and it won't be kept on-disk so that's fine. I'm not sure
> if lzo original author want to support it or not. It'd be better to get


Hmm.. that's a problem, as there may be existed potential users who are
using lzo-rle, remove lzo-rle support will cause compatibility issue...

IMO, the condition "f2fs may has persisted lzo-rle compress format data already"
may affect the decision of not supporting that algorithm from author.

> some opinion before keeping it on-disk.

Yes, I can try to ask... :)

Thanks,

> 
> Thanks,
> Gao Xiang
> 
>> - Eric
>>
>>
>> _______________________________________________
>> Linux-f2fs-devel mailing list
>> Linux-f2fs-devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> 
> .
> 

  reply	other threads:[~2020-12-04  1:57 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-03  6:17 [PATCH v6] f2fs: compress: support compress level Chao Yu
2020-12-03 19:32 ` [f2fs-dev] " Eric Biggers
2020-12-04  0:31   ` Gao Xiang
2020-12-04  1:56     ` Chao Yu [this message]
2020-12-04  2:06       ` Gao Xiang
2020-12-04  2:38         ` Chao Yu
2020-12-04  2:47           ` Gao Xiang
2020-12-04  3:11             ` Chao Yu
2020-12-04  3:21               ` Gao Xiang
2020-12-04  7:09     ` Chao Yu
2020-12-04  7:43       ` Gao Xiang
2020-12-04  8:50         ` Chao Yu
2020-12-04  9:10           ` Gao Xiang
2020-12-04  1:18   ` Chao Yu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7b975d1a-a06c-4e14-067e-064afc200934@huawei.com \
    --to=yuchao0@huawei.com \
    --cc=ebiggers@kernel.org \
    --cc=hsiangkao@redhat.com \
    --cc=jaegeuk@kernel.org \
    --cc=linux-f2fs-devel@lists.sourceforge.net \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).