linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Gao Xiang <hsiangkao@redhat.com>
To: Chao Yu <yuchao0@huawei.com>
Cc: Eric Biggers <ebiggers@kernel.org>,
	jaegeuk@kernel.org, linux-kernel@vger.kernel.org,
	linux-f2fs-devel@lists.sourceforge.net
Subject: Re: [f2fs-dev] [PATCH v6] f2fs: compress: support compress level
Date: Fri, 4 Dec 2020 10:06:59 +0800	[thread overview]
Message-ID: <20201204020659.GB1963435@xiangao.remote.csb> (raw)
In-Reply-To: <7b975d1a-a06c-4e14-067e-064afc200934@huawei.com>

On Fri, Dec 04, 2020 at 09:56:27AM +0800, Chao Yu wrote:
> Hi Xiang,
> 
> On 2020/12/4 8:31, Gao Xiang wrote:
> > Hi Chao,
> > 
> > On Thu, Dec 03, 2020 at 11:32:34AM -0800, Eric Biggers wrote:
> > 
> > ...
> > 
> > > 
> > > What is the use case for storing the compression level on-disk?
> > > 
> > > Keep in mind that compression levels are an implementation detail; the exact
> > > compressed data that is produced by a particular algorithm at a particular
> > > compression level is *not* a stable interface.  It can change when the
> > > compressor is updated, as long as the output continues to be compatible with the
> > > decompressor.
> > > 
> > > So does compression level really belong in the on-disk format?
> > > 
> > 
> > Curious about this, since f2fs compression uses 16k f2fs compress cluster
> > by default (doesn't do sub-block compression by design as what btrfs did),
> > so is there significant CR difference between lz4 and lz4hc on 16k
> > configuration (I guess using zstd or lz4hc for 128k cluster like btrfs
> > could make more sense), could you leave some CR numbers about these
> > algorithms on typical datasets (enwik9, silisia.tar or else.) with 16k
> > cluster size?
> 
> Yup, I can figure out some numbers later. :)
> 
> > 
> > As you may noticed, lz4hc is much slower than lz4, so if it's used online,
> > it's a good way to keep all CPUs busy (under writeback) with unprivileged
> > users. I'm not sure if it does matter. (Ok, it'll give users more options
> > at least, yet I'm not sure end users are quite understand what these
> > algorithms really mean, I guess it spends more CPU time but without much
> > more storage saving by the default 16k configuration.)
> > 
> > from https://github.com/lz4/lz4    Core i7-9700K CPU @ 4.9GHz
> > Silesia Corpus
> > 
> > Compressor              Ratio   Compression     Decompression
> > memcpy                  1.000   13700 MB/s      13700 MB/s
> > Zstandard 1.4.0 -1      2.883   515 MB/s	1380 MB/s
> > LZ4 HC -9 (v1.9.0)      2.721   41 MB/s         4900 MB/s
> 
> There is one solutions now, Daeho has submitted two patches:
> 
> f2fs: add compress_mode mount option
> f2fs: add F2FS_IOC_DECOMPRESS_FILE and F2FS_IOC_COMPRESS_FILE
> 
> Which allows to specify all files in data partition be compressible, by default,
> all files are written as non-compressed one, at free time of system, we can use
> ioctl to reload and compress data for specific files.
> 

Yeah, my own premature suggestion is there are many compression options
exist in f2fs compression, but end users are not compression experts.
So it'd better to leave advantage options to users (or users might be
confused or select wrong algorithm or make potential complaint...)

Keep lz4hc dirty data under writeback could block writeback, make kswapd
busy, and direct memory reclaim path, I guess that's why rare online
compression chooses it. My own premature suggestion is that it'd better
to show the CR or performance benefits in advance, and prevent unprivileged
users from using high-level lz4hc algorithm (to avoid potential system attack.)
either from mount options or ioctl.

> > 
> > Also a minor thing is lzo-rle, initially it was only used for in-memory
> > anonymous pages and it won't be kept on-disk so that's fine. I'm not sure
> > if lzo original author want to support it or not. It'd be better to get
> 
> 
> Hmm.. that's a problem, as there may be existed potential users who are
> using lzo-rle, remove lzo-rle support will cause compatibility issue...
> 
> IMO, the condition "f2fs may has persisted lzo-rle compress format data already"
> may affect the decision of not supporting that algorithm from author.
> 
> > some opinion before keeping it on-disk.
> 
> Yes, I can try to ask... :)

Yeah, it'd be better to ask the author first, or it may have to maintain
a private lz4-rle folk...

Thanks,
Gao Xiang

> 
> Thanks,
> 
> > 
> > Thanks,
> > Gao Xiang
> > 
> > > - Eric
> > > 
> > > 
> > > _______________________________________________
> > > Linux-f2fs-devel mailing list
> > > Linux-f2fs-devel@lists.sourceforge.net
> > > https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> > 
> > .
> > 
> 


  reply	other threads:[~2020-12-04  2:08 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-03  6:17 [PATCH v6] f2fs: compress: support compress level Chao Yu
2020-12-03 19:32 ` [f2fs-dev] " Eric Biggers
2020-12-04  0:31   ` Gao Xiang
2020-12-04  1:56     ` Chao Yu
2020-12-04  2:06       ` Gao Xiang [this message]
2020-12-04  2:38         ` Chao Yu
2020-12-04  2:47           ` Gao Xiang
2020-12-04  3:11             ` Chao Yu
2020-12-04  3:21               ` Gao Xiang
2020-12-04  7:09     ` Chao Yu
2020-12-04  7:43       ` Gao Xiang
2020-12-04  8:50         ` Chao Yu
2020-12-04  9:10           ` Gao Xiang
2020-12-04  1:18   ` Chao Yu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201204020659.GB1963435@xiangao.remote.csb \
    --to=hsiangkao@redhat.com \
    --cc=ebiggers@kernel.org \
    --cc=jaegeuk@kernel.org \
    --cc=linux-f2fs-devel@lists.sourceforge.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=yuchao0@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).