linux-f2fs-devel.lists.sourceforge.net archive mirror
 help / color / mirror / Atom feed
From: Chao Yu <yuchao0@huawei.com>
To: Daeho Jeong <daeho43@gmail.com>
Cc: jaegeuk@kernel.org, linux-f2fs-devel@lists.sourceforge.net,
	Daeho Jeong <daehojeong@google.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [f2fs-dev] [PATCH] f2fs: compress: fix zstd data corruption
Date: Fri, 8 May 2020 14:42:41 +0800	[thread overview]
Message-ID: <2a241a80-2597-ef9e-62b5-cf2b8bdb33c4@huawei.com> (raw)
In-Reply-To: <CACOAw_z39D=2GONkMaQX6pSi2z26nqCvBZwZK-M=n3_yc84+yg@mail.gmail.com>

On 2020/5/8 11:30, Daeho Jeong wrote:
> I am a little bit confused.
> 
> In compress_log=2 (4 pages),
> 
> Every compression algorithm will set the cc->nr_cpages to 5 pages like below.
> 
>         max_len = COMPRESS_HEADER_SIZE + cc->clen;
>         cc->nr_cpages = DIV_ROUND_UP(max_len, PAGE_SIZE);
> 
>         cc->cpages = f2fs_kzalloc(sbi, sizeof(struct page *) *
>                                         cc->nr_cpages, GFP_NOFS);
> 
> And call cops->compress_pages(cc) and the returned length of the compressed data will be set to cc->clen for every case.
> And if the cc->clen is larger than max_len, we will give up compression.
> 
>         ret = cops->compress_pages(cc);
>         if (ret)
>                 goto out_vunmap_cbuf;
> 
>         max_len = PAGE_SIZE * (cc->cluster_size - 1) - COMPRESS_HEADER_SIZE;
> 
>         if (cc->clen > max_len) {
>                 ret = -EAGAIN;
>                 goto out_vunmap_cbuf;
>         }
> 
> So, with your patch, we will just use 3 pages for ZSTD and 5 pages for LZO and LZ4 now.
> My question was whether it is also possible to decrease the compression buffer size for LZO and LZ4 to 3 pages like ZSTD case.
> I was just curious about that. :)
I guess we can change LZ4 as we did for ZSTD case, since it supports partially
compression:

- lz4_compress_pages
 - LZ4_compress_default
  - LZ4_compress_fast
   - LZ4_compress_fast_extState
    if (maxOutputSize < LZ4_COMPRESSBOUND(inputSize))
     - LZ4_compress_generic(..., limitedOutput, ...)
      - if (outputLimited && boundary_check_condition) return 0;

And for LZO case, it looks we have to keep to allocate 5 pages for worst
compression case as it doesn't support partially compression as I checked.

Thanks,

> 
> 
> 2020년 5월 8일 (금) 오전 11:48, Chao Yu <yuchao0@huawei.com <mailto:yuchao0@huawei.com>>님이 작성:
> 
>     Hi Daeho,
> 
>     On 2020/5/8 9:28, Daeho Jeong wrote:
>     > Hi Chao,
>     >
>     > IIUC, you are trying not to use ZSTD_compressBound() to save the memory
>     > space. Am I right?
>     >
>     > Then, how about LZ4_compressBound() for LZ4 and lzo1x_worst_compress() for
>     > LZO?
> 
>     Oops, it looks those limits were wrongly used...
> 
>     #define LZ4_COMPRESSBOUND(isize)        (\
>             (unsigned int)(isize) > (unsigned int)LZ4_MAX_INPUT_SIZE \
>             ? 0 \
>             : (isize) + ((isize)/255) + 16)
> 
>     #define lzo1x_worst_compress(x) ((x) + ((x) / 16) + 64 + 3 + 2)
> 
>     Newly calculated boundary size is larger than target buffer size.
> 
>     However comments on LZ4_compress_default() said:
> 
>     ...
>      * @maxOutputSize: full or partial size of buffer 'dest'
>      *      which must be already allocated
>     ...
>     int LZ4_compress_default(const char *source, char *dest, int inputSize,
>             int maxOutputSize, void *wrkmem);
> 
>     And @out_len in lzo1x_1_compress() was passed as an output parameter to
>     pass length of data that compressor compressed into @out buffer.
> 
>     Let me know if I missed sth.
> 
>     Thannks,
> 
>     > Could we save more memory space for these two cases like ZSTD?
>     > As you know, we are using 5 pages compression buffer for LZ4 and LZO in
>     > compress_log_size=2,
>     > and if the compressed data doesn't fit in 3 pages, it returns -EAGAIN to
>     > give up compressing that one.
>     >
>     > Thanks,
>     >
>     > 2020년 5월 8일 (금) 오전 10:17, Chao Yu <yuchao0@huawei.com <mailto:yuchao0@huawei.com>>님이 작성:
>     >
>     >> During zstd compression, ZSTD_endStream() may return non-zero value
>     >> because distination buffer is full, but there is still compressed data
>     >> remained in intermediate buffer, it means that zstd algorithm can not
>     >> save at last one block space, let's just writeback raw data instead of
>     >> compressed one, this can fix data corruption when decompressing
>     >> incomplete stored compression data.
>     >>
>     >> Signed-off-by: Daeho Jeong <daehojeong@google.com <mailto:daehojeong@google.com>>
>     >> Signed-off-by: Chao Yu <yuchao0@huawei.com <mailto:yuchao0@huawei.com>>
>     >> ---
>     >>  fs/f2fs/compress.c | 7 +++++++
>     >>  1 file changed, 7 insertions(+)
>     >>
>     >> diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c
>     >> index c22cc0d37369..5e4947250262 100644
>     >> --- a/fs/f2fs/compress.c
>     >> +++ b/fs/f2fs/compress.c
>     >> @@ -358,6 +358,13 @@ static int zstd_compress_pages(struct compress_ctx
>     >> *cc)
>     >>                 return -EIO;
>     >>         }
>     >>
>     >> +       /*
>     >> +        * there is compressed data remained in intermediate buffer due to
>     >> +        * no more space in cbuf.cdata
>     >> +        */
>     >> +       if (ret)
>     >> +               return -EAGAIN;
>     >> +
>     >>         cc->clen = outbuf.pos;
>     >>         return 0;
>     >>  }
>     >> --
>     >> 2.18.0.rc1
>     >>
>     >>
>     >>
>     >> _______________________________________________
>     >> Linux-f2fs-devel mailing list
>     >> Linux-f2fs-devel@lists.sourceforge.net <mailto:Linux-f2fs-devel@lists.sourceforge.net>
>     >> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
>     >>
>     >
> 


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

  parent reply	other threads:[~2020-05-08  6:43 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-08  1:16 [f2fs-dev] [PATCH] f2fs: compress: fix zstd data corruption Chao Yu
     [not found] ` <CACOAw_xxS_Wf==KnD31f9AOMu+QUs3WacowsfcD6w4A9n2AkTg@mail.gmail.com>
2020-05-08  2:48   ` Chao Yu
     [not found]     ` <CACOAw_z39D=2GONkMaQX6pSi2z26nqCvBZwZK-M=n3_yc84+yg@mail.gmail.com>
2020-05-08  6:42       ` Chao Yu [this message]
2020-05-08  6:51         ` Daeho Jeong
2020-05-12  1:44 ` Chao Yu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2a241a80-2597-ef9e-62b5-cf2b8bdb33c4@huawei.com \
    --to=yuchao0@huawei.com \
    --cc=daeho43@gmail.com \
    --cc=daehojeong@google.com \
    --cc=jaegeuk@kernel.org \
    --cc=linux-f2fs-devel@lists.sourceforge.net \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).