linux-f2fs-devel.lists.sourceforge.net archive mirror
 help / color / mirror / Atom feed
* [f2fs-dev] [PATCH] f2fs: compress: fix zstd data corruption
@ 2020-05-08  1:16 Chao Yu
       [not found] ` <CACOAw_xxS_Wf==KnD31f9AOMu+QUs3WacowsfcD6w4A9n2AkTg@mail.gmail.com>
  2020-05-12  1:44 ` Chao Yu
  0 siblings, 2 replies; 5+ messages in thread
From: Chao Yu @ 2020-05-08  1:16 UTC (permalink / raw)
  To: jaegeuk; +Cc: Daeho Jeong, linux-kernel, linux-f2fs-devel

During zstd compression, ZSTD_endStream() may return non-zero value
because distination buffer is full, but there is still compressed data
remained in intermediate buffer, it means that zstd algorithm can not
save at last one block space, let's just writeback raw data instead of
compressed one, this can fix data corruption when decompressing
incomplete stored compression data.

Signed-off-by: Daeho Jeong <daehojeong@google.com>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
---
 fs/f2fs/compress.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c
index c22cc0d37369..5e4947250262 100644
--- a/fs/f2fs/compress.c
+++ b/fs/f2fs/compress.c
@@ -358,6 +358,13 @@ static int zstd_compress_pages(struct compress_ctx *cc)
 		return -EIO;
 	}
 
+	/*
+	 * there is compressed data remained in intermediate buffer due to
+	 * no more space in cbuf.cdata
+	 */
+	if (ret)
+		return -EAGAIN;
+
 	cc->clen = outbuf.pos;
 	return 0;
 }
-- 
2.18.0.rc1



_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [f2fs-dev] [PATCH] f2fs: compress: fix zstd data corruption
       [not found] ` <CACOAw_xxS_Wf==KnD31f9AOMu+QUs3WacowsfcD6w4A9n2AkTg@mail.gmail.com>
@ 2020-05-08  2:48   ` Chao Yu
       [not found]     ` <CACOAw_z39D=2GONkMaQX6pSi2z26nqCvBZwZK-M=n3_yc84+yg@mail.gmail.com>
  0 siblings, 1 reply; 5+ messages in thread
From: Chao Yu @ 2020-05-08  2:48 UTC (permalink / raw)
  To: Daeho Jeong; +Cc: jaegeuk, linux-f2fs-devel, Daeho Jeong, linux-kernel

Hi Daeho,

On 2020/5/8 9:28, Daeho Jeong wrote:
> Hi Chao,
> 
> IIUC, you are trying not to use ZSTD_compressBound() to save the memory
> space. Am I right?
> 
> Then, how about LZ4_compressBound() for LZ4 and lzo1x_worst_compress() for
> LZO?

Oops, it looks those limits were wrongly used...

#define LZ4_COMPRESSBOUND(isize)	(\
	(unsigned int)(isize) > (unsigned int)LZ4_MAX_INPUT_SIZE \
	? 0 \
	: (isize) + ((isize)/255) + 16)

#define lzo1x_worst_compress(x) ((x) + ((x) / 16) + 64 + 3 + 2)

Newly calculated boundary size is larger than target buffer size.

However comments on LZ4_compress_default() said:

...
 * @maxOutputSize: full or partial size of buffer 'dest'
 *	which must be already allocated
...
int LZ4_compress_default(const char *source, char *dest, int inputSize,
	int maxOutputSize, void *wrkmem);

And @out_len in lzo1x_1_compress() was passed as an output parameter to
pass length of data that compressor compressed into @out buffer.

Let me know if I missed sth.

Thannks,

> Could we save more memory space for these two cases like ZSTD?
> As you know, we are using 5 pages compression buffer for LZ4 and LZO in
> compress_log_size=2,
> and if the compressed data doesn't fit in 3 pages, it returns -EAGAIN to
> give up compressing that one.
> 
> Thanks,
> 
> 2020년 5월 8일 (금) 오전 10:17, Chao Yu <yuchao0@huawei.com>님이 작성:
> 
>> During zstd compression, ZSTD_endStream() may return non-zero value
>> because distination buffer is full, but there is still compressed data
>> remained in intermediate buffer, it means that zstd algorithm can not
>> save at last one block space, let's just writeback raw data instead of
>> compressed one, this can fix data corruption when decompressing
>> incomplete stored compression data.
>>
>> Signed-off-by: Daeho Jeong <daehojeong@google.com>
>> Signed-off-by: Chao Yu <yuchao0@huawei.com>
>> ---
>>  fs/f2fs/compress.c | 7 +++++++
>>  1 file changed, 7 insertions(+)
>>
>> diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c
>> index c22cc0d37369..5e4947250262 100644
>> --- a/fs/f2fs/compress.c
>> +++ b/fs/f2fs/compress.c
>> @@ -358,6 +358,13 @@ static int zstd_compress_pages(struct compress_ctx
>> *cc)
>>                 return -EIO;
>>         }
>>
>> +       /*
>> +        * there is compressed data remained in intermediate buffer due to
>> +        * no more space in cbuf.cdata
>> +        */
>> +       if (ret)
>> +               return -EAGAIN;
>> +
>>         cc->clen = outbuf.pos;
>>         return 0;
>>  }
>> --
>> 2.18.0.rc1
>>
>>
>>
>> _______________________________________________
>> Linux-f2fs-devel mailing list
>> Linux-f2fs-devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
>>
> 


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [f2fs-dev] [PATCH] f2fs: compress: fix zstd data corruption
       [not found]     ` <CACOAw_z39D=2GONkMaQX6pSi2z26nqCvBZwZK-M=n3_yc84+yg@mail.gmail.com>
@ 2020-05-08  6:42       ` Chao Yu
  2020-05-08  6:51         ` Daeho Jeong
  0 siblings, 1 reply; 5+ messages in thread
From: Chao Yu @ 2020-05-08  6:42 UTC (permalink / raw)
  To: Daeho Jeong; +Cc: jaegeuk, linux-f2fs-devel, Daeho Jeong, linux-kernel

On 2020/5/8 11:30, Daeho Jeong wrote:
> I am a little bit confused.
> 
> In compress_log=2 (4 pages),
> 
> Every compression algorithm will set the cc->nr_cpages to 5 pages like below.
> 
>         max_len = COMPRESS_HEADER_SIZE + cc->clen;
>         cc->nr_cpages = DIV_ROUND_UP(max_len, PAGE_SIZE);
> 
>         cc->cpages = f2fs_kzalloc(sbi, sizeof(struct page *) *
>                                         cc->nr_cpages, GFP_NOFS);
> 
> And call cops->compress_pages(cc) and the returned length of the compressed data will be set to cc->clen for every case.
> And if the cc->clen is larger than max_len, we will give up compression.
> 
>         ret = cops->compress_pages(cc);
>         if (ret)
>                 goto out_vunmap_cbuf;
> 
>         max_len = PAGE_SIZE * (cc->cluster_size - 1) - COMPRESS_HEADER_SIZE;
> 
>         if (cc->clen > max_len) {
>                 ret = -EAGAIN;
>                 goto out_vunmap_cbuf;
>         }
> 
> So, with your patch, we will just use 3 pages for ZSTD and 5 pages for LZO and LZ4 now.
> My question was whether it is also possible to decrease the compression buffer size for LZO and LZ4 to 3 pages like ZSTD case.
> I was just curious about that. :)
I guess we can change LZ4 as we did for ZSTD case, since it supports partially
compression:

- lz4_compress_pages
 - LZ4_compress_default
  - LZ4_compress_fast
   - LZ4_compress_fast_extState
    if (maxOutputSize < LZ4_COMPRESSBOUND(inputSize))
     - LZ4_compress_generic(..., limitedOutput, ...)
      - if (outputLimited && boundary_check_condition) return 0;

And for LZO case, it looks we have to keep to allocate 5 pages for worst
compression case as it doesn't support partially compression as I checked.

Thanks,

> 
> 
> 2020년 5월 8일 (금) 오전 11:48, Chao Yu <yuchao0@huawei.com <mailto:yuchao0@huawei.com>>님이 작성:
> 
>     Hi Daeho,
> 
>     On 2020/5/8 9:28, Daeho Jeong wrote:
>     > Hi Chao,
>     >
>     > IIUC, you are trying not to use ZSTD_compressBound() to save the memory
>     > space. Am I right?
>     >
>     > Then, how about LZ4_compressBound() for LZ4 and lzo1x_worst_compress() for
>     > LZO?
> 
>     Oops, it looks those limits were wrongly used...
> 
>     #define LZ4_COMPRESSBOUND(isize)        (\
>             (unsigned int)(isize) > (unsigned int)LZ4_MAX_INPUT_SIZE \
>             ? 0 \
>             : (isize) + ((isize)/255) + 16)
> 
>     #define lzo1x_worst_compress(x) ((x) + ((x) / 16) + 64 + 3 + 2)
> 
>     Newly calculated boundary size is larger than target buffer size.
> 
>     However comments on LZ4_compress_default() said:
> 
>     ...
>      * @maxOutputSize: full or partial size of buffer 'dest'
>      *      which must be already allocated
>     ...
>     int LZ4_compress_default(const char *source, char *dest, int inputSize,
>             int maxOutputSize, void *wrkmem);
> 
>     And @out_len in lzo1x_1_compress() was passed as an output parameter to
>     pass length of data that compressor compressed into @out buffer.
> 
>     Let me know if I missed sth.
> 
>     Thannks,
> 
>     > Could we save more memory space for these two cases like ZSTD?
>     > As you know, we are using 5 pages compression buffer for LZ4 and LZO in
>     > compress_log_size=2,
>     > and if the compressed data doesn't fit in 3 pages, it returns -EAGAIN to
>     > give up compressing that one.
>     >
>     > Thanks,
>     >
>     > 2020년 5월 8일 (금) 오전 10:17, Chao Yu <yuchao0@huawei.com <mailto:yuchao0@huawei.com>>님이 작성:
>     >
>     >> During zstd compression, ZSTD_endStream() may return non-zero value
>     >> because distination buffer is full, but there is still compressed data
>     >> remained in intermediate buffer, it means that zstd algorithm can not
>     >> save at last one block space, let's just writeback raw data instead of
>     >> compressed one, this can fix data corruption when decompressing
>     >> incomplete stored compression data.
>     >>
>     >> Signed-off-by: Daeho Jeong <daehojeong@google.com <mailto:daehojeong@google.com>>
>     >> Signed-off-by: Chao Yu <yuchao0@huawei.com <mailto:yuchao0@huawei.com>>
>     >> ---
>     >>  fs/f2fs/compress.c | 7 +++++++
>     >>  1 file changed, 7 insertions(+)
>     >>
>     >> diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c
>     >> index c22cc0d37369..5e4947250262 100644
>     >> --- a/fs/f2fs/compress.c
>     >> +++ b/fs/f2fs/compress.c
>     >> @@ -358,6 +358,13 @@ static int zstd_compress_pages(struct compress_ctx
>     >> *cc)
>     >>                 return -EIO;
>     >>         }
>     >>
>     >> +       /*
>     >> +        * there is compressed data remained in intermediate buffer due to
>     >> +        * no more space in cbuf.cdata
>     >> +        */
>     >> +       if (ret)
>     >> +               return -EAGAIN;
>     >> +
>     >>         cc->clen = outbuf.pos;
>     >>         return 0;
>     >>  }
>     >> --
>     >> 2.18.0.rc1
>     >>
>     >>
>     >>
>     >> _______________________________________________
>     >> Linux-f2fs-devel mailing list
>     >> Linux-f2fs-devel@lists.sourceforge.net <mailto:Linux-f2fs-devel@lists.sourceforge.net>
>     >> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
>     >>
>     >
> 


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [f2fs-dev] [PATCH] f2fs: compress: fix zstd data corruption
  2020-05-08  6:42       ` Chao Yu
@ 2020-05-08  6:51         ` Daeho Jeong
  0 siblings, 0 replies; 5+ messages in thread
From: Daeho Jeong @ 2020-05-08  6:51 UTC (permalink / raw)
  To: Chao Yu; +Cc: jaegeuk, linux-f2fs-devel, Daeho Jeong, linux-kernel

Great, thanks for checking~! :)


2020년 5월 8일 (금) 오후 3:42, Chao Yu <yuchao0@huawei.com>님이 작성:
>
> On 2020/5/8 11:30, Daeho Jeong wrote:
> > I am a little bit confused.
> >
> > In compress_log=2 (4 pages),
> >
> > Every compression algorithm will set the cc->nr_cpages to 5 pages like below.
> >
> >         max_len = COMPRESS_HEADER_SIZE + cc->clen;
> >         cc->nr_cpages = DIV_ROUND_UP(max_len, PAGE_SIZE);
> >
> >         cc->cpages = f2fs_kzalloc(sbi, sizeof(struct page *) *
> >                                         cc->nr_cpages, GFP_NOFS);
> >
> > And call cops->compress_pages(cc) and the returned length of the compressed data will be set to cc->clen for every case.
> > And if the cc->clen is larger than max_len, we will give up compression.
> >
> >         ret = cops->compress_pages(cc);
> >         if (ret)
> >                 goto out_vunmap_cbuf;
> >
> >         max_len = PAGE_SIZE * (cc->cluster_size - 1) - COMPRESS_HEADER_SIZE;
> >
> >         if (cc->clen > max_len) {
> >                 ret = -EAGAIN;
> >                 goto out_vunmap_cbuf;
> >         }
> >
> > So, with your patch, we will just use 3 pages for ZSTD and 5 pages for LZO and LZ4 now.
> > My question was whether it is also possible to decrease the compression buffer size for LZO and LZ4 to 3 pages like ZSTD case.
> > I was just curious about that. :)
> I guess we can change LZ4 as we did for ZSTD case, since it supports partially
> compression:
>
> - lz4_compress_pages
>  - LZ4_compress_default
>   - LZ4_compress_fast
>    - LZ4_compress_fast_extState
>     if (maxOutputSize < LZ4_COMPRESSBOUND(inputSize))
>      - LZ4_compress_generic(..., limitedOutput, ...)
>       - if (outputLimited && boundary_check_condition) return 0;
>
> And for LZO case, it looks we have to keep to allocate 5 pages for worst
> compression case as it doesn't support partially compression as I checked.
>
> Thanks,
>
> >
> >
> > 2020년 5월 8일 (금) 오전 11:48, Chao Yu <yuchao0@huawei.com <mailto:yuchao0@huawei.com>>님이 작성:
> >
> >     Hi Daeho,
> >
> >     On 2020/5/8 9:28, Daeho Jeong wrote:
> >     > Hi Chao,
> >     >
> >     > IIUC, you are trying not to use ZSTD_compressBound() to save the memory
> >     > space. Am I right?
> >     >
> >     > Then, how about LZ4_compressBound() for LZ4 and lzo1x_worst_compress() for
> >     > LZO?
> >
> >     Oops, it looks those limits were wrongly used...
> >
> >     #define LZ4_COMPRESSBOUND(isize)        (\
> >             (unsigned int)(isize) > (unsigned int)LZ4_MAX_INPUT_SIZE \
> >             ? 0 \
> >             : (isize) + ((isize)/255) + 16)
> >
> >     #define lzo1x_worst_compress(x) ((x) + ((x) / 16) + 64 + 3 + 2)
> >
> >     Newly calculated boundary size is larger than target buffer size.
> >
> >     However comments on LZ4_compress_default() said:
> >
> >     ...
> >      * @maxOutputSize: full or partial size of buffer 'dest'
> >      *      which must be already allocated
> >     ...
> >     int LZ4_compress_default(const char *source, char *dest, int inputSize,
> >             int maxOutputSize, void *wrkmem);
> >
> >     And @out_len in lzo1x_1_compress() was passed as an output parameter to
> >     pass length of data that compressor compressed into @out buffer.
> >
> >     Let me know if I missed sth.
> >
> >     Thannks,
> >
> >     > Could we save more memory space for these two cases like ZSTD?
> >     > As you know, we are using 5 pages compression buffer for LZ4 and LZO in
> >     > compress_log_size=2,
> >     > and if the compressed data doesn't fit in 3 pages, it returns -EAGAIN to
> >     > give up compressing that one.
> >     >
> >     > Thanks,
> >     >
> >     > 2020년 5월 8일 (금) 오전 10:17, Chao Yu <yuchao0@huawei.com <mailto:yuchao0@huawei.com>>님이 작성:
> >     >
> >     >> During zstd compression, ZSTD_endStream() may return non-zero value
> >     >> because distination buffer is full, but there is still compressed data
> >     >> remained in intermediate buffer, it means that zstd algorithm can not
> >     >> save at last one block space, let's just writeback raw data instead of
> >     >> compressed one, this can fix data corruption when decompressing
> >     >> incomplete stored compression data.
> >     >>
> >     >> Signed-off-by: Daeho Jeong <daehojeong@google.com <mailto:daehojeong@google.com>>
> >     >> Signed-off-by: Chao Yu <yuchao0@huawei.com <mailto:yuchao0@huawei.com>>
> >     >> ---
> >     >>  fs/f2fs/compress.c | 7 +++++++
> >     >>  1 file changed, 7 insertions(+)
> >     >>
> >     >> diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c
> >     >> index c22cc0d37369..5e4947250262 100644
> >     >> --- a/fs/f2fs/compress.c
> >     >> +++ b/fs/f2fs/compress.c
> >     >> @@ -358,6 +358,13 @@ static int zstd_compress_pages(struct compress_ctx
> >     >> *cc)
> >     >>                 return -EIO;
> >     >>         }
> >     >>
> >     >> +       /*
> >     >> +        * there is compressed data remained in intermediate buffer due to
> >     >> +        * no more space in cbuf.cdata
> >     >> +        */
> >     >> +       if (ret)
> >     >> +               return -EAGAIN;
> >     >> +
> >     >>         cc->clen = outbuf.pos;
> >     >>         return 0;
> >     >>  }
> >     >> --
> >     >> 2.18.0.rc1
> >     >>
> >     >>
> >     >>
> >     >> _______________________________________________
> >     >> Linux-f2fs-devel mailing list
> >     >> Linux-f2fs-devel@lists.sourceforge.net <mailto:Linux-f2fs-devel@lists.sourceforge.net>
> >     >> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> >     >>
> >     >
> >


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [f2fs-dev] [PATCH] f2fs: compress: fix zstd data corruption
  2020-05-08  1:16 [f2fs-dev] [PATCH] f2fs: compress: fix zstd data corruption Chao Yu
       [not found] ` <CACOAw_xxS_Wf==KnD31f9AOMu+QUs3WacowsfcD6w4A9n2AkTg@mail.gmail.com>
@ 2020-05-12  1:44 ` Chao Yu
  1 sibling, 0 replies; 5+ messages in thread
From: Chao Yu @ 2020-05-12  1:44 UTC (permalink / raw)
  To: jaegeuk; +Cc: Daeho Jeong, linux-kernel, linux-f2fs-devel

Hi Jaegeuk,

I think we need this patch in order to avoid writebacking incomplete data
compressed by zstd.

On 2020/5/8 9:16, Chao Yu wrote:
> During zstd compression, ZSTD_endStream() may return non-zero value
> because distination buffer is full, but there is still compressed data
> remained in intermediate buffer, it means that zstd algorithm can not
> save at last one block space, let's just writeback raw data instead of
> compressed one, this can fix data corruption when decompressing
> incomplete stored compression data.
> 

Fixes: 50cfa66f0de0 ("f2fs: compress: support zstd compress algorithm")

Thanks,


> Signed-off-by: Daeho Jeong <daehojeong@google.com>
> Signed-off-by: Chao Yu <yuchao0@huawei.com>
> ---
>  fs/f2fs/compress.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c
> index c22cc0d37369..5e4947250262 100644
> --- a/fs/f2fs/compress.c
> +++ b/fs/f2fs/compress.c
> @@ -358,6 +358,13 @@ static int zstd_compress_pages(struct compress_ctx *cc)
>  		return -EIO;
>  	}
>  
> +	/*
> +	 * there is compressed data remained in intermediate buffer due to
> +	 * no more space in cbuf.cdata
> +	 */
> +	if (ret)
> +		return -EAGAIN;
> +
>  	cc->clen = outbuf.pos;
>  	return 0;
>  }
> 


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-05-12  1:44 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-08  1:16 [f2fs-dev] [PATCH] f2fs: compress: fix zstd data corruption Chao Yu
     [not found] ` <CACOAw_xxS_Wf==KnD31f9AOMu+QUs3WacowsfcD6w4A9n2AkTg@mail.gmail.com>
2020-05-08  2:48   ` Chao Yu
     [not found]     ` <CACOAw_z39D=2GONkMaQX6pSi2z26nqCvBZwZK-M=n3_yc84+yg@mail.gmail.com>
2020-05-08  6:42       ` Chao Yu
2020-05-08  6:51         ` Daeho Jeong
2020-05-12  1:44 ` Chao Yu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).