All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] btrfs: lzo: Harden decompression callers to avoid
@ 2018-05-17  6:27 Qu Wenruo
  2018-05-17  6:27 ` [PATCH 1/4] btrfs: compression: Add linux/sizes.h for compression.h Qu Wenruo
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Qu Wenruo @ 2018-05-17  6:27 UTC (permalink / raw)
  To: linux-btrfs; +Cc: jamespharvey20

James Harvey reported pretty strange kernel misbehavior where after
reading certain btrfs compressed data, kernel crash with unrelated
calltrace.
(https://bugzilla.kernel.org/show_bug.cgi?id=199707 and
 https://www.spinics.net/lists/linux-btrfs/msg77971.html)

Thanks for his comprehensive debug help, we located the problem to:

1) Bad lzo decompression verification check
   The direct cause.
   The lzo decompression code involves memory copy for compressed data
   who crosses page boundary.
   And if corrupted segment header contains invalid length, btrfs will
   do memory copy and then corrupt kernel memory.

   Fix it by add header length check for each segment, and just in case,
   also add uncompressed data length check. (Patch 3)

2) Compressed extent created for NODATASUM inode
   The root cause.
   Will be addressed in another patchset.

Other patches are mostly cosmetic, like adding extra include for
compression.h, to avoid later compiling error. (Patch 1)
Or adding comment of how btrfs organise its compressed data (Patch 2)
And adding extra check for inlined compressed data even it's not really
possible to happen (Patch 4)

Qu Wenruo (4):
  btrfs: compression: Add linux/sizes.h for compression.h
  btrfs: lzo: Add comment about the how btrfs records its lzo compressed
    data
  btrfs: lzo: Add header length check to avoid slab out of bounds access
  btrfs: lzo: Hardern inline lzo compressed extent decompression

 fs/btrfs/compression.h |  1 +
 fs/btrfs/lzo.c         | 63 +++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 63 insertions(+), 1 deletion(-)

-- 
2.17.0


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/4] btrfs: compression: Add linux/sizes.h for compression.h
  2018-05-17  6:27 [PATCH 0/4] btrfs: lzo: Harden decompression callers to avoid Qu Wenruo
@ 2018-05-17  6:27 ` Qu Wenruo
  2018-05-17  7:49   ` Nikolay Borisov
  2018-05-17  6:27 ` [PATCH 2/4] btrfs: lzo: Add comment about the how btrfs records its lzo compressed data Qu Wenruo
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 12+ messages in thread
From: Qu Wenruo @ 2018-05-17  6:27 UTC (permalink / raw)
  To: linux-btrfs; +Cc: jamespharvey20

Since compression.h is using SZ_* macros, and if some user only includes
compression.h without linux/sizes.h, it will cause compile error.

One example is lzo.c, if it uses BTRFS_MAX_COMPRESSED, it would cause
compile error.

Fix it by adding linux/sizes.h in compression.h

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/compression.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/btrfs/compression.h b/fs/btrfs/compression.h
index cc605f7b23fb..317703d9b073 100644
--- a/fs/btrfs/compression.h
+++ b/fs/btrfs/compression.h
@@ -6,6 +6,7 @@
 #ifndef BTRFS_COMPRESSION_H
 #define BTRFS_COMPRESSION_H
 
+#include <linux/sizes.h>
 /*
  * We want to make sure that amount of RAM required to uncompress an extent is
  * reasonable, so we limit the total size in ram of a compressed extent to
-- 
2.17.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/4] btrfs: lzo: Add comment about the how btrfs records its lzo compressed data
  2018-05-17  6:27 [PATCH 0/4] btrfs: lzo: Harden decompression callers to avoid Qu Wenruo
  2018-05-17  6:27 ` [PATCH 1/4] btrfs: compression: Add linux/sizes.h for compression.h Qu Wenruo
@ 2018-05-17  6:27 ` Qu Wenruo
  2018-05-17  7:48   ` Nikolay Borisov
  2018-05-17  6:27 ` [PATCH 3/4] btrfs: lzo: Add header length check to avoid slab out of bounds access Qu Wenruo
  2018-05-17  6:27 ` [PATCH 4/4] btrfs: lzo: Hardern inline lzo compressed extent decompression Qu Wenruo
  3 siblings, 1 reply; 12+ messages in thread
From: Qu Wenruo @ 2018-05-17  6:27 UTC (permalink / raw)
  To: linux-btrfs; +Cc: jamespharvey20

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/lzo.c | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/fs/btrfs/lzo.c b/fs/btrfs/lzo.c
index 0667ea07f766..3d2ae4c08876 100644
--- a/fs/btrfs/lzo.c
+++ b/fs/btrfs/lzo.c
@@ -17,6 +17,29 @@
 
 #define LZO_LEN	4
 
+/*
+ * Btrfs LZO compression format
+ *
+ * Regular LZO compressed data extent is consist of:
+ * 1.  Header
+ *     Fixed size. LZO_LEN (4) bytes long, LE16.
+ *     Records the total size (*includes* the header) of real compressed data.
+ *
+ * 2.  Segment(s)
+ *     Variable size. Includes one segment header, and then data payload.
+ *     One btrfs compressed data can have one or more segments.
+ *
+ * 2.1 Segment header
+ *     Fixed size. LZO_LEN (4) bytes long, LE16.
+ *     Records the total size of the segment (*excludes* the header).
+ *
+ * 2.2 Data Payload
+ *     Variable size. Size up limit should be lzo1x_worst_compress(PAGE_SIZE).
+ *
+ * While for inlined LZO compressed data extent, it doesn't have Header, just
+ * one Segment.
+ */
+
 struct workspace {
 	void *mem;
 	void *buf;	/* where decompressed data goes */
-- 
2.17.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/4] btrfs: lzo: Add header length check to avoid slab out of bounds access
  2018-05-17  6:27 [PATCH 0/4] btrfs: lzo: Harden decompression callers to avoid Qu Wenruo
  2018-05-17  6:27 ` [PATCH 1/4] btrfs: compression: Add linux/sizes.h for compression.h Qu Wenruo
  2018-05-17  6:27 ` [PATCH 2/4] btrfs: lzo: Add comment about the how btrfs records its lzo compressed data Qu Wenruo
@ 2018-05-17  6:27 ` Qu Wenruo
  2018-05-17  8:14   ` Nikolay Borisov
  2018-05-17  6:27 ` [PATCH 4/4] btrfs: lzo: Hardern inline lzo compressed extent decompression Qu Wenruo
  3 siblings, 1 reply; 12+ messages in thread
From: Qu Wenruo @ 2018-05-17  6:27 UTC (permalink / raw)
  To: linux-btrfs; +Cc: jamespharvey20

James Harvey reported that some corrupted compressed extent data can
lead to various kernel memory corruption.

Such corrupted extent data belongs to inode with NODATASUM flags, thus
data csum won't help us detecting such bug.

If lucky enough, kasan could catch it like:
==================================================================
BUG: KASAN: slab-out-of-bounds in lzo_decompress_bio+0x384/0x7a0 [btrfs]
Write of size 4096 at addr ffff8800606cb0f8 by task kworker/u16:0/2338

CPU: 3 PID: 2338 Comm: kworker/u16:0 Tainted: G           O      4.17.0-rc5-custom+ #50
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
Workqueue: btrfs-endio btrfs_endio_helper [btrfs]
Call Trace:
 dump_stack+0xc2/0x16b
 print_address_description+0x6a/0x270
 kasan_report+0x260/0x380
 memcpy+0x34/0x50
 lzo_decompress_bio+0x384/0x7a0 [btrfs]
 end_compressed_bio_read+0x99f/0x10b0 [btrfs]
 bio_endio+0x32e/0x640
 normal_work_helper+0x15a/0xea0 [btrfs]
 process_one_work+0x7e3/0x1470
 worker_thread+0x1b0/0x1170
 kthread+0x2db/0x390
 ret_from_fork+0x22/0x40
...
==================================================================

The offending compressed data has the following info:

Header:			length 32768		(Looks completely valid)
Segment 0 Header:	length 3472882419	(Obvious out of bounds)

Then when handling segment 0, since it's over the current page, we need
the compressed data to workspace, then such large size would trigger
out-of-bounds memory access, screwing up the whole kernel.

Fix it by adding extra checks on header and segment headers to ensure we
won't access out-of-bounds, and even checks the decompressed data won't
be out-of-bounds.

Reported-by: James Harvey <jamespharvey20@gmail.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/lzo.c | 35 ++++++++++++++++++++++++++++++++++-
 1 file changed, 34 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/lzo.c b/fs/btrfs/lzo.c
index 3d2ae4c08876..78ebc809072f 100644
--- a/fs/btrfs/lzo.c
+++ b/fs/btrfs/lzo.c
@@ -281,6 +281,7 @@ static int lzo_decompress_bio(struct list_head *ws, struct compressed_bio *cb)
 	unsigned long working_bytes;
 	size_t in_len;
 	size_t out_len;
+	size_t max_segment_len = lzo1x_worst_compress(PAGE_SIZE);
 	unsigned long in_offset;
 	unsigned long in_page_bytes_left;
 	unsigned long tot_in;
@@ -294,6 +295,18 @@ static int lzo_decompress_bio(struct list_head *ws, struct compressed_bio *cb)
 
 	data_in = kmap(pages_in[0]);
 	tot_len = read_compress_length(data_in);
+	/*
+	 * Compressed data header check.
+	 *
+	 * The real compressed size can't exceed extent length, and all pages
+	 * should be used (a full pending page is not possible).
+	 * If this happens it means the compressed extent is corrupted.
+	 */
+	if (tot_len > min_t(size_t, BTRFS_MAX_COMPRESSED, srclen) ||
+	    tot_len < srclen - PAGE_SIZE) {
+		ret = -EUCLEAN;
+		goto done;
+	}
 
 	tot_in = LZO_LEN;
 	in_offset = LZO_LEN;
@@ -308,6 +321,17 @@ static int lzo_decompress_bio(struct list_head *ws, struct compressed_bio *cb)
 		in_offset += LZO_LEN;
 		tot_in += LZO_LEN;
 
+		/*
+		 * Segment header check.
+		 *
+		 * The segment length must not exceed max lzo compression
+		 * size, nor the total compressed size
+		 */
+		if (in_len > max_segment_len || tot_in + in_len > tot_len) {
+			ret = -EUCLEAN;
+			goto done;
+		}
+
 		tot_in += in_len;
 		working_bytes = in_len;
 		may_late_unmap = need_unmap = false;
@@ -358,7 +382,7 @@ static int lzo_decompress_bio(struct list_head *ws, struct compressed_bio *cb)
 			}
 		}
 
-		out_len = lzo1x_worst_compress(PAGE_SIZE);
+		out_len = max_segment_len;
 		ret = lzo1x_decompress_safe(buf, in_len, workspace->buf,
 					    &out_len);
 		if (need_unmap)
@@ -368,6 +392,15 @@ static int lzo_decompress_bio(struct list_head *ws, struct compressed_bio *cb)
 			ret = -EIO;
 			break;
 		}
+		/*
+		 * Decompressed data length check.
+		 * The uncompressed data should not exceed uncompressed extent
+		 * size.
+		 */
+		if (tot_out + out_len > cb->len) {
+			ret = -EUCLEAN;
+			break;
+		}
 
 		buf_start = tot_out;
 		tot_out += out_len;
-- 
2.17.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 4/4] btrfs: lzo: Hardern inline lzo compressed extent decompression
  2018-05-17  6:27 [PATCH 0/4] btrfs: lzo: Harden decompression callers to avoid Qu Wenruo
                   ` (2 preceding siblings ...)
  2018-05-17  6:27 ` [PATCH 3/4] btrfs: lzo: Add header length check to avoid slab out of bounds access Qu Wenruo
@ 2018-05-17  6:27 ` Qu Wenruo
  3 siblings, 0 replies; 12+ messages in thread
From: Qu Wenruo @ 2018-05-17  6:27 UTC (permalink / raw)
  To: linux-btrfs; +Cc: jamespharvey20

Unlike regular lzo compressed extent, inline extent doesn't have Header
and only has one Segment.
And further more, inlined extent always has csum in its leaf header,
it's less possible to have corrupted data.

Anyway, still add extra segment header length check.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/lzo.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/fs/btrfs/lzo.c b/fs/btrfs/lzo.c
index 78ebc809072f..077851a9f498 100644
--- a/fs/btrfs/lzo.c
+++ b/fs/btrfs/lzo.c
@@ -425,6 +425,7 @@ static int lzo_decompress(struct list_head *ws, unsigned char *data_in,
 	struct workspace *workspace = list_entry(ws, struct workspace, list);
 	size_t in_len;
 	size_t out_len;
+	size_t max_segment_len = lzo1x_worst_compress(PAGE_SIZE);
 	int ret = 0;
 	char *kaddr;
 	unsigned long bytes;
@@ -434,6 +435,10 @@ static int lzo_decompress(struct list_head *ws, unsigned char *data_in,
 	data_in += LZO_LEN;
 
 	in_len = read_compress_length(data_in);
+	if (in_len > max_segment_len) {
+		ret = -EUCLEAN;
+		goto out;
+	}
 	data_in += LZO_LEN;
 
 	out_len = PAGE_SIZE;
-- 
2.17.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/4] btrfs: lzo: Add comment about the how btrfs records its lzo compressed data
  2018-05-17  6:27 ` [PATCH 2/4] btrfs: lzo: Add comment about the how btrfs records its lzo compressed data Qu Wenruo
@ 2018-05-17  7:48   ` Nikolay Borisov
  2018-05-17  8:04     ` Qu Wenruo
  0 siblings, 1 reply; 12+ messages in thread
From: Nikolay Borisov @ 2018-05-17  7:48 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs; +Cc: jamespharvey20



On 17.05.2018 09:27, Qu Wenruo wrote:
> Signed-off-by: Qu Wenruo <wqu@suse.com>

Overall it looks good and useful just a couple of nits below.
> ---
>  fs/btrfs/lzo.c | 23 +++++++++++++++++++++++
>  1 file changed, 23 insertions(+)
> 
> diff --git a/fs/btrfs/lzo.c b/fs/btrfs/lzo.c
> index 0667ea07f766..3d2ae4c08876 100644
> --- a/fs/btrfs/lzo.c
> +++ b/fs/btrfs/lzo.c
> @@ -17,6 +17,29 @@
>  
>  #define LZO_LEN	4
>  
> +/*
> + * Btrfs LZO compression format
> + *
> + * Regular LZO compressed data extent is consist of:
nit: s/is consist/consists
> + * 1.  Header
> + *     Fixed size. LZO_LEN (4) bytes long, LE16.
> + *     Records the total size (*includes* the header) of real compressed data.
> + *
> + * 2.  Segment(s)
> + *     Variable size. Includes one segment header, and then data payload.
> + *     One btrfs compressed data can have one or more segments.

So "one btrfs compressed data" should really mean "one btrfs compressed
extent" I guess?

> + *
> + * 2.1 Segment header
> + *     Fixed size. LZO_LEN (4) bytes long, LE16.
> + *     Records the total size of the segment (*excludes* the header).
> + *
> + * 2.2 Data Payload
> + *     Variable size. Size up limit should be lzo1x_worst_compress(PAGE_SIZE).
> + *
> + * While for inlined LZO compressed data extent, it doesn't have Header, just
> + * one Segment.



> + */
> +
>  struct workspace {
>  	void *mem;
>  	void *buf;	/* where decompressed data goes */
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/4] btrfs: compression: Add linux/sizes.h for compression.h
  2018-05-17  6:27 ` [PATCH 1/4] btrfs: compression: Add linux/sizes.h for compression.h Qu Wenruo
@ 2018-05-17  7:49   ` Nikolay Borisov
  0 siblings, 0 replies; 12+ messages in thread
From: Nikolay Borisov @ 2018-05-17  7:49 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs; +Cc: jamespharvey20



On 17.05.2018 09:27, Qu Wenruo wrote:
> Since compression.h is using SZ_* macros, and if some user only includes
> compression.h without linux/sizes.h, it will cause compile error.
> 
> One example is lzo.c, if it uses BTRFS_MAX_COMPRESSED, it would cause
> compile error.
> 
> Fix it by adding linux/sizes.h in compression.h
> 
> Signed-off-by: Qu Wenruo <wqu@suse.com>

Reviewed-by: Nikolay Borisov <nborisov@suse.com>

> ---
>  fs/btrfs/compression.h | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/fs/btrfs/compression.h b/fs/btrfs/compression.h
> index cc605f7b23fb..317703d9b073 100644
> --- a/fs/btrfs/compression.h
> +++ b/fs/btrfs/compression.h
> @@ -6,6 +6,7 @@
>  #ifndef BTRFS_COMPRESSION_H
>  #define BTRFS_COMPRESSION_H
>  
> +#include <linux/sizes.h>
>  /*
>   * We want to make sure that amount of RAM required to uncompress an extent is
>   * reasonable, so we limit the total size in ram of a compressed extent to
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/4] btrfs: lzo: Add comment about the how btrfs records its lzo compressed data
  2018-05-17  7:48   ` Nikolay Borisov
@ 2018-05-17  8:04     ` Qu Wenruo
  0 siblings, 0 replies; 12+ messages in thread
From: Qu Wenruo @ 2018-05-17  8:04 UTC (permalink / raw)
  To: Nikolay Borisov, Qu Wenruo, linux-btrfs; +Cc: jamespharvey20



On 2018年05月17日 15:48, Nikolay Borisov wrote:
> 
> 
> On 17.05.2018 09:27, Qu Wenruo wrote:
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
> 
> Overall it looks good and useful just a couple of nits below.
>> ---
>>  fs/btrfs/lzo.c | 23 +++++++++++++++++++++++
>>  1 file changed, 23 insertions(+)
>>
>> diff --git a/fs/btrfs/lzo.c b/fs/btrfs/lzo.c
>> index 0667ea07f766..3d2ae4c08876 100644
>> --- a/fs/btrfs/lzo.c
>> +++ b/fs/btrfs/lzo.c
>> @@ -17,6 +17,29 @@
>>  
>>  #define LZO_LEN	4
>>  
>> +/*
>> + * Btrfs LZO compression format
>> + *
>> + * Regular LZO compressed data extent is consist of:
> nit: s/is consist/consists
>> + * 1.  Header
>> + *     Fixed size. LZO_LEN (4) bytes long, LE16.
>> + *     Records the total size (*includes* the header) of real compressed data.
>> + *
>> + * 2.  Segment(s)
>> + *     Variable size. Includes one segment header, and then data payload.
>> + *     One btrfs compressed data can have one or more segments.
> 
> So "one btrfs compressed data" should really mean "one btrfs compressed
> extent" I guess?

Oh, yes, one btrfs compressed extent.

Thanks,
Qu

> 
>> + *
>> + * 2.1 Segment header
>> + *     Fixed size. LZO_LEN (4) bytes long, LE16.
>> + *     Records the total size of the segment (*excludes* the header).
>> + *
>> + * 2.2 Data Payload
>> + *     Variable size. Size up limit should be lzo1x_worst_compress(PAGE_SIZE).
>> + *
>> + * While for inlined LZO compressed data extent, it doesn't have Header, just
>> + * one Segment.
> 
> 
> 
>> + */
>> +
>>  struct workspace {
>>  	void *mem;
>>  	void *buf;	/* where decompressed data goes */
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/4] btrfs: lzo: Add header length check to avoid slab out of bounds access
  2018-05-17  6:27 ` [PATCH 3/4] btrfs: lzo: Add header length check to avoid slab out of bounds access Qu Wenruo
@ 2018-05-17  8:14   ` Nikolay Borisov
  2018-05-17  8:19     ` Qu Wenruo
  2018-05-22 14:14     ` David Sterba
  0 siblings, 2 replies; 12+ messages in thread
From: Nikolay Borisov @ 2018-05-17  8:14 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs; +Cc: jamespharvey20



On 17.05.2018 09:27, Qu Wenruo wrote:
> James Harvey reported that some corrupted compressed extent data can
> lead to various kernel memory corruption.
> 
> Such corrupted extent data belongs to inode with NODATASUM flags, thus
> data csum won't help us detecting such bug.
> 
> If lucky enough, kasan could catch it like:
> ==================================================================
> BUG: KASAN: slab-out-of-bounds in lzo_decompress_bio+0x384/0x7a0 [btrfs]
> Write of size 4096 at addr ffff8800606cb0f8 by task kworker/u16:0/2338
> 
> CPU: 3 PID: 2338 Comm: kworker/u16:0 Tainted: G           O      4.17.0-rc5-custom+ #50
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
> Workqueue: btrfs-endio btrfs_endio_helper [btrfs]
> Call Trace:
>  dump_stack+0xc2/0x16b
>  print_address_description+0x6a/0x270
>  kasan_report+0x260/0x380
>  memcpy+0x34/0x50
>  lzo_decompress_bio+0x384/0x7a0 [btrfs]
>  end_compressed_bio_read+0x99f/0x10b0 [btrfs]
>  bio_endio+0x32e/0x640
>  normal_work_helper+0x15a/0xea0 [btrfs]
>  process_one_work+0x7e3/0x1470
>  worker_thread+0x1b0/0x1170
>  kthread+0x2db/0x390
>  ret_from_fork+0x22/0x40
> ...
> ==================================================================
> 
> The offending compressed data has the following info:
> 
> Header:			length 32768		(Looks completely valid)
> Segment 0 Header:	length 3472882419	(Obvious out of bounds)
> 
> Then when handling segment 0, since it's over the current page, we need
> the compressed data to workspace, then such large size would trigger
> out-of-bounds memory access, screwing up the whole kernel.
> 
> Fix it by adding extra checks on header and segment headers to ensure we
> won't access out-of-bounds, and even checks the decompressed data won't
> be out-of-bounds.
> 
> Reported-by: James Harvey <jamespharvey20@gmail.com>
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
>  fs/btrfs/lzo.c | 35 ++++++++++++++++++++++++++++++++++-
>  1 file changed, 34 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/btrfs/lzo.c b/fs/btrfs/lzo.c
> index 3d2ae4c08876..78ebc809072f 100644
> --- a/fs/btrfs/lzo.c
> +++ b/fs/btrfs/lzo.c
> @@ -281,6 +281,7 @@ static int lzo_decompress_bio(struct list_head *ws, struct compressed_bio *cb)
>  	unsigned long working_bytes;
>  	size_t in_len;
>  	size_t out_len;
> +	size_t max_segment_len = lzo1x_worst_compress(PAGE_SIZE);
>  	unsigned long in_offset;
>  	unsigned long in_page_bytes_left;
>  	unsigned long tot_in;
> @@ -294,6 +295,18 @@ static int lzo_decompress_bio(struct list_head *ws, struct compressed_bio *cb)
>  
>  	data_in = kmap(pages_in[0]);
>  	tot_len = read_compress_length(data_in);
> +	/*
> +	 * Compressed data header check.
> +	 *
> +	 * The real compressed size can't exceed extent length, and all pages
> +	 * should be used (a full pending page is not possible).
> +	 * If this happens it means the compressed extent is corrupted.
> +	 */
> +	if (tot_len > min_t(size_t, BTRFS_MAX_COMPRESSED, srclen) ||
> +	    tot_len < srclen - PAGE_SIZE) {
> +		ret = -EUCLEAN;
> +		goto done;
> +	}

So tot_len is the compressed size as written in the compressed stream,
whereas srclen is the number of bytes on-disk this compressed extent
take up (as derived from submit_compressed_extents). Shouldn't those two
always be equal, i.e perhaps an assert is in order?

srclen  comes from the async_extent struct, which in turns is
initialized in compress_file_range with the value of "total_compressed",
and the value there is actually initialized by
btrfs_compress_pages->lzo_compress_pages (that code makes me wanna sing
"You spin me right round, baby Right round like a record, baby").


>  
>  	tot_in = LZO_LEN;
>  	in_offset = LZO_LEN;
> @@ -308,6 +321,17 @@ static int lzo_decompress_bio(struct list_head *ws, struct compressed_bio *cb)
>  		in_offset += LZO_LEN;
>  		tot_in += LZO_LEN;
>  
> +		/*
> +		 * Segment header check.
> +		 *
> +		 * The segment length must not exceed max lzo compression
> +		 * size, nor the total compressed size
> +		 */
> +		if (in_len > max_segment_len || tot_in + in_len > tot_len) {
> +			ret = -EUCLEAN;
> +			goto done;
> +		}
> +
>  		tot_in += in_len;
>  		working_bytes = in_len;
>  		may_late_unmap = need_unmap = false;
> @@ -358,7 +382,7 @@ static int lzo_decompress_bio(struct list_head *ws, struct compressed_bio *cb)
>  			}
>  		}
>  
> -		out_len = lzo1x_worst_compress(PAGE_SIZE);
> +		out_len = max_segment_len;
>  		ret = lzo1x_decompress_safe(buf, in_len, workspace->buf,
>  					    &out_len);
>  		if (need_unmap)
> @@ -368,6 +392,15 @@ static int lzo_decompress_bio(struct list_head *ws, struct compressed_bio *cb)
>  			ret = -EIO;
>  			break;
>  		}
> +		/*
> +		 * Decompressed data length check.
> +		 * The uncompressed data should not exceed uncompressed extent
> +		 * size.
> +		 */
> +		if (tot_out + out_len > cb->len) {
> +			ret = -EUCLEAN;
> +			break;
> +		}
>  
>  		buf_start = tot_out;
>  		tot_out += out_len;
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/4] btrfs: lzo: Add header length check to avoid slab out of bounds access
  2018-05-17  8:14   ` Nikolay Borisov
@ 2018-05-17  8:19     ` Qu Wenruo
  2018-05-17  9:05       ` Qu Wenruo
  2018-05-22 14:14     ` David Sterba
  1 sibling, 1 reply; 12+ messages in thread
From: Qu Wenruo @ 2018-05-17  8:19 UTC (permalink / raw)
  To: Nikolay Borisov, linux-btrfs; +Cc: jamespharvey20



On 2018年05月17日 16:14, Nikolay Borisov wrote:
> 
> 
> On 17.05.2018 09:27, Qu Wenruo wrote:
>> James Harvey reported that some corrupted compressed extent data can
>> lead to various kernel memory corruption.
>>
>> Such corrupted extent data belongs to inode with NODATASUM flags, thus
>> data csum won't help us detecting such bug.
>>
>> If lucky enough, kasan could catch it like:
>> ==================================================================
>> BUG: KASAN: slab-out-of-bounds in lzo_decompress_bio+0x384/0x7a0 [btrfs]
>> Write of size 4096 at addr ffff8800606cb0f8 by task kworker/u16:0/2338
>>
>> CPU: 3 PID: 2338 Comm: kworker/u16:0 Tainted: G           O      4.17.0-rc5-custom+ #50
>> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
>> Workqueue: btrfs-endio btrfs_endio_helper [btrfs]
>> Call Trace:
>>  dump_stack+0xc2/0x16b
>>  print_address_description+0x6a/0x270
>>  kasan_report+0x260/0x380
>>  memcpy+0x34/0x50
>>  lzo_decompress_bio+0x384/0x7a0 [btrfs]
>>  end_compressed_bio_read+0x99f/0x10b0 [btrfs]
>>  bio_endio+0x32e/0x640
>>  normal_work_helper+0x15a/0xea0 [btrfs]
>>  process_one_work+0x7e3/0x1470
>>  worker_thread+0x1b0/0x1170
>>  kthread+0x2db/0x390
>>  ret_from_fork+0x22/0x40
>> ...
>> ==================================================================
>>
>> The offending compressed data has the following info:
>>
>> Header:			length 32768		(Looks completely valid)
>> Segment 0 Header:	length 3472882419	(Obvious out of bounds)
>>
>> Then when handling segment 0, since it's over the current page, we need
>> the compressed data to workspace, then such large size would trigger
>> out-of-bounds memory access, screwing up the whole kernel.
>>
>> Fix it by adding extra checks on header and segment headers to ensure we
>> won't access out-of-bounds, and even checks the decompressed data won't
>> be out-of-bounds.
>>
>> Reported-by: James Harvey <jamespharvey20@gmail.com>
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>> ---
>>  fs/btrfs/lzo.c | 35 ++++++++++++++++++++++++++++++++++-
>>  1 file changed, 34 insertions(+), 1 deletion(-)
>>
>> diff --git a/fs/btrfs/lzo.c b/fs/btrfs/lzo.c
>> index 3d2ae4c08876..78ebc809072f 100644
>> --- a/fs/btrfs/lzo.c
>> +++ b/fs/btrfs/lzo.c
>> @@ -281,6 +281,7 @@ static int lzo_decompress_bio(struct list_head *ws, struct compressed_bio *cb)
>>  	unsigned long working_bytes;
>>  	size_t in_len;
>>  	size_t out_len;
>> +	size_t max_segment_len = lzo1x_worst_compress(PAGE_SIZE);
>>  	unsigned long in_offset;
>>  	unsigned long in_page_bytes_left;
>>  	unsigned long tot_in;
>> @@ -294,6 +295,18 @@ static int lzo_decompress_bio(struct list_head *ws, struct compressed_bio *cb)
>>  
>>  	data_in = kmap(pages_in[0]);
>>  	tot_len = read_compress_length(data_in);
>> +	/*
>> +	 * Compressed data header check.
>> +	 *
>> +	 * The real compressed size can't exceed extent length, and all pages
>> +	 * should be used (a full pending page is not possible).
>> +	 * If this happens it means the compressed extent is corrupted.
>> +	 */
>> +	if (tot_len > min_t(size_t, BTRFS_MAX_COMPRESSED, srclen) ||
>> +	    tot_len < srclen - PAGE_SIZE) {
>> +		ret = -EUCLEAN;
>> +		goto done;
>> +	}
> 
> So tot_len is the compressed size as written in the compressed stream,
> whereas srclen is the number of bytes on-disk this compressed extent
> take up (as derived from submit_compressed_extents). Shouldn't those two
> always be equal, i.e perhaps an assert is in order?

Nope, and in most case, tot_len is smaller than srclen (extent size).

The compressed data can have pending zeros, and in fact, for 8K all
"0xcd" lzo compressed data, tot_len is 100 bytes, and the remaining
bytes are all pending zeros.
While compressed extent size is always rounded up to sector size, and in
that 8K all "0xcd" case, @srclen will be 4K.

Thanks,
Qu

> 
> srclen  comes from the async_extent struct, which in turns is
> initialized in compress_file_range with the value of "total_compressed",
> and the value there is actually initialized by
> btrfs_compress_pages->lzo_compress_pages (that code makes me wanna sing
> "You spin me right round, baby Right round like a record, baby").
> 
> 
>>  
>>  	tot_in = LZO_LEN;
>>  	in_offset = LZO_LEN;
>> @@ -308,6 +321,17 @@ static int lzo_decompress_bio(struct list_head *ws, struct compressed_bio *cb)
>>  		in_offset += LZO_LEN;
>>  		tot_in += LZO_LEN;
>>  
>> +		/*
>> +		 * Segment header check.
>> +		 *
>> +		 * The segment length must not exceed max lzo compression
>> +		 * size, nor the total compressed size
>> +		 */
>> +		if (in_len > max_segment_len || tot_in + in_len > tot_len) {
>> +			ret = -EUCLEAN;
>> +			goto done;
>> +		}
>> +
>>  		tot_in += in_len;
>>  		working_bytes = in_len;
>>  		may_late_unmap = need_unmap = false;
>> @@ -358,7 +382,7 @@ static int lzo_decompress_bio(struct list_head *ws, struct compressed_bio *cb)
>>  			}
>>  		}
>>  
>> -		out_len = lzo1x_worst_compress(PAGE_SIZE);
>> +		out_len = max_segment_len;
>>  		ret = lzo1x_decompress_safe(buf, in_len, workspace->buf,
>>  					    &out_len);
>>  		if (need_unmap)
>> @@ -368,6 +392,15 @@ static int lzo_decompress_bio(struct list_head *ws, struct compressed_bio *cb)
>>  			ret = -EIO;
>>  			break;
>>  		}
>> +		/*
>> +		 * Decompressed data length check.
>> +		 * The uncompressed data should not exceed uncompressed extent
>> +		 * size.
>> +		 */
>> +		if (tot_out + out_len > cb->len) {
>> +			ret = -EUCLEAN;
>> +			break;
>> +		}
>>  
>>  		buf_start = tot_out;
>>  		tot_out += out_len;
>>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/4] btrfs: lzo: Add header length check to avoid slab out of bounds access
  2018-05-17  8:19     ` Qu Wenruo
@ 2018-05-17  9:05       ` Qu Wenruo
  0 siblings, 0 replies; 12+ messages in thread
From: Qu Wenruo @ 2018-05-17  9:05 UTC (permalink / raw)
  To: Qu Wenruo, Nikolay Borisov, linux-btrfs; +Cc: jamespharvey20



On 2018年05月17日 16:19, Qu Wenruo wrote:
> 
> 
> On 2018年05月17日 16:14, Nikolay Borisov wrote:
>>
>>
>> On 17.05.2018 09:27, Qu Wenruo wrote:
>>> James Harvey reported that some corrupted compressed extent data can
>>> lead to various kernel memory corruption.
>>>
>>> Such corrupted extent data belongs to inode with NODATASUM flags, thus
>>> data csum won't help us detecting such bug.
>>>
>>> If lucky enough, kasan could catch it like:
>>> ==================================================================
>>> BUG: KASAN: slab-out-of-bounds in lzo_decompress_bio+0x384/0x7a0 [btrfs]
>>> Write of size 4096 at addr ffff8800606cb0f8 by task kworker/u16:0/2338
>>>
>>> CPU: 3 PID: 2338 Comm: kworker/u16:0 Tainted: G           O      4.17.0-rc5-custom+ #50
>>> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
>>> Workqueue: btrfs-endio btrfs_endio_helper [btrfs]
>>> Call Trace:
>>>  dump_stack+0xc2/0x16b
>>>  print_address_description+0x6a/0x270
>>>  kasan_report+0x260/0x380
>>>  memcpy+0x34/0x50
>>>  lzo_decompress_bio+0x384/0x7a0 [btrfs]
>>>  end_compressed_bio_read+0x99f/0x10b0 [btrfs]
>>>  bio_endio+0x32e/0x640
>>>  normal_work_helper+0x15a/0xea0 [btrfs]
>>>  process_one_work+0x7e3/0x1470
>>>  worker_thread+0x1b0/0x1170
>>>  kthread+0x2db/0x390
>>>  ret_from_fork+0x22/0x40
>>> ...
>>> ==================================================================
>>>
>>> The offending compressed data has the following info:
>>>
>>> Header:			length 32768		(Looks completely valid)
>>> Segment 0 Header:	length 3472882419	(Obvious out of bounds)
>>>
>>> Then when handling segment 0, since it's over the current page, we need
>>> the compressed data to workspace, then such large size would trigger
>>> out-of-bounds memory access, screwing up the whole kernel.
>>>
>>> Fix it by adding extra checks on header and segment headers to ensure we
>>> won't access out-of-bounds, and even checks the decompressed data won't
>>> be out-of-bounds.
>>>
>>> Reported-by: James Harvey <jamespharvey20@gmail.com>
>>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>>> ---
>>>  fs/btrfs/lzo.c | 35 ++++++++++++++++++++++++++++++++++-
>>>  1 file changed, 34 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/fs/btrfs/lzo.c b/fs/btrfs/lzo.c
>>> index 3d2ae4c08876..78ebc809072f 100644
>>> --- a/fs/btrfs/lzo.c
>>> +++ b/fs/btrfs/lzo.c
>>> @@ -281,6 +281,7 @@ static int lzo_decompress_bio(struct list_head *ws, struct compressed_bio *cb)
>>>  	unsigned long working_bytes;
>>>  	size_t in_len;
>>>  	size_t out_len;
>>> +	size_t max_segment_len = lzo1x_worst_compress(PAGE_SIZE);
>>>  	unsigned long in_offset;
>>>  	unsigned long in_page_bytes_left;
>>>  	unsigned long tot_in;
>>> @@ -294,6 +295,18 @@ static int lzo_decompress_bio(struct list_head *ws, struct compressed_bio *cb)
>>>  
>>>  	data_in = kmap(pages_in[0]);
>>>  	tot_len = read_compress_length(data_in);
>>> +	/*
>>> +	 * Compressed data header check.
>>> +	 *
>>> +	 * The real compressed size can't exceed extent length, and all pages
>>> +	 * should be used (a full pending page is not possible).
>>> +	 * If this happens it means the compressed extent is corrupted.
>>> +	 */
>>> +	if (tot_len > min_t(size_t, BTRFS_MAX_COMPRESSED, srclen) ||
>>> +	    tot_len < srclen - PAGE_SIZE) {
>>> +		ret = -EUCLEAN;
>>> +		goto done;
>>> +	}
>>
>> So tot_len is the compressed size as written in the compressed stream,
>> whereas srclen is the number of bytes on-disk this compressed extent
>> take up (as derived from submit_compressed_extents). Shouldn't those two
>> always be equal, i.e perhaps an assert is in order?
> 
> Nope, and in most case, tot_len is smaller than srclen (extent size).
> 
> The compressed data can have pending zeros, and in fact, for 8K all
> "0xcd" lzo compressed data, tot_len is 100 bytes, and the remaining
> bytes are all pending zeros.
> While compressed extent size is always rounded up to sector size, and in
> that 8K all "0xcd" case, @srclen will be 4K.
> 
> Thanks,
> Qu
> 
>>
>> srclen  comes from the async_extent struct, which in turns is
>> initialized in compress_file_range with the value of "total_compressed",
>> and the value there is actually initialized by
>> btrfs_compress_pages->lzo_compress_pages (that code makes me wanna sing
>> "You spin me right round, baby Right round like a record, baby").

Forgot to mention, that call sites are all *write* sites, not *read* sites.

For read, cb->compressed_len will just be the extent size.

Anyway, the check works fine for both sites.

Thanks,
Qu

>>
>>
>>>  
>>>  	tot_in = LZO_LEN;
>>>  	in_offset = LZO_LEN;
>>> @@ -308,6 +321,17 @@ static int lzo_decompress_bio(struct list_head *ws, struct compressed_bio *cb)
>>>  		in_offset += LZO_LEN;
>>>  		tot_in += LZO_LEN;
>>>  
>>> +		/*
>>> +		 * Segment header check.
>>> +		 *
>>> +		 * The segment length must not exceed max lzo compression
>>> +		 * size, nor the total compressed size
>>> +		 */
>>> +		if (in_len > max_segment_len || tot_in + in_len > tot_len) {
>>> +			ret = -EUCLEAN;
>>> +			goto done;
>>> +		}
>>> +
>>>  		tot_in += in_len;
>>>  		working_bytes = in_len;
>>>  		may_late_unmap = need_unmap = false;
>>> @@ -358,7 +382,7 @@ static int lzo_decompress_bio(struct list_head *ws, struct compressed_bio *cb)
>>>  			}
>>>  		}
>>>  
>>> -		out_len = lzo1x_worst_compress(PAGE_SIZE);
>>> +		out_len = max_segment_len;
>>>  		ret = lzo1x_decompress_safe(buf, in_len, workspace->buf,
>>>  					    &out_len);
>>>  		if (need_unmap)
>>> @@ -368,6 +392,15 @@ static int lzo_decompress_bio(struct list_head *ws, struct compressed_bio *cb)
>>>  			ret = -EIO;
>>>  			break;
>>>  		}
>>> +		/*
>>> +		 * Decompressed data length check.
>>> +		 * The uncompressed data should not exceed uncompressed extent
>>> +		 * size.
>>> +		 */
>>> +		if (tot_out + out_len > cb->len) {
>>> +			ret = -EUCLEAN;
>>> +			break;
>>> +		}
>>>  
>>>  		buf_start = tot_out;
>>>  		tot_out += out_len;
>>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/4] btrfs: lzo: Add header length check to avoid slab out of bounds access
  2018-05-17  8:14   ` Nikolay Borisov
  2018-05-17  8:19     ` Qu Wenruo
@ 2018-05-22 14:14     ` David Sterba
  1 sibling, 0 replies; 12+ messages in thread
From: David Sterba @ 2018-05-22 14:14 UTC (permalink / raw)
  To: Nikolay Borisov; +Cc: Qu Wenruo, linux-btrfs, jamespharvey20

On Thu, May 17, 2018 at 11:14:46AM +0300, Nikolay Borisov wrote:
> srclen  comes from the async_extent struct, which in turns is
> initialized in compress_file_range with the value of "total_compressed",
> and the value there is actually initialized by
> btrfs_compress_pages->lzo_compress_pages (that code makes me wanna sing
> "You spin me right round, baby Right round like a record, baby").

I have a dusted patch that reworks the whole loop to something more
understandable, I can post it with some other pending compression
updates.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2018-05-22 14:17 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-17  6:27 [PATCH 0/4] btrfs: lzo: Harden decompression callers to avoid Qu Wenruo
2018-05-17  6:27 ` [PATCH 1/4] btrfs: compression: Add linux/sizes.h for compression.h Qu Wenruo
2018-05-17  7:49   ` Nikolay Borisov
2018-05-17  6:27 ` [PATCH 2/4] btrfs: lzo: Add comment about the how btrfs records its lzo compressed data Qu Wenruo
2018-05-17  7:48   ` Nikolay Borisov
2018-05-17  8:04     ` Qu Wenruo
2018-05-17  6:27 ` [PATCH 3/4] btrfs: lzo: Add header length check to avoid slab out of bounds access Qu Wenruo
2018-05-17  8:14   ` Nikolay Borisov
2018-05-17  8:19     ` Qu Wenruo
2018-05-17  9:05       ` Qu Wenruo
2018-05-22 14:14     ` David Sterba
2018-05-17  6:27 ` [PATCH 4/4] btrfs: lzo: Hardern inline lzo compressed extent decompression Qu Wenruo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.