linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/8] Add support for LZ4 compression
@ 2014-05-31 23:48 Philip Worrall
  2014-05-31 23:48 ` [PATCH 1/8] Btrfs: Add kernel config options for LZ4 Philip Worrall
                   ` (9 more replies)
  0 siblings, 10 replies; 13+ messages in thread
From: Philip Worrall @ 2014-05-31 23:48 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Philip Worrall

LZ4 is a lossless data compression algorithm that is focused on 
compression and decompression speed. LZ4 gives a slightly worse
compression ratio compared with LZO (and much worse than Zlib)
but compression speeds are *generally* similar to LZO. 
Decompression tends to be much faster under LZ4 compared 
with LZO hence it makes more sense to use LZ4 compression
when your workload involves a higher proportion of reads.

The following patch set adds LZ4 compression support to BTRFS
using the existing kernel implementation. It is based on the 
changeset for LZO support in 2011. Once a filesystem has been 
mounted with LZ4 compression enabled older versions of BTRFS 
will be unable to read it. This implementation is however 
backwards compatible with filesystems that currently use 
LZO or Zlib compression. Existing data will remain unchanged 
but any new files that you create will be compressed with LZ4.

Usage:
Apply the following 8 patches to the current git tree 
(as of 20140531) and compile/load the btrfs module.

# mount -t btrfs -o compress=lz4 device mountpoint

or

# mount -t btrfs -o compress-force=lz4 device mountpoint

Philip Worrall (8):
  Btrfs: Add kernel config options for LZ4
  Btrfs: Add lz4.c to the Makefile
  Btrfs: Add lz4 compression to avaialble compression ops
  Btrfs: Add definition for external lz4 compression struct
  Btrfs: Add feature flags for LZ4 support
  Btrfs: Ensure LZ4 feature flags are set when mounting with LZ4
  Btrfs: Add lz4 compression/decompression struct ops
  Btrfs: Check for compress=lz4 in mount options

 fs/btrfs/Kconfig       |   2 +
 fs/btrfs/Makefile      |   2 +-
 fs/btrfs/compression.c |   1 +
 fs/btrfs/compression.h |   2 +-
 fs/btrfs/ctree.h       |  10 +-
 fs/btrfs/disk-io.c     |   3 +-
 fs/btrfs/lz4.c         | 436 +++++++++++++++++++++++++++++++++++++++++++++++++
 fs/btrfs/super.c       |  12 +-
 8 files changed, 461 insertions(+), 7 deletions(-)
 create mode 100644 fs/btrfs/lz4.c

-- 
1.9.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/8] Btrfs: Add kernel config options for LZ4
  2014-05-31 23:48 [PATCH 0/8] Add support for LZ4 compression Philip Worrall
@ 2014-05-31 23:48 ` Philip Worrall
  2014-05-31 23:48 ` [PATCH 2/8] Btrfs: Add lz4.c to the Makefile Philip Worrall
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Philip Worrall @ 2014-05-31 23:48 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Philip Worrall

Add two kernel configuration options for LZ4 compress
and decompress.

Signed-off-by: Philip Worrall <philip.worrall@googlemail.com>
---
 fs/btrfs/Kconfig | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/btrfs/Kconfig b/fs/btrfs/Kconfig
index a66768e..fb93d9e 100644
--- a/fs/btrfs/Kconfig
+++ b/fs/btrfs/Kconfig
@@ -6,6 +6,8 @@ config BTRFS_FS
 	select ZLIB_DEFLATE
 	select LZO_COMPRESS
 	select LZO_DECOMPRESS
+	select LZ4_COMPRESS
+	select LZ4_DECOMPRESS
 	select RAID6_PQ
 	select XOR_BLOCKS
 
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/8] Btrfs: Add lz4.c to the Makefile
  2014-05-31 23:48 [PATCH 0/8] Add support for LZ4 compression Philip Worrall
  2014-05-31 23:48 ` [PATCH 1/8] Btrfs: Add kernel config options for LZ4 Philip Worrall
@ 2014-05-31 23:48 ` Philip Worrall
  2014-05-31 23:48 ` [PATCH 3/8] Btrfs: Add lz4 compression to avaialble compression ops Philip Worrall
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Philip Worrall @ 2014-05-31 23:48 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Philip Worrall

Add lz4.c to the btrfs makefile

Signed-off-by: Philip Worrall <philip.worrall@googlemail.com>
---
 fs/btrfs/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/Makefile b/fs/btrfs/Makefile
index f341a98..af8d12d 100644
--- a/fs/btrfs/Makefile
+++ b/fs/btrfs/Makefile
@@ -6,7 +6,7 @@ btrfs-y += super.o ctree.o extent-tree.o print-tree.o root-tree.o dir-item.o \
 	   transaction.o inode.o file.o tree-defrag.o \
 	   extent_map.o sysfs.o struct-funcs.o xattr.o ordered-data.o \
 	   extent_io.o volumes.o async-thread.o ioctl.o locking.o orphan.o \
-	   export.o tree-log.o free-space-cache.o zlib.o lzo.o \
+	   export.o tree-log.o free-space-cache.o zlib.o lzo.o lz4.o \
 	   compression.o delayed-ref.o relocation.o delayed-inode.o scrub.o \
 	   reada.o backref.o ulist.o qgroup.o send.o dev-replace.o raid56.o \
 	   uuid-tree.o props.o hash.o
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/8] Btrfs: Add lz4 compression to avaialble compression ops
  2014-05-31 23:48 [PATCH 0/8] Add support for LZ4 compression Philip Worrall
  2014-05-31 23:48 ` [PATCH 1/8] Btrfs: Add kernel config options for LZ4 Philip Worrall
  2014-05-31 23:48 ` [PATCH 2/8] Btrfs: Add lz4.c to the Makefile Philip Worrall
@ 2014-05-31 23:48 ` Philip Worrall
  2014-05-31 23:48 ` [PATCH 4/8] Btrfs: Add definition for external lz4 compression struct Philip Worrall
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Philip Worrall @ 2014-05-31 23:48 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Philip Worrall

Add lz4 compression structure to available btrfs compression
operations.

Signed-off-by: Philip Worrall <philip.worrall@googlemail.com>
---
 fs/btrfs/compression.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
index ed1ff1c..35b3268 100644
--- a/fs/btrfs/compression.c
+++ b/fs/btrfs/compression.c
@@ -747,6 +747,7 @@ static wait_queue_head_t comp_workspace_wait[BTRFS_COMPRESS_TYPES];
 static struct btrfs_compress_op *btrfs_compress_op[] = {
 	&btrfs_zlib_compress,
 	&btrfs_lzo_compress,
+	&btrfs_lz4_compress,
 };
 
 void __init btrfs_init_compress(void)
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 4/8] Btrfs: Add definition for external lz4 compression struct
  2014-05-31 23:48 [PATCH 0/8] Add support for LZ4 compression Philip Worrall
                   ` (2 preceding siblings ...)
  2014-05-31 23:48 ` [PATCH 3/8] Btrfs: Add lz4 compression to avaialble compression ops Philip Worrall
@ 2014-05-31 23:48 ` Philip Worrall
  2014-05-31 23:48 ` [PATCH 5/8] Btrfs: Add feature flags for LZ4 support Philip Worrall
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Philip Worrall @ 2014-05-31 23:48 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Philip Worrall

Signed-off-by: Philip Worrall <philip.worrall@googlemail.com>
---
 fs/btrfs/compression.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/compression.h b/fs/btrfs/compression.h
index 0c803b4..c39b86a 100644
--- a/fs/btrfs/compression.h
+++ b/fs/btrfs/compression.h
@@ -77,5 +77,5 @@ struct btrfs_compress_op {
 
 extern struct btrfs_compress_op btrfs_zlib_compress;
 extern struct btrfs_compress_op btrfs_lzo_compress;
-
+extern struct btrfs_compress_op btrfs_lz4_compress;
 #endif
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 5/8] Btrfs: Add feature flags for LZ4 support
  2014-05-31 23:48 [PATCH 0/8] Add support for LZ4 compression Philip Worrall
                   ` (3 preceding siblings ...)
  2014-05-31 23:48 ` [PATCH 4/8] Btrfs: Add definition for external lz4 compression struct Philip Worrall
@ 2014-05-31 23:48 ` Philip Worrall
  2014-05-31 23:48 ` [PATCH 6/8] Btrfs: Ensure LZ4 feature flags are set when mounting with LZ4 Philip Worrall
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Philip Worrall @ 2014-05-31 23:48 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Philip Worrall

Add various feature flags for LZ4 so that older kernels refuse
to mount btrfs filesystems that have been used with LZ4

Signed-off-by: Philip Worrall <philip.worrall@googlemail.com>
---
 fs/btrfs/ctree.h | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index d8a669e..b5118cfa 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -523,6 +523,7 @@ struct btrfs_super_block {
 #define BTRFS_FEATURE_INCOMPAT_RAID56		(1ULL << 7)
 #define BTRFS_FEATURE_INCOMPAT_SKINNY_METADATA	(1ULL << 8)
 #define BTRFS_FEATURE_INCOMPAT_NO_HOLES		(1ULL << 9)
+#define BTRFS_FEATURE_INCOMPAT_COMPRESS_LZ4     (1ULL << 10)
 
 #define BTRFS_FEATURE_COMPAT_SUPP		0ULL
 #define BTRFS_FEATURE_COMPAT_SAFE_SET		0ULL
@@ -540,7 +541,8 @@ struct btrfs_super_block {
 	 BTRFS_FEATURE_INCOMPAT_RAID56 |		\
 	 BTRFS_FEATURE_INCOMPAT_EXTENDED_IREF |		\
 	 BTRFS_FEATURE_INCOMPAT_SKINNY_METADATA |	\
-	 BTRFS_FEATURE_INCOMPAT_NO_HOLES)
+	 BTRFS_FEATURE_INCOMPAT_NO_HOLES	|	\
+	 BTRFS_FEATURE_INCOMPAT_COMPRESS_LZ4)
 
 #define BTRFS_FEATURE_INCOMPAT_SAFE_SET			\
 	(BTRFS_FEATURE_INCOMPAT_EXTENDED_IREF)
@@ -709,8 +711,10 @@ enum btrfs_compression_type {
 	BTRFS_COMPRESS_NONE  = 0,
 	BTRFS_COMPRESS_ZLIB  = 1,
 	BTRFS_COMPRESS_LZO   = 2,
-	BTRFS_COMPRESS_TYPES = 2,
-	BTRFS_COMPRESS_LAST  = 3,
+	BTRFS_COMPRESS_LZ4   = 3,
+	BTRFS_COMPRESS_TYPES = 3,
+	BTRFS_COMPRESS_LAST  = 4,
+
 };
 
 struct btrfs_inode_item {
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 6/8] Btrfs: Ensure LZ4 feature flags are set when mounting with LZ4
  2014-05-31 23:48 [PATCH 0/8] Add support for LZ4 compression Philip Worrall
                   ` (4 preceding siblings ...)
  2014-05-31 23:48 ` [PATCH 5/8] Btrfs: Add feature flags for LZ4 support Philip Worrall
@ 2014-05-31 23:48 ` Philip Worrall
  2014-05-31 23:48 ` [PATCH 7/8] Btrfs: Add lz4 compression/decompression struct ops Philip Worrall
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Philip Worrall @ 2014-05-31 23:48 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Philip Worrall

Signed-off-by: Philip Worrall <philip.worrall@googlemail.com>
---
 fs/btrfs/disk-io.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 6d1ac7d..56bab54 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2479,7 +2479,8 @@ int open_ctree(struct super_block *sb,
 	features |= BTRFS_FEATURE_INCOMPAT_MIXED_BACKREF;
 	if (tree_root->fs_info->compress_type == BTRFS_COMPRESS_LZO)
 		features |= BTRFS_FEATURE_INCOMPAT_COMPRESS_LZO;
-
+	if (tree_root->fs_info->compress_type == BTRFS_COMPRESS_LZ4)
+		features |= BTRFS_FEATURE_INCOMPAT_COMPRESS_LZ4;
 	if (features & BTRFS_FEATURE_INCOMPAT_SKINNY_METADATA)
 		printk(KERN_ERR "BTRFS: has skinny extents\n");
 
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 7/8] Btrfs: Add lz4 compression/decompression struct ops
  2014-05-31 23:48 [PATCH 0/8] Add support for LZ4 compression Philip Worrall
                   ` (5 preceding siblings ...)
  2014-05-31 23:48 ` [PATCH 6/8] Btrfs: Ensure LZ4 feature flags are set when mounting with LZ4 Philip Worrall
@ 2014-05-31 23:48 ` Philip Worrall
  2014-05-31 23:48 ` [PATCH 8/8] Btrfs: Check for compress=lz4 in mount options Philip Worrall
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Philip Worrall @ 2014-05-31 23:48 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Philip Worrall

Add functions to handle lz4 compression and decompression and
include them within a new btrfs_compression_op struct.

Signed-off-by: Philip Worrall <philip.worrall@googlemail.com>
---
 fs/btrfs/lz4.c | 436 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 436 insertions(+)
 create mode 100644 fs/btrfs/lz4.c

diff --git a/fs/btrfs/lz4.c b/fs/btrfs/lz4.c
new file mode 100644
index 0000000..cd7616e
--- /dev/null
+++ b/fs/btrfs/lz4.c
@@ -0,0 +1,436 @@
+/*
+ * Copyright (C) 2014 Oracle.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public
+ * License v2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public
+ * License along with this program; if not, write to the
+ * Free Software Foundation, Inc., 59 Temple Place - Suite 330,
+ * Boston, MA 021110-1307, USA.
+ *
+ * Based on fs/btrfs/lzo.c
+ * Created by Philip Worrall <philip.worrall@googlemail.com>
+ */
+
+#include <linux/kernel.h>
+#include <linux/slab.h>
+#include <linux/vmalloc.h>
+#include <linux/init.h>
+#include <linux/err.h>
+#include <linux/sched.h>
+#include <linux/pagemap.h>
+#include <linux/bio.h>
+#include <linux/lz4.h>
+#include "compression.h"
+
+#define LZ4_LEN	4
+#define LZ4_E_OK 0
+
+struct workspace {
+	void *mem;
+	void *buf;	/* where decompressed data goes */
+	void *cbuf;	/* where compressed data goes */
+	struct list_head list;
+};
+
+static void lz4_free_workspace(struct list_head *ws)
+{
+	struct workspace *workspace = list_entry(ws, struct workspace, list);
+
+	vfree(workspace->buf);
+	vfree(workspace->cbuf);
+	vfree(workspace->mem);
+	kfree(workspace);
+}
+
+static struct list_head *lz4_alloc_workspace(void)
+{
+	struct workspace *workspace;
+
+	workspace = kzalloc(sizeof(*workspace), GFP_NOFS);
+	if (!workspace)
+		return ERR_PTR(-ENOMEM);
+
+	workspace->mem = vmalloc(LZ4_MEM_COMPRESS);
+	workspace->buf = vmalloc(lz4_compressbound(PAGE_CACHE_SIZE));
+	workspace->cbuf = vmalloc(lz4_compressbound(PAGE_CACHE_SIZE));
+	if (!workspace->mem || !workspace->buf || !workspace->cbuf)
+		goto fail;
+
+	INIT_LIST_HEAD(&workspace->list);
+
+	return &workspace->list;
+fail:
+	lz4_free_workspace(&workspace->list);
+	return ERR_PTR(-ENOMEM);
+}
+
+static inline void write_compress_length(char *buf, size_t len)
+{
+	__le32 dlen;
+
+	dlen = cpu_to_le32(len);
+	memcpy(buf, &dlen, LZ4_LEN);
+}
+
+static inline size_t read_compress_length(char *buf)
+{
+	__le32 dlen;
+
+	memcpy(&dlen, buf, LZ4_LEN);
+	return le32_to_cpu(dlen);
+}
+
+static int lz4_compress_pages(struct list_head *ws,
+			      struct address_space *mapping,
+			      u64 start, unsigned long len,
+			      struct page **pages,
+			      unsigned long nr_dest_pages,
+			      unsigned long *out_pages,
+			      unsigned long *total_in,
+			      unsigned long *total_out,
+			      unsigned long max_out)
+{
+	struct workspace *workspace = list_entry(ws, struct workspace, list);
+	int ret = 0;
+	char *data_in;
+	char *cpage_out;
+	int nr_pages = 0;
+	struct page *in_page = NULL;
+	struct page *out_page = NULL;
+	unsigned long bytes_left;
+
+	size_t in_len;
+	size_t out_len;
+	char *buf;
+	unsigned long tot_in = 0;
+	unsigned long tot_out = 0;
+	unsigned long pg_bytes_left;
+	unsigned long out_offset;
+	unsigned long bytes;
+
+	*out_pages = 0;
+	*total_out = 0;
+	*total_in = 0;
+
+	in_page = find_get_page(mapping, start >> PAGE_CACHE_SHIFT);
+	data_in = kmap(in_page);
+
+	/*
+	 * store the size of all chunks of compressed data in
+	 * the first 4 bytes
+	 */
+	out_page = alloc_page(GFP_NOFS | __GFP_HIGHMEM);
+	if (out_page == NULL) {
+		ret = -ENOMEM;
+		goto out;
+	}
+	cpage_out = kmap(out_page);
+	out_offset = LZ4_LEN;
+	tot_out = LZ4_LEN;
+	pages[0] = out_page;
+	nr_pages = 1;
+	pg_bytes_left = PAGE_CACHE_SIZE - LZ4_LEN;
+
+	/* compress at most one page of data each time */
+	in_len = min(len, PAGE_CACHE_SIZE);
+	while (tot_in < len) {
+		ret = lz4_compress(data_in, in_len, workspace->cbuf,
+				       &out_len, workspace->mem);
+		if (ret != LZ4_E_OK) {
+			printk(KERN_DEBUG "BTRFS: lz4 compress in loop returned %d\n",
+			       ret);
+			ret = -1;
+			goto out;
+		}
+
+		/* store the size of this chunk of compressed data */
+		write_compress_length(cpage_out + out_offset, out_len);
+		tot_out += LZ4_LEN;
+		out_offset += LZ4_LEN;
+		pg_bytes_left -= LZ4_LEN;
+
+		tot_in += in_len;
+		tot_out += out_len;
+
+		/* copy bytes from the working buffer into the pages */
+		buf = workspace->cbuf;
+		while (out_len) {
+			bytes = min_t(unsigned long, pg_bytes_left, out_len);
+
+			memcpy(cpage_out + out_offset, buf, bytes);
+
+			out_len -= bytes;
+			pg_bytes_left -= bytes;
+			buf += bytes;
+			out_offset += bytes;
+
+			/*
+			 * we need another page for writing out.
+			 *
+			 * Note if there's less than 4 bytes left, we just
+			 * skip to a new page.
+			 */
+			if ((out_len == 0 && pg_bytes_left < LZ4_LEN) ||
+			    pg_bytes_left == 0) {
+				if (pg_bytes_left) {
+					memset(cpage_out + out_offset, 0,
+					       pg_bytes_left);
+					tot_out += pg_bytes_left;
+				}
+
+				/* we're done, don't allocate new page */
+				if (out_len == 0 && tot_in >= len)
+					break;
+
+				kunmap(out_page);
+				if (nr_pages == nr_dest_pages) {
+					out_page = NULL;
+					ret = -1;
+					goto out;
+				}
+
+				out_page = alloc_page(GFP_NOFS | __GFP_HIGHMEM);
+				if (out_page == NULL) {
+					ret = -ENOMEM;
+					goto out;
+				}
+				cpage_out = kmap(out_page);
+				pages[nr_pages++] = out_page;
+
+				pg_bytes_left = PAGE_CACHE_SIZE;
+				out_offset = 0;
+			}
+		}
+
+		/* we're making it bigger, give up */
+		if (tot_in > 8192 && tot_in < tot_out) {
+			ret = -1;
+			goto out;
+		}
+
+		/* we're all done */
+		if (tot_in >= len)
+			break;
+
+		if (tot_out > max_out)
+			break;
+
+		bytes_left = len - tot_in;
+		kunmap(in_page);
+		page_cache_release(in_page);
+
+		start += PAGE_CACHE_SIZE;
+		in_page = find_get_page(mapping, start >> PAGE_CACHE_SHIFT);
+		data_in = kmap(in_page);
+		in_len = min(bytes_left, PAGE_CACHE_SIZE);
+	}
+
+	if (tot_out > tot_in)
+		goto out;
+
+	/* store the size of all chunks of compressed data */
+	cpage_out = kmap(pages[0]);
+	write_compress_length(cpage_out, tot_out);
+
+	kunmap(pages[0]);
+
+	ret = 0;
+	*total_out = tot_out;
+	*total_in = tot_in;
+out:
+	*out_pages = nr_pages;
+	if (out_page)
+		kunmap(out_page);
+
+	if (in_page) {
+		kunmap(in_page);
+		page_cache_release(in_page);
+	}
+
+	return ret;
+}
+
+static int lz4_decompress_biovec(struct list_head *ws,
+				 struct page **pages_in,
+				 u64 disk_start,
+				 struct bio_vec *bvec,
+				 int vcnt,
+				 size_t srclen)
+{
+	struct workspace *workspace = list_entry(ws, struct workspace, list);
+	int ret = 0, ret2;
+	char *data_in;
+	unsigned long page_in_index = 0;
+	unsigned long page_out_index = 0;
+	unsigned long total_pages_in = (srclen + PAGE_CACHE_SIZE - 1) /
+					PAGE_CACHE_SIZE;
+	unsigned long buf_start;
+	unsigned long buf_offset = 0;
+	unsigned long bytes;
+	unsigned long working_bytes;
+	unsigned long pg_offset;
+
+	size_t in_len;
+	size_t out_len;
+	unsigned long in_offset;
+	unsigned long in_page_bytes_left;
+	unsigned long tot_in;
+	unsigned long tot_out;
+	unsigned long tot_len;
+	char *buf;
+	bool may_late_unmap, need_unmap;
+
+	data_in = kmap(pages_in[0]);
+	tot_len = read_compress_length(data_in);
+
+	tot_in = LZ4_LEN;
+	in_offset = LZ4_LEN;
+	tot_len = min_t(size_t, srclen, tot_len);
+	in_page_bytes_left = PAGE_CACHE_SIZE - LZ4_LEN;
+
+	tot_out = 0;
+	pg_offset = 0;
+
+	while (tot_in < tot_len) {
+		in_len = read_compress_length(data_in + in_offset);
+		in_page_bytes_left -= LZ4_LEN;
+		in_offset += LZ4_LEN;
+		tot_in += LZ4_LEN;
+
+		tot_in += in_len;
+		working_bytes = in_len;
+		may_late_unmap = need_unmap = false;
+
+		/* fast path: avoid using the working buffer */
+		if (in_page_bytes_left >= in_len) {
+			buf = data_in + in_offset;
+			bytes = in_len;
+			may_late_unmap = true;
+			goto cont;
+		}
+
+		/* copy bytes from the pages into the working buffer */
+		buf = workspace->cbuf;
+		buf_offset = 0;
+		while (working_bytes) {
+			bytes = min(working_bytes, in_page_bytes_left);
+
+			memcpy(buf + buf_offset, data_in + in_offset, bytes);
+			buf_offset += bytes;
+cont:
+			working_bytes -= bytes;
+			in_page_bytes_left -= bytes;
+			in_offset += bytes;
+
+			/* check if we need to pick another page */
+			if ((working_bytes == 0 && in_page_bytes_left < LZ4_LEN)
+			    || in_page_bytes_left == 0) {
+				tot_in += in_page_bytes_left;
+
+				if (working_bytes == 0 && tot_in >= tot_len)
+					break;
+
+				if (page_in_index + 1 >= total_pages_in) {
+					ret = -1;
+					goto done;
+				}
+
+				if (may_late_unmap)
+					need_unmap = true;
+				else
+					kunmap(pages_in[page_in_index]);
+
+				data_in = kmap(pages_in[++page_in_index]);
+
+				in_page_bytes_left = PAGE_CACHE_SIZE;
+				in_offset = 0;
+			}
+		}
+
+		out_len = lz4_compressbound(PAGE_CACHE_SIZE);
+		ret = lz4_decompress_unknownoutputsize(buf, in_len,
+						workspace->buf, &out_len);
+		if (need_unmap)
+			kunmap(pages_in[page_in_index - 1]);
+		if (ret != LZ4_E_OK) {
+			printk(KERN_WARNING "BTRFS: lz4 decompress failed\n");
+			ret = -1;
+			break;
+		}
+
+		buf_start = tot_out;
+		tot_out += out_len;
+
+		ret2 = btrfs_decompress_buf2page(workspace->buf, buf_start,
+						tot_out, disk_start,
+						bvec, vcnt, &page_out_index,
+						&pg_offset);
+		if (ret2 == 0)
+			break;
+	}
+done:
+	kunmap(pages_in[page_in_index]);
+	return ret;
+}
+
+static int lz4_decompress_page(struct list_head *ws,
+				unsigned char *data_in,
+				struct page *dest_page,
+				unsigned long start_byte,
+				size_t srclen, size_t destlen)
+{
+	struct workspace *workspace = list_entry(ws,
+					struct workspace, list);
+	size_t in_len;
+	size_t out_len;
+	size_t tot_len;
+	int ret = 0;
+	char *kaddr;
+	unsigned long bytes;
+
+	BUG_ON(srclen < LZ4_LEN);
+
+	tot_len = read_compress_length(data_in);
+	data_in += LZ4_LEN;
+
+	in_len = read_compress_length(data_in);
+	data_in += LZ4_LEN;
+
+	out_len = PAGE_CACHE_SIZE;
+	ret = lz4_decompress_unknownoutputsize(data_in, in_len,
+				workspace->buf, &out_len);
+	if (ret != LZ4_E_OK) {
+		printk(KERN_WARNING "BTRFS: lz4 decompress failed!\n");
+		ret = -1;
+		goto out;
+	}
+
+	if (out_len < start_byte) {
+		ret = -1;
+		goto out;
+	}
+
+	bytes = min_t(unsigned long, destlen, out_len - start_byte);
+
+	kaddr = kmap_atomic(dest_page);
+	memcpy(kaddr, workspace->buf + start_byte, bytes);
+	kunmap_atomic(kaddr);
+out:
+	return ret;
+}
+
+struct btrfs_compress_op btrfs_lz4_compress = {
+	.alloc_workspace	= lz4_alloc_workspace,
+	.free_workspace		= lz4_free_workspace,
+	.compress_pages		= lz4_compress_pages,
+	.decompress_biovec	= lz4_decompress_biovec,
+	.decompress		= lz4_decompress_page,
+};
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 8/8] Btrfs: Check for compress=lz4 in mount options
  2014-05-31 23:48 [PATCH 0/8] Add support for LZ4 compression Philip Worrall
                   ` (6 preceding siblings ...)
  2014-05-31 23:48 ` [PATCH 7/8] Btrfs: Add lz4 compression/decompression struct ops Philip Worrall
@ 2014-05-31 23:48 ` Philip Worrall
  2014-06-02 22:04 ` [PATCH 0/8] Add support for LZ4 compression Mitch Harder
  2014-06-03 15:53 ` David Sterba
  9 siblings, 0 replies; 13+ messages in thread
From: Philip Worrall @ 2014-05-31 23:48 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Philip Worrall

Check whether the user has set compress=lz4 in the mount options
and if so set the compress method to lz4.

Signed-off-by: Philip Worrall <philip.worrall@googlemail.com>
---
 fs/btrfs/super.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index d4878dd..a348734 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -506,6 +506,14 @@ int btrfs_parse_options(struct btrfs_root *root, char *options)
 				btrfs_clear_opt(info->mount_opt, NODATACOW);
 				btrfs_clear_opt(info->mount_opt, NODATASUM);
 				btrfs_set_fs_incompat(info, COMPRESS_LZO);
+			} else if (strcmp(args[0].from, "lz4") == 0) {
+				printk(KERN_WARNING "BTRFS: Using LZ4 compression\n");
+				compress_type = "lz4";
+				info->compress_type = BTRFS_COMPRESS_LZ4;
+				btrfs_set_opt(info->mount_opt, COMPRESS);
+				btrfs_clear_opt(info->mount_opt, NODATACOW);
+				btrfs_clear_opt(info->mount_opt, NODATASUM);
+				btrfs_set_fs_incompat(info, COMPRESS_LZ4);
 			} else if (strncmp(args[0].from, "no", 2) == 0) {
 				compress_type = "no";
 				btrfs_clear_opt(info->mount_opt, COMPRESS);
@@ -1035,8 +1043,10 @@ static int btrfs_show_options(struct seq_file *seq, struct dentry *dentry)
 	if (btrfs_test_opt(root, COMPRESS)) {
 		if (info->compress_type == BTRFS_COMPRESS_ZLIB)
 			compress_type = "zlib";
-		else
+		else if (info->compress_type == BTRFS_COMPRESS_LZO)
 			compress_type = "lzo";
+		else
+			compress_type = "lz4";
 		if (btrfs_test_opt(root, FORCE_COMPRESS))
 			seq_printf(seq, ",compress-force=%s", compress_type);
 		else
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/8] Add support for LZ4 compression
  2014-05-31 23:48 [PATCH 0/8] Add support for LZ4 compression Philip Worrall
                   ` (7 preceding siblings ...)
  2014-05-31 23:48 ` [PATCH 8/8] Btrfs: Check for compress=lz4 in mount options Philip Worrall
@ 2014-06-02 22:04 ` Mitch Harder
  2014-06-03 15:53 ` David Sterba
  9 siblings, 0 replies; 13+ messages in thread
From: Mitch Harder @ 2014-06-02 22:04 UTC (permalink / raw)
  To: Philip Worrall; +Cc: linux-btrfs

On Sat, May 31, 2014 at 6:48 PM, Philip Worrall
<philip.worrall@googlemail.com> wrote:
> LZ4 is a lossless data compression algorithm that is focused on
> compression and decompression speed. LZ4 gives a slightly worse
> compression ratio compared with LZO (and much worse than Zlib)
> but compression speeds are *generally* similar to LZO.
> Decompression tends to be much faster under LZ4 compared
> with LZO hence it makes more sense to use LZ4 compression
> when your workload involves a higher proportion of reads.
>
> The following patch set adds LZ4 compression support to BTRFS
> using the existing kernel implementation. It is based on the
> changeset for LZO support in 2011. Once a filesystem has been
> mounted with LZ4 compression enabled older versions of BTRFS
> will be unable to read it. This implementation is however
> backwards compatible with filesystems that currently use
> LZO or Zlib compression. Existing data will remain unchanged
> but any new files that you create will be compressed with LZ4.
>
> Usage:
> Apply the following 8 patches to the current git tree
> (as of 20140531) and compile/load the btrfs module.
>
> # mount -t btrfs -o compress=lz4 device mountpoint
>
> or
>
> # mount -t btrfs -o compress-force=lz4 device mountpoint
>

I gave this patch-set a preliminary test, and there were no obvious
signs of corruption.

I was unable to run btrfsck or xfstests since btrfs-progs is not yet
aware of lz4, so my testing should be considered superficial.

One comment:  IMHO, it would make sense to implement lz4 and lz4hc at
the same time.  Since both are now in the kernel, I assume we would go
that direction eventually anyways, unless there are some wrinkles
around lz4hc I don't fully appreciate.  So we might as well conserve
the kernel INCOMPAT flag.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/8] Add support for LZ4 compression
  2014-05-31 23:48 [PATCH 0/8] Add support for LZ4 compression Philip Worrall
                   ` (8 preceding siblings ...)
  2014-06-02 22:04 ` [PATCH 0/8] Add support for LZ4 compression Mitch Harder
@ 2014-06-03 15:53 ` David Sterba
  2014-06-04 14:00   ` Chris Mason
  9 siblings, 1 reply; 13+ messages in thread
From: David Sterba @ 2014-06-03 15:53 UTC (permalink / raw)
  To: Philip Worrall; +Cc: linux-btrfs

On Sat, May 31, 2014 at 11:48:28PM +0000, Philip Worrall wrote:
> LZ4 is a lossless data compression algorithm that is focused on 
> compression and decompression speed. LZ4 gives a slightly worse
> compression ratio compared with LZO (and much worse than Zlib)
> but compression speeds are *generally* similar to LZO. 
> Decompression tends to be much faster under LZ4 compared 
> with LZO hence it makes more sense to use LZ4 compression
> when your workload involves a higher proportion of reads.
> 
> The following patch set adds LZ4 compression support to BTRFS
> using the existing kernel implementation. It is based on the 
> changeset for LZO support in 2011. Once a filesystem has been 
> mounted with LZ4 compression enabled older versions of BTRFS 
> will be unable to read it. This implementation is however 
> backwards compatible with filesystems that currently use 
> LZO or Zlib compression. Existing data will remain unchanged 
> but any new files that you create will be compressed with LZ4.

tl;dr simply copying what btrfs+LZO does will not buy us anything in
terms of speedup or space savings.

I've been working on adding LZ4 to btrfs for some time and still do in
my spare time. The project idea roughly outlines my goals:

https://btrfs.wiki.kernel.org/index.php/Project_ideas#Compression_enhancements

The initial compression support introduced a very simple format of the
compressed data. The simplicity was probably a good choice for a first
approach and allowed early adoption.

The main drawback of the format is that the compressed data are fed to
the compressor in 4k (page size) blocks and LZO (same for LZ4) does not
keep and reuse the state from previous blocks. This is different from
ZLIB which does, but is slower more yet more space effective.

The small blocks do not give much space for data reuse and the results
for LZO and LZ4 are very close, the difference was not measurable in my
tests. The raw speed of compression/decompression of the algorithms is
different, but we have to measure it under real loads where eg. the
decompression speedup does not weigh much in the overall performance.

The natural step forward is to compress in larger blocks, but this also
means designing new storage format for the compressed data and change
the kernel implementation accordingly. Also, this is not something that
can be done incrementally. One incompat bit should completely cover the
new stuff.

At the moment, there is no strong need for LZ4, though there are
numerous remarks in the online media about when btrfs will support it.

The situation was different for ZFS. The original compressor was LZJB,
that was derived from LZRW1 and tweaked for speed. The ratio suffered a
from that. LZO is better in this regard and the licensing issues do not
prevent adding it to btrfs, unlike ZFS (though there were other
concerns). LZ4 is released under BSD license, so it was a natural choice
IMO.

The usecase for LZ4 in btrfs builds on the high compressor mode that
maintains the same binary format but is able to achieve higher ratio.
The high compression would be triggered through defrag if there are
resources available, otherwise the real-time version would be used for
new writes. Applying a defrag on the system files (binaries, libs)
should improve performance in the read-mostly load you've mentioned
above.

I don't have much time to continue on that. I dont't mind sharing the
code (some draft is lying around in my git repos) and letting somebody
continue, but this needs experience with kernel internals regarding
memory management and performance tuning.

So, this is a NAK for your patchset.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/8] Add support for LZ4 compression
  2014-06-03 15:53 ` David Sterba
@ 2014-06-04 14:00   ` Chris Mason
  2014-06-12  8:47     ` David Sterba
  0 siblings, 1 reply; 13+ messages in thread
From: Chris Mason @ 2014-06-04 14:00 UTC (permalink / raw)
  To: dsterba, Philip Worrall, linux-btrfs

On 06/03/2014 11:53 AM, David Sterba wrote:
> On Sat, May 31, 2014 at 11:48:28PM +0000, Philip Worrall wrote:
>> LZ4 is a lossless data compression algorithm that is focused on 
>> compression and decompression speed. LZ4 gives a slightly worse
>> compression ratio compared with LZO (and much worse than Zlib)
>> but compression speeds are *generally* similar to LZO. 
>> Decompression tends to be much faster under LZ4 compared 
>> with LZO hence it makes more sense to use LZ4 compression
>> when your workload involves a higher proportion of reads.
>>
>> The following patch set adds LZ4 compression support to BTRFS
>> using the existing kernel implementation. It is based on the 
>> changeset for LZO support in 2011. Once a filesystem has been 
>> mounted with LZ4 compression enabled older versions of BTRFS 
>> will be unable to read it. This implementation is however 
>> backwards compatible with filesystems that currently use 
>> LZO or Zlib compression. Existing data will remain unchanged 
>> but any new files that you create will be compressed with LZ4.
> 
> tl;dr simply copying what btrfs+LZO does will not buy us anything in
> terms of speedup or space savings.

I have a slightly different reason for holding off on these.  Disk
format changes are forever, and we need a really strong use case for
pulling them in.

With that said, thanks for spending all of the time on this.  Pulling in
Dave's idea to stream larger compression blocks through lzo (or any new
alg) might be enough to push performance much higher, and better show
case the differences between new algorithms.

The whole reason I chose zlib originally was because its streaming
interface was a better fit for how FS IO worked.

-chris

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/8] Add support for LZ4 compression
  2014-06-04 14:00   ` Chris Mason
@ 2014-06-12  8:47     ` David Sterba
  0 siblings, 0 replies; 13+ messages in thread
From: David Sterba @ 2014-06-12  8:47 UTC (permalink / raw)
  To: Chris Mason; +Cc: dsterba, Philip Worrall, linux-btrfs

On Wed, Jun 04, 2014 at 10:00:06AM -0400, Chris Mason wrote:
> I have a slightly different reason for holding off on these.  Disk
> format changes are forever, and we need a really strong use case for
> pulling them in.

The format upgrade is inevitable for full bidirectional interoperability
of filesystems with non-pagesized sectorsize and compression. At the
moment this is not possible even without compression, but patches are on
the way.

> With that said, thanks for spending all of the time on this.  Pulling in
> Dave's idea to stream larger compression blocks through lzo (or any new
> alg) might be enough to push performance much higher, and better show
> case the differences between new algorithms.

The space savings and speed gains can be measured outside of btrfs.
>From the past numbers I see that 4k->64k chunk brings another 5-10% of
ratio and the de/compression speed is not worse.

Bigger chunks do not improve that much, but the overhead for assembling
the linear mappings would be decreased.

> The whole reason I chose zlib originally was because its streaming
> interface was a better fit for how FS IO worked.

Right, zlib has the streaming interface and accepts randomly scattered
blocks, but the others do not. LZ4 has a streaming extension proposed,
but I haven't looked at it closely whether it satisfies our constraints.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2014-06-12  8:47 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-05-31 23:48 [PATCH 0/8] Add support for LZ4 compression Philip Worrall
2014-05-31 23:48 ` [PATCH 1/8] Btrfs: Add kernel config options for LZ4 Philip Worrall
2014-05-31 23:48 ` [PATCH 2/8] Btrfs: Add lz4.c to the Makefile Philip Worrall
2014-05-31 23:48 ` [PATCH 3/8] Btrfs: Add lz4 compression to avaialble compression ops Philip Worrall
2014-05-31 23:48 ` [PATCH 4/8] Btrfs: Add definition for external lz4 compression struct Philip Worrall
2014-05-31 23:48 ` [PATCH 5/8] Btrfs: Add feature flags for LZ4 support Philip Worrall
2014-05-31 23:48 ` [PATCH 6/8] Btrfs: Ensure LZ4 feature flags are set when mounting with LZ4 Philip Worrall
2014-05-31 23:48 ` [PATCH 7/8] Btrfs: Add lz4 compression/decompression struct ops Philip Worrall
2014-05-31 23:48 ` [PATCH 8/8] Btrfs: Check for compress=lz4 in mount options Philip Worrall
2014-06-02 22:04 ` [PATCH 0/8] Add support for LZ4 compression Mitch Harder
2014-06-03 15:53 ` David Sterba
2014-06-04 14:00   ` Chris Mason
2014-06-12  8:47     ` David Sterba

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).