All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/5] Btrfs: add interface for writing compressed extent directly
@ 2019-08-15 21:04 Omar Sandoval
  2019-08-15 21:04 ` [PATCH 1/5] Btrfs: use correct count in btrfs_file_write_iter() Omar Sandoval
                   ` (6 more replies)
  0 siblings, 7 replies; 23+ messages in thread
From: Omar Sandoval @ 2019-08-15 21:04 UTC (permalink / raw)
  To: linux-btrfs; +Cc: kernel-team

From: Omar Sandoval <osandov@fb.com>

Hello,

This series adds a way to write compressed data directly to Btrfs. The
intended use case is making send/receive on compressed file systems more
efficient; however, the interface is general enough that it could be
used in other scenarios. Patch 5 is the main change; see that for more
details.

Patches 1-3 are small fixes/cleanups that I ran into while implementing
this; they should go in regardless of the remainder of the series. Patch
4 exports a required VFS interface.

An example program and test case are available at [1].

To preemptively address a few concerns:

- Writing arbitrary, untrusted data which we feed to the decompression
  algorithm can be a security risk. For that reason, the ioctl is
  restricted to CAP_SYS_ADMIN. The Btrfs code is properly hardened
  against invalid compressed data/incorrect lengths, and the compression
  libraries are mature, but better safe than sorry for now.
- If the user is writing their own compressed data rather than just
  blindly feeding in something from btrfs send, they need to know some
  implementation details about the compression format. For zlib, there
  are no special requirements. For zstd, a non-default compression
  parameter must be used. For lzo, we have our own wrapper format since
  lzo doesn't have a standard wrapper format. It feels a little wrong to
  expose these details, but they are part of the on-disk format, so they
  must be stable regardless.
- The permissions checks duplicated from the VFS code are fairly
  minimal.

This series is based on misc-next.

This is an RFC, so please, comment away.

Thanks!

1: https://github.com/osandov/xfstests/tree/btrfs-compressed-write

Omar Sandoval (5):
  Btrfs: use correct count in btrfs_file_write_iter()
  Btrfs: treat RWF_{,D}SYNC writes as sync for CRCs
  Btrfs: stop clearing EXTENT_DIRTY in inode I/O tree
  fs: export rw_verify_area()
  Btrfs: add ioctl for directly writing compressed data

 fs/btrfs/compression.c       |   6 +-
 fs/btrfs/compression.h       |  14 +--
 fs/btrfs/ctree.h             |  12 ++
 fs/btrfs/extent_io.c         |   6 +-
 fs/btrfs/file.c              |  22 ++--
 fs/btrfs/free-space-cache.c  |   9 +-
 fs/btrfs/inode.c             | 232 +++++++++++++++++++++++++++++++----
 fs/btrfs/ioctl.c             | 101 ++++++++++++++-
 fs/btrfs/tests/inode-tests.c |  12 +-
 fs/internal.h                |   5 -
 fs/read_write.c              |   1 +
 include/linux/fs.h           |   1 +
 include/uapi/linux/btrfs.h   |  63 ++++++++++
 13 files changed, 415 insertions(+), 69 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 1/5] Btrfs: use correct count in btrfs_file_write_iter()
  2019-08-15 21:04 [RFC PATCH 0/5] Btrfs: add interface for writing compressed extent directly Omar Sandoval
@ 2019-08-15 21:04 ` Omar Sandoval
  2019-08-16 16:56   ` Josef Bacik
  2019-08-15 21:04 ` [PATCH 2/5] Btrfs: treat RWF_{,D}SYNC writes as sync for CRCs Omar Sandoval
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 23+ messages in thread
From: Omar Sandoval @ 2019-08-15 21:04 UTC (permalink / raw)
  To: linux-btrfs; +Cc: kernel-team

From: Omar Sandoval <osandov@fb.com>

generic_write_checks() may modify iov_iter_count(), so we must get the
count after the call, not before. Using the wrong one has a couple of
consequences:

1. We check a longer range in check_can_nocow() for nowait than we're
   actually writing.
2. We create extra hole extent maps in btrfs_cont_expand(). As far as I
   can tell, this is harmless, but I might be missing something.

These issues are pretty minor, but let's fix it before something more
important trips on it.

Fixes: edf064e7c6fe ("btrfs: nowait aio support")
Signed-off-by: Omar Sandoval <osandov@fb.com>
---
 fs/btrfs/file.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index b31991f0f440..4393b6b24e02 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -1885,7 +1885,7 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb,
 	bool sync = (file->f_flags & O_DSYNC) || IS_SYNC(file->f_mapping->host);
 	ssize_t err;
 	loff_t pos;
-	size_t count = iov_iter_count(from);
+	size_t count;
 	loff_t oldsize;
 	int clean_page = 0;
 
@@ -1906,6 +1906,7 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb,
 	}
 
 	pos = iocb->ki_pos;
+	count = iov_iter_count(from);
 	if (iocb->ki_flags & IOCB_NOWAIT) {
 		/*
 		 * We will allocate space in case nodatacow is not set,
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 2/5] Btrfs: treat RWF_{,D}SYNC writes as sync for CRCs
  2019-08-15 21:04 [RFC PATCH 0/5] Btrfs: add interface for writing compressed extent directly Omar Sandoval
  2019-08-15 21:04 ` [PATCH 1/5] Btrfs: use correct count in btrfs_file_write_iter() Omar Sandoval
@ 2019-08-15 21:04 ` Omar Sandoval
  2019-08-16 16:59   ` Josef Bacik
  2019-08-27 12:35   ` David Sterba
  2019-08-15 21:04 ` [PATCH 3/5] Btrfs: stop clearing EXTENT_DIRTY in inode I/O tree Omar Sandoval
                   ` (4 subsequent siblings)
  6 siblings, 2 replies; 23+ messages in thread
From: Omar Sandoval @ 2019-08-15 21:04 UTC (permalink / raw)
  To: linux-btrfs; +Cc: kernel-team

From: Omar Sandoval <osandov@fb.com>

In btrfs_file_write_iter(), we treat a write as synchrononous if the
file is marked as synchronous. However, with pwritev2(), a write with
RWF_SYNC or RWF_DSYNC is also synchronous even if the file isn't by
default. Make sure we bump the sync_writers counter in that case, too,
so that we'll do the CRCs synchronously.

Signed-off-by: Omar Sandoval <osandov@fb.com>
---
 fs/btrfs/file.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 4393b6b24e02..27223753da7b 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -1882,7 +1882,7 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb,
 	u64 start_pos;
 	u64 end_pos;
 	ssize_t num_written = 0;
-	bool sync = (file->f_flags & O_DSYNC) || IS_SYNC(file->f_mapping->host);
+	bool sync = iocb->ki_flags & IOCB_DSYNC;
 	ssize_t err;
 	loff_t pos;
 	size_t count;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 3/5] Btrfs: stop clearing EXTENT_DIRTY in inode I/O tree
  2019-08-15 21:04 [RFC PATCH 0/5] Btrfs: add interface for writing compressed extent directly Omar Sandoval
  2019-08-15 21:04 ` [PATCH 1/5] Btrfs: use correct count in btrfs_file_write_iter() Omar Sandoval
  2019-08-15 21:04 ` [PATCH 2/5] Btrfs: treat RWF_{,D}SYNC writes as sync for CRCs Omar Sandoval
@ 2019-08-15 21:04 ` Omar Sandoval
  2019-08-16 16:59   ` Josef Bacik
  2019-08-15 21:04 ` [RFC PATCH 4/5] fs: export rw_verify_area() Omar Sandoval
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 23+ messages in thread
From: Omar Sandoval @ 2019-08-15 21:04 UTC (permalink / raw)
  To: linux-btrfs; +Cc: kernel-team

From: Omar Sandoval <osandov@fb.com>

Since commit fee187d9d9dd ("Btrfs: do not set EXTENT_DIRTY along with
EXTENT_DELALLOC"), we never set EXTENT_DIRTY in inode->io_tree, so we
can simplify and stop trying to clear it.

Signed-off-by: Omar Sandoval <osandov@fb.com>
---
 fs/btrfs/extent_io.c         |  6 ++----
 fs/btrfs/file.c              |  4 ++--
 fs/btrfs/free-space-cache.c  |  9 ++++----
 fs/btrfs/inode.c             | 41 ++++++++++++++----------------------
 fs/btrfs/ioctl.c             |  5 ++---
 fs/btrfs/tests/inode-tests.c | 12 ++++-------
 6 files changed, 30 insertions(+), 47 deletions(-)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index bac59d721b54..4dc5e6939856 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -4322,10 +4322,8 @@ int extent_invalidatepage(struct extent_io_tree *tree,
 
 	lock_extent_bits(tree, start, end, &cached_state);
 	wait_on_page_writeback(page);
-	clear_extent_bit(tree, start, end,
-			 EXTENT_LOCKED | EXTENT_DIRTY | EXTENT_DELALLOC |
-			 EXTENT_DO_ACCOUNTING,
-			 1, 1, &cached_state);
+	clear_extent_bit(tree, start, end, EXTENT_LOCKED | EXTENT_DELALLOC |
+			 EXTENT_DO_ACCOUNTING, 1, 1, &cached_state);
 	return 0;
 }
 
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 27223753da7b..c080fbcbda11 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -537,8 +537,8 @@ int btrfs_dirty_pages(struct inode *inode, struct page **pages,
 	 * we can set things up properly
 	 */
 	clear_extent_bit(&BTRFS_I(inode)->io_tree, start_pos, end_of_last_block,
-			 EXTENT_DIRTY | EXTENT_DELALLOC |
-			 EXTENT_DO_ACCOUNTING | EXTENT_DEFRAG, 0, 0, cached);
+			 EXTENT_DELALLOC | EXTENT_DO_ACCOUNTING | EXTENT_DEFRAG,
+			 0, 0, cached);
 
 	if (!btrfs_is_free_space_inode(BTRFS_I(inode))) {
 		if (start_pos >= isize &&
diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c
index faaf57a7c289..96cf1e2dc388 100644
--- a/fs/btrfs/free-space-cache.c
+++ b/fs/btrfs/free-space-cache.c
@@ -1005,7 +1005,7 @@ update_cache_item(struct btrfs_trans_handle *trans,
 	ret = btrfs_search_slot(trans, root, &key, path, 0, 1);
 	if (ret < 0) {
 		clear_extent_bit(&BTRFS_I(inode)->io_tree, 0, inode->i_size - 1,
-				 EXTENT_DIRTY | EXTENT_DELALLOC, 0, 0, NULL);
+				 EXTENT_DELALLOC, 0, 0, NULL);
 		goto fail;
 	}
 	leaf = path->nodes[0];
@@ -1017,9 +1017,8 @@ update_cache_item(struct btrfs_trans_handle *trans,
 		if (found_key.objectid != BTRFS_FREE_SPACE_OBJECTID ||
 		    found_key.offset != offset) {
 			clear_extent_bit(&BTRFS_I(inode)->io_tree, 0,
-					 inode->i_size - 1,
-					 EXTENT_DIRTY | EXTENT_DELALLOC, 0, 0,
-					 NULL);
+					 inode->i_size - 1, EXTENT_DELALLOC, 0,
+					 0, NULL);
 			btrfs_release_path(path);
 			goto fail;
 		}
@@ -1115,7 +1114,7 @@ static int flush_dirty_cache(struct inode *inode)
 	ret = btrfs_wait_ordered_range(inode, 0, (u64)-1);
 	if (ret)
 		clear_extent_bit(&BTRFS_I(inode)->io_tree, 0, inode->i_size - 1,
-				 EXTENT_DIRTY | EXTENT_DELALLOC, 0, 0, NULL);
+				 EXTENT_DELALLOC, 0, 0, NULL);
 
 	return ret;
 }
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 612c25aac15c..491755921c4b 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -4935,9 +4935,8 @@ int btrfs_truncate_block(struct inode *inode, loff_t from, loff_t len,
 	}
 
 	clear_extent_bit(&BTRFS_I(inode)->io_tree, block_start, block_end,
-			  EXTENT_DIRTY | EXTENT_DELALLOC |
-			  EXTENT_DO_ACCOUNTING | EXTENT_DEFRAG,
-			  0, 0, &cached_state);
+			 EXTENT_DELALLOC | EXTENT_DO_ACCOUNTING | EXTENT_DEFRAG,
+			 0, 0, &cached_state);
 
 	ret = btrfs_set_extent_delalloc(inode, block_start, block_end, 0,
 					&cached_state);
@@ -5321,9 +5320,9 @@ static void evict_inode_truncate_pages(struct inode *inode)
 			btrfs_qgroup_free_data(inode, NULL, start, end - start + 1);
 
 		clear_extent_bit(io_tree, start, end,
-				 EXTENT_LOCKED | EXTENT_DIRTY |
-				 EXTENT_DELALLOC | EXTENT_DO_ACCOUNTING |
-				 EXTENT_DEFRAG, 1, 1, &cached_state);
+				 EXTENT_LOCKED | EXTENT_DELALLOC |
+				 EXTENT_DO_ACCOUNTING | EXTENT_DEFRAG, 1, 1,
+				 &cached_state);
 
 		cond_resched();
 		spin_lock(&io_tree->lock);
@@ -7690,12 +7689,9 @@ static int btrfs_get_blocks_direct(struct inode *inode, sector_t iblock,
 	u64 start = iblock << inode->i_blkbits;
 	u64 lockstart, lockend;
 	u64 len = bh_result->b_size;
-	int unlock_bits = EXTENT_LOCKED;
 	int ret = 0;
 
-	if (create)
-		unlock_bits |= EXTENT_DIRTY;
-	else
+	if (!create)
 		len = min_t(u64, len, fs_info->sectorsize);
 
 	lockstart = start;
@@ -7754,9 +7750,8 @@ static int btrfs_get_blocks_direct(struct inode *inode, sector_t iblock,
 		if (ret < 0)
 			goto unlock_err;
 
-		/* clear and unlock the entire range */
-		clear_extent_bit(&BTRFS_I(inode)->io_tree, lockstart, lockend,
-				 unlock_bits, 1, 0, &cached_state);
+		unlock_extent_cached(&BTRFS_I(inode)->io_tree, lockstart,
+				     lockend, &cached_state);
 	} else {
 		ret = btrfs_get_blocks_direct_read(em, bh_result, inode,
 						   start, len);
@@ -7772,9 +7767,8 @@ static int btrfs_get_blocks_direct(struct inode *inode, sector_t iblock,
 		 */
 		lockstart = start + bh_result->b_size;
 		if (lockstart < lockend) {
-			clear_extent_bit(&BTRFS_I(inode)->io_tree, lockstart,
-					 lockend, unlock_bits, 1, 0,
-					 &cached_state);
+			unlock_extent_cached(&BTRFS_I(inode)->io_tree,
+					     lockstart, lockend, &cached_state);
 		} else {
 			free_extent_state(cached_state);
 		}
@@ -7785,8 +7779,8 @@ static int btrfs_get_blocks_direct(struct inode *inode, sector_t iblock,
 	return 0;
 
 unlock_err:
-	clear_extent_bit(&BTRFS_I(inode)->io_tree, lockstart, lockend,
-			 unlock_bits, 1, 0, &cached_state);
+	unlock_extent_cached(&BTRFS_I(inode)->io_tree, lockstart, lockend,
+			     &cached_state);
 err:
 	if (dio_data)
 		current->journal_info = dio_data;
@@ -8801,8 +8795,7 @@ static void btrfs_invalidatepage(struct page *page, unsigned int offset,
 		 */
 		if (!inode_evicting)
 			clear_extent_bit(tree, start, end,
-					 EXTENT_DIRTY | EXTENT_DELALLOC |
-					 EXTENT_DELALLOC_NEW |
+					 EXTENT_DELALLOC | EXTENT_DELALLOC_NEW |
 					 EXTENT_LOCKED | EXTENT_DO_ACCOUNTING |
 					 EXTENT_DEFRAG, 1, 0, &cached_state);
 		/*
@@ -8857,8 +8850,7 @@ static void btrfs_invalidatepage(struct page *page, unsigned int offset,
 	if (PageDirty(page))
 		btrfs_qgroup_free_data(inode, NULL, page_start, PAGE_SIZE);
 	if (!inode_evicting) {
-		clear_extent_bit(tree, page_start, page_end,
-				 EXTENT_LOCKED | EXTENT_DIRTY |
+		clear_extent_bit(tree, page_start, page_end, EXTENT_LOCKED |
 				 EXTENT_DELALLOC | EXTENT_DELALLOC_NEW |
 				 EXTENT_DO_ACCOUNTING | EXTENT_DEFRAG, 1, 1,
 				 &cached_state);
@@ -8986,9 +8978,8 @@ vm_fault_t btrfs_page_mkwrite(struct vm_fault *vmf)
 	 * reserve data&meta space before lock_page() (see above comments).
 	 */
 	clear_extent_bit(&BTRFS_I(inode)->io_tree, page_start, end,
-			  EXTENT_DIRTY | EXTENT_DELALLOC |
-			  EXTENT_DO_ACCOUNTING | EXTENT_DEFRAG,
-			  0, 0, &cached_state);
+			  EXTENT_DELALLOC | EXTENT_DO_ACCOUNTING |
+			  EXTENT_DEFRAG, 0, 0, &cached_state);
 
 	ret2 = btrfs_set_extent_delalloc(inode, page_start, end, 0,
 					&cached_state);
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 4eabd419aaca..4b383811a7d2 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -1333,9 +1333,8 @@ static int cluster_pages_for_defrag(struct inode *inode,
 	lock_extent_bits(&BTRFS_I(inode)->io_tree,
 			 page_start, page_end - 1, &cached_state);
 	clear_extent_bit(&BTRFS_I(inode)->io_tree, page_start,
-			  page_end - 1, EXTENT_DIRTY | EXTENT_DELALLOC |
-			  EXTENT_DO_ACCOUNTING | EXTENT_DEFRAG, 0, 0,
-			  &cached_state);
+			  page_end - 1, EXTENT_DELALLOC | EXTENT_DO_ACCOUNTING |
+			  EXTENT_DEFRAG, 0, 0, &cached_state);
 
 	if (i_done != page_cnt) {
 		spin_lock(&BTRFS_I(inode)->lock);
diff --git a/fs/btrfs/tests/inode-tests.c b/fs/btrfs/tests/inode-tests.c
index b363fb990cec..09ecf7dc7b08 100644
--- a/fs/btrfs/tests/inode-tests.c
+++ b/fs/btrfs/tests/inode-tests.c
@@ -988,8 +988,7 @@ static int test_extent_accounting(u32 sectorsize, u32 nodesize)
 	ret = clear_extent_bit(&BTRFS_I(inode)->io_tree,
 			       BTRFS_MAX_EXTENT_SIZE >> 1,
 			       (BTRFS_MAX_EXTENT_SIZE >> 1) + sectorsize - 1,
-			       EXTENT_DELALLOC | EXTENT_DIRTY |
-			       EXTENT_UPTODATE, 0, 0, NULL);
+			       EXTENT_DELALLOC | EXTENT_UPTODATE, 0, 0, NULL);
 	if (ret) {
 		test_err("clear_extent_bit returned %d", ret);
 		goto out;
@@ -1056,8 +1055,7 @@ static int test_extent_accounting(u32 sectorsize, u32 nodesize)
 	ret = clear_extent_bit(&BTRFS_I(inode)->io_tree,
 			       BTRFS_MAX_EXTENT_SIZE + sectorsize,
 			       BTRFS_MAX_EXTENT_SIZE + 2 * sectorsize - 1,
-			       EXTENT_DIRTY | EXTENT_DELALLOC |
-			       EXTENT_UPTODATE, 0, 0, NULL);
+			       EXTENT_DELALLOC | EXTENT_UPTODATE, 0, 0, NULL);
 	if (ret) {
 		test_err("clear_extent_bit returned %d", ret);
 		goto out;
@@ -1089,8 +1087,7 @@ static int test_extent_accounting(u32 sectorsize, u32 nodesize)
 
 	/* Empty */
 	ret = clear_extent_bit(&BTRFS_I(inode)->io_tree, 0, (u64)-1,
-			       EXTENT_DIRTY | EXTENT_DELALLOC |
-			       EXTENT_UPTODATE, 0, 0, NULL);
+			       EXTENT_DELALLOC | EXTENT_UPTODATE, 0, 0, NULL);
 	if (ret) {
 		test_err("clear_extent_bit returned %d", ret);
 		goto out;
@@ -1105,8 +1102,7 @@ static int test_extent_accounting(u32 sectorsize, u32 nodesize)
 out:
 	if (ret)
 		clear_extent_bit(&BTRFS_I(inode)->io_tree, 0, (u64)-1,
-				 EXTENT_DIRTY | EXTENT_DELALLOC |
-				 EXTENT_UPTODATE, 0, 0, NULL);
+				 EXTENT_DELALLOC | EXTENT_UPTODATE, 0, 0, NULL);
 	iput(inode);
 	btrfs_free_dummy_root(root);
 	btrfs_free_dummy_fs_info(fs_info);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [RFC PATCH 4/5] fs: export rw_verify_area()
  2019-08-15 21:04 [RFC PATCH 0/5] Btrfs: add interface for writing compressed extent directly Omar Sandoval
                   ` (2 preceding siblings ...)
  2019-08-15 21:04 ` [PATCH 3/5] Btrfs: stop clearing EXTENT_DIRTY in inode I/O tree Omar Sandoval
@ 2019-08-15 21:04 ` Omar Sandoval
  2019-08-16 17:02   ` Josef Bacik
  2019-08-15 21:04 ` [RFC PATCH 5/5] Btrfs: add ioctl for directly writing compressed data Omar Sandoval
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 23+ messages in thread
From: Omar Sandoval @ 2019-08-15 21:04 UTC (permalink / raw)
  To: linux-btrfs; +Cc: kernel-team

From: Omar Sandoval <osandov@fb.com>

I'm adding a Btrfs ioctl to write compressed data, and rather than
duplicating the checks in rw_verify_area(), let's just export it.

Signed-off-by: Omar Sandoval <osandov@fb.com>
---
 fs/internal.h      | 5 -----
 fs/read_write.c    | 1 +
 include/linux/fs.h | 1 +
 3 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/fs/internal.h b/fs/internal.h
index 315fcd8d237c..94e1831d4c95 100644
--- a/fs/internal.h
+++ b/fs/internal.h
@@ -160,11 +160,6 @@ extern char *simple_dname(struct dentry *, char *, int);
 extern void dput_to_list(struct dentry *, struct list_head *);
 extern void shrink_dentry_list(struct list_head *);
 
-/*
- * read_write.c
- */
-extern int rw_verify_area(int, struct file *, const loff_t *, size_t);
-
 /*
  * pipe.c
  */
diff --git a/fs/read_write.c b/fs/read_write.c
index 1f5088dec566..9d95491ce9ab 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -399,6 +399,7 @@ int rw_verify_area(int read_write, struct file *file, const loff_t *ppos, size_t
 	return security_file_permission(file,
 				read_write == READ ? MAY_READ : MAY_WRITE);
 }
+EXPORT_SYMBOL(rw_verify_area);
 
 static ssize_t new_sync_read(struct file *filp, char __user *buf, size_t len, loff_t *ppos)
 {
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 997a530ff4e9..a9a1884768e4 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -3082,6 +3082,7 @@ extern loff_t fixed_size_llseek(struct file *file, loff_t offset,
 		int whence, loff_t size);
 extern loff_t no_seek_end_llseek_size(struct file *, loff_t, int, loff_t);
 extern loff_t no_seek_end_llseek(struct file *, loff_t, int);
+extern int rw_verify_area(int, struct file *, const loff_t *, size_t);
 extern int generic_file_open(struct inode * inode, struct file * filp);
 extern int nonseekable_open(struct inode * inode, struct file * filp);
 extern int stream_open(struct inode * inode, struct file * filp);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [RFC PATCH 5/5] Btrfs: add ioctl for directly writing compressed data
  2019-08-15 21:04 [RFC PATCH 0/5] Btrfs: add interface for writing compressed extent directly Omar Sandoval
                   ` (3 preceding siblings ...)
  2019-08-15 21:04 ` [RFC PATCH 4/5] fs: export rw_verify_area() Omar Sandoval
@ 2019-08-15 21:04 ` Omar Sandoval
  2019-08-26 21:36   ` Josef Bacik
  2019-08-28 12:06   ` David Sterba
  2019-08-15 21:14 ` [RFC PATCH 0/5] Btrfs: add interface for writing compressed extent directly Omar Sandoval
  2019-08-27 18:31 ` David Sterba
  6 siblings, 2 replies; 23+ messages in thread
From: Omar Sandoval @ 2019-08-15 21:04 UTC (permalink / raw)
  To: linux-btrfs; +Cc: kernel-team

From: Omar Sandoval <osandov@fb.com>

This adds an API for writing compressed data directly to the filesystem.
The use case that I have in mind is send/receive: currently, when
sending data from one compressed filesystem to another, the sending side
decompresses the data and the receiving side recompresses it before
writing it out. This is wasteful and can be avoided if we can just send
and write compressed extents. The send part will be implemented in a
separate series, as this ioctl can stand alone.

The interface is essentially pwrite(2) with some extra information:

- The input buffer contains the compressed data.
- Both the compressed and decompressed sizes of the data are given.
- The compression type (zlib, lzo, or zstd) is given.

A more detailed description of the interface, including restrictions and
edge cases, is included in include/uapi/linux/btrfs.h.

The implementation is similar to direct I/O: we have to flush any
ordered extents, invalidate the page cache, and do the io
tree/delalloc/extent map/ordered extent dance. From there, we can reuse
the compression code with a minor modification to distinguish the new
ioctl from writeback.

Signed-off-by: Omar Sandoval <osandov@fb.com>
---
 fs/btrfs/compression.c     |   6 +-
 fs/btrfs/compression.h     |  14 +--
 fs/btrfs/ctree.h           |  12 +++
 fs/btrfs/file.c            |  13 ++-
 fs/btrfs/inode.c           | 191 ++++++++++++++++++++++++++++++++++++-
 fs/btrfs/ioctl.c           |  96 +++++++++++++++++++
 include/uapi/linux/btrfs.h |  63 ++++++++++++
 7 files changed, 380 insertions(+), 15 deletions(-)

diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
index 60c47b417a4b..50e3a9a7e829 100644
--- a/fs/btrfs/compression.c
+++ b/fs/btrfs/compression.c
@@ -275,7 +275,8 @@ static void end_compressed_bio_write(struct bio *bio)
 			bio->bi_status == BLK_STS_OK);
 	cb->compressed_pages[0]->mapping = NULL;
 
-	end_compressed_writeback(inode, cb);
+	if (cb->writeback)
+		end_compressed_writeback(inode, cb);
 	/* note, our inode could be gone now */
 
 	/*
@@ -310,7 +311,7 @@ blk_status_t btrfs_submit_compressed_write(struct inode *inode, u64 start,
 				 unsigned long compressed_len,
 				 struct page **compressed_pages,
 				 unsigned long nr_pages,
-				 unsigned int write_flags)
+				 unsigned int write_flags, bool writeback)
 {
 	struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
 	struct bio *bio = NULL;
@@ -335,6 +336,7 @@ blk_status_t btrfs_submit_compressed_write(struct inode *inode, u64 start,
 	cb->mirror_num = 0;
 	cb->compressed_pages = compressed_pages;
 	cb->compressed_len = compressed_len;
+	cb->writeback = writeback;
 	cb->orig_bio = NULL;
 	cb->nr_pages = nr_pages;
 
diff --git a/fs/btrfs/compression.h b/fs/btrfs/compression.h
index 2035b8eb1290..f39b69e8fbd7 100644
--- a/fs/btrfs/compression.h
+++ b/fs/btrfs/compression.h
@@ -6,6 +6,7 @@
 #ifndef BTRFS_COMPRESSION_H
 #define BTRFS_COMPRESSION_H
 
+#include <linux/btrfs.h>
 #include <linux/sizes.h>
 
 /*
@@ -47,6 +48,9 @@ struct compressed_bio {
 	/* the compression algorithm for this bio */
 	int compress_type;
 
+	/* Whether this is a write for writeback. */
+	bool writeback;
+
 	/* number of compressed pages in the array */
 	unsigned long nr_pages;
 
@@ -93,20 +97,12 @@ blk_status_t btrfs_submit_compressed_write(struct inode *inode, u64 start,
 				  unsigned long compressed_len,
 				  struct page **compressed_pages,
 				  unsigned long nr_pages,
-				  unsigned int write_flags);
+				  unsigned int write_flags, bool writeback);
 blk_status_t btrfs_submit_compressed_read(struct inode *inode, struct bio *bio,
 				 int mirror_num, unsigned long bio_flags);
 
 unsigned int btrfs_compress_str2level(unsigned int type, const char *str);
 
-enum btrfs_compression_type {
-	BTRFS_COMPRESS_NONE  = 0,
-	BTRFS_COMPRESS_ZLIB  = 1,
-	BTRFS_COMPRESS_LZO   = 2,
-	BTRFS_COMPRESS_ZSTD  = 3,
-	BTRFS_COMPRESS_TYPES = 3,
-};
-
 struct workspace_manager {
 	const struct btrfs_compress_op *ops;
 	struct list_head idle_ws;
diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 85b808e3ea42..e2854345a3a6 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -2985,6 +2985,16 @@ int btrfs_run_delalloc_range(struct inode *inode, struct page *locked_page,
 int btrfs_writepage_cow_fixup(struct page *page, u64 start, u64 end);
 void btrfs_writepage_endio_finish_ordered(struct page *page, u64 start,
 					  u64 end, int uptodate);
+
+struct btrfs_compressed_write {
+	void __user *buf;
+	unsigned long compressed_len;
+	unsigned long orig_len;
+	int compress_type;
+};
+ssize_t btrfs_compressed_write(struct kiocb *iocb, struct iov_iter *from,
+			       struct btrfs_compressed_write *compressed);
+
 extern const struct dentry_operations btrfs_dentry_operations;
 
 /* ioctl.c */
@@ -3008,6 +3018,8 @@ int btrfs_add_inode_defrag(struct btrfs_trans_handle *trans,
 			   struct btrfs_inode *inode);
 int btrfs_run_defrag_inodes(struct btrfs_fs_info *fs_info);
 void btrfs_cleanup_defrag_inodes(struct btrfs_fs_info *fs_info);
+ssize_t btrfs_do_write_iter(struct kiocb *iocb, struct iov_iter *from,
+			    struct btrfs_compressed_write *compressed);
 int btrfs_sync_file(struct file *file, loff_t start, loff_t end, int datasync);
 void btrfs_drop_extent_cache(struct btrfs_inode *inode, u64 start, u64 end,
 			     int skip_pinned);
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index c080fbcbda11..1fcaa338baf5 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -1872,8 +1872,8 @@ static void update_time_for_write(struct inode *inode)
 		inode_inc_iversion(inode);
 }
 
-static ssize_t btrfs_file_write_iter(struct kiocb *iocb,
-				    struct iov_iter *from)
+ssize_t btrfs_do_write_iter(struct kiocb *iocb, struct iov_iter *from,
+			    struct btrfs_compressed_write *compressed)
 {
 	struct file *file = iocb->ki_filp;
 	struct inode *inode = file_inode(file);
@@ -1965,7 +1965,9 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb,
 	if (sync)
 		atomic_inc(&BTRFS_I(inode)->sync_writers);
 
-	if (iocb->ki_flags & IOCB_DIRECT) {
+	if (compressed) {
+		num_written = btrfs_compressed_write(iocb, from, compressed);
+	} else if (iocb->ki_flags & IOCB_DIRECT) {
 		num_written = __btrfs_direct_write(iocb, from);
 	} else {
 		num_written = btrfs_buffered_write(iocb, from);
@@ -1996,6 +1998,11 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb,
 	return num_written ? num_written : err;
 }
 
+static ssize_t btrfs_file_write_iter(struct kiocb *iocb, struct iov_iter *from)
+{
+	return btrfs_do_write_iter(iocb, from, NULL);
+}
+
 int btrfs_release_file(struct inode *inode, struct file *filp)
 {
 	struct btrfs_file_private *private = filp->private_data;
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 491755921c4b..4ed8ba97b7d4 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -863,7 +863,7 @@ static noinline void submit_compressed_extents(struct async_chunk *async_chunk)
 				    ins.objectid,
 				    ins.offset, async_extent->pages,
 				    async_extent->nr_pages,
-				    async_chunk->write_flags)) {
+				    async_chunk->write_flags, true)) {
 			struct page *p = async_extent->pages[0];
 			const u64 start = async_extent->start;
 			const u64 end = start + async_extent->ram_size - 1;
@@ -10541,6 +10541,195 @@ void btrfs_set_range_writeback(struct extent_io_tree *tree, u64 start, u64 end)
 	}
 }
 
+ssize_t btrfs_compressed_write(struct kiocb *iocb, struct iov_iter *from,
+			       struct btrfs_compressed_write *compressed)
+{
+	struct file *file = iocb->ki_filp;
+	struct inode *inode = file_inode(file);
+	struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
+	struct btrfs_root *root = BTRFS_I(inode)->root;
+	struct extent_io_tree *io_tree = &BTRFS_I(inode)->io_tree;
+	struct extent_changeset *data_reserved = NULL;
+	struct extent_state *cached_state = NULL;
+	unsigned long nr_pages, i;
+	struct page **pages;
+	unsigned long disk_num_bytes, ram_bytes;
+	u64 start, end;
+	struct btrfs_key ins;
+	struct extent_map *em;
+	ssize_t ret;
+
+	if (iov_iter_count(from) != compressed->orig_len) {
+		/*
+		 * The write got truncated by generic_write_checks(). We can't
+		 * do a partial compressed write.
+		 */
+		return -EFBIG;
+	}
+
+	/* This should be handled higher up. */
+	ASSERT(compressed->orig_len != 0);
+
+	/* The extent size must be sane. */
+	if (compressed->compressed_len > BTRFS_MAX_COMPRESSED ||
+	    compressed->orig_len > BTRFS_MAX_UNCOMPRESSED ||
+	    compressed->compressed_len == 0)
+		return -EINVAL;
+
+	/*
+	 * The compressed data on disk must be sector-aligned. For convenience,
+	 * we extend the compressed data with zeroes if it isn't.
+	 */
+	disk_num_bytes = ALIGN(compressed->compressed_len, fs_info->sectorsize);
+	/*
+	 * The extent in the file must also be sector-aligned. However, we allow
+	 * a write which ends at or extends i_size to have an unaligned length;
+	 * we round up the extent size and set i_size to the given length.
+	 */
+	start = iocb->ki_pos;
+	if ((start & (fs_info->sectorsize - 1)))
+		return -EINVAL;
+	if (start + compressed->orig_len >= inode->i_size) {
+		ram_bytes = ALIGN(compressed->orig_len, fs_info->sectorsize);
+	} else {
+		ram_bytes = compressed->orig_len;
+		if ((ram_bytes & (fs_info->sectorsize - 1)))
+			return -EINVAL;
+	}
+	end = start + ram_bytes - 1;
+
+	/*
+	 * It's valid for compressed data to be larger than or the same size as
+	 * the decompressed data. However, for buffered I/O, we never write out
+	 * a compressed extent unless it's smaller than the decompressed data,
+	 * so for now, let's not allow creating such extents with the ioctl,
+	 * either.
+	 */
+	if (disk_num_bytes >= ram_bytes)
+		return -EINVAL;
+
+	nr_pages = DIV_ROUND_UP(disk_num_bytes, PAGE_SIZE);
+	pages = kcalloc(nr_pages, sizeof(struct page *),
+			GFP_USER | __GFP_NOWARN);
+	if (!pages)
+		return -ENOMEM;
+	for (i = 0; i < nr_pages; i++) {
+		unsigned long offset = i << PAGE_SHIFT, n;
+		char *kaddr;
+
+		pages[i] = alloc_page(GFP_USER | __GFP_NOWARN);
+		if (!pages[i]) {
+			ret = -ENOMEM;
+			goto out_pages;
+		}
+		kaddr = kmap(pages[i]);
+		if (offset < compressed->compressed_len) {
+			n = min(PAGE_SIZE, compressed->compressed_len - offset);
+			if (copy_from_user(kaddr, compressed->buf + offset,
+					   n)) {
+				kunmap(pages[i]);
+				ret = -EFAULT;
+				goto out_pages;
+			}
+		} else {
+			n = 0;
+		}
+		if (n < PAGE_SIZE)
+			memset(kaddr + n, 0, PAGE_SIZE - n);
+		kunmap(pages[i]);
+	}
+
+	for (;;) {
+		struct btrfs_ordered_extent *ordered;
+
+		lock_extent_bits(io_tree, start, end, &cached_state);
+		ordered = btrfs_lookup_ordered_range(BTRFS_I(inode), start,
+						     end - start + 1);
+		if (!ordered &&
+		    !filemap_range_has_page(inode->i_mapping, start, end))
+			break;
+		if (ordered)
+			btrfs_put_ordered_extent(ordered);
+		unlock_extent_cached(&BTRFS_I(inode)->io_tree, start, end,
+				     &cached_state);
+		cond_resched();
+		ret = btrfs_wait_ordered_range(inode, start, end);
+		if (ret)
+			goto out_pages;
+		ret = invalidate_inode_pages2_range(inode->i_mapping,
+						    start >> PAGE_SHIFT,
+						    end >> PAGE_SHIFT);
+		if (ret)
+			goto out_pages;
+	}
+
+	ret = btrfs_delalloc_reserve_space(inode, &data_reserved, start,
+					   ram_bytes);
+	if (ret)
+		goto out_unlock;
+
+	ret = btrfs_reserve_extent(root, ram_bytes, disk_num_bytes,
+				   disk_num_bytes, 0, 0, &ins, 1, 1);
+	if (ret)
+		goto out_delalloc_release;
+
+	em = create_io_em(inode, start, ram_bytes, start, ins.objectid,
+			  ins.offset, ins.offset, ram_bytes,
+			  compressed->compress_type, BTRFS_ORDERED_COMPRESSED);
+	if (IS_ERR(em)) {
+		ret = PTR_ERR(em);
+		goto out_free_reserve;
+	}
+	free_extent_map(em);
+
+	ret = btrfs_add_ordered_extent_compress(inode, start, ins.objectid,
+						ram_bytes, ins.offset,
+						BTRFS_ORDERED_COMPRESSED,
+						compressed->compress_type);
+	if (ret) {
+		btrfs_drop_extent_cache(BTRFS_I(inode), start, end, 0);
+		goto out_free_reserve;
+	}
+	btrfs_dec_block_group_reservations(fs_info, ins.objectid);
+
+	if (start + compressed->orig_len > inode->i_size)
+		i_size_write(inode, start + compressed->orig_len);
+
+	unlock_extent_cached(io_tree, start, end, &cached_state);
+
+	btrfs_delalloc_release_extents(BTRFS_I(inode), ram_bytes, false);
+
+	if (btrfs_submit_compressed_write(inode, start, ram_bytes, ins.objectid,
+					  ins.offset, pages, nr_pages, 0,
+					  false)) {
+		struct page *page = pages[0];
+
+		page->mapping = inode->i_mapping;
+		btrfs_writepage_endio_finish_ordered(page, start, end, 0);
+		page->mapping = NULL;
+		ret = -EIO;
+		goto out_pages;
+	}
+	iocb->ki_pos += compressed->orig_len;
+	return compressed->orig_len;
+
+out_free_reserve:
+	btrfs_dec_block_group_reservations(fs_info, ins.objectid);
+	btrfs_free_reserved_extent(fs_info, ins.objectid, ins.offset, 1);
+out_delalloc_release:
+	btrfs_delalloc_release_space(inode, data_reserved, start, ram_bytes,
+				     true);
+out_unlock:
+	unlock_extent_cached(io_tree, start, end, &cached_state);
+out_pages:
+	for (i = 0; i < nr_pages; i++) {
+		if (pages[i])
+			put_page(pages[i]);
+	}
+	kfree(pages);
+	return ret;
+}
+
 #ifdef CONFIG_SWAP
 /*
  * Add an entry indicating a block group or device which is pinned by a
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 4b383811a7d2..7c829cd21d8e 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -26,6 +26,7 @@
 #include <linux/btrfs.h>
 #include <linux/uaccess.h>
 #include <linux/iversion.h>
+#include <linux/sched/xacct.h>
 #include "ctree.h"
 #include "disk-io.h"
 #include "transaction.h"
@@ -84,6 +85,18 @@ struct btrfs_ioctl_send_args_32 {
 
 #define BTRFS_IOC_SEND_32 _IOW(BTRFS_IOCTL_MAGIC, 38, \
 			       struct btrfs_ioctl_send_args_32)
+
+struct btrfs_ioctl_compressed_pwrite_args_32 {
+	__u64 offset;		/* in */
+	__u32 compressed_len;	/* in */
+	__u32 orig_len;		/* in */
+	__u32 compress_type;	/* in */
+	__u32 reserved[9];
+	compat_uptr_t buf;	/* in */
+} __attribute__ ((__packed__));
+
+#define BTRFS_IOC_COMPRESSED_PWRITE_32 _IOW(BTRFS_IOCTL_MAGIC, 63, \
+				 struct btrfs_ioctl_compressed_pwrite_args_32)
 #endif
 
 static int btrfs_clone(struct inode *src, struct inode *inode,
@@ -5424,6 +5437,83 @@ static int _btrfs_ioctl_send(struct file *file, void __user *argp, bool compat)
 	return ret;
 }
 
+static int btrfs_ioctl_compressed_pwrite(struct file *file, void __user *argp,
+					 bool compat)
+{
+	struct btrfs_ioctl_compressed_pwrite_args args;
+	struct btrfs_compressed_write compressed;
+	struct iov_iter iter;
+	loff_t pos;
+	struct kiocb kiocb;
+	ssize_t ret;
+
+	if (!capable(CAP_SYS_ADMIN))
+		return -EPERM;
+
+	if (!(file->f_mode & FMODE_WRITE))
+		return -EBADF;
+
+	if (compat) {
+#if defined(CONFIG_64BIT) && defined(CONFIG_COMPAT)
+		struct btrfs_ioctl_compressed_pwrite_args_32 args32;
+
+		if (copy_from_user(&args32, argp, sizeof(args32)))
+			return -EFAULT;
+		args.offset = args32.offset;
+		args.buf = compat_ptr(args32.buf);
+		args.compressed_len = args32.compressed_len;
+		args.orig_len = args32.orig_len;
+		args.compress_type = args32.compress_type;
+		memcpy(args.reserved, args32.reserved, sizeof(args.reserved));
+#else
+		return -ENOTTY;
+#endif
+	} else {
+		if (copy_from_user(&args, argp, sizeof(args)))
+			return -EFAULT;
+	}
+
+	/* The compression type must be valid. */
+	if (args.compress_type == BTRFS_COMPRESS_NONE ||
+	    args.compress_type > BTRFS_COMPRESS_TYPES)
+		return -EINVAL;
+	/* Reserved fields must be zero. */
+	if (memchr_inv(args.reserved, 0, sizeof(args.reserved)))
+		return -EINVAL;
+
+	if (unlikely(!access_ok(args.buf, args.compressed_len)))
+		return -EFAULT;
+
+	pos = args.offset;
+	ret = rw_verify_area(WRITE, file, &pos, args.orig_len);
+	if (ret)
+		return ret;
+
+	init_sync_kiocb(&kiocb, file);
+	kiocb.ki_pos = pos;
+	/*
+	 * This iov_iter is a lie; we only construct it so that we can use
+	 * write_iter.
+	 */
+	iov_iter_init(&iter, WRITE, NULL, 0, args.orig_len);
+
+	compressed.buf = args.buf;
+	compressed.compressed_len = args.compressed_len;
+	compressed.orig_len = args.orig_len;
+	compressed.compress_type = args.compress_type;
+
+	file_start_write(file);
+	ret = btrfs_do_write_iter(&kiocb, &iter, &compressed);
+	if (ret > 0) {
+		ASSERT(ret == compressed.orig_len);
+		fsnotify_modify(file);
+		add_wchar(current, ret);
+	}
+	inc_syscw(current);
+	file_end_write(file);
+	return ret < 0 ? ret : 0;
+}
+
 long btrfs_ioctl(struct file *file, unsigned int
 		cmd, unsigned long arg)
 {
@@ -5570,6 +5660,12 @@ long btrfs_ioctl(struct file *file, unsigned int
 		return btrfs_ioctl_get_subvol_rootref(file, argp);
 	case BTRFS_IOC_INO_LOOKUP_USER:
 		return btrfs_ioctl_ino_lookup_user(file, argp);
+	case BTRFS_IOC_COMPRESSED_PWRITE:
+		return btrfs_ioctl_compressed_pwrite(file, argp, false);
+#if defined(CONFIG_64BIT) && defined(CONFIG_COMPAT)
+	case BTRFS_IOC_COMPRESSED_PWRITE_32:
+		return btrfs_ioctl_compressed_pwrite(file, argp, true);
+#endif
 	}
 
 	return -ENOTTY;
diff --git a/include/uapi/linux/btrfs.h b/include/uapi/linux/btrfs.h
index 3ee0678c0a83..d0c803e3edae 100644
--- a/include/uapi/linux/btrfs.h
+++ b/include/uapi/linux/btrfs.h
@@ -822,6 +822,67 @@ struct btrfs_ioctl_get_subvol_rootref_args {
 		__u8 align[7];
 };
 
+enum btrfs_compression_type {
+	BTRFS_COMPRESS_NONE  = 0,
+	BTRFS_COMPRESS_ZLIB  = 1,
+	BTRFS_COMPRESS_LZO   = 2,
+	BTRFS_COMPRESS_ZSTD  = 3,
+	BTRFS_COMPRESS_TYPES = 3,
+};
+
+/*
+ * Write compressed data directly to the filesystem. CAP_SYS_ADMIN is required
+ * and the file descriptor must be open for writing.
+ */
+struct btrfs_ioctl_compressed_pwrite_args {
+	/*
+	 * Offset in file where to write. This must be aligned to the sector
+	 * size of the filesystem.
+	 */
+	__u64 offset;		/* in */
+	/*
+	 * Length of the decompressed data in the file, in bytes. This must be
+	 * aligned to the sector size of the filesystem unless the data ends at
+	 * or beyond the current end of file; this special case is to support
+	 * creating compressed files whose length is not aligned to the sector
+	 * size.
+	 *
+	 * If this length does not match the actual length of the decompressed
+	 * data, then reading may return an error.
+	 *
+	 * This must be less than 128k (BTRFS_MAX_UNCOMPRESSED), although that
+	 * limit may increase in the future.
+	 */
+	__u32 orig_len;		/* in */
+	/*
+	 * Length of compressed data (see buf below) in bytes. This does not
+	 * need to be aligned to a sector.
+	 *
+	 * This must be less than 128k (BTRFS_MAX_COMPRESSED), although that
+	 * limit may increase in the future.
+	 */
+	__u32 compressed_len;	/* in */
+	/*
+	 * The compression type (enum btrfs_compression_type). This must not be
+	 * BTRFS_COMPRESS_NONE.
+	 */
+	__u32 compress_type;	/* in */
+	/* Reserved for future extensions. Must be zero. */
+	__u32 reserved[9];
+	/*
+	 * The compressed data. The format is as follows:
+	 *
+	 * - zlib: The extent is a single zlib stream.
+	 * - lzo: The extent is compressed page by page with LZO1X and wrapped
+	 *   according to the format documented in fs/btrfs/lzo.c.
+	 * - zstd: The extent is a single zstd stream. The windowLog compression
+	 *   parameter must be no more than 17 (ZSTD_BTRFS_MAX_WINDOWLOG).
+	 *
+	 * If the compressed data is invalid, reading will return an error.
+	 */
+	void __user *buf;	/* in */
+} __attribute__ ((__packed__));
+
 /* Error codes as returned by the kernel */
 enum btrfs_err_code {
 	BTRFS_ERROR_DEV_RAID1_MIN_NOT_MET = 1,
@@ -946,5 +1007,7 @@ enum btrfs_err_code {
 				struct btrfs_ioctl_get_subvol_rootref_args)
 #define BTRFS_IOC_INO_LOOKUP_USER _IOWR(BTRFS_IOCTL_MAGIC, 62, \
 				struct btrfs_ioctl_ino_lookup_user_args)
+#define BTRFS_IOC_COMPRESSED_PWRITE _IOW(BTRFS_IOCTL_MAGIC, 63, \
+				 struct btrfs_ioctl_compressed_pwrite_args)
 
 #endif /* _UAPI_LINUX_BTRFS_H */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH 0/5] Btrfs: add interface for writing compressed extent directly
  2019-08-15 21:04 [RFC PATCH 0/5] Btrfs: add interface for writing compressed extent directly Omar Sandoval
                   ` (4 preceding siblings ...)
  2019-08-15 21:04 ` [RFC PATCH 5/5] Btrfs: add ioctl for directly writing compressed data Omar Sandoval
@ 2019-08-15 21:14 ` Omar Sandoval
  2019-08-27 18:31 ` David Sterba
  6 siblings, 0 replies; 23+ messages in thread
From: Omar Sandoval @ 2019-08-15 21:14 UTC (permalink / raw)
  To: linux-btrfs; +Cc: kernel-team

On Thu, Aug 15, 2019 at 02:04:01PM -0700, Omar Sandoval wrote:
> From: Omar Sandoval <osandov@fb.com>
> 
> Hello,
> 
> This series adds a way to write compressed data directly to Btrfs. The
> intended use case is making send/receive on compressed file systems more
> efficient; however, the interface is general enough that it could be
> used in other scenarios. Patch 5 is the main change; see that for more
> details.
> 
> Patches 1-3 are small fixes/cleanups that I ran into while implementing
> this; they should go in regardless of the remainder of the series. Patch
> 4 exports a required VFS interface.
> 
> An example program and test case are available at [1].
> 
> To preemptively address a few concerns:
> 
> - Writing arbitrary, untrusted data which we feed to the decompression
>   algorithm can be a security risk. For that reason, the ioctl is
>   restricted to CAP_SYS_ADMIN. The Btrfs code is properly hardened
>   against invalid compressed data/incorrect lengths, and the compression
>   libraries are mature, but better safe than sorry for now.
> - If the user is writing their own compressed data rather than just
>   blindly feeding in something from btrfs send, they need to know some
>   implementation details about the compression format. For zlib, there
>   are no special requirements. For zstd, a non-default compression
>   parameter must be used. For lzo, we have our own wrapper format since
>   lzo doesn't have a standard wrapper format. It feels a little wrong to
>   expose these details, but they are part of the on-disk format, so they
>   must be stable regardless.
> - The permissions checks duplicated from the VFS code are fairly
>   minimal.
> 
> This series is based on misc-next.
> 
> This is an RFC, so please, comment away.
> 
> Thanks!
> 
> 1: https://github.com/osandov/xfstests/tree/btrfs-compressed-write
> 
> Omar Sandoval (5):
>   Btrfs: use correct count in btrfs_file_write_iter()
>   Btrfs: treat RWF_{,D}SYNC writes as sync for CRCs
>   Btrfs: stop clearing EXTENT_DIRTY in inode I/O tree
>   fs: export rw_verify_area()
>   Btrfs: add ioctl for directly writing compressed data
> 
>  fs/btrfs/compression.c       |   6 +-
>  fs/btrfs/compression.h       |  14 +--
>  fs/btrfs/ctree.h             |  12 ++
>  fs/btrfs/extent_io.c         |   6 +-
>  fs/btrfs/file.c              |  22 ++--
>  fs/btrfs/free-space-cache.c  |   9 +-
>  fs/btrfs/inode.c             | 232 +++++++++++++++++++++++++++++++----
>  fs/btrfs/ioctl.c             | 101 ++++++++++++++-
>  fs/btrfs/tests/inode-tests.c |  12 +-
>  fs/internal.h                |   5 -
>  fs/read_write.c              |   1 +
>  include/linux/fs.h           |   1 +
>  include/uapi/linux/btrfs.h   |  63 ++++++++++
>  13 files changed, 415 insertions(+), 69 deletions(-)

I forgot to CC fsdevel. I'll do that for v2.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/5] Btrfs: use correct count in btrfs_file_write_iter()
  2019-08-15 21:04 ` [PATCH 1/5] Btrfs: use correct count in btrfs_file_write_iter() Omar Sandoval
@ 2019-08-16 16:56   ` Josef Bacik
  0 siblings, 0 replies; 23+ messages in thread
From: Josef Bacik @ 2019-08-16 16:56 UTC (permalink / raw)
  To: Omar Sandoval; +Cc: linux-btrfs, kernel-team

On Thu, Aug 15, 2019 at 02:04:02PM -0700, Omar Sandoval wrote:
> From: Omar Sandoval <osandov@fb.com>
> 
> generic_write_checks() may modify iov_iter_count(), so we must get the
> count after the call, not before. Using the wrong one has a couple of
> consequences:
> 
> 1. We check a longer range in check_can_nocow() for nowait than we're
>    actually writing.
> 2. We create extra hole extent maps in btrfs_cont_expand(). As far as I
>    can tell, this is harmless, but I might be missing something.
> 
> These issues are pretty minor, but let's fix it before something more
> important trips on it.
> 
> Fixes: edf064e7c6fe ("btrfs: nowait aio support")
> Signed-off-by: Omar Sandoval <osandov@fb.com>

Reviewed-by: Josef Bacik <josef@toxicpanda.com>

Thanks,

Josef

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/5] Btrfs: treat RWF_{,D}SYNC writes as sync for CRCs
  2019-08-15 21:04 ` [PATCH 2/5] Btrfs: treat RWF_{,D}SYNC writes as sync for CRCs Omar Sandoval
@ 2019-08-16 16:59   ` Josef Bacik
  2019-08-27 12:35   ` David Sterba
  1 sibling, 0 replies; 23+ messages in thread
From: Josef Bacik @ 2019-08-16 16:59 UTC (permalink / raw)
  To: Omar Sandoval; +Cc: linux-btrfs, kernel-team

On Thu, Aug 15, 2019 at 02:04:03PM -0700, Omar Sandoval wrote:
> From: Omar Sandoval <osandov@fb.com>
> 
> In btrfs_file_write_iter(), we treat a write as synchrononous if the
> file is marked as synchronous. However, with pwritev2(), a write with
> RWF_SYNC or RWF_DSYNC is also synchronous even if the file isn't by
> default. Make sure we bump the sync_writers counter in that case, too,
> so that we'll do the CRCs synchronously.
> 
> Signed-off-by: Omar Sandoval <osandov@fb.com>

Reviewed-by: Josef Bacik <josef@toxicpanda.com>

Thanks,

Josef

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 3/5] Btrfs: stop clearing EXTENT_DIRTY in inode I/O tree
  2019-08-15 21:04 ` [PATCH 3/5] Btrfs: stop clearing EXTENT_DIRTY in inode I/O tree Omar Sandoval
@ 2019-08-16 16:59   ` Josef Bacik
  0 siblings, 0 replies; 23+ messages in thread
From: Josef Bacik @ 2019-08-16 16:59 UTC (permalink / raw)
  To: Omar Sandoval; +Cc: linux-btrfs, kernel-team

On Thu, Aug 15, 2019 at 02:04:04PM -0700, Omar Sandoval wrote:
> From: Omar Sandoval <osandov@fb.com>
> 
> Since commit fee187d9d9dd ("Btrfs: do not set EXTENT_DIRTY along with
> EXTENT_DELALLOC"), we never set EXTENT_DIRTY in inode->io_tree, so we
> can simplify and stop trying to clear it.
> 
> Signed-off-by: Omar Sandoval <osandov@fb.com>

Ship this, dear lord,

Reviewed-by: Josef Bacik <josef@toxicpanda.com>

Thanks,

Josef

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH 4/5] fs: export rw_verify_area()
  2019-08-15 21:04 ` [RFC PATCH 4/5] fs: export rw_verify_area() Omar Sandoval
@ 2019-08-16 17:02   ` Josef Bacik
  0 siblings, 0 replies; 23+ messages in thread
From: Josef Bacik @ 2019-08-16 17:02 UTC (permalink / raw)
  To: Omar Sandoval; +Cc: linux-btrfs, kernel-team

On Thu, Aug 15, 2019 at 02:04:05PM -0700, Omar Sandoval wrote:
> From: Omar Sandoval <osandov@fb.com>
> 
> I'm adding a Btrfs ioctl to write compressed data, and rather than
> duplicating the checks in rw_verify_area(), let's just export it.
> 
> Signed-off-by: Omar Sandoval <osandov@fb.com>

Reviewed-by: Josef Bacik <josef@toxicpanda.com>

Thanks,

Josef

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH 5/5] Btrfs: add ioctl for directly writing compressed data
  2019-08-15 21:04 ` [RFC PATCH 5/5] Btrfs: add ioctl for directly writing compressed data Omar Sandoval
@ 2019-08-26 21:36   ` Josef Bacik
  2019-08-27  6:26     ` Nikolay Borisov
  2019-08-28 12:06   ` David Sterba
  1 sibling, 1 reply; 23+ messages in thread
From: Josef Bacik @ 2019-08-26 21:36 UTC (permalink / raw)
  To: Omar Sandoval; +Cc: linux-btrfs, kernel-team

On Thu, Aug 15, 2019 at 02:04:06PM -0700, Omar Sandoval wrote:
> From: Omar Sandoval <osandov@fb.com>
> 
> This adds an API for writing compressed data directly to the filesystem.
> The use case that I have in mind is send/receive: currently, when
> sending data from one compressed filesystem to another, the sending side
> decompresses the data and the receiving side recompresses it before
> writing it out. This is wasteful and can be avoided if we can just send
> and write compressed extents. The send part will be implemented in a
> separate series, as this ioctl can stand alone.
> 
> The interface is essentially pwrite(2) with some extra information:
> 
> - The input buffer contains the compressed data.
> - Both the compressed and decompressed sizes of the data are given.
> - The compression type (zlib, lzo, or zstd) is given.
> 
> A more detailed description of the interface, including restrictions and
> edge cases, is included in include/uapi/linux/btrfs.h.
> 
> The implementation is similar to direct I/O: we have to flush any
> ordered extents, invalidate the page cache, and do the io
> tree/delalloc/extent map/ordered extent dance. From there, we can reuse
> the compression code with a minor modification to distinguish the new
> ioctl from writeback.
>

I've looked at this a few times, the locking and space reservation stuff look
right.  What about encrypted send/recieve?  Are we going to want to use this to
just blind copy encrypted data without having to decrypt/re-encrypt?  Should
this be taken into consideration for this interface?  I'll think more about it,
but I can't really see any better option than this.  Thanks,

Josef 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH 5/5] Btrfs: add ioctl for directly writing compressed data
  2019-08-26 21:36   ` Josef Bacik
@ 2019-08-27  6:26     ` Nikolay Borisov
  2019-08-27 11:57       ` Josef Bacik
  0 siblings, 1 reply; 23+ messages in thread
From: Nikolay Borisov @ 2019-08-27  6:26 UTC (permalink / raw)
  To: Josef Bacik, Omar Sandoval; +Cc: kernel-team, linux-btrfs



On 27.08.19 г. 0:36 ч., Josef Bacik wrote:
> On Thu, Aug 15, 2019 at 02:04:06PM -0700, Omar Sandoval wrote:
>> From: Omar Sandoval <osandov@fb.com>
>>
>> This adds an API for writing compressed data directly to the filesystem.
>> The use case that I have in mind is send/receive: currently, when
>> sending data from one compressed filesystem to another, the sending side
>> decompresses the data and the receiving side recompresses it before
>> writing it out. This is wasteful and can be avoided if we can just send
>> and write compressed extents. The send part will be implemented in a
>> separate series, as this ioctl can stand alone.
>>
>> The interface is essentially pwrite(2) with some extra information:
>>
>> - The input buffer contains the compressed data.
>> - Both the compressed and decompressed sizes of the data are given.
>> - The compression type (zlib, lzo, or zstd) is given.
>>
>> A more detailed description of the interface, including restrictions and
>> edge cases, is included in include/uapi/linux/btrfs.h.
>>
>> The implementation is similar to direct I/O: we have to flush any
>> ordered extents, invalidate the page cache, and do the io
>> tree/delalloc/extent map/ordered extent dance. From there, we can reuse
>> the compression code with a minor modification to distinguish the new
>> ioctl from writeback.
>>
> 
> I've looked at this a few times, the locking and space reservation stuff look
> right.  What about encrypted send/recieve?  Are we going to want to use this to
> just blind copy encrypted data without having to decrypt/re-encrypt?  Should
> this be taken into consideration for this interface?  I'll think more about it,
> but I can't really see any better option than this.  Thanks,

The main problem is we don't have encryption implemented. And one of the
larger aspects of the encryption support is going to be how we are
storing the encryption keys. E.g. should they be part of the send
format? Or are we going to limit send/receive based on whether the
source/dest have transferred encryption keys out of line?

> 
> Josef 
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH 5/5] Btrfs: add ioctl for directly writing compressed data
  2019-08-27  6:26     ` Nikolay Borisov
@ 2019-08-27 11:57       ` Josef Bacik
  2019-08-27 18:06         ` Omar Sandoval
  0 siblings, 1 reply; 23+ messages in thread
From: Josef Bacik @ 2019-08-27 11:57 UTC (permalink / raw)
  To: Nikolay Borisov; +Cc: Josef Bacik, Omar Sandoval, kernel-team, linux-btrfs

On Tue, Aug 27, 2019 at 09:26:21AM +0300, Nikolay Borisov wrote:
> 
> 
> On 27.08.19 г. 0:36 ч., Josef Bacik wrote:
> > On Thu, Aug 15, 2019 at 02:04:06PM -0700, Omar Sandoval wrote:
> >> From: Omar Sandoval <osandov@fb.com>
> >>
> >> This adds an API for writing compressed data directly to the filesystem.
> >> The use case that I have in mind is send/receive: currently, when
> >> sending data from one compressed filesystem to another, the sending side
> >> decompresses the data and the receiving side recompresses it before
> >> writing it out. This is wasteful and can be avoided if we can just send
> >> and write compressed extents. The send part will be implemented in a
> >> separate series, as this ioctl can stand alone.
> >>
> >> The interface is essentially pwrite(2) with some extra information:
> >>
> >> - The input buffer contains the compressed data.
> >> - Both the compressed and decompressed sizes of the data are given.
> >> - The compression type (zlib, lzo, or zstd) is given.
> >>
> >> A more detailed description of the interface, including restrictions and
> >> edge cases, is included in include/uapi/linux/btrfs.h.
> >>
> >> The implementation is similar to direct I/O: we have to flush any
> >> ordered extents, invalidate the page cache, and do the io
> >> tree/delalloc/extent map/ordered extent dance. From there, we can reuse
> >> the compression code with a minor modification to distinguish the new
> >> ioctl from writeback.
> >>
> > 
> > I've looked at this a few times, the locking and space reservation stuff look
> > right.  What about encrypted send/recieve?  Are we going to want to use this to
> > just blind copy encrypted data without having to decrypt/re-encrypt?  Should
> > this be taken into consideration for this interface?  I'll think more about it,
> > but I can't really see any better option than this.  Thanks,
> 
> The main problem is we don't have encryption implemented. And one of the
> larger aspects of the encryption support is going to be how we are
> storing the encryption keys. E.g. should they be part of the send
> format? Or are we going to limit send/receive based on whether the
> source/dest have transferred encryption keys out of line?
> 

Subvolume encryption will be coming soon, but I'm less worried about the
mechanics of how that will be used and more worried about making this interface
work for that eventual future.  I assume we'll want to be able to just blind
copy the encrypted data instead of decrypting into the send stream and then
re-encrypting on the other side.  Which means we'll have two uses for this
interface, and I want to make sure we're happy with it before it gets merged.
Thanks,

Josef

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/5] Btrfs: treat RWF_{,D}SYNC writes as sync for CRCs
  2019-08-15 21:04 ` [PATCH 2/5] Btrfs: treat RWF_{,D}SYNC writes as sync for CRCs Omar Sandoval
  2019-08-16 16:59   ` Josef Bacik
@ 2019-08-27 12:35   ` David Sterba
  2019-08-27 17:44     ` Omar Sandoval
  1 sibling, 1 reply; 23+ messages in thread
From: David Sterba @ 2019-08-27 12:35 UTC (permalink / raw)
  To: Omar Sandoval; +Cc: linux-btrfs, kernel-team

On Thu, Aug 15, 2019 at 02:04:03PM -0700, Omar Sandoval wrote:
> From: Omar Sandoval <osandov@fb.com>
> 
> In btrfs_file_write_iter(), we treat a write as synchrononous if the
> file is marked as synchronous. However, with pwritev2(), a write with
> RWF_SYNC or RWF_DSYNC is also synchronous even if the file isn't by
> default. Make sure we bump the sync_writers counter in that case, too,
> so that we'll do the CRCs synchronously.
> 
> Signed-off-by: Omar Sandoval <osandov@fb.com>
> ---
>  fs/btrfs/file.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
> index 4393b6b24e02..27223753da7b 100644
> --- a/fs/btrfs/file.c
> +++ b/fs/btrfs/file.c
> @@ -1882,7 +1882,7 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb,
>  	u64 start_pos;
>  	u64 end_pos;
>  	ssize_t num_written = 0;
> -	bool sync = (file->f_flags & O_DSYNC) || IS_SYNC(file->f_mapping->host);
> +	bool sync = iocb->ki_flags & IOCB_DSYNC;

I'd like to merge the patches 1-3, but have hard time matching the
changelog to the change here. It's from one set of sync flags to
another, mentioning pwritev2 but that's a syscall and the function
itself does not use the sync flags at all. That's probably somewhere
deep in the vfs calls but that's what I'd appreciate stated explicitly
in the changelog as I was not able to find it out in a reasonable time.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/5] Btrfs: treat RWF_{,D}SYNC writes as sync for CRCs
  2019-08-27 12:35   ` David Sterba
@ 2019-08-27 17:44     ` Omar Sandoval
  2019-08-27 18:16       ` David Sterba
  0 siblings, 1 reply; 23+ messages in thread
From: Omar Sandoval @ 2019-08-27 17:44 UTC (permalink / raw)
  To: David Sterba; +Cc: linux-btrfs, kernel-team

On Tue, Aug 27, 2019 at 02:35:13PM +0200, David Sterba wrote:
> On Thu, Aug 15, 2019 at 02:04:03PM -0700, Omar Sandoval wrote:
> > From: Omar Sandoval <osandov@fb.com>
> > 
> > In btrfs_file_write_iter(), we treat a write as synchrononous if the
> > file is marked as synchronous. However, with pwritev2(), a write with
> > RWF_SYNC or RWF_DSYNC is also synchronous even if the file isn't by
> > default. Make sure we bump the sync_writers counter in that case, too,
> > so that we'll do the CRCs synchronously.
> > 
> > Signed-off-by: Omar Sandoval <osandov@fb.com>
> > ---
> >  fs/btrfs/file.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
> > index 4393b6b24e02..27223753da7b 100644
> > --- a/fs/btrfs/file.c
> > +++ b/fs/btrfs/file.c
> > @@ -1882,7 +1882,7 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb,
> >  	u64 start_pos;
> >  	u64 end_pos;
> >  	ssize_t num_written = 0;
> > -	bool sync = (file->f_flags & O_DSYNC) || IS_SYNC(file->f_mapping->host);
> > +	bool sync = iocb->ki_flags & IOCB_DSYNC;
> 
> I'd like to merge the patches 1-3, but have hard time matching the
> changelog to the change here. It's from one set of sync flags to
> another, mentioning pwritev2 but that's a syscall and the function
> itself does not use the sync flags at all. That's probably somewhere
> deep in the vfs calls but that's what I'd appreciate stated explicitly
> in the changelog as I was not able to find it out in a reasonable time.

You're right, there are a few layers here. How about this for the
changelog:


The VFS indicates a synchronous write to ->write_iter() via
iocb->ki_flags. The IOCB_{,D}SYNC flags may be set based on the file
(see iocb_flags()) or the RWF_* flags passed to a syscall like
pwritev2() (see kiocb_set_rw_flags()). However, in
btrfs_file_write_iter(), we're checking if a write is synchronous based
only on the file; we use this to decide when to bump the sync_writers
counter and thus do CRCs synchronously. Make sure we do this for all
synchronous writes as determined by the VFS.


Let me know if you want me to resend with the new changelog.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH 5/5] Btrfs: add ioctl for directly writing compressed data
  2019-08-27 11:57       ` Josef Bacik
@ 2019-08-27 18:06         ` Omar Sandoval
  2019-08-27 18:22           ` Omar Sandoval
  0 siblings, 1 reply; 23+ messages in thread
From: Omar Sandoval @ 2019-08-27 18:06 UTC (permalink / raw)
  To: Josef Bacik; +Cc: Nikolay Borisov, kernel-team, linux-btrfs

On Tue, Aug 27, 2019 at 07:57:41AM -0400, Josef Bacik wrote:
> On Tue, Aug 27, 2019 at 09:26:21AM +0300, Nikolay Borisov wrote:
> > 
> > 
> > On 27.08.19 г. 0:36 ч., Josef Bacik wrote:
> > > On Thu, Aug 15, 2019 at 02:04:06PM -0700, Omar Sandoval wrote:
> > >> From: Omar Sandoval <osandov@fb.com>
> > >>
> > >> This adds an API for writing compressed data directly to the filesystem.
> > >> The use case that I have in mind is send/receive: currently, when
> > >> sending data from one compressed filesystem to another, the sending side
> > >> decompresses the data and the receiving side recompresses it before
> > >> writing it out. This is wasteful and can be avoided if we can just send
> > >> and write compressed extents. The send part will be implemented in a
> > >> separate series, as this ioctl can stand alone.
> > >>
> > >> The interface is essentially pwrite(2) with some extra information:
> > >>
> > >> - The input buffer contains the compressed data.
> > >> - Both the compressed and decompressed sizes of the data are given.
> > >> - The compression type (zlib, lzo, or zstd) is given.
> > >>
> > >> A more detailed description of the interface, including restrictions and
> > >> edge cases, is included in include/uapi/linux/btrfs.h.
> > >>
> > >> The implementation is similar to direct I/O: we have to flush any
> > >> ordered extents, invalidate the page cache, and do the io
> > >> tree/delalloc/extent map/ordered extent dance. From there, we can reuse
> > >> the compression code with a minor modification to distinguish the new
> > >> ioctl from writeback.
> > >>
> > > 
> > > I've looked at this a few times, the locking and space reservation stuff look
> > > right.  What about encrypted send/recieve?  Are we going to want to use this to
> > > just blind copy encrypted data without having to decrypt/re-encrypt?  Should
> > > this be taken into consideration for this interface?  I'll think more about it,
> > > but I can't really see any better option than this.  Thanks,
> > 
> > The main problem is we don't have encryption implemented. And one of the
> > larger aspects of the encryption support is going to be how we are
> > storing the encryption keys. E.g. should they be part of the send
> > format? Or are we going to limit send/receive based on whether the
> > source/dest have transferred encryption keys out of line?
> > 
> 
> Subvolume encryption will be coming soon, but I'm less worried about the
> mechanics of how that will be used and more worried about making this interface
> work for that eventual future.  I assume we'll want to be able to just blind
> copy the encrypted data instead of decrypting into the send stream and then
> re-encrypting on the other side.  Which means we'll have two uses for this
> interface, and I want to make sure we're happy with it before it gets merged.
> Thanks,
> 
> Josef

Right, I think the only way to do this would be to blindly send
encrypted data, and leave the key management to a higher layer.

Looking at the ioctl definition:

struct btrfs_ioctl_compressed_pwrite_args {
        __u64 offset;           /* in */
        __u32 orig_len;         /* in */
        __u32 compressed_len;   /* in */
        __u32 compress_type;    /* in */
        __u32 reserved[9];
        void __user *buf;       /* in */
} __attribute__ ((__packed__));

I think there are enough reserved fields in there for, e.g., encryption
type, any key management-related things we might need to stuff in, etc.
But the naming would be pretty bad if we extended it this way. Maybe
compressed write -> raw write, orig_len -> num_bytes, compressed_len ->
disk_num_bytes?

struct btrfs_ioctl_raw_pwrite_args {
        __u64 offset;           /* in */
        __u32 num_bytes;        /* in */
        __u32 disk_num_bytes;   /* in */
        __u32 compress_type;    /* in */
        __u32 reserved[9];
        void __user *buf;       /* in */
} __attribute__ ((__packed__));

Besides the naming, I don't think anything else would need to change for
now. And if we decide that we don't want encrypted send/receive, then
fine, this naming is still okay.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/5] Btrfs: treat RWF_{,D}SYNC writes as sync for CRCs
  2019-08-27 17:44     ` Omar Sandoval
@ 2019-08-27 18:16       ` David Sterba
  0 siblings, 0 replies; 23+ messages in thread
From: David Sterba @ 2019-08-27 18:16 UTC (permalink / raw)
  To: Omar Sandoval; +Cc: linux-btrfs, kernel-team

On Tue, Aug 27, 2019 at 10:44:39AM -0700, Omar Sandoval wrote:
> On Tue, Aug 27, 2019 at 02:35:13PM +0200, David Sterba wrote:
> > On Thu, Aug 15, 2019 at 02:04:03PM -0700, Omar Sandoval wrote:
> > > From: Omar Sandoval <osandov@fb.com>
> > > 
> > > In btrfs_file_write_iter(), we treat a write as synchrononous if the
> > > file is marked as synchronous. However, with pwritev2(), a write with
> > > RWF_SYNC or RWF_DSYNC is also synchronous even if the file isn't by
> > > default. Make sure we bump the sync_writers counter in that case, too,
> > > so that we'll do the CRCs synchronously.
> > > 
> > > Signed-off-by: Omar Sandoval <osandov@fb.com>
> > > ---
> > >  fs/btrfs/file.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
> > > index 4393b6b24e02..27223753da7b 100644
> > > --- a/fs/btrfs/file.c
> > > +++ b/fs/btrfs/file.c
> > > @@ -1882,7 +1882,7 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb,
> > >  	u64 start_pos;
> > >  	u64 end_pos;
> > >  	ssize_t num_written = 0;
> > > -	bool sync = (file->f_flags & O_DSYNC) || IS_SYNC(file->f_mapping->host);
> > > +	bool sync = iocb->ki_flags & IOCB_DSYNC;
> > 
> > I'd like to merge the patches 1-3, but have hard time matching the
> > changelog to the change here. It's from one set of sync flags to
> > another, mentioning pwritev2 but that's a syscall and the function
> > itself does not use the sync flags at all. That's probably somewhere
> > deep in the vfs calls but that's what I'd appreciate stated explicitly
> > in the changelog as I was not able to find it out in a reasonable time.
> 
> You're right, there are a few layers here. How about this for the
> changelog:
> 
> 
> The VFS indicates a synchronous write to ->write_iter() via
> iocb->ki_flags. The IOCB_{,D}SYNC flags may be set based on the file
> (see iocb_flags()) or the RWF_* flags passed to a syscall like
> pwritev2() (see kiocb_set_rw_flags()). However, in
> btrfs_file_write_iter(), we're checking if a write is synchronous based
> only on the file; we use this to decide when to bump the sync_writers
> counter and thus do CRCs synchronously. Make sure we do this for all
> synchronous writes as determined by the VFS.

That's great, thanks, no need to resend.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH 5/5] Btrfs: add ioctl for directly writing compressed data
  2019-08-27 18:06         ` Omar Sandoval
@ 2019-08-27 18:22           ` Omar Sandoval
  2019-08-27 18:28             ` Josef Bacik
  0 siblings, 1 reply; 23+ messages in thread
From: Omar Sandoval @ 2019-08-27 18:22 UTC (permalink / raw)
  To: Josef Bacik; +Cc: Nikolay Borisov, kernel-team, linux-btrfs

On Tue, Aug 27, 2019 at 11:06:23AM -0700, Omar Sandoval wrote:
> On Tue, Aug 27, 2019 at 07:57:41AM -0400, Josef Bacik wrote:
> > On Tue, Aug 27, 2019 at 09:26:21AM +0300, Nikolay Borisov wrote:
> > > 
> > > 
> > > On 27.08.19 г. 0:36 ч., Josef Bacik wrote:
> > > > On Thu, Aug 15, 2019 at 02:04:06PM -0700, Omar Sandoval wrote:
> > > >> From: Omar Sandoval <osandov@fb.com>
> > > >>
> > > >> This adds an API for writing compressed data directly to the filesystem.
> > > >> The use case that I have in mind is send/receive: currently, when
> > > >> sending data from one compressed filesystem to another, the sending side
> > > >> decompresses the data and the receiving side recompresses it before
> > > >> writing it out. This is wasteful and can be avoided if we can just send
> > > >> and write compressed extents. The send part will be implemented in a
> > > >> separate series, as this ioctl can stand alone.
> > > >>
> > > >> The interface is essentially pwrite(2) with some extra information:
> > > >>
> > > >> - The input buffer contains the compressed data.
> > > >> - Both the compressed and decompressed sizes of the data are given.
> > > >> - The compression type (zlib, lzo, or zstd) is given.
> > > >>
> > > >> A more detailed description of the interface, including restrictions and
> > > >> edge cases, is included in include/uapi/linux/btrfs.h.
> > > >>
> > > >> The implementation is similar to direct I/O: we have to flush any
> > > >> ordered extents, invalidate the page cache, and do the io
> > > >> tree/delalloc/extent map/ordered extent dance. From there, we can reuse
> > > >> the compression code with a minor modification to distinguish the new
> > > >> ioctl from writeback.
> > > >>
> > > > 
> > > > I've looked at this a few times, the locking and space reservation stuff look
> > > > right.  What about encrypted send/recieve?  Are we going to want to use this to
> > > > just blind copy encrypted data without having to decrypt/re-encrypt?  Should
> > > > this be taken into consideration for this interface?  I'll think more about it,
> > > > but I can't really see any better option than this.  Thanks,
> > > 
> > > The main problem is we don't have encryption implemented. And one of the
> > > larger aspects of the encryption support is going to be how we are
> > > storing the encryption keys. E.g. should they be part of the send
> > > format? Or are we going to limit send/receive based on whether the
> > > source/dest have transferred encryption keys out of line?
> > > 
> > 
> > Subvolume encryption will be coming soon, but I'm less worried about the
> > mechanics of how that will be used and more worried about making this interface
> > work for that eventual future.  I assume we'll want to be able to just blind
> > copy the encrypted data instead of decrypting into the send stream and then
> > re-encrypting on the other side.  Which means we'll have two uses for this
> > interface, and I want to make sure we're happy with it before it gets merged.
> > Thanks,
> > 
> > Josef
> 
> Right, I think the only way to do this would be to blindly send
> encrypted data, and leave the key management to a higher layer.
> 
> Looking at the ioctl definition:
> 
> struct btrfs_ioctl_compressed_pwrite_args {
>         __u64 offset;           /* in */
>         __u32 orig_len;         /* in */
>         __u32 compressed_len;   /* in */
>         __u32 compress_type;    /* in */
>         __u32 reserved[9];
>         void __user *buf;       /* in */
> } __attribute__ ((__packed__));
> 
> I think there are enough reserved fields in there for, e.g., encryption
> type, any key management-related things we might need to stuff in, etc.
> But the naming would be pretty bad if we extended it this way. Maybe
> compressed write -> raw write, orig_len -> num_bytes, compressed_len ->
> disk_num_bytes?
> 
> struct btrfs_ioctl_raw_pwrite_args {
>         __u64 offset;           /* in */
>         __u32 num_bytes;        /* in */
>         __u32 disk_num_bytes;   /* in */
>         __u32 compress_type;    /* in */
>         __u32 reserved[9];
>         void __user *buf;       /* in */
> } __attribute__ ((__packed__));
> 
> Besides the naming, I don't think anything else would need to change for
> now. And if we decide that we don't want encrypted send/receive, then
> fine, this naming is still okay.

Oh, and at this again, compression and encryption are only u8 in the
extent item, and we have an extra u16 for "other_encoding", so it'd
probably be safe to make it:

struct btrfs_ioctl_raw_pwrite_args {
        __u64 offset;           /* in */
        __u32 num_bytes;        /* in */
        __u32 disk_num_bytes;   /* in */
        __u8 compression;       /* in */
        __u8 encryption;        /* in */
	__u16 other_encoding;   /* in */
        __u32 reserved[9];
        void __user *buf;       /* in */
} __attribute__ ((__packed__));

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH 5/5] Btrfs: add ioctl for directly writing compressed data
  2019-08-27 18:22           ` Omar Sandoval
@ 2019-08-27 18:28             ` Josef Bacik
  0 siblings, 0 replies; 23+ messages in thread
From: Josef Bacik @ 2019-08-27 18:28 UTC (permalink / raw)
  To: Omar Sandoval; +Cc: Josef Bacik, Nikolay Borisov, kernel-team, linux-btrfs

On Tue, Aug 27, 2019 at 11:22:42AM -0700, Omar Sandoval wrote:
> On Tue, Aug 27, 2019 at 11:06:23AM -0700, Omar Sandoval wrote:
> > On Tue, Aug 27, 2019 at 07:57:41AM -0400, Josef Bacik wrote:
> > > On Tue, Aug 27, 2019 at 09:26:21AM +0300, Nikolay Borisov wrote:
> > > > 
> > > > 
> > > > On 27.08.19 г. 0:36 ч., Josef Bacik wrote:
> > > > > On Thu, Aug 15, 2019 at 02:04:06PM -0700, Omar Sandoval wrote:
> > > > >> From: Omar Sandoval <osandov@fb.com>
> > > > >>
> > > > >> This adds an API for writing compressed data directly to the filesystem.
> > > > >> The use case that I have in mind is send/receive: currently, when
> > > > >> sending data from one compressed filesystem to another, the sending side
> > > > >> decompresses the data and the receiving side recompresses it before
> > > > >> writing it out. This is wasteful and can be avoided if we can just send
> > > > >> and write compressed extents. The send part will be implemented in a
> > > > >> separate series, as this ioctl can stand alone.
> > > > >>
> > > > >> The interface is essentially pwrite(2) with some extra information:
> > > > >>
> > > > >> - The input buffer contains the compressed data.
> > > > >> - Both the compressed and decompressed sizes of the data are given.
> > > > >> - The compression type (zlib, lzo, or zstd) is given.
> > > > >>
> > > > >> A more detailed description of the interface, including restrictions and
> > > > >> edge cases, is included in include/uapi/linux/btrfs.h.
> > > > >>
> > > > >> The implementation is similar to direct I/O: we have to flush any
> > > > >> ordered extents, invalidate the page cache, and do the io
> > > > >> tree/delalloc/extent map/ordered extent dance. From there, we can reuse
> > > > >> the compression code with a minor modification to distinguish the new
> > > > >> ioctl from writeback.
> > > > >>
> > > > > 
> > > > > I've looked at this a few times, the locking and space reservation stuff look
> > > > > right.  What about encrypted send/recieve?  Are we going to want to use this to
> > > > > just blind copy encrypted data without having to decrypt/re-encrypt?  Should
> > > > > this be taken into consideration for this interface?  I'll think more about it,
> > > > > but I can't really see any better option than this.  Thanks,
> > > > 
> > > > The main problem is we don't have encryption implemented. And one of the
> > > > larger aspects of the encryption support is going to be how we are
> > > > storing the encryption keys. E.g. should they be part of the send
> > > > format? Or are we going to limit send/receive based on whether the
> > > > source/dest have transferred encryption keys out of line?
> > > > 
> > > 
> > > Subvolume encryption will be coming soon, but I'm less worried about the
> > > mechanics of how that will be used and more worried about making this interface
> > > work for that eventual future.  I assume we'll want to be able to just blind
> > > copy the encrypted data instead of decrypting into the send stream and then
> > > re-encrypting on the other side.  Which means we'll have two uses for this
> > > interface, and I want to make sure we're happy with it before it gets merged.
> > > Thanks,
> > > 
> > > Josef
> > 
> > Right, I think the only way to do this would be to blindly send
> > encrypted data, and leave the key management to a higher layer.
> > 
> > Looking at the ioctl definition:
> > 
> > struct btrfs_ioctl_compressed_pwrite_args {
> >         __u64 offset;           /* in */
> >         __u32 orig_len;         /* in */
> >         __u32 compressed_len;   /* in */
> >         __u32 compress_type;    /* in */
> >         __u32 reserved[9];
> >         void __user *buf;       /* in */
> > } __attribute__ ((__packed__));
> > 
> > I think there are enough reserved fields in there for, e.g., encryption
> > type, any key management-related things we might need to stuff in, etc.
> > But the naming would be pretty bad if we extended it this way. Maybe
> > compressed write -> raw write, orig_len -> num_bytes, compressed_len ->
> > disk_num_bytes?
> > 
> > struct btrfs_ioctl_raw_pwrite_args {
> >         __u64 offset;           /* in */
> >         __u32 num_bytes;        /* in */
> >         __u32 disk_num_bytes;   /* in */
> >         __u32 compress_type;    /* in */
> >         __u32 reserved[9];
> >         void __user *buf;       /* in */
> > } __attribute__ ((__packed__));
> > 
> > Besides the naming, I don't think anything else would need to change for
> > now. And if we decide that we don't want encrypted send/receive, then
> > fine, this naming is still okay.
> 
> Oh, and at this again, compression and encryption are only u8 in the
> extent item, and we have an extra u16 for "other_encoding", so it'd
> probably be safe to make it:
> 
> struct btrfs_ioctl_raw_pwrite_args {
>         __u64 offset;           /* in */
>         __u32 num_bytes;        /* in */
>         __u32 disk_num_bytes;   /* in */
>         __u8 compression;       /* in */
>         __u8 encryption;        /* in */
> 	__u16 other_encoding;   /* in */
>         __u32 reserved[9];
>         void __user *buf;       /* in */
> } __attribute__ ((__packed__));

I like this, then just adjust the patches to utilize the generic naming
convention instead of "compression" and I think it's good to go.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH 0/5] Btrfs: add interface for writing compressed extent directly
  2019-08-15 21:04 [RFC PATCH 0/5] Btrfs: add interface for writing compressed extent directly Omar Sandoval
                   ` (5 preceding siblings ...)
  2019-08-15 21:14 ` [RFC PATCH 0/5] Btrfs: add interface for writing compressed extent directly Omar Sandoval
@ 2019-08-27 18:31 ` David Sterba
  6 siblings, 0 replies; 23+ messages in thread
From: David Sterba @ 2019-08-27 18:31 UTC (permalink / raw)
  To: Omar Sandoval; +Cc: linux-btrfs, kernel-team

On Thu, Aug 15, 2019 at 02:04:01PM -0700, Omar Sandoval wrote:
> From: Omar Sandoval <osandov@fb.com>
> 
> Hello,
> 
> This series adds a way to write compressed data directly to Btrfs. The
> intended use case is making send/receive on compressed file systems more
> efficient; however, the interface is general enough that it could be
> used in other scenarios. Patch 5 is the main change; see that for more
> details.
> 
> Patches 1-3 are small fixes/cleanups that I ran into while implementing
> this; they should go in regardless of the remainder of the series.

1-3 added to misc-next, thanks. I haven't looked at the rest yet.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH 5/5] Btrfs: add ioctl for directly writing compressed data
  2019-08-15 21:04 ` [RFC PATCH 5/5] Btrfs: add ioctl for directly writing compressed data Omar Sandoval
  2019-08-26 21:36   ` Josef Bacik
@ 2019-08-28 12:06   ` David Sterba
  2019-09-03 17:14     ` Omar Sandoval
  1 sibling, 1 reply; 23+ messages in thread
From: David Sterba @ 2019-08-28 12:06 UTC (permalink / raw)
  To: Omar Sandoval; +Cc: linux-btrfs, kernel-team

On Thu, Aug 15, 2019 at 02:04:06PM -0700, Omar Sandoval wrote:
>  #define BTRFS_IOC_SEND_32 _IOW(BTRFS_IOCTL_MAGIC, 38, \
>  			       struct btrfs_ioctl_send_args_32)
> +
> +struct btrfs_ioctl_compressed_pwrite_args_32 {
> +	__u64 offset;		/* in */
> +	__u32 compressed_len;	/* in */
> +	__u32 orig_len;		/* in */
> +	__u32 compress_type;	/* in */
> +	__u32 reserved[9];
> +	compat_uptr_t buf;	/* in */
> +} __attribute__ ((__packed__));
> +
> +#define BTRFS_IOC_COMPRESSED_PWRITE_32 _IOW(BTRFS_IOCTL_MAGIC, 63, \
> +				 struct btrfs_ioctl_compressed_pwrite_args_32)

Note that the _32 is a workaround for a mistake in the send ioctl
definitions that slipped trhough. Any pointer in the structure changes
the ioctl number on 32bit and 64bit.

But as the raw data ioctl is new there's point to copy the mistake. The
alignment and width can be forced eg. like

> +	void __user *buf;	/* in */

	union {
		void __user *buf;
		__u64 __buf_alignment;
	};

This allows to user buf as a buffer without casts to a intermediate
type.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH 5/5] Btrfs: add ioctl for directly writing compressed data
  2019-08-28 12:06   ` David Sterba
@ 2019-09-03 17:14     ` Omar Sandoval
  0 siblings, 0 replies; 23+ messages in thread
From: Omar Sandoval @ 2019-09-03 17:14 UTC (permalink / raw)
  To: dsterba, linux-btrfs, kernel-team

On Wed, Aug 28, 2019 at 02:06:50PM +0200, David Sterba wrote:
> On Thu, Aug 15, 2019 at 02:04:06PM -0700, Omar Sandoval wrote:
> >  #define BTRFS_IOC_SEND_32 _IOW(BTRFS_IOCTL_MAGIC, 38, \
> >  			       struct btrfs_ioctl_send_args_32)
> > +
> > +struct btrfs_ioctl_compressed_pwrite_args_32 {
> > +	__u64 offset;		/* in */
> > +	__u32 compressed_len;	/* in */
> > +	__u32 orig_len;		/* in */
> > +	__u32 compress_type;	/* in */
> > +	__u32 reserved[9];
> > +	compat_uptr_t buf;	/* in */
> > +} __attribute__ ((__packed__));
> > +
> > +#define BTRFS_IOC_COMPRESSED_PWRITE_32 _IOW(BTRFS_IOCTL_MAGIC, 63, \
> > +				 struct btrfs_ioctl_compressed_pwrite_args_32)
> 
> Note that the _32 is a workaround for a mistake in the send ioctl
> definitions that slipped trhough. Any pointer in the structure changes
> the ioctl number on 32bit and 64bit.
> 
> But as the raw data ioctl is new there's point to copy the mistake. The
> alignment and width can be forced eg. like
> 
> > +	void __user *buf;	/* in */
> 
> 	union {
> 		void __user *buf;
> 		__u64 __buf_alignment;
> 	};
> 
> This allows to user buf as a buffer without casts to a intermediate
> type.

I don't think this works on big-endian architectures. Let's say a 32-bit
application does:

struct btrfs_ioctl_compressed_pwrite_args_32 {
	.buf = 0x12345678,
};

The pointer will be in the first 4 bytes of the 8-byte union:

0    1    2    3    4    5    6    7
0x12 0x34 0x56 0x78 0x00 0x00 0x00 0x00

But, the 64-bit kernel will read buf as 0x1234567800000000. Let me know
if I messed up my analysis, but I think we need the compat stuff.

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2019-09-03 17:15 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-15 21:04 [RFC PATCH 0/5] Btrfs: add interface for writing compressed extent directly Omar Sandoval
2019-08-15 21:04 ` [PATCH 1/5] Btrfs: use correct count in btrfs_file_write_iter() Omar Sandoval
2019-08-16 16:56   ` Josef Bacik
2019-08-15 21:04 ` [PATCH 2/5] Btrfs: treat RWF_{,D}SYNC writes as sync for CRCs Omar Sandoval
2019-08-16 16:59   ` Josef Bacik
2019-08-27 12:35   ` David Sterba
2019-08-27 17:44     ` Omar Sandoval
2019-08-27 18:16       ` David Sterba
2019-08-15 21:04 ` [PATCH 3/5] Btrfs: stop clearing EXTENT_DIRTY in inode I/O tree Omar Sandoval
2019-08-16 16:59   ` Josef Bacik
2019-08-15 21:04 ` [RFC PATCH 4/5] fs: export rw_verify_area() Omar Sandoval
2019-08-16 17:02   ` Josef Bacik
2019-08-15 21:04 ` [RFC PATCH 5/5] Btrfs: add ioctl for directly writing compressed data Omar Sandoval
2019-08-26 21:36   ` Josef Bacik
2019-08-27  6:26     ` Nikolay Borisov
2019-08-27 11:57       ` Josef Bacik
2019-08-27 18:06         ` Omar Sandoval
2019-08-27 18:22           ` Omar Sandoval
2019-08-27 18:28             ` Josef Bacik
2019-08-28 12:06   ` David Sterba
2019-09-03 17:14     ` Omar Sandoval
2019-08-15 21:14 ` [RFC PATCH 0/5] Btrfs: add interface for writing compressed extent directly Omar Sandoval
2019-08-27 18:31 ` David Sterba

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.