All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v14 0/7] btrfs: add send/receive support for reading/writing compressed data
@ 2022-03-17 17:25 Omar Sandoval
  2022-03-17 17:25 ` [PATCH v14 1/7] btrfs: send: remove unused send_ctx::{total,cmd}_send_size Omar Sandoval
                   ` (17 more replies)
  0 siblings, 18 replies; 29+ messages in thread
From: Omar Sandoval @ 2022-03-17 17:25 UTC (permalink / raw)
  To: linux-btrfs; +Cc: kernel-team

From: Omar Sandoval <osandov@fb.com>

This series adds support for sending compressed data via Btrfs send and
btrfs-progs support for sending/receiving compressed data and writing it
with BTRFS_IOC_ENCODED_WRITE, which was previously merged into
misc-next. See the previous posting for more details and benchmarks [1].

Patches 1 and 2 are cleanups for Btrfs send. Patches 3-5 prepare some
protocol changes for send stream v2. Patch 6 implements compressed send.
Patch 7 enables send stream v2 and compressed send in the send ioctl
when requested.

Changes since v13 [2]:

- Rebased on latest misc-next branch.
- Dropped ioctl patches which are already in misc-next.

1: https://lore.kernel.org/linux-btrfs/cover.1615922753.git.osandov@fb.com/
2: https://lore.kernel.org/linux-btrfs/cover.1644519257.git.osandov@fb.com/

Omar Sandoval (7):
  btrfs: send: remove unused send_ctx::{total,cmd}_send_size
  btrfs: send: explicitly number commands and attributes
  btrfs: add send stream v2 definitions
  btrfs: send: write larger chunks when using stream v2
  btrfs: send: allocate send buffer with alloc_page() and vmap() for v2
  btrfs: send: send compressed extents with encoded writes
  btrfs: send: enable support for stream v2 and compressed writes

 fs/btrfs/ctree.h           |   6 +
 fs/btrfs/inode.c           |  13 +-
 fs/btrfs/send.c            | 324 +++++++++++++++++++++++++++++++++----
 fs/btrfs/send.h            | 142 +++++++++-------
 include/uapi/linux/btrfs.h |  10 +-
 5 files changed, 395 insertions(+), 100 deletions(-)

The btrfs-progs patches were written by Boris Burkov with some updates
from me. Patches 1-4 are preparation. Patch 5 implements encoded writes.
Patch 6 implements the fallback to decompressing. Patches 7 and 8
implement the other commands. Patch 9 adds the new `btrfs send` options.
Patch 10 adds a test case.

Changes since v13:

- Rebased on latest devel branch.
- Updated the btrfs_ioctl_encoded_io_args definition to the version that
  was merged into misc-next.

Boris Burkov (8):
  btrfs-progs: receive: support v2 send stream larger tlv_len
  btrfs-progs: receive: dynamically allocate sctx->read_buf
  btrfs-progs: receive: support v2 send stream DATA tlv format
  btrfs-progs: receive: process encoded_write commands
  btrfs-progs: receive: encoded_write fallback to explicit decode and
    write
  btrfs-progs: receive: process fallocate commands
  btrfs-progs: receive: process setflags ioctl commands
  btrfs-progs: receive: add tests for basic encoded_write send/receive

Omar Sandoval (2):
  btrfs-progs: receive: add send stream v2 cmds and attrs to send.h
  btrfs-progs: send: stream v2 ioctl flags

 Documentation/btrfs-receive.rst               |   5 +
 Documentation/btrfs-send.rst                  |  22 ++
 cmds/receive-dump.c                           |  31 +-
 cmds/receive.c                                | 347 +++++++++++++++++-
 cmds/send.c                                   | 100 ++++-
 common/send-stream.c                          | 165 +++++++--
 common/send-stream.h                          |   7 +
 ioctl.h                                       | 151 +++++++-
 kernel-shared/send.h                          | 146 +++++---
 libbtrfs/send-stream.c                        |   2 +-
 .../052-receive-write-encoded/test.sh         | 114 ++++++
 11 files changed, 993 insertions(+), 97 deletions(-)
 create mode 100755 tests/misc-tests/052-receive-write-encoded/test.sh

-- 
2.35.1


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v14 1/7] btrfs: send: remove unused send_ctx::{total,cmd}_send_size
  2022-03-17 17:25 [PATCH v14 0/7] btrfs: add send/receive support for reading/writing compressed data Omar Sandoval
@ 2022-03-17 17:25 ` Omar Sandoval
  2022-03-17 17:25 ` [PATCH v14 2/7] btrfs: send: explicitly number commands and attributes Omar Sandoval
                   ` (16 subsequent siblings)
  17 siblings, 0 replies; 29+ messages in thread
From: Omar Sandoval @ 2022-03-17 17:25 UTC (permalink / raw)
  To: linux-btrfs; +Cc: kernel-team

From: Omar Sandoval <osandov@fb.com>

We collect these statistics but have never exposed them in any way. I
also didn't find any patches that ever attempted to make use of them.

Signed-off-by: Omar Sandoval <osandov@fb.com>
---
 fs/btrfs/send.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c
index cf86f1eafcb7..6d36dee1505f 100644
--- a/fs/btrfs/send.c
+++ b/fs/btrfs/send.c
@@ -82,8 +82,6 @@ struct send_ctx {
 	char *send_buf;
 	u32 send_size;
 	u32 send_max_size;
-	u64 total_send_size;
-	u64 cmd_send_size[BTRFS_SEND_C_MAX + 1];
 	u64 flags;	/* 'flags' member of btrfs_ioctl_send_args is u64 */
 	/* Protocol version compatibility requested */
 	u32 proto;
@@ -727,8 +725,6 @@ static int send_cmd(struct send_ctx *sctx)
 	ret = write_buf(sctx->send_filp, sctx->send_buf, sctx->send_size,
 					&sctx->send_off);
 
-	sctx->total_send_size += sctx->send_size;
-	sctx->cmd_send_size[get_unaligned_le16(&hdr->cmd)] += sctx->send_size;
 	sctx->send_size = 0;
 
 	return ret;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v14 2/7] btrfs: send: explicitly number commands and attributes
  2022-03-17 17:25 [PATCH v14 0/7] btrfs: add send/receive support for reading/writing compressed data Omar Sandoval
  2022-03-17 17:25 ` [PATCH v14 1/7] btrfs: send: remove unused send_ctx::{total,cmd}_send_size Omar Sandoval
@ 2022-03-17 17:25 ` Omar Sandoval
  2022-03-24 17:52   ` Sweet Tea Dorminy
  2022-03-17 17:25 ` [PATCH v14 3/7] btrfs: add send stream v2 definitions Omar Sandoval
                   ` (15 subsequent siblings)
  17 siblings, 1 reply; 29+ messages in thread
From: Omar Sandoval @ 2022-03-17 17:25 UTC (permalink / raw)
  To: linux-btrfs; +Cc: kernel-team

From: Omar Sandoval <osandov@fb.com>

Commit e77fbf990316 ("btrfs: send: prepare for v2 protocol") added
_BTRFS_SEND_C_MAX_V* macros equal to the maximum command number for the
version plus 1, but as written this creates gaps in the number space.
The maximum command number is currently 22, and __BTRFS_SEND_C_MAX_V1 is
accordingly 23. But then __BTRFS_SEND_C_MAX_V2 is 24, suggesting that v2
has a command numbered 23, and __BTRFS_SEND_C_MAX is 25, suggesting that
23 and 24 are valid commands.

Instead, let's explicitly number all of the commands, attributes, and
sentinel MAX constants.

Signed-off-by: Omar Sandoval <osandov@fb.com>
---
 fs/btrfs/send.c |   4 +-
 fs/btrfs/send.h | 106 ++++++++++++++++++++++++------------------------
 2 files changed, 54 insertions(+), 56 deletions(-)

diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c
index 6d36dee1505f..9363f625fa17 100644
--- a/fs/btrfs/send.c
+++ b/fs/btrfs/send.c
@@ -326,8 +326,8 @@ __maybe_unused
 static bool proto_cmd_ok(const struct send_ctx *sctx, int cmd)
 {
 	switch (sctx->proto) {
-	case 1:	 return cmd < __BTRFS_SEND_C_MAX_V1;
-	case 2:	 return cmd < __BTRFS_SEND_C_MAX_V2;
+	case 1:	 return cmd <= BTRFS_SEND_C_MAX_V1;
+	case 2:	 return cmd <= BTRFS_SEND_C_MAX_V2;
 	default: return false;
 	}
 }
diff --git a/fs/btrfs/send.h b/fs/btrfs/send.h
index 08602fdd600a..67721e0281ba 100644
--- a/fs/btrfs/send.h
+++ b/fs/btrfs/send.h
@@ -46,84 +46,82 @@ struct btrfs_tlv_header {
 
 /* commands */
 enum btrfs_send_cmd {
-	BTRFS_SEND_C_UNSPEC,
+	BTRFS_SEND_C_UNSPEC = 0,
 
 	/* Version 1 */
-	BTRFS_SEND_C_SUBVOL,
-	BTRFS_SEND_C_SNAPSHOT,
+	BTRFS_SEND_C_SUBVOL = 1,
+	BTRFS_SEND_C_SNAPSHOT = 2,
 
-	BTRFS_SEND_C_MKFILE,
-	BTRFS_SEND_C_MKDIR,
-	BTRFS_SEND_C_MKNOD,
-	BTRFS_SEND_C_MKFIFO,
-	BTRFS_SEND_C_MKSOCK,
-	BTRFS_SEND_C_SYMLINK,
+	BTRFS_SEND_C_MKFILE = 3,
+	BTRFS_SEND_C_MKDIR = 4,
+	BTRFS_SEND_C_MKNOD = 5,
+	BTRFS_SEND_C_MKFIFO = 6,
+	BTRFS_SEND_C_MKSOCK = 7,
+	BTRFS_SEND_C_SYMLINK = 8,
 
-	BTRFS_SEND_C_RENAME,
-	BTRFS_SEND_C_LINK,
-	BTRFS_SEND_C_UNLINK,
-	BTRFS_SEND_C_RMDIR,
+	BTRFS_SEND_C_RENAME = 9,
+	BTRFS_SEND_C_LINK = 10,
+	BTRFS_SEND_C_UNLINK = 11,
+	BTRFS_SEND_C_RMDIR = 12,
 
-	BTRFS_SEND_C_SET_XATTR,
-	BTRFS_SEND_C_REMOVE_XATTR,
+	BTRFS_SEND_C_SET_XATTR = 13,
+	BTRFS_SEND_C_REMOVE_XATTR = 14,
 
-	BTRFS_SEND_C_WRITE,
-	BTRFS_SEND_C_CLONE,
+	BTRFS_SEND_C_WRITE = 15,
+	BTRFS_SEND_C_CLONE = 16,
 
-	BTRFS_SEND_C_TRUNCATE,
-	BTRFS_SEND_C_CHMOD,
-	BTRFS_SEND_C_CHOWN,
-	BTRFS_SEND_C_UTIMES,
+	BTRFS_SEND_C_TRUNCATE = 17,
+	BTRFS_SEND_C_CHMOD = 18,
+	BTRFS_SEND_C_CHOWN = 19,
+	BTRFS_SEND_C_UTIMES = 20,
 
-	BTRFS_SEND_C_END,
-	BTRFS_SEND_C_UPDATE_EXTENT,
-	__BTRFS_SEND_C_MAX_V1,
+	BTRFS_SEND_C_END = 21,
+	BTRFS_SEND_C_UPDATE_EXTENT = 22,
+	BTRFS_SEND_C_MAX_V1 = 22,
 
 	/* Version 2 */
-	__BTRFS_SEND_C_MAX_V2,
+	BTRFS_SEND_C_MAX_V2 = 22,
 
 	/* End */
-	__BTRFS_SEND_C_MAX,
+	BTRFS_SEND_C_MAX = 22,
 };
-#define BTRFS_SEND_C_MAX (__BTRFS_SEND_C_MAX - 1)
 
 /* attributes in send stream */
 enum {
-	BTRFS_SEND_A_UNSPEC,
+	BTRFS_SEND_A_UNSPEC = 0,
 
-	BTRFS_SEND_A_UUID,
-	BTRFS_SEND_A_CTRANSID,
+	BTRFS_SEND_A_UUID = 1,
+	BTRFS_SEND_A_CTRANSID = 2,
 
-	BTRFS_SEND_A_INO,
-	BTRFS_SEND_A_SIZE,
-	BTRFS_SEND_A_MODE,
-	BTRFS_SEND_A_UID,
-	BTRFS_SEND_A_GID,
-	BTRFS_SEND_A_RDEV,
-	BTRFS_SEND_A_CTIME,
-	BTRFS_SEND_A_MTIME,
-	BTRFS_SEND_A_ATIME,
-	BTRFS_SEND_A_OTIME,
+	BTRFS_SEND_A_INO = 3,
+	BTRFS_SEND_A_SIZE = 4,
+	BTRFS_SEND_A_MODE = 5,
+	BTRFS_SEND_A_UID = 6,
+	BTRFS_SEND_A_GID = 7,
+	BTRFS_SEND_A_RDEV = 8,
+	BTRFS_SEND_A_CTIME = 9,
+	BTRFS_SEND_A_MTIME = 10,
+	BTRFS_SEND_A_ATIME = 11,
+	BTRFS_SEND_A_OTIME = 12,
 
-	BTRFS_SEND_A_XATTR_NAME,
-	BTRFS_SEND_A_XATTR_DATA,
+	BTRFS_SEND_A_XATTR_NAME = 13,
+	BTRFS_SEND_A_XATTR_DATA = 14,
 
-	BTRFS_SEND_A_PATH,
-	BTRFS_SEND_A_PATH_TO,
-	BTRFS_SEND_A_PATH_LINK,
+	BTRFS_SEND_A_PATH = 15,
+	BTRFS_SEND_A_PATH_TO = 16,
+	BTRFS_SEND_A_PATH_LINK = 17,
 
-	BTRFS_SEND_A_FILE_OFFSET,
-	BTRFS_SEND_A_DATA,
+	BTRFS_SEND_A_FILE_OFFSET = 18,
+	BTRFS_SEND_A_DATA = 19,
 
-	BTRFS_SEND_A_CLONE_UUID,
-	BTRFS_SEND_A_CLONE_CTRANSID,
-	BTRFS_SEND_A_CLONE_PATH,
-	BTRFS_SEND_A_CLONE_OFFSET,
-	BTRFS_SEND_A_CLONE_LEN,
+	BTRFS_SEND_A_CLONE_UUID = 20,
+	BTRFS_SEND_A_CLONE_CTRANSID = 21,
+	BTRFS_SEND_A_CLONE_PATH = 22,
+	BTRFS_SEND_A_CLONE_OFFSET = 23,
+	BTRFS_SEND_A_CLONE_LEN = 24,
 
-	__BTRFS_SEND_A_MAX,
+	BTRFS_SEND_A_MAX = 24,
 };
-#define BTRFS_SEND_A_MAX (__BTRFS_SEND_A_MAX - 1)
 
 #ifdef __KERNEL__
 long btrfs_ioctl_send(struct inode *inode, struct btrfs_ioctl_send_args *arg);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v14 3/7] btrfs: add send stream v2 definitions
  2022-03-17 17:25 [PATCH v14 0/7] btrfs: add send/receive support for reading/writing compressed data Omar Sandoval
  2022-03-17 17:25 ` [PATCH v14 1/7] btrfs: send: remove unused send_ctx::{total,cmd}_send_size Omar Sandoval
  2022-03-17 17:25 ` [PATCH v14 2/7] btrfs: send: explicitly number commands and attributes Omar Sandoval
@ 2022-03-17 17:25 ` Omar Sandoval
  2022-03-17 17:25 ` [PATCH v14 4/7] btrfs: send: write larger chunks when using stream v2 Omar Sandoval
                   ` (14 subsequent siblings)
  17 siblings, 0 replies; 29+ messages in thread
From: Omar Sandoval @ 2022-03-17 17:25 UTC (permalink / raw)
  To: linux-btrfs; +Cc: kernel-team

From: Omar Sandoval <osandov@fb.com>

This adds the definitions of the new commands for send stream version 2
and their respective attributes: fallocate, FS_IOC_SETFLAGS (a.k.a.
chattr), and encoded writes. It also documents two changes to the send
stream format in v2: the receiver shouldn't assume a maximum command
size, and the DATA attribute is encoded differently to allow for writes
larger than 64k. These will be implemented in subsequent changes, and
then the ioctl will accept the new version and flag.

Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Omar Sandoval <osandov@fb.com>
---
 fs/btrfs/send.c            |  2 +-
 fs/btrfs/send.h            | 40 ++++++++++++++++++++++++++++++++++----
 include/uapi/linux/btrfs.h |  7 +++++++
 3 files changed, 44 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c
index 9363f625fa17..1f141de3a7d6 100644
--- a/fs/btrfs/send.c
+++ b/fs/btrfs/send.c
@@ -7459,7 +7459,7 @@ long btrfs_ioctl_send(struct inode *inode, struct btrfs_ioctl_send_args *arg)
 
 	sctx->clone_roots_cnt = arg->clone_sources_count;
 
-	sctx->send_max_size = BTRFS_SEND_BUF_SIZE;
+	sctx->send_max_size = BTRFS_SEND_BUF_SIZE_V1;
 	sctx->send_buf = kvmalloc(sctx->send_max_size, GFP_KERNEL);
 	if (!sctx->send_buf) {
 		ret = -ENOMEM;
diff --git a/fs/btrfs/send.h b/fs/btrfs/send.h
index 67721e0281ba..805d8095209a 100644
--- a/fs/btrfs/send.h
+++ b/fs/btrfs/send.h
@@ -12,7 +12,11 @@
 #define BTRFS_SEND_STREAM_MAGIC "btrfs-stream"
 #define BTRFS_SEND_STREAM_VERSION 1
 
-#define BTRFS_SEND_BUF_SIZE SZ_64K
+/*
+ * In send stream v1, no command is larger than 64k. In send stream v2, no limit
+ * should be assumed.
+ */
+#define BTRFS_SEND_BUF_SIZE_V1 SZ_64K
 
 enum btrfs_tlv_type {
 	BTRFS_TLV_U8,
@@ -80,16 +84,20 @@ enum btrfs_send_cmd {
 	BTRFS_SEND_C_MAX_V1 = 22,
 
 	/* Version 2 */
-	BTRFS_SEND_C_MAX_V2 = 22,
+	BTRFS_SEND_C_FALLOCATE = 23,
+	BTRFS_SEND_C_SETFLAGS = 24,
+	BTRFS_SEND_C_ENCODED_WRITE = 25,
+	BTRFS_SEND_C_MAX_V2 = 25,
 
 	/* End */
-	BTRFS_SEND_C_MAX = 22,
+	BTRFS_SEND_C_MAX = 25,
 };
 
 /* attributes in send stream */
 enum {
 	BTRFS_SEND_A_UNSPEC = 0,
 
+	/* Version 1 */
 	BTRFS_SEND_A_UUID = 1,
 	BTRFS_SEND_A_CTRANSID = 2,
 
@@ -112,6 +120,11 @@ enum {
 	BTRFS_SEND_A_PATH_LINK = 17,
 
 	BTRFS_SEND_A_FILE_OFFSET = 18,
+	/*
+	 * As of send stream v2, this attribute is special: it must be the last
+	 * attribute in a command, its header contains only the type, and its
+	 * length is implicitly the remaining length of the command.
+	 */
 	BTRFS_SEND_A_DATA = 19,
 
 	BTRFS_SEND_A_CLONE_UUID = 20,
@@ -120,7 +133,26 @@ enum {
 	BTRFS_SEND_A_CLONE_OFFSET = 23,
 	BTRFS_SEND_A_CLONE_LEN = 24,
 
-	BTRFS_SEND_A_MAX = 24,
+	BTRFS_SEND_A_MAX_V1 = 24,
+
+	/* Version 2 */
+	BTRFS_SEND_A_FALLOCATE_MODE = 25,
+
+	BTRFS_SEND_A_SETFLAGS_FLAGS = 26,
+
+	BTRFS_SEND_A_UNENCODED_FILE_LEN = 27,
+	BTRFS_SEND_A_UNENCODED_LEN = 28,
+	BTRFS_SEND_A_UNENCODED_OFFSET = 29,
+	/*
+	 * COMPRESSION and ENCRYPTION default to NONE (0) if omitted from
+	 * BTRFS_SEND_C_ENCODED_WRITE.
+	 */
+	BTRFS_SEND_A_COMPRESSION = 30,
+	BTRFS_SEND_A_ENCRYPTION = 31,
+	BTRFS_SEND_A_MAX_V2 = 31,
+
+	/* End */
+	BTRFS_SEND_A_MAX = 31,
 };
 
 #ifdef __KERNEL__
diff --git a/include/uapi/linux/btrfs.h b/include/uapi/linux/btrfs.h
index d956b2993970..b6f26a434b10 100644
--- a/include/uapi/linux/btrfs.h
+++ b/include/uapi/linux/btrfs.h
@@ -777,6 +777,13 @@ struct btrfs_ioctl_received_subvol_args {
  */
 #define BTRFS_SEND_FLAG_VERSION			0x8
 
+/*
+ * Send compressed data using the ENCODED_WRITE command instead of decompressing
+ * the data and sending it with the WRITE command. This requires protocol
+ * version >= 2.
+ */
+#define BTRFS_SEND_FLAG_COMPRESSED		0x10
+
 #define BTRFS_SEND_FLAG_MASK \
 	(BTRFS_SEND_FLAG_NO_FILE_DATA | \
 	 BTRFS_SEND_FLAG_OMIT_STREAM_HEADER | \
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v14 4/7] btrfs: send: write larger chunks when using stream v2
  2022-03-17 17:25 [PATCH v14 0/7] btrfs: add send/receive support for reading/writing compressed data Omar Sandoval
                   ` (2 preceding siblings ...)
  2022-03-17 17:25 ` [PATCH v14 3/7] btrfs: add send stream v2 definitions Omar Sandoval
@ 2022-03-17 17:25 ` Omar Sandoval
  2022-03-24 17:52   ` Sweet Tea Dorminy
  2022-03-17 17:25 ` [PATCH v14 5/7] btrfs: send: allocate send buffer with alloc_page() and vmap() for v2 Omar Sandoval
                   ` (13 subsequent siblings)
  17 siblings, 1 reply; 29+ messages in thread
From: Omar Sandoval @ 2022-03-17 17:25 UTC (permalink / raw)
  To: linux-btrfs; +Cc: kernel-team

From: Omar Sandoval <osandov@fb.com>

The length field of the send stream TLV header is 16 bits. This means
that the maximum amount of data that can be sent for one write is 64k
minus one. However, encoded writes must be able to send the maximum
compressed extent (128k) in one command. To support this, send stream
version 2 encodes the DATA attribute differently: it has no length
field, and the length is implicitly up to the end of containing command
(which has a 32-bit length field). Although this is necessary for
encoded writes, normal writes can benefit from it, too.

Also add a check to enforce that the DATA attribute is last. It is only
strictly necessary for v2, but we might as well make v1 consistent with
it.

For v2, let's bump up the send buffer to the maximum compressed extent
size plus 16k for the other metadata (144k total). Since this will most
likely be vmalloc'd (and always will be after the next commit), we round
it up to the next page since we might as well use the rest of the page
on systems with >16k pages.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Omar Sandoval <osandov@fb.com>
---
 fs/btrfs/send.c | 42 ++++++++++++++++++++++++++++++++++--------
 1 file changed, 34 insertions(+), 8 deletions(-)

diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c
index 1f141de3a7d6..02053fff80ca 100644
--- a/fs/btrfs/send.c
+++ b/fs/btrfs/send.c
@@ -82,6 +82,7 @@ struct send_ctx {
 	char *send_buf;
 	u32 send_size;
 	u32 send_max_size;
+	bool put_data;
 	u64 flags;	/* 'flags' member of btrfs_ioctl_send_args is u64 */
 	/* Protocol version compatibility requested */
 	u32 proto;
@@ -589,6 +590,9 @@ static int tlv_put(struct send_ctx *sctx, u16 attr, const void *data, int len)
 	int total_len = sizeof(*hdr) + len;
 	int left = sctx->send_max_size - sctx->send_size;
 
+	if (WARN_ON_ONCE(sctx->put_data))
+		return -EINVAL;
+
 	if (unlikely(left < total_len))
 		return -EOVERFLOW;
 
@@ -726,6 +730,7 @@ static int send_cmd(struct send_ctx *sctx)
 					&sctx->send_off);
 
 	sctx->send_size = 0;
+	sctx->put_data = false;
 
 	return ret;
 }
@@ -4853,14 +4858,30 @@ static inline u64 max_send_read_size(const struct send_ctx *sctx)
 
 static int put_data_header(struct send_ctx *sctx, u32 len)
 {
-	struct btrfs_tlv_header *hdr;
+	if (WARN_ON_ONCE(sctx->put_data))
+		return -EINVAL;
+	sctx->put_data = true;
+	if (sctx->proto >= 2) {
+		/*
+		 * In v2, the data attribute header doesn't include a length; it
+		 * is implicitly to the end of the command.
+		 */
+		if (sctx->send_max_size - sctx->send_size < 2 + len)
+			return -EOVERFLOW;
+		put_unaligned_le16(BTRFS_SEND_A_DATA,
+				   sctx->send_buf + sctx->send_size);
+		sctx->send_size += 2;
+	} else {
+		struct btrfs_tlv_header *hdr;
 
-	if (sctx->send_max_size - sctx->send_size < sizeof(*hdr) + len)
-		return -EOVERFLOW;
-	hdr = (struct btrfs_tlv_header *)(sctx->send_buf + sctx->send_size);
-	put_unaligned_le16(BTRFS_SEND_A_DATA, &hdr->tlv_type);
-	put_unaligned_le16(len, &hdr->tlv_len);
-	sctx->send_size += sizeof(*hdr);
+		if (sctx->send_max_size - sctx->send_size < sizeof(*hdr) + len)
+			return -EOVERFLOW;
+		hdr = (struct btrfs_tlv_header *)(sctx->send_buf +
+						  sctx->send_size);
+		put_unaligned_le16(BTRFS_SEND_A_DATA, &hdr->tlv_type);
+		put_unaligned_le16(len, &hdr->tlv_len);
+		sctx->send_size += sizeof(*hdr);
+	}
 	return 0;
 }
 
@@ -7459,7 +7480,12 @@ long btrfs_ioctl_send(struct inode *inode, struct btrfs_ioctl_send_args *arg)
 
 	sctx->clone_roots_cnt = arg->clone_sources_count;
 
-	sctx->send_max_size = BTRFS_SEND_BUF_SIZE_V1;
+	if (sctx->proto >= 2) {
+		sctx->send_max_size = ALIGN(SZ_16K + BTRFS_MAX_COMPRESSED,
+					    PAGE_SIZE);
+	} else {
+		sctx->send_max_size = BTRFS_SEND_BUF_SIZE_V1;
+	}
 	sctx->send_buf = kvmalloc(sctx->send_max_size, GFP_KERNEL);
 	if (!sctx->send_buf) {
 		ret = -ENOMEM;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v14 5/7] btrfs: send: allocate send buffer with alloc_page() and vmap() for v2
  2022-03-17 17:25 [PATCH v14 0/7] btrfs: add send/receive support for reading/writing compressed data Omar Sandoval
                   ` (3 preceding siblings ...)
  2022-03-17 17:25 ` [PATCH v14 4/7] btrfs: send: write larger chunks when using stream v2 Omar Sandoval
@ 2022-03-17 17:25 ` Omar Sandoval
  2022-03-24 17:53   ` Sweet Tea Dorminy
  2022-03-17 17:25 ` [PATCH v14 6/7] btrfs: send: send compressed extents with encoded writes Omar Sandoval
                   ` (12 subsequent siblings)
  17 siblings, 1 reply; 29+ messages in thread
From: Omar Sandoval @ 2022-03-17 17:25 UTC (permalink / raw)
  To: linux-btrfs; +Cc: kernel-team

From: Omar Sandoval <osandov@fb.com>

For encoded writes, we need the raw pages for reading compressed data
directly via a bio. So, replace kvmalloc() with vmap() so we have access
to the raw pages. 144k is large enough that it usually gets allocated
with vmalloc(), anyways.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Omar Sandoval <osandov@fb.com>
---
 fs/btrfs/send.c | 33 +++++++++++++++++++++++++++++++--
 1 file changed, 31 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c
index 02053fff80ca..ac2a1297027a 100644
--- a/fs/btrfs/send.c
+++ b/fs/btrfs/send.c
@@ -83,6 +83,7 @@ struct send_ctx {
 	u32 send_size;
 	u32 send_max_size;
 	bool put_data;
+	struct page **send_buf_pages;
 	u64 flags;	/* 'flags' member of btrfs_ioctl_send_args is u64 */
 	/* Protocol version compatibility requested */
 	u32 proto;
@@ -7392,6 +7393,7 @@ long btrfs_ioctl_send(struct inode *inode, struct btrfs_ioctl_send_args *arg)
 	struct btrfs_root *clone_root;
 	struct send_ctx *sctx = NULL;
 	u32 i;
+	u32 send_buf_num_pages = 0;
 	u64 *clone_sources_tmp = NULL;
 	int clone_sources_to_rollback = 0;
 	size_t alloc_size;
@@ -7483,10 +7485,28 @@ long btrfs_ioctl_send(struct inode *inode, struct btrfs_ioctl_send_args *arg)
 	if (sctx->proto >= 2) {
 		sctx->send_max_size = ALIGN(SZ_16K + BTRFS_MAX_COMPRESSED,
 					    PAGE_SIZE);
+		send_buf_num_pages = sctx->send_max_size >> PAGE_SHIFT;
+		sctx->send_buf_pages = kcalloc(send_buf_num_pages,
+					       sizeof(*sctx->send_buf_pages),
+					       GFP_KERNEL);
+		if (!sctx->send_buf_pages) {
+			send_buf_num_pages = 0;
+			ret = -ENOMEM;
+			goto out;
+		}
+		for (i = 0; i < send_buf_num_pages; i++) {
+			sctx->send_buf_pages[i] = alloc_page(GFP_KERNEL);
+			if (!sctx->send_buf_pages[i]) {
+				ret = -ENOMEM;
+				goto out;
+			}
+		}
+		sctx->send_buf = vmap(sctx->send_buf_pages, send_buf_num_pages,
+				      VM_MAP, PAGE_KERNEL);
 	} else {
 		sctx->send_max_size = BTRFS_SEND_BUF_SIZE_V1;
+		sctx->send_buf = kvmalloc(sctx->send_max_size, GFP_KERNEL);
 	}
-	sctx->send_buf = kvmalloc(sctx->send_max_size, GFP_KERNEL);
 	if (!sctx->send_buf) {
 		ret = -ENOMEM;
 		goto out;
@@ -7679,7 +7699,16 @@ long btrfs_ioctl_send(struct inode *inode, struct btrfs_ioctl_send_args *arg)
 			fput(sctx->send_filp);
 
 		kvfree(sctx->clone_roots);
-		kvfree(sctx->send_buf);
+		if (sctx->proto >= 2) {
+			vunmap(sctx->send_buf);
+			for (i = 0; i < send_buf_num_pages; i++) {
+				if (sctx->send_buf_pages[i])
+					__free_page(sctx->send_buf_pages[i]);
+			}
+			kfree(sctx->send_buf_pages);
+		} else {
+			kvfree(sctx->send_buf);
+		}
 
 		name_cache_free(sctx);
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v14 6/7] btrfs: send: send compressed extents with encoded writes
  2022-03-17 17:25 [PATCH v14 0/7] btrfs: add send/receive support for reading/writing compressed data Omar Sandoval
                   ` (4 preceding siblings ...)
  2022-03-17 17:25 ` [PATCH v14 5/7] btrfs: send: allocate send buffer with alloc_page() and vmap() for v2 Omar Sandoval
@ 2022-03-17 17:25 ` Omar Sandoval
  2022-03-17 17:25 ` [PATCH v14 7/7] btrfs: send: enable support for stream v2 and compressed writes Omar Sandoval
                   ` (11 subsequent siblings)
  17 siblings, 0 replies; 29+ messages in thread
From: Omar Sandoval @ 2022-03-17 17:25 UTC (permalink / raw)
  To: linux-btrfs; +Cc: kernel-team

From: Omar Sandoval <osandov@fb.com>

Now that all of the pieces are in place, we can use the ENCODED_WRITE
command to send compressed extents when appropriate.

Signed-off-by: Omar Sandoval <osandov@fb.com>
---
 fs/btrfs/ctree.h |   6 ++
 fs/btrfs/inode.c |  13 +--
 fs/btrfs/send.c  | 234 +++++++++++++++++++++++++++++++++++++++++++----
 3 files changed, 228 insertions(+), 25 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 09b7b0b2d016..92bfee5b3ded 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -3355,6 +3355,12 @@ int btrfs_writepage_cow_fixup(struct page *page);
 void btrfs_writepage_endio_finish_ordered(struct btrfs_inode *inode,
 					  struct page *page, u64 start,
 					  u64 end, bool uptodate);
+int btrfs_encoded_io_compression_from_extent(struct btrfs_fs_info *fs_info,
+					     int compress_type);
+int btrfs_encoded_read_regular_fill_pages(struct btrfs_inode *inode,
+					  u64 file_offset, u64 disk_bytenr,
+					  u64 disk_io_size,
+					  struct page **pages);
 ssize_t btrfs_encoded_read(struct kiocb *iocb, struct iov_iter *iter,
 			   struct btrfs_ioctl_encoded_io_args *encoded);
 ssize_t btrfs_do_encoded_write(struct kiocb *iocb, struct iov_iter *from,
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 78a5145353e1..6b8c0f026545 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -10114,9 +10114,8 @@ void btrfs_set_range_writeback(struct btrfs_inode *inode, u64 start, u64 end)
 	}
 }
 
-static int btrfs_encoded_io_compression_from_extent(
-				struct btrfs_fs_info *fs_info,
-				int compress_type)
+int btrfs_encoded_io_compression_from_extent(struct btrfs_fs_info *fs_info,
+					     int compress_type)
 {
 	switch (compress_type) {
 	case BTRFS_COMPRESS_NONE:
@@ -10321,11 +10320,9 @@ static void btrfs_encoded_read_endio(struct bio *bio)
 	bio_put(bio);
 }
 
-static int btrfs_encoded_read_regular_fill_pages(struct btrfs_inode *inode,
-						 u64 file_offset,
-						 u64 disk_bytenr,
-						 u64 disk_io_size,
-						 struct page **pages)
+int btrfs_encoded_read_regular_fill_pages(struct btrfs_inode *inode,
+					  u64 file_offset, u64 disk_bytenr,
+					  u64 disk_io_size, struct page **pages)
 {
 	struct btrfs_fs_info *fs_info = inode->root->fs_info;
 	struct btrfs_encoded_read_private priv = {
diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c
index ac2a1297027a..b0560be3053b 100644
--- a/fs/btrfs/send.c
+++ b/fs/btrfs/send.c
@@ -614,6 +614,7 @@ static int tlv_put(struct send_ctx *sctx, u16 attr, const void *data, int len)
 		return tlv_put(sctx, attr, &__tmp, sizeof(__tmp));	\
 	}
 
+TLV_PUT_DEFINE_INT(32)
 TLV_PUT_DEFINE_INT(64)
 
 static int tlv_put_string(struct send_ctx *sctx, u16 attr,
@@ -5160,16 +5161,215 @@ static int send_hole(struct send_ctx *sctx, u64 end)
 	return ret;
 }
 
-static int send_extent_data(struct send_ctx *sctx,
-			    const u64 offset,
-			    const u64 len)
+static int send_encoded_inline_extent(struct send_ctx *sctx,
+				      struct btrfs_path *path, u64 offset,
+				      u64 len)
 {
+	struct btrfs_root *root = sctx->send_root;
+	struct btrfs_fs_info *fs_info = root->fs_info;
+	struct inode *inode;
+	struct fs_path *p;
+	struct extent_buffer *leaf = path->nodes[0];
+	struct btrfs_key key;
+	struct btrfs_file_extent_item *ei;
+	u64 ram_bytes;
+	size_t inline_size;
+	int ret;
+
+	inode = btrfs_iget(fs_info->sb, sctx->cur_ino, root);
+	if (IS_ERR(inode))
+		return PTR_ERR(inode);
+
+	p = fs_path_alloc();
+	if (!p) {
+		ret = -ENOMEM;
+		goto out;
+	}
+
+	ret = begin_cmd(sctx, BTRFS_SEND_C_ENCODED_WRITE);
+	if (ret < 0)
+		goto out;
+
+	ret = get_cur_path(sctx, sctx->cur_ino, sctx->cur_inode_gen, p);
+	if (ret < 0)
+		goto out;
+
+	btrfs_item_key_to_cpu(leaf, &key, path->slots[0]);
+	ei = btrfs_item_ptr(leaf, path->slots[0],
+			    struct btrfs_file_extent_item);
+	ram_bytes = btrfs_file_extent_ram_bytes(leaf, ei);
+	inline_size = btrfs_file_extent_inline_item_len(leaf, path->slots[0]);
+
+	TLV_PUT_PATH(sctx, BTRFS_SEND_A_PATH, p);
+	TLV_PUT_U64(sctx, BTRFS_SEND_A_FILE_OFFSET, offset);
+	TLV_PUT_U64(sctx, BTRFS_SEND_A_UNENCODED_FILE_LEN,
+		    min(key.offset + ram_bytes - offset, len));
+	TLV_PUT_U64(sctx, BTRFS_SEND_A_UNENCODED_LEN, ram_bytes);
+	TLV_PUT_U64(sctx, BTRFS_SEND_A_UNENCODED_OFFSET, offset - key.offset);
+	ret = btrfs_encoded_io_compression_from_extent(fs_info,
+				btrfs_file_extent_compression(leaf, ei));
+	if (ret < 0)
+		goto out;
+	TLV_PUT_U32(sctx, BTRFS_SEND_A_COMPRESSION, ret);
+
+	ret = put_data_header(sctx, inline_size);
+	if (ret < 0)
+		goto out;
+	read_extent_buffer(leaf, sctx->send_buf + sctx->send_size,
+			   btrfs_file_extent_inline_start(ei), inline_size);
+	sctx->send_size += inline_size;
+
+	ret = send_cmd(sctx);
+
+tlv_put_failure:
+out:
+	fs_path_free(p);
+	iput(inode);
+	return ret;
+}
+
+static int send_encoded_extent(struct send_ctx *sctx, struct btrfs_path *path,
+			       u64 offset, u64 len)
+{
+	struct btrfs_root *root = sctx->send_root;
+	struct btrfs_fs_info *fs_info = root->fs_info;
+	struct inode *inode;
+	struct fs_path *p;
+	struct extent_buffer *leaf = path->nodes[0];
+	struct btrfs_key key;
+	struct btrfs_file_extent_item *ei;
+	u64 disk_bytenr, disk_num_bytes;
+	u32 data_offset;
+	struct btrfs_cmd_header *hdr;
+	u32 crc;
+	int ret;
+
+	inode = btrfs_iget(fs_info->sb, sctx->cur_ino, root);
+	if (IS_ERR(inode))
+		return PTR_ERR(inode);
+
+	p = fs_path_alloc();
+	if (!p) {
+		ret = -ENOMEM;
+		goto out;
+	}
+
+	ret = begin_cmd(sctx, BTRFS_SEND_C_ENCODED_WRITE);
+	if (ret < 0)
+		goto out;
+
+	ret = get_cur_path(sctx, sctx->cur_ino, sctx->cur_inode_gen, p);
+	if (ret < 0)
+		goto out;
+
+	btrfs_item_key_to_cpu(leaf, &key, path->slots[0]);
+	ei = btrfs_item_ptr(leaf, path->slots[0],
+			    struct btrfs_file_extent_item);
+	disk_bytenr = btrfs_file_extent_disk_bytenr(leaf, ei);
+	disk_num_bytes = btrfs_file_extent_disk_num_bytes(leaf, ei);
+
+	TLV_PUT_PATH(sctx, BTRFS_SEND_A_PATH, p);
+	TLV_PUT_U64(sctx, BTRFS_SEND_A_FILE_OFFSET, offset);
+	TLV_PUT_U64(sctx, BTRFS_SEND_A_UNENCODED_FILE_LEN,
+		    min(key.offset + btrfs_file_extent_num_bytes(leaf, ei) - offset,
+			len));
+	TLV_PUT_U64(sctx, BTRFS_SEND_A_UNENCODED_LEN,
+		    btrfs_file_extent_ram_bytes(leaf, ei));
+	TLV_PUT_U64(sctx, BTRFS_SEND_A_UNENCODED_OFFSET,
+		    offset - key.offset + btrfs_file_extent_offset(leaf, ei));
+	ret = btrfs_encoded_io_compression_from_extent(fs_info,
+				btrfs_file_extent_compression(leaf, ei));
+	if (ret < 0)
+		goto out;
+	TLV_PUT_U32(sctx, BTRFS_SEND_A_COMPRESSION, ret);
+	TLV_PUT_U32(sctx, BTRFS_SEND_A_ENCRYPTION, 0);
+
+	ret = put_data_header(sctx, disk_num_bytes);
+	if (ret < 0)
+		goto out;
+
+	/*
+	 * We want to do I/O directly into the send buffer, so get the next page
+	 * boundary in the send buffer. This means that there may be a gap
+	 * between the beginning of the command and the file data.
+	 */
+	data_offset = ALIGN(sctx->send_size, PAGE_SIZE);
+	if (data_offset > sctx->send_max_size ||
+	    sctx->send_max_size - data_offset < disk_num_bytes) {
+		ret = -EOVERFLOW;
+		goto out;
+	}
+
+	/*
+	 * Note that send_buf is a mapping of send_buf_pages, so this is really
+	 * reading into send_buf.
+	 */
+	ret = btrfs_encoded_read_regular_fill_pages(BTRFS_I(inode), offset,
+						    disk_bytenr, disk_num_bytes,
+						    sctx->send_buf_pages +
+						    (data_offset >> PAGE_SHIFT));
+	if (ret)
+		goto out;
+
+	hdr = (struct btrfs_cmd_header *)sctx->send_buf;
+	hdr->len = cpu_to_le32(sctx->send_size + disk_num_bytes - sizeof(*hdr));
+	hdr->crc = 0;
+	crc = btrfs_crc32c(0, sctx->send_buf, sctx->send_size);
+	crc = btrfs_crc32c(crc, sctx->send_buf + data_offset, disk_num_bytes);
+	hdr->crc = cpu_to_le32(crc);
+
+	ret = write_buf(sctx->send_filp, sctx->send_buf, sctx->send_size,
+			&sctx->send_off);
+	if (!ret) {
+		ret = write_buf(sctx->send_filp, sctx->send_buf + data_offset,
+				disk_num_bytes, &sctx->send_off);
+	}
+	sctx->send_size = 0;
+	sctx->put_data = false;
+
+tlv_put_failure:
+out:
+	fs_path_free(p);
+	iput(inode);
+	return ret;
+}
+
+static int send_extent_data(struct send_ctx *sctx, struct btrfs_path *path,
+			    const u64 offset, const u64 len)
+{
+	struct extent_buffer *leaf = path->nodes[0];
+	struct btrfs_file_extent_item *ei;
 	u64 read_size = max_send_read_size(sctx);
 	u64 sent = 0;
 
 	if (sctx->flags & BTRFS_SEND_FLAG_NO_FILE_DATA)
 		return send_update_extent(sctx, offset, len);
 
+	ei = btrfs_item_ptr(leaf, path->slots[0],
+			    struct btrfs_file_extent_item);
+	if ((sctx->flags & BTRFS_SEND_FLAG_COMPRESSED) &&
+	    btrfs_file_extent_compression(leaf, ei) != BTRFS_COMPRESS_NONE) {
+		bool is_inline = (btrfs_file_extent_type(leaf, ei) ==
+				  BTRFS_FILE_EXTENT_INLINE);
+
+		/*
+		 * Send the compressed extent unless the compressed data is
+		 * larger than the decompressed data. This can happen if we're
+		 * not sending the entire extent, either because it has been
+		 * partially overwritten/truncated or because this is a part of
+		 * the extent that we couldn't clone in clone_range().
+		 */
+		if (is_inline &&
+		    btrfs_file_extent_inline_item_len(leaf,
+						      path->slots[0]) <= len) {
+			return send_encoded_inline_extent(sctx, path, offset,
+							  len);
+		} else if (!is_inline &&
+			   btrfs_file_extent_disk_num_bytes(leaf, ei) <= len) {
+			return send_encoded_extent(sctx, path, offset, len);
+		}
+	}
+
 	while (sent < len) {
 		u64 size = min(len - sent, read_size);
 		int ret;
@@ -5240,12 +5440,9 @@ static int send_capabilities(struct send_ctx *sctx)
 	return ret;
 }
 
-static int clone_range(struct send_ctx *sctx,
-		       struct clone_root *clone_root,
-		       const u64 disk_byte,
-		       u64 data_offset,
-		       u64 offset,
-		       u64 len)
+static int clone_range(struct send_ctx *sctx, struct btrfs_path *dst_path,
+		       struct clone_root *clone_root, const u64 disk_byte,
+		       u64 data_offset, u64 offset, u64 len)
 {
 	struct btrfs_path *path;
 	struct btrfs_key key;
@@ -5269,7 +5466,7 @@ static int clone_range(struct send_ctx *sctx,
 	 */
 	if (clone_root->offset == 0 &&
 	    len == sctx->send_root->fs_info->sectorsize)
-		return send_extent_data(sctx, offset, len);
+		return send_extent_data(sctx, dst_path, offset, len);
 
 	path = alloc_path_for_send();
 	if (!path)
@@ -5366,7 +5563,8 @@ static int clone_range(struct send_ctx *sctx,
 
 			if (hole_len > len)
 				hole_len = len;
-			ret = send_extent_data(sctx, offset, hole_len);
+			ret = send_extent_data(sctx, dst_path, offset,
+					       hole_len);
 			if (ret < 0)
 				goto out;
 
@@ -5439,14 +5637,16 @@ static int clone_range(struct send_ctx *sctx,
 					if (ret < 0)
 						goto out;
 				}
-				ret = send_extent_data(sctx, offset + slen,
+				ret = send_extent_data(sctx, dst_path,
+						       offset + slen,
 						       clone_len - slen);
 			} else {
 				ret = send_clone(sctx, offset, clone_len,
 						 clone_root);
 			}
 		} else {
-			ret = send_extent_data(sctx, offset, clone_len);
+			ret = send_extent_data(sctx, dst_path, offset,
+					       clone_len);
 		}
 
 		if (ret < 0)
@@ -5478,7 +5678,7 @@ static int clone_range(struct send_ctx *sctx,
 	}
 
 	if (len > 0)
-		ret = send_extent_data(sctx, offset, len);
+		ret = send_extent_data(sctx, dst_path, offset, len);
 	else
 		ret = 0;
 out:
@@ -5509,10 +5709,10 @@ static int send_write_or_clone(struct send_ctx *sctx,
 				    struct btrfs_file_extent_item);
 		disk_byte = btrfs_file_extent_disk_bytenr(path->nodes[0], ei);
 		data_offset = btrfs_file_extent_offset(path->nodes[0], ei);
-		ret = clone_range(sctx, clone_root, disk_byte, data_offset,
-				  offset, end - offset);
+		ret = clone_range(sctx, path, clone_root, disk_byte,
+				  data_offset, offset, end - offset);
 	} else {
-		ret = send_extent_data(sctx, offset, end - offset);
+		ret = send_extent_data(sctx, path, offset, end - offset);
 	}
 	sctx->cur_inode_next_write_offset = end;
 	return ret;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v14 7/7] btrfs: send: enable support for stream v2 and compressed writes
  2022-03-17 17:25 [PATCH v14 0/7] btrfs: add send/receive support for reading/writing compressed data Omar Sandoval
                   ` (5 preceding siblings ...)
  2022-03-17 17:25 ` [PATCH v14 6/7] btrfs: send: send compressed extents with encoded writes Omar Sandoval
@ 2022-03-17 17:25 ` Omar Sandoval
  2022-03-17 17:25 ` [PATCH v14 01/10] btrfs-progs: receive: support v2 send stream larger tlv_len Omar Sandoval
                   ` (10 subsequent siblings)
  17 siblings, 0 replies; 29+ messages in thread
From: Omar Sandoval @ 2022-03-17 17:25 UTC (permalink / raw)
  To: linux-btrfs; +Cc: kernel-team

From: Omar Sandoval <osandov@fb.com>

Now that the new support is implemented, allow the ioctl to accept v2
and the compressed flag, and update the version in sysfs.

Signed-off-by: Omar Sandoval <osandov@fb.com>
---
 fs/btrfs/send.c            | 7 +++++--
 fs/btrfs/send.h            | 2 +-
 include/uapi/linux/btrfs.h | 3 ++-
 3 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c
index b0560be3053b..4567271ce642 100644
--- a/fs/btrfs/send.c
+++ b/fs/btrfs/send.c
@@ -690,8 +690,7 @@ static int send_header(struct send_ctx *sctx)
 	struct btrfs_stream_header hdr;
 
 	strcpy(hdr.magic, BTRFS_SEND_STREAM_MAGIC);
-	hdr.version = cpu_to_le32(BTRFS_SEND_STREAM_VERSION);
-
+	hdr.version = cpu_to_le32(sctx->proto);
 	return write_buf(sctx->send_filp, &hdr, sizeof(hdr),
 					&sctx->send_off);
 }
@@ -7663,6 +7662,10 @@ long btrfs_ioctl_send(struct inode *inode, struct btrfs_ioctl_send_args *arg)
 	} else {
 		sctx->proto = 1;
 	}
+	if ((arg->flags & BTRFS_SEND_FLAG_COMPRESSED) && sctx->proto < 2) {
+		ret = -EINVAL;
+		goto out;
+	}
 
 	sctx->send_filp = fget(arg->send_fd);
 	if (!sctx->send_filp) {
diff --git a/fs/btrfs/send.h b/fs/btrfs/send.h
index 805d8095209a..50a2aceae929 100644
--- a/fs/btrfs/send.h
+++ b/fs/btrfs/send.h
@@ -10,7 +10,7 @@
 #include "ctree.h"
 
 #define BTRFS_SEND_STREAM_MAGIC "btrfs-stream"
-#define BTRFS_SEND_STREAM_VERSION 1
+#define BTRFS_SEND_STREAM_VERSION 2
 
 /*
  * In send stream v1, no command is larger than 64k. In send stream v2, no limit
diff --git a/include/uapi/linux/btrfs.h b/include/uapi/linux/btrfs.h
index b6f26a434b10..f54dc91e4025 100644
--- a/include/uapi/linux/btrfs.h
+++ b/include/uapi/linux/btrfs.h
@@ -788,7 +788,8 @@ struct btrfs_ioctl_received_subvol_args {
 	(BTRFS_SEND_FLAG_NO_FILE_DATA | \
 	 BTRFS_SEND_FLAG_OMIT_STREAM_HEADER | \
 	 BTRFS_SEND_FLAG_OMIT_END_CMD | \
-	 BTRFS_SEND_FLAG_VERSION)
+	 BTRFS_SEND_FLAG_VERSION | \
+	 BTRFS_SEND_FLAG_COMPRESSED)
 
 struct btrfs_ioctl_send_args {
 	__s64 send_fd;			/* in */
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v14 01/10] btrfs-progs: receive: support v2 send stream larger tlv_len
  2022-03-17 17:25 [PATCH v14 0/7] btrfs: add send/receive support for reading/writing compressed data Omar Sandoval
                   ` (6 preceding siblings ...)
  2022-03-17 17:25 ` [PATCH v14 7/7] btrfs: send: enable support for stream v2 and compressed writes Omar Sandoval
@ 2022-03-17 17:25 ` Omar Sandoval
  2022-03-17 17:25 ` [PATCH v14 02/10] btrfs-progs: receive: dynamically allocate sctx->read_buf Omar Sandoval
                   ` (9 subsequent siblings)
  17 siblings, 0 replies; 29+ messages in thread
From: Omar Sandoval @ 2022-03-17 17:25 UTC (permalink / raw)
  To: linux-btrfs; +Cc: kernel-team

From: Boris Burkov <borisb@fb.com>

An encoded extent can be up to 128K in length, which exceeds the largest
value expressible by the current send stream format's 16 bit tlv_len
field. Since encoded writes cannot be split into multiple writes by
btrfs send, the send stream format must change to accommodate encoded
writes.

Supporting this changed format requires retooling how we store the
commands we have processed. We currently store pointers to the struct
btrfs_tlv_headers in the command buffer. This is not sufficient to
represent the new BTRFS_SEND_A_DATA format. Instead, parse the attribute
headers and store them in a new struct btrfs_send_attribute which has a
32-bit length field. This is transparent to users of the various TLV_GET
macros.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Boris Burkov <boris@bur.io>
---
 common/send-stream.c | 34 +++++++++++++++++++++++++---------
 1 file changed, 25 insertions(+), 9 deletions(-)

diff --git a/common/send-stream.c b/common/send-stream.c
index e9be922b..7d182238 100644
--- a/common/send-stream.c
+++ b/common/send-stream.c
@@ -24,13 +24,23 @@
 #include "crypto/crc32c.h"
 #include "common/utils.h"
 
+struct btrfs_send_attribute {
+	u16 tlv_type;
+	/*
+	 * Note: in btrfs_tlv_header, this is __le16, but we need 32 bits for
+	 * attributes with file data as of version 2 of the send stream format
+	 */
+	u32 tlv_len;
+	char *data;
+};
+
 struct btrfs_send_stream {
 	char read_buf[BTRFS_SEND_BUF_SIZE];
 	int fd;
 
 	int cmd;
 	struct btrfs_cmd_header *cmd_hdr;
-	struct btrfs_tlv_header *cmd_attrs[BTRFS_SEND_A_MAX + 1];
+	struct btrfs_send_attribute cmd_attrs[BTRFS_SEND_A_MAX + 1];
 	u32 version;
 
 	/*
@@ -152,6 +162,7 @@ static int read_cmd(struct btrfs_send_stream *sctx)
 		struct btrfs_tlv_header *tlv_hdr;
 		u16 tlv_type;
 		u16 tlv_len;
+		struct btrfs_send_attribute *send_attr;
 
 		tlv_hdr = (struct btrfs_tlv_header *)data;
 		tlv_type = le16_to_cpu(tlv_hdr->tlv_type);
@@ -164,10 +175,15 @@ static int read_cmd(struct btrfs_send_stream *sctx)
 			goto out;
 		}
 
-		sctx->cmd_attrs[tlv_type] = tlv_hdr;
+		send_attr = &sctx->cmd_attrs[tlv_type];
+		send_attr->tlv_type = tlv_type;
+		send_attr->tlv_len = tlv_len;
+		pos += sizeof(*tlv_hdr);
+		data += sizeof(*tlv_hdr);
 
-		data += sizeof(*tlv_hdr) + tlv_len;
-		pos += sizeof(*tlv_hdr) + tlv_len;
+		send_attr->data = data;
+		pos += send_attr->tlv_len;
+		data += send_attr->tlv_len;
 	}
 
 	sctx->cmd = cmd;
@@ -180,7 +196,7 @@ out:
 static int tlv_get(struct btrfs_send_stream *sctx, int attr, void **data, int *len)
 {
 	int ret;
-	struct btrfs_tlv_header *hdr;
+	struct btrfs_send_attribute *send_attr;
 
 	if (attr <= 0 || attr > BTRFS_SEND_A_MAX) {
 		error("invalid attribute requested, attr = %d", attr);
@@ -188,15 +204,15 @@ static int tlv_get(struct btrfs_send_stream *sctx, int attr, void **data, int *l
 		goto out;
 	}
 
-	hdr = sctx->cmd_attrs[attr];
-	if (!hdr) {
+	send_attr = &sctx->cmd_attrs[attr];
+	if (!send_attr->data) {
 		error("attribute %d requested but not present", attr);
 		ret = -ENOENT;
 		goto out;
 	}
 
-	*len = le16_to_cpu(hdr->tlv_len);
-	*data = hdr + 1;
+	*len = send_attr->tlv_len;
+	*data = send_attr->data;
 
 	ret = 0;
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v14 02/10] btrfs-progs: receive: dynamically allocate sctx->read_buf
  2022-03-17 17:25 [PATCH v14 0/7] btrfs: add send/receive support for reading/writing compressed data Omar Sandoval
                   ` (7 preceding siblings ...)
  2022-03-17 17:25 ` [PATCH v14 01/10] btrfs-progs: receive: support v2 send stream larger tlv_len Omar Sandoval
@ 2022-03-17 17:25 ` Omar Sandoval
  2022-03-17 17:25 ` [PATCH v14 03/10] btrfs-progs: receive: support v2 send stream DATA tlv format Omar Sandoval
                   ` (8 subsequent siblings)
  17 siblings, 0 replies; 29+ messages in thread
From: Omar Sandoval @ 2022-03-17 17:25 UTC (permalink / raw)
  To: linux-btrfs; +Cc: kernel-team

From: Boris Burkov <boris@bur.io>

In send stream v2, write commands can now be an arbitrary size. For that
reason, we can no longer allocate a fixed array in sctx for read_cmd.
Instead, read_cmd dynamically allocates sctx->read_buf. To avoid
needless reallocations, we reuse read_buf between read_cmd calls by also
keeping track of the size of the allocated buffer in sctx->read_buf_sz.

We do the first allocation of the old default size at the start of
processing the stream, and we only reallocate if we encounter a command
that needs a larger buffer.

Signed-off-by: Boris Burkov <boris@bur.io>
---
 common/send-stream.c   | 56 ++++++++++++++++++++++++++++--------------
 kernel-shared/send.h   |  6 ++++-
 libbtrfs/send-stream.c |  2 +-
 3 files changed, 43 insertions(+), 21 deletions(-)

diff --git a/common/send-stream.c b/common/send-stream.c
index 7d182238..421cd1bb 100644
--- a/common/send-stream.c
+++ b/common/send-stream.c
@@ -35,11 +35,11 @@ struct btrfs_send_attribute {
 };
 
 struct btrfs_send_stream {
-	char read_buf[BTRFS_SEND_BUF_SIZE];
+	char *read_buf;
+	size_t read_buf_sz;
 	int fd;
 
 	int cmd;
-	struct btrfs_cmd_header *cmd_hdr;
 	struct btrfs_send_attribute cmd_attrs[BTRFS_SEND_A_MAX + 1];
 	u32 version;
 
@@ -111,11 +111,12 @@ static int read_cmd(struct btrfs_send_stream *sctx)
 	u32 pos;
 	u32 crc;
 	u32 crc2;
+	struct btrfs_cmd_header *cmd_hdr;
+	size_t buf_len;
 
 	memset(sctx->cmd_attrs, 0, sizeof(sctx->cmd_attrs));
 
-	ASSERT(sizeof(*sctx->cmd_hdr) <= sizeof(sctx->read_buf));
-	ret = read_buf(sctx, sctx->read_buf, sizeof(*sctx->cmd_hdr));
+	ret = read_buf(sctx, sctx->read_buf, sizeof(*cmd_hdr));
 	if (ret < 0)
 		goto out;
 	if (ret) {
@@ -124,18 +125,25 @@ static int read_cmd(struct btrfs_send_stream *sctx)
 		goto out;
 	}
 
-	sctx->cmd_hdr = (struct btrfs_cmd_header *)sctx->read_buf;
-	cmd = le16_to_cpu(sctx->cmd_hdr->cmd);
-	cmd_len = le32_to_cpu(sctx->cmd_hdr->len);
+	cmd_hdr = (struct btrfs_cmd_header *)sctx->read_buf;
+	cmd_len = le32_to_cpu(cmd_hdr->len);
+	cmd = le16_to_cpu(cmd_hdr->cmd);
+	buf_len = sizeof(*cmd_hdr) + cmd_len;
+	if (sctx->read_buf_sz < buf_len) {
+		void *new_read_buf;
 
-	if (cmd_len + sizeof(*sctx->cmd_hdr) >= sizeof(sctx->read_buf)) {
-		ret = -EINVAL;
-		error("command length %u too big for buffer %zu",
-				cmd_len, sizeof(sctx->read_buf));
-		goto out;
+		new_read_buf = realloc(sctx->read_buf, buf_len);
+		if (!new_read_buf) {
+			ret = -ENOMEM;
+			error("failed to reallocate read buffer for cmd");
+			goto out;
+		}
+		sctx->read_buf = new_read_buf;
+		sctx->read_buf_sz = buf_len;
+		/* We need to reset cmd_hdr after realloc of sctx->read_buf */
+		cmd_hdr = (struct btrfs_cmd_header *)sctx->read_buf;
 	}
-
-	data = sctx->read_buf + sizeof(*sctx->cmd_hdr);
+	data = sctx->read_buf + sizeof(*cmd_hdr);
 	ret = read_buf(sctx, data, cmd_len);
 	if (ret < 0)
 		goto out;
@@ -145,11 +153,12 @@ static int read_cmd(struct btrfs_send_stream *sctx)
 		goto out;
 	}
 
-	crc = le32_to_cpu(sctx->cmd_hdr->crc);
-	sctx->cmd_hdr->crc = 0;
+	crc = le32_to_cpu(cmd_hdr->crc);
+	/* in send, crc is computed with header crc = 0, replicate that */
+	cmd_hdr->crc = 0;
 
 	crc2 = crc32c(0, (unsigned char*)sctx->read_buf,
-			sizeof(*sctx->cmd_hdr) + cmd_len);
+			sizeof(*cmd_hdr) + cmd_len);
 
 	if (crc != crc2) {
 		ret = -EINVAL;
@@ -537,19 +546,28 @@ int btrfs_read_and_process_send_stream(int fd,
 		goto out;
 	}
 
+	sctx.read_buf = malloc(BTRFS_SEND_BUF_SIZE_V1);
+	if (!sctx.read_buf) {
+		ret = -ENOMEM;
+		error("unable to allocate send stream read buffer");
+		goto out;
+	}
+	sctx.read_buf_sz = BTRFS_SEND_BUF_SIZE_V1;
+
 	while (1) {
 		ret = read_and_process_cmd(&sctx);
 		if (ret < 0) {
 			last_err = ret;
 			errors++;
 			if (max_errors > 0 && errors >= max_errors)
-				goto out;
+				break;
 		} else if (ret > 0) {
 			if (!honor_end_cmd)
 				ret = 0;
-			goto out;
+			break;
 		}
 	}
+	free(sctx.read_buf);
 
 out:
 	if (last_err && !ret)
diff --git a/kernel-shared/send.h b/kernel-shared/send.h
index e73f09df..e986b6c8 100644
--- a/kernel-shared/send.h
+++ b/kernel-shared/send.h
@@ -33,7 +33,11 @@ extern "C" {
 #define BTRFS_SEND_STREAM_MAGIC "btrfs-stream"
 #define BTRFS_SEND_STREAM_VERSION 1
 
-#define BTRFS_SEND_BUF_SIZE  (64 * 1024)
+/*
+ * In send stream v1, no command is larger than 64k. In send stream v2, no limit
+ * should be assumed.
+ */
+#define BTRFS_SEND_BUF_SIZE_V1 (64 * 1024)
 #define BTRFS_SEND_READ_SIZE (1024 * 48)
 
 enum btrfs_tlv_type {
diff --git a/libbtrfs/send-stream.c b/libbtrfs/send-stream.c
index 2b21d846..39cbb3ed 100644
--- a/libbtrfs/send-stream.c
+++ b/libbtrfs/send-stream.c
@@ -22,7 +22,7 @@
 #include "crypto/crc32c.h"
 
 struct btrfs_send_stream {
-	char read_buf[BTRFS_SEND_BUF_SIZE];
+	char read_buf[BTRFS_SEND_BUF_SIZE_V1];
 	int fd;
 
 	int cmd;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v14 03/10] btrfs-progs: receive: support v2 send stream DATA tlv format
  2022-03-17 17:25 [PATCH v14 0/7] btrfs: add send/receive support for reading/writing compressed data Omar Sandoval
                   ` (8 preceding siblings ...)
  2022-03-17 17:25 ` [PATCH v14 02/10] btrfs-progs: receive: dynamically allocate sctx->read_buf Omar Sandoval
@ 2022-03-17 17:25 ` Omar Sandoval
  2022-03-17 17:25 ` [PATCH v14 04/10] btrfs-progs: receive: add send stream v2 cmds and attrs to send.h Omar Sandoval
                   ` (7 subsequent siblings)
  17 siblings, 0 replies; 29+ messages in thread
From: Omar Sandoval @ 2022-03-17 17:25 UTC (permalink / raw)
  To: linux-btrfs; +Cc: kernel-team

From: Boris Burkov <borisb@fb.com>

The new format privileges the BTRFS_SEND_A_DATA attribute by
guaranteeing it will always be the last attribute in any command that
needs it, and by implicitly encoding the data length as the difference
between the total command length in the command header and the sizes of
the rest of the attributes (and of course the tlv_type identifying the
DATA attribute). To parse the new stream, we must read the tlv_type and
if it is not DATA, we proceed normally, but if it is DATA, we don't
parse a tlv_len but simply compute the length.

In addition, we add some bounds checking when parsing each chunk of
data, as well as for the tlv_len itself.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Boris Burkov <boris@bur.io>
---
 common/send-stream.c | 36 ++++++++++++++++++++++++++----------
 1 file changed, 26 insertions(+), 10 deletions(-)

diff --git a/common/send-stream.c b/common/send-stream.c
index 421cd1bb..81a830d9 100644
--- a/common/send-stream.c
+++ b/common/send-stream.c
@@ -168,28 +168,44 @@ static int read_cmd(struct btrfs_send_stream *sctx)
 
 	pos = 0;
 	while (pos < cmd_len) {
-		struct btrfs_tlv_header *tlv_hdr;
 		u16 tlv_type;
-		u16 tlv_len;
 		struct btrfs_send_attribute *send_attr;
 
-		tlv_hdr = (struct btrfs_tlv_header *)data;
-		tlv_type = le16_to_cpu(tlv_hdr->tlv_type);
-		tlv_len = le16_to_cpu(tlv_hdr->tlv_len);
+		if (cmd_len - pos < sizeof(__le16)) {
+			error("send stream is truncated");
+			ret = -EINVAL;
+			goto out;
+		}
+		tlv_type = le16_to_cpu(*(__le16 *)data);
 
 		if (tlv_type == 0 || tlv_type > BTRFS_SEND_A_MAX) {
-			error("invalid tlv in cmd tlv_type = %hu, tlv_len = %hu",
-					tlv_type, tlv_len);
+			error("invalid tlv in cmd tlv_type = %hu", tlv_type);
 			ret = -EINVAL;
 			goto out;
 		}
 
 		send_attr = &sctx->cmd_attrs[tlv_type];
 		send_attr->tlv_type = tlv_type;
-		send_attr->tlv_len = tlv_len;
-		pos += sizeof(*tlv_hdr);
-		data += sizeof(*tlv_hdr);
 
+		pos += sizeof(tlv_type);
+		data += sizeof(tlv_type);
+		if (sctx->version >= 2 && tlv_type == BTRFS_SEND_A_DATA) {
+			send_attr->tlv_len = cmd_len - pos;
+		} else {
+			if (cmd_len - pos < sizeof(__le16)) {
+				error("send stream is truncated");
+				ret = -EINVAL;
+				goto out;
+			}
+			send_attr->tlv_len = le16_to_cpu(*(__le16 *)data);
+			pos += sizeof(__le16);
+			data += sizeof(__le16);
+		}
+		if (cmd_len - pos < send_attr->tlv_len) {
+			error("send stream is truncated");
+			ret = -EINVAL;
+			goto out;
+		}
 		send_attr->data = data;
 		pos += send_attr->tlv_len;
 		data += send_attr->tlv_len;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v14 04/10] btrfs-progs: receive: add send stream v2 cmds and attrs to send.h
  2022-03-17 17:25 [PATCH v14 0/7] btrfs: add send/receive support for reading/writing compressed data Omar Sandoval
                   ` (9 preceding siblings ...)
  2022-03-17 17:25 ` [PATCH v14 03/10] btrfs-progs: receive: support v2 send stream DATA tlv format Omar Sandoval
@ 2022-03-17 17:25 ` Omar Sandoval
  2022-03-17 17:25 ` [PATCH v14 05/10] btrfs-progs: receive: process encoded_write commands Omar Sandoval
                   ` (6 subsequent siblings)
  17 siblings, 0 replies; 29+ messages in thread
From: Omar Sandoval @ 2022-03-17 17:25 UTC (permalink / raw)
  To: linux-btrfs; +Cc: kernel-team

From: Omar Sandoval <osandov@fb.com>

Update our copy of send.h from the kernel. This adds the new commands
and attributes for v2 as well as explicit enum numbering.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Boris Burkov <boris@bur.io>
Signed-off-by: Omar Sandoval <osandov@fb.com>
---
 kernel-shared/send.h | 138 ++++++++++++++++++++++++++-----------------
 1 file changed, 85 insertions(+), 53 deletions(-)

diff --git a/kernel-shared/send.h b/kernel-shared/send.h
index e986b6c8..b902d054 100644
--- a/kernel-shared/send.h
+++ b/kernel-shared/send.h
@@ -38,7 +38,6 @@ extern "C" {
  * should be assumed.
  */
 #define BTRFS_SEND_BUF_SIZE_V1 (64 * 1024)
-#define BTRFS_SEND_READ_SIZE (1024 * 48)
 
 enum btrfs_tlv_type {
 	BTRFS_TLV_U8,
@@ -72,77 +71,110 @@ struct btrfs_tlv_header {
 
 /* commands */
 enum btrfs_send_cmd {
-	BTRFS_SEND_C_UNSPEC,
+	BTRFS_SEND_C_UNSPEC = 0,
 
-	BTRFS_SEND_C_SUBVOL,
-	BTRFS_SEND_C_SNAPSHOT,
+	/* Version 1 */
+	BTRFS_SEND_C_SUBVOL = 1,
+	BTRFS_SEND_C_SNAPSHOT = 2,
 
-	BTRFS_SEND_C_MKFILE,
-	BTRFS_SEND_C_MKDIR,
-	BTRFS_SEND_C_MKNOD,
-	BTRFS_SEND_C_MKFIFO,
-	BTRFS_SEND_C_MKSOCK,
-	BTRFS_SEND_C_SYMLINK,
+	BTRFS_SEND_C_MKFILE = 3,
+	BTRFS_SEND_C_MKDIR = 4,
+	BTRFS_SEND_C_MKNOD = 5,
+	BTRFS_SEND_C_MKFIFO = 6,
+	BTRFS_SEND_C_MKSOCK = 7,
+	BTRFS_SEND_C_SYMLINK = 8,
 
-	BTRFS_SEND_C_RENAME,
-	BTRFS_SEND_C_LINK,
-	BTRFS_SEND_C_UNLINK,
-	BTRFS_SEND_C_RMDIR,
+	BTRFS_SEND_C_RENAME = 9,
+	BTRFS_SEND_C_LINK = 10,
+	BTRFS_SEND_C_UNLINK = 11,
+	BTRFS_SEND_C_RMDIR = 12,
 
-	BTRFS_SEND_C_SET_XATTR,
-	BTRFS_SEND_C_REMOVE_XATTR,
+	BTRFS_SEND_C_SET_XATTR = 13,
+	BTRFS_SEND_C_REMOVE_XATTR = 14,
 
-	BTRFS_SEND_C_WRITE,
-	BTRFS_SEND_C_CLONE,
+	BTRFS_SEND_C_WRITE = 15,
+	BTRFS_SEND_C_CLONE = 16,
 
-	BTRFS_SEND_C_TRUNCATE,
-	BTRFS_SEND_C_CHMOD,
-	BTRFS_SEND_C_CHOWN,
-	BTRFS_SEND_C_UTIMES,
+	BTRFS_SEND_C_TRUNCATE = 17,
+	BTRFS_SEND_C_CHMOD = 18,
+	BTRFS_SEND_C_CHOWN = 19,
+	BTRFS_SEND_C_UTIMES = 20,
 
-	BTRFS_SEND_C_END,
-	BTRFS_SEND_C_UPDATE_EXTENT,
-	__BTRFS_SEND_C_MAX,
+	BTRFS_SEND_C_END = 21,
+	BTRFS_SEND_C_UPDATE_EXTENT = 22,
+	BTRFS_SEND_C_MAX_V1 = 22,
+
+	/* Version 2 */
+	BTRFS_SEND_C_FALLOCATE = 23,
+	BTRFS_SEND_C_SETFLAGS = 24,
+	BTRFS_SEND_C_ENCODED_WRITE = 25,
+	BTRFS_SEND_C_MAX_V2 = 25,
+
+	/* End */
+	BTRFS_SEND_C_MAX = 25,
 };
-#define BTRFS_SEND_C_MAX (__BTRFS_SEND_C_MAX - 1)
 
 /* attributes in send stream */
 enum {
-	BTRFS_SEND_A_UNSPEC,
+	BTRFS_SEND_A_UNSPEC = 0,
 
-	BTRFS_SEND_A_UUID,
-	BTRFS_SEND_A_CTRANSID,
+	/* Version 1 */
+	BTRFS_SEND_A_UUID = 1,
+	BTRFS_SEND_A_CTRANSID = 2,
 
-	BTRFS_SEND_A_INO,
-	BTRFS_SEND_A_SIZE,
-	BTRFS_SEND_A_MODE,
-	BTRFS_SEND_A_UID,
-	BTRFS_SEND_A_GID,
-	BTRFS_SEND_A_RDEV,
-	BTRFS_SEND_A_CTIME,
-	BTRFS_SEND_A_MTIME,
-	BTRFS_SEND_A_ATIME,
-	BTRFS_SEND_A_OTIME,
+	BTRFS_SEND_A_INO = 3,
+	BTRFS_SEND_A_SIZE = 4,
+	BTRFS_SEND_A_MODE = 5,
+	BTRFS_SEND_A_UID = 6,
+	BTRFS_SEND_A_GID = 7,
+	BTRFS_SEND_A_RDEV = 8,
+	BTRFS_SEND_A_CTIME = 9,
+	BTRFS_SEND_A_MTIME = 10,
+	BTRFS_SEND_A_ATIME = 11,
+	BTRFS_SEND_A_OTIME = 12,
 
-	BTRFS_SEND_A_XATTR_NAME,
-	BTRFS_SEND_A_XATTR_DATA,
+	BTRFS_SEND_A_XATTR_NAME = 13,
+	BTRFS_SEND_A_XATTR_DATA = 14,
 
-	BTRFS_SEND_A_PATH,
-	BTRFS_SEND_A_PATH_TO,
-	BTRFS_SEND_A_PATH_LINK,
+	BTRFS_SEND_A_PATH = 15,
+	BTRFS_SEND_A_PATH_TO = 16,
+	BTRFS_SEND_A_PATH_LINK = 17,
 
-	BTRFS_SEND_A_FILE_OFFSET,
-	BTRFS_SEND_A_DATA,
+	BTRFS_SEND_A_FILE_OFFSET = 18,
+	/*
+	 * As of send stream v2, this attribute is special: it must be the last
+	 * attribute in a command, its header contains only the type, and its
+	 * length is implicitly the remaining length of the command.
+	 */
+	BTRFS_SEND_A_DATA = 19,
 
-	BTRFS_SEND_A_CLONE_UUID,
-	BTRFS_SEND_A_CLONE_CTRANSID,
-	BTRFS_SEND_A_CLONE_PATH,
-	BTRFS_SEND_A_CLONE_OFFSET,
-	BTRFS_SEND_A_CLONE_LEN,
+	BTRFS_SEND_A_CLONE_UUID = 20,
+	BTRFS_SEND_A_CLONE_CTRANSID = 21,
+	BTRFS_SEND_A_CLONE_PATH = 22,
+	BTRFS_SEND_A_CLONE_OFFSET = 23,
+	BTRFS_SEND_A_CLONE_LEN = 24,
 
-	__BTRFS_SEND_A_MAX,
+	BTRFS_SEND_A_MAX_V1 = 24,
+
+	/* Version 2 */
+	BTRFS_SEND_A_FALLOCATE_MODE = 25,
+
+	BTRFS_SEND_A_SETFLAGS_FLAGS = 26,
+
+	BTRFS_SEND_A_UNENCODED_FILE_LEN = 27,
+	BTRFS_SEND_A_UNENCODED_LEN = 28,
+	BTRFS_SEND_A_UNENCODED_OFFSET = 29,
+	/*
+	 * COMPRESSION and ENCRYPTION default to NONE (0) if omitted from
+	 * BTRFS_SEND_C_ENCODED_WRITE.
+	 */
+	BTRFS_SEND_A_COMPRESSION = 30,
+	BTRFS_SEND_A_ENCRYPTION = 31,
+	BTRFS_SEND_A_MAX_V2 = 31,
+
+	/* End */
+	BTRFS_SEND_A_MAX = 31,
 };
-#define BTRFS_SEND_A_MAX (__BTRFS_SEND_A_MAX - 1)
 
 #ifdef __cplusplus
 }
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v14 05/10] btrfs-progs: receive: process encoded_write commands
  2022-03-17 17:25 [PATCH v14 0/7] btrfs: add send/receive support for reading/writing compressed data Omar Sandoval
                   ` (10 preceding siblings ...)
  2022-03-17 17:25 ` [PATCH v14 04/10] btrfs-progs: receive: add send stream v2 cmds and attrs to send.h Omar Sandoval
@ 2022-03-17 17:25 ` Omar Sandoval
  2022-03-17 17:25 ` [PATCH v14 06/10] btrfs-progs: receive: encoded_write fallback to explicit decode and write Omar Sandoval
                   ` (5 subsequent siblings)
  17 siblings, 0 replies; 29+ messages in thread
From: Omar Sandoval @ 2022-03-17 17:25 UTC (permalink / raw)
  To: linux-btrfs; +Cc: kernel-team

From: Boris Burkov <borisb@fb.com>

Add a new btrfs_send_op and support for both dumping and proper receive
processing which does actual encoded writes.

Encoded writes are only allowed on a file descriptor opened with an
extra flag that allows encoded writes, so we also add support for this
flag when opening or reusing a file for writing.

Signed-off-by: Boris Burkov <boris@bur.io>
---
 cmds/receive-dump.c  |  16 +++++-
 cmds/receive.c       |  48 ++++++++++++++++
 common/send-stream.c |  29 ++++++++++
 common/send-stream.h |   4 ++
 ioctl.h              | 132 +++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 228 insertions(+), 1 deletion(-)

diff --git a/cmds/receive-dump.c b/cmds/receive-dump.c
index 00ad4fd1..83701b62 100644
--- a/cmds/receive-dump.c
+++ b/cmds/receive-dump.c
@@ -318,6 +318,19 @@ static int print_update_extent(const char *path, u64 offset, u64 len,
 			  offset, len);
 }
 
+static int print_encoded_write(const char *path, const void *data, u64 offset,
+			       u64 len, u64 unencoded_file_len,
+			       u64 unencoded_len, u64 unencoded_offset,
+			       u32 compression, u32 encryption, void *user)
+{
+	return PRINT_DUMP(user, path, "encoded_write",
+			  "offset=%llu len=%llu, unencoded_file_len=%llu, "
+			  "unencoded_len=%llu, unencoded_offset=%llu, "
+			  "compression=%u, encryption=%u",
+			  offset, len, unencoded_file_len, unencoded_len,
+			  unencoded_offset, compression, encryption);
+}
+
 struct btrfs_send_ops btrfs_print_send_ops = {
 	.subvol = print_subvol,
 	.snapshot = print_snapshot,
@@ -339,5 +352,6 @@ struct btrfs_send_ops btrfs_print_send_ops = {
 	.chmod = print_chmod,
 	.chown = print_chown,
 	.utimes = print_utimes,
-	.update_extent = print_update_extent
+	.update_extent = print_update_extent,
+	.encoded_write = print_encoded_write,
 };
diff --git a/cmds/receive.c b/cmds/receive.c
index d106e554..8226ca32 100644
--- a/cmds/receive.c
+++ b/cmds/receive.c
@@ -29,12 +29,14 @@
 #include <assert.h>
 #include <getopt.h>
 #include <limits.h>
+#include <errno.h>
 
 #include <sys/stat.h>
 #include <sys/types.h>
 #include <sys/ioctl.h>
 #include <sys/time.h>
 #include <sys/types.h>
+#include <sys/uio.h>
 #include <sys/xattr.h>
 #include <uuid/uuid.h>
 
@@ -49,6 +51,7 @@
 #include "cmds/receive-dump.h"
 #include "common/help.h"
 #include "common/path-utils.h"
+#include "stubs.h"
 
 struct btrfs_receive
 {
@@ -982,6 +985,50 @@ static int process_update_extent(const char *path, u64 offset, u64 len,
 	return 0;
 }
 
+static int process_encoded_write(const char *path, const void *data, u64 offset,
+				 u64 len, u64 unencoded_file_len,
+				 u64 unencoded_len, u64 unencoded_offset,
+				 u32 compression, u32 encryption, void *user)
+{
+	int ret;
+	struct btrfs_receive *rctx = user;
+	char full_path[PATH_MAX];
+	struct iovec iov = { (char *)data, len };
+	struct btrfs_ioctl_encoded_io_args encoded = {
+		.iov = &iov,
+		.iovcnt = 1,
+		.offset = offset,
+		.len = unencoded_file_len,
+		.unencoded_len = unencoded_len,
+		.unencoded_offset = unencoded_offset,
+		.compression = compression,
+		.encryption = encryption,
+	};
+
+	if (encryption) {
+		error("encoded_write: encryption not supported");
+		return -EOPNOTSUPP;
+	}
+
+	ret = path_cat_out(full_path, rctx->full_subvol_path, path);
+	if (ret < 0) {
+		error("encoded_write: path invalid: %s", path);
+		return ret;
+	}
+
+	ret = open_inode_for_write(rctx, full_path);
+	if (ret < 0)
+		return ret;
+
+	ret = ioctl(rctx->write_fd, BTRFS_IOC_ENCODED_WRITE, &encoded);
+	if (ret < 0) {
+		ret = -errno;
+		error("encoded_write: writing to %s failed: %m", path);
+		return ret;
+	}
+	return 0;
+}
+
 static struct btrfs_send_ops send_ops = {
 	.subvol = process_subvol,
 	.snapshot = process_snapshot,
@@ -1004,6 +1051,7 @@ static struct btrfs_send_ops send_ops = {
 	.chown = process_chown,
 	.utimes = process_utimes,
 	.update_extent = process_update_extent,
+	.encoded_write = process_encoded_write,
 };
 
 static int do_receive(struct btrfs_receive *rctx, const char *tomnt,
diff --git a/common/send-stream.c b/common/send-stream.c
index 81a830d9..ce7c40f5 100644
--- a/common/send-stream.c
+++ b/common/send-stream.c
@@ -357,6 +357,8 @@ static int read_and_process_cmd(struct btrfs_send_stream *sctx)
 	struct timespec mt;
 	u8 uuid[BTRFS_UUID_SIZE];
 	u8 clone_uuid[BTRFS_UUID_SIZE];
+	u32 compression;
+	u32 encryption;
 	u64 tmp;
 	u64 tmp2;
 	u64 ctransid;
@@ -366,6 +368,9 @@ static int read_and_process_cmd(struct btrfs_send_stream *sctx)
 	u64 clone_offset;
 	u64 offset;
 	u64 ino;
+	u64 unencoded_file_len;
+	u64 unencoded_len;
+	u64 unencoded_offset;
 	int len;
 	int xattr_len;
 
@@ -452,6 +457,30 @@ static int read_and_process_cmd(struct btrfs_send_stream *sctx)
 		TLV_GET(sctx, BTRFS_SEND_A_DATA, &data, &len);
 		ret = sctx->ops->write(path, data, offset, len, sctx->user);
 		break;
+	case BTRFS_SEND_C_ENCODED_WRITE:
+		TLV_GET_STRING(sctx, BTRFS_SEND_A_PATH, &path);
+		TLV_GET_U64(sctx, BTRFS_SEND_A_FILE_OFFSET, &offset);
+		TLV_GET_U64(sctx, BTRFS_SEND_A_UNENCODED_FILE_LEN,
+			    &unencoded_file_len);
+		TLV_GET_U64(sctx, BTRFS_SEND_A_UNENCODED_LEN, &unencoded_len);
+		TLV_GET_U64(sctx, BTRFS_SEND_A_UNENCODED_OFFSET,
+			    &unencoded_offset);
+		/* Compression and encryption default to none if omitted. */
+		if (sctx->cmd_attrs[BTRFS_SEND_A_COMPRESSION].data)
+			TLV_GET_U32(sctx, BTRFS_SEND_A_COMPRESSION, &compression);
+		else
+			compression = BTRFS_ENCODED_IO_COMPRESSION_NONE;
+		if (sctx->cmd_attrs[BTRFS_SEND_A_ENCRYPTION].data)
+			TLV_GET_U32(sctx, BTRFS_SEND_A_ENCRYPTION, &encryption);
+		else
+			encryption = BTRFS_ENCODED_IO_ENCRYPTION_NONE;
+		TLV_GET(sctx, BTRFS_SEND_A_DATA, &data, &len);
+		ret = sctx->ops->encoded_write(path, data, offset, len,
+					       unencoded_file_len,
+					       unencoded_len, unencoded_offset,
+					       compression, encryption,
+					       sctx->user);
+		break;
 	case BTRFS_SEND_C_CLONE:
 		TLV_GET_STRING(sctx, BTRFS_SEND_A_PATH, &path);
 		TLV_GET_U64(sctx, BTRFS_SEND_A_FILE_OFFSET, &offset);
diff --git a/common/send-stream.h b/common/send-stream.h
index 2de51eac..44abbc9d 100644
--- a/common/send-stream.h
+++ b/common/send-stream.h
@@ -53,6 +53,10 @@ struct btrfs_send_ops {
 		      struct timespec *mt, struct timespec *ct,
 		      void *user);
 	int (*update_extent)(const char *path, u64 offset, u64 len, void *user);
+	int (*encoded_write)(const char *path, const void *data, u64 offset,
+			     u64 len, u64 unencoded_file_len, u64 unencoded_len,
+			     u64 unencoded_offset, u32 compression,
+			     u32 encryption, void *user);
 };
 
 int btrfs_read_and_process_send_stream(int fd,
diff --git a/ioctl.h b/ioctl.h
index 368a87b2..8adf63c2 100644
--- a/ioctl.h
+++ b/ioctl.h
@@ -777,6 +777,134 @@ struct btrfs_ioctl_get_subvol_rootref_args {
 };
 BUILD_ASSERT(sizeof(struct btrfs_ioctl_get_subvol_rootref_args) == 4096);
 
+/*
+ * Data and metadata for an encoded read or write.
+ *
+ * Encoded I/O bypasses any encoding automatically done by the filesystem (e.g.,
+ * compression). This can be used to read the compressed contents of a file or
+ * write pre-compressed data directly to a file.
+ *
+ * BTRFS_IOC_ENCODED_READ and BTRFS_IOC_ENCODED_WRITE are essentially
+ * preadv/pwritev with additional metadata about how the data is encoded and the
+ * size of the unencoded data.
+ *
+ * BTRFS_IOC_ENCODED_READ fills the given iovecs with the encoded data, fills
+ * the metadata fields, and returns the size of the encoded data. It reads one
+ * extent per call. It can also read data which is not encoded.
+ *
+ * BTRFS_IOC_ENCODED_WRITE uses the metadata fields, writes the encoded data
+ * from the iovecs, and returns the size of the encoded data. Note that the
+ * encoded data is not validated when it is written; if it is not valid (e.g.,
+ * it cannot be decompressed), then a subsequent read may return an error.
+ *
+ * Since the filesystem page cache contains decoded data, encoded I/O bypasses
+ * the page cache. Encoded I/O requires CAP_SYS_ADMIN.
+ */
+struct btrfs_ioctl_encoded_io_args {
+	/* Input parameters for both reads and writes. */
+
+	/*
+	 * iovecs containing encoded data.
+	 *
+	 * For reads, if the size of the encoded data is larger than the sum of
+	 * iov[n].iov_len for 0 <= n < iovcnt, then the ioctl fails with
+	 * ENOBUFS.
+	 *
+	 * For writes, the size of the encoded data is the sum of iov[n].iov_len
+	 * for 0 <= n < iovcnt. This must be less than 128 KiB (this limit may
+	 * increase in the future). This must also be less than or equal to
+	 * unencoded_len.
+	 */
+	const struct iovec __user *iov;
+	/* Number of iovecs. */
+	unsigned long iovcnt;
+	/*
+	 * Offset in file.
+	 *
+	 * For writes, must be aligned to the sector size of the filesystem.
+	 */
+	__s64 offset;
+	/* Currently must be zero. */
+	__u64 flags;
+
+	/*
+	 * For reads, the following members are output parameters that will
+	 * contain the returned metadata for the encoded data.
+	 * For writes, the following members must be set to the metadata for the
+	 * encoded data.
+	 */
+
+	/*
+	 * Length of the data in the file.
+	 *
+	 * Must be less than or equal to unencoded_len - unencoded_offset. For
+	 * writes, must be aligned to the sector size of the filesystem unless
+	 * the data ends at or beyond the current end of the file.
+	 */
+	__u64 len;
+	/*
+	 * Length of the unencoded (i.e., decrypted and decompressed) data.
+	 *
+	 * For writes, must be no more than 128 KiB (this limit may increase in
+	 * the future). If the unencoded data is actually longer than
+	 * unencoded_len, then it is truncated; if it is shorter, then it is
+	 * extended with zeroes.
+	 */
+	__u64 unencoded_len;
+	/*
+	 * Offset from the first byte of the unencoded data to the first byte of
+	 * logical data in the file.
+	 *
+	 * Must be less than unencoded_len.
+	 */
+	__u64 unencoded_offset;
+	/*
+	 * BTRFS_ENCODED_IO_COMPRESSION_* type.
+	 *
+	 * For writes, must not be BTRFS_ENCODED_IO_COMPRESSION_NONE.
+	 */
+	__u32 compression;
+	/* Currently always BTRFS_ENCODED_IO_ENCRYPTION_NONE. */
+	__u32 encryption;
+	/*
+	 * Reserved for future expansion.
+	 *
+	 * For reads, always returned as zero. Users should check for non-zero
+	 * bytes. If there are any, then the kernel has a newer version of this
+	 * structure with additional information that the user definition is
+	 * missing.
+	 *
+	 * For writes, must be zeroed.
+	 */
+	__u8 reserved[64];
+};
+
+/* Data is not compressed. */
+#define BTRFS_ENCODED_IO_COMPRESSION_NONE 0
+/* Data is compressed as a single zlib stream. */
+#define BTRFS_ENCODED_IO_COMPRESSION_ZLIB 1
+/*
+ * Data is compressed as a single zstd frame with the windowLog compression
+ * parameter set to no more than 17.
+ */
+#define BTRFS_ENCODED_IO_COMPRESSION_ZSTD 2
+/*
+ * Data is compressed sector by sector (using the sector size indicated by the
+ * name of the constant) with LZO1X and wrapped in the format documented in
+ * fs/btrfs/lzo.c. For writes, the compression sector size must match the
+ * filesystem sector size.
+ */
+#define BTRFS_ENCODED_IO_COMPRESSION_LZO_4K 3
+#define BTRFS_ENCODED_IO_COMPRESSION_LZO_8K 4
+#define BTRFS_ENCODED_IO_COMPRESSION_LZO_16K 5
+#define BTRFS_ENCODED_IO_COMPRESSION_LZO_32K 6
+#define BTRFS_ENCODED_IO_COMPRESSION_LZO_64K 7
+#define BTRFS_ENCODED_IO_COMPRESSION_TYPES 8
+
+/* Data is not encrypted. */
+#define BTRFS_ENCODED_IO_ENCRYPTION_NONE 0
+#define BTRFS_ENCODED_IO_ENCRYPTION_TYPES 1
+
 /* Error codes as returned by the kernel */
 enum btrfs_err_code {
 	notused,
@@ -951,6 +1079,10 @@ static inline char *btrfs_err_str(enum btrfs_err_code err_code)
 				struct btrfs_ioctl_ino_lookup_user_args)
 #define BTRFS_IOC_SNAP_DESTROY_V2 _IOW(BTRFS_IOCTL_MAGIC, 63, \
 				   struct btrfs_ioctl_vol_args_v2)
+#define BTRFS_IOC_ENCODED_READ _IOR(BTRFS_IOCTL_MAGIC, 64, \
+				    struct btrfs_ioctl_encoded_io_args)
+#define BTRFS_IOC_ENCODED_WRITE _IOW(BTRFS_IOCTL_MAGIC, 64, \
+				     struct btrfs_ioctl_encoded_io_args)
 
 #ifdef __cplusplus
 }
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v14 06/10] btrfs-progs: receive: encoded_write fallback to explicit decode and write
  2022-03-17 17:25 [PATCH v14 0/7] btrfs: add send/receive support for reading/writing compressed data Omar Sandoval
                   ` (11 preceding siblings ...)
  2022-03-17 17:25 ` [PATCH v14 05/10] btrfs-progs: receive: process encoded_write commands Omar Sandoval
@ 2022-03-17 17:25 ` Omar Sandoval
  2022-03-17 17:25 ` [PATCH v14 07/10] btrfs-progs: receive: process fallocate commands Omar Sandoval
                   ` (4 subsequent siblings)
  17 siblings, 0 replies; 29+ messages in thread
From: Omar Sandoval @ 2022-03-17 17:25 UTC (permalink / raw)
  To: linux-btrfs; +Cc: kernel-team

From: Boris Burkov <boris@bur.io>

An encoded_write can fail if the file system it is being applied to does
not support encoded writes or if it can't find enough contiguous space
to accommodate the encoded extent. In those cases, we can likely still
process an encoded_write by explicitly decoding the data and doing a
normal write.

Add the necessary fallback path for decoding data compressed with zlib,
lzo, or zstd. zlib and zstd have reusable decoding context data
structures which we cache in the receive context so that we don't have
to recreate them on every encoded_write.

Finally, add a command line flag for force-decompress which causes
receive to always use the fallback path rather than first attempting the
encoded write.

Signed-off-by: Boris Burkov <boris@bur.io>
---
 Documentation/btrfs-receive.rst |   5 +
 cmds/receive.c                  | 261 +++++++++++++++++++++++++++++++-
 2 files changed, 259 insertions(+), 7 deletions(-)

diff --git a/Documentation/btrfs-receive.rst b/Documentation/btrfs-receive.rst
index 86ffdcc6..b9a3cad6 100644
--- a/Documentation/btrfs-receive.rst
+++ b/Documentation/btrfs-receive.rst
@@ -57,6 +57,11 @@ A subvolume is made read-only after the receiving process finishes successfully
         If */proc* is not accessible, eg. in a chroot environment, use this option to
         tell us where this filesystem is mounted.
 
+--force-decompress
+        if the stream contains compressed data (see *--compressed-data* in
+        ``btrfs-send(8)``), always decompress it instead of writing it with
+        encoded I/O
+
 --dump
         dump the stream metadata, one line per operation
 
diff --git a/cmds/receive.c b/cmds/receive.c
index 8226ca32..5fd939ce 100644
--- a/cmds/receive.c
+++ b/cmds/receive.c
@@ -40,6 +40,10 @@
 #include <sys/xattr.h>
 #include <uuid/uuid.h>
 
+#include <lzo/lzo1x.h>
+#include <zlib.h>
+#include <zstd.h>
+
 #include "kernel-shared/ctree.h"
 #include "ioctl.h"
 #include "cmds/commands.h"
@@ -75,6 +79,12 @@ struct btrfs_receive
 	char cur_subvol_path[PATH_MAX];
 
 	int honor_end_cmd;
+
+	bool force_decompress;
+
+	/* Reuse stream objects for encoded_write decompression fallback */
+	ZSTD_DStream *zstd_dstream;
+	z_stream *zlib_stream;
 };
 
 static int finish_subvol(struct btrfs_receive *rctx)
@@ -985,6 +995,219 @@ static int process_update_extent(const char *path, u64 offset, u64 len,
 	return 0;
 }
 
+static int decompress_zlib(struct btrfs_receive *rctx, const char *encoded_data,
+			   u64 encoded_len, char *unencoded_data,
+			   u64 unencoded_len)
+{
+	bool init = false;
+	int ret;
+
+	if (!rctx->zlib_stream) {
+		init = true;
+		rctx->zlib_stream = malloc(sizeof(z_stream));
+		if (!rctx->zlib_stream) {
+			error("failed to allocate zlib stream %m");
+			return -ENOMEM;
+		}
+	}
+	rctx->zlib_stream->next_in = (void *)encoded_data;
+	rctx->zlib_stream->avail_in = encoded_len;
+	rctx->zlib_stream->next_out = (void *)unencoded_data;
+	rctx->zlib_stream->avail_out = unencoded_len;
+
+	if (init) {
+		rctx->zlib_stream->zalloc = Z_NULL;
+		rctx->zlib_stream->zfree = Z_NULL;
+		rctx->zlib_stream->opaque = Z_NULL;
+		ret = inflateInit(rctx->zlib_stream);
+	} else {
+		ret = inflateReset(rctx->zlib_stream);
+	}
+	if (ret != Z_OK) {
+		error("zlib inflate init failed: %d", ret);
+		return -EIO;
+	}
+
+	while (rctx->zlib_stream->avail_in > 0 &&
+	       rctx->zlib_stream->avail_out > 0) {
+		ret = inflate(rctx->zlib_stream, Z_FINISH);
+		if (ret == Z_STREAM_END) {
+			break;
+		} else if (ret != Z_OK) {
+			error("zlib inflate failed: %d", ret);
+			return -EIO;
+		}
+	}
+	return 0;
+}
+
+static int decompress_zstd(struct btrfs_receive *rctx, const char *encoded_buf,
+			   u64 encoded_len, char *unencoded_buf,
+			   u64 unencoded_len)
+{
+	ZSTD_inBuffer in_buf = {
+		.src = encoded_buf,
+		.size = encoded_len
+	};
+	ZSTD_outBuffer out_buf = {
+		.dst = unencoded_buf,
+		.size = unencoded_len
+	};
+	size_t ret;
+
+	if (!rctx->zstd_dstream) {
+		rctx->zstd_dstream = ZSTD_createDStream();
+		if (!rctx->zstd_dstream) {
+			error("failed to create zstd dstream");
+			return -ENOMEM;
+		}
+	}
+	ret = ZSTD_initDStream(rctx->zstd_dstream);
+	if (ZSTD_isError(ret)) {
+		error("failed to init zstd stream: %s", ZSTD_getErrorName(ret));
+		return -EIO;
+	}
+	while (in_buf.pos < in_buf.size && out_buf.pos < out_buf.size) {
+		ret = ZSTD_decompressStream(rctx->zstd_dstream, &out_buf, &in_buf);
+		if (ret == 0) {
+			break;
+		} else if (ZSTD_isError(ret)) {
+			error("failed to decompress zstd stream: %s",
+			      ZSTD_getErrorName(ret));
+			return -EIO;
+		}
+	}
+	return 0;
+}
+
+static int decompress_lzo(const char *encoded_data, u64 encoded_len,
+			  char *unencoded_data, u64 unencoded_len,
+			  unsigned int sector_size)
+{
+	uint32_t total_len;
+	size_t in_pos, out_pos;
+
+	if (encoded_len < 4) {
+		error("lzo header is truncated");
+		return -EIO;
+	}
+	memcpy(&total_len, encoded_data, 4);
+	total_len = le32toh(total_len);
+	if (total_len > encoded_len) {
+		error("lzo header is invalid");
+		return -EIO;
+	}
+
+	in_pos = 4;
+	out_pos = 0;
+	while (in_pos < total_len && out_pos < unencoded_len) {
+		size_t sector_remaining;
+		uint32_t src_len;
+		lzo_uint dst_len;
+		int ret;
+
+		sector_remaining = -in_pos % sector_size;
+		if (sector_remaining < 4) {
+			if (total_len - in_pos <= sector_remaining)
+				break;
+			in_pos += sector_remaining;
+		}
+
+		if (total_len - in_pos < 4) {
+			error("lzo segment header is truncated");
+			return -EIO;
+		}
+
+		memcpy(&src_len, encoded_data + in_pos, 4);
+		src_len = le32toh(src_len);
+		in_pos += 4;
+		if (src_len > total_len - in_pos) {
+			error("lzo segment header is invalid");
+			return -EIO;
+		}
+
+		dst_len = sector_size;
+		ret = lzo1x_decompress_safe((void *)(encoded_data + in_pos),
+					    src_len,
+					    (void *)(unencoded_data + out_pos),
+					    &dst_len, NULL);
+		if (ret != LZO_E_OK) {
+			error("lzo1x_decompress_safe failed: %d", ret);
+			return -EIO;
+		}
+
+		in_pos += src_len;
+		out_pos += dst_len;
+	}
+	return 0;
+}
+
+static int decompress_and_write(struct btrfs_receive *rctx,
+				const char *encoded_data, u64 offset,
+				u64 encoded_len, u64 unencoded_file_len,
+				u64 unencoded_len, u64 unencoded_offset,
+				u32 compression)
+{
+	int ret = 0;
+	size_t pos;
+	ssize_t w;
+	char *unencoded_data;
+	int sector_shift;
+
+	unencoded_data = calloc(unencoded_len, 1);
+	if (!unencoded_data) {
+		error("allocating space for unencoded data failed: %m");
+		return -errno;
+	}
+
+	switch (compression) {
+	case BTRFS_ENCODED_IO_COMPRESSION_ZLIB:
+		ret = decompress_zlib(rctx, encoded_data, encoded_len,
+				      unencoded_data, unencoded_len);
+		if (ret)
+			goto out;
+		break;
+	case BTRFS_ENCODED_IO_COMPRESSION_ZSTD:
+		ret = decompress_zstd(rctx, encoded_data, encoded_len,
+				      unencoded_data, unencoded_len);
+		if (ret)
+			goto out;
+		break;
+	case BTRFS_ENCODED_IO_COMPRESSION_LZO_4K:
+	case BTRFS_ENCODED_IO_COMPRESSION_LZO_8K:
+	case BTRFS_ENCODED_IO_COMPRESSION_LZO_16K:
+	case BTRFS_ENCODED_IO_COMPRESSION_LZO_32K:
+	case BTRFS_ENCODED_IO_COMPRESSION_LZO_64K:
+		sector_shift =
+			compression - BTRFS_ENCODED_IO_COMPRESSION_LZO_4K + 12;
+		ret = decompress_lzo(encoded_data, encoded_len, unencoded_data,
+				     unencoded_len, 1U << sector_shift);
+		if (ret)
+			goto out;
+		break;
+	default:
+		error("unknown compression: %d", compression);
+		ret = -EOPNOTSUPP;
+		goto out;
+	}
+
+	pos = unencoded_offset;
+	while (pos < unencoded_file_len) {
+		w = pwrite(rctx->write_fd, unencoded_data + pos,
+			   unencoded_file_len - pos, offset);
+		if (w < 0) {
+			ret = -errno;
+			error("writing unencoded data failed: %m");
+			goto out;
+		}
+		pos += w;
+		offset += w;
+	}
+out:
+	free(unencoded_data);
+	return ret;
+}
+
 static int process_encoded_write(const char *path, const void *data, u64 offset,
 				 u64 len, u64 unencoded_file_len,
 				 u64 unencoded_len, u64 unencoded_offset,
@@ -1020,13 +1243,21 @@ static int process_encoded_write(const char *path, const void *data, u64 offset,
 	if (ret < 0)
 		return ret;
 
-	ret = ioctl(rctx->write_fd, BTRFS_IOC_ENCODED_WRITE, &encoded);
-	if (ret < 0) {
-		ret = -errno;
-		error("encoded_write: writing to %s failed: %m", path);
-		return ret;
+	if (!rctx->force_decompress) {
+		ret = ioctl(rctx->write_fd, BTRFS_IOC_ENCODED_WRITE, &encoded);
+		if (ret >= 0)
+			return 0;
+		/* Fall back for these errors, fail hard for anything else. */
+		if (errno != ENOSPC && errno != ENOTTY && errno != EINVAL) {
+			ret = -errno;
+			error("encoded_write: writing to %s failed: %m", path);
+			return ret;
+		}
 	}
-	return 0;
+
+	return decompress_and_write(rctx, data, offset, len, unencoded_file_len,
+				    unencoded_len, unencoded_offset,
+				    compression);
 }
 
 static struct btrfs_send_ops send_ops = {
@@ -1204,6 +1435,12 @@ out:
 		close(rctx->dest_dir_fd);
 		rctx->dest_dir_fd = -1;
 	}
+	if (rctx->zstd_dstream)
+		ZSTD_freeDStream(rctx->zstd_dstream);
+	if (rctx->zlib_stream) {
+		inflateEnd(rctx->zlib_stream);
+		free(rctx->zlib_stream);
+	}
 
 	return ret;
 }
@@ -1234,6 +1471,9 @@ static const char * const cmd_receive_usage[] = {
 	"-m ROOTMOUNT     the root mount point of the destination filesystem.",
 	"                 If /proc is not accessible, use this to tell us where",
 	"                 this file system is mounted.",
+	"--force-decompress",
+	"                 if the stream contains compressed data, always",
+	"                 decompress it instead of writing it with encoded I/O",
 	"--dump           dump stream metadata, one line per operation,",
 	"                 does not require the MOUNT parameter",
 	"-v               deprecated, alias for global -v option",
@@ -1277,12 +1517,16 @@ static int cmd_receive(const struct cmd_struct *cmd, int argc, char **argv)
 	optind = 0;
 	while (1) {
 		int c;
-		enum { GETOPT_VAL_DUMP = 257 };
+		enum {
+			GETOPT_VAL_DUMP = 257,
+			GETOPT_VAL_FORCE_DECOMPRESS,
+		};
 		static const struct option long_opts[] = {
 			{ "max-errors", required_argument, NULL, 'E' },
 			{ "chroot", no_argument, NULL, 'C' },
 			{ "dump", no_argument, NULL, GETOPT_VAL_DUMP },
 			{ "quiet", no_argument, NULL, 'q' },
+			{ "force-decompress", no_argument, NULL, GETOPT_VAL_FORCE_DECOMPRESS },
 			{ NULL, 0, NULL, 0 }
 		};
 
@@ -1325,6 +1569,9 @@ static int cmd_receive(const struct cmd_struct *cmd, int argc, char **argv)
 		case GETOPT_VAL_DUMP:
 			dump = 1;
 			break;
+		case GETOPT_VAL_FORCE_DECOMPRESS:
+			rctx.force_decompress = true;
+			break;
 		default:
 			usage_unknown_option(cmd, argv);
 		}
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v14 07/10] btrfs-progs: receive: process fallocate commands
  2022-03-17 17:25 [PATCH v14 0/7] btrfs: add send/receive support for reading/writing compressed data Omar Sandoval
                   ` (12 preceding siblings ...)
  2022-03-17 17:25 ` [PATCH v14 06/10] btrfs-progs: receive: encoded_write fallback to explicit decode and write Omar Sandoval
@ 2022-03-17 17:25 ` Omar Sandoval
  2022-03-17 17:25 ` [PATCH v14 08/10] btrfs-progs: receive: process setflags ioctl commands Omar Sandoval
                   ` (3 subsequent siblings)
  17 siblings, 0 replies; 29+ messages in thread
From: Omar Sandoval @ 2022-03-17 17:25 UTC (permalink / raw)
  To: linux-btrfs; +Cc: kernel-team

From: Boris Burkov <boris@bur.io>

Send stream v2 can emit fallocate commands, so receive must support them
as well. The implementation simply passes along the arguments to the
syscall. Note that mode is encoded as a u32 in send stream but fallocate
takes an int, so there is a unsigned->signed conversion there.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Boris Burkov <boris@bur.io>
---
 cmds/receive-dump.c  |  9 +++++++++
 cmds/receive.c       | 25 +++++++++++++++++++++++++
 common/send-stream.c |  9 +++++++++
 common/send-stream.h |  2 ++
 4 files changed, 45 insertions(+)

diff --git a/cmds/receive-dump.c b/cmds/receive-dump.c
index 83701b62..fa397bcf 100644
--- a/cmds/receive-dump.c
+++ b/cmds/receive-dump.c
@@ -331,6 +331,14 @@ static int print_encoded_write(const char *path, const void *data, u64 offset,
 			  unencoded_offset, compression, encryption);
 }
 
+static int print_fallocate(const char *path, int mode, u64 offset, u64 len,
+			   void *user)
+{
+	return PRINT_DUMP(user, path, "fallocate",
+			  "mode=%d offset=%llu len=%llu",
+			  mode, offset, len);
+}
+
 struct btrfs_send_ops btrfs_print_send_ops = {
 	.subvol = print_subvol,
 	.snapshot = print_snapshot,
@@ -354,4 +362,5 @@ struct btrfs_send_ops btrfs_print_send_ops = {
 	.utimes = print_utimes,
 	.update_extent = print_update_extent,
 	.encoded_write = print_encoded_write,
+	.fallocate = print_fallocate,
 };
diff --git a/cmds/receive.c b/cmds/receive.c
index 5fd939ce..4893d693 100644
--- a/cmds/receive.c
+++ b/cmds/receive.c
@@ -1260,6 +1260,30 @@ static int process_encoded_write(const char *path, const void *data, u64 offset,
 				    compression);
 }
 
+static int process_fallocate(const char *path, int mode, u64 offset, u64 len,
+			     void *user)
+{
+	int ret;
+	struct btrfs_receive *rctx = user;
+	char full_path[PATH_MAX];
+
+	ret = path_cat_out(full_path, rctx->full_subvol_path, path);
+	if (ret < 0) {
+		error("fallocate: path invalid: %s", path);
+		return ret;
+	}
+	ret = open_inode_for_write(rctx, full_path);
+	if (ret < 0)
+		return ret;
+	ret = fallocate(rctx->write_fd, mode, offset, len);
+	if (ret < 0) {
+		ret = -errno;
+		error("fallocate: fallocate on %s failed: %m", path);
+		return ret;
+	}
+	return 0;
+}
+
 static struct btrfs_send_ops send_ops = {
 	.subvol = process_subvol,
 	.snapshot = process_snapshot,
@@ -1283,6 +1307,7 @@ static struct btrfs_send_ops send_ops = {
 	.utimes = process_utimes,
 	.update_extent = process_update_extent,
 	.encoded_write = process_encoded_write,
+	.fallocate = process_fallocate,
 };
 
 static int do_receive(struct btrfs_receive *rctx, const char *tomnt,
diff --git a/common/send-stream.c b/common/send-stream.c
index ce7c40f5..2d0aa624 100644
--- a/common/send-stream.c
+++ b/common/send-stream.c
@@ -373,6 +373,7 @@ static int read_and_process_cmd(struct btrfs_send_stream *sctx)
 	u64 unencoded_offset;
 	int len;
 	int xattr_len;
+	int fallocate_mode;
 
 	ret = read_cmd(sctx);
 	if (ret)
@@ -537,6 +538,14 @@ static int read_and_process_cmd(struct btrfs_send_stream *sctx)
 	case BTRFS_SEND_C_END:
 		ret = 1;
 		break;
+	case BTRFS_SEND_C_FALLOCATE:
+		TLV_GET_STRING(sctx, BTRFS_SEND_A_PATH, &path);
+		TLV_GET_U32(sctx, BTRFS_SEND_A_FALLOCATE_MODE, &fallocate_mode);
+		TLV_GET_U64(sctx, BTRFS_SEND_A_FILE_OFFSET, &offset);
+		TLV_GET_U64(sctx, BTRFS_SEND_A_SIZE, &tmp);
+		ret = sctx->ops->fallocate(path, fallocate_mode, offset, tmp,
+					   sctx->user);
+		break;
 	}
 
 tlv_get_failed:
diff --git a/common/send-stream.h b/common/send-stream.h
index 44abbc9d..61a88d3d 100644
--- a/common/send-stream.h
+++ b/common/send-stream.h
@@ -57,6 +57,8 @@ struct btrfs_send_ops {
 			     u64 len, u64 unencoded_file_len, u64 unencoded_len,
 			     u64 unencoded_offset, u32 compression,
 			     u32 encryption, void *user);
+	int (*fallocate)(const char *path, int mode, u64 offset, u64 len,
+			 void *user);
 };
 
 int btrfs_read_and_process_send_stream(int fd,
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v14 08/10] btrfs-progs: receive: process setflags ioctl commands
  2022-03-17 17:25 [PATCH v14 0/7] btrfs: add send/receive support for reading/writing compressed data Omar Sandoval
                   ` (13 preceding siblings ...)
  2022-03-17 17:25 ` [PATCH v14 07/10] btrfs-progs: receive: process fallocate commands Omar Sandoval
@ 2022-03-17 17:25 ` Omar Sandoval
  2022-03-17 17:25 ` [PATCH v14 09/10] btrfs-progs: send: stream v2 ioctl flags Omar Sandoval
                   ` (2 subsequent siblings)
  17 siblings, 0 replies; 29+ messages in thread
From: Omar Sandoval @ 2022-03-17 17:25 UTC (permalink / raw)
  To: linux-btrfs; +Cc: kernel-team

From: Boris Burkov <boris@bur.io>

In send stream v2, send can emit a command for setting inode flags via
the setflags ioctl. Pass the flags attribute through to the ioctl call
in receive.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Boris Burkov <boris@bur.io>
---
 cmds/receive-dump.c  |  6 ++++++
 cmds/receive.c       | 25 +++++++++++++++++++++++++
 common/send-stream.c |  7 +++++++
 common/send-stream.h |  1 +
 4 files changed, 39 insertions(+)

diff --git a/cmds/receive-dump.c b/cmds/receive-dump.c
index fa397bcf..df5991e1 100644
--- a/cmds/receive-dump.c
+++ b/cmds/receive-dump.c
@@ -339,6 +339,11 @@ static int print_fallocate(const char *path, int mode, u64 offset, u64 len,
 			  mode, offset, len);
 }
 
+static int print_setflags(const char *path, int flags, void *user)
+{
+	return PRINT_DUMP(user, path, "setflags", "flags=%d", flags);
+}
+
 struct btrfs_send_ops btrfs_print_send_ops = {
 	.subvol = print_subvol,
 	.snapshot = print_snapshot,
@@ -363,4 +368,5 @@ struct btrfs_send_ops btrfs_print_send_ops = {
 	.update_extent = print_update_extent,
 	.encoded_write = print_encoded_write,
 	.fallocate = print_fallocate,
+	.setflags = print_setflags,
 };
diff --git a/cmds/receive.c b/cmds/receive.c
index 4893d693..7f76a04f 100644
--- a/cmds/receive.c
+++ b/cmds/receive.c
@@ -38,6 +38,7 @@
 #include <sys/types.h>
 #include <sys/uio.h>
 #include <sys/xattr.h>
+#include <linux/fs.h>
 #include <uuid/uuid.h>
 
 #include <lzo/lzo1x.h>
@@ -1284,6 +1285,29 @@ static int process_fallocate(const char *path, int mode, u64 offset, u64 len,
 	return 0;
 }
 
+static int process_setflags(const char *path, int flags, void *user)
+{
+	int ret;
+	struct btrfs_receive *rctx = user;
+	char full_path[PATH_MAX];
+
+	ret = path_cat_out(full_path, rctx->full_subvol_path, path);
+	if (ret < 0) {
+		error("setflags: path invalid: %s", path);
+		return ret;
+	}
+	ret = open_inode_for_write(rctx, full_path);
+	if (ret < 0)
+		return ret;
+	ret = ioctl(rctx->write_fd, FS_IOC_SETFLAGS, &flags);
+	if (ret < 0) {
+		ret = -errno;
+		error("setflags: setflags ioctl on %s failed: %m", path);
+		return ret;
+	}
+	return 0;
+}
+
 static struct btrfs_send_ops send_ops = {
 	.subvol = process_subvol,
 	.snapshot = process_snapshot,
@@ -1308,6 +1332,7 @@ static struct btrfs_send_ops send_ops = {
 	.update_extent = process_update_extent,
 	.encoded_write = process_encoded_write,
 	.fallocate = process_fallocate,
+	.setflags = process_setflags,
 };
 
 static int do_receive(struct btrfs_receive *rctx, const char *tomnt,
diff --git a/common/send-stream.c b/common/send-stream.c
index 2d0aa624..21295cbb 100644
--- a/common/send-stream.c
+++ b/common/send-stream.c
@@ -374,6 +374,7 @@ static int read_and_process_cmd(struct btrfs_send_stream *sctx)
 	int len;
 	int xattr_len;
 	int fallocate_mode;
+	int setflags_flags;
 
 	ret = read_cmd(sctx);
 	if (ret)
@@ -546,8 +547,14 @@ static int read_and_process_cmd(struct btrfs_send_stream *sctx)
 		ret = sctx->ops->fallocate(path, fallocate_mode, offset, tmp,
 					   sctx->user);
 		break;
+	case BTRFS_SEND_C_SETFLAGS:
+		TLV_GET_STRING(sctx, BTRFS_SEND_A_PATH, &path);
+		TLV_GET_U32(sctx, BTRFS_SEND_A_SETFLAGS_FLAGS, &setflags_flags);
+		ret = sctx->ops->setflags(path, setflags_flags, sctx->user);
+		break;
 	}
 
+
 tlv_get_failed:
 out:
 	free(path);
diff --git a/common/send-stream.h b/common/send-stream.h
index 61a88d3d..3189f889 100644
--- a/common/send-stream.h
+++ b/common/send-stream.h
@@ -59,6 +59,7 @@ struct btrfs_send_ops {
 			     u32 encryption, void *user);
 	int (*fallocate)(const char *path, int mode, u64 offset, u64 len,
 			 void *user);
+	int (*setflags)(const char *path, int flags, void *user);
 };
 
 int btrfs_read_and_process_send_stream(int fd,
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v14 09/10] btrfs-progs: send: stream v2 ioctl flags
  2022-03-17 17:25 [PATCH v14 0/7] btrfs: add send/receive support for reading/writing compressed data Omar Sandoval
                   ` (14 preceding siblings ...)
  2022-03-17 17:25 ` [PATCH v14 08/10] btrfs-progs: receive: process setflags ioctl commands Omar Sandoval
@ 2022-03-17 17:25 ` Omar Sandoval
  2022-03-17 17:25 ` [PATCH v14 10/10] btrfs-progs: receive: add tests for basic encoded_write send/receive Omar Sandoval
  2022-03-24 17:53 ` [PATCH v14 0/7] btrfs: add send/receive support for reading/writing compressed data Sweet Tea Dorminy
  17 siblings, 0 replies; 29+ messages in thread
From: Omar Sandoval @ 2022-03-17 17:25 UTC (permalink / raw)
  To: linux-btrfs; +Cc: kernel-team

From: Omar Sandoval <osandov@fb.com>

First, add a --proto option to allow specifying the desired send
protocol version. It defaults to one, the original version. In a couple
of releases once people are aware that protocol revisions are happening,
we can change it to default to zero, which means the latest version
supported by the kernel. This is based on Dave Sterba's patch.

Also add a --compressed-data flag to instruct the kernel to use
encoded_write commands for compressed extents. This requires an explicit
opt in separate from the protocol version because:

1. The user may not want compression on the receiving side, or may want
   a different compression algorithm/level on the receiving side.
2. It has a soft requirement for kernel support on the receiving side
   (btrfs-progs can fall back to decompressing and writing if the kernel
   doesn't support BTRFS_IOC_ENCODED_WRITE, but the user may not be
   prepared to pay that CPU cost). Going forward, since it's easier to
   update progs than the kernel, I think we'll want to make new send
   features that require kernel support opt-in, whereas anything that
   only requires a progs update can happen automatically.

Signed-off-by: Boris Burkov <boris@bur.io>
Signed-off-by: Omar Sandoval <osandov@fb.com>
---
 Documentation/btrfs-send.rst |  22 ++++++++
 cmds/send.c                  | 100 ++++++++++++++++++++++++++++++++++-
 ioctl.h                      |  19 ++++++-
 kernel-shared/send.h         |   2 +-
 4 files changed, 138 insertions(+), 5 deletions(-)

diff --git a/Documentation/btrfs-send.rst b/Documentation/btrfs-send.rst
index 4526532e..291c537e 100644
--- a/Documentation/btrfs-send.rst
+++ b/Documentation/btrfs-send.rst
@@ -60,6 +60,28 @@ please see section *SUBVOLUME FLAGS* in ``btrfs-subvolume(8)``.
         used to transfer changes. This mode is faster and is useful to show the
         differences in metadata.
 
+--proto <N>
+        use send protocol version N
+
+        The default is 1, which was the original protocol version. Version 2
+        encodes file data slightly more efficiently; it is also required for
+        sending compressed data directly (see *--compressed-data*). Version 2
+        requires at least btrfs-progs 5.18 on both the sender and receiver and
+        at least Linux 5.18 on the sender. Passing 0 means to use the highest
+        version supported by the running kernel.
+
+--compressed-data
+        send data that is compressed on the filesystem directly without
+        decompressing it
+
+        If the receiver supports the *BTRFS_IOC_ENCODED_WRITE* ioctl (added in
+        Linux 5.18), it can also write it directly without decompressing it.
+        Otherwise, the receiver will fall back to decompressing it and writing
+        it normally.
+
+        This requires protocol version 2 or higher. If *--proto* was not used,
+        then *--compressed-data* implies *--proto 2*.
+
 -q|--quiet
         (deprecated) alias for global *-q* option
 
diff --git a/cmds/send.c b/cmds/send.c
index 087af05c..b1adfeca 100644
--- a/cmds/send.c
+++ b/cmds/send.c
@@ -57,6 +57,8 @@ struct btrfs_send {
 	u64 clone_sources_count;
 
 	char *root_path;
+	u32 proto;
+	u32 proto_supported;
 };
 
 static int get_root_id(struct btrfs_send *sctx, const char *path, u64 *root_id)
@@ -259,6 +261,16 @@ static int do_send(struct btrfs_send *send, u64 parent_root_id,
 	memset(&io_send, 0, sizeof(io_send));
 	io_send.send_fd = pipefd[1];
 	send->send_fd = pipefd[0];
+	io_send.flags = flags;
+
+	if (send->proto_supported > 1) {
+		/*
+		 * Versioned stream supported, requesting default or specific
+		 * number.
+		 */
+		io_send.version = send->proto;
+		io_send.flags |= BTRFS_SEND_FLAG_VERSION;
+	}
 
 	if (!ret)
 		ret = pthread_create(&t_read, NULL, read_sent_data, send);
@@ -269,7 +281,6 @@ static int do_send(struct btrfs_send *send, u64 parent_root_id,
 		goto out;
 	}
 
-	io_send.flags = flags;
 	io_send.clone_sources = (__u64*)send->clone_sources;
 	io_send.clone_sources_count = send->clone_sources_count;
 	io_send.parent_root = parent_root_id;
@@ -421,6 +432,36 @@ static void free_send_info(struct btrfs_send *sctx)
 	sctx->root_path = NULL;
 }
 
+static u32 get_sysfs_proto_supported(void)
+{
+	int fd;
+	int ret;
+	char buf[32] = {};
+	char *end = NULL;
+	u64 version;
+
+	fd = sysfs_open_file("features/send_stream_version");
+	if (fd < 0) {
+		/*
+		 * No file is either no version support or old kernel with just
+		 * v1.
+		 */
+		return 1;
+	}
+	ret = sysfs_read_file(fd, buf, sizeof(buf));
+	close(fd);
+	if (ret <= 0)
+		return 1;
+	version = strtoull(buf, &end, 10);
+	if (version == ULLONG_MAX && errno == ERANGE)
+		return 1;
+	if (version > U32_MAX) {
+		warning("sysfs/send_stream_version too big: %llu", version);
+		version = 1;
+	}
+	return version;
+}
+
 static const char * const cmd_send_usage[] = {
 	"btrfs send [-ve] [-p <parent>] [-c <clone-src>] [-f <outfile>] <subvol> [<subvol>...]",
 	"Send the subvolume(s) to stdout.",
@@ -449,6 +490,11 @@ static const char * const cmd_send_usage[] = {
 	"                 does not contain any file data and thus cannot be used",
 	"                 to transfer changes. This mode is faster and useful to",
 	"                 show the differences in metadata.",
+	"--proto N        use protocol version N, or 0 to use the highest version",
+	"                 supported by the sending kernel (default: 1)",
+	"--compressed-data",
+	"                 send data that is compressed on the filesystem directly",
+	"                 without decompressing it",
 	"-v|--verbose     deprecated, alias for global -v option",
 	"-q|--quiet       deprecated, alias for global -q option",
 	HELPINFO_INSERT_GLOBALS,
@@ -471,9 +517,11 @@ static int cmd_send(const struct cmd_struct *cmd, int argc, char **argv)
 	int full_send = 1;
 	int new_end_cmd_semantic = 0;
 	u64 send_flags = 0;
+	u64 proto = 0;
 
 	memset(&send, 0, sizeof(send));
 	send.dump_fd = fileno(stdout);
+	send.proto = 1;
 	outname[0] = 0;
 
 	/*
@@ -489,11 +537,17 @@ static int cmd_send(const struct cmd_struct *cmd, int argc, char **argv)
 
 	optind = 0;
 	while (1) {
-		enum { GETOPT_VAL_SEND_NO_DATA = 256 };
+		enum {
+			GETOPT_VAL_SEND_NO_DATA = 256,
+			GETOPT_VAL_PROTO,
+			GETOPT_VAL_COMPRESSED_DATA,
+		};
 		static const struct option long_options[] = {
 			{ "verbose", no_argument, NULL, 'v' },
 			{ "quiet", no_argument, NULL, 'q' },
 			{ "no-data", no_argument, NULL, GETOPT_VAL_SEND_NO_DATA },
+			{ "proto", required_argument, NULL, GETOPT_VAL_PROTO },
+			{ "compressed-data", no_argument, NULL, GETOPT_VAL_COMPRESSED_DATA },
 			{ NULL, 0, NULL, 0 }
 		};
 		int c = getopt_long(argc, argv, "vqec:f:i:p:", long_options, NULL);
@@ -582,6 +636,18 @@ static int cmd_send(const struct cmd_struct *cmd, int argc, char **argv)
 		case GETOPT_VAL_SEND_NO_DATA:
 			send_flags |= BTRFS_SEND_FLAG_NO_FILE_DATA;
 			break;
+		case GETOPT_VAL_PROTO:
+			proto = arg_strtou64(optarg);
+			if (proto > U32_MAX) {
+				error("protocol version number too big %llu", proto);
+				ret = 1;
+				goto out;
+			}
+			send.proto = proto;
+			break;
+		case GETOPT_VAL_COMPRESSED_DATA:
+			send_flags |= BTRFS_SEND_FLAG_COMPRESSED;
+			break;
 		default:
 			usage_unknown_option(cmd, argv);
 		}
@@ -689,6 +755,36 @@ static int cmd_send(const struct cmd_struct *cmd, int argc, char **argv)
 	if ((send_flags & BTRFS_SEND_FLAG_NO_FILE_DATA) && bconf.verbose > 1)
 		if (bconf.verbose > 1)
 			fprintf(stderr, "Mode NO_FILE_DATA enabled\n");
+	send.proto_supported = get_sysfs_proto_supported();
+	if (send.proto_supported == 1) {
+		if (send.proto > send.proto_supported) {
+			error("requested version %u but kernel supports only %u",
+			      send.proto, send.proto_supported);
+			ret = -EPROTO;
+			goto out;
+		}
+	}
+	if (send_flags & BTRFS_SEND_FLAG_COMPRESSED) {
+		/*
+		 * If no protocol version was explicitly requested, then
+		 * --compressed-data implies --proto 2.
+		 */
+		if (send.proto == 1 && !proto)
+			send.proto = 2;
+
+		if (send.proto == 1) {
+			error("--compressed-data requires protocol version >= 2 (requested 1)");
+			ret = -EINVAL;
+			goto out;
+		} else if (send.proto == 0 && send.proto_supported < 2) {
+			error("kernel does not support --compressed-data");
+			ret = -EINVAL;
+			goto out;
+		}
+	}
+	if (bconf.verbose > 1)
+		fprintf(stderr, "Protocol version requested: %u (supported %u)\n",
+			send.proto, send.proto_supported);
 
 	for (i = optind; i < argc; i++) {
 		int is_first_subvol;
diff --git a/ioctl.h b/ioctl.h
index 8adf63c2..f19695e3 100644
--- a/ioctl.h
+++ b/ioctl.h
@@ -655,10 +655,24 @@ BUILD_ASSERT(sizeof(struct btrfs_ioctl_received_subvol_args_32) == 192);
  */
 #define BTRFS_SEND_FLAG_OMIT_END_CMD		0x4
 
+/*
+ * Read the protocol version in the structure
+ */
+#define BTRFS_SEND_FLAG_VERSION			0x8
+
+/*
+ * Send compressed data using the ENCODED_WRITE command instead of decompressing
+ * the data and sending it with the WRITE command. This requires protocol
+ * version >= 2.
+ */
+#define BTRFS_SEND_FLAG_COMPRESSED		0x10
+
 #define BTRFS_SEND_FLAG_MASK \
 	(BTRFS_SEND_FLAG_NO_FILE_DATA | \
 	 BTRFS_SEND_FLAG_OMIT_STREAM_HEADER | \
-	 BTRFS_SEND_FLAG_OMIT_END_CMD)
+	 BTRFS_SEND_FLAG_OMIT_END_CMD | \
+	 BTRFS_SEND_FLAG_VERSION | \
+	 BTRFS_SEND_FLAG_COMPRESSED)
 
 struct btrfs_ioctl_send_args {
 	__s64 send_fd;			/* in */
@@ -666,7 +680,8 @@ struct btrfs_ioctl_send_args {
 	__u64 __user *clone_sources;	/* in */
 	__u64 parent_root;		/* in */
 	__u64 flags;			/* in */
-	__u64 reserved[4];		/* in */
+	__u32 version;			/* in */
+	__u8 reserved[28];		/* in */
 };
 /*
  * Size of structure depends on pointer width, was not caught in the early
diff --git a/kernel-shared/send.h b/kernel-shared/send.h
index b902d054..1f20d01a 100644
--- a/kernel-shared/send.h
+++ b/kernel-shared/send.h
@@ -31,7 +31,7 @@ extern "C" {
 #endif
 
 #define BTRFS_SEND_STREAM_MAGIC "btrfs-stream"
-#define BTRFS_SEND_STREAM_VERSION 1
+#define BTRFS_SEND_STREAM_VERSION 2
 
 /*
  * In send stream v1, no command is larger than 64k. In send stream v2, no limit
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v14 10/10] btrfs-progs: receive: add tests for basic encoded_write send/receive
  2022-03-17 17:25 [PATCH v14 0/7] btrfs: add send/receive support for reading/writing compressed data Omar Sandoval
                   ` (15 preceding siblings ...)
  2022-03-17 17:25 ` [PATCH v14 09/10] btrfs-progs: send: stream v2 ioctl flags Omar Sandoval
@ 2022-03-17 17:25 ` Omar Sandoval
  2022-03-24 17:53 ` [PATCH v14 0/7] btrfs: add send/receive support for reading/writing compressed data Sweet Tea Dorminy
  17 siblings, 0 replies; 29+ messages in thread
From: Omar Sandoval @ 2022-03-17 17:25 UTC (permalink / raw)
  To: linux-btrfs; +Cc: kernel-team

From: Boris Burkov <boris@bur.io>

Adapt the existing send/receive tests by passing '-o compress-force' to
the mount commands in a new test. After writing a few files in the
various compression formats, send/receive them with and without
--force-decompress to test both the encoded_write path and the fallback
to decode+write.

Signed-off-by: Boris Burkov <boris@bur.io>
---
 .../052-receive-write-encoded/test.sh         | 114 ++++++++++++++++++
 1 file changed, 114 insertions(+)
 create mode 100755 tests/misc-tests/052-receive-write-encoded/test.sh

diff --git a/tests/misc-tests/052-receive-write-encoded/test.sh b/tests/misc-tests/052-receive-write-encoded/test.sh
new file mode 100755
index 00000000..47330281
--- /dev/null
+++ b/tests/misc-tests/052-receive-write-encoded/test.sh
@@ -0,0 +1,114 @@
+#!/bin/bash
+#
+# test that we can send and receive encoded writes for three modes of
+# transparent compression: zlib, lzo, and zstd.
+
+source "$TEST_TOP/common"
+
+check_prereq mkfs.btrfs
+check_prereq btrfs
+
+setup_root_helper
+prepare_test_dev
+
+here=`pwd`
+
+# assumes the filesystem exists, and does mount, write, snapshot, send, unmount
+# for the specified encoding option
+send_one() {
+	local str
+	local subv
+	local snap
+
+	algorithm="$1"
+	shift
+	str="$1"
+	shift
+
+	subv="subv-$algorithm"
+	snap="snap-$algorithm"
+
+	run_check_mount_test_dev "-o" "compress-force=$algorithm"
+	cd "$TEST_MNT" || _fail "cannot chdir to TEST_MNT"
+
+	run_check $SUDO_HELPER "$TOP/btrfs" subvolume create "$subv"
+	run_check $SUDO_HELPER dd if=/dev/zero of="$subv/file1" bs=1M count=1
+	run_check $SUDO_HELPER dd if=/dev/zero of="$subv/file2" bs=500K count=1
+	run_check $SUDO_HELPER "$TOP/btrfs" subvolume snapshot -r "$subv" "$snap"
+	run_check $SUDO_HELPER "$TOP/btrfs" send -f "$str" "$snap" "$@"
+
+	cd "$here" || _fail "cannot chdir back to test directory"
+	run_check_umount_test_dev
+}
+
+receive_one() {
+	local str
+	str="$1"
+	shift
+
+	run_check_mkfs_test_dev
+	run_check_mount_test_dev
+	run_check $SUDO_HELPER "$TOP/btrfs" receive "$@" -v -f "$str" "$TEST_MNT"
+	run_check_umount_test_dev
+	run_check rm -f -- "$str"
+}
+
+test_one_write_encoded() {
+	local str
+	local algorithm
+	algorithm="$1"
+	shift
+	str="$here/stream-$algorithm.stream"
+
+	run_check_mkfs_test_dev
+	send_one "$algorithm" "$str" --compressed-data
+	receive_one "$str" "$@"
+}
+
+test_one_stream_v1() {
+	local str
+	local algorithm
+	algorithm="$1"
+	shift
+	str="$here/stream-$algorithm.stream"
+
+	run_check_mkfs_test_dev
+	send_one "$algorithm" "$str" --proto 1
+	receive_one "$str" "$@"
+}
+
+test_mix_write_encoded() {
+	local strzlib
+	local strlzo
+	local strzstd
+	strzlib="$here/stream-zlib.stream"
+	strlzo="$here/stream-lzo.stream"
+	strzstd="$here/stream-zstd.stream"
+
+	run_check_mkfs_test_dev
+
+	send_one "zlib" "$strzlib" --compressed-data
+	send_one "lzo" "$strlzo" --compressed-data
+	send_one "zstd" "$strzstd" --compressed-data
+
+	receive_one "$strzlib"
+	receive_one "$strlzo"
+	receive_one "$strzstd"
+}
+
+test_one_write_encoded "zlib"
+test_one_write_encoded "lzo"
+test_one_write_encoded "zstd"
+
+# with decompression forced
+test_one_write_encoded "zlib" "--force-decompress"
+test_one_write_encoded "lzo" "--force-decompress"
+test_one_write_encoded "zstd" "--force-decompress"
+
+# send stream v1
+test_one_stream_v1 "zlib"
+test_one_stream_v1 "lzo"
+test_one_stream_v1 "zstd"
+
+# files use a mix of compression algorithms
+test_mix_write_encoded
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH v14 2/7] btrfs: send: explicitly number commands and attributes
  2022-03-17 17:25 ` [PATCH v14 2/7] btrfs: send: explicitly number commands and attributes Omar Sandoval
@ 2022-03-24 17:52   ` Sweet Tea Dorminy
  0 siblings, 0 replies; 29+ messages in thread
From: Sweet Tea Dorminy @ 2022-03-24 17:52 UTC (permalink / raw)
  To: Omar Sandoval, linux-btrfs; +Cc: kernel-team


On 3/17/22 13:25, Omar Sandoval wrote:
> From: Omar Sandoval <osandov@fb.com>
>
> Commit e77fbf990316 ("btrfs: send: prepare for v2 protocol") added
> _BTRFS_SEND_C_MAX_V* macros equal to the maximum command number for the
s/macros/enums/?
> version plus 1, but as written this creates gaps in the number space.
> The maximum command number is currently 22, and __BTRFS_SEND_C_MAX_V1 is
> accordingly 23. But then __BTRFS_SEND_C_MAX_V2 is 24, suggesting that v2
> has a command numbered 23, and __BTRFS_SEND_C_MAX is 25, suggesting that
> 23 and 24 are valid commands.
>
> Instead, let's explicitly number all of the commands, attributes, and
> sentinel MAX constants.

If you were sending out another version, the last sentence explains what the change is doing but not why you like it as a solution to the problem.

Nit: I think it would be slightly more elegant to set the MAX values to the appropriate symbolic value instead of numerical, to emphasize the relationship a tiny bit more. Perhaps e.g.

enum btrfs_send_cmds {
	...
	BTRFS_SEND_C_UPDATE_EXTENT,
	__BTRFS_SEND_CMDS_V1_MAX = BTRFS_SEND_C_UPDATE_EXTENT,
	BTRFS_SEND_C_FALLOCATE,
	...
	BTRFS_SEND_C_ENCODED_WRITE,
	__BTRFS_SEND_CMDS_V2_MAX = BTRFS_SEND_C_ENCODED_WRITE,
	BTRFS_SEND_C_CMDS_MAX = __BTRFS_SEND_CMDS_V2_MAX,
};

(either with or without explicitly setting the numerical values of the individual commands). Or perhaps #define the MAX values instead, still in terms of the symbolic constant?

I have a mild preference for not explicitly setting things to numerical values, so it's harder to duplicate a value by accident later on; but explicit setting does help if one day some cmd needs to be dropped from the middle of the list without renumbering everything, so shrug.

Nit: If you do stick with explicit numerical values everywhere, is there a chance you could line up the = vertically, to make it easier to scan down the list of numbers and verify they are in order and without holes in the sequence? I know vertical alignment is usually unneeded, but in this case I think it does add a bunch to the readability of the table.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v14 4/7] btrfs: send: write larger chunks when using stream v2
  2022-03-17 17:25 ` [PATCH v14 4/7] btrfs: send: write larger chunks when using stream v2 Omar Sandoval
@ 2022-03-24 17:52   ` Sweet Tea Dorminy
  2022-03-30 17:05     ` Omar Sandoval
  0 siblings, 1 reply; 29+ messages in thread
From: Sweet Tea Dorminy @ 2022-03-24 17:52 UTC (permalink / raw)
  To: Omar Sandoval, linux-btrfs; +Cc: kernel-team


> diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c
> index 1f141de3a7d6..02053fff80ca 100644
> --- a/fs/btrfs/send.c
> +++ b/fs/btrfs/send.c
> @@ -82,6 +82,7 @@ struct send_ctx {
>   	char *send_buf;
>   	u32 send_size;
>   	u32 send_max_size;
> +	bool put_data;
put_data's use seems to be about making sure put_data_header() isn't 
called more than once, which is not super obvious to me from the name; 
perhaps one of 'data_header_{set,setup,initialized}' might make it 
clearer? Or if it's actually about put_file_data, maybe moving the 
assertion there would make that clearer?
>   static int put_data_header(struct send_ctx *sctx, u32 len)
>   {
> -	struct btrfs_tlv_header *hdr;
> +	if (WARN_ON_ONCE(sctx->put_data))
> +		return -EINVAL;
> +	sctx->put_data = true;
> +	if (sctx->proto >= 2) {
> +		/*
> +		 * In v2, the data attribute header doesn't include a length; it
> +		 * is implicitly to the end of the command.
> +		 */
> +		if (sctx->send_max_size - sctx->send_size < 2 + len)
> +			return -EOVERFLOW;
> +		put_unaligned_le16(BTRFS_SEND_A_DATA,
> +				   sctx->send_buf + sctx->send_size);
> +		sctx->send_size += 2;
> +	} else {
> +		struct btrfs_tlv_header *hdr;
>   
> -	if (sctx->send_max_size - sctx->send_size < sizeof(*hdr) + len)
> -		return -EOVERFLOW;
> -	hdr = (struct btrfs_tlv_header *)(sctx->send_buf + sctx->send_size);
> -	put_unaligned_le16(BTRFS_SEND_A_DATA, &hdr->tlv_type);
> -	put_unaligned_le16(len, &hdr->tlv_len);
> -	sctx->send_size += sizeof(*hdr);
> +		if (sctx->send_max_size - sctx->send_size < sizeof(*hdr) + len)
> +			return -EOVERFLOW;
> +		hdr = (struct btrfs_tlv_header *)(sctx->send_buf +
> +						  sctx->send_size);
> +		put_unaligned_le16(BTRFS_SEND_A_DATA, &hdr->tlv_type);
> +		put_unaligned_le16(len, &hdr->tlv_len);
> +		sctx->send_size += sizeof(*hdr);
> +	}
>   	return 0;
>   }

I wish the 2s were named, and that there were more commonality between 
the two branches... Might I propose this alternative? It doesn't check 
the length's suitability until after adding the two fields, but I don't 
think anything bad happens from delaying the check.

static int put_data_header(struct send_ctx *sctx, u32 len)
{
         struct btrfs_tlv_header *hdr =
                 (struct btrfs_tlv_header *)(sctx->send_buf + sctx->send_size);

         if (WARN_ON_ONCE(sctx->put_data))
                 return -EINVAL;
         sctx->put_data = true;

         put_unaligned_le16(BTRFS_SEND_A_DATA, &hdr->tlv_type);
         sctx->send_size += sizeof(hdr->tlv_type);
                                                                                                                        
         /*
          * In v2+, the data attribute header doesn't include a length; it is
          * implicitly to the end of the command.
          */
         if (sctx->proto == 1) {
                 put_unaligned_le16(len, &hdr->tlv_len);
                 sctx->send_size += sizeof(hdr->tlv_len);
         }

         if (sctx->send_max_size - sctx->send_size < len)
                 return -EOVERFLOW;

         return 0;
}


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v14 5/7] btrfs: send: allocate send buffer with alloc_page() and vmap() for v2
  2022-03-17 17:25 ` [PATCH v14 5/7] btrfs: send: allocate send buffer with alloc_page() and vmap() for v2 Omar Sandoval
@ 2022-03-24 17:53   ` Sweet Tea Dorminy
  2022-03-30 16:03     ` Omar Sandoval
  0 siblings, 1 reply; 29+ messages in thread
From: Sweet Tea Dorminy @ 2022-03-24 17:53 UTC (permalink / raw)
  To: Omar Sandoval, linux-btrfs; +Cc: kernel-team



On 3/17/22 13:25, Omar Sandoval wrote:
> From: Omar Sandoval <osandov@fb.com>
> 
> For encoded writes, we need the raw pages for reading compressed data
> directly via a bio.
Perhaps:
"For encoded writes, the existing btrfs_encoded_read*() functions expect 
a list of raw pages."

I think it would be a better to continue just vmalloc'ing a large 
continuous buffer and translating each page in the buffer into its raw 
page with something like is_vmalloc_addr(data) ? vmalloc_to_page(data) : 
virt_to_page(data). Vmalloc can request a higher-order allocation, which 
probably doesn't matter but might slightly improve memory locality. And 
in terms of readability, I somewhat like the elegance of having a single 
identical kvmalloc call to allocate and send_buf in both cases, even if 
we do need to initialize the page list for some v2 commands.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v14 0/7] btrfs: add send/receive support for reading/writing compressed data
  2022-03-17 17:25 [PATCH v14 0/7] btrfs: add send/receive support for reading/writing compressed data Omar Sandoval
                   ` (16 preceding siblings ...)
  2022-03-17 17:25 ` [PATCH v14 10/10] btrfs-progs: receive: add tests for basic encoded_write send/receive Omar Sandoval
@ 2022-03-24 17:53 ` Sweet Tea Dorminy
  17 siblings, 0 replies; 29+ messages in thread
From: Sweet Tea Dorminy @ 2022-03-24 17:53 UTC (permalink / raw)
  To: Omar Sandoval, linux-btrfs; +Cc: kernel-team

For the 7 kernel patches:
Reviewed-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me>

On 3/17/22 13:25, Omar Sandoval wrote:
> From: Omar Sandoval <osandov@fb.com>
> 
> This series adds support for sending compressed data via Btrfs send and
> btrfs-progs support for sending/receiving compressed data and writing it
> with BTRFS_IOC_ENCODED_WRITE, which was previously merged into
> misc-next. See the previous posting for more details and benchmarks [1].
> 
> Patches 1 and 2 are cleanups for Btrfs send. Patches 3-5 prepare some
> protocol changes for send stream v2. Patch 6 implements compressed send.
> Patch 7 enables send stream v2 and compressed send in the send ioctl
> when requested.
> 
> Changes since v13 [2]:
> 
> - Rebased on latest misc-next branch.
> - Dropped ioctl patches which are already in misc-next.
> 
> 1: https://lore.kernel.org/linux-btrfs/cover.1615922753.git.osandov@fb.com/
> 2: https://lore.kernel.org/linux-btrfs/cover.1644519257.git.osandov@fb.com/
> 
> Omar Sandoval (7):
>    btrfs: send: remove unused send_ctx::{total,cmd}_send_size
>    btrfs: send: explicitly number commands and attributes
>    btrfs: add send stream v2 definitions
>    btrfs: send: write larger chunks when using stream v2
>    btrfs: send: allocate send buffer with alloc_page() and vmap() for v2
>    btrfs: send: send compressed extents with encoded writes
>    btrfs: send: enable support for stream v2 and compressed writes
> 
>   fs/btrfs/ctree.h           |   6 +
>   fs/btrfs/inode.c           |  13 +-
>   fs/btrfs/send.c            | 324 +++++++++++++++++++++++++++++++++----
>   fs/btrfs/send.h            | 142 +++++++++-------
>   include/uapi/linux/btrfs.h |  10 +-
>   5 files changed, 395 insertions(+), 100 deletions(-)
> 
> The btrfs-progs patches were written by Boris Burkov with some updates
> from me. Patches 1-4 are preparation. Patch 5 implements encoded writes.
> Patch 6 implements the fallback to decompressing. Patches 7 and 8
> implement the other commands. Patch 9 adds the new `btrfs send` options.
> Patch 10 adds a test case.
> 
> Changes since v13:
> 
> - Rebased on latest devel branch.
> - Updated the btrfs_ioctl_encoded_io_args definition to the version that
>    was merged into misc-next.
> 
> Boris Burkov (8):
>    btrfs-progs: receive: support v2 send stream larger tlv_len
>    btrfs-progs: receive: dynamically allocate sctx->read_buf
>    btrfs-progs: receive: support v2 send stream DATA tlv format
>    btrfs-progs: receive: process encoded_write commands
>    btrfs-progs: receive: encoded_write fallback to explicit decode and
>      write
>    btrfs-progs: receive: process fallocate commands
>    btrfs-progs: receive: process setflags ioctl commands
>    btrfs-progs: receive: add tests for basic encoded_write send/receive
> 
> Omar Sandoval (2):
>    btrfs-progs: receive: add send stream v2 cmds and attrs to send.h
>    btrfs-progs: send: stream v2 ioctl flags
> 
>   Documentation/btrfs-receive.rst               |   5 +
>   Documentation/btrfs-send.rst                  |  22 ++
>   cmds/receive-dump.c                           |  31 +-
>   cmds/receive.c                                | 347 +++++++++++++++++-
>   cmds/send.c                                   | 100 ++++-
>   common/send-stream.c                          | 165 +++++++--
>   common/send-stream.h                          |   7 +
>   ioctl.h                                       | 151 +++++++-
>   kernel-shared/send.h                          | 146 +++++---
>   libbtrfs/send-stream.c                        |   2 +-
>   .../052-receive-write-encoded/test.sh         | 114 ++++++
>   11 files changed, 993 insertions(+), 97 deletions(-)
>   create mode 100755 tests/misc-tests/052-receive-write-encoded/test.sh
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v14 5/7] btrfs: send: allocate send buffer with alloc_page() and vmap() for v2
  2022-03-24 17:53   ` Sweet Tea Dorminy
@ 2022-03-30 16:03     ` Omar Sandoval
  2022-03-30 16:33       ` Sweet Tea Dorminy
  0 siblings, 1 reply; 29+ messages in thread
From: Omar Sandoval @ 2022-03-30 16:03 UTC (permalink / raw)
  To: Sweet Tea Dorminy; +Cc: linux-btrfs, kernel-team

On Thu, Mar 24, 2022 at 01:53:20PM -0400, Sweet Tea Dorminy wrote:
> 
> 
> On 3/17/22 13:25, Omar Sandoval wrote:
> > From: Omar Sandoval <osandov@fb.com>
> > 
> > For encoded writes, we need the raw pages for reading compressed data
> > directly via a bio.
> Perhaps:
> "For encoded writes, the existing btrfs_encoded_read*() functions expect a
> list of raw pages."
> 
> I think it would be a better to continue just vmalloc'ing a large continuous
> buffer and translating each page in the buffer into its raw page with
> something like is_vmalloc_addr(data) ? vmalloc_to_page(data) :
> virt_to_page(data). Vmalloc can request a higher-order allocation, which
> probably doesn't matter but might slightly improve memory locality. And in
> terms of readability, I somewhat like the elegance of having a single
> identical kvmalloc call to allocate and send_buf in both cases, even if we
> do need to initialize the page list for some v2 commands.

I like this, but are we guaranteed that kvmalloc() will return a
page-aligned buffer? It seems reasonable to me that it would for
allocations of at least one page, but I can't find that written down
anywhere.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v14 5/7] btrfs: send: allocate send buffer with alloc_page() and vmap() for v2
  2022-03-30 16:03     ` Omar Sandoval
@ 2022-03-30 16:33       ` Sweet Tea Dorminy
  2022-03-30 17:13         ` Omar Sandoval
  0 siblings, 1 reply; 29+ messages in thread
From: Sweet Tea Dorminy @ 2022-03-30 16:33 UTC (permalink / raw)
  To: Omar Sandoval; +Cc: linux-btrfs, kernel-team



On 3/30/22 12:03, Omar Sandoval wrote:
> On Thu, Mar 24, 2022 at 01:53:20PM -0400, Sweet Tea Dorminy wrote:
>>
>>
>> On 3/17/22 13:25, Omar Sandoval wrote:
>>> From: Omar Sandoval <osandov@fb.com>
>>>
>>> For encoded writes, we need the raw pages for reading compressed data
>>> directly via a bio.
>> Perhaps:
>> "For encoded writes, the existing btrfs_encoded_read*() functions expect a
>> list of raw pages."
>>
>> I think it would be a better to continue just vmalloc'ing a large continuous
>> buffer and translating each page in the buffer into its raw page with
>> something like is_vmalloc_addr(data) ? vmalloc_to_page(data) :
>> virt_to_page(data). Vmalloc can request a higher-order allocation, which
>> probably doesn't matter but might slightly improve memory locality. And in
>> terms of readability, I somewhat like the elegance of having a single
>> identical kvmalloc call to allocate and send_buf in both cases, even if we
>> do need to initialize the page list for some v2 commands.
> 
> I like this, but are we guaranteed that kvmalloc() will return a
> page-aligned buffer? It seems reasonable to me that it would for
> allocations of at least one page, but I can't find that written down
> anywhere.

Since vmalloc allocates whole pages, and kmalloc guarantees alignment to 
the allocation size for powers of 2 sizes (and PAGE_SIZE is required to 
be a power of 2), I think that adds up to a guarantee of page alignment 
both ways?

https://elixir.bootlin.com/linux/v5.17.1/source/include/linux/slab.h#L522 : 
kmalloc: "For @size of power of two bytes, the alignment is also 
guaranteed to be at least to the size."
https://elixir.bootlin.com/linux/v5.17.1/source/mm/vmalloc.c#L3180: 
vmalloc: " Allocate enough pages"...


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v14 4/7] btrfs: send: write larger chunks when using stream v2
  2022-03-24 17:52   ` Sweet Tea Dorminy
@ 2022-03-30 17:05     ` Omar Sandoval
  0 siblings, 0 replies; 29+ messages in thread
From: Omar Sandoval @ 2022-03-30 17:05 UTC (permalink / raw)
  To: Sweet Tea Dorminy; +Cc: linux-btrfs, kernel-team

On Thu, Mar 24, 2022 at 01:52:53PM -0400, Sweet Tea Dorminy wrote:
> 
> > diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c
> > index 1f141de3a7d6..02053fff80ca 100644
> > --- a/fs/btrfs/send.c
> > +++ b/fs/btrfs/send.c
> > @@ -82,6 +82,7 @@ struct send_ctx {
> >   	char *send_buf;
> >   	u32 send_size;
> >   	u32 send_max_size;
> > +	bool put_data;
> put_data's use seems to be about making sure put_data_header() isn't called
> more than once, which is not super obvious to me from the name; perhaps one
> of 'data_header_{set,setup,initialized}' might make it clearer? Or if it's
> actually about put_file_data, maybe moving the assertion there would make
> that clearer?

The intention is to prevent adding another attribute after a data
attribute, since that's impossible with v2. Notice that it's also
checked in tlv_put(). So "put_data" means "was a data attribute already
added to this command?"; it's not specifically about the data header or
data itself. I'll add a comment to that effect.

> >   static int put_data_header(struct send_ctx *sctx, u32 len)
> >   {
> > -	struct btrfs_tlv_header *hdr;
> > +	if (WARN_ON_ONCE(sctx->put_data))
> > +		return -EINVAL;
> > +	sctx->put_data = true;
> > +	if (sctx->proto >= 2) {
> > +		/*
> > +		 * In v2, the data attribute header doesn't include a length; it
> > +		 * is implicitly to the end of the command.
> > +		 */
> > +		if (sctx->send_max_size - sctx->send_size < 2 + len)
> > +			return -EOVERFLOW;
> > +		put_unaligned_le16(BTRFS_SEND_A_DATA,
> > +				   sctx->send_buf + sctx->send_size);
> > +		sctx->send_size += 2;
> > +	} else {
> > +		struct btrfs_tlv_header *hdr;
> > -	if (sctx->send_max_size - sctx->send_size < sizeof(*hdr) + len)
> > -		return -EOVERFLOW;
> > -	hdr = (struct btrfs_tlv_header *)(sctx->send_buf + sctx->send_size);
> > -	put_unaligned_le16(BTRFS_SEND_A_DATA, &hdr->tlv_type);
> > -	put_unaligned_le16(len, &hdr->tlv_len);
> > -	sctx->send_size += sizeof(*hdr);
> > +		if (sctx->send_max_size - sctx->send_size < sizeof(*hdr) + len)
> > +			return -EOVERFLOW;
> > +		hdr = (struct btrfs_tlv_header *)(sctx->send_buf +
> > +						  sctx->send_size);
> > +		put_unaligned_le16(BTRFS_SEND_A_DATA, &hdr->tlv_type);
> > +		put_unaligned_le16(len, &hdr->tlv_len);
> > +		sctx->send_size += sizeof(*hdr);
> > +	}
> >   	return 0;
> >   }
> 
> I wish the 2s were named, and that there were more commonality between the
> two branches... Might I propose this alternative? It doesn't check the
> length's suitability until after adding the two fields, but I don't think
> anything bad happens from delaying the check.

We need to check that writing the header itself won't write past the end
of send_buf, so it'd have to look more like:

static int put_data_header(struct send_ctx *sctx, u32 len)
{
	struct btrfs_tlv_header *hdr =
		(struct btrfs_tlv_header *)(sctx->send_buf + sctx->send_size)

	if (WARN_ON_ONCE(sctx->put_data))
		return -EINVAL;
	sctx->put_data = true;

	if (sctx->send_max_size - sctx->send_size < sizeof(hdr->tlv_type))
		return -EOVERFLOW;
	put_unaligned_le16(BTRFS_SEND_A_DATA, &hdr->tlv_type);
	sctx->send_size += sizeof(hdr->tlv_type);

	/*
	 * In v2, the data attribute header doesn't include a length; it is
	 * implicitly to the end of the command.
	 */
	if (sctx->proto < 2) {
		if (sctx->send_max_size - sctx->send_size <
		    sizeof(hdr->tlv_len))
			return -EOVERFLOW;
		put_unaligned_le16(len, &hdr->tlv_len);
		sctx->send_size += sizeof(hdr->tlv_len);
	}

	if (sctx->send_max_size - sctx->send_size < len)
		return -EOVERFLOW;

	return 0;
}

Now we're checking for overflow in three places, shrug.

One thing that I like about the two separate cases is that it makes it
more clear that v2+ doesn't actually have a btrfs_tlv_header; it's just
a single __le16. If it's alright with you, I'll stick with my original
version, but I will replaced the hard-coded 2s with sizeof(__le16).

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v14 5/7] btrfs: send: allocate send buffer with alloc_page() and vmap() for v2
  2022-03-30 16:33       ` Sweet Tea Dorminy
@ 2022-03-30 17:13         ` Omar Sandoval
  2022-03-30 18:48           ` Sweet Tea Dorminy
  0 siblings, 1 reply; 29+ messages in thread
From: Omar Sandoval @ 2022-03-30 17:13 UTC (permalink / raw)
  To: Sweet Tea Dorminy; +Cc: linux-btrfs, kernel-team

On Wed, Mar 30, 2022 at 12:33:48PM -0400, Sweet Tea Dorminy wrote:
> 
> 
> On 3/30/22 12:03, Omar Sandoval wrote:
> > On Thu, Mar 24, 2022 at 01:53:20PM -0400, Sweet Tea Dorminy wrote:
> > > 
> > > 
> > > On 3/17/22 13:25, Omar Sandoval wrote:
> > > > From: Omar Sandoval <osandov@fb.com>
> > > > 
> > > > For encoded writes, we need the raw pages for reading compressed data
> > > > directly via a bio.
> > > Perhaps:
> > > "For encoded writes, the existing btrfs_encoded_read*() functions expect a
> > > list of raw pages."
> > > 
> > > I think it would be a better to continue just vmalloc'ing a large continuous
> > > buffer and translating each page in the buffer into its raw page with
> > > something like is_vmalloc_addr(data) ? vmalloc_to_page(data) :
> > > virt_to_page(data). Vmalloc can request a higher-order allocation, which
> > > probably doesn't matter but might slightly improve memory locality. And in
> > > terms of readability, I somewhat like the elegance of having a single
> > > identical kvmalloc call to allocate and send_buf in both cases, even if we
> > > do need to initialize the page list for some v2 commands.
> > 
> > I like this, but are we guaranteed that kvmalloc() will return a
> > page-aligned buffer? It seems reasonable to me that it would for
> > allocations of at least one page, but I can't find that written down
> > anywhere.
> 
> Since vmalloc allocates whole pages, and kmalloc guarantees alignment to the
> allocation size for powers of 2 sizes (and PAGE_SIZE is required to be a
> power of 2), I think that adds up to a guarantee of page alignment both
> ways?
> 
> https://elixir.bootlin.com/linux/v5.17.1/source/include/linux/slab.h#L522 :
> kmalloc: "For @size of power of two bytes, the alignment is also guaranteed
> to be at least to the size."

Our allocation size is ALIGN(SZ_16K + BTRFS_MAX_COMPRESSED, PAGE_SIZE),
which is 144K for PAGE_SIZE = 4k. If I interpret the kmalloc() comment
very literally, since this isn't a power of two, it's not guaranteed to
be aligned, right?

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v14 5/7] btrfs: send: allocate send buffer with alloc_page() and vmap() for v2
  2022-03-30 17:13         ` Omar Sandoval
@ 2022-03-30 18:48           ` Sweet Tea Dorminy
  2022-03-30 20:42             ` Omar Sandoval
  0 siblings, 1 reply; 29+ messages in thread
From: Sweet Tea Dorminy @ 2022-03-30 18:48 UTC (permalink / raw)
  To: Omar Sandoval; +Cc: linux-btrfs, kernel-team



On 3/30/22 13:13, Omar Sandoval wrote:
> On Wed, Mar 30, 2022 at 12:33:48PM -0400, Sweet Tea Dorminy wrote:
>>
>>
>> On 3/30/22 12:03, Omar Sandoval wrote:
>>> On Thu, Mar 24, 2022 at 01:53:20PM -0400, Sweet Tea Dorminy wrote:
>>>>
>>>>
>>>> On 3/17/22 13:25, Omar Sandoval wrote:
>>>>> From: Omar Sandoval <osandov@fb.com>
>>>>>
>>>>> For encoded writes, we need the raw pages for reading compressed data
>>>>> directly via a bio.
>>>> Perhaps:
>>>> "For encoded writes, the existing btrfs_encoded_read*() functions expect a
>>>> list of raw pages."
>>>>
>>>> I think it would be a better to continue just vmalloc'ing a large continuous
>>>> buffer and translating each page in the buffer into its raw page with
>>>> something like is_vmalloc_addr(data) ? vmalloc_to_page(data) :
>>>> virt_to_page(data). Vmalloc can request a higher-order allocation, which
>>>> probably doesn't matter but might slightly improve memory locality. And in
>>>> terms of readability, I somewhat like the elegance of having a single
>>>> identical kvmalloc call to allocate and send_buf in both cases, even if we
>>>> do need to initialize the page list for some v2 commands.
>>>
>>> I like this, but are we guaranteed that kvmalloc() will return a
>>> page-aligned buffer? It seems reasonable to me that it would for
>>> allocations of at least one page, but I can't find that written down
>>> anywhere.
>>
>> Since vmalloc allocates whole pages, and kmalloc guarantees alignment to the
>> allocation size for powers of 2 sizes (and PAGE_SIZE is required to be a
>> power of 2), I think that adds up to a guarantee of page alignment both
>> ways?
>>
>> https://elixir.bootlin.com/linux/v5.17.1/source/include/linux/slab.h#L522 :
>> kmalloc: "For @size of power of two bytes, the alignment is also guaranteed
>> to be at least to the size."
> 
> Our allocation size is ALIGN(SZ_16K + BTRFS_MAX_COMPRESSED, PAGE_SIZE),
> which is 144K for PAGE_SIZE = 4k. If I interpret the kmalloc() comment
> very literally, since this isn't a power of two, it's not guaranteed to
> be aligned, right?

Ah, an excellent point.

Now that I think about it, the kmalloc path picks a slab to allocate 
from based on the log_2 of the size: 
https://elixir.bootlin.com/linux/v5.17.1/source/mm/slab_common.c#L733 so 
we'd end up wasting 128k-16k space using kmalloc, whether it's aligned 
or not, I think?

So maybe it should just always use vmalloc and get the page alignment?

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v14 5/7] btrfs: send: allocate send buffer with alloc_page() and vmap() for v2
  2022-03-30 18:48           ` Sweet Tea Dorminy
@ 2022-03-30 20:42             ` Omar Sandoval
  2022-03-30 21:04               ` Sweet Tea Dorminy
  0 siblings, 1 reply; 29+ messages in thread
From: Omar Sandoval @ 2022-03-30 20:42 UTC (permalink / raw)
  To: Sweet Tea Dorminy; +Cc: linux-btrfs, kernel-team

On Wed, Mar 30, 2022 at 02:48:42PM -0400, Sweet Tea Dorminy wrote:
> 
> 
> On 3/30/22 13:13, Omar Sandoval wrote:
> > On Wed, Mar 30, 2022 at 12:33:48PM -0400, Sweet Tea Dorminy wrote:
> > > 
> > > 
> > > On 3/30/22 12:03, Omar Sandoval wrote:
> > > > On Thu, Mar 24, 2022 at 01:53:20PM -0400, Sweet Tea Dorminy wrote:
> > > > > 
> > > > > 
> > > > > On 3/17/22 13:25, Omar Sandoval wrote:
> > > > > > From: Omar Sandoval <osandov@fb.com>
> > > > > > 
> > > > > > For encoded writes, we need the raw pages for reading compressed data
> > > > > > directly via a bio.
> > > > > Perhaps:
> > > > > "For encoded writes, the existing btrfs_encoded_read*() functions expect a
> > > > > list of raw pages."
> > > > > 
> > > > > I think it would be a better to continue just vmalloc'ing a large continuous
> > > > > buffer and translating each page in the buffer into its raw page with
> > > > > something like is_vmalloc_addr(data) ? vmalloc_to_page(data) :
> > > > > virt_to_page(data). Vmalloc can request a higher-order allocation, which
> > > > > probably doesn't matter but might slightly improve memory locality. And in
> > > > > terms of readability, I somewhat like the elegance of having a single
> > > > > identical kvmalloc call to allocate and send_buf in both cases, even if we
> > > > > do need to initialize the page list for some v2 commands.
> > > > 
> > > > I like this, but are we guaranteed that kvmalloc() will return a
> > > > page-aligned buffer? It seems reasonable to me that it would for
> > > > allocations of at least one page, but I can't find that written down
> > > > anywhere.
> > > 
> > > Since vmalloc allocates whole pages, and kmalloc guarantees alignment to the
> > > allocation size for powers of 2 sizes (and PAGE_SIZE is required to be a
> > > power of 2), I think that adds up to a guarantee of page alignment both
> > > ways?
> > > 
> > > https://elixir.bootlin.com/linux/v5.17.1/source/include/linux/slab.h#L522 :
> > > kmalloc: "For @size of power of two bytes, the alignment is also guaranteed
> > > to be at least to the size."
> > 
> > Our allocation size is ALIGN(SZ_16K + BTRFS_MAX_COMPRESSED, PAGE_SIZE),
> > which is 144K for PAGE_SIZE = 4k. If I interpret the kmalloc() comment
> > very literally, since this isn't a power of two, it's not guaranteed to
> > be aligned, right?
> 
> Ah, an excellent point.
> 
> Now that I think about it, the kmalloc path picks a slab to allocate from
> based on the log_2 of the size:
> https://elixir.bootlin.com/linux/v5.17.1/source/mm/slab_common.c#L733 so
> we'd end up wasting 128k-16k space using kmalloc, whether it's aligned or
> not, I think?
> 
> So maybe it should just always use vmalloc and get the page alignment?

Yeah, vmalloc()+vmalloc_to_page() is going to be more or less equivalent
to the vmap thing I'm doing here, but a lot cleaner. Replacing this
patch with the below patch seems to work:

diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c
index c0ca45dae6d6..e574d4f4a167 100644
--- a/fs/btrfs/send.c
+++ b/fs/btrfs/send.c
@@ -87,6 +87,7 @@ struct send_ctx {
 	 * command (since protocol v2, data must be the last attribute).
 	 */
 	bool put_data;
+	struct page **send_buf_pages;
 	u64 flags;	/* 'flags' member of btrfs_ioctl_send_args is u64 */
 	/* Protocol version compatibility requested */
 	u32 proto;
@@ -7486,12 +7487,32 @@ long btrfs_ioctl_send(struct inode *inode, struct btrfs_ioctl_send_args *arg)
 	sctx->clone_roots_cnt = arg->clone_sources_count;
 
 	if (sctx->proto >= 2) {
+		u32 send_buf_num_pages;
+
 		sctx->send_max_size = ALIGN(SZ_16K + BTRFS_MAX_COMPRESSED,
 					    PAGE_SIZE);
+		sctx->send_buf = vmalloc(sctx->send_max_size);
+		if (!sctx->send_buf) {
+			ret = -ENOMEM;
+			goto out;
+		}
+		send_buf_num_pages = sctx->send_max_size >> PAGE_SHIFT;
+		sctx->send_buf_pages = kcalloc(send_buf_num_pages,
+					       sizeof(*sctx->send_buf_pages),
+					       GFP_KERNEL);
+		if (!sctx->send_buf_pages) {
+			ret = -ENOMEM;
+			goto out;
+		}
+		for (i = 0; i < send_buf_num_pages; i++) {
+			sctx->send_buf_pages[i] =
+				vmalloc_to_page(sctx->send_buf +
+						(i << PAGE_SHIFT));
+		}
 	} else {
 		sctx->send_max_size = BTRFS_SEND_BUF_SIZE_V1;
+		sctx->send_buf = kvmalloc(sctx->send_max_size, GFP_KERNEL);
 	}
-	sctx->send_buf = kvmalloc(sctx->send_max_size, GFP_KERNEL);
 	if (!sctx->send_buf) {
 		ret = -ENOMEM;
 		goto out;
@@ -7684,6 +7705,7 @@ long btrfs_ioctl_send(struct inode *inode, struct btrfs_ioctl_send_args *arg)
 			fput(sctx->send_filp);
 
 		kvfree(sctx->clone_roots);
+		kfree(sctx->send_buf_pages);
 		kvfree(sctx->send_buf);
 
 		name_cache_free(sctx);

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH v14 5/7] btrfs: send: allocate send buffer with alloc_page() and vmap() for v2
  2022-03-30 20:42             ` Omar Sandoval
@ 2022-03-30 21:04               ` Sweet Tea Dorminy
  0 siblings, 0 replies; 29+ messages in thread
From: Sweet Tea Dorminy @ 2022-03-30 21:04 UTC (permalink / raw)
  To: Omar Sandoval; +Cc: linux-btrfs, kernel-team


> Yeah, vmalloc()+vmalloc_to_page() is going to be more or less equivalent
> to the vmap thing I'm doing here, but a lot cleaner. Replacing this
> patch with the below patch seems to work:
Looks great to me - thanks!

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2022-03-30 21:04 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-17 17:25 [PATCH v14 0/7] btrfs: add send/receive support for reading/writing compressed data Omar Sandoval
2022-03-17 17:25 ` [PATCH v14 1/7] btrfs: send: remove unused send_ctx::{total,cmd}_send_size Omar Sandoval
2022-03-17 17:25 ` [PATCH v14 2/7] btrfs: send: explicitly number commands and attributes Omar Sandoval
2022-03-24 17:52   ` Sweet Tea Dorminy
2022-03-17 17:25 ` [PATCH v14 3/7] btrfs: add send stream v2 definitions Omar Sandoval
2022-03-17 17:25 ` [PATCH v14 4/7] btrfs: send: write larger chunks when using stream v2 Omar Sandoval
2022-03-24 17:52   ` Sweet Tea Dorminy
2022-03-30 17:05     ` Omar Sandoval
2022-03-17 17:25 ` [PATCH v14 5/7] btrfs: send: allocate send buffer with alloc_page() and vmap() for v2 Omar Sandoval
2022-03-24 17:53   ` Sweet Tea Dorminy
2022-03-30 16:03     ` Omar Sandoval
2022-03-30 16:33       ` Sweet Tea Dorminy
2022-03-30 17:13         ` Omar Sandoval
2022-03-30 18:48           ` Sweet Tea Dorminy
2022-03-30 20:42             ` Omar Sandoval
2022-03-30 21:04               ` Sweet Tea Dorminy
2022-03-17 17:25 ` [PATCH v14 6/7] btrfs: send: send compressed extents with encoded writes Omar Sandoval
2022-03-17 17:25 ` [PATCH v14 7/7] btrfs: send: enable support for stream v2 and compressed writes Omar Sandoval
2022-03-17 17:25 ` [PATCH v14 01/10] btrfs-progs: receive: support v2 send stream larger tlv_len Omar Sandoval
2022-03-17 17:25 ` [PATCH v14 02/10] btrfs-progs: receive: dynamically allocate sctx->read_buf Omar Sandoval
2022-03-17 17:25 ` [PATCH v14 03/10] btrfs-progs: receive: support v2 send stream DATA tlv format Omar Sandoval
2022-03-17 17:25 ` [PATCH v14 04/10] btrfs-progs: receive: add send stream v2 cmds and attrs to send.h Omar Sandoval
2022-03-17 17:25 ` [PATCH v14 05/10] btrfs-progs: receive: process encoded_write commands Omar Sandoval
2022-03-17 17:25 ` [PATCH v14 06/10] btrfs-progs: receive: encoded_write fallback to explicit decode and write Omar Sandoval
2022-03-17 17:25 ` [PATCH v14 07/10] btrfs-progs: receive: process fallocate commands Omar Sandoval
2022-03-17 17:25 ` [PATCH v14 08/10] btrfs-progs: receive: process setflags ioctl commands Omar Sandoval
2022-03-17 17:25 ` [PATCH v14 09/10] btrfs-progs: send: stream v2 ioctl flags Omar Sandoval
2022-03-17 17:25 ` [PATCH v14 10/10] btrfs-progs: receive: add tests for basic encoded_write send/receive Omar Sandoval
2022-03-24 17:53 ` [PATCH v14 0/7] btrfs: add send/receive support for reading/writing compressed data Sweet Tea Dorminy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.