All of lore.kernel.org
 help / color / mirror / Atom feed
From: Omar Sandoval <osandov@osandov.com>
To: linux-btrfs@vger.kernel.org
Cc: linux-fsdevel@vger.kernel.org
Subject: [PATCH 0/9] btrfs: implement send/receive of compressed extents without decompressing
Date: Fri, 21 Aug 2020 00:39:50 -0700	[thread overview]
Message-ID: <cover.1597994106.git.osandov@osandov.com> (raw)

This series uses the interface added in "fs: interface for directly
reading/writing compressed data" to send and receive compressed data
without wastefully decompressing and recompressing it. It does so by

1. Bumping the send stream protocol version to 2.
2. Adding a new command, BTRFS_SEND_C_ENCODED_WRITE, and its associated
   attributes that indicates a write using the new encoded I/O
   interface.
3. Sending compressed extents with BTRFS_SEND_C_ENCODED_WRITE when
   requested by the user.
4. Falling back to decompressing and writing the decompressed data if
   encoded I/O fails.

Benchmarks
==========

I ran some benchmarks on send and receive of a zstd (level 3) compressed
snapshot of a server's root filesystem which is about 23GB when
compressed and 50GB when decompressed.

Send v1:
0.41user 81.97system 2:21.71elapsed 58%CPU (0avgtext+0avgdata 2900maxresident)k
47182656inputs+0outputs (10major+119minor)pagefaults 0swaps

Send compressed:
0.43user 60.53system 2:20.62elapsed 43%CPU (0avgtext+0avgdata 2836maxresident)k
47778864inputs+0outputs (8major+117minor)pagefaults 0swaps

In this case, the bottleneck for send is reading the metadata trees and
data from disk, so there's not much of a wall time improvement, but
since the kernel doesn't have to decompress the data in the compressed
case, it uses significantly less CPU and system time.

Receive v1 into a filesystem with compress=none:
15.58user 62.36system 7:34.44elapsed 17%CPU (0avgtext+0avgdata 3028maxresident)k
104719648inputs+105333248outputs (1major+140minor)pagefaults 0swaps

Receive v1 into a filesystem with compress-force=zstd:
15.45user 63.99system 5:11.57elapsed 25%CPU (0avgtext+0avgdata 3100maxresident)k
104587240inputs+105379328outputs (1major+143minor)pagefaults 0swaps

Receive compressed into a filesystem with compress-force=zstd:
7.95user 44.53system 3:42.79elapsed 23%CPU (0avgtext+0avgdata 2992maxresident)k
46909600inputs+21603216outputs (2major+176minor)pagefaults 0swaps

Without compressed receive, recompressing the data is still a wall time
win because it requires much less I/O. However, compressed receive
reduces the wall time even further.

The v1 send stream is 50GB, and the v2 send stream is 23 GB. The v1 send
stream compresses down to 17GB with zstd (level 3), so compressed send
gets pretty close with no extra CPU overhead (the reason that compressed
send is still larger is of course that we compress extents individually,
which does not compress as efficiently as compressing the entire
filesystem representation in one go).

# ls -lh v1.send v1.send.zst compressed.send
-rw-r--r-- 1 root root 23G Aug 17 12:34 compressed.send                 
-rw-r--r-- 1 root root 50G Aug 17 12:13 v1.send                           
-rw-r--r-- 1 root root 17G Aug 17 12:28 v1.send.zst               

Protocol Updates
================

This series makes some changes to the send stream protocol beyond adding
the encoded write command/attributes and bumping the version. Namely, v1
has a 64k limit on the size of a write due to the 16-bit attribute
length. This is not enough for encoded writes, as compressed extents may
be up to 128k and cannot be split up. To address this, the
BTRFS_SEND_A_DATA is treated specially in v2: its length is implicitly
the remaining length of the command (which has a 32-bit length). This
was the last bad of the options I considered.

There are other commands that we've been wanting to add to the protocol:
fallocate and FS_IOC_SETFLAGS. This series reserves their command and
attribute numbers but does not implement kernel support for emitting
them. However, it does implement support in receive for them, so the
kernel can start emitting those whenever we get around to implementing
them.

Interface
=========

For the send ioctl, stream version 2 is opt-in, and compressed writes
are opt-in separately (but dependent on) stream version 2.

Accordingly, `btrfs send` now accepts a `--stream-version` option and a
`--compressed` option; the latter implies `--stream-version 2`.

`btrfs receive` also accepts a `--force-decompress` option that forces
the fallback to decompressing and writing the decompressed data.

These options are provided to give the user flexibility in case they
don't want their receiving filesytem to be compressed.

Patches
=======

The kernel patches are based on kdave/misc-next plus my "fs: interface
for directly reading/writing compressed data" series. Patches 1-3 are
improvements to the generic send code.  Patches 4-7 do some preparation
for stream v2 and compressed send. Patch 8 implements compressed send.
Patch 9 modified the ioctl to accept the new flags and enable the new
feature.

Omar Sandoval (9):
  btrfs: send: get rid of i_size logic in send_write()
  btrfs: send: avoid copying file data
  btrfs: send: use btrfs_file_extent_end() in send_write_or_clone()
  btrfs: add send_stream_version attribute to sysfs
  btrfs: add send stream v2 definitions
  btrfs: send: write larger chunks when using stream v2
  btrfs: send: allocate send buffer with alloc_page() and vmap() for v2
  btrfs: send: send compressed extents with encoded writes
  btrfs: send: enable support for stream v2 and compressed writes

 fs/btrfs/ctree.h           |   4 +
 fs/btrfs/inode.c           |   6 +-
 fs/btrfs/send.c            | 419 ++++++++++++++++++++++++++++---------
 fs/btrfs/send.h            |  33 ++-
 fs/btrfs/sysfs.c           |   9 +
 include/uapi/linux/btrfs.h |  17 +-
 6 files changed, 384 insertions(+), 104 deletions(-)

The btrfs-progs patches were written by Boris Burkov. Patches 1-5 are
preparation. Patch 6 implements encoded writes. Patch 7 implements the
fallback to decompressing. Patch 8-9 implement the other commands. Patch
10 adds the new `btrfs send` options. Patch 11 adds a test case.

Boris Burkov (11):
  btrfs-progs: receive: support v2 send stream larger tlv_len
  btrfs-progs: receive: dynamically allocate sctx->read_buf
  btrfs-progs: receive: support v2 send stream DATA tlv format
  btrfs-progs: receive: add send stream v2 cmds and attrs to send.h
  btrfs-progs: receive: add stub implementation for pwritev2
  btrfs-progs: receive: process encoded_write commands
  btrfs-progs: receive: encoded_write fallback to explicit decode and
    write
  btrfs-progs: receive: process fallocate commands
  btrfs-progs: receive: process setflags ioctl commands
  btrfs-progs: send: stream v2 ioctl flags
  btrfs-progs: receive: add tests for basic encoded_write send/receive

 Makefile                                      |   4 +-
 cmds/receive-dump.c                           |  31 +-
 cmds/receive.c                                | 402 +++++++++++++++++-
 cmds/send.c                                   |  39 +-
 common/send-stream.c                          | 159 +++++--
 common/send-stream.h                          |   7 +
 configure.ac                                  |   1 +
 ioctl.h                                       |  17 +-
 libbtrfsutil/btrfs.h                          |  17 +-
 send.h                                        |  19 +-
 stubs.c                                       |  24 ++
 stubs.h                                       |  50 +++
 .../040-receive-write-encoded/test.sh         | 114 +++++
 13 files changed, 832 insertions(+), 52 deletions(-)
 create mode 100644 stubs.c
 create mode 100644 stubs.h
 create mode 100755 tests/misc-tests/040-receive-write-encoded/test.sh

Thanks!

-- 
2.28.0


             reply	other threads:[~2020-08-21  7:40 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-21  7:39 Omar Sandoval [this message]
2020-08-21  7:39 ` [PATCH 1/9] btrfs: send: get rid of i_size logic in send_write() Omar Sandoval
2020-08-21 17:26   ` Filipe Manana
2020-08-24 17:39   ` Josef Bacik
2020-08-21  7:39 ` [PATCH 2/9] btrfs: send: avoid copying file data Omar Sandoval
2020-08-21 17:29   ` Filipe Manana
2020-08-24 21:34     ` Omar Sandoval
2020-08-24 17:47   ` Josef Bacik
2020-09-11 14:13   ` David Sterba
2020-09-14 22:04     ` Omar Sandoval
2020-09-15  8:14       ` David Sterba
2020-08-21  7:39 ` [PATCH 3/9] btrfs: send: use btrfs_file_extent_end() in send_write_or_clone() Omar Sandoval
2020-08-21 17:30   ` Filipe Manana
2020-08-21  7:39 ` [PATCH 4/9] btrfs: add send_stream_version attribute to sysfs Omar Sandoval
2020-08-21  7:39 ` [PATCH 5/9] btrfs: add send stream v2 definitions Omar Sandoval
2020-08-24 17:49   ` Josef Bacik
2020-08-21  7:39 ` [PATCH 6/9] btrfs: send: write larger chunks when using stream v2 Omar Sandoval
2020-08-24 17:57   ` Josef Bacik
2020-08-21  7:39 ` [PATCH 7/9] btrfs: send: allocate send buffer with alloc_page() and vmap() for v2 Omar Sandoval
2020-08-21  7:39 ` [PATCH 8/9] btrfs: send: send compressed extents with encoded writes Omar Sandoval
2020-08-24 17:32   ` Josef Bacik
2020-08-24 17:52     ` Omar Sandoval
2020-08-21  7:39 ` [PATCH 9/9] btrfs: send: enable support for stream v2 and compressed writes Omar Sandoval
2020-08-21  7:40 ` [PATCH 01/11] btrfs-progs: receive: support v2 send stream larger tlv_len Omar Sandoval
2020-08-21  7:40 ` [PATCH 02/11] btrfs-progs: receive: dynamically allocate sctx->read_buf Omar Sandoval
2020-08-21  7:40 ` [PATCH 03/11] btrfs-progs: receive: support v2 send stream DATA tlv format Omar Sandoval
2020-08-21  7:40 ` [PATCH 04/11] btrfs-progs: receive: add send stream v2 cmds and attrs to send.h Omar Sandoval
2020-08-21  7:40 ` [PATCH 05/11] btrfs-progs: receive: add stub implementation for pwritev2 Omar Sandoval
2020-08-21  7:40 ` [PATCH 06/11] btrfs-progs: receive: process encoded_write commands Omar Sandoval
2020-08-21  7:40 ` [PATCH 07/11] btrfs-progs: receive: encoded_write fallback to explicit decode and write Omar Sandoval
2020-08-21  7:40 ` [PATCH 08/11] btrfs-progs: receive: process fallocate commands Omar Sandoval
2020-08-21  7:40 ` [PATCH 09/11] btrfs-progs: receive: process setflags ioctl commands Omar Sandoval
2020-08-21  7:40 ` [PATCH 10/11] btrfs-progs: send: stream v2 ioctl flags Omar Sandoval
2020-08-21  7:40 ` [PATCH 11/11] btrfs-progs: receive: add tests for basic encoded_write send/receive Omar Sandoval
2020-08-24 19:57 ` [PATCH 0/9] btrfs: implement send/receive of compressed extents without decompressing David Sterba
2020-08-24 22:16   ` Omar Sandoval
2020-09-10 11:28 ` David Sterba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cover.1597994106.git.osandov@osandov.com \
    --to=osandov@osandov.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.