linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Fabio M. De Francesco" <fmdefrancesco@gmail.com>
To: linux-btrfs@vger.kernel.org, Qu Wenruo <wqu@suse.com>
Cc: ira.weiny@intel.com
Subject: Re: [PATCH] btrfs: zlib: refactor how we prepare the input buffer
Date: Sat, 18 Jun 2022 08:14:26 +0200	[thread overview]
Message-ID: <2326236.NG923GbCHz@opensuse> (raw)
In-Reply-To: <d0bfc791b5509df7b9ad44e41ada197d1b3149b3.1655519730.git.wqu@suse.com>

On sabato 18 giugno 2022 04:39:28 CEST Qu Wenruo wrote:
> Inspired by recent kmap() change from Fabio M. De Francesco.
> 

Thanks!

>
> There are some weird behavior in zlib_compress_pages(), mostly around how
> we prepare the input buffer.
> 
> [BEFORE]
> - We hold a page mapped for a long time
>   This is making it much harder to convert kmap() to kmap_local_page(),
>   as such long mapped page can lead to nested mapped page.
> 
> - Different paths in the name of "optimization"
>   When we ran out of input buffer, we will grab the new input with two
>   different paths:
> 
>   * If there are more than one pages left, we copy the content into the
>     input buffer.
>     This behavior is introduced mostly for S390, as that arch needs
>     multiple pages as input buffer for hardware decompression.
> 
>   * If there is only one page left, we use that page from page cache
>     directly without copying the content.
> 
>   This is making page map/unmap much harder, especially due the latter
>   case.
> 
> [AFTER]
> This patch will change the behavior by introducing a new helper, to
> fulfill the input buffer:
> 
> - Only map one page when we do the content copy
> 
> - Unified path, by always copying the page content into workspace
>   input buffer
>   Yes, we're doing extra page copying. But the original optimization
>   only work for the last page of the input range.
> 
>   Thus I'd say the sacrifice is already not that big.
> 
> - Use kmap_local_page() and kunmap_local() instead
>   Now the lifespan for the mapped page is only during memcpy() call,
>   we're definitely fine to use kmap_local_page()/kunmap_local().
> 
> Cc: Fabio M. De Francesco <fmdefrancesco@gmail.com>
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
> Only tested on x86_64 for the correctness of the new helper.
>
> 
> But considering how small the window we need the page to be mapped, I
> think it should also work for x86 without any problem.
>

This patch passed 26/26 "compress" group tests of (x)fstests on a 32-bit 
QEMU + KVM VM (two tests were skipped because they need 5 or more disks, 
but I don't have enough free space).

Tested-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>

tweed32:/usr/lib/xfstests # ./check -g compress
FSTYP         -- btrfs
PLATFORM      -- Linux/i686 tweed32 5.19.0-rc2-vanilla-debug+ #46 SMP 
PREEMPT_DYNAMIC Sat Jun 18 07:30:28 CEST 2022
MKFS_OPTIONS  -- /dev/loop1
MOUNT_OPTIONS -- /dev/loop1 /mnt/scratch

btrfs/024 4s ...  3s
btrfs/026 8s ...  6s
btrfs/037 5s ...  3s
btrfs/038 3s ...  3s
btrfs/041 4s ...  2s
btrfs/062 47s ...  40s
btrfs/063 26s ...  22s
btrfs/067 44s ...  39s
btrfs/068 17s ...  13s
btrfs/070	[not run] btrfs and this test needs 5 or more disks in 
SCRATCH_DEV_POOL
btrfs/071	[not run] btrfs and this test needs 5 or more disks in 
SCRATCH_DEV_POOL
btrfs/072 46s ...  41s
btrfs/073 21s ...  22s
btrfs/074 48s ...  41s
btrfs/076 3s ...  3s
btrfs/103 3s ...  3s
btrfs/106 3s ...  3s
btrfs/109 4s ...  3s
btrfs/113 4s ...  3s
btrfs/138 63s ...  53s
btrfs/149 4s ...  3s
btrfs/183 4s ...  3s
btrfs/205 4s ...  3s
btrfs/234 5s ...  4s
btrfs/246 3s ...  3s
btrfs/251 3s ...  2s
Ran: btrfs/024 btrfs/026 btrfs/037 btrfs/038 btrfs/041 btrfs/062 btrfs/063 
btrfs/067 btrfs/068 btrfs/070 btrfs/071 btrfs/072 btrfs/073 btrfs/074 
btrfs/076 btrfs/103 btrfs/106 btrfs/109 btrfs/113 btrfs/138 btrfs/149 
btrfs/183 btrfs/205 btrfs/234 btrfs/246 btrfs/251
Not run: btrfs/070 btrfs/071
Passed all 26 tests

>
> ---
>  fs/btrfs/zlib.c | 85 ++++++++++++++++++++++++-------------------------
>  1 file changed, 41 insertions(+), 44 deletions(-)
> 
> diff --git a/fs/btrfs/zlib.c b/fs/btrfs/zlib.c
> index 767a0c6c9694..2cd4f6fb1537 100644
> --- a/fs/btrfs/zlib.c
> +++ b/fs/btrfs/zlib.c
> @@ -91,20 +91,54 @@ struct list_head *zlib_alloc_workspace(unsigned int 
level)
>  	return ERR_PTR(-ENOMEM);
>  }
>  
> +/*
> + * Copy the content from page cache into @workspace->buf.
> + *
> + * @total_in:		The original total input length.
> + * @fileoff_ret:	The file offset.
> + *			Will be increased by the number of bytes we 
read.
> + */
> +static void fill_input_buffer(struct workspace *workspace,
> +			      struct address_space *mapping,
> +			      unsigned long total_in, u64 
*fileoff_ret)
> +{
> +	unsigned long bytes_left = total_in - workspace->strm.total_in;
> +	const int input_pages = min(DIV_ROUND_UP(bytes_left, PAGE_SIZE),
> +				    workspace->buf_size / 
PAGE_SIZE);
> +	u64 file_offset = *fileoff_ret;
> +	int i;
> +
> +	/* Copy the content of each page into the input buffer. */
> +	for (i = 0; i < input_pages; i++) {
> +		struct page *in_page;
> +		void *addr;
> +
> +		in_page = find_get_page(mapping, file_offset >> 
PAGE_SHIFT);
> +
> +		addr = kmap_local_page(in_page);
> +		memcpy(workspace->buf + i * PAGE_SIZE, addr, 
PAGE_SIZE);
> +		kunmap_local(addr);
> +
> +		put_page(in_page);
> +		file_offset += PAGE_SIZE;
> +	}
> +	*fileoff_ret = file_offset;
> +	workspace->strm.next_in = workspace->buf;
> +	workspace->strm.avail_in = min_t(unsigned long, bytes_left,
> +					 workspace->buf_size);
> +}
> +
>  int zlib_compress_pages(struct list_head *ws, struct address_space 
*mapping,
>  		u64 start, struct page **pages, unsigned long 
*out_pages,
>  		unsigned long *total_in, unsigned long *total_out)
>  {
>  	struct workspace *workspace = list_entry(ws, struct workspace, 
list);
> +	/* Total input length. */
> +	const unsigned long len = *total_out;
>  	int ret;
> -	char *data_in;
>  	char *cpage_out;
>  	int nr_pages = 0;
> -	struct page *in_page = NULL;
>  	struct page *out_page = NULL;
> -	unsigned long bytes_left;
> -	unsigned int in_buf_pages;
> -	unsigned long len = *total_out;
>  	unsigned long nr_dest_pages = *out_pages;
>  	const unsigned long max_out = nr_dest_pages * PAGE_SIZE;
>  
> @@ -140,40 +174,8 @@ int zlib_compress_pages(struct list_head *ws, struct 
address_space *mapping,
>  		 * Get next input pages and copy the contents to
>  		 * the workspace buffer if required.
>  		 */
> -		if (workspace->strm.avail_in == 0) {
> -			bytes_left = len - workspace->strm.total_in;
> -			in_buf_pages = min(DIV_ROUND_UP(bytes_left, 
PAGE_SIZE),
> -					   workspace->buf_size / 
PAGE_SIZE);
> -			if (in_buf_pages > 1) {
> -				int i;
> -
> -				for (i = 0; i < in_buf_pages; i++) 
{
> -					if (in_page) {
> -						
kunmap(in_page);
> -						
put_page(in_page);
> -					}
> -					in_page = 
find_get_page(mapping,
> -								
start >> PAGE_SHIFT);
> -					data_in = kmap(in_page);
> -					memcpy(workspace->buf + i 
* PAGE_SIZE,
> -					       data_in, 
PAGE_SIZE);
> -					start += PAGE_SIZE;
> -				}
> -				workspace->strm.next_in = 
workspace->buf;
> -			} else {
> -				if (in_page) {
> -					kunmap(in_page);
> -					put_page(in_page);
> -				}
> -				in_page = find_get_page(mapping,
> -							start 
>> PAGE_SHIFT);
> -				data_in = kmap(in_page);
> -				start += PAGE_SIZE;
> -				workspace->strm.next_in = data_in;
> -			}
> -			workspace->strm.avail_in = min(bytes_left,
> -						       
(unsigned long) workspace->buf_size);
> -		}
> +		if (workspace->strm.avail_in == 0)
> +			fill_input_buffer(workspace, mapping, len, 
&start);
>  
>  		ret = zlib_deflate(&workspace->strm, Z_SYNC_FLUSH);
>  		if (ret != Z_OK) {
> @@ -266,11 +268,6 @@ int zlib_compress_pages(struct list_head *ws, struct 
address_space *mapping,
>  	*out_pages = nr_pages;
>  	if (out_page)
>  		kunmap(out_page);
> -
> -	if (in_page) {
> -		kunmap(in_page);
> -		put_page(in_page);
> -	}
>  	return ret;
>  }
>  
> -- 
> 2.36.1
> 
Good job!

With your patch, the logic of zlib_compress_pages() is much more 
understandable for people unfamiliar with this code.

Reviewed-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>

As a side effect (desired and important to me), I can now easily convert 
the remaining kmap() call sites in zlib.c.

Thanks again,

Fabio

PS: I'm adding Ira Weiny to the list of recipients.



  reply	other threads:[~2022-06-18  6:14 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-18  2:39 [PATCH] btrfs: zlib: refactor how we prepare the input buffer Qu Wenruo
2022-06-18  6:14 ` Fabio M. De Francesco [this message]
2022-06-20 16:08 ` David Sterba
2022-06-21  0:40   ` Qu Wenruo
2022-06-21  1:43     ` Qu Wenruo
2022-06-20 21:38 ` Ira Weiny

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2326236.NG923GbCHz@opensuse \
    --to=fmdefrancesco@gmail.com \
    --cc=ira.weiny@intel.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).