linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ira Weiny <ira.weiny@intel.com>
To: Qu Wenruo <wqu@suse.com>
Cc: linux-btrfs@vger.kernel.org,
	"Fabio M . De Francesco" <fmdefrancesco@gmail.com>
Subject: Re: [PATCH] btrfs: zlib: refactor how we prepare the input buffer
Date: Mon, 20 Jun 2022 14:38:30 -0700	[thread overview]
Message-ID: <YrDo1qgcXEKSQM7l@iweiny-desk3> (raw)
In-Reply-To: <d0bfc791b5509df7b9ad44e41ada197d1b3149b3.1655519730.git.wqu@suse.com>

On Sat, Jun 18, 2022 at 10:39:28AM +0800, Qu Wenruo wrote:
> Inspired by recent kmap() change from Fabio M. De Francesco.
> 
> There are some weird behavior in zlib_compress_pages(), mostly around how
> we prepare the input buffer.
> 
> [BEFORE]
> - We hold a page mapped for a long time
>   This is making it much harder to convert kmap() to kmap_local_page(),
>   as such long mapped page can lead to nested mapped page.
> 
> - Different paths in the name of "optimization"
>   When we ran out of input buffer, we will grab the new input with two
>   different paths:
> 
>   * If there are more than one pages left, we copy the content into the
>     input buffer.
>     This behavior is introduced mostly for S390, as that arch needs
>     multiple pages as input buffer for hardware decompression.
> 
>   * If there is only one page left, we use that page from page cache
>     directly without copying the content.
> 
>   This is making page map/unmap much harder, especially due the latter
>   case.
> 
> [AFTER]
> This patch will change the behavior by introducing a new helper, to
> fulfill the input buffer:
> 
> - Only map one page when we do the content copy
> 
> - Unified path, by always copying the page content into workspace
>   input buffer
>   Yes, we're doing extra page copying. But the original optimization
>   only work for the last page of the input range.
> 
>   Thus I'd say the sacrifice is already not that big.
> 
> - Use kmap_local_page() and kunmap_local() instead
>   Now the lifespan for the mapped page is only during memcpy() call,
>   we're definitely fine to use kmap_local_page()/kunmap_local().

Thanks!  This helps a lot.  Minor issue below.

[snip]

> +static void fill_input_buffer(struct workspace *workspace,
> +			      struct address_space *mapping,
> +			      unsigned long total_in, u64 *fileoff_ret)
> +{
> +	unsigned long bytes_left = total_in - workspace->strm.total_in;
> +	const int input_pages = min(DIV_ROUND_UP(bytes_left, PAGE_SIZE),
> +				    workspace->buf_size / PAGE_SIZE);
> +	u64 file_offset = *fileoff_ret;
> +	int i;
> +
> +	/* Copy the content of each page into the input buffer. */
> +	for (i = 0; i < input_pages; i++) {
> +		struct page *in_page;
> +		void *addr;
> +
> +		in_page = find_get_page(mapping, file_offset >> PAGE_SHIFT);
> +
> +		addr = kmap_local_page(in_page);
> +		memcpy(workspace->buf + i * PAGE_SIZE, addr, PAGE_SIZE);
> +		kunmap_local(addr);

This should be memcpy_from_page().

Ira

> +
> +		put_page(in_page);
> +		file_offset += PAGE_SIZE;
> +	}
> +	*fileoff_ret = file_offset;
> +	workspace->strm.next_in = workspace->buf;
> +	workspace->strm.avail_in = min_t(unsigned long, bytes_left,
> +					 workspace->buf_size);
> +}
> +
>  int zlib_compress_pages(struct list_head *ws, struct address_space *mapping,
>  		u64 start, struct page **pages, unsigned long *out_pages,
>  		unsigned long *total_in, unsigned long *total_out)
>  {
>  	struct workspace *workspace = list_entry(ws, struct workspace, list);
> +	/* Total input length. */
> +	const unsigned long len = *total_out;
>  	int ret;
> -	char *data_in;
>  	char *cpage_out;
>  	int nr_pages = 0;
> -	struct page *in_page = NULL;
>  	struct page *out_page = NULL;
> -	unsigned long bytes_left;
> -	unsigned int in_buf_pages;
> -	unsigned long len = *total_out;
>  	unsigned long nr_dest_pages = *out_pages;
>  	const unsigned long max_out = nr_dest_pages * PAGE_SIZE;
>  
> @@ -140,40 +174,8 @@ int zlib_compress_pages(struct list_head *ws, struct address_space *mapping,
>  		 * Get next input pages and copy the contents to
>  		 * the workspace buffer if required.
>  		 */
> -		if (workspace->strm.avail_in == 0) {
> -			bytes_left = len - workspace->strm.total_in;
> -			in_buf_pages = min(DIV_ROUND_UP(bytes_left, PAGE_SIZE),
> -					   workspace->buf_size / PAGE_SIZE);
> -			if (in_buf_pages > 1) {
> -				int i;
> -
> -				for (i = 0; i < in_buf_pages; i++) {
> -					if (in_page) {
> -						kunmap(in_page);
> -						put_page(in_page);
> -					}
> -					in_page = find_get_page(mapping,
> -								start >> PAGE_SHIFT);
> -					data_in = kmap(in_page);
> -					memcpy(workspace->buf + i * PAGE_SIZE,
> -					       data_in, PAGE_SIZE);
> -					start += PAGE_SIZE;
> -				}
> -				workspace->strm.next_in = workspace->buf;
> -			} else {
> -				if (in_page) {
> -					kunmap(in_page);
> -					put_page(in_page);
> -				}
> -				in_page = find_get_page(mapping,
> -							start >> PAGE_SHIFT);
> -				data_in = kmap(in_page);
> -				start += PAGE_SIZE;
> -				workspace->strm.next_in = data_in;
> -			}
> -			workspace->strm.avail_in = min(bytes_left,
> -						       (unsigned long) workspace->buf_size);
> -		}
> +		if (workspace->strm.avail_in == 0)
> +			fill_input_buffer(workspace, mapping, len, &start);
>  
>  		ret = zlib_deflate(&workspace->strm, Z_SYNC_FLUSH);
>  		if (ret != Z_OK) {
> @@ -266,11 +268,6 @@ int zlib_compress_pages(struct list_head *ws, struct address_space *mapping,
>  	*out_pages = nr_pages;
>  	if (out_page)
>  		kunmap(out_page);
> -
> -	if (in_page) {
> -		kunmap(in_page);
> -		put_page(in_page);
> -	}
>  	return ret;
>  }
>  
> -- 
> 2.36.1
> 

      parent reply	other threads:[~2022-06-20 21:38 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-18  2:39 [PATCH] btrfs: zlib: refactor how we prepare the input buffer Qu Wenruo
2022-06-18  6:14 ` Fabio M. De Francesco
2022-06-20 16:08 ` David Sterba
2022-06-21  0:40   ` Qu Wenruo
2022-06-21  1:43     ` Qu Wenruo
2022-06-20 21:38 ` Ira Weiny [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YrDo1qgcXEKSQM7l@iweiny-desk3 \
    --to=ira.weiny@intel.com \
    --cc=fmdefrancesco@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).