From: David Sterba <dsterba@suse.cz>
To: Qu Wenruo <wqu@suse.com>
Cc: linux-btrfs@vger.kernel.org,
"Fabio M . De Francesco" <fmdefrancesco@gmail.com>
Subject: Re: [PATCH] btrfs: zlib: refactor how we prepare the input buffer
Date: Mon, 20 Jun 2022 18:08:06 +0200 [thread overview]
Message-ID: <20220620160806.GS20633@twin.jikos.cz> (raw)
In-Reply-To: <d0bfc791b5509df7b9ad44e41ada197d1b3149b3.1655519730.git.wqu@suse.com>
On Sat, Jun 18, 2022 at 10:39:28AM +0800, Qu Wenruo wrote:
> Inspired by recent kmap() change from Fabio M. De Francesco.
>
> There are some weird behavior in zlib_compress_pages(), mostly around how
> we prepare the input buffer.
>
> [BEFORE]
> - We hold a page mapped for a long time
> This is making it much harder to convert kmap() to kmap_local_page(),
> as such long mapped page can lead to nested mapped page.
>
> - Different paths in the name of "optimization"
> When we ran out of input buffer, we will grab the new input with two
> different paths:
>
> * If there are more than one pages left, we copy the content into the
> input buffer.
> This behavior is introduced mostly for S390, as that arch needs
> multiple pages as input buffer for hardware decompression.
>
> * If there is only one page left, we use that page from page cache
> directly without copying the content.
>
> This is making page map/unmap much harder, especially due the latter
> case.
>
> [AFTER]
> This patch will change the behavior by introducing a new helper, to
> fulfill the input buffer:
>
> - Only map one page when we do the content copy
>
> - Unified path, by always copying the page content into workspace
> input buffer
> Yes, we're doing extra page copying. But the original optimization
> only work for the last page of the input range.
>
> Thus I'd say the sacrifice is already not that big.
I don't like that the performance may drop and that there's extra memory
copyiing when not absolutely needed, OTOH it's in zlib code and I think
though it's in use today, the zstd is a sufficient replacement so the
perf drop should have low impact.
> - Use kmap_local_page() and kunmap_local() instead
> Now the lifespan for the mapped page is only during memcpy() call,
> we're definitely fine to use kmap_local_page()/kunmap_local().
>
> Cc: Fabio M. De Francesco <fmdefrancesco@gmail.com>
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
> Only tested on x86_64 for the correctness of the new helper.
>
> But considering how small the window we need the page to be mapped, I
> think it should also work for x86 without any problem.
> ---
> fs/btrfs/zlib.c | 85 ++++++++++++++++++++++++-------------------------
> 1 file changed, 41 insertions(+), 44 deletions(-)
>
> diff --git a/fs/btrfs/zlib.c b/fs/btrfs/zlib.c
> index 767a0c6c9694..2cd4f6fb1537 100644
> --- a/fs/btrfs/zlib.c
> +++ b/fs/btrfs/zlib.c
> @@ -91,20 +91,54 @@ struct list_head *zlib_alloc_workspace(unsigned int level)
> return ERR_PTR(-ENOMEM);
> }
>
> +/*
> + * Copy the content from page cache into @workspace->buf.
> + *
> + * @total_in: The original total input length.
> + * @fileoff_ret: The file offset.
> + * Will be increased by the number of bytes we read.
> + */
> +static void fill_input_buffer(struct workspace *workspace,
> + struct address_space *mapping,
> + unsigned long total_in, u64 *fileoff_ret)
> +{
> + unsigned long bytes_left = total_in - workspace->strm.total_in;
> + const int input_pages = min(DIV_ROUND_UP(bytes_left, PAGE_SIZE),
> + workspace->buf_size / PAGE_SIZE);
> + u64 file_offset = *fileoff_ret;
> + int i;
> +
> + /* Copy the content of each page into the input buffer. */
> + for (i = 0; i < input_pages; i++) {
> + struct page *in_page;
> + void *addr;
> +
> + in_page = find_get_page(mapping, file_offset >> PAGE_SHIFT);
> +
> + addr = kmap_local_page(in_page);
> + memcpy(workspace->buf + i * PAGE_SIZE, addr, PAGE_SIZE);
The workspace->buf is 1 page or 4 pages for x390, so here it looks
confusing that it could overflow and no bounds are explicitly checked
wherether the i * PAGE_SIZE offset is still OK. This would at least
deserve a comment and some runtime checks.
next prev parent reply other threads:[~2022-06-20 16:12 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-06-18 2:39 [PATCH] btrfs: zlib: refactor how we prepare the input buffer Qu Wenruo
2022-06-18 6:14 ` Fabio M. De Francesco
2022-06-20 16:08 ` David Sterba [this message]
2022-06-21 0:40 ` Qu Wenruo
2022-06-21 1:43 ` Qu Wenruo
2022-06-20 21:38 ` Ira Weiny
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220620160806.GS20633@twin.jikos.cz \
--to=dsterba@suse.cz \
--cc=fmdefrancesco@gmail.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=wqu@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).