All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: Nikolay Borisov <nborisov@suse.com>
Cc: linux-fsdevel@vger.kernel.org, viro@zeniv.linux.org.uk,
	david@fromorbit.com, djwong@kernel.org
Subject: Re: [PATCH] vfs: Optimize dedupe comparison
Date: Thu, 15 Jul 2021 15:30:56 +0100	[thread overview]
Message-ID: <YPBGoDlf9T6kFjk1@casper.infradead.org> (raw)
In-Reply-To: <20210715141309.38443-1-nborisov@suse.com>

On Thu, Jul 15, 2021 at 05:13:09PM +0300, Nikolay Borisov wrote:
> Currently the comparison method vfs_dedupe_file_range_compare utilizes
> is a plain memcmp. This effectively means the code is doing byte-by-byte
> comparison. Instead, the code could do word-sized comparison without
> adverse effect on performance, provided that the comparison's length is
> at least as big as the native word size, as well as resulting memory
> addresses are properly aligned.

Sounds to me like somebody hasn't optimised memcmp() very well ...
is this x86-64?

> @@ -256,9 +257,35 @@ static int vfs_dedupe_file_range_compare(struct inode *src, loff_t srcoff,
>  		flush_dcache_page(src_page);
>  		flush_dcache_page(dest_page);
> 
> -		if (memcmp(src_addr + src_poff, dest_addr + dest_poff, cmp_len))
> -			same = false;
> 
> +		if (!IS_ALIGNED((unsigned long)(src_addr + src_poff), block_size) ||
> +		    !IS_ALIGNED((unsigned long)(dest_addr + dest_poff), block_size) ||
> +		    cmp_len < block_size) {

Can this even happen?  Surely we can only dedup on a block boundary and
blocks are required to be a power of two and at least 512 bytes in size?

> +			if (memcmp(src_addr + src_poff, dest_addr + dest_poff,
> +				   cmp_len))
> +				same = false;
> +		} else {
> +			int i;
> +			size_t blocks = cmp_len / block_size;
> +			loff_t rem_len = cmp_len - (blocks * block_size);
> +			unsigned long *src = src_addr + src_poff;
> +			unsigned long *dst = dest_addr + src_poff;
> +
> +			for (i = 0; i < blocks; i++) {
> +				if (src[i] - dst[i]) {
> +					same = false;
> +					goto finished;
> +				}
> +			}
> +
> +			if (rem_len) {
> +				src_addr += src_poff + (blocks * block_size);
> +				dest_addr += dest_poff + (blocks * block_size);
> +				if (memcmp(src_addr, dest_addr, rem_len))
> +					same = false;
> +			}
> +		}
> +finished:
>  		kunmap_atomic(dest_addr);
>  		kunmap_atomic(src_addr);
>  unlock:
> --
> 2.25.1
> 

  reply	other threads:[~2021-07-15 14:31 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-15 14:13 [PATCH] vfs: Optimize dedupe comparison Nikolay Borisov
2021-07-15 14:30 ` Matthew Wilcox [this message]
2021-07-15 14:44   ` Nikolay Borisov
2021-07-15 15:09     ` Matthew Wilcox
2021-07-15 22:33       ` Dave Chinner
2021-07-20 14:58         ` Nikolay Borisov
2021-07-20 15:12           ` Matthew Wilcox
2021-07-16 12:10       ` Nikolay Borisov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YPBGoDlf9T6kFjk1@casper.infradead.org \
    --to=willy@infradead.org \
    --cc=david@fromorbit.com \
    --cc=djwong@kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=nborisov@suse.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.