All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: ansgar.loesser@kom.tu-darmstadt.de
Cc: "Linus Torvalds" <torvalds@linuxfoundation.org>,
	"Darrick J. Wong" <djwong@kernel.org>,
	"Josef Bacik" <josef@toxicpanda.com>,
	"Christoph Hellwig" <hch@lst.de>,
	"Amir Goldstein" <amir73il@gmail.com>,
	"Mark Fasheh" <mark@fasheh.com>,
	"Matthew Wilcox" <willy@infradead.org>,
	"Miklos Szeredi" <mszeredi@redhat.com>,
	"Al Viro" <viro@zeniv.linux.org.uk>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	"Security Officers" <security@kernel.org>,
	"Max Schlecht" <max.schlecht@informatik.hu-berlin.de>,
	"Björn Scheuermann" <scheuermann@kom.tu-darmstadt.de>
Subject: Re: [PATCH] vf/remap: return the amount of bytes actually deduplicated
Date: Fri, 15 Jul 2022 08:32:38 +1000	[thread overview]
Message-ID: <20220714223238.GH3600936@dread.disaster.area> (raw)
In-Reply-To: <b5805118-7e56-3d43-28e9-9e0198ee43f3@tu-darmstadt.de>

On Wed, Jul 13, 2022 at 08:51:44PM +0200, Ansgar Lößer wrote:
> When using the FIDEDUPRANGE ioctl, in case of success the requested size
> is returned. In some cases this might not be the actual amount of bytes
> deduplicated.
> 
> This change modifies vfs_dedupe_file_range() to report the actual amount
> of bytes deduplicated, instead of the requested amount.
> 
> Link: https://lore.kernel.org/linux-fsdevel/5548ef63-62f9-4f46-5793-03165ceccacc@tu-darmstadt.de/
> 
> Reported-by: Ansgar Lößer (ansgar.loesser@kom.tu-darmstadt.de)
> Reported-by: Max Schlecht (max.schlecht@informatik.hu-berlin.de)
> Reported-by: Björn Scheuermann (scheuermann@kom.tu-darmstadt.de)
> Signed-off-by: Ansgar Lößer <ansgar.loesser@kom.tu-darmstadt.de>
> ---
> 
> > Mind sending it with a sign-off and a short commit message?
> > 
> >               Linus
> Sure, thank you!
> 
> This is my first commit, so I hope it is ok like this.
> 
> 
>  fs/remap_range.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/remap_range.c b/fs/remap_range.c
> index e112b5424cdb..072c2c48aeed 100644
> --- a/fs/remap_range.c
> +++ b/fs/remap_range.c
> @@ -546,7 +546,7 @@ int vfs_dedupe_file_range(struct file *file, struct
> file_dedupe_range *same)
>                 else if (deduped < 0)
>                         info->status = deduped;
>                 else
> -                       info->bytes_deduped = len;
> +                       info->bytes_deduped = deduped;
> 
>  next_fdput:
>                 fdput(dst_fd);
> --
> 2.35.1

As I suspected would occur, this change causes test failures. e.g
generic/517 in fstests fails with:

generic/517 1s ... - output mismatch (see /home/dave/src/xfstests-dev/results//xfs_quota/generic/517.out.bad)
--- tests/generic/517.out   2019-10-29 11:47:07.464539451 +1100
+++ /home/dave/src/xfstests-dev/results//xfs_quota/generic/517.out.bad      2022-07-14 18:00:04.833536434 +1000
@@ -13,7 +13,7 @@
*
0786528 ae ae ae ae
0786532
-deduped 131172/131172 bytes at offset 65536
+deduped 131072/131172 bytes at offset 65536
XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
File content after first deduplication and before unmounting:
...
(Run 'diff -u /home/dave/src/xfstests-dev/tests/generic/517.out /home/dave/src/xfstests-dev/results//xfs_quota/generic/517.out.bad'  to see the entire diff)

The whole diff is:

-- /home/dave/src/xfstests-dev/tests/generic/517.out   2019-10-29 11:47:07.464539451 +1100
+++ /home/dave/src/xfstests-dev/results//xfs_quota/generic/517.out.bad  2022-07-14 18:00:04.833536434 +1000
@@ -13,7 +13,7 @@
 *
 0786528 ae ae ae ae
 0786532
-deduped 131172/131172 bytes at offset 65536
+deduped 131072/131172 bytes at offset 65536
 XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 File content after first deduplication and before unmounting:
 0000000 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
@@ -33,8 +33,6 @@
 0786532
 wrote 100/100 bytes at offset 0
 XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
-deduped 100/100 bytes at offset 655360
-XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 File content after second deduplication:
 0000000 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
 *

This shows user visible API behaviours (i.e. "short dedupe"
behaviour) that may be unexpected by userspace.  So while the change
might be technically correct, it definitely changes the behaviour at
least one test expects to occur.

This is why I asked if this had been tested - it tells us if
userspace is likely to see issues with the API change. "Correct but
breaks tests" is basically a warning that we should expect
users/applications to notice the API change and that there's a good
likelyhood of downstream problems arising from it...

Linus, can you please revert this commit for the 5.19 series (before
the stable autosel bot sends it back to stable kernels, please!) to
give us more time to investigate and consider the impact of the the
API change on userspace applications before we commit to changing
the API.

-Dave.
-- 
Dave Chinner
david@fromorbit.com

  parent reply	other threads:[~2022-07-14 22:32 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-12 12:11 Information Leak: FIDEDUPERANGE ioctl allows reading writeonly files Ansgar Lößer
2022-07-12 17:33 ` Linus Torvalds
2022-07-12 18:43   ` Matthew Wilcox
2022-07-12 18:47     ` Linus Torvalds
2022-07-12 18:51       ` Linus Torvalds
2022-07-12 19:02   ` Josef Bacik
2022-07-12 19:07     ` Linus Torvalds
2022-07-12 19:23       ` Linus Torvalds
2022-07-12 20:03       ` Josef Bacik
2022-07-12 20:48         ` Linus Torvalds
2022-07-13  0:48           ` Darrick J. Wong
2022-07-13  2:58             ` Linus Torvalds
2022-07-13  4:14               ` Linus Torvalds
2022-07-13  6:46               ` Dave Chinner
2022-07-13  7:49                 ` [PATCH] fs/remap: constrain dedupe of EOF blocks Dave Chinner
2022-07-13  8:19                   ` Linus Torvalds
2022-07-13 17:18                   ` Ansgar Lößer
2022-07-13 17:26                     ` Linus Torvalds
2022-07-13 18:51                       ` [PATCH] vf/remap: return the amount of bytes actually deduplicated Ansgar Lößer
2022-07-13 19:09                         ` Linus Torvalds
2022-07-14  0:22                         ` Dave Chinner
2022-07-14  1:03                           ` Linus Torvalds
2022-07-16 21:15                           ` Mark Fasheh
2022-07-14 22:32                         ` Dave Chinner [this message]
2022-07-14 22:42                           ` Linus Torvalds
2022-07-14 23:15                             ` Dave Chinner
2022-07-13  8:16                 ` Information Leak: FIDEDUPERANGE ioctl allows reading writeonly files Linus Torvalds
2022-07-13 23:48                   ` Dave Chinner
2022-07-13 17:17                 ` Ansgar Lößer
2022-07-13 17:16             ` Ansgar Lößer
2022-07-13 22:43               ` Dave Chinner
2022-07-13 17:14   ` Ansgar Lößer
2022-07-13 18:03     ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220714223238.GH3600936@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=amir73il@gmail.com \
    --cc=ansgar.loesser@kom.tu-darmstadt.de \
    --cc=djwong@kernel.org \
    --cc=hch@lst.de \
    --cc=josef@toxicpanda.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=mark@fasheh.com \
    --cc=max.schlecht@informatik.hu-berlin.de \
    --cc=mszeredi@redhat.com \
    --cc=scheuermann@kom.tu-darmstadt.de \
    --cc=security@kernel.org \
    --cc=torvalds@linuxfoundation.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.