linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Scrub resume regression
@ 2020-01-15  9:03 Graham Cobb
  2020-01-15  9:33 ` Filipe Manana
  2020-01-15 12:51 ` David Sterba
  0 siblings, 2 replies; 10+ messages in thread
From: Graham Cobb @ 2020-01-15  9:03 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Sebastian Döring

OK, I have bisected the problem with scrub resume being broken by the
scrub ioctl ABI being changed.

The bad commit is:

Fail
06fe39ab15a6a47d4979460fcc17d33b1d72ccf9 is the first bad commit
commit 06fe39ab15a6a47d4979460fcc17d33b1d72ccf9
Author: Filipe Manana <fdmanana@suse.com>
Date:   Fri Dec 14 19:50:17 2018 +0000

    Btrfs: do not overwrite scrub error with fault error in scrub ioctl

    If scrub returned an error and then the copy_to_user() call did not
    succeed, we would overwrite the error returned by scrub with -EFAULT.
    Fix that by calling copy_to_user() only if btrfs_scrub_dev() returned
    success.

    Signed-off-by: Filipe Manana <fdmanana@suse.com>
    Reviewed-by: David Sterba <dsterba@suse.com>
    Signed-off-by: David Sterba <dsterba@suse.com>

 fs/btrfs/ioctl.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
bisect run success

It is important that scrub always returns the stats, even when it
returns an error. This is critical for cancel, as that is how
cancel/resume works, but it should also apply in case of other errors so
that the user can see how much of the scrub was done before the fatal error.

I am not sure in which kernel release this commit appeared but as this
breaks the "scrub resume" command completely, I think the fix for this
needs to be backported and may want to be considered by distro kernel
maintainers.

I will reply later with the simple reproducer program I created for the
bisection in case it is useful for testing.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Scrub resume regression
  2020-01-15  9:03 Scrub resume regression Graham Cobb
@ 2020-01-15  9:33 ` Filipe Manana
  2020-01-15 11:55   ` Graham Cobb
  2020-01-15 12:51 ` David Sterba
  1 sibling, 1 reply; 10+ messages in thread
From: Filipe Manana @ 2020-01-15  9:33 UTC (permalink / raw)
  To: Graham Cobb; +Cc: linux-btrfs, Sebastian Döring

On Wed, Jan 15, 2020 at 9:04 AM Graham Cobb <g.btrfs@cobb.uk.net> wrote:
>
> OK, I have bisected the problem with scrub resume being broken by the
> scrub ioctl ABI being changed.
>
> The bad commit is:
>
> Fail
> 06fe39ab15a6a47d4979460fcc17d33b1d72ccf9 is the first bad commit
> commit 06fe39ab15a6a47d4979460fcc17d33b1d72ccf9
> Author: Filipe Manana <fdmanana@suse.com>
> Date:   Fri Dec 14 19:50:17 2018 +0000
>
>     Btrfs: do not overwrite scrub error with fault error in scrub ioctl
>
>     If scrub returned an error and then the copy_to_user() call did not
>     succeed, we would overwrite the error returned by scrub with -EFAULT.
>     Fix that by calling copy_to_user() only if btrfs_scrub_dev() returned
>     success.
>
>     Signed-off-by: Filipe Manana <fdmanana@suse.com>
>     Reviewed-by: David Sterba <dsterba@suse.com>
>     Signed-off-by: David Sterba <dsterba@suse.com>
>
>  fs/btrfs/ioctl.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> bisect run success
>
> It is important that scrub always returns the stats, even when it
> returns an error. This is critical for cancel, as that is how
> cancel/resume works, but it should also apply in case of other errors so
> that the user can see how much of the scrub was done before the fatal error.
>
> I am not sure in which kernel release this commit appeared but as this
> breaks the "scrub resume" command completely, I think the fix for this
> needs to be backported and may want to be considered by distro kernel
> maintainers.
>
> I will reply later with the simple reproducer program I created for the
> bisection in case it is useful for testing.

No need to, it is simple to understand why it happens and the not
copying that stats in error case was not intentional.

Try this:

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 3a4bd5cd67fa..611dfe8cdbb1 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -4253,8 +4253,10 @@ static long btrfs_ioctl_scrub(struct file
*file, void __user *arg)
                              &sa->progress, sa->flags & BTRFS_SCRUB_READONLY,
                              0);

-       if (ret == 0 && copy_to_user(arg, sa, sizeof(*sa)))
-               ret = -EFAULT;
+       if (copy_to_user(arg, sa, sizeof(*sa))) {
+               if (!ret)
+                       ret = -EFAULT;
+       }

        if (!(sa->flags & BTRFS_SCRUB_READONLY))
                mnt_drop_write_file(file);

I'll later send a patch with a changelog to the list.
Thanks.



-- 
Filipe David Manana,

“Whether you think you can, or you think you can't — you're right.”

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: Scrub resume regression
  2020-01-15  9:33 ` Filipe Manana
@ 2020-01-15 11:55   ` Graham Cobb
  0 siblings, 0 replies; 10+ messages in thread
From: Graham Cobb @ 2020-01-15 11:55 UTC (permalink / raw)
  To: fdmanana; +Cc: linux-btrfs, Sebastian Döring

On 15/01/2020 09:33, Filipe Manana wrote:
> On Wed, Jan 15, 2020 at 9:04 AM Graham Cobb <g.btrfs@cobb.uk.net> wrote:
>>
>> OK, I have bisected the problem with scrub resume being broken by the
>> scrub ioctl ABI being changed.
[snip]
> 
> Try this:
> 
> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> index 3a4bd5cd67fa..611dfe8cdbb1 100644
> --- a/fs/btrfs/ioctl.c
> +++ b/fs/btrfs/ioctl.c
> @@ -4253,8 +4253,10 @@ static long btrfs_ioctl_scrub(struct file
> *file, void __user *arg)
>                               &sa->progress, sa->flags & BTRFS_SCRUB_READONLY,
>                               0);
> 
> -       if (ret == 0 && copy_to_user(arg, sa, sizeof(*sa)))
> -               ret = -EFAULT;
> +       if (copy_to_user(arg, sa, sizeof(*sa))) {
> +               if (!ret)
> +                       ret = -EFAULT;
> +       }
> 
>         if (!(sa->flags & BTRFS_SCRUB_READONLY))
>                 mnt_drop_write_file(file);
> 
> I'll later send a patch with a changelog to the list.
> Thanks.

Thanks Filipe. I have tested that the patch builds and works with my
reproducer on a 5.4 kernel.

Later I will build a distro kernel with the patch and try my monthly
scrub-with-interruptions script. That will take a day or two.



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Scrub resume regression
  2020-01-15  9:03 Scrub resume regression Graham Cobb
  2020-01-15  9:33 ` Filipe Manana
@ 2020-01-15 12:51 ` David Sterba
  2020-01-15 13:10   ` Holger Hoffstätte
  1 sibling, 1 reply; 10+ messages in thread
From: David Sterba @ 2020-01-15 12:51 UTC (permalink / raw)
  To: Graham Cobb; +Cc: linux-btrfs, Sebastian Döring

On Wed, Jan 15, 2020 at 09:03:13AM +0000, Graham Cobb wrote:
> OK, I have bisected the problem with scrub resume being broken by the
> scrub ioctl ABI being changed.

Thanks for the bisecting, I'll forward the fix to 5.5-rc so it can be
picked by stable trees. The regression has been there since 5.1, which
at the moment is only for 5.4.x.

> The bad commit is:
> 
> 06fe39ab15a6a47d4979460fcc17d33b1d72ccf9 is the first bad commit
> commit 06fe39ab15a6a47d4979460fcc17d33b1d72ccf9
> Author: Filipe Manana <fdmanana@suse.com>
> Date:   Fri Dec 14 19:50:17 2018 +0000
> 
>     Btrfs: do not overwrite scrub error with fault error in scrub ioctl
[...]

> It is important that scrub always returns the stats, even when it
> returns an error. This is critical for cancel, as that is how
> cancel/resume works, but it should also apply in case of other errors so
> that the user can see how much of the scrub was done before the fatal error.

That's something we need to document in code and perhaps in the manual
pages too.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Scrub resume regression
  2020-01-15 12:51 ` David Sterba
@ 2020-01-15 13:10   ` Holger Hoffstätte
  2020-01-15 21:02     ` Sebastian Döring
  2020-01-16 14:02     ` David Sterba
  0 siblings, 2 replies; 10+ messages in thread
From: Holger Hoffstätte @ 2020-01-15 13:10 UTC (permalink / raw)
  To: linux-btrfs

On 1/15/20 1:51 PM, David Sterba wrote:
>> It is important that scrub always returns the stats, even when it
>> returns an error. This is critical for cancel, as that is how
>> cancel/resume works, but it should also apply in case of other errors so
>> that the user can see how much of the scrub was done before the fatal error.
> 
> That's something we need to document in code and perhaps in the manual
> pages too.

Isn't the real problem that cancel does not actually mean cancel,
but rather also implies "..and maybe continue"? IMHO cancel should cancel
(and say how much work was performed), while the intention to resume should
be called e.g. "pause". This makes the behaviour clear and prevents
accidental semantic overlap.

-h

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Scrub resume regression
  2020-01-15 13:10   ` Holger Hoffstätte
@ 2020-01-15 21:02     ` Sebastian Döring
  2020-01-16 14:02     ` David Sterba
  1 sibling, 0 replies; 10+ messages in thread
From: Sebastian Döring @ 2020-01-15 21:02 UTC (permalink / raw)
  To: linux-btrfs

> Isn't the real problem that cancel does not actually mean cancel,
> but rather also implies "..and maybe continue"? IMHO cancel should cancel
> (and say how much work was performed), while the intention to resume should
> be called e.g. "pause". This makes the behaviour clear and prevents
> accidental semantic overlap.
>

IMHO a better choice of wording might also be "interrupt" and "abort",
but maybe that's just being pedantic (cancel could then stay in place
for backwards compatibility). In any case the current man page for
btrfs-progs makes it somewhat clear what's supposed to happen and my
confusion was mostly caused by the bug.

BTW: Thanks to everyone who helped tracking down the cause of this
bug. Haven't had a chance to test the proposed patch yet, but am
looking forward to schedule large scrubs across multiple nights.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Scrub resume regression
  2020-01-15 13:10   ` Holger Hoffstätte
  2020-01-15 21:02     ` Sebastian Döring
@ 2020-01-16 14:02     ` David Sterba
  2020-01-17 15:59       ` Zygo Blaxell
  1 sibling, 1 reply; 10+ messages in thread
From: David Sterba @ 2020-01-16 14:02 UTC (permalink / raw)
  To: Holger Hoffstätte; +Cc: linux-btrfs

On Wed, Jan 15, 2020 at 02:10:42PM +0100, Holger Hoffstätte wrote:
> On 1/15/20 1:51 PM, David Sterba wrote:
> >> It is important that scrub always returns the stats, even when it
> >> returns an error. This is critical for cancel, as that is how
> >> cancel/resume works, but it should also apply in case of other errors so
> >> that the user can see how much of the scrub was done before the fatal error.
> > 
> > That's something we need to document in code and perhaps in the manual
> > pages too.
> 
> Isn't the real problem that cancel does not actually mean cancel,
> but rather also implies "..and maybe continue"? IMHO cancel should cancel
> (and say how much work was performed), while the intention to resume should
> be called e.g. "pause". This makes the behaviour clear and prevents
> accidental semantic overlap.

We can add 'pause', but for backward compatibility, cancel has to stay
as is. I personally think that saving the last position after cancel is
not a big deal. With 'pause' it will be less confusing for users and
will have also parity with balance commands.

start - pause - resume
start - cancel

One difference is that cancelling balance will also delete the state
(stored inside the filesystem metadata). If scrub start follows cancel,
the state is reset at the beginning. I'm not sure if adding an extra
option eg. 'scrub cancel --reset' is worth.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Scrub resume regression
  2020-01-16 14:02     ` David Sterba
@ 2020-01-17 15:59       ` Zygo Blaxell
  2020-01-17 18:39         ` Chris Murphy
  0 siblings, 1 reply; 10+ messages in thread
From: Zygo Blaxell @ 2020-01-17 15:59 UTC (permalink / raw)
  To: dsterba, Holger Hoffstätte, linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 1867 bytes --]

On Thu, Jan 16, 2020 at 03:02:27PM +0100, David Sterba wrote:
> On Wed, Jan 15, 2020 at 02:10:42PM +0100, Holger Hoffstätte wrote:
> > On 1/15/20 1:51 PM, David Sterba wrote:
> > >> It is important that scrub always returns the stats, even when it
> > >> returns an error. This is critical for cancel, as that is how
> > >> cancel/resume works, but it should also apply in case of other errors so
> > >> that the user can see how much of the scrub was done before the fatal error.
> > > 
> > > That's something we need to document in code and perhaps in the manual
> > > pages too.
> > 
> > Isn't the real problem that cancel does not actually mean cancel,
> > but rather also implies "..and maybe continue"? IMHO cancel should cancel
> > (and say how much work was performed), while the intention to resume should
> > be called e.g. "pause". This makes the behaviour clear and prevents
> > accidental semantic overlap.
> 
> We can add 'pause', but for backward compatibility, cancel has to stay
> as is. I personally think that saving the last position after cancel is
> not a big deal. With 'pause' it will be less confusing for users and
> will have also parity with balance commands.
> 
> start - pause - resume
> start - cancel
> 
> One difference is that cancelling balance will also delete the state
> (stored inside the filesystem metadata). If scrub start follows cancel,
> the state is reset at the beginning. I'm not sure if adding an extra
> option eg. 'scrub cancel --reset' is worth.

Currently 'scrub cancel' doesn't reset the state, so existing scripts will
be broken if 'cancel' is not exactly the same as 'pause'.  I have such
scripts, they call 'cancel' and 'resume' blindly according to load or
maintenance window boundaries, and they leave it to btrfs userspace to
sort out whether any new work should be done.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Scrub resume regression
  2020-01-17 15:59       ` Zygo Blaxell
@ 2020-01-17 18:39         ` Chris Murphy
  2020-01-17 19:39           ` Graham Cobb
  0 siblings, 1 reply; 10+ messages in thread
From: Chris Murphy @ 2020-01-17 18:39 UTC (permalink / raw)
  To: Zygo Blaxell; +Cc: David Sterba, Holger Hoffstätte, Btrfs BTRFS

It's no one's fault, it's just confusing. :P

Cancel word origin means more than stop, implies resetting state, to
obliterate or invalidate.

Pause and stop word origin suggests they're interchangeable, but in
practice with digital audio and video consumer gear, stop has come to
mean a kind of cancel. (I'm gonna ignore tape.) Where a start from a
stop will start at the very beginning. Whereas pause saves state and
unpause means resume.

Lightweight change, add new command stop, which saves state, and
cancel is an alias for backward compatibility. No other change.

Moderate change:
start = alias resume
stop = alias cancel
i.e. a stop then start does the same thing as a cancel then resume,
unless new command 'reset' is used
reset = stops, and resets state to the beginning

Heavier change that's linguistically sane, but breaks expectations of
today's cancel:
pause and unpause (alias resume), and start and stop (alias cancel).
The former is stateful, and the latter is stateless.

But this problem should be looked at by i18n folks before changing anything.

---
Chris Murphy

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Scrub resume regression
  2020-01-17 18:39         ` Chris Murphy
@ 2020-01-17 19:39           ` Graham Cobb
  0 siblings, 0 replies; 10+ messages in thread
From: Graham Cobb @ 2020-01-17 19:39 UTC (permalink / raw)
  To: Chris Murphy, Btrfs BTRFS

On 17/01/2020 18:39, Chris Murphy wrote:
> It's no one's fault, it's just confusing. :P
> 
> Cancel word origin means more than stop, implies resetting state, to
> obliterate or invalidate.
> 
> Pause and stop word origin suggests they're interchangeable, but in
> practice with digital audio and video consumer gear, stop has come to
> mean a kind of cancel. (I'm gonna ignore tape.) Where a start from a
> stop will start at the very beginning. Whereas pause saves state and
> unpause means resume.
> 
> Lightweight change, add new command stop, which saves state, and
> cancel is an alias for backward compatibility. No other change.

That seems fairly pointless.

> Moderate change:
> start = alias resume

start and resume do different things today. The distinction is important
as the "saved state" after a cancel stays around until the next scrub
starts (it could be months old).

> stop = alias cancel
> i.e. a stop then start does the same thing as a cancel then resume,
> unless new command 'reset' is used
> reset = stops, and resets state to the beginning

That really isn't useful. It is much more useful to be able to decide
whether to use the saved state at start time than it is to decide
whether to save state at stop time. To build on Zygo's example, I might
have a script which pauses the scrub when the system load goes up and
then decides whether to resume it or restart from scratch depending on
how long it has been paused for. Also, the saved state can be inspected
while the scrub is paused to allow the operator to estimate how long it
might take to complete.

> Heavier change that's linguistically sane, but breaks expectations of
> today's cancel:
> pause and unpause (alias resume), and start and stop (alias cancel).
> The former is stateful, and the latter is stateless.

Changing the meaning of the current start, resume, or cancel commands
isn't an option - these are built into user scripts.


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-01-17 19:39 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-15  9:03 Scrub resume regression Graham Cobb
2020-01-15  9:33 ` Filipe Manana
2020-01-15 11:55   ` Graham Cobb
2020-01-15 12:51 ` David Sterba
2020-01-15 13:10   ` Holger Hoffstätte
2020-01-15 21:02     ` Sebastian Döring
2020-01-16 14:02     ` David Sterba
2020-01-17 15:59       ` Zygo Blaxell
2020-01-17 18:39         ` Chris Murphy
2020-01-17 19:39           ` Graham Cobb

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).