All of lore.kernel.org
 help / color / mirror / Atom feed
* btrfs scrub status reports not running when it is
@ 2015-01-14 21:06 Sandy McArthur Jr
  2015-01-14 21:23 ` Marc Joliet
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Sandy McArthur Jr @ 2015-01-14 21:06 UTC (permalink / raw)
  To: linux-btrfs

Sometimes btrfs scrub status reports that is not running when it still is.

I think this a cosmetic bug. And I believe this is related to the
scrub completing on some drives before others in a multi-drive btrfs
filesystem that is not well balanced.

Based on `iostat 1` activity the last drive in the btrfs filesystem
was still being scrubbed at the time I copied the output below, you
can see the total bytes scrubbed is increasing despite showing as "not
running". The last drive being scrubbed was not the device identified
when you list mount points with `mount`:


# date ; echo ; btrfs scrub status /mcmedia/
Wed Jan 14 15:20:18 EST 2015

scrub status for 94b3345e-2589-423c-a228-d569bf94ab58
scrub started at Tue Jan 13 01:18:22 2015, interrupted after 136912
seconds, not running
total bytes scrubbed: 23.05TiB with 513 errors
error details: verify=19 csum=494
corrected errors: 512, uncorrectable errors: 1, unverified errors: 0


# date ; echo ; btrfs scrub status /mcmedia/
Wed Jan 14 15:21:25 EST 2015

scrub status for 94b3345e-2589-423c-a228-d569bf94ab58
scrub started at Tue Jan 13 01:18:22 2015, interrupted after 136982
seconds, not running
total bytes scrubbed: 23.06TiB with 513 errors
error details: verify=19 csum=494
corrected errors: 512, uncorrectable errors: 1, unverified errors: 0


# uname -a
Linux mcplex 3.18.2-gentoo #1 SMP Mon Jan 12 10:24:25 EST 2015 x86_64
Intel(R) Core(TM) i7-2600S CPU @ 2.80GHz GenuineIntel GNU/Linux

# btrfs --version
Btrfs v3.18.1


-- 
Sandy McArthur Jr

"He who dares not offend cannot be honest."
- Thomas Paine

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: btrfs scrub status reports not running when it is
  2015-01-14 21:06 btrfs scrub status reports not running when it is Sandy McArthur Jr
@ 2015-01-14 21:23 ` Marc Joliet
  2015-01-14 21:26 ` Sandy McArthur Jr
  2015-01-14 22:27 ` Zach Brown
  2 siblings, 0 replies; 7+ messages in thread
From: Marc Joliet @ 2015-01-14 21:23 UTC (permalink / raw)
  To: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 919 bytes --]

Am Wed, 14 Jan 2015 16:06:02 -0500
schrieb Sandy McArthur Jr <sandymac@gmail.com>:

> Sometimes btrfs scrub status reports that is not running when it still is.
[...]

FWIW, I (and one other person) reported this in the thread titled 'btrfs scrub
status misreports as "interrupted"' (starting on 22.11.2014).

> # uname -a
> Linux mcplex 3.18.2-gentoo #1 SMP Mon Jan 12 10:24:25 EST 2015 x86_64
> Intel(R) Core(TM) i7-2600S CPU @ 2.80GHz GenuineIntel GNU/Linux
>
> # btrfs --version
> Btrfs v3.18.1

Too bad it's still there; I'm on kernel 3.17.8 and userspace 3.18.1,
respectively, and didn't see this issue the last time I ran a scrub, so I was
hoping it was gone by now.

(On the upside, though, this isn't exactly the worst bug btrfs has ever
had ;) .)

Greetings
-- 
Marc Joliet
--
"People who think they know everything really annoy those of us who know we
don't" - Bjarne Stroustrup

[-- Attachment #2: Digitale Signatur von OpenPGP --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: btrfs scrub status reports not running when it is
  2015-01-14 21:06 btrfs scrub status reports not running when it is Sandy McArthur Jr
  2015-01-14 21:23 ` Marc Joliet
@ 2015-01-14 21:26 ` Sandy McArthur Jr
  2015-01-14 22:27 ` Zach Brown
  2 siblings, 0 replies; 7+ messages in thread
From: Sandy McArthur Jr @ 2015-01-14 21:26 UTC (permalink / raw)
  To: linux-btrfs

Okay, different output when the scrub is actually complete:

completed status:

scrub status for 94b3345e-2589-423c-a228-d569bf94ab58
scrub started at Tue Jan 13 01:18:22 2015 and finished after 139459 seconds
total bytes scrubbed: 23.30TiB with 513 errors
error details: verify=19 csum=494
corrected errors: 512, uncorrectable errors: 1, unverified errors: 0


Still, the output when wrapping up is still not intuitive to me:

"scrub started at Tue Jan 13 01:18:22 2015, interrupted after 136982
seconds, not running"


On Wed, Jan 14, 2015 at 4:06 PM, Sandy McArthur Jr <sandymac@gmail.com> wrote:
> Sometimes btrfs scrub status reports that is not running when it still is.
>
> I think this a cosmetic bug. And I believe this is related to the
> scrub completing on some drives before others in a multi-drive btrfs
> filesystem that is not well balanced.
>
> Based on `iostat 1` activity the last drive in the btrfs filesystem
> was still being scrubbed at the time I copied the output below, you
> can see the total bytes scrubbed is increasing despite showing as "not
> running". The last drive being scrubbed was not the device identified
> when you list mount points with `mount`:
>
>
> # date ; echo ; btrfs scrub status /mcmedia/
> Wed Jan 14 15:20:18 EST 2015
>
> scrub status for 94b3345e-2589-423c-a228-d569bf94ab58
> scrub started at Tue Jan 13 01:18:22 2015, interrupted after 136912
> seconds, not running
> total bytes scrubbed: 23.05TiB with 513 errors
> error details: verify=19 csum=494
> corrected errors: 512, uncorrectable errors: 1, unverified errors: 0
>
>
> # date ; echo ; btrfs scrub status /mcmedia/
> Wed Jan 14 15:21:25 EST 2015
>
> scrub status for 94b3345e-2589-423c-a228-d569bf94ab58
> scrub started at Tue Jan 13 01:18:22 2015, interrupted after 136982
> seconds, not running
> total bytes scrubbed: 23.06TiB with 513 errors
> error details: verify=19 csum=494
> corrected errors: 512, uncorrectable errors: 1, unverified errors: 0
>
>
> # uname -a
> Linux mcplex 3.18.2-gentoo #1 SMP Mon Jan 12 10:24:25 EST 2015 x86_64
> Intel(R) Core(TM) i7-2600S CPU @ 2.80GHz GenuineIntel GNU/Linux
>
> # btrfs --version
> Btrfs v3.18.1
>
>
> --
> Sandy McArthur Jr
>
> "He who dares not offend cannot be honest."
> - Thomas Paine



-- 
Sandy McArthur

"He who dares not offend cannot be honest."
- Thomas Paine

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: btrfs scrub status reports not running when it is
  2015-01-14 21:06 btrfs scrub status reports not running when it is Sandy McArthur Jr
  2015-01-14 21:23 ` Marc Joliet
  2015-01-14 21:26 ` Sandy McArthur Jr
@ 2015-01-14 22:27 ` Zach Brown
  2015-01-15 11:24   ` David Sterba
  2 siblings, 1 reply; 7+ messages in thread
From: Zach Brown @ 2015-01-14 22:27 UTC (permalink / raw)
  To: Sandy McArthur Jr; +Cc: linux-btrfs

On Wed, Jan 14, 2015 at 04:06:02PM -0500, Sandy McArthur Jr wrote:
> Sometimes btrfs scrub status reports that is not running when it still is.
> 
> I think this a cosmetic bug. And I believe this is related to the
> scrub completing on some drives before others in a multi-drive btrfs
> filesystem that is not well balanced.

Boy, I don't really know this code, but it looks like:

if (ss->in_progress)
	printf(", running for %llu seconds\n", ss->duration);
else
	printf(", interrupted after %llu seconds, not running\n",
			ss->duration);

in_progress = is_scrub_running_in_kernel(fdmnt, di_args, fi_args.num_devices);

static int is_scrub_running_in_kernel(int fd,
                struct btrfs_ioctl_dev_info_args *di_args, u64 max_devices)
{
        struct scrub_progress sp;
        int i;
        int ret;

        for (i = 0; i < max_devices; i++) {
                memset(&sp, 0, sizeof(sp));
                sp.scrub_args.devid = di_args[i].devid;
                ret = ioctl(fd, BTRFS_IOC_SCRUB_PROGRESS, &sp.scrub_args);
                if (ret < 0 && errno == ENODEV)
                        continue;
                if (ret < 0 && errno == ENOTCONN)
                        return 0;

It says that scrub isn't running if any devices have completed.  If you drop
all those ret < 0 conditional branches that are either noops or wrong, does it
work like you'd expect?

- z

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: btrfs scrub status reports not running when it is
  2015-01-14 22:27 ` Zach Brown
@ 2015-01-15 11:24   ` David Sterba
  2015-01-15 18:02     ` Zach Brown
  0 siblings, 1 reply; 7+ messages in thread
From: David Sterba @ 2015-01-15 11:24 UTC (permalink / raw)
  To: Zach Brown; +Cc: Sandy McArthur Jr, linux-btrfs

On Wed, Jan 14, 2015 at 02:27:17PM -0800, Zach Brown wrote:
> On Wed, Jan 14, 2015 at 04:06:02PM -0500, Sandy McArthur Jr wrote:
> > Sometimes btrfs scrub status reports that is not running when it still is.
> > 
> > I think this a cosmetic bug. And I believe this is related to the
> > scrub completing on some drives before others in a multi-drive btrfs
> > filesystem that is not well balanced.
> 
> Boy, I don't really know this code, but it looks like:
> 
> if (ss->in_progress)
> 	printf(", running for %llu seconds\n", ss->duration);
> else
> 	printf(", interrupted after %llu seconds, not running\n",
> 			ss->duration);
> 
> in_progress = is_scrub_running_in_kernel(fdmnt, di_args, fi_args.num_devices);
> 
> static int is_scrub_running_in_kernel(int fd,
>                 struct btrfs_ioctl_dev_info_args *di_args, u64 max_devices)
> {
>         struct scrub_progress sp;
>         int i;
>         int ret;
> 
>         for (i = 0; i < max_devices; i++) {
>                 memset(&sp, 0, sizeof(sp));
>                 sp.scrub_args.devid = di_args[i].devid;
>                 ret = ioctl(fd, BTRFS_IOC_SCRUB_PROGRESS, &sp.scrub_args);
>                 if (ret < 0 && errno == ENODEV)
>                         continue;
>                 if (ret < 0 && errno == ENOTCONN)
>                         return 0;
> 
> It says that scrub isn't running if any devices have completed.  If you drop
> all those ret < 0 conditional branches that are either noops or wrong, does it
> work like you'd expect?

Why wrong? The ioctl callback returns -ENODEV or -ENOTCONN that get
translated to the errno values and ioctl(...) returns -1 in both cases.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: btrfs scrub status reports not running when it is
  2015-01-15 11:24   ` David Sterba
@ 2015-01-15 18:02     ` Zach Brown
  2015-01-19 17:45       ` David Sterba
  0 siblings, 1 reply; 7+ messages in thread
From: Zach Brown @ 2015-01-15 18:02 UTC (permalink / raw)
  To: dsterba, Sandy McArthur Jr, linux-btrfs

On Thu, Jan 15, 2015 at 12:24:41PM +0100, David Sterba wrote:
> On Wed, Jan 14, 2015 at 02:27:17PM -0800, Zach Brown wrote:
> > On Wed, Jan 14, 2015 at 04:06:02PM -0500, Sandy McArthur Jr wrote:
> > > Sometimes btrfs scrub status reports that is not running when it still is.
> > > 
> > > I think this a cosmetic bug. And I believe this is related to the
> > > scrub completing on some drives before others in a multi-drive btrfs
> > > filesystem that is not well balanced.
> > 
> > Boy, I don't really know this code, but it looks like:
> > 
> > if (ss->in_progress)
> > 	printf(", running for %llu seconds\n", ss->duration);
> > else
> > 	printf(", interrupted after %llu seconds, not running\n",
> > 			ss->duration);
> > 
> > in_progress = is_scrub_running_in_kernel(fdmnt, di_args, fi_args.num_devices);
> > 
> > static int is_scrub_running_in_kernel(int fd,
> >                 struct btrfs_ioctl_dev_info_args *di_args, u64 max_devices)
> > {
> >         struct scrub_progress sp;
> >         int i;
> >         int ret;
> > 
> >         for (i = 0; i < max_devices; i++) {
> >                 memset(&sp, 0, sizeof(sp));
> >                 sp.scrub_args.devid = di_args[i].devid;
> >                 ret = ioctl(fd, BTRFS_IOC_SCRUB_PROGRESS, &sp.scrub_args);
> >                 if (ret < 0 && errno == ENODEV)
> >                         continue;
> >                 if (ret < 0 && errno == ENOTCONN)
> >                         return 0;
> > 
> > It says that scrub isn't running if any devices have completed.  If you drop
> > all those ret < 0 conditional branches that are either noops or wrong, does it
> > work like you'd expect?
> 
> Why wrong? The ioctl callback returns -ENODEV or -ENOTCONN that get
> translated to the errno values and ioctl(...) returns -1 in both cases.

Wrong because returning 0 on the first ENOTCONN, instead of continuing
to find more devices which might still be scrubbing, leads to this
confusing status message.

That's my working theory having spent 15 seconds reading code.  I would
be not surprised at all if I'm missing something here.


- z

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: btrfs scrub status reports not running when it is
  2015-01-15 18:02     ` Zach Brown
@ 2015-01-19 17:45       ` David Sterba
  0 siblings, 0 replies; 7+ messages in thread
From: David Sterba @ 2015-01-19 17:45 UTC (permalink / raw)
  To: Zach Brown; +Cc: dsterba, Sandy McArthur Jr, linux-btrfs

On Thu, Jan 15, 2015 at 10:02:37AM -0800, Zach Brown wrote:
> > > It says that scrub isn't running if any devices have completed.  If you drop
> > > all those ret < 0 conditional branches that are either noops or wrong, does it
> > > work like you'd expect?
> > 
> > Why wrong? The ioctl callback returns -ENODEV or -ENOTCONN that get
> > translated to the errno values and ioctl(...) returns -1 in both cases.
> 
> Wrong because returning 0 on the first ENOTCONN, instead of continuing
> to find more devices which might still be scrubbing, leads to this
> confusing status message.

You're right, fix on the way.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2015-01-19 17:45 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-14 21:06 btrfs scrub status reports not running when it is Sandy McArthur Jr
2015-01-14 21:23 ` Marc Joliet
2015-01-14 21:26 ` Sandy McArthur Jr
2015-01-14 22:27 ` Zach Brown
2015-01-15 11:24   ` David Sterba
2015-01-15 18:02     ` Zach Brown
2015-01-19 17:45       ` David Sterba

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.