All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Murphy <lists@colorremedies.com>
To: "Stefan Löwen" <stefan.loewen@gmail.com>
Cc: Chris Murphy <lists@colorremedies.com>,
	Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: Re: btrfs send hung in pipe_wait
Date: Thu, 6 Sep 2018 12:16:47 -0600	[thread overview]
Message-ID: <CAJCQCtTHxM+Bx8akyV+QdYch=y6-0hCf_3r1KonPC2vKsujkxQ@mail.gmail.com> (raw)
In-Reply-To: <090f8da0-c29c-da5f-6e5b-ec6961706508@gmail.com>

On Thu, Sep 6, 2018 at 10:03 AM, Stefan Löwen <stefan.loewen@gmail.com> wrote:
> I have one subvolume (rw) and 2 snapshots (ro) of it.
>
> I just tested 'btrfs send <subvol> > /dev/null' and that also shows no IO
> after a while but also no significant CPU usage.
> During this I tried 'ls' on the source subvolume and it hangs as well.
> dmesg has some interesting messages I think (see attached dmesg.log)
>

OK you've got a different problem.

[  186.898756] sd 2:0:0:0: [sdb] tag#0 FAILED Result:
hostbyte=DID_ERROR driverbyte=DRIVER_OK
[  186.898762] sd 2:0:0:0: [sdb] tag#0 CDB: Read(10) 28 00 15 26 a0 d0
00 08 00 00
[  186.898764] print_req_error: I/O error, dev sdb, sector 354853072
[  187.109641] usb 2-1: reset SuperSpeed Gen 1 USB device number 2
using xhci_hcd
[  187.345245] usb 2-1: reset SuperSpeed Gen 1 USB device number 2
using xhci_hcd
[  187.657844] usb 2-1: reset SuperSpeed Gen 1 USB device number 2
using xhci_hcd
[  187.851336] usb 2-1: reset SuperSpeed Gen 1 USB device number 2
using xhci_hcd
[  188.026882] usb 2-1: reset SuperSpeed Gen 1 USB device number 2
using xhci_hcd
[  188.215881] usb 2-1: reset SuperSpeed Gen 1 USB device number 2
using xhci_hcd
[  188.247028] sd 2:0:0:0: [sdb] tag#0 FAILED Result:
hostbyte=DID_ERROR driverbyte=DRIVER_OK
[  188.247041] sd 2:0:0:0: [sdb] tag#0 CDB: Read(10) 28 00 15 26 a8 d0
00 08 00 00
[  188.247048] print_req_error: I/O error, dev sdb, sector 354855120


This is a read error for a specific sector.  So your drive has media
problems. And I think that's the instigating problem here, from which
a bunch of other tasks that depend on one or more reads completing but
never do. But weirdly there also isn't any kind of libata reset. At
least on SATA, by default we see a link reset after a command has not
returned in 30 seconds. That reset would totally clear the drive's
command queue, and then things either can recover or barf. But in your
case, neither happens and it just sits there with hung tasks.

[  189.350360] BTRFS error (device sdb1): bdev /dev/sdb1 errs: wr 0,
rd 2, flush 0, corrupt 0, gen 0

And that's the last we really see from Btrfs. After that, it's all
just hung task traces and are rather unsurprising to me.

Drives in USB cases add a whole bunch of complicating factors for
troubleshooting and repair. Including often masking the actual logical
and physical sector size, the min and max IO size, alignment offset,
and all kinds of things. They can have all sorts of bugs. And I'm also
not totally certain about the relationship between the usb reset
messages and the bad sector. As far as I know the only way we can get
a sector LBA expressly noted in dmesg along with the failed read(10)
command, is if the drive has reported back to libata that discrete
error with sense information. So I'm accepting that as a reliable
error, rather than it being something like a cable. But the reset
messages could possibly be something else in addition to that.

Anyway, the central issue is sector 354855120 is having problems. I
can't tell from the trace if it's transient or persistent. Maybe if
it's transient, that would explain how you sometimes get send to start
working again briefly but then it reverts to hanging. What do you get
for:

fdisk -l /dev/sdb
smartctl -x /dev/sdb
smartctl -l sct erc /dev/sdb

Those are all read only commands, nothing is written or changed.



-- 
Chris Murphy

  reply	other threads:[~2018-09-06 22:53 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-06  9:22 btrfs send hung in pipe_wait Stefan Löwen
2018-09-06 15:04 ` Stefan Loewen
2018-09-06 15:48   ` Chris Murphy
2018-09-06 16:03     ` Stefan Löwen
2018-09-06 18:16       ` Chris Murphy [this message]
2018-09-06 18:36         ` Stefan Loewen
2018-09-06 19:58           ` Chris Murphy
2018-09-06 20:16             ` Stefan Loewen
2018-09-07  3:29               ` Chris Murphy
2018-09-07 12:47                 ` Stefan Loewen
2018-09-07 15:44                   ` Chris Murphy
2018-09-07 17:07                     ` Stefan Loewen
2018-09-07 19:17                       ` Chris Murphy
     [not found]                         ` <CAHTTHimT7m+S4bm1OgZOfmFkk69fc1SPGEvidxwFCHniKL-w6A@mail.gmail.com>
2018-09-08  9:45                           ` Fwd: " Stefan Loewen
2018-09-09  2:31                             ` Chris Murphy
     [not found]                               ` <CAHTTHinSJy6c7jV1pApeQgnGwMHjd9DEutqxc-T5XjKVbeh1SA@mail.gmail.com>
2018-09-09 23:29                                 ` Chris Murphy
     [not found]                           ` <CAJCQCtQBwvvbYR3u=EGbRR=rsnBaZK5F=mso3SE_kPwtcXyvHg@mail.gmail.com>
2018-09-08  9:47                             ` Fwd: " Stefan Loewen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJCQCtTHxM+Bx8akyV+QdYch=y6-0hCf_3r1KonPC2vKsujkxQ@mail.gmail.com' \
    --to=lists@colorremedies.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=stefan.loewen@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.