All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Murphy <lists@colorremedies.com>
To: Stefan Loewen <stefan.loewen@gmail.com>
Cc: Chris Murphy <lists@colorremedies.com>,
	Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: Re: btrfs send hung in pipe_wait
Date: Fri, 7 Sep 2018 09:44:16 -0600	[thread overview]
Message-ID: <CAJCQCtSa1-5Zae4_jqqhZk49YQ+6fKG+jgwcG2_uK5+sYfwCbQ@mail.gmail.com> (raw)
In-Reply-To: <CAHTTHimqg_wgqs0AXt73YzOv3ga7cAEUvbwMOVVT2JUVaNbsFQ@mail.gmail.com>

On Fri, Sep 7, 2018 at 6:47 AM, Stefan Loewen <stefan.loewen@gmail.com> wrote:
> Well... It seems it's not the hardware.
> I ran a long SMART check which ran through without errors and
> reallocation count is still 0.

That only checks the drive, it's an internal test. It doesn't check
anything else, including connections.

Also you do have a log with a read error and a sector LBA reported. So
there is a hardware issue, it could just be transient.


> So I used clonezilla (partclone.btrfs) to mirror the drive to another
> drive (same model).
> Everything copied over just fine. No I/O error im dmesg.
>
> The new disk shows the same behavior.

So now I'm suspicious of USB behavior. Like I said earlier, when I've
got USB enclosed drives connect to my NUC, regardless of file system,
I routinely get hangs and USB resets. I have to connect all of my USB
enclosed drives to a good USB hub, or I have problems.



> So I created another subvolume, reflinked stuff over and found that it
> is enough to reflink one file, create a read-only snapshot and try to
> btrfs-send that. It's not happening with every file, but there are
> definitely multiple different files. The one I tested with is a 3.8GB
> ISO file.
> Even better:
> 'btrfs send --no-data snap-one > /dev/null'
> (snap-one containing just one iso file) hangs as well.

Do you have a list of steps to make this clear? It sounds like first
you copy a 3.8G ISO file to one subvolume, then reflink copy it into
another subvolume, then snapshot that 2nd subvolume, and try to send
the snapshot? But I want to be clear.

I've got piles of reflinked files in snapshots and they send OK,
although like I said I do get sometimes a 15-30 second hang during
sends.

> Still dmesg shows no IO errors, only "INFO: task btrfs-transacti:541
> blocked for more than 120 seconds." with associated call trace.
> btrfs-send reads some MB in the beginning, writes a few bytes and then
> hangs without further IO.
>
> copying the same file without --reflink, snapshotting and sending
> works without problems.
>
> I guess that pretty much eleminates bad sectors and points towards
> some problem with reflinks / btrfs metadata.

That's pretty weird. I'll keep trying and see if I hit this. What
happens if you downgrade to an older kernel? Either 4.14 or 4.17 or
both. The send code is mainly in the kernel, where the receive code is
mainly in user space tools, for this testing you don't need to
downgrade user space tools. If there's a bug here, I expect it's
kernel.




-- 
Chris Murphy

  reply	other threads:[~2018-09-07 20:25 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-06  9:22 btrfs send hung in pipe_wait Stefan Löwen
2018-09-06 15:04 ` Stefan Loewen
2018-09-06 15:48   ` Chris Murphy
2018-09-06 16:03     ` Stefan Löwen
2018-09-06 18:16       ` Chris Murphy
2018-09-06 18:36         ` Stefan Loewen
2018-09-06 19:58           ` Chris Murphy
2018-09-06 20:16             ` Stefan Loewen
2018-09-07  3:29               ` Chris Murphy
2018-09-07 12:47                 ` Stefan Loewen
2018-09-07 15:44                   ` Chris Murphy [this message]
2018-09-07 17:07                     ` Stefan Loewen
2018-09-07 19:17                       ` Chris Murphy
     [not found]                         ` <CAHTTHimT7m+S4bm1OgZOfmFkk69fc1SPGEvidxwFCHniKL-w6A@mail.gmail.com>
2018-09-08  9:45                           ` Fwd: " Stefan Loewen
2018-09-09  2:31                             ` Chris Murphy
     [not found]                               ` <CAHTTHinSJy6c7jV1pApeQgnGwMHjd9DEutqxc-T5XjKVbeh1SA@mail.gmail.com>
2018-09-09 23:29                                 ` Chris Murphy
     [not found]                           ` <CAJCQCtQBwvvbYR3u=EGbRR=rsnBaZK5F=mso3SE_kPwtcXyvHg@mail.gmail.com>
2018-09-08  9:47                             ` Fwd: " Stefan Loewen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJCQCtSa1-5Zae4_jqqhZk49YQ+6fKG+jgwcG2_uK5+sYfwCbQ@mail.gmail.com \
    --to=lists@colorremedies.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=stefan.loewen@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.