All of lore.kernel.org
 help / color / mirror / Atom feed
From: Phil Karn <karn@ka9q.net>
To: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
Cc: Paul Jones <paul@pauljones.id.au>,
	"linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>
Subject: Re: Extremely slow device removals
Date: Sat, 2 May 2020 01:22:25 -0700	[thread overview]
Message-ID: <CAMwB8mg5npwzxFrBw8gdBt7KPbTb=M8d_MAGtbQbCoJS0GoMgA@mail.gmail.com> (raw)
In-Reply-To: <20200502074237.GM10769@hungrycats.org>

On Sat, May 2, 2020 at 12:42 AM Zygo Blaxell
<ce3g8jdj@umail.furryterror.org> wrote:

> If you use btrfs replace to move data between drives then you get all
> the advantages you describe.  Don't do 'device remove' if you can possibly
> avoid it.

But I had to use replace to do what I originally wanted to do: replace
four 6TB drives with two 16TB drives.  I could replace two but I'd
still have to remove two more. I may give up on that latter part for
now, but my original hope was to move everything to a smaller and
especially quieter box than the 10-year-old 4U server I have now
that's banished to the garage because of the noise. (Working on its
console in single-user is much less pleasant than retiring to the
house and using my laptop.) I also wanted to retire all four 6 TB
drives because they have over 35K hours (four years) of continuous run
time. They keep passing their SMART checks but I didn't want to keep
pushing my luck.

> If there's data corruption on one disk, btrfs can detect it and replace
> the lost data from the good copy.

That's a very good point I should have remembered. FS-agnostic RAID
depends on drive-level error detection, and being an early TCP/IP guy
I have always been a fan of end-to-end checks. That said, I can't
remember EVER having one of my drives silently corrupt data. When one
failed, I knew it. (Boy, did I know it.)  I can detect silent
corruption even in my ext4 or xfs file systems because I've been
experimenting for years with stashing SHA file hashes in an extended
attribute and periodically verifying them. This originated as a simple
deduplication tool with the attributes used only as a cache. But I
became intrigued by other uses for file-level hashes, like looking for
a file on a heterogeneous collection of machines by multicasting its
hash, and the aforementioned check for silent corruption. (Yes, I know
btrfs checks automatically, but I won't represent what I'm doing as
anything but purely experimental.)

I've never seen a btrfs scrub produce errors either except very
quickly on one system with faulty RAM, so I was never going to trust
it with real data anyway. (BTW, I believe strongly in ECC RAM. I can't
understand why it isn't universal given that it costs little more.)

I'm beginning to think I should look at some of the less tightly
coupled ways to provide redundant storage, such as gluster.

  reply	other threads:[~2020-05-02  8:22 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-28  7:22 Extremely slow device removals Phil Karn
2020-04-30 17:31 ` Phil Karn
2020-04-30 18:13   ` Jean-Denis Girard
2020-05-01  8:05     ` Phil Karn
2020-05-02  3:35       ` Zygo Blaxell
     [not found]         ` <CAMwB8mjUw+KV8mxg8ynPsv0sj5vSpwG7_khw=oP5n+SnPYzumQ@mail.gmail.com>
2020-05-02  4:31           ` Zygo Blaxell
2020-05-02  4:48         ` Paul Jones
2020-05-02  5:25           ` Phil Karn
2020-05-02  6:04             ` Remi Gauvin
2020-05-02  7:20             ` Zygo Blaxell
2020-05-02  7:27               ` Phil Karn
2020-05-02  7:52                 ` Zygo Blaxell
2020-05-02  6:00           ` Zygo Blaxell
2020-05-02  6:23             ` Paul Jones
2020-05-02  7:20               ` Phil Karn
2020-05-02  7:42                 ` Zygo Blaxell
2020-05-02  8:22                   ` Phil Karn [this message]
2020-05-02  8:24                     ` Phil Karn
2020-05-02  9:09                     ` Zygo Blaxell
2020-05-02 17:48                       ` Chris Murphy
2020-05-03  5:26                         ` Zygo Blaxell
2020-05-03  5:39                           ` Chris Murphy
2020-05-03  6:05                             ` Chris Murphy
2020-05-04  2:09                         ` Phil Karn
2020-05-02  7:43                 ` Jukka Larja
2020-05-02  4:49         ` Phil Karn
2020-04-30 18:40   ` Chris Murphy
2020-04-30 19:59     ` Phil Karn
2020-04-30 20:27       ` Alexandru Dordea
2020-04-30 20:58         ` Phil Karn
2020-05-01  2:47       ` Zygo Blaxell
2020-05-01  4:48         ` Phil Karn
2020-05-01  6:05           ` Alexandru Dordea
2020-05-01  7:29             ` Phil Karn
2020-05-02  4:18               ` Zygo Blaxell
2020-05-02  4:48                 ` Phil Karn
2020-05-02  5:00                 ` Phil Karn
2020-05-03  2:28                 ` Phil Karn
2020-05-04  7:39                   ` Phil Karn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAMwB8mg5npwzxFrBw8gdBt7KPbTb=M8d_MAGtbQbCoJS0GoMgA@mail.gmail.com' \
    --to=karn@ka9q.net \
    --cc=ce3g8jdj@umail.furryterror.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=paul@pauljones.id.au \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.