All of lore.kernel.org
 help / color / mirror / Atom feed
From: Gionatan Danti <g.danti@assyoma.it>
To: Reindl Harald <h.reindl@thelounge.net>
Cc: Roman Mamedov <rm@romanrm.net>,
	linux-raid@vger.kernel.org, g.danti@assyoma.it
Subject: Re: Filesystem corruption on RAID1
Date: Fri, 14 Jul 2017 12:46:57 +0200	[thread overview]
Message-ID: <9eea45ddc0f80f4f4e238b5c2527a1fa@assyoma.it> (raw)
In-Reply-To: <770b09d3-cff6-b6b2-0a51-5d11e8bac7e9@thelounge.net>

Il 14-07-2017 02:32 Reindl Harald ha scritto:
> because you won't be that happy when the kernel spits out a disk each
> time a random SATA command times out - the 4 RAID10 disks on my
> workstation are from 2011 and showed them too several times in the
> past while they are just fine
> 
> here you go:
> http://strugglers.net/~andy/blog/2015/11/09/linux-software-raid-and-drive-timeouts/

Hi, so a premature/preventive drive detachment is not a silver bullet, 
and I buy it. However, I would at least expect this behavior to be 
configurable. Maybe it is, and I am missing something?

Anyway, what really surprise me is *not* the drive to not be detached, 
rather permitting that corruption make its way into real data. I naively 
expect that when a WRITE_QUEUED or CACHE_FLUSH command aborts/fails 
(which *will* cause data corruption if not properly handled) the I/O 
layer has the following possibilities:

a) retry the write/flush. You don't want to retry indefinitely, so the 
kernel need some type of counter/threshold; when the counter is reached, 
continue with b). This would mask out sporadic errors, while propagating 
recurring ones;

b) notify the upper layer that a write error happened. For synchronized 
and direct writes it can notify that by simply returning the correct 
exit code to the calling function. In this case, the block layer should 
return an error to the MD driver, which must act accordlying: for 
example, dropping the disk from the array.

c) do nothing. This seems to me by far the worst choice.

If b) is correcly implemented, it should prevent corruption to 
accumulate on the drives.

Please also note the *type* of corrupted data: not only user data, but 
filesystem journal and metadata also. The latter should be protected by 
the using of write barriers / FUAs, so they should be able to stop 
themselves *before* corruption.

So I have some very important questions:
- how does MD behave when flushing data to disk?
- does it propagate write barriers?
- when a write barrier fails, is the error propagated to the upper 
layers?

Thanks you all.

-- 
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8

  parent reply	other threads:[~2017-07-14 10:46 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-13 15:35 Filesystem corruption on RAID1 Gionatan Danti
2017-07-13 16:48 ` Roman Mamedov
2017-07-13 21:28   ` Gionatan Danti
2017-07-13 21:34     ` Reindl Harald
2017-07-13 22:34       ` Gionatan Danti
2017-07-14  0:32         ` Reindl Harald
2017-07-14  0:52           ` Anthony Youngman
2017-07-14  1:10             ` Reindl Harald
2017-07-14 10:46           ` Gionatan Danti [this message]
2017-07-14 10:58             ` Reindl Harald
2017-08-17  8:23             ` Gionatan Danti
2017-08-17 12:41               ` Roger Heflin
2017-08-17 14:31                 ` Gionatan Danti
2017-08-17 17:33                   ` Wols Lists
2017-08-17 20:50                     ` Gionatan Danti
2017-08-17 21:01                       ` Roger Heflin
2017-08-17 21:21                         ` Gionatan Danti
2017-08-17 21:23                           ` Gionatan Danti
2017-08-17 22:51                       ` Wols Lists
2017-08-18 12:26                         ` Gionatan Danti
2017-08-18 12:54                           ` Roger Heflin
2017-08-18 19:42                             ` Gionatan Danti
2017-08-20  7:14                               ` Mikael Abrahamsson
2017-08-20  7:24                                 ` Gionatan Danti
2017-08-20 10:43                                   ` Mikael Abrahamsson
2017-08-20 13:07                                     ` Wols Lists
2017-08-20 15:38                                       ` Adam Goryachev
2017-08-20 15:48                                         ` Mikael Abrahamsson
2017-08-20 16:10                                           ` Wols Lists
2017-08-20 23:11                                             ` Adam Goryachev
2017-08-21 14:03                                               ` Anthony Youngman
2017-08-20 19:11                                           ` Gionatan Danti
2017-08-20 19:03                                         ` Gionatan Danti
2017-08-20 19:01                                       ` Gionatan Danti
2017-08-31 22:55                                     ` Robert L Mathews
2017-09-01  5:39                                       ` Reindl Harald
2017-09-01 23:14                                         ` Robert L Mathews
2017-08-20 23:22                                 ` Chris Murphy
2017-08-21  5:57                                   ` Gionatan Danti
2017-08-21  8:37                                   ` Mikael Abrahamsson
2017-08-21 12:28                                     ` Gionatan Danti
2017-08-21 14:09                                       ` Anthony Youngman
2017-08-21 17:33                                     ` Chris Murphy
2017-08-21 17:52                                       ` Reindl Harald
2017-07-14  1:48         ` Chris Murphy
2017-07-14  7:22           ` Roman Mamedov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9eea45ddc0f80f4f4e238b5c2527a1fa@assyoma.it \
    --to=g.danti@assyoma.it \
    --cc=h.reindl@thelounge.net \
    --cc=linux-raid@vger.kernel.org \
    --cc=rm@romanrm.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.