All of lore.kernel.org
 help / color / mirror / Atom feed
From: "John Stoffel" <john@stoffel.org>
To: Ronnie Lazar <ronnie.lazar@vastdata.com>
Cc: linux-raid@vger.kernel.org, Asaf Levy <asaf@vastdata.com>
Subject: Re: Question about potential data consistency issues when writes failed in mdadm raid1
Date: Fri, 17 Mar 2023 16:58:43 -0400	[thread overview]
Message-ID: <25620.54403.944889.209021@quad.stoffel.home> (raw)
In-Reply-To: <CALM_6_s7=eyDWFkirzg6ifqeeeF6-bnZD8n7=3=V+fm_qc34AQ@mail.gmail.com>

>>>>> "Ronnie" == Ronnie Lazar <ronnie.lazar@vastdata.com> writes:

> I'm trying to understand how mdadm protects against inconsistent data
> read in the face of failures that occur while writing to a device that
> has raid1.

You need to give a better test case, with examples. 

> Here is the scenario: I have set up raid1 that has 2 mirrors. First
> one is on local storage and the second is on remote storage.  The
> remote storage mirror is configured with write-mostly.

Configuration details?  And what is the remote device?  

> We have parallel jobs: 1 writing to an area on the device and the
> other reading from that area.

So you create /dev/md9 and are writing/reading from it, then the
system crashes and you lose the local half of the mirror, right?

> The write operation writes the data to the first mirror, and at that
> point the read operation reads the new data from the first mirror.

So how is your write succeeding if it's not written to both halves of
the MD device?  You need to give more details and maybe even some
example code showing what you're doing here. 

> Now, before data has been written to the second (remote) mirror a
> failure has occurred which caused the first machine to fail, When
> the machine comes up, the data is recovered from the second, remote,
> mirror.

Ah... some more details.  It sounds like you have a system A which is
writing to a SITE local remote device as well as a REMOTE site device
in the MD mirror, is this correct?  

Are these iSCSI devices?  FibreChannel?  NBD devices?  More details
please.

> Now when reading from this area, the users will receive the older
> value, even though, in the first read they got the newer value that
> was written.

> Does mdadm protect against this inconsistency?

It shouldn't be returning success on the write until both sides of the
mirror are updated.  But we can't really tell until you give more
details and an example.

I assume you're not building a RAID1 device and then writing to the
individual devices behind it's back or something silly like that,
right? 

John


  reply	other threads:[~2023-03-17 20:59 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-15 18:06 Question about potential data consistency issues when writes failed in mdadm raid1 Ronnie Lazar
2023-03-17 20:58 ` John Stoffel [this message]
2023-03-19  9:13   ` Asaf Levy
2023-03-19  9:55     ` Geoff Back
2023-03-19 11:31       ` Asaf Levy
2023-03-19 12:45         ` Geoff Back
2023-03-19 14:34           ` Asaf Levy
2023-03-20 13:52         ` John Stoffel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=25620.54403.944889.209021@quad.stoffel.home \
    --to=john@stoffel.org \
    --cc=asaf@vastdata.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=ronnie.lazar@vastdata.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.