linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Tkaczyk, Mariusz" <mariusz.tkaczyk@intel.com>
To: "linux-raid@vger.kernel.org" <linux-raid@vger.kernel.org>
Subject: Missing lock during partition read
Date: Thu, 5 Nov 2020 09:31:21 +0000	[thread overview]
Message-ID: <SA0PR11MB4542ECA84F72506B39C3C9F1FFEE0@SA0PR11MB4542.namprd11.prod.outlook.com> (raw)

Hi all,

I found an issue related to missing locking mechanism during partition
detection process. I'm using md raid 5 with IMSM metadata. On top of
array I created GPT with some partitions.

The problem here is that partition may stay read-only even if the parent
device (in this case raid array) becomes read-write. The issue doesn't
affect every partition. I got result where some of partitions are
read-write and the rest doesn't.

It is related to raid assembly process for external arrays. First
array appears read-only and later it is switched to read-write mode.
The read-only for array is well respected and as a result, if partition
detection start at this stage, then partitions get read-only mode.

The mode switch is done from userspace by mdmon, it manages array's
sysfs attribute "md/array_state" and kernel changes to read-write from
this context. This is done by set_disk_ro() function
(see array_state_store() in md.c).

So, as I wrote before partition detection starts when array is read-only.
I investigated that the issue occurs if mdmon changes array state during 
this process in background. As a result, it changes state on already
detected partitions, it doesn't wait for rest to appear. Udev reports md
device change event (generated by "md/array_state" update) between adds:

KERNEL[85844.484805] add /devices/virtual/block/md126/md126p1 (block)
KERNEL[85844.484853] change /devices/virtual/block/md126 (block)
KERNEL[85844.484912] add /devices/virtual/block/md126/md126p2 (block)

It ends with /dev/md126p2 as read-only. It can be fixed manually by
partprobe, but system may drop to emergency shell or dracut, depending
on configuration.

My understanding is that those two actions aren't synchronized and time
race occurs. To prevent from it, common resources should be locked.
Looks like md problem, it cannot be reproduced on standalone drives.
What are your thoughts?

TIA,
Mariusz



                 reply	other threads:[~2020-11-05  9:31 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=SA0PR11MB4542ECA84F72506B39C3C9F1FFEE0@SA0PR11MB4542.namprd11.prod.outlook.com \
    --to=mariusz.tkaczyk@intel.com \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).