All of lore.kernel.org
 help / color / mirror / Atom feed
* /sys/block/md126 still exists even after stopping the array
@ 2014-06-24 15:38 Francis Moreau
  2014-06-25  1:03 ` NeilBrown
  0 siblings, 1 reply; 23+ messages in thread
From: Francis Moreau @ 2014-06-24 15:38 UTC (permalink / raw)
  To: linux-raid

Hello,

I'm having the folloing behaviour with kernel 3.14.5 and mdadm v3.3.1.

After stopping all arrays, I still can see one of them in /sys/block/:

# cat /proc/mdstat
Personalities : [raid1]
md125 : active raid1 sdb3[1] sda3[0]
      483688448 blocks super 1.2 [2/2] [UU]
      [======>..............]  resync = 34.9% (169161280/483688448)
finish=44.0min speed=118852K/sec
      bitmap: 3/4 pages [12KB], 65536KB chunk

md126 : active raid1 sdb2[1] sda2[0]
      4038656 blocks super 1.2 [2/2] [UU]

md127 : active raid1 sdb1[1] sda1[0]
      524224 blocks super 1.0 [2/2] [UU]

unused devices: <none>

# mdadm --stop /dev/md12[567]
mdadm: stopped /dev/md125
mdadm: stopped /dev/md126
mdadm: stopped /dev/md127

# cat /proc/mdstat
Personalities : [raid1]
unused devices: <none>

# ls /sys/block/
md126  sda  sdb  sdc  sr0

# ls /sys/block/md126/md/
array_size  array_state  bitmap  chunk_size  component_size  layout
level  max_read_errors  metadata_version  new_dev  raid_disks
reshape_direction  reshape_position  resync_start  safe_mode_delay

# dmesg
....
[ 1573.715476] md125: detected capacity change from 495296970752 to 0
[ 1573.715626] md: md125 stopped.
[ 1573.715633] md: unbind<sdb3>
[ 1573.740681] md: export_rdev(sdb3)
[ 1573.740694] md: unbind<sda3>
[ 1573.754008] md: export_rdev(sda3)
[ 1573.773398] md126: detected capacity change from 4135583744 to 0
[ 1573.773403] md: md126 stopped.
[ 1573.773410] md: unbind<sdb2>
[ 1573.820652] md: export_rdev(sdb2)
[ 1573.820664] md: unbind<sda2>
[ 1573.873974] md: export_rdev(sda2)
[ 1573.889904] md127: detected capacity change from 536805376 to 0
[ 1573.889910] md: md127 stopped.
[ 1573.889917] md: unbind<sdb1>
[ 1573.913978] md: export_rdev(sdb1)
[ 1573.914033] md: unbind<sda1>
[ 1573.940627] md: export_rdev(sda1)

After waiting a couple of min, stopping again md126 worked:

[ 1835.755661] md: md126 stopped.

Is this expected ?

Thanks.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: /sys/block/md126 still exists even after stopping the array
  2014-06-24 15:38 /sys/block/md126 still exists even after stopping the array Francis Moreau
@ 2014-06-25  1:03 ` NeilBrown
  2014-06-25  6:59   ` Francis Moreau
                     ` (2 more replies)
  0 siblings, 3 replies; 23+ messages in thread
From: NeilBrown @ 2014-06-25  1:03 UTC (permalink / raw)
  To: Francis Moreau; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 2523 bytes --]

On Tue, 24 Jun 2014 17:38:30 +0200 Francis Moreau <francis.moro@gmail.com>
wrote:

> Hello,
> 
> I'm having the folloing behaviour with kernel 3.14.5 and mdadm v3.3.1.
> 
> After stopping all arrays, I still can see one of them in /sys/block/:
> 
> # cat /proc/mdstat
> Personalities : [raid1]
> md125 : active raid1 sdb3[1] sda3[0]
>       483688448 blocks super 1.2 [2/2] [UU]
>       [======>..............]  resync = 34.9% (169161280/483688448)
> finish=44.0min speed=118852K/sec
>       bitmap: 3/4 pages [12KB], 65536KB chunk
> 
> md126 : active raid1 sdb2[1] sda2[0]
>       4038656 blocks super 1.2 [2/2] [UU]
> 
> md127 : active raid1 sdb1[1] sda1[0]
>       524224 blocks super 1.0 [2/2] [UU]
> 
> unused devices: <none>
> 
> # mdadm --stop /dev/md12[567]
> mdadm: stopped /dev/md125
> mdadm: stopped /dev/md126
> mdadm: stopped /dev/md127
> 
> # cat /proc/mdstat
> Personalities : [raid1]
> unused devices: <none>
> 
> # ls /sys/block/
> md126  sda  sdb  sdc  sr0
> 
> # ls /sys/block/md126/md/
> array_size  array_state  bitmap  chunk_size  component_size  layout
> level  max_read_errors  metadata_version  new_dev  raid_disks
> reshape_direction  reshape_position  resync_start  safe_mode_delay
> 
> # dmesg
> ....
> [ 1573.715476] md125: detected capacity change from 495296970752 to 0
> [ 1573.715626] md: md125 stopped.
> [ 1573.715633] md: unbind<sdb3>
> [ 1573.740681] md: export_rdev(sdb3)
> [ 1573.740694] md: unbind<sda3>
> [ 1573.754008] md: export_rdev(sda3)
> [ 1573.773398] md126: detected capacity change from 4135583744 to 0
> [ 1573.773403] md: md126 stopped.
> [ 1573.773410] md: unbind<sdb2>
> [ 1573.820652] md: export_rdev(sdb2)
> [ 1573.820664] md: unbind<sda2>
> [ 1573.873974] md: export_rdev(sda2)
> [ 1573.889904] md127: detected capacity change from 536805376 to 0
> [ 1573.889910] md: md127 stopped.
> [ 1573.889917] md: unbind<sdb1>
> [ 1573.913978] md: export_rdev(sdb1)
> [ 1573.914033] md: unbind<sda1>
> [ 1573.940627] md: export_rdev(sda1)
> 
> After waiting a couple of min, stopping again md126 worked:
> 
> [ 1835.755661] md: md126 stopped.
> 
> Is this expected ?

No overly surprising.

This is probably caused by udev, or something udev runs, opening /dev/md126
after it has been stopped.  This has the effect of creating an empty inactive
array.
e.g.

 mknod /dev/test b 9 57
 < /dev/test

will make /sys/block/md57 appear.

It is a bit untidy, but shouldn't cause problems.

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: /sys/block/md126 still exists even after stopping the array
  2014-06-25  1:03 ` NeilBrown
@ 2014-06-25  6:59   ` Francis Moreau
  2014-07-24 13:40   ` Sebastian Parschauer
  2014-09-25 16:12   ` Francis Moreau
  2 siblings, 0 replies; 23+ messages in thread
From: Francis Moreau @ 2014-06-25  6:59 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid

Hello Neil,

On 06/25/2014 03:03 AM, NeilBrown wrote:
[...]

>>
>> After waiting a couple of min, stopping again md126 worked:
>>
>> [ 1835.755661] md: md126 stopped.
>>
>> Is this expected ?
> 
> No overly surprising.
> 
> This is probably caused by udev, or something udev runs, opening /dev/md126
> after it has been stopped.  This has the effect of creating an empty inactive
> array.
> e.g.
> 

Why opening /dev/md126 after it's been stopped didn't fail ?

Also why doing 'mdadm --stop /dev/md126' several times reports that the
array was stopped each time ?

>  mknod /dev/test b 9 57
>  < /dev/test
> 
> will make /sys/block/md57 appear.
> 
> It is a bit untidy, but shouldn't cause problems.
>

Another 'inconsistent' state I noticed is when a new array is created.
There's a time window when /sys//block/md127/md exists but the contained
disk (dev-sda1 dev-sdb1, for example) are not present in that directory.
This state (as the previous one) confuses my tool that is listening udev
events in order to register all block devices present in the system.

So I am wondering how I should deal with this 'transition' states and
how can detect them in a reliable way ?

Thanks.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: /sys/block/md126 still exists even after stopping the array
  2014-06-25  1:03 ` NeilBrown
  2014-06-25  6:59   ` Francis Moreau
@ 2014-07-24 13:40   ` Sebastian Parschauer
  2014-07-24 13:51     ` Artur Paszkiewicz
  2014-09-25 16:12   ` Francis Moreau
  2 siblings, 1 reply; 23+ messages in thread
From: Sebastian Parschauer @ 2014-07-24 13:40 UTC (permalink / raw)
  To: NeilBrown, Francis Moreau; +Cc: linux-raid

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 25.06.2014 03:03, NeilBrown wrote:
> On Tue, 24 Jun 2014 17:38:30 +0200 Francis Moreau
> <francis.moro@gmail.com> wrote:
> 
>> Hello,
>> 
>> I'm having the folloing behaviour with kernel 3.14.5 and mdadm
>> v3.3.1.
>> 
>> After stopping all arrays, I still can see one of them in
>> /sys/block/:
>> 
>> # cat /proc/mdstat Personalities : [raid1] md125 : active raid1
>> sdb3[1] sda3[0] 483688448 blocks super 1.2 [2/2] [UU] 
>> [======>..............]  resync = 34.9% (169161280/483688448) 
>> finish=44.0min speed=118852K/sec bitmap: 3/4 pages [12KB],
>> 65536KB chunk
>> 
>> md126 : active raid1 sdb2[1] sda2[0] 4038656 blocks super 1.2
>> [2/2] [UU]
>> 
>> md127 : active raid1 sdb1[1] sda1[0] 524224 blocks super 1.0
>> [2/2] [UU]
>> 
>> unused devices: <none>
>> 
>> # mdadm --stop /dev/md12[567] mdadm: stopped /dev/md125 mdadm:
>> stopped /dev/md126 mdadm: stopped /dev/md127
>> 
>> # cat /proc/mdstat Personalities : [raid1] unused devices:
>> <none>
>> 
>> # ls /sys/block/ md126  sda  sdb  sdc  sr0
>> 
>> # ls /sys/block/md126/md/ array_size  array_state  bitmap
>> chunk_size  component_size  layout level  max_read_errors
>> metadata_version  new_dev  raid_disks reshape_direction
>> reshape_position  resync_start  safe_mode_delay
>> 
>> # dmesg .... [ 1573.715476] md125: detected capacity change from
>> 495296970752 to 0 [ 1573.715626] md: md125 stopped. [
>> 1573.715633] md: unbind<sdb3> [ 1573.740681] md:
>> export_rdev(sdb3) [ 1573.740694] md: unbind<sda3> [ 1573.754008]
>> md: export_rdev(sda3) [ 1573.773398] md126: detected capacity
>> change from 4135583744 to 0 [ 1573.773403] md: md126 stopped. [
>> 1573.773410] md: unbind<sdb2> [ 1573.820652] md:
>> export_rdev(sdb2) [ 1573.820664] md: unbind<sda2> [ 1573.873974]
>> md: export_rdev(sda2) [ 1573.889904] md127: detected capacity
>> change from 536805376 to 0 [ 1573.889910] md: md127 stopped. [
>> 1573.889917] md: unbind<sdb1> [ 1573.913978] md:
>> export_rdev(sdb1) [ 1573.914033] md: unbind<sda1> [ 1573.940627]
>> md: export_rdev(sda1)
>> 
>> After waiting a couple of min, stopping again md126 worked:
>> 
>> [ 1835.755661] md: md126 stopped.
>> 
>> Is this expected ?
> 
> No overly surprising.
> 
> This is probably caused by udev, or something udev runs, opening
> /dev/md126 after it has been stopped.  This has the effect of
> creating an empty inactive array. e.g.
> 
> mknod /dev/test b 9 57 < /dev/test
> 
> will make /sys/block/md57 appear.
> 
> It is a bit untidy, but shouldn't cause problems.
> 
> NeilBrown
> 


Hi Neil,

we are also noticing this issue with udev-215 on Gentoo. They've
switched over to systemd. Deactivating all udev rules doesn't help. So
it must be an issue in the code. udev-204 works fine.
When stopping udev, then it is possible to stop the MD array and it
remains stopped (/sys/block/mdX disappears). Looks like they do
something really strange there.

Do you have any advice how to fix this in another way but to change
the kernel or to downgrade udev?

File a systemd bug?

Cheers,
Sebastian
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJT0QzGAAoJEH4DRb7WXajZXooIALBi2HzzXL0aLx6aKf3EHIRb
/PmRFhGdHZlGcETbgYflbuFYz30rLs3AYfOm+/PpGNpatppxuKkEddEq/2Lp/wJn
2cQt6cd2F6fOdgMq9MK3tM2nnj4HRPpqpq5huvnvijTYsPXRYvepjQUxFtd7/G/L
z/41cRGeeLOhlwfoUeWIwm1XKsetLgUl8VyKmxxtBRwCA5UZjfdE9u1X7TUxxA7d
ZRtASftmtXqjtxLHRzV+0LRSDgBBAYLzrJTlrgPf5oN+rW/NtjNxld5V0oqynrsW
WawC3ntc+z79tPtfHSLV1icshhmkfhcQzWjOcmW4RypFF/7h+3zTBBi262FEiGM=
=CvmT
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: /sys/block/md126 still exists even after stopping the array
  2014-07-24 13:40   ` Sebastian Parschauer
@ 2014-07-24 13:51     ` Artur Paszkiewicz
  0 siblings, 0 replies; 23+ messages in thread
From: Artur Paszkiewicz @ 2014-07-24 13:51 UTC (permalink / raw)
  To: Sebastian Parschauer, NeilBrown, Francis Moreau; +Cc: linux-raid

On 07/24/2014 03:40 PM, Sebastian Parschauer wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> On 25.06.2014 03:03, NeilBrown wrote:
>> On Tue, 24 Jun 2014 17:38:30 +0200 Francis Moreau
>> <francis.moro@gmail.com> wrote:
>>
>>> Hello,
>>>
>>> I'm having the folloing behaviour with kernel 3.14.5 and mdadm
>>> v3.3.1.
>>>
>>> After stopping all arrays, I still can see one of them in
>>> /sys/block/:
>>>
>>> # cat /proc/mdstat Personalities : [raid1] md125 : active raid1
>>> sdb3[1] sda3[0] 483688448 blocks super 1.2 [2/2] [UU] 
>>> [======>..............]  resync = 34.9% (169161280/483688448) 
>>> finish=44.0min speed=118852K/sec bitmap: 3/4 pages [12KB],
>>> 65536KB chunk
>>>
>>> md126 : active raid1 sdb2[1] sda2[0] 4038656 blocks super 1.2
>>> [2/2] [UU]
>>>
>>> md127 : active raid1 sdb1[1] sda1[0] 524224 blocks super 1.0
>>> [2/2] [UU]
>>>
>>> unused devices: <none>
>>>
>>> # mdadm --stop /dev/md12[567] mdadm: stopped /dev/md125 mdadm:
>>> stopped /dev/md126 mdadm: stopped /dev/md127
>>>
>>> # cat /proc/mdstat Personalities : [raid1] unused devices:
>>> <none>
>>>
>>> # ls /sys/block/ md126  sda  sdb  sdc  sr0
>>>
>>> # ls /sys/block/md126/md/ array_size  array_state  bitmap
>>> chunk_size  component_size  layout level  max_read_errors
>>> metadata_version  new_dev  raid_disks reshape_direction
>>> reshape_position  resync_start  safe_mode_delay
>>>
>>> # dmesg .... [ 1573.715476] md125: detected capacity change from
>>> 495296970752 to 0 [ 1573.715626] md: md125 stopped. [
>>> 1573.715633] md: unbind<sdb3> [ 1573.740681] md:
>>> export_rdev(sdb3) [ 1573.740694] md: unbind<sda3> [ 1573.754008]
>>> md: export_rdev(sda3) [ 1573.773398] md126: detected capacity
>>> change from 4135583744 to 0 [ 1573.773403] md: md126 stopped. [
>>> 1573.773410] md: unbind<sdb2> [ 1573.820652] md:
>>> export_rdev(sdb2) [ 1573.820664] md: unbind<sda2> [ 1573.873974]
>>> md: export_rdev(sda2) [ 1573.889904] md127: detected capacity
>>> change from 536805376 to 0 [ 1573.889910] md: md127 stopped. [
>>> 1573.889917] md: unbind<sdb1> [ 1573.913978] md:
>>> export_rdev(sdb1) [ 1573.914033] md: unbind<sda1> [ 1573.940627]
>>> md: export_rdev(sda1)
>>>
>>> After waiting a couple of min, stopping again md126 worked:
>>>
>>> [ 1835.755661] md: md126 stopped.
>>>
>>> Is this expected ?
>>
>> No overly surprising.
>>
>> This is probably caused by udev, or something udev runs, opening
>> /dev/md126 after it has been stopped.  This has the effect of
>> creating an empty inactive array. e.g.
>>
>> mknod /dev/test b 9 57 < /dev/test
>>
>> will make /sys/block/md57 appear.
>>
>> It is a bit untidy, but shouldn't cause problems.
>>
>> NeilBrown
>>
> 
> 
> Hi Neil,
> 
> we are also noticing this issue with udev-215 on Gentoo. They've
> switched over to systemd. Deactivating all udev rules doesn't help. So
> it must be an issue in the code. udev-204 works fine.
> When stopping udev, then it is possible to stop the MD array and it
> remains stopped (/sys/block/mdX disappears). Looks like they do
> something really strange there.
> 
> Do you have any advice how to fix this in another way but to change
> the kernel or to downgrade udev?
> 
> File a systemd bug?

Hi,

I have also noticed this issue. It is caused by this change in udev:

commit 3ebdb81ef088afd3b4c72b516beb5610f8c93a0d
Author: Kay Sievers <kay at vrfy.org>
Date:   Sun Apr 13 19:54:27 2014 -0700

     udev: serialize/synchronize block device event handling with file locks

http://cgit.freedesktop.org/systemd/systemd/commit/?id=3ebdb81ef088afd3b4c72b516beb5610f8c93a0d

It seems that they have already disabled this for dm for some reason,
but not for md:

commit e918a1b5a94f270186dca59156354acd2a596494
Author: Kay Sievers <kay@vrfy.org>
Date:   Tue Jun 3 16:49:38 2014 +0200

    udev: exclude device-mapper from block device ownership event locking

http://cgit.freedesktop.org/systemd/systemd/commit/?id=e918a1b5a94f270186dca59156354acd2a596494


Artur

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: /sys/block/md126 still exists even after stopping the array
  2014-06-25  1:03 ` NeilBrown
  2014-06-25  6:59   ` Francis Moreau
  2014-07-24 13:40   ` Sebastian Parschauer
@ 2014-09-25 16:12   ` Francis Moreau
  2014-09-26  0:33     ` NeilBrown
  2 siblings, 1 reply; 23+ messages in thread
From: Francis Moreau @ 2014-09-25 16:12 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid, sebastian.riemer

Hello,

On 06/25/2014 03:03 AM, NeilBrown wrote:
> On Tue, 24 Jun 2014 17:38:30 +0200 Francis Moreau <francis.moro@gmail.com>
> wrote:
> 
>> Hello,
>>
>> I'm having the folloing behaviour with kernel 3.14.5 and mdadm v3.3.1.
>>
>> After stopping all arrays, I still can see one of them in /sys/block/:
>>
>> # cat /proc/mdstat
>> Personalities : [raid1]
>> md125 : active raid1 sdb3[1] sda3[0]
>>       483688448 blocks super 1.2 [2/2] [UU]
>>       [======>..............]  resync = 34.9% (169161280/483688448)
>> finish=44.0min speed=118852K/sec
>>       bitmap: 3/4 pages [12KB], 65536KB chunk
>>
>> md126 : active raid1 sdb2[1] sda2[0]
>>       4038656 blocks super 1.2 [2/2] [UU]
>>
>> md127 : active raid1 sdb1[1] sda1[0]
>>       524224 blocks super 1.0 [2/2] [UU]
>>
>> unused devices: <none>
>>
>> # mdadm --stop /dev/md12[567]
>> mdadm: stopped /dev/md125
>> mdadm: stopped /dev/md126
>> mdadm: stopped /dev/md127
>>
>> # cat /proc/mdstat
>> Personalities : [raid1]
>> unused devices: <none>
>>
>> # ls /sys/block/
>> md126  sda  sdb  sdc  sr0
>>
>> # ls /sys/block/md126/md/
>> array_size  array_state  bitmap  chunk_size  component_size  layout
>> level  max_read_errors  metadata_version  new_dev  raid_disks
>> reshape_direction  reshape_position  resync_start  safe_mode_delay
>>
>> # dmesg
>> ....
>> [ 1573.715476] md125: detected capacity change from 495296970752 to 0
>> [ 1573.715626] md: md125 stopped.
>> [ 1573.715633] md: unbind<sdb3>
>> [ 1573.740681] md: export_rdev(sdb3)
>> [ 1573.740694] md: unbind<sda3>
>> [ 1573.754008] md: export_rdev(sda3)
>> [ 1573.773398] md126: detected capacity change from 4135583744 to 0
>> [ 1573.773403] md: md126 stopped.
>> [ 1573.773410] md: unbind<sdb2>
>> [ 1573.820652] md: export_rdev(sdb2)
>> [ 1573.820664] md: unbind<sda2>
>> [ 1573.873974] md: export_rdev(sda2)
>> [ 1573.889904] md127: detected capacity change from 536805376 to 0
>> [ 1573.889910] md: md127 stopped.
>> [ 1573.889917] md: unbind<sdb1>
>> [ 1573.913978] md: export_rdev(sdb1)
>> [ 1573.914033] md: unbind<sda1>
>> [ 1573.940627] md: export_rdev(sda1)
>>
>> After waiting a couple of min, stopping again md126 worked:
>>
>> [ 1835.755661] md: md126 stopped.
>>
>> Is this expected ?
> 
> No overly surprising.
> 
> This is probably caused by udev, or something udev runs, opening /dev/md126
> after it has been stopped.  This has the effect of creating an empty inactive
> array.
> e.g.
> 

Sorry for resurecting this again, but I'm still seeing this.

Sebastian saw the same behaviour with udev 215 and it appeared that this
specific version introduced a regression which resulted in the same
behavior I described initialy.

But in my case, the version of udev used is older (204).

I tried to find out what could have opened the md device by using fuser,
but fuser reports no users.

I took a look to the udev rules which are the one shipped by mdadm 3.3.2
but nothing keep the device opened during the remove event.

Could you give me some hints here to debug this ?

Thanks



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: /sys/block/md126 still exists even after stopping the array
  2014-09-25 16:12   ` Francis Moreau
@ 2014-09-26  0:33     ` NeilBrown
  2014-09-26 10:23       ` Francis Moreau
  0 siblings, 1 reply; 23+ messages in thread
From: NeilBrown @ 2014-09-26  0:33 UTC (permalink / raw)
  To: Francis Moreau; +Cc: linux-raid, sebastian.riemer

[-- Attachment #1: Type: text/plain, Size: 3853 bytes --]

On Thu, 25 Sep 2014 18:12:07 +0200 Francis Moreau <francis.moro@gmail.com>
wrote:

> Hello,
> 
> On 06/25/2014 03:03 AM, NeilBrown wrote:
> > On Tue, 24 Jun 2014 17:38:30 +0200 Francis Moreau <francis.moro@gmail.com>
> > wrote:
> > 
> >> Hello,
> >>
> >> I'm having the folloing behaviour with kernel 3.14.5 and mdadm v3.3.1.
> >>
> >> After stopping all arrays, I still can see one of them in /sys/block/:
> >>
> >> # cat /proc/mdstat
> >> Personalities : [raid1]
> >> md125 : active raid1 sdb3[1] sda3[0]
> >>       483688448 blocks super 1.2 [2/2] [UU]
> >>       [======>..............]  resync = 34.9% (169161280/483688448)
> >> finish=44.0min speed=118852K/sec
> >>       bitmap: 3/4 pages [12KB], 65536KB chunk
> >>
> >> md126 : active raid1 sdb2[1] sda2[0]
> >>       4038656 blocks super 1.2 [2/2] [UU]
> >>
> >> md127 : active raid1 sdb1[1] sda1[0]
> >>       524224 blocks super 1.0 [2/2] [UU]
> >>
> >> unused devices: <none>
> >>
> >> # mdadm --stop /dev/md12[567]
> >> mdadm: stopped /dev/md125
> >> mdadm: stopped /dev/md126
> >> mdadm: stopped /dev/md127
> >>
> >> # cat /proc/mdstat
> >> Personalities : [raid1]
> >> unused devices: <none>
> >>
> >> # ls /sys/block/
> >> md126  sda  sdb  sdc  sr0
> >>
> >> # ls /sys/block/md126/md/
> >> array_size  array_state  bitmap  chunk_size  component_size  layout
> >> level  max_read_errors  metadata_version  new_dev  raid_disks
> >> reshape_direction  reshape_position  resync_start  safe_mode_delay
> >>
> >> # dmesg
> >> ....
> >> [ 1573.715476] md125: detected capacity change from 495296970752 to 0
> >> [ 1573.715626] md: md125 stopped.
> >> [ 1573.715633] md: unbind<sdb3>
> >> [ 1573.740681] md: export_rdev(sdb3)
> >> [ 1573.740694] md: unbind<sda3>
> >> [ 1573.754008] md: export_rdev(sda3)
> >> [ 1573.773398] md126: detected capacity change from 4135583744 to 0
> >> [ 1573.773403] md: md126 stopped.
> >> [ 1573.773410] md: unbind<sdb2>
> >> [ 1573.820652] md: export_rdev(sdb2)
> >> [ 1573.820664] md: unbind<sda2>
> >> [ 1573.873974] md: export_rdev(sda2)
> >> [ 1573.889904] md127: detected capacity change from 536805376 to 0
> >> [ 1573.889910] md: md127 stopped.
> >> [ 1573.889917] md: unbind<sdb1>
> >> [ 1573.913978] md: export_rdev(sdb1)
> >> [ 1573.914033] md: unbind<sda1>
> >> [ 1573.940627] md: export_rdev(sda1)
> >>
> >> After waiting a couple of min, stopping again md126 worked:
> >>
> >> [ 1835.755661] md: md126 stopped.
> >>
> >> Is this expected ?
> > 
> > No overly surprising.
> > 
> > This is probably caused by udev, or something udev runs, opening /dev/md126
> > after it has been stopped.  This has the effect of creating an empty inactive
> > array.
> > e.g.
> > 
> 
> Sorry for resurecting this again, but I'm still seeing this.
> 
> Sebastian saw the same behaviour with udev 215 and it appeared that this
> specific version introduced a regression which resulted in the same
> behavior I described initialy.
> 
> But in my case, the version of udev used is older (204).
> 
> I tried to find out what could have opened the md device by using fuser,
> but fuser reports no users.

It is probably a transient open/close.

> 
> I took a look to the udev rules which are the one shipped by mdadm 3.3.2
> but nothing keep the device opened during the remove event.
> 
> Could you give me some hints here to debug this ?

Modify md_open in drivers/md/md.c to add
   printk("Opened by %s\n", current->comm);

and build a new kernel.  That will tell you the name of the process which
opened the device.

NeilBrown

> 
> Thanks
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: /sys/block/md126 still exists even after stopping the array
  2014-09-26  0:33     ` NeilBrown
@ 2014-09-26 10:23       ` Francis Moreau
  2014-09-26 10:44         ` NeilBrown
  0 siblings, 1 reply; 23+ messages in thread
From: Francis Moreau @ 2014-09-26 10:23 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid, sebastian.riemer

Hello Neil,

On 09/26/2014 02:33 AM, NeilBrown wrote:
> On Thu, 25 Sep 2014 18:12:07 +0200 Francis Moreau <francis.moro@gmail.com>
> wrote:
[...]
>> I tried to find out what could have opened the md device by using fuser,
>> but fuser reports no users.
> 
> It is probably a transient open/close.
> 

If it's open/close wouldn't the 'close' part make the device disapear ?

>>
>> I took a look to the udev rules which are the one shipped by mdadm 3.3.2
>> but nothing keep the device opened during the remove event.
>>
>> Could you give me some hints here to debug this ?
> 
> Modify md_open in drivers/md/md.c to add
>    printk("Opened by %s\n", current->comm);
> 
> and build a new kernel.  That will tell you the name of the process which
> opened the device.
> 

I did that I also added a trace in md_release() but strangely no trace
were outputed from there.

Here's the details of what I did:

--- %< ---
[root@localhost ~]# cat /proc/mdstat
Personalities : [raid1]
md125 : active raid1 vdc1[1] vdb1[0]
      65472 blocks super 1.0 [2/2] [UU]

md126 : active raid1 vdc2[1] vdb2[0]
      209536 blocks super 1.2 [2/2] [UU]

md127 : active raid1 vdb3[0] vdc3[1]
      1819584 blocks super 1.2 [2/2] [UU]

unused devices: <none>

[root@localhost ~]# mdadm --stop --scan

[root@localhost ~]# dmesg | grep md_
[    1.474207] md_open(): opened by mdadm
[    1.475316] md_open(): opened by mdadm
[    1.492880] md_open(): opened by mdadm
[    1.493201] md_open(): opened by mdadm
[    1.494690] md_open(): opened by mdadm
[    1.499369] md_open(): opened by mdadm
[    1.533566] md_open(): opened by mdadm
[    1.533697] md_open(): opened by mdadm
[    1.554419] md_open(): opened by mdadm
[    1.574451] md_open(): opened by mdadm
[    1.574666] md_open(): opened by mdadm
[    1.574877] md_open(): opened by mdadm
[    1.576822] md_open(): opened by systemd-udevd
[    1.576895] md_open(): opened by systemd-udevd
[    1.577029] md_open(): opened by systemd-udevd
[    1.581850] md_open(): opened by mdadm
[    1.584054] md_open(): opened by systemd-udevd
[    1.584770] md_open(): opened by mdadm
[    1.585175] md_open(): opened by mdadm
[    1.586328] md_open(): opened by systemd-udevd
[    1.586933] md_open(): opened by systemd-udevd
[    1.651265] md_open(): opened by mdadm
[    1.651320] md_open(): opened by mdadm
[    1.651364] md_open(): opened by mdadm
[    1.651437] md_open(): opened by mdadm
[    1.652376] md_open(): opened by mdadm
[    1.652452] md_open(): opened by mdadm
[   33.486704] md_open(): opened by mdadm
[   33.489259] md_open(): opened by mdadm
[   33.491000] md_open(): opened by mdadm
[   33.491767] md_open(): opened by systemd-udevd
[   33.692255] md_open(): opened by mdadm
[   33.692288] md_open(): opened by mdadm
[   33.692606] md_open(): opened by mdadm
[   33.692858] md_open(): opened by mdadm
[   33.692942] md_open(): opened by mdadm
[   33.693237] md_open(): opened by mdadm
[   33.694254] md_open(): opened by mdadm
[   33.694275] md_open(): opened by mdadm
[   33.694373] md_open(): opened by mdadm
[   33.695558] md_open(): opened by mdadm
[   33.695679] md_open(): opened by mdadm
[   33.695855] md_open(): opened by mdadm
[   33.695894] md_open(): opened by mdadm

[root@localhost ~]# ls /dev/md125
/dev/md125

[root@localhost ~]# fuser /dev/md125

[root@localhost ~]# ps aux | grep "mdadm\|systemd-udevd"
root       366  0.0  0.1  38172  1696 ?        Ss   06:04   0:00
/usr/lib/systemd/systemd-udevd
root       465  0.0  0.0   4964   924 ?        Ss   06:04   0:00
/sbin/mdadm --monitor --scan --daemonise --syslog
--pid-file=/run/mdadm/mdadm.pid

[root@localhost ~]# ls -l /proc/366/fd/
total 0
lrwx------ 1 root root 64 Sep 26 06:04 0 -> /dev/null
lrwx------ 1 root root 64 Sep 26 06:04 1 -> /dev/null
lrwx------ 1 root root 64 Sep 26 06:04 10 -> socket:[8665]
lr-x------ 1 root root 64 Sep 26 06:04 11 -> /etc/udev/hwdb.bin
lrwx------ 1 root root 64 Sep 26 06:04 12 -> anon_inode:[eventpoll]
lrwx------ 1 root root 64 Sep 26 06:04 2 -> /dev/null
lrwx------ 1 root root 64 Sep 26 06:04 3 -> socket:[8144]
lrwx------ 1 root root 64 Sep 26 06:04 4 -> socket:[8103]
lrwx------ 1 root root 64 Sep 26 06:04 5 -> socket:[8660]
lrwx------ 1 root root 64 Sep 26 06:04 6 -> /run/udev/queue.bin
lr-x------ 1 root root 64 Sep 26 06:04 7 -> anon_inode:inotify
lrwx------ 1 root root 64 Sep 26 06:04 8 -> anon_inode:[signalfd]
lrwx------ 1 root root 64 Sep 26 06:04 9 -> socket:[8664]

[root@localhost ~]# ls -l /proc/465/fd/
total 0
lrwx------ 1 root root 64 Sep 26 06:04 0 -> /dev/null
lrwx------ 1 root root 64 Sep 26 06:04 1 -> /dev/null
lrwx------ 1 root root 64 Sep 26 06:04 2 -> /dev/null
lr-x------ 1 root root 64 Sep 26 06:06 4 -> /proc/mdstat
lrwx------ 1 root root 64 Sep 26 06:06 5 -> socket:[10038]

[root@localhost ~]# cat /proc/mdstat
Personalities : [raid1]
unused devices: <none>

[root@localhost ~]# ls /sys/block/md125/md/
array_size  array_state  bitmap/  chunk_size  component_size  layout
level  max_read_errors  metadata_version  new_dev  raid_disks
reshape_direction  reshape_position  resync_start  safe_mode_delay

--- >% ---

So in my understanding, only mdadm and udevd are opening the MD devices
and mdamd was the last to open the device. For some unknown reasons,
md_release() is never called.

This happens with:

 - kernel 3.14.19
 - mdadm 3.3.2
 - systemd 208

Can you see something wrong here ?

Thanks.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: /sys/block/md126 still exists even after stopping the array
  2014-09-26 10:23       ` Francis Moreau
@ 2014-09-26 10:44         ` NeilBrown
  2014-09-26 11:23           ` Artur Paszkiewicz
  2014-09-26 12:21           ` Francis Moreau
  0 siblings, 2 replies; 23+ messages in thread
From: NeilBrown @ 2014-09-26 10:44 UTC (permalink / raw)
  To: Francis Moreau; +Cc: linux-raid, sebastian.riemer

[-- Attachment #1: Type: text/plain, Size: 6784 bytes --]

On Fri, 26 Sep 2014 12:23:27 +0200 Francis Moreau <francis.moro@gmail.com>
wrote:

> Hello Neil,
> 
> On 09/26/2014 02:33 AM, NeilBrown wrote:
> > On Thu, 25 Sep 2014 18:12:07 +0200 Francis Moreau <francis.moro@gmail.com>
> > wrote:
> [...]
> >> I tried to find out what could have opened the md device by using fuser,
> >> but fuser reports no users.
> > 
> > It is probably a transient open/close.
> > 
> 
> If it's open/close wouldn't the 'close' part make the device disapear ?

No. It's ... complicated.

> 
> >>
> >> I took a look to the udev rules which are the one shipped by mdadm 3.3.2
> >> but nothing keep the device opened during the remove event.
> >>
> >> Could you give me some hints here to debug this ?
> > 
> > Modify md_open in drivers/md/md.c to add
> >    printk("Opened by %s\n", current->comm);
> > 
> > and build a new kernel.  That will tell you the name of the process which
> > opened the device.
> > 
> 
> I did that I also added a trace in md_release() but strangely no trace
> were outputed from there.

Without seeing your patch I can't guess what it happening, but I am *certain*
that md_release() would get called providing md_open didn't return an error.

It might be helpful to print out the pid and the md device number too
 task_tgid_vnr(current)
will give you the pid.
  mdname(mddev)
give the name of the device.

Probably there is a 'change' event happening just before the 'remove' event,
and udev runs "mdadm" on the 'change' event, and that ends up happening after
the device has been removed.

Is this really a problem?  Can't you just ignore it and pretend it isn't
there?

NeilBrown

> 
> Here's the details of what I did:
> 
> --- %< ---
> [root@localhost ~]# cat /proc/mdstat
> Personalities : [raid1]
> md125 : active raid1 vdc1[1] vdb1[0]
>       65472 blocks super 1.0 [2/2] [UU]
> 
> md126 : active raid1 vdc2[1] vdb2[0]
>       209536 blocks super 1.2 [2/2] [UU]
> 
> md127 : active raid1 vdb3[0] vdc3[1]
>       1819584 blocks super 1.2 [2/2] [UU]
> 
> unused devices: <none>
> 
> [root@localhost ~]# mdadm --stop --scan
> 
> [root@localhost ~]# dmesg | grep md_
> [    1.474207] md_open(): opened by mdadm
> [    1.475316] md_open(): opened by mdadm
> [    1.492880] md_open(): opened by mdadm
> [    1.493201] md_open(): opened by mdadm
> [    1.494690] md_open(): opened by mdadm
> [    1.499369] md_open(): opened by mdadm
> [    1.533566] md_open(): opened by mdadm
> [    1.533697] md_open(): opened by mdadm
> [    1.554419] md_open(): opened by mdadm
> [    1.574451] md_open(): opened by mdadm
> [    1.574666] md_open(): opened by mdadm
> [    1.574877] md_open(): opened by mdadm
> [    1.576822] md_open(): opened by systemd-udevd
> [    1.576895] md_open(): opened by systemd-udevd
> [    1.577029] md_open(): opened by systemd-udevd
> [    1.581850] md_open(): opened by mdadm
> [    1.584054] md_open(): opened by systemd-udevd
> [    1.584770] md_open(): opened by mdadm
> [    1.585175] md_open(): opened by mdadm
> [    1.586328] md_open(): opened by systemd-udevd
> [    1.586933] md_open(): opened by systemd-udevd
> [    1.651265] md_open(): opened by mdadm
> [    1.651320] md_open(): opened by mdadm
> [    1.651364] md_open(): opened by mdadm
> [    1.651437] md_open(): opened by mdadm
> [    1.652376] md_open(): opened by mdadm
> [    1.652452] md_open(): opened by mdadm
> [   33.486704] md_open(): opened by mdadm
> [   33.489259] md_open(): opened by mdadm
> [   33.491000] md_open(): opened by mdadm
> [   33.491767] md_open(): opened by systemd-udevd
> [   33.692255] md_open(): opened by mdadm
> [   33.692288] md_open(): opened by mdadm
> [   33.692606] md_open(): opened by mdadm
> [   33.692858] md_open(): opened by mdadm
> [   33.692942] md_open(): opened by mdadm
> [   33.693237] md_open(): opened by mdadm
> [   33.694254] md_open(): opened by mdadm
> [   33.694275] md_open(): opened by mdadm
> [   33.694373] md_open(): opened by mdadm
> [   33.695558] md_open(): opened by mdadm
> [   33.695679] md_open(): opened by mdadm
> [   33.695855] md_open(): opened by mdadm
> [   33.695894] md_open(): opened by mdadm
> 
> [root@localhost ~]# ls /dev/md125
> /dev/md125
> 
> [root@localhost ~]# fuser /dev/md125
> 
> [root@localhost ~]# ps aux | grep "mdadm\|systemd-udevd"
> root       366  0.0  0.1  38172  1696 ?        Ss   06:04   0:00
> /usr/lib/systemd/systemd-udevd
> root       465  0.0  0.0   4964   924 ?        Ss   06:04   0:00
> /sbin/mdadm --monitor --scan --daemonise --syslog
> --pid-file=/run/mdadm/mdadm.pid
> 
> [root@localhost ~]# ls -l /proc/366/fd/
> total 0
> lrwx------ 1 root root 64 Sep 26 06:04 0 -> /dev/null
> lrwx------ 1 root root 64 Sep 26 06:04 1 -> /dev/null
> lrwx------ 1 root root 64 Sep 26 06:04 10 -> socket:[8665]
> lr-x------ 1 root root 64 Sep 26 06:04 11 -> /etc/udev/hwdb.bin
> lrwx------ 1 root root 64 Sep 26 06:04 12 -> anon_inode:[eventpoll]
> lrwx------ 1 root root 64 Sep 26 06:04 2 -> /dev/null
> lrwx------ 1 root root 64 Sep 26 06:04 3 -> socket:[8144]
> lrwx------ 1 root root 64 Sep 26 06:04 4 -> socket:[8103]
> lrwx------ 1 root root 64 Sep 26 06:04 5 -> socket:[8660]
> lrwx------ 1 root root 64 Sep 26 06:04 6 -> /run/udev/queue.bin
> lr-x------ 1 root root 64 Sep 26 06:04 7 -> anon_inode:inotify
> lrwx------ 1 root root 64 Sep 26 06:04 8 -> anon_inode:[signalfd]
> lrwx------ 1 root root 64 Sep 26 06:04 9 -> socket:[8664]
> 
> [root@localhost ~]# ls -l /proc/465/fd/
> total 0
> lrwx------ 1 root root 64 Sep 26 06:04 0 -> /dev/null
> lrwx------ 1 root root 64 Sep 26 06:04 1 -> /dev/null
> lrwx------ 1 root root 64 Sep 26 06:04 2 -> /dev/null
> lr-x------ 1 root root 64 Sep 26 06:06 4 -> /proc/mdstat
> lrwx------ 1 root root 64 Sep 26 06:06 5 -> socket:[10038]
> 
> [root@localhost ~]# cat /proc/mdstat
> Personalities : [raid1]
> unused devices: <none>
> 
> [root@localhost ~]# ls /sys/block/md125/md/
> array_size  array_state  bitmap/  chunk_size  component_size  layout
> level  max_read_errors  metadata_version  new_dev  raid_disks
> reshape_direction  reshape_position  resync_start  safe_mode_delay
> 
> --- >% ---
> 
> So in my understanding, only mdadm and udevd are opening the MD devices
> and mdamd was the last to open the device. For some unknown reasons,
> md_release() is never called.
> 
> This happens with:
> 
>  - kernel 3.14.19
>  - mdadm 3.3.2
>  - systemd 208
> 
> Can you see something wrong here ?
> 
> Thanks.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: /sys/block/md126 still exists even after stopping the array
  2014-09-26 10:44         ` NeilBrown
@ 2014-09-26 11:23           ` Artur Paszkiewicz
  2014-09-29  4:19             ` NeilBrown
  2014-09-26 12:21           ` Francis Moreau
  1 sibling, 1 reply; 23+ messages in thread
From: Artur Paszkiewicz @ 2014-09-26 11:23 UTC (permalink / raw)
  To: NeilBrown, Francis Moreau; +Cc: linux-raid, sebastian.riemer

On 09/26/2014 12:44 PM, NeilBrown wrote:
> On Fri, 26 Sep 2014 12:23:27 +0200 Francis Moreau <francis.moro@gmail.com>
> wrote:
> 
>> Hello Neil,
>>
>> On 09/26/2014 02:33 AM, NeilBrown wrote:
>>> On Thu, 25 Sep 2014 18:12:07 +0200 Francis Moreau <francis.moro@gmail.com>
>>> wrote:
>> [...]
>>>> I tried to find out what could have opened the md device by using fuser,
>>>> but fuser reports no users.
>>>
>>> It is probably a transient open/close.
>>>
>>
>> If it's open/close wouldn't the 'close' part make the device disapear ?
> 
> No. It's ... complicated.
> 
>>
>>>>
>>>> I took a look to the udev rules which are the one shipped by mdadm 3.3.2
>>>> but nothing keep the device opened during the remove event.
>>>>
>>>> Could you give me some hints here to debug this ?
>>>
>>> Modify md_open in drivers/md/md.c to add
>>>    printk("Opened by %s\n", current->comm);
>>>
>>> and build a new kernel.  That will tell you the name of the process which
>>> opened the device.
>>>
>>
>> I did that I also added a trace in md_release() but strangely no trace
>> were outputed from there.
> 
> Without seeing your patch I can't guess what it happening, but I am *certain*
> that md_release() would get called providing md_open didn't return an error.
> 
> It might be helpful to print out the pid and the md device number too
>  task_tgid_vnr(current)
> will give you the pid.
>   mdname(mddev)
> give the name of the device.
> 
> Probably there is a 'change' event happening just before the 'remove' event,
> and udev runs "mdadm" on the 'change' event, and that ends up happening after
> the device has been removed.
> 
> Is this really a problem?  Can't you just ignore it and pretend it isn't
> there?
> 
> NeilBrown
> 
>>
>> Here's the details of what I did:
>>
>> --- %< ---
>> [root@localhost ~]# cat /proc/mdstat
>> Personalities : [raid1]
>> md125 : active raid1 vdc1[1] vdb1[0]
>>       65472 blocks super 1.0 [2/2] [UU]
>>
>> md126 : active raid1 vdc2[1] vdb2[0]
>>       209536 blocks super 1.2 [2/2] [UU]
>>
>> md127 : active raid1 vdb3[0] vdc3[1]
>>       1819584 blocks super 1.2 [2/2] [UU]
>>
>> unused devices: <none>
>>
>> [root@localhost ~]# mdadm --stop --scan
>>
>> [root@localhost ~]# dmesg | grep md_
>> [    1.474207] md_open(): opened by mdadm
>> [    1.475316] md_open(): opened by mdadm
>> [    1.492880] md_open(): opened by mdadm
>> [    1.493201] md_open(): opened by mdadm
>> [    1.494690] md_open(): opened by mdadm
>> [    1.499369] md_open(): opened by mdadm
>> [    1.533566] md_open(): opened by mdadm
>> [    1.533697] md_open(): opened by mdadm
>> [    1.554419] md_open(): opened by mdadm
>> [    1.574451] md_open(): opened by mdadm
>> [    1.574666] md_open(): opened by mdadm
>> [    1.574877] md_open(): opened by mdadm
>> [    1.576822] md_open(): opened by systemd-udevd
>> [    1.576895] md_open(): opened by systemd-udevd
>> [    1.577029] md_open(): opened by systemd-udevd
>> [    1.581850] md_open(): opened by mdadm
>> [    1.584054] md_open(): opened by systemd-udevd
>> [    1.584770] md_open(): opened by mdadm
>> [    1.585175] md_open(): opened by mdadm
>> [    1.586328] md_open(): opened by systemd-udevd
>> [    1.586933] md_open(): opened by systemd-udevd
>> [    1.651265] md_open(): opened by mdadm
>> [    1.651320] md_open(): opened by mdadm
>> [    1.651364] md_open(): opened by mdadm
>> [    1.651437] md_open(): opened by mdadm
>> [    1.652376] md_open(): opened by mdadm
>> [    1.652452] md_open(): opened by mdadm
>> [   33.486704] md_open(): opened by mdadm
>> [   33.489259] md_open(): opened by mdadm
>> [   33.491000] md_open(): opened by mdadm
>> [   33.491767] md_open(): opened by systemd-udevd
>> [   33.692255] md_open(): opened by mdadm
>> [   33.692288] md_open(): opened by mdadm
>> [   33.692606] md_open(): opened by mdadm
>> [   33.692858] md_open(): opened by mdadm
>> [   33.692942] md_open(): opened by mdadm
>> [   33.693237] md_open(): opened by mdadm
>> [   33.694254] md_open(): opened by mdadm
>> [   33.694275] md_open(): opened by mdadm
>> [   33.694373] md_open(): opened by mdadm
>> [   33.695558] md_open(): opened by mdadm
>> [   33.695679] md_open(): opened by mdadm
>> [   33.695855] md_open(): opened by mdadm
>> [   33.695894] md_open(): opened by mdadm
>>
>> [root@localhost ~]# ls /dev/md125
>> /dev/md125
>>
>> [root@localhost ~]# fuser /dev/md125
>>
>> [root@localhost ~]# ps aux | grep "mdadm\|systemd-udevd"
>> root       366  0.0  0.1  38172  1696 ?        Ss   06:04   0:00
>> /usr/lib/systemd/systemd-udevd
>> root       465  0.0  0.0   4964   924 ?        Ss   06:04   0:00
>> /sbin/mdadm --monitor --scan --daemonise --syslog
>> --pid-file=/run/mdadm/mdadm.pid
>>
>> [root@localhost ~]# ls -l /proc/366/fd/
>> total 0
>> lrwx------ 1 root root 64 Sep 26 06:04 0 -> /dev/null
>> lrwx------ 1 root root 64 Sep 26 06:04 1 -> /dev/null
>> lrwx------ 1 root root 64 Sep 26 06:04 10 -> socket:[8665]
>> lr-x------ 1 root root 64 Sep 26 06:04 11 -> /etc/udev/hwdb.bin
>> lrwx------ 1 root root 64 Sep 26 06:04 12 -> anon_inode:[eventpoll]
>> lrwx------ 1 root root 64 Sep 26 06:04 2 -> /dev/null
>> lrwx------ 1 root root 64 Sep 26 06:04 3 -> socket:[8144]
>> lrwx------ 1 root root 64 Sep 26 06:04 4 -> socket:[8103]
>> lrwx------ 1 root root 64 Sep 26 06:04 5 -> socket:[8660]
>> lrwx------ 1 root root 64 Sep 26 06:04 6 -> /run/udev/queue.bin
>> lr-x------ 1 root root 64 Sep 26 06:04 7 -> anon_inode:inotify
>> lrwx------ 1 root root 64 Sep 26 06:04 8 -> anon_inode:[signalfd]
>> lrwx------ 1 root root 64 Sep 26 06:04 9 -> socket:[8664]
>>
>> [root@localhost ~]# ls -l /proc/465/fd/
>> total 0
>> lrwx------ 1 root root 64 Sep 26 06:04 0 -> /dev/null
>> lrwx------ 1 root root 64 Sep 26 06:04 1 -> /dev/null
>> lrwx------ 1 root root 64 Sep 26 06:04 2 -> /dev/null
>> lr-x------ 1 root root 64 Sep 26 06:06 4 -> /proc/mdstat
>> lrwx------ 1 root root 64 Sep 26 06:06 5 -> socket:[10038]
>>
>> [root@localhost ~]# cat /proc/mdstat
>> Personalities : [raid1]
>> unused devices: <none>
>>
>> [root@localhost ~]# ls /sys/block/md125/md/
>> array_size  array_state  bitmap/  chunk_size  component_size  layout
>> level  max_read_errors  metadata_version  new_dev  raid_disks
>> reshape_direction  reshape_position  resync_start  safe_mode_delay
>>
>> --- >% ---
>>
>> So in my understanding, only mdadm and udevd are opening the MD devices
>> and mdamd was the last to open the device. For some unknown reasons,
>> md_release() is never called.
>>
>> This happens with:
>>
>>  - kernel 3.14.19
>>  - mdadm 3.3.2
>>  - systemd 208
>>
>> Can you see something wrong here ?
>>
>> Thanks.
>> --

Hi,

I have also been debugging this issue and I came up with this
fix/workaround. It works for me. Can you take a look a this?

Thanks,
Artur

From c547e39789cde93d4a7ea1d3f845d61b82e4f0ed Mon Sep 17 00:00:00 2001
From: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Date: Fri, 26 Sep 2014 12:20:46 +0200
Subject: [PATCH] md: avoid creating new devices for stopped arrays in
 md_open()

When an array is about to be destroyed, set mddev->gendisk->private_data
to NULL as it is no longer needed and check it in md_open(). If
bdev->bd_disk->private_data is NULL, then this indicates that the array
is stopped and return -ENODEV.

Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
---
 drivers/md/md.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 1294238..7109d48 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -449,6 +449,7 @@ static void mddev_put(struct mddev *mddev)
                bs = mddev->bio_set;
                mddev->bio_set = NULL;
                if (mddev->gendisk) {
+                       mddev->gendisk->private_data = NULL;
                        /* We did a probe so need to clean up.  Call
                         * queue_work inside the spinlock so that
                         * flush_workqueue() after mddev_find will
@@ -6693,9 +6694,14 @@ static int md_open(struct block_device *bdev, fmode_t mode)
         * Succeed if we can lock the mddev, which confirms that
         * it isn't being stopped right now.
         */
-       struct mddev *mddev = mddev_find(bdev->bd_dev);
+       struct mddev *mddev;
        int err;

+       if (!bdev->bd_disk->private_data)
+               return -ENODEV;
+
+       mddev = mddev_find(bdev->bd_dev);
+
        if (!mddev)
                return -ENODEV;

-- 
1.8.4.5

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: /sys/block/md126 still exists even after stopping the array
  2014-09-26 10:44         ` NeilBrown
  2014-09-26 11:23           ` Artur Paszkiewicz
@ 2014-09-26 12:21           ` Francis Moreau
  2014-09-26 12:50             ` Francis Moreau
  2014-09-29  4:37             ` NeilBrown
  1 sibling, 2 replies; 23+ messages in thread
From: Francis Moreau @ 2014-09-26 12:21 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid, sebastian.riemer

On 09/26/2014 12:44 PM, NeilBrown wrote:
> On Fri, 26 Sep 2014 12:23:27 +0200 Francis Moreau <francis.moro@gmail.com>
> wrote:
> 
>> Hello Neil,
>>
>> On 09/26/2014 02:33 AM, NeilBrown wrote:
>>> On Thu, 25 Sep 2014 18:12:07 +0200 Francis Moreau <francis.moro@gmail.com>
>>> wrote:
>> [...]
>>>> I tried to find out what could have opened the md device by using fuser,
>>>> but fuser reports no users.
>>>
>>> It is probably a transient open/close.
>>>
>>
>> If it's open/close wouldn't the 'close' part make the device disapear ?
> 
> No. It's ... complicated.
> 
>>
>>>>
>>>> I took a look to the udev rules which are the one shipped by mdadm 3.3.2
>>>> but nothing keep the device opened during the remove event.
>>>>
>>>> Could you give me some hints here to debug this ?
>>>
>>> Modify md_open in drivers/md/md.c to add
>>>    printk("Opened by %s\n", current->comm);
>>>
>>> and build a new kernel.  That will tell you the name of the process which
>>> opened the device.
>>>
>>
>> I did that I also added a trace in md_release() but strangely no trace
>> were outputed from there.
> 
> Without seeing your patch I can't guess what it happening, but I am *certain*
> that md_release() would get called providing md_open didn't return an error.

Here's the patch:

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 73aedcb..08ead8d 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -6703,6 +6703,8 @@ static int md_open(struct block_device *bdev,
fmode_t mode)
 	struct mddev *mddev = mddev_find(bdev->bd_dev);
 	int err;

+	printk("md_open(): opened by %s\n", current->comm);
+
 	if (!mddev)
 		return -ENODEV;

@@ -6735,6 +6737,8 @@ static void md_release(struct gendisk *disk,
fmode_t mode)
 {
  	struct mddev *mddev = disk->private_data;

+	printk("md_release(): released by %s\n", current->comm);
+
 	BUG_ON(!mddev);
 	atomic_dec(&mddev->openers);
 	mddev_put(mddev);

> 
> It might be helpful to print out the pid and the md device number too
>  task_tgid_vnr(current)
> will give you the pid.
>   mdname(mddev)
> give the name of the device.
> 

Here's the new trace, this time md_release() was called, so I probably
did something wrong the first time, sorry for that.

[    1.470744] md_open(): md127 opened by mdadm [388]
[    1.485437] md_release(): md127 released by mdadm [388]
[    1.486888] md_open(): md126 opened by mdadm [381]
[    1.487468] md_release(): md126 released by mdadm [381]
[    1.488646] md_open(): md125 opened by mdadm [383]
[    1.489074] md_release(): md125 released by mdadm [383]
[    1.490555] md_open(): md127 opened by mdadm [385]
[    1.512556] md_release(): md127 released by mdadm [385]
[    1.512582] md_open(): md127 opened by mdadm [385]
[    1.512682] md_open(): md126 opened by mdadm [384]
[    1.553414] md_release(): md126 released by mdadm [384]
[    1.553442] md_open(): md126 opened by mdadm [384]
[    1.553549] md_open(): md125 opened by mdadm [382]
[    1.573263] md_release(): md125 released by mdadm [382]
[    1.573288] md_open(): md125 opened by mdadm [382]
[    1.601034] md_open(): md125 opened by mdadm [459]
[    1.601041] md_release(): md125 released by mdadm [459]
[    1.601065] md_open(): md126 opened by mdadm [459]
[    1.601067] md_release(): md126 released by mdadm [459]
[    1.601090] md_open(): md127 opened by mdadm [459]
[    1.601092] md_release(): md127 released by mdadm [459]
[    1.601130] md_open(): md127 opened by mdadm [459]
[    1.601220] md_release(): md127 released by mdadm [459]
[    1.601633] md_open(): md126 opened by mdadm [459]
[    1.601661] md_release(): md126 released by mdadm [459]
[    1.601673] md_open(): md125 opened by mdadm [459]
[    1.601695] md_release(): md125 released by mdadm [459]
[    1.606127] md_open(): md125 opened by mdadm [454]
[    1.608682] md_open(): md126 opened by mdadm [453]
[    1.609514] md_open(): md127 opened by mdadm [448]
[    1.622512] md_release(): md126 released by mdadm [453]
[    1.623028] md_release(): md127 released by mdadm [448]
[    1.625288] md_open(): md126 opened by systemd-udevd [363]
[    1.625391] md_release(): md125 released by mdadm [454]
[    1.625619] md_open(): md127 opened by systemd-udevd [368]
[    1.625737] md_open(): md125 opened by systemd-udevd [366]
[    1.637137] md_release(): md125 released by systemd-udevd [366]
[    1.643982] md_open(): md125 opened by mdadm [476]
[    1.644071] md_release(): md127 released by systemd-udevd [368]
[    1.647787] md_release(): md125 released by mdadm [382]
[    1.648171] md_release(): md126 released by systemd-udevd [363]
[    1.651629] md_open(): md126 opened by mdadm [479]
[    1.656666] md_open(): md127 opened by mdadm [480]
[    1.657771] md_release(): md125 released by mdadm [476]
[    1.659312] md_open(): md125 opened by systemd-udevd [365]
[    1.663193] md_release(): md127 released by mdadm [385]
[    1.673669] md_release(): md125 released by systemd-udevd [365]
[    1.685527] md_release(): md127 released by mdadm [480]
[    1.685599] md_release(): md126 released by mdadm [479]
[    1.686058] md_open(): md126 opened by systemd-udevd [366]
[    1.686282] md_release(): md126 released by systemd-udevd [366]
[    1.691024] md_open(): md127 opened by systemd-udevd [363]
[    1.695415] md_release(): md126 released by mdadm [384]
[    1.707163] md_release(): md127 released by systemd-udevd [363]

>>> mdadm --stop --scan <<<

[   89.975162] md_open(): md125 opened by mdadm [930]
[   89.975305] md_release(): md125 released by mdadm [930]
[   89.977434] md_open(): md125 opened by mdadm [932]
[   89.978813] md_open(): md125 opened by mdadm [930]
[   89.979365] md_release(): md125 released by mdadm [932]
[   89.979693] md_open(): md125 opened by systemd-udevd [931]
[   89.985790] md_release(): md125 released by systemd-udevd [931]
[   90.179911] md_release(): md125 released by mdadm [930]
[   90.180168] md_open(): md127 opened by mdadm [459]
[   90.180187] md_release(): md127 released by mdadm [459]
[   90.180199] md_open(): md126 opened by mdadm [459]
[   90.180205] md_release(): md126 released by mdadm [459]
[   90.180556] md_open(): md126 opened by mdadm [930]
[   90.180653] md_release(): md126 released by mdadm [930]
[   90.180690] md_open(): md126 opened by mdadm [930]
[   90.180758] md_open(): mdX opened by mdadm [459]
[   90.180995] md_open(): md125 opened by mdadm [459]
[   90.181056] md_release(): md125 released by mdadm [459]
[   90.182717] md_open(): md127 opened by mdadm [459]
[   90.182725] md_release(): md127 released by mdadm [459]
[   90.182732] md_open(): md126 opened by mdadm [459]
[   90.182761] md_release(): md126 released by mdadm [459]
[   90.182770] md_open(): md125 opened by mdadm [459]
[   90.182775] md_release(): md125 released by mdadm [459]
[   90.182940] md_release(): md126 released by mdadm [930]
[   90.183167] md_open(): md127 opened by mdadm [930]
[   90.183257] md_release(): md127 released by mdadm [930]
[   90.183288] md_open(): md127 opened by mdadm [930]
[   90.183461] md_open(): md127 opened by mdadm [459]
[   90.183488] md_release(): md127 released by mdadm [459]
[   90.183499] md_open(): md125 opened by mdadm [459]
[   90.183505] md_release(): md125 released by mdadm [459]
[   90.183686] md_release(): md127 released by mdadm [930]


> Probably there is a 'change' event happening just before the 'remove' event,
> and udev runs "mdadm" on the 'change' event, and that ends up happening after
> the device has been removed.
> 
> Is this really a problem?  Can't you just ignore it and pretend it isn't
> there?

Well, if you list the block devices that the kernel detected in order to
operate on them, it could. I don't know exactly what would be the result
to use it but it could confuse some tools.

Is there a way to check that the 'ghost' device has been removed by
poking sysfs ?

Thanks

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: /sys/block/md126 still exists even after stopping the array
  2014-09-26 12:21           ` Francis Moreau
@ 2014-09-26 12:50             ` Francis Moreau
  2014-09-29  4:47               ` NeilBrown
  2014-09-29  4:37             ` NeilBrown
  1 sibling, 1 reply; 23+ messages in thread
From: Francis Moreau @ 2014-09-26 12:50 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid, sebastian.riemer

On 09/26/2014 02:21 PM, Francis Moreau wrote:
[...]

> 
>>>> mdadm --stop --scan <<<
> 
> [   89.975162] md_open(): md125 opened by mdadm [930]
> [   89.975305] md_release(): md125 released by mdadm [930]
> [   89.977434] md_open(): md125 opened by mdadm [932]
> [   89.978813] md_open(): md125 opened by mdadm [930]
> [   89.979365] md_release(): md125 released by mdadm [932]
> [   89.979693] md_open(): md125 opened by systemd-udevd [931]
> [   89.985790] md_release(): md125 released by systemd-udevd [931]
> [   90.179911] md_release(): md125 released by mdadm [930]
> [   90.180168] md_open(): md127 opened by mdadm [459]
> [   90.180187] md_release(): md127 released by mdadm [459]
> [   90.180199] md_open(): md126 opened by mdadm [459]
> [   90.180205] md_release(): md126 released by mdadm [459]
> [   90.180556] md_open(): md126 opened by mdadm [930]
> [   90.180653] md_release(): md126 released by mdadm [930]
> [   90.180690] md_open(): md126 opened by mdadm [930]
> [   90.180758] md_open(): mdX opened by mdadm [459]

What is this 'mdX' device that mdadm operates on ?

It also doesn't have a counterpart release() call.


> [   90.180995] md_open(): md125 opened by mdadm [459]
> [   90.181056] md_release(): md125 released by mdadm [459]
> [   90.182717] md_open(): md127 opened by mdadm [459]
> [   90.182725] md_release(): md127 released by mdadm [459]
> [   90.182732] md_open(): md126 opened by mdadm [459]
> [   90.182761] md_release(): md126 released by mdadm [459]
> [   90.182770] md_open(): md125 opened by mdadm [459]
> [   90.182775] md_release(): md125 released by mdadm [459]
> [   90.182940] md_release(): md126 released by mdadm [930]
> [   90.183167] md_open(): md127 opened by mdadm [930]
> [   90.183257] md_release(): md127 released by mdadm [930]
> [   90.183288] md_open(): md127 opened by mdadm [930]
> [   90.183461] md_open(): md127 opened by mdadm [459]
> [   90.183488] md_release(): md127 released by mdadm [459]
> [   90.183499] md_open(): md125 opened by mdadm [459]
> [   90.183505] md_release(): md125 released by mdadm [459]
> [   90.183686] md_release(): md127 released by mdadm [930]
> 
> 
>> Probably there is a 'change' event happening just before the 'remove' event,
>> and udev runs "mdadm" on the 'change' event, and that ends up happening after
>> the device has been removed.
>>
>> Is this really a problem?  Can't you just ignore it and pretend it isn't
>> there?
> 
> Well, if you list the block devices that the kernel detected in order to
> operate on them, it could. I don't know exactly what would be the result
> to use it but it could confuse some tools.
> 
> Is there a way to check that the 'ghost' device has been removed by
> poking sysfs ?
> 
> Thanks
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: /sys/block/md126 still exists even after stopping the array
  2014-09-26 11:23           ` Artur Paszkiewicz
@ 2014-09-29  4:19             ` NeilBrown
  0 siblings, 0 replies; 23+ messages in thread
From: NeilBrown @ 2014-09-29  4:19 UTC (permalink / raw)
  To: Artur Paszkiewicz; +Cc: Francis Moreau, linux-raid, sebastian.riemer

[-- Attachment #1: Type: text/plain, Size: 9567 bytes --]

On Fri, 26 Sep 2014 13:23:41 +0200 Artur Paszkiewicz
<artur.paszkiewicz@intel.com> wrote:

> On 09/26/2014 12:44 PM, NeilBrown wrote:
> > On Fri, 26 Sep 2014 12:23:27 +0200 Francis Moreau <francis.moro@gmail.com>
> > wrote:
> > 
> >> Hello Neil,
> >>
> >> On 09/26/2014 02:33 AM, NeilBrown wrote:
> >>> On Thu, 25 Sep 2014 18:12:07 +0200 Francis Moreau <francis.moro@gmail.com>
> >>> wrote:
> >> [...]
> >>>> I tried to find out what could have opened the md device by using fuser,
> >>>> but fuser reports no users.
> >>>
> >>> It is probably a transient open/close.
> >>>
> >>
> >> If it's open/close wouldn't the 'close' part make the device disapear ?
> > 
> > No. It's ... complicated.
> > 
> >>
> >>>>
> >>>> I took a look to the udev rules which are the one shipped by mdadm 3.3.2
> >>>> but nothing keep the device opened during the remove event.
> >>>>
> >>>> Could you give me some hints here to debug this ?
> >>>
> >>> Modify md_open in drivers/md/md.c to add
> >>>    printk("Opened by %s\n", current->comm);
> >>>
> >>> and build a new kernel.  That will tell you the name of the process which
> >>> opened the device.
> >>>
> >>
> >> I did that I also added a trace in md_release() but strangely no trace
> >> were outputed from there.
> > 
> > Without seeing your patch I can't guess what it happening, but I am *certain*
> > that md_release() would get called providing md_open didn't return an error.
> > 
> > It might be helpful to print out the pid and the md device number too
> >  task_tgid_vnr(current)
> > will give you the pid.
> >   mdname(mddev)
> > give the name of the device.
> > 
> > Probably there is a 'change' event happening just before the 'remove' event,
> > and udev runs "mdadm" on the 'change' event, and that ends up happening after
> > the device has been removed.
> > 
> > Is this really a problem?  Can't you just ignore it and pretend it isn't
> > there?
> > 
> > NeilBrown
> > 
> >>
> >> Here's the details of what I did:
> >>
> >> --- %< ---
> >> [root@localhost ~]# cat /proc/mdstat
> >> Personalities : [raid1]
> >> md125 : active raid1 vdc1[1] vdb1[0]
> >>       65472 blocks super 1.0 [2/2] [UU]
> >>
> >> md126 : active raid1 vdc2[1] vdb2[0]
> >>       209536 blocks super 1.2 [2/2] [UU]
> >>
> >> md127 : active raid1 vdb3[0] vdc3[1]
> >>       1819584 blocks super 1.2 [2/2] [UU]
> >>
> >> unused devices: <none>
> >>
> >> [root@localhost ~]# mdadm --stop --scan
> >>
> >> [root@localhost ~]# dmesg | grep md_
> >> [    1.474207] md_open(): opened by mdadm
> >> [    1.475316] md_open(): opened by mdadm
> >> [    1.492880] md_open(): opened by mdadm
> >> [    1.493201] md_open(): opened by mdadm
> >> [    1.494690] md_open(): opened by mdadm
> >> [    1.499369] md_open(): opened by mdadm
> >> [    1.533566] md_open(): opened by mdadm
> >> [    1.533697] md_open(): opened by mdadm
> >> [    1.554419] md_open(): opened by mdadm
> >> [    1.574451] md_open(): opened by mdadm
> >> [    1.574666] md_open(): opened by mdadm
> >> [    1.574877] md_open(): opened by mdadm
> >> [    1.576822] md_open(): opened by systemd-udevd
> >> [    1.576895] md_open(): opened by systemd-udevd
> >> [    1.577029] md_open(): opened by systemd-udevd
> >> [    1.581850] md_open(): opened by mdadm
> >> [    1.584054] md_open(): opened by systemd-udevd
> >> [    1.584770] md_open(): opened by mdadm
> >> [    1.585175] md_open(): opened by mdadm
> >> [    1.586328] md_open(): opened by systemd-udevd
> >> [    1.586933] md_open(): opened by systemd-udevd
> >> [    1.651265] md_open(): opened by mdadm
> >> [    1.651320] md_open(): opened by mdadm
> >> [    1.651364] md_open(): opened by mdadm
> >> [    1.651437] md_open(): opened by mdadm
> >> [    1.652376] md_open(): opened by mdadm
> >> [    1.652452] md_open(): opened by mdadm
> >> [   33.486704] md_open(): opened by mdadm
> >> [   33.489259] md_open(): opened by mdadm
> >> [   33.491000] md_open(): opened by mdadm
> >> [   33.491767] md_open(): opened by systemd-udevd
> >> [   33.692255] md_open(): opened by mdadm
> >> [   33.692288] md_open(): opened by mdadm
> >> [   33.692606] md_open(): opened by mdadm
> >> [   33.692858] md_open(): opened by mdadm
> >> [   33.692942] md_open(): opened by mdadm
> >> [   33.693237] md_open(): opened by mdadm
> >> [   33.694254] md_open(): opened by mdadm
> >> [   33.694275] md_open(): opened by mdadm
> >> [   33.694373] md_open(): opened by mdadm
> >> [   33.695558] md_open(): opened by mdadm
> >> [   33.695679] md_open(): opened by mdadm
> >> [   33.695855] md_open(): opened by mdadm
> >> [   33.695894] md_open(): opened by mdadm
> >>
> >> [root@localhost ~]# ls /dev/md125
> >> /dev/md125
> >>
> >> [root@localhost ~]# fuser /dev/md125
> >>
> >> [root@localhost ~]# ps aux | grep "mdadm\|systemd-udevd"
> >> root       366  0.0  0.1  38172  1696 ?        Ss   06:04   0:00
> >> /usr/lib/systemd/systemd-udevd
> >> root       465  0.0  0.0   4964   924 ?        Ss   06:04   0:00
> >> /sbin/mdadm --monitor --scan --daemonise --syslog
> >> --pid-file=/run/mdadm/mdadm.pid
> >>
> >> [root@localhost ~]# ls -l /proc/366/fd/
> >> total 0
> >> lrwx------ 1 root root 64 Sep 26 06:04 0 -> /dev/null
> >> lrwx------ 1 root root 64 Sep 26 06:04 1 -> /dev/null
> >> lrwx------ 1 root root 64 Sep 26 06:04 10 -> socket:[8665]
> >> lr-x------ 1 root root 64 Sep 26 06:04 11 -> /etc/udev/hwdb.bin
> >> lrwx------ 1 root root 64 Sep 26 06:04 12 -> anon_inode:[eventpoll]
> >> lrwx------ 1 root root 64 Sep 26 06:04 2 -> /dev/null
> >> lrwx------ 1 root root 64 Sep 26 06:04 3 -> socket:[8144]
> >> lrwx------ 1 root root 64 Sep 26 06:04 4 -> socket:[8103]
> >> lrwx------ 1 root root 64 Sep 26 06:04 5 -> socket:[8660]
> >> lrwx------ 1 root root 64 Sep 26 06:04 6 -> /run/udev/queue.bin
> >> lr-x------ 1 root root 64 Sep 26 06:04 7 -> anon_inode:inotify
> >> lrwx------ 1 root root 64 Sep 26 06:04 8 -> anon_inode:[signalfd]
> >> lrwx------ 1 root root 64 Sep 26 06:04 9 -> socket:[8664]
> >>
> >> [root@localhost ~]# ls -l /proc/465/fd/
> >> total 0
> >> lrwx------ 1 root root 64 Sep 26 06:04 0 -> /dev/null
> >> lrwx------ 1 root root 64 Sep 26 06:04 1 -> /dev/null
> >> lrwx------ 1 root root 64 Sep 26 06:04 2 -> /dev/null
> >> lr-x------ 1 root root 64 Sep 26 06:06 4 -> /proc/mdstat
> >> lrwx------ 1 root root 64 Sep 26 06:06 5 -> socket:[10038]
> >>
> >> [root@localhost ~]# cat /proc/mdstat
> >> Personalities : [raid1]
> >> unused devices: <none>
> >>
> >> [root@localhost ~]# ls /sys/block/md125/md/
> >> array_size  array_state  bitmap/  chunk_size  component_size  layout
> >> level  max_read_errors  metadata_version  new_dev  raid_disks
> >> reshape_direction  reshape_position  resync_start  safe_mode_delay
> >>
> >> --- >% ---
> >>
> >> So in my understanding, only mdadm and udevd are opening the MD devices
> >> and mdamd was the last to open the device. For some unknown reasons,
> >> md_release() is never called.
> >>
> >> This happens with:
> >>
> >>  - kernel 3.14.19
> >>  - mdadm 3.3.2
> >>  - systemd 208
> >>
> >> Can you see something wrong here ?
> >>
> >> Thanks.
> >> --
> 
> Hi,
> 
> I have also been debugging this issue and I came up with this
> fix/workaround. It works for me. Can you take a look a this?
> 
> Thanks,
> Artur
> 
> >From c547e39789cde93d4a7ea1d3f845d61b82e4f0ed Mon Sep 17 00:00:00 2001
> From: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
> Date: Fri, 26 Sep 2014 12:20:46 +0200
> Subject: [PATCH] md: avoid creating new devices for stopped arrays in
>  md_open()
> 
> When an array is about to be destroyed, set mddev->gendisk->private_data
> to NULL as it is no longer needed and check it in md_open(). If
> bdev->bd_disk->private_data is NULL, then this indicates that the array
> is stopped and return -ENODEV.
> 
> Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
> ---
>  drivers/md/md.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index 1294238..7109d48 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -449,6 +449,7 @@ static void mddev_put(struct mddev *mddev)
>                 bs = mddev->bio_set;
>                 mddev->bio_set = NULL;
>                 if (mddev->gendisk) {
> +                       mddev->gendisk->private_data = NULL;
>                         /* We did a probe so need to clean up.  Call
>                          * queue_work inside the spinlock so that
>                          * flush_workqueue() after mddev_find will
> @@ -6693,9 +6694,14 @@ static int md_open(struct block_device *bdev, fmode_t mode)
>          * Succeed if we can lock the mddev, which confirms that
>          * it isn't being stopped right now.
>          */
> -       struct mddev *mddev = mddev_find(bdev->bd_dev);
> +       struct mddev *mddev;
>         int err;
> 
> +       if (!bdev->bd_disk->private_data)
> +               return -ENODEV;
> +
> +       mddev = mddev_find(bdev->bd_dev);
> +
>         if (!mddev)
>                 return -ENODEV;
> 

Thanks, but I don't think this is a complete fix.
It creates a small window after an array is stopped during which an attempt
to open the device will fail.  Once mddev_delayed_delete() completes, the
device can be opened again.
So it might occasionally fix the symptom, but it is very dependant on timing
and won't always work.

NeilBrown


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: /sys/block/md126 still exists even after stopping the array
  2014-09-26 12:21           ` Francis Moreau
  2014-09-26 12:50             ` Francis Moreau
@ 2014-09-29  4:37             ` NeilBrown
  2014-09-29  8:45               ` Francis Moreau
  1 sibling, 1 reply; 23+ messages in thread
From: NeilBrown @ 2014-09-29  4:37 UTC (permalink / raw)
  To: Francis Moreau; +Cc: linux-raid, sebastian.riemer

[-- Attachment #1: Type: text/plain, Size: 8866 bytes --]

On Fri, 26 Sep 2014 14:21:04 +0200 Francis Moreau <francis.moro@gmail.com>
wrote:

> On 09/26/2014 12:44 PM, NeilBrown wrote:
> > On Fri, 26 Sep 2014 12:23:27 +0200 Francis Moreau <francis.moro@gmail.com>
> > wrote:
> > 
> >> Hello Neil,
> >>
> >> On 09/26/2014 02:33 AM, NeilBrown wrote:
> >>> On Thu, 25 Sep 2014 18:12:07 +0200 Francis Moreau <francis.moro@gmail.com>
> >>> wrote:
> >> [...]
> >>>> I tried to find out what could have opened the md device by using fuser,
> >>>> but fuser reports no users.
> >>>
> >>> It is probably a transient open/close.
> >>>
> >>
> >> If it's open/close wouldn't the 'close' part make the device disapear ?
> > 
> > No. It's ... complicated.
> > 
> >>
> >>>>
> >>>> I took a look to the udev rules which are the one shipped by mdadm 3.3.2
> >>>> but nothing keep the device opened during the remove event.
> >>>>
> >>>> Could you give me some hints here to debug this ?
> >>>
> >>> Modify md_open in drivers/md/md.c to add
> >>>    printk("Opened by %s\n", current->comm);
> >>>
> >>> and build a new kernel.  That will tell you the name of the process which
> >>> opened the device.
> >>>
> >>
> >> I did that I also added a trace in md_release() but strangely no trace
> >> were outputed from there.
> > 
> > Without seeing your patch I can't guess what it happening, but I am *certain*
> > that md_release() would get called providing md_open didn't return an error.
> 
> Here's the patch:
> 
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index 73aedcb..08ead8d 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -6703,6 +6703,8 @@ static int md_open(struct block_device *bdev,
> fmode_t mode)
>  	struct mddev *mddev = mddev_find(bdev->bd_dev);
>  	int err;
> 
> +	printk("md_open(): opened by %s\n", current->comm);
> +
>  	if (!mddev)
>  		return -ENODEV;
> 
> @@ -6735,6 +6737,8 @@ static void md_release(struct gendisk *disk,
> fmode_t mode)
>  {
>   	struct mddev *mddev = disk->private_data;
> 
> +	printk("md_release(): released by %s\n", current->comm);
> +
>  	BUG_ON(!mddev);
>  	atomic_dec(&mddev->openers);
>  	mddev_put(mddev);
> 
> > 
> > It might be helpful to print out the pid and the md device number too
> >  task_tgid_vnr(current)
> > will give you the pid.
> >   mdname(mddev)
> > give the name of the device.
> > 
> 
> Here's the new trace, this time md_release() was called, so I probably
> did something wrong the first time, sorry for that.
> 
> [    1.470744] md_open(): md127 opened by mdadm [388]
> [    1.485437] md_release(): md127 released by mdadm [388]
> [    1.486888] md_open(): md126 opened by mdadm [381]
> [    1.487468] md_release(): md126 released by mdadm [381]
> [    1.488646] md_open(): md125 opened by mdadm [383]
> [    1.489074] md_release(): md125 released by mdadm [383]
> [    1.490555] md_open(): md127 opened by mdadm [385]
> [    1.512556] md_release(): md127 released by mdadm [385]
> [    1.512582] md_open(): md127 opened by mdadm [385]
> [    1.512682] md_open(): md126 opened by mdadm [384]
> [    1.553414] md_release(): md126 released by mdadm [384]
> [    1.553442] md_open(): md126 opened by mdadm [384]
> [    1.553549] md_open(): md125 opened by mdadm [382]
> [    1.573263] md_release(): md125 released by mdadm [382]
> [    1.573288] md_open(): md125 opened by mdadm [382]
> [    1.601034] md_open(): md125 opened by mdadm [459]
> [    1.601041] md_release(): md125 released by mdadm [459]
> [    1.601065] md_open(): md126 opened by mdadm [459]
> [    1.601067] md_release(): md126 released by mdadm [459]
> [    1.601090] md_open(): md127 opened by mdadm [459]
> [    1.601092] md_release(): md127 released by mdadm [459]
> [    1.601130] md_open(): md127 opened by mdadm [459]
> [    1.601220] md_release(): md127 released by mdadm [459]
> [    1.601633] md_open(): md126 opened by mdadm [459]
> [    1.601661] md_release(): md126 released by mdadm [459]
> [    1.601673] md_open(): md125 opened by mdadm [459]
> [    1.601695] md_release(): md125 released by mdadm [459]
> [    1.606127] md_open(): md125 opened by mdadm [454]
> [    1.608682] md_open(): md126 opened by mdadm [453]
> [    1.609514] md_open(): md127 opened by mdadm [448]
> [    1.622512] md_release(): md126 released by mdadm [453]
> [    1.623028] md_release(): md127 released by mdadm [448]
> [    1.625288] md_open(): md126 opened by systemd-udevd [363]
> [    1.625391] md_release(): md125 released by mdadm [454]
> [    1.625619] md_open(): md127 opened by systemd-udevd [368]
> [    1.625737] md_open(): md125 opened by systemd-udevd [366]
> [    1.637137] md_release(): md125 released by systemd-udevd [366]
> [    1.643982] md_open(): md125 opened by mdadm [476]
> [    1.644071] md_release(): md127 released by systemd-udevd [368]
> [    1.647787] md_release(): md125 released by mdadm [382]
> [    1.648171] md_release(): md126 released by systemd-udevd [363]
> [    1.651629] md_open(): md126 opened by mdadm [479]
> [    1.656666] md_open(): md127 opened by mdadm [480]
> [    1.657771] md_release(): md125 released by mdadm [476]
> [    1.659312] md_open(): md125 opened by systemd-udevd [365]
> [    1.663193] md_release(): md127 released by mdadm [385]
> [    1.673669] md_release(): md125 released by systemd-udevd [365]
> [    1.685527] md_release(): md127 released by mdadm [480]
> [    1.685599] md_release(): md126 released by mdadm [479]
> [    1.686058] md_open(): md126 opened by systemd-udevd [366]
> [    1.686282] md_release(): md126 released by systemd-udevd [366]
> [    1.691024] md_open(): md127 opened by systemd-udevd [363]
> [    1.695415] md_release(): md126 released by mdadm [384]
> [    1.707163] md_release(): md127 released by systemd-udevd [363]
> 
> >>> mdadm --stop --scan <<<
> 
> [   89.975162] md_open(): md125 opened by mdadm [930]
> [   89.975305] md_release(): md125 released by mdadm [930]
> [   89.977434] md_open(): md125 opened by mdadm [932]
> [   89.978813] md_open(): md125 opened by mdadm [930]
> [   89.979365] md_release(): md125 released by mdadm [932]
> [   89.979693] md_open(): md125 opened by systemd-udevd [931]
> [   89.985790] md_release(): md125 released by systemd-udevd [931]
> [   90.179911] md_release(): md125 released by mdadm [930]
> [   90.180168] md_open(): md127 opened by mdadm [459]
> [   90.180187] md_release(): md127 released by mdadm [459]
> [   90.180199] md_open(): md126 opened by mdadm [459]
> [   90.180205] md_release(): md126 released by mdadm [459]
> [   90.180556] md_open(): md126 opened by mdadm [930]
> [   90.180653] md_release(): md126 released by mdadm [930]
> [   90.180690] md_open(): md126 opened by mdadm [930]
> [   90.180758] md_open(): mdX opened by mdadm [459]
> [   90.180995] md_open(): md125 opened by mdadm [459]
> [   90.181056] md_release(): md125 released by mdadm [459]
> [   90.182717] md_open(): md127 opened by mdadm [459]
> [   90.182725] md_release(): md127 released by mdadm [459]
> [   90.182732] md_open(): md126 opened by mdadm [459]
> [   90.182761] md_release(): md126 released by mdadm [459]
> [   90.182770] md_open(): md125 opened by mdadm [459]
> [   90.182775] md_release(): md125 released by mdadm [459]
> [   90.182940] md_release(): md126 released by mdadm [930]
> [   90.183167] md_open(): md127 opened by mdadm [930]
> [   90.183257] md_release(): md127 released by mdadm [930]
> [   90.183288] md_open(): md127 opened by mdadm [930]
> [   90.183461] md_open(): md127 opened by mdadm [459]
> [   90.183488] md_release(): md127 released by mdadm [459]
> [   90.183499] md_open(): md125 opened by mdadm [459]
> [   90.183505] md_release(): md125 released by mdadm [459]
> [   90.183686] md_release(): md127 released by mdadm [930]

So what were pids 930 and 459?
One was presumably the "mdadm -Ss"  - probably 930.
Is 459 the "mdadm --monitor" ??  That might be useful hint.


> 
> 
> > Probably there is a 'change' event happening just before the 'remove' event,
> > and udev runs "mdadm" on the 'change' event, and that ends up happening after
> > the device has been removed.
> > 
> > Is this really a problem?  Can't you just ignore it and pretend it isn't
> > there?
> 
> Well, if you list the block devices that the kernel detected in order to
> operate on them, it could. I don't know exactly what would be the result
> to use it but it could confuse some tools.
> 
> Is there a way to check that the 'ghost' device has been removed by
> poking sysfs ?

If you look at /sys/block/md*/md/array_state, those that contain 'inactive'
or 'clear' might be 'ghosts', or might be in the process of being assembled.
If you write 'clear' to the same file they should disappear.... unless udev
does something to re-create them.

NeilBrown


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: /sys/block/md126 still exists even after stopping the array
  2014-09-26 12:50             ` Francis Moreau
@ 2014-09-29  4:47               ` NeilBrown
  0 siblings, 0 replies; 23+ messages in thread
From: NeilBrown @ 2014-09-29  4:47 UTC (permalink / raw)
  To: Francis Moreau; +Cc: linux-raid, sebastian.riemer

[-- Attachment #1: Type: text/plain, Size: 3413 bytes --]

On Fri, 26 Sep 2014 14:50:08 +0200 Francis Moreau <francis.moro@gmail.com>
wrote:

> On 09/26/2014 02:21 PM, Francis Moreau wrote:
> [...]
> 
> > 
> >>>> mdadm --stop --scan <<<
> > 
> > [   89.975162] md_open(): md125 opened by mdadm [930]
> > [   89.975305] md_release(): md125 released by mdadm [930]
> > [   89.977434] md_open(): md125 opened by mdadm [932]
> > [   89.978813] md_open(): md125 opened by mdadm [930]
> > [   89.979365] md_release(): md125 released by mdadm [932]
> > [   89.979693] md_open(): md125 opened by systemd-udevd [931]
> > [   89.985790] md_release(): md125 released by systemd-udevd [931]
> > [   90.179911] md_release(): md125 released by mdadm [930]
> > [   90.180168] md_open(): md127 opened by mdadm [459]
> > [   90.180187] md_release(): md127 released by mdadm [459]
> > [   90.180199] md_open(): md126 opened by mdadm [459]
> > [   90.180205] md_release(): md126 released by mdadm [459]
> > [   90.180556] md_open(): md126 opened by mdadm [930]
> > [   90.180653] md_release(): md126 released by mdadm [930]
> > [   90.180690] md_open(): md126 opened by mdadm [930]
> > [   90.180758] md_open(): mdX opened by mdadm [459]
> 
> What is this 'mdX' device that mdadm operates on ?
> 
> It also doesn't have a counterpart release() call.

'mdX' is the name used if mddev->gendisk is NULL.
In that case, md_open() will return an error (ERESTARTSYS).
As the 'open' failed, we wouldn't expect a matching close/release.

NeilBrown


> 
> 
> > [   90.180995] md_open(): md125 opened by mdadm [459]
> > [   90.181056] md_release(): md125 released by mdadm [459]
> > [   90.182717] md_open(): md127 opened by mdadm [459]
> > [   90.182725] md_release(): md127 released by mdadm [459]
> > [   90.182732] md_open(): md126 opened by mdadm [459]
> > [   90.182761] md_release(): md126 released by mdadm [459]
> > [   90.182770] md_open(): md125 opened by mdadm [459]
> > [   90.182775] md_release(): md125 released by mdadm [459]
> > [   90.182940] md_release(): md126 released by mdadm [930]
> > [   90.183167] md_open(): md127 opened by mdadm [930]
> > [   90.183257] md_release(): md127 released by mdadm [930]
> > [   90.183288] md_open(): md127 opened by mdadm [930]
> > [   90.183461] md_open(): md127 opened by mdadm [459]
> > [   90.183488] md_release(): md127 released by mdadm [459]
> > [   90.183499] md_open(): md125 opened by mdadm [459]
> > [   90.183505] md_release(): md125 released by mdadm [459]
> > [   90.183686] md_release(): md127 released by mdadm [930]
> > 
> > 
> >> Probably there is a 'change' event happening just before the 'remove' event,
> >> and udev runs "mdadm" on the 'change' event, and that ends up happening after
> >> the device has been removed.
> >>
> >> Is this really a problem?  Can't you just ignore it and pretend it isn't
> >> there?
> > 
> > Well, if you list the block devices that the kernel detected in order to
> > operate on them, it could. I don't know exactly what would be the result
> > to use it but it could confuse some tools.
> > 
> > Is there a way to check that the 'ghost' device has been removed by
> > poking sysfs ?
> > 
> > Thanks
> > 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: /sys/block/md126 still exists even after stopping the array
  2014-09-29  4:37             ` NeilBrown
@ 2014-09-29  8:45               ` Francis Moreau
  2014-09-29 21:56                 ` NeilBrown
  0 siblings, 1 reply; 23+ messages in thread
From: Francis Moreau @ 2014-09-29  8:45 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid, sebastian.riemer

Hello Neil,

On 09/29/2014 06:37 AM, NeilBrown wrote:
> On Fri, 26 Sep 2014 14:21:04 +0200 Francis Moreau <francis.moro@gmail.com>
> wrote:
> 
>> On 09/26/2014 12:44 PM, NeilBrown wrote:
>>> On Fri, 26 Sep 2014 12:23:27 +0200 Francis Moreau <francis.moro@gmail.com>
>>> wrote:
>>>
>>>> Hello Neil,
>>>>
>>>> On 09/26/2014 02:33 AM, NeilBrown wrote:
>>>>> On Thu, 25 Sep 2014 18:12:07 +0200 Francis Moreau <francis.moro@gmail.com>
>>>>> wrote:
>>>> [...]
>>>>>> I tried to find out what could have opened the md device by using fuser,
>>>>>> but fuser reports no users.
>>>>>
>>>>> It is probably a transient open/close.
>>>>>
>>>>
>>>> If it's open/close wouldn't the 'close' part make the device disapear ?
>>>
>>> No. It's ... complicated.
>>>
>>>>
>>>>>>
>>>>>> I took a look to the udev rules which are the one shipped by mdadm 3.3.2
>>>>>> but nothing keep the device opened during the remove event.
>>>>>>
>>>>>> Could you give me some hints here to debug this ?
>>>>>
>>>>> Modify md_open in drivers/md/md.c to add
>>>>>    printk("Opened by %s\n", current->comm);
>>>>>
>>>>> and build a new kernel.  That will tell you the name of the process which
>>>>> opened the device.
>>>>>
>>>>
>>>> I did that I also added a trace in md_release() but strangely no trace
>>>> were outputed from there.
>>>
>>> Without seeing your patch I can't guess what it happening, but I am *certain*
>>> that md_release() would get called providing md_open didn't return an error.
>>
>> Here's the patch:
>>
>> diff --git a/drivers/md/md.c b/drivers/md/md.c
>> index 73aedcb..08ead8d 100644
>> --- a/drivers/md/md.c
>> +++ b/drivers/md/md.c
>> @@ -6703,6 +6703,8 @@ static int md_open(struct block_device *bdev,
>> fmode_t mode)
>>  	struct mddev *mddev = mddev_find(bdev->bd_dev);
>>  	int err;
>>
>> +	printk("md_open(): opened by %s\n", current->comm);
>> +
>>  	if (!mddev)
>>  		return -ENODEV;
>>
>> @@ -6735,6 +6737,8 @@ static void md_release(struct gendisk *disk,
>> fmode_t mode)
>>  {
>>   	struct mddev *mddev = disk->private_data;
>>
>> +	printk("md_release(): released by %s\n", current->comm);
>> +
>>  	BUG_ON(!mddev);
>>  	atomic_dec(&mddev->openers);
>>  	mddev_put(mddev);
>>
>>>
>>> It might be helpful to print out the pid and the md device number too
>>>  task_tgid_vnr(current)
>>> will give you the pid.
>>>   mdname(mddev)
>>> give the name of the device.
>>>
>>
>> Here's the new trace, this time md_release() was called, so I probably
>> did something wrong the first time, sorry for that.
>>
>> [    1.470744] md_open(): md127 opened by mdadm [388]
>> [    1.485437] md_release(): md127 released by mdadm [388]
>> [    1.486888] md_open(): md126 opened by mdadm [381]
>> [    1.487468] md_release(): md126 released by mdadm [381]
>> [    1.488646] md_open(): md125 opened by mdadm [383]
>> [    1.489074] md_release(): md125 released by mdadm [383]
>> [    1.490555] md_open(): md127 opened by mdadm [385]
>> [    1.512556] md_release(): md127 released by mdadm [385]
>> [    1.512582] md_open(): md127 opened by mdadm [385]
>> [    1.512682] md_open(): md126 opened by mdadm [384]
>> [    1.553414] md_release(): md126 released by mdadm [384]
>> [    1.553442] md_open(): md126 opened by mdadm [384]
>> [    1.553549] md_open(): md125 opened by mdadm [382]
>> [    1.573263] md_release(): md125 released by mdadm [382]
>> [    1.573288] md_open(): md125 opened by mdadm [382]
>> [    1.601034] md_open(): md125 opened by mdadm [459]
>> [    1.601041] md_release(): md125 released by mdadm [459]
>> [    1.601065] md_open(): md126 opened by mdadm [459]
>> [    1.601067] md_release(): md126 released by mdadm [459]
>> [    1.601090] md_open(): md127 opened by mdadm [459]
>> [    1.601092] md_release(): md127 released by mdadm [459]
>> [    1.601130] md_open(): md127 opened by mdadm [459]
>> [    1.601220] md_release(): md127 released by mdadm [459]
>> [    1.601633] md_open(): md126 opened by mdadm [459]
>> [    1.601661] md_release(): md126 released by mdadm [459]
>> [    1.601673] md_open(): md125 opened by mdadm [459]
>> [    1.601695] md_release(): md125 released by mdadm [459]
>> [    1.606127] md_open(): md125 opened by mdadm [454]
>> [    1.608682] md_open(): md126 opened by mdadm [453]
>> [    1.609514] md_open(): md127 opened by mdadm [448]
>> [    1.622512] md_release(): md126 released by mdadm [453]
>> [    1.623028] md_release(): md127 released by mdadm [448]
>> [    1.625288] md_open(): md126 opened by systemd-udevd [363]
>> [    1.625391] md_release(): md125 released by mdadm [454]
>> [    1.625619] md_open(): md127 opened by systemd-udevd [368]
>> [    1.625737] md_open(): md125 opened by systemd-udevd [366]
>> [    1.637137] md_release(): md125 released by systemd-udevd [366]
>> [    1.643982] md_open(): md125 opened by mdadm [476]
>> [    1.644071] md_release(): md127 released by systemd-udevd [368]
>> [    1.647787] md_release(): md125 released by mdadm [382]
>> [    1.648171] md_release(): md126 released by systemd-udevd [363]
>> [    1.651629] md_open(): md126 opened by mdadm [479]
>> [    1.656666] md_open(): md127 opened by mdadm [480]
>> [    1.657771] md_release(): md125 released by mdadm [476]
>> [    1.659312] md_open(): md125 opened by systemd-udevd [365]
>> [    1.663193] md_release(): md127 released by mdadm [385]
>> [    1.673669] md_release(): md125 released by systemd-udevd [365]
>> [    1.685527] md_release(): md127 released by mdadm [480]
>> [    1.685599] md_release(): md126 released by mdadm [479]
>> [    1.686058] md_open(): md126 opened by systemd-udevd [366]
>> [    1.686282] md_release(): md126 released by systemd-udevd [366]
>> [    1.691024] md_open(): md127 opened by systemd-udevd [363]
>> [    1.695415] md_release(): md126 released by mdadm [384]
>> [    1.707163] md_release(): md127 released by systemd-udevd [363]
>>
>>>>> mdadm --stop --scan <<<
>>
>> [   89.975162] md_open(): md125 opened by mdadm [930]
>> [   89.975305] md_release(): md125 released by mdadm [930]
>> [   89.977434] md_open(): md125 opened by mdadm [932]
>> [   89.978813] md_open(): md125 opened by mdadm [930]
>> [   89.979365] md_release(): md125 released by mdadm [932]
>> [   89.979693] md_open(): md125 opened by systemd-udevd [931]
>> [   89.985790] md_release(): md125 released by systemd-udevd [931]
>> [   90.179911] md_release(): md125 released by mdadm [930]
>> [   90.180168] md_open(): md127 opened by mdadm [459]
>> [   90.180187] md_release(): md127 released by mdadm [459]
>> [   90.180199] md_open(): md126 opened by mdadm [459]
>> [   90.180205] md_release(): md126 released by mdadm [459]
>> [   90.180556] md_open(): md126 opened by mdadm [930]
>> [   90.180653] md_release(): md126 released by mdadm [930]
>> [   90.180690] md_open(): md126 opened by mdadm [930]
>> [   90.180758] md_open(): mdX opened by mdadm [459]
>> [   90.180995] md_open(): md125 opened by mdadm [459]
>> [   90.181056] md_release(): md125 released by mdadm [459]opened by mdadm [968]
>> [   90.182717] md_open(): md127 opened by mdadm [459]
>> [   90.182725] md_release(): md127 released by mdadm [459]
>> [   90.182732] md_open(): md126 opened by mdadm [459]
>> [   90.182761] md_release(): md126 released by mdadm [459]
>> [   90.182770] md_open(): md125 opened by mdadm [459]
>> [   90.182775] md_release(): md125 released by mdadm [459]
>> [   90.182940] md_release(): md126 released by mdadm [930]
>> [   90.183167] md_open(): md127 opened by mdadm [930]
>> [   90.183257] md_release(): md127 released by mdadm [930]
>> [   90.183288] md_open(): md127 opened by mdadm [930]
>> [   90.183461] md_open(): md127 opened by mdadm [459]
>> [   90.183488] md_release(): md127 released by mdadm [459]
>> [   90.183499] md_open(): md125 opened by mdadm [459]
>> [   90.183505] md_release(): md125 released by mdadm [459]
>> [   90.183686] md_release(): md127 released by mdadm [930]
> 
> So what were pids 930 and 459?
> One was presumably the "mdadm -Ss"  - probably 930.
> Is 459 the "mdadm --monitor" ??  That might be useful hint.
> 

yes.

[456] is:  /sbin/mdadm --monitor --scan --daemonise --syslog
--pid-file=/run/mdadm/mdadm.pid

and [930] is 'mdamd -Ss'.

> 
>>
>>
>>> Probably there is a 'change' event happening just before the 'remove' event,
>>> and udev runs "mdadm" on the 'change' event, and that ends up happening after
>>> the device has been removed.
>>>
>>> Is this really a problem?  Can't you just ignore it and pretend it isn't
>>> there?
>>
>> Well, if you list the block devices that the kernel detected in order to
>> operate on them, it could. I don't know exactly what would be the result
>> to use it but it could confuse some tools.
>>
>> Is there a way to check that the 'ghost' device has been removed by
>> poking sysfs ?
> 
> If you look at /sys/block/md*/md/array_state, those that contain 'inactive'
> or 'clear' might be 'ghosts', or might be in the process of being assembled.
> If you write 'clear' to the same file they should disappear.... unless udev
> does something to re-create them.
> 

It's in 'clear' state, and writing 'clear' doesn't make the device disapear.

[root@localhost ~]# dmesg -c >/dev/null
[root@localhost ~]# echo clear >/sys/block/md125/md/array_state
[root@localhost ~]# dmesg
[  254.106252] md: md125 stopped.
[  254.108182] md_open(): mdX opened by mdadm [968]

[  254.109103] md_open(): md125 opened by mdadm [459]
  [  254.109127] md_open(): md125 opened by mdadm [459]
  [  254.109281] md_release(): md125 released by mdadm [459]

  [  254.109337] md_open(): md125 opened by mdadm [968]
  [  254.109572] md_release(): md125 released by mdadm [968]

  [  254.109847] md_open(): md125 opened by systemd-udevd [967]
  [  254.109986] md_release(): md125 released by systemd-udevd [967]

In that sequence, it seems that mdadm [459] is missing a md_release()
here. Is this expected ?

Thanks

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: /sys/block/md126 still exists even after stopping the array
  2014-09-29  8:45               ` Francis Moreau
@ 2014-09-29 21:56                 ` NeilBrown
  2014-09-30  7:43                   ` Francis Moreau
  0 siblings, 1 reply; 23+ messages in thread
From: NeilBrown @ 2014-09-29 21:56 UTC (permalink / raw)
  To: Francis Moreau; +Cc: linux-raid, sebastian.riemer

[-- Attachment #1: Type: text/plain, Size: 3512 bytes --]

On Mon, 29 Sep 2014 10:45:17 +0200 Francis Moreau <francis.moro@gmail.com>
wrote:

> > So what were pids 930 and 459?
> > One was presumably the "mdadm -Ss"  - probably 930.
> > Is 459 the "mdadm --monitor" ??  That might be useful hint.
> > 
> 
> yes.
> 
> [456] is:  /sbin/mdadm --monitor --scan --daemonise --syslog
> --pid-file=/run/mdadm/mdadm.pid
> 
> and [930] is 'mdamd -Ss'.

Good.  Please try the patch below.

> > 
> >>
> >>
> >>> Probably there is a 'change' event happening just before the 'remove' event,
> >>> and udev runs "mdadm" on the 'change' event, and that ends up happening after
> >>> the device has been removed.
> >>>
> >>> Is this really a problem?  Can't you just ignore it and pretend it isn't
> >>> there?
> >>
> >> Well, if you list the block devices that the kernel detected in order to
> >> operate on them, it could. I don't know exactly what would be the result
> >> to use it but it could confuse some tools.
> >>
> >> Is there a way to check that the 'ghost' device has been removed by
> >> poking sysfs ?
> > 
> > If you look at /sys/block/md*/md/array_state, those that contain 'inactive'
> > or 'clear' might be 'ghosts', or might be in the process of being assembled.
> > If you write 'clear' to the same file they should disappear.... unless udev
> > does something to re-create them.
> > 
> 
> It's in 'clear' state, and writing 'clear' doesn't make the device disapear.
> 
> [root@localhost ~]# dmesg -c >/dev/null
> [root@localhost ~]# echo clear >/sys/block/md125/md/array_state
> [root@localhost ~]# dmesg
> [  254.106252] md: md125 stopped.
> [  254.108182] md_open(): mdX opened by mdadm [968]
> 
> [  254.109103] md_open(): md125 opened by mdadm [459]
>   [  254.109127] md_open(): md125 opened by mdadm [459]
>   [  254.109281] md_release(): md125 released by mdadm [459]
> 
>   [  254.109337] md_open(): md125 opened by mdadm [968]
>   [  254.109572] md_release(): md125 released by mdadm [968]
> 
>   [  254.109847] md_open(): md125 opened by systemd-udevd [967]
>   [  254.109986] md_release(): md125 released by systemd-udevd [967]
> 
> In that sequence, it seems that mdadm [459] is missing a md_release()
> here. Is this expected ?

Presumably the first md_open returned an error.  You could add another printk
at each 'return' to check.

Thanks,
NeilBrown


diff --git a/Monitor.c b/Monitor.c
index 5cb24fab8f2a..971d2ecbea72 100644
--- a/Monitor.c
+++ b/Monitor.c
@@ -460,7 +460,7 @@ static int check_array(struct state *st, struct mdstat_ent *mdstat,
 	mdu_array_info_t array;
 	struct mdstat_ent *mse = NULL, *mse2;
 	char *dev = st->devname;
-	int fd;
+	int fd = -1;
 	int i;
 	int remaining_disks;
 	int last_disk;
@@ -468,6 +468,27 @@ static int check_array(struct state *st, struct mdstat_ent *mdstat,
 
 	if (test)
 		alert("TestMessage", dev, NULL, ainfo);
+	if (st->devnm[0])
+		fd = open("/sys/block", O_RDONLY|O_DIRECTORY);
+	if (fd >= 0) {
+		/* Don't open the device unless it is present and
+		 * active in sysfs.
+		 */
+		char buf[10];
+		close(fd);
+		fd = sysfs_open(st->devnm, NULL, "array_state");
+		if (fd < 0 ||
+		    read(fd, buf, 10) < 5 ||
+		    strncmp(buf,"clear",5) == 0 ||
+		    strncmp(buf,"inact",5) == 0) {
+			if (fd >= 0)
+				close(fd);
+			if (!st->err)
+				alert("DeviceDisappeared", dev, NULL, ainfo);
+			st->err++;
+			return 0;
+		}
+	}
 	fd = open(dev, O_RDONLY);
 	if (fd < 0) {
 		if (!st->err)

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: /sys/block/md126 still exists even after stopping the array
  2014-09-29 21:56                 ` NeilBrown
@ 2014-09-30  7:43                   ` Francis Moreau
  2014-10-07  7:05                     ` Francis Moreau
  0 siblings, 1 reply; 23+ messages in thread
From: Francis Moreau @ 2014-09-30  7:43 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid, sebastian.riemer

Hi Neil,

On 09/29/2014 11:56 PM, NeilBrown wrote:
> On Mon, 29 Sep 2014 10:45:17 +0200 Francis Moreau <francis.moro@gmail.com>
> wrote:
> 
>>> So what were pids 930 and 459?
>>> One was presumably the "mdadm -Ss"  - probably 930.
>>> Is 459 the "mdadm --monitor" ??  That might be useful hint.
>>>
>>
>> yes.
>>
>> [456] is:  /sbin/mdadm --monitor --scan --daemonise --syslog
>> --pid-file=/run/mdadm/mdadm.pid
>>
>> and [930] is 'mdamd -Ss'.
> 
> Good.  Please try the patch below.
> 

After applying your patch, this is what I'm getting in syslog:

Sep 30 03:40:07 localhost kernel: md_open(): md125 opened by mdadm [970]
Sep 30 03:40:07 localhost kernel: md_release(): md125 released by mdadm
[970]
Sep 30 03:40:07 localhost kernel: md_open(): md125 opened by mdadm [972]
Sep 30 03:40:07 localhost kernel: md_open(): md125 opened by mdadm [970]
Sep 30 03:40:07 localhost kernel: md_release(): md125 released by mdadm
[972]
Sep 30 03:40:07 localhost kernel: md_open(): md125 opened by
systemd-udevd [971]
Sep 30 03:40:07 localhost systemd[1]: Cannot add dependency job for unit
mdmonitor-takeover.service, ignoring: Invalid argument
Sep 30 03:40:07 localhost systemd[1]: Started Software RAID monitoring
and management.
Sep 30 03:40:07 localhost kernel: md_release(): md125 released by
systemd-udevd [971]
Sep 30 03:40:08 localhost mdadm[466]: DeviceDisappeared event detected
on md device /dev/md125
Sep 30 03:40:08 localhost mdadm[466]: DeviceDisappeared event detected
on md device /dev/md126
Sep 30 03:40:08 localhost mdadm[466]: DeviceDisappeared event detected
on md device /dev/md127
Sep 30 03:40:08 localhost kernel: md125: detected capacity change from
1863254016 to 0
Sep 30 03:40:08 localhost kernel: md: md125 stopped.
Sep 30 03:40:08 localhost kernel: md: unbind<vdc3>
Sep 30 03:40:08 localhost kernel: md: export_rdev(vdc3)
Sep 30 03:40:08 localhost kernel: md: unbind<vdb3>
Sep 30 03:40:08 localhost kernel: md: export_rdev(vdb3)
Sep 30 03:40:08 localhost kernel: md_release(): md125 released by mdadm
[970]
Sep 30 03:40:08 localhost kernel: md_open(): md127 opened by mdadm [466]
Sep 30 03:40:08 localhost kernel: md_release(): md127 released by mdadm
[466]
Sep 30 03:40:08 localhost kernel: md_open(): md126 opened by mdadm [466]
Sep 30 03:40:08 localhost kernel: md_release(): md126 released by mdadm
[466]
Sep 30 03:40:08 localhost kernel: md_open(): md126 opened by mdadm [970]
Sep 30 03:40:08 localhost kernel: md_release(): md126 released by mdadm
[970]
Sep 30 03:40:08 localhost kernel: md_open(): md126 opened by mdadm [970]
Sep 30 03:40:08 localhost kernel: md126: detected capacity change from
67043328 to 0
Sep 30 03:40:08 localhost kernel: md: md126 stopped.
Sep 30 03:40:08 localhost kernel: md: unbind<vdc1>
Sep 30 03:40:08 localhost kernel: md: export_rdev(vdc1)
Sep 30 03:40:08 localhost kernel: md: unbind<vdb1>
Sep 30 03:40:08 localhost kernel: md: export_rdev(vdb1)
Sep 30 03:40:08 localhost kernel: md_open(): md127 opened by mdadm [466]
Sep 30 03:40:08 localhost kernel: md_release(): md127 released by mdadm
[466]
Sep 30 03:40:08 localhost kernel: md_release(): md126 released by mdadm
[970]
Sep 30 03:40:08 localhost kernel: md_open(): md127 opened by mdadm [970]
Sep 30 03:40:08 localhost kernel: md_release(): md127 released by mdadm
[970]
Sep 30 03:40:08 localhost kernel: md_open(): md127 opened by mdadm [970]
Sep 30 03:40:08 localhost kernel: md127: detected capacity change from
214564864 to 0
Sep 30 03:40:08 localhost kernel: md: md127 stopped.
Sep 30 03:40:08 localhost kernel: md: unbind<vdc2>
Sep 30 03:40:08 localhost kernel: md: export_rdev(vdc2)
Sep 30 03:40:08 localhost kernel: md: unbind<vdb2>
Sep 30 03:40:08 localhost kernel: md: export_rdev(vdb2)
Sep 30 03:40:08 localhost kernel: md_release(): md127 released by mdadm
[970]

The ghost device is no more present so your patch seems to have fixed my
issue. But I must admit I don't really understand what's going on :-/

Thanks

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: /sys/block/md126 still exists even after stopping the array
  2014-09-30  7:43                   ` Francis Moreau
@ 2014-10-07  7:05                     ` Francis Moreau
  2014-10-07 23:54                       ` NeilBrown
  0 siblings, 1 reply; 23+ messages in thread
From: Francis Moreau @ 2014-10-07  7:05 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid, sebastian.riemer

Hi Neil,

On 09/30/2014 09:43 AM, Francis Moreau wrote:
> Hi Neil,
> 
> On 09/29/2014 11:56 PM, NeilBrown wrote:
>> On Mon, 29 Sep 2014 10:45:17 +0200 Francis Moreau <francis.moro@gmail.com>
>> wrote:
>>
>>>> So what were pids 930 and 459?
>>>> One was presumably the "mdadm -Ss"  - probably 930.
>>>> Is 459 the "mdadm --monitor" ??  That might be useful hint.
>>>>
>>>
>>> yes.
>>>
>>> [456] is:  /sbin/mdadm --monitor --scan --daemonise --syslog
>>> --pid-file=/run/mdadm/mdadm.pid
>>>
>>> and [930] is 'mdamd -Ss'.
>>
>> Good.  Please try the patch below.
>>
> 
> After applying your patch, this is what I'm getting in syslog:
> 
> Sep 30 03:40:07 localhost kernel: md_open(): md125 opened by mdadm [970]
> Sep 30 03:40:07 localhost kernel: md_release(): md125 released by mdadm
> [970]
> Sep 30 03:40:07 localhost kernel: md_open(): md125 opened by mdadm [972]
> Sep 30 03:40:07 localhost kernel: md_open(): md125 opened by mdadm [970]
> Sep 30 03:40:07 localhost kernel: md_release(): md125 released by mdadm
> [972]
> Sep 30 03:40:07 localhost kernel: md_open(): md125 opened by
> systemd-udevd [971]
> Sep 30 03:40:07 localhost systemd[1]: Cannot add dependency job for unit
> mdmonitor-takeover.service, ignoring: Invalid argument
> Sep 30 03:40:07 localhost systemd[1]: Started Software RAID monitoring
> and management.
> Sep 30 03:40:07 localhost kernel: md_release(): md125 released by
> systemd-udevd [971]
> Sep 30 03:40:08 localhost mdadm[466]: DeviceDisappeared event detected
> on md device /dev/md125
> Sep 30 03:40:08 localhost mdadm[466]: DeviceDisappeared event detected
> on md device /dev/md126
> Sep 30 03:40:08 localhost mdadm[466]: DeviceDisappeared event detected
> on md device /dev/md127
> Sep 30 03:40:08 localhost kernel: md125: detected capacity change from
> 1863254016 to 0
> Sep 30 03:40:08 localhost kernel: md: md125 stopped.
> Sep 30 03:40:08 localhost kernel: md: unbind<vdc3>
> Sep 30 03:40:08 localhost kernel: md: export_rdev(vdc3)
> Sep 30 03:40:08 localhost kernel: md: unbind<vdb3>
> Sep 30 03:40:08 localhost kernel: md: export_rdev(vdb3)
> Sep 30 03:40:08 localhost kernel: md_release(): md125 released by mdadm
> [970]
> Sep 30 03:40:08 localhost kernel: md_open(): md127 opened by mdadm [466]
> Sep 30 03:40:08 localhost kernel: md_release(): md127 released by mdadm
> [466]
> Sep 30 03:40:08 localhost kernel: md_open(): md126 opened by mdadm [466]
> Sep 30 03:40:08 localhost kernel: md_release(): md126 released by mdadm
> [466]
> Sep 30 03:40:08 localhost kernel: md_open(): md126 opened by mdadm [970]
> Sep 30 03:40:08 localhost kernel: md_release(): md126 released by mdadm
> [970]
> Sep 30 03:40:08 localhost kernel: md_open(): md126 opened by mdadm [970]
> Sep 30 03:40:08 localhost kernel: md126: detected capacity change from
> 67043328 to 0
> Sep 30 03:40:08 localhost kernel: md: md126 stopped.
> Sep 30 03:40:08 localhost kernel: md: unbind<vdc1>
> Sep 30 03:40:08 localhost kernel: md: export_rdev(vdc1)
> Sep 30 03:40:08 localhost kernel: md: unbind<vdb1>
> Sep 30 03:40:08 localhost kernel: md: export_rdev(vdb1)
> Sep 30 03:40:08 localhost kernel: md_open(): md127 opened by mdadm [466]
> Sep 30 03:40:08 localhost kernel: md_release(): md127 released by mdadm
> [466]
> Sep 30 03:40:08 localhost kernel: md_release(): md126 released by mdadm
> [970]
> Sep 30 03:40:08 localhost kernel: md_open(): md127 opened by mdadm [970]
> Sep 30 03:40:08 localhost kernel: md_release(): md127 released by mdadm
> [970]
> Sep 30 03:40:08 localhost kernel: md_open(): md127 opened by mdadm [970]
> Sep 30 03:40:08 localhost kernel: md127: detected capacity change from
> 214564864 to 0
> Sep 30 03:40:08 localhost kernel: md: md127 stopped.
> Sep 30 03:40:08 localhost kernel: md: unbind<vdc2>
> Sep 30 03:40:08 localhost kernel: md: export_rdev(vdc2)
> Sep 30 03:40:08 localhost kernel: md: unbind<vdb2>
> Sep 30 03:40:08 localhost kernel: md: export_rdev(vdb2)
> Sep 30 03:40:08 localhost kernel: md_release(): md127 released by mdadm
> [970]
> 
> The ghost device is no more present so your patch seems to have fixed my
> issue. But I must admit I don't really understand what's going on :-/
> 

Since those 'ghost' devices are expected from the MD implementation
point of view, I'm wondering how am I supposed to detect them or maybe
how an application is supposed to recognized online arrays.

My application uses udev to detect et to get information about new
devices. I don't think the information exported by udev is enough to
figure this out. Also please note that since I rely on udev, I can't
really read information on /sys since this information may be out of
sync with the one returned by udev.

Thanks.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: /sys/block/md126 still exists even after stopping the array
  2014-10-07  7:05                     ` Francis Moreau
@ 2014-10-07 23:54                       ` NeilBrown
  2014-10-09  9:40                         ` Francis Moreau
  0 siblings, 1 reply; 23+ messages in thread
From: NeilBrown @ 2014-10-07 23:54 UTC (permalink / raw)
  To: Francis Moreau; +Cc: linux-raid, sebastian.riemer

[-- Attachment #1: Type: text/plain, Size: 6330 bytes --]

On Tue, 07 Oct 2014 09:05:43 +0200 Francis Moreau <francis.moro@gmail.com>
wrote:

> Hi Neil,
> 
> On 09/30/2014 09:43 AM, Francis Moreau wrote:
> > Hi Neil,
> > 
> > On 09/29/2014 11:56 PM, NeilBrown wrote:
> >> On Mon, 29 Sep 2014 10:45:17 +0200 Francis Moreau <francis.moro@gmail.com>
> >> wrote:
> >>
> >>>> So what were pids 930 and 459?
> >>>> One was presumably the "mdadm -Ss"  - probably 930.
> >>>> Is 459 the "mdadm --monitor" ??  That might be useful hint.
> >>>>
> >>>
> >>> yes.
> >>>
> >>> [456] is:  /sbin/mdadm --monitor --scan --daemonise --syslog
> >>> --pid-file=/run/mdadm/mdadm.pid
> >>>
> >>> and [930] is 'mdamd -Ss'.
> >>
> >> Good.  Please try the patch below.
> >>
> > 
> > After applying your patch, this is what I'm getting in syslog:
> > 
> > Sep 30 03:40:07 localhost kernel: md_open(): md125 opened by mdadm [970]
> > Sep 30 03:40:07 localhost kernel: md_release(): md125 released by mdadm
> > [970]
> > Sep 30 03:40:07 localhost kernel: md_open(): md125 opened by mdadm [972]
> > Sep 30 03:40:07 localhost kernel: md_open(): md125 opened by mdadm [970]
> > Sep 30 03:40:07 localhost kernel: md_release(): md125 released by mdadm
> > [972]
> > Sep 30 03:40:07 localhost kernel: md_open(): md125 opened by
> > systemd-udevd [971]
> > Sep 30 03:40:07 localhost systemd[1]: Cannot add dependency job for unit
> > mdmonitor-takeover.service, ignoring: Invalid argument
> > Sep 30 03:40:07 localhost systemd[1]: Started Software RAID monitoring
> > and management.
> > Sep 30 03:40:07 localhost kernel: md_release(): md125 released by
> > systemd-udevd [971]
> > Sep 30 03:40:08 localhost mdadm[466]: DeviceDisappeared event detected
> > on md device /dev/md125
> > Sep 30 03:40:08 localhost mdadm[466]: DeviceDisappeared event detected
> > on md device /dev/md126
> > Sep 30 03:40:08 localhost mdadm[466]: DeviceDisappeared event detected
> > on md device /dev/md127
> > Sep 30 03:40:08 localhost kernel: md125: detected capacity change from
> > 1863254016 to 0
> > Sep 30 03:40:08 localhost kernel: md: md125 stopped.
> > Sep 30 03:40:08 localhost kernel: md: unbind<vdc3>
> > Sep 30 03:40:08 localhost kernel: md: export_rdev(vdc3)
> > Sep 30 03:40:08 localhost kernel: md: unbind<vdb3>
> > Sep 30 03:40:08 localhost kernel: md: export_rdev(vdb3)
> > Sep 30 03:40:08 localhost kernel: md_release(): md125 released by mdadm
> > [970]
> > Sep 30 03:40:08 localhost kernel: md_open(): md127 opened by mdadm [466]
> > Sep 30 03:40:08 localhost kernel: md_release(): md127 released by mdadm
> > [466]
> > Sep 30 03:40:08 localhost kernel: md_open(): md126 opened by mdadm [466]
> > Sep 30 03:40:08 localhost kernel: md_release(): md126 released by mdadm
> > [466]
> > Sep 30 03:40:08 localhost kernel: md_open(): md126 opened by mdadm [970]
> > Sep 30 03:40:08 localhost kernel: md_release(): md126 released by mdadm
> > [970]
> > Sep 30 03:40:08 localhost kernel: md_open(): md126 opened by mdadm [970]
> > Sep 30 03:40:08 localhost kernel: md126: detected capacity change from
> > 67043328 to 0
> > Sep 30 03:40:08 localhost kernel: md: md126 stopped.
> > Sep 30 03:40:08 localhost kernel: md: unbind<vdc1>
> > Sep 30 03:40:08 localhost kernel: md: export_rdev(vdc1)
> > Sep 30 03:40:08 localhost kernel: md: unbind<vdb1>
> > Sep 30 03:40:08 localhost kernel: md: export_rdev(vdb1)
> > Sep 30 03:40:08 localhost kernel: md_open(): md127 opened by mdadm [466]
> > Sep 30 03:40:08 localhost kernel: md_release(): md127 released by mdadm
> > [466]
> > Sep 30 03:40:08 localhost kernel: md_release(): md126 released by mdadm
> > [970]
> > Sep 30 03:40:08 localhost kernel: md_open(): md127 opened by mdadm [970]
> > Sep 30 03:40:08 localhost kernel: md_release(): md127 released by mdadm
> > [970]
> > Sep 30 03:40:08 localhost kernel: md_open(): md127 opened by mdadm [970]
> > Sep 30 03:40:08 localhost kernel: md127: detected capacity change from
> > 214564864 to 0
> > Sep 30 03:40:08 localhost kernel: md: md127 stopped.
> > Sep 30 03:40:08 localhost kernel: md: unbind<vdc2>
> > Sep 30 03:40:08 localhost kernel: md: export_rdev(vdc2)
> > Sep 30 03:40:08 localhost kernel: md: unbind<vdb2>
> > Sep 30 03:40:08 localhost kernel: md: export_rdev(vdb2)
> > Sep 30 03:40:08 localhost kernel: md_release(): md127 released by mdadm
> > [970]
> > 
> > The ghost device is no more present so your patch seems to have fixed my
> > issue. But I must admit I don't really understand what's going on :-/
> > 
> 
> Since those 'ghost' devices are expected from the MD implementation
> point of view, I'm wondering how am I supposed to detect them or maybe
> how an application is supposed to recognized online arrays.

If your application is looking in /proc/mdstat, then the "ghost" devices will
be either "inactive" or not present at all.
If your application is looking in /sys/block/md*, then the "ghost" devices
will have "clear" or "inactive" in /sys/block/mdXX/md/array_state.

If you use the new "CREATE names=yes" line in mdadm.conf (mdadm 3.3 or
later), and use kernel 3.17 or later, and use names rather than numbers to
identify your arrays (/dev/md/home, /dev/md_root), then the "ghost" problem
will be gone, and names in /proc/mdstat will be e.g. "md_home", or "md_root"
rather than "md4" or "md127".

> 
> My application uses udev to detect et to get information about new
> devices. I don't think the information exported by udev is enough to
> figure this out. Also please note that since I rely on udev, I can't
> really read information on /sys since this information may be out of
> sync with the one returned by udev.

If udev reports that an array exists, then it really did exist when udev got
the message.  By the time your program gets run by udev, it might not exist
any more.
i.e. udev is always racy.
You should always treat any event from udev as a hint: 

  "Something happened to this device in the recent past.  Lots of other
  things might have happened since.  The device might not exist any more, or
  it might have been replaced with a completely different device.  So you
  might want to do something, or you might not, but whatever you do - be
  careful and don't blame me if things go wrong 'cause I'm just the
  messenger."

NeilBrown




[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: /sys/block/md126 still exists even after stopping the array
  2014-10-07 23:54                       ` NeilBrown
@ 2014-10-09  9:40                         ` Francis Moreau
  2014-10-09  9:55                           ` NeilBrown
  0 siblings, 1 reply; 23+ messages in thread
From: Francis Moreau @ 2014-10-09  9:40 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid, sebastian.riemer

On 10/08/2014 01:54 AM, NeilBrown wrote:
> On Tue, 07 Oct 2014 09:05:43 +0200 Francis Moreau <francis.moro@gmail.com>
> wrote:
> 
>> Hi Neil,
>>
>> On 09/30/2014 09:43 AM, Francis Moreau wrote:
>>> Hi Neil,
>>>
>>> On 09/29/2014 11:56 PM, NeilBrown wrote:
>>>> On Mon, 29 Sep 2014 10:45:17 +0200 Francis Moreau <francis.moro@gmail.com>
>>>> wrote:
>>>>
>>>>>> So what were pids 930 and 459?
>>>>>> One was presumably the "mdadm -Ss"  - probably 930.
>>>>>> Is 459 the "mdadm --monitor" ??  That might be useful hint.
>>>>>>
>>>>>
>>>>> yes.
>>>>>
>>>>> [456] is:  /sbin/mdadm --monitor --scan --daemonise --syslog
>>>>> --pid-file=/run/mdadm/mdadm.pid
>>>>>
>>>>> and [930] is 'mdamd -Ss'.
>>>>
>>>> Good.  Please try the patch below.
>>>>
>>>
>>> After applying your patch, this is what I'm getting in syslog:
>>>
>>> Sep 30 03:40:07 localhost kernel: md_open(): md125 opened by mdadm [970]
>>> Sep 30 03:40:07 localhost kernel: md_release(): md125 released by mdadm
>>> [970]
>>> Sep 30 03:40:07 localhost kernel: md_open(): md125 opened by mdadm [972]
>>> Sep 30 03:40:07 localhost kernel: md_open(): md125 opened by mdadm [970]
>>> Sep 30 03:40:07 localhost kernel: md_release(): md125 released by mdadm
>>> [972]
>>> Sep 30 03:40:07 localhost kernel: md_open(): md125 opened by
>>> systemd-udevd [971]
>>> Sep 30 03:40:07 localhost systemd[1]: Cannot add dependency job for unit
>>> mdmonitor-takeover.service, ignoring: Invalid argument
>>> Sep 30 03:40:07 localhost systemd[1]: Started Software RAID monitoring
>>> and management.
>>> Sep 30 03:40:07 localhost kernel: md_release(): md125 released by
>>> systemd-udevd [971]
>>> Sep 30 03:40:08 localhost mdadm[466]: DeviceDisappeared event detected
>>> on md device /dev/md125
>>> Sep 30 03:40:08 localhost mdadm[466]: DeviceDisappeared event detected
>>> on md device /dev/md126
>>> Sep 30 03:40:08 localhost mdadm[466]: DeviceDisappeared event detected
>>> on md device /dev/md127
>>> Sep 30 03:40:08 localhost kernel: md125: detected capacity change from
>>> 1863254016 to 0
>>> Sep 30 03:40:08 localhost kernel: md: md125 stopped.
>>> Sep 30 03:40:08 localhost kernel: md: unbind<vdc3>
>>> Sep 30 03:40:08 localhost kernel: md: export_rdev(vdc3)
>>> Sep 30 03:40:08 localhost kernel: md: unbind<vdb3>
>>> Sep 30 03:40:08 localhost kernel: md: export_rdev(vdb3)
>>> Sep 30 03:40:08 localhost kernel: md_release(): md125 released by mdadm
>>> [970]
>>> Sep 30 03:40:08 localhost kernel: md_open(): md127 opened by mdadm [466]
>>> Sep 30 03:40:08 localhost kernel: md_release(): md127 released by mdadm
>>> [466]
>>> Sep 30 03:40:08 localhost kernel: md_open(): md126 opened by mdadm [466]
>>> Sep 30 03:40:08 localhost kernel: md_release(): md126 released by mdadm
>>> [466]
>>> Sep 30 03:40:08 localhost kernel: md_open(): md126 opened by mdadm [970]
>>> Sep 30 03:40:08 localhost kernel: md_release(): md126 released by mdadm
>>> [970]
>>> Sep 30 03:40:08 localhost kernel: md_open(): md126 opened by mdadm [970]
>>> Sep 30 03:40:08 localhost kernel: md126: detected capacity change from
>>> 67043328 to 0
>>> Sep 30 03:40:08 localhost kernel: md: md126 stopped.
>>> Sep 30 03:40:08 localhost kernel: md: unbind<vdc1>
>>> Sep 30 03:40:08 localhost kernel: md: export_rdev(vdc1)
>>> Sep 30 03:40:08 localhost kernel: md: unbind<vdb1>
>>> Sep 30 03:40:08 localhost kernel: md: export_rdev(vdb1)
>>> Sep 30 03:40:08 localhost kernel: md_open(): md127 opened by mdadm [466]
>>> Sep 30 03:40:08 localhost kernel: md_release(): md127 released by mdadm
>>> [466]
>>> Sep 30 03:40:08 localhost kernel: md_release(): md126 released by mdadm
>>> [970]
>>> Sep 30 03:40:08 localhost kernel: md_open(): md127 opened by mdadm [970]
>>> Sep 30 03:40:08 localhost kernel: md_release(): md127 released by mdadm
>>> [970]
>>> Sep 30 03:40:08 localhost kernel: md_open(): md127 opened by mdadm [970]
>>> Sep 30 03:40:08 localhost kernel: md127: detected capacity change from
>>> 214564864 to 0
>>> Sep 30 03:40:08 localhost kernel: md: md127 stopped.
>>> Sep 30 03:40:08 localhost kernel: md: unbind<vdc2>
>>> Sep 30 03:40:08 localhost kernel: md: export_rdev(vdc2)
>>> Sep 30 03:40:08 localhost kernel: md: unbind<vdb2>
>>> Sep 30 03:40:08 localhost kernel: md: export_rdev(vdb2)
>>> Sep 30 03:40:08 localhost kernel: md_release(): md127 released by mdadm
>>> [970]
>>>
>>> The ghost device is no more present so your patch seems to have fixed my
>>> issue. But I must admit I don't really understand what's going on :-/
>>>
>>
>> Since those 'ghost' devices are expected from the MD implementation
>> point of view, I'm wondering how am I supposed to detect them or maybe
>> how an application is supposed to recognized online arrays.
> 
> If your application is looking in /proc/mdstat, then the "ghost" devices will
> be either "inactive" or not present at all.
> If your application is looking in /sys/block/md*, then the "ghost" devices
> will have "clear" or "inactive" in /sys/block/mdXX/md/array_state.
> 
> If you use the new "CREATE names=yes" line in mdadm.conf (mdadm 3.3 or
> later), and use kernel 3.17 or later, and use names rather than numbers to
> identify your arrays (/dev/md/home, /dev/md_root), then the "ghost" problem
> will be gone, and names in /proc/mdstat will be e.g. "md_home", or "md_root"
> rather than "md4" or "md127".
> 
>>
>> My application uses udev to detect et to get information about new
>> devices. I don't think the information exported by udev is enough to
>> figure this out. Also please note that since I rely on udev, I can't
>> really read information on /sys since this information may be out of
>> sync with the one returned by udev.
> 
> If udev reports that an array exists, then it really did exist when udev got
> the message.  By the time your program gets run by udev, it might not exist
> any more. i.e. udev is always racy.

Yes, but reading sysfs is also racy. I was thinking that the advantage
of using udev is that it gives me a *consistent* (perhaps outdated)
snapshot of the device state.

Thanks

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: /sys/block/md126 still exists even after stopping the array
  2014-10-09  9:40                         ` Francis Moreau
@ 2014-10-09  9:55                           ` NeilBrown
  2014-10-10 19:34                             ` Francis Moreau
  0 siblings, 1 reply; 23+ messages in thread
From: NeilBrown @ 2014-10-09  9:55 UTC (permalink / raw)
  To: Francis Moreau; +Cc: linux-raid, sebastian.riemer

[-- Attachment #1: Type: text/plain, Size: 7037 bytes --]

On Thu, 09 Oct 2014 11:40:25 +0200 Francis Moreau <francis.moro@gmail.com>
wrote:

> On 10/08/2014 01:54 AM, NeilBrown wrote:
> > On Tue, 07 Oct 2014 09:05:43 +0200 Francis Moreau <francis.moro@gmail.com>
> > wrote:
> > 
> >> Hi Neil,
> >>
> >> On 09/30/2014 09:43 AM, Francis Moreau wrote:
> >>> Hi Neil,
> >>>
> >>> On 09/29/2014 11:56 PM, NeilBrown wrote:
> >>>> On Mon, 29 Sep 2014 10:45:17 +0200 Francis Moreau <francis.moro@gmail.com>
> >>>> wrote:
> >>>>
> >>>>>> So what were pids 930 and 459?
> >>>>>> One was presumably the "mdadm -Ss"  - probably 930.
> >>>>>> Is 459 the "mdadm --monitor" ??  That might be useful hint.
> >>>>>>
> >>>>>
> >>>>> yes.
> >>>>>
> >>>>> [456] is:  /sbin/mdadm --monitor --scan --daemonise --syslog
> >>>>> --pid-file=/run/mdadm/mdadm.pid
> >>>>>
> >>>>> and [930] is 'mdamd -Ss'.
> >>>>
> >>>> Good.  Please try the patch below.
> >>>>
> >>>
> >>> After applying your patch, this is what I'm getting in syslog:
> >>>
> >>> Sep 30 03:40:07 localhost kernel: md_open(): md125 opened by mdadm [970]
> >>> Sep 30 03:40:07 localhost kernel: md_release(): md125 released by mdadm
> >>> [970]
> >>> Sep 30 03:40:07 localhost kernel: md_open(): md125 opened by mdadm [972]
> >>> Sep 30 03:40:07 localhost kernel: md_open(): md125 opened by mdadm [970]
> >>> Sep 30 03:40:07 localhost kernel: md_release(): md125 released by mdadm
> >>> [972]
> >>> Sep 30 03:40:07 localhost kernel: md_open(): md125 opened by
> >>> systemd-udevd [971]
> >>> Sep 30 03:40:07 localhost systemd[1]: Cannot add dependency job for unit
> >>> mdmonitor-takeover.service, ignoring: Invalid argument
> >>> Sep 30 03:40:07 localhost systemd[1]: Started Software RAID monitoring
> >>> and management.
> >>> Sep 30 03:40:07 localhost kernel: md_release(): md125 released by
> >>> systemd-udevd [971]
> >>> Sep 30 03:40:08 localhost mdadm[466]: DeviceDisappeared event detected
> >>> on md device /dev/md125
> >>> Sep 30 03:40:08 localhost mdadm[466]: DeviceDisappeared event detected
> >>> on md device /dev/md126
> >>> Sep 30 03:40:08 localhost mdadm[466]: DeviceDisappeared event detected
> >>> on md device /dev/md127
> >>> Sep 30 03:40:08 localhost kernel: md125: detected capacity change from
> >>> 1863254016 to 0
> >>> Sep 30 03:40:08 localhost kernel: md: md125 stopped.
> >>> Sep 30 03:40:08 localhost kernel: md: unbind<vdc3>
> >>> Sep 30 03:40:08 localhost kernel: md: export_rdev(vdc3)
> >>> Sep 30 03:40:08 localhost kernel: md: unbind<vdb3>
> >>> Sep 30 03:40:08 localhost kernel: md: export_rdev(vdb3)
> >>> Sep 30 03:40:08 localhost kernel: md_release(): md125 released by mdadm
> >>> [970]
> >>> Sep 30 03:40:08 localhost kernel: md_open(): md127 opened by mdadm [466]
> >>> Sep 30 03:40:08 localhost kernel: md_release(): md127 released by mdadm
> >>> [466]
> >>> Sep 30 03:40:08 localhost kernel: md_open(): md126 opened by mdadm [466]
> >>> Sep 30 03:40:08 localhost kernel: md_release(): md126 released by mdadm
> >>> [466]
> >>> Sep 30 03:40:08 localhost kernel: md_open(): md126 opened by mdadm [970]
> >>> Sep 30 03:40:08 localhost kernel: md_release(): md126 released by mdadm
> >>> [970]
> >>> Sep 30 03:40:08 localhost kernel: md_open(): md126 opened by mdadm [970]
> >>> Sep 30 03:40:08 localhost kernel: md126: detected capacity change from
> >>> 67043328 to 0
> >>> Sep 30 03:40:08 localhost kernel: md: md126 stopped.
> >>> Sep 30 03:40:08 localhost kernel: md: unbind<vdc1>
> >>> Sep 30 03:40:08 localhost kernel: md: export_rdev(vdc1)
> >>> Sep 30 03:40:08 localhost kernel: md: unbind<vdb1>
> >>> Sep 30 03:40:08 localhost kernel: md: export_rdev(vdb1)
> >>> Sep 30 03:40:08 localhost kernel: md_open(): md127 opened by mdadm [466]
> >>> Sep 30 03:40:08 localhost kernel: md_release(): md127 released by mdadm
> >>> [466]
> >>> Sep 30 03:40:08 localhost kernel: md_release(): md126 released by mdadm
> >>> [970]
> >>> Sep 30 03:40:08 localhost kernel: md_open(): md127 opened by mdadm [970]
> >>> Sep 30 03:40:08 localhost kernel: md_release(): md127 released by mdadm
> >>> [970]
> >>> Sep 30 03:40:08 localhost kernel: md_open(): md127 opened by mdadm [970]
> >>> Sep 30 03:40:08 localhost kernel: md127: detected capacity change from
> >>> 214564864 to 0
> >>> Sep 30 03:40:08 localhost kernel: md: md127 stopped.
> >>> Sep 30 03:40:08 localhost kernel: md: unbind<vdc2>
> >>> Sep 30 03:40:08 localhost kernel: md: export_rdev(vdc2)
> >>> Sep 30 03:40:08 localhost kernel: md: unbind<vdb2>
> >>> Sep 30 03:40:08 localhost kernel: md: export_rdev(vdb2)
> >>> Sep 30 03:40:08 localhost kernel: md_release(): md127 released by mdadm
> >>> [970]
> >>>
> >>> The ghost device is no more present so your patch seems to have fixed my
> >>> issue. But I must admit I don't really understand what's going on :-/
> >>>
> >>
> >> Since those 'ghost' devices are expected from the MD implementation
> >> point of view, I'm wondering how am I supposed to detect them or maybe
> >> how an application is supposed to recognized online arrays.
> > 
> > If your application is looking in /proc/mdstat, then the "ghost" devices will
> > be either "inactive" or not present at all.
> > If your application is looking in /sys/block/md*, then the "ghost" devices
> > will have "clear" or "inactive" in /sys/block/mdXX/md/array_state.
> > 
> > If you use the new "CREATE names=yes" line in mdadm.conf (mdadm 3.3 or
> > later), and use kernel 3.17 or later, and use names rather than numbers to
> > identify your arrays (/dev/md/home, /dev/md_root), then the "ghost" problem
> > will be gone, and names in /proc/mdstat will be e.g. "md_home", or "md_root"
> > rather than "md4" or "md127".
> > 
> >>
> >> My application uses udev to detect et to get information about new
> >> devices. I don't think the information exported by udev is enough to
> >> figure this out. Also please note that since I rely on udev, I can't
> >> really read information on /sys since this information may be out of
> >> sync with the one returned by udev.
> > 
> > If udev reports that an array exists, then it really did exist when udev got
> > the message.  By the time your program gets run by udev, it might not exist
> > any more. i.e. udev is always racy.
> 
> Yes, but reading sysfs is also racy. I was thinking that the advantage
> of using udev is that it gives me a *consistent* (perhaps outdated)
> snapshot of the device state.
> 

In what sense do you think sysfs is racy?  What exactly do you want to do
with the udev event?

The event from the kernel to udev only contains the identity of the device
and the type of event (add,change,remove).

Some drivers add extra 'environment' information.
 - 'dm' adds a 'cookie'.
 - bcache add a 'CACHED_UUID' and 'CACHED_LABEL'
 - libata-acpi adds a 'BAY_EVENT'

but in general there is nothing extra.

If udev adds stuff (which is probably does), it is just as racy as anything
that you might determine and add yourself.

NeilBrown

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: /sys/block/md126 still exists even after stopping the array
  2014-10-09  9:55                           ` NeilBrown
@ 2014-10-10 19:34                             ` Francis Moreau
  0 siblings, 0 replies; 23+ messages in thread
From: Francis Moreau @ 2014-10-10 19:34 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid

Hello Neil,

On 10/09/2014 11:55 AM, NeilBrown wrote:
> On Thu, 09 Oct 2014 11:40:25 +0200 Francis Moreau <francis.moro@gmail.com>
> wrote:
> 
[...]

>>
>> Yes, but reading sysfs is also racy. I was thinking that the advantage
>> of using udev is that it gives me a *consistent* (perhaps outdated)
>> snapshot of the device state.
>>
> 
> In what sense do you think sysfs is racy?  What exactly do you want to do
> with the udev event?
> 
> The event from the kernel to udev only contains the identity of the device
> and the type of event (add,change,remove).
> 
> Some drivers add extra 'environment' information.
>  - 'dm' adds a 'cookie'.
>  - bcache add a 'CACHED_UUID' and 'CACHED_LABEL'
>  - libata-acpi adds a 'BAY_EVENT'
> 
> but in general there is nothing extra.
> 
> If udev adds stuff (which is probably does), it is just as racy as anything
> that you might determine and add yourself.
> 

I was wrongly assuming that if I queried the udev DB through its library
it would have returned a consistent image of the state of the device
meaning that each property of the device would be coherent with the others.

But even if that was possible, it would be pointless since all accesses
to the device could still fail.

Thanks for enlighting me :)

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2014-10-10 19:34 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-06-24 15:38 /sys/block/md126 still exists even after stopping the array Francis Moreau
2014-06-25  1:03 ` NeilBrown
2014-06-25  6:59   ` Francis Moreau
2014-07-24 13:40   ` Sebastian Parschauer
2014-07-24 13:51     ` Artur Paszkiewicz
2014-09-25 16:12   ` Francis Moreau
2014-09-26  0:33     ` NeilBrown
2014-09-26 10:23       ` Francis Moreau
2014-09-26 10:44         ` NeilBrown
2014-09-26 11:23           ` Artur Paszkiewicz
2014-09-29  4:19             ` NeilBrown
2014-09-26 12:21           ` Francis Moreau
2014-09-26 12:50             ` Francis Moreau
2014-09-29  4:47               ` NeilBrown
2014-09-29  4:37             ` NeilBrown
2014-09-29  8:45               ` Francis Moreau
2014-09-29 21:56                 ` NeilBrown
2014-09-30  7:43                   ` Francis Moreau
2014-10-07  7:05                     ` Francis Moreau
2014-10-07 23:54                       ` NeilBrown
2014-10-09  9:40                         ` Francis Moreau
2014-10-09  9:55                           ` NeilBrown
2014-10-10 19:34                             ` Francis Moreau

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.