mdadm: one question about the readonly and readwrite feature

All of lore.kernel.org
 help / color / mirror / Atom feed

* mdadm: one question about the readonly and readwrite feature
@ 2017-03-22 12:00 Zhilong Liu
  2017-03-22 21:55 ` NeilBrown
  0 siblings, 1 reply; 11+ messages in thread
From: Zhilong Liu @ 2017-03-22 12:00 UTC (permalink / raw)
  To: NeilBrown, Jes Sorensen; +Cc: linux-raid

Hi, Neil;

   Excuse me, according to read 'mdadm/tests/ToTest', I'm a little 
confused about "readonly"
and "readwrite" feature, and I've no idea how to fix it. Thus I report 
this question and I'm sorry
for this long description email.

relevant linux/driver/md commit: 260fa034ef7a4ff8b73068b48ac497edd5217491

   My question: If the array has been set the MD_CLOSING flag, although 
hasn't removed the sysfs
folder because sysfs_remove_group() wasn't invoked, and now, how should 
mdadm continue to
control this 'readonly' array?
   Of course, once we cannot operate the array, the 'readwrite' feature 
would be never worked.

Test step:
# ./mdadm -CR /dev/md0 -b internal -l1 -n2 /dev/loop[0-1] --assume-clean
# ./mdadm -o /dev/mdX

# in md.h
enum mddev_flags {
         MD_ARRAY_FIRST_USE,     /* First use of array, needs 
initialization */
         MD_CLOSING,             /* If set, we are closing the array, do 
not open it then */

1. In mdadm tool:
   the func: Manage_ro(dv->devname, mdfd, -1) would be never invoked 
once the array has been
set 'readonly' before. the open_mddev() cannot get a valid file 
descriptor any more. Most of mdadm
commands would be failure, I have to execute the "echo clear > 
/sys/block/mdX/md/array_state".

# refer to mdadm.c
... ...
static int misc_list(struct mddev_dev *devlist,
                      struct mddev_ident *ident,
                      char *dump_directory,
                      struct supertype *ss, struct context *c)
{
... ...
                 switch(dv->devname[0] == '/') {
                         case 0:
                                 mdfd = open_dev(dv->devname);
                                 if (mdfd >= 0) break;
                         case 1:
                                 mdfd = open_mddev(dv->devname, 1);
                 }
                 if (mdfd>=0) {
                         switch(dv->disposition) {
                         case 'R':
                                 c->runstop = 1;
                                 rv |= Manage_run(dv->devname, mdfd, c); 
break;
                         case 'S':
                                 rv |= Manage_stop(dv->devname, mdfd, 
c->verbose, 0); break;
                         case 'o':
                                 rv |= Manage_ro(dv->devname, mdfd, 1); 
break;
                         case 'w':
                                 rv |= Manage_ro(dv->devname, mdfd, -1); 
break;
                         }
2. in md driver:
For readonly, the code path is:
ioctl(fd, STOP_ARRAY_RO, NULL)  - - > set_bit(MD_CLOSING, &mddev->flags) 
- - > md_set_readonly()

cut a piece of code: - -> md_set_readonly() of md.c:
... ...
         if (mddev->pers) {
                 __md_stop_writes(mddev);

                 err  = -ENXIO;
                 if (mddev->ro==1)
                         goto out;
                 mddev->ro = 1;
                 set_disk_ro(mddev->gendisk, 1);
                 clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
                 set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);   - - > 
I think it did nothing once readonly has been set.
                 md_wakeup_thread(mddev->thread);
                 sysfs_notify_dirent_safe(mddev->sysfs_state);
                 err = 0;
... ...

Thanks for your patience,
-Zhilong

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: mdadm: one question about the readonly and readwrite feature
  2017-03-22 12:00 mdadm: one question about the readonly and readwrite feature Zhilong Liu
@ 2017-03-22 21:55 ` NeilBrown
  2017-03-23  1:54   ` Guoqing Jiang
  0 siblings, 1 reply; 11+ messages in thread
From: NeilBrown @ 2017-03-22 21:55 UTC (permalink / raw)
  To: Zhilong Liu, Jes Sorensen; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 4103 bytes --]

On Wed, Mar 22 2017, Zhilong Liu wrote:

> Hi, Neil;
>
>    Excuse me, according to read 'mdadm/tests/ToTest', I'm a little 
> confused about "readonly"
> and "readwrite" feature, and I've no idea how to fix it. Thus I report 
> this question and I'm sorry
> for this long description email.
>
> relevant linux/driver/md commit: 260fa034ef7a4ff8b73068b48ac497edd5217491
>
>    My question: If the array has been set the MD_CLOSING flag, although 
> hasn't removed the sysfs
> folder because sysfs_remove_group() wasn't invoked, and now, how should 
> mdadm continue to
> control this 'readonly' array?

MD_CLOSING should only be set for a short period or time to avoid
certain races.  After the operation that set it completes, it should be
cleared.
It looks like this is a bug that was introduced in
Commit: af8d8e6f0315 ("md: changes for MD_STILL_CLOSED flag")
when MD_STILL_CLOSED was renamed to MD_CLOSING.

If we already had the tests for readonly/readwrite that you are working
on, we might have caught the bug earlier - so I'm glad you are working
on these tests..

Thanks,
NeilBrown


>    Of course, once we cannot operate the array, the 'readwrite' feature 
> would be never worked.
>
> Test step:
> # ./mdadm -CR /dev/md0 -b internal -l1 -n2 /dev/loop[0-1] --assume-clean
> # ./mdadm -o /dev/mdX
>
> # in md.h
> enum mddev_flags {
>          MD_ARRAY_FIRST_USE,     /* First use of array, needs 
> initialization */
>          MD_CLOSING,             /* If set, we are closing the array, do 
> not open it then */
>
> 1. In mdadm tool:
>    the func: Manage_ro(dv->devname, mdfd, -1) would be never invoked 
> once the array has been
> set 'readonly' before. the open_mddev() cannot get a valid file 
> descriptor any more. Most of mdadm
> commands would be failure, I have to execute the "echo clear > 
> /sys/block/mdX/md/array_state".
>
> # refer to mdadm.c
> ... ...
> static int misc_list(struct mddev_dev *devlist,
>                       struct mddev_ident *ident,
>                       char *dump_directory,
>                       struct supertype *ss, struct context *c)
> {
> ... ...
>                  switch(dv->devname[0] == '/') {
>                          case 0:
>                                  mdfd = open_dev(dv->devname);
>                                  if (mdfd >= 0) break;
>                          case 1:
>                                  mdfd = open_mddev(dv->devname, 1);
>                  }
>                  if (mdfd>=0) {
>                          switch(dv->disposition) {
>                          case 'R':
>                                  c->runstop = 1;
>                                  rv |= Manage_run(dv->devname, mdfd, c); 
> break;
>                          case 'S':
>                                  rv |= Manage_stop(dv->devname, mdfd, 
> c->verbose, 0); break;
>                          case 'o':
>                                  rv |= Manage_ro(dv->devname, mdfd, 1); 
> break;
>                          case 'w':
>                                  rv |= Manage_ro(dv->devname, mdfd, -1); 
> break;
>                          }
> 2. in md driver:
> For readonly, the code path is:
> ioctl(fd, STOP_ARRAY_RO, NULL)  - - > set_bit(MD_CLOSING, &mddev->flags) 
> - - > md_set_readonly()
>
> cut a piece of code: - -> md_set_readonly() of md.c:
> ... ...
>          if (mddev->pers) {
>                  __md_stop_writes(mddev);
>
>                  err  = -ENXIO;
>                  if (mddev->ro==1)
>                          goto out;
>                  mddev->ro = 1;
>                  set_disk_ro(mddev->gendisk, 1);
>                  clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
>                  set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);   - - > 
> I think it did nothing once readonly has been set.
>                  md_wakeup_thread(mddev->thread);
>                  sysfs_notify_dirent_safe(mddev->sysfs_state);
>                  err = 0;
> ... ...
>
> Thanks for your patience,
> -Zhilong

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: mdadm: one question about the readonly and readwrite feature
  2017-03-22 21:55 ` NeilBrown
@ 2017-03-23  1:54   ` Guoqing Jiang
  2017-03-23  2:26     ` Guoqing Jiang
  2017-03-23  3:42     ` NeilBrown
  0 siblings, 2 replies; 11+ messages in thread
From: Guoqing Jiang @ 2017-03-23  1:54 UTC (permalink / raw)
  To: NeilBrown, Zhilong Liu, Jes Sorensen; +Cc: linux-raid



On 03/23/2017 05:55 AM, NeilBrown wrote:
> On Wed, Mar 22 2017, Zhilong Liu wrote:
>
>> Hi, Neil;
>>
>>     Excuse me, according to read 'mdadm/tests/ToTest', I'm a little
>> confused about "readonly"
>> and "readwrite" feature, and I've no idea how to fix it. Thus I report
>> this question and I'm sorry
>> for this long description email.
>>
>> relevant linux/driver/md commit: 260fa034ef7a4ff8b73068b48ac497edd5217491
>>
>>     My question: If the array has been set the MD_CLOSING flag, although
>> hasn't removed the sysfs
>> folder because sysfs_remove_group() wasn't invoked, and now, how should
>> mdadm continue to
>> control this 'readonly' array?
> MD_CLOSING should only be set for a short period or time to avoid
> certain races.  After the operation that set it completes, it should be
> cleared.
> It looks like this is a bug that was introduced in
> Commit: af8d8e6f0315 ("md: changes for MD_STILL_CLOSED flag")
> when MD_STILL_CLOSED was renamed to MD_CLOSING.

I guess it is because we set MD_CLOSING for STOP_ARRAY_RO cmd, then commit
af8d8e6f0315 ("md: changes for MD_STILL_CLOSED flag") did below changes:

@@ -7075,9 +7073,13 @@ static int md_open(struct block_device *bdev, 
fmode_t mode)
         if ((err = mutex_lock_interruptible(&mddev->open_mutex)))
                 goto out;

+       if (test_bit(MD_CLOSING, &mddev->flags)) {
+               mutex_unlock(&mddev->open_mutex);
+               return -ENODEV;
+       }

Maybe we need to differentiate "STOP_ARRAY" and "STOP_ARRAY_RO", or 
revert above
changes.

Thanks,
Guoqing

>
> If we already had the tests for readonly/readwrite that you are working
> on, we might have caught the bug earlier - so I'm glad you are working
> on these tests..
>
> Thanks,
> NeilBrown
>
>
>>     Of course, once we cannot operate the array, the 'readwrite' feature
>> would be never worked.
>>
>> Test step:
>> # ./mdadm -CR /dev/md0 -b internal -l1 -n2 /dev/loop[0-1] --assume-clean
>> # ./mdadm -o /dev/mdX
>>
>> # in md.h
>> enum mddev_flags {
>>           MD_ARRAY_FIRST_USE,     /* First use of array, needs
>> initialization */
>>           MD_CLOSING,             /* If set, we are closing the array, do
>> not open it then */
>>
>> 1. In mdadm tool:
>>     the func: Manage_ro(dv->devname, mdfd, -1) would be never invoked
>> once the array has been
>> set 'readonly' before. the open_mddev() cannot get a valid file
>> descriptor any more. Most of mdadm
>> commands would be failure, I have to execute the "echo clear >
>> /sys/block/mdX/md/array_state".
>>
>> # refer to mdadm.c
>> ... ...
>> static int misc_list(struct mddev_dev *devlist,
>>                        struct mddev_ident *ident,
>>                        char *dump_directory,
>>                        struct supertype *ss, struct context *c)
>> {
>> ... ...
>>                   switch(dv->devname[0] == '/') {
>>                           case 0:
>>                                   mdfd = open_dev(dv->devname);
>>                                   if (mdfd >= 0) break;
>>                           case 1:
>>                                   mdfd = open_mddev(dv->devname, 1);
>>                   }
>>                   if (mdfd>=0) {
>>                           switch(dv->disposition) {
>>                           case 'R':
>>                                   c->runstop = 1;
>>                                   rv |= Manage_run(dv->devname, mdfd, c);
>> break;
>>                           case 'S':
>>                                   rv |= Manage_stop(dv->devname, mdfd,
>> c->verbose, 0); break;
>>                           case 'o':
>>                                   rv |= Manage_ro(dv->devname, mdfd, 1);
>> break;
>>                           case 'w':
>>                                   rv |= Manage_ro(dv->devname, mdfd, -1);
>> break;
>>                           }
>> 2. in md driver:
>> For readonly, the code path is:
>> ioctl(fd, STOP_ARRAY_RO, NULL)  - - > set_bit(MD_CLOSING, &mddev->flags)
>> - - > md_set_readonly()
>>
>> cut a piece of code: - -> md_set_readonly() of md.c:
>> ... ...
>>           if (mddev->pers) {
>>                   __md_stop_writes(mddev);
>>
>>                   err  = -ENXIO;
>>                   if (mddev->ro==1)
>>                           goto out;
>>                   mddev->ro = 1;
>>                   set_disk_ro(mddev->gendisk, 1);
>>                   clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
>>                   set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);   - - >
>> I think it did nothing once readonly has been set.
>>                   md_wakeup_thread(mddev->thread);
>>                   sysfs_notify_dirent_safe(mddev->sysfs_state);
>>                   err = 0;
>> ... ...
>>
>> Thanks for your patience,
>> -Zhilong


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: mdadm: one question about the readonly and readwrite feature
  2017-03-23  1:54   ` Guoqing Jiang
@ 2017-03-23  2:26     ` Guoqing Jiang
  2017-03-23  3:42     ` NeilBrown
  1 sibling, 0 replies; 11+ messages in thread
From: Guoqing Jiang @ 2017-03-23  2:26 UTC (permalink / raw)
  To: NeilBrown, Zhilong Liu, Jes Sorensen; +Cc: linux-raid



On 03/23/2017 09:54 AM, Guoqing Jiang wrote:
>
>
> On 03/23/2017 05:55 AM, NeilBrown wrote:
>> On Wed, Mar 22 2017, Zhilong Liu wrote:
>>
>>> Hi, Neil;
>>>
>>>     Excuse me, according to read 'mdadm/tests/ToTest', I'm a little
>>> confused about "readonly"
>>> and "readwrite" feature, and I've no idea how to fix it. Thus I report
>>> this question and I'm sorry
>>> for this long description email.
>>>
>>> relevant linux/driver/md commit: 
>>> 260fa034ef7a4ff8b73068b48ac497edd5217491
>>>
>>>     My question: If the array has been set the MD_CLOSING flag, 
>>> although
>>> hasn't removed the sysfs
>>> folder because sysfs_remove_group() wasn't invoked, and now, how should
>>> mdadm continue to
>>> control this 'readonly' array?
>> MD_CLOSING should only be set for a short period or time to avoid
>> certain races.  After the operation that set it completes, it should be
>> cleared.
>> It looks like this is a bug that was introduced in
>> Commit: af8d8e6f0315 ("md: changes for MD_STILL_CLOSED flag")
>> when MD_STILL_CLOSED was renamed to MD_CLOSING.
>
> I guess it is because we set MD_CLOSING for STOP_ARRAY_RO cmd, then 
> commit
> af8d8e6f0315 ("md: changes for MD_STILL_CLOSED flag") did below changes:
>
> @@ -7075,9 +7073,13 @@ static int md_open(struct block_device *bdev, 
> fmode_t mode)
>         if ((err = mutex_lock_interruptible(&mddev->open_mutex)))
>                 goto out;
>
> +       if (test_bit(MD_CLOSING, &mddev->flags)) {
> +               mutex_unlock(&mddev->open_mutex);
> +               return -ENODEV;
> +       }
>
> Maybe we need to differentiate "STOP_ARRAY" and "STOP_ARRAY_RO", or 
> revert above
> changes.

Or maybe something like below (only compile test).

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 42e68b2e0b41..c40e863fe191 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -65,6 +65,7 @@
  #include <linux/raid/md_p.h>
  #include <linux/raid/md_u.h>
  #include <linux/slab.h>
+#include <linux/genhd.h>
  #include <trace/events/block.h>
  #include "md.h"
  #include "bitmap.h"
@@ -7237,7 +7238,7 @@ static int md_open(struct block_device *bdev, 
fmode_t mode)
         if ((err = mutex_lock_interruptible(&mddev->open_mutex)))
                 goto out;

-       if (test_bit(MD_CLOSING, &mddev->flags)) {
+       if (!get_disk_ro(mddev->gendisk) && test_bit(MD_CLOSING, 
&mddev->flags)) {
                 mutex_unlock(&mddev->open_mutex);
                 err = -ENODEV;
                 goto out;

Thanks,
Guoqing

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: mdadm: one question about the readonly and readwrite feature
  2017-03-23  1:54   ` Guoqing Jiang
  2017-03-23  2:26     ` Guoqing Jiang
@ 2017-03-23  3:42     ` NeilBrown
  2017-03-23  3:54       ` Zhilong Liu
  2017-03-23  6:50       ` Zhilong Liu
  1 sibling, 2 replies; 11+ messages in thread
From: NeilBrown @ 2017-03-23  3:42 UTC (permalink / raw)
  To: Guoqing Jiang, Zhilong Liu, Jes Sorensen; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 2965 bytes --]

On Thu, Mar 23 2017, Guoqing Jiang wrote:

> On 03/23/2017 05:55 AM, NeilBrown wrote:
>> On Wed, Mar 22 2017, Zhilong Liu wrote:
>>
>>> Hi, Neil;
>>>
>>>     Excuse me, according to read 'mdadm/tests/ToTest', I'm a little
>>> confused about "readonly"
>>> and "readwrite" feature, and I've no idea how to fix it. Thus I report
>>> this question and I'm sorry
>>> for this long description email.
>>>
>>> relevant linux/driver/md commit: 260fa034ef7a4ff8b73068b48ac497edd5217491
>>>
>>>     My question: If the array has been set the MD_CLOSING flag, although
>>> hasn't removed the sysfs
>>> folder because sysfs_remove_group() wasn't invoked, and now, how should
>>> mdadm continue to
>>> control this 'readonly' array?
>> MD_CLOSING should only be set for a short period or time to avoid
>> certain races.  After the operation that set it completes, it should be
>> cleared.
>> It looks like this is a bug that was introduced in
>> Commit: af8d8e6f0315 ("md: changes for MD_STILL_CLOSED flag")
>> when MD_STILL_CLOSED was renamed to MD_CLOSING.
>
> I guess it is because we set MD_CLOSING for STOP_ARRAY_RO cmd, then commit
> af8d8e6f0315 ("md: changes for MD_STILL_CLOSED flag") did below changes:
>
> @@ -7075,9 +7073,13 @@ static int md_open(struct block_device *bdev, 
> fmode_t mode)
>          if ((err = mutex_lock_interruptible(&mddev->open_mutex)))
>                  goto out;
>
> +       if (test_bit(MD_CLOSING, &mddev->flags)) {
> +               mutex_unlock(&mddev->open_mutex);
> +               return -ENODEV;
> +       }
>
> Maybe we need to differentiate "STOP_ARRAY" and "STOP_ARRAY_RO", or 
> revert above
> changes.
>

No, your missing the point.
What you describe above would change the effect of what MD_CLOSING does.
We don't want to change it.  It is good.
The problem is that when it has finished doing what it needs to do,
we aren't clearing it. That is simply a bug.

So something like this is needed.

NeilBrown

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 7e168ff1ae90..567aba246497 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -6934,6 +6934,7 @@ static int md_ioctl(struct block_device *bdev, fmode_t mode,
 	void __user *argp = (void __user *)arg;
 	struct mddev *mddev = NULL;
 	int ro;
+	bool did_set_md_closing = false;
 
 	if (!md_ioctl_valid(cmd))
 		return -ENOTTY;
@@ -7023,7 +7024,9 @@ static int md_ioctl(struct block_device *bdev, fmode_t mode,
 			err = -EBUSY;
 			goto out;
 		}
+		WARN_ON_ONCE(test_bit(MD_CLOSING, &mddev->flags));
 		set_bit(MD_CLOSING, &mddev->flags);
+		did_set_md_closing = true;
 		mutex_unlock(&mddev->open_mutex);
 		sync_blockdev(bdev);
 	}
@@ -7216,6 +7219,8 @@ static int md_ioctl(struct block_device *bdev, fmode_t mode,
 		mddev->hold_active = 0;
 	mddev_unlock(mddev);
 out:
+	if (did_set_md_closing)
+		clear_bit(MD_CLOSING, &mddev->flags);
 	return err;
 }
 #ifdef CONFIG_COMPAT

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: mdadm: one question about the readonly and readwrite feature
  2017-03-23  3:42     ` NeilBrown
@ 2017-03-23  3:54       ` Zhilong Liu
  2017-03-23  6:50       ` Zhilong Liu
  1 sibling, 0 replies; 11+ messages in thread
From: Zhilong Liu @ 2017-03-23  3:54 UTC (permalink / raw)
  To: NeilBrown, Guoqing Jiang, Jes Sorensen; +Cc: linux-raid


On 03/23/2017 11:42 AM, NeilBrown wrote:
> On Thu, Mar 23 2017, Guoqing Jiang wrote:
>
>> On 03/23/2017 05:55 AM, NeilBrown wrote:
>>> On Wed, Mar 22 2017, Zhilong Liu wrote:
>>>
>>>> Hi, Neil;
>>>>
>>>>      Excuse me, according to read 'mdadm/tests/ToTest', I'm a little
>>>> confused about "readonly"
>>>> and "readwrite" feature, and I've no idea how to fix it. Thus I report
>>>> this question and I'm sorry
>>>> for this long description email.
>>>>
>>>> relevant linux/driver/md commit: 260fa034ef7a4ff8b73068b48ac497edd5217491
>>>>
>>>>      My question: If the array has been set the MD_CLOSING flag, although
>>>> hasn't removed the sysfs
>>>> folder because sysfs_remove_group() wasn't invoked, and now, how should
>>>> mdadm continue to
>>>> control this 'readonly' array?
>>> MD_CLOSING should only be set for a short period or time to avoid
>>> certain races.  After the operation that set it completes, it should be
>>> cleared.
>>> It looks like this is a bug that was introduced in
>>> Commit: af8d8e6f0315 ("md: changes for MD_STILL_CLOSED flag")
>>> when MD_STILL_CLOSED was renamed to MD_CLOSING.
>> I guess it is because we set MD_CLOSING for STOP_ARRAY_RO cmd, then commit
>> af8d8e6f0315 ("md: changes for MD_STILL_CLOSED flag") did below changes:
>>
>> @@ -7075,9 +7073,13 @@ static int md_open(struct block_device *bdev,
>> fmode_t mode)
>>           if ((err = mutex_lock_interruptible(&mddev->open_mutex)))
>>                   goto out;
>>
>> +       if (test_bit(MD_CLOSING, &mddev->flags)) {
>> +               mutex_unlock(&mddev->open_mutex);
>> +               return -ENODEV;
>> +       }
>>
>> Maybe we need to differentiate "STOP_ARRAY" and "STOP_ARRAY_RO", or
>> revert above
>> changes.
>>
> No, your missing the point.
> What you describe above would change the effect of what MD_CLOSING does.
> We don't want to change it.  It is good.
> The problem is that when it has finished doing what it needs to do,
> we aren't clearing it. That is simply a bug.
>
> So something like this is needed.

I can understand this method, :-). So I would do further learning on 
this code path.
I would response on it later. Thanks very much.

Thanks,
-Zhilong
> NeilBrown
>
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index 7e168ff1ae90..567aba246497 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -6934,6 +6934,7 @@ static int md_ioctl(struct block_device *bdev, fmode_t mode,
>   	void __user *argp = (void __user *)arg;
>   	struct mddev *mddev = NULL;
>   	int ro;
> +	bool did_set_md_closing = false;
>   
>   	if (!md_ioctl_valid(cmd))
>   		return -ENOTTY;
> @@ -7023,7 +7024,9 @@ static int md_ioctl(struct block_device *bdev, fmode_t mode,
>   			err = -EBUSY;
>   			goto out;
>   		}
> +		WARN_ON_ONCE(test_bit(MD_CLOSING, &mddev->flags));
>   		set_bit(MD_CLOSING, &mddev->flags);
> +		did_set_md_closing = true;
>   		mutex_unlock(&mddev->open_mutex);
>   		sync_blockdev(bdev);
>   	}
> @@ -7216,6 +7219,8 @@ static int md_ioctl(struct block_device *bdev, fmode_t mode,
>   		mddev->hold_active = 0;
>   	mddev_unlock(mddev);
>   out:
> +	if (did_set_md_closing)
> +		clear_bit(MD_CLOSING, &mddev->flags);
>   	return err;
>   }
>   #ifdef CONFIG_COMPAT


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: mdadm: one question about the readonly and readwrite feature
  2017-03-23  3:42     ` NeilBrown
  2017-03-23  3:54       ` Zhilong Liu
@ 2017-03-23  6:50       ` Zhilong Liu
  2017-03-23  7:06         ` NeilBrown
  1 sibling, 1 reply; 11+ messages in thread
From: Zhilong Liu @ 2017-03-23  6:50 UTC (permalink / raw)
  To: NeilBrown, Guoqing Jiang, Jes Sorensen; +Cc: linux-raid


On 03/23/2017 11:42 AM, NeilBrown wrote:
> On Thu, Mar 23 2017, Guoqing Jiang wrote:
>
>> On 03/23/2017 05:55 AM, NeilBrown wrote:
>>> On Wed, Mar 22 2017, Zhilong Liu wrote:
>>>
>>>> Hi, Neil;
>>>>
>>>>      Excuse me, according to read 'mdadm/tests/ToTest', I'm a little
>>>> confused about "readonly"
>>>> and "readwrite" feature, and I've no idea how to fix it. Thus I report
>>>> this question and I'm sorry
>>>> for this long description email.
>>>>
>>>> relevant linux/driver/md commit: 260fa034ef7a4ff8b73068b48ac497edd5217491
>>>>
>>>>      My question: If the array has been set the MD_CLOSING flag, although
>>>> hasn't removed the sysfs
>>>> folder because sysfs_remove_group() wasn't invoked, and now, how should
>>>> mdadm continue to
>>>> control this 'readonly' array?
>>> MD_CLOSING should only be set for a short period or time to avoid
>>> certain races.  After the operation that set it completes, it should be
>>> cleared.
>>> It looks like this is a bug that was introduced in
>>> Commit: af8d8e6f0315 ("md: changes for MD_STILL_CLOSED flag")
>>> when MD_STILL_CLOSED was renamed to MD_CLOSING.
>> I guess it is because we set MD_CLOSING for STOP_ARRAY_RO cmd, then commit
>> af8d8e6f0315 ("md: changes for MD_STILL_CLOSED flag") did below changes:
>>
>> @@ -7075,9 +7073,13 @@ static int md_open(struct block_device *bdev,
>> fmode_t mode)
>>           if ((err = mutex_lock_interruptible(&mddev->open_mutex)))
>>                   goto out;
>>
>> +       if (test_bit(MD_CLOSING, &mddev->flags)) {
>> +               mutex_unlock(&mddev->open_mutex);
>> +               return -ENODEV;
>> +       }
>>
>> Maybe we need to differentiate "STOP_ARRAY" and "STOP_ARRAY_RO", or
>> revert above
>> changes.
>>
> No, your missing the point.
> What you describe above would change the effect of what MD_CLOSING does.
> We don't want to change it.  It is good.
> The problem is that when it has finished doing what it needs to do,
> we aren't clearing it. That is simply a bug.
>
> So something like this is needed.

Sorry again, I have another question in md_set_readonly() of md.c
Although it has stopped all md writes operations, and I still prefer
to do set_disk_ro after md_wakeup_thread. refer to following code.

cut the piece of code:
... ...
static int md_set_readonly(struct mddev *mddev, struct block_device *bdev)
{
... ...
         if (mddev->pers) {
                 __md_stop_writes(mddev);

                 err  = -ENXIO;
                 if (mddev->ro==1)
                         goto out;
                 mddev->ro = 1;
                 set_disk_ro(mddev->gendisk, 1);         ## --> I prefer 
to do it after md_wakeup_thread.
                 clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
                 set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
                 md_wakeup_thread(mddev->thread);     ## --> do 
set_disk_ro after this step.
                 sysfs_notify_dirent_safe(mddev->sysfs_state);
                 err = 0;
         }
out:
         mutex_unlock(&mddev->open_mutex);
         return err;
}

Thanks,
-Zhilong
> NeilBrown
>
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index 7e168ff1ae90..567aba246497 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -6934,6 +6934,7 @@ static int md_ioctl(struct block_device *bdev, fmode_t mode,
>   	void __user *argp = (void __user *)arg;
>   	struct mddev *mddev = NULL;
>   	int ro;
> +	bool did_set_md_closing = false;
>   
>   	if (!md_ioctl_valid(cmd))
>   		return -ENOTTY;
> @@ -7023,7 +7024,9 @@ static int md_ioctl(struct block_device *bdev, fmode_t mode,
>   			err = -EBUSY;
>   			goto out;
>   		}
> +		WARN_ON_ONCE(test_bit(MD_CLOSING, &mddev->flags));
>   		set_bit(MD_CLOSING, &mddev->flags);
> +		did_set_md_closing = true;
>   		mutex_unlock(&mddev->open_mutex);
>   		sync_blockdev(bdev);
>   	}
> @@ -7216,6 +7219,8 @@ static int md_ioctl(struct block_device *bdev, fmode_t mode,
>   		mddev->hold_active = 0;
>   	mddev_unlock(mddev);
>   out:
> +	if (did_set_md_closing)
> +		clear_bit(MD_CLOSING, &mddev->flags);
>   	return err;
>   }
>   #ifdef CONFIG_COMPAT


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: mdadm: one question about the readonly and readwrite feature
  2017-03-23  6:50       ` Zhilong Liu
@ 2017-03-23  7:06         ` NeilBrown
  2017-03-23  8:14           ` Zhilong Liu
  0 siblings, 1 reply; 11+ messages in thread
From: NeilBrown @ 2017-03-23  7:06 UTC (permalink / raw)
  To: Zhilong Liu, Guoqing Jiang, Jes Sorensen; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 4657 bytes --]

On Thu, Mar 23 2017, Zhilong Liu wrote:

> On 03/23/2017 11:42 AM, NeilBrown wrote:
>> On Thu, Mar 23 2017, Guoqing Jiang wrote:
>>
>>> On 03/23/2017 05:55 AM, NeilBrown wrote:
>>>> On Wed, Mar 22 2017, Zhilong Liu wrote:
>>>>
>>>>> Hi, Neil;
>>>>>
>>>>>      Excuse me, according to read 'mdadm/tests/ToTest', I'm a little
>>>>> confused about "readonly"
>>>>> and "readwrite" feature, and I've no idea how to fix it. Thus I report
>>>>> this question and I'm sorry
>>>>> for this long description email.
>>>>>
>>>>> relevant linux/driver/md commit: 260fa034ef7a4ff8b73068b48ac497edd5217491
>>>>>
>>>>>      My question: If the array has been set the MD_CLOSING flag, although
>>>>> hasn't removed the sysfs
>>>>> folder because sysfs_remove_group() wasn't invoked, and now, how should
>>>>> mdadm continue to
>>>>> control this 'readonly' array?
>>>> MD_CLOSING should only be set for a short period or time to avoid
>>>> certain races.  After the operation that set it completes, it should be
>>>> cleared.
>>>> It looks like this is a bug that was introduced in
>>>> Commit: af8d8e6f0315 ("md: changes for MD_STILL_CLOSED flag")
>>>> when MD_STILL_CLOSED was renamed to MD_CLOSING.
>>> I guess it is because we set MD_CLOSING for STOP_ARRAY_RO cmd, then commit
>>> af8d8e6f0315 ("md: changes for MD_STILL_CLOSED flag") did below changes:
>>>
>>> @@ -7075,9 +7073,13 @@ static int md_open(struct block_device *bdev,
>>> fmode_t mode)
>>>           if ((err = mutex_lock_interruptible(&mddev->open_mutex)))
>>>                   goto out;
>>>
>>> +       if (test_bit(MD_CLOSING, &mddev->flags)) {
>>> +               mutex_unlock(&mddev->open_mutex);
>>> +               return -ENODEV;
>>> +       }
>>>
>>> Maybe we need to differentiate "STOP_ARRAY" and "STOP_ARRAY_RO", or
>>> revert above
>>> changes.
>>>
>> No, your missing the point.
>> What you describe above would change the effect of what MD_CLOSING does.
>> We don't want to change it.  It is good.
>> The problem is that when it has finished doing what it needs to do,
>> we aren't clearing it. That is simply a bug.
>>
>> So something like this is needed.
>
> Sorry again, I have another question in md_set_readonly() of md.c
> Although it has stopped all md writes operations, and I still prefer
> to do set_disk_ro after md_wakeup_thread. refer to following code.

Why?
How do the actions performed by set_disk_ro() interact with the actions
performed by md_wakeup_thread()?

I'm not saying the code shouldn't be changed, just that there needs to
be a clear explanation of the benefit.

NeilBrown


>
> cut the piece of code:
> ... ...
> static int md_set_readonly(struct mddev *mddev, struct block_device *bdev)
> {
> ... ...
>          if (mddev->pers) {
>                  __md_stop_writes(mddev);
>
>                  err  = -ENXIO;
>                  if (mddev->ro==1)
>                          goto out;
>                  mddev->ro = 1;
>                  set_disk_ro(mddev->gendisk, 1);         ## --> I prefer 
> to do it after md_wakeup_thread.
>                  clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
>                  set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
>                  md_wakeup_thread(mddev->thread);     ## --> do 
> set_disk_ro after this step.
>                  sysfs_notify_dirent_safe(mddev->sysfs_state);
>                  err = 0;
>          }
> out:
>          mutex_unlock(&mddev->open_mutex);
>          return err;
> }
>
> Thanks,
> -Zhilong
>> NeilBrown
>>
>> diff --git a/drivers/md/md.c b/drivers/md/md.c
>> index 7e168ff1ae90..567aba246497 100644
>> --- a/drivers/md/md.c
>> +++ b/drivers/md/md.c
>> @@ -6934,6 +6934,7 @@ static int md_ioctl(struct block_device *bdev, fmode_t mode,
>>   	void __user *argp = (void __user *)arg;
>>   	struct mddev *mddev = NULL;
>>   	int ro;
>> +	bool did_set_md_closing = false;
>>   
>>   	if (!md_ioctl_valid(cmd))
>>   		return -ENOTTY;
>> @@ -7023,7 +7024,9 @@ static int md_ioctl(struct block_device *bdev, fmode_t mode,
>>   			err = -EBUSY;
>>   			goto out;
>>   		}
>> +		WARN_ON_ONCE(test_bit(MD_CLOSING, &mddev->flags));
>>   		set_bit(MD_CLOSING, &mddev->flags);
>> +		did_set_md_closing = true;
>>   		mutex_unlock(&mddev->open_mutex);
>>   		sync_blockdev(bdev);
>>   	}
>> @@ -7216,6 +7219,8 @@ static int md_ioctl(struct block_device *bdev, fmode_t mode,
>>   		mddev->hold_active = 0;
>>   	mddev_unlock(mddev);
>>   out:
>> +	if (did_set_md_closing)
>> +		clear_bit(MD_CLOSING, &mddev->flags);
>>   	return err;
>>   }
>>   #ifdef CONFIG_COMPAT

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: mdadm: one question about the readonly and readwrite feature
  2017-03-23  7:06         ` NeilBrown
@ 2017-03-23  8:14           ` Zhilong Liu
  2017-03-24  0:28             ` NeilBrown
  0 siblings, 1 reply; 11+ messages in thread
From: Zhilong Liu @ 2017-03-23  8:14 UTC (permalink / raw)
  To: NeilBrown, Guoqing Jiang, Jes Sorensen; +Cc: linux-raid


On 03/23/2017 03:06 PM, NeilBrown wrote:
> On Thu, Mar 23 2017, Zhilong Liu wrote:
>
>> On 03/23/2017 11:42 AM, NeilBrown wrote:
>>> On Thu, Mar 23 2017, Guoqing Jiang wrote:
>>>
>>>> On 03/23/2017 05:55 AM, NeilBrown wrote:
>>>>> On Wed, Mar 22 2017, Zhilong Liu wrote:
>>>>>
>>>>>> Hi, Neil;
>>>>>>
>>>>>>       Excuse me, according to read 'mdadm/tests/ToTest', I'm a little
>>>>>> confused about "readonly"
>>>>>> and "readwrite" feature, and I've no idea how to fix it. Thus I report
>>>>>> this question and I'm sorry
>>>>>> for this long description email.
>>>>>>
>>>>>> relevant linux/driver/md commit: 260fa034ef7a4ff8b73068b48ac497edd5217491
>>>>>>
>>>>>>       My question: If the array has been set the MD_CLOSING flag, although
>>>>>> hasn't removed the sysfs
>>>>>> folder because sysfs_remove_group() wasn't invoked, and now, how should
>>>>>> mdadm continue to
>>>>>> control this 'readonly' array?
>>>>> MD_CLOSING should only be set for a short period or time to avoid
>>>>> certain races.  After the operation that set it completes, it should be
>>>>> cleared.
>>>>> It looks like this is a bug that was introduced in
>>>>> Commit: af8d8e6f0315 ("md: changes for MD_STILL_CLOSED flag")
>>>>> when MD_STILL_CLOSED was renamed to MD_CLOSING.
>>>> I guess it is because we set MD_CLOSING for STOP_ARRAY_RO cmd, then commit
>>>> af8d8e6f0315 ("md: changes for MD_STILL_CLOSED flag") did below changes:
>>>>
>>>> @@ -7075,9 +7073,13 @@ static int md_open(struct block_device *bdev,
>>>> fmode_t mode)
>>>>            if ((err = mutex_lock_interruptible(&mddev->open_mutex)))
>>>>                    goto out;
>>>>
>>>> +       if (test_bit(MD_CLOSING, &mddev->flags)) {
>>>> +               mutex_unlock(&mddev->open_mutex);
>>>> +               return -ENODEV;
>>>> +       }
>>>>
>>>> Maybe we need to differentiate "STOP_ARRAY" and "STOP_ARRAY_RO", or
>>>> revert above
>>>> changes.
>>>>
>>> No, your missing the point.
>>> What you describe above would change the effect of what MD_CLOSING does.
>>> We don't want to change it.  It is good.
>>> The problem is that when it has finished doing what it needs to do,
>>> we aren't clearing it. That is simply a bug.
>>>
>>> So something like this is needed.
>> Sorry again, I have another question in md_set_readonly() of md.c
>> Although it has stopped all md writes operations, and I still prefer
>> to do set_disk_ro after md_wakeup_thread. refer to following code.
> Why?
> How do the actions performed by set_disk_ro() interact with the actions
> performed by md_wakeup_thread()?
>
> I'm not saying the code shouldn't be changed, just that there needs to
> be a clear explanation of the benefit.

here just according to my understanding for readonly code path, welcome
the correction in my comments if I misunderstand this feature, :-).
... ...
static int md_set_readonly(struct mddev *mddev, struct block_device *bdev)
{
         int err = 0;
         int did_freeze = 0;

         if (!test_bit(MD_RECOVERY_FROZEN, &mddev->recovery)) {
                 did_freeze = 1;
                 set_bit(MD_RECOVERY_FROZEN, &mddev->recovery);  --> 
here has set bit as FROZEN
                 md_wakeup_thread(mddev->thread);
         }
... ...

         if (mddev->pers) {
                 __md_stop_writes(mddev);
/*
  *  set FROZEN again, maybe can use test_and_set_bit(FROZEN) better, it 
doesn't matter.
  *  it flushed and synced all data, bitmap and superblock to array.
  */
                 err  = -ENXIO;
                 if (mddev->ro==1)
                         goto out;
                 mddev->ro = 1;
/*
  *  Here, I means that set_disk_ro until the daemon thread has 
completed all operations
  *  include of sync and recovery progress. set_disk_ro when the array 
has cleaned totally,
  *  then it would be safer.
  *  I'm not sure MD_RECOVERY_NEEDED would be running once the array has 
set_disk_ro,
  *  actually I don't know how to test this scenario, thus asked this 
question.
  */
                 set_disk_ro(mddev->gendisk, 1);
                 clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
                 set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
                 md_wakeup_thread(mddev->thread);
                 sysfs_notify_dirent_safe(mddev->sysfs_state);

Thanks very much,
-Zhilong
> NeilBrown
>
>
>> cut the piece of code:
>> ... ...
>> static int md_set_readonly(struct mddev *mddev, struct block_device *bdev)
>> {
>> ... ...
>>           if (mddev->pers) {
>>                   __md_stop_writes(mddev);
>>
>>                   err  = -ENXIO;
>>                   if (mddev->ro==1)
>>                           goto out;
>>                   mddev->ro = 1;
>>                   set_disk_ro(mddev->gendisk, 1);         ## --> I prefer
>> to do it after md_wakeup_thread.
>>                   clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
>>                   set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
>>                   md_wakeup_thread(mddev->thread);     ## --> do
>> set_disk_ro after this step.
>>                   sysfs_notify_dirent_safe(mddev->sysfs_state);
>>                   err = 0;
>>           }
>> out:
>>           mutex_unlock(&mddev->open_mutex);
>>           return err;
>> }
>>
>> Thanks,
>> -Zhilong
>>> NeilBrown
>>>
>>> diff --git a/drivers/md/md.c b/drivers/md/md.c
>>> index 7e168ff1ae90..567aba246497 100644
>>> --- a/drivers/md/md.c
>>> +++ b/drivers/md/md.c
>>> @@ -6934,6 +6934,7 @@ static int md_ioctl(struct block_device *bdev, fmode_t mode,
>>>    	void __user *argp = (void __user *)arg;
>>>    	struct mddev *mddev = NULL;
>>>    	int ro;
>>> +	bool did_set_md_closing = false;
>>>    
>>>    	if (!md_ioctl_valid(cmd))
>>>    		return -ENOTTY;
>>> @@ -7023,7 +7024,9 @@ static int md_ioctl(struct block_device *bdev, fmode_t mode,
>>>    			err = -EBUSY;
>>>    			goto out;
>>>    		}
>>> +		WARN_ON_ONCE(test_bit(MD_CLOSING, &mddev->flags));
>>>    		set_bit(MD_CLOSING, &mddev->flags);
>>> +		did_set_md_closing = true;
>>>    		mutex_unlock(&mddev->open_mutex);
>>>    		sync_blockdev(bdev);
>>>    	}
>>> @@ -7216,6 +7219,8 @@ static int md_ioctl(struct block_device *bdev, fmode_t mode,
>>>    		mddev->hold_active = 0;
>>>    	mddev_unlock(mddev);
>>>    out:
>>> +	if (did_set_md_closing)
>>> +		clear_bit(MD_CLOSING, &mddev->flags);
>>>    	return err;
>>>    }
>>>    #ifdef CONFIG_COMPAT


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: mdadm: one question about the readonly and readwrite feature
  2017-03-23  8:14           ` Zhilong Liu
@ 2017-03-24  0:28             ` NeilBrown
  2017-03-24 15:44               ` Zhilong
  0 siblings, 1 reply; 11+ messages in thread
From: NeilBrown @ 2017-03-24  0:28 UTC (permalink / raw)
  To: Zhilong Liu, Guoqing Jiang, Jes Sorensen; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 3561 bytes --]

On Thu, Mar 23 2017, Zhilong Liu wrote:

> On 03/23/2017 03:06 PM, NeilBrown wrote:
>> Why?
>> How do the actions performed by set_disk_ro() interact with the actions
>> performed by md_wakeup_thread()?
>>
>> I'm not saying the code shouldn't be changed, just that there needs to
>> be a clear explanation of the benefit.
>
> here just according to my understanding for readonly code path, welcome
> the correction in my comments if I misunderstand this feature, :-).

I'm sorry if this sounds harsh, but I get the impression that you aren't
really trying to understand the code.  You are just guessing about what
things might be doing, rather than doing careful research to determine
exactly what the code does.

Do you know what "set_disk_ro()" does?  What are the consequences of
call it?  Until you know that, you cannot make any reasonable assessment
on where the call should go.

Do you know why we set MD_RECOVERY_NEEDED and wake up the thread?
You seem to expect that it might cause some write out, however it
happens just after a call to __md_stop_writes() which aims to stop all
writes happening to the array.  So that seems unlikely (but is worth
checking).

> ... ...
> static int md_set_readonly(struct mddev *mddev, struct block_device *bdev)
> {
>          int err = 0;
>          int did_freeze = 0;
>
>          if (!test_bit(MD_RECOVERY_FROZEN, &mddev->recovery)) {
>                  did_freeze = 1;
>                  set_bit(MD_RECOVERY_FROZEN, &mddev->recovery);  --> 
> here has set bit as FROZEN
>                  md_wakeup_thread(mddev->thread);
>          }
> ... ...
>
>          if (mddev->pers) {
>                  __md_stop_writes(mddev);
> /*
>   *  set FROZEN again, maybe can use test_and_set_bit(FROZEN) better, it 
> doesn't matter.

Correct, it doesn't matter.

>   *  it flushed and synced all data, bitmap and superblock to array.

Correct again.

>   */
>                  err  = -ENXIO;
>                  if (mddev->ro==1)
>                          goto out;
>                  mddev->ro = 1;
> /*
>   *  Here, I means that set_disk_ro until the daemon thread has 
> completed all operations
>   *  include of sync and recovery progress. set_disk_ro when the array 
> has cleaned totally,
>   *  then it would be safer.

Completed which operations exactly?  __md_stop_writes() has called
md_reap_sync_thread() so there cannot still be any recovery.  It has
called pers->quiesce() so there cannot be any outstanding io.
And just moving the call to afterwards would't cause it to wait for those
supposed operations to complete.

>   *  I'm not sure MD_RECOVERY_NEEDED would be running once the array has 
> set_disk_ro,
>   *  actually I don't know how to test this scenario, thus asked this 
> question.

The first step is to understand the code.
Your questions was "should we move this line to here", without asking any
questions about what the code does, or showing much understanding of it.

I do encourage you to ask questions, but it is best to start with
questions that make sense.
And once you have framed a question, try to answer it yourself.
Read the code (e.g. the code for set_disk_ro()) if you haven't read it
before, to be sure you understand what it does.
And if you don't know why some code does something, it often helps to
look through the git logs, or use "git blame" to find out when the code
was added or changed.  Often the changelog of the patch will explain the
purpose of the code.

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: mdadm: one question about the readonly and readwrite feature
  2017-03-24  0:28             ` NeilBrown
@ 2017-03-24 15:44               ` Zhilong
  0 siblings, 0 replies; 11+ messages in thread
From: Zhilong @ 2017-03-24 15:44 UTC (permalink / raw)
  To: NeilBrown; +Cc: Guoqing Jiang, Jes Sorensen, linux-raid





发自我的 iPhone
> 在 2017年3月24日，08:28，NeilBrown <neilb@suse.com> 写道：
> 
>> On Thu, Mar 23 2017, Zhilong Liu wrote:
>> 
>>> On 03/23/2017 03:06 PM, NeilBrown wrote:
>>> Why?
>>> How do the actions performed by set_disk_ro() interact with the actions
>>> performed by md_wakeup_thread()?
>>> 
>>> I'm not saying the code shouldn't be changed, just that there needs to
>>> be a clear explanation of the benefit.
>> 
>> here just according to my understanding for readonly code path, welcome
>> the correction in my comments if I misunderstand this feature, :-).
> 
> I'm sorry if this sounds harsh, but I get the impression that you aren't
> really trying to understand the code.  You are just guessing about what
> things might be doing, rather than doing careful research to determine
> exactly what the code does.
> 
> Do you know what "set_disk_ro()" does?  What are the consequences of
> call it?  Until you know that, you cannot make any reasonable assessment
> on where the call should go.
> 
> Do you know why we set MD_RECOVERY_NEEDED and wake up the thread?
> You seem to expect that it might cause some write out, however it
> happens just after a call to __md_stop_writes() which aims to stop all
> writes happening to the array.  So that seems unlikely (but is worth
> checking).
> 
> 
>> ... ...
>> static int md_set_readonly(struct mddev *mddev, struct block_device *bdev)
>> {
>>         int err = 0;
>>         int did_freeze = 0;
>> 
>>         if (!test_bit(MD_RECOVERY_FROZEN, &mddev->recovery)) {
>>                 did_freeze = 1;
>>                 set_bit(MD_RECOVERY_FROZEN, &mddev->recovery);  --> 
>> here has set bit as FROZEN
>>                 md_wakeup_thread(mddev->thread);
>>         }
>> ... ...
>> 
>>         if (mddev->pers) {
>>                 __md_stop_writes(mddev);
>> /*
>>  *  set FROZEN again, maybe can use test_and_set_bit(FROZEN) better, it 
>> doesn't matter.
> 
> Correct, it doesn't matter.
> 
> 
>>  *  it flushed and synced all data, bitmap and superblock to array.
> 
> Correct again.
> 
> 
>>  */
>>                 err  = -ENXIO;
>>                 if (mddev->ro==1)
>>                         goto out;
>>                 mddev->ro = 1;
>> /*
>>  *  Here, I means that set_disk_ro until the daemon thread has 
>> completed all operations
>>  *  include of sync and recovery progress. set_disk_ro when the array 
>> has cleaned totally,
>>  *  then it would be safer.
> 
> Completed which operations exactly?  __md_stop_writes() has called
> md_reap_sync_thread() so there cannot still be any recovery.  It has
> called pers->quiesce() so there cannot be any outstanding io.
> And just moving the call to afterwards would't cause it to wait for those
> supposed operations to complete.
> 
>>  *  I'm not sure MD_RECOVERY_NEEDED would be running once the array has 
>> set_disk_ro,
>>  *  actually I don't know how to test this scenario, thus asked this 
>> question.
> 
> The first step is to understand the code.
> Your questions was "should we move this line to here", without asking any
> questions about what the code does, or showing much understanding of it.
> 
> I do encourage you to ask questions, but it is best to start with
> questions that make sense.
> And once you have framed a question, try to answer it yourself.
> Read the code (e.g. the code for set_disk_ro()) if you haven't read it
> before, to be sure you understand what it does.
> And if you don't know why some code does something, it often helps to
> look through the git logs, or use "git blame" to find out when the code
> was added or changed.  Often the changelog of the patch will explain the
> purpose of the code.
> 

Sorry for the later response. I have a sick leave and go to hospital today.
many thanks, every suggestion is very inspiring to me. As a new student of this interesting project, I should keep more strict to myself, and I would continue to learn this project carefully and try to bring more meaningful question here.

For the above patch to clear the MD_CLOSING bit, I have tested and it works well. :-). 

Best regards,
-Zhilong

> NeilBrown
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2017-03-24 15:44 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-22 12:00 mdadm: one question about the readonly and readwrite feature Zhilong Liu
2017-03-22 21:55 ` NeilBrown
2017-03-23  1:54   ` Guoqing Jiang
2017-03-23  2:26     ` Guoqing Jiang
2017-03-23  3:42     ` NeilBrown
2017-03-23  3:54       ` Zhilong Liu
2017-03-23  6:50       ` Zhilong Liu
2017-03-23  7:06         ` NeilBrown
2017-03-23  8:14           ` Zhilong Liu
2017-03-24  0:28             ` NeilBrown
2017-03-24 15:44               ` Zhilong

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.