All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] md: it panic after reshape from raid1 to raid5
@ 2021-12-09  5:55 Xiao Ni
  2021-12-09  5:55 ` [PATCH 1/2] md/raid0: Free r0conf memory when register integrity failed Xiao Ni
  2021-12-09  5:55 ` [PATCH 2/2] md: Move alloc/free acct bioset in to personality Xiao Ni
  0 siblings, 2 replies; 14+ messages in thread
From: Xiao Ni @ 2021-12-09  5:55 UTC (permalink / raw)
  To: song; +Cc: guoqing.jiang, ncroxon, linux-raid

Hi all

After reshape from raid1 to raid5, it can panice when there are I/Os

The steps can reproduce this:
mdadm -CR /dev/md0 -l1 -n2 /dev/loop0 /dev/loop1
mdadm --wait /dev/md0
mkfs.xfs /dev/md0
mdadm /dev/md0 --grow -l5
mount /dev/md0 /mnt

These two patches fix this problem.

Xiao Ni (2):
  Free r0conf memory when register integrity failed
  Move alloc/free acct bioset in to personality

 drivers/md/md.c    | 27 +++++++++++++++++----------
 drivers/md/md.h    |  2 ++
 drivers/md/raid0.c | 28 ++++++++++++++++++++++++----
 drivers/md/raid5.c | 41 ++++++++++++++++++++++++++++++-----------
 4 files changed, 73 insertions(+), 25 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/2] md/raid0: Free r0conf memory when register integrity failed
  2021-12-09  5:55 [PATCH 0/2] md: it panic after reshape from raid1 to raid5 Xiao Ni
@ 2021-12-09  5:55 ` Xiao Ni
  2021-12-09 18:02   ` Song Liu
  2021-12-09  5:55 ` [PATCH 2/2] md: Move alloc/free acct bioset in to personality Xiao Ni
  1 sibling, 1 reply; 14+ messages in thread
From: Xiao Ni @ 2021-12-09  5:55 UTC (permalink / raw)
  To: song; +Cc: guoqing.jiang, ncroxon, linux-raid

It doesn't free memory when register integrity failed. And move
free conf codes into a single function.

Signed-off-by: Xiao Ni <xni@redhat.com>
---
 drivers/md/raid0.c | 18 +++++++++++++++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
index 62c8b6adac70..3fa47df1c60e 100644
--- a/drivers/md/raid0.c
+++ b/drivers/md/raid0.c
@@ -356,6 +356,7 @@ static sector_t raid0_size(struct mddev *mddev, sector_t sectors, int raid_disks
 	return array_sectors;
 }
 
+static void free_conf(struct r0conf *conf);
 static void raid0_free(struct mddev *mddev, void *priv);
 
 static int raid0_run(struct mddev *mddev)
@@ -413,19 +414,30 @@ static int raid0_run(struct mddev *mddev)
 	dump_zones(mddev);
 
 	ret = md_integrity_register(mddev);
+	if (ret)
+		goto free;
 
 	return ret;
+
+free:
+	free_conf(conf);
+	return ret;
 }
 
-static void raid0_free(struct mddev *mddev, void *priv)
+static void free_conf(struct r0conf *conf)
 {
-	struct r0conf *conf = priv;
-
 	kfree(conf->strip_zone);
 	kfree(conf->devlist);
 	kfree(conf);
 }
 
+static void raid0_free(struct mddev *mddev, void *priv)
+{
+	struct r0conf *conf = priv;
+
+	free_conf(conf);
+}
+
 static void raid0_handle_discard(struct mddev *mddev, struct bio *bio)
 {
 	struct r0conf *conf = mddev->private;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/2] md: Move alloc/free acct bioset in to personality
  2021-12-09  5:55 [PATCH 0/2] md: it panic after reshape from raid1 to raid5 Xiao Ni
  2021-12-09  5:55 ` [PATCH 1/2] md/raid0: Free r0conf memory when register integrity failed Xiao Ni
@ 2021-12-09  5:55 ` Xiao Ni
  2021-12-10  1:30   ` Guoqing Jiang
  1 sibling, 1 reply; 14+ messages in thread
From: Xiao Ni @ 2021-12-09  5:55 UTC (permalink / raw)
  To: song; +Cc: guoqing.jiang, ncroxon, linux-raid

Now it alloc acct bioset in md_run and only raid0/raid5 need acct
bioset. For example, it doesn't create acct bioset when creating
raid1. Then reshape from raid1 to raid0/raid5, it will access acct
bioset after reshaping. It can panic because of NULL pointer
reference.

We can move alloc/free jobs to personality. pers->run alloc acct
bioset and pers->clean free it.

Fixes: daee2024715d (md: check level before create and exit io_acct_set)
Signed-off-by: Xiao Ni <xni@redhat.com>
---
 drivers/md/md.c    | 27 +++++++++++++++++----------
 drivers/md/md.h    |  2 ++
 drivers/md/raid0.c | 10 +++++++++-
 drivers/md/raid5.c | 41 ++++++++++++++++++++++++++++++-----------
 4 files changed, 58 insertions(+), 22 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index e8666bdc0d28..0fc34a05a655 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -5878,13 +5878,6 @@ int md_run(struct mddev *mddev)
 		if (err)
 			goto exit_bio_set;
 	}
-	if (mddev->level != 1 && mddev->level != 10 &&
-	    !bioset_initialized(&mddev->io_acct_set)) {
-		err = bioset_init(&mddev->io_acct_set, BIO_POOL_SIZE,
-				  offsetof(struct md_io_acct, bio_clone), 0);
-		if (err)
-			goto exit_sync_set;
-	}
 
 	spin_lock(&pers_lock);
 	pers = find_pers(mddev->level, mddev->clevel);
@@ -6061,9 +6054,6 @@ int md_run(struct mddev *mddev)
 	module_put(pers->owner);
 	md_bitmap_destroy(mddev);
 abort:
-	if (mddev->level != 1 && mddev->level != 10)
-		bioset_exit(&mddev->io_acct_set);
-exit_sync_set:
 	bioset_exit(&mddev->sync_set);
 exit_bio_set:
 	bioset_exit(&mddev->bio_set);
@@ -8596,6 +8586,23 @@ void md_submit_discard_bio(struct mddev *mddev, struct md_rdev *rdev,
 }
 EXPORT_SYMBOL_GPL(md_submit_discard_bio);
 
+int acct_bioset_init(struct mddev *mddev)
+{
+	int err = 0;
+
+	if (!bioset_initialized(&mddev->io_acct_set))
+		err = bioset_init(&mddev->io_acct_set, BIO_POOL_SIZE,
+			offsetof(struct md_io_acct, bio_clone), 0);
+	return err;
+}
+EXPORT_SYMBOL_GPL(acct_bioset_init);
+
+void acct_bioset_exit(struct mddev *mddev)
+{
+	bioset_exit(&mddev->io_acct_set);
+}
+EXPORT_SYMBOL_GPL(acct_bioset_exit);
+
 static void md_end_io_acct(struct bio *bio)
 {
 	struct md_io_acct *md_io_acct = bio->bi_private;
diff --git a/drivers/md/md.h b/drivers/md/md.h
index 53ea7a6961de..f1bf3625ef4c 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -721,6 +721,8 @@ extern void md_error(struct mddev *mddev, struct md_rdev *rdev);
 extern void md_finish_reshape(struct mddev *mddev);
 void md_submit_discard_bio(struct mddev *mddev, struct md_rdev *rdev,
 			struct bio *bio, sector_t start, sector_t size);
+int acct_bioset_init(struct mddev *mddev);
+void acct_bioset_exit(struct mddev *mddev);
 void md_account_bio(struct mddev *mddev, struct bio **bio);
 
 extern bool __must_check md_flush_request(struct mddev *mddev, struct bio *bio);
diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
index 3fa47df1c60e..2391a4a63b4d 100644
--- a/drivers/md/raid0.c
+++ b/drivers/md/raid0.c
@@ -371,11 +371,16 @@ static int raid0_run(struct mddev *mddev)
 	if (md_check_no_bitmap(mddev))
 		return -EINVAL;
 
+	if (acct_bioset_init(mddev)) {
+		pr_err("md/raid0:%s: alloc acct bioset failed.\n", mdname(mddev));
+		return -ENOMEM;
+	}
+
 	/* if private is not null, we are here after takeover */
 	if (mddev->private == NULL) {
 		ret = create_strip_zones(mddev, &conf);
 		if (ret < 0)
-			return ret;
+			goto exit_acct_set;
 		mddev->private = conf;
 	}
 	conf = mddev->private;
@@ -421,6 +426,8 @@ static int raid0_run(struct mddev *mddev)
 
 free:
 	free_conf(conf);
+exit_acct_set:
+	acct_bioset_exit(mddev);
 	return ret;
 }
 
@@ -436,6 +443,7 @@ static void raid0_free(struct mddev *mddev, void *priv)
 	struct r0conf *conf = priv;
 
 	free_conf(conf);
+	acct_bioset_exit(mddev);
 }
 
 static void raid0_handle_discard(struct mddev *mddev, struct bio *bio)
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 1240a5c16af8..13afa8c5cc8a 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -7447,12 +7447,19 @@ static int raid5_run(struct mddev *mddev)
 	struct md_rdev *rdev;
 	struct md_rdev *journal_dev = NULL;
 	sector_t reshape_offset = 0;
-	int i;
+	int i, ret = 0;
 	long long min_offset_diff = 0;
 	int first = 1;
 
-	if (mddev_init_writes_pending(mddev) < 0)
+	if (acct_bioset_init(mddev)) {
+		pr_err("md/raid456:%s: alloc acct bioset failed.\n", mdname(mddev));
 		return -ENOMEM;
+	}
+
+	if (mddev_init_writes_pending(mddev) < 0) {
+		ret = -ENOMEM;
+		goto exit_acct_set;
+	}
 
 	if (mddev->recovery_cp != MaxSector)
 		pr_notice("md/raid:%s: not clean -- starting background reconstruction\n",
@@ -7483,7 +7490,8 @@ static int raid5_run(struct mddev *mddev)
 	    (mddev->bitmap_info.offset || mddev->bitmap_info.file)) {
 		pr_notice("md/raid:%s: array cannot have both journal and bitmap\n",
 			  mdname(mddev));
-		return -EINVAL;
+		ret = -EINVAL;
+		goto exit_acct_set;
 	}
 
 	if (mddev->reshape_position != MaxSector) {
@@ -7508,13 +7516,15 @@ static int raid5_run(struct mddev *mddev)
 		if (journal_dev) {
 			pr_warn("md/raid:%s: don't support reshape with journal - aborting.\n",
 				mdname(mddev));
-			return -EINVAL;
+			ret = -EINVAL;
+			goto exit_acct_set;
 		}
 
 		if (mddev->new_level != mddev->level) {
 			pr_warn("md/raid:%s: unsupported reshape required - aborting.\n",
 				mdname(mddev));
-			return -EINVAL;
+			ret = -EINVAL;
+			goto exit_acct_set;
 		}
 		old_disks = mddev->raid_disks - mddev->delta_disks;
 		/* reshape_position must be on a new-stripe boundary, and one
@@ -7530,7 +7540,8 @@ static int raid5_run(struct mddev *mddev)
 		if (sector_div(here_new, chunk_sectors * new_data_disks)) {
 			pr_warn("md/raid:%s: reshape_position not on a stripe boundary\n",
 				mdname(mddev));
-			return -EINVAL;
+			ret = -EINVAL;
+			goto exit_acct_set;
 		}
 		reshape_offset = here_new * chunk_sectors;
 		/* here_new is the stripe we will write to */
@@ -7552,7 +7563,8 @@ static int raid5_run(struct mddev *mddev)
 			else if (mddev->ro == 0) {
 				pr_warn("md/raid:%s: in-place reshape must be started in read-only mode - aborting\n",
 					mdname(mddev));
-				return -EINVAL;
+				ret = -EINVAL;
+				goto exit_acct_set;
 			}
 		} else if (mddev->reshape_backwards
 		    ? (here_new * chunk_sectors + min_offset_diff <=
@@ -7562,7 +7574,8 @@ static int raid5_run(struct mddev *mddev)
 			/* Reading from the same stripe as writing to - bad */
 			pr_warn("md/raid:%s: reshape_position too early for auto-recovery - aborting.\n",
 				mdname(mddev));
-			return -EINVAL;
+			ret = -EINVAL;
+			goto exit_acct_set;
 		}
 		pr_debug("md/raid:%s: reshape will continue\n", mdname(mddev));
 		/* OK, we should be able to continue; */
@@ -7586,8 +7599,10 @@ static int raid5_run(struct mddev *mddev)
 	else
 		conf = mddev->private;
 
-	if (IS_ERR(conf))
-		return PTR_ERR(conf);
+	if (IS_ERR(conf)) {
+		ret = PTR_ERR(conf);
+		goto exit_acct_set;
+	}
 
 	if (test_bit(MD_HAS_JOURNAL, &mddev->flags)) {
 		if (!journal_dev) {
@@ -7784,7 +7799,10 @@ static int raid5_run(struct mddev *mddev)
 	free_conf(conf);
 	mddev->private = NULL;
 	pr_warn("md/raid:%s: failed to run raid set.\n", mdname(mddev));
-	return -EIO;
+	ret = -EIO;
+exit_acct_set:
+	acct_bioset_exit(mddev);
+	return ret;
 }
 
 static void raid5_free(struct mddev *mddev, void *priv)
@@ -7792,6 +7810,7 @@ static void raid5_free(struct mddev *mddev, void *priv)
 	struct r5conf *conf = priv;
 
 	free_conf(conf);
+	acct_bioset_exit(mddev);
 	mddev->to_remove = &raid5_attrs_group;
 }
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] md/raid0: Free r0conf memory when register integrity failed
  2021-12-09  5:55 ` [PATCH 1/2] md/raid0: Free r0conf memory when register integrity failed Xiao Ni
@ 2021-12-09 18:02   ` Song Liu
  2021-12-10  1:18     ` Xiao Ni
  2021-12-10  1:22     ` Guoqing Jiang
  0 siblings, 2 replies; 14+ messages in thread
From: Song Liu @ 2021-12-09 18:02 UTC (permalink / raw)
  To: Xiao Ni; +Cc: Guoqing Jiang, Nigel Croxon, linux-raid

On Wed, Dec 8, 2021 at 9:55 PM Xiao Ni <xni@redhat.com> wrote:
>
> It doesn't free memory when register integrity failed. And move
> free conf codes into a single function.
>
> Signed-off-by: Xiao Ni <xni@redhat.com>
> ---
>  drivers/md/raid0.c | 18 +++++++++++++++---
>  1 file changed, 15 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
> index 62c8b6adac70..3fa47df1c60e 100644
> --- a/drivers/md/raid0.c
> +++ b/drivers/md/raid0.c
> @@ -356,6 +356,7 @@ static sector_t raid0_size(struct mddev *mddev, sector_t sectors, int raid_disks
>         return array_sectors;
>  }
>
> +static void free_conf(struct r0conf *conf);
>  static void raid0_free(struct mddev *mddev, void *priv);
>
>  static int raid0_run(struct mddev *mddev)
> @@ -413,19 +414,30 @@ static int raid0_run(struct mddev *mddev)
>         dump_zones(mddev);
>
>         ret = md_integrity_register(mddev);
> +       if (ret)
> +               goto free;
>
>         return ret;
> +
> +free:
> +       free_conf(conf);

Can we just use raid0_free() here? Also, after freeing conf,
we should set mddev->private to NULL.

Thanks,
Song

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] md/raid0: Free r0conf memory when register integrity failed
  2021-12-09 18:02   ` Song Liu
@ 2021-12-10  1:18     ` Xiao Ni
  2021-12-10  2:07       ` Guoqing Jiang
  2021-12-10  2:27       ` Song Liu
  2021-12-10  1:22     ` Guoqing Jiang
  1 sibling, 2 replies; 14+ messages in thread
From: Xiao Ni @ 2021-12-10  1:18 UTC (permalink / raw)
  To: Song Liu; +Cc: Guoqing Jiang, Nigel Croxon, linux-raid

On Fri, Dec 10, 2021 at 2:02 AM Song Liu <song@kernel.org> wrote:
>
> On Wed, Dec 8, 2021 at 9:55 PM Xiao Ni <xni@redhat.com> wrote:
> >
> > It doesn't free memory when register integrity failed. And move
> > free conf codes into a single function.
> >
> > Signed-off-by: Xiao Ni <xni@redhat.com>
> > ---
> >  drivers/md/raid0.c | 18 +++++++++++++++---
> >  1 file changed, 15 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
> > index 62c8b6adac70..3fa47df1c60e 100644
> > --- a/drivers/md/raid0.c
> > +++ b/drivers/md/raid0.c
> > @@ -356,6 +356,7 @@ static sector_t raid0_size(struct mddev *mddev, sector_t sectors, int raid_disks
> >         return array_sectors;
> >  }
> >
> > +static void free_conf(struct r0conf *conf);
> >  static void raid0_free(struct mddev *mddev, void *priv);
> >
> >  static int raid0_run(struct mddev *mddev)
> > @@ -413,19 +414,30 @@ static int raid0_run(struct mddev *mddev)
> >         dump_zones(mddev);
> >
> >         ret = md_integrity_register(mddev);
> > +       if (ret)
> > +               goto free;
> >
> >         return ret;
> > +
> > +free:
> > +       free_conf(conf);
>
> Can we just use raid0_free() here? Also, after freeing conf,

At first I used raid0_free too. But it looks strange after adding
acct_bioset_exit
in raid0_free. Because if creating stripe zones failed, it doesn't
need to free conf,
although it doesn't have problems that passes NULL to kfree.

But I'm ok if you want me to use raid0_free here. I'll send the next version.

> we should set mddev->private to NULL.
>

Yes.

Regards
Xiao


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] md/raid0: Free r0conf memory when register integrity failed
  2021-12-09 18:02   ` Song Liu
  2021-12-10  1:18     ` Xiao Ni
@ 2021-12-10  1:22     ` Guoqing Jiang
  2021-12-10  1:34       ` Xiao Ni
  2021-12-10  1:51       ` Xiao Ni
  1 sibling, 2 replies; 14+ messages in thread
From: Guoqing Jiang @ 2021-12-10  1:22 UTC (permalink / raw)
  To: Song Liu, Xiao Ni; +Cc: Nigel Croxon, linux-raid



On 12/10/21 2:02 AM, Song Liu wrote:
> On Wed, Dec 8, 2021 at 9:55 PM Xiao Ni<xni@redhat.com>  wrote:
>> It doesn't free memory when register integrity failed. And move
>> free conf codes into a single function.
>>
>> Signed-off-by: Xiao Ni<xni@redhat.com>
>> ---
>>   drivers/md/raid0.c | 18 +++++++++++++++---
>>   1 file changed, 15 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
>> index 62c8b6adac70..3fa47df1c60e 100644
>> --- a/drivers/md/raid0.c
>> +++ b/drivers/md/raid0.c
>> @@ -356,6 +356,7 @@ static sector_t raid0_size(struct mddev *mddev, sector_t sectors, int raid_disks
>>          return array_sectors;
>>   }
>>
>> +static void free_conf(struct r0conf *conf);
>>   static void raid0_free(struct mddev *mddev, void *priv);
>>
>>   static int raid0_run(struct mddev *mddev)
>> @@ -413,19 +414,30 @@ static int raid0_run(struct mddev *mddev)
>>          dump_zones(mddev);
>>
>>          ret = md_integrity_register(mddev);
>> +       if (ret)
>> +               goto free;
>>
>>          return ret;
>> +
>> +free:
>> +       free_conf(conf);
> Can we just use raid0_free() here? Also, after freeing conf,
> we should set mddev->private to NULL.

Agree, like what raid1_run did. And we might need to check the
return value of pers->run in level_store as well.

Thanks,
Guoqing

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] md: Move alloc/free acct bioset in to personality
  2021-12-09  5:55 ` [PATCH 2/2] md: Move alloc/free acct bioset in to personality Xiao Ni
@ 2021-12-10  1:30   ` Guoqing Jiang
  0 siblings, 0 replies; 14+ messages in thread
From: Guoqing Jiang @ 2021-12-10  1:30 UTC (permalink / raw)
  To: Xiao Ni, song; +Cc: ncroxon, linux-raid



On 12/9/21 1:55 PM, Xiao Ni wrote:
> Now it alloc acct bioset in md_run and only raid0/raid5 need acct
> bioset. For example, it doesn't create acct bioset when creating
> raid1. Then reshape from raid1 to raid0/raid5, it will access acct
> bioset after reshaping. It can panic because of NULL pointer
> reference.

Thanks, I think the previous commit didn't think of the reshape scenario.
Could you paste the relevant info into commit header?

> We can move alloc/free jobs to personality. pers->run alloc acct
> bioset and pers->clean free it.

In the reshape case,  the caller of pers->run is level_store, so.

Acked-by: Guoqing Jiang <guoqing.jiang@linux.dev>

Thanks,
Guoqing

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] md/raid0: Free r0conf memory when register integrity failed
  2021-12-10  1:22     ` Guoqing Jiang
@ 2021-12-10  1:34       ` Xiao Ni
  2021-12-10  1:43         ` Guoqing Jiang
  2021-12-10  2:29         ` Song Liu
  2021-12-10  1:51       ` Xiao Ni
  1 sibling, 2 replies; 14+ messages in thread
From: Xiao Ni @ 2021-12-10  1:34 UTC (permalink / raw)
  To: Guoqing Jiang; +Cc: Song Liu, Nigel Croxon, linux-raid

On Fri, Dec 10, 2021 at 9:22 AM Guoqing Jiang <guoqing.jiang@linux.dev> wrote:
>
>
>
> On 12/10/21 2:02 AM, Song Liu wrote:
> > On Wed, Dec 8, 2021 at 9:55 PM Xiao Ni<xni@redhat.com>  wrote:
> >> It doesn't free memory when register integrity failed. And move
> >> free conf codes into a single function.
> >>
> >> Signed-off-by: Xiao Ni<xni@redhat.com>
> >> ---
> >>   drivers/md/raid0.c | 18 +++++++++++++++---
> >>   1 file changed, 15 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
> >> index 62c8b6adac70..3fa47df1c60e 100644
> >> --- a/drivers/md/raid0.c
> >> +++ b/drivers/md/raid0.c
> >> @@ -356,6 +356,7 @@ static sector_t raid0_size(struct mddev *mddev, sector_t sectors, int raid_disks
> >>          return array_sectors;
> >>   }
> >>
> >> +static void free_conf(struct r0conf *conf);
> >>   static void raid0_free(struct mddev *mddev, void *priv);
> >>
> >>   static int raid0_run(struct mddev *mddev)
> >> @@ -413,19 +414,30 @@ static int raid0_run(struct mddev *mddev)
> >>          dump_zones(mddev);
> >>
> >>          ret = md_integrity_register(mddev);
> >> +       if (ret)
> >> +               goto free;
> >>
> >>          return ret;
> >> +
> >> +free:
> >> +       free_conf(conf);
> > Can we just use raid0_free() here? Also, after freeing conf,
> > we should set mddev->private to NULL.
>
> Agree, like what raid1_run did. And we might need to check the
> return value of pers->run in level_store as well.

Yes. It needs to check the return value and try to reback to the original
state. I planed to fix this in the following days not this patch. This patch
only fix the NULL reference problem after reshape.

In fact, no only we need to check pers->run in level_store, we also need
to check integrity during reshape. Now integrity only supports
raid0/raid1/raid10,
so it needs to do some jobs related with integrity (unregister/register)

I plan to fix these two problems in the next patches. Is it OK?

Regards
Xiao


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] md/raid0: Free r0conf memory when register integrity failed
  2021-12-10  1:34       ` Xiao Ni
@ 2021-12-10  1:43         ` Guoqing Jiang
  2021-12-10  2:29         ` Song Liu
  1 sibling, 0 replies; 14+ messages in thread
From: Guoqing Jiang @ 2021-12-10  1:43 UTC (permalink / raw)
  To: Xiao Ni; +Cc: Song Liu, Nigel Croxon, linux-raid



On 12/10/21 9:34 AM, Xiao Ni wrote:
> On Fri, Dec 10, 2021 at 9:22 AM Guoqing Jiang <guoqing.jiang@linux.dev> wrote:
>>
>>
>> On 12/10/21 2:02 AM, Song Liu wrote:
>>> On Wed, Dec 8, 2021 at 9:55 PM Xiao Ni<xni@redhat.com>  wrote:
>>>> It doesn't free memory when register integrity failed. And move
>>>> free conf codes into a single function.
>>>>
>>>> Signed-off-by: Xiao Ni<xni@redhat.com>
>>>> ---
>>>>    drivers/md/raid0.c | 18 +++++++++++++++---
>>>>    1 file changed, 15 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
>>>> index 62c8b6adac70..3fa47df1c60e 100644
>>>> --- a/drivers/md/raid0.c
>>>> +++ b/drivers/md/raid0.c
>>>> @@ -356,6 +356,7 @@ static sector_t raid0_size(struct mddev *mddev, sector_t sectors, int raid_disks
>>>>           return array_sectors;
>>>>    }
>>>>
>>>> +static void free_conf(struct r0conf *conf);
>>>>    static void raid0_free(struct mddev *mddev, void *priv);
>>>>
>>>>    static int raid0_run(struct mddev *mddev)
>>>> @@ -413,19 +414,30 @@ static int raid0_run(struct mddev *mddev)
>>>>           dump_zones(mddev);
>>>>
>>>>           ret = md_integrity_register(mddev);
>>>> +       if (ret)
>>>> +               goto free;
>>>>
>>>>           return ret;
>>>> +
>>>> +free:
>>>> +       free_conf(conf);
>>> Can we just use raid0_free() here? Also, after freeing conf,
>>> we should set mddev->private to NULL.
>> Agree, like what raid1_run did. And we might need to check the
>> return value of pers->run in level_store as well.
> Yes. It needs to check the return value and try to reback to the original
> state. I planed to fix this in the following days not this patch. This patch
> only fix the NULL reference problem after reshape.
>
> In fact, no only we need to check pers->run in level_store, we also need
> to check integrity during reshape. Now integrity only supports
> raid0/raid1/raid10,
> so it needs to do some jobs related with integrity (unregister/register)
>
> I plan to fix these two problems in the next patches. Is it OK?

Yes, fine from my side and thanks for your time.

Thanks,
Guoqing

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] md/raid0: Free r0conf memory when register integrity failed
  2021-12-10  1:22     ` Guoqing Jiang
  2021-12-10  1:34       ` Xiao Ni
@ 2021-12-10  1:51       ` Xiao Ni
  1 sibling, 0 replies; 14+ messages in thread
From: Xiao Ni @ 2021-12-10  1:51 UTC (permalink / raw)
  To: Guoqing Jiang; +Cc: Song Liu, Nigel Croxon, linux-raid

On Fri, Dec 10, 2021 at 9:22 AM Guoqing Jiang <guoqing.jiang@linux.dev> wrote:
>
>
>
> On 12/10/21 2:02 AM, Song Liu wrote:
> > On Wed, Dec 8, 2021 at 9:55 PM Xiao Ni<xni@redhat.com>  wrote:
> >> It doesn't free memory when register integrity failed. And move
> >> free conf codes into a single function.
> >>
> >> Signed-off-by: Xiao Ni<xni@redhat.com>
> >> ---
> >>   drivers/md/raid0.c | 18 +++++++++++++++---
> >>   1 file changed, 15 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
> >> index 62c8b6adac70..3fa47df1c60e 100644
> >> --- a/drivers/md/raid0.c
> >> +++ b/drivers/md/raid0.c
> >> @@ -356,6 +356,7 @@ static sector_t raid0_size(struct mddev *mddev, sector_t sectors, int raid_disks
> >>          return array_sectors;
> >>   }
> >>
> >> +static void free_conf(struct r0conf *conf);
> >>   static void raid0_free(struct mddev *mddev, void *priv);
> >>
> >>   static int raid0_run(struct mddev *mddev)
> >> @@ -413,19 +414,30 @@ static int raid0_run(struct mddev *mddev)
> >>          dump_zones(mddev);
> >>
> >>          ret = md_integrity_register(mddev);
> >> +       if (ret)
> >> +               goto free;
> >>
> >>          return ret;
> >> +
> >> +free:
> >> +       free_conf(conf);
> > Can we just use raid0_free() here? Also, after freeing conf,
> > we should set mddev->private to NULL.
>
> Agree, like what raid1_run did. And we might need to check the
> return value of pers->run in level_store as well.

Hi Guoqing

By the way, you agree with using raid1_run here? I just replied to
Song in an email.
Could you have a look at it? I explained why I wrote a new function
free_conf here.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] md/raid0: Free r0conf memory when register integrity failed
  2021-12-10  1:18     ` Xiao Ni
@ 2021-12-10  2:07       ` Guoqing Jiang
  2021-12-10  2:17         ` Xiao Ni
  2021-12-10  2:27       ` Song Liu
  1 sibling, 1 reply; 14+ messages in thread
From: Guoqing Jiang @ 2021-12-10  2:07 UTC (permalink / raw)
  To: Xiao Ni, Song Liu; +Cc: Nigel Croxon, linux-raid



On 12/10/21 9:18 AM, Xiao Ni wrote:
> On Fri, Dec 10, 2021 at 2:02 AM Song Liu <song@kernel.org> wrote:
>> On Wed, Dec 8, 2021 at 9:55 PM Xiao Ni <xni@redhat.com> wrote:
>>> It doesn't free memory when register integrity failed. And move
>>> free conf codes into a single function.
>>>
>>> Signed-off-by: Xiao Ni <xni@redhat.com>
>>> ---
>>>   drivers/md/raid0.c | 18 +++++++++++++++---
>>>   1 file changed, 15 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
>>> index 62c8b6adac70..3fa47df1c60e 100644
>>> --- a/drivers/md/raid0.c
>>> +++ b/drivers/md/raid0.c
>>> @@ -356,6 +356,7 @@ static sector_t raid0_size(struct mddev *mddev, sector_t sectors, int raid_disks
>>>          return array_sectors;
>>>   }
>>>
>>> +static void free_conf(struct r0conf *conf);
>>>   static void raid0_free(struct mddev *mddev, void *priv);
>>>
>>>   static int raid0_run(struct mddev *mddev)
>>> @@ -413,19 +414,30 @@ static int raid0_run(struct mddev *mddev)
>>>          dump_zones(mddev);
>>>
>>>          ret = md_integrity_register(mddev);
>>> +       if (ret)
>>> +               goto free;
>>>
>>>          return ret;
>>> +
>>> +free:
>>> +       free_conf(conf);
>> Can we just use raid0_free() here? Also, after freeing conf,
> At first I used raid0_free too. But it looks strange after adding
> acct_bioset_exit
> in raid0_free. Because if creating stripe zones failed, it doesn't
> need to free conf,
> although it doesn't have problems that passes NULL to kfree.

Maybe something like this, just FYI.

diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
index 62c8b6adac70..c24bec49b36f 100644
--- a/drivers/md/raid0.c
+++ b/drivers/md/raid0.c
@@ -377,6 +377,12 @@ static int raid0_run(struct mddev *mddev)
                         return ret;
                 mddev->private = conf;
         }
+
+       if (acct_bioset_init(mddev)) {
+               pr_err("md/raid0:%s: alloc acct bioset failed.\n", 
mdname(mddev));
+               return -ENOMEM;
+       }
+
         conf = mddev->private;
         if (mddev->queue) {
                 struct md_rdev *rdev;
@@ -413,6 +419,11 @@ static int raid0_run(struct mddev *mddev)
         dump_zones(mddev);

         ret = md_integrity_register(mddev);
+       if (ret) {
+               raid0_free(mddev, mddev->private);
+               mddev->private = NULL;
+               acct_bioset_exit(mddev);
+       }

         return ret;
  }

Thanks,
Guoqing

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] md/raid0: Free r0conf memory when register integrity failed
  2021-12-10  2:07       ` Guoqing Jiang
@ 2021-12-10  2:17         ` Xiao Ni
  0 siblings, 0 replies; 14+ messages in thread
From: Xiao Ni @ 2021-12-10  2:17 UTC (permalink / raw)
  To: Guoqing Jiang; +Cc: Song Liu, Nigel Croxon, linux-raid

On Fri, Dec 10, 2021 at 10:07 AM Guoqing Jiang <guoqing.jiang@linux.dev> wrote:
>
> Maybe something like this, just FYI.
>
> diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
> index 62c8b6adac70..c24bec49b36f 100644
> --- a/drivers/md/raid0.c
> +++ b/drivers/md/raid0.c
> @@ -377,6 +377,12 @@ static int raid0_run(struct mddev *mddev)
>                          return ret;
>                  mddev->private = conf;
>          }
> +
> +       if (acct_bioset_init(mddev)) {
> +               pr_err("md/raid0:%s: alloc acct bioset failed.\n",
> mdname(mddev));
> +               return -ENOMEM;
> +       }
> +
>          conf = mddev->private;
>          if (mddev->queue) {
>                  struct md_rdev *rdev;
> @@ -413,6 +419,11 @@ static int raid0_run(struct mddev *mddev)
>          dump_zones(mddev);
>
>          ret = md_integrity_register(mddev);
> +       if (ret) {
> +               raid0_free(mddev, mddev->private);
> +               mddev->private = NULL;
> +               acct_bioset_exit(mddev);
> +       }
>
>          return ret;
>   }
>
> Thanks,
> Guoqing
>

This should not work. The way to fix the original problem is that we
need to clean acct bio set of old
level raid in pers->free and create acct bio set for new level raid in
pers->run. So it doesn't need to
change any codes in level_store. So it needs to put
acct_bioset_init/acct_bioset_exit in pers->run/free.

Regards
Xiao


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] md/raid0: Free r0conf memory when register integrity failed
  2021-12-10  1:18     ` Xiao Ni
  2021-12-10  2:07       ` Guoqing Jiang
@ 2021-12-10  2:27       ` Song Liu
  1 sibling, 0 replies; 14+ messages in thread
From: Song Liu @ 2021-12-10  2:27 UTC (permalink / raw)
  To: Xiao Ni; +Cc: Guoqing Jiang, Nigel Croxon, linux-raid

On Thu, Dec 9, 2021 at 5:19 PM Xiao Ni <xni@redhat.com> wrote:
>
> On Fri, Dec 10, 2021 at 2:02 AM Song Liu <song@kernel.org> wrote:
> >
> > On Wed, Dec 8, 2021 at 9:55 PM Xiao Ni <xni@redhat.com> wrote:
> > >
> > > It doesn't free memory when register integrity failed. And move
> > > free conf codes into a single function.
> > >
> > > Signed-off-by: Xiao Ni <xni@redhat.com>
> > > ---
> > >  drivers/md/raid0.c | 18 +++++++++++++++---
> > >  1 file changed, 15 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
> > > index 62c8b6adac70..3fa47df1c60e 100644
> > > --- a/drivers/md/raid0.c
> > > +++ b/drivers/md/raid0.c
> > > @@ -356,6 +356,7 @@ static sector_t raid0_size(struct mddev *mddev, sector_t sectors, int raid_disks
> > >         return array_sectors;
> > >  }
> > >
> > > +static void free_conf(struct r0conf *conf);
> > >  static void raid0_free(struct mddev *mddev, void *priv);
> > >
> > >  static int raid0_run(struct mddev *mddev)
> > > @@ -413,19 +414,30 @@ static int raid0_run(struct mddev *mddev)
> > >         dump_zones(mddev);
> > >
> > >         ret = md_integrity_register(mddev);
> > > +       if (ret)
> > > +               goto free;
> > >
> > >         return ret;
> > > +
> > > +free:
> > > +       free_conf(conf);
> >
> > Can we just use raid0_free() here? Also, after freeing conf,
>
> At first I used raid0_free too. But it looks strange after adding
> acct_bioset_exit
> in raid0_free. Because if creating stripe zones failed, it doesn't
> need to free conf,
> although it doesn't have problems that passes NULL to kfree.
>
I see. Let's add free_conf() then. Maybe we can move both free_conf()
and raid0_free() ahead to avoid the extra declaration?

Thanks,
Song

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] md/raid0: Free r0conf memory when register integrity failed
  2021-12-10  1:34       ` Xiao Ni
  2021-12-10  1:43         ` Guoqing Jiang
@ 2021-12-10  2:29         ` Song Liu
  1 sibling, 0 replies; 14+ messages in thread
From: Song Liu @ 2021-12-10  2:29 UTC (permalink / raw)
  To: Xiao Ni; +Cc: Guoqing Jiang, Nigel Croxon, linux-raid

On Thu, Dec 9, 2021 at 5:34 PM Xiao Ni <xni@redhat.com> wrote:
>
> On Fri, Dec 10, 2021 at 9:22 AM Guoqing Jiang <guoqing.jiang@linux.dev> wrote:
> >
> >
> >
> > On 12/10/21 2:02 AM, Song Liu wrote:
> > > On Wed, Dec 8, 2021 at 9:55 PM Xiao Ni<xni@redhat.com>  wrote:
> > >> It doesn't free memory when register integrity failed. And move
> > >> free conf codes into a single function.
> > >>
> > >> Signed-off-by: Xiao Ni<xni@redhat.com>
> > >> ---
> > >>   drivers/md/raid0.c | 18 +++++++++++++++---
> > >>   1 file changed, 15 insertions(+), 3 deletions(-)
> > >>
> > >> diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
> > >> index 62c8b6adac70..3fa47df1c60e 100644
> > >> --- a/drivers/md/raid0.c
> > >> +++ b/drivers/md/raid0.c
> > >> @@ -356,6 +356,7 @@ static sector_t raid0_size(struct mddev *mddev, sector_t sectors, int raid_disks
> > >>          return array_sectors;
> > >>   }
> > >>
> > >> +static void free_conf(struct r0conf *conf);
> > >>   static void raid0_free(struct mddev *mddev, void *priv);
> > >>
> > >>   static int raid0_run(struct mddev *mddev)
> > >> @@ -413,19 +414,30 @@ static int raid0_run(struct mddev *mddev)
> > >>          dump_zones(mddev);
> > >>
> > >>          ret = md_integrity_register(mddev);
> > >> +       if (ret)
> > >> +               goto free;
> > >>
> > >>          return ret;
> > >> +
> > >> +free:
> > >> +       free_conf(conf);
> > > Can we just use raid0_free() here? Also, after freeing conf,
> > > we should set mddev->private to NULL.
> >
> > Agree, like what raid1_run did. And we might need to check the
> > return value of pers->run in level_store as well.
>
> Yes. It needs to check the return value and try to reback to the original
> state. I planed to fix this in the following days not this patch. This patch
> only fix the NULL reference problem after reshape.
>
> In fact, no only we need to check pers->run in level_store, we also need
> to check integrity during reshape. Now integrity only supports
> raid0/raid1/raid10,
> so it needs to do some jobs related with integrity (unregister/register)
>
> I plan to fix these two problems in the next patches. Is it OK?

Yes, that works. We don't have to fix all the issues in one set. :)

Song

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2021-12-10  2:29 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-09  5:55 [PATCH 0/2] md: it panic after reshape from raid1 to raid5 Xiao Ni
2021-12-09  5:55 ` [PATCH 1/2] md/raid0: Free r0conf memory when register integrity failed Xiao Ni
2021-12-09 18:02   ` Song Liu
2021-12-10  1:18     ` Xiao Ni
2021-12-10  2:07       ` Guoqing Jiang
2021-12-10  2:17         ` Xiao Ni
2021-12-10  2:27       ` Song Liu
2021-12-10  1:22     ` Guoqing Jiang
2021-12-10  1:34       ` Xiao Ni
2021-12-10  1:43         ` Guoqing Jiang
2021-12-10  2:29         ` Song Liu
2021-12-10  1:51       ` Xiao Ni
2021-12-09  5:55 ` [PATCH 2/2] md: Move alloc/free acct bioset in to personality Xiao Ni
2021-12-10  1:30   ` Guoqing Jiang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.