All of lore.kernel.org
 help / color / mirror / Atom feed
From: "NeilBrown" <neilb@suse.de>
To: "Mikulas Patocka" <mpatocka@redhat.com>
Cc: "Song Liu" <song@kernel.org>,
	"Guoqing Jiang" <guoqing.jiang@linux.dev>,
	"Zdenek Kabelac" <zkabelac@redhat.com>,
	linux-raid@vger.kernel.org, dm-devel@redhat.com
Subject: Re: [dm-devel] [PATCH] md: fix a crash in mempool_free
Date: Fri, 04 Nov 2022 15:40:45 +1100	[thread overview]
Message-ID: <166753684502.19313.12105294223332649758@noble.neil.brown.name> (raw)
In-Reply-To: =?utf-8?q?=3Calpine=2ELRH=2E2=2E21=2E2211031121070=2E18305=40fi?= =?utf-8?q?le01=2Eintranet=2Eprod=2Eint=2Erdu2=2Eredhat=2Ecom=3E?=

On Fri, 04 Nov 2022, Mikulas Patocka wrote:
> There's a crash in mempool_free when running the lvm test
> shell/lvchange-rebuild-raid.sh.
> 
> The reason for the crash is this:
> * super_written calls atomic_dec_and_test(&mddev->pending_writes) and
>   wake_up(&mddev->sb_wait). Then it calls rdev_dec_pending(rdev, mddev)
>   and bio_put(bio).
> * so, the process that waited on sb_wait and that is woken up is racing
>   with bio_put(bio).
> * if the process wins the race, it calls bioset_exit before bio_put(bio)
>   is executed.
> * bio_put(bio) attempts to free a bio into a destroyed bio set - causing
>   a crash in mempool_free.
> 
> We fix this bug by moving bio_put before atomic_dec_and_test.
> 
> The function md_end_flush has a similar bug - we must call bio_put before
> we decrement the number of in-progress bios.
> 
>  BUG: kernel NULL pointer dereference, address: 0000000000000000
>  #PF: supervisor write access in kernel mode
>  #PF: error_code(0x0002) - not-present page
>  PGD 11557f0067 P4D 11557f0067 PUD 0
>  Oops: 0002 [#1] PREEMPT SMP
>  CPU: 0 PID: 73 Comm: kworker/0:1 Not tainted 6.1.0-rc3 #5
>  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
>  Workqueue: kdelayd flush_expired_bios [dm_delay]
>  RIP: 0010:mempool_free+0x47/0x80
>  Code: 48 89 ef 5b 5d ff e0 f3 c3 48 89 f7 e8 32 45 3f 00 48 63 53 08 48 89 c6 3b 53 04 7d 2d 48 8b 43 10 8d 4a 01 48 89 df 89 4b 08 <48> 89 2c d0 e8 b0 45 3f 00 48 8d 7b 30 5b 5d 31 c9 ba 01 00 00 00
>  RSP: 0018:ffff88910036bda8 EFLAGS: 00010093
>  RAX: 0000000000000000 RBX: ffff8891037b65d8 RCX: 0000000000000001
>  RDX: 0000000000000000 RSI: 0000000000000202 RDI: ffff8891037b65d8
>  RBP: ffff8891447ba240 R08: 0000000000012908 R09: 00000000003d0900
>  R10: 0000000000000000 R11: 0000000000173544 R12: ffff889101a14000
>  R13: ffff8891562ac300 R14: ffff889102b41440 R15: ffffe8ffffa00d05
>  FS:  0000000000000000(0000) GS:ffff88942fa00000(0000) knlGS:0000000000000000
>  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>  CR2: 0000000000000000 CR3: 0000001102e99000 CR4: 00000000000006b0
>  Call Trace:
>   <TASK>
>   clone_endio+0xf4/0x1c0 [dm_mod]
>   clone_endio+0xf4/0x1c0 [dm_mod]
>   __submit_bio+0x76/0x120
>   submit_bio_noacct_nocheck+0xb6/0x2a0
>   flush_expired_bios+0x28/0x2f [dm_delay]
>   process_one_work+0x1b4/0x300
>   worker_thread+0x45/0x3e0
>   ? rescuer_thread+0x380/0x380
>   kthread+0xc2/0x100
>   ? kthread_complete_and_exit+0x20/0x20
>   ret_from_fork+0x1f/0x30
>   </TASK>
>  Modules linked in: brd dm_delay dm_raid dm_mod af_packet uvesafb cfbfillrect cfbimgblt cn cfbcopyarea fb font fbdev tun autofs4 binfmt_misc configfs ipv6 virtio_rng virtio_balloon rng_core virtio_net pcspkr net_failover failover qemu_fw_cfg button mousedev raid10 raid456 libcrc32c async_raid6_recov async_memcpy async_pq raid6_pq async_xor xor async_tx raid1 raid0 md_mod sd_mod t10_pi crc64_rocksoft crc64 virtio_scsi scsi_mod evdev psmouse bsg scsi_common [last unloaded: brd]
>  CR2: 0000000000000000
>  ---[ end trace 0000000000000000 ]---
> 
> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> Cc: stable@vger.kernel.org
> 
> ---
>  drivers/md/md.c |    6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> Index: linux-2.6/drivers/md/md.c
> ===================================================================
> --- linux-2.6.orig/drivers/md/md.c	2022-11-03 15:29:02.000000000 +0100
> +++ linux-2.6/drivers/md/md.c	2022-11-03 15:33:17.000000000 +0100
> @@ -509,13 +509,14 @@ static void md_end_flush(struct bio *bio
>  	struct md_rdev *rdev = bio->bi_private;
>  	struct mddev *mddev = rdev->mddev;
>  
> +	bio_put(bio);
> +
>  	rdev_dec_pending(rdev, mddev);
>  
>  	if (atomic_dec_and_test(&mddev->flush_pending)) {
>  		/* The pre-request flush has finished */
>  		queue_work(md_wq, &mddev->flush_work);
>  	}
> -	bio_put(bio);
>  }
>  
>  static void md_submit_flush_data(struct work_struct *ws);
> @@ -913,10 +914,11 @@ static void super_written(struct bio *bi
>  	} else
>  		clear_bit(LastDev, &rdev->flags);
>  
> +	bio_put(bio);
> +
>  	if (atomic_dec_and_test(&mddev->pending_writes))
>  		wake_up(&mddev->sb_wait);
>  	rdev_dec_pending(rdev, mddev);
> -	bio_put(bio);
>  }

Thanks. I think this is a clear improvement.
I think it would be a little better if the rdev_dec_pending were also
move up.
Then both code fragments would be:
  bio_put ; rdev_dec_pending ; atomic_dec_and_test

Thanks,
NeilBrown


WARNING: multiple messages have this Message-ID (diff)
From: "NeilBrown" <neilb@suse.de>
To: "Mikulas Patocka" <mpatocka@redhat.com>
Cc: linux-raid@vger.kernel.org, Song Liu <song@kernel.org>,
	dm-devel@redhat.com, Guoqing Jiang <guoqing.jiang@linux.dev>,
	Zdenek Kabelac <zkabelac@redhat.com>
Subject: Re: [dm-devel] [PATCH] md: fix a crash in mempool_free
Date: Fri, 04 Nov 2022 15:40:45 +1100	[thread overview]
Message-ID: <166753684502.19313.12105294223332649758@noble.neil.brown.name> (raw)
In-Reply-To: =?utf-8?q?=3Calpine=2ELRH=2E2=2E21=2E2211031121070=2E18305=40fi?= =?utf-8?q?le01=2Eintranet=2Eprod=2Eint=2Erdu2=2Eredhat=2Ecom=3E?=

On Fri, 04 Nov 2022, Mikulas Patocka wrote:
> There's a crash in mempool_free when running the lvm test
> shell/lvchange-rebuild-raid.sh.
> 
> The reason for the crash is this:
> * super_written calls atomic_dec_and_test(&mddev->pending_writes) and
>   wake_up(&mddev->sb_wait). Then it calls rdev_dec_pending(rdev, mddev)
>   and bio_put(bio).
> * so, the process that waited on sb_wait and that is woken up is racing
>   with bio_put(bio).
> * if the process wins the race, it calls bioset_exit before bio_put(bio)
>   is executed.
> * bio_put(bio) attempts to free a bio into a destroyed bio set - causing
>   a crash in mempool_free.
> 
> We fix this bug by moving bio_put before atomic_dec_and_test.
> 
> The function md_end_flush has a similar bug - we must call bio_put before
> we decrement the number of in-progress bios.
> 
>  BUG: kernel NULL pointer dereference, address: 0000000000000000
>  #PF: supervisor write access in kernel mode
>  #PF: error_code(0x0002) - not-present page
>  PGD 11557f0067 P4D 11557f0067 PUD 0
>  Oops: 0002 [#1] PREEMPT SMP
>  CPU: 0 PID: 73 Comm: kworker/0:1 Not tainted 6.1.0-rc3 #5
>  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
>  Workqueue: kdelayd flush_expired_bios [dm_delay]
>  RIP: 0010:mempool_free+0x47/0x80
>  Code: 48 89 ef 5b 5d ff e0 f3 c3 48 89 f7 e8 32 45 3f 00 48 63 53 08 48 89 c6 3b 53 04 7d 2d 48 8b 43 10 8d 4a 01 48 89 df 89 4b 08 <48> 89 2c d0 e8 b0 45 3f 00 48 8d 7b 30 5b 5d 31 c9 ba 01 00 00 00
>  RSP: 0018:ffff88910036bda8 EFLAGS: 00010093
>  RAX: 0000000000000000 RBX: ffff8891037b65d8 RCX: 0000000000000001
>  RDX: 0000000000000000 RSI: 0000000000000202 RDI: ffff8891037b65d8
>  RBP: ffff8891447ba240 R08: 0000000000012908 R09: 00000000003d0900
>  R10: 0000000000000000 R11: 0000000000173544 R12: ffff889101a14000
>  R13: ffff8891562ac300 R14: ffff889102b41440 R15: ffffe8ffffa00d05
>  FS:  0000000000000000(0000) GS:ffff88942fa00000(0000) knlGS:0000000000000000
>  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>  CR2: 0000000000000000 CR3: 0000001102e99000 CR4: 00000000000006b0
>  Call Trace:
>   <TASK>
>   clone_endio+0xf4/0x1c0 [dm_mod]
>   clone_endio+0xf4/0x1c0 [dm_mod]
>   __submit_bio+0x76/0x120
>   submit_bio_noacct_nocheck+0xb6/0x2a0
>   flush_expired_bios+0x28/0x2f [dm_delay]
>   process_one_work+0x1b4/0x300
>   worker_thread+0x45/0x3e0
>   ? rescuer_thread+0x380/0x380
>   kthread+0xc2/0x100
>   ? kthread_complete_and_exit+0x20/0x20
>   ret_from_fork+0x1f/0x30
>   </TASK>
>  Modules linked in: brd dm_delay dm_raid dm_mod af_packet uvesafb cfbfillrect cfbimgblt cn cfbcopyarea fb font fbdev tun autofs4 binfmt_misc configfs ipv6 virtio_rng virtio_balloon rng_core virtio_net pcspkr net_failover failover qemu_fw_cfg button mousedev raid10 raid456 libcrc32c async_raid6_recov async_memcpy async_pq raid6_pq async_xor xor async_tx raid1 raid0 md_mod sd_mod t10_pi crc64_rocksoft crc64 virtio_scsi scsi_mod evdev psmouse bsg scsi_common [last unloaded: brd]
>  CR2: 0000000000000000
>  ---[ end trace 0000000000000000 ]---
> 
> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> Cc: stable@vger.kernel.org
> 
> ---
>  drivers/md/md.c |    6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> Index: linux-2.6/drivers/md/md.c
> ===================================================================
> --- linux-2.6.orig/drivers/md/md.c	2022-11-03 15:29:02.000000000 +0100
> +++ linux-2.6/drivers/md/md.c	2022-11-03 15:33:17.000000000 +0100
> @@ -509,13 +509,14 @@ static void md_end_flush(struct bio *bio
>  	struct md_rdev *rdev = bio->bi_private;
>  	struct mddev *mddev = rdev->mddev;
>  
> +	bio_put(bio);
> +
>  	rdev_dec_pending(rdev, mddev);
>  
>  	if (atomic_dec_and_test(&mddev->flush_pending)) {
>  		/* The pre-request flush has finished */
>  		queue_work(md_wq, &mddev->flush_work);
>  	}
> -	bio_put(bio);
>  }
>  
>  static void md_submit_flush_data(struct work_struct *ws);
> @@ -913,10 +914,11 @@ static void super_written(struct bio *bi
>  	} else
>  		clear_bit(LastDev, &rdev->flags);
>  
> +	bio_put(bio);
> +
>  	if (atomic_dec_and_test(&mddev->pending_writes))
>  		wake_up(&mddev->sb_wait);
>  	rdev_dec_pending(rdev, mddev);
> -	bio_put(bio);
>  }

Thanks. I think this is a clear improvement.
I think it would be a little better if the rdev_dec_pending were also
move up.
Then both code fragments would be:
  bio_put ; rdev_dec_pending ; atomic_dec_and_test

Thanks,
NeilBrown

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


         reply	other threads:[~2022-11-04  4:40 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-03 15:23 [dm-devel] [PATCH] md: fix a crash in mempool_free Mikulas Patocka
2022-11-03 15:23 ` Mikulas Patocka
2022-11-04  4:40 ` NeilBrown [this message]
2022-11-04  4:40   ` [dm-devel] " NeilBrown
2022-11-04  5:52   ` Problems with email threading (was: [dm-devel] [PATCH] md: fix a crash in mempool_free) Paul Menzel
2022-11-04  5:52     ` [dm-devel] Problems with email threading (was: " Paul Menzel
2022-11-04 13:52   ` [dm-devel] [PATCH] md: fix a crash in mempool_free Mikulas Patocka
2022-11-04 13:52     ` Mikulas Patocka
2022-11-05 22:34   ` NeilBrown
2022-11-05 22:34     ` NeilBrown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=166753684502.19313.12105294223332649758@noble.neil.brown.name \
    --to=neilb@suse.de \
    --cc=dm-devel@redhat.com \
    --cc=guoqing.jiang@linux.dev \
    --cc=linux-raid@vger.kernel.org \
    --cc=mpatocka@redhat.com \
    --cc=song@kernel.org \
    --cc=zkabelac@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.