All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] dm-integrity: revert adc0daad366b to fix recalculation
@ 2020-07-22 18:46 Mikulas Patocka
  2020-07-22 19:45 ` Mike Snitzer
  0 siblings, 1 reply; 6+ messages in thread
From: Mikulas Patocka @ 2020-07-22 18:46 UTC (permalink / raw)
  To: Mike Snitzer, Marian Csontos, Zdenek Kabelac; +Cc: dm-devel

Hi Mike

Please submit this to Linus and to RHEL-8.

Mikulas



From: Mikulas Patocka <mpatocka@redhat.com>

The patch adc0daad366b62ca1bce3e2958a40b0b71a8b8b3 broke recalculation on
dm-integrity. The patch replaces a private variable "suspending" with a
call to "dm_suspended".

The problem is that dm_suspended returns true not only during suspend, but
also during resume. This race condition could occur:
1. dm_integrity_resume calls queue_work(ic->recalc_wq, &ic->recalc_work)
2. integrity_recalc (&ic->recalc_work) preempts the current thread
3. integrity_recalc calls if (unlikely(dm_suspended(ic->ti))) goto unlock_ret;
4. integrity_recalc exits and no recalculating is done.

In order to fix this race condition, we stop using dm_suspended and start
using the variable "suspending" (that is only set during suspend, not
during resume).

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Fixes: adc0daad366b ("dm: report suspended device during destroy")
Cc: stable@vger.kernel.org	# v4.18+

---
 drivers/md/dm-integrity.c |   12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

Index: linux-2.6/drivers/md/dm-integrity.c
===================================================================
--- linux-2.6.orig/drivers/md/dm-integrity.c	2020-06-29 14:49:59.000000000 +0200
+++ linux-2.6/drivers/md/dm-integrity.c	2020-07-22 15:48:49.000000000 +0200
@@ -204,13 +204,12 @@ struct dm_integrity_c {
 	__u8 log2_blocks_per_bitmap_bit;
 
 	unsigned char mode;
+	int suspending;
 
 	int failed;
 
 	struct crypto_shash *internal_hash;
 
-	struct dm_target *ti;
-
 	/* these variables are locked with endio_wait.lock */
 	struct rb_root in_progress;
 	struct list_head wait_list;
@@ -2420,7 +2419,7 @@ static void integrity_writer(struct work
 	unsigned prev_free_sectors;
 
 	/* the following test is not needed, but it tests the replay code */
-	if (unlikely(dm_suspended(ic->ti)) && !ic->meta_dev)
+	if (READ_ONCE(ic->suspending) && !ic->meta_dev)
 		return;
 
 	spin_lock_irq(&ic->endio_wait.lock);
@@ -2481,7 +2480,7 @@ static void integrity_recalc(struct work
 
 next_chunk:
 
-	if (unlikely(dm_suspended(ic->ti)))
+	if (unlikely(READ_ONCE(ic->suspending)))
 		goto unlock_ret;
 
 	range.logical_sector = le64_to_cpu(ic->sb->recalc_sector);
@@ -2909,6 +2908,8 @@ static void dm_integrity_postsuspend(str
 
 	del_timer_sync(&ic->autocommit_timer);
 
+	WRITE_ONCE(ic->suspending, 1);
+
 	if (ic->recalc_wq)
 		drain_workqueue(ic->recalc_wq);
 
@@ -2937,6 +2938,8 @@ static void dm_integrity_postsuspend(str
 #endif
 	}
 
+	WRITE_ONCE(ic->suspending, 0);
+
 	BUG_ON(!RB_EMPTY_ROOT(&ic->in_progress));
 
 	ic->journal_uptodate = true;
@@ -3767,7 +3770,6 @@ static int dm_integrity_ctr(struct dm_ta
 	}
 	ti->private = ic;
 	ti->per_io_data_size = sizeof(struct dm_integrity_io);
-	ic->ti = ti;
 
 	ic->in_progress = RB_ROOT;
 	INIT_LIST_HEAD(&ic->wait_list);

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: dm-integrity: revert adc0daad366b to fix recalculation
  2020-07-22 18:46 [PATCH] dm-integrity: revert adc0daad366b to fix recalculation Mikulas Patocka
@ 2020-07-22 19:45 ` Mike Snitzer
  2020-07-22 20:02   ` Mikulas Patocka
  0 siblings, 1 reply; 6+ messages in thread
From: Mike Snitzer @ 2020-07-22 19:45 UTC (permalink / raw)
  To: Mikulas Patocka; +Cc: dm-devel, Marian Csontos, Zdenek Kabelac

On Wed, Jul 22 2020 at  2:46pm -0400,
Mikulas Patocka <mpatocka@redhat.com> wrote:

> Hi Mike
> 
> Please submit this to Linus and to RHEL-8.
> 
> Mikulas
> 
> 
> 
> From: Mikulas Patocka <mpatocka@redhat.com>
> 
> The patch adc0daad366b62ca1bce3e2958a40b0b71a8b8b3 broke recalculation on
> dm-integrity. The patch replaces a private variable "suspending" with a
> call to "dm_suspended".
> 
> The problem is that dm_suspended returns true not only during suspend, but
> also during resume. This race condition could occur:
> 1. dm_integrity_resume calls queue_work(ic->recalc_wq, &ic->recalc_work)
> 2. integrity_recalc (&ic->recalc_work) preempts the current thread
> 3. integrity_recalc calls if (unlikely(dm_suspended(ic->ti))) goto unlock_ret;
> 4. integrity_recalc exits and no recalculating is done.
> 
> In order to fix this race condition, we stop using dm_suspended and start
> using the variable "suspending" (that is only set during suspend, not
> during resume).
> 
> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> Fixes: adc0daad366b ("dm: report suspended device during destroy")
> Cc: stable@vger.kernel.org	# v4.18+

OK, but why not add a dm_suspending() to DM core?  Could be other
future targets would like this same info right?  I don't see harm in
elevating it.

Mike

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: dm-integrity: revert adc0daad366b to fix recalculation
  2020-07-22 19:45 ` Mike Snitzer
@ 2020-07-22 20:02   ` Mikulas Patocka
  2020-07-23 14:42     ` [PATCH v2] " Mikulas Patocka
  0 siblings, 1 reply; 6+ messages in thread
From: Mikulas Patocka @ 2020-07-22 20:02 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: dm-devel, Marian Csontos, Zdenek Kabelac



On Wed, 22 Jul 2020, Mike Snitzer wrote:

> On Wed, Jul 22 2020 at  2:46pm -0400,
> Mikulas Patocka <mpatocka@redhat.com> wrote:
> 
> > Hi Mike
> > 
> > Please submit this to Linus and to RHEL-8.
> > 
> > Mikulas
> > 
> > 
> > 
> > From: Mikulas Patocka <mpatocka@redhat.com>
> > 
> > The patch adc0daad366b62ca1bce3e2958a40b0b71a8b8b3 broke recalculation on
> > dm-integrity. The patch replaces a private variable "suspending" with a
> > call to "dm_suspended".
> > 
> > The problem is that dm_suspended returns true not only during suspend, but
> > also during resume. This race condition could occur:
> > 1. dm_integrity_resume calls queue_work(ic->recalc_wq, &ic->recalc_work)
> > 2. integrity_recalc (&ic->recalc_work) preempts the current thread
> > 3. integrity_recalc calls if (unlikely(dm_suspended(ic->ti))) goto unlock_ret;
> > 4. integrity_recalc exits and no recalculating is done.
> > 
> > In order to fix this race condition, we stop using dm_suspended and start
> > using the variable "suspending" (that is only set during suspend, not
> > during resume).
> > 
> > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> > Fixes: adc0daad366b ("dm: report suspended device during destroy")
> > Cc: stable@vger.kernel.org	# v4.18+
> 
> OK, but why not add a dm_suspending() to DM core?  Could be other
> future targets would like this same info right?  I don't see harm in
> elevating it.
> 
> Mike

Yes - it may be possible to add this.

Mikulas

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v2] dm-integrity: revert adc0daad366b to fix recalculation
  2020-07-22 20:02   ` Mikulas Patocka
@ 2020-07-23 14:42     ` Mikulas Patocka
  2020-07-23 18:23       ` Mike Snitzer
  0 siblings, 1 reply; 6+ messages in thread
From: Mikulas Patocka @ 2020-07-23 14:42 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: dm-devel, Marian Csontos, Zdenek Kabelac



On Wed, 22 Jul 2020, Mikulas Patocka wrote:

> 
> 
> On Wed, 22 Jul 2020, Mike Snitzer wrote:
> 
> > On Wed, Jul 22 2020 at  2:46pm -0400,
> > Mikulas Patocka <mpatocka@redhat.com> wrote:
> > 
> > > Hi Mike
> > > 
> > > Please submit this to Linus and to RHEL-8.
> > > 
> > > Mikulas
> > > 
> > > 
> > > 
> > > From: Mikulas Patocka <mpatocka@redhat.com>
> > > 
> > > The patch adc0daad366b62ca1bce3e2958a40b0b71a8b8b3 broke recalculation on
> > > dm-integrity. The patch replaces a private variable "suspending" with a
> > > call to "dm_suspended".
> > > 
> > > The problem is that dm_suspended returns true not only during suspend, but
> > > also during resume. This race condition could occur:
> > > 1. dm_integrity_resume calls queue_work(ic->recalc_wq, &ic->recalc_work)
> > > 2. integrity_recalc (&ic->recalc_work) preempts the current thread
> > > 3. integrity_recalc calls if (unlikely(dm_suspended(ic->ti))) goto unlock_ret;
> > > 4. integrity_recalc exits and no recalculating is done.
> > > 
> > > In order to fix this race condition, we stop using dm_suspended and start
> > > using the variable "suspending" (that is only set during suspend, not
> > > during resume).
> > > 
> > > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> > > Fixes: adc0daad366b ("dm: report suspended device during destroy")
> > > Cc: stable@vger.kernel.org	# v4.18+
> > 
> > OK, but why not add a dm_suspending() to DM core?  Could be other
> > future targets would like this same info right?  I don't see harm in
> > elevating it.
> > 
> > Mike
> 
> Yes - it may be possible to add this.
> 
> Mikulas

From: Mikulas Patocka <mpatocka@redhat.com>

The patch adc0daad366b62ca1bce3e2958a40b0b71a8b8b3 broke recalculation on
dm-integrity. The patch replaces a private variable "suspending" with a
call to "dm_suspended".

The problem is that dm_suspended returns true not only during suspend, but
also during resume. This race condition could occur:
1. dm_integrity_resume calls queue_work(ic->recalc_wq, &ic->recalc_work)
2. integrity_recalc (&ic->recalc_work) preempts the current thread
3. integrity_recalc calls if (unlikely(dm_suspended(ic->ti))) goto unlock_ret;
4. integrity_recalc exits and no recalculating is done.

In order to fix this race condition, we add a function dm_suspending that
is only true during the postsuspend phase and use it instead of
dm_suspended.

Signed-off-by: Mikulas Patocka <mpatocka redhat com>
Fixes: adc0daad366b ("dm: report suspended device during destroy")
Cc: stable vger kernel org      # v4.18+

Index: rhel8/drivers/md/dm.c
===================================================================
--- rhel8.orig/drivers/md/dm.c
+++ rhel8/drivers/md/dm.c
@@ -140,6 +140,7 @@ EXPORT_SYMBOL_GPL(dm_bio_get_target_bio_
 #define DMF_NOFLUSH_SUSPENDING 5
 #define DMF_DEFERRED_REMOVE 6
 #define DMF_SUSPENDED_INTERNALLY 7
+#define DMF_SUSPENDING 8
 
 #define DM_NUMA_NODE NUMA_NO_NODE
 static int dm_numa_node = DM_NUMA_NODE;
@@ -2379,6 +2380,7 @@ static void __dm_destroy(struct mapped_d
 	if (!dm_suspended_md(md)) {
 		dm_table_presuspend_targets(map);
 		set_bit(DMF_SUSPENDED, &md->flags);
+		set_bit(DMF_SUSPENDING, &md->flags);
 		dm_table_postsuspend_targets(map);
 	}
 	/* dm_put_live_table must be before msleep, otherwise deadlock is possible */
@@ -2701,7 +2703,9 @@ retry:
 	if (r)
 		goto out_unlock;
 
+	set_bit(DMF_SUSPENDING, &md->flags);
 	dm_table_postsuspend_targets(map);
+	clear_bit(DMF_SUSPENDING, &md->flags);
 
 out_unlock:
 	mutex_unlock(&md->suspend_lock);
@@ -2798,7 +2802,9 @@ static void __dm_internal_suspend(struct
 	(void) __dm_suspend(md, map, suspend_flags, TASK_UNINTERRUPTIBLE,
 			    DMF_SUSPENDED_INTERNALLY);
 
+	set_bit(DMF_SUSPENDING, &md->flags);
 	dm_table_postsuspend_targets(map);
+	clear_bit(DMF_SUSPENDING, &md->flags);
 }
 
 static void __dm_internal_resume(struct mapped_device *md)
@@ -2951,6 +2957,11 @@ int dm_suspended_md(struct mapped_device
 	return test_bit(DMF_SUSPENDED, &md->flags);
 }
 
+static int dm_suspending_md(struct mapped_device *md)
+{
+	return test_bit(DMF_SUSPENDING, &md->flags);
+}
+
 int dm_suspended_internally_md(struct mapped_device *md)
 {
 	return test_bit(DMF_SUSPENDED_INTERNALLY, &md->flags);
@@ -2967,6 +2978,12 @@ int dm_suspended(struct dm_target *ti)
 }
 EXPORT_SYMBOL_GPL(dm_suspended);
 
+int dm_suspending(struct dm_target *ti)
+{
+	return dm_suspending_md(dm_table_get_md(ti->table));
+}
+EXPORT_SYMBOL_GPL(dm_suspending);
+
 int dm_noflush_suspending(struct dm_target *ti)
 {
 	return __noflush_suspending(dm_table_get_md(ti->table));
Index: rhel8/drivers/md/dm-integrity.c
===================================================================
--- rhel8.orig/drivers/md/dm-integrity.c
+++ rhel8/drivers/md/dm-integrity.c
@@ -2428,7 +2428,7 @@ static void integrity_writer(struct work
 	unsigned prev_free_sectors;
 
 	/* the following test is not needed, but it tests the replay code */
-	if (unlikely(dm_suspended(ic->ti)) && !ic->meta_dev)
+	if (unlikely(dm_suspending(ic->ti)) && !ic->meta_dev)
 		return;
 
 	spin_lock_irq(&ic->endio_wait.lock);
@@ -2489,7 +2489,7 @@ static void integrity_recalc(struct work
 
 next_chunk:
 
-	if (unlikely(dm_suspended(ic->ti)))
+	if (unlikely(dm_suspending(ic->ti)))
 		goto unlock_ret;
 
 	range.logical_sector = le64_to_cpu(ic->sb->recalc_sector);
Index: rhel8/include/linux/device-mapper.h
===================================================================
--- rhel8.orig/include/linux/device-mapper.h
+++ rhel8/include/linux/device-mapper.h
@@ -426,6 +426,7 @@ const char *dm_device_name(struct mapped
 int dm_copy_name_and_uuid(struct mapped_device *md, char *name, char *uuid);
 struct gendisk *dm_disk(struct mapped_device *md);
 int dm_suspended(struct dm_target *ti);
+int dm_suspending(struct dm_target *ti);
 int dm_noflush_suspending(struct dm_target *ti);
 void dm_accept_partial_bio(struct bio *bio, unsigned n_sectors);
 union map_info *dm_get_rq_mapinfo(struct request *rq);

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] dm-integrity: revert adc0daad366b to fix recalculation
  2020-07-23 14:42     ` [PATCH v2] " Mikulas Patocka
@ 2020-07-23 18:23       ` Mike Snitzer
  2020-07-23 18:39         ` Mikulas Patocka
  0 siblings, 1 reply; 6+ messages in thread
From: Mike Snitzer @ 2020-07-23 18:23 UTC (permalink / raw)
  To: Mikulas Patocka; +Cc: dm-devel, Marian Csontos, Zdenek Kabelac

On Thu, Jul 23 2020 at 10:42am -0400,
Mikulas Patocka <mpatocka@redhat.com> wrote:

> In order to fix this race condition, we add a function dm_suspending that
> is only true during the postsuspend phase and use it instead of
> dm_suspended.
> 

...

> Index: rhel8/drivers/md/dm.c
> ===================================================================
> --- rhel8.orig/drivers/md/dm.c
> +++ rhel8/drivers/md/dm.c
> @@ -140,6 +140,7 @@ EXPORT_SYMBOL_GPL(dm_bio_get_target_bio_
>  #define DMF_NOFLUSH_SUSPENDING 5
>  #define DMF_DEFERRED_REMOVE 6
>  #define DMF_SUSPENDED_INTERNALLY 7
> +#define DMF_SUSPENDING 8

Think I prefer DMF_POST_SUSPENDING.  If you're OK with that I can fix up
the code while I stage your commit.

Thanks,
Mike

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] dm-integrity: revert adc0daad366b to fix recalculation
  2020-07-23 18:23       ` Mike Snitzer
@ 2020-07-23 18:39         ` Mikulas Patocka
  0 siblings, 0 replies; 6+ messages in thread
From: Mikulas Patocka @ 2020-07-23 18:39 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: dm-devel, Marian Csontos, Zdenek Kabelac



On Thu, 23 Jul 2020, Mike Snitzer wrote:

> On Thu, Jul 23 2020 at 10:42am -0400,
> Mikulas Patocka <mpatocka@redhat.com> wrote:
> 
> > In order to fix this race condition, we add a function dm_suspending that
> > is only true during the postsuspend phase and use it instead of
> > dm_suspended.
> > 
> 
> ...
> 
> > Index: rhel8/drivers/md/dm.c
> > ===================================================================
> > --- rhel8.orig/drivers/md/dm.c
> > +++ rhel8/drivers/md/dm.c
> > @@ -140,6 +140,7 @@ EXPORT_SYMBOL_GPL(dm_bio_get_target_bio_
> >  #define DMF_NOFLUSH_SUSPENDING 5
> >  #define DMF_DEFERRED_REMOVE 6
> >  #define DMF_SUSPENDED_INTERNALLY 7
> > +#define DMF_SUSPENDING 8
> 
> Think I prefer DMF_POST_SUSPENDING.  If you're OK with that I can fix up
> the code while I stage your commit.
> 
> Thanks,
> Mike

Yes - you can change it.

Mikulas

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-07-23 18:39 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-22 18:46 [PATCH] dm-integrity: revert adc0daad366b to fix recalculation Mikulas Patocka
2020-07-22 19:45 ` Mike Snitzer
2020-07-22 20:02   ` Mikulas Patocka
2020-07-23 14:42     ` [PATCH v2] " Mikulas Patocka
2020-07-23 18:23       ` Mike Snitzer
2020-07-23 18:39         ` Mikulas Patocka

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.