* [PATCH] dm-integrity: revert adc0daad366b to fix recalculation @ 2020-07-22 18:46 Mikulas Patocka 2020-07-22 19:45 ` Mike Snitzer 0 siblings, 1 reply; 6+ messages in thread From: Mikulas Patocka @ 2020-07-22 18:46 UTC (permalink / raw) To: Mike Snitzer, Marian Csontos, Zdenek Kabelac; +Cc: dm-devel Hi Mike Please submit this to Linus and to RHEL-8. Mikulas From: Mikulas Patocka <mpatocka@redhat.com> The patch adc0daad366b62ca1bce3e2958a40b0b71a8b8b3 broke recalculation on dm-integrity. The patch replaces a private variable "suspending" with a call to "dm_suspended". The problem is that dm_suspended returns true not only during suspend, but also during resume. This race condition could occur: 1. dm_integrity_resume calls queue_work(ic->recalc_wq, &ic->recalc_work) 2. integrity_recalc (&ic->recalc_work) preempts the current thread 3. integrity_recalc calls if (unlikely(dm_suspended(ic->ti))) goto unlock_ret; 4. integrity_recalc exits and no recalculating is done. In order to fix this race condition, we stop using dm_suspended and start using the variable "suspending" (that is only set during suspend, not during resume). Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Fixes: adc0daad366b ("dm: report suspended device during destroy") Cc: stable@vger.kernel.org # v4.18+ --- drivers/md/dm-integrity.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) Index: linux-2.6/drivers/md/dm-integrity.c =================================================================== --- linux-2.6.orig/drivers/md/dm-integrity.c 2020-06-29 14:49:59.000000000 +0200 +++ linux-2.6/drivers/md/dm-integrity.c 2020-07-22 15:48:49.000000000 +0200 @@ -204,13 +204,12 @@ struct dm_integrity_c { __u8 log2_blocks_per_bitmap_bit; unsigned char mode; + int suspending; int failed; struct crypto_shash *internal_hash; - struct dm_target *ti; - /* these variables are locked with endio_wait.lock */ struct rb_root in_progress; struct list_head wait_list; @@ -2420,7 +2419,7 @@ static void integrity_writer(struct work unsigned prev_free_sectors; /* the following test is not needed, but it tests the replay code */ - if (unlikely(dm_suspended(ic->ti)) && !ic->meta_dev) + if (READ_ONCE(ic->suspending) && !ic->meta_dev) return; spin_lock_irq(&ic->endio_wait.lock); @@ -2481,7 +2480,7 @@ static void integrity_recalc(struct work next_chunk: - if (unlikely(dm_suspended(ic->ti))) + if (unlikely(READ_ONCE(ic->suspending))) goto unlock_ret; range.logical_sector = le64_to_cpu(ic->sb->recalc_sector); @@ -2909,6 +2908,8 @@ static void dm_integrity_postsuspend(str del_timer_sync(&ic->autocommit_timer); + WRITE_ONCE(ic->suspending, 1); + if (ic->recalc_wq) drain_workqueue(ic->recalc_wq); @@ -2937,6 +2938,8 @@ static void dm_integrity_postsuspend(str #endif } + WRITE_ONCE(ic->suspending, 0); + BUG_ON(!RB_EMPTY_ROOT(&ic->in_progress)); ic->journal_uptodate = true; @@ -3767,7 +3770,6 @@ static int dm_integrity_ctr(struct dm_ta } ti->private = ic; ti->per_io_data_size = sizeof(struct dm_integrity_io); - ic->ti = ti; ic->in_progress = RB_ROOT; INIT_LIST_HEAD(&ic->wait_list); ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: dm-integrity: revert adc0daad366b to fix recalculation 2020-07-22 18:46 [PATCH] dm-integrity: revert adc0daad366b to fix recalculation Mikulas Patocka @ 2020-07-22 19:45 ` Mike Snitzer 2020-07-22 20:02 ` Mikulas Patocka 0 siblings, 1 reply; 6+ messages in thread From: Mike Snitzer @ 2020-07-22 19:45 UTC (permalink / raw) To: Mikulas Patocka; +Cc: dm-devel, Marian Csontos, Zdenek Kabelac On Wed, Jul 22 2020 at 2:46pm -0400, Mikulas Patocka <mpatocka@redhat.com> wrote: > Hi Mike > > Please submit this to Linus and to RHEL-8. > > Mikulas > > > > From: Mikulas Patocka <mpatocka@redhat.com> > > The patch adc0daad366b62ca1bce3e2958a40b0b71a8b8b3 broke recalculation on > dm-integrity. The patch replaces a private variable "suspending" with a > call to "dm_suspended". > > The problem is that dm_suspended returns true not only during suspend, but > also during resume. This race condition could occur: > 1. dm_integrity_resume calls queue_work(ic->recalc_wq, &ic->recalc_work) > 2. integrity_recalc (&ic->recalc_work) preempts the current thread > 3. integrity_recalc calls if (unlikely(dm_suspended(ic->ti))) goto unlock_ret; > 4. integrity_recalc exits and no recalculating is done. > > In order to fix this race condition, we stop using dm_suspended and start > using the variable "suspending" (that is only set during suspend, not > during resume). > > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> > Fixes: adc0daad366b ("dm: report suspended device during destroy") > Cc: stable@vger.kernel.org # v4.18+ OK, but why not add a dm_suspending() to DM core? Could be other future targets would like this same info right? I don't see harm in elevating it. Mike ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: dm-integrity: revert adc0daad366b to fix recalculation 2020-07-22 19:45 ` Mike Snitzer @ 2020-07-22 20:02 ` Mikulas Patocka 2020-07-23 14:42 ` [PATCH v2] " Mikulas Patocka 0 siblings, 1 reply; 6+ messages in thread From: Mikulas Patocka @ 2020-07-22 20:02 UTC (permalink / raw) To: Mike Snitzer; +Cc: dm-devel, Marian Csontos, Zdenek Kabelac On Wed, 22 Jul 2020, Mike Snitzer wrote: > On Wed, Jul 22 2020 at 2:46pm -0400, > Mikulas Patocka <mpatocka@redhat.com> wrote: > > > Hi Mike > > > > Please submit this to Linus and to RHEL-8. > > > > Mikulas > > > > > > > > From: Mikulas Patocka <mpatocka@redhat.com> > > > > The patch adc0daad366b62ca1bce3e2958a40b0b71a8b8b3 broke recalculation on > > dm-integrity. The patch replaces a private variable "suspending" with a > > call to "dm_suspended". > > > > The problem is that dm_suspended returns true not only during suspend, but > > also during resume. This race condition could occur: > > 1. dm_integrity_resume calls queue_work(ic->recalc_wq, &ic->recalc_work) > > 2. integrity_recalc (&ic->recalc_work) preempts the current thread > > 3. integrity_recalc calls if (unlikely(dm_suspended(ic->ti))) goto unlock_ret; > > 4. integrity_recalc exits and no recalculating is done. > > > > In order to fix this race condition, we stop using dm_suspended and start > > using the variable "suspending" (that is only set during suspend, not > > during resume). > > > > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> > > Fixes: adc0daad366b ("dm: report suspended device during destroy") > > Cc: stable@vger.kernel.org # v4.18+ > > OK, but why not add a dm_suspending() to DM core? Could be other > future targets would like this same info right? I don't see harm in > elevating it. > > Mike Yes - it may be possible to add this. Mikulas ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v2] dm-integrity: revert adc0daad366b to fix recalculation 2020-07-22 20:02 ` Mikulas Patocka @ 2020-07-23 14:42 ` Mikulas Patocka 2020-07-23 18:23 ` Mike Snitzer 0 siblings, 1 reply; 6+ messages in thread From: Mikulas Patocka @ 2020-07-23 14:42 UTC (permalink / raw) To: Mike Snitzer; +Cc: dm-devel, Marian Csontos, Zdenek Kabelac On Wed, 22 Jul 2020, Mikulas Patocka wrote: > > > On Wed, 22 Jul 2020, Mike Snitzer wrote: > > > On Wed, Jul 22 2020 at 2:46pm -0400, > > Mikulas Patocka <mpatocka@redhat.com> wrote: > > > > > Hi Mike > > > > > > Please submit this to Linus and to RHEL-8. > > > > > > Mikulas > > > > > > > > > > > > From: Mikulas Patocka <mpatocka@redhat.com> > > > > > > The patch adc0daad366b62ca1bce3e2958a40b0b71a8b8b3 broke recalculation on > > > dm-integrity. The patch replaces a private variable "suspending" with a > > > call to "dm_suspended". > > > > > > The problem is that dm_suspended returns true not only during suspend, but > > > also during resume. This race condition could occur: > > > 1. dm_integrity_resume calls queue_work(ic->recalc_wq, &ic->recalc_work) > > > 2. integrity_recalc (&ic->recalc_work) preempts the current thread > > > 3. integrity_recalc calls if (unlikely(dm_suspended(ic->ti))) goto unlock_ret; > > > 4. integrity_recalc exits and no recalculating is done. > > > > > > In order to fix this race condition, we stop using dm_suspended and start > > > using the variable "suspending" (that is only set during suspend, not > > > during resume). > > > > > > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> > > > Fixes: adc0daad366b ("dm: report suspended device during destroy") > > > Cc: stable@vger.kernel.org # v4.18+ > > > > OK, but why not add a dm_suspending() to DM core? Could be other > > future targets would like this same info right? I don't see harm in > > elevating it. > > > > Mike > > Yes - it may be possible to add this. > > Mikulas From: Mikulas Patocka <mpatocka@redhat.com> The patch adc0daad366b62ca1bce3e2958a40b0b71a8b8b3 broke recalculation on dm-integrity. The patch replaces a private variable "suspending" with a call to "dm_suspended". The problem is that dm_suspended returns true not only during suspend, but also during resume. This race condition could occur: 1. dm_integrity_resume calls queue_work(ic->recalc_wq, &ic->recalc_work) 2. integrity_recalc (&ic->recalc_work) preempts the current thread 3. integrity_recalc calls if (unlikely(dm_suspended(ic->ti))) goto unlock_ret; 4. integrity_recalc exits and no recalculating is done. In order to fix this race condition, we add a function dm_suspending that is only true during the postsuspend phase and use it instead of dm_suspended. Signed-off-by: Mikulas Patocka <mpatocka redhat com> Fixes: adc0daad366b ("dm: report suspended device during destroy") Cc: stable vger kernel org # v4.18+ Index: rhel8/drivers/md/dm.c =================================================================== --- rhel8.orig/drivers/md/dm.c +++ rhel8/drivers/md/dm.c @@ -140,6 +140,7 @@ EXPORT_SYMBOL_GPL(dm_bio_get_target_bio_ #define DMF_NOFLUSH_SUSPENDING 5 #define DMF_DEFERRED_REMOVE 6 #define DMF_SUSPENDED_INTERNALLY 7 +#define DMF_SUSPENDING 8 #define DM_NUMA_NODE NUMA_NO_NODE static int dm_numa_node = DM_NUMA_NODE; @@ -2379,6 +2380,7 @@ static void __dm_destroy(struct mapped_d if (!dm_suspended_md(md)) { dm_table_presuspend_targets(map); set_bit(DMF_SUSPENDED, &md->flags); + set_bit(DMF_SUSPENDING, &md->flags); dm_table_postsuspend_targets(map); } /* dm_put_live_table must be before msleep, otherwise deadlock is possible */ @@ -2701,7 +2703,9 @@ retry: if (r) goto out_unlock; + set_bit(DMF_SUSPENDING, &md->flags); dm_table_postsuspend_targets(map); + clear_bit(DMF_SUSPENDING, &md->flags); out_unlock: mutex_unlock(&md->suspend_lock); @@ -2798,7 +2802,9 @@ static void __dm_internal_suspend(struct (void) __dm_suspend(md, map, suspend_flags, TASK_UNINTERRUPTIBLE, DMF_SUSPENDED_INTERNALLY); + set_bit(DMF_SUSPENDING, &md->flags); dm_table_postsuspend_targets(map); + clear_bit(DMF_SUSPENDING, &md->flags); } static void __dm_internal_resume(struct mapped_device *md) @@ -2951,6 +2957,11 @@ int dm_suspended_md(struct mapped_device return test_bit(DMF_SUSPENDED, &md->flags); } +static int dm_suspending_md(struct mapped_device *md) +{ + return test_bit(DMF_SUSPENDING, &md->flags); +} + int dm_suspended_internally_md(struct mapped_device *md) { return test_bit(DMF_SUSPENDED_INTERNALLY, &md->flags); @@ -2967,6 +2978,12 @@ int dm_suspended(struct dm_target *ti) } EXPORT_SYMBOL_GPL(dm_suspended); +int dm_suspending(struct dm_target *ti) +{ + return dm_suspending_md(dm_table_get_md(ti->table)); +} +EXPORT_SYMBOL_GPL(dm_suspending); + int dm_noflush_suspending(struct dm_target *ti) { return __noflush_suspending(dm_table_get_md(ti->table)); Index: rhel8/drivers/md/dm-integrity.c =================================================================== --- rhel8.orig/drivers/md/dm-integrity.c +++ rhel8/drivers/md/dm-integrity.c @@ -2428,7 +2428,7 @@ static void integrity_writer(struct work unsigned prev_free_sectors; /* the following test is not needed, but it tests the replay code */ - if (unlikely(dm_suspended(ic->ti)) && !ic->meta_dev) + if (unlikely(dm_suspending(ic->ti)) && !ic->meta_dev) return; spin_lock_irq(&ic->endio_wait.lock); @@ -2489,7 +2489,7 @@ static void integrity_recalc(struct work next_chunk: - if (unlikely(dm_suspended(ic->ti))) + if (unlikely(dm_suspending(ic->ti))) goto unlock_ret; range.logical_sector = le64_to_cpu(ic->sb->recalc_sector); Index: rhel8/include/linux/device-mapper.h =================================================================== --- rhel8.orig/include/linux/device-mapper.h +++ rhel8/include/linux/device-mapper.h @@ -426,6 +426,7 @@ const char *dm_device_name(struct mapped int dm_copy_name_and_uuid(struct mapped_device *md, char *name, char *uuid); struct gendisk *dm_disk(struct mapped_device *md); int dm_suspended(struct dm_target *ti); +int dm_suspending(struct dm_target *ti); int dm_noflush_suspending(struct dm_target *ti); void dm_accept_partial_bio(struct bio *bio, unsigned n_sectors); union map_info *dm_get_rq_mapinfo(struct request *rq); ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2] dm-integrity: revert adc0daad366b to fix recalculation 2020-07-23 14:42 ` [PATCH v2] " Mikulas Patocka @ 2020-07-23 18:23 ` Mike Snitzer 2020-07-23 18:39 ` Mikulas Patocka 0 siblings, 1 reply; 6+ messages in thread From: Mike Snitzer @ 2020-07-23 18:23 UTC (permalink / raw) To: Mikulas Patocka; +Cc: dm-devel, Marian Csontos, Zdenek Kabelac On Thu, Jul 23 2020 at 10:42am -0400, Mikulas Patocka <mpatocka@redhat.com> wrote: > In order to fix this race condition, we add a function dm_suspending that > is only true during the postsuspend phase and use it instead of > dm_suspended. > ... > Index: rhel8/drivers/md/dm.c > =================================================================== > --- rhel8.orig/drivers/md/dm.c > +++ rhel8/drivers/md/dm.c > @@ -140,6 +140,7 @@ EXPORT_SYMBOL_GPL(dm_bio_get_target_bio_ > #define DMF_NOFLUSH_SUSPENDING 5 > #define DMF_DEFERRED_REMOVE 6 > #define DMF_SUSPENDED_INTERNALLY 7 > +#define DMF_SUSPENDING 8 Think I prefer DMF_POST_SUSPENDING. If you're OK with that I can fix up the code while I stage your commit. Thanks, Mike ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2] dm-integrity: revert adc0daad366b to fix recalculation 2020-07-23 18:23 ` Mike Snitzer @ 2020-07-23 18:39 ` Mikulas Patocka 0 siblings, 0 replies; 6+ messages in thread From: Mikulas Patocka @ 2020-07-23 18:39 UTC (permalink / raw) To: Mike Snitzer; +Cc: dm-devel, Marian Csontos, Zdenek Kabelac On Thu, 23 Jul 2020, Mike Snitzer wrote: > On Thu, Jul 23 2020 at 10:42am -0400, > Mikulas Patocka <mpatocka@redhat.com> wrote: > > > In order to fix this race condition, we add a function dm_suspending that > > is only true during the postsuspend phase and use it instead of > > dm_suspended. > > > > ... > > > Index: rhel8/drivers/md/dm.c > > =================================================================== > > --- rhel8.orig/drivers/md/dm.c > > +++ rhel8/drivers/md/dm.c > > @@ -140,6 +140,7 @@ EXPORT_SYMBOL_GPL(dm_bio_get_target_bio_ > > #define DMF_NOFLUSH_SUSPENDING 5 > > #define DMF_DEFERRED_REMOVE 6 > > #define DMF_SUSPENDED_INTERNALLY 7 > > +#define DMF_SUSPENDING 8 > > Think I prefer DMF_POST_SUSPENDING. If you're OK with that I can fix up > the code while I stage your commit. > > Thanks, > Mike Yes - you can change it. Mikulas ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2020-07-23 18:39 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-07-22 18:46 [PATCH] dm-integrity: revert adc0daad366b to fix recalculation Mikulas Patocka 2020-07-22 19:45 ` Mike Snitzer 2020-07-22 20:02 ` Mikulas Patocka 2020-07-23 14:42 ` [PATCH v2] " Mikulas Patocka 2020-07-23 18:23 ` Mike Snitzer 2020-07-23 18:39 ` Mikulas Patocka
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.