All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] btrfs: raid56: protect error_bitmap update with spinlock
@ 2023-01-20  8:56 Qu Wenruo
  2023-01-20 13:57 ` Christoph Hellwig
  0 siblings, 1 reply; 3+ messages in thread
From: Qu Wenruo @ 2023-01-20  8:56 UTC (permalink / raw)
  To: linux-btrfs

In the rework of raid56 code, there is very limited concurrency in the
endio context.

Most of works is done inside the sectors arrays, which different bios
will never touch the same sector.

But there is a cocurrency here for error_bitmap. Both read and write
endio functions need to touch them, and we can have multiple write bios
touching the same error bitmap if they all hit some errors.

And since bitmap_set() is not atomic, we need spinlock to protect the
error_bitmap, or we may miss some bits.

This patch will fix it by introducing btrfs_raid_bio::error_bitmap_lock,
and properly hold the spinlock during error_bitmap update.

Fixes: 2942a50dea74 ("btrfs: raid56: introduce btrfs_raid_bio::error_bitmap")
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/raid56.c | 4 ++++
 fs/btrfs/raid56.h | 7 +++++++
 2 files changed, 11 insertions(+)

diff --git a/fs/btrfs/raid56.c b/fs/btrfs/raid56.c
index 65ed4f326fb9..ee1b6b95a10b 100644
--- a/fs/btrfs/raid56.c
+++ b/fs/btrfs/raid56.c
@@ -951,6 +951,7 @@ static struct btrfs_raid_bio *alloc_rbio(struct btrfs_fs_info *fs_info,
 	init_waitqueue_head(&rbio->io_wait);
 	INIT_LIST_HEAD(&rbio->plug_list);
 	spin_lock_init(&rbio->bio_list_lock);
+	spin_lock_init(&rbio->error_bitmap_lock);
 	INIT_LIST_HEAD(&rbio->stripe_cache);
 	INIT_LIST_HEAD(&rbio->hash_list);
 	btrfs_get_bioc(bioc);
@@ -1426,12 +1427,15 @@ static void rbio_update_error_bitmap(struct btrfs_raid_bio *rbio, struct bio *bi
 	u32 bio_size = 0;
 	struct bio_vec *bvec;
 	struct bvec_iter_all iter_all;
+	unsigned long flags;
 
 	bio_for_each_segment_all(bvec, bio, iter_all)
 		bio_size += bvec->bv_len;
 
+	spin_lock_irqsave(&rbio->error_bitmap_lock, flags);
 	bitmap_set(rbio->error_bitmap, total_sector_nr,
 		   bio_size >> rbio->bioc->fs_info->sectorsize_bits);
+	spin_unlock_irqrestore(&rbio->error_bitmap_lock, flags);
 }
 
 /* Verify the data sectors at read time. */
diff --git a/fs/btrfs/raid56.h b/fs/btrfs/raid56.h
index 7c73a443939e..005612f5a312 100644
--- a/fs/btrfs/raid56.h
+++ b/fs/btrfs/raid56.h
@@ -119,6 +119,13 @@ struct btrfs_raid_bio {
 	/* Allocated with real_stripes-many pointers for finish_*() calls */
 	void **finish_pointers;
 
+	/*
+	 * Spinlock for error_bitmap.
+	 * Unlike sector arrays above, error_bitmap can be modified by
+	 * more than one bios at the same time.
+	 */
+	spinlock_t error_bitmap_lock;
+
 	/*
 	 * The bitmap recording where IO errors happened.
 	 * Each bit is corresponding to one sector in either bio_sectors[] or
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] btrfs: raid56: protect error_bitmap update with spinlock
  2023-01-20  8:56 [PATCH] btrfs: raid56: protect error_bitmap update with spinlock Qu Wenruo
@ 2023-01-20 13:57 ` Christoph Hellwig
  2023-01-20 23:46   ` Qu Wenruo
  0 siblings, 1 reply; 3+ messages in thread
From: Christoph Hellwig @ 2023-01-20 13:57 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

So the reads of the bitmap only happen after all bios have completed
and the reader side really doesn't need the lock.  Having it only
one side looks odd, and on really weakly ordered architecture might
lack all the barriers.  Should the reads of the bitmap also take
the lock for completeness, or can we get away with a big fat comment?

In a way that alsmost asks for just using a set_bit loop in the
completion handlers as that makes the ordering pretty obvious.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] btrfs: raid56: protect error_bitmap update with spinlock
  2023-01-20 13:57 ` Christoph Hellwig
@ 2023-01-20 23:46   ` Qu Wenruo
  0 siblings, 0 replies; 3+ messages in thread
From: Qu Wenruo @ 2023-01-20 23:46 UTC (permalink / raw)
  To: Christoph Hellwig, Qu Wenruo; +Cc: linux-btrfs



On 2023/1/20 21:57, Christoph Hellwig wrote:
> So the reads of the bitmap only happen after all bios have completed
> and the reader side really doesn't need the lock.  Having it only
> one side looks odd, and on really weakly ordered architecture might
> lack all the barriers.  Should the reads of the bitmap also take
> the lock for completeness, or can we get away with a big fat comment?
> 
> In a way that alsmost asks for just using a set_bit loop in the
> completion handlers as that makes the ordering pretty obvious.

You're right, set_bit() loop looks more common.

Thanks,
Qu

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-01-20 23:47 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-20  8:56 [PATCH] btrfs: raid56: protect error_bitmap update with spinlock Qu Wenruo
2023-01-20 13:57 ` Christoph Hellwig
2023-01-20 23:46   ` Qu Wenruo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.