All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dmitry Fomichev <dmitry.fomichev@wdc.com>
To: Jens Axboe <axboe@kernel.dk>,
	fio@vger.kernel.org, Aravind Ramesh <aravind.ramesh@wdc.com>,
	Bart Van Assche <bvanassche@acm.org>,
	Naohiro Aota <naohiro.aota@wdc.com>,
	Niklas Cassel <niklas.cassel@wdc.com>
Cc: Damien Le Moal <damien.lemoal@wdc.com>,
	Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com>,
	Dmitry Fomichev <dmitry.fomichev@wdc.com>
Subject: [PATCH v3 10/38] zbd: do not lock conventional zones on I/O adjustment
Date: Thu,  7 Jan 2021 06:57:11 +0900	[thread overview]
Message-ID: <20210106215739.264524-11-dmitry.fomichev@wdc.com> (raw)
In-Reply-To: <20210106215739.264524-1-dmitry.fomichev@wdc.com>

From: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

When a random workload runs against write pointer zones, I/Os are
adjusted to meet write pointer restrictions. During read, I/O offsets
are adjusted to point to zones with data to read and during write, I/O
offsets are adjusted to be at write pointers of open zones.

However, when a random workload runs in a range that contains both
write pointer zones and conventional zones, I/Os to write pointer
zones can potentially be adjusted to conventional zones. The functions
zbd_find_zone() and zbd_convert_to_open_zone() search for zones
regardless of their type, and therefore they may return conventional
zones. These functions lock the found zone to guard its open status
and write pointer position, but this lock is meaningless for
conventional zones. This unwanted lock of conventional zones has been
observed to cause a deadlock.

Furthermore, zbd_convert_to_open_zone() may add the found conventional
zone to the array of open zones. However, conventional zones should
never be added to the array of open zones as conventional zones never
take the "implicit open" condition and not counted as part of the
device open zone management.

To avoid the deadlock, modify zbd_find_zone() not to lock zone when it
checks conventional zone without write pointer. To avoid the deadlock
and the conventional zone open, modify zbd_convert_to_open_zone() to
ignore conventional zones.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
---
 zbd.c | 31 ++++++++++++++++++++-----------
 1 file changed, 20 insertions(+), 11 deletions(-)

diff --git a/zbd.c b/zbd.c
index c1db215e..84ac6b6f 100644
--- a/zbd.c
+++ b/zbd.c
@@ -1033,7 +1033,8 @@ static uint32_t pick_random_zone_idx(const struct fio_file *f,
 /*
  * Modify the offset of an I/O unit that does not refer to an open zone such
  * that it refers to an open zone. Close an open zone and open a new zone if
- * necessary. This algorithm can only work correctly if all write pointers are
+ * necessary. The open zone is searched across sequential zones.
+ * This algorithm can only work correctly if all write pointers are
  * a multiple of the fio block size. The caller must neither hold z->mutex
  * nor f->zbd_info->mutex. Returns with z->mutex held upon success.
  */
@@ -1159,7 +1160,8 @@ open_other_zone:
 	/* Zone 'z' is full, so try to open a new zone. */
 	for (i = f->io_size / f->zbd_info->zone_size; i > 0; i--) {
 		zone_idx++;
-		zone_unlock(z);
+		if (z->has_wp)
+			zone_unlock(z);
 		z++;
 		if (!is_valid_offset(f, z->start)) {
 			/* Wrap-around. */
@@ -1167,6 +1169,8 @@ open_other_zone:
 			z = get_zone(f, zone_idx);
 		}
 		assert(is_valid_offset(f, z->start));
+		if (!z->has_wp)
+			continue;
 		zone_lock(td, f, z);
 		if (z->open)
 			continue;
@@ -1202,6 +1206,7 @@ out:
 	dprint(FD_ZBD, "%s(%s): returning zone %d\n", __func__, f->file_name,
 	       zone_idx);
 	io_u->offset = z->start;
+	assert(z->has_wp);
 	assert(z->cond != ZBD_ZONE_COND_OFFLINE);
 	return z;
 }
@@ -1228,12 +1233,12 @@ static struct fio_zone_info *zbd_replay_write_order(struct thread_data *td,
 }
 
 /*
- * Find another zone for which @io_u fits below the write pointer. Start
- * searching in zones @zb + 1 .. @zl and continue searching in zones
- * @zf .. @zb - 1.
+ * Find another zone for which @io_u fits in the readable data in the zone.
+ * Search in zones @zb + 1 .. @zl. For random workload, also search in zones
+ * @zb - 1 .. @zf.
  *
- * Either returns NULL or returns a zone pointer and holds the mutex for that
- * zone.
+ * Either returns NULL or returns a zone pointer. When the zone has write
+ * pointer, hold the mutex for the zone.
  */
 static struct fio_zone_info *
 zbd_find_zone(struct thread_data *td, struct io_u *io_u,
@@ -1250,19 +1255,23 @@ zbd_find_zone(struct thread_data *td, struct io_u *io_u,
 	 */
 	for (z1 = zb + 1, z2 = zb - 1; z1 < zl || z2 >= zf; z1++, z2--) {
 		if (z1 < zl && z1->cond != ZBD_ZONE_COND_OFFLINE) {
-			zone_lock(td, f, z1);
+			if (z1->has_wp)
+				zone_lock(td, f, z1);
 			if (z1->start + min_bs <= z1->wp)
 				return z1;
-			zone_unlock(z1);
+			if (z1->has_wp)
+				zone_unlock(z1);
 		} else if (!td_random(td)) {
 			break;
 		}
 		if (td_random(td) && z2 >= zf &&
 		    z2->cond != ZBD_ZONE_COND_OFFLINE) {
-			zone_lock(td, f, z2);
+			if (z2->has_wp)
+				zone_lock(td, f, z2);
 			if (z2->start + min_bs <= z2->wp)
 				return z2;
-			zone_unlock(z2);
+			if (z2->has_wp)
+				zone_unlock(z2);
 		}
 	}
 	dprint(FD_ZBD, "%s: adjusting random read offset failed\n",
-- 
2.28.0



  parent reply	other threads:[~2021-01-06 21:57 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-06 21:57 [PATCH v3 00/38] ZBD fixes and improvements Dmitry Fomichev
2021-01-06 21:57 ` [PATCH v3 01/38] zbd: return ENOMEM if zone buffer allocation fails Dmitry Fomichev
2021-01-22  2:07   ` Shinichiro Kawasaki
2021-01-06 21:57 ` [PATCH v3 02/38] zbd: use zbd_zone_nr() more actively in the code Dmitry Fomichev
2021-01-22  2:14   ` Shinichiro Kawasaki
2021-01-06 21:57 ` [PATCH v3 03/38] zbd: add get_zone() helper function Dmitry Fomichev
2021-01-22  2:19   ` Shinichiro Kawasaki
2021-01-06 21:57 ` [PATCH v3 04/38] zbd: introduce zone_unlock() Dmitry Fomichev
2021-01-22  2:23   ` Shinichiro Kawasaki
2021-01-06 21:57 ` [PATCH v3 05/38] zbd: engines/libzbc: don't fail on assert for offline zones Dmitry Fomichev
2021-01-22  2:27   ` Shinichiro Kawasaki
2021-01-06 21:57 ` [PATCH v3 06/38] zbd: remove dependency on zone type during i/o Dmitry Fomichev
2021-01-22  3:56   ` Shinichiro Kawasaki
2021-01-06 21:57 ` [PATCH v3 07/38] zbd: skip offline zones in zbd_convert_to_open_zone() Dmitry Fomichev
2021-01-22  3:59   ` Shinichiro Kawasaki
2021-01-06 21:57 ` [PATCH v3 08/38] zbd: avoid zone buffer overrun Dmitry Fomichev
2021-01-22  4:02   ` Shinichiro Kawasaki
2021-01-06 21:57 ` [PATCH v3 09/38] zbd: don't unlock zone mutex after verify replay Dmitry Fomichev
2021-01-22  4:13   ` Shinichiro Kawasaki
2021-01-06 21:57 ` Dmitry Fomichev [this message]
2021-01-06 21:57 ` [PATCH v3 11/38] zbd: do not set zbd handlers for conventional zones Dmitry Fomichev
2021-01-06 21:57 ` [PATCH v3 12/38] zbd: count sectors with data for write pointer zones Dmitry Fomichev
2021-01-06 21:57 ` [PATCH v3 13/38] zbd: initialize min_zone and max_zone for all zone types Dmitry Fomichev
2021-01-06 21:57 ` [PATCH v3 14/38] zbd: initialize sectors with data at start time Dmitry Fomichev
2021-01-22  4:19   ` Shinichiro Kawasaki
2021-01-06 21:57 ` [PATCH v3 15/38] zbd: use zone_lock() in zbd_process_swd() Dmitry Fomichev
2021-01-22  4:28   ` Shinichiro Kawasaki
2021-01-06 21:57 ` [PATCH v3 16/38] zbd: disable crossing from conventional to sequential zones Dmitry Fomichev
2021-01-06 21:57 ` [PATCH v3 17/38] zbd: don't log "zone nnnn is not open" message Dmitry Fomichev
2021-01-22  4:31   ` Shinichiro Kawasaki
2021-01-06 21:57 ` [PATCH v3 18/38] zbd: handle conventional start zone in zbd_convert_to_open_zone() Dmitry Fomichev
2021-01-22  4:36   ` Shinichiro Kawasaki
2021-01-06 21:57 ` [PATCH v3 19/38] zbd: improve replay range validation Dmitry Fomichev
2021-01-22  4:47   ` Shinichiro Kawasaki
2021-01-06 21:57 ` [PATCH v3 20/38] engines/libzbc: enable block backend Dmitry Fomichev
2021-01-22  4:49   ` Shinichiro Kawasaki
2021-01-06 21:57 ` [PATCH v3 21/38] zbd: avoid failing assertion in zbd_convert_to_open_zone() Dmitry Fomichev
2021-01-22  5:05   ` Shinichiro Kawasaki
2021-01-06 21:57 ` [PATCH v3 22/38] zbd: set thread errors in zbd_adjust_block() Dmitry Fomichev
2021-01-22  5:12   ` Shinichiro Kawasaki
2021-01-06 21:57 ` [PATCH v3 23/38] t/zbd: check for error in test #2 Dmitry Fomichev
2021-01-22  5:13   ` Shinichiro Kawasaki
2021-01-06 21:57 ` [PATCH v3 24/38] t/zbd: add run-tests-against-nullb script Dmitry Fomichev
2021-01-22  8:47   ` Shinichiro Kawasaki
2021-01-06 21:57 ` [PATCH v3 25/38] t/zbd: add -t option to run-tests-against-nullb Dmitry Fomichev
2021-01-06 21:57 ` [PATCH v3 26/38] t/zbd: skip tests when test prerequisites are not met Dmitry Fomichev
2021-01-06 21:57 ` [PATCH v3 27/38] t/zbd: skip tests that need too many sequential zones Dmitry Fomichev
2021-01-06 21:57 ` [PATCH v3 28/38] t/zbd: test that conventional zones are not locked during random i/o Dmitry Fomichev
2021-01-06 21:57 ` [PATCH v3 29/38] t/zbd: test that zone_reset_threshold calculation is correct Dmitry Fomichev
2021-01-06 21:57 ` [PATCH v3 30/38] t/zbd: test random I/O direction in all-conventional case Dmitry Fomichev
2021-01-06 21:57 ` [PATCH v3 31/38] t/zbd: fix wrong units in test case #37 Dmitry Fomichev
2021-01-06 21:57 ` [PATCH v3 32/38] t/zbd: add an option to bail on a failed test Dmitry Fomichev
2021-01-22  8:53   ` Shinichiro Kawasaki
2021-01-06 21:57 ` [PATCH v3 33/38] t/zbd: prevent test #31 from looping Dmitry Fomichev
2021-01-22  8:56   ` Shinichiro Kawasaki
2021-01-06 21:57 ` [PATCH v3 34/38] t/zbd: add checks for offline zone condition Dmitry Fomichev
2021-01-22  9:06   ` Shinichiro Kawasaki
2021-01-06 21:57 ` [PATCH v3 35/38] t/zbd: add test #54 to exercise ZBD verification Dmitry Fomichev
2021-01-22  9:10   ` Shinichiro Kawasaki
2021-01-06 21:57 ` [PATCH v3 36/38] t/zbd: show elapsed time in test-zbd-support Dmitry Fomichev
2021-01-22  9:11   ` Shinichiro Kawasaki
2021-01-06 21:57 ` [PATCH v3 37/38] t/zbd: increase timeout in test #48 Dmitry Fomichev
2021-01-22  9:12   ` Shinichiro Kawasaki
2021-01-06 21:57 ` [PATCH v3 38/38] t/zbd: avoid looping on invalid command line options Dmitry Fomichev
2021-01-22  9:14   ` Shinichiro Kawasaki
2021-01-22  9:24 ` [PATCH v3 00/38] ZBD fixes and improvements Shinichiro Kawasaki
2021-01-22 20:31   ` Dmitry Fomichev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210106215739.264524-11-dmitry.fomichev@wdc.com \
    --to=dmitry.fomichev@wdc.com \
    --cc=aravind.ramesh@wdc.com \
    --cc=axboe@kernel.dk \
    --cc=bvanassche@acm.org \
    --cc=damien.lemoal@wdc.com \
    --cc=fio@vger.kernel.org \
    --cc=naohiro.aota@wdc.com \
    --cc=niklas.cassel@wdc.com \
    --cc=shinichiro.kawasaki@wdc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.