All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/16] lightnvm: next set of improvements for 5.2
@ 2019-03-22 14:48 Igor Konopko
  2019-03-22 14:48 ` [PATCH v2 01/16] lightnvm: pblk: warn when there are opened chunks Igor Konopko
                   ` (15 more replies)
  0 siblings, 16 replies; 30+ messages in thread
From: Igor Konopko @ 2019-03-22 14:48 UTC (permalink / raw)
  To: mb, javier, hans.holmberg; +Cc: linux-block, igor.j.konopko

This is another set of fixes and improvements to both pblk and lightnvm
core. 

First patch is the leftover from previous patchset, since we decided to
reorganize it a little. Second & third patches are the most crutial, since they
changes the approach to partial read path, so detailed review is needed
especially here.

Other patches are my other findings related to some bugs or potential
improvements, mostly related to some corner cases.

Changes v1 -> v2:
-dropped some not needed patches
-review feedback incorporated for some of the patches
-partial read path changes patch splitted into two patches

Igor Konopko (16):
  lightnvm: pblk: warn when there are opened chunks
  lightnvm: pblk: IO path reorganization
  lightnvm: pblk: simplify partial read path
  lightnvm: pblk: OOB recovery for closed chunks fix
  lightnvm: pblk: propagate errors when reading meta
  lightnvm: pblk: recover only written metadata
  lightnvm: pblk: wait for inflight IOs in recovery
  lightnvm: pblk: remove internal IO timeout
  lightnvm: pblk: fix spin_unlock order
  lightnvm: pblk: kick writer on write recovery path
  lightnvm: pblk: fix update line wp in OOB recovery
  lightnvm: pblk: do not read OOB from emeta region
  lightnvm: pblk: store multiple copies of smeta
  lightnvm: pblk: GC error handling
  lightnvm: pblk: use nvm_rq_to_ppa_list()
  lightnvm: track inflight target creations

 drivers/lightnvm/core.c          |  19 ++-
 drivers/lightnvm/pblk-cache.c    |   7 +-
 drivers/lightnvm/pblk-core.c     | 179 +++++++++++++++++----
 drivers/lightnvm/pblk-gc.c       |   5 +-
 drivers/lightnvm/pblk-init.c     |  59 ++++---
 drivers/lightnvm/pblk-rb.c       |   2 +-
 drivers/lightnvm/pblk-read.c     | 336 +++++++++++----------------------------
 drivers/lightnvm/pblk-recovery.c | 103 ++++++++----
 drivers/lightnvm/pblk-rl.c       |   3 +-
 drivers/lightnvm/pblk-write.c    |   1 +
 drivers/lightnvm/pblk.h          |  31 ++--
 include/linux/lightnvm.h         |   2 +
 12 files changed, 391 insertions(+), 356 deletions(-)

-- 
2.9.5


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v2 01/16] lightnvm: pblk: warn when there are opened chunks
  2019-03-22 14:48 [PATCH v2 00/16] lightnvm: next set of improvements for 5.2 Igor Konopko
@ 2019-03-22 14:48 ` Igor Konopko
  2019-03-25 11:32   ` Matias Bjørling
  2019-03-22 14:48 ` [PATCH v2 02/16] lightnvm: pblk: IO path reorganization Igor Konopko
                   ` (14 subsequent siblings)
  15 siblings, 1 reply; 30+ messages in thread
From: Igor Konopko @ 2019-03-22 14:48 UTC (permalink / raw)
  To: mb, javier, hans.holmberg; +Cc: linux-block, igor.j.konopko

In case of factory pblk init, we might have a situation when there are
some opened chunks. Based on OCSSD spec we are not allowed to transform
chunks from the open state directly to the free state, so such a reset
can lead to IO error (even that most of the controller will allow for
for such a operation anyway), so we would like to warn users about such
a situation at least.

Signed-off-by: Igor Konopko <igor.j.konopko@intel.com>
Reviewed-by: Javier González <javier@javigon.com>
---
 drivers/lightnvm/pblk-init.c | 22 +++++++++++++++-------
 1 file changed, 15 insertions(+), 7 deletions(-)

diff --git a/drivers/lightnvm/pblk-init.c b/drivers/lightnvm/pblk-init.c
index 1e227a0..e2e1193 100644
--- a/drivers/lightnvm/pblk-init.c
+++ b/drivers/lightnvm/pblk-init.c
@@ -712,7 +712,7 @@ static int pblk_set_provision(struct pblk *pblk, int nr_free_chks)
 }
 
 static int pblk_setup_line_meta_chk(struct pblk *pblk, struct pblk_line *line,
-				   struct nvm_chk_meta *meta)
+				   struct nvm_chk_meta *meta, int *opened)
 {
 	struct nvm_tgt_dev *dev = pblk->dev;
 	struct nvm_geo *geo = &dev->geo;
@@ -748,6 +748,9 @@ static int pblk_setup_line_meta_chk(struct pblk *pblk, struct pblk_line *line,
 			continue;
 		}
 
+		if (chunk->state & NVM_CHK_ST_OPEN)
+			(*opened)++;
+
 		if (!(chunk->state & NVM_CHK_ST_OFFLINE))
 			continue;
 
@@ -759,7 +762,7 @@ static int pblk_setup_line_meta_chk(struct pblk *pblk, struct pblk_line *line,
 }
 
 static long pblk_setup_line_meta(struct pblk *pblk, struct pblk_line *line,
-				 void *chunk_meta, int line_id)
+				 void *chunk_meta, int line_id, int *opened)
 {
 	struct pblk_line_mgmt *l_mg = &pblk->l_mg;
 	struct pblk_line_meta *lm = &pblk->lm;
@@ -773,7 +776,7 @@ static long pblk_setup_line_meta(struct pblk *pblk, struct pblk_line *line,
 	line->vsc = &l_mg->vsc_list[line_id];
 	spin_lock_init(&line->lock);
 
-	nr_bad_chks = pblk_setup_line_meta_chk(pblk, line, chunk_meta);
+	nr_bad_chks = pblk_setup_line_meta_chk(pblk, line, chunk_meta, opened);
 
 	chk_in_line = lm->blk_per_line - nr_bad_chks;
 	if (nr_bad_chks < 0 || nr_bad_chks > lm->blk_per_line ||
@@ -1019,12 +1022,12 @@ static int pblk_line_meta_init(struct pblk *pblk)
 	return 0;
 }
 
-static int pblk_lines_init(struct pblk *pblk)
+static int pblk_lines_init(struct pblk *pblk, bool factory)
 {
 	struct pblk_line_mgmt *l_mg = &pblk->l_mg;
 	struct pblk_line *line;
 	void *chunk_meta;
-	int nr_free_chks = 0;
+	int nr_free_chks = 0, nr_opened_chks = 0;
 	int i, ret;
 
 	ret = pblk_line_meta_init(pblk);
@@ -1059,7 +1062,8 @@ static int pblk_lines_init(struct pblk *pblk)
 		if (ret)
 			goto fail_free_lines;
 
-		nr_free_chks += pblk_setup_line_meta(pblk, line, chunk_meta, i);
+		nr_free_chks += pblk_setup_line_meta(pblk, line, chunk_meta, i,
+							&nr_opened_chks);
 
 		trace_pblk_line_state(pblk_disk_name(pblk), line->id,
 								line->state);
@@ -1071,6 +1075,10 @@ static int pblk_lines_init(struct pblk *pblk)
 		goto fail_free_lines;
 	}
 
+	if (factory && nr_opened_chks)
+		pblk_warn(pblk, "%d opened chunks during factory creation\n",
+				nr_opened_chks);
+
 	ret = pblk_set_provision(pblk, nr_free_chks);
 	if (ret)
 		goto fail_free_lines;
@@ -1235,7 +1243,7 @@ static void *pblk_init(struct nvm_tgt_dev *dev, struct gendisk *tdisk,
 		goto fail;
 	}
 
-	ret = pblk_lines_init(pblk);
+	ret = pblk_lines_init(pblk, flags & NVM_TARGET_FACTORY);
 	if (ret) {
 		pblk_err(pblk, "could not initialize lines\n");
 		goto fail_free_core;
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 02/16] lightnvm: pblk: IO path reorganization
  2019-03-22 14:48 [PATCH v2 00/16] lightnvm: next set of improvements for 5.2 Igor Konopko
  2019-03-22 14:48 ` [PATCH v2 01/16] lightnvm: pblk: warn when there are opened chunks Igor Konopko
@ 2019-03-22 14:48 ` Igor Konopko
  2019-03-25  5:55   ` Javier González
  2019-03-22 14:48 ` [PATCH v2 03/16] lightnvm: pblk: simplify partial read path Igor Konopko
                   ` (13 subsequent siblings)
  15 siblings, 1 reply; 30+ messages in thread
From: Igor Konopko @ 2019-03-22 14:48 UTC (permalink / raw)
  To: mb, javier, hans.holmberg; +Cc: linux-block, igor.j.konopko

This patch is made in order to prepare read path for new approach to
partial read handling (which is needed for multi-page bvec handling)
The most important change is to move the handling of completed and
failed bio from the pblk_make_rq() to particular read and write
functions. This is needed, since after partial read path changes,
sometimes completed/failed bio will be different from original one, so
we cannot do this any longer in pblk_make_rq(). Other changes are
small read path refactor in order to reduce the size of another patch
with partial read changes. Generally the goal of this patch is not to
change the functionality, but just to prepare the code for the
following changes.

Signed-off-by: Igor Konopko <igor.j.konopko@intel.com>
---
 drivers/lightnvm/pblk-cache.c |  7 ++--
 drivers/lightnvm/pblk-init.c  | 14 ++------
 drivers/lightnvm/pblk-read.c  | 83 ++++++++++++++++++++-----------------------
 drivers/lightnvm/pblk.h       |  4 +--
 4 files changed, 47 insertions(+), 61 deletions(-)

diff --git a/drivers/lightnvm/pblk-cache.c b/drivers/lightnvm/pblk-cache.c
index c9fa26f..d2a3683 100644
--- a/drivers/lightnvm/pblk-cache.c
+++ b/drivers/lightnvm/pblk-cache.c
@@ -18,7 +18,7 @@
 
 #include "pblk.h"
 
-int pblk_write_to_cache(struct pblk *pblk, struct bio *bio, unsigned long flags)
+void pblk_write_to_cache(struct pblk *pblk, struct bio *bio, unsigned long flags)
 {
 	struct request_queue *q = pblk->dev->q;
 	struct pblk_w_ctx w_ctx;
@@ -43,6 +43,7 @@ int pblk_write_to_cache(struct pblk *pblk, struct bio *bio, unsigned long flags)
 		goto retry;
 	case NVM_IO_ERR:
 		pblk_pipeline_stop(pblk);
+		bio_io_error(bio);
 		goto out;
 	}
 
@@ -79,7 +80,9 @@ int pblk_write_to_cache(struct pblk *pblk, struct bio *bio, unsigned long flags)
 out:
 	generic_end_io_acct(q, REQ_OP_WRITE, &pblk->disk->part0, start_time);
 	pblk_write_should_kick(pblk);
-	return ret;
+
+	if (ret == NVM_IO_DONE)
+		bio_endio(bio);
 }
 
 /*
diff --git a/drivers/lightnvm/pblk-init.c b/drivers/lightnvm/pblk-init.c
index e2e1193..4211cd1 100644
--- a/drivers/lightnvm/pblk-init.c
+++ b/drivers/lightnvm/pblk-init.c
@@ -50,7 +50,6 @@ struct bio_set pblk_bio_set;
 static blk_qc_t pblk_make_rq(struct request_queue *q, struct bio *bio)
 {
 	struct pblk *pblk = q->queuedata;
-	int ret;
 
 	if (bio_op(bio) == REQ_OP_DISCARD) {
 		pblk_discard(pblk, bio);
@@ -65,7 +64,7 @@ static blk_qc_t pblk_make_rq(struct request_queue *q, struct bio *bio)
 	 */
 	if (bio_data_dir(bio) == READ) {
 		blk_queue_split(q, &bio);
-		ret = pblk_submit_read(pblk, bio);
+		pblk_submit_read(pblk, bio);
 	} else {
 		/* Prevent deadlock in the case of a modest LUN configuration
 		 * and large user I/Os. Unless stalled, the rate limiter
@@ -74,16 +73,7 @@ static blk_qc_t pblk_make_rq(struct request_queue *q, struct bio *bio)
 		if (pblk_get_secs(bio) > pblk_rl_max_io(&pblk->rl))
 			blk_queue_split(q, &bio);
 
-		ret = pblk_write_to_cache(pblk, bio, PBLK_IOTYPE_USER);
-	}
-
-	switch (ret) {
-	case NVM_IO_ERR:
-		bio_io_error(bio);
-		break;
-	case NVM_IO_DONE:
-		bio_endio(bio);
-		break;
+		pblk_write_to_cache(pblk, bio, PBLK_IOTYPE_USER);
 	}
 
 	return BLK_QC_T_NONE;
diff --git a/drivers/lightnvm/pblk-read.c b/drivers/lightnvm/pblk-read.c
index 6569746..597fe6d 100644
--- a/drivers/lightnvm/pblk-read.c
+++ b/drivers/lightnvm/pblk-read.c
@@ -179,7 +179,8 @@ static void pblk_end_user_read(struct bio *bio, int error)
 {
 	if (error && error != NVM_RSP_WARN_HIGHECC)
 		bio_io_error(bio);
-	bio_endio(bio);
+	else
+		bio_endio(bio);
 }
 
 static void __pblk_end_io_read(struct pblk *pblk, struct nvm_rq *rqd,
@@ -383,7 +384,6 @@ static int pblk_partial_read_bio(struct pblk *pblk, struct nvm_rq *rqd,
 
 	/* Free allocated pages in new bio */
 	pblk_bio_free_pages(pblk, rqd->bio, 0, rqd->bio->bi_vcnt);
-	__pblk_end_io_read(pblk, rqd, false);
 	return NVM_IO_ERR;
 }
 
@@ -428,7 +428,7 @@ static void pblk_read_rq(struct pblk *pblk, struct nvm_rq *rqd, struct bio *bio,
 	}
 }
 
-int pblk_submit_read(struct pblk *pblk, struct bio *bio)
+void pblk_submit_read(struct pblk *pblk, struct bio *bio)
 {
 	struct nvm_tgt_dev *dev = pblk->dev;
 	struct request_queue *q = dev->q;
@@ -436,9 +436,9 @@ int pblk_submit_read(struct pblk *pblk, struct bio *bio)
 	unsigned int nr_secs = pblk_get_secs(bio);
 	struct pblk_g_ctx *r_ctx;
 	struct nvm_rq *rqd;
+	struct bio *int_bio;
 	unsigned int bio_init_idx;
 	DECLARE_BITMAP(read_bitmap, NVM_MAX_VLBA);
-	int ret = NVM_IO_ERR;
 
 	generic_start_io_acct(q, REQ_OP_READ, bio_sectors(bio),
 			      &pblk->disk->part0);
@@ -449,74 +449,67 @@ int pblk_submit_read(struct pblk *pblk, struct bio *bio)
 
 	rqd->opcode = NVM_OP_PREAD;
 	rqd->nr_ppas = nr_secs;
-	rqd->bio = NULL; /* cloned bio if needed */
 	rqd->private = pblk;
 	rqd->end_io = pblk_end_io_read;
 
 	r_ctx = nvm_rq_to_pdu(rqd);
 	r_ctx->start_time = jiffies;
 	r_ctx->lba = blba;
-	r_ctx->private = bio; /* original bio */
 
 	/* Save the index for this bio's start. This is needed in case
 	 * we need to fill a partial read.
 	 */
 	bio_init_idx = pblk_get_bi_idx(bio);
 
-	if (pblk_alloc_rqd_meta(pblk, rqd))
-		goto fail_rqd_free;
+	if (pblk_alloc_rqd_meta(pblk, rqd)) {
+		bio_io_error(bio);
+		pblk_free_rqd(pblk, rqd, PBLK_READ);
+		return;
+	}
+
+	/* Clone read bio to deal internally with:
+	 * -read errors when reading from drive
+	 * -bio_advance() calls during l2p lookup and cache reads
+	 */
+	int_bio = bio_clone_fast(bio, GFP_KERNEL, &pblk_bio_set);
 
 	if (nr_secs > 1)
 		pblk_read_ppalist_rq(pblk, rqd, bio, blba, read_bitmap);
 	else
 		pblk_read_rq(pblk, rqd, bio, blba, read_bitmap);
 
+	r_ctx->private = bio; /* original bio */
+	rqd->bio = int_bio; /* internal bio */
+
 	if (bitmap_full(read_bitmap, nr_secs)) {
+		pblk_end_user_read(bio, 0);
 		atomic_inc(&pblk->inflight_io);
 		__pblk_end_io_read(pblk, rqd, false);
-		return NVM_IO_DONE;
+		return;
 	}
 
-	/* All sectors are to be read from the device */
-	if (bitmap_empty(read_bitmap, rqd->nr_ppas)) {
-		struct bio *int_bio = NULL;
-
-		/* Clone read bio to deal with read errors internally */
-		int_bio = bio_clone_fast(bio, GFP_KERNEL, &pblk_bio_set);
-		if (!int_bio) {
-			pblk_err(pblk, "could not clone read bio\n");
-			goto fail_end_io;
-		}
-
-		rqd->bio = int_bio;
-
-		if (pblk_submit_io(pblk, rqd)) {
+	if (!bitmap_empty(read_bitmap, rqd->nr_ppas)) {
+		/* The read bio request could be partially filled by the write
+		 * buffer, but there are some holes that need to be read from
+		 * the drive.
+		 */
+		bio_put(int_bio);
+		rqd->bio = NULL;
+		if (pblk_partial_read_bio(pblk, rqd, bio_init_idx, read_bitmap,
+					    nr_secs)) {
 			pblk_err(pblk, "read IO submission failed\n");
-			ret = NVM_IO_ERR;
-			goto fail_end_io;
+			bio_io_error(bio);
+			__pblk_end_io_read(pblk, rqd, false);
 		}
-
-		return NVM_IO_OK;
+		return;
 	}
 
-	/* The read bio request could be partially filled by the write buffer,
-	 * but there are some holes that need to be read from the drive.
-	 */
-	ret = pblk_partial_read_bio(pblk, rqd, bio_init_idx, read_bitmap,
-				    nr_secs);
-	if (ret)
-		goto fail_meta_free;
-
-	return NVM_IO_OK;
-
-fail_meta_free:
-	nvm_dev_dma_free(dev->parent, rqd->meta_list, rqd->dma_meta_list);
-fail_rqd_free:
-	pblk_free_rqd(pblk, rqd, PBLK_READ);
-	return ret;
-fail_end_io:
-	__pblk_end_io_read(pblk, rqd, false);
-	return ret;
+	/* All sectors are to be read from the device */
+	if (pblk_submit_io(pblk, rqd)) {
+		pblk_err(pblk, "read IO submission failed\n");
+		bio_io_error(bio);
+		__pblk_end_io_read(pblk, rqd, false);
+	}
 }
 
 static int read_ppalist_rq_gc(struct pblk *pblk, struct nvm_rq *rqd,
diff --git a/drivers/lightnvm/pblk.h b/drivers/lightnvm/pblk.h
index 381f074..a3c925d 100644
--- a/drivers/lightnvm/pblk.h
+++ b/drivers/lightnvm/pblk.h
@@ -868,7 +868,7 @@ void pblk_get_packed_meta(struct pblk *pblk, struct nvm_rq *rqd);
 /*
  * pblk user I/O write path
  */
-int pblk_write_to_cache(struct pblk *pblk, struct bio *bio,
+void pblk_write_to_cache(struct pblk *pblk, struct bio *bio,
 			unsigned long flags);
 int pblk_write_gc_to_cache(struct pblk *pblk, struct pblk_gc_rq *gc_rq);
 
@@ -894,7 +894,7 @@ void pblk_write_kick(struct pblk *pblk);
  * pblk read path
  */
 extern struct bio_set pblk_bio_set;
-int pblk_submit_read(struct pblk *pblk, struct bio *bio);
+void pblk_submit_read(struct pblk *pblk, struct bio *bio);
 int pblk_submit_read_gc(struct pblk *pblk, struct pblk_gc_rq *gc_rq);
 /*
  * pblk recovery
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 03/16] lightnvm: pblk: simplify partial read path
  2019-03-22 14:48 [PATCH v2 00/16] lightnvm: next set of improvements for 5.2 Igor Konopko
  2019-03-22 14:48 ` [PATCH v2 01/16] lightnvm: pblk: warn when there are opened chunks Igor Konopko
  2019-03-22 14:48 ` [PATCH v2 02/16] lightnvm: pblk: IO path reorganization Igor Konopko
@ 2019-03-22 14:48 ` Igor Konopko
  2019-03-22 14:48 ` [PATCH v2 04/16] lightnvm: pblk: OOB recovery for closed chunks fix Igor Konopko
                   ` (12 subsequent siblings)
  15 siblings, 0 replies; 30+ messages in thread
From: Igor Konopko @ 2019-03-22 14:48 UTC (permalink / raw)
  To: mb, javier, hans.holmberg; +Cc: linux-block, igor.j.konopko

This patch changes the approach to handling partial read path, what is
needed for supporting multi-page bvec, which is currently broken.

In old approach merging of data from round buffer and drive was fully
made by drive. This had some disadvantages - code was complex and
relies on bio internals, so it was hard to maintain and was strongly
dependent on bio changes.

In new approach most of the handling is done mostly by block layer
functions such as bio_split(), bio_chain() and generic_make request()
and generally is less complex and easier to maintain. Below some more
details of the new approach.

When read bio arrives, it is cloned for pblk internal purposes. All
the L2P mapping, which includes copying data from round buffer to bio
and thus bio_advance() calls is done on the cloned bio, so the original
bio is untouched. Later if we found that we have partial read case, we
still have original bio untouched, so we can split it and continue to
process only first 4K of it in current context, when the rest will be
called as separate bio request which is passed to generic_make_request()
for further processing.

Signed-off-by: Igor Konopko <igor.j.konopko@intel.com>
---
 drivers/lightnvm/pblk-core.c |  16 +++
 drivers/lightnvm/pblk-read.c | 276 +++++++++++--------------------------------
 drivers/lightnvm/pblk.h      |  22 ++--
 3 files changed, 97 insertions(+), 217 deletions(-)

diff --git a/drivers/lightnvm/pblk-core.c b/drivers/lightnvm/pblk-core.c
index 39280c1..f2edec6 100644
--- a/drivers/lightnvm/pblk-core.c
+++ b/drivers/lightnvm/pblk-core.c
@@ -1529,6 +1529,22 @@ void pblk_rq_to_line_put(struct pblk *pblk, struct nvm_rq *rqd)
 		pblk_ppa_to_line_put(pblk, ppa_list[i]);
 }
 
+void pblk_rq_to_line_partial_put(struct pblk *pblk, struct nvm_rq *rqd,
+					int offset)
+{
+	struct ppa_addr ppa;
+	struct pblk_line *line;
+	int i = offset;
+
+	for (; i < rqd->nr_ppas; i++) {
+		ppa = rqd->ppa_list[i];
+		if (!pblk_ppa_empty(ppa) && !pblk_addr_in_cache(ppa)) {
+			line = pblk_ppa_to_line(pblk, ppa);
+			kref_put(&line->ref, pblk_line_put);
+		}
+	}
+}
+
 static void pblk_stop_writes(struct pblk *pblk, struct pblk_line *line)
 {
 	lockdep_assert_held(&pblk->l_mg.free_lock);
diff --git a/drivers/lightnvm/pblk-read.c b/drivers/lightnvm/pblk-read.c
index 597fe6d..264d1d7 100644
--- a/drivers/lightnvm/pblk-read.c
+++ b/drivers/lightnvm/pblk-read.c
@@ -41,28 +41,30 @@ static int pblk_read_from_cache(struct pblk *pblk, struct bio *bio,
 
 static void pblk_read_ppalist_rq(struct pblk *pblk, struct nvm_rq *rqd,
 				 struct bio *bio, sector_t blba,
-				 unsigned long *read_bitmap)
+				 struct pblk_read_info *info)
 {
 	void *meta_list = rqd->meta_list;
-	struct ppa_addr ppas[NVM_MAX_VLBA];
 	int nr_secs = rqd->nr_ppas;
 	bool advanced_bio = false;
-	int i, j = 0;
+	int i;
 
-	pblk_lookup_l2p_seq(pblk, ppas, blba, nr_secs);
+	pblk_lookup_l2p_seq(pblk, rqd->ppa_list, blba, nr_secs);
+	info->first_sec_from_cache = nr_secs;
+	info->first_sec_from_drive = nr_secs;
 
 	for (i = 0; i < nr_secs; i++) {
-		struct ppa_addr p = ppas[i];
 		struct pblk_sec_meta *meta = pblk_get_meta(pblk, meta_list, i);
 		sector_t lba = blba + i;
 
 retry:
-		if (pblk_ppa_empty(p)) {
+		if (pblk_ppa_empty(rqd->ppa_list[i])) {
 			__le64 addr_empty = cpu_to_le64(ADDR_EMPTY);
 
-			WARN_ON(test_and_set_bit(i, read_bitmap));
-			meta->lba = addr_empty;
+			info->secs_from_cache = true;
+			info->first_sec_from_cache = min_t(int, i,
+						info->first_sec_from_cache);
 
+			meta->lba = addr_empty;
 			if (unlikely(!advanced_bio)) {
 				bio_advance(bio, (i) * PBLK_EXPOSED_PAGE_SIZE);
 				advanced_bio = true;
@@ -74,13 +76,18 @@ static void pblk_read_ppalist_rq(struct pblk *pblk, struct nvm_rq *rqd,
 		/* Try to read from write buffer. The address is later checked
 		 * on the write buffer to prevent retrieving overwritten data.
 		 */
-		if (pblk_addr_in_cache(p)) {
-			if (!pblk_read_from_cache(pblk, bio, lba, p, i,
-								advanced_bio)) {
-				pblk_lookup_l2p_seq(pblk, &p, lba, 1);
+		if (pblk_addr_in_cache(rqd->ppa_list[i])) {
+			if (!pblk_read_from_cache(pblk, bio, lba,
+					rqd->ppa_list[i], i, advanced_bio)) {
+				pblk_lookup_l2p_seq(pblk, &rqd->ppa_list[i],
+							lba, 1);
 				goto retry;
 			}
-			WARN_ON(test_and_set_bit(i, read_bitmap));
+
+			info->secs_from_cache = true;
+			info->first_sec_from_cache = min_t(int, i,
+						info->first_sec_from_cache);
+
 			meta->lba = cpu_to_le64(lba);
 			advanced_bio = true;
 #ifdef CONFIG_NVM_PBLK_DEBUG
@@ -88,7 +95,9 @@ static void pblk_read_ppalist_rq(struct pblk *pblk, struct nvm_rq *rqd,
 #endif
 		} else {
 			/* Read from media non-cached sectors */
-			rqd->ppa_list[j++] = p;
+			info->secs_from_drive = true;
+			info->first_sec_from_drive = min_t(int, i,
+						info->first_sec_from_drive);
 		}
 
 next:
@@ -223,172 +232,8 @@ static void pblk_end_io_read(struct nvm_rq *rqd)
 	__pblk_end_io_read(pblk, rqd, true);
 }
 
-static void pblk_end_partial_read(struct nvm_rq *rqd)
-{
-	struct pblk *pblk = rqd->private;
-	struct pblk_g_ctx *r_ctx = nvm_rq_to_pdu(rqd);
-	struct pblk_pr_ctx *pr_ctx = r_ctx->private;
-	struct pblk_sec_meta *meta;
-	struct bio *new_bio = rqd->bio;
-	struct bio *bio = pr_ctx->orig_bio;
-	struct bio_vec src_bv, dst_bv;
-	void *meta_list = rqd->meta_list;
-	int bio_init_idx = pr_ctx->bio_init_idx;
-	unsigned long *read_bitmap = pr_ctx->bitmap;
-	int nr_secs = pr_ctx->orig_nr_secs;
-	int nr_holes = nr_secs - bitmap_weight(read_bitmap, nr_secs);
-	void *src_p, *dst_p;
-	int hole, i;
-
-	if (unlikely(nr_holes == 1)) {
-		struct ppa_addr ppa;
-
-		ppa = rqd->ppa_addr;
-		rqd->ppa_list = pr_ctx->ppa_ptr;
-		rqd->dma_ppa_list = pr_ctx->dma_ppa_list;
-		rqd->ppa_list[0] = ppa;
-	}
-
-	for (i = 0; i < nr_secs; i++) {
-		meta = pblk_get_meta(pblk, meta_list, i);
-		pr_ctx->lba_list_media[i] = le64_to_cpu(meta->lba);
-		meta->lba = cpu_to_le64(pr_ctx->lba_list_mem[i]);
-	}
-
-	/* Fill the holes in the original bio */
-	i = 0;
-	hole = find_first_zero_bit(read_bitmap, nr_secs);
-	do {
-		struct pblk_line *line;
-
-		line = pblk_ppa_to_line(pblk, rqd->ppa_list[i]);
-		kref_put(&line->ref, pblk_line_put);
-
-		meta = pblk_get_meta(pblk, meta_list, hole);
-		meta->lba = cpu_to_le64(pr_ctx->lba_list_media[i]);
-
-		src_bv = new_bio->bi_io_vec[i++];
-		dst_bv = bio->bi_io_vec[bio_init_idx + hole];
-
-		src_p = kmap_atomic(src_bv.bv_page);
-		dst_p = kmap_atomic(dst_bv.bv_page);
-
-		memcpy(dst_p + dst_bv.bv_offset,
-			src_p + src_bv.bv_offset,
-			PBLK_EXPOSED_PAGE_SIZE);
-
-		kunmap_atomic(src_p);
-		kunmap_atomic(dst_p);
-
-		mempool_free(src_bv.bv_page, &pblk->page_bio_pool);
-
-		hole = find_next_zero_bit(read_bitmap, nr_secs, hole + 1);
-	} while (hole < nr_secs);
-
-	bio_put(new_bio);
-	kfree(pr_ctx);
-
-	/* restore original request */
-	rqd->bio = NULL;
-	rqd->nr_ppas = nr_secs;
-
-	pblk_end_user_read(bio, rqd->error);
-	__pblk_end_io_read(pblk, rqd, false);
-}
-
-static int pblk_setup_partial_read(struct pblk *pblk, struct nvm_rq *rqd,
-			    unsigned int bio_init_idx,
-			    unsigned long *read_bitmap,
-			    int nr_holes)
-{
-	void *meta_list = rqd->meta_list;
-	struct pblk_g_ctx *r_ctx = nvm_rq_to_pdu(rqd);
-	struct pblk_pr_ctx *pr_ctx;
-	struct bio *new_bio, *bio = r_ctx->private;
-	int nr_secs = rqd->nr_ppas;
-	int i;
-
-	new_bio = bio_alloc(GFP_KERNEL, nr_holes);
-
-	if (pblk_bio_add_pages(pblk, new_bio, GFP_KERNEL, nr_holes))
-		goto fail_bio_put;
-
-	if (nr_holes != new_bio->bi_vcnt) {
-		WARN_ONCE(1, "pblk: malformed bio\n");
-		goto fail_free_pages;
-	}
-
-	pr_ctx = kzalloc(sizeof(struct pblk_pr_ctx), GFP_KERNEL);
-	if (!pr_ctx)
-		goto fail_free_pages;
-
-	for (i = 0; i < nr_secs; i++) {
-		struct pblk_sec_meta *meta = pblk_get_meta(pblk, meta_list, i);
-
-		pr_ctx->lba_list_mem[i] = le64_to_cpu(meta->lba);
-	}
-
-	new_bio->bi_iter.bi_sector = 0; /* internal bio */
-	bio_set_op_attrs(new_bio, REQ_OP_READ, 0);
-
-	rqd->bio = new_bio;
-	rqd->nr_ppas = nr_holes;
-
-	pr_ctx->orig_bio = bio;
-	bitmap_copy(pr_ctx->bitmap, read_bitmap, NVM_MAX_VLBA);
-	pr_ctx->bio_init_idx = bio_init_idx;
-	pr_ctx->orig_nr_secs = nr_secs;
-	r_ctx->private = pr_ctx;
-
-	if (unlikely(nr_holes == 1)) {
-		pr_ctx->ppa_ptr = rqd->ppa_list;
-		pr_ctx->dma_ppa_list = rqd->dma_ppa_list;
-		rqd->ppa_addr = rqd->ppa_list[0];
-	}
-	return 0;
-
-fail_free_pages:
-	pblk_bio_free_pages(pblk, new_bio, 0, new_bio->bi_vcnt);
-fail_bio_put:
-	bio_put(new_bio);
-
-	return -ENOMEM;
-}
-
-static int pblk_partial_read_bio(struct pblk *pblk, struct nvm_rq *rqd,
-				 unsigned int bio_init_idx,
-				 unsigned long *read_bitmap, int nr_secs)
-{
-	int nr_holes;
-	int ret;
-
-	nr_holes = nr_secs - bitmap_weight(read_bitmap, nr_secs);
-
-	if (pblk_setup_partial_read(pblk, rqd, bio_init_idx, read_bitmap,
-				    nr_holes))
-		return NVM_IO_ERR;
-
-	rqd->end_io = pblk_end_partial_read;
-
-	ret = pblk_submit_io(pblk, rqd);
-	if (ret) {
-		bio_put(rqd->bio);
-		pblk_err(pblk, "partial read IO submission failed\n");
-		goto err;
-	}
-
-	return NVM_IO_OK;
-
-err:
-	pblk_err(pblk, "failed to perform partial read\n");
-
-	/* Free allocated pages in new bio */
-	pblk_bio_free_pages(pblk, rqd->bio, 0, rqd->bio->bi_vcnt);
-	return NVM_IO_ERR;
-}
-
 static void pblk_read_rq(struct pblk *pblk, struct nvm_rq *rqd, struct bio *bio,
-			 sector_t lba, unsigned long *read_bitmap)
+			 sector_t lba, struct pblk_read_info *info)
 {
 	struct pblk_sec_meta *meta = pblk_get_meta(pblk, rqd->meta_list, 0);
 	struct ppa_addr ppa;
@@ -403,8 +248,8 @@ static void pblk_read_rq(struct pblk *pblk, struct nvm_rq *rqd, struct bio *bio,
 	if (pblk_ppa_empty(ppa)) {
 		__le64 addr_empty = cpu_to_le64(ADDR_EMPTY);
 
-		WARN_ON(test_and_set_bit(0, read_bitmap));
 		meta->lba = addr_empty;
+		info->secs_from_cache = true;
 		return;
 	}
 
@@ -417,14 +262,14 @@ static void pblk_read_rq(struct pblk *pblk, struct nvm_rq *rqd, struct bio *bio,
 			goto retry;
 		}
 
-		WARN_ON(test_and_set_bit(0, read_bitmap));
 		meta->lba = cpu_to_le64(lba);
-
+		info->secs_from_cache = true;
 #ifdef CONFIG_NVM_PBLK_DEBUG
 		atomic_long_inc(&pblk->cache_reads);
 #endif
 	} else {
 		rqd->ppa_addr = ppa;
+		info->secs_from_drive = true;
 	}
 }
 
@@ -436,15 +281,12 @@ void pblk_submit_read(struct pblk *pblk, struct bio *bio)
 	unsigned int nr_secs = pblk_get_secs(bio);
 	struct pblk_g_ctx *r_ctx;
 	struct nvm_rq *rqd;
-	struct bio *int_bio;
-	unsigned int bio_init_idx;
-	DECLARE_BITMAP(read_bitmap, NVM_MAX_VLBA);
+	struct bio *int_bio, *split_bio;
+	struct pblk_read_info info = {0};
 
 	generic_start_io_acct(q, REQ_OP_READ, bio_sectors(bio),
 			      &pblk->disk->part0);
 
-	bitmap_zero(read_bitmap, nr_secs);
-
 	rqd = pblk_alloc_rqd(pblk, PBLK_READ);
 
 	rqd->opcode = NVM_OP_PREAD;
@@ -456,11 +298,6 @@ void pblk_submit_read(struct pblk *pblk, struct bio *bio)
 	r_ctx->start_time = jiffies;
 	r_ctx->lba = blba;
 
-	/* Save the index for this bio's start. This is needed in case
-	 * we need to fill a partial read.
-	 */
-	bio_init_idx = pblk_get_bi_idx(bio);
-
 	if (pblk_alloc_rqd_meta(pblk, rqd)) {
 		bio_io_error(bio);
 		pblk_free_rqd(pblk, rqd, PBLK_READ);
@@ -474,34 +311,63 @@ void pblk_submit_read(struct pblk *pblk, struct bio *bio)
 	int_bio = bio_clone_fast(bio, GFP_KERNEL, &pblk_bio_set);
 
 	if (nr_secs > 1)
-		pblk_read_ppalist_rq(pblk, rqd, bio, blba, read_bitmap);
+		pblk_read_ppalist_rq(pblk, rqd, int_bio, blba, &info);
 	else
-		pblk_read_rq(pblk, rqd, bio, blba, read_bitmap);
+		pblk_read_rq(pblk, rqd, int_bio, blba, &info);
 
+split_retry:
 	r_ctx->private = bio; /* original bio */
 	rqd->bio = int_bio; /* internal bio */
 
-	if (bitmap_full(read_bitmap, nr_secs)) {
+	if (info.secs_from_cache && !info.secs_from_drive) {
 		pblk_end_user_read(bio, 0);
 		atomic_inc(&pblk->inflight_io);
 		__pblk_end_io_read(pblk, rqd, false);
 		return;
 	}
 
-	if (!bitmap_empty(read_bitmap, rqd->nr_ppas)) {
+	if (info.secs_from_cache && info.secs_from_drive) {
 		/* The read bio request could be partially filled by the write
 		 * buffer, but there are some holes that need to be read from
-		 * the drive.
+		 * the drive. In order to handle this, we will use block layer
+		 * mechanism to split this request in to smaller ones and make
+		 * a chain of it.
+		 */
+		int split = max_t(int, info.first_sec_from_cache,
+					info.first_sec_from_drive);
+
+		split_bio = bio_split(bio, split * NR_PHY_IN_LOG, GFP_KERNEL,
+					&pblk_bio_set);
+		bio_chain(split_bio, bio);
+		generic_make_request(bio);
+
+		/* New bio contains first N sectors of the previous one, so
+		 * we can continue to use existing rqd, but we need to shrink
+		 * the number of PPAs in it. We also need to release line
+		 * references on the rest of the PPAs. Finally we also need
+		 * to update pblk_read_info struct properly in order to avoid
+		 * another interation of l2p lookup.
+		 */
+		bio = split_bio;
+		pblk_rq_to_line_partial_put(pblk, rqd, split);
+#ifdef CONFIG_NVM_PBLK_DEBUG
+		atomic_long_sub((rqd->nr_ppas - split), &pblk->inflight_reads);
+#endif
+		rqd->nr_ppas = split;
+		if (rqd->nr_ppas == 1)
+			rqd->ppa_addr = rqd->ppa_list[0];
+
+		if (info.first_sec_from_cache > info.first_sec_from_drive)
+			info.secs_from_cache = false;
+		else
+			info.secs_from_drive = false;
+
+		/* Recreate int_bio - existing might have some needed internal
+		 * fields modified already.
 		 */
 		bio_put(int_bio);
-		rqd->bio = NULL;
-		if (pblk_partial_read_bio(pblk, rqd, bio_init_idx, read_bitmap,
-					    nr_secs)) {
-			pblk_err(pblk, "read IO submission failed\n");
-			bio_io_error(bio);
-			__pblk_end_io_read(pblk, rqd, false);
-		}
-		return;
+		int_bio = bio_clone_fast(bio, GFP_KERNEL, &pblk_bio_set);
+		goto split_retry;
 	}
 
 	/* All sectors are to be read from the device */
diff --git a/drivers/lightnvm/pblk.h b/drivers/lightnvm/pblk.h
index a3c925d..e284a93 100644
--- a/drivers/lightnvm/pblk.h
+++ b/drivers/lightnvm/pblk.h
@@ -123,18 +123,6 @@ struct pblk_g_ctx {
 	u64 lba;
 };
 
-/* partial read context */
-struct pblk_pr_ctx {
-	struct bio *orig_bio;
-	DECLARE_BITMAP(bitmap, NVM_MAX_VLBA);
-	unsigned int orig_nr_secs;
-	unsigned int bio_init_idx;
-	void *ppa_ptr;
-	dma_addr_t dma_ppa_list;
-	u64 lba_list_mem[NVM_MAX_VLBA];
-	u64 lba_list_media[NVM_MAX_VLBA];
-};
-
 /* Pad context */
 struct pblk_pad_rq {
 	struct pblk *pblk;
@@ -726,6 +714,14 @@ struct pblk_line_ws {
 	struct work_struct ws;
 };
 
+/* L2P lookup on read path info */
+struct pblk_read_info {
+	int first_sec_from_cache;
+	int first_sec_from_drive;
+	bool secs_from_cache;
+	bool secs_from_drive;
+};
+
 #define pblk_g_rq_size (sizeof(struct nvm_rq) + sizeof(struct pblk_g_ctx))
 #define pblk_w_rq_size (sizeof(struct nvm_rq) + sizeof(struct pblk_c_ctx))
 
@@ -809,6 +805,8 @@ struct pblk_line *pblk_line_get_first_data(struct pblk *pblk);
 struct pblk_line *pblk_line_replace_data(struct pblk *pblk);
 void pblk_ppa_to_line_put(struct pblk *pblk, struct ppa_addr ppa);
 void pblk_rq_to_line_put(struct pblk *pblk, struct nvm_rq *rqd);
+void pblk_rq_to_line_partial_put(struct pblk *pblk, struct nvm_rq *rqd,
+					int offset);
 int pblk_line_recov_alloc(struct pblk *pblk, struct pblk_line *line);
 void pblk_line_recov_close(struct pblk *pblk, struct pblk_line *line);
 struct pblk_line *pblk_line_get_data(struct pblk *pblk);
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 04/16] lightnvm: pblk: OOB recovery for closed chunks fix
  2019-03-22 14:48 [PATCH v2 00/16] lightnvm: next set of improvements for 5.2 Igor Konopko
                   ` (2 preceding siblings ...)
  2019-03-22 14:48 ` [PATCH v2 03/16] lightnvm: pblk: simplify partial read path Igor Konopko
@ 2019-03-22 14:48 ` Igor Konopko
  2019-03-25  6:02   ` Javier González
  2019-03-22 14:48 ` [PATCH v2 05/16] lightnvm: pblk: propagate errors when reading meta Igor Konopko
                   ` (11 subsequent siblings)
  15 siblings, 1 reply; 30+ messages in thread
From: Igor Konopko @ 2019-03-22 14:48 UTC (permalink / raw)
  To: mb, javier, hans.holmberg; +Cc: linux-block, igor.j.konopko

In case of OOB recovery, when some of the chunks are in closed state,
we are calculating number of written sectors in line incorrectly,
because we are always counting chunk WP, which for closed chunks
does not longer reflects written sectors in particular chunk. Based on
OCSSD 2.0 spec WP for closed chunks is equal to SLBA + NLB and here we
need only NLB (clba in code) value for calculation. This patch for such
a chunks takes clba field instead.

Signed-off-by: Igor Konopko <igor.j.konopko@intel.com>
---
 drivers/lightnvm/pblk-recovery.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/lightnvm/pblk-recovery.c b/drivers/lightnvm/pblk-recovery.c
index 83b467b..bcd3633 100644
--- a/drivers/lightnvm/pblk-recovery.c
+++ b/drivers/lightnvm/pblk-recovery.c
@@ -101,6 +101,8 @@ static void pblk_update_line_wp(struct pblk *pblk, struct pblk_line *line,
 
 static u64 pblk_sec_in_open_line(struct pblk *pblk, struct pblk_line *line)
 {
+	struct nvm_tgt_dev *dev = pblk->dev;
+	struct nvm_geo *geo = &dev->geo;
 	struct pblk_line_meta *lm = &pblk->lm;
 	int nr_bb = bitmap_weight(line->blk_bitmap, lm->blk_per_line);
 	u64 written_secs = 0;
@@ -113,7 +115,11 @@ static u64 pblk_sec_in_open_line(struct pblk *pblk, struct pblk_line *line)
 		if (chunk->state & NVM_CHK_ST_OFFLINE)
 			continue;
 
-		written_secs += chunk->wp;
+		if (chunk->state & NVM_CHK_ST_OPEN)
+			written_secs += chunk->wp;
+		else if (chunk->state & NVM_CHK_ST_CLOSED)
+			written_secs += geo->clba;
+
 		valid_chunks++;
 	}
 
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 05/16] lightnvm: pblk: propagate errors when reading meta
  2019-03-22 14:48 [PATCH v2 00/16] lightnvm: next set of improvements for 5.2 Igor Konopko
                   ` (3 preceding siblings ...)
  2019-03-22 14:48 ` [PATCH v2 04/16] lightnvm: pblk: OOB recovery for closed chunks fix Igor Konopko
@ 2019-03-22 14:48 ` Igor Konopko
  2019-03-22 14:48 ` [PATCH v2 06/16] lightnvm: pblk: recover only written metadata Igor Konopko
                   ` (10 subsequent siblings)
  15 siblings, 0 replies; 30+ messages in thread
From: Igor Konopko @ 2019-03-22 14:48 UTC (permalink / raw)
  To: mb, javier, hans.holmberg; +Cc: linux-block, igor.j.konopko

Currently when smeta/emeta/oob is read errors are not always propagated
correctly. This patch changes that behaviour and propagates all the
error codes except of high ecc read warning status.

Signed-off-by: Igor Konopko <igor.j.konopko@intel.com>
Reviewed-by: Javier González <javier@javigon.com>
Reviewed-by: Hans Holmberg <hans.holmberg@cnexlabs.com>
---
 drivers/lightnvm/pblk-core.c     | 9 +++++++--
 drivers/lightnvm/pblk-recovery.c | 2 +-
 2 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/lightnvm/pblk-core.c b/drivers/lightnvm/pblk-core.c
index f2edec6..06ac3f0 100644
--- a/drivers/lightnvm/pblk-core.c
+++ b/drivers/lightnvm/pblk-core.c
@@ -761,8 +761,10 @@ int pblk_line_smeta_read(struct pblk *pblk, struct pblk_line *line)
 
 	atomic_dec(&pblk->inflight_io);
 
-	if (rqd.error)
+	if (rqd.error && rqd.error != NVM_RSP_WARN_HIGHECC) {
 		pblk_log_read_err(pblk, &rqd);
+		ret = -EIO;
+	}
 
 clear_rqd:
 	pblk_free_rqd_meta(pblk, &rqd);
@@ -916,8 +918,11 @@ int pblk_line_emeta_read(struct pblk *pblk, struct pblk_line *line,
 
 	atomic_dec(&pblk->inflight_io);
 
-	if (rqd.error)
+	if (rqd.error && rqd.error != NVM_RSP_WARN_HIGHECC) {
 		pblk_log_read_err(pblk, &rqd);
+		ret = -EIO;
+		goto free_rqd_dma;
+	}
 
 	emeta_buf += rq_len;
 	left_ppas -= rq_ppas;
diff --git a/drivers/lightnvm/pblk-recovery.c b/drivers/lightnvm/pblk-recovery.c
index bcd3633..688fdeb 100644
--- a/drivers/lightnvm/pblk-recovery.c
+++ b/drivers/lightnvm/pblk-recovery.c
@@ -450,7 +450,7 @@ static int pblk_recov_scan_oob(struct pblk *pblk, struct pblk_line *line,
 	atomic_dec(&pblk->inflight_io);
 
 	/* If a read fails, do a best effort by padding the line and retrying */
-	if (rqd->error) {
+	if (rqd->error && rqd->error != NVM_RSP_WARN_HIGHECC) {
 		int pad_distance, ret;
 
 		if (padded) {
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 06/16] lightnvm: pblk: recover only written metadata
  2019-03-22 14:48 [PATCH v2 00/16] lightnvm: next set of improvements for 5.2 Igor Konopko
                   ` (4 preceding siblings ...)
  2019-03-22 14:48 ` [PATCH v2 05/16] lightnvm: pblk: propagate errors when reading meta Igor Konopko
@ 2019-03-22 14:48 ` Igor Konopko
  2019-03-22 14:48 ` [PATCH v2 07/16] lightnvm: pblk: wait for inflight IOs in recovery Igor Konopko
                   ` (9 subsequent siblings)
  15 siblings, 0 replies; 30+ messages in thread
From: Igor Konopko @ 2019-03-22 14:48 UTC (permalink / raw)
  To: mb, javier, hans.holmberg; +Cc: linux-block, igor.j.konopko

This patch ensures that smeta was fully written before even
trying to read it based on chunk table state and write pointer.

Signed-off-by: Igor Konopko <igor.j.konopko@intel.com>
---
 drivers/lightnvm/pblk-recovery.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/lightnvm/pblk-recovery.c b/drivers/lightnvm/pblk-recovery.c
index 688fdeb..b2267bd 100644
--- a/drivers/lightnvm/pblk-recovery.c
+++ b/drivers/lightnvm/pblk-recovery.c
@@ -653,10 +653,12 @@ static int pblk_line_was_written(struct pblk_line *line,
 	bppa = pblk->luns[smeta_blk].bppa;
 	chunk = &line->chks[pblk_ppa_to_pos(geo, bppa)];
 
-	if (chunk->state & NVM_CHK_ST_FREE)
-		return 0;
+	if (chunk->state & NVM_CHK_ST_CLOSED ||
+	    (chunk->state & NVM_CHK_ST_OPEN
+	     && chunk->wp >= lm->smeta_sec))
+		return 1;
 
-	return 1;
+	return 0;
 }
 
 static bool pblk_line_is_open(struct pblk *pblk, struct pblk_line *line)
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 07/16] lightnvm: pblk: wait for inflight IOs in recovery
  2019-03-22 14:48 [PATCH v2 00/16] lightnvm: next set of improvements for 5.2 Igor Konopko
                   ` (5 preceding siblings ...)
  2019-03-22 14:48 ` [PATCH v2 06/16] lightnvm: pblk: recover only written metadata Igor Konopko
@ 2019-03-22 14:48 ` Igor Konopko
  2019-03-25  6:18   ` Javier González
  2019-03-22 14:48 ` [PATCH v2 08/16] lightnvm: pblk: remove internal IO timeout Igor Konopko
                   ` (8 subsequent siblings)
  15 siblings, 1 reply; 30+ messages in thread
From: Igor Konopko @ 2019-03-22 14:48 UTC (permalink / raw)
  To: mb, javier, hans.holmberg; +Cc: linux-block, igor.j.konopko

This patch changes the behaviour of recovery padding in order to
support a case, when some IOs were already submitted to the drive and
some next one are not submitted due to error returned.

Currently in case of errors we simply exit the pad function without
waiting for inflight IOs, which leads to panic on inflight IOs
completion.

After the changes we always wait for all the inflight IOs before
exiting the function.

Signed-off-by: Igor Konopko <igor.j.konopko@intel.com>
---
 drivers/lightnvm/pblk-recovery.c | 25 ++++++++++++-------------
 1 file changed, 12 insertions(+), 13 deletions(-)

diff --git a/drivers/lightnvm/pblk-recovery.c b/drivers/lightnvm/pblk-recovery.c
index b2267bd..f7e383e 100644
--- a/drivers/lightnvm/pblk-recovery.c
+++ b/drivers/lightnvm/pblk-recovery.c
@@ -200,7 +200,7 @@ static int pblk_recov_pad_line(struct pblk *pblk, struct pblk_line *line,
 	rq_ppas = pblk_calc_secs(pblk, left_ppas, 0, false);
 	if (rq_ppas < pblk->min_write_pgs) {
 		pblk_err(pblk, "corrupted pad line %d\n", line->id);
-		goto fail_free_pad;
+		goto fail_complete;
 	}
 
 	rq_len = rq_ppas * geo->csecs;
@@ -209,7 +209,7 @@ static int pblk_recov_pad_line(struct pblk *pblk, struct pblk_line *line,
 						PBLK_VMALLOC_META, GFP_KERNEL);
 	if (IS_ERR(bio)) {
 		ret = PTR_ERR(bio);
-		goto fail_free_pad;
+		goto fail_complete;
 	}
 
 	bio->bi_iter.bi_sector = 0; /* internal bio */
@@ -218,8 +218,11 @@ static int pblk_recov_pad_line(struct pblk *pblk, struct pblk_line *line,
 	rqd = pblk_alloc_rqd(pblk, PBLK_WRITE_INT);
 
 	ret = pblk_alloc_rqd_meta(pblk, rqd);
-	if (ret)
-		goto fail_free_rqd;
+	if (ret) {
+		pblk_free_rqd(pblk, rqd, PBLK_WRITE_INT);
+		bio_put(bio);
+		goto fail_complete;
+	}
 
 	rqd->bio = bio;
 	rqd->opcode = NVM_OP_PWRITE;
@@ -266,7 +269,10 @@ static int pblk_recov_pad_line(struct pblk *pblk, struct pblk_line *line,
 	if (ret) {
 		pblk_err(pblk, "I/O submission failed: %d\n", ret);
 		pblk_up_chunk(pblk, rqd->ppa_list[0]);
-		goto fail_free_rqd;
+		kref_put(&pad_rq->ref, pblk_recov_complete);
+		pblk_free_rqd(pblk, rqd, PBLK_WRITE_INT);
+		bio_put(bio);
+		goto fail_complete;
 	}
 
 	left_line_ppas -= rq_ppas;
@@ -274,6 +280,7 @@ static int pblk_recov_pad_line(struct pblk *pblk, struct pblk_line *line,
 	if (left_ppas && left_line_ppas)
 		goto next_pad_rq;
 
+fail_complete:
 	kref_put(&pad_rq->ref, pblk_recov_complete);
 
 	if (!wait_for_completion_io_timeout(&pad_rq->wait,
@@ -289,14 +296,6 @@ static int pblk_recov_pad_line(struct pblk *pblk, struct pblk_line *line,
 free_rq:
 	kfree(pad_rq);
 	return ret;
-
-fail_free_rqd:
-	pblk_free_rqd(pblk, rqd, PBLK_WRITE_INT);
-	bio_put(bio);
-fail_free_pad:
-	kfree(pad_rq);
-	vfree(data);
-	return ret;
 }
 
 static int pblk_pad_distance(struct pblk *pblk, struct pblk_line *line)
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 08/16] lightnvm: pblk: remove internal IO timeout
  2019-03-22 14:48 [PATCH v2 00/16] lightnvm: next set of improvements for 5.2 Igor Konopko
                   ` (6 preceding siblings ...)
  2019-03-22 14:48 ` [PATCH v2 07/16] lightnvm: pblk: wait for inflight IOs in recovery Igor Konopko
@ 2019-03-22 14:48 ` Igor Konopko
  2019-03-22 14:48 ` [PATCH v2 09/16] lightnvm: pblk: fix spin_unlock order Igor Konopko
                   ` (7 subsequent siblings)
  15 siblings, 0 replies; 30+ messages in thread
From: Igor Konopko @ 2019-03-22 14:48 UTC (permalink / raw)
  To: mb, javier, hans.holmberg; +Cc: linux-block, igor.j.konopko

Currently during pblk padding, there is internal IO timeout introduced,
which is smaller than default NVMe timeout. This can lead to various
use-after-free issues. Since in case of any IO timeouts NVMe and block
layer will handle timeout by themselves and report it back to use,
there is no need to keep this internal timeout in pblk.

Signed-off-by: Igor Konopko <igor.j.konopko@intel.com>
---
 drivers/lightnvm/pblk-recovery.c | 7 +------
 drivers/lightnvm/pblk.h          | 2 --
 2 files changed, 1 insertion(+), 8 deletions(-)

diff --git a/drivers/lightnvm/pblk-recovery.c b/drivers/lightnvm/pblk-recovery.c
index f7e383e..65916f0 100644
--- a/drivers/lightnvm/pblk-recovery.c
+++ b/drivers/lightnvm/pblk-recovery.c
@@ -282,12 +282,7 @@ static int pblk_recov_pad_line(struct pblk *pblk, struct pblk_line *line,
 
 fail_complete:
 	kref_put(&pad_rq->ref, pblk_recov_complete);
-
-	if (!wait_for_completion_io_timeout(&pad_rq->wait,
-				msecs_to_jiffies(PBLK_COMMAND_TIMEOUT_MS))) {
-		pblk_err(pblk, "pad write timed out\n");
-		ret = -ETIME;
-	}
+	wait_for_completion(&pad_rq->wait);
 
 	if (!pblk_line_is_full(line))
 		pblk_err(pblk, "corrupted padded line: %d\n", line->id);
diff --git a/drivers/lightnvm/pblk.h b/drivers/lightnvm/pblk.h
index e284a93..3a84c8a 100644
--- a/drivers/lightnvm/pblk.h
+++ b/drivers/lightnvm/pblk.h
@@ -43,8 +43,6 @@
 
 #define PBLK_CACHE_NAME_LEN (DISK_NAME_LEN + 16)
 
-#define PBLK_COMMAND_TIMEOUT_MS 30000
-
 /* Max 512 LUNs per device */
 #define PBLK_MAX_LUNS_BITMAP (4)
 
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 09/16] lightnvm: pblk: fix spin_unlock order
  2019-03-22 14:48 [PATCH v2 00/16] lightnvm: next set of improvements for 5.2 Igor Konopko
                   ` (7 preceding siblings ...)
  2019-03-22 14:48 ` [PATCH v2 08/16] lightnvm: pblk: remove internal IO timeout Igor Konopko
@ 2019-03-22 14:48 ` Igor Konopko
  2019-03-25 11:09   ` Matias Bjørling
  2019-03-22 14:48 ` [PATCH v2 10/16] lightnvm: pblk: kick writer on write recovery path Igor Konopko
                   ` (6 subsequent siblings)
  15 siblings, 1 reply; 30+ messages in thread
From: Igor Konopko @ 2019-03-22 14:48 UTC (permalink / raw)
  To: mb, javier, hans.holmberg; +Cc: linux-block, igor.j.konopko

In pblk_rb_tear_down_check() spin_unlock() functions are not
called in proper order. This patch fixes that.

Fixes: a4bd217 ("lightnvm: physical block device (pblk) target")
Signed-off-by: Igor Konopko <igor.j.konopko@intel.com>
Reviewed-by: Javier González <javier@javigon.com>
Reviewed-by: Hans Holmberg <hans.holmberg@cnexlabs.com>
---
 drivers/lightnvm/pblk-rb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/lightnvm/pblk-rb.c b/drivers/lightnvm/pblk-rb.c
index 03c241b..3555014 100644
--- a/drivers/lightnvm/pblk-rb.c
+++ b/drivers/lightnvm/pblk-rb.c
@@ -799,8 +799,8 @@ int pblk_rb_tear_down_check(struct pblk_rb *rb)
 	}
 
 out:
-	spin_unlock(&rb->w_lock);
 	spin_unlock_irq(&rb->s_lock);
+	spin_unlock(&rb->w_lock);
 
 	return ret;
 }
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 10/16] lightnvm: pblk: kick writer on write recovery path
  2019-03-22 14:48 [PATCH v2 00/16] lightnvm: next set of improvements for 5.2 Igor Konopko
                   ` (8 preceding siblings ...)
  2019-03-22 14:48 ` [PATCH v2 09/16] lightnvm: pblk: fix spin_unlock order Igor Konopko
@ 2019-03-22 14:48 ` Igor Konopko
  2019-03-25 11:13   ` Matias Bjørling
  2019-03-22 14:48 ` [PATCH v2 11/16] lightnvm: pblk: fix update line wp in OOB recovery Igor Konopko
                   ` (5 subsequent siblings)
  15 siblings, 1 reply; 30+ messages in thread
From: Igor Konopko @ 2019-03-22 14:48 UTC (permalink / raw)
  To: mb, javier, hans.holmberg; +Cc: linux-block, igor.j.konopko

In case of write recovery path, there is a chance that writer thread
is not active, so for sanity it would be good to always kick it.

Signed-off-by: Igor Konopko <igor.j.konopko@intel.com>
Reviewed-by: Javier González <javier@javigon.com>
Reviewed-by: Hans Holmberg <hans.holmberg@cnexlabs.com>
---
 drivers/lightnvm/pblk-write.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/lightnvm/pblk-write.c b/drivers/lightnvm/pblk-write.c
index 6593dea..4e63f9b 100644
--- a/drivers/lightnvm/pblk-write.c
+++ b/drivers/lightnvm/pblk-write.c
@@ -228,6 +228,7 @@ static void pblk_submit_rec(struct work_struct *work)
 	mempool_free(recovery, &pblk->rec_pool);
 
 	atomic_dec(&pblk->inflight_io);
+	pblk_write_kick(pblk);
 }
 
 
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 11/16] lightnvm: pblk: fix update line wp in OOB recovery
  2019-03-22 14:48 [PATCH v2 00/16] lightnvm: next set of improvements for 5.2 Igor Konopko
                   ` (9 preceding siblings ...)
  2019-03-22 14:48 ` [PATCH v2 10/16] lightnvm: pblk: kick writer on write recovery path Igor Konopko
@ 2019-03-22 14:48 ` Igor Konopko
  2019-03-25 11:18   ` Matias Bjørling
  2019-03-22 14:48 ` [PATCH v2 12/16] lightnvm: pblk: do not read OOB from emeta region Igor Konopko
                   ` (4 subsequent siblings)
  15 siblings, 1 reply; 30+ messages in thread
From: Igor Konopko @ 2019-03-22 14:48 UTC (permalink / raw)
  To: mb, javier, hans.holmberg; +Cc: linux-block, igor.j.konopko

In case of OOB recovery, we can hit the scenario when all the data in
line were written and some part of emeta was written too. In such
a case pblk_update_line_wp() function will call pblk_alloc_page()
function which will case to set left_msecs to value below zero
(since this field does not track emeta region) and thus will lead to
multiple kernel warnings. This patch fixes that issue.

Signed-off-by: Igor Konopko <igor.j.konopko@intel.com>
Reviewed-by: Javier González <javier@javigon.com>
---
 drivers/lightnvm/pblk-recovery.c | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/drivers/lightnvm/pblk-recovery.c b/drivers/lightnvm/pblk-recovery.c
index 65916f0..4e8f382 100644
--- a/drivers/lightnvm/pblk-recovery.c
+++ b/drivers/lightnvm/pblk-recovery.c
@@ -93,10 +93,24 @@ static int pblk_recov_l2p_from_emeta(struct pblk *pblk, struct pblk_line *line)
 static void pblk_update_line_wp(struct pblk *pblk, struct pblk_line *line,
 				u64 written_secs)
 {
+	struct pblk_line_mgmt *l_mg = &pblk->l_mg;
 	int i;
 
 	for (i = 0; i < written_secs; i += pblk->min_write_pgs)
-		pblk_alloc_page(pblk, line, pblk->min_write_pgs);
+		__pblk_alloc_page(pblk, line, pblk->min_write_pgs);
+
+	spin_lock(&l_mg->free_lock);
+	if (written_secs > line->left_msecs) {
+		/*
+		 * We have all data sectors written
+		 * and some emeta sectors written too.
+		 */
+		line->left_msecs = 0;
+	} else {
+		/* We have only some data sectors written. */
+		line->left_msecs -= written_secs;
+	}
+	spin_unlock(&l_mg->free_lock);
 }
 
 static u64 pblk_sec_in_open_line(struct pblk *pblk, struct pblk_line *line)
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 12/16] lightnvm: pblk: do not read OOB from emeta region
  2019-03-22 14:48 [PATCH v2 00/16] lightnvm: next set of improvements for 5.2 Igor Konopko
                   ` (10 preceding siblings ...)
  2019-03-22 14:48 ` [PATCH v2 11/16] lightnvm: pblk: fix update line wp in OOB recovery Igor Konopko
@ 2019-03-22 14:48 ` Igor Konopko
  2019-03-25  6:23   ` Javier González
  2019-03-22 14:48 ` [PATCH v2 13/16] lightnvm: pblk: store multiple copies of smeta Igor Konopko
                   ` (3 subsequent siblings)
  15 siblings, 1 reply; 30+ messages in thread
From: Igor Konopko @ 2019-03-22 14:48 UTC (permalink / raw)
  To: mb, javier, hans.holmberg; +Cc: linux-block, igor.j.konopko

Emeta does not have corresponding OOB metadata mapping valid, so there
is no need to try to recover L2P mapping from it.

Signed-off-by: Igor Konopko <igor.j.konopko@intel.com>
---
 drivers/lightnvm/pblk-recovery.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/lightnvm/pblk-recovery.c b/drivers/lightnvm/pblk-recovery.c
index 4e8f382..74e5b17 100644
--- a/drivers/lightnvm/pblk-recovery.c
+++ b/drivers/lightnvm/pblk-recovery.c
@@ -479,6 +479,15 @@ static int pblk_recov_scan_oob(struct pblk *pblk, struct pblk_line *line,
 		goto retry_rq;
 	}
 
+	if (paddr >= line->emeta_ssec) {
+		/*
+		 * We reach emeta region and we don't want
+		 * to recover oob from emeta region.
+		 */
+		bio_put(bio);
+		goto completed;
+	}
+
 	pblk_get_packed_meta(pblk, rqd);
 	bio_put(bio);
 
@@ -499,6 +508,7 @@ static int pblk_recov_scan_oob(struct pblk *pblk, struct pblk_line *line,
 	if (left_ppas > 0)
 		goto next_rq;
 
+completed:
 #ifdef CONFIG_NVM_PBLK_DEBUG
 	WARN_ON(padded && !pblk_line_is_full(line));
 #endif
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 13/16] lightnvm: pblk: store multiple copies of smeta
  2019-03-22 14:48 [PATCH v2 00/16] lightnvm: next set of improvements for 5.2 Igor Konopko
                   ` (11 preceding siblings ...)
  2019-03-22 14:48 ` [PATCH v2 12/16] lightnvm: pblk: do not read OOB from emeta region Igor Konopko
@ 2019-03-22 14:48 ` Igor Konopko
  2019-03-25  6:31   ` Javier González
  2019-03-22 14:48 ` [PATCH v2 14/16] lightnvm: pblk: GC error handling Igor Konopko
                   ` (2 subsequent siblings)
  15 siblings, 1 reply; 30+ messages in thread
From: Igor Konopko @ 2019-03-22 14:48 UTC (permalink / raw)
  To: mb, javier, hans.holmberg; +Cc: linux-block, igor.j.konopko

Currently there is only one copy of smeta stored per line in pblk. This
is risky, because in case of read error on such a chunk, we are losing
all the data from whole line, what leads to silent data corruption.

This patch changes this behaviour and allows to store more then one
copy of the smeta (specified by module parameter) in order to provide
higher reliability by storing mirrored copies of smeta struct and
providing possibility to failover to another copy of that struct in
case of read error. Such an approach ensures that copies of that
critical structures will be stored on different dies and thus predicted
UBER is multiple times higher

Signed-off-by: Igor Konopko <igor.j.konopko@intel.com>
---
 drivers/lightnvm/pblk-core.c     | 124 ++++++++++++++++++++++++++++++++-------
 drivers/lightnvm/pblk-init.c     |  23 ++++++--
 drivers/lightnvm/pblk-recovery.c |  14 +++--
 drivers/lightnvm/pblk-rl.c       |   3 +-
 drivers/lightnvm/pblk.h          |   1 +
 5 files changed, 132 insertions(+), 33 deletions(-)

diff --git a/drivers/lightnvm/pblk-core.c b/drivers/lightnvm/pblk-core.c
index 06ac3f0..9d9a9e2 100644
--- a/drivers/lightnvm/pblk-core.c
+++ b/drivers/lightnvm/pblk-core.c
@@ -720,13 +720,14 @@ u64 pblk_line_smeta_start(struct pblk *pblk, struct pblk_line *line)
 	return bit * geo->ws_opt;
 }
 
-int pblk_line_smeta_read(struct pblk *pblk, struct pblk_line *line)
+static int pblk_line_smeta_read_copy(struct pblk *pblk,
+				     struct pblk_line *line, u64 paddr)
 {
 	struct nvm_tgt_dev *dev = pblk->dev;
+	struct nvm_geo *geo = &dev->geo;
 	struct pblk_line_meta *lm = &pblk->lm;
 	struct bio *bio;
 	struct nvm_rq rqd;
-	u64 paddr = pblk_line_smeta_start(pblk, line);
 	int i, ret;
 
 	memset(&rqd, 0, sizeof(struct nvm_rq));
@@ -749,8 +750,20 @@ int pblk_line_smeta_read(struct pblk *pblk, struct pblk_line *line)
 	rqd.nr_ppas = lm->smeta_sec;
 	rqd.is_seq = 1;
 
-	for (i = 0; i < lm->smeta_sec; i++, paddr++)
-		rqd.ppa_list[i] = addr_to_gen_ppa(pblk, paddr, line->id);
+	for (i = 0; i < rqd.nr_ppas; i++, paddr++) {
+		struct ppa_addr ppa = addr_to_gen_ppa(pblk, paddr, line->id);
+		int pos = pblk_ppa_to_pos(geo, ppa);
+
+		while (test_bit(pos, line->blk_bitmap)) {
+			paddr += pblk->min_write_pgs;
+			ppa = addr_to_gen_ppa(pblk, paddr, line->id);
+			pos = pblk_ppa_to_pos(geo, ppa);
+		}
+
+		rqd.ppa_list[i] = ppa;
+		pblk_get_meta(pblk, rqd.meta_list, i)->lba =
+				  cpu_to_le64(ADDR_EMPTY);
+	}
 
 	ret = pblk_submit_io_sync(pblk, &rqd);
 	if (ret) {
@@ -771,16 +784,63 @@ int pblk_line_smeta_read(struct pblk *pblk, struct pblk_line *line)
 	return ret;
 }
 
-static int pblk_line_smeta_write(struct pblk *pblk, struct pblk_line *line,
-				 u64 paddr)
+int pblk_line_smeta_read(struct pblk *pblk, struct pblk_line *line)
+{
+	struct pblk_line_meta *lm = &pblk->lm;
+	int i, ret = 0;
+	u64 paddr = pblk_line_smeta_start(pblk, line);
+
+	for (i = 0; i < lm->smeta_copies; i++) {
+		ret = pblk_line_smeta_read_copy(pblk, line,
+						paddr + (i * lm->smeta_sec));
+		if (!ret) {
+			/*
+			 * Just one successfully read copy of smeta is
+			 * enough for us for recovery, don't need to
+			 * read another one.
+			 */
+			return ret;
+		}
+	}
+	return ret;
+}
+
+static int pblk_line_smeta_write(struct pblk *pblk, struct pblk_line *line)
 {
 	struct nvm_tgt_dev *dev = pblk->dev;
+	struct nvm_geo *geo = &dev->geo;
 	struct pblk_line_meta *lm = &pblk->lm;
 	struct bio *bio;
 	struct nvm_rq rqd;
 	__le64 *lba_list = emeta_to_lbas(pblk, line->emeta->buf);
 	__le64 addr_empty = cpu_to_le64(ADDR_EMPTY);
-	int i, ret;
+	u64 paddr = 0;
+	int smeta_wr_len = lm->smeta_len;
+	int smeta_wr_sec = lm->smeta_sec;
+	int i, ret, rq_writes;
+
+	/*
+	 * Check if we can write all the smeta copies with
+	 * a single write command.
+	 * If yes -> copy smeta sector into multiple copies
+	 * in buffer to write.
+	 * If no -> issue writes one by one using the same
+	 * buffer space.
+	 * Only if all the copies are written correctly
+	 * we are treating this line as valid for proper
+	 * UBER reliability.
+	 */
+	if (lm->smeta_sec * lm->smeta_copies > pblk->max_write_pgs) {
+		rq_writes = lm->smeta_copies;
+	} else {
+		rq_writes = 1;
+		for (i = 1; i < lm->smeta_copies; i++) {
+			memcpy(line->smeta + i * lm->smeta_len,
+			       line->smeta, lm->smeta_len);
+		}
+		smeta_wr_len *= lm->smeta_copies;
+		smeta_wr_sec *= lm->smeta_copies;
+	}
 
 	memset(&rqd, 0, sizeof(struct nvm_rq));
 
@@ -788,7 +848,8 @@ static int pblk_line_smeta_write(struct pblk *pblk, struct pblk_line *line,
 	if (ret)
 		return ret;
 
-	bio = bio_map_kern(dev->q, line->smeta, lm->smeta_len, GFP_KERNEL);
+next_rq:
+	bio = bio_map_kern(dev->q, line->smeta, smeta_wr_len, GFP_KERNEL);
 	if (IS_ERR(bio)) {
 		ret = PTR_ERR(bio);
 		goto clear_rqd;
@@ -799,15 +860,23 @@ static int pblk_line_smeta_write(struct pblk *pblk, struct pblk_line *line,
 
 	rqd.bio = bio;
 	rqd.opcode = NVM_OP_PWRITE;
-	rqd.nr_ppas = lm->smeta_sec;
+	rqd.nr_ppas = smeta_wr_sec;
 	rqd.is_seq = 1;
 
-	for (i = 0; i < lm->smeta_sec; i++, paddr++) {
-		struct pblk_sec_meta *meta = pblk_get_meta(pblk,
-							   rqd.meta_list, i);
+	for (i = 0; i < rqd.nr_ppas; i++, paddr++) {
+		void *meta_list = rqd.meta_list;
+		struct ppa_addr ppa = addr_to_gen_ppa(pblk, paddr, line->id);
+		int pos = pblk_ppa_to_pos(geo, ppa);
 
-		rqd.ppa_list[i] = addr_to_gen_ppa(pblk, paddr, line->id);
-		meta->lba = lba_list[paddr] = addr_empty;
+		while (test_bit(pos, line->blk_bitmap)) {
+			paddr += pblk->min_write_pgs;
+			ppa = addr_to_gen_ppa(pblk, paddr, line->id);
+			pos = pblk_ppa_to_pos(geo, ppa);
+		}
+
+		rqd.ppa_list[i] = ppa;
+		pblk_get_meta(pblk, meta_list, i)->lba = addr_empty;
+		lba_list[paddr] = addr_empty;
 	}
 
 	ret = pblk_submit_io_sync_sem(pblk, &rqd);
@@ -822,8 +891,13 @@ static int pblk_line_smeta_write(struct pblk *pblk, struct pblk_line *line,
 	if (rqd.error) {
 		pblk_log_write_err(pblk, &rqd);
 		ret = -EIO;
+		goto clear_rqd;
 	}
 
+	rq_writes--;
+	if (rq_writes > 0)
+		goto next_rq;
+
 clear_rqd:
 	pblk_free_rqd_meta(pblk, &rqd);
 	return ret;
@@ -1020,7 +1094,7 @@ static void pblk_line_setup_metadata(struct pblk_line *line,
 	line->smeta = l_mg->sline_meta[meta_line];
 	line->emeta = l_mg->eline_meta[meta_line];
 
-	memset(line->smeta, 0, lm->smeta_len);
+	memset(line->smeta, 0, lm->smeta_len * lm->smeta_copies);
 	memset(line->emeta->buf, 0, lm->emeta_len[0]);
 
 	line->emeta->mem = 0;
@@ -1147,7 +1221,7 @@ static int pblk_line_init_bb(struct pblk *pblk, struct pblk_line *line,
 	struct pblk_line_mgmt *l_mg = &pblk->l_mg;
 	u64 off;
 	int bit = -1;
-	int emeta_secs;
+	int emeta_secs, smeta_secs;
 
 	line->sec_in_line = lm->sec_per_line;
 
@@ -1163,13 +1237,19 @@ static int pblk_line_init_bb(struct pblk *pblk, struct pblk_line *line,
 	}
 
 	/* Mark smeta metadata sectors as bad sectors */
-	bit = find_first_zero_bit(line->blk_bitmap, lm->blk_per_line);
-	off = bit * geo->ws_opt;
-	bitmap_set(line->map_bitmap, off, lm->smeta_sec);
-	line->sec_in_line -= lm->smeta_sec;
-	line->cur_sec = off + lm->smeta_sec;
+	smeta_secs = lm->smeta_sec * lm->smeta_copies;
+	bit = -1;
+	while (smeta_secs) {
+		bit = find_next_zero_bit(line->blk_bitmap, lm->blk_per_line,
+					bit + 1);
+		off = bit * geo->ws_opt;
+		bitmap_set(line->map_bitmap, off, geo->ws_opt);
+		line->cur_sec = off + geo->ws_opt;
+		smeta_secs -= lm->smeta_sec;
+	}
+	line->sec_in_line -= (lm->smeta_sec * lm->smeta_copies);
 
-	if (init && pblk_line_smeta_write(pblk, line, off)) {
+	if (init && pblk_line_smeta_write(pblk, line)) {
 		pblk_debug(pblk, "line smeta I/O failed. Retry\n");
 		return 0;
 	}
diff --git a/drivers/lightnvm/pblk-init.c b/drivers/lightnvm/pblk-init.c
index 4211cd1..5717ad4 100644
--- a/drivers/lightnvm/pblk-init.c
+++ b/drivers/lightnvm/pblk-init.c
@@ -27,6 +27,11 @@ static unsigned int write_buffer_size;
 module_param(write_buffer_size, uint, 0644);
 MODULE_PARM_DESC(write_buffer_size, "number of entries in a write buffer");
 
+static unsigned int smeta_copies = 1;
+
+module_param(smeta_copies, int, 0644);
+MODULE_PARM_DESC(smeta_copies, "number of smeta copies");
+
 struct pblk_global_caches {
 	struct kmem_cache	*ws;
 	struct kmem_cache	*rec;
@@ -867,7 +872,8 @@ static int pblk_line_mg_init(struct pblk *pblk)
 	 * emeta depends on the number of LUNs allocated to the pblk instance
 	 */
 	for (i = 0; i < PBLK_DATA_LINES; i++) {
-		l_mg->sline_meta[i] = kmalloc(lm->smeta_len, GFP_KERNEL);
+		l_mg->sline_meta[i] = kmalloc(lm->smeta_len
+						* lm->smeta_copies, GFP_KERNEL);
 		if (!l_mg->sline_meta[i])
 			goto fail_free_smeta;
 	}
@@ -967,6 +973,12 @@ static int pblk_line_meta_init(struct pblk *pblk)
 	lm->mid_thrs = lm->sec_per_line / 2;
 	lm->high_thrs = lm->sec_per_line / 4;
 	lm->meta_distance = (geo->all_luns / 2) * pblk->min_write_pgs;
+	lm->smeta_copies = smeta_copies;
+
+	if (lm->smeta_copies < 1 || lm->smeta_copies > geo->all_luns) {
+		pblk_err(pblk, "unsupported smeta copies parameter\n");
+		return -EINVAL;
+	}
 
 	/* Calculate necessary pages for smeta. See comment over struct
 	 * line_smeta definition
@@ -998,10 +1010,11 @@ static int pblk_line_meta_init(struct pblk *pblk)
 
 	lm->emeta_bb = geo->all_luns > i ? geo->all_luns - i : 0;
 
-	lm->min_blk_line = 1;
-	if (geo->all_luns > 1)
-		lm->min_blk_line += DIV_ROUND_UP(lm->smeta_sec +
-					lm->emeta_sec[0], geo->clba);
+	lm->min_blk_line = lm->smeta_copies;
+	if (geo->all_luns > lm->smeta_copies) {
+		lm->min_blk_line += DIV_ROUND_UP((lm->smeta_sec
+			* lm->smeta_copies) + lm->emeta_sec[0], geo->clba);
+	}
 
 	if (lm->min_blk_line > lm->blk_per_line) {
 		pblk_err(pblk, "config. not supported. Min. LUN in line:%d\n",
diff --git a/drivers/lightnvm/pblk-recovery.c b/drivers/lightnvm/pblk-recovery.c
index 74e5b17..038931d 100644
--- a/drivers/lightnvm/pblk-recovery.c
+++ b/drivers/lightnvm/pblk-recovery.c
@@ -51,7 +51,8 @@ static int pblk_recov_l2p_from_emeta(struct pblk *pblk, struct pblk_line *line)
 	if (!lba_list)
 		return 1;
 
-	data_start = pblk_line_smeta_start(pblk, line) + lm->smeta_sec;
+	data_start = pblk_line_smeta_start(pblk, line)
+					+ (lm->smeta_sec * lm->smeta_copies);
 	data_end = line->emeta_ssec;
 	nr_valid_lbas = le64_to_cpu(emeta_buf->nr_valid_lbas);
 
@@ -140,7 +141,8 @@ static u64 pblk_sec_in_open_line(struct pblk *pblk, struct pblk_line *line)
 	if (lm->blk_per_line - nr_bb != valid_chunks)
 		pblk_err(pblk, "recovery line %d is bad\n", line->id);
 
-	pblk_update_line_wp(pblk, line, written_secs - lm->smeta_sec);
+	pblk_update_line_wp(pblk, line, written_secs -
+					(lm->smeta_sec * lm->smeta_copies));
 
 	return written_secs;
 }
@@ -383,12 +385,14 @@ static int pblk_recov_scan_oob(struct pblk *pblk, struct pblk_line *line,
 	void *data;
 	dma_addr_t dma_ppa_list, dma_meta_list;
 	__le64 *lba_list;
-	u64 paddr = pblk_line_smeta_start(pblk, line) + lm->smeta_sec;
+	u64 paddr = pblk_line_smeta_start(pblk, line) +
+					(lm->smeta_sec * lm->smeta_copies);
 	bool padded = false;
 	int rq_ppas, rq_len;
 	int i, j;
 	int ret;
-	u64 left_ppas = pblk_sec_in_open_line(pblk, line) - lm->smeta_sec;
+	u64 left_ppas = pblk_sec_in_open_line(pblk, line) -
+					(lm->smeta_sec * lm->smeta_copies);
 
 	if (pblk_line_wps_are_unbalanced(pblk, line))
 		pblk_warn(pblk, "recovering unbalanced line (%d)\n", line->id);
@@ -722,7 +726,7 @@ struct pblk_line *pblk_recov_l2p(struct pblk *pblk)
 
 		line = &pblk->lines[i];
 
-		memset(smeta, 0, lm->smeta_len);
+		memset(smeta, 0, lm->smeta_len * lm->smeta_copies);
 		line->smeta = smeta;
 		line->lun_bitmap = ((void *)(smeta_buf)) +
 						sizeof(struct line_smeta);
diff --git a/drivers/lightnvm/pblk-rl.c b/drivers/lightnvm/pblk-rl.c
index b014957..944372c 100644
--- a/drivers/lightnvm/pblk-rl.c
+++ b/drivers/lightnvm/pblk-rl.c
@@ -218,7 +218,8 @@ void pblk_rl_init(struct pblk_rl *rl, int budget, int threshold)
 	unsigned int rb_windows;
 
 	/* Consider sectors used for metadata */
-	sec_meta = (lm->smeta_sec + lm->emeta_sec[0]) * l_mg->nr_free_lines;
+	sec_meta = ((lm->smeta_sec * lm->smeta_copies)
+			+ lm->emeta_sec[0]) * l_mg->nr_free_lines;
 	blk_meta = DIV_ROUND_UP(sec_meta, geo->clba);
 
 	rl->high = pblk->op_blks - blk_meta - lm->blk_per_line;
diff --git a/drivers/lightnvm/pblk.h b/drivers/lightnvm/pblk.h
index 3a84c8a..0999245 100644
--- a/drivers/lightnvm/pblk.h
+++ b/drivers/lightnvm/pblk.h
@@ -547,6 +547,7 @@ struct pblk_line_mgmt {
 struct pblk_line_meta {
 	unsigned int smeta_len;		/* Total length for smeta */
 	unsigned int smeta_sec;		/* Sectors needed for smeta */
+	unsigned int smeta_copies;	/* Number of smeta copies */
 
 	unsigned int emeta_len[4];	/* Lengths for emeta:
 					 *  [0]: Total
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 14/16] lightnvm: pblk: GC error handling
  2019-03-22 14:48 [PATCH v2 00/16] lightnvm: next set of improvements for 5.2 Igor Konopko
                   ` (12 preceding siblings ...)
  2019-03-22 14:48 ` [PATCH v2 13/16] lightnvm: pblk: store multiple copies of smeta Igor Konopko
@ 2019-03-22 14:48 ` Igor Konopko
  2019-03-25 11:29   ` Matias Bjørling
  2019-03-22 14:48 ` [PATCH v2 15/16] lightnvm: pblk: use nvm_rq_to_ppa_list() Igor Konopko
  2019-03-22 14:48 ` [PATCH v2 16/16] lightnvm: track inflight target creations Igor Konopko
  15 siblings, 1 reply; 30+ messages in thread
From: Igor Konopko @ 2019-03-22 14:48 UTC (permalink / raw)
  To: mb, javier, hans.holmberg; +Cc: linux-block, igor.j.konopko

Currently when there is an IO error (or similar) on GC read path, pblk
still moves this line to free state, what leads to data mismatch issue.

This patch adds a handling for such an error - after that changes this
line will be returned to closed state so it can be still in use
for reading - at least we will be able to return error status to user
when user tries to read such a data.

Signed-off-by: Igor Konopko <igor.j.konopko@intel.com>
Reviewed-by: Javier González <javier@javigon.com>
Reviewed-by: Hans Holmberg <hans.holmberg@cnexlabs.com>
---
 drivers/lightnvm/pblk-core.c | 8 ++++++++
 drivers/lightnvm/pblk-gc.c   | 5 ++---
 drivers/lightnvm/pblk-read.c | 1 -
 drivers/lightnvm/pblk.h      | 2 ++
 4 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/drivers/lightnvm/pblk-core.c b/drivers/lightnvm/pblk-core.c
index 9d9a9e2..1b88e5e 100644
--- a/drivers/lightnvm/pblk-core.c
+++ b/drivers/lightnvm/pblk-core.c
@@ -1799,6 +1799,14 @@ static void __pblk_line_put(struct pblk *pblk, struct pblk_line *line)
 
 	spin_lock(&line->lock);
 	WARN_ON(line->state != PBLK_LINESTATE_GC);
+	if (line->w_err_gc->has_gc_err) {
+		spin_unlock(&line->lock);
+		pblk_err(pblk, "line %d had errors during GC\n", line->id);
+		pblk_put_line_back(pblk, line);
+		line->w_err_gc->has_gc_err = 0;
+		return;
+	}
+
 	line->state = PBLK_LINESTATE_FREE;
 	trace_pblk_line_state(pblk_disk_name(pblk), line->id,
 					line->state);
diff --git a/drivers/lightnvm/pblk-gc.c b/drivers/lightnvm/pblk-gc.c
index e23b192..63ee205 100644
--- a/drivers/lightnvm/pblk-gc.c
+++ b/drivers/lightnvm/pblk-gc.c
@@ -59,7 +59,7 @@ static void pblk_gc_writer_kick(struct pblk_gc *gc)
 	wake_up_process(gc->gc_writer_ts);
 }
 
-static void pblk_put_line_back(struct pblk *pblk, struct pblk_line *line)
+void pblk_put_line_back(struct pblk *pblk, struct pblk_line *line)
 {
 	struct pblk_line_mgmt *l_mg = &pblk->l_mg;
 	struct list_head *move_list;
@@ -98,8 +98,7 @@ static void pblk_gc_line_ws(struct work_struct *work)
 	/* Read from GC victim block */
 	ret = pblk_submit_read_gc(pblk, gc_rq);
 	if (ret) {
-		pblk_err(pblk, "failed GC read in line:%d (err:%d)\n",
-								line->id, ret);
+		line->w_err_gc->has_gc_err = 1;
 		goto out;
 	}
 
diff --git a/drivers/lightnvm/pblk-read.c b/drivers/lightnvm/pblk-read.c
index 264d1d7..5481dfa 100644
--- a/drivers/lightnvm/pblk-read.c
+++ b/drivers/lightnvm/pblk-read.c
@@ -494,7 +494,6 @@ int pblk_submit_read_gc(struct pblk *pblk, struct pblk_gc_rq *gc_rq)
 
 	if (pblk_submit_io_sync(pblk, &rqd)) {
 		ret = -EIO;
-		pblk_err(pblk, "GC read request failed\n");
 		goto err_free_bio;
 	}
 
diff --git a/drivers/lightnvm/pblk.h b/drivers/lightnvm/pblk.h
index 0999245..0bc4255 100644
--- a/drivers/lightnvm/pblk.h
+++ b/drivers/lightnvm/pblk.h
@@ -425,6 +425,7 @@ struct pblk_smeta {
 
 struct pblk_w_err_gc {
 	int has_write_err;
+	int has_gc_err;
 	__le64 *lba_list;
 };
 
@@ -916,6 +917,7 @@ void pblk_gc_free_full_lines(struct pblk *pblk);
 void pblk_gc_sysfs_state_show(struct pblk *pblk, int *gc_enabled,
 			      int *gc_active);
 int pblk_gc_sysfs_force(struct pblk *pblk, int force);
+void pblk_put_line_back(struct pblk *pblk, struct pblk_line *line);
 
 /*
  * pblk rate limiter
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 15/16] lightnvm: pblk: use nvm_rq_to_ppa_list()
  2019-03-22 14:48 [PATCH v2 00/16] lightnvm: next set of improvements for 5.2 Igor Konopko
                   ` (13 preceding siblings ...)
  2019-03-22 14:48 ` [PATCH v2 14/16] lightnvm: pblk: GC error handling Igor Konopko
@ 2019-03-22 14:48 ` Igor Konopko
  2019-03-22 14:48 ` [PATCH v2 16/16] lightnvm: track inflight target creations Igor Konopko
  15 siblings, 0 replies; 30+ messages in thread
From: Igor Konopko @ 2019-03-22 14:48 UTC (permalink / raw)
  To: mb, javier, hans.holmberg; +Cc: linux-block, igor.j.konopko

This patch replaces few remaining usages of rqd->ppa_list[] with
existing nvm_rq_to_ppa_list() helpers. This is needed for theoretical
devices with ws_min/ws_opt equal to 1.

Signed-off-by: Igor Konopko <igor.j.konopko@intel.com>
Reviewed-by: Javier González <javier@javigon.com>
---
 drivers/lightnvm/pblk-core.c     | 26 ++++++++++++++------------
 drivers/lightnvm/pblk-recovery.c | 13 ++++++++-----
 2 files changed, 22 insertions(+), 17 deletions(-)

diff --git a/drivers/lightnvm/pblk-core.c b/drivers/lightnvm/pblk-core.c
index 1b88e5e..cd99c8c 100644
--- a/drivers/lightnvm/pblk-core.c
+++ b/drivers/lightnvm/pblk-core.c
@@ -562,11 +562,9 @@ int pblk_submit_io_sync(struct pblk *pblk, struct nvm_rq *rqd)
 
 int pblk_submit_io_sync_sem(struct pblk *pblk, struct nvm_rq *rqd)
 {
-	struct ppa_addr *ppa_list;
+	struct ppa_addr *ppa_list = nvm_rq_to_ppa_list(rqd);
 	int ret;
 
-	ppa_list = (rqd->nr_ppas > 1) ? rqd->ppa_list : &rqd->ppa_addr;
-
 	pblk_down_chunk(pblk, ppa_list[0]);
 	ret = pblk_submit_io_sync(pblk, rqd);
 	pblk_up_chunk(pblk, ppa_list[0]);
@@ -727,6 +725,7 @@ static int pblk_line_smeta_read_copy(struct pblk *pblk,
 	struct nvm_geo *geo = &dev->geo;
 	struct pblk_line_meta *lm = &pblk->lm;
 	struct bio *bio;
+	struct ppa_addr *ppa_list;
 	struct nvm_rq rqd;
 	int i, ret;
 
@@ -749,6 +748,7 @@ static int pblk_line_smeta_read_copy(struct pblk *pblk,
 	rqd.opcode = NVM_OP_PREAD;
 	rqd.nr_ppas = lm->smeta_sec;
 	rqd.is_seq = 1;
+	ppa_list = nvm_rq_to_ppa_list(&rqd);
 
 	for (i = 0; i < rqd.nr_ppas; i++, paddr++) {
 		struct ppa_addr ppa = addr_to_gen_ppa(pblk, paddr, line->id);
@@ -760,7 +760,7 @@ static int pblk_line_smeta_read_copy(struct pblk *pblk,
 			pos = pblk_ppa_to_pos(geo, ppa);
 		}
 
-		rqd.ppa_list[i] = ppa;
+		ppa_list[i] = ppa;
 		pblk_get_meta(pblk, rqd.meta_list, i)->lba =
 				  cpu_to_le64(ADDR_EMPTY);
 	}
@@ -811,6 +811,7 @@ static int pblk_line_smeta_write(struct pblk *pblk, struct pblk_line *line)
 	struct nvm_geo *geo = &dev->geo;
 	struct pblk_line_meta *lm = &pblk->lm;
 	struct bio *bio;
+	struct ppa_addr *ppa_list;
 	struct nvm_rq rqd;
 	__le64 *lba_list = emeta_to_lbas(pblk, line->emeta->buf);
 	__le64 addr_empty = cpu_to_le64(ADDR_EMPTY);
@@ -862,6 +863,7 @@ static int pblk_line_smeta_write(struct pblk *pblk, struct pblk_line *line)
 	rqd.opcode = NVM_OP_PWRITE;
 	rqd.nr_ppas = smeta_wr_sec;
 	rqd.is_seq = 1;
+	ppa_list = nvm_rq_to_ppa_list(&rqd);
 
 	for (i = 0; i < rqd.nr_ppas; i++, paddr++) {
 		void *meta_list = rqd.meta_list;
@@ -874,7 +876,7 @@ static int pblk_line_smeta_write(struct pblk *pblk, struct pblk_line *line)
 			pos = pblk_ppa_to_pos(geo, ppa);
 		}
 
-		rqd.ppa_list[i] = ppa;
+		ppa_list[i] = ppa;
 		pblk_get_meta(pblk, meta_list, i)->lba = addr_empty;
 		lba_list[paddr] = addr_empty;
 	}
@@ -910,8 +912,9 @@ int pblk_line_emeta_read(struct pblk *pblk, struct pblk_line *line,
 	struct nvm_geo *geo = &dev->geo;
 	struct pblk_line_mgmt *l_mg = &pblk->l_mg;
 	struct pblk_line_meta *lm = &pblk->lm;
-	void *ppa_list, *meta_list;
+	void *ppa_list_buf, *meta_list;
 	struct bio *bio;
+	struct ppa_addr *ppa_list;
 	struct nvm_rq rqd;
 	u64 paddr = line->emeta_ssec;
 	dma_addr_t dma_ppa_list, dma_meta_list;
@@ -927,7 +930,7 @@ int pblk_line_emeta_read(struct pblk *pblk, struct pblk_line *line,
 	if (!meta_list)
 		return -ENOMEM;
 
-	ppa_list = meta_list + pblk_dma_meta_size(pblk);
+	ppa_list_buf = meta_list + pblk_dma_meta_size(pblk);
 	dma_ppa_list = dma_meta_list + pblk_dma_meta_size(pblk);
 
 next_rq:
@@ -948,11 +951,12 @@ int pblk_line_emeta_read(struct pblk *pblk, struct pblk_line *line,
 
 	rqd.bio = bio;
 	rqd.meta_list = meta_list;
-	rqd.ppa_list = ppa_list;
+	rqd.ppa_list = ppa_list_buf;
 	rqd.dma_meta_list = dma_meta_list;
 	rqd.dma_ppa_list = dma_ppa_list;
 	rqd.opcode = NVM_OP_PREAD;
 	rqd.nr_ppas = rq_ppas;
+	ppa_list = nvm_rq_to_ppa_list(&rqd);
 
 	for (i = 0; i < rqd.nr_ppas; ) {
 		struct ppa_addr ppa = addr_to_gen_ppa(pblk, paddr, line_id);
@@ -980,7 +984,7 @@ int pblk_line_emeta_read(struct pblk *pblk, struct pblk_line *line,
 		}
 
 		for (j = 0; j < min; j++, i++, paddr++)
-			rqd.ppa_list[i] = addr_to_gen_ppa(pblk, paddr, line_id);
+			ppa_list[i] = addr_to_gen_ppa(pblk, paddr, line_id);
 	}
 
 	ret = pblk_submit_io_sync(pblk, &rqd);
@@ -1605,11 +1609,9 @@ void pblk_ppa_to_line_put(struct pblk *pblk, struct ppa_addr ppa)
 
 void pblk_rq_to_line_put(struct pblk *pblk, struct nvm_rq *rqd)
 {
-	struct ppa_addr *ppa_list;
+	struct ppa_addr *ppa_list = nvm_rq_to_ppa_list(rqd);
 	int i;
 
-	ppa_list = (rqd->nr_ppas > 1) ? rqd->ppa_list : &rqd->ppa_addr;
-
 	for (i = 0; i < rqd->nr_ppas; i++)
 		pblk_ppa_to_line_put(pblk, ppa_list[i]);
 }
diff --git a/drivers/lightnvm/pblk-recovery.c b/drivers/lightnvm/pblk-recovery.c
index 038931d..b80341f 100644
--- a/drivers/lightnvm/pblk-recovery.c
+++ b/drivers/lightnvm/pblk-recovery.c
@@ -187,6 +187,7 @@ static int pblk_recov_pad_line(struct pblk *pblk, struct pblk_line *line,
 	struct pblk_pad_rq *pad_rq;
 	struct nvm_rq *rqd;
 	struct bio *bio;
+	struct ppa_addr *ppa_list;
 	void *data;
 	__le64 *lba_list = emeta_to_lbas(pblk, line->emeta->buf);
 	u64 w_ptr = line->cur_sec;
@@ -247,6 +248,7 @@ static int pblk_recov_pad_line(struct pblk *pblk, struct pblk_line *line,
 	rqd->end_io = pblk_end_io_recov;
 	rqd->private = pad_rq;
 
+	ppa_list = nvm_rq_to_ppa_list(rqd);
 	meta_list = rqd->meta_list;
 
 	for (i = 0; i < rqd->nr_ppas; ) {
@@ -274,17 +276,17 @@ static int pblk_recov_pad_line(struct pblk *pblk, struct pblk_line *line,
 			lba_list[w_ptr] = addr_empty;
 			meta = pblk_get_meta(pblk, meta_list, i);
 			meta->lba = addr_empty;
-			rqd->ppa_list[i] = dev_ppa;
+			ppa_list[i] = dev_ppa;
 		}
 	}
 
 	kref_get(&pad_rq->ref);
-	pblk_down_chunk(pblk, rqd->ppa_list[0]);
+	pblk_down_chunk(pblk, ppa_list[0]);
 
 	ret = pblk_submit_io(pblk, rqd);
 	if (ret) {
 		pblk_err(pblk, "I/O submission failed: %d\n", ret);
-		pblk_up_chunk(pblk, rqd->ppa_list[0]);
+		pblk_up_chunk(pblk, ppa_list[0]);
 		kref_put(&pad_rq->ref, pblk_recov_complete);
 		pblk_free_rqd(pblk, rqd, PBLK_WRITE_INT);
 		bio_put(bio);
@@ -430,6 +432,7 @@ static int pblk_recov_scan_oob(struct pblk *pblk, struct pblk_line *line,
 	rqd->ppa_list = ppa_list;
 	rqd->dma_ppa_list = dma_ppa_list;
 	rqd->dma_meta_list = dma_meta_list;
+	ppa_list = nvm_rq_to_ppa_list(rqd);
 
 	if (pblk_io_aligned(pblk, rq_ppas))
 		rqd->is_seq = 1;
@@ -448,7 +451,7 @@ static int pblk_recov_scan_oob(struct pblk *pblk, struct pblk_line *line,
 		}
 
 		for (j = 0; j < pblk->min_write_pgs; j++, i++)
-			rqd->ppa_list[i] =
+			ppa_list[i] =
 				addr_to_gen_ppa(pblk, paddr + j, line->id);
 	}
 
@@ -505,7 +508,7 @@ static int pblk_recov_scan_oob(struct pblk *pblk, struct pblk_line *line,
 			continue;
 
 		line->nr_valid_lbas++;
-		pblk_update_map(pblk, lba, rqd->ppa_list[i]);
+		pblk_update_map(pblk, lba, ppa_list[i]);
 	}
 
 	left_ppas -= rq_ppas;
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 16/16] lightnvm: track inflight target creations
  2019-03-22 14:48 [PATCH v2 00/16] lightnvm: next set of improvements for 5.2 Igor Konopko
                   ` (14 preceding siblings ...)
  2019-03-22 14:48 ` [PATCH v2 15/16] lightnvm: pblk: use nvm_rq_to_ppa_list() Igor Konopko
@ 2019-03-22 14:48 ` Igor Konopko
  15 siblings, 0 replies; 30+ messages in thread
From: Igor Konopko @ 2019-03-22 14:48 UTC (permalink / raw)
  To: mb, javier, hans.holmberg; +Cc: linux-block, igor.j.konopko

This patch adds a counter, which is responsible for tracking inflight
target creations.

When creation process is still in progress, target is not yet on
targets list. This causes a chance for removing whole lightnvm
subsystem by calling nvm_unregister() in the meantime and finally by
causing kernel panic inside target init function.

With this patch we are able to track such a scenarios and wait with
completing nvm_unregister() and freeing memory until target creation
will be completed.

Signed-off-by: Igor Konopko <igor.j.konopko@intel.com>
---
 drivers/lightnvm/core.c  | 19 ++++++++++++++++++-
 include/linux/lightnvm.h |  2 ++
 2 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/drivers/lightnvm/core.c b/drivers/lightnvm/core.c
index c01f83b..1edb9d8 100644
--- a/drivers/lightnvm/core.c
+++ b/drivers/lightnvm/core.c
@@ -1083,6 +1083,7 @@ static int nvm_core_init(struct nvm_dev *dev)
 	INIT_LIST_HEAD(&dev->targets);
 	mutex_init(&dev->mlock);
 	spin_lock_init(&dev->lock);
+	dev->create_inflight = 0;
 
 	ret = nvm_register_map(dev);
 	if (ret)
@@ -1180,6 +1181,11 @@ void nvm_unregister(struct nvm_dev *dev)
 	struct nvm_target *t, *tmp;
 
 	mutex_lock(&dev->mlock);
+	while (dev->create_inflight > 0) {
+		mutex_unlock(&dev->mlock);
+		io_schedule();
+		mutex_lock(&dev->mlock);
+	}
 	list_for_each_entry_safe(t, tmp, &dev->targets, list) {
 		if (t->dev->parent != dev)
 			continue;
@@ -1198,6 +1204,7 @@ EXPORT_SYMBOL(nvm_unregister);
 static int __nvm_configure_create(struct nvm_ioctl_create *create)
 {
 	struct nvm_dev *dev;
+	int ret;
 
 	down_write(&nvm_lock);
 	dev = nvm_find_nvm_dev(create->dev);
@@ -1208,7 +1215,17 @@ static int __nvm_configure_create(struct nvm_ioctl_create *create)
 		return -EINVAL;
 	}
 
-	return nvm_create_tgt(dev, create);
+	mutex_lock(&dev->mlock);
+	dev->create_inflight++;
+	mutex_unlock(&dev->mlock);
+
+	ret = nvm_create_tgt(dev, create);
+
+	mutex_lock(&dev->mlock);
+	dev->create_inflight--;
+	mutex_unlock(&dev->mlock);
+
+	return ret;
 }
 
 static long nvm_ioctl_info(struct file *file, void __user *arg)
diff --git a/include/linux/lightnvm.h b/include/linux/lightnvm.h
index d3b0270..e462d1d 100644
--- a/include/linux/lightnvm.h
+++ b/include/linux/lightnvm.h
@@ -428,6 +428,8 @@ struct nvm_dev {
 	char name[DISK_NAME_LEN];
 	void *private_data;
 
+	int create_inflight;
+
 	void *rmap;
 
 	struct mutex mlock;
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 02/16] lightnvm: pblk: IO path reorganization
  2019-03-22 14:48 ` [PATCH v2 02/16] lightnvm: pblk: IO path reorganization Igor Konopko
@ 2019-03-25  5:55   ` Javier González
  0 siblings, 0 replies; 30+ messages in thread
From: Javier González @ 2019-03-25  5:55 UTC (permalink / raw)
  To: Konopko, Igor J; +Cc: Matias Bjørling, Hans Holmberg, linux-block

[-- Attachment #1: Type: text/plain, Size: 9330 bytes --]

> On 22 Mar 2019, at 22.48, Igor Konopko <igor.j.konopko@intel.com> wrote:
> 
> This patch is made in order to prepare read path for new approach to
> partial read handling (which is needed for multi-page bvec handling)
> The most important change is to move the handling of completed and
> failed bio from the pblk_make_rq() to particular read and write
> functions. This is needed, since after partial read path changes,
> sometimes completed/failed bio will be different from original one, so
> we cannot do this any longer in pblk_make_rq(). Other changes are
> small read path refactor in order to reduce the size of another patch
> with partial read changes. Generally the goal of this patch is not to
> change the functionality, but just to prepare the code for the
> following changes.
> 
> Signed-off-by: Igor Konopko <igor.j.konopko@intel.com>
> ---
> drivers/lightnvm/pblk-cache.c |  7 ++--
> drivers/lightnvm/pblk-init.c  | 14 ++------
> drivers/lightnvm/pblk-read.c  | 83 ++++++++++++++++++++-----------------------
> drivers/lightnvm/pblk.h       |  4 +--
> 4 files changed, 47 insertions(+), 61 deletions(-)
> 
> diff --git a/drivers/lightnvm/pblk-cache.c b/drivers/lightnvm/pblk-cache.c
> index c9fa26f..d2a3683 100644
> --- a/drivers/lightnvm/pblk-cache.c
> +++ b/drivers/lightnvm/pblk-cache.c
> @@ -18,7 +18,7 @@
> 
> #include "pblk.h"
> 
> -int pblk_write_to_cache(struct pblk *pblk, struct bio *bio, unsigned long flags)
> +void pblk_write_to_cache(struct pblk *pblk, struct bio *bio, unsigned long flags)
> {
> 	struct request_queue *q = pblk->dev->q;
> 	struct pblk_w_ctx w_ctx;
> @@ -43,6 +43,7 @@ int pblk_write_to_cache(struct pblk *pblk, struct bio *bio, unsigned long flags)
> 		goto retry;
> 	case NVM_IO_ERR:
> 		pblk_pipeline_stop(pblk);
> +		bio_io_error(bio);
> 		goto out;
> 	}
> 
> @@ -79,7 +80,9 @@ int pblk_write_to_cache(struct pblk *pblk, struct bio *bio, unsigned long flags)
> out:
> 	generic_end_io_acct(q, REQ_OP_WRITE, &pblk->disk->part0, start_time);
> 	pblk_write_should_kick(pblk);
> -	return ret;
> +
> +	if (ret == NVM_IO_DONE)
> +		bio_endio(bio);
> }
> 
> /*
> diff --git a/drivers/lightnvm/pblk-init.c b/drivers/lightnvm/pblk-init.c
> index e2e1193..4211cd1 100644
> --- a/drivers/lightnvm/pblk-init.c
> +++ b/drivers/lightnvm/pblk-init.c
> @@ -50,7 +50,6 @@ struct bio_set pblk_bio_set;
> static blk_qc_t pblk_make_rq(struct request_queue *q, struct bio *bio)
> {
> 	struct pblk *pblk = q->queuedata;
> -	int ret;
> 
> 	if (bio_op(bio) == REQ_OP_DISCARD) {
> 		pblk_discard(pblk, bio);
> @@ -65,7 +64,7 @@ static blk_qc_t pblk_make_rq(struct request_queue *q, struct bio *bio)
> 	 */
> 	if (bio_data_dir(bio) == READ) {
> 		blk_queue_split(q, &bio);
> -		ret = pblk_submit_read(pblk, bio);
> +		pblk_submit_read(pblk, bio);
> 	} else {
> 		/* Prevent deadlock in the case of a modest LUN configuration
> 		 * and large user I/Os. Unless stalled, the rate limiter
> @@ -74,16 +73,7 @@ static blk_qc_t pblk_make_rq(struct request_queue *q, struct bio *bio)
> 		if (pblk_get_secs(bio) > pblk_rl_max_io(&pblk->rl))
> 			blk_queue_split(q, &bio);
> 
> -		ret = pblk_write_to_cache(pblk, bio, PBLK_IOTYPE_USER);
> -	}
> -
> -	switch (ret) {
> -	case NVM_IO_ERR:
> -		bio_io_error(bio);
> -		break;
> -	case NVM_IO_DONE:
> -		bio_endio(bio);
> -		break;
> +		pblk_write_to_cache(pblk, bio, PBLK_IOTYPE_USER);
> 	}
> 
> 	return BLK_QC_T_NONE;
> diff --git a/drivers/lightnvm/pblk-read.c b/drivers/lightnvm/pblk-read.c
> index 6569746..597fe6d 100644
> --- a/drivers/lightnvm/pblk-read.c
> +++ b/drivers/lightnvm/pblk-read.c
> @@ -179,7 +179,8 @@ static void pblk_end_user_read(struct bio *bio, int error)
> {
> 	if (error && error != NVM_RSP_WARN_HIGHECC)
> 		bio_io_error(bio);
> -	bio_endio(bio);
> +	else
> +		bio_endio(bio);
> }
> 
> static void __pblk_end_io_read(struct pblk *pblk, struct nvm_rq *rqd,
> @@ -383,7 +384,6 @@ static int pblk_partial_read_bio(struct pblk *pblk, struct nvm_rq *rqd,
> 
> 	/* Free allocated pages in new bio */
> 	pblk_bio_free_pages(pblk, rqd->bio, 0, rqd->bio->bi_vcnt);
> -	__pblk_end_io_read(pblk, rqd, false);
> 	return NVM_IO_ERR;
> }
> 
> @@ -428,7 +428,7 @@ static void pblk_read_rq(struct pblk *pblk, struct nvm_rq *rqd, struct bio *bio,
> 	}
> }
> 
> -int pblk_submit_read(struct pblk *pblk, struct bio *bio)
> +void pblk_submit_read(struct pblk *pblk, struct bio *bio)
> {
> 	struct nvm_tgt_dev *dev = pblk->dev;
> 	struct request_queue *q = dev->q;
> @@ -436,9 +436,9 @@ int pblk_submit_read(struct pblk *pblk, struct bio *bio)
> 	unsigned int nr_secs = pblk_get_secs(bio);
> 	struct pblk_g_ctx *r_ctx;
> 	struct nvm_rq *rqd;
> +	struct bio *int_bio;
> 	unsigned int bio_init_idx;
> 	DECLARE_BITMAP(read_bitmap, NVM_MAX_VLBA);
> -	int ret = NVM_IO_ERR;
> 
> 	generic_start_io_acct(q, REQ_OP_READ, bio_sectors(bio),
> 			      &pblk->disk->part0);
> @@ -449,74 +449,67 @@ int pblk_submit_read(struct pblk *pblk, struct bio *bio)
> 
> 	rqd->opcode = NVM_OP_PREAD;
> 	rqd->nr_ppas = nr_secs;
> -	rqd->bio = NULL; /* cloned bio if needed */
> 	rqd->private = pblk;
> 	rqd->end_io = pblk_end_io_read;
> 
> 	r_ctx = nvm_rq_to_pdu(rqd);
> 	r_ctx->start_time = jiffies;
> 	r_ctx->lba = blba;
> -	r_ctx->private = bio; /* original bio */
> 
> 	/* Save the index for this bio's start. This is needed in case
> 	 * we need to fill a partial read.
> 	 */
> 	bio_init_idx = pblk_get_bi_idx(bio);
> 
> -	if (pblk_alloc_rqd_meta(pblk, rqd))
> -		goto fail_rqd_free;
> +	if (pblk_alloc_rqd_meta(pblk, rqd)) {
> +		bio_io_error(bio);
> +		pblk_free_rqd(pblk, rqd, PBLK_READ);
> +		return;
> +	}
> +
> +	/* Clone read bio to deal internally with:
> +	 * -read errors when reading from drive
> +	 * -bio_advance() calls during l2p lookup and cache reads
> +	 */
> +	int_bio = bio_clone_fast(bio, GFP_KERNEL, &pblk_bio_set);
> 
> 	if (nr_secs > 1)
> 		pblk_read_ppalist_rq(pblk, rqd, bio, blba, read_bitmap);
> 	else
> 		pblk_read_rq(pblk, rqd, bio, blba, read_bitmap);
> 
> +	r_ctx->private = bio; /* original bio */
> +	rqd->bio = int_bio; /* internal bio */
> +
> 	if (bitmap_full(read_bitmap, nr_secs)) {
> +		pblk_end_user_read(bio, 0);
> 		atomic_inc(&pblk->inflight_io);
> 		__pblk_end_io_read(pblk, rqd, false);
> -		return NVM_IO_DONE;
> +		return;
> 	}
> 
> -	/* All sectors are to be read from the device */
> -	if (bitmap_empty(read_bitmap, rqd->nr_ppas)) {
> -		struct bio *int_bio = NULL;
> -
> -		/* Clone read bio to deal with read errors internally */
> -		int_bio = bio_clone_fast(bio, GFP_KERNEL, &pblk_bio_set);
> -		if (!int_bio) {
> -			pblk_err(pblk, "could not clone read bio\n");
> -			goto fail_end_io;
> -		}
> -
> -		rqd->bio = int_bio;
> -
> -		if (pblk_submit_io(pblk, rqd)) {
> +	if (!bitmap_empty(read_bitmap, rqd->nr_ppas)) {
> +		/* The read bio request could be partially filled by the write
> +		 * buffer, but there are some holes that need to be read from
> +		 * the drive.
> +		 */
> +		bio_put(int_bio);
> +		rqd->bio = NULL;
> +		if (pblk_partial_read_bio(pblk, rqd, bio_init_idx, read_bitmap,
> +					    nr_secs)) {
> 			pblk_err(pblk, "read IO submission failed\n");
> -			ret = NVM_IO_ERR;
> -			goto fail_end_io;
> +			bio_io_error(bio);
> +			__pblk_end_io_read(pblk, rqd, false);
> 		}
> -
> -		return NVM_IO_OK;
> +		return;
> 	}
> 
> -	/* The read bio request could be partially filled by the write buffer,
> -	 * but there are some holes that need to be read from the drive.
> -	 */
> -	ret = pblk_partial_read_bio(pblk, rqd, bio_init_idx, read_bitmap,
> -				    nr_secs);
> -	if (ret)
> -		goto fail_meta_free;
> -
> -	return NVM_IO_OK;
> -
> -fail_meta_free:
> -	nvm_dev_dma_free(dev->parent, rqd->meta_list, rqd->dma_meta_list);
> -fail_rqd_free:
> -	pblk_free_rqd(pblk, rqd, PBLK_READ);
> -	return ret;
> -fail_end_io:
> -	__pblk_end_io_read(pblk, rqd, false);
> -	return ret;
> +	/* All sectors are to be read from the device */
> +	if (pblk_submit_io(pblk, rqd)) {
> +		pblk_err(pblk, "read IO submission failed\n");
> +		bio_io_error(bio);
> +		__pblk_end_io_read(pblk, rqd, false);
> +	}
> }
> 
> static int read_ppalist_rq_gc(struct pblk *pblk, struct nvm_rq *rqd,
> diff --git a/drivers/lightnvm/pblk.h b/drivers/lightnvm/pblk.h
> index 381f074..a3c925d 100644
> --- a/drivers/lightnvm/pblk.h
> +++ b/drivers/lightnvm/pblk.h
> @@ -868,7 +868,7 @@ void pblk_get_packed_meta(struct pblk *pblk, struct nvm_rq *rqd);
> /*
>  * pblk user I/O write path
>  */
> -int pblk_write_to_cache(struct pblk *pblk, struct bio *bio,
> +void pblk_write_to_cache(struct pblk *pblk, struct bio *bio,
> 			unsigned long flags);
> int pblk_write_gc_to_cache(struct pblk *pblk, struct pblk_gc_rq *gc_rq);
> 
> @@ -894,7 +894,7 @@ void pblk_write_kick(struct pblk *pblk);
>  * pblk read path
>  */
> extern struct bio_set pblk_bio_set;
> -int pblk_submit_read(struct pblk *pblk, struct bio *bio);
> +void pblk_submit_read(struct pblk *pblk, struct bio *bio);
> int pblk_submit_read_gc(struct pblk *pblk, struct pblk_gc_rq *gc_rq);
> /*
>  * pblk recovery
> --
> 2.9.5

The patch looks sane.


Reviewed-by: Javier González <javier@javigon.com>


[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 04/16] lightnvm: pblk: OOB recovery for closed chunks fix
  2019-03-22 14:48 ` [PATCH v2 04/16] lightnvm: pblk: OOB recovery for closed chunks fix Igor Konopko
@ 2019-03-25  6:02   ` Javier González
  2019-03-25 11:12     ` Igor Konopko
  0 siblings, 1 reply; 30+ messages in thread
From: Javier González @ 2019-03-25  6:02 UTC (permalink / raw)
  To: Konopko, Igor J; +Cc: Matias Bjørling, Hans Holmberg, linux-block

[-- Attachment #1: Type: text/plain, Size: 1789 bytes --]

> On 22 Mar 2019, at 22.48, Igor Konopko <igor.j.konopko@intel.com> wrote:
> 
> In case of OOB recovery, when some of the chunks are in closed state,
> we are calculating number of written sectors in line incorrectly,
> because we are always counting chunk WP, which for closed chunks
> does not longer reflects written sectors in particular chunk. Based on
> OCSSD 2.0 spec WP for closed chunks is equal to SLBA + NLB and here we
> need only NLB (clba in code) value for calculation. This patch for such
> a chunks takes clba field instead.
> 
> Signed-off-by: Igor Konopko <igor.j.konopko@intel.com>
> ---
> drivers/lightnvm/pblk-recovery.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/lightnvm/pblk-recovery.c b/drivers/lightnvm/pblk-recovery.c
> index 83b467b..bcd3633 100644
> --- a/drivers/lightnvm/pblk-recovery.c
> +++ b/drivers/lightnvm/pblk-recovery.c
> @@ -101,6 +101,8 @@ static void pblk_update_line_wp(struct pblk *pblk, struct pblk_line *line,
> 
> static u64 pblk_sec_in_open_line(struct pblk *pblk, struct pblk_line *line)
> {
> +	struct nvm_tgt_dev *dev = pblk->dev;
> +	struct nvm_geo *geo = &dev->geo;
> 	struct pblk_line_meta *lm = &pblk->lm;
> 	int nr_bb = bitmap_weight(line->blk_bitmap, lm->blk_per_line);
> 	u64 written_secs = 0;
> @@ -113,7 +115,11 @@ static u64 pblk_sec_in_open_line(struct pblk *pblk, struct pblk_line *line)
> 		if (chunk->state & NVM_CHK_ST_OFFLINE)
> 			continue;
> 
> -		written_secs += chunk->wp;
> +		if (chunk->state & NVM_CHK_ST_OPEN)
> +			written_secs += chunk->wp;
> +		else if (chunk->state & NVM_CHK_ST_CLOSED)
> +			written_secs += geo->clba;
> +
> 		valid_chunks++;
> 	}
> 
> --
> 2.9.5

Didn’t we agree that the patch was not needed?



[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 07/16] lightnvm: pblk: wait for inflight IOs in recovery
  2019-03-22 14:48 ` [PATCH v2 07/16] lightnvm: pblk: wait for inflight IOs in recovery Igor Konopko
@ 2019-03-25  6:18   ` Javier González
  2019-03-25 11:17     ` Igor Konopko
  0 siblings, 1 reply; 30+ messages in thread
From: Javier González @ 2019-03-25  6:18 UTC (permalink / raw)
  To: Konopko, Igor J; +Cc: Matias Bjørling, Hans Holmberg, linux-block

[-- Attachment #1: Type: text/plain, Size: 3217 bytes --]


> On 22 Mar 2019, at 22.48, Igor Konopko <igor.j.konopko@intel.com> wrote:
> 
> This patch changes the behaviour of recovery padding in order to
> support a case, when some IOs were already submitted to the drive and
> some next one are not submitted due to error returned.
> 
> Currently in case of errors we simply exit the pad function without
> waiting for inflight IOs, which leads to panic on inflight IOs
> completion.
> 
> After the changes we always wait for all the inflight IOs before
> exiting the function.
> 
> Signed-off-by: Igor Konopko <igor.j.konopko@intel.com>
> ---
> drivers/lightnvm/pblk-recovery.c | 25 ++++++++++++-------------
> 1 file changed, 12 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/lightnvm/pblk-recovery.c b/drivers/lightnvm/pblk-recovery.c
> index b2267bd..f7e383e 100644
> --- a/drivers/lightnvm/pblk-recovery.c
> +++ b/drivers/lightnvm/pblk-recovery.c
> @@ -200,7 +200,7 @@ static int pblk_recov_pad_line(struct pblk *pblk, struct pblk_line *line,
> 	rq_ppas = pblk_calc_secs(pblk, left_ppas, 0, false);
> 	if (rq_ppas < pblk->min_write_pgs) {
> 		pblk_err(pblk, "corrupted pad line %d\n", line->id);
> -		goto fail_free_pad;
> +		goto fail_complete;
> 	}
> 
> 	rq_len = rq_ppas * geo->csecs;
> @@ -209,7 +209,7 @@ static int pblk_recov_pad_line(struct pblk *pblk, struct pblk_line *line,
> 						PBLK_VMALLOC_META, GFP_KERNEL);
> 	if (IS_ERR(bio)) {
> 		ret = PTR_ERR(bio);
> -		goto fail_free_pad;
> +		goto fail_complete;
> 	}
> 
> 	bio->bi_iter.bi_sector = 0; /* internal bio */
> @@ -218,8 +218,11 @@ static int pblk_recov_pad_line(struct pblk *pblk, struct pblk_line *line,
> 	rqd = pblk_alloc_rqd(pblk, PBLK_WRITE_INT);
> 
> 	ret = pblk_alloc_rqd_meta(pblk, rqd);
> -	if (ret)
> -		goto fail_free_rqd;
> +	if (ret) {
> +		pblk_free_rqd(pblk, rqd, PBLK_WRITE_INT);
> +		bio_put(bio);
> +		goto fail_complete;
> +	}
> 
> 	rqd->bio = bio;
> 	rqd->opcode = NVM_OP_PWRITE;
> @@ -266,7 +269,10 @@ static int pblk_recov_pad_line(struct pblk *pblk, struct pblk_line *line,
> 	if (ret) {
> 		pblk_err(pblk, "I/O submission failed: %d\n", ret);
> 		pblk_up_chunk(pblk, rqd->ppa_list[0]);
> -		goto fail_free_rqd;
> +		kref_put(&pad_rq->ref, pblk_recov_complete);

Are you sure you need this? You are putting the reference on
fail_complete: already.

> +		pblk_free_rqd(pblk, rqd, PBLK_WRITE_INT);
> +		bio_put(bio);
> +		goto fail_complete;
> 	}
> 
> 	left_line_ppas -= rq_ppas;
> @@ -274,6 +280,7 @@ static int pblk_recov_pad_line(struct pblk *pblk, struct pblk_line *line,
> 	if (left_ppas && left_line_ppas)
> 		goto next_pad_rq;
> 
> +fail_complete:
> 	kref_put(&pad_rq->ref, pblk_recov_complete);
> 
> 	if (!wait_for_completion_io_timeout(&pad_rq->wait,
> @@ -289,14 +296,6 @@ static int pblk_recov_pad_line(struct pblk *pblk, struct pblk_line *line,
> free_rq:
> 	kfree(pad_rq);
> 	return ret;
> -
> -fail_free_rqd:
> -	pblk_free_rqd(pblk, rqd, PBLK_WRITE_INT);
> -	bio_put(bio);
> -fail_free_pad:
> -	kfree(pad_rq);
> -	vfree(data);
> -	return ret;
> }
> 
> static int pblk_pad_distance(struct pblk *pblk, struct pblk_line *line)
> --
> 2.9.5

Rest looks good


[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 12/16] lightnvm: pblk: do not read OOB from emeta region
  2019-03-22 14:48 ` [PATCH v2 12/16] lightnvm: pblk: do not read OOB from emeta region Igor Konopko
@ 2019-03-25  6:23   ` Javier González
  2019-03-25 11:17     ` Igor Konopko
  0 siblings, 1 reply; 30+ messages in thread
From: Javier González @ 2019-03-25  6:23 UTC (permalink / raw)
  To: Konopko, Igor J; +Cc: Matias Bjørling, Hans Holmberg, linux-block

[-- Attachment #1: Type: text/plain, Size: 1262 bytes --]

> On 22 Mar 2019, at 22.48, Igor Konopko <igor.j.konopko@intel.com> wrote:
> 
> Emeta does not have corresponding OOB metadata mapping valid, so there
> is no need to try to recover L2P mapping from it.
> 
> Signed-off-by: Igor Konopko <igor.j.konopko@intel.com>
> ---
> drivers/lightnvm/pblk-recovery.c | 10 ++++++++++
> 1 file changed, 10 insertions(+)
> 
> diff --git a/drivers/lightnvm/pblk-recovery.c b/drivers/lightnvm/pblk-recovery.c
> index 4e8f382..74e5b17 100644
> --- a/drivers/lightnvm/pblk-recovery.c
> +++ b/drivers/lightnvm/pblk-recovery.c
> @@ -479,6 +479,15 @@ static int pblk_recov_scan_oob(struct pblk *pblk, struct pblk_line *line,
> 		goto retry_rq;
> 	}
> 
> +	if (paddr >= line->emeta_ssec) {
> +		/*
> +		 * We reach emeta region and we don't want
> +		 * to recover oob from emeta region.
> +		 */
> +		bio_put(bio);
> +		goto completed;
> +	}
> +
> 

This recovery path is supposed to be a last resource to recover an open
line. Since emeta_ssec is calculated based on the new state after
recovery, it is not guaranteed that the bad blocks are the same as when
the last pblk instance was interrupted. Thus, emeta_ssec is not to be
trusted.

I would rather do the full scan at this point.

Javier


[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 13/16] lightnvm: pblk: store multiple copies of smeta
  2019-03-22 14:48 ` [PATCH v2 13/16] lightnvm: pblk: store multiple copies of smeta Igor Konopko
@ 2019-03-25  6:31   ` Javier González
  0 siblings, 0 replies; 30+ messages in thread
From: Javier González @ 2019-03-25  6:31 UTC (permalink / raw)
  To: Konopko, Igor J; +Cc: Matias Bjørling, Hans Holmberg, linux-block

[-- Attachment #1: Type: text/plain, Size: 13765 bytes --]

> On 22 Mar 2019, at 22.48, Igor Konopko <igor.j.konopko@intel.com> wrote:
> 
> Currently there is only one copy of smeta stored per line in pblk. This
> is risky, because in case of read error on such a chunk, we are losing
> all the data from whole line, what leads to silent data corruption.
> 
> This patch changes this behaviour and allows to store more then one
> copy of the smeta (specified by module parameter) in order to provide
> higher reliability by storing mirrored copies of smeta struct and
> providing possibility to failover to another copy of that struct in
> case of read error. Such an approach ensures that copies of that
> critical structures will be stored on different dies and thus predicted
> UBER is multiple times higher
> 
> Signed-off-by: Igor Konopko <igor.j.konopko@intel.com>
> ---
> drivers/lightnvm/pblk-core.c     | 124 ++++++++++++++++++++++++++++++++-------
> drivers/lightnvm/pblk-init.c     |  23 ++++++--
> drivers/lightnvm/pblk-recovery.c |  14 +++--
> drivers/lightnvm/pblk-rl.c       |   3 +-
> drivers/lightnvm/pblk.h          |   1 +
> 5 files changed, 132 insertions(+), 33 deletions(-)
> 
> diff --git a/drivers/lightnvm/pblk-core.c b/drivers/lightnvm/pblk-core.c
> index 06ac3f0..9d9a9e2 100644
> --- a/drivers/lightnvm/pblk-core.c
> +++ b/drivers/lightnvm/pblk-core.c
> @@ -720,13 +720,14 @@ u64 pblk_line_smeta_start(struct pblk *pblk, struct pblk_line *line)
> 	return bit * geo->ws_opt;
> }
> 
> -int pblk_line_smeta_read(struct pblk *pblk, struct pblk_line *line)
> +static int pblk_line_smeta_read_copy(struct pblk *pblk,
> +				     struct pblk_line *line, u64 paddr)
> {
> 	struct nvm_tgt_dev *dev = pblk->dev;
> +	struct nvm_geo *geo = &dev->geo;
> 	struct pblk_line_meta *lm = &pblk->lm;
> 	struct bio *bio;
> 	struct nvm_rq rqd;
> -	u64 paddr = pblk_line_smeta_start(pblk, line);
> 	int i, ret;
> 
> 	memset(&rqd, 0, sizeof(struct nvm_rq));
> @@ -749,8 +750,20 @@ int pblk_line_smeta_read(struct pblk *pblk, struct pblk_line *line)
> 	rqd.nr_ppas = lm->smeta_sec;
> 	rqd.is_seq = 1;
> 
> -	for (i = 0; i < lm->smeta_sec; i++, paddr++)
> -		rqd.ppa_list[i] = addr_to_gen_ppa(pblk, paddr, line->id);
> +	for (i = 0; i < rqd.nr_ppas; i++, paddr++) {
> +		struct ppa_addr ppa = addr_to_gen_ppa(pblk, paddr, line->id);
> +		int pos = pblk_ppa_to_pos(geo, ppa);
> +
> +		while (test_bit(pos, line->blk_bitmap)) {
> +			paddr += pblk->min_write_pgs;
> +			ppa = addr_to_gen_ppa(pblk, paddr, line->id);
> +			pos = pblk_ppa_to_pos(geo, ppa);
> +		}
> +
> +		rqd.ppa_list[i] = ppa;
> +		pblk_get_meta(pblk, rqd.meta_list, i)->lba =
> +				  cpu_to_le64(ADDR_EMPTY);
> +	}
> 
> 	ret = pblk_submit_io_sync(pblk, &rqd);
> 	if (ret) {
> @@ -771,16 +784,63 @@ int pblk_line_smeta_read(struct pblk *pblk, struct pblk_line *line)
> 	return ret;
> }
> 
> -static int pblk_line_smeta_write(struct pblk *pblk, struct pblk_line *line,
> -				 u64 paddr)
> +int pblk_line_smeta_read(struct pblk *pblk, struct pblk_line *line)
> +{
> +	struct pblk_line_meta *lm = &pblk->lm;
> +	int i, ret = 0;
> +	u64 paddr = pblk_line_smeta_start(pblk, line);
> +
> +	for (i = 0; i < lm->smeta_copies; i++) {
> +		ret = pblk_line_smeta_read_copy(pblk, line,
> +						paddr + (i * lm->smeta_sec));
> +		if (!ret) {
> +			/*
> +			 * Just one successfully read copy of smeta is
> +			 * enough for us for recovery, don't need to
> +			 * read another one.
> +			 */
> +			return ret;
> +		}
> +	}
> +	return ret;
> +}
> +
> +static int pblk_line_smeta_write(struct pblk *pblk, struct pblk_line *line)
> {
> 	struct nvm_tgt_dev *dev = pblk->dev;
> +	struct nvm_geo *geo = &dev->geo;
> 	struct pblk_line_meta *lm = &pblk->lm;
> 	struct bio *bio;
> 	struct nvm_rq rqd;
> 	__le64 *lba_list = emeta_to_lbas(pblk, line->emeta->buf);
> 	__le64 addr_empty = cpu_to_le64(ADDR_EMPTY);
> -	int i, ret;
> +	u64 paddr = 0;
> +	int smeta_wr_len = lm->smeta_len;
> +	int smeta_wr_sec = lm->smeta_sec;
> +	int i, ret, rq_writes;
> +
> +	/*
> +	 * Check if we can write all the smeta copies with
> +	 * a single write command.
> +	 * If yes -> copy smeta sector into multiple copies
> +	 * in buffer to write.
> +	 * If no -> issue writes one by one using the same
> +	 * buffer space.
> +	 * Only if all the copies are written correctly
> +	 * we are treating this line as valid for proper
> +	 * UBER reliability.
> +	 */
> +	if (lm->smeta_sec * lm->smeta_copies > pblk->max_write_pgs) {
> +		rq_writes = lm->smeta_copies;
> +	} else {
> +		rq_writes = 1;
> +		for (i = 1; i < lm->smeta_copies; i++) {
> +			memcpy(line->smeta + i * lm->smeta_len,
> +			       line->smeta, lm->smeta_len);
> +		}
> +		smeta_wr_len *= lm->smeta_copies;
> +		smeta_wr_sec *= lm->smeta_copies;
> +	}
> 
> 	memset(&rqd, 0, sizeof(struct nvm_rq));
> 
> @@ -788,7 +848,8 @@ static int pblk_line_smeta_write(struct pblk *pblk, struct pblk_line *line,
> 	if (ret)
> 		return ret;
> 
> -	bio = bio_map_kern(dev->q, line->smeta, lm->smeta_len, GFP_KERNEL);
> +next_rq:
> +	bio = bio_map_kern(dev->q, line->smeta, smeta_wr_len, GFP_KERNEL);
> 	if (IS_ERR(bio)) {
> 		ret = PTR_ERR(bio);
> 		goto clear_rqd;
> @@ -799,15 +860,23 @@ static int pblk_line_smeta_write(struct pblk *pblk, struct pblk_line *line,
> 
> 	rqd.bio = bio;
> 	rqd.opcode = NVM_OP_PWRITE;
> -	rqd.nr_ppas = lm->smeta_sec;
> +	rqd.nr_ppas = smeta_wr_sec;
> 	rqd.is_seq = 1;
> 
> -	for (i = 0; i < lm->smeta_sec; i++, paddr++) {
> -		struct pblk_sec_meta *meta = pblk_get_meta(pblk,
> -							   rqd.meta_list, i);
> +	for (i = 0; i < rqd.nr_ppas; i++, paddr++) {
> +		void *meta_list = rqd.meta_list;
> +		struct ppa_addr ppa = addr_to_gen_ppa(pblk, paddr, line->id);
> +		int pos = pblk_ppa_to_pos(geo, ppa);
> 
> -		rqd.ppa_list[i] = addr_to_gen_ppa(pblk, paddr, line->id);
> -		meta->lba = lba_list[paddr] = addr_empty;
> +		while (test_bit(pos, line->blk_bitmap)) {
> +			paddr += pblk->min_write_pgs;
> +			ppa = addr_to_gen_ppa(pblk, paddr, line->id);
> +			pos = pblk_ppa_to_pos(geo, ppa);
> +		}
> +
> +		rqd.ppa_list[i] = ppa;
> +		pblk_get_meta(pblk, meta_list, i)->lba = addr_empty;
> +		lba_list[paddr] = addr_empty;
> 	}
> 
> 	ret = pblk_submit_io_sync_sem(pblk, &rqd);
> @@ -822,8 +891,13 @@ static int pblk_line_smeta_write(struct pblk *pblk, struct pblk_line *line,
> 	if (rqd.error) {
> 		pblk_log_write_err(pblk, &rqd);
> 		ret = -EIO;
> +		goto clear_rqd;
> 	}
> 
> +	rq_writes--;
> +	if (rq_writes > 0)
> +		goto next_rq;
> +
> clear_rqd:
> 	pblk_free_rqd_meta(pblk, &rqd);
> 	return ret;
> @@ -1020,7 +1094,7 @@ static void pblk_line_setup_metadata(struct pblk_line *line,
> 	line->smeta = l_mg->sline_meta[meta_line];
> 	line->emeta = l_mg->eline_meta[meta_line];
> 
> -	memset(line->smeta, 0, lm->smeta_len);
> +	memset(line->smeta, 0, lm->smeta_len * lm->smeta_copies);
> 	memset(line->emeta->buf, 0, lm->emeta_len[0]);
> 
> 	line->emeta->mem = 0;
> @@ -1147,7 +1221,7 @@ static int pblk_line_init_bb(struct pblk *pblk, struct pblk_line *line,
> 	struct pblk_line_mgmt *l_mg = &pblk->l_mg;
> 	u64 off;
> 	int bit = -1;
> -	int emeta_secs;
> +	int emeta_secs, smeta_secs;
> 
> 	line->sec_in_line = lm->sec_per_line;
> 
> @@ -1163,13 +1237,19 @@ static int pblk_line_init_bb(struct pblk *pblk, struct pblk_line *line,
> 	}
> 
> 	/* Mark smeta metadata sectors as bad sectors */
> -	bit = find_first_zero_bit(line->blk_bitmap, lm->blk_per_line);
> -	off = bit * geo->ws_opt;
> -	bitmap_set(line->map_bitmap, off, lm->smeta_sec);
> -	line->sec_in_line -= lm->smeta_sec;
> -	line->cur_sec = off + lm->smeta_sec;
> +	smeta_secs = lm->smeta_sec * lm->smeta_copies;
> +	bit = -1;
> +	while (smeta_secs) {
> +		bit = find_next_zero_bit(line->blk_bitmap, lm->blk_per_line,
> +					bit + 1);
> +		off = bit * geo->ws_opt;
> +		bitmap_set(line->map_bitmap, off, geo->ws_opt);
> +		line->cur_sec = off + geo->ws_opt;
> +		smeta_secs -= lm->smeta_sec;
> +	}
> +	line->sec_in_line -= (lm->smeta_sec * lm->smeta_copies);
> 
> -	if (init && pblk_line_smeta_write(pblk, line, off)) {
> +	if (init && pblk_line_smeta_write(pblk, line)) {
> 		pblk_debug(pblk, "line smeta I/O failed. Retry\n");
> 		return 0;
> 	}
> diff --git a/drivers/lightnvm/pblk-init.c b/drivers/lightnvm/pblk-init.c
> index 4211cd1..5717ad4 100644
> --- a/drivers/lightnvm/pblk-init.c
> +++ b/drivers/lightnvm/pblk-init.c
> @@ -27,6 +27,11 @@ static unsigned int write_buffer_size;
> module_param(write_buffer_size, uint, 0644);
> MODULE_PARM_DESC(write_buffer_size, "number of entries in a write buffer");
> 
> +static unsigned int smeta_copies = 1;
> +
> +module_param(smeta_copies, int, 0644);
> +MODULE_PARM_DESC(smeta_copies, "number of smeta copies");
> +
> struct pblk_global_caches {
> 	struct kmem_cache	*ws;
> 	struct kmem_cache	*rec;
> @@ -867,7 +872,8 @@ static int pblk_line_mg_init(struct pblk *pblk)
> 	 * emeta depends on the number of LUNs allocated to the pblk instance
> 	 */
> 	for (i = 0; i < PBLK_DATA_LINES; i++) {
> -		l_mg->sline_meta[i] = kmalloc(lm->smeta_len, GFP_KERNEL);
> +		l_mg->sline_meta[i] = kmalloc(lm->smeta_len
> +						* lm->smeta_copies, GFP_KERNEL);
> 		if (!l_mg->sline_meta[i])
> 			goto fail_free_smeta;
> 	}
> @@ -967,6 +973,12 @@ static int pblk_line_meta_init(struct pblk *pblk)
> 	lm->mid_thrs = lm->sec_per_line / 2;
> 	lm->high_thrs = lm->sec_per_line / 4;
> 	lm->meta_distance = (geo->all_luns / 2) * pblk->min_write_pgs;
> +	lm->smeta_copies = smeta_copies;
> +
> +	if (lm->smeta_copies < 1 || lm->smeta_copies > geo->all_luns) {
> +		pblk_err(pblk, "unsupported smeta copies parameter\n");
> +		return -EINVAL;
> +	}
> 
> 	/* Calculate necessary pages for smeta. See comment over struct
> 	 * line_smeta definition
> @@ -998,10 +1010,11 @@ static int pblk_line_meta_init(struct pblk *pblk)
> 
> 	lm->emeta_bb = geo->all_luns > i ? geo->all_luns - i : 0;
> 
> -	lm->min_blk_line = 1;
> -	if (geo->all_luns > 1)
> -		lm->min_blk_line += DIV_ROUND_UP(lm->smeta_sec +
> -					lm->emeta_sec[0], geo->clba);
> +	lm->min_blk_line = lm->smeta_copies;
> +	if (geo->all_luns > lm->smeta_copies) {
> +		lm->min_blk_line += DIV_ROUND_UP((lm->smeta_sec
> +			* lm->smeta_copies) + lm->emeta_sec[0], geo->clba);
> +	}
> 
> 	if (lm->min_blk_line > lm->blk_per_line) {
> 		pblk_err(pblk, "config. not supported. Min. LUN in line:%d\n",
> diff --git a/drivers/lightnvm/pblk-recovery.c b/drivers/lightnvm/pblk-recovery.c
> index 74e5b17..038931d 100644
> --- a/drivers/lightnvm/pblk-recovery.c
> +++ b/drivers/lightnvm/pblk-recovery.c
> @@ -51,7 +51,8 @@ static int pblk_recov_l2p_from_emeta(struct pblk *pblk, struct pblk_line *line)
> 	if (!lba_list)
> 		return 1;
> 
> -	data_start = pblk_line_smeta_start(pblk, line) + lm->smeta_sec;
> +	data_start = pblk_line_smeta_start(pblk, line)
> +					+ (lm->smeta_sec * lm->smeta_copies);
> 	data_end = line->emeta_ssec;
> 	nr_valid_lbas = le64_to_cpu(emeta_buf->nr_valid_lbas);
> 
> @@ -140,7 +141,8 @@ static u64 pblk_sec_in_open_line(struct pblk *pblk, struct pblk_line *line)
> 	if (lm->blk_per_line - nr_bb != valid_chunks)
> 		pblk_err(pblk, "recovery line %d is bad\n", line->id);
> 
> -	pblk_update_line_wp(pblk, line, written_secs - lm->smeta_sec);
> +	pblk_update_line_wp(pblk, line, written_secs -
> +					(lm->smeta_sec * lm->smeta_copies));
> 
> 	return written_secs;
> }
> @@ -383,12 +385,14 @@ static int pblk_recov_scan_oob(struct pblk *pblk, struct pblk_line *line,
> 	void *data;
> 	dma_addr_t dma_ppa_list, dma_meta_list;
> 	__le64 *lba_list;
> -	u64 paddr = pblk_line_smeta_start(pblk, line) + lm->smeta_sec;
> +	u64 paddr = pblk_line_smeta_start(pblk, line) +
> +					(lm->smeta_sec * lm->smeta_copies);
> 	bool padded = false;
> 	int rq_ppas, rq_len;
> 	int i, j;
> 	int ret;
> -	u64 left_ppas = pblk_sec_in_open_line(pblk, line) - lm->smeta_sec;
> +	u64 left_ppas = pblk_sec_in_open_line(pblk, line) -
> +					(lm->smeta_sec * lm->smeta_copies);
> 
> 	if (pblk_line_wps_are_unbalanced(pblk, line))
> 		pblk_warn(pblk, "recovering unbalanced line (%d)\n", line->id);
> @@ -722,7 +726,7 @@ struct pblk_line *pblk_recov_l2p(struct pblk *pblk)
> 
> 		line = &pblk->lines[i];
> 
> -		memset(smeta, 0, lm->smeta_len);
> +		memset(smeta, 0, lm->smeta_len * lm->smeta_copies);
> 		line->smeta = smeta;
> 		line->lun_bitmap = ((void *)(smeta_buf)) +
> 						sizeof(struct line_smeta);
> diff --git a/drivers/lightnvm/pblk-rl.c b/drivers/lightnvm/pblk-rl.c
> index b014957..944372c 100644
> --- a/drivers/lightnvm/pblk-rl.c
> +++ b/drivers/lightnvm/pblk-rl.c
> @@ -218,7 +218,8 @@ void pblk_rl_init(struct pblk_rl *rl, int budget, int threshold)
> 	unsigned int rb_windows;
> 
> 	/* Consider sectors used for metadata */
> -	sec_meta = (lm->smeta_sec + lm->emeta_sec[0]) * l_mg->nr_free_lines;
> +	sec_meta = ((lm->smeta_sec * lm->smeta_copies)
> +			+ lm->emeta_sec[0]) * l_mg->nr_free_lines;
> 	blk_meta = DIV_ROUND_UP(sec_meta, geo->clba);
> 
> 	rl->high = pblk->op_blks - blk_meta - lm->blk_per_line;
> diff --git a/drivers/lightnvm/pblk.h b/drivers/lightnvm/pblk.h
> index 3a84c8a..0999245 100644
> --- a/drivers/lightnvm/pblk.h
> +++ b/drivers/lightnvm/pblk.h
> @@ -547,6 +547,7 @@ struct pblk_line_mgmt {
> struct pblk_line_meta {
> 	unsigned int smeta_len;		/* Total length for smeta */
> 	unsigned int smeta_sec;		/* Sectors needed for smeta */
> +	unsigned int smeta_copies;	/* Number of smeta copies */
> 
> 	unsigned int emeta_len[4];	/* Lengths for emeta:
> 					 *  [0]: Total
> --
> 2.9.5

Looks good.


Reviewed-by: Javier González <javier@javigon.com>


[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 09/16] lightnvm: pblk: fix spin_unlock order
  2019-03-22 14:48 ` [PATCH v2 09/16] lightnvm: pblk: fix spin_unlock order Igor Konopko
@ 2019-03-25 11:09   ` Matias Bjørling
  0 siblings, 0 replies; 30+ messages in thread
From: Matias Bjørling @ 2019-03-25 11:09 UTC (permalink / raw)
  To: Igor Konopko, javier, hans.holmberg; +Cc: linux-block

On 3/22/19 3:48 PM, Igor Konopko wrote:
> In pblk_rb_tear_down_check() spin_unlock() functions are not
> called in proper order. This patch fixes that.
> 
> Fixes: a4bd217 ("lightnvm: physical block device (pblk) target")
> Signed-off-by: Igor Konopko <igor.j.konopko@intel.com>
> Reviewed-by: Javier González <javier@javigon.com>
> Reviewed-by: Hans Holmberg <hans.holmberg@cnexlabs.com>
> ---
>   drivers/lightnvm/pblk-rb.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/lightnvm/pblk-rb.c b/drivers/lightnvm/pblk-rb.c
> index 03c241b..3555014 100644
> --- a/drivers/lightnvm/pblk-rb.c
> +++ b/drivers/lightnvm/pblk-rb.c
> @@ -799,8 +799,8 @@ int pblk_rb_tear_down_check(struct pblk_rb *rb)
>   	}
>   
>   out:
> -	spin_unlock(&rb->w_lock);
>   	spin_unlock_irq(&rb->s_lock);
> +	spin_unlock(&rb->w_lock);
>   
>   	return ret;
>   }
> 

Thanks, applied.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 04/16] lightnvm: pblk: OOB recovery for closed chunks fix
  2019-03-25  6:02   ` Javier González
@ 2019-03-25 11:12     ` Igor Konopko
  0 siblings, 0 replies; 30+ messages in thread
From: Igor Konopko @ 2019-03-25 11:12 UTC (permalink / raw)
  To: Javier González, Matias Bjørling; +Cc: Hans Holmberg, linux-block



On 25.03.2019 07:02, Javier González wrote:
>> On 22 Mar 2019, at 22.48, Igor Konopko <igor.j.konopko@intel.com> wrote:
>>
>> In case of OOB recovery, when some of the chunks are in closed state,
>> we are calculating number of written sectors in line incorrectly,
>> because we are always counting chunk WP, which for closed chunks
>> does not longer reflects written sectors in particular chunk. Based on
>> OCSSD 2.0 spec WP for closed chunks is equal to SLBA + NLB and here we
>> need only NLB (clba in code) value for calculation. This patch for such
>> a chunks takes clba field instead.
>>
>> Signed-off-by: Igor Konopko <igor.j.konopko@intel.com>
>> ---
>> drivers/lightnvm/pblk-recovery.c | 8 +++++++-
>> 1 file changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/lightnvm/pblk-recovery.c b/drivers/lightnvm/pblk-recovery.c
>> index 83b467b..bcd3633 100644
>> --- a/drivers/lightnvm/pblk-recovery.c
>> +++ b/drivers/lightnvm/pblk-recovery.c
>> @@ -101,6 +101,8 @@ static void pblk_update_line_wp(struct pblk *pblk, struct pblk_line *line,
>>
>> static u64 pblk_sec_in_open_line(struct pblk *pblk, struct pblk_line *line)
>> {
>> +	struct nvm_tgt_dev *dev = pblk->dev;
>> +	struct nvm_geo *geo = &dev->geo;
>> 	struct pblk_line_meta *lm = &pblk->lm;
>> 	int nr_bb = bitmap_weight(line->blk_bitmap, lm->blk_per_line);
>> 	u64 written_secs = 0;
>> @@ -113,7 +115,11 @@ static u64 pblk_sec_in_open_line(struct pblk *pblk, struct pblk_line *line)
>> 		if (chunk->state & NVM_CHK_ST_OFFLINE)
>> 			continue;
>>
>> -		written_secs += chunk->wp;
>> +		if (chunk->state & NVM_CHK_ST_OPEN)
>> +			written_secs += chunk->wp;
>> +		else if (chunk->state & NVM_CHK_ST_CLOSED)
>> +			written_secs += geo->clba;
>> +
>> 		valid_chunks++;
>> 	}
>>
>> --
>> 2.9.5
> 
> Didn’t we agree that the patch was not needed?
> 

I hope I explained what was my motivation for doing this changes. And 
don't know honestly if we should fix this here or in the spec, so that's 
why I send it with v2 too. Matias should decide.

> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 10/16] lightnvm: pblk: kick writer on write recovery path
  2019-03-22 14:48 ` [PATCH v2 10/16] lightnvm: pblk: kick writer on write recovery path Igor Konopko
@ 2019-03-25 11:13   ` Matias Bjørling
  0 siblings, 0 replies; 30+ messages in thread
From: Matias Bjørling @ 2019-03-25 11:13 UTC (permalink / raw)
  To: Igor Konopko, javier, hans.holmberg; +Cc: linux-block

On 3/22/19 3:48 PM, Igor Konopko wrote:
> In case of write recovery path, there is a chance that writer thread
> is not active, so for sanity it would be good to always kick it.
> 
> Signed-off-by: Igor Konopko <igor.j.konopko@intel.com>
> Reviewed-by: Javier González <javier@javigon.com>
> Reviewed-by: Hans Holmberg <hans.holmberg@cnexlabs.com>
> ---
>   drivers/lightnvm/pblk-write.c | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/lightnvm/pblk-write.c b/drivers/lightnvm/pblk-write.c
> index 6593dea..4e63f9b 100644
> --- a/drivers/lightnvm/pblk-write.c
> +++ b/drivers/lightnvm/pblk-write.c
> @@ -228,6 +228,7 @@ static void pblk_submit_rec(struct work_struct *work)
>   	mempool_free(recovery, &pblk->rec_pool);
>   
>   	atomic_dec(&pblk->inflight_io);
> +	pblk_write_kick(pblk);
>   }
>   
>   
> 

Thanks Igor. I've updated the description a bit.

"In case of write recovery path, there is a chance that writer thread
is not active, kick immediately instead of waiting for timer."

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 07/16] lightnvm: pblk: wait for inflight IOs in recovery
  2019-03-25  6:18   ` Javier González
@ 2019-03-25 11:17     ` Igor Konopko
  0 siblings, 0 replies; 30+ messages in thread
From: Igor Konopko @ 2019-03-25 11:17 UTC (permalink / raw)
  To: Javier González; +Cc: Matias Bjørling, Hans Holmberg, linux-block



On 25.03.2019 07:18, Javier González wrote:
> 
>> On 22 Mar 2019, at 22.48, Igor Konopko <igor.j.konopko@intel.com> wrote:
>>
>> This patch changes the behaviour of recovery padding in order to
>> support a case, when some IOs were already submitted to the drive and
>> some next one are not submitted due to error returned.
>>
>> Currently in case of errors we simply exit the pad function without
>> waiting for inflight IOs, which leads to panic on inflight IOs
>> completion.
>>
>> After the changes we always wait for all the inflight IOs before
>> exiting the function.
>>
>> Signed-off-by: Igor Konopko <igor.j.konopko@intel.com>
>> ---
>> drivers/lightnvm/pblk-recovery.c | 25 ++++++++++++-------------
>> 1 file changed, 12 insertions(+), 13 deletions(-)
>>
>> diff --git a/drivers/lightnvm/pblk-recovery.c b/drivers/lightnvm/pblk-recovery.c
>> index b2267bd..f7e383e 100644
>> --- a/drivers/lightnvm/pblk-recovery.c
>> +++ b/drivers/lightnvm/pblk-recovery.c
>> @@ -200,7 +200,7 @@ static int pblk_recov_pad_line(struct pblk *pblk, struct pblk_line *line,
>> 	rq_ppas = pblk_calc_secs(pblk, left_ppas, 0, false);
>> 	if (rq_ppas < pblk->min_write_pgs) {
>> 		pblk_err(pblk, "corrupted pad line %d\n", line->id);
>> -		goto fail_free_pad;
>> +		goto fail_complete;
>> 	}
>>
>> 	rq_len = rq_ppas * geo->csecs;
>> @@ -209,7 +209,7 @@ static int pblk_recov_pad_line(struct pblk *pblk, struct pblk_line *line,
>> 						PBLK_VMALLOC_META, GFP_KERNEL);
>> 	if (IS_ERR(bio)) {
>> 		ret = PTR_ERR(bio);
>> -		goto fail_free_pad;
>> +		goto fail_complete;
>> 	}
>>
>> 	bio->bi_iter.bi_sector = 0; /* internal bio */
>> @@ -218,8 +218,11 @@ static int pblk_recov_pad_line(struct pblk *pblk, struct pblk_line *line,
>> 	rqd = pblk_alloc_rqd(pblk, PBLK_WRITE_INT);
>>
>> 	ret = pblk_alloc_rqd_meta(pblk, rqd);
>> -	if (ret)
>> -		goto fail_free_rqd;
>> +	if (ret) {
>> +		pblk_free_rqd(pblk, rqd, PBLK_WRITE_INT);
>> +		bio_put(bio);
>> +		goto fail_complete;
>> +	}
>>
>> 	rqd->bio = bio;
>> 	rqd->opcode = NVM_OP_PWRITE;
>> @@ -266,7 +269,10 @@ static int pblk_recov_pad_line(struct pblk *pblk, struct pblk_line *line,
>> 	if (ret) {
>> 		pblk_err(pblk, "I/O submission failed: %d\n", ret);
>> 		pblk_up_chunk(pblk, rqd->ppa_list[0]);
>> -		goto fail_free_rqd;
>> +		kref_put(&pad_rq->ref, pblk_recov_complete);
> 
> Are you sure you need this? You are putting the reference on
> fail_complete: already.
> 

On fail_complete: I'm always putting the "extra" reference which comes 
from kref_init() call in line 214 (no changes here which existing behavior).

Here I'm putting the reference taken for a single request in line 283, 
which for successfully submitted IOs is released in pblk_end_io_recov() 
in line 177.

>> +		pblk_free_rqd(pblk, rqd, PBLK_WRITE_INT);
>> +		bio_put(bio);
>> +		goto fail_complete;
>> 	}
>>
>> 	left_line_ppas -= rq_ppas;
>> @@ -274,6 +280,7 @@ static int pblk_recov_pad_line(struct pblk *pblk, struct pblk_line *line,
>> 	if (left_ppas && left_line_ppas)
>> 		goto next_pad_rq;
>>
>> +fail_complete:
>> 	kref_put(&pad_rq->ref, pblk_recov_complete);
>>
>> 	if (!wait_for_completion_io_timeout(&pad_rq->wait,
>> @@ -289,14 +296,6 @@ static int pblk_recov_pad_line(struct pblk *pblk, struct pblk_line *line,
>> free_rq:
>> 	kfree(pad_rq);
>> 	return ret;
>> -
>> -fail_free_rqd:
>> -	pblk_free_rqd(pblk, rqd, PBLK_WRITE_INT);
>> -	bio_put(bio);
>> -fail_free_pad:
>> -	kfree(pad_rq);
>> -	vfree(data);
>> -	return ret;
>> }
>>
>> static int pblk_pad_distance(struct pblk *pblk, struct pblk_line *line)
>> --
>> 2.9.5
> 
> Rest looks good
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 12/16] lightnvm: pblk: do not read OOB from emeta region
  2019-03-25  6:23   ` Javier González
@ 2019-03-25 11:17     ` Igor Konopko
  0 siblings, 0 replies; 30+ messages in thread
From: Igor Konopko @ 2019-03-25 11:17 UTC (permalink / raw)
  To: Javier González; +Cc: Matias Bjørling, Hans Holmberg, linux-block



On 25.03.2019 07:23, Javier González wrote:
>> On 22 Mar 2019, at 22.48, Igor Konopko <igor.j.konopko@intel.com> wrote:
>>
>> Emeta does not have corresponding OOB metadata mapping valid, so there
>> is no need to try to recover L2P mapping from it.
>>
>> Signed-off-by: Igor Konopko <igor.j.konopko@intel.com>
>> ---
>> drivers/lightnvm/pblk-recovery.c | 10 ++++++++++
>> 1 file changed, 10 insertions(+)
>>
>> diff --git a/drivers/lightnvm/pblk-recovery.c b/drivers/lightnvm/pblk-recovery.c
>> index 4e8f382..74e5b17 100644
>> --- a/drivers/lightnvm/pblk-recovery.c
>> +++ b/drivers/lightnvm/pblk-recovery.c
>> @@ -479,6 +479,15 @@ static int pblk_recov_scan_oob(struct pblk *pblk, struct pblk_line *line,
>> 		goto retry_rq;
>> 	}
>>
>> +	if (paddr >= line->emeta_ssec) {
>> +		/*
>> +		 * We reach emeta region and we don't want
>> +		 * to recover oob from emeta region.
>> +		 */
>> +		bio_put(bio);
>> +		goto completed;
>> +	}
>> +
>>
> 
> This recovery path is supposed to be a last resource to recover an open
> line. Since emeta_ssec is calculated based on the new state after
> recovery, it is not guaranteed that the bad blocks are the same as when
> the last pblk instance was interrupted. Thus, emeta_ssec is not to be
> trusted.
> 
> I would rather do the full scan at this point.
> 

Ok, make sense, so we can drop this patch.

> Javier
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 11/16] lightnvm: pblk: fix update line wp in OOB recovery
  2019-03-22 14:48 ` [PATCH v2 11/16] lightnvm: pblk: fix update line wp in OOB recovery Igor Konopko
@ 2019-03-25 11:18   ` Matias Bjørling
  0 siblings, 0 replies; 30+ messages in thread
From: Matias Bjørling @ 2019-03-25 11:18 UTC (permalink / raw)
  To: Igor Konopko, javier, hans.holmberg; +Cc: linux-block

On 3/22/19 3:48 PM, Igor Konopko wrote:
> In case of OOB recovery, we can hit the scenario when all the data in
> line were written and some part of emeta was written too. In such
> a case pblk_update_line_wp() function will call pblk_alloc_page()
> function which will case to set left_msecs to value below zero
> (since this field does not track emeta region) and thus will lead to
> multiple kernel warnings. This patch fixes that issue.
> 
> Signed-off-by: Igor Konopko <igor.j.konopko@intel.com>
> Reviewed-by: Javier González <javier@javigon.com>
> ---
>   drivers/lightnvm/pblk-recovery.c | 16 +++++++++++++++-
>   1 file changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/lightnvm/pblk-recovery.c b/drivers/lightnvm/pblk-recovery.c
> index 65916f0..4e8f382 100644
> --- a/drivers/lightnvm/pblk-recovery.c
> +++ b/drivers/lightnvm/pblk-recovery.c
> @@ -93,10 +93,24 @@ static int pblk_recov_l2p_from_emeta(struct pblk *pblk, struct pblk_line *line)
>   static void pblk_update_line_wp(struct pblk *pblk, struct pblk_line *line,
>   				u64 written_secs)
>   {
> +	struct pblk_line_mgmt *l_mg = &pblk->l_mg;
>   	int i;
>   
>   	for (i = 0; i < written_secs; i += pblk->min_write_pgs)
> -		pblk_alloc_page(pblk, line, pblk->min_write_pgs);
> +		__pblk_alloc_page(pblk, line, pblk->min_write_pgs);
> +
> +	spin_lock(&l_mg->free_lock);
> +	if (written_secs > line->left_msecs) {
> +		/*
> +		 * We have all data sectors written
> +		 * and some emeta sectors written too.
> +		 */
> +		line->left_msecs = 0;
> +	} else {
> +		/* We have only some data sectors written. */
> +		line->left_msecs -= written_secs;
> +	}
> +	spin_unlock(&l_mg->free_lock);
>   }
>   
>   static u64 pblk_sec_in_open_line(struct pblk *pblk, struct pblk_line *line)
> 

Thanks, applied.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 14/16] lightnvm: pblk: GC error handling
  2019-03-22 14:48 ` [PATCH v2 14/16] lightnvm: pblk: GC error handling Igor Konopko
@ 2019-03-25 11:29   ` Matias Bjørling
  0 siblings, 0 replies; 30+ messages in thread
From: Matias Bjørling @ 2019-03-25 11:29 UTC (permalink / raw)
  To: Igor Konopko, javier, hans.holmberg; +Cc: linux-block

On 3/22/19 3:48 PM, Igor Konopko wrote:
> Currently when there is an IO error (or similar) on GC read path, pblk
> still moves this line to free state

What those "this" line refer to?

, what leads to data mismatch issue.

what -> which leads to a data mismatch.
> 
> This patch adds a handling for such an error - after that changes this
> line will be returned to closed state so it can be still in use
> for reading - at least we will be able to return error status to user
> when user tries to read such a data.

Could you update the description. It is hard for me to parse.

> 
> Signed-off-by: Igor Konopko <igor.j.konopko@intel.com>
> Reviewed-by: Javier González <javier@javigon.com>
> Reviewed-by: Hans Holmberg <hans.holmberg@cnexlabs.com>
> ---
>   drivers/lightnvm/pblk-core.c | 8 ++++++++
>   drivers/lightnvm/pblk-gc.c   | 5 ++---
>   drivers/lightnvm/pblk-read.c | 1 -
>   drivers/lightnvm/pblk.h      | 2 ++
>   4 files changed, 12 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/lightnvm/pblk-core.c b/drivers/lightnvm/pblk-core.c
> index 9d9a9e2..1b88e5e 100644
> --- a/drivers/lightnvm/pblk-core.c
> +++ b/drivers/lightnvm/pblk-core.c
> @@ -1799,6 +1799,14 @@ static void __pblk_line_put(struct pblk *pblk, struct pblk_line *line)
>   
>   	spin_lock(&line->lock);
>   	WARN_ON(line->state != PBLK_LINESTATE_GC);
> +	if (line->w_err_gc->has_gc_err) {
> +		spin_unlock(&line->lock);
> +		pblk_err(pblk, "line %d had errors during GC\n", line->id);
> +		pblk_put_line_back(pblk, line);
> +		line->w_err_gc->has_gc_err = 0;
> +		return;
> +	}
> +
>   	line->state = PBLK_LINESTATE_FREE;
>   	trace_pblk_line_state(pblk_disk_name(pblk), line->id,
>   					line->state);
> diff --git a/drivers/lightnvm/pblk-gc.c b/drivers/lightnvm/pblk-gc.c
> index e23b192..63ee205 100644
> --- a/drivers/lightnvm/pblk-gc.c
> +++ b/drivers/lightnvm/pblk-gc.c
> @@ -59,7 +59,7 @@ static void pblk_gc_writer_kick(struct pblk_gc *gc)
>   	wake_up_process(gc->gc_writer_ts);
>   }
>   
> -static void pblk_put_line_back(struct pblk *pblk, struct pblk_line *line)
> +void pblk_put_line_back(struct pblk *pblk, struct pblk_line *line)
>   {
>   	struct pblk_line_mgmt *l_mg = &pblk->l_mg;
>   	struct list_head *move_list;
> @@ -98,8 +98,7 @@ static void pblk_gc_line_ws(struct work_struct *work)
>   	/* Read from GC victim block */
>   	ret = pblk_submit_read_gc(pblk, gc_rq);
>   	if (ret) {
> -		pblk_err(pblk, "failed GC read in line:%d (err:%d)\n",
> -								line->id, ret);
> +		line->w_err_gc->has_gc_err = 1;
>   		goto out;
>   	}
>   
> diff --git a/drivers/lightnvm/pblk-read.c b/drivers/lightnvm/pblk-read.c
> index 264d1d7..5481dfa 100644
> --- a/drivers/lightnvm/pblk-read.c
> +++ b/drivers/lightnvm/pblk-read.c
> @@ -494,7 +494,6 @@ int pblk_submit_read_gc(struct pblk *pblk, struct pblk_gc_rq *gc_rq)
>   
>   	if (pblk_submit_io_sync(pblk, &rqd)) {
>   		ret = -EIO;
> -		pblk_err(pblk, "GC read request failed\n");
>   		goto err_free_bio;
>   	}
>   
> diff --git a/drivers/lightnvm/pblk.h b/drivers/lightnvm/pblk.h
> index 0999245..0bc4255 100644
> --- a/drivers/lightnvm/pblk.h
> +++ b/drivers/lightnvm/pblk.h
> @@ -425,6 +425,7 @@ struct pblk_smeta {
>   
>   struct pblk_w_err_gc {
>   	int has_write_err;
> +	int has_gc_err;
>   	__le64 *lba_list;
>   };
>   
> @@ -916,6 +917,7 @@ void pblk_gc_free_full_lines(struct pblk *pblk);
>   void pblk_gc_sysfs_state_show(struct pblk *pblk, int *gc_enabled,
>   			      int *gc_active);
>   int pblk_gc_sysfs_force(struct pblk *pblk, int force);
> +void pblk_put_line_back(struct pblk *pblk, struct pblk_line *line);
>   
>   /*
>    * pblk rate limiter
> 


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 01/16] lightnvm: pblk: warn when there are opened chunks
  2019-03-22 14:48 ` [PATCH v2 01/16] lightnvm: pblk: warn when there are opened chunks Igor Konopko
@ 2019-03-25 11:32   ` Matias Bjørling
  0 siblings, 0 replies; 30+ messages in thread
From: Matias Bjørling @ 2019-03-25 11:32 UTC (permalink / raw)
  To: Igor Konopko, javier, hans.holmberg; +Cc: linux-block

On 3/22/19 3:48 PM, Igor Konopko wrote:
> In case of factory pblk init, we might have a situation when there are
> some opened chunks. Based on OCSSD spec we are not allowed to transform
> chunks from the open state directly to the free state, so such a reset
> can lead to IO error (even that most of the controller will allow for
> for such a operation anyway), so we would like to warn users about such
> a situation at least.

Let's fix up the spec to allow resets on opened chunks and omit this patch.

> 
> Signed-off-by: Igor Konopko <igor.j.konopko@intel.com>
> Reviewed-by: Javier González <javier@javigon.com>
> ---
>   drivers/lightnvm/pblk-init.c | 22 +++++++++++++++-------
>   1 file changed, 15 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/lightnvm/pblk-init.c b/drivers/lightnvm/pblk-init.c
> index 1e227a0..e2e1193 100644
> --- a/drivers/lightnvm/pblk-init.c
> +++ b/drivers/lightnvm/pblk-init.c
> @@ -712,7 +712,7 @@ static int pblk_set_provision(struct pblk *pblk, int nr_free_chks)
>   }
>   
>   static int pblk_setup_line_meta_chk(struct pblk *pblk, struct pblk_line *line,
> -				   struct nvm_chk_meta *meta)
> +				   struct nvm_chk_meta *meta, int *opened)
>   {
>   	struct nvm_tgt_dev *dev = pblk->dev;
>   	struct nvm_geo *geo = &dev->geo;
> @@ -748,6 +748,9 @@ static int pblk_setup_line_meta_chk(struct pblk *pblk, struct pblk_line *line,
>   			continue;
>   		}
>   
> +		if (chunk->state & NVM_CHK_ST_OPEN)
> +			(*opened)++;
> +
>   		if (!(chunk->state & NVM_CHK_ST_OFFLINE))
>   			continue;
>   
> @@ -759,7 +762,7 @@ static int pblk_setup_line_meta_chk(struct pblk *pblk, struct pblk_line *line,
>   }
>   
>   static long pblk_setup_line_meta(struct pblk *pblk, struct pblk_line *line,
> -				 void *chunk_meta, int line_id)
> +				 void *chunk_meta, int line_id, int *opened)
>   {
>   	struct pblk_line_mgmt *l_mg = &pblk->l_mg;
>   	struct pblk_line_meta *lm = &pblk->lm;
> @@ -773,7 +776,7 @@ static long pblk_setup_line_meta(struct pblk *pblk, struct pblk_line *line,
>   	line->vsc = &l_mg->vsc_list[line_id];
>   	spin_lock_init(&line->lock);
>   
> -	nr_bad_chks = pblk_setup_line_meta_chk(pblk, line, chunk_meta);
> +	nr_bad_chks = pblk_setup_line_meta_chk(pblk, line, chunk_meta, opened);
>   
>   	chk_in_line = lm->blk_per_line - nr_bad_chks;
>   	if (nr_bad_chks < 0 || nr_bad_chks > lm->blk_per_line ||
> @@ -1019,12 +1022,12 @@ static int pblk_line_meta_init(struct pblk *pblk)
>   	return 0;
>   }
>   
> -static int pblk_lines_init(struct pblk *pblk)
> +static int pblk_lines_init(struct pblk *pblk, bool factory)
>   {
>   	struct pblk_line_mgmt *l_mg = &pblk->l_mg;
>   	struct pblk_line *line;
>   	void *chunk_meta;
> -	int nr_free_chks = 0;
> +	int nr_free_chks = 0, nr_opened_chks = 0;
>   	int i, ret;
>   
>   	ret = pblk_line_meta_init(pblk);
> @@ -1059,7 +1062,8 @@ static int pblk_lines_init(struct pblk *pblk)
>   		if (ret)
>   			goto fail_free_lines;
>   
> -		nr_free_chks += pblk_setup_line_meta(pblk, line, chunk_meta, i);
> +		nr_free_chks += pblk_setup_line_meta(pblk, line, chunk_meta, i,
> +							&nr_opened_chks);
>   
>   		trace_pblk_line_state(pblk_disk_name(pblk), line->id,
>   								line->state);
> @@ -1071,6 +1075,10 @@ static int pblk_lines_init(struct pblk *pblk)
>   		goto fail_free_lines;
>   	}
>   
> +	if (factory && nr_opened_chks)
> +		pblk_warn(pblk, "%d opened chunks during factory creation\n",
> +				nr_opened_chks);
> +
>   	ret = pblk_set_provision(pblk, nr_free_chks);
>   	if (ret)
>   		goto fail_free_lines;
> @@ -1235,7 +1243,7 @@ static void *pblk_init(struct nvm_tgt_dev *dev, struct gendisk *tdisk,
>   		goto fail;
>   	}
>   
> -	ret = pblk_lines_init(pblk);
> +	ret = pblk_lines_init(pblk, flags & NVM_TARGET_FACTORY);
>   	if (ret) {
>   		pblk_err(pblk, "could not initialize lines\n");
>   		goto fail_free_core;
> 


^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2019-03-25 11:32 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-22 14:48 [PATCH v2 00/16] lightnvm: next set of improvements for 5.2 Igor Konopko
2019-03-22 14:48 ` [PATCH v2 01/16] lightnvm: pblk: warn when there are opened chunks Igor Konopko
2019-03-25 11:32   ` Matias Bjørling
2019-03-22 14:48 ` [PATCH v2 02/16] lightnvm: pblk: IO path reorganization Igor Konopko
2019-03-25  5:55   ` Javier González
2019-03-22 14:48 ` [PATCH v2 03/16] lightnvm: pblk: simplify partial read path Igor Konopko
2019-03-22 14:48 ` [PATCH v2 04/16] lightnvm: pblk: OOB recovery for closed chunks fix Igor Konopko
2019-03-25  6:02   ` Javier González
2019-03-25 11:12     ` Igor Konopko
2019-03-22 14:48 ` [PATCH v2 05/16] lightnvm: pblk: propagate errors when reading meta Igor Konopko
2019-03-22 14:48 ` [PATCH v2 06/16] lightnvm: pblk: recover only written metadata Igor Konopko
2019-03-22 14:48 ` [PATCH v2 07/16] lightnvm: pblk: wait for inflight IOs in recovery Igor Konopko
2019-03-25  6:18   ` Javier González
2019-03-25 11:17     ` Igor Konopko
2019-03-22 14:48 ` [PATCH v2 08/16] lightnvm: pblk: remove internal IO timeout Igor Konopko
2019-03-22 14:48 ` [PATCH v2 09/16] lightnvm: pblk: fix spin_unlock order Igor Konopko
2019-03-25 11:09   ` Matias Bjørling
2019-03-22 14:48 ` [PATCH v2 10/16] lightnvm: pblk: kick writer on write recovery path Igor Konopko
2019-03-25 11:13   ` Matias Bjørling
2019-03-22 14:48 ` [PATCH v2 11/16] lightnvm: pblk: fix update line wp in OOB recovery Igor Konopko
2019-03-25 11:18   ` Matias Bjørling
2019-03-22 14:48 ` [PATCH v2 12/16] lightnvm: pblk: do not read OOB from emeta region Igor Konopko
2019-03-25  6:23   ` Javier González
2019-03-25 11:17     ` Igor Konopko
2019-03-22 14:48 ` [PATCH v2 13/16] lightnvm: pblk: store multiple copies of smeta Igor Konopko
2019-03-25  6:31   ` Javier González
2019-03-22 14:48 ` [PATCH v2 14/16] lightnvm: pblk: GC error handling Igor Konopko
2019-03-25 11:29   ` Matias Bjørling
2019-03-22 14:48 ` [PATCH v2 15/16] lightnvm: pblk: use nvm_rq_to_ppa_list() Igor Konopko
2019-03-22 14:48 ` [PATCH v2 16/16] lightnvm: track inflight target creations Igor Konopko

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.