linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] lightnvm: pblk extra patches for 4.12
@ 2017-04-21 23:32 Javier González
  2017-04-21 23:32 ` [PATCH 1/5] lightnvm: pblk: fix race condition on line retry Javier González
                   ` (5 more replies)
  0 siblings, 6 replies; 16+ messages in thread
From: Javier González @ 2017-04-21 23:32 UTC (permalink / raw)
  To: mb; +Cc: linux-block, linux-kernel, Javier González

Hi Matias,

This is a couple of bug fixes for the pblk release patch.

Thanks,
Javier

Javier González (5):
  lightnvm: pblk: fix race condition on line retry
  lightnvm: pblk: fix bad error check
  lightnvm: pblk: fix memory leak on error path
  lightnvm: pblk: free metadata on line alloc failure
  lightnvm: pblk: fix erase counters on error fail

 drivers/lightnvm/pblk-core.c  | 60 ++++++++++++++++++++++++++-----------------
 drivers/lightnvm/pblk-gc.c    |  2 +-
 drivers/lightnvm/pblk-init.c  |  9 ++++---
 drivers/lightnvm/pblk-map.c   |  4 +--
 drivers/lightnvm/pblk-rl.c    |  6 +++--
 drivers/lightnvm/pblk-write.c |  7 +++--
 drivers/lightnvm/pblk.h       |  6 ++---
 7 files changed, 57 insertions(+), 37 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/5] lightnvm: pblk: fix race condition on line retry
  2017-04-21 23:32 [PATCH 0/5] lightnvm: pblk extra patches for 4.12 Javier González
@ 2017-04-21 23:32 ` Javier González
  2017-04-22  8:51   ` Matias Bjørling
  2017-04-21 23:32 ` [PATCH 2/5] lightnvm: pblk: fix bad error check Javier González
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 16+ messages in thread
From: Javier González @ 2017-04-21 23:32 UTC (permalink / raw)
  To: mb; +Cc: linux-block, linux-kernel, Javier González

When a pblk line fails (or is recovered), make sure to take the line
management lock.

Fixes: a4bd217b4326 "lightnvm: physical block device (pblk) target"

Signed-off-by: Javier González <javier@cnexlabs.com>
---
 drivers/lightnvm/pblk-core.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/lightnvm/pblk-core.c b/drivers/lightnvm/pblk-core.c
index a2bcd09..7065658 100644
--- a/drivers/lightnvm/pblk-core.c
+++ b/drivers/lightnvm/pblk-core.c
@@ -1050,13 +1050,14 @@ int pblk_line_recov_alloc(struct pblk *pblk, struct pblk_line *line)
 	spin_lock(&l_mg->free_lock);
 	l_mg->data_line = line;
 	list_del(&line->list);
-	spin_unlock(&l_mg->free_lock);
 
 	ret = pblk_line_prepare(pblk, line);
 	if (ret) {
 		list_add(&line->list, &l_mg->free_list);
+		spin_unlock(&l_mg->free_lock);
 		return ret;
 	}
+	spin_unlock(&l_mg->free_lock);
 
 	pblk_rl_free_lines_dec(&pblk->rl, line);
 
@@ -1140,6 +1141,7 @@ static struct pblk_line *pblk_line_retry(struct pblk *pblk,
 	line->invalid_bitmap = NULL;
 	line->smeta = NULL;
 	line->emeta = NULL;
+	l_mg->data_line = retry_line;
 	spin_unlock(&l_mg->free_lock);
 
 	if (pblk_line_erase(pblk, retry_line))
@@ -1147,8 +1149,6 @@ static struct pblk_line *pblk_line_retry(struct pblk *pblk,
 
 	pblk_rl_free_lines_dec(&pblk->rl, retry_line);
 
-	l_mg->data_line = retry_line;
-
 	return retry_line;
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/5] lightnvm: pblk: fix bad error check
  2017-04-21 23:32 [PATCH 0/5] lightnvm: pblk extra patches for 4.12 Javier González
  2017-04-21 23:32 ` [PATCH 1/5] lightnvm: pblk: fix race condition on line retry Javier González
@ 2017-04-21 23:32 ` Javier González
  2017-04-22  8:52   ` Matias Bjørling
  2017-04-21 23:32 ` [PATCH 3/5] lightnvm: pblk: fix memory leak on error path Javier González
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 16+ messages in thread
From: Javier González @ 2017-04-21 23:32 UTC (permalink / raw)
  To: mb; +Cc: linux-block, linux-kernel, Javier González

Fix bad error check

Fixes: a4bd217b4326 "lightnvm: physical block device (pblk) target"

Signed-off-by: Javier González <javier@cnexlabs.com>
---
 drivers/lightnvm/pblk-core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/lightnvm/pblk-core.c b/drivers/lightnvm/pblk-core.c
index 7065658..7eb62ec 100644
--- a/drivers/lightnvm/pblk-core.c
+++ b/drivers/lightnvm/pblk-core.c
@@ -1273,7 +1273,7 @@ struct pblk_line *pblk_line_replace_data(struct pblk *pblk)
 retry_setup:
 	if (!pblk_line_set_metadata(pblk, new, cur)) {
 		new = pblk_line_retry(pblk, new);
-		if (new)
+		if (!new)
 			return NULL;
 
 		goto retry_setup;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 3/5] lightnvm: pblk: fix memory leak on error path
  2017-04-21 23:32 [PATCH 0/5] lightnvm: pblk extra patches for 4.12 Javier González
  2017-04-21 23:32 ` [PATCH 1/5] lightnvm: pblk: fix race condition on line retry Javier González
  2017-04-21 23:32 ` [PATCH 2/5] lightnvm: pblk: fix bad error check Javier González
@ 2017-04-21 23:32 ` Javier González
  2017-04-22  8:57   ` Matias Bjørling
  2017-04-21 23:32 ` [PATCH 4/5] lightnvm: pblk: free metadata on line alloc failure Javier González
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 16+ messages in thread
From: Javier González @ 2017-04-21 23:32 UTC (permalink / raw)
  To: mb; +Cc: linux-block, linux-kernel, Javier González

When write recovery fails, Free memory for the recovery structure.

Fixes: a4bd217b4326 "lightnvm: physical block device (pblk) target"

Signed-off-by: Javier González <javier@cnexlabs.com>
---
 drivers/lightnvm/pblk-write.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/lightnvm/pblk-write.c b/drivers/lightnvm/pblk-write.c
index 74f7413..a896190 100644
--- a/drivers/lightnvm/pblk-write.c
+++ b/drivers/lightnvm/pblk-write.c
@@ -142,6 +142,7 @@ static void pblk_end_w_fail(struct pblk *pblk, struct nvm_rq *rqd)
 		/* Logic error */
 		if (bit > c_ctx->nr_valid) {
 			WARN_ONCE(1, "pblk: corrupted write request\n");
+			mempool_free(recovery, pblk->rec_pool);
 			goto out;
 		}
 
@@ -149,6 +150,7 @@ static void pblk_end_w_fail(struct pblk *pblk, struct nvm_rq *rqd)
 		entry = pblk_rb_sync_scan_entry(&pblk->rwb, &ppa);
 		if (!entry) {
 			pr_err("pblk: could not scan entry on write failure\n");
+			mempool_free(recovery, pblk->rec_pool);
 			goto out;
 		}
 
@@ -162,6 +164,7 @@ static void pblk_end_w_fail(struct pblk *pblk, struct nvm_rq *rqd)
 	ret = pblk_recov_setup_rq(pblk, c_ctx, recovery, comp_bits, c_entries);
 	if (ret) {
 		pr_err("pblk: could not recover from write failure\n");
+		mempool_free(recovery, pblk->rec_pool);
 		goto out;
 	}
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 4/5] lightnvm: pblk: free metadata on line alloc failure
  2017-04-21 23:32 [PATCH 0/5] lightnvm: pblk extra patches for 4.12 Javier González
                   ` (2 preceding siblings ...)
  2017-04-21 23:32 ` [PATCH 3/5] lightnvm: pblk: fix memory leak on error path Javier González
@ 2017-04-21 23:32 ` Javier González
  2017-04-22  9:02   ` Matias Bjørling
  2017-04-21 23:32 ` [PATCH 5/5] lightnvm: pblk: fix erase counters on error fail Javier González
  2017-04-23 18:00 ` [PATCH 0/5] lightnvm: pblk extra patches for 4.12 Matias Bjørling
  5 siblings, 1 reply; 16+ messages in thread
From: Javier González @ 2017-04-21 23:32 UTC (permalink / raw)
  To: mb; +Cc: linux-block, linux-kernel, Javier González

When a line allocation fails, for example, due to having too many bad
blocks, free its metadata correctly.

Fixes: a4bd217b4326 "lightnvm: physical block device (pblk) target"

Signed-off-by: Javier González <javier@cnexlabs.com>
---
 drivers/lightnvm/pblk-core.c | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/lightnvm/pblk-core.c b/drivers/lightnvm/pblk-core.c
index 7eb62ec..ac3742b 100644
--- a/drivers/lightnvm/pblk-core.c
+++ b/drivers/lightnvm/pblk-core.c
@@ -1127,6 +1127,7 @@ static struct pblk_line *pblk_line_retry(struct pblk *pblk,
 	spin_lock(&l_mg->free_lock);
 	retry_line = pblk_line_get(pblk);
 	if (!retry_line) {
+		l_mg->data_line = NULL;
 		spin_unlock(&l_mg->free_lock);
 		return NULL;
 	}
@@ -1134,18 +1135,17 @@ static struct pblk_line *pblk_line_retry(struct pblk *pblk,
 	retry_line->smeta = line->smeta;
 	retry_line->emeta = line->emeta;
 	retry_line->meta_line = line->meta_line;
-	retry_line->map_bitmap = line->map_bitmap;
-	retry_line->invalid_bitmap = line->invalid_bitmap;
 
-	line->map_bitmap = NULL;
-	line->invalid_bitmap = NULL;
-	line->smeta = NULL;
-	line->emeta = NULL;
+	pblk_line_free(pblk, line);
 	l_mg->data_line = retry_line;
 	spin_unlock(&l_mg->free_lock);
 
-	if (pblk_line_erase(pblk, retry_line))
+	if (pblk_line_erase(pblk, retry_line)) {
+		spin_lock(&l_mg->free_lock);
+		l_mg->data_line = NULL;
+		spin_unlock(&l_mg->free_lock);
 		return NULL;
+	}
 
 	pblk_rl_free_lines_dec(&pblk->rl, retry_line);
 
@@ -1299,6 +1299,8 @@ void pblk_line_free(struct pblk *pblk, struct pblk_line *line)
 
 	line->map_bitmap = NULL;
 	line->invalid_bitmap = NULL;
+	line->smeta = NULL;
+	line->emeta = NULL;
 }
 
 void pblk_line_put(struct kref *ref)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 5/5] lightnvm: pblk: fix erase counters on error fail
  2017-04-21 23:32 [PATCH 0/5] lightnvm: pblk extra patches for 4.12 Javier González
                   ` (3 preceding siblings ...)
  2017-04-21 23:32 ` [PATCH 4/5] lightnvm: pblk: free metadata on line alloc failure Javier González
@ 2017-04-21 23:32 ` Javier González
  2017-04-22  9:22   ` Matias Bjørling
  2017-04-23 18:00 ` [PATCH 0/5] lightnvm: pblk extra patches for 4.12 Matias Bjørling
  5 siblings, 1 reply; 16+ messages in thread
From: Javier González @ 2017-04-21 23:32 UTC (permalink / raw)
  To: mb; +Cc: linux-block, linux-kernel, Javier González

When block erases fail, these blocks are marked bad. The number of valid
blocks in the line was not updated, which could cause an infinite loop
on the erase path.

Fix this atomic counter and, in order to avoid taking an irq lock on the
interrupt context, make the erase counters atomic too.

Also, in the case that a significant number of blocks become bad in a
line, the result is the double shared metadata buffer (emeta) to stop
the pipeline until all metadata is flushed to the media. Increase the
number of metadata lines from 2 to 4 to avoid this case.

Fixes: a4bd217b4326 "lightnvm: physical block device (pblk) target"

Signed-off-by: Javier González <javier@cnexlabs.com>
---
 drivers/lightnvm/pblk-core.c  | 28 +++++++++++++++++++---------
 drivers/lightnvm/pblk-gc.c    |  2 +-
 drivers/lightnvm/pblk-init.c  |  9 ++++++---
 drivers/lightnvm/pblk-map.c   |  4 ++--
 drivers/lightnvm/pblk-rl.c    |  6 ++++--
 drivers/lightnvm/pblk-write.c |  4 ++--
 drivers/lightnvm/pblk.h       |  6 +++---
 7 files changed, 37 insertions(+), 22 deletions(-)

diff --git a/drivers/lightnvm/pblk-core.c b/drivers/lightnvm/pblk-core.c
index ac3742b..5e44768 100644
--- a/drivers/lightnvm/pblk-core.c
+++ b/drivers/lightnvm/pblk-core.c
@@ -29,6 +29,7 @@ static void pblk_mark_bb(struct pblk *pblk, struct pblk_line *line,
 	pr_debug("pblk: erase failed: line:%d, pos:%d\n", line->id, pos);
 	atomic_long_inc(&pblk->erase_failed);
 
+	atomic_dec(&line->blk_in_line);
 	if (test_and_set_bit(pos, line->blk_bitmap))
 		pr_err("pblk: attempted to erase bb: line:%d, pos:%d\n",
 							line->id, pos);
@@ -832,21 +833,28 @@ int pblk_line_erase(struct pblk *pblk, struct pblk_line *line)
 	struct ppa_addr ppa;
 	int bit = -1;
 
-	/* Erase one block at the time and only erase good blocks */
-	while ((bit = find_next_zero_bit(line->erase_bitmap, lm->blk_per_line,
-						bit + 1)) < lm->blk_per_line) {
+	/* Erase only good blocks, one at a time */
+	do {
+		spin_lock(&line->lock);
+		bit = find_next_zero_bit(line->erase_bitmap, lm->blk_per_line,
+								bit + 1);
+		if (bit >= lm->blk_per_line) {
+			spin_unlock(&line->lock);
+			break;
+		}
+
 		ppa = pblk->luns[bit].bppa; /* set ch and lun */
 		ppa.g.blk = line->id;
 
-		/* If the erase fails, the block is bad and should be marked */
-		line->left_eblks--;
+		atomic_dec(&line->left_eblks);
 		WARN_ON(test_and_set_bit(bit, line->erase_bitmap));
+		spin_unlock(&line->lock);
 
 		if (pblk_blk_erase_sync(pblk, ppa)) {
 			pr_err("pblk: failed to erase line %d\n", line->id);
 			return -ENOMEM;
 		}
-	}
+	} while (1);
 
 	return 0;
 }
@@ -1007,6 +1015,7 @@ static int pblk_line_init_bb(struct pblk *pblk, struct pblk_line *line,
 static int pblk_line_prepare(struct pblk *pblk, struct pblk_line *line)
 {
 	struct pblk_line_meta *lm = &pblk->lm;
+	int blk_in_line = atomic_read(&line->blk_in_line);
 
 	line->map_bitmap = mempool_alloc(pblk->line_meta_pool, GFP_ATOMIC);
 	if (!line->map_bitmap)
@@ -1030,12 +1039,13 @@ static int pblk_line_prepare(struct pblk *pblk, struct pblk_line *line)
 		return -EINTR;
 	}
 	line->state = PBLK_LINESTATE_OPEN;
+
+	atomic_set(&line->left_eblks, blk_in_line);
+	atomic_set(&line->left_seblks, blk_in_line);
 	spin_unlock(&line->lock);
 
 	/* Bad blocks do not need to be erased */
 	bitmap_copy(line->erase_bitmap, line->blk_bitmap, lm->blk_per_line);
-	line->left_eblks = line->blk_in_line;
-	atomic_set(&line->left_seblks, line->left_eblks);
 
 	kref_init(&line->ref);
 
@@ -1231,7 +1241,7 @@ struct pblk_line *pblk_line_replace_data(struct pblk *pblk)
 	left_seblks = atomic_read(&new->left_seblks);
 	if (left_seblks) {
 		/* If line is not fully erased, erase it */
-		if (new->left_eblks) {
+		if (atomic_read(&new->left_eblks)) {
 			if (pblk_line_erase(pblk, new))
 				return NULL;
 		} else {
diff --git a/drivers/lightnvm/pblk-gc.c b/drivers/lightnvm/pblk-gc.c
index f173fd4..eaf479c 100644
--- a/drivers/lightnvm/pblk-gc.c
+++ b/drivers/lightnvm/pblk-gc.c
@@ -332,7 +332,7 @@ static void pblk_gc_run(struct pblk *pblk)
 		}
 
 		line = list_first_entry(group_list, struct pblk_line, list);
-		nr_blocks_free += line->blk_in_line;
+		nr_blocks_free += atomic_read(&line->blk_in_line);
 
 		spin_lock(&line->lock);
 		WARN_ON(line->state != PBLK_LINESTATE_CLOSED);
diff --git a/drivers/lightnvm/pblk-init.c b/drivers/lightnvm/pblk-init.c
index 3996e4b..15b2787 100644
--- a/drivers/lightnvm/pblk-init.c
+++ b/drivers/lightnvm/pblk-init.c
@@ -678,6 +678,8 @@ static int pblk_lines_init(struct pblk *pblk)
 
 	nr_free_blks = 0;
 	for (i = 0; i < l_mg->nr_lines; i++) {
+		int blk_in_line;
+
 		line = &pblk->lines[i];
 
 		line->pblk = pblk;
@@ -693,14 +695,15 @@ static int pblk_lines_init(struct pblk *pblk)
 			goto fail_free_lines;
 		}
 
-		line->blk_in_line = lm->blk_per_line - nr_bad_blks;
-		if (line->blk_in_line < lm->min_blk_line) {
+		blk_in_line = lm->blk_per_line - nr_bad_blks;
+		if (blk_in_line < lm->min_blk_line) {
 			line->state = PBLK_LINESTATE_BAD;
 			list_add_tail(&line->list, &l_mg->bad_list);
 			continue;
 		}
 
-		nr_free_blks += line->blk_in_line;
+		nr_free_blks += blk_in_line;
+		atomic_set(&line->blk_in_line, blk_in_line);
 
 		l_mg->nr_free_lines++;
 		list_add_tail(&line->list, &l_mg->free_list);
diff --git a/drivers/lightnvm/pblk-map.c b/drivers/lightnvm/pblk-map.c
index 3f8bab4..17c1695 100644
--- a/drivers/lightnvm/pblk-map.c
+++ b/drivers/lightnvm/pblk-map.c
@@ -110,7 +110,7 @@ void pblk_map_erase_rq(struct pblk *pblk, struct nvm_rq *rqd,
 				continue;
 
 			set_bit(erase_lun, e_line->erase_bitmap);
-			e_line->left_eblks--;
+			atomic_dec(&e_line->left_eblks);
 			*erase_ppa = rqd->ppa_list[i];
 			erase_ppa->g.blk = e_line->id;
 
@@ -129,7 +129,7 @@ void pblk_map_erase_rq(struct pblk *pblk, struct nvm_rq *rqd,
 			return;
 
 		set_bit(i, e_line->erase_bitmap);
-		e_line->left_eblks--;
+		atomic_dec(&e_line->left_eblks);
 		*erase_ppa = pblk->luns[i].bppa; /* set ch and lun */
 		erase_ppa->g.blk = e_line->id;
 	}
diff --git a/drivers/lightnvm/pblk-rl.c b/drivers/lightnvm/pblk-rl.c
index 4042162..ab7cbb1 100644
--- a/drivers/lightnvm/pblk-rl.c
+++ b/drivers/lightnvm/pblk-rl.c
@@ -107,9 +107,10 @@ void pblk_rl_set_gc_rsc(struct pblk_rl *rl, int rsv)
 void pblk_rl_free_lines_inc(struct pblk_rl *rl, struct pblk_line *line)
 {
 	struct pblk *pblk = container_of(rl, struct pblk, rl);
+	int blk_in_line = atomic_read(&line->blk_in_line);
 	int ret;
 
-	atomic_add(line->blk_in_line, &rl->free_blocks);
+	atomic_add(blk_in_line, &rl->free_blocks);
 	/* Rates will not change that often - no need to lock update */
 	ret = pblk_rl_update_rates(rl, rl->rb_budget);
 
@@ -122,9 +123,10 @@ void pblk_rl_free_lines_inc(struct pblk_rl *rl, struct pblk_line *line)
 void pblk_rl_free_lines_dec(struct pblk_rl *rl, struct pblk_line *line)
 {
 	struct pblk *pblk = container_of(rl, struct pblk, rl);
+	int blk_in_line = atomic_read(&line->blk_in_line);
 	int ret;
 
-	atomic_sub(line->blk_in_line, &rl->free_blocks);
+	atomic_sub(blk_in_line, &rl->free_blocks);
 
 	/* Rates will not change that often - no need to lock update */
 	ret = pblk_rl_update_rates(rl, rl->rb_budget);
diff --git a/drivers/lightnvm/pblk-write.c b/drivers/lightnvm/pblk-write.c
index a896190..aef6fd7 100644
--- a/drivers/lightnvm/pblk-write.c
+++ b/drivers/lightnvm/pblk-write.c
@@ -244,7 +244,7 @@ static int pblk_setup_w_rq(struct pblk *pblk, struct nvm_rq *rqd,
 	}
 
 	ppa_set_empty(&erase_ppa);
-	if (likely(!e_line || !e_line->left_eblks))
+	if (likely(!e_line || !atomic_read(&e_line->left_eblks)))
 		pblk_map_rq(pblk, rqd, c_ctx->sentry, lun_bitmap, valid, 0);
 	else
 		pblk_map_erase_rq(pblk, rqd, c_ctx->sentry, lun_bitmap,
@@ -257,7 +257,7 @@ static int pblk_setup_w_rq(struct pblk *pblk, struct nvm_rq *rqd,
 			struct nvm_geo *geo = &dev->geo;
 			int bit;
 
-			e_line->left_eblks++;
+			atomic_inc(&e_line->left_eblks);
 			bit = erase_ppa.g.lun * geo->nr_chnls + erase_ppa.g.ch;
 			WARN_ON(!test_and_clear_bit(bit, e_line->erase_bitmap));
 			up(&pblk->erase_sem);
diff --git a/drivers/lightnvm/pblk.h b/drivers/lightnvm/pblk.h
index c82120c..02dd18f 100644
--- a/drivers/lightnvm/pblk.h
+++ b/drivers/lightnvm/pblk.h
@@ -363,14 +363,14 @@ struct pblk_line {
 
 	unsigned int sec_in_line;	/* Number of usable secs in line */
 
-	unsigned int blk_in_line;	/* Number of good blocks in line */
+	atomic_t blk_in_line;		/* Number of good blocks in line */
 	unsigned long *blk_bitmap;	/* Bitmap for valid/invalid blocks */
 	unsigned long *erase_bitmap;	/* Bitmap for erased blocks */
 
 	unsigned long *map_bitmap;	/* Bitmap for mapped sectors in line */
 	unsigned long *invalid_bitmap;	/* Bitmap for invalid sectors in line */
 
-	int left_eblks;			/* Blocks left for erasing */
+	atomic_t left_eblks;		/* Blocks left for erasing */
 	atomic_t left_seblks;		/* Blocks left for sync erasing */
 
 	int left_msecs;			/* Sectors left for mapping */
@@ -383,7 +383,7 @@ struct pblk_line {
 	spinlock_t lock;		/* Necessary for invalid_bitmap only */
 };
 
-#define PBLK_DATA_LINES 2
+#define PBLK_DATA_LINES 4
 
 enum{
 	PBLK_KMALLOC_META = 1,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/5] lightnvm: pblk: fix race condition on line retry
  2017-04-21 23:32 ` [PATCH 1/5] lightnvm: pblk: fix race condition on line retry Javier González
@ 2017-04-22  8:51   ` Matias Bjørling
  0 siblings, 0 replies; 16+ messages in thread
From: Matias Bjørling @ 2017-04-22  8:51 UTC (permalink / raw)
  To: Javier González; +Cc: linux-block, linux-kernel, Javier González

On 04/22/2017 01:32 AM, Javier González wrote:
> When a pblk line fails (or is recovered), make sure to take the line
> management lock.
>
> Fixes: a4bd217b4326 "lightnvm: physical block device (pblk) target"
>
> Signed-off-by: Javier González <javier@cnexlabs.com>
> ---
>  drivers/lightnvm/pblk-core.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/lightnvm/pblk-core.c b/drivers/lightnvm/pblk-core.c
> index a2bcd09..7065658 100644
> --- a/drivers/lightnvm/pblk-core.c
> +++ b/drivers/lightnvm/pblk-core.c
> @@ -1050,13 +1050,14 @@ int pblk_line_recov_alloc(struct pblk *pblk, struct pblk_line *line)
>  	spin_lock(&l_mg->free_lock);
>  	l_mg->data_line = line;
>  	list_del(&line->list);
> -	spin_unlock(&l_mg->free_lock);
>
>  	ret = pblk_line_prepare(pblk, line);
>  	if (ret) {
>  		list_add(&line->list, &l_mg->free_list);
> +		spin_unlock(&l_mg->free_lock);
>  		return ret;
>  	}
> +	spin_unlock(&l_mg->free_lock);
>
>  	pblk_rl_free_lines_dec(&pblk->rl, line);
>
> @@ -1140,6 +1141,7 @@ static struct pblk_line *pblk_line_retry(struct pblk *pblk,
>  	line->invalid_bitmap = NULL;
>  	line->smeta = NULL;
>  	line->emeta = NULL;
> +	l_mg->data_line = retry_line;
>  	spin_unlock(&l_mg->free_lock);
>
>  	if (pblk_line_erase(pblk, retry_line))
> @@ -1147,8 +1149,6 @@ static struct pblk_line *pblk_line_retry(struct pblk *pblk,
>
>  	pblk_rl_free_lines_dec(&pblk->rl, retry_line);
>
> -	l_mg->data_line = retry_line;
> -
>  	return retry_line;
>  }
>
>

Reviewed-by: Matias Bjørling <matias@cnexlabs.com>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/5] lightnvm: pblk: fix bad error check
  2017-04-21 23:32 ` [PATCH 2/5] lightnvm: pblk: fix bad error check Javier González
@ 2017-04-22  8:52   ` Matias Bjørling
  0 siblings, 0 replies; 16+ messages in thread
From: Matias Bjørling @ 2017-04-22  8:52 UTC (permalink / raw)
  To: Javier González; +Cc: linux-block, linux-kernel, Javier González

On 04/22/2017 01:32 AM, Javier González wrote:
> Fix bad error check
>
> Fixes: a4bd217b4326 "lightnvm: physical block device (pblk) target"
>
> Signed-off-by: Javier González <javier@cnexlabs.com>
> ---
>  drivers/lightnvm/pblk-core.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/lightnvm/pblk-core.c b/drivers/lightnvm/pblk-core.c
> index 7065658..7eb62ec 100644
> --- a/drivers/lightnvm/pblk-core.c
> +++ b/drivers/lightnvm/pblk-core.c
> @@ -1273,7 +1273,7 @@ struct pblk_line *pblk_line_replace_data(struct pblk *pblk)
>  retry_setup:
>  	if (!pblk_line_set_metadata(pblk, new, cur)) {
>  		new = pblk_line_retry(pblk, new);
> -		if (new)
> +		if (!new)
>  			return NULL;
>
>  		goto retry_setup;
>

Reviewed-by: Matias Bjørling <matias@cnexlabs.com>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/5] lightnvm: pblk: fix memory leak on error path
  2017-04-21 23:32 ` [PATCH 3/5] lightnvm: pblk: fix memory leak on error path Javier González
@ 2017-04-22  8:57   ` Matias Bjørling
  0 siblings, 0 replies; 16+ messages in thread
From: Matias Bjørling @ 2017-04-22  8:57 UTC (permalink / raw)
  To: Javier González; +Cc: linux-block, linux-kernel, Javier González



On 04/22/2017 01:32 AM, Javier González wrote:
> When write recovery fails, Free memory for the recovery structure.
>
> Fixes: a4bd217b4326 "lightnvm: physical block device (pblk) target"
>
> Signed-off-by: Javier González <javier@cnexlabs.com>
> ---
>  drivers/lightnvm/pblk-write.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/drivers/lightnvm/pblk-write.c b/drivers/lightnvm/pblk-write.c
> index 74f7413..a896190 100644
> --- a/drivers/lightnvm/pblk-write.c
> +++ b/drivers/lightnvm/pblk-write.c
> @@ -142,6 +142,7 @@ static void pblk_end_w_fail(struct pblk *pblk, struct nvm_rq *rqd)
>  		/* Logic error */
>  		if (bit > c_ctx->nr_valid) {
>  			WARN_ONCE(1, "pblk: corrupted write request\n");
> +			mempool_free(recovery, pblk->rec_pool);
>  			goto out;
>  		}
>
> @@ -149,6 +150,7 @@ static void pblk_end_w_fail(struct pblk *pblk, struct nvm_rq *rqd)
>  		entry = pblk_rb_sync_scan_entry(&pblk->rwb, &ppa);
>  		if (!entry) {
>  			pr_err("pblk: could not scan entry on write failure\n");
> +			mempool_free(recovery, pblk->rec_pool);
>  			goto out;
>  		}
>
> @@ -162,6 +164,7 @@ static void pblk_end_w_fail(struct pblk *pblk, struct nvm_rq *rqd)
>  	ret = pblk_recov_setup_rq(pblk, c_ctx, recovery, comp_bits, c_entries);
>  	if (ret) {
>  		pr_err("pblk: could not recover from write failure\n");
> +		mempool_free(recovery, pblk->rec_pool);
>  		goto out;
>  	}
>
>
Reviewed-by: Matias Bjørling <matias@cnexlabs.com>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/5] lightnvm: pblk: free metadata on line alloc failure
  2017-04-21 23:32 ` [PATCH 4/5] lightnvm: pblk: free metadata on line alloc failure Javier González
@ 2017-04-22  9:02   ` Matias Bjørling
  0 siblings, 0 replies; 16+ messages in thread
From: Matias Bjørling @ 2017-04-22  9:02 UTC (permalink / raw)
  To: Javier González; +Cc: linux-block, linux-kernel, Javier González



On 04/22/2017 01:32 AM, Javier González wrote:
> When a line allocation fails, for example, due to having too many bad
> blocks, free its metadata correctly.
>
> Fixes: a4bd217b4326 "lightnvm: physical block device (pblk) target"
>
> Signed-off-by: Javier González <javier@cnexlabs.com>
> ---
>  drivers/lightnvm/pblk-core.c | 16 +++++++++-------
>  1 file changed, 9 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/lightnvm/pblk-core.c b/drivers/lightnvm/pblk-core.c
> index 7eb62ec..ac3742b 100644
> --- a/drivers/lightnvm/pblk-core.c
> +++ b/drivers/lightnvm/pblk-core.c
> @@ -1127,6 +1127,7 @@ static struct pblk_line *pblk_line_retry(struct pblk *pblk,
>  	spin_lock(&l_mg->free_lock);
>  	retry_line = pblk_line_get(pblk);
>  	if (!retry_line) {
> +		l_mg->data_line = NULL;
>  		spin_unlock(&l_mg->free_lock);
>  		return NULL;
>  	}
> @@ -1134,18 +1135,17 @@ static struct pblk_line *pblk_line_retry(struct pblk *pblk,
>  	retry_line->smeta = line->smeta;
>  	retry_line->emeta = line->emeta;
>  	retry_line->meta_line = line->meta_line;
> -	retry_line->map_bitmap = line->map_bitmap;
> -	retry_line->invalid_bitmap = line->invalid_bitmap;
>
> -	line->map_bitmap = NULL;
> -	line->invalid_bitmap = NULL;
> -	line->smeta = NULL;
> -	line->emeta = NULL;
> +	pblk_line_free(pblk, line);
>  	l_mg->data_line = retry_line;
>  	spin_unlock(&l_mg->free_lock);
>
> -	if (pblk_line_erase(pblk, retry_line))
> +	if (pblk_line_erase(pblk, retry_line)) {
> +		spin_lock(&l_mg->free_lock);
> +		l_mg->data_line = NULL;
> +		spin_unlock(&l_mg->free_lock);
>  		return NULL;
> +	}
>
>  	pblk_rl_free_lines_dec(&pblk->rl, retry_line);
>
> @@ -1299,6 +1299,8 @@ void pblk_line_free(struct pblk *pblk, struct pblk_line *line)
>
>  	line->map_bitmap = NULL;
>  	line->invalid_bitmap = NULL;
> +	line->smeta = NULL;
> +	line->emeta = NULL;
>  }
>
>  void pblk_line_put(struct kref *ref)
>

Reviewed-by: Matias Bjørling <matias@cnexlabs.com>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 5/5] lightnvm: pblk: fix erase counters on error fail
  2017-04-21 23:32 ` [PATCH 5/5] lightnvm: pblk: fix erase counters on error fail Javier González
@ 2017-04-22  9:22   ` Matias Bjørling
  2017-04-22  9:31     ` Javier González
  0 siblings, 1 reply; 16+ messages in thread
From: Matias Bjørling @ 2017-04-22  9:22 UTC (permalink / raw)
  To: Javier González; +Cc: linux-block, linux-kernel, Javier González

On 04/22/2017 01:32 AM, Javier González wrote:
> When block erases fail, these blocks are marked bad. The number of valid
> blocks in the line was not updated, which could cause an infinite loop
> on the erase path.
>
> Fix this atomic counter and, in order to avoid taking an irq lock on the
> interrupt context, make the erase counters atomic too.

I can't find out where the counters are used in irq context? Can you 
point me in the right direction? I'll prefer for these counters to go in 
under the existing line_lock.

>
> Also, in the case that a significant number of blocks become bad in a
> line, the result is the double shared metadata buffer (emeta) to stop
> the pipeline until all metadata is flushed to the media. Increase the
> number of metadata lines from 2 to 4 to avoid this case.

How does moving to 4 lines solve this case? The way I read it is that it 
only postpones when this occurs?

>
> Fixes: a4bd217b4326 "lightnvm: physical block device (pblk) target"
>
> Signed-off-by: Javier González <javier@cnexlabs.com>
> ---
>  drivers/lightnvm/pblk-core.c  | 28 +++++++++++++++++++---------
>  drivers/lightnvm/pblk-gc.c    |  2 +-
>  drivers/lightnvm/pblk-init.c  |  9 ++++++---
>  drivers/lightnvm/pblk-map.c   |  4 ++--
>  drivers/lightnvm/pblk-rl.c    |  6 ++++--
>  drivers/lightnvm/pblk-write.c |  4 ++--
>  drivers/lightnvm/pblk.h       |  6 +++---
>  7 files changed, 37 insertions(+), 22 deletions(-)
>
> diff --git a/drivers/lightnvm/pblk-core.c b/drivers/lightnvm/pblk-core.c
> index ac3742b..5e44768 100644
> --- a/drivers/lightnvm/pblk-core.c
> +++ b/drivers/lightnvm/pblk-core.c
> @@ -29,6 +29,7 @@ static void pblk_mark_bb(struct pblk *pblk, struct pblk_line *line,
>  	pr_debug("pblk: erase failed: line:%d, pos:%d\n", line->id, pos);
>  	atomic_long_inc(&pblk->erase_failed);
>
> +	atomic_dec(&line->blk_in_line);
>  	if (test_and_set_bit(pos, line->blk_bitmap))
>  		pr_err("pblk: attempted to erase bb: line:%d, pos:%d\n",
>  							line->id, pos);
> @@ -832,21 +833,28 @@ int pblk_line_erase(struct pblk *pblk, struct pblk_line *line)
>  	struct ppa_addr ppa;
>  	int bit = -1;
>
> -	/* Erase one block at the time and only erase good blocks */
> -	while ((bit = find_next_zero_bit(line->erase_bitmap, lm->blk_per_line,
> -						bit + 1)) < lm->blk_per_line) {
> +	/* Erase only good blocks, one at a time */
> +	do {
> +		spin_lock(&line->lock);
> +		bit = find_next_zero_bit(line->erase_bitmap, lm->blk_per_line,
> +								bit + 1);
> +		if (bit >= lm->blk_per_line) {
> +			spin_unlock(&line->lock);
> +			break;
> +		}
> +
>  		ppa = pblk->luns[bit].bppa; /* set ch and lun */
>  		ppa.g.blk = line->id;
>
> -		/* If the erase fails, the block is bad and should be marked */
> -		line->left_eblks--;
> +		atomic_dec(&line->left_eblks);
>  		WARN_ON(test_and_set_bit(bit, line->erase_bitmap));
> +		spin_unlock(&line->lock);
>
>  		if (pblk_blk_erase_sync(pblk, ppa)) {
>  			pr_err("pblk: failed to erase line %d\n", line->id);
>  			return -ENOMEM;
>  		}
> -	}
> +	} while (1);
>
>  	return 0;
>  }
> @@ -1007,6 +1015,7 @@ static int pblk_line_init_bb(struct pblk *pblk, struct pblk_line *line,
>  static int pblk_line_prepare(struct pblk *pblk, struct pblk_line *line)
>  {
>  	struct pblk_line_meta *lm = &pblk->lm;
> +	int blk_in_line = atomic_read(&line->blk_in_line);
>
>  	line->map_bitmap = mempool_alloc(pblk->line_meta_pool, GFP_ATOMIC);
>  	if (!line->map_bitmap)
> @@ -1030,12 +1039,13 @@ static int pblk_line_prepare(struct pblk *pblk, struct pblk_line *line)
>  		return -EINTR;
>  	}
>  	line->state = PBLK_LINESTATE_OPEN;
> +
> +	atomic_set(&line->left_eblks, blk_in_line);
> +	atomic_set(&line->left_seblks, blk_in_line);
>  	spin_unlock(&line->lock);
>
>  	/* Bad blocks do not need to be erased */
>  	bitmap_copy(line->erase_bitmap, line->blk_bitmap, lm->blk_per_line);
> -	line->left_eblks = line->blk_in_line;
> -	atomic_set(&line->left_seblks, line->left_eblks);
>
>  	kref_init(&line->ref);
>
> @@ -1231,7 +1241,7 @@ struct pblk_line *pblk_line_replace_data(struct pblk *pblk)
>  	left_seblks = atomic_read(&new->left_seblks);
>  	if (left_seblks) {
>  		/* If line is not fully erased, erase it */
> -		if (new->left_eblks) {
> +		if (atomic_read(&new->left_eblks)) {
>  			if (pblk_line_erase(pblk, new))
>  				return NULL;
>  		} else {
> diff --git a/drivers/lightnvm/pblk-gc.c b/drivers/lightnvm/pblk-gc.c
> index f173fd4..eaf479c 100644
> --- a/drivers/lightnvm/pblk-gc.c
> +++ b/drivers/lightnvm/pblk-gc.c
> @@ -332,7 +332,7 @@ static void pblk_gc_run(struct pblk *pblk)
>  		}
>
>  		line = list_first_entry(group_list, struct pblk_line, list);
> -		nr_blocks_free += line->blk_in_line;
> +		nr_blocks_free += atomic_read(&line->blk_in_line);
>
>  		spin_lock(&line->lock);
>  		WARN_ON(line->state != PBLK_LINESTATE_CLOSED);
> diff --git a/drivers/lightnvm/pblk-init.c b/drivers/lightnvm/pblk-init.c
> index 3996e4b..15b2787 100644
> --- a/drivers/lightnvm/pblk-init.c
> +++ b/drivers/lightnvm/pblk-init.c
> @@ -678,6 +678,8 @@ static int pblk_lines_init(struct pblk *pblk)
>
>  	nr_free_blks = 0;
>  	for (i = 0; i < l_mg->nr_lines; i++) {
> +		int blk_in_line;
> +
>  		line = &pblk->lines[i];
>
>  		line->pblk = pblk;
> @@ -693,14 +695,15 @@ static int pblk_lines_init(struct pblk *pblk)
>  			goto fail_free_lines;
>  		}
>
> -		line->blk_in_line = lm->blk_per_line - nr_bad_blks;
> -		if (line->blk_in_line < lm->min_blk_line) {
> +		blk_in_line = lm->blk_per_line - nr_bad_blks;
> +		if (blk_in_line < lm->min_blk_line) {
>  			line->state = PBLK_LINESTATE_BAD;
>  			list_add_tail(&line->list, &l_mg->bad_list);
>  			continue;
>  		}
>
> -		nr_free_blks += line->blk_in_line;
> +		nr_free_blks += blk_in_line;
> +		atomic_set(&line->blk_in_line, blk_in_line);
>
>  		l_mg->nr_free_lines++;
>  		list_add_tail(&line->list, &l_mg->free_list);
> diff --git a/drivers/lightnvm/pblk-map.c b/drivers/lightnvm/pblk-map.c
> index 3f8bab4..17c1695 100644
> --- a/drivers/lightnvm/pblk-map.c
> +++ b/drivers/lightnvm/pblk-map.c
> @@ -110,7 +110,7 @@ void pblk_map_erase_rq(struct pblk *pblk, struct nvm_rq *rqd,
>  				continue;
>
>  			set_bit(erase_lun, e_line->erase_bitmap);
> -			e_line->left_eblks--;
> +			atomic_dec(&e_line->left_eblks);
>  			*erase_ppa = rqd->ppa_list[i];
>  			erase_ppa->g.blk = e_line->id;
>
> @@ -129,7 +129,7 @@ void pblk_map_erase_rq(struct pblk *pblk, struct nvm_rq *rqd,
>  			return;
>
>  		set_bit(i, e_line->erase_bitmap);
> -		e_line->left_eblks--;
> +		atomic_dec(&e_line->left_eblks);
>  		*erase_ppa = pblk->luns[i].bppa; /* set ch and lun */
>  		erase_ppa->g.blk = e_line->id;
>  	}
> diff --git a/drivers/lightnvm/pblk-rl.c b/drivers/lightnvm/pblk-rl.c
> index 4042162..ab7cbb1 100644
> --- a/drivers/lightnvm/pblk-rl.c
> +++ b/drivers/lightnvm/pblk-rl.c
> @@ -107,9 +107,10 @@ void pblk_rl_set_gc_rsc(struct pblk_rl *rl, int rsv)
>  void pblk_rl_free_lines_inc(struct pblk_rl *rl, struct pblk_line *line)
>  {
>  	struct pblk *pblk = container_of(rl, struct pblk, rl);
> +	int blk_in_line = atomic_read(&line->blk_in_line);
>  	int ret;
>
> -	atomic_add(line->blk_in_line, &rl->free_blocks);
> +	atomic_add(blk_in_line, &rl->free_blocks);
>  	/* Rates will not change that often - no need to lock update */
>  	ret = pblk_rl_update_rates(rl, rl->rb_budget);
>
> @@ -122,9 +123,10 @@ void pblk_rl_free_lines_inc(struct pblk_rl *rl, struct pblk_line *line)
>  void pblk_rl_free_lines_dec(struct pblk_rl *rl, struct pblk_line *line)
>  {
>  	struct pblk *pblk = container_of(rl, struct pblk, rl);
> +	int blk_in_line = atomic_read(&line->blk_in_line);
>  	int ret;
>
> -	atomic_sub(line->blk_in_line, &rl->free_blocks);
> +	atomic_sub(blk_in_line, &rl->free_blocks);
>
>  	/* Rates will not change that often - no need to lock update */
>  	ret = pblk_rl_update_rates(rl, rl->rb_budget);
> diff --git a/drivers/lightnvm/pblk-write.c b/drivers/lightnvm/pblk-write.c
> index a896190..aef6fd7 100644
> --- a/drivers/lightnvm/pblk-write.c
> +++ b/drivers/lightnvm/pblk-write.c
> @@ -244,7 +244,7 @@ static int pblk_setup_w_rq(struct pblk *pblk, struct nvm_rq *rqd,
>  	}
>
>  	ppa_set_empty(&erase_ppa);
> -	if (likely(!e_line || !e_line->left_eblks))
> +	if (likely(!e_line || !atomic_read(&e_line->left_eblks)))
>  		pblk_map_rq(pblk, rqd, c_ctx->sentry, lun_bitmap, valid, 0);
>  	else
>  		pblk_map_erase_rq(pblk, rqd, c_ctx->sentry, lun_bitmap,
> @@ -257,7 +257,7 @@ static int pblk_setup_w_rq(struct pblk *pblk, struct nvm_rq *rqd,
>  			struct nvm_geo *geo = &dev->geo;
>  			int bit;
>
> -			e_line->left_eblks++;
> +			atomic_inc(&e_line->left_eblks);
>  			bit = erase_ppa.g.lun * geo->nr_chnls + erase_ppa.g.ch;
>  			WARN_ON(!test_and_clear_bit(bit, e_line->erase_bitmap));
>  			up(&pblk->erase_sem);
> diff --git a/drivers/lightnvm/pblk.h b/drivers/lightnvm/pblk.h
> index c82120c..02dd18f 100644
> --- a/drivers/lightnvm/pblk.h
> +++ b/drivers/lightnvm/pblk.h
> @@ -363,14 +363,14 @@ struct pblk_line {
>
>  	unsigned int sec_in_line;	/* Number of usable secs in line */
>
> -	unsigned int blk_in_line;	/* Number of good blocks in line */
> +	atomic_t blk_in_line;		/* Number of good blocks in line */
>  	unsigned long *blk_bitmap;	/* Bitmap for valid/invalid blocks */
>  	unsigned long *erase_bitmap;	/* Bitmap for erased blocks */
>
>  	unsigned long *map_bitmap;	/* Bitmap for mapped sectors in line */
>  	unsigned long *invalid_bitmap;	/* Bitmap for invalid sectors in line */
>
> -	int left_eblks;			/* Blocks left for erasing */
> +	atomic_t left_eblks;		/* Blocks left for erasing */
>  	atomic_t left_seblks;		/* Blocks left for sync erasing */
>
>  	int left_msecs;			/* Sectors left for mapping */
> @@ -383,7 +383,7 @@ struct pblk_line {
>  	spinlock_t lock;		/* Necessary for invalid_bitmap only */
>  };
>
> -#define PBLK_DATA_LINES 2
> +#define PBLK_DATA_LINES 4

Why this change? I like to keep new features for 4.13. Only bugfixes for 
4.12.

>
>  enum{
>  	PBLK_KMALLOC_META = 1,
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 5/5] lightnvm: pblk: fix erase counters on error fail
  2017-04-22  9:22   ` Matias Bjørling
@ 2017-04-22  9:31     ` Javier González
  2017-04-23 17:59       ` Matias Bjørling
  0 siblings, 1 reply; 16+ messages in thread
From: Javier González @ 2017-04-22  9:31 UTC (permalink / raw)
  To: Matias Bjørling; +Cc: linux-block, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2115 bytes --]


> On 22 Apr 2017, at 11.22, Matias Bjørling <mb@lightnvm.io> wrote:
> 
> On 04/22/2017 01:32 AM, Javier González wrote:
>> When block erases fail, these blocks are marked bad. The number of valid
>> blocks in the line was not updated, which could cause an infinite loop
>> on the erase path.
>> 
>> Fix this atomic counter and, in order to avoid taking an irq lock on the
>> interrupt context, make the erase counters atomic too.
> 
> I can't find out where the counters are used in irq context? Can you
> point me in the right direction? I'll prefer for these counters to go
> in under the existing line_lock.
> 

This counter is line->blk_in_line, which is used on pblk_mark_bb. This
is triggered on the erase completion path. Note that we do not want to
wait until the recovery kicks in on the workqueue since the line might
start getting recycled straight away if we are close to reaching
capacity. This is indeed the scenario that triggers the race condition.

Making all erase counters atomic allows us to avoid taking the
line_lock. Note that the counters do not depend on each other.

>> Also, in the case that a significant number of blocks become bad in a
>> line, the result is the double shared metadata buffer (emeta) to stop
>> the pipeline until all metadata is flushed to the media. Increase the
>> number of metadata lines from 2 to 4 to avoid this case.
> 
> How does moving to 4 lines solve this case? The way I read it is that
> it only postpones when this occurs?
> 

The chances of having more than 2 lines falling out of blocks after
pre-condition are slim. Adding two more lines should be enough.

>> 
>> [...]
>> 
>> -#define PBLK_DATA_LINES 2
>> +#define PBLK_DATA_LINES 4
> 
> Why this change? I like to keep new features for 4.13. Only bugfixes for 4.12.

This is the 4 lines referred above. I see it as a bug fix since we risk
stalling the pipeline if a line goes above a certain number of bad
blocks on initialization, but we can leave it out and fix this when we
add in-line metadata on the write thread for 4.12

Thanks,
Javier

[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 5/5] lightnvm: pblk: fix erase counters on error fail
  2017-04-22  9:31     ` Javier González
@ 2017-04-23 17:59       ` Matias Bjørling
  2017-04-23 18:18         ` Javier González
  0 siblings, 1 reply; 16+ messages in thread
From: Matias Bjørling @ 2017-04-23 17:59 UTC (permalink / raw)
  To: Javier González; +Cc: linux-block, linux-kernel

On 04/22/2017 11:31 AM, Javier González wrote:
>
>> On 22 Apr 2017, at 11.22, Matias Bjørling <mb@lightnvm.io> wrote:
>>
>> On 04/22/2017 01:32 AM, Javier González wrote:
>>> When block erases fail, these blocks are marked bad. The number of valid
>>> blocks in the line was not updated, which could cause an infinite loop
>>> on the erase path.
>>>
>>> Fix this atomic counter and, in order to avoid taking an irq lock on the
>>> interrupt context, make the erase counters atomic too.
>>
>> I can't find out where the counters are used in irq context? Can you
>> point me in the right direction? I'll prefer for these counters to go
>> in under the existing line_lock.
>>
>
> This counter is line->blk_in_line, which is used on pblk_mark_bb. This
> is triggered on the erase completion path. Note that we do not want to
> wait until the recovery kicks in on the workqueue since the line might
> start getting recycled straight away if we are close to reaching
> capacity. This is indeed the scenario that triggers the race condition.
>
> Making all erase counters atomic allows us to avoid taking the
> line_lock. Note that the counters do not depend on each other.
>
>>> Also, in the case that a significant number of blocks become bad in a
>>> line, the result is the double shared metadata buffer (emeta) to stop
>>> the pipeline until all metadata is flushed to the media. Increase the
>>> number of metadata lines from 2 to 4 to avoid this case.
>>
>> How does moving to 4 lines solve this case? The way I read it is that
>> it only postpones when this occurs?
>>
>
> The chances of having more than 2 lines falling out of blocks after
> pre-condition are slim. Adding two more lines should be enough.
>
>>>
>>> [...]
>>>
>>> -#define PBLK_DATA_LINES 2
>>> +#define PBLK_DATA_LINES 4
>>
>> Why this change? I like to keep new features for 4.13. Only bugfixes for 4.12.
>
> This is the 4 lines referred above. I see it as a bug fix since we risk
> stalling the pipeline if a line goes above a certain number of bad
> blocks on initialization, but we can leave it out and fix this when we
> add in-line metadata on the write thread for 4.12
>
> Thanks,
> Javier
>

Okay. It tickles me a bit. However, to make it pretty, some refactoring 
is needed, which we won't push for this release.

Reviewed-by: Matias Bjørling <matias@cnexlabs.com>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 0/5] lightnvm: pblk extra patches for 4.12
  2017-04-21 23:32 [PATCH 0/5] lightnvm: pblk extra patches for 4.12 Javier González
                   ` (4 preceding siblings ...)
  2017-04-21 23:32 ` [PATCH 5/5] lightnvm: pblk: fix erase counters on error fail Javier González
@ 2017-04-23 18:00 ` Matias Bjørling
  2017-04-23 22:58   ` Jens Axboe
  5 siblings, 1 reply; 16+ messages in thread
From: Matias Bjørling @ 2017-04-23 18:00 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Javier González, linux-block, linux-kernel, Javier González

On 04/22/2017 01:32 AM, Javier González wrote:
> Hi Matias,
>
> This is a couple of bug fixes for the pblk release patch.
>
> Thanks,
> Javier
>
> Javier González (5):
>   lightnvm: pblk: fix race condition on line retry
>   lightnvm: pblk: fix bad error check
>   lightnvm: pblk: fix memory leak on error path
>   lightnvm: pblk: free metadata on line alloc failure
>   lightnvm: pblk: fix erase counters on error fail
>
>  drivers/lightnvm/pblk-core.c  | 60 ++++++++++++++++++++++++++-----------------
>  drivers/lightnvm/pblk-gc.c    |  2 +-
>  drivers/lightnvm/pblk-init.c  |  9 ++++---
>  drivers/lightnvm/pblk-map.c   |  4 +--
>  drivers/lightnvm/pblk-rl.c    |  6 +++--
>  drivers/lightnvm/pblk-write.c |  7 +++--
>  drivers/lightnvm/pblk.h       |  6 ++---
>  7 files changed, 57 insertions(+), 37 deletions(-)
>

Hi Jens,

Could you pick these up as well? This will properly be the last for the 
4.12 window. Thank you!

-Matias

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 5/5] lightnvm: pblk: fix erase counters on error fail
  2017-04-23 17:59       ` Matias Bjørling
@ 2017-04-23 18:18         ` Javier González
  0 siblings, 0 replies; 16+ messages in thread
From: Javier González @ 2017-04-23 18:18 UTC (permalink / raw)
  To: Matias Bjørling; +Cc: linux-block, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2673 bytes --]

> On 23 Apr 2017, at 19.59, Matias Bjørling <mb@lightnvm.io> wrote:
> 
> On 04/22/2017 11:31 AM, Javier González wrote:
>>> On 22 Apr 2017, at 11.22, Matias Bjørling <mb@lightnvm.io> wrote:
>>> 
>>> On 04/22/2017 01:32 AM, Javier González wrote:
>>>> When block erases fail, these blocks are marked bad. The number of valid
>>>> blocks in the line was not updated, which could cause an infinite loop
>>>> on the erase path.
>>>> 
>>>> Fix this atomic counter and, in order to avoid taking an irq lock on the
>>>> interrupt context, make the erase counters atomic too.
>>> 
>>> I can't find out where the counters are used in irq context? Can you
>>> point me in the right direction? I'll prefer for these counters to go
>>> in under the existing line_lock.
>> 
>> This counter is line->blk_in_line, which is used on pblk_mark_bb. This
>> is triggered on the erase completion path. Note that we do not want to
>> wait until the recovery kicks in on the workqueue since the line might
>> start getting recycled straight away if we are close to reaching
>> capacity. This is indeed the scenario that triggers the race condition.
>> 
>> Making all erase counters atomic allows us to avoid taking the
>> line_lock. Note that the counters do not depend on each other.
>> 
>>>> Also, in the case that a significant number of blocks become bad in a
>>>> line, the result is the double shared metadata buffer (emeta) to stop
>>>> the pipeline until all metadata is flushed to the media. Increase the
>>>> number of metadata lines from 2 to 4 to avoid this case.
>>> 
>>> How does moving to 4 lines solve this case? The way I read it is that
>>> it only postpones when this occurs?
>> 
>> The chances of having more than 2 lines falling out of blocks after
>> pre-condition are slim. Adding two more lines should be enough.
>> 
>>>> [...]
>>>> 
>>>> -#define PBLK_DATA_LINES 2
>>>> +#define PBLK_DATA_LINES 4
>>> 
>>> Why this change? I like to keep new features for 4.13. Only bugfixes for 4.12.
>> 
>> This is the 4 lines referred above. I see it as a bug fix since we risk
>> stalling the pipeline if a line goes above a certain number of bad
>> blocks on initialization, but we can leave it out and fix this when we
>> add in-line metadata on the write thread for 4.12
>> 
>> Thanks,
>> Javier
> 
> Okay. It tickles me a bit. However, to make it pretty, some
> refactoring is needed, which we won't push for this release.
> 
> Reviewed-by: Matias Bjørling <matias@cnexlabs.com>


Yes, you are right. I think it will become much better as we implement
the FTL log and refactor the metadata path.

Thanks!
Javier

[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 0/5] lightnvm: pblk extra patches for 4.12
  2017-04-23 18:00 ` [PATCH 0/5] lightnvm: pblk extra patches for 4.12 Matias Bjørling
@ 2017-04-23 22:58   ` Jens Axboe
  0 siblings, 0 replies; 16+ messages in thread
From: Jens Axboe @ 2017-04-23 22:58 UTC (permalink / raw)
  To: Matias Bjørling
  Cc: Javier González, linux-block, linux-kernel, Javier González

On Sun, Apr 23 2017, Matias Bjørling wrote:
> On 04/22/2017 01:32 AM, Javier González wrote:
> >Hi Matias,
> >
> >This is a couple of bug fixes for the pblk release patch.
> >
> >Thanks,
> >Javier
> >
> >Javier González (5):
> >  lightnvm: pblk: fix race condition on line retry
> >  lightnvm: pblk: fix bad error check
> >  lightnvm: pblk: fix memory leak on error path
> >  lightnvm: pblk: free metadata on line alloc failure
> >  lightnvm: pblk: fix erase counters on error fail
> >
> > drivers/lightnvm/pblk-core.c  | 60 ++++++++++++++++++++++++++-----------------
> > drivers/lightnvm/pblk-gc.c    |  2 +-
> > drivers/lightnvm/pblk-init.c  |  9 ++++---
> > drivers/lightnvm/pblk-map.c   |  4 +--
> > drivers/lightnvm/pblk-rl.c    |  6 +++--
> > drivers/lightnvm/pblk-write.c |  7 +++--
> > drivers/lightnvm/pblk.h       |  6 ++---
> > 7 files changed, 57 insertions(+), 37 deletions(-)
> >
> 
> Hi Jens,
> 
> Could you pick these up as well? This will properly be the last for
> the 4.12 window. Thank you!

Yep, added for 4.12, thanks.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2017-04-23 22:58 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-21 23:32 [PATCH 0/5] lightnvm: pblk extra patches for 4.12 Javier González
2017-04-21 23:32 ` [PATCH 1/5] lightnvm: pblk: fix race condition on line retry Javier González
2017-04-22  8:51   ` Matias Bjørling
2017-04-21 23:32 ` [PATCH 2/5] lightnvm: pblk: fix bad error check Javier González
2017-04-22  8:52   ` Matias Bjørling
2017-04-21 23:32 ` [PATCH 3/5] lightnvm: pblk: fix memory leak on error path Javier González
2017-04-22  8:57   ` Matias Bjørling
2017-04-21 23:32 ` [PATCH 4/5] lightnvm: pblk: free metadata on line alloc failure Javier González
2017-04-22  9:02   ` Matias Bjørling
2017-04-21 23:32 ` [PATCH 5/5] lightnvm: pblk: fix erase counters on error fail Javier González
2017-04-22  9:22   ` Matias Bjørling
2017-04-22  9:31     ` Javier González
2017-04-23 17:59       ` Matias Bjørling
2017-04-23 18:18         ` Javier González
2017-04-23 18:00 ` [PATCH 0/5] lightnvm: pblk extra patches for 4.12 Matias Bjørling
2017-04-23 22:58   ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).