linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/5] lightnvm: pblk: handle bad sectors in the emeta area correctly
@ 2018-01-31  2:06 Javier González
  2018-01-31  2:06 ` [PATCH 2/5] lightnvm: pblk: check data lines version on recovery Javier González
                   ` (3 more replies)
  0 siblings, 4 replies; 15+ messages in thread
From: Javier González @ 2018-01-31  2:06 UTC (permalink / raw)
  To: mb; +Cc: linux-block, linux-kernel, Hans Holmberg

From: Hans Holmberg <hans.holmberg@cnexlabs.com>

Unless we check if there are bad sectors in the entire emeta-area
we risk ending up with valid bitmap / available sector count inconsistency.
This results in lines with a bad chunk at the last LUN marked as bad,
so go through the whole emeta area and mark up the invalid sectors.

Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com>
---
 drivers/lightnvm/pblk-core.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/lightnvm/pblk-core.c b/drivers/lightnvm/pblk-core.c
index 0487b9340c1d..9027cf2ed1d8 100644
--- a/drivers/lightnvm/pblk-core.c
+++ b/drivers/lightnvm/pblk-core.c
@@ -1021,6 +1021,7 @@ static int pblk_line_init_bb(struct pblk *pblk, struct pblk_line *line,
 	int nr_bb = 0;
 	u64 off;
 	int bit = -1;
+	int emeta_secs;
 
 	line->sec_in_line = lm->sec_per_line;
 
@@ -1055,18 +1056,18 @@ static int pblk_line_init_bb(struct pblk *pblk, struct pblk_line *line,
 	/* Mark emeta metadata sectors as bad sectors. We need to consider bad
 	 * blocks to make sure that there are enough sectors to store emeta
 	 */
-	off = lm->sec_per_line - lm->emeta_sec[0];
-	bitmap_set(line->invalid_bitmap, off, lm->emeta_sec[0]);
-	while (nr_bb) {
+	emeta_secs = lm->emeta_sec[0];
+	off = lm->sec_per_line;
+	while (emeta_secs) {
 		off -= geo->sec_per_pl;
 		if (!test_bit(off, line->invalid_bitmap)) {
 			bitmap_set(line->invalid_bitmap, off, geo->sec_per_pl);
-			nr_bb--;
+			emeta_secs -= geo->sec_per_pl;
 		}
 	}
 
-	line->sec_in_line -= lm->emeta_sec[0];
 	line->emeta_ssec = off;
+	line->sec_in_line -= lm->emeta_sec[0];
 	line->nr_valid_lbas = 0;
 	line->left_msecs = line->sec_in_line;
 	*line->vsc = cpu_to_le32(line->sec_in_line);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/5] lightnvm: pblk: check data lines version on recovery
  2018-01-31  2:06 [PATCH 1/5] lightnvm: pblk: handle bad sectors in the emeta area correctly Javier González
@ 2018-01-31  2:06 ` Javier González
  2018-01-31  2:06 ` [PATCH 3/5] lightnvm: pblk: export write amplification counters to sysfs Javier González
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 15+ messages in thread
From: Javier González @ 2018-01-31  2:06 UTC (permalink / raw)
  To: mb; +Cc: linux-block, linux-kernel, Hans Holmberg, Javier González

From: Hans Holmberg <hans.holmberg@cnexlabs.com>

As a preparation for future bumps of data line persistent storage
versions, we need to start checking the emeta line version during
recovery. Also slit up the current emeta/smeta version into two
bytes (major,minor).

Recovering lines with the same major number as the current pblk data
line version must succeed. This means that any changes in the
persistent format must be:

 (1) Backward compatible: if we switch back to and older
     kernel, recovery of lines stored with major == current_major
     and minor > current_minor must succeed.

 (2) Forward compatible: switching to a newer kernel,
     recovery of lines stored with major=current_major and
     minor < minor must handle the data format differences
     gracefully(i.e. initialize new data structures to default values).

If we detect lines that have a different major number than
the current we must abort recovery. The user must manually
migrate the data in this case.

Previously the version stored in the emeta header was copied
from smeta, which has version 1, so we need to set the minor
version to 1.

Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com>
Signed-off-by: Javier González <javier@cnexlabs.com>
---
 drivers/lightnvm/pblk-core.c     |  9 ++++++++-
 drivers/lightnvm/pblk-recovery.c | 26 ++++++++++++++++++++++++--
 drivers/lightnvm/pblk.h          | 16 ++++++++++++++--
 3 files changed, 46 insertions(+), 5 deletions(-)

diff --git a/drivers/lightnvm/pblk-core.c b/drivers/lightnvm/pblk-core.c
index 9027cf2ed1d8..155e42a26293 100644
--- a/drivers/lightnvm/pblk-core.c
+++ b/drivers/lightnvm/pblk-core.c
@@ -975,7 +975,8 @@ static int pblk_line_init_metadata(struct pblk *pblk, struct pblk_line *line,
 	memcpy(smeta_buf->header.uuid, pblk->instance_uuid, 16);
 	smeta_buf->header.id = cpu_to_le32(line->id);
 	smeta_buf->header.type = cpu_to_le16(line->type);
-	smeta_buf->header.version = SMETA_VERSION;
+	smeta_buf->header.version_major = SMETA_VERSION_MAJOR;
+	smeta_buf->header.version_minor = SMETA_VERSION_MINOR;
 
 	/* Start metadata */
 	smeta_buf->seq_nr = cpu_to_le64(line->seq_nr);
@@ -998,6 +999,12 @@ static int pblk_line_init_metadata(struct pblk *pblk, struct pblk_line *line,
 	/* End metadata */
 	memcpy(&emeta_buf->header, &smeta_buf->header,
 						sizeof(struct line_header));
+
+	emeta_buf->header.version_major = EMETA_VERSION_MAJOR;
+	emeta_buf->header.version_minor = EMETA_VERSION_MINOR;
+	emeta_buf->header.crc = cpu_to_le32(
+			pblk_calc_meta_header_crc(pblk, &emeta_buf->header));
+
 	emeta_buf->seq_nr = cpu_to_le64(line->seq_nr);
 	emeta_buf->nr_lbas = cpu_to_le64(line->sec_in_line);
 	emeta_buf->nr_valid_lbas = cpu_to_le64(0);
diff --git a/drivers/lightnvm/pblk-recovery.c b/drivers/lightnvm/pblk-recovery.c
index 1d5e961bf5e0..a30fe203d454 100644
--- a/drivers/lightnvm/pblk-recovery.c
+++ b/drivers/lightnvm/pblk-recovery.c
@@ -826,6 +826,25 @@ static u64 pblk_line_emeta_start(struct pblk *pblk, struct pblk_line *line)
 	return emeta_start;
 }
 
+static int pblk_recov_check_line_version(struct pblk *pblk,
+					 struct line_emeta *emeta)
+{
+	struct line_header *header = &emeta->header;
+
+	if (header->version_major != EMETA_VERSION_MAJOR) {
+		pr_err("pblk: line major version mismatch: %d, expected: %d\n",
+		       header->version_major, EMETA_VERSION_MAJOR);
+		return 1;
+	}
+
+#ifdef NVM_DEBUG
+	if (header->version_minor > EMETA_VERSION_MINOR)
+		pr_info("pblk: newer line minor version found: %d\n", line_v);
+#endif
+
+	return 0;
+}
+
 struct pblk_line *pblk_recov_l2p(struct pblk *pblk)
 {
 	struct pblk_line_meta *lm = &pblk->lm;
@@ -873,9 +892,9 @@ struct pblk_line *pblk_recov_l2p(struct pblk *pblk)
 		if (le32_to_cpu(smeta_buf->header.identifier) != PBLK_MAGIC)
 			continue;
 
-		if (smeta_buf->header.version != SMETA_VERSION) {
+		if (smeta_buf->header.version_major != SMETA_VERSION_MAJOR) {
 			pr_err("pblk: found incompatible line version %u\n",
-					le16_to_cpu(smeta_buf->header.version));
+					smeta_buf->header.version_major);
 			return ERR_PTR(-EINVAL);
 		}
 
@@ -943,6 +962,9 @@ struct pblk_line *pblk_recov_l2p(struct pblk *pblk)
 			goto next;
 		}
 
+		if (pblk_recov_check_line_version(pblk, line->emeta->buf))
+			return ERR_PTR(-EINVAL);
+
 		if (pblk_recov_l2p_from_emeta(pblk, line))
 			pblk_recov_l2p_from_oob(pblk, line);
 
diff --git a/drivers/lightnvm/pblk.h b/drivers/lightnvm/pblk.h
index 8c357fb6538e..fae2526f80b2 100644
--- a/drivers/lightnvm/pblk.h
+++ b/drivers/lightnvm/pblk.h
@@ -320,14 +320,26 @@ enum {
 };
 
 #define PBLK_MAGIC 0x70626c6b /*pblk*/
-#define SMETA_VERSION cpu_to_le16(1)
+
+/* emeta/smeta persistent storage format versions:
+ * Changes in major version requires offline migration.
+ * Changes in minor version are handled automatically during
+ * recovery.
+ */
+
+#define SMETA_VERSION_MAJOR (0)
+#define SMETA_VERSION_MINOR (1)
+
+#define EMETA_VERSION_MAJOR (0)
+#define EMETA_VERSION_MINOR (1)
 
 struct line_header {
 	__le32 crc;
 	__le32 identifier;	/* pblk identifier */
 	__u8 uuid[16];		/* instance uuid */
 	__le16 type;		/* line type */
-	__le16 version;		/* type version */
+	__u8 version_major;	/* version major */
+	__u8 version_minor;	/* version minor */
 	__le32 id;		/* line id for current line */
 };
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 3/5] lightnvm: pblk: export write amplification counters to sysfs
  2018-01-31  2:06 [PATCH 1/5] lightnvm: pblk: handle bad sectors in the emeta area correctly Javier González
  2018-01-31  2:06 ` [PATCH 2/5] lightnvm: pblk: check data lines version on recovery Javier González
@ 2018-01-31  2:06 ` Javier González
  2018-01-31  2:06 ` [PATCH 4/5] lightnvm: pblk: add padding distribution sysfs attribute Javier González
  2018-01-31  2:06 ` [PATCH 5/5] lightnvm: pblk: refactor bad block identification Javier González
  3 siblings, 0 replies; 15+ messages in thread
From: Javier González @ 2018-01-31  2:06 UTC (permalink / raw)
  To: mb; +Cc: linux-block, linux-kernel, Hans Holmberg, Javier González

From: Hans Holmberg <hans.holmberg@cnexlabs.com>

In a SSD, write amplification, WA, is defined as the average
number of page writes per user page write. Write amplification
negatively affects write performance and decreases the lifetime
of the disk, so it's a useful metric to add to sysfs.

In plkb's case, the number of writes per user sector is the sum of:

    (1) number of user writes
    (2) number of sectors written by the garbage collector
    (3) number of sectors padded (i.e. due to syncs)

This patch adds persistent counters for 1-3 and two sysfs attributes
to export these along with WA calculated with five decimals:

    write_amp_mileage: the accumulated write amplification stats
                      for the lifetime of the pblk instance

    write_amp_trip: resetable stats to facilitate delta measurements,
                    values reset at creation and if 0 is written
                    to the attribute.

64-bit counters are used as a 32 bit counter would wrap around
already after about 17 TB worth of user data. It will take a
long long time before the 64 bit sector counters wrap around.

The counters are stored after the bad block bitmap in the first
emeta sector of each written line. There is plenty of space in the
first emeta sector, so we don't need to bump the major version of
the line data format.

Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com>
Signed-off-by: Javier González <javier@cnexlabs.com>
---
 drivers/lightnvm/pblk-cache.c    |  4 ++
 drivers/lightnvm/pblk-core.c     |  6 +++
 drivers/lightnvm/pblk-init.c     | 11 ++++-
 drivers/lightnvm/pblk-map.c      |  2 +
 drivers/lightnvm/pblk-rb.c       |  3 ++
 drivers/lightnvm/pblk-recovery.c | 25 ++++++++++++
 drivers/lightnvm/pblk-sysfs.c    | 86 +++++++++++++++++++++++++++++++++++++++-
 drivers/lightnvm/pblk.h          | 42 ++++++++++++++++----
 8 files changed, 169 insertions(+), 10 deletions(-)

diff --git a/drivers/lightnvm/pblk-cache.c b/drivers/lightnvm/pblk-cache.c
index 000fcad38136..29a23111b31c 100644
--- a/drivers/lightnvm/pblk-cache.c
+++ b/drivers/lightnvm/pblk-cache.c
@@ -63,6 +63,8 @@ int pblk_write_to_cache(struct pblk *pblk, struct bio *bio, unsigned long flags)
 		bio_advance(bio, PBLK_EXPOSED_PAGE_SIZE);
 	}
 
+	atomic64_add(nr_entries, &pblk->user_wa);
+
 #ifdef CONFIG_NVM_DEBUG
 	atomic_long_add(nr_entries, &pblk->inflight_writes);
 	atomic_long_add(nr_entries, &pblk->req_writes);
@@ -117,6 +119,8 @@ int pblk_write_gc_to_cache(struct pblk *pblk, struct pblk_gc_rq *gc_rq)
 	WARN_ONCE(gc_rq->secs_to_gc != valid_entries,
 					"pblk: inconsistent GC write\n");
 
+	atomic64_add(valid_entries, &pblk->gc_wa);
+
 #ifdef CONFIG_NVM_DEBUG
 	atomic_long_add(valid_entries, &pblk->inflight_writes);
 	atomic_long_add(valid_entries, &pblk->recov_gc_writes);
diff --git a/drivers/lightnvm/pblk-core.c b/drivers/lightnvm/pblk-core.c
index 155e42a26293..22e61cd4f801 100644
--- a/drivers/lightnvm/pblk-core.c
+++ b/drivers/lightnvm/pblk-core.c
@@ -1630,11 +1630,16 @@ void pblk_line_close_meta(struct pblk *pblk, struct pblk_line *line)
 	struct pblk_line_meta *lm = &pblk->lm;
 	struct pblk_emeta *emeta = line->emeta;
 	struct line_emeta *emeta_buf = emeta->buf;
+	struct wa_counters *wa = emeta_to_wa(lm, emeta_buf);
 
 	/* No need for exact vsc value; avoid a big line lock and take aprox. */
 	memcpy(emeta_to_vsc(pblk, emeta_buf), l_mg->vsc_list, lm->vsc_list_len);
 	memcpy(emeta_to_bb(emeta_buf), line->blk_bitmap, lm->blk_bitmap_len);
 
+	wa->user = cpu_to_le64(atomic64_read(&pblk->user_wa));
+	wa->pad = cpu_to_le64(atomic64_read(&pblk->pad_wa));
+	wa->gc = cpu_to_le64(atomic64_read(&pblk->gc_wa));
+
 	emeta_buf->nr_valid_lbas = cpu_to_le64(line->nr_valid_lbas);
 	emeta_buf->crc = cpu_to_le32(pblk_calc_emeta_crc(pblk, emeta_buf));
 
@@ -1837,6 +1842,7 @@ void pblk_update_map_dev(struct pblk *pblk, sector_t lba,
 #endif
 	/* Invalidate and discard padded entries */
 	if (lba == ADDR_EMPTY) {
+		atomic64_inc(&pblk->pad_wa);
 #ifdef CONFIG_NVM_DEBUG
 		atomic_long_inc(&pblk->padded_wb);
 #endif
diff --git a/drivers/lightnvm/pblk-init.c b/drivers/lightnvm/pblk-init.c
index 93d671ca518e..7eedc5daebc8 100644
--- a/drivers/lightnvm/pblk-init.c
+++ b/drivers/lightnvm/pblk-init.c
@@ -559,8 +559,8 @@ static unsigned int calc_emeta_len(struct pblk *pblk)
 
 	/* Round to sector size so that lba_list starts on its own sector */
 	lm->emeta_sec[1] = DIV_ROUND_UP(
-			sizeof(struct line_emeta) + lm->blk_bitmap_len,
-			geo->sec_size);
+			sizeof(struct line_emeta) + lm->blk_bitmap_len +
+			sizeof(struct wa_counters), geo->sec_size);
 	lm->emeta_len[1] = lm->emeta_sec[1] * geo->sec_size;
 
 	/* Round to sector size so that vsc_list starts on its own sector */
@@ -991,6 +991,13 @@ static void *pblk_init(struct nvm_tgt_dev *dev, struct gendisk *tdisk,
 	if (flags & NVM_TARGET_FACTORY)
 		pblk_setup_uuid(pblk);
 
+	atomic64_set(&pblk->user_wa, 0);
+	atomic64_set(&pblk->pad_wa, 0);
+	atomic64_set(&pblk->gc_wa, 0);
+	pblk->user_rst_wa = 0;
+	pblk->pad_rst_wa = 0;
+	pblk->gc_rst_wa = 0;
+
 #ifdef CONFIG_NVM_DEBUG
 	atomic_long_set(&pblk->inflight_writes, 0);
 	atomic_long_set(&pblk->padded_writes, 0);
diff --git a/drivers/lightnvm/pblk-map.c b/drivers/lightnvm/pblk-map.c
index 7445e6430c52..04e08d76ea5f 100644
--- a/drivers/lightnvm/pblk-map.c
+++ b/drivers/lightnvm/pblk-map.c
@@ -65,6 +65,8 @@ static void pblk_map_page_data(struct pblk *pblk, unsigned int sentry,
 			lba_list[paddr] = cpu_to_le64(w_ctx->lba);
 			if (lba_list[paddr] != addr_empty)
 				line->nr_valid_lbas++;
+			else
+				atomic64_inc(&pblk->pad_wa);
 		} else {
 			lba_list[paddr] = meta_list[i].lba = addr_empty;
 			__pblk_map_invalidate(pblk, line, paddr);
diff --git a/drivers/lightnvm/pblk-rb.c b/drivers/lightnvm/pblk-rb.c
index ec8fc314646b..7044b5599cc4 100644
--- a/drivers/lightnvm/pblk-rb.c
+++ b/drivers/lightnvm/pblk-rb.c
@@ -622,6 +622,9 @@ unsigned int pblk_rb_read_to_bio(struct pblk_rb *rb, struct nvm_rq *rqd,
 		}
 	}
 
+	atomic64_add(pad, &((struct pblk *)
+			(container_of(rb, struct pblk, rwb)))->pad_wa);
+
 #ifdef CONFIG_NVM_DEBUG
 	atomic_long_add(pad, &((struct pblk *)
 			(container_of(rb, struct pblk, rwb)))->padded_writes);
diff --git a/drivers/lightnvm/pblk-recovery.c b/drivers/lightnvm/pblk-recovery.c
index a30fe203d454..e75a1af2eebe 100644
--- a/drivers/lightnvm/pblk-recovery.c
+++ b/drivers/lightnvm/pblk-recovery.c
@@ -845,6 +845,29 @@ static int pblk_recov_check_line_version(struct pblk *pblk,
 	return 0;
 }
 
+static void pblk_recov_wa_counters(struct pblk *pblk,
+				   struct line_emeta *emeta)
+{
+	struct pblk_line_meta *lm = &pblk->lm;
+	struct line_header *header = &emeta->header;
+	struct wa_counters *wa = emeta_to_wa(lm, emeta);
+
+	/* WA counters were introduced in emeta version 0.2 */
+	if (header->version_major > 0 || header->version_minor >= 2) {
+		u64 user = le64_to_cpu(wa->user);
+		u64 pad = le64_to_cpu(wa->pad);
+		u64 gc = le64_to_cpu(wa->gc);
+
+		atomic64_set(&pblk->user_wa, user);
+		atomic64_set(&pblk->pad_wa, pad);
+		atomic64_set(&pblk->gc_wa, gc);
+
+		pblk->user_rst_wa = user;
+		pblk->pad_rst_wa = pad;
+		pblk->gc_rst_wa = gc;
+	}
+}
+
 struct pblk_line *pblk_recov_l2p(struct pblk *pblk)
 {
 	struct pblk_line_meta *lm = &pblk->lm;
@@ -965,6 +988,8 @@ struct pblk_line *pblk_recov_l2p(struct pblk *pblk)
 		if (pblk_recov_check_line_version(pblk, line->emeta->buf))
 			return ERR_PTR(-EINVAL);
 
+		pblk_recov_wa_counters(pblk, line->emeta->buf);
+
 		if (pblk_recov_l2p_from_emeta(pblk, line))
 			pblk_recov_l2p_from_oob(pblk, line);
 
diff --git a/drivers/lightnvm/pblk-sysfs.c b/drivers/lightnvm/pblk-sysfs.c
index 620bab853579..4804bbd32d5f 100644
--- a/drivers/lightnvm/pblk-sysfs.c
+++ b/drivers/lightnvm/pblk-sysfs.c
@@ -298,6 +298,49 @@ static ssize_t pblk_sysfs_get_sec_per_write(struct pblk *pblk, char *page)
 	return snprintf(page, PAGE_SIZE, "%d\n", pblk->sec_per_write);
 }
 
+static ssize_t pblk_get_write_amp(u64 user, u64 gc, u64 pad,
+				  char *page)
+{
+	int sz;
+
+	sz = snprintf(page, PAGE_SIZE,
+			"user:%lld gc:%lld pad:%lld WA:",
+			user, gc, pad);
+
+	if (!user) {
+		sz += snprintf(page + sz, PAGE_SIZE - sz, "NaN\n");
+	} else {
+		u64 wa_int, wa_frac, wa_frac_int;
+
+		wa_int = user + gc + pad;
+		sector_div(wa_int, user);
+
+		wa_frac_int = (user + gc + pad) * 100000;
+		sector_div(wa_frac_int, user);
+		div_u64_rem(wa_frac_int, 100000, (u32 *)&wa_frac);
+
+		sz += snprintf(page + sz, PAGE_SIZE - sz, "%lld.%05lld\n",
+							wa_int, wa_frac);
+	}
+
+	return sz;
+}
+
+static ssize_t pblk_sysfs_get_write_amp_mileage(struct pblk *pblk, char *page)
+{
+	return pblk_get_write_amp(atomic64_read(&pblk->user_wa),
+		atomic64_read(&pblk->gc_wa), atomic64_read(&pblk->pad_wa),
+		page);
+}
+
+static ssize_t pblk_sysfs_get_write_amp_trip(struct pblk *pblk, char *page)
+{
+	return pblk_get_write_amp(
+		atomic64_read(&pblk->user_wa) - pblk->user_rst_wa,
+		atomic64_read(&pblk->gc_wa) - pblk->gc_rst_wa,
+		atomic64_read(&pblk->pad_wa) - pblk->pad_rst_wa, page);
+}
+
 #ifdef CONFIG_NVM_DEBUG
 static ssize_t pblk_sysfs_stats_debug(struct pblk *pblk, char *page)
 {
@@ -360,6 +403,30 @@ static ssize_t pblk_sysfs_set_sec_per_write(struct pblk *pblk,
 	return len;
 }
 
+static ssize_t pblk_sysfs_set_write_amp_trip(struct pblk *pblk,
+			const char *page, size_t len)
+{
+	size_t c_len;
+	int reset_value;
+
+	c_len = strcspn(page, "\n");
+	if (c_len >= len)
+		return -EINVAL;
+
+	if (kstrtouint(page, 0, &reset_value))
+		return -EINVAL;
+
+	if (reset_value !=  0)
+		return -EINVAL;
+
+	pblk->user_rst_wa = atomic64_read(&pblk->user_wa);
+	pblk->pad_rst_wa = atomic64_read(&pblk->pad_wa);
+	pblk->gc_rst_wa = atomic64_read(&pblk->gc_wa);
+
+	return len;
+}
+
+
 static struct attribute sys_write_luns = {
 	.name = "write_luns",
 	.mode = 0444,
@@ -410,6 +477,16 @@ static struct attribute sys_max_sec_per_write = {
 	.mode = 0644,
 };
 
+static struct attribute sys_write_amp_mileage = {
+	.name = "write_amp_mileage",
+	.mode = 0444,
+};
+
+static struct attribute sys_write_amp_trip = {
+	.name = "write_amp_trip",
+	.mode = 0644,
+};
+
 #ifdef CONFIG_NVM_DEBUG
 static struct attribute sys_stats_debug_attr = {
 	.name = "stats",
@@ -428,6 +505,8 @@ static struct attribute *pblk_attrs[] = {
 	&sys_stats_ppaf_attr,
 	&sys_lines_attr,
 	&sys_lines_info_attr,
+	&sys_write_amp_mileage,
+	&sys_write_amp_trip,
 #ifdef CONFIG_NVM_DEBUG
 	&sys_stats_debug_attr,
 #endif
@@ -457,6 +536,10 @@ static ssize_t pblk_sysfs_show(struct kobject *kobj, struct attribute *attr,
 		return pblk_sysfs_lines_info(pblk, buf);
 	else if (strcmp(attr->name, "max_sec_per_write") == 0)
 		return pblk_sysfs_get_sec_per_write(pblk, buf);
+	else if (strcmp(attr->name, "write_amp_mileage") == 0)
+		return pblk_sysfs_get_write_amp_mileage(pblk, buf);
+	else if (strcmp(attr->name, "write_amp_trip") == 0)
+		return pblk_sysfs_get_write_amp_trip(pblk, buf);
 #ifdef CONFIG_NVM_DEBUG
 	else if (strcmp(attr->name, "stats") == 0)
 		return pblk_sysfs_stats_debug(pblk, buf);
@@ -473,7 +556,8 @@ static ssize_t pblk_sysfs_store(struct kobject *kobj, struct attribute *attr,
 		return pblk_sysfs_gc_force(pblk, buf, len);
 	else if (strcmp(attr->name, "max_sec_per_write") == 0)
 		return pblk_sysfs_set_sec_per_write(pblk, buf, len);
-
+	else if (strcmp(attr->name, "write_amp_trip") == 0)
+		return pblk_sysfs_set_write_amp_trip(pblk, buf, len);
 	return 0;
 }
 
diff --git a/drivers/lightnvm/pblk.h b/drivers/lightnvm/pblk.h
index fae2526f80b2..4b7d8618631f 100644
--- a/drivers/lightnvm/pblk.h
+++ b/drivers/lightnvm/pblk.h
@@ -331,7 +331,7 @@ enum {
 #define SMETA_VERSION_MINOR (1)
 
 #define EMETA_VERSION_MAJOR (0)
-#define EMETA_VERSION_MINOR (1)
+#define EMETA_VERSION_MINOR (2)
 
 struct line_header {
 	__le32 crc;
@@ -361,11 +361,13 @@ struct line_smeta {
 	__le64 lun_bitmap[];
 };
 
+
 /*
  * Metadata layout in media:
  *	First sector:
  *		1. struct line_emeta
  *		2. bad block bitmap (u64 * window_wr_lun)
+ *		3. write amplification counters
  *	Mid sectors (start at lbas_sector):
  *		3. nr_lbas (u64) forming lba list
  *	Last sectors (start at vsc_sector):
@@ -389,7 +391,15 @@ struct line_emeta {
 	__le32 next_id;		/* Line id for next line */
 	__le64 nr_lbas;		/* Number of lbas mapped in line */
 	__le64 nr_valid_lbas;	/* Number of valid lbas mapped in line */
-	__le64 bb_bitmap[];	/* Updated bad block bitmap for line */
+	__le64 bb_bitmap[];     /* Updated bad block bitmap for line */
+};
+
+
+/* Write amplification counters stored on media */
+struct wa_counters {
+	__le64 user;		/* Number of user written sectors */
+	__le64 gc;		/* Number of sectors written by GC*/
+	__le64 pad;		/* Number of padded sectors */
 };
 
 struct pblk_emeta {
@@ -519,10 +529,11 @@ struct pblk_line_meta {
 	unsigned int smeta_sec;		/* Sectors needed for smeta */
 
 	unsigned int emeta_len[4];	/* Lengths for emeta:
-					 *  [0]: Total length
-					 *  [1]: struct line_emeta length
-					 *  [2]: L2P portion length
-					 *  [3]: vsc list length
+					 *  [0]: Total
+					 *  [1]: struct line_emeta +
+					 *       bb_bitmap + struct wa_counters
+					 *  [2]: L2P portion
+					 *  [3]: vsc
 					 */
 	unsigned int emeta_sec[4];	/* Sectors needed for emeta. Same layout
 					 * as emeta_len
@@ -604,8 +615,19 @@ struct pblk {
 	int sec_per_write;
 
 	unsigned char instance_uuid[16];
+
+	/* Persistent write amplification counters, 4kb sector I/Os */
+	atomic64_t user_wa;		/* Sectors written by user */
+	atomic64_t gc_wa;		/* Sectors written by GC */
+	atomic64_t pad_wa;		/* Padded sectors written */
+
+	/* Reset values for delta write amplification measurements */
+	u64 user_rst_wa;
+	u64 gc_rst_wa;
+	u64 pad_rst_wa;
+
 #ifdef CONFIG_NVM_DEBUG
-	/* All debug counters apply to 4kb sector I/Os */
+	/* Non-persistent debug counters, 4kb sector I/Os */
 	atomic_long_t inflight_writes;	/* Inflight writes (user and gc) */
 	atomic_long_t padded_writes;	/* Sectors padded due to flush/fua */
 	atomic_long_t padded_wb;	/* Sectors padded in write buffer */
@@ -900,6 +922,12 @@ static inline void *emeta_to_bb(struct line_emeta *emeta)
 	return emeta->bb_bitmap;
 }
 
+static inline void *emeta_to_wa(struct pblk_line_meta *lm,
+				struct line_emeta *emeta)
+{
+	return emeta->bb_bitmap + lm->blk_bitmap_len;
+}
+
 static inline void *emeta_to_lbas(struct pblk *pblk, struct line_emeta *emeta)
 {
 	return ((void *)emeta + pblk->lm.emeta_len[1]);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 4/5] lightnvm: pblk: add padding distribution sysfs attribute
  2018-01-31  2:06 [PATCH 1/5] lightnvm: pblk: handle bad sectors in the emeta area correctly Javier González
  2018-01-31  2:06 ` [PATCH 2/5] lightnvm: pblk: check data lines version on recovery Javier González
  2018-01-31  2:06 ` [PATCH 3/5] lightnvm: pblk: export write amplification counters to sysfs Javier González
@ 2018-01-31  2:06 ` Javier González
  2018-01-31  8:44   ` Matias Bjørling
  2018-01-31  2:06 ` [PATCH 5/5] lightnvm: pblk: refactor bad block identification Javier González
  3 siblings, 1 reply; 15+ messages in thread
From: Javier González @ 2018-01-31  2:06 UTC (permalink / raw)
  To: mb; +Cc: linux-block, linux-kernel, Hans Holmberg, Javier González

From: Hans Holmberg <hans.holmberg@cnexlabs.com>

When pblk receives a sync, all data up to that point in the write buffer
must be comitted to persistent storage, and as flash memory comes with a
minimal write size there is a significant cost involved both in terms
of time for completing the sync and in terms of write amplification
padded sectors for filling up to the minimal write size.

In order to get a better understanding of the costs involved for syncs,
Add a sysfs attribute to pblk: padded_dist, showing a normalized
distribution of sectors padded. In order to facilitate measurements of
specific workloads during the lifetime of the pblk instance, the
distribution can be reset by writing 0 to the attribute.

Do this by introducing counters for each possible padding:
{0..(minimal write size - 1)} and calculate the normalized distribution
when showing the attribute.

Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com>
Signed-off-by: Javier González <javier@cnexlabs.com>
---
 drivers/lightnvm/pblk-init.c  | 16 ++++++++--
 drivers/lightnvm/pblk-rb.c    | 15 +++++-----
 drivers/lightnvm/pblk-sysfs.c | 68 +++++++++++++++++++++++++++++++++++++++++++
 drivers/lightnvm/pblk.h       |  6 +++-
 4 files changed, 95 insertions(+), 10 deletions(-)

diff --git a/drivers/lightnvm/pblk-init.c b/drivers/lightnvm/pblk-init.c
index 7eedc5daebc8..a5e3510c0f38 100644
--- a/drivers/lightnvm/pblk-init.c
+++ b/drivers/lightnvm/pblk-init.c
@@ -921,6 +921,7 @@ static void pblk_free(struct pblk *pblk)
 {
 	pblk_luns_free(pblk);
 	pblk_lines_free(pblk);
+	kfree(pblk->pad_dist);
 	pblk_line_meta_free(pblk);
 	pblk_core_free(pblk);
 	pblk_l2p_free(pblk);
@@ -998,11 +999,13 @@ static void *pblk_init(struct nvm_tgt_dev *dev, struct gendisk *tdisk,
 	pblk->pad_rst_wa = 0;
 	pblk->gc_rst_wa = 0;
 
+	atomic_long_set(&pblk->nr_flush, 0);
+	pblk->nr_flush_rst = 0;
+
 #ifdef CONFIG_NVM_DEBUG
 	atomic_long_set(&pblk->inflight_writes, 0);
 	atomic_long_set(&pblk->padded_writes, 0);
 	atomic_long_set(&pblk->padded_wb, 0);
-	atomic_long_set(&pblk->nr_flush, 0);
 	atomic_long_set(&pblk->req_writes, 0);
 	atomic_long_set(&pblk->sub_writes, 0);
 	atomic_long_set(&pblk->sync_writes, 0);
@@ -1034,10 +1037,17 @@ static void *pblk_init(struct nvm_tgt_dev *dev, struct gendisk *tdisk,
 		goto fail_free_luns;
 	}
 
+	pblk->pad_dist = kzalloc((pblk->min_write_pgs - 1) * sizeof(atomic64_t),
+				 GFP_KERNEL);
+	if (!pblk->pad_dist) {
+		ret = -ENOMEM;
+		goto fail_free_line_meta;
+	}
+
 	ret = pblk_core_init(pblk);
 	if (ret) {
 		pr_err("pblk: could not initialize core\n");
-		goto fail_free_line_meta;
+		goto fail_free_pad_dist;
 	}
 
 	ret = pblk_l2p_init(pblk);
@@ -1097,6 +1107,8 @@ static void *pblk_init(struct nvm_tgt_dev *dev, struct gendisk *tdisk,
 	pblk_l2p_free(pblk);
 fail_free_core:
 	pblk_core_free(pblk);
+fail_free_pad_dist:
+	kfree(pblk->pad_dist);
 fail_free_line_meta:
 	pblk_line_meta_free(pblk);
 fail_free_luns:
diff --git a/drivers/lightnvm/pblk-rb.c b/drivers/lightnvm/pblk-rb.c
index 7044b5599cc4..264372d7b537 100644
--- a/drivers/lightnvm/pblk-rb.c
+++ b/drivers/lightnvm/pblk-rb.c
@@ -437,9 +437,7 @@ static int pblk_rb_may_write_flush(struct pblk_rb *rb, unsigned int nr_entries,
 	if (bio->bi_opf & REQ_PREFLUSH) {
 		struct pblk *pblk = container_of(rb, struct pblk, rwb);
 
-#ifdef CONFIG_NVM_DEBUG
 		atomic_long_inc(&pblk->nr_flush);
-#endif
 		if (pblk_rb_flush_point_set(&pblk->rwb, bio, mem))
 			*io_ret = NVM_IO_OK;
 	}
@@ -620,14 +618,17 @@ unsigned int pblk_rb_read_to_bio(struct pblk_rb *rb, struct nvm_rq *rqd,
 			pr_err("pblk: could not pad page in write bio\n");
 			return NVM_IO_ERR;
 		}
+
+		if (pad < pblk->min_write_pgs)
+			atomic64_inc(&pblk->pad_dist[pad - 1]);
+		else
+			pr_warn("pblk: padding more than min. sectors\n");
+
+		atomic64_add(pad, &pblk->pad_wa);
 	}
 
-	atomic64_add(pad, &((struct pblk *)
-			(container_of(rb, struct pblk, rwb)))->pad_wa);
-
 #ifdef CONFIG_NVM_DEBUG
-	atomic_long_add(pad, &((struct pblk *)
-			(container_of(rb, struct pblk, rwb)))->padded_writes);
+	atomic_long_add(pad, &pblk->padded_writes);
 #endif
 
 	return NVM_IO_OK;
diff --git a/drivers/lightnvm/pblk-sysfs.c b/drivers/lightnvm/pblk-sysfs.c
index 4804bbd32d5f..f902ac00d071 100644
--- a/drivers/lightnvm/pblk-sysfs.c
+++ b/drivers/lightnvm/pblk-sysfs.c
@@ -341,6 +341,38 @@ static ssize_t pblk_sysfs_get_write_amp_trip(struct pblk *pblk, char *page)
 		atomic64_read(&pblk->pad_wa) - pblk->pad_rst_wa, page);
 }
 
+static ssize_t pblk_sysfs_get_padding_dist(struct pblk *pblk, char *page)
+{
+	int sz = 0;
+	unsigned long long total;
+	unsigned long long total_buckets = 0;
+	int buckets = pblk->min_write_pgs - 1;
+	int i;
+
+	total = atomic64_read(&pblk->nr_flush) - pblk->nr_flush_rst;
+
+	for (i = 0; i < buckets; i++)
+		total_buckets += atomic64_read(&pblk->pad_dist[i]);
+
+	if (!total) {
+		for (i = 0; i < (buckets + 1); i++)
+			sz += snprintf(page + sz, PAGE_SIZE - sz,
+				"%d:0 ", i);
+		sz += snprintf(page + sz, PAGE_SIZE - sz, "\n");
+
+		return sz;
+	}
+
+	sz += snprintf(page + sz, PAGE_SIZE - sz, "0:%lld%% ",
+		((total - total_buckets) * 100) / total);
+	for (i = 0; i < buckets; i++)
+		sz += snprintf(page + sz, PAGE_SIZE - sz, "%d:%lld%% ", i + 1,
+			(atomic64_read(&pblk->pad_dist[i]) * 100) / total);
+	sz += snprintf(page + sz, PAGE_SIZE - sz, "\n");
+
+	return sz;
+}
+
 #ifdef CONFIG_NVM_DEBUG
 static ssize_t pblk_sysfs_stats_debug(struct pblk *pblk, char *page)
 {
@@ -427,6 +459,32 @@ static ssize_t pblk_sysfs_set_write_amp_trip(struct pblk *pblk,
 }
 
 
+static ssize_t pblk_sysfs_set_padding_dist(struct pblk *pblk,
+			const char *page, size_t len)
+{
+	size_t c_len;
+	int reset_value;
+	int buckets = pblk->min_write_pgs - 1;
+	int i;
+
+	c_len = strcspn(page, "\n");
+	if (c_len >= len)
+		return -EINVAL;
+
+	if (kstrtouint(page, 0, &reset_value))
+		return -EINVAL;
+
+	if (reset_value !=  0)
+		return -EINVAL;
+
+	for (i = 0; i < buckets; i++)
+		atomic64_set(&pblk->pad_dist[i], 0);
+
+	pblk->nr_flush_rst = atomic64_read(&pblk->nr_flush);
+
+	return len;
+}
+
 static struct attribute sys_write_luns = {
 	.name = "write_luns",
 	.mode = 0444,
@@ -487,6 +545,11 @@ static struct attribute sys_write_amp_trip = {
 	.mode = 0644,
 };
 
+static struct attribute sys_padding_dist = {
+	.name = "padding_dist",
+	.mode = 0644,
+};
+
 #ifdef CONFIG_NVM_DEBUG
 static struct attribute sys_stats_debug_attr = {
 	.name = "stats",
@@ -507,6 +570,7 @@ static struct attribute *pblk_attrs[] = {
 	&sys_lines_info_attr,
 	&sys_write_amp_mileage,
 	&sys_write_amp_trip,
+	&sys_padding_dist,
 #ifdef CONFIG_NVM_DEBUG
 	&sys_stats_debug_attr,
 #endif
@@ -540,6 +604,8 @@ static ssize_t pblk_sysfs_show(struct kobject *kobj, struct attribute *attr,
 		return pblk_sysfs_get_write_amp_mileage(pblk, buf);
 	else if (strcmp(attr->name, "write_amp_trip") == 0)
 		return pblk_sysfs_get_write_amp_trip(pblk, buf);
+	else if (strcmp(attr->name, "padding_dist") == 0)
+		return pblk_sysfs_get_padding_dist(pblk, buf);
 #ifdef CONFIG_NVM_DEBUG
 	else if (strcmp(attr->name, "stats") == 0)
 		return pblk_sysfs_stats_debug(pblk, buf);
@@ -558,6 +624,8 @@ static ssize_t pblk_sysfs_store(struct kobject *kobj, struct attribute *attr,
 		return pblk_sysfs_set_sec_per_write(pblk, buf, len);
 	else if (strcmp(attr->name, "write_amp_trip") == 0)
 		return pblk_sysfs_set_write_amp_trip(pblk, buf, len);
+	else if (strcmp(attr->name, "padding_dist") == 0)
+		return pblk_sysfs_set_padding_dist(pblk, buf, len);
 	return 0;
 }
 
diff --git a/drivers/lightnvm/pblk.h b/drivers/lightnvm/pblk.h
index 4b7d8618631f..88720e2441c0 100644
--- a/drivers/lightnvm/pblk.h
+++ b/drivers/lightnvm/pblk.h
@@ -626,12 +626,16 @@ struct pblk {
 	u64 gc_rst_wa;
 	u64 pad_rst_wa;
 
+	/* Counters used for calculating padding distribution */
+	atomic64_t *pad_dist;		/* Padding distribution buckets */
+	u64 nr_flush_rst;		/* Flushes reset value for pad dist.*/
+	atomic_long_t nr_flush;		/* Number of flush/fua I/O */
+
 #ifdef CONFIG_NVM_DEBUG
 	/* Non-persistent debug counters, 4kb sector I/Os */
 	atomic_long_t inflight_writes;	/* Inflight writes (user and gc) */
 	atomic_long_t padded_writes;	/* Sectors padded due to flush/fua */
 	atomic_long_t padded_wb;	/* Sectors padded in write buffer */
-	atomic_long_t nr_flush;		/* Number of flush/fua I/O */
 	atomic_long_t req_writes;	/* Sectors stored on write buffer */
 	atomic_long_t sub_writes;	/* Sectors submitted from buffer */
 	atomic_long_t sync_writes;	/* Sectors synced to media */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 5/5] lightnvm: pblk: refactor bad block identification
  2018-01-31  2:06 [PATCH 1/5] lightnvm: pblk: handle bad sectors in the emeta area correctly Javier González
                   ` (2 preceding siblings ...)
  2018-01-31  2:06 ` [PATCH 4/5] lightnvm: pblk: add padding distribution sysfs attribute Javier González
@ 2018-01-31  2:06 ` Javier González
  2018-01-31  8:51   ` Matias Bjørling
  3 siblings, 1 reply; 15+ messages in thread
From: Javier González @ 2018-01-31  2:06 UTC (permalink / raw)
  To: mb; +Cc: linux-block, linux-kernel, Javier González

In preparation for the OCSSD 2.0 spec. bad block identification,
refactor the current code to generalize bad block get/set functions and
structures.

Signed-off-by: Javier González <javier@cnexlabs.com>
---
 drivers/lightnvm/pblk-init.c | 213 +++++++++++++++++++++++--------------------
 drivers/lightnvm/pblk.h      |   6 --
 2 files changed, 112 insertions(+), 107 deletions(-)

diff --git a/drivers/lightnvm/pblk-init.c b/drivers/lightnvm/pblk-init.c
index a5e3510c0f38..86a94a7faa96 100644
--- a/drivers/lightnvm/pblk-init.c
+++ b/drivers/lightnvm/pblk-init.c
@@ -365,7 +365,25 @@ static void pblk_luns_free(struct pblk *pblk)
 	kfree(pblk->luns);
 }
 
-static void pblk_free_line_bitmaps(struct pblk_line *line)
+static void pblk_line_mg_free(struct pblk *pblk)
+{
+	struct pblk_line_mgmt *l_mg = &pblk->l_mg;
+	int i;
+
+	kfree(l_mg->bb_template);
+	kfree(l_mg->bb_aux);
+	kfree(l_mg->vsc_list);
+
+	for (i = 0; i < PBLK_DATA_LINES; i++) {
+		kfree(l_mg->sline_meta[i]);
+		pblk_mfree(l_mg->eline_meta[i]->buf, l_mg->emeta_alloc_type);
+		kfree(l_mg->eline_meta[i]);
+	}
+
+	kfree(pblk->lines);
+}
+
+static void pblk_line_meta_free(struct pblk_line *line)
 {
 	kfree(line->blk_bitmap);
 	kfree(line->erase_bitmap);
@@ -382,40 +400,16 @@ static void pblk_lines_free(struct pblk *pblk)
 		line = &pblk->lines[i];
 
 		pblk_line_free(pblk, line);
-		pblk_free_line_bitmaps(line);
+		pblk_line_meta_free(line);
 	}
 	spin_unlock(&l_mg->free_lock);
 }
 
-static void pblk_line_meta_free(struct pblk *pblk)
+static int pblk_bb_get_tbl(struct nvm_tgt_dev *dev, struct pblk_lun *rlun,
+			   u8 *blks, int nr_blks)
 {
-	struct pblk_line_mgmt *l_mg = &pblk->l_mg;
-	int i;
-
-	kfree(l_mg->bb_template);
-	kfree(l_mg->bb_aux);
-	kfree(l_mg->vsc_list);
-
-	for (i = 0; i < PBLK_DATA_LINES; i++) {
-		kfree(l_mg->sline_meta[i]);
-		pblk_mfree(l_mg->eline_meta[i]->buf, l_mg->emeta_alloc_type);
-		kfree(l_mg->eline_meta[i]);
-	}
-
-	kfree(pblk->lines);
-}
-
-static int pblk_bb_discovery(struct nvm_tgt_dev *dev, struct pblk_lun *rlun)
-{
-	struct nvm_geo *geo = &dev->geo;
 	struct ppa_addr ppa;
-	u8 *blks;
-	int nr_blks, ret;
-
-	nr_blks = geo->nr_chks * geo->plane_mode;
-	blks = kmalloc(nr_blks, GFP_KERNEL);
-	if (!blks)
-		return -ENOMEM;
+	int ret;
 
 	ppa.ppa = 0;
 	ppa.g.ch = rlun->bppa.g.ch;
@@ -423,34 +417,56 @@ static int pblk_bb_discovery(struct nvm_tgt_dev *dev, struct pblk_lun *rlun)
 
 	ret = nvm_get_tgt_bb_tbl(dev, ppa, blks);
 	if (ret)
-		goto out;
+		return ret;
 
 	nr_blks = nvm_bb_tbl_fold(dev->parent, blks, nr_blks);
-	if (nr_blks < 0) {
-		ret = nr_blks;
-		goto out;
-	}
-
-	rlun->bb_list = blks;
+	if (nr_blks < 0)
+		return -EIO;
 
 	return 0;
-out:
-	kfree(blks);
-	return ret;
+}
+
+static void *pblk_bb_get_log(struct pblk *pblk)
+{
+	struct nvm_tgt_dev *dev = pblk->dev;
+	struct nvm_geo *geo = &dev->geo;
+	u8 *log;
+	int i, nr_blks, blk_per_lun;
+	int ret;
+
+	blk_per_lun = geo->nr_chks * geo->plane_mode;
+	nr_blks = blk_per_lun * geo->all_luns;
+
+	log = kmalloc(nr_blks, GFP_KERNEL);
+	if (!log)
+		return ERR_PTR(-ENOMEM);
+
+	for (i = 0; i < geo->all_luns; i++) {
+		struct pblk_lun *rlun = &pblk->luns[i];
+		u8 *log_pos = log + i * blk_per_lun;
+
+		ret = pblk_bb_get_tbl(dev, rlun, log_pos, blk_per_lun);
+		if (ret) {
+			kfree(log);
+			return ERR_PTR(-EIO);
+		}
+	}
+
+	return log;
 }
 
 static int pblk_bb_line(struct pblk *pblk, struct pblk_line *line,
-			int blk_per_line)
+			u8 *bb_log, int blk_per_line)
 {
 	struct nvm_tgt_dev *dev = pblk->dev;
 	struct nvm_geo *geo = &dev->geo;
-	struct pblk_lun *rlun;
-	int bb_cnt = 0;
-	int i;
+	int i, bb_cnt = 0;
 
 	for (i = 0; i < blk_per_line; i++) {
-		rlun = &pblk->luns[i];
-		if (rlun->bb_list[line->id] == NVM_BLK_T_FREE)
+		struct pblk_lun *rlun = &pblk->luns[i];
+		u8 *lun_bb_log = bb_log + i * blk_per_line;
+
+		if (lun_bb_log[line->id] == NVM_BLK_T_FREE)
 			continue;
 
 		set_bit(pblk_ppa_to_pos(geo, rlun->bppa), line->blk_bitmap);
@@ -460,29 +476,12 @@ static int pblk_bb_line(struct pblk *pblk, struct pblk_line *line,
 	return bb_cnt;
 }
 
-static int pblk_alloc_line_bitmaps(struct pblk *pblk, struct pblk_line *line)
-{
-	struct pblk_line_meta *lm = &pblk->lm;
-
-	line->blk_bitmap = kzalloc(lm->blk_bitmap_len, GFP_KERNEL);
-	if (!line->blk_bitmap)
-		return -ENOMEM;
-
-	line->erase_bitmap = kzalloc(lm->blk_bitmap_len, GFP_KERNEL);
-	if (!line->erase_bitmap) {
-		kfree(line->blk_bitmap);
-		return -ENOMEM;
-	}
-
-	return 0;
-}
-
 static int pblk_luns_init(struct pblk *pblk, struct ppa_addr *luns)
 {
 	struct nvm_tgt_dev *dev = pblk->dev;
 	struct nvm_geo *geo = &dev->geo;
 	struct pblk_lun *rlun;
-	int i, ret;
+	int i;
 
 	/* TODO: Implement unbalanced LUN support */
 	if (geo->nr_luns < 0) {
@@ -505,13 +504,6 @@ static int pblk_luns_init(struct pblk *pblk, struct ppa_addr *luns)
 		rlun->bppa = luns[lunid];
 
 		sema_init(&rlun->wr_sem, 1);
-
-		ret = pblk_bb_discovery(dev, rlun);
-		if (ret) {
-			while (--i >= 0)
-				kfree(pblk->luns[i].bb_list);
-			return ret;
-		}
 	}
 
 	return 0;
@@ -689,6 +681,26 @@ static int pblk_lines_alloc_metadata(struct pblk *pblk)
 	return -ENOMEM;
 }
 
+static int pblk_setup_line_meta(struct pblk *pblk, struct pblk_line *line,
+				void *chunk_log, long *nr_bad_blks)
+{
+	struct pblk_line_meta *lm = &pblk->lm;
+
+	line->blk_bitmap = kzalloc(lm->blk_bitmap_len, GFP_KERNEL);
+	if (!line->blk_bitmap)
+		return -ENOMEM;
+
+	line->erase_bitmap = kzalloc(lm->blk_bitmap_len, GFP_KERNEL);
+	if (!line->erase_bitmap) {
+		kfree(line->blk_bitmap);
+		return -ENOMEM;
+	}
+
+	*nr_bad_blks = pblk_bb_line(pblk, line, chunk_log, lm->blk_per_line);
+
+	return 0;
+}
+
 static int pblk_lines_init(struct pblk *pblk)
 {
 	struct nvm_tgt_dev *dev = pblk->dev;
@@ -696,8 +708,9 @@ static int pblk_lines_init(struct pblk *pblk)
 	struct pblk_line_mgmt *l_mg = &pblk->l_mg;
 	struct pblk_line_meta *lm = &pblk->lm;
 	struct pblk_line *line;
+	void *chunk_log;
 	unsigned int smeta_len, emeta_len;
-	long nr_bad_blks, nr_free_blks;
+	long nr_bad_blks = 0, nr_free_blks = 0;
 	int bb_distance, max_write_ppas, mod;
 	int i, ret;
 
@@ -771,13 +784,12 @@ static int pblk_lines_init(struct pblk *pblk)
 	if (lm->min_blk_line > lm->blk_per_line) {
 		pr_err("pblk: config. not supported. Min. LUN in line:%d\n",
 							lm->blk_per_line);
-		ret = -EINVAL;
-		goto fail;
+		return -EINVAL;
 	}
 
 	ret = pblk_lines_alloc_metadata(pblk);
 	if (ret)
-		goto fail;
+		return ret;
 
 	l_mg->bb_template = kzalloc(lm->sec_bitmap_len, GFP_KERNEL);
 	if (!l_mg->bb_template) {
@@ -821,9 +833,16 @@ static int pblk_lines_init(struct pblk *pblk)
 		goto fail_free_bb_aux;
 	}
 
-	nr_free_blks = 0;
+	chunk_log = pblk_bb_get_log(pblk);
+	if (IS_ERR(chunk_log)) {
+		pr_err("pblk: could not get bad block log (%lu)\n",
+							PTR_ERR(chunk_log));
+		ret = PTR_ERR(chunk_log);
+		goto fail_free_lines;
+	}
+
 	for (i = 0; i < l_mg->nr_lines; i++) {
-		int blk_in_line;
+		int chk_in_line;
 
 		line = &pblk->lines[i];
 
@@ -835,26 +854,20 @@ static int pblk_lines_init(struct pblk *pblk)
 		line->vsc = &l_mg->vsc_list[i];
 		spin_lock_init(&line->lock);
 
-		ret = pblk_alloc_line_bitmaps(pblk, line);
+		ret = pblk_setup_line_meta(pblk, line, chunk_log, &nr_bad_blks);
 		if (ret)
-			goto fail_free_lines;
+			goto fail_free_chunk_log;
 
-		nr_bad_blks = pblk_bb_line(pblk, line, lm->blk_per_line);
-		if (nr_bad_blks < 0 || nr_bad_blks > lm->blk_per_line) {
-			pblk_free_line_bitmaps(line);
-			ret = -EINVAL;
-			goto fail_free_lines;
-		}
-
-		blk_in_line = lm->blk_per_line - nr_bad_blks;
-		if (blk_in_line < lm->min_blk_line) {
+		chk_in_line = lm->blk_per_line - nr_bad_blks;
+		if (nr_bad_blks < 0 || nr_bad_blks > lm->blk_per_line ||
+					chk_in_line < lm->min_blk_line) {
 			line->state = PBLK_LINESTATE_BAD;
 			list_add_tail(&line->list, &l_mg->bad_list);
 			continue;
 		}
 
-		nr_free_blks += blk_in_line;
-		atomic_set(&line->blk_in_line, blk_in_line);
+		nr_free_blks += chk_in_line;
+		atomic_set(&line->blk_in_line, chk_in_line);
 
 		l_mg->nr_free_lines++;
 		list_add_tail(&line->list, &l_mg->free_list);
@@ -862,23 +875,21 @@ static int pblk_lines_init(struct pblk *pblk)
 
 	pblk_set_provision(pblk, nr_free_blks);
 
-	/* Cleanup per-LUN bad block lists - managed within lines on run-time */
-	for (i = 0; i < geo->all_luns; i++)
-		kfree(pblk->luns[i].bb_list);
-
+	kfree(chunk_log);
 	return 0;
-fail_free_lines:
+
+fail_free_chunk_log:
+	kfree(chunk_log);
 	while (--i >= 0)
-		pblk_free_line_bitmaps(&pblk->lines[i]);
+		pblk_line_meta_free(&pblk->lines[i]);
+fail_free_lines:
+	kfree(pblk->lines);
 fail_free_bb_aux:
 	kfree(l_mg->bb_aux);
 fail_free_bb_template:
 	kfree(l_mg->bb_template);
 fail_free_meta:
-	pblk_line_meta_free(pblk);
-fail:
-	for (i = 0; i < geo->all_luns; i++)
-		kfree(pblk->luns[i].bb_list);
+	pblk_line_mg_free(pblk);
 
 	return ret;
 }
@@ -922,7 +933,7 @@ static void pblk_free(struct pblk *pblk)
 	pblk_luns_free(pblk);
 	pblk_lines_free(pblk);
 	kfree(pblk->pad_dist);
-	pblk_line_meta_free(pblk);
+	pblk_line_mg_free(pblk);
 	pblk_core_free(pblk);
 	pblk_l2p_free(pblk);
 
@@ -1110,7 +1121,7 @@ static void *pblk_init(struct nvm_tgt_dev *dev, struct gendisk *tdisk,
 fail_free_pad_dist:
 	kfree(pblk->pad_dist);
 fail_free_line_meta:
-	pblk_line_meta_free(pblk);
+	pblk_line_mg_free(pblk);
 fail_free_luns:
 	pblk_luns_free(pblk);
 fail:
diff --git a/drivers/lightnvm/pblk.h b/drivers/lightnvm/pblk.h
index 88720e2441c0..282dfc8780e8 100644
--- a/drivers/lightnvm/pblk.h
+++ b/drivers/lightnvm/pblk.h
@@ -201,12 +201,6 @@ struct pblk_rb {
 
 struct pblk_lun {
 	struct ppa_addr bppa;
-
-	u8 *bb_list;			/* Bad block list for LUN. Only used on
-					 * bring up. Bad blocks are managed
-					 * within lines on run-time.
-					 */
-
 	struct semaphore wr_sem;
 };
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/5] lightnvm: pblk: add padding distribution sysfs attribute
  2018-01-31  2:06 ` [PATCH 4/5] lightnvm: pblk: add padding distribution sysfs attribute Javier González
@ 2018-01-31  8:44   ` Matias Bjørling
  2018-02-06  9:27     ` Hans Holmberg
  0 siblings, 1 reply; 15+ messages in thread
From: Matias Bjørling @ 2018-01-31  8:44 UTC (permalink / raw)
  To: Javier González
  Cc: linux-block, linux-kernel, Hans Holmberg, Javier González

On 01/31/2018 03:06 AM, Javier González wrote:
> From: Hans Holmberg <hans.holmberg@cnexlabs.com>
> 
> When pblk receives a sync, all data up to that point in the write buffer
> must be comitted to persistent storage, and as flash memory comes with a
> minimal write size there is a significant cost involved both in terms
> of time for completing the sync and in terms of write amplification
> padded sectors for filling up to the minimal write size.
> 
> In order to get a better understanding of the costs involved for syncs,
> Add a sysfs attribute to pblk: padded_dist, showing a normalized
> distribution of sectors padded. In order to facilitate measurements of
> specific workloads during the lifetime of the pblk instance, the
> distribution can be reset by writing 0 to the attribute.
> 
> Do this by introducing counters for each possible padding:
> {0..(minimal write size - 1)} and calculate the normalized distribution
> when showing the attribute.
> 
> Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com>
> Signed-off-by: Javier González <javier@cnexlabs.com>
> ---
>   drivers/lightnvm/pblk-init.c  | 16 ++++++++--
>   drivers/lightnvm/pblk-rb.c    | 15 +++++-----
>   drivers/lightnvm/pblk-sysfs.c | 68 +++++++++++++++++++++++++++++++++++++++++++
>   drivers/lightnvm/pblk.h       |  6 +++-
>   4 files changed, 95 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/lightnvm/pblk-init.c b/drivers/lightnvm/pblk-init.c
> index 7eedc5daebc8..a5e3510c0f38 100644
> --- a/drivers/lightnvm/pblk-init.c
> +++ b/drivers/lightnvm/pblk-init.c
> @@ -921,6 +921,7 @@ static void pblk_free(struct pblk *pblk)
>   {
>   	pblk_luns_free(pblk);
>   	pblk_lines_free(pblk);
> +	kfree(pblk->pad_dist);
>   	pblk_line_meta_free(pblk);
>   	pblk_core_free(pblk);
>   	pblk_l2p_free(pblk);
> @@ -998,11 +999,13 @@ static void *pblk_init(struct nvm_tgt_dev *dev, struct gendisk *tdisk,
>   	pblk->pad_rst_wa = 0;
>   	pblk->gc_rst_wa = 0;
>   
> +	atomic_long_set(&pblk->nr_flush, 0);
> +	pblk->nr_flush_rst = 0;
> +
>   #ifdef CONFIG_NVM_DEBUG
>   	atomic_long_set(&pblk->inflight_writes, 0);
>   	atomic_long_set(&pblk->padded_writes, 0);
>   	atomic_long_set(&pblk->padded_wb, 0);
> -	atomic_long_set(&pblk->nr_flush, 0);
>   	atomic_long_set(&pblk->req_writes, 0);
>   	atomic_long_set(&pblk->sub_writes, 0);
>   	atomic_long_set(&pblk->sync_writes, 0);
> @@ -1034,10 +1037,17 @@ static void *pblk_init(struct nvm_tgt_dev *dev, struct gendisk *tdisk,
>   		goto fail_free_luns;
>   	}
>   
> +	pblk->pad_dist = kzalloc((pblk->min_write_pgs - 1) * sizeof(atomic64_t),
> +				 GFP_KERNEL);
> +	if (!pblk->pad_dist) {
> +		ret = -ENOMEM;
> +		goto fail_free_line_meta;
> +	}
> +
>   	ret = pblk_core_init(pblk);
>   	if (ret) {
>   		pr_err("pblk: could not initialize core\n");
> -		goto fail_free_line_meta;
> +		goto fail_free_pad_dist;
>   	}
>   
>   	ret = pblk_l2p_init(pblk);
> @@ -1097,6 +1107,8 @@ static void *pblk_init(struct nvm_tgt_dev *dev, struct gendisk *tdisk,
>   	pblk_l2p_free(pblk);
>   fail_free_core:
>   	pblk_core_free(pblk);
> +fail_free_pad_dist:
> +	kfree(pblk->pad_dist);
>   fail_free_line_meta:
>   	pblk_line_meta_free(pblk);
>   fail_free_luns:
> diff --git a/drivers/lightnvm/pblk-rb.c b/drivers/lightnvm/pblk-rb.c
> index 7044b5599cc4..264372d7b537 100644
> --- a/drivers/lightnvm/pblk-rb.c
> +++ b/drivers/lightnvm/pblk-rb.c
> @@ -437,9 +437,7 @@ static int pblk_rb_may_write_flush(struct pblk_rb *rb, unsigned int nr_entries,
>   	if (bio->bi_opf & REQ_PREFLUSH) {
>   		struct pblk *pblk = container_of(rb, struct pblk, rwb);
>   
> -#ifdef CONFIG_NVM_DEBUG
>   		atomic_long_inc(&pblk->nr_flush);
> -#endif
>   		if (pblk_rb_flush_point_set(&pblk->rwb, bio, mem))
>   			*io_ret = NVM_IO_OK;
>   	}
> @@ -620,14 +618,17 @@ unsigned int pblk_rb_read_to_bio(struct pblk_rb *rb, struct nvm_rq *rqd,
>   			pr_err("pblk: could not pad page in write bio\n");
>   			return NVM_IO_ERR;
>   		}
> +
> +		if (pad < pblk->min_write_pgs)
> +			atomic64_inc(&pblk->pad_dist[pad - 1]);
> +		else
> +			pr_warn("pblk: padding more than min. sectors\n");

Curious, when would this happen? Would it be an implementation error or 
something a user did wrong?

> +
> +		atomic64_add(pad, &pblk->pad_wa);
>   	}
>   
> -	atomic64_add(pad, &((struct pblk *)
> -			(container_of(rb, struct pblk, rwb)))->pad_wa);
> -
>   #ifdef CONFIG_NVM_DEBUG
> -	atomic_long_add(pad, &((struct pblk *)
> -			(container_of(rb, struct pblk, rwb)))->padded_writes);
> +	atomic_long_add(pad, &pblk->padded_writes);
>   #endif
>   
>   	return NVM_IO_OK;
> diff --git a/drivers/lightnvm/pblk-sysfs.c b/drivers/lightnvm/pblk-sysfs.c
> index 4804bbd32d5f..f902ac00d071 100644
> --- a/drivers/lightnvm/pblk-sysfs.c
> +++ b/drivers/lightnvm/pblk-sysfs.c
> @@ -341,6 +341,38 @@ static ssize_t pblk_sysfs_get_write_amp_trip(struct pblk *pblk, char *page)
>   		atomic64_read(&pblk->pad_wa) - pblk->pad_rst_wa, page);
>   }
>   
> +static ssize_t pblk_sysfs_get_padding_dist(struct pblk *pblk, char *page)
> +{
> +	int sz = 0;
> +	unsigned long long total;
> +	unsigned long long total_buckets = 0;
> +	int buckets = pblk->min_write_pgs - 1;
> +	int i;
> +
> +	total = atomic64_read(&pblk->nr_flush) - pblk->nr_flush_rst;
> +
> +	for (i = 0; i < buckets; i++)
> +		total_buckets += atomic64_read(&pblk->pad_dist[i]);
> +

total_buckets are first used later. The calculation could be moved below 
the next statement.

> +	if (!total) {
> +		for (i = 0; i < (buckets + 1); i++)
> +			sz += snprintf(page + sz, PAGE_SIZE - sz,
> +				"%d:0 ", i);
> +		sz += snprintf(page + sz, PAGE_SIZE - sz, "\n");
> +
> +		return sz;
> +	}
> +
> +	sz += snprintf(page + sz, PAGE_SIZE - sz, "0:%lld%% ",
> +		((total - total_buckets) * 100) / total);
> +	for (i = 0; i < buckets; i++)
> +		sz += snprintf(page + sz, PAGE_SIZE - sz, "%d:%lld%% ", i + 1,
> +			(atomic64_read(&pblk->pad_dist[i]) * 100) / total);
> +	sz += snprintf(page + sz, PAGE_SIZE - sz, "\n");
> +
> +	return sz;
> +}
> +
>   #ifdef CONFIG_NVM_DEBUG
>   static ssize_t pblk_sysfs_stats_debug(struct pblk *pblk, char *page)
>   {
> @@ -427,6 +459,32 @@ static ssize_t pblk_sysfs_set_write_amp_trip(struct pblk *pblk,
>   }
>   
>   
> +static ssize_t pblk_sysfs_set_padding_dist(struct pblk *pblk,
> +			const char *page, size_t len)
> +{
> +	size_t c_len;
> +	int reset_value;
> +	int buckets = pblk->min_write_pgs - 1;
> +	int i;
> +
> +	c_len = strcspn(page, "\n");
> +	if (c_len >= len)
> +		return -EINVAL;
> +
> +	if (kstrtouint(page, 0, &reset_value))
> +		return -EINVAL;
> +
> +	if (reset_value !=  0)
> +		return -EINVAL;
> +
> +	for (i = 0; i < buckets; i++)
> +		atomic64_set(&pblk->pad_dist[i], 0);
> +
> +	pblk->nr_flush_rst = atomic64_read(&pblk->nr_flush);
> +
> +	return len;
> +}
> +
>   static struct attribute sys_write_luns = {
>   	.name = "write_luns",
>   	.mode = 0444,
> @@ -487,6 +545,11 @@ static struct attribute sys_write_amp_trip = {
>   	.mode = 0644,
>   };
>   
> +static struct attribute sys_padding_dist = {
> +	.name = "padding_dist",
> +	.mode = 0644,
> +};
> +
>   #ifdef CONFIG_NVM_DEBUG
>   static struct attribute sys_stats_debug_attr = {
>   	.name = "stats",
> @@ -507,6 +570,7 @@ static struct attribute *pblk_attrs[] = {
>   	&sys_lines_info_attr,
>   	&sys_write_amp_mileage,
>   	&sys_write_amp_trip,
> +	&sys_padding_dist,
>   #ifdef CONFIG_NVM_DEBUG
>   	&sys_stats_debug_attr,
>   #endif
> @@ -540,6 +604,8 @@ static ssize_t pblk_sysfs_show(struct kobject *kobj, struct attribute *attr,
>   		return pblk_sysfs_get_write_amp_mileage(pblk, buf);
>   	else if (strcmp(attr->name, "write_amp_trip") == 0)
>   		return pblk_sysfs_get_write_amp_trip(pblk, buf);
> +	else if (strcmp(attr->name, "padding_dist") == 0)
> +		return pblk_sysfs_get_padding_dist(pblk, buf);
>   #ifdef CONFIG_NVM_DEBUG
>   	else if (strcmp(attr->name, "stats") == 0)
>   		return pblk_sysfs_stats_debug(pblk, buf);
> @@ -558,6 +624,8 @@ static ssize_t pblk_sysfs_store(struct kobject *kobj, struct attribute *attr,
>   		return pblk_sysfs_set_sec_per_write(pblk, buf, len);
>   	else if (strcmp(attr->name, "write_amp_trip") == 0)
>   		return pblk_sysfs_set_write_amp_trip(pblk, buf, len);
> +	else if (strcmp(attr->name, "padding_dist") == 0)
> +		return pblk_sysfs_set_padding_dist(pblk, buf, len);
>   	return 0;
>   }
>   
> diff --git a/drivers/lightnvm/pblk.h b/drivers/lightnvm/pblk.h
> index 4b7d8618631f..88720e2441c0 100644
> --- a/drivers/lightnvm/pblk.h
> +++ b/drivers/lightnvm/pblk.h
> @@ -626,12 +626,16 @@ struct pblk {
>   	u64 gc_rst_wa;
>   	u64 pad_rst_wa;
>   
> +	/* Counters used for calculating padding distribution */
> +	atomic64_t *pad_dist;		/* Padding distribution buckets */
> +	u64 nr_flush_rst;		/* Flushes reset value for pad dist.*/
> +	atomic_long_t nr_flush;		/* Number of flush/fua I/O */
> +
>   #ifdef CONFIG_NVM_DEBUG
>   	/* Non-persistent debug counters, 4kb sector I/Os */
>   	atomic_long_t inflight_writes;	/* Inflight writes (user and gc) */
>   	atomic_long_t padded_writes;	/* Sectors padded due to flush/fua */
>   	atomic_long_t padded_wb;	/* Sectors padded in write buffer */
> -	atomic_long_t nr_flush;		/* Number of flush/fua I/O */
>   	atomic_long_t req_writes;	/* Sectors stored on write buffer */
>   	atomic_long_t sub_writes;	/* Sectors submitted from buffer */
>   	atomic_long_t sync_writes;	/* Sectors synced to media */
> 

Looks good to me. Let me know if you want to send a new patch, or if I 
may make the change when I pick it up.

Also, should the padding be stored in the on-disk metadata as well?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 5/5] lightnvm: pblk: refactor bad block identification
  2018-01-31  2:06 ` [PATCH 5/5] lightnvm: pblk: refactor bad block identification Javier González
@ 2018-01-31  8:51   ` Matias Bjørling
  2018-01-31  9:13     ` Javier Gonzalez
  0 siblings, 1 reply; 15+ messages in thread
From: Matias Bjørling @ 2018-01-31  8:51 UTC (permalink / raw)
  To: Javier González; +Cc: linux-block, linux-kernel, Javier González

On 01/31/2018 03:06 AM, Javier González wrote:
> In preparation for the OCSSD 2.0 spec. bad block identification,
> refactor the current code to generalize bad block get/set functions and
> structures.
> 
> Signed-off-by: Javier González <javier@cnexlabs.com>
> ---
>   drivers/lightnvm/pblk-init.c | 213 +++++++++++++++++++++++--------------------
>   drivers/lightnvm/pblk.h      |   6 --
>   2 files changed, 112 insertions(+), 107 deletions(-)
> 
> diff --git a/drivers/lightnvm/pblk-init.c b/drivers/lightnvm/pblk-init.c
> index a5e3510c0f38..86a94a7faa96 100644
> --- a/drivers/lightnvm/pblk-init.c
> +++ b/drivers/lightnvm/pblk-init.c
> @@ -365,7 +365,25 @@ static void pblk_luns_free(struct pblk *pblk)
>   	kfree(pblk->luns);
>   }
>   
> -static void pblk_free_line_bitmaps(struct pblk_line *line)
> +static void pblk_line_mg_free(struct pblk *pblk)
> +{
> +	struct pblk_line_mgmt *l_mg = &pblk->l_mg;
> +	int i;
> +
> +	kfree(l_mg->bb_template);
> +	kfree(l_mg->bb_aux);
> +	kfree(l_mg->vsc_list);
> +
> +	for (i = 0; i < PBLK_DATA_LINES; i++) {
> +		kfree(l_mg->sline_meta[i]);
> +		pblk_mfree(l_mg->eline_meta[i]->buf, l_mg->emeta_alloc_type);
> +		kfree(l_mg->eline_meta[i]);
> +	}
> +
> +	kfree(pblk->lines);
> +}
> +
> +static void pblk_line_meta_free(struct pblk_line *line)
>   {
>   	kfree(line->blk_bitmap);
>   	kfree(line->erase_bitmap);
> @@ -382,40 +400,16 @@ static void pblk_lines_free(struct pblk *pblk)
>   		line = &pblk->lines[i];
>   
>   		pblk_line_free(pblk, line);
> -		pblk_free_line_bitmaps(line);
> +		pblk_line_meta_free(line);
>   	}
>   	spin_unlock(&l_mg->free_lock);
>   }
>   
> -static void pblk_line_meta_free(struct pblk *pblk)
> +static int pblk_bb_get_tbl(struct nvm_tgt_dev *dev, struct pblk_lun *rlun,
> +			   u8 *blks, int nr_blks)
>   {
> -	struct pblk_line_mgmt *l_mg = &pblk->l_mg;
> -	int i;
> -
> -	kfree(l_mg->bb_template);
> -	kfree(l_mg->bb_aux);
> -	kfree(l_mg->vsc_list);
> -
> -	for (i = 0; i < PBLK_DATA_LINES; i++) {
> -		kfree(l_mg->sline_meta[i]);
> -		pblk_mfree(l_mg->eline_meta[i]->buf, l_mg->emeta_alloc_type);
> -		kfree(l_mg->eline_meta[i]);
> -	}
> -
> -	kfree(pblk->lines);
> -}
> -
> -static int pblk_bb_discovery(struct nvm_tgt_dev *dev, struct pblk_lun *rlun)
> -{
> -	struct nvm_geo *geo = &dev->geo;
>   	struct ppa_addr ppa;
> -	u8 *blks;
> -	int nr_blks, ret;
> -
> -	nr_blks = geo->nr_chks * geo->plane_mode;
> -	blks = kmalloc(nr_blks, GFP_KERNEL);
> -	if (!blks)
> -		return -ENOMEM;
> +	int ret;
>   
>   	ppa.ppa = 0;
>   	ppa.g.ch = rlun->bppa.g.ch;
> @@ -423,34 +417,56 @@ static int pblk_bb_discovery(struct nvm_tgt_dev *dev, struct pblk_lun *rlun)
>   
>   	ret = nvm_get_tgt_bb_tbl(dev, ppa, blks);
>   	if (ret)
> -		goto out;
> +		return ret;
>   
>   	nr_blks = nvm_bb_tbl_fold(dev->parent, blks, nr_blks);
> -	if (nr_blks < 0) {
> -		ret = nr_blks;
> -		goto out;
> -	}
> -
> -	rlun->bb_list = blks;
> +	if (nr_blks < 0)
> +		return -EIO;
>   
>   	return 0;
> -out:
> -	kfree(blks);
> -	return ret;
> +}
> +
> +static void *pblk_bb_get_log(struct pblk *pblk)
> +{
> +	struct nvm_tgt_dev *dev = pblk->dev;
> +	struct nvm_geo *geo = &dev->geo;
> +	u8 *log;
> +	int i, nr_blks, blk_per_lun;
> +	int ret;
> +
> +	blk_per_lun = geo->nr_chks * geo->plane_mode;
> +	nr_blks = blk_per_lun * geo->all_luns;
> +
> +	log = kmalloc(nr_blks, GFP_KERNEL);
> +	if (!log)
> +		return ERR_PTR(-ENOMEM);
> +
> +	for (i = 0; i < geo->all_luns; i++) {
> +		struct pblk_lun *rlun = &pblk->luns[i];
> +		u8 *log_pos = log + i * blk_per_lun;
> +
> +		ret = pblk_bb_get_tbl(dev, rlun, log_pos, blk_per_lun);
> +		if (ret) {
> +			kfree(log);
> +			return ERR_PTR(-EIO);
> +		}
> +	}
> +
> +	return log;
>   }
>   
>   static int pblk_bb_line(struct pblk *pblk, struct pblk_line *line,
> -			int blk_per_line)
> +			u8 *bb_log, int blk_per_line)
>   {
>   	struct nvm_tgt_dev *dev = pblk->dev;
>   	struct nvm_geo *geo = &dev->geo;
> -	struct pblk_lun *rlun;
> -	int bb_cnt = 0;
> -	int i;
> +	int i, bb_cnt = 0;
>   
>   	for (i = 0; i < blk_per_line; i++) {
> -		rlun = &pblk->luns[i];
> -		if (rlun->bb_list[line->id] == NVM_BLK_T_FREE)
> +		struct pblk_lun *rlun = &pblk->luns[i];
> +		u8 *lun_bb_log = bb_log + i * blk_per_line;
> +
> +		if (lun_bb_log[line->id] == NVM_BLK_T_FREE)
>   			continue;
>   
>   		set_bit(pblk_ppa_to_pos(geo, rlun->bppa), line->blk_bitmap);
> @@ -460,29 +476,12 @@ static int pblk_bb_line(struct pblk *pblk, struct pblk_line *line,
>   	return bb_cnt;
>   }
>   
> -static int pblk_alloc_line_bitmaps(struct pblk *pblk, struct pblk_line *line)
> -{
> -	struct pblk_line_meta *lm = &pblk->lm;
> -
> -	line->blk_bitmap = kzalloc(lm->blk_bitmap_len, GFP_KERNEL);
> -	if (!line->blk_bitmap)
> -		return -ENOMEM;
> -
> -	line->erase_bitmap = kzalloc(lm->blk_bitmap_len, GFP_KERNEL);
> -	if (!line->erase_bitmap) {
> -		kfree(line->blk_bitmap);
> -		return -ENOMEM;
> -	}
> -
> -	return 0;
> -}
> -
>   static int pblk_luns_init(struct pblk *pblk, struct ppa_addr *luns)
>   {
>   	struct nvm_tgt_dev *dev = pblk->dev;
>   	struct nvm_geo *geo = &dev->geo;
>   	struct pblk_lun *rlun;
> -	int i, ret;
> +	int i;
>   
>   	/* TODO: Implement unbalanced LUN support */
>   	if (geo->nr_luns < 0) {
> @@ -505,13 +504,6 @@ static int pblk_luns_init(struct pblk *pblk, struct ppa_addr *luns)
>   		rlun->bppa = luns[lunid];
>   
>   		sema_init(&rlun->wr_sem, 1);
> -
> -		ret = pblk_bb_discovery(dev, rlun);
> -		if (ret) {
> -			while (--i >= 0)
> -				kfree(pblk->luns[i].bb_list);
> -			return ret;
> -		}
>   	}
>   
>   	return 0;
> @@ -689,6 +681,26 @@ static int pblk_lines_alloc_metadata(struct pblk *pblk)
>   	return -ENOMEM;
>   }
>   
> +static int pblk_setup_line_meta(struct pblk *pblk, struct pblk_line *line,
> +				void *chunk_log, long *nr_bad_blks)
> +{
> +	struct pblk_line_meta *lm = &pblk->lm;
> +
> +	line->blk_bitmap = kzalloc(lm->blk_bitmap_len, GFP_KERNEL);
> +	if (!line->blk_bitmap)
> +		return -ENOMEM;
> +
> +	line->erase_bitmap = kzalloc(lm->blk_bitmap_len, GFP_KERNEL);
> +	if (!line->erase_bitmap) {
> +		kfree(line->blk_bitmap);
> +		return -ENOMEM;
> +	}
> +
> +	*nr_bad_blks = pblk_bb_line(pblk, line, chunk_log, lm->blk_per_line);
> +
> +	return 0;
> +}
> +
>   static int pblk_lines_init(struct pblk *pblk)
>   {
>   	struct nvm_tgt_dev *dev = pblk->dev;
> @@ -696,8 +708,9 @@ static int pblk_lines_init(struct pblk *pblk)
>   	struct pblk_line_mgmt *l_mg = &pblk->l_mg;
>   	struct pblk_line_meta *lm = &pblk->lm;
>   	struct pblk_line *line;
> +	void *chunk_log;
>   	unsigned int smeta_len, emeta_len;
> -	long nr_bad_blks, nr_free_blks;
> +	long nr_bad_blks = 0, nr_free_blks = 0;
>   	int bb_distance, max_write_ppas, mod;
>   	int i, ret;
>   
> @@ -771,13 +784,12 @@ static int pblk_lines_init(struct pblk *pblk)
>   	if (lm->min_blk_line > lm->blk_per_line) {
>   		pr_err("pblk: config. not supported. Min. LUN in line:%d\n",
>   							lm->blk_per_line);
> -		ret = -EINVAL;
> -		goto fail;
> +		return -EINVAL;
>   	}
>   
>   	ret = pblk_lines_alloc_metadata(pblk);
>   	if (ret)
> -		goto fail;
> +		return ret;
>   
>   	l_mg->bb_template = kzalloc(lm->sec_bitmap_len, GFP_KERNEL);
>   	if (!l_mg->bb_template) {
> @@ -821,9 +833,16 @@ static int pblk_lines_init(struct pblk *pblk)
>   		goto fail_free_bb_aux;
>   	}
>   
> -	nr_free_blks = 0;
> +	chunk_log = pblk_bb_get_log(pblk);
> +	if (IS_ERR(chunk_log)) {
> +		pr_err("pblk: could not get bad block log (%lu)\n",
> +							PTR_ERR(chunk_log));
> +		ret = PTR_ERR(chunk_log);
> +		goto fail_free_lines;
> +	}
> +
>   	for (i = 0; i < l_mg->nr_lines; i++) {
> -		int blk_in_line;
> +		int chk_in_line;
>   
>   		line = &pblk->lines[i];
>   
> @@ -835,26 +854,20 @@ static int pblk_lines_init(struct pblk *pblk)
>   		line->vsc = &l_mg->vsc_list[i];
>   		spin_lock_init(&line->lock);
>   
> -		ret = pblk_alloc_line_bitmaps(pblk, line);
> +		ret = pblk_setup_line_meta(pblk, line, chunk_log, &nr_bad_blks);
>   		if (ret)
> -			goto fail_free_lines;
> +			goto fail_free_chunk_log;
>   
> -		nr_bad_blks = pblk_bb_line(pblk, line, lm->blk_per_line);
> -		if (nr_bad_blks < 0 || nr_bad_blks > lm->blk_per_line) {
> -			pblk_free_line_bitmaps(line);
> -			ret = -EINVAL;
> -			goto fail_free_lines;
> -		}
> -
> -		blk_in_line = lm->blk_per_line - nr_bad_blks;
> -		if (blk_in_line < lm->min_blk_line) {
> +		chk_in_line = lm->blk_per_line - nr_bad_blks;
> +		if (nr_bad_blks < 0 || nr_bad_blks > lm->blk_per_line ||
> +					chk_in_line < lm->min_blk_line) {
>   			line->state = PBLK_LINESTATE_BAD;
>   			list_add_tail(&line->list, &l_mg->bad_list);
>   			continue;
>   		}
>   
> -		nr_free_blks += blk_in_line;
> -		atomic_set(&line->blk_in_line, blk_in_line);
> +		nr_free_blks += chk_in_line;
> +		atomic_set(&line->blk_in_line, chk_in_line);
>   
>   		l_mg->nr_free_lines++;
>   		list_add_tail(&line->list, &l_mg->free_list);
> @@ -862,23 +875,21 @@ static int pblk_lines_init(struct pblk *pblk)
>   
>   	pblk_set_provision(pblk, nr_free_blks);
>   
> -	/* Cleanup per-LUN bad block lists - managed within lines on run-time */
> -	for (i = 0; i < geo->all_luns; i++)
> -		kfree(pblk->luns[i].bb_list);
> -
> +	kfree(chunk_log);
>   	return 0;
> -fail_free_lines:
> +
> +fail_free_chunk_log:
> +	kfree(chunk_log);
>   	while (--i >= 0)
> -		pblk_free_line_bitmaps(&pblk->lines[i]);
> +		pblk_line_meta_free(&pblk->lines[i]);
> +fail_free_lines:
> +	kfree(pblk->lines);
>   fail_free_bb_aux:
>   	kfree(l_mg->bb_aux);
>   fail_free_bb_template:
>   	kfree(l_mg->bb_template);
>   fail_free_meta:
> -	pblk_line_meta_free(pblk);
> -fail:
> -	for (i = 0; i < geo->all_luns; i++)
> -		kfree(pblk->luns[i].bb_list);
> +	pblk_line_mg_free(pblk);
>   
>   	return ret;
>   }
> @@ -922,7 +933,7 @@ static void pblk_free(struct pblk *pblk)
>   	pblk_luns_free(pblk);
>   	pblk_lines_free(pblk);
>   	kfree(pblk->pad_dist);
> -	pblk_line_meta_free(pblk);
> +	pblk_line_mg_free(pblk);
>   	pblk_core_free(pblk);
>   	pblk_l2p_free(pblk);
>   
> @@ -1110,7 +1121,7 @@ static void *pblk_init(struct nvm_tgt_dev *dev, struct gendisk *tdisk,
>   fail_free_pad_dist:
>   	kfree(pblk->pad_dist);
>   fail_free_line_meta:
> -	pblk_line_meta_free(pblk);
> +	pblk_line_mg_free(pblk);
>   fail_free_luns:
>   	pblk_luns_free(pblk);
>   fail:
> diff --git a/drivers/lightnvm/pblk.h b/drivers/lightnvm/pblk.h
> index 88720e2441c0..282dfc8780e8 100644
> --- a/drivers/lightnvm/pblk.h
> +++ b/drivers/lightnvm/pblk.h
> @@ -201,12 +201,6 @@ struct pblk_rb {
>   
>   struct pblk_lun {
>   	struct ppa_addr bppa;
> -
> -	u8 *bb_list;			/* Bad block list for LUN. Only used on
> -					 * bring up. Bad blocks are managed
> -					 * within lines on run-time.
> -					 */
> -
>   	struct semaphore wr_sem;
>   };
>   
> 

Looks good to me.

With respect to the 2.0 implementation. I am thinking that get_bb and 
set_bb should behave in the following way:

For get_bb (in nvme/host/lightnvm.c)
1.2: Keep the implementation as is.
2.0: Use the report chunk command, and redo the get_bb format.

For set_bb (in nvme/host/lightnvm.c)
1.2: Business as usual
2.0: Report error.

For 2.0, there will be added a function pointer for get report chunk 
implementation, where the client can ask for full chunk information. 
Similarly here, client will fail the function call if the drive is 1.2.

I hope to post the small 2.0 identify implementation Tomorrow or Friday, 
and then you can post the report chunk implementation that you have done 
if you like.

We also need to take a look at the new error codes, which does not align 
with the 1.2 specification (now that they actually follow the nvme 
specification ;))

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 5/5] lightnvm: pblk: refactor bad block identification
  2018-01-31  8:51   ` Matias Bjørling
@ 2018-01-31  9:13     ` Javier Gonzalez
  2018-01-31 18:24       ` Matias Bjørling
  0 siblings, 1 reply; 15+ messages in thread
From: Javier Gonzalez @ 2018-01-31  9:13 UTC (permalink / raw)
  To: Matias Bjørling; +Cc: Javier González, linux-block, linux-kernel

DQoNCj4gT24gMzEgSmFuIDIwMTgsIGF0IDE2LjUxLCBNYXRpYXMgQmrDuHJsaW5nIDxtYkBsaWdo
dG52bS5pbz4gd3JvdGU6DQo+IA0KPj4gT24gMDEvMzEvMjAxOCAwMzowNiBBTSwgSmF2aWVyIEdv
bnrDoWxleiB3cm90ZToNCj4+IEluIHByZXBhcmF0aW9uIGZvciB0aGUgT0NTU0QgMi4wIHNwZWMu
IGJhZCBibG9jayBpZGVudGlmaWNhdGlvbiwNCj4+IHJlZmFjdG9yIHRoZSBjdXJyZW50IGNvZGUg
dG8gZ2VuZXJhbGl6ZSBiYWQgYmxvY2sgZ2V0L3NldCBmdW5jdGlvbnMgYW5kDQo+PiBzdHJ1Y3R1
cmVzLg0KPj4gU2lnbmVkLW9mZi1ieTogSmF2aWVyIEdvbnrDoWxleiA8amF2aWVyQGNuZXhsYWJz
LmNvbT4NCj4+IC0tLQ0KPj4gIGRyaXZlcnMvbGlnaHRudm0vcGJsay1pbml0LmMgfCAyMTMgKysr
KysrKysrKysrKysrKysrKysrKystLS0tLS0tLS0tLS0tLS0tLS0tLQ0KPj4gIGRyaXZlcnMvbGln
aHRudm0vcGJsay5oICAgICAgfCAgIDYgLS0NCj4+ICAyIGZpbGVzIGNoYW5nZWQsIDExMiBpbnNl
cnRpb25zKCspLCAxMDcgZGVsZXRpb25zKC0pDQo+PiBkaWZmIC0tZ2l0IGEvZHJpdmVycy9saWdo
dG52bS9wYmxrLWluaXQuYyBiL2RyaXZlcnMvbGlnaHRudm0vcGJsay1pbml0LmMNCj4+IGluZGV4
IGE1ZTM1MTBjMGYzOC4uODZhOTRhN2ZhYTk2IDEwMDY0NA0KPj4gLS0tIGEvZHJpdmVycy9saWdo
dG52bS9wYmxrLWluaXQuYw0KPj4gKysrIGIvZHJpdmVycy9saWdodG52bS9wYmxrLWluaXQuYw0K
Pj4gQEAgLTM2NSw3ICszNjUsMjUgQEAgc3RhdGljIHZvaWQgcGJsa19sdW5zX2ZyZWUoc3RydWN0
IHBibGsgKnBibGspDQo+PiAgICAgIGtmcmVlKHBibGstPmx1bnMpOw0KPj4gIH0NCj4+ICAtc3Rh
dGljIHZvaWQgcGJsa19mcmVlX2xpbmVfYml0bWFwcyhzdHJ1Y3QgcGJsa19saW5lICpsaW5lKQ0K
Pj4gK3N0YXRpYyB2b2lkIHBibGtfbGluZV9tZ19mcmVlKHN0cnVjdCBwYmxrICpwYmxrKQ0KPj4g
K3sNCj4+ICsgICAgc3RydWN0IHBibGtfbGluZV9tZ210ICpsX21nID0gJnBibGstPmxfbWc7DQo+
PiArICAgIGludCBpOw0KPj4gKw0KPj4gKyAgICBrZnJlZShsX21nLT5iYl90ZW1wbGF0ZSk7DQo+
PiArICAgIGtmcmVlKGxfbWctPmJiX2F1eCk7DQo+PiArICAgIGtmcmVlKGxfbWctPnZzY19saXN0
KTsNCj4+ICsNCj4+ICsgICAgZm9yIChpID0gMDsgaSA8IFBCTEtfREFUQV9MSU5FUzsgaSsrKSB7
DQo+PiArICAgICAgICBrZnJlZShsX21nLT5zbGluZV9tZXRhW2ldKTsNCj4+ICsgICAgICAgIHBi
bGtfbWZyZWUobF9tZy0+ZWxpbmVfbWV0YVtpXS0+YnVmLCBsX21nLT5lbWV0YV9hbGxvY190eXBl
KTsNCj4+ICsgICAgICAgIGtmcmVlKGxfbWctPmVsaW5lX21ldGFbaV0pOw0KPj4gKyAgICB9DQo+
PiArDQo+PiArICAgIGtmcmVlKHBibGstPmxpbmVzKTsNCj4+ICt9DQo+PiArDQo+PiArc3RhdGlj
IHZvaWQgcGJsa19saW5lX21ldGFfZnJlZShzdHJ1Y3QgcGJsa19saW5lICpsaW5lKQ0KPj4gIHsN
Cj4+ICAgICAga2ZyZWUobGluZS0+YmxrX2JpdG1hcCk7DQo+PiAgICAgIGtmcmVlKGxpbmUtPmVy
YXNlX2JpdG1hcCk7DQo+PiBAQCAtMzgyLDQwICs0MDAsMTYgQEAgc3RhdGljIHZvaWQgcGJsa19s
aW5lc19mcmVlKHN0cnVjdCBwYmxrICpwYmxrKQ0KPj4gICAgICAgICAgbGluZSA9ICZwYmxrLT5s
aW5lc1tpXTsNCj4+ICAgICAgICAgICAgcGJsa19saW5lX2ZyZWUocGJsaywgbGluZSk7DQo+PiAt
ICAgICAgICBwYmxrX2ZyZWVfbGluZV9iaXRtYXBzKGxpbmUpOw0KPj4gKyAgICAgICAgcGJsa19s
aW5lX21ldGFfZnJlZShsaW5lKTsNCj4+ICAgICAgfQ0KPj4gICAgICBzcGluX3VubG9jaygmbF9t
Zy0+ZnJlZV9sb2NrKTsNCj4+ICB9DQo+PiAgLXN0YXRpYyB2b2lkIHBibGtfbGluZV9tZXRhX2Zy
ZWUoc3RydWN0IHBibGsgKnBibGspDQo+PiArc3RhdGljIGludCBwYmxrX2JiX2dldF90Ymwoc3Ry
dWN0IG52bV90Z3RfZGV2ICpkZXYsIHN0cnVjdCBwYmxrX2x1biAqcmx1biwNCj4+ICsgICAgICAg
ICAgICAgICB1OCAqYmxrcywgaW50IG5yX2Jsa3MpDQo+PiAgew0KPj4gLSAgICBzdHJ1Y3QgcGJs
a19saW5lX21nbXQgKmxfbWcgPSAmcGJsay0+bF9tZzsNCj4+IC0gICAgaW50IGk7DQo+PiAtDQo+
PiAtICAgIGtmcmVlKGxfbWctPmJiX3RlbXBsYXRlKTsNCj4+IC0gICAga2ZyZWUobF9tZy0+YmJf
YXV4KTsNCj4+IC0gICAga2ZyZWUobF9tZy0+dnNjX2xpc3QpOw0KPj4gLQ0KPj4gLSAgICBmb3Ig
KGkgPSAwOyBpIDwgUEJMS19EQVRBX0xJTkVTOyBpKyspIHsNCj4+IC0gICAgICAgIGtmcmVlKGxf
bWctPnNsaW5lX21ldGFbaV0pOw0KPj4gLSAgICAgICAgcGJsa19tZnJlZShsX21nLT5lbGluZV9t
ZXRhW2ldLT5idWYsIGxfbWctPmVtZXRhX2FsbG9jX3R5cGUpOw0KPj4gLSAgICAgICAga2ZyZWUo
bF9tZy0+ZWxpbmVfbWV0YVtpXSk7DQo+PiAtICAgIH0NCj4+IC0NCj4+IC0gICAga2ZyZWUocGJs
ay0+bGluZXMpOw0KPj4gLX0NCj4+IC0NCj4+IC1zdGF0aWMgaW50IHBibGtfYmJfZGlzY292ZXJ5
KHN0cnVjdCBudm1fdGd0X2RldiAqZGV2LCBzdHJ1Y3QgcGJsa19sdW4gKnJsdW4pDQo+PiAtew0K
Pj4gLSAgICBzdHJ1Y3QgbnZtX2dlbyAqZ2VvID0gJmRldi0+Z2VvOw0KPj4gICAgICBzdHJ1Y3Qg
cHBhX2FkZHIgcHBhOw0KPj4gLSAgICB1OCAqYmxrczsNCj4+IC0gICAgaW50IG5yX2Jsa3MsIHJl
dDsNCj4+IC0NCj4+IC0gICAgbnJfYmxrcyA9IGdlby0+bnJfY2hrcyAqIGdlby0+cGxhbmVfbW9k
ZTsNCj4+IC0gICAgYmxrcyA9IGttYWxsb2MobnJfYmxrcywgR0ZQX0tFUk5FTCk7DQo+PiAtICAg
IGlmICghYmxrcykNCj4+IC0gICAgICAgIHJldHVybiAtRU5PTUVNOw0KPj4gKyAgICBpbnQgcmV0
Ow0KPj4gICAgICAgIHBwYS5wcGEgPSAwOw0KPj4gICAgICBwcGEuZy5jaCA9IHJsdW4tPmJwcGEu
Zy5jaDsNCj4+IEBAIC00MjMsMzQgKzQxNyw1NiBAQCBzdGF0aWMgaW50IHBibGtfYmJfZGlzY292
ZXJ5KHN0cnVjdCBudm1fdGd0X2RldiAqZGV2LCBzdHJ1Y3QgcGJsa19sdW4gKnJsdW4pDQo+PiAg
ICAgICAgcmV0ID0gbnZtX2dldF90Z3RfYmJfdGJsKGRldiwgcHBhLCBibGtzKTsNCj4+ICAgICAg
aWYgKHJldCkNCj4+IC0gICAgICAgIGdvdG8gb3V0Ow0KPj4gKyAgICAgICAgcmV0dXJuIHJldDsN
Cj4+ICAgICAgICBucl9ibGtzID0gbnZtX2JiX3RibF9mb2xkKGRldi0+cGFyZW50LCBibGtzLCBu
cl9ibGtzKTsNCj4+IC0gICAgaWYgKG5yX2Jsa3MgPCAwKSB7DQo+PiAtICAgICAgICByZXQgPSBu
cl9ibGtzOw0KPj4gLSAgICAgICAgZ290byBvdXQ7DQo+PiAtICAgIH0NCj4+IC0NCj4+IC0gICAg
cmx1bi0+YmJfbGlzdCA9IGJsa3M7DQo+PiArICAgIGlmIChucl9ibGtzIDwgMCkNCj4+ICsgICAg
ICAgIHJldHVybiAtRUlPOw0KPj4gICAgICAgIHJldHVybiAwOw0KPj4gLW91dDoNCj4+IC0gICAg
a2ZyZWUoYmxrcyk7DQo+PiAtICAgIHJldHVybiByZXQ7DQo+PiArfQ0KPj4gKw0KPj4gK3N0YXRp
YyB2b2lkICpwYmxrX2JiX2dldF9sb2coc3RydWN0IHBibGsgKnBibGspDQo+PiArew0KPj4gKyAg
ICBzdHJ1Y3QgbnZtX3RndF9kZXYgKmRldiA9IHBibGstPmRldjsNCj4+ICsgICAgc3RydWN0IG52
bV9nZW8gKmdlbyA9ICZkZXYtPmdlbzsNCj4+ICsgICAgdTggKmxvZzsNCj4+ICsgICAgaW50IGks
IG5yX2Jsa3MsIGJsa19wZXJfbHVuOw0KPj4gKyAgICBpbnQgcmV0Ow0KPj4gKw0KPj4gKyAgICBi
bGtfcGVyX2x1biA9IGdlby0+bnJfY2hrcyAqIGdlby0+cGxhbmVfbW9kZTsNCj4+ICsgICAgbnJf
YmxrcyA9IGJsa19wZXJfbHVuICogZ2VvLT5hbGxfbHVuczsNCj4+ICsNCj4+ICsgICAgbG9nID0g
a21hbGxvYyhucl9ibGtzLCBHRlBfS0VSTkVMKTsNCj4+ICsgICAgaWYgKCFsb2cpDQo+PiArICAg
ICAgICByZXR1cm4gRVJSX1BUUigtRU5PTUVNKTsNCj4+ICsNCj4+ICsgICAgZm9yIChpID0gMDsg
aSA8IGdlby0+YWxsX2x1bnM7IGkrKykgew0KPj4gKyAgICAgICAgc3RydWN0IHBibGtfbHVuICpy
bHVuID0gJnBibGstPmx1bnNbaV07DQo+PiArICAgICAgICB1OCAqbG9nX3BvcyA9IGxvZyArIGkg
KiBibGtfcGVyX2x1bjsNCj4+ICsNCj4+ICsgICAgICAgIHJldCA9IHBibGtfYmJfZ2V0X3RibChk
ZXYsIHJsdW4sIGxvZ19wb3MsIGJsa19wZXJfbHVuKTsNCj4+ICsgICAgICAgIGlmIChyZXQpIHsN
Cj4+ICsgICAgICAgICAgICBrZnJlZShsb2cpOw0KPj4gKyAgICAgICAgICAgIHJldHVybiBFUlJf
UFRSKC1FSU8pOw0KPj4gKyAgICAgICAgfQ0KPj4gKyAgICB9DQo+PiArDQo+PiArICAgIHJldHVy
biBsb2c7DQo+PiAgfQ0KPj4gICAgc3RhdGljIGludCBwYmxrX2JiX2xpbmUoc3RydWN0IHBibGsg
KnBibGssIHN0cnVjdCBwYmxrX2xpbmUgKmxpbmUsDQo+PiAtICAgICAgICAgICAgaW50IGJsa19w
ZXJfbGluZSkNCj4+ICsgICAgICAgICAgICB1OCAqYmJfbG9nLCBpbnQgYmxrX3Blcl9saW5lKQ0K
Pj4gIHsNCj4+ICAgICAgc3RydWN0IG52bV90Z3RfZGV2ICpkZXYgPSBwYmxrLT5kZXY7DQo+PiAg
ICAgIHN0cnVjdCBudm1fZ2VvICpnZW8gPSAmZGV2LT5nZW87DQo+PiAtICAgIHN0cnVjdCBwYmxr
X2x1biAqcmx1bjsNCj4+IC0gICAgaW50IGJiX2NudCA9IDA7DQo+PiAtICAgIGludCBpOw0KPj4g
KyAgICBpbnQgaSwgYmJfY250ID0gMDsNCj4+ICAgICAgICBmb3IgKGkgPSAwOyBpIDwgYmxrX3Bl
cl9saW5lOyBpKyspIHsNCj4+IC0gICAgICAgIHJsdW4gPSAmcGJsay0+bHVuc1tpXTsNCj4+IC0g
ICAgICAgIGlmIChybHVuLT5iYl9saXN0W2xpbmUtPmlkXSA9PSBOVk1fQkxLX1RfRlJFRSkNCj4+
ICsgICAgICAgIHN0cnVjdCBwYmxrX2x1biAqcmx1biA9ICZwYmxrLT5sdW5zW2ldOw0KPj4gKyAg
ICAgICAgdTggKmx1bl9iYl9sb2cgPSBiYl9sb2cgKyBpICogYmxrX3Blcl9saW5lOw0KPj4gKw0K
Pj4gKyAgICAgICAgaWYgKGx1bl9iYl9sb2dbbGluZS0+aWRdID09IE5WTV9CTEtfVF9GUkVFKQ0K
Pj4gICAgICAgICAgICAgIGNvbnRpbnVlOw0KPj4gICAgICAgICAgICBzZXRfYml0KHBibGtfcHBh
X3RvX3BvcyhnZW8sIHJsdW4tPmJwcGEpLCBsaW5lLT5ibGtfYml0bWFwKTsNCj4+IEBAIC00NjAs
MjkgKzQ3NiwxMiBAQCBzdGF0aWMgaW50IHBibGtfYmJfbGluZShzdHJ1Y3QgcGJsayAqcGJsaywg
c3RydWN0IHBibGtfbGluZSAqbGluZSwNCj4+ICAgICAgcmV0dXJuIGJiX2NudDsNCj4+ICB9DQo+
PiAgLXN0YXRpYyBpbnQgcGJsa19hbGxvY19saW5lX2JpdG1hcHMoc3RydWN0IHBibGsgKnBibGss
IHN0cnVjdCBwYmxrX2xpbmUgKmxpbmUpDQo+PiAtew0KPj4gLSAgICBzdHJ1Y3QgcGJsa19saW5l
X21ldGEgKmxtID0gJnBibGstPmxtOw0KPj4gLQ0KPj4gLSAgICBsaW5lLT5ibGtfYml0bWFwID0g
a3phbGxvYyhsbS0+YmxrX2JpdG1hcF9sZW4sIEdGUF9LRVJORUwpOw0KPj4gLSAgICBpZiAoIWxp
bmUtPmJsa19iaXRtYXApDQo+PiAtICAgICAgICByZXR1cm4gLUVOT01FTTsNCj4+IC0NCj4+IC0g
ICAgbGluZS0+ZXJhc2VfYml0bWFwID0ga3phbGxvYyhsbS0+YmxrX2JpdG1hcF9sZW4sIEdGUF9L
RVJORUwpOw0KPj4gLSAgICBpZiAoIWxpbmUtPmVyYXNlX2JpdG1hcCkgew0KPj4gLSAgICAgICAg
a2ZyZWUobGluZS0+YmxrX2JpdG1hcCk7DQo+PiAtICAgICAgICByZXR1cm4gLUVOT01FTTsNCj4+
IC0gICAgfQ0KPj4gLQ0KPj4gLSAgICByZXR1cm4gMDsNCj4+IC19DQo+PiAtDQo+PiAgc3RhdGlj
IGludCBwYmxrX2x1bnNfaW5pdChzdHJ1Y3QgcGJsayAqcGJsaywgc3RydWN0IHBwYV9hZGRyICps
dW5zKQ0KPj4gIHsNCj4+ICAgICAgc3RydWN0IG52bV90Z3RfZGV2ICpkZXYgPSBwYmxrLT5kZXY7
DQo+PiAgICAgIHN0cnVjdCBudm1fZ2VvICpnZW8gPSAmZGV2LT5nZW87DQo+PiAgICAgIHN0cnVj
dCBwYmxrX2x1biAqcmx1bjsNCj4+IC0gICAgaW50IGksIHJldDsNCj4+ICsgICAgaW50IGk7DQo+
PiAgICAgICAgLyogVE9ETzogSW1wbGVtZW50IHVuYmFsYW5jZWQgTFVOIHN1cHBvcnQgKi8NCj4+
ICAgICAgaWYgKGdlby0+bnJfbHVucyA8IDApIHsNCj4+IEBAIC01MDUsMTMgKzUwNCw2IEBAIHN0
YXRpYyBpbnQgcGJsa19sdW5zX2luaXQoc3RydWN0IHBibGsgKnBibGssIHN0cnVjdCBwcGFfYWRk
ciAqbHVucykNCj4+ICAgICAgICAgIHJsdW4tPmJwcGEgPSBsdW5zW2x1bmlkXTsNCj4+ICAgICAg
ICAgICAgc2VtYV9pbml0KCZybHVuLT53cl9zZW0sIDEpOw0KPj4gLQ0KPj4gLSAgICAgICAgcmV0
ID0gcGJsa19iYl9kaXNjb3ZlcnkoZGV2LCBybHVuKTsNCj4+IC0gICAgICAgIGlmIChyZXQpIHsN
Cj4+IC0gICAgICAgICAgICB3aGlsZSAoLS1pID49IDApDQo+PiAtICAgICAgICAgICAgICAgIGtm
cmVlKHBibGstPmx1bnNbaV0uYmJfbGlzdCk7DQo+PiAtICAgICAgICAgICAgcmV0dXJuIHJldDsN
Cj4+IC0gICAgICAgIH0NCj4+ICAgICAgfQ0KPj4gICAgICAgIHJldHVybiAwOw0KPj4gQEAgLTY4
OSw2ICs2ODEsMjYgQEAgc3RhdGljIGludCBwYmxrX2xpbmVzX2FsbG9jX21ldGFkYXRhKHN0cnVj
dCBwYmxrICpwYmxrKQ0KPj4gICAgICByZXR1cm4gLUVOT01FTTsNCj4+ICB9DQo+PiAgK3N0YXRp
YyBpbnQgcGJsa19zZXR1cF9saW5lX21ldGEoc3RydWN0IHBibGsgKnBibGssIHN0cnVjdCBwYmxr
X2xpbmUgKmxpbmUsDQo+PiArICAgICAgICAgICAgICAgIHZvaWQgKmNodW5rX2xvZywgbG9uZyAq
bnJfYmFkX2Jsa3MpDQo+PiArew0KPj4gKyAgICBzdHJ1Y3QgcGJsa19saW5lX21ldGEgKmxtID0g
JnBibGstPmxtOw0KPj4gKw0KPj4gKyAgICBsaW5lLT5ibGtfYml0bWFwID0ga3phbGxvYyhsbS0+
YmxrX2JpdG1hcF9sZW4sIEdGUF9LRVJORUwpOw0KPj4gKyAgICBpZiAoIWxpbmUtPmJsa19iaXRt
YXApDQo+PiArICAgICAgICByZXR1cm4gLUVOT01FTTsNCj4+ICsNCj4+ICsgICAgbGluZS0+ZXJh
c2VfYml0bWFwID0ga3phbGxvYyhsbS0+YmxrX2JpdG1hcF9sZW4sIEdGUF9LRVJORUwpOw0KPj4g
KyAgICBpZiAoIWxpbmUtPmVyYXNlX2JpdG1hcCkgew0KPj4gKyAgICAgICAga2ZyZWUobGluZS0+
YmxrX2JpdG1hcCk7DQo+PiArICAgICAgICByZXR1cm4gLUVOT01FTTsNCj4+ICsgICAgfQ0KPj4g
Kw0KPj4gKyAgICAqbnJfYmFkX2Jsa3MgPSBwYmxrX2JiX2xpbmUocGJsaywgbGluZSwgY2h1bmtf
bG9nLCBsbS0+YmxrX3Blcl9saW5lKTsNCj4+ICsNCj4+ICsgICAgcmV0dXJuIDA7DQo+PiArfQ0K
Pj4gKw0KPj4gIHN0YXRpYyBpbnQgcGJsa19saW5lc19pbml0KHN0cnVjdCBwYmxrICpwYmxrKQ0K
Pj4gIHsNCj4+ICAgICAgc3RydWN0IG52bV90Z3RfZGV2ICpkZXYgPSBwYmxrLT5kZXY7DQo+PiBA
QCAtNjk2LDggKzcwOCw5IEBAIHN0YXRpYyBpbnQgcGJsa19saW5lc19pbml0KHN0cnVjdCBwYmxr
ICpwYmxrKQ0KPj4gICAgICBzdHJ1Y3QgcGJsa19saW5lX21nbXQgKmxfbWcgPSAmcGJsay0+bF9t
ZzsNCj4+ICAgICAgc3RydWN0IHBibGtfbGluZV9tZXRhICpsbSA9ICZwYmxrLT5sbTsNCj4+ICAg
ICAgc3RydWN0IHBibGtfbGluZSAqbGluZTsNCj4+ICsgICAgdm9pZCAqY2h1bmtfbG9nOw0KPj4g
ICAgICB1bnNpZ25lZCBpbnQgc21ldGFfbGVuLCBlbWV0YV9sZW47DQo+PiAtICAgIGxvbmcgbnJf
YmFkX2Jsa3MsIG5yX2ZyZWVfYmxrczsNCj4+ICsgICAgbG9uZyBucl9iYWRfYmxrcyA9IDAsIG5y
X2ZyZWVfYmxrcyA9IDA7DQo+PiAgICAgIGludCBiYl9kaXN0YW5jZSwgbWF4X3dyaXRlX3BwYXMs
IG1vZDsNCj4+ICAgICAgaW50IGksIHJldDsNCj4+ICBAQCAtNzcxLDEzICs3ODQsMTIgQEAgc3Rh
dGljIGludCBwYmxrX2xpbmVzX2luaXQoc3RydWN0IHBibGsgKnBibGspDQo+PiAgICAgIGlmIChs
bS0+bWluX2Jsa19saW5lID4gbG0tPmJsa19wZXJfbGluZSkgew0KPj4gICAgICAgICAgcHJfZXJy
KCJwYmxrOiBjb25maWcuIG5vdCBzdXBwb3J0ZWQuIE1pbi4gTFVOIGluIGxpbmU6JWRcbiIsDQo+
PiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIGxtLT5ibGtfcGVyX2xpbmUpOw0KPj4gLSAg
ICAgICAgcmV0ID0gLUVJTlZBTDsNCj4+IC0gICAgICAgIGdvdG8gZmFpbDsNCj4+ICsgICAgICAg
IHJldHVybiAtRUlOVkFMOw0KPj4gICAgICB9DQo+PiAgICAgICAgcmV0ID0gcGJsa19saW5lc19h
bGxvY19tZXRhZGF0YShwYmxrKTsNCj4+ICAgICAgaWYgKHJldCkNCj4+IC0gICAgICAgIGdvdG8g
ZmFpbDsNCj4+ICsgICAgICAgIHJldHVybiByZXQ7DQo+PiAgICAgICAgbF9tZy0+YmJfdGVtcGxh
dGUgPSBremFsbG9jKGxtLT5zZWNfYml0bWFwX2xlbiwgR0ZQX0tFUk5FTCk7DQo+PiAgICAgIGlm
ICghbF9tZy0+YmJfdGVtcGxhdGUpIHsNCj4+IEBAIC04MjEsOSArODMzLDE2IEBAIHN0YXRpYyBp
bnQgcGJsa19saW5lc19pbml0KHN0cnVjdCBwYmxrICpwYmxrKQ0KPj4gICAgICAgICAgZ290byBm
YWlsX2ZyZWVfYmJfYXV4Ow0KPj4gICAgICB9DQo+PiAgLSAgICBucl9mcmVlX2Jsa3MgPSAwOw0K
Pj4gKyAgICBjaHVua19sb2cgPSBwYmxrX2JiX2dldF9sb2cocGJsayk7DQo+PiArICAgIGlmIChJ
U19FUlIoY2h1bmtfbG9nKSkgew0KPj4gKyAgICAgICAgcHJfZXJyKCJwYmxrOiBjb3VsZCBub3Qg
Z2V0IGJhZCBibG9jayBsb2cgKCVsdSlcbiIsDQo+PiArICAgICAgICAgICAgICAgICAgICAgICAg
ICAgIFBUUl9FUlIoY2h1bmtfbG9nKSk7DQo+PiArICAgICAgICByZXQgPSBQVFJfRVJSKGNodW5r
X2xvZyk7DQo+PiArICAgICAgICBnb3RvIGZhaWxfZnJlZV9saW5lczsNCj4+ICsgICAgfQ0KPj4g
Kw0KPj4gICAgICBmb3IgKGkgPSAwOyBpIDwgbF9tZy0+bnJfbGluZXM7IGkrKykgew0KPj4gLSAg
ICAgICAgaW50IGJsa19pbl9saW5lOw0KPj4gKyAgICAgICAgaW50IGNoa19pbl9saW5lOw0KPj4g
ICAgICAgICAgICBsaW5lID0gJnBibGstPmxpbmVzW2ldOw0KPj4gIEBAIC04MzUsMjYgKzg1NCwy
MCBAQCBzdGF0aWMgaW50IHBibGtfbGluZXNfaW5pdChzdHJ1Y3QgcGJsayAqcGJsaykNCj4+ICAg
ICAgICAgIGxpbmUtPnZzYyA9ICZsX21nLT52c2NfbGlzdFtpXTsNCj4+ICAgICAgICAgIHNwaW5f
bG9ja19pbml0KCZsaW5lLT5sb2NrKTsNCj4+ICAtICAgICAgICByZXQgPSBwYmxrX2FsbG9jX2xp
bmVfYml0bWFwcyhwYmxrLCBsaW5lKTsNCj4+ICsgICAgICAgIHJldCA9IHBibGtfc2V0dXBfbGlu
ZV9tZXRhKHBibGssIGxpbmUsIGNodW5rX2xvZywgJm5yX2JhZF9ibGtzKTsNCj4+ICAgICAgICAg
IGlmIChyZXQpDQo+PiAtICAgICAgICAgICAgZ290byBmYWlsX2ZyZWVfbGluZXM7DQo+PiArICAg
ICAgICAgICAgZ290byBmYWlsX2ZyZWVfY2h1bmtfbG9nOw0KPj4gIC0gICAgICAgIG5yX2JhZF9i
bGtzID0gcGJsa19iYl9saW5lKHBibGssIGxpbmUsIGxtLT5ibGtfcGVyX2xpbmUpOw0KPj4gLSAg
ICAgICAgaWYgKG5yX2JhZF9ibGtzIDwgMCB8fCBucl9iYWRfYmxrcyA+IGxtLT5ibGtfcGVyX2xp
bmUpIHsNCj4+IC0gICAgICAgICAgICBwYmxrX2ZyZWVfbGluZV9iaXRtYXBzKGxpbmUpOw0KPj4g
LSAgICAgICAgICAgIHJldCA9IC1FSU5WQUw7DQo+PiAtICAgICAgICAgICAgZ290byBmYWlsX2Zy
ZWVfbGluZXM7DQo+PiAtICAgICAgICB9DQo+PiAtDQo+PiAtICAgICAgICBibGtfaW5fbGluZSA9
IGxtLT5ibGtfcGVyX2xpbmUgLSBucl9iYWRfYmxrczsNCj4+IC0gICAgICAgIGlmIChibGtfaW5f
bGluZSA8IGxtLT5taW5fYmxrX2xpbmUpIHsNCj4+ICsgICAgICAgIGNoa19pbl9saW5lID0gbG0t
PmJsa19wZXJfbGluZSAtIG5yX2JhZF9ibGtzOw0KPj4gKyAgICAgICAgaWYgKG5yX2JhZF9ibGtz
IDwgMCB8fCBucl9iYWRfYmxrcyA+IGxtLT5ibGtfcGVyX2xpbmUgfHwNCj4+ICsgICAgICAgICAg
ICAgICAgICAgIGNoa19pbl9saW5lIDwgbG0tPm1pbl9ibGtfbGluZSkgew0KPj4gICAgICAgICAg
ICAgIGxpbmUtPnN0YXRlID0gUEJMS19MSU5FU1RBVEVfQkFEOw0KPj4gICAgICAgICAgICAgIGxp
c3RfYWRkX3RhaWwoJmxpbmUtPmxpc3QsICZsX21nLT5iYWRfbGlzdCk7DQo+PiAgICAgICAgICAg
ICAgY29udGludWU7DQo+PiAgICAgICAgICB9DQo+PiAgLSAgICAgICAgbnJfZnJlZV9ibGtzICs9
IGJsa19pbl9saW5lOw0KPj4gLSAgICAgICAgYXRvbWljX3NldCgmbGluZS0+YmxrX2luX2xpbmUs
IGJsa19pbl9saW5lKTsNCj4+ICsgICAgICAgIG5yX2ZyZWVfYmxrcyArPSBjaGtfaW5fbGluZTsN
Cj4+ICsgICAgICAgIGF0b21pY19zZXQoJmxpbmUtPmJsa19pbl9saW5lLCBjaGtfaW5fbGluZSk7
DQo+PiAgICAgICAgICAgIGxfbWctPm5yX2ZyZWVfbGluZXMrKzsNCj4+ICAgICAgICAgIGxpc3Rf
YWRkX3RhaWwoJmxpbmUtPmxpc3QsICZsX21nLT5mcmVlX2xpc3QpOw0KPj4gQEAgLTg2MiwyMyAr
ODc1LDIxIEBAIHN0YXRpYyBpbnQgcGJsa19saW5lc19pbml0KHN0cnVjdCBwYmxrICpwYmxrKQ0K
Pj4gICAgICAgIHBibGtfc2V0X3Byb3Zpc2lvbihwYmxrLCBucl9mcmVlX2Jsa3MpOw0KPj4gIC0g
ICAgLyogQ2xlYW51cCBwZXItTFVOIGJhZCBibG9jayBsaXN0cyAtIG1hbmFnZWQgd2l0aGluIGxp
bmVzIG9uIHJ1bi10aW1lICovDQo+PiAtICAgIGZvciAoaSA9IDA7IGkgPCBnZW8tPmFsbF9sdW5z
OyBpKyspDQo+PiAtICAgICAgICBrZnJlZShwYmxrLT5sdW5zW2ldLmJiX2xpc3QpOw0KPj4gLQ0K
Pj4gKyAgICBrZnJlZShjaHVua19sb2cpOw0KPj4gICAgICByZXR1cm4gMDsNCj4+IC1mYWlsX2Zy
ZWVfbGluZXM6DQo+PiArDQo+PiArZmFpbF9mcmVlX2NodW5rX2xvZzoNCj4+ICsgICAga2ZyZWUo
Y2h1bmtfbG9nKTsNCj4+ICAgICAgd2hpbGUgKC0taSA+PSAwKQ0KPj4gLSAgICAgICAgcGJsa19m
cmVlX2xpbmVfYml0bWFwcygmcGJsay0+bGluZXNbaV0pOw0KPj4gKyAgICAgICAgcGJsa19saW5l
X21ldGFfZnJlZSgmcGJsay0+bGluZXNbaV0pOw0KPj4gK2ZhaWxfZnJlZV9saW5lczoNCj4+ICsg
ICAga2ZyZWUocGJsay0+bGluZXMpOw0KPj4gIGZhaWxfZnJlZV9iYl9hdXg6DQo+PiAgICAgIGtm
cmVlKGxfbWctPmJiX2F1eCk7DQo+PiAgZmFpbF9mcmVlX2JiX3RlbXBsYXRlOg0KPj4gICAgICBr
ZnJlZShsX21nLT5iYl90ZW1wbGF0ZSk7DQo+PiAgZmFpbF9mcmVlX21ldGE6DQo+PiAtICAgIHBi
bGtfbGluZV9tZXRhX2ZyZWUocGJsayk7DQo+PiAtZmFpbDoNCj4+IC0gICAgZm9yIChpID0gMDsg
aSA8IGdlby0+YWxsX2x1bnM7IGkrKykNCj4+IC0gICAgICAgIGtmcmVlKHBibGstPmx1bnNbaV0u
YmJfbGlzdCk7DQo+PiArICAgIHBibGtfbGluZV9tZ19mcmVlKHBibGspOw0KPj4gICAgICAgIHJl
dHVybiByZXQ7DQo+PiAgfQ0KPj4gQEAgLTkyMiw3ICs5MzMsNyBAQCBzdGF0aWMgdm9pZCBwYmxr
X2ZyZWUoc3RydWN0IHBibGsgKnBibGspDQo+PiAgICAgIHBibGtfbHVuc19mcmVlKHBibGspOw0K
Pj4gICAgICBwYmxrX2xpbmVzX2ZyZWUocGJsayk7DQo+PiAgICAgIGtmcmVlKHBibGstPnBhZF9k
aXN0KTsNCj4+IC0gICAgcGJsa19saW5lX21ldGFfZnJlZShwYmxrKTsNCj4+ICsgICAgcGJsa19s
aW5lX21nX2ZyZWUocGJsayk7DQo+PiAgICAgIHBibGtfY29yZV9mcmVlKHBibGspOw0KPj4gICAg
ICBwYmxrX2wycF9mcmVlKHBibGspOw0KPj4gIEBAIC0xMTEwLDcgKzExMjEsNyBAQCBzdGF0aWMg
dm9pZCAqcGJsa19pbml0KHN0cnVjdCBudm1fdGd0X2RldiAqZGV2LCBzdHJ1Y3QgZ2VuZGlzayAq
dGRpc2ssDQo+PiAgZmFpbF9mcmVlX3BhZF9kaXN0Og0KPj4gICAgICBrZnJlZShwYmxrLT5wYWRf
ZGlzdCk7DQo+PiAgZmFpbF9mcmVlX2xpbmVfbWV0YToNCj4+IC0gICAgcGJsa19saW5lX21ldGFf
ZnJlZShwYmxrKTsNCj4+ICsgICAgcGJsa19saW5lX21nX2ZyZWUocGJsayk7DQo+PiAgZmFpbF9m
cmVlX2x1bnM6DQo+PiAgICAgIHBibGtfbHVuc19mcmVlKHBibGspOw0KPj4gIGZhaWw6DQo+PiBk
aWZmIC0tZ2l0IGEvZHJpdmVycy9saWdodG52bS9wYmxrLmggYi9kcml2ZXJzL2xpZ2h0bnZtL3Bi
bGsuaA0KPj4gaW5kZXggODg3MjBlMjQ0MWMwLi4yODJkZmM4NzgwZTggMTAwNjQ0DQo+PiAtLS0g
YS9kcml2ZXJzL2xpZ2h0bnZtL3BibGsuaA0KPj4gKysrIGIvZHJpdmVycy9saWdodG52bS9wYmxr
LmgNCj4+IEBAIC0yMDEsMTIgKzIwMSw2IEBAIHN0cnVjdCBwYmxrX3JiIHsNCj4+ICAgIHN0cnVj
dCBwYmxrX2x1biB7DQo+PiAgICAgIHN0cnVjdCBwcGFfYWRkciBicHBhOw0KPj4gLQ0KPj4gLSAg
ICB1OCAqYmJfbGlzdDsgICAgICAgICAgICAvKiBCYWQgYmxvY2sgbGlzdCBmb3IgTFVOLiBPbmx5
IHVzZWQgb24NCj4+IC0gICAgICAgICAgICAgICAgICAgICAqIGJyaW5nIHVwLiBCYWQgYmxvY2tz
IGFyZSBtYW5hZ2VkDQo+PiAtICAgICAgICAgICAgICAgICAgICAgKiB3aXRoaW4gbGluZXMgb24g
cnVuLXRpbWUuDQo+PiAtICAgICAgICAgICAgICAgICAgICAgKi8NCj4+IC0NCj4+ICAgICAgc3Ry
dWN0IHNlbWFwaG9yZSB3cl9zZW07DQo+PiAgfTsNCj4+ICANCj4gDQo+IExvb2tzIGdvb2QgdG8g
bWUuDQo+IA0KPiBXaXRoIHJlc3BlY3QgdG8gdGhlIDIuMCBpbXBsZW1lbnRhdGlvbi4gSSBhbSB0
aGlua2luZyB0aGF0IGdldF9iYiBhbmQgc2V0X2JiIHNob3VsZCBiZWhhdmUgaW4gdGhlIGZvbGxv
d2luZyB3YXk6DQo+IA0KPiBGb3IgZ2V0X2JiIChpbiBudm1lL2hvc3QvbGlnaHRudm0uYykNCj4g
MS4yOiBLZWVwIHRoZSBpbXBsZW1lbnRhdGlvbiBhcyBpcy4NCj4gMi4wOiBVc2UgdGhlIHJlcG9y
dCBjaHVuayBjb21tYW5kLCBhbmQgcmVkbyB0aGUgZ2V0X2JiIGZvcm1hdC4NCj4gDQo+IEZvciBz
ZXRfYmIgKGluIG52bWUvaG9zdC9saWdodG52bS5jKQ0KPiAxLjI6IEJ1c2luZXNzIGFzIHVzdWFs
DQo+IDIuMDogUmVwb3J0IGVycm9yLg0KDQpJIGhhdmUgYSBwYXRjaGVzIGFic3RyYWN0aW5nIHRo
aXMsIHdoaWNoIEkgdGhpbmsgaXQgbWFrZXMgaXQgY2xlYW5lci4gSSBjYW4gcHVzaCBpdCBuZXh0
IHdlZWsgZm9yIHJldmlldy4gSeKAmW0gdHJhdmVsaW5nIHRoaXMgd2Vlay4gKElmIHlvdSB3YW50
IHRvIGdldCBhIGdsaW1wc2UgSSBjYW4gcG9pbnQgeW91IHRvIHRoZSBjb2RlKS4gDQoNCj4gDQo+
IEZvciAyLjAsIHRoZXJlIHdpbGwgYmUgYWRkZWQgYSBmdW5jdGlvbiBwb2ludGVyIGZvciBnZXQg
cmVwb3J0IGNodW5rIGltcGxlbWVudGF0aW9uLCB3aGVyZSB0aGUgY2xpZW50IGNhbiBhc2sgZm9y
IGZ1bGwgY2h1bmsgaW5mb3JtYXRpb24uIFNpbWlsYXJseSBoZXJlLCBjbGllbnQgd2lsbCBmYWls
IHRoZSBmdW5jdGlvbiBjYWxsIGlmIHRoZSBkcml2ZSBpcyAxLjIuDQo+IA0KPiBJIGhvcGUgdG8g
cG9zdCB0aGUgc21hbGwgMi4wIGlkZW50aWZ5IGltcGxlbWVudGF0aW9uIFRvbW9ycm93IG9yIEZy
aWRheSwgYW5kIHRoZW4geW91IGNhbiBwb3N0IHRoZSByZXBvcnQgY2h1bmsgaW1wbGVtZW50YXRp
b24gdGhhdCB5b3UgaGF2ZSBkb25lIGlmIHlvdSBsaWtlDQoNClN1cmUuIEkgaGF2ZSBwYXRjaGVz
IGltcGxlbWVudGluZyAyLjAgdG9vLCBidXQgeW91IGNhbiBwdXNoIHlvdXIgdmVyc2lvbi4gT25l
IHRoaW5nIEnigJlkIGxpa2UgdG8gZ2V0IGluIGlzIGEgZ2VuZXJpYyBnZW9tZXRyeSBzdHJ1Y3R1
cmUgdGhhdCBzaW1wbGlmaWVzIHRoZSBpZGVudGlmeSBjb21tYW5kIGFuZCBwdXRzIHRoZSBsb2dp
YyBpbiB0aGUgaWRlbnRpZnkgcGF0aC4gDQoNClRoaXMgbWFrZXMgdGhlIGJhc2UgZ3JvdXAgc3Ry
dWN0dXJlIGZyb20gMS4yIGdvIGF3YXkgbWFrZXMgdGhlIDEuMiBhbmQgMi4wIHBhdGhzIGVhc2ll
ciB0byBtYWludGFpbiBhdCB0aGUgdGFyZ2V0IGxldmVsLiANCg0KSSBjYW4gcG9pbnQgeW91IHRv
IHRoZSBwYXRjaGVzIHRvbW9ycm93IGlmIHlvdSB3YW50IHRvIGFsaWduIHdpdGggaXQuIE90aGVy
d2lzZSwgSSBjYW4ganVzdCByZWJhc2UgYW5kIHNlbmQgdGhlbSBuZXh0IHdlZWsuIEJ1dCBwbGVh
c2UsIGNvbnNpZGVyIHRoaXMgYmVmb3JlIG1ha2luZyBhIG5ldyBhYnN0cmFjdGlvbiB0byBtYWtl
IHRoZSAyIHNwZWNzIGNvZXhpc3QgLSBtb3N0IG9mIHRoZSB3b3JrIGlzIGRvbmUgYWxyZWFkeS4u
Lg0KDQoNCj4gV2UgYWxzbyBuZWVkIHRvIHRha2UgYSBsb29rIGF0IHRoZSBuZXcgZXJyb3IgY29k
ZXMsIHdoaWNoIGRvZXMgbm90IGFsaWduIHdpdGggdGhlIDEuMiBzcGVjaWZpY2F0aW9uIChub3cg
dGhhdCB0aGV5IGFjdHVhbGx5IGZvbGxvdyB0aGUgbnZtZSBzcGVjaWZpY2F0aW9uIDspKQ0KDQpD
b29sLiBMZXTigJlzIG1haW50YWluIHRoZSAxLjIgZXJyb3JzIGZvciBjb21wYXRpYmlsaXR5IGFu
ZCBtYWtlIHRoZSAyLjAgcGF0aCBjbGVhbiAtIHRoYXQgaXMgbnZtZSBjb21wYXRpYmxlIDspDQoN
Ckphdmllci4g

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 5/5] lightnvm: pblk: refactor bad block identification
  2018-01-31  9:13     ` Javier Gonzalez
@ 2018-01-31 18:24       ` Matias Bjørling
  2018-02-04 10:37         ` Javier Gonzalez
  0 siblings, 1 reply; 15+ messages in thread
From: Matias Bjørling @ 2018-01-31 18:24 UTC (permalink / raw)
  To: Javier Gonzalez; +Cc: Javier González, linux-block, linux-kernel

On 01/31/2018 10:13 AM, Javier Gonzalez wrote:
> 
> 
>> On 31 Jan 2018, at 16.51, Matias Bjørling <mb@lightnvm.io> wrote:
>>
>>> On 01/31/2018 03:06 AM, Javier González wrote:
>>> In preparation for the OCSSD 2.0 spec. bad block identification,
>>> refactor the current code to generalize bad block get/set functions and
>>> structures.
>>> Signed-off-by: Javier González <javier@cnexlabs.com>
>>> ---
>>>   drivers/lightnvm/pblk-init.c | 213 +++++++++++++++++++++++--------------------
>>>   drivers/lightnvm/pblk.h      |   6 --
>>>   2 files changed, 112 insertions(+), 107 deletions(-)
>>> diff --git a/drivers/lightnvm/pblk-init.c b/drivers/lightnvm/pblk-init.c
>>> index a5e3510c0f38..86a94a7faa96 100644
>>> --- a/drivers/lightnvm/pblk-init.c
>>> +++ b/drivers/lightnvm/pblk-init.c
>>> @@ -365,7 +365,25 @@ static void pblk_luns_free(struct pblk *pblk)
>>>       kfree(pblk->luns);
>>>   }
>>>   -static void pblk_free_line_bitmaps(struct pblk_line *line)
>>> +static void pblk_line_mg_free(struct pblk *pblk)
>>> +{
>>> +    struct pblk_line_mgmt *l_mg = &pblk->l_mg;
>>> +    int i;
>>> +
>>> +    kfree(l_mg->bb_template);
>>> +    kfree(l_mg->bb_aux);
>>> +    kfree(l_mg->vsc_list);
>>> +
>>> +    for (i = 0; i < PBLK_DATA_LINES; i++) {
>>> +        kfree(l_mg->sline_meta[i]);
>>> +        pblk_mfree(l_mg->eline_meta[i]->buf, l_mg->emeta_alloc_type);
>>> +        kfree(l_mg->eline_meta[i]);
>>> +    }
>>> +
>>> +    kfree(pblk->lines);
>>> +}
>>> +
>>> +static void pblk_line_meta_free(struct pblk_line *line)
>>>   {
>>>       kfree(line->blk_bitmap);
>>>       kfree(line->erase_bitmap);
>>> @@ -382,40 +400,16 @@ static void pblk_lines_free(struct pblk *pblk)
>>>           line = &pblk->lines[i];
>>>             pblk_line_free(pblk, line);
>>> -        pblk_free_line_bitmaps(line);
>>> +        pblk_line_meta_free(line);
>>>       }
>>>       spin_unlock(&l_mg->free_lock);
>>>   }
>>>   -static void pblk_line_meta_free(struct pblk *pblk)
>>> +static int pblk_bb_get_tbl(struct nvm_tgt_dev *dev, struct pblk_lun *rlun,
>>> +               u8 *blks, int nr_blks)
>>>   {
>>> -    struct pblk_line_mgmt *l_mg = &pblk->l_mg;
>>> -    int i;
>>> -
>>> -    kfree(l_mg->bb_template);
>>> -    kfree(l_mg->bb_aux);
>>> -    kfree(l_mg->vsc_list);
>>> -
>>> -    for (i = 0; i < PBLK_DATA_LINES; i++) {
>>> -        kfree(l_mg->sline_meta[i]);
>>> -        pblk_mfree(l_mg->eline_meta[i]->buf, l_mg->emeta_alloc_type);
>>> -        kfree(l_mg->eline_meta[i]);
>>> -    }
>>> -
>>> -    kfree(pblk->lines);
>>> -}
>>> -
>>> -static int pblk_bb_discovery(struct nvm_tgt_dev *dev, struct pblk_lun *rlun)
>>> -{
>>> -    struct nvm_geo *geo = &dev->geo;
>>>       struct ppa_addr ppa;
>>> -    u8 *blks;
>>> -    int nr_blks, ret;
>>> -
>>> -    nr_blks = geo->nr_chks * geo->plane_mode;
>>> -    blks = kmalloc(nr_blks, GFP_KERNEL);
>>> -    if (!blks)
>>> -        return -ENOMEM;
>>> +    int ret;
>>>         ppa.ppa = 0;
>>>       ppa.g.ch = rlun->bppa.g.ch;
>>> @@ -423,34 +417,56 @@ static int pblk_bb_discovery(struct nvm_tgt_dev *dev, struct pblk_lun *rlun)
>>>         ret = nvm_get_tgt_bb_tbl(dev, ppa, blks);
>>>       if (ret)
>>> -        goto out;
>>> +        return ret;
>>>         nr_blks = nvm_bb_tbl_fold(dev->parent, blks, nr_blks);
>>> -    if (nr_blks < 0) {
>>> -        ret = nr_blks;
>>> -        goto out;
>>> -    }
>>> -
>>> -    rlun->bb_list = blks;
>>> +    if (nr_blks < 0)
>>> +        return -EIO;
>>>         return 0;
>>> -out:
>>> -    kfree(blks);
>>> -    return ret;
>>> +}
>>> +
>>> +static void *pblk_bb_get_log(struct pblk *pblk)
>>> +{
>>> +    struct nvm_tgt_dev *dev = pblk->dev;
>>> +    struct nvm_geo *geo = &dev->geo;
>>> +    u8 *log;
>>> +    int i, nr_blks, blk_per_lun;
>>> +    int ret;
>>> +
>>> +    blk_per_lun = geo->nr_chks * geo->plane_mode;
>>> +    nr_blks = blk_per_lun * geo->all_luns;
>>> +
>>> +    log = kmalloc(nr_blks, GFP_KERNEL);
>>> +    if (!log)
>>> +        return ERR_PTR(-ENOMEM);
>>> +
>>> +    for (i = 0; i < geo->all_luns; i++) {
>>> +        struct pblk_lun *rlun = &pblk->luns[i];
>>> +        u8 *log_pos = log + i * blk_per_lun;
>>> +
>>> +        ret = pblk_bb_get_tbl(dev, rlun, log_pos, blk_per_lun);
>>> +        if (ret) {
>>> +            kfree(log);
>>> +            return ERR_PTR(-EIO);
>>> +        }
>>> +    }
>>> +
>>> +    return log;
>>>   }
>>>     static int pblk_bb_line(struct pblk *pblk, struct pblk_line *line,
>>> -            int blk_per_line)
>>> +            u8 *bb_log, int blk_per_line)
>>>   {
>>>       struct nvm_tgt_dev *dev = pblk->dev;
>>>       struct nvm_geo *geo = &dev->geo;
>>> -    struct pblk_lun *rlun;
>>> -    int bb_cnt = 0;
>>> -    int i;
>>> +    int i, bb_cnt = 0;
>>>         for (i = 0; i < blk_per_line; i++) {
>>> -        rlun = &pblk->luns[i];
>>> -        if (rlun->bb_list[line->id] == NVM_BLK_T_FREE)
>>> +        struct pblk_lun *rlun = &pblk->luns[i];
>>> +        u8 *lun_bb_log = bb_log + i * blk_per_line;
>>> +
>>> +        if (lun_bb_log[line->id] == NVM_BLK_T_FREE)
>>>               continue;
>>>             set_bit(pblk_ppa_to_pos(geo, rlun->bppa), line->blk_bitmap);
>>> @@ -460,29 +476,12 @@ static int pblk_bb_line(struct pblk *pblk, struct pblk_line *line,
>>>       return bb_cnt;
>>>   }
>>>   -static int pblk_alloc_line_bitmaps(struct pblk *pblk, struct pblk_line *line)
>>> -{
>>> -    struct pblk_line_meta *lm = &pblk->lm;
>>> -
>>> -    line->blk_bitmap = kzalloc(lm->blk_bitmap_len, GFP_KERNEL);
>>> -    if (!line->blk_bitmap)
>>> -        return -ENOMEM;
>>> -
>>> -    line->erase_bitmap = kzalloc(lm->blk_bitmap_len, GFP_KERNEL);
>>> -    if (!line->erase_bitmap) {
>>> -        kfree(line->blk_bitmap);
>>> -        return -ENOMEM;
>>> -    }
>>> -
>>> -    return 0;
>>> -}
>>> -
>>>   static int pblk_luns_init(struct pblk *pblk, struct ppa_addr *luns)
>>>   {
>>>       struct nvm_tgt_dev *dev = pblk->dev;
>>>       struct nvm_geo *geo = &dev->geo;
>>>       struct pblk_lun *rlun;
>>> -    int i, ret;
>>> +    int i;
>>>         /* TODO: Implement unbalanced LUN support */
>>>       if (geo->nr_luns < 0) {
>>> @@ -505,13 +504,6 @@ static int pblk_luns_init(struct pblk *pblk, struct ppa_addr *luns)
>>>           rlun->bppa = luns[lunid];
>>>             sema_init(&rlun->wr_sem, 1);
>>> -
>>> -        ret = pblk_bb_discovery(dev, rlun);
>>> -        if (ret) {
>>> -            while (--i >= 0)
>>> -                kfree(pblk->luns[i].bb_list);
>>> -            return ret;
>>> -        }
>>>       }
>>>         return 0;
>>> @@ -689,6 +681,26 @@ static int pblk_lines_alloc_metadata(struct pblk *pblk)
>>>       return -ENOMEM;
>>>   }
>>>   +static int pblk_setup_line_meta(struct pblk *pblk, struct pblk_line *line,
>>> +                void *chunk_log, long *nr_bad_blks)
>>> +{
>>> +    struct pblk_line_meta *lm = &pblk->lm;
>>> +
>>> +    line->blk_bitmap = kzalloc(lm->blk_bitmap_len, GFP_KERNEL);
>>> +    if (!line->blk_bitmap)
>>> +        return -ENOMEM;
>>> +
>>> +    line->erase_bitmap = kzalloc(lm->blk_bitmap_len, GFP_KERNEL);
>>> +    if (!line->erase_bitmap) {
>>> +        kfree(line->blk_bitmap);
>>> +        return -ENOMEM;
>>> +    }
>>> +
>>> +    *nr_bad_blks = pblk_bb_line(pblk, line, chunk_log, lm->blk_per_line);
>>> +
>>> +    return 0;
>>> +}
>>> +
>>>   static int pblk_lines_init(struct pblk *pblk)
>>>   {
>>>       struct nvm_tgt_dev *dev = pblk->dev;
>>> @@ -696,8 +708,9 @@ static int pblk_lines_init(struct pblk *pblk)
>>>       struct pblk_line_mgmt *l_mg = &pblk->l_mg;
>>>       struct pblk_line_meta *lm = &pblk->lm;
>>>       struct pblk_line *line;
>>> +    void *chunk_log;
>>>       unsigned int smeta_len, emeta_len;
>>> -    long nr_bad_blks, nr_free_blks;
>>> +    long nr_bad_blks = 0, nr_free_blks = 0;
>>>       int bb_distance, max_write_ppas, mod;
>>>       int i, ret;
>>>   @@ -771,13 +784,12 @@ static int pblk_lines_init(struct pblk *pblk)
>>>       if (lm->min_blk_line > lm->blk_per_line) {
>>>           pr_err("pblk: config. not supported. Min. LUN in line:%d\n",
>>>                               lm->blk_per_line);
>>> -        ret = -EINVAL;
>>> -        goto fail;
>>> +        return -EINVAL;
>>>       }
>>>         ret = pblk_lines_alloc_metadata(pblk);
>>>       if (ret)
>>> -        goto fail;
>>> +        return ret;
>>>         l_mg->bb_template = kzalloc(lm->sec_bitmap_len, GFP_KERNEL);
>>>       if (!l_mg->bb_template) {
>>> @@ -821,9 +833,16 @@ static int pblk_lines_init(struct pblk *pblk)
>>>           goto fail_free_bb_aux;
>>>       }
>>>   -    nr_free_blks = 0;
>>> +    chunk_log = pblk_bb_get_log(pblk);
>>> +    if (IS_ERR(chunk_log)) {
>>> +        pr_err("pblk: could not get bad block log (%lu)\n",
>>> +                            PTR_ERR(chunk_log));
>>> +        ret = PTR_ERR(chunk_log);
>>> +        goto fail_free_lines;
>>> +    }
>>> +
>>>       for (i = 0; i < l_mg->nr_lines; i++) {
>>> -        int blk_in_line;
>>> +        int chk_in_line;
>>>             line = &pblk->lines[i];
>>>   @@ -835,26 +854,20 @@ static int pblk_lines_init(struct pblk *pblk)
>>>           line->vsc = &l_mg->vsc_list[i];
>>>           spin_lock_init(&line->lock);
>>>   -        ret = pblk_alloc_line_bitmaps(pblk, line);
>>> +        ret = pblk_setup_line_meta(pblk, line, chunk_log, &nr_bad_blks);
>>>           if (ret)
>>> -            goto fail_free_lines;
>>> +            goto fail_free_chunk_log;
>>>   -        nr_bad_blks = pblk_bb_line(pblk, line, lm->blk_per_line);
>>> -        if (nr_bad_blks < 0 || nr_bad_blks > lm->blk_per_line) {
>>> -            pblk_free_line_bitmaps(line);
>>> -            ret = -EINVAL;
>>> -            goto fail_free_lines;
>>> -        }
>>> -
>>> -        blk_in_line = lm->blk_per_line - nr_bad_blks;
>>> -        if (blk_in_line < lm->min_blk_line) {
>>> +        chk_in_line = lm->blk_per_line - nr_bad_blks;
>>> +        if (nr_bad_blks < 0 || nr_bad_blks > lm->blk_per_line ||
>>> +                    chk_in_line < lm->min_blk_line) {
>>>               line->state = PBLK_LINESTATE_BAD;
>>>               list_add_tail(&line->list, &l_mg->bad_list);
>>>               continue;
>>>           }
>>>   -        nr_free_blks += blk_in_line;
>>> -        atomic_set(&line->blk_in_line, blk_in_line);
>>> +        nr_free_blks += chk_in_line;
>>> +        atomic_set(&line->blk_in_line, chk_in_line);
>>>             l_mg->nr_free_lines++;
>>>           list_add_tail(&line->list, &l_mg->free_list);
>>> @@ -862,23 +875,21 @@ static int pblk_lines_init(struct pblk *pblk)
>>>         pblk_set_provision(pblk, nr_free_blks);
>>>   -    /* Cleanup per-LUN bad block lists - managed within lines on run-time */
>>> -    for (i = 0; i < geo->all_luns; i++)
>>> -        kfree(pblk->luns[i].bb_list);
>>> -
>>> +    kfree(chunk_log);
>>>       return 0;
>>> -fail_free_lines:
>>> +
>>> +fail_free_chunk_log:
>>> +    kfree(chunk_log);
>>>       while (--i >= 0)
>>> -        pblk_free_line_bitmaps(&pblk->lines[i]);
>>> +        pblk_line_meta_free(&pblk->lines[i]);
>>> +fail_free_lines:
>>> +    kfree(pblk->lines);
>>>   fail_free_bb_aux:
>>>       kfree(l_mg->bb_aux);
>>>   fail_free_bb_template:
>>>       kfree(l_mg->bb_template);
>>>   fail_free_meta:
>>> -    pblk_line_meta_free(pblk);
>>> -fail:
>>> -    for (i = 0; i < geo->all_luns; i++)
>>> -        kfree(pblk->luns[i].bb_list);
>>> +    pblk_line_mg_free(pblk);
>>>         return ret;
>>>   }
>>> @@ -922,7 +933,7 @@ static void pblk_free(struct pblk *pblk)
>>>       pblk_luns_free(pblk);
>>>       pblk_lines_free(pblk);
>>>       kfree(pblk->pad_dist);
>>> -    pblk_line_meta_free(pblk);
>>> +    pblk_line_mg_free(pblk);
>>>       pblk_core_free(pblk);
>>>       pblk_l2p_free(pblk);
>>>   @@ -1110,7 +1121,7 @@ static void *pblk_init(struct nvm_tgt_dev *dev, struct gendisk *tdisk,
>>>   fail_free_pad_dist:
>>>       kfree(pblk->pad_dist);
>>>   fail_free_line_meta:
>>> -    pblk_line_meta_free(pblk);
>>> +    pblk_line_mg_free(pblk);
>>>   fail_free_luns:
>>>       pblk_luns_free(pblk);
>>>   fail:
>>> diff --git a/drivers/lightnvm/pblk.h b/drivers/lightnvm/pblk.h
>>> index 88720e2441c0..282dfc8780e8 100644
>>> --- a/drivers/lightnvm/pblk.h
>>> +++ b/drivers/lightnvm/pblk.h
>>> @@ -201,12 +201,6 @@ struct pblk_rb {
>>>     struct pblk_lun {
>>>       struct ppa_addr bppa;
>>> -
>>> -    u8 *bb_list;            /* Bad block list for LUN. Only used on
>>> -                     * bring up. Bad blocks are managed
>>> -                     * within lines on run-time.
>>> -                     */
>>> -
>>>       struct semaphore wr_sem;
>>>   };
>>>   
>>
>> Looks good to me.
>>
>> With respect to the 2.0 implementation. I am thinking that get_bb and set_bb should behave in the following way:
>>
>> For get_bb (in nvme/host/lightnvm.c)
>> 1.2: Keep the implementation as is.
>> 2.0: Use the report chunk command, and redo the get_bb format.
>>
>> For set_bb (in nvme/host/lightnvm.c)
>> 1.2: Business as usual
>> 2.0: Report error.
> 
> I have a patches abstracting this, which I think it makes it cleaner. I can push it next week for review. I’m traveling this week. (If you want to get a glimpse I can point you to the code).
> 

Yes, please do. Thanks

>>
>> For 2.0, there will be added a function pointer for get report chunk implementation, where the client can ask for full chunk information. Similarly here, client will fail the function call if the drive is 1.2.
>>
>> I hope to post the small 2.0 identify implementation Tomorrow or Friday, and then you can post the report chunk implementation that you have done if you like
> 
> Sure. I have patches implementing 2.0 too, but you can push your version. One thing I’d like to get in is a generic geometry structure that simplifies the identify command and puts the logic in the identify path.

Agree, that comes naturally when adding the support.

> 
> This makes the base group structure from 1.2 go away makes the 1.2 and 2.0 paths easier to maintain at the target level.
> 
> I can point you to the patches tomorrow if you want to align with it. Otherwise, I can just rebase and send them next week. But please, consider this before making a new abstraction to make the 2 specs coexist - most of the work is done already...

Thanks. Yes, they should co-exist.
> 
> 
>> We also need to take a look at the new error codes, which does not align with the 1.2 specification (now that they actually follow the nvme specification ;))
> 
> Cool. Let’s maintain the 1.2 errors for compatibility and make the 2.0 path clean - that is nvme compatible ;)
> 
> Javier.
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 5/5] lightnvm: pblk: refactor bad block identification
  2018-01-31 18:24       ` Matias Bjørling
@ 2018-02-04 10:37         ` Javier Gonzalez
  2018-02-04 12:54           ` Matias Bjørling
  0 siblings, 1 reply; 15+ messages in thread
From: Javier Gonzalez @ 2018-02-04 10:37 UTC (permalink / raw)
  To: Matias Bjørling; +Cc: linux-block, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 657 bytes --]


> On 31 Jan 2018, at 19.24, Matias Bjørling <mb@lightnvm.io> wrote:
> 
> On 01/31/2018 10:13 AM, Javier Gonzalez wrote:
>>> On 31 Jan 2018, at 16.51, Matias Bjørling <mb@lightnvm.io> wrote:
>>> 
>> I have a patches abstracting this, which I think it makes it cleaner. I can push it next week for review. I’m traveling this week. (If you want to get a glimpse I can point you to the code).
> 
> Yes, please do. Thanks

This is the release candidate for 2.0 support based on 4.17. I'll rebase
on top of you 2.0 support. We'll see if all changes make it to 4.17
then.

https://github.com/OpenChannelSSD/linux/tree/for-4.17/spec20

Javier

[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 5/5] lightnvm: pblk: refactor bad block identification
  2018-02-04 10:37         ` Javier Gonzalez
@ 2018-02-04 12:54           ` Matias Bjørling
  2018-02-04 13:09             ` Javier Gonzalez
  0 siblings, 1 reply; 15+ messages in thread
From: Matias Bjørling @ 2018-02-04 12:54 UTC (permalink / raw)
  To: Javier Gonzalez; +Cc: linux-block, linux-kernel

On 02/04/2018 11:37 AM, Javier Gonzalez wrote:
> 
>> On 31 Jan 2018, at 19.24, Matias Bjørling <mb@lightnvm.io> wrote:
>>
>> On 01/31/2018 10:13 AM, Javier Gonzalez wrote:
>>>> On 31 Jan 2018, at 16.51, Matias Bjørling <mb@lightnvm.io> wrote:
>>>>
>>> I have a patches abstracting this, which I think it makes it cleaner. I can push it next week for review. I’m traveling this week. (If you want to get a glimpse I can point you to the code).
>>
>> Yes, please do. Thanks
> 
> This is the release candidate for 2.0 support based on 4.17. I'll rebase
> on top of you 2.0 support. We'll see if all changes make it to 4.17
> then.
> 
> https://github.com/OpenChannelSSD/linux/tree/for-4.17/spec20
> 
> Javier
> 

Great. I look forward to be patches being cleaned up and posted. I do 
see some nitpicks here and there, which we properly can take a couple of 
stabs at.

One think that generally stands out to me is the "if 1.2 support", else, 
... statements. These could be structured better by having dedicated 
setup functions for 1.2 and 2.0.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 5/5] lightnvm: pblk: refactor bad block identification
  2018-02-04 12:54           ` Matias Bjørling
@ 2018-02-04 13:09             ` Javier Gonzalez
  0 siblings, 0 replies; 15+ messages in thread
From: Javier Gonzalez @ 2018-02-04 13:09 UTC (permalink / raw)
  To: Matias Bjørling; +Cc: linux-block, linux-kernel

DQo+IE9uIDQgRmViIDIwMTgsIGF0IDEzLjU1LCBNYXRpYXMgQmrDuHJsaW5nIDxtYkBsaWdodG52
bS5pbz4gd3JvdGU6DQo+IA0KPiBPbiAwMi8wNC8yMDE4IDExOjM3IEFNLCBKYXZpZXIgR29uemFs
ZXogd3JvdGU6DQo+Pj4gT24gMzEgSmFuIDIwMTgsIGF0IDE5LjI0LCBNYXRpYXMgQmrDuHJsaW5n
IDxtYkBsaWdodG52bS5pbz4gd3JvdGU6DQo+Pj4gDQo+Pj4gT24gMDEvMzEvMjAxOCAxMDoxMyBB
TSwgSmF2aWVyIEdvbnphbGV6IHdyb3RlOg0KPj4+Pj4gT24gMzEgSmFuIDIwMTgsIGF0IDE2LjUx
LCBNYXRpYXMgQmrDuHJsaW5nIDxtYkBsaWdodG52bS5pbz4gd3JvdGU6DQo+Pj4+PiANCj4+Pj4g
SSBoYXZlIGEgcGF0Y2hlcyBhYnN0cmFjdGluZyB0aGlzLCB3aGljaCBJIHRoaW5rIGl0IG1ha2Vz
IGl0IGNsZWFuZXIuIEkgY2FuIHB1c2ggaXQgbmV4dCB3ZWVrIGZvciByZXZpZXcuIEnigJltIHRy
YXZlbGluZyB0aGlzIHdlZWsuIChJZiB5b3Ugd2FudCB0byBnZXQgYSBnbGltcHNlIEkgY2FuIHBv
aW50IHlvdSB0byB0aGUgY29kZSkuDQo+Pj4gDQo+Pj4gWWVzLCBwbGVhc2UgZG8uIFRoYW5rcw0K
Pj4gVGhpcyBpcyB0aGUgcmVsZWFzZSBjYW5kaWRhdGUgZm9yIDIuMCBzdXBwb3J0IGJhc2VkIG9u
IDQuMTcuIEknbGwgcmViYXNlDQo+PiBvbiB0b3Agb2YgeW91IDIuMCBzdXBwb3J0LiBXZSdsbCBz
ZWUgaWYgYWxsIGNoYW5nZXMgbWFrZSBpdCB0byA0LjE3DQo+PiB0aGVuLg0KPj4gaHR0cHM6Ly9n
aXRodWIuY29tL09wZW5DaGFubmVsU1NEL2xpbnV4L3RyZWUvZm9yLTQuMTcvc3BlYzIwDQo+PiBK
YXZpZXINCj4gDQo+IEdyZWF0LiBJIGxvb2sgZm9yd2FyZCB0byBiZSBwYXRjaGVzIGJlaW5nIGNs
ZWFuZWQgdXAgYW5kIHBvc3RlZC4gSSBkbyBzZWUgc29tZSBuaXRwaWNrcyBoZXJlIGFuZCB0aGVy
ZSwgd2hpY2ggd2UgcHJvcGVybHkgY2FuIHRha2UgYSBjb3VwbGUgb2Ygc3RhYnMgYXQuDQoNClN1
cmUuIFRoaXMgaXMgc3RpbGwgaW4gZGV2ZWxvcG1lbnQ7IGp1c3Qgd2FudGVkIHRvIHBvaW50IHRv
IHRoZSBhYnN0cmFjdGlvbnMgSeKAmW0gdGhpbmtpbmcgb2Ygc28gdGhhdCB3ZSBkb27igJl0IGRv
IHRoZSBzYW1lIHdvcmsgdHdpY2UuIA0KDQpJ4oCZbGwgd2FpdCBmb3IgcG9zdGluZyB1bnRpbCB5
b3UgZG8gdGhlIDIuMCBpZGVudGlmeSwgc2luY2UgdGhlIG9sZCB2ZXJzaW9uIGlzIGltcGxlbWVu
dGVkIG9uIHRoZSBmaXJzdCBwYXRjaCBvZiB0aGlzIHNlcmllcy4gDQoNCj4gT25lIHRoaW5rIHRo
YXQgZ2VuZXJhbGx5IHN0YW5kcyBvdXQgdG8gbWUgaXMgdGhlICJpZiAxLjIgc3VwcG9ydCIsIGVs
c2UsIC4uLiBzdGF0ZW1lbnRzLiBUaGVzZSBjb3VsZCBiZSBzdHJ1Y3R1cmVkIGJldHRlciBieSBo
YXZpbmcgZGVkaWNhdGVkIHNldHVwIGZ1bmN0aW9ucyBmb3IgMS4yIGFuZCAyLjAuDQoNCldlIGhh
dmUgdGhpcyBjb25zdHJ1Y3Rpb24gYm90aCBpbiBwYmxrIGFuZCBpbiBjb3JlIGZvciBhZGRyZXNz
IHRyYW5zbGF0aW9uLiBOb3RlIHRoYXQgd2UgbmVlZCB0byBoYXZlIHRoZW0gc2VwYXJhdGVkIHRv
IHN1cHBvcnQgbXVsdGkgaW5zdGFuY2UgYW5kIGtlZXAgY2hhbm5lbHMgZGVjb3VwbGVkIGZyb20g
ZWFjaCBpbnN0YW5jZS4gDQoNCkkgYXNzdW1lIDIgaWYuLi50aGVuIGlzIGNoZWFwZXIgdGhhbiBk
b2luZyAyIGRlLXJlZmVyZW5jZXMgdG8gZnVuY3Rpb24gcG9pbnRlcnMuIFRoaXMgaXMgdGhlIHdh
eSBpdCBpcyBkb25lIG9uIGxlZ2FjeSBwYXRocyBpbiBvdGhlciBwbGFjZXMgKGUuZy4sIG5vbiBt
cSBzY3NpKSwgYnV0IEkgY2FuIGxvb2sgaW50byBob3cgcG9pbnRlciBmdW5jdGlvbnMgd291bGQg
bG9vayBsaWtlIGFuZCBtZWFzdXJlIHRoZSBwZXJmb3JtYW5jZSBpbXBhY3QuIA0KDQpKYXZpZXI=

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/5] lightnvm: pblk: add padding distribution sysfs attribute
  2018-01-31  8:44   ` Matias Bjørling
@ 2018-02-06  9:27     ` Hans Holmberg
  2018-02-06 10:50       ` Matias Bjørling
  0 siblings, 1 reply; 15+ messages in thread
From: Hans Holmberg @ 2018-02-06  9:27 UTC (permalink / raw)
  To: Matias Bjørling
  Cc: Javier González, linux-block, Linux Kernel Mailing List,
	Hans Holmberg, Javier González

Hi Matias,

On Wed, Jan 31, 2018 at 9:44 AM, Matias Bj=C3=B8rling <mb@lightnvm.io> wrot=
e:
> On 01/31/2018 03:06 AM, Javier Gonz=C3=A1lez wrote:
>>
>> From: Hans Holmberg <hans.holmberg@cnexlabs.com>
>>
>> When pblk receives a sync, all data up to that point in the write buffer
>> must be comitted to persistent storage, and as flash memory comes with a
>> minimal write size there is a significant cost involved both in terms
>> of time for completing the sync and in terms of write amplification
>> padded sectors for filling up to the minimal write size.
>>
>> In order to get a better understanding of the costs involved for syncs,
>> Add a sysfs attribute to pblk: padded_dist, showing a normalized
>> distribution of sectors padded. In order to facilitate measurements of
>> specific workloads during the lifetime of the pblk instance, the
>> distribution can be reset by writing 0 to the attribute.
>>
>> Do this by introducing counters for each possible padding:
>> {0..(minimal write size - 1)} and calculate the normalized distribution
>> when showing the attribute.
>>
>> Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com>
>> Signed-off-by: Javier Gonz=C3=A1lez <javier@cnexlabs.com>
>> ---
>>   drivers/lightnvm/pblk-init.c  | 16 ++++++++--
>>   drivers/lightnvm/pblk-rb.c    | 15 +++++-----
>>   drivers/lightnvm/pblk-sysfs.c | 68
>> +++++++++++++++++++++++++++++++++++++++++++
>>   drivers/lightnvm/pblk.h       |  6 +++-
>>   4 files changed, 95 insertions(+), 10 deletions(-)
>>
>> diff --git a/drivers/lightnvm/pblk-init.c b/drivers/lightnvm/pblk-init.c
>> index 7eedc5daebc8..a5e3510c0f38 100644
>> --- a/drivers/lightnvm/pblk-init.c
>> +++ b/drivers/lightnvm/pblk-init.c
>> @@ -921,6 +921,7 @@ static void pblk_free(struct pblk *pblk)
>>   {
>>         pblk_luns_free(pblk);
>>         pblk_lines_free(pblk);
>> +       kfree(pblk->pad_dist);
>>         pblk_line_meta_free(pblk);
>>         pblk_core_free(pblk);
>>         pblk_l2p_free(pblk);
>> @@ -998,11 +999,13 @@ static void *pblk_init(struct nvm_tgt_dev *dev,
>> struct gendisk *tdisk,
>>         pblk->pad_rst_wa =3D 0;
>>         pblk->gc_rst_wa =3D 0;
>>   +     atomic_long_set(&pblk->nr_flush, 0);
>> +       pblk->nr_flush_rst =3D 0;
>> +
>>   #ifdef CONFIG_NVM_DEBUG
>>         atomic_long_set(&pblk->inflight_writes, 0);
>>         atomic_long_set(&pblk->padded_writes, 0);
>>         atomic_long_set(&pblk->padded_wb, 0);
>> -       atomic_long_set(&pblk->nr_flush, 0);
>>         atomic_long_set(&pblk->req_writes, 0);
>>         atomic_long_set(&pblk->sub_writes, 0);
>>         atomic_long_set(&pblk->sync_writes, 0);
>> @@ -1034,10 +1037,17 @@ static void *pblk_init(struct nvm_tgt_dev *dev,
>> struct gendisk *tdisk,
>>                 goto fail_free_luns;
>>         }
>>   +     pblk->pad_dist =3D kzalloc((pblk->min_write_pgs - 1) *
>> sizeof(atomic64_t),
>> +                                GFP_KERNEL);
>> +       if (!pblk->pad_dist) {
>> +               ret =3D -ENOMEM;
>> +               goto fail_free_line_meta;
>> +       }
>> +
>>         ret =3D pblk_core_init(pblk);
>>         if (ret) {
>>                 pr_err("pblk: could not initialize core\n");
>> -               goto fail_free_line_meta;
>> +               goto fail_free_pad_dist;
>>         }
>>         ret =3D pblk_l2p_init(pblk);
>> @@ -1097,6 +1107,8 @@ static void *pblk_init(struct nvm_tgt_dev *dev,
>> struct gendisk *tdisk,
>>         pblk_l2p_free(pblk);
>>   fail_free_core:
>>         pblk_core_free(pblk);
>> +fail_free_pad_dist:
>> +       kfree(pblk->pad_dist);
>>   fail_free_line_meta:
>>         pblk_line_meta_free(pblk);
>>   fail_free_luns:
>> diff --git a/drivers/lightnvm/pblk-rb.c b/drivers/lightnvm/pblk-rb.c
>> index 7044b5599cc4..264372d7b537 100644
>> --- a/drivers/lightnvm/pblk-rb.c
>> +++ b/drivers/lightnvm/pblk-rb.c
>> @@ -437,9 +437,7 @@ static int pblk_rb_may_write_flush(struct pblk_rb *r=
b,
>> unsigned int nr_entries,
>>         if (bio->bi_opf & REQ_PREFLUSH) {
>>                 struct pblk *pblk =3D container_of(rb, struct pblk, rwb)=
;
>>   -#ifdef CONFIG_NVM_DEBUG
>>                 atomic_long_inc(&pblk->nr_flush);
>> -#endif
>>                 if (pblk_rb_flush_point_set(&pblk->rwb, bio, mem))
>>                         *io_ret =3D NVM_IO_OK;
>>         }
>> @@ -620,14 +618,17 @@ unsigned int pblk_rb_read_to_bio(struct pblk_rb *r=
b,
>> struct nvm_rq *rqd,
>>                         pr_err("pblk: could not pad page in write bio\n"=
);
>>                         return NVM_IO_ERR;
>>                 }
>> +
>> +               if (pad < pblk->min_write_pgs)
>> +                       atomic64_inc(&pblk->pad_dist[pad - 1]);
>> +               else
>> +                       pr_warn("pblk: padding more than min. sectors\n"=
);
>
>
> Curious, when would this happen? Would it be an implementation error or
> something a user did wrong?

This would be an implementation error - and this check is just a
safeguard to make sure we won't go out of bounds when updating the
statistics.

>
>
>> +
>> +               atomic64_add(pad, &pblk->pad_wa);
>>         }
>>   -     atomic64_add(pad, &((struct pblk *)
>> -                       (container_of(rb, struct pblk, rwb)))->pad_wa);
>> -
>>   #ifdef CONFIG_NVM_DEBUG
>> -       atomic_long_add(pad, &((struct pblk *)
>> -                       (container_of(rb, struct pblk,
>> rwb)))->padded_writes);
>> +       atomic_long_add(pad, &pblk->padded_writes);
>>   #endif
>>         return NVM_IO_OK;
>> diff --git a/drivers/lightnvm/pblk-sysfs.c b/drivers/lightnvm/pblk-sysfs=
.c
>> index 4804bbd32d5f..f902ac00d071 100644
>> --- a/drivers/lightnvm/pblk-sysfs.c
>> +++ b/drivers/lightnvm/pblk-sysfs.c
>> @@ -341,6 +341,38 @@ static ssize_t pblk_sysfs_get_write_amp_trip(struct
>> pblk *pblk, char *page)
>>                 atomic64_read(&pblk->pad_wa) - pblk->pad_rst_wa, page);
>>   }
>>   +static ssize_t pblk_sysfs_get_padding_dist(struct pblk *pblk, char
>> *page)
>> +{
>> +       int sz =3D 0;
>> +       unsigned long long total;
>> +       unsigned long long total_buckets =3D 0;
>> +       int buckets =3D pblk->min_write_pgs - 1;
>> +       int i;
>> +
>> +       total =3D atomic64_read(&pblk->nr_flush) - pblk->nr_flush_rst;
>> +
>> +       for (i =3D 0; i < buckets; i++)
>> +               total_buckets +=3D atomic64_read(&pblk->pad_dist[i]);
>> +
>
>
> total_buckets are first used later. The calculation could be moved below =
the
> next statement.

I saw that you fixed this on the core branch, thanks.

>
>
>> +       if (!total) {
>> +               for (i =3D 0; i < (buckets + 1); i++)
>> +                       sz +=3D snprintf(page + sz, PAGE_SIZE - sz,
>> +                               "%d:0 ", i);
>> +               sz +=3D snprintf(page + sz, PAGE_SIZE - sz, "\n");
>> +
>> +               return sz;
>> +       }
>> +
>> +       sz +=3D snprintf(page + sz, PAGE_SIZE - sz, "0:%lld%% ",
>> +               ((total - total_buckets) * 100) / total);
>> +       for (i =3D 0; i < buckets; i++)
>> +               sz +=3D snprintf(page + sz, PAGE_SIZE - sz, "%d:%lld%% "=
, i
>> + 1,
>> +                       (atomic64_read(&pblk->pad_dist[i]) * 100) /
>> total);
>> +       sz +=3D snprintf(page + sz, PAGE_SIZE - sz, "\n");
>> +
>> +       return sz;
>> +}
>> +
>>   #ifdef CONFIG_NVM_DEBUG
>>   static ssize_t pblk_sysfs_stats_debug(struct pblk *pblk, char *page)
>>   {
>> @@ -427,6 +459,32 @@ static ssize_t pblk_sysfs_set_write_amp_trip(struct
>> pblk *pblk,
>>   }
>>     +static ssize_t pblk_sysfs_set_padding_dist(struct pblk *pblk,
>> +                       const char *page, size_t len)
>> +{
>> +       size_t c_len;
>> +       int reset_value;
>> +       int buckets =3D pblk->min_write_pgs - 1;
>> +       int i;
>> +
>> +       c_len =3D strcspn(page, "\n");
>> +       if (c_len >=3D len)
>> +               return -EINVAL;
>> +
>> +       if (kstrtouint(page, 0, &reset_value))
>> +               return -EINVAL;
>> +
>> +       if (reset_value !=3D  0)
>> +               return -EINVAL;
>> +
>> +       for (i =3D 0; i < buckets; i++)
>> +               atomic64_set(&pblk->pad_dist[i], 0);
>> +
>> +       pblk->nr_flush_rst =3D atomic64_read(&pblk->nr_flush);
>> +
>> +       return len;
>> +}
>> +
>>   static struct attribute sys_write_luns =3D {
>>         .name =3D "write_luns",
>>         .mode =3D 0444,
>> @@ -487,6 +545,11 @@ static struct attribute sys_write_amp_trip =3D {
>>         .mode =3D 0644,
>>   };
>>   +static struct attribute sys_padding_dist =3D {
>> +       .name =3D "padding_dist",
>> +       .mode =3D 0644,
>> +};
>> +
>>   #ifdef CONFIG_NVM_DEBUG
>>   static struct attribute sys_stats_debug_attr =3D {
>>         .name =3D "stats",
>> @@ -507,6 +570,7 @@ static struct attribute *pblk_attrs[] =3D {
>>         &sys_lines_info_attr,
>>         &sys_write_amp_mileage,
>>         &sys_write_amp_trip,
>> +       &sys_padding_dist,
>>   #ifdef CONFIG_NVM_DEBUG
>>         &sys_stats_debug_attr,
>>   #endif
>> @@ -540,6 +604,8 @@ static ssize_t pblk_sysfs_show(struct kobject *kobj,
>> struct attribute *attr,
>>                 return pblk_sysfs_get_write_amp_mileage(pblk, buf);
>>         else if (strcmp(attr->name, "write_amp_trip") =3D=3D 0)
>>                 return pblk_sysfs_get_write_amp_trip(pblk, buf);
>> +       else if (strcmp(attr->name, "padding_dist") =3D=3D 0)
>> +               return pblk_sysfs_get_padding_dist(pblk, buf);
>>   #ifdef CONFIG_NVM_DEBUG
>>         else if (strcmp(attr->name, "stats") =3D=3D 0)
>>                 return pblk_sysfs_stats_debug(pblk, buf);
>> @@ -558,6 +624,8 @@ static ssize_t pblk_sysfs_store(struct kobject *kobj=
,
>> struct attribute *attr,
>>                 return pblk_sysfs_set_sec_per_write(pblk, buf, len);
>>         else if (strcmp(attr->name, "write_amp_trip") =3D=3D 0)
>>                 return pblk_sysfs_set_write_amp_trip(pblk, buf, len);
>> +       else if (strcmp(attr->name, "padding_dist") =3D=3D 0)
>> +               return pblk_sysfs_set_padding_dist(pblk, buf, len);
>>         return 0;
>>   }
>>   diff --git a/drivers/lightnvm/pblk.h b/drivers/lightnvm/pblk.h
>> index 4b7d8618631f..88720e2441c0 100644
>> --- a/drivers/lightnvm/pblk.h
>> +++ b/drivers/lightnvm/pblk.h
>> @@ -626,12 +626,16 @@ struct pblk {
>>         u64 gc_rst_wa;
>>         u64 pad_rst_wa;
>>   +     /* Counters used for calculating padding distribution */
>> +       atomic64_t *pad_dist;           /* Padding distribution buckets =
*/
>> +       u64 nr_flush_rst;               /* Flushes reset value for pad
>> dist.*/
>> +       atomic_long_t nr_flush;         /* Number of flush/fua I/O */
>> +
>>   #ifdef CONFIG_NVM_DEBUG
>>         /* Non-persistent debug counters, 4kb sector I/Os */
>>         atomic_long_t inflight_writes;  /* Inflight writes (user and gc)
>> */
>>         atomic_long_t padded_writes;    /* Sectors padded due to flush/f=
ua
>> */
>>         atomic_long_t padded_wb;        /* Sectors padded in write buffe=
r
>> */
>> -       atomic_long_t nr_flush;         /* Number of flush/fua I/O */
>>         atomic_long_t req_writes;       /* Sectors stored on write buffe=
r
>> */
>>         atomic_long_t sub_writes;       /* Sectors submitted from buffer
>> */
>>         atomic_long_t sync_writes;      /* Sectors synced to media */
>>
>
> Looks good to me. Let me know if you want to send a new patch, or if I ma=
y
> make the change when I pick it up.

Thanks for the review, i saw that you already reworked the patch a bit
on your core branch.
However, i found a build issue when building for i386, so i'll fix
that and submit a V2(that includes your update)

> Also, should the padding be stored in the on-disk metadata as well?

I don't see the need to do this, as I believe one would only be
interested in measuring the padding distribution on specific workloads
- and this would not require persisting the counters.

Thanks, Hans

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/5] lightnvm: pblk: add padding distribution sysfs attribute
  2018-02-06  9:27     ` Hans Holmberg
@ 2018-02-06 10:50       ` Matias Bjørling
  2018-02-06 11:28         ` Hans Holmberg
  0 siblings, 1 reply; 15+ messages in thread
From: Matias Bjørling @ 2018-02-06 10:50 UTC (permalink / raw)
  To: Hans Holmberg
  Cc: Javier González, linux-block, Linux Kernel Mailing List,
	Hans Holmberg, Javier González

On 02/06/2018 10:27 AM, Hans Holmberg wrote:
> Hi Matias,
> 
> On Wed, Jan 31, 2018 at 9:44 AM, Matias Bjørling <mb@lightnvm.io> wrote:
>> On 01/31/2018 03:06 AM, Javier González wrote:
>>>
>>> From: Hans Holmberg <hans.holmberg@cnexlabs.com>
>>>
>>> When pblk receives a sync, all data up to that point in the write buffer
>>> must be comitted to persistent storage, and as flash memory comes with a
>>> minimal write size there is a significant cost involved both in terms
>>> of time for completing the sync and in terms of write amplification
>>> padded sectors for filling up to the minimal write size.
>>>
>>> In order to get a better understanding of the costs involved for syncs,
>>> Add a sysfs attribute to pblk: padded_dist, showing a normalized
>>> distribution of sectors padded. In order to facilitate measurements of
>>> specific workloads during the lifetime of the pblk instance, the
>>> distribution can be reset by writing 0 to the attribute.
>>>
>>> Do this by introducing counters for each possible padding:
>>> {0..(minimal write size - 1)} and calculate the normalized distribution
>>> when showing the attribute.
>>>
>>> Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com>
>>> Signed-off-by: Javier González <javier@cnexlabs.com>
>>> ---
>>>    drivers/lightnvm/pblk-init.c  | 16 ++++++++--
>>>    drivers/lightnvm/pblk-rb.c    | 15 +++++-----
>>>    drivers/lightnvm/pblk-sysfs.c | 68
>>> +++++++++++++++++++++++++++++++++++++++++++
>>>    drivers/lightnvm/pblk.h       |  6 +++-
>>>    4 files changed, 95 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/drivers/lightnvm/pblk-init.c b/drivers/lightnvm/pblk-init.c
>>> index 7eedc5daebc8..a5e3510c0f38 100644
>>> --- a/drivers/lightnvm/pblk-init.c
>>> +++ b/drivers/lightnvm/pblk-init.c
>>> @@ -921,6 +921,7 @@ static void pblk_free(struct pblk *pblk)
>>>    {
>>>          pblk_luns_free(pblk);
>>>          pblk_lines_free(pblk);
>>> +       kfree(pblk->pad_dist);
>>>          pblk_line_meta_free(pblk);
>>>          pblk_core_free(pblk);
>>>          pblk_l2p_free(pblk);
>>> @@ -998,11 +999,13 @@ static void *pblk_init(struct nvm_tgt_dev *dev,
>>> struct gendisk *tdisk,
>>>          pblk->pad_rst_wa = 0;
>>>          pblk->gc_rst_wa = 0;
>>>    +     atomic_long_set(&pblk->nr_flush, 0);
>>> +       pblk->nr_flush_rst = 0;
>>> +
>>>    #ifdef CONFIG_NVM_DEBUG
>>>          atomic_long_set(&pblk->inflight_writes, 0);
>>>          atomic_long_set(&pblk->padded_writes, 0);
>>>          atomic_long_set(&pblk->padded_wb, 0);
>>> -       atomic_long_set(&pblk->nr_flush, 0);
>>>          atomic_long_set(&pblk->req_writes, 0);
>>>          atomic_long_set(&pblk->sub_writes, 0);
>>>          atomic_long_set(&pblk->sync_writes, 0);
>>> @@ -1034,10 +1037,17 @@ static void *pblk_init(struct nvm_tgt_dev *dev,
>>> struct gendisk *tdisk,
>>>                  goto fail_free_luns;
>>>          }
>>>    +     pblk->pad_dist = kzalloc((pblk->min_write_pgs - 1) *
>>> sizeof(atomic64_t),
>>> +                                GFP_KERNEL);
>>> +       if (!pblk->pad_dist) {
>>> +               ret = -ENOMEM;
>>> +               goto fail_free_line_meta;
>>> +       }
>>> +
>>>          ret = pblk_core_init(pblk);
>>>          if (ret) {
>>>                  pr_err("pblk: could not initialize core\n");
>>> -               goto fail_free_line_meta;
>>> +               goto fail_free_pad_dist;
>>>          }
>>>          ret = pblk_l2p_init(pblk);
>>> @@ -1097,6 +1107,8 @@ static void *pblk_init(struct nvm_tgt_dev *dev,
>>> struct gendisk *tdisk,
>>>          pblk_l2p_free(pblk);
>>>    fail_free_core:
>>>          pblk_core_free(pblk);
>>> +fail_free_pad_dist:
>>> +       kfree(pblk->pad_dist);
>>>    fail_free_line_meta:
>>>          pblk_line_meta_free(pblk);
>>>    fail_free_luns:
>>> diff --git a/drivers/lightnvm/pblk-rb.c b/drivers/lightnvm/pblk-rb.c
>>> index 7044b5599cc4..264372d7b537 100644
>>> --- a/drivers/lightnvm/pblk-rb.c
>>> +++ b/drivers/lightnvm/pblk-rb.c
>>> @@ -437,9 +437,7 @@ static int pblk_rb_may_write_flush(struct pblk_rb *rb,
>>> unsigned int nr_entries,
>>>          if (bio->bi_opf & REQ_PREFLUSH) {
>>>                  struct pblk *pblk = container_of(rb, struct pblk, rwb);
>>>    -#ifdef CONFIG_NVM_DEBUG
>>>                  atomic_long_inc(&pblk->nr_flush);
>>> -#endif
>>>                  if (pblk_rb_flush_point_set(&pblk->rwb, bio, mem))
>>>                          *io_ret = NVM_IO_OK;
>>>          }
>>> @@ -620,14 +618,17 @@ unsigned int pblk_rb_read_to_bio(struct pblk_rb *rb,
>>> struct nvm_rq *rqd,
>>>                          pr_err("pblk: could not pad page in write bio\n");
>>>                          return NVM_IO_ERR;
>>>                  }
>>> +
>>> +               if (pad < pblk->min_write_pgs)
>>> +                       atomic64_inc(&pblk->pad_dist[pad - 1]);
>>> +               else
>>> +                       pr_warn("pblk: padding more than min. sectors\n");
>>
>>
>> Curious, when would this happen? Would it be an implementation error or
>> something a user did wrong?
> 
> This would be an implementation error - and this check is just a
> safeguard to make sure we won't go out of bounds when updating the
> statistics.
> 

Ah, does it make sense to have such defensive programming, when the 
value is never persisted? An 64 bit integer is quite large, and if only 
used for workloads for a single instance, it would practically never 
trigger, and if it did, do one care?

<snip>

>>
>> Looks good to me. Let me know if you want to send a new patch, or if I may
>> make the change when I pick it up.
> 
> Thanks for the review, i saw that you already reworked the patch a bit
> on your core branch.
> However, i found a build issue when building for i386, so i'll fix
> that and submit a V2(that includes your update)

Ok. I assume this is the kbuild mail I asked about a couple of days ago. 
I'll wait for v2.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/5] lightnvm: pblk: add padding distribution sysfs attribute
  2018-02-06 10:50       ` Matias Bjørling
@ 2018-02-06 11:28         ` Hans Holmberg
  0 siblings, 0 replies; 15+ messages in thread
From: Hans Holmberg @ 2018-02-06 11:28 UTC (permalink / raw)
  To: Matias Bjørling
  Cc: Javier González, linux-block, Linux Kernel Mailing List,
	Hans Holmberg, Javier González

On Tue, Feb 6, 2018 at 11:50 AM, Matias Bj=C3=B8rling <mb@lightnvm.io> wrot=
e:
> On 02/06/2018 10:27 AM, Hans Holmberg wrote:
>>
>> Hi Matias,
>>
>> On Wed, Jan 31, 2018 at 9:44 AM, Matias Bj=C3=B8rling <mb@lightnvm.io> w=
rote:
>>>
>>> On 01/31/2018 03:06 AM, Javier Gonz=C3=A1lez wrote:
>>>>
>>>>
>>>> From: Hans Holmberg <hans.holmberg@cnexlabs.com>
>>>>
>>>> When pblk receives a sync, all data up to that point in the write buff=
er
>>>> must be comitted to persistent storage, and as flash memory comes with=
 a
>>>> minimal write size there is a significant cost involved both in terms
>>>> of time for completing the sync and in terms of write amplification
>>>> padded sectors for filling up to the minimal write size.
>>>>
>>>> In order to get a better understanding of the costs involved for syncs=
,
>>>> Add a sysfs attribute to pblk: padded_dist, showing a normalized
>>>> distribution of sectors padded. In order to facilitate measurements of
>>>> specific workloads during the lifetime of the pblk instance, the
>>>> distribution can be reset by writing 0 to the attribute.
>>>>
>>>> Do this by introducing counters for each possible padding:
>>>> {0..(minimal write size - 1)} and calculate the normalized distributio=
n
>>>> when showing the attribute.
>>>>
>>>> Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com>
>>>> Signed-off-by: Javier Gonz=C3=A1lez <javier@cnexlabs.com>
>>>> ---
>>>>    drivers/lightnvm/pblk-init.c  | 16 ++++++++--
>>>>    drivers/lightnvm/pblk-rb.c    | 15 +++++-----
>>>>    drivers/lightnvm/pblk-sysfs.c | 68
>>>> +++++++++++++++++++++++++++++++++++++++++++
>>>>    drivers/lightnvm/pblk.h       |  6 +++-
>>>>    4 files changed, 95 insertions(+), 10 deletions(-)
>>>>
>>>> diff --git a/drivers/lightnvm/pblk-init.c b/drivers/lightnvm/pblk-init=
.c
>>>> index 7eedc5daebc8..a5e3510c0f38 100644
>>>> --- a/drivers/lightnvm/pblk-init.c
>>>> +++ b/drivers/lightnvm/pblk-init.c
>>>> @@ -921,6 +921,7 @@ static void pblk_free(struct pblk *pblk)
>>>>    {
>>>>          pblk_luns_free(pblk);
>>>>          pblk_lines_free(pblk);
>>>> +       kfree(pblk->pad_dist);
>>>>          pblk_line_meta_free(pblk);
>>>>          pblk_core_free(pblk);
>>>>          pblk_l2p_free(pblk);
>>>> @@ -998,11 +999,13 @@ static void *pblk_init(struct nvm_tgt_dev *dev,
>>>> struct gendisk *tdisk,
>>>>          pblk->pad_rst_wa =3D 0;
>>>>          pblk->gc_rst_wa =3D 0;
>>>>    +     atomic_long_set(&pblk->nr_flush, 0);
>>>> +       pblk->nr_flush_rst =3D 0;
>>>> +
>>>>    #ifdef CONFIG_NVM_DEBUG
>>>>          atomic_long_set(&pblk->inflight_writes, 0);
>>>>          atomic_long_set(&pblk->padded_writes, 0);
>>>>          atomic_long_set(&pblk->padded_wb, 0);
>>>> -       atomic_long_set(&pblk->nr_flush, 0);
>>>>          atomic_long_set(&pblk->req_writes, 0);
>>>>          atomic_long_set(&pblk->sub_writes, 0);
>>>>          atomic_long_set(&pblk->sync_writes, 0);
>>>> @@ -1034,10 +1037,17 @@ static void *pblk_init(struct nvm_tgt_dev *dev=
,
>>>> struct gendisk *tdisk,
>>>>                  goto fail_free_luns;
>>>>          }
>>>>    +     pblk->pad_dist =3D kzalloc((pblk->min_write_pgs - 1) *
>>>> sizeof(atomic64_t),
>>>> +                                GFP_KERNEL);
>>>> +       if (!pblk->pad_dist) {
>>>> +               ret =3D -ENOMEM;
>>>> +               goto fail_free_line_meta;
>>>> +       }
>>>> +
>>>>          ret =3D pblk_core_init(pblk);
>>>>          if (ret) {
>>>>                  pr_err("pblk: could not initialize core\n");
>>>> -               goto fail_free_line_meta;
>>>> +               goto fail_free_pad_dist;
>>>>          }
>>>>          ret =3D pblk_l2p_init(pblk);
>>>> @@ -1097,6 +1107,8 @@ static void *pblk_init(struct nvm_tgt_dev *dev,
>>>> struct gendisk *tdisk,
>>>>          pblk_l2p_free(pblk);
>>>>    fail_free_core:
>>>>          pblk_core_free(pblk);
>>>> +fail_free_pad_dist:
>>>> +       kfree(pblk->pad_dist);
>>>>    fail_free_line_meta:
>>>>          pblk_line_meta_free(pblk);
>>>>    fail_free_luns:
>>>> diff --git a/drivers/lightnvm/pblk-rb.c b/drivers/lightnvm/pblk-rb.c
>>>> index 7044b5599cc4..264372d7b537 100644
>>>> --- a/drivers/lightnvm/pblk-rb.c
>>>> +++ b/drivers/lightnvm/pblk-rb.c
>>>> @@ -437,9 +437,7 @@ static int pblk_rb_may_write_flush(struct pblk_rb
>>>> *rb,
>>>> unsigned int nr_entries,
>>>>          if (bio->bi_opf & REQ_PREFLUSH) {
>>>>                  struct pblk *pblk =3D container_of(rb, struct pblk, r=
wb);
>>>>    -#ifdef CONFIG_NVM_DEBUG
>>>>                  atomic_long_inc(&pblk->nr_flush);
>>>> -#endif
>>>>                  if (pblk_rb_flush_point_set(&pblk->rwb, bio, mem))
>>>>                          *io_ret =3D NVM_IO_OK;
>>>>          }
>>>> @@ -620,14 +618,17 @@ unsigned int pblk_rb_read_to_bio(struct pblk_rb
>>>> *rb,
>>>> struct nvm_rq *rqd,
>>>>                          pr_err("pblk: could not pad page in write
>>>> bio\n");
>>>>                          return NVM_IO_ERR;
>>>>                  }
>>>> +
>>>> +               if (pad < pblk->min_write_pgs)
>>>> +                       atomic64_inc(&pblk->pad_dist[pad - 1]);
>>>> +               else
>>>> +                       pr_warn("pblk: padding more than min.
>>>> sectors\n");
>>>
>>>
>>>
>>> Curious, when would this happen? Would it be an implementation error or
>>> something a user did wrong?
>>
>>
>> This would be an implementation error - and this check is just a
>> safeguard to make sure we won't go out of bounds when updating the
>> statistics.
>>
>
> Ah, does it make sense to have such defensive programming, when the value=
 is
> never persisted? An 64 bit integer is quite large, and if only used for
> workloads for a single instance, it would practically never trigger, and =
if
> it did, do one care?

Ah, sorry for being unclear,  it's not for checking if the pad
counters overflow  - I wanted to make sure that we won't index
pad_dist with negative numbers if we mess up the padding calculations
in the future.

>
> <snip>
>
>>>
>>> Looks good to me. Let me know if you want to send a new patch, or if I
>>> may
>>> make the change when I pick it up.
>>
>>
>> Thanks for the review, i saw that you already reworked the patch a bit
>> on your core branch.
>> However, i found a build issue when building for i386, so i'll fix
>> that and submit a V2(that includes your update)
>
>
> Ok. I assume this is the kbuild mail I asked about a couple of days ago.
> I'll wait for v2.

I did see that particular mail, but it's most likely the same issue
that the V2 will fix.

Thanks,
Hans

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2018-02-06 11:28 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-31  2:06 [PATCH 1/5] lightnvm: pblk: handle bad sectors in the emeta area correctly Javier González
2018-01-31  2:06 ` [PATCH 2/5] lightnvm: pblk: check data lines version on recovery Javier González
2018-01-31  2:06 ` [PATCH 3/5] lightnvm: pblk: export write amplification counters to sysfs Javier González
2018-01-31  2:06 ` [PATCH 4/5] lightnvm: pblk: add padding distribution sysfs attribute Javier González
2018-01-31  8:44   ` Matias Bjørling
2018-02-06  9:27     ` Hans Holmberg
2018-02-06 10:50       ` Matias Bjørling
2018-02-06 11:28         ` Hans Holmberg
2018-01-31  2:06 ` [PATCH 5/5] lightnvm: pblk: refactor bad block identification Javier González
2018-01-31  8:51   ` Matias Bjørling
2018-01-31  9:13     ` Javier Gonzalez
2018-01-31 18:24       ` Matias Bjørling
2018-02-04 10:37         ` Javier Gonzalez
2018-02-04 12:54           ` Matias Bjørling
2018-02-04 13:09             ` Javier Gonzalez

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).