All of lore.kernel.org
 help / color / mirror / Atom feed
* UBI: Bitrot checking
@ 2015-03-29 12:13 ` Richard Weinberger
  0 siblings, 0 replies; 49+ messages in thread
From: Richard Weinberger @ 2015-03-29 12:13 UTC (permalink / raw)
  To: linux-mtd; +Cc: linux-kernel, dedekind1, boris.brezillon

As of today UBI does not offer a way to reliable deal with
data retention and especially read disturb. Read disturb means that
a page X will face bitflips if you read page Y,
Y != X but within the same block, very often.
People who care about data retention often have cron jobs on their
targets which read from time to time whole UBI volumes.
Something like "dd if=/dev/ubi0_0 of=/dev/zero" every week.
If UBI faces bitflips while reading it will schedule the affected PEB
for scrubbing.
The major downside of this approach is that you don't catch all pages.
e.g. UBI EC and VID headers are not always read.
Also UBI internal data structure like volume table are not checked.
These pages are mostly read at attach time. So, if you run a
cron job and reboot often you are on the safe side.
With fastmap the issue is even worse as you don't scan every PEB at
attach time. In fact, I've seen read disturb issues on a customer's
target where he has installed such a cronjob.
But some targets still died due to read disturb issues.
It turned out that only targets with very high uptimes were affected.

To overcome that issue this patch series adds a bitrot checking
mechanism to UBI.
It can be triggered by writing to /sys/class/ubi/ubiX/trigger_bitrot_check.
Then UBI will read every PEB and schedule scrubbing or an appropriated
action if it detects flipped bits. Reading is done within the wear-leveling
thread such that it does not block other IO. I can also think of adding
a new thread to UBI which has a very low priority.
User can then replace the dd command in their cron jobs by a simple
"echo 1 > /sys/class/ubi/ubiX/trigger_bitrot_check".

This series is only part one of two. Part two will add new ioctl()
commands to the UBI device nodes such that we can develop something
like a ubi-healthd.
This daemon will be able to fetch statistics about every known PEB
and can trigger scrubbing. Especially for MLC NAND support more
advanced techniques are needed to deal with data retention. 

[PATCH 1/4] UBI: Introduce ubi_schedule_fm_work()
[PATCH 2/4] UBI: Introduce prepare_erase_work()
[PATCH 3/4] UBI: Introduce in_pq()
[PATCH 4/4] UBI: Implement bitrot checking

Thanks,
//richard

^ permalink raw reply	[flat|nested] 49+ messages in thread

* UBI: Bitrot checking
@ 2015-03-29 12:13 ` Richard Weinberger
  0 siblings, 0 replies; 49+ messages in thread
From: Richard Weinberger @ 2015-03-29 12:13 UTC (permalink / raw)
  To: linux-mtd; +Cc: boris.brezillon, linux-kernel, dedekind1

As of today UBI does not offer a way to reliable deal with
data retention and especially read disturb. Read disturb means that
a page X will face bitflips if you read page Y,
Y != X but within the same block, very often.
People who care about data retention often have cron jobs on their
targets which read from time to time whole UBI volumes.
Something like "dd if=/dev/ubi0_0 of=/dev/zero" every week.
If UBI faces bitflips while reading it will schedule the affected PEB
for scrubbing.
The major downside of this approach is that you don't catch all pages.
e.g. UBI EC and VID headers are not always read.
Also UBI internal data structure like volume table are not checked.
These pages are mostly read at attach time. So, if you run a
cron job and reboot often you are on the safe side.
With fastmap the issue is even worse as you don't scan every PEB at
attach time. In fact, I've seen read disturb issues on a customer's
target where he has installed such a cronjob.
But some targets still died due to read disturb issues.
It turned out that only targets with very high uptimes were affected.

To overcome that issue this patch series adds a bitrot checking
mechanism to UBI.
It can be triggered by writing to /sys/class/ubi/ubiX/trigger_bitrot_check.
Then UBI will read every PEB and schedule scrubbing or an appropriated
action if it detects flipped bits. Reading is done within the wear-leveling
thread such that it does not block other IO. I can also think of adding
a new thread to UBI which has a very low priority.
User can then replace the dd command in their cron jobs by a simple
"echo 1 > /sys/class/ubi/ubiX/trigger_bitrot_check".

This series is only part one of two. Part two will add new ioctl()
commands to the UBI device nodes such that we can develop something
like a ubi-healthd.
This daemon will be able to fetch statistics about every known PEB
and can trigger scrubbing. Especially for MLC NAND support more
advanced techniques are needed to deal with data retention. 

[PATCH 1/4] UBI: Introduce ubi_schedule_fm_work()
[PATCH 2/4] UBI: Introduce prepare_erase_work()
[PATCH 3/4] UBI: Introduce in_pq()
[PATCH 4/4] UBI: Implement bitrot checking

Thanks,
//richard

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PATCH 1/4] UBI: Introduce ubi_schedule_fm_work()
  2015-03-29 12:13 ` Richard Weinberger
@ 2015-03-29 12:13   ` Richard Weinberger
  -1 siblings, 0 replies; 49+ messages in thread
From: Richard Weinberger @ 2015-03-29 12:13 UTC (permalink / raw)
  To: linux-mtd; +Cc: linux-kernel, dedekind1, boris.brezillon, Richard Weinberger

Wrap the fastmap work scheduling code into a new function
such that it can be reused.

Signed-off-by: Richard Weinberger <richard@nod.at>
---
 drivers/mtd/ubi/fastmap-wl.c |  5 +----
 drivers/mtd/ubi/ubi.h        | 12 ++++++++++++
 2 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/drivers/mtd/ubi/fastmap-wl.c b/drivers/mtd/ubi/fastmap-wl.c
index b2a6653..fc0e2ff 100644
--- a/drivers/mtd/ubi/fastmap-wl.c
+++ b/drivers/mtd/ubi/fastmap-wl.c
@@ -237,10 +237,7 @@ static struct ubi_wl_entry *get_peb_for_wl(struct ubi_device *ubi)
 		/* We cannot update the fastmap here because this
 		 * function is called in atomic context.
 		 * Let's fail here and refill/update it as soon as possible. */
-		if (!ubi->fm_work_scheduled) {
-			ubi->fm_work_scheduled = 1;
-			schedule_work(&ubi->fm_work);
-		}
+		ubi_schedule_fm_work(ubi);
 		return NULL;
 	}
 
diff --git a/drivers/mtd/ubi/ubi.h b/drivers/mtd/ubi/ubi.h
index 11f8fb0..cb3dcd0 100644
--- a/drivers/mtd/ubi/ubi.h
+++ b/drivers/mtd/ubi/ubi.h
@@ -881,8 +881,20 @@ size_t ubi_calc_fm_size(struct ubi_device *ubi);
 int ubi_update_fastmap(struct ubi_device *ubi);
 int ubi_scan_fastmap(struct ubi_device *ubi, struct ubi_attach_info *ai,
 		     int fm_anchor);
+/**
+ * ubi_schedule_fm_work - Schedule UBI fastmap work
+ * @ubi: UBI device description object
+ */
+static inline void ubi_schedule_fm_work(struct ubi_device *ubi)
+{
+	if (!ubi->fm_work_scheduled) {
+		ubi->fm_work_scheduled = 1;
+		schedule_work(&ubi->fm_work);
+	}
+}
 #else
 static inline int ubi_update_fastmap(struct ubi_device *ubi) { return 0; }
+static inline void ubi_schedule_fm_work(struct ubi_device *ubi) { }
 #endif
 
 /* block.c */
-- 
1.8.4.5


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 1/4] UBI: Introduce ubi_schedule_fm_work()
@ 2015-03-29 12:13   ` Richard Weinberger
  0 siblings, 0 replies; 49+ messages in thread
From: Richard Weinberger @ 2015-03-29 12:13 UTC (permalink / raw)
  To: linux-mtd; +Cc: Richard Weinberger, boris.brezillon, linux-kernel, dedekind1

Wrap the fastmap work scheduling code into a new function
such that it can be reused.

Signed-off-by: Richard Weinberger <richard@nod.at>
---
 drivers/mtd/ubi/fastmap-wl.c |  5 +----
 drivers/mtd/ubi/ubi.h        | 12 ++++++++++++
 2 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/drivers/mtd/ubi/fastmap-wl.c b/drivers/mtd/ubi/fastmap-wl.c
index b2a6653..fc0e2ff 100644
--- a/drivers/mtd/ubi/fastmap-wl.c
+++ b/drivers/mtd/ubi/fastmap-wl.c
@@ -237,10 +237,7 @@ static struct ubi_wl_entry *get_peb_for_wl(struct ubi_device *ubi)
 		/* We cannot update the fastmap here because this
 		 * function is called in atomic context.
 		 * Let's fail here and refill/update it as soon as possible. */
-		if (!ubi->fm_work_scheduled) {
-			ubi->fm_work_scheduled = 1;
-			schedule_work(&ubi->fm_work);
-		}
+		ubi_schedule_fm_work(ubi);
 		return NULL;
 	}
 
diff --git a/drivers/mtd/ubi/ubi.h b/drivers/mtd/ubi/ubi.h
index 11f8fb0..cb3dcd0 100644
--- a/drivers/mtd/ubi/ubi.h
+++ b/drivers/mtd/ubi/ubi.h
@@ -881,8 +881,20 @@ size_t ubi_calc_fm_size(struct ubi_device *ubi);
 int ubi_update_fastmap(struct ubi_device *ubi);
 int ubi_scan_fastmap(struct ubi_device *ubi, struct ubi_attach_info *ai,
 		     int fm_anchor);
+/**
+ * ubi_schedule_fm_work - Schedule UBI fastmap work
+ * @ubi: UBI device description object
+ */
+static inline void ubi_schedule_fm_work(struct ubi_device *ubi)
+{
+	if (!ubi->fm_work_scheduled) {
+		ubi->fm_work_scheduled = 1;
+		schedule_work(&ubi->fm_work);
+	}
+}
 #else
 static inline int ubi_update_fastmap(struct ubi_device *ubi) { return 0; }
+static inline void ubi_schedule_fm_work(struct ubi_device *ubi) { }
 #endif
 
 /* block.c */
-- 
1.8.4.5

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 2/4] UBI: Introduce prepare_erase_work()
  2015-03-29 12:13 ` Richard Weinberger
@ 2015-03-29 12:13   ` Richard Weinberger
  -1 siblings, 0 replies; 49+ messages in thread
From: Richard Weinberger @ 2015-03-29 12:13 UTC (permalink / raw)
  To: linux-mtd; +Cc: linux-kernel, dedekind1, boris.brezillon, Richard Weinberger

Split ubi_work creation and scheduling code such that
we can re-use the creation code where we schedule ubi work
on our own.

Signed-off-by: Richard Weinberger <richard@nod.at>
---
 drivers/mtd/ubi/wl.c | 40 +++++++++++++++++++++++++++++++---------
 1 file changed, 31 insertions(+), 9 deletions(-)

diff --git a/drivers/mtd/ubi/wl.c b/drivers/mtd/ubi/wl.c
index cefc767..13594e9 100644
--- a/drivers/mtd/ubi/wl.c
+++ b/drivers/mtd/ubi/wl.c
@@ -569,29 +569,26 @@ static int erase_worker(struct ubi_device *ubi, struct ubi_work *wl_wrk,
 			int shutdown);
 
 /**
- * schedule_erase - schedule an erase work.
+ * prepare_erase_work - prepare an erase work.
  * @ubi: UBI device description object
  * @e: the WL entry of the physical eraseblock to erase
  * @vol_id: the volume ID that last used this PEB
  * @lnum: the last used logical eraseblock number for the PEB
  * @torture: if the physical eraseblock has to be tortured
  *
- * This function returns zero in case of success and a %-ENOMEM in case of
- * failure.
+ * This function returns a struct ubi_work in case of success
+ * and an ERR_PTR(%-ENOMEM) in case of failure.
  */
-static int schedule_erase(struct ubi_device *ubi, struct ubi_wl_entry *e,
-			  int vol_id, int lnum, int torture)
+static struct ubi_work *prepare_erase_work(struct ubi_wl_entry *e, int vol_id,
+					   int lnum, int torture)
 {
 	struct ubi_work *wl_wrk;
 
 	ubi_assert(e);
 
-	dbg_wl("schedule erasure of PEB %d, EC %d, torture %d",
-	       e->pnum, e->ec, torture);
-
 	wl_wrk = kmalloc(sizeof(struct ubi_work), GFP_NOFS);
 	if (!wl_wrk)
-		return -ENOMEM;
+		return ERR_PTR(-ENOMEM);
 
 	wl_wrk->func = &erase_worker;
 	wl_wrk->e = e;
@@ -599,6 +596,31 @@ static int schedule_erase(struct ubi_device *ubi, struct ubi_wl_entry *e,
 	wl_wrk->lnum = lnum;
 	wl_wrk->torture = torture;
 
+	return wl_wrk;
+}
+
+/**
+ * schedule_erase - schedule an erase work.
+ * @ubi: UBI device description object
+ * @e: the WL entry of the physical eraseblock to erase
+ * @vol_id: the volume ID that last used this PEB
+ * @lnum: the last used logical eraseblock number for the PEB
+ * @torture: if the physical eraseblock has to be tortured
+ *
+ * This function returns zero in case of success and a %-ENOMEM in case of
+ * failure.
+ */
+static int schedule_erase(struct ubi_device *ubi, struct ubi_wl_entry *e,
+			  int vol_id, int lnum, int torture)
+{
+	struct ubi_work *wl_wrk = prepare_erase_work(e, vol_id, lnum, torture);
+
+	dbg_wl("schedule erasure of PEB %d, EC %d, torture %d",
+	       e->pnum, e->ec, torture);
+
+	if (IS_ERR(wl_wrk))
+		return PTR_ERR(wl_wrk);
+
 	schedule_ubi_work(ubi, wl_wrk);
 	return 0;
 }
-- 
1.8.4.5


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 2/4] UBI: Introduce prepare_erase_work()
@ 2015-03-29 12:13   ` Richard Weinberger
  0 siblings, 0 replies; 49+ messages in thread
From: Richard Weinberger @ 2015-03-29 12:13 UTC (permalink / raw)
  To: linux-mtd; +Cc: Richard Weinberger, boris.brezillon, linux-kernel, dedekind1

Split ubi_work creation and scheduling code such that
we can re-use the creation code where we schedule ubi work
on our own.

Signed-off-by: Richard Weinberger <richard@nod.at>
---
 drivers/mtd/ubi/wl.c | 40 +++++++++++++++++++++++++++++++---------
 1 file changed, 31 insertions(+), 9 deletions(-)

diff --git a/drivers/mtd/ubi/wl.c b/drivers/mtd/ubi/wl.c
index cefc767..13594e9 100644
--- a/drivers/mtd/ubi/wl.c
+++ b/drivers/mtd/ubi/wl.c
@@ -569,29 +569,26 @@ static int erase_worker(struct ubi_device *ubi, struct ubi_work *wl_wrk,
 			int shutdown);
 
 /**
- * schedule_erase - schedule an erase work.
+ * prepare_erase_work - prepare an erase work.
  * @ubi: UBI device description object
  * @e: the WL entry of the physical eraseblock to erase
  * @vol_id: the volume ID that last used this PEB
  * @lnum: the last used logical eraseblock number for the PEB
  * @torture: if the physical eraseblock has to be tortured
  *
- * This function returns zero in case of success and a %-ENOMEM in case of
- * failure.
+ * This function returns a struct ubi_work in case of success
+ * and an ERR_PTR(%-ENOMEM) in case of failure.
  */
-static int schedule_erase(struct ubi_device *ubi, struct ubi_wl_entry *e,
-			  int vol_id, int lnum, int torture)
+static struct ubi_work *prepare_erase_work(struct ubi_wl_entry *e, int vol_id,
+					   int lnum, int torture)
 {
 	struct ubi_work *wl_wrk;
 
 	ubi_assert(e);
 
-	dbg_wl("schedule erasure of PEB %d, EC %d, torture %d",
-	       e->pnum, e->ec, torture);
-
 	wl_wrk = kmalloc(sizeof(struct ubi_work), GFP_NOFS);
 	if (!wl_wrk)
-		return -ENOMEM;
+		return ERR_PTR(-ENOMEM);
 
 	wl_wrk->func = &erase_worker;
 	wl_wrk->e = e;
@@ -599,6 +596,31 @@ static int schedule_erase(struct ubi_device *ubi, struct ubi_wl_entry *e,
 	wl_wrk->lnum = lnum;
 	wl_wrk->torture = torture;
 
+	return wl_wrk;
+}
+
+/**
+ * schedule_erase - schedule an erase work.
+ * @ubi: UBI device description object
+ * @e: the WL entry of the physical eraseblock to erase
+ * @vol_id: the volume ID that last used this PEB
+ * @lnum: the last used logical eraseblock number for the PEB
+ * @torture: if the physical eraseblock has to be tortured
+ *
+ * This function returns zero in case of success and a %-ENOMEM in case of
+ * failure.
+ */
+static int schedule_erase(struct ubi_device *ubi, struct ubi_wl_entry *e,
+			  int vol_id, int lnum, int torture)
+{
+	struct ubi_work *wl_wrk = prepare_erase_work(e, vol_id, lnum, torture);
+
+	dbg_wl("schedule erasure of PEB %d, EC %d, torture %d",
+	       e->pnum, e->ec, torture);
+
+	if (IS_ERR(wl_wrk))
+		return PTR_ERR(wl_wrk);
+
 	schedule_ubi_work(ubi, wl_wrk);
 	return 0;
 }
-- 
1.8.4.5

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 3/4] UBI: Introduce in_pq()
  2015-03-29 12:13 ` Richard Weinberger
@ 2015-03-29 12:13   ` Richard Weinberger
  -1 siblings, 0 replies; 49+ messages in thread
From: Richard Weinberger @ 2015-03-29 12:13 UTC (permalink / raw)
  To: linux-mtd; +Cc: linux-kernel, dedekind1, boris.brezillon, Richard Weinberger

This function works like in_wl_tree() but checks whether an ubi_wl_entry
is currently in the protection queue.
We need this function to query the current state of an ubi_wl_entry.

Signed-off-by: Richard Weinberger <richard@nod.at>
---
 drivers/mtd/ubi/wl.c | 30 +++++++++++++++++++++++-------
 1 file changed, 23 insertions(+), 7 deletions(-)

diff --git a/drivers/mtd/ubi/wl.c b/drivers/mtd/ubi/wl.c
index 13594e9..9b11db9 100644
--- a/drivers/mtd/ubi/wl.c
+++ b/drivers/mtd/ubi/wl.c
@@ -278,6 +278,27 @@ static int in_wl_tree(struct ubi_wl_entry *e, struct rb_root *root)
 }
 
 /**
+ * in_pq - check if a wear-leveling entry is present in the protection queue.
+ * @ubi: UBI device description object
+ * @e: the wear-leveling entry to check
+ *
+ * This function returns non-zero if @e is in the protection queue and zero
+ * if it is not.
+ */
+static inline int in_pq(const struct ubi_device *ubi, struct ubi_wl_entry *e)
+{
+	struct ubi_wl_entry *p;
+	int i;
+
+	for (i = 0; i < UBI_PROT_QUEUE_LEN; ++i)
+		list_for_each_entry(p, &ubi->pq[i], u.list)
+			if (p == e)
+				return 1;
+
+	return 0;
+}
+
+/**
  * prot_queue_add - add physical eraseblock to the protection queue.
  * @ubi: UBI device description object
  * @e: the physical eraseblock to add
@@ -1754,16 +1775,11 @@ static int self_check_in_wl_tree(const struct ubi_device *ubi,
 static int self_check_in_pq(const struct ubi_device *ubi,
 			    struct ubi_wl_entry *e)
 {
-	struct ubi_wl_entry *p;
-	int i;
-
 	if (!ubi_dbg_chk_gen(ubi))
 		return 0;
 
-	for (i = 0; i < UBI_PROT_QUEUE_LEN; ++i)
-		list_for_each_entry(p, &ubi->pq[i], u.list)
-			if (p == e)
-				return 0;
+	if (in_pq(ubi, e))
+		return 0;
 
 	ubi_err(ubi, "self-check failed for PEB %d, EC %d, Protect queue",
 		e->pnum, e->ec);
-- 
1.8.4.5


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 3/4] UBI: Introduce in_pq()
@ 2015-03-29 12:13   ` Richard Weinberger
  0 siblings, 0 replies; 49+ messages in thread
From: Richard Weinberger @ 2015-03-29 12:13 UTC (permalink / raw)
  To: linux-mtd; +Cc: Richard Weinberger, boris.brezillon, linux-kernel, dedekind1

This function works like in_wl_tree() but checks whether an ubi_wl_entry
is currently in the protection queue.
We need this function to query the current state of an ubi_wl_entry.

Signed-off-by: Richard Weinberger <richard@nod.at>
---
 drivers/mtd/ubi/wl.c | 30 +++++++++++++++++++++++-------
 1 file changed, 23 insertions(+), 7 deletions(-)

diff --git a/drivers/mtd/ubi/wl.c b/drivers/mtd/ubi/wl.c
index 13594e9..9b11db9 100644
--- a/drivers/mtd/ubi/wl.c
+++ b/drivers/mtd/ubi/wl.c
@@ -278,6 +278,27 @@ static int in_wl_tree(struct ubi_wl_entry *e, struct rb_root *root)
 }
 
 /**
+ * in_pq - check if a wear-leveling entry is present in the protection queue.
+ * @ubi: UBI device description object
+ * @e: the wear-leveling entry to check
+ *
+ * This function returns non-zero if @e is in the protection queue and zero
+ * if it is not.
+ */
+static inline int in_pq(const struct ubi_device *ubi, struct ubi_wl_entry *e)
+{
+	struct ubi_wl_entry *p;
+	int i;
+
+	for (i = 0; i < UBI_PROT_QUEUE_LEN; ++i)
+		list_for_each_entry(p, &ubi->pq[i], u.list)
+			if (p == e)
+				return 1;
+
+	return 0;
+}
+
+/**
  * prot_queue_add - add physical eraseblock to the protection queue.
  * @ubi: UBI device description object
  * @e: the physical eraseblock to add
@@ -1754,16 +1775,11 @@ static int self_check_in_wl_tree(const struct ubi_device *ubi,
 static int self_check_in_pq(const struct ubi_device *ubi,
 			    struct ubi_wl_entry *e)
 {
-	struct ubi_wl_entry *p;
-	int i;
-
 	if (!ubi_dbg_chk_gen(ubi))
 		return 0;
 
-	for (i = 0; i < UBI_PROT_QUEUE_LEN; ++i)
-		list_for_each_entry(p, &ubi->pq[i], u.list)
-			if (p == e)
-				return 0;
+	if (in_pq(ubi, e))
+		return 0;
 
 	ubi_err(ubi, "self-check failed for PEB %d, EC %d, Protect queue",
 		e->pnum, e->ec);
-- 
1.8.4.5

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 4/4] UBI: Implement bitrot checking
  2015-03-29 12:13 ` Richard Weinberger
@ 2015-03-29 12:13   ` Richard Weinberger
  -1 siblings, 0 replies; 49+ messages in thread
From: Richard Weinberger @ 2015-03-29 12:13 UTC (permalink / raw)
  To: linux-mtd; +Cc: linux-kernel, dedekind1, boris.brezillon, Richard Weinberger

This patch implements bitrot checking for UBI.
ubi_wl_trigger_bitrot_check() triggers a re-read of every
PEB. If a bitflip is detected PEBs in use will get scrubbed
and free ones erased.

Signed-off-by: Richard Weinberger <richard@nod.at>
---
 drivers/mtd/ubi/build.c |  39 +++++++++++++
 drivers/mtd/ubi/ubi.h   |   4 ++
 drivers/mtd/ubi/wl.c    | 146 ++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 189 insertions(+)

diff --git a/drivers/mtd/ubi/build.c b/drivers/mtd/ubi/build.c
index 9690cf9..f58330b 100644
--- a/drivers/mtd/ubi/build.c
+++ b/drivers/mtd/ubi/build.c
@@ -118,6 +118,9 @@ static struct class_attribute ubi_version =
 
 static ssize_t dev_attribute_show(struct device *dev,
 				  struct device_attribute *attr, char *buf);
+static ssize_t trigger_bitrot_check(struct device *dev,
+				    struct device_attribute *mattr,
+				    const char *data, size_t count);
 
 /* UBI device attributes (correspond to files in '/<sysfs>/class/ubi/ubiX') */
 static struct device_attribute dev_eraseblock_size =
@@ -142,6 +145,8 @@ static struct device_attribute dev_bgt_enabled =
 	__ATTR(bgt_enabled, S_IRUGO, dev_attribute_show, NULL);
 static struct device_attribute dev_mtd_num =
 	__ATTR(mtd_num, S_IRUGO, dev_attribute_show, NULL);
+static struct device_attribute dev_trigger_bitrot_check =
+	__ATTR(trigger_bitrot_check, S_IWUSR, NULL, trigger_bitrot_check);
 
 /**
  * ubi_volume_notify - send a volume change notification.
@@ -334,6 +339,36 @@ int ubi_major2num(int major)
 	return ubi_num;
 }
 
+/* "Store" method for file '/<sysfs>/class/ubi/ubiX/trigger_bitrot_check' */
+static ssize_t trigger_bitrot_check(struct device *dev,
+				    struct device_attribute *mattr,
+				    const char *data, size_t count)
+{
+	struct ubi_device *ubi;
+	int ret;
+
+	ubi = container_of(dev, struct ubi_device, dev);
+	ubi = ubi_get_device(ubi->ubi_num);
+	if (!ubi) {
+		ret = -ENODEV;
+		goto out;
+	}
+
+	if (atomic_inc_return(&ubi->bit_rot_work) != 1) {
+		ret = -EBUSY;
+		goto out_dec;
+	}
+
+	ubi_wl_trigger_bitrot_check(ubi);
+	ret = count;
+
+out_dec:
+	atomic_dec(&ubi->bit_rot_work);
+out:
+	ubi_put_device(ubi);
+	return ret;
+}
+
 /* "Show" method for files in '/<sysfs>/class/ubi/ubiX/' */
 static ssize_t dev_attribute_show(struct device *dev,
 				  struct device_attribute *attr, char *buf)
@@ -445,6 +480,9 @@ static int ubi_sysfs_init(struct ubi_device *ubi, int *ref)
 	if (err)
 		return err;
 	err = device_create_file(&ubi->dev, &dev_mtd_num);
+	if (err)
+		return err;
+	err = device_create_file(&ubi->dev, &dev_trigger_bitrot_check);
 	return err;
 }
 
@@ -465,6 +503,7 @@ static void ubi_sysfs_close(struct ubi_device *ubi)
 	device_remove_file(&ubi->dev, &dev_total_eraseblocks);
 	device_remove_file(&ubi->dev, &dev_avail_eraseblocks);
 	device_remove_file(&ubi->dev, &dev_eraseblock_size);
+	device_remove_file(&ubi->dev, &dev_trigger_bitrot_check);
 	device_unregister(&ubi->dev);
 }
 
diff --git a/drivers/mtd/ubi/ubi.h b/drivers/mtd/ubi/ubi.h
index cb3dcd0..da88cd8 100644
--- a/drivers/mtd/ubi/ubi.h
+++ b/drivers/mtd/ubi/ubi.h
@@ -39,6 +39,7 @@
 #include <linux/notifier.h>
 #include <linux/mtd/mtd.h>
 #include <linux/mtd/ubi.h>
+#include <linux/atomic.h>
 #include <asm/pgtable.h>
 
 #include "ubi-media.h"
@@ -464,6 +465,7 @@ struct ubi_debug_info {
  * @bgt_thread: background thread description object
  * @thread_enabled: if the background thread is enabled
  * @bgt_name: background thread name
+ * @bit_rot_work: non-zero if a bit rot check is running
  *
  * @flash_size: underlying MTD device size (in bytes)
  * @peb_count: count of physical eraseblocks on the MTD device
@@ -567,6 +569,7 @@ struct ubi_device {
 	struct task_struct *bgt_thread;
 	int thread_enabled;
 	char bgt_name[sizeof(UBI_BGT_NAME_PATTERN)+2];
+	atomic_t bit_rot_work;
 
 	/* I/O sub-system's stuff */
 	long long flash_size;
@@ -834,6 +837,7 @@ int ubi_wl_put_fm_peb(struct ubi_device *ubi, struct ubi_wl_entry *used_e,
 int ubi_is_erase_work(struct ubi_work *wrk);
 void ubi_refill_pools(struct ubi_device *ubi);
 int ubi_ensure_anchor_pebs(struct ubi_device *ubi);
+void ubi_wl_trigger_bitrot_check(struct ubi_device *ubi);
 
 /* io.c */
 int ubi_io_read(const struct ubi_device *ubi, void *buf, int pnum, int offset,
diff --git a/drivers/mtd/ubi/wl.c b/drivers/mtd/ubi/wl.c
index 9b11db9..784bb52 100644
--- a/drivers/mtd/ubi/wl.c
+++ b/drivers/mtd/ubi/wl.c
@@ -1447,6 +1447,150 @@ static void tree_destroy(struct ubi_device *ubi, struct rb_root *root)
 }
 
 /**
+ * bitrot_check_worker - physical eraseblock bitrot check worker function.
+ * @ubi: UBI device description object
+ * @wl_wrk: the work object
+ * @shutdown: non-zero if the worker has to free memory and exit
+ *
+ * This function reads a physical eraseblock and schedules scrubbing if
+ * bit flips are detected.
+ */
+static int bitrot_check_worker(struct ubi_device *ubi, struct ubi_work *wl_wrk,
+			       int shutdown)
+{
+	struct ubi_wl_entry *e = wl_wrk->e;
+	int err;
+
+	kfree(wl_wrk);
+	if (shutdown) {
+		dbg_wl("cancel bitrot check of PEB %d", e->pnum);
+		wl_entry_destroy(ubi, e);
+		return 0;
+	}
+
+	mutex_lock(&ubi->buf_mutex);
+	err = ubi_io_read(ubi, ubi->peb_buf, e->pnum, 0, ubi->peb_size);
+	mutex_unlock(&ubi->buf_mutex);
+	if (err == UBI_IO_BITFLIPS) {
+		dbg_wl("found bitflips in PEB %d", e->pnum);
+		spin_lock(&ubi->wl_lock);
+
+		if (in_pq(ubi, e)) {
+			prot_queue_del(ubi, e->pnum);
+			wl_tree_add(e, &ubi->scrub);
+			spin_unlock(&ubi->wl_lock);
+
+			err = ensure_wear_leveling(ubi, 1);
+		}
+		else if (in_wl_tree(e, &ubi->used)) {
+			rb_erase(&e->u.rb, &ubi->used);
+			wl_tree_add(e, &ubi->scrub);
+			spin_unlock(&ubi->wl_lock);
+
+			err = ensure_wear_leveling(ubi, 1);
+		}
+		else if (in_wl_tree(e, &ubi->free)) {
+			rb_erase(&e->u.rb, &ubi->free);
+			spin_unlock(&ubi->wl_lock);
+
+			wl_wrk = prepare_erase_work(e, -1, -1, 1);
+			if (IS_ERR(wl_wrk)) {
+				err = PTR_ERR(wl_wrk);
+				goto out;
+			}
+
+			__schedule_ubi_work(ubi, wl_wrk);
+			err = 0;
+		}
+		/*
+		 * e is target of a move operation, all we can do is kicking
+		 * wear leveling such that we can catch it later or wear
+		 * leveling itself scrubbs the PEB.
+		 */
+		else if (ubi->move_to == e || ubi->move_from == e) {
+			spin_unlock(&ubi->wl_lock);
+
+			err = ensure_wear_leveling(ubi, 1);
+		}
+		/*
+		 * e is member of a fastmap pool. We are not allowed to
+		 * remove it from that pool as the on-flash fastmap data
+		 * structure refers to it. Let's schedule a new fastmap write
+		 * such that the said PEB can get released.
+		 */
+		else {
+			ubi_schedule_fm_work(ubi);
+			spin_unlock(&ubi->wl_lock);
+
+			err = 0;
+		}
+	}
+	else {
+		/*
+		 * Ignore read errors as we return only work related errors.
+		 * Read errors will be logged by ubi_io_read().
+		 */
+		err = 0;
+	}
+
+out:
+	atomic_dec(&ubi->bit_rot_work);
+	ubi_assert(atomic_read(&ubi->bit_rot_work) >= 0);
+	return err;
+}
+
+/**
+ * schedule_bitrot_check - schedule a bitrot check work.
+ * @ubi: UBI device description object
+ * @e: the WL entry of the physical eraseblock to check
+ *
+ * This function returns zero in case of success and a %-ENOMEM in case of
+ * failure.
+ */
+static int schedule_bitrot_check(struct ubi_device *ubi,
+				 struct ubi_wl_entry *e)
+{
+	struct ubi_work *wl_wrk;
+
+	ubi_assert(e);
+
+	dbg_wl("schedule bitrot check of PEB %d", e->pnum);
+
+	wl_wrk = kmalloc(sizeof(struct ubi_work), GFP_NOFS);
+	if (!wl_wrk)
+		return -ENOMEM;
+
+	wl_wrk->func = &bitrot_check_worker;
+	wl_wrk->e = e;
+
+	schedule_ubi_work(ubi, wl_wrk);
+	return 0;
+}
+
+/**
+ * ubi_wl_trigger_bitrot_check - triggers a re-read of all physical erase
+ * blocks.
+ * @ubi: UBI device description object
+ */
+void ubi_wl_trigger_bitrot_check(struct ubi_device *ubi)
+{
+	int i;
+	struct ubi_wl_entry *e;
+
+	ubi_msg(ubi, "Running a full read check");
+
+	for (i = 0; i < ubi->peb_count; i++) {
+		spin_lock(&ubi->wl_lock);
+		e = ubi->lookuptbl[i];
+		spin_unlock(&ubi->wl_lock);
+		if (e) {
+			atomic_inc(&ubi->bit_rot_work);
+			schedule_bitrot_check(ubi, e);
+		}
+	}
+}
+
+/**
  * ubi_thread - UBI background thread.
  * @u: the UBI device description object pointer
  */
@@ -1646,6 +1790,8 @@ int ubi_wl_init(struct ubi_device *ubi, struct ubi_attach_info *ai)
 	ubi->avail_pebs -= reserved_pebs;
 	ubi->rsvd_pebs += reserved_pebs;
 
+	atomic_set(&ubi->bit_rot_work, 0);
+
 	/* Schedule wear-leveling if needed */
 	err = ensure_wear_leveling(ubi, 0);
 	if (err)
-- 
1.8.4.5


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 4/4] UBI: Implement bitrot checking
@ 2015-03-29 12:13   ` Richard Weinberger
  0 siblings, 0 replies; 49+ messages in thread
From: Richard Weinberger @ 2015-03-29 12:13 UTC (permalink / raw)
  To: linux-mtd; +Cc: Richard Weinberger, boris.brezillon, linux-kernel, dedekind1

This patch implements bitrot checking for UBI.
ubi_wl_trigger_bitrot_check() triggers a re-read of every
PEB. If a bitflip is detected PEBs in use will get scrubbed
and free ones erased.

Signed-off-by: Richard Weinberger <richard@nod.at>
---
 drivers/mtd/ubi/build.c |  39 +++++++++++++
 drivers/mtd/ubi/ubi.h   |   4 ++
 drivers/mtd/ubi/wl.c    | 146 ++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 189 insertions(+)

diff --git a/drivers/mtd/ubi/build.c b/drivers/mtd/ubi/build.c
index 9690cf9..f58330b 100644
--- a/drivers/mtd/ubi/build.c
+++ b/drivers/mtd/ubi/build.c
@@ -118,6 +118,9 @@ static struct class_attribute ubi_version =
 
 static ssize_t dev_attribute_show(struct device *dev,
 				  struct device_attribute *attr, char *buf);
+static ssize_t trigger_bitrot_check(struct device *dev,
+				    struct device_attribute *mattr,
+				    const char *data, size_t count);
 
 /* UBI device attributes (correspond to files in '/<sysfs>/class/ubi/ubiX') */
 static struct device_attribute dev_eraseblock_size =
@@ -142,6 +145,8 @@ static struct device_attribute dev_bgt_enabled =
 	__ATTR(bgt_enabled, S_IRUGO, dev_attribute_show, NULL);
 static struct device_attribute dev_mtd_num =
 	__ATTR(mtd_num, S_IRUGO, dev_attribute_show, NULL);
+static struct device_attribute dev_trigger_bitrot_check =
+	__ATTR(trigger_bitrot_check, S_IWUSR, NULL, trigger_bitrot_check);
 
 /**
  * ubi_volume_notify - send a volume change notification.
@@ -334,6 +339,36 @@ int ubi_major2num(int major)
 	return ubi_num;
 }
 
+/* "Store" method for file '/<sysfs>/class/ubi/ubiX/trigger_bitrot_check' */
+static ssize_t trigger_bitrot_check(struct device *dev,
+				    struct device_attribute *mattr,
+				    const char *data, size_t count)
+{
+	struct ubi_device *ubi;
+	int ret;
+
+	ubi = container_of(dev, struct ubi_device, dev);
+	ubi = ubi_get_device(ubi->ubi_num);
+	if (!ubi) {
+		ret = -ENODEV;
+		goto out;
+	}
+
+	if (atomic_inc_return(&ubi->bit_rot_work) != 1) {
+		ret = -EBUSY;
+		goto out_dec;
+	}
+
+	ubi_wl_trigger_bitrot_check(ubi);
+	ret = count;
+
+out_dec:
+	atomic_dec(&ubi->bit_rot_work);
+out:
+	ubi_put_device(ubi);
+	return ret;
+}
+
 /* "Show" method for files in '/<sysfs>/class/ubi/ubiX/' */
 static ssize_t dev_attribute_show(struct device *dev,
 				  struct device_attribute *attr, char *buf)
@@ -445,6 +480,9 @@ static int ubi_sysfs_init(struct ubi_device *ubi, int *ref)
 	if (err)
 		return err;
 	err = device_create_file(&ubi->dev, &dev_mtd_num);
+	if (err)
+		return err;
+	err = device_create_file(&ubi->dev, &dev_trigger_bitrot_check);
 	return err;
 }
 
@@ -465,6 +503,7 @@ static void ubi_sysfs_close(struct ubi_device *ubi)
 	device_remove_file(&ubi->dev, &dev_total_eraseblocks);
 	device_remove_file(&ubi->dev, &dev_avail_eraseblocks);
 	device_remove_file(&ubi->dev, &dev_eraseblock_size);
+	device_remove_file(&ubi->dev, &dev_trigger_bitrot_check);
 	device_unregister(&ubi->dev);
 }
 
diff --git a/drivers/mtd/ubi/ubi.h b/drivers/mtd/ubi/ubi.h
index cb3dcd0..da88cd8 100644
--- a/drivers/mtd/ubi/ubi.h
+++ b/drivers/mtd/ubi/ubi.h
@@ -39,6 +39,7 @@
 #include <linux/notifier.h>
 #include <linux/mtd/mtd.h>
 #include <linux/mtd/ubi.h>
+#include <linux/atomic.h>
 #include <asm/pgtable.h>
 
 #include "ubi-media.h"
@@ -464,6 +465,7 @@ struct ubi_debug_info {
  * @bgt_thread: background thread description object
  * @thread_enabled: if the background thread is enabled
  * @bgt_name: background thread name
+ * @bit_rot_work: non-zero if a bit rot check is running
  *
  * @flash_size: underlying MTD device size (in bytes)
  * @peb_count: count of physical eraseblocks on the MTD device
@@ -567,6 +569,7 @@ struct ubi_device {
 	struct task_struct *bgt_thread;
 	int thread_enabled;
 	char bgt_name[sizeof(UBI_BGT_NAME_PATTERN)+2];
+	atomic_t bit_rot_work;
 
 	/* I/O sub-system's stuff */
 	long long flash_size;
@@ -834,6 +837,7 @@ int ubi_wl_put_fm_peb(struct ubi_device *ubi, struct ubi_wl_entry *used_e,
 int ubi_is_erase_work(struct ubi_work *wrk);
 void ubi_refill_pools(struct ubi_device *ubi);
 int ubi_ensure_anchor_pebs(struct ubi_device *ubi);
+void ubi_wl_trigger_bitrot_check(struct ubi_device *ubi);
 
 /* io.c */
 int ubi_io_read(const struct ubi_device *ubi, void *buf, int pnum, int offset,
diff --git a/drivers/mtd/ubi/wl.c b/drivers/mtd/ubi/wl.c
index 9b11db9..784bb52 100644
--- a/drivers/mtd/ubi/wl.c
+++ b/drivers/mtd/ubi/wl.c
@@ -1447,6 +1447,150 @@ static void tree_destroy(struct ubi_device *ubi, struct rb_root *root)
 }
 
 /**
+ * bitrot_check_worker - physical eraseblock bitrot check worker function.
+ * @ubi: UBI device description object
+ * @wl_wrk: the work object
+ * @shutdown: non-zero if the worker has to free memory and exit
+ *
+ * This function reads a physical eraseblock and schedules scrubbing if
+ * bit flips are detected.
+ */
+static int bitrot_check_worker(struct ubi_device *ubi, struct ubi_work *wl_wrk,
+			       int shutdown)
+{
+	struct ubi_wl_entry *e = wl_wrk->e;
+	int err;
+
+	kfree(wl_wrk);
+	if (shutdown) {
+		dbg_wl("cancel bitrot check of PEB %d", e->pnum);
+		wl_entry_destroy(ubi, e);
+		return 0;
+	}
+
+	mutex_lock(&ubi->buf_mutex);
+	err = ubi_io_read(ubi, ubi->peb_buf, e->pnum, 0, ubi->peb_size);
+	mutex_unlock(&ubi->buf_mutex);
+	if (err == UBI_IO_BITFLIPS) {
+		dbg_wl("found bitflips in PEB %d", e->pnum);
+		spin_lock(&ubi->wl_lock);
+
+		if (in_pq(ubi, e)) {
+			prot_queue_del(ubi, e->pnum);
+			wl_tree_add(e, &ubi->scrub);
+			spin_unlock(&ubi->wl_lock);
+
+			err = ensure_wear_leveling(ubi, 1);
+		}
+		else if (in_wl_tree(e, &ubi->used)) {
+			rb_erase(&e->u.rb, &ubi->used);
+			wl_tree_add(e, &ubi->scrub);
+			spin_unlock(&ubi->wl_lock);
+
+			err = ensure_wear_leveling(ubi, 1);
+		}
+		else if (in_wl_tree(e, &ubi->free)) {
+			rb_erase(&e->u.rb, &ubi->free);
+			spin_unlock(&ubi->wl_lock);
+
+			wl_wrk = prepare_erase_work(e, -1, -1, 1);
+			if (IS_ERR(wl_wrk)) {
+				err = PTR_ERR(wl_wrk);
+				goto out;
+			}
+
+			__schedule_ubi_work(ubi, wl_wrk);
+			err = 0;
+		}
+		/*
+		 * e is target of a move operation, all we can do is kicking
+		 * wear leveling such that we can catch it later or wear
+		 * leveling itself scrubbs the PEB.
+		 */
+		else if (ubi->move_to == e || ubi->move_from == e) {
+			spin_unlock(&ubi->wl_lock);
+
+			err = ensure_wear_leveling(ubi, 1);
+		}
+		/*
+		 * e is member of a fastmap pool. We are not allowed to
+		 * remove it from that pool as the on-flash fastmap data
+		 * structure refers to it. Let's schedule a new fastmap write
+		 * such that the said PEB can get released.
+		 */
+		else {
+			ubi_schedule_fm_work(ubi);
+			spin_unlock(&ubi->wl_lock);
+
+			err = 0;
+		}
+	}
+	else {
+		/*
+		 * Ignore read errors as we return only work related errors.
+		 * Read errors will be logged by ubi_io_read().
+		 */
+		err = 0;
+	}
+
+out:
+	atomic_dec(&ubi->bit_rot_work);
+	ubi_assert(atomic_read(&ubi->bit_rot_work) >= 0);
+	return err;
+}
+
+/**
+ * schedule_bitrot_check - schedule a bitrot check work.
+ * @ubi: UBI device description object
+ * @e: the WL entry of the physical eraseblock to check
+ *
+ * This function returns zero in case of success and a %-ENOMEM in case of
+ * failure.
+ */
+static int schedule_bitrot_check(struct ubi_device *ubi,
+				 struct ubi_wl_entry *e)
+{
+	struct ubi_work *wl_wrk;
+
+	ubi_assert(e);
+
+	dbg_wl("schedule bitrot check of PEB %d", e->pnum);
+
+	wl_wrk = kmalloc(sizeof(struct ubi_work), GFP_NOFS);
+	if (!wl_wrk)
+		return -ENOMEM;
+
+	wl_wrk->func = &bitrot_check_worker;
+	wl_wrk->e = e;
+
+	schedule_ubi_work(ubi, wl_wrk);
+	return 0;
+}
+
+/**
+ * ubi_wl_trigger_bitrot_check - triggers a re-read of all physical erase
+ * blocks.
+ * @ubi: UBI device description object
+ */
+void ubi_wl_trigger_bitrot_check(struct ubi_device *ubi)
+{
+	int i;
+	struct ubi_wl_entry *e;
+
+	ubi_msg(ubi, "Running a full read check");
+
+	for (i = 0; i < ubi->peb_count; i++) {
+		spin_lock(&ubi->wl_lock);
+		e = ubi->lookuptbl[i];
+		spin_unlock(&ubi->wl_lock);
+		if (e) {
+			atomic_inc(&ubi->bit_rot_work);
+			schedule_bitrot_check(ubi, e);
+		}
+	}
+}
+
+/**
  * ubi_thread - UBI background thread.
  * @u: the UBI device description object pointer
  */
@@ -1646,6 +1790,8 @@ int ubi_wl_init(struct ubi_device *ubi, struct ubi_attach_info *ai)
 	ubi->avail_pebs -= reserved_pebs;
 	ubi->rsvd_pebs += reserved_pebs;
 
+	atomic_set(&ubi->bit_rot_work, 0);
+
 	/* Schedule wear-leveling if needed */
 	err = ensure_wear_leveling(ubi, 0);
 	if (err)
-- 
1.8.4.5

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] UBI: Implement bitrot checking
  2015-03-29 12:13   ` Richard Weinberger
@ 2015-04-02 17:34     ` Andrea Scian
  -1 siblings, 0 replies; 49+ messages in thread
From: Andrea Scian @ 2015-04-02 17:34 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: linux-mtd, boris.brezillon, linux-kernel, dedekind1


Richard,

Il 29/03/2015 14:13, Richard Weinberger ha scritto:
> +	mutex_lock(&ubi->buf_mutex);
> +	err = ubi_io_read(ubi, ubi->peb_buf, e->pnum, 0, ubi->peb_size);
> +	mutex_unlock(&ubi->buf_mutex);
> +	if (err == UBI_IO_BITFLIPS) {
> +		dbg_wl("found bitflips in PEB %d", e->pnum);
> +		spin_lock(&ubi->wl_lock);
> +

IIUC you trigger the action as soon as you have a bitflip error, is this
correct?

Isn't this too much conservative? You usually have a RBER on MLC devices
that's between 1E-7 (for brand new devices) and 1E-4 (for devices with
1k-2k P/E cycle after 100k-300k read-without-P/E)

Having a few bitflips on a block read is more that usual and current ECC
can correct more that 16 bit error over 512/1KiB.

WDYT?

Kind Regards,

-- 

Andrea SCIAN

DAVE Embedded Systems

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] UBI: Implement bitrot checking
@ 2015-04-02 17:34     ` Andrea Scian
  0 siblings, 0 replies; 49+ messages in thread
From: Andrea Scian @ 2015-04-02 17:34 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: boris.brezillon, linux-mtd, linux-kernel, dedekind1


Richard,

Il 29/03/2015 14:13, Richard Weinberger ha scritto:
> +	mutex_lock(&ubi->buf_mutex);
> +	err = ubi_io_read(ubi, ubi->peb_buf, e->pnum, 0, ubi->peb_size);
> +	mutex_unlock(&ubi->buf_mutex);
> +	if (err == UBI_IO_BITFLIPS) {
> +		dbg_wl("found bitflips in PEB %d", e->pnum);
> +		spin_lock(&ubi->wl_lock);
> +

IIUC you trigger the action as soon as you have a bitflip error, is this
correct?

Isn't this too much conservative? You usually have a RBER on MLC devices
that's between 1E-7 (for brand new devices) and 1E-4 (for devices with
1k-2k P/E cycle after 100k-300k read-without-P/E)

Having a few bitflips on a block read is more that usual and current ECC
can correct more that 16 bit error over 512/1KiB.

WDYT?

Kind Regards,

-- 

Andrea SCIAN

DAVE Embedded Systems

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] UBI: Implement bitrot checking
  2015-04-02 17:34     ` Andrea Scian
@ 2015-04-02 17:54       ` Richard Weinberger
  -1 siblings, 0 replies; 49+ messages in thread
From: Richard Weinberger @ 2015-04-02 17:54 UTC (permalink / raw)
  To: Andrea Scian; +Cc: linux-mtd, boris.brezillon, linux-kernel, dedekind1

Hi!

Am 02.04.2015 um 19:34 schrieb Andrea Scian:
> 
> Richard,
> 
> Il 29/03/2015 14:13, Richard Weinberger ha scritto:
>> +	mutex_lock(&ubi->buf_mutex);
>> +	err = ubi_io_read(ubi, ubi->peb_buf, e->pnum, 0, ubi->peb_size);
>> +	mutex_unlock(&ubi->buf_mutex);
>> +	if (err == UBI_IO_BITFLIPS) {
>> +		dbg_wl("found bitflips in PEB %d", e->pnum);
>> +		spin_lock(&ubi->wl_lock);
>> +
> 
> IIUC you trigger the action as soon as you have a bitflip error, is this
> correct?

I trigger it as soon UBI sees the bitflip. This depends on the configured MTD
bitflip_threshold.

> Isn't this too much conservative? You usually have a RBER on MLC devices
> that's between 1E-7 (for brand new devices) and 1E-4 (for devices with
> 1k-2k P/E cycle after 100k-300k read-without-P/E)
> 
> Having a few bitflips on a block read is more that usual and current ECC
> can correct more that 16 bit error over 512/1KiB.

Please see above. :)

Thanks,
//richard


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] UBI: Implement bitrot checking
@ 2015-04-02 17:54       ` Richard Weinberger
  0 siblings, 0 replies; 49+ messages in thread
From: Richard Weinberger @ 2015-04-02 17:54 UTC (permalink / raw)
  To: Andrea Scian; +Cc: boris.brezillon, linux-mtd, linux-kernel, dedekind1

Hi!

Am 02.04.2015 um 19:34 schrieb Andrea Scian:
> 
> Richard,
> 
> Il 29/03/2015 14:13, Richard Weinberger ha scritto:
>> +	mutex_lock(&ubi->buf_mutex);
>> +	err = ubi_io_read(ubi, ubi->peb_buf, e->pnum, 0, ubi->peb_size);
>> +	mutex_unlock(&ubi->buf_mutex);
>> +	if (err == UBI_IO_BITFLIPS) {
>> +		dbg_wl("found bitflips in PEB %d", e->pnum);
>> +		spin_lock(&ubi->wl_lock);
>> +
> 
> IIUC you trigger the action as soon as you have a bitflip error, is this
> correct?

I trigger it as soon UBI sees the bitflip. This depends on the configured MTD
bitflip_threshold.

> Isn't this too much conservative? You usually have a RBER on MLC devices
> that's between 1E-7 (for brand new devices) and 1E-4 (for devices with
> 1k-2k P/E cycle after 100k-300k read-without-P/E)
> 
> Having a few bitflips on a block read is more that usual and current ECC
> can correct more that 16 bit error over 512/1KiB.

Please see above. :)

Thanks,
//richard

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] UBI: Implement bitrot checking
  2015-04-02 17:54       ` Richard Weinberger
@ 2015-04-02 19:19         ` Andrea Scian
  -1 siblings, 0 replies; 49+ messages in thread
From: Andrea Scian @ 2015-04-02 19:19 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: linux-mtd, boris.brezillon, linux-kernel, dedekind1

Il 02/04/2015 19:54, Richard Weinberger ha scritto:
> Hi!
>
> Am 02.04.2015 um 19:34 schrieb Andrea Scian:
>> Richard,
>>
>> Il 29/03/2015 14:13, Richard Weinberger ha scritto:
>>> +	mutex_lock(&ubi->buf_mutex);
>>> +	err = ubi_io_read(ubi, ubi->peb_buf, e->pnum, 0, ubi->peb_size);
>>> +	mutex_unlock(&ubi->buf_mutex);
>>> +	if (err == UBI_IO_BITFLIPS) {
>>> +		dbg_wl("found bitflips in PEB %d", e->pnum);
>>> +		spin_lock(&ubi->wl_lock);
>>> +
>> IIUC you trigger the action as soon as you have a bitflip error, is this
>> correct?
> I trigger it as soon UBI sees the bitflip. This depends on the configured MTD
> bitflip_threshold.

ops.. I confused UBI bitflip error with MTD bitflip error ;-)
thanks for point this out and help me understanding the code

now it's clear and it sounds good to me

Kind Regards,

-- 

Andrea SCIAN

DAVE Embedded Systems


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] UBI: Implement bitrot checking
@ 2015-04-02 19:19         ` Andrea Scian
  0 siblings, 0 replies; 49+ messages in thread
From: Andrea Scian @ 2015-04-02 19:19 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: boris.brezillon, linux-mtd, linux-kernel, dedekind1

Il 02/04/2015 19:54, Richard Weinberger ha scritto:
> Hi!
>
> Am 02.04.2015 um 19:34 schrieb Andrea Scian:
>> Richard,
>>
>> Il 29/03/2015 14:13, Richard Weinberger ha scritto:
>>> +	mutex_lock(&ubi->buf_mutex);
>>> +	err = ubi_io_read(ubi, ubi->peb_buf, e->pnum, 0, ubi->peb_size);
>>> +	mutex_unlock(&ubi->buf_mutex);
>>> +	if (err == UBI_IO_BITFLIPS) {
>>> +		dbg_wl("found bitflips in PEB %d", e->pnum);
>>> +		spin_lock(&ubi->wl_lock);
>>> +
>> IIUC you trigger the action as soon as you have a bitflip error, is this
>> correct?
> I trigger it as soon UBI sees the bitflip. This depends on the configured MTD
> bitflip_threshold.

ops.. I confused UBI bitflip error with MTD bitflip error ;-)
thanks for point this out and help me understanding the code

now it's clear and it sounds good to me

Kind Regards,

-- 

Andrea SCIAN

DAVE Embedded Systems

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] UBI: Implement bitrot checking
  2015-04-02 19:19         ` Andrea Scian
@ 2015-04-08 10:34           ` Richard Weinberger
  -1 siblings, 0 replies; 49+ messages in thread
From: Richard Weinberger @ 2015-04-08 10:34 UTC (permalink / raw)
  To: Andrea Scian; +Cc: linux-mtd, boris.brezillon, linux-kernel, dedekind1

Am 02.04.2015 um 21:19 schrieb Andrea Scian:
> Il 02/04/2015 19:54, Richard Weinberger ha scritto:
>> Hi!
>>
>> Am 02.04.2015 um 19:34 schrieb Andrea Scian:
>>> Richard,
>>>
>>> Il 29/03/2015 14:13, Richard Weinberger ha scritto:
>>>> +	mutex_lock(&ubi->buf_mutex);
>>>> +	err = ubi_io_read(ubi, ubi->peb_buf, e->pnum, 0, ubi->peb_size);
>>>> +	mutex_unlock(&ubi->buf_mutex);
>>>> +	if (err == UBI_IO_BITFLIPS) {
>>>> +		dbg_wl("found bitflips in PEB %d", e->pnum);
>>>> +		spin_lock(&ubi->wl_lock);
>>>> +
>>> IIUC you trigger the action as soon as you have a bitflip error, is this
>>> correct?
>> I trigger it as soon UBI sees the bitflip. This depends on the configured MTD
>> bitflip_threshold.
> 
> ops.. I confused UBI bitflip error with MTD bitflip error ;-)
> thanks for point this out and help me understanding the code
> 
> now it's clear and it sounds good to me

Is this an implicit Reviewed-by? :)

Thanks,
//richard

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] UBI: Implement bitrot checking
@ 2015-04-08 10:34           ` Richard Weinberger
  0 siblings, 0 replies; 49+ messages in thread
From: Richard Weinberger @ 2015-04-08 10:34 UTC (permalink / raw)
  To: Andrea Scian; +Cc: boris.brezillon, linux-mtd, linux-kernel, dedekind1

Am 02.04.2015 um 21:19 schrieb Andrea Scian:
> Il 02/04/2015 19:54, Richard Weinberger ha scritto:
>> Hi!
>>
>> Am 02.04.2015 um 19:34 schrieb Andrea Scian:
>>> Richard,
>>>
>>> Il 29/03/2015 14:13, Richard Weinberger ha scritto:
>>>> +	mutex_lock(&ubi->buf_mutex);
>>>> +	err = ubi_io_read(ubi, ubi->peb_buf, e->pnum, 0, ubi->peb_size);
>>>> +	mutex_unlock(&ubi->buf_mutex);
>>>> +	if (err == UBI_IO_BITFLIPS) {
>>>> +		dbg_wl("found bitflips in PEB %d", e->pnum);
>>>> +		spin_lock(&ubi->wl_lock);
>>>> +
>>> IIUC you trigger the action as soon as you have a bitflip error, is this
>>> correct?
>> I trigger it as soon UBI sees the bitflip. This depends on the configured MTD
>> bitflip_threshold.
> 
> ops.. I confused UBI bitflip error with MTD bitflip error ;-)
> thanks for point this out and help me understanding the code
> 
> now it's clear and it sounds good to me

Is this an implicit Reviewed-by? :)

Thanks,
//richard

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] UBI: Implement bitrot checking
  2015-03-29 12:13   ` Richard Weinberger
  (?)
  (?)
@ 2015-04-08 11:48   ` David Oberhollenzer
  -1 siblings, 0 replies; 49+ messages in thread
From: David Oberhollenzer @ 2015-04-08 11:48 UTC (permalink / raw)
  To: Richard Weinberger, linux-mtd; +Cc: boris.brezillon, linux-kernel, dedekind1

Reviewed-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] UBI: Implement bitrot checking
  2015-04-08 10:34           ` Richard Weinberger
@ 2015-04-08 21:02             ` Andrea Scian
  -1 siblings, 0 replies; 49+ messages in thread
From: Andrea Scian @ 2015-04-08 21:02 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: linux-mtd, boris.brezillon, linux-kernel, dedekind1

Il 08/04/2015 12:34, Richard Weinberger ha scritto:
> Am 02.04.2015 um 21:19 schrieb Andrea Scian:
>> Il 02/04/2015 19:54, Richard Weinberger ha scritto:
>>> Hi!
>>>
>>> Am 02.04.2015 um 19:34 schrieb Andrea Scian:
>>>> Richard,
>>>>
>>>> Il 29/03/2015 14:13, Richard Weinberger ha scritto:
>>>>> +	mutex_lock(&ubi->buf_mutex);
>>>>> +	err = ubi_io_read(ubi, ubi->peb_buf, e->pnum, 0, ubi->peb_size);
>>>>> +	mutex_unlock(&ubi->buf_mutex);
>>>>> +	if (err == UBI_IO_BITFLIPS) {
>>>>> +		dbg_wl("found bitflips in PEB %d", e->pnum);
>>>>> +		spin_lock(&ubi->wl_lock);
>>>>> +
>>>> IIUC you trigger the action as soon as you have a bitflip error, is this
>>>> correct?
>>> I trigger it as soon UBI sees the bitflip. This depends on the configured MTD
>>> bitflip_threshold.
>>
>> ops.. I confused UBI bitflip error with MTD bitflip error ;-)
>> thanks for point this out and help me understanding the code
>>
>> now it's clear and it sounds good to me
> 
> Is this an implicit Reviewed-by? :)

Yes

Reviewed-by: Andrea Scian <andrea.scian@dave.eu>

(I have lot to learn about submitting kernel patches ;-) )

Unfortunately I cannot review it from a deep technical point of view
because some part of bitrot_check_worker() are not so clear to me (I'm
not an UBI expert and I need to dig a little bit further to have better
understanding), but the idea IMHO is right and it's worth having into UBI

Kind Regards,

-- 

Andrea SCIAN

DAVE Embedded Systems

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] UBI: Implement bitrot checking
@ 2015-04-08 21:02             ` Andrea Scian
  0 siblings, 0 replies; 49+ messages in thread
From: Andrea Scian @ 2015-04-08 21:02 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: boris.brezillon, linux-mtd, linux-kernel, dedekind1

Il 08/04/2015 12:34, Richard Weinberger ha scritto:
> Am 02.04.2015 um 21:19 schrieb Andrea Scian:
>> Il 02/04/2015 19:54, Richard Weinberger ha scritto:
>>> Hi!
>>>
>>> Am 02.04.2015 um 19:34 schrieb Andrea Scian:
>>>> Richard,
>>>>
>>>> Il 29/03/2015 14:13, Richard Weinberger ha scritto:
>>>>> +	mutex_lock(&ubi->buf_mutex);
>>>>> +	err = ubi_io_read(ubi, ubi->peb_buf, e->pnum, 0, ubi->peb_size);
>>>>> +	mutex_unlock(&ubi->buf_mutex);
>>>>> +	if (err == UBI_IO_BITFLIPS) {
>>>>> +		dbg_wl("found bitflips in PEB %d", e->pnum);
>>>>> +		spin_lock(&ubi->wl_lock);
>>>>> +
>>>> IIUC you trigger the action as soon as you have a bitflip error, is this
>>>> correct?
>>> I trigger it as soon UBI sees the bitflip. This depends on the configured MTD
>>> bitflip_threshold.
>>
>> ops.. I confused UBI bitflip error with MTD bitflip error ;-)
>> thanks for point this out and help me understanding the code
>>
>> now it's clear and it sounds good to me
> 
> Is this an implicit Reviewed-by? :)

Yes

Reviewed-by: Andrea Scian <andrea.scian@dave.eu>

(I have lot to learn about submitting kernel patches ;-) )

Unfortunately I cannot review it from a deep technical point of view
because some part of bitrot_check_worker() are not so clear to me (I'm
not an UBI expert and I need to dig a little bit further to have better
understanding), but the idea IMHO is right and it's worth having into UBI

Kind Regards,

-- 

Andrea SCIAN

DAVE Embedded Systems

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] UBI: Implement bitrot checking
  2015-03-29 12:13   ` Richard Weinberger
                     ` (2 preceding siblings ...)
  (?)
@ 2015-04-12 14:12   ` Boris Brezillon
  2015-04-12 16:09     ` Richard Weinberger
  -1 siblings, 1 reply; 49+ messages in thread
From: Boris Brezillon @ 2015-04-12 14:12 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: linux-mtd, linux-kernel, dedekind1

Hi Richard,

Sorry for the late reply.

On Sun, 29 Mar 2015 14:13:17 +0200
Richard Weinberger <richard@nod.at> wrote:

> This patch implements bitrot checking for UBI.
> ubi_wl_trigger_bitrot_check() triggers a re-read of every
> PEB. If a bitflip is detected PEBs in use will get scrubbed
> and free ones erased.

As you'll see, I didn't have much to say about the 'UBI bitrot
detection' mechanism, so this review is a collection of
nitpicks :-).

> 
> Signed-off-by: Richard Weinberger <richard@nod.at>
> ---
>  drivers/mtd/ubi/build.c |  39 +++++++++++++
>  drivers/mtd/ubi/ubi.h   |   4 ++
>  drivers/mtd/ubi/wl.c    | 146 ++++++++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 189 insertions(+)
> 
> diff --git a/drivers/mtd/ubi/build.c b/drivers/mtd/ubi/build.c
> index 9690cf9..f58330b 100644
> --- a/drivers/mtd/ubi/build.c
> +++ b/drivers/mtd/ubi/build.c
> @@ -118,6 +118,9 @@ static struct class_attribute ubi_version =
>  
>  static ssize_t dev_attribute_show(struct device *dev,
>  				  struct device_attribute *attr, char *buf);
> +static ssize_t trigger_bitrot_check(struct device *dev,
> +				    struct device_attribute *mattr,
> +				    const char *data, size_t count);
>  
>  /* UBI device attributes (correspond to files in '/<sysfs>/class/ubi/ubiX') */
>  static struct device_attribute dev_eraseblock_size =
> @@ -142,6 +145,8 @@ static struct device_attribute dev_bgt_enabled =
>  	__ATTR(bgt_enabled, S_IRUGO, dev_attribute_show, NULL);
>  static struct device_attribute dev_mtd_num =
>  	__ATTR(mtd_num, S_IRUGO, dev_attribute_show, NULL);
> +static struct device_attribute dev_trigger_bitrot_check =
> +	__ATTR(trigger_bitrot_check, S_IWUSR, NULL, trigger_bitrot_check);

How about making this attribute a RW one, so that users could check
if there's a bitrot check in progress.

>  
>  /**
>   * ubi_volume_notify - send a volume change notification.
> @@ -334,6 +339,36 @@ int ubi_major2num(int major)
>  	return ubi_num;
>  }
>  
> +/* "Store" method for file '/<sysfs>/class/ubi/ubiX/trigger_bitrot_check' */
> +static ssize_t trigger_bitrot_check(struct device *dev,
> +				    struct device_attribute *mattr,
> +				    const char *data, size_t count)
> +{
> +	struct ubi_device *ubi;
> +	int ret;
> +

Maybe that's on purpose, but you do not check the value passed in data
(in your documention you suggest to do an
echo 1 > /sys/class/ubi/ubiX/trigger_bitrot_check).

> +	ubi = container_of(dev, struct ubi_device, dev);
> +	ubi = ubi_get_device(ubi->ubi_num);
> +	if (!ubi) {
> +		ret = -ENODEV;
> +		goto out;
> +	}
> +
> +	if (atomic_inc_return(&ubi->bit_rot_work) != 1) {
> +		ret = -EBUSY;
> +		goto out_dec;
> +	}
> +
> +	ubi_wl_trigger_bitrot_check(ubi);
> +	ret = count;
> +
> +out_dec:
> +	atomic_dec(&ubi->bit_rot_work);
> +out:
> +	ubi_put_device(ubi);
> +	return ret;
> +}
> +
>  /* "Show" method for files in '/<sysfs>/class/ubi/ubiX/' */
>  static ssize_t dev_attribute_show(struct device *dev,
>  				  struct device_attribute *attr, char *buf)
> @@ -445,6 +480,9 @@ static int ubi_sysfs_init(struct ubi_device *ubi, int *ref)
>  	if (err)
>  		return err;
>  	err = device_create_file(&ubi->dev, &dev_mtd_num);
> +	if (err)
> +		return err;
> +	err = device_create_file(&ubi->dev, &dev_trigger_bitrot_check);
>  	return err;

You don't seem to control the return code, so, how about replacing
those 2 lines by:

	return device_create_file(&ubi->dev, &dev_trigger_bitrot_check);

>  }

[...]

> diff --git a/drivers/mtd/ubi/wl.c b/drivers/mtd/ubi/wl.c
> index 9b11db9..784bb52 100644
> --- a/drivers/mtd/ubi/wl.c
> +++ b/drivers/mtd/ubi/wl.c
> @@ -1447,6 +1447,150 @@ static void tree_destroy(struct ubi_device *ubi, struct rb_root *root)
>  }
>  
>  /**
> + * bitrot_check_worker - physical eraseblock bitrot check worker function.
> + * @ubi: UBI device description object
> + * @wl_wrk: the work object
> + * @shutdown: non-zero if the worker has to free memory and exit
> + *
> + * This function reads a physical eraseblock and schedules scrubbing if
> + * bit flips are detected.
> + */
> +static int bitrot_check_worker(struct ubi_device *ubi, struct ubi_work *wl_wrk,
> +			       int shutdown)
> +{
> +	struct ubi_wl_entry *e = wl_wrk->e;
> +	int err;
> +
> +	kfree(wl_wrk);
> +	if (shutdown) {
> +		dbg_wl("cancel bitrot check of PEB %d", e->pnum);
> +		wl_entry_destroy(ubi, e);
> +		return 0;
> +	}
> +
> +	mutex_lock(&ubi->buf_mutex);
> +	err = ubi_io_read(ubi, ubi->peb_buf, e->pnum, 0, ubi->peb_size);
> +	mutex_unlock(&ubi->buf_mutex);
> +	if (err == UBI_IO_BITFLIPS) {
> +		dbg_wl("found bitflips in PEB %d", e->pnum);
> +		spin_lock(&ubi->wl_lock);
> +
> +		if (in_pq(ubi, e)) {
> +			prot_queue_del(ubi, e->pnum);
> +			wl_tree_add(e, &ubi->scrub);
> +			spin_unlock(&ubi->wl_lock);
> +
> +			err = ensure_wear_leveling(ubi, 1);
> +		}
> +		else if (in_wl_tree(e, &ubi->used)) {
> +			rb_erase(&e->u.rb, &ubi->used);
> +			wl_tree_add(e, &ubi->scrub);
> +			spin_unlock(&ubi->wl_lock);
> +
> +			err = ensure_wear_leveling(ubi, 1);
> +		}
> +		else if (in_wl_tree(e, &ubi->free)) {
> +			rb_erase(&e->u.rb, &ubi->free);
> +			spin_unlock(&ubi->wl_lock);
> +
> +			wl_wrk = prepare_erase_work(e, -1, -1, 1);
> +			if (IS_ERR(wl_wrk)) {
> +				err = PTR_ERR(wl_wrk);
> +				goto out;
> +			}
> +
> +			__schedule_ubi_work(ubi, wl_wrk);
> +			err = 0;
> +		}
> +		/*
> +		 * e is target of a move operation, all we can do is kicking
> +		 * wear leveling such that we can catch it later or wear
> +		 * leveling itself scrubbs the PEB.
> +		 */
> +		else if (ubi->move_to == e || ubi->move_from == e) {
> +			spin_unlock(&ubi->wl_lock);
> +
> +			err = ensure_wear_leveling(ubi, 1);
> +		}
> +		/*
> +		 * e is member of a fastmap pool. We are not allowed to
> +		 * remove it from that pool as the on-flash fastmap data
> +		 * structure refers to it. Let's schedule a new fastmap write
> +		 * such that the said PEB can get released.
> +		 */
> +		else {
> +			ubi_schedule_fm_work(ubi);
> +			spin_unlock(&ubi->wl_lock);
> +
> +			err = 0;
> +		}

Nitpick, but checkpatch complains about 'else' or 'else if' statements
that are not on the '}' line.

> +	}
> +	else {
> +		/*
> +		 * Ignore read errors as we return only work related errors.
> +		 * Read errors will be logged by ubi_io_read().
> +		 */
> +		err = 0;
> +	}

Nitpicking again, but you can avoid another level of indentation by
doing the following:

	if (err != UBI_IO_BITFLIPS) {
		err = 0;
		goto out;
	}

	dbg_wl("found bitflips in PEB %d", e->pnum);
	spin_lock(&ubi->wl_lock);
	/* ... */

> +
> +out:
> +	atomic_dec(&ubi->bit_rot_work);
> +	ubi_assert(atomic_read(&ubi->bit_rot_work) >= 0);

How about replacing those two lines by:

	ubi_assert(atomic_dec_return(&ubi->bit_rot_work) >= 0);

> +	return err;
> +}

Best Regards,

Boris


-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] UBI: Implement bitrot checking
  2015-03-29 12:13   ` Richard Weinberger
                     ` (3 preceding siblings ...)
  (?)
@ 2015-04-12 15:14   ` Boris Brezillon
  2015-04-12 16:14     ` Richard Weinberger
  -1 siblings, 1 reply; 49+ messages in thread
From: Boris Brezillon @ 2015-04-12 15:14 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: linux-mtd, linux-kernel, dedekind1

Second pass on this patch :-).

On Sun, 29 Mar 2015 14:13:17 +0200
Richard Weinberger <richard@nod.at> wrote:

>  /**
> + * bitrot_check_worker - physical eraseblock bitrot check worker function.
> + * @ubi: UBI device description object
> + * @wl_wrk: the work object
> + * @shutdown: non-zero if the worker has to free memory and exit
> + *
> + * This function reads a physical eraseblock and schedules scrubbing if
> + * bit flips are detected.
> + */
> +static int bitrot_check_worker(struct ubi_device *ubi, struct ubi_work *wl_wrk,
> +			       int shutdown)
> +{
> +	struct ubi_wl_entry *e = wl_wrk->e;
> +	int err;
> +
> +	kfree(wl_wrk);
> +	if (shutdown) {
> +		dbg_wl("cancel bitrot check of PEB %d", e->pnum);
> +		wl_entry_destroy(ubi, e);
> +		return 0;
> +	}
> +
> +	mutex_lock(&ubi->buf_mutex);
> +	err = ubi_io_read(ubi, ubi->peb_buf, e->pnum, 0, ubi->peb_size);
> +	mutex_unlock(&ubi->buf_mutex);
> +	if (err == UBI_IO_BITFLIPS) {
> +		dbg_wl("found bitflips in PEB %d", e->pnum);
> +		spin_lock(&ubi->wl_lock);
> +
> +		if (in_pq(ubi, e)) {
> +			prot_queue_del(ubi, e->pnum);
> +			wl_tree_add(e, &ubi->scrub);
> +			spin_unlock(&ubi->wl_lock);
> +
> +			err = ensure_wear_leveling(ubi, 1);
> +		}
> +		else if (in_wl_tree(e, &ubi->used)) {
> +			rb_erase(&e->u.rb, &ubi->used);
> +			wl_tree_add(e, &ubi->scrub);
> +			spin_unlock(&ubi->wl_lock);
> +
> +			err = ensure_wear_leveling(ubi, 1);
> +		}
> +		else if (in_wl_tree(e, &ubi->free)) {
> +			rb_erase(&e->u.rb, &ubi->free);
> +			spin_unlock(&ubi->wl_lock);
> +

IMHO the following code chunk, starting here:

> +			wl_wrk = prepare_erase_work(e, -1, -1, 1);
> +			if (IS_ERR(wl_wrk)) {
> +				err = PTR_ERR(wl_wrk);
> +				goto out;
> +			}
> +
> +			__schedule_ubi_work(ubi, wl_wrk);

and ending here ^, could be placed in an helper function
(re_erase_peb ?)

> +			err = 0;
> +		}
> +		/*
> +		 * e is target of a move operation, all we can do is kicking
> +		 * wear leveling such that we can catch it later or wear
> +		 * leveling itself scrubbs the PEB.
> +		 */
> +		else if (ubi->move_to == e || ubi->move_from == e) {
> +			spin_unlock(&ubi->wl_lock);
> +
> +			err = ensure_wear_leveling(ubi, 1);
> +		}
> +		/*
> +		 * e is member of a fastmap pool. We are not allowed to
> +		 * remove it from that pool as the on-flash fastmap data
> +		 * structure refers to it. Let's schedule a new fastmap write
> +		 * such that the said PEB can get released.
> +		 */
> +		else {
> +			ubi_schedule_fm_work(ubi);
> +			spin_unlock(&ubi->wl_lock);
> +
> +			err = 0;
> +		}

I'm nitpicking again, but I like to have a single place where spinlocks
are locked and unlocked, so here is a rework suggestion for the code
inside the 'if (err == UBI_IO_BITFLIPS)' statement:

		bool ensure_wl = false, scrub = false, re_erase = false;

		spin_lock(&ubi->wl_lock);

		if (in_pq(ubi, e)) {
			prot_queue_del(ubi, e->pnum);
			scrub = true;
		} else if (in_wl_tree(e, &ubi->used)) {
			rb_erase(&e->u.rb, &ubi->used);
			scrub = true;
		} else if (in_wl_tree(e, &ubi->free)) {
			rb_erase(&e->u.rb, &ubi->free);
			re_erase = true;
		} else if (ubi->move_to == e || ubi->move_from == e) {
			ensure_wl = true;
		} else {
			ubi_schedule_fm_work(ubi);
		}

		if (scrub)
			wl_tree_add(e, &ubi->scrub);

		spin_unlock(&ubi->wl_lock);

		if (scrub || ensure_wl)
			err = ensure_wear_leveling(ubi, 1);
		else if (re_erase)
			err = re_erase_peb(ubi, e);
		else
			err = 0;

Best Regards,

Boris

-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] UBI: Implement bitrot checking
  2015-04-12 14:12   ` Boris Brezillon
@ 2015-04-12 16:09     ` Richard Weinberger
  2015-04-12 16:43       ` Boris Brezillon
  0 siblings, 1 reply; 49+ messages in thread
From: Richard Weinberger @ 2015-04-12 16:09 UTC (permalink / raw)
  To: Boris Brezillon; +Cc: linux-mtd, linux-kernel, dedekind1

Am 12.04.2015 um 16:12 schrieb Boris Brezillon:
> Hi Richard,
> 
> Sorry for the late reply.
> 
> On Sun, 29 Mar 2015 14:13:17 +0200
> Richard Weinberger <richard@nod.at> wrote:
> 
>> This patch implements bitrot checking for UBI.
>> ubi_wl_trigger_bitrot_check() triggers a re-read of every
>> PEB. If a bitflip is detected PEBs in use will get scrubbed
>> and free ones erased.
> 
> As you'll see, I didn't have much to say about the 'UBI bitrot
> detection' mechanism, so this review is a collection of
> nitpicks :-).
> 
>>
>> Signed-off-by: Richard Weinberger <richard@nod.at>
>> ---
>>  drivers/mtd/ubi/build.c |  39 +++++++++++++
>>  drivers/mtd/ubi/ubi.h   |   4 ++
>>  drivers/mtd/ubi/wl.c    | 146 ++++++++++++++++++++++++++++++++++++++++++++++++
>>  3 files changed, 189 insertions(+)
>>
>> diff --git a/drivers/mtd/ubi/build.c b/drivers/mtd/ubi/build.c
>> index 9690cf9..f58330b 100644
>> --- a/drivers/mtd/ubi/build.c
>> +++ b/drivers/mtd/ubi/build.c
>> @@ -118,6 +118,9 @@ static struct class_attribute ubi_version =
>>  
>>  static ssize_t dev_attribute_show(struct device *dev,
>>  				  struct device_attribute *attr, char *buf);
>> +static ssize_t trigger_bitrot_check(struct device *dev,
>> +				    struct device_attribute *mattr,
>> +				    const char *data, size_t count);
>>  
>>  /* UBI device attributes (correspond to files in '/<sysfs>/class/ubi/ubiX') */
>>  static struct device_attribute dev_eraseblock_size =
>> @@ -142,6 +145,8 @@ static struct device_attribute dev_bgt_enabled =
>>  	__ATTR(bgt_enabled, S_IRUGO, dev_attribute_show, NULL);
>>  static struct device_attribute dev_mtd_num =
>>  	__ATTR(mtd_num, S_IRUGO, dev_attribute_show, NULL);
>> +static struct device_attribute dev_trigger_bitrot_check =
>> +	__ATTR(trigger_bitrot_check, S_IWUSR, NULL, trigger_bitrot_check);
> 
> How about making this attribute a RW one, so that users could check
> if there's a bitrot check in progress.

As the check will be initiated only by userspace and writing to the trigger
while a check is running will return anyway a EBUSY I don't really see
a point why userspace would check for it.

>>  
>>  /**
>>   * ubi_volume_notify - send a volume change notification.
>> @@ -334,6 +339,36 @@ int ubi_major2num(int major)
>>  	return ubi_num;
>>  }
>>  
>> +/* "Store" method for file '/<sysfs>/class/ubi/ubiX/trigger_bitrot_check' */
>> +static ssize_t trigger_bitrot_check(struct device *dev,
>> +				    struct device_attribute *mattr,
>> +				    const char *data, size_t count)
>> +{
>> +	struct ubi_device *ubi;
>> +	int ret;
>> +
> 
> Maybe that's on purpose, but you do not check the value passed in data
> (in your documention you suggest to do an
> echo 1 > /sys/class/ubi/ubiX/trigger_bitrot_check).

Yeah, the example using "1", but why should I limit it to it?
The idea was that any write will trigger a check.

>> +	ubi = container_of(dev, struct ubi_device, dev);
>> +	ubi = ubi_get_device(ubi->ubi_num);
>> +	if (!ubi) {
>> +		ret = -ENODEV;
>> +		goto out;
>> +	}
>> +
>> +	if (atomic_inc_return(&ubi->bit_rot_work) != 1) {
>> +		ret = -EBUSY;
>> +		goto out_dec;
>> +	}
>> +
>> +	ubi_wl_trigger_bitrot_check(ubi);
>> +	ret = count;
>> +
>> +out_dec:
>> +	atomic_dec(&ubi->bit_rot_work);
>> +out:
>> +	ubi_put_device(ubi);
>> +	return ret;
>> +}
>> +
>>  /* "Show" method for files in '/<sysfs>/class/ubi/ubiX/' */
>>  static ssize_t dev_attribute_show(struct device *dev,
>>  				  struct device_attribute *attr, char *buf)
>> @@ -445,6 +480,9 @@ static int ubi_sysfs_init(struct ubi_device *ubi, int *ref)
>>  	if (err)
>>  		return err;
>>  	err = device_create_file(&ubi->dev, &dev_mtd_num);
>> +	if (err)
>> +		return err;
>> +	err = device_create_file(&ubi->dev, &dev_trigger_bitrot_check);
>>  	return err;
> 
> You don't seem to control the return code, so, how about replacing
> those 2 lines by:
> 
> 	return device_create_file(&ubi->dev, &dev_trigger_bitrot_check);

I did it exactly like the existing code does.
But I can replace the lines...

>>  }
> 
> [...]
> 
>> diff --git a/drivers/mtd/ubi/wl.c b/drivers/mtd/ubi/wl.c
>> index 9b11db9..784bb52 100644
>> --- a/drivers/mtd/ubi/wl.c
>> +++ b/drivers/mtd/ubi/wl.c
>> @@ -1447,6 +1447,150 @@ static void tree_destroy(struct ubi_device *ubi, struct rb_root *root)
>>  }
>>  
>>  /**
>> + * bitrot_check_worker - physical eraseblock bitrot check worker function.
>> + * @ubi: UBI device description object
>> + * @wl_wrk: the work object
>> + * @shutdown: non-zero if the worker has to free memory and exit
>> + *
>> + * This function reads a physical eraseblock and schedules scrubbing if
>> + * bit flips are detected.
>> + */
>> +static int bitrot_check_worker(struct ubi_device *ubi, struct ubi_work *wl_wrk,
>> +			       int shutdown)
>> +{
>> +	struct ubi_wl_entry *e = wl_wrk->e;
>> +	int err;
>> +
>> +	kfree(wl_wrk);
>> +	if (shutdown) {
>> +		dbg_wl("cancel bitrot check of PEB %d", e->pnum);
>> +		wl_entry_destroy(ubi, e);
>> +		return 0;
>> +	}
>> +
>> +	mutex_lock(&ubi->buf_mutex);
>> +	err = ubi_io_read(ubi, ubi->peb_buf, e->pnum, 0, ubi->peb_size);
>> +	mutex_unlock(&ubi->buf_mutex);
>> +	if (err == UBI_IO_BITFLIPS) {
>> +		dbg_wl("found bitflips in PEB %d", e->pnum);
>> +		spin_lock(&ubi->wl_lock);
>> +
>> +		if (in_pq(ubi, e)) {
>> +			prot_queue_del(ubi, e->pnum);
>> +			wl_tree_add(e, &ubi->scrub);
>> +			spin_unlock(&ubi->wl_lock);
>> +
>> +			err = ensure_wear_leveling(ubi, 1);
>> +		}
>> +		else if (in_wl_tree(e, &ubi->used)) {
>> +			rb_erase(&e->u.rb, &ubi->used);
>> +			wl_tree_add(e, &ubi->scrub);
>> +			spin_unlock(&ubi->wl_lock);
>> +
>> +			err = ensure_wear_leveling(ubi, 1);
>> +		}
>> +		else if (in_wl_tree(e, &ubi->free)) {
>> +			rb_erase(&e->u.rb, &ubi->free);
>> +			spin_unlock(&ubi->wl_lock);
>> +
>> +			wl_wrk = prepare_erase_work(e, -1, -1, 1);
>> +			if (IS_ERR(wl_wrk)) {
>> +				err = PTR_ERR(wl_wrk);
>> +				goto out;
>> +			}
>> +
>> +			__schedule_ubi_work(ubi, wl_wrk);
>> +			err = 0;
>> +		}
>> +		/*
>> +		 * e is target of a move operation, all we can do is kicking
>> +		 * wear leveling such that we can catch it later or wear
>> +		 * leveling itself scrubbs the PEB.
>> +		 */
>> +		else if (ubi->move_to == e || ubi->move_from == e) {
>> +			spin_unlock(&ubi->wl_lock);
>> +
>> +			err = ensure_wear_leveling(ubi, 1);
>> +		}
>> +		/*
>> +		 * e is member of a fastmap pool. We are not allowed to
>> +		 * remove it from that pool as the on-flash fastmap data
>> +		 * structure refers to it. Let's schedule a new fastmap write
>> +		 * such that the said PEB can get released.
>> +		 */
>> +		else {
>> +			ubi_schedule_fm_work(ubi);
>> +			spin_unlock(&ubi->wl_lock);
>> +
>> +			err = 0;
>> +		}
> 
> Nitpick, but checkpatch complains about 'else' or 'else if' statements
> that are not on the '}' line.

I like it as is because I can nicely place the comment above the else {.
And checkpatch is not our lawmaker.

>> +	}
>> +	else {
>> +		/*
>> +		 * Ignore read errors as we return only work related errors.
>> +		 * Read errors will be logged by ubi_io_read().
>> +		 */
>> +		err = 0;
>> +	}
> 
> Nitpicking again, but you can avoid another level of indentation by
> doing the following:
> 
> 	if (err != UBI_IO_BITFLIPS) {
> 		err = 0;
> 		goto out;
> 	}
> 
> 	dbg_wl("found bitflips in PEB %d", e->pnum);
> 	spin_lock(&ubi->wl_lock);
> 	/* ... */
> 
>> +
>> +out:
>> +	atomic_dec(&ubi->bit_rot_work);
>> +	ubi_assert(atomic_read(&ubi->bit_rot_work) >= 0);
> 
> How about replacing those two lines by:
> 
> 	ubi_assert(atomic_dec_return(&ubi->bit_rot_work) >= 0);

True, I always forget how many nice atomic_* helper we have. :)

Thanks,
//richard

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] UBI: Implement bitrot checking
  2015-04-12 15:14   ` [PATCH 4/4] UBI: Implement bitrot checking Boris Brezillon
@ 2015-04-12 16:14     ` Richard Weinberger
  2015-04-12 16:31       ` Boris Brezillon
  0 siblings, 1 reply; 49+ messages in thread
From: Richard Weinberger @ 2015-04-12 16:14 UTC (permalink / raw)
  To: Boris Brezillon; +Cc: linux-mtd, linux-kernel, dedekind1

Am 12.04.2015 um 17:14 schrieb Boris Brezillon:
> Second pass on this patch :-).
> 
> On Sun, 29 Mar 2015 14:13:17 +0200
> Richard Weinberger <richard@nod.at> wrote:
> 
>>  /**
>> + * bitrot_check_worker - physical eraseblock bitrot check worker function.
>> + * @ubi: UBI device description object
>> + * @wl_wrk: the work object
>> + * @shutdown: non-zero if the worker has to free memory and exit
>> + *
>> + * This function reads a physical eraseblock and schedules scrubbing if
>> + * bit flips are detected.
>> + */
>> +static int bitrot_check_worker(struct ubi_device *ubi, struct ubi_work *wl_wrk,
>> +			       int shutdown)
>> +{
>> +	struct ubi_wl_entry *e = wl_wrk->e;
>> +	int err;
>> +
>> +	kfree(wl_wrk);
>> +	if (shutdown) {
>> +		dbg_wl("cancel bitrot check of PEB %d", e->pnum);
>> +		wl_entry_destroy(ubi, e);
>> +		return 0;
>> +	}
>> +
>> +	mutex_lock(&ubi->buf_mutex);
>> +	err = ubi_io_read(ubi, ubi->peb_buf, e->pnum, 0, ubi->peb_size);
>> +	mutex_unlock(&ubi->buf_mutex);
>> +	if (err == UBI_IO_BITFLIPS) {
>> +		dbg_wl("found bitflips in PEB %d", e->pnum);
>> +		spin_lock(&ubi->wl_lock);
>> +
>> +		if (in_pq(ubi, e)) {
>> +			prot_queue_del(ubi, e->pnum);
>> +			wl_tree_add(e, &ubi->scrub);
>> +			spin_unlock(&ubi->wl_lock);
>> +
>> +			err = ensure_wear_leveling(ubi, 1);
>> +		}
>> +		else if (in_wl_tree(e, &ubi->used)) {
>> +			rb_erase(&e->u.rb, &ubi->used);
>> +			wl_tree_add(e, &ubi->scrub);
>> +			spin_unlock(&ubi->wl_lock);
>> +
>> +			err = ensure_wear_leveling(ubi, 1);
>> +		}
>> +		else if (in_wl_tree(e, &ubi->free)) {
>> +			rb_erase(&e->u.rb, &ubi->free);
>> +			spin_unlock(&ubi->wl_lock);
>> +
> 
> IMHO the following code chunk, starting here:
> 
>> +			wl_wrk = prepare_erase_work(e, -1, -1, 1);
>> +			if (IS_ERR(wl_wrk)) {
>> +				err = PTR_ERR(wl_wrk);
>> +				goto out;
>> +			}
>> +
>> +			__schedule_ubi_work(ubi, wl_wrk);
> 
> and ending here ^, could be placed in an helper function
> (re_erase_peb ?)

As long we have only one user of that pattern I'd keep it as is.
We have in UBI already a gazillion helper functions.

>> +			err = 0;
>> +		}
>> +		/*
>> +		 * e is target of a move operation, all we can do is kicking
>> +		 * wear leveling such that we can catch it later or wear
>> +		 * leveling itself scrubbs the PEB.
>> +		 */
>> +		else if (ubi->move_to == e || ubi->move_from == e) {
>> +			spin_unlock(&ubi->wl_lock);
>> +
>> +			err = ensure_wear_leveling(ubi, 1);
>> +		}
>> +		/*
>> +		 * e is member of a fastmap pool. We are not allowed to
>> +		 * remove it from that pool as the on-flash fastmap data
>> +		 * structure refers to it. Let's schedule a new fastmap write
>> +		 * such that the said PEB can get released.
>> +		 */
>> +		else {
>> +			ubi_schedule_fm_work(ubi);
>> +			spin_unlock(&ubi->wl_lock);
>> +
>> +			err = 0;
>> +		}
> 
> I'm nitpicking again, but I like to have a single place where spinlocks
> are locked and unlocked, so here is a rework suggestion for the code
> inside the 'if (err == UBI_IO_BITFLIPS)' statement:

A single lock/unlock place is nice but in this case the whole logic fits
into a single page on screen. "do_this" and "do_that" variables don't make
the code more readable IMHO.
But as with all nitpicks it is a matter of taste and we could waste multiple
days on such things.

Thanks,
//richard

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] UBI: Implement bitrot checking
  2015-04-12 16:14     ` Richard Weinberger
@ 2015-04-12 16:31       ` Boris Brezillon
  2015-04-12 16:32         ` Richard Weinberger
  0 siblings, 1 reply; 49+ messages in thread
From: Boris Brezillon @ 2015-04-12 16:31 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: linux-mtd, linux-kernel, dedekind1

On Sun, 12 Apr 2015 18:14:40 +0200
Richard Weinberger <richard@nod.at> wrote:

> > IMHO the following code chunk, starting here:
> > 
> >> +			wl_wrk = prepare_erase_work(e, -1, -1, 1);
> >> +			if (IS_ERR(wl_wrk)) {
> >> +				err = PTR_ERR(wl_wrk);
> >> +				goto out;
> >> +			}
> >> +
> >> +			__schedule_ubi_work(ubi, wl_wrk);
> > 
> > and ending here ^, could be placed in an helper function
> > (re_erase_peb ?)
> 
> As long we have only one user of that pattern I'd keep it as is.
> We have in UBI already a gazillion helper functions.

Okay, then maybe you should comment what you're doing here: erase an
already erased PEB where bitflips have occured.

> 
> >> +			err = 0;
> >> +		}
> >> +		/*
> >> +		 * e is target of a move operation, all we can do is kicking
> >> +		 * wear leveling such that we can catch it later or wear
> >> +		 * leveling itself scrubbs the PEB.
> >> +		 */
> >> +		else if (ubi->move_to == e || ubi->move_from == e) {
> >> +			spin_unlock(&ubi->wl_lock);
> >> +
> >> +			err = ensure_wear_leveling(ubi, 1);
> >> +		}
> >> +		/*
> >> +		 * e is member of a fastmap pool. We are not allowed to
> >> +		 * remove it from that pool as the on-flash fastmap data
> >> +		 * structure refers to it. Let's schedule a new fastmap write
> >> +		 * such that the said PEB can get released.
> >> +		 */
> >> +		else {
> >> +			ubi_schedule_fm_work(ubi);
> >> +			spin_unlock(&ubi->wl_lock);
> >> +
> >> +			err = 0;
> >> +		}
> > 
> > I'm nitpicking again, but I like to have a single place where spinlocks
> > are locked and unlocked, so here is a rework suggestion for the code
> > inside the 'if (err == UBI_IO_BITFLIPS)' statement:
> 
> A single lock/unlock place is nice but in this case the whole logic fits
> into a single page on screen. "do_this" and "do_that" variables don't make
> the code more readable IMHO.
> But as with all nitpicks it is a matter of taste and we could waste multiple
> days on such things.

Isn't that the whole point of code reviews :-P ?


-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] UBI: Implement bitrot checking
  2015-04-12 16:31       ` Boris Brezillon
@ 2015-04-12 16:32         ` Richard Weinberger
  0 siblings, 0 replies; 49+ messages in thread
From: Richard Weinberger @ 2015-04-12 16:32 UTC (permalink / raw)
  To: Boris Brezillon; +Cc: linux-mtd, linux-kernel, dedekind1

Am 12.04.2015 um 18:31 schrieb Boris Brezillon:
> On Sun, 12 Apr 2015 18:14:40 +0200
> Richard Weinberger <richard@nod.at> wrote:
> 
>>> IMHO the following code chunk, starting here:
>>>
>>>> +			wl_wrk = prepare_erase_work(e, -1, -1, 1);
>>>> +			if (IS_ERR(wl_wrk)) {
>>>> +				err = PTR_ERR(wl_wrk);
>>>> +				goto out;
>>>> +			}
>>>> +
>>>> +			__schedule_ubi_work(ubi, wl_wrk);
>>>
>>> and ending here ^, could be placed in an helper function
>>> (re_erase_peb ?)
>>
>> As long we have only one user of that pattern I'd keep it as is.
>> We have in UBI already a gazillion helper functions.
> 
> Okay, then maybe you should comment what you're doing here: erase an
> already erased PEB where bitflips have occured.

Makes sense!

>>
>>>> +			err = 0;
>>>> +		}
>>>> +		/*
>>>> +		 * e is target of a move operation, all we can do is kicking
>>>> +		 * wear leveling such that we can catch it later or wear
>>>> +		 * leveling itself scrubbs the PEB.
>>>> +		 */
>>>> +		else if (ubi->move_to == e || ubi->move_from == e) {
>>>> +			spin_unlock(&ubi->wl_lock);
>>>> +
>>>> +			err = ensure_wear_leveling(ubi, 1);
>>>> +		}
>>>> +		/*
>>>> +		 * e is member of a fastmap pool. We are not allowed to
>>>> +		 * remove it from that pool as the on-flash fastmap data
>>>> +		 * structure refers to it. Let's schedule a new fastmap write
>>>> +		 * such that the said PEB can get released.
>>>> +		 */
>>>> +		else {
>>>> +			ubi_schedule_fm_work(ubi);
>>>> +			spin_unlock(&ubi->wl_lock);
>>>> +
>>>> +			err = 0;
>>>> +		}
>>>
>>> I'm nitpicking again, but I like to have a single place where spinlocks
>>> are locked and unlocked, so here is a rework suggestion for the code
>>> inside the 'if (err == UBI_IO_BITFLIPS)' statement:
>>
>> A single lock/unlock place is nice but in this case the whole logic fits
>> into a single page on screen. "do_this" and "do_that" variables don't make
>> the code more readable IMHO.
>> But as with all nitpicks it is a matter of taste and we could waste multiple
>> days on such things.
> 
> Isn't that the whole point of code reviews :-P ?

;-)

Thanks,
//richard

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] UBI: Implement bitrot checking
  2015-04-12 16:09     ` Richard Weinberger
@ 2015-04-12 16:43       ` Boris Brezillon
  2015-04-12 16:55         ` Richard Weinberger
  0 siblings, 1 reply; 49+ messages in thread
From: Boris Brezillon @ 2015-04-12 16:43 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: linux-mtd, linux-kernel, dedekind1

On Sun, 12 Apr 2015 18:09:23 +0200
Richard Weinberger <richard@nod.at> wrote:

> Am 12.04.2015 um 16:12 schrieb Boris Brezillon:
> > Hi Richard,
> > 
> > Sorry for the late reply.
> > 
> > On Sun, 29 Mar 2015 14:13:17 +0200
> > Richard Weinberger <richard@nod.at> wrote:
> > 
> >> This patch implements bitrot checking for UBI.
> >> ubi_wl_trigger_bitrot_check() triggers a re-read of every
> >> PEB. If a bitflip is detected PEBs in use will get scrubbed
> >> and free ones erased.
> > 
> > As you'll see, I didn't have much to say about the 'UBI bitrot
> > detection' mechanism, so this review is a collection of
> > nitpicks :-).
> > 
> >>
> >> Signed-off-by: Richard Weinberger <richard@nod.at>
> >> ---
> >>  drivers/mtd/ubi/build.c |  39 +++++++++++++
> >>  drivers/mtd/ubi/ubi.h   |   4 ++
> >>  drivers/mtd/ubi/wl.c    | 146 ++++++++++++++++++++++++++++++++++++++++++++++++
> >>  3 files changed, 189 insertions(+)
> >>
> >> diff --git a/drivers/mtd/ubi/build.c b/drivers/mtd/ubi/build.c
> >> index 9690cf9..f58330b 100644
> >> --- a/drivers/mtd/ubi/build.c
> >> +++ b/drivers/mtd/ubi/build.c
> >> @@ -118,6 +118,9 @@ static struct class_attribute ubi_version =
> >>  
> >>  static ssize_t dev_attribute_show(struct device *dev,
> >>  				  struct device_attribute *attr, char *buf);
> >> +static ssize_t trigger_bitrot_check(struct device *dev,
> >> +				    struct device_attribute *mattr,
> >> +				    const char *data, size_t count);
> >>  
> >>  /* UBI device attributes (correspond to files in '/<sysfs>/class/ubi/ubiX') */
> >>  static struct device_attribute dev_eraseblock_size =
> >> @@ -142,6 +145,8 @@ static struct device_attribute dev_bgt_enabled =
> >>  	__ATTR(bgt_enabled, S_IRUGO, dev_attribute_show, NULL);
> >>  static struct device_attribute dev_mtd_num =
> >>  	__ATTR(mtd_num, S_IRUGO, dev_attribute_show, NULL);
> >> +static struct device_attribute dev_trigger_bitrot_check =
> >> +	__ATTR(trigger_bitrot_check, S_IWUSR, NULL, trigger_bitrot_check);
> > 
> > How about making this attribute a RW one, so that users could check
> > if there's a bitrot check in progress.
> 
> As the check will be initiated only by userspace and writing to the trigger
> while a check is running will return anyway a EBUSY I don't really see
> a point why userspace would check for it.

Sometime you just want to know whether something is running or not (in
this case the bitrot check) without risking to trigger a new action...

> 
> >>  
> >>  /**
> >>   * ubi_volume_notify - send a volume change notification.
> >> @@ -334,6 +339,36 @@ int ubi_major2num(int major)
> >>  	return ubi_num;
> >>  }
> >>  
> >> +/* "Store" method for file '/<sysfs>/class/ubi/ubiX/trigger_bitrot_check' */
> >> +static ssize_t trigger_bitrot_check(struct device *dev,
> >> +				    struct device_attribute *mattr,
> >> +				    const char *data, size_t count)
> >> +{
> >> +	struct ubi_device *ubi;
> >> +	int ret;
> >> +
> > 
> > Maybe that's on purpose, but you do not check the value passed in data
> > (in your documention you suggest to do an
> > echo 1 > /sys/class/ubi/ubiX/trigger_bitrot_check).
> 
> Yeah, the example using "1", but why should I limit it to it?
> The idea was that any write will trigger a check.

Okay.


[...]

> >> +		/*
> >> +		 * e is member of a fastmap pool. We are not allowed to
> >> +		 * remove it from that pool as the on-flash fastmap data
> >> +		 * structure refers to it. Let's schedule a new fastmap write
> >> +		 * such that the said PEB can get released.
> >> +		 */
> >> +		else {
> >> +			ubi_schedule_fm_work(ubi);
> >> +			spin_unlock(&ubi->wl_lock);
> >> +
> >> +			err = 0;
> >> +		}
> > 
> > Nitpick, but checkpatch complains about 'else' or 'else if' statements
> > that are not on the '}' line.
> 
> I like it as is because I can nicely place the comment above the else {.
> And checkpatch is not our lawmaker.

You could put your comment after the braces.
Anyway, you might dislike the coding style imposed by kernel
developers/maintainers, but this is what keeps the kernel code
consistent across the different subsystems.
I agree that some checks done by checkpatch can be a bit restrictive in
some cases (like the 80 characters limit), but I really think the
braces and else[ if] placements should be enforced.
This being said, this is your call to make, so I won't complain about
it anymore ;-).

> 
> >> +	}
> >> +	else {
> >> +		/*
> >> +		 * Ignore read errors as we return only work related errors.
> >> +		 * Read errors will be logged by ubi_io_read().
> >> +		 */
> >> +		err = 0;
> >> +	}
> > 
> > Nitpicking again, but you can avoid another level of indentation by
> > doing the following:
> > 
> > 	if (err != UBI_IO_BITFLIPS) {
> > 		err = 0;
> > 		goto out;
> > 	}
> > 
> > 	dbg_wl("found bitflips in PEB %d", e->pnum);
> > 	spin_lock(&ubi->wl_lock);
> > 	/* ... */

You didn't answer to that one.



-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] UBI: Implement bitrot checking
  2015-04-12 16:43       ` Boris Brezillon
@ 2015-04-12 16:55         ` Richard Weinberger
  2015-04-12 20:42             ` Andrea Scian
  0 siblings, 1 reply; 49+ messages in thread
From: Richard Weinberger @ 2015-04-12 16:55 UTC (permalink / raw)
  To: Boris Brezillon; +Cc: linux-mtd, linux-kernel, dedekind1

Am 12.04.2015 um 18:43 schrieb Boris Brezillon:
> On Sun, 12 Apr 2015 18:09:23 +0200
> Richard Weinberger <richard@nod.at> wrote:
> 
>> Am 12.04.2015 um 16:12 schrieb Boris Brezillon:
>>> Hi Richard,
>>>
>>> Sorry for the late reply.
>>>
>>> On Sun, 29 Mar 2015 14:13:17 +0200
>>> Richard Weinberger <richard@nod.at> wrote:
>>>
>>>> This patch implements bitrot checking for UBI.
>>>> ubi_wl_trigger_bitrot_check() triggers a re-read of every
>>>> PEB. If a bitflip is detected PEBs in use will get scrubbed
>>>> and free ones erased.
>>>
>>> As you'll see, I didn't have much to say about the 'UBI bitrot
>>> detection' mechanism, so this review is a collection of
>>> nitpicks :-).
>>>
>>>>
>>>> Signed-off-by: Richard Weinberger <richard@nod.at>
>>>> ---
>>>>  drivers/mtd/ubi/build.c |  39 +++++++++++++
>>>>  drivers/mtd/ubi/ubi.h   |   4 ++
>>>>  drivers/mtd/ubi/wl.c    | 146 ++++++++++++++++++++++++++++++++++++++++++++++++
>>>>  3 files changed, 189 insertions(+)
>>>>
>>>> diff --git a/drivers/mtd/ubi/build.c b/drivers/mtd/ubi/build.c
>>>> index 9690cf9..f58330b 100644
>>>> --- a/drivers/mtd/ubi/build.c
>>>> +++ b/drivers/mtd/ubi/build.c
>>>> @@ -118,6 +118,9 @@ static struct class_attribute ubi_version =
>>>>  
>>>>  static ssize_t dev_attribute_show(struct device *dev,
>>>>  				  struct device_attribute *attr, char *buf);
>>>> +static ssize_t trigger_bitrot_check(struct device *dev,
>>>> +				    struct device_attribute *mattr,
>>>> +				    const char *data, size_t count);
>>>>  
>>>>  /* UBI device attributes (correspond to files in '/<sysfs>/class/ubi/ubiX') */
>>>>  static struct device_attribute dev_eraseblock_size =
>>>> @@ -142,6 +145,8 @@ static struct device_attribute dev_bgt_enabled =
>>>>  	__ATTR(bgt_enabled, S_IRUGO, dev_attribute_show, NULL);
>>>>  static struct device_attribute dev_mtd_num =
>>>>  	__ATTR(mtd_num, S_IRUGO, dev_attribute_show, NULL);
>>>> +static struct device_attribute dev_trigger_bitrot_check =
>>>> +	__ATTR(trigger_bitrot_check, S_IWUSR, NULL, trigger_bitrot_check);
>>>
>>> How about making this attribute a RW one, so that users could check
>>> if there's a bitrot check in progress.
>>
>> As the check will be initiated only by userspace and writing to the trigger
>> while a check is running will return anyway a EBUSY I don't really see
>> a point why userspace would check for it.
> 
> Sometime you just want to know whether something is running or not (in
> this case the bitrot check) without risking to trigger a new action...

Why would they care?
But I can add this feature, no problem.

>>
>>>>  
>>>>  /**
>>>>   * ubi_volume_notify - send a volume change notification.
>>>> @@ -334,6 +339,36 @@ int ubi_major2num(int major)
>>>>  	return ubi_num;
>>>>  }
>>>>  
>>>> +/* "Store" method for file '/<sysfs>/class/ubi/ubiX/trigger_bitrot_check' */
>>>> +static ssize_t trigger_bitrot_check(struct device *dev,
>>>> +				    struct device_attribute *mattr,
>>>> +				    const char *data, size_t count)
>>>> +{
>>>> +	struct ubi_device *ubi;
>>>> +	int ret;
>>>> +
>>>
>>> Maybe that's on purpose, but you do not check the value passed in data
>>> (in your documention you suggest to do an
>>> echo 1 > /sys/class/ubi/ubiX/trigger_bitrot_check).
>>
>> Yeah, the example using "1", but why should I limit it to it?
>> The idea was that any write will trigger a check.
> 
> Okay.
> 
> 
> [...]
> 
>>>> +		/*
>>>> +		 * e is member of a fastmap pool. We are not allowed to
>>>> +		 * remove it from that pool as the on-flash fastmap data
>>>> +		 * structure refers to it. Let's schedule a new fastmap write
>>>> +		 * such that the said PEB can get released.
>>>> +		 */
>>>> +		else {
>>>> +			ubi_schedule_fm_work(ubi);
>>>> +			spin_unlock(&ubi->wl_lock);
>>>> +
>>>> +			err = 0;
>>>> +		}
>>>
>>> Nitpick, but checkpatch complains about 'else' or 'else if' statements
>>> that are not on the '}' line.
>>
>> I like it as is because I can nicely place the comment above the else {.
>> And checkpatch is not our lawmaker.
> 
> You could put your comment after the braces.
> Anyway, you might dislike the coding style imposed by kernel
> developers/maintainers, but this is what keeps the kernel code
> consistent across the different subsystems.
> I agree that some checks done by checkpatch can be a bit restrictive in
> some cases (like the 80 characters limit), but I really think the
> braces and else[ if] placements should be enforced.
> This being said, this is your call to make, so I won't complain about
> it anymore ;-).

It is corner case which is not handled by the kernel coding style IMHO.
The sad thing is that checkpatch is not developed by kernel developers.

>>
>>>> +	}
>>>> +	else {
>>>> +		/*
>>>> +		 * Ignore read errors as we return only work related errors.
>>>> +		 * Read errors will be logged by ubi_io_read().
>>>> +		 */
>>>> +		err = 0;
>>>> +	}
>>>
>>> Nitpicking again, but you can avoid another level of indentation by
>>> doing the following:
>>>
>>> 	if (err != UBI_IO_BITFLIPS) {
>>> 		err = 0;
>>> 		goto out;
>>> 	}
>>>
>>> 	dbg_wl("found bitflips in PEB %d", e->pnum);
>>> 	spin_lock(&ubi->wl_lock);
>>> 	/* ... */
> 
> You didn't answer to that one.

Whoops.
Yeah, that makes sense!

Thanks,
//richard

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] UBI: Implement bitrot checking
  2015-03-29 12:13   ` Richard Weinberger
                     ` (4 preceding siblings ...)
  (?)
@ 2015-04-12 17:01   ` Boris Brezillon
  2015-04-12 17:09     ` Richard Weinberger
  2015-04-12 17:36     ` Richard Weinberger
  -1 siblings, 2 replies; 49+ messages in thread
From: Boris Brezillon @ 2015-04-12 17:01 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: linux-mtd, linux-kernel, dedekind1

Hi Richard,

After the 'coding style related'/'useless' comments, now comes a real
question related to the approach you've taken :-).

On Sun, 29 Mar 2015 14:13:17 +0200
Richard Weinberger <richard@nod.at> wrote:

[...]
> +
> +/**
> + * ubi_wl_trigger_bitrot_check - triggers a re-read of all physical erase
> + * blocks.
> + * @ubi: UBI device description object
> + */
> +void ubi_wl_trigger_bitrot_check(struct ubi_device *ubi)
> +{
> +	int i;
> +	struct ubi_wl_entry *e;
> +
> +	ubi_msg(ubi, "Running a full read check");
> +
> +	for (i = 0; i < ubi->peb_count; i++) {
> +		spin_lock(&ubi->wl_lock);
> +		e = ubi->lookuptbl[i];
> +		spin_unlock(&ubi->wl_lock);
> +		if (e) {
> +			atomic_inc(&ubi->bit_rot_work);
> +			schedule_bitrot_check(ubi, e);
> +		}
> +	}

Do we really need to create a ubi_work per PEB ?
Couldn't we create a single work being rescheduled inside the worker
function (after updating the ubi_wl_entry of course).

I'm pretty sure I'm missing something obvious that you'll probably
point out ;-).


-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] UBI: Implement bitrot checking
  2015-04-12 17:01   ` Boris Brezillon
@ 2015-04-12 17:09     ` Richard Weinberger
  2015-04-12 19:20       ` Boris Brezillon
  2015-04-12 17:36     ` Richard Weinberger
  1 sibling, 1 reply; 49+ messages in thread
From: Richard Weinberger @ 2015-04-12 17:09 UTC (permalink / raw)
  To: Boris Brezillon; +Cc: linux-mtd, linux-kernel, dedekind1

Am 12.04.2015 um 19:01 schrieb Boris Brezillon:
> Hi Richard,
> 
> After the 'coding style related'/'useless' comments, now comes a real
> question related to the approach you've taken :-).
> 
> On Sun, 29 Mar 2015 14:13:17 +0200
> Richard Weinberger <richard@nod.at> wrote:
> 
> [...]
>> +
>> +/**
>> + * ubi_wl_trigger_bitrot_check - triggers a re-read of all physical erase
>> + * blocks.
>> + * @ubi: UBI device description object
>> + */
>> +void ubi_wl_trigger_bitrot_check(struct ubi_device *ubi)
>> +{
>> +	int i;
>> +	struct ubi_wl_entry *e;
>> +
>> +	ubi_msg(ubi, "Running a full read check");
>> +
>> +	for (i = 0; i < ubi->peb_count; i++) {
>> +		spin_lock(&ubi->wl_lock);
>> +		e = ubi->lookuptbl[i];
>> +		spin_unlock(&ubi->wl_lock);
>> +		if (e) {
>> +			atomic_inc(&ubi->bit_rot_work);
>> +			schedule_bitrot_check(ubi, e);
>> +		}
>> +	}
> 
> Do we really need to create a ubi_work per PEB ?
> Couldn't we create a single work being rescheduled inside the worker
> function (after updating the ubi_wl_entry of course).

Currently the UBI worker thread handles one PEB per ubi_work. I didn't wanted
to break that pattern. The downside of that approach is that we need more memory.
A few KiB per run.

I'm not sure if I understood your idea. You mean that we schedule one check for
PEB N and this work will re-schedule again a work for PEB N+1?
Using that approach we can safe memory, yes. But is it worth the hassle?
I'd like to avoid works which schedule again other works.
In the current way it is clear where the work is scheduled and how much.

> I'm pretty sure I'm missing something obvious that you'll probably
> point out ;-).

No no, it is a very good question.

Thanks,
//richard

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] UBI: Implement bitrot checking
  2015-04-12 17:01   ` Boris Brezillon
  2015-04-12 17:09     ` Richard Weinberger
@ 2015-04-12 17:36     ` Richard Weinberger
  1 sibling, 0 replies; 49+ messages in thread
From: Richard Weinberger @ 2015-04-12 17:36 UTC (permalink / raw)
  To: Boris Brezillon; +Cc: linux-mtd, linux-kernel, dedekind1

Am 12.04.2015 um 19:01 schrieb Boris Brezillon:
> Hi Richard,
> 
> After the 'coding style related'/'useless' comments, now comes a real
> question related to the approach you've taken :-).
> 
> On Sun, 29 Mar 2015 14:13:17 +0200
> Richard Weinberger <richard@nod.at> wrote:
> 
> [...]
>> +
>> +/**
>> + * ubi_wl_trigger_bitrot_check - triggers a re-read of all physical erase
>> + * blocks.
>> + * @ubi: UBI device description object
>> + */
>> +void ubi_wl_trigger_bitrot_check(struct ubi_device *ubi)
>> +{
>> +	int i;
>> +	struct ubi_wl_entry *e;
>> +
>> +	ubi_msg(ubi, "Running a full read check");
>> +
>> +	for (i = 0; i < ubi->peb_count; i++) {
>> +		spin_lock(&ubi->wl_lock);
>> +		e = ubi->lookuptbl[i];
>> +		spin_unlock(&ubi->wl_lock);
>> +		if (e) {
>> +			atomic_inc(&ubi->bit_rot_work);
>> +			schedule_bitrot_check(ubi, e);
>> +		}

After re-reading this loop I realized that I've missed a possible race against
other workers. It can happen that we schedule eX and while being scheduled it
is possible that eX turns bad and some worker frees eX under us.
Not very likely but definitely possible.
Let's see if I can find a solution for that without adding new locks or refcounter to
ubi_wl_entry.

I can think of adding a check to the schedule work code which ensures that work
targeting eX is scheduled only once.


Thanks,
//richard

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] UBI: Implement bitrot checking
  2015-04-12 17:09     ` Richard Weinberger
@ 2015-04-12 19:20       ` Boris Brezillon
  2015-04-12 19:53         ` Richard Weinberger
  0 siblings, 1 reply; 49+ messages in thread
From: Boris Brezillon @ 2015-04-12 19:20 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: linux-mtd, linux-kernel, dedekind1

On Sun, 12 Apr 2015 19:09:11 +0200
Richard Weinberger <richard@nod.at> wrote:

> Am 12.04.2015 um 19:01 schrieb Boris Brezillon:
> > Hi Richard,
> > 
> > After the 'coding style related'/'useless' comments, now comes a real
> > question related to the approach you've taken :-).
> > 
> > On Sun, 29 Mar 2015 14:13:17 +0200
> > Richard Weinberger <richard@nod.at> wrote:
> > 
> > [...]
> >> +
> >> +/**
> >> + * ubi_wl_trigger_bitrot_check - triggers a re-read of all physical erase
> >> + * blocks.
> >> + * @ubi: UBI device description object
> >> + */
> >> +void ubi_wl_trigger_bitrot_check(struct ubi_device *ubi)
> >> +{
> >> +	int i;
> >> +	struct ubi_wl_entry *e;
> >> +
> >> +	ubi_msg(ubi, "Running a full read check");
> >> +
> >> +	for (i = 0; i < ubi->peb_count; i++) {
> >> +		spin_lock(&ubi->wl_lock);
> >> +		e = ubi->lookuptbl[i];
> >> +		spin_unlock(&ubi->wl_lock);
> >> +		if (e) {
> >> +			atomic_inc(&ubi->bit_rot_work);
> >> +			schedule_bitrot_check(ubi, e);
> >> +		}
> >> +	}
> > 
> > Do we really need to create a ubi_work per PEB ?
> > Couldn't we create a single work being rescheduled inside the worker
> > function (after updating the ubi_wl_entry of course).
> 
> Currently the UBI worker thread handles one PEB per ubi_work. I didn't wanted
> to break that pattern. The downside of that approach is that we need more memory.
> A few KiB per run.
> 
> I'm not sure if I understood your idea. You mean that we schedule one check for
> PEB N and this work will re-schedule again a work for PEB N+1?

That's exactly what I meant.

> Using that approach we can safe memory, yes. But is it worth the hassle?

Unless I'm missing something, it should be pretty easy to implement:
adding the following lines at the end of bitrot_check_worker() should do
the trick

	if (e->pnum + 1 < ubi->peb_count) {
		wl_wrk->e = ubi->lookuptbl[e->pnum + 1];
		__schedule_ubi_work(ubi, wl_wrk);
	} else {
		atomic_dec(&ubi->bit_rot_work);
	}
	

> I'd like to avoid works which schedule again other works.
> In the current way it is clear where the work is scheduled and how much.

Yes, but the memory consumption induced by this approach can be pretty
big on modern NAND chips (on 32 bit platforms, ubi_work is 32 octets
large, and on modern NANDs you often have 4096 blocks, so a UBI device
of 4000 block is pretty common => 4000 * 32 = 125 KiB).

For standard wear leveling requests, using a ubi_work per request is
sensible since you can't know in advance which block will be queued for
wear-leveling operation next time.
In your case, you're scanning all blocks in ascending order, which
makes it a good candidate for this 'one work for all bitrot checks'
approach.



-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] UBI: Implement bitrot checking
  2015-04-12 19:20       ` Boris Brezillon
@ 2015-04-12 19:53         ` Richard Weinberger
  2015-04-12 21:24           ` Boris Brezillon
  0 siblings, 1 reply; 49+ messages in thread
From: Richard Weinberger @ 2015-04-12 19:53 UTC (permalink / raw)
  To: Boris Brezillon; +Cc: linux-mtd, linux-kernel, dedekind1

Am 12.04.2015 um 21:20 schrieb Boris Brezillon:
> Unless I'm missing something, it should be pretty easy to implement:
> adding the following lines at the end of bitrot_check_worker() should do
> the trick
> 
> 	if (e->pnum + 1 < ubi->peb_count) {
> 		wl_wrk->e = ubi->lookuptbl[e->pnum + 1];
> 		__schedule_ubi_work(ubi, wl_wrk);
> 	} else {
> 		atomic_dec(&ubi->bit_rot_work);
> 	}
> 	

It will suffer from the same race issue as my current approach.
While e is scheduled another worker could free it in case of an fatal
error.

>> I'd like to avoid works which schedule again other works.
>> In the current way it is clear where the work is scheduled and how much.
> 
> Yes, but the memory consumption induced by this approach can be pretty
> big on modern NAND chips (on 32 bit platforms, ubi_work is 32 octets
> large, and on modern NANDs you often have 4096 blocks, so a UBI device
> of 4000 block is pretty common => 4000 * 32 = 125 KiB).

While I agree that consuming memory is not very nice I don't think that 125KiB
is a big deal.

> For standard wear leveling requests, using a ubi_work per request is
> sensible since you can't know in advance which block will be queued for
> wear-leveling operation next time.
> In your case, you're scanning all blocks in ascending order, which
> makes it a good candidate for this 'one work for all bitrot checks'
> approach.

The good news is that I have an idea to solve both problems the race and
the memory issue. It should be pretty easy to implement.
Patches will materialize in a few days.

Thanks,
//richard

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] UBI: Implement bitrot checking (linux-mtd Digest, Vol 145, Issue 24)
  2015-04-12 16:55         ` Richard Weinberger
@ 2015-04-12 20:42             ` Andrea Scian
  0 siblings, 0 replies; 49+ messages in thread
From: Andrea Scian @ 2015-04-12 20:42 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: linux-mtd, Boris Brezillon, linux-kernel, dedekind1


Il 12/04/2015 18:55, Richard Weinberger ha scritto:
> Am 12.04.2015 um 18:43 schrieb Boris Brezillon:
>> On Sun, 12 Apr 2015 18:09:23 +0200
>> Richard Weinberger <richard@nod.at> wrote:
>>
>>> Am 12.04.2015 um 16:12 schrieb Boris Brezillon:
>>>> Hi Richard,
>>>>
>>>> Sorry for the late reply.
>>>>
>>>> On Sun, 29 Mar 2015 14:13:17 +0200
>>>> Richard Weinberger <richard@nod.at> wrote:
>>>>
>>>>> This patch implements bitrot checking for UBI.
>>>>> ubi_wl_trigger_bitrot_check() triggers a re-read of every
>>>>> PEB. If a bitflip is detected PEBs in use will get scrubbed
>>>>> and free ones erased.
>>>>
>>>> As you'll see, I didn't have much to say about the 'UBI bitrot
>>>> detection' mechanism, so this review is a collection of
>>>> nitpicks :-).
>>>>
>>>>>
>>>>> Signed-off-by: Richard Weinberger <richard@nod.at>
>>>>> ---
>>>>>   drivers/mtd/ubi/build.c |  39 +++++++++++++
>>>>>   drivers/mtd/ubi/ubi.h   |   4 ++
>>>>>   drivers/mtd/ubi/wl.c    | 146 ++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>   3 files changed, 189 insertions(+)
>>>>>
>>>>> diff --git a/drivers/mtd/ubi/build.c b/drivers/mtd/ubi/build.c
>>>>> index 9690cf9..f58330b 100644
>>>>> --- a/drivers/mtd/ubi/build.c
>>>>> +++ b/drivers/mtd/ubi/build.c
>>>>> @@ -118,6 +118,9 @@ static struct class_attribute ubi_version =
>>>>>
>>>>>   static ssize_t dev_attribute_show(struct device *dev,
>>>>>   				  struct device_attribute *attr, char *buf);
>>>>> +static ssize_t trigger_bitrot_check(struct device *dev,
>>>>> +				    struct device_attribute *mattr,
>>>>> +				    const char *data, size_t count);
>>>>>
>>>>>   /* UBI device attributes (correspond to files in '/<sysfs>/class/ubi/ubiX') */
>>>>>   static struct device_attribute dev_eraseblock_size =
>>>>> @@ -142,6 +145,8 @@ static struct device_attribute dev_bgt_enabled =
>>>>>   	__ATTR(bgt_enabled, S_IRUGO, dev_attribute_show, NULL);
>>>>>   static struct device_attribute dev_mtd_num =
>>>>>   	__ATTR(mtd_num, S_IRUGO, dev_attribute_show, NULL);
>>>>> +static struct device_attribute dev_trigger_bitrot_check =
>>>>> +	__ATTR(trigger_bitrot_check, S_IWUSR, NULL, trigger_bitrot_check);
>>>>
>>>> How about making this attribute a RW one, so that users could check
>>>> if there's a bitrot check in progress.
>>>
>>> As the check will be initiated only by userspace and writing to the trigger
>>> while a check is running will return anyway a EBUSY I don't really see
>>> a point why userspace would check for it.
>>
>> Sometime you just want to know whether something is running or not (in
>> this case the bitrot check) without risking to trigger a new action...
>
> Why would they care?

I think is always useful to give some additional information in 
userspace, from both debugging and diagnostic point of view.

> But I can add this feature, no problem.

Thanks ;-)

May I ask if can be useful to abort the (IMHO quite long running) operation?
I think it can be useful to save power, e.g. when running on batteries: 
smart systems will trigger the operation when charging and aborting it 
if on batteries (or on low batteries).

What happens if the system need to reboot in the middle of scanning?
Probably nothing at all but I think it's worth asking ;-)
Anyway I think it's better if we can, on runlevel 6, shutdown the 
operation in a clean way

To ask a little bit more from the current implementation, can it be 
useful expand sysfs entry with the current status (stopped, running, 
completed)?
In this way the userspace knows whenever the operation it has triggered, 
it completed successfully or something interrupt it (e.g. an internal 
error). I will schedule a new operation sooner if I have no evidence 
that the last one completed successfully.. WDYT?
But maybe all of this stuff will be implemented inside a daemon with 
additional ioctl() (IIRC Richard already is working on this).

>
>>>
>>>>>
>>>>>   /**
>>>>>    * ubi_volume_notify - send a volume change notification.
>>>>> @@ -334,6 +339,36 @@ int ubi_major2num(int major)
>>>>>   	return ubi_num;
>>>>>   }
>>>>>
>>>>> +/* "Store" method for file '/<sysfs>/class/ubi/ubiX/trigger_bitrot_check' */
>>>>> +static ssize_t trigger_bitrot_check(struct device *dev,
>>>>> +				    struct device_attribute *mattr,
>>>>> +				    const char *data, size_t count)
>>>>> +{
>>>>> +	struct ubi_device *ubi;
>>>>> +	int ret;
>>>>> +
>>>>
>>>> Maybe that's on purpose, but you do not check the value passed in data
>>>> (in your documention you suggest to do an
>>>> echo 1 > /sys/class/ubi/ubiX/trigger_bitrot_check).
>>>
>>> Yeah, the example using "1", but why should I limit it to it?
>>> The idea was that any write will trigger a check.
>>
>> Okay.

See above my reason for having something more that 1 value (or having 
more than one sysfs entry ;-) )

Kind Regards,

Andrea SCIAN

-- 

Andrea SCIAN

DAVE Embedded Systems

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] UBI: Implement bitrot checking (linux-mtd Digest, Vol 145, Issue 24)
@ 2015-04-12 20:42             ` Andrea Scian
  0 siblings, 0 replies; 49+ messages in thread
From: Andrea Scian @ 2015-04-12 20:42 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: Boris Brezillon, linux-mtd, linux-kernel, dedekind1


Il 12/04/2015 18:55, Richard Weinberger ha scritto:
> Am 12.04.2015 um 18:43 schrieb Boris Brezillon:
>> On Sun, 12 Apr 2015 18:09:23 +0200
>> Richard Weinberger <richard@nod.at> wrote:
>>
>>> Am 12.04.2015 um 16:12 schrieb Boris Brezillon:
>>>> Hi Richard,
>>>>
>>>> Sorry for the late reply.
>>>>
>>>> On Sun, 29 Mar 2015 14:13:17 +0200
>>>> Richard Weinberger <richard@nod.at> wrote:
>>>>
>>>>> This patch implements bitrot checking for UBI.
>>>>> ubi_wl_trigger_bitrot_check() triggers a re-read of every
>>>>> PEB. If a bitflip is detected PEBs in use will get scrubbed
>>>>> and free ones erased.
>>>>
>>>> As you'll see, I didn't have much to say about the 'UBI bitrot
>>>> detection' mechanism, so this review is a collection of
>>>> nitpicks :-).
>>>>
>>>>>
>>>>> Signed-off-by: Richard Weinberger <richard@nod.at>
>>>>> ---
>>>>>   drivers/mtd/ubi/build.c |  39 +++++++++++++
>>>>>   drivers/mtd/ubi/ubi.h   |   4 ++
>>>>>   drivers/mtd/ubi/wl.c    | 146 ++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>   3 files changed, 189 insertions(+)
>>>>>
>>>>> diff --git a/drivers/mtd/ubi/build.c b/drivers/mtd/ubi/build.c
>>>>> index 9690cf9..f58330b 100644
>>>>> --- a/drivers/mtd/ubi/build.c
>>>>> +++ b/drivers/mtd/ubi/build.c
>>>>> @@ -118,6 +118,9 @@ static struct class_attribute ubi_version =
>>>>>
>>>>>   static ssize_t dev_attribute_show(struct device *dev,
>>>>>   				  struct device_attribute *attr, char *buf);
>>>>> +static ssize_t trigger_bitrot_check(struct device *dev,
>>>>> +				    struct device_attribute *mattr,
>>>>> +				    const char *data, size_t count);
>>>>>
>>>>>   /* UBI device attributes (correspond to files in '/<sysfs>/class/ubi/ubiX') */
>>>>>   static struct device_attribute dev_eraseblock_size =
>>>>> @@ -142,6 +145,8 @@ static struct device_attribute dev_bgt_enabled =
>>>>>   	__ATTR(bgt_enabled, S_IRUGO, dev_attribute_show, NULL);
>>>>>   static struct device_attribute dev_mtd_num =
>>>>>   	__ATTR(mtd_num, S_IRUGO, dev_attribute_show, NULL);
>>>>> +static struct device_attribute dev_trigger_bitrot_check =
>>>>> +	__ATTR(trigger_bitrot_check, S_IWUSR, NULL, trigger_bitrot_check);
>>>>
>>>> How about making this attribute a RW one, so that users could check
>>>> if there's a bitrot check in progress.
>>>
>>> As the check will be initiated only by userspace and writing to the trigger
>>> while a check is running will return anyway a EBUSY I don't really see
>>> a point why userspace would check for it.
>>
>> Sometime you just want to know whether something is running or not (in
>> this case the bitrot check) without risking to trigger a new action...
>
> Why would they care?

I think is always useful to give some additional information in 
userspace, from both debugging and diagnostic point of view.

> But I can add this feature, no problem.

Thanks ;-)

May I ask if can be useful to abort the (IMHO quite long running) operation?
I think it can be useful to save power, e.g. when running on batteries: 
smart systems will trigger the operation when charging and aborting it 
if on batteries (or on low batteries).

What happens if the system need to reboot in the middle of scanning?
Probably nothing at all but I think it's worth asking ;-)
Anyway I think it's better if we can, on runlevel 6, shutdown the 
operation in a clean way

To ask a little bit more from the current implementation, can it be 
useful expand sysfs entry with the current status (stopped, running, 
completed)?
In this way the userspace knows whenever the operation it has triggered, 
it completed successfully or something interrupt it (e.g. an internal 
error). I will schedule a new operation sooner if I have no evidence 
that the last one completed successfully.. WDYT?
But maybe all of this stuff will be implemented inside a daemon with 
additional ioctl() (IIRC Richard already is working on this).

>
>>>
>>>>>
>>>>>   /**
>>>>>    * ubi_volume_notify - send a volume change notification.
>>>>> @@ -334,6 +339,36 @@ int ubi_major2num(int major)
>>>>>   	return ubi_num;
>>>>>   }
>>>>>
>>>>> +/* "Store" method for file '/<sysfs>/class/ubi/ubiX/trigger_bitrot_check' */
>>>>> +static ssize_t trigger_bitrot_check(struct device *dev,
>>>>> +				    struct device_attribute *mattr,
>>>>> +				    const char *data, size_t count)
>>>>> +{
>>>>> +	struct ubi_device *ubi;
>>>>> +	int ret;
>>>>> +
>>>>
>>>> Maybe that's on purpose, but you do not check the value passed in data
>>>> (in your documention you suggest to do an
>>>> echo 1 > /sys/class/ubi/ubiX/trigger_bitrot_check).
>>>
>>> Yeah, the example using "1", but why should I limit it to it?
>>> The idea was that any write will trigger a check.
>>
>> Okay.

See above my reason for having something more that 1 value (or having 
more than one sysfs entry ;-) )

Kind Regards,

Andrea SCIAN

-- 

Andrea SCIAN

DAVE Embedded Systems

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] UBI: Implement bitrot checking (linux-mtd Digest, Vol 145, Issue 24)
  2015-04-12 20:42             ` Andrea Scian
@ 2015-04-12 21:01               ` Richard Weinberger
  -1 siblings, 0 replies; 49+ messages in thread
From: Richard Weinberger @ 2015-04-12 21:01 UTC (permalink / raw)
  To: Andrea Scian; +Cc: linux-mtd, Boris Brezillon, linux-kernel, dedekind1

Am 12.04.2015 um 22:42 schrieb Andrea Scian:
> 
> Il 12/04/2015 18:55, Richard Weinberger ha scritto:
>> Am 12.04.2015 um 18:43 schrieb Boris Brezillon:
>>> On Sun, 12 Apr 2015 18:09:23 +0200
>>> Richard Weinberger <richard@nod.at> wrote:
>>>
>>>> Am 12.04.2015 um 16:12 schrieb Boris Brezillon:
>>>>> Hi Richard,
>>>>>
>>>>> Sorry for the late reply.
>>>>>
>>>>> On Sun, 29 Mar 2015 14:13:17 +0200
>>>>> Richard Weinberger <richard@nod.at> wrote:
>>>>>
>>>>>> This patch implements bitrot checking for UBI.
>>>>>> ubi_wl_trigger_bitrot_check() triggers a re-read of every
>>>>>> PEB. If a bitflip is detected PEBs in use will get scrubbed
>>>>>> and free ones erased.
>>>>>
>>>>> As you'll see, I didn't have much to say about the 'UBI bitrot
>>>>> detection' mechanism, so this review is a collection of
>>>>> nitpicks :-).
>>>>>
>>>>>>
>>>>>> Signed-off-by: Richard Weinberger <richard@nod.at>
>>>>>> ---
>>>>>>   drivers/mtd/ubi/build.c |  39 +++++++++++++
>>>>>>   drivers/mtd/ubi/ubi.h   |   4 ++
>>>>>>   drivers/mtd/ubi/wl.c    | 146 ++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>>   3 files changed, 189 insertions(+)
>>>>>>
>>>>>> diff --git a/drivers/mtd/ubi/build.c b/drivers/mtd/ubi/build.c
>>>>>> index 9690cf9..f58330b 100644
>>>>>> --- a/drivers/mtd/ubi/build.c
>>>>>> +++ b/drivers/mtd/ubi/build.c
>>>>>> @@ -118,6 +118,9 @@ static struct class_attribute ubi_version =
>>>>>>
>>>>>>   static ssize_t dev_attribute_show(struct device *dev,
>>>>>>                     struct device_attribute *attr, char *buf);
>>>>>> +static ssize_t trigger_bitrot_check(struct device *dev,
>>>>>> +                    struct device_attribute *mattr,
>>>>>> +                    const char *data, size_t count);
>>>>>>
>>>>>>   /* UBI device attributes (correspond to files in '/<sysfs>/class/ubi/ubiX') */
>>>>>>   static struct device_attribute dev_eraseblock_size =
>>>>>> @@ -142,6 +145,8 @@ static struct device_attribute dev_bgt_enabled =
>>>>>>       __ATTR(bgt_enabled, S_IRUGO, dev_attribute_show, NULL);
>>>>>>   static struct device_attribute dev_mtd_num =
>>>>>>       __ATTR(mtd_num, S_IRUGO, dev_attribute_show, NULL);
>>>>>> +static struct device_attribute dev_trigger_bitrot_check =
>>>>>> +    __ATTR(trigger_bitrot_check, S_IWUSR, NULL, trigger_bitrot_check);
>>>>>
>>>>> How about making this attribute a RW one, so that users could check
>>>>> if there's a bitrot check in progress.
>>>>
>>>> As the check will be initiated only by userspace and writing to the trigger
>>>> while a check is running will return anyway a EBUSY I don't really see
>>>> a point why userspace would check for it.
>>>
>>> Sometime you just want to know whether something is running or not (in
>>> this case the bitrot check) without risking to trigger a new action...
>>
>> Why would they care?
> 
> I think is always useful to give some additional information in userspace, from both debugging and diagnostic point of view.

The question is, why does userspace care?
Other UBI operations are also not visible...

>> But I can add this feature, no problem.
> 
> Thanks ;-)
> 
> May I ask if can be useful to abort the (IMHO quite long running) operation?
> I think it can be useful to save power, e.g. when running on batteries: smart systems will trigger the operation when charging and aborting it if on batteries (or on low batteries).

If the system is running on low power mode just don't trigger the run...
Userspace controls it.

> What happens if the system need to reboot in the middle of scanning?

Just reboot, UBI can handle that. Work will be canceled.

> Probably nothing at all but I think it's worth asking ;-)
> Anyway I think it's better if we can, on runlevel 6, shutdown the operation in a clean way
> 
> To ask a little bit more from the current implementation, can it be useful expand sysfs entry with the current status (stopped, running, completed)?
> In this way the userspace knows whenever the operation it has triggered, it completed successfully or something interrupt it (e.g. an internal error). I will schedule a new
> operation sooner if I have no evidence that the last one completed successfully.. WDYT?
> But maybe all of this stuff will be implemented inside a daemon with additional ioctl() (IIRC Richard already is working on this).

That's the plan. The interface proposed in that patch series it designed to be a simple replacement for the dd if=/dev/ubiXY of=/dev/null hack.
The next step is adding a more advanced ioctl() interface to support a clever deamon.

Thanks,
//richard

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] UBI: Implement bitrot checking (linux-mtd Digest, Vol 145, Issue 24)
@ 2015-04-12 21:01               ` Richard Weinberger
  0 siblings, 0 replies; 49+ messages in thread
From: Richard Weinberger @ 2015-04-12 21:01 UTC (permalink / raw)
  To: Andrea Scian; +Cc: Boris Brezillon, linux-mtd, linux-kernel, dedekind1

Am 12.04.2015 um 22:42 schrieb Andrea Scian:
> 
> Il 12/04/2015 18:55, Richard Weinberger ha scritto:
>> Am 12.04.2015 um 18:43 schrieb Boris Brezillon:
>>> On Sun, 12 Apr 2015 18:09:23 +0200
>>> Richard Weinberger <richard@nod.at> wrote:
>>>
>>>> Am 12.04.2015 um 16:12 schrieb Boris Brezillon:
>>>>> Hi Richard,
>>>>>
>>>>> Sorry for the late reply.
>>>>>
>>>>> On Sun, 29 Mar 2015 14:13:17 +0200
>>>>> Richard Weinberger <richard@nod.at> wrote:
>>>>>
>>>>>> This patch implements bitrot checking for UBI.
>>>>>> ubi_wl_trigger_bitrot_check() triggers a re-read of every
>>>>>> PEB. If a bitflip is detected PEBs in use will get scrubbed
>>>>>> and free ones erased.
>>>>>
>>>>> As you'll see, I didn't have much to say about the 'UBI bitrot
>>>>> detection' mechanism, so this review is a collection of
>>>>> nitpicks :-).
>>>>>
>>>>>>
>>>>>> Signed-off-by: Richard Weinberger <richard@nod.at>
>>>>>> ---
>>>>>>   drivers/mtd/ubi/build.c |  39 +++++++++++++
>>>>>>   drivers/mtd/ubi/ubi.h   |   4 ++
>>>>>>   drivers/mtd/ubi/wl.c    | 146 ++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>>   3 files changed, 189 insertions(+)
>>>>>>
>>>>>> diff --git a/drivers/mtd/ubi/build.c b/drivers/mtd/ubi/build.c
>>>>>> index 9690cf9..f58330b 100644
>>>>>> --- a/drivers/mtd/ubi/build.c
>>>>>> +++ b/drivers/mtd/ubi/build.c
>>>>>> @@ -118,6 +118,9 @@ static struct class_attribute ubi_version =
>>>>>>
>>>>>>   static ssize_t dev_attribute_show(struct device *dev,
>>>>>>                     struct device_attribute *attr, char *buf);
>>>>>> +static ssize_t trigger_bitrot_check(struct device *dev,
>>>>>> +                    struct device_attribute *mattr,
>>>>>> +                    const char *data, size_t count);
>>>>>>
>>>>>>   /* UBI device attributes (correspond to files in '/<sysfs>/class/ubi/ubiX') */
>>>>>>   static struct device_attribute dev_eraseblock_size =
>>>>>> @@ -142,6 +145,8 @@ static struct device_attribute dev_bgt_enabled =
>>>>>>       __ATTR(bgt_enabled, S_IRUGO, dev_attribute_show, NULL);
>>>>>>   static struct device_attribute dev_mtd_num =
>>>>>>       __ATTR(mtd_num, S_IRUGO, dev_attribute_show, NULL);
>>>>>> +static struct device_attribute dev_trigger_bitrot_check =
>>>>>> +    __ATTR(trigger_bitrot_check, S_IWUSR, NULL, trigger_bitrot_check);
>>>>>
>>>>> How about making this attribute a RW one, so that users could check
>>>>> if there's a bitrot check in progress.
>>>>
>>>> As the check will be initiated only by userspace and writing to the trigger
>>>> while a check is running will return anyway a EBUSY I don't really see
>>>> a point why userspace would check for it.
>>>
>>> Sometime you just want to know whether something is running or not (in
>>> this case the bitrot check) without risking to trigger a new action...
>>
>> Why would they care?
> 
> I think is always useful to give some additional information in userspace, from both debugging and diagnostic point of view.

The question is, why does userspace care?
Other UBI operations are also not visible...

>> But I can add this feature, no problem.
> 
> Thanks ;-)
> 
> May I ask if can be useful to abort the (IMHO quite long running) operation?
> I think it can be useful to save power, e.g. when running on batteries: smart systems will trigger the operation when charging and aborting it if on batteries (or on low batteries).

If the system is running on low power mode just don't trigger the run...
Userspace controls it.

> What happens if the system need to reboot in the middle of scanning?

Just reboot, UBI can handle that. Work will be canceled.

> Probably nothing at all but I think it's worth asking ;-)
> Anyway I think it's better if we can, on runlevel 6, shutdown the operation in a clean way
> 
> To ask a little bit more from the current implementation, can it be useful expand sysfs entry with the current status (stopped, running, completed)?
> In this way the userspace knows whenever the operation it has triggered, it completed successfully or something interrupt it (e.g. an internal error). I will schedule a new
> operation sooner if I have no evidence that the last one completed successfully.. WDYT?
> But maybe all of this stuff will be implemented inside a daemon with additional ioctl() (IIRC Richard already is working on this).

That's the plan. The interface proposed in that patch series it designed to be a simple replacement for the dd if=/dev/ubiXY of=/dev/null hack.
The next step is adding a more advanced ioctl() interface to support a clever deamon.

Thanks,
//richard

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] UBI: Implement bitrot checking
  2015-04-12 19:53         ` Richard Weinberger
@ 2015-04-12 21:24           ` Boris Brezillon
  2015-04-12 21:34             ` Richard Weinberger
  0 siblings, 1 reply; 49+ messages in thread
From: Boris Brezillon @ 2015-04-12 21:24 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: linux-mtd, linux-kernel, dedekind1

On Sun, 12 Apr 2015 21:53:12 +0200
Richard Weinberger <richard@nod.at> wrote:

> Am 12.04.2015 um 21:20 schrieb Boris Brezillon:
> > Unless I'm missing something, it should be pretty easy to implement:
> > adding the following lines at the end of bitrot_check_worker() should do
> > the trick
> > 
> > 	if (e->pnum + 1 < ubi->peb_count) {
> > 		wl_wrk->e = ubi->lookuptbl[e->pnum + 1];
> > 		__schedule_ubi_work(ubi, wl_wrk);
> > 	} else {
> > 		atomic_dec(&ubi->bit_rot_work);
> > 	}
> > 	
> 
> It will suffer from the same race issue as my current approach.
> While e is scheduled another worker could free it in case of an fatal
> error.

Right, I guess grabing wl_lock before retrieving ->e (and iterating
over the lookuptbl until it's != NULL) would partly solve the problem.
But I'm not sure how you would handle this sequence (not sure it can
happen though):
1/ schedule bitrot check on PEB X
2/ execute some operation on PEB x that might free PEB X entry in the
   lookuptbl
3/ execute bitrot check on PEB X

In this case the ->e is invalid (pointing to a freed memory region) at
the time bitrot check is executed.

Of course, if you're guaranteed that ubi_work are executed in the
correct order (FIFO) this should never happen, because the scheduled
operation messing up with lookuptbl entry could have been detected
before bitrot work insertion.

> 
> >> I'd like to avoid works which schedule again other works.
> >> In the current way it is clear where the work is scheduled and how much.
> > 
> > Yes, but the memory consumption induced by this approach can be pretty
> > big on modern NAND chips (on 32 bit platforms, ubi_work is 32 octets
> > large, and on modern NANDs you often have 4096 blocks, so a UBI device
> > of 4000 block is pretty common => 4000 * 32 = 125 KiB).
> 
> While I agree that consuming memory is not very nice I don't think that 125KiB
> is a big deal.

Hm, a few weeks ago, when I suggested to store information about PEBs in
order to better choose the next block to be checked for bitrot, one of
your argument to reject that approach was the memory consumption of
such a design.
In my case the only thing I needed was the following structure (one
instance per PEB):

struct ubi_peb_statistics {
	struct list_head node;
	int pnum;
	int bitflips;
	int last_full_read; /* in seconds */
	int last_partial_write; /* in seconds */
};

which is 24 bytes large.

I definitely understand the memory consumption argument, but that's not
something you can change depending on who's proposing the solution :-).

> 
> > For standard wear leveling requests, using a ubi_work per request is
> > sensible since you can't know in advance which block will be queued for
> > wear-leveling operation next time.
> > In your case, you're scanning all blocks in ascending order, which
> > makes it a good candidate for this 'one work for all bitrot checks'
> > approach.
> 
> The good news is that I have an idea to solve both problems the race and
> the memory issue. It should be pretty easy to implement.
> Patches will materialize in a few days.

Great!

Best Regards,

Boris


-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] UBI: Implement bitrot checking (linux-mtd Digest, Vol 145, Issue 24)
  2015-04-12 21:01               ` Richard Weinberger
  (?)
@ 2015-04-12 21:30               ` Boris Brezillon
  2015-04-12 21:37                 ` Richard Weinberger
  -1 siblings, 1 reply; 49+ messages in thread
From: Boris Brezillon @ 2015-04-12 21:30 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: Andrea Scian, linux-mtd, linux-kernel, dedekind1

On Sun, 12 Apr 2015 23:01:27 +0200
Richard Weinberger <richard@nod.at> wrote:


> >>>>>> +static struct device_attribute dev_trigger_bitrot_check =
> >>>>>> +    __ATTR(trigger_bitrot_check, S_IWUSR, NULL, trigger_bitrot_check);
> >>>>>
> >>>>> How about making this attribute a RW one, so that users could check
> >>>>> if there's a bitrot check in progress.
> >>>>
> >>>> As the check will be initiated only by userspace and writing to the trigger
> >>>> while a check is running will return anyway a EBUSY I don't really see
> >>>> a point why userspace would check for it.
> >>>
> >>> Sometime you just want to know whether something is running or not (in
> >>> this case the bitrot check) without risking to trigger a new action...
> >>
> >> Why would they care?
> > 
> > I think is always useful to give some additional information in userspace, from both debugging and diagnostic point of view.
> 
> The question is, why does userspace care?
> Other UBI operations are also not visible...

Yes, but AFAIK other wear-leveling operations are not directly triggered
by user-space.


-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] UBI: Implement bitrot checking (linux-mtd Digest, Vol 145, Issue 24)
  2015-04-12 21:01               ` Richard Weinberger
@ 2015-04-12 21:33                 ` Andrea Scian
  -1 siblings, 0 replies; 49+ messages in thread
From: Andrea Scian @ 2015-04-12 21:33 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: linux-mtd, Boris Brezillon, linux-kernel, dedekind1

Il 12/04/2015 23:01, Richard Weinberger ha scritto:
> Am 12.04.2015 um 22:42 schrieb Andrea Scian:
>> Il 12/04/2015 18:55, Richard Weinberger ha scritto:
>>> Am 12.04.2015 um 18:43 schrieb Boris Brezillon:
>>>> On Sun, 12 Apr 2015 18:09:23 +0200
>>>> Richard Weinberger <richard@nod.at> wrote:
>>>>
>>>>> Am 12.04.2015 um 16:12 schrieb Boris Brezillon:
>>>>>> Hi Richard,
>>>>>>
>>>>>> Sorry for the late reply.
>>>>>>
>>>>>> On Sun, 29 Mar 2015 14:13:17 +0200
>>>>>> Richard Weinberger <richard@nod.at> wrote:
>>>>>>
>>>>>>> This patch implements bitrot checking for UBI.
>>>>>>> ubi_wl_trigger_bitrot_check() triggers a re-read of every
>>>>>>> PEB. If a bitflip is detected PEBs in use will get scrubbed
>>>>>>> and free ones erased.
>>>>>> As you'll see, I didn't have much to say about the 'UBI bitrot
>>>>>> detection' mechanism, so this review is a collection of
>>>>>> nitpicks :-).
>>>>>>
>>>>>>> Signed-off-by: Richard Weinberger <richard@nod.at>
>>>>>>> ---
>>>>>>>    drivers/mtd/ubi/build.c |  39 +++++++++++++
>>>>>>>    drivers/mtd/ubi/ubi.h   |   4 ++
>>>>>>>    drivers/mtd/ubi/wl.c    | 146 ++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>>>    3 files changed, 189 insertions(+)
>>>>>>>
>>>>>>> diff --git a/drivers/mtd/ubi/build.c b/drivers/mtd/ubi/build.c
>>>>>>> index 9690cf9..f58330b 100644
>>>>>>> --- a/drivers/mtd/ubi/build.c
>>>>>>> +++ b/drivers/mtd/ubi/build.c
>>>>>>> @@ -118,6 +118,9 @@ static struct class_attribute ubi_version =
>>>>>>>
>>>>>>>    static ssize_t dev_attribute_show(struct device *dev,
>>>>>>>                      struct device_attribute *attr, char *buf);
>>>>>>> +static ssize_t trigger_bitrot_check(struct device *dev,
>>>>>>> +                    struct device_attribute *mattr,
>>>>>>> +                    const char *data, size_t count);
>>>>>>>
>>>>>>>    /* UBI device attributes (correspond to files in '/<sysfs>/class/ubi/ubiX') */
>>>>>>>    static struct device_attribute dev_eraseblock_size =
>>>>>>> @@ -142,6 +145,8 @@ static struct device_attribute dev_bgt_enabled =
>>>>>>>        __ATTR(bgt_enabled, S_IRUGO, dev_attribute_show, NULL);
>>>>>>>    static struct device_attribute dev_mtd_num =
>>>>>>>        __ATTR(mtd_num, S_IRUGO, dev_attribute_show, NULL);
>>>>>>> +static struct device_attribute dev_trigger_bitrot_check =
>>>>>>> +    __ATTR(trigger_bitrot_check, S_IWUSR, NULL, trigger_bitrot_check);
>>>>>> How about making this attribute a RW one, so that users could check
>>>>>> if there's a bitrot check in progress.
>>>>> As the check will be initiated only by userspace and writing to the trigger
>>>>> while a check is running will return anyway a EBUSY I don't really see
>>>>> a point why userspace would check for it.
>>>> Sometime you just want to know whether something is running or not (in
>>>> this case the bitrot check) without risking to trigger a new action...
>>> Why would they care?
>> I think is always useful to give some additional information in userspace, from both debugging and diagnostic point of view.
> The question is, why does userspace care?

Because the userspace trigger it

> Other UBI operations are also not visible...

I don't really know much about UBI internals, but I think not many 
operation are triggered from userspace (apart from ubiformat and mount ;-) )


>>> But I can add this feature, no problem.
>> Thanks ;-)
>>
>> May I ask if can be useful to abort the (IMHO quite long running) operation?
>> I think it can be useful to save power, e.g. when running on batteries: smart systems will trigger the operation when charging and aborting it if on batteries (or on low batteries).
> If the system is running on low power mode just don't trigger the run...
> Userspace controls it.

It heavily depends on how long the operation takes, we may have 4 to 32 
GiB devices so I think it may take more than just a few minutes to scan 
the whole device (and additional time depending on how many scrub 
operation are needed).
You may start the operation when power is not an issue (e.g. while 
charging) and when it is (e.g. when running on batteries and you need to 
save any mAh you have ;-) ) it may be still running for a while..
I know that this is some kind of advance feature, but I would like to 
point this out for comments.

>
>> What happens if the system need to reboot in the middle of scanning?
> Just reboot, UBI can handle that. Work will be canceled.

Thanks for the confirm

>
>> Probably nothing at all but I think it's worth asking ;-)
>> Anyway I think it's better if we can, on runlevel 6, shutdown the operation in a clean way
>>
>> To ask a little bit more from the current implementation, can it be useful expand sysfs entry with the current status (stopped, running, completed)?
>> In this way the userspace knows whenever the operation it has triggered, it completed successfully or something interrupt it (e.g. an internal error). I will schedule a new
>> operation sooner if I have no evidence that the last one completed successfully.. WDYT?
>> But maybe all of this stuff will be implemented inside a daemon with additional ioctl() (IIRC Richard already is working on this).
> That's the plan. The interface proposed in that patch series it designed to be a simple replacement for the dd if=/dev/ubiXY of=/dev/null hack.

dd can be stopped if needed and you may also have the progress status :-)

for sure you probably don't need all of the above additional stuff but 
it may be useful anyway
Maybe it's better to have an initial working implementation and add some 
more (backward compatible?) features later.

> The next step is adding a more advanced ioctl() interface to support a clever deamon.

Kind Regards and thanks for you comments,

-- 

Andrea SCIAN

DAVE Embedded Systems


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] UBI: Implement bitrot checking (linux-mtd Digest, Vol 145, Issue 24)
@ 2015-04-12 21:33                 ` Andrea Scian
  0 siblings, 0 replies; 49+ messages in thread
From: Andrea Scian @ 2015-04-12 21:33 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: Boris Brezillon, linux-mtd, linux-kernel, dedekind1

Il 12/04/2015 23:01, Richard Weinberger ha scritto:
> Am 12.04.2015 um 22:42 schrieb Andrea Scian:
>> Il 12/04/2015 18:55, Richard Weinberger ha scritto:
>>> Am 12.04.2015 um 18:43 schrieb Boris Brezillon:
>>>> On Sun, 12 Apr 2015 18:09:23 +0200
>>>> Richard Weinberger <richard@nod.at> wrote:
>>>>
>>>>> Am 12.04.2015 um 16:12 schrieb Boris Brezillon:
>>>>>> Hi Richard,
>>>>>>
>>>>>> Sorry for the late reply.
>>>>>>
>>>>>> On Sun, 29 Mar 2015 14:13:17 +0200
>>>>>> Richard Weinberger <richard@nod.at> wrote:
>>>>>>
>>>>>>> This patch implements bitrot checking for UBI.
>>>>>>> ubi_wl_trigger_bitrot_check() triggers a re-read of every
>>>>>>> PEB. If a bitflip is detected PEBs in use will get scrubbed
>>>>>>> and free ones erased.
>>>>>> As you'll see, I didn't have much to say about the 'UBI bitrot
>>>>>> detection' mechanism, so this review is a collection of
>>>>>> nitpicks :-).
>>>>>>
>>>>>>> Signed-off-by: Richard Weinberger <richard@nod.at>
>>>>>>> ---
>>>>>>>    drivers/mtd/ubi/build.c |  39 +++++++++++++
>>>>>>>    drivers/mtd/ubi/ubi.h   |   4 ++
>>>>>>>    drivers/mtd/ubi/wl.c    | 146 ++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>>>    3 files changed, 189 insertions(+)
>>>>>>>
>>>>>>> diff --git a/drivers/mtd/ubi/build.c b/drivers/mtd/ubi/build.c
>>>>>>> index 9690cf9..f58330b 100644
>>>>>>> --- a/drivers/mtd/ubi/build.c
>>>>>>> +++ b/drivers/mtd/ubi/build.c
>>>>>>> @@ -118,6 +118,9 @@ static struct class_attribute ubi_version =
>>>>>>>
>>>>>>>    static ssize_t dev_attribute_show(struct device *dev,
>>>>>>>                      struct device_attribute *attr, char *buf);
>>>>>>> +static ssize_t trigger_bitrot_check(struct device *dev,
>>>>>>> +                    struct device_attribute *mattr,
>>>>>>> +                    const char *data, size_t count);
>>>>>>>
>>>>>>>    /* UBI device attributes (correspond to files in '/<sysfs>/class/ubi/ubiX') */
>>>>>>>    static struct device_attribute dev_eraseblock_size =
>>>>>>> @@ -142,6 +145,8 @@ static struct device_attribute dev_bgt_enabled =
>>>>>>>        __ATTR(bgt_enabled, S_IRUGO, dev_attribute_show, NULL);
>>>>>>>    static struct device_attribute dev_mtd_num =
>>>>>>>        __ATTR(mtd_num, S_IRUGO, dev_attribute_show, NULL);
>>>>>>> +static struct device_attribute dev_trigger_bitrot_check =
>>>>>>> +    __ATTR(trigger_bitrot_check, S_IWUSR, NULL, trigger_bitrot_check);
>>>>>> How about making this attribute a RW one, so that users could check
>>>>>> if there's a bitrot check in progress.
>>>>> As the check will be initiated only by userspace and writing to the trigger
>>>>> while a check is running will return anyway a EBUSY I don't really see
>>>>> a point why userspace would check for it.
>>>> Sometime you just want to know whether something is running or not (in
>>>> this case the bitrot check) without risking to trigger a new action...
>>> Why would they care?
>> I think is always useful to give some additional information in userspace, from both debugging and diagnostic point of view.
> The question is, why does userspace care?

Because the userspace trigger it

> Other UBI operations are also not visible...

I don't really know much about UBI internals, but I think not many 
operation are triggered from userspace (apart from ubiformat and mount ;-) )


>>> But I can add this feature, no problem.
>> Thanks ;-)
>>
>> May I ask if can be useful to abort the (IMHO quite long running) operation?
>> I think it can be useful to save power, e.g. when running on batteries: smart systems will trigger the operation when charging and aborting it if on batteries (or on low batteries).
> If the system is running on low power mode just don't trigger the run...
> Userspace controls it.

It heavily depends on how long the operation takes, we may have 4 to 32 
GiB devices so I think it may take more than just a few minutes to scan 
the whole device (and additional time depending on how many scrub 
operation are needed).
You may start the operation when power is not an issue (e.g. while 
charging) and when it is (e.g. when running on batteries and you need to 
save any mAh you have ;-) ) it may be still running for a while..
I know that this is some kind of advance feature, but I would like to 
point this out for comments.

>
>> What happens if the system need to reboot in the middle of scanning?
> Just reboot, UBI can handle that. Work will be canceled.

Thanks for the confirm

>
>> Probably nothing at all but I think it's worth asking ;-)
>> Anyway I think it's better if we can, on runlevel 6, shutdown the operation in a clean way
>>
>> To ask a little bit more from the current implementation, can it be useful expand sysfs entry with the current status (stopped, running, completed)?
>> In this way the userspace knows whenever the operation it has triggered, it completed successfully or something interrupt it (e.g. an internal error). I will schedule a new
>> operation sooner if I have no evidence that the last one completed successfully.. WDYT?
>> But maybe all of this stuff will be implemented inside a daemon with additional ioctl() (IIRC Richard already is working on this).
> That's the plan. The interface proposed in that patch series it designed to be a simple replacement for the dd if=/dev/ubiXY of=/dev/null hack.

dd can be stopped if needed and you may also have the progress status :-)

for sure you probably don't need all of the above additional stuff but 
it may be useful anyway
Maybe it's better to have an initial working implementation and add some 
more (backward compatible?) features later.

> The next step is adding a more advanced ioctl() interface to support a clever deamon.

Kind Regards and thanks for you comments,

-- 

Andrea SCIAN

DAVE Embedded Systems

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] UBI: Implement bitrot checking
  2015-04-12 21:24           ` Boris Brezillon
@ 2015-04-12 21:34             ` Richard Weinberger
  2015-04-13  3:36               ` nick
  0 siblings, 1 reply; 49+ messages in thread
From: Richard Weinberger @ 2015-04-12 21:34 UTC (permalink / raw)
  To: Boris Brezillon; +Cc: linux-mtd, linux-kernel, dedekind1

Am 12.04.2015 um 23:24 schrieb Boris Brezillon:
> On Sun, 12 Apr 2015 21:53:12 +0200
> Richard Weinberger <richard@nod.at> wrote:
> 
>> Am 12.04.2015 um 21:20 schrieb Boris Brezillon:
>>> Unless I'm missing something, it should be pretty easy to implement:
>>> adding the following lines at the end of bitrot_check_worker() should do
>>> the trick
>>>
>>> 	if (e->pnum + 1 < ubi->peb_count) {
>>> 		wl_wrk->e = ubi->lookuptbl[e->pnum + 1];
>>> 		__schedule_ubi_work(ubi, wl_wrk);
>>> 	} else {
>>> 		atomic_dec(&ubi->bit_rot_work);
>>> 	}
>>> 	
>>
>> It will suffer from the same race issue as my current approach.
>> While e is scheduled another worker could free it in case of an fatal
>> error.
> 
> Right, I guess grabing wl_lock before retrieving ->e (and iterating
> over the lookuptbl until it's != NULL) would partly solve the problem.
> But I'm not sure how you would handle this sequence (not sure it can
> happen though):
> 1/ schedule bitrot check on PEB X
> 2/ execute some operation on PEB x that might free PEB X entry in the
>    lookuptbl
> 3/ execute bitrot check on PEB X
> 
> In this case the ->e is invalid (pointing to a freed memory region) at
> the time bitrot check is executed.

We have to make sure that we remove e from the data structure.
This is what UBI currently does for all works.
That way only one work can operate on an ubi_wl_entry.

> Of course, if you're guaranteed that ubi_work are executed in the
> correct order (FIFO) this should never happen, because the scheduled
> operation messing up with lookuptbl entry could have been detected
> before bitrot work insertion.
> 
>>
>>>> I'd like to avoid works which schedule again other works.
>>>> In the current way it is clear where the work is scheduled and how much.
>>>
>>> Yes, but the memory consumption induced by this approach can be pretty
>>> big on modern NAND chips (on 32 bit platforms, ubi_work is 32 octets
>>> large, and on modern NANDs you often have 4096 blocks, so a UBI device
>>> of 4000 block is pretty common => 4000 * 32 = 125 KiB).
>>
>> While I agree that consuming memory is not very nice I don't think that 125KiB
>> is a big deal.
> 
> Hm, a few weeks ago, when I suggested to store information about PEBs in
> order to better choose the next block to be checked for bitrot, one of
> your argument to reject that approach was the memory consumption of
> such a design.
> In my case the only thing I needed was the following structure (one
> instance per PEB):
> 
> struct ubi_peb_statistics {
> 	struct list_head node;
> 	int pnum;
> 	int bitflips;
> 	int last_full_read; /* in seconds */
> 	int last_partial_write; /* in seconds */
> };
> 
> which is 24 bytes large.
> 
> I definitely understand the memory consumption argument, but that's not
> something you can change depending on who's proposing the solution :-).

Yeah, but this structure remains in memory forever, right?
In the bitrot case we allocate the memory only temporary.

That said, my arguments are not perfect nor irreversible,
it can happen that I mess up or was simply wrong.
Just beat me down with my own arguments when I deserve it.

Thanks,
//richard

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] UBI: Implement bitrot checking (linux-mtd Digest, Vol 145, Issue 24)
  2015-04-12 21:30               ` Boris Brezillon
@ 2015-04-12 21:37                 ` Richard Weinberger
  0 siblings, 0 replies; 49+ messages in thread
From: Richard Weinberger @ 2015-04-12 21:37 UTC (permalink / raw)
  To: Boris Brezillon; +Cc: Andrea Scian, linux-mtd, linux-kernel, dedekind1

Am 12.04.2015 um 23:30 schrieb Boris Brezillon:
> On Sun, 12 Apr 2015 23:01:27 +0200
> Richard Weinberger <richard@nod.at> wrote:
> 
> 
>>>>>>>> +static struct device_attribute dev_trigger_bitrot_check =
>>>>>>>> +    __ATTR(trigger_bitrot_check, S_IWUSR, NULL, trigger_bitrot_check);
>>>>>>>
>>>>>>> How about making this attribute a RW one, so that users could check
>>>>>>> if there's a bitrot check in progress.
>>>>>>
>>>>>> As the check will be initiated only by userspace and writing to the trigger
>>>>>> while a check is running will return anyway a EBUSY I don't really see
>>>>>> a point why userspace would check for it.
>>>>>
>>>>> Sometime you just want to know whether something is running or not (in
>>>>> this case the bitrot check) without risking to trigger a new action...
>>>>
>>>> Why would they care?
>>>
>>> I think is always useful to give some additional information in userspace, from both debugging and diagnostic point of view.
>>
>> The question is, why does userspace care?
>> Other UBI operations are also not visible...
> 
> Yes, but AFAIK other wear-leveling operations are not directly triggered
> by user-space.

If userspace trigger is, it knows that it was triggered.
Unless you have multiple tools/scripts which want to trigger it,
which is IMHO a problem anyways.

But as stated before, I can add an indicator.

Thanks,
//richard

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] UBI: Implement bitrot checking (linux-mtd Digest, Vol 145, Issue 24)
  2015-04-12 21:33                 ` Andrea Scian
@ 2015-04-12 21:42                   ` Richard Weinberger
  -1 siblings, 0 replies; 49+ messages in thread
From: Richard Weinberger @ 2015-04-12 21:42 UTC (permalink / raw)
  To: Andrea Scian; +Cc: linux-mtd, Boris Brezillon, linux-kernel, dedekind1

Am 12.04.2015 um 23:33 schrieb Andrea Scian:
>>> I think is always useful to give some additional information in userspace, from both debugging and diagnostic point of view.
>> The question is, why does userspace care?
> 
> Because the userspace trigger it

As written in the previous mail, then userspace knows anyway the state.

> 
>> Other UBI operations are also not visible...
> 
> I don't really know much about UBI internals, but I think not many operation are triggered from userspace (apart from ubiformat and mount ;-) )
> 
> 
>>>> But I can add this feature, no problem.
>>> Thanks ;-)
>>>
>>> May I ask if can be useful to abort the (IMHO quite long running) operation?
>>> I think it can be useful to save power, e.g. when running on batteries: smart systems will trigger the operation when charging and aborting it if on batteries (or on low
>>> batteries).
>> If the system is running on low power mode just don't trigger the run...
>> Userspace controls it.
> 
> It heavily depends on how long the operation takes, we may have 4 to 32 GiB devices so I think it may take more than just a few minutes to scan the whole device (and additional
> time depending on how many scrub operation are needed).
> You may start the operation when power is not an issue (e.g. while charging) and when it is (e.g. when running on batteries and you need to save any mAh you have ;-) ) it may be
> still running for a while..
> I know that this is some kind of advance feature, but I would like to point this out for comments.

In such a scenario one definitely wants the emerging ioctl() interface.

>>
>>> What happens if the system need to reboot in the middle of scanning?
>> Just reboot, UBI can handle that. Work will be canceled.
> 
> Thanks for the confirm
> 
>>
>>> Probably nothing at all but I think it's worth asking ;-)
>>> Anyway I think it's better if we can, on runlevel 6, shutdown the operation in a clean way
>>>
>>> To ask a little bit more from the current implementation, can it be useful expand sysfs entry with the current status (stopped, running, completed)?
>>> In this way the userspace knows whenever the operation it has triggered, it completed successfully or something interrupt it (e.g. an internal error). I will schedule a new
>>> operation sooner if I have no evidence that the last one completed successfully.. WDYT?
>>> But maybe all of this stuff will be implemented inside a daemon with additional ioctl() (IIRC Richard already is working on this).
>> That's the plan. The interface proposed in that patch series it designed to be a simple replacement for the dd if=/dev/ubiXY of=/dev/null hack.
> 
> dd can be stopped if needed and you may also have the progress status :-)
> 
> for sure you probably don't need all of the above additional stuff but it may be useful anyway
> Maybe it's better to have an initial working implementation and add some more (backward compatible?) features later.

I agree.

BTW: Thanks a lot for your comments, Boris and Andrea!

Thanks,
//richard

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] UBI: Implement bitrot checking (linux-mtd Digest, Vol 145, Issue 24)
@ 2015-04-12 21:42                   ` Richard Weinberger
  0 siblings, 0 replies; 49+ messages in thread
From: Richard Weinberger @ 2015-04-12 21:42 UTC (permalink / raw)
  To: Andrea Scian; +Cc: Boris Brezillon, linux-mtd, linux-kernel, dedekind1

Am 12.04.2015 um 23:33 schrieb Andrea Scian:
>>> I think is always useful to give some additional information in userspace, from both debugging and diagnostic point of view.
>> The question is, why does userspace care?
> 
> Because the userspace trigger it

As written in the previous mail, then userspace knows anyway the state.

> 
>> Other UBI operations are also not visible...
> 
> I don't really know much about UBI internals, but I think not many operation are triggered from userspace (apart from ubiformat and mount ;-) )
> 
> 
>>>> But I can add this feature, no problem.
>>> Thanks ;-)
>>>
>>> May I ask if can be useful to abort the (IMHO quite long running) operation?
>>> I think it can be useful to save power, e.g. when running on batteries: smart systems will trigger the operation when charging and aborting it if on batteries (or on low
>>> batteries).
>> If the system is running on low power mode just don't trigger the run...
>> Userspace controls it.
> 
> It heavily depends on how long the operation takes, we may have 4 to 32 GiB devices so I think it may take more than just a few minutes to scan the whole device (and additional
> time depending on how many scrub operation are needed).
> You may start the operation when power is not an issue (e.g. while charging) and when it is (e.g. when running on batteries and you need to save any mAh you have ;-) ) it may be
> still running for a while..
> I know that this is some kind of advance feature, but I would like to point this out for comments.

In such a scenario one definitely wants the emerging ioctl() interface.

>>
>>> What happens if the system need to reboot in the middle of scanning?
>> Just reboot, UBI can handle that. Work will be canceled.
> 
> Thanks for the confirm
> 
>>
>>> Probably nothing at all but I think it's worth asking ;-)
>>> Anyway I think it's better if we can, on runlevel 6, shutdown the operation in a clean way
>>>
>>> To ask a little bit more from the current implementation, can it be useful expand sysfs entry with the current status (stopped, running, completed)?
>>> In this way the userspace knows whenever the operation it has triggered, it completed successfully or something interrupt it (e.g. an internal error). I will schedule a new
>>> operation sooner if I have no evidence that the last one completed successfully.. WDYT?
>>> But maybe all of this stuff will be implemented inside a daemon with additional ioctl() (IIRC Richard already is working on this).
>> That's the plan. The interface proposed in that patch series it designed to be a simple replacement for the dd if=/dev/ubiXY of=/dev/null hack.
> 
> dd can be stopped if needed and you may also have the progress status :-)
> 
> for sure you probably don't need all of the above additional stuff but it may be useful anyway
> Maybe it's better to have an initial working implementation and add some more (backward compatible?) features later.

I agree.

BTW: Thanks a lot for your comments, Boris and Andrea!

Thanks,
//richard

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] UBI: Implement bitrot checking
  2015-04-12 21:34             ` Richard Weinberger
@ 2015-04-13  3:36               ` nick
  0 siblings, 0 replies; 49+ messages in thread
From: nick @ 2015-04-13  3:36 UTC (permalink / raw)
  To: Richard Weinberger, Boris Brezillon; +Cc: linux-mtd, linux-kernel, dedekind1



On 2015-04-12 05:34 PM, Richard Weinberger wrote:
>>> >> While I agree that consuming memory is not very nice I don't think that 125KiB
>>> >> is a big deal.
>> > 
>> > Hm, a few weeks ago, when I suggested to store information about PEBs in
>> > order to better choose the next block to be checked for bitrot, one of
>> > your argument to reject that approach was the memory consumption of
>> > such a design.
>> > In my case the only thing I needed was the following structure (one
>> > instance per PEB):
>> > 
>> > struct ubi_peb_statistics {
>> > 	struct list_head node;
>> > 	int pnum;
>> > 	int bitflips;
>> > 	int last_full_read; /* in seconds */
>> > 	int last_partial_write; /* in seconds */
>> > };
>> > 
>> > which is 24 bytes large.
>> > 
>> > I definitely understand the memory consumption argument, but that's not
>> > something you can change depending on who's proposing the solution :-).
> Yeah, but this structure remains in memory forever, right?
> In the bitrot case we allocate the memory only temporary.
> 
> That said, my arguments are not perfect nor irreversible,
> it can happen that I mess up or was simply wrong.
> Just beat me down with my own arguments when I deserve it.
> 
> Thanks,
> //richard
Richard and others,
This seems to be like the way we are handling page tables in the kernel. Further more due
to this if this is overall a good idea otherwise, I would recommend looking into the ratio
of storing the structure as a percent of overall memory on various systems to see if that
much memory is using storing the PEBs this way. Generally if it's over 2% of total memory
I would recommend finding a different solution, on the high end page structures take 1.5%
of overall memory on the high end for all systems I am currently aware of. Another area to
compare for doing something like is the driver core or slabs that need to be there for a 
useable system i.e. the one for task_structs and therefore are always in kernel memory.
Just a option for a MTD newcomer,
Nick

^ permalink raw reply	[flat|nested] 49+ messages in thread

* linux-mtd digest emails (was Re: [PATCH 4/4] UBI: Implement bitrot checking)
  2015-04-12 21:42                   ` Richard Weinberger
@ 2015-04-13 17:17                     ` Brian Norris
  -1 siblings, 0 replies; 49+ messages in thread
From: Brian Norris @ 2015-04-13 17:17 UTC (permalink / raw)
  To: linux-mtd
  Cc: Andrea Scian, Boris Brezillon, linux-mtd, linux-kernel,
	dedekind1, Richard Weinberger

Hi everyone,

linux-mtd administrative note:

Please don't start conversations based on mailing list "digest" emails.
It's not helpful for others (e.g., it likely breaks threading). Also,
the mailing list filters these types of messages out, presumably based
on the email subject, so many subscribers probably did not see your
email.

So, a recent thread was under the subject:

 Re: [PATCH 4/4] UBI: Implement bitrot checking (linux-mtd Digest, Vol 145, Issue 24) 

But it should be using:

 Re: [PATCH 4/4] UBI: Implement bitrot checking

I've released a few messages for now, but I can't guarantee that such
messages will get through in the future (and certainly not promptly).

Thanks,
Brian

^ permalink raw reply	[flat|nested] 49+ messages in thread

* linux-mtd digest emails (was Re: [PATCH 4/4] UBI: Implement bitrot checking)
@ 2015-04-13 17:17                     ` Brian Norris
  0 siblings, 0 replies; 49+ messages in thread
From: Brian Norris @ 2015-04-13 17:17 UTC (permalink / raw)
  To: linux-mtd
  Cc: Boris Brezillon, dedekind1, Richard Weinberger, linux-kernel,
	Andrea Scian, linux-mtd

Hi everyone,

linux-mtd administrative note:

Please don't start conversations based on mailing list "digest" emails.
It's not helpful for others (e.g., it likely breaks threading). Also,
the mailing list filters these types of messages out, presumably based
on the email subject, so many subscribers probably did not see your
email.

So, a recent thread was under the subject:

 Re: [PATCH 4/4] UBI: Implement bitrot checking (linux-mtd Digest, Vol 145, Issue 24) 

But it should be using:

 Re: [PATCH 4/4] UBI: Implement bitrot checking

I've released a few messages for now, but I can't guarantee that such
messages will get through in the future (and certainly not promptly).

Thanks,
Brian

^ permalink raw reply	[flat|nested] 49+ messages in thread

end of thread, other threads:[~2015-04-13 17:18 UTC | newest]

Thread overview: 49+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-29 12:13 UBI: Bitrot checking Richard Weinberger
2015-03-29 12:13 ` Richard Weinberger
2015-03-29 12:13 ` [PATCH 1/4] UBI: Introduce ubi_schedule_fm_work() Richard Weinberger
2015-03-29 12:13   ` Richard Weinberger
2015-03-29 12:13 ` [PATCH 2/4] UBI: Introduce prepare_erase_work() Richard Weinberger
2015-03-29 12:13   ` Richard Weinberger
2015-03-29 12:13 ` [PATCH 3/4] UBI: Introduce in_pq() Richard Weinberger
2015-03-29 12:13   ` Richard Weinberger
2015-03-29 12:13 ` [PATCH 4/4] UBI: Implement bitrot checking Richard Weinberger
2015-03-29 12:13   ` Richard Weinberger
2015-04-02 17:34   ` Andrea Scian
2015-04-02 17:34     ` Andrea Scian
2015-04-02 17:54     ` Richard Weinberger
2015-04-02 17:54       ` Richard Weinberger
2015-04-02 19:19       ` Andrea Scian
2015-04-02 19:19         ` Andrea Scian
2015-04-08 10:34         ` Richard Weinberger
2015-04-08 10:34           ` Richard Weinberger
2015-04-08 21:02           ` Andrea Scian
2015-04-08 21:02             ` Andrea Scian
2015-04-08 11:48   ` David Oberhollenzer
2015-04-12 14:12   ` Boris Brezillon
2015-04-12 16:09     ` Richard Weinberger
2015-04-12 16:43       ` Boris Brezillon
2015-04-12 16:55         ` Richard Weinberger
2015-04-12 20:42           ` [PATCH 4/4] UBI: Implement bitrot checking (linux-mtd Digest, Vol 145, Issue 24) Andrea Scian
2015-04-12 20:42             ` Andrea Scian
2015-04-12 21:01             ` Richard Weinberger
2015-04-12 21:01               ` Richard Weinberger
2015-04-12 21:30               ` Boris Brezillon
2015-04-12 21:37                 ` Richard Weinberger
2015-04-12 21:33               ` Andrea Scian
2015-04-12 21:33                 ` Andrea Scian
2015-04-12 21:42                 ` Richard Weinberger
2015-04-12 21:42                   ` Richard Weinberger
2015-04-13 17:17                   ` linux-mtd digest emails (was Re: [PATCH 4/4] UBI: Implement bitrot checking) Brian Norris
2015-04-13 17:17                     ` Brian Norris
2015-04-12 15:14   ` [PATCH 4/4] UBI: Implement bitrot checking Boris Brezillon
2015-04-12 16:14     ` Richard Weinberger
2015-04-12 16:31       ` Boris Brezillon
2015-04-12 16:32         ` Richard Weinberger
2015-04-12 17:01   ` Boris Brezillon
2015-04-12 17:09     ` Richard Weinberger
2015-04-12 19:20       ` Boris Brezillon
2015-04-12 19:53         ` Richard Weinberger
2015-04-12 21:24           ` Boris Brezillon
2015-04-12 21:34             ` Richard Weinberger
2015-04-13  3:36               ` nick
2015-04-12 17:36     ` Richard Weinberger
     [not found] <mailman.40253.1428858576.22890.linux-mtd@lists.infradead.org>
     [not found] <mailman.38750.1427638218.22890.linux-mtd@lists.infradead.org>

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.