linux-nvdimm.lists.01.org archive mirror
 help / color / mirror / Atom feed
* [ndctl PATCH v4 0/6] Add ndctl check-namespace
@ 2017-04-07 23:17 Vishal Verma
  2017-04-07 23:17 ` [ndctl PATCH v4 1/6] libndctl: add a ndctl_namespace_is_active helper Vishal Verma
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: Vishal Verma @ 2017-04-07 23:17 UTC (permalink / raw)
  To: linux-nvdimm

Changes in v4:
- Change the bitmap code to the kernel's GPLv2 Routines instead of the
  LGPL ccan/bitmap.
- Upgrade a few messages from 'info' to 'err'

Changes in v3:
- Move the addition of ccan/bitmap to its own patch(es) (Dan)
- Drop the changelog update from the spec (Dan)
- Fix the [verse] section in the documentation text for check-namespace (Dan)
- Unify all namespace_disable paths to perform checking for a mounted
  filesystem (Dan)
- Change the logging to use util/log.h (Dan)
- Use BTT_START_OFFSET for the initial offset, and store it in bttc (Jeff, Dan)
- Fix a number of line > 80 chars (everything but strings) (Jeff)
- Fix short write error handling, add fsync (Jeff)
- Save system page size in bttc to avoid calling sysconf repeatedly (Jeff)
- In check_log_map(), loop through the entire log even in case of an error,
  and if there was a saved error, fail. (Jeff)
- btt-check.sh: in the post repair test, validate that the data read back
  is the same as what was written (Jeff)
- Stop playing games with pre-adding/subtracting the initial 4K offset (Jeff)
- btt_read_info doesn't need to use 'rc', return directly.

Changes in v2:
- Move checking functionality to a separate file (Dan, Jeff)
- Rename btt-structs.h to check.h (Dan)
- Don't provide a configure option for building the checker, always
  build it in. (Dan, Jeff)
- Fix the Documentation example to also include disable-namespace (Linda)
- Update the description text to note the namespace needs to be disabled
  before checking (Linda)
- Use util/size.h for sizes (Dan)
- Use --repair to do repairs instead of --dry-run to disable repairs (Dan)
- Fix btt_read_info short read error handling (Jeff)
- Simplify the map lookup/write routines (Jeff)
- Differentiate the use off BTT_PG_SIZE, sysconf(_SC_PAGESIZE), and SZ_4K
  (for the fixed start offset) in the different places they're used (Jeff)
- Add the missing msync when copying over info2 (Jeff)
- Add unit tests to test the checker (Jeff)
- Add a missing error case check in do_xaction_namespace for check
- Add a --force option that allows running on an active namespace (Jeff)
- Add a bitmap test for checking all internal blocks are referenced exactly
  once between the map and flog (Jeff)
- Remove unused #defines in check.h
- Add comments to explain what we do with raw_mode (Jeff)
- Add some sanity checking when parsing an arena's metadata (Jeff)
- Refactor some read-verify sequences into a helper that combines the two (Jeff)
- Additional bounds checking on the 'offset' in recover_first_sb attempt 3 (Jeff)
- Add a missing ACTION_DESTROY string in parse_namespace_options (Dan)
- Use uXX, and cpu_to_XX from ccan/endian (Dan)
- Move the fletcher64 Routing to util/ as it is shared by builtin-dimm.c (Dan)
- Open the raw block device only once with O_EXCL instead of every time on
  read/write/mmap (Dan)
- Add a new 'inform' routing in util/usage.c, and use it for some non-critical
  messages (Dan)
- Remove namespace_is_offline() from builtin-check.c. Instead, use
  util_namespace_active() from util/json.c
- Add a missing return value check after info block restoration in
  discover_arenas

Vishal Verma (6):
  libndctl: add a ndctl_namespace_is_active helper
  libndctl: add a ndctl_namespace_disable_safe() API
  ndctl: move the fletcher64 routine to util/
  util: add util/bitmap in preparation for the BTT checker
  ndctl: add a BTT check utility
  ndctl, test: Add a unit test for the BTT checker

 Documentation/Makefile.am               |   1 +
 Documentation/ndctl-check-namespace.txt |  64 +++
 Documentation/ndctl.txt                 |   1 +
 Makefile.am                             |   4 +-
 builtin.h                               |   1 +
 contrib/ndctl                           |   3 +
 ndctl/Makefile.am                       |   1 +
 ndctl/builtin-check.c                   | 988 ++++++++++++++++++++++++++++++++
 ndctl/builtin-dimm.c                    |  18 +-
 ndctl/builtin-list.c                    |   2 +-
 ndctl/builtin-xaction-namespace.c       | 112 ++--
 ndctl/check.h                           | 127 ++++
 ndctl/lib/libndctl.c                    |  59 ++
 ndctl/lib/libndctl.sym                  |   2 +
 ndctl/libndctl.h.in                     |   3 +
 ndctl/ndctl.c                           |   1 +
 test/Makefile.am                        |   5 +-
 test/btt-check.sh                       | 172 ++++++
 util/bitmap.c                           | 115 ++++
 util/bitmap.h                           |  32 ++
 util/fletcher.c                         |  23 +
 util/fletcher.h                         |   8 +
 util/json.c                             |  17 +-
 util/json.h                             |   1 -
 util/util.h                             |  12 +
 25 files changed, 1696 insertions(+), 76 deletions(-)
 create mode 100644 Documentation/ndctl-check-namespace.txt
 create mode 100644 ndctl/builtin-check.c
 create mode 100644 ndctl/check.h
 create mode 100755 test/btt-check.sh
 create mode 100644 util/bitmap.c
 create mode 100644 util/bitmap.h
 create mode 100644 util/fletcher.c
 create mode 100644 util/fletcher.h

-- 
2.9.3

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [ndctl PATCH v4 1/6] libndctl: add a ndctl_namespace_is_active helper
  2017-04-07 23:17 [ndctl PATCH v4 0/6] Add ndctl check-namespace Vishal Verma
@ 2017-04-07 23:17 ` Vishal Verma
  2017-04-07 23:17 ` [ndctl PATCH v4 2/6] libndctl: add a ndctl_namespace_disable_safe() API Vishal Verma
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Vishal Verma @ 2017-04-07 23:17 UTC (permalink / raw)
  To: linux-nvdimm

The pattern of checking if a namespace is currently active was repeated
in many places. Convert the scattered usage into a libndctl API. This
gets rid of util_namespace_active from util/json.c which was an awkward
place for this anyway.

Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 ndctl/builtin-list.c   |  2 +-
 ndctl/lib/libndctl.c   | 15 +++++++++++++++
 ndctl/lib/libndctl.sym |  1 +
 ndctl/libndctl.h.in    |  2 ++
 util/json.c            | 17 +----------------
 util/json.h            |  1 -
 6 files changed, 20 insertions(+), 18 deletions(-)

diff --git a/ndctl/builtin-list.c b/ndctl/builtin-list.c
index e8d0070..536d333 100644
--- a/ndctl/builtin-list.c
+++ b/ndctl/builtin-list.c
@@ -84,7 +84,7 @@ static struct json_object *list_namespaces(struct ndctl_region *region,
 		if (param.mode && mode_to_type(param.mode) != mode)
 			continue;
 
-		if (!list.idle && !util_namespace_active(ndns))
+		if (!list.idle && !ndctl_namespace_is_active(ndns))
 			continue;
 
 		if (!jnamespaces) {
diff --git a/ndctl/lib/libndctl.c b/ndctl/lib/libndctl.c
index 090ec0b..ae029c5 100644
--- a/ndctl/lib/libndctl.c
+++ b/ndctl/lib/libndctl.c
@@ -3234,6 +3234,21 @@ static void region_refresh_children(struct ndctl_region *region)
 	daxs_init(region);
 }
 
+NDCTL_EXPORT bool ndctl_namespace_is_active(struct ndctl_namespace *ndns)
+{
+	struct ndctl_btt *btt = ndctl_namespace_get_btt(ndns);
+	struct ndctl_pfn *pfn = ndctl_namespace_get_pfn(ndns);
+	struct ndctl_dax *dax = ndctl_namespace_get_dax(ndns);
+
+	if ((btt && ndctl_btt_is_enabled(btt))
+			|| (pfn && ndctl_pfn_is_enabled(pfn))
+			|| (dax && ndctl_dax_is_enabled(dax))
+			|| (!btt && !pfn && !dax
+				&& ndctl_namespace_is_enabled(ndns)))
+		return true;
+	return false;
+}
+
 /*
  * Return 0 if enabled, < 0 if failed to enable, and > 0 if claimed by
  * another device and that device is enabled.  In the > 0 case a
diff --git a/ndctl/lib/libndctl.sym b/ndctl/lib/libndctl.sym
index ca5165a..705ec4c 100644
--- a/ndctl/lib/libndctl.sym
+++ b/ndctl/lib/libndctl.sym
@@ -172,6 +172,7 @@ global:
 	ndctl_namespace_enable;
 	ndctl_namespace_disable;
 	ndctl_namespace_disable_invalidate;
+	ndctl_namespace_is_active;
 	ndctl_namespace_is_valid;
 	ndctl_namespace_is_configured;
 	ndctl_namespace_delete;
diff --git a/ndctl/libndctl.h.in b/ndctl/libndctl.h.in
index d38aa45..586eb26 100644
--- a/ndctl/libndctl.h.in
+++ b/ndctl/libndctl.h.in
@@ -13,6 +13,7 @@
 #ifndef _LIBNDCTL_H_
 #define _LIBNDCTL_H_
 
+#include <stdbool.h>
 #include <stdarg.h>
 #include <unistd.h>
 
@@ -484,6 +485,7 @@ int ndctl_namespace_is_enabled(struct ndctl_namespace *ndns);
 int ndctl_namespace_enable(struct ndctl_namespace *ndns);
 int ndctl_namespace_disable(struct ndctl_namespace *ndns);
 int ndctl_namespace_disable_invalidate(struct ndctl_namespace *ndns);
+bool ndctl_namespace_is_active(struct ndctl_namespace *ndns);
 int ndctl_namespace_is_valid(struct ndctl_namespace *ndns);
 int ndctl_namespace_is_configured(struct ndctl_namespace *ndns);
 int ndctl_namespace_delete(struct ndctl_namespace *ndns);
diff --git a/util/json.c b/util/json.c
index d6a8d4c..82d8073 100644
--- a/util/json.c
+++ b/util/json.c
@@ -86,21 +86,6 @@ struct json_object *util_dimm_to_json(struct ndctl_dimm *dimm)
 	return NULL;
 }
 
-bool util_namespace_active(struct ndctl_namespace *ndns)
-{
-	struct ndctl_btt *btt = ndctl_namespace_get_btt(ndns);
-	struct ndctl_pfn *pfn = ndctl_namespace_get_pfn(ndns);
-	struct ndctl_dax *dax = ndctl_namespace_get_dax(ndns);
-
-	if ((btt && ndctl_btt_is_enabled(btt))
-			|| (pfn && ndctl_pfn_is_enabled(pfn))
-			|| (dax && ndctl_dax_is_enabled(dax))
-			|| (!btt && !pfn && !dax
-				&& ndctl_namespace_is_enabled(ndns)))
-		return true;
-	return false;
-}
-
 struct json_object *util_daxctl_dev_to_json(struct daxctl_dev *dev)
 {
 	const char *devname = daxctl_dev_get_devname(dev);
@@ -334,7 +319,7 @@ struct json_object *util_namespace_to_json(struct ndctl_namespace *ndns,
 		json_object_object_add(jndns, "blockdev", jobj);
 	}
 
-	if (!util_namespace_active(ndns)) {
+	if (!ndctl_namespace_is_active(ndns)) {
 		jobj = json_object_new_string("disabled");
 		if (!jobj)
 			goto err;
diff --git a/util/json.h b/util/json.h
index a9afb2d..2449c2d 100644
--- a/util/json.h
+++ b/util/json.h
@@ -6,7 +6,6 @@
 
 struct json_object;
 void util_display_json_array(FILE *f_out, struct json_object *jarray, int jflag);
-bool util_namespace_active(struct ndctl_namespace *ndns);
 struct json_object *util_bus_to_json(struct ndctl_bus *bus);
 struct json_object *util_dimm_to_json(struct ndctl_dimm *dimm);
 struct json_object *util_mapping_to_json(struct ndctl_mapping *mapping);
-- 
2.9.3

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [ndctl PATCH v4 2/6] libndctl: add a ndctl_namespace_disable_safe() API
  2017-04-07 23:17 [ndctl PATCH v4 0/6] Add ndctl check-namespace Vishal Verma
  2017-04-07 23:17 ` [ndctl PATCH v4 1/6] libndctl: add a ndctl_namespace_is_active helper Vishal Verma
@ 2017-04-07 23:17 ` Vishal Verma
  2017-04-07 23:18 ` [ndctl PATCH v4 3/6] ndctl: move the fletcher64 routine to util/ Vishal Verma
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Vishal Verma @ 2017-04-07 23:17 UTC (permalink / raw)
  To: linux-nvdimm

Disabling a namespace which has a filesystem mounted on it is unsafe as
filesystems are not prepared for a block device to be yanked from under
them. The destroy_namespace routine checked for an active mount by
performing an O_EXCL open of the backing block device, but many other
callers of ndctl_namespace_disable* could benefit from this checking.

Codify the mounted filesystem check in a new libndctl API -
ndctl_namespace_disable_safe(), and use it for the destroy/disable
namespace ndctl commands as well as the upcoming check-namespace
command.

Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 ndctl/builtin-xaction-namespace.c | 46 +++++++--------------------------------
 ndctl/lib/libndctl.c              | 44 +++++++++++++++++++++++++++++++++++++
 ndctl/lib/libndctl.sym            |  1 +
 ndctl/libndctl.h.in               |  1 +
 4 files changed, 54 insertions(+), 38 deletions(-)

diff --git a/ndctl/builtin-xaction-namespace.c b/ndctl/builtin-xaction-namespace.c
index 46d651e..d6b0c37 100644
--- a/ndctl/builtin-xaction-namespace.c
+++ b/ndctl/builtin-xaction-namespace.c
@@ -731,10 +731,7 @@ static int namespace_destroy(struct ndctl_region *region,
 	struct ndctl_pfn *pfn = ndctl_namespace_get_pfn(ndns);
 	struct ndctl_dax *dax = ndctl_namespace_get_dax(ndns);
 	struct ndctl_btt *btt = ndctl_namespace_get_btt(ndns);
-	const char *bdev = NULL;
-	bool dax_active = false;
-	char path[50];
-	int fd, rc;
+	int rc;
 
 	if (ndctl_region_get_ro(region)) {
 		error("%s: read-only, re-configuration disabled\n",
@@ -742,42 +739,15 @@ static int namespace_destroy(struct ndctl_region *region,
 		return -ENXIO;
 	}
 
-	if (pfn && ndctl_pfn_is_enabled(pfn))
-		bdev = ndctl_pfn_get_block_device(pfn);
-	else if (dax && ndctl_dax_is_enabled(dax))
-		dax_active = true;
-	else if (btt && ndctl_btt_is_enabled(btt))
-		bdev = ndctl_btt_get_block_device(btt);
-	else if (ndctl_namespace_is_enabled(ndns))
-		bdev = ndctl_namespace_get_block_device(ndns);
-
-	if ((bdev || dax_active) && !force) {
+	if (ndctl_namespace_is_active(ndns) && !force) {
 		error("%s is active, specify --force for re-configuration\n",
 				devname);
 		return -EBUSY;
-	} else if (bdev) {
-		sprintf(path, "/dev/%s", bdev);
-		fd = open(path, O_RDWR|O_EXCL);
-		if (fd >= 0) {
-			/*
-			 * Got it, now block new mounts while we have it
-			 * pinned.
-			 */
-			ndctl_namespace_disable_invalidate(ndns);
-			close(fd);
-		} else {
-			/*
-			 * Yes, TOCTOU hole, but if you're racing namespace
-			 * creation you have other problems, and there's nothing
-			 * stopping the !bdev case from racing to mount an fs or
-			 * re-enabling the namepace.
-			 */
-			error("%s: %s failed exlusive open: %s\n",
-					devname, bdev, strerror(errno));
-			return -errno;
-		}
-	} else if (dax_active)
-		ndctl_namespace_disable_invalidate(ndns);
+	} else {
+		rc = ndctl_namespace_disable_safe(ndns);
+		if (rc)
+			return rc;
+	}
 
 	if (pfn || btt || dax) {
 		rc = zero_info_block(ndns);
@@ -869,7 +839,7 @@ static int do_xaction_namespace(const char *namespace,
 					continue;
 				switch (action) {
 				case ACTION_DISABLE:
-					rc = ndctl_namespace_disable_invalidate(ndns);
+					rc = ndctl_namespace_disable_safe(ndns);
 					break;
 				case ACTION_ENABLE:
 					rc = ndctl_namespace_enable(ndns);
diff --git a/ndctl/lib/libndctl.c b/ndctl/lib/libndctl.c
index ae029c5..a3481b1 100644
--- a/ndctl/lib/libndctl.c
+++ b/ndctl/lib/libndctl.c
@@ -3346,6 +3346,50 @@ NDCTL_EXPORT int ndctl_namespace_disable_invalidate(struct ndctl_namespace *ndns
 	return ndctl_namespace_disable(ndns);
 }
 
+NDCTL_EXPORT int ndctl_namespace_disable_safe(struct ndctl_namespace *ndns)
+{
+	const char *devname = ndctl_namespace_get_devname(ndns);
+	struct ndctl_ctx *ctx = ndctl_namespace_get_ctx(ndns);
+	struct ndctl_pfn *pfn = ndctl_namespace_get_pfn(ndns);
+	struct ndctl_btt *btt = ndctl_namespace_get_btt(ndns);
+	const char *bdev = NULL;
+	char path[50];
+	int fd;
+
+	if (pfn && ndctl_pfn_is_enabled(pfn))
+		bdev = ndctl_pfn_get_block_device(pfn);
+	else if (btt && ndctl_btt_is_enabled(btt))
+		bdev = ndctl_btt_get_block_device(btt);
+	else if (ndctl_namespace_is_enabled(ndns))
+		bdev = ndctl_namespace_get_block_device(ndns);
+
+	if (bdev) {
+		sprintf(path, "/dev/%s", bdev);
+		fd = open(path, O_RDWR|O_EXCL);
+		if (fd >= 0) {
+			/*
+			 * Got it, now block new mounts while we have it
+			 * pinned.
+			 */
+			ndctl_namespace_disable_invalidate(ndns);
+			close(fd);
+		} else {
+			/*
+			 * Yes, TOCTOU hole, but if you're racing namespace
+			 * creation you have other problems, and there's nothing
+			 * stopping the !bdev case from racing to mount an fs or
+			 * re-enabling the namepace.
+			 */
+			dbg(ctx, "%s: %s failed exclusive open: %s\n",
+					devname, bdev, strerror(errno));
+			return -errno;
+		}
+	} else
+		ndctl_namespace_disable_invalidate(ndns);
+
+	return 0;
+}
+
 static int pmem_namespace_is_configured(struct ndctl_namespace *ndns)
 {
 	if (ndctl_namespace_get_size(ndns) < ND_MIN_NAMESPACE_SIZE)
diff --git a/ndctl/lib/libndctl.sym b/ndctl/lib/libndctl.sym
index 705ec4c..9d2a568 100644
--- a/ndctl/lib/libndctl.sym
+++ b/ndctl/lib/libndctl.sym
@@ -172,6 +172,7 @@ global:
 	ndctl_namespace_enable;
 	ndctl_namespace_disable;
 	ndctl_namespace_disable_invalidate;
+	ndctl_namespace_disable_safe;
 	ndctl_namespace_is_active;
 	ndctl_namespace_is_valid;
 	ndctl_namespace_is_configured;
diff --git a/ndctl/libndctl.h.in b/ndctl/libndctl.h.in
index 586eb26..bcc569c 100644
--- a/ndctl/libndctl.h.in
+++ b/ndctl/libndctl.h.in
@@ -485,6 +485,7 @@ int ndctl_namespace_is_enabled(struct ndctl_namespace *ndns);
 int ndctl_namespace_enable(struct ndctl_namespace *ndns);
 int ndctl_namespace_disable(struct ndctl_namespace *ndns);
 int ndctl_namespace_disable_invalidate(struct ndctl_namespace *ndns);
+int ndctl_namespace_disable_safe(struct ndctl_namespace *ndns);
 bool ndctl_namespace_is_active(struct ndctl_namespace *ndns);
 int ndctl_namespace_is_valid(struct ndctl_namespace *ndns);
 int ndctl_namespace_is_configured(struct ndctl_namespace *ndns);
-- 
2.9.3

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [ndctl PATCH v4 3/6] ndctl: move the fletcher64 routine to util/
  2017-04-07 23:17 [ndctl PATCH v4 0/6] Add ndctl check-namespace Vishal Verma
  2017-04-07 23:17 ` [ndctl PATCH v4 1/6] libndctl: add a ndctl_namespace_is_active helper Vishal Verma
  2017-04-07 23:17 ` [ndctl PATCH v4 2/6] libndctl: add a ndctl_namespace_disable_safe() API Vishal Verma
@ 2017-04-07 23:18 ` Vishal Verma
  2017-04-07 23:18 ` [ndctl PATCH v4 4/6] util: add util/bitmap in preparation for the BTT checker Vishal Verma
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Vishal Verma @ 2017-04-07 23:18 UTC (permalink / raw)
  To: linux-nvdimm

In preparation for check-namespace, since it will also use the
fletcher64 routine, move it to util/ so that it can be shared by both
builtin-check.c and builtin-dimm.c

Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 Makefile.am          |  3 ++-
 ndctl/builtin-dimm.c | 18 ++----------------
 util/fletcher.c      | 23 +++++++++++++++++++++++
 util/fletcher.h      |  8 ++++++++
 4 files changed, 35 insertions(+), 17 deletions(-)
 create mode 100644 util/fletcher.c
 create mode 100644 util/fletcher.h

diff --git a/Makefile.am b/Makefile.am
index 06cd1b0..5453b2a 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -68,6 +68,7 @@ libutil_a_SOURCES = \
 	util/help.c \
 	util/strbuf.c \
 	util/wrapper.c \
-	util/filter.c
+	util/filter.c \
+	util/fletcher.c
 
 nobase_include_HEADERS = daxctl/libdaxctl.h
diff --git a/ndctl/builtin-dimm.c b/ndctl/builtin-dimm.c
index 637b10b..93f9530 100644
--- a/ndctl/builtin-dimm.c
+++ b/ndctl/builtin-dimm.c
@@ -22,6 +22,7 @@
 #include <util/json.h>
 #include <util/filter.h>
 #include <json-c/json.h>
+#include <util/fletcher.h>
 #include <ndctl/libndctl.h>
 #include <util/parse-options.h>
 #include <ccan/minmax/minmax.h>
@@ -358,7 +359,7 @@ struct nvdimm_data {
 };
 
 /*
- * Note, best_seq(), inc_seq(), fletcher64(), sizeof_namespace_index()
+ * Note, best_seq(), inc_seq(), sizeof_namespace_index()
  * nvdimm_num_label_slots(), label_validate(), and label_write_index()
  * are copied from drivers/nvdimm/label.c in the Linux kernel with the
  * following modifications:
@@ -371,21 +372,6 @@ struct nvdimm_data {
  * 7/ dropped clear_bit_le() usage in label_write_index
  */
 
-static u64 fletcher64(void *addr, size_t len, bool le)
-{
-	u32 *buf = addr;
-	u32 lo32 = 0;
-	u64 hi32 = 0;
-	size_t i;
-
-	for (i = 0; i < len / sizeof(u32); i++) {
-		lo32 += le ? le32_to_cpu((le32) buf[i]) : buf[i];
-		hi32 += lo32;
-	}
-
-	return hi32 << 32 | lo32;
-}
-
 static unsigned inc_seq(unsigned seq)
 {
 	static const unsigned next[] = { 0, 2, 3, 1 };
diff --git a/util/fletcher.c b/util/fletcher.c
new file mode 100644
index 0000000..cee2fc3
--- /dev/null
+++ b/util/fletcher.c
@@ -0,0 +1,23 @@
+#include <stdlib.h>
+#include <stdbool.h>
+#include <util/fletcher.h>
+#include <ccan/endian/endian.h>
+#include <ccan/short_types/short_types.h>
+
+/*
+ * Note, fletcher64() is copied from drivers/nvdimm/label.c in the Linux kernel
+ */
+u64 fletcher64(void *addr, size_t len, bool le)
+{
+	u32 *buf = addr;
+	u32 lo32 = 0;
+	u64 hi32 = 0;
+	size_t i;
+
+	for (i = 0; i < len / sizeof(u32); i++) {
+		lo32 += le ? le32_to_cpu((le32) buf[i]) : buf[i];
+		hi32 += lo32;
+	}
+
+	return hi32 << 32 | lo32;
+}
diff --git a/util/fletcher.h b/util/fletcher.h
new file mode 100644
index 0000000..e3bbce3
--- /dev/null
+++ b/util/fletcher.h
@@ -0,0 +1,8 @@
+#ifndef _NDCTL_FLETCHER_H_
+#define _NDCTL_FLETCHER_H_
+
+#include <ccan/short_types/short_types.h>
+
+u64 fletcher64(void *addr, size_t len, bool le);
+
+#endif /* _NDCTL_FLETCHER_H_ */
-- 
2.9.3

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [ndctl PATCH v4 4/6] util: add util/bitmap in preparation for the BTT checker
  2017-04-07 23:17 [ndctl PATCH v4 0/6] Add ndctl check-namespace Vishal Verma
                   ` (2 preceding siblings ...)
  2017-04-07 23:18 ` [ndctl PATCH v4 3/6] ndctl: move the fletcher64 routine to util/ Vishal Verma
@ 2017-04-07 23:18 ` Vishal Verma
  2017-04-07 23:18 ` [ndctl PATCH v4 5/6] ndctl: add a BTT check utility Vishal Verma
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Vishal Verma @ 2017-04-07 23:18 UTC (permalink / raw)
  To: linux-nvdimm

The BTT checker will include a bitmap test where we mark a bit for
each post-map and free block, and check if the bitmap is full. Add
util/bitmap based on the kernels bitmap code to facilitate this.

Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
---
 Makefile.am   |   3 +-
 util/bitmap.c | 115 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 util/bitmap.h |  32 ++++++++++++++++
 util/util.h   |  11 ++++++
 4 files changed, 160 insertions(+), 1 deletion(-)
 create mode 100644 util/bitmap.c
 create mode 100644 util/bitmap.h

diff --git a/Makefile.am b/Makefile.am
index 5453b2a..2b46736 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -69,6 +69,7 @@ libutil_a_SOURCES = \
 	util/strbuf.c \
 	util/wrapper.c \
 	util/filter.c \
-	util/fletcher.c
+	util/fletcher.c\
+	util/bitmap.c
 
 nobase_include_HEADERS = daxctl/libdaxctl.h
diff --git a/util/bitmap.c b/util/bitmap.c
new file mode 100644
index 0000000..31e8c3a
--- /dev/null
+++ b/util/bitmap.c
@@ -0,0 +1,115 @@
+#include <stdlib.h>
+#include <util/size.h>
+#include <util/util.h>
+#include <util/bitmap.h>
+#include <ccan/endian/endian.h>
+#include <ccan/minmax/minmax.h>
+#include <ccan/short_types/short_types.h>
+
+unsigned long *bitmap_alloc(unsigned long nbits)
+{
+	return calloc(BITS_TO_LONGS(nbits), sizeof(unsigned long));
+}
+
+void bitmap_set(unsigned long *map, unsigned int start, int len)
+{
+	unsigned long *p = map + BIT_WORD(start);
+	const unsigned int size = start + len;
+	int bits_to_set = BITS_PER_LONG - (start % BITS_PER_LONG);
+	unsigned long mask_to_set = BITMAP_FIRST_WORD_MASK(start);
+
+	while (len - bits_to_set >= 0) {
+		*p |= mask_to_set;
+		len -= bits_to_set;
+		bits_to_set = BITS_PER_LONG;
+		mask_to_set = ~0UL;
+		p++;
+	}
+	if (len) {
+		mask_to_set &= BITMAP_LAST_WORD_MASK(size);
+		*p |= mask_to_set;
+	}
+}
+
+void bitmap_clear(unsigned long *map, unsigned int start, int len)
+{
+	unsigned long *p = map + BIT_WORD(start);
+	const unsigned int size = start + len;
+	int bits_to_clear = BITS_PER_LONG - (start % BITS_PER_LONG);
+	unsigned long mask_to_clear = BITMAP_FIRST_WORD_MASK(start);
+
+	while (len - bits_to_clear >= 0) {
+		*p &= ~mask_to_clear;
+		len -= bits_to_clear;
+		bits_to_clear = BITS_PER_LONG;
+		mask_to_clear = ~0UL;
+		p++;
+	}
+	if (len) {
+		mask_to_clear &= BITMAP_LAST_WORD_MASK(size);
+		*p &= ~mask_to_clear;
+	}
+}
+
+/**
+ * test_bit - Determine whether a bit is set
+ * @nr: bit number to test
+ * @addr: Address to start counting from
+ */
+int test_bit(unsigned int nr, const volatile unsigned long *addr)
+{
+	return 1UL & (addr[BIT_WORD(nr)] >> (nr & (BITS_PER_LONG-1)));
+}
+
+/*
+ * This is a common helper function for find_next_bit and
+ * find_next_zero_bit.  The difference is the "invert" argument, which
+ * is XORed with each fetched word before searching it for one bits.
+ */
+static unsigned long _find_next_bit(const unsigned long *addr,
+		unsigned long nbits, unsigned long start, unsigned long invert)
+{
+	unsigned long tmp;
+
+	if (!nbits || start >= nbits)
+		return nbits;
+
+	tmp = addr[start / BITS_PER_LONG] ^ invert;
+
+	/* Handle 1st word. */
+	tmp &= BITMAP_FIRST_WORD_MASK(start);
+	start = round_down(start, BITS_PER_LONG);
+
+	while (!tmp) {
+		start += BITS_PER_LONG;
+		if (start >= nbits)
+			return nbits;
+
+		tmp = addr[start / BITS_PER_LONG] ^ invert;
+	}
+
+	return min(start + __builtin_ffsl(tmp), nbits);
+}
+
+/*
+ * Find the next set bit in a memory region.
+ */
+unsigned long find_next_bit(const unsigned long *addr, unsigned long size,
+			    unsigned long offset)
+{
+	return _find_next_bit(addr, size, offset, 0UL);
+}
+
+unsigned long find_next_zero_bit(const unsigned long *addr, unsigned long size,
+				 unsigned long offset)
+{
+	return _find_next_bit(addr, size, offset, ~0UL);
+}
+
+int bitmap_full(const unsigned long *src, unsigned int nbits)
+{
+	if (small_const_nbits(nbits))
+		return ! (~(*src) & BITMAP_LAST_WORD_MASK(nbits));
+
+	return find_next_zero_bit(src, nbits, 0UL) == nbits;
+}
diff --git a/util/bitmap.h b/util/bitmap.h
new file mode 100644
index 0000000..826ae28
--- /dev/null
+++ b/util/bitmap.h
@@ -0,0 +1,32 @@
+#ifndef _NDCTL_BITMAP_H_
+#define _NDCTL_BITMAP_H_
+
+#include <util/size.h>
+#include <ccan/short_types/short_types.h>
+
+#define DIV_ROUND_UP(n, d) (((n) + (d) - 1) / (d))
+
+#define BIT(nr)			(1UL << (nr))
+#define BIT_MASK(nr)		(1UL << ((nr) % BITS_PER_LONG))
+#define BIT_WORD(nr)		((nr) / BITS_PER_LONG)
+#define BITS_PER_BYTE		8
+#define BITS_TO_LONGS(nr)	DIV_ROUND_UP(nr, BITS_PER_BYTE * sizeof(long))
+
+#define BITMAP_FIRST_WORD_MASK(start) (~0UL << ((start) & (BITS_PER_LONG - 1)))
+#define BITMAP_LAST_WORD_MASK(nbits) (~0UL >> (-(nbits) & (BITS_PER_LONG - 1)))
+
+#define small_const_nbits(nbits) \
+	(__builtin_constant_p(nbits) && (nbits) <= BITS_PER_LONG)
+
+unsigned long *bitmap_alloc(unsigned long nbits);
+void bitmap_set(unsigned long *map, unsigned int start, int len);
+void bitmap_clear(unsigned long *map, unsigned int start, int len);
+int test_bit(unsigned int nr, const volatile unsigned long *addr);
+unsigned long find_next_bit(const unsigned long *addr, unsigned long size,
+			    unsigned long offset);
+unsigned long find_next_zero_bit(const unsigned long *addr, unsigned long size,
+				 unsigned long offset);
+int bitmap_full(const unsigned long *src, unsigned int nbits);
+
+
+#endif /* _NDCTL_BITMAP_H_ */
diff --git a/util/util.h b/util/util.h
index e0e5f26..620eb1c 100644
--- a/util/util.h
+++ b/util/util.h
@@ -23,6 +23,17 @@
 
 #define alloc_nr(x) (((x)+16)*3/2)
 
+#define __round_mask(x, y) ((__typeof__(x))((y)-1))
+#define round_up(x, y) ((((x)-1) | __round_mask(x, y))+1)
+#define round_down(x, y) ((x) & ~__round_mask(x, y))
+
+#define rounddown(x, y) (				\
+{							\
+	typeof(x) __x = (x);				\
+	__x - (__x % (y));				\
+}							\
+)
+
 /*
  * Realloc the buffer pointed at by variable 'x' so that it can hold
  * at least 'nr' entries; the number of entries currently allocated
-- 
2.9.3

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [ndctl PATCH v4 5/6] ndctl: add a BTT check utility
  2017-04-07 23:17 [ndctl PATCH v4 0/6] Add ndctl check-namespace Vishal Verma
                   ` (3 preceding siblings ...)
  2017-04-07 23:18 ` [ndctl PATCH v4 4/6] util: add util/bitmap in preparation for the BTT checker Vishal Verma
@ 2017-04-07 23:18 ` Vishal Verma
  2017-04-07 23:18 ` [ndctl PATCH v4 6/6] ndctl, test: Add a unit test for the BTT checker Vishal Verma
  2017-04-08  2:14 ` [ndctl PATCH v4 0/6] Add ndctl check-namespace Dan Williams
  6 siblings, 0 replies; 8+ messages in thread
From: Vishal Verma @ 2017-04-07 23:18 UTC (permalink / raw)
  To: linux-nvdimm

Add the check-namespace command to ndctl. This will check the BTT
metadata layout for the given namespace, and if requested, correct any
errors found. Not all metadata corruption is detectable or fixable.

Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Linda Knippers <linda.knippers@hpe.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 Documentation/Makefile.am               |   1 +
 Documentation/ndctl-check-namespace.txt |  64 +++
 Documentation/ndctl.txt                 |   1 +
 builtin.h                               |   1 +
 contrib/ndctl                           |   3 +
 ndctl/Makefile.am                       |   1 +
 ndctl/builtin-check.c                   | 988 ++++++++++++++++++++++++++++++++
 ndctl/builtin-xaction-namespace.c       |  66 ++-
 ndctl/check.h                           | 127 ++++
 ndctl/ndctl.c                           |   1 +
 test/Makefile.am                        |   2 +
 util/util.h                             |   1 +
 12 files changed, 1254 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/ndctl-check-namespace.txt
 create mode 100644 ndctl/builtin-check.c
 create mode 100644 ndctl/check.h

diff --git a/Documentation/Makefile.am b/Documentation/Makefile.am
index 6daeb56..eea11e0 100644
--- a/Documentation/Makefile.am
+++ b/Documentation/Makefile.am
@@ -12,6 +12,7 @@ man1_MANS = \
 	ndctl-disable-namespace.1 \
 	ndctl-create-namespace.1 \
 	ndctl-destroy-namespace.1 \
+	ndctl-check-namespace.1 \
 	ndctl-list.1 \
 	daxctl-list.1
 
diff --git a/Documentation/ndctl-check-namespace.txt b/Documentation/ndctl-check-namespace.txt
new file mode 100644
index 0000000..232f22d
--- /dev/null
+++ b/Documentation/ndctl-check-namespace.txt
@@ -0,0 +1,64 @@
+ndctl-check-namespace(1)
+=========================
+
+NAME
+----
+ndctl-check-namespace - check namespace metadata consistency
+
+SYNOPSIS
+--------
+[verse]
+'ndctl check-namespace' <namespace> [<options>]
+
+DESCRIPTION
+-----------
+
+A namespace in the 'sector' mode will have metadata on it to describe
+the kernel BTT (Block Translation Table). The check-namespace command
+can be used to check the consistency of this metadata, and optionally,
+also attempt to repair it, if it has enough information to do so.
+
+The namespace being checked has to be disabled before initiating a
+check on it as a precautionary measure. The --force option can override
+this.
+
+EXAMPLES
+--------
+
+Check a namespace (only report errors)
+[verse]
+ndctl disable-namespace namespace0.0
+ndctl check-namespace namespace0.0
+
+Check a namespace, and perform repairs if possible
+[verse]
+ndctl disable-namespace namespace0.0
+ndctl check-namespace --repair namespace0.0
+
+OPTIONS
+-------
+-R::
+--repair::
+	Perform metadata repairs if possible. Without this option,
+	the raw namespace contents will not be touched.
+
+-f::
+--force::
+	Unless this option is specified, a check-namespace operation
+	will fail if the namespace is presently active. Specifying
+	--force causes the namespace to be disabled before checking.
+
+-v::
+--verbose::
+	Emit debug messages for the namespace check process.
+
+-r::
+--region=::
+include::xable-region-options.txt[]
+
+SEE ALSO
+--------
+linkndctl:ndctl-disable-namespace[1],
+linkndctl:ndctl-enable-namespace[1],
+http://pmem.io/documents/NVDIMM_Namespace_Spec.pdf[NVDIMM Namespace
+Specification]
diff --git a/Documentation/ndctl.txt b/Documentation/ndctl.txt
index 883a59c..c26cc2f 100644
--- a/Documentation/ndctl.txt
+++ b/Documentation/ndctl.txt
@@ -34,6 +34,7 @@ SEE ALSO
 --------
 linkndctl:ndctl-create-namespace[1],
 linkndctl:ndctl-destroy-namespace[1],
+linkndctl:ndctl-check-namespace[1],
 linkndctl:ndctl-enable-region[1],
 linkndctl:ndctl-disable-region[1],
 linkndctl:ndctl-enable-dimm[1],
diff --git a/builtin.h b/builtin.h
index 9b66196..200bd8e 100644
--- a/builtin.h
+++ b/builtin.h
@@ -13,6 +13,7 @@ int cmd_enable_namespace(int argc, const char **argv, void *ctx);
 int cmd_create_namespace(int argc, const char **argv, void *ctx);
 int cmd_destroy_namespace(int argc, const char **argv, void *ctx);
 int cmd_disable_namespace(int argc, const char **argv, void *ctx);
+int cmd_check_namespace(int argc, const char **argv, void *ctx);
 int cmd_enable_region(int argc, const char **argv, void *ctx);
 int cmd_disable_region(int argc, const char **argv, void *ctx);
 int cmd_enable_dimm(int argc, const char **argv, void *ctx);
diff --git a/contrib/ndctl b/contrib/ndctl
index ea7303c..c97adcc 100755
--- a/contrib/ndctl
+++ b/contrib/ndctl
@@ -194,6 +194,9 @@ __ndctl_comp_non_option_args()
 	destroy-namespace)
 		opts="$(__ndctl_get_ns) all"
 		;;
+	check-namespace)
+		opts="$(__ndctl_get_ns -i) all"
+		;;
 	enable-region)
 		opts="$(__ndctl_get_regions -i) all"
 		;;
diff --git a/ndctl/Makefile.am b/ndctl/Makefile.am
index c563e94..f9158d9 100644
--- a/ndctl/Makefile.am
+++ b/ndctl/Makefile.am
@@ -5,6 +5,7 @@ bin_PROGRAMS = ndctl
 ndctl_SOURCES = ndctl.c \
 		builtin-create-nfit.c \
 		builtin-xaction-namespace.c \
+		builtin-check.c \
 		builtin-xable-region.c \
 		builtin-dimm.c \
 		 ../util/log.c \
diff --git a/ndctl/builtin-check.c b/ndctl/builtin-check.c
new file mode 100644
index 0000000..3b30a98
--- /dev/null
+++ b/ndctl/builtin-check.c
@@ -0,0 +1,988 @@
+/*
+ * Copyright(c) 2015-2016 Intel Corporation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of version 2 of the GNU General Public License as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ */
+#include <stdio.h>
+#include <fcntl.h>
+#include <errno.h>
+#include <stdint.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <limits.h>
+#include <stdbool.h>
+#include <sys/mman.h>
+#include <util/log.h>
+#include <uuid/uuid.h>
+#include <sys/types.h>
+#include <util/json.h>
+#include <util/size.h>
+#include <util/util.h>
+#include <util/bitmap.h>
+#include <util/fletcher.h>
+#include <ndctl/libndctl.h>
+#include <ccan/endian/endian.h>
+#include <ccan/minmax/minmax.h>
+#include <ccan/array_size/array_size.h>
+#include <ccan/short_types/short_types.h>
+#include "check.h"
+
+#ifdef HAVE_NDCTL_H
+#include <linux/ndctl.h>
+#else
+#include <ndctl.h>
+#endif
+
+static int repair_msg(struct btt_chk *bttc)
+{
+	info(bttc, "  Run with --repair to make the changes\n");
+	return 0;
+}
+
+/**
+ * btt_read_info - read an info block from a given offset
+ * @bttc:	the main btt_chk structure for this btt
+ * @btt_sb:	struct btt_sb where the info block will be copied into
+ * @offset:	offset in the raw namespace to read the info block from
+ *
+ * This will also use 'pread' to read the info block, and not mmap+loads
+ * as this is used before the mappings are set up.
+ */
+static int btt_read_info(struct btt_chk *bttc, struct btt_sb *btt_sb, u64 off)
+{
+	ssize_t size;
+
+	size = pread(bttc->fd, btt_sb, sizeof(*btt_sb), off);
+	if (size < 0) {
+		err(bttc, "unable to read first info block: %s\n",
+			strerror(errno));
+		return -errno;
+	}
+	if (size != sizeof(*btt_sb)) {
+		err(bttc, "short read of first info block: %ld\n", size);
+		return -ENXIO;
+	}
+
+	return 0;
+}
+
+/**
+ * btt_write_info - write an info block to the given offset
+ * @bttc:	the main btt_chk structure for this btt
+ * @btt_sb:	struct btt_sb where the info block will be copied from
+ * @offset:	offset in the raw namespace to write the info block to
+ *
+ * This will also use 'pwrite' to write the info block, and not mmap+stores
+ * as this is used before the mappings are set up.
+ */
+static int btt_write_info(struct btt_chk *bttc, struct btt_sb *btt_sb, u64 off)
+{
+	ssize_t size;
+	int rc;
+
+	if (!bttc->opts->repair) {
+		err(bttc, "BTT info block at offset %#lx needs to be restored\n",
+			off);
+		repair_msg(bttc);
+		return -1;
+	}
+	info(bttc, "Restoring BTT info block at offset %#lx\n", off);
+
+	size = pwrite(bttc->fd, btt_sb, sizeof(*btt_sb), off);
+	if (size < 0) {
+		err(bttc, "unable to write the info block: %s\n",
+			strerror(errno));
+		return -errno;
+	}
+	if (size != sizeof(*btt_sb)) {
+		err(bttc, "short write of the info block: %ld\n", size);
+		return -ENXIO;
+	}
+
+	rc = fsync(bttc->fd);
+	if (rc < 0)
+		return -errno;
+	return 0;
+}
+
+/**
+ * btt_copy_to_info2 - restore the backup info block using the main one
+ * @a:		the arena_info handle for this arena
+ *
+ * Called when a corrupted backup info block is detected. Copies the
+ * main info block over to the backup location. This is done using
+ * mmap + stores, and thus needs a msync.
+ */
+static int btt_copy_to_info2(struct arena_info *a)
+{
+	void *ms_align;
+	size_t ms_size;
+
+	if (!a->bttc->opts->repair) {
+		err(a->bttc, "Arena %d: BTT info2 needs to be restored\n",
+			a->num);
+		return repair_msg(a->bttc);
+	}
+	printf("Arena %d: Restoring BTT info2\n", a->num);
+	memcpy(a->map.info2, a->map.info, BTT_INFO_SIZE);
+
+	ms_align = (void *)rounddown((u64)a->map.info2, a->bttc->sys_page_size);
+	ms_size = max(BTT_INFO_SIZE, a->bttc->sys_page_size);
+	if (msync(ms_align, ms_size, MS_SYNC) < 0)
+		return errno;
+
+	return 0;
+}
+
+/*
+ * btt_map_lookup - given a pre-map Arena Block Address, return the post-map ABA
+ * @a:		the arena_info handle for this arena
+ * @lba:	the logical block address for which we are performing the lookup
+ *
+ * This will correctly account for map entries in the 'initial state'
+ */
+static u32 btt_map_lookup(struct arena_info *a, u32 lba)
+{
+	u32 raw_mapping;
+
+	raw_mapping = le32_to_cpu(a->map.map[lba]);
+	if (raw_mapping & MAP_ENT_NORMAL)
+		return raw_mapping & MAP_LBA_MASK;
+	else
+		return lba;
+}
+
+static int btt_map_write(struct arena_info *a, u32 lba, u32 mapping)
+{
+	void *ms_align;
+
+	if (!a->bttc->opts->repair) {
+		err(a->bttc,
+			"Arena %d: map[%#x] needs to be updated to %#x\n",
+			a->num, lba, mapping);
+		return repair_msg(a->bttc);
+	}
+	info(a->bttc, "Arena %d: Updating map[%#x] to %#x\n", a->num,
+		lba, mapping);
+
+	/*
+	 * We want to set neither of the Z or E flags, and in the actual
+	 * layout, this means setting the bit positions of both to '1' to
+	 * indicate a 'normal' map entry
+	 */
+	mapping |= MAP_ENT_NORMAL;
+	a->map.map[lba] = cpu_to_le32(mapping);
+
+	ms_align = (void *)rounddown((u64)&a->map.map[lba],
+		a->bttc->sys_page_size);
+	if (msync(ms_align, a->bttc->sys_page_size, MS_SYNC) < 0)
+		return errno;
+
+	return 0;
+}
+
+static void btt_log_read_pair(struct arena_info *a, u32 lane,
+			struct log_entry *ent)
+{
+	memcpy(ent, &a->map.log[lane * 2], 2 * sizeof(struct log_entry));
+}
+
+/*
+ * This function accepts two log entries, and uses the sequence number to
+ * find the 'older' entry. The return value indicates which of the two was
+ * the 'old' entry
+ */
+static int btt_log_get_old(struct log_entry *ent)
+{
+	int old;
+
+	if (ent[0].seq == 0) {
+		ent[0].seq = cpu_to_le32(1);
+		return 0;
+	}
+
+	if (le32_to_cpu(ent[0].seq) < le32_to_cpu(ent[1].seq)) {
+		if (le32_to_cpu(ent[1].seq) - le32_to_cpu(ent[0].seq) == 1)
+			old = 0;
+		else
+			old = 1;
+	} else {
+		if (le32_to_cpu(ent[0].seq) - le32_to_cpu(ent[1].seq) == 1)
+			old = 1;
+		else
+			old = 0;
+	}
+
+	return old;
+}
+
+static int btt_log_read(struct arena_info *a, u32 lane, struct log_entry *ent)
+{
+	int new_ent;
+	struct log_entry log[2];
+
+	if (ent == NULL)
+		return -EINVAL;
+	btt_log_read_pair(a, lane, log);
+	new_ent = 1 - btt_log_get_old(log);
+	memcpy(ent, &log[new_ent], sizeof(struct log_entry));
+	return 0;
+}
+
+static int btt_checksum_verify(struct btt_sb *btt_sb)
+{
+	uint64_t sum;
+	le64 sum_save;
+
+	BUILD_BUG_ON(sizeof(struct btt_sb) != SZ_4K);
+
+	sum_save = btt_sb->checksum;
+	btt_sb->checksum = 0;
+	sum = fletcher64(btt_sb, sizeof(*btt_sb), 1);
+	if (sum != sum_save)
+		return 1;
+	/* restore the checksum in the buffer */
+	btt_sb->checksum = sum_save;
+
+	return 0;
+}
+
+/*
+ * Never pass a mmapped buffer to this as it will attempt to write to
+ * the buffer, and we want writes to only happened in a controlled fashion.
+ * In the non --repair case, even if such a buffer is passed, the write will
+ * result in a fault due to the readonly mmap flags.
+ */
+static int btt_info_verify(struct btt_chk *bttc, struct btt_sb *btt_sb)
+{
+	if (memcmp(btt_sb->signature, BTT_SIG, BTT_SIG_LEN) != 0)
+		return -ENXIO;
+
+	if (!uuid_is_null(btt_sb->parent_uuid))
+		if (uuid_compare(bttc->parent_uuid, btt_sb->parent_uuid) != 0)
+			return -ENXIO;
+
+	if (btt_checksum_verify(btt_sb))
+		return -ENXIO;
+
+	return 0;
+}
+
+static int btt_info_read_verify(struct btt_chk *bttc, struct btt_sb *btt_sb,
+	u64 off)
+{
+	int rc;
+
+	rc = btt_read_info(bttc, btt_sb, off);
+	if (rc)
+		return rc;
+	rc = btt_info_verify(bttc, btt_sb);
+	if (rc)
+		return rc;
+	return 0;
+}
+
+enum btt_errcodes {
+	BTT_OK = 0,
+	BTT_LOG_EQL_SEQ = 0x100,
+	BTT_LOG_OOB_SEQ,
+	BTT_LOG_OOB_LBA,
+	BTT_LOG_OOB_OLD,
+	BTT_LOG_OOB_NEW,
+	BTT_LOG_MAP_ERR,
+	BTT_MAP_OOB,
+	BTT_BITMAP_ERROR,
+};
+
+static void btt_xlat_status(struct arena_info *a, int errcode)
+{
+	switch(errcode) {
+	case BTT_OK:
+		break;
+	case BTT_LOG_EQL_SEQ:
+		err(a->bttc,
+			"arena %d: found a pair of log entries with the same sequence number\n",
+			a->num);
+		break;
+	case BTT_LOG_OOB_SEQ:
+		err(a->bttc,
+			"arena %d: found a log entry with an out of bounds sequence number\n",
+			a->num);
+		break;
+	case BTT_LOG_OOB_LBA:
+		err(a->bttc,
+			"arena %d: found a log entry with an out of bounds LBA\n",
+			a->num);
+		break;
+	case BTT_LOG_OOB_OLD:
+		err(a->bttc,
+			"arena %d: found a log entry with an out of bounds 'old' mapping\n",
+			a->num);
+		break;
+	case BTT_LOG_OOB_NEW:
+		err(a->bttc,
+			"arena %d: found a log entry with an out of bounds 'new' mapping\n",
+			a->num);
+		break;
+	case BTT_LOG_MAP_ERR:
+		info(a->bttc,
+			"arena %d: found a log entry that does not match with a map entry\n",
+			a->num);
+		break;
+	case BTT_MAP_OOB:
+		err(a->bttc,
+			"arena %d: found a map entry that is out of bounds\n",
+			a->num);
+		break;
+	case BTT_BITMAP_ERROR:
+		err(a->bttc,
+			"arena %d: bitmap error: internal blocks are incorrectly referenced\n",
+			a->num);
+		break;
+	default:
+		err(a->bttc, "arena %d: unknown error: %d\n",
+			a->num, errcode);
+	}
+}
+
+/* Check that log entries are self consistent */
+static int btt_check_log_entries(struct arena_info *a)
+{
+	unsigned int i;
+	int rc = 0;
+
+	/*
+	 * First, check both 'slots' for sequence numbers being distinct
+	 * and in bounds
+	 */
+	for (i = 0; i < (2 * a->nfree); i+=2) {
+		if (a->map.log[i].seq == a->map.log[i + 1].seq)
+			return BTT_LOG_EQL_SEQ;
+		if (a->map.log[i].seq > 3 || a->map.log[i + 1].seq > 3)
+			return BTT_LOG_OOB_SEQ;
+	}
+	/*
+	 * Next, check only the 'new' slot in each lane for the remaining
+	 * entries being in bounds
+	 */
+	for (i = 0; i < a->nfree; i++) {
+		struct log_entry log;
+
+		rc = btt_log_read(a, i, &log);
+		if (rc)
+			return rc;
+
+		if (log.lba >= a->external_nlba)
+			return BTT_LOG_OOB_LBA;
+		if (log.old_map >= a->internal_nlba)
+			return BTT_LOG_OOB_OLD;
+		if (log.new_map >= a->internal_nlba)
+			return BTT_LOG_OOB_NEW;
+	}
+	return rc;
+}
+
+/* Check that map entries are self consistent */
+static int btt_check_map_entries(struct arena_info *a)
+{
+	unsigned int i;
+	u32 mapping;
+
+	for (i = 0; i < a->external_nlba; i++) {
+		mapping = btt_map_lookup(a, i);
+		if (mapping >= a->internal_nlba)
+			return BTT_MAP_OOB;
+	}
+	return 0;
+}
+
+/* Check that each flog entry has the correct corresponding map entry */
+static int btt_check_log_map(struct arena_info *a)
+{
+	unsigned int i;
+	u32 mapping;
+	int rc = 0, rc_saved = 0;
+
+	for (i = 0; i < a->nfree; i++) {
+		struct log_entry log;
+
+		rc = btt_log_read(a, i, &log);
+		if (rc)
+			return rc;
+		mapping = btt_map_lookup(a, log.lba);
+
+		/*
+		 * Case where the flog was written, but map couldn't be
+		 * updated. The kernel should also be able to detect and
+		 * fix this condition.
+		 */
+		if (log.new_map != mapping && log.old_map == mapping) {
+			info(a->bttc,
+				"arena %d: log[%d].new_map (%#x) doesn't match map[%#x] (%#x)\n",
+				a->num, i, log.new_map, log.lba, mapping);
+			rc = btt_map_write(a, log.lba, log.new_map);
+			if (rc)
+				rc_saved = rc;
+		}
+	}
+	return rc_saved ? BTT_LOG_MAP_ERR : 0;
+}
+
+static int btt_check_info2(struct arena_info *a)
+{
+	/*
+	 * Repair info2 if needed. The main info-block can be trusted
+	 * as it has been verified during arena discovery
+	 */
+	if(memcmp(a->map.info2, a->map.info, BTT_INFO_SIZE))
+		return btt_copy_to_info2(a);
+	return 0;
+}
+
+/*
+ * This will create a bitmap where each bit corresponds to an internal
+ * 'block'. Between the BTT map and flog (representing 'free' blocks),
+ * every single internal block must be represented exactly once. This
+ * check will detect cases where either one or more blocks are never
+ * referenced, or if a block is referenced more than once.
+ */
+static int btt_check_bitmap(struct arena_info *a)
+{
+	unsigned long *bm;
+	u32 i, btt_mapping;
+	int rc;
+
+	bm = bitmap_alloc(a->internal_nlba);
+	if (bm == NULL)
+		return -ENOMEM;
+
+	/* map 'external_nlba' number of map entries */
+	for (i = 0; i < a->external_nlba; i++) {
+		btt_mapping = btt_map_lookup(a, i);
+		if (test_bit(btt_mapping, bm)) {
+			info(a->bttc,
+				"arena %d: internal block %#x is referenced by two map entries\n",
+				a->num, btt_mapping);
+			rc = BTT_BITMAP_ERROR;
+			goto out;
+		}
+		bitmap_set(bm, btt_mapping, 1);
+	}
+
+	/* map 'nfree' number of flog entries */
+	for (i = 0; i < a->nfree; i++) {
+		struct log_entry log;
+
+		rc = btt_log_read(a, i, &log);
+		if (rc)
+			goto out;
+		if (test_bit(log.old_map, bm)) {
+			info(a->bttc,
+				"arena %d: internal block %#x is referenced by two map/log entries\n",
+				a->num, log.old_map);
+			rc = BTT_BITMAP_ERROR;
+			goto out;
+		}
+		bitmap_set(bm, log.old_map, 1);
+	}
+
+	/* check that the bitmap is full */
+	if (!bitmap_full(bm, a->internal_nlba))
+		rc = BTT_BITMAP_ERROR;
+ out:
+	free(bm);
+	return rc;
+}
+
+static int btt_check_arenas(struct btt_chk *bttc)
+{
+	struct arena_info *a = NULL;
+	int i, rc;
+
+	for(i = 0; i < bttc->num_arenas; i++) {
+		info(bttc, "checking arena %d\n", i);
+		a = &bttc->arena[i];
+		rc = btt_check_log_entries(a);
+		if (rc)
+			break;
+		rc = btt_check_map_entries(a);
+		if (rc)
+			break;
+		rc = btt_check_log_map(a);
+		if (rc)
+			break;
+		rc = btt_check_info2(a);
+		if (rc)
+			break;
+		/*
+		 * bitmap test has to be after check_log_map so that any
+		 * pending log updates have been performed. Otherwise the
+		 * bitmap test may result in a false positive
+		 */
+		rc = btt_check_bitmap(a);
+		if (rc)
+			break;
+	}
+
+	btt_xlat_status(a, rc);
+	if (rc)
+		return -ENXIO;
+	return 0;
+}
+
+/*
+ * This copies over information from the info block to the arena_info struct.
+ * The main difference is that all the offsets (infooff, mapoff etc) were
+ * relative to the arena in the info block, but in arena_info, we use
+ * arena_off to make these offsets absolute, i.e. relative to the start of
+ * the raw namespace.
+ */
+static int btt_parse_meta(struct arena_info *arena, struct btt_sb *btt_sb,
+				u64 arena_off)
+{
+	arena->internal_nlba = le32_to_cpu(btt_sb->internal_nlba);
+	arena->internal_lbasize = le32_to_cpu(btt_sb->internal_lbasize);
+	arena->external_nlba = le32_to_cpu(btt_sb->external_nlba);
+	arena->external_lbasize = le32_to_cpu(btt_sb->external_lbasize);
+	arena->nfree = le32_to_cpu(btt_sb->nfree);
+
+	if (arena->internal_nlba - arena->external_nlba != arena->nfree)
+		return -ENXIO;
+	if (arena->internal_lbasize != arena->external_lbasize)
+		return -ENXIO;
+
+	arena->version_major = le16_to_cpu(btt_sb->version_major);
+	arena->version_minor = le16_to_cpu(btt_sb->version_minor);
+
+	arena->nextoff = (btt_sb->nextoff == 0) ? 0 : (arena_off +
+			le64_to_cpu(btt_sb->nextoff));
+	arena->infooff = arena_off;
+	arena->dataoff = arena_off + le64_to_cpu(btt_sb->dataoff);
+	arena->mapoff = arena_off + le64_to_cpu(btt_sb->mapoff);
+	arena->logoff = arena_off + le64_to_cpu(btt_sb->logoff);
+	arena->info2off = arena_off + le64_to_cpu(btt_sb->info2off);
+
+	arena->size = (le64_to_cpu(btt_sb->nextoff) > 0)
+		? (le64_to_cpu(btt_sb->nextoff))
+		: (arena->info2off - arena->infooff + BTT_INFO_SIZE);
+
+	arena->flags = le32_to_cpu(btt_sb->flags);
+	if (btt_sb->flags & IB_FLAG_ERROR_MASK) {
+		err(arena->bttc, "Info block error flag is set, aborting\n");
+		return -ENXIO;
+	}
+	return 0;
+}
+
+static int btt_discover_arenas(struct btt_chk *bttc)
+{
+	int ret = 0;
+	struct arena_info *arena;
+	struct btt_sb *btt_sb;
+	size_t remaining = bttc->rawsize;
+	size_t cur_off = bttc->start_off;
+	u64 cur_nlba = 0;
+	int  i = 0;
+
+	btt_sb = calloc(1, sizeof(*btt_sb));
+	if (!btt_sb)
+		return -ENOMEM;
+
+	while (remaining) {
+		/* Alloc memory for arena */
+		arena = realloc(bttc->arena, (i + 1) * sizeof(*arena));
+		if (!arena) {
+			ret = -ENOMEM;
+			goto out;
+		} else {
+			bttc->arena = arena;
+			arena = &bttc->arena[i];
+			/* zero the new memory */
+			memset(arena, 0, sizeof(*arena));
+		}
+
+		arena->infooff = cur_off;
+		ret = btt_read_info(bttc, btt_sb, cur_off);
+		if (ret)
+			goto out;
+
+		if (btt_info_verify(bttc, btt_sb) != 0) {
+			u64 offset;
+
+			/* Try to find the backup info block */
+			if (remaining <= ARENA_MAX_SIZE)
+				offset = rounddown(bttc->rawsize, SZ_4K) -
+					BTT_INFO_SIZE;
+			else
+				offset = cur_off + ARENA_MAX_SIZE -
+					BTT_INFO_SIZE;
+
+			info(bttc,
+				"Arena %d: Attempting recover info-block using info2\n", i);
+			ret = btt_read_info(bttc, btt_sb, offset);
+			if (ret) {
+				err(bttc, "Unable to read backup info block (offset %#lx)\n",
+					offset);
+				goto out;
+			}
+			ret = btt_info_verify(bttc, btt_sb);
+			if (ret) {
+				err(bttc, "Backup info block (offset %#lx) verification failed\n",
+					offset);
+				goto out;
+			}
+			ret = btt_write_info(bttc, btt_sb, cur_off);
+			if (ret) {
+				err(bttc, "Restoration of the info block failed: %d\n",
+					ret);
+				goto out;
+			}
+		}
+
+		arena->num = i;
+		arena->bttc = bttc;
+		arena->external_lba_start = cur_nlba;
+		ret = btt_parse_meta(arena, btt_sb, cur_off);
+		if (ret) {
+			err(bttc, "Problem parsing arena[%d] metadata\n", i);
+			goto out;
+		}
+		remaining -= arena->size;
+		cur_off += arena->size;
+		cur_nlba += arena->external_nlba;
+		i++;
+
+		if (arena->nextoff == 0)
+			break;
+	}
+	bttc->num_arenas = i;
+	bttc->nlba = cur_nlba;
+	info(bttc, "found %d BTT arena%s\n", bttc->num_arenas,
+		(bttc->num_arenas > 1) ? "s" : "");
+	free(btt_sb);
+	return ret;
+
+ out:
+	free(bttc->arena);
+	free(btt_sb);
+	return ret;
+}
+
+static int btt_create_mappings(struct btt_chk *bttc)
+{
+	struct arena_info *a;
+	int mmap_flags;
+	int i;
+
+	if (!bttc->opts->repair)
+		mmap_flags = PROT_READ;
+	else
+		mmap_flags = PROT_READ|PROT_WRITE;
+
+	for (i = 0; i < bttc->num_arenas; i++) {
+		a = &bttc->arena[i];
+		a->map.info_len = BTT_INFO_SIZE;
+		a->map.info = mmap(NULL, a->map.info_len, mmap_flags,
+			MAP_SHARED, bttc->fd, a->infooff);
+		if (a->map.info == MAP_FAILED) {
+			err(bttc, "mmap arena[%d].info [sz = %#lx, off = %#lx] failed: %d\n",
+				i, a->map.info_len, a->infooff, errno);
+			return -errno;
+		}
+
+		a->map.data_len = a->mapoff - a->dataoff;
+		a->map.data = mmap(NULL, a->map.data_len, mmap_flags,
+			MAP_SHARED, bttc->fd, a->dataoff);
+		if (a->map.data == MAP_FAILED) {
+			err(bttc, "mmap arena[%d].data [sz = %#lx, off = %#lx] failed: %d\n",
+				i, a->map.data_len, a->dataoff, errno);
+			return -errno;
+		}
+
+		a->map.map_len = a->logoff - a->mapoff;
+		a->map.map = mmap(NULL, a->map.map_len, mmap_flags,
+			MAP_SHARED, bttc->fd, a->mapoff);
+		if (a->map.map == MAP_FAILED) {
+			err(bttc, "mmap arena[%d].map [sz = %#lx, off = %#lx] failed: %d\n",
+				i, a->map.map_len, a->mapoff, errno);
+			return -errno;
+		}
+
+		a->map.log_len = a->info2off - a->logoff;
+		a->map.log = mmap(NULL, a->map.log_len, mmap_flags,
+			MAP_SHARED, bttc->fd, a->logoff);
+		if (a->map.log == MAP_FAILED) {
+			err(bttc, "mmap arena[%d].log [sz = %#lx, off = %#lx] failed: %d\n",
+				i, a->map.log_len, a->logoff, errno);
+			return -errno;
+		}
+
+		a->map.info2_len = BTT_INFO_SIZE;
+		a->map.info2 = mmap(NULL, a->map.info2_len, mmap_flags,
+			MAP_SHARED, bttc->fd, a->info2off);
+		if (a->map.info2 == MAP_FAILED) {
+			err(bttc, "mmap arena[%d].info2 [sz = %#lx, off = %#lx] failed: %d\n",
+				i, a->map.info2_len, a->info2off, errno);
+			return -errno;
+		}
+	}
+
+	return 0;
+}
+
+static void btt_remove_mappings(struct btt_chk *bttc)
+{
+	struct arena_info *a;
+	int i;
+
+	for (i = 0; i < bttc->num_arenas; i++) {
+		a = &bttc->arena[i];
+		if (a->map.info)
+			munmap(a->map.info, a->map.info_len);
+		if (a->map.data)
+			munmap(a->map.data, a->map.data_len);
+		if (a->map.map)
+			munmap(a->map.map, a->map.map_len);
+		if (a->map.log)
+			munmap(a->map.log, a->map.log_len);
+		if (a->map.info2)
+			munmap(a->map.info2, a->map.info2_len);
+	}
+}
+
+static int btt_recover_first_sb(struct btt_chk *bttc)
+{
+	int rc, est_arenas = 0;
+	u64 offset, remaining;
+	struct btt_sb *btt_sb;
+
+	/* Estimate the number of arenas */
+	remaining = bttc->rawsize - bttc->start_off;
+	while (remaining) {
+		if (remaining < ARENA_MIN_SIZE && est_arenas == 0)
+			return -EINVAL;
+		if (remaining > ARENA_MAX_SIZE) {
+			/* full-size arena */
+			remaining -= ARENA_MAX_SIZE;
+			est_arenas++;
+			continue;
+		}
+		if (remaining < ARENA_MIN_SIZE) {
+			/* 'remaining' was too small for another arena */
+			break;
+		} else {
+			/* last, short arena */
+			remaining = 0;
+			est_arenas++;
+			break;
+		}
+	}
+	info(bttc, "estimated arenas: %d, remaining bytes: %#lx\n",
+		est_arenas, remaining);
+
+	btt_sb = malloc(2 * sizeof(*btt_sb));
+	if (btt_sb == NULL)
+		return -ENOMEM;
+	/* Read the original first info block into btt_sb[0] */
+	rc = btt_read_info(bttc, &btt_sb[0], bttc->start_off);
+	if (rc)
+		goto out;
+
+	/* Attepmt 1: try recovery from expected end of the first arena */
+	if (est_arenas == 1)
+		offset = rounddown(bttc->rawsize - remaining, SZ_4K) -
+			BTT_INFO_SIZE;
+	else
+		offset = ARENA_MAX_SIZE - BTT_INFO_SIZE + bttc->start_off;
+
+	info(bttc, "Attempting recover info-block from end-of-arena offset %#lx\n",
+		offset);
+	rc = btt_info_read_verify(bttc, &btt_sb[1], offset);
+	if (rc == 0) {
+		rc = btt_write_info(bttc, &btt_sb[1], bttc->start_off);
+		goto out;
+	}
+
+	/*
+	 * Attempt 2: From the very end of 'rawsize', try to copy the fields
+	 * that are constant in every arena (only valid when multiple arenas
+	 * are present)
+	 */
+	if (est_arenas > 1) {
+		offset = rounddown(bttc->rawsize - remaining, SZ_4K) -
+			BTT_INFO_SIZE;
+		info(bttc, "Attempting to recover info-block from end offset %#lx\n",
+			offset);
+		rc = btt_info_read_verify(bttc, &btt_sb[1], offset);
+		if (rc)
+			goto out;
+		/* copy over the arena0 specific fields from btt_sb[0] */
+		btt_sb[1].flags = btt_sb[0].flags;
+		btt_sb[1].external_nlba = btt_sb[0].external_nlba;
+		btt_sb[1].internal_nlba = btt_sb[0].internal_nlba;
+		btt_sb[1].nextoff = btt_sb[0].nextoff;
+		btt_sb[1].dataoff = btt_sb[0].dataoff;
+		btt_sb[1].mapoff = btt_sb[0].mapoff;
+		btt_sb[1].logoff = btt_sb[0].logoff;
+		btt_sb[1].info2off = btt_sb[0].info2off;
+		btt_sb[1].checksum = btt_sb[0].checksum;
+		rc = btt_info_verify(bttc, &btt_sb[1]);
+		if (rc == 0) {
+			rc = btt_write_info(bttc, &btt_sb[1], bttc->start_off);
+			goto out;
+		}
+	}
+
+	/*
+	 * Attempt 3: use info2off as-is, and check if we find a valid info
+	 * block at that location.
+	 */
+	offset = le32_to_cpu(btt_sb[0].info2off);
+	if (offset > min(bttc->rawsize - BTT_INFO_SIZE,
+			ARENA_MAX_SIZE - BTT_INFO_SIZE + bttc->start_off)) {
+		rc = -ENXIO;
+		goto out;
+	}
+	if (offset) {
+		info(bttc, "Attempting to recover info-block from info2 offset %#lx\n",
+			offset);
+		rc = btt_info_read_verify(bttc, &btt_sb[1],
+			offset + bttc->start_off);
+		if (rc == 0) {
+			rc = btt_write_info(bttc, &btt_sb[1], bttc->start_off);
+			goto out;
+		}
+	} else
+		rc = -ENXIO;
+ out:
+	free(btt_sb);
+	return rc;
+}
+
+int namespace_check(struct ndctl_namespace *ndns, struct check_opts *opts)
+{
+	const char *devname = ndctl_namespace_get_devname(ndns);
+	int raw_mode, rc, disabled_flag = 0, open_flags;
+	struct btt_sb *btt_sb;
+	struct btt_chk *bttc;
+	char path[50];
+
+	bttc = calloc(1, sizeof(*bttc));
+	if (bttc == NULL)
+		return -ENOMEM;
+
+	log_init(&bttc->ctx, devname, "NDCTL_CHECK_NAMESPACE");
+	if (opts->verbose)
+		bttc->ctx.log_priority = LOG_DEBUG;
+
+	bttc->opts = opts;
+	bttc->start_off = BTT_START_OFFSET;
+	bttc->sys_page_size = sysconf(_SC_PAGESIZE);
+	bttc->rawsize = ndctl_namespace_get_size(ndns);
+	ndctl_namespace_get_uuid(ndns, bttc->parent_uuid);
+
+	info(bttc, "checking %s\n", devname);
+	if (ndctl_namespace_is_active(ndns)) {
+		if (opts->force) {
+			rc = ndctl_namespace_disable_safe(ndns);
+			if (rc)
+				return rc;
+			disabled_flag = 1;
+		} else {
+			err(bttc, "%s: check aborted, namespace online\n",
+				devname);
+			rc = -EBUSY;
+			goto out_bttc;
+		}
+	}
+
+	/* In typical usage, the current raw_mode should be false. */
+	raw_mode = ndctl_namespace_get_raw_mode(ndns);
+
+	/*
+	 * Putting the namespace into raw mode will allow us to access
+	 * the btt metadata.
+	 */
+	rc = ndctl_namespace_set_raw_mode(ndns, 1);
+	if (rc < 0) {
+		err(bttc, "%s: failed to set the raw mode flag: %d\n",
+			devname, rc);
+		goto out_ns;
+	}
+	/*
+	 * Now enable the namespace.  This will result in a pmem device
+	 * node showing up in /dev that is in raw mode.
+	 */
+	rc = ndctl_namespace_enable(ndns);
+	if (rc != 0) {
+		err(bttc, "%s: failed to enable in raw mode: %d\n",
+			devname, rc);
+		goto out_ns;
+	}
+
+	sprintf(path, "/dev/%s", ndctl_namespace_get_block_device(ndns));
+	bttc->path = path;
+
+	btt_sb = malloc(sizeof(*btt_sb));
+	if (btt_sb == NULL) {
+		rc = -ENOMEM;
+		goto out_ns;
+	}
+
+	if (!bttc->opts->repair)
+		open_flags = O_RDONLY|O_EXCL;
+	else
+		open_flags = O_RDWR|O_EXCL;
+
+	bttc->fd = open(bttc->path, open_flags);
+	if (bttc->fd < 0) {
+		err(bttc, "unable to open %s: %s\n",
+			bttc->path, strerror(errno));
+		rc = -errno;
+		goto out_sb;
+	}
+
+	rc = btt_info_read_verify(bttc, btt_sb, bttc->start_off);
+	if (rc) {
+		rc = btt_recover_first_sb(bttc);
+		if (rc) {
+			err(bttc, "Unable to recover any BTT info blocks\n");
+			goto out_close;
+		}
+		rc = btt_info_read_verify(bttc, btt_sb, bttc->start_off);
+		if (rc)
+			goto out_close;
+	}
+	rc = btt_discover_arenas(bttc);
+	if (rc)
+		goto out_close;
+
+	rc = btt_create_mappings(bttc);
+	if (rc)
+		goto out_close;
+
+	rc = btt_check_arenas(bttc);
+
+	btt_remove_mappings(bttc);
+ out_close:
+	close(bttc->fd);
+ out_sb:
+	free(btt_sb);
+ out_ns:
+	ndctl_namespace_set_raw_mode(ndns, raw_mode);
+	ndctl_namespace_disable_invalidate(ndns);
+	if (disabled_flag)
+		if(ndctl_namespace_enable(ndns) < 0)
+			err(bttc, "%s: failed to re-enable namespace\n",
+				devname);
+ out_bttc:
+	free(bttc);
+	return rc;
+}
diff --git a/ndctl/builtin-xaction-namespace.c b/ndctl/builtin-xaction-namespace.c
index d6b0c37..8c62d14 100644
--- a/ndctl/builtin-xaction-namespace.c
+++ b/ndctl/builtin-xaction-namespace.c
@@ -28,6 +28,7 @@
 #include <util/parse-options.h>
 #include <ccan/minmax/minmax.h>
 #include <ccan/array_size/array_size.h>
+#include "check.h"
 
 #ifdef HAVE_NDCTL_H
 #include <linux/ndctl.h>
@@ -37,6 +38,7 @@
 
 static bool verbose;
 static bool force;
+static bool repair;
 static struct parameters {
 	bool do_scan;
 	bool mode_default;
@@ -112,6 +114,10 @@ OPT_STRING('a', "align", &param.align, "align", \
 	"specify the namespace alignment in bytes (default: 2M)"), \
 OPT_BOOLEAN('f', "force", &force, "reconfigure namespace even if currently active")
 
+#define CHECK_OPTIONS() \
+OPT_BOOLEAN('R', "repair", &repair, "perform metadata repairs"), \
+OPT_BOOLEAN('f', "force", &force, "check namespace even if currently active")
+
 static const struct option base_options[] = {
 	BASE_OPTIONS(),
 	OPT_END(),
@@ -130,11 +136,18 @@ static const struct option create_options[] = {
 	OPT_END(),
 };
 
+static const struct option check_options[] = {
+	BASE_OPTIONS(),
+	CHECK_OPTIONS(),
+	OPT_END(),
+};
+
 enum namespace_action {
 	ACTION_ENABLE,
 	ACTION_DISABLE,
 	ACTION_CREATE,
 	ACTION_DESTROY,
+	ACTION_CHECK,
 };
 
 static int set_defaults(enum namespace_action mode)
@@ -268,8 +281,26 @@ static const char *parse_namespace_options(int argc, const char **argv,
 	rc = set_defaults(mode);
 
 	if (argc == 0 && mode != ACTION_CREATE) {
-		error("specify a namespace to %s, or \"all\"\n",
-				mode == ACTION_ENABLE ? "enable" : "disable");
+		char *action_string;
+
+		switch (mode) {
+			case ACTION_ENABLE:
+				action_string = "enable";
+				break;
+			case ACTION_DISABLE:
+				action_string = "disable";
+				break;
+			case ACTION_DESTROY:
+				action_string = "destroy";
+				break;
+			case ACTION_CHECK:
+				action_string = "check";
+				break;
+			default:
+				action_string = "<>";
+				break;
+		}
+		error("specify a namespace to %s, or \"all\"\n", action_string);
 		rc = -EINVAL;
 	}
 	for (i = mode == ACTION_CREATE ? 0 : 1; i < argc; i++) {
@@ -793,6 +824,7 @@ static int do_xaction_namespace(const char *namespace,
 	struct ndctl_namespace *ndns, *_n;
 	int rc = -ENXIO, success = 0;
 	struct ndctl_region *region;
+	struct check_opts opts;
 	const char *ndns_name;
 	struct ndctl_bus *bus;
 
@@ -847,6 +879,14 @@ static int do_xaction_namespace(const char *namespace,
 				case ACTION_DESTROY:
 					rc = namespace_destroy(region, ndns);
 					break;
+				case ACTION_CHECK:
+					opts.verbose = verbose;
+					opts.repair = repair;
+					opts.force = force;
+					rc = namespace_check(ndns, &opts);
+					if (rc < 0)
+						return rc;
+					break;
 				case ACTION_CREATE:
 					rc = namespace_reconfig(region, ndns);
 					if (rc < 0)
@@ -965,3 +1005,25 @@ int cmd_destroy_namespace(int argc , const char **argv, void *ctx)
 		return 0;
 	}
 }
+
+int cmd_check_namespace(int argc , const char **argv, void *ctx)
+{
+	char *xable_usage = "ndctl check-namespace <namespace> [<options>]";
+	const char *namespace = parse_namespace_options(argc, argv,
+			ACTION_CHECK, check_options, xable_usage);
+	int checked;
+
+	checked = do_xaction_namespace(namespace, ACTION_CHECK, ctx);
+	if (checked < 0) {
+		fprintf(stderr, "error checking namespaces: %s\n",
+				strerror(-checked));
+		return checked;
+	} else if (checked == 0) {
+		fprintf(stderr, "checked 0 namespaces\n");
+		return 0;
+	} else {
+		fprintf(stderr, "checked %d namespace%s\n", checked,
+				checked > 1 ? "s" : "");
+		return 0;
+	}
+}
diff --git a/ndctl/check.h b/ndctl/check.h
new file mode 100644
index 0000000..f0200db
--- /dev/null
+++ b/ndctl/check.h
@@ -0,0 +1,127 @@
+/*
+ * Copyright (c) 2016, Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ */
+
+#ifndef _CHECK_H
+#define _CHECK_H
+
+#include <util/log.h>
+#include <ccan/endian/endian.h>
+#include <ccan/short_types/short_types.h>
+
+#define BTT_SIG_LEN 16
+#define BTT_SIG "BTT_ARENA_INFO\0"
+#define MAP_TRIM_SHIFT 31
+#define MAP_ERR_SHIFT 30
+#define MAP_LBA_MASK (~((1 << MAP_TRIM_SHIFT) | (1 << MAP_ERR_SHIFT)))
+#define MAP_ENT_NORMAL 0xC0000000
+#define ARENA_MIN_SIZE (1UL << 24)	/* 16 MB */
+#define ARENA_MAX_SIZE (1ULL << 39)	/* 512 GB */
+#define BTT_INFO_SIZE 4096
+#define BTT_START_OFFSET 4096
+#define IB_FLAG_ERROR_MASK 0x00000001
+
+struct log_entry {
+	le32 lba;
+	le32 old_map;
+	le32 new_map;
+	le32 seq;
+	le64 padding[2];
+};
+
+struct btt_sb {
+	u8 signature[BTT_SIG_LEN];
+	u8 uuid[16];
+	u8 parent_uuid[16];
+	le32 flags;
+	le16 version_major;
+	le16 version_minor;
+	le32 external_lbasize;
+	le32 external_nlba;
+	le32 internal_lbasize;
+	le32 internal_nlba;
+	le32 nfree;
+	le32 infosize;
+	le64 nextoff;
+	le64 dataoff;
+	le64 mapoff;
+	le64 logoff;
+	le64 info2off;
+	u8 padding[3968];
+	le64 checksum;
+};
+
+struct free_entry {
+	u32 block;
+	u8 sub;
+	u8 seq;
+};
+
+struct arena_map {
+	struct btt_sb *info;
+	size_t info_len;
+	void *data;
+	size_t data_len;
+	u32 *map;
+	size_t map_len;
+	struct log_entry *log;
+	size_t log_len;
+	struct btt_sb *info2;
+	size_t info2_len;
+};
+
+struct check_opts {
+	bool verbose;
+	bool force;
+	bool repair;
+};
+
+struct btt_chk {
+	char *path;
+	int fd;
+	uuid_t parent_uuid;
+	unsigned long long rawsize;
+	unsigned long long nlba;
+	int start_off;
+	int num_arenas;
+	long sys_page_size;
+	struct arena_info *arena;
+	struct check_opts *opts;
+	struct log_ctx ctx;
+};
+
+
+struct arena_info {
+	struct arena_map map;
+	u64 size;	/* Total bytes for this arena */
+	u64 external_lba_start;
+	u32 internal_nlba;
+	u32 internal_lbasize;
+	u32 external_nlba;
+	u32 external_lbasize;
+	u32 nfree;
+	u16 version_major;
+	u16 version_minor;
+	u64 nextoff;
+	u64 infooff;
+	u64 dataoff;
+	u64 mapoff;
+	u64 logoff;
+	u64 info2off;
+	u32 flags;
+	int num;
+	struct btt_chk *bttc;
+};
+
+int namespace_check(struct ndctl_namespace *ndns, struct check_opts *opts);
+
+#endif
diff --git a/ndctl/ndctl.c b/ndctl/ndctl.c
index 80a0491..0678a9a 100644
--- a/ndctl/ndctl.c
+++ b/ndctl/ndctl.c
@@ -57,6 +57,7 @@ static struct cmd_struct commands[] = {
 	{ "disable-namespace", cmd_disable_namespace },
 	{ "create-namespace", cmd_create_namespace },
 	{ "destroy-namespace", cmd_destroy_namespace },
+	{ "check-namespace", cmd_check_namespace },
 	{ "enable-region", cmd_enable_region },
 	{ "disable-region", cmd_disable_region },
 	{ "enable-dimm", cmd_enable_dimm },
diff --git a/test/Makefile.am b/test/Makefile.am
index cd2226f..24afcea 100644
--- a/test/Makefile.am
+++ b/test/Makefile.am
@@ -84,6 +84,7 @@ device_dax_SOURCES = \
 		dax-pmd.c \
 		$(testcore) \
 		../ndctl/builtin-xaction-namespace.c \
+		../ndctl/builtin-check.c \
 		../util/json.c
 device_dax_LDADD = \
 		$(LIBNDCTL_LIB) \
@@ -95,6 +96,7 @@ multi_pmem_SOURCES = \
 		multi-pmem.c \
 		$(testcore) \
 		../ndctl/builtin-xaction-namespace.c \
+		../ndctl/builtin-check.c \
 		../util/json.c
 multi_pmem_LDADD = \
 		$(LIBNDCTL_LIB) \
diff --git a/util/util.h b/util/util.h
index 620eb1c..04103ce 100644
--- a/util/util.h
+++ b/util/util.h
@@ -55,6 +55,7 @@
 #define zfree(ptr) ({ free(*ptr); *ptr = NULL; })
 
 #define BUILD_BUG_ON_ZERO(e) (sizeof(struct { int:-!!(e); }))
+#define BUILD_BUG_ON(condition) ((void)sizeof(char[1 - 2*!!(condition)]))
 
 static inline const char *skip_prefix(const char *str, const char *prefix)
 {
-- 
2.9.3

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [ndctl PATCH v4 6/6] ndctl, test: Add a unit test for the BTT checker
  2017-04-07 23:17 [ndctl PATCH v4 0/6] Add ndctl check-namespace Vishal Verma
                   ` (4 preceding siblings ...)
  2017-04-07 23:18 ` [ndctl PATCH v4 5/6] ndctl: add a BTT check utility Vishal Verma
@ 2017-04-07 23:18 ` Vishal Verma
  2017-04-08  2:14 ` [ndctl PATCH v4 0/6] Add ndctl check-namespace Dan Williams
  6 siblings, 0 replies; 8+ messages in thread
From: Vishal Verma @ 2017-04-07 23:18 UTC (permalink / raw)
  To: linux-nvdimm

Add a new unit test that will set up BTTs, corrupt them in known ways,
and test that the checker is able to detect or repair the corruption in
the expected way.

Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 test/Makefile.am  |   3 +-
 test/btt-check.sh | 172 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 174 insertions(+), 1 deletion(-)
 create mode 100755 test/btt-check.sh

diff --git a/test/Makefile.am b/test/Makefile.am
index 24afcea..b09d2dd 100644
--- a/test/Makefile.am
+++ b/test/Makefile.am
@@ -9,7 +9,8 @@ TESTS =\
 	create.sh \
 	clear.sh \
 	dax-errors.sh \
-	daxdev-errors.sh
+	daxdev-errors.sh \
+	btt-check.sh
 
 check_PROGRAMS =\
 	libndctl \
diff --git a/test/btt-check.sh b/test/btt-check.sh
new file mode 100755
index 0000000..11821d2
--- /dev/null
+++ b/test/btt-check.sh
@@ -0,0 +1,172 @@
+#!/bin/bash -E
+
+[ -f "../ndctl/ndctl" ] && [ -x "../ndctl/ndctl" ] && ndctl="../ndctl/ndctl"
+[ -f "./ndctl/ndctl" ] && [ -x "./ndctl/ndctl" ] && ndctl="./ndctl/ndctl"
+[ -z "$ndctl" ] && echo "Couldn't find an ndctl binary" && exit 1
+bus="nfit_test.0"
+json2var="s/[{}\",]//g; s/:/=/g"
+dev=""
+mode=""
+size=""
+sector_size=""
+blockdev=""
+bs=4096
+rc=77
+
+trap 'err $LINENO' ERR
+
+# sample json:
+# {
+#   "dev":"namespace5.0",
+#   "mode":"sector",
+#   "size":32440320,
+#   "uuid":"51805176-e124-4635-ae17-0e6a4a16671a",
+#   "sector_size":4096,
+#   "blockdev":"pmem5s"
+# }
+
+# $1: Line number
+# $2: exit code
+err()
+{
+	[ -n "$2" ] && rc="$2"
+	echo "test/btt-check: failed at line $1"
+	exit "$rc"
+}
+
+create()
+{
+	json=$($ndctl create-namespace -b "$bus" -t pmem -m sector)
+	eval "$(echo "$json" | sed -e "$json2var")"
+	[ -n "$dev" ] || err "$LINENO" 2
+	[ "$mode" = "sector" ] || err "$LINENO" 2
+	[ -n "$size" ] || err "$LINENO" 2
+	[ -n "$sector_size" ] || err "$LINENO" 2
+	[ -n "$blockdev" ] || err "$LINENO" 2
+	[ $size -gt 0 ] || err "$LINENO" 2
+}
+
+reset()
+{
+	$ndctl disable-region -b "$bus" all
+	$ndctl zero-labels -b "$bus" all
+	$ndctl enable-region -b "$bus" all
+}
+
+# re-enable the BTT namespace, and do IO to it in an attempt to
+# verify it still comes up ok, and functions as expected
+post_repair_test()
+{
+	echo "${FUNCNAME[0]}: I/O to BTT namespace"
+	test -b /dev/$blockdev
+	dd if=/dev/urandom of=test-bin bs=$sector_size count=$((size/sector_size)) > /dev/null 2>&1
+	dd if=test-bin of=/dev/$blockdev bs=$sector_size count=$((size/sector_size)) > /dev/null 2>&1
+	dd if=/dev/$blockdev of=test-bin-read bs=$sector_size count=$((size/sector_size)) > /dev/null 2>&1
+	diff test-bin test-bin-read
+	rm -f test-bin*
+	echo "done"
+}
+
+test_normal()
+{
+	echo "=== ${FUNCNAME[0]} ==="
+	# disable the namespace
+	$ndctl disable-namespace $dev
+	$ndctl check-namespace $dev
+	$ndctl enable-namespace $dev
+	post_repair_test
+}
+
+test_force()
+{
+	echo "=== ${FUNCNAME[0]} ==="
+	$ndctl check-namespace --force $dev
+	post_repair_test
+}
+
+set_raw()
+{
+	$ndctl disable-namespace $dev
+	echo -n "set raw_mode: "
+	echo 1 | tee /sys/bus/nd/devices/$dev/force_raw
+	$ndctl enable-namespace $dev
+	raw_bdev="${blockdev%%s}"
+	test -b /dev/$raw_bdev
+	raw_size="$(cat /sys/bus/nd/devices/$dev/size)"
+}
+
+unset_raw()
+{
+	$ndctl disable-namespace $dev
+	echo -n "set raw_mode: "
+	echo 0 | tee /sys/bus/nd/devices/$dev/force_raw
+	$ndctl enable-namespace $dev
+	raw_bdev=""
+}
+
+test_bad_info2()
+{
+	echo "=== ${FUNCNAME[0]} ==="
+	set_raw
+	seek="$((raw_size/bs - 1))"
+	echo "wiping info2 block (offset = $seek blocks)"
+	dd if=/dev/zero of=/dev/$raw_bdev bs=$bs count=1 seek=$seek
+	unset_raw
+	$ndctl disable-namespace $dev
+	$ndctl check-namespace $dev 2>&1 | grep "info2 needs to be restored"
+	$ndctl check-namespace --repair $dev
+	$ndctl enable-namespace $dev
+	post_repair_test
+}
+
+test_bad_info()
+{
+	echo "=== ${FUNCNAME[0]} ==="
+	set_raw
+	echo "wiping info block"
+	dd if=/dev/zero of=/dev/$raw_bdev bs=$bs count=1 seek=1
+	unset_raw
+	$ndctl disable-namespace $dev
+	$ndctl check-namespace $dev 2>&1 | grep "info block at offset 0x1000 needs to be restored"
+	$ndctl check-namespace --repair $dev
+	$ndctl enable-namespace $dev
+	post_repair_test
+}
+
+test_bitmap()
+{
+	echo "=== ${FUNCNAME[0]} ==="
+	reset && create
+	set_raw
+	# scribble over the last 4K of the map
+	rm -f /tmp/scribble
+	for (( i=0 ; i<512 ; i++ )); do
+		echo -n -e \\x1e\\x1e\\x00\\xc0\\x1e\\x1e\\x00\\xc0 >> /tmp/scribble
+	done
+	seek="$((raw_size/bs - (256*64/bs) - 2))"
+	echo "scribbling over map entries (offset = $seek blocks)"
+	dd if=/tmp/scribble of=/dev/$raw_bdev bs=$bs seek=$seek
+	rm -f /tmp/scribble
+	unset_raw
+	$ndctl disable-namespace $dev
+	$ndctl check-namespace $dev 2>&1 | grep "bitmap error"
+	# This is not repairable
+	reset && create
+}
+
+do_tests()
+{
+	test_normal
+	test_force
+	test_bad_info2
+	test_bad_info
+	test_bitmap
+}
+
+# setup (reset nfit_test dimms, create the BTT namespace)
+modprobe nfit_test
+rc=1
+reset && create
+do_tests
+reset
+exit 0
-- 
2.9.3

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [ndctl PATCH v4 0/6] Add ndctl check-namespace
  2017-04-07 23:17 [ndctl PATCH v4 0/6] Add ndctl check-namespace Vishal Verma
                   ` (5 preceding siblings ...)
  2017-04-07 23:18 ` [ndctl PATCH v4 6/6] ndctl, test: Add a unit test for the BTT checker Vishal Verma
@ 2017-04-08  2:14 ` Dan Williams
  6 siblings, 0 replies; 8+ messages in thread
From: Dan Williams @ 2017-04-08  2:14 UTC (permalink / raw)
  To: Vishal Verma; +Cc: linux-nvdimm

On Fri, Apr 7, 2017 at 4:17 PM, Vishal Verma <vishal.l.verma@intel.com> wrote:
> Changes in v4:
> - Change the bitmap code to the kernel's GPLv2 Routines instead of the
>   LGPL ccan/bitmap.
> - Upgrade a few messages from 'info' to 'err'

Thanks Vishal! Applied and pushed out to 'pending'.

https://github.com/pmem/ndctl/tree/pending
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2017-04-08  2:14 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-07 23:17 [ndctl PATCH v4 0/6] Add ndctl check-namespace Vishal Verma
2017-04-07 23:17 ` [ndctl PATCH v4 1/6] libndctl: add a ndctl_namespace_is_active helper Vishal Verma
2017-04-07 23:17 ` [ndctl PATCH v4 2/6] libndctl: add a ndctl_namespace_disable_safe() API Vishal Verma
2017-04-07 23:18 ` [ndctl PATCH v4 3/6] ndctl: move the fletcher64 routine to util/ Vishal Verma
2017-04-07 23:18 ` [ndctl PATCH v4 4/6] util: add util/bitmap in preparation for the BTT checker Vishal Verma
2017-04-07 23:18 ` [ndctl PATCH v4 5/6] ndctl: add a BTT check utility Vishal Verma
2017-04-07 23:18 ` [ndctl PATCH v4 6/6] ndctl, test: Add a unit test for the BTT checker Vishal Verma
2017-04-08  2:14 ` [ndctl PATCH v4 0/6] Add ndctl check-namespace Dan Williams

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).