linux-bcache.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [RFC PATCH v6 0/7] nvm page allocator for bcache
  2021-02-08 14:26 [RFC PATCH v6 0/7] nvm page allocator for bcache Qiaowei Ren
@ 2021-02-08 13:49 ` Coly Li
  2021-02-09  2:30   ` Ren, Qiaowei
  2021-02-08 14:26 ` [RFC PATCH v6 1/7] bcache: add initial data structures for nvm pages Qiaowei Ren
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 11+ messages in thread
From: Coly Li @ 2021-02-08 13:49 UTC (permalink / raw)
  To: Qiaowei Ren, Jianpeng Ma; +Cc: linux-bcache

On 2/8/21 10:26 PM, Qiaowei Ren wrote:
> This series implements nvm pages allocator for bcache. This idea is from
> one discussion about nvdimm use case in kernel together with Coly. Coly
> sent the following email about this idea to give some introduction on what
> we will do before:
> 
> https://lore.kernel.org/linux-bcache/bc7e71ec-97eb-b226-d4fc-d8b64c1ef41a@suse.de/
> 
> Here this series focus on the first step in above email, that is to say,
> this patch set implements a generic framework in bcache to allocate/release
> NV-memory pages, and provide allocated pages for each requestor after reboot.
> In order to do this, one simple buddy system is implemented to manage NV-memory
> pages.
> 
> This set includes one testing module which can be used for simple test cases.
> Next need to stroe bcache log or internal btree nodes into nvdimm based on
> these buddy apis to do more testing.
> 
> Qiaowei Ren (7):
>   bcache: add initial data structures for nvm pages
>   bcache: initialize the nvm pages allocator
>   bcache: initialization of the buddy
>   bcache: bch_nvm_alloc_pages() of the buddy
>   bcache: bch_nvm_free_pages() of the buddy
>   bcache: get allocated pages from specific owner
>   bcache: persist owner info when alloc/free pages.

I test the V6 patch set, it works with current bcache part change. Sorry
for not response for the previous series in time on list, but thank you
all to fix the known issues in previous version.

Although the series is still marked as RFC patches, but IMHO they are in
good shape for an EXPERIMENTAL series.

I will have them with my other bcache changes in the v5.12 for-next, and
it is so far so good in my smoking testing.

There is one thing I feel should be clarified from you, I see some
patches the author and the first signed-off-by person is not identical.
Please make the first SOB people to be the same one in the From/Author
field. And I guess maybe most of the work are done by both of you, if
this is true, the second author can use a Co-authored-by: tag after the
first Signed-off-by: person.

The v6 series is under testing now, so it is unnecessary to post one
more version for the above changes. I'd like to change them from my side
if you may provide me some hints.

Thanks for the contribution, the tiny NVDIMM pages allcoator works.

Coly Li

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [RFC PATCH v6 0/7]  nvm page allocator for bcache
@ 2021-02-08 14:26 Qiaowei Ren
  2021-02-08 13:49 ` Coly Li
                   ` (7 more replies)
  0 siblings, 8 replies; 11+ messages in thread
From: Qiaowei Ren @ 2021-02-08 14:26 UTC (permalink / raw)
  To: Coly Li; +Cc: Qiaowei Ren, Jianpeng Ma, linux-bcache

This series implements nvm pages allocator for bcache. This idea is from
one discussion about nvdimm use case in kernel together with Coly. Coly
sent the following email about this idea to give some introduction on what
we will do before:

https://lore.kernel.org/linux-bcache/bc7e71ec-97eb-b226-d4fc-d8b64c1ef41a@suse.de/

Here this series focus on the first step in above email, that is to say,
this patch set implements a generic framework in bcache to allocate/release
NV-memory pages, and provide allocated pages for each requestor after reboot.
In order to do this, one simple buddy system is implemented to manage NV-memory
pages.

This set includes one testing module which can be used for simple test cases.
Next need to stroe bcache log or internal btree nodes into nvdimm based on
these buddy apis to do more testing.

Qiaowei Ren (7):
  bcache: add initial data structures for nvm pages
  bcache: initialize the nvm pages allocator
  bcache: initialization of the buddy
  bcache: bch_nvm_alloc_pages() of the buddy
  bcache: bch_nvm_free_pages() of the buddy
  bcache: get allocated pages from specific owner
  bcache: persist owner info when alloc/free pages.

 drivers/md/bcache/Kconfig       |   6 +
 drivers/md/bcache/Makefile      |   2 +-
 drivers/md/bcache/nvm-pages.c   | 853 ++++++++++++++++++++++++++++++++
 drivers/md/bcache/nvm-pages.h   | 112 +++++
 drivers/md/bcache/super.c       |   3 +
 include/uapi/linux/bcache-nvm.h | 188 +++++++
 6 files changed, 1163 insertions(+), 1 deletion(-)
 create mode 100644 drivers/md/bcache/nvm-pages.c
 create mode 100644 drivers/md/bcache/nvm-pages.h
 create mode 100644 include/uapi/linux/bcache-nvm.h

-- 
2.17.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [RFC PATCH v6 1/7] bcache: add initial data structures for nvm pages
  2021-02-08 14:26 [RFC PATCH v6 0/7] nvm page allocator for bcache Qiaowei Ren
  2021-02-08 13:49 ` Coly Li
@ 2021-02-08 14:26 ` Qiaowei Ren
  2021-02-08 14:26 ` [RFC PATCH v6 2/7] bcache: initialize the nvm pages allocator Qiaowei Ren
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Qiaowei Ren @ 2021-02-08 14:26 UTC (permalink / raw)
  To: Coly Li; +Cc: Qiaowei Ren, Jianpeng Ma, linux-bcache

This patch initializes the prototype data structures for nvm pages
allocator,

- struct bch_nvm_pages_sb
This is the super block allocated on each nvdimm namespace. A nvdimm
set may have multiple namespaces, bch_nvm_pages_sb->set_uuid is used
to mark which nvdimm set this name space belongs to. Normally we will
use the bcache's cache set UUID to initialize this uuid, to connect this
nvdimm set to a specified bcache cache set.

- struct bch_owner_list_head
This is a table for all heads of all owner lists. A owner list records
which page(s) allocated to which owner. After reboot from power failure,
the ownwer may find all its requested and allocated pages from the owner
list by a handler which is converted by a UUID.

- struct bch_nvm_pages_owner_head
This is a head of an owner list. Each owner only has one owner list,
and a nvm page only belongs to an specific owner. uuid[] will be set to
owner's uuid, for bcache it is the bcache's cache set uuid. label is not
mandatory, it is a human-readable string for debug purpose. The pointer
*recs references to separated nvm page which hold the table of struct
bch_nvm_pgalloc_rec.

- struct bch_nvm_pgalloc_recs
This struct occupies a whole page, owner_uuid should match the uuid
in struct bch_nvm_pages_owner_head. recs[] is the real table contains all
allocated records.

- struct bch_nvm_pgalloc_rec
Each structure records a range of allocated nvm pages. pgoff is offset
in unit of page size of this allocated nvm page range. The adjoint page
ranges of same owner can be merged into a larger one, therefore pages_nr
is NOT always power of 2.

Signed-off-by: Coly Li <colyli@suse.de>
Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com>
---
 include/uapi/linux/bcache-nvm.h | 195 ++++++++++++++++++++++++++++++++
 1 file changed, 195 insertions(+)
 create mode 100644 include/uapi/linux/bcache-nvm.h

diff --git a/include/uapi/linux/bcache-nvm.h b/include/uapi/linux/bcache-nvm.h
new file mode 100644
index 000000000000..61108bf2a63e
--- /dev/null
+++ b/include/uapi/linux/bcache-nvm.h
@@ -0,0 +1,195 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+
+#ifndef _UAPI_BCACHE_NVM_H
+#define _UAPI_BCACHE_NVM_H
+
+/*
+ * Bcache on NVDIMM data structures
+ */
+
+/*
+ * - struct bch_nvm_pages_sb
+ *   This is the super block allocated on each nvdimm namespace. A nvdimm
+ * set may have multiple namespaces, bch_nvm_pages_sb->set_uuid is used to mark
+ * which nvdimm set this name space belongs to. Normally we will use the
+ * bcache's cache set UUID to initialize this uuid, to connect this nvdimm
+ * set to a specified bcache cache set.
+ *
+ * - struct bch_owner_list_head
+ *   This is a table for all heads of all owner lists. A owner list records
+ * which page(s) allocated to which owner. After reboot from power failure,
+ * the ownwer may find all its requested and allocated pages from the owner
+ * list by a handler which is converted by a UUID.
+ *
+ * - struct bch_nvm_pages_owner_head
+ *   This is a head of an owner list. Each owner only has one owner list,
+ * and a nvm page only belongs to an specific owner. uuid[] will be set to
+ * owner's uuid, for bcache it is the bcache's cache set uuid. label is not
+ * mandatory, it is a human-readable string for debug purpose. The pointer
+ * recs references to separated nvm page which hold the table of struct
+ * bch_pgalloc_rec.
+ *
+ *- struct bch_nvm_pgalloc_recs
+ *  This structure occupies a whole page, owner_uuid should match the uuid
+ * in struct bch_nvm_pages_owner_head. recs[] is the real table contains all
+ * allocated records.
+ *
+ * - struct bch_pgalloc_rec
+ *   Each structure records a range of allocated nvm pages. pgoff is offset
+ * in unit of page size of this allocated nvm page range. The adjoint page
+ * ranges of same owner can be merged into a larger one, therefore pages_nr
+ * is NOT always power of 2.
+ *
+ *
+ * Memory layout on nvdimm namespace 0
+ *
+ *    0 +---------------------------------+
+ *      |                                 |
+ *  4KB +---------------------------------+
+ *      |         bch_nvm_pages_sb        |
+ *  8KB +---------------------------------+ <--- bch_nvm_pages_sb.bch_owner_list_head
+ *      |       bch_owner_list_head       |
+ *      |                                 |
+ * 16KB +---------------------------------+ <--- bch_owner_list_head.heads[0].recs[0]
+ *      |       bch_nvm_pgalloc_recs      |
+ *      |  (nvm pages internal usage)     |
+ * 24KB +---------------------------------+
+ *      |                                 |
+ *      |                                 |
+ * 16MB  +---------------------------------+
+ *      |      allocable nvm pages        |
+ *      |      for buddy allocator        |
+ * end  +---------------------------------+
+ *
+ *
+ *
+ * Memory layout on nvdimm namespace N
+ * (doesn't have owner list)
+ *
+ *    0 +---------------------------------+
+ *      |                                 |
+ *  4KB +---------------------------------+
+ *      |         bch_nvm_pages_sb        |
+ *  8KB +---------------------------------+
+ *      |                                 |
+ *      |                                 |
+ *      |                                 |
+ *      |                                 |
+ *      |                                 |
+ *      |                                 |
+ * 16MB  +---------------------------------+
+ *      |      allocable nvm pages        |
+ *      |      for buddy allocator        |
+ * end  +---------------------------------+
+ *
+ */
+
+#include <linux/types.h>
+
+/* In sectors */
+#define BCH_NVM_PAGES_SB_OFFSET			4096
+#define BCH_NVM_PAGES_OFFSET			(16 << 20)
+
+#define BCH_NVM_PAGES_LABEL_SIZE		32
+#define BCH_NVM_PAGES_NAMESPACES_MAX		8
+
+#define BCH_NVM_PAGES_OWNER_LIST_HEAD_OFFSET	(8<<10)
+#define BCH_NVM_PAGES_SYS_RECS_HEAD_OFFSET	(16<<10)
+
+#define BCH_NVM_PAGES_SB_VERSION		0
+#define BCH_NVM_PAGES_SB_VERSION_MAX		0
+
+static const char bch_nvm_pages_magic[] = {
+	0x17, 0xbd, 0x53, 0x7f, 0x1b, 0x23, 0xd6, 0x83,
+	0x46, 0xa4, 0xf8, 0x28, 0x17, 0xda, 0xec, 0xa9 };
+static const char bch_nvm_pages_pgalloc_magic[] = {
+	0x39, 0x25, 0x3f, 0xf7, 0x27, 0x17, 0xd0, 0xb9,
+	0x10, 0xe6, 0xd2, 0xda, 0x38, 0x68, 0x26, 0xae };
+
+struct bch_pgalloc_rec {
+	__u32			pgoff;
+	__u32			nr;
+};
+
+struct bch_nvm_pgalloc_recs {
+union {
+	struct {
+		struct bch_nvm_pages_owner_head	*owner;
+		struct bch_nvm_pgalloc_recs	*next;
+		__u8				magic[16];
+		__u8				owner_uuid[16];
+		__u32				size;
+		__u32				used;
+		__u64				_pad[4];
+		struct bch_pgalloc_rec		recs[];
+	};
+	__u8	pad[8192];
+};
+};
+#define BCH_MAX_RECS					\
+	((sizeof(struct bch_nvm_pgalloc_recs) -		\
+	 offsetof(struct bch_nvm_pgalloc_recs, recs)) /	\
+	 sizeof(struct bch_pgalloc_rec))
+
+struct bch_nvm_pages_owner_head {
+	__u8			uuid[16];
+	char			label[BCH_NVM_PAGES_LABEL_SIZE];
+	/* Per-namespace own lists */
+	struct bch_nvm_pgalloc_recs	*recs[BCH_NVM_PAGES_NAMESPACES_MAX];
+};
+
+/* heads[0] is always for nvm_pages internal usage */
+struct bch_owner_list_head {
+union {
+	struct {
+		__u32				size;
+		__u32				used;
+		__u64				_pad[4];
+		struct bch_nvm_pages_owner_head	heads[];
+	};
+	__u8	pad[8192];
+};
+};
+#define BCH_MAX_OWNER_LIST				\
+	((sizeof(struct bch_owner_list_head) -		\
+	 offsetof(struct bch_owner_list_head, heads)) /	\
+	 sizeof(struct bch_nvm_pages_owner_head))
+
+/* The on-media bit order is local CPU order */
+struct bch_nvm_pages_sb {
+	__u64			csum;
+	__u64			ns_start;
+	__u64			sb_offset;
+	__u64			version;
+	__u8			magic[16];
+	__u8			uuid[16];
+	__u32			page_size;
+	__u32			total_namespaces_nr;
+	__u32			this_namespace_nr;
+	union {
+		__u8		set_uuid[16];
+		__u64		set_magic;
+	};
+
+	__u64			flags;
+	__u64			seq;
+
+	__u64			feature_compat;
+	__u64			feature_incompat;
+	__u64			feature_ro_compat;
+
+	/* For allocable nvm pages from buddy systems */
+	__u64			pages_offset;
+	__u64			pages_total;
+
+	__u64			pad[8];
+
+	/* Only on the first name space */
+	struct bch_owner_list_head	*owner_list_head;
+
+	/* Just for csum_set() */
+	__u32			keys;
+	__u64			d[0];
+};
+
+#endif /* _UAPI_BCACHE_NVM_H */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH v6 2/7] bcache: initialize the nvm pages allocator
  2021-02-08 14:26 [RFC PATCH v6 0/7] nvm page allocator for bcache Qiaowei Ren
  2021-02-08 13:49 ` Coly Li
  2021-02-08 14:26 ` [RFC PATCH v6 1/7] bcache: add initial data structures for nvm pages Qiaowei Ren
@ 2021-02-08 14:26 ` Qiaowei Ren
  2021-02-08 14:26 ` [RFC PATCH v6 3/7] bcache: initialization of the buddy Qiaowei Ren
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Qiaowei Ren @ 2021-02-08 14:26 UTC (permalink / raw)
  To: Coly Li; +Cc: Qiaowei Ren, Jianpeng Ma, linux-bcache

This patch define the prototype data structures in memory and initializes
the nvm pages allocator.

The nv address space which is managed by this allocatior can consist of
many nvm namespaces, and some namespaces can compose into one nvm set,
like cache set. For this initial implementation, only one set can be
supported.

The users of this nvm pages allocator need to call regiseter_namespace()
to register the nvdimm device (like /dev/pmemX) into this allocator as
the instance of struct nvm_namespace.

Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com>
---
 drivers/md/bcache/Kconfig       |   6 +
 drivers/md/bcache/Makefile      |   2 +-
 drivers/md/bcache/nvm-pages.c   | 404 ++++++++++++++++++++++++++++++++
 drivers/md/bcache/nvm-pages.h   |  92 ++++++++
 drivers/md/bcache/super.c       |   3 +
 include/uapi/linux/bcache-nvm.h |   7 -
 6 files changed, 506 insertions(+), 8 deletions(-)
 create mode 100644 drivers/md/bcache/nvm-pages.c
 create mode 100644 drivers/md/bcache/nvm-pages.h

diff --git a/drivers/md/bcache/Kconfig b/drivers/md/bcache/Kconfig
index d1ca4d059c20..fdec9905ef40 100644
--- a/drivers/md/bcache/Kconfig
+++ b/drivers/md/bcache/Kconfig
@@ -35,3 +35,9 @@ config BCACHE_ASYNC_REGISTRATION
 	device path into this file will returns immediately and the real
 	registration work is handled in kernel work queue in asynchronous
 	way.
+
+config BCACHE_NVM_PAGES
+	bool "NVDIMM support for bcache (EXPERIMENTAL)"
+	depends on BCACHE
+	help
+	nvm pages allocator for bcache.
diff --git a/drivers/md/bcache/Makefile b/drivers/md/bcache/Makefile
index 5b87e59676b8..948e5ed2ca66 100644
--- a/drivers/md/bcache/Makefile
+++ b/drivers/md/bcache/Makefile
@@ -4,4 +4,4 @@ obj-$(CONFIG_BCACHE)	+= bcache.o
 
 bcache-y		:= alloc.o bset.o btree.o closure.o debug.o extents.o\
 	io.o journal.o movinggc.o request.o stats.o super.o sysfs.o trace.o\
-	util.o writeback.o features.o
+	util.o writeback.o features.o nvm-pages.o
diff --git a/drivers/md/bcache/nvm-pages.c b/drivers/md/bcache/nvm-pages.c
new file mode 100644
index 000000000000..4fa8e2764773
--- /dev/null
+++ b/drivers/md/bcache/nvm-pages.c
@@ -0,0 +1,404 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Nvdimm page-buddy allocator
+ *
+ * Copyright (c) 2021, Intel Corporation.
+ * Copyright (c) 2021, Qiaowei Ren <qiaowei.ren@intel.com>.
+ * Copyright (c) 2021, Jianpeng Ma <jianpeng.ma@intel.com>.
+ */
+
+#include "bcache.h"
+#include "nvm-pages.h"
+
+#include <linux/slab.h>
+#include <linux/list.h>
+#include <linux/mutex.h>
+#include <linux/dax.h>
+#include <linux/pfn_t.h>
+#include <linux/libnvdimm.h>
+#include <linux/mm_types.h>
+#include <linux/err.h>
+#include <linux/pagemap.h>
+#include <linux/bitmap.h>
+#include <linux/blkdev.h>
+
+#ifdef CONFIG_BCACHE_NVM_PAGES
+
+static const char bch_nvm_pages_magic[] = {
+	0x17, 0xbd, 0x53, 0x7f, 0x1b, 0x23, 0xd6, 0x83,
+	0x46, 0xa4, 0xf8, 0x28, 0x17, 0xda, 0xec, 0xa9 };
+static const char bch_nvm_pages_pgalloc_magic[] = {
+	0x39, 0x25, 0x3f, 0xf7, 0x27, 0x17, 0xd0, 0xb9,
+	0x10, 0xe6, 0xd2, 0xda, 0x38, 0x68, 0x26, 0xae };
+
+struct bch_nvm_set *only_set;
+
+static struct bch_owner_list *alloc_owner_list(const char *owner_uuid,
+		const char *label, int total_namespaces)
+{
+	struct bch_owner_list *owner_list;
+
+	owner_list = kzalloc(sizeof(*owner_list), GFP_KERNEL);
+	if (!owner_list)
+		return NULL;
+
+	owner_list->alloced_recs = kcalloc(total_namespaces,
+			sizeof(struct bch_nvm_alloced_recs *), GFP_KERNEL);
+	if (!owner_list->alloced_recs) {
+		kfree(owner_list);
+		return NULL;
+	}
+
+	if (owner_uuid)
+		memcpy(owner_list->owner_uuid, owner_uuid, 16);
+	if (label)
+		memcpy(owner_list->label, label, BCH_NVM_PAGES_LABEL_SIZE);
+
+	return owner_list;
+}
+
+static void release_extents(struct bch_nvm_alloced_recs *extents)
+{
+	struct list_head *list = extents->extent_head.next;
+	struct bch_extent *extent;
+
+	while (list != &extents->extent_head) {
+		extent = container_of(list, struct bch_extent, list);
+		list_del(list);
+		kfree(extent);
+		list = extents->extent_head.next;
+	}
+	kfree(extents);
+}
+
+static void release_owner_info(struct bch_nvm_set *nvm_set)
+{
+	struct bch_owner_list *owner_list;
+	int i, j;
+
+	for (i = 0; i < nvm_set->owner_list_used; i++) {
+		owner_list = nvm_set->owner_lists[i];
+		for (j = 0; j < nvm_set->total_namespaces_nr; j++) {
+			if (owner_list->alloced_recs[j])
+				release_extents(owner_list->alloced_recs[j]);
+		}
+		kfree(owner_list->alloced_recs);
+		kfree(owner_list);
+	}
+	kfree(nvm_set->owner_lists);
+}
+
+static void release_nvm_namespaces(struct bch_nvm_set *nvm_set)
+{
+	int i;
+
+	for (i = 0; i < nvm_set->total_namespaces_nr; i++) {
+		blkdev_put(nvm_set->nss[i]->bdev, FMODE_READ|FMODE_WRITE|FMODE_EXEC);
+		kfree(nvm_set->nss[i]);
+	}
+
+	kfree(nvm_set->nss);
+}
+
+static void release_nvm_set(struct bch_nvm_set *nvm_set)
+{
+	release_nvm_namespaces(nvm_set);
+	release_owner_info(nvm_set);
+	kfree(nvm_set);
+}
+
+static void *nvm_pgoff_to_vaddr(struct bch_nvm_namespace *ns, pgoff_t pgoff)
+{
+	return ns->kaddr + (pgoff << PAGE_SHIFT);
+}
+
+static int init_owner_info(struct bch_nvm_namespace *ns)
+{
+	struct bch_owner_list_head *owner_list_head;
+	struct bch_nvm_pages_owner_head *owner_head;
+	struct bch_nvm_pgalloc_recs *nvm_pgalloc_recs;
+	struct bch_owner_list *owner_list;
+	struct bch_nvm_alloced_recs *extents;
+	struct bch_extent *extent;
+	u32 i, j, k;
+
+	owner_list_head = (struct bch_owner_list_head *)
+			(ns->kaddr + BCH_NVM_PAGES_OWNER_LIST_HEAD_OFFSET);
+
+	mutex_lock(&only_set->lock);
+	only_set->owner_list_size = owner_list_head->size;
+	only_set->owner_list_used = owner_list_head->used;
+
+	for (i = 0; i < owner_list_head->used; i++) {
+		owner_head = &owner_list_head->heads[i];
+		owner_list = alloc_owner_list(owner_head->uuid, owner_head->label,
+				only_set->total_namespaces_nr);
+		if (!owner_list) {
+			mutex_unlock(&only_set->lock);
+			return -ENOMEM;
+		}
+
+		for (j = 0; j < only_set->total_namespaces_nr; j++) {
+			if (!only_set->nss[j] || !owner_head->recs[j])
+				continue;
+
+			nvm_pgalloc_recs = (struct bch_nvm_pgalloc_recs *)
+					((long)owner_head->recs[j] + ns->kaddr);
+			if (memcmp(nvm_pgalloc_recs->magic, bch_nvm_pages_pgalloc_magic, 16)) {
+				pr_info("invalid bch_nvmpages_pgalloc_magic\n");
+				mutex_unlock(&only_set->lock);
+				return -EINVAL;
+			}
+
+			extents = kzalloc(sizeof(*extents), GFP_KERNEL);
+			if (!extents) {
+				mutex_unlock(&only_set->lock);
+				return -ENOMEM;
+			}
+
+			extents->ns = only_set->nss[j];
+			INIT_LIST_HEAD(&extents->extent_head);
+			owner_list->alloced_recs[j] = extents;
+
+			do {
+				struct bch_pgalloc_rec *rec;
+
+				for (k = 0; k < nvm_pgalloc_recs->used; k++) {
+					rec = &nvm_pgalloc_recs->recs[k];
+					extent = kzalloc(sizeof(*extent), GFP_KERNEL);
+					if (!extents) {
+						mutex_unlock(&only_set->lock);
+						return -ENOMEM;
+					}
+					extent->kaddr = nvm_pgoff_to_vaddr(extents->ns, rec->pgoff);
+					extent->nr = rec->nr;
+					list_add_tail(&extent->list, &extents->extent_head);
+				}
+				extents->nr += nvm_pgalloc_recs->used;
+
+				if (nvm_pgalloc_recs->next) {
+					nvm_pgalloc_recs = (struct bch_nvm_pgalloc_recs *)
+						((long)nvm_pgalloc_recs->next + ns->kaddr);
+					if (memcmp(nvm_pgalloc_recs->magic,
+						bch_nvm_pages_pgalloc_magic, 16)) {
+						pr_info("invalid bch_nvmpages_pgalloc_magic\n");
+						mutex_unlock(&only_set->lock);
+						return -EINVAL;
+					}
+				} else
+					nvm_pgalloc_recs = NULL;
+			} while (nvm_pgalloc_recs);
+		}
+		only_set->owner_lists[i] = owner_list;
+		owner_list->nvm_set = only_set;
+	}
+	mutex_unlock(&only_set->lock);
+
+	return 0;
+}
+
+static bool attach_nvm_set(struct bch_nvm_namespace *ns)
+{
+	bool rc = true;
+
+	mutex_lock(&only_set->lock);
+	if (only_set->nss) {
+		if (memcmp(ns->sb.set_uuid, only_set->set_uuid, 16)) {
+			pr_info("namespace id does't match nvm set\n");
+			rc = false;
+			goto unlock;
+		}
+
+		if (only_set->nss[ns->sb.this_namespace_nr]) {
+			pr_info("already has the same position(%d) nvm\n",
+					ns->sb.this_namespace_nr);
+			rc = false;
+			goto unlock;
+		}
+	} else {
+		memcpy(only_set->set_uuid, ns->sb.set_uuid, 16);
+		only_set->total_namespaces_nr = ns->sb.total_namespaces_nr;
+		only_set->nss = kcalloc(only_set->total_namespaces_nr,
+				sizeof(struct bch_nvm_namespace *), GFP_KERNEL);
+		only_set->owner_lists = kcalloc(BCH_MAX_OWNER_LIST,
+				sizeof(struct nvm_pages_owner_head *), GFP_KERNEL);
+		if (!only_set->nss || !only_set->owner_lists) {
+			pr_info("can't alloc nss or owner_list\n");
+			kfree(only_set->nss);
+			kfree(only_set->owner_lists);
+			rc = false;
+			goto unlock;
+		}
+	}
+
+	only_set->nss[ns->sb.this_namespace_nr] = ns;
+
+unlock:
+	mutex_unlock(&only_set->lock);
+	return rc;
+}
+
+static int read_nvdimm_meta_super(struct block_device *bdev,
+			      struct bch_nvm_namespace *ns)
+{
+	struct page *page;
+	struct bch_nvm_pages_sb *sb;
+
+	page = read_cache_page_gfp(bdev->bd_inode->i_mapping,
+			BCH_NVM_PAGES_SB_OFFSET >> PAGE_SHIFT, GFP_KERNEL);
+
+	if (IS_ERR(page))
+		return -EIO;
+
+	sb = page_address(page) + offset_in_page(BCH_NVM_PAGES_SB_OFFSET);
+	memcpy(&ns->sb, sb, sizeof(struct bch_nvm_pages_sb));
+
+	put_page(page);
+
+	return 0;
+}
+
+struct bch_nvm_namespace *bch_register_namespace(const char *dev_path)
+{
+	struct bch_nvm_namespace *ns;
+	int err;
+	pgoff_t pgoff;
+	char buf[BDEVNAME_SIZE];
+	struct block_device *bdev;
+	uint64_t expected_csum;
+	int id;
+	char *path = NULL;
+
+	path = kstrndup(dev_path, 512, GFP_KERNEL);
+	if (!path) {
+		pr_err("kstrndup failed\n");
+		return ERR_PTR(-ENOMEM);
+	}
+
+	bdev = blkdev_get_by_path(strim(path),
+				  FMODE_READ|FMODE_WRITE|FMODE_EXEC,
+				  only_set);
+	if (IS_ERR(bdev)) {
+		pr_info("get %s error\n", dev_path);
+		kfree(path);
+		return ERR_PTR(PTR_ERR(bdev));
+	}
+
+	ns = kmalloc(sizeof(struct bch_nvm_namespace), GFP_KERNEL);
+	if (!ns)
+		goto bdput;
+
+	err = -EIO;
+	if (read_nvdimm_meta_super(bdev, ns)) {
+		pr_info("%s read nvdimm meta super block failed.\n",
+			bdevname(bdev, buf));
+		goto free_ns;
+	}
+
+	if (memcmp(ns->sb.magic, bch_nvm_pages_magic, 16)) {
+		pr_info("invalid bch_nvm_pages_magic\n");
+		goto free_ns;
+	}
+
+	if (ns->sb.sb_offset != BCH_NVM_PAGES_SB_OFFSET) {
+		pr_info("invalid superblock offset\n");
+		goto free_ns;
+	}
+
+	if (ns->sb.total_namespaces_nr != 1) {
+		pr_info("only one nvm device\n");
+		goto free_ns;
+	}
+
+	expected_csum = csum_set(&ns->sb);
+	if (expected_csum != ns->sb.csum) {
+		pr_info("csum is not match with expected one\n");
+		goto free_ns;
+	}
+
+	err = -EOPNOTSUPP;
+	if (!bdev_dax_supported(bdev, ns->sb.page_size)) {
+		pr_info("%s don't support DAX\n", bdevname(bdev, buf));
+		goto free_ns;
+	}
+
+	err = -EINVAL;
+	if (bdev_dax_pgoff(bdev, 0, ns->sb.page_size, &pgoff)) {
+		pr_info("invalid offset of %s\n", bdevname(bdev, buf));
+		goto free_ns;
+	}
+
+	err = -ENOMEM;
+	ns->dax_dev = fs_dax_get_by_bdev(bdev);
+	if (!ns->dax_dev) {
+		pr_info("can't by dax device by %s\n", bdevname(bdev, buf));
+		goto free_ns;
+	}
+
+	err = -EINVAL;
+	id = dax_read_lock();
+	if (dax_direct_access(ns->dax_dev, pgoff, ns->sb.pages_total,
+			      &ns->kaddr, &ns->start_pfn) <= 0) {
+		pr_info("dax_direct_access error\n");
+		dax_read_unlock(id);
+		goto free_ns;
+	}
+	dax_read_unlock(id);
+
+
+	err = -EEXIST;
+	if (!attach_nvm_set(ns))
+		goto free_ns;
+
+	ns->page_size = ns->sb.page_size;
+	ns->pages_offset = ns->sb.pages_offset;
+	ns->pages_total = ns->sb.pages_total;
+	ns->free = 0;
+	ns->bdev = bdev;
+	ns->nvm_set = only_set;
+
+	mutex_init(&ns->lock);
+
+	if (ns->sb.this_namespace_nr == 0) {
+		pr_info("only first namespace contain owner info\n");
+		err = init_owner_info(ns);
+		if (err < 0) {
+			pr_info("init_owner_info met error %d\n", err);
+			goto free_ns;
+		}
+	}
+
+	kfree(path);
+	return ns;
+free_ns:
+	kfree(ns);
+bdput:
+	blkdev_put(bdev, FMODE_READ|FMODE_WRITE|FMODE_EXEC);
+	kfree(path);
+	return ERR_PTR(err);
+}
+EXPORT_SYMBOL_GPL(bch_register_namespace);
+
+int __init bch_nvm_init(void)
+{
+	only_set = kzalloc(sizeof(*only_set), GFP_KERNEL);
+	if (!only_set)
+		return -ENOMEM;
+
+	only_set->total_namespaces_nr = 0;
+	only_set->owner_lists = NULL;
+	only_set->nss = NULL;
+
+	mutex_init(&only_set->lock);
+
+	pr_info("bcache nvm init\n");
+	return 0;
+}
+
+void bch_nvm_exit(void)
+{
+	release_nvm_set(only_set);
+	pr_info("bcache nvm exit\n");
+}
+
+#endif
diff --git a/drivers/md/bcache/nvm-pages.h b/drivers/md/bcache/nvm-pages.h
new file mode 100644
index 000000000000..1b10b4b6db0f
--- /dev/null
+++ b/drivers/md/bcache/nvm-pages.h
@@ -0,0 +1,92 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef _BCACHE_NVM_PAGES_H
+#define _BCACHE_NVM_PAGES_H
+
+#include <linux/bcache-nvm.h>
+
+/*
+ * Bcache NVDIMM in memory data structures
+ */
+
+/*
+ * The following three structures in memory records which page(s) allocated
+ * to which owner. After reboot from power failure, they will be initialized
+ * based on nvm pages superblock in NVDIMM device.
+ */
+struct bch_extent {
+	void *kaddr;
+	u32 nr;
+	struct list_head list;
+};
+
+struct bch_nvm_alloced_recs {
+	u32  nr;
+	struct bch_nvm_namespace *ns;
+	struct list_head extent_head;
+};
+
+struct bch_owner_list {
+	u8  owner_uuid[16];
+	char label[BCH_NVM_PAGES_LABEL_SIZE];
+
+	struct bch_nvm_set *nvm_set;
+	struct bch_nvm_alloced_recs **alloced_recs;
+};
+
+struct bch_nvm_namespace {
+	struct bch_nvm_pages_sb sb;
+	void *kaddr;
+
+	u8 uuid[16];
+	u64 free;
+	u32 page_size;
+	u64 pages_offset;
+	u64 pages_total;
+	pfn_t start_pfn;
+
+	struct dax_device *dax_dev;
+	struct block_device *bdev;
+	struct bch_nvm_set *nvm_set;
+
+	struct mutex lock;
+};
+
+/*
+ * A set of namespaces. Currently only one set can be supported.
+ */
+struct bch_nvm_set {
+	u8 set_uuid[16];
+	u32 total_namespaces_nr;
+
+	u32 owner_list_size;
+	u32 owner_list_used;
+	struct bch_owner_list **owner_lists;
+
+	struct bch_nvm_namespace **nss;
+
+	struct mutex lock;
+};
+extern struct bch_nvm_set *only_set;
+
+#ifdef CONFIG_BCACHE_NVM_PAGES
+
+struct bch_nvm_namespace *bch_register_namespace(const char *dev_path);
+int bch_nvm_init(void);
+void bch_nvm_exit(void);
+
+#else
+
+static inline struct bch_nvm_namespace *bch_register_namespace(const char *dev_path)
+{
+	return NULL;
+}
+static inline int bch_nvm_init(void)
+{
+	return 0;
+}
+static inline void bch_nvm_exit(void) { }
+
+#endif /* CONFIG_BCACHE_NVM_PAGES */
+
+#endif /* _BCACHE_NVM_PAGES_H */
diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index 2047a9cccdb5..7fffb6ccfb0c 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -14,6 +14,7 @@
 #include "request.h"
 #include "writeback.h"
 #include "features.h"
+#include "nvm-pages.h"
 
 #include <linux/blkdev.h>
 #include <linux/debugfs.h>
@@ -2815,6 +2816,7 @@ static void bcache_exit(void)
 {
 	bch_debug_exit();
 	bch_request_exit();
+	bch_nvm_exit();
 	if (bcache_kobj)
 		kobject_put(bcache_kobj);
 	if (bcache_wq)
@@ -2894,6 +2896,7 @@ static int __init bcache_init(void)
 
 	bch_debug_init();
 	closure_debug_init();
+	bch_nvm_init();
 
 	bcache_is_reboot = false;
 
diff --git a/include/uapi/linux/bcache-nvm.h b/include/uapi/linux/bcache-nvm.h
index 61108bf2a63e..0a6dc4a6e470 100644
--- a/include/uapi/linux/bcache-nvm.h
+++ b/include/uapi/linux/bcache-nvm.h
@@ -99,13 +99,6 @@
 #define BCH_NVM_PAGES_SB_VERSION		0
 #define BCH_NVM_PAGES_SB_VERSION_MAX		0
 
-static const char bch_nvm_pages_magic[] = {
-	0x17, 0xbd, 0x53, 0x7f, 0x1b, 0x23, 0xd6, 0x83,
-	0x46, 0xa4, 0xf8, 0x28, 0x17, 0xda, 0xec, 0xa9 };
-static const char bch_nvm_pages_pgalloc_magic[] = {
-	0x39, 0x25, 0x3f, 0xf7, 0x27, 0x17, 0xd0, 0xb9,
-	0x10, 0xe6, 0xd2, 0xda, 0x38, 0x68, 0x26, 0xae };
-
 struct bch_pgalloc_rec {
 	__u32			pgoff;
 	__u32			nr;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH v6 3/7] bcache: initialization of the buddy
  2021-02-08 14:26 [RFC PATCH v6 0/7] nvm page allocator for bcache Qiaowei Ren
                   ` (2 preceding siblings ...)
  2021-02-08 14:26 ` [RFC PATCH v6 2/7] bcache: initialize the nvm pages allocator Qiaowei Ren
@ 2021-02-08 14:26 ` Qiaowei Ren
  2021-02-08 14:26 ` [RFC PATCH v6 4/7] bcache: bch_nvm_alloc_pages() " Qiaowei Ren
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Qiaowei Ren @ 2021-02-08 14:26 UTC (permalink / raw)
  To: Coly Li; +Cc: Qiaowei Ren, Jianpeng Ma, linux-bcache

This nvm pages allocator will implement the simple buddy to manage the
nvm address space. This patch initializes this buddy for new namespace.

the unit of alloc/free of the buddy is page. DAX device has their
struct page(in dram or PMEM).

	struct {        /* ZONE_DEVICE pages */
		/** @pgmap: Points to the hosting device page map. */
		struct dev_pagemap *pgmap;
		void *zone_device_data;
		/*
		 * ZONE_DEVICE private pages are counted as being
		 * mapped so the next 3 words hold the mapping, index,
		 * and private fields from the source anonymous or
		 * page cache page while the page is migrated to device
		 * private memory.
		 * ZONE_DEVICE MEMORY_DEVICE_FS_DAX pages also
		 * use the mapping, index, and private fields when
		 * pmem backed DAX files are mapped.
		 */
	};

ZONE_DEVICE pages only use pgmap. Other 4 words[16/32 bytes] don't use.
So the second/third word will be used as 'struct list_head ' which list
in buddy. The fourth word(that is normal struct page::index) store pgoff
which the page-offset in the dax device. And the fifth word (that is
normal struct page::private) store order of buddy. page_type will be used
to store buddy flags.

Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com>
---
 drivers/md/bcache/nvm-pages.c | 75 ++++++++++++++++++++++++++++++++++-
 drivers/md/bcache/nvm-pages.h |  5 +++
 2 files changed, 78 insertions(+), 2 deletions(-)

diff --git a/drivers/md/bcache/nvm-pages.c b/drivers/md/bcache/nvm-pages.c
index 4fa8e2764773..7efb99c0fc07 100644
--- a/drivers/md/bcache/nvm-pages.c
+++ b/drivers/md/bcache/nvm-pages.c
@@ -93,6 +93,7 @@ static void release_nvm_namespaces(struct bch_nvm_set *nvm_set)
 	int i;
 
 	for (i = 0; i < nvm_set->total_namespaces_nr; i++) {
+		kvfree(nvm_set->nss[i]->pages_bitmap);
 		blkdev_put(nvm_set->nss[i]->bdev, FMODE_READ|FMODE_WRITE|FMODE_EXEC);
 		kfree(nvm_set->nss[i]);
 	}
@@ -112,6 +113,17 @@ static void *nvm_pgoff_to_vaddr(struct bch_nvm_namespace *ns, pgoff_t pgoff)
 	return ns->kaddr + (pgoff << PAGE_SHIFT);
 }
 
+static struct page *nvm_vaddr_to_page(struct bch_nvm_namespace *ns, void *addr)
+{
+	return virt_to_page(addr);
+}
+
+static inline void remove_owner_space(struct bch_nvm_namespace *ns,
+					pgoff_t pgoff, u32 nr)
+{
+	bitmap_set(ns->pages_bitmap, pgoff, nr);
+}
+
 static int init_owner_info(struct bch_nvm_namespace *ns)
 {
 	struct bch_owner_list_head *owner_list_head;
@@ -129,6 +141,8 @@ static int init_owner_info(struct bch_nvm_namespace *ns)
 	only_set->owner_list_size = owner_list_head->size;
 	only_set->owner_list_used = owner_list_head->used;
 
+	remove_owner_space(ns, 0, ns->pages_offset/ns->page_size);
+
 	for (i = 0; i < owner_list_head->used; i++) {
 		owner_head = &owner_list_head->heads[i];
 		owner_list = alloc_owner_list(owner_head->uuid, owner_head->label,
@@ -162,6 +176,8 @@ static int init_owner_info(struct bch_nvm_namespace *ns)
 
 			do {
 				struct bch_pgalloc_rec *rec;
+				int order;
+				struct page *page;
 
 				for (k = 0; k < nvm_pgalloc_recs->used; k++) {
 					rec = &nvm_pgalloc_recs->recs[k];
@@ -172,7 +188,17 @@ static int init_owner_info(struct bch_nvm_namespace *ns)
 					}
 					extent->kaddr = nvm_pgoff_to_vaddr(extents->ns, rec->pgoff);
 					extent->nr = rec->nr;
+					WARN_ON(!is_power_of_2(extent->nr));
+
+					/*init struct page: index/private */
+					order = ilog2(extent->nr);
+					page = nvm_vaddr_to_page(ns, extent->kaddr);
+					set_page_private(page, order);
+					page->index = rec->pgoff;
+
 					list_add_tail(&extent->list, &extents->extent_head);
+					/*remove already alloced space*/
+					remove_owner_space(extents->ns, rec->pgoff, rec->nr);
 				}
 				extents->nr += nvm_pgalloc_recs->used;
 
@@ -197,6 +223,36 @@ static int init_owner_info(struct bch_nvm_namespace *ns)
 	return 0;
 }
 
+static void init_nvm_free_space(struct bch_nvm_namespace *ns)
+{
+	unsigned int start, end, i;
+	struct page *page;
+	long long pages;
+	pgoff_t pgoff_start;
+
+	bitmap_for_each_clear_region(ns->pages_bitmap, start, end, 0, ns->pages_total) {
+		pgoff_start = start;
+		pages = end - start;
+
+		while (pages) {
+			for (i = BCH_MAX_ORDER - 1; i >= 0 ; i--) {
+				if ((pgoff_start % (1 << i) == 0) && (pages >= (1 << i)))
+					break;
+			}
+
+			page = nvm_vaddr_to_page(ns, nvm_pgoff_to_vaddr(ns, pgoff_start));
+			page->index = pgoff_start;
+			set_page_private(page, i);
+			__SetPageBuddy(page);
+			list_add((struct list_head *)&page->zone_device_data, &ns->free_area[i]);
+
+			pgoff_start += 1 << i;
+			pages -= 1 << i;
+		}
+	}
+
+}
+
 static bool attach_nvm_set(struct bch_nvm_namespace *ns)
 {
 	bool rc = true;
@@ -261,7 +317,7 @@ static int read_nvdimm_meta_super(struct block_device *bdev,
 struct bch_nvm_namespace *bch_register_namespace(const char *dev_path)
 {
 	struct bch_nvm_namespace *ns;
-	int err;
+	int i, err;
 	pgoff_t pgoff;
 	char buf[BDEVNAME_SIZE];
 	struct block_device *bdev;
@@ -357,6 +413,16 @@ struct bch_nvm_namespace *bch_register_namespace(const char *dev_path)
 	ns->bdev = bdev;
 	ns->nvm_set = only_set;
 
+	ns->pages_bitmap = kvcalloc(BITS_TO_LONGS(ns->pages_total),
+					sizeof(unsigned long), GFP_KERNEL);
+	if (!ns->pages_bitmap) {
+		err = -ENOMEM;
+		goto free_ns;
+	}
+
+	for (i = 0; i < BCH_MAX_ORDER; i++)
+		INIT_LIST_HEAD(&ns->free_area[i]);
+
 	mutex_init(&ns->lock);
 
 	if (ns->sb.this_namespace_nr == 0) {
@@ -364,12 +430,17 @@ struct bch_nvm_namespace *bch_register_namespace(const char *dev_path)
 		err = init_owner_info(ns);
 		if (err < 0) {
 			pr_info("init_owner_info met error %d\n", err);
-			goto free_ns;
+			goto free_bitmap;
 		}
+		/* init buddy allocator */
+		init_nvm_free_space(ns);
 	}
 
 	kfree(path);
 	return ns;
+
+free_bitmap:
+	kvfree(ns->pages_bitmap);
 free_ns:
 	kfree(ns);
 bdput:
diff --git a/drivers/md/bcache/nvm-pages.h b/drivers/md/bcache/nvm-pages.h
index 1b10b4b6db0f..ed3431daae06 100644
--- a/drivers/md/bcache/nvm-pages.h
+++ b/drivers/md/bcache/nvm-pages.h
@@ -34,6 +34,7 @@ struct bch_owner_list {
 	struct bch_nvm_alloced_recs **alloced_recs;
 };
 
+#define BCH_MAX_ORDER 20
 struct bch_nvm_namespace {
 	struct bch_nvm_pages_sb sb;
 	void *kaddr;
@@ -45,6 +46,10 @@ struct bch_nvm_namespace {
 	u64 pages_total;
 	pfn_t start_pfn;
 
+	unsigned long *pages_bitmap;
+	struct list_head free_area[BCH_MAX_ORDER];
+
+
 	struct dax_device *dax_dev;
 	struct block_device *bdev;
 	struct bch_nvm_set *nvm_set;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH v6 4/7] bcache: bch_nvm_alloc_pages() of the buddy
  2021-02-08 14:26 [RFC PATCH v6 0/7] nvm page allocator for bcache Qiaowei Ren
                   ` (3 preceding siblings ...)
  2021-02-08 14:26 ` [RFC PATCH v6 3/7] bcache: initialization of the buddy Qiaowei Ren
@ 2021-02-08 14:26 ` Qiaowei Ren
  2021-02-08 14:26 ` [RFC PATCH v6 5/7] bcache: bch_nvm_free_pages() " Qiaowei Ren
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Qiaowei Ren @ 2021-02-08 14:26 UTC (permalink / raw)
  To: Coly Li; +Cc: Qiaowei Ren, Jianpeng Ma, linux-bcache

This patch implements the bch_nvm_alloc_pages() of the buddy.

Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com>
---
 drivers/md/bcache/nvm-pages.c | 121 ++++++++++++++++++++++++++++++++++
 drivers/md/bcache/nvm-pages.h |   6 ++
 2 files changed, 127 insertions(+)

diff --git a/drivers/md/bcache/nvm-pages.c b/drivers/md/bcache/nvm-pages.c
index 7efb99c0fc07..0b992c17ce47 100644
--- a/drivers/md/bcache/nvm-pages.c
+++ b/drivers/md/bcache/nvm-pages.c
@@ -124,6 +124,127 @@ static inline void remove_owner_space(struct bch_nvm_namespace *ns,
 	bitmap_set(ns->pages_bitmap, pgoff, nr);
 }
 
+/* If not found, it will create if create == true */
+static struct bch_owner_list *find_owner_list(const char *owner_uuid, bool create)
+{
+	struct bch_owner_list *owner_list;
+	int i;
+
+	for (i = 0; i < only_set->owner_list_used; i++) {
+		if (!memcmp(owner_uuid, only_set->owner_lists[i]->owner_uuid, 16))
+			return only_set->owner_lists[i];
+	}
+
+	if (create) {
+		owner_list = alloc_owner_list(owner_uuid, NULL, only_set->total_namespaces_nr);
+		only_set->owner_lists[only_set->owner_list_used++] = owner_list;
+		return owner_list;
+	} else
+		return NULL;
+}
+
+static struct bch_nvm_alloced_recs *find_nvm_alloced_recs(struct bch_owner_list *owner_list,
+		struct bch_nvm_namespace *ns, bool create)
+{
+	int position = ns->sb.this_namespace_nr;
+
+	if (create && !owner_list->alloced_recs[position]) {
+		struct bch_nvm_alloced_recs *alloced_recs =
+			kzalloc(sizeof(*alloced_recs), GFP_KERNEL|__GFP_NOFAIL);
+
+		alloced_recs->ns = ns;
+		INIT_LIST_HEAD(&alloced_recs->extent_head);
+		owner_list->alloced_recs[position] = alloced_recs;
+		return alloced_recs;
+	} else
+		return owner_list->alloced_recs[position];
+}
+
+static inline void *extent_end_addr(struct bch_extent *extent)
+{
+	return extent->kaddr + ((u64)(extent->nr) << PAGE_SHIFT);
+}
+
+static void add_extent(struct bch_nvm_alloced_recs *alloced_recs, void *addr, int order)
+{
+	struct list_head *list = alloced_recs->extent_head.next;
+	struct bch_extent *extent, *tmp;
+	void *end_addr = addr + (((u64)1 << order) << PAGE_SHIFT);
+
+	while (list != &alloced_recs->extent_head) {
+		extent = container_of(list, struct bch_extent, list);
+		if (addr > extent->kaddr) {
+			list = list->next;
+			continue;
+		}
+		break;
+	}
+
+	extent = kzalloc(sizeof(*extent), GFP_KERNEL);
+	extent->kaddr = addr;
+	extent->nr = 1 << order;
+	list_add_tail(&extent->list, list);
+	alloced_recs->nr++;
+}
+
+void *bch_nvm_alloc_pages(int order, const char *owner_uuid)
+{
+	void *kaddr = NULL;
+	struct bch_owner_list *owner_list;
+	struct bch_nvm_alloced_recs *alloced_recs;
+	int i, j;
+
+	mutex_lock(&only_set->lock);
+	owner_list = find_owner_list(owner_uuid, true);
+
+	for (j = 0; j < only_set->total_namespaces_nr; j++) {
+		struct bch_nvm_namespace *ns = only_set->nss[j];
+
+		if (!ns || (ns->free < (1 << order)))
+			continue;
+
+		for (i = order; i < BCH_MAX_ORDER; i++) {
+			struct list_head *list;
+			struct page *page, *buddy_page;
+
+			if (list_empty(&ns->free_area[i]))
+				continue;
+
+			list = ns->free_area[i].next;
+			page = container_of((void *)list, struct page, zone_device_data);
+
+			list_del(list);
+
+			while (i != order) {
+				buddy_page = nvm_vaddr_to_page(ns,
+					nvm_pgoff_to_vaddr(ns, page->index + (1 << (i - 1))));
+				set_page_private(buddy_page, i - 1);
+				buddy_page->index = page->index + (1 << (i - 1));
+				__SetPageBuddy(buddy_page);
+				list_add((struct list_head *)&buddy_page->zone_device_data,
+					&ns->free_area[i - 1]);
+				i--;
+			}
+
+			set_page_private(page, order);
+			__ClearPageBuddy(page);
+			ns->free -= 1 << order;
+			kaddr = nvm_pgoff_to_vaddr(ns, page->index);
+			break;
+		}
+
+		if (i != BCH_MAX_ORDER) {
+			alloced_recs = find_nvm_alloced_recs(owner_list, ns, true);
+			add_extent(alloced_recs, kaddr, order);
+			break;
+		}
+	}
+
+	mutex_unlock(&only_set->lock);
+	return kaddr;
+}
+EXPORT_SYMBOL_GPL(bch_nvm_alloc_pages);
+
 static int init_owner_info(struct bch_nvm_namespace *ns)
 {
 	struct bch_owner_list_head *owner_list_head;
diff --git a/drivers/md/bcache/nvm-pages.h b/drivers/md/bcache/nvm-pages.h
index ed3431daae06..10157d993126 100644
--- a/drivers/md/bcache/nvm-pages.h
+++ b/drivers/md/bcache/nvm-pages.h
@@ -79,6 +79,7 @@ extern struct bch_nvm_set *only_set;
 struct bch_nvm_namespace *bch_register_namespace(const char *dev_path);
 int bch_nvm_init(void);
 void bch_nvm_exit(void);
+void *bch_nvm_alloc_pages(int order, const char *owner_uuid);
 
 #else
 
@@ -92,6 +93,11 @@ static inline int bch_nvm_init(void)
 }
 static inline void bch_nvm_exit(void) { }
 
+static inline void *bch_nvm_alloc_pages(int order, const char *owner_uuid)
+{
+	return NULL;
+}
+
 #endif /* CONFIG_BCACHE_NVM_PAGES */
 
 #endif /* _BCACHE_NVM_PAGES_H */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH v6 5/7] bcache: bch_nvm_free_pages() of the buddy
  2021-02-08 14:26 [RFC PATCH v6 0/7] nvm page allocator for bcache Qiaowei Ren
                   ` (4 preceding siblings ...)
  2021-02-08 14:26 ` [RFC PATCH v6 4/7] bcache: bch_nvm_alloc_pages() " Qiaowei Ren
@ 2021-02-08 14:26 ` Qiaowei Ren
  2021-02-08 14:26 ` [RFC PATCH v6 6/7] bcache: get allocated pages from specific owner Qiaowei Ren
  2021-02-08 14:26 ` [RFC PATCH v6 7/7] bcache: persist owner info when alloc/free pages Qiaowei Ren
  7 siblings, 0 replies; 11+ messages in thread
From: Qiaowei Ren @ 2021-02-08 14:26 UTC (permalink / raw)
  To: Coly Li; +Cc: Qiaowei Ren, Jianpeng Ma, linux-bcache

This patch implements the bch_nvm_free_pages() of the buddy.

Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com>
---
 drivers/md/bcache/nvm-pages.c | 143 ++++++++++++++++++++++++++++++++--
 drivers/md/bcache/nvm-pages.h |   3 +
 2 files changed, 138 insertions(+), 8 deletions(-)

diff --git a/drivers/md/bcache/nvm-pages.c b/drivers/md/bcache/nvm-pages.c
index 0b992c17ce47..b40bdbac873f 100644
--- a/drivers/md/bcache/nvm-pages.c
+++ b/drivers/md/bcache/nvm-pages.c
@@ -168,8 +168,7 @@ static inline void *extent_end_addr(struct bch_extent *extent)
 static void add_extent(struct bch_nvm_alloced_recs *alloced_recs, void *addr, int order)
 {
 	struct list_head *list = alloced_recs->extent_head.next;
-	struct bch_extent *extent, *tmp;
-	void *end_addr = addr + (((u64)1 << order) << PAGE_SHIFT);
+	struct bch_extent *extent;
 
 	while (list != &alloced_recs->extent_head) {
 		extent = container_of(list, struct bch_extent, list);
@@ -187,6 +186,136 @@ static void add_extent(struct bch_nvm_alloced_recs *alloced_recs, void *addr, in
 	alloced_recs->nr++;
 }
 
+static inline void *nvm_end_addr(struct bch_nvm_namespace *ns)
+{
+	return ns->kaddr + (ns->pages_total << PAGE_SHIFT);
+}
+
+static inline bool in_nvm_range(struct bch_nvm_namespace *ns,
+		void *start_addr, void *end_addr)
+{
+	return (start_addr >= ns->kaddr) && (end_addr <= nvm_end_addr(ns));
+}
+
+static struct bch_nvm_namespace *find_nvm_by_addr(void *addr, int order)
+{
+	int i;
+	struct bch_nvm_namespace *ns;
+
+	for (i = 0; i < only_set->total_namespaces_nr; i++) {
+		ns = only_set->nss[i];
+		if (ns && in_nvm_range(ns, addr, addr + (1 << order)))
+			return ns;
+	}
+	return NULL;
+}
+
+static int remove_extent(struct bch_nvm_alloced_recs *alloced_recs, void *addr, int order)
+{
+	struct list_head *list = alloced_recs->extent_head.next;
+	struct bch_extent *extent;
+
+	while (list != &alloced_recs->extent_head) {
+		extent = container_of(list, struct bch_extent, list);
+
+		if (addr < extent->kaddr)
+			return -ENOENT;
+		if (addr > extent->kaddr) {
+			list = list->next;
+			continue;
+		}
+
+		WARN_ON(extent->nr != (1 << order));
+		list_del(list);
+		kfree(extent);
+		alloced_recs->nr--;
+		break;
+	}
+	return (list == &alloced_recs->extent_head) ? -ENOENT : 0;
+}
+
+static void __free_space(struct bch_nvm_namespace *ns, void *addr, int order)
+{
+	unsigned int add_pages = (1 << order);
+	pgoff_t pgoff;
+	struct page *page;
+
+	page = nvm_vaddr_to_page(ns, addr);
+	WARN_ON((!page) || (page->private != order));
+	pgoff = page->index;
+
+	while (order < BCH_MAX_ORDER - 1) {
+		struct page *buddy_page;
+
+		pgoff_t buddy_pgoff = pgoff ^ (1 << order);
+		pgoff_t parent_pgoff = pgoff & ~(1 << order);
+
+		if ((parent_pgoff + (1 << (order + 1)) > ns->pages_total))
+			break;
+
+		buddy_page = nvm_vaddr_to_page(ns, nvm_pgoff_to_vaddr(ns, buddy_pgoff));
+		WARN_ON(!buddy_page);
+
+		if (PageBuddy(buddy_page) && (buddy_page->private == order)) {
+			list_del((struct list_head *)&buddy_page->zone_device_data);
+			__ClearPageBuddy(buddy_page);
+			pgoff = parent_pgoff;
+			order++;
+			continue;
+		}
+		break;
+	}
+
+	page = nvm_vaddr_to_page(ns, nvm_pgoff_to_vaddr(ns, pgoff));
+	WARN_ON(!page);
+	list_add((struct list_head *)&page->zone_device_data, &ns->free_area[order]);
+	page->index = pgoff;
+	set_page_private(page, order);
+	__SetPageBuddy(page);
+	ns->free += add_pages;
+}
+
+void bch_nvm_free_pages(void *addr, int order, const char *owner_uuid)
+{
+	struct bch_nvm_namespace *ns;
+	struct bch_owner_list *owner_list;
+	struct bch_nvm_alloced_recs *alloced_recs;
+	int r;
+
+	mutex_lock(&only_set->lock);
+
+	ns = find_nvm_by_addr(addr, order);
+	if (!ns) {
+		pr_info("can't find nvm_dev by kaddr %p\n", addr);
+		goto unlock;
+	}
+
+	owner_list = find_owner_list(owner_uuid, false);
+	if (!owner_list) {
+		pr_info("can't found owner(uuid=%s)\n", owner_uuid);
+		goto unlock;
+	}
+
+	alloced_recs = find_nvm_alloced_recs(owner_list, ns, false);
+	if (!alloced_recs) {
+		pr_info("can't find alloced_recs(uuid=%s)\n", ns->uuid);
+		goto unlock;
+	}
+
+	r = remove_extent(alloced_recs, addr, order);
+	if (r < 0) {
+		pr_info("can't find extent\n");
+		goto unlock;
+	}
+
+	__free_space(ns, addr, order);
+
+unlock:
+	mutex_unlock(&only_set->lock);
+}
+EXPORT_SYMBOL_GPL(bch_nvm_free_pages);
+
+
 void *bch_nvm_alloc_pages(int order, const char *owner_uuid)
 {
 	void *kaddr = NULL;
@@ -276,7 +405,6 @@ static int init_owner_info(struct bch_nvm_namespace *ns)
 		for (j = 0; j < only_set->total_namespaces_nr; j++) {
 			if (!only_set->nss[j] || !owner_head->recs[j])
 				continue;
-
 			nvm_pgalloc_recs = (struct bch_nvm_pgalloc_recs *)
 					((long)owner_head->recs[j] + ns->kaddr);
 			if (memcmp(nvm_pgalloc_recs->magic, bch_nvm_pages_pgalloc_magic, 16)) {
@@ -348,7 +476,7 @@ static void init_nvm_free_space(struct bch_nvm_namespace *ns)
 {
 	unsigned int start, end, i;
 	struct page *page;
-	long long pages;
+	u64 pages;
 	pgoff_t pgoff_start;
 
 	bitmap_for_each_clear_region(ns->pages_bitmap, start, end, 0, ns->pages_total) {
@@ -364,9 +492,8 @@ static void init_nvm_free_space(struct bch_nvm_namespace *ns)
 			page = nvm_vaddr_to_page(ns, nvm_pgoff_to_vaddr(ns, pgoff_start));
 			page->index = pgoff_start;
 			set_page_private(page, i);
-			__SetPageBuddy(page);
-			list_add((struct list_head *)&page->zone_device_data, &ns->free_area[i]);
-
+			/* in order to update ns->free */
+			__free_space(ns, nvm_pgoff_to_vaddr(ns, pgoff_start), i);
 			pgoff_start += 1 << i;
 			pages -= 1 << i;
 		}
@@ -530,7 +657,7 @@ struct bch_nvm_namespace *bch_register_namespace(const char *dev_path)
 	ns->page_size = ns->sb.page_size;
 	ns->pages_offset = ns->sb.pages_offset;
 	ns->pages_total = ns->sb.pages_total;
-	ns->free = 0;
+	ns->free = 0; /* increased by __free_space() */
 	ns->bdev = bdev;
 	ns->nvm_set = only_set;
 
diff --git a/drivers/md/bcache/nvm-pages.h b/drivers/md/bcache/nvm-pages.h
index 10157d993126..1bc3129f2482 100644
--- a/drivers/md/bcache/nvm-pages.h
+++ b/drivers/md/bcache/nvm-pages.h
@@ -80,6 +80,7 @@ struct bch_nvm_namespace *bch_register_namespace(const char *dev_path);
 int bch_nvm_init(void);
 void bch_nvm_exit(void);
 void *bch_nvm_alloc_pages(int order, const char *owner_uuid);
+void bch_nvm_free_pages(void *addr, int order, const char *owner_uuid);
 
 #else
 
@@ -98,6 +99,8 @@ static inline void *bch_nvm_alloc_pages(int order, const char *owner_uuid)
 	return NULL;
 }
 
+static inline void bch_nvm_free_pages(void *addr, int order, const char *owner_uuid) { }
+
 #endif /* CONFIG_BCACHE_NVM_PAGES */
 
 #endif /* _BCACHE_NVM_PAGES_H */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH v6 6/7] bcache: get allocated pages from specific owner
  2021-02-08 14:26 [RFC PATCH v6 0/7] nvm page allocator for bcache Qiaowei Ren
                   ` (5 preceding siblings ...)
  2021-02-08 14:26 ` [RFC PATCH v6 5/7] bcache: bch_nvm_free_pages() " Qiaowei Ren
@ 2021-02-08 14:26 ` Qiaowei Ren
  2021-02-08 14:26 ` [RFC PATCH v6 7/7] bcache: persist owner info when alloc/free pages Qiaowei Ren
  7 siblings, 0 replies; 11+ messages in thread
From: Qiaowei Ren @ 2021-02-08 14:26 UTC (permalink / raw)
  To: Coly Li; +Cc: Qiaowei Ren, Jianpeng Ma, linux-bcache

This patch implements bch_get_allocated_pages() of the buddy to be used to
get allocated pages from specific owner.

Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com>
---
 drivers/md/bcache/nvm-pages.c | 39 +++++++++++++++++++++++++++++++++++
 drivers/md/bcache/nvm-pages.h |  6 ++++++
 2 files changed, 45 insertions(+)

diff --git a/drivers/md/bcache/nvm-pages.c b/drivers/md/bcache/nvm-pages.c
index b40bdbac873f..2b079a277e88 100644
--- a/drivers/md/bcache/nvm-pages.c
+++ b/drivers/md/bcache/nvm-pages.c
@@ -374,6 +374,45 @@ void *bch_nvm_alloc_pages(int order, const char *owner_uuid)
 }
 EXPORT_SYMBOL_GPL(bch_nvm_alloc_pages);
 
+struct bch_extent *bch_get_allocated_pages(const char *owner_uuid)
+{
+	struct bch_owner_list *owner_list = find_owner_list(owner_uuid, false);
+	struct bch_nvm_alloced_recs *alloced_recs;
+	struct bch_extent *head = NULL, *e, *tmp;
+	int i;
+
+	if (!owner_list)
+		return NULL;
+
+	for (i = 0; i < only_set->total_namespaces_nr; i++) {
+		struct list_head *l;
+
+		alloced_recs = owner_list->alloced_recs[i];
+
+		if (!alloced_recs || alloced_recs->nr == 0)
+			continue;
+
+		l = alloced_recs->extent_head.next;
+		while (l != &alloced_recs->extent_head) {
+			e = container_of(l, struct bch_extent, list);
+			tmp = kzalloc(sizeof(*tmp), GFP_KERNEL|__GFP_NOFAIL);
+
+			INIT_LIST_HEAD(&tmp->list);
+			tmp->kaddr = e->kaddr;
+			tmp->nr = e->nr;
+
+			if (head)
+				list_add_tail(&tmp->list, &head->list);
+			else
+				head = tmp;
+
+			l = l->next;
+		}
+	}
+	return head;
+}
+EXPORT_SYMBOL_GPL(bch_get_allocated_pages);
+
 static int init_owner_info(struct bch_nvm_namespace *ns)
 {
 	struct bch_owner_list_head *owner_list_head;
diff --git a/drivers/md/bcache/nvm-pages.h b/drivers/md/bcache/nvm-pages.h
index 1bc3129f2482..8ffae11c7c61 100644
--- a/drivers/md/bcache/nvm-pages.h
+++ b/drivers/md/bcache/nvm-pages.h
@@ -81,6 +81,7 @@ int bch_nvm_init(void);
 void bch_nvm_exit(void);
 void *bch_nvm_alloc_pages(int order, const char *owner_uuid);
 void bch_nvm_free_pages(void *addr, int order, const char *owner_uuid);
+struct bch_extent *bch_get_allocated_pages(const char *owner_uuid);
 
 #else
 
@@ -101,6 +102,11 @@ static inline void *bch_nvm_alloc_pages(int order, const char *owner_uuid)
 
 static inline void bch_nvm_free_pages(void *addr, int order, const char *owner_uuid) { }
 
+static inline struct bch_extent *bch_get_allocated_pages(const char *owner_uuid)
+{
+	return NULL;
+}
+
 #endif /* CONFIG_BCACHE_NVM_PAGES */
 
 #endif /* _BCACHE_NVM_PAGES_H */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH v6 7/7] bcache: persist owner info when alloc/free pages.
  2021-02-08 14:26 [RFC PATCH v6 0/7] nvm page allocator for bcache Qiaowei Ren
                   ` (6 preceding siblings ...)
  2021-02-08 14:26 ` [RFC PATCH v6 6/7] bcache: get allocated pages from specific owner Qiaowei Ren
@ 2021-02-08 14:26 ` Qiaowei Ren
  7 siblings, 0 replies; 11+ messages in thread
From: Qiaowei Ren @ 2021-02-08 14:26 UTC (permalink / raw)
  To: Coly Li; +Cc: Qiaowei Ren, Jianpeng Ma, linux-bcache

This patch implement persist owner info on nvdimm device
when alloc/free pages.

Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com>
---
 drivers/md/bcache/nvm-pages.c | 93 ++++++++++++++++++++++++++++++++++-
 1 file changed, 92 insertions(+), 1 deletion(-)

diff --git a/drivers/md/bcache/nvm-pages.c b/drivers/md/bcache/nvm-pages.c
index 2b079a277e88..c350dcd696dd 100644
--- a/drivers/md/bcache/nvm-pages.c
+++ b/drivers/md/bcache/nvm-pages.c
@@ -210,6 +210,19 @@ static struct bch_nvm_namespace *find_nvm_by_addr(void *addr, int order)
 	return NULL;
 }
 
+static void init_pgalloc_recs(struct bch_nvm_pgalloc_recs *recs, const char *owner_uuid)
+{
+	memset(recs, 0, sizeof(struct bch_nvm_pgalloc_recs));
+	memcpy(recs->magic, bch_nvm_pages_pgalloc_magic, 16);
+	memcpy(recs->owner_uuid, owner_uuid, 16);
+	recs->size = BCH_MAX_RECS;
+}
+
+static pgoff_t vaddr_to_nvm_pgoff(struct bch_nvm_namespace *ns, void *kaddr)
+{
+	return (kaddr - ns->kaddr) / PAGE_SIZE;
+}
+
 static int remove_extent(struct bch_nvm_alloced_recs *alloced_recs, void *addr, int order)
 {
 	struct list_head *list = alloced_recs->extent_head.next;
@@ -234,6 +247,82 @@ static int remove_extent(struct bch_nvm_alloced_recs *alloced_recs, void *addr,
 	return (list == &alloced_recs->extent_head) ? -ENOENT : 0;
 }
 
+#define BCH_RECS_LEN (sizeof(struct bch_nvm_pgalloc_recs))
+
+static void write_owner_info(void)
+{
+	struct bch_owner_list *owner_list;
+	struct bch_nvm_pgalloc_recs *recs;
+	struct bch_nvm_namespace *ns = only_set->nss[0];
+	struct bch_owner_list_head *owner_list_head;
+	struct bch_nvm_pages_owner_head *owner_head;
+	u64 recs_pos = BCH_NVM_PAGES_SYS_RECS_HEAD_OFFSET;
+	struct list_head *list;
+	int i, j;
+
+	owner_list_head = kzalloc(sizeof(*owner_list_head), GFP_KERNEL);
+	recs = kmalloc(sizeof(*recs), GFP_KERNEL);
+	if (!owner_list_head || !recs) {
+		pr_info("can't alloc memory\n");
+		goto free_resouce;
+	}
+
+	owner_list_head->size = BCH_MAX_OWNER_LIST;
+	WARN_ON(only_set->owner_list_used > owner_list_head->size);
+
+	// in-memory owner maybe not contain alloced-pages.
+	for (i = 0; i < only_set->owner_list_used; i++) {
+		owner_head = &owner_list_head->heads[i];
+		owner_list = only_set->owner_lists[i];
+
+		memcpy(owner_head->uuid, owner_list->owner_uuid, 16);
+
+		for (j = 0; j < only_set->total_namespaces_nr; j++) {
+			struct bch_nvm_alloced_recs *extents = owner_list->alloced_recs[j];
+
+			if (!extents || !extents->nr)
+				continue;
+
+			init_pgalloc_recs(recs, owner_list->owner_uuid);
+
+			BUG_ON(recs_pos >= BCH_NVM_PAGES_OFFSET);
+			owner_head->recs[j] = (struct bch_nvm_pgalloc_recs *)(uintptr_t)recs_pos;
+
+			for (list = extents->extent_head.next;
+				list != &extents->extent_head;
+				list = list->next) {
+				struct bch_extent *extent;
+
+				extent = container_of(list, struct bch_extent, list);
+
+				if (recs->used == recs->size) {
+					BUG_ON(recs_pos >= BCH_NVM_PAGES_OFFSET);
+					recs->next = (struct bch_nvm_pgalloc_recs *)
+							(uintptr_t)(recs_pos + BCH_RECS_LEN);
+					memcpy_flushcache(ns->kaddr + recs_pos, recs, BCH_RECS_LEN);
+					init_pgalloc_recs(recs, owner_list->owner_uuid);
+					recs_pos += BCH_RECS_LEN;
+				}
+
+				recs->recs[recs->used].pgoff =
+					vaddr_to_nvm_pgoff(only_set->nss[j], extent->kaddr);
+				recs->recs[recs->used].nr = extent->nr;
+				recs->used++;
+			}
+
+			memcpy_flushcache(ns->kaddr + recs_pos, recs, BCH_RECS_LEN);
+			recs_pos += sizeof(struct bch_nvm_pgalloc_recs);
+		}
+	}
+
+	owner_list_head->used = only_set->owner_list_used;
+	memcpy_flushcache(ns->kaddr + BCH_NVM_PAGES_OWNER_LIST_HEAD_OFFSET,
+			 (void *)owner_list_head, sizeof(struct bch_owner_list_head));
+free_resouce:
+	kfree(owner_list_head);
+	kfree(recs);
+}
+
 static void __free_space(struct bch_nvm_namespace *ns, void *addr, int order)
 {
 	unsigned int add_pages = (1 << order);
@@ -309,6 +398,7 @@ void bch_nvm_free_pages(void *addr, int order, const char *owner_uuid)
 	}
 
 	__free_space(ns, addr, order);
+	write_owner_info();
 
 unlock:
 	mutex_unlock(&only_set->lock);
@@ -368,7 +458,8 @@ void *bch_nvm_alloc_pages(int order, const char *owner_uuid)
 			break;
 		}
 	}
-
+	if (kaddr)
+		write_owner_info();
 	mutex_unlock(&only_set->lock);
 	return kaddr;
 }
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* RE: [RFC PATCH v6 0/7] nvm page allocator for bcache
  2021-02-08 13:49 ` Coly Li
@ 2021-02-09  2:30   ` Ren, Qiaowei
  2021-02-09  3:26     ` Coly Li
  0 siblings, 1 reply; 11+ messages in thread
From: Ren, Qiaowei @ 2021-02-09  2:30 UTC (permalink / raw)
  To: Coly Li, Ma, Jianpeng; +Cc: linux-bcache


> -----Original Message-----
> From: Coly Li <colyli@suse.de>
> Sent: Monday, February 8, 2021 9:50 PM
> To: Ren, Qiaowei <qiaowei.ren@intel.com>; Ma, Jianpeng
> <jianpeng.ma@intel.com>
> Cc: linux-bcache@vger.kernel.org
> Subject: Re: [RFC PATCH v6 0/7] nvm page allocator for bcache
> 
> On 2/8/21 10:26 PM, Qiaowei Ren wrote:
> > This series implements nvm pages allocator for bcache. This idea is
> > from one discussion about nvdimm use case in kernel together with
> > Coly. Coly sent the following email about this idea to give some
> > introduction on what we will do before:
> >
> > https://lore.kernel.org/linux-bcache/bc7e71ec-97eb-b226-d4fc-d8b64c1ef
> > 41a@suse.de/
> >
> > Here this series focus on the first step in above email, that is to
> > say, this patch set implements a generic framework in bcache to
> > allocate/release NV-memory pages, and provide allocated pages for each
> requestor after reboot.
> > In order to do this, one simple buddy system is implemented to manage
> > NV-memory pages.
> >
> > This set includes one testing module which can be used for simple test
> cases.
> > Next need to stroe bcache log or internal btree nodes into nvdimm
> > based on these buddy apis to do more testing.
> >
> > Qiaowei Ren (7):
> >   bcache: add initial data structures for nvm pages
> >   bcache: initialize the nvm pages allocator
> >   bcache: initialization of the buddy
> >   bcache: bch_nvm_alloc_pages() of the buddy
> >   bcache: bch_nvm_free_pages() of the buddy
> >   bcache: get allocated pages from specific owner
> >   bcache: persist owner info when alloc/free pages.
> 
> I test the V6 patch set, it works with current bcache part change. Sorry for
> not response for the previous series in time on list, but thank you all to fix
> the known issues in previous version.
> 
> Although the series is still marked as RFC patches, but IMHO they are in good
> shape for an EXPERIMENTAL series.
> 
> I will have them with my other bcache changes in the v5.12 for-next, and it is
> so far so good in my smoking testing.
> 
> There is one thing I feel should be clarified from you, I see some patches the
> author and the first signed-off-by person is not identical.
> Please make the first SOB people to be the same one in the From/Author
> field. And I guess maybe most of the work are done by both of you, if this is
> true, the second author can use a Co-authored-by: tag after the first Signed-
> off-by: person.
> 
Yes, it is true, but the From/Author field should be Jianpeng. Thanks.

> The v6 series is under testing now, so it is unnecessary to post one more
> version for the above changes. I'd like to change them from my side if you
> may provide me some hints.
> 
> Thanks for the contribution, the tiny NVDIMM pages allcoator works.
> 
> Coly Li

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH v6 0/7] nvm page allocator for bcache
  2021-02-09  2:30   ` Ren, Qiaowei
@ 2021-02-09  3:26     ` Coly Li
  0 siblings, 0 replies; 11+ messages in thread
From: Coly Li @ 2021-02-09  3:26 UTC (permalink / raw)
  To: Ren, Qiaowei, Ma, Jianpeng; +Cc: linux-bcache

On 2/9/21 10:30 AM, Ren, Qiaowei wrote:
> 
>> -----Original Message-----
>> From: Coly Li <colyli@suse.de>
>> Sent: Monday, February 8, 2021 9:50 PM
>> To: Ren, Qiaowei <qiaowei.ren@intel.com>; Ma, Jianpeng
>> <jianpeng.ma@intel.com>
>> Cc: linux-bcache@vger.kernel.org
>> Subject: Re: [RFC PATCH v6 0/7] nvm page allocator for bcache
>>
>> On 2/8/21 10:26 PM, Qiaowei Ren wrote:
>>> This series implements nvm pages allocator for bcache. This idea is
>>> from one discussion about nvdimm use case in kernel together with
>>> Coly. Coly sent the following email about this idea to give some
>>> introduction on what we will do before:
>>>
>>> https://lore.kernel.org/linux-bcache/bc7e71ec-97eb-b226-d4fc-d8b64c1ef
>>> 41a@suse.de/
>>>
>>> Here this series focus on the first step in above email, that is to
>>> say, this patch set implements a generic framework in bcache to
>>> allocate/release NV-memory pages, and provide allocated pages for each
>> requestor after reboot.
>>> In order to do this, one simple buddy system is implemented to manage
>>> NV-memory pages.
>>>
>>> This set includes one testing module which can be used for simple test
>> cases.
>>> Next need to stroe bcache log or internal btree nodes into nvdimm
>>> based on these buddy apis to do more testing.
>>>
>>> Qiaowei Ren (7):
>>>   bcache: add initial data structures for nvm pages
>>>   bcache: initialize the nvm pages allocator
>>>   bcache: initialization of the buddy
>>>   bcache: bch_nvm_alloc_pages() of the buddy
>>>   bcache: bch_nvm_free_pages() of the buddy
>>>   bcache: get allocated pages from specific owner
>>>   bcache: persist owner info when alloc/free pages.
>>
>> I test the V6 patch set, it works with current bcache part change. Sorry for
>> not response for the previous series in time on list, but thank you all to fix
>> the known issues in previous version.
>>
>> Although the series is still marked as RFC patches, but IMHO they are in good
>> shape for an EXPERIMENTAL series.
>>
>> I will have them with my other bcache changes in the v5.12 for-next, and it is
>> so far so good in my smoking testing.
>>
>> There is one thing I feel should be clarified from you, I see some patches the
>> author and the first signed-off-by person is not identical.
>> Please make the first SOB people to be the same one in the From/Author
>> field. And I guess maybe most of the work are done by both of you, if this is
>> true, the second author can use a Co-authored-by: tag after the first Signed-
>> off-by: person.
>>
> Yes, it is true, but the From/Author field should be Jianpeng. Thanks.

OK, I will modify from my side, you don't need to post one more version
for this.

[snipped]

Thanks.

Coly Li

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-02-09  3:30 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-08 14:26 [RFC PATCH v6 0/7] nvm page allocator for bcache Qiaowei Ren
2021-02-08 13:49 ` Coly Li
2021-02-09  2:30   ` Ren, Qiaowei
2021-02-09  3:26     ` Coly Li
2021-02-08 14:26 ` [RFC PATCH v6 1/7] bcache: add initial data structures for nvm pages Qiaowei Ren
2021-02-08 14:26 ` [RFC PATCH v6 2/7] bcache: initialize the nvm pages allocator Qiaowei Ren
2021-02-08 14:26 ` [RFC PATCH v6 3/7] bcache: initialization of the buddy Qiaowei Ren
2021-02-08 14:26 ` [RFC PATCH v6 4/7] bcache: bch_nvm_alloc_pages() " Qiaowei Ren
2021-02-08 14:26 ` [RFC PATCH v6 5/7] bcache: bch_nvm_free_pages() " Qiaowei Ren
2021-02-08 14:26 ` [RFC PATCH v6 6/7] bcache: get allocated pages from specific owner Qiaowei Ren
2021-02-08 14:26 ` [RFC PATCH v6 7/7] bcache: persist owner info when alloc/free pages Qiaowei Ren

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).