All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v7 0/3] erofs-utils: optimize buffer allocation logic
       [not found] <20210122171153.27404-1-hsiangkao.ref@aol.com>
@ 2021-01-22 17:11 ` Gao Xiang via Linux-erofs
  2021-01-22 17:11   ` [PATCH v7 1/3] erofs-utils: get rid of `end' argument from erofs_mapbh() Gao Xiang via Linux-erofs
                     ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Gao Xiang via Linux-erofs @ 2021-01-22 17:11 UTC (permalink / raw)
  To: linux-erofs

Hi all,

This introduces bucket lists for mapped buffer block to boost up
buffer allocation and buffer mapping. Thanks to Weiwen for
contribution!

changes since v6:
 - introduce erofs_bfind_for_attach to clean up erofs_balloc();
 - use a new formula mentioned by Jianan to calculate used_before;
 - only DBG_BUGON on debug version for __erofs_battach < 0.

Thanks,
Gao Xiang  

Gao Xiang (1):
  erofs-utils: introduce erofs_bfind_for_attach()

Hu Weiwen (2):
  erofs-utils: get rid of `end' argument from erofs_mapbh()
  erofs-utils: optimize buffer allocation logic

 include/erofs/cache.h |   3 +-
 lib/cache.c           | 186 ++++++++++++++++++++++++++++++++----------
 lib/compress.c        |   2 +-
 lib/inode.c           |  10 +--
 lib/xattr.c           |   2 +-
 mkfs/main.c           |   2 +-
 6 files changed, 151 insertions(+), 54 deletions(-)

-- 
2.24.0


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v7 1/3] erofs-utils: get rid of `end' argument from erofs_mapbh()
  2021-01-22 17:11 ` [PATCH v7 0/3] erofs-utils: optimize buffer allocation logic Gao Xiang via Linux-erofs
@ 2021-01-22 17:11   ` Gao Xiang via Linux-erofs
  2021-02-06 15:28     ` Li GuiFu via Linux-erofs
  2021-01-22 17:11   ` [PATCH v7 2/3] erofs-utils: introduce erofs_bfind_for_attach() Gao Xiang via Linux-erofs
  2021-01-22 17:11   ` [PATCH v7 3/3] erofs-utils: optimize buffer allocation logic Gao Xiang via Linux-erofs
  2 siblings, 1 reply; 7+ messages in thread
From: Gao Xiang via Linux-erofs @ 2021-01-22 17:11 UTC (permalink / raw)
  To: linux-erofs

From: Hu Weiwen <sehuww@mail.scut.edu.cn>

`end` arguement is completely broken now. Also, it could
be reintroduced later if needed.

Signed-off-by: Hu Weiwen <sehuww@mail.scut.edu.cn>
Signed-off-by: Gao Xiang <hsiangkao@aol.com>
---
 include/erofs/cache.h |  2 +-
 lib/cache.c           |  6 ++----
 lib/compress.c        |  2 +-
 lib/inode.c           | 10 +++++-----
 lib/xattr.c           |  2 +-
 mkfs/main.c           |  2 +-
 6 files changed, 11 insertions(+), 13 deletions(-)

diff --git a/include/erofs/cache.h b/include/erofs/cache.h
index 8c171f5a130e..f8dff67b9736 100644
--- a/include/erofs/cache.h
+++ b/include/erofs/cache.h
@@ -95,7 +95,7 @@ struct erofs_buffer_head *erofs_balloc(int type, erofs_off_t size,
 struct erofs_buffer_head *erofs_battach(struct erofs_buffer_head *bh,
 					int type, unsigned int size);
 
-erofs_blk_t erofs_mapbh(struct erofs_buffer_block *bb, bool end);
+erofs_blk_t erofs_mapbh(struct erofs_buffer_block *bb);
 bool erofs_bflush(struct erofs_buffer_block *bb);
 
 void erofs_bdrop(struct erofs_buffer_head *bh, bool tryrevoke);
diff --git a/lib/cache.c b/lib/cache.c
index 0d5c4a5d48de..32a58311f563 100644
--- a/lib/cache.c
+++ b/lib/cache.c
@@ -257,16 +257,14 @@ static erofs_blk_t __erofs_mapbh(struct erofs_buffer_block *bb)
 	return blkaddr;
 }
 
-erofs_blk_t erofs_mapbh(struct erofs_buffer_block *bb, bool end)
+erofs_blk_t erofs_mapbh(struct erofs_buffer_block *bb)
 {
 	struct erofs_buffer_block *t, *nt;
 
 	if (!bb || bb->blkaddr == NULL_ADDR) {
 		list_for_each_entry_safe(t, nt, &blkh.list, list) {
-			if (!end && (t == bb || nt == &blkh))
-				break;
 			(void)__erofs_mapbh(t);
-			if (end && t == bb)
+			if (t == bb)
 				break;
 		}
 	}
diff --git a/lib/compress.c b/lib/compress.c
index 86db940b6edd..2b1f93c389ff 100644
--- a/lib/compress.c
+++ b/lib/compress.c
@@ -416,7 +416,7 @@ int erofs_write_compressed_file(struct erofs_inode *inode)
 
 	memset(compressmeta, 0, Z_EROFS_LEGACY_MAP_HEADER_SIZE);
 
-	blkaddr = erofs_mapbh(bh->block, true);	/* start_blkaddr */
+	blkaddr = erofs_mapbh(bh->block);	/* start_blkaddr */
 	ctx.blkaddr = blkaddr;
 	ctx.metacur = compressmeta + Z_EROFS_LEGACY_MAP_HEADER_SIZE;
 	ctx.head = ctx.tail = 0;
diff --git a/lib/inode.c b/lib/inode.c
index 640068f4147c..ee0afacd4b40 100644
--- a/lib/inode.c
+++ b/lib/inode.c
@@ -148,7 +148,7 @@ static int __allocate_inode_bh_data(struct erofs_inode *inode,
 	inode->bh_data = bh;
 
 	/* get blkaddr of the bh */
-	ret = erofs_mapbh(bh->block, true);
+	ret = erofs_mapbh(bh->block);
 	DBG_BUGON(ret < 0);
 
 	/* write blocks except for the tail-end block */
@@ -522,7 +522,7 @@ int erofs_prepare_tail_block(struct erofs_inode *inode)
 		bh->op = &erofs_skip_write_bhops;
 
 		/* get blkaddr of bh */
-		ret = erofs_mapbh(bh->block, true);
+		ret = erofs_mapbh(bh->block);
 		DBG_BUGON(ret < 0);
 		inode->u.i_blkaddr = bh->block->blkaddr;
 
@@ -632,7 +632,7 @@ int erofs_write_tail_end(struct erofs_inode *inode)
 		int ret;
 		erofs_off_t pos;
 
-		erofs_mapbh(bh->block, true);
+		erofs_mapbh(bh->block);
 		pos = erofs_btell(bh, true) - EROFS_BLKSIZ;
 		ret = dev_write(inode->idata, pos, inode->idata_size);
 		if (ret)
@@ -881,7 +881,7 @@ void erofs_fixup_meta_blkaddr(struct erofs_inode *rootdir)
 	struct erofs_buffer_head *const bh = rootdir->bh;
 	erofs_off_t off, meta_offset;
 
-	erofs_mapbh(bh->block, true);
+	erofs_mapbh(bh->block);
 	off = erofs_btell(bh, false);
 
 	if (off > rootnid_maxoffset)
@@ -900,7 +900,7 @@ erofs_nid_t erofs_lookupnid(struct erofs_inode *inode)
 	if (!bh)
 		return inode->nid;
 
-	erofs_mapbh(bh->block, true);
+	erofs_mapbh(bh->block);
 	off = erofs_btell(bh, false);
 
 	meta_offset = blknr_to_addr(sbi.meta_blkaddr);
diff --git a/lib/xattr.c b/lib/xattr.c
index 49ebb9c2f539..8b7bcb126fe9 100644
--- a/lib/xattr.c
+++ b/lib/xattr.c
@@ -575,7 +575,7 @@ int erofs_build_shared_xattrs_from_path(const char *path)
 	}
 	bh->op = &erofs_skip_write_bhops;
 
-	erofs_mapbh(bh->block, true);
+	erofs_mapbh(bh->block);
 	off = erofs_btell(bh, false);
 
 	sbi.xattr_blkaddr = off / EROFS_BLKSIZ;
diff --git a/mkfs/main.c b/mkfs/main.c
index abd48be0fa4f..d9c4c7fff5c1 100644
--- a/mkfs/main.c
+++ b/mkfs/main.c
@@ -304,7 +304,7 @@ int erofs_mkfs_update_super_block(struct erofs_buffer_head *bh,
 		round_up(EROFS_SUPER_END, EROFS_BLKSIZ);
 	char *buf;
 
-	*blocks         = erofs_mapbh(NULL, true);
+	*blocks         = erofs_mapbh(NULL);
 	sb.blocks       = cpu_to_le32(*blocks);
 	sb.root_nid     = cpu_to_le16(root_nid);
 	memcpy(sb.uuid, sbi.uuid, sizeof(sb.uuid));
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v7 2/3] erofs-utils: introduce erofs_bfind_for_attach()
  2021-01-22 17:11 ` [PATCH v7 0/3] erofs-utils: optimize buffer allocation logic Gao Xiang via Linux-erofs
  2021-01-22 17:11   ` [PATCH v7 1/3] erofs-utils: get rid of `end' argument from erofs_mapbh() Gao Xiang via Linux-erofs
@ 2021-01-22 17:11   ` Gao Xiang via Linux-erofs
  2021-02-06 15:29     ` Li GuiFu via Linux-erofs
  2021-01-22 17:11   ` [PATCH v7 3/3] erofs-utils: optimize buffer allocation logic Gao Xiang via Linux-erofs
  2 siblings, 1 reply; 7+ messages in thread
From: Gao Xiang via Linux-erofs @ 2021-01-22 17:11 UTC (permalink / raw)
  To: linux-erofs

From: Gao Xiang <hsiangkao@aol.com>

Seperate erofs_balloc() to make the logic more clearer.

Cc: Hu Weiwen <sehuww@mail.scut.edu.cn>
Signed-off-by: Gao Xiang <hsiangkao@aol.com>
---
 lib/cache.c | 81 +++++++++++++++++++++++++++++++++--------------------
 1 file changed, 50 insertions(+), 31 deletions(-)

diff --git a/lib/cache.c b/lib/cache.c
index 32a58311f563..f02413d0f887 100644
--- a/lib/cache.c
+++ b/lib/cache.c
@@ -125,25 +125,25 @@ int erofs_bh_balloon(struct erofs_buffer_head *bh, erofs_off_t incr)
 	return __erofs_battach(bb, NULL, incr, 1, 0, false);
 }
 
-struct erofs_buffer_head *erofs_balloc(int type, erofs_off_t size,
-				       unsigned int required_ext,
-				       unsigned int inline_ext)
+static int erofs_bfind_for_attach(int type, erofs_off_t size,
+				  unsigned int required_ext,
+				  unsigned int inline_ext,
+				  unsigned int alignsize,
+				  struct erofs_buffer_block **bbp)
 {
 	struct erofs_buffer_block *cur, *bb;
-	struct erofs_buffer_head *bh;
-	unsigned int alignsize, used0, usedmax;
-
-	int ret = get_alignsize(type, &type);
-
-	if (ret < 0)
-		return ERR_PTR(ret);
-	alignsize = ret;
+	unsigned int used0, usedmax;
 
 	used0 = (size + required_ext) % EROFS_BLKSIZ + inline_ext;
+	/* inline data should be in the same fs block */
+	if (used0 > EROFS_BLKSIZ)
+		return -ENOSPC;
+
 	usedmax = 0;
 	bb = NULL;
 
 	list_for_each_entry(cur, &blkh.list, list) {
+		int ret;
 		unsigned int used_before, used;
 
 		used_before = cur->buffers.off % EROFS_BLKSIZ;
@@ -179,34 +179,53 @@ struct erofs_buffer_head *erofs_balloc(int type, erofs_off_t size,
 			usedmax = used;
 		}
 	}
+	*bbp = bb;
+	return 0;
+}
+
+struct erofs_buffer_head *erofs_balloc(int type, erofs_off_t size,
+				       unsigned int required_ext,
+				       unsigned int inline_ext)
+{
+	struct erofs_buffer_block *bb;
+	struct erofs_buffer_head *bh;
+	unsigned int alignsize;
+
+	int ret = get_alignsize(type, &type);
+
+	if (ret < 0)
+		return ERR_PTR(ret);
+	alignsize = ret;
+
+	/* try to find if we could reuse an allocated buffer block */
+	ret = erofs_bfind_for_attach(type, size, required_ext, inline_ext,
+				     alignsize, &bb);
+	if (ret)
+		return ERR_PTR(ret);
 
 	if (bb) {
 		bh = malloc(sizeof(struct erofs_buffer_head));
 		if (!bh)
 			return ERR_PTR(-ENOMEM);
-		goto found;
-	}
-
-	/* allocate a new buffer block */
-	if (used0 > EROFS_BLKSIZ)
-		return ERR_PTR(-ENOSPC);
-
-	bb = malloc(sizeof(struct erofs_buffer_block));
-	if (!bb)
-		return ERR_PTR(-ENOMEM);
+	} else {
+		/* get a new buffer block instead */
+		bb = malloc(sizeof(struct erofs_buffer_block));
+		if (!bb)
+			return ERR_PTR(-ENOMEM);
 
-	bb->type = type;
-	bb->blkaddr = NULL_ADDR;
-	bb->buffers.off = 0;
-	init_list_head(&bb->buffers.list);
-	list_add_tail(&bb->list, &blkh.list);
+		bb->type = type;
+		bb->blkaddr = NULL_ADDR;
+		bb->buffers.off = 0;
+		init_list_head(&bb->buffers.list);
+		list_add_tail(&bb->list, &blkh.list);
 
-	bh = malloc(sizeof(struct erofs_buffer_head));
-	if (!bh) {
-		free(bb);
-		return ERR_PTR(-ENOMEM);
+		bh = malloc(sizeof(struct erofs_buffer_head));
+		if (!bh) {
+			free(bb);
+			return ERR_PTR(-ENOMEM);
+		}
 	}
-found:
+
 	ret = __erofs_battach(bb, bh, size, alignsize,
 			      required_ext + inline_ext, false);
 	if (ret < 0)
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v7 3/3] erofs-utils: optimize buffer allocation logic
  2021-01-22 17:11 ` [PATCH v7 0/3] erofs-utils: optimize buffer allocation logic Gao Xiang via Linux-erofs
  2021-01-22 17:11   ` [PATCH v7 1/3] erofs-utils: get rid of `end' argument from erofs_mapbh() Gao Xiang via Linux-erofs
  2021-01-22 17:11   ` [PATCH v7 2/3] erofs-utils: introduce erofs_bfind_for_attach() Gao Xiang via Linux-erofs
@ 2021-01-22 17:11   ` Gao Xiang via Linux-erofs
  2021-02-06 15:29     ` Li GuiFu via Linux-erofs
  2 siblings, 1 reply; 7+ messages in thread
From: Gao Xiang via Linux-erofs @ 2021-01-22 17:11 UTC (permalink / raw)
  To: linux-erofs

From: Hu Weiwen <sehuww@mail.scut.edu.cn>

When using EROFS to pack our dataset which consists of millions of
files, mkfs.erofs is very slow compared with mksquashfs.

The bottleneck is `erofs_balloc' and `erofs_mapbh' function, which
iterate over all previously allocated buffer blocks, making the
complexity of the algrithm O(N^2) where N is the number of files.

With this patch:

* global `last_mapped_block' is mantained to avoid full scan in
`erofs_mapbh` function.

* global `mapped_buckets' maintains a list of already mapped buffer
blocks for each type and for each possible used bytes in the last
EROFS_BLKSIZ. Then it is used to identify the most suitable blocks in
future `erofs_balloc', avoiding full scan. Note that not-mapped (and the
last mapped) blocks can be expended, so we deal with them separately.

When I test it with ImageNet dataset (1.33M files, 147GiB), it takes
about 4 hours. Most time is spent on IO.

Cc: Huang Jianan <jnhuang95@gmail.com>
Signed-off-by: Hu Weiwen <sehuww@mail.scut.edu.cn>
Signed-off-by: Gao Xiang <hsiangkao@aol.com>
---
 include/erofs/cache.h |   1 +
 lib/cache.c           | 105 ++++++++++++++++++++++++++++++++++++------
 2 files changed, 93 insertions(+), 13 deletions(-)

diff --git a/include/erofs/cache.h b/include/erofs/cache.h
index f8dff67b9736..611ca5b8432b 100644
--- a/include/erofs/cache.h
+++ b/include/erofs/cache.h
@@ -39,6 +39,7 @@ struct erofs_buffer_head {
 
 struct erofs_buffer_block {
 	struct list_head list;
+	struct list_head mapped_list;
 
 	erofs_blk_t blkaddr;
 	int type;
diff --git a/lib/cache.c b/lib/cache.c
index f02413d0f887..40d3b1f3f4d5 100644
--- a/lib/cache.c
+++ b/lib/cache.c
@@ -18,6 +18,11 @@ static struct erofs_buffer_block blkh = {
 };
 static erofs_blk_t tail_blkaddr;
 
+/* buckets for all mapped buffer blocks to boost up allocation */
+static struct list_head mapped_buckets[META + 1][EROFS_BLKSIZ];
+/* last mapped buffer block to accelerate erofs_mapbh() */
+static struct erofs_buffer_block *last_mapped_block = &blkh;
+
 static bool erofs_bh_flush_drop_directly(struct erofs_buffer_head *bh)
 {
 	return erofs_bh_flush_generic_end(bh);
@@ -62,15 +67,32 @@ struct erofs_bhops erofs_buf_write_bhops = {
 /* return buffer_head of erofs super block (with size 0) */
 struct erofs_buffer_head *erofs_buffer_init(void)
 {
+	int i, j;
 	struct erofs_buffer_head *bh = erofs_balloc(META, 0, 0, 0);
 
 	if (IS_ERR(bh))
 		return bh;
 
 	bh->op = &erofs_skip_write_bhops;
+
+	for (i = 0; i < ARRAY_SIZE(mapped_buckets); i++)
+		for (j = 0; j < ARRAY_SIZE(mapped_buckets[0]); j++)
+			init_list_head(&mapped_buckets[i][j]);
 	return bh;
 }
 
+static void erofs_bupdate_mapped(struct erofs_buffer_block *bb)
+{
+	struct list_head *bkt;
+
+	if (bb->blkaddr == NULL_ADDR)
+		return;
+
+	bkt = mapped_buckets[bb->type] + bb->buffers.off % EROFS_BLKSIZ;
+	list_del(&bb->mapped_list);
+	list_add_tail(&bb->mapped_list, bkt);
+}
+
 /* return occupied bytes in specific buffer block if succeed */
 static int __erofs_battach(struct erofs_buffer_block *bb,
 			   struct erofs_buffer_head *bh,
@@ -110,6 +132,7 @@ static int __erofs_battach(struct erofs_buffer_block *bb,
 		/* need to update the tail_blkaddr */
 		if (tailupdate)
 			tail_blkaddr = blkaddr + BLK_ROUND_UP(bb->buffers.off);
+		erofs_bupdate_mapped(bb);
 	}
 	return (alignedoffset + incr) % EROFS_BLKSIZ;
 }
@@ -132,20 +155,62 @@ static int erofs_bfind_for_attach(int type, erofs_off_t size,
 				  struct erofs_buffer_block **bbp)
 {
 	struct erofs_buffer_block *cur, *bb;
-	unsigned int used0, usedmax;
+	unsigned int used0, usedmax, used;
+	int used_before, ret;
 
 	used0 = (size + required_ext) % EROFS_BLKSIZ + inline_ext;
 	/* inline data should be in the same fs block */
 	if (used0 > EROFS_BLKSIZ)
 		return -ENOSPC;
 
+	if (!used0 || alignsize == EROFS_BLKSIZ) {
+		*bbp = NULL;
+		return 0;
+	}
+
 	usedmax = 0;
 	bb = NULL;
 
-	list_for_each_entry(cur, &blkh.list, list) {
-		int ret;
-		unsigned int used_before, used;
+	/* try to find a most-fit mapped buffer block first */
+	if (size + required_ext + inline_ext >= EROFS_BLKSIZ)
+		goto skip_mapped;
+
+	used_before = rounddown(EROFS_BLKSIZ -
+				(size + required_ext + inline_ext), alignsize);
+	do {
+		struct list_head *bt = mapped_buckets[type] + used_before;
 
+		if (list_empty(bt))
+			continue;
+		cur = list_first_entry(bt, struct erofs_buffer_block,
+				       mapped_list);
+
+		/* last mapped block can be expended, don't handle it here */
+		if (cur == last_mapped_block)
+			continue;
+
+		ret = __erofs_battach(cur, NULL, size, alignsize,
+				      required_ext + inline_ext, true);
+		if (ret < 0) {
+			DBG_BUGON(1);
+			continue;
+		}
+
+		/* should contain all data in the current block */
+		used = ret + required_ext + inline_ext;
+		DBG_BUGON(used > EROFS_BLKSIZ);
+
+		bb = cur;
+		usedmax = used;
+		break;
+	} while (--used_before > 0);
+
+skip_mapped:
+	/* try to start from the last mapped one, which can be expended */
+	cur = last_mapped_block;
+	if (cur == &blkh)
+		cur = list_next_entry(cur, list);
+	for (; cur != &blkh; cur = list_next_entry(cur, list)) {
 		used_before = cur->buffers.off % EROFS_BLKSIZ;
 
 		/* skip if buffer block is just full */
@@ -195,6 +260,8 @@ struct erofs_buffer_head *erofs_balloc(int type, erofs_off_t size,
 
 	if (ret < 0)
 		return ERR_PTR(ret);
+
+	DBG_BUGON(type < 0 || type > META);
 	alignsize = ret;
 
 	/* try to find if we could reuse an allocated buffer block */
@@ -218,6 +285,7 @@ struct erofs_buffer_head *erofs_balloc(int type, erofs_off_t size,
 		bb->buffers.off = 0;
 		init_list_head(&bb->buffers.list);
 		list_add_tail(&bb->list, &blkh.list);
+		init_list_head(&bb->mapped_list);
 
 		bh = malloc(sizeof(struct erofs_buffer_head));
 		if (!bh) {
@@ -266,8 +334,11 @@ static erofs_blk_t __erofs_mapbh(struct erofs_buffer_block *bb)
 {
 	erofs_blk_t blkaddr;
 
-	if (bb->blkaddr == NULL_ADDR)
+	if (bb->blkaddr == NULL_ADDR) {
 		bb->blkaddr = tail_blkaddr;
+		last_mapped_block = bb;
+		erofs_bupdate_mapped(bb);
+	}
 
 	blkaddr = bb->blkaddr + BLK_ROUND_UP(bb->buffers.off);
 	if (blkaddr > tail_blkaddr)
@@ -278,15 +349,18 @@ static erofs_blk_t __erofs_mapbh(struct erofs_buffer_block *bb)
 
 erofs_blk_t erofs_mapbh(struct erofs_buffer_block *bb)
 {
-	struct erofs_buffer_block *t, *nt;
+	struct erofs_buffer_block *t = last_mapped_block;
 
-	if (!bb || bb->blkaddr == NULL_ADDR) {
-		list_for_each_entry_safe(t, nt, &blkh.list, list) {
-			(void)__erofs_mapbh(t);
-			if (t == bb)
-				break;
-		}
-	}
+	if (bb && bb->blkaddr != NULL_ADDR)
+		return bb->blkaddr;
+	do {
+		t = list_next_entry(t, list);
+		if (t == &blkh)
+			break;
+
+		DBG_BUGON(t->blkaddr != NULL_ADDR);
+		(void)__erofs_mapbh(t);
+	} while (t != bb);
 	return tail_blkaddr;
 }
 
@@ -328,6 +402,7 @@ bool erofs_bflush(struct erofs_buffer_block *bb)
 
 		erofs_dbg("block %u to %u flushed", p->blkaddr, blkaddr - 1);
 
+		list_del(&p->mapped_list);
 		list_del(&p->list);
 		free(p);
 	}
@@ -351,6 +426,10 @@ void erofs_bdrop(struct erofs_buffer_head *bh, bool tryrevoke)
 	if (!list_empty(&bb->buffers.list))
 		return;
 
+	if (bb == last_mapped_block)
+		last_mapped_block = list_prev_entry(bb, list);
+
+	list_del(&bb->mapped_list);
 	list_del(&bb->list);
 	free(bb);
 
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v7 1/3] erofs-utils: get rid of `end' argument from erofs_mapbh()
  2021-01-22 17:11   ` [PATCH v7 1/3] erofs-utils: get rid of `end' argument from erofs_mapbh() Gao Xiang via Linux-erofs
@ 2021-02-06 15:28     ` Li GuiFu via Linux-erofs
  0 siblings, 0 replies; 7+ messages in thread
From: Li GuiFu via Linux-erofs @ 2021-02-06 15:28 UTC (permalink / raw)
  To: Gao Xiang, linux-erofs



On 2021/1/23 1:11, Gao Xiang via Linux-erofs wrote:
> From: Hu Weiwen <sehuww@mail.scut.edu.cn>
> 
> `end` arguement is completely broken now. Also, it could
> be reintroduced later if needed.
> 
> Signed-off-by: Hu Weiwen <sehuww@mail.scut.edu.cn>
> Signed-off-by: Gao Xiang <hsiangkao@aol.com>
> ---
>  include/erofs/cache.h |  2 +-
>  lib/cache.c           |  6 ++----
>  lib/compress.c        |  2 +-
>  lib/inode.c           | 10 +++++-----
>  lib/xattr.c           |  2 +-
>  mkfs/main.c           |  2 +-
>  6 files changed, 11 insertions(+), 13 deletions(-)
> 

It looks good
Reviewed-by: Li Guifu <bluce.lee@aliyun.com>

Thanks,

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v7 2/3] erofs-utils: introduce erofs_bfind_for_attach()
  2021-01-22 17:11   ` [PATCH v7 2/3] erofs-utils: introduce erofs_bfind_for_attach() Gao Xiang via Linux-erofs
@ 2021-02-06 15:29     ` Li GuiFu via Linux-erofs
  0 siblings, 0 replies; 7+ messages in thread
From: Li GuiFu via Linux-erofs @ 2021-02-06 15:29 UTC (permalink / raw)
  To: Gao Xiang, linux-erofs



On 2021/1/23 1:11, Gao Xiang via Linux-erofs wrote:
> From: Gao Xiang <hsiangkao@aol.com>
> 
> Seperate erofs_balloc() to make the logic more clearer.
> 
> Cc: Hu Weiwen <sehuww@mail.scut.edu.cn>
> Signed-off-by: Gao Xiang <hsiangkao@aol.com>
> ---
>  lib/cache.c | 81 +++++++++++++++++++++++++++++++++--------------------
>  1 file changed, 50 insertions(+), 31 deletions(-)
> 

It looks good
Reviewed-by: Li Guifu <bluce.lee@aliyun.com>

Thanks,

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v7 3/3] erofs-utils: optimize buffer allocation logic
  2021-01-22 17:11   ` [PATCH v7 3/3] erofs-utils: optimize buffer allocation logic Gao Xiang via Linux-erofs
@ 2021-02-06 15:29     ` Li GuiFu via Linux-erofs
  0 siblings, 0 replies; 7+ messages in thread
From: Li GuiFu via Linux-erofs @ 2021-02-06 15:29 UTC (permalink / raw)
  To: Gao Xiang, linux-erofs



On 2021/1/23 1:11, Gao Xiang via Linux-erofs wrote:
> From: Hu Weiwen <sehuww@mail.scut.edu.cn>
> 
> When using EROFS to pack our dataset which consists of millions of
> files, mkfs.erofs is very slow compared with mksquashfs.
> 
> The bottleneck is `erofs_balloc' and `erofs_mapbh' function, which
> iterate over all previously allocated buffer blocks, making the
> complexity of the algrithm O(N^2) where N is the number of files.
> 
> With this patch:
> 
> * global `last_mapped_block' is mantained to avoid full scan in
> `erofs_mapbh` function.
> 
> * global `mapped_buckets' maintains a list of already mapped buffer
> blocks for each type and for each possible used bytes in the last
> EROFS_BLKSIZ. Then it is used to identify the most suitable blocks in
> future `erofs_balloc', avoiding full scan. Note that not-mapped (and the
> last mapped) blocks can be expended, so we deal with them separately.
> 
> When I test it with ImageNet dataset (1.33M files, 147GiB), it takes
> about 4 hours. Most time is spent on IO.
> 
> Cc: Huang Jianan <jnhuang95@gmail.com>
> Signed-off-by: Hu Weiwen <sehuww@mail.scut.edu.cn>
> Signed-off-by: Gao Xiang <hsiangkao@aol.com>
> ---
>  include/erofs/cache.h |   1 +
>  lib/cache.c           | 105 ++++++++++++++++++++++++++++++++++++------
>  2 files changed, 93 insertions(+), 13 deletions(-)
> 

It looks good
Reviewed-by: Li Guifu <bluce.lee@aliyun.com>

Thanks,

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-02-06 15:30 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20210122171153.27404-1-hsiangkao.ref@aol.com>
2021-01-22 17:11 ` [PATCH v7 0/3] erofs-utils: optimize buffer allocation logic Gao Xiang via Linux-erofs
2021-01-22 17:11   ` [PATCH v7 1/3] erofs-utils: get rid of `end' argument from erofs_mapbh() Gao Xiang via Linux-erofs
2021-02-06 15:28     ` Li GuiFu via Linux-erofs
2021-01-22 17:11   ` [PATCH v7 2/3] erofs-utils: introduce erofs_bfind_for_attach() Gao Xiang via Linux-erofs
2021-02-06 15:29     ` Li GuiFu via Linux-erofs
2021-01-22 17:11   ` [PATCH v7 3/3] erofs-utils: optimize buffer allocation logic Gao Xiang via Linux-erofs
2021-02-06 15:29     ` Li GuiFu via Linux-erofs

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.