All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH for-4.19 00/12] erofs fixes for linux-4.19.y
@ 2019-02-20  9:18 ` Gao Xiang
  0 siblings, 0 replies; 68+ messages in thread
From: Gao Xiang @ 2019-02-20  9:18 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, Chao Yu, linux-erofs, miaoxie, Gao Xiang

This series backports bugfixes already merged in linux upstream
which we found these issues in our commerical products, which
are serious and should be fixed immediately.

Note that it also includes some xarray modification since
upcoming patches heavily needs it, which can reduce more
conflicts later.

All patches have been tested again as a whole.

Thanks,
Gao Xiang

Chen Gong (1):
  staging: erofs: replace BUG_ON with DBG_BUGON in data.c

Gao Xiang (11):
  staging: erofs: fix a bug when appling cache strategy
  staging: erofs: complete error handing of z_erofs_do_read_page
  staging: erofs: drop multiref support temporarily
  staging: erofs: remove the redundant d_rehash() for the root dentry
  staging: erofs: fix race when the managed cache is enabled
  staging: erofs: atomic_cond_read_relaxed on ref-locked workgroup
  staging: erofs: fix `erofs_workgroup_{try_to_freeze, unfreeze}'
  staging: erofs: add a full barrier in erofs_workgroup_unfreeze
  staging: erofs: {dir,inode,super}.c: rectify BUG_ONs
  staging: erofs: unzip_{pagevec.h,vle.c}: rectify BUG_ONs
  staging: erofs: unzip_vle_lz4.c,utils.c: rectify BUG_ONs

 drivers/staging/erofs/data.c          |  31 ++++---
 drivers/staging/erofs/dir.c           |   7 +-
 drivers/staging/erofs/inode.c         |  10 ++-
 drivers/staging/erofs/internal.h      |  71 ++++++++++------
 drivers/staging/erofs/super.c         |  19 ++---
 drivers/staging/erofs/unzip_pagevec.h |   2 +-
 drivers/staging/erofs/unzip_vle.c     |  97 ++++++++--------------
 drivers/staging/erofs/unzip_vle.h     |  12 +--
 drivers/staging/erofs/unzip_vle_lz4.c |   2 +-
 drivers/staging/erofs/utils.c         | 150 +++++++++++++++++++++++-----------
 include/linux/xarray.h                |  48 +++++++++++
 11 files changed, 271 insertions(+), 178 deletions(-)

-- 
2.14.4


^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH for-4.19 00/12] erofs fixes for linux-4.19.y
@ 2019-02-20  9:18 ` Gao Xiang
  0 siblings, 0 replies; 68+ messages in thread
From: Gao Xiang @ 2019-02-20  9:18 UTC (permalink / raw)


This series backports bugfixes already merged in linux upstream
which we found these issues in our commerical products, which
are serious and should be fixed immediately.

Note that it also includes some xarray modification since
upcoming patches heavily needs it, which can reduce more
conflicts later.

All patches have been tested again as a whole.

Thanks,
Gao Xiang

Chen Gong (1):
  staging: erofs: replace BUG_ON with DBG_BUGON in data.c

Gao Xiang (11):
  staging: erofs: fix a bug when appling cache strategy
  staging: erofs: complete error handing of z_erofs_do_read_page
  staging: erofs: drop multiref support temporarily
  staging: erofs: remove the redundant d_rehash() for the root dentry
  staging: erofs: fix race when the managed cache is enabled
  staging: erofs: atomic_cond_read_relaxed on ref-locked workgroup
  staging: erofs: fix `erofs_workgroup_{try_to_freeze, unfreeze}'
  staging: erofs: add a full barrier in erofs_workgroup_unfreeze
  staging: erofs: {dir,inode,super}.c: rectify BUG_ONs
  staging: erofs: unzip_{pagevec.h,vle.c}: rectify BUG_ONs
  staging: erofs: unzip_vle_lz4.c,utils.c: rectify BUG_ONs

 drivers/staging/erofs/data.c          |  31 ++++---
 drivers/staging/erofs/dir.c           |   7 +-
 drivers/staging/erofs/inode.c         |  10 ++-
 drivers/staging/erofs/internal.h      |  71 ++++++++++------
 drivers/staging/erofs/super.c         |  19 ++---
 drivers/staging/erofs/unzip_pagevec.h |   2 +-
 drivers/staging/erofs/unzip_vle.c     |  97 ++++++++--------------
 drivers/staging/erofs/unzip_vle.h     |  12 +--
 drivers/staging/erofs/unzip_vle_lz4.c |   2 +-
 drivers/staging/erofs/utils.c         | 150 +++++++++++++++++++++++-----------
 include/linux/xarray.h                |  48 +++++++++++
 11 files changed, 271 insertions(+), 178 deletions(-)

-- 
2.14.4

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH for-4.19 01/12] staging: erofs: fix a bug when appling cache strategy
  2019-02-20  9:18 ` Gao Xiang
@ 2019-02-20  9:18   ` Gao Xiang
  -1 siblings, 0 replies; 68+ messages in thread
From: Gao Xiang @ 2019-02-20  9:18 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, Chao Yu, linux-erofs, miaoxie, Gao Xiang

commit 0734ffbf574ee813b20899caef2fe0ed502bb783 upstream.

As described in Kconfig, the last compressed pack should be cached
for further reading for either `EROFS_FS_ZIP_CACHE_UNIPOLAR' or
`EROFS_FS_ZIP_CACHE_BIPOLAR' by design.

However, there is a bug in z_erofs_do_read_page, it will
switch `initial' to `false' at the very beginning before it decides
to cache the last compressed pack.

caching strategy should work properly after appling this patch.

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
---
 drivers/staging/erofs/unzip_vle.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/erofs/unzip_vle.c b/drivers/staging/erofs/unzip_vle.c
index 0346630b67c8..35ae0865c1fd 100644
--- a/drivers/staging/erofs/unzip_vle.c
+++ b/drivers/staging/erofs/unzip_vle.c
@@ -624,7 +624,7 @@ static int z_erofs_do_read_page(struct z_erofs_vle_frontend *fe,
 	/* go ahead the next map_blocks */
 	debugln("%s: [out-of-range] pos %llu", __func__, offset + cur);
 
-	if (!z_erofs_vle_work_iter_end(builder))
+	if (z_erofs_vle_work_iter_end(builder))
 		fe->initial = false;
 
 	map->m_la = offset + cur;
-- 
2.14.4


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH for-4.19 01/12] staging: erofs: fix a bug when appling cache strategy
@ 2019-02-20  9:18   ` Gao Xiang
  0 siblings, 0 replies; 68+ messages in thread
From: Gao Xiang @ 2019-02-20  9:18 UTC (permalink / raw)


commit 0734ffbf574ee813b20899caef2fe0ed502bb783 upstream.

As described in Kconfig, the last compressed pack should be cached
for further reading for either `EROFS_FS_ZIP_CACHE_UNIPOLAR' or
`EROFS_FS_ZIP_CACHE_BIPOLAR' by design.

However, there is a bug in z_erofs_do_read_page, it will
switch `initial' to `false' at the very beginning before it decides
to cache the last compressed pack.

caching strategy should work properly after appling this patch.

Reviewed-by: Chao Yu <yuchao0 at huawei.com>
Signed-off-by: Gao Xiang <gaoxiang25 at huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh at linuxfoundation.org>
Signed-off-by: Gao Xiang <gaoxiang25 at huawei.com>
---
 drivers/staging/erofs/unzip_vle.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/erofs/unzip_vle.c b/drivers/staging/erofs/unzip_vle.c
index 0346630b67c8..35ae0865c1fd 100644
--- a/drivers/staging/erofs/unzip_vle.c
+++ b/drivers/staging/erofs/unzip_vle.c
@@ -624,7 +624,7 @@ static int z_erofs_do_read_page(struct z_erofs_vle_frontend *fe,
 	/* go ahead the next map_blocks */
 	debugln("%s: [out-of-range] pos %llu", __func__, offset + cur);
 
-	if (!z_erofs_vle_work_iter_end(builder))
+	if (z_erofs_vle_work_iter_end(builder))
 		fe->initial = false;
 
 	map->m_la = offset + cur;
-- 
2.14.4

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH for-4.19 02/12] staging: erofs: complete error handing of z_erofs_do_read_page
  2019-02-20  9:18 ` Gao Xiang
@ 2019-02-20  9:18   ` Gao Xiang
  -1 siblings, 0 replies; 68+ messages in thread
From: Gao Xiang @ 2019-02-20  9:18 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, Chao Yu, linux-erofs, miaoxie, Gao Xiang

commit 1e05ff36e6921ca61bdbf779f81a602863569ee3 upstream.

This patch completes error handing code of z_erofs_do_read_page.
PG_error will be set when some read error happens, therefore
z_erofs_onlinepage_endio will unlock this page without setting
PG_uptodate.

Reviewed-by: Chao Yu <yucxhao0@huawei.com>
Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Conflicts:
	drivers/staging/erofs/unzip_vle.c

Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
---
 drivers/staging/erofs/unzip_vle.c | 22 ++++++++++++----------
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/drivers/staging/erofs/unzip_vle.c b/drivers/staging/erofs/unzip_vle.c
index 35ae0865c1fd..9ae1f8833b72 100644
--- a/drivers/staging/erofs/unzip_vle.c
+++ b/drivers/staging/erofs/unzip_vle.c
@@ -605,8 +605,8 @@ static int z_erofs_do_read_page(struct z_erofs_vle_frontend *fe,
 #endif
 
 	enum z_erofs_page_type page_type;
-	unsigned cur, end, spiltted, index;
-	int err;
+	unsigned int cur, end, spiltted, index;
+	int err = 0;
 
 	/* register locked file pages as online pages in pack */
 	z_erofs_onlinepage_init(page);
@@ -633,12 +633,11 @@ static int z_erofs_do_read_page(struct z_erofs_vle_frontend *fe,
 	if (unlikely(err))
 		goto err_out;
 
-	/* deal with hole (FIXME! broken now) */
 	if (unlikely(!(map->m_flags & EROFS_MAP_MAPPED)))
 		goto hitted;
 
 	DBG_BUGON(map->m_plen != 1 << sbi->clusterbits);
-	BUG_ON(erofs_blkoff(map->m_pa));
+	DBG_BUGON(erofs_blkoff(map->m_pa));
 
 	err = z_erofs_vle_work_iter_begin(builder, sb, map, &fe->owned_head);
 	if (unlikely(err))
@@ -683,7 +682,7 @@ static int z_erofs_do_read_page(struct z_erofs_vle_frontend *fe,
 
 		err = z_erofs_vle_work_add_page(builder,
 			newpage, Z_EROFS_PAGE_TYPE_EXCLUSIVE);
-		if (!err)
+		if (likely(!err))
 			goto retry;
 	}
 
@@ -694,9 +693,10 @@ static int z_erofs_do_read_page(struct z_erofs_vle_frontend *fe,
 
 	/* FIXME! avoid the last relundant fixup & endio */
 	z_erofs_onlinepage_fixup(page, index, true);
-	++spiltted;
 
-	/* also update nr_pages and increase queued_pages */
+	/* bump up the number of spiltted parts of a page */
+	++spiltted;
+	/* also update nr_pages */
 	work->nr_pages = max_t(pgoff_t, work->nr_pages, index + 1);
 next_part:
 	/* can be used for verification */
@@ -706,16 +706,18 @@ static int z_erofs_do_read_page(struct z_erofs_vle_frontend *fe,
 	if (end > 0)
 		goto repeat;
 
+out:
 	/* FIXME! avoid the last relundant fixup & endio */
 	z_erofs_onlinepage_endio(page);
 
 	debugln("%s, finish page: %pK spiltted: %u map->m_llen %llu",
 		__func__, page, spiltted, map->m_llen);
-	return 0;
+	return err;
 
+	/* if some error occurred while processing this page */
 err_out:
-	/* TODO: the missing error handing cases */
-	return err;
+	SetPageError(page);
+	goto out;
 }
 
 static void z_erofs_vle_unzip_kickoff(void *ptr, int bios)
-- 
2.14.4


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH for-4.19 02/12] staging: erofs: complete error handing of z_erofs_do_read_page
@ 2019-02-20  9:18   ` Gao Xiang
  0 siblings, 0 replies; 68+ messages in thread
From: Gao Xiang @ 2019-02-20  9:18 UTC (permalink / raw)


commit 1e05ff36e6921ca61bdbf779f81a602863569ee3 upstream.

This patch completes error handing code of z_erofs_do_read_page.
PG_error will be set when some read error happens, therefore
z_erofs_onlinepage_endio will unlock this page without setting
PG_uptodate.

Reviewed-by: Chao Yu <yucxhao0 at huawei.com>
Signed-off-by: Gao Xiang <gaoxiang25 at huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh at linuxfoundation.org>

Conflicts:
	drivers/staging/erofs/unzip_vle.c

Signed-off-by: Gao Xiang <gaoxiang25 at huawei.com>
---
 drivers/staging/erofs/unzip_vle.c | 22 ++++++++++++----------
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/drivers/staging/erofs/unzip_vle.c b/drivers/staging/erofs/unzip_vle.c
index 35ae0865c1fd..9ae1f8833b72 100644
--- a/drivers/staging/erofs/unzip_vle.c
+++ b/drivers/staging/erofs/unzip_vle.c
@@ -605,8 +605,8 @@ static int z_erofs_do_read_page(struct z_erofs_vle_frontend *fe,
 #endif
 
 	enum z_erofs_page_type page_type;
-	unsigned cur, end, spiltted, index;
-	int err;
+	unsigned int cur, end, spiltted, index;
+	int err = 0;
 
 	/* register locked file pages as online pages in pack */
 	z_erofs_onlinepage_init(page);
@@ -633,12 +633,11 @@ static int z_erofs_do_read_page(struct z_erofs_vle_frontend *fe,
 	if (unlikely(err))
 		goto err_out;
 
-	/* deal with hole (FIXME! broken now) */
 	if (unlikely(!(map->m_flags & EROFS_MAP_MAPPED)))
 		goto hitted;
 
 	DBG_BUGON(map->m_plen != 1 << sbi->clusterbits);
-	BUG_ON(erofs_blkoff(map->m_pa));
+	DBG_BUGON(erofs_blkoff(map->m_pa));
 
 	err = z_erofs_vle_work_iter_begin(builder, sb, map, &fe->owned_head);
 	if (unlikely(err))
@@ -683,7 +682,7 @@ static int z_erofs_do_read_page(struct z_erofs_vle_frontend *fe,
 
 		err = z_erofs_vle_work_add_page(builder,
 			newpage, Z_EROFS_PAGE_TYPE_EXCLUSIVE);
-		if (!err)
+		if (likely(!err))
 			goto retry;
 	}
 
@@ -694,9 +693,10 @@ static int z_erofs_do_read_page(struct z_erofs_vle_frontend *fe,
 
 	/* FIXME! avoid the last relundant fixup & endio */
 	z_erofs_onlinepage_fixup(page, index, true);
-	++spiltted;
 
-	/* also update nr_pages and increase queued_pages */
+	/* bump up the number of spiltted parts of a page */
+	++spiltted;
+	/* also update nr_pages */
 	work->nr_pages = max_t(pgoff_t, work->nr_pages, index + 1);
 next_part:
 	/* can be used for verification */
@@ -706,16 +706,18 @@ static int z_erofs_do_read_page(struct z_erofs_vle_frontend *fe,
 	if (end > 0)
 		goto repeat;
 
+out:
 	/* FIXME! avoid the last relundant fixup & endio */
 	z_erofs_onlinepage_endio(page);
 
 	debugln("%s, finish page: %pK spiltted: %u map->m_llen %llu",
 		__func__, page, spiltted, map->m_llen);
-	return 0;
+	return err;
 
+	/* if some error occurred while processing this page */
 err_out:
-	/* TODO: the missing error handing cases */
-	return err;
+	SetPageError(page);
+	goto out;
 }
 
 static void z_erofs_vle_unzip_kickoff(void *ptr, int bios)
-- 
2.14.4

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH for-4.19 03/12] staging: erofs: replace BUG_ON with DBG_BUGON in data.c
  2019-02-20  9:18 ` Gao Xiang
@ 2019-02-20  9:18   ` Gao Xiang
  -1 siblings, 0 replies; 68+ messages in thread
From: Gao Xiang @ 2019-02-20  9:18 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, Chao Yu, linux-erofs, miaoxie, Chen Gong, Gao Xiang

From: Chen Gong <gongchen4@huawei.com>

commit 9141b60cf6a53c99f8a9309bf8e1c6650a6785c1 upstream.

This patch replace BUG_ON with DBG_BUGON in data.c, and add necessary
error handler.

Signed-off-by: Chen Gong <gongchen4@huawei.com>
Reviewed-by: Gao Xiang <gaoxiang25@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
---
 drivers/staging/erofs/data.c | 31 ++++++++++++++++++++-----------
 1 file changed, 20 insertions(+), 11 deletions(-)

diff --git a/drivers/staging/erofs/data.c b/drivers/staging/erofs/data.c
index ac263a180253..894e60ecebe2 100644
--- a/drivers/staging/erofs/data.c
+++ b/drivers/staging/erofs/data.c
@@ -25,7 +25,7 @@ static inline void read_endio(struct bio *bio)
 		struct page *page = bvec->bv_page;
 
 		/* page is already locked */
-		BUG_ON(PageUptodate(page));
+		DBG_BUGON(PageUptodate(page));
 
 		if (unlikely(err))
 			SetPageError(page);
@@ -91,12 +91,12 @@ static int erofs_map_blocks_flatmode(struct inode *inode,
 	struct erofs_map_blocks *map,
 	int flags)
 {
+	int err = 0;
 	erofs_blk_t nblocks, lastblk;
 	u64 offset = map->m_la;
 	struct erofs_vnode *vi = EROFS_V(inode);
 
 	trace_erofs_map_blocks_flatmode_enter(inode, map, flags);
-	BUG_ON(is_inode_layout_compression(inode));
 
 	nblocks = DIV_ROUND_UP(inode->i_size, PAGE_SIZE);
 	lastblk = nblocks - is_inode_layout_inline(inode);
@@ -123,18 +123,27 @@ static int erofs_map_blocks_flatmode(struct inode *inode,
 		map->m_plen = inode->i_size - offset;
 
 		/* inline data should locate in one meta block */
-		BUG_ON(erofs_blkoff(map->m_pa) + map->m_plen > PAGE_SIZE);
+		if (erofs_blkoff(map->m_pa) + map->m_plen > PAGE_SIZE) {
+			DBG_BUGON(1);
+			err = -EIO;
+			goto err_out;
+		}
+
 		map->m_flags |= EROFS_MAP_META;
 	} else {
 		errln("internal error @ nid: %llu (size %llu), m_la 0x%llx",
 			vi->nid, inode->i_size, map->m_la);
-		BUG();
+		DBG_BUGON(1);
+		err = -EIO;
+		goto err_out;
 	}
 
 out:
 	map->m_llen = map->m_plen;
+
+err_out:
 	trace_erofs_map_blocks_flatmode_exit(inode, map, flags, 0);
-	return 0;
+	return err;
 }
 
 #ifdef CONFIG_EROFS_FS_ZIP
@@ -190,7 +199,7 @@ static inline struct bio *erofs_read_raw_page(
 	erofs_off_t current_block = (erofs_off_t)page->index;
 	int err;
 
-	BUG_ON(!nblocks);
+	DBG_BUGON(!nblocks);
 
 	if (PageUptodate(page)) {
 		err = 0;
@@ -233,7 +242,7 @@ static inline struct bio *erofs_read_raw_page(
 		}
 
 		/* for RAW access mode, m_plen must be equal to m_llen */
-		BUG_ON(map.m_plen != map.m_llen);
+		DBG_BUGON(map.m_plen != map.m_llen);
 
 		blknr = erofs_blknr(map.m_pa);
 		blkoff = erofs_blkoff(map.m_pa);
@@ -243,7 +252,7 @@ static inline struct bio *erofs_read_raw_page(
 			void *vsrc, *vto;
 			struct page *ipage;
 
-			BUG_ON(map.m_plen > PAGE_SIZE);
+			DBG_BUGON(map.m_plen > PAGE_SIZE);
 
 			ipage = erofs_get_meta_page(inode->i_sb, blknr, 0);
 
@@ -270,7 +279,7 @@ static inline struct bio *erofs_read_raw_page(
 		}
 
 		/* pa must be block-aligned for raw reading */
-		BUG_ON(erofs_blkoff(map.m_pa) != 0);
+		DBG_BUGON(erofs_blkoff(map.m_pa));
 
 		/* max # of continuous pages */
 		if (nblocks > DIV_ROUND_UP(map.m_plen, PAGE_SIZE))
@@ -331,7 +340,7 @@ static int erofs_raw_access_readpage(struct file *file, struct page *page)
 	if (IS_ERR(bio))
 		return PTR_ERR(bio);
 
-	BUG_ON(bio != NULL);	/* since we have only one bio -- must be NULL */
+	DBG_BUGON(bio);	/* since we have only one bio -- must be NULL */
 	return 0;
 }
 
@@ -369,7 +378,7 @@ static int erofs_raw_access_readpages(struct file *filp,
 		/* pages could still be locked */
 		put_page(page);
 	}
-	BUG_ON(!list_empty(pages));
+	DBG_BUGON(!list_empty(pages));
 
 	/* the rare case (end in gaps) */
 	if (unlikely(bio != NULL))
-- 
2.14.4


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH for-4.19 03/12] staging: erofs: replace BUG_ON with DBG_BUGON in data.c
@ 2019-02-20  9:18   ` Gao Xiang
  0 siblings, 0 replies; 68+ messages in thread
From: Gao Xiang @ 2019-02-20  9:18 UTC (permalink / raw)


From: Chen Gong <gongchen4@huawei.com>

commit 9141b60cf6a53c99f8a9309bf8e1c6650a6785c1 upstream.

This patch replace BUG_ON with DBG_BUGON in data.c, and add necessary
error handler.

Signed-off-by: Chen Gong <gongchen4 at huawei.com>
Reviewed-by: Gao Xiang <gaoxiang25 at huawei.com>
Reviewed-by: Chao Yu <yuchao0 at huawei.com>
Signed-off-by: Gao Xiang <gaoxiang25 at huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh at linuxfoundation.org>
Signed-off-by: Gao Xiang <gaoxiang25 at huawei.com>
---
 drivers/staging/erofs/data.c | 31 ++++++++++++++++++++-----------
 1 file changed, 20 insertions(+), 11 deletions(-)

diff --git a/drivers/staging/erofs/data.c b/drivers/staging/erofs/data.c
index ac263a180253..894e60ecebe2 100644
--- a/drivers/staging/erofs/data.c
+++ b/drivers/staging/erofs/data.c
@@ -25,7 +25,7 @@ static inline void read_endio(struct bio *bio)
 		struct page *page = bvec->bv_page;
 
 		/* page is already locked */
-		BUG_ON(PageUptodate(page));
+		DBG_BUGON(PageUptodate(page));
 
 		if (unlikely(err))
 			SetPageError(page);
@@ -91,12 +91,12 @@ static int erofs_map_blocks_flatmode(struct inode *inode,
 	struct erofs_map_blocks *map,
 	int flags)
 {
+	int err = 0;
 	erofs_blk_t nblocks, lastblk;
 	u64 offset = map->m_la;
 	struct erofs_vnode *vi = EROFS_V(inode);
 
 	trace_erofs_map_blocks_flatmode_enter(inode, map, flags);
-	BUG_ON(is_inode_layout_compression(inode));
 
 	nblocks = DIV_ROUND_UP(inode->i_size, PAGE_SIZE);
 	lastblk = nblocks - is_inode_layout_inline(inode);
@@ -123,18 +123,27 @@ static int erofs_map_blocks_flatmode(struct inode *inode,
 		map->m_plen = inode->i_size - offset;
 
 		/* inline data should locate in one meta block */
-		BUG_ON(erofs_blkoff(map->m_pa) + map->m_plen > PAGE_SIZE);
+		if (erofs_blkoff(map->m_pa) + map->m_plen > PAGE_SIZE) {
+			DBG_BUGON(1);
+			err = -EIO;
+			goto err_out;
+		}
+
 		map->m_flags |= EROFS_MAP_META;
 	} else {
 		errln("internal error @ nid: %llu (size %llu), m_la 0x%llx",
 			vi->nid, inode->i_size, map->m_la);
-		BUG();
+		DBG_BUGON(1);
+		err = -EIO;
+		goto err_out;
 	}
 
 out:
 	map->m_llen = map->m_plen;
+
+err_out:
 	trace_erofs_map_blocks_flatmode_exit(inode, map, flags, 0);
-	return 0;
+	return err;
 }
 
 #ifdef CONFIG_EROFS_FS_ZIP
@@ -190,7 +199,7 @@ static inline struct bio *erofs_read_raw_page(
 	erofs_off_t current_block = (erofs_off_t)page->index;
 	int err;
 
-	BUG_ON(!nblocks);
+	DBG_BUGON(!nblocks);
 
 	if (PageUptodate(page)) {
 		err = 0;
@@ -233,7 +242,7 @@ static inline struct bio *erofs_read_raw_page(
 		}
 
 		/* for RAW access mode, m_plen must be equal to m_llen */
-		BUG_ON(map.m_plen != map.m_llen);
+		DBG_BUGON(map.m_plen != map.m_llen);
 
 		blknr = erofs_blknr(map.m_pa);
 		blkoff = erofs_blkoff(map.m_pa);
@@ -243,7 +252,7 @@ static inline struct bio *erofs_read_raw_page(
 			void *vsrc, *vto;
 			struct page *ipage;
 
-			BUG_ON(map.m_plen > PAGE_SIZE);
+			DBG_BUGON(map.m_plen > PAGE_SIZE);
 
 			ipage = erofs_get_meta_page(inode->i_sb, blknr, 0);
 
@@ -270,7 +279,7 @@ static inline struct bio *erofs_read_raw_page(
 		}
 
 		/* pa must be block-aligned for raw reading */
-		BUG_ON(erofs_blkoff(map.m_pa) != 0);
+		DBG_BUGON(erofs_blkoff(map.m_pa));
 
 		/* max # of continuous pages */
 		if (nblocks > DIV_ROUND_UP(map.m_plen, PAGE_SIZE))
@@ -331,7 +340,7 @@ static int erofs_raw_access_readpage(struct file *file, struct page *page)
 	if (IS_ERR(bio))
 		return PTR_ERR(bio);
 
-	BUG_ON(bio != NULL);	/* since we have only one bio -- must be NULL */
+	DBG_BUGON(bio);	/* since we have only one bio -- must be NULL */
 	return 0;
 }
 
@@ -369,7 +378,7 @@ static int erofs_raw_access_readpages(struct file *filp,
 		/* pages could still be locked */
 		put_page(page);
 	}
-	BUG_ON(!list_empty(pages));
+	DBG_BUGON(!list_empty(pages));
 
 	/* the rare case (end in gaps) */
 	if (unlikely(bio != NULL))
-- 
2.14.4

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH for-4.19 04/12] staging: erofs: drop multiref support temporarily
  2019-02-20  9:18 ` Gao Xiang
@ 2019-02-20  9:18   ` Gao Xiang
  -1 siblings, 0 replies; 68+ messages in thread
From: Gao Xiang @ 2019-02-20  9:18 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, Chao Yu, linux-erofs, miaoxie, Gao Xiang

commit e5e3abbadf0dbd1068f64f8abe70401c5a178180 upstream.

Multiref support means that a compressed page could have
more than one reference, which is designed for on-disk data
deduplication. However, mkfs doesn't support this mode
at this moment, and the kernel implementation is also broken.

Let's drop multiref support. If it is fully implemented
in the future, it can be reverted later.

Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Conflicts:
	drivers/staging/erofs/unzip_vle.c

Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
---
 drivers/staging/erofs/unzip_vle.c | 40 +++++++--------------------------------
 drivers/staging/erofs/unzip_vle.h | 12 +-----------
 2 files changed, 8 insertions(+), 44 deletions(-)

diff --git a/drivers/staging/erofs/unzip_vle.c b/drivers/staging/erofs/unzip_vle.c
index 9ae1f8833b72..63accc4527ce 100644
--- a/drivers/staging/erofs/unzip_vle.c
+++ b/drivers/staging/erofs/unzip_vle.c
@@ -293,12 +293,9 @@ z_erofs_vle_work_lookup(struct super_block *sb,
 	*grp_ret = grp = container_of(egrp,
 		struct z_erofs_vle_workgroup, obj);
 
-#ifndef CONFIG_EROFS_FS_ZIP_MULTIREF
 	work = z_erofs_vle_grab_work(grp, pageofs);
+	/* if multiref is disabled, `primary' is always true */
 	primary = true;
-#else
-	BUG();
-#endif
 
 	DBG_BUGON(work->pageofs != pageofs);
 
@@ -365,12 +362,9 @@ z_erofs_vle_work_register(struct super_block *sb,
 	struct z_erofs_vle_workgroup *grp = *grp_ret;
 	struct z_erofs_vle_work *work;
 
-#ifndef CONFIG_EROFS_FS_ZIP_MULTIREF
+	/* if multiref is disabled, grp should never be nullptr */
 	BUG_ON(grp != NULL);
-#else
-	if (grp != NULL)
-		goto skip;
-#endif
+
 	/* no available workgroup, let's allocate one */
 	grp = kmem_cache_zalloc(z_erofs_workgroup_cachep, GFP_NOFS);
 	if (unlikely(grp == NULL))
@@ -393,13 +387,7 @@ z_erofs_vle_work_register(struct super_block *sb,
 	*hosted = true;
 
 	newgrp = true;
-#ifdef CONFIG_EROFS_FS_ZIP_MULTIREF
-skip:
-	/* currently unimplemented */
-	BUG();
-#else
 	work = z_erofs_vle_grab_primary_work(grp);
-#endif
 	work->pageofs = pageofs;
 
 	mutex_init(&work->lock);
@@ -798,10 +786,8 @@ static int z_erofs_vle_unzip(struct super_block *sb,
 	const unsigned clusterpages = erofs_clusterpages(sbi);
 
 	struct z_erofs_pagevec_ctor ctor;
-	unsigned nr_pages;
-#ifndef CONFIG_EROFS_FS_ZIP_MULTIREF
-	unsigned sparsemem_pages = 0;
-#endif
+	unsigned int nr_pages;
+	unsigned int sparsemem_pages = 0;
 	struct page *pages_onstack[Z_EROFS_VLE_VMAP_ONSTACK_PAGES];
 	struct page **pages, **compressed_pages, *page;
 	unsigned i, llen;
@@ -813,11 +799,7 @@ static int z_erofs_vle_unzip(struct super_block *sb,
 	int err;
 
 	might_sleep();
-#ifndef CONFIG_EROFS_FS_ZIP_MULTIREF
 	work = z_erofs_vle_grab_primary_work(grp);
-#else
-	BUG();
-#endif
 	BUG_ON(!READ_ONCE(work->nr_pages));
 
 	mutex_lock(&work->lock);
@@ -868,13 +850,11 @@ static int z_erofs_vle_unzip(struct super_block *sb,
 			pagenr = z_erofs_onlinepage_index(page);
 
 		BUG_ON(pagenr >= nr_pages);
-
-#ifndef CONFIG_EROFS_FS_ZIP_MULTIREF
 		BUG_ON(pages[pagenr] != NULL);
-		++sparsemem_pages;
-#endif
+
 		pages[pagenr] = page;
 	}
+	sparsemem_pages = i;
 
 	z_erofs_pagevec_ctor_exit(&ctor, true);
 
@@ -904,10 +884,8 @@ static int z_erofs_vle_unzip(struct super_block *sb,
 		pagenr = z_erofs_onlinepage_index(page);
 
 		BUG_ON(pagenr >= nr_pages);
-#ifndef CONFIG_EROFS_FS_ZIP_MULTIREF
 		BUG_ON(pages[pagenr] != NULL);
 		++sparsemem_pages;
-#endif
 		pages[pagenr] = page;
 
 		overlapped = true;
@@ -933,12 +911,10 @@ static int z_erofs_vle_unzip(struct super_block *sb,
 	if (err != -ENOTSUPP)
 		goto out_percpu;
 
-#ifndef CONFIG_EROFS_FS_ZIP_MULTIREF
 	if (sparsemem_pages >= nr_pages) {
 		BUG_ON(sparsemem_pages > nr_pages);
 		goto skip_allocpage;
 	}
-#endif
 
 	for (i = 0; i < nr_pages; ++i) {
 		if (pages[i] != NULL)
@@ -947,9 +923,7 @@ static int z_erofs_vle_unzip(struct super_block *sb,
 		pages[i] = __stagingpage_alloc(page_pool, GFP_NOFS);
 	}
 
-#ifndef CONFIG_EROFS_FS_ZIP_MULTIREF
 skip_allocpage:
-#endif
 	vout = erofs_vmap(pages, nr_pages);
 
 	err = z_erofs_vle_unzip_vmap(compressed_pages,
diff --git a/drivers/staging/erofs/unzip_vle.h b/drivers/staging/erofs/unzip_vle.h
index 393998500865..3316bc36965d 100644
--- a/drivers/staging/erofs/unzip_vle.h
+++ b/drivers/staging/erofs/unzip_vle.h
@@ -47,13 +47,6 @@ static inline bool z_erofs_gather_if_stagingpage(struct list_head *page_pool,
 #define Z_EROFS_VLE_INLINE_PAGEVECS     3
 
 struct z_erofs_vle_work {
-	/* struct z_erofs_vle_work *left, *right; */
-
-#ifdef CONFIG_EROFS_FS_ZIP_MULTIREF
-	struct list_head list;
-
-	atomic_t refcount;
-#endif
 	struct mutex lock;
 
 	/* I: decompression offset in page */
@@ -107,10 +100,8 @@ static inline void z_erofs_vle_set_workgrp_fmt(
 	grp->flags = fmt | (grp->flags & ~Z_EROFS_VLE_WORKGRP_FMT_MASK);
 }
 
-#ifdef CONFIG_EROFS_FS_ZIP_MULTIREF
-#error multiref decompression is unimplemented yet
-#else
 
+/* definitions if multiref is disabled */
 #define z_erofs_vle_grab_primary_work(grp)	(&(grp)->work)
 #define z_erofs_vle_grab_work(grp, pageofs)	(&(grp)->work)
 #define z_erofs_vle_work_workgroup(wrk, primary)	\
@@ -118,7 +109,6 @@ static inline void z_erofs_vle_set_workgrp_fmt(
 		struct z_erofs_vle_workgroup, work) : \
 		({ BUG(); (void *)NULL; }))
 
-#endif
 
 #define Z_EROFS_WORKGROUP_SIZE       sizeof(struct z_erofs_vle_workgroup)
 
-- 
2.14.4


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH for-4.19 04/12] staging: erofs: drop multiref support temporarily
@ 2019-02-20  9:18   ` Gao Xiang
  0 siblings, 0 replies; 68+ messages in thread
From: Gao Xiang @ 2019-02-20  9:18 UTC (permalink / raw)


commit e5e3abbadf0dbd1068f64f8abe70401c5a178180 upstream.

Multiref support means that a compressed page could have
more than one reference, which is designed for on-disk data
deduplication. However, mkfs doesn't support this mode
at this moment, and the kernel implementation is also broken.

Let's drop multiref support. If it is fully implemented
in the future, it can be reverted later.

Signed-off-by: Gao Xiang <gaoxiang25 at huawei.com>
Reviewed-by: Chao Yu <yuchao0 at huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh at linuxfoundation.org>

Conflicts:
	drivers/staging/erofs/unzip_vle.c

Signed-off-by: Gao Xiang <gaoxiang25 at huawei.com>
---
 drivers/staging/erofs/unzip_vle.c | 40 +++++++--------------------------------
 drivers/staging/erofs/unzip_vle.h | 12 +-----------
 2 files changed, 8 insertions(+), 44 deletions(-)

diff --git a/drivers/staging/erofs/unzip_vle.c b/drivers/staging/erofs/unzip_vle.c
index 9ae1f8833b72..63accc4527ce 100644
--- a/drivers/staging/erofs/unzip_vle.c
+++ b/drivers/staging/erofs/unzip_vle.c
@@ -293,12 +293,9 @@ z_erofs_vle_work_lookup(struct super_block *sb,
 	*grp_ret = grp = container_of(egrp,
 		struct z_erofs_vle_workgroup, obj);
 
-#ifndef CONFIG_EROFS_FS_ZIP_MULTIREF
 	work = z_erofs_vle_grab_work(grp, pageofs);
+	/* if multiref is disabled, `primary' is always true */
 	primary = true;
-#else
-	BUG();
-#endif
 
 	DBG_BUGON(work->pageofs != pageofs);
 
@@ -365,12 +362,9 @@ z_erofs_vle_work_register(struct super_block *sb,
 	struct z_erofs_vle_workgroup *grp = *grp_ret;
 	struct z_erofs_vle_work *work;
 
-#ifndef CONFIG_EROFS_FS_ZIP_MULTIREF
+	/* if multiref is disabled, grp should never be nullptr */
 	BUG_ON(grp != NULL);
-#else
-	if (grp != NULL)
-		goto skip;
-#endif
+
 	/* no available workgroup, let's allocate one */
 	grp = kmem_cache_zalloc(z_erofs_workgroup_cachep, GFP_NOFS);
 	if (unlikely(grp == NULL))
@@ -393,13 +387,7 @@ z_erofs_vle_work_register(struct super_block *sb,
 	*hosted = true;
 
 	newgrp = true;
-#ifdef CONFIG_EROFS_FS_ZIP_MULTIREF
-skip:
-	/* currently unimplemented */
-	BUG();
-#else
 	work = z_erofs_vle_grab_primary_work(grp);
-#endif
 	work->pageofs = pageofs;
 
 	mutex_init(&work->lock);
@@ -798,10 +786,8 @@ static int z_erofs_vle_unzip(struct super_block *sb,
 	const unsigned clusterpages = erofs_clusterpages(sbi);
 
 	struct z_erofs_pagevec_ctor ctor;
-	unsigned nr_pages;
-#ifndef CONFIG_EROFS_FS_ZIP_MULTIREF
-	unsigned sparsemem_pages = 0;
-#endif
+	unsigned int nr_pages;
+	unsigned int sparsemem_pages = 0;
 	struct page *pages_onstack[Z_EROFS_VLE_VMAP_ONSTACK_PAGES];
 	struct page **pages, **compressed_pages, *page;
 	unsigned i, llen;
@@ -813,11 +799,7 @@ static int z_erofs_vle_unzip(struct super_block *sb,
 	int err;
 
 	might_sleep();
-#ifndef CONFIG_EROFS_FS_ZIP_MULTIREF
 	work = z_erofs_vle_grab_primary_work(grp);
-#else
-	BUG();
-#endif
 	BUG_ON(!READ_ONCE(work->nr_pages));
 
 	mutex_lock(&work->lock);
@@ -868,13 +850,11 @@ static int z_erofs_vle_unzip(struct super_block *sb,
 			pagenr = z_erofs_onlinepage_index(page);
 
 		BUG_ON(pagenr >= nr_pages);
-
-#ifndef CONFIG_EROFS_FS_ZIP_MULTIREF
 		BUG_ON(pages[pagenr] != NULL);
-		++sparsemem_pages;
-#endif
+
 		pages[pagenr] = page;
 	}
+	sparsemem_pages = i;
 
 	z_erofs_pagevec_ctor_exit(&ctor, true);
 
@@ -904,10 +884,8 @@ static int z_erofs_vle_unzip(struct super_block *sb,
 		pagenr = z_erofs_onlinepage_index(page);
 
 		BUG_ON(pagenr >= nr_pages);
-#ifndef CONFIG_EROFS_FS_ZIP_MULTIREF
 		BUG_ON(pages[pagenr] != NULL);
 		++sparsemem_pages;
-#endif
 		pages[pagenr] = page;
 
 		overlapped = true;
@@ -933,12 +911,10 @@ static int z_erofs_vle_unzip(struct super_block *sb,
 	if (err != -ENOTSUPP)
 		goto out_percpu;
 
-#ifndef CONFIG_EROFS_FS_ZIP_MULTIREF
 	if (sparsemem_pages >= nr_pages) {
 		BUG_ON(sparsemem_pages > nr_pages);
 		goto skip_allocpage;
 	}
-#endif
 
 	for (i = 0; i < nr_pages; ++i) {
 		if (pages[i] != NULL)
@@ -947,9 +923,7 @@ static int z_erofs_vle_unzip(struct super_block *sb,
 		pages[i] = __stagingpage_alloc(page_pool, GFP_NOFS);
 	}
 
-#ifndef CONFIG_EROFS_FS_ZIP_MULTIREF
 skip_allocpage:
-#endif
 	vout = erofs_vmap(pages, nr_pages);
 
 	err = z_erofs_vle_unzip_vmap(compressed_pages,
diff --git a/drivers/staging/erofs/unzip_vle.h b/drivers/staging/erofs/unzip_vle.h
index 393998500865..3316bc36965d 100644
--- a/drivers/staging/erofs/unzip_vle.h
+++ b/drivers/staging/erofs/unzip_vle.h
@@ -47,13 +47,6 @@ static inline bool z_erofs_gather_if_stagingpage(struct list_head *page_pool,
 #define Z_EROFS_VLE_INLINE_PAGEVECS     3
 
 struct z_erofs_vle_work {
-	/* struct z_erofs_vle_work *left, *right; */
-
-#ifdef CONFIG_EROFS_FS_ZIP_MULTIREF
-	struct list_head list;
-
-	atomic_t refcount;
-#endif
 	struct mutex lock;
 
 	/* I: decompression offset in page */
@@ -107,10 +100,8 @@ static inline void z_erofs_vle_set_workgrp_fmt(
 	grp->flags = fmt | (grp->flags & ~Z_EROFS_VLE_WORKGRP_FMT_MASK);
 }
 
-#ifdef CONFIG_EROFS_FS_ZIP_MULTIREF
-#error multiref decompression is unimplemented yet
-#else
 
+/* definitions if multiref is disabled */
 #define z_erofs_vle_grab_primary_work(grp)	(&(grp)->work)
 #define z_erofs_vle_grab_work(grp, pageofs)	(&(grp)->work)
 #define z_erofs_vle_work_workgroup(wrk, primary)	\
@@ -118,7 +109,6 @@ static inline void z_erofs_vle_set_workgrp_fmt(
 		struct z_erofs_vle_workgroup, work) : \
 		({ BUG(); (void *)NULL; }))
 
-#endif
 
 #define Z_EROFS_WORKGROUP_SIZE       sizeof(struct z_erofs_vle_workgroup)
 
-- 
2.14.4

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH for-4.19 05/12] staging: erofs: remove the redundant d_rehash() for the root dentry
  2019-02-20  9:18 ` Gao Xiang
@ 2019-02-20  9:18   ` Gao Xiang
  -1 siblings, 0 replies; 68+ messages in thread
From: Gao Xiang @ 2019-02-20  9:18 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, Chao Yu, linux-erofs, miaoxie, Gao Xiang, Al Viro

commit e9c892465583c8f42d61fafe30970d36580925df upstream.

There is actually no need at all to d_rehash() for the root dentry
as Al pointed out, fix it.

Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
---
 drivers/staging/erofs/super.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/drivers/staging/erofs/super.c b/drivers/staging/erofs/super.c
index 2df9768edac9..434616b3e76c 100644
--- a/drivers/staging/erofs/super.c
+++ b/drivers/staging/erofs/super.c
@@ -404,12 +404,6 @@ static int erofs_read_super(struct super_block *sb,
 
 	erofs_register_super(sb);
 
-	/*
-	 * We already have a positive dentry, which was instantiated
-	 * by d_make_root. Just need to d_rehash it.
-	 */
-	d_rehash(sb->s_root);
-
 	if (!silent)
 		infoln("mounted on %s with opts: %s.", dev_name,
 			(char *)data);
-- 
2.14.4


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH for-4.19 05/12] staging: erofs: remove the redundant d_rehash() for the root dentry
@ 2019-02-20  9:18   ` Gao Xiang
  0 siblings, 0 replies; 68+ messages in thread
From: Gao Xiang @ 2019-02-20  9:18 UTC (permalink / raw)


commit e9c892465583c8f42d61fafe30970d36580925df upstream.

There is actually no need at all to d_rehash() for the root dentry
as Al pointed out, fix it.

Reported-by: Al Viro <viro at ZenIV.linux.org.uk>
Cc: Al Viro <viro at ZenIV.linux.org.uk>
Signed-off-by: Gao Xiang <gaoxiang25 at huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh at linuxfoundation.org>
Signed-off-by: Gao Xiang <gaoxiang25 at huawei.com>
---
 drivers/staging/erofs/super.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/drivers/staging/erofs/super.c b/drivers/staging/erofs/super.c
index 2df9768edac9..434616b3e76c 100644
--- a/drivers/staging/erofs/super.c
+++ b/drivers/staging/erofs/super.c
@@ -404,12 +404,6 @@ static int erofs_read_super(struct super_block *sb,
 
 	erofs_register_super(sb);
 
-	/*
-	 * We already have a positive dentry, which was instantiated
-	 * by d_make_root. Just need to d_rehash it.
-	 */
-	d_rehash(sb->s_root);
-
 	if (!silent)
 		infoln("mounted on %s with opts: %s.", dev_name,
 			(char *)data);
-- 
2.14.4

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH for-4.19 06/12] staging: erofs: fix race when the managed cache is enabled
  2019-02-20  9:18 ` Gao Xiang
@ 2019-02-20  9:18   ` Gao Xiang
  -1 siblings, 0 replies; 68+ messages in thread
From: Gao Xiang @ 2019-02-20  9:18 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, Chao Yu, linux-erofs, miaoxie, Gao Xiang,
	Matthew Wilcox

commit 51232df5e4b268936beccde5248f312a316800be upstream.

When the managed cache is enabled, the last reference count
of a workgroup must be used for its workstation.

Otherwise, it could lead to incorrect (un)freezes in
the reclaim path, and it would be harmful.

A typical race as follows:

Thread 1 (In the reclaim path)  Thread 2
workgroup_freeze(grp, 1)                                refcnt = 1
...
workgroup_unfreeze(grp, 1)                              refcnt = 1
                                workgroup_get(grp)      refcnt = 2 (x)
workgroup_put(grp)                                      refcnt = 1 (x)
                                ...unexpected behaviors

* grp is detached but still used, which violates cache-managed
  freeze constraint.

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>

Conflicts:
	drivers/staging/erofs/utils.c
Updates:
	include/linux/xarray.h:
		add xa_untag_pointer,xa_tag_pointer,xa_pointer_tag
		from upstream 3159f943aafd in order to reduce
		conflicts.

Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
---
 drivers/staging/erofs/internal.h |   1 +
 drivers/staging/erofs/utils.c    | 138 +++++++++++++++++++++++++++------------
 include/linux/xarray.h           |  48 ++++++++++++++
 3 files changed, 145 insertions(+), 42 deletions(-)

diff --git a/drivers/staging/erofs/internal.h b/drivers/staging/erofs/internal.h
index e6313c54e3ad..122ea5016f3b 100644
--- a/drivers/staging/erofs/internal.h
+++ b/drivers/staging/erofs/internal.h
@@ -240,6 +240,7 @@ static inline bool erofs_workgroup_get(struct erofs_workgroup *grp, int *ocnt)
 }
 
 #define __erofs_workgroup_get(grp)	atomic_inc(&(grp)->refcount)
+#define __erofs_workgroup_put(grp)	atomic_dec(&(grp)->refcount)
 
 extern int erofs_workgroup_put(struct erofs_workgroup *grp);
 
diff --git a/drivers/staging/erofs/utils.c b/drivers/staging/erofs/utils.c
index 595cf90af9bb..389f6182721e 100644
--- a/drivers/staging/erofs/utils.c
+++ b/drivers/staging/erofs/utils.c
@@ -87,12 +87,21 @@ int erofs_register_workgroup(struct super_block *sb,
 		grp = (void *)((unsigned long)grp |
 			1UL << RADIX_TREE_EXCEPTIONAL_SHIFT);
 
-	err = radix_tree_insert(&sbi->workstn_tree,
-		grp->index, grp);
+	/*
+	 * Bump up reference count before making this workgroup
+	 * visible to other users in order to avoid potential UAF
+	 * without serialized by erofs_workstn_lock.
+	 */
+	__erofs_workgroup_get(grp);
 
-	if (!err) {
-		__erofs_workgroup_get(grp);
-	}
+	err = radix_tree_insert(&sbi->workstn_tree,
+				grp->index, grp);
+	if (unlikely(err))
+		/*
+		 * it's safe to decrease since the workgroup isn't visible
+		 * and refcount >= 2 (cannot be freezed).
+		 */
+		__erofs_workgroup_put(grp);
 
 	erofs_workstn_unlock(sbi);
 	radix_tree_preload_end();
@@ -101,19 +110,94 @@ int erofs_register_workgroup(struct super_block *sb,
 
 extern void erofs_workgroup_free_rcu(struct erofs_workgroup *grp);
 
+static void  __erofs_workgroup_free(struct erofs_workgroup *grp)
+{
+	atomic_long_dec(&erofs_global_shrink_cnt);
+	erofs_workgroup_free_rcu(grp);
+}
+
 int erofs_workgroup_put(struct erofs_workgroup *grp)
 {
 	int count = atomic_dec_return(&grp->refcount);
 
 	if (count == 1)
 		atomic_long_inc(&erofs_global_shrink_cnt);
-	else if (!count) {
-		atomic_long_dec(&erofs_global_shrink_cnt);
-		erofs_workgroup_free_rcu(grp);
-	}
+	else if (!count)
+		__erofs_workgroup_free(grp);
 	return count;
 }
 
+#ifdef EROFS_FS_HAS_MANAGED_CACHE
+/* for cache-managed case, customized reclaim paths exist */
+static void erofs_workgroup_unfreeze_final(struct erofs_workgroup *grp)
+{
+	erofs_workgroup_unfreeze(grp, 0);
+	__erofs_workgroup_free(grp);
+}
+
+bool erofs_try_to_release_workgroup(struct erofs_sb_info *sbi,
+				    struct erofs_workgroup *grp,
+				    bool cleanup)
+{
+	/*
+	 * for managed cache enabled, the refcount of workgroups
+	 * themselves could be < 0 (freezed). So there is no guarantee
+	 * that all refcount > 0 if managed cache is enabled.
+	 */
+	if (!erofs_workgroup_try_to_freeze(grp, 1))
+		return false;
+
+	/*
+	 * note that all cached pages should be unlinked
+	 * before delete it from the radix tree.
+	 * Otherwise some cached pages of an orphan old workgroup
+	 * could be still linked after the new one is available.
+	 */
+	if (erofs_try_to_free_all_cached_pages(sbi, grp)) {
+		erofs_workgroup_unfreeze(grp, 1);
+		return false;
+	}
+
+	/*
+	 * it is impossible to fail after the workgroup is freezed,
+	 * however in order to avoid some race conditions, add a
+	 * DBG_BUGON to observe this in advance.
+	 */
+	DBG_BUGON(xa_untag_pointer(radix_tree_delete(&sbi->workstn_tree,
+						     grp->index)) != grp);
+
+	/*
+	 * if managed cache is enable, the last refcount
+	 * should indicate the related workstation.
+	 */
+	erofs_workgroup_unfreeze_final(grp);
+	return true;
+}
+
+#else
+/* for nocache case, no customized reclaim path at all */
+bool erofs_try_to_release_workgroup(struct erofs_sb_info *sbi,
+				    struct erofs_workgroup *grp,
+				    bool cleanup)
+{
+	int cnt = atomic_read(&grp->refcount);
+
+	DBG_BUGON(cnt <= 0);
+	DBG_BUGON(cleanup && cnt != 1);
+
+	if (cnt > 1)
+		return false;
+
+	DBG_BUGON(xa_untag_pointer(radix_tree_delete(&sbi->workstn_tree,
+						     grp->index)) != grp);
+
+	/* (rarely) could be grabbed again when freeing */
+	erofs_workgroup_put(grp);
+	return true;
+}
+
+#endif
+
 unsigned long erofs_shrink_workstation(struct erofs_sb_info *sbi,
 				       unsigned long nr_shrink,
 				       bool cleanup)
@@ -130,44 +214,14 @@ unsigned long erofs_shrink_workstation(struct erofs_sb_info *sbi,
 		batch, first_index, PAGEVEC_SIZE);
 
 	for (i = 0; i < found; ++i) {
-		int cnt;
-		struct erofs_workgroup *grp = (void *)
-			((unsigned long)batch[i] &
-				~RADIX_TREE_EXCEPTIONAL_ENTRY);
+		struct erofs_workgroup *grp = xa_untag_pointer(batch[i]);
 
 		first_index = grp->index + 1;
 
-		cnt = atomic_read(&grp->refcount);
-		BUG_ON(cnt <= 0);
-
-		if (cleanup)
-			BUG_ON(cnt != 1);
-
-#ifndef EROFS_FS_HAS_MANAGED_CACHE
-		else if (cnt > 1)
-#else
-		if (!erofs_workgroup_try_to_freeze(grp, 1))
-#endif
+		/* try to shrink each valid workgroup */
+		if (!erofs_try_to_release_workgroup(sbi, grp, cleanup))
 			continue;
 
-		if (radix_tree_delete(&sbi->workstn_tree,
-			grp->index) != grp) {
-#ifdef EROFS_FS_HAS_MANAGED_CACHE
-skip:
-			erofs_workgroup_unfreeze(grp, 1);
-#endif
-			continue;
-		}
-
-#ifdef EROFS_FS_HAS_MANAGED_CACHE
-		if (erofs_try_to_free_all_cached_pages(sbi, grp))
-			goto skip;
-
-		erofs_workgroup_unfreeze(grp, 1);
-#endif
-		/* (rarely) grabbed again when freeing */
-		erofs_workgroup_put(grp);
-
 		++freed;
 		if (unlikely(!--nr_shrink))
 			break;
diff --git a/include/linux/xarray.h b/include/linux/xarray.h
index 2dfc8006fe64..1053785830c7 100644
--- a/include/linux/xarray.h
+++ b/include/linux/xarray.h
@@ -21,4 +21,52 @@
 #define xa_unlock_irqrestore(xa, flags) \
 				spin_unlock_irqrestore(&(xa)->xa_lock, flags)
 
+/**
+ * xa_tag_pointer() - Create an XArray entry for a tagged pointer.
+ * @p: Plain pointer.
+ * @tag: Tag value (0, 1 or 3).
+ *
+ * If the user of the XArray prefers, they can tag their pointers instead
+ * of storing value entries.  Three tags are available (0, 1 and 3).
+ * These are distinct from the xa_mark_t as they are not replicated up
+ * through the array and cannot be searched for.
+ *
+ * Context: Any context.
+ * Return: An XArray entry.
+ */
+static inline void *xa_tag_pointer(void *p, unsigned long tag)
+{
+	return (void *)((unsigned long)p | tag);
+}
+
+/**
+ * xa_untag_pointer() - Turn an XArray entry into a plain pointer.
+ * @entry: XArray entry.
+ *
+ * If you have stored a tagged pointer in the XArray, call this function
+ * to get the untagged version of the pointer.
+ *
+ * Context: Any context.
+ * Return: A pointer.
+ */
+static inline void *xa_untag_pointer(void *entry)
+{
+	return (void *)((unsigned long)entry & ~3UL);
+}
+
+/**
+ * xa_pointer_tag() - Get the tag stored in an XArray entry.
+ * @entry: XArray entry.
+ *
+ * If you have stored a tagged pointer in the XArray, call this function
+ * to get the tag of that pointer.
+ *
+ * Context: Any context.
+ * Return: A tag.
+ */
+static inline unsigned int xa_pointer_tag(void *entry)
+{
+	return (unsigned long)entry & 3UL;
+}
+
 #endif /* _LINUX_XARRAY_H */
-- 
2.14.4


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH for-4.19 06/12] staging: erofs: fix race when the managed cache is enabled
@ 2019-02-20  9:18   ` Gao Xiang
  0 siblings, 0 replies; 68+ messages in thread
From: Gao Xiang @ 2019-02-20  9:18 UTC (permalink / raw)


commit 51232df5e4b268936beccde5248f312a316800be upstream.

When the managed cache is enabled, the last reference count
of a workgroup must be used for its workstation.

Otherwise, it could lead to incorrect (un)freezes in
the reclaim path, and it would be harmful.

A typical race as follows:

Thread 1 (In the reclaim path)  Thread 2
workgroup_freeze(grp, 1)                                refcnt = 1
...
workgroup_unfreeze(grp, 1)                              refcnt = 1
                                workgroup_get(grp)      refcnt = 2 (x)
workgroup_put(grp)                                      refcnt = 1 (x)
                                ...unexpected behaviors

* grp is detached but still used, which violates cache-managed
  freeze constraint.

Reviewed-by: Chao Yu <yuchao0 at huawei.com>
Signed-off-by: Gao Xiang <gaoxiang25 at huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh at linuxfoundation.org>
Signed-off-by: Gao Xiang <gaoxiang25 at huawei.com>

Conflicts:
	drivers/staging/erofs/utils.c
Updates:
	include/linux/xarray.h:
		add xa_untag_pointer,xa_tag_pointer,xa_pointer_tag
		from upstream 3159f943aafd in order to reduce
		conflicts.

Cc: Matthew Wilcox <willy at infradead.org>
Signed-off-by: Gao Xiang <gaoxiang25 at huawei.com>
---
 drivers/staging/erofs/internal.h |   1 +
 drivers/staging/erofs/utils.c    | 138 +++++++++++++++++++++++++++------------
 include/linux/xarray.h           |  48 ++++++++++++++
 3 files changed, 145 insertions(+), 42 deletions(-)

diff --git a/drivers/staging/erofs/internal.h b/drivers/staging/erofs/internal.h
index e6313c54e3ad..122ea5016f3b 100644
--- a/drivers/staging/erofs/internal.h
+++ b/drivers/staging/erofs/internal.h
@@ -240,6 +240,7 @@ static inline bool erofs_workgroup_get(struct erofs_workgroup *grp, int *ocnt)
 }
 
 #define __erofs_workgroup_get(grp)	atomic_inc(&(grp)->refcount)
+#define __erofs_workgroup_put(grp)	atomic_dec(&(grp)->refcount)
 
 extern int erofs_workgroup_put(struct erofs_workgroup *grp);
 
diff --git a/drivers/staging/erofs/utils.c b/drivers/staging/erofs/utils.c
index 595cf90af9bb..389f6182721e 100644
--- a/drivers/staging/erofs/utils.c
+++ b/drivers/staging/erofs/utils.c
@@ -87,12 +87,21 @@ int erofs_register_workgroup(struct super_block *sb,
 		grp = (void *)((unsigned long)grp |
 			1UL << RADIX_TREE_EXCEPTIONAL_SHIFT);
 
-	err = radix_tree_insert(&sbi->workstn_tree,
-		grp->index, grp);
+	/*
+	 * Bump up reference count before making this workgroup
+	 * visible to other users in order to avoid potential UAF
+	 * without serialized by erofs_workstn_lock.
+	 */
+	__erofs_workgroup_get(grp);
 
-	if (!err) {
-		__erofs_workgroup_get(grp);
-	}
+	err = radix_tree_insert(&sbi->workstn_tree,
+				grp->index, grp);
+	if (unlikely(err))
+		/*
+		 * it's safe to decrease since the workgroup isn't visible
+		 * and refcount >= 2 (cannot be freezed).
+		 */
+		__erofs_workgroup_put(grp);
 
 	erofs_workstn_unlock(sbi);
 	radix_tree_preload_end();
@@ -101,19 +110,94 @@ int erofs_register_workgroup(struct super_block *sb,
 
 extern void erofs_workgroup_free_rcu(struct erofs_workgroup *grp);
 
+static void  __erofs_workgroup_free(struct erofs_workgroup *grp)
+{
+	atomic_long_dec(&erofs_global_shrink_cnt);
+	erofs_workgroup_free_rcu(grp);
+}
+
 int erofs_workgroup_put(struct erofs_workgroup *grp)
 {
 	int count = atomic_dec_return(&grp->refcount);
 
 	if (count == 1)
 		atomic_long_inc(&erofs_global_shrink_cnt);
-	else if (!count) {
-		atomic_long_dec(&erofs_global_shrink_cnt);
-		erofs_workgroup_free_rcu(grp);
-	}
+	else if (!count)
+		__erofs_workgroup_free(grp);
 	return count;
 }
 
+#ifdef EROFS_FS_HAS_MANAGED_CACHE
+/* for cache-managed case, customized reclaim paths exist */
+static void erofs_workgroup_unfreeze_final(struct erofs_workgroup *grp)
+{
+	erofs_workgroup_unfreeze(grp, 0);
+	__erofs_workgroup_free(grp);
+}
+
+bool erofs_try_to_release_workgroup(struct erofs_sb_info *sbi,
+				    struct erofs_workgroup *grp,
+				    bool cleanup)
+{
+	/*
+	 * for managed cache enabled, the refcount of workgroups
+	 * themselves could be < 0 (freezed). So there is no guarantee
+	 * that all refcount > 0 if managed cache is enabled.
+	 */
+	if (!erofs_workgroup_try_to_freeze(grp, 1))
+		return false;
+
+	/*
+	 * note that all cached pages should be unlinked
+	 * before delete it from the radix tree.
+	 * Otherwise some cached pages of an orphan old workgroup
+	 * could be still linked after the new one is available.
+	 */
+	if (erofs_try_to_free_all_cached_pages(sbi, grp)) {
+		erofs_workgroup_unfreeze(grp, 1);
+		return false;
+	}
+
+	/*
+	 * it is impossible to fail after the workgroup is freezed,
+	 * however in order to avoid some race conditions, add a
+	 * DBG_BUGON to observe this in advance.
+	 */
+	DBG_BUGON(xa_untag_pointer(radix_tree_delete(&sbi->workstn_tree,
+						     grp->index)) != grp);
+
+	/*
+	 * if managed cache is enable, the last refcount
+	 * should indicate the related workstation.
+	 */
+	erofs_workgroup_unfreeze_final(grp);
+	return true;
+}
+
+#else
+/* for nocache case, no customized reclaim path at all */
+bool erofs_try_to_release_workgroup(struct erofs_sb_info *sbi,
+				    struct erofs_workgroup *grp,
+				    bool cleanup)
+{
+	int cnt = atomic_read(&grp->refcount);
+
+	DBG_BUGON(cnt <= 0);
+	DBG_BUGON(cleanup && cnt != 1);
+
+	if (cnt > 1)
+		return false;
+
+	DBG_BUGON(xa_untag_pointer(radix_tree_delete(&sbi->workstn_tree,
+						     grp->index)) != grp);
+
+	/* (rarely) could be grabbed again when freeing */
+	erofs_workgroup_put(grp);
+	return true;
+}
+
+#endif
+
 unsigned long erofs_shrink_workstation(struct erofs_sb_info *sbi,
 				       unsigned long nr_shrink,
 				       bool cleanup)
@@ -130,44 +214,14 @@ unsigned long erofs_shrink_workstation(struct erofs_sb_info *sbi,
 		batch, first_index, PAGEVEC_SIZE);
 
 	for (i = 0; i < found; ++i) {
-		int cnt;
-		struct erofs_workgroup *grp = (void *)
-			((unsigned long)batch[i] &
-				~RADIX_TREE_EXCEPTIONAL_ENTRY);
+		struct erofs_workgroup *grp = xa_untag_pointer(batch[i]);
 
 		first_index = grp->index + 1;
 
-		cnt = atomic_read(&grp->refcount);
-		BUG_ON(cnt <= 0);
-
-		if (cleanup)
-			BUG_ON(cnt != 1);
-
-#ifndef EROFS_FS_HAS_MANAGED_CACHE
-		else if (cnt > 1)
-#else
-		if (!erofs_workgroup_try_to_freeze(grp, 1))
-#endif
+		/* try to shrink each valid workgroup */
+		if (!erofs_try_to_release_workgroup(sbi, grp, cleanup))
 			continue;
 
-		if (radix_tree_delete(&sbi->workstn_tree,
-			grp->index) != grp) {
-#ifdef EROFS_FS_HAS_MANAGED_CACHE
-skip:
-			erofs_workgroup_unfreeze(grp, 1);
-#endif
-			continue;
-		}
-
-#ifdef EROFS_FS_HAS_MANAGED_CACHE
-		if (erofs_try_to_free_all_cached_pages(sbi, grp))
-			goto skip;
-
-		erofs_workgroup_unfreeze(grp, 1);
-#endif
-		/* (rarely) grabbed again when freeing */
-		erofs_workgroup_put(grp);
-
 		++freed;
 		if (unlikely(!--nr_shrink))
 			break;
diff --git a/include/linux/xarray.h b/include/linux/xarray.h
index 2dfc8006fe64..1053785830c7 100644
--- a/include/linux/xarray.h
+++ b/include/linux/xarray.h
@@ -21,4 +21,52 @@
 #define xa_unlock_irqrestore(xa, flags) \
 				spin_unlock_irqrestore(&(xa)->xa_lock, flags)
 
+/**
+ * xa_tag_pointer() - Create an XArray entry for a tagged pointer.
+ * @p: Plain pointer.
+ * @tag: Tag value (0, 1 or 3).
+ *
+ * If the user of the XArray prefers, they can tag their pointers instead
+ * of storing value entries.  Three tags are available (0, 1 and 3).
+ * These are distinct from the xa_mark_t as they are not replicated up
+ * through the array and cannot be searched for.
+ *
+ * Context: Any context.
+ * Return: An XArray entry.
+ */
+static inline void *xa_tag_pointer(void *p, unsigned long tag)
+{
+	return (void *)((unsigned long)p | tag);
+}
+
+/**
+ * xa_untag_pointer() - Turn an XArray entry into a plain pointer.
+ * @entry: XArray entry.
+ *
+ * If you have stored a tagged pointer in the XArray, call this function
+ * to get the untagged version of the pointer.
+ *
+ * Context: Any context.
+ * Return: A pointer.
+ */
+static inline void *xa_untag_pointer(void *entry)
+{
+	return (void *)((unsigned long)entry & ~3UL);
+}
+
+/**
+ * xa_pointer_tag() - Get the tag stored in an XArray entry.
+ * @entry: XArray entry.
+ *
+ * If you have stored a tagged pointer in the XArray, call this function
+ * to get the tag of that pointer.
+ *
+ * Context: Any context.
+ * Return: A tag.
+ */
+static inline unsigned int xa_pointer_tag(void *entry)
+{
+	return (unsigned long)entry & 3UL;
+}
+
 #endif /* _LINUX_XARRAY_H */
-- 
2.14.4

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH for-4.19 07/12] staging: erofs: atomic_cond_read_relaxed on ref-locked workgroup
  2019-02-20  9:18 ` Gao Xiang
@ 2019-02-20  9:18   ` Gao Xiang
  -1 siblings, 0 replies; 68+ messages in thread
From: Gao Xiang @ 2019-02-20  9:18 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, Chao Yu, linux-erofs, miaoxie, Gao Xiang

commit df134b8d17b90c1e7720e318d36416b57424ff7a upstream.

It's better to use atomic_cond_read_relaxed, which is implemented
in hardware instructions to monitor a variable changes currently
for ARM64, instead of open-coded busy waiting.

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
---
 drivers/staging/erofs/internal.h | 30 ++++++++++++++++++------------
 1 file changed, 18 insertions(+), 12 deletions(-)

diff --git a/drivers/staging/erofs/internal.h b/drivers/staging/erofs/internal.h
index 122ea5016f3b..0fff0d30fed7 100644
--- a/drivers/staging/erofs/internal.h
+++ b/drivers/staging/erofs/internal.h
@@ -211,23 +211,29 @@ static inline void erofs_workgroup_unfreeze(
 	preempt_enable();
 }
 
+#if defined(CONFIG_SMP)
+static inline int erofs_wait_on_workgroup_freezed(struct erofs_workgroup *grp)
+{
+	return atomic_cond_read_relaxed(&grp->refcount,
+					VAL != EROFS_LOCKED_MAGIC);
+}
+#else
+static inline int erofs_wait_on_workgroup_freezed(struct erofs_workgroup *grp)
+{
+	int v = atomic_read(&grp->refcount);
+
+	/* workgroup is never freezed on uniprocessor systems */
+	DBG_BUGON(v == EROFS_LOCKED_MAGIC);
+	return v;
+}
+#endif
+
 static inline bool erofs_workgroup_get(struct erofs_workgroup *grp, int *ocnt)
 {
-	const int locked = (int)EROFS_LOCKED_MAGIC;
 	int o;
 
 repeat:
-	o = atomic_read(&grp->refcount);
-
-	/* spin if it is temporarily locked at the reclaim path */
-	if (unlikely(o == locked)) {
-#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK)
-		do
-			cpu_relax();
-		while (atomic_read(&grp->refcount) == locked);
-#endif
-		goto repeat;
-	}
+	o = erofs_wait_on_workgroup_freezed(grp);
 
 	if (unlikely(o <= 0))
 		return -1;
-- 
2.14.4


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH for-4.19 07/12] staging: erofs: atomic_cond_read_relaxed on ref-locked workgroup
@ 2019-02-20  9:18   ` Gao Xiang
  0 siblings, 0 replies; 68+ messages in thread
From: Gao Xiang @ 2019-02-20  9:18 UTC (permalink / raw)


commit df134b8d17b90c1e7720e318d36416b57424ff7a upstream.

It's better to use atomic_cond_read_relaxed, which is implemented
in hardware instructions to monitor a variable changes currently
for ARM64, instead of open-coded busy waiting.

Reviewed-by: Chao Yu <yuchao0 at huawei.com>
Signed-off-by: Gao Xiang <gaoxiang25 at huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh at linuxfoundation.org>
Signed-off-by: Gao Xiang <gaoxiang25 at huawei.com>
---
 drivers/staging/erofs/internal.h | 30 ++++++++++++++++++------------
 1 file changed, 18 insertions(+), 12 deletions(-)

diff --git a/drivers/staging/erofs/internal.h b/drivers/staging/erofs/internal.h
index 122ea5016f3b..0fff0d30fed7 100644
--- a/drivers/staging/erofs/internal.h
+++ b/drivers/staging/erofs/internal.h
@@ -211,23 +211,29 @@ static inline void erofs_workgroup_unfreeze(
 	preempt_enable();
 }
 
+#if defined(CONFIG_SMP)
+static inline int erofs_wait_on_workgroup_freezed(struct erofs_workgroup *grp)
+{
+	return atomic_cond_read_relaxed(&grp->refcount,
+					VAL != EROFS_LOCKED_MAGIC);
+}
+#else
+static inline int erofs_wait_on_workgroup_freezed(struct erofs_workgroup *grp)
+{
+	int v = atomic_read(&grp->refcount);
+
+	/* workgroup is never freezed on uniprocessor systems */
+	DBG_BUGON(v == EROFS_LOCKED_MAGIC);
+	return v;
+}
+#endif
+
 static inline bool erofs_workgroup_get(struct erofs_workgroup *grp, int *ocnt)
 {
-	const int locked = (int)EROFS_LOCKED_MAGIC;
 	int o;
 
 repeat:
-	o = atomic_read(&grp->refcount);
-
-	/* spin if it is temporarily locked at the reclaim path */
-	if (unlikely(o == locked)) {
-#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK)
-		do
-			cpu_relax();
-		while (atomic_read(&grp->refcount) == locked);
-#endif
-		goto repeat;
-	}
+	o = erofs_wait_on_workgroup_freezed(grp);
 
 	if (unlikely(o <= 0))
 		return -1;
-- 
2.14.4

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH for-4.19 08/12] staging: erofs: fix `erofs_workgroup_{try_to_freeze, unfreeze}'
  2019-02-20  9:18 ` Gao Xiang
@ 2019-02-20  9:18   ` Gao Xiang
  -1 siblings, 0 replies; 68+ messages in thread
From: Gao Xiang @ 2019-02-20  9:18 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, Chao Yu, linux-erofs, miaoxie, Gao Xiang

commit 73f5c66df3e26ab750cefcb9a3e08c71c9f79cad upstream.

There are two minor issues in the current freeze interface:

   1) Freeze interfaces have not related with CONFIG_DEBUG_SPINLOCK,
      therefore fix the incorrect conditions;

   2) For SMP platforms, it should also disable preemption before
      doing atomic_cmpxchg in case that some high priority tasks
      preempt between atomic_cmpxchg and disable_preempt, then spin
      on the locked refcount later.

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
---
 drivers/staging/erofs/internal.h | 41 ++++++++++++++++++++++++----------------
 1 file changed, 25 insertions(+), 16 deletions(-)

diff --git a/drivers/staging/erofs/internal.h b/drivers/staging/erofs/internal.h
index 0fff0d30fed7..c4c9caf84639 100644
--- a/drivers/staging/erofs/internal.h
+++ b/drivers/staging/erofs/internal.h
@@ -184,40 +184,49 @@ struct erofs_workgroup {
 
 #define EROFS_LOCKED_MAGIC     (INT_MIN | 0xE0F510CCL)
 
-static inline bool erofs_workgroup_try_to_freeze(
-	struct erofs_workgroup *grp, int v)
+#if defined(CONFIG_SMP)
+static inline bool erofs_workgroup_try_to_freeze(struct erofs_workgroup *grp,
+						 int val)
 {
-#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK)
-	if (v != atomic_cmpxchg(&grp->refcount,
-		v, EROFS_LOCKED_MAGIC))
-		return false;
 	preempt_disable();
-#else
-	preempt_disable();
-	if (atomic_read(&grp->refcount) != v) {
+	if (val != atomic_cmpxchg(&grp->refcount, val, EROFS_LOCKED_MAGIC)) {
 		preempt_enable();
 		return false;
 	}
-#endif
 	return true;
 }
 
-static inline void erofs_workgroup_unfreeze(
-	struct erofs_workgroup *grp, int v)
+static inline void erofs_workgroup_unfreeze(struct erofs_workgroup *grp,
+					    int orig_val)
 {
-#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK)
-	atomic_set(&grp->refcount, v);
-#endif
+	atomic_set(&grp->refcount, orig_val);
 	preempt_enable();
 }
 
-#if defined(CONFIG_SMP)
 static inline int erofs_wait_on_workgroup_freezed(struct erofs_workgroup *grp)
 {
 	return atomic_cond_read_relaxed(&grp->refcount,
 					VAL != EROFS_LOCKED_MAGIC);
 }
 #else
+static inline bool erofs_workgroup_try_to_freeze(struct erofs_workgroup *grp,
+						 int val)
+{
+	preempt_disable();
+	/* no need to spin on UP platforms, let's just disable preemption. */
+	if (val != atomic_read(&grp->refcount)) {
+		preempt_enable();
+		return false;
+	}
+	return true;
+}
+
+static inline void erofs_workgroup_unfreeze(struct erofs_workgroup *grp,
+					    int orig_val)
+{
+	preempt_enable();
+}
+
 static inline int erofs_wait_on_workgroup_freezed(struct erofs_workgroup *grp)
 {
 	int v = atomic_read(&grp->refcount);
-- 
2.14.4


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH for-4.19 08/12] staging: erofs: fix `erofs_workgroup_{try_to_freeze, unfreeze}'
@ 2019-02-20  9:18   ` Gao Xiang
  0 siblings, 0 replies; 68+ messages in thread
From: Gao Xiang @ 2019-02-20  9:18 UTC (permalink / raw)


commit 73f5c66df3e26ab750cefcb9a3e08c71c9f79cad upstream.

There are two minor issues in the current freeze interface:

   1) Freeze interfaces have not related with CONFIG_DEBUG_SPINLOCK,
      therefore fix the incorrect conditions;

   2) For SMP platforms, it should also disable preemption before
      doing atomic_cmpxchg in case that some high priority tasks
      preempt between atomic_cmpxchg and disable_preempt, then spin
      on the locked refcount later.

Reviewed-by: Chao Yu <yuchao0 at huawei.com>
Signed-off-by: Gao Xiang <gaoxiang25 at huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh at linuxfoundation.org>
Signed-off-by: Gao Xiang <gaoxiang25 at huawei.com>
---
 drivers/staging/erofs/internal.h | 41 ++++++++++++++++++++++++----------------
 1 file changed, 25 insertions(+), 16 deletions(-)

diff --git a/drivers/staging/erofs/internal.h b/drivers/staging/erofs/internal.h
index 0fff0d30fed7..c4c9caf84639 100644
--- a/drivers/staging/erofs/internal.h
+++ b/drivers/staging/erofs/internal.h
@@ -184,40 +184,49 @@ struct erofs_workgroup {
 
 #define EROFS_LOCKED_MAGIC     (INT_MIN | 0xE0F510CCL)
 
-static inline bool erofs_workgroup_try_to_freeze(
-	struct erofs_workgroup *grp, int v)
+#if defined(CONFIG_SMP)
+static inline bool erofs_workgroup_try_to_freeze(struct erofs_workgroup *grp,
+						 int val)
 {
-#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK)
-	if (v != atomic_cmpxchg(&grp->refcount,
-		v, EROFS_LOCKED_MAGIC))
-		return false;
 	preempt_disable();
-#else
-	preempt_disable();
-	if (atomic_read(&grp->refcount) != v) {
+	if (val != atomic_cmpxchg(&grp->refcount, val, EROFS_LOCKED_MAGIC)) {
 		preempt_enable();
 		return false;
 	}
-#endif
 	return true;
 }
 
-static inline void erofs_workgroup_unfreeze(
-	struct erofs_workgroup *grp, int v)
+static inline void erofs_workgroup_unfreeze(struct erofs_workgroup *grp,
+					    int orig_val)
 {
-#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK)
-	atomic_set(&grp->refcount, v);
-#endif
+	atomic_set(&grp->refcount, orig_val);
 	preempt_enable();
 }
 
-#if defined(CONFIG_SMP)
 static inline int erofs_wait_on_workgroup_freezed(struct erofs_workgroup *grp)
 {
 	return atomic_cond_read_relaxed(&grp->refcount,
 					VAL != EROFS_LOCKED_MAGIC);
 }
 #else
+static inline bool erofs_workgroup_try_to_freeze(struct erofs_workgroup *grp,
+						 int val)
+{
+	preempt_disable();
+	/* no need to spin on UP platforms, let's just disable preemption. */
+	if (val != atomic_read(&grp->refcount)) {
+		preempt_enable();
+		return false;
+	}
+	return true;
+}
+
+static inline void erofs_workgroup_unfreeze(struct erofs_workgroup *grp,
+					    int orig_val)
+{
+	preempt_enable();
+}
+
 static inline int erofs_wait_on_workgroup_freezed(struct erofs_workgroup *grp)
 {
 	int v = atomic_read(&grp->refcount);
-- 
2.14.4

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH for-4.19 09/12] staging: erofs: add a full barrier in erofs_workgroup_unfreeze
  2019-02-20  9:18 ` Gao Xiang
@ 2019-02-20  9:18   ` Gao Xiang
  -1 siblings, 0 replies; 68+ messages in thread
From: Gao Xiang @ 2019-02-20  9:18 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, Chao Yu, linux-erofs, miaoxie, Gao Xiang

commit 948bbdb1818b7ad6e539dad4fbd2dd4650793ea9 upstream.

Just like other generic locks, insert a full barrier
in case of memory reorder.

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
---
 drivers/staging/erofs/internal.h | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/staging/erofs/internal.h b/drivers/staging/erofs/internal.h
index c4c9caf84639..2e9a9a4b4226 100644
--- a/drivers/staging/erofs/internal.h
+++ b/drivers/staging/erofs/internal.h
@@ -199,6 +199,11 @@ static inline bool erofs_workgroup_try_to_freeze(struct erofs_workgroup *grp,
 static inline void erofs_workgroup_unfreeze(struct erofs_workgroup *grp,
 					    int orig_val)
 {
+	/*
+	 * other observers should notice all modifications
+	 * in the freezing period.
+	 */
+	smp_mb();
 	atomic_set(&grp->refcount, orig_val);
 	preempt_enable();
 }
-- 
2.14.4


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH for-4.19 09/12] staging: erofs: add a full barrier in erofs_workgroup_unfreeze
@ 2019-02-20  9:18   ` Gao Xiang
  0 siblings, 0 replies; 68+ messages in thread
From: Gao Xiang @ 2019-02-20  9:18 UTC (permalink / raw)


commit 948bbdb1818b7ad6e539dad4fbd2dd4650793ea9 upstream.

Just like other generic locks, insert a full barrier
in case of memory reorder.

Reviewed-by: Chao Yu <yuchao0 at huawei.com>
Signed-off-by: Gao Xiang <gaoxiang25 at huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh at linuxfoundation.org>
Signed-off-by: Gao Xiang <gaoxiang25 at huawei.com>
---
 drivers/staging/erofs/internal.h | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/staging/erofs/internal.h b/drivers/staging/erofs/internal.h
index c4c9caf84639..2e9a9a4b4226 100644
--- a/drivers/staging/erofs/internal.h
+++ b/drivers/staging/erofs/internal.h
@@ -199,6 +199,11 @@ static inline bool erofs_workgroup_try_to_freeze(struct erofs_workgroup *grp,
 static inline void erofs_workgroup_unfreeze(struct erofs_workgroup *grp,
 					    int orig_val)
 {
+	/*
+	 * other observers should notice all modifications
+	 * in the freezing period.
+	 */
+	smp_mb();
 	atomic_set(&grp->refcount, orig_val);
 	preempt_enable();
 }
-- 
2.14.4

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH for-4.19 10/12] staging: erofs: {dir,inode,super}.c: rectify BUG_ONs
  2019-02-20  9:18 ` Gao Xiang
@ 2019-02-20  9:18   ` Gao Xiang
  -1 siblings, 0 replies; 68+ messages in thread
From: Gao Xiang @ 2019-02-20  9:18 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, Chao Yu, linux-erofs, miaoxie, Gao Xiang

commit 8b987bca2d09649683cbe496419a011df8c08493 upstream.

remove all redundant BUG_ONs, and turn the rest
useful usages to DBG_BUGONs.

Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Conflicts:
	drivers/staging/erofs/super.c

Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
---
 drivers/staging/erofs/dir.c   |  7 +++++--
 drivers/staging/erofs/inode.c | 10 ++++++++--
 drivers/staging/erofs/super.c | 13 ++++++-------
 3 files changed, 19 insertions(+), 11 deletions(-)

diff --git a/drivers/staging/erofs/dir.c b/drivers/staging/erofs/dir.c
index be6ae3b1bdbe..04b84ff31d03 100644
--- a/drivers/staging/erofs/dir.c
+++ b/drivers/staging/erofs/dir.c
@@ -53,8 +53,11 @@ static int erofs_fill_dentries(struct dir_context *ctx,
 			strnlen(de_name, maxsize - nameoff) :
 			le16_to_cpu(de[1].nameoff) - nameoff;
 
-		/* the corrupted directory found */
-		BUG_ON(de_namelen < 0);
+		/* a corrupted entry is found */
+		if (unlikely(de_namelen < 0)) {
+			DBG_BUGON(1);
+			return -EIO;
+		}
 
 #ifdef CONFIG_EROFS_FS_DEBUG
 		dbg_namelen = min(EROFS_NAME_LEN - 1, de_namelen);
diff --git a/drivers/staging/erofs/inode.c b/drivers/staging/erofs/inode.c
index fbf6ff25cd1b..9e7815f55a17 100644
--- a/drivers/staging/erofs/inode.c
+++ b/drivers/staging/erofs/inode.c
@@ -132,7 +132,13 @@ static int fill_inline_data(struct inode *inode, void *data, unsigned m_pofs)
 			return -ENOMEM;
 
 		m_pofs += vi->inode_isize + vi->xattr_isize;
-		BUG_ON(m_pofs + inode->i_size > PAGE_SIZE);
+
+		/* inline symlink data shouldn't across page boundary as well */
+		if (unlikely(m_pofs + inode->i_size > PAGE_SIZE)) {
+			DBG_BUGON(1);
+			kfree(lnk);
+			return -EIO;
+		}
 
 		/* get in-page inline data */
 		memcpy(lnk, data + m_pofs, inode->i_size);
@@ -170,7 +176,7 @@ static int fill_inode(struct inode *inode, int isdir)
 		return PTR_ERR(page);
 	}
 
-	BUG_ON(!PageUptodate(page));
+	DBG_BUGON(!PageUptodate(page));
 	data = page_address(page);
 
 	err = read_inode(inode, data + ofs);
diff --git a/drivers/staging/erofs/super.c b/drivers/staging/erofs/super.c
index 434616b3e76c..b0583cdb079a 100644
--- a/drivers/staging/erofs/super.c
+++ b/drivers/staging/erofs/super.c
@@ -40,7 +40,6 @@ static int erofs_init_inode_cache(void)
 
 static void erofs_exit_inode_cache(void)
 {
-	BUG_ON(erofs_inode_cachep == NULL);
 	kmem_cache_destroy(erofs_inode_cachep);
 }
 
@@ -265,8 +264,8 @@ static int managed_cache_releasepage(struct page *page, gfp_t gfp_mask)
 	int ret = 1;	/* 0 - busy */
 	struct address_space *const mapping = page->mapping;
 
-	BUG_ON(!PageLocked(page));
-	BUG_ON(mapping->a_ops != &managed_cache_aops);
+	DBG_BUGON(!PageLocked(page));
+	DBG_BUGON(mapping->a_ops != &managed_cache_aops);
 
 	if (PagePrivate(page))
 		ret = erofs_try_to_free_cached_page(mapping, page);
@@ -279,10 +278,10 @@ static void managed_cache_invalidatepage(struct page *page,
 {
 	const unsigned int stop = length + offset;
 
-	BUG_ON(!PageLocked(page));
+	DBG_BUGON(!PageLocked(page));
 
-	/* Check for overflow */
-	BUG_ON(stop > PAGE_SIZE || stop < length);
+	/* Check for potential overflow in debug mode */
+	DBG_BUGON(stop > PAGE_SIZE || stop < length);
 
 	if (offset == 0 && stop == PAGE_SIZE)
 		while (!managed_cache_releasepage(page, GFP_NOFS))
@@ -619,7 +618,7 @@ static int erofs_show_options(struct seq_file *seq, struct dentry *root)
 
 static int erofs_remount(struct super_block *sb, int *flags, char *data)
 {
-	BUG_ON(!sb_rdonly(sb));
+	DBG_BUGON(!sb_rdonly(sb));
 
 	*flags |= SB_RDONLY;
 	return 0;
-- 
2.14.4


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH for-4.19 10/12] staging: erofs: {dir, inode, super}.c: rectify BUG_ONs
@ 2019-02-20  9:18   ` Gao Xiang
  0 siblings, 0 replies; 68+ messages in thread
From: Gao Xiang @ 2019-02-20  9:18 UTC (permalink / raw)


commit 8b987bca2d09649683cbe496419a011df8c08493 upstream.

remove all redundant BUG_ONs, and turn the rest
useful usages to DBG_BUGONs.

Signed-off-by: Gao Xiang <gaoxiang25 at huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh at linuxfoundation.org>

Conflicts:
	drivers/staging/erofs/super.c

Signed-off-by: Gao Xiang <gaoxiang25 at huawei.com>
---
 drivers/staging/erofs/dir.c   |  7 +++++--
 drivers/staging/erofs/inode.c | 10 ++++++++--
 drivers/staging/erofs/super.c | 13 ++++++-------
 3 files changed, 19 insertions(+), 11 deletions(-)

diff --git a/drivers/staging/erofs/dir.c b/drivers/staging/erofs/dir.c
index be6ae3b1bdbe..04b84ff31d03 100644
--- a/drivers/staging/erofs/dir.c
+++ b/drivers/staging/erofs/dir.c
@@ -53,8 +53,11 @@ static int erofs_fill_dentries(struct dir_context *ctx,
 			strnlen(de_name, maxsize - nameoff) :
 			le16_to_cpu(de[1].nameoff) - nameoff;
 
-		/* the corrupted directory found */
-		BUG_ON(de_namelen < 0);
+		/* a corrupted entry is found */
+		if (unlikely(de_namelen < 0)) {
+			DBG_BUGON(1);
+			return -EIO;
+		}
 
 #ifdef CONFIG_EROFS_FS_DEBUG
 		dbg_namelen = min(EROFS_NAME_LEN - 1, de_namelen);
diff --git a/drivers/staging/erofs/inode.c b/drivers/staging/erofs/inode.c
index fbf6ff25cd1b..9e7815f55a17 100644
--- a/drivers/staging/erofs/inode.c
+++ b/drivers/staging/erofs/inode.c
@@ -132,7 +132,13 @@ static int fill_inline_data(struct inode *inode, void *data, unsigned m_pofs)
 			return -ENOMEM;
 
 		m_pofs += vi->inode_isize + vi->xattr_isize;
-		BUG_ON(m_pofs + inode->i_size > PAGE_SIZE);
+
+		/* inline symlink data shouldn't across page boundary as well */
+		if (unlikely(m_pofs + inode->i_size > PAGE_SIZE)) {
+			DBG_BUGON(1);
+			kfree(lnk);
+			return -EIO;
+		}
 
 		/* get in-page inline data */
 		memcpy(lnk, data + m_pofs, inode->i_size);
@@ -170,7 +176,7 @@ static int fill_inode(struct inode *inode, int isdir)
 		return PTR_ERR(page);
 	}
 
-	BUG_ON(!PageUptodate(page));
+	DBG_BUGON(!PageUptodate(page));
 	data = page_address(page);
 
 	err = read_inode(inode, data + ofs);
diff --git a/drivers/staging/erofs/super.c b/drivers/staging/erofs/super.c
index 434616b3e76c..b0583cdb079a 100644
--- a/drivers/staging/erofs/super.c
+++ b/drivers/staging/erofs/super.c
@@ -40,7 +40,6 @@ static int erofs_init_inode_cache(void)
 
 static void erofs_exit_inode_cache(void)
 {
-	BUG_ON(erofs_inode_cachep == NULL);
 	kmem_cache_destroy(erofs_inode_cachep);
 }
 
@@ -265,8 +264,8 @@ static int managed_cache_releasepage(struct page *page, gfp_t gfp_mask)
 	int ret = 1;	/* 0 - busy */
 	struct address_space *const mapping = page->mapping;
 
-	BUG_ON(!PageLocked(page));
-	BUG_ON(mapping->a_ops != &managed_cache_aops);
+	DBG_BUGON(!PageLocked(page));
+	DBG_BUGON(mapping->a_ops != &managed_cache_aops);
 
 	if (PagePrivate(page))
 		ret = erofs_try_to_free_cached_page(mapping, page);
@@ -279,10 +278,10 @@ static void managed_cache_invalidatepage(struct page *page,
 {
 	const unsigned int stop = length + offset;
 
-	BUG_ON(!PageLocked(page));
+	DBG_BUGON(!PageLocked(page));
 
-	/* Check for overflow */
-	BUG_ON(stop > PAGE_SIZE || stop < length);
+	/* Check for potential overflow in debug mode */
+	DBG_BUGON(stop > PAGE_SIZE || stop < length);
 
 	if (offset == 0 && stop == PAGE_SIZE)
 		while (!managed_cache_releasepage(page, GFP_NOFS))
@@ -619,7 +618,7 @@ static int erofs_show_options(struct seq_file *seq, struct dentry *root)
 
 static int erofs_remount(struct super_block *sb, int *flags, char *data)
 {
-	BUG_ON(!sb_rdonly(sb));
+	DBG_BUGON(!sb_rdonly(sb));
 
 	*flags |= SB_RDONLY;
 	return 0;
-- 
2.14.4

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH for-4.19 11/12] staging: erofs: unzip_{pagevec.h,vle.c}: rectify BUG_ONs
  2019-02-20  9:18 ` Gao Xiang
@ 2019-02-20  9:18   ` Gao Xiang
  -1 siblings, 0 replies; 68+ messages in thread
From: Gao Xiang @ 2019-02-20  9:18 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, Chao Yu, linux-erofs, miaoxie, Gao Xiang

commit 70b17991d89554cdd16f3e4fb0179bcc03c808d9 upstream.

remove all redundant BUG_ONs, and turn the rest
useful usages to DBG_BUGONs.

Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Conflicts:
	drivers/staging/erofs/unzip_vle.c

Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
---
 drivers/staging/erofs/unzip_pagevec.h |  2 +-
 drivers/staging/erofs/unzip_vle.c     | 35 ++++++++++++++---------------------
 2 files changed, 15 insertions(+), 22 deletions(-)

diff --git a/drivers/staging/erofs/unzip_pagevec.h b/drivers/staging/erofs/unzip_pagevec.h
index 0956615b86f7..23856ba2742d 100644
--- a/drivers/staging/erofs/unzip_pagevec.h
+++ b/drivers/staging/erofs/unzip_pagevec.h
@@ -150,7 +150,7 @@ z_erofs_pagevec_ctor_dequeue(struct z_erofs_pagevec_ctor *ctor,
 	erofs_vtptr_t t;
 
 	if (unlikely(ctor->index >= ctor->nr)) {
-		BUG_ON(ctor->next == NULL);
+		DBG_BUGON(!ctor->next);
 		z_erofs_pagevec_ctor_pagedown(ctor, true);
 	}
 
diff --git a/drivers/staging/erofs/unzip_vle.c b/drivers/staging/erofs/unzip_vle.c
index 63accc4527ce..b82af55c7f84 100644
--- a/drivers/staging/erofs/unzip_vle.c
+++ b/drivers/staging/erofs/unzip_vle.c
@@ -18,9 +18,6 @@ static struct kmem_cache *z_erofs_workgroup_cachep __read_mostly;
 
 void z_erofs_exit_zip_subsystem(void)
 {
-	BUG_ON(z_erofs_workqueue == NULL);
-	BUG_ON(z_erofs_workgroup_cachep == NULL);
-
 	destroy_workqueue(z_erofs_workqueue);
 	kmem_cache_destroy(z_erofs_workgroup_cachep);
 }
@@ -363,7 +360,10 @@ z_erofs_vle_work_register(struct super_block *sb,
 	struct z_erofs_vle_work *work;
 
 	/* if multiref is disabled, grp should never be nullptr */
-	BUG_ON(grp != NULL);
+	if (unlikely(grp)) {
+		DBG_BUGON(1);
+		return ERR_PTR(-EINVAL);
+	}
 
 	/* no available workgroup, let's allocate one */
 	grp = kmem_cache_zalloc(z_erofs_workgroup_cachep, GFP_NOFS);
@@ -742,7 +742,7 @@ static inline void z_erofs_vle_read_endio(struct bio *bio)
 		bool cachemngd = false;
 
 		DBG_BUGON(PageUptodate(page));
-		BUG_ON(page->mapping == NULL);
+		DBG_BUGON(!page->mapping);
 
 #ifdef EROFS_FS_HAS_MANAGED_CACHE
 		if (unlikely(mngda == NULL && !z_erofs_is_stagingpage(page))) {
@@ -800,7 +800,7 @@ static int z_erofs_vle_unzip(struct super_block *sb,
 
 	might_sleep();
 	work = z_erofs_vle_grab_primary_work(grp);
-	BUG_ON(!READ_ONCE(work->nr_pages));
+	DBG_BUGON(!READ_ONCE(work->nr_pages));
 
 	mutex_lock(&work->lock);
 	nr_pages = work->nr_pages;
@@ -849,8 +849,8 @@ static int z_erofs_vle_unzip(struct super_block *sb,
 		else
 			pagenr = z_erofs_onlinepage_index(page);
 
-		BUG_ON(pagenr >= nr_pages);
-		BUG_ON(pages[pagenr] != NULL);
+		DBG_BUGON(pagenr >= nr_pages);
+		DBG_BUGON(pages[pagenr]);
 
 		pages[pagenr] = page;
 	}
@@ -873,9 +873,8 @@ static int z_erofs_vle_unzip(struct super_block *sb,
 		if (z_erofs_is_stagingpage(page))
 			continue;
 #ifdef EROFS_FS_HAS_MANAGED_CACHE
-		else if (page->mapping == mngda) {
-			BUG_ON(PageLocked(page));
-			BUG_ON(!PageUptodate(page));
+		if (page->mapping == mngda) {
+			DBG_BUGON(!PageUptodate(page));
 			continue;
 		}
 #endif
@@ -883,8 +882,8 @@ static int z_erofs_vle_unzip(struct super_block *sb,
 		/* only non-head page could be reused as a compressed page */
 		pagenr = z_erofs_onlinepage_index(page);
 
-		BUG_ON(pagenr >= nr_pages);
-		BUG_ON(pages[pagenr] != NULL);
+		DBG_BUGON(pagenr >= nr_pages);
+		DBG_BUGON(pages[pagenr]);
 		++sparsemem_pages;
 		pages[pagenr] = page;
 
@@ -894,9 +893,6 @@ static int z_erofs_vle_unzip(struct super_block *sb,
 	llen = (nr_pages << PAGE_SHIFT) - work->pageofs;
 
 	if (z_erofs_vle_workgrp_fmt(grp) == Z_EROFS_VLE_WORKGRP_FMT_PLAIN) {
-		/* FIXME! this should be fixed in the future */
-		BUG_ON(grp->llen != llen);
-
 		err = z_erofs_vle_plain_copy(compressed_pages, clusterpages,
 			pages, nr_pages, work->pageofs);
 		goto out;
@@ -911,10 +907,8 @@ static int z_erofs_vle_unzip(struct super_block *sb,
 	if (err != -ENOTSUPP)
 		goto out_percpu;
 
-	if (sparsemem_pages >= nr_pages) {
-		BUG_ON(sparsemem_pages > nr_pages);
+	if (sparsemem_pages >= nr_pages)
 		goto skip_allocpage;
-	}
 
 	for (i = 0; i < nr_pages; ++i) {
 		if (pages[i] != NULL)
@@ -1007,7 +1001,7 @@ static void z_erofs_vle_unzip_wq(struct work_struct *work)
 		struct z_erofs_vle_unzip_io_sb, io.u.work);
 	LIST_HEAD(page_pool);
 
-	BUG_ON(iosb->io.head == Z_EROFS_VLE_WORKGRP_TAIL_CLOSED);
+	DBG_BUGON(iosb->io.head == Z_EROFS_VLE_WORKGRP_TAIL_CLOSED);
 	z_erofs_vle_unzip_all(iosb->sb, &iosb->io, &page_pool);
 
 	put_pages_list(&page_pool);
@@ -1336,7 +1330,6 @@ static inline int __z_erofs_vle_normalaccess_readpages(
 			continue;
 		}
 
-		BUG_ON(PagePrivate(page));
 		set_page_private(page, (unsigned long)head);
 		head = page;
 	}
-- 
2.14.4


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH for-4.19 11/12] staging: erofs: unzip_{pagevec.h, vle.c}: rectify BUG_ONs
@ 2019-02-20  9:18   ` Gao Xiang
  0 siblings, 0 replies; 68+ messages in thread
From: Gao Xiang @ 2019-02-20  9:18 UTC (permalink / raw)


commit 70b17991d89554cdd16f3e4fb0179bcc03c808d9 upstream.

remove all redundant BUG_ONs, and turn the rest
useful usages to DBG_BUGONs.

Signed-off-by: Gao Xiang <gaoxiang25 at huawei.com>
Reviewed-by: Chao Yu <yuchao0 at huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh at linuxfoundation.org>

Conflicts:
	drivers/staging/erofs/unzip_vle.c

Signed-off-by: Gao Xiang <gaoxiang25 at huawei.com>
---
 drivers/staging/erofs/unzip_pagevec.h |  2 +-
 drivers/staging/erofs/unzip_vle.c     | 35 ++++++++++++++---------------------
 2 files changed, 15 insertions(+), 22 deletions(-)

diff --git a/drivers/staging/erofs/unzip_pagevec.h b/drivers/staging/erofs/unzip_pagevec.h
index 0956615b86f7..23856ba2742d 100644
--- a/drivers/staging/erofs/unzip_pagevec.h
+++ b/drivers/staging/erofs/unzip_pagevec.h
@@ -150,7 +150,7 @@ z_erofs_pagevec_ctor_dequeue(struct z_erofs_pagevec_ctor *ctor,
 	erofs_vtptr_t t;
 
 	if (unlikely(ctor->index >= ctor->nr)) {
-		BUG_ON(ctor->next == NULL);
+		DBG_BUGON(!ctor->next);
 		z_erofs_pagevec_ctor_pagedown(ctor, true);
 	}
 
diff --git a/drivers/staging/erofs/unzip_vle.c b/drivers/staging/erofs/unzip_vle.c
index 63accc4527ce..b82af55c7f84 100644
--- a/drivers/staging/erofs/unzip_vle.c
+++ b/drivers/staging/erofs/unzip_vle.c
@@ -18,9 +18,6 @@ static struct kmem_cache *z_erofs_workgroup_cachep __read_mostly;
 
 void z_erofs_exit_zip_subsystem(void)
 {
-	BUG_ON(z_erofs_workqueue == NULL);
-	BUG_ON(z_erofs_workgroup_cachep == NULL);
-
 	destroy_workqueue(z_erofs_workqueue);
 	kmem_cache_destroy(z_erofs_workgroup_cachep);
 }
@@ -363,7 +360,10 @@ z_erofs_vle_work_register(struct super_block *sb,
 	struct z_erofs_vle_work *work;
 
 	/* if multiref is disabled, grp should never be nullptr */
-	BUG_ON(grp != NULL);
+	if (unlikely(grp)) {
+		DBG_BUGON(1);
+		return ERR_PTR(-EINVAL);
+	}
 
 	/* no available workgroup, let's allocate one */
 	grp = kmem_cache_zalloc(z_erofs_workgroup_cachep, GFP_NOFS);
@@ -742,7 +742,7 @@ static inline void z_erofs_vle_read_endio(struct bio *bio)
 		bool cachemngd = false;
 
 		DBG_BUGON(PageUptodate(page));
-		BUG_ON(page->mapping == NULL);
+		DBG_BUGON(!page->mapping);
 
 #ifdef EROFS_FS_HAS_MANAGED_CACHE
 		if (unlikely(mngda == NULL && !z_erofs_is_stagingpage(page))) {
@@ -800,7 +800,7 @@ static int z_erofs_vle_unzip(struct super_block *sb,
 
 	might_sleep();
 	work = z_erofs_vle_grab_primary_work(grp);
-	BUG_ON(!READ_ONCE(work->nr_pages));
+	DBG_BUGON(!READ_ONCE(work->nr_pages));
 
 	mutex_lock(&work->lock);
 	nr_pages = work->nr_pages;
@@ -849,8 +849,8 @@ static int z_erofs_vle_unzip(struct super_block *sb,
 		else
 			pagenr = z_erofs_onlinepage_index(page);
 
-		BUG_ON(pagenr >= nr_pages);
-		BUG_ON(pages[pagenr] != NULL);
+		DBG_BUGON(pagenr >= nr_pages);
+		DBG_BUGON(pages[pagenr]);
 
 		pages[pagenr] = page;
 	}
@@ -873,9 +873,8 @@ static int z_erofs_vle_unzip(struct super_block *sb,
 		if (z_erofs_is_stagingpage(page))
 			continue;
 #ifdef EROFS_FS_HAS_MANAGED_CACHE
-		else if (page->mapping == mngda) {
-			BUG_ON(PageLocked(page));
-			BUG_ON(!PageUptodate(page));
+		if (page->mapping == mngda) {
+			DBG_BUGON(!PageUptodate(page));
 			continue;
 		}
 #endif
@@ -883,8 +882,8 @@ static int z_erofs_vle_unzip(struct super_block *sb,
 		/* only non-head page could be reused as a compressed page */
 		pagenr = z_erofs_onlinepage_index(page);
 
-		BUG_ON(pagenr >= nr_pages);
-		BUG_ON(pages[pagenr] != NULL);
+		DBG_BUGON(pagenr >= nr_pages);
+		DBG_BUGON(pages[pagenr]);
 		++sparsemem_pages;
 		pages[pagenr] = page;
 
@@ -894,9 +893,6 @@ static int z_erofs_vle_unzip(struct super_block *sb,
 	llen = (nr_pages << PAGE_SHIFT) - work->pageofs;
 
 	if (z_erofs_vle_workgrp_fmt(grp) == Z_EROFS_VLE_WORKGRP_FMT_PLAIN) {
-		/* FIXME! this should be fixed in the future */
-		BUG_ON(grp->llen != llen);
-
 		err = z_erofs_vle_plain_copy(compressed_pages, clusterpages,
 			pages, nr_pages, work->pageofs);
 		goto out;
@@ -911,10 +907,8 @@ static int z_erofs_vle_unzip(struct super_block *sb,
 	if (err != -ENOTSUPP)
 		goto out_percpu;
 
-	if (sparsemem_pages >= nr_pages) {
-		BUG_ON(sparsemem_pages > nr_pages);
+	if (sparsemem_pages >= nr_pages)
 		goto skip_allocpage;
-	}
 
 	for (i = 0; i < nr_pages; ++i) {
 		if (pages[i] != NULL)
@@ -1007,7 +1001,7 @@ static void z_erofs_vle_unzip_wq(struct work_struct *work)
 		struct z_erofs_vle_unzip_io_sb, io.u.work);
 	LIST_HEAD(page_pool);
 
-	BUG_ON(iosb->io.head == Z_EROFS_VLE_WORKGRP_TAIL_CLOSED);
+	DBG_BUGON(iosb->io.head == Z_EROFS_VLE_WORKGRP_TAIL_CLOSED);
 	z_erofs_vle_unzip_all(iosb->sb, &iosb->io, &page_pool);
 
 	put_pages_list(&page_pool);
@@ -1336,7 +1330,6 @@ static inline int __z_erofs_vle_normalaccess_readpages(
 			continue;
 		}
 
-		BUG_ON(PagePrivate(page));
 		set_page_private(page, (unsigned long)head);
 		head = page;
 	}
-- 
2.14.4

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH for-4.19 12/12] staging: erofs: unzip_vle_lz4.c,utils.c: rectify BUG_ONs
  2019-02-20  9:18 ` Gao Xiang
@ 2019-02-20  9:18   ` Gao Xiang
  -1 siblings, 0 replies; 68+ messages in thread
From: Gao Xiang @ 2019-02-20  9:18 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, Chao Yu, linux-erofs, miaoxie, Gao Xiang

commit b8e076a6ef253e763bfdb81e5c72bcc828b0fbeb upstream.

remove all redundant BUG_ONs, and turn the rest
useful usages to DBG_BUGONs.

Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
---
 drivers/staging/erofs/unzip_vle_lz4.c |  2 +-
 drivers/staging/erofs/utils.c         | 12 ++++++------
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/staging/erofs/unzip_vle_lz4.c b/drivers/staging/erofs/unzip_vle_lz4.c
index f5b665f15be5..9cb35cd33365 100644
--- a/drivers/staging/erofs/unzip_vle_lz4.c
+++ b/drivers/staging/erofs/unzip_vle_lz4.c
@@ -57,7 +57,7 @@ int z_erofs_vle_plain_copy(struct page **compressed_pages,
 			if (compressed_pages[j] != page)
 				continue;
 
-			BUG_ON(mirrored[j]);
+			DBG_BUGON(mirrored[j]);
 			memcpy(percpu_data + j * PAGE_SIZE, dst, PAGE_SIZE);
 			mirrored[j] = true;
 			break;
diff --git a/drivers/staging/erofs/utils.c b/drivers/staging/erofs/utils.c
index 389f6182721e..ceee288fbe90 100644
--- a/drivers/staging/erofs/utils.c
+++ b/drivers/staging/erofs/utils.c
@@ -23,9 +23,6 @@ struct page *erofs_allocpage(struct list_head *pool, gfp_t gfp)
 		list_del(&page->lru);
 	} else {
 		page = alloc_pages(gfp | __GFP_NOFAIL, 0);
-
-		BUG_ON(page == NULL);
-		BUG_ON(page->mapping != NULL);
 	}
 	return page;
 }
@@ -60,7 +57,7 @@ struct erofs_workgroup *erofs_find_workgroup(
 		/* decrease refcount added by erofs_workgroup_put */
 		if (unlikely(oldcount == 1))
 			atomic_long_dec(&erofs_global_shrink_cnt);
-		BUG_ON(index != grp->index);
+		DBG_BUGON(index != grp->index);
 	}
 	rcu_read_unlock();
 	return grp;
@@ -73,8 +70,11 @@ int erofs_register_workgroup(struct super_block *sb,
 	struct erofs_sb_info *sbi;
 	int err;
 
-	/* grp->refcount should not < 1 */
-	BUG_ON(!atomic_read(&grp->refcount));
+	/* grp shouldn't be broken or used before */
+	if (unlikely(atomic_read(&grp->refcount) != 1)) {
+		DBG_BUGON(1);
+		return -EINVAL;
+	}
 
 	err = radix_tree_preload(GFP_NOFS);
 	if (err)
-- 
2.14.4


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH for-4.19 12/12] staging: erofs: unzip_vle_lz4.c, utils.c: rectify BUG_ONs
@ 2019-02-20  9:18   ` Gao Xiang
  0 siblings, 0 replies; 68+ messages in thread
From: Gao Xiang @ 2019-02-20  9:18 UTC (permalink / raw)


commit b8e076a6ef253e763bfdb81e5c72bcc828b0fbeb upstream.

remove all redundant BUG_ONs, and turn the rest
useful usages to DBG_BUGONs.

Signed-off-by: Gao Xiang <gaoxiang25 at huawei.com>
Reviewed-by: Chao Yu <yuchao0 at huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh at linuxfoundation.org>
Signed-off-by: Gao Xiang <gaoxiang25 at huawei.com>
---
 drivers/staging/erofs/unzip_vle_lz4.c |  2 +-
 drivers/staging/erofs/utils.c         | 12 ++++++------
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/staging/erofs/unzip_vle_lz4.c b/drivers/staging/erofs/unzip_vle_lz4.c
index f5b665f15be5..9cb35cd33365 100644
--- a/drivers/staging/erofs/unzip_vle_lz4.c
+++ b/drivers/staging/erofs/unzip_vle_lz4.c
@@ -57,7 +57,7 @@ int z_erofs_vle_plain_copy(struct page **compressed_pages,
 			if (compressed_pages[j] != page)
 				continue;
 
-			BUG_ON(mirrored[j]);
+			DBG_BUGON(mirrored[j]);
 			memcpy(percpu_data + j * PAGE_SIZE, dst, PAGE_SIZE);
 			mirrored[j] = true;
 			break;
diff --git a/drivers/staging/erofs/utils.c b/drivers/staging/erofs/utils.c
index 389f6182721e..ceee288fbe90 100644
--- a/drivers/staging/erofs/utils.c
+++ b/drivers/staging/erofs/utils.c
@@ -23,9 +23,6 @@ struct page *erofs_allocpage(struct list_head *pool, gfp_t gfp)
 		list_del(&page->lru);
 	} else {
 		page = alloc_pages(gfp | __GFP_NOFAIL, 0);
-
-		BUG_ON(page == NULL);
-		BUG_ON(page->mapping != NULL);
 	}
 	return page;
 }
@@ -60,7 +57,7 @@ struct erofs_workgroup *erofs_find_workgroup(
 		/* decrease refcount added by erofs_workgroup_put */
 		if (unlikely(oldcount == 1))
 			atomic_long_dec(&erofs_global_shrink_cnt);
-		BUG_ON(index != grp->index);
+		DBG_BUGON(index != grp->index);
 	}
 	rcu_read_unlock();
 	return grp;
@@ -73,8 +70,11 @@ int erofs_register_workgroup(struct super_block *sb,
 	struct erofs_sb_info *sbi;
 	int err;
 
-	/* grp->refcount should not < 1 */
-	BUG_ON(!atomic_read(&grp->refcount));
+	/* grp shouldn't be broken or used before */
+	if (unlikely(atomic_read(&grp->refcount) != 1)) {
+		DBG_BUGON(1);
+		return -EINVAL;
+	}
 
 	err = radix_tree_preload(GFP_NOFS);
 	if (err)
-- 
2.14.4

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* Re: [PATCH for-4.19 00/12] erofs fixes for linux-4.19.y
  2019-02-20  9:18 ` Gao Xiang
@ 2019-02-22  8:35   ` Greg Kroah-Hartman
  -1 siblings, 0 replies; 68+ messages in thread
From: Greg Kroah-Hartman @ 2019-02-22  8:35 UTC (permalink / raw)
  To: Gao Xiang; +Cc: stable, Chao Yu, linux-erofs, miaoxie

On Wed, Feb 20, 2019 at 05:18:42PM +0800, Gao Xiang wrote:
> This series backports bugfixes already merged in linux upstream
> which we found these issues in our commerical products, which
> are serious and should be fixed immediately.
> 
> Note that it also includes some xarray modification since
> upcoming patches heavily needs it, which can reduce more
> conflicts later.
> 
> All patches have been tested again as a whole.

Some of these patches are not in 4.20, so if a user were to move to that
kernel, they would see a "regression" in the filesystem functionality,
right?  That's not ok, and because of that reason alone, I can't take
this whole series for 4.19.y right now.

So can you prepare a set of patches for 4.20.y also with the missing
patches you have included here?

Also, are all of these really "fixes"?  They feel like they are basic
"get things working properly" type of patches.  The BUG_ONs are not
really fixes as no one _should_ be hitting those in a normal system,
right?  Are they really necessary for systems running 4.19?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH for-4.19 00/12] erofs fixes for linux-4.19.y
@ 2019-02-22  8:35   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 68+ messages in thread
From: Greg Kroah-Hartman @ 2019-02-22  8:35 UTC (permalink / raw)


On Wed, Feb 20, 2019@05:18:42PM +0800, Gao Xiang wrote:
> This series backports bugfixes already merged in linux upstream
> which we found these issues in our commerical products, which
> are serious and should be fixed immediately.
> 
> Note that it also includes some xarray modification since
> upcoming patches heavily needs it, which can reduce more
> conflicts later.
> 
> All patches have been tested again as a whole.

Some of these patches are not in 4.20, so if a user were to move to that
kernel, they would see a "regression" in the filesystem functionality,
right?  That's not ok, and because of that reason alone, I can't take
this whole series for 4.19.y right now.

So can you prepare a set of patches for 4.20.y also with the missing
patches you have included here?

Also, are all of these really "fixes"?  They feel like they are basic
"get things working properly" type of patches.  The BUG_ONs are not
really fixes as no one _should_ be hitting those in a normal system,
right?  Are they really necessary for systems running 4.19?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH for-4.19 00/12] erofs fixes for linux-4.19.y
  2019-02-22  8:35   ` Greg Kroah-Hartman
@ 2019-02-22  9:03     ` Gao Xiang
  -1 siblings, 0 replies; 68+ messages in thread
From: Gao Xiang @ 2019-02-22  9:03 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: stable, Chao Yu, linux-erofs, miaoxie

Hi Greg,

On 2019/2/22 16:35, Greg Kroah-Hartman wrote:
> On Wed, Feb 20, 2019 at 05:18:42PM +0800, Gao Xiang wrote:
>> This series backports bugfixes already merged in linux upstream
>> which we found these issues in our commerical products, which
>> are serious and should be fixed immediately.
>>
>> Note that it also includes some xarray modification since
>> upcoming patches heavily needs it, which can reduce more
>> conflicts later.
>>
>> All patches have been tested again as a whole.
> 
> Some of these patches are not in 4.20, so if a user were to move to that
> kernel, they would see a "regression" in the filesystem functionality,
> right?  That's not ok, and because of that reason alone, I can't take
> this whole series for 4.19.y right now.

Yes, you are right. I think 4.20 should be fixed as well.
I will do it later for 4.20 and the reason I did 4.19 first is just because
it is a LTS kernel...

> 
> So can you prepare a set of patches for 4.20.y also with the missing
> patches you have included here?

I will do that.

> 
> Also, are all of these really "fixes"?  They feel like they are basic
> "get things working properly" type of patches.  The BUG_ONs are not
> really fixes as no one _should_ be hitting those in a normal system,
> right?  Are they really necessary for systems running 4.19?

Most of these BUG_ONs will crash kernel when mounted with corrupted images,
since I plan to take time to run EROFS linux 4.19 LTS in AOSP, those patches
are important to do that...

Thanks,
Gao Xiang

> 
> thanks,
> 
> greg k-h
> 

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH for-4.19 00/12] erofs fixes for linux-4.19.y
@ 2019-02-22  9:03     ` Gao Xiang
  0 siblings, 0 replies; 68+ messages in thread
From: Gao Xiang @ 2019-02-22  9:03 UTC (permalink / raw)


Hi Greg,

On 2019/2/22 16:35, Greg Kroah-Hartman wrote:
> On Wed, Feb 20, 2019@05:18:42PM +0800, Gao Xiang wrote:
>> This series backports bugfixes already merged in linux upstream
>> which we found these issues in our commerical products, which
>> are serious and should be fixed immediately.
>>
>> Note that it also includes some xarray modification since
>> upcoming patches heavily needs it, which can reduce more
>> conflicts later.
>>
>> All patches have been tested again as a whole.
> 
> Some of these patches are not in 4.20, so if a user were to move to that
> kernel, they would see a "regression" in the filesystem functionality,
> right?  That's not ok, and because of that reason alone, I can't take
> this whole series for 4.19.y right now.

Yes, you are right. I think 4.20 should be fixed as well.
I will do it later for 4.20 and the reason I did 4.19 first is just because
it is a LTS kernel...

> 
> So can you prepare a set of patches for 4.20.y also with the missing
> patches you have included here?

I will do that.

> 
> Also, are all of these really "fixes"?  They feel like they are basic
> "get things working properly" type of patches.  The BUG_ONs are not
> really fixes as no one _should_ be hitting those in a normal system,
> right?  Are they really necessary for systems running 4.19?

Most of these BUG_ONs will crash kernel when mounted with corrupted images,
since I plan to take time to run EROFS linux 4.19 LTS in AOSP, those patches
are important to do that...

Thanks,
Gao Xiang

> 
> thanks,
> 
> greg k-h
> 

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH for-4.19 02/12] staging: erofs: complete error handing of z_erofs_do_read_page
  2019-02-20  9:18   ` Gao Xiang
@ 2019-02-25 14:59     ` Greg Kroah-Hartman
  -1 siblings, 0 replies; 68+ messages in thread
From: Greg Kroah-Hartman @ 2019-02-25 14:59 UTC (permalink / raw)
  To: Gao Xiang; +Cc: stable, Chao Yu, linux-erofs, miaoxie

On Wed, Feb 20, 2019 at 05:18:44PM +0800, Gao Xiang wrote:
> commit 1e05ff36e6921ca61bdbf779f81a602863569ee3 upstream.
> 
> This patch completes error handing code of z_erofs_do_read_page.
> PG_error will be set when some read error happens, therefore
> z_erofs_onlinepage_endio will unlock this page without setting
> PG_uptodate.
> 
> Reviewed-by: Chao Yu <yucxhao0@huawei.com>
> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> 
> Conflicts:
> 	drivers/staging/erofs/unzip_vle.c

These types of lines are not needed in backports, they just are clutter.
I'll go fix it up...

greg k-h

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH for-4.19 02/12] staging: erofs: complete error handing of z_erofs_do_read_page
@ 2019-02-25 14:59     ` Greg Kroah-Hartman
  0 siblings, 0 replies; 68+ messages in thread
From: Greg Kroah-Hartman @ 2019-02-25 14:59 UTC (permalink / raw)


On Wed, Feb 20, 2019@05:18:44PM +0800, Gao Xiang wrote:
> commit 1e05ff36e6921ca61bdbf779f81a602863569ee3 upstream.
> 
> This patch completes error handing code of z_erofs_do_read_page.
> PG_error will be set when some read error happens, therefore
> z_erofs_onlinepage_endio will unlock this page without setting
> PG_uptodate.
> 
> Reviewed-by: Chao Yu <yucxhao0 at huawei.com>
> Signed-off-by: Gao Xiang <gaoxiang25 at huawei.com>
> Signed-off-by: Greg Kroah-Hartman <gregkh at linuxfoundation.org>
> 
> Conflicts:
> 	drivers/staging/erofs/unzip_vle.c

These types of lines are not needed in backports, they just are clutter.
I'll go fix it up...

greg k-h

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH for-4.19 02/12] staging: erofs: complete error handing of z_erofs_do_read_page
  2019-02-25 14:59     ` Greg Kroah-Hartman
@ 2019-02-25 15:04       ` Gao Xiang
  -1 siblings, 0 replies; 68+ messages in thread
From: Gao Xiang @ 2019-02-25 15:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: stable, Chao Yu, linux-erofs, miaoxie



On 2019/2/25 22:59, Greg Kroah-Hartman wrote:
> On Wed, Feb 20, 2019 at 05:18:44PM +0800, Gao Xiang wrote:
>> commit 1e05ff36e6921ca61bdbf779f81a602863569ee3 upstream.
>>
>> This patch completes error handing code of z_erofs_do_read_page.
>> PG_error will be set when some read error happens, therefore
>> z_erofs_onlinepage_endio will unlock this page without setting
>> PG_uptodate.
>>
>> Reviewed-by: Chao Yu <yucxhao0@huawei.com>
>> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
>> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>>
>> Conflicts:
>> 	drivers/staging/erofs/unzip_vle.c
> 
> These types of lines are not needed in backports, they just are clutter.
> I'll go fix it up...

Thanks Greg... I also added some comments in some patch..
If it is not needed, I will notice in the future submission...

Thanks,
Gao Xiang

> 
> greg k-h
> 

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH for-4.19 02/12] staging: erofs: complete error handing of z_erofs_do_read_page
@ 2019-02-25 15:04       ` Gao Xiang
  0 siblings, 0 replies; 68+ messages in thread
From: Gao Xiang @ 2019-02-25 15:04 UTC (permalink / raw)




On 2019/2/25 22:59, Greg Kroah-Hartman wrote:
> On Wed, Feb 20, 2019@05:18:44PM +0800, Gao Xiang wrote:
>> commit 1e05ff36e6921ca61bdbf779f81a602863569ee3 upstream.
>>
>> This patch completes error handing code of z_erofs_do_read_page.
>> PG_error will be set when some read error happens, therefore
>> z_erofs_onlinepage_endio will unlock this page without setting
>> PG_uptodate.
>>
>> Reviewed-by: Chao Yu <yucxhao0 at huawei.com>
>> Signed-off-by: Gao Xiang <gaoxiang25 at huawei.com>
>> Signed-off-by: Greg Kroah-Hartman <gregkh at linuxfoundation.org>
>>
>> Conflicts:
>> 	drivers/staging/erofs/unzip_vle.c
> 
> These types of lines are not needed in backports, they just are clutter.
> I'll go fix it up...

Thanks Greg... I also added some comments in some patch..
If it is not needed, I will notice in the future submission...

Thanks,
Gao Xiang

> 
> greg k-h
> 

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH for-4.19 06/12] staging: erofs: fix race when the managed cache is enabled
  2019-02-20  9:18   ` Gao Xiang
@ 2019-02-25 15:04     ` Greg Kroah-Hartman
  -1 siblings, 0 replies; 68+ messages in thread
From: Greg Kroah-Hartman @ 2019-02-25 15:04 UTC (permalink / raw)
  To: Gao Xiang; +Cc: stable, Chao Yu, linux-erofs, miaoxie, Matthew Wilcox

On Wed, Feb 20, 2019 at 05:18:48PM +0800, Gao Xiang wrote:
> commit 51232df5e4b268936beccde5248f312a316800be upstream.
> 
> When the managed cache is enabled, the last reference count
> of a workgroup must be used for its workstation.
> 
> Otherwise, it could lead to incorrect (un)freezes in
> the reclaim path, and it would be harmful.
> 
> A typical race as follows:
> 
> Thread 1 (In the reclaim path)  Thread 2
> workgroup_freeze(grp, 1)                                refcnt = 1
> ...
> workgroup_unfreeze(grp, 1)                              refcnt = 1
>                                 workgroup_get(grp)      refcnt = 2 (x)
> workgroup_put(grp)                                      refcnt = 1 (x)
>                                 ...unexpected behaviors
> 
> * grp is detached but still used, which violates cache-managed
>   freeze constraint.
> 
> Reviewed-by: Chao Yu <yuchao0@huawei.com>
> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
> 
> Conflicts:
> 	drivers/staging/erofs/utils.c
> Updates:
> 	include/linux/xarray.h:
> 		add xa_untag_pointer,xa_tag_pointer,xa_pointer_tag
> 		from upstream 3159f943aafd in order to reduce
> 		conflicts.

No, sorry, I don't want to add xarray.h to 4.19.y, that's crazy.

And even if we did, you do not slip it in as part of a different patch,
it should come in as its own patch, with the same git commit id that it
landed in 4.20 with.

Please fix this up...

greg k-h

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH for-4.19 06/12] staging: erofs: fix race when the managed cache is enabled
@ 2019-02-25 15:04     ` Greg Kroah-Hartman
  0 siblings, 0 replies; 68+ messages in thread
From: Greg Kroah-Hartman @ 2019-02-25 15:04 UTC (permalink / raw)


On Wed, Feb 20, 2019@05:18:48PM +0800, Gao Xiang wrote:
> commit 51232df5e4b268936beccde5248f312a316800be upstream.
> 
> When the managed cache is enabled, the last reference count
> of a workgroup must be used for its workstation.
> 
> Otherwise, it could lead to incorrect (un)freezes in
> the reclaim path, and it would be harmful.
> 
> A typical race as follows:
> 
> Thread 1 (In the reclaim path)  Thread 2
> workgroup_freeze(grp, 1)                                refcnt = 1
> ...
> workgroup_unfreeze(grp, 1)                              refcnt = 1
>                                 workgroup_get(grp)      refcnt = 2 (x)
> workgroup_put(grp)                                      refcnt = 1 (x)
>                                 ...unexpected behaviors
> 
> * grp is detached but still used, which violates cache-managed
>   freeze constraint.
> 
> Reviewed-by: Chao Yu <yuchao0 at huawei.com>
> Signed-off-by: Gao Xiang <gaoxiang25 at huawei.com>
> Signed-off-by: Greg Kroah-Hartman <gregkh at linuxfoundation.org>
> Signed-off-by: Gao Xiang <gaoxiang25 at huawei.com>
> 
> Conflicts:
> 	drivers/staging/erofs/utils.c
> Updates:
> 	include/linux/xarray.h:
> 		add xa_untag_pointer,xa_tag_pointer,xa_pointer_tag
> 		from upstream 3159f943aafd in order to reduce
> 		conflicts.

No, sorry, I don't want to add xarray.h to 4.19.y, that's crazy.

And even if we did, you do not slip it in as part of a different patch,
it should come in as its own patch, with the same git commit id that it
landed in 4.20 with.

Please fix this up...

greg k-h

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH for-4.19 06/12] staging: erofs: fix race when the managed cache is enabled
  2019-02-25 15:04     ` Greg Kroah-Hartman
@ 2019-02-25 15:07       ` Gao Xiang
  -1 siblings, 0 replies; 68+ messages in thread
From: Gao Xiang @ 2019-02-25 15:07 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: stable, Chao Yu, linux-erofs, miaoxie, Matthew Wilcox



On 2019/2/25 23:04, Greg Kroah-Hartman wrote:
> On Wed, Feb 20, 2019 at 05:18:48PM +0800, Gao Xiang wrote:
>> commit 51232df5e4b268936beccde5248f312a316800be upstream.
>>
>> When the managed cache is enabled, the last reference count
>> of a workgroup must be used for its workstation.
>>
>> Otherwise, it could lead to incorrect (un)freezes in
>> the reclaim path, and it would be harmful.
>>
>> A typical race as follows:
>>
>> Thread 1 (In the reclaim path)  Thread 2
>> workgroup_freeze(grp, 1)                                refcnt = 1
>> ...
>> workgroup_unfreeze(grp, 1)                              refcnt = 1
>>                                 workgroup_get(grp)      refcnt = 2 (x)
>> workgroup_put(grp)                                      refcnt = 1 (x)
>>                                 ...unexpected behaviors
>>
>> * grp is detached but still used, which violates cache-managed
>>   freeze constraint.
>>
>> Reviewed-by: Chao Yu <yuchao0@huawei.com>
>> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
>> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
>>
>> Conflicts:
>> 	drivers/staging/erofs/utils.c
>> Updates:
>> 	include/linux/xarray.h:
>> 		add xa_untag_pointer,xa_tag_pointer,xa_pointer_tag
>> 		from upstream 3159f943aafd in order to reduce
>> 		conflicts.
> 
> No, sorry, I don't want to add xarray.h to 4.19.y, that's crazy.

Or can I define these xa_untag_pointer,xa_tag_pointer,xa_pointer_tag in
a erofs header internally? it is acceptable?

Thanks,
Gao Xiang

> 
> And even if we did, you do not slip it in as part of a different patch,
> it should come in as its own patch, with the same git commit id that it
> landed in 4.20 with.
> 
> Please fix this up...
> 
> greg k-h
> 

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH for-4.19 06/12] staging: erofs: fix race when the managed cache is enabled
@ 2019-02-25 15:07       ` Gao Xiang
  0 siblings, 0 replies; 68+ messages in thread
From: Gao Xiang @ 2019-02-25 15:07 UTC (permalink / raw)




On 2019/2/25 23:04, Greg Kroah-Hartman wrote:
> On Wed, Feb 20, 2019@05:18:48PM +0800, Gao Xiang wrote:
>> commit 51232df5e4b268936beccde5248f312a316800be upstream.
>>
>> When the managed cache is enabled, the last reference count
>> of a workgroup must be used for its workstation.
>>
>> Otherwise, it could lead to incorrect (un)freezes in
>> the reclaim path, and it would be harmful.
>>
>> A typical race as follows:
>>
>> Thread 1 (In the reclaim path)  Thread 2
>> workgroup_freeze(grp, 1)                                refcnt = 1
>> ...
>> workgroup_unfreeze(grp, 1)                              refcnt = 1
>>                                 workgroup_get(grp)      refcnt = 2 (x)
>> workgroup_put(grp)                                      refcnt = 1 (x)
>>                                 ...unexpected behaviors
>>
>> * grp is detached but still used, which violates cache-managed
>>   freeze constraint.
>>
>> Reviewed-by: Chao Yu <yuchao0 at huawei.com>
>> Signed-off-by: Gao Xiang <gaoxiang25 at huawei.com>
>> Signed-off-by: Greg Kroah-Hartman <gregkh at linuxfoundation.org>
>> Signed-off-by: Gao Xiang <gaoxiang25 at huawei.com>
>>
>> Conflicts:
>> 	drivers/staging/erofs/utils.c
>> Updates:
>> 	include/linux/xarray.h:
>> 		add xa_untag_pointer,xa_tag_pointer,xa_pointer_tag
>> 		from upstream 3159f943aafd in order to reduce
>> 		conflicts.
> 
> No, sorry, I don't want to add xarray.h to 4.19.y, that's crazy.

Or can I define these xa_untag_pointer,xa_tag_pointer,xa_pointer_tag in
a erofs header internally? it is acceptable?

Thanks,
Gao Xiang

> 
> And even if we did, you do not slip it in as part of a different patch,
> it should come in as its own patch, with the same git commit id that it
> landed in 4.20 with.
> 
> Please fix this up...
> 
> greg k-h
> 

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH for-4.19 06/12] staging: erofs: fix race when the managed cache is enabled
  2019-02-25 15:04     ` Greg Kroah-Hartman
@ 2019-02-25 15:25       ` Matthew Wilcox
  -1 siblings, 0 replies; 68+ messages in thread
From: Matthew Wilcox @ 2019-02-25 15:25 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: Gao Xiang, stable, Chao Yu, linux-erofs, miaoxie

On Mon, Feb 25, 2019 at 04:04:49PM +0100, Greg Kroah-Hartman wrote:
> On Wed, Feb 20, 2019 at 05:18:48PM +0800, Gao Xiang wrote:
> > commit 51232df5e4b268936beccde5248f312a316800be upstream.
> > Updates:
> > 	include/linux/xarray.h:
> > 		add xa_untag_pointer,xa_tag_pointer,xa_pointer_tag
> > 		from upstream 3159f943aafd in order to reduce
> > 		conflicts.
> 
> No, sorry, I don't want to add xarray.h to 4.19.y, that's crazy.

I gave this a quick look when it came past, and I don't particularly
object to this piece going into 4.19.y.  A full-on backport of XArray
to 4.19 will be ... interesting, but essentially this is just some
boilerplate.

> And even if we did, you do not slip it in as part of a different patch,
> it should come in as its own patch, with the same git commit id that it
> landed in 4.20 with.

Putting in all of 3159f943aafdbacb2f94c38fdaadabf2bbde2a14 would be a
bad idea; it actually ended up breaking m68k in a rather unexpected way
which required 66ee620f06f99d72475db6eb638559ba608c7dee, which in turn
caused a memory consumption regression ...

But putting in the xa_tag_pointer() API doesn't concern me.

> Please fix this up...
> 
> greg k-h

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH for-4.19 06/12] staging: erofs: fix race when the managed cache is enabled
@ 2019-02-25 15:25       ` Matthew Wilcox
  0 siblings, 0 replies; 68+ messages in thread
From: Matthew Wilcox @ 2019-02-25 15:25 UTC (permalink / raw)


On Mon, Feb 25, 2019@04:04:49PM +0100, Greg Kroah-Hartman wrote:
> On Wed, Feb 20, 2019@05:18:48PM +0800, Gao Xiang wrote:
> > commit 51232df5e4b268936beccde5248f312a316800be upstream.
> > Updates:
> > 	include/linux/xarray.h:
> > 		add xa_untag_pointer,xa_tag_pointer,xa_pointer_tag
> > 		from upstream 3159f943aafd in order to reduce
> > 		conflicts.
> 
> No, sorry, I don't want to add xarray.h to 4.19.y, that's crazy.

I gave this a quick look when it came past, and I don't particularly
object to this piece going into 4.19.y.  A full-on backport of XArray
to 4.19 will be ... interesting, but essentially this is just some
boilerplate.

> And even if we did, you do not slip it in as part of a different patch,
> it should come in as its own patch, with the same git commit id that it
> landed in 4.20 with.

Putting in all of 3159f943aafdbacb2f94c38fdaadabf2bbde2a14 would be a
bad idea; it actually ended up breaking m68k in a rather unexpected way
which required 66ee620f06f99d72475db6eb638559ba608c7dee, which in turn
caused a memory consumption regression ...

But putting in the xa_tag_pointer() API doesn't concern me.

> Please fix this up...
> 
> greg k-h

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH for-4.19 00/12] erofs fixes for linux-4.19.y
  2019-02-20  9:18 ` Gao Xiang
@ 2019-02-25 15:28   ` Greg Kroah-Hartman
  -1 siblings, 0 replies; 68+ messages in thread
From: Greg Kroah-Hartman @ 2019-02-25 15:28 UTC (permalink / raw)
  To: Gao Xiang; +Cc: stable, Chao Yu, linux-erofs, miaoxie

On Wed, Feb 20, 2019 at 05:18:42PM +0800, Gao Xiang wrote:
> This series backports bugfixes already merged in linux upstream
> which we found these issues in our commerical products, which
> are serious and should be fixed immediately.
> 
> Note that it also includes some xarray modification since
> upcoming patches heavily needs it, which can reduce more
> conflicts later.
> 
> All patches have been tested again as a whole.

All but one now queued up, thanks.

greg k-h

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH for-4.19 00/12] erofs fixes for linux-4.19.y
@ 2019-02-25 15:28   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 68+ messages in thread
From: Greg Kroah-Hartman @ 2019-02-25 15:28 UTC (permalink / raw)


On Wed, Feb 20, 2019@05:18:42PM +0800, Gao Xiang wrote:
> This series backports bugfixes already merged in linux upstream
> which we found these issues in our commerical products, which
> are serious and should be fixed immediately.
> 
> Note that it also includes some xarray modification since
> upcoming patches heavily needs it, which can reduce more
> conflicts later.
> 
> All patches have been tested again as a whole.

All but one now queued up, thanks.

greg k-h

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH for-4.19 06/12] staging: erofs: fix race when the managed cache is enabled
  2019-02-25 15:07       ` Gao Xiang
@ 2019-02-25 15:51         ` Greg Kroah-Hartman
  -1 siblings, 0 replies; 68+ messages in thread
From: Greg Kroah-Hartman @ 2019-02-25 15:51 UTC (permalink / raw)
  To: Gao Xiang; +Cc: stable, Chao Yu, linux-erofs, miaoxie, Matthew Wilcox

On Mon, Feb 25, 2019 at 11:07:19PM +0800, Gao Xiang wrote:
> 
> 
> On 2019/2/25 23:04, Greg Kroah-Hartman wrote:
> > On Wed, Feb 20, 2019 at 05:18:48PM +0800, Gao Xiang wrote:
> >> commit 51232df5e4b268936beccde5248f312a316800be upstream.
> >>
> >> When the managed cache is enabled, the last reference count
> >> of a workgroup must be used for its workstation.
> >>
> >> Otherwise, it could lead to incorrect (un)freezes in
> >> the reclaim path, and it would be harmful.
> >>
> >> A typical race as follows:
> >>
> >> Thread 1 (In the reclaim path)  Thread 2
> >> workgroup_freeze(grp, 1)                                refcnt = 1
> >> ...
> >> workgroup_unfreeze(grp, 1)                              refcnt = 1
> >>                                 workgroup_get(grp)      refcnt = 2 (x)
> >> workgroup_put(grp)                                      refcnt = 1 (x)
> >>                                 ...unexpected behaviors
> >>
> >> * grp is detached but still used, which violates cache-managed
> >>   freeze constraint.
> >>
> >> Reviewed-by: Chao Yu <yuchao0@huawei.com>
> >> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
> >> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> >> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
> >>
> >> Conflicts:
> >> 	drivers/staging/erofs/utils.c
> >> Updates:
> >> 	include/linux/xarray.h:
> >> 		add xa_untag_pointer,xa_tag_pointer,xa_pointer_tag
> >> 		from upstream 3159f943aafd in order to reduce
> >> 		conflicts.
> > 
> > No, sorry, I don't want to add xarray.h to 4.19.y, that's crazy.
> 
> Or can I define these xa_untag_pointer,xa_tag_pointer,xa_pointer_tag in
> a erofs header internally? it is acceptable?

No, that's not ok.

If you want to backport a subset of the api, and Matthew agrees with it
(I think he did), then let's backport a subset, as a single patch, that
matches the original patch comments and git id.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH for-4.19 06/12] staging: erofs: fix race when the managed cache is enabled
@ 2019-02-25 15:51         ` Greg Kroah-Hartman
  0 siblings, 0 replies; 68+ messages in thread
From: Greg Kroah-Hartman @ 2019-02-25 15:51 UTC (permalink / raw)


On Mon, Feb 25, 2019@11:07:19PM +0800, Gao Xiang wrote:
> 
> 
> On 2019/2/25 23:04, Greg Kroah-Hartman wrote:
> > On Wed, Feb 20, 2019@05:18:48PM +0800, Gao Xiang wrote:
> >> commit 51232df5e4b268936beccde5248f312a316800be upstream.
> >>
> >> When the managed cache is enabled, the last reference count
> >> of a workgroup must be used for its workstation.
> >>
> >> Otherwise, it could lead to incorrect (un)freezes in
> >> the reclaim path, and it would be harmful.
> >>
> >> A typical race as follows:
> >>
> >> Thread 1 (In the reclaim path)  Thread 2
> >> workgroup_freeze(grp, 1)                                refcnt = 1
> >> ...
> >> workgroup_unfreeze(grp, 1)                              refcnt = 1
> >>                                 workgroup_get(grp)      refcnt = 2 (x)
> >> workgroup_put(grp)                                      refcnt = 1 (x)
> >>                                 ...unexpected behaviors
> >>
> >> * grp is detached but still used, which violates cache-managed
> >>   freeze constraint.
> >>
> >> Reviewed-by: Chao Yu <yuchao0 at huawei.com>
> >> Signed-off-by: Gao Xiang <gaoxiang25 at huawei.com>
> >> Signed-off-by: Greg Kroah-Hartman <gregkh at linuxfoundation.org>
> >> Signed-off-by: Gao Xiang <gaoxiang25 at huawei.com>
> >>
> >> Conflicts:
> >> 	drivers/staging/erofs/utils.c
> >> Updates:
> >> 	include/linux/xarray.h:
> >> 		add xa_untag_pointer,xa_tag_pointer,xa_pointer_tag
> >> 		from upstream 3159f943aafd in order to reduce
> >> 		conflicts.
> > 
> > No, sorry, I don't want to add xarray.h to 4.19.y, that's crazy.
> 
> Or can I define these xa_untag_pointer,xa_tag_pointer,xa_pointer_tag in
> a erofs header internally? it is acceptable?

No, that's not ok.

If you want to backport a subset of the api, and Matthew agrees with it
(I think he did), then let's backport a subset, as a single patch, that
matches the original patch comments and git id.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH for-4.19 06/12] staging: erofs: fix race when the managed cache is enabled
  2019-02-25 15:25       ` Matthew Wilcox
@ 2019-02-25 15:52         ` Greg Kroah-Hartman
  -1 siblings, 0 replies; 68+ messages in thread
From: Greg Kroah-Hartman @ 2019-02-25 15:52 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: Gao Xiang, stable, Chao Yu, linux-erofs, miaoxie

On Mon, Feb 25, 2019 at 07:25:48AM -0800, Matthew Wilcox wrote:
> On Mon, Feb 25, 2019 at 04:04:49PM +0100, Greg Kroah-Hartman wrote:
> > On Wed, Feb 20, 2019 at 05:18:48PM +0800, Gao Xiang wrote:
> > > commit 51232df5e4b268936beccde5248f312a316800be upstream.
> > > Updates:
> > > 	include/linux/xarray.h:
> > > 		add xa_untag_pointer,xa_tag_pointer,xa_pointer_tag
> > > 		from upstream 3159f943aafd in order to reduce
> > > 		conflicts.
> > 
> > No, sorry, I don't want to add xarray.h to 4.19.y, that's crazy.
> 
> I gave this a quick look when it came past, and I don't particularly
> object to this piece going into 4.19.y.  A full-on backport of XArray
> to 4.19 will be ... interesting, but essentially this is just some
> boilerplate.
> 
> > And even if we did, you do not slip it in as part of a different patch,
> > it should come in as its own patch, with the same git commit id that it
> > landed in 4.20 with.
> 
> Putting in all of 3159f943aafdbacb2f94c38fdaadabf2bbde2a14 would be a
> bad idea; it actually ended up breaking m68k in a rather unexpected way
> which required 66ee620f06f99d72475db6eb638559ba608c7dee, which in turn
> caused a memory consumption regression ...

Where did the chain of regressions stop?  that would be good to know as
we will have to deal with this over time :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH for-4.19 06/12] staging: erofs: fix race when the managed cache is enabled
@ 2019-02-25 15:52         ` Greg Kroah-Hartman
  0 siblings, 0 replies; 68+ messages in thread
From: Greg Kroah-Hartman @ 2019-02-25 15:52 UTC (permalink / raw)


On Mon, Feb 25, 2019@07:25:48AM -0800, Matthew Wilcox wrote:
> On Mon, Feb 25, 2019@04:04:49PM +0100, Greg Kroah-Hartman wrote:
> > On Wed, Feb 20, 2019@05:18:48PM +0800, Gao Xiang wrote:
> > > commit 51232df5e4b268936beccde5248f312a316800be upstream.
> > > Updates:
> > > 	include/linux/xarray.h:
> > > 		add xa_untag_pointer,xa_tag_pointer,xa_pointer_tag
> > > 		from upstream 3159f943aafd in order to reduce
> > > 		conflicts.
> > 
> > No, sorry, I don't want to add xarray.h to 4.19.y, that's crazy.
> 
> I gave this a quick look when it came past, and I don't particularly
> object to this piece going into 4.19.y.  A full-on backport of XArray
> to 4.19 will be ... interesting, but essentially this is just some
> boilerplate.
> 
> > And even if we did, you do not slip it in as part of a different patch,
> > it should come in as its own patch, with the same git commit id that it
> > landed in 4.20 with.
> 
> Putting in all of 3159f943aafdbacb2f94c38fdaadabf2bbde2a14 would be a
> bad idea; it actually ended up breaking m68k in a rather unexpected way
> which required 66ee620f06f99d72475db6eb638559ba608c7dee, which in turn
> caused a memory consumption regression ...

Where did the chain of regressions stop?  that would be good to know as
we will have to deal with this over time :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH for-4.19 06/12] staging: erofs: fix race when the managed cache is enabled
  2019-02-25 15:51         ` Greg Kroah-Hartman
@ 2019-02-25 15:57           ` Gao Xiang
  -1 siblings, 0 replies; 68+ messages in thread
From: Gao Xiang @ 2019-02-25 15:57 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Matthew Wilcox
  Cc: Gao Xiang, miaoxie, linux-erofs, stable


On 2019/2/25 23:51, Greg Kroah-Hartman wrote:
> On Mon, Feb 25, 2019 at 11:07:19PM +0800, Gao Xiang wrote:
>>
>> On 2019/2/25 23:04, Greg Kroah-Hartman wrote:
>>> On Wed, Feb 20, 2019 at 05:18:48PM +0800, Gao Xiang wrote:
>>>> commit 51232df5e4b268936beccde5248f312a316800be upstream.
>>>>
>>>> When the managed cache is enabled, the last reference count
>>>> of a workgroup must be used for its workstation.
>>>>
>>>> Otherwise, it could lead to incorrect (un)freezes in
>>>> the reclaim path, and it would be harmful.
>>>>
>>>> A typical race as follows:
>>>>
>>>> Thread 1 (In the reclaim path)  Thread 2
>>>> workgroup_freeze(grp, 1)                                refcnt = 1
>>>> ...
>>>> workgroup_unfreeze(grp, 1)                              refcnt = 1
>>>>                                 workgroup_get(grp)      refcnt = 2 (x)
>>>> workgroup_put(grp)                                      refcnt = 1 (x)
>>>>                                 ...unexpected behaviors
>>>>
>>>> * grp is detached but still used, which violates cache-managed
>>>>   freeze constraint.
>>>>
>>>> Reviewed-by: Chao Yu <yuchao0@huawei.com>
>>>> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
>>>> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>>>> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
>>>>
>>>> Conflicts:
>>>> 	drivers/staging/erofs/utils.c
>>>> Updates:
>>>> 	include/linux/xarray.h:
>>>> 		add xa_untag_pointer,xa_tag_pointer,xa_pointer_tag
>>>> 		from upstream 3159f943aafd in order to reduce
>>>> 		conflicts.
>>> No, sorry, I don't want to add xarray.h to 4.19.y, that's crazy.
>> Or can I define these xa_untag_pointer,xa_tag_pointer,xa_pointer_tag in
>> a erofs header internally? it is acceptable?
> No, that's not ok.
>
> If you want to backport a subset of the api, and Matthew agrees with it
> (I think he did), then let's backport a subset, as a single patch, that
> matches the original patch comments and git id.

OK, I will backport a minimum subset of upstream 3159f943aafd only with
xa_untag_pointer,xa_tag_pointer,xa_pointer_tag as the only valid part.

I will do this soon and thanks Greg and Matthew for taking time on this patch :)

Thanks,
Gao Xiang


>
> thanks,
>
> greg k-h

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH for-4.19 06/12] staging: erofs: fix race when the managed cache is enabled
@ 2019-02-25 15:57           ` Gao Xiang
  0 siblings, 0 replies; 68+ messages in thread
From: Gao Xiang @ 2019-02-25 15:57 UTC (permalink / raw)



On 2019/2/25 23:51, Greg Kroah-Hartman wrote:
> On Mon, Feb 25, 2019@11:07:19PM +0800, Gao Xiang wrote:
>>
>> On 2019/2/25 23:04, Greg Kroah-Hartman wrote:
>>> On Wed, Feb 20, 2019@05:18:48PM +0800, Gao Xiang wrote:
>>>> commit 51232df5e4b268936beccde5248f312a316800be upstream.
>>>>
>>>> When the managed cache is enabled, the last reference count
>>>> of a workgroup must be used for its workstation.
>>>>
>>>> Otherwise, it could lead to incorrect (un)freezes in
>>>> the reclaim path, and it would be harmful.
>>>>
>>>> A typical race as follows:
>>>>
>>>> Thread 1 (In the reclaim path)  Thread 2
>>>> workgroup_freeze(grp, 1)                                refcnt = 1
>>>> ...
>>>> workgroup_unfreeze(grp, 1)                              refcnt = 1
>>>>                                 workgroup_get(grp)      refcnt = 2 (x)
>>>> workgroup_put(grp)                                      refcnt = 1 (x)
>>>>                                 ...unexpected behaviors
>>>>
>>>> * grp is detached but still used, which violates cache-managed
>>>>   freeze constraint.
>>>>
>>>> Reviewed-by: Chao Yu <yuchao0 at huawei.com>
>>>> Signed-off-by: Gao Xiang <gaoxiang25 at huawei.com>
>>>> Signed-off-by: Greg Kroah-Hartman <gregkh at linuxfoundation.org>
>>>> Signed-off-by: Gao Xiang <gaoxiang25 at huawei.com>
>>>>
>>>> Conflicts:
>>>> 	drivers/staging/erofs/utils.c
>>>> Updates:
>>>> 	include/linux/xarray.h:
>>>> 		add xa_untag_pointer,xa_tag_pointer,xa_pointer_tag
>>>> 		from upstream 3159f943aafd in order to reduce
>>>> 		conflicts.
>>> No, sorry, I don't want to add xarray.h to 4.19.y, that's crazy.
>> Or can I define these xa_untag_pointer,xa_tag_pointer,xa_pointer_tag in
>> a erofs header internally? it is acceptable?
> No, that's not ok.
>
> If you want to backport a subset of the api, and Matthew agrees with it
> (I think he did), then let's backport a subset, as a single patch, that
> matches the original patch comments and git id.

OK, I will backport a minimum subset of upstream 3159f943aafd only with
xa_untag_pointer,xa_tag_pointer,xa_pointer_tag as the only valid part.

I will do this soon and thanks Greg and Matthew for taking time on this patch :)

Thanks,
Gao Xiang


>
> thanks,
>
> greg k-h

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH for-4.19 06/12] staging: erofs: fix race when the managed cache is enabled
  2019-02-25 15:52         ` Greg Kroah-Hartman
@ 2019-02-25 16:04           ` Matthew Wilcox
  -1 siblings, 0 replies; 68+ messages in thread
From: Matthew Wilcox @ 2019-02-25 16:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: Gao Xiang, stable, Chao Yu, linux-erofs, miaoxie

On Mon, Feb 25, 2019 at 04:52:43PM +0100, Greg Kroah-Hartman wrote:
> On Mon, Feb 25, 2019 at 07:25:48AM -0800, Matthew Wilcox wrote:
> > Putting in all of 3159f943aafdbacb2f94c38fdaadabf2bbde2a14 would be a
> > bad idea; it actually ended up breaking m68k in a rather unexpected way
> > which required 66ee620f06f99d72475db6eb638559ba608c7dee, which in turn
> > caused a memory consumption regression ...
> 
> Where did the chain of regressions stop?  that would be good to know as
> we will have to deal with this over time :)

As far as I know, that terminated with
f32f004cddf86d63a9c42994bbce9f4e2f07c9fa but by the point you get there,
we've brought in the whole of the XArray which I, like you, would be
reluctant to do at this point.

It may make sense to bring in the whole of the XArray at some point to
make backporting other fixes easier, but I think we should give it a
few more months to be sure it's solid.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH for-4.19 06/12] staging: erofs: fix race when the managed cache is enabled
@ 2019-02-25 16:04           ` Matthew Wilcox
  0 siblings, 0 replies; 68+ messages in thread
From: Matthew Wilcox @ 2019-02-25 16:04 UTC (permalink / raw)


On Mon, Feb 25, 2019@04:52:43PM +0100, Greg Kroah-Hartman wrote:
> On Mon, Feb 25, 2019@07:25:48AM -0800, Matthew Wilcox wrote:
> > Putting in all of 3159f943aafdbacb2f94c38fdaadabf2bbde2a14 would be a
> > bad idea; it actually ended up breaking m68k in a rather unexpected way
> > which required 66ee620f06f99d72475db6eb638559ba608c7dee, which in turn
> > caused a memory consumption regression ...
> 
> Where did the chain of regressions stop?  that would be good to know as
> we will have to deal with this over time :)

As far as I know, that terminated with
f32f004cddf86d63a9c42994bbce9f4e2f07c9fa but by the point you get there,
we've brought in the whole of the XArray which I, like you, would be
reluctant to do at this point.

It may make sense to bring in the whole of the XArray at some point to
make backporting other fixes easier, but I think we should give it a
few more months to be sure it's solid.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH for-4.19 1/2] xarray: Replace exceptional entries
  2019-02-25 15:51         ` Greg Kroah-Hartman
@ 2019-02-25 17:58           ` Gao Xiang
  -1 siblings, 0 replies; 68+ messages in thread
From: Gao Xiang @ 2019-02-25 17:58 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, linux-erofs, Chao Yu, Matthew Wilcox, Gao Xiang

From: Matthew Wilcox <willy@infradead.org>

commit 3159f943aafdbacb2f94c38fdaadabf2bbde2a14 upstream.

Introduce xarray value entries and tagged pointers to replace radix
tree exceptional entries.  This is a slight change in encoding to allow
the use of an extra bit (we can now store BITS_PER_LONG - 1 bits in a
value entry).  It is also a change in emphasis; exceptional entries are
intimidating and different.  As the comment explains, you can choose
to store values or pointers in the xarray and they are both first-class
citizens.

Signed-off-by: Matthew Wilcox <willy@infradead.org>
Reviewed-by: Josef Bacik <jbacik@fb.com>
[ take the minimum subset of tagged pointer support only. ]
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
---
 drivers/staging/erofs/utils.c | 18 ++++++----------
 include/linux/xarray.h        | 48 +++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 54 insertions(+), 12 deletions(-)

diff --git a/drivers/staging/erofs/utils.c b/drivers/staging/erofs/utils.c
index 595cf90af9bb..bdee9bd09f11 100644
--- a/drivers/staging/erofs/utils.c
+++ b/drivers/staging/erofs/utils.c
@@ -35,7 +35,6 @@ static atomic_long_t erofs_global_shrink_cnt;
 
 #ifdef CONFIG_EROFS_FS_ZIP
 
-/* radix_tree and the future XArray both don't use tagptr_t yet */
 struct erofs_workgroup *erofs_find_workgroup(
 	struct super_block *sb, pgoff_t index, bool *tag)
 {
@@ -47,9 +46,8 @@ struct erofs_workgroup *erofs_find_workgroup(
 	rcu_read_lock();
 	grp = radix_tree_lookup(&sbi->workstn_tree, index);
 	if (grp != NULL) {
-		*tag = radix_tree_exceptional_entry(grp);
-		grp = (void *)((unsigned long)grp &
-			~RADIX_TREE_EXCEPTIONAL_ENTRY);
+		*tag = xa_pointer_tag(grp);
+		grp = xa_untag_pointer(grp);
 
 		if (erofs_workgroup_get(grp, &oldcount)) {
 			/* prefer to relax rcu read side */
@@ -83,9 +81,7 @@ int erofs_register_workgroup(struct super_block *sb,
 	sbi = EROFS_SB(sb);
 	erofs_workstn_lock(sbi);
 
-	if (tag)
-		grp = (void *)((unsigned long)grp |
-			1UL << RADIX_TREE_EXCEPTIONAL_SHIFT);
+	grp = xa_tag_pointer(grp, tag);
 
 	err = radix_tree_insert(&sbi->workstn_tree,
 		grp->index, grp);
@@ -131,9 +127,7 @@ unsigned long erofs_shrink_workstation(struct erofs_sb_info *sbi,
 
 	for (i = 0; i < found; ++i) {
 		int cnt;
-		struct erofs_workgroup *grp = (void *)
-			((unsigned long)batch[i] &
-				~RADIX_TREE_EXCEPTIONAL_ENTRY);
+		struct erofs_workgroup *grp = xa_untag_pointer(batch[i]);
 
 		first_index = grp->index + 1;
 
@@ -150,8 +144,8 @@ unsigned long erofs_shrink_workstation(struct erofs_sb_info *sbi,
 #endif
 			continue;
 
-		if (radix_tree_delete(&sbi->workstn_tree,
-			grp->index) != grp) {
+		if (xa_untag_pointer(radix_tree_delete(&sbi->workstn_tree,
+			grp->index)) != grp) {
 #ifdef EROFS_FS_HAS_MANAGED_CACHE
 skip:
 			erofs_workgroup_unfreeze(grp, 1);
diff --git a/include/linux/xarray.h b/include/linux/xarray.h
index 2dfc8006fe64..6f5c380f2d80 100644
--- a/include/linux/xarray.h
+++ b/include/linux/xarray.h
@@ -9,6 +9,54 @@
 
 #include <linux/spinlock.h>
 
+/**
+ * xa_tag_pointer() - Create an XArray entry for a tagged pointer.
+ * @p: Plain pointer.
+ * @tag: Tag value (0, 1 or 3).
+ *
+ * If the user of the XArray prefers, they can tag their pointers instead
+ * of storing value entries.  Three tags are available (0, 1 and 3).
+ * These are distinct from the xa_mark_t as they are not replicated up
+ * through the array and cannot be searched for.
+ *
+ * Context: Any context.
+ * Return: An XArray entry.
+ */
+static inline void *xa_tag_pointer(void *p, unsigned long tag)
+{
+	return (void *)((unsigned long)p | tag);
+}
+
+/**
+ * xa_untag_pointer() - Turn an XArray entry into a plain pointer.
+ * @entry: XArray entry.
+ *
+ * If you have stored a tagged pointer in the XArray, call this function
+ * to get the untagged version of the pointer.
+ *
+ * Context: Any context.
+ * Return: A pointer.
+ */
+static inline void *xa_untag_pointer(void *entry)
+{
+	return (void *)((unsigned long)entry & ~3UL);
+}
+
+/**
+ * xa_pointer_tag() - Get the tag stored in an XArray entry.
+ * @entry: XArray entry.
+ *
+ * If you have stored a tagged pointer in the XArray, call this function
+ * to get the tag of that pointer.
+ *
+ * Context: Any context.
+ * Return: A tag.
+ */
+static inline unsigned int xa_pointer_tag(void *entry)
+{
+	return (unsigned long)entry & 3UL;
+}
+
 #define xa_trylock(xa)		spin_trylock(&(xa)->xa_lock)
 #define xa_lock(xa)		spin_lock(&(xa)->xa_lock)
 #define xa_unlock(xa)		spin_unlock(&(xa)->xa_lock)
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH for-4.19 1/2] xarray: Replace exceptional entries
@ 2019-02-25 17:58           ` Gao Xiang
  0 siblings, 0 replies; 68+ messages in thread
From: Gao Xiang @ 2019-02-25 17:58 UTC (permalink / raw)


From: Matthew Wilcox <willy@infradead.org>

commit 3159f943aafdbacb2f94c38fdaadabf2bbde2a14 upstream.

Introduce xarray value entries and tagged pointers to replace radix
tree exceptional entries.  This is a slight change in encoding to allow
the use of an extra bit (we can now store BITS_PER_LONG - 1 bits in a
value entry).  It is also a change in emphasis; exceptional entries are
intimidating and different.  As the comment explains, you can choose
to store values or pointers in the xarray and they are both first-class
citizens.

Signed-off-by: Matthew Wilcox <willy at infradead.org>
Reviewed-by: Josef Bacik <jbacik at fb.com>
[ take the minimum subset of tagged pointer support only. ]
Cc: Matthew Wilcox <willy at infradead.org>
Signed-off-by: Gao Xiang <gaoxiang25 at huawei.com>
---
 drivers/staging/erofs/utils.c | 18 ++++++----------
 include/linux/xarray.h        | 48 +++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 54 insertions(+), 12 deletions(-)

diff --git a/drivers/staging/erofs/utils.c b/drivers/staging/erofs/utils.c
index 595cf90af9bb..bdee9bd09f11 100644
--- a/drivers/staging/erofs/utils.c
+++ b/drivers/staging/erofs/utils.c
@@ -35,7 +35,6 @@ static atomic_long_t erofs_global_shrink_cnt;
 
 #ifdef CONFIG_EROFS_FS_ZIP
 
-/* radix_tree and the future XArray both don't use tagptr_t yet */
 struct erofs_workgroup *erofs_find_workgroup(
 	struct super_block *sb, pgoff_t index, bool *tag)
 {
@@ -47,9 +46,8 @@ struct erofs_workgroup *erofs_find_workgroup(
 	rcu_read_lock();
 	grp = radix_tree_lookup(&sbi->workstn_tree, index);
 	if (grp != NULL) {
-		*tag = radix_tree_exceptional_entry(grp);
-		grp = (void *)((unsigned long)grp &
-			~RADIX_TREE_EXCEPTIONAL_ENTRY);
+		*tag = xa_pointer_tag(grp);
+		grp = xa_untag_pointer(grp);
 
 		if (erofs_workgroup_get(grp, &oldcount)) {
 			/* prefer to relax rcu read side */
@@ -83,9 +81,7 @@ int erofs_register_workgroup(struct super_block *sb,
 	sbi = EROFS_SB(sb);
 	erofs_workstn_lock(sbi);
 
-	if (tag)
-		grp = (void *)((unsigned long)grp |
-			1UL << RADIX_TREE_EXCEPTIONAL_SHIFT);
+	grp = xa_tag_pointer(grp, tag);
 
 	err = radix_tree_insert(&sbi->workstn_tree,
 		grp->index, grp);
@@ -131,9 +127,7 @@ unsigned long erofs_shrink_workstation(struct erofs_sb_info *sbi,
 
 	for (i = 0; i < found; ++i) {
 		int cnt;
-		struct erofs_workgroup *grp = (void *)
-			((unsigned long)batch[i] &
-				~RADIX_TREE_EXCEPTIONAL_ENTRY);
+		struct erofs_workgroup *grp = xa_untag_pointer(batch[i]);
 
 		first_index = grp->index + 1;
 
@@ -150,8 +144,8 @@ unsigned long erofs_shrink_workstation(struct erofs_sb_info *sbi,
 #endif
 			continue;
 
-		if (radix_tree_delete(&sbi->workstn_tree,
-			grp->index) != grp) {
+		if (xa_untag_pointer(radix_tree_delete(&sbi->workstn_tree,
+			grp->index)) != grp) {
 #ifdef EROFS_FS_HAS_MANAGED_CACHE
 skip:
 			erofs_workgroup_unfreeze(grp, 1);
diff --git a/include/linux/xarray.h b/include/linux/xarray.h
index 2dfc8006fe64..6f5c380f2d80 100644
--- a/include/linux/xarray.h
+++ b/include/linux/xarray.h
@@ -9,6 +9,54 @@
 
 #include <linux/spinlock.h>
 
+/**
+ * xa_tag_pointer() - Create an XArray entry for a tagged pointer.
+ * @p: Plain pointer.
+ * @tag: Tag value (0, 1 or 3).
+ *
+ * If the user of the XArray prefers, they can tag their pointers instead
+ * of storing value entries.  Three tags are available (0, 1 and 3).
+ * These are distinct from the xa_mark_t as they are not replicated up
+ * through the array and cannot be searched for.
+ *
+ * Context: Any context.
+ * Return: An XArray entry.
+ */
+static inline void *xa_tag_pointer(void *p, unsigned long tag)
+{
+	return (void *)((unsigned long)p | tag);
+}
+
+/**
+ * xa_untag_pointer() - Turn an XArray entry into a plain pointer.
+ * @entry: XArray entry.
+ *
+ * If you have stored a tagged pointer in the XArray, call this function
+ * to get the untagged version of the pointer.
+ *
+ * Context: Any context.
+ * Return: A pointer.
+ */
+static inline void *xa_untag_pointer(void *entry)
+{
+	return (void *)((unsigned long)entry & ~3UL);
+}
+
+/**
+ * xa_pointer_tag() - Get the tag stored in an XArray entry.
+ * @entry: XArray entry.
+ *
+ * If you have stored a tagged pointer in the XArray, call this function
+ * to get the tag of that pointer.
+ *
+ * Context: Any context.
+ * Return: A tag.
+ */
+static inline unsigned int xa_pointer_tag(void *entry)
+{
+	return (unsigned long)entry & 3UL;
+}
+
 #define xa_trylock(xa)		spin_trylock(&(xa)->xa_lock)
 #define xa_lock(xa)		spin_lock(&(xa)->xa_lock)
 #define xa_unlock(xa)		spin_unlock(&(xa)->xa_lock)
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH for-4.19 2/2] staging: erofs: fix race when the managed cache is enabled
  2019-02-25 17:58           ` Gao Xiang
@ 2019-02-25 17:58             ` Gao Xiang
  -1 siblings, 0 replies; 68+ messages in thread
From: Gao Xiang @ 2019-02-25 17:58 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, linux-erofs, Chao Yu, Gao Xiang

From: Gao Xiang <gaoxiang25@huawei.com>

commit 51232df5e4b268936beccde5248f312a316800be upstream.

When the managed cache is enabled, the last reference count
of a workgroup must be used for its workstation.

Otherwise, it could lead to incorrect (un)freezes in
the reclaim path, and it would be harmful.

A typical race as follows:

Thread 1 (In the reclaim path)  Thread 2
workgroup_freeze(grp, 1)                                refcnt = 1
...
workgroup_unfreeze(grp, 1)                              refcnt = 1
                                workgroup_get(grp)      refcnt = 2 (x)
workgroup_put(grp)                                      refcnt = 1 (x)
                                ...unexpected behaviors

* grp is detached but still used, which violates cache-managed
  freeze constraint.

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
---
 drivers/staging/erofs/internal.h |   1 +
 drivers/staging/erofs/utils.c    | 134 +++++++++++++++++++++++++++------------
 2 files changed, 96 insertions(+), 39 deletions(-)

diff --git a/drivers/staging/erofs/internal.h b/drivers/staging/erofs/internal.h
index e6313c54e3ad..122ea5016f3b 100644
--- a/drivers/staging/erofs/internal.h
+++ b/drivers/staging/erofs/internal.h
@@ -240,6 +240,7 @@ static inline bool erofs_workgroup_get(struct erofs_workgroup *grp, int *ocnt)
 }
 
 #define __erofs_workgroup_get(grp)	atomic_inc(&(grp)->refcount)
+#define __erofs_workgroup_put(grp)	atomic_dec(&(grp)->refcount)
 
 extern int erofs_workgroup_put(struct erofs_workgroup *grp);
 
diff --git a/drivers/staging/erofs/utils.c b/drivers/staging/erofs/utils.c
index bdee9bd09f11..bcabbb85d40e 100644
--- a/drivers/staging/erofs/utils.c
+++ b/drivers/staging/erofs/utils.c
@@ -83,12 +83,21 @@ int erofs_register_workgroup(struct super_block *sb,
 
 	grp = xa_tag_pointer(grp, tag);
 
-	err = radix_tree_insert(&sbi->workstn_tree,
-		grp->index, grp);
+	/*
+	 * Bump up reference count before making this workgroup
+	 * visible to other users in order to avoid potential UAF
+	 * without serialized by erofs_workstn_lock.
+	 */
+	__erofs_workgroup_get(grp);
 
-	if (!err) {
-		__erofs_workgroup_get(grp);
-	}
+	err = radix_tree_insert(&sbi->workstn_tree,
+				grp->index, grp);
+	if (unlikely(err))
+		/*
+		 * it's safe to decrease since the workgroup isn't visible
+		 * and refcount >= 2 (cannot be freezed).
+		 */
+		__erofs_workgroup_put(grp);
 
 	erofs_workstn_unlock(sbi);
 	radix_tree_preload_end();
@@ -97,19 +106,94 @@ int erofs_register_workgroup(struct super_block *sb,
 
 extern void erofs_workgroup_free_rcu(struct erofs_workgroup *grp);
 
+static void  __erofs_workgroup_free(struct erofs_workgroup *grp)
+{
+	atomic_long_dec(&erofs_global_shrink_cnt);
+	erofs_workgroup_free_rcu(grp);
+}
+
 int erofs_workgroup_put(struct erofs_workgroup *grp)
 {
 	int count = atomic_dec_return(&grp->refcount);
 
 	if (count == 1)
 		atomic_long_inc(&erofs_global_shrink_cnt);
-	else if (!count) {
-		atomic_long_dec(&erofs_global_shrink_cnt);
-		erofs_workgroup_free_rcu(grp);
-	}
+	else if (!count)
+		__erofs_workgroup_free(grp);
 	return count;
 }
 
+#ifdef EROFS_FS_HAS_MANAGED_CACHE
+/* for cache-managed case, customized reclaim paths exist */
+static void erofs_workgroup_unfreeze_final(struct erofs_workgroup *grp)
+{
+	erofs_workgroup_unfreeze(grp, 0);
+	__erofs_workgroup_free(grp);
+}
+
+bool erofs_try_to_release_workgroup(struct erofs_sb_info *sbi,
+				    struct erofs_workgroup *grp,
+				    bool cleanup)
+{
+	/*
+	 * for managed cache enabled, the refcount of workgroups
+	 * themselves could be < 0 (freezed). So there is no guarantee
+	 * that all refcount > 0 if managed cache is enabled.
+	 */
+	if (!erofs_workgroup_try_to_freeze(grp, 1))
+		return false;
+
+	/*
+	 * note that all cached pages should be unlinked
+	 * before delete it from the radix tree.
+	 * Otherwise some cached pages of an orphan old workgroup
+	 * could be still linked after the new one is available.
+	 */
+	if (erofs_try_to_free_all_cached_pages(sbi, grp)) {
+		erofs_workgroup_unfreeze(grp, 1);
+		return false;
+	}
+
+	/*
+	 * it is impossible to fail after the workgroup is freezed,
+	 * however in order to avoid some race conditions, add a
+	 * DBG_BUGON to observe this in advance.
+	 */
+	DBG_BUGON(xa_untag_pointer(radix_tree_delete(&sbi->workstn_tree,
+						     grp->index)) != grp);
+
+	/*
+	 * if managed cache is enable, the last refcount
+	 * should indicate the related workstation.
+	 */
+	erofs_workgroup_unfreeze_final(grp);
+	return true;
+}
+
+#else
+/* for nocache case, no customized reclaim path at all */
+bool erofs_try_to_release_workgroup(struct erofs_sb_info *sbi,
+				    struct erofs_workgroup *grp,
+				    bool cleanup)
+{
+	int cnt = atomic_read(&grp->refcount);
+
+	DBG_BUGON(cnt <= 0);
+	DBG_BUGON(cleanup && cnt != 1);
+
+	if (cnt > 1)
+		return false;
+
+	DBG_BUGON(xa_untag_pointer(radix_tree_delete(&sbi->workstn_tree,
+						     grp->index)) != grp);
+
+	/* (rarely) could be grabbed again when freeing */
+	erofs_workgroup_put(grp);
+	return true;
+}
+
+#endif
+
 unsigned long erofs_shrink_workstation(struct erofs_sb_info *sbi,
 				       unsigned long nr_shrink,
 				       bool cleanup)
@@ -126,42 +210,14 @@ unsigned long erofs_shrink_workstation(struct erofs_sb_info *sbi,
 		batch, first_index, PAGEVEC_SIZE);
 
 	for (i = 0; i < found; ++i) {
-		int cnt;
 		struct erofs_workgroup *grp = xa_untag_pointer(batch[i]);
 
 		first_index = grp->index + 1;
 
-		cnt = atomic_read(&grp->refcount);
-		BUG_ON(cnt <= 0);
-
-		if (cleanup)
-			BUG_ON(cnt != 1);
-
-#ifndef EROFS_FS_HAS_MANAGED_CACHE
-		else if (cnt > 1)
-#else
-		if (!erofs_workgroup_try_to_freeze(grp, 1))
-#endif
+		/* try to shrink each valid workgroup */
+		if (!erofs_try_to_release_workgroup(sbi, grp, cleanup))
 			continue;
 
-		if (xa_untag_pointer(radix_tree_delete(&sbi->workstn_tree,
-			grp->index)) != grp) {
-#ifdef EROFS_FS_HAS_MANAGED_CACHE
-skip:
-			erofs_workgroup_unfreeze(grp, 1);
-#endif
-			continue;
-		}
-
-#ifdef EROFS_FS_HAS_MANAGED_CACHE
-		if (erofs_try_to_free_all_cached_pages(sbi, grp))
-			goto skip;
-
-		erofs_workgroup_unfreeze(grp, 1);
-#endif
-		/* (rarely) grabbed again when freeing */
-		erofs_workgroup_put(grp);
-
 		++freed;
 		if (unlikely(!--nr_shrink))
 			break;
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH for-4.19 2/2] staging: erofs: fix race when the managed cache is enabled
@ 2019-02-25 17:58             ` Gao Xiang
  0 siblings, 0 replies; 68+ messages in thread
From: Gao Xiang @ 2019-02-25 17:58 UTC (permalink / raw)


From: Gao Xiang <gaoxiang25@huawei.com>

commit 51232df5e4b268936beccde5248f312a316800be upstream.

When the managed cache is enabled, the last reference count
of a workgroup must be used for its workstation.

Otherwise, it could lead to incorrect (un)freezes in
the reclaim path, and it would be harmful.

A typical race as follows:

Thread 1 (In the reclaim path)  Thread 2
workgroup_freeze(grp, 1)                                refcnt = 1
...
workgroup_unfreeze(grp, 1)                              refcnt = 1
                                workgroup_get(grp)      refcnt = 2 (x)
workgroup_put(grp)                                      refcnt = 1 (x)
                                ...unexpected behaviors

* grp is detached but still used, which violates cache-managed
  freeze constraint.

Reviewed-by: Chao Yu <yuchao0 at huawei.com>
Signed-off-by: Gao Xiang <gaoxiang25 at huawei.com>
---
 drivers/staging/erofs/internal.h |   1 +
 drivers/staging/erofs/utils.c    | 134 +++++++++++++++++++++++++++------------
 2 files changed, 96 insertions(+), 39 deletions(-)

diff --git a/drivers/staging/erofs/internal.h b/drivers/staging/erofs/internal.h
index e6313c54e3ad..122ea5016f3b 100644
--- a/drivers/staging/erofs/internal.h
+++ b/drivers/staging/erofs/internal.h
@@ -240,6 +240,7 @@ static inline bool erofs_workgroup_get(struct erofs_workgroup *grp, int *ocnt)
 }
 
 #define __erofs_workgroup_get(grp)	atomic_inc(&(grp)->refcount)
+#define __erofs_workgroup_put(grp)	atomic_dec(&(grp)->refcount)
 
 extern int erofs_workgroup_put(struct erofs_workgroup *grp);
 
diff --git a/drivers/staging/erofs/utils.c b/drivers/staging/erofs/utils.c
index bdee9bd09f11..bcabbb85d40e 100644
--- a/drivers/staging/erofs/utils.c
+++ b/drivers/staging/erofs/utils.c
@@ -83,12 +83,21 @@ int erofs_register_workgroup(struct super_block *sb,
 
 	grp = xa_tag_pointer(grp, tag);
 
-	err = radix_tree_insert(&sbi->workstn_tree,
-		grp->index, grp);
+	/*
+	 * Bump up reference count before making this workgroup
+	 * visible to other users in order to avoid potential UAF
+	 * without serialized by erofs_workstn_lock.
+	 */
+	__erofs_workgroup_get(grp);
 
-	if (!err) {
-		__erofs_workgroup_get(grp);
-	}
+	err = radix_tree_insert(&sbi->workstn_tree,
+				grp->index, grp);
+	if (unlikely(err))
+		/*
+		 * it's safe to decrease since the workgroup isn't visible
+		 * and refcount >= 2 (cannot be freezed).
+		 */
+		__erofs_workgroup_put(grp);
 
 	erofs_workstn_unlock(sbi);
 	radix_tree_preload_end();
@@ -97,19 +106,94 @@ int erofs_register_workgroup(struct super_block *sb,
 
 extern void erofs_workgroup_free_rcu(struct erofs_workgroup *grp);
 
+static void  __erofs_workgroup_free(struct erofs_workgroup *grp)
+{
+	atomic_long_dec(&erofs_global_shrink_cnt);
+	erofs_workgroup_free_rcu(grp);
+}
+
 int erofs_workgroup_put(struct erofs_workgroup *grp)
 {
 	int count = atomic_dec_return(&grp->refcount);
 
 	if (count == 1)
 		atomic_long_inc(&erofs_global_shrink_cnt);
-	else if (!count) {
-		atomic_long_dec(&erofs_global_shrink_cnt);
-		erofs_workgroup_free_rcu(grp);
-	}
+	else if (!count)
+		__erofs_workgroup_free(grp);
 	return count;
 }
 
+#ifdef EROFS_FS_HAS_MANAGED_CACHE
+/* for cache-managed case, customized reclaim paths exist */
+static void erofs_workgroup_unfreeze_final(struct erofs_workgroup *grp)
+{
+	erofs_workgroup_unfreeze(grp, 0);
+	__erofs_workgroup_free(grp);
+}
+
+bool erofs_try_to_release_workgroup(struct erofs_sb_info *sbi,
+				    struct erofs_workgroup *grp,
+				    bool cleanup)
+{
+	/*
+	 * for managed cache enabled, the refcount of workgroups
+	 * themselves could be < 0 (freezed). So there is no guarantee
+	 * that all refcount > 0 if managed cache is enabled.
+	 */
+	if (!erofs_workgroup_try_to_freeze(grp, 1))
+		return false;
+
+	/*
+	 * note that all cached pages should be unlinked
+	 * before delete it from the radix tree.
+	 * Otherwise some cached pages of an orphan old workgroup
+	 * could be still linked after the new one is available.
+	 */
+	if (erofs_try_to_free_all_cached_pages(sbi, grp)) {
+		erofs_workgroup_unfreeze(grp, 1);
+		return false;
+	}
+
+	/*
+	 * it is impossible to fail after the workgroup is freezed,
+	 * however in order to avoid some race conditions, add a
+	 * DBG_BUGON to observe this in advance.
+	 */
+	DBG_BUGON(xa_untag_pointer(radix_tree_delete(&sbi->workstn_tree,
+						     grp->index)) != grp);
+
+	/*
+	 * if managed cache is enable, the last refcount
+	 * should indicate the related workstation.
+	 */
+	erofs_workgroup_unfreeze_final(grp);
+	return true;
+}
+
+#else
+/* for nocache case, no customized reclaim path at all */
+bool erofs_try_to_release_workgroup(struct erofs_sb_info *sbi,
+				    struct erofs_workgroup *grp,
+				    bool cleanup)
+{
+	int cnt = atomic_read(&grp->refcount);
+
+	DBG_BUGON(cnt <= 0);
+	DBG_BUGON(cleanup && cnt != 1);
+
+	if (cnt > 1)
+		return false;
+
+	DBG_BUGON(xa_untag_pointer(radix_tree_delete(&sbi->workstn_tree,
+						     grp->index)) != grp);
+
+	/* (rarely) could be grabbed again when freeing */
+	erofs_workgroup_put(grp);
+	return true;
+}
+
+#endif
+
 unsigned long erofs_shrink_workstation(struct erofs_sb_info *sbi,
 				       unsigned long nr_shrink,
 				       bool cleanup)
@@ -126,42 +210,14 @@ unsigned long erofs_shrink_workstation(struct erofs_sb_info *sbi,
 		batch, first_index, PAGEVEC_SIZE);
 
 	for (i = 0; i < found; ++i) {
-		int cnt;
 		struct erofs_workgroup *grp = xa_untag_pointer(batch[i]);
 
 		first_index = grp->index + 1;
 
-		cnt = atomic_read(&grp->refcount);
-		BUG_ON(cnt <= 0);
-
-		if (cleanup)
-			BUG_ON(cnt != 1);
-
-#ifndef EROFS_FS_HAS_MANAGED_CACHE
-		else if (cnt > 1)
-#else
-		if (!erofs_workgroup_try_to_freeze(grp, 1))
-#endif
+		/* try to shrink each valid workgroup */
+		if (!erofs_try_to_release_workgroup(sbi, grp, cleanup))
 			continue;
 
-		if (xa_untag_pointer(radix_tree_delete(&sbi->workstn_tree,
-			grp->index)) != grp) {
-#ifdef EROFS_FS_HAS_MANAGED_CACHE
-skip:
-			erofs_workgroup_unfreeze(grp, 1);
-#endif
-			continue;
-		}
-
-#ifdef EROFS_FS_HAS_MANAGED_CACHE
-		if (erofs_try_to_free_all_cached_pages(sbi, grp))
-			goto skip;
-
-		erofs_workgroup_unfreeze(grp, 1);
-#endif
-		/* (rarely) grabbed again when freeing */
-		erofs_workgroup_put(grp);
-
 		++freed;
 		if (unlikely(!--nr_shrink))
 			break;
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* Re: [PATCH for-4.19 1/2] xarray: Replace exceptional entries
  2019-02-25 17:58           ` Gao Xiang
@ 2019-02-25 18:27             ` Matthew Wilcox
  -1 siblings, 0 replies; 68+ messages in thread
From: Matthew Wilcox @ 2019-02-25 18:27 UTC (permalink / raw)
  To: Gao Xiang; +Cc: stable, Greg Kroah-Hartman, linux-erofs, Chao Yu, Gao Xiang

On Tue, Feb 26, 2019 at 01:58:08AM +0800, Gao Xiang wrote:
> +/**
> + * xa_tag_pointer() - Create an XArray entry for a tagged pointer.
> + * @p: Plain pointer.
> + * @tag: Tag value (0, 1 or 3).
> + *
> + * If the user of the XArray prefers, they can tag their pointers instead
> + * of storing value entries.  Three tags are available (0, 1 and 3).
> + * These are distinct from the xa_mark_t as they are not replicated up
> + * through the array and cannot be searched for.
> + *
> + * Context: Any context.
> + * Return: An XArray entry.
> + */
> +static inline void *xa_tag_pointer(void *p, unsigned long tag)
> +{
> +	return (void *)((unsigned long)p | tag);
> +}

I think we have to diverge from upstream here.  Part of the original
commit is changing the format of internal & exceptional entries to give
us an extra bit.  This implementation of xa_tag_pointer would transform
a pointer tagged with value 1 into an internal pointer, which would
break the radix tree.

I would suggest:

+/**
+ * xa_tag_pointer() - Create an XArray entry for a tagged pointer.
+ * @p: Plain pointer.
+ * @tag: Tag value (0 or 1).
+ *
+ * If the user of the XArray prefers, they can tag their pointers instead
+ * of storing value entries.  Two tags are available (0 and 1).
+ * These are distinct from the xa_mark_t as they are not replicated up
+ * through the array and cannot be searched for.
+ *
+ * Context: Any context.
+ * Return: An XArray entry.
+ */
+static inline void *xa_tag_pointer(void *p, unsigned long tag)
+{
+	BUG_ON(tag > 1);
+	return (void *)((unsigned long)p | (tag << 1));
+}

xa_untag_pointer() and xa_pointer_tag() will need corresponding changes.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH for-4.19 1/2] xarray: Replace exceptional entries
@ 2019-02-25 18:27             ` Matthew Wilcox
  0 siblings, 0 replies; 68+ messages in thread
From: Matthew Wilcox @ 2019-02-25 18:27 UTC (permalink / raw)


On Tue, Feb 26, 2019@01:58:08AM +0800, Gao Xiang wrote:
> +/**
> + * xa_tag_pointer() - Create an XArray entry for a tagged pointer.
> + * @p: Plain pointer.
> + * @tag: Tag value (0, 1 or 3).
> + *
> + * If the user of the XArray prefers, they can tag their pointers instead
> + * of storing value entries.  Three tags are available (0, 1 and 3).
> + * These are distinct from the xa_mark_t as they are not replicated up
> + * through the array and cannot be searched for.
> + *
> + * Context: Any context.
> + * Return: An XArray entry.
> + */
> +static inline void *xa_tag_pointer(void *p, unsigned long tag)
> +{
> +	return (void *)((unsigned long)p | tag);
> +}

I think we have to diverge from upstream here.  Part of the original
commit is changing the format of internal & exceptional entries to give
us an extra bit.  This implementation of xa_tag_pointer would transform
a pointer tagged with value 1 into an internal pointer, which would
break the radix tree.

I would suggest:

+/**
+ * xa_tag_pointer() - Create an XArray entry for a tagged pointer.
+ * @p: Plain pointer.
+ * @tag: Tag value (0 or 1).
+ *
+ * If the user of the XArray prefers, they can tag their pointers instead
+ * of storing value entries.  Two tags are available (0 and 1).
+ * These are distinct from the xa_mark_t as they are not replicated up
+ * through the array and cannot be searched for.
+ *
+ * Context: Any context.
+ * Return: An XArray entry.
+ */
+static inline void *xa_tag_pointer(void *p, unsigned long tag)
+{
+	BUG_ON(tag > 1);
+	return (void *)((unsigned long)p | (tag << 1));
+}

xa_untag_pointer() and xa_pointer_tag() will need corresponding changes.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH for-4.19 1/2] xarray: Replace exceptional entries
  2019-02-25 18:27             ` Matthew Wilcox
@ 2019-02-26  1:21               ` Gao Xiang
  -1 siblings, 0 replies; 68+ messages in thread
From: Gao Xiang @ 2019-02-26  1:21 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Gao Xiang, stable, Greg Kroah-Hartman, linux-erofs, Chao Yu, Miao Xie

Hi Matthew,

On 2019/2/26 2:27, Matthew Wilcox wrote:
> On Tue, Feb 26, 2019 at 01:58:08AM +0800, Gao Xiang wrote:
>> +/**
>> + * xa_tag_pointer() - Create an XArray entry for a tagged pointer.
>> + * @p: Plain pointer.
>> + * @tag: Tag value (0, 1 or 3).
>> + *
>> + * If the user of the XArray prefers, they can tag their pointers instead
>> + * of storing value entries.  Three tags are available (0, 1 and 3).
>> + * These are distinct from the xa_mark_t as they are not replicated up
>> + * through the array and cannot be searched for.
>> + *
>> + * Context: Any context.
>> + * Return: An XArray entry.
>> + */
>> +static inline void *xa_tag_pointer(void *p, unsigned long tag)
>> +{
>> +	return (void *)((unsigned long)p | tag);
>> +}
> 
> I think we have to diverge from upstream here.  Part of the original
> commit is changing the format of internal & exceptional entries to give
> us an extra bit.  This implementation of xa_tag_pointer would transform
> a pointer tagged with value 1 into an internal pointer, which would
> break the radix tree.

Sorry for the delay reply. I noticed this, 
radix use 1x for storing exceptional entry
#define RADIX_TREE_EXCEPTIONAL_ENTRY   2
but XArray use x1 instead.

But I didn't notice this commit fixes the radix tree implementation as well,
let me fix it soon and thanks for pointing it out :)

Thanks,
Gao Xiang

> 
> I would suggest:
> 
> +/**
> + * xa_tag_pointer() - Create an XArray entry for a tagged pointer.
> + * @p: Plain pointer.
> + * @tag: Tag value (0 or 1).
> + *
> + * If the user of the XArray prefers, they can tag their pointers instead
> + * of storing value entries.  Two tags are available (0 and 1).
> + * These are distinct from the xa_mark_t as they are not replicated up
> + * through the array and cannot be searched for.
> + *
> + * Context: Any context.
> + * Return: An XArray entry.
> + */
> +static inline void *xa_tag_pointer(void *p, unsigned long tag)
> +{
> +	BUG_ON(tag > 1);
> +	return (void *)((unsigned long)p | (tag << 1));
> +}
> 
> xa_untag_pointer() and xa_pointer_tag() will need corresponding changes.
> 

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH for-4.19 1/2] xarray: Replace exceptional entries
@ 2019-02-26  1:21               ` Gao Xiang
  0 siblings, 0 replies; 68+ messages in thread
From: Gao Xiang @ 2019-02-26  1:21 UTC (permalink / raw)


Hi Matthew,

On 2019/2/26 2:27, Matthew Wilcox wrote:
> On Tue, Feb 26, 2019@01:58:08AM +0800, Gao Xiang wrote:
>> +/**
>> + * xa_tag_pointer() - Create an XArray entry for a tagged pointer.
>> + * @p: Plain pointer.
>> + * @tag: Tag value (0, 1 or 3).
>> + *
>> + * If the user of the XArray prefers, they can tag their pointers instead
>> + * of storing value entries.  Three tags are available (0, 1 and 3).
>> + * These are distinct from the xa_mark_t as they are not replicated up
>> + * through the array and cannot be searched for.
>> + *
>> + * Context: Any context.
>> + * Return: An XArray entry.
>> + */
>> +static inline void *xa_tag_pointer(void *p, unsigned long tag)
>> +{
>> +	return (void *)((unsigned long)p | tag);
>> +}
> 
> I think we have to diverge from upstream here.  Part of the original
> commit is changing the format of internal & exceptional entries to give
> us an extra bit.  This implementation of xa_tag_pointer would transform
> a pointer tagged with value 1 into an internal pointer, which would
> break the radix tree.

Sorry for the delay reply. I noticed this, 
radix use 1x for storing exceptional entry
#define RADIX_TREE_EXCEPTIONAL_ENTRY   2
but XArray use x1 instead.

But I didn't notice this commit fixes the radix tree implementation as well,
let me fix it soon and thanks for pointing it out :)

Thanks,
Gao Xiang

> 
> I would suggest:
> 
> +/**
> + * xa_tag_pointer() - Create an XArray entry for a tagged pointer.
> + * @p: Plain pointer.
> + * @tag: Tag value (0 or 1).
> + *
> + * If the user of the XArray prefers, they can tag their pointers instead
> + * of storing value entries.  Two tags are available (0 and 1).
> + * These are distinct from the xa_mark_t as they are not replicated up
> + * through the array and cannot be searched for.
> + *
> + * Context: Any context.
> + * Return: An XArray entry.
> + */
> +static inline void *xa_tag_pointer(void *p, unsigned long tag)
> +{
> +	BUG_ON(tag > 1);
> +	return (void *)((unsigned long)p | (tag << 1));
> +}
> 
> xa_untag_pointer() and xa_pointer_tag() will need corresponding changes.
> 

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v2 for-4.19 1/2] xarray: Replace exceptional entries
  2019-02-25 18:27             ` Matthew Wilcox
@ 2019-02-26  5:14               ` Gao Xiang
  -1 siblings, 0 replies; 68+ messages in thread
From: Gao Xiang @ 2019-02-26  5:14 UTC (permalink / raw)
  To: stable, Greg Kroah-Hartman, Matthew Wilcox
  Cc: linux-erofs, Chao Yu, Miao Xie, Gao Xiang

From: Matthew Wilcox <willy@infradead.org>

commit 3159f943aafdbacb2f94c38fdaadabf2bbde2a14 upstream.

Introduce xarray value entries and tagged pointers to replace radix
tree exceptional entries.  This is a slight change in encoding to allow
the use of an extra bit (we can now store BITS_PER_LONG - 1 bits in a
value entry).  It is also a change in emphasis; exceptional entries are
intimidating and different.  As the comment explains, you can choose
to store values or pointers in the xarray and they are both first-class
citizens.

Signed-off-by: Matthew Wilcox <willy@infradead.org>
Reviewed-by: Josef Bacik <jbacik@fb.com>
[ take the minimum subset of tagged pointer support only. ]
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
---
change log v2:
 - fix tagged pointer apis to make it work properly pointed out by Matthew;

 drivers/staging/erofs/utils.c | 18 +++++----------
 include/linux/xarray.h        | 52 +++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 58 insertions(+), 12 deletions(-)

diff --git a/drivers/staging/erofs/utils.c b/drivers/staging/erofs/utils.c
index 595cf90af9bb..bdee9bd09f11 100644
--- a/drivers/staging/erofs/utils.c
+++ b/drivers/staging/erofs/utils.c
@@ -35,7 +35,6 @@ static atomic_long_t erofs_global_shrink_cnt;
 
 #ifdef CONFIG_EROFS_FS_ZIP
 
-/* radix_tree and the future XArray both don't use tagptr_t yet */
 struct erofs_workgroup *erofs_find_workgroup(
 	struct super_block *sb, pgoff_t index, bool *tag)
 {
@@ -47,9 +46,8 @@ struct erofs_workgroup *erofs_find_workgroup(
 	rcu_read_lock();
 	grp = radix_tree_lookup(&sbi->workstn_tree, index);
 	if (grp != NULL) {
-		*tag = radix_tree_exceptional_entry(grp);
-		grp = (void *)((unsigned long)grp &
-			~RADIX_TREE_EXCEPTIONAL_ENTRY);
+		*tag = xa_pointer_tag(grp);
+		grp = xa_untag_pointer(grp);
 
 		if (erofs_workgroup_get(grp, &oldcount)) {
 			/* prefer to relax rcu read side */
@@ -83,9 +81,7 @@ int erofs_register_workgroup(struct super_block *sb,
 	sbi = EROFS_SB(sb);
 	erofs_workstn_lock(sbi);
 
-	if (tag)
-		grp = (void *)((unsigned long)grp |
-			1UL << RADIX_TREE_EXCEPTIONAL_SHIFT);
+	grp = xa_tag_pointer(grp, tag);
 
 	err = radix_tree_insert(&sbi->workstn_tree,
 		grp->index, grp);
@@ -131,9 +127,7 @@ unsigned long erofs_shrink_workstation(struct erofs_sb_info *sbi,
 
 	for (i = 0; i < found; ++i) {
 		int cnt;
-		struct erofs_workgroup *grp = (void *)
-			((unsigned long)batch[i] &
-				~RADIX_TREE_EXCEPTIONAL_ENTRY);
+		struct erofs_workgroup *grp = xa_untag_pointer(batch[i]);
 
 		first_index = grp->index + 1;
 
@@ -150,8 +144,8 @@ unsigned long erofs_shrink_workstation(struct erofs_sb_info *sbi,
 #endif
 			continue;
 
-		if (radix_tree_delete(&sbi->workstn_tree,
-			grp->index) != grp) {
+		if (xa_untag_pointer(radix_tree_delete(&sbi->workstn_tree,
+			grp->index)) != grp) {
 #ifdef EROFS_FS_HAS_MANAGED_CACHE
 skip:
 			erofs_workgroup_unfreeze(grp, 1);
diff --git a/include/linux/xarray.h b/include/linux/xarray.h
index 2dfc8006fe64..51ae9779d08b 100644
--- a/include/linux/xarray.h
+++ b/include/linux/xarray.h
@@ -9,6 +9,58 @@
 
 #include <linux/spinlock.h>
 
+/**
+ * xa_tag_pointer() - Create an XArray entry for a tagged pointer.
+ * @p: Plain pointer.
+ * @tag: Tag value (0 or 1).
+ *
+ * If the user of the XArray prefers, they can tag their pointers instead
+ * of storing value entries.  Two tags are available (0 and 1).
+ * These are distinct from the xa_mark_t as they are not replicated up
+ * through the array and cannot be searched for.
+ *
+ * Context: Any context.
+ * Return: An XArray entry.
+ */
+static inline void *xa_tag_pointer(void *p, unsigned long tag)
+{
+	if (__builtin_constant_p(tag))
+		BUILD_BUG_ON(tag > 1);
+	else
+		BUG_ON(tag > 1);
+	return (void *)((unsigned long)p | (tag << 1));
+}
+
+/**
+ * xa_untag_pointer() - Turn an XArray entry into a plain pointer.
+ * @entry: XArray entry.
+ *
+ * If you have stored a tagged pointer in the XArray, call this function
+ * to get the untagged version of the pointer.
+ *
+ * Context: Any context.
+ * Return: A pointer.
+ */
+static inline void *xa_untag_pointer(void *entry)
+{
+	return (void *)((unsigned long)entry & ~3UL);
+}
+
+/**
+ * xa_pointer_tag() - Get the tag stored in an XArray entry.
+ * @entry: XArray entry.
+ *
+ * If you have stored a tagged pointer in the XArray, call this function
+ * to get the tag of that pointer.
+ *
+ * Context: Any context.
+ * Return: A tag.
+ */
+static inline unsigned int xa_pointer_tag(void *entry)
+{
+	return ((unsigned long)entry & 3UL) >> 1;
+}
+
 #define xa_trylock(xa)		spin_trylock(&(xa)->xa_lock)
 #define xa_lock(xa)		spin_lock(&(xa)->xa_lock)
 #define xa_unlock(xa)		spin_unlock(&(xa)->xa_lock)
-- 
2.14.5


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v2 for-4.19 1/2] xarray: Replace exceptional entries
@ 2019-02-26  5:14               ` Gao Xiang
  0 siblings, 0 replies; 68+ messages in thread
From: Gao Xiang @ 2019-02-26  5:14 UTC (permalink / raw)


From: Matthew Wilcox <willy@infradead.org>

commit 3159f943aafdbacb2f94c38fdaadabf2bbde2a14 upstream.

Introduce xarray value entries and tagged pointers to replace radix
tree exceptional entries.  This is a slight change in encoding to allow
the use of an extra bit (we can now store BITS_PER_LONG - 1 bits in a
value entry).  It is also a change in emphasis; exceptional entries are
intimidating and different.  As the comment explains, you can choose
to store values or pointers in the xarray and they are both first-class
citizens.

Signed-off-by: Matthew Wilcox <willy at infradead.org>
Reviewed-by: Josef Bacik <jbacik at fb.com>
[ take the minimum subset of tagged pointer support only. ]
Cc: Matthew Wilcox <willy at infradead.org>
Signed-off-by: Gao Xiang <gaoxiang25 at huawei.com>
---
change log v2:
 - fix tagged pointer apis to make it work properly pointed out by Matthew;

 drivers/staging/erofs/utils.c | 18 +++++----------
 include/linux/xarray.h        | 52 +++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 58 insertions(+), 12 deletions(-)

diff --git a/drivers/staging/erofs/utils.c b/drivers/staging/erofs/utils.c
index 595cf90af9bb..bdee9bd09f11 100644
--- a/drivers/staging/erofs/utils.c
+++ b/drivers/staging/erofs/utils.c
@@ -35,7 +35,6 @@ static atomic_long_t erofs_global_shrink_cnt;
 
 #ifdef CONFIG_EROFS_FS_ZIP
 
-/* radix_tree and the future XArray both don't use tagptr_t yet */
 struct erofs_workgroup *erofs_find_workgroup(
 	struct super_block *sb, pgoff_t index, bool *tag)
 {
@@ -47,9 +46,8 @@ struct erofs_workgroup *erofs_find_workgroup(
 	rcu_read_lock();
 	grp = radix_tree_lookup(&sbi->workstn_tree, index);
 	if (grp != NULL) {
-		*tag = radix_tree_exceptional_entry(grp);
-		grp = (void *)((unsigned long)grp &
-			~RADIX_TREE_EXCEPTIONAL_ENTRY);
+		*tag = xa_pointer_tag(grp);
+		grp = xa_untag_pointer(grp);
 
 		if (erofs_workgroup_get(grp, &oldcount)) {
 			/* prefer to relax rcu read side */
@@ -83,9 +81,7 @@ int erofs_register_workgroup(struct super_block *sb,
 	sbi = EROFS_SB(sb);
 	erofs_workstn_lock(sbi);
 
-	if (tag)
-		grp = (void *)((unsigned long)grp |
-			1UL << RADIX_TREE_EXCEPTIONAL_SHIFT);
+	grp = xa_tag_pointer(grp, tag);
 
 	err = radix_tree_insert(&sbi->workstn_tree,
 		grp->index, grp);
@@ -131,9 +127,7 @@ unsigned long erofs_shrink_workstation(struct erofs_sb_info *sbi,
 
 	for (i = 0; i < found; ++i) {
 		int cnt;
-		struct erofs_workgroup *grp = (void *)
-			((unsigned long)batch[i] &
-				~RADIX_TREE_EXCEPTIONAL_ENTRY);
+		struct erofs_workgroup *grp = xa_untag_pointer(batch[i]);
 
 		first_index = grp->index + 1;
 
@@ -150,8 +144,8 @@ unsigned long erofs_shrink_workstation(struct erofs_sb_info *sbi,
 #endif
 			continue;
 
-		if (radix_tree_delete(&sbi->workstn_tree,
-			grp->index) != grp) {
+		if (xa_untag_pointer(radix_tree_delete(&sbi->workstn_tree,
+			grp->index)) != grp) {
 #ifdef EROFS_FS_HAS_MANAGED_CACHE
 skip:
 			erofs_workgroup_unfreeze(grp, 1);
diff --git a/include/linux/xarray.h b/include/linux/xarray.h
index 2dfc8006fe64..51ae9779d08b 100644
--- a/include/linux/xarray.h
+++ b/include/linux/xarray.h
@@ -9,6 +9,58 @@
 
 #include <linux/spinlock.h>
 
+/**
+ * xa_tag_pointer() - Create an XArray entry for a tagged pointer.
+ * @p: Plain pointer.
+ * @tag: Tag value (0 or 1).
+ *
+ * If the user of the XArray prefers, they can tag their pointers instead
+ * of storing value entries.  Two tags are available (0 and 1).
+ * These are distinct from the xa_mark_t as they are not replicated up
+ * through the array and cannot be searched for.
+ *
+ * Context: Any context.
+ * Return: An XArray entry.
+ */
+static inline void *xa_tag_pointer(void *p, unsigned long tag)
+{
+	if (__builtin_constant_p(tag))
+		BUILD_BUG_ON(tag > 1);
+	else
+		BUG_ON(tag > 1);
+	return (void *)((unsigned long)p | (tag << 1));
+}
+
+/**
+ * xa_untag_pointer() - Turn an XArray entry into a plain pointer.
+ * @entry: XArray entry.
+ *
+ * If you have stored a tagged pointer in the XArray, call this function
+ * to get the untagged version of the pointer.
+ *
+ * Context: Any context.
+ * Return: A pointer.
+ */
+static inline void *xa_untag_pointer(void *entry)
+{
+	return (void *)((unsigned long)entry & ~3UL);
+}
+
+/**
+ * xa_pointer_tag() - Get the tag stored in an XArray entry.
+ * @entry: XArray entry.
+ *
+ * If you have stored a tagged pointer in the XArray, call this function
+ * to get the tag of that pointer.
+ *
+ * Context: Any context.
+ * Return: A tag.
+ */
+static inline unsigned int xa_pointer_tag(void *entry)
+{
+	return ((unsigned long)entry & 3UL) >> 1;
+}
+
 #define xa_trylock(xa)		spin_trylock(&(xa)->xa_lock)
 #define xa_lock(xa)		spin_lock(&(xa)->xa_lock)
 #define xa_unlock(xa)		spin_unlock(&(xa)->xa_lock)
-- 
2.14.5

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v2 for-4.19 2/2] staging: erofs: fix race when the managed cache is enabled
  2019-02-26  5:14               ` Gao Xiang
@ 2019-02-26  5:14                 ` Gao Xiang
  -1 siblings, 0 replies; 68+ messages in thread
From: Gao Xiang @ 2019-02-26  5:14 UTC (permalink / raw)
  To: stable, Greg Kroah-Hartman, Matthew Wilcox
  Cc: linux-erofs, Chao Yu, Miao Xie, Gao Xiang

commit 51232df5e4b268936beccde5248f312a316800be upstream.

When the managed cache is enabled, the last reference count
of a workgroup must be used for its workstation.

Otherwise, it could lead to incorrect (un)freezes in
the reclaim path, and it would be harmful.

A typical race as follows:

Thread 1 (In the reclaim path)  Thread 2
workgroup_freeze(grp, 1)                                refcnt = 1
...
workgroup_unfreeze(grp, 1)                              refcnt = 1
                                workgroup_get(grp)      refcnt = 2 (x)
workgroup_put(grp)                                      refcnt = 1 (x)
                                ...unexpected behaviors

* grp is detached but still used, which violates cache-managed
  freeze constraint.

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
---
 drivers/staging/erofs/internal.h |   1 +
 drivers/staging/erofs/utils.c    | 134 +++++++++++++++++++++++++++------------
 2 files changed, 96 insertions(+), 39 deletions(-)

diff --git a/drivers/staging/erofs/internal.h b/drivers/staging/erofs/internal.h
index e6313c54e3ad..122ea5016f3b 100644
--- a/drivers/staging/erofs/internal.h
+++ b/drivers/staging/erofs/internal.h
@@ -240,6 +240,7 @@ static inline bool erofs_workgroup_get(struct erofs_workgroup *grp, int *ocnt)
 }
 
 #define __erofs_workgroup_get(grp)	atomic_inc(&(grp)->refcount)
+#define __erofs_workgroup_put(grp)	atomic_dec(&(grp)->refcount)
 
 extern int erofs_workgroup_put(struct erofs_workgroup *grp);
 
diff --git a/drivers/staging/erofs/utils.c b/drivers/staging/erofs/utils.c
index bdee9bd09f11..bcabbb85d40e 100644
--- a/drivers/staging/erofs/utils.c
+++ b/drivers/staging/erofs/utils.c
@@ -83,12 +83,21 @@ int erofs_register_workgroup(struct super_block *sb,
 
 	grp = xa_tag_pointer(grp, tag);
 
-	err = radix_tree_insert(&sbi->workstn_tree,
-		grp->index, grp);
+	/*
+	 * Bump up reference count before making this workgroup
+	 * visible to other users in order to avoid potential UAF
+	 * without serialized by erofs_workstn_lock.
+	 */
+	__erofs_workgroup_get(grp);
 
-	if (!err) {
-		__erofs_workgroup_get(grp);
-	}
+	err = radix_tree_insert(&sbi->workstn_tree,
+				grp->index, grp);
+	if (unlikely(err))
+		/*
+		 * it's safe to decrease since the workgroup isn't visible
+		 * and refcount >= 2 (cannot be freezed).
+		 */
+		__erofs_workgroup_put(grp);
 
 	erofs_workstn_unlock(sbi);
 	radix_tree_preload_end();
@@ -97,19 +106,94 @@ int erofs_register_workgroup(struct super_block *sb,
 
 extern void erofs_workgroup_free_rcu(struct erofs_workgroup *grp);
 
+static void  __erofs_workgroup_free(struct erofs_workgroup *grp)
+{
+	atomic_long_dec(&erofs_global_shrink_cnt);
+	erofs_workgroup_free_rcu(grp);
+}
+
 int erofs_workgroup_put(struct erofs_workgroup *grp)
 {
 	int count = atomic_dec_return(&grp->refcount);
 
 	if (count == 1)
 		atomic_long_inc(&erofs_global_shrink_cnt);
-	else if (!count) {
-		atomic_long_dec(&erofs_global_shrink_cnt);
-		erofs_workgroup_free_rcu(grp);
-	}
+	else if (!count)
+		__erofs_workgroup_free(grp);
 	return count;
 }
 
+#ifdef EROFS_FS_HAS_MANAGED_CACHE
+/* for cache-managed case, customized reclaim paths exist */
+static void erofs_workgroup_unfreeze_final(struct erofs_workgroup *grp)
+{
+	erofs_workgroup_unfreeze(grp, 0);
+	__erofs_workgroup_free(grp);
+}
+
+bool erofs_try_to_release_workgroup(struct erofs_sb_info *sbi,
+				    struct erofs_workgroup *grp,
+				    bool cleanup)
+{
+	/*
+	 * for managed cache enabled, the refcount of workgroups
+	 * themselves could be < 0 (freezed). So there is no guarantee
+	 * that all refcount > 0 if managed cache is enabled.
+	 */
+	if (!erofs_workgroup_try_to_freeze(grp, 1))
+		return false;
+
+	/*
+	 * note that all cached pages should be unlinked
+	 * before delete it from the radix tree.
+	 * Otherwise some cached pages of an orphan old workgroup
+	 * could be still linked after the new one is available.
+	 */
+	if (erofs_try_to_free_all_cached_pages(sbi, grp)) {
+		erofs_workgroup_unfreeze(grp, 1);
+		return false;
+	}
+
+	/*
+	 * it is impossible to fail after the workgroup is freezed,
+	 * however in order to avoid some race conditions, add a
+	 * DBG_BUGON to observe this in advance.
+	 */
+	DBG_BUGON(xa_untag_pointer(radix_tree_delete(&sbi->workstn_tree,
+						     grp->index)) != grp);
+
+	/*
+	 * if managed cache is enable, the last refcount
+	 * should indicate the related workstation.
+	 */
+	erofs_workgroup_unfreeze_final(grp);
+	return true;
+}
+
+#else
+/* for nocache case, no customized reclaim path at all */
+bool erofs_try_to_release_workgroup(struct erofs_sb_info *sbi,
+				    struct erofs_workgroup *grp,
+				    bool cleanup)
+{
+	int cnt = atomic_read(&grp->refcount);
+
+	DBG_BUGON(cnt <= 0);
+	DBG_BUGON(cleanup && cnt != 1);
+
+	if (cnt > 1)
+		return false;
+
+	DBG_BUGON(xa_untag_pointer(radix_tree_delete(&sbi->workstn_tree,
+						     grp->index)) != grp);
+
+	/* (rarely) could be grabbed again when freeing */
+	erofs_workgroup_put(grp);
+	return true;
+}
+
+#endif
+
 unsigned long erofs_shrink_workstation(struct erofs_sb_info *sbi,
 				       unsigned long nr_shrink,
 				       bool cleanup)
@@ -126,42 +210,14 @@ unsigned long erofs_shrink_workstation(struct erofs_sb_info *sbi,
 		batch, first_index, PAGEVEC_SIZE);
 
 	for (i = 0; i < found; ++i) {
-		int cnt;
 		struct erofs_workgroup *grp = xa_untag_pointer(batch[i]);
 
 		first_index = grp->index + 1;
 
-		cnt = atomic_read(&grp->refcount);
-		BUG_ON(cnt <= 0);
-
-		if (cleanup)
-			BUG_ON(cnt != 1);
-
-#ifndef EROFS_FS_HAS_MANAGED_CACHE
-		else if (cnt > 1)
-#else
-		if (!erofs_workgroup_try_to_freeze(grp, 1))
-#endif
+		/* try to shrink each valid workgroup */
+		if (!erofs_try_to_release_workgroup(sbi, grp, cleanup))
 			continue;
 
-		if (xa_untag_pointer(radix_tree_delete(&sbi->workstn_tree,
-			grp->index)) != grp) {
-#ifdef EROFS_FS_HAS_MANAGED_CACHE
-skip:
-			erofs_workgroup_unfreeze(grp, 1);
-#endif
-			continue;
-		}
-
-#ifdef EROFS_FS_HAS_MANAGED_CACHE
-		if (erofs_try_to_free_all_cached_pages(sbi, grp))
-			goto skip;
-
-		erofs_workgroup_unfreeze(grp, 1);
-#endif
-		/* (rarely) grabbed again when freeing */
-		erofs_workgroup_put(grp);
-
 		++freed;
 		if (unlikely(!--nr_shrink))
 			break;
-- 
2.14.5


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v2 for-4.19 2/2] staging: erofs: fix race when the managed cache is enabled
@ 2019-02-26  5:14                 ` Gao Xiang
  0 siblings, 0 replies; 68+ messages in thread
From: Gao Xiang @ 2019-02-26  5:14 UTC (permalink / raw)


commit 51232df5e4b268936beccde5248f312a316800be upstream.

When the managed cache is enabled, the last reference count
of a workgroup must be used for its workstation.

Otherwise, it could lead to incorrect (un)freezes in
the reclaim path, and it would be harmful.

A typical race as follows:

Thread 1 (In the reclaim path)  Thread 2
workgroup_freeze(grp, 1)                                refcnt = 1
...
workgroup_unfreeze(grp, 1)                              refcnt = 1
                                workgroup_get(grp)      refcnt = 2 (x)
workgroup_put(grp)                                      refcnt = 1 (x)
                                ...unexpected behaviors

* grp is detached but still used, which violates cache-managed
  freeze constraint.

Reviewed-by: Chao Yu <yuchao0 at huawei.com>
Signed-off-by: Gao Xiang <gaoxiang25 at huawei.com>
---
 drivers/staging/erofs/internal.h |   1 +
 drivers/staging/erofs/utils.c    | 134 +++++++++++++++++++++++++++------------
 2 files changed, 96 insertions(+), 39 deletions(-)

diff --git a/drivers/staging/erofs/internal.h b/drivers/staging/erofs/internal.h
index e6313c54e3ad..122ea5016f3b 100644
--- a/drivers/staging/erofs/internal.h
+++ b/drivers/staging/erofs/internal.h
@@ -240,6 +240,7 @@ static inline bool erofs_workgroup_get(struct erofs_workgroup *grp, int *ocnt)
 }
 
 #define __erofs_workgroup_get(grp)	atomic_inc(&(grp)->refcount)
+#define __erofs_workgroup_put(grp)	atomic_dec(&(grp)->refcount)
 
 extern int erofs_workgroup_put(struct erofs_workgroup *grp);
 
diff --git a/drivers/staging/erofs/utils.c b/drivers/staging/erofs/utils.c
index bdee9bd09f11..bcabbb85d40e 100644
--- a/drivers/staging/erofs/utils.c
+++ b/drivers/staging/erofs/utils.c
@@ -83,12 +83,21 @@ int erofs_register_workgroup(struct super_block *sb,
 
 	grp = xa_tag_pointer(grp, tag);
 
-	err = radix_tree_insert(&sbi->workstn_tree,
-		grp->index, grp);
+	/*
+	 * Bump up reference count before making this workgroup
+	 * visible to other users in order to avoid potential UAF
+	 * without serialized by erofs_workstn_lock.
+	 */
+	__erofs_workgroup_get(grp);
 
-	if (!err) {
-		__erofs_workgroup_get(grp);
-	}
+	err = radix_tree_insert(&sbi->workstn_tree,
+				grp->index, grp);
+	if (unlikely(err))
+		/*
+		 * it's safe to decrease since the workgroup isn't visible
+		 * and refcount >= 2 (cannot be freezed).
+		 */
+		__erofs_workgroup_put(grp);
 
 	erofs_workstn_unlock(sbi);
 	radix_tree_preload_end();
@@ -97,19 +106,94 @@ int erofs_register_workgroup(struct super_block *sb,
 
 extern void erofs_workgroup_free_rcu(struct erofs_workgroup *grp);
 
+static void  __erofs_workgroup_free(struct erofs_workgroup *grp)
+{
+	atomic_long_dec(&erofs_global_shrink_cnt);
+	erofs_workgroup_free_rcu(grp);
+}
+
 int erofs_workgroup_put(struct erofs_workgroup *grp)
 {
 	int count = atomic_dec_return(&grp->refcount);
 
 	if (count == 1)
 		atomic_long_inc(&erofs_global_shrink_cnt);
-	else if (!count) {
-		atomic_long_dec(&erofs_global_shrink_cnt);
-		erofs_workgroup_free_rcu(grp);
-	}
+	else if (!count)
+		__erofs_workgroup_free(grp);
 	return count;
 }
 
+#ifdef EROFS_FS_HAS_MANAGED_CACHE
+/* for cache-managed case, customized reclaim paths exist */
+static void erofs_workgroup_unfreeze_final(struct erofs_workgroup *grp)
+{
+	erofs_workgroup_unfreeze(grp, 0);
+	__erofs_workgroup_free(grp);
+}
+
+bool erofs_try_to_release_workgroup(struct erofs_sb_info *sbi,
+				    struct erofs_workgroup *grp,
+				    bool cleanup)
+{
+	/*
+	 * for managed cache enabled, the refcount of workgroups
+	 * themselves could be < 0 (freezed). So there is no guarantee
+	 * that all refcount > 0 if managed cache is enabled.
+	 */
+	if (!erofs_workgroup_try_to_freeze(grp, 1))
+		return false;
+
+	/*
+	 * note that all cached pages should be unlinked
+	 * before delete it from the radix tree.
+	 * Otherwise some cached pages of an orphan old workgroup
+	 * could be still linked after the new one is available.
+	 */
+	if (erofs_try_to_free_all_cached_pages(sbi, grp)) {
+		erofs_workgroup_unfreeze(grp, 1);
+		return false;
+	}
+
+	/*
+	 * it is impossible to fail after the workgroup is freezed,
+	 * however in order to avoid some race conditions, add a
+	 * DBG_BUGON to observe this in advance.
+	 */
+	DBG_BUGON(xa_untag_pointer(radix_tree_delete(&sbi->workstn_tree,
+						     grp->index)) != grp);
+
+	/*
+	 * if managed cache is enable, the last refcount
+	 * should indicate the related workstation.
+	 */
+	erofs_workgroup_unfreeze_final(grp);
+	return true;
+}
+
+#else
+/* for nocache case, no customized reclaim path at all */
+bool erofs_try_to_release_workgroup(struct erofs_sb_info *sbi,
+				    struct erofs_workgroup *grp,
+				    bool cleanup)
+{
+	int cnt = atomic_read(&grp->refcount);
+
+	DBG_BUGON(cnt <= 0);
+	DBG_BUGON(cleanup && cnt != 1);
+
+	if (cnt > 1)
+		return false;
+
+	DBG_BUGON(xa_untag_pointer(radix_tree_delete(&sbi->workstn_tree,
+						     grp->index)) != grp);
+
+	/* (rarely) could be grabbed again when freeing */
+	erofs_workgroup_put(grp);
+	return true;
+}
+
+#endif
+
 unsigned long erofs_shrink_workstation(struct erofs_sb_info *sbi,
 				       unsigned long nr_shrink,
 				       bool cleanup)
@@ -126,42 +210,14 @@ unsigned long erofs_shrink_workstation(struct erofs_sb_info *sbi,
 		batch, first_index, PAGEVEC_SIZE);
 
 	for (i = 0; i < found; ++i) {
-		int cnt;
 		struct erofs_workgroup *grp = xa_untag_pointer(batch[i]);
 
 		first_index = grp->index + 1;
 
-		cnt = atomic_read(&grp->refcount);
-		BUG_ON(cnt <= 0);
-
-		if (cleanup)
-			BUG_ON(cnt != 1);
-
-#ifndef EROFS_FS_HAS_MANAGED_CACHE
-		else if (cnt > 1)
-#else
-		if (!erofs_workgroup_try_to_freeze(grp, 1))
-#endif
+		/* try to shrink each valid workgroup */
+		if (!erofs_try_to_release_workgroup(sbi, grp, cleanup))
 			continue;
 
-		if (xa_untag_pointer(radix_tree_delete(&sbi->workstn_tree,
-			grp->index)) != grp) {
-#ifdef EROFS_FS_HAS_MANAGED_CACHE
-skip:
-			erofs_workgroup_unfreeze(grp, 1);
-#endif
-			continue;
-		}
-
-#ifdef EROFS_FS_HAS_MANAGED_CACHE
-		if (erofs_try_to_free_all_cached_pages(sbi, grp))
-			goto skip;
-
-		erofs_workgroup_unfreeze(grp, 1);
-#endif
-		/* (rarely) grabbed again when freeing */
-		erofs_workgroup_put(grp);
-
 		++freed;
 		if (unlikely(!--nr_shrink))
 			break;
-- 
2.14.5

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* Re: [PATCH v2 for-4.19 1/2] xarray: Replace exceptional entries
  2019-02-26  5:14               ` Gao Xiang
@ 2019-02-26 12:43                 ` Gao Xiang
  -1 siblings, 0 replies; 68+ messages in thread
From: Gao Xiang @ 2019-02-26 12:43 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: stable, Greg Kroah-Hartman, linux-erofs, Chao Yu, Miao Xie

Hi Matthew,

On 2019/2/26 13:14, Gao Xiang wrote:
> From: Matthew Wilcox <willy@infradead.org>
> 
> commit 3159f943aafdbacb2f94c38fdaadabf2bbde2a14 upstream.
> 
> Introduce xarray value entries and tagged pointers to replace radix
> tree exceptional entries.  This is a slight change in encoding to allow
> the use of an extra bit (we can now store BITS_PER_LONG - 1 bits in a
> value entry).  It is also a change in emphasis; exceptional entries are
> intimidating and different.  As the comment explains, you can choose
> to store values or pointers in the xarray and they are both first-class
> citizens.
> 
> Signed-off-by: Matthew Wilcox <willy@infradead.org>
> Reviewed-by: Josef Bacik <jbacik@fb.com>
> [ take the minimum subset of tagged pointer support only. ]
> Cc: Matthew Wilcox <willy@infradead.org>
> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
> ---
> change log v2:
>  - fix tagged pointer apis to make it work properly pointed out by Matthew;

Is this version ok? Could you kindly give an "Acked-by" tag for this patch?

The following patch is important for erofs, I'd like to backport it ASAP...
 staging: erofs: fix race when the managed cache is enabled

It will be of great help. Thanks in advance!

Thanks,
Gao Xiang

> 
>  drivers/staging/erofs/utils.c | 18 +++++----------
>  include/linux/xarray.h        | 52 +++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 58 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/staging/erofs/utils.c b/drivers/staging/erofs/utils.c
> index 595cf90af9bb..bdee9bd09f11 100644
> --- a/drivers/staging/erofs/utils.c
> +++ b/drivers/staging/erofs/utils.c
> @@ -35,7 +35,6 @@ static atomic_long_t erofs_global_shrink_cnt;
>  
>  #ifdef CONFIG_EROFS_FS_ZIP
>  
> -/* radix_tree and the future XArray both don't use tagptr_t yet */
>  struct erofs_workgroup *erofs_find_workgroup(
>  	struct super_block *sb, pgoff_t index, bool *tag)
>  {
> @@ -47,9 +46,8 @@ struct erofs_workgroup *erofs_find_workgroup(
>  	rcu_read_lock();
>  	grp = radix_tree_lookup(&sbi->workstn_tree, index);
>  	if (grp != NULL) {
> -		*tag = radix_tree_exceptional_entry(grp);
> -		grp = (void *)((unsigned long)grp &
> -			~RADIX_TREE_EXCEPTIONAL_ENTRY);
> +		*tag = xa_pointer_tag(grp);
> +		grp = xa_untag_pointer(grp);
>  
>  		if (erofs_workgroup_get(grp, &oldcount)) {
>  			/* prefer to relax rcu read side */
> @@ -83,9 +81,7 @@ int erofs_register_workgroup(struct super_block *sb,
>  	sbi = EROFS_SB(sb);
>  	erofs_workstn_lock(sbi);
>  
> -	if (tag)
> -		grp = (void *)((unsigned long)grp |
> -			1UL << RADIX_TREE_EXCEPTIONAL_SHIFT);
> +	grp = xa_tag_pointer(grp, tag);
>  
>  	err = radix_tree_insert(&sbi->workstn_tree,
>  		grp->index, grp);
> @@ -131,9 +127,7 @@ unsigned long erofs_shrink_workstation(struct erofs_sb_info *sbi,
>  
>  	for (i = 0; i < found; ++i) {
>  		int cnt;
> -		struct erofs_workgroup *grp = (void *)
> -			((unsigned long)batch[i] &
> -				~RADIX_TREE_EXCEPTIONAL_ENTRY);
> +		struct erofs_workgroup *grp = xa_untag_pointer(batch[i]);
>  
>  		first_index = grp->index + 1;
>  
> @@ -150,8 +144,8 @@ unsigned long erofs_shrink_workstation(struct erofs_sb_info *sbi,
>  #endif
>  			continue;
>  
> -		if (radix_tree_delete(&sbi->workstn_tree,
> -			grp->index) != grp) {
> +		if (xa_untag_pointer(radix_tree_delete(&sbi->workstn_tree,
> +			grp->index)) != grp) {
>  #ifdef EROFS_FS_HAS_MANAGED_CACHE
>  skip:
>  			erofs_workgroup_unfreeze(grp, 1);
> diff --git a/include/linux/xarray.h b/include/linux/xarray.h
> index 2dfc8006fe64..51ae9779d08b 100644
> --- a/include/linux/xarray.h
> +++ b/include/linux/xarray.h
> @@ -9,6 +9,58 @@
>  
>  #include <linux/spinlock.h>
>  
> +/**
> + * xa_tag_pointer() - Create an XArray entry for a tagged pointer.
> + * @p: Plain pointer.
> + * @tag: Tag value (0 or 1).
> + *
> + * If the user of the XArray prefers, they can tag their pointers instead
> + * of storing value entries.  Two tags are available (0 and 1).
> + * These are distinct from the xa_mark_t as they are not replicated up
> + * through the array and cannot be searched for.
> + *
> + * Context: Any context.
> + * Return: An XArray entry.
> + */
> +static inline void *xa_tag_pointer(void *p, unsigned long tag)
> +{
> +	if (__builtin_constant_p(tag))
> +		BUILD_BUG_ON(tag > 1);
> +	else
> +		BUG_ON(tag > 1);
> +	return (void *)((unsigned long)p | (tag << 1));
> +}
> +
> +/**
> + * xa_untag_pointer() - Turn an XArray entry into a plain pointer.
> + * @entry: XArray entry.
> + *
> + * If you have stored a tagged pointer in the XArray, call this function
> + * to get the untagged version of the pointer.
> + *
> + * Context: Any context.
> + * Return: A pointer.
> + */
> +static inline void *xa_untag_pointer(void *entry)
> +{
> +	return (void *)((unsigned long)entry & ~3UL);
> +}
> +
> +/**
> + * xa_pointer_tag() - Get the tag stored in an XArray entry.
> + * @entry: XArray entry.
> + *
> + * If you have stored a tagged pointer in the XArray, call this function
> + * to get the tag of that pointer.
> + *
> + * Context: Any context.
> + * Return: A tag.
> + */
> +static inline unsigned int xa_pointer_tag(void *entry)
> +{
> +	return ((unsigned long)entry & 3UL) >> 1;
> +}
> +
>  #define xa_trylock(xa)		spin_trylock(&(xa)->xa_lock)
>  #define xa_lock(xa)		spin_lock(&(xa)->xa_lock)
>  #define xa_unlock(xa)		spin_unlock(&(xa)->xa_lock)
> 

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v2 for-4.19 1/2] xarray: Replace exceptional entries
@ 2019-02-26 12:43                 ` Gao Xiang
  0 siblings, 0 replies; 68+ messages in thread
From: Gao Xiang @ 2019-02-26 12:43 UTC (permalink / raw)


Hi Matthew,

On 2019/2/26 13:14, Gao Xiang wrote:
> From: Matthew Wilcox <willy at infradead.org>
> 
> commit 3159f943aafdbacb2f94c38fdaadabf2bbde2a14 upstream.
> 
> Introduce xarray value entries and tagged pointers to replace radix
> tree exceptional entries.  This is a slight change in encoding to allow
> the use of an extra bit (we can now store BITS_PER_LONG - 1 bits in a
> value entry).  It is also a change in emphasis; exceptional entries are
> intimidating and different.  As the comment explains, you can choose
> to store values or pointers in the xarray and they are both first-class
> citizens.
> 
> Signed-off-by: Matthew Wilcox <willy at infradead.org>
> Reviewed-by: Josef Bacik <jbacik at fb.com>
> [ take the minimum subset of tagged pointer support only. ]
> Cc: Matthew Wilcox <willy at infradead.org>
> Signed-off-by: Gao Xiang <gaoxiang25 at huawei.com>
> ---
> change log v2:
>  - fix tagged pointer apis to make it work properly pointed out by Matthew;

Is this version ok? Could you kindly give an "Acked-by" tag for this patch?

The following patch is important for erofs, I'd like to backport it ASAP...
 staging: erofs: fix race when the managed cache is enabled

It will be of great help. Thanks in advance!

Thanks,
Gao Xiang

> 
>  drivers/staging/erofs/utils.c | 18 +++++----------
>  include/linux/xarray.h        | 52 +++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 58 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/staging/erofs/utils.c b/drivers/staging/erofs/utils.c
> index 595cf90af9bb..bdee9bd09f11 100644
> --- a/drivers/staging/erofs/utils.c
> +++ b/drivers/staging/erofs/utils.c
> @@ -35,7 +35,6 @@ static atomic_long_t erofs_global_shrink_cnt;
>  
>  #ifdef CONFIG_EROFS_FS_ZIP
>  
> -/* radix_tree and the future XArray both don't use tagptr_t yet */
>  struct erofs_workgroup *erofs_find_workgroup(
>  	struct super_block *sb, pgoff_t index, bool *tag)
>  {
> @@ -47,9 +46,8 @@ struct erofs_workgroup *erofs_find_workgroup(
>  	rcu_read_lock();
>  	grp = radix_tree_lookup(&sbi->workstn_tree, index);
>  	if (grp != NULL) {
> -		*tag = radix_tree_exceptional_entry(grp);
> -		grp = (void *)((unsigned long)grp &
> -			~RADIX_TREE_EXCEPTIONAL_ENTRY);
> +		*tag = xa_pointer_tag(grp);
> +		grp = xa_untag_pointer(grp);
>  
>  		if (erofs_workgroup_get(grp, &oldcount)) {
>  			/* prefer to relax rcu read side */
> @@ -83,9 +81,7 @@ int erofs_register_workgroup(struct super_block *sb,
>  	sbi = EROFS_SB(sb);
>  	erofs_workstn_lock(sbi);
>  
> -	if (tag)
> -		grp = (void *)((unsigned long)grp |
> -			1UL << RADIX_TREE_EXCEPTIONAL_SHIFT);
> +	grp = xa_tag_pointer(grp, tag);
>  
>  	err = radix_tree_insert(&sbi->workstn_tree,
>  		grp->index, grp);
> @@ -131,9 +127,7 @@ unsigned long erofs_shrink_workstation(struct erofs_sb_info *sbi,
>  
>  	for (i = 0; i < found; ++i) {
>  		int cnt;
> -		struct erofs_workgroup *grp = (void *)
> -			((unsigned long)batch[i] &
> -				~RADIX_TREE_EXCEPTIONAL_ENTRY);
> +		struct erofs_workgroup *grp = xa_untag_pointer(batch[i]);
>  
>  		first_index = grp->index + 1;
>  
> @@ -150,8 +144,8 @@ unsigned long erofs_shrink_workstation(struct erofs_sb_info *sbi,
>  #endif
>  			continue;
>  
> -		if (radix_tree_delete(&sbi->workstn_tree,
> -			grp->index) != grp) {
> +		if (xa_untag_pointer(radix_tree_delete(&sbi->workstn_tree,
> +			grp->index)) != grp) {
>  #ifdef EROFS_FS_HAS_MANAGED_CACHE
>  skip:
>  			erofs_workgroup_unfreeze(grp, 1);
> diff --git a/include/linux/xarray.h b/include/linux/xarray.h
> index 2dfc8006fe64..51ae9779d08b 100644
> --- a/include/linux/xarray.h
> +++ b/include/linux/xarray.h
> @@ -9,6 +9,58 @@
>  
>  #include <linux/spinlock.h>
>  
> +/**
> + * xa_tag_pointer() - Create an XArray entry for a tagged pointer.
> + * @p: Plain pointer.
> + * @tag: Tag value (0 or 1).
> + *
> + * If the user of the XArray prefers, they can tag their pointers instead
> + * of storing value entries.  Two tags are available (0 and 1).
> + * These are distinct from the xa_mark_t as they are not replicated up
> + * through the array and cannot be searched for.
> + *
> + * Context: Any context.
> + * Return: An XArray entry.
> + */
> +static inline void *xa_tag_pointer(void *p, unsigned long tag)
> +{
> +	if (__builtin_constant_p(tag))
> +		BUILD_BUG_ON(tag > 1);
> +	else
> +		BUG_ON(tag > 1);
> +	return (void *)((unsigned long)p | (tag << 1));
> +}
> +
> +/**
> + * xa_untag_pointer() - Turn an XArray entry into a plain pointer.
> + * @entry: XArray entry.
> + *
> + * If you have stored a tagged pointer in the XArray, call this function
> + * to get the untagged version of the pointer.
> + *
> + * Context: Any context.
> + * Return: A pointer.
> + */
> +static inline void *xa_untag_pointer(void *entry)
> +{
> +	return (void *)((unsigned long)entry & ~3UL);
> +}
> +
> +/**
> + * xa_pointer_tag() - Get the tag stored in an XArray entry.
> + * @entry: XArray entry.
> + *
> + * If you have stored a tagged pointer in the XArray, call this function
> + * to get the tag of that pointer.
> + *
> + * Context: Any context.
> + * Return: A tag.
> + */
> +static inline unsigned int xa_pointer_tag(void *entry)
> +{
> +	return ((unsigned long)entry & 3UL) >> 1;
> +}
> +
>  #define xa_trylock(xa)		spin_trylock(&(xa)->xa_lock)
>  #define xa_lock(xa)		spin_lock(&(xa)->xa_lock)
>  #define xa_unlock(xa)		spin_unlock(&(xa)->xa_lock)
> 

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v2 for-4.19 1/2] xarray: Replace exceptional entries
  2019-02-26 12:43                 ` Gao Xiang
@ 2019-03-04  5:13                   ` Gao Xiang
  -1 siblings, 0 replies; 68+ messages in thread
From: Gao Xiang @ 2019-03-04  5:13 UTC (permalink / raw)
  To: Matthew Wilcox, Greg Kroah-Hartman; +Cc: stable, linux-erofs, Chao Yu, Miao Xie

ping? Is these two patches can be merged into linux-4.19? :'(

Thanks,
Gao Xiang

On 2019/2/26 20:43, Gao Xiang wrote:
> Hi Matthew,
> 
> On 2019/2/26 13:14, Gao Xiang wrote:
>> From: Matthew Wilcox <willy@infradead.org>
>>
>> commit 3159f943aafdbacb2f94c38fdaadabf2bbde2a14 upstream.
>>
>> Introduce xarray value entries and tagged pointers to replace radix
>> tree exceptional entries.  This is a slight change in encoding to allow
>> the use of an extra bit (we can now store BITS_PER_LONG - 1 bits in a
>> value entry).  It is also a change in emphasis; exceptional entries are
>> intimidating and different.  As the comment explains, you can choose
>> to store values or pointers in the xarray and they are both first-class
>> citizens.
>>
>> Signed-off-by: Matthew Wilcox <willy@infradead.org>
>> Reviewed-by: Josef Bacik <jbacik@fb.com>
>> [ take the minimum subset of tagged pointer support only. ]
>> Cc: Matthew Wilcox <willy@infradead.org>
>> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
>> ---
>> change log v2:
>>  - fix tagged pointer apis to make it work properly pointed out by Matthew;
> 
> Is this version ok? Could you kindly give an "Acked-by" tag for this patch?
> 
> The following patch is important for erofs, I'd like to backport it ASAP...
>  staging: erofs: fix race when the managed cache is enabled
> 
> It will be of great help. Thanks in advance!
> 
> Thanks,
> Gao Xiang
> 
>>
>>  drivers/staging/erofs/utils.c | 18 +++++----------
>>  include/linux/xarray.h        | 52 +++++++++++++++++++++++++++++++++++++++++++
>>  2 files changed, 58 insertions(+), 12 deletions(-)
>>
>> diff --git a/drivers/staging/erofs/utils.c b/drivers/staging/erofs/utils.c
>> index 595cf90af9bb..bdee9bd09f11 100644
>> --- a/drivers/staging/erofs/utils.c
>> +++ b/drivers/staging/erofs/utils.c
>> @@ -35,7 +35,6 @@ static atomic_long_t erofs_global_shrink_cnt;
>>  
>>  #ifdef CONFIG_EROFS_FS_ZIP
>>  
>> -/* radix_tree and the future XArray both don't use tagptr_t yet */
>>  struct erofs_workgroup *erofs_find_workgroup(
>>  	struct super_block *sb, pgoff_t index, bool *tag)
>>  {
>> @@ -47,9 +46,8 @@ struct erofs_workgroup *erofs_find_workgroup(
>>  	rcu_read_lock();
>>  	grp = radix_tree_lookup(&sbi->workstn_tree, index);
>>  	if (grp != NULL) {
>> -		*tag = radix_tree_exceptional_entry(grp);
>> -		grp = (void *)((unsigned long)grp &
>> -			~RADIX_TREE_EXCEPTIONAL_ENTRY);
>> +		*tag = xa_pointer_tag(grp);
>> +		grp = xa_untag_pointer(grp);
>>  
>>  		if (erofs_workgroup_get(grp, &oldcount)) {
>>  			/* prefer to relax rcu read side */
>> @@ -83,9 +81,7 @@ int erofs_register_workgroup(struct super_block *sb,
>>  	sbi = EROFS_SB(sb);
>>  	erofs_workstn_lock(sbi);
>>  
>> -	if (tag)
>> -		grp = (void *)((unsigned long)grp |
>> -			1UL << RADIX_TREE_EXCEPTIONAL_SHIFT);
>> +	grp = xa_tag_pointer(grp, tag);
>>  
>>  	err = radix_tree_insert(&sbi->workstn_tree,
>>  		grp->index, grp);
>> @@ -131,9 +127,7 @@ unsigned long erofs_shrink_workstation(struct erofs_sb_info *sbi,
>>  
>>  	for (i = 0; i < found; ++i) {
>>  		int cnt;
>> -		struct erofs_workgroup *grp = (void *)
>> -			((unsigned long)batch[i] &
>> -				~RADIX_TREE_EXCEPTIONAL_ENTRY);
>> +		struct erofs_workgroup *grp = xa_untag_pointer(batch[i]);
>>  
>>  		first_index = grp->index + 1;
>>  
>> @@ -150,8 +144,8 @@ unsigned long erofs_shrink_workstation(struct erofs_sb_info *sbi,
>>  #endif
>>  			continue;
>>  
>> -		if (radix_tree_delete(&sbi->workstn_tree,
>> -			grp->index) != grp) {
>> +		if (xa_untag_pointer(radix_tree_delete(&sbi->workstn_tree,
>> +			grp->index)) != grp) {
>>  #ifdef EROFS_FS_HAS_MANAGED_CACHE
>>  skip:
>>  			erofs_workgroup_unfreeze(grp, 1);
>> diff --git a/include/linux/xarray.h b/include/linux/xarray.h
>> index 2dfc8006fe64..51ae9779d08b 100644
>> --- a/include/linux/xarray.h
>> +++ b/include/linux/xarray.h
>> @@ -9,6 +9,58 @@
>>  
>>  #include <linux/spinlock.h>
>>  
>> +/**
>> + * xa_tag_pointer() - Create an XArray entry for a tagged pointer.
>> + * @p: Plain pointer.
>> + * @tag: Tag value (0 or 1).
>> + *
>> + * If the user of the XArray prefers, they can tag their pointers instead
>> + * of storing value entries.  Two tags are available (0 and 1).
>> + * These are distinct from the xa_mark_t as they are not replicated up
>> + * through the array and cannot be searched for.
>> + *
>> + * Context: Any context.
>> + * Return: An XArray entry.
>> + */
>> +static inline void *xa_tag_pointer(void *p, unsigned long tag)
>> +{
>> +	if (__builtin_constant_p(tag))
>> +		BUILD_BUG_ON(tag > 1);
>> +	else
>> +		BUG_ON(tag > 1);
>> +	return (void *)((unsigned long)p | (tag << 1));
>> +}
>> +
>> +/**
>> + * xa_untag_pointer() - Turn an XArray entry into a plain pointer.
>> + * @entry: XArray entry.
>> + *
>> + * If you have stored a tagged pointer in the XArray, call this function
>> + * to get the untagged version of the pointer.
>> + *
>> + * Context: Any context.
>> + * Return: A pointer.
>> + */
>> +static inline void *xa_untag_pointer(void *entry)
>> +{
>> +	return (void *)((unsigned long)entry & ~3UL);
>> +}
>> +
>> +/**
>> + * xa_pointer_tag() - Get the tag stored in an XArray entry.
>> + * @entry: XArray entry.
>> + *
>> + * If you have stored a tagged pointer in the XArray, call this function
>> + * to get the tag of that pointer.
>> + *
>> + * Context: Any context.
>> + * Return: A tag.
>> + */
>> +static inline unsigned int xa_pointer_tag(void *entry)
>> +{
>> +	return ((unsigned long)entry & 3UL) >> 1;
>> +}
>> +
>>  #define xa_trylock(xa)		spin_trylock(&(xa)->xa_lock)
>>  #define xa_lock(xa)		spin_lock(&(xa)->xa_lock)
>>  #define xa_unlock(xa)		spin_unlock(&(xa)->xa_lock)
>>

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v2 for-4.19 1/2] xarray: Replace exceptional entries
@ 2019-03-04  5:13                   ` Gao Xiang
  0 siblings, 0 replies; 68+ messages in thread
From: Gao Xiang @ 2019-03-04  5:13 UTC (permalink / raw)


ping? Is these two patches can be merged into linux-4.19? :'(

Thanks,
Gao Xiang

On 2019/2/26 20:43, Gao Xiang wrote:
> Hi Matthew,
> 
> On 2019/2/26 13:14, Gao Xiang wrote:
>> From: Matthew Wilcox <willy at infradead.org>
>>
>> commit 3159f943aafdbacb2f94c38fdaadabf2bbde2a14 upstream.
>>
>> Introduce xarray value entries and tagged pointers to replace radix
>> tree exceptional entries.  This is a slight change in encoding to allow
>> the use of an extra bit (we can now store BITS_PER_LONG - 1 bits in a
>> value entry).  It is also a change in emphasis; exceptional entries are
>> intimidating and different.  As the comment explains, you can choose
>> to store values or pointers in the xarray and they are both first-class
>> citizens.
>>
>> Signed-off-by: Matthew Wilcox <willy at infradead.org>
>> Reviewed-by: Josef Bacik <jbacik at fb.com>
>> [ take the minimum subset of tagged pointer support only. ]
>> Cc: Matthew Wilcox <willy at infradead.org>
>> Signed-off-by: Gao Xiang <gaoxiang25 at huawei.com>
>> ---
>> change log v2:
>>  - fix tagged pointer apis to make it work properly pointed out by Matthew;
> 
> Is this version ok? Could you kindly give an "Acked-by" tag for this patch?
> 
> The following patch is important for erofs, I'd like to backport it ASAP...
>  staging: erofs: fix race when the managed cache is enabled
> 
> It will be of great help. Thanks in advance!
> 
> Thanks,
> Gao Xiang
> 
>>
>>  drivers/staging/erofs/utils.c | 18 +++++----------
>>  include/linux/xarray.h        | 52 +++++++++++++++++++++++++++++++++++++++++++
>>  2 files changed, 58 insertions(+), 12 deletions(-)
>>
>> diff --git a/drivers/staging/erofs/utils.c b/drivers/staging/erofs/utils.c
>> index 595cf90af9bb..bdee9bd09f11 100644
>> --- a/drivers/staging/erofs/utils.c
>> +++ b/drivers/staging/erofs/utils.c
>> @@ -35,7 +35,6 @@ static atomic_long_t erofs_global_shrink_cnt;
>>  
>>  #ifdef CONFIG_EROFS_FS_ZIP
>>  
>> -/* radix_tree and the future XArray both don't use tagptr_t yet */
>>  struct erofs_workgroup *erofs_find_workgroup(
>>  	struct super_block *sb, pgoff_t index, bool *tag)
>>  {
>> @@ -47,9 +46,8 @@ struct erofs_workgroup *erofs_find_workgroup(
>>  	rcu_read_lock();
>>  	grp = radix_tree_lookup(&sbi->workstn_tree, index);
>>  	if (grp != NULL) {
>> -		*tag = radix_tree_exceptional_entry(grp);
>> -		grp = (void *)((unsigned long)grp &
>> -			~RADIX_TREE_EXCEPTIONAL_ENTRY);
>> +		*tag = xa_pointer_tag(grp);
>> +		grp = xa_untag_pointer(grp);
>>  
>>  		if (erofs_workgroup_get(grp, &oldcount)) {
>>  			/* prefer to relax rcu read side */
>> @@ -83,9 +81,7 @@ int erofs_register_workgroup(struct super_block *sb,
>>  	sbi = EROFS_SB(sb);
>>  	erofs_workstn_lock(sbi);
>>  
>> -	if (tag)
>> -		grp = (void *)((unsigned long)grp |
>> -			1UL << RADIX_TREE_EXCEPTIONAL_SHIFT);
>> +	grp = xa_tag_pointer(grp, tag);
>>  
>>  	err = radix_tree_insert(&sbi->workstn_tree,
>>  		grp->index, grp);
>> @@ -131,9 +127,7 @@ unsigned long erofs_shrink_workstation(struct erofs_sb_info *sbi,
>>  
>>  	for (i = 0; i < found; ++i) {
>>  		int cnt;
>> -		struct erofs_workgroup *grp = (void *)
>> -			((unsigned long)batch[i] &
>> -				~RADIX_TREE_EXCEPTIONAL_ENTRY);
>> +		struct erofs_workgroup *grp = xa_untag_pointer(batch[i]);
>>  
>>  		first_index = grp->index + 1;
>>  
>> @@ -150,8 +144,8 @@ unsigned long erofs_shrink_workstation(struct erofs_sb_info *sbi,
>>  #endif
>>  			continue;
>>  
>> -		if (radix_tree_delete(&sbi->workstn_tree,
>> -			grp->index) != grp) {
>> +		if (xa_untag_pointer(radix_tree_delete(&sbi->workstn_tree,
>> +			grp->index)) != grp) {
>>  #ifdef EROFS_FS_HAS_MANAGED_CACHE
>>  skip:
>>  			erofs_workgroup_unfreeze(grp, 1);
>> diff --git a/include/linux/xarray.h b/include/linux/xarray.h
>> index 2dfc8006fe64..51ae9779d08b 100644
>> --- a/include/linux/xarray.h
>> +++ b/include/linux/xarray.h
>> @@ -9,6 +9,58 @@
>>  
>>  #include <linux/spinlock.h>
>>  
>> +/**
>> + * xa_tag_pointer() - Create an XArray entry for a tagged pointer.
>> + * @p: Plain pointer.
>> + * @tag: Tag value (0 or 1).
>> + *
>> + * If the user of the XArray prefers, they can tag their pointers instead
>> + * of storing value entries.  Two tags are available (0 and 1).
>> + * These are distinct from the xa_mark_t as they are not replicated up
>> + * through the array and cannot be searched for.
>> + *
>> + * Context: Any context.
>> + * Return: An XArray entry.
>> + */
>> +static inline void *xa_tag_pointer(void *p, unsigned long tag)
>> +{
>> +	if (__builtin_constant_p(tag))
>> +		BUILD_BUG_ON(tag > 1);
>> +	else
>> +		BUG_ON(tag > 1);
>> +	return (void *)((unsigned long)p | (tag << 1));
>> +}
>> +
>> +/**
>> + * xa_untag_pointer() - Turn an XArray entry into a plain pointer.
>> + * @entry: XArray entry.
>> + *
>> + * If you have stored a tagged pointer in the XArray, call this function
>> + * to get the untagged version of the pointer.
>> + *
>> + * Context: Any context.
>> + * Return: A pointer.
>> + */
>> +static inline void *xa_untag_pointer(void *entry)
>> +{
>> +	return (void *)((unsigned long)entry & ~3UL);
>> +}
>> +
>> +/**
>> + * xa_pointer_tag() - Get the tag stored in an XArray entry.
>> + * @entry: XArray entry.
>> + *
>> + * If you have stored a tagged pointer in the XArray, call this function
>> + * to get the tag of that pointer.
>> + *
>> + * Context: Any context.
>> + * Return: A tag.
>> + */
>> +static inline unsigned int xa_pointer_tag(void *entry)
>> +{
>> +	return ((unsigned long)entry & 3UL) >> 1;
>> +}
>> +
>>  #define xa_trylock(xa)		spin_trylock(&(xa)->xa_lock)
>>  #define xa_lock(xa)		spin_lock(&(xa)->xa_lock)
>>  #define xa_unlock(xa)		spin_unlock(&(xa)->xa_lock)
>>

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v2 for-4.19 1/2] xarray: Replace exceptional entries
  2019-03-04  5:13                   ` Gao Xiang
@ 2019-03-13  9:18                     ` Gao Xiang
  -1 siblings, 0 replies; 68+ messages in thread
From: Gao Xiang @ 2019-03-13  9:18 UTC (permalink / raw)
  To: Matthew Wilcox, Greg Kroah-Hartman; +Cc: stable, linux-erofs, Chao Yu, Miao Xie

ping? ...

On 2019/3/4 13:13, Gao Xiang wrote:
> ping? Is these two patches can be merged into linux-4.19? :'(
> 
> Thanks,
> Gao Xiang
> 
> On 2019/2/26 20:43, Gao Xiang wrote:
>> Hi Matthew,
>>
>> On 2019/2/26 13:14, Gao Xiang wrote:
>>> From: Matthew Wilcox <willy@infradead.org>
>>>
>>> commit 3159f943aafdbacb2f94c38fdaadabf2bbde2a14 upstream.
>>>
>>> Introduce xarray value entries and tagged pointers to replace radix
>>> tree exceptional entries.  This is a slight change in encoding to allow
>>> the use of an extra bit (we can now store BITS_PER_LONG - 1 bits in a
>>> value entry).  It is also a change in emphasis; exceptional entries are
>>> intimidating and different.  As the comment explains, you can choose
>>> to store values or pointers in the xarray and they are both first-class
>>> citizens.
>>>
>>> Signed-off-by: Matthew Wilcox <willy@infradead.org>
>>> Reviewed-by: Josef Bacik <jbacik@fb.com>
>>> [ take the minimum subset of tagged pointer support only. ]
>>> Cc: Matthew Wilcox <willy@infradead.org>
>>> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
>>> ---
>>> change log v2:
>>>  - fix tagged pointer apis to make it work properly pointed out by Matthew;
>>
>> Is this version ok? Could you kindly give an "Acked-by" tag for this patch?
>>
>> The following patch is important for erofs, I'd like to backport it ASAP...
>>  staging: erofs: fix race when the managed cache is enabled
>>
>> It will be of great help. Thanks in advance!
>>
>> Thanks,
>> Gao Xiang
>>
>>>
>>>  drivers/staging/erofs/utils.c | 18 +++++----------
>>>  include/linux/xarray.h        | 52 +++++++++++++++++++++++++++++++++++++++++++
>>>  2 files changed, 58 insertions(+), 12 deletions(-)
>>>
>>> diff --git a/drivers/staging/erofs/utils.c b/drivers/staging/erofs/utils.c
>>> index 595cf90af9bb..bdee9bd09f11 100644
>>> --- a/drivers/staging/erofs/utils.c
>>> +++ b/drivers/staging/erofs/utils.c
>>> @@ -35,7 +35,6 @@ static atomic_long_t erofs_global_shrink_cnt;
>>>  
>>>  #ifdef CONFIG_EROFS_FS_ZIP
>>>  
>>> -/* radix_tree and the future XArray both don't use tagptr_t yet */
>>>  struct erofs_workgroup *erofs_find_workgroup(
>>>  	struct super_block *sb, pgoff_t index, bool *tag)
>>>  {
>>> @@ -47,9 +46,8 @@ struct erofs_workgroup *erofs_find_workgroup(
>>>  	rcu_read_lock();
>>>  	grp = radix_tree_lookup(&sbi->workstn_tree, index);
>>>  	if (grp != NULL) {
>>> -		*tag = radix_tree_exceptional_entry(grp);
>>> -		grp = (void *)((unsigned long)grp &
>>> -			~RADIX_TREE_EXCEPTIONAL_ENTRY);
>>> +		*tag = xa_pointer_tag(grp);
>>> +		grp = xa_untag_pointer(grp);
>>>  
>>>  		if (erofs_workgroup_get(grp, &oldcount)) {
>>>  			/* prefer to relax rcu read side */
>>> @@ -83,9 +81,7 @@ int erofs_register_workgroup(struct super_block *sb,
>>>  	sbi = EROFS_SB(sb);
>>>  	erofs_workstn_lock(sbi);
>>>  
>>> -	if (tag)
>>> -		grp = (void *)((unsigned long)grp |
>>> -			1UL << RADIX_TREE_EXCEPTIONAL_SHIFT);
>>> +	grp = xa_tag_pointer(grp, tag);
>>>  
>>>  	err = radix_tree_insert(&sbi->workstn_tree,
>>>  		grp->index, grp);
>>> @@ -131,9 +127,7 @@ unsigned long erofs_shrink_workstation(struct erofs_sb_info *sbi,
>>>  
>>>  	for (i = 0; i < found; ++i) {
>>>  		int cnt;
>>> -		struct erofs_workgroup *grp = (void *)
>>> -			((unsigned long)batch[i] &
>>> -				~RADIX_TREE_EXCEPTIONAL_ENTRY);
>>> +		struct erofs_workgroup *grp = xa_untag_pointer(batch[i]);
>>>  
>>>  		first_index = grp->index + 1;
>>>  
>>> @@ -150,8 +144,8 @@ unsigned long erofs_shrink_workstation(struct erofs_sb_info *sbi,
>>>  #endif
>>>  			continue;
>>>  
>>> -		if (radix_tree_delete(&sbi->workstn_tree,
>>> -			grp->index) != grp) {
>>> +		if (xa_untag_pointer(radix_tree_delete(&sbi->workstn_tree,
>>> +			grp->index)) != grp) {
>>>  #ifdef EROFS_FS_HAS_MANAGED_CACHE
>>>  skip:
>>>  			erofs_workgroup_unfreeze(grp, 1);
>>> diff --git a/include/linux/xarray.h b/include/linux/xarray.h
>>> index 2dfc8006fe64..51ae9779d08b 100644
>>> --- a/include/linux/xarray.h
>>> +++ b/include/linux/xarray.h
>>> @@ -9,6 +9,58 @@
>>>  
>>>  #include <linux/spinlock.h>
>>>  
>>> +/**
>>> + * xa_tag_pointer() - Create an XArray entry for a tagged pointer.
>>> + * @p: Plain pointer.
>>> + * @tag: Tag value (0 or 1).
>>> + *
>>> + * If the user of the XArray prefers, they can tag their pointers instead
>>> + * of storing value entries.  Two tags are available (0 and 1).
>>> + * These are distinct from the xa_mark_t as they are not replicated up
>>> + * through the array and cannot be searched for.
>>> + *
>>> + * Context: Any context.
>>> + * Return: An XArray entry.
>>> + */
>>> +static inline void *xa_tag_pointer(void *p, unsigned long tag)
>>> +{
>>> +	if (__builtin_constant_p(tag))
>>> +		BUILD_BUG_ON(tag > 1);
>>> +	else
>>> +		BUG_ON(tag > 1);
>>> +	return (void *)((unsigned long)p | (tag << 1));
>>> +}
>>> +
>>> +/**
>>> + * xa_untag_pointer() - Turn an XArray entry into a plain pointer.
>>> + * @entry: XArray entry.
>>> + *
>>> + * If you have stored a tagged pointer in the XArray, call this function
>>> + * to get the untagged version of the pointer.
>>> + *
>>> + * Context: Any context.
>>> + * Return: A pointer.
>>> + */
>>> +static inline void *xa_untag_pointer(void *entry)
>>> +{
>>> +	return (void *)((unsigned long)entry & ~3UL);
>>> +}
>>> +
>>> +/**
>>> + * xa_pointer_tag() - Get the tag stored in an XArray entry.
>>> + * @entry: XArray entry.
>>> + *
>>> + * If you have stored a tagged pointer in the XArray, call this function
>>> + * to get the tag of that pointer.
>>> + *
>>> + * Context: Any context.
>>> + * Return: A tag.
>>> + */
>>> +static inline unsigned int xa_pointer_tag(void *entry)
>>> +{
>>> +	return ((unsigned long)entry & 3UL) >> 1;
>>> +}
>>> +
>>>  #define xa_trylock(xa)		spin_trylock(&(xa)->xa_lock)
>>>  #define xa_lock(xa)		spin_lock(&(xa)->xa_lock)
>>>  #define xa_unlock(xa)		spin_unlock(&(xa)->xa_lock)
>>>

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v2 for-4.19 1/2] xarray: Replace exceptional entries
@ 2019-03-13  9:18                     ` Gao Xiang
  0 siblings, 0 replies; 68+ messages in thread
From: Gao Xiang @ 2019-03-13  9:18 UTC (permalink / raw)


ping? ...

On 2019/3/4 13:13, Gao Xiang wrote:
> ping? Is these two patches can be merged into linux-4.19? :'(
> 
> Thanks,
> Gao Xiang
> 
> On 2019/2/26 20:43, Gao Xiang wrote:
>> Hi Matthew,
>>
>> On 2019/2/26 13:14, Gao Xiang wrote:
>>> From: Matthew Wilcox <willy at infradead.org>
>>>
>>> commit 3159f943aafdbacb2f94c38fdaadabf2bbde2a14 upstream.
>>>
>>> Introduce xarray value entries and tagged pointers to replace radix
>>> tree exceptional entries.  This is a slight change in encoding to allow
>>> the use of an extra bit (we can now store BITS_PER_LONG - 1 bits in a
>>> value entry).  It is also a change in emphasis; exceptional entries are
>>> intimidating and different.  As the comment explains, you can choose
>>> to store values or pointers in the xarray and they are both first-class
>>> citizens.
>>>
>>> Signed-off-by: Matthew Wilcox <willy at infradead.org>
>>> Reviewed-by: Josef Bacik <jbacik at fb.com>
>>> [ take the minimum subset of tagged pointer support only. ]
>>> Cc: Matthew Wilcox <willy at infradead.org>
>>> Signed-off-by: Gao Xiang <gaoxiang25 at huawei.com>
>>> ---
>>> change log v2:
>>>  - fix tagged pointer apis to make it work properly pointed out by Matthew;
>>
>> Is this version ok? Could you kindly give an "Acked-by" tag for this patch?
>>
>> The following patch is important for erofs, I'd like to backport it ASAP...
>>  staging: erofs: fix race when the managed cache is enabled
>>
>> It will be of great help. Thanks in advance!
>>
>> Thanks,
>> Gao Xiang
>>
>>>
>>>  drivers/staging/erofs/utils.c | 18 +++++----------
>>>  include/linux/xarray.h        | 52 +++++++++++++++++++++++++++++++++++++++++++
>>>  2 files changed, 58 insertions(+), 12 deletions(-)
>>>
>>> diff --git a/drivers/staging/erofs/utils.c b/drivers/staging/erofs/utils.c
>>> index 595cf90af9bb..bdee9bd09f11 100644
>>> --- a/drivers/staging/erofs/utils.c
>>> +++ b/drivers/staging/erofs/utils.c
>>> @@ -35,7 +35,6 @@ static atomic_long_t erofs_global_shrink_cnt;
>>>  
>>>  #ifdef CONFIG_EROFS_FS_ZIP
>>>  
>>> -/* radix_tree and the future XArray both don't use tagptr_t yet */
>>>  struct erofs_workgroup *erofs_find_workgroup(
>>>  	struct super_block *sb, pgoff_t index, bool *tag)
>>>  {
>>> @@ -47,9 +46,8 @@ struct erofs_workgroup *erofs_find_workgroup(
>>>  	rcu_read_lock();
>>>  	grp = radix_tree_lookup(&sbi->workstn_tree, index);
>>>  	if (grp != NULL) {
>>> -		*tag = radix_tree_exceptional_entry(grp);
>>> -		grp = (void *)((unsigned long)grp &
>>> -			~RADIX_TREE_EXCEPTIONAL_ENTRY);
>>> +		*tag = xa_pointer_tag(grp);
>>> +		grp = xa_untag_pointer(grp);
>>>  
>>>  		if (erofs_workgroup_get(grp, &oldcount)) {
>>>  			/* prefer to relax rcu read side */
>>> @@ -83,9 +81,7 @@ int erofs_register_workgroup(struct super_block *sb,
>>>  	sbi = EROFS_SB(sb);
>>>  	erofs_workstn_lock(sbi);
>>>  
>>> -	if (tag)
>>> -		grp = (void *)((unsigned long)grp |
>>> -			1UL << RADIX_TREE_EXCEPTIONAL_SHIFT);
>>> +	grp = xa_tag_pointer(grp, tag);
>>>  
>>>  	err = radix_tree_insert(&sbi->workstn_tree,
>>>  		grp->index, grp);
>>> @@ -131,9 +127,7 @@ unsigned long erofs_shrink_workstation(struct erofs_sb_info *sbi,
>>>  
>>>  	for (i = 0; i < found; ++i) {
>>>  		int cnt;
>>> -		struct erofs_workgroup *grp = (void *)
>>> -			((unsigned long)batch[i] &
>>> -				~RADIX_TREE_EXCEPTIONAL_ENTRY);
>>> +		struct erofs_workgroup *grp = xa_untag_pointer(batch[i]);
>>>  
>>>  		first_index = grp->index + 1;
>>>  
>>> @@ -150,8 +144,8 @@ unsigned long erofs_shrink_workstation(struct erofs_sb_info *sbi,
>>>  #endif
>>>  			continue;
>>>  
>>> -		if (radix_tree_delete(&sbi->workstn_tree,
>>> -			grp->index) != grp) {
>>> +		if (xa_untag_pointer(radix_tree_delete(&sbi->workstn_tree,
>>> +			grp->index)) != grp) {
>>>  #ifdef EROFS_FS_HAS_MANAGED_CACHE
>>>  skip:
>>>  			erofs_workgroup_unfreeze(grp, 1);
>>> diff --git a/include/linux/xarray.h b/include/linux/xarray.h
>>> index 2dfc8006fe64..51ae9779d08b 100644
>>> --- a/include/linux/xarray.h
>>> +++ b/include/linux/xarray.h
>>> @@ -9,6 +9,58 @@
>>>  
>>>  #include <linux/spinlock.h>
>>>  
>>> +/**
>>> + * xa_tag_pointer() - Create an XArray entry for a tagged pointer.
>>> + * @p: Plain pointer.
>>> + * @tag: Tag value (0 or 1).
>>> + *
>>> + * If the user of the XArray prefers, they can tag their pointers instead
>>> + * of storing value entries.  Two tags are available (0 and 1).
>>> + * These are distinct from the xa_mark_t as they are not replicated up
>>> + * through the array and cannot be searched for.
>>> + *
>>> + * Context: Any context.
>>> + * Return: An XArray entry.
>>> + */
>>> +static inline void *xa_tag_pointer(void *p, unsigned long tag)
>>> +{
>>> +	if (__builtin_constant_p(tag))
>>> +		BUILD_BUG_ON(tag > 1);
>>> +	else
>>> +		BUG_ON(tag > 1);
>>> +	return (void *)((unsigned long)p | (tag << 1));
>>> +}
>>> +
>>> +/**
>>> + * xa_untag_pointer() - Turn an XArray entry into a plain pointer.
>>> + * @entry: XArray entry.
>>> + *
>>> + * If you have stored a tagged pointer in the XArray, call this function
>>> + * to get the untagged version of the pointer.
>>> + *
>>> + * Context: Any context.
>>> + * Return: A pointer.
>>> + */
>>> +static inline void *xa_untag_pointer(void *entry)
>>> +{
>>> +	return (void *)((unsigned long)entry & ~3UL);
>>> +}
>>> +
>>> +/**
>>> + * xa_pointer_tag() - Get the tag stored in an XArray entry.
>>> + * @entry: XArray entry.
>>> + *
>>> + * If you have stored a tagged pointer in the XArray, call this function
>>> + * to get the tag of that pointer.
>>> + *
>>> + * Context: Any context.
>>> + * Return: A tag.
>>> + */
>>> +static inline unsigned int xa_pointer_tag(void *entry)
>>> +{
>>> +	return ((unsigned long)entry & 3UL) >> 1;
>>> +}
>>> +
>>>  #define xa_trylock(xa)		spin_trylock(&(xa)->xa_lock)
>>>  #define xa_lock(xa)		spin_lock(&(xa)->xa_lock)
>>>  #define xa_unlock(xa)		spin_unlock(&(xa)->xa_lock)
>>>

^ permalink raw reply	[flat|nested] 68+ messages in thread

end of thread, other threads:[~2019-03-13  9:19 UTC | newest]

Thread overview: 68+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-20  9:18 [PATCH for-4.19 00/12] erofs fixes for linux-4.19.y Gao Xiang
2019-02-20  9:18 ` Gao Xiang
2019-02-20  9:18 ` [PATCH for-4.19 01/12] staging: erofs: fix a bug when appling cache strategy Gao Xiang
2019-02-20  9:18   ` Gao Xiang
2019-02-20  9:18 ` [PATCH for-4.19 02/12] staging: erofs: complete error handing of z_erofs_do_read_page Gao Xiang
2019-02-20  9:18   ` Gao Xiang
2019-02-25 14:59   ` Greg Kroah-Hartman
2019-02-25 14:59     ` Greg Kroah-Hartman
2019-02-25 15:04     ` Gao Xiang
2019-02-25 15:04       ` Gao Xiang
2019-02-20  9:18 ` [PATCH for-4.19 03/12] staging: erofs: replace BUG_ON with DBG_BUGON in data.c Gao Xiang
2019-02-20  9:18   ` Gao Xiang
2019-02-20  9:18 ` [PATCH for-4.19 04/12] staging: erofs: drop multiref support temporarily Gao Xiang
2019-02-20  9:18   ` Gao Xiang
2019-02-20  9:18 ` [PATCH for-4.19 05/12] staging: erofs: remove the redundant d_rehash() for the root dentry Gao Xiang
2019-02-20  9:18   ` Gao Xiang
2019-02-20  9:18 ` [PATCH for-4.19 06/12] staging: erofs: fix race when the managed cache is enabled Gao Xiang
2019-02-20  9:18   ` Gao Xiang
2019-02-25 15:04   ` Greg Kroah-Hartman
2019-02-25 15:04     ` Greg Kroah-Hartman
2019-02-25 15:07     ` Gao Xiang
2019-02-25 15:07       ` Gao Xiang
2019-02-25 15:51       ` Greg Kroah-Hartman
2019-02-25 15:51         ` Greg Kroah-Hartman
2019-02-25 15:57         ` Gao Xiang
2019-02-25 15:57           ` Gao Xiang
2019-02-25 17:58         ` [PATCH for-4.19 1/2] xarray: Replace exceptional entries Gao Xiang
2019-02-25 17:58           ` Gao Xiang
2019-02-25 17:58           ` [PATCH for-4.19 2/2] staging: erofs: fix race when the managed cache is enabled Gao Xiang
2019-02-25 17:58             ` Gao Xiang
2019-02-25 18:27           ` [PATCH for-4.19 1/2] xarray: Replace exceptional entries Matthew Wilcox
2019-02-25 18:27             ` Matthew Wilcox
2019-02-26  1:21             ` Gao Xiang
2019-02-26  1:21               ` Gao Xiang
2019-02-26  5:14             ` [PATCH v2 " Gao Xiang
2019-02-26  5:14               ` Gao Xiang
2019-02-26  5:14               ` [PATCH v2 for-4.19 2/2] staging: erofs: fix race when the managed cache is enabled Gao Xiang
2019-02-26  5:14                 ` Gao Xiang
2019-02-26 12:43               ` [PATCH v2 for-4.19 1/2] xarray: Replace exceptional entries Gao Xiang
2019-02-26 12:43                 ` Gao Xiang
2019-03-04  5:13                 ` Gao Xiang
2019-03-04  5:13                   ` Gao Xiang
2019-03-13  9:18                   ` Gao Xiang
2019-03-13  9:18                     ` Gao Xiang
2019-02-25 15:25     ` [PATCH for-4.19 06/12] staging: erofs: fix race when the managed cache is enabled Matthew Wilcox
2019-02-25 15:25       ` Matthew Wilcox
2019-02-25 15:52       ` Greg Kroah-Hartman
2019-02-25 15:52         ` Greg Kroah-Hartman
2019-02-25 16:04         ` Matthew Wilcox
2019-02-25 16:04           ` Matthew Wilcox
2019-02-20  9:18 ` [PATCH for-4.19 07/12] staging: erofs: atomic_cond_read_relaxed on ref-locked workgroup Gao Xiang
2019-02-20  9:18   ` Gao Xiang
2019-02-20  9:18 ` [PATCH for-4.19 08/12] staging: erofs: fix `erofs_workgroup_{try_to_freeze, unfreeze}' Gao Xiang
2019-02-20  9:18   ` Gao Xiang
2019-02-20  9:18 ` [PATCH for-4.19 09/12] staging: erofs: add a full barrier in erofs_workgroup_unfreeze Gao Xiang
2019-02-20  9:18   ` Gao Xiang
2019-02-20  9:18 ` [PATCH for-4.19 10/12] staging: erofs: {dir,inode,super}.c: rectify BUG_ONs Gao Xiang
2019-02-20  9:18   ` [PATCH for-4.19 10/12] staging: erofs: {dir, inode, super}.c: " Gao Xiang
2019-02-20  9:18 ` [PATCH for-4.19 11/12] staging: erofs: unzip_{pagevec.h,vle.c}: " Gao Xiang
2019-02-20  9:18   ` [PATCH for-4.19 11/12] staging: erofs: unzip_{pagevec.h, vle.c}: " Gao Xiang
2019-02-20  9:18 ` [PATCH for-4.19 12/12] staging: erofs: unzip_vle_lz4.c,utils.c: " Gao Xiang
2019-02-20  9:18   ` [PATCH for-4.19 12/12] staging: erofs: unzip_vle_lz4.c, utils.c: " Gao Xiang
2019-02-22  8:35 ` [PATCH for-4.19 00/12] erofs fixes for linux-4.19.y Greg Kroah-Hartman
2019-02-22  8:35   ` Greg Kroah-Hartman
2019-02-22  9:03   ` Gao Xiang
2019-02-22  9:03     ` Gao Xiang
2019-02-25 15:28 ` Greg Kroah-Hartman
2019-02-25 15:28   ` Greg Kroah-Hartman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.