linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/7] staging: erofs: decompression fixes & cleanups
@ 2018-12-07 16:19 Gao Xiang
  2018-12-07 16:19 ` [PATCH 1/7] staging: erofs: fix use-after-free of on-stack `z_erofs_vle_unzip_io' Gao Xiang
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: Gao Xiang @ 2018-12-07 16:19 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel
  Cc: LKML, linux-erofs, Chao Yu, Miao Xie, weidu.du, Gao Xiang

Hi,

This patchset has been pending in the erofs mailing list for months.
It mainly focuses on cleaning up io submission flow in the decompression
subsystem in order to remove #ifdefs in related functions, which
were used to differentiate whether cached decompression is enabled
(rather than do in-place decompression only).

These patches has been tested by our Android beta users for these months,
no new issue reported.

Thanks,
Gao Xiang

Gao Xiang (7):
  staging: erofs: fix use-after-free of on-stack `z_erofs_vle_unzip_io'
  staging: erofs: introduce MNGD_MAPPING helper
  staging: erofs: localize UNALLOCATED_CACHED_PAGE placeholder
  staging: erofs: revisit the page submission flow
  staging: erofs: refine compressed pages preload flow
  staging: erofs: redefine where `owned_workgrp_t' points
  staging: erofs: simplify `z_erofs_vle_submit_all'

 drivers/staging/erofs/internal.h  |   6 +-
 drivers/staging/erofs/unzip_vle.c | 580 ++++++++++++++++++++++++--------------
 drivers/staging/erofs/unzip_vle.h |   4 +-
 3 files changed, 377 insertions(+), 213 deletions(-)

-- 
2.14.4


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/7] staging: erofs: fix use-after-free of on-stack `z_erofs_vle_unzip_io'
  2018-12-07 16:19 [PATCH 0/7] staging: erofs: decompression fixes & cleanups Gao Xiang
@ 2018-12-07 16:19 ` Gao Xiang
  2018-12-07 16:19 ` [PATCH 2/7] staging: erofs: introduce MNGD_MAPPING helper Gao Xiang
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Gao Xiang @ 2018-12-07 16:19 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel
  Cc: LKML, linux-erofs, Chao Yu, Miao Xie, weidu.du, Gao Xiang

The root cause is the race as follows:
 Thread #0                         Thread #1

 z_erofs_vle_unzip_kickoff         z_erofs_submit_and_unzip

                                    struct z_erofs_vle_unzip_io io[]
   atomic_add_return()
                                    wait_event()
                                    [end of function]
   wake_up()

Fix it by taking the waitqueue lock between atomic_add_return and
wake_up to close such the race.

kernel message:

Unable to handle kernel paging request at virtual address 97f7052caa1303dc
...
Workqueue: kverityd verity_work
task: ffffffe32bcb8000 task.stack: ffffffe3298a0000
PC is at __wake_up_common+0x48/0xa8
LR is at __wake_up+0x3c/0x58
...
Call trace:
...
[<ffffff94a08ff648>] __wake_up_common+0x48/0xa8
[<ffffff94a08ff8b8>] __wake_up+0x3c/0x58
[<ffffff94a0c11b60>] z_erofs_vle_unzip_kickoff+0x40/0x64
[<ffffff94a0c118e4>] z_erofs_vle_read_endio+0x94/0x134
[<ffffff94a0c83c9c>] bio_endio+0xe4/0xf8
[<ffffff94a1076540>] dec_pending+0x134/0x32c
[<ffffff94a1076f28>] clone_endio+0x90/0xf4
[<ffffff94a0c83c9c>] bio_endio+0xe4/0xf8
[<ffffff94a1095024>] verity_work+0x210/0x368
[<ffffff94a08c4150>] process_one_work+0x188/0x4b4
[<ffffff94a08c45bc>] worker_thread+0x140/0x458
[<ffffff94a08cad48>] kthread+0xec/0x108
[<ffffff94a0883ab4>] ret_from_fork+0x10/0x1c
Code: d1006273 54000260 f9400804 b9400019 (b85fc081)
---[ end trace be9dde154f677cd1 ]---

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
---
 drivers/staging/erofs/unzip_vle.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/drivers/staging/erofs/unzip_vle.c b/drivers/staging/erofs/unzip_vle.c
index f5d088fdf7f2..4404ea6fb9e4 100644
--- a/drivers/staging/erofs/unzip_vle.c
+++ b/drivers/staging/erofs/unzip_vle.c
@@ -744,13 +744,18 @@ static void z_erofs_vle_unzip_kickoff(void *ptr, int bios)
 	struct z_erofs_vle_unzip_io *io = tagptr_unfold_ptr(t);
 	bool background = tagptr_unfold_tags(t);
 
-	if (atomic_add_return(bios, &io->pending_bios))
+	if (!background) {
+		unsigned long flags;
+
+		spin_lock_irqsave(&io->u.wait.lock, flags);
+		if (!atomic_add_return(bios, &io->pending_bios))
+			wake_up_locked(&io->u.wait);
+		spin_unlock_irqrestore(&io->u.wait.lock, flags);
 		return;
+	}
 
-	if (background)
+	if (!atomic_add_return(bios, &io->pending_bios))
 		queue_work(z_erofs_workqueue, &io->u.work);
-	else
-		wake_up(&io->u.wait);
 }
 
 static inline void z_erofs_vle_read_endio(struct bio *bio)
-- 
2.14.4


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/7] staging: erofs: introduce MNGD_MAPPING helper
  2018-12-07 16:19 [PATCH 0/7] staging: erofs: decompression fixes & cleanups Gao Xiang
  2018-12-07 16:19 ` [PATCH 1/7] staging: erofs: fix use-after-free of on-stack `z_erofs_vle_unzip_io' Gao Xiang
@ 2018-12-07 16:19 ` Gao Xiang
  2018-12-07 16:19 ` [PATCH 3/7] staging: erofs: localize UNALLOCATED_CACHED_PAGE placeholder Gao Xiang
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Gao Xiang @ 2018-12-07 16:19 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel
  Cc: LKML, linux-erofs, Chao Yu, Miao Xie, weidu.du, Gao Xiang

This patch introduces MNGD_MAPPING to wrap up
sbi->managed_cache->i_mapping, which will be used
to solve too many #ifdefs in a single function.

No logic changes.

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
---

The following lines will be changed in the next cleanup patch.
-	noio_outoforder = grab_managed_cache_pages(mngda,
+	noio_outoforder = grab_managed_cache_pages(mc,
 		erofs_blknr(map->m_pa),
 		grp->compressed_pages, erofs_blknr(map->m_plen),
 		/* compressed page caching selection strategy */


 drivers/staging/erofs/internal.h  |  4 ++++
 drivers/staging/erofs/unzip_vle.c | 29 +++++++++++++----------------
 2 files changed, 17 insertions(+), 16 deletions(-)

diff --git a/drivers/staging/erofs/internal.h b/drivers/staging/erofs/internal.h
index 892944355867..b78d6e4c12ab 100644
--- a/drivers/staging/erofs/internal.h
+++ b/drivers/staging/erofs/internal.h
@@ -295,6 +295,10 @@ extern int erofs_try_to_free_all_cached_pages(struct erofs_sb_info *sbi,
 	struct erofs_workgroup *egrp);
 extern int erofs_try_to_free_cached_page(struct address_space *mapping,
 	struct page *page);
+
+#define MNGD_MAPPING(sbi)	((sbi)->managed_cache->i_mapping)
+#else
+#define MNGD_MAPPING(sbi)	(NULL)
 #endif
 
 #define DEFAULT_MAX_SYNC_DECOMPRESS_PAGES	3
diff --git a/drivers/staging/erofs/unzip_vle.c b/drivers/staging/erofs/unzip_vle.c
index 4404ea6fb9e4..ac2e30474520 100644
--- a/drivers/staging/erofs/unzip_vle.c
+++ b/drivers/staging/erofs/unzip_vle.c
@@ -165,7 +165,7 @@ int erofs_try_to_free_all_cached_pages(struct erofs_sb_info *sbi,
 {
 	struct z_erofs_vle_workgroup *const grp =
 		container_of(egrp, struct z_erofs_vle_workgroup, obj);
-	struct address_space *const mapping = sbi->managed_cache->i_mapping;
+	struct address_space *const mapping = MNGD_MAPPING(sbi);
 	const int clusterpages = erofs_clusterpages(sbi);
 	int i;
 
@@ -617,7 +617,7 @@ static int z_erofs_do_read_page(struct z_erofs_vle_frontend *fe,
 	struct z_erofs_vle_work *work = builder->work;
 
 #ifdef EROFS_FS_HAS_MANAGED_CACHE
-	struct address_space *const mngda = sbi->managed_cache->i_mapping;
+	struct address_space *const mc = MNGD_MAPPING(sbi);
 	struct z_erofs_vle_workgroup *grp;
 	bool noio_outoforder;
 #endif
@@ -665,7 +665,7 @@ static int z_erofs_do_read_page(struct z_erofs_vle_frontend *fe,
 	grp = fe->builder.grp;
 
 	/* let's do out-of-order decompression for noio */
-	noio_outoforder = grab_managed_cache_pages(mngda,
+	noio_outoforder = grab_managed_cache_pages(mc,
 		erofs_blknr(map->m_pa),
 		grp->compressed_pages, erofs_blknr(map->m_plen),
 		/* compressed page caching selection strategy */
@@ -764,7 +764,7 @@ static inline void z_erofs_vle_read_endio(struct bio *bio)
 	unsigned int i;
 	struct bio_vec *bvec;
 #ifdef EROFS_FS_HAS_MANAGED_CACHE
-	struct address_space *mngda = NULL;
+	struct address_space *mc = NULL;
 #endif
 
 	bio_for_each_segment_all(bvec, bio, i) {
@@ -775,18 +775,18 @@ static inline void z_erofs_vle_read_endio(struct bio *bio)
 		BUG_ON(!page->mapping);
 
 #ifdef EROFS_FS_HAS_MANAGED_CACHE
-		if (unlikely(!mngda && !z_erofs_is_stagingpage(page))) {
+		if (unlikely(!mc && !z_erofs_is_stagingpage(page))) {
 			struct inode *const inode = page->mapping->host;
 			struct super_block *const sb = inode->i_sb;
 
-			mngda = EROFS_SB(sb)->managed_cache->i_mapping;
+			mc = MNGD_MAPPING(EROFS_SB(sb));
 		}
 
 		/*
-		 * If mngda has not gotten, it equals NULL,
+		 * If mc has not gotten, it equals NULL,
 		 * however, page->mapping never be NULL if working properly.
 		 */
-		cachemngd = (page->mapping == mngda);
+		cachemngd = (page->mapping == mc);
 #endif
 
 		if (unlikely(err))
@@ -810,9 +810,6 @@ static int z_erofs_vle_unzip(struct super_block *sb,
 	struct list_head *page_pool)
 {
 	struct erofs_sb_info *const sbi = EROFS_SB(sb);
-#ifdef EROFS_FS_HAS_MANAGED_CACHE
-	struct address_space *const mngda = sbi->managed_cache->i_mapping;
-#endif
 	const unsigned int clusterpages = erofs_clusterpages(sbi);
 
 	struct z_erofs_pagevec_ctor ctor;
@@ -903,7 +900,7 @@ static int z_erofs_vle_unzip(struct super_block *sb,
 		if (z_erofs_is_stagingpage(page))
 			continue;
 #ifdef EROFS_FS_HAS_MANAGED_CACHE
-		else if (page->mapping == mngda) {
+		if (page->mapping == MNGD_MAPPING(sbi)) {
 			BUG_ON(PageLocked(page));
 			BUG_ON(!PageUptodate(page));
 			continue;
@@ -981,7 +978,7 @@ static int z_erofs_vle_unzip(struct super_block *sb,
 		page = compressed_pages[i];
 
 #ifdef EROFS_FS_HAS_MANAGED_CACHE
-		if (page->mapping == mngda)
+		if (page->mapping == MNGD_MAPPING(sbi))
 			continue;
 #endif
 		/* recycle all individual staging pages */
@@ -1114,7 +1111,7 @@ static bool z_erofs_vle_submit_all(struct super_block *sb,
 	const unsigned int clusterpages = erofs_clusterpages(sbi);
 	const gfp_t gfp = GFP_NOFS;
 #ifdef EROFS_FS_HAS_MANAGED_CACHE
-	struct address_space *const mngda = sbi->managed_cache->i_mapping;
+	struct address_space *const mc = MNGD_MAPPING(sbi);
 	struct z_erofs_vle_workgroup *lstgrp_noio = NULL, *lstgrp_io = NULL;
 #endif
 	struct z_erofs_vle_unzip_io *ios[1 + __FSIO_1];
@@ -1187,7 +1184,7 @@ static bool z_erofs_vle_submit_all(struct super_block *sb,
 			cachemngd = true;
 			goto do_allocpage;
 		} else if (page) {
-			if (page->mapping != mngda)
+			if (page->mapping != mc)
 				BUG_ON(PageUptodate(page));
 			else if (recover_managed_page(grp, page)) {
 				/* page is uptodate, skip io submission */
@@ -1210,7 +1207,7 @@ static bool z_erofs_vle_submit_all(struct super_block *sb,
 				goto repeat;
 #ifdef EROFS_FS_HAS_MANAGED_CACHE
 			} else if (cachemngd && !add_to_page_cache_lru(page,
-				mngda, first_index + i, gfp)) {
+				   mc, first_index + i, gfp)) {
 				set_page_private(page, (unsigned long)grp);
 				SetPagePrivate(page);
 #endif
-- 
2.14.4


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 3/7] staging: erofs: localize UNALLOCATED_CACHED_PAGE placeholder
  2018-12-07 16:19 [PATCH 0/7] staging: erofs: decompression fixes & cleanups Gao Xiang
  2018-12-07 16:19 ` [PATCH 1/7] staging: erofs: fix use-after-free of on-stack `z_erofs_vle_unzip_io' Gao Xiang
  2018-12-07 16:19 ` [PATCH 2/7] staging: erofs: introduce MNGD_MAPPING helper Gao Xiang
@ 2018-12-07 16:19 ` Gao Xiang
  2018-12-07 16:19 ` [PATCH 4/7] staging: erofs: revisit the page submission flow Gao Xiang
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Gao Xiang @ 2018-12-07 16:19 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel
  Cc: LKML, linux-erofs, Chao Yu, Miao Xie, weidu.du, Gao Xiang

In practice, in order to do cached decompression rather than reuse
them for in-place decompression and make full use of pages in
page_pool instead of allocating as much as possible, an unallocated
placeholder was introduce to mark all in compressed_pages[] and
they will be replaced at the time of submission.

Previously EROFS_UNALLOCATED_CACHED_PAGE was included in internal.h,
which is unnecessary since it's only internally used in decompression
subsystem, move it to unzip_vle.c and rename it to PAGE_UNALLOCATED.

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
---
 drivers/staging/erofs/internal.h  |  2 --
 drivers/staging/erofs/unzip_vle.c | 10 ++++++++--
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/staging/erofs/internal.h b/drivers/staging/erofs/internal.h
index b78d6e4c12ab..e049d00c087a 100644
--- a/drivers/staging/erofs/internal.h
+++ b/drivers/staging/erofs/internal.h
@@ -289,8 +289,6 @@ static inline void erofs_workstation_cleanup_all(struct super_block *sb)
 }
 
 #ifdef EROFS_FS_HAS_MANAGED_CACHE
-#define EROFS_UNALLOCATED_CACHED_PAGE	((void *)0x5F0EF00D)
-
 extern int erofs_try_to_free_all_cached_pages(struct erofs_sb_info *sbi,
 	struct erofs_workgroup *egrp);
 extern int erofs_try_to_free_cached_page(struct address_space *mapping,
diff --git a/drivers/staging/erofs/unzip_vle.c b/drivers/staging/erofs/unzip_vle.c
index ac2e30474520..f39e495f4369 100644
--- a/drivers/staging/erofs/unzip_vle.c
+++ b/drivers/staging/erofs/unzip_vle.c
@@ -15,6 +15,12 @@
 
 #include <trace/events/erofs.h>
 
+/*
+ * a compressed_pages[] placeholder in order to avoid
+ * being filled with file pages for in-place decompression.
+ */
+#define PAGE_UNALLOCATED     ((void *)0x5F0E4B1D)
+
 static struct workqueue_struct *z_erofs_workqueue __read_mostly;
 static struct kmem_cache *z_erofs_workgroup_cachep __read_mostly;
 
@@ -147,7 +153,7 @@ static bool grab_managed_cache_pages(struct address_space *mapping,
 			noio = false;
 			if (!reserve_allocation)
 				continue;
-			page = EROFS_UNALLOCATED_CACHED_PAGE;
+			page = PAGE_UNALLOCATED;
 		}
 
 		if (!cmpxchg(compressed_pages + i, NULL, page))
@@ -1180,7 +1186,7 @@ static bool z_erofs_vle_submit_all(struct super_block *sb,
 #ifdef EROFS_FS_HAS_MANAGED_CACHE
 		cachemngd = false;
 
-		if (page == EROFS_UNALLOCATED_CACHED_PAGE) {
+		if (page == PAGE_UNALLOCATED) {
 			cachemngd = true;
 			goto do_allocpage;
 		} else if (page) {
-- 
2.14.4


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 4/7] staging: erofs: revisit the page submission flow
  2018-12-07 16:19 [PATCH 0/7] staging: erofs: decompression fixes & cleanups Gao Xiang
                   ` (2 preceding siblings ...)
  2018-12-07 16:19 ` [PATCH 3/7] staging: erofs: localize UNALLOCATED_CACHED_PAGE placeholder Gao Xiang
@ 2018-12-07 16:19 ` Gao Xiang
  2018-12-07 16:19 ` [PATCH 5/7] staging: erofs: refine compressed pages preload flow Gao Xiang
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Gao Xiang @ 2018-12-07 16:19 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel
  Cc: LKML, linux-erofs, Chao Yu, Miao Xie, weidu.du, Gao Xiang

Previously, the submission flow works with cached compressed pages
reclaim path in a tricky way, and it could be buggy if the reclaim
path changes later without such tricky restrictions. For example,
currently one PagePrivate(page) is evaluated without taking page
lock (it only follows a wait_for_page_locked which closes such race)
and no handling solves the potential page truncation case.

In addition, it's also full of #ifdefs in the function, which
is hard to understand and maintain. this patch fixes them all.

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
---
 drivers/staging/erofs/unzip_vle.c | 183 +++++++++++++++++++++++---------------
 1 file changed, 112 insertions(+), 71 deletions(-)

diff --git a/drivers/staging/erofs/unzip_vle.c b/drivers/staging/erofs/unzip_vle.c
index f39e495f4369..de41f7b739ac 100644
--- a/drivers/staging/erofs/unzip_vle.c
+++ b/drivers/staging/erofs/unzip_vle.c
@@ -1047,6 +1047,107 @@ static void z_erofs_vle_unzip_wq(struct work_struct *work)
 	kvfree(iosb);
 }
 
+static struct page *
+pickup_page_for_submission(struct z_erofs_vle_workgroup *grp,
+			   unsigned int nr,
+			   struct list_head *pagepool,
+			   struct address_space *mc,
+			   gfp_t gfp)
+{
+	/* determined at compile time to avoid too many #ifdefs */
+	const bool nocache = __builtin_constant_p(mc) ? !mc : false;
+	const pgoff_t index = grp->obj.index;
+	bool tocache = false;
+
+	struct address_space *mapping;
+	struct page *oldpage, *page;
+
+repeat:
+	page = READ_ONCE(grp->compressed_pages[nr]);
+	oldpage = page;
+
+	if (!page)
+		goto out_allocpage;
+
+	/*
+	 * the cached page has not been allocated and
+	 * an placeholder is out there, prepare it now.
+	 */
+	if (!nocache && page == PAGE_UNALLOCATED) {
+		tocache = true;
+		goto out_allocpage;
+	}
+
+	mapping = READ_ONCE(page->mapping);
+
+	/*
+	 * if managed cache is disabled, it's no way to
+	 * get such a cached-like page.
+	 */
+	if (nocache) {
+		/* should be locked, not uptodate, and not truncated */
+		DBG_BUGON(!PageLocked(page));
+		DBG_BUGON(PageUptodate(page));
+		DBG_BUGON(!mapping);
+		goto out;
+	}
+
+	/*
+	 * unmanaged (file) pages are all locked solidly,
+	 * therefore it is impossible for `mapping' to be NULL.
+	 */
+	if (mapping && mapping != mc)
+		/* ought to be unmanaged pages */
+		goto out;
+
+	lock_page(page);
+
+	/* the page is still in manage cache */
+	if (page->mapping == mc) {
+		WRITE_ONCE(grp->compressed_pages[nr], page);
+
+		if (!PagePrivate(page)) {
+			set_page_private(page, (unsigned long)grp);
+			SetPagePrivate(page);
+		}
+
+		/* no need to submit io if it is already up-to-date */
+		if (PageUptodate(page)) {
+			unlock_page(page);
+			page = NULL;
+		}
+		goto out;
+	}
+
+	/*
+	 * the managed page has been truncated, it's unsafe to
+	 * reuse this one, let's allocate a new cache-managed page.
+	 */
+	DBG_BUGON(page->mapping);
+
+	tocache = true;
+	unlock_page(page);
+	put_page(page);
+out_allocpage:
+	page = __stagingpage_alloc(pagepool, gfp);
+	if (oldpage != cmpxchg(&grp->compressed_pages[nr], oldpage, page)) {
+		list_add(&page->lru, pagepool);
+		cpu_relax();
+		goto repeat;
+	}
+	if (nocache || !tocache)
+		goto out;
+	if (add_to_page_cache_lru(page, mc, index + nr, gfp)) {
+		page->mapping = Z_EROFS_MAPPING_STAGING;
+		goto out;
+	}
+
+	set_page_private(page, (unsigned long)grp);
+	SetPagePrivate(page);
+out:	/* the only exit (for tracing and debugging) */
+	return page;
+}
+
 static inline struct z_erofs_vle_unzip_io *
 prepare_io_handler(struct super_block *sb,
 		   struct z_erofs_vle_unzip_io *io,
@@ -1082,26 +1183,6 @@ prepare_io_handler(struct super_block *sb,
 }
 
 #ifdef EROFS_FS_HAS_MANAGED_CACHE
-/* true - unlocked (noio), false - locked (need submit io) */
-static inline bool recover_managed_page(struct z_erofs_vle_workgroup *grp,
-					struct page *page)
-{
-	wait_on_page_locked(page);
-	if (PagePrivate(page) && PageUptodate(page))
-		return true;
-
-	lock_page(page);
-	if (unlikely(!PagePrivate(page))) {
-		set_page_private(page, (unsigned long)grp);
-		SetPagePrivate(page);
-	}
-	if (unlikely(PageUptodate(page))) {
-		unlock_page(page);
-		return true;
-	}
-	return false;
-}
-
 #define __FSIO_1 1
 #else
 #define __FSIO_1 0
@@ -1117,7 +1198,6 @@ static bool z_erofs_vle_submit_all(struct super_block *sb,
 	const unsigned int clusterpages = erofs_clusterpages(sbi);
 	const gfp_t gfp = GFP_NOFS;
 #ifdef EROFS_FS_HAS_MANAGED_CACHE
-	struct address_space *const mc = MNGD_MAPPING(sbi);
 	struct z_erofs_vle_workgroup *lstgrp_noio = NULL, *lstgrp_io = NULL;
 #endif
 	struct z_erofs_vle_unzip_io *ios[1 + __FSIO_1];
@@ -1156,13 +1236,9 @@ static bool z_erofs_vle_submit_all(struct super_block *sb,
 
 	do {
 		struct z_erofs_vle_workgroup *grp;
-		struct page **compressed_pages, *oldpage, *page;
 		pgoff_t first_index;
-		unsigned int i = 0;
-#ifdef EROFS_FS_HAS_MANAGED_CACHE
-		unsigned int noio = 0;
-		bool cachemngd;
-#endif
+		struct page *page;
+		unsigned int i = 0, bypass = 0;
 		int err;
 
 		/* no possible 'owned_head' equals the following */
@@ -1173,51 +1249,18 @@ static bool z_erofs_vle_submit_all(struct super_block *sb,
 
 		/* close the main owned chain at first */
 		owned_head = cmpxchg(&grp->next, Z_EROFS_VLE_WORKGRP_TAIL,
-			Z_EROFS_VLE_WORKGRP_TAIL_CLOSED);
+				     Z_EROFS_VLE_WORKGRP_TAIL_CLOSED);
 
 		first_index = grp->obj.index;
-		compressed_pages = grp->compressed_pages;
-
 		force_submit |= (first_index != last_index + 1);
-repeat:
-		/* fulfill all compressed pages */
-		oldpage = page = READ_ONCE(compressed_pages[i]);
 
-#ifdef EROFS_FS_HAS_MANAGED_CACHE
-		cachemngd = false;
-
-		if (page == PAGE_UNALLOCATED) {
-			cachemngd = true;
-			goto do_allocpage;
-		} else if (page) {
-			if (page->mapping != mc)
-				BUG_ON(PageUptodate(page));
-			else if (recover_managed_page(grp, page)) {
-				/* page is uptodate, skip io submission */
-				force_submit = true;
-				++noio;
-				goto skippage;
-			}
-		} else {
-do_allocpage:
-#else
-		if (page)
-			BUG_ON(PageUptodate(page));
-		else {
-#endif
-			page = __stagingpage_alloc(pagepool, gfp);
-
-			if (oldpage != cmpxchg(compressed_pages + i,
-				oldpage, page)) {
-				list_add(&page->lru, pagepool);
-				goto repeat;
-#ifdef EROFS_FS_HAS_MANAGED_CACHE
-			} else if (cachemngd && !add_to_page_cache_lru(page,
-				   mc, first_index + i, gfp)) {
-				set_page_private(page, (unsigned long)grp);
-				SetPagePrivate(page);
-#endif
-			}
+repeat:
+		page = pickup_page_for_submission(grp, i, pagepool,
+						  MNGD_MAPPING(sbi), gfp);
+		if (!page) {
+			force_submit = true;
+			++bypass;
+			goto skippage;
 		}
 
 		if (bio && force_submit) {
@@ -1240,14 +1283,12 @@ static bool z_erofs_vle_submit_all(struct super_block *sb,
 
 		force_submit = false;
 		last_index = first_index + i;
-#ifdef EROFS_FS_HAS_MANAGED_CACHE
 skippage:
-#endif
 		if (++i < clusterpages)
 			goto repeat;
 
 #ifdef EROFS_FS_HAS_MANAGED_CACHE
-		if (noio < clusterpages) {
+		if (bypass < clusterpages) {
 			lstgrp_io = grp;
 		} else {
 			z_erofs_vle_owned_workgrp_t iogrp_next =
-- 
2.14.4


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 5/7] staging: erofs: refine compressed pages preload flow
  2018-12-07 16:19 [PATCH 0/7] staging: erofs: decompression fixes & cleanups Gao Xiang
                   ` (3 preceding siblings ...)
  2018-12-07 16:19 ` [PATCH 4/7] staging: erofs: revisit the page submission flow Gao Xiang
@ 2018-12-07 16:19 ` Gao Xiang
  2018-12-07 16:19 ` [PATCH 6/7] staging: erofs: redefine where `owned_workgrp_t' points Gao Xiang
  2018-12-07 16:19 ` [PATCH 7/7] staging: erofs: simplify `z_erofs_vle_submit_all' Gao Xiang
  6 siblings, 0 replies; 8+ messages in thread
From: Gao Xiang @ 2018-12-07 16:19 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel
  Cc: LKML, linux-erofs, Chao Yu, Miao Xie, weidu.du, Gao Xiang

Currently, there are two kinds of compressed pages in erofs:
  1) file pages for the in-place decompression and
  2) managed pages for cached decompression.
Both are all stored in grp->compressed_pages[].

For managed pages, they could already exist or could be preloaded
in this round, including the following cases in detail:
  1) Already valid (loaded in some previous round);
  2) PAGE_UNALLOCATED, should be allocated at the time of submission;
  3) Just found in the managed cache, and with an extra page ref.
Currently, 1) and 3) can be distinguishable by lock_page and
checking its PG_private, which is guaranteed by the reclaim path,
but it's better to do a double check by using an extra tag.

This patch reworks the preload flow by introducing such the tag
by using tagged pointer, too many #ifdefs are removed as well.

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
---

This patch adds a new compressed_page_t type because erofs tagged
pointer implementation defines 1-bit tag tagptr1_t, 2-bit tag
tagptr2_t, ... structures to wrap up different tagged pointer uses.
it needs to give a meaningful type name rather than purely use
tagptr1_t, tagptr2_t, ...

Thanks,
Gao Xiang

 drivers/staging/erofs/unzip_vle.c | 166 ++++++++++++++++++++++++++++----------
 1 file changed, 123 insertions(+), 43 deletions(-)

diff --git a/drivers/staging/erofs/unzip_vle.c b/drivers/staging/erofs/unzip_vle.c
index de41f7b739ac..2b494f29e2c2 100644
--- a/drivers/staging/erofs/unzip_vle.c
+++ b/drivers/staging/erofs/unzip_vle.c
@@ -21,6 +21,21 @@
  */
 #define PAGE_UNALLOCATED     ((void *)0x5F0E4B1D)
 
+/* how to allocate cached pages for a workgroup */
+enum z_erofs_cache_alloctype {
+	DONTALLOC,	/* don't allocate any cached pages */
+	DELAYEDALLOC,	/* delayed allocation (at the time of submitting io) */
+};
+
+/*
+ * tagged pointer with 1-bit tag for all compressed pages
+ * tag 0 - the page is just found with an extra page reference
+ */
+typedef tagptr1_t compressed_page_t;
+
+#define tag_compressed_page_justfound(page) \
+	tagptr_fold(compressed_page_t, page, 1)
+
 static struct workqueue_struct *z_erofs_workqueue __read_mostly;
 static struct kmem_cache *z_erofs_workgroup_cachep __read_mostly;
 
@@ -131,38 +146,58 @@ struct z_erofs_vle_work_builder {
 	{ .work = NULL, .role = Z_EROFS_VLE_WORK_PRIMARY_FOLLOWED }
 
 #ifdef EROFS_FS_HAS_MANAGED_CACHE
-
-static bool grab_managed_cache_pages(struct address_space *mapping,
-				     erofs_blk_t start,
-				     struct page **compressed_pages,
-				     int clusterblks,
-				     bool reserve_allocation)
+static void preload_compressed_pages(struct z_erofs_vle_work_builder *bl,
+				     struct address_space *mc,
+				     pgoff_t index,
+				     unsigned int clusterpages,
+				     enum z_erofs_cache_alloctype type,
+				     struct list_head *pagepool,
+				     gfp_t gfp)
 {
-	bool noio = true;
-	unsigned int i;
+	struct page **const pages = bl->compressed_pages;
+	const unsigned int remaining = bl->compressed_deficit;
+	bool standalone = true;
+	unsigned int i, j = 0;
+
+	if (bl->role < Z_EROFS_VLE_WORK_PRIMARY_FOLLOWED)
+		return;
+
+	gfp = mapping_gfp_constraint(mc, gfp) & ~__GFP_RECLAIM;
 
-	/* TODO: optimize by introducing find_get_pages_range */
-	for (i = 0; i < clusterblks; ++i) {
-		struct page *page, *found;
+	index += clusterpages - remaining;
 
-		if (READ_ONCE(compressed_pages[i]))
+	for (i = 0; i < remaining; ++i) {
+		struct page *page;
+		compressed_page_t t;
+
+		/* the compressed page was loaded before */
+		if (READ_ONCE(pages[i]))
 			continue;
 
-		page = found = find_get_page(mapping, start + i);
-		if (!found) {
-			noio = false;
-			if (!reserve_allocation)
-				continue;
-			page = PAGE_UNALLOCATED;
+		page = find_get_page(mc, index + i);
+
+		if (page) {
+			t = tag_compressed_page_justfound(page);
+		} else if (type == DELAYEDALLOC) {
+			t = tagptr_init(compressed_page_t, PAGE_UNALLOCATED);
+		} else {	/* DONTALLOC */
+			if (standalone)
+				j = i;
+			standalone = false;
+			continue;
 		}
 
-		if (!cmpxchg(compressed_pages + i, NULL, page))
+		if (!cmpxchg_relaxed(&pages[i], NULL, tagptr_cast_ptr(t)))
 			continue;
 
-		if (found)
-			put_page(found);
+		if (page)
+			put_page(page);
 	}
-	return noio;
+	bl->compressed_pages += j;
+	bl->compressed_deficit = remaining - j;
+
+	if (standalone)
+		bl->role = Z_EROFS_VLE_WORK_PRIMARY;
 }
 
 /* called by erofs_shrinker to get rid of all compressed_pages */
@@ -234,6 +269,17 @@ int erofs_try_to_free_cached_page(struct address_space *mapping,
 	}
 	return ret;
 }
+#else
+static void preload_compressed_pages(struct z_erofs_vle_work_builder *bl,
+				     struct address_space *mc,
+				     pgoff_t index,
+				     unsigned int clusterpages,
+				     enum z_erofs_cache_alloctype type,
+				     struct list_head *pagepool,
+				     gfp_t gfp)
+{
+	/* nowhere to load compressed pages from */
+}
 #endif
 
 /* page_type must be Z_EROFS_PAGE_TYPE_EXCLUSIVE */
@@ -608,6 +654,26 @@ struct z_erofs_vle_frontend {
 	.owned_head = Z_EROFS_VLE_WORKGRP_TAIL, \
 	.backmost = true, }
 
+#ifdef EROFS_FS_HAS_MANAGED_CACHE
+static inline bool
+should_alloc_managed_pages(struct z_erofs_vle_frontend *fe, erofs_off_t la)
+{
+	if (fe->backmost)
+		return true;
+
+	if (EROFS_FS_ZIP_CACHE_LVL >= 2)
+		return la < fe->headoffset;
+
+	return false;
+}
+#else
+static inline bool
+should_alloc_managed_pages(struct z_erofs_vle_frontend *fe, erofs_off_t la)
+{
+	return false;
+}
+#endif
+
 static int z_erofs_do_read_page(struct z_erofs_vle_frontend *fe,
 				struct page *page,
 				struct list_head *page_pool)
@@ -622,12 +688,7 @@ static int z_erofs_do_read_page(struct z_erofs_vle_frontend *fe,
 	bool tight = builder_is_followed(builder);
 	struct z_erofs_vle_work *work = builder->work;
 
-#ifdef EROFS_FS_HAS_MANAGED_CACHE
-	struct address_space *const mc = MNGD_MAPPING(sbi);
-	struct z_erofs_vle_workgroup *grp;
-	bool noio_outoforder;
-#endif
-
+	enum z_erofs_cache_alloctype cache_strategy;
 	enum z_erofs_page_type page_type;
 	unsigned int cur, end, spiltted, index;
 	int err = 0;
@@ -667,20 +728,16 @@ static int z_erofs_do_read_page(struct z_erofs_vle_frontend *fe,
 	if (unlikely(err))
 		goto err_out;
 
-#ifdef EROFS_FS_HAS_MANAGED_CACHE
-	grp = fe->builder.grp;
-
-	/* let's do out-of-order decompression for noio */
-	noio_outoforder = grab_managed_cache_pages(mc,
-		erofs_blknr(map->m_pa),
-		grp->compressed_pages, erofs_blknr(map->m_plen),
-		/* compressed page caching selection strategy */
-		fe->backmost | (EROFS_FS_ZIP_CACHE_LVL >= 2 ?
-				map->m_la < fe->headoffset : 0));
-
-	if (noio_outoforder && builder_is_followed(builder))
-		builder->role = Z_EROFS_VLE_WORK_PRIMARY;
-#endif
+	/* preload all compressed pages (maybe downgrade role if necessary) */
+	if (should_alloc_managed_pages(fe, map->m_la))
+		cache_strategy = DELAYEDALLOC;
+	else
+		cache_strategy = DONTALLOC;
+
+	preload_compressed_pages(builder, MNGD_MAPPING(sbi),
+				 map->m_pa / PAGE_SIZE,
+				 map->m_plen / PAGE_SIZE,
+				 cache_strategy, page_pool, GFP_KERNEL);
 
 	tight &= builder_is_followed(builder);
 	work = builder->work;
@@ -1062,6 +1119,9 @@ pickup_page_for_submission(struct z_erofs_vle_workgroup *grp,
 	struct address_space *mapping;
 	struct page *oldpage, *page;
 
+	compressed_page_t t;
+	int justfound;
+
 repeat:
 	page = READ_ONCE(grp->compressed_pages[nr]);
 	oldpage = page;
@@ -1078,6 +1138,11 @@ pickup_page_for_submission(struct z_erofs_vle_workgroup *grp,
 		goto out_allocpage;
 	}
 
+	/* process the target tagged pointer */
+	t = tagptr_init(compressed_page_t, page);
+	justfound = tagptr_unfold_tags(t);
+	page = tagptr_unfold_ptr(t);
+
 	mapping = READ_ONCE(page->mapping);
 
 	/*
@@ -1085,7 +1150,10 @@ pickup_page_for_submission(struct z_erofs_vle_workgroup *grp,
 	 * get such a cached-like page.
 	 */
 	if (nocache) {
-		/* should be locked, not uptodate, and not truncated */
+		/* if managed cache is disabled, it is impossible `justfound' */
+		DBG_BUGON(justfound);
+
+		/* and it should be locked, not uptodate, and not truncated */
 		DBG_BUGON(!PageLocked(page));
 		DBG_BUGON(PageUptodate(page));
 		DBG_BUGON(!mapping);
@@ -1102,11 +1170,22 @@ pickup_page_for_submission(struct z_erofs_vle_workgroup *grp,
 
 	lock_page(page);
 
+	/* only true if page reclaim goes wrong, should never happen */
+	DBG_BUGON(justfound && PagePrivate(page));
+
 	/* the page is still in manage cache */
 	if (page->mapping == mc) {
 		WRITE_ONCE(grp->compressed_pages[nr], page);
 
 		if (!PagePrivate(page)) {
+			/*
+			 * impossible to be !PagePrivate(page) for
+			 * the current restriction as well if
+			 * the page is already in compressed_pages[].
+			 */
+			DBG_BUGON(!justfound);
+
+			justfound = 0;
 			set_page_private(page, (unsigned long)grp);
 			SetPagePrivate(page);
 		}
@@ -1124,6 +1203,7 @@ pickup_page_for_submission(struct z_erofs_vle_workgroup *grp,
 	 * reuse this one, let's allocate a new cache-managed page.
 	 */
 	DBG_BUGON(page->mapping);
+	DBG_BUGON(!justfound);
 
 	tocache = true;
 	unlock_page(page);
-- 
2.14.4


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 6/7] staging: erofs: redefine where `owned_workgrp_t' points
  2018-12-07 16:19 [PATCH 0/7] staging: erofs: decompression fixes & cleanups Gao Xiang
                   ` (4 preceding siblings ...)
  2018-12-07 16:19 ` [PATCH 5/7] staging: erofs: refine compressed pages preload flow Gao Xiang
@ 2018-12-07 16:19 ` Gao Xiang
  2018-12-07 16:19 ` [PATCH 7/7] staging: erofs: simplify `z_erofs_vle_submit_all' Gao Xiang
  6 siblings, 0 replies; 8+ messages in thread
From: Gao Xiang @ 2018-12-07 16:19 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel
  Cc: LKML, linux-erofs, Chao Yu, Miao Xie, weidu.du, Gao Xiang

By design, workgroups are queued in the form of linked lists.

Previously, it points to the next `z_erofs_vle_workgroup',
which isn't flexible enough to simplify `z_erofs_vle_submit_all'.

Let's fix it by pointing to the next `owned_workgrp_t' and use
container_of to get its coresponding `z_erofs_vle_workgroup'.

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
---
 drivers/staging/erofs/unzip_vle.c | 12 +++++++-----
 drivers/staging/erofs/unzip_vle.h |  4 ++--
 2 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/staging/erofs/unzip_vle.c b/drivers/staging/erofs/unzip_vle.c
index 2b494f29e2c2..a8cf324f0b9c 100644
--- a/drivers/staging/erofs/unzip_vle.c
+++ b/drivers/staging/erofs/unzip_vle.c
@@ -333,7 +333,7 @@ static inline bool try_to_claim_workgroup(
 			    *owned_head) != Z_EROFS_VLE_WORKGRP_NIL)
 			goto retry;
 
-		*owned_head = grp;
+		*owned_head = &grp->next;
 		*hosted = true;
 	} else if (grp->next == Z_EROFS_VLE_WORKGRP_TAIL) {
 		/*
@@ -488,7 +488,8 @@ z_erofs_vle_work_register(const struct z_erofs_vle_work_finder *f,
 		}
 	}
 
-	*f->owned_head = *f->grp_ret = grp;
+	*f->owned_head = &grp->next;
+	*f->grp_ret = grp;
 	return work;
 }
 
@@ -1084,7 +1085,7 @@ static void z_erofs_vle_unzip_all(struct super_block *sb,
 		/* no possible that 'owned' equals NULL */
 		DBG_BUGON(owned == Z_EROFS_VLE_WORKGRP_NIL);
 
-		grp = owned;
+		grp = container_of(owned, struct z_erofs_vle_workgroup, next);
 		owned = READ_ONCE(grp->next);
 
 		z_erofs_vle_unzip(sb, grp, page_pool);
@@ -1325,7 +1326,8 @@ static bool z_erofs_vle_submit_all(struct super_block *sb,
 		DBG_BUGON(owned_head == Z_EROFS_VLE_WORKGRP_TAIL_CLOSED);
 		DBG_BUGON(owned_head == Z_EROFS_VLE_WORKGRP_NIL);
 
-		grp = owned_head;
+		grp = container_of(owned_head,
+				   struct z_erofs_vle_workgroup, next);
 
 		/* close the main owned chain at first */
 		owned_head = cmpxchg(&grp->next, Z_EROFS_VLE_WORKGRP_TAIL,
@@ -1382,7 +1384,7 @@ static bool z_erofs_vle_submit_all(struct super_block *sb,
 				WRITE_ONCE(lstgrp_io->next, iogrp_next);
 
 			if (!lstgrp_noio)
-				ios[0]->head = grp;
+				ios[0]->head = &grp->next;
 			else
 				WRITE_ONCE(lstgrp_noio->next, grp);
 
diff --git a/drivers/staging/erofs/unzip_vle.h b/drivers/staging/erofs/unzip_vle.h
index 3316bc36965d..5a4e1b62c0d1 100644
--- a/drivers/staging/erofs/unzip_vle.h
+++ b/drivers/staging/erofs/unzip_vle.h
@@ -67,13 +67,13 @@ struct z_erofs_vle_work {
 #define Z_EROFS_VLE_WORKGRP_FMT_LZ4          1
 #define Z_EROFS_VLE_WORKGRP_FMT_MASK         1
 
-typedef struct z_erofs_vle_workgroup *z_erofs_vle_owned_workgrp_t;
+typedef void *z_erofs_vle_owned_workgrp_t;
 
 struct z_erofs_vle_workgroup {
 	struct erofs_workgroup obj;
 	struct z_erofs_vle_work work;
 
-	/* next owned workgroup */
+	/* point to next owned_workgrp_t */
 	z_erofs_vle_owned_workgrp_t next;
 
 	/* compressed pages (including multi-usage pages) */
-- 
2.14.4


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 7/7] staging: erofs: simplify `z_erofs_vle_submit_all'
  2018-12-07 16:19 [PATCH 0/7] staging: erofs: decompression fixes & cleanups Gao Xiang
                   ` (5 preceding siblings ...)
  2018-12-07 16:19 ` [PATCH 6/7] staging: erofs: redefine where `owned_workgrp_t' points Gao Xiang
@ 2018-12-07 16:19 ` Gao Xiang
  6 siblings, 0 replies; 8+ messages in thread
From: Gao Xiang @ 2018-12-07 16:19 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel
  Cc: LKML, linux-erofs, Chao Yu, Miao Xie, weidu.du, Gao Xiang

Previously, there are too many hacked stuffs such as `__FSIO_1',
`lstgrp_noio', `lstgrp_io' out there in `z_erofs_vle_submit_all'.

Revisit the whole process by properly introducing jobqueue to
represent each type of queued workgroups, furthermore hide all of
crazyness behind independent separated functions.

After this patch, 2 independent jobqueues exist if managed cache
is enabled, or 1 jobqueue if disabled.

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
---
 drivers/staging/erofs/unzip_vle.c | 195 ++++++++++++++++++++++----------------
 1 file changed, 113 insertions(+), 82 deletions(-)

diff --git a/drivers/staging/erofs/unzip_vle.c b/drivers/staging/erofs/unzip_vle.c
index a8cf324f0b9c..500046f271cb 100644
--- a/drivers/staging/erofs/unzip_vle.c
+++ b/drivers/staging/erofs/unzip_vle.c
@@ -1229,33 +1229,28 @@ pickup_page_for_submission(struct z_erofs_vle_workgroup *grp,
 	return page;
 }
 
-static inline struct z_erofs_vle_unzip_io *
-prepare_io_handler(struct super_block *sb,
-		   struct z_erofs_vle_unzip_io *io,
-		   bool background)
+static struct z_erofs_vle_unzip_io *
+jobqueue_init(struct super_block *sb,
+	      struct z_erofs_vle_unzip_io *io,
+	      bool foreground)
 {
 	struct z_erofs_vle_unzip_io_sb *iosb;
 
-	if (!background) {
+	if (foreground) {
 		/* waitqueue available for foreground io */
-		BUG_ON(!io);
+		DBG_BUGON(!io);
 
 		init_waitqueue_head(&io->u.wait);
 		atomic_set(&io->pending_bios, 0);
 		goto out;
 	}
 
-	if (io)
-		BUG();
-	else {
-		/* allocate extra io descriptor for background io */
-		iosb = kvzalloc(sizeof(struct z_erofs_vle_unzip_io_sb),
+	iosb = kvzalloc(sizeof(struct z_erofs_vle_unzip_io_sb),
 			GFP_KERNEL | __GFP_NOFAIL);
-		BUG_ON(!iosb);
-
-		io = &iosb->io;
-	}
+	DBG_BUGON(!iosb);
 
+	/* initialize fields in the allocated descriptor */
+	io = &iosb->io;
 	iosb->sb = sb;
 	INIT_WORK(&io->u.work, z_erofs_vle_unzip_wq);
 out:
@@ -1263,27 +1258,105 @@ prepare_io_handler(struct super_block *sb,
 	return io;
 }
 
+/* define workgroup jobqueue types */
+enum {
 #ifdef EROFS_FS_HAS_MANAGED_CACHE
-#define __FSIO_1 1
+	JQ_BYPASS,
+#endif
+	JQ_SUBMIT,
+	NR_JOBQUEUES,
+};
+
+static void *jobqueueset_init(struct super_block *sb,
+			      z_erofs_vle_owned_workgrp_t qtail[],
+			      struct z_erofs_vle_unzip_io *q[],
+			      struct z_erofs_vle_unzip_io *fgq,
+			      bool forcefg)
+{
+#ifdef EROFS_FS_HAS_MANAGED_CACHE
+	/*
+	 * if managed cache is enabled, bypass jobqueue is needed,
+	 * no need to read from device for all workgroups in this queue.
+	 */
+	q[JQ_BYPASS] = jobqueue_init(sb, fgq + JQ_BYPASS, true);
+	qtail[JQ_BYPASS] = &q[JQ_BYPASS]->head;
+#endif
+
+	q[JQ_SUBMIT] = jobqueue_init(sb, fgq + JQ_SUBMIT, forcefg);
+	qtail[JQ_SUBMIT] = &q[JQ_SUBMIT]->head;
+
+	return tagptr_cast_ptr(tagptr_fold(tagptr1_t, q[JQ_SUBMIT], !forcefg));
+}
+
+#ifdef EROFS_FS_HAS_MANAGED_CACHE
+static void move_to_bypass_jobqueue(struct z_erofs_vle_workgroup *grp,
+				    z_erofs_vle_owned_workgrp_t qtail[],
+				    z_erofs_vle_owned_workgrp_t owned_head)
+{
+	z_erofs_vle_owned_workgrp_t *const submit_qtail = qtail[JQ_SUBMIT];
+	z_erofs_vle_owned_workgrp_t *const bypass_qtail = qtail[JQ_BYPASS];
+
+	DBG_BUGON(owned_head == Z_EROFS_VLE_WORKGRP_TAIL_CLOSED);
+	if (owned_head == Z_EROFS_VLE_WORKGRP_TAIL)
+		owned_head = Z_EROFS_VLE_WORKGRP_TAIL_CLOSED;
+
+	WRITE_ONCE(grp->next, Z_EROFS_VLE_WORKGRP_TAIL_CLOSED);
+
+	WRITE_ONCE(*submit_qtail, owned_head);
+	WRITE_ONCE(*bypass_qtail, &grp->next);
+
+	qtail[JQ_BYPASS] = &grp->next;
+}
+
+static bool postsubmit_is_all_bypassed(struct z_erofs_vle_unzip_io *q[],
+				       unsigned int nr_bios,
+				       bool force_fg)
+{
+	/*
+	 * although background is preferred, no one is pending for submission.
+	 * don't issue workqueue for decompression but drop it directly instead.
+	 */
+	if (force_fg || nr_bios)
+		return false;
+
+	kvfree(container_of(q[JQ_SUBMIT],
+			    struct z_erofs_vle_unzip_io_sb,
+			    io));
+	return true;
+}
 #else
-#define __FSIO_1 0
+static void move_to_bypass_jobqueue(struct z_erofs_vle_workgroup *grp,
+				    z_erofs_vle_owned_workgrp_t qtail[],
+				    z_erofs_vle_owned_workgrp_t owned_head)
+{
+	/* impossible to bypass submission for managed cache disabled */
+	DBG_BUGON(1);
+}
+
+static bool postsubmit_is_all_bypassed(struct z_erofs_vle_unzip_io *q[],
+				       unsigned int nr_bios,
+				       bool force_fg)
+{
+	/* bios should be >0 if managed cache is disabled */
+	DBG_BUGON(!nr_bios);
+	return false;
+}
 #endif
 
 static bool z_erofs_vle_submit_all(struct super_block *sb,
 				   z_erofs_vle_owned_workgrp_t owned_head,
 				   struct list_head *pagepool,
-				   struct z_erofs_vle_unzip_io *fg_io,
+				   struct z_erofs_vle_unzip_io *fgq,
 				   bool force_fg)
 {
 	struct erofs_sb_info *const sbi = EROFS_SB(sb);
 	const unsigned int clusterpages = erofs_clusterpages(sbi);
 	const gfp_t gfp = GFP_NOFS;
-#ifdef EROFS_FS_HAS_MANAGED_CACHE
-	struct z_erofs_vle_workgroup *lstgrp_noio = NULL, *lstgrp_io = NULL;
-#endif
-	struct z_erofs_vle_unzip_io *ios[1 + __FSIO_1];
+
+	z_erofs_vle_owned_workgrp_t qtail[NR_JOBQUEUES];
+	struct z_erofs_vle_unzip_io *q[NR_JOBQUEUES];
 	struct bio *bio;
-	tagptr1_t bi_private;
+	void *bi_private;
 	/* since bio will be NULL, no need to initialize last_index */
 	pgoff_t uninitialized_var(last_index);
 	bool force_submit = false;
@@ -1292,28 +1365,13 @@ static bool z_erofs_vle_submit_all(struct super_block *sb,
 	if (unlikely(owned_head == Z_EROFS_VLE_WORKGRP_TAIL))
 		return false;
 
-	/*
-	 * force_fg == 1, (io, fg_io[0]) no io, (io, fg_io[1]) need submit io
-	 * force_fg == 0, (io, fg_io[0]) no io; (io[1], bg_io) need submit io
-	 */
-#ifdef EROFS_FS_HAS_MANAGED_CACHE
-	ios[0] = prepare_io_handler(sb, fg_io + 0, false);
-#endif
-
-	if (force_fg) {
-		ios[__FSIO_1] = prepare_io_handler(sb, fg_io + __FSIO_1, false);
-		bi_private = tagptr_fold(tagptr1_t, ios[__FSIO_1], 0);
-	} else {
-		ios[__FSIO_1] = prepare_io_handler(sb, NULL, true);
-		bi_private = tagptr_fold(tagptr1_t, ios[__FSIO_1], 1);
-	}
-
-	nr_bios = 0;
 	force_submit = false;
 	bio = NULL;
+	nr_bios = 0;
+	bi_private = jobqueueset_init(sb, qtail, q, fgq, force_fg);
 
 	/* by default, all need io submission */
-	ios[__FSIO_1]->head = owned_head;
+	q[JQ_SUBMIT]->head = owned_head;
 
 	do {
 		struct z_erofs_vle_workgroup *grp;
@@ -1353,8 +1411,9 @@ static bool z_erofs_vle_submit_all(struct super_block *sb,
 
 		if (!bio) {
 			bio = erofs_grab_bio(sb, first_index + i,
-				BIO_MAX_PAGES, z_erofs_vle_read_endio, true);
-			bio->bi_private = tagptr_cast_ptr(bi_private);
+					     BIO_MAX_PAGES,
+					     z_erofs_vle_read_endio, true);
+			bio->bi_private = bi_private;
 
 			++nr_bios;
 		}
@@ -1369,47 +1428,19 @@ static bool z_erofs_vle_submit_all(struct super_block *sb,
 		if (++i < clusterpages)
 			goto repeat;
 
-#ifdef EROFS_FS_HAS_MANAGED_CACHE
-		if (bypass < clusterpages) {
-			lstgrp_io = grp;
-		} else {
-			z_erofs_vle_owned_workgrp_t iogrp_next =
-				owned_head == Z_EROFS_VLE_WORKGRP_TAIL ?
-				Z_EROFS_VLE_WORKGRP_TAIL_CLOSED :
-				owned_head;
-
-			if (!lstgrp_io)
-				ios[1]->head = iogrp_next;
-			else
-				WRITE_ONCE(lstgrp_io->next, iogrp_next);
-
-			if (!lstgrp_noio)
-				ios[0]->head = &grp->next;
-			else
-				WRITE_ONCE(lstgrp_noio->next, grp);
-
-			lstgrp_noio = grp;
-		}
-#endif
+		if (bypass < clusterpages)
+			qtail[JQ_SUBMIT] = &grp->next;
+		else
+			move_to_bypass_jobqueue(grp, qtail, owned_head);
 	} while (owned_head != Z_EROFS_VLE_WORKGRP_TAIL);
 
 	if (bio)
 		__submit_bio(bio, REQ_OP_READ, 0);
 
-#ifndef EROFS_FS_HAS_MANAGED_CACHE
-	BUG_ON(!nr_bios);
-#else
-	if (lstgrp_noio)
-		WRITE_ONCE(lstgrp_noio->next, Z_EROFS_VLE_WORKGRP_TAIL_CLOSED);
-
-	if (!force_fg && !nr_bios) {
-		kvfree(container_of(ios[1],
-			struct z_erofs_vle_unzip_io_sb, io));
+	if (postsubmit_is_all_bypassed(q, nr_bios, force_fg))
 		return true;
-	}
-#endif
 
-	z_erofs_vle_unzip_kickoff(tagptr_cast_ptr(bi_private), nr_bios);
+	z_erofs_vle_unzip_kickoff(bi_private, nr_bios);
 	return true;
 }
 
@@ -1418,23 +1449,23 @@ static void z_erofs_submit_and_unzip(struct z_erofs_vle_frontend *f,
 				     bool force_fg)
 {
 	struct super_block *sb = f->inode->i_sb;
-	struct z_erofs_vle_unzip_io io[1 + __FSIO_1];
+	struct z_erofs_vle_unzip_io io[NR_JOBQUEUES];
 
 	if (!z_erofs_vle_submit_all(sb, f->owned_head, pagepool, io, force_fg))
 		return;
 
 #ifdef EROFS_FS_HAS_MANAGED_CACHE
-	z_erofs_vle_unzip_all(sb, &io[0], pagepool);
+	z_erofs_vle_unzip_all(sb, &io[JQ_BYPASS], pagepool);
 #endif
 	if (!force_fg)
 		return;
 
 	/* wait until all bios are completed */
-	wait_event(io[__FSIO_1].u.wait,
-		!atomic_read(&io[__FSIO_1].pending_bios));
+	wait_event(io[JQ_SUBMIT].u.wait,
+		   !atomic_read(&io[JQ_SUBMIT].pending_bios));
 
 	/* let's synchronous decompression */
-	z_erofs_vle_unzip_all(sb, &io[__FSIO_1], pagepool);
+	z_erofs_vle_unzip_all(sb, &io[JQ_SUBMIT], pagepool);
 }
 
 static int z_erofs_vle_normalaccess_readpage(struct file *file,
-- 
2.14.4


^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-12-07 16:05 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-07 16:19 [PATCH 0/7] staging: erofs: decompression fixes & cleanups Gao Xiang
2018-12-07 16:19 ` [PATCH 1/7] staging: erofs: fix use-after-free of on-stack `z_erofs_vle_unzip_io' Gao Xiang
2018-12-07 16:19 ` [PATCH 2/7] staging: erofs: introduce MNGD_MAPPING helper Gao Xiang
2018-12-07 16:19 ` [PATCH 3/7] staging: erofs: localize UNALLOCATED_CACHED_PAGE placeholder Gao Xiang
2018-12-07 16:19 ` [PATCH 4/7] staging: erofs: revisit the page submission flow Gao Xiang
2018-12-07 16:19 ` [PATCH 5/7] staging: erofs: refine compressed pages preload flow Gao Xiang
2018-12-07 16:19 ` [PATCH 6/7] staging: erofs: redefine where `owned_workgrp_t' points Gao Xiang
2018-12-07 16:19 ` [PATCH 7/7] staging: erofs: simplify `z_erofs_vle_submit_all' Gao Xiang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).