[PATCH 0/5] cachefiles: some bugfixes for clean object/send req/poll

linux-erofs.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/5] cachefiles: some bugfixes for clean object/send req/poll
@ 2024-04-24  3:34 libaokun
  2024-04-24  3:34 ` [PATCH 1/5] cachefiles: stop sending new request when dropping object libaokun
                   ` (4 more replies)
  0 siblings, 5 replies; 11+ messages in thread
From: libaokun @ 2024-04-24  3:34 UTC (permalink / raw)
  To: netfs
  Cc: libaokun, jlayton, linux-kernel, dhowells, linux-fsdevel,
	linux-cachefs, linux-erofs

From: Baokun Li <libaokun1@huawei.com>

Hello everyone!

Recently we found some bugs while doing tests on cachefiles ondemand mode,
and this patchset is a fix for some of those issues. The following is a
brief overview of the patches, see the patches for more details.

Patch 1-3: After an object has been cleaned up, make sure it has no
outstanding requests and that the corresponding ondemand_object_worker
has exited, otherwise it may use-after-free.

Patch 4: Cyclic allocation of msg_id to avoid msg_id reuse misleading
the daemon to cause hung.

Patch 5: Hold xas_lock during polling to avoid dereferencing reqs causing
use-after-free.

Comments and questions are, as always, welcome.

Thanks,
Baokun

Baokun Li (3):
  cachefiles: stop sending new request when dropping object
  cachefiles: flush all requests for the object that is being dropped
  cachefiles: cyclic allocation of msg_id to avoid reuse

Hou Tao (1):
  cachefiles: flush ondemand_object_worker during clean object

Jingbo Xu (1):
  cachefiles: add missing lock protection when polling

 fs/cachefiles/daemon.c   |   4 +-
 fs/cachefiles/internal.h |   3 +
 fs/cachefiles/ondemand.c | 120 ++++++++++++++++++++++++++-------------
 3 files changed, 86 insertions(+), 41 deletions(-)

-- 
2.39.2

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/5] cachefiles: stop sending new request when dropping object
  2024-04-24  3:34 [PATCH 0/5] cachefiles: some bugfixes for clean object/send req/poll libaokun
@ 2024-04-24  3:34 ` libaokun
  2024-04-24  3:34 ` [PATCH 2/5] cachefiles: flush all requests for the object that is being dropped libaokun
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 11+ messages in thread
From: libaokun @ 2024-04-24  3:34 UTC (permalink / raw)
  To: netfs
  Cc: libaokun, jlayton, linux-kernel, dhowells, linux-fsdevel,
	linux-cachefs, linux-erofs

From: Baokun Li <libaokun1@huawei.com>

Added CACHEFILES_ONDEMAND_OBJSTATE_DROPPING indicates that the cachefiles
object is being dropped, and is set after the close request for the dropped
object completes, and no new requests are allowed to be sent after this
state.

Signed-off-by: Baokun Li <libaokun1@huawei.com>
---
 fs/cachefiles/internal.h |  2 ++
 fs/cachefiles/ondemand.c | 10 ++++++++--
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h
index d33169f0018b..8ecd296cc1c4 100644
--- a/fs/cachefiles/internal.h
+++ b/fs/cachefiles/internal.h
@@ -48,6 +48,7 @@ enum cachefiles_object_state {
 	CACHEFILES_ONDEMAND_OBJSTATE_CLOSE, /* Anonymous fd closed by daemon or initial state */
 	CACHEFILES_ONDEMAND_OBJSTATE_OPEN, /* Anonymous fd associated with object is available */
 	CACHEFILES_ONDEMAND_OBJSTATE_REOPENING, /* Object that was closed and is being reopened. */
+	CACHEFILES_ONDEMAND_OBJSTATE_DROPPING, /* Object is being dropped. */
 };
 
 struct cachefiles_ondemand_info {
@@ -332,6 +333,7 @@ cachefiles_ondemand_set_object_##_state(struct cachefiles_object *object) \
 CACHEFILES_OBJECT_STATE_FUNCS(open, OPEN);
 CACHEFILES_OBJECT_STATE_FUNCS(close, CLOSE);
 CACHEFILES_OBJECT_STATE_FUNCS(reopening, REOPENING);
+CACHEFILES_OBJECT_STATE_FUNCS(dropping, DROPPING);
 
 static inline bool cachefiles_ondemand_is_reopening_read(struct cachefiles_req *req)
 {
diff --git a/fs/cachefiles/ondemand.c b/fs/cachefiles/ondemand.c
index 4ba42f1fa3b4..73da4d4eaa9b 100644
--- a/fs/cachefiles/ondemand.c
+++ b/fs/cachefiles/ondemand.c
@@ -422,7 +422,8 @@ static int cachefiles_ondemand_send_req(struct cachefiles_object *object,
 		 */
 		xas_lock(&xas);
 
-		if (test_bit(CACHEFILES_DEAD, &cache->flags)) {
+		if (test_bit(CACHEFILES_DEAD, &cache->flags) ||
+		    cachefiles_ondemand_object_is_dropping(object)) {
 			xas_unlock(&xas);
 			ret = -EIO;
 			goto out;
@@ -463,7 +464,8 @@ static int cachefiles_ondemand_send_req(struct cachefiles_object *object,
 	 * If error occurs after creating the anonymous fd,
 	 * cachefiles_ondemand_fd_release() will set object to close.
 	 */
-	if (opcode == CACHEFILES_OP_OPEN)
+	if (opcode == CACHEFILES_OP_OPEN &&
+	    !cachefiles_ondemand_object_is_dropping(object))
 		cachefiles_ondemand_set_object_close(object);
 	kfree(req);
 	return ret;
@@ -562,8 +564,12 @@ int cachefiles_ondemand_init_object(struct cachefiles_object *object)
 
 void cachefiles_ondemand_clean_object(struct cachefiles_object *object)
 {
+	if (!object->ondemand)
+		return;
+
 	cachefiles_ondemand_send_req(object, CACHEFILES_OP_CLOSE, 0,
 			cachefiles_ondemand_init_close_req, NULL);
+	cachefiles_ondemand_set_object_dropping(object);
 }
 
 int cachefiles_ondemand_init_obj_info(struct cachefiles_object *object,
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/5] cachefiles: flush all requests for the object that is being dropped
  2024-04-24  3:34 [PATCH 0/5] cachefiles: some bugfixes for clean object/send req/poll libaokun
  2024-04-24  3:34 ` [PATCH 1/5] cachefiles: stop sending new request when dropping object libaokun
@ 2024-04-24  3:34 ` libaokun
  2024-04-24  3:34 ` [PATCH 3/5] cachefiles: flush ondemand_object_worker during clean object libaokun
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 11+ messages in thread
From: libaokun @ 2024-04-24  3:34 UTC (permalink / raw)
  To: netfs
  Cc: libaokun, jlayton, linux-kernel, dhowells, linux-fsdevel,
	linux-cachefs, linux-erofs

From: Baokun Li <libaokun1@huawei.com>

Because after an object is dropped, requests for that object are useless,
flush them to avoid causing other problems.

Signed-off-by: Baokun Li <libaokun1@huawei.com>
---
 fs/cachefiles/ondemand.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/fs/cachefiles/ondemand.c b/fs/cachefiles/ondemand.c
index 73da4d4eaa9b..d24bff43499b 100644
--- a/fs/cachefiles/ondemand.c
+++ b/fs/cachefiles/ondemand.c
@@ -564,12 +564,31 @@ int cachefiles_ondemand_init_object(struct cachefiles_object *object)
 
 void cachefiles_ondemand_clean_object(struct cachefiles_object *object)
 {
+	unsigned long index;
+	struct cachefiles_req *req;
+	struct cachefiles_cache *cache;
+
 	if (!object->ondemand)
 		return;
 
 	cachefiles_ondemand_send_req(object, CACHEFILES_OP_CLOSE, 0,
 			cachefiles_ondemand_init_close_req, NULL);
+
+	if (!object->ondemand->ondemand_id)
+		return;
+
+	/* Flush all requests for the object that is being dropped. */
+	cache = object->volume->cache;
+	xa_lock(&cache->reqs);
 	cachefiles_ondemand_set_object_dropping(object);
+	xa_for_each(&cache->reqs, index, req) {
+		if (req->object == object) {
+			req->error = -EIO;
+			complete(&req->done);
+			__xa_erase(&cache->reqs, index);
+		}
+	}
+	xa_unlock(&cache->reqs);
 }
 
 int cachefiles_ondemand_init_obj_info(struct cachefiles_object *object,
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/5] cachefiles: flush ondemand_object_worker during clean object
  2024-04-24  3:34 [PATCH 0/5] cachefiles: some bugfixes for clean object/send req/poll libaokun
  2024-04-24  3:34 ` [PATCH 1/5] cachefiles: stop sending new request when dropping object libaokun
  2024-04-24  3:34 ` [PATCH 2/5] cachefiles: flush all requests for the object that is being dropped libaokun
@ 2024-04-24  3:34 ` libaokun
  2024-04-25  5:41   ` Jia Zhu via Linux-erofs
  2024-04-24  3:34 ` [PATCH 4/5] cachefiles: cyclic allocation of msg_id to avoid reuse libaokun
  2024-04-24  3:34 ` [PATCH 5/5] cachefiles: add missing lock protection when polling libaokun
  4 siblings, 1 reply; 11+ messages in thread
From: libaokun @ 2024-04-24  3:34 UTC (permalink / raw)
  To: netfs
  Cc: libaokun, jlayton, linux-kernel, dhowells, linux-fsdevel,
	linux-cachefs, linux-erofs

From: Hou Tao <houtao1@huawei.com>

When queuing ondemand_object_worker() to re-open the object,
cachefiles_object is not pinned. The cachefiles_object may be freed when
the pending read request is completed intentionally and the related
erofs is umounted. If ondemand_object_worker() runs after the object is
freed, it will incur use-after-free problem as shown below.

process A  processs B  process C  process D

cachefiles_ondemand_send_req()
// send a read req X
// wait for its completion

           // close ondemand fd
           cachefiles_ondemand_fd_release()
           // set object as CLOSE

                       cachefiles_ondemand_daemon_read()
                       // set object as REOPENING
                       queue_work(fscache_wq, &info->ondemand_work)

                                // close /dev/cachefiles
                                cachefiles_daemon_release
                                cachefiles_flush_reqs
                                complete(&req->done)

// read req X is completed
// umount the erofs fs
cachefiles_put_object()
// object will be freed
cachefiles_ondemand_deinit_obj_info()
kmem_cache_free(object)
                       // both info and object are freed
                       ondemand_object_worker()

When dropping an object, it is no longer necessary to reopen the object,
so use cancel_work_sync() to cancel or wait for ondemand_object_worker()
to complete.

Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Baokun Li <libaokun1@huawei.com>
---
 fs/cachefiles/ondemand.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/cachefiles/ondemand.c b/fs/cachefiles/ondemand.c
index d24bff43499b..f6440b3e7368 100644
--- a/fs/cachefiles/ondemand.c
+++ b/fs/cachefiles/ondemand.c
@@ -589,6 +589,9 @@ void cachefiles_ondemand_clean_object(struct cachefiles_object *object)
 		}
 	}
 	xa_unlock(&cache->reqs);
+
+	/* Wait for ondemand_object_worker() to finish to avoid UAF. */
+	cancel_work_sync(&object->ondemand->ondemand_work);
 }
 
 int cachefiles_ondemand_init_obj_info(struct cachefiles_object *object,
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 4/5] cachefiles: cyclic allocation of msg_id to avoid reuse
  2024-04-24  3:34 [PATCH 0/5] cachefiles: some bugfixes for clean object/send req/poll libaokun
                   ` (2 preceding siblings ...)
  2024-04-24  3:34 ` [PATCH 3/5] cachefiles: flush ondemand_object_worker during clean object libaokun
@ 2024-04-24  3:34 ` libaokun
  2024-04-24  3:34 ` [PATCH 5/5] cachefiles: add missing lock protection when polling libaokun
  4 siblings, 0 replies; 11+ messages in thread
From: libaokun @ 2024-04-24  3:34 UTC (permalink / raw)
  To: netfs
  Cc: libaokun, jlayton, linux-kernel, dhowells, linux-fsdevel,
	linux-cachefs, linux-erofs

From: Baokun Li <libaokun1@huawei.com>

Reusing the msg_id after a maliciously completed reopen request may cause
a read request to remain unprocessed and result in a hung, as shown below:

       t1       |      t2       |      t3
-------------------------------------------------
cachefiles_ondemand_select_req
 cachefiles_ondemand_object_is_close(A)
 cachefiles_ondemand_set_object_reopening(A)
 queue_work(fscache_object_wq, &info->work)
                ondemand_object_worker
                 cachefiles_ondemand_init_object(A)
                  cachefiles_ondemand_send_req(OPEN)
                    // get msg_id 6
                    wait_for_completion(&req_A->done)
cachefiles_ondemand_daemon_read
 // read msg_id 6 req_A
 cachefiles_ondemand_get_fd
 copy_to_user
                                // Malicious completion msg_id 6
                                copen 6,-1
                // reopen fails, want daemon to close fd,
                // then set object to close, retrigger reopen
                                cachefiles_ondemand_init_object(B)
                                 cachefiles_ondemand_send_req(OPEN)
                                 // new open req_B reuse msg_id 6
// daemon successfully copen msg_id 6, so it won't close the fd.
// object is always reopening, so read requests are not processed
// resulting in a hung

Therefore allocate the msg_id cyclically to avoid reusing the msg_id for
a very short duration of time causing the above problem.

Fixes: c8383054506c ("cachefiles: notify the user daemon when looking up cookie")
Signed-off-by: Baokun Li <libaokun1@huawei.com>
---
 fs/cachefiles/internal.h |  1 +
 fs/cachefiles/ondemand.c | 92 +++++++++++++++++++++++-----------------
 2 files changed, 54 insertions(+), 39 deletions(-)

diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h
index 8ecd296cc1c4..9200c00f3e98 100644
--- a/fs/cachefiles/internal.h
+++ b/fs/cachefiles/internal.h
@@ -128,6 +128,7 @@ struct cachefiles_cache {
 	unsigned long			req_id_next;
 	struct xarray			ondemand_ids;	/* xarray for ondemand_id allocation */
 	u32				ondemand_id_next;
+	u32				msg_id_next;
 };
 
 static inline bool cachefiles_in_ondemand_mode(struct cachefiles_cache *cache)
diff --git a/fs/cachefiles/ondemand.c b/fs/cachefiles/ondemand.c
index f6440b3e7368..6171e1a8cfa1 100644
--- a/fs/cachefiles/ondemand.c
+++ b/fs/cachefiles/ondemand.c
@@ -404,51 +404,65 @@ static int cachefiles_ondemand_send_req(struct cachefiles_object *object,
 	if (ret)
 		goto out;
 
-	do {
-		/*
-		 * Stop enqueuing the request when daemon is dying. The
-		 * following two operations need to be atomic as a whole.
-		 *   1) check cache state, and
-		 *   2) enqueue request if cache is alive.
-		 * Otherwise the request may be enqueued after xarray has been
-		 * flushed, leaving the orphan request never being completed.
-		 *
-		 * CPU 1			CPU 2
-		 * =====			=====
-		 *				test CACHEFILES_DEAD bit
-		 * set CACHEFILES_DEAD bit
-		 * flush requests in the xarray
-		 *				enqueue the request
-		 */
-		xas_lock(&xas);
-
-		if (test_bit(CACHEFILES_DEAD, &cache->flags) ||
-		    cachefiles_ondemand_object_is_dropping(object)) {
-			xas_unlock(&xas);
-			ret = -EIO;
-			goto out;
-		}
+retry:
+	/*
+	 * Stop enqueuing the request when daemon is dying. The
+	 * following two operations need to be atomic as a whole.
+	 *   1) check cache state, and
+	 *   2) enqueue request if cache is alive.
+	 * Otherwise the request may be enqueued after xarray has been
+	 * flushed, leaving the orphan request never being completed.
+	 *
+	 * CPU 1			CPU 2
+	 * =====			=====
+	 *				test CACHEFILES_DEAD bit
+	 * set CACHEFILES_DEAD bit
+	 * flush requests in the xarray
+	 *				enqueue the request
+	 */
+	xas_lock(&xas);
 
-		/* coupled with the barrier in cachefiles_flush_reqs() */
-		smp_mb();
+	if (test_bit(CACHEFILES_DEAD, &cache->flags) ||
+	    cachefiles_ondemand_object_is_dropping(object)) {
+		xas_unlock(&xas);
+		ret = -EIO;
+		goto out;
+	}
 
-		if (opcode == CACHEFILES_OP_CLOSE &&
-			!cachefiles_ondemand_object_is_open(object)) {
-			WARN_ON_ONCE(object->ondemand->ondemand_id == 0);
-			xas_unlock(&xas);
-			ret = -EIO;
-			goto out;
-		}
+	/* coupled with the barrier in cachefiles_flush_reqs() */
+	smp_mb();
+
+	if (opcode == CACHEFILES_OP_CLOSE &&
+		!cachefiles_ondemand_object_is_open(object)) {
+		WARN_ON_ONCE(object->ondemand->ondemand_id == 0);
+		xas_unlock(&xas);
+		ret = -EIO;
+		goto out;
+	}
 
+	/*
+	 * Cyclically find a free xas to avoid msg_id reuse that would
+	 * cause the daemon to successfully copen a stale msg_id.
+	 */
+	xas.xa_index = cache->msg_id_next;
+	xas_find_marked(&xas, UINT_MAX, XA_FREE_MARK);
+	if (xas.xa_node == XAS_RESTART) {
 		xas.xa_index = 0;
-		xas_find_marked(&xas, UINT_MAX, XA_FREE_MARK);
-		if (xas.xa_node == XAS_RESTART)
-			xas_set_err(&xas, -EBUSY);
-		xas_store(&xas, req);
+		xas_find_marked(&xas, cache->msg_id_next - 1, XA_FREE_MARK);
+	}
+	if (xas.xa_node == XAS_RESTART)
+		xas_set_err(&xas, -EBUSY);
+
+	xas_store(&xas, req);
+	if (xas_valid(&xas)) {
+		cache->msg_id_next = xas.xa_index + 1;
 		xas_clear_mark(&xas, XA_FREE_MARK);
 		xas_set_mark(&xas, CACHEFILES_REQ_NEW);
-		xas_unlock(&xas);
-	} while (xas_nomem(&xas, GFP_KERNEL));
+	}
+
+	xas_unlock(&xas);
+	if (xas_nomem(&xas, GFP_KERNEL))
+		goto retry;
 
 	ret = xas_error(&xas);
 	if (ret)
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 5/5] cachefiles: add missing lock protection when polling
  2024-04-24  3:34 [PATCH 0/5] cachefiles: some bugfixes for clean object/send req/poll libaokun
                   ` (3 preceding siblings ...)
  2024-04-24  3:34 ` [PATCH 4/5] cachefiles: cyclic allocation of msg_id to avoid reuse libaokun
@ 2024-04-24  3:34 ` libaokun
  2024-04-24  4:29   ` Gao Xiang
  2024-04-24  5:46   ` Jia Zhu via Linux-erofs
  4 siblings, 2 replies; 11+ messages in thread
From: libaokun @ 2024-04-24  3:34 UTC (permalink / raw)
  To: netfs
  Cc: libaokun, Joseph Qi, jlayton, linux-kernel, dhowells,
	linux-fsdevel, linux-cachefs, Gao Xiang, linux-erofs

From: Jingbo Xu <jefflexu@linux.alibaba.com>

Add missing lock protection in poll routine when iterating xarray,
otherwise:

Even with RCU read lock held, only the slot of the radix tree is
ensured to be pinned there, while the data structure (e.g. struct
cachefiles_req) stored in the slot has no such guarantee.  The poll
routine will iterate the radix tree and dereference cachefiles_req
accordingly.  Thus RCU read lock is not adequate in this case and
spinlock is needed here.

Fixes: b817e22b2e91 ("cachefiles: narrow the scope of triggering EPOLLIN events in ondemand mode")
Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: Baokun Li <libaokun1@huawei.com>
---
 fs/cachefiles/daemon.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/cachefiles/daemon.c b/fs/cachefiles/daemon.c
index 6465e2574230..73ed2323282a 100644
--- a/fs/cachefiles/daemon.c
+++ b/fs/cachefiles/daemon.c
@@ -365,14 +365,14 @@ static __poll_t cachefiles_daemon_poll(struct file *file,
 
 	if (cachefiles_in_ondemand_mode(cache)) {
 		if (!xa_empty(&cache->reqs)) {
-			rcu_read_lock();
+			xas_lock(&xas);
 			xas_for_each_marked(&xas, req, ULONG_MAX, CACHEFILES_REQ_NEW) {
 				if (!cachefiles_ondemand_is_reopening_read(req)) {
 					mask |= EPOLLIN;
 					break;
 				}
 			}
-			rcu_read_unlock();
+			xas_unlock(&xas);
 		}
 	} else {
 		if (test_bit(CACHEFILES_STATE_CHANGED, &cache->flags))
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 5/5] cachefiles: add missing lock protection when polling
  2024-04-24  3:34 ` [PATCH 5/5] cachefiles: add missing lock protection when polling libaokun
@ 2024-04-24  4:29   ` Gao Xiang
  2024-04-24  6:23     ` Baokun Li
  2024-04-24  5:46   ` Jia Zhu via Linux-erofs
  1 sibling, 1 reply; 11+ messages in thread
From: Gao Xiang @ 2024-04-24  4:29 UTC (permalink / raw)
  To: libaokun, netfs
  Cc: Joseph Qi, jlayton, linux-kernel, dhowells, linux-fsdevel,
	linux-cachefs, linux-erofs

Hi Baokun,

On 2024/4/24 11:34, libaokun@huaweicloud.com wrote:
> From: Jingbo Xu <jefflexu@linux.alibaba.com>
> 
> Add missing lock protection in poll routine when iterating xarray,
> otherwise:
> 
> Even with RCU read lock held, only the slot of the radix tree is
> ensured to be pinned there, while the data structure (e.g. struct
> cachefiles_req) stored in the slot has no such guarantee.  The poll
> routine will iterate the radix tree and dereference cachefiles_req
> accordingly.  Thus RCU read lock is not adequate in this case and
> spinlock is needed here.
> 
> Fixes: b817e22b2e91 ("cachefiles: narrow the scope of triggering EPOLLIN events in ondemand mode")
> Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com>
> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>

I'm not sure why this patch didn't send upstream,
https://gitee.com/anolis/cloud-kernel/commit/324ecaaa10fefb0e3d94b547e3170e40b90cda1f

But since we're now working on upstreaming, so let's drop
the previous in-house review tags..

Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>

Thanks,
Gao Xiang

> Signed-off-by: Baokun Li <libaokun1@huawei.com>
> ---
>   fs/cachefiles/daemon.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/cachefiles/daemon.c b/fs/cachefiles/daemon.c
> index 6465e2574230..73ed2323282a 100644
> --- a/fs/cachefiles/daemon.c
> +++ b/fs/cachefiles/daemon.c
> @@ -365,14 +365,14 @@ static __poll_t cachefiles_daemon_poll(struct file *file,
>   
>   	if (cachefiles_in_ondemand_mode(cache)) {
>   		if (!xa_empty(&cache->reqs)) {
> -			rcu_read_lock();
> +			xas_lock(&xas);
>   			xas_for_each_marked(&xas, req, ULONG_MAX, CACHEFILES_REQ_NEW) {
>   				if (!cachefiles_ondemand_is_reopening_read(req)) {
>   					mask |= EPOLLIN;
>   					break;
>   				}
>   			}
> -			rcu_read_unlock();
> +			xas_unlock(&xas);
>   		}
>   	} else {
>   		if (test_bit(CACHEFILES_STATE_CHANGED, &cache->flags))

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 5/5] cachefiles: add missing lock protection when polling
  2024-04-24  3:34 ` [PATCH 5/5] cachefiles: add missing lock protection when polling libaokun
  2024-04-24  4:29   ` Gao Xiang
@ 2024-04-24  5:46   ` Jia Zhu via Linux-erofs
  1 sibling, 0 replies; 11+ messages in thread
From: Jia Zhu via Linux-erofs @ 2024-04-24  5:46 UTC (permalink / raw)
  To: libaokun, netfs
  Cc: Joseph Qi, jlayton, linux-kernel, dhowells, linux-fsdevel,
	linux-cachefs, Gao Xiang, linux-erofs



在 2024/4/24 11:34, libaokun@huaweicloud.com 写道:
> From: Jingbo Xu <jefflexu@linux.alibaba.com>
> 
> Add missing lock protection in poll routine when iterating xarray,
> otherwise:
> 
> Even with RCU read lock held, only the slot of the radix tree is
> ensured to be pinned there, while the data structure (e.g. struct
> cachefiles_req) stored in the slot has no such guarantee.  The poll
> routine will iterate the radix tree and dereference cachefiles_req
> accordingly.  Thus RCU read lock is not adequate in this case and
> spinlock is needed here.
> 
> Fixes: b817e22b2e91 ("cachefiles: narrow the scope of triggering EPOLLIN events in ondemand mode")
> Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com>
> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
> Signed-off-by: Baokun Li <libaokun1@huawei.com>

Thanks for catching this.

Reviewed-by: Jia Zhu <zhujia.zj@bytedance.com>

> ---
>   fs/cachefiles/daemon.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/cachefiles/daemon.c b/fs/cachefiles/daemon.c
> index 6465e2574230..73ed2323282a 100644
> --- a/fs/cachefiles/daemon.c
> +++ b/fs/cachefiles/daemon.c
> @@ -365,14 +365,14 @@ static __poll_t cachefiles_daemon_poll(struct file *file,
>   
>   	if (cachefiles_in_ondemand_mode(cache)) {
>   		if (!xa_empty(&cache->reqs)) {
> -			rcu_read_lock();
> +			xas_lock(&xas);
>   			xas_for_each_marked(&xas, req, ULONG_MAX, CACHEFILES_REQ_NEW) {
>   				if (!cachefiles_ondemand_is_reopening_read(req)) {
>   					mask |= EPOLLIN;
>   					break;
>   				}
>   			}
> -			rcu_read_unlock();
> +			xas_unlock(&xas);
>   		}
>   	} else {
>   		if (test_bit(CACHEFILES_STATE_CHANGED, &cache->flags))

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 5/5] cachefiles: add missing lock protection when polling
  2024-04-24  4:29   ` Gao Xiang
@ 2024-04-24  6:23     ` Baokun Li
  0 siblings, 0 replies; 11+ messages in thread
From: Baokun Li @ 2024-04-24  6:23 UTC (permalink / raw)
  To: Gao Xiang, netfs
  Cc: libaokun, Joseph Qi, jlayton, linux-kernel, dhowells,
	linux-fsdevel, linux-cachefs, linux-erofs

Hi Xiang,

On 2024/4/24 12:29, Gao Xiang wrote:
> Hi Baokun,
>
> On 2024/4/24 11:34, libaokun@huaweicloud.com wrote:
>> From: Jingbo Xu <jefflexu@linux.alibaba.com>
>>
>> Add missing lock protection in poll routine when iterating xarray,
>> otherwise:
>>
>> Even with RCU read lock held, only the slot of the radix tree is
>> ensured to be pinned there, while the data structure (e.g. struct
>> cachefiles_req) stored in the slot has no such guarantee.  The poll
>> routine will iterate the radix tree and dereference cachefiles_req
>> accordingly.  Thus RCU read lock is not adequate in this case and
>> spinlock is needed here.
>>
>> Fixes: b817e22b2e91 ("cachefiles: narrow the scope of triggering 
>> EPOLLIN events in ondemand mode")
>> Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com>
>> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
>> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
>
> I'm not sure why this patch didn't send upstream,
> https://gitee.com/anolis/cloud-kernel/commit/324ecaaa10fefb0e3d94b547e3170e40b90cda1f 
>
>
Yes, this issue blocks our tests, so this commit is adapted to upstream 
here.

> But since we're now working on upstreaming, so let's drop
> the previous in-house review tags..
>
> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
>
> Thanks,
> Gao Xiang

Ok, thanks for the review!

Cheers,
Baokun
>
>> Signed-off-by: Baokun Li <libaokun1@huawei.com>
>> ---
>>   fs/cachefiles/daemon.c | 4 ++--
>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/cachefiles/daemon.c b/fs/cachefiles/daemon.c
>> index 6465e2574230..73ed2323282a 100644
>> --- a/fs/cachefiles/daemon.c
>> +++ b/fs/cachefiles/daemon.c
>> @@ -365,14 +365,14 @@ static __poll_t cachefiles_daemon_poll(struct 
>> file *file,
>>         if (cachefiles_in_ondemand_mode(cache)) {
>>           if (!xa_empty(&cache->reqs)) {
>> -            rcu_read_lock();
>> +            xas_lock(&xas);
>>               xas_for_each_marked(&xas, req, ULONG_MAX, 
>> CACHEFILES_REQ_NEW) {
>>                   if (!cachefiles_ondemand_is_reopening_read(req)) {
>>                       mask |= EPOLLIN;
>>                       break;
>>                   }
>>               }
>> -            rcu_read_unlock();
>> +            xas_unlock(&xas);
>>           }
>>       } else {
>>           if (test_bit(CACHEFILES_STATE_CHANGED, &cache->flags))



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/5] cachefiles: flush ondemand_object_worker during clean object
  2024-04-24  3:34 ` [PATCH 3/5] cachefiles: flush ondemand_object_worker during clean object libaokun
@ 2024-04-25  5:41   ` Jia Zhu via Linux-erofs
  2024-04-25  6:53     ` Baokun Li
  0 siblings, 1 reply; 11+ messages in thread
From: Jia Zhu via Linux-erofs @ 2024-04-25  5:41 UTC (permalink / raw)
  To: libaokun, netfs
  Cc: jlayton, linux-kernel, dhowells, linux-fsdevel, linux-cachefs,
	linux-erofs

Thanks for catching this. How about adding a Fixes tag.

Reviewed-by: Jia Zhu <zhujia.zj@bytedance.com>


在 2024/4/24 11:34, libaokun@huaweicloud.com 写道:
> From: Hou Tao <houtao1@huawei.com>
> 
> When queuing ondemand_object_worker() to re-open the object,
> cachefiles_object is not pinned. The cachefiles_object may be freed when
> the pending read request is completed intentionally and the related
> erofs is umounted. If ondemand_object_worker() runs after the object is
> freed, it will incur use-after-free problem as shown below.
> 
> process A  processs B  process C  process D
> 
> cachefiles_ondemand_send_req()
> // send a read req X
> // wait for its completion
> 
>             // close ondemand fd
>             cachefiles_ondemand_fd_release()
>             // set object as CLOSE
> 
>                         cachefiles_ondemand_daemon_read()
>                         // set object as REOPENING
>                         queue_work(fscache_wq, &info->ondemand_work)
> 
>                                  // close /dev/cachefiles
>                                  cachefiles_daemon_release
>                                  cachefiles_flush_reqs
>                                  complete(&req->done)
> 
> // read req X is completed
> // umount the erofs fs
> cachefiles_put_object()
> // object will be freed
> cachefiles_ondemand_deinit_obj_info()
> kmem_cache_free(object)
>                         // both info and object are freed
>                         ondemand_object_worker()
> 
> When dropping an object, it is no longer necessary to reopen the object,
> so use cancel_work_sync() to cancel or wait for ondemand_object_worker()
> to complete.
> 
> Signed-off-by: Hou Tao <houtao1@huawei.com>
> Signed-off-by: Baokun Li <libaokun1@huawei.com>
> ---
>   fs/cachefiles/ondemand.c | 3 +++
>   1 file changed, 3 insertions(+)
> 
> diff --git a/fs/cachefiles/ondemand.c b/fs/cachefiles/ondemand.c
> index d24bff43499b..f6440b3e7368 100644
> --- a/fs/cachefiles/ondemand.c
> +++ b/fs/cachefiles/ondemand.c
> @@ -589,6 +589,9 @@ void cachefiles_ondemand_clean_object(struct cachefiles_object *object)
>   		}
>   	}
>   	xa_unlock(&cache->reqs);
> +
> +	/* Wait for ondemand_object_worker() to finish to avoid UAF. */
> +	cancel_work_sync(&object->ondemand->ondemand_work);
>   }
>   
>   int cachefiles_ondemand_init_obj_info(struct cachefiles_object *object,

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/5] cachefiles: flush ondemand_object_worker during clean object
  2024-04-25  5:41   ` Jia Zhu via Linux-erofs
@ 2024-04-25  6:53     ` Baokun Li
  0 siblings, 0 replies; 11+ messages in thread
From: Baokun Li @ 2024-04-25  6:53 UTC (permalink / raw)
  To: Jia Zhu, netfs
  Cc: libaokun, jlayton, linux-kernel, dhowells, linux-fsdevel,
	linux-cachefs, linux-erofs

Hi Jia,

On 2024/4/25 13:41, Jia Zhu wrote:
> Thanks for catching this. How about adding a Fixes tag.
>
> Reviewed-by: Jia Zhu <zhujia.zj@bytedance.com>
>
>
Ok, I will add the Fixes tag in the next iteration.
Thank you very much for your review!

Cheers!
Baokun
> 在 2024/4/24 11:34, libaokun@huaweicloud.com 写道:
>> From: Hou Tao <houtao1@huawei.com>
>>
>> When queuing ondemand_object_worker() to re-open the object,
>> cachefiles_object is not pinned. The cachefiles_object may be freed when
>> the pending read request is completed intentionally and the related
>> erofs is umounted. If ondemand_object_worker() runs after the object is
>> freed, it will incur use-after-free problem as shown below.
>>
>> process A  processs B  process C  process D
>>
>> cachefiles_ondemand_send_req()
>> // send a read req X
>> // wait for its completion
>>
>>             // close ondemand fd
>>             cachefiles_ondemand_fd_release()
>>             // set object as CLOSE
>>
>>                         cachefiles_ondemand_daemon_read()
>>                         // set object as REOPENING
>>                         queue_work(fscache_wq, &info->ondemand_work)
>>
>>                                  // close /dev/cachefiles
>>                                  cachefiles_daemon_release
>>                                  cachefiles_flush_reqs
>>                                  complete(&req->done)
>>
>> // read req X is completed
>> // umount the erofs fs
>> cachefiles_put_object()
>> // object will be freed
>> cachefiles_ondemand_deinit_obj_info()
>> kmem_cache_free(object)
>>                         // both info and object are freed
>>                         ondemand_object_worker()
>>
>> When dropping an object, it is no longer necessary to reopen the object,
>> so use cancel_work_sync() to cancel or wait for ondemand_object_worker()
>> to complete.
>>
>> Signed-off-by: Hou Tao <houtao1@huawei.com>
>> Signed-off-by: Baokun Li <libaokun1@huawei.com>
>> ---
>>   fs/cachefiles/ondemand.c | 3 +++
>>   1 file changed, 3 insertions(+)
>>
>> diff --git a/fs/cachefiles/ondemand.c b/fs/cachefiles/ondemand.c
>> index d24bff43499b..f6440b3e7368 100644
>> --- a/fs/cachefiles/ondemand.c
>> +++ b/fs/cachefiles/ondemand.c
>> @@ -589,6 +589,9 @@ void cachefiles_ondemand_clean_object(struct 
>> cachefiles_object *object)
>>           }
>>       }
>>       xa_unlock(&cache->reqs);
>> +
>> +    /* Wait for ondemand_object_worker() to finish to avoid UAF. */
>> + cancel_work_sync(&object->ondemand->ondemand_work);
>>   }
>>     int cachefiles_ondemand_init_obj_info(struct cachefiles_object 
>> *object,



^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2024-04-25  6:53 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-24  3:34 [PATCH 0/5] cachefiles: some bugfixes for clean object/send req/poll libaokun
2024-04-24  3:34 ` [PATCH 1/5] cachefiles: stop sending new request when dropping object libaokun
2024-04-24  3:34 ` [PATCH 2/5] cachefiles: flush all requests for the object that is being dropped libaokun
2024-04-24  3:34 ` [PATCH 3/5] cachefiles: flush ondemand_object_worker during clean object libaokun
2024-04-25  5:41   ` Jia Zhu via Linux-erofs
2024-04-25  6:53     ` Baokun Li
2024-04-24  3:34 ` [PATCH 4/5] cachefiles: cyclic allocation of msg_id to avoid reuse libaokun
2024-04-24  3:34 ` [PATCH 5/5] cachefiles: add missing lock protection when polling libaokun
2024-04-24  4:29   ` Gao Xiang
2024-04-24  6:23     ` Baokun Li
2024-04-24  5:46   ` Jia Zhu via Linux-erofs

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).