linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 2 0/2] xfs: fix inline link path race
@ 2021-11-15  2:24 Ian Kent
  2021-11-15  2:24 ` [PATCH 2 1/2] vfs: check dentry is still valid in get_link() Ian Kent
  2021-11-15  2:24 ` [PATCH 2 2/2] xfs: make sure link path does not go away at access Ian Kent
  0 siblings, 2 replies; 6+ messages in thread
From: Ian Kent @ 2021-11-15  2:24 UTC (permalink / raw)
  To: xfs, Darrick J. Wong, Christoph Hellwig
  Cc: Miklos Szeredi, Brian Foster, Al Viro, David Howells,
	linux-fsdevel, Kernel Mailing List

If the inode of an inline symlink is released (dropped) while a path walk
that is in rcu-walk mode is occuring the inode ->get_link() method can be
NULL when VFS dereferences it causing a crash. But, since the release can
occur at any time there's a small but finite possibility the link path
text could be freed while it's being used.

Changes since v1:
- don't bother trying to rcu-free the link path since there could
  be side effects from the xfs reclaim code.
---

Ian Kent (2):
      vfs: check dentry is still valid in get_link()
      xfs: make sure link path does not go away at access


 fs/xfs/xfs_iops.c | 3 +++
 1 file changed, 3 insertions(+)

--
Ian


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 2 1/2] vfs: check dentry is still valid in get_link()
  2021-11-15  2:24 [PATCH 2 0/2] xfs: fix inline link path race Ian Kent
@ 2021-11-15  2:24 ` Ian Kent
  2021-11-15 14:25   ` Brian Foster
  2021-11-15  2:24 ` [PATCH 2 2/2] xfs: make sure link path does not go away at access Ian Kent
  1 sibling, 1 reply; 6+ messages in thread
From: Ian Kent @ 2021-11-15  2:24 UTC (permalink / raw)
  To: xfs, Darrick J. Wong, Christoph Hellwig
  Cc: Miklos Szeredi, Brian Foster, Al Viro, David Howells,
	linux-fsdevel, Kernel Mailing List

When following a trailing symlink in rcu-walk mode it's possible for
the dentry to become invalid between the last dentry seq lock check
and getting the link (eg. an unlink) leading to a backtrace similar
to this:

crash> bt
PID: 10964  TASK: ffff951c8aa92f80  CPU: 3   COMMAND: "TaniumCX"
…
 #7 [ffffae44d0a6fbe0] page_fault at ffffffff8d6010fe
    [exception RIP: unknown or invalid address]
    RIP: 0000000000000000  RSP: ffffae44d0a6fc90  RFLAGS: 00010246
    RAX: ffffffff8da3cc80  RBX: ffffae44d0a6fd30  RCX: 0000000000000000
    RDX: ffffae44d0a6fd98  RSI: ffff951aa9af3008  RDI: 0000000000000000
    RBP: 0000000000000000   R8: ffffae44d0a6fb94   R9: 0000000000000000
    R10: ffff951c95d8c318  R11: 0000000000080000  R12: ffffae44d0a6fd98
    R13: ffff951aa9af3008  R14: ffff951c8c9eb840  R15: 0000000000000000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #8 [ffffae44d0a6fc90] trailing_symlink at ffffffff8cf24e61
 #9 [ffffae44d0a6fcc8] path_lookupat at ffffffff8cf261d1
#10 [ffffae44d0a6fd28] filename_lookup at ffffffff8cf2a700
#11 [ffffae44d0a6fe40] vfs_statx at ffffffff8cf1dbc4
#12 [ffffae44d0a6fe98] __do_sys_newstat at ffffffff8cf1e1f9
#13 [ffffae44d0a6ff38] do_syscall_64 at ffffffff8cc0420b

Most of the time this is not a problem because the inode is unchanged
while the rcu read lock is held.

But xfs can re-use inodes which can result in the inode ->get_link()
method becoming invalid (or NULL).

This case needs to be checked for in fs/namei.c:get_link() and if
detected the walk re-started.

Signed-off-by: Ian Kent <raven@themaw.net>
---
 0 files changed

diff --git a/fs/namei.c b/fs/namei.c
index 1946d9667790..9a48a6106516 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -1760,8 +1760,11 @@ static const char *pick_link(struct nameidata *nd, struct path *link,
 	if (!res) {
 		const char * (*get)(struct dentry *, struct inode *,
 				struct delayed_call *);
-		get = inode->i_op->get_link;
+		get = READ_ONCE(inode->i_op->get_link);
 		if (nd->flags & LOOKUP_RCU) {
+			/* Does the inode still match the associated dentry? */
+			if (unlikely(read_seqcount_retry(&link->dentry->d_seq, last->seq)))
+				return ERR_PTR(-ECHILD);
 			res = get(NULL, inode, &last->done);
 			if (res == ERR_PTR(-ECHILD) && try_to_unlazy(nd))
 				res = get(link->dentry, inode, &last->done);



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2 2/2] xfs: make sure link path does not go away at access
  2021-11-15  2:24 [PATCH 2 0/2] xfs: fix inline link path race Ian Kent
  2021-11-15  2:24 ` [PATCH 2 1/2] vfs: check dentry is still valid in get_link() Ian Kent
@ 2021-11-15  2:24 ` Ian Kent
  2021-11-15 14:25   ` Brian Foster
  2021-11-15 22:26   ` Dave Chinner
  1 sibling, 2 replies; 6+ messages in thread
From: Ian Kent @ 2021-11-15  2:24 UTC (permalink / raw)
  To: xfs, Darrick J. Wong, Christoph Hellwig
  Cc: Miklos Szeredi, Brian Foster, Al Viro, David Howells,
	linux-fsdevel, Kernel Mailing List

When following an inline symlink in rcu-walk mode it's possible to
succeed in getting the ->get_link() method pointer but the link path
string be deallocated while it's being used.

This is becuase of the xfs inode reclaim mechanism. While rcu freeing
the link path can prevent it from being freed during use the inode
reclaim could assign a new value to the field at any time outside of
the path walk and result in an invalid link path pointer being
returned. Admittedly a very small race window but possible.

The best way to mitigate this risk is to return -ECHILD to the VFS
if the inline symlink method, ->get_link(), is called in rcu-walk mode
so the VFS can switch to ref-walk mode or redo the walk if the inode
has become invalid.

If it's discovered that staying in rcu-walk mode gives a worth while
performance improvement (unlikely) then the link path could be freed
under rcu once potential side effects of the xfs inode reclaim
sub-system have been analysed and dealt with if needed.

Signed-off-by: Ian Kent <raven@themaw.net>
---
 fs/xfs/xfs_iops.c |    3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
index a607d6aca5c4..0a96183c5381 100644
--- a/fs/xfs/xfs_iops.c
+++ b/fs/xfs/xfs_iops.c
@@ -520,6 +520,9 @@ xfs_vn_get_link_inline(
 	struct xfs_inode	*ip = XFS_I(inode);
 	char			*link;
 
+	if (!dentry)
+		return ERR_PTR(-ECHILD);
+
 	ASSERT(ip->i_df.if_format == XFS_DINODE_FMT_LOCAL);
 
 	/*



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 2 1/2] vfs: check dentry is still valid in get_link()
  2021-11-15  2:24 ` [PATCH 2 1/2] vfs: check dentry is still valid in get_link() Ian Kent
@ 2021-11-15 14:25   ` Brian Foster
  0 siblings, 0 replies; 6+ messages in thread
From: Brian Foster @ 2021-11-15 14:25 UTC (permalink / raw)
  To: Ian Kent
  Cc: xfs, Darrick J. Wong, Christoph Hellwig, Miklos Szeredi, Al Viro,
	David Howells, linux-fsdevel, Kernel Mailing List

On Mon, Nov 15, 2021 at 10:24:22AM +0800, Ian Kent wrote:
> When following a trailing symlink in rcu-walk mode it's possible for
> the dentry to become invalid between the last dentry seq lock check
> and getting the link (eg. an unlink) leading to a backtrace similar
> to this:
> 
> crash> bt
> PID: 10964  TASK: ffff951c8aa92f80  CPU: 3   COMMAND: "TaniumCX"
> …
>  #7 [ffffae44d0a6fbe0] page_fault at ffffffff8d6010fe
>     [exception RIP: unknown or invalid address]
>     RIP: 0000000000000000  RSP: ffffae44d0a6fc90  RFLAGS: 00010246
>     RAX: ffffffff8da3cc80  RBX: ffffae44d0a6fd30  RCX: 0000000000000000
>     RDX: ffffae44d0a6fd98  RSI: ffff951aa9af3008  RDI: 0000000000000000
>     RBP: 0000000000000000   R8: ffffae44d0a6fb94   R9: 0000000000000000
>     R10: ffff951c95d8c318  R11: 0000000000080000  R12: ffffae44d0a6fd98
>     R13: ffff951aa9af3008  R14: ffff951c8c9eb840  R15: 0000000000000000
>     ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
>  #8 [ffffae44d0a6fc90] trailing_symlink at ffffffff8cf24e61
>  #9 [ffffae44d0a6fcc8] path_lookupat at ffffffff8cf261d1
> #10 [ffffae44d0a6fd28] filename_lookup at ffffffff8cf2a700
> #11 [ffffae44d0a6fe40] vfs_statx at ffffffff8cf1dbc4
> #12 [ffffae44d0a6fe98] __do_sys_newstat at ffffffff8cf1e1f9
> #13 [ffffae44d0a6ff38] do_syscall_64 at ffffffff8cc0420b
> 
> Most of the time this is not a problem because the inode is unchanged
> while the rcu read lock is held.
> 
> But xfs can re-use inodes which can result in the inode ->get_link()
> method becoming invalid (or NULL).
> 
> This case needs to be checked for in fs/namei.c:get_link() and if
> detected the walk re-started.
> 
> Signed-off-by: Ian Kent <raven@themaw.net>
> ---

Reviewed-by: Brian Foster <bfoster@redhat.com>

>  0 files changed
> 
> diff --git a/fs/namei.c b/fs/namei.c
> index 1946d9667790..9a48a6106516 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -1760,8 +1760,11 @@ static const char *pick_link(struct nameidata *nd, struct path *link,
>  	if (!res) {
>  		const char * (*get)(struct dentry *, struct inode *,
>  				struct delayed_call *);
> -		get = inode->i_op->get_link;
> +		get = READ_ONCE(inode->i_op->get_link);
>  		if (nd->flags & LOOKUP_RCU) {
> +			/* Does the inode still match the associated dentry? */
> +			if (unlikely(read_seqcount_retry(&link->dentry->d_seq, last->seq)))
> +				return ERR_PTR(-ECHILD);
>  			res = get(NULL, inode, &last->done);
>  			if (res == ERR_PTR(-ECHILD) && try_to_unlazy(nd))
>  				res = get(link->dentry, inode, &last->done);
> 
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2 2/2] xfs: make sure link path does not go away at access
  2021-11-15  2:24 ` [PATCH 2 2/2] xfs: make sure link path does not go away at access Ian Kent
@ 2021-11-15 14:25   ` Brian Foster
  2021-11-15 22:26   ` Dave Chinner
  1 sibling, 0 replies; 6+ messages in thread
From: Brian Foster @ 2021-11-15 14:25 UTC (permalink / raw)
  To: Ian Kent
  Cc: xfs, Darrick J. Wong, Christoph Hellwig, Miklos Szeredi, Al Viro,
	David Howells, linux-fsdevel, Kernel Mailing List

On Mon, Nov 15, 2021 at 10:24:28AM +0800, Ian Kent wrote:
> When following an inline symlink in rcu-walk mode it's possible to
> succeed in getting the ->get_link() method pointer but the link path
> string be deallocated while it's being used.
> 
> This is becuase of the xfs inode reclaim mechanism. While rcu freeing
> the link path can prevent it from being freed during use the inode
> reclaim could assign a new value to the field at any time outside of
> the path walk and result in an invalid link path pointer being
> returned. Admittedly a very small race window but possible.
> 
> The best way to mitigate this risk is to return -ECHILD to the VFS
> if the inline symlink method, ->get_link(), is called in rcu-walk mode
> so the VFS can switch to ref-walk mode or redo the walk if the inode
> has become invalid.
> 
> If it's discovered that staying in rcu-walk mode gives a worth while
> performance improvement (unlikely) then the link path could be freed
> under rcu once potential side effects of the xfs inode reclaim
> sub-system have been analysed and dealt with if needed.
> 
> Signed-off-by: Ian Kent <raven@themaw.net>
> ---

Reviewed-by: Brian Foster <bfoster@redhat.com>

>  fs/xfs/xfs_iops.c |    3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
> index a607d6aca5c4..0a96183c5381 100644
> --- a/fs/xfs/xfs_iops.c
> +++ b/fs/xfs/xfs_iops.c
> @@ -520,6 +520,9 @@ xfs_vn_get_link_inline(
>  	struct xfs_inode	*ip = XFS_I(inode);
>  	char			*link;
>  
> +	if (!dentry)
> +		return ERR_PTR(-ECHILD);
> +
>  	ASSERT(ip->i_df.if_format == XFS_DINODE_FMT_LOCAL);
>  
>  	/*
> 
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2 2/2] xfs: make sure link path does not go away at access
  2021-11-15  2:24 ` [PATCH 2 2/2] xfs: make sure link path does not go away at access Ian Kent
  2021-11-15 14:25   ` Brian Foster
@ 2021-11-15 22:26   ` Dave Chinner
  1 sibling, 0 replies; 6+ messages in thread
From: Dave Chinner @ 2021-11-15 22:26 UTC (permalink / raw)
  To: Ian Kent
  Cc: xfs, Darrick J. Wong, Christoph Hellwig, Miklos Szeredi,
	Brian Foster, Al Viro, David Howells, linux-fsdevel,
	Kernel Mailing List

On Mon, Nov 15, 2021 at 10:24:28AM +0800, Ian Kent wrote:
> When following an inline symlink in rcu-walk mode it's possible to
> succeed in getting the ->get_link() method pointer but the link path
> string be deallocated while it's being used.
> 
> This is becuase of the xfs inode reclaim mechanism. While rcu freeing
> the link path can prevent it from being freed during use the inode
> reclaim could assign a new value to the field at any time outside of
> the path walk and result in an invalid link path pointer being
> returned. Admittedly a very small race window but possible.
> 
> The best way to mitigate this risk is to return -ECHILD to the VFS
> if the inline symlink method, ->get_link(), is called in rcu-walk mode
> so the VFS can switch to ref-walk mode or redo the walk if the inode
> has become invalid.
> 
> If it's discovered that staying in rcu-walk mode gives a worth while
> performance improvement (unlikely) then the link path could be freed
> under rcu once potential side effects of the xfs inode reclaim
> sub-system have been analysed and dealt with if needed.
> 
> Signed-off-by: Ian Kent <raven@themaw.net>
> ---
>  fs/xfs/xfs_iops.c |    3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
> index a607d6aca5c4..0a96183c5381 100644
> --- a/fs/xfs/xfs_iops.c
> +++ b/fs/xfs/xfs_iops.c
> @@ -520,6 +520,9 @@ xfs_vn_get_link_inline(
>  	struct xfs_inode	*ip = XFS_I(inode);
>  	char			*link;
>  
> +	if (!dentry)
> +		return ERR_PTR(-ECHILD);
> +
>  	ASSERT(ip->i_df.if_format == XFS_DINODE_FMT_LOCAL);

NACK. As I just mentioned in the original thread, we can fix this
inode reuse within the RCU grace period problem realtively easily
without needing to turn off lockless pathwalk support for inline
symlinks.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-11-15 22:43 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-15  2:24 [PATCH 2 0/2] xfs: fix inline link path race Ian Kent
2021-11-15  2:24 ` [PATCH 2 1/2] vfs: check dentry is still valid in get_link() Ian Kent
2021-11-15 14:25   ` Brian Foster
2021-11-15  2:24 ` [PATCH 2 2/2] xfs: make sure link path does not go away at access Ian Kent
2021-11-15 14:25   ` Brian Foster
2021-11-15 22:26   ` Dave Chinner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).