linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] ext4: avoid utf8_strncasecmp() with unstable name
@ 2020-05-30  6:02 Eric Biggers
  2020-05-30  6:17 ` Gabriel Krisman Bertazi
  2020-05-30 17:18 ` Matthew Wilcox
  0 siblings, 2 replies; 9+ messages in thread
From: Eric Biggers @ 2020-05-30  6:02 UTC (permalink / raw)
  To: linux-ext4
  Cc: linux-fsdevel, linux-f2fs-devel, stable, Al Viro,
	Daniel Rosenberg, Gabriel Krisman Bertazi

From: Eric Biggers <ebiggers@google.com>

If the dentry name passed to ->d_compare() fits in dentry::d_iname, then
it may be concurrently modified by a rename.  This can cause undefined
behavior (possibly out-of-bounds memory accesses or crashes) in
utf8_strncasecmp(), since fs/unicode/ isn't written to handle strings
that may be concurrently modified.

Fix this by first copying the filename to a stack buffer if needed.
This way we get a stable snapshot of the filename.

Fixes: b886ee3e778e ("ext4: Support case-insensitive file name lookups")
Cc: <stable@vger.kernel.org> # v5.2+
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Daniel Rosenberg <drosen@google.com>
Cc: Gabriel Krisman Bertazi <krisman@collabora.co.uk>
Signed-off-by: Eric Biggers <ebiggers@google.com>
---
 fs/ext4/dir.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/fs/ext4/dir.c b/fs/ext4/dir.c
index c654205f648dd..19aef8328bb18 100644
--- a/fs/ext4/dir.c
+++ b/fs/ext4/dir.c
@@ -675,6 +675,7 @@ static int ext4_d_compare(const struct dentry *dentry, unsigned int len,
 	struct qstr qstr = {.name = str, .len = len };
 	const struct dentry *parent = READ_ONCE(dentry->d_parent);
 	const struct inode *inode = READ_ONCE(parent->d_inode);
+	char strbuf[DNAME_INLINE_LEN];
 
 	if (!inode || !IS_CASEFOLDED(inode) ||
 	    !EXT4_SB(inode->i_sb)->s_encoding) {
@@ -683,6 +684,22 @@ static int ext4_d_compare(const struct dentry *dentry, unsigned int len,
 		return memcmp(str, name->name, len);
 	}
 
+	/*
+	 * If the dentry name is stored in-line, then it may be concurrently
+	 * modified by a rename.  If this happens, the VFS will eventually retry
+	 * the lookup, so it doesn't matter what ->d_compare() returns.
+	 * However, it's unsafe to call utf8_strncasecmp() with an unstable
+	 * string.  Therefore, we have to copy the name into a temporary buffer.
+	 */
+	if (len <= DNAME_INLINE_LEN - 1) {
+		unsigned int i;
+
+		for (i = 0; i < len; i++)
+			strbuf[i] = READ_ONCE(str[i]);
+		strbuf[len] = 0;
+		qstr.name = strbuf;
+	}
+
 	return ext4_ci_compare(inode, name, &qstr, false);
 }
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] ext4: avoid utf8_strncasecmp() with unstable name
  2020-05-30  6:02 [PATCH] ext4: avoid utf8_strncasecmp() with unstable name Eric Biggers
@ 2020-05-30  6:17 ` Gabriel Krisman Bertazi
  2020-05-30  6:44   ` Eric Biggers
  2020-05-30 17:18 ` Matthew Wilcox
  1 sibling, 1 reply; 9+ messages in thread
From: Gabriel Krisman Bertazi @ 2020-05-30  6:17 UTC (permalink / raw)
  To: Eric Biggers
  Cc: linux-ext4, linux-fsdevel, linux-f2fs-devel, stable, Al Viro,
	Daniel Rosenberg, Gabriel Krisman Bertazi

Eric Biggers <ebiggers@kernel.org> writes:

> From: Eric Biggers <ebiggers@google.com>
>
> If the dentry name passed to ->d_compare() fits in dentry::d_iname, then
> it may be concurrently modified by a rename.  This can cause undefined
> behavior (possibly out-of-bounds memory accesses or crashes) in
> utf8_strncasecmp(), since fs/unicode/ isn't written to handle strings
> that may be concurrently modified.
>
> Fix this by first copying the filename to a stack buffer if needed.
> This way we get a stable snapshot of the filename.
>
> Fixes: b886ee3e778e ("ext4: Support case-insensitive file name lookups")
> Cc: <stable@vger.kernel.org> # v5.2+
> Cc: Al Viro <viro@zeniv.linux.org.uk>
> Cc: Daniel Rosenberg <drosen@google.com>
> Cc: Gabriel Krisman Bertazi <krisman@collabora.co.uk>
> Signed-off-by: Eric Biggers <ebiggers@google.com>
> ---
>  fs/ext4/dir.c | 17 +++++++++++++++++
>  1 file changed, 17 insertions(+)
>
> diff --git a/fs/ext4/dir.c b/fs/ext4/dir.c
> index c654205f648dd..19aef8328bb18 100644
> --- a/fs/ext4/dir.c
> +++ b/fs/ext4/dir.c
> @@ -675,6 +675,7 @@ static int ext4_d_compare(const struct dentry *dentry, unsigned int len,
>  	struct qstr qstr = {.name = str, .len = len };
>  	const struct dentry *parent = READ_ONCE(dentry->d_parent);
>  	const struct inode *inode = READ_ONCE(parent->d_inode);
> +	char strbuf[DNAME_INLINE_LEN];
>  
>  	if (!inode || !IS_CASEFOLDED(inode) ||
>  	    !EXT4_SB(inode->i_sb)->s_encoding) {
> @@ -683,6 +684,22 @@ static int ext4_d_compare(const struct dentry *dentry, unsigned int len,
>  		return memcmp(str, name->name, len);
>  	}
>  
> +	/*
> +	 * If the dentry name is stored in-line, then it may be concurrently
> +	 * modified by a rename.  If this happens, the VFS will eventually retry
> +	 * the lookup, so it doesn't matter what ->d_compare() returns.
> +	 * However, it's unsafe to call utf8_strncasecmp() with an unstable
> +	 * string.  Therefore, we have to copy the name into a temporary buffer.
> +	 */
> +	if (len <= DNAME_INLINE_LEN - 1) {
> +		unsigned int i;
> +
> +		for (i = 0; i < len; i++)
> +			strbuf[i] = READ_ONCE(str[i]);
> +		strbuf[len] = 0;
> +		qstr.name = strbuf;
> +	}
> +

Could we avoid this if the casefolded version were cached in the dentry?
Then we could use utf8_strncasecmp_folded which would be safe.  Would
this be acceptable for vfs?

>  	return ext4_ci_compare(inode, name, &qstr, false);
>  }

-- 
Gabriel Krisman Bertazi

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] ext4: avoid utf8_strncasecmp() with unstable name
  2020-05-30  6:17 ` Gabriel Krisman Bertazi
@ 2020-05-30  6:44   ` Eric Biggers
  0 siblings, 0 replies; 9+ messages in thread
From: Eric Biggers @ 2020-05-30  6:44 UTC (permalink / raw)
  To: Gabriel Krisman Bertazi
  Cc: linux-ext4, linux-fsdevel, linux-f2fs-devel, stable, Al Viro,
	Daniel Rosenberg, Gabriel Krisman Bertazi

On Sat, May 30, 2020 at 02:17:02AM -0400, Gabriel Krisman Bertazi wrote:
> >  > > +	/*
> > +	 * If the dentry name is stored in-line, then it may be concurrently
> > +	 * modified by a rename.  If this happens, the VFS will eventually retry
> > +	 * the lookup, so it doesn't matter what ->d_compare() returns.
> > +	 * However, it's unsafe to call utf8_strncasecmp() with an unstable
> > +	 * string.  Therefore, we have to copy the name into a temporary buffer.
> > +	 */
> > +	if (len <= DNAME_INLINE_LEN - 1) {
> > +		unsigned int i;
> > +
> > +		for (i = 0; i < len; i++)
> > +			strbuf[i] = READ_ONCE(str[i]);
> > +		strbuf[len] = 0;
> > +		qstr.name = strbuf;
> > +	}
> > +
> 
> Could we avoid this if the casefolded version were cached in the dentry?
> Then we could use utf8_strncasecmp_folded which would be safe.  Would
> this be acceptable for vfs?

The VFS assumes that each dentry has one name, the one in d_name.  That's what
it passes to ->d_compare(), and that's what it updates in __d_move().

So while ext4 and f2fs could put the casefolded name in ->d_fsdata,
->d_compare() wouldn't actually have access to it (unless we added d_fsdata as a
parameter to ->d_compare()).  Also, the casefolded name would get outdated when
__d_move() changes d_name.

We could instead make d_name always be the casefolded name.  I'm not sure that
would be possible, though.  For one, I don't think ->lookup() is allowed to just
change the dentry name.  It would also make getcwd(), /proc/*/fd/, etc. always
show casefolded names, which could be problematic.  And probably other issues I
can't think of off the top of my head.

- Eric

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] ext4: avoid utf8_strncasecmp() with unstable name
  2020-05-30  6:02 [PATCH] ext4: avoid utf8_strncasecmp() with unstable name Eric Biggers
  2020-05-30  6:17 ` Gabriel Krisman Bertazi
@ 2020-05-30 17:18 ` Matthew Wilcox
  2020-05-30 17:35   ` Eric Biggers
  1 sibling, 1 reply; 9+ messages in thread
From: Matthew Wilcox @ 2020-05-30 17:18 UTC (permalink / raw)
  To: Eric Biggers
  Cc: linux-ext4, linux-fsdevel, linux-f2fs-devel, stable, Al Viro,
	Daniel Rosenberg, Gabriel Krisman Bertazi

On Fri, May 29, 2020 at 11:02:16PM -0700, Eric Biggers wrote:
> +	if (len <= DNAME_INLINE_LEN - 1) {
> +		unsigned int i;
> +
> +		for (i = 0; i < len; i++)
> +			strbuf[i] = READ_ONCE(str[i]);
> +		strbuf[len] = 0;

This READ_ONCE is going to force the compiler to use byte accesses.
What's wrong with using a plain memcpy()?

> +		qstr.name = strbuf;
> +	}
> +
>  	return ext4_ci_compare(inode, name, &qstr, false);
>  }
>  
> -- 
> 2.26.2
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] ext4: avoid utf8_strncasecmp() with unstable name
  2020-05-30 17:18 ` Matthew Wilcox
@ 2020-05-30 17:35   ` Eric Biggers
  2020-05-30 17:59     ` Al Viro
  2020-05-30 20:41     ` Matthew Wilcox
  0 siblings, 2 replies; 9+ messages in thread
From: Eric Biggers @ 2020-05-30 17:35 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-ext4, linux-fsdevel, linux-f2fs-devel, stable, Al Viro,
	Daniel Rosenberg, Gabriel Krisman Bertazi

On Sat, May 30, 2020 at 10:18:14AM -0700, Matthew Wilcox wrote:
> On Fri, May 29, 2020 at 11:02:16PM -0700, Eric Biggers wrote:
> > +	if (len <= DNAME_INLINE_LEN - 1) {
> > +		unsigned int i;
> > +
> > +		for (i = 0; i < len; i++)
> > +			strbuf[i] = READ_ONCE(str[i]);
> > +		strbuf[len] = 0;
> 
> This READ_ONCE is going to force the compiler to use byte accesses.
> What's wrong with using a plain memcpy()?
> 

It's undefined behavior when the source can be concurrently modified.

Compilers can assume that it's not, and remove the memcpy() (instead just using
the source data directly) if they can prove that the destination array is never
modified again before it goes out of scope.

Do you have any suggestions that don't involve undefined behavior?

- Eric

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] ext4: avoid utf8_strncasecmp() with unstable name
  2020-05-30 17:35   ` Eric Biggers
@ 2020-05-30 17:59     ` Al Viro
  2020-06-01  6:45       ` Eric Biggers
  2020-05-30 20:41     ` Matthew Wilcox
  1 sibling, 1 reply; 9+ messages in thread
From: Al Viro @ 2020-05-30 17:59 UTC (permalink / raw)
  To: Eric Biggers
  Cc: Matthew Wilcox, linux-ext4, linux-fsdevel, linux-f2fs-devel,
	stable, Daniel Rosenberg, Gabriel Krisman Bertazi

On Sat, May 30, 2020 at 10:35:47AM -0700, Eric Biggers wrote:
> On Sat, May 30, 2020 at 10:18:14AM -0700, Matthew Wilcox wrote:
> > On Fri, May 29, 2020 at 11:02:16PM -0700, Eric Biggers wrote:
> > > +	if (len <= DNAME_INLINE_LEN - 1) {
> > > +		unsigned int i;
> > > +
> > > +		for (i = 0; i < len; i++)
> > > +			strbuf[i] = READ_ONCE(str[i]);
> > > +		strbuf[len] = 0;
> > 
> > This READ_ONCE is going to force the compiler to use byte accesses.
> > What's wrong with using a plain memcpy()?
> > 
> 
> It's undefined behavior when the source can be concurrently modified.
> 
> Compilers can assume that it's not, and remove the memcpy() (instead just using
> the source data directly) if they can prove that the destination array is never
> modified again before it goes out of scope.
> 
> Do you have any suggestions that don't involve undefined behavior?

Even memcpy(strbuf, (volatile void *)str, len)?  It's been a while since I've
looked at these parts of C99...

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] ext4: avoid utf8_strncasecmp() with unstable name
  2020-05-30 17:35   ` Eric Biggers
  2020-05-30 17:59     ` Al Viro
@ 2020-05-30 20:41     ` Matthew Wilcox
  2020-06-01  7:05       ` Eric Biggers
  1 sibling, 1 reply; 9+ messages in thread
From: Matthew Wilcox @ 2020-05-30 20:41 UTC (permalink / raw)
  To: Eric Biggers
  Cc: linux-ext4, linux-fsdevel, linux-f2fs-devel, stable, Al Viro,
	Daniel Rosenberg, Gabriel Krisman Bertazi

On Sat, May 30, 2020 at 10:35:47AM -0700, Eric Biggers wrote:
> On Sat, May 30, 2020 at 10:18:14AM -0700, Matthew Wilcox wrote:
> > On Fri, May 29, 2020 at 11:02:16PM -0700, Eric Biggers wrote:
> > > +	if (len <= DNAME_INLINE_LEN - 1) {
> > > +		unsigned int i;
> > > +
> > > +		for (i = 0; i < len; i++)
> > > +			strbuf[i] = READ_ONCE(str[i]);
> > > +		strbuf[len] = 0;
> > 
> > This READ_ONCE is going to force the compiler to use byte accesses.
> > What's wrong with using a plain memcpy()?
> > 
> 
> It's undefined behavior when the source can be concurrently modified.
> 
> Compilers can assume that it's not, and remove the memcpy() (instead just using
> the source data directly) if they can prove that the destination array is never
> modified again before it goes out of scope.
> 
> Do you have any suggestions that don't involve undefined behavior?

void *memcpy_unsafe(void *dst, volatile void *src, __kernel_size_t);

It can just call memcpy() of course, but the compiler can't reason about
this function because it's not a stdlib function.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] ext4: avoid utf8_strncasecmp() with unstable name
  2020-05-30 17:59     ` Al Viro
@ 2020-06-01  6:45       ` Eric Biggers
  0 siblings, 0 replies; 9+ messages in thread
From: Eric Biggers @ 2020-06-01  6:45 UTC (permalink / raw)
  To: Al Viro
  Cc: Matthew Wilcox, linux-ext4, linux-fsdevel, linux-f2fs-devel,
	stable, Daniel Rosenberg, Gabriel Krisman Bertazi

On Sat, May 30, 2020 at 06:59:07PM +0100, Al Viro wrote:
> On Sat, May 30, 2020 at 10:35:47AM -0700, Eric Biggers wrote:
> > On Sat, May 30, 2020 at 10:18:14AM -0700, Matthew Wilcox wrote:
> > > On Fri, May 29, 2020 at 11:02:16PM -0700, Eric Biggers wrote:
> > > > +	if (len <= DNAME_INLINE_LEN - 1) {
> > > > +		unsigned int i;
> > > > +
> > > > +		for (i = 0; i < len; i++)
> > > > +			strbuf[i] = READ_ONCE(str[i]);
> > > > +		strbuf[len] = 0;
> > > 
> > > This READ_ONCE is going to force the compiler to use byte accesses.
> > > What's wrong with using a plain memcpy()?
> > > 
> > 
> > It's undefined behavior when the source can be concurrently modified.
> > 
> > Compilers can assume that it's not, and remove the memcpy() (instead just using
> > the source data directly) if they can prove that the destination array is never
> > modified again before it goes out of scope.
> > 
> > Do you have any suggestions that don't involve undefined behavior?
> 
> Even memcpy(strbuf, (volatile void *)str, len)?  It's been a while since I've
> looked at these parts of C99...

That doesn't make sense.  memcpy() takes a non-volatile pointer, so the pointer
just gets implicitly cast back to (void *), and you get a compiler warning.

- Eric

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] ext4: avoid utf8_strncasecmp() with unstable name
  2020-05-30 20:41     ` Matthew Wilcox
@ 2020-06-01  7:05       ` Eric Biggers
  0 siblings, 0 replies; 9+ messages in thread
From: Eric Biggers @ 2020-06-01  7:05 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-ext4, linux-fsdevel, linux-f2fs-devel, stable, Al Viro,
	Daniel Rosenberg, Gabriel Krisman Bertazi

On Sat, May 30, 2020 at 01:41:32PM -0700, Matthew Wilcox wrote:
> On Sat, May 30, 2020 at 10:35:47AM -0700, Eric Biggers wrote:
> > On Sat, May 30, 2020 at 10:18:14AM -0700, Matthew Wilcox wrote:
> > > On Fri, May 29, 2020 at 11:02:16PM -0700, Eric Biggers wrote:
> > > > +	if (len <= DNAME_INLINE_LEN - 1) {
> > > > +		unsigned int i;
> > > > +
> > > > +		for (i = 0; i < len; i++)
> > > > +			strbuf[i] = READ_ONCE(str[i]);
> > > > +		strbuf[len] = 0;
> > > 
> > > This READ_ONCE is going to force the compiler to use byte accesses.
> > > What's wrong with using a plain memcpy()?
> > > 
> > 
> > It's undefined behavior when the source can be concurrently modified.
> > 
> > Compilers can assume that it's not, and remove the memcpy() (instead just using
> > the source data directly) if they can prove that the destination array is never
> > modified again before it goes out of scope.
> > 
> > Do you have any suggestions that don't involve undefined behavior?
> 
> void *memcpy_unsafe(void *dst, volatile void *src, __kernel_size_t);
> 
> It can just call memcpy() of course, but the compiler can't reason about
> this function because it's not a stdlib function.

The compiler can still reason about it if it's in the same file, if it's an
inline function, or if link-time-optimization is enabled.  (LTO isn't yet
supported by the mainline kernel, but people have been working on it.)

Also, as I mentioned to Al, it's necessary to cast away 'volatile' to call
memcpy().  So the 'volatile' serves no purpose.

How about using barrier(), which expands to  asm("" : : : "memory") to tell the
compiler that memory was clobbered?

        if (len <= DNAME_INLINE_LEN - 1) {
                memcpy(strbuf, str, len);
                strbuf[len] = 0;
                /* prevent compiler from optimizing out the temporary buffer */
                barrier();
        }

I think it's still technically undefined to call memcpy() on concurrently
modified memory at all, but I think the above would be okay in practice...

Using 'noinline' could be another option.

- Eric

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-06-01  7:05 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-30  6:02 [PATCH] ext4: avoid utf8_strncasecmp() with unstable name Eric Biggers
2020-05-30  6:17 ` Gabriel Krisman Bertazi
2020-05-30  6:44   ` Eric Biggers
2020-05-30 17:18 ` Matthew Wilcox
2020-05-30 17:35   ` Eric Biggers
2020-05-30 17:59     ` Al Viro
2020-06-01  6:45       ` Eric Biggers
2020-05-30 20:41     ` Matthew Wilcox
2020-06-01  7:05       ` Eric Biggers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).