linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] fsnotify: fix mem overwritten
@ 2017-05-31  3:54 石祤
  2017-05-31  3:54 ` [PATCH 1/2] fs/dcache.c: New copy_dname method 石祤
  2017-05-31  3:54 ` [PATCH 2/2] fsnotify: use method copy_dname copying filename 石祤
  0 siblings, 2 replies; 5+ messages in thread
From: 石祤 @ 2017-05-31  3:54 UTC (permalink / raw)
  To: leilei.lin, viro, linux-kernel, linux-fsdevel, zhiche.yy

From: "leilei.lin" <leilei.lin@alibaba-inc.com>

Slub alloced mem overwritten ocurrs when fsnotify thread was copying the
dentry name and another rename thread change the dentry name at same
time.

These patches do the following:

1. A new copy_dname method was created which copy file_name to new alloc mem.
   The later patch (of 2) would use this method

2. Use the new copy_dname method instead of using the point of dentry->name,
   which may be modified anytime

We can use script below to reproduce overwritten warning

```
#!/usr/bin/python

import os
import random
import time
import string
import multiprocessing

WRITE_SIZE = 100
filename = "/watch/tdc_admin.LOG"
#filename = "/watch/tdc_admin.LOG.1234567890.1234567890.1234567890"

def file_op_process():
    for j in range(10):
        n = random.randrange(0, 10)
        tobe_wrote = "".join(random.sample(string.ascii_letters, 10))
        for i in xrange(n):
            try:
                os.rename(filename, filename + ".1123123123")
            except OSError:
                pass

        for i in xrange(n):
            f = file(filename, "w+")
            f.write(tobe_wrote * i * (1024 / 2))
            f.flush()

            f.close()


if __name__ == '__main__':
    process_list = []
    while True:
        for i in range(100):
            p0 = multiprocessing.Process(target=file_op_process)
            p0.start()
            process_list.append(p0)

        #time.sleep(0.002)
        for p in process_list:
            if p.is_alive():
                p.join(0.01)
            else:
                del p

```

leilei.lin (2):
  fs/dcache.c: New copy_dname method
  fsnotify: use method copy_dname copying filename

 fs/dcache.c            | 36 ++++++++++++++++++++++++++++++++++++
 fs/notify/fsnotify.c   | 14 ++++++++++++--
 include/linux/dcache.h |  2 ++
 3 files changed, 50 insertions(+), 2 deletions(-)

-- 
2.8.4.31.g9ed660f

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/2] fs/dcache.c: New copy_dname method
  2017-05-31  3:54 [PATCH 0/2] fsnotify: fix mem overwritten 石祤
@ 2017-05-31  3:54 ` 石祤
  2017-05-31  3:54 ` [PATCH 2/2] fsnotify: use method copy_dname copying filename 石祤
  1 sibling, 0 replies; 5+ messages in thread
From: 石祤 @ 2017-05-31  3:54 UTC (permalink / raw)
  To: leilei.lin, viro, linux-kernel, linux-fsdevel, zhiche.yy

From: "leilei.lin" <leilei.lin@alibaba-inc.com>

locklessly and integrally copy dentry name. It will be used later for
bugfix

Signed-off-by: leilei.lin <leilei.lin@alibaba-inc.com>
---
 fs/dcache.c            | 38 ++++++++++++++++++++++++++++++++++++++
 include/linux/dcache.h |  2 ++
 2 files changed, 40 insertions(+)

diff --git a/fs/dcache.c b/fs/dcache.c
index 95d71ed..056249b 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -3536,6 +3536,44 @@ void d_tmpfile(struct dentry *dentry, struct inode *inode)
 }
 EXPORT_SYMBOL(d_tmpfile);
 
+int copy_dname(struct dentry *dentry, char **name)
+{
+	unsigned int seq;
+	const char *dname;
+
+	/* in case that dentry was being free */
+	rcu_read_lock();
+
+	dname = dentry->d_name.name;
+
+	if (unlikely(!dname)) {
+		rcu_read_unlock();
+		return 0;
+	}
+
+	if (unlikely(dname != dentry->d_iname)) {
+		*name = kstrdup(dname, GFP_ATOMIC);
+		rcu_read_unlock();
+		if (unlikely(!*name))
+			return -ENOMEM;
+
+		return 0;
+	}
+
+	rcu_read_unlock();
+	*name = kmalloc(DNAME_INLINE_LEN, GFP_KERNEL);
+	if (unlikely(!*name))
+		return -ENOMEM;
+
+	do {
+		seq = read_seqcount_begin(&dentry->d_seq);
+		strcpy(*name, dname);
+	} while (read_seqcount_retry(&dentry->d_seq, seq));
+
+	return 0;
+}
+EXPORT_SYMBOL(copy_dname);
+
 static __initdata unsigned long dhash_entries;
 static int __init set_dhash_entries(char *str)
 {
diff --git a/include/linux/dcache.h b/include/linux/dcache.h
index d2e38dc..a1b2aeb 100644
--- a/include/linux/dcache.h
+++ b/include/linux/dcache.h
@@ -296,6 +296,8 @@ extern char *d_path(const struct path *, char *, int);
 extern char *dentry_path_raw(struct dentry *, char *, int);
 extern char *dentry_path(struct dentry *, char *, int);
 
+extern int copy_dname(struct dentry *dentry, char **name);
+
 /* Allocation counts.. */
 
 /**
-- 
2.8.4.31.g9ed660f

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/2] fsnotify: use method copy_dname copying filename
  2017-05-31  3:54 [PATCH 0/2] fsnotify: fix mem overwritten 石祤
  2017-05-31  3:54 ` [PATCH 1/2] fs/dcache.c: New copy_dname method 石祤
@ 2017-05-31  3:54 ` 石祤
  2017-08-04  3:58   ` 林守磊
  1 sibling, 1 reply; 5+ messages in thread
From: 石祤 @ 2017-05-31  3:54 UTC (permalink / raw)
  To: leilei.lin, viro, linux-kernel, linux-fsdevel, zhiche.yy

From: "leilei.lin" <leilei.lin@alibaba-inc.com>

Kernel got panicked in inotify_handle_event, while running the rename
operation against the same file. The root cause is that the race between
fsnotify thread and file rename thread.  When fsnotify thread was
copying the dentry name, another rename thread could change the dentry
name at same time. With slub_debug=FZ boot args, this bug will trigger
a trace like the following:

[   87.733653] =============================================================================
[   87.735350] BUG kmalloc-64 (Not tainted): Redzone overwritten
[   87.736550] -----------------------------------------------------------------------------

[   87.738466] Disabling lock debugging due to kernel taint
[   87.739556] INFO: 0xffff8e487a50b0f8-0xffff8e487a50b0fc. First byte 0x33 instead of 0xcc
[   87.741188] INFO: Slab 0xfffff116c0e942c0 objects=46 used=43 fp=0xffff8e487a50bf80 flags=0xffff8000000101
[   87.743133] INFO: Object 0xffff8e487a50b0b8 @offset=184 fp=0xffff8e487a50b0b8

[   87.744942] Redzone ffff8e487a50b0b0: cc cc cc cc cc cc cc cc                          ........
[   87.746743] Object ffff8e487a50b0b8: b8 b0 50 7a 48 8e ff ff b8 b0 50 7a 48 8e ff ff  ..PzH.....PzH...
[   87.748621] Object ffff8e487a50b0c8: 60 75 7e 7b 48 8e ff ff 08 00 00 08 00 00 00 00  `u~{H...........
[   87.750583] Object ffff8e487a50b0d8: 01 00 00 00 00 00 00 00 0d 00 00 00 74 64 63 5f  ............tdc_
[   87.752541] Object ffff8e487a50b0e8: 61 64 6d 69 6e 2e 4c 4f 47 2e 31 31 32 33 31 32  admin.LOG.112312
[   87.754431] Redzone ffff8e487a50b0f8: 33 31 32 33 00 cc cc cc                          3123....
[   87.756172] Padding ffff8e487a50b100: 00 00 00 00 00 00 00 00                          ........
[   87.757988] CPU: 0 PID: 286 Comm: python Tainted: G    B           4.11.0-rc4+ #29
[   87.759574] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.1-0-gb3ef39f-prebuilt.qemu-project.org 04/01/2014
[   87.761878] Call Trace:
[   87.762381]  dump_stack+0x65/0x88
[   87.763063]  print_trailer+0x15d/0x250
[   87.763833]  check_bytes_and_report+0xcd/0x110
[   87.764731]  check_object+0x1ce/0x290
[   87.765472]  free_debug_processing+0x9c/0x2e3
[   87.766362]  ? inotify_free_event+0xe/0x10
[   87.767191]  __slab_free+0x1ba/0x2b0
[   87.767922]  ? async_page_fault+0x28/0x30
[   87.768731]  ? inotify_free_event+0xe/0x10
[   87.769558]  kfree+0x165/0x1a0
[   87.770184]  inotify_free_event+0xe/0x10
[   87.770974]  fsnotify_destroy_event+0x51/0x70
[   87.771851]  inotify_read+0x1dc/0x3a0
[   87.772582]  ? do_wait_intr_irq+0xa0/0xa0
[   87.773388]  __vfs_read+0x37/0x150
[   87.774078]  ? security_file_permission+0x9d/0xc0
[   87.775009]  vfs_read+0x8c/0x130
[   87.775657]  SyS_read+0x55/0xc0
[   87.776328]  entry_SYSCALL_64_fastpath+0x1e/0xad
[   87.777280] RIP: 0033:0x7fcc1375b210
[   87.778001] RSP: 002b:00007ffe2f00b838 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[   87.779513] RAX: ffffffffffffffda RBX: 00007fcc1303d7b8 RCX: 00007fcc1375b210
[   87.780932] RDX: 0000000000005c70 RSI: 00000000013fe9f4 RDI: 0000000000000004
[   87.782337] RBP: 00007fcc1303d760 R08: 0000000000000080 R09: 0000000000005c95
[   87.783780] R10: 0000000000000073 R11: 0000000000000246 R12: 0000000000005c95
[   87.785203] R13: 0000000000002708 R14: 0000000000005ca1 R15: 00007fcc1303d7b8
[   87.786636] FIX kmalloc-64: Restoring 0xffff8e487a50b0f8-0xffff8e487a50b0fc=0xcc

[   87.789388] FIX kmalloc-64: Object at 0xffff8e487a50b0b8 not freed

Graph below is the flow of inotify subsystem handling
notify event. If a rename syscall happened simultaneously,
for example filename of "foobar" is rename to
"foobar_longername", which would access memory illegally.

            CPU 1                                       CPU 2

     fsnotify()
       inotify_handle_event()
         strlen(file_name) // file_name -> "foobar"

                                                    rename happen
                                                    file_name -> "foobar_longername"

         alloc_len += len + 1;
         event = kmalloc(alloc_len, GFP_KERNEL);
         strcpy(event->name, file_name); -> overwritten

Signed-off-by: leilei.lin <leilei.lin@alibaba-inc.com>
---
 fs/notify/fsnotify.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c
index b41515d..2c6f94d 100644
--- a/fs/notify/fsnotify.c
+++ b/fs/notify/fsnotify.c
@@ -91,6 +91,7 @@ int __fsnotify_parent(const struct path *path, struct dentry *dentry, __u32 mask
 	struct dentry *parent;
 	struct inode *p_inode;
 	int ret = 0;
+	char *name = NULL;
 
 	if (!dentry)
 		dentry = path->dentry;
@@ -108,14 +109,23 @@ int __fsnotify_parent(const struct path *path, struct dentry *dentry, __u32 mask
 		 * specifies these are events which came from a child. */
 		mask |= FS_EVENT_ON_CHILD;
 
+		ret = copy_dname(dentry, &name);
+
+		if (ret) {
+			dput(parent);
+			return ret;
+		}
+
 		if (path)
 			ret = fsnotify(p_inode, mask, path, FSNOTIFY_EVENT_PATH,
-				       dentry->d_name.name, 0);
+				       name, 0);
 		else
 			ret = fsnotify(p_inode, mask, dentry->d_inode, FSNOTIFY_EVENT_INODE,
-				       dentry->d_name.name, 0);
+				       name, 0);
 	}
 
+	kfree(name);
+
 	dput(parent);
 
 	return ret;
-- 
2.8.4.31.g9ed660f

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 2/2] fsnotify: use method copy_dname copying filename
  2017-05-31  3:54 ` [PATCH 2/2] fsnotify: use method copy_dname copying filename 石祤
@ 2017-08-04  3:58   ` 林守磊
  2017-08-04  5:18     ` Al Viro
  0 siblings, 1 reply; 5+ messages in thread
From: 林守磊 @ 2017-08-04  3:58 UTC (permalink / raw)
  To: 石祤,
	viro, linux-kernel, linux-fsdevel, zhiche.yy, torvalds, linux-mm

Hi all

I sent this patch two months ago, then I found CVE from this link last night

    http://seclists.org/oss-sec/2017/q3/240

which not only references this patch, but also provides a upstream fix

    https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=49d31c2f389acfe83417083e1208422b4091cd9

I was wondering why @viro hadn't noticed this mail (And @viro fixed
this). I'm new here and nobody,
trying to do my best to help the this linux community. I was looking
forword to your reply, because some
insufficiency may exists in my work, I'd like to learn from you guys

Thanks and humble enough to wait your reply

在 2017年5月31日 上午11:54,石祤 <linxiulei@gmail.com> 写道:
> From: "leilei.lin" <leilei.lin@alibaba-inc.com>
>
> Kernel got panicked in inotify_handle_event, while running the rename
> operation against the same file. The root cause is that the race between
> fsnotify thread and file rename thread.  When fsnotify thread was
> copying the dentry name, another rename thread could change the dentry
> name at same time. With slub_debug=FZ boot args, this bug will trigger
> a trace like the following:
>
> [   87.733653] =============================================================================
> [   87.735350] BUG kmalloc-64 (Not tainted): Redzone overwritten
> [   87.736550] -----------------------------------------------------------------------------
>
> [   87.738466] Disabling lock debugging due to kernel taint
> [   87.739556] INFO: 0xffff8e487a50b0f8-0xffff8e487a50b0fc. First byte 0x33 instead of 0xcc
> [   87.741188] INFO: Slab 0xfffff116c0e942c0 objects=46 used=43 fp=0xffff8e487a50bf80 flags=0xffff8000000101
> [   87.743133] INFO: Object 0xffff8e487a50b0b8 @offset=184 fp=0xffff8e487a50b0b8
>
> [   87.744942] Redzone ffff8e487a50b0b0: cc cc cc cc cc cc cc cc                          ........
> [   87.746743] Object ffff8e487a50b0b8: b8 b0 50 7a 48 8e ff ff b8 b0 50 7a 48 8e ff ff  ..PzH.....PzH...
> [   87.748621] Object ffff8e487a50b0c8: 60 75 7e 7b 48 8e ff ff 08 00 00 08 00 00 00 00  `u~{H...........
> [   87.750583] Object ffff8e487a50b0d8: 01 00 00 00 00 00 00 00 0d 00 00 00 74 64 63 5f  ............tdc_
> [   87.752541] Object ffff8e487a50b0e8: 61 64 6d 69 6e 2e 4c 4f 47 2e 31 31 32 33 31 32  admin.LOG.112312
> [   87.754431] Redzone ffff8e487a50b0f8: 33 31 32 33 00 cc cc cc                          3123....
> [   87.756172] Padding ffff8e487a50b100: 00 00 00 00 00 00 00 00                          ........
> [   87.757988] CPU: 0 PID: 286 Comm: python Tainted: G    B           4.11.0-rc4+ #29
> [   87.759574] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.1-0-gb3ef39f-prebuilt.qemu-project.org 04/01/2014
> [   87.761878] Call Trace:
> [   87.762381]  dump_stack+0x65/0x88
> [   87.763063]  print_trailer+0x15d/0x250
> [   87.763833]  check_bytes_and_report+0xcd/0x110
> [   87.764731]  check_object+0x1ce/0x290
> [   87.765472]  free_debug_processing+0x9c/0x2e3
> [   87.766362]  ? inotify_free_event+0xe/0x10
> [   87.767191]  __slab_free+0x1ba/0x2b0
> [   87.767922]  ? async_page_fault+0x28/0x30
> [   87.768731]  ? inotify_free_event+0xe/0x10
> [   87.769558]  kfree+0x165/0x1a0
> [   87.770184]  inotify_free_event+0xe/0x10
> [   87.770974]  fsnotify_destroy_event+0x51/0x70
> [   87.771851]  inotify_read+0x1dc/0x3a0
> [   87.772582]  ? do_wait_intr_irq+0xa0/0xa0
> [   87.773388]  __vfs_read+0x37/0x150
> [   87.774078]  ? security_file_permission+0x9d/0xc0
> [   87.775009]  vfs_read+0x8c/0x130
> [   87.775657]  SyS_read+0x55/0xc0
> [   87.776328]  entry_SYSCALL_64_fastpath+0x1e/0xad
> [   87.777280] RIP: 0033:0x7fcc1375b210
> [   87.778001] RSP: 002b:00007ffe2f00b838 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
> [   87.779513] RAX: ffffffffffffffda RBX: 00007fcc1303d7b8 RCX: 00007fcc1375b210
> [   87.780932] RDX: 0000000000005c70 RSI: 00000000013fe9f4 RDI: 0000000000000004
> [   87.782337] RBP: 00007fcc1303d760 R08: 0000000000000080 R09: 0000000000005c95
> [   87.783780] R10: 0000000000000073 R11: 0000000000000246 R12: 0000000000005c95
> [   87.785203] R13: 0000000000002708 R14: 0000000000005ca1 R15: 00007fcc1303d7b8
> [   87.786636] FIX kmalloc-64: Restoring 0xffff8e487a50b0f8-0xffff8e487a50b0fc=0xcc
>
> [   87.789388] FIX kmalloc-64: Object at 0xffff8e487a50b0b8 not freed
>
> Graph below is the flow of inotify subsystem handling
> notify event. If a rename syscall happened simultaneously,
> for example filename of "foobar" is rename to
> "foobar_longername", which would access memory illegally.
>
>             CPU 1                                       CPU 2
>
>      fsnotify()
>        inotify_handle_event()
>          strlen(file_name) // file_name -> "foobar"
>
>                                                     rename happen
>                                                     file_name -> "foobar_longername"
>
>          alloc_len += len + 1;
>          event = kmalloc(alloc_len, GFP_KERNEL);
>          strcpy(event->name, file_name); -> overwritten
>
> Signed-off-by: leilei.lin <leilei.lin@alibaba-inc.com>
> ---
>  fs/notify/fsnotify.c | 14 ++++++++++++--
>  1 file changed, 12 insertions(+), 2 deletions(-)
>
> diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c
> index b41515d..2c6f94d 100644
> --- a/fs/notify/fsnotify.c
> +++ b/fs/notify/fsnotify.c
> @@ -91,6 +91,7 @@ int __fsnotify_parent(const struct path *path, struct dentry *dentry, __u32 mask
>         struct dentry *parent;
>         struct inode *p_inode;
>         int ret = 0;
> +       char *name = NULL;
>
>         if (!dentry)
>                 dentry = path->dentry;
> @@ -108,14 +109,23 @@ int __fsnotify_parent(const struct path *path, struct dentry *dentry, __u32 mask
>                  * specifies these are events which came from a child. */
>                 mask |= FS_EVENT_ON_CHILD;
>
> +               ret = copy_dname(dentry, &name);
> +
> +               if (ret) {
> +                       dput(parent);
> +                       return ret;
> +               }
> +
>                 if (path)
>                         ret = fsnotify(p_inode, mask, path, FSNOTIFY_EVENT_PATH,
> -                                      dentry->d_name.name, 0);
> +                                      name, 0);
>                 else
>                         ret = fsnotify(p_inode, mask, dentry->d_inode, FSNOTIFY_EVENT_INODE,
> -                                      dentry->d_name.name, 0);
> +                                      name, 0);
>         }
>
> +       kfree(name);
> +
>         dput(parent);
>
>         return ret;
> --
> 2.8.4.31.g9ed660f
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 2/2] fsnotify: use method copy_dname copying filename
  2017-08-04  3:58   ` 林守磊
@ 2017-08-04  5:18     ` Al Viro
  0 siblings, 0 replies; 5+ messages in thread
From: Al Viro @ 2017-08-04  5:18 UTC (permalink / raw)
  To: 林守磊
  Cc: 石祤,
	linux-kernel, linux-fsdevel, zhiche.yy, torvalds, linux-mm

On Fri, Aug 04, 2017 at 11:58:41AM +0800, 林守磊 wrote:
> Hi all
> 
> I sent this patch two months ago, then I found CVE from this link last night
> 
>     http://seclists.org/oss-sec/2017/q3/240
> 
> which not only references this patch, but also provides a upstream fix
> 
>     https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=49d31c2f389acfe83417083e1208422b4091cd9
> 
> I was wondering why @viro hadn't noticed this mail (And @viro fixed
> this). I'm new here and nobody,
> trying to do my best to help the this linux community. I was looking
> forword to your reply, because some
> insufficiency may exists in my work, I'd like to learn from you guys
> 
> Thanks and humble enough to wait your reply

Fair enough.  As for the reasons why allocation of name copy is a bad idea,
consider this: only short (embedded) names get overwritten on rename.
External ones (i.e. anything longer than 32 bytes or so) are unmodified
until freed.  And their lifetime is controlled by a refcount, so we can
trivially get a guaranteed to be stable name in that case - all it takes
is bumping the refcount and the name _will_ stay around until we drop
the reference.  So we are left with the case of short names and that
is trivial to deal with - 32-byte array is small enough, so we can bloody
well have it on caller's stack and copy the (short) name there.
That approach avoids all the headache with allocation, allocation failure
handling, etc.  Stack footprint is not much higher (especially compared
to how much idiotify and friends stomp on the stack) and it's obviously
cheaper - we only copy the name in short case and we never go into
allocator.  And it's just as easy to use as "make a dynamic copy" variant
of API...

Al, still buried in packing boxes at the moment...

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-08-04  5:18 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-31  3:54 [PATCH 0/2] fsnotify: fix mem overwritten 石祤
2017-05-31  3:54 ` [PATCH 1/2] fs/dcache.c: New copy_dname method 石祤
2017-05-31  3:54 ` [PATCH 2/2] fsnotify: use method copy_dname copying filename 石祤
2017-08-04  3:58   ` 林守磊
2017-08-04  5:18     ` Al Viro

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).