linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] mm/hugetlb: Fix a race between hugetlb sysctl handlers
@ 2020-08-28  3:11 Muchun Song
  2020-08-28 14:46 ` Mike Kravetz
  2020-08-28 14:59 ` Andi Kleen
  0 siblings, 2 replies; 5+ messages in thread
From: Muchun Song @ 2020-08-28  3:11 UTC (permalink / raw)
  To: mike.kravetz, akpm; +Cc: ak, linux-mm, linux-kernel, Muchun Song

There is a race between the assignment of `table->data` and write value
to the pointer of `table->data` in the __do_proc_doulongvec_minmax() on
the other thread.

CPU0:                                 CPU1:
                                      proc_sys_write
hugetlb_sysctl_handler                  proc_sys_call_handler
hugetlb_sysctl_handler_common             hugetlb_sysctl_handler
  table->data = &tmp;                       hugetlb_sysctl_handler_common
                                              table->data = &tmp;
    proc_doulongvec_minmax
      do_proc_doulongvec_minmax           sysctl_head_finish
        __do_proc_doulongvec_minmax         unuse_table
          i = table->data;
          *i = val;  // corrupt CPU1's stack

Fix this by duplicating the `table`, and only update the duplicate of
it. And introduce a helper of proc_hugetlb_doulongvec_minmax() to
simplify the code.

The following oops was seen:

    BUG: kernel NULL pointer dereference, address: 0000000000000000
    #PF: supervisor instruction fetch in kernel mode
    #PF: error_code(0x0010) - not-present page
    Code: Bad RIP value.
    ...
    Call Trace:
     ? set_max_huge_pages+0x3da/0x4f0
     ? alloc_pool_huge_page+0x150/0x150
     ? proc_doulongvec_minmax+0x46/0x60
     ? hugetlb_sysctl_handler_common+0x1c7/0x200
     ? nr_hugepages_store+0x20/0x20
     ? copy_fd_bitmaps+0x170/0x170
     ? hugetlb_sysctl_handler+0x1e/0x20
     ? proc_sys_call_handler+0x2f1/0x300
     ? unregister_sysctl_table+0xb0/0xb0
     ? __fd_install+0x78/0x100
     ? proc_sys_write+0x14/0x20
     ? __vfs_write+0x4d/0x90
     ? vfs_write+0xef/0x240
     ? ksys_write+0xc0/0x160
     ? __ia32_sys_read+0x50/0x50
     ? __close_fd+0x129/0x150
     ? __x64_sys_write+0x43/0x50
     ? do_syscall_64+0x6c/0x200
     ? entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: e5ff215941d5 ("hugetlb: multiple hstates for multiple page sizes")
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
---
chagelogs in v2:
 1. Add more details about how the race happened to the commit message.
 2. Remove unnecessary assignment of table->maxlen.

 mm/hugetlb.c | 26 ++++++++++++++++++++------
 1 file changed, 20 insertions(+), 6 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index a301c2d672bf..4c2a2620eeed 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3454,6 +3454,22 @@ static unsigned int allowed_mems_nr(struct hstate *h)
 }
 
 #ifdef CONFIG_SYSCTL
+static int proc_hugetlb_doulongvec_minmax(struct ctl_table *table, int write,
+					  void *buffer, size_t *length,
+					  loff_t *ppos, unsigned long *out)
+{
+	struct ctl_table dup_table;
+
+	/*
+	 * In order to avoid races with __do_proc_doulongvec_minmax(), we
+	 * can duplicate the @table and alter the duplicate of it.
+	 */
+	dup_table = *table;
+	dup_table.data = out;
+
+	return proc_doulongvec_minmax(&dup_table, write, buffer, length, ppos);
+}
+
 static int hugetlb_sysctl_handler_common(bool obey_mempolicy,
 			 struct ctl_table *table, int write,
 			 void *buffer, size_t *length, loff_t *ppos)
@@ -3465,9 +3481,8 @@ static int hugetlb_sysctl_handler_common(bool obey_mempolicy,
 	if (!hugepages_supported())
 		return -EOPNOTSUPP;
 
-	table->data = &tmp;
-	table->maxlen = sizeof(unsigned long);
-	ret = proc_doulongvec_minmax(table, write, buffer, length, ppos);
+	ret = proc_hugetlb_doulongvec_minmax(table, write, buffer, length, ppos,
+					     &tmp);
 	if (ret)
 		goto out;
 
@@ -3510,9 +3525,8 @@ int hugetlb_overcommit_handler(struct ctl_table *table, int write,
 	if (write && hstate_is_gigantic(h))
 		return -EINVAL;
 
-	table->data = &tmp;
-	table->maxlen = sizeof(unsigned long);
-	ret = proc_doulongvec_minmax(table, write, buffer, length, ppos);
+	ret = proc_hugetlb_doulongvec_minmax(table, write, buffer, length, ppos,
+					     &tmp);
 	if (ret)
 		goto out;
 
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] mm/hugetlb: Fix a race between hugetlb sysctl handlers
  2020-08-28  3:11 [PATCH v2] mm/hugetlb: Fix a race between hugetlb sysctl handlers Muchun Song
@ 2020-08-28 14:46 ` Mike Kravetz
  2020-08-28 14:59 ` Andi Kleen
  1 sibling, 0 replies; 5+ messages in thread
From: Mike Kravetz @ 2020-08-28 14:46 UTC (permalink / raw)
  To: Muchun Song, akpm; +Cc: ak, linux-mm, linux-kernel

On 8/27/20 8:11 PM, Muchun Song wrote:
> There is a race between the assignment of `table->data` and write value
> to the pointer of `table->data` in the __do_proc_doulongvec_minmax() on
> the other thread.
> 
> CPU0:                                 CPU1:
>                                       proc_sys_write
> hugetlb_sysctl_handler                  proc_sys_call_handler
> hugetlb_sysctl_handler_common             hugetlb_sysctl_handler
>   table->data = &tmp;                       hugetlb_sysctl_handler_common
>                                               table->data = &tmp;
>     proc_doulongvec_minmax
>       do_proc_doulongvec_minmax           sysctl_head_finish
>         __do_proc_doulongvec_minmax         unuse_table
>           i = table->data;
>           *i = val;  // corrupt CPU1's stack
> 
> Fix this by duplicating the `table`, and only update the duplicate of
> it. And introduce a helper of proc_hugetlb_doulongvec_minmax() to
> simplify the code.
> 
> The following oops was seen:
> 
>     BUG: kernel NULL pointer dereference, address: 0000000000000000
>     #PF: supervisor instruction fetch in kernel mode
>     #PF: error_code(0x0010) - not-present page
>     Code: Bad RIP value.
>     ...
>     Call Trace:
>      ? set_max_huge_pages+0x3da/0x4f0
>      ? alloc_pool_huge_page+0x150/0x150
>      ? proc_doulongvec_minmax+0x46/0x60
>      ? hugetlb_sysctl_handler_common+0x1c7/0x200
>      ? nr_hugepages_store+0x20/0x20
>      ? copy_fd_bitmaps+0x170/0x170
>      ? hugetlb_sysctl_handler+0x1e/0x20
>      ? proc_sys_call_handler+0x2f1/0x300
>      ? unregister_sysctl_table+0xb0/0xb0
>      ? __fd_install+0x78/0x100
>      ? proc_sys_write+0x14/0x20
>      ? __vfs_write+0x4d/0x90
>      ? vfs_write+0xef/0x240
>      ? ksys_write+0xc0/0x160
>      ? __ia32_sys_read+0x50/0x50
>      ? __close_fd+0x129/0x150
>      ? __x64_sys_write+0x43/0x50
>      ? do_syscall_64+0x6c/0x200
>      ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 
> Fixes: e5ff215941d5 ("hugetlb: multiple hstates for multiple page sizes")
> Signed-off-by: Muchun Song <songmuchun@bytedance.com>

Thank you!

Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>

-- 
Mike Kravetz

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] mm/hugetlb: Fix a race between hugetlb sysctl handlers
  2020-08-28  3:11 [PATCH v2] mm/hugetlb: Fix a race between hugetlb sysctl handlers Muchun Song
  2020-08-28 14:46 ` Mike Kravetz
@ 2020-08-28 14:59 ` Andi Kleen
  2020-08-28 17:14   ` Mike Kravetz
  1 sibling, 1 reply; 5+ messages in thread
From: Andi Kleen @ 2020-08-28 14:59 UTC (permalink / raw)
  To: Muchun Song; +Cc: mike.kravetz, akpm, linux-mm, linux-kernel

> Fixes: e5ff215941d5 ("hugetlb: multiple hstates for multiple page sizes")

I believe the Fixes line is still not correct. The original patch
didn't have that problem. Please identify which patch added
the problematic code.

-Andi

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] mm/hugetlb: Fix a race between hugetlb sysctl handlers
  2020-08-28 14:59 ` Andi Kleen
@ 2020-08-28 17:14   ` Mike Kravetz
  2020-08-28 20:30     ` Andi Kleen
  0 siblings, 1 reply; 5+ messages in thread
From: Mike Kravetz @ 2020-08-28 17:14 UTC (permalink / raw)
  To: Andi Kleen, Muchun Song; +Cc: akpm, linux-mm, linux-kernel

On 8/28/20 7:59 AM, Andi Kleen wrote:
>> Fixes: e5ff215941d5 ("hugetlb: multiple hstates for multiple page sizes")
> 
> I believe the Fixes line is still not correct. The original patch
> didn't have that problem. Please identify which patch added
> the problematic code.

Hi Andi,

I agree with Muchun's assessment that the issue was caused by e5ff215941d5.
Why?

Commit e5ff215941d5 changed initialization of the .data field in the
ctl_table structure for hugetlb_sysctl_handler.  This was required because
the commit introduced multiple hstates.  As a result, it can not be known
until runtime the value for .data.  This code was added to
hugetlb_sysctl_handler to set the value at runtime.

@@ -1037,8 +1109,19 @@ int hugetlb_sysctl_handler(struct ctl_table *table, int write,
                           struct file *file, void __user *buffer,
                           size_t *length, loff_t *ppos)
 {
+       struct hstate *h = &default_hstate;
+       unsigned long tmp;
+
+       if (!write)
+               tmp = h->max_huge_pages;
+
+       table->data = &tmp;
+       table->maxlen = sizeof(unsigned long);

The problem is that two threads running in parallel can modify table->data
in the global data structure without any synchronization.  Worse yet, is
that that value is local to their stacks.  That was the root cause of the
issue addressed by Muchun's patch.

Does that analysis make sense?  Or, are we missing something.
-- 
Mike Kravetz

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] mm/hugetlb: Fix a race between hugetlb sysctl handlers
  2020-08-28 17:14   ` Mike Kravetz
@ 2020-08-28 20:30     ` Andi Kleen
  0 siblings, 0 replies; 5+ messages in thread
From: Andi Kleen @ 2020-08-28 20:30 UTC (permalink / raw)
  To: Mike Kravetz; +Cc: Muchun Song, akpm, linux-mm, linux-kernel

On Fri, Aug 28, 2020 at 10:14:16AM -0700, Mike Kravetz wrote:
> On 8/28/20 7:59 AM, Andi Kleen wrote:
> >> Fixes: e5ff215941d5 ("hugetlb: multiple hstates for multiple page sizes")
> > 
> > I believe the Fixes line is still not correct. The original patch
> > didn't have that problem. Please identify which patch added
> > the problematic code.
> 
> Hi Andi,
> 
> I agree with Muchun's assessment that the issue was caused by e5ff215941d5.
> Why?

Yes when checking the code again I agree. It was introduced with that
patch. Patches look ok to me now.

-Andi

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-08-28 20:53 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-28  3:11 [PATCH v2] mm/hugetlb: Fix a race between hugetlb sysctl handlers Muchun Song
2020-08-28 14:46 ` Mike Kravetz
2020-08-28 14:59 ` Andi Kleen
2020-08-28 17:14   ` Mike Kravetz
2020-08-28 20:30     ` Andi Kleen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).