All of lore.kernel.org
 help / color / mirror / Atom feed
From: Muchun Song <songmuchun@bytedance.com>
To: mike.kravetz@oracle.com, Andrew Morton <akpm@linux-foundation.org>
Cc: ak@linux.intel.com,
	Linux Memory Management List <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] mm/hugetlb: Fix a race between hugetlb sysctl handlers
Date: Tue, 25 Aug 2020 10:42:58 +0800	[thread overview]
Message-ID: <CAMZfGtUCeD_zLr+VY--3fcCuB+_P14tSPPeeSR6aG5KnBU36Gg@mail.gmail.com> (raw)
In-Reply-To: <20200822095328.61306-1-songmuchun@bytedance.com>

Hi Andrew and Mike,

On Sat, Aug 22, 2020 at 5:53 PM Muchun Song <songmuchun@bytedance.com> wrote:
>
> There is a race between the assignment of `table->data` and write value
> to the pointer of `table->data` in the __do_proc_doulongvec_minmax().
> Fix this by duplicating the `table`, and only update the duplicate of
> it. And introduce a helper of proc_hugetlb_doulongvec_minmax() to
> simplify the code.

I am sorry, I didn't expose more details about how the race happened.

CPU0:                                     CPU1:
                                          proc_sys_write
hugetlb_sysctl_handler                      proc_sys_call_handler
hugetlb_sysctl_handler_common                 hugetlb_sysctl_handler
  table->data = &tmp;                           hugetlb_sysctl_handler_common
                                                  table->data = &tmp;
    proc_doulongvec_minmax
      do_proc_doulongvec_minmax             sysctl_head_finish
        __do_proc_doulongvec_minmax
          i = table->data;
          *i = val;     // corrupt CPU1 stack

>
> The following oops was seen:
>
>     BUG: kernel NULL pointer dereference, address: 0000000000000000
>     #PF: supervisor instruction fetch in kernel mode
>     #PF: error_code(0x0010) - not-present page
>     Code: Bad RIP value.

Here we can see the "Bad RIP value", so the stack frame is corrupted by
others.

>     ...
>     Call Trace:
>      ? set_max_huge_pages+0x3da/0x4f0
>      ? alloc_pool_huge_page+0x150/0x150
>      ? proc_doulongvec_minmax+0x46/0x60
>      ? hugetlb_sysctl_handler_common+0x1c7/0x200
>      ? nr_hugepages_store+0x20/0x20
>      ? copy_fd_bitmaps+0x170/0x170
>      ? hugetlb_sysctl_handler+0x1e/0x20
>      ? proc_sys_call_handler+0x2f1/0x300
>      ? unregister_sysctl_table+0xb0/0xb0
>      ? __fd_install+0x78/0x100
>      ? proc_sys_write+0x14/0x20
>      ? __vfs_write+0x4d/0x90
>      ? vfs_write+0xef/0x240
>      ? ksys_write+0xc0/0x160
>      ? __ia32_sys_read+0x50/0x50
>      ? __close_fd+0x129/0x150
>      ? __x64_sys_write+0x43/0x50
>      ? do_syscall_64+0x6c/0x200
>      ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
> Fixes: e5ff215941d5 ("hugetlb: multiple hstates for multiple page sizes")
> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> ---
>  mm/hugetlb.c | 27 +++++++++++++++++++++------
>  1 file changed, 21 insertions(+), 6 deletions(-)
>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index a301c2d672bf..818d6125af49 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -3454,6 +3454,23 @@ static unsigned int allowed_mems_nr(struct hstate *h)
>  }
>
>  #ifdef CONFIG_SYSCTL
> +static int proc_hugetlb_doulongvec_minmax(struct ctl_table *table, int write,
> +                                         void *buffer, size_t *length,
> +                                         loff_t *ppos, unsigned long *out)
> +{
> +       struct ctl_table dup_table;
> +
> +       /*
> +        * In order to avoid races with __do_proc_doulongvec_minmax(), we
> +        * can duplicate the @table and alter the duplicate of it.
> +        */
> +       dup_table = *table;
> +       dup_table.data = out;
> +       dup_table.maxlen = sizeof(unsigned long);
> +
> +       return proc_doulongvec_minmax(&dup_table, write, buffer, length, ppos);
> +}
> +
>  static int hugetlb_sysctl_handler_common(bool obey_mempolicy,
>                          struct ctl_table *table, int write,
>                          void *buffer, size_t *length, loff_t *ppos)
> @@ -3465,9 +3482,8 @@ static int hugetlb_sysctl_handler_common(bool obey_mempolicy,
>         if (!hugepages_supported())
>                 return -EOPNOTSUPP;
>
> -       table->data = &tmp;
> -       table->maxlen = sizeof(unsigned long);
> -       ret = proc_doulongvec_minmax(table, write, buffer, length, ppos);
> +       ret = proc_hugetlb_doulongvec_minmax(table, write, buffer, length, ppos,
> +                                            &tmp);
>         if (ret)
>                 goto out;
>
> @@ -3510,9 +3526,8 @@ int hugetlb_overcommit_handler(struct ctl_table *table, int write,
>         if (write && hstate_is_gigantic(h))
>                 return -EINVAL;
>
> -       table->data = &tmp;
> -       table->maxlen = sizeof(unsigned long);
> -       ret = proc_doulongvec_minmax(table, write, buffer, length, ppos);
> +       ret = proc_hugetlb_doulongvec_minmax(table, write, buffer, length, ppos,
> +                                            &tmp);
>         if (ret)
>                 goto out;
>
> --
> 2.11.0
>


-- 
Yours,
Muchun

  parent reply	other threads:[~2020-08-25  2:43 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-22  9:53 [PATCH] mm/hugetlb: Fix a race between hugetlb sysctl handlers Muchun Song
2020-08-24 20:59 ` Andrew Morton
2020-08-24 21:19   ` Mike Kravetz
2020-08-25  3:01     ` [External] " Muchun Song
2020-08-25  3:01       ` Muchun Song
2020-08-26  0:01       ` Mike Kravetz
2020-08-26  2:47         ` Muchun Song
2020-08-26  2:47           ` Muchun Song
2020-08-27 21:51           ` Mike Kravetz
2020-08-28  2:33             ` Muchun Song
2020-08-28  2:33               ` Muchun Song
2020-08-25  2:42 ` Muchun Song [this message]
2020-08-25  2:42   ` Muchun Song
2020-08-25 15:25 ` Andi Kleen
2020-08-26  2:34   ` [Phishing Risk] [External] " Muchun Song
2020-08-26  2:34     ` Muchun Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAMZfGtUCeD_zLr+VY--3fcCuB+_P14tSPPeeSR6aG5KnBU36Gg@mail.gmail.com \
    --to=songmuchun@bytedance.com \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.