All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jann Horn <jannh@google.com>
To: Dmitry Safonov <dima@arista.com>
Cc: kernel list <linux-kernel@vger.kernel.org>,
	Adrian Reber <adrian@lisas.de>, Andrei Vagin <avagin@openvz.org>,
	Andy Lutomirski <luto@kernel.org>, Arnd Bergmann <arnd@arndb.de>,
	Christian Brauner <christian.brauner@ubuntu.com>,
	Cyrill Gorcunov <gorcunov@openvz.org>,
	Dmitry Safonov <0x7f454c46@gmail.com>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	"H. Peter Anvin" <hpa@zytor.com>, Ingo Molnar <mingo@redhat.com>,
	Jeff Dike <jdike@addtoit.com>, Oleg Nesterov <oleg@redhat.com>,
	Pavel Emelyanov <xemul@virtuozzo.com>,
	Shuah Khan <shuah@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Vincenzo Frascino <vincenzo.frascino@arm.com>,
	containers@lists.linux-foundation.org, criu@openvz.org,
	Linux API <linux-api@vger.kernel.org>,
	"the arch/x86 maintainers" <x86@kernel.org>
Subject: Re: [PATCHv3 15/27] x86/vdso: Allocate timens vdso
Date: Thu, 25 Apr 2019 20:32:17 +0200	[thread overview]
Message-ID: <CAG48ez3nT_RtaHrjpKPRZDYTyzxX49QBZXJ+u2AZmFy5Wao4wQ@mail.gmail.com> (raw)
In-Reply-To: <20190425161416.26600-16-dima@arista.com>

On Thu, Apr 25, 2019 at 6:14 PM Dmitry Safonov <dima@arista.com> wrote:
>
> As it has been discussed on timens RFC, adding a new conditional branch
> `if (inside_time_ns)` on VDSO for all processes is undesirable.
> It will add a penalty for everybody as branch predictor may mispredict
> the jump. Also there are instruction cache lines wasted on cmp/jmp.
>
> Those effects of introducing time namespace are very much unwanted
> having in mind how much work have been spent on micro-optimisation
> vdso code.
>
> The propose is to allocate a second vdso code with dynamically
> patched out (disabled by static_branch) timens code on boot time.
>
> Allocate another vdso and copy original code.
[...]
> diff --git a/arch/x86/entry/vdso/vma.c b/arch/x86/entry/vdso/vma.c
> index 80cbb2167eba..6aae9c0d400d 100644
> --- a/arch/x86/entry/vdso/vma.c
> +++ b/arch/x86/entry/vdso/vma.c
[...]
>  static vm_fault_t vdso_fault(const struct vm_special_mapping *sm,
>                       struct vm_area_struct *vma, struct vm_fault *vmf)
>  {
>         const struct vdso_image *image = vma->vm_mm->context.vdso_image;
> +       unsigned long offset = vmf->pgoff << PAGE_SHIFT;
>
>         if (!image || (vmf->pgoff << PAGE_SHIFT) >= image->size)
>                 return VM_FAULT_SIGBUS;
>
> -       vmf->page = virt_to_page(image->text + (vmf->pgoff << PAGE_SHIFT));
> +       if (current_timens_offsets() && image->text_timens)

I'm pretty sure that accessing `current` in here is wrong. AFAIK this
fault handler can be invoked on remote processes, through interfaces
like /proc/$pid/mem and process_vm_readv(); in that case, the kernel
should install a page based on the time namespace of the target
process, not based on the time namespace of the caller.

> +               vmf->page = vmalloc_to_page(image->text_timens + offset);
> +       else
> +               vmf->page = virt_to_page(image->text + offset);
> +
>         get_page(vmf->page);
>         return 0;
>  }
[...]

WARNING: multiple messages have this Message-ID (diff)
From: Jann Horn <jannh@google.com>
To: Dmitry Safonov <dima@arista.com>
Cc: kernel list <linux-kernel@vger.kernel.org>,
	Adrian Reber <adrian@lisas.de>, Andrei Vagin <avagin@openvz.org>,
	Andy Lutomirski <luto@kernel.org>, Arnd Bergmann <arnd@arndb.de>,
	Christian Brauner <christian.brauner@ubuntu.com>,
	Cyrill Gorcunov <gorcunov@openvz.org>,
	Dmitry Safonov <0x7f454c46@gmail.com>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	"H. Peter Anvin" <hpa@zytor.com>, Ingo Molnar <mingo@redhat.com>,
	Jeff Dike <jdike@addtoit.com>, Oleg Nesterov <oleg@redhat.com>,
	Pavel Emelyanov <xemul@virtuozzo.com>,
	Shuah Khan <shuah@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Vincenzo Frascino <vincenzo.frascino@arm.com>,
	containers@lists.linux-foundation.org, criu@openvz.org,
	Linux API <linux-api@vger.kernel.org>,
	the arch/x86 maintainers <x86>
Subject: Re: [PATCHv3 15/27] x86/vdso: Allocate timens vdso
Date: Thu, 25 Apr 2019 20:32:17 +0200	[thread overview]
Message-ID: <CAG48ez3nT_RtaHrjpKPRZDYTyzxX49QBZXJ+u2AZmFy5Wao4wQ@mail.gmail.com> (raw)
In-Reply-To: <20190425161416.26600-16-dima@arista.com>

On Thu, Apr 25, 2019 at 6:14 PM Dmitry Safonov <dima@arista.com> wrote:
>
> As it has been discussed on timens RFC, adding a new conditional branch
> `if (inside_time_ns)` on VDSO for all processes is undesirable.
> It will add a penalty for everybody as branch predictor may mispredict
> the jump. Also there are instruction cache lines wasted on cmp/jmp.
>
> Those effects of introducing time namespace are very much unwanted
> having in mind how much work have been spent on micro-optimisation
> vdso code.
>
> The propose is to allocate a second vdso code with dynamically
> patched out (disabled by static_branch) timens code on boot time.
>
> Allocate another vdso and copy original code.
[...]
> diff --git a/arch/x86/entry/vdso/vma.c b/arch/x86/entry/vdso/vma.c
> index 80cbb2167eba..6aae9c0d400d 100644
> --- a/arch/x86/entry/vdso/vma.c
> +++ b/arch/x86/entry/vdso/vma.c
[...]
>  static vm_fault_t vdso_fault(const struct vm_special_mapping *sm,
>                       struct vm_area_struct *vma, struct vm_fault *vmf)
>  {
>         const struct vdso_image *image = vma->vm_mm->context.vdso_image;
> +       unsigned long offset = vmf->pgoff << PAGE_SHIFT;
>
>         if (!image || (vmf->pgoff << PAGE_SHIFT) >= image->size)
>                 return VM_FAULT_SIGBUS;
>
> -       vmf->page = virt_to_page(image->text + (vmf->pgoff << PAGE_SHIFT));
> +       if (current_timens_offsets() && image->text_timens)

I'm pretty sure that accessing `current` in here is wrong. AFAIK this
fault handler can be invoked on remote processes, through interfaces
like /proc/$pid/mem and process_vm_readv(); in that case, the kernel
should install a page based on the time namespace of the target
process, not based on the time namespace of the caller.

> +               vmf->page = vmalloc_to_page(image->text_timens + offset);
> +       else
> +               vmf->page = virt_to_page(image->text + offset);
> +
>         get_page(vmf->page);
>         return 0;
>  }
[...]

  reply	other threads:[~2019-04-25 18:32 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-25 16:13 [PATCHv3 00/27] kernel: Introduce Time Namespace Dmitry Safonov
2019-04-25 16:13 ` [PATCHv3 01/27] ns: " Dmitry Safonov
2019-04-25 19:10   ` Jann Horn
2019-04-25 19:10     ` Jann Horn
2019-04-25 16:13 ` [PATCHv3 02/27] timens: Add timens_offsets Dmitry Safonov
2019-04-25 16:13 ` [PATCHv3 03/27] timens: Introduce CLOCK_MONOTONIC offsets Dmitry Safonov
2019-04-25 19:52   ` Thomas Gleixner
2019-04-25 16:13 ` [PATCHv3 04/27] timens: Introduce CLOCK_BOOTTIME offset Dmitry Safonov
2019-04-25 20:08   ` Cyrill Gorcunov
2019-04-25 16:13 ` [PATCHv3 05/27] timerfd/timens: Take into account ns clock offsets Dmitry Safonov
2019-04-25 21:28   ` Thomas Gleixner
2019-05-03  7:00     ` Andrei Vagin
2019-04-25 16:13 ` [PATCHv3 06/27] posix-timers/timens: Take into account " Dmitry Safonov
2019-04-25 21:45   ` Thomas Gleixner
2019-04-25 16:13 ` [PATCHv3 07/27] timens/kernel: Take into account timens clock offsets in clock_nanosleep Dmitry Safonov
2019-04-25 16:13 ` [PATCHv3 08/27] timens: Shift /proc/uptime Dmitry Safonov
2019-04-25 16:13 ` [PATCHv3 09/27] x86/vdso2c: Correct err messages on file opening Dmitry Safonov
2019-04-25 16:13 ` [PATCHv3 10/27] x86/vdso2c: Convert iterator to unsigned Dmitry Safonov
2019-04-25 16:14 ` [PATCHv3 11/27] x86/vdso/Makefile: Add vobjs32 Dmitry Safonov
2019-04-25 16:14 ` [PATCHv3 12/27] x86/vdso: Restrict splitting VVAR VMA Dmitry Safonov
2019-04-25 18:41   ` Jann Horn
2019-04-25 18:41     ` Jann Horn
2019-04-25 18:46     ` Dmitry Safonov
2019-04-25 16:14 ` [PATCHv3 13/27] x86/vdso: Rename vdso_image {.data=>.text} Dmitry Safonov
2019-04-25 16:14 ` [PATCHv3 14/27] x86/vdso: Add offsets page in vvar Dmitry Safonov
2019-04-25 16:14 ` [PATCHv3 15/27] x86/vdso: Allocate timens vdso Dmitry Safonov
2019-04-25 18:32   ` Jann Horn [this message]
2019-04-25 18:32     ` Jann Horn
2019-04-25 19:05     ` Dmitry Safonov
2019-04-25 16:14 ` [PATCHv3 16/27] x86/vdso: Switch image on setns()/unshare()/clone() Dmitry Safonov
2019-04-25 17:53   ` Jann Horn
2019-04-25 17:53     ` Jann Horn
2019-04-25 18:02     ` Dmitry Safonov
2019-04-25 18:02       ` Dmitry Safonov
2019-04-25 16:14 ` [PATCHv3 17/27] vdso: introduce timens_static_branch Dmitry Safonov
2019-04-25 16:14 ` [PATCHv3 18/27] timens: Add align for timens_offsets Dmitry Safonov
2019-04-25 16:14 ` [PATCHv3 19/27] timens/fs/proc: Introduce /proc/pid/timens_offsets Dmitry Safonov
2019-04-25 18:16   ` Jann Horn
2019-04-25 18:16     ` Jann Horn
2019-05-02  6:08     ` Andrei Vagin
2019-05-02  6:08       ` Andrei Vagin
2019-04-25 16:14 ` [PATCHv3 20/27] selftest/timens: Add Time Namespace test for supported clocks Dmitry Safonov
2019-04-25 16:14 ` [PATCHv3 21/27] selftest/timens: Add a test for timerfd Dmitry Safonov
2019-04-25 16:14 ` [PATCHv3 22/27] selftest/timens: Add a test for clock_nanosleep() Dmitry Safonov
2019-04-25 16:14 ` [PATCHv3 23/27] selftest/timens: Add procfs selftest Dmitry Safonov
2019-04-25 16:14 ` [PATCHv3 24/27] selftest/timens: Add timer offsets test Dmitry Safonov
2019-04-25 16:14 ` [PATCHv3 25/27] x86/vdso: Align VDSO functions by CPU L1 cache line Dmitry Safonov
2019-04-25 16:14 ` [PATCHv3 26/27] selftests: Add a simple perf test for clock_gettime() Dmitry Safonov
2019-04-25 16:14 ` [PATCHv3 27/27] selftest/timens: Check that a right vdso is mapped after fork and exec Dmitry Safonov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAG48ez3nT_RtaHrjpKPRZDYTyzxX49QBZXJ+u2AZmFy5Wao4wQ@mail.gmail.com \
    --to=jannh@google.com \
    --cc=0x7f454c46@gmail.com \
    --cc=adrian@lisas.de \
    --cc=arnd@arndb.de \
    --cc=avagin@openvz.org \
    --cc=christian.brauner@ubuntu.com \
    --cc=containers@lists.linux-foundation.org \
    --cc=criu@openvz.org \
    --cc=dima@arista.com \
    --cc=ebiederm@xmission.com \
    --cc=gorcunov@openvz.org \
    --cc=hpa@zytor.com \
    --cc=jdike@addtoit.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=oleg@redhat.com \
    --cc=shuah@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=vincenzo.frascino@arm.com \
    --cc=x86@kernel.org \
    --cc=xemul@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.