All of lore.kernel.org
 help / color / mirror / Atom feed
From: u3557@miso.sublimeip.com (Amnon Shiloh)
To: oleg@redhat.com (Oleg Nesterov)
Cc: rostedt@goodmis.org (Steven Rostedt),
	fweisbec@gmail.com (Frederic Weisbecker),
	mingo@redhat.com (Ingo Molnar),
	a.p.zijlstra@chello.nl (Peter Zijlstra),
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] arch_check_bp_in_kernelspace: fix the range check
Date: Thu, 22 Nov 2012 04:30:43 +1100 (EST)	[thread overview]
Message-ID: <20121121173043.F0319592076@miso.sublimeip.com> (raw)
In-Reply-To: <20121121141627.GB21030@redhat.com>

Hi Oleg,

Yes, I can see that "arch/x86/kernel/vsyscall_64.c"
has changed dramatically since I last looked at it.

Since this is the case, I no longer need to trap the vsyscall page.

Now however, that "vsyscall" was effectively replaced by vdso, it
creates a new problem for me and probably for anyone else who uses
some form of checkpoint/restore:

Suppose a process is checkpointed because the system needs to reboot
for a kernel-upgrade, then restored on the new and different kernel.
The new VDSO page may no longer match the new kernel - it could for
example fetch data from addresses in the vsyscall page that now
contain different things; or in case the hardware also was changed,
it may use machine-instructions that are now illegal.

As I don't mind to forego the "fast" sys_time(), my obvious solution
is to disable the vdso for traced processes that may be checkpointed.

One way to do it would be by brute-force: straight after "execve"
unmap the tracee's vdso page, then manipulate the ELF tables in
its memory so the VDSO entry is gone and the library will not go
looking for it.  Alternately, the function-table within the VDSO
page can be erased.

I just wonder whether you know of an easier and more standard way
to disable the vdso in user-mode - ideally on a per-process basis,
or otherwise, if it's too hard, on the whole computer.  I searched
the web and found references to "/proc/sys/vm/vdso_enable", but I
have no such file or "sysctl" option on my system.

Best Regards,
Amnon.


> 
> Hi Amnon,
> 
> Please read my previous email ;)
> http://marc.info/?l=linux-kernel&m=135342649119153
> 
> On 11/21, u3557@miso.sublimeip.com wrote:
> >
> > Hi Oleg,
> >
> > > Or. Perhaps we can define TRAP_VSYSCALL and change emulate_vsyscall() to
> > > do
> > >
> > >
> > > 	if (current->ptrace && test_thread_flag(TIF_SYSCALL_TRACE))
> > > 		send_sigtrap(TRAP_VSYSCALL, ...);
> > >
> > > if it returns true?
> > >
> >
> > I wish it were possible, but the vsyscall page is entered in user-mode,
> 
> Only in NATIVE mode. emulate_vsyscall() runs in kernel mode.
> 
> And in the NATIVE mode PTRACE_SYSCALL should work just fine, because:
> 
> > The vsyscall page was designed in order to avoid user/kernel context
> > switches,
> 
> True, it was. But not today. Please look at __vsyscall_page:
> 
> 	__vsyscall_page:
> 
> 		mov $__NR_gettimeofday, %rax
> 		syscall
> 		ret
> 
> If you want the "fast" sys_time() without entering the kernel, you can
> use __vdso_time(). And since vdso has the user-space mapping you can
> insert "int3" or use hw breakpoints.
> 
> At least this is my understanding after I glanced at the new implementation.
> 
> 
> However. It is not that I think that TRAP_VSYSCALL is really good idea.
> At least it needs another option...
> 
> Oleg.
> 
> 


  reply	other threads:[~2012-11-21 17:30 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-11-09 18:29 [PATCH] arch_check_bp_in_kernelspace: fix the range check Oleg Nesterov
2012-11-09 18:30 ` Oleg Nesterov
2012-11-19 17:47   ` Oleg Nesterov
2012-11-19 18:25     ` Steven Rostedt
2012-11-20 10:33       ` u3557
2012-11-20 15:48       ` Oleg Nesterov
2012-11-20 15:55         ` Steven Rostedt
2012-11-20 18:32         ` Oleg Nesterov
2012-11-20 23:16           ` u3557
2012-11-21 14:16             ` Oleg Nesterov
2012-11-21 17:30               ` Amnon Shiloh [this message]
2012-11-22 16:12                 ` vdso && cr (Was: arch_check_bp_in_kernelspace: fix the range check) Oleg Nesterov
2012-11-22 20:57                   ` Pavel Emelyanov
2012-11-23  0:20                     ` vdso && cr (Was: arch_check_bp_in_kernelspace: fix the range Amnon Shiloh
2012-11-23 17:45                       ` Oleg Nesterov
2012-11-24 12:47                         ` Amnon Shiloh
2012-11-23 17:42                     ` vdso && cr (Was: arch_check_bp_in_kernelspace: fix the range check) Oleg Nesterov
2012-11-23  9:14                   ` arch_check_bp_in_kernelspace: fix the range check Amnon Shiloh
2012-11-23 16:33                     ` Oleg Nesterov
2012-11-23 17:05                       ` Oleg Nesterov
2012-11-24 14:14                         ` Amnon Shiloh
2012-11-24 13:45                       ` Amnon Shiloh
2012-11-25 22:55                         ` Oleg Nesterov
2012-11-25 23:48                           ` Amnon Shiloh
2012-12-02 19:30                             ` PTRACE_SYSCALL && vsyscall (Was: arch_check_bp_in_kernelspace: fix the range check) Oleg Nesterov
2012-12-02 23:54                               ` u3557
2012-12-04 17:59                                 ` Oleg Nesterov
2012-12-04 22:44                                   ` u3557
2013-01-08 17:08                                   ` Pedro Alves
2013-01-09 17:52                                     ` Oleg Nesterov
2013-01-10  6:54                                       ` u3557
2013-01-12 18:12                                         ` Oleg Nesterov
2013-01-14  2:31                                           ` u3557
2013-01-14 16:01                                             ` Oleg Nesterov
2013-02-18  1:39                                               ` prctl(PR_SET_MM) Amnon Shiloh
2013-02-18  5:44                                                 ` prctl(PR_SET_MM) Randy Dunlap
2013-02-18 15:21                                                 ` prctl(PR_SET_MM) Steven Rostedt
2013-02-18 16:33                                                   ` prctl(PR_SET_MM) Amnon Shiloh
2013-02-18 19:49                                                     ` prctl(PR_SET_MM) Steven Rostedt
2013-02-19  6:25                                                       ` prctl(PR_SET_MM) Amnon Shiloh
2013-02-20  8:39                                                         ` prctl(PR_SET_MM) Cyrill Gorcunov
2013-02-20  9:38                                                           ` prctl(PR_SET_MM) Amnon Shiloh
2013-02-20 10:51                                                             ` prctl(PR_SET_MM) Cyrill Gorcunov
2013-02-20 11:16                                                               ` prctl(PR_SET_MM) Amnon Shiloh
2013-02-21  7:46                                                               ` prctl(PR_SET_MM) Amnon Shiloh
2013-02-21  8:00                                                                 ` prctl(PR_SET_MM) Cyrill Gorcunov
2013-02-21  8:03                                                                   ` prctl(PR_SET_MM) Amnon Shiloh
2013-02-21  8:09                                                                     ` prctl(PR_SET_MM) Cyrill Gorcunov
2013-02-21 22:18                                                                   ` prctl(PR_SET_MM) Andrew Morton
2013-02-21 22:42                                                                     ` prctl(PR_SET_MM) Cyrill Gorcunov
2013-02-22  1:18                                                                     ` prctl(PR_SET_MM) Amnon Shiloh
2013-02-22 14:23                                                         ` prctl(PR_SET_MM) Denys Vlasenko
2012-12-05  9:29                               ` PTRACE_SYSCALL && vsyscall (Was: arch_check_bp_in_kernelspace: fix the range check) Jan Kratochvil
2012-12-05 13:14                                 ` u3557
2012-11-26  9:44                   ` vdso && cr " Cyrill Gorcunov
2012-11-26 12:27                     ` Andrey Wagin
2012-11-26 12:55                       ` Amnon Shiloh
2012-11-26 14:18                         ` Cyrill Gorcunov
2012-11-26 14:26                           ` vdso && cr (Was: arch_check_bp_in_kernelspace: fix the range Amnon Shiloh
2012-11-26 14:41                             ` vdso && cr Cyrill Gorcunov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121121173043.F0319592076@miso.sublimeip.com \
    --to=u3557@miso.sublimeip.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=fweisbec@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=oleg@redhat.com \
    --cc=rostedt@goodmis.org \
    --cc=u3557@dialix.com.au \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.