From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757087AbcHZAEN (ORCPT ); Thu, 25 Aug 2016 20:04:13 -0400 Received: from mail-qk0-f170.google.com ([209.85.220.170]:33563 "EHLO mail-qk0-f170.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756332AbcHZAEH (ORCPT ); Thu, 25 Aug 2016 20:04:07 -0400 MIME-Version: 1.0 In-Reply-To: <3AD1D5AF-552E-4345-855A-36ECC4B545DE@zytor.com> References: <20160825152110.25663-1-dsafonov@virtuozzo.com> <3AD1D5AF-552E-4345-855A-36ECC4B545DE@zytor.com> From: Dmitry Safonov <0x7f454c46@gmail.com> Date: Fri, 26 Aug 2016 01:53:43 +0300 Message-ID: Subject: Re: [RFC 0/3] Put vdso in ramfs-like filesystem (vdsofs) To: "H. Peter Anvin" Cc: Dmitry Safonov , linux-kernel@vger.kernel.org, Ingo Molnar , Andy Lutomirski , Thomas Gleixner , X86 ML , Oleg Nesterov , Steven Rostedt , viro@zeniv.linux.org.uk Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 2016-08-25 23:49 GMT+03:00 H. Peter Anvin : > On August 25, 2016 8:21:07 AM PDT, Dmitry Safonov wrote: >>This patches set is cleanly RFC and is not supposed to be applied. >>Also for RFC time it builds only on x86_64. >> >>So, in a mail thread Oleg told that it would be worth to introduce >>vm_file >>for vdso mappings as currently uprobes can not be placed on vDSO VMAs >>[1]. >>In this patches set I introduce in-kernel filesystem for vdso files. >>After patches vDSO VMA now has inode and is just a private file >>mapping: >>7ffcc4b2b000-7ffcc4b2d000 r--p 00000000 00:00 0 >> [vvar] >>7ffcc4b2d000-7ffcc4b2f000 r-xp 00000000 00:09 18 >> [vdso] >> >>Then I introduce interface in uprobe_events to insert uprobes in vdso. >>FWIW: >> [~]# cd kernel/linux >> [linux]# readelf --syms arch/x86/entry/vdso/vdso64.so >>Symbol table '.dynsym' contains 11 entries: >> Num: Value Size Type Bind Vis Ndx Name >> 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND >> 1: 0000000000000470 0 SECTION LOCAL DEFAULT 8 >>2: 00000000000008d0 885 FUNC WEAK DEFAULT 12 >>clock_gettime@@LINUX_2.6 >>3: 0000000000000c50 472 FUNC GLOBAL DEFAULT 12 >>__vdso_gettimeofday@@LINUX_2.6 >>4: 0000000000000c50 472 FUNC WEAK DEFAULT 12 >>gettimeofday@@LINUX_2.6 >>5: 0000000000000e30 21 FUNC GLOBAL DEFAULT 12 >>__vdso_time@@LINUX_2.6 >> 6: 0000000000000e30 21 FUNC WEAK DEFAULT 12 time@@LINUX_2.6 >>7: 00000000000008d0 885 FUNC GLOBAL DEFAULT 12 >>__vdso_clock_gettime@@LINUX_2.6 >> 8: 0000000000000000 0 OBJECT GLOBAL DEFAULT ABS LINUX_2.6 >>9: 0000000000000e50 41 FUNC GLOBAL DEFAULT 12 >>__vdso_getcpu@@LINUX_2.6 >>10: 0000000000000e50 41 FUNC WEAK DEFAULT 12 >>getcpu@@LINUX_2.6 >> [~]# cd /sys/kernel/debug/tracing/ >> [tracing]# echo 'p:clock_gettime :vdso:/64:0x8d0' > uprobe_events >> [tracing]# echo 'p:gettimeofday :vdso:/64:0xc50' >> uprobe_events >> [tracing]# echo 'p:time :vdso:/64:0xe30' >> uprobe_events >> [tracing]# echo 1 > events/uprobes/enable >> [tracing]# su test # it has UID=1001 >> [tracing]$ date >> Thu Aug 25 17:19:29 MSK 2016 >> [tracing]$ exit >> [tracing]# cat trace >> # tracer: nop >> # >> # entries-in-buffer/entries-written: 175/175 #P:4 >> # >> # _-----=> irqs-off >> # / _----=> need-resched >> # | / _---=> hardirq/softirq >> # || / _--=> preempt-depth >> # ||| / delay >> # TASK-PID CPU# |||| TIMESTAMP FUNCTION >> # | | | |||| | | >> bash-11560 [001] d... 316.470236: time: (0x7ffcacebae30) >> bash-11560 [001] d... 316.471436: gettimeofday: (0x7ffcacebac50) >> bash-11560 [001] d... 316.477550: time: (0x7ffcacebae30) >> bash-11560 [001] d... 316.477655: time: (0x7ffcacebae30) >> mktemp-11568 [001] d... 316.479589: gettimeofday: (0x7ffc603f0c50) >> date-11571 [001] d... 316.481890: clock_gettime: (0x7ffec9db58d0) >>[...] >> >>If this approach will be decided as fine, I will prepare a better >>version, >>fixing the following things: >>o put vdsofs in generic fs/* dir >>o support other archs and vdso blobs >>o remove BUG_ON()'s and UID==1001 check >>o remove extern's and use headers only >>o refactor code in create_trace_uprobe() >>o add some state to (struct trace_uprobe), so i.e., `cat uprobe_events` >>will >> print those uprobes as vdso-based >>o document this interface in Documentation/trace/uprobetracer.txt >>o prepare nice patches set? >> >>So, opinions? Is it worth to add something like this? >> >>[1]: https://lkml.org/lkml/2016/7/12/346 >> >>Dmitry Safonov (3): >> x86/vdso: create vdso file, use it for mapping >> uprobe: drop isdigit() check in create_trace_uprobe >> uprobe: add vdso support >> >>Cc: Oleg Nesterov >>Cc: Al Viro >>Cc: Steven Rostedt >>Cc: Andy Lutomirski >>Cc: Thomas Gleixner >>Cc: Ingo Molnar >>Cc: "H. Peter Anvin" >>Cc: x86@kernel.org >>Cc: Dmitry Safonov <0x7f454c46@gmail.com> >> >>arch/x86/entry/vdso/vma.c | 148 >>++++++++++++++++++++++++++++++++++++++++++-- >> kernel/trace/trace_uprobe.c | 50 +++++++++++---- >> 2 files changed, 180 insertions(+), 18 deletions(-) > > I think there is a lot to be said for this idea. However, a private mapping is definitely wrong for the vvar data; for the vdso code it could be considered either way I suppose. Thanks on your reply. As you could see, I preserved pure mapping of pfn for vvar: 7ffcc4b2b000-7ffcc4b2d000 r--p 00000000 00:00 0 [vvar] 7ffcc4b2d000-7ffcc4b2f000 r-xp 00000000 00:09 18 [vdso] (no inode number). I also think it would be useless to do the same to vvar as it has just data and there is no point in probing it. -- Dmitry