All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V4 0/3] perf tools and x86 PTI entry trampolines
@ 2018-06-06 12:54 Adrian Hunter
  2018-06-06 12:54 ` [PATCH V4 1/3] kallsyms: Simplify update_iter_mod() Adrian Hunter
                   ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Adrian Hunter @ 2018-06-06 12:54 UTC (permalink / raw)
  To: Thomas Gleixner, Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Andy Lutomirski, H. Peter Anvin,
	Andi Kleen, Alexander Shishkin, Dave Hansen, Joerg Roedel,
	Jiri Olsa, linux-kernel, x86

Hi

Here is V4 of patches to support x86 PTI entry trampolines in perf tools.

Patches also here:
	http://git.infradead.org/users/ahunter/linux-perf.git/shortlog/refs/heads/perf-tools-kpti-v4
	git://git.infradead.org/users/ahunter/linux-perf.git perf-tools-kpti-v4

V3 patches also here:
	http://git.infradead.org/users/ahunter/linux-perf.git/shortlog/refs/heads/perf-tools-kpti-v3
	git://git.infradead.org/users/ahunter/linux-perf.git perf-tools-kpti-v3

V2 patches also here:
	http://git.infradead.org/users/ahunter/linux-perf.git/shortlog/refs/heads/perf-tools-kpti-v2
	git://git.infradead.org/users/ahunter/linux-perf.git perf-tools-kpti-v2

V1 patches also here:
	http://git.infradead.org/users/ahunter/linux-perf.git/shortlog/refs/heads/perf-tools-kpti-v1
	git://git.infradead.org/users/ahunter/linux-perf.git perf-tools-kpti-v1


Changes Since V3:
	kallsyms: Simplify update_iter_mod()
		Added comment
		Added Andi's Ack

	kallsyms, x86: Export addresses of PTI entry trampolines
		Expanded commit message
		Used for_each_possible_cpu()
		Added Andi's Ack even though logic changed slightly

	x86: Add entry trampolines to kcore
		Re-based
		Added Andi's Ack

	perf tools: Add machine__nr_cpus_avail()
	perf tools: Workaround missing maps for x86 PTI entry trampolines
	perf tools: Fix map_groups__split_kallsyms() for entry trampoline symbols
	perf tools: Allow for extra kernel maps
	perf tools: Create maps for x86 PTI entry trampolines
	perf tools: Synthesize and process mmap events for x86 PTI entry trampolines
	perf buildid-cache: kcore_copy: Keep phdr data in a list
	perf buildid-cache: kcore_copy: Keep a count of phdrs
	perf buildid-cache: kcore_copy: Calculate offset from phnum
	perf buildid-cache: kcore_copy: Layout sections
	perf buildid-cache: kcore_copy: Iterate phdrs
	perf buildid-cache: kcore_copy: Get rid of kernel_map
	perf buildid-cache: kcore_copy: Copy x86 PTI entry trampoline sections
	perf buildid-cache: kcore_copy: Amend the offset of sections that remap kernel text
		Dropped because they have been applied

Changes Since V2:

	x86: Add entry trampolines to kcore
	x86: kcore: Give entry trampolines all the same offset in kcore
		Combined into a single patch
		Added comment
		Expand commit message

	perf tools: Add machine__is() to identify machine arch
		Dropped because it has been applied

	perf tools: Fix kernel_start for PTI on x86
		Dropped because it has been applied

Changes Since V1:

	perf tools: Use the _stest symbol to identify the kernel map when loading kcore
		Dropped because it has been applied

	perf tools: Add machine__is() to identify machine arch
		New patch

	perf tools: Fix kernel_start for PTI on x86
		Moved definition of machine__is() to a separate patch

	perf tools: Add machine__nr_cpus_avail()
		New patch

	perf tools: Workaround missing maps for x86 PTI entry trampolines
		Use machine__nr_cpus_avail()

	perf tools: Create maps for x86 PTI entry trampolines
		Re-based

Changes Since RFC:

	Change description 'x86_64 KPTI' to 'x86 PTI'

	Rename 'special' kernel map to 'extra' kernel map etc

	kallsyms: Simplify update_iter_mod()
		Expand commit message

	perf tools: Fix kernel_start for PTI on x86
		Amend machine__is() to check if machine is NULL

	perf tools: Workaround missing maps for x86 PTI entry trampolines
		Simplify find_entry_trampoline()
		Add comment before struct extra_kernel_map /* Kernel-space
		maps for symbols that are outside the main kernel map and
		module maps */

	perf tools: Create maps for x86 PTI entry trampolines
		Move code presently only used by x86_64 into arch

	perf tools: Synthesize and process mmap events for x86 PTI entry
	trampolines
		Fix spelling 'kernal' -> 'kernel'
		Rename 'special' kernel map to 'extra' kernel map etc
		Move code presently only used by x86_64 into arch

	perf buildid-cache: kcore_copy: Keep phdr data in a list
		Expand commit message
		Rename 'list' -> 'node'

	perf buildid-cache: kcore_copy: Get rid of kernel_map
		Expand commit message
		Add phdr_data__new()
		Rename 'kcore_copy__new_phdr' -> 'kcore_copy_info__addnew'


Original Cover email:

Perf tools do not know about x86 PTI entry trampolines - see example
below.  These patches add a workaround, namely "perf tools: Workaround
missing maps for x86 PTI entry trampolines", which has the limitation
that it hard codes the addresses.  Note that the workaround will work for
old kernels and old perf.data files, but not for future kernels if the
trampoline addresses are ever changed.

At present, perf tools uses /proc/kallsyms to construct a memory map for
the kernel.  Recording such a map in the perf.data file is necessary to
deal with kernel relocation and KASLR.

While it is reasonable on its own terms, to add symbols for the trampolines
to /proc/kallsyms, the motivation here is to have perf tools use them to
create memory maps in the same fashion as is done for the kernel text.

So the first 2 patches add symbols to /proc/kallsyms for the trampolines:

      kallsyms: Simplify update_iter_mod()
      kallsyms, x86: Export addresses of syscall trampolines

perf tools have the ability to use /proc/kcore (in conjunction with
/proc/kallsyms) as the kernel image. So the next 2 patches add program
headers for the trampolines to the kcore ELF:

      x86: Add entry trampolines to kcore
      x86: kcore: Give entry trampolines all the same offset in kcore

It is worth noting that, with the kcore changes alone, perf tools require
no changes to recognise the trampolines when using /proc/kcore.

Similarly, if perf tools are used with a matching kallsyms only (by denying
access to /proc/kcore or a vmlinux image), then the kallsyms patches are
sufficient to recognise the trampolines with no changes needed to the
tools.

However, in the general case, when using vmlinux or dealing with
relocations, perf tools needs memory maps for the trampolines.  Because the
kernel text map is constructed as a special case, using the same approach
for the trampolines means treating them as a special case also, which
requires a number of changes to perf tools, and the remaining patches deal
with that.


Example: make a program that does lots of small syscalls e.g.

	$ cat uname_x_n.c

	#include <sys/utsname.h>
	#include <stdlib.h>

	int main(int argc, char *argv[])
	{
		long n = argc > 1 ? strtol(argv[1], NULL, 0) : 0;
		struct utsname u;

		while (n--)
			uname(&u);

		return 0;
	}

and then:

	sudo perf record uname_x_n 100000
	sudo perf report --stdio

Before the changes, there are unknown symbols:

 # Overhead  Command    Shared Object     Symbol
 # ........  .........  ................  ..................................
 #
    41.91%  uname_x_n  [kernel.vmlinux]  [k] syscall_return_via_sysret
    19.22%  uname_x_n  [kernel.vmlinux]  [k] copy_user_enhanced_fast_string
    18.70%  uname_x_n  [unknown]         [k] 0xfffffe00000e201b
     4.09%  uname_x_n  libc-2.19.so      [.] __GI___uname
     3.08%  uname_x_n  [kernel.vmlinux]  [k] do_syscall_64
     3.02%  uname_x_n  [unknown]         [k] 0xfffffe00000e2025
     2.32%  uname_x_n  [kernel.vmlinux]  [k] down_read
     2.27%  uname_x_n  ld-2.19.so        [.] _dl_start
     1.97%  uname_x_n  [unknown]         [k] 0xfffffe00000e201e
     1.25%  uname_x_n  [kernel.vmlinux]  [k] up_read
     1.02%  uname_x_n  [unknown]         [k] 0xfffffe00000e200c
     0.99%  uname_x_n  [kernel.vmlinux]  [k] entry_SYSCALL_64
     0.16%  uname_x_n  [kernel.vmlinux]  [k] flush_signal_handlers
     0.01%  perf       [kernel.vmlinux]  [k] native_sched_clock
     0.00%  perf       [kernel.vmlinux]  [k] native_write_msr

After the changes there are not:

 # Overhead  Command    Shared Object     Symbol
 # ........  .........  ................  ..................................
 #
    41.91%  uname_x_n  [kernel.vmlinux]  [k] syscall_return_via_sysret
    24.70%  uname_x_n  [kernel.vmlinux]  [k] entry_SYSCALL_64_trampoline
    19.22%  uname_x_n  [kernel.vmlinux]  [k] copy_user_enhanced_fast_string
     4.09%  uname_x_n  libc-2.19.so      [.] __GI___uname
     3.08%  uname_x_n  [kernel.vmlinux]  [k] do_syscall_64
     2.32%  uname_x_n  [kernel.vmlinux]  [k] down_read
     2.27%  uname_x_n  ld-2.19.so        [.] _dl_start
     1.25%  uname_x_n  [kernel.vmlinux]  [k] up_read
     0.99%  uname_x_n  [kernel.vmlinux]  [k] entry_SYSCALL_64
     0.16%  uname_x_n  [kernel.vmlinux]  [k] flush_signal_handlers
     0.01%  perf       [kernel.vmlinux]  [k] native_sched_clock
     0.00%  perf       [kernel.vmlinux]  [k] native_write_msr


Adrian Hunter (2):
      kallsyms: Simplify update_iter_mod()
      x86: Add entry trampolines to kcore

Alexander Shishkin (1):
      kallsyms, x86: Export addresses of PTI entry trampolines

 arch/x86/mm/cpu_entry_area.c | 33 ++++++++++++++++++++++++++++
 fs/proc/kcore.c              |  7 ++++--
 include/linux/kcore.h        | 13 +++++++++++
 kernel/kallsyms.c            | 51 ++++++++++++++++++++++++++++++++------------
 4 files changed, 88 insertions(+), 16 deletions(-)


Regards
Adrian

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2018-08-18 12:00 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-06 12:54 [PATCH V4 0/3] perf tools and x86 PTI entry trampolines Adrian Hunter
2018-06-06 12:54 ` [PATCH V4 1/3] kallsyms: Simplify update_iter_mod() Adrian Hunter
2018-06-06 13:13   ` Peter Zijlstra
2018-06-06 18:37     ` Steven Rostedt
2018-06-06 18:59       ` Arnaldo Carvalho de Melo
2018-06-06 19:13         ` Steven Rostedt
2018-06-06 19:47           ` Arnaldo Carvalho de Melo
2018-08-18 11:59   ` [tip:perf/urgent] " tip-bot for Adrian Hunter
2018-06-06 12:54 ` [PATCH V4 2/3] kallsyms, x86: Export addresses of PTI entry trampolines Adrian Hunter
2018-06-06 13:14   ` Peter Zijlstra
2018-08-18 11:59   ` [tip:perf/urgent] " tip-bot for Alexander Shishkin
2018-06-06 12:54 ` [PATCH V4 3/3] x86: Add entry trampolines to kcore Adrian Hunter
2018-06-06 13:16   ` Peter Zijlstra
2018-06-06 13:19     ` Arnaldo Carvalho de Melo
2018-07-17  8:54       ` Adrian Hunter
2018-07-17 14:47         ` Arnaldo Carvalho de Melo
2018-08-18 12:00   ` [tip:perf/urgent] " tip-bot for Adrian Hunter

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.