[PATCH 0/2] [RFC] jump-label/x86: Compress jmps to 2 bytes where possible

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/2] [RFC] jump-label/x86: Compress jmps to 2 bytes where possible
@ 2012-01-18 19:53 Steven Rostedt
  2012-01-18 19:53 ` [PATCH 1/2] jump labels: Add infrastructure to update jump labels at compile time Steven Rostedt
  2012-01-18 19:53 ` [PATCH 2/2] jump labels/x86: Use etiher 5 byte or 2 byte jumps Steven Rostedt
  0 siblings, 2 replies; 12+ messages in thread
From: Steven Rostedt @ 2012-01-18 19:53 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Thomas Gleixner, Frederic Weisbecker,
	H. Peter Anvin, Mathieu Desnoyers, Jason Baron

As discussed previously, it can save a lot of text space if we could
use the 2 byte jump for jump labels instead of always using the 5 byte
version. But to do this, the jmps must be put into the kernel first
(always on). Then at compile time, the jmps are examined and converted
to nops.

At boot up, the jump-labels are examined again and the nops are converted
to the CPU native best 5 byte nops for just the 5 byte nop locations.

The actual op is examined to determine if the jmp was 2 or 5 bytes, and
dealt with accordingly.

I forward ported old work I did before and cleaned it up a bit.
This can also be made to make a "default on" case at compile time if
necessary.

Thoughts?

-- Steve

Steven Rostedt (2):
      jump labels: Add infrastructure to update jump labels at compile time
      jump labels/x86: Use etiher 5 byte or 2 byte jumps

----
 Makefile                          |    7 +
 arch/Kconfig                      |    6 +
 arch/x86/Kconfig                  |    1 +
 arch/x86/include/asm/jump_label.h |    2 +-
 arch/x86/kernel/jump_label.c      |   86 +++++++++-
 scripts/Makefile                  |    1 +
 scripts/Makefile.build            |   15 ++-
 scripts/update_jump_label.c       |  335 +++++++++++++++++++++++++++++++++++++
 scripts/update_jump_label.h       |  283 +++++++++++++++++++++++++++++++
 9 files changed, 726 insertions(+), 10 deletions(-)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/2] jump labels: Add infrastructure to update jump labels at compile time
  2012-01-18 19:53 [PATCH 0/2] [RFC] jump-label/x86: Compress jmps to 2 bytes where possible Steven Rostedt
@ 2012-01-18 19:53 ` Steven Rostedt
  2012-01-19 14:24   ` Mathieu Desnoyers
  2012-01-18 19:53 ` [PATCH 2/2] jump labels/x86: Use etiher 5 byte or 2 byte jumps Steven Rostedt
  1 sibling, 1 reply; 12+ messages in thread
From: Steven Rostedt @ 2012-01-18 19:53 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Thomas Gleixner, Frederic Weisbecker,
	H. Peter Anvin, Mathieu Desnoyers, Jason Baron

[-- Attachment #1: 0001-jump-labels-Add-infrastructure-to-update-jump-labels.patch --]
[-- Type: text/plain, Size: 20842 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

Add the infrastructure to allow architectures to modify the jump label
locations at compile time. This is mainly for x86, where the jmps may
be either 2 bytes or 5 bytes. Instead of wasting 5 bytes for all jump labels,
this code will let x86 put in a jmp instead of a 5 byte nop. Then the
assembler will make either a 2 byte or 5 byte jmp depending on where
the target is.

At compile time, this code will replace the jmps with either a 2 byte
or 5 byte nop depending on the size of the jmp that was added.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 Makefile                    |    7 +
 arch/Kconfig                |    6 +
 scripts/Makefile            |    1 +
 scripts/Makefile.build      |   15 ++-
 scripts/update_jump_label.c |  335 +++++++++++++++++++++++++++++++++++++++++++
 scripts/update_jump_label.h |  283 ++++++++++++++++++++++++++++++++++++
 6 files changed, 645 insertions(+), 2 deletions(-)
 create mode 100644 scripts/update_jump_label.c
 create mode 100644 scripts/update_jump_label.h

diff --git a/Makefile b/Makefile
index adddd11..2b3fdf3 100644
--- a/Makefile
+++ b/Makefile
@@ -611,6 +611,13 @@ ifdef CONFIG_DYNAMIC_FTRACE
 endif
 endif
 
+ifdef CONFIG_JUMP_LABEL
+	ifdef CONFIG_HAVE_BUILD_TIME_JUMP_LABEL
+		BUILD_UPDATE_JUMP_LABEL := y
+		export BUILD_UPDATE_JUMP_LABEL
+	endif
+endif
+
 # We trigger additional mismatches with less inlining
 ifdef CONFIG_DEBUG_SECTION_MISMATCH
 KBUILD_CFLAGS += $(call cc-option, -fno-inline-functions-called-once)
diff --git a/arch/Kconfig b/arch/Kconfig
index 4b0669c..8fa6934 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -169,6 +169,12 @@ config HAVE_PERF_EVENTS_NMI
 	  subsystem.  Also has support for calculating CPU cycle events
 	  to determine how many clock cycles in a given period.
 
+config HAVE_BUILD_TIME_JUMP_LABEL
+       bool
+       help
+	If an arch uses scripts/update_jump_label to patch in jump nops
+	at build time, then it must enable this option.
+
 config HAVE_ARCH_JUMP_LABEL
 	bool
 
diff --git a/scripts/Makefile b/scripts/Makefile
index df7678f..738b65c 100644
--- a/scripts/Makefile
+++ b/scripts/Makefile
@@ -13,6 +13,7 @@ hostprogs-$(CONFIG_LOGO)         += pnmtologo
 hostprogs-$(CONFIG_VT)           += conmakehash
 hostprogs-$(CONFIG_IKCONFIG)     += bin2c
 hostprogs-$(BUILD_C_RECORDMCOUNT) += recordmcount
+hostprogs-$(BUILD_UPDATE_JUMP_LABEL) += update_jump_label
 
 always		:= $(hostprogs-y) $(hostprogs-m)
 
diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index d2b366c..8a84b80 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -258,6 +258,15 @@ cmd_modversions =								\
 	fi;
 endif
 
+ifdef BUILD_UPDATE_JUMP_LABEL
+update_jump_label_source := $(srctree)/scripts/update_jump_label.c \
+			$(srctree)/scripts/update_jump_label.h
+cmd_update_jump_label =						\
+	if [ $(@) != "scripts/mod/empty.o" ]; then		\
+		$(objtree)/scripts/update_jump_label "$(@)";	\
+	fi;
+endif
+
 ifdef CONFIG_FTRACE_MCOUNT_RECORD
 ifdef BUILD_C_RECORDMCOUNT
 ifeq ("$(origin RECORDMCOUNT_WARN)", "command line")
@@ -294,6 +303,7 @@ define rule_cc_o_c
 	$(cmd_modversions)						  \
 	$(call echo-cmd,record_mcount)					  \
 	$(cmd_record_mcount)						  \
+	$(cmd_update_jump_label)					  \
 	scripts/basic/fixdep $(depfile) $@ '$(call make-cmd,cc_o_c)' >    \
 	                                              $(dot-target).tmp;  \
 	rm -f $(depfile);						  \
@@ -301,13 +311,14 @@ define rule_cc_o_c
 endef
 
 # Built-in and composite module parts
-$(obj)/%.o: $(src)/%.c $(recordmcount_source) FORCE
+$(obj)/%.o: $(src)/%.c $(recordmcount_source) $(update_jump_label_source) FORCE
 	$(call cmd,force_checksrc)
 	$(call if_changed_rule,cc_o_c)
 
 # Single-part modules are special since we need to mark them in $(MODVERDIR)
 
-$(single-used-m): $(obj)/%.o: $(src)/%.c $(recordmcount_source) FORCE
+$(single-used-m): $(obj)/%.o: $(src)/%.c $(recordmcount_source) \
+		  $(update_jump_label_source) FORCE
 	$(call cmd,force_checksrc)
 	$(call if_changed_rule,cc_o_c)
 	@{ echo $(@:.o=.ko); echo $@; } > $(MODVERDIR)/$(@F:.o=.mod)
diff --git a/scripts/update_jump_label.c b/scripts/update_jump_label.c
new file mode 100644
index 0000000..399d898
--- /dev/null
+++ b/scripts/update_jump_label.c
@@ -0,0 +1,335 @@
+/*
+ * update_jump_label.c: replace jmps with nops at compile time.
+ * Copyright 2010 Steven Rostedt <srostedt@redhat.com>, Red Hat Inc.
+ *  Parsing of the elf file was influenced by recordmcount.c
+ *  originally written by and copyright to John F. Reiser <jreiser@BitWagon.com>.
+ */
+
+/*
+ * Note, this code is originally designed for x86, but may be used by
+ * other archs to do the nop updates at compile time instead of at boot time.
+ * X86 uses this as an optimization, as jmps can be either 2 bytes or 5 bytes.
+ * Inserting a 2 byte where possible helps with both CPU performance and
+ * icache strain.
+ */
+#include <sys/types.h>
+#include <sys/mman.h>
+#include <sys/stat.h>
+#include <getopt.h>
+#include <elf.h>
+#include <fcntl.h>
+#include <setjmp.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdarg.h>
+#include <string.h>
+#include <unistd.h>
+
+static int fd_map;	/* File descriptor for file being modified. */
+static struct stat sb;	/* Remember .st_size, etc. */
+static int mmap_failed; /* Boolean flag. */
+
+static void die(const char *err, const char *fmt, ...)
+{
+	va_list ap;
+
+	if (err)
+		perror(err);
+
+	if (fmt) {
+		va_start(ap, fmt);
+		fprintf(stderr, "Fatal error:  ");
+		vfprintf(stderr, fmt, ap);
+		fprintf(stderr, "\n");
+		va_end(ap);
+	}
+
+	exit(1);
+}
+
+static void usage(char **argv)
+{
+	char *arg = argv[0];
+	char *p = arg+strlen(arg);
+
+	while (p >= arg && *p != '/')
+		p--;
+	p++;
+
+	printf("usage: %s file\n"
+	       "\n", p);
+	exit(-1);
+}
+
+/* w8rev, w8nat, ...: Handle endianness. */
+
+static uint64_t w8rev(uint64_t const x)
+{
+	return   ((0xff & (x >> (0 * 8))) << (7 * 8))
+	       | ((0xff & (x >> (1 * 8))) << (6 * 8))
+	       | ((0xff & (x >> (2 * 8))) << (5 * 8))
+	       | ((0xff & (x >> (3 * 8))) << (4 * 8))
+	       | ((0xff & (x >> (4 * 8))) << (3 * 8))
+	       | ((0xff & (x >> (5 * 8))) << (2 * 8))
+	       | ((0xff & (x >> (6 * 8))) << (1 * 8))
+	       | ((0xff & (x >> (7 * 8))) << (0 * 8));
+}
+
+static uint32_t w4rev(uint32_t const x)
+{
+	return   ((0xff & (x >> (0 * 8))) << (3 * 8))
+	       | ((0xff & (x >> (1 * 8))) << (2 * 8))
+	       | ((0xff & (x >> (2 * 8))) << (1 * 8))
+	       | ((0xff & (x >> (3 * 8))) << (0 * 8));
+}
+
+static uint32_t w2rev(uint16_t const x)
+{
+	return   ((0xff & (x >> (0 * 8))) << (1 * 8))
+	       | ((0xff & (x >> (1 * 8))) << (0 * 8));
+}
+
+static uint64_t w8nat(uint64_t const x)
+{
+	return x;
+}
+
+static uint32_t w4nat(uint32_t const x)
+{
+	return x;
+}
+
+static uint32_t w2nat(uint16_t const x)
+{
+	return x;
+}
+
+static uint64_t (*w8)(uint64_t);
+static uint32_t (*w)(uint32_t);
+static uint32_t (*w2)(uint16_t);
+
+/* ulseek, uread, ...:  Check return value for errors. */
+
+static off_t
+ulseek(int const fd, off_t const offset, int const whence)
+{
+	off_t const w = lseek(fd, offset, whence);
+	if (w == (off_t)-1)
+		die("lseek", NULL);
+
+	return w;
+}
+
+static size_t
+uread(int const fd, void *const buf, size_t const count)
+{
+	size_t const n = read(fd, buf, count);
+	if (n != count)
+		die("read", NULL);
+
+	return n;
+}
+
+static size_t
+uwrite(int const fd, void const *const buf, size_t const count)
+{
+	size_t const n = write(fd, buf, count);
+	if (n != count)
+		die("write", NULL);
+
+	return n;
+}
+
+static void *
+umalloc(size_t size)
+{
+	void *const addr = malloc(size);
+	if (addr == 0)
+		die("malloc", "malloc failed: %zu bytes\n", size);
+
+	return addr;
+}
+
+/*
+ * Get the whole file as a programming convenience in order to avoid
+ * malloc+lseek+read+free of many pieces.  If successful, then mmap
+ * avoids copying unused pieces; else just read the whole file.
+ * Open for both read and write; new info will be appended to the file.
+ * Use MAP_PRIVATE so that a few changes to the in-memory ElfXX_Ehdr
+ * do not propagate to the file until an explicit overwrite at the last.
+ * This preserves most aspects of consistency (all except .st_size)
+ * for simultaneous readers of the file while we are appending to it.
+ * However, multiple writers still are bad.  We choose not to use
+ * locking because it is expensive and the use case of kernel build
+ * makes multiple writers unlikely.
+ */
+static void *mmap_file(char const *fname)
+{
+	void *addr;
+
+	fd_map = open(fname, O_RDWR);
+	if (fd_map < 0 || fstat(fd_map, &sb) < 0)
+		die(fname, "failed to open file");
+
+	if (!S_ISREG(sb.st_mode))
+		die(NULL, "not a regular file: %s\n", fname);
+
+	addr = mmap(0, sb.st_size, PROT_READ|PROT_WRITE, MAP_PRIVATE,
+		    fd_map, 0);
+
+	mmap_failed = 0;
+	if (addr == MAP_FAILED) {
+		mmap_failed = 1;
+		addr = umalloc(sb.st_size);
+		uread(fd_map, addr, sb.st_size);
+	}
+	return addr;
+}
+
+static void munmap_file(void *addr)
+{
+	if (!mmap_failed)
+		munmap(addr, sb.st_size);
+	else
+		free(addr);
+	close(fd_map);
+}
+
+static unsigned char ideal_nop5_x86_64[5] = { 0x0f, 0x1f, 0x44, 0x00, 0x00 };
+static unsigned char ideal_nop5_x86_32[5] = { 0x3e, 0x8d, 0x74, 0x26, 0x00 };
+static unsigned char ideal_nop2_x86[2] = { 0x66, 0x90 };
+static unsigned char *ideal_nop;
+
+static int (*make_nop)(void *map, size_t const offset);
+
+static int make_nop_x86(void *map, size_t const offset)
+{
+	unsigned char *op;
+	unsigned char *nop;
+	int size;
+
+	/* Determine which type of jmp this is 2 byte or 5. */
+	op = map + offset;
+	switch (*op) {
+	case 0xeb: /* 2 byte */
+		size = 2;
+		nop = ideal_nop2_x86;
+		break;
+	case 0xe9: /* 5 byte */
+		size = 5;
+		nop = ideal_nop;
+		break;
+	default:
+		die(NULL, "Bad jump label section (bad op %x)\n", *op);
+		__builtin_unreachable();
+	}
+
+	/* convert to nop */
+	ulseek(fd_map, offset, SEEK_SET);
+	uwrite(fd_map, nop, size);
+	return 0;
+}
+
+/* 32 bit and 64 bit are very similar */
+#include "update_jump_label.h"
+#define UPDATE_JUMP_LABEL_64
+#include "update_jump_label.h"
+
+static int do_file(const char *fname)
+{
+	Elf32_Ehdr *const ehdr = mmap_file(fname);
+	unsigned int reltype = 0;
+
+	w = w4nat;
+	w2 = w2nat;
+	w8 = w8nat;
+	switch (ehdr->e_ident[EI_DATA]) {
+		static unsigned int const endian = 1;
+	default:
+		die(NULL, "unrecognized ELF data encoding %d: %s\n",
+			ehdr->e_ident[EI_DATA], fname);
+		break;
+	case ELFDATA2LSB:
+		if (*(unsigned char const *)&endian != 1) {
+			/* main() is big endian, file.o is little endian. */
+			w = w4rev;
+			w2 = w2rev;
+			w8 = w8rev;
+		}
+		break;
+	case ELFDATA2MSB:
+		if (*(unsigned char const *)&endian != 0) {
+			/* main() is little endian, file.o is big endian. */
+			w = w4rev;
+			w2 = w2rev;
+			w8 = w8rev;
+		}
+		break;
+	}  /* end switch */
+
+	if (memcmp(ELFMAG, ehdr->e_ident, SELFMAG) != 0 ||
+	    w2(ehdr->e_type) != ET_REL ||
+	    ehdr->e_ident[EI_VERSION] != EV_CURRENT)
+		die(NULL, "unrecognized ET_REL file %s\n", fname);
+
+	switch (w2(ehdr->e_machine)) {
+	default:
+		die(NULL, "unrecognized e_machine %d %s\n",
+		    w2(ehdr->e_machine), fname);
+		break;
+	case EM_386:
+		reltype = R_386_32;
+		make_nop = make_nop_x86;
+		ideal_nop = ideal_nop5_x86_32;
+		break;
+	case EM_ARM:	 reltype = R_ARM_ABS32;
+			 break;
+	case EM_IA_64:	 reltype = R_IA64_IMM64; break;
+	case EM_MIPS:	 /* reltype: e_class    */ break;
+	case EM_PPC:	 reltype = R_PPC_ADDR32;   break;
+	case EM_PPC64:	 reltype = R_PPC64_ADDR64; break;
+	case EM_S390:    /* reltype: e_class    */ break;
+	case EM_SH:	 reltype = R_SH_DIR32;                 break;
+	case EM_SPARCV9: reltype = R_SPARC_64;     break;
+	case EM_X86_64:
+		make_nop = make_nop_x86;
+		ideal_nop = ideal_nop5_x86_64;
+		reltype = R_X86_64_64;
+		break;
+	}  /* end switch */
+
+	switch (ehdr->e_ident[EI_CLASS]) {
+	default:
+		die(NULL, "unrecognized ELF class %d %s\n",
+		    ehdr->e_ident[EI_CLASS], fname);
+		break;
+	case ELFCLASS32:
+		if (w2(ehdr->e_ehsize) != sizeof(Elf32_Ehdr)
+		||  w2(ehdr->e_shentsize) != sizeof(Elf32_Shdr))
+			die(NULL, "unrecognized ET_REL file: %s\n", fname);
+
+		do_func32(ehdr, fname, reltype);
+		break;
+	case ELFCLASS64: {
+		Elf64_Ehdr *const ghdr = (Elf64_Ehdr *)ehdr;
+		if (w2(ghdr->e_ehsize) != sizeof(Elf64_Ehdr)
+		||  w2(ghdr->e_shentsize) != sizeof(Elf64_Shdr))
+			die(NULL, "unrecognized ET_REL file: %s\n", fname);
+
+		do_func64(ghdr, fname, reltype);
+		break;
+	}
+	}  /* end switch */
+
+	munmap_file(ehdr);
+	return 0;
+}
+
+int main(int argc, char **argv)
+{
+	if (argc != 2)
+		usage(argv);
+
+	return do_file(argv[1]);
+}
+
diff --git a/scripts/update_jump_label.h b/scripts/update_jump_label.h
new file mode 100644
index 0000000..8810557
--- /dev/null
+++ b/scripts/update_jump_label.h
@@ -0,0 +1,283 @@
+/*
+ * update_jump_label.h
+ *
+ * This code was taken out of recordmcount.c written by
+ * Copyright 2009 John F. Reiser <jreiser@BitWagon.com>.  All rights reserved.
+ *
+ * The original code had the same algorithms for both 32bit
+ * and 64bit ELF files, but the code was duplicated to support
+ * the difference in structures that were used. This
+ * file creates a macro of everything that is different between
+ * the 64 and 32 bit code, such that by including this header
+ * twice we can create both sets of functions by including this
+ * header once with RECORD_MCOUNT_64 undefined, and again with
+ * it defined.
+ *
+ * This conversion to macros was done by:
+ * Copyright 2010 Steven Rostedt <srostedt@redhat.com>, Red Hat Inc.
+ *
+ * Licensed under the GNU General Public License, version 2 (GPLv2).
+ */
+
+#undef EBITS
+#undef _w
+#undef _align
+#undef _size
+
+#ifdef UPDATE_JUMP_LABEL_64
+# define EBITS			64
+# define _w			w8
+# define _align			7u
+# define _size			8
+#else
+# define EBITS			32
+# define _w			w
+# define _align			3u
+# define _size			4
+#endif
+
+#define _FBITS(x, e)	x##e
+#define FBITS(x, e)	_FBITS(x, e)
+#define FUNC(x)		FBITS(x, EBITS)
+
+#undef Elf_Addr
+#undef Elf_Ehdr
+#undef Elf_Shdr
+#undef Elf_Rel
+#undef Elf_Rela
+#undef Elf_Sym
+#undef ELF_R_SYM
+#undef ELF_R_TYPE
+
+#define __ATTACH(x, y, z)	x##y##z
+#define ATTACH(x, y, z)		__ATTACH(x, y, z)
+
+#define Elf_Addr	ATTACH(Elf, EBITS, _Addr)
+#define Elf_Ehdr	ATTACH(Elf, EBITS, _Ehdr)
+#define Elf_Shdr	ATTACH(Elf, EBITS, _Shdr)
+#define Elf_Rel		ATTACH(Elf, EBITS, _Rel)
+#define Elf_Rela	ATTACH(Elf, EBITS, _Rela)
+#define Elf_Sym		ATTACH(Elf, EBITS, _Sym)
+#define uint_t		ATTACH(uint, EBITS, _t)
+#define ELF_R_SYM	ATTACH(ELF, EBITS, _R_SYM)
+#define ELF_R_TYPE	ATTACH(ELF, EBITS, _R_TYPE)
+
+#undef get_shdr
+#define get_shdr(ehdr) ((Elf_Shdr *)(_w((ehdr)->e_shoff) + (void *)(ehdr)))
+
+#undef get_section_loc
+#define get_section_loc(ehdr, shdr)(_w((shdr)->sh_offset) + (void *)(ehdr))
+
+/* Functions and pointers that do_file() may override for specific e_machine. */
+
+static void FUNC(get_sym_str_and_relp)(Elf_Shdr const *const relhdr,
+				 Elf_Ehdr const *const ehdr,
+				 Elf_Sym const **sym0,
+				 char const **str0,
+				 Elf_Rel const **relp)
+{
+	Elf_Shdr *const shdr0 = get_shdr(ehdr);
+	unsigned const symsec_sh_link = w(relhdr->sh_link);
+	Elf_Shdr const *const symsec = &shdr0[symsec_sh_link];
+	Elf_Shdr const *const strsec = &shdr0[w(symsec->sh_link)];
+	Elf_Rel const *const rel0 =
+		(Elf_Rel const *)get_section_loc(ehdr, relhdr);
+
+	*sym0 = (Elf_Sym const *)get_section_loc(ehdr, symsec);
+
+	*str0 = (char const *)get_section_loc(ehdr, strsec);
+
+	*relp = rel0;
+}
+
+/*
+ * Read the relocation table again, but this time its called on sections
+ * that are not going to be traced. The mcount calls here will be converted
+ * into nops.
+ */
+static void FUNC(nop_jump_label)(Elf_Shdr const *const relhdr,
+		       Elf_Ehdr const *const ehdr,
+		       const char *const txtname)
+{
+	Elf_Shdr *const shdr0 = get_shdr(ehdr);
+	Elf_Sym const *sym0;
+	char const *str0;
+	Elf_Rel const *relp;
+	Elf_Rela const *relap;
+	Elf_Shdr const *const shdr = &shdr0[w(relhdr->sh_info)];
+	unsigned rel_entsize = w(relhdr->sh_entsize);
+	unsigned const nrel = _w(relhdr->sh_size) / rel_entsize;
+	int t;
+
+	FUNC(get_sym_str_and_relp)(relhdr, ehdr, &sym0, &str0, &relp);
+
+	for (t = nrel; t > 0; t -= 3) {
+		int ret = -1;
+
+		relap = (Elf_Rela const *)relp;
+		printf("rel offset=%lx info=%lx sym=%lx type=%lx addend=%lx\n",
+		       (long)relap->r_offset, (long)relap->r_info,
+		       (long)ELF_R_SYM(relap->r_info),
+		       (long)ELF_R_TYPE(relap->r_info),
+		       (long)relap->r_addend);
+
+		if (0 && make_nop)
+			ret = make_nop((void *)ehdr, shdr->sh_offset + relp->r_offset);
+
+		/* jump label sections are paired in threes */
+		relp = (Elf_Rel const *)(rel_entsize * 3 + (void *)relp);
+	}
+}
+
+/* Find relocation section hdr for a given section */
+static const Elf_Shdr *
+FUNC(find_relhdr)(const Elf_Ehdr *ehdr, const Elf_Shdr *shdr)
+{
+	const Elf_Shdr *shdr0 = get_shdr(ehdr);
+	int nhdr = w2(ehdr->e_shnum);
+	const Elf_Shdr *hdr;
+	int i;
+
+	for (hdr = shdr0, i = 0; i < nhdr; hdr = &shdr0[++i]) {
+		if (w(hdr->sh_type) != SHT_REL &&
+		    w(hdr->sh_type) != SHT_RELA)
+			continue;
+
+		/*
+		 * The relocation section's info field holds
+		 * the section index that it represents.
+		 */
+		if (shdr == &shdr0[w(hdr->sh_info)])
+			return hdr;
+	}
+	return NULL;
+}
+
+/* Find a section headr based on name and type */
+static const Elf_Shdr *
+FUNC(find_shdr)(const Elf_Ehdr *ehdr, const char *name, uint_t type)
+{
+	const Elf_Shdr *shdr0 = get_shdr(ehdr);
+	const Elf_Shdr *shstr = &shdr0[w2(ehdr->e_shstrndx)];
+	const char *shstrtab = (char *)get_section_loc(ehdr, shstr);
+	int nhdr = w2(ehdr->e_shnum);
+	const Elf_Shdr *hdr;
+	const char *hdrname;
+	int i;
+
+	for (hdr = shdr0, i = 0; i < nhdr; hdr = &shdr0[++i]) {
+		if (w(hdr->sh_type) != type)
+			continue;
+
+		/* If we are just looking for a section by type (ie. SYMTAB) */
+		if (!name)
+			return hdr;
+
+		hdrname = &shstrtab[w(hdr->sh_name)];
+		if (strcmp(hdrname, name) == 0)
+			return hdr;
+	}
+	return NULL;
+}
+
+static void
+FUNC(section_update)(const Elf_Ehdr *ehdr, const Elf_Shdr *symhdr,
+		     unsigned shtype, const Elf_Rel *rel, void *data)
+{
+	const Elf_Shdr *shdr0 = get_shdr(ehdr);
+	const Elf_Shdr *targethdr;
+	const Elf_Rela *rela;
+	const Elf_Sym *syment;
+	uint_t offset = _w(rel->r_offset);
+	uint_t info = _w(rel->r_info);
+	uint_t sym = ELF_R_SYM(info);
+	uint_t type = ELF_R_TYPE(info);
+	uint_t addend;
+	uint_t targetloc;
+
+	if (shtype == SHT_RELA) {
+		rela = (const Elf_Rela *)rel;
+		addend = _w(rela->r_addend);
+	} else
+		addend = _w(*(int *)(data + offset));
+
+	syment = (const Elf_Sym *)get_section_loc(ehdr, symhdr);
+	targethdr = &shdr0[w2(syment[sym].st_shndx)];
+	targetloc = _w(targethdr->sh_offset);
+
+	/* TODO, need a separate function for all archs */
+	if (type != R_386_32)
+		die(NULL, "Arch relocation type %d not supported", type);
+
+	targetloc += addend;
+
+#if 0
+	printf("offset=%x target=%x shoffset=%x add=%x indx=%x\n",
+	       offset, targetloc, _w(targethdr->sh_offset), addend,
+	       targetloc - _w(targethdr->sh_offset));
+#endif
+	*(uint_t *)(data + offset) = targetloc;
+}
+
+/* Overall supervision for Elf32 ET_REL file. */
+static void
+FUNC(do_func)(Elf_Ehdr *ehdr, char const *const fname, unsigned const reltype)
+{
+	const Elf_Shdr *jlshdr;
+	const Elf_Shdr *jlrhdr;
+	const Elf_Shdr *symhdr;
+	const Elf_Rel *rel;
+	unsigned size;
+	unsigned cnt;
+	unsigned i;
+	uint_t type;
+	void *jdata;
+	void *data;
+
+	jlshdr = FUNC(find_shdr)(ehdr, "__jump_table", SHT_PROGBITS);
+	if (!jlshdr)
+		return;
+
+	jlrhdr = FUNC(find_relhdr)(ehdr, jlshdr);
+	if (!jlrhdr)
+		return;
+
+	/*
+	 * Create and fill in the __jump_table section and use it to
+	 * find the offsets into the text that we want to update.
+	 * We create it so that we do not depend on the order of the
+	 * relocations, and use the table directly, as it is broken
+	 * up into sections.
+	 */
+	size = _w(jlshdr->sh_size);
+	data = umalloc(size);
+
+	jdata = (void *)get_section_loc(ehdr, jlshdr);
+	memcpy(data, jdata, size);
+
+	cnt = _w(jlrhdr->sh_size) / w(jlrhdr->sh_entsize);
+
+	rel = (const Elf_Rel *)get_section_loc(ehdr, jlrhdr);
+
+	/* Is this as Rel or Rela? */
+	type = w(jlrhdr->sh_type);
+
+	symhdr = FUNC(find_shdr)(ehdr, NULL, SHT_SYMTAB);
+
+	for (i = 0; i < cnt; i++) {
+		FUNC(section_update)(ehdr, symhdr, type, rel, data);
+		rel = (void *)rel + w(jlrhdr->sh_entsize);
+	}
+
+	/*
+	 * This is specific to x86. The jump_table is stored in three
+	 * long words. The first is the location of the jmp target we
+	 * must update.
+	 */
+	cnt = size / sizeof(uint_t);
+
+	for (i = 0; i < cnt; i += 3)
+		make_nop((void *)ehdr, *(uint_t *)(data + i * sizeof(uint_t)));
+
+	free(data);
+}
-- 
1.7.8.3



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/2] jump labels/x86: Use etiher 5 byte or 2 byte jumps
  2012-01-18 19:53 [PATCH 0/2] [RFC] jump-label/x86: Compress jmps to 2 bytes where possible Steven Rostedt
  2012-01-18 19:53 ` [PATCH 1/2] jump labels: Add infrastructure to update jump labels at compile time Steven Rostedt
@ 2012-01-18 19:53 ` Steven Rostedt
  2012-01-19 12:22   ` Ingo Molnar
  2012-01-19 14:41   ` Mathieu Desnoyers
  1 sibling, 2 replies; 12+ messages in thread
From: Steven Rostedt @ 2012-01-18 19:53 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Thomas Gleixner, Frederic Weisbecker,
	H. Peter Anvin, Mathieu Desnoyers, Jason Baron

[-- Attachment #1: 0002-jump-labels-x86-Use-etiher-5-byte-or-2-byte-jumps.patch --]
[-- Type: text/plain, Size: 5389 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

Have the jump labels add a "jmp" in the assembly instead
of a default nop. This will cause the assembler to put in
either a 2 byte or 5 byte jmp depending on where the target
lable is.

Then at compile time, the update_jump_label code will replace
the jmps with either 2 or 5 byte nops.

On boot up, the code can be examined to see if the jump label
uses either a 2 or 5 byte nop and replace it.

By allowing the jump labels to be 2 bytes, it speeds up the
nops, not only 2 byte nops are faster than 5 byte nops, but also
because it saves on cache foot print.

   text    data     bss     dec     hex filename
13403667 3666856 2998272 20068795 13239bb ../nobackup/mxtest/vmlinux-old
13398536 3666856 2998272 20063664 13225b0 ../nobackup/mxtest/vmlinux-new

Converting the current v3.2 trace points saved 5,131 bytes.
As more places use jump labels, this will have a bigger savings.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 arch/x86/Kconfig                  |    1 +
 arch/x86/include/asm/jump_label.h |    2 +-
 arch/x86/kernel/jump_label.c      |   86 ++++++++++++++++++++++++++++++++++---
 3 files changed, 81 insertions(+), 8 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index efb4294..b5004c1 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -61,6 +61,7 @@ config X86
 	select HAVE_ARCH_KMEMCHECK
 	select HAVE_USER_RETURN_NOTIFIER
 	select HAVE_ARCH_JUMP_LABEL
+	select HAVE_BUILD_TIME_JUMP_LABEL
 	select HAVE_TEXT_POKE_SMP
 	select HAVE_GENERIC_HARDIRQS
 	select HAVE_SPARSE_IRQ
diff --git a/arch/x86/include/asm/jump_label.h b/arch/x86/include/asm/jump_label.h
index a32b18c..872b3e1 100644
--- a/arch/x86/include/asm/jump_label.h
+++ b/arch/x86/include/asm/jump_label.h
@@ -14,7 +14,7 @@
 static __always_inline bool arch_static_branch(struct jump_label_key *key)
 {
 	asm goto("1:"
-		JUMP_LABEL_INITIAL_NOP
+		"jmp %l[l_yes]\n"
 		".pushsection __jump_table,  \"aw\" \n\t"
 		_ASM_ALIGN "\n\t"
 		_ASM_PTR "1b, %l[l_yes], %c0 \n\t"
diff --git a/arch/x86/kernel/jump_label.c b/arch/x86/kernel/jump_label.c
index ea9d5f2f..d5b84de 100644
--- a/arch/x86/kernel/jump_label.c
+++ b/arch/x86/kernel/jump_label.c
@@ -16,12 +16,27 @@
 
 #ifdef HAVE_JUMP_LABEL
 
+static unsigned char nop_short[] = { P6_NOP2 };
+
+/* These are the nops added at compile time */
+#ifdef CONFIG_X86_32
+static unsigned char default_nop[5] = { 0x3e, 0x8d, 0x74, 0x26, 0x00 };
+#else
+static unsigned char default_nop[5] = { 0x0f, 0x1f, 0x44, 0x00, 0x00 };
+#endif
+
+static int update_nops;
+
 union jump_code_union {
 	char code[JUMP_LABEL_NOP_SIZE];
 	struct {
 		char jump;
 		int offset;
-	} __attribute__((packed));
+	} __packed;
+	struct {
+		char jump_short;
+		char offset_short;
+	} __packed;
 };
 
 static void __jump_label_transform(struct jump_entry *entry,
@@ -29,20 +44,70 @@ static void __jump_label_transform(struct jump_entry *entry,
 				   void *(*poker)(void *, const void *, size_t))
 {
 	union jump_code_union code;
+	unsigned char nop;
+	unsigned char op;
+	unsigned size;
+	void *ip = (void *)entry->code;
+	void *ideal = (void *)ideal_nops[NOP_ATOMIC5];
+
+	/* Use probe_kernel_read()? */
+	op = *(unsigned char *)ip;
+	nop = ideal_nops[NOP_ATOMIC5][0];
 
 	if (type == JUMP_LABEL_ENABLE) {
-		code.jump = 0xe9;
-		code.offset = entry->target -
-				(entry->code + JUMP_LABEL_NOP_SIZE);
-	} else
-		memcpy(&code, ideal_nops[NOP_ATOMIC5], JUMP_LABEL_NOP_SIZE);
+		if (op == 0xe9 || op == 0xeb)
+			/* Already enabled. Warn? */
+			return;
+
+		if (memcmp(ip, nop_short, 2) == 0) {
+			size = 2;
+			code.jump_short = 0xeb;
+			code.offset = entry->target -
+				(entry->code + 2);
+			/* Check for overflow ? */
+		} else if (memcmp(ip, ideal, 5) == 0 ||
+			   memcmp(ip, default_nop, 5)) {
+			size = JUMP_LABEL_NOP_SIZE;
+			code.jump = 0xe9;
+			code.offset = entry->target - (entry->code + size);
+		} else
+			BUG();
 
-	(*poker)((void *)entry->code, &code, JUMP_LABEL_NOP_SIZE);
+	} else {
+		/* Check if already disabled */
+		if (memcmp(ip, nop_short, 2) == 0)
+			return;
+
+		if (memcmp(ip, ideal, 5) == 0)
+			return;
+
+		/* This may need to update from default to ideal */
+		if (update_nops && memcmp(ip, default_nop, 5) == 0) {
+			/* Set to the ideal nop */
+			size = JUMP_LABEL_NOP_SIZE;
+			memcpy(&code, ideal_nops[NOP_ATOMIC5], size);
+
+		} else if (op == 0xe9) {
+			/* Replace a 5 byte jmp */
+			size = JUMP_LABEL_NOP_SIZE;
+			memcpy(&code, ideal_nops[NOP_ATOMIC5], size);
+		} else if (op == 0xeb) {
+			/* Replace a 2 byte jmp */
+			size = 2;
+			memcpy(&code, nop_short, size);
+		} else
+			BUG();
+	}
+
+	(*poker)((void *)entry->code, &code, size);
 }
 
 void arch_jump_label_transform(struct jump_entry *entry,
 			       enum jump_label_type type)
 {
+	/* All nops should be updated to the ideal nop by now */
+	update_nops = 0;
+
 	get_online_cpus();
 	mutex_lock(&text_mutex);
 	__jump_label_transform(entry, type, text_poke_smp);
@@ -53,6 +118,13 @@ void arch_jump_label_transform(struct jump_entry *entry,
 void arch_jump_label_transform_static(struct jump_entry *entry,
 				      enum jump_label_type type)
 {
+	/*
+	 * If the default nop does not equal the ideal nop, then
+	 * update them.
+	 */
+	if (memcmp(default_nop, ideal_nops[NOP_ATOMIC5], 5) != 0)
+		update_nops = 1;
+
 	__jump_label_transform(entry, type, text_poke_early);
 }
 
-- 
1.7.8.3



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] jump labels/x86: Use etiher 5 byte or 2 byte jumps
  2012-01-18 19:53 ` [PATCH 2/2] jump labels/x86: Use etiher 5 byte or 2 byte jumps Steven Rostedt
@ 2012-01-19 12:22   ` Ingo Molnar
  2012-01-19 14:41   ` Mathieu Desnoyers
  1 sibling, 0 replies; 12+ messages in thread
From: Ingo Molnar @ 2012-01-19 12:22 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Andrew Morton, Thomas Gleixner,
	Frederic Weisbecker, H. Peter Anvin, Mathieu Desnoyers,
	Jason Baron


* Steven Rostedt <rostedt@goodmis.org> wrote:

> From: Steven Rostedt <srostedt@redhat.com>
> 
> Have the jump labels add a "jmp" in the assembly instead
> of a default nop. This will cause the assembler to put in
> either a 2 byte or 5 byte jmp depending on where the target
> lable is.
> 
> Then at compile time, the update_jump_label code will replace
> the jmps with either 2 or 5 byte nops.
> 
> On boot up, the code can be examined to see if the jump label
> uses either a 2 or 5 byte nop and replace it.
> 
> By allowing the jump labels to be 2 bytes, it speeds up the
> nops, not only 2 byte nops are faster than 5 byte nops, but also
> because it saves on cache foot print.
> 
>    text    data     bss     dec     hex filename
> 13403667 3666856 2998272 20068795 13239bb ../nobackup/mxtest/vmlinux-old
> 13398536 3666856 2998272 20063664 13225b0 ../nobackup/mxtest/vmlinux-new
> 
> Converting the current v3.2 trace points saved 5,131 bytes.
> As more places use jump labels, this will have a bigger savings.

Yummie!

This is very nice, as that's 5K saved from rather hot codepaths 
quite often, so this feature increases the effective instruction 
cache size of the CPU in essence.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] jump labels: Add infrastructure to update jump labels at compile time
  2012-01-18 19:53 ` [PATCH 1/2] jump labels: Add infrastructure to update jump labels at compile time Steven Rostedt
@ 2012-01-19 14:24   ` Mathieu Desnoyers
  2012-01-19 14:52     ` Steven Rostedt
  0 siblings, 1 reply; 12+ messages in thread
From: Mathieu Desnoyers @ 2012-01-19 14:24 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Frederic Weisbecker, H. Peter Anvin, Jason Baron

Hi Steven,

I really like this patchset! A few comments below,

* Steven Rostedt (rostedt@goodmis.org) wrote:
> From: Steven Rostedt <srostedt@redhat.com>
> 
> Add the infrastructure to allow architectures to modify the jump label
> locations at compile time. This is mainly for x86, where the jmps may

is it "at compile time" or "after compilation" ? AFAIU, this
update_jump_label script gets executed on the .o, which makes it
technically more part of the linking stage than the compilation
stage.

> be either 2 bytes or 5 bytes. Instead of wasting 5 bytes for all jump labels,
> this code will let x86 put in a jmp instead of a 5 byte nop. Then the
> assembler will make either a 2 byte or 5 byte jmp depending on where
> the target is.
> 
> At compile time, this code will replace the jmps with either a 2 byte
> or 5 byte nop depending on the size of the jmp that was added.
> 

A small note somewhere saying that this 2 vs 5 bytes choice done by the
compiler is a pessimistic approximation would be good, so people don't
end up complaining if they see that the compiler chose a 5-byte nop for
a 120 bytes offset.

> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
> ---
>  Makefile                    |    7 +
>  arch/Kconfig                |    6 +
>  scripts/Makefile            |    1 +
>  scripts/Makefile.build      |   15 ++-
>  scripts/update_jump_label.c |  335 +++++++++++++++++++++++++++++++++++++++++++
>  scripts/update_jump_label.h |  283 ++++++++++++++++++++++++++++++++++++
>  6 files changed, 645 insertions(+), 2 deletions(-)
>  create mode 100644 scripts/update_jump_label.c
>  create mode 100644 scripts/update_jump_label.h
> 
[...]
> diff --git a/scripts/update_jump_label.c b/scripts/update_jump_label.c
> new file mode 100644
> index 0000000..399d898
> --- /dev/null
> +++ b/scripts/update_jump_label.c
> @@ -0,0 +1,335 @@
> +/*
> + * update_jump_label.c: replace jmps with nops at compile time.
> + * Copyright 2010 Steven Rostedt <srostedt@redhat.com>, Red Hat Inc.
> + *  Parsing of the elf file was influenced by recordmcount.c
> + *  originally written by and copyright to John F. Reiser <jreiser@BitWagon.com>.
> + */
> +
> +/*
> + * Note, this code is originally designed for x86, but may be used by
> + * other archs to do the nop updates at compile time instead of at boot time.
> + * X86 uses this as an optimization, as jmps can be either 2 bytes or 5 bytes.
> + * Inserting a 2 byte where possible helps with both CPU performance and
> + * icache strain.
> + */
> +#include <sys/types.h>
> +#include <sys/mman.h>
> +#include <sys/stat.h>
> +#include <getopt.h>
> +#include <elf.h>
> +#include <fcntl.h>
> +#include <setjmp.h>
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <stdarg.h>
> +#include <string.h>
> +#include <unistd.h>
> +
> +static int fd_map;	/* File descriptor for file being modified. */
> +static struct stat sb;	/* Remember .st_size, etc. */
> +static int mmap_failed; /* Boolean flag. */
> +
> +static void die(const char *err, const char *fmt, ...)
> +{
> +	va_list ap;
> +
> +	if (err)
> +		perror(err);
> +
> +	if (fmt) {
> +		va_start(ap, fmt);
> +		fprintf(stderr, "Fatal error:  ");
> +		vfprintf(stderr, fmt, ap);
> +		fprintf(stderr, "\n");
> +		va_end(ap);
> +	}
> +
> +	exit(1);
> +}
> +
> +static void usage(char **argv)
> +{
> +	char *arg = argv[0];
> +	char *p = arg+strlen(arg);

I'm pretty sure checkpatch will complain about the coding style of this
file. I won't point out the various nits, but there is a need for
cleanup before pulling it.

> +
> +	while (p >= arg && *p != '/')
> +		p--;
> +	p++;
> +
> +	printf("usage: %s file\n"
> +	       "\n", p);
> +	exit(-1);
> +}

[...]

> +/*
> + * Read the relocation table again, but this time its called on sections
> + * that are not going to be traced. The mcount calls here will be converted

mcount calls -> jump labels

> + * into nops.
> + */
> +static void FUNC(nop_jump_label)(Elf_Shdr const *const relhdr,
> +		       Elf_Ehdr const *const ehdr,
> +		       const char *const txtname)
> +{
> +	Elf_Shdr *const shdr0 = get_shdr(ehdr);
> +	Elf_Sym const *sym0;
> +	char const *str0;
> +	Elf_Rel const *relp;
> +	Elf_Rela const *relap;
> +	Elf_Shdr const *const shdr = &shdr0[w(relhdr->sh_info)];
> +	unsigned rel_entsize = w(relhdr->sh_entsize);
> +	unsigned const nrel = _w(relhdr->sh_size) / rel_entsize;
> +	int t;
> +
> +	FUNC(get_sym_str_and_relp)(relhdr, ehdr, &sym0, &str0, &relp);
> +
> +	for (t = nrel; t > 0; t -= 3) {
> +		int ret = -1;
> +
> +		relap = (Elf_Rela const *)relp;
> +		printf("rel offset=%lx info=%lx sym=%lx type=%lx addend=%lx\n",
> +		       (long)relap->r_offset, (long)relap->r_info,
> +		       (long)ELF_R_SYM(relap->r_info),
> +		       (long)ELF_R_TYPE(relap->r_info),
> +		       (long)relap->r_addend);
> +
> +		if (0 && make_nop)
> +			ret = make_nop((void *)ehdr, shdr->sh_offset + relp->r_offset);

if (0 &&  -> leftover code ?

This whole FUNC(nop_jump_label) function seems to be unused.

> +
> +		/* jump label sections are paired in threes */
> +		relp = (Elf_Rel const *)(rel_entsize * 3 + (void *)relp);
> +	}
> +}
> +
> +/* Find relocation section hdr for a given section */
> +static const Elf_Shdr *
> +FUNC(find_relhdr)(const Elf_Ehdr *ehdr, const Elf_Shdr *shdr)
> +{
> +	const Elf_Shdr *shdr0 = get_shdr(ehdr);
> +	int nhdr = w2(ehdr->e_shnum);
> +	const Elf_Shdr *hdr;
> +	int i;
> +
> +	for (hdr = shdr0, i = 0; i < nhdr; hdr = &shdr0[++i]) {
> +		if (w(hdr->sh_type) != SHT_REL &&
> +		    w(hdr->sh_type) != SHT_RELA)
> +			continue;
> +
> +		/*
> +		 * The relocation section's info field holds
> +		 * the section index that it represents.
> +		 */
> +		if (shdr == &shdr0[w(hdr->sh_info)])
> +			return hdr;
> +	}
> +	return NULL;
> +}
> +
> +/* Find a section headr based on name and type */
> +static const Elf_Shdr *
> +FUNC(find_shdr)(const Elf_Ehdr *ehdr, const char *name, uint_t type)
> +{
> +	const Elf_Shdr *shdr0 = get_shdr(ehdr);
> +	const Elf_Shdr *shstr = &shdr0[w2(ehdr->e_shstrndx)];
> +	const char *shstrtab = (char *)get_section_loc(ehdr, shstr);
> +	int nhdr = w2(ehdr->e_shnum);
> +	const Elf_Shdr *hdr;
> +	const char *hdrname;
> +	int i;
> +
> +	for (hdr = shdr0, i = 0; i < nhdr; hdr = &shdr0[++i]) {
> +		if (w(hdr->sh_type) != type)
> +			continue;
> +
> +		/* If we are just looking for a section by type (ie. SYMTAB) */
> +		if (!name)
> +			return hdr;
> +
> +		hdrname = &shstrtab[w(hdr->sh_name)];
> +		if (strcmp(hdrname, name) == 0)
> +			return hdr;
> +	}
> +	return NULL;
> +}
> +
> +static void
> +FUNC(section_update)(const Elf_Ehdr *ehdr, const Elf_Shdr *symhdr,
> +		     unsigned shtype, const Elf_Rel *rel, void *data)
> +{
> +	const Elf_Shdr *shdr0 = get_shdr(ehdr);
> +	const Elf_Shdr *targethdr;
> +	const Elf_Rela *rela;
> +	const Elf_Sym *syment;
> +	uint_t offset = _w(rel->r_offset);
> +	uint_t info = _w(rel->r_info);
> +	uint_t sym = ELF_R_SYM(info);
> +	uint_t type = ELF_R_TYPE(info);
> +	uint_t addend;
> +	uint_t targetloc;
> +
> +	if (shtype == SHT_RELA) {
> +		rela = (const Elf_Rela *)rel;
> +		addend = _w(rela->r_addend);
> +	} else
> +		addend = _w(*(int *)(data + offset));
> +
> +	syment = (const Elf_Sym *)get_section_loc(ehdr, symhdr);
> +	targethdr = &shdr0[w2(syment[sym].st_shndx)];
> +	targetloc = _w(targethdr->sh_offset);
> +
> +	/* TODO, need a separate function for all archs */

maybe this is where you want to call FUNC(nop_jump_label) too ?

> +	if (type != R_386_32)
> +		die(NULL, "Arch relocation type %d not supported", type);
> +
> +	targetloc += addend;
> +
> +#if 0
> +	printf("offset=%x target=%x shoffset=%x add=%x indx=%x\n",
> +	       offset, targetloc, _w(targethdr->sh_offset), addend,
> +	       targetloc - _w(targethdr->sh_offset));
> +#endif
> +	*(uint_t *)(data + offset) = targetloc;
> +}
> +
> +/* Overall supervision for Elf32 ET_REL file. */
> +static void
> +FUNC(do_func)(Elf_Ehdr *ehdr, char const *const fname, unsigned const reltype)
> +{
> +	const Elf_Shdr *jlshdr;
> +	const Elf_Shdr *jlrhdr;
> +	const Elf_Shdr *symhdr;
> +	const Elf_Rel *rel;
> +	unsigned size;
> +	unsigned cnt;
> +	unsigned i;
> +	uint_t type;
> +	void *jdata;
> +	void *data;
> +
> +	jlshdr = FUNC(find_shdr)(ehdr, "__jump_table", SHT_PROGBITS);
> +	if (!jlshdr)
> +		return;
> +
> +	jlrhdr = FUNC(find_relhdr)(ehdr, jlshdr);
> +	if (!jlrhdr)
> +		return;
> +
> +	/*
> +	 * Create and fill in the __jump_table section and use it to
> +	 * find the offsets into the text that we want to update.
> +	 * We create it so that we do not depend on the order of the
> +	 * relocations, and use the table directly, as it is broken
> +	 * up into sections.
> +	 */
> +	size = _w(jlshdr->sh_size);
> +	data = umalloc(size);
> +
> +	jdata = (void *)get_section_loc(ehdr, jlshdr);
> +	memcpy(data, jdata, size);
> +
> +	cnt = _w(jlrhdr->sh_size) / w(jlrhdr->sh_entsize);
> +
> +	rel = (const Elf_Rel *)get_section_loc(ehdr, jlrhdr);
> +
> +	/* Is this as Rel or Rela? */
> +	type = w(jlrhdr->sh_type);
> +
> +	symhdr = FUNC(find_shdr)(ehdr, NULL, SHT_SYMTAB);
> +
> +	for (i = 0; i < cnt; i++) {
> +		FUNC(section_update)(ehdr, symhdr, type, rel, data);
> +		rel = (void *)rel + w(jlrhdr->sh_entsize);
> +	}
> +
> +	/*
> +	 * This is specific to x86. The jump_table is stored in three
> +	 * long words. The first is the location of the jmp target we
> +	 * must update.
> +	 */
> +	cnt = size / sizeof(uint_t);
> +
> +	for (i = 0; i < cnt; i += 3)
> +		make_nop((void *)ehdr, *(uint_t *)(data + i * sizeof(uint_t)));
> +

Why is this function iterating twice on the __jump_table section rather
than simply writing the nops in the first pass, within the
section_update function ?

Thanks,

Mathieu


> +	free(data);
> +}
> -- 
> 1.7.8.3
> 
> 

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] jump labels/x86: Use etiher 5 byte or 2 byte jumps
  2012-01-18 19:53 ` [PATCH 2/2] jump labels/x86: Use etiher 5 byte or 2 byte jumps Steven Rostedt
  2012-01-19 12:22   ` Ingo Molnar
@ 2012-01-19 14:41   ` Mathieu Desnoyers
  2012-01-19 14:46     ` H. Peter Anvin
  2012-01-19 14:56     ` Steven Rostedt
  1 sibling, 2 replies; 12+ messages in thread
From: Mathieu Desnoyers @ 2012-01-19 14:41 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Frederic Weisbecker, H. Peter Anvin, Jason Baron

* Steven Rostedt (rostedt@goodmis.org) wrote:
> From: Steven Rostedt <srostedt@redhat.com>
> 
> Have the jump labels add a "jmp" in the assembly instead
> of a default nop. This will cause the assembler to put in
> either a 2 byte or 5 byte jmp depending on where the target
> lable is.
> 
> Then at compile time, the update_jump_label code will replace
> the jmps with either 2 or 5 byte nops.
> 
> On boot up, the code can be examined to see if the jump label
> uses either a 2 or 5 byte nop and replace it.
> 
> By allowing the jump labels to be 2 bytes, it speeds up the
> nops, not only 2 byte nops are faster than 5 byte nops, but also
> because it saves on cache foot print.
> 
>    text    data     bss     dec     hex filename
> 13403667 3666856 2998272 20068795 13239bb ../nobackup/mxtest/vmlinux-old
> 13398536 3666856 2998272 20063664 13225b0 ../nobackup/mxtest/vmlinux-new
> 
> Converting the current v3.2 trace points saved 5,131 bytes.
> As more places use jump labels, this will have a bigger savings.
> 
> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
> ---
>  arch/x86/Kconfig                  |    1 +
>  arch/x86/include/asm/jump_label.h |    2 +-
>  arch/x86/kernel/jump_label.c      |   86 ++++++++++++++++++++++++++++++++++---
>  3 files changed, 81 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index efb4294..b5004c1 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -61,6 +61,7 @@ config X86
>  	select HAVE_ARCH_KMEMCHECK
>  	select HAVE_USER_RETURN_NOTIFIER
>  	select HAVE_ARCH_JUMP_LABEL
> +	select HAVE_BUILD_TIME_JUMP_LABEL
>  	select HAVE_TEXT_POKE_SMP
>  	select HAVE_GENERIC_HARDIRQS
>  	select HAVE_SPARSE_IRQ
> diff --git a/arch/x86/include/asm/jump_label.h b/arch/x86/include/asm/jump_label.h
> index a32b18c..872b3e1 100644
> --- a/arch/x86/include/asm/jump_label.h
> +++ b/arch/x86/include/asm/jump_label.h
> @@ -14,7 +14,7 @@
>  static __always_inline bool arch_static_branch(struct jump_label_key *key)
>  {
>  	asm goto("1:"
> -		JUMP_LABEL_INITIAL_NOP
> +		"jmp %l[l_yes]\n"

Is it possible that the compiler choose a jump that is not 2 or 5-byte ?
e.g. a jmp rel16 (e9 opcode) on 32-bit x86, or any of the other
instruction listed under the JMP-Jump instruction in the Intel insn
manual ?

>  		".pushsection __jump_table,  \"aw\" \n\t"
>  		_ASM_ALIGN "\n\t"
>  		_ASM_PTR "1b, %l[l_yes], %c0 \n\t"
> diff --git a/arch/x86/kernel/jump_label.c b/arch/x86/kernel/jump_label.c
> index ea9d5f2f..d5b84de 100644
> --- a/arch/x86/kernel/jump_label.c
> +++ b/arch/x86/kernel/jump_label.c
> @@ -16,12 +16,27 @@
>  
>  #ifdef HAVE_JUMP_LABEL
>  
> +static unsigned char nop_short[] = { P6_NOP2 };
> +
> +/* These are the nops added at compile time */
> +#ifdef CONFIG_X86_32
> +static unsigned char default_nop[5] = { 0x3e, 0x8d, 0x74, 0x26, 0x00 };
> +#else
> +static unsigned char default_nop[5] = { 0x0f, 0x1f, 0x44, 0x00, 0x00 };
> +#endif
> +
> +static int update_nops;
> +
>  union jump_code_union {
>  	char code[JUMP_LABEL_NOP_SIZE];
>  	struct {
>  		char jump;
>  		int offset;
> -	} __attribute__((packed));
> +	} __packed;
> +	struct {
> +		char jump_short;
> +		char offset_short;
> +	} __packed;
>  };
>  
>  static void __jump_label_transform(struct jump_entry *entry,
> @@ -29,20 +44,70 @@ static void __jump_label_transform(struct jump_entry *entry,
>  				   void *(*poker)(void *, const void *, size_t))
>  {
>  	union jump_code_union code;
> +	unsigned char nop;
> +	unsigned char op;
> +	unsigned size;
> +	void *ip = (void *)entry->code;
> +	void *ideal = (void *)ideal_nops[NOP_ATOMIC5];

"void *" should possibly be "unsigned char *" here to respect the nop
place-holder typing.

> +
> +	/* Use probe_kernel_read()? */
> +	op = *(unsigned char *)ip;
> +	nop = ideal_nops[NOP_ATOMIC5][0];
>  
>  	if (type == JUMP_LABEL_ENABLE) {
> -		code.jump = 0xe9;
> -		code.offset = entry->target -
> -				(entry->code + JUMP_LABEL_NOP_SIZE);
> -	} else
> -		memcpy(&code, ideal_nops[NOP_ATOMIC5], JUMP_LABEL_NOP_SIZE);
> +		if (op == 0xe9 || op == 0xeb)
> +			/* Already enabled. Warn? */

This could be caused by failure to run the link-time script, or running
the transform twice. A warning would indeed be welcome, as this should
never happen.

> +			return;
> +
> +		if (memcmp(ip, nop_short, 2) == 0) {
> +			size = 2;
> +			code.jump_short = 0xeb;
> +			code.offset = entry->target -
> +				(entry->code + 2);
> +			/* Check for overflow ? */
> +		} else if (memcmp(ip, ideal, 5) == 0 ||
> +			   memcmp(ip, default_nop, 5)) {
> +			size = JUMP_LABEL_NOP_SIZE;
> +			code.jump = 0xe9;
> +			code.offset = entry->target - (entry->code + size);
> +		} else
> +			BUG();
>  
> -	(*poker)((void *)entry->code, &code, JUMP_LABEL_NOP_SIZE);
> +	} else {
> +		/* Check if already disabled */
> +		if (memcmp(ip, nop_short, 2) == 0)
> +			return;
> +
> +		if (memcmp(ip, ideal, 5) == 0)
> +			return;
> +
> +		/* This may need to update from default to ideal */
> +		if (update_nops && memcmp(ip, default_nop, 5) == 0) {
> +			/* Set to the ideal nop */
> +			size = JUMP_LABEL_NOP_SIZE;
> +			memcpy(&code, ideal_nops[NOP_ATOMIC5], size);

whiteline.

> +
> +		} else if (op == 0xe9) {
> +			/* Replace a 5 byte jmp */
> +			size = JUMP_LABEL_NOP_SIZE;
> +			memcpy(&code, ideal_nops[NOP_ATOMIC5], size);
> +		} else if (op == 0xeb) {
> +			/* Replace a 2 byte jmp */
> +			size = 2;
> +			memcpy(&code, nop_short, size);
> +		} else
> +			BUG();
> +	}

Thanks,

Mathieu

> +
> +	(*poker)((void *)entry->code, &code, size);
>  }
>  
>  void arch_jump_label_transform(struct jump_entry *entry,
>  			       enum jump_label_type type)
>  {
> +	/* All nops should be updated to the ideal nop by now */
> +	update_nops = 0;
> +
>  	get_online_cpus();
>  	mutex_lock(&text_mutex);
>  	__jump_label_transform(entry, type, text_poke_smp);
> @@ -53,6 +118,13 @@ void arch_jump_label_transform(struct jump_entry *entry,
>  void arch_jump_label_transform_static(struct jump_entry *entry,
>  				      enum jump_label_type type)
>  {
> +	/*
> +	 * If the default nop does not equal the ideal nop, then
> +	 * update them.
> +	 */
> +	if (memcmp(default_nop, ideal_nops[NOP_ATOMIC5], 5) != 0)
> +		update_nops = 1;
> +
>  	__jump_label_transform(entry, type, text_poke_early);
>  }
>  
> -- 
> 1.7.8.3
> 
> 

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] jump labels/x86: Use etiher 5 byte or 2 byte jumps
  2012-01-19 14:41   ` Mathieu Desnoyers
@ 2012-01-19 14:46     ` H. Peter Anvin
  2012-01-19 14:58       ` Steven Rostedt
  2012-01-19 14:56     ` Steven Rostedt
  1 sibling, 1 reply; 12+ messages in thread
From: H. Peter Anvin @ 2012-01-19 14:46 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Steven Rostedt, linux-kernel, Ingo Molnar, Andrew Morton,
	Thomas Gleixner, Frederic Weisbecker, Jason Baron

On 01/19/2012 06:41 AM, Mathieu Desnoyers wrote:
>> diff --git a/arch/x86/include/asm/jump_label.h b/arch/x86/include/asm/jump_label.h
>> index a32b18c..872b3e1 100644
>> --- a/arch/x86/include/asm/jump_label.h
>> +++ b/arch/x86/include/asm/jump_label.h
>> @@ -14,7 +14,7 @@
>>  static __always_inline bool arch_static_branch(struct jump_label_key *key)
>>  {
>>  	asm goto("1:"
>> -		JUMP_LABEL_INITIAL_NOP
>> +		"jmp %l[l_yes]\n"
> 
> Is it possible that the compiler choose a jump that is not 2 or 5-byte ?
> e.g. a jmp rel16 (e9 opcode) on 32-bit x86, or any of the other
> instruction listed under the JMP-Jump instruction in the Intel insn
> manual ?
> 

No.

>>  				   void *(*poker)(void *, const void *, size_t))
>>  {
>>  	union jump_code_union code;
>> +	unsigned char nop;
>> +	unsigned char op;
>> +	unsigned size;
>> +	void *ip = (void *)entry->code;
>> +	void *ideal = (void *)ideal_nops[NOP_ATOMIC5];
> 
> "void *" should possibly be "unsigned char *" here to respect the nop
> place-holder typing.
> 

const unsigned char * please.

>> +
>> +	/* Use probe_kernel_read()? */
>> +	op = *(unsigned char *)ip;
>> +	nop = ideal_nops[NOP_ATOMIC5][0];
>>  
>>  	if (type == JUMP_LABEL_ENABLE) {
>> -		code.jump = 0xe9;
>> -		code.offset = entry->target -
>> -				(entry->code + JUMP_LABEL_NOP_SIZE);
>> -	} else
>> -		memcpy(&code, ideal_nops[NOP_ATOMIC5], JUMP_LABEL_NOP_SIZE);
>> +		if (op == 0xe9 || op == 0xeb)
>> +			/* Already enabled. Warn? */
> 
> This could be caused by failure to run the link-time script, or running
> the transform twice. A warning would indeed be welcome, as this should
> never happen.
> 

Warning?  No.  ERROR.  Something very bad could be happening here.  We
have covered this before.

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] jump labels: Add infrastructure to update jump labels at compile time
  2012-01-19 14:24   ` Mathieu Desnoyers
@ 2012-01-19 14:52     ` Steven Rostedt
  0 siblings, 0 replies; 12+ messages in thread
From: Steven Rostedt @ 2012-01-19 14:52 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Frederic Weisbecker, H. Peter Anvin, Jason Baron

On Thu, 2012-01-19 at 09:24 -0500, Mathieu Desnoyers wrote:
> Hi Steven,
> 
> I really like this patchset! A few comments below,
> 
> * Steven Rostedt (rostedt@goodmis.org) wrote:
> > From: Steven Rostedt <srostedt@redhat.com>
> > 
> > Add the infrastructure to allow architectures to modify the jump label
> > locations at compile time. This is mainly for x86, where the jmps may
> 
> is it "at compile time" or "after compilation" ? AFAIU, this
> update_jump_label script gets executed on the .o, which makes it
> technically more part of the linking stage than the compilation
> stage.

We usually denote things as "at compile time" or "at run time" whether
it is the linker or not, we don't differentiate because it's not
important. The "at compile time" means, it happens once before you boot
the kernel, as suppose to, it happens every time your run the kernel.
That's the important part.


> 
> > be either 2 bytes or 5 bytes. Instead of wasting 5 bytes for all jump labels,
> > this code will let x86 put in a jmp instead of a 5 byte nop. Then the
> > assembler will make either a 2 byte or 5 byte jmp depending on where
> > the target is.
> > 
> > At compile time, this code will replace the jmps with either a 2 byte
> > or 5 byte nop depending on the size of the jmp that was added.
> > 
> 
> A small note somewhere saying that this 2 vs 5 bytes choice done by the
> compiler is a pessimistic approximation would be good, so people don't
> end up complaining if they see that the compiler chose a 5-byte nop for
> a 120 bytes offset.

Hmm, that really doesn't have anything to do with this patch. If someone
is anal enough to complain, then they can complain to the gcc people. As
it would do the same with other jmps not associated with jumplabels too.


> 
> > Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
> > ---
> >  Makefile                    |    7 +
> >  arch/Kconfig                |    6 +
> >  scripts/Makefile            |    1 +
> >  scripts/Makefile.build      |   15 ++-
> >  scripts/update_jump_label.c |  335 +++++++++++++++++++++++++++++++++++++++++++
> >  scripts/update_jump_label.h |  283 ++++++++++++++++++++++++++++++++++++
> >  6 files changed, 645 insertions(+), 2 deletions(-)
> >  create mode 100644 scripts/update_jump_label.c
> >  create mode 100644 scripts/update_jump_label.h
> > 
> [...]
> > diff --git a/scripts/update_jump_label.c b/scripts/update_jump_label.c
> > new file mode 100644
> > index 0000000..399d898
> > --- /dev/null
> > +++ b/scripts/update_jump_label.c
> > @@ -0,0 +1,335 @@
> > +/*
> > + * update_jump_label.c: replace jmps with nops at compile time.
> > + * Copyright 2010 Steven Rostedt <srostedt@redhat.com>, Red Hat Inc.
> > + *  Parsing of the elf file was influenced by recordmcount.c
> > + *  originally written by and copyright to John F. Reiser <jreiser@BitWagon.com>.
> > + */
> > +
> > +/*
> > + * Note, this code is originally designed for x86, but may be used by
> > + * other archs to do the nop updates at compile time instead of at boot time.
> > + * X86 uses this as an optimization, as jmps can be either 2 bytes or 5 bytes.
> > + * Inserting a 2 byte where possible helps with both CPU performance and
> > + * icache strain.
> > + */
> > +#include <sys/types.h>
> > +#include <sys/mman.h>
> > +#include <sys/stat.h>
> > +#include <getopt.h>
> > +#include <elf.h>
> > +#include <fcntl.h>
> > +#include <setjmp.h>
> > +#include <stdio.h>
> > +#include <stdlib.h>
> > +#include <stdarg.h>
> > +#include <string.h>
> > +#include <unistd.h>
> > +
> > +static int fd_map;	/* File descriptor for file being modified. */
> > +static struct stat sb;	/* Remember .st_size, etc. */
> > +static int mmap_failed; /* Boolean flag. */
> > +
> > +static void die(const char *err, const char *fmt, ...)
> > +{
> > +	va_list ap;
> > +
> > +	if (err)
> > +		perror(err);
> > +
> > +	if (fmt) {
> > +		va_start(ap, fmt);
> > +		fprintf(stderr, "Fatal error:  ");
> > +		vfprintf(stderr, fmt, ap);
> > +		fprintf(stderr, "\n");
> > +		va_end(ap);
> > +	}
> > +
> > +	exit(1);
> > +}
> > +
> > +static void usage(char **argv)
> > +{
> > +	char *arg = argv[0];
> > +	char *p = arg+strlen(arg);
> 
> I'm pretty sure checkpatch will complain about the coding style of this
> file. I won't point out the various nits, but there is a need for
> cleanup before pulling it.

Yeah, this code was based off of the record_mcount.c file. I need to
clean that up too. And not just a trivial clean up, but a complete
overhaul. It's had a couple already. The guy who wrote the original code
use some really strange coding conventions and was really hard to read
and understand. I spent days cleaning it up to make it presentable for
Linux. I now know this code enough to make another pass at it and make
it a bit more readable.

I can fix this little thing up now.

> 
> > +
> > +	while (p >= arg && *p != '/')
> > +		p--;
> > +	p++;
> > +
> > +	printf("usage: %s file\n"
> > +	       "\n", p);
> > +	exit(-1);
> > +}
> 
> [...]
> 
> > +/*
> > + * Read the relocation table again, but this time its called on sections
> > + * that are not going to be traced. The mcount calls here will be converted
> 
> mcount calls -> jump labels

Hehe, like I said, this was based off of the record_mcount code. I did
state that in the original code. Thanks, I'll fix this too.

> 
> > + * into nops.
> > + */
> > +static void FUNC(nop_jump_label)(Elf_Shdr const *const relhdr,
> > +		       Elf_Ehdr const *const ehdr,
> > +		       const char *const txtname)
> > +{
> > +	Elf_Shdr *const shdr0 = get_shdr(ehdr);
> > +	Elf_Sym const *sym0;
> > +	char const *str0;
> > +	Elf_Rel const *relp;
> > +	Elf_Rela const *relap;
> > +	Elf_Shdr const *const shdr = &shdr0[w(relhdr->sh_info)];
> > +	unsigned rel_entsize = w(relhdr->sh_entsize);
> > +	unsigned const nrel = _w(relhdr->sh_size) / rel_entsize;
> > +	int t;
> > +
> > +	FUNC(get_sym_str_and_relp)(relhdr, ehdr, &sym0, &str0, &relp);
> > +
> > +	for (t = nrel; t > 0; t -= 3) {
> > +		int ret = -1;
> > +
> > +		relap = (Elf_Rela const *)relp;
> > +		printf("rel offset=%lx info=%lx sym=%lx type=%lx addend=%lx\n",
> > +		       (long)relap->r_offset, (long)relap->r_info,
> > +		       (long)ELF_R_SYM(relap->r_info),
> > +		       (long)ELF_R_TYPE(relap->r_info),
> > +		       (long)relap->r_addend);
> > +
> > +		if (0 && make_nop)
> > +			ret = make_nop((void *)ehdr, shdr->sh_offset + relp->r_offset);
> 
> if (0 &&  -> leftover code ?
> 
> This whole FUNC(nop_jump_label) function seems to be unused.

Yeah, I still need to remove this crap. I posted this as an RFC to get
ideas before I ripped out all the debug (and crap). This was code that I
wasn't sure I would use. Yes, it will be removed before the final
version. I probably can nuke it now, as I'm now comfortable that this
will not be needed.

> 
> > +
> > +		/* jump label sections are paired in threes */
> > +		relp = (Elf_Rel const *)(rel_entsize * 3 + (void *)relp);
> > +	}
> > +}
> > +
> > +/* Find relocation section hdr for a given section */
> > +static const Elf_Shdr *
> > +FUNC(find_relhdr)(const Elf_Ehdr *ehdr, const Elf_Shdr *shdr)
> > +{
> > +	const Elf_Shdr *shdr0 = get_shdr(ehdr);
> > +	int nhdr = w2(ehdr->e_shnum);
> > +	const Elf_Shdr *hdr;
> > +	int i;
> > +
> > +	for (hdr = shdr0, i = 0; i < nhdr; hdr = &shdr0[++i]) {
> > +		if (w(hdr->sh_type) != SHT_REL &&
> > +		    w(hdr->sh_type) != SHT_RELA)
> > +			continue;
> > +
> > +		/*
> > +		 * The relocation section's info field holds
> > +		 * the section index that it represents.
> > +		 */
> > +		if (shdr == &shdr0[w(hdr->sh_info)])
> > +			return hdr;
> > +	}
> > +	return NULL;
> > +}
> > +
> > +/* Find a section headr based on name and type */
> > +static const Elf_Shdr *
> > +FUNC(find_shdr)(const Elf_Ehdr *ehdr, const char *name, uint_t type)
> > +{
> > +	const Elf_Shdr *shdr0 = get_shdr(ehdr);
> > +	const Elf_Shdr *shstr = &shdr0[w2(ehdr->e_shstrndx)];
> > +	const char *shstrtab = (char *)get_section_loc(ehdr, shstr);
> > +	int nhdr = w2(ehdr->e_shnum);
> > +	const Elf_Shdr *hdr;
> > +	const char *hdrname;
> > +	int i;
> > +
> > +	for (hdr = shdr0, i = 0; i < nhdr; hdr = &shdr0[++i]) {
> > +		if (w(hdr->sh_type) != type)
> > +			continue;
> > +
> > +		/* If we are just looking for a section by type (ie. SYMTAB) */
> > +		if (!name)
> > +			return hdr;
> > +
> > +		hdrname = &shstrtab[w(hdr->sh_name)];
> > +		if (strcmp(hdrname, name) == 0)
> > +			return hdr;
> > +	}
> > +	return NULL;
> > +}
> > +
> > +static void
> > +FUNC(section_update)(const Elf_Ehdr *ehdr, const Elf_Shdr *symhdr,
> > +		     unsigned shtype, const Elf_Rel *rel, void *data)
> > +{
> > +	const Elf_Shdr *shdr0 = get_shdr(ehdr);
> > +	const Elf_Shdr *targethdr;
> > +	const Elf_Rela *rela;
> > +	const Elf_Sym *syment;
> > +	uint_t offset = _w(rel->r_offset);
> > +	uint_t info = _w(rel->r_info);
> > +	uint_t sym = ELF_R_SYM(info);
> > +	uint_t type = ELF_R_TYPE(info);
> > +	uint_t addend;
> > +	uint_t targetloc;
> > +
> > +	if (shtype == SHT_RELA) {
> > +		rela = (const Elf_Rela *)rel;
> > +		addend = _w(rela->r_addend);
> > +	} else
> > +		addend = _w(*(int *)(data + offset));
> > +
> > +	syment = (const Elf_Sym *)get_section_loc(ehdr, symhdr);
> > +	targethdr = &shdr0[w2(syment[sym].st_shndx)];
> > +	targetloc = _w(targethdr->sh_offset);
> > +
> > +	/* TODO, need a separate function for all archs */
> 
> maybe this is where you want to call FUNC(nop_jump_label) too ?

It would be where something similar goes, but not that function. I
should still nuke it.

> 
> > +	if (type != R_386_32)
> > +		die(NULL, "Arch relocation type %d not supported", type);
> > +
> > +	targetloc += addend;
> > +
> > +#if 0
> > +	printf("offset=%x target=%x shoffset=%x add=%x indx=%x\n",
> > +	       offset, targetloc, _w(targethdr->sh_offset), addend,
> > +	       targetloc - _w(targethdr->sh_offset));
> > +#endif

This obviously needs to be nuked too.

> > +	*(uint_t *)(data + offset) = targetloc;
> > +}
> > +
> > +/* Overall supervision for Elf32 ET_REL file. */
> > +static void
> > +FUNC(do_func)(Elf_Ehdr *ehdr, char const *const fname, unsigned const reltype)
> > +{
> > +	const Elf_Shdr *jlshdr;
> > +	const Elf_Shdr *jlrhdr;
> > +	const Elf_Shdr *symhdr;
> > +	const Elf_Rel *rel;
> > +	unsigned size;
> > +	unsigned cnt;
> > +	unsigned i;
> > +	uint_t type;
> > +	void *jdata;
> > +	void *data;
> > +
> > +	jlshdr = FUNC(find_shdr)(ehdr, "__jump_table", SHT_PROGBITS);
> > +	if (!jlshdr)
> > +		return;
> > +
> > +	jlrhdr = FUNC(find_relhdr)(ehdr, jlshdr);
> > +	if (!jlrhdr)
> > +		return;
> > +
> > +	/*
> > +	 * Create and fill in the __jump_table section and use it to
> > +	 * find the offsets into the text that we want to update.
> > +	 * We create it so that we do not depend on the order of the
> > +	 * relocations, and use the table directly, as it is broken
> > +	 * up into sections.
> > +	 */
> > +	size = _w(jlshdr->sh_size);
> > +	data = umalloc(size);
> > +
> > +	jdata = (void *)get_section_loc(ehdr, jlshdr);
> > +	memcpy(data, jdata, size);
> > +
> > +	cnt = _w(jlrhdr->sh_size) / w(jlrhdr->sh_entsize);
> > +
> > +	rel = (const Elf_Rel *)get_section_loc(ehdr, jlrhdr);
> > +
> > +	/* Is this as Rel or Rela? */
> > +	type = w(jlrhdr->sh_type);
> > +
> > +	symhdr = FUNC(find_shdr)(ehdr, NULL, SHT_SYMTAB);
> > +
> > +	for (i = 0; i < cnt; i++) {
> > +		FUNC(section_update)(ehdr, symhdr, type, rel, data);
> > +		rel = (void *)rel + w(jlrhdr->sh_entsize);
> > +	}
> > +
> > +	/*
> > +	 * This is specific to x86. The jump_table is stored in three
> > +	 * long words. The first is the location of the jmp target we
> > +	 * must update.
> > +	 */
> > +	cnt = size / sizeof(uint_t);
> > +
> > +	for (i = 0; i < cnt; i += 3)
> > +		make_nop((void *)ehdr, *(uint_t *)(data + i * sizeof(uint_t)));
> > +
> 
> Why is this function iterating twice on the __jump_table section rather
> than simply writing the nops in the first pass, within the
> section_update function ?

To simplify things. There's also no guarantee that the relocation table
is in order. It usually is, but I don't want to assume that. Thus I need
to first read the entire relocation table to find the locations of all
the jump labels, and fill in the locations as they would have been
linked. And after that is complete, then I can iterate over that section
and look at every third entry which is the jump label location.

-- Steve




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] jump labels/x86: Use etiher 5 byte or 2 byte jumps
  2012-01-19 14:41   ` Mathieu Desnoyers
  2012-01-19 14:46     ` H. Peter Anvin
@ 2012-01-19 14:56     ` Steven Rostedt
  2012-01-19 14:58       ` H. Peter Anvin
  1 sibling, 1 reply; 12+ messages in thread
From: Steven Rostedt @ 2012-01-19 14:56 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Frederic Weisbecker, H. Peter Anvin, Jason Baron

On Thu, 2012-01-19 at 09:41 -0500, Mathieu Desnoyers wrote:
> * Steven Rostedt (rostedt@goodmis.org) wrote:
> > From: Steven Rostedt <srostedt@redhat.com>
> > 
> > Have the jump labels add a "jmp" in the assembly instead
> > of a default nop. This will cause the assembler to put in
> > either a 2 byte or 5 byte jmp depending on where the target
> > lable is.
> > 
> > Then at compile time, the update_jump_label code will replace
> > the jmps with either 2 or 5 byte nops.
> > 
> > On boot up, the code can be examined to see if the jump label
> > uses either a 2 or 5 byte nop and replace it.
> > 
> > By allowing the jump labels to be 2 bytes, it speeds up the
> > nops, not only 2 byte nops are faster than 5 byte nops, but also
> > because it saves on cache foot print.
> > 
> >    text    data     bss     dec     hex filename
> > 13403667 3666856 2998272 20068795 13239bb ../nobackup/mxtest/vmlinux-old
> > 13398536 3666856 2998272 20063664 13225b0 ../nobackup/mxtest/vmlinux-new
> > 
> > Converting the current v3.2 trace points saved 5,131 bytes.
> > As more places use jump labels, this will have a bigger savings.
> > 
> > Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
> > ---
> >  arch/x86/Kconfig                  |    1 +
> >  arch/x86/include/asm/jump_label.h |    2 +-
> >  arch/x86/kernel/jump_label.c      |   86 ++++++++++++++++++++++++++++++++++---
> >  3 files changed, 81 insertions(+), 8 deletions(-)
> > 
> > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> > index efb4294..b5004c1 100644
> > --- a/arch/x86/Kconfig
> > +++ b/arch/x86/Kconfig
> > @@ -61,6 +61,7 @@ config X86
> >  	select HAVE_ARCH_KMEMCHECK
> >  	select HAVE_USER_RETURN_NOTIFIER
> >  	select HAVE_ARCH_JUMP_LABEL
> > +	select HAVE_BUILD_TIME_JUMP_LABEL
> >  	select HAVE_TEXT_POKE_SMP
> >  	select HAVE_GENERIC_HARDIRQS
> >  	select HAVE_SPARSE_IRQ
> > diff --git a/arch/x86/include/asm/jump_label.h b/arch/x86/include/asm/jump_label.h
> > index a32b18c..872b3e1 100644
> > --- a/arch/x86/include/asm/jump_label.h
> > +++ b/arch/x86/include/asm/jump_label.h
> > @@ -14,7 +14,7 @@
> >  static __always_inline bool arch_static_branch(struct jump_label_key *key)
> >  {
> >  	asm goto("1:"
> > -		JUMP_LABEL_INITIAL_NOP
> > +		"jmp %l[l_yes]\n"
> 
> Is it possible that the compiler choose a jump that is not 2 or 5-byte ?
> e.g. a jmp rel16 (e9 opcode) on 32-bit x86, or any of the other

Isn't e9 the 5 byte opcode?

> instruction listed under the JMP-Jump instruction in the Intel insn
> manual ?

If it does, it will fail to compile. Look at the make_nop_x86(). It will
die if it doesn't see either a eb or e9


> 
> >  		".pushsection __jump_table,  \"aw\" \n\t"
> >  		_ASM_ALIGN "\n\t"
> >  		_ASM_PTR "1b, %l[l_yes], %c0 \n\t"
> > diff --git a/arch/x86/kernel/jump_label.c b/arch/x86/kernel/jump_label.c
> > index ea9d5f2f..d5b84de 100644
> > --- a/arch/x86/kernel/jump_label.c
> > +++ b/arch/x86/kernel/jump_label.c
> > @@ -16,12 +16,27 @@
> >  
> >  #ifdef HAVE_JUMP_LABEL
> >  
> > +static unsigned char nop_short[] = { P6_NOP2 };
> > +
> > +/* These are the nops added at compile time */
> > +#ifdef CONFIG_X86_32
> > +static unsigned char default_nop[5] = { 0x3e, 0x8d, 0x74, 0x26, 0x00 };
> > +#else
> > +static unsigned char default_nop[5] = { 0x0f, 0x1f, 0x44, 0x00, 0x00 };
> > +#endif
> > +
> > +static int update_nops;
> > +
> >  union jump_code_union {
> >  	char code[JUMP_LABEL_NOP_SIZE];
> >  	struct {
> >  		char jump;
> >  		int offset;
> > -	} __attribute__((packed));
> > +	} __packed;
> > +	struct {
> > +		char jump_short;
> > +		char offset_short;
> > +	} __packed;
> >  };
> >  
> >  static void __jump_label_transform(struct jump_entry *entry,
> > @@ -29,20 +44,70 @@ static void __jump_label_transform(struct jump_entry *entry,
> >  				   void *(*poker)(void *, const void *, size_t))
> >  {
> >  	union jump_code_union code;
> > +	unsigned char nop;
> > +	unsigned char op;
> > +	unsigned size;
> > +	void *ip = (void *)entry->code;
> > +	void *ideal = (void *)ideal_nops[NOP_ATOMIC5];
> 
> "void *" should possibly be "unsigned char *" here to respect the nop
> place-holder typing.
> 
> > +
> > +	/* Use probe_kernel_read()? */
> > +	op = *(unsigned char *)ip;
> > +	nop = ideal_nops[NOP_ATOMIC5][0];
> >  
> >  	if (type == JUMP_LABEL_ENABLE) {
> > -		code.jump = 0xe9;
> > -		code.offset = entry->target -
> > -				(entry->code + JUMP_LABEL_NOP_SIZE);
> > -	} else
> > -		memcpy(&code, ideal_nops[NOP_ATOMIC5], JUMP_LABEL_NOP_SIZE);
> > +		if (op == 0xe9 || op == 0xeb)
> > +			/* Already enabled. Warn? */
> 
> This could be caused by failure to run the link-time script, or running
> the transform twice. A warning would indeed be welcome, as this should
> never happen.

That's why I commented this :-)

-- Steve



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] jump labels/x86: Use etiher 5 byte or 2 byte jumps
  2012-01-19 14:56     ` Steven Rostedt
@ 2012-01-19 14:58       ` H. Peter Anvin
  0 siblings, 0 replies; 12+ messages in thread
From: H. Peter Anvin @ 2012-01-19 14:58 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Mathieu Desnoyers, linux-kernel, Ingo Molnar, Andrew Morton,
	Thomas Gleixner, Frederic Weisbecker, Jason Baron

On 01/19/2012 06:56 AM, Steven Rostedt wrote:
>>
>> Is it possible that the compiler choose a jump that is not 2 or 5-byte ?
>> e.g. a jmp rel16 (e9 opcode) on 32-bit x86, or any of the other
>
> Isn't e9 the 5 byte opcode?
>

I think Mathieu is thinking of 66 e9 rel16 (66 e9 xx xx) which isn't 
used by the compiler since it truncates the result to 16 bits.

	-hpa

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] jump labels/x86: Use etiher 5 byte or 2 byte jumps
  2012-01-19 14:46     ` H. Peter Anvin
@ 2012-01-19 14:58       ` Steven Rostedt
  2012-01-19 15:19         ` Steven Rostedt
  0 siblings, 1 reply; 12+ messages in thread
From: Steven Rostedt @ 2012-01-19 14:58 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Mathieu Desnoyers, linux-kernel, Ingo Molnar, Andrew Morton,
	Thomas Gleixner, Frederic Weisbecker, Jason Baron

On Thu, 2012-01-19 at 06:46 -0800, H. Peter Anvin wrote:

> >>  				   void *(*poker)(void *, const void *, size_t))
> >>  {
> >>  	union jump_code_union code;
> >> +	unsigned char nop;
> >> +	unsigned char op;
> >> +	unsigned size;
> >> +	void *ip = (void *)entry->code;
> >> +	void *ideal = (void *)ideal_nops[NOP_ATOMIC5];
> > 
> > "void *" should possibly be "unsigned char *" here to respect the nop
> > place-holder typing.
> > 
> 
> const unsigned char * please.

OK.

> 
> >> +
> >> +	/* Use probe_kernel_read()? */
> >> +	op = *(unsigned char *)ip;
> >> +	nop = ideal_nops[NOP_ATOMIC5][0];
> >>  
> >>  	if (type == JUMP_LABEL_ENABLE) {
> >> -		code.jump = 0xe9;
> >> -		code.offset = entry->target -
> >> -				(entry->code + JUMP_LABEL_NOP_SIZE);
> >> -	} else
> >> -		memcpy(&code, ideal_nops[NOP_ATOMIC5], JUMP_LABEL_NOP_SIZE);
> >> +		if (op == 0xe9 || op == 0xeb)
> >> +			/* Already enabled. Warn? */
> > 
> > This could be caused by failure to run the link-time script, or running
> > the transform twice. A warning would indeed be welcome, as this should
> > never happen.
> > 
> 
> Warning?  No.  ERROR.  Something very bad could be happening here.  We
> have covered this before.

Heh, not this exactly. We covered run time errors, this is build time
errors.

But I agree, it should error. The mcount code errors on problems, this
should too.

-- Steve



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] jump labels/x86: Use etiher 5 byte or 2 byte jumps
  2012-01-19 14:58       ` Steven Rostedt
@ 2012-01-19 15:19         ` Steven Rostedt
  0 siblings, 0 replies; 12+ messages in thread
From: Steven Rostedt @ 2012-01-19 15:19 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Mathieu Desnoyers, linux-kernel, Ingo Molnar, Andrew Morton,
	Thomas Gleixner, Frederic Weisbecker, Jason Baron

On Thu, 2012-01-19 at 09:58 -0500, Steven Rostedt wrote:
> On Thu, 2012-01-19 at 06:46 -0800, H. Peter Anvin wrote:

> > 
> > >> +
> > >> +	/* Use probe_kernel_read()? */
> > >> +	op = *(unsigned char *)ip;
> > >> +	nop = ideal_nops[NOP_ATOMIC5][0];
> > >>  
> > >>  	if (type == JUMP_LABEL_ENABLE) {
> > >> -		code.jump = 0xe9;
> > >> -		code.offset = entry->target -
> > >> -				(entry->code + JUMP_LABEL_NOP_SIZE);
> > >> -	} else
> > >> -		memcpy(&code, ideal_nops[NOP_ATOMIC5], JUMP_LABEL_NOP_SIZE);
> > >> +		if (op == 0xe9 || op == 0xeb)
> > >> +			/* Already enabled. Warn? */
> > > 
> > > This could be caused by failure to run the link-time script, or running
> > > the transform twice. A warning would indeed be welcome, as this should
> > > never happen.
> > > 
> > 
> > Warning?  No.  ERROR.  Something very bad could be happening here.  We
> > have covered this before.
> 
> Heh, not this exactly. We covered run time errors, this is build time
> errors.

Oops, this is patch 2 not 1. It is run time error.

> 
> But I agree, it should error. The mcount code errors on problems, this
> should too.

I'll have to look at this in more detail, to make sure jump labels are
consistent in their updates. It may legitimately enable them twice. If
that's the case, what should be done is to make sure the entire
instruction is what we expect it to be, otherwise BUG().

-- Steve



^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2012-01-19 15:19 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-01-18 19:53 [PATCH 0/2] [RFC] jump-label/x86: Compress jmps to 2 bytes where possible Steven Rostedt
2012-01-18 19:53 ` [PATCH 1/2] jump labels: Add infrastructure to update jump labels at compile time Steven Rostedt
2012-01-19 14:24   ` Mathieu Desnoyers
2012-01-19 14:52     ` Steven Rostedt
2012-01-18 19:53 ` [PATCH 2/2] jump labels/x86: Use etiher 5 byte or 2 byte jumps Steven Rostedt
2012-01-19 12:22   ` Ingo Molnar
2012-01-19 14:41   ` Mathieu Desnoyers
2012-01-19 14:46     ` H. Peter Anvin
2012-01-19 14:58       ` Steven Rostedt
2012-01-19 15:19         ` Steven Rostedt
2012-01-19 14:56     ` Steven Rostedt
2012-01-19 14:58       ` H. Peter Anvin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).