[tip:,objtool/core] x86/insn: Support big endian cross-compiles
diff mbox series

Message ID 160208761921.7002.1321765913567405137.tip-bot2@tip-bot2
State In Next
Commit 2a522b53c47051d3bf98748418f4f8e5f20d2c04
Headers show
Series
  • [tip:,objtool/core] x86/insn: Support big endian cross-compiles
Related show

Commit Message

tip-bot2 for Aditya Srivastava Oct. 7, 2020, 4:20 p.m. UTC
The following commit has been merged into the objtool/core branch of tip:

Commit-ID:     2a522b53c47051d3bf98748418f4f8e5f20d2c04
Gitweb:        https://git.kernel.org/tip/2a522b53c47051d3bf98748418f4f8e5f20d2c04
Author:        Martin Schwidefsky <schwidefsky@de.ibm.com>
AuthorDate:    Mon, 05 Oct 2020 17:50:31 +02:00
Committer:     Josh Poimboeuf <jpoimboe@redhat.com>
CommitterDate: Tue, 06 Oct 2020 09:32:29 -05:00

x86/insn: Support big endian cross-compiles

x86 instruction decoder code is shared across the kernel source and the
tools. Currently objtool seems to be the only tool from build tools needed
which breaks x86 cross compilation on big endian systems. Make the x86
instruction decoder build host endianness agnostic to support x86 cross
compilation and enable objtool to implement endianness awareness for
big endian architectures support.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Co-developed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
---
 arch/x86/include/asm/insn.h       |  33 +++++++++-
 arch/x86/lib/insn.c               | 101 +++++++++++++----------------
 tools/arch/x86/include/asm/insn.h |  33 +++++++++-
 tools/arch/x86/lib/insn.c         | 101 +++++++++++++----------------
 4 files changed, 160 insertions(+), 108 deletions(-)

Comments

Peter Zijlstra Oct. 9, 2020, 8:38 p.m. UTC | #1
On Wed, Oct 07, 2020 at 04:20:19PM -0000, tip-bot2 for Martin Schwidefsky wrote:
> The following commit has been merged into the objtool/core branch of tip:
> 
> Commit-ID:     2a522b53c47051d3bf98748418f4f8e5f20d2c04
> Gitweb:        https://git.kernel.org/tip/2a522b53c47051d3bf98748418f4f8e5f20d2c04
> Author:        Martin Schwidefsky <schwidefsky@de.ibm.com>
> AuthorDate:    Mon, 05 Oct 2020 17:50:31 +02:00
> Committer:     Josh Poimboeuf <jpoimboe@redhat.com>
> CommitterDate: Tue, 06 Oct 2020 09:32:29 -05:00
> 
> x86/insn: Support big endian cross-compiles
> 
> x86 instruction decoder code is shared across the kernel source and the
> tools. Currently objtool seems to be the only tool from build tools needed
> which breaks x86 cross compilation on big endian systems. Make the x86
> instruction decoder build host endianness agnostic to support x86 cross
> compilation and enable objtool to implement endianness awareness for
> big endian architectures support.
> 
> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
> Co-developed-by: Vasily Gorbik <gor@linux.ibm.com>
> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
> Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>

This commit breaks the x86 build with CONFIG_X86_DECODER_SELFTEST=y.

I've asked Boris to truncate tip/objtool/core.
Borislav Petkov Oct. 9, 2020, 8:49 p.m. UTC | #2
On Fri, Oct 09, 2020 at 10:38:22PM +0200, Peter Zijlstra wrote:
> On Wed, Oct 07, 2020 at 04:20:19PM -0000, tip-bot2 for Martin Schwidefsky wrote:
> > The following commit has been merged into the objtool/core branch of tip:
> > 
> > Commit-ID:     2a522b53c47051d3bf98748418f4f8e5f20d2c04
> > Gitweb:        https://git.kernel.org/tip/2a522b53c47051d3bf98748418f4f8e5f20d2c04
> > Author:        Martin Schwidefsky <schwidefsky@de.ibm.com>
> > AuthorDate:    Mon, 05 Oct 2020 17:50:31 +02:00
> > Committer:     Josh Poimboeuf <jpoimboe@redhat.com>
> > CommitterDate: Tue, 06 Oct 2020 09:32:29 -05:00
> > 
> > x86/insn: Support big endian cross-compiles
> > 
> > x86 instruction decoder code is shared across the kernel source and the
> > tools. Currently objtool seems to be the only tool from build tools needed
> > which breaks x86 cross compilation on big endian systems. Make the x86
> > instruction decoder build host endianness agnostic to support x86 cross
> > compilation and enable objtool to implement endianness awareness for
> > big endian architectures support.
> > 
> > Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
> > Co-developed-by: Vasily Gorbik <gor@linux.ibm.com>
> > Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
> > Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
> > Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
> 
> This commit breaks the x86 build with CONFIG_X86_DECODER_SELFTEST=y.
> 
> I've asked Boris to truncate tip/objtool/core.

Yeah, top 4 are gone until this is resolved.

What I would suggest is to have a look at how tools/ headers are kept
separate from kernel proper ones, see tools/include/ and how those
headers there are full of dummy definitions just so it builds.

And then including a global one like linux/kernel.h is just looking for
trouble:

In file included from ./include/uapi/linux/byteorder/little_endian.h:12,
                 from ./include/linux/byteorder/little_endian.h:5,
                 from /usr/include/x86_64-linux-gnu/asm/byteorder.h:5,
                 from ./arch/x86/include/asm/insn.h:10,
                 from arch/x86/tools/insn_sanity.c:21:
./tools/include/linux/types.h:30:18: error: conflicting types for ‘u64’
   30 | typedef uint64_t u64;
      |                  ^~~
In file included from /usr/include/asm-generic/types.h:7,
                 from /usr/include/x86_64-linux-gnu/asm/types.h:1,
                 from ./tools/include/linux/types.h:10,
                 from ./include/uapi/linux/byteorder/little_endian.h:12,
                 from ./include/linux/byteorder/little_endian.h:5,
                 from /usr/include/x86_64-linux-gnu/asm/byteorder.h:5,
                 from ./arch/x86/include/asm/insn.h:10,
                 from arch/x86/tools/insn_sanity.c:21:
./include/asm-generic/int-ll64.h:23:15: note: previous declaration of ‘u64’ was here
   23 | typedef __u64 u64;
      |               ^~~
In file included from ./include/uapi/linux/byteorder/little_endian.h:12,
                 from ./include/linux/byteorder/little_endian.h:5,
                 from /usr/include/x86_64-linux-gnu/asm/byteorder.h:5,
                 from ./arch/x86/include/asm/insn.h:10,
                 from arch/x86/tools/insn_sanity.c:21:
./tools/include/linux/types.h:31:17: error: conflicting types for ‘s64’
   31 | typedef int64_t s64;
      |                 ^~~
In file included from /usr/include/asm-generic/types.h:7,
                 from /usr/include/x86_64-linux-gnu/asm/types.h:1,
                 from ./tools/include/linux/types.h:10,
                 from ./include/uapi/linux/byteorder/little_endian.h:12,
                 from ./include/linux/byteorder/little_endian.h:5,
                 from /usr/include/x86_64-linux-gnu/asm/byteorder.h:5,
                 from ./arch/x86/include/asm/insn.h:10,
                 from arch/x86/tools/insn_sanity.c:21:
./include/asm-generic/int-ll64.h:22:15: note: previous declaration of ‘s64’ was here
   22 | typedef __s64 s64;
      |               ^~~
In file included from ./arch/x86/lib/insn.c:8,
                 from arch/x86/tools/insn_sanity.c:23:
./tools/include/linux/kernel.h:87: warning: "cpu_to_le16" redefined
   87 | #define cpu_to_le16
      | 
In file included from ./include/linux/byteorder/little_endian.h:11,
                 from /usr/include/x86_64-linux-gnu/asm/byteorder.h:5,
                 from ./arch/x86/include/asm/insn.h:10,
                 from arch/x86/tools/insn_sanity.c:21:
./include/linux/byteorder/generic.h:90: note: this is the location of the previous definition
   90 | #define cpu_to_le16 __cpu_to_le16
      | 
In file included from ./arch/x86/lib/insn.c:8,
                 from arch/x86/tools/insn_sanity.c:23:
./tools/include/linux/kernel.h:88: warning: "cpu_to_le32" redefined
   88 | #define cpu_to_le32
      | 
In file included from ./include/linux/byteorder/little_endian.h:11,
                 from /usr/include/x86_64-linux-gnu/asm/byteorder.h:5,
                 from ./arch/x86/include/asm/insn.h:10,
                 from arch/x86/tools/insn_sanity.c:21:
./include/linux/byteorder/generic.h:88: note: this is the location of the previous definition
   88 | #define cpu_to_le32 __cpu_to_le32
      | 
In file included from ./arch/x86/lib/insn.c:8,
                 from arch/x86/tools/insn_sanity.c:23:
./tools/include/linux/kernel.h:89: warning: "cpu_to_le64" redefined
   89 | #define cpu_to_le64
      | 
In file included from ./include/linux/byteorder/little_endian.h:11,
                 from /usr/include/x86_64-linux-gnu/asm/byteorder.h:5,
                 from ./arch/x86/include/asm/insn.h:10,
                 from arch/x86/tools/insn_sanity.c:21:
./include/linux/byteorder/generic.h:86: note: this is the location of the previous definition
   86 | #define cpu_to_le64 __cpu_to_le64
      | 
In file included from ./arch/x86/lib/insn.c:8,
                 from arch/x86/tools/insn_sanity.c:23:
./tools/include/linux/kernel.h:90: warning: "le16_to_cpu" redefined
   90 | #define le16_to_cpu
      | 
In file included from ./include/linux/byteorder/little_endian.h:11,
                 from /usr/include/x86_64-linux-gnu/asm/byteorder.h:5,
                 from ./arch/x86/include/asm/insn.h:10,
                 from arch/x86/tools/insn_sanity.c:21:
./include/linux/byteorder/generic.h:91: note: this is the location of the previous definition
   91 | #define le16_to_cpu __le16_to_cpu
      | 
In file included from ./arch/x86/lib/insn.c:8,
                 from arch/x86/tools/insn_sanity.c:23:
./tools/include/linux/kernel.h:91: warning: "le32_to_cpu" redefined
   91 | #define le32_to_cpu
      | 
In file included from ./include/linux/byteorder/little_endian.h:11,
                 from /usr/include/x86_64-linux-gnu/asm/byteorder.h:5,
                 from ./arch/x86/include/asm/insn.h:10,
                 from arch/x86/tools/insn_sanity.c:21:
./include/linux/byteorder/generic.h:89: note: this is the location of the previous definition
   89 | #define le32_to_cpu __le32_to_cpu
      | 
In file included from ./arch/x86/lib/insn.c:8,
                 from arch/x86/tools/insn_sanity.c:23:
./tools/include/linux/kernel.h:92: warning: "le64_to_cpu" redefined
   92 | #define le64_to_cpu
      | 
In file included from ./include/linux/byteorder/little_endian.h:11,
                 from /usr/include/x86_64-linux-gnu/asm/byteorder.h:5,
                 from ./arch/x86/include/asm/insn.h:10,
                 from arch/x86/tools/insn_sanity.c:21:
./include/linux/byteorder/generic.h:87: note: this is the location of the previous definition
   87 | #define le64_to_cpu __le64_to_cpu
      | 
In file included from ./arch/x86/lib/insn.c:8,
                 from arch/x86/tools/insn_sanity.c:23:
./tools/include/linux/kernel.h:93: warning: "cpu_to_be16" redefined
   93 | #define cpu_to_be16 bswap_16
      | 
In file included from ./include/linux/byteorder/little_endian.h:11,
                 from /usr/include/x86_64-linux-gnu/asm/byteorder.h:5,
                 from ./arch/x86/include/asm/insn.h:10,
                 from arch/x86/tools/insn_sanity.c:21:
./include/linux/byteorder/generic.h:96: note: this is the location of the previous definition
   96 | #define cpu_to_be16 __cpu_to_be16
      | 
In file included from ./arch/x86/lib/insn.c:8,
                 from arch/x86/tools/insn_sanity.c:23:
./tools/include/linux/kernel.h:94: warning: "cpu_to_be32" redefined
   94 | #define cpu_to_be32 bswap_32
      | 
In file included from ./include/linux/byteorder/little_endian.h:11,
                 from /usr/include/x86_64-linux-gnu/asm/byteorder.h:5,
                 from ./arch/x86/include/asm/insn.h:10,
                 from arch/x86/tools/insn_sanity.c:21:
./include/linux/byteorder/generic.h:94: note: this is the location of the previous definition
   94 | #define cpu_to_be32 __cpu_to_be32
      | 
In file included from ./arch/x86/lib/insn.c:8,
                 from arch/x86/tools/insn_sanity.c:23:
./tools/include/linux/kernel.h:95: warning: "cpu_to_be64" redefined
   95 | #define cpu_to_be64 bswap_64
      | 
In file included from ./include/linux/byteorder/little_endian.h:11,
                 from /usr/include/x86_64-linux-gnu/asm/byteorder.h:5,
                 from ./arch/x86/include/asm/insn.h:10,
                 from arch/x86/tools/insn_sanity.c:21:
./include/linux/byteorder/generic.h:92: note: this is the location of the previous definition
   92 | #define cpu_to_be64 __cpu_to_be64
      | 
In file included from ./arch/x86/lib/insn.c:8,
                 from arch/x86/tools/insn_sanity.c:23:
./tools/include/linux/kernel.h:96: warning: "be16_to_cpu" redefined
   96 | #define be16_to_cpu bswap_16
      | 
In file included from ./include/linux/byteorder/little_endian.h:11,
                 from /usr/include/x86_64-linux-gnu/asm/byteorder.h:5,
                 from ./arch/x86/include/asm/insn.h:10,
                 from arch/x86/tools/insn_sanity.c:21:
./include/linux/byteorder/generic.h:97: note: this is the location of the previous definition
   97 | #define be16_to_cpu __be16_to_cpu
      | 
In file included from ./arch/x86/lib/insn.c:8,
                 from arch/x86/tools/insn_sanity.c:23:
./tools/include/linux/kernel.h:97: warning: "be32_to_cpu" redefined
   97 | #define be32_to_cpu bswap_32
      | 
In file included from ./include/linux/byteorder/little_endian.h:11,
                 from /usr/include/x86_64-linux-gnu/asm/byteorder.h:5,
                 from ./arch/x86/include/asm/insn.h:10,
                 from arch/x86/tools/insn_sanity.c:21:
./include/linux/byteorder/generic.h:95: note: this is the location of the previous definition
   95 | #define be32_to_cpu __be32_to_cpu
      | 
In file included from ./arch/x86/lib/insn.c:8,
                 from arch/x86/tools/insn_sanity.c:23:
./tools/include/linux/kernel.h:98: warning: "be64_to_cpu" redefined
   98 | #define be64_to_cpu bswap_64
      | 
In file included from ./include/linux/byteorder/little_endian.h:11,
                 from /usr/include/x86_64-linux-gnu/asm/byteorder.h:5,
                 from ./arch/x86/include/asm/insn.h:10,
                 from arch/x86/tools/insn_sanity.c:21:
./include/linux/byteorder/generic.h:93: note: this is the location of the previous definition
   93 | #define be64_to_cpu __be64_to_cpu
      | 
In file included from ./arch/x86/lib/insn.c:8,
                 from arch/x86/tools/insn_sanity.c:23:
./tools/include/linux/kernel.h:105: warning: "ARRAY_SIZE" redefined
  105 | #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + __must_be_array(arr))
      | 
arch/x86/tools/insn_sanity.c:19: note: this is the location of the previous definition
   19 | #define ARRAY_SIZE(a) (sizeof(a)/sizeof(a[0]))
      | 
make[1]: *** [scripts/Makefile.host:95: arch/x86/tools/insn_sanity] Error 1
make[1]: *** Waiting for unfinished jobs....
make: *** [arch/x86/Makefile:267: bzImage] Error 2
make: *** Waiting for unfinished jobs....
Vasily Gorbik Oct. 10, 2020, 2:02 p.m. UTC | #3
On Fri, Oct 09, 2020 at 10:49:21PM +0200, Borislav Petkov wrote:
> On Fri, Oct 09, 2020 at 10:38:22PM +0200, Peter Zijlstra wrote:
> > On Wed, Oct 07, 2020 at 04:20:19PM -0000, tip-bot2 for Martin Schwidefsky wrote:
> > > The following commit has been merged into the objtool/core branch of tip:
> > > 
> > > Commit-ID:     2a522b53c47051d3bf98748418f4f8e5f20d2c04
> > > Gitweb:        https://git.kernel.org/tip/2a522b53c47051d3bf98748418f4f8e5f20d2c04
> > > 
> > > x86/insn: Support big endian cross-compiles
> > 
> > This commit breaks the x86 build with CONFIG_X86_DECODER_SELFTEST=y.
> > 
> > I've asked Boris to truncate tip/objtool/core.
> 
> Yeah, top 4 are gone until this is resolved.
> 
> What I would suggest is to have a look at how tools/ headers are kept
> separate from kernel proper ones, see tools/include/ and how those
> headers there are full of dummy definitions just so it builds.
> 
> And then including a global one like linux/kernel.h is just looking for
> trouble:
> 
> In file included from ./include/uapi/linux/byteorder/little_endian.h:12,
>                  from ./include/linux/byteorder/little_endian.h:5,
>                  from /usr/include/x86_64-linux-gnu/asm/byteorder.h:5,
>                  from ./arch/x86/include/asm/insn.h:10,
>                  from arch/x86/tools/insn_sanity.c:21:
> ./tools/include/linux/types.h:30:18: error: conflicting types for ‘u64’
>    30 | typedef uint64_t u64;

Sigh... I have not realized there are more usages of insn.c which are
conditionally compiled. It's not like you grep *.c files to find who
includes them regularity.

Looks like there is no way to find common byte swapping helpers for
the kernel and tools then. Even though tools provide quite a bunch of
them in tools/include/. So, completely avoiding mixing "kernel" and
"userspace" headers would look like the following (delta to commit
mentioned above):
---

diff --git a/arch/x86/include/asm/insn.h b/arch/x86/include/asm/insn.h
index 004e27bdf121..68197fe18a11 100644
--- a/arch/x86/include/asm/insn.h
+++ b/arch/x86/include/asm/insn.h
@@ -7,7 +7,13 @@
  * Copyright (C) IBM Corporation, 2009
  */
 
+#ifdef __KERNEL__
 #include <asm/byteorder.h>
+#define insn_cpu_to_le32 cpu_to_le32
+#else
+#include <endian.h>
+#define insn_cpu_to_le32 htole32
+#endif
 /* insn_attr_t is defined in inat.h */
 #include <asm/inat.h>
 
@@ -47,7 +53,7 @@ static inline void insn_field_set(struct insn_field *p, insn_value_t v,
 				  unsigned char n)
 {
 	p->value = v;
-	p->little = __cpu_to_le32(v);
+	p->little = insn_cpu_to_le32(v);
 	p->nbytes = n;
 }
 
diff --git a/arch/x86/lib/insn.c b/arch/x86/lib/insn.c
index 520b31fc1f1a..003f32ff7798 100644
--- a/arch/x86/lib/insn.c
+++ b/arch/x86/lib/insn.c
@@ -5,7 +5,6 @@
  * Copyright (C) IBM Corporation, 2002, 2004, 2009
  */
 
-#include <linux/kernel.h>
 #ifdef __KERNEL__
 #include <linux/string.h>
 #else
@@ -16,15 +15,23 @@
 
 #include <asm/emulate_prefix.h>
 
+#ifdef __KERNEL__
+#define insn_le32_to_cpu le32_to_cpu
+#define insn_le16_to_cpu le16_to_cpu
+#else
+#define insn_le32_to_cpu le32toh
+#define insn_le16_to_cpu le16toh
+#endif
+
 #define leXX_to_cpu(t, r)						\
 ({									\
 	__typeof__(t) v;						\
 	switch (sizeof(t)) {						\
-	case 4: v = le32_to_cpu(r); break;				\
-	case 2: v = le16_to_cpu(r); break;				\
+	case 4: v = insn_le32_to_cpu(r); break;				\
+	case 2: v = insn_le16_to_cpu(r); break;				\
 	case 1:	v = r; break;						\
-	default:							\
-		BUILD_BUG(); break;					\
+	default: /* relying on -Wuninitialized to report this */	\
+		break;							\
 	}								\
 	v;								\
 })
--
And the same for the tools/*
No linux/kernel.h means no BUILD_BUG(), but -Wuninitialized actually
does a decent job in this case:
arch/x86/../../../arch/x86/lib/insn.c:605:37: error: variable 'v' is
		uninitialized when used here [-Werror,-Wuninitialized]
                insn_field_set(&insn->immediate2, get_next(long, insn), 1);
                                                  ^~~~~~~~~~~~~~~~~~~~

Masami, Josh,
would that be acceptable?

Should I resent the entire patch series again with these changes squashed?
Or just as a separate commit which would go on top?
Josh Poimboeuf Oct. 10, 2020, 5:44 p.m. UTC | #4
On Fri, Oct 09, 2020 at 10:49:21PM +0200, Borislav Petkov wrote:
> On Fri, Oct 09, 2020 at 10:38:22PM +0200, Peter Zijlstra wrote:
> > On Wed, Oct 07, 2020 at 04:20:19PM -0000, tip-bot2 for Martin Schwidefsky wrote:
> > > The following commit has been merged into the objtool/core branch of tip:
> > > 
> > > Commit-ID:     2a522b53c47051d3bf98748418f4f8e5f20d2c04
> > > Gitweb:        https://git.kernel.org/tip/2a522b53c47051d3bf98748418f4f8e5f20d2c04
> > > Author:        Martin Schwidefsky <schwidefsky@de.ibm.com>
> > > AuthorDate:    Mon, 05 Oct 2020 17:50:31 +02:00
> > > Committer:     Josh Poimboeuf <jpoimboe@redhat.com>
> > > CommitterDate: Tue, 06 Oct 2020 09:32:29 -05:00
> > > 
> > > x86/insn: Support big endian cross-compiles
> > > 
> > > x86 instruction decoder code is shared across the kernel source and the
> > > tools. Currently objtool seems to be the only tool from build tools needed
> > > which breaks x86 cross compilation on big endian systems. Make the x86
> > > instruction decoder build host endianness agnostic to support x86 cross
> > > compilation and enable objtool to implement endianness awareness for
> > > big endian architectures support.
> > > 
> > > Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
> > > Co-developed-by: Vasily Gorbik <gor@linux.ibm.com>
> > > Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
> > > Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
> > > Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
> > 
> > This commit breaks the x86 build with CONFIG_X86_DECODER_SELFTEST=y.
> > 
> > I've asked Boris to truncate tip/objtool/core.
> 
> Yeah, top 4 are gone until this is resolved.

Masami, I wonder if we even need these selftests anymore?  Objtool
already decodes the entire kernel.
Borislav Petkov Oct. 11, 2020, 2:40 p.m. UTC | #5
On Sat, Oct 10, 2020 at 04:02:10PM +0200, Vasily Gorbik wrote:
> Should I resent the entire patch series again with these changes
> squashed? Or just as a separate commit which would go on top?

You can wait at least a week (merge window starts tomorrow) and then
resend the entire series once the final approach is agreed upon.

Thx.
Masami Hiramatsu Oct. 12, 2020, 12:02 a.m. UTC | #6
On Sat, 10 Oct 2020 16:02:10 +0200
Vasily Gorbik <gor@linux.ibm.com> wrote:

> On Fri, Oct 09, 2020 at 10:49:21PM +0200, Borislav Petkov wrote:
> > On Fri, Oct 09, 2020 at 10:38:22PM +0200, Peter Zijlstra wrote:
> > > On Wed, Oct 07, 2020 at 04:20:19PM -0000, tip-bot2 for Martin Schwidefsky wrote:
> > > > The following commit has been merged into the objtool/core branch of tip:
> > > > 
> > > > Commit-ID:     2a522b53c47051d3bf98748418f4f8e5f20d2c04
> > > > Gitweb:        https://git.kernel.org/tip/2a522b53c47051d3bf98748418f4f8e5f20d2c04
> > > > 
> > > > x86/insn: Support big endian cross-compiles
> > > 
> > > This commit breaks the x86 build with CONFIG_X86_DECODER_SELFTEST=y.
> > > 
> > > I've asked Boris to truncate tip/objtool/core.
> > 
> > Yeah, top 4 are gone until this is resolved.
> > 
> > What I would suggest is to have a look at how tools/ headers are kept
> > separate from kernel proper ones, see tools/include/ and how those
> > headers there are full of dummy definitions just so it builds.
> > 
> > And then including a global one like linux/kernel.h is just looking for
> > trouble:
> > 
> > In file included from ./include/uapi/linux/byteorder/little_endian.h:12,
> >                  from ./include/linux/byteorder/little_endian.h:5,
> >                  from /usr/include/x86_64-linux-gnu/asm/byteorder.h:5,
> >                  from ./arch/x86/include/asm/insn.h:10,
> >                  from arch/x86/tools/insn_sanity.c:21:
> > ./tools/include/linux/types.h:30:18: error: conflicting types for ‘u64’
> >    30 | typedef uint64_t u64;
> 
> Sigh... I have not realized there are more usages of insn.c which are
> conditionally compiled. It's not like you grep *.c files to find who
> includes them regularity.

Yes, x86 insn library code is used for the sanity check tool too.

> 
> Looks like there is no way to find common byte swapping helpers for
> the kernel and tools then. Even though tools provide quite a bunch of
> them in tools/include/. So, completely avoiding mixing "kernel" and
> "userspace" headers would look like the following (delta to commit
> mentioned above):
> ---
> 
> diff --git a/arch/x86/include/asm/insn.h b/arch/x86/include/asm/insn.h
> index 004e27bdf121..68197fe18a11 100644
> --- a/arch/x86/include/asm/insn.h
> +++ b/arch/x86/include/asm/insn.h
> @@ -7,7 +7,13 @@
>   * Copyright (C) IBM Corporation, 2009
>   */
>  
> +#ifdef __KERNEL__
>  #include <asm/byteorder.h>
> +#define insn_cpu_to_le32 cpu_to_le32
> +#else
> +#include <endian.h>
> +#define insn_cpu_to_le32 htole32
> +#endif
>  /* insn_attr_t is defined in inat.h */
>  #include <asm/inat.h>
>  
> @@ -47,7 +53,7 @@ static inline void insn_field_set(struct insn_field *p, insn_value_t v,
>  				  unsigned char n)
>  {
>  	p->value = v;
> -	p->little = __cpu_to_le32(v);
> +	p->little = insn_cpu_to_le32(v);
>  	p->nbytes = n;
>  }
>  
> diff --git a/arch/x86/lib/insn.c b/arch/x86/lib/insn.c
> index 520b31fc1f1a..003f32ff7798 100644
> --- a/arch/x86/lib/insn.c
> +++ b/arch/x86/lib/insn.c
> @@ -5,7 +5,6 @@
>   * Copyright (C) IBM Corporation, 2002, 2004, 2009
>   */
>  
> -#include <linux/kernel.h>
>  #ifdef __KERNEL__
>  #include <linux/string.h>
>  #else
> @@ -16,15 +15,23 @@
>  
>  #include <asm/emulate_prefix.h>
>  
> +#ifdef __KERNEL__
> +#define insn_le32_to_cpu le32_to_cpu
> +#define insn_le16_to_cpu le16_to_cpu
> +#else
> +#define insn_le32_to_cpu le32toh
> +#define insn_le16_to_cpu le16toh
> +#endif
> +
>  #define leXX_to_cpu(t, r)						\
>  ({									\
>  	__typeof__(t) v;						\
>  	switch (sizeof(t)) {						\
> -	case 4: v = le32_to_cpu(r); break;				\
> -	case 2: v = le16_to_cpu(r); break;				\
> +	case 4: v = insn_le32_to_cpu(r); break;				\
> +	case 2: v = insn_le16_to_cpu(r); break;				\
>  	case 1:	v = r; break;						\
> -	default:							\
> -		BUILD_BUG(); break;					\
> +	default: /* relying on -Wuninitialized to report this */	\
> +		break;							\
>  	}								\
>  	v;								\
>  })
> --
> And the same for the tools/*
> No linux/kernel.h means no BUILD_BUG(), but -Wuninitialized actually
> does a decent job in this case:
> arch/x86/../../../arch/x86/lib/insn.c:605:37: error: variable 'v' is
> 		uninitialized when used here [-Werror,-Wuninitialized]
>                 insn_field_set(&insn->immediate2, get_next(long, insn), 1);
>                                                   ^~~~~~~~~~~~~~~~~~~~

Can you initialize v with 0 ? Anyway it will be optimized out while
compiling the code.

> 
> Masami, Josh,
> would that be acceptable?

Yes.

Thank you,
Masami Hiramatsu Oct. 12, 2020, 12:12 a.m. UTC | #7
On Sat, 10 Oct 2020 12:44:15 -0500
Josh Poimboeuf <jpoimboe@redhat.com> wrote:

> On Fri, Oct 09, 2020 at 10:49:21PM +0200, Borislav Petkov wrote:
> > On Fri, Oct 09, 2020 at 10:38:22PM +0200, Peter Zijlstra wrote:
> > > On Wed, Oct 07, 2020 at 04:20:19PM -0000, tip-bot2 for Martin Schwidefsky wrote:
> > > > The following commit has been merged into the objtool/core branch of tip:
> > > > 
> > > > Commit-ID:     2a522b53c47051d3bf98748418f4f8e5f20d2c04
> > > > Gitweb:        https://git.kernel.org/tip/2a522b53c47051d3bf98748418f4f8e5f20d2c04
> > > > Author:        Martin Schwidefsky <schwidefsky@de.ibm.com>
> > > > AuthorDate:    Mon, 05 Oct 2020 17:50:31 +02:00
> > > > Committer:     Josh Poimboeuf <jpoimboe@redhat.com>
> > > > CommitterDate: Tue, 06 Oct 2020 09:32:29 -05:00
> > > > 
> > > > x86/insn: Support big endian cross-compiles
> > > > 
> > > > x86 instruction decoder code is shared across the kernel source and the
> > > > tools. Currently objtool seems to be the only tool from build tools needed
> > > > which breaks x86 cross compilation on big endian systems. Make the x86
> > > > instruction decoder build host endianness agnostic to support x86 cross
> > > > compilation and enable objtool to implement endianness awareness for
> > > > big endian architectures support.
> > > > 
> > > > Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
> > > > Co-developed-by: Vasily Gorbik <gor@linux.ibm.com>
> > > > Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
> > > > Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
> > > > Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
> > > 
> > > This commit breaks the x86 build with CONFIG_X86_DECODER_SELFTEST=y.
> > > 
> > > I've asked Boris to truncate tip/objtool/core.
> > 
> > Yeah, top 4 are gone until this is resolved.
> 
> Masami, I wonder if we even need these selftests anymore?  Objtool
> already decodes the entire kernel.

No, they have different roles. The selftest checks if the decoder
works correctly by comparing with the output of objdump.

As far as I can see, the objtool relies on the sanity of the decoder
(it trusts the output of the decoder).

Thank you,
Josh Poimboeuf Oct. 12, 2020, 3:39 p.m. UTC | #8
On Mon, Oct 12, 2020 at 09:12:36AM +0900, Masami Hiramatsu wrote:
> On Sat, 10 Oct 2020 12:44:15 -0500
> Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> 
> > On Fri, Oct 09, 2020 at 10:49:21PM +0200, Borislav Petkov wrote:
> > > On Fri, Oct 09, 2020 at 10:38:22PM +0200, Peter Zijlstra wrote:
> > > > On Wed, Oct 07, 2020 at 04:20:19PM -0000, tip-bot2 for Martin Schwidefsky wrote:
> > > > > The following commit has been merged into the objtool/core branch of tip:
> > > > > 
> > > > > Commit-ID:     2a522b53c47051d3bf98748418f4f8e5f20d2c04
> > > > > Gitweb:        https://git.kernel.org/tip/2a522b53c47051d3bf98748418f4f8e5f20d2c04
> > > > > Author:        Martin Schwidefsky <schwidefsky@de.ibm.com>
> > > > > AuthorDate:    Mon, 05 Oct 2020 17:50:31 +02:00
> > > > > Committer:     Josh Poimboeuf <jpoimboe@redhat.com>
> > > > > CommitterDate: Tue, 06 Oct 2020 09:32:29 -05:00
> > > > > 
> > > > > x86/insn: Support big endian cross-compiles
> > > > > 
> > > > > x86 instruction decoder code is shared across the kernel source and the
> > > > > tools. Currently objtool seems to be the only tool from build tools needed
> > > > > which breaks x86 cross compilation on big endian systems. Make the x86
> > > > > instruction decoder build host endianness agnostic to support x86 cross
> > > > > compilation and enable objtool to implement endianness awareness for
> > > > > big endian architectures support.
> > > > > 
> > > > > Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
> > > > > Co-developed-by: Vasily Gorbik <gor@linux.ibm.com>
> > > > > Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
> > > > > Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
> > > > > Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
> > > > 
> > > > This commit breaks the x86 build with CONFIG_X86_DECODER_SELFTEST=y.
> > > > 
> > > > I've asked Boris to truncate tip/objtool/core.
> > > 
> > > Yeah, top 4 are gone until this is resolved.
> > 
> > Masami, I wonder if we even need these selftests anymore?  Objtool
> > already decodes the entire kernel.
> 
> No, they have different roles. The selftest checks if the decoder
> works correctly by comparing with the output of objdump.
> 
> As far as I can see, the objtool relies on the sanity of the decoder
> (it trusts the output of the decoder).

Ok.  I wonder if we should move the decoder selftest to the 'tools'
subdirectory.
Masami Hiramatsu Oct. 14, 2020, 7:28 a.m. UTC | #9
On Mon, 12 Oct 2020 10:39:49 -0500
Josh Poimboeuf <jpoimboe@redhat.com> wrote:

> On Mon, Oct 12, 2020 at 09:12:36AM +0900, Masami Hiramatsu wrote:
> > On Sat, 10 Oct 2020 12:44:15 -0500
> > Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> > 
> > > On Fri, Oct 09, 2020 at 10:49:21PM +0200, Borislav Petkov wrote:
> > > > On Fri, Oct 09, 2020 at 10:38:22PM +0200, Peter Zijlstra wrote:
> > > > > On Wed, Oct 07, 2020 at 04:20:19PM -0000, tip-bot2 for Martin Schwidefsky wrote:
> > > > > > The following commit has been merged into the objtool/core branch of tip:
> > > > > > 
> > > > > > Commit-ID:     2a522b53c47051d3bf98748418f4f8e5f20d2c04
> > > > > > Gitweb:        https://git.kernel.org/tip/2a522b53c47051d3bf98748418f4f8e5f20d2c04
> > > > > > Author:        Martin Schwidefsky <schwidefsky@de.ibm.com>
> > > > > > AuthorDate:    Mon, 05 Oct 2020 17:50:31 +02:00
> > > > > > Committer:     Josh Poimboeuf <jpoimboe@redhat.com>
> > > > > > CommitterDate: Tue, 06 Oct 2020 09:32:29 -05:00
> > > > > > 
> > > > > > x86/insn: Support big endian cross-compiles
> > > > > > 
> > > > > > x86 instruction decoder code is shared across the kernel source and the
> > > > > > tools. Currently objtool seems to be the only tool from build tools needed
> > > > > > which breaks x86 cross compilation on big endian systems. Make the x86
> > > > > > instruction decoder build host endianness agnostic to support x86 cross
> > > > > > compilation and enable objtool to implement endianness awareness for
> > > > > > big endian architectures support.
> > > > > > 
> > > > > > Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
> > > > > > Co-developed-by: Vasily Gorbik <gor@linux.ibm.com>
> > > > > > Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
> > > > > > Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
> > > > > > Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
> > > > > 
> > > > > This commit breaks the x86 build with CONFIG_X86_DECODER_SELFTEST=y.
> > > > > 
> > > > > I've asked Boris to truncate tip/objtool/core.
> > > > 
> > > > Yeah, top 4 are gone until this is resolved.
> > > 
> > > Masami, I wonder if we even need these selftests anymore?  Objtool
> > > already decodes the entire kernel.
> > 
> > No, they have different roles. The selftest checks if the decoder
> > works correctly by comparing with the output of objdump.
> > 
> > As far as I can see, the objtool relies on the sanity of the decoder
> > (it trusts the output of the decoder).
> 
> Ok.  I wonder if we should move the decoder selftest to the 'tools'
> subdirectory.

It is in the arch/x86/tools, so it is already in a kind of tools :)
But yeah, it was considered to be used only on x86. But if someone
start trying to run it on non-x86, cross compiling, we need to
reconsider that.

Thank you,
Ian Rogers Oct. 15, 2020, 6:24 a.m. UTC | #10
On Wed, Oct 14, 2020 at 12:29 AM Masami Hiramatsu <mhiramat@kernel.org> wrote:
>
> On Mon, 12 Oct 2020 10:39:49 -0500
> Josh Poimboeuf <jpoimboe@redhat.com> wrote:
>
> > On Mon, Oct 12, 2020 at 09:12:36AM +0900, Masami Hiramatsu wrote:
> > > On Sat, 10 Oct 2020 12:44:15 -0500
> > > Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> > >
> > > > On Fri, Oct 09, 2020 at 10:49:21PM +0200, Borislav Petkov wrote:
> > > > > On Fri, Oct 09, 2020 at 10:38:22PM +0200, Peter Zijlstra wrote:
> > > > > > On Wed, Oct 07, 2020 at 04:20:19PM -0000, tip-bot2 for Martin Schwidefsky wrote:
> > > > > > > The following commit has been merged into the objtool/core branch of tip:
> > > > > > >
> > > > > > > Commit-ID:     2a522b53c47051d3bf98748418f4f8e5f20d2c04
> > > > > > > Gitweb:        https://git.kernel.org/tip/2a522b53c47051d3bf98748418f4f8e5f20d2c04
> > > > > > > Author:        Martin Schwidefsky <schwidefsky@de.ibm.com>
> > > > > > > AuthorDate:    Mon, 05 Oct 2020 17:50:31 +02:00
> > > > > > > Committer:     Josh Poimboeuf <jpoimboe@redhat.com>
> > > > > > > CommitterDate: Tue, 06 Oct 2020 09:32:29 -05:00
> > > > > > >
> > > > > > > x86/insn: Support big endian cross-compiles
> > > > > > >
> > > > > > > x86 instruction decoder code is shared across the kernel source and the
> > > > > > > tools. Currently objtool seems to be the only tool from build tools needed
> > > > > > > which breaks x86 cross compilation on big endian systems. Make the x86
> > > > > > > instruction decoder build host endianness agnostic to support x86 cross
> > > > > > > compilation and enable objtool to implement endianness awareness for
> > > > > > > big endian architectures support.
> > > > > > >
> > > > > > > Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
> > > > > > > Co-developed-by: Vasily Gorbik <gor@linux.ibm.com>
> > > > > > > Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
> > > > > > > Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
> > > > > > > Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
> > > > > >
> > > > > > This commit breaks the x86 build with CONFIG_X86_DECODER_SELFTEST=y.
> > > > > >
> > > > > > I've asked Boris to truncate tip/objtool/core.
> > > > >
> > > > > Yeah, top 4 are gone until this is resolved.
> > > >
> > > > Masami, I wonder if we even need these selftests anymore?  Objtool
> > > > already decodes the entire kernel.
> > >
> > > No, they have different roles. The selftest checks if the decoder
> > > works correctly by comparing with the output of objdump.
> > >
> > > As far as I can see, the objtool relies on the sanity of the decoder
> > > (it trusts the output of the decoder).
> >
> > Ok.  I wonder if we should move the decoder selftest to the 'tools'
> > subdirectory.
>
> It is in the arch/x86/tools, so it is already in a kind of tools :)
> But yeah, it was considered to be used only on x86. But if someone
> start trying to run it on non-x86, cross compiling, we need to
> reconsider that.
>
> Thank you,
>
> --
> Masami Hiramatsu <mhiramat@kernel.org>

There is undefined behavior that is present in the x86 insn.c code as
described in:
https://lore.kernel.org/lkml/20190724184512.162887-4-nums@google.com/

I resent this patch fixing other potential undefined behavior:
https://lore.kernel.org/lkml/20201015062148.1437894-1-irogers@google.com/T/#t

The misaligned loads will likely break on anything non-x86.

Thanks,
Ian
Vasily Gorbik Nov. 3, 2020, 7:35 p.m. UTC | #11
On Wed, Oct 14, 2020 at 04:28:59PM +0900, Masami Hiramatsu wrote:
> On Mon, 12 Oct 2020 10:39:49 -0500
> Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> 
> > On Mon, Oct 12, 2020 at 09:12:36AM +0900, Masami Hiramatsu wrote:
> > > On Sat, 10 Oct 2020 12:44:15 -0500
> > > Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> > > 
> > > > On Fri, Oct 09, 2020 at 10:49:21PM +0200, Borislav Petkov wrote:
> > > > > On Fri, Oct 09, 2020 at 10:38:22PM +0200, Peter Zijlstra wrote:
> > > > > > On Wed, Oct 07, 2020 at 04:20:19PM -0000, tip-bot2 for Martin Schwidefsky wrote:
> > > > > > > The following commit has been merged into the objtool/core branch of tip:
> > > > > > > 
> > > > > > > Commit-ID:     2a522b53c47051d3bf98748418f4f8e5f20d2c04
> > > > > > > Gitweb:        https://git.kernel.org/tip/2a522b53c47051d3bf98748418f4f8e5f20d2c04
> > > > > > > Author:        Martin Schwidefsky <schwidefsky@de.ibm.com>
> > > > > > > AuthorDate:    Mon, 05 Oct 2020 17:50:31 +02:00
> > > > > > > Committer:     Josh Poimboeuf <jpoimboe@redhat.com>
> > > > > > > CommitterDate: Tue, 06 Oct 2020 09:32:29 -05:00
> > > > > > > 
> > > > > > > x86/insn: Support big endian cross-compiles
> > > > > > > 
> > > > > > > x86 instruction decoder code is shared across the kernel source and the
> > > > > > > tools. Currently objtool seems to be the only tool from build tools needed
> > > > > > > which breaks x86 cross compilation on big endian systems. Make the x86
> > > > > > > instruction decoder build host endianness agnostic to support x86 cross
> > > > > > > compilation and enable objtool to implement endianness awareness for
> > > > > > > big endian architectures support.
> > > > > > > 
> > > > > > > Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
> > > > > > > Co-developed-by: Vasily Gorbik <gor@linux.ibm.com>
> > > > > > > Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
> > > > > > > Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
> > > > > > > Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
> > > > > > 
> > > > > > This commit breaks the x86 build with CONFIG_X86_DECODER_SELFTEST=y.
> > > > > > 
> > > > > > I've asked Boris to truncate tip/objtool/core.
> > > > > 
> > > > > Yeah, top 4 are gone until this is resolved.
> > > > 
> > > > Masami, I wonder if we even need these selftests anymore?  Objtool
> > > > already decodes the entire kernel.
> > > 
> > > No, they have different roles. The selftest checks if the decoder
> > > works correctly by comparing with the output of objdump.
> > > 
> > > As far as I can see, the objtool relies on the sanity of the decoder
> > > (it trusts the output of the decoder).
> > 
> > Ok.  I wonder if we should move the decoder selftest to the 'tools'
> > subdirectory.
> 
> It is in the arch/x86/tools, so it is already in a kind of tools :)
> But yeah, it was considered to be used only on x86. But if someone
> start trying to run it on non-x86, cross compiling, we need to
> reconsider that.

I actually tried to move it to tools/testing/selftests and encountered
several problems with kselftest build in general:
- out of source build is broken if path is relative,
- out of source build headers partially installed in
  $(srcdir)arch/x86/include/generated/ instead of $(objdir), when
  kselftests are called from the kbuild,
- out of source test runs is broken,
- kernel headers are installed unconditionally.

These things impede moving decoder selftests to kselftests.

On the other hand making the decoder selftest work "in place" seems
trivial. The following fix on top of jpoimboe/objtool/core fixes the
build, as well as cross-compilation. With that I can cross-compile
x86 kernel on s390 with CONFIG_X86_DECODER_SELFTEST=y and posttest runs
just fine.

Vasily Gorbik (1):
  x86/tools: Use tools headers for instruction decoder selftests

 arch/x86/tools/Makefile      | 8 ++++----
 arch/x86/tools/insn_sanity.c | 4 ----
 2 files changed, 4 insertions(+), 8 deletions(-)

Patch
diff mbox series

diff --git a/arch/x86/include/asm/insn.h b/arch/x86/include/asm/insn.h
index 5c1ae3e..004e27b 100644
--- a/arch/x86/include/asm/insn.h
+++ b/arch/x86/include/asm/insn.h
@@ -7,9 +7,12 @@ 
  * Copyright (C) IBM Corporation, 2009
  */
 
+#include <asm/byteorder.h>
 /* insn_attr_t is defined in inat.h */
 #include <asm/inat.h>
 
+#if defined(__BYTE_ORDER) ? __BYTE_ORDER == __LITTLE_ENDIAN : defined(__LITTLE_ENDIAN)
+
 struct insn_field {
 	union {
 		insn_value_t value;
@@ -20,6 +23,36 @@  struct insn_field {
 	unsigned char nbytes;
 };
 
+static inline void insn_field_set(struct insn_field *p, insn_value_t v,
+				  unsigned char n)
+{
+	p->value = v;
+	p->nbytes = n;
+}
+
+#else
+
+struct insn_field {
+	insn_value_t value;
+	union {
+		insn_value_t little;
+		insn_byte_t bytes[4];
+	};
+	/* !0 if we've run insn_get_xxx() for this field */
+	unsigned char got;
+	unsigned char nbytes;
+};
+
+static inline void insn_field_set(struct insn_field *p, insn_value_t v,
+				  unsigned char n)
+{
+	p->value = v;
+	p->little = __cpu_to_le32(v);
+	p->nbytes = n;
+}
+
+#endif
+
 struct insn {
 	struct insn_field prefixes;	/*
 					 * Prefixes
diff --git a/arch/x86/lib/insn.c b/arch/x86/lib/insn.c
index 4042795..520b31f 100644
--- a/arch/x86/lib/insn.c
+++ b/arch/x86/lib/insn.c
@@ -5,6 +5,7 @@ 
  * Copyright (C) IBM Corporation, 2002, 2004, 2009
  */
 
+#include <linux/kernel.h>
 #ifdef __KERNEL__
 #include <linux/string.h>
 #else
@@ -15,15 +16,28 @@ 
 
 #include <asm/emulate_prefix.h>
 
+#define leXX_to_cpu(t, r)						\
+({									\
+	__typeof__(t) v;						\
+	switch (sizeof(t)) {						\
+	case 4: v = le32_to_cpu(r); break;				\
+	case 2: v = le16_to_cpu(r); break;				\
+	case 1:	v = r; break;						\
+	default:							\
+		BUILD_BUG(); break;					\
+	}								\
+	v;								\
+})
+
 /* Verify next sizeof(t) bytes can be on the same instruction */
 #define validate_next(t, insn, n)	\
 	((insn)->next_byte + sizeof(t) + n <= (insn)->end_kaddr)
 
 #define __get_next(t, insn)	\
-	({ t r = *(t*)insn->next_byte; insn->next_byte += sizeof(t); r; })
+	({ t r = *(t*)insn->next_byte; insn->next_byte += sizeof(t); leXX_to_cpu(t, r); })
 
 #define __peek_nbyte_next(t, insn, n)	\
-	({ t r = *(t*)((insn)->next_byte + n); r; })
+	({ t r = *(t*)((insn)->next_byte + n); leXX_to_cpu(t, r); })
 
 #define get_next(t, insn)	\
 	({ if (unlikely(!validate_next(t, insn, 0))) goto err_out; __get_next(t, insn); })
@@ -157,8 +171,7 @@  found:
 		b = peek_next(insn_byte_t, insn);
 		attr = inat_get_opcode_attribute(b);
 		if (inat_is_rex_prefix(attr)) {
-			insn->rex_prefix.value = b;
-			insn->rex_prefix.nbytes = 1;
+			insn_field_set(&insn->rex_prefix, b, 1);
 			insn->next_byte++;
 			if (X86_REX_W(b))
 				/* REX.W overrides opnd_size */
@@ -295,8 +308,7 @@  void insn_get_modrm(struct insn *insn)
 
 	if (inat_has_modrm(insn->attr)) {
 		mod = get_next(insn_byte_t, insn);
-		modrm->value = mod;
-		modrm->nbytes = 1;
+		insn_field_set(modrm, mod, 1);
 		if (inat_is_group(insn->attr)) {
 			pfx_id = insn_last_prefix_id(insn);
 			insn->attr = inat_get_group_attribute(mod, pfx_id,
@@ -334,7 +346,7 @@  int insn_rip_relative(struct insn *insn)
 	 * For rip-relative instructions, the mod field (top 2 bits)
 	 * is zero and the r/m field (bottom 3 bits) is 0x5.
 	 */
-	return (modrm->nbytes && (modrm->value & 0xc7) == 0x5);
+	return (modrm->nbytes && (modrm->bytes[0] & 0xc7) == 0x5);
 }
 
 /**
@@ -353,11 +365,11 @@  void insn_get_sib(struct insn *insn)
 	if (!insn->modrm.got)
 		insn_get_modrm(insn);
 	if (insn->modrm.nbytes) {
-		modrm = (insn_byte_t)insn->modrm.value;
+		modrm = insn->modrm.bytes[0];
 		if (insn->addr_bytes != 2 &&
 		    X86_MODRM_MOD(modrm) != 3 && X86_MODRM_RM(modrm) == 4) {
-			insn->sib.value = get_next(insn_byte_t, insn);
-			insn->sib.nbytes = 1;
+			insn_field_set(&insn->sib,
+				       get_next(insn_byte_t, insn), 1);
 		}
 	}
 	insn->sib.got = 1;
@@ -407,19 +419,18 @@  void insn_get_displacement(struct insn *insn)
 		if (mod == 3)
 			goto out;
 		if (mod == 1) {
-			insn->displacement.value = get_next(signed char, insn);
-			insn->displacement.nbytes = 1;
+			insn_field_set(&insn->displacement,
+				       get_next(signed char, insn), 1);
 		} else if (insn->addr_bytes == 2) {
 			if ((mod == 0 && rm == 6) || mod == 2) {
-				insn->displacement.value =
-					 get_next(short, insn);
-				insn->displacement.nbytes = 2;
+				insn_field_set(&insn->displacement,
+					       get_next(short, insn), 2);
 			}
 		} else {
 			if ((mod == 0 && rm == 5) || mod == 2 ||
 			    (mod == 0 && base == 5)) {
-				insn->displacement.value = get_next(int, insn);
-				insn->displacement.nbytes = 4;
+				insn_field_set(&insn->displacement,
+					       get_next(int, insn), 4);
 			}
 		}
 	}
@@ -435,18 +446,14 @@  static int __get_moffset(struct insn *insn)
 {
 	switch (insn->addr_bytes) {
 	case 2:
-		insn->moffset1.value = get_next(short, insn);
-		insn->moffset1.nbytes = 2;
+		insn_field_set(&insn->moffset1, get_next(short, insn), 2);
 		break;
 	case 4:
-		insn->moffset1.value = get_next(int, insn);
-		insn->moffset1.nbytes = 4;
+		insn_field_set(&insn->moffset1, get_next(int, insn), 4);
 		break;
 	case 8:
-		insn->moffset1.value = get_next(int, insn);
-		insn->moffset1.nbytes = 4;
-		insn->moffset2.value = get_next(int, insn);
-		insn->moffset2.nbytes = 4;
+		insn_field_set(&insn->moffset1, get_next(int, insn), 4);
+		insn_field_set(&insn->moffset2, get_next(int, insn), 4);
 		break;
 	default:	/* opnd_bytes must be modified manually */
 		goto err_out;
@@ -464,13 +471,11 @@  static int __get_immv32(struct insn *insn)
 {
 	switch (insn->opnd_bytes) {
 	case 2:
-		insn->immediate.value = get_next(short, insn);
-		insn->immediate.nbytes = 2;
+		insn_field_set(&insn->immediate, get_next(short, insn), 2);
 		break;
 	case 4:
 	case 8:
-		insn->immediate.value = get_next(int, insn);
-		insn->immediate.nbytes = 4;
+		insn_field_set(&insn->immediate, get_next(int, insn), 4);
 		break;
 	default:	/* opnd_bytes must be modified manually */
 		goto err_out;
@@ -487,18 +492,15 @@  static int __get_immv(struct insn *insn)
 {
 	switch (insn->opnd_bytes) {
 	case 2:
-		insn->immediate1.value = get_next(short, insn);
-		insn->immediate1.nbytes = 2;
+		insn_field_set(&insn->immediate1, get_next(short, insn), 2);
 		break;
 	case 4:
-		insn->immediate1.value = get_next(int, insn);
+		insn_field_set(&insn->immediate1, get_next(int, insn), 4);
 		insn->immediate1.nbytes = 4;
 		break;
 	case 8:
-		insn->immediate1.value = get_next(int, insn);
-		insn->immediate1.nbytes = 4;
-		insn->immediate2.value = get_next(int, insn);
-		insn->immediate2.nbytes = 4;
+		insn_field_set(&insn->immediate1, get_next(int, insn), 4);
+		insn_field_set(&insn->immediate2, get_next(int, insn), 4);
 		break;
 	default:	/* opnd_bytes must be modified manually */
 		goto err_out;
@@ -515,12 +517,10 @@  static int __get_immptr(struct insn *insn)
 {
 	switch (insn->opnd_bytes) {
 	case 2:
-		insn->immediate1.value = get_next(short, insn);
-		insn->immediate1.nbytes = 2;
+		insn_field_set(&insn->immediate1, get_next(short, insn), 2);
 		break;
 	case 4:
-		insn->immediate1.value = get_next(int, insn);
-		insn->immediate1.nbytes = 4;
+		insn_field_set(&insn->immediate1, get_next(int, insn), 4);
 		break;
 	case 8:
 		/* ptr16:64 is not exist (no segment) */
@@ -528,8 +528,7 @@  static int __get_immptr(struct insn *insn)
 	default:	/* opnd_bytes must be modified manually */
 		goto err_out;
 	}
-	insn->immediate2.value = get_next(unsigned short, insn);
-	insn->immediate2.nbytes = 2;
+	insn_field_set(&insn->immediate2, get_next(unsigned short, insn), 2);
 	insn->immediate1.got = insn->immediate2.got = 1;
 
 	return 1;
@@ -565,22 +564,17 @@  void insn_get_immediate(struct insn *insn)
 
 	switch (inat_immediate_size(insn->attr)) {
 	case INAT_IMM_BYTE:
-		insn->immediate.value = get_next(signed char, insn);
-		insn->immediate.nbytes = 1;
+		insn_field_set(&insn->immediate, get_next(signed char, insn), 1);
 		break;
 	case INAT_IMM_WORD:
-		insn->immediate.value = get_next(short, insn);
-		insn->immediate.nbytes = 2;
+		insn_field_set(&insn->immediate, get_next(short, insn), 2);
 		break;
 	case INAT_IMM_DWORD:
-		insn->immediate.value = get_next(int, insn);
-		insn->immediate.nbytes = 4;
+		insn_field_set(&insn->immediate, get_next(int, insn), 4);
 		break;
 	case INAT_IMM_QWORD:
-		insn->immediate1.value = get_next(int, insn);
-		insn->immediate1.nbytes = 4;
-		insn->immediate2.value = get_next(int, insn);
-		insn->immediate2.nbytes = 4;
+		insn_field_set(&insn->immediate1, get_next(int, insn), 4);
+		insn_field_set(&insn->immediate2, get_next(int, insn), 4);
 		break;
 	case INAT_IMM_PTR:
 		if (!__get_immptr(insn))
@@ -599,8 +593,7 @@  void insn_get_immediate(struct insn *insn)
 		goto err_out;
 	}
 	if (inat_has_second_immediate(insn->attr)) {
-		insn->immediate2.value = get_next(signed char, insn);
-		insn->immediate2.nbytes = 1;
+		insn_field_set(&insn->immediate2, get_next(signed char, insn), 1);
 	}
 done:
 	insn->immediate.got = 1;
diff --git a/tools/arch/x86/include/asm/insn.h b/tools/arch/x86/include/asm/insn.h
index 568854b..b9b6928 100644
--- a/tools/arch/x86/include/asm/insn.h
+++ b/tools/arch/x86/include/asm/insn.h
@@ -7,9 +7,12 @@ 
  * Copyright (C) IBM Corporation, 2009
  */
 
+#include <asm/byteorder.h>
 /* insn_attr_t is defined in inat.h */
 #include "inat.h"
 
+#if defined(__BYTE_ORDER) ? __BYTE_ORDER == __LITTLE_ENDIAN : defined(__LITTLE_ENDIAN)
+
 struct insn_field {
 	union {
 		insn_value_t value;
@@ -20,6 +23,36 @@  struct insn_field {
 	unsigned char nbytes;
 };
 
+static inline void insn_field_set(struct insn_field *p, insn_value_t v,
+				  unsigned char n)
+{
+	p->value = v;
+	p->nbytes = n;
+}
+
+#else
+
+struct insn_field {
+	insn_value_t value;
+	union {
+		insn_value_t little;
+		insn_byte_t bytes[4];
+	};
+	/* !0 if we've run insn_get_xxx() for this field */
+	unsigned char got;
+	unsigned char nbytes;
+};
+
+static inline void insn_field_set(struct insn_field *p, insn_value_t v,
+				  unsigned char n)
+{
+	p->value = v;
+	p->little = __cpu_to_le32(v);
+	p->nbytes = n;
+}
+
+#endif
+
 struct insn {
 	struct insn_field prefixes;	/*
 					 * Prefixes
diff --git a/tools/arch/x86/lib/insn.c b/tools/arch/x86/lib/insn.c
index 0151dfc..77e92aa 100644
--- a/tools/arch/x86/lib/insn.c
+++ b/tools/arch/x86/lib/insn.c
@@ -5,6 +5,7 @@ 
  * Copyright (C) IBM Corporation, 2002, 2004, 2009
  */
 
+#include <linux/kernel.h>
 #ifdef __KERNEL__
 #include <linux/string.h>
 #else
@@ -15,15 +16,28 @@ 
 
 #include "../include/asm/emulate_prefix.h"
 
+#define leXX_to_cpu(t, r)						\
+({									\
+	__typeof__(t) v;						\
+	switch (sizeof(t)) {						\
+	case 4: v = le32_to_cpu(r); break;				\
+	case 2: v = le16_to_cpu(r); break;				\
+	case 1:	v = r; break;						\
+	default:							\
+		BUILD_BUG(); break;					\
+	}								\
+	v;								\
+})
+
 /* Verify next sizeof(t) bytes can be on the same instruction */
 #define validate_next(t, insn, n)	\
 	((insn)->next_byte + sizeof(t) + n <= (insn)->end_kaddr)
 
 #define __get_next(t, insn)	\
-	({ t r = *(t*)insn->next_byte; insn->next_byte += sizeof(t); r; })
+	({ t r = *(t*)insn->next_byte; insn->next_byte += sizeof(t); leXX_to_cpu(t, r); })
 
 #define __peek_nbyte_next(t, insn, n)	\
-	({ t r = *(t*)((insn)->next_byte + n); r; })
+	({ t r = *(t*)((insn)->next_byte + n); leXX_to_cpu(t, r); })
 
 #define get_next(t, insn)	\
 	({ if (unlikely(!validate_next(t, insn, 0))) goto err_out; __get_next(t, insn); })
@@ -157,8 +171,7 @@  found:
 		b = peek_next(insn_byte_t, insn);
 		attr = inat_get_opcode_attribute(b);
 		if (inat_is_rex_prefix(attr)) {
-			insn->rex_prefix.value = b;
-			insn->rex_prefix.nbytes = 1;
+			insn_field_set(&insn->rex_prefix, b, 1);
 			insn->next_byte++;
 			if (X86_REX_W(b))
 				/* REX.W overrides opnd_size */
@@ -295,8 +308,7 @@  void insn_get_modrm(struct insn *insn)
 
 	if (inat_has_modrm(insn->attr)) {
 		mod = get_next(insn_byte_t, insn);
-		modrm->value = mod;
-		modrm->nbytes = 1;
+		insn_field_set(modrm, mod, 1);
 		if (inat_is_group(insn->attr)) {
 			pfx_id = insn_last_prefix_id(insn);
 			insn->attr = inat_get_group_attribute(mod, pfx_id,
@@ -334,7 +346,7 @@  int insn_rip_relative(struct insn *insn)
 	 * For rip-relative instructions, the mod field (top 2 bits)
 	 * is zero and the r/m field (bottom 3 bits) is 0x5.
 	 */
-	return (modrm->nbytes && (modrm->value & 0xc7) == 0x5);
+	return (modrm->nbytes && (modrm->bytes[0] & 0xc7) == 0x5);
 }
 
 /**
@@ -353,11 +365,11 @@  void insn_get_sib(struct insn *insn)
 	if (!insn->modrm.got)
 		insn_get_modrm(insn);
 	if (insn->modrm.nbytes) {
-		modrm = (insn_byte_t)insn->modrm.value;
+		modrm = insn->modrm.bytes[0];
 		if (insn->addr_bytes != 2 &&
 		    X86_MODRM_MOD(modrm) != 3 && X86_MODRM_RM(modrm) == 4) {
-			insn->sib.value = get_next(insn_byte_t, insn);
-			insn->sib.nbytes = 1;
+			insn_field_set(&insn->sib,
+				       get_next(insn_byte_t, insn), 1);
 		}
 	}
 	insn->sib.got = 1;
@@ -407,19 +419,18 @@  void insn_get_displacement(struct insn *insn)
 		if (mod == 3)
 			goto out;
 		if (mod == 1) {
-			insn->displacement.value = get_next(signed char, insn);
-			insn->displacement.nbytes = 1;
+			insn_field_set(&insn->displacement,
+				       get_next(signed char, insn), 1);
 		} else if (insn->addr_bytes == 2) {
 			if ((mod == 0 && rm == 6) || mod == 2) {
-				insn->displacement.value =
-					 get_next(short, insn);
-				insn->displacement.nbytes = 2;
+				insn_field_set(&insn->displacement,
+					       get_next(short, insn), 2);
 			}
 		} else {
 			if ((mod == 0 && rm == 5) || mod == 2 ||
 			    (mod == 0 && base == 5)) {
-				insn->displacement.value = get_next(int, insn);
-				insn->displacement.nbytes = 4;
+				insn_field_set(&insn->displacement,
+					       get_next(int, insn), 4);
 			}
 		}
 	}
@@ -435,18 +446,14 @@  static int __get_moffset(struct insn *insn)
 {
 	switch (insn->addr_bytes) {
 	case 2:
-		insn->moffset1.value = get_next(short, insn);
-		insn->moffset1.nbytes = 2;
+		insn_field_set(&insn->moffset1, get_next(short, insn), 2);
 		break;
 	case 4:
-		insn->moffset1.value = get_next(int, insn);
-		insn->moffset1.nbytes = 4;
+		insn_field_set(&insn->moffset1, get_next(int, insn), 4);
 		break;
 	case 8:
-		insn->moffset1.value = get_next(int, insn);
-		insn->moffset1.nbytes = 4;
-		insn->moffset2.value = get_next(int, insn);
-		insn->moffset2.nbytes = 4;
+		insn_field_set(&insn->moffset1, get_next(int, insn), 4);
+		insn_field_set(&insn->moffset2, get_next(int, insn), 4);
 		break;
 	default:	/* opnd_bytes must be modified manually */
 		goto err_out;
@@ -464,13 +471,11 @@  static int __get_immv32(struct insn *insn)
 {
 	switch (insn->opnd_bytes) {
 	case 2:
-		insn->immediate.value = get_next(short, insn);
-		insn->immediate.nbytes = 2;
+		insn_field_set(&insn->immediate, get_next(short, insn), 2);
 		break;
 	case 4:
 	case 8:
-		insn->immediate.value = get_next(int, insn);
-		insn->immediate.nbytes = 4;
+		insn_field_set(&insn->immediate, get_next(int, insn), 4);
 		break;
 	default:	/* opnd_bytes must be modified manually */
 		goto err_out;
@@ -487,18 +492,15 @@  static int __get_immv(struct insn *insn)
 {
 	switch (insn->opnd_bytes) {
 	case 2:
-		insn->immediate1.value = get_next(short, insn);
-		insn->immediate1.nbytes = 2;
+		insn_field_set(&insn->immediate1, get_next(short, insn), 2);
 		break;
 	case 4:
-		insn->immediate1.value = get_next(int, insn);
+		insn_field_set(&insn->immediate1, get_next(int, insn), 4);
 		insn->immediate1.nbytes = 4;
 		break;
 	case 8:
-		insn->immediate1.value = get_next(int, insn);
-		insn->immediate1.nbytes = 4;
-		insn->immediate2.value = get_next(int, insn);
-		insn->immediate2.nbytes = 4;
+		insn_field_set(&insn->immediate1, get_next(int, insn), 4);
+		insn_field_set(&insn->immediate2, get_next(int, insn), 4);
 		break;
 	default:	/* opnd_bytes must be modified manually */
 		goto err_out;
@@ -515,12 +517,10 @@  static int __get_immptr(struct insn *insn)
 {
 	switch (insn->opnd_bytes) {
 	case 2:
-		insn->immediate1.value = get_next(short, insn);
-		insn->immediate1.nbytes = 2;
+		insn_field_set(&insn->immediate1, get_next(short, insn), 2);
 		break;
 	case 4:
-		insn->immediate1.value = get_next(int, insn);
-		insn->immediate1.nbytes = 4;
+		insn_field_set(&insn->immediate1, get_next(int, insn), 4);
 		break;
 	case 8:
 		/* ptr16:64 is not exist (no segment) */
@@ -528,8 +528,7 @@  static int __get_immptr(struct insn *insn)
 	default:	/* opnd_bytes must be modified manually */
 		goto err_out;
 	}
-	insn->immediate2.value = get_next(unsigned short, insn);
-	insn->immediate2.nbytes = 2;
+	insn_field_set(&insn->immediate2, get_next(unsigned short, insn), 2);
 	insn->immediate1.got = insn->immediate2.got = 1;
 
 	return 1;
@@ -565,22 +564,17 @@  void insn_get_immediate(struct insn *insn)
 
 	switch (inat_immediate_size(insn->attr)) {
 	case INAT_IMM_BYTE:
-		insn->immediate.value = get_next(signed char, insn);
-		insn->immediate.nbytes = 1;
+		insn_field_set(&insn->immediate, get_next(signed char, insn), 1);
 		break;
 	case INAT_IMM_WORD:
-		insn->immediate.value = get_next(short, insn);
-		insn->immediate.nbytes = 2;
+		insn_field_set(&insn->immediate, get_next(short, insn), 2);
 		break;
 	case INAT_IMM_DWORD:
-		insn->immediate.value = get_next(int, insn);
-		insn->immediate.nbytes = 4;
+		insn_field_set(&insn->immediate, get_next(int, insn), 4);
 		break;
 	case INAT_IMM_QWORD:
-		insn->immediate1.value = get_next(int, insn);
-		insn->immediate1.nbytes = 4;
-		insn->immediate2.value = get_next(int, insn);
-		insn->immediate2.nbytes = 4;
+		insn_field_set(&insn->immediate1, get_next(int, insn), 4);
+		insn_field_set(&insn->immediate2, get_next(int, insn), 4);
 		break;
 	case INAT_IMM_PTR:
 		if (!__get_immptr(insn))
@@ -599,8 +593,7 @@  void insn_get_immediate(struct insn *insn)
 		goto err_out;
 	}
 	if (inat_has_second_immediate(insn->attr)) {
-		insn->immediate2.value = get_next(signed char, insn);
-		insn->immediate2.nbytes = 1;
+		insn_field_set(&insn->immediate2, get_next(signed char, insn), 1);
 	}
 done:
 	insn->immediate.got = 1;