linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] tools/perf: fix the word selected in find_*_bit
@ 2016-06-15 11:42 Madhavan Srinivasan
  2016-06-15 12:44 ` George Spelvin
  2016-06-15 19:51 ` Yury Norov
  0 siblings, 2 replies; 16+ messages in thread
From: Madhavan Srinivasan @ 2016-06-15 11:42 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev
  Cc: Madhavan Srinivasan, Arnaldo Carvalho de Melo, Adrian Hunter,
	Borislav Petkov, David Ahern, George Spelvin, Jiri Olsa,
	Namhyung Kim, Rasmus Villemoes, Wang Nan, Yury Norov,
	Michael Ellerman

When decoding the perf_regs mask in regs_dump__printf(),
we loop through the mask using find_first_bit and find_next_bit functions.
And mask is of type "u64". But "u64" is send as a "unsigned long *" to
lib functions along with sizeof().

While the exisitng code works fine in most of the case, when using a 32bit perf
on a 64bit kernel (Big Endian), we end reading the wrong word. In find_first_bit(),
one word at a time (based on BITS_PER_LONG) is loaded and
checked for any bit set. In 32bit BE userspace,
BITS_PER_LONG turns out to be 32, and for a mask value of
"0x00000000000000ff", find_first_bit will return 32, instead of 0.
Reason for this is that, value in the word0 is all zeros and value
in word1 is 0xff. Ideally, second word in the mask should be loaded
and searched. Patch swaps the word to look incase of 32bit BE.

Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: George Spelvin <linux@horizon.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Yury Norov <yury.norov@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
---
 tools/lib/find_bit.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/tools/lib/find_bit.c b/tools/lib/find_bit.c
index 9122a9e80046..996b3e04324f 100644
--- a/tools/lib/find_bit.c
+++ b/tools/lib/find_bit.c
@@ -37,7 +37,12 @@ static unsigned long _find_next_bit(const unsigned long *addr,
 	if (!nbits || start >= nbits)
 		return nbits;

+#if (__BYTE_ORDER == __BIG_ENDIAN) && (BITS_PER_LONG != 64)
+	tmp = addr[(((nbits - 1)/BITS_PER_LONG) - (start / BITS_PER_LONG))]
+								^ invert;
+#else
 	tmp = addr[start / BITS_PER_LONG] ^ invert;
+#endif

 	/* Handle 1st word. */
 	tmp &= BITMAP_FIRST_WORD_MASK(start);
@@ -48,7 +53,12 @@ static unsigned long _find_next_bit(const unsigned long *addr,
 		if (start >= nbits)
 			return nbits;

+#if (__BYTE_ORDER == __BIG_ENDIAN) && (BITS_PER_LONG != 64)
+		tmp = addr[(((nbits - 1)/BITS_PER_LONG) - (start / BITS_PER_LONG))]
+								^ invert;
+#else
 		tmp = addr[start / BITS_PER_LONG] ^ invert;
+#endif
 	}

 	return min(start + __ffs(tmp), nbits);
@@ -75,8 +85,15 @@ unsigned long find_first_bit(const unsigned long *addr, unsigned long size)
 	unsigned long idx;

 	for (idx = 0; idx * BITS_PER_LONG < size; idx++) {
+#if (__BYTE_ORDER == __BIG_ENDIAN) && (BITS_PER_LONG != 64)
+		if (addr[(((size-1)/BITS_PER_LONG) - idx)])
+			return min(idx * BITS_PER_LONG +
+				__ffs(addr[(((size-1)/BITS_PER_LONG) - idx)]),
+									size);
+#else
 		if (addr[idx])
 			return min(idx * BITS_PER_LONG + __ffs(addr[idx]), size);
+#endif
 	}

 	return size;
--
1.9.1

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] tools/perf: fix the word selected in find_*_bit
  2016-06-15 11:42 [PATCH] tools/perf: fix the word selected in find_*_bit Madhavan Srinivasan
@ 2016-06-15 12:44 ` George Spelvin
  2016-06-16  7:21   ` Madhavan Srinivasan
  2016-06-15 19:51 ` Yury Norov
  1 sibling, 1 reply; 16+ messages in thread
From: George Spelvin @ 2016-06-15 12:44 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev, maddy
  Cc: acme, adrian.hunter, bp, dsahern, jolsa, linux, linux, mpe,
	namhyung, wangnan0, yury.norov

Madhavan Srinivasan wrote:
> +#if (__BYTE_ORDER == __BIG_ENDIAN) && (BITS_PER_LONG != 64)
> +	tmp = addr[(((nbits - 1)/BITS_PER_LONG) - (start / BITS_PER_LONG))]
> +								^ invert;
> +#else
>  	tmp = addr[start / BITS_PER_LONG] ^ invert;
> +#endif

Than you for diagnosing this problem, but I don't think the fix
is correct.

1) It's not clear that all users of _find_next_bit and for_each_set_bit()
   want this change.
2) Is your code even correct?  I'd think you'd want addr[x ^ 1].  Are you
   sure you shpuld be reversing the whole array, and not just the halves of
   each 64-bit word?
3) You've now broken the case of 32-bit big-endian kernel.

I think the proper solution is uglier than this. :-(

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] tools/perf: fix the word selected in find_*_bit
  2016-06-15 11:42 [PATCH] tools/perf: fix the word selected in find_*_bit Madhavan Srinivasan
  2016-06-15 12:44 ` George Spelvin
@ 2016-06-15 19:51 ` Yury Norov
  2016-06-15 21:11   ` Yury Norov
  2016-06-16  7:15   ` Madhavan Srinivasan
  1 sibling, 2 replies; 16+ messages in thread
From: Yury Norov @ 2016-06-15 19:51 UTC (permalink / raw)
  To: Madhavan Srinivasan
  Cc: linux-kernel, linuxppc-dev, Arnaldo Carvalho de Melo,
	Adrian Hunter, Borislav Petkov, David Ahern, George Spelvin,
	Jiri Olsa, Namhyung Kim, Rasmus Villemoes, Wang Nan, Yury Norov,
	Michael Ellerman

Hi Madhavan,

On Wed, Jun 15, 2016 at 05:12:53PM +0530, Madhavan Srinivasan wrote:
> When decoding the perf_regs mask in regs_dump__printf(),
> we loop through the mask using find_first_bit and find_next_bit functions.
> And mask is of type "u64". But "u64" is send as a "unsigned long *" to
> lib functions along with sizeof().
> 
> While the exisitng code works fine in most of the case, when using a 32bit perf
> on a 64bit kernel (Big Endian), we end reading the wrong word. In find_first_bit(),
> one word at a time (based on BITS_PER_LONG) is loaded and
> checked for any bit set. In 32bit BE userspace,
> BITS_PER_LONG turns out to be 32, and for a mask value of
> "0x00000000000000ff", find_first_bit will return 32, instead of 0.
> Reason for this is that, value in the word0 is all zeros and value
> in word1 is 0xff. Ideally, second word in the mask should be loaded
> and searched. Patch swaps the word to look incase of 32bit BE.

I think this is not a problem of find_bit() at all. You have wrong
typecast as the source of problem (tools/perf/util/session.c"):

940 static void regs_dump__printf(u64 mask, u64 *regs)
941 {
942         unsigned rid, i = 0;
943 
944         for_each_set_bit(rid, (unsigned long *) &mask, sizeof(mask) * 8) {
                                          ^^^^ Here ^^^^
945                 u64 val = regs[i++];
946 
947                 printf(".... %-5s 0x%" PRIx64 "\n",
948                        perf_reg_name(rid), val);
949         }
950 }

But for some reason you change correct find_bit()...

Though proper fix is like this for me:

static void regs_dump__printf(u64 mask, u64 *regs)
{
        unsigned rid, i = 0;
        unsigned long _mask[sizeof(mask)/sizeof(unsigned long)];

        _mask[0] = mask & ULONG_MAX;
        if (sizeof(mask) > sizeof(unsigned long))
                _mask[1] = mask >> BITS_PER_LONG;

        for_each_set_bit(rid, _mask, sizeof(mask) * BITS_PER_BYTE) {
                u64 val = regs[i++];

                printf(".... %-5s 0x%" PRIx64 "\n",
                       perf_reg_name(rid), val);
        }
}

Maybe there already is some macro doing the conversion for you...

Yury.

> Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
> Cc: Adrian Hunter <adrian.hunter@intel.com>
> Cc: Borislav Petkov <bp@suse.de>
> Cc: David Ahern <dsahern@gmail.com>
> Cc: George Spelvin <linux@horizon.com>
> Cc: Jiri Olsa <jolsa@redhat.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
> Cc: Wang Nan <wangnan0@huawei.com>
> Cc: Yury Norov <yury.norov@gmail.com>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
> ---
>  tools/lib/find_bit.c | 17 +++++++++++++++++
>  1 file changed, 17 insertions(+)
> 
> diff --git a/tools/lib/find_bit.c b/tools/lib/find_bit.c
> index 9122a9e80046..996b3e04324f 100644
> --- a/tools/lib/find_bit.c
> +++ b/tools/lib/find_bit.c
> @@ -37,7 +37,12 @@ static unsigned long _find_next_bit(const unsigned long *addr,
>  	if (!nbits || start >= nbits)
>  		return nbits;
> 
> +#if (__BYTE_ORDER == __BIG_ENDIAN) && (BITS_PER_LONG != 64)
> +	tmp = addr[(((nbits - 1)/BITS_PER_LONG) - (start / BITS_PER_LONG))]
> +								^ invert;
> +#else
>  	tmp = addr[start / BITS_PER_LONG] ^ invert;
> +#endif
> 
>  	/* Handle 1st word. */
>  	tmp &= BITMAP_FIRST_WORD_MASK(start);
> @@ -48,7 +53,12 @@ static unsigned long _find_next_bit(const unsigned long *addr,
>  		if (start >= nbits)
>  			return nbits;
> 
> +#if (__BYTE_ORDER == __BIG_ENDIAN) && (BITS_PER_LONG != 64)
> +		tmp = addr[(((nbits - 1)/BITS_PER_LONG) - (start / BITS_PER_LONG))]
> +								^ invert;
> +#else
>  		tmp = addr[start / BITS_PER_LONG] ^ invert;
> +#endif
>  	}
> 
>  	return min(start + __ffs(tmp), nbits);
> @@ -75,8 +85,15 @@ unsigned long find_first_bit(const unsigned long *addr, unsigned long size)
>  	unsigned long idx;
> 
>  	for (idx = 0; idx * BITS_PER_LONG < size; idx++) {
> +#if (__BYTE_ORDER == __BIG_ENDIAN) && (BITS_PER_LONG != 64)
> +		if (addr[(((size-1)/BITS_PER_LONG) - idx)])
> +			return min(idx * BITS_PER_LONG +
> +				__ffs(addr[(((size-1)/BITS_PER_LONG) - idx)]),
> +									size);
> +#else
>  		if (addr[idx])
>  			return min(idx * BITS_PER_LONG + __ffs(addr[idx]), size);
> +#endif
>  	}
> 
>  	return size;
> --
> 1.9.1

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] tools/perf: fix the word selected in find_*_bit
  2016-06-15 19:51 ` Yury Norov
@ 2016-06-15 21:11   ` Yury Norov
  2016-06-15 21:29     ` Arnaldo Carvalho de Melo
  2016-06-16 13:11     ` Madhavan Srinivasan
  2016-06-16  7:15   ` Madhavan Srinivasan
  1 sibling, 2 replies; 16+ messages in thread
From: Yury Norov @ 2016-06-15 21:11 UTC (permalink / raw)
  To: Madhavan Srinivasan
  Cc: linux-kernel, linuxppc-dev, Arnaldo Carvalho de Melo,
	Adrian Hunter, Borislav Petkov, David Ahern, George Spelvin,
	Jiri Olsa, Namhyung Kim, Rasmus Villemoes, Wang Nan, Yury Norov,
	Michael Ellerman

On Wed, Jun 15, 2016 at 10:51:27PM +0300, Yury Norov wrote:
> Hi Madhavan,
> 
> On Wed, Jun 15, 2016 at 05:12:53PM +0530, Madhavan Srinivasan wrote:
> > When decoding the perf_regs mask in regs_dump__printf(),
> > we loop through the mask using find_first_bit and find_next_bit functions.
> > And mask is of type "u64". But "u64" is send as a "unsigned long *" to
> > lib functions along with sizeof().
> > 
> > While the exisitng code works fine in most of the case, when using a 32bit perf
> > on a 64bit kernel (Big Endian), we end reading the wrong word. In find_first_bit(),
> > one word at a time (based on BITS_PER_LONG) is loaded and
> > checked for any bit set. In 32bit BE userspace,
> > BITS_PER_LONG turns out to be 32, and for a mask value of
> > "0x00000000000000ff", find_first_bit will return 32, instead of 0.
> > Reason for this is that, value in the word0 is all zeros and value
> > in word1 is 0xff. Ideally, second word in the mask should be loaded
> > and searched. Patch swaps the word to look incase of 32bit BE.
> 
> I think this is not a problem of find_bit() at all. You have wrong
> typecast as the source of problem (tools/perf/util/session.c"):
> 
> 940 static void regs_dump__printf(u64 mask, u64 *regs)
> 941 {
> 942         unsigned rid, i = 0;
> 943 
> 944         for_each_set_bit(rid, (unsigned long *) &mask, sizeof(mask) * 8) {
>                                           ^^^^ Here ^^^^
> 945                 u64 val = regs[i++];
> 946 
> 947                 printf(".... %-5s 0x%" PRIx64 "\n",
> 948                        perf_reg_name(rid), val);
> 949         }
> 950 }
> 
> But for some reason you change correct find_bit()...
> 
> Though proper fix is like this for me:
> 
> static void regs_dump__printf(u64 mask, u64 *regs)
> {
>         unsigned rid, i = 0;
>         unsigned long _mask[sizeof(mask)/sizeof(unsigned long)];
> 
>         _mask[0] = mask & ULONG_MAX;
>         if (sizeof(mask) > sizeof(unsigned long))
>                 _mask[1] = mask >> BITS_PER_LONG;
> 
>         for_each_set_bit(rid, _mask, sizeof(mask) * BITS_PER_BYTE) {
>                 u64 val = regs[i++];
> 
>                 printf(".... %-5s 0x%" PRIx64 "\n",
>                        perf_reg_name(rid), val);
>         }
> }
> 
> Maybe there already is some macro doing the conversion for you...

yes it is, cpu_to_le64() is what you want

> 
> Yury.
> 
> > Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
> > Cc: Adrian Hunter <adrian.hunter@intel.com>
> > Cc: Borislav Petkov <bp@suse.de>
> > Cc: David Ahern <dsahern@gmail.com>
> > Cc: George Spelvin <linux@horizon.com>
> > Cc: Jiri Olsa <jolsa@redhat.com>
> > Cc: Namhyung Kim <namhyung@kernel.org>
> > Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
> > Cc: Wang Nan <wangnan0@huawei.com>
> > Cc: Yury Norov <yury.norov@gmail.com>
> > Cc: Michael Ellerman <mpe@ellerman.id.au>
> > Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
> > ---
> >  tools/lib/find_bit.c | 17 +++++++++++++++++
> >  1 file changed, 17 insertions(+)
> > 
> > diff --git a/tools/lib/find_bit.c b/tools/lib/find_bit.c
> > index 9122a9e80046..996b3e04324f 100644
> > --- a/tools/lib/find_bit.c
> > +++ b/tools/lib/find_bit.c
> > @@ -37,7 +37,12 @@ static unsigned long _find_next_bit(const unsigned long *addr,
> >  	if (!nbits || start >= nbits)
> >  		return nbits;
> > 
> > +#if (__BYTE_ORDER == __BIG_ENDIAN) && (BITS_PER_LONG != 64)
> > +	tmp = addr[(((nbits - 1)/BITS_PER_LONG) - (start / BITS_PER_LONG))]
> > +								^ invert;
> > +#else
> >  	tmp = addr[start / BITS_PER_LONG] ^ invert;
> > +#endif
> > 
> >  	/* Handle 1st word. */
> >  	tmp &= BITMAP_FIRST_WORD_MASK(start);
> > @@ -48,7 +53,12 @@ static unsigned long _find_next_bit(const unsigned long *addr,
> >  		if (start >= nbits)
> >  			return nbits;
> > 
> > +#if (__BYTE_ORDER == __BIG_ENDIAN) && (BITS_PER_LONG != 64)
> > +		tmp = addr[(((nbits - 1)/BITS_PER_LONG) - (start / BITS_PER_LONG))]
> > +								^ invert;
> > +#else
> >  		tmp = addr[start / BITS_PER_LONG] ^ invert;
> > +#endif
> >  	}
> > 
> >  	return min(start + __ffs(tmp), nbits);
> > @@ -75,8 +85,15 @@ unsigned long find_first_bit(const unsigned long *addr, unsigned long size)
> >  	unsigned long idx;
> > 
> >  	for (idx = 0; idx * BITS_PER_LONG < size; idx++) {
> > +#if (__BYTE_ORDER == __BIG_ENDIAN) && (BITS_PER_LONG != 64)
> > +		if (addr[(((size-1)/BITS_PER_LONG) - idx)])
> > +			return min(idx * BITS_PER_LONG +
> > +				__ffs(addr[(((size-1)/BITS_PER_LONG) - idx)]),
> > +									size);
> > +#else
> >  		if (addr[idx])
> >  			return min(idx * BITS_PER_LONG + __ffs(addr[idx]), size);
> > +#endif
> >  	}
> > 
> >  	return size;
> > --
> > 1.9.1

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] tools/perf: fix the word selected in find_*_bit
  2016-06-15 21:11   ` Yury Norov
@ 2016-06-15 21:29     ` Arnaldo Carvalho de Melo
  2016-06-16  1:27       ` [PATCH] tools include: Fix wrong macro definitions for cpu_to_le* for big endian He Kuang
                         ` (2 more replies)
  2016-06-16 13:11     ` Madhavan Srinivasan
  1 sibling, 3 replies; 16+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-06-15 21:29 UTC (permalink / raw)
  To: Yury Norov
  Cc: Madhavan Srinivasan, linux-kernel, linuxppc-dev, Adrian Hunter,
	Borislav Petkov, David Ahern, George Spelvin, Jiri Olsa,
	Namhyung Kim, Rasmus Villemoes, Wang Nan, Yury Norov,
	Michael Ellerman

Em Thu, Jun 16, 2016 at 12:11:04AM +0300, Yury Norov escreveu:
> On Wed, Jun 15, 2016 at 10:51:27PM +0300, Yury Norov wrote:
> > Maybe there already is some macro doing the conversion for you...
> 
> yes it is, cpu_to_le64() is what you want

Beware that the cpu_to_le64() in tools/perf is bogus, we need to grab a
copy from the kernel sources.

- Arnaldo

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH] tools include: Fix wrong macro definitions for cpu_to_le* for big endian
  2016-06-15 21:29     ` Arnaldo Carvalho de Melo
@ 2016-06-16  1:27       ` He Kuang
  2016-06-16  1:32       ` He Kuang
  2016-06-16  1:35       ` [PATCH] tools/perf: fix the word selected in find_*_bit Hekuang
  2 siblings, 0 replies; 16+ messages in thread
From: He Kuang @ 2016-06-16  1:27 UTC (permalink / raw)
  To: peterz, mingo, acme, alexander.shishkin, hekuang, wangnan0,
	adrian.hunter, ak, maddy, bp, dsahern, linux, jolsa, namhyung,
	linux, yury.norov, mpe
  Cc: linux-kernel, linuxppc-dev

From: Wang Nan <wangnan0@huawei.com>

The cpu_to_le* macros in kernel.h are defined without considering
endianese. This patch includes "byteoder/generic.h" instead to fix the
bug, and removes redundant le64_to_cpu definition in intel-bts.c.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
---
 tools/include/linux/kernel.h | 5 ++---
 tools/perf/util/intel-bts.c  | 5 -----
 2 files changed, 2 insertions(+), 8 deletions(-)

diff --git a/tools/include/linux/kernel.h b/tools/include/linux/kernel.h
index 76df535..6145e41 100644
--- a/tools/include/linux/kernel.h
+++ b/tools/include/linux/kernel.h
@@ -64,11 +64,10 @@
 #endif
 
 /*
- * Both need more care to handle endianness
+ * Need more care to handle endianness
  * (Don't use bitmap_copy_le() for now)
  */
-#define cpu_to_le64(x)	(x)
-#define cpu_to_le32(x)	(x)
+#include <linux/byteorder/generic.h>
 
 static inline int
 vscnprintf(char *buf, size_t size, const char *fmt, va_list args)
diff --git a/tools/perf/util/intel-bts.c b/tools/perf/util/intel-bts.c
index 9df9960..0e632c4 100644
--- a/tools/perf/util/intel-bts.c
+++ b/tools/perf/util/intel-bts.c
@@ -40,11 +40,6 @@
 #define INTEL_BTS_ERR_NOINSN  5
 #define INTEL_BTS_ERR_LOST    9
 
-#if __BYTE_ORDER == __BIG_ENDIAN
-#define le64_to_cpu bswap_64
-#else
-#define le64_to_cpu
-#endif
 
 struct intel_bts {
 	struct auxtrace			auxtrace;
-- 
1.8.3.4

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH] tools include: Fix wrong macro definitions for cpu_to_le* for big endian
  2016-06-15 21:29     ` Arnaldo Carvalho de Melo
  2016-06-16  1:27       ` [PATCH] tools include: Fix wrong macro definitions for cpu_to_le* for big endian He Kuang
@ 2016-06-16  1:32       ` He Kuang
  2016-06-16  1:32         ` [PATCH 1/2] tools include: Sync byteorder/generic.h He Kuang
  2016-06-16  1:32         ` [PATCH 2/2] tools include: Fix wrong macro definitions for cpu_to_le* for big endian He Kuang
  2016-06-16  1:35       ` [PATCH] tools/perf: fix the word selected in find_*_bit Hekuang
  2 siblings, 2 replies; 16+ messages in thread
From: He Kuang @ 2016-06-16  1:32 UTC (permalink / raw)
  To: peterz, mingo, acme, alexander.shishkin, hekuang, wangnan0,
	adrian.hunter, ak, maddy, bp, dsahern, linux, jolsa, namhyung,
	linux, yury.norov, mpe
  Cc: linux-kernel, linuxppc-dev

From: Wang Nan <wangnan0@huawei.com>

The cpu_to_le* macros in kernel.h are defined without considering
endianese. This patch includes "byteoder/generic.h" instead to fix the
bug, and removes redundant le64_to_cpu definition in intel-bts.c.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
---
 tools/include/linux/kernel.h | 5 ++---
 tools/perf/util/intel-bts.c  | 5 -----
 2 files changed, 2 insertions(+), 8 deletions(-)

diff --git a/tools/include/linux/kernel.h b/tools/include/linux/kernel.h
index 76df535..6145e41 100644
--- a/tools/include/linux/kernel.h
+++ b/tools/include/linux/kernel.h
@@ -64,11 +64,10 @@
 #endif
 
 /*
- * Both need more care to handle endianness
+ * Need more care to handle endianness
  * (Don't use bitmap_copy_le() for now)
  */
-#define cpu_to_le64(x)	(x)
-#define cpu_to_le32(x)	(x)
+#include <linux/byteorder/generic.h>
 
 static inline int
 vscnprintf(char *buf, size_t size, const char *fmt, va_list args)
diff --git a/tools/perf/util/intel-bts.c b/tools/perf/util/intel-bts.c
index 9df9960..0e632c4 100644
--- a/tools/perf/util/intel-bts.c
+++ b/tools/perf/util/intel-bts.c
@@ -40,11 +40,6 @@
 #define INTEL_BTS_ERR_NOINSN  5
 #define INTEL_BTS_ERR_LOST    9
 
-#if __BYTE_ORDER == __BIG_ENDIAN
-#define le64_to_cpu bswap_64
-#else
-#define le64_to_cpu
-#endif
 
 struct intel_bts {
 	struct auxtrace			auxtrace;
-- 
1.8.3.4

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/2] tools include: Sync byteorder/generic.h
  2016-06-16  1:32       ` He Kuang
@ 2016-06-16  1:32         ` He Kuang
  2016-06-16  6:39           ` Jiri Olsa
  2016-06-16  1:32         ` [PATCH 2/2] tools include: Fix wrong macro definitions for cpu_to_le* for big endian He Kuang
  1 sibling, 1 reply; 16+ messages in thread
From: He Kuang @ 2016-06-16  1:32 UTC (permalink / raw)
  To: peterz, mingo, acme, alexander.shishkin, hekuang, wangnan0,
	adrian.hunter, ak, maddy, bp, dsahern, linux, jolsa, namhyung,
	linux, yury.norov, mpe
  Cc: linux-kernel, linuxppc-dev

From: Wang Nan <wangnan0@huawei.com>

This patch copies "include/linux/byteorder/generic.h" to
"tools/include/linux/byteorder/generic.h" to enable other libraries to
use macros in it.

tools/perf/MANIFEST is also updated for 'make perf-*-src-pkg'.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
---
 tools/include/linux/byteorder/generic.h | 48 +++++++++++++++++++++++++++++++++
 tools/perf/MANIFEST                     |  1 +
 2 files changed, 49 insertions(+)
 create mode 100644 tools/include/linux/byteorder/generic.h

diff --git a/tools/include/linux/byteorder/generic.h b/tools/include/linux/byteorder/generic.h
new file mode 100644
index 0000000..41b4507
--- /dev/null
+++ b/tools/include/linux/byteorder/generic.h
@@ -0,0 +1,48 @@
+#ifndef _TOOLS_LINUX_BYTEORDER_GENERIC_H
+#define _TOOLS_LINUX_BYTEORDER_GENERIC_H
+
+#include <endian.h>
+#include <byteswap.h>
+
+#define cpu_to_le64 __cpu_to_le64
+#define le64_to_cpu __le64_to_cpu
+#define cpu_to_le32 __cpu_to_le32
+#define le32_to_cpu __le32_to_cpu
+#define cpu_to_le16 __cpu_to_le16
+#define le16_to_cpu __le16_to_cpu
+#define cpu_to_be64 __cpu_to_be64
+#define be64_to_cpu __be64_to_cpu
+#define cpu_to_be32 __cpu_to_be32
+#define be32_to_cpu __be32_to_cpu
+#define cpu_to_be16 __cpu_to_be16
+#define be16_to_cpu __be16_to_cpu
+
+#if __BYTE_ORDER == __BIG_ENDIAN
+#define __cpu_to_le16 bswap_16
+#define __cpu_to_le32 bswap_32
+#define __cpu_to_le64 bswap_64
+#define __le16_to_cpu bswap_16
+#define __le32_to_cpu bswap_32
+#define __le64_to_cpu bswap_64
+#define __cpu_to_be16
+#define __cpu_to_be32
+#define __cpu_to_be64
+#define __be16_to_cpu
+#define __be32_to_cpu
+#define __be64_to_cpu
+#else
+#define __cpu_to_le16
+#define __cpu_to_le32
+#define __cpu_to_le64
+#define __le16_to_cpu
+#define __le32_to_cpu
+#define __le64_to_cpu
+#define __cpu_to_be16 bswap_16
+#define __cpu_to_be32 bswap_32
+#define __cpu_to_be64 bswap_64
+#define __be16_to_cpu bswap_16
+#define __be32_to_cpu bswap_32
+#define __be64_to_cpu bswap_64
+#endif
+
+#endif /* _TOOLS_LINUX_BYTEORDER_GENERIC_H */
diff --git a/tools/perf/MANIFEST b/tools/perf/MANIFEST
index 8c8c6b9..80ac3d4 100644
--- a/tools/perf/MANIFEST
+++ b/tools/perf/MANIFEST
@@ -46,6 +46,7 @@ tools/include/asm-generic/bitops/hweight.h
 tools/include/asm-generic/bitops.h
 tools/include/linux/atomic.h
 tools/include/linux/bitops.h
+tools/include/linux/byteorder/generic.h
 tools/include/linux/compiler.h
 tools/include/linux/filter.h
 tools/include/linux/hash.h
-- 
1.8.5.2

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 2/2] tools include: Fix wrong macro definitions for cpu_to_le* for big endian
  2016-06-16  1:32       ` He Kuang
  2016-06-16  1:32         ` [PATCH 1/2] tools include: Sync byteorder/generic.h He Kuang
@ 2016-06-16  1:32         ` He Kuang
  2016-06-16  6:21           ` kbuild test robot
  2016-06-16  6:39           ` Jiri Olsa
  1 sibling, 2 replies; 16+ messages in thread
From: He Kuang @ 2016-06-16  1:32 UTC (permalink / raw)
  To: peterz, mingo, acme, alexander.shishkin, hekuang, wangnan0,
	adrian.hunter, ak, maddy, bp, dsahern, linux, jolsa, namhyung,
	linux, yury.norov, mpe
  Cc: linux-kernel, linuxppc-dev

From: Wang Nan <wangnan0@huawei.com>

The cpu_to_le* macros in kernel.h are defined without considering
endianese. This patch includes "byteoder/generic.h" instead to fix the
bug, and removes redundant le64_to_cpu definition in intel-bts.c.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
---
 tools/include/linux/kernel.h | 5 ++---
 tools/perf/util/intel-bts.c  | 5 -----
 2 files changed, 2 insertions(+), 8 deletions(-)

diff --git a/tools/include/linux/kernel.h b/tools/include/linux/kernel.h
index 76df535..6145e41 100644
--- a/tools/include/linux/kernel.h
+++ b/tools/include/linux/kernel.h
@@ -64,11 +64,10 @@
 #endif
 
 /*
- * Both need more care to handle endianness
+ * Need more care to handle endianness
  * (Don't use bitmap_copy_le() for now)
  */
-#define cpu_to_le64(x)	(x)
-#define cpu_to_le32(x)	(x)
+#include <linux/byteorder/generic.h>
 
 static inline int
 vscnprintf(char *buf, size_t size, const char *fmt, va_list args)
diff --git a/tools/perf/util/intel-bts.c b/tools/perf/util/intel-bts.c
index 9df9960..0e632c4 100644
--- a/tools/perf/util/intel-bts.c
+++ b/tools/perf/util/intel-bts.c
@@ -40,11 +40,6 @@
 #define INTEL_BTS_ERR_NOINSN  5
 #define INTEL_BTS_ERR_LOST    9
 
-#if __BYTE_ORDER == __BIG_ENDIAN
-#define le64_to_cpu bswap_64
-#else
-#define le64_to_cpu
-#endif
 
 struct intel_bts {
 	struct auxtrace			auxtrace;
-- 
1.8.5.2

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] tools/perf: fix the word selected in find_*_bit
  2016-06-15 21:29     ` Arnaldo Carvalho de Melo
  2016-06-16  1:27       ` [PATCH] tools include: Fix wrong macro definitions for cpu_to_le* for big endian He Kuang
  2016-06-16  1:32       ` He Kuang
@ 2016-06-16  1:35       ` Hekuang
  2 siblings, 0 replies; 16+ messages in thread
From: Hekuang @ 2016-06-16  1:35 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Yury Norov
  Cc: Madhavan Srinivasan, linux-kernel, linuxppc-dev, Adrian Hunter,
	Borislav Petkov, David Ahern, George Spelvin, Jiri Olsa,
	Namhyung Kim, Rasmus Villemoes, Wang Nan, Yury Norov,
	Michael Ellerman



在 2016/6/16 5:29, Arnaldo Carvalho de Melo 写道:
> Em Thu, Jun 16, 2016 at 12:11:04AM +0300, Yury Norov escreveu:
>> On Wed, Jun 15, 2016 at 10:51:27PM +0300, Yury Norov wrote:
>>> Maybe there already is some macro doing the conversion for you...
>> yes it is, cpu_to_le64() is what you want
> Beware that the cpu_to_le64() in tools/perf is bogus, we need to grab a
> copy from the kernel sources.
>
> - Arnaldo
>
[PATCH 1/2] tools include: Sync byteorder/generic.h
[PATCH 2/2] tools include: Fix wrong macro definitions for cpu_to_le* 
for big endian

Here're two patches related to this issue, sorry for wrongly sent two more
reduntant mails.

Thank you.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] tools include: Fix wrong macro definitions for cpu_to_le* for big endian
  2016-06-16  1:32         ` [PATCH 2/2] tools include: Fix wrong macro definitions for cpu_to_le* for big endian He Kuang
@ 2016-06-16  6:21           ` kbuild test robot
  2016-06-16  6:39           ` Jiri Olsa
  1 sibling, 0 replies; 16+ messages in thread
From: kbuild test robot @ 2016-06-16  6:21 UTC (permalink / raw)
  To: He Kuang
  Cc: kbuild-all, peterz, mingo, acme, alexander.shishkin, hekuang,
	wangnan0, adrian.hunter, ak, maddy, bp, dsahern, linux, jolsa,
	namhyung, linux, yury.norov, mpe, linux-kernel, linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 3128 bytes --]

Hi,

[auto build test ERROR on tip/perf/core]
[also build test ERROR on v4.7-rc3 next-20160615]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/He-Kuang/tools-include-Fix-wrong-macro-definitions-for-cpu_to_le-for-big-endian/20160616-092924
config: x86_64-allmodconfig (attached as .config)
compiler: gcc-6 (Debian 6.1.1-1) 6.1.1 20160430
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

All errors (new ones prefixed by >>):

   In file included from tools/include/linux/list.h:6:0,
                    from elf.h:23,
                    from special.h:22,
                    from special.c:26:
>> tools/include/linux/kernel.h:70:37: fatal error: linux/byteorder/generic.h: No such file or directory
   compilation terminated.
   In file included from tools/include/linux/list.h:6:0,
                    from elf.h:23,
                    from builtin-check.c:32:
>> tools/include/linux/kernel.h:70:37: fatal error: linux/byteorder/generic.h: No such file or directory
   compilation terminated.
>> mv: cannot stat 'tools/objtool/.special.o.tmp': No such file or directory
   make[4]: *** [tools/objtool/special.o] Error 1
>> mv: cannot stat 'tools/objtool/.builtin-check.o.tmp': No such file or directory
   make[4]: *** [tools/objtool/builtin-check.o] Error 1
   In file included from tools/include/linux/list.h:6:0,
                    from elf.h:23,
                    from elf.c:30:
>> tools/include/linux/kernel.h:70:37: fatal error: linux/byteorder/generic.h: No such file or directory
   compilation terminated.
>> mv: cannot stat 'tools/objtool/.elf.o.tmp': No such file or directory
   make[4]: *** [tools/objtool/elf.o] Error 1
   In file included from tools/include/linux/list.h:6:0,
                    from arch/x86/../../elf.h:23,
                    from arch/x86/decode.c:26:
>> tools/include/linux/kernel.h:70:37: fatal error: linux/byteorder/generic.h: No such file or directory
   compilation terminated.
>> mv: cannot stat 'tools/objtool/arch/x86/.decode.o.tmp': No such file or directory
   make[5]: *** [tools/objtool/arch/x86/decode.o] Error 1
   make[5]: Target '__build' not remade because of errors.
   make[4]: *** [arch/x86] Error 2
   make[4]: Target '__build' not remade because of errors.
   make[3]: *** [tools/objtool/objtool-in.o] Error 2
   make[3]: Target 'all' not remade because of errors.
   make[2]: *** [objtool] Error 2
   make[1]: *** [tools/objtool] Error 2
   make[1]: Target 'prepare' not remade because of errors.
   make: *** [sub-make] Error 2

vim +70 tools/include/linux/kernel.h

    64	#endif
    65	
    66	/*
    67	 * Need more care to handle endianness
    68	 * (Don't use bitmap_copy_le() for now)
    69	 */
  > 70	#include <linux/byteorder/generic.h>
    71	
    72	static inline int
    73	vscnprintf(char *buf, size_t size, const char *fmt, va_list args)

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/octet-stream, Size: 54751 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] tools include: Fix wrong macro definitions for cpu_to_le* for big endian
  2016-06-16  1:32         ` [PATCH 2/2] tools include: Fix wrong macro definitions for cpu_to_le* for big endian He Kuang
  2016-06-16  6:21           ` kbuild test robot
@ 2016-06-16  6:39           ` Jiri Olsa
  1 sibling, 0 replies; 16+ messages in thread
From: Jiri Olsa @ 2016-06-16  6:39 UTC (permalink / raw)
  To: He Kuang
  Cc: peterz, mingo, acme, alexander.shishkin, wangnan0, adrian.hunter,
	ak, maddy, bp, dsahern, linux, namhyung, linux, yury.norov, mpe,
	linux-kernel, linuxppc-dev

On Thu, Jun 16, 2016 at 01:32:09AM +0000, He Kuang wrote:
> From: Wang Nan <wangnan0@huawei.com>
> 
> The cpu_to_le* macros in kernel.h are defined without considering
> endianese. This patch includes "byteoder/generic.h" instead to fix the
> bug, and removes redundant le64_to_cpu definition in intel-bts.c.
> 
> Signed-off-by: Wang Nan <wangnan0@huawei.com>
> Signed-off-by: He Kuang <hekuang@huawei.com>
> ---
>  tools/include/linux/kernel.h | 5 ++---
>  tools/perf/util/intel-bts.c  | 5 -----
>  2 files changed, 2 insertions(+), 8 deletions(-)
> 
> diff --git a/tools/include/linux/kernel.h b/tools/include/linux/kernel.h
> index 76df535..6145e41 100644
> --- a/tools/include/linux/kernel.h
> +++ b/tools/include/linux/kernel.h
> @@ -64,11 +64,10 @@
>  #endif
>  
>  /*
> - * Both need more care to handle endianness
> + * Need more care to handle endianness
>   * (Don't use bitmap_copy_le() for now)

what's the purpose of this comment now?

>   */
> -#define cpu_to_le64(x)	(x)
> -#define cpu_to_le32(x)	(x)
> +#include <linux/byteorder/generic.h>
>  
>  static inline int
>  vscnprintf(char *buf, size_t size, const char *fmt, va_list args)
> diff --git a/tools/perf/util/intel-bts.c b/tools/perf/util/intel-bts.c
> index 9df9960..0e632c4 100644
> --- a/tools/perf/util/intel-bts.c
> +++ b/tools/perf/util/intel-bts.c
> @@ -40,11 +40,6 @@
>  #define INTEL_BTS_ERR_NOINSN  5
>  #define INTEL_BTS_ERR_LOST    9
>  
> -#if __BYTE_ORDER == __BIG_ENDIAN
> -#define le64_to_cpu bswap_64
> -#else
> -#define le64_to_cpu
> -#endif

the purpose of this patchset is to unify these macros right?

there're more conversion defines in:
  util/intel-pt-decoder/intel-pt-pkt-decoder.c,

please remove them as well

thanks,
jirka

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/2] tools include: Sync byteorder/generic.h
  2016-06-16  1:32         ` [PATCH 1/2] tools include: Sync byteorder/generic.h He Kuang
@ 2016-06-16  6:39           ` Jiri Olsa
  0 siblings, 0 replies; 16+ messages in thread
From: Jiri Olsa @ 2016-06-16  6:39 UTC (permalink / raw)
  To: He Kuang
  Cc: peterz, mingo, acme, alexander.shishkin, wangnan0, adrian.hunter,
	ak, maddy, bp, dsahern, linux, namhyung, linux, yury.norov, mpe,
	linux-kernel, linuxppc-dev

On Thu, Jun 16, 2016 at 01:32:08AM +0000, He Kuang wrote:
> From: Wang Nan <wangnan0@huawei.com>
> 
> This patch copies "include/linux/byteorder/generic.h" to
> "tools/include/linux/byteorder/generic.h" to enable other libraries to
> use macros in it.

it's not the file copied, as the changelog suggest,
just several macros, please fix the changelog

thanks,
jirka

> 
> tools/perf/MANIFEST is also updated for 'make perf-*-src-pkg'.
> 
> Signed-off-by: Wang Nan <wangnan0@huawei.com>
> Signed-off-by: He Kuang <hekuang@huawei.com>
> ---
>  tools/include/linux/byteorder/generic.h | 48 +++++++++++++++++++++++++++++++++
>  tools/perf/MANIFEST                     |  1 +
>  2 files changed, 49 insertions(+)
>  create mode 100644 tools/include/linux/byteorder/generic.h
> 
> diff --git a/tools/include/linux/byteorder/generic.h b/tools/include/linux/byteorder/generic.h
> new file mode 100644
> index 0000000..41b4507
> --- /dev/null
> +++ b/tools/include/linux/byteorder/generic.h
> @@ -0,0 +1,48 @@
> +#ifndef _TOOLS_LINUX_BYTEORDER_GENERIC_H
> +#define _TOOLS_LINUX_BYTEORDER_GENERIC_H
> +
> +#include <endian.h>
> +#include <byteswap.h>
> +
> +#define cpu_to_le64 __cpu_to_le64
> +#define le64_to_cpu __le64_to_cpu
> +#define cpu_to_le32 __cpu_to_le32
> +#define le32_to_cpu __le32_to_cpu
> +#define cpu_to_le16 __cpu_to_le16
> +#define le16_to_cpu __le16_to_cpu
> +#define cpu_to_be64 __cpu_to_be64
> +#define be64_to_cpu __be64_to_cpu
> +#define cpu_to_be32 __cpu_to_be32
> +#define be32_to_cpu __be32_to_cpu
> +#define cpu_to_be16 __cpu_to_be16
> +#define be16_to_cpu __be16_to_cpu
> +
> +#if __BYTE_ORDER == __BIG_ENDIAN
> +#define __cpu_to_le16 bswap_16
> +#define __cpu_to_le32 bswap_32
> +#define __cpu_to_le64 bswap_64
> +#define __le16_to_cpu bswap_16
> +#define __le32_to_cpu bswap_32
> +#define __le64_to_cpu bswap_64
> +#define __cpu_to_be16
> +#define __cpu_to_be32
> +#define __cpu_to_be64
> +#define __be16_to_cpu
> +#define __be32_to_cpu
> +#define __be64_to_cpu
> +#else
> +#define __cpu_to_le16
> +#define __cpu_to_le32
> +#define __cpu_to_le64
> +#define __le16_to_cpu
> +#define __le32_to_cpu
> +#define __le64_to_cpu
> +#define __cpu_to_be16 bswap_16
> +#define __cpu_to_be32 bswap_32
> +#define __cpu_to_be64 bswap_64
> +#define __be16_to_cpu bswap_16
> +#define __be32_to_cpu bswap_32
> +#define __be64_to_cpu bswap_64
> +#endif
> +
> +#endif /* _TOOLS_LINUX_BYTEORDER_GENERIC_H */
> diff --git a/tools/perf/MANIFEST b/tools/perf/MANIFEST
> index 8c8c6b9..80ac3d4 100644
> --- a/tools/perf/MANIFEST
> +++ b/tools/perf/MANIFEST
> @@ -46,6 +46,7 @@ tools/include/asm-generic/bitops/hweight.h
>  tools/include/asm-generic/bitops.h
>  tools/include/linux/atomic.h
>  tools/include/linux/bitops.h
> +tools/include/linux/byteorder/generic.h
>  tools/include/linux/compiler.h
>  tools/include/linux/filter.h
>  tools/include/linux/hash.h
> -- 
> 1.8.5.2
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] tools/perf: fix the word selected in find_*_bit
  2016-06-15 19:51 ` Yury Norov
  2016-06-15 21:11   ` Yury Norov
@ 2016-06-16  7:15   ` Madhavan Srinivasan
  1 sibling, 0 replies; 16+ messages in thread
From: Madhavan Srinivasan @ 2016-06-16  7:15 UTC (permalink / raw)
  To: Yury Norov
  Cc: linux-kernel, linuxppc-dev, Arnaldo Carvalho de Melo,
	Adrian Hunter, Borislav Petkov, David Ahern, George Spelvin,
	Jiri Olsa, Namhyung Kim, Rasmus Villemoes, Wang Nan, Yury Norov,
	Michael Ellerman



On Thursday 16 June 2016 01:21 AM, Yury Norov wrote:
> Hi Madhavan,
>
> On Wed, Jun 15, 2016 at 05:12:53PM +0530, Madhavan Srinivasan wrote:
>> When decoding the perf_regs mask in regs_dump__printf(),
>> we loop through the mask using find_first_bit and find_next_bit functions.
>> And mask is of type "u64". But "u64" is send as a "unsigned long *" to
>> lib functions along with sizeof().
>>
>> While the exisitng code works fine in most of the case, when using a 32bit perf
>> on a 64bit kernel (Big Endian), we end reading the wrong word. In find_first_bit(),
>> one word at a time (based on BITS_PER_LONG) is loaded and
>> checked for any bit set. In 32bit BE userspace,
>> BITS_PER_LONG turns out to be 32, and for a mask value of
>> "0x00000000000000ff", find_first_bit will return 32, instead of 0.
>> Reason for this is that, value in the word0 is all zeros and value
>> in word1 is 0xff. Ideally, second word in the mask should be loaded
>> and searched. Patch swaps the word to look incase of 32bit BE.
> I think this is not a problem of find_bit() at all. You have wrong
> typecast as the source of problem (tools/perf/util/session.c"):
>
> 940 static void regs_dump__printf(u64 mask, u64 *regs)
> 941 {
> 942         unsigned rid, i = 0;
> 943 
> 944         for_each_set_bit(rid, (unsigned long *) &mask, sizeof(mask) * 8) {
>                                           ^^^^ Here ^^^^
> 945                 u64 val = regs[i++];
> 946 
> 947                 printf(".... %-5s 0x%" PRIx64 "\n",
> 948                        perf_reg_name(rid), val);
> 949         }
> 950 }
>
> But for some reason you change correct find_bit()...
>
> Though proper fix is like this for me:
>
> static void regs_dump__printf(u64 mask, u64 *regs)
> {
>         unsigned rid, i = 0;
>         unsigned long _mask[sizeof(mask)/sizeof(unsigned long)];
>
>         _mask[0] = mask & ULONG_MAX;
>         if (sizeof(mask) > sizeof(unsigned long))
>                 _mask[1] = mask >> BITS_PER_LONG;
>
>         for_each_set_bit(rid, _mask, sizeof(mask) * BITS_PER_BYTE) {
>                 u64 val = regs[i++];
>
>                 printf(".... %-5s 0x%" PRIx64 "\n",
>                        perf_reg_name(rid), val);
>         }
> }
>
> Maybe there already is some macro doing the conversion for you...

Agreed, but reason for proposing fix in lib side is to avoid conversion
on each case (if any in future).

I will repost the fix as suggested.

Maddy
> Yury.
>
>> Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
>> Cc: Adrian Hunter <adrian.hunter@intel.com>
>> Cc: Borislav Petkov <bp@suse.de>
>> Cc: David Ahern <dsahern@gmail.com>
>> Cc: George Spelvin <linux@horizon.com>
>> Cc: Jiri Olsa <jolsa@redhat.com>
>> Cc: Namhyung Kim <namhyung@kernel.org>
>> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
>> Cc: Wang Nan <wangnan0@huawei.com>
>> Cc: Yury Norov <yury.norov@gmail.com>
>> Cc: Michael Ellerman <mpe@ellerman.id.au>
>> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
>> ---
>>  tools/lib/find_bit.c | 17 +++++++++++++++++
>>  1 file changed, 17 insertions(+)
>>
>> diff --git a/tools/lib/find_bit.c b/tools/lib/find_bit.c
>> index 9122a9e80046..996b3e04324f 100644
>> --- a/tools/lib/find_bit.c
>> +++ b/tools/lib/find_bit.c
>> @@ -37,7 +37,12 @@ static unsigned long _find_next_bit(const unsigned long *addr,
>>  	if (!nbits || start >= nbits)
>>  		return nbits;
>>
>> +#if (__BYTE_ORDER == __BIG_ENDIAN) && (BITS_PER_LONG != 64)
>> +	tmp = addr[(((nbits - 1)/BITS_PER_LONG) - (start / BITS_PER_LONG))]
>> +								^ invert;
>> +#else
>>  	tmp = addr[start / BITS_PER_LONG] ^ invert;
>> +#endif
>>
>>  	/* Handle 1st word. */
>>  	tmp &= BITMAP_FIRST_WORD_MASK(start);
>> @@ -48,7 +53,12 @@ static unsigned long _find_next_bit(const unsigned long *addr,
>>  		if (start >= nbits)
>>  			return nbits;
>>
>> +#if (__BYTE_ORDER == __BIG_ENDIAN) && (BITS_PER_LONG != 64)
>> +		tmp = addr[(((nbits - 1)/BITS_PER_LONG) - (start / BITS_PER_LONG))]
>> +								^ invert;
>> +#else
>>  		tmp = addr[start / BITS_PER_LONG] ^ invert;
>> +#endif
>>  	}
>>
>>  	return min(start + __ffs(tmp), nbits);
>> @@ -75,8 +85,15 @@ unsigned long find_first_bit(const unsigned long *addr, unsigned long size)
>>  	unsigned long idx;
>>
>>  	for (idx = 0; idx * BITS_PER_LONG < size; idx++) {
>> +#if (__BYTE_ORDER == __BIG_ENDIAN) && (BITS_PER_LONG != 64)
>> +		if (addr[(((size-1)/BITS_PER_LONG) - idx)])
>> +			return min(idx * BITS_PER_LONG +
>> +				__ffs(addr[(((size-1)/BITS_PER_LONG) - idx)]),
>> +									size);
>> +#else
>>  		if (addr[idx])
>>  			return min(idx * BITS_PER_LONG + __ffs(addr[idx]), size);
>> +#endif
>>  	}
>>
>>  	return size;
>> --
>> 1.9.1

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] tools/perf: fix the word selected in find_*_bit
  2016-06-15 12:44 ` George Spelvin
@ 2016-06-16  7:21   ` Madhavan Srinivasan
  0 siblings, 0 replies; 16+ messages in thread
From: Madhavan Srinivasan @ 2016-06-16  7:21 UTC (permalink / raw)
  To: George Spelvin, linux-kernel, linuxppc-dev
  Cc: acme, adrian.hunter, bp, dsahern, jolsa, linux, linux, mpe,
	namhyung, wangnan0, yury.norov



On Wednesday 15 June 2016 06:14 PM, George Spelvin wrote:
> Madhavan Srinivasan wrote:
>> +#if (__BYTE_ORDER == __BIG_ENDIAN) && (BITS_PER_LONG != 64)
>> +	tmp = addr[(((nbits - 1)/BITS_PER_LONG) - (start / BITS_PER_LONG))]
>> +								^ invert;
>> +#else
>>  	tmp = addr[start / BITS_PER_LONG] ^ invert;
>> +#endif
> Than you for diagnosing this problem, but I don't think the fix
> is correct.
>
> 1) It's not clear that all users of _find_next_bit and for_each_set_bit()
>    want this change.
> 2) Is your code even correct?  I'd think you'd want addr[x ^ 1].  Are you
>    sure you shpuld be reversing the whole array, and not just the halves of
>    each 64-bit word?
> 3) You've now broken the case of 32-bit big-endian kernel.

Yes. But looks like we havent hit this case yet. Will post a fix.

Maddy

>
> I think the proper solution is uglier than this. :-(
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] tools/perf: fix the word selected in find_*_bit
  2016-06-15 21:11   ` Yury Norov
  2016-06-15 21:29     ` Arnaldo Carvalho de Melo
@ 2016-06-16 13:11     ` Madhavan Srinivasan
  1 sibling, 0 replies; 16+ messages in thread
From: Madhavan Srinivasan @ 2016-06-16 13:11 UTC (permalink / raw)
  To: Yury Norov
  Cc: linux-kernel, linuxppc-dev, Arnaldo Carvalho de Melo,
	Adrian Hunter, Borislav Petkov, David Ahern, George Spelvin,
	Jiri Olsa, Namhyung Kim, Rasmus Villemoes, Wang Nan, Yury Norov,
	Michael Ellerman



On Thursday 16 June 2016 02:41 AM, Yury Norov wrote:
> On Wed, Jun 15, 2016 at 10:51:27PM +0300, Yury Norov wrote:
>> Hi Madhavan,
>>
>> On Wed, Jun 15, 2016 at 05:12:53PM +0530, Madhavan Srinivasan wrote:
>>> When decoding the perf_regs mask in regs_dump__printf(),
>>> we loop through the mask using find_first_bit and find_next_bit functions.
>>> And mask is of type "u64". But "u64" is send as a "unsigned long *" to
>>> lib functions along with sizeof().
>>>
>>> While the exisitng code works fine in most of the case, when using a 32bit perf
>>> on a 64bit kernel (Big Endian), we end reading the wrong word. In find_first_bit(),
>>> one word at a time (based on BITS_PER_LONG) is loaded and
>>> checked for any bit set. In 32bit BE userspace,
>>> BITS_PER_LONG turns out to be 32, and for a mask value of
>>> "0x00000000000000ff", find_first_bit will return 32, instead of 0.
>>> Reason for this is that, value in the word0 is all zeros and value
>>> in word1 is 0xff. Ideally, second word in the mask should be loaded
>>> and searched. Patch swaps the word to look incase of 32bit BE.
>> I think this is not a problem of find_bit() at all. You have wrong
>> typecast as the source of problem (tools/perf/util/session.c"):
>>
>> 940 static void regs_dump__printf(u64 mask, u64 *regs)
>> 941 {
>> 942         unsigned rid, i = 0;
>> 943 
>> 944         for_each_set_bit(rid, (unsigned long *) &mask, sizeof(mask) * 8) {
>>                                           ^^^^ Here ^^^^
>> 945                 u64 val = regs[i++];
>> 946 
>> 947                 printf(".... %-5s 0x%" PRIx64 "\n",
>> 948                        perf_reg_name(rid), val);
>> 949         }
>> 950 }
>>
>> But for some reason you change correct find_bit()...
>>
>> Though proper fix is like this for me:
>>
>> static void regs_dump__printf(u64 mask, u64 *regs)
>> {
>>         unsigned rid, i = 0;
>>         unsigned long _mask[sizeof(mask)/sizeof(unsigned long)];
>>
>>         _mask[0] = mask & ULONG_MAX;
>>         if (sizeof(mask) > sizeof(unsigned long))
>>                 _mask[1] = mask >> BITS_PER_LONG;
>>
>>         for_each_set_bit(rid, _mask, sizeof(mask) * BITS_PER_BYTE) {
>>                 u64 val = regs[i++];
>>
>>                 printf(".... %-5s 0x%" PRIx64 "\n",
>>                        perf_reg_name(rid), val);
>>         }
>> }
>>
>> Maybe there already is some macro doing the conversion for you...
> yes it is, cpu_to_le64() is what you want

no wait, on second look, cpu_to_le64() is not right.
Because we will end up swapping within 32bit.
But what you suggested looks to be fine. I can repost
this with one minor tweak, right shift with 32 instead
of BITS_PER_LONG (since I see compiler errors in 64bit).

Maddy

>
>> Yury.
>>
>>> Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
>>> Cc: Adrian Hunter <adrian.hunter@intel.com>
>>> Cc: Borislav Petkov <bp@suse.de>
>>> Cc: David Ahern <dsahern@gmail.com>
>>> Cc: George Spelvin <linux@horizon.com>
>>> Cc: Jiri Olsa <jolsa@redhat.com>
>>> Cc: Namhyung Kim <namhyung@kernel.org>
>>> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
>>> Cc: Wang Nan <wangnan0@huawei.com>
>>> Cc: Yury Norov <yury.norov@gmail.com>
>>> Cc: Michael Ellerman <mpe@ellerman.id.au>
>>> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
>>> ---
>>>  tools/lib/find_bit.c | 17 +++++++++++++++++
>>>  1 file changed, 17 insertions(+)
>>>
>>> diff --git a/tools/lib/find_bit.c b/tools/lib/find_bit.c
>>> index 9122a9e80046..996b3e04324f 100644
>>> --- a/tools/lib/find_bit.c
>>> +++ b/tools/lib/find_bit.c
>>> @@ -37,7 +37,12 @@ static unsigned long _find_next_bit(const unsigned long *addr,
>>>  	if (!nbits || start >= nbits)
>>>  		return nbits;
>>>
>>> +#if (__BYTE_ORDER == __BIG_ENDIAN) && (BITS_PER_LONG != 64)
>>> +	tmp = addr[(((nbits - 1)/BITS_PER_LONG) - (start / BITS_PER_LONG))]
>>> +								^ invert;
>>> +#else
>>>  	tmp = addr[start / BITS_PER_LONG] ^ invert;
>>> +#endif
>>>
>>>  	/* Handle 1st word. */
>>>  	tmp &= BITMAP_FIRST_WORD_MASK(start);
>>> @@ -48,7 +53,12 @@ static unsigned long _find_next_bit(const unsigned long *addr,
>>>  		if (start >= nbits)
>>>  			return nbits;
>>>
>>> +#if (__BYTE_ORDER == __BIG_ENDIAN) && (BITS_PER_LONG != 64)
>>> +		tmp = addr[(((nbits - 1)/BITS_PER_LONG) - (start / BITS_PER_LONG))]
>>> +								^ invert;
>>> +#else
>>>  		tmp = addr[start / BITS_PER_LONG] ^ invert;
>>> +#endif
>>>  	}
>>>
>>>  	return min(start + __ffs(tmp), nbits);
>>> @@ -75,8 +85,15 @@ unsigned long find_first_bit(const unsigned long *addr, unsigned long size)
>>>  	unsigned long idx;
>>>
>>>  	for (idx = 0; idx * BITS_PER_LONG < size; idx++) {
>>> +#if (__BYTE_ORDER == __BIG_ENDIAN) && (BITS_PER_LONG != 64)
>>> +		if (addr[(((size-1)/BITS_PER_LONG) - idx)])
>>> +			return min(idx * BITS_PER_LONG +
>>> +				__ffs(addr[(((size-1)/BITS_PER_LONG) - idx)]),
>>> +									size);
>>> +#else
>>>  		if (addr[idx])
>>>  			return min(idx * BITS_PER_LONG + __ffs(addr[idx]), size);
>>> +#endif
>>>  	}
>>>
>>>  	return size;
>>> --
>>> 1.9.1

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2016-06-16 13:12 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-15 11:42 [PATCH] tools/perf: fix the word selected in find_*_bit Madhavan Srinivasan
2016-06-15 12:44 ` George Spelvin
2016-06-16  7:21   ` Madhavan Srinivasan
2016-06-15 19:51 ` Yury Norov
2016-06-15 21:11   ` Yury Norov
2016-06-15 21:29     ` Arnaldo Carvalho de Melo
2016-06-16  1:27       ` [PATCH] tools include: Fix wrong macro definitions for cpu_to_le* for big endian He Kuang
2016-06-16  1:32       ` He Kuang
2016-06-16  1:32         ` [PATCH 1/2] tools include: Sync byteorder/generic.h He Kuang
2016-06-16  6:39           ` Jiri Olsa
2016-06-16  1:32         ` [PATCH 2/2] tools include: Fix wrong macro definitions for cpu_to_le* for big endian He Kuang
2016-06-16  6:21           ` kbuild test robot
2016-06-16  6:39           ` Jiri Olsa
2016-06-16  1:35       ` [PATCH] tools/perf: fix the word selected in find_*_bit Hekuang
2016-06-16 13:11     ` Madhavan Srinivasan
2016-06-16  7:15   ` Madhavan Srinivasan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).