All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] perf/core: Fix the mask in perf_output_sample_regs
@ 2016-07-03 18:01 Madhavan Srinivasan
  2016-07-04  7:19 ` Yury Norov
  2016-08-11 12:27 ` Peter Zijlstra
  0 siblings, 2 replies; 5+ messages in thread
From: Madhavan Srinivasan @ 2016-07-03 18:01 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev
  Cc: Madhavan Srinivasan, Yury Norov, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa,
	Michael Ellerman

When decoding the perf_regs mask in perf_output_sample_regs(),
we loop through the mask using find_first_bit and find_next_bit functions.
While the exisitng code works fine in most of the case,
the logic is broken for 32bit kernel (Big Endian).
When reading u64 mask using (u32 *)(&val)[0], find_*_bit() assumes it gets
lower 32bits of u64 but instead gets upper 32bits which is wrong.
Proposed fix is to swap the words of the u64 to handle this case.
This is _not_ endianness swap.

Suggested-by: Yury Norov <ynorov@caviumnetworks.com>
Cc: Yury Norov <ynorov@caviumnetworks.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
---
 include/linux/bitmap.h |  2 ++
 kernel/events/core.c   |  4 +++-
 lib/bitmap.c           | 19 +++++++++++++++++++
 3 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
index e9b0b9ab07e5..d95b422db183 100644
--- a/include/linux/bitmap.h
+++ b/include/linux/bitmap.h
@@ -188,6 +188,8 @@ extern int bitmap_print_to_pagebuf(bool list, char *buf,
 #define small_const_nbits(nbits) \
 	(__builtin_constant_p(nbits) && (nbits) <= BITS_PER_LONG)
 
+extern void bitmap_from_u64(unsigned long *dst, u64 mask);
+
 static inline void bitmap_zero(unsigned long *dst, unsigned int nbits)
 {
 	if (small_const_nbits(nbits))
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 9c51ec3f0f44..613fec95ea4c 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -5205,8 +5205,10 @@ perf_output_sample_regs(struct perf_output_handle *handle,
 			struct pt_regs *regs, u64 mask)
 {
 	int bit;
+	DECLARE_BITMAP(_mask, 64);
 
-	for_each_set_bit(bit, (const unsigned long *) &mask,
+	bitmap_from_u64(_mask, mask);
+	for_each_set_bit(bit, _mask,
 			 sizeof(mask) * BITS_PER_BYTE) {
 		u64 val;
 
diff --git a/lib/bitmap.c b/lib/bitmap.c
index c66da508cbf7..522f1b4c6078 100644
--- a/lib/bitmap.c
+++ b/lib/bitmap.c
@@ -1170,3 +1170,22 @@ void bitmap_copy_le(unsigned long *dst, const unsigned long *src, unsigned int n
 }
 EXPORT_SYMBOL(bitmap_copy_le);
 #endif
+
+/*
+ * bitmap_from_u64 - Check and swap words within u64.
+ *  @mask: source bitmap
+ *  @dst:  destination bitmap
+ *
+ * In 32bit Big Endian kernel, when using (u32 *)(&val)[*]
+ * to read u64 mask, we will get wrong word.
+ * That is "(u32 *)(&val)[0]" gets upper 32 bits,
+ * but expected could be lower 32bits of u64.
+ */
+void bitmap_from_u64(unsigned long *dst, u64 mask)
+{
+	dst[0] = mask & ULONG_MAX;
+
+	if (sizeof(mask) > sizeof(unsigned long))
+		dst[1] = mask >> 32;
+}
+EXPORT_SYMBOL(bitmap_from_u64);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] perf/core: Fix the mask in perf_output_sample_regs
  2016-07-03 18:01 [PATCH] perf/core: Fix the mask in perf_output_sample_regs Madhavan Srinivasan
@ 2016-07-04  7:19 ` Yury Norov
  2016-07-04 22:41   ` Arnaldo Carvalho de Melo
  2016-08-11 12:27 ` Peter Zijlstra
  1 sibling, 1 reply; 5+ messages in thread
From: Yury Norov @ 2016-07-04  7:19 UTC (permalink / raw)
  To: Madhavan Srinivasan
  Cc: linux-kernel, linuxppc-dev, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa,
	Michael Ellerman

On Sun, Jul 03, 2016 at 11:31:58PM +0530, Madhavan Srinivasan wrote:
> When decoding the perf_regs mask in perf_output_sample_regs(),
> we loop through the mask using find_first_bit and find_next_bit functions.
> While the exisitng code works fine in most of the case,
> the logic is broken for 32bit kernel (Big Endian).
> When reading u64 mask using (u32 *)(&val)[0], find_*_bit() assumes it gets
> lower 32bits of u64 but instead gets upper 32bits which is wrong.
> Proposed fix is to swap the words of the u64 to handle this case.
> This is _not_ endianness swap.


In fact, it's broken for 32-bit LE as well if mask is zero and next
word on stack is not zero. The rest is OK.

Reviewed-by: Yury Norov <ynorov@caviumnetworks.com>

> 
> Suggested-by: Yury Norov <ynorov@caviumnetworks.com>
> Cc: Yury Norov <ynorov@caviumnetworks.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
> Cc: Jiri Olsa <jolsa@kernel.org>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
> ---
>  include/linux/bitmap.h |  2 ++
>  kernel/events/core.c   |  4 +++-
>  lib/bitmap.c           | 19 +++++++++++++++++++
>  3 files changed, 24 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
> index e9b0b9ab07e5..d95b422db183 100644
> --- a/include/linux/bitmap.h
> +++ b/include/linux/bitmap.h
> @@ -188,6 +188,8 @@ extern int bitmap_print_to_pagebuf(bool list, char *buf,
>  #define small_const_nbits(nbits) \
>  	(__builtin_constant_p(nbits) && (nbits) <= BITS_PER_LONG)
>  
> +extern void bitmap_from_u64(unsigned long *dst, u64 mask);
> +
>  static inline void bitmap_zero(unsigned long *dst, unsigned int nbits)
>  {
>  	if (small_const_nbits(nbits))
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 9c51ec3f0f44..613fec95ea4c 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -5205,8 +5205,10 @@ perf_output_sample_regs(struct perf_output_handle *handle,
>  			struct pt_regs *regs, u64 mask)
>  {
>  	int bit;
> +	DECLARE_BITMAP(_mask, 64);
>  
> -	for_each_set_bit(bit, (const unsigned long *) &mask,
> +	bitmap_from_u64(_mask, mask);
> +	for_each_set_bit(bit, _mask,
>  			 sizeof(mask) * BITS_PER_BYTE) {
>  		u64 val;
>  
> diff --git a/lib/bitmap.c b/lib/bitmap.c
> index c66da508cbf7..522f1b4c6078 100644
> --- a/lib/bitmap.c
> +++ b/lib/bitmap.c
> @@ -1170,3 +1170,22 @@ void bitmap_copy_le(unsigned long *dst, const unsigned long *src, unsigned int n
>  }
>  EXPORT_SYMBOL(bitmap_copy_le);
>  #endif
> +
> +/*
> + * bitmap_from_u64 - Check and swap words within u64.
> + *  @mask: source bitmap
> + *  @dst:  destination bitmap
> + *
> + * In 32bit Big Endian kernel, when using (u32 *)(&val)[*]
> + * to read u64 mask, we will get wrong word.
> + * That is "(u32 *)(&val)[0]" gets upper 32 bits,
> + * but expected could be lower 32bits of u64.
> + */
> +void bitmap_from_u64(unsigned long *dst, u64 mask)
> +{
> +	dst[0] = mask & ULONG_MAX;
> +
> +	if (sizeof(mask) > sizeof(unsigned long))
> +		dst[1] = mask >> 32;
> +}
> +EXPORT_SYMBOL(bitmap_from_u64);
> -- 
> 1.9.1

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] perf/core: Fix the mask in perf_output_sample_regs
  2016-07-04  7:19 ` Yury Norov
@ 2016-07-04 22:41   ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 5+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-07-04 22:41 UTC (permalink / raw)
  To: Yury Norov
  Cc: Madhavan Srinivasan, linux-kernel, linuxppc-dev, Peter Zijlstra,
	Ingo Molnar, Alexander Shishkin, Jiri Olsa, Michael Ellerman

Em Mon, Jul 04, 2016 at 10:19:06AM +0300, Yury Norov escreveu:
> On Sun, Jul 03, 2016 at 11:31:58PM +0530, Madhavan Srinivasan wrote:
> > When decoding the perf_regs mask in perf_output_sample_regs(),
> > we loop through the mask using find_first_bit and find_next_bit functions.
> > While the exisitng code works fine in most of the case,
> > the logic is broken for 32bit kernel (Big Endian).
> > When reading u64 mask using (u32 *)(&val)[0], find_*_bit() assumes it gets
> > lower 32bits of u64 but instead gets upper 32bits which is wrong.
> > Proposed fix is to swap the words of the u64 to handle this case.
> > This is _not_ endianness swap.
> 
> 
> In fact, it's broken for 32-bit LE as well if mask is zero and next
> word on stack is not zero. The rest is OK.
> 
> Reviewed-by: Yury Norov <ynorov@caviumnetworks.com>

Waiting a bit for more acks, as IIRC several other people had comments
on this patch, ok?

- Arnaldo
 
> > 
> > Suggested-by: Yury Norov <ynorov@caviumnetworks.com>
> > Cc: Yury Norov <ynorov@caviumnetworks.com>
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: Ingo Molnar <mingo@redhat.com>
> > Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
> > Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
> > Cc: Jiri Olsa <jolsa@kernel.org>
> > Cc: Michael Ellerman <mpe@ellerman.id.au>
> > Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
> > ---
> >  include/linux/bitmap.h |  2 ++
> >  kernel/events/core.c   |  4 +++-
> >  lib/bitmap.c           | 19 +++++++++++++++++++
> >  3 files changed, 24 insertions(+), 1 deletion(-)
> > 
> > diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
> > index e9b0b9ab07e5..d95b422db183 100644
> > --- a/include/linux/bitmap.h
> > +++ b/include/linux/bitmap.h
> > @@ -188,6 +188,8 @@ extern int bitmap_print_to_pagebuf(bool list, char *buf,
> >  #define small_const_nbits(nbits) \
> >  	(__builtin_constant_p(nbits) && (nbits) <= BITS_PER_LONG)
> >  
> > +extern void bitmap_from_u64(unsigned long *dst, u64 mask);
> > +
> >  static inline void bitmap_zero(unsigned long *dst, unsigned int nbits)
> >  {
> >  	if (small_const_nbits(nbits))
> > diff --git a/kernel/events/core.c b/kernel/events/core.c
> > index 9c51ec3f0f44..613fec95ea4c 100644
> > --- a/kernel/events/core.c
> > +++ b/kernel/events/core.c
> > @@ -5205,8 +5205,10 @@ perf_output_sample_regs(struct perf_output_handle *handle,
> >  			struct pt_regs *regs, u64 mask)
> >  {
> >  	int bit;
> > +	DECLARE_BITMAP(_mask, 64);
> >  
> > -	for_each_set_bit(bit, (const unsigned long *) &mask,
> > +	bitmap_from_u64(_mask, mask);
> > +	for_each_set_bit(bit, _mask,
> >  			 sizeof(mask) * BITS_PER_BYTE) {
> >  		u64 val;
> >  
> > diff --git a/lib/bitmap.c b/lib/bitmap.c
> > index c66da508cbf7..522f1b4c6078 100644
> > --- a/lib/bitmap.c
> > +++ b/lib/bitmap.c
> > @@ -1170,3 +1170,22 @@ void bitmap_copy_le(unsigned long *dst, const unsigned long *src, unsigned int n
> >  }
> >  EXPORT_SYMBOL(bitmap_copy_le);
> >  #endif
> > +
> > +/*
> > + * bitmap_from_u64 - Check and swap words within u64.
> > + *  @mask: source bitmap
> > + *  @dst:  destination bitmap
> > + *
> > + * In 32bit Big Endian kernel, when using (u32 *)(&val)[*]
> > + * to read u64 mask, we will get wrong word.
> > + * That is "(u32 *)(&val)[0]" gets upper 32 bits,
> > + * but expected could be lower 32bits of u64.
> > + */
> > +void bitmap_from_u64(unsigned long *dst, u64 mask)
> > +{
> > +	dst[0] = mask & ULONG_MAX;
> > +
> > +	if (sizeof(mask) > sizeof(unsigned long))
> > +		dst[1] = mask >> 32;
> > +}
> > +EXPORT_SYMBOL(bitmap_from_u64);
> > -- 
> > 1.9.1

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] perf/core: Fix the mask in perf_output_sample_regs
  2016-07-03 18:01 [PATCH] perf/core: Fix the mask in perf_output_sample_regs Madhavan Srinivasan
  2016-07-04  7:19 ` Yury Norov
@ 2016-08-11 12:27 ` Peter Zijlstra
  2016-08-16  5:29   ` Madhavan Srinivasan
  1 sibling, 1 reply; 5+ messages in thread
From: Peter Zijlstra @ 2016-08-11 12:27 UTC (permalink / raw)
  To: Madhavan Srinivasan
  Cc: linux-kernel, linuxppc-dev, Yury Norov, Ingo Molnar,
	Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa,
	Michael Ellerman


Sorry, found it in my inbox while clearing out backlog..

On Sun, Jul 03, 2016 at 11:31:58PM +0530, Madhavan Srinivasan wrote:
> When decoding the perf_regs mask in perf_output_sample_regs(),
> we loop through the mask using find_first_bit and find_next_bit functions.
> While the exisitng code works fine in most of the case,
> the logic is broken for 32bit kernel (Big Endian).
> When reading u64 mask using (u32 *)(&val)[0], find_*_bit() assumes it gets
> lower 32bits of u64 but instead gets upper 32bits which is wrong.
> Proposed fix is to swap the words of the u64 to handle this case.

> This is _not_ endianness swap.

But it looks an awful lot like it..

> +++ b/kernel/events/core.c
> @@ -5205,8 +5205,10 @@ perf_output_sample_regs(struct perf_output_handle *handle,
>  			struct pt_regs *regs, u64 mask)
>  {
>  	int bit;
> +	DECLARE_BITMAP(_mask, 64);
>  
> -	for_each_set_bit(bit, (const unsigned long *) &mask,
> +	bitmap_from_u64(_mask, mask);
> +	for_each_set_bit(bit, _mask,
>  			 sizeof(mask) * BITS_PER_BYTE) {
>  		u64 val;

> +++ b/lib/bitmap.c

> +void bitmap_from_u64(unsigned long *dst, u64 mask)
> +{
> +	dst[0] = mask & ULONG_MAX;
> +
> +	if (sizeof(mask) > sizeof(unsigned long))
> +		dst[1] = mask >> 32;
> +}
> +EXPORT_SYMBOL(bitmap_from_u64);

Looks small enough for an inline.

Alternatively you can go all the way and add bitmap_from_u64array(), but
that seems massive overkill.

Tedious stuff.. I can't come up with anything prettier :/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] perf/core: Fix the mask in perf_output_sample_regs
  2016-08-11 12:27 ` Peter Zijlstra
@ 2016-08-16  5:29   ` Madhavan Srinivasan
  0 siblings, 0 replies; 5+ messages in thread
From: Madhavan Srinivasan @ 2016-08-16  5:29 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, linuxppc-dev, Yury Norov, Ingo Molnar,
	Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa,
	Michael Ellerman



On Thursday 11 August 2016 05:57 PM, Peter Zijlstra wrote:
> Sorry, found it in my inbox while clearing out backlog..
>
> On Sun, Jul 03, 2016 at 11:31:58PM +0530, Madhavan Srinivasan wrote:
>> When decoding the perf_regs mask in perf_output_sample_regs(),
>> we loop through the mask using find_first_bit and find_next_bit functions.
>> While the exisitng code works fine in most of the case,
>> the logic is broken for 32bit kernel (Big Endian).
>> When reading u64 mask using (u32 *)(&val)[0], find_*_bit() assumes it gets
>> lower 32bits of u64 but instead gets upper 32bits which is wrong.
>> Proposed fix is to swap the words of the u64 to handle this case.
>> This is _not_ endianness swap.
> But it looks an awful lot like it..
Hit this issue when testing my perf_arch_regs patchset. Yep exactly
the reason for adding that comment in the commit message.


>
>> +++ b/kernel/events/core.c
>> @@ -5205,8 +5205,10 @@ perf_output_sample_regs(struct perf_output_handle *handle,
>>   			struct pt_regs *regs, u64 mask)
>>   {
>>   	int bit;
>> +	DECLARE_BITMAP(_mask, 64);
>>   
>> -	for_each_set_bit(bit, (const unsigned long *) &mask,
>> +	bitmap_from_u64(_mask, mask);
>> +	for_each_set_bit(bit, _mask,
>>   			 sizeof(mask) * BITS_PER_BYTE) {
>>   		u64 val;
>> +++ b/lib/bitmap.c
>> +void bitmap_from_u64(unsigned long *dst, u64 mask)
>> +{
>> +	dst[0] = mask & ULONG_MAX;
>> +
>> +	if (sizeof(mask) > sizeof(unsigned long))
>> +		dst[1] = mask >> 32;
>> +}
>> +EXPORT_SYMBOL(bitmap_from_u64);
> Looks small enough for an inline.
>
> Alternatively you can go all the way and add bitmap_from_u64array(), but
> that seems massive overkill.

Ok will make it inline and resend.

Maddy

>
> Tedious stuff.. I can't come up with anything prettier :/
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-08-16  5:29 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-03 18:01 [PATCH] perf/core: Fix the mask in perf_output_sample_regs Madhavan Srinivasan
2016-07-04  7:19 ` Yury Norov
2016-07-04 22:41   ` Arnaldo Carvalho de Melo
2016-08-11 12:27 ` Peter Zijlstra
2016-08-16  5:29   ` Madhavan Srinivasan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.