linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] powerpc/64s: Fix possible corruption on big endian due to pgd/pud_present()
@ 2019-02-14  6:23 Michael Ellerman
  2019-02-14 16:31 ` Jan Kara
                   ` (3 more replies)
  0 siblings, 4 replies; 16+ messages in thread
From: Michael Ellerman @ 2019-02-14  6:23 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: aneesh.kumar, jack, erhard_f, linux-kernel, linux-mm

In v4.20 we changed our pgd/pud_present() to check for _PAGE_PRESENT
rather than just checking that the value is non-zero, e.g.:

  static inline int pgd_present(pgd_t pgd)
  {
 -       return !pgd_none(pgd);
 +       return (pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT));
  }

Unfortunately this is broken on big endian, as the result of the
bitwise && is truncated to int, which is always zero because
_PAGE_PRESENT is 0x8000000000000000ul. This means pgd_present() and
pud_present() are always false at compile time, and the compiler
elides the subsequent code.

Remarkably with that bug present we are still able to boot and run
with few noticeable effects. However under some work loads we are able
to trigger a warning in the ext4 code:

  WARNING: CPU: 11 PID: 29593 at fs/ext4/inode.c:3927 .ext4_set_page_dirty+0x70/0xb0
  CPU: 11 PID: 29593 Comm: debugedit Not tainted 4.20.0-rc1 #1
  ...
  NIP .ext4_set_page_dirty+0x70/0xb0
  LR  .set_page_dirty+0xa0/0x150
  Call Trace:
   .set_page_dirty+0xa0/0x150
   .unmap_page_range+0xbf0/0xe10
   .unmap_vmas+0x84/0x130
   .unmap_region+0xe8/0x190
   .__do_munmap+0x2f0/0x510
   .__vm_munmap+0x80/0x110
   .__se_sys_munmap+0x14/0x30
   system_call+0x5c/0x70

The fix is simple, we need to convert the result of the bitwise && to
an int before returning it.

Thanks to Jan Kara and Aneesh for help with debugging.

Fixes: da7ad366b497 ("powerpc/mm/book3s: Update pmd_present to look at _PAGE_PRESENT bit")
Cc: stable@vger.kernel.org # v4.20+
Reported-by: Erhard F. <erhard_f@mailbox.org>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
 arch/powerpc/include/asm/book3s/64/pgtable.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index c9bfe526ca9d..d8c8d7c9df15 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -904,7 +904,7 @@ static inline int pud_none(pud_t pud)
 
 static inline int pud_present(pud_t pud)
 {
-	return (pud_raw(pud) & cpu_to_be64(_PAGE_PRESENT));
+	return !!(pud_raw(pud) & cpu_to_be64(_PAGE_PRESENT));
 }
 
 extern struct page *pud_page(pud_t pud);
@@ -951,7 +951,7 @@ static inline int pgd_none(pgd_t pgd)
 
 static inline int pgd_present(pgd_t pgd)
 {
-	return (pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT));
+	return !!(pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT));
 }
 
 static inline pte_t pgd_pte(pgd_t pgd)
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] powerpc/64s: Fix possible corruption on big endian due to pgd/pud_present()
  2019-02-14  6:23 [PATCH] powerpc/64s: Fix possible corruption on big endian due to pgd/pud_present() Michael Ellerman
@ 2019-02-14 16:31 ` Jan Kara
  2019-02-16 10:55 ` Balbir Singh
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 16+ messages in thread
From: Jan Kara @ 2019-02-14 16:31 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: linuxppc-dev, aneesh.kumar, jack, erhard_f, linux-kernel, linux-mm

On Thu 14-02-19 17:23:39, Michael Ellerman wrote:
> In v4.20 we changed our pgd/pud_present() to check for _PAGE_PRESENT
> rather than just checking that the value is non-zero, e.g.:
> 
>   static inline int pgd_present(pgd_t pgd)
>   {
>  -       return !pgd_none(pgd);
>  +       return (pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT));
>   }
> 
> Unfortunately this is broken on big endian, as the result of the
> bitwise && is truncated to int, which is always zero because
> _PAGE_PRESENT is 0x8000000000000000ul. This means pgd_present() and
> pud_present() are always false at compile time, and the compiler
> elides the subsequent code.
> 
> Remarkably with that bug present we are still able to boot and run
> with few noticeable effects. However under some work loads we are able
> to trigger a warning in the ext4 code:

Wow, good catch. I wouldn't believe there are so few bad effects from
such a major breakage! :)

								Honza

> 
>   WARNING: CPU: 11 PID: 29593 at fs/ext4/inode.c:3927 .ext4_set_page_dirty+0x70/0xb0
>   CPU: 11 PID: 29593 Comm: debugedit Not tainted 4.20.0-rc1 #1
>   ...
>   NIP .ext4_set_page_dirty+0x70/0xb0
>   LR  .set_page_dirty+0xa0/0x150
>   Call Trace:
>    .set_page_dirty+0xa0/0x150
>    .unmap_page_range+0xbf0/0xe10
>    .unmap_vmas+0x84/0x130
>    .unmap_region+0xe8/0x190
>    .__do_munmap+0x2f0/0x510
>    .__vm_munmap+0x80/0x110
>    .__se_sys_munmap+0x14/0x30
>    system_call+0x5c/0x70
> 
> The fix is simple, we need to convert the result of the bitwise && to
> an int before returning it.
> 
> Thanks to Jan Kara and Aneesh for help with debugging.
> 
> Fixes: da7ad366b497 ("powerpc/mm/book3s: Update pmd_present to look at _PAGE_PRESENT bit")
> Cc: stable@vger.kernel.org # v4.20+
> Reported-by: Erhard F. <erhard_f@mailbox.org>
> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
> ---
>  arch/powerpc/include/asm/book3s/64/pgtable.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
> index c9bfe526ca9d..d8c8d7c9df15 100644
> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
> @@ -904,7 +904,7 @@ static inline int pud_none(pud_t pud)
>  
>  static inline int pud_present(pud_t pud)
>  {
> -	return (pud_raw(pud) & cpu_to_be64(_PAGE_PRESENT));
> +	return !!(pud_raw(pud) & cpu_to_be64(_PAGE_PRESENT));
>  }
>  
>  extern struct page *pud_page(pud_t pud);
> @@ -951,7 +951,7 @@ static inline int pgd_none(pgd_t pgd)
>  
>  static inline int pgd_present(pgd_t pgd)
>  {
> -	return (pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT));
> +	return !!(pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT));
>  }
>  
>  static inline pte_t pgd_pte(pgd_t pgd)
> -- 
> 2.20.1
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] powerpc/64s: Fix possible corruption on big endian due to pgd/pud_present()
  2019-02-14  6:23 [PATCH] powerpc/64s: Fix possible corruption on big endian due to pgd/pud_present() Michael Ellerman
  2019-02-14 16:31 ` Jan Kara
@ 2019-02-16 10:55 ` Balbir Singh
  2019-02-16 14:22   ` Segher Boessenkool
  2019-02-16 13:15 ` Andreas Schwab
  2019-02-17  8:21 ` Michael Ellerman
  3 siblings, 1 reply; 16+ messages in thread
From: Balbir Singh @ 2019-02-16 10:55 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: linuxppc-dev, aneesh.kumar, jack, erhard_f, linux-kernel, linux-mm

On Thu, Feb 14, 2019 at 05:23:39PM +1100, Michael Ellerman wrote:
> In v4.20 we changed our pgd/pud_present() to check for _PAGE_PRESENT
> rather than just checking that the value is non-zero, e.g.:
> 
>   static inline int pgd_present(pgd_t pgd)
>   {
>  -       return !pgd_none(pgd);
>  +       return (pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT));
>   }
> 
> Unfortunately this is broken on big endian, as the result of the
> bitwise && is truncated to int, which is always zero because

Not sure why that should happen, why is the result an int? What
causes the casting of pgd_t & be64 to be truncated to an int.

> _PAGE_PRESENT is 0x8000000000000000ul. This means pgd_present() and
> pud_present() are always false at compile time, and the compiler
> elides the subsequent code.
> 
> Remarkably with that bug present we are still able to boot and run
> with few noticeable effects. However under some work loads we are able
> to trigger a warning in the ext4 code:
> 
>   WARNING: CPU: 11 PID: 29593 at fs/ext4/inode.c:3927 .ext4_set_page_dirty+0x70/0xb0
>   CPU: 11 PID: 29593 Comm: debugedit Not tainted 4.20.0-rc1 #1
>   ...
>   NIP .ext4_set_page_dirty+0x70/0xb0
>   LR  .set_page_dirty+0xa0/0x150
>   Call Trace:
>    .set_page_dirty+0xa0/0x150
>    .unmap_page_range+0xbf0/0xe10
>    .unmap_vmas+0x84/0x130
>    .unmap_region+0xe8/0x190
>    .__do_munmap+0x2f0/0x510
>    .__vm_munmap+0x80/0x110
>    .__se_sys_munmap+0x14/0x30
>    system_call+0x5c/0x70
> 
> The fix is simple, we need to convert the result of the bitwise && to
> an int before returning it.
> 
> Thanks to Jan Kara and Aneesh for help with debugging.
> 
> Fixes: da7ad366b497 ("powerpc/mm/book3s: Update pmd_present to look at _PAGE_PRESENT bit")
> Cc: stable@vger.kernel.org # v4.20+
> Reported-by: Erhard F. <erhard_f@mailbox.org>
> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
> ---
>  arch/powerpc/include/asm/book3s/64/pgtable.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
> index c9bfe526ca9d..d8c8d7c9df15 100644
> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
> @@ -904,7 +904,7 @@ static inline int pud_none(pud_t pud)
>  
>  static inline int pud_present(pud_t pud)
>  {
> -	return (pud_raw(pud) & cpu_to_be64(_PAGE_PRESENT));
> +	return !!(pud_raw(pud) & cpu_to_be64(_PAGE_PRESENT));
>  }
>  
>  extern struct page *pud_page(pud_t pud);
> @@ -951,7 +951,7 @@ static inline int pgd_none(pgd_t pgd)
>  
>  static inline int pgd_present(pgd_t pgd)
>  {
> -	return (pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT));
> +	return !!(pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT));
>  }
>

Care to put a big FAT warning, so that we don't repeat this again
(as in authors planning on changing these bits). 

Balbir Singh.
  

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] powerpc/64s: Fix possible corruption on big endian due to pgd/pud_present()
  2019-02-14  6:23 [PATCH] powerpc/64s: Fix possible corruption on big endian due to pgd/pud_present() Michael Ellerman
  2019-02-14 16:31 ` Jan Kara
  2019-02-16 10:55 ` Balbir Singh
@ 2019-02-16 13:15 ` Andreas Schwab
  2019-02-17  8:26   ` Michael Ellerman
  2019-02-17  8:21 ` Michael Ellerman
  3 siblings, 1 reply; 16+ messages in thread
From: Andreas Schwab @ 2019-02-16 13:15 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: linuxppc-dev, linux-mm, erhard_f, jack, aneesh.kumar, linux-kernel

On Feb 14 2019, Michael Ellerman <mpe@ellerman.id.au> wrote:

> The fix is simple, we need to convert the result of the bitwise && to
> an int before returning it.

Alternatively, the return type could be changed to bool, so that the
compiler does the right thing by itself.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] powerpc/64s: Fix possible corruption on big endian due to pgd/pud_present()
  2019-02-16 10:55 ` Balbir Singh
@ 2019-02-16 14:22   ` Segher Boessenkool
  2019-02-17  6:23     ` Balbir Singh
  2019-02-17  8:34     ` Michael Ellerman
  0 siblings, 2 replies; 16+ messages in thread
From: Segher Boessenkool @ 2019-02-16 14:22 UTC (permalink / raw)
  To: Balbir Singh
  Cc: Michael Ellerman, erhard_f, jack, linuxppc-dev, linux-kernel,
	linux-mm, aneesh.kumar

Hi all,

On Sat, Feb 16, 2019 at 09:55:11PM +1100, Balbir Singh wrote:
> On Thu, Feb 14, 2019 at 05:23:39PM +1100, Michael Ellerman wrote:
> > In v4.20 we changed our pgd/pud_present() to check for _PAGE_PRESENT
> > rather than just checking that the value is non-zero, e.g.:
> > 
> >   static inline int pgd_present(pgd_t pgd)
> >   {
> >  -       return !pgd_none(pgd);
> >  +       return (pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT));
> >   }
> > 
> > Unfortunately this is broken on big endian, as the result of the
> > bitwise && is truncated to int, which is always zero because

(Bitwise "&" of course).

> Not sure why that should happen, why is the result an int? What
> causes the casting of pgd_t & be64 to be truncated to an int.

Yes, it's not obvious as written...  It's simply that the return type of
pgd_present is int.  So it is truncated _after_ the bitwise and.


Segher

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] powerpc/64s: Fix possible corruption on big endian due to pgd/pud_present()
  2019-02-16 14:22   ` Segher Boessenkool
@ 2019-02-17  6:23     ` Balbir Singh
  2019-02-17  8:34       ` Michael Ellerman
  2019-02-17  8:34     ` Michael Ellerman
  1 sibling, 1 reply; 16+ messages in thread
From: Balbir Singh @ 2019-02-17  6:23 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Michael Ellerman, erhard_f, jack, linuxppc-dev, linux-kernel,
	linux-mm, aneesh.kumar

On Sat, Feb 16, 2019 at 08:22:12AM -0600, Segher Boessenkool wrote:
> Hi all,
> 
> On Sat, Feb 16, 2019 at 09:55:11PM +1100, Balbir Singh wrote:
> > On Thu, Feb 14, 2019 at 05:23:39PM +1100, Michael Ellerman wrote:
> > > In v4.20 we changed our pgd/pud_present() to check for _PAGE_PRESENT
> > > rather than just checking that the value is non-zero, e.g.:
> > > 
> > >   static inline int pgd_present(pgd_t pgd)
> > >   {
> > >  -       return !pgd_none(pgd);
> > >  +       return (pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT));
> > >   }
> > > 
> > > Unfortunately this is broken on big endian, as the result of the
> > > bitwise && is truncated to int, which is always zero because
> 
> (Bitwise "&" of course).
> 
> > Not sure why that should happen, why is the result an int? What
> > causes the casting of pgd_t & be64 to be truncated to an int.
> 
> Yes, it's not obvious as written...  It's simply that the return type of
> pgd_present is int.  So it is truncated _after_ the bitwise and.
>

Thanks, I am surprised the compiler does not complain about the truncation
of bits. I wonder if we are missing -Wconversion

Balbir

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: powerpc/64s: Fix possible corruption on big endian due to pgd/pud_present()
  2019-02-14  6:23 [PATCH] powerpc/64s: Fix possible corruption on big endian due to pgd/pud_present() Michael Ellerman
                   ` (2 preceding siblings ...)
  2019-02-16 13:15 ` Andreas Schwab
@ 2019-02-17  8:21 ` Michael Ellerman
  3 siblings, 0 replies; 16+ messages in thread
From: Michael Ellerman @ 2019-02-17  8:21 UTC (permalink / raw)
  To: Michael Ellerman, linuxppc-dev
  Cc: linux-mm, erhard_f, jack, aneesh.kumar, linux-kernel

On Thu, 2019-02-14 at 06:23:39 UTC, Michael Ellerman wrote:
> In v4.20 we changed our pgd/pud_present() to check for _PAGE_PRESENT
> rather than just checking that the value is non-zero, e.g.:
> 
>   static inline int pgd_present(pgd_t pgd)
>   {
>  -       return !pgd_none(pgd);
>  +       return (pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT));
>   }
> 
> Unfortunately this is broken on big endian, as the result of the
> bitwise && is truncated to int, which is always zero because
> _PAGE_PRESENT is 0x8000000000000000ul. This means pgd_present() and
> pud_present() are always false at compile time, and the compiler
> elides the subsequent code.
> 
> Remarkably with that bug present we are still able to boot and run
> with few noticeable effects. However under some work loads we are able
> to trigger a warning in the ext4 code:
> 
>   WARNING: CPU: 11 PID: 29593 at fs/ext4/inode.c:3927 .ext4_set_page_dirty+0x70/0xb0
>   CPU: 11 PID: 29593 Comm: debugedit Not tainted 4.20.0-rc1 #1
>   ...
>   NIP .ext4_set_page_dirty+0x70/0xb0
>   LR  .set_page_dirty+0xa0/0x150
>   Call Trace:
>    .set_page_dirty+0xa0/0x150
>    .unmap_page_range+0xbf0/0xe10
>    .unmap_vmas+0x84/0x130
>    .unmap_region+0xe8/0x190
>    .__do_munmap+0x2f0/0x510
>    .__vm_munmap+0x80/0x110
>    .__se_sys_munmap+0x14/0x30
>    system_call+0x5c/0x70
> 
> The fix is simple, we need to convert the result of the bitwise && to
> an int before returning it.
> 
> Thanks to Jan Kara and Aneesh for help with debugging.
> 
> Fixes: da7ad366b497 ("powerpc/mm/book3s: Update pmd_present to look at _PAGE_PRESENT bit")
> Cc: stable@vger.kernel.org # v4.20+
> Reported-by: Erhard F. <erhard_f@mailbox.org>
> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

Applied to powerpc fixes.

https://git.kernel.org/powerpc/c/a58007621be33e9f7c7bed5d5ff8ecb9

cheers

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] powerpc/64s: Fix possible corruption on big endian due to pgd/pud_present()
  2019-02-16 13:15 ` Andreas Schwab
@ 2019-02-17  8:26   ` Michael Ellerman
  0 siblings, 0 replies; 16+ messages in thread
From: Michael Ellerman @ 2019-02-17  8:26 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: linuxppc-dev, linux-mm, erhard_f, jack, aneesh.kumar, linux-kernel

Andreas Schwab <schwab@linux-m68k.org> writes:

> On Feb 14 2019, Michael Ellerman <mpe@ellerman.id.au> wrote:
>
>> The fix is simple, we need to convert the result of the bitwise && to
>> an int before returning it.
>
> Alternatively, the return type could be changed to bool, so that the
> compiler does the right thing by itself.

Yes that would be preferable. All other architectures return an int so
I wasn't game to switch to bool for a fix. But I don't see why it should
matter so I'll do a patch using bool for next.

cheers

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] powerpc/64s: Fix possible corruption on big endian due to pgd/pud_present()
  2019-02-17  6:23     ` Balbir Singh
@ 2019-02-17  8:34       ` Michael Ellerman
  2019-02-17 21:55         ` Balbir Singh
  0 siblings, 1 reply; 16+ messages in thread
From: Michael Ellerman @ 2019-02-17  8:34 UTC (permalink / raw)
  To: Balbir Singh, Segher Boessenkool
  Cc: erhard_f, jack, linuxppc-dev, linux-kernel, linux-mm, aneesh.kumar

Balbir Singh <bsingharora@gmail.com> writes:
> On Sat, Feb 16, 2019 at 08:22:12AM -0600, Segher Boessenkool wrote:
>> Hi all,
>> 
>> On Sat, Feb 16, 2019 at 09:55:11PM +1100, Balbir Singh wrote:
>> > On Thu, Feb 14, 2019 at 05:23:39PM +1100, Michael Ellerman wrote:
>> > > In v4.20 we changed our pgd/pud_present() to check for _PAGE_PRESENT
>> > > rather than just checking that the value is non-zero, e.g.:
>> > > 
>> > >   static inline int pgd_present(pgd_t pgd)
>> > >   {
>> > >  -       return !pgd_none(pgd);
>> > >  +       return (pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT));
>> > >   }
>> > > 
>> > > Unfortunately this is broken on big endian, as the result of the
>> > > bitwise && is truncated to int, which is always zero because
>> 
>> (Bitwise "&" of course).
>> 
>> > Not sure why that should happen, why is the result an int? What
>> > causes the casting of pgd_t & be64 to be truncated to an int.
>> 
>> Yes, it's not obvious as written...  It's simply that the return type of
>> pgd_present is int.  So it is truncated _after_ the bitwise and.
>>
>
> Thanks, I am surprised the compiler does not complain about the truncation
> of bits. I wonder if we are missing -Wconversion

Good luck with that :)

What I should start doing is building with it enabled and then comparing
the output before and after commits to make sure we're not introducing
new cases.

cheers

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] powerpc/64s: Fix possible corruption on big endian due to pgd/pud_present()
  2019-02-16 14:22   ` Segher Boessenkool
  2019-02-17  6:23     ` Balbir Singh
@ 2019-02-17  8:34     ` Michael Ellerman
  1 sibling, 0 replies; 16+ messages in thread
From: Michael Ellerman @ 2019-02-17  8:34 UTC (permalink / raw)
  To: Segher Boessenkool, Balbir Singh
  Cc: erhard_f, jack, linuxppc-dev, linux-kernel, linux-mm, aneesh.kumar

Segher Boessenkool <segher@kernel.crashing.org> writes:

> Hi all,
>
> On Sat, Feb 16, 2019 at 09:55:11PM +1100, Balbir Singh wrote:
>> On Thu, Feb 14, 2019 at 05:23:39PM +1100, Michael Ellerman wrote:
>> > In v4.20 we changed our pgd/pud_present() to check for _PAGE_PRESENT
>> > rather than just checking that the value is non-zero, e.g.:
>> > 
>> >   static inline int pgd_present(pgd_t pgd)
>> >   {
>> >  -       return !pgd_none(pgd);
>> >  +       return (pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT));
>> >   }
>> > 
>> > Unfortunately this is broken on big endian, as the result of the
>> > bitwise && is truncated to int, which is always zero because
>
> (Bitwise "&" of course).

Thanks, I fixed that up.

cheers

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] powerpc/64s: Fix possible corruption on big endian due to pgd/pud_present()
  2019-02-17  8:34       ` Michael Ellerman
@ 2019-02-17 21:55         ` Balbir Singh
  2019-02-18  0:49           ` Michael Ellerman
  0 siblings, 1 reply; 16+ messages in thread
From: Balbir Singh @ 2019-02-17 21:55 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Segher Boessenkool, erhard_f, jack, linuxppc-dev, linux-kernel,
	linux-mm, aneesh.kumar

On Sun, Feb 17, 2019 at 07:34:20PM +1100, Michael Ellerman wrote:
> Balbir Singh <bsingharora@gmail.com> writes:
> > On Sat, Feb 16, 2019 at 08:22:12AM -0600, Segher Boessenkool wrote:
> >> Hi all,
> >> 
> >> On Sat, Feb 16, 2019 at 09:55:11PM +1100, Balbir Singh wrote:
> >> > On Thu, Feb 14, 2019 at 05:23:39PM +1100, Michael Ellerman wrote:
> >> > > In v4.20 we changed our pgd/pud_present() to check for _PAGE_PRESENT
> >> > > rather than just checking that the value is non-zero, e.g.:
> >> > > 
> >> > >   static inline int pgd_present(pgd_t pgd)
> >> > >   {
> >> > >  -       return !pgd_none(pgd);
> >> > >  +       return (pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT));
> >> > >   }
> >> > > 
> >> > > Unfortunately this is broken on big endian, as the result of the
> >> > > bitwise && is truncated to int, which is always zero because
> >> 
> >> (Bitwise "&" of course).
> >> 
> >> > Not sure why that should happen, why is the result an int? What
> >> > causes the casting of pgd_t & be64 to be truncated to an int.
> >> 
> >> Yes, it's not obvious as written...  It's simply that the return type of
> >> pgd_present is int.  So it is truncated _after_ the bitwise and.
> >>
> >
> > Thanks, I am surprised the compiler does not complain about the truncation
> > of bits. I wonder if we are missing -Wconversion
> 
> Good luck with that :)
> 
> What I should start doing is building with it enabled and then comparing
> the output before and after commits to make sure we're not introducing
> new cases.
>

Fair enough, my point was that the compiler can help out. I'll see what
-Wconversion finds on my local build :)

Balbir Singh. 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] powerpc/64s: Fix possible corruption on big endian due to pgd/pud_present()
  2019-02-17 21:55         ` Balbir Singh
@ 2019-02-18  0:49           ` Michael Ellerman
  2019-02-19 12:01             ` Balbir Singh
  2019-02-19 20:15             ` Segher Boessenkool
  0 siblings, 2 replies; 16+ messages in thread
From: Michael Ellerman @ 2019-02-18  0:49 UTC (permalink / raw)
  To: Balbir Singh
  Cc: Segher Boessenkool, erhard_f, jack, linuxppc-dev, linux-kernel,
	linux-mm, aneesh.kumar

Balbir Singh <bsingharora@gmail.com> writes:
> On Sun, Feb 17, 2019 at 07:34:20PM +1100, Michael Ellerman wrote:
>> Balbir Singh <bsingharora@gmail.com> writes:
>> > On Sat, Feb 16, 2019 at 08:22:12AM -0600, Segher Boessenkool wrote:
>> >> On Sat, Feb 16, 2019 at 09:55:11PM +1100, Balbir Singh wrote:
>> >> > On Thu, Feb 14, 2019 at 05:23:39PM +1100, Michael Ellerman wrote:
>> >> > > In v4.20 we changed our pgd/pud_present() to check for _PAGE_PRESENT
>> >> > > rather than just checking that the value is non-zero, e.g.:
>> >> > > 
>> >> > >   static inline int pgd_present(pgd_t pgd)
>> >> > >   {
>> >> > >  -       return !pgd_none(pgd);
>> >> > >  +       return (pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT));
>> >> > >   }
>> >> > > 
>> >> > > Unfortunately this is broken on big endian, as the result of the
>> >> > > bitwise && is truncated to int, which is always zero because
>> >> 
>> >> (Bitwise "&" of course).
>> >> 
>> >> > Not sure why that should happen, why is the result an int? What
>> >> > causes the casting of pgd_t & be64 to be truncated to an int.
>> >> 
>> >> Yes, it's not obvious as written...  It's simply that the return type of
>> >> pgd_present is int.  So it is truncated _after_ the bitwise and.
>> >>
>> >
>> > Thanks, I am surprised the compiler does not complain about the truncation
>> > of bits. I wonder if we are missing -Wconversion
>> 
>> Good luck with that :)
>> 
>> What I should start doing is building with it enabled and then comparing
>> the output before and after commits to make sure we're not introducing
>> new cases.
>
> Fair enough, my point was that the compiler can help out. I'll see what
> -Wconversion finds on my local build :)

I get about 43MB of warnings here :)

cheers

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] powerpc/64s: Fix possible corruption on big endian due to pgd/pud_present()
  2019-02-18  0:49           ` Michael Ellerman
@ 2019-02-19 12:01             ` Balbir Singh
  2019-02-19 20:15             ` Segher Boessenkool
  1 sibling, 0 replies; 16+ messages in thread
From: Balbir Singh @ 2019-02-19 12:01 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Segher Boessenkool, erhard_f, jack, linuxppc-dev, linux-kernel,
	linux-mm, aneesh.kumar

On Mon, Feb 18, 2019 at 11:49:18AM +1100, Michael Ellerman wrote:
> Balbir Singh <bsingharora@gmail.com> writes:
> > On Sun, Feb 17, 2019 at 07:34:20PM +1100, Michael Ellerman wrote:
> >> Balbir Singh <bsingharora@gmail.com> writes:
> >> > On Sat, Feb 16, 2019 at 08:22:12AM -0600, Segher Boessenkool wrote:
> >> >> On Sat, Feb 16, 2019 at 09:55:11PM +1100, Balbir Singh wrote:
> >> >> > On Thu, Feb 14, 2019 at 05:23:39PM +1100, Michael Ellerman wrote:
> >> >> > > In v4.20 we changed our pgd/pud_present() to check for _PAGE_PRESENT
> >> >> > > rather than just checking that the value is non-zero, e.g.:
> >> >> > > 
> >> >> > >   static inline int pgd_present(pgd_t pgd)
> >> >> > >   {
> >> >> > >  -       return !pgd_none(pgd);
> >> >> > >  +       return (pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT));
> >> >> > >   }
> >> >> > > 
> >> >> > > Unfortunately this is broken on big endian, as the result of the
> >> >> > > bitwise && is truncated to int, which is always zero because
> >> >> 
> >> >> (Bitwise "&" of course).
> >> >> 
> >> >> > Not sure why that should happen, why is the result an int? What
> >> >> > causes the casting of pgd_t & be64 to be truncated to an int.
> >> >> 
> >> >> Yes, it's not obvious as written...  It's simply that the return type of
> >> >> pgd_present is int.  So it is truncated _after_ the bitwise and.
> >> >>
> >> >
> >> > Thanks, I am surprised the compiler does not complain about the truncation
> >> > of bits. I wonder if we are missing -Wconversion
> >> 
> >> Good luck with that :)
> >> 
> >> What I should start doing is building with it enabled and then comparing
> >> the output before and after commits to make sure we're not introducing
> >> new cases.
> >
> > Fair enough, my point was that the compiler can help out. I'll see what
> > -Wconversion finds on my local build :)
> 
> I get about 43MB of warnings here :)
>

I got about 181M with a failed build :(, but the warnings pointed to some cases
that can be a good project for cleanup

For example

1. 
static inline long regs_return_value(struct pt_regs *regs)
{
        if (is_syscall_success(regs))
                return regs->gpr[3];
        else
                return -regs->gpr[3];
}

In the case of is_syscall_success() returning false, we should ensure that
regs->gpr[3] is negative and capped within a certain limit, but it might
be an expensive check

2.
static inline void mark_hpte_slot_valid(unsigned char *hpte_slot_array,
                                        unsigned int index, unsigned int hidx)
{
        hpte_slot_array[index] = (hidx << 1) | 0x1;
}

hidx is 3 bits, but the argument is unsigned int. The caller probably does a
hidx & 0x7, but it's not clear from the code

3. hash__pmd_bad (pmd_bad) and hash__pud_bad (pud_bad) have issues similar to what was found,
but since the the page table indices are below 32, the macros are safe :)

And a few more, but I am not sure why I spent time looking at possible issues,
may be I am being stupid or overly pessimistic :)

Balbir



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] powerpc/64s: Fix possible corruption on big endian due to pgd/pud_present()
  2019-02-18  0:49           ` Michael Ellerman
  2019-02-19 12:01             ` Balbir Singh
@ 2019-02-19 20:15             ` Segher Boessenkool
  2019-02-20 11:18               ` Michael Ellerman
  1 sibling, 1 reply; 16+ messages in thread
From: Segher Boessenkool @ 2019-02-19 20:15 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Balbir Singh, erhard_f, jack, linuxppc-dev, linux-kernel,
	linux-mm, aneesh.kumar

On Mon, Feb 18, 2019 at 11:49:18AM +1100, Michael Ellerman wrote:
> Balbir Singh <bsingharora@gmail.com> writes:
> > Fair enough, my point was that the compiler can help out. I'll see what
> > -Wconversion finds on my local build :)
> 
> I get about 43MB of warnings here :)

Yes, -Wconversion complains about a lot of things that are idiomatic C.
There is a reason -Wconversion is not in -Wall or -Wextra.


Segher

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] powerpc/64s: Fix possible corruption on big endian due to pgd/pud_present()
  2019-02-19 20:15             ` Segher Boessenkool
@ 2019-02-20 11:18               ` Michael Ellerman
  2019-02-20 14:51                 ` Segher Boessenkool
  0 siblings, 1 reply; 16+ messages in thread
From: Michael Ellerman @ 2019-02-20 11:18 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Balbir Singh, erhard_f, jack, linuxppc-dev, linux-kernel,
	linux-mm, aneesh.kumar

Segher Boessenkool <segher@kernel.crashing.org> writes:
> On Mon, Feb 18, 2019 at 11:49:18AM +1100, Michael Ellerman wrote:
>> Balbir Singh <bsingharora@gmail.com> writes:
>> > Fair enough, my point was that the compiler can help out. I'll see what
>> > -Wconversion finds on my local build :)
>> 
>> I get about 43MB of warnings here :)
>
> Yes, -Wconversion complains about a lot of things that are idiomatic C.
> There is a reason -Wconversion is not in -Wall or -Wextra.

Actually a lot of those go away when I add -Wno-sign-conversion.

And what's left seems mostly reasonable, they all indicate the
possibility of a bug I think.

In fact this works and would have caught the bug:

diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index d8c8d7c9df15..3114e3f368e2 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -904,7 +904,12 @@ static inline int pud_none(pud_t pud)
 
 static inline int pud_present(pud_t pud)
 {
+	__diag_push();
+	__diag_warn(GCC, 8, "-Wconversion", "ulong -> int");
+
 	return !!(pud_raw(pud) & cpu_to_be64(_PAGE_PRESENT));
+
+	__diag_pop();
 }
 
 extern struct page *pud_page(pud_t pud);



Obviously we're not going to instrument every function like that. But we
could start instrumenting particular files.

cheers

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] powerpc/64s: Fix possible corruption on big endian due to pgd/pud_present()
  2019-02-20 11:18               ` Michael Ellerman
@ 2019-02-20 14:51                 ` Segher Boessenkool
  0 siblings, 0 replies; 16+ messages in thread
From: Segher Boessenkool @ 2019-02-20 14:51 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Balbir Singh, erhard_f, jack, linuxppc-dev, linux-kernel,
	linux-mm, aneesh.kumar

On Wed, Feb 20, 2019 at 10:18:38PM +1100, Michael Ellerman wrote:
> Segher Boessenkool <segher@kernel.crashing.org> writes:
> > On Mon, Feb 18, 2019 at 11:49:18AM +1100, Michael Ellerman wrote:
> >> Balbir Singh <bsingharora@gmail.com> writes:
> >> > Fair enough, my point was that the compiler can help out. I'll see what
> >> > -Wconversion finds on my local build :)
> >> 
> >> I get about 43MB of warnings here :)
> >
> > Yes, -Wconversion complains about a lot of things that are idiomatic C.
> > There is a reason -Wconversion is not in -Wall or -Wextra.
> 
> Actually a lot of those go away when I add -Wno-sign-conversion.
> 
> And what's left seems mostly reasonable, they all indicate the
> possibility of a bug I think.
> 
> In fact this works and would have caught the bug:
> 
> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
> index d8c8d7c9df15..3114e3f368e2 100644
> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
> @@ -904,7 +904,12 @@ static inline int pud_none(pud_t pud)
>  
>  static inline int pud_present(pud_t pud)
>  {
> +	__diag_push();
> +	__diag_warn(GCC, 8, "-Wconversion", "ulong -> int");
> +
>  	return !!(pud_raw(pud) & cpu_to_be64(_PAGE_PRESENT));
> +
> +	__diag_pop();
>  }
>  
>  extern struct page *pud_page(pud_t pud);
> 
> 
> 
> Obviously we're not going to instrument every function like that. But we
> could start instrumenting particular files.

So you want to instrument the functions that you know are buggy, using some
weird incantations to catch only those errors you already know about?

(I am worried this does not scale, in many dimensions).


Segher

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2019-02-20 14:52 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-14  6:23 [PATCH] powerpc/64s: Fix possible corruption on big endian due to pgd/pud_present() Michael Ellerman
2019-02-14 16:31 ` Jan Kara
2019-02-16 10:55 ` Balbir Singh
2019-02-16 14:22   ` Segher Boessenkool
2019-02-17  6:23     ` Balbir Singh
2019-02-17  8:34       ` Michael Ellerman
2019-02-17 21:55         ` Balbir Singh
2019-02-18  0:49           ` Michael Ellerman
2019-02-19 12:01             ` Balbir Singh
2019-02-19 20:15             ` Segher Boessenkool
2019-02-20 11:18               ` Michael Ellerman
2019-02-20 14:51                 ` Segher Boessenkool
2019-02-17  8:34     ` Michael Ellerman
2019-02-16 13:15 ` Andreas Schwab
2019-02-17  8:26   ` Michael Ellerman
2019-02-17  8:21 ` Michael Ellerman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).