All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH -next v2] mm: annotate a data race in page_zonenum()
@ 2020-02-13 18:38 Qian Cai
  2020-02-13 19:26 ` Paul E. McKenney
  2020-02-13 21:20 ` John Hubbard
  0 siblings, 2 replies; 10+ messages in thread
From: Qian Cai @ 2020-02-13 18:38 UTC (permalink / raw)
  To: paulmck
  Cc: akpm, elver, david, jack, jhubbard, ira.weiny, dan.j.williams,
	linux-mm, linux-kernel, Qian Cai

 BUG: KCSAN: data-race in page_cpupid_xchg_last / put_page

 write (marked) to 0xfffffc0d48ec1a00 of 8 bytes by task 91442 on cpu 3:
  page_cpupid_xchg_last+0x51/0x80
  page_cpupid_xchg_last at mm/mmzone.c:109 (discriminator 11)
  wp_page_reuse+0x3e/0xc0
  wp_page_reuse at mm/memory.c:2453
  do_wp_page+0x472/0x7b0
  do_wp_page at mm/memory.c:2798
  __handle_mm_fault+0xcb0/0xd00
  handle_pte_fault at mm/memory.c:4049
  (inlined by) __handle_mm_fault at mm/memory.c:4163
  handle_mm_fault+0xfc/0x2f0
  handle_mm_fault at mm/memory.c:4200
  do_page_fault+0x263/0x6f9
  do_user_addr_fault at arch/x86/mm/fault.c:1465
  (inlined by) do_page_fault at arch/x86/mm/fault.c:1539
  page_fault+0x34/0x40

 read to 0xfffffc0d48ec1a00 of 8 bytes by task 94817 on cpu 69:
  put_page+0x15a/0x1f0
  page_zonenum at include/linux/mm.h:923
  (inlined by) is_zone_device_page at include/linux/mm.h:929
  (inlined by) page_is_devmap_managed at include/linux/mm.h:948
  (inlined by) put_page at include/linux/mm.h:1023
  wp_page_copy+0x571/0x930
  wp_page_copy at mm/memory.c:2615
  do_wp_page+0x107/0x7b0
  __handle_mm_fault+0xcb0/0xd00
  handle_mm_fault+0xfc/0x2f0
  do_page_fault+0x263/0x6f9
  page_fault+0x34/0x40

 Reported by Kernel Concurrency Sanitizer on:
 CPU: 69 PID: 94817 Comm: systemd-udevd Tainted: G        W  O L 5.5.0-next-20200204+ #6
 Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019

A page never changes its zone number. The zone number happens to be
stored in the same word as other bits which are modified, but the zone
number bits will never be modified by any other write, so it can accept
a reload of the zone bits after an intervening write and it don't need
to use READ_ONCE(). Thus, annotate this data race using
ASSERT_EXCLUSIVE_BITS() to also assert that there are no concurrent
writes to it.

Suggested-by: Marco Elver <elver@google.com>
Signed-off-by: Qian Cai <cai@lca.pw>
---

v2: use ASSERT_EXCLUSIVE_BITS().

BTW, not sure if it is easier for Andrew with Paul to pick this up (with
Andrew's ACK), since ASSERT_EXCLUSIVE_BITS() is in -rcu tree only (or likely
tomorrow's -next tree).

 include/linux/mm.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 52269e56c514..0d70fafd055c 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -920,6 +920,7 @@ vm_fault_t alloc_set_pte(struct vm_fault *vmf, struct mem_cgroup *memcg,
 
 static inline enum zone_type page_zonenum(const struct page *page)
 {
+	ASSERT_EXCLUSIVE_BITS(page->flags, ZONES_MASK << ZONES_PGSHIFT);
 	return (page->flags >> ZONES_PGSHIFT) & ZONES_MASK;
 }
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH -next v2] mm: annotate a data race in page_zonenum()
  2020-02-13 18:38 [PATCH -next v2] mm: annotate a data race in page_zonenum() Qian Cai
@ 2020-02-13 19:26 ` Paul E. McKenney
  2020-02-14 14:30     ` Qian Cai
  2020-02-13 21:20 ` John Hubbard
  1 sibling, 1 reply; 10+ messages in thread
From: Paul E. McKenney @ 2020-02-13 19:26 UTC (permalink / raw)
  To: Qian Cai
  Cc: akpm, elver, david, jack, jhubbard, ira.weiny, dan.j.williams,
	linux-mm, linux-kernel

On Thu, Feb 13, 2020 at 01:38:09PM -0500, Qian Cai wrote:
>  BUG: KCSAN: data-race in page_cpupid_xchg_last / put_page
> 
>  write (marked) to 0xfffffc0d48ec1a00 of 8 bytes by task 91442 on cpu 3:
>   page_cpupid_xchg_last+0x51/0x80
>   page_cpupid_xchg_last at mm/mmzone.c:109 (discriminator 11)
>   wp_page_reuse+0x3e/0xc0
>   wp_page_reuse at mm/memory.c:2453
>   do_wp_page+0x472/0x7b0
>   do_wp_page at mm/memory.c:2798
>   __handle_mm_fault+0xcb0/0xd00
>   handle_pte_fault at mm/memory.c:4049
>   (inlined by) __handle_mm_fault at mm/memory.c:4163
>   handle_mm_fault+0xfc/0x2f0
>   handle_mm_fault at mm/memory.c:4200
>   do_page_fault+0x263/0x6f9
>   do_user_addr_fault at arch/x86/mm/fault.c:1465
>   (inlined by) do_page_fault at arch/x86/mm/fault.c:1539
>   page_fault+0x34/0x40
> 
>  read to 0xfffffc0d48ec1a00 of 8 bytes by task 94817 on cpu 69:
>   put_page+0x15a/0x1f0
>   page_zonenum at include/linux/mm.h:923
>   (inlined by) is_zone_device_page at include/linux/mm.h:929
>   (inlined by) page_is_devmap_managed at include/linux/mm.h:948
>   (inlined by) put_page at include/linux/mm.h:1023
>   wp_page_copy+0x571/0x930
>   wp_page_copy at mm/memory.c:2615
>   do_wp_page+0x107/0x7b0
>   __handle_mm_fault+0xcb0/0xd00
>   handle_mm_fault+0xfc/0x2f0
>   do_page_fault+0x263/0x6f9
>   page_fault+0x34/0x40
> 
>  Reported by Kernel Concurrency Sanitizer on:
>  CPU: 69 PID: 94817 Comm: systemd-udevd Tainted: G        W  O L 5.5.0-next-20200204+ #6
>  Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019
> 
> A page never changes its zone number. The zone number happens to be
> stored in the same word as other bits which are modified, but the zone
> number bits will never be modified by any other write, so it can accept
> a reload of the zone bits after an intervening write and it don't need
> to use READ_ONCE(). Thus, annotate this data race using
> ASSERT_EXCLUSIVE_BITS() to also assert that there are no concurrent
> writes to it.
> 
> Suggested-by: Marco Elver <elver@google.com>
> Signed-off-by: Qian Cai <cai@lca.pw>
> ---
> 
> v2: use ASSERT_EXCLUSIVE_BITS().
> 
> BTW, not sure if it is easier for Andrew with Paul to pick this up (with
> Andrew's ACK), since ASSERT_EXCLUSIVE_BITS() is in -rcu tree only (or likely
> tomorrow's -next tree).

Here are the options I know of, any of which work for me:

1.	I take the patch given appropriate acks/reviews.

2.	Someone hangs onto the patch until the KCSAN infrastructure
	hits mainline and then sends it up via whatever path.

3.	One way to do #2 is to merge the -rcu tree's "kcsan" branch and
	then queue this patch on top of that, again sending this patch
	along once KCSAN hits mainline.  Unusually for the -rcu tree,
	the "kcsan" branch is not supposed to be rebased.

Either way, just let me know!

							Thanx, Paul

>  include/linux/mm.h | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 52269e56c514..0d70fafd055c 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -920,6 +920,7 @@ vm_fault_t alloc_set_pte(struct vm_fault *vmf, struct mem_cgroup *memcg,
>  
>  static inline enum zone_type page_zonenum(const struct page *page)
>  {
> +	ASSERT_EXCLUSIVE_BITS(page->flags, ZONES_MASK << ZONES_PGSHIFT);
>  	return (page->flags >> ZONES_PGSHIFT) & ZONES_MASK;
>  }
>  
> -- 
> 1.8.3.1
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH -next v2] mm: annotate a data race in page_zonenum()
  2020-02-13 18:38 [PATCH -next v2] mm: annotate a data race in page_zonenum() Qian Cai
  2020-02-13 19:26 ` Paul E. McKenney
@ 2020-02-13 21:20 ` John Hubbard
  2020-02-19 21:55     ` Qian Cai
  1 sibling, 1 reply; 10+ messages in thread
From: John Hubbard @ 2020-02-13 21:20 UTC (permalink / raw)
  To: Qian Cai, paulmck
  Cc: akpm, elver, david, jack, ira.weiny, dan.j.williams, linux-mm,
	linux-kernel

On 2/13/20 10:38 AM, Qian Cai wrote:
>  BUG: KCSAN: data-race in page_cpupid_xchg_last / put_page
> 
>  write (marked) to 0xfffffc0d48ec1a00 of 8 bytes by task 91442 on cpu 3:
>   page_cpupid_xchg_last+0x51/0x80
>   page_cpupid_xchg_last at mm/mmzone.c:109 (discriminator 11)
>   wp_page_reuse+0x3e/0xc0
>   wp_page_reuse at mm/memory.c:2453
>   do_wp_page+0x472/0x7b0
>   do_wp_page at mm/memory.c:2798
>   __handle_mm_fault+0xcb0/0xd00
>   handle_pte_fault at mm/memory.c:4049
>   (inlined by) __handle_mm_fault at mm/memory.c:4163
>   handle_mm_fault+0xfc/0x2f0
>   handle_mm_fault at mm/memory.c:4200
>   do_page_fault+0x263/0x6f9
>   do_user_addr_fault at arch/x86/mm/fault.c:1465
>   (inlined by) do_page_fault at arch/x86/mm/fault.c:1539
>   page_fault+0x34/0x40
> 
>  read to 0xfffffc0d48ec1a00 of 8 bytes by task 94817 on cpu 69:
>   put_page+0x15a/0x1f0
>   page_zonenum at include/linux/mm.h:923
>   (inlined by) is_zone_device_page at include/linux/mm.h:929
>   (inlined by) page_is_devmap_managed at include/linux/mm.h:948
>   (inlined by) put_page at include/linux/mm.h:1023
>   wp_page_copy+0x571/0x930
>   wp_page_copy at mm/memory.c:2615
>   do_wp_page+0x107/0x7b0
>   __handle_mm_fault+0xcb0/0xd00
>   handle_mm_fault+0xfc/0x2f0
>   do_page_fault+0x263/0x6f9
>   page_fault+0x34/0x40
> 
>  Reported by Kernel Concurrency Sanitizer on:
>  CPU: 69 PID: 94817 Comm: systemd-udevd Tainted: G        W  O L 5.5.0-next-20200204+ #6
>  Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019
> 
> A page never changes its zone number. The zone number happens to be
> stored in the same word as other bits which are modified, but the zone
> number bits will never be modified by any other write, so it can accept
> a reload of the zone bits after an intervening write and it don't need
> to use READ_ONCE(). Thus, annotate this data race using
> ASSERT_EXCLUSIVE_BITS() to also assert that there are no concurrent
> writes to it.
> 
> Suggested-by: Marco Elver <elver@google.com>
> Signed-off-by: Qian Cai <cai@lca.pw>
> ---
> 
> v2: use ASSERT_EXCLUSIVE_BITS().


Much cleaner, thanks to this new macro. You can add:


    Reviewed-by: John Hubbard <jhubbard@nvidia.com>


thanks,
-- 
John Hubbard
NVIDIA

> 
> BTW, not sure if it is easier for Andrew with Paul to pick this up (with
> Andrew's ACK), since ASSERT_EXCLUSIVE_BITS() is in -rcu tree only (or likely
> tomorrow's -next tree).
> 
>  include/linux/mm.h | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 52269e56c514..0d70fafd055c 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -920,6 +920,7 @@ vm_fault_t alloc_set_pte(struct vm_fault *vmf, struct mem_cgroup *memcg,
>  
>  static inline enum zone_type page_zonenum(const struct page *page)
>  {
> +	ASSERT_EXCLUSIVE_BITS(page->flags, ZONES_MASK << ZONES_PGSHIFT);
>  	return (page->flags >> ZONES_PGSHIFT) & ZONES_MASK;
>  }
>  
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH -next v2] mm: annotate a data race in page_zonenum()
  2020-02-13 19:26 ` Paul E. McKenney
@ 2020-02-14 14:30     ` Qian Cai
  0 siblings, 0 replies; 10+ messages in thread
From: Qian Cai @ 2020-02-14 14:30 UTC (permalink / raw)
  To: paulmck
  Cc: akpm, elver, david, jack, jhubbard, ira.weiny, dan.j.williams,
	linux-mm, linux-kernel

On Thu, 2020-02-13 at 11:26 -0800, Paul E. McKenney wrote:
> On Thu, Feb 13, 2020 at 01:38:09PM -0500, Qian Cai wrote:
> >  BUG: KCSAN: data-race in page_cpupid_xchg_last / put_page
> > 
> >  write (marked) to 0xfffffc0d48ec1a00 of 8 bytes by task 91442 on cpu 3:
> >   page_cpupid_xchg_last+0x51/0x80
> >   page_cpupid_xchg_last at mm/mmzone.c:109 (discriminator 11)
> >   wp_page_reuse+0x3e/0xc0
> >   wp_page_reuse at mm/memory.c:2453
> >   do_wp_page+0x472/0x7b0
> >   do_wp_page at mm/memory.c:2798
> >   __handle_mm_fault+0xcb0/0xd00
> >   handle_pte_fault at mm/memory.c:4049
> >   (inlined by) __handle_mm_fault at mm/memory.c:4163
> >   handle_mm_fault+0xfc/0x2f0
> >   handle_mm_fault at mm/memory.c:4200
> >   do_page_fault+0x263/0x6f9
> >   do_user_addr_fault at arch/x86/mm/fault.c:1465
> >   (inlined by) do_page_fault at arch/x86/mm/fault.c:1539
> >   page_fault+0x34/0x40
> > 
> >  read to 0xfffffc0d48ec1a00 of 8 bytes by task 94817 on cpu 69:
> >   put_page+0x15a/0x1f0
> >   page_zonenum at include/linux/mm.h:923
> >   (inlined by) is_zone_device_page at include/linux/mm.h:929
> >   (inlined by) page_is_devmap_managed at include/linux/mm.h:948
> >   (inlined by) put_page at include/linux/mm.h:1023
> >   wp_page_copy+0x571/0x930
> >   wp_page_copy at mm/memory.c:2615
> >   do_wp_page+0x107/0x7b0
> >   __handle_mm_fault+0xcb0/0xd00
> >   handle_mm_fault+0xfc/0x2f0
> >   do_page_fault+0x263/0x6f9
> >   page_fault+0x34/0x40
> > 
> >  Reported by Kernel Concurrency Sanitizer on:
> >  CPU: 69 PID: 94817 Comm: systemd-udevd Tainted: G        W  O L 5.5.0-next-20200204+ #6
> >  Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019
> > 
> > A page never changes its zone number. The zone number happens to be
> > stored in the same word as other bits which are modified, but the zone
> > number bits will never be modified by any other write, so it can accept
> > a reload of the zone bits after an intervening write and it don't need
> > to use READ_ONCE(). Thus, annotate this data race using
> > ASSERT_EXCLUSIVE_BITS() to also assert that there are no concurrent
> > writes to it.
> > 
> > Suggested-by: Marco Elver <elver@google.com>
> > Signed-off-by: Qian Cai <cai@lca.pw>
> > ---
> > 
> > v2: use ASSERT_EXCLUSIVE_BITS().
> > 
> > BTW, not sure if it is easier for Andrew with Paul to pick this up (with
> > Andrew's ACK), since ASSERT_EXCLUSIVE_BITS() is in -rcu tree only (or likely
> > tomorrow's -next tree).
> 
> Here are the options I know of, any of which work for me:
> 
> 1.	I take the patch given appropriate acks/reviews.
> 
> 2.	Someone hangs onto the patch until the KCSAN infrastructure
> 	hits mainline and then sends it up via whatever path.
> 
> 3.	One way to do #2 is to merge the -rcu tree's "kcsan" branch and
> 	then queue this patch on top of that, again sending this patch
> 	along once KCSAN hits mainline.  Unusually for the -rcu tree,
> 	the "kcsan" branch is not supposed to be rebased.
> 
> Either way, just let me know!

I suppose it is up to Andrew what he prefers. I may have another pile of patches
depends on some of those KCSAN things (except the data_race() macro) are coming,
so I'd like to find a way to get into -next first rather than waiting to send
them in April (once KCSAN hit the mainline) if possible.

> 
> 							Thanx, Paul
> 
> >  include/linux/mm.h | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/include/linux/mm.h b/include/linux/mm.h
> > index 52269e56c514..0d70fafd055c 100644
> > --- a/include/linux/mm.h
> > +++ b/include/linux/mm.h
> > @@ -920,6 +920,7 @@ vm_fault_t alloc_set_pte(struct vm_fault *vmf, struct mem_cgroup *memcg,
> >  
> >  static inline enum zone_type page_zonenum(const struct page *page)
> >  {
> > +	ASSERT_EXCLUSIVE_BITS(page->flags, ZONES_MASK << ZONES_PGSHIFT);
> >  	return (page->flags >> ZONES_PGSHIFT) & ZONES_MASK;
> >  }
> >  
> > -- 
> > 1.8.3.1
> > 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH -next v2] mm: annotate a data race in page_zonenum()
@ 2020-02-14 14:30     ` Qian Cai
  0 siblings, 0 replies; 10+ messages in thread
From: Qian Cai @ 2020-02-14 14:30 UTC (permalink / raw)
  To: paulmck
  Cc: akpm, elver, david, jack, jhubbard, ira.weiny, dan.j.williams,
	linux-mm, linux-kernel

On Thu, 2020-02-13 at 11:26 -0800, Paul E. McKenney wrote:
> On Thu, Feb 13, 2020 at 01:38:09PM -0500, Qian Cai wrote:
> >  BUG: KCSAN: data-race in page_cpupid_xchg_last / put_page
> > 
> >  write (marked) to 0xfffffc0d48ec1a00 of 8 bytes by task 91442 on cpu 3:
> >   page_cpupid_xchg_last+0x51/0x80
> >   page_cpupid_xchg_last at mm/mmzone.c:109 (discriminator 11)
> >   wp_page_reuse+0x3e/0xc0
> >   wp_page_reuse at mm/memory.c:2453
> >   do_wp_page+0x472/0x7b0
> >   do_wp_page at mm/memory.c:2798
> >   __handle_mm_fault+0xcb0/0xd00
> >   handle_pte_fault at mm/memory.c:4049
> >   (inlined by) __handle_mm_fault at mm/memory.c:4163
> >   handle_mm_fault+0xfc/0x2f0
> >   handle_mm_fault at mm/memory.c:4200
> >   do_page_fault+0x263/0x6f9
> >   do_user_addr_fault at arch/x86/mm/fault.c:1465
> >   (inlined by) do_page_fault at arch/x86/mm/fault.c:1539
> >   page_fault+0x34/0x40
> > 
> >  read to 0xfffffc0d48ec1a00 of 8 bytes by task 94817 on cpu 69:
> >   put_page+0x15a/0x1f0
> >   page_zonenum at include/linux/mm.h:923
> >   (inlined by) is_zone_device_page at include/linux/mm.h:929
> >   (inlined by) page_is_devmap_managed at include/linux/mm.h:948
> >   (inlined by) put_page at include/linux/mm.h:1023
> >   wp_page_copy+0x571/0x930
> >   wp_page_copy at mm/memory.c:2615
> >   do_wp_page+0x107/0x7b0
> >   __handle_mm_fault+0xcb0/0xd00
> >   handle_mm_fault+0xfc/0x2f0
> >   do_page_fault+0x263/0x6f9
> >   page_fault+0x34/0x40
> > 
> >  Reported by Kernel Concurrency Sanitizer on:
> >  CPU: 69 PID: 94817 Comm: systemd-udevd Tainted: G        W  O L 5.5.0-next-20200204+ #6
> >  Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019
> > 
> > A page never changes its zone number. The zone number happens to be
> > stored in the same word as other bits which are modified, but the zone
> > number bits will never be modified by any other write, so it can accept
> > a reload of the zone bits after an intervening write and it don't need
> > to use READ_ONCE(). Thus, annotate this data race using
> > ASSERT_EXCLUSIVE_BITS() to also assert that there are no concurrent
> > writes to it.
> > 
> > Suggested-by: Marco Elver <elver@google.com>
> > Signed-off-by: Qian Cai <cai@lca.pw>
> > ---
> > 
> > v2: use ASSERT_EXCLUSIVE_BITS().
> > 
> > BTW, not sure if it is easier for Andrew with Paul to pick this up (with
> > Andrew's ACK), since ASSERT_EXCLUSIVE_BITS() is in -rcu tree only (or likely
> > tomorrow's -next tree).
> 
> Here are the options I know of, any of which work for me:
> 
> 1.	I take the patch given appropriate acks/reviews.
> 
> 2.	Someone hangs onto the patch until the KCSAN infrastructure
> 	hits mainline and then sends it up via whatever path.
> 
> 3.	One way to do #2 is to merge the -rcu tree's "kcsan" branch and
> 	then queue this patch on top of that, again sending this patch
> 	along once KCSAN hits mainline.  Unusually for the -rcu tree,
> 	the "kcsan" branch is not supposed to be rebased.
> 
> Either way, just let me know!

I suppose it is up to Andrew what he prefers. I may have another pile of patches
depends on some of those KCSAN things (except the data_race() macro) are coming,
so I'd like to find a way to get into -next first rather than waiting to send
them in April (once KCSAN hit the mainline) if possible.

> 
> 							Thanx, Paul
> 
> >  include/linux/mm.h | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/include/linux/mm.h b/include/linux/mm.h
> > index 52269e56c514..0d70fafd055c 100644
> > --- a/include/linux/mm.h
> > +++ b/include/linux/mm.h
> > @@ -920,6 +920,7 @@ vm_fault_t alloc_set_pte(struct vm_fault *vmf, struct mem_cgroup *memcg,
> >  
> >  static inline enum zone_type page_zonenum(const struct page *page)
> >  {
> > +	ASSERT_EXCLUSIVE_BITS(page->flags, ZONES_MASK << ZONES_PGSHIFT);
> >  	return (page->flags >> ZONES_PGSHIFT) & ZONES_MASK;
> >  }
> >  
> > -- 
> > 1.8.3.1
> > 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH -next v2] mm: annotate a data race in page_zonenum()
  2020-02-14 14:30     ` Qian Cai
  (?)
@ 2020-02-14 16:16     ` Paul E. McKenney
  2020-02-18 12:42       ` Qian Cai
  -1 siblings, 1 reply; 10+ messages in thread
From: Paul E. McKenney @ 2020-02-14 16:16 UTC (permalink / raw)
  To: Qian Cai
  Cc: akpm, elver, david, jack, jhubbard, ira.weiny, dan.j.williams,
	linux-mm, linux-kernel

On Fri, Feb 14, 2020 at 09:30:26AM -0500, Qian Cai wrote:
> On Thu, 2020-02-13 at 11:26 -0800, Paul E. McKenney wrote:
> > On Thu, Feb 13, 2020 at 01:38:09PM -0500, Qian Cai wrote:
> > >  BUG: KCSAN: data-race in page_cpupid_xchg_last / put_page
> > > 
> > >  write (marked) to 0xfffffc0d48ec1a00 of 8 bytes by task 91442 on cpu 3:
> > >   page_cpupid_xchg_last+0x51/0x80
> > >   page_cpupid_xchg_last at mm/mmzone.c:109 (discriminator 11)
> > >   wp_page_reuse+0x3e/0xc0
> > >   wp_page_reuse at mm/memory.c:2453
> > >   do_wp_page+0x472/0x7b0
> > >   do_wp_page at mm/memory.c:2798
> > >   __handle_mm_fault+0xcb0/0xd00
> > >   handle_pte_fault at mm/memory.c:4049
> > >   (inlined by) __handle_mm_fault at mm/memory.c:4163
> > >   handle_mm_fault+0xfc/0x2f0
> > >   handle_mm_fault at mm/memory.c:4200
> > >   do_page_fault+0x263/0x6f9
> > >   do_user_addr_fault at arch/x86/mm/fault.c:1465
> > >   (inlined by) do_page_fault at arch/x86/mm/fault.c:1539
> > >   page_fault+0x34/0x40
> > > 
> > >  read to 0xfffffc0d48ec1a00 of 8 bytes by task 94817 on cpu 69:
> > >   put_page+0x15a/0x1f0
> > >   page_zonenum at include/linux/mm.h:923
> > >   (inlined by) is_zone_device_page at include/linux/mm.h:929
> > >   (inlined by) page_is_devmap_managed at include/linux/mm.h:948
> > >   (inlined by) put_page at include/linux/mm.h:1023
> > >   wp_page_copy+0x571/0x930
> > >   wp_page_copy at mm/memory.c:2615
> > >   do_wp_page+0x107/0x7b0
> > >   __handle_mm_fault+0xcb0/0xd00
> > >   handle_mm_fault+0xfc/0x2f0
> > >   do_page_fault+0x263/0x6f9
> > >   page_fault+0x34/0x40
> > > 
> > >  Reported by Kernel Concurrency Sanitizer on:
> > >  CPU: 69 PID: 94817 Comm: systemd-udevd Tainted: G        W  O L 5.5.0-next-20200204+ #6
> > >  Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019
> > > 
> > > A page never changes its zone number. The zone number happens to be
> > > stored in the same word as other bits which are modified, but the zone
> > > number bits will never be modified by any other write, so it can accept
> > > a reload of the zone bits after an intervening write and it don't need
> > > to use READ_ONCE(). Thus, annotate this data race using
> > > ASSERT_EXCLUSIVE_BITS() to also assert that there are no concurrent
> > > writes to it.
> > > 
> > > Suggested-by: Marco Elver <elver@google.com>
> > > Signed-off-by: Qian Cai <cai@lca.pw>
> > > ---
> > > 
> > > v2: use ASSERT_EXCLUSIVE_BITS().
> > > 
> > > BTW, not sure if it is easier for Andrew with Paul to pick this up (with
> > > Andrew's ACK), since ASSERT_EXCLUSIVE_BITS() is in -rcu tree only (or likely
> > > tomorrow's -next tree).
> > 
> > Here are the options I know of, any of which work for me:
> > 
> > 1.	I take the patch given appropriate acks/reviews.
> > 
> > 2.	Someone hangs onto the patch until the KCSAN infrastructure
> > 	hits mainline and then sends it up via whatever path.
> > 
> > 3.	One way to do #2 is to merge the -rcu tree's "kcsan" branch and
> > 	then queue this patch on top of that, again sending this patch
> > 	along once KCSAN hits mainline.  Unusually for the -rcu tree,
> > 	the "kcsan" branch is not supposed to be rebased.
> > 
> > Either way, just let me know!
> 
> I suppose it is up to Andrew what he prefers. I may have another pile of patches
> depends on some of those KCSAN things (except the data_race() macro) are coming,
> so I'd like to find a way to get into -next first rather than waiting to send
> them in April (once KCSAN hit the mainline) if possible.

Any of the three options above would get you into -next quickly.

So just let me know how you would like to proceed.

							Thanx, Paul

> > >  include/linux/mm.h | 1 +
> > >  1 file changed, 1 insertion(+)
> > > 
> > > diff --git a/include/linux/mm.h b/include/linux/mm.h
> > > index 52269e56c514..0d70fafd055c 100644
> > > --- a/include/linux/mm.h
> > > +++ b/include/linux/mm.h
> > > @@ -920,6 +920,7 @@ vm_fault_t alloc_set_pte(struct vm_fault *vmf, struct mem_cgroup *memcg,
> > >  
> > >  static inline enum zone_type page_zonenum(const struct page *page)
> > >  {
> > > +	ASSERT_EXCLUSIVE_BITS(page->flags, ZONES_MASK << ZONES_PGSHIFT);
> > >  	return (page->flags >> ZONES_PGSHIFT) & ZONES_MASK;
> > >  }
> > >  
> > > -- 
> > > 1.8.3.1
> > > 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH -next v2] mm: annotate a data race in page_zonenum()
  2020-02-14 16:16     ` Paul E. McKenney
@ 2020-02-18 12:42       ` Qian Cai
  0 siblings, 0 replies; 10+ messages in thread
From: Qian Cai @ 2020-02-18 12:42 UTC (permalink / raw)
  To: paulmck
  Cc: akpm, elver, david, jack, jhubbard, ira.weiny, dan.j.williams,
	linux-mm, linux-kernel



> On Feb 14, 2020, at 11:16 AM, Paul E. McKenney <paulmck@kernel.org> wrote:
> 
> Any of the three options above would get you into -next quickly.
> 
> So just let me know how you would like to proceed.

Unless Andrew has any objection, I think it makes sense that you pick this up as it is pretty much a low risk patch.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH -next v2] mm: annotate a data race in page_zonenum()
  2020-02-13 21:20 ` John Hubbard
@ 2020-02-19 21:55     ` Qian Cai
  0 siblings, 0 replies; 10+ messages in thread
From: Qian Cai @ 2020-02-19 21:55 UTC (permalink / raw)
  To: akpm
  Cc: John Hubbard, paulmck, elver, david, jack, ira.weiny,
	dan.j.williams, linux-mm, linux-kernel

Andrew, since you had picked the similar one which also depends
on ASSERT_EXCLUSIVE_*, can you pick up this patch as well?

On Thu, 2020-02-13 at 13:20 -0800, John Hubbard wrote:
> On 2/13/20 10:38 AM, Qian Cai wrote:
> >  BUG: KCSAN: data-race in page_cpupid_xchg_last / put_page
> > 
> >  write (marked) to 0xfffffc0d48ec1a00 of 8 bytes by task 91442 on cpu 3:
> >   page_cpupid_xchg_last+0x51/0x80
> >   page_cpupid_xchg_last at mm/mmzone.c:109 (discriminator 11)
> >   wp_page_reuse+0x3e/0xc0
> >   wp_page_reuse at mm/memory.c:2453
> >   do_wp_page+0x472/0x7b0
> >   do_wp_page at mm/memory.c:2798
> >   __handle_mm_fault+0xcb0/0xd00
> >   handle_pte_fault at mm/memory.c:4049
> >   (inlined by) __handle_mm_fault at mm/memory.c:4163
> >   handle_mm_fault+0xfc/0x2f0
> >   handle_mm_fault at mm/memory.c:4200
> >   do_page_fault+0x263/0x6f9
> >   do_user_addr_fault at arch/x86/mm/fault.c:1465
> >   (inlined by) do_page_fault at arch/x86/mm/fault.c:1539
> >   page_fault+0x34/0x40
> > 
> >  read to 0xfffffc0d48ec1a00 of 8 bytes by task 94817 on cpu 69:
> >   put_page+0x15a/0x1f0
> >   page_zonenum at include/linux/mm.h:923
> >   (inlined by) is_zone_device_page at include/linux/mm.h:929
> >   (inlined by) page_is_devmap_managed at include/linux/mm.h:948
> >   (inlined by) put_page at include/linux/mm.h:1023
> >   wp_page_copy+0x571/0x930
> >   wp_page_copy at mm/memory.c:2615
> >   do_wp_page+0x107/0x7b0
> >   __handle_mm_fault+0xcb0/0xd00
> >   handle_mm_fault+0xfc/0x2f0
> >   do_page_fault+0x263/0x6f9
> >   page_fault+0x34/0x40
> > 
> >  Reported by Kernel Concurrency Sanitizer on:
> >  CPU: 69 PID: 94817 Comm: systemd-udevd Tainted: G        W  O L 5.5.0-next-20200204+ #6
> >  Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019
> > 
> > A page never changes its zone number. The zone number happens to be
> > stored in the same word as other bits which are modified, but the zone
> > number bits will never be modified by any other write, so it can accept
> > a reload of the zone bits after an intervening write and it don't need
> > to use READ_ONCE(). Thus, annotate this data race using
> > ASSERT_EXCLUSIVE_BITS() to also assert that there are no concurrent
> > writes to it.
> > 
> > Suggested-by: Marco Elver <elver@google.com>
> > Signed-off-by: Qian Cai <cai@lca.pw>
> > ---
> > 
> > v2: use ASSERT_EXCLUSIVE_BITS().
> 
> 
> Much cleaner, thanks to this new macro. You can add:
> 
> 
>     Reviewed-by: John Hubbard <jhubbard@nvidia.com>
> 
> 
> thanks,

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH -next v2] mm: annotate a data race in page_zonenum()
@ 2020-02-19 21:55     ` Qian Cai
  0 siblings, 0 replies; 10+ messages in thread
From: Qian Cai @ 2020-02-19 21:55 UTC (permalink / raw)
  To: akpm
  Cc: John Hubbard, paulmck, elver, david, jack, ira.weiny,
	dan.j.williams, linux-mm, linux-kernel

Andrew, since you had picked the similar one which also depends
on ASSERT_EXCLUSIVE_*, can you pick up this patch as well?

On Thu, 2020-02-13 at 13:20 -0800, John Hubbard wrote:
> On 2/13/20 10:38 AM, Qian Cai wrote:
> >  BUG: KCSAN: data-race in page_cpupid_xchg_last / put_page
> > 
> >  write (marked) to 0xfffffc0d48ec1a00 of 8 bytes by task 91442 on cpu 3:
> >   page_cpupid_xchg_last+0x51/0x80
> >   page_cpupid_xchg_last at mm/mmzone.c:109 (discriminator 11)
> >   wp_page_reuse+0x3e/0xc0
> >   wp_page_reuse at mm/memory.c:2453
> >   do_wp_page+0x472/0x7b0
> >   do_wp_page at mm/memory.c:2798
> >   __handle_mm_fault+0xcb0/0xd00
> >   handle_pte_fault at mm/memory.c:4049
> >   (inlined by) __handle_mm_fault at mm/memory.c:4163
> >   handle_mm_fault+0xfc/0x2f0
> >   handle_mm_fault at mm/memory.c:4200
> >   do_page_fault+0x263/0x6f9
> >   do_user_addr_fault at arch/x86/mm/fault.c:1465
> >   (inlined by) do_page_fault at arch/x86/mm/fault.c:1539
> >   page_fault+0x34/0x40
> > 
> >  read to 0xfffffc0d48ec1a00 of 8 bytes by task 94817 on cpu 69:
> >   put_page+0x15a/0x1f0
> >   page_zonenum at include/linux/mm.h:923
> >   (inlined by) is_zone_device_page at include/linux/mm.h:929
> >   (inlined by) page_is_devmap_managed at include/linux/mm.h:948
> >   (inlined by) put_page at include/linux/mm.h:1023
> >   wp_page_copy+0x571/0x930
> >   wp_page_copy at mm/memory.c:2615
> >   do_wp_page+0x107/0x7b0
> >   __handle_mm_fault+0xcb0/0xd00
> >   handle_mm_fault+0xfc/0x2f0
> >   do_page_fault+0x263/0x6f9
> >   page_fault+0x34/0x40
> > 
> >  Reported by Kernel Concurrency Sanitizer on:
> >  CPU: 69 PID: 94817 Comm: systemd-udevd Tainted: G        W  O L 5.5.0-next-20200204+ #6
> >  Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019
> > 
> > A page never changes its zone number. The zone number happens to be
> > stored in the same word as other bits which are modified, but the zone
> > number bits will never be modified by any other write, so it can accept
> > a reload of the zone bits after an intervening write and it don't need
> > to use READ_ONCE(). Thus, annotate this data race using
> > ASSERT_EXCLUSIVE_BITS() to also assert that there are no concurrent
> > writes to it.
> > 
> > Suggested-by: Marco Elver <elver@google.com>
> > Signed-off-by: Qian Cai <cai@lca.pw>
> > ---
> > 
> > v2: use ASSERT_EXCLUSIVE_BITS().
> 
> 
> Much cleaner, thanks to this new macro. You can add:
> 
> 
>     Reviewed-by: John Hubbard <jhubbard@nvidia.com>
> 
> 
> thanks,


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH -next v2] mm: annotate a data race in page_zonenum()
  2020-02-19 21:55     ` Qian Cai
  (?)
@ 2020-02-19 23:47     ` Paul E. McKenney
  -1 siblings, 0 replies; 10+ messages in thread
From: Paul E. McKenney @ 2020-02-19 23:47 UTC (permalink / raw)
  To: Qian Cai
  Cc: akpm, John Hubbard, elver, david, jack, ira.weiny,
	dan.j.williams, linux-mm, linux-kernel

On Wed, Feb 19, 2020 at 04:55:40PM -0500, Qian Cai wrote:
> Andrew, since you had picked the similar one which also depends
> on ASSERT_EXCLUSIVE_*, can you pick up this patch as well?
> 
> On Thu, 2020-02-13 at 13:20 -0800, John Hubbard wrote:
> > On 2/13/20 10:38 AM, Qian Cai wrote:
> > >  BUG: KCSAN: data-race in page_cpupid_xchg_last / put_page
> > > 
> > >  write (marked) to 0xfffffc0d48ec1a00 of 8 bytes by task 91442 on cpu 3:
> > >   page_cpupid_xchg_last+0x51/0x80
> > >   page_cpupid_xchg_last at mm/mmzone.c:109 (discriminator 11)
> > >   wp_page_reuse+0x3e/0xc0
> > >   wp_page_reuse at mm/memory.c:2453
> > >   do_wp_page+0x472/0x7b0
> > >   do_wp_page at mm/memory.c:2798
> > >   __handle_mm_fault+0xcb0/0xd00
> > >   handle_pte_fault at mm/memory.c:4049
> > >   (inlined by) __handle_mm_fault at mm/memory.c:4163
> > >   handle_mm_fault+0xfc/0x2f0
> > >   handle_mm_fault at mm/memory.c:4200
> > >   do_page_fault+0x263/0x6f9
> > >   do_user_addr_fault at arch/x86/mm/fault.c:1465
> > >   (inlined by) do_page_fault at arch/x86/mm/fault.c:1539
> > >   page_fault+0x34/0x40
> > > 
> > >  read to 0xfffffc0d48ec1a00 of 8 bytes by task 94817 on cpu 69:
> > >   put_page+0x15a/0x1f0
> > >   page_zonenum at include/linux/mm.h:923
> > >   (inlined by) is_zone_device_page at include/linux/mm.h:929
> > >   (inlined by) page_is_devmap_managed at include/linux/mm.h:948
> > >   (inlined by) put_page at include/linux/mm.h:1023
> > >   wp_page_copy+0x571/0x930
> > >   wp_page_copy at mm/memory.c:2615
> > >   do_wp_page+0x107/0x7b0
> > >   __handle_mm_fault+0xcb0/0xd00
> > >   handle_mm_fault+0xfc/0x2f0
> > >   do_page_fault+0x263/0x6f9
> > >   page_fault+0x34/0x40
> > > 
> > >  Reported by Kernel Concurrency Sanitizer on:
> > >  CPU: 69 PID: 94817 Comm: systemd-udevd Tainted: G        W  O L 5.5.0-next-20200204+ #6
> > >  Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019
> > > 
> > > A page never changes its zone number. The zone number happens to be
> > > stored in the same word as other bits which are modified, but the zone
> > > number bits will never be modified by any other write, so it can accept
> > > a reload of the zone bits after an intervening write and it don't need
> > > to use READ_ONCE(). Thus, annotate this data race using
> > > ASSERT_EXCLUSIVE_BITS() to also assert that there are no concurrent
> > > writes to it.
> > > 
> > > Suggested-by: Marco Elver <elver@google.com>
> > > Signed-off-by: Qian Cai <cai@lca.pw>
> > > ---
> > > 
> > > v2: use ASSERT_EXCLUSIVE_BITS().
> > 
> > 
> > Much cleaner, thanks to this new macro. You can add:
> > 
> > 
> >     Reviewed-by: John Hubbard <jhubbard@nvidia.com>

It looks like Andrew already pulled this one in, but let me know if
he drops it for whatever reason.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-02-19 23:47 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-13 18:38 [PATCH -next v2] mm: annotate a data race in page_zonenum() Qian Cai
2020-02-13 19:26 ` Paul E. McKenney
2020-02-14 14:30   ` Qian Cai
2020-02-14 14:30     ` Qian Cai
2020-02-14 16:16     ` Paul E. McKenney
2020-02-18 12:42       ` Qian Cai
2020-02-13 21:20 ` John Hubbard
2020-02-19 21:55   ` Qian Cai
2020-02-19 21:55     ` Qian Cai
2020-02-19 23:47     ` Paul E. McKenney

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.