linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH -next] mm/percpu: fix data-race with pcpu_nr_empty_pop_pages
@ 2021-10-25  7:00 Yuanzheng Song
  2021-10-25  7:50 ` Christoph Lameter
  0 siblings, 1 reply; 4+ messages in thread
From: Yuanzheng Song @ 2021-10-25  7:00 UTC (permalink / raw)
  To: dennis, tj, cl, akpm, linux-mm, linux-kernel

When reading the pcpu_nr_empty_pop_pages in pcpu_alloc()
and writing the pcpu_nr_empty_pop_pages in
pcpu_update_empty_pages() at the same time,
the data-race occurs.

===========
read-write to 0xffffffff882fdd4c of 4 bytes by task 9424 on cpu 0:
 pcpu_update_empty_pages
 pcpu_chunk_populated
 pcpu_balance_populated
 pcpu_balance_workfn
 process_one_work
 worker_thread
 kthread
 ret_from_fork

read to 0xffffffff882fdd4c of 4 bytes by task 9386 on cpu 3:
 pcpu_alloc
 __alloc_percpu_gfp
 fib_nh_common_init
 fib_nh_init
 fib_create_info
 fib_table_insert
 fib_magic
 ......
 sock_sendmsg_nosec
 sock_sendmsg
 __sys_sendto
 __do_sys_sendto
 __se_sys_sendto
 __x64_sys_sendto
 do_syscall_64
 entry_SYSCALL_64_after_hwframe
============

The same problem will occur in these functions:
pcpu_reclaim_populated(), pcpu_update_empty_pages(),
pcpu_isolate_chunk().

To fix this issue, use READ_ONCE() and WRITE_ONCE() to
read and write the pcpu_nr_empty_pop_pages.

Signed-off-by: Yuanzheng Song <songyuanzheng@huawei.com>
---
 mm/percpu.c | 18 ++++++++++++------
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/mm/percpu.c b/mm/percpu.c
index 293009cc03ef..e8ef92e698ab 100644
--- a/mm/percpu.c
+++ b/mm/percpu.c
@@ -574,7 +574,9 @@ static void pcpu_isolate_chunk(struct pcpu_chunk *chunk)
 
 	if (!chunk->isolated) {
 		chunk->isolated = true;
-		pcpu_nr_empty_pop_pages -= chunk->nr_empty_pop_pages;
+		WRITE_ONCE(pcpu_nr_empty_pop_pages,
+			   READ_ONCE(pcpu_nr_empty_pop_pages) -
+			   chunk->nr_empty_pop_pages);
 	}
 	list_move(&chunk->list, &pcpu_chunk_lists[pcpu_to_depopulate_slot]);
 }
@@ -585,7 +587,9 @@ static void pcpu_reintegrate_chunk(struct pcpu_chunk *chunk)
 
 	if (chunk->isolated) {
 		chunk->isolated = false;
-		pcpu_nr_empty_pop_pages += chunk->nr_empty_pop_pages;
+		WRITE_ONCE(pcpu_nr_empty_pop_pages,
+			   READ_ONCE(pcpu_nr_empty_pop_pages) +
+			   chunk->nr_empty_pop_pages);
 		pcpu_chunk_relocate(chunk, -1);
 	}
 }
@@ -603,7 +607,8 @@ static inline void pcpu_update_empty_pages(struct pcpu_chunk *chunk, int nr)
 {
 	chunk->nr_empty_pop_pages += nr;
 	if (chunk != pcpu_reserved_chunk && !chunk->isolated)
-		pcpu_nr_empty_pop_pages += nr;
+		WRITE_ONCE(pcpu_nr_empty_pop_pages,
+			   READ_ONCE(pcpu_nr_empty_pop_pages) + nr);
 }
 
 /*
@@ -1874,7 +1879,7 @@ static void __percpu *pcpu_alloc(size_t size, size_t align, bool reserved,
 		mutex_unlock(&pcpu_alloc_mutex);
 	}
 
-	if (pcpu_nr_empty_pop_pages < PCPU_EMPTY_POP_PAGES_LOW)
+	if (READ_ONCE(pcpu_nr_empty_pop_pages) < PCPU_EMPTY_POP_PAGES_LOW)
 		pcpu_schedule_balance_work();
 
 	/* clear the areas and return address relative to base address */
@@ -2062,7 +2067,7 @@ static void pcpu_balance_populated(void)
 		pcpu_atomic_alloc_failed = false;
 	} else {
 		nr_to_pop = clamp(PCPU_EMPTY_POP_PAGES_HIGH -
-				  pcpu_nr_empty_pop_pages,
+				  READ_ONCE(pcpu_nr_empty_pop_pages),
 				  0, PCPU_EMPTY_POP_PAGES_HIGH);
 	}
 
@@ -2163,7 +2168,8 @@ static void pcpu_reclaim_populated(void)
 				break;
 
 			/* reintegrate chunk to prevent atomic alloc failures */
-			if (pcpu_nr_empty_pop_pages < PCPU_EMPTY_POP_PAGES_HIGH) {
+			if (READ_ONCE(pcpu_nr_empty_pop_pages) <
+			    PCPU_EMPTY_POP_PAGES_HIGH) {
 				reintegrate = true;
 				goto end_chunk;
 			}
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH -next] mm/percpu: fix data-race with pcpu_nr_empty_pop_pages
  2021-10-25  7:00 [PATCH -next] mm/percpu: fix data-race with pcpu_nr_empty_pop_pages Yuanzheng Song
@ 2021-10-25  7:50 ` Christoph Lameter
  2021-10-26  2:41   ` Dennis Zhou
  0 siblings, 1 reply; 4+ messages in thread
From: Christoph Lameter @ 2021-10-25  7:50 UTC (permalink / raw)
  To: Yuanzheng Song; +Cc: dennis, tj, akpm, linux-mm, linux-kernel

On Mon, 25 Oct 2021, Yuanzheng Song wrote:

> When reading the pcpu_nr_empty_pop_pages in pcpu_alloc()
> and writing the pcpu_nr_empty_pop_pages in
> pcpu_update_empty_pages() at the same time,
> the data-race occurs.

Looks like a use case for the atomic RMV instructions.

> To fix this issue, use READ_ONCE() and WRITE_ONCE() to
> read and write the pcpu_nr_empty_pop_pages.

Never thought that READ_ONCE and WRITE_ONCE can fix races like
this. Really?

> diff --git a/mm/percpu.c b/mm/percpu.c
> index 293009cc03ef..e8ef92e698ab 100644
> --- a/mm/percpu.c
> +++ b/mm/percpu.c
> @@ -574,7 +574,9 @@ static void pcpu_isolate_chunk(struct pcpu_chunk *chunk)
>
>  	if (!chunk->isolated) {
>  		chunk->isolated = true;
> -		pcpu_nr_empty_pop_pages -= chunk->nr_empty_pop_pages;
> +		WRITE_ONCE(pcpu_nr_empty_pop_pages,
> +			   READ_ONCE(pcpu_nr_empty_pop_pages) -
> +			   chunk->nr_empty_pop_pages);

atomic_sub()?

>  	}
>  	list_move(&chunk->list, &pcpu_chunk_lists[pcpu_to_depopulate_slot]);
>  }
> @@ -585,7 +587,9 @@ static void pcpu_reintegrate_chunk(struct pcpu_chunk *chunk)
>
>  	if (chunk->isolated) {
>  		chunk->isolated = false;
> -		pcpu_nr_empty_pop_pages += chunk->nr_empty_pop_pages;
> +		WRITE_ONCE(pcpu_nr_empty_pop_pages,
> +			   READ_ONCE(pcpu_nr_empty_pop_pages) +
> +			   chunk->nr_empty_pop_pages);

atomic_add()?


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH -next] mm/percpu: fix data-race with pcpu_nr_empty_pop_pages
  2021-10-25  7:50 ` Christoph Lameter
@ 2021-10-26  2:41   ` Dennis Zhou
  2021-10-27  7:12     ` songyuanzheng
  0 siblings, 1 reply; 4+ messages in thread
From: Dennis Zhou @ 2021-10-26  2:41 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Yuanzheng Song, dennis, tj, akpm, linux-mm, linux-kernel

Hello,

On Mon, Oct 25, 2021 at 09:50:48AM +0200, Christoph Lameter wrote:
> On Mon, 25 Oct 2021, Yuanzheng Song wrote:
> 
> > When reading the pcpu_nr_empty_pop_pages in pcpu_alloc()
> > and writing the pcpu_nr_empty_pop_pages in
> > pcpu_update_empty_pages() at the same time,
> > the data-race occurs.
> 
> Looks like a use case for the atomic RMV instructions.
> 

Yeah. I see 2 options. Switch the variable over to an atomic or we can
move the read behind pcpu_lock. All the writes are already behind it
othewise that would actually be problematic. In this particular case,
reading a wrong # of empty pages isn't a big deal as eventually the
background work will get scheduled.

Thanks,
Dennis

> > To fix this issue, use READ_ONCE() and WRITE_ONCE() to
> > read and write the pcpu_nr_empty_pop_pages.
> 
> Never thought that READ_ONCE and WRITE_ONCE can fix races like
> this. Really?
> 
> > diff --git a/mm/percpu.c b/mm/percpu.c
> > index 293009cc03ef..e8ef92e698ab 100644
> > --- a/mm/percpu.c
> > +++ b/mm/percpu.c
> > @@ -574,7 +574,9 @@ static void pcpu_isolate_chunk(struct pcpu_chunk *chunk)
> >
> >  	if (!chunk->isolated) {
> >  		chunk->isolated = true;
> > -		pcpu_nr_empty_pop_pages -= chunk->nr_empty_pop_pages;
> > +		WRITE_ONCE(pcpu_nr_empty_pop_pages,
> > +			   READ_ONCE(pcpu_nr_empty_pop_pages) -
> > +			   chunk->nr_empty_pop_pages);
> 
> atomic_sub()?
> 
> >  	}
> >  	list_move(&chunk->list, &pcpu_chunk_lists[pcpu_to_depopulate_slot]);
> >  }
> > @@ -585,7 +587,9 @@ static void pcpu_reintegrate_chunk(struct pcpu_chunk *chunk)
> >
> >  	if (chunk->isolated) {
> >  		chunk->isolated = false;
> > -		pcpu_nr_empty_pop_pages += chunk->nr_empty_pop_pages;
> > +		WRITE_ONCE(pcpu_nr_empty_pop_pages,
> > +			   READ_ONCE(pcpu_nr_empty_pop_pages) +
> > +			   chunk->nr_empty_pop_pages);
> 
> atomic_add()?
> 


^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: [PATCH -next] mm/percpu: fix data-race with pcpu_nr_empty_pop_pages
  2021-10-26  2:41   ` Dennis Zhou
@ 2021-10-27  7:12     ` songyuanzheng
  0 siblings, 0 replies; 4+ messages in thread
From: songyuanzheng @ 2021-10-27  7:12 UTC (permalink / raw)
  To: Dennis Zhou, Christoph Lameter; +Cc: tj, akpm, linux-mm, linux-kernel

Hello,

Thanks for the advice, Dennis Zhou and Christoph Lameter. 
I really appreciate it.
I edited this patch by changing the pcpu_nr_empty_pop_pages to atomic_t variable.

Here is the v2 patch: https://patchwork.kernel.org/project/linux-mm/patch/20211026084312.2138852-1-songyuanzheng@huawei.com/.
Would you mind reviewing it again?

Thanks,
Yuanzheng Song

-----Original Message-----
From: Dennis Zhou [mailto:dennis@kernel.org] 
Sent: Tuesday, October 26, 2021 10:42 AM
To: Christoph Lameter <cl@gentwo.de>
Cc: songyuanzheng <songyuanzheng@huawei.com>; dennis@kernel.org; tj@kernel.org; akpm@linux-foundation.org; linux-mm@kvack.org; linux-kernel@vger.kernel.org
Subject: Re: [PATCH -next] mm/percpu: fix data-race with pcpu_nr_empty_pop_pages

Hello,

On Mon, Oct 25, 2021 at 09:50:48AM +0200, Christoph Lameter wrote:
> On Mon, 25 Oct 2021, Yuanzheng Song wrote:
> 
> > When reading the pcpu_nr_empty_pop_pages in pcpu_alloc() and writing 
> > the pcpu_nr_empty_pop_pages in
> > pcpu_update_empty_pages() at the same time, the data-race occurs.
> 
> Looks like a use case for the atomic RMV instructions.
> 

Yeah. I see 2 options. Switch the variable over to an atomic or we can move the read behind pcpu_lock. All the writes are already behind it othewise that would actually be problematic. In this particular case, reading a wrong # of empty pages isn't a big deal as eventually the background work will get scheduled.

Thanks,
Dennis

> > To fix this issue, use READ_ONCE() and WRITE_ONCE() to read and 
> > write the pcpu_nr_empty_pop_pages.
> 
> Never thought that READ_ONCE and WRITE_ONCE can fix races like this. 
> Really?
> 
> > diff --git a/mm/percpu.c b/mm/percpu.c index 
> > 293009cc03ef..e8ef92e698ab 100644
> > --- a/mm/percpu.c
> > +++ b/mm/percpu.c
> > @@ -574,7 +574,9 @@ static void pcpu_isolate_chunk(struct pcpu_chunk 
> > *chunk)
> >
> >  	if (!chunk->isolated) {
> >  		chunk->isolated = true;
> > -		pcpu_nr_empty_pop_pages -= chunk->nr_empty_pop_pages;
> > +		WRITE_ONCE(pcpu_nr_empty_pop_pages,
> > +			   READ_ONCE(pcpu_nr_empty_pop_pages) -
> > +			   chunk->nr_empty_pop_pages);
> 
> atomic_sub()?
> 
> >  	}
> >  	list_move(&chunk->list, 
> > &pcpu_chunk_lists[pcpu_to_depopulate_slot]);
> >  }
> > @@ -585,7 +587,9 @@ static void pcpu_reintegrate_chunk(struct 
> > pcpu_chunk *chunk)
> >
> >  	if (chunk->isolated) {
> >  		chunk->isolated = false;
> > -		pcpu_nr_empty_pop_pages += chunk->nr_empty_pop_pages;
> > +		WRITE_ONCE(pcpu_nr_empty_pop_pages,
> > +			   READ_ONCE(pcpu_nr_empty_pop_pages) +
> > +			   chunk->nr_empty_pop_pages);
> 
> atomic_add()?
> 


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-10-27  7:12 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-25  7:00 [PATCH -next] mm/percpu: fix data-race with pcpu_nr_empty_pop_pages Yuanzheng Song
2021-10-25  7:50 ` Christoph Lameter
2021-10-26  2:41   ` Dennis Zhou
2021-10-27  7:12     ` songyuanzheng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).