All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/2] A few cleanup and fixup patches for swap
@ 2022-06-25  9:33 Miaohe Lin
  2022-06-25  9:33 ` [PATCH v3 1/2] mm/swapfile: fix possible data races of inuse_pages Miaohe Lin
  2022-06-25  9:33 ` [PATCH v3 2/2] mm/swap: remove swap_cache_info statistics Miaohe Lin
  0 siblings, 2 replies; 8+ messages in thread
From: Miaohe Lin @ 2022-06-25  9:33 UTC (permalink / raw)
  To: akpm
  Cc: david, ying.huang, songmuchun, quic_qiancai, linux-mm,
	linux-kernel, linmiaohe

Hi everyone,
This series contains a cleaup patch to remove unneeded swap_cache_info
statistics, and a bugfix patch to avoid data races of inuse_pages. More
details can be found in the respective changelogs. Thanks!

---
v3:
  rebase on linux-next-20220624
  drop patch 1/3 per Huang, Ying
  collect Reviewed-by tag per David, Muchun, Acked-by tag per Huang, Ying
  use WRITE_ONCE to pair with READ_ONCE in patch 2/3
v2:
  collect Reviewed-by tag per David
  drop patch "mm/swapfile: avoid confusing swap cache statistics"
  add a new patch to remove swap_cache_info statistics per David
  Many thanks David for review and comment.
---
Miaohe Lin (2):
  mm/swapfile: fix possible data races of inuse_pages
  mm/swap: remove swap_cache_info statistics

 mm/swap_state.c | 17 -----------------
 mm/swapfile.c   |  8 ++++----
 2 files changed, 4 insertions(+), 21 deletions(-)

-- 
2.23.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v3 1/2] mm/swapfile: fix possible data races of inuse_pages
  2022-06-25  9:33 [PATCH v3 0/2] A few cleanup and fixup patches for swap Miaohe Lin
@ 2022-06-25  9:33 ` Miaohe Lin
  2022-06-27  1:29   ` Huang, Ying
  2022-06-27 12:43   ` Qian Cai
  2022-06-25  9:33 ` [PATCH v3 2/2] mm/swap: remove swap_cache_info statistics Miaohe Lin
  1 sibling, 2 replies; 8+ messages in thread
From: Miaohe Lin @ 2022-06-25  9:33 UTC (permalink / raw)
  To: akpm
  Cc: david, ying.huang, songmuchun, quic_qiancai, linux-mm,
	linux-kernel, linmiaohe

si->inuse_pages could still be accessed concurrently now. The plain reads
outside si->lock critical section, i.e. swap_show and si_swapinfo, which
results in data races. READ_ONCE and WRITE_ONCE is used to fix such data
races. Note these data races should be ok because they're just used for
showing swap info.

Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
---
 mm/swapfile.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/mm/swapfile.c b/mm/swapfile.c
index edc3420d30e7..5c8681a3f1d9 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -695,7 +695,7 @@ static void swap_range_alloc(struct swap_info_struct *si, unsigned long offset,
 		si->lowest_bit += nr_entries;
 	if (end == si->highest_bit)
 		WRITE_ONCE(si->highest_bit, si->highest_bit - nr_entries);
-	si->inuse_pages += nr_entries;
+	WRITE_ONCE(si->inuse_pages, si->inuse_pages + nr_entries);
 	if (si->inuse_pages == si->pages) {
 		si->lowest_bit = si->max;
 		si->highest_bit = 0;
@@ -732,7 +732,7 @@ static void swap_range_free(struct swap_info_struct *si, unsigned long offset,
 			add_to_avail_list(si);
 	}
 	atomic_long_add(nr_entries, &nr_swap_pages);
-	si->inuse_pages -= nr_entries;
+	WRITE_ONCE(si->inuse_pages, si->inuse_pages - nr_entries);
 	if (si->flags & SWP_BLKDEV)
 		swap_slot_free_notify =
 			si->bdev->bd_disk->fops->swap_slot_free_notify;
@@ -2641,7 +2641,7 @@ static int swap_show(struct seq_file *swap, void *v)
 	}
 
 	bytes = si->pages << (PAGE_SHIFT - 10);
-	inuse = si->inuse_pages << (PAGE_SHIFT - 10);
+	inuse = READ_ONCE(si->inuse_pages) << (PAGE_SHIFT - 10);
 
 	file = si->swap_file;
 	len = seq_file_path(swap, file, " \t\n\\");
@@ -3260,7 +3260,7 @@ void si_swapinfo(struct sysinfo *val)
 		struct swap_info_struct *si = swap_info[type];
 
 		if ((si->flags & SWP_USED) && !(si->flags & SWP_WRITEOK))
-			nr_to_be_unused += si->inuse_pages;
+			nr_to_be_unused += READ_ONCE(si->inuse_pages);
 	}
 	val->freeswap = atomic_long_read(&nr_swap_pages) + nr_to_be_unused;
 	val->totalswap = total_swap_pages + nr_to_be_unused;
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v3 2/2] mm/swap: remove swap_cache_info statistics
  2022-06-25  9:33 [PATCH v3 0/2] A few cleanup and fixup patches for swap Miaohe Lin
  2022-06-25  9:33 ` [PATCH v3 1/2] mm/swapfile: fix possible data races of inuse_pages Miaohe Lin
@ 2022-06-25  9:33 ` Miaohe Lin
  1 sibling, 0 replies; 8+ messages in thread
From: Miaohe Lin @ 2022-06-25  9:33 UTC (permalink / raw)
  To: akpm
  Cc: david, ying.huang, songmuchun, quic_qiancai, linux-mm,
	linux-kernel, linmiaohe

swap_cache_info are not statistics that could be easily used to tune system
performance because they are not easily accessile. Also they can't provide
really useful info when OOM occurs. Remove these statistics can also help
mitigate unneeded global swap_cache_info cacheline contention.

Suggested-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: "Huang, Ying" <ying.huang@intel.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
---
 mm/swap_state.c | 17 -----------------
 1 file changed, 17 deletions(-)

diff --git a/mm/swap_state.c b/mm/swap_state.c
index dd142624172b..e166051566f4 100644
--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -59,24 +59,11 @@ static bool enable_vma_readahead __read_mostly = true;
 #define GET_SWAP_RA_VAL(vma)					\
 	(atomic_long_read(&(vma)->swap_readahead_info) ? : 4)
 
-#define INC_CACHE_INFO(x)	data_race(swap_cache_info.x++)
-#define ADD_CACHE_INFO(x, nr)	data_race(swap_cache_info.x += (nr))
-
-static struct {
-	unsigned long add_total;
-	unsigned long del_total;
-	unsigned long find_success;
-	unsigned long find_total;
-} swap_cache_info;
-
 static atomic_t swapin_readahead_hits = ATOMIC_INIT(4);
 
 void show_swap_cache_info(void)
 {
 	printk("%lu pages in swap cache\n", total_swapcache_pages());
-	printk("Swap cache stats: add %lu, delete %lu, find %lu/%lu\n",
-		swap_cache_info.add_total, swap_cache_info.del_total,
-		swap_cache_info.find_success, swap_cache_info.find_total);
 	printk("Free swap  = %ldkB\n",
 		get_nr_swap_pages() << (PAGE_SHIFT - 10));
 	printk("Total swap = %lukB\n", total_swap_pages << (PAGE_SHIFT - 10));
@@ -133,7 +120,6 @@ int add_to_swap_cache(struct page *page, swp_entry_t entry,
 		address_space->nrpages += nr;
 		__mod_node_page_state(page_pgdat(page), NR_FILE_PAGES, nr);
 		__mod_lruvec_page_state(page, NR_SWAPCACHE, nr);
-		ADD_CACHE_INFO(add_total, nr);
 unlock:
 		xas_unlock_irq(&xas);
 	} while (xas_nomem(&xas, gfp));
@@ -173,7 +159,6 @@ void __delete_from_swap_cache(struct folio *folio,
 	address_space->nrpages -= nr;
 	__node_stat_mod_folio(folio, NR_FILE_PAGES, -nr);
 	__lruvec_stat_mod_folio(folio, NR_SWAPCACHE, -nr);
-	ADD_CACHE_INFO(del_total, nr);
 }
 
 /**
@@ -349,12 +334,10 @@ struct page *lookup_swap_cache(swp_entry_t entry, struct vm_area_struct *vma,
 	page = find_get_page(swap_address_space(entry), swp_offset(entry));
 	put_swap_device(si);
 
-	INC_CACHE_INFO(find_total);
 	if (page) {
 		bool vma_ra = swap_use_vma_readahead();
 		bool readahead;
 
-		INC_CACHE_INFO(find_success);
 		/*
 		 * At the moment, we don't support PG_readahead for anon THP
 		 * so let's bail out rather than confusing the readahead stat.
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 1/2] mm/swapfile: fix possible data races of inuse_pages
  2022-06-25  9:33 ` [PATCH v3 1/2] mm/swapfile: fix possible data races of inuse_pages Miaohe Lin
@ 2022-06-27  1:29   ` Huang, Ying
  2022-06-27 12:43   ` Qian Cai
  1 sibling, 0 replies; 8+ messages in thread
From: Huang, Ying @ 2022-06-27  1:29 UTC (permalink / raw)
  To: Miaohe Lin; +Cc: akpm, david, songmuchun, quic_qiancai, linux-mm, linux-kernel

Miaohe Lin <linmiaohe@huawei.com> writes:

> si->inuse_pages could still be accessed concurrently now. The plain reads
> outside si->lock critical section, i.e. swap_show and si_swapinfo, which
> results in data races. READ_ONCE and WRITE_ONCE is used to fix such data
> races. Note these data races should be ok because they're just used for
> showing swap info.
>
> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
> Reviewed-by: David Hildenbrand <david@redhat.com>

Reviewed-by: "Huang, Ying" <ying.huang@intel.com>

Thanks!

Best Regards,
Huang, Ying

> ---
>  mm/swapfile.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index edc3420d30e7..5c8681a3f1d9 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -695,7 +695,7 @@ static void swap_range_alloc(struct swap_info_struct *si, unsigned long offset,
>  		si->lowest_bit += nr_entries;
>  	if (end == si->highest_bit)
>  		WRITE_ONCE(si->highest_bit, si->highest_bit - nr_entries);
> -	si->inuse_pages += nr_entries;
> +	WRITE_ONCE(si->inuse_pages, si->inuse_pages + nr_entries);
>  	if (si->inuse_pages == si->pages) {
>  		si->lowest_bit = si->max;
>  		si->highest_bit = 0;
> @@ -732,7 +732,7 @@ static void swap_range_free(struct swap_info_struct *si, unsigned long offset,
>  			add_to_avail_list(si);
>  	}
>  	atomic_long_add(nr_entries, &nr_swap_pages);
> -	si->inuse_pages -= nr_entries;
> +	WRITE_ONCE(si->inuse_pages, si->inuse_pages - nr_entries);
>  	if (si->flags & SWP_BLKDEV)
>  		swap_slot_free_notify =
>  			si->bdev->bd_disk->fops->swap_slot_free_notify;
> @@ -2641,7 +2641,7 @@ static int swap_show(struct seq_file *swap, void *v)
>  	}
>  
>  	bytes = si->pages << (PAGE_SHIFT - 10);
> -	inuse = si->inuse_pages << (PAGE_SHIFT - 10);
> +	inuse = READ_ONCE(si->inuse_pages) << (PAGE_SHIFT - 10);
>  
>  	file = si->swap_file;
>  	len = seq_file_path(swap, file, " \t\n\\");
> @@ -3260,7 +3260,7 @@ void si_swapinfo(struct sysinfo *val)
>  		struct swap_info_struct *si = swap_info[type];
>  
>  		if ((si->flags & SWP_USED) && !(si->flags & SWP_WRITEOK))
> -			nr_to_be_unused += si->inuse_pages;
> +			nr_to_be_unused += READ_ONCE(si->inuse_pages);
>  	}
>  	val->freeswap = atomic_long_read(&nr_swap_pages) + nr_to_be_unused;
>  	val->totalswap = total_swap_pages + nr_to_be_unused;

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 1/2] mm/swapfile: fix possible data races of inuse_pages
  2022-06-25  9:33 ` [PATCH v3 1/2] mm/swapfile: fix possible data races of inuse_pages Miaohe Lin
  2022-06-27  1:29   ` Huang, Ying
@ 2022-06-27 12:43   ` Qian Cai
  2022-06-27 13:27     ` Miaohe Lin
  1 sibling, 1 reply; 8+ messages in thread
From: Qian Cai @ 2022-06-27 12:43 UTC (permalink / raw)
  To: Miaohe Lin; +Cc: akpm, david, ying.huang, songmuchun, linux-mm, linux-kernel

On Sat, Jun 25, 2022 at 05:33:45PM +0800, Miaohe Lin wrote:
> si->inuse_pages could still be accessed concurrently now. The plain reads
> outside si->lock critical section, i.e. swap_show and si_swapinfo, which
> results in data races. READ_ONCE and WRITE_ONCE is used to fix such data
> races. Note these data races should be ok because they're just used for
> showing swap info.

Was this found by kcsan? If so, it would be useful to record the exact
kscan report in the commit message.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 1/2] mm/swapfile: fix possible data races of inuse_pages
  2022-06-27 12:43   ` Qian Cai
@ 2022-06-27 13:27     ` Miaohe Lin
  2022-06-27 13:47       ` Qian Cai
  0 siblings, 1 reply; 8+ messages in thread
From: Miaohe Lin @ 2022-06-27 13:27 UTC (permalink / raw)
  To: Qian Cai; +Cc: akpm, david, ying.huang, songmuchun, linux-mm, linux-kernel

On 2022/6/27 20:43, Qian Cai wrote:
> On Sat, Jun 25, 2022 at 05:33:45PM +0800, Miaohe Lin wrote:
>> si->inuse_pages could still be accessed concurrently now. The plain reads
>> outside si->lock critical section, i.e. swap_show and si_swapinfo, which
>> results in data races. READ_ONCE and WRITE_ONCE is used to fix such data
>> races. Note these data races should be ok because they're just used for
>> showing swap info.
> 
> Was this found by kcsan? If so, it would be useful to record the exact
> kscan report in the commit message.

Sorry, it's found via code inspection.

Thanks.

> .
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 1/2] mm/swapfile: fix possible data races of inuse_pages
  2022-06-27 13:27     ` Miaohe Lin
@ 2022-06-27 13:47       ` Qian Cai
  2022-06-28  1:56         ` Huang, Ying
  0 siblings, 1 reply; 8+ messages in thread
From: Qian Cai @ 2022-06-27 13:47 UTC (permalink / raw)
  To: Miaohe Lin; +Cc: akpm, david, ying.huang, songmuchun, linux-mm, linux-kernel

On Mon, Jun 27, 2022 at 09:27:43PM +0800, Miaohe Lin wrote:
> On 2022/6/27 20:43, Qian Cai wrote:
> > On Sat, Jun 25, 2022 at 05:33:45PM +0800, Miaohe Lin wrote:
> >> si->inuse_pages could still be accessed concurrently now. The plain reads
> >> outside si->lock critical section, i.e. swap_show and si_swapinfo, which
> >> results in data races. READ_ONCE and WRITE_ONCE is used to fix such data
> >> races. Note these data races should be ok because they're just used for
> >> showing swap info.
> > 
> > Was this found by kcsan? If so, it would be useful to record the exact
> > kscan report in the commit message.
> 
> Sorry, it's found via code inspection.

Well, if we are going to do a WRITE_ONCE() in those places just for
documentation purpose now, I think we will need to fix all places in the mm
subsystem to be consistent.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 1/2] mm/swapfile: fix possible data races of inuse_pages
  2022-06-27 13:47       ` Qian Cai
@ 2022-06-28  1:56         ` Huang, Ying
  0 siblings, 0 replies; 8+ messages in thread
From: Huang, Ying @ 2022-06-28  1:56 UTC (permalink / raw)
  To: Qian Cai; +Cc: Miaohe Lin, akpm, david, songmuchun, linux-mm, linux-kernel

Qian Cai <quic_qiancai@quicinc.com> writes:

> On Mon, Jun 27, 2022 at 09:27:43PM +0800, Miaohe Lin wrote:
>> On 2022/6/27 20:43, Qian Cai wrote:
>> > On Sat, Jun 25, 2022 at 05:33:45PM +0800, Miaohe Lin wrote:
>> >> si->inuse_pages could still be accessed concurrently now. The plain reads
>> >> outside si->lock critical section, i.e. swap_show and si_swapinfo, which
>> >> results in data races. READ_ONCE and WRITE_ONCE is used to fix such data
>> >> races. Note these data races should be ok because they're just used for
>> >> showing swap info.
>> > 
>> > Was this found by kcsan? If so, it would be useful to record the exact
>> > kscan report in the commit message.
>> 
>> Sorry, it's found via code inspection.
>
> Well, if we are going to do a WRITE_ONCE() in those places just for
> documentation purpose now, I think we will need to fix all places in the mm
> subsystem to be consistent.

We have already done this in swapfile.c, please search "WRITE_ONCE"
in that file.

Best Regards,
Huang, Ying

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-06-28  1:57 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-25  9:33 [PATCH v3 0/2] A few cleanup and fixup patches for swap Miaohe Lin
2022-06-25  9:33 ` [PATCH v3 1/2] mm/swapfile: fix possible data races of inuse_pages Miaohe Lin
2022-06-27  1:29   ` Huang, Ying
2022-06-27 12:43   ` Qian Cai
2022-06-27 13:27     ` Miaohe Lin
2022-06-27 13:47       ` Qian Cai
2022-06-28  1:56         ` Huang, Ying
2022-06-25  9:33 ` [PATCH v3 2/2] mm/swap: remove swap_cache_info statistics Miaohe Lin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.