All of lore.kernel.org
 help / color / mirror / Atom feed
From: Miaohe Lin <linmiaohe@huawei.com>
To: "Huang, Ying" <ying.huang@intel.com>
Cc: <akpm@linux-foundation.org>, <dennis@kernel.org>,
	<tim.c.chen@linux.intel.com>, <hughd@google.com>,
	<hannes@cmpxchg.org>, <mhocko@suse.com>, <iamjoonsoo.kim@lge.com>,
	<alexs@kernel.org>, <david@redhat.com>, <minchan@kernel.org>,
	<richard.weiyang@gmail.com>, <linux-kernel@vger.kernel.org>,
	<linux-mm@kvack.org>
Subject: Re: [PATCH v2 3/5] swap: fix do_swap_page() race with swapoff
Date: Mon, 19 Apr 2021 14:54:50 +0800	[thread overview]
Message-ID: <38d0ac56-62e0-76b8-36ef-711089b88d91@huawei.com> (raw)
In-Reply-To: <87k0ozko5c.fsf@yhuang6-desk1.ccr.corp.intel.com>

On 2021/4/19 10:23, Huang, Ying wrote:
> Miaohe Lin <linmiaohe@huawei.com> writes:
> 
>> When I was investigating the swap code, I found the below possible race
>> window:
>>
>> CPU 1                                   	CPU 2
>> -----                                   	-----
>> do_swap_page
> 
> This is OK for swap cache cases.  So
> 
>   if (data_race(si->flags & SWP_SYNCHRONOUS_IO))
> 
> should be shown here.

Ok.

> 
>>   swap_readpage(skip swap cache case)
>>     if (data_race(sis->flags & SWP_FS_OPS)) {
>>                                         	swapoff
>> 					  	  p->flags = &= ~SWP_VALID;
>> 					  	  ..
>> 					  	  synchronize_rcu();
>> 					  	  ..
>> 					  	  p->swap_file = NULL;
>>     struct file *swap_file = sis->swap_file;
>>     struct address_space *mapping = swap_file->f_mapping;[oops!]
>>
>> Note that for the pages that are swapped in through swap cache, this isn't
>> an issue. Because the page is locked, and the swap entry will be marked
>> with SWAP_HAS_CACHE, so swapoff() can not proceed until the page has been
>> unlocked.
>>
>> Using current get/put_swap_device() to guard against concurrent swapoff for
>> swap_readpage() looks terrible because swap_readpage() may take really long
>> time. And this race may not be really pernicious because swapoff is usually
>> done when system shutdown only. To reduce the performance overhead on the
>> hot-path as much as possible, it appears we can use the percpu_ref to close
>> this race window(as suggested by Huang, Ying).
> 
> I still suggest to squash PATCH 1-3, at least PATCH 1-2.  That will
> change the relevant code together and make it easier to review.
> 

Will squash PATCH 1-2. Thanks.

> Best Regards,
> Huang, Ying
> 
>> Fixes: 0bcac06f27d7 ("mm,swap: skip swapcache for swapin of synchronous device")
>> Reported-by: kernel test robot <lkp@intel.com> (auto build test ERROR)
>> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
>> ---
>>  include/linux/swap.h | 9 +++++++++
>>  mm/memory.c          | 9 +++++++++
>>  2 files changed, 18 insertions(+)
>>
>> diff --git a/include/linux/swap.h b/include/linux/swap.h
>> index 993693b38109..523c2411a135 100644
>> --- a/include/linux/swap.h
>> +++ b/include/linux/swap.h
>> @@ -528,6 +528,15 @@ static inline struct swap_info_struct *swp_swap_info(swp_entry_t entry)
>>  	return NULL;
>>  }
>>  
>> +static inline struct swap_info_struct *get_swap_device(swp_entry_t entry)
>> +{
>> +	return NULL;
>> +}
>> +
>> +static inline void put_swap_device(struct swap_info_struct *si)
>> +{
>> +}
>> +
>>  #define swap_address_space(entry)		(NULL)
>>  #define get_nr_swap_pages()			0L
>>  #define total_swap_pages			0L
>> diff --git a/mm/memory.c b/mm/memory.c
>> index 27014c3bde9f..7a2fe12cf641 100644
>> --- a/mm/memory.c
>> +++ b/mm/memory.c
>> @@ -3311,6 +3311,7 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
>>  {
>>  	struct vm_area_struct *vma = vmf->vma;
>>  	struct page *page = NULL, *swapcache;
>> +	struct swap_info_struct *si = NULL;
>>  	swp_entry_t entry;
>>  	pte_t pte;
>>  	int locked;
>> @@ -3338,6 +3339,10 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
>>  		goto out;
>>  	}
>>  
>> +	/* Prevent swapoff from happening to us. */
>> +	si = get_swap_device(entry);
>> +	if (unlikely(!si))
>> +		goto out;
>>  
>>  	delayacct_set_flag(current, DELAYACCT_PF_SWAPIN);
>>  	page = lookup_swap_cache(entry, vma, vmf->address);
>> @@ -3514,6 +3519,8 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
>>  unlock:
>>  	pte_unmap_unlock(vmf->pte, vmf->ptl);
>>  out:
>> +	if (si)
>> +		put_swap_device(si);
>>  	return ret;
>>  out_nomap:
>>  	pte_unmap_unlock(vmf->pte, vmf->ptl);
>> @@ -3525,6 +3532,8 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
>>  		unlock_page(swapcache);
>>  		put_page(swapcache);
>>  	}
>> +	if (si)
>> +		put_swap_device(si);
>>  	return ret;
>>  }
> .
> 


  reply	other threads:[~2021-04-19  6:56 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-17  9:40 [PATCH v2 0/5] close various race windows for swap Miaohe Lin
2021-04-17  9:40 ` [PATCH v2 1/5] mm/swapfile: add percpu_ref support " Miaohe Lin
2021-04-19  2:48   ` Huang, Ying
2021-04-19  2:48     ` Huang, Ying
2021-04-19  6:46     ` Miaohe Lin
2021-04-19  7:09       ` Huang, Ying
2021-04-19  7:09         ` Huang, Ying
2021-04-19  7:35         ` Miaohe Lin
2021-04-19  7:52           ` Huang, Ying
2021-04-19  7:52             ` Huang, Ying
2021-04-19  8:20             ` Miaohe Lin
2021-04-17  9:40 ` [PATCH v2 2/5] mm/swapfile: use percpu_ref to serialize against concurrent swapoff Miaohe Lin
2021-04-19  2:54   ` Huang, Ying
2021-04-19  2:54     ` Huang, Ying
2021-04-19  6:57     ` Miaohe Lin
2021-04-17  9:40 ` [PATCH v2 3/5] swap: fix do_swap_page() race with swapoff Miaohe Lin
2021-04-19  2:23   ` Huang, Ying
2021-04-19  2:23     ` Huang, Ying
2021-04-19  6:54     ` Miaohe Lin [this message]
2021-04-17  9:40 ` [PATCH v2 4/5] mm/swap: remove confusing checking for non_swap_entry() in swap_ra_info() Miaohe Lin
2021-04-19  1:53   ` Huang, Ying
2021-04-19  1:53     ` Huang, Ying
2021-04-19  6:46     ` Miaohe Lin
2021-04-17  9:40 ` [PATCH v2 5/5] mm/shmem: fix shmem_swapin() race with swapoff Miaohe Lin
2021-04-19  2:15   ` Huang, Ying
2021-04-19  2:15     ` Huang, Ying
2021-04-19  6:49     ` Miaohe Lin
2021-04-19  7:04       ` Huang, Ying
2021-04-19  7:04         ` Huang, Ying
2021-04-19  7:14         ` Miaohe Lin
2021-04-19  7:41           ` Huang, Ying
2021-04-19  7:41             ` Huang, Ying
2021-04-19  8:18             ` Miaohe Lin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=38d0ac56-62e0-76b8-36ef-711089b88d91@huawei.com \
    --to=linmiaohe@huawei.com \
    --cc=akpm@linux-foundation.org \
    --cc=alexs@kernel.org \
    --cc=david@redhat.com \
    --cc=dennis@kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=minchan@kernel.org \
    --cc=richard.weiyang@gmail.com \
    --cc=tim.c.chen@linux.intel.com \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.