linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Miaohe Lin <linmiaohe@huawei.com>
To: "Huang, Ying" <ying.huang@intel.com>
Cc: <akpm@linux-foundation.org>, <dennis@kernel.org>,
	<tim.c.chen@linux.intel.com>, <hughd@google.com>,
	<hannes@cmpxchg.org>, <mhocko@suse.com>, <iamjoonsoo.kim@lge.com>,
	<alexs@kernel.org>, <david@redhat.com>, <minchan@kernel.org>,
	<richard.weiyang@gmail.com>, <linux-kernel@vger.kernel.org>,
	<linux-mm@kvack.org>
Subject: Re: [PATCH v2 2/5] mm/swapfile: use percpu_ref to serialize against concurrent swapoff
Date: Mon, 19 Apr 2021 14:57:13 +0800	[thread overview]
Message-ID: <16a20b86-b690-9397-def1-1171828c245e@huawei.com> (raw)
In-Reply-To: <87a6pvkmqi.fsf@yhuang6-desk1.ccr.corp.intel.com>

On 2021/4/19 10:54, Huang, Ying wrote:
> Miaohe Lin <linmiaohe@huawei.com> writes:
> 
>> Use percpu_ref to serialize against concurrent swapoff. Also remove the
>> SWP_VALID flag because it's used together with RCU solution.
>>
>> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
>> ---
>>  include/linux/swap.h |  3 +--
>>  mm/swapfile.c        | 43 +++++++++++++++++--------------------------
>>  2 files changed, 18 insertions(+), 28 deletions(-)
>>
>> diff --git a/include/linux/swap.h b/include/linux/swap.h
>> index 8be36eb58b7a..993693b38109 100644
>> --- a/include/linux/swap.h
>> +++ b/include/linux/swap.h
>> @@ -177,7 +177,6 @@ enum {
>>  	SWP_PAGE_DISCARD = (1 << 10),	/* freed swap page-cluster discards */
>>  	SWP_STABLE_WRITES = (1 << 11),	/* no overwrite PG_writeback pages */
>>  	SWP_SYNCHRONOUS_IO = (1 << 12),	/* synchronous IO is efficient */
>> -	SWP_VALID	= (1 << 13),	/* swap is valid to be operated on? */
>>  					/* add others here before... */
>>  	SWP_SCANNING	= (1 << 14),	/* refcount in scan_swap_map */
>>  };
>> @@ -514,7 +513,7 @@ sector_t swap_page_sector(struct page *page);
>>  
>>  static inline void put_swap_device(struct swap_info_struct *si)
>>  {
>> -	rcu_read_unlock();
>> +	percpu_ref_put(&si->users);
>>  }
>>  
>>  #else /* CONFIG_SWAP */
>> diff --git a/mm/swapfile.c b/mm/swapfile.c
>> index 66515a3a2824..90e197bc2eeb 100644
>> --- a/mm/swapfile.c
>> +++ b/mm/swapfile.c
>> @@ -1279,18 +1279,12 @@ static unsigned char __swap_entry_free_locked(struct swap_info_struct *p,
>>   * via preventing the swap device from being swapoff, until
>>   * put_swap_device() is called.  Otherwise return NULL.
>>   *
>> - * The entirety of the RCU read critical section must come before the
>> - * return from or after the call to synchronize_rcu() in
>> - * enable_swap_info() or swapoff().  So if "si->flags & SWP_VALID" is
>> - * true, the si->map, si->cluster_info, etc. must be valid in the
>> - * critical section.
>> - *
>>   * Notice that swapoff or swapoff+swapon can still happen before the
>> - * rcu_read_lock() in get_swap_device() or after the rcu_read_unlock()
>> - * in put_swap_device() if there isn't any other way to prevent
>> - * swapoff, such as page lock, page table lock, etc.  The caller must
>> - * be prepared for that.  For example, the following situation is
>> - * possible.
>> + * percpu_ref_tryget_live() in get_swap_device() or after the
>> + * percpu_ref_put() in put_swap_device() if there isn't any other way
>> + * to prevent swapoff, such as page lock, page table lock, etc.  The
>> + * caller must be prepared for that.  For example, the following
>> + * situation is possible.
>>   *
>>   *   CPU1				CPU2
>>   *   do_swap_page()
>> @@ -1318,21 +1312,24 @@ struct swap_info_struct *get_swap_device(swp_entry_t entry)
>>  	si = swp_swap_info(entry);
>>  	if (!si)
>>  		goto bad_nofile;
>> -
>> -	rcu_read_lock();
>> -	if (data_race(!(si->flags & SWP_VALID)))
>> -		goto unlock_out;
>> +	if (!percpu_ref_tryget_live(&si->users))
>> +		goto out;
>> +	/*
>> +	 * Guarantee we will not reference uninitialized fields
>> +	 * of swap_info_struct.
>> +	 */
> 
> /*
>  * Guarantee the si->users are checked before accessing other fields of
>  * swap_info_struct.
> */
> 
>> +	smp_rmb();
> 
> Usually, smp_rmb() need to be paired with smp_wmb().  Some comments are
> needed for that.  Here smb_rmb() is paired with the spin_unlock() after
> setup_swap_info() in enable_swap_info().
> 
>>  	offset = swp_offset(entry);
>>  	if (offset >= si->max)
>> -		goto unlock_out;
>> +		goto put_out;
>>  
>>  	return si;
>>  bad_nofile:
>>  	pr_err("%s: %s%08lx\n", __func__, Bad_file, entry.val);
>>  out:
>>  	return NULL;
>> -unlock_out:
>> -	rcu_read_unlock();
>> +put_out:
>> +	percpu_ref_put(&si->users);
>>  	return NULL;
>>  }
>>  
>> @@ -2475,7 +2472,7 @@ static void setup_swap_info(struct swap_info_struct *p, int prio,
>>  
>>  static void _enable_swap_info(struct swap_info_struct *p)
>>  {
>> -	p->flags |= SWP_WRITEOK | SWP_VALID;
>> +	p->flags |= SWP_WRITEOK;
>>  	atomic_long_add(p->pages, &nr_swap_pages);
>>  	total_swap_pages += p->pages;
>>  
>> @@ -2507,7 +2504,7 @@ static void enable_swap_info(struct swap_info_struct *p, int prio,
>>  	spin_unlock(&swap_lock);
>>  	/*
>>  	 * Guarantee swap_map, cluster_info, etc. fields are valid
>> -	 * between get/put_swap_device() if SWP_VALID bit is set
>> +	 * between get/put_swap_device().
>>  	 */
> 
> The comments need to be revised.  Something likes below?
> 
> /* Finished initialized swap device, now it's safe to reference it */
> 

All look good for me. Will do. Many thanks!

> Best Regards,
> Huang, Ying
> 
>>  	percpu_ref_resurrect(&p->users);
>>  	spin_lock(&swap_lock);
>> @@ -2625,12 +2622,6 @@ SYSCALL_DEFINE1(swapoff, const char __user *, specialfile)
>>  
>>  	reenable_swap_slots_cache_unlock();
>>  
>> -	spin_lock(&swap_lock);
>> -	spin_lock(&p->lock);
>> -	p->flags &= ~SWP_VALID;		/* mark swap device as invalid */
>> -	spin_unlock(&p->lock);
>> -	spin_unlock(&swap_lock);
>> -
>>  	percpu_ref_kill(&p->users);
>>  	/*
>>  	 * We need synchronize_rcu() here to protect the accessing
> 
> .
> 


  reply	other threads:[~2021-04-19  6:57 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-17  9:40 [PATCH v2 0/5] close various race windows for swap Miaohe Lin
2021-04-17  9:40 ` [PATCH v2 1/5] mm/swapfile: add percpu_ref support " Miaohe Lin
2021-04-19  2:48   ` Huang, Ying
2021-04-19  6:46     ` Miaohe Lin
2021-04-19  7:09       ` Huang, Ying
2021-04-19  7:35         ` Miaohe Lin
2021-04-19  7:52           ` Huang, Ying
2021-04-19  8:20             ` Miaohe Lin
2021-04-17  9:40 ` [PATCH v2 2/5] mm/swapfile: use percpu_ref to serialize against concurrent swapoff Miaohe Lin
2021-04-19  2:54   ` Huang, Ying
2021-04-19  6:57     ` Miaohe Lin [this message]
2021-04-17  9:40 ` [PATCH v2 3/5] swap: fix do_swap_page() race with swapoff Miaohe Lin
2021-04-19  2:23   ` Huang, Ying
2021-04-19  6:54     ` Miaohe Lin
2021-04-17  9:40 ` [PATCH v2 4/5] mm/swap: remove confusing checking for non_swap_entry() in swap_ra_info() Miaohe Lin
2021-04-19  1:53   ` Huang, Ying
2021-04-19  6:46     ` Miaohe Lin
2021-04-17  9:40 ` [PATCH v2 5/5] mm/shmem: fix shmem_swapin() race with swapoff Miaohe Lin
2021-04-19  2:15   ` Huang, Ying
2021-04-19  6:49     ` Miaohe Lin
2021-04-19  7:04       ` Huang, Ying
2021-04-19  7:14         ` Miaohe Lin
2021-04-19  7:41           ` Huang, Ying
2021-04-19  8:18             ` Miaohe Lin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=16a20b86-b690-9397-def1-1171828c245e@huawei.com \
    --to=linmiaohe@huawei.com \
    --cc=akpm@linux-foundation.org \
    --cc=alexs@kernel.org \
    --cc=david@redhat.com \
    --cc=dennis@kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=minchan@kernel.org \
    --cc=richard.weiyang@gmail.com \
    --cc=tim.c.chen@linux.intel.com \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).