All of lore.kernel.org
 help / color / mirror / Atom feed
From: Miaohe Lin <linmiaohe@huawei.com>
To: "Huang, Ying" <ying.huang@intel.com>
Cc: <akpm@linux-foundation.org>, <hannes@cmpxchg.org>,
	<mhocko@suse.com>, <iamjoonsoo.kim@lge.com>, <vbabka@suse.cz>,
	<alex.shi@linux.alibaba.com>, <willy@infradead.org>,
	<minchan@kernel.org>, <richard.weiyang@gmail.com>,
	<hughd@google.com>, <tim.c.chen@linux.intel.com>,
	<linux-kernel@vger.kernel.org>, <linux-mm@kvack.org>
Subject: Re: [PATCH 2/5] swap: fix do_swap_page() race with swapoff
Date: Wed, 14 Apr 2021 11:27:21 +0800	[thread overview]
Message-ID: <489bcf5f-1441-a775-913d-cc8b54dc6a2a@huawei.com> (raw)
In-Reply-To: <87eefdsgwx.fsf@yhuang6-desk1.ccr.corp.intel.com>

On 2021/4/14 11:07, Huang, Ying wrote:
> Miaohe Lin <linmiaohe@huawei.com> writes:
> 
>> On 2021/4/13 9:27, Huang, Ying wrote:
>>> Miaohe Lin <linmiaohe@huawei.com> writes:
>>>
>>>> When I was investigating the swap code, I found the below possible race
>>>> window:
>>>>
>>>> CPU 1					CPU 2
>>>> -----					-----
>>>> do_swap_page
>>>>   synchronous swap_readpage
>>>>     alloc_page_vma
>>>> 					swapoff
>>>> 					  release swap_file, bdev, or ...
>>>>       swap_readpage
>>>> 	check sis->flags is ok
>>>> 	  access swap_file, bdev...[oops!]
>>>> 					    si->flags = 0
>>>>
>>>> Using current get/put_swap_device() to guard against concurrent swapoff for
>>>> swap_readpage() looks terrible because swap_readpage() may take really long
>>>> time. And this race may not be really pernicious because swapoff is usually
>>>> done when system shutdown only. To reduce the performance overhead on the
>>>> hot-path as much as possible, it appears we can use the percpu_ref to close
>>>> this race window(as suggested by Huang, Ying).
>>>>
>>>> Fixes: 235b62176712 ("mm/swap: add cluster lock")
>>>
>>> This isn't the commit that introduces the race.  You can use `git blame`
>>> find out the correct commit.  For this it's commit 0bcac06f27d7 "mm,
>>> swap: skip swapcache for swapin of synchronous device".
>>>
>>
>> Sorry about it! What I refer to is commit eb085574a752 ("mm, swap: fix race between
>> swapoff and some swap operations"). And I think this commit does not fix the race
>> condition completely, so I reuse the Fixes tag inside it.
>>
>>> And I suggest to merge 1/5 and 2/5 to make it easy to get the full
>>> picture.
>>>
>>>> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
>>>> ---
>>>>  include/linux/swap.h |  2 +-
>>>>  mm/memory.c          | 10 ++++++++++
>>>>  mm/swapfile.c        | 28 +++++++++++-----------------
>>>>  3 files changed, 22 insertions(+), 18 deletions(-)
>>>>
>>>> diff --git a/include/linux/swap.h b/include/linux/swap.h
>>>> index 849ba5265c11..9066addb57fd 100644
>>>> --- a/include/linux/swap.h
>>>> +++ b/include/linux/swap.h
>>>> @@ -513,7 +513,7 @@ sector_t swap_page_sector(struct page *page);
>>>>  
>>>>  static inline void put_swap_device(struct swap_info_struct *si)
>>>>  {
>>>> -	rcu_read_unlock();
>>>> +	percpu_ref_put(&si->users);
>>>>  }
>>>>  
>>>>  #else /* CONFIG_SWAP */
>>>> diff --git a/mm/memory.c b/mm/memory.c
>>>> index cc71a445c76c..8543c47b955c 100644
>>>> --- a/mm/memory.c
>>>> +++ b/mm/memory.c
>>>> @@ -3311,6 +3311,7 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
>>>>  {
>>>>  	struct vm_area_struct *vma = vmf->vma;
>>>>  	struct page *page = NULL, *swapcache;
>>>> +	struct swap_info_struct *si = NULL;
>>>>  	swp_entry_t entry;
>>>>  	pte_t pte;
>>>>  	int locked;
>>>> @@ -3339,6 +3340,11 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
>>>>  	}
>>>>  
>>>>
>>>
>>> I suggest to add comments here as follows (words copy from Matthew Wilcox)
>>>
>>> 	/* Prevent swapoff from happening to us */
>>
>> Ok.
>>
>>>
>>>> +	si = get_swap_device(entry);
>>>> +	/* In case we raced with swapoff. */
>>>> +	if (unlikely(!si))
>>>> +		goto out;
>>>> +
>>>
>>> Because we wrap the whole do_swap_page() with get/put_swap_device()
>>> now.  We can remove several get/put_swap_device() for function called by
>>> do_swap_page().  That can be another optimization patch.
>>
>> I tried to remove several get/put_swap_device() for function called
>> by do_swap_page() only before I send this series. But it seems they have
>> other callers without proper get/put_swap_device().
> 
> Then we need to revise these callers instead.  Anyway, can be another
> series.

Yes. can be another series.
Thanks.

> 
> Best Regards,
> Huang, Ying
> 
> .
> 


  reply	other threads:[~2021-04-14  3:27 UTC|newest]

Thread overview: 73+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-08 13:08 [PATCH 0/5] close various race windows for swap Miaohe Lin
2021-04-08 13:08 ` [PATCH 1/5] mm/swapfile: add percpu_ref support " Miaohe Lin
2021-04-12  3:30   ` Huang, Ying
2021-04-12  3:30     ` Huang, Ying
2021-04-12  6:59     ` Miaohe Lin
2021-04-12  7:24     ` Huang, Ying
2021-04-12  7:24       ` Huang, Ying
2021-04-13 12:39       ` Miaohe Lin
2021-04-14  1:17         ` Huang, Ying
2021-04-14  1:17           ` Huang, Ying
2021-04-14  1:58           ` Miaohe Lin
2021-04-14  2:06             ` Huang, Ying
2021-04-14  2:06               ` Huang, Ying
2021-04-14  3:44               ` Dennis Zhou
2021-04-14  3:59                 ` Huang, Ying
2021-04-14  3:59                   ` Huang, Ying
2021-04-14  4:05                   ` Dennis Zhou
2021-04-14  5:44                     ` Huang, Ying
2021-04-14  5:44                       ` Huang, Ying
2021-04-14 14:53                       ` Dennis Zhou
2021-04-15  3:16                         ` Miaohe Lin
2021-04-15  4:20                           ` Dennis Zhou
2021-04-15  9:17                             ` Miaohe Lin
2021-04-15  5:24                         ` Huang, Ying
2021-04-15  5:24                           ` Huang, Ying
2021-04-15 14:31                           ` Dennis Zhou
2021-04-16  0:54                             ` Huang, Ying
2021-04-16  0:54                               ` Huang, Ying
2021-04-16  2:27                             ` Miaohe Lin
2021-04-16  6:25                               ` Huang, Ying
2021-04-16  6:25                                 ` Huang, Ying
2021-04-16  8:30                                 ` Miaohe Lin
2021-04-08 13:08 ` [PATCH 2/5] swap: fix do_swap_page() race with swapoff Miaohe Lin
2021-04-08 21:34   ` Tim Chen
2021-04-09  8:42     ` Miaohe Lin
2021-04-09 17:17       ` Tim Chen
2021-04-10  3:17         ` Miaohe Lin
2021-04-12  1:44           ` Huang, Ying
2021-04-12  1:44             ` Huang, Ying
2021-04-12  3:24             ` Miaohe Lin
2021-04-08 21:37   ` kernel test robot
2021-04-09  8:46     ` Miaohe Lin
2021-04-08 22:56   ` kernel test robot
2021-04-13  1:27   ` Huang, Ying
2021-04-13  1:27     ` Huang, Ying
2021-04-13 19:24     ` Tim Chen
2021-04-14  1:04       ` Huang, Ying
2021-04-14  1:04         ` Huang, Ying
2021-04-14  2:20         ` Miaohe Lin
2021-04-14 16:13         ` Tim Chen
2021-04-15  3:19           ` Miaohe Lin
2021-04-14  2:55     ` Miaohe Lin
2021-04-14  3:07       ` Huang, Ying
2021-04-14  3:07         ` Huang, Ying
2021-04-14  3:27         ` Miaohe Lin [this message]
2021-04-08 13:08 ` [PATCH 3/5] mm/swap_state: fix get_shadow_from_swap_cache() " Miaohe Lin
2021-04-13  1:33   ` Huang, Ying
2021-04-13  1:33     ` Huang, Ying
2021-04-14  2:42     ` Miaohe Lin
2021-04-08 13:08 ` [PATCH 4/5] mm/swap_state: fix potential faulted in race in swap_ra_info() Miaohe Lin
2021-04-09  8:50   ` Huang, Ying
2021-04-09  8:50     ` Huang, Ying
2021-04-09  9:00     ` Miaohe Lin
2021-04-12  0:55       ` Huang, Ying
2021-04-12  0:55         ` Huang, Ying
2021-04-12  3:17         ` Miaohe Lin
2021-04-08 13:08 ` [PATCH 5/5] mm/swap_state: fix swap_cluster_readahead() race with swapoff Miaohe Lin
2021-04-13  1:36   ` Huang, Ying
2021-04-13  1:36     ` Huang, Ying
2021-04-14  2:43     ` Miaohe Lin
2021-04-08 14:55 ` [PATCH 0/5] close various race windows for swap riteshh
2021-04-09  8:01   ` Miaohe Lin
2021-04-08 20:46 [PATCH 2/5] swap: fix do_swap_page() race with swapoff kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=489bcf5f-1441-a775-913d-cc8b54dc6a2a@huawei.com \
    --to=linmiaohe@huawei.com \
    --cc=akpm@linux-foundation.org \
    --cc=alex.shi@linux.alibaba.com \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=minchan@kernel.org \
    --cc=richard.weiyang@gmail.com \
    --cc=tim.c.chen@linux.intel.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.