From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE, SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 57E4DC433E0 for ; Tue, 30 Mar 2021 11:22:06 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 9DEE96195C for ; Tue, 30 Mar 2021 11:22:05 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9DEE96195C Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=huawei.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 012E56B0071; Tue, 30 Mar 2021 07:22:05 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id F2C5A6B007D; Tue, 30 Mar 2021 07:22:04 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DF4056B0080; Tue, 30 Mar 2021 07:22:04 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0118.hostedemail.com [216.40.44.118]) by kanga.kvack.org (Postfix) with ESMTP id C71056B0071 for ; Tue, 30 Mar 2021 07:22:04 -0400 (EDT) Received: from smtpin34.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 8B65A181AF5CC for ; Tue, 30 Mar 2021 11:22:04 +0000 (UTC) X-FDA: 77976301368.34.506B0A4 Received: from szxga04-in.huawei.com (szxga04-in.huawei.com [45.249.212.190]) by imf18.hostedemail.com (Postfix) with ESMTP id C44812000242 for ; Tue, 30 Mar 2021 11:22:03 +0000 (UTC) Received: from DGGEMS407-HUB.china.huawei.com (unknown [172.30.72.59]) by szxga04-in.huawei.com (SkyGuard) with ESMTP id 4F8n5f2Lr3zmbHR; Tue, 30 Mar 2021 19:19:18 +0800 (CST) Received: from [10.174.179.86] (10.174.179.86) by DGGEMS407-HUB.china.huawei.com (10.3.19.207) with Microsoft SMTP Server id 14.3.498.0; Tue, 30 Mar 2021 19:21:54 +0800 Subject: Re: [Question] Is there a race window between swapoff vs synchronous swap_readpage To: "Huang, Ying" CC: Linux-MM , linux-kernel , Andrew Morton , Matthew Wilcox , Yu Zhao , "Shakeel Butt" , Alex Shi , "Minchan Kim" References: <364d7ce9-ccb7-fa04-7067-44a96be87060@huawei.com> <8735wdbdy4.fsf@yhuang6-desk1.ccr.corp.intel.com> <0cb765aa-1783-cd62-c4a4-b3fbc620532d@huawei.com> <87h7kt9ufw.fsf@yhuang6-desk1.ccr.corp.intel.com> From: Miaohe Lin Message-ID: <7d2126a2-e67e-cadb-d732-77f8d54a2f0c@huawei.com> Date: Tue, 30 Mar 2021 19:21:54 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.6.0 MIME-Version: 1.0 In-Reply-To: <87h7kt9ufw.fsf@yhuang6-desk1.ccr.corp.intel.com> Content-Type: text/plain; charset="windows-1252" Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.174.179.86] X-CFilter-Loop: Reflected X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: C44812000242 X-Stat-Signature: fnggbomqsspgemxuhy91higczi7qdehw Received-SPF: none (huawei.com>: No applicable sender policy available) receiver=imf18; identity=mailfrom; envelope-from=""; helo=szxga04-in.huawei.com; client-ip=45.249.212.190 X-HE-DKIM-Result: none/none X-HE-Tag: 1617103323-442112 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2021/3/30 11:44, Huang, Ying wrote: > Miaohe Lin writes: > >> On 2021/3/30 9:57, Huang, Ying wrote: >>> Hi, Miaohe, >>> >>> Miaohe Lin writes: >>> >>>> Hi all, >>>> I am investigating the swap code, and I found the below possible race window: >>>> >>>> CPU 1 CPU 2 >>>> ----- ----- >>>> do_swap_page >>>> skip swapcache case (synchronous swap_readpage) >>>> alloc_page_vma >>>> swapoff >>>> release swap_file, bdev, or ... >>>> swap_readpage >>>> check sis->flags is ok >>>> access swap_file, bdev or ...[oops!] >>>> si->flags = 0 >>>> >>>> The swapcache case is ok because swapoff will wait on the page_lock of swapcache page. >>>> Is this will really happen or Am I miss something ? >>>> Any reply would be really grateful. Thanks! :) >>> >>> This appears possible. Even for swapcache case, we can't guarantee the >> >> Many thanks for reply! >> >>> swap entry gotten from the page table is always valid too. The >> >> The page table may change at any time. And we may thus do some useless work. >> But the pte_same() check could handle these races correctly if these do not >> result in oops. >> >>> underlying swap device can be swapped off at the same time. So we use >>> get/put_swap_device() for that. Maybe we need similar stuff here. >> >> Using get/put_swap_device() to guard against swapoff for swap_readpage() sounds >> really bad as swap_readpage() may take really long time. Also such race may not be >> really hurtful because swapoff is usually done when system shutdown only. >> I can not figure some simple and stable stuff out to fix this. Any suggestions or >> could anyone help get rid of such race? > > Some reference counting on the swap device can prevent swap device from > swapping-off. To reduce the performance overhead on the hot-path as > much as possible, it appears we can use the percpu_ref. > Sounds a good idea. Many thanks for your suggestion. :) > Best Regards, > Huang, Ying > . >