From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8E2ECC47DD9 for ; Fri, 22 Mar 2024 16:40:20 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 237616B009E; Fri, 22 Mar 2024 12:40:20 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1C1936B009F; Fri, 22 Mar 2024 12:40:20 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 062C26B00A0; Fri, 22 Mar 2024 12:40:20 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id E78206B009E for ; Fri, 22 Mar 2024 12:40:19 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id B44F61615D4 for ; Fri, 22 Mar 2024 16:40:19 +0000 (UTC) X-FDA: 81925237758.29.00C1CD0 Received: from out-177.mta1.migadu.com (out-177.mta1.migadu.com [95.215.58.177]) by imf22.hostedemail.com (Postfix) with ESMTP id EF160C0023 for ; Fri, 22 Mar 2024 16:40:17 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=QtD5iCkR; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf22.hostedemail.com: domain of chengming.zhou@linux.dev designates 95.215.58.177 as permitted sender) smtp.mailfrom=chengming.zhou@linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1711125618; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=alKuNVPMUZUInvf3FonZiSKCsCjs3x5ZGIB2V/RJ1Bc=; b=0iscZbltBljZSnC0NEEx7BPn+RvqG3qHGaKhXgxkgTT5XEYem4ZUExSaNjmFc+VD1LjtbS 0WgxEyUYPVqvSwzV89xQJUyU+RsrwZGLQbHWsLPvBzjtjWi+zo8gdtBrZe7K4c+D9oPmnJ 3Qzp4Xnv1sjxOYyitafCbBEzVLetM+A= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=QtD5iCkR; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf22.hostedemail.com: domain of chengming.zhou@linux.dev designates 95.215.58.177 as permitted sender) smtp.mailfrom=chengming.zhou@linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1711125618; a=rsa-sha256; cv=none; b=iNLzM0fTAsrRag9Cf5Uvtv45OUpjz37Tv0RyxlCjOsdowGjXdEwDWbl9caQzTzLLsT2SNZ 2N/QF8pKOkBUxxYyp59C+GE9xTtwJ3rg+nn9IISHC4v63UgfiGKuVgZpf8RGZy8ANrKde1 YqC8+CW0F2ZhbQ3xTxBgTPL8N+7sfHk= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1711125608; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=alKuNVPMUZUInvf3FonZiSKCsCjs3x5ZGIB2V/RJ1Bc=; b=QtD5iCkRCOfhonm2ViUGyN7iojeqGcmSW/yx+PEgNH3oECNvazjIbddo34MdYyNpjZy5zn fczjHWvReG1rzqrjx+YygIGC3C+mhwCueqod20QSEa+FBl2L+wCkWsSPtoHSv9PTmjXYfn zkOFNv+gtBAR+CY9T1yxCLa8bVZ3ID4= From: chengming.zhou@linux.dev To: hannes@cmpxchg.org, yosryahmed@google.com, nphamcs@gmail.com, chengming.zhou@linux.dev, akpm@linux-foundation.org Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Zhongkun He Subject: [RFC PATCH] mm: add folio in swapcache if swapin from zswap Date: Sat, 23 Mar 2024 00:39:39 +0800 Message-Id: <20240322163939.17846-1-chengming.zhou@linux.dev> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Migadu-Flow: FLOW_OUT X-Rspamd-Queue-Id: EF160C0023 X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: 3m449khux3xws7thz74kkmcf1p7s95ro X-HE-Tag: 1711125617-729702 X-HE-Meta: U2FsdGVkX18Lbjq8FYX66jdLaYRf/HFdZAjL3qqdgFYXaaKAhkuGG9zIxMrg8gpZWqm6+SMskPA1/BoVLN7SGe4rrIG2FVivivHF4nmFLuKYVF+MG/vCyKusm/2GRcNu+z1H2BECTHCzQil2RGQqjNnJCWNGxp1JV1fdJhEG8GsrQIIaJCFUXBHNL6Irv4To76zETCWcT5tciHFYZsE5X6i9wgqBMxk3cVQd5I3EenhOOCVL1ANZB91eVYgPN4VPTbmyJBS0rbvQS19RxnIJ8VkisWuqucJvnfUbdGh+tY0s2wwVImgYtexO0ijteI546RoDpaKhflYUCWZPazoOxO/9bIgo/e9spffn3p0xteDc34/zwYQFdZ/zKbJX84iKSyVKpk+2FTsEGCZ5pT0v1QXAdde1kOHwiOLQdjwDxPs8NvONLKn6pjCUpltOlwucommEzjHRTGgLXlhnrdVbBQizu6smDLQnGYfwHgT5IkRym6BGSploUyD/3JDKyn3YB7q6AYoAvF9tWa6d0osS9ss259kCgnEgI+wcOihV7c2G4BQNI3Mn/R9KeR0uGPHHZxp9Jr+4xiES5mICPG6sFJsRn81+8iE2dylrI82NPLOuGhvMgG9X60DZa2E/HoCqK+G55hLXD/IRWhEpxvWaw7vTCvMtvtLNWdw2PETdtJZLdwhjWPyI8LRkTvUDIIBZME1NKgu2ipCQ7LHtYIZAKw7/FwU7JFO5YFgcMZ/2SWd1RIK9AipWOWqrWaXODTyIFV+9pz/hf1WPxD+7IdOzoOstFLhzRYLVRHW8h2+ZtElBIAQhKe84tUfJhyrBduWlnUyg+efoahx/EA7c+ujhbPb3wAHLY2eD6RQcFIUR0AlwAHZZjy+LzQ1HFvUCHlThNJPYwpvx+mP+RnY2mUNRflKphliYxyzQSb00LaRpkxx5rkNTMKOJdnZYD9hZF7X3Sra0V9g/iN5xbEPTCoN 71SRg+Ob CF2CNjl5pi57n23eRAdD8GPN2fMMfq8RZQSAAicPgNVcoCWxDvRp/z2y0gaRiEFc3b2/uyV9fRRPCnSnpRFvAdmKOI/PZQ4hsgs+hm75UGZnwUcGT1jYjGqsD9zoXgVtYaMMXW1bsgBNDR3tUSTgiWjXwx0NbSs0IQjrnR5P1OVJI6lCOlZjUeMH+9T+SLki4HBpkI10H6XIHezANmjpBYwKTMA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Chengming Zhou There is a report of data corruption caused by double swapin, which is only possible in the skip swapcache path on SWP_SYNCHRONOUS_IO backends. The root cause is that zswap is not like other "normal" swap backends, it won't keep the copy of data after the first time of swapin. So if the folio in the first time of swapin can't be installed in the pagetable successfully and we just free it directly. Then in the second time of swapin, we can't find anything in zswap and read wrong data from swapfile, so this data corruption problem happened. We can fix it by always adding the folio into swapcache if we know the pinned swap entry can be found in zswap, so it won't get freed even though it can't be installed successfully in the first time of swapin. And we have to check if the swap entry is in zswap after entry pinned, only then we can make sure the check result is stable. Reported-by: Zhongkun He Closes: https://lore.kernel.org/all/CACSyD1N+dUvsu8=zV9P691B9bVq33erwOXNTmEaUbi9DrDeJzw@mail.gmail.com Signed-off-by: Chengming Zhou --- include/linux/zswap.h | 6 ++++++ mm/memory.c | 28 ++++++++++++++++++++++++---- mm/zswap.c | 10 ++++++++++ 3 files changed, 40 insertions(+), 4 deletions(-) diff --git a/include/linux/zswap.h b/include/linux/zswap.h index 2a85b941db97..180d0b1f0886 100644 --- a/include/linux/zswap.h +++ b/include/linux/zswap.h @@ -36,6 +36,7 @@ void zswap_memcg_offline_cleanup(struct mem_cgroup *memcg); void zswap_lruvec_state_init(struct lruvec *lruvec); void zswap_folio_swapin(struct folio *folio); bool is_zswap_enabled(void); +bool zswap_find(swp_entry_t swp); #else struct zswap_lruvec_state {}; @@ -65,6 +66,11 @@ static inline bool is_zswap_enabled(void) return false; } +static inline bool zswap_find(swp_entry_t swp) +{ + return false; +} + #endif #endif /* _LINUX_ZSWAP_H */ diff --git a/mm/memory.c b/mm/memory.c index 4f2caf1c3c4d..a564b2b8faca 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4031,18 +4031,38 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) ret = VM_FAULT_OOM; goto out_page; } + + /* + * We have to add the folio into swapcache if + * this pinned swap entry is found in zswap, + * which won't keep copy of data after swapin. + * Or data will just get lost if later folio + * can't be installed successfully in pagetable. + */ + if (zswap_find(entry)) { + if (add_to_swap_cache(folio, entry, + GFP_KERNEL, &shadow)) { + ret = VM_FAULT_OOM; + goto out_page; + } + swapcache = folio; + need_clear_cache = false; + } else { + shadow = get_shadow_from_swap_cache(entry); + /* To provide entry to swap_read_folio() */ + folio->swap = entry; + } + mem_cgroup_swapin_uncharge_swap(entry); - shadow = get_shadow_from_swap_cache(entry); if (shadow) workingset_refault(folio, shadow); folio_add_lru(folio); - /* To provide entry to swap_read_folio() */ - folio->swap = entry; swap_read_folio(folio, true, NULL); - folio->private = NULL; + if (need_clear_cache) + folio->private = NULL; } } else { page = swapin_readahead(entry, GFP_HIGHUSER_MOVABLE, diff --git a/mm/zswap.c b/mm/zswap.c index c4979c76d58e..84a904a788a3 100644 --- a/mm/zswap.c +++ b/mm/zswap.c @@ -1601,6 +1601,16 @@ void zswap_invalidate(swp_entry_t swp) zswap_entry_free(entry); } +bool zswap_find(swp_entry_t swp) +{ + pgoff_t offset = swp_offset(swp); + struct xarray *tree = swap_zswap_tree(swp); + struct zswap_entry *entry; + + entry = xa_find(tree, &offset, offset, XA_PRESENT); + return entry != NULL; +} + int zswap_swapon(int type, unsigned long nr_pages) { struct xarray *trees, *tree; -- 2.20.1