From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D7D76C54E64 for ; Fri, 22 Mar 2024 22:33:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 205E06B007B; Fri, 22 Mar 2024 18:33:19 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 18EBA6B0082; Fri, 22 Mar 2024 18:33:19 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 008D86B0085; Fri, 22 Mar 2024 18:33:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id DE06C6B007B for ; Fri, 22 Mar 2024 18:33:18 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id B7DD340602 for ; Fri, 22 Mar 2024 22:33:18 +0000 (UTC) X-FDA: 81926127276.21.1E16FFC Received: from mail-pf1-f202.google.com (mail-pf1-f202.google.com [209.85.210.202]) by imf14.hostedemail.com (Postfix) with ESMTP id E61BD100005 for ; Fri, 22 Mar 2024 22:33:16 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=FpDg8bv4; spf=pass (imf14.hostedemail.com: domain of 3Kwf-ZQoKCGgeUYXeGNSKJMUUMRK.IUSROTad-SSQbGIQ.UXM@flex--yosryahmed.bounces.google.com designates 209.85.210.202 as permitted sender) smtp.mailfrom=3Kwf-ZQoKCGgeUYXeGNSKJMUUMRK.IUSROTad-SSQbGIQ.UXM@flex--yosryahmed.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1711146797; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Cnfm9xdwmI7Yzkhi3AqNU+hvsi8BXcsiP+kyU3wPosw=; b=0C90zA308DamVHY1aYTb62fJoY/WqDKKvB9YaXMsO+75K7i/Ft5S9wKF7PDQOsUYADGgKS 5nbvLCLat5CUIMLJSDWG1eqA+1Vqy/nNss3I6txd5dZ2q5ur1rsbtS3IBZKoGbaW0M2BGW t6+v8q0yC58N2vUJ65+uDwq5LQ2MmjM= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1711146797; a=rsa-sha256; cv=none; b=MBHJhn/ChMGgVgksaduILxqYHZ57W0Gtc1RdwGYAIuT/9u2Ib0/OYJy52DR+fAKMI/jFH9 U8SSwuAQZveZGD+WCIqXJAiun7OWLUpD2DqVb5pchUoN9MNL2izdmWBg/9684QEZDc/qNn FCyJEU3RoitAS6KDSw4py1vxf3XoxmQ= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=FpDg8bv4; spf=pass (imf14.hostedemail.com: domain of 3Kwf-ZQoKCGgeUYXeGNSKJMUUMRK.IUSROTad-SSQbGIQ.UXM@flex--yosryahmed.bounces.google.com designates 209.85.210.202 as permitted sender) smtp.mailfrom=3Kwf-ZQoKCGgeUYXeGNSKJMUUMRK.IUSROTad-SSQbGIQ.UXM@flex--yosryahmed.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-pf1-f202.google.com with SMTP id d2e1a72fcca58-6e73833c7f6so1486079b3a.3 for ; Fri, 22 Mar 2024 15:33:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1711146795; x=1711751595; darn=kvack.org; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:from:to:cc:subject:date:message-id :reply-to; bh=Cnfm9xdwmI7Yzkhi3AqNU+hvsi8BXcsiP+kyU3wPosw=; b=FpDg8bv4DLIKHgQ8ZMnSG7szGMywx3wS/txjRk+vk5MSTce+BkzVGgpmWlxcZY3aoN KLhWgoZ1rmxaUqFkSwrJ0+EpmpIJEf9ShqtMzVh8rEHD3fTKUGr7tobwKluFcWBydykx P3JuDQ6K32Gw6vJdhIuoTbVTC1/9sYqMV1ryHAg86vtWUAYU33tiQDJKRDaXG7TmH1B0 pP7Wja8cOq8AJKR1bqdd+UUPPt7IDrSXr+r8Agn4BvGXVgy2vbMHoYWMDJqo5lvt8Oei geUAQ0EoYZtbZMxJDVfUIHKBJAz2ey1Qd+9eaAmnTKdUhlIl0n+j0XisgHl7B1P6yr5a kC4Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711146795; x=1711751595; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=Cnfm9xdwmI7Yzkhi3AqNU+hvsi8BXcsiP+kyU3wPosw=; b=ay32DkjiBqecOWThx/aPspoa/NqT+/pQ1lJW/m0W7vg5vGo6wdZR5RDn+D2F2nuJPV WamzxkVOLfbJl84SbM6cLlxcIL8U/rYeS5MNkusInPYoTHoPNPUOFbY+IR8nWb0X+QhP Qgr4HjYBKg8TXg7VAuSyvSbe6lAbb6RDctL1gfW49CO0oE77tJYqkKSylXNHJfjp/o+Q P936UIewewgt5loQWG3EA92gusMPL1tDlk4EKVsHiKdVkVUbIkvuz3rNExeuCDDnnWvm 7oJ05mJxqrFx4W2IahRHFiL7ZdhAEBPxUi7d5Dam0WmuGE3xjIfw/lPvnbq9Us0hZbDf EKrg== X-Forwarded-Encrypted: i=1; AJvYcCVgQUdnAluC0qN7PAt4e+L+s1kxLAXJ3qxFgTNv+3MhWXQ9y+ad467ye5zITkkXIjiAgQZXYdyljp8fmOZOBz7npHg= X-Gm-Message-State: AOJu0YybOWNdRdhZn2N/mhvDdGV6BXpjJUr3xG36qi+T7uDz+K9t/5d0 QikpOUefdZNPC8D6OQ5j7LuJV7egsfDrR6gMjyWXz59NEFrh/cZMhrHdzAoE1aaFvuMbZKFZ/KS QJgEWdcBnxulACc4bNA== X-Google-Smtp-Source: AGHT+IHXtPdRtQD6tSe9r59cIEYeBNGQJ2Ol6DFvxFMGUt3I71z9xOo9m9Lxj1/3bSVlDo2ISMsGFxYYgHvOS1Yt X-Received: from yosry.c.googlers.com ([fda3:e722:ac3:cc00:20:ed76:c0a8:29b4]) (user=yosryahmed job=sendgmr) by 2002:a05:6a00:2d96:b0:6ea:8938:b069 with SMTP id fb22-20020a056a002d9600b006ea8938b069mr70242pfb.1.1711146795474; Fri, 22 Mar 2024 15:33:15 -0700 (PDT) Date: Fri, 22 Mar 2024 22:33:13 +0000 In-Reply-To: Mime-Version: 1.0 References: <20240322163939.17846-1-chengming.zhou@linux.dev> Message-ID: Subject: Re: [RFC PATCH] mm: add folio in swapcache if swapin from zswap From: Yosry Ahmed To: Barry Song <21cnbao@gmail.com> Cc: chengming.zhou@linux.dev, hannes@cmpxchg.org, nphamcs@gmail.com, akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Zhongkun He Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: kzy4yu5eekmtmnri55xrnpqnr43zmj3r X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: E61BD100005 X-Rspam-User: X-HE-Tag: 1711146796-913212 X-HE-Meta: U2FsdGVkX1/WGOB/tKLML2irSkdDhOD5wujfSR8yAWcbqHKLBN5szP9i9ItuTUdJsL/MjwzcjltfJiWwhOyR5m0kGHUJi4Std/XguQE3h/qAsxX+r07ryawNgVlIuJedR3UYQIiRDM5wM5AaAsBMNGTVf7UCnOXrs7m/bmVLIeUFJ3Xkq1r3ERK9hJRciIravOLaRaTFdK8/9b26rYA6zKy6rymT/TOI6bbgBHXkPfZsPg0MtqmcgX7sgOwvJb9QF9wcRfGnNm5FYWBJP671NSOicTpf13RXbI1bL5a+tdzHJMwO+JCyrA1yZtOqNSRScEXuib5iuQtWze+pjqZeaeRG9MIQH1YAXfRrxKUGQ34KuL/K4SSLFkq/gHy9aJI1tNy46Qyig2+q6ic3sBqNtmkI6h3LY1DM4vvWqD/7h11rv7Y7PhiywdSnRp+JbL9pSjaP2j1USrcPlg8c1Dm9EFMKOp3wUNHoSRFH/GHR38sQA8mG4GeCimDZTdaHSeEE21YQm3KORqW1kJWQYJSxMzkETsqaUPVUAYp5otonjpXq/I/Hp0EECeCS5BUgvE85yWb4sxB6OI62Ojfjuhcr2A/2mCpWAgytNjxkWyFEUPyuxb9+db+NYScPn8YHGs4hRxrIiAoHg6/VcOpaYB1fRE7tLPnfPdcBrCpHikB8zqek3h8BaVirItQrqXJHCdmNx8Fufbxgy6asUjo1Csvw/+nRdqOo9d3cf42hXzTxaFToq2fBADZbzMxCIg/hALz4Dr4ArKE93zinmfm8NgtxS5xk6Fs3I8HJhz6Pdb78lCwSZDubdSqTUOTPaZhoTgMGVneGgb3DWxTX3mbVZegQro0jmvyaurvcNPT+x9+v++GnMAo35kKddVx6TqYnDYsHodKJ9MCl1kzg8WI6kQCamed7f+YnmzYM2MoH6DqpWTbC+Iua3f/L6RpAvtbSx2eshscGKWmPwPOTUe/TMp1 8wahOmJp z7wNsvgisnqB1n0A/x6HthIiNY499W06yXiEA1BKIm7pdi6DlGLCN/PvGHt9FbFAwo6JmgNW7yDtLkMzg//KYFJE8v4m1OUggh29xYv30WlFFpcsH/B0R2i4QFktOTJIFwjzfiQhOQVKWnOKySo+g6DEJTA0GVG5yDukYBj9XP3n6hyFYv04kBxwC0+sEtD22lvOY37cQlgeWP86+p7uhUGpo/Zg1Kb/rC48t5Fs44B1kbN9dFqfPMezF58CyPG3u2yi+bLZzWdgWwH8eXoNU9gUKAyUW14FypMQ8qLCQlXxPK/Ex3llSYTwO3ISYXxDdkUSJ0doqfBK3TNtLPMzlZkZhRcBZ7IHh0PxK61cfN+ZTXwAjAvvmfUXeENz/ps627y1og1M/n+0HQQliM0PWcIIuuUPXhPLp9eg8FpT/NtOEdNGccF4MKONbZKbWdAhJGMfxNd80vbzbdufY/vXeal/hMGFevpvw0CdAMM3VgT0g3StFHsb7rltBhVyrWaSAxv5fRQmHrL8Xkl9vNpZCYAAxucGRbGyQcNzfC/0i4wvo36j5vAIyRiqLFQeFLSEibBk2ZIBcOrKJnc9QU2cza46qnoH3NSvpEpUCkfbL7aY5oO0N+adhippknbf2qwb6ehbjY01bpJGimhMvDCn+TgBsRbn4txG+YRz0/iLIM9BdLa4= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Sat, Mar 23, 2024 at 10:41:32AM +1300, Barry Song wrote: > On Sat, Mar 23, 2024 at 8:38=E2=80=AFAM Yosry Ahmed wrote: > > > > On Fri, Mar 22, 2024 at 9:40=E2=80=AFAM wrot= e: > > > > > > From: Chengming Zhou > > > > > > There is a report of data corruption caused by double swapin, which i= s > > > only possible in the skip swapcache path on SWP_SYNCHRONOUS_IO backen= ds. > > > > > > The root cause is that zswap is not like other "normal" swap backends= , > > > it won't keep the copy of data after the first time of swapin. So if >=20 > I don't quite understand this, so once we load a page from zswap, zswap > will free it even though do_swap_page might not set it to PTE? >=20 > shouldn't zswap free the memory after notify_free just like zram? It's an optimization that zswap has, exclusive loads. After a page is swapped in it can stick around in the swapcache for a while. In this case, there would be two copies in memory with zram (compressed and uncompressed). Zswap implements exclusive loads to drop the compressed copy. The folio is marked as dirty so that any attempts to reclaim it cause a new write (compression) to zswap. It is also for a lot of cleanups and straightforward entry lifetime tracking in zswap. It is mostly fine, the problem here happens because we skip the swapcache during swapin, so there is a possibility that we load the folio from zswap then just drop it without stashing it anywhere. >=20 > > > the folio in the first time of swapin can't be installed in the paget= able > > > successfully and we just free it directly. Then in the second time of > > > swapin, we can't find anything in zswap and read wrong data from swap= file, > > > so this data corruption problem happened. > > > > > > We can fix it by always adding the folio into swapcache if we know th= e > > > pinned swap entry can be found in zswap, so it won't get freed even t= hough > > > it can't be installed successfully in the first time of swapin. > > > > A concurrent faulting thread could have already checked the swapcache > > before we add the folio to it, right? In this case, that thread will > > go ahead and call swap_read_folio() anyway. > > > > Also, I suspect the zswap lookup might hurt performance. Would it be > > better to add the folio back to zswap upon failure? This should be > > detectable by checking if the folio is dirty as I mentioned in the bug > > report thread. >=20 > I don't like the idea either as sync-io is the fast path for zram etc. > or, can we use > the way of zram to free compressed data? I don't think we want to stop doing exclusive loads in zswap due to this interaction with zram, which shouldn't be common. I think we can solve this by just writing the folio back to zswap upon failure as I mentioned.