From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-11.3 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E41E3C433DF for ; Thu, 20 Aug 2020 04:41:37 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A7B7A20786 for ; Thu, 20 Aug 2020 04:41:37 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="dpxPi/hx" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A7B7A20786 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 4C1F56B008A; Thu, 20 Aug 2020 00:41:37 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 474536B008C; Thu, 20 Aug 2020 00:41:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 363398D0003; Thu, 20 Aug 2020 00:41:37 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0176.hostedemail.com [216.40.44.176]) by kanga.kvack.org (Postfix) with ESMTP id 1E6CB6B008A for ; Thu, 20 Aug 2020 00:41:37 -0400 (EDT) Received: from smtpin10.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id B44D7180AD81F for ; Thu, 20 Aug 2020 04:41:36 +0000 (UTC) X-FDA: 77169698592.10.place02_311183e2702d Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin10.hostedemail.com (Postfix) with ESMTP id 76ED416A4AA for ; Thu, 20 Aug 2020 04:41:36 +0000 (UTC) X-HE-Tag: place02_311183e2702d X-Filterd-Recvd-Size: 6676 Received: from us-smtp-delivery-1.mimecast.com (us-smtp-1.mimecast.com [207.211.31.81]) by imf42.hostedemail.com (Postfix) with ESMTP for ; Thu, 20 Aug 2020 04:41:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1597898495; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=FawlYeQPcm1j7Zq9lJ25yGvGfdLQoOkhrat1d3Sia+Y=; b=dpxPi/hxNEWgjYrDs+JrpkzGMvgeSIzSuFth1lAQweA93FRBvl6WatfbSigarQb6aCb+FB rKt98zetjiqcLZhteQ6egs7IQp1jVDrLcjg7yUNpKRTxm850LNbUW7Ww7qh4EPlUD0aalY N9d+KAPTmATAg6MHVeAp+dEISuDhuow= Received: from mail-pg1-f197.google.com (mail-pg1-f197.google.com [209.85.215.197]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-397-7AcFrSWVMDGO7u5-7h1ijw-1; Thu, 20 Aug 2020 00:41:33 -0400 X-MC-Unique: 7AcFrSWVMDGO7u5-7h1ijw-1 Received: by mail-pg1-f197.google.com with SMTP id e4so534521pgv.7 for ; Wed, 19 Aug 2020 21:41:32 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=FawlYeQPcm1j7Zq9lJ25yGvGfdLQoOkhrat1d3Sia+Y=; b=dM7fvOd/5VxjtSbA7s03L39tkH9u7A3qWmjlr6JI/Rh4Mvc/r1Ja7kRn16xcR3Gwxj Vueqagnh1nJKjsMT3IjT5iLDvsJtfWY8VVZFpYEK4lx0w1xFoIIikOOC2MTyCfXHX9y2 XgTUfxuhHU8JQyUKcTvxdSsjXcKKpuYB4BAbkez6UkTopSOhq0py9/QIdSgAiZrCGBX7 g2CyRuYxpXGmufTGdNUJ9G+ZW3/ahdm3hi+gtfGqAembc2aPIiDM76ut2KfTOlaoBZRr K34/zFr6UGXz/0C8I6S5ExzZ65Sm8WmZMXgUxsOG2zns4oZiLMuOBy9NCs3nTNmCnlIA R0kw== X-Gm-Message-State: AOAM530B/m2204qJxbQ8Z5jBKhBWm83ZxmUbcTkNX/Xf3MeEPm2bfedi pJui77WIl9RrlFOzePfrE+uJdC6ax1wMrPBRs3qOytWzmKR35A2JvWHkbBAVyTFXgAxhgq9YiLC 4FiRBGj0R/m0= X-Received: by 2002:a17:90a:6a8d:: with SMTP id u13mr959853pjj.166.1597898492059; Wed, 19 Aug 2020 21:41:32 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxDV5HicPAU/dYLjNBDH689m2WckfoWML+OpLh4ESvPaJhPhNe3pktnZl4ZzLv/bV7HCMJXPw== X-Received: by 2002:a17:90a:6a8d:: with SMTP id u13mr959833pjj.166.1597898491774; Wed, 19 Aug 2020 21:41:31 -0700 (PDT) Received: from xiangao.remote.csb ([209.132.188.80]) by smtp.gmail.com with ESMTPSA id v128sm961284pfc.14.2020.08.19.21.41.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 19 Aug 2020 21:41:31 -0700 (PDT) Date: Thu, 20 Aug 2020 12:41:20 +0800 From: Gao Xiang To: "Huang, Ying" Cc: Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Rafael Aquini , Carlos Maiolino , Eric Sandeen , stable Subject: Re: [PATCH] mm, THP, swap: fix allocating cluster for swapfile by mistake Message-ID: <20200820044120.GB12374@xiangao.remote.csb> References: <20200819195613.24269-1-hsiangkao@redhat.com> <871rk2x7bb.fsf@yhuang-dev.intel.com> MIME-Version: 1.0 In-Reply-To: <871rk2x7bb.fsf@yhuang-dev.intel.com> User-Agent: Mutt/1.10.1 (2018-07-13) Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=hsiangkao@redhat.com X-Mimecast-Spam-Score: 0.002 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Rspamd-Queue-Id: 76ED416A4AA X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam01 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hi Ying, On Thu, Aug 20, 2020 at 12:36:08PM +0800, Huang, Ying wrote: > Gao Xiang writes: > > > SWP_FS doesn't mean the device is file-backed swap device, > > which just means each writeback request should go through fs > > by DIO. Or it'll just use extents added by .swap_activate(), > > but it also works as file-backed swap device. > > > > So in order to achieve the goal of the original patch, > > SWP_BLKDEV should be used instead. > > > > FS corruption can be observed with SSD device + XFS + > > fragmented swapfile due to CONFIG_THP_SWAP=y. > > > > Fixes: f0eea189e8e9 ("mm, THP, swap: Don't allocate huge cluster for file backed swap device") > > Fixes: 38d8b4e6bdc8 ("mm, THP, swap: delay splitting THP during swap out") > > Cc: "Huang, Ying" > > Cc: stable > > Signed-off-by: Gao Xiang > > Good catch! The fix itself looks good me! Although the description is > a little confusing. > > After some digging, it seems that SWP_FS is set on the swap devices > which make swap entry read/write go through the file system specific > callback (now used by swap over NFS only). Okay, let me send out v2 with the updated commit message in https://lore.kernel.org/r/20200820012409.GB5846@xiangao.remote.csb/ Thanks, Gao Xiang > > Best Regards, > Huang, Ying > > > --- > > > > I reproduced the issue with the following details: > > > > Environment: > > QEMU + upstream kernel + buildroot + NVMe (2 GB) > > > > Kernel config: > > CONFIG_BLK_DEV_NVME=y > > CONFIG_THP_SWAP=y > > > > Some reproducable steps: > > mkfs.xfs -f /dev/nvme0n1 > > mkdir /tmp/mnt > > mount /dev/nvme0n1 /tmp/mnt > > bs="32k" > > sz="1024m" # doesn't matter too much, I also tried 16m > > xfs_io -f -c "pwrite -R -b $bs 0 $sz" -c "fdatasync" /tmp/mnt/sw > > xfs_io -f -c "pwrite -R -b $bs 0 $sz" -c "fdatasync" /tmp/mnt/sw > > xfs_io -f -c "pwrite -R -b $bs 0 $sz" -c "fdatasync" /tmp/mnt/sw > > xfs_io -f -c "pwrite -F -S 0 -b $bs 0 $sz" -c "fdatasync" /tmp/mnt/sw > > xfs_io -f -c "pwrite -R -b $bs 0 $sz" -c "fsync" /tmp/mnt/sw > > > > mkswap /tmp/mnt/sw > > swapon /tmp/mnt/sw > > > > stress --vm 2 --vm-bytes 600M # doesn't matter too much as well > > > > Symptoms: > > - FS corruption (e.g. checksum failure) > > - memory corruption at: 0xd2808010 > > - segfault > > ... > > > > mm/swapfile.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/mm/swapfile.c b/mm/swapfile.c > > index 6c26916e95fd..2937daf3ca02 100644 > > --- a/mm/swapfile.c > > +++ b/mm/swapfile.c > > @@ -1074,7 +1074,7 @@ int get_swap_pages(int n_goal, swp_entry_t swp_entries[], int entry_size) > > goto nextsi; > > } > > if (size == SWAPFILE_CLUSTER) { > > - if (!(si->flags & SWP_FS)) > > + if (si->flags & SWP_BLKDEV) > > n_ret = swap_alloc_cluster(si, swp_entries); > > } else > > n_ret = scan_swap_map_slots(si, SWAP_HAS_CACHE, >