From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 86EE9C433EF for ; Wed, 16 Feb 2022 21:58:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 040E66B0074; Wed, 16 Feb 2022 16:58:42 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id F32696B0075; Wed, 16 Feb 2022 16:58:41 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DD3096B0078; Wed, 16 Feb 2022 16:58:41 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0135.hostedemail.com [216.40.44.135]) by kanga.kvack.org (Postfix) with ESMTP id CBA5E6B0074 for ; Wed, 16 Feb 2022 16:58:41 -0500 (EST) Received: from smtpin23.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 93CAD95B04 for ; Wed, 16 Feb 2022 21:58:41 +0000 (UTC) X-FDA: 79150008042.23.1F93E91 Received: from mail-io1-f52.google.com (mail-io1-f52.google.com [209.85.166.52]) by imf26.hostedemail.com (Postfix) with ESMTP id 350BD140006 for ; Wed, 16 Feb 2022 21:58:41 +0000 (UTC) Received: by mail-io1-f52.google.com with SMTP id h5so1464981ioj.3 for ; Wed, 16 Feb 2022 13:58:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=Zg/d65u19q38I3oM5Do2dldDuoJfPVh5cYcs6nK/ICM=; b=n8+DJL5853b/9UCZzm45z0yqXyySfIeRDb4HhDdX/Cg5Y1vql7KrhiE0iE9gVvwgO9 9Dz4ggbWPb9uj4Ff1xMZDLiBefxciRYeB7gMFJdPIb9mlFfRHV6mZ5HwYNqY+m6gOu7C pXMwaRd2TIcGgHoG5fYFrzxMALPDUfHuUuOanOd5nptHrzMeiW1Qm2nwxEE1SMOiQjnS 0rYYgBNnukmahimiakB0RugJUlyYJaBUiZ9IreimwnC7W9il3gITRBthUZ3+lLT4Teyf fQaX7vtRLi+mC88FdD6EZHbMJC1YXLif0MWhiIShfhCEunx9qAR1LXJcOHe53CS87iA3 mI8w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=Zg/d65u19q38I3oM5Do2dldDuoJfPVh5cYcs6nK/ICM=; b=nyWCf9/is+tJ3UnfbzAz1R10aP6Bii2KCGq5+x7xlUTJc1wccFmL+0bgOIVRCwtmeD YeEgsrTtZNT6zhEhwHZKNnLWpxyjeuLfWuEVjXAjiRufJ9fP9p0KVHv4TBPcOpdzfNTm tPH1WwryVStE4mhw/U5YncG3iilr2slERXaPIk/IK4KhLutQAUG8yuawxJs6WCTpc3n7 QcAOV9X7tUPxQB4+C+SFXZdqTXU9ua6NCyXljggCYsIh+W3Bdl5caQR8IRixE7pxENQi hAdUm+AezNn2s+DdO+lG2wkh4unuKZBBiWcbqSb1w1WVtjQrPRnuedGCWW5/Jvd765er Jhww== X-Gm-Message-State: AOAM530efRzC+B2JZ369aKufnIEHd40YWPQFNwKbD1n63bhv5gHhYMgx y9OP/jrz5vGcYrP1zz4fcb0Xgw== X-Google-Smtp-Source: ABdhPJz0ybrl7XQJTJk3JKg4Baqd4O0CrHOQvFPWW+O5bFHdUZjNf89exYPbRwB7GSVSWUSGtnqdwA== X-Received: by 2002:a02:8664:0:b0:30d:e657:7847 with SMTP id e91-20020a028664000000b0030de6577847mr2940845jai.283.1645048720329; Wed, 16 Feb 2022 13:58:40 -0800 (PST) Received: from google.com ([2620:15c:183:200:5929:5114:bf56:ccb6]) by smtp.gmail.com with ESMTPSA id o13sm645866iou.3.2022.02.16.13.58.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 16 Feb 2022 13:58:39 -0800 (PST) Date: Wed, 16 Feb 2022 14:58:36 -0700 From: Yu Zhao To: "Huang, Ying" Cc: Mauricio Faria de Oliveira , Minchan Kim , Andrew Morton , Yang Shi , Miaohe Lin , linux-mm@kvack.org, linux-block@vger.kernel.org Subject: Re: [PATCH v3] mm: fix race between MADV_FREE reclaim and blkdev direct IO read Message-ID: References: <20220131230255.789059-1-mfo@canonical.com> <87o837cnnw.fsf@yhuang6-desk2.ccr.corp.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <87o837cnnw.fsf@yhuang6-desk2.ccr.corp.intel.com> Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=n8+DJL58; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf26.hostedemail.com: domain of yuzhao@google.com designates 209.85.166.52 as permitted sender) smtp.mailfrom=yuzhao@google.com X-Rspam-User: X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 350BD140006 X-Stat-Signature: mfciqq6zb4319czpoe3oz6h4ma8jhnqz X-HE-Tag: 1645048721-372920 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Feb 16, 2022 at 02:48:19PM +0800, Huang, Ying wrote: > Yu Zhao writes: > > > On Wed, Feb 02, 2022 at 06:27:47PM -0300, Mauricio Faria de Oliveira wrote: > >> On Wed, Feb 2, 2022 at 4:56 PM Yu Zhao wrote: > >> > > >> > On Mon, Jan 31, 2022 at 08:02:55PM -0300, Mauricio Faria de Oliveira wrote: > >> > > Problem: > >> > > ======= > >> > > >> > Thanks for the update. A couple of quick questions: > >> > > >> > > Userspace might read the zero-page instead of actual data from a > >> > > direct IO read on a block device if the buffers have been called > >> > > madvise(MADV_FREE) on earlier (this is discussed below) due to a > >> > > race between page reclaim on MADV_FREE and blkdev direct IO read. > >> > > >> > 1) would page migration be affected as well? > >> > >> Could you please elaborate on the potential problem you considered? > >> > >> I checked migrate_pages() -> try_to_migrate() holds the page lock, > >> thus shouldn't race with shrink_page_list() -> with try_to_unmap() > >> (where the issue with MADV_FREE is), but maybe I didn't get you > >> correctly. > > > > Could the race exist between DIO and migration? While DIO is writing > > to a page, could migration unmap it and copy the data from this page > > to a new page? > > Check the migrate_pages() code, > > migrate_pages > unmap_and_move > __unmap_and_move > try_to_migrate // set PTE to swap entry with PTL > move_to_new_page > migrate_page > folio_migrate_mapping > folio_ref_count(folio) != expected_count // check page ref count > folio_migrate_copy > > The page ref count is checked after unmapping and before copying. This > is good, but it appears that we need a memory barrier between checking > page ref count and copying page. I didn't look into this but, off the top of head, this should be similar if not identical to the DIO case. Therefore, it requires two barriers -- before and after the refcnt check (which may or may not exist).