From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 09F4CC433F5 for ; Thu, 17 Feb 2022 06:08:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6AA6B6B007E; Thu, 17 Feb 2022 01:08:39 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 65A526B0080; Thu, 17 Feb 2022 01:08:39 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4FC1E6B0081; Thu, 17 Feb 2022 01:08:39 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0223.hostedemail.com [216.40.44.223]) by kanga.kvack.org (Postfix) with ESMTP id 4231A6B007E for ; Thu, 17 Feb 2022 01:08:39 -0500 (EST) Received: from smtpin17.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id E9D918249980 for ; Thu, 17 Feb 2022 06:08:38 +0000 (UTC) X-FDA: 79151242716.17.457BF38 Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by imf02.hostedemail.com (Postfix) with ESMTP id 6D7838000E for ; Thu, 17 Feb 2022 06:08:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1645078117; x=1676614117; h=from:to:cc:subject:references:date:in-reply-to: message-id:mime-version; bh=m2kcHBEI7JLaKzxurYEKxslk3esMl6wjzgpM2u6BgEM=; b=niu1WXfaIiJjRpnsBrGLfNAl1ldOMDjop2acp5xSF/MjLoBdGDoQ5koI ru/cQqiGD1FLUxSfea7UUyapJ04Qx3wMCJ8tk53qQpMMPyqQwNpDWxOLH ns02GrYOXs7W19j8WTst9qzUxrovz23np+sxVTb9yzmwRSGbFI4Gzc6Td /r2Q9vghfBE8bUDkP7LqTgrt5HPhiPbInoW27tJZJ21URJVrp/hZFUKTB DM2bBj4yQ1JCGSyWP3B3KQQ1k7+tR1x4N6w05XCFQ1CwwzkiWVpLIp88Y Z4tsSqlVRBiYA0oQxrYs5z2rexvxbRD8LkKANbGB9F91oKWPqs4HkS4gg w==; X-IronPort-AV: E=McAfee;i="6200,9189,10260"; a="248400863" X-IronPort-AV: E=Sophos;i="5.88,375,1635231600"; d="scan'208";a="248400863" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Feb 2022 22:08:35 -0800 X-IronPort-AV: E=Sophos;i="5.88,375,1635231600"; d="scan'208";a="545390963" Received: from yhuang6-desk2.sh.intel.com (HELO yhuang6-desk2.ccr.corp.intel.com) ([10.239.13.11]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Feb 2022 22:08:33 -0800 From: "Huang, Ying" To: Yu Zhao Cc: Mauricio Faria de Oliveira , Minchan Kim , Andrew Morton , Yang Shi , Miaohe Lin , linux-mm@kvack.org, linux-block@vger.kernel.org Subject: Re: [PATCH v3] mm: fix race between MADV_FREE reclaim and blkdev direct IO read References: <20220131230255.789059-1-mfo@canonical.com> <87o837cnnw.fsf@yhuang6-desk2.ccr.corp.intel.com> Date: Thu, 17 Feb 2022 14:08:31 +0800 In-Reply-To: (Yu Zhao's message of "Wed, 16 Feb 2022 14:58:36 -0700") Message-ID: <87zgmq6n4w.fsf@yhuang6-desk2.ccr.corp.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=ascii X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 6D7838000E X-Stat-Signature: sxjrz5j64wxd8arm3hu6swpse5h75ith Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=niu1WXfa; dmarc=pass (policy=none) header.from=intel.com; spf=none (imf02.hostedemail.com: domain of ying.huang@intel.com has no SPF policy when checking 192.55.52.93) smtp.mailfrom=ying.huang@intel.com X-HE-Tag: 1645078117-594057 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Yu Zhao writes: > On Wed, Feb 16, 2022 at 02:48:19PM +0800, Huang, Ying wrote: >> Yu Zhao writes: >> >> > On Wed, Feb 02, 2022 at 06:27:47PM -0300, Mauricio Faria de Oliveira wrote: >> >> On Wed, Feb 2, 2022 at 4:56 PM Yu Zhao wrote: >> >> > >> >> > On Mon, Jan 31, 2022 at 08:02:55PM -0300, Mauricio Faria de Oliveira wrote: >> >> > > Problem: >> >> > > ======= >> >> > >> >> > Thanks for the update. A couple of quick questions: >> >> > >> >> > > Userspace might read the zero-page instead of actual data from a >> >> > > direct IO read on a block device if the buffers have been called >> >> > > madvise(MADV_FREE) on earlier (this is discussed below) due to a >> >> > > race between page reclaim on MADV_FREE and blkdev direct IO read. >> >> > >> >> > 1) would page migration be affected as well? >> >> >> >> Could you please elaborate on the potential problem you considered? >> >> >> >> I checked migrate_pages() -> try_to_migrate() holds the page lock, >> >> thus shouldn't race with shrink_page_list() -> with try_to_unmap() >> >> (where the issue with MADV_FREE is), but maybe I didn't get you >> >> correctly. >> > >> > Could the race exist between DIO and migration? While DIO is writing >> > to a page, could migration unmap it and copy the data from this page >> > to a new page? >> >> Check the migrate_pages() code, >> >> migrate_pages >> unmap_and_move >> __unmap_and_move >> try_to_migrate // set PTE to swap entry with PTL >> move_to_new_page >> migrate_page >> folio_migrate_mapping >> folio_ref_count(folio) != expected_count // check page ref count >> folio_migrate_copy >> >> The page ref count is checked after unmapping and before copying. This >> is good, but it appears that we need a memory barrier between checking >> page ref count and copying page. > > I didn't look into this but, off the top of head, this should be > similar if not identical to the DIO case. Therefore, it requires two > barriers -- before and after the refcnt check (which may or may not > exist). Yes. I think so too. Best Regards, Huang, Ying