From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9B02BC6FA82 for ; Tue, 27 Sep 2022 01:52:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1D1828E009E; Mon, 26 Sep 2022 21:52:06 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 14BA48E0090; Mon, 26 Sep 2022 21:52:06 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EB8FD8E009E; Mon, 26 Sep 2022 21:52:05 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id D5C888E0090 for ; Mon, 26 Sep 2022 21:52:05 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id A55B61C6382 for ; Tue, 27 Sep 2022 01:52:05 +0000 (UTC) X-FDA: 79956189810.23.CA62F85 Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by imf09.hostedemail.com (Postfix) with ESMTP id 019A2140002 for ; Tue, 27 Sep 2022 01:52:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1664243524; x=1695779524; h=from:to:cc:subject:references:date:in-reply-to: message-id:mime-version; bh=Dk6ndrlNBYI60TKR2ZwIH8QqY848CDVYdIWHSrUiXuI=; b=cyheonoP+YjhlUDfHKsWT+D5NJ4ZBq55++b6JQjuW9rK0W311wWW5MpB 6mhI45DHse0Vvj6fNGXjg3/msgG6e2NSQkWd3xcxUaTxIu8nLSy6H8Bn3 ZMy1eBdnm8hqZ+xTjqlpQt8mMmtEsV03vL/HYunRNLUzssoSx+6ej9kd7 XUlFVVZVRxX4kgu+M1gRiV36xlE/QbbWHsXTusX/mPSRfwA3wLlWwvNcO 0kckqXYaSfCwqg+FgPmPsPQ6RSnd/k7fMcFcCBCAKPTsZ5akpefTZs0D5 9rfQTIbiDkp8WaoiHqiMDrJJbWJDW+HqAKdWa8y1lRvZZgI2pYgwXFIxE g==; X-IronPort-AV: E=McAfee;i="6500,9779,10482"; a="280912226" X-IronPort-AV: E=Sophos;i="5.93,347,1654585200"; d="scan'208";a="280912226" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Sep 2022 18:52:02 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10482"; a="725309544" X-IronPort-AV: E=Sophos;i="5.93,347,1654585200"; d="scan'208";a="725309544" Received: from yhuang6-desk2.sh.intel.com (HELO yhuang6-desk2.ccr.corp.intel.com) ([10.238.208.55]) by fmsmga002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Sep 2022 18:52:00 -0700 From: "Huang, Ying" To: Alistair Popple Cc: Yang Shi , , , Andrew Morton , Zi Yan , Baolin Wang , Oscar Salvador , Matthew Wilcox Subject: Re: [RFC 2/6] mm/migrate_pages: split unmap_and_move() to _unmap() and _move() References: <20220921060616.73086-1-ying.huang@intel.com> <20220921060616.73086-3-ying.huang@intel.com> <87o7v2lbn4.fsf@nvdebian.thelocal> <87fsgdllmb.fsf@nvdebian.thelocal> Date: Tue, 27 Sep 2022 09:51:21 +0800 In-Reply-To: <87fsgdllmb.fsf@nvdebian.thelocal> (Alistair Popple's message of "Tue, 27 Sep 2022 10:02:33 +1000") Message-ID: <87ill937qe.fsf@yhuang6-desk2.ccr.corp.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=ascii ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1664243524; a=rsa-sha256; cv=none; b=lg1kwE+Q45Q6527BBaX+2jJvJuDy9bzPO/2YS4WIcwQ6b3e/QkrGW7uBU3Xb5i0o9ULDRD Djqz0PebS5czv3oalMOy03W5HidVYh/8zQuXN6wjK9s5DG2ciuXZteSdx1bjY6esFgoIJM 9Dez1lVwFkQ22Pt3YjMfdw1REtJFHtU= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=cyheonoP; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf09.hostedemail.com: domain of ying.huang@intel.com designates 192.55.52.136 as permitted sender) smtp.mailfrom=ying.huang@intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1664243524; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Li1SPnZ5SrT67PCr2Cwcym8O3ajLqqW+8IrTI0p5uIU=; b=INXe3v2joueJxaUdcsZv7+VoT0Yt+bvS+FRtrXtpdk4yPVQ3GP3FhkY/mem2iGnDc6NGAO kKbj6heZwG55BuuMhlcDcGmeJdU+ufx/Lr2CCOFEe72nyXNLtGh/HLvO4dybU8i8owz+Fq JzhcwpXYenNAREVW5ntaey8+kK3GoGA= X-Rspamd-Server: rspam03 X-Rspam-User: Authentication-Results: imf09.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=cyheonoP; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf09.hostedemail.com: domain of ying.huang@intel.com designates 192.55.52.136 as permitted sender) smtp.mailfrom=ying.huang@intel.com X-Stat-Signature: mh9wr3t3pbcxdcby4giekuw9ks8ykqfx X-Rspamd-Queue-Id: 019A2140002 X-HE-Tag: 1664243523-616597 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Alistair Popple writes: > Yang Shi writes: > >> On Mon, Sep 26, 2022 at 2:37 AM Alistair Popple wrote: >>> >>> >>> Huang Ying writes: >>> >>> > This is a preparation patch to batch the page unmapping and moving >>> > for the normal pages and THP. >>> > >>> > In this patch, unmap_and_move() is split to migrate_page_unmap() and >>> > migrate_page_move(). So, we can batch _unmap() and _move() in >>> > different loops later. To pass some information between unmap and >>> > move, the original unused newpage->mapping and newpage->private are >>> > used. >>> >>> This looks like it could cause a deadlock between two threads migrating >>> the same pages if force == true && mode != MIGRATE_ASYNC as >>> migrate_page_unmap() will call lock_page() while holding the lock on >>> other pages in the list. Therefore the two threads could deadlock if the >>> pages are in a different order. >> >> It seems unlikely to me since the page has to be isolated from lru >> before migration. The isolating from lru is atomic, so the two threads >> unlikely see the same pages on both lists. > > Oh thanks! That is a good point and I agree since lru isolation is > atomic the two threads won't see the same pages. migrate_vma_setup() > does LRU isolation after locking the page which is why the potential > exists there. We could potentially switch that around but given > ZONE_DEVICE pages aren't on an lru it wouldn't help much. > >> But there might be other cases which may incur deadlock, for example, >> filesystem writeback IIUC. Some filesystems may lock a bunch of pages >> then write them back in a batch. The same pages may be on the >> migration list and they are also dirty and seen by writeback. I'm not >> sure whether I miss something that could prevent such a deadlock from >> happening. > > I'm not overly familiar with that area but I would assume any filesystem > code doing this would already have to deal with deadlock potential. Thank you very much for pointing this out. I think the deadlock is a real issue. Anyway, we shouldn't forbid other places in kernel to lock 2 pages at the same time. The simplest solution is to batch page migration only if mode == MIGRATE_ASYNC. Then we may consider to fall back to non-batch mode if mode != MIGRATE_ASYNC and trylock page fails. Best Regards, Huang, Ying [snip]