From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=3.0 tests=INCLUDES_PATCH, MAILING_LIST_MULTI,PDS_BTC_ID,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS, USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9DD42C43603 for ; Thu, 5 Dec 2019 11:32:01 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 41C1F20707 for ; Thu, 5 Dec 2019 11:32:01 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 41C1F20707 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 8A32B6B0FEE; Thu, 5 Dec 2019 06:32:00 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 854A96B0FEF; Thu, 5 Dec 2019 06:32:00 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 742B76B0FF0; Thu, 5 Dec 2019 06:32:00 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0097.hostedemail.com [216.40.44.97]) by kanga.kvack.org (Postfix) with ESMTP id 5B3E26B0FEE for ; Thu, 5 Dec 2019 06:32:00 -0500 (EST) Received: from smtpin09.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with SMTP id 16F412C78 for ; Thu, 5 Dec 2019 11:32:00 +0000 (UTC) X-FDA: 76230873600.09.plate42_4b9b59b602c10 X-HE-Tag: plate42_4b9b59b602c10 X-Filterd-Recvd-Size: 8128 Received: from mail-wm1-f67.google.com (mail-wm1-f67.google.com [209.85.128.67]) by imf22.hostedemail.com (Postfix) with ESMTP for ; Thu, 5 Dec 2019 11:31:59 +0000 (UTC) Received: by mail-wm1-f67.google.com with SMTP id c20so1804214wmb.0 for ; Thu, 05 Dec 2019 03:31:59 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=JjJxU4HP6eaRRyFxsdouinTrpCEow/Z7i31krH2kwCs=; b=Cr08Qjh3GKg28KWLGtVnJtNmCM4LAticlkhuhFqHJ4h2i/M/GFRBiWI01GU8axq1fi e2es6hSe10yq78jLXf1N0BGOQ8sZiQYnBM+Y6jPBcsIX7TmMnmVgvwWWpuRiRtjUdfn8 V60dHr81mG2ObTnSwfEBYIEV2HA+GmmDyl9HzcU3SoEK1smQemyBvpf6Da1oAYWhzMt5 xfIg9pPL7OagaE8QeBBXbu0wZDGSV57CBlBueFULCYS6ERRunUS2iws+qu1tDpC8TCRK 9G9pc5fQdE3H5C1LB+4RqMs7Atcl7qdX2MvFdPwSIXnCci7h7kUZqXJ+JOHHFmyToNDx 8L6A== X-Gm-Message-State: APjAAAUtiNOdsD4yMohEj8KF9XBEN4cERKlZLul+WL/iYRTrq6F/p77m 1/rDX+4ovLayB/j4BsvKWuc= X-Google-Smtp-Source: APXvYqzpDPHksDnygy5Fg+9DCfEclhSTLYJunPk0DVHsHi1gH3jci6q0s1hvYgMua9JGOYKPgXNpXQ== X-Received: by 2002:a1c:61d7:: with SMTP id v206mr4592095wmb.13.1575545518284; Thu, 05 Dec 2019 03:31:58 -0800 (PST) Received: from localhost (prg-ext-pat.suse.com. [213.151.95.130]) by smtp.gmail.com with ESMTPSA id z11sm11966035wrt.82.2019.12.05.03.31.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 05 Dec 2019 03:31:57 -0800 (PST) Date: Thu, 5 Dec 2019 12:31:56 +0100 From: Michal Hocko To: Yang Shi Cc: fabecassis@nvidia.com, jhubbard@nvidia.com, cl@linux.com, vbabka@suse.cz, mgorman@techsingularity.net, akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, stable@vger.kernel.org Subject: Re: [v2 PATCH] mm: move_pages: return valid node id in status if the page is already on the target node Message-ID: <20191205113156.GE28317@dhcp22.suse.cz> References: <1575519678-86510-1-git-send-email-yang.shi@linux.alibaba.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1575519678-86510-1-git-send-email-yang.shi@linux.alibaba.com> User-Agent: Mutt/1.10.1 (2018-07-13) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu 05-12-19 12:21:18, Yang Shi wrote: > Felix Abecassis reports move_pages() would return random status if the > pages are already on the target node by the below test program: > > ---8<--- > > int main(void) > { > const long node_id = 1; > const long page_size = sysconf(_SC_PAGESIZE); > const int64_t num_pages = 8; > > unsigned long nodemask = 1 << node_id; > long ret = set_mempolicy(MPOL_BIND, &nodemask, sizeof(nodemask)); > if (ret < 0) > return (EXIT_FAILURE); > > void **pages = malloc(sizeof(void*) * num_pages); > for (int i = 0; i < num_pages; ++i) { > pages[i] = mmap(NULL, page_size, PROT_WRITE | PROT_READ, > MAP_PRIVATE | MAP_POPULATE | MAP_ANONYMOUS, > -1, 0); > if (pages[i] == MAP_FAILED) > return (EXIT_FAILURE); > } > > ret = set_mempolicy(MPOL_DEFAULT, NULL, 0); > if (ret < 0) > return (EXIT_FAILURE); > > int *nodes = malloc(sizeof(int) * num_pages); > int *status = malloc(sizeof(int) * num_pages); > for (int i = 0; i < num_pages; ++i) { > nodes[i] = node_id; > status[i] = 0xd0; /* simulate garbage values */ > } > > ret = move_pages(0, num_pages, pages, nodes, status, MPOL_MF_MOVE); > printf("move_pages: %ld\n", ret); > for (int i = 0; i < num_pages; ++i) > printf("status[%d] = %d\n", i, status[i]); > } > ---8<--- > > Then running the program would return nonsense status values: > $ ./move_pages_bug > move_pages: 0 > status[0] = 208 > status[1] = 208 > status[2] = 208 > status[3] = 208 > status[4] = 208 > status[5] = 208 > status[6] = 208 > status[7] = 208 > > This is because the status is not set if the page is already on the > target node, but move_pages() should return valid status as long as it > succeeds. The valid status may be errno or node id. > > We can't simply initialize status array to zero since the pages may be > not on node 0. Fix it by updating status with node id which the page is > already on. And, it looks we have to update the status inside > add_page_for_migration() since the page struct is not available outside > it. The code is indeed more complex than I wanted but I couldn't figure an easier way back then. I wanted to keep store_status at a single place because the failure handling is quite complex already. > Make add_page_for_migration() return 1 if store_status() is failed in > order to not mix up the status value since -EFAULT is also a valid > status. Can we simply return 1 when there is something to migrate instead? Something like diff --git a/mm/migrate.c b/mm/migrate.c index 4fe45d1428c8..f3730804b8d4 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1516,9 +1516,9 @@ static int do_move_pages_to_node(struct mm_struct *mm, /* * Resolves the given address to a struct page, isolates it from the LRU and * puts it to the given pagelist. - * Returns -errno if the page cannot be found/isolated or 0 when it has been - * queued or the page doesn't need to be migrated because it is already on - * the target node + * Returns -errno if the page cannot be found/isolated or 0 when it doesn't have + * to be migrate or 1 dwhen it has been queued or the page doesn't need to be + * migrated because it is already on the target node */ static int add_page_for_migration(struct mm_struct *mm, unsigned long addr, int node, struct list_head *pagelist, bool migrate_all) @@ -1557,7 +1557,7 @@ static int add_page_for_migration(struct mm_struct *mm, unsigned long addr, if (PageHuge(page)) { if (PageHead(page)) { isolate_huge_page(page, pagelist); - err = 0; + err = 1; } } else { struct page *head; @@ -1567,7 +1567,7 @@ static int add_page_for_migration(struct mm_struct *mm, unsigned long addr, if (err) goto out_putpage; - err = 0; + err = 1; list_add_tail(&head->lru, pagelist); mod_node_page_state(page_pgdat(head), NR_ISOLATED_ANON + page_is_file_cache(head), @@ -1644,8 +1644,14 @@ static int do_pages_move(struct mm_struct *mm, nodemask_t task_nodes, */ err = add_page_for_migration(mm, addr, current_node, &pagelist, flags & MPOL_MF_MOVE_ALL); - if (!err) + if (!err) { + err = store_status(status, i, current_node, 1); + if (err) + goto out_flush; + continue; + } else if (err > 0) { continue; + } err = store_status(status, i, err, 1); if (err) this would still keep store_status ugliness at a single place. > > Fixes: a49bd4d71637 ("mm, numa: rework do_pages_move") > Reported-by: Felix Abecassis > Tested-by: Felix Abecassis > Cc: John Hubbard > Cc: Michal Hocko > Cc: Christoph Lameter > Cc: Vlastimil Babka > Cc: Mel Gorman > Cc: 4.17+ > Signed-off-by: Yang Shi > --- > v2: *Correted the return value when add_page_for_migration() returns 1. > > John noticed another return value inconsistency between the implementation and > the manpage. The manpage says it should return -ENOENT if the page is already > on the target node, but it doesn't. It looks the original code didn't return > -ENOENT either, I'm not sure if this is a document issue or not. Anyway this > is another issue, once we confirm it we can fix it later. I do not remember all the details but my recollection is that there were several inconsistencies present before I touched the code and I've decided to not touch them without a clear usecase. -- Michal Hocko SUSE Labs