From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id ED5A6C433DB for ; Fri, 29 Jan 2021 20:59:21 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 0D11864E04 for ; Fri, 29 Jan 2021 20:59:20 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0D11864E04 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 64FA18D0002; Fri, 29 Jan 2021 15:59:20 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5FEC18D0001; Fri, 29 Jan 2021 15:59:20 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4ED7D8D0002; Fri, 29 Jan 2021 15:59:20 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0168.hostedemail.com [216.40.44.168]) by kanga.kvack.org (Postfix) with ESMTP id 3A8598D0001 for ; Fri, 29 Jan 2021 15:59:20 -0500 (EST) Received: from smtpin05.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id EAE253634 for ; Fri, 29 Jan 2021 20:59:19 +0000 (UTC) X-FDA: 77760028038.05.bone83_5d0f516275ab Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin05.hostedemail.com (Postfix) with ESMTP id D167918026B1C for ; Fri, 29 Jan 2021 20:59:19 +0000 (UTC) X-HE-Tag: bone83_5d0f516275ab X-Filterd-Recvd-Size: 9327 Received: from mail-ed1-f53.google.com (mail-ed1-f53.google.com [209.85.208.53]) by imf31.hostedemail.com (Postfix) with ESMTP for ; Fri, 29 Jan 2021 20:59:19 +0000 (UTC) Received: by mail-ed1-f53.google.com with SMTP id c6so12175478ede.0 for ; Fri, 29 Jan 2021 12:59:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=nVH1s5GcGoG1te9FUAOq8LIQwayu4tr9vgJLQMZUFNw=; b=vhN3SKTCzL8NW2JQMe9nLy+1sPx5AbQAjOUYFkDWFQMHQt4dVczXcmkflCPQ3Fe/Uk E1ODptOnLTYdUAb3zO5oi9zZ4Eepg3T51IvnK8a6IHT/diPceLPNmY6tRSw/bipFxQ51 AnAU9796oRyQss3ja5xnyRdLlLehm/RyH53OwCfAB+gaK6zLb+42oFehlGzfqOIht5FD X6olexQEL9dmmrRckTn63qEOM0vtA3VmV6J4zPqskQUTcYJXcSCaBKWjlpTuYIelKGOt KY6C/DgXOPFcfEOu9CvR5XHXrdAyxpRW+TfiBr7I6aBkTZMhMGUhSnzgePzkM9ml9BA4 hBmA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=nVH1s5GcGoG1te9FUAOq8LIQwayu4tr9vgJLQMZUFNw=; b=cBJucpIQNO4l5LoQnlwYW1vQ11TRM2i5ANSaVdWkVYGZiPX/sLkAaUWHTfiJequzAD hpOW5mrBngQrf/UsCDNaPs0IoQL3D3BsPYkkGvAXuwLzbMLY0B0cfcKoDYG/v2HuT7Ya WVHdyvigYBnbrVTupQOahrvOYdGgZJ0NEXhlZxcul0m1R8j2cCdHYwhGiHO3Ar4+/dVB Tur5fX2BdjagpV/q2fufJuo6sXnmbHXfeAg4NwW0MlOzunClZ4eaXdhsktbxJP8ZFV8W rxNsMyuUTieWFe15bkaKXHgPu2gYx+m9tIreCECEFV51oybgRaUv5LKU1jISl+z1v6Nz dGRA== X-Gm-Message-State: AOAM531FsE97uTtXXEDc+YR0GXm9ce0HAjJzfuEeggHCQKjIkh+SgmvK 4xSiF2n1AxSSnnPgFzaxPKYmeJE7LVqpsLkYxmk= X-Google-Smtp-Source: ABdhPJyeGpRYWhq035RfJWm8foKTw1NlidoXIYQGGlquPqG59tIsr4ZikU0kIameCXnAq1HomxQ1bB+lMO/XyOB9bMg= X-Received: by 2002:a50:fc04:: with SMTP id i4mr7371767edr.137.1611953958176; Fri, 29 Jan 2021 12:59:18 -0800 (PST) MIME-Version: 1.0 References: <20210126003411.2AC51464@viggo.jf.intel.com> <20210126003423.8D2B5637@viggo.jf.intel.com> In-Reply-To: <20210126003423.8D2B5637@viggo.jf.intel.com> From: Yang Shi Date: Fri, 29 Jan 2021 12:59:06 -0800 Message-ID: Subject: Re: [RFC][PATCH 06/13] mm/migrate: update migration order during on hotplug events To: Dave Hansen Cc: Linux Kernel Mailing List , Linux MM , Yang Shi , David Rientjes , Huang Ying , Dan Williams , David Hildenbrand , Oscar Salvador Content-Type: text/plain; charset="UTF-8" X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Jan 25, 2021 at 4:41 PM Dave Hansen wrote: > > > From: Dave Hansen > > Reclaim-based migration is attempting to optimize data placement in > memory based on the system topology. If the system changes, so must > the migration ordering. > > The implementation here is pretty simple and entirely unoptimized. On > any memory or CPU hotplug events, assume that a node was added or > removed and recalculate all migration targets. This ensures that the > node_demotion[] array is always ready to be used in case the new > reclaim mode is enabled. > > This recalculation is far from optimal, most glaringly that it does > not even attempt to figure out if nodes are actually coming or going. > But, given the expected paucity of hotplug events, this should be > fine. > > Signed-off-by: Dave Hansen > Cc: Yang Shi > Cc: David Rientjes > Cc: Huang Ying > Cc: Dan Williams > Cc: David Hildenbrand > Cc: osalvador > --- > > b/mm/migrate.c | 97 +++++++++++++++++++++++++++++++++++++++++++++++++++++++-- > 1 file changed, 95 insertions(+), 2 deletions(-) > > diff -puN mm/migrate.c~enable-numa-demotion mm/migrate.c > --- a/mm/migrate.c~enable-numa-demotion 2021-01-25 16:23:11.850866703 -0800 > +++ b/mm/migrate.c 2021-01-25 16:23:11.855866703 -0800 > @@ -49,6 +49,7 @@ > #include > #include > #include > +#include > > #include > > @@ -3135,6 +3136,7 @@ void migrate_vma_finalize(struct migrate > EXPORT_SYMBOL(migrate_vma_finalize); > #endif /* CONFIG_DEVICE_PRIVATE */ > > +#if defined(CONFIG_MEMORY_HOTPLUG) > /* Disable reclaim-based migration. */ > static void disable_all_migrate_targets(void) > { > @@ -3191,7 +3193,7 @@ static int establish_migrate_target(int > * with itself. Exclusion is provided by memory hotplug events > * being single-threaded. > */ > -void __set_migration_target_nodes(void) > +static void __set_migration_target_nodes(void) > { > nodemask_t next_pass = NODE_MASK_NONE; > nodemask_t this_pass = NODE_MASK_NONE; > @@ -3253,9 +3255,100 @@ again: > goto again; > } > > -void set_migration_target_nodes(void) > +/* > + * For callers that do not hold get_online_mems() already. > + */ > +static void set_migration_target_nodes(void) Aha, it is changed to static here. I think this hunk should be folded into the previous patch. > { > get_online_mems(); > __set_migration_target_nodes(); > put_online_mems(); > } > + > +/* > + * React to hotplug events that might affect the migration targes s/targes/targets > + * like events that online or offline NUMA nodes. > + * > + * The ordering is also currently dependent on which nodes have > + * CPUs. That means we need CPU on/offline notification too. > + */ > +static int migration_online_cpu(unsigned int cpu) > +{ > + set_migration_target_nodes(); > + return 0; > +} > + > +static int migration_offline_cpu(unsigned int cpu) > +{ > + set_migration_target_nodes(); > + return 0; > +} > + > +/* > + * This leaves migrate-on-reclaim transiently disabled > + * between the MEM_GOING_OFFLINE and MEM_OFFLINE events. > + * This runs reclaim-based micgration is enabled or not. s/micgration/migration > + * This ensures that the user can turn reclaim-based > + * migration at any time without needing to recalcuate s/reclcuate/recalculate > + * migration targets. > + * > + * These callbacks already hold get_online_mems(). That > + * is why __set_migration_target_nodes() can be used as > + * opposed to set_migration_target_nodes(). > + */ > +static int __meminit migrate_on_reclaim_callback(struct notifier_block *self, > + unsigned long action, void *arg) > +{ > + switch (action) { > + case MEM_GOING_OFFLINE: > + /* > + * Make sure there are not transient states where > + * an offline node is a migration target. This > + * will leave migration disabled until the offline > + * completes and the MEM_OFFLINE case below runs. > + */ > + disable_all_migrate_targets(); Don't we need smp_wmb() here? In the previous patch the comment says write memory barrier is needed to guarantee readers see the consistent values. The the smp_wmb() is called by __set_migration_target_nodes(). So, it seems it'd better to move smp_wmb() into disable_all_migrate_targets(). > + break; > + case MEM_OFFLINE: > + case MEM_ONLINE: > + /* > + * Recalculate the target nodes once the node > + * reaches its final state (online or offline). > + */ > + __set_migration_target_nodes(); > + break; > + case MEM_CANCEL_OFFLINE: > + /* > + * MEM_GOING_OFFLINE disabled all the migration > + * targets. Reenable them. > + */ > + __set_migration_target_nodes(); > + break; > + case MEM_GOING_ONLINE: > + case MEM_CANCEL_ONLINE: > + break; > + } > + > + return notifier_from_errno(0); > +} > + > +static int __init migrate_on_reclaim_init(void) > +{ > + int ret; > + > + ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "migrate on reclaim", > + migration_online_cpu, > + migration_offline_cpu); > + /* > + * In the unlikely case that this fails, the automatic > + * migration targets may become suboptimal for nodes > + * where N_CPU changes. With such a small impact in a > + * rare case, do not bother trying to do anything special. > + */ > + WARN_ON(ret < 0); > + > + hotplug_memory_notifier(migrate_on_reclaim_callback, 100); > + return 0; > +} > +late_initcall(migrate_on_reclaim_init); > +#endif /* CONFIG_MEMORY_HOTPLUG */ > _ >