From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE, SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A787DC4727F for ; Wed, 7 Oct 2020 16:17:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 648D820789 for ; Wed, 7 Oct 2020 16:17:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729195AbgJGQRx (ORCPT ); Wed, 7 Oct 2020 12:17:53 -0400 Received: from mga05.intel.com ([192.55.52.43]:59178 "EHLO mga05.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729179AbgJGQRt (ORCPT ); Wed, 7 Oct 2020 12:17:49 -0400 IronPort-SDR: cTF+h+9BllbkO8fQV9QW36Gc5TdcdAXH3T+OWjAx57qGohK11qDKD/UGFcfK3KYXiLJIsV+o4u CGAsWRG/JcBg== X-IronPort-AV: E=McAfee;i="6000,8403,9767"; a="249718684" X-IronPort-AV: E=Sophos;i="5.77,347,1596524400"; d="scan'208";a="249718684" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Oct 2020 09:17:49 -0700 IronPort-SDR: 8HPIDvb9j3GVeLiCNh2f2i/uckzsBEixx1tHzFSrBXnFo8jFzdOq69fA+lQSJnuhFZ/b/HevRQ cZ3xhuN0TLgA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.77,347,1596524400"; d="scan'208";a="316279973" Received: from viggo.jf.intel.com (HELO localhost.localdomain) ([10.54.77.144]) by orsmga006.jf.intel.com with ESMTP; 07 Oct 2020 09:17:48 -0700 Subject: [RFC][PATCH 3/9] mm/migrate: update migration order during on hotplug events To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, Dave Hansen , yang.shi@linux.alibaba.com, rientjes@google.com, ying.huang@intel.com, dan.j.williams@intel.com, david@redhat.com From: Dave Hansen Date: Wed, 07 Oct 2020 09:17:41 -0700 References: <20201007161736.ACC6E387@viggo.jf.intel.com> In-Reply-To: <20201007161736.ACC6E387@viggo.jf.intel.com> Message-Id: <20201007161741.DDC85648@viggo.jf.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Dave Hansen Reclaim-based migration is attempting to optimize data placement in memory based on the system topology. If the system changes, so must the migration ordering. The implementation here is pretty simple and entirely unoptimized. On any memory or CPU hotplug events, assume that a node was added or removed and recalculate all migration targets. This ensures that the node_demotion[] array is always ready to be used in case the new reclaim mode is enabled. This recalculation is far from optimal, most glaringly that it does not even attempt to figure out if nodes are actually coming or going. But, given the expected paucity of hotplug events, this should be fine. Signed-off-by: Dave Hansen Cc: Yang Shi Cc: David Rientjes Cc: Huang Ying Cc: Dan Williams Cc: David Hildenbrand --- b/mm/migrate.c | 93 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 93 insertions(+) diff -puN mm/migrate.c~enable-numa-demotion mm/migrate.c --- a/mm/migrate.c~enable-numa-demotion 2020-10-07 09:15:28.260642449 -0700 +++ b/mm/migrate.c 2020-10-07 09:15:28.266642449 -0700 @@ -49,6 +49,7 @@ #include #include #include +#include #include @@ -3241,9 +3242,101 @@ again: goto again; } +/* + * For callers that do not hold get_online_mems() already. + */ void set_migration_target_nodes(void) { get_online_mems(); __set_migration_target_nodes(); put_online_mems(); } + +/* + * React to hotplug events that might affect the migration targes + * like events that online or offline NUMA nodes. + * + * The ordering is also currently dependent on which nodes have + * CPUs. That means we need CPU on/offline notification too. + */ +static int migration_online_cpu(unsigned int cpu) +{ + set_migration_target_nodes(); + return 0; +} + +static int migration_offline_cpu(unsigned int cpu) +{ + set_migration_target_nodes(); + return 0; +} + +/* + * This leaves migrate-on-reclaim transiently disabled + * between the MEM_GOING_OFFLINE and MEM_OFFLINE events. + * This runs reclaim-based micgration is enabled or not. + * This ensures that the user can turn reclaim-based + * migration at any time without needing to recalcuate + * migration targets. + * + * These callbacks already hold get_online_mems(). That + * is why __set_migration_target_nodes() can be used as + * opposed to set_migration_target_nodes(). + */ +#if defined(CONFIG_MEMORY_HOTPLUG) +static int __meminit migrate_on_reclaim_callback(struct notifier_block *self, + unsigned long action, void *arg) +{ + switch (action) { + case MEM_GOING_OFFLINE: + /* + * Make sure there are not transient states where + * an offline node is a migration target. This + * will leave migration disabled until the offline + * completes and the MEM_OFFLINE case below runs. + */ + disable_all_migrate_targets(); + break; + case MEM_OFFLINE: + case MEM_ONLINE: + /* + * Recalculate the target nodes once the node + * reaches its final state (online or offline). + */ + __set_migration_target_nodes(); + break; + case MEM_CANCEL_OFFLINE: + /* + * MEM_GOING_OFFLINE disabled all the migration + * targets. Reenable them. + */ + __set_migration_target_nodes(); + break; + case MEM_GOING_ONLINE: + case MEM_CANCEL_ONLINE: + break; + } + + return notifier_from_errno(0); +} + +static int __init migrate_on_reclaim_init(void) +{ + int ret; + + ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "migrate on reclaim", + migration_online_cpu, + migration_offline_cpu); + /* + * In the unlikely case that this fails, the automatic + * migration targets may become suboptimal for nodes + * where N_CPU changes. With such a small impact in a + * rare case, do not bother trying to do anything special. + */ + WARN_ON(ret < 0); + + hotplug_memory_notifier(migrate_on_reclaim_callback, 100); + return 0; +} +late_initcall(migrate_on_reclaim_init); +#endif /* CONFIG_MEMORY_HOTPLUG */ _ From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE, SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DFDF5C41604 for ; Wed, 7 Oct 2020 16:17:54 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 2E7C8215A4 for ; Wed, 7 Oct 2020 16:17:54 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2E7C8215A4 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id B5F9E6B0062; Wed, 7 Oct 2020 12:17:53 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B0FD56B0068; Wed, 7 Oct 2020 12:17:53 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9FE7B6B006C; Wed, 7 Oct 2020 12:17:53 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0253.hostedemail.com [216.40.44.253]) by kanga.kvack.org (Postfix) with ESMTP id 742936B0062 for ; Wed, 7 Oct 2020 12:17:53 -0400 (EDT) Received: from smtpin30.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 1308B181AE86F for ; Wed, 7 Oct 2020 16:17:53 +0000 (UTC) X-FDA: 77345635626.30.tramp82_27125e3271d0 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin30.hostedemail.com (Postfix) with ESMTP id 2182A180B3CBA for ; Wed, 7 Oct 2020 16:17:52 +0000 (UTC) X-HE-Tag: tramp82_27125e3271d0 X-Filterd-Recvd-Size: 5903 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by imf12.hostedemail.com (Postfix) with ESMTP for ; Wed, 7 Oct 2020 16:17:51 +0000 (UTC) IronPort-SDR: 5fBVugHkHKMMEdGBPelrPq/V+YAK278KyFn/zj2dUod930QZ/WUCapu29A4vJ9GO4c2OF1Aa9O htLv8K7qVDwA== X-IronPort-AV: E=McAfee;i="6000,8403,9767"; a="182481091" X-IronPort-AV: E=Sophos;i="5.77,347,1596524400"; d="scan'208";a="182481091" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Oct 2020 09:17:49 -0700 IronPort-SDR: 8HPIDvb9j3GVeLiCNh2f2i/uckzsBEixx1tHzFSrBXnFo8jFzdOq69fA+lQSJnuhFZ/b/HevRQ cZ3xhuN0TLgA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.77,347,1596524400"; d="scan'208";a="316279973" Received: from viggo.jf.intel.com (HELO localhost.localdomain) ([10.54.77.144]) by orsmga006.jf.intel.com with ESMTP; 07 Oct 2020 09:17:48 -0700 Subject: [RFC][PATCH 3/9] mm/migrate: update migration order during on hotplug events To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org,Dave Hansen ,yang.shi@linux.alibaba.com,rientjes@google.com,ying.huang@intel.com,dan.j.williams@intel.com,david@redhat.com From: Dave Hansen Date: Wed, 07 Oct 2020 09:17:41 -0700 References: <20201007161736.ACC6E387@viggo.jf.intel.com> In-Reply-To: <20201007161736.ACC6E387@viggo.jf.intel.com> Message-Id: <20201007161741.DDC85648@viggo.jf.intel.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Dave Hansen Reclaim-based migration is attempting to optimize data placement in memory based on the system topology. If the system changes, so must the migration ordering. The implementation here is pretty simple and entirely unoptimized. On any memory or CPU hotplug events, assume that a node was added or removed and recalculate all migration targets. This ensures that the node_demotion[] array is always ready to be used in case the new reclaim mode is enabled. This recalculation is far from optimal, most glaringly that it does not even attempt to figure out if nodes are actually coming or going. But, given the expected paucity of hotplug events, this should be fine. Signed-off-by: Dave Hansen Cc: Yang Shi Cc: David Rientjes Cc: Huang Ying Cc: Dan Williams Cc: David Hildenbrand --- b/mm/migrate.c | 93 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 93 insertions(+) diff -puN mm/migrate.c~enable-numa-demotion mm/migrate.c --- a/mm/migrate.c~enable-numa-demotion 2020-10-07 09:15:28.260642449 -0700 +++ b/mm/migrate.c 2020-10-07 09:15:28.266642449 -0700 @@ -49,6 +49,7 @@ #include #include #include +#include #include @@ -3241,9 +3242,101 @@ again: goto again; } +/* + * For callers that do not hold get_online_mems() already. + */ void set_migration_target_nodes(void) { get_online_mems(); __set_migration_target_nodes(); put_online_mems(); } + +/* + * React to hotplug events that might affect the migration targes + * like events that online or offline NUMA nodes. + * + * The ordering is also currently dependent on which nodes have + * CPUs. That means we need CPU on/offline notification too. + */ +static int migration_online_cpu(unsigned int cpu) +{ + set_migration_target_nodes(); + return 0; +} + +static int migration_offline_cpu(unsigned int cpu) +{ + set_migration_target_nodes(); + return 0; +} + +/* + * This leaves migrate-on-reclaim transiently disabled + * between the MEM_GOING_OFFLINE and MEM_OFFLINE events. + * This runs reclaim-based micgration is enabled or not. + * This ensures that the user can turn reclaim-based + * migration at any time without needing to recalcuate + * migration targets. + * + * These callbacks already hold get_online_mems(). That + * is why __set_migration_target_nodes() can be used as + * opposed to set_migration_target_nodes(). + */ +#if defined(CONFIG_MEMORY_HOTPLUG) +static int __meminit migrate_on_reclaim_callback(struct notifier_block *self, + unsigned long action, void *arg) +{ + switch (action) { + case MEM_GOING_OFFLINE: + /* + * Make sure there are not transient states where + * an offline node is a migration target. This + * will leave migration disabled until the offline + * completes and the MEM_OFFLINE case below runs. + */ + disable_all_migrate_targets(); + break; + case MEM_OFFLINE: + case MEM_ONLINE: + /* + * Recalculate the target nodes once the node + * reaches its final state (online or offline). + */ + __set_migration_target_nodes(); + break; + case MEM_CANCEL_OFFLINE: + /* + * MEM_GOING_OFFLINE disabled all the migration + * targets. Reenable them. + */ + __set_migration_target_nodes(); + break; + case MEM_GOING_ONLINE: + case MEM_CANCEL_ONLINE: + break; + } + + return notifier_from_errno(0); +} + +static int __init migrate_on_reclaim_init(void) +{ + int ret; + + ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "migrate on reclaim", + migration_online_cpu, + migration_offline_cpu); + /* + * In the unlikely case that this fails, the automatic + * migration targets may become suboptimal for nodes + * where N_CPU changes. With such a small impact in a + * rare case, do not bother trying to do anything special. + */ + WARN_ON(ret < 0); + + hotplug_memory_notifier(migrate_on_reclaim_callback, 100); + return 0; +} +late_initcall(migrate_on_reclaim_init); +#endif /* CONFIG_MEMORY_HOTPLUG */ _