linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: linux-kernel@vger.kernel.org
Cc: linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org,
	linux-hyperv@vger.kernel.org,
	David Hildenbrand <david@redhat.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Yumei Huang <yuhuang@redhat.com>,
	Igor Mammedov <imammedo@redhat.com>, Baoquan He <bhe@redhat.com>,
	Eduardo Habkost <ehabkost@redhat.com>,
	Milan Zamazal <mzamazal@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Haiyang Zhang <haiyangz@microsoft.com>,
	"K. Y. Srinivasan" <kys@microsoft.com>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Michal Hocko <mhocko@kernel.org>, Michal Hocko <mhocko@suse.com>,
	Oscar Salvador <osalvador@suse.de>,
	Paul Mackerras <paulus@samba.org>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	Stephen Hemminger <sthemmin@microsoft.com>,
	Wei Liu <wei.liu@kernel.org>,
	Wei Yang <richard.weiyang@gmail.com>
Subject: [PATCH v2 0/8] mm/memory_hotplug: allow to specify a default online_type
Date: Tue, 17 Mar 2020 11:49:34 +0100	[thread overview]
Message-ID: <20200317104942.11178-1-david@redhat.com> (raw)

Distributions nowadays use udev rules ([1] [2]) to specify if and
how to online hotplugged memory. The rules seem to get more complex with
many special cases. Due to the various special cases,
CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE cannot be used. All memory hotplug
is handled via udev rules.

Everytime we hotplug memory, the udev rule will come to the same
conclusion. Especially Hyper-V (but also soon virtio-mem) add a lot of
memory in separate memory blocks and wait for memory to get onlined by user
space before continuing to add more memory blocks (to not add memory faster
than it is getting onlined). This of course slows down the whole memory
hotplug process.

To make the job of distributions easier and to avoid udev rules that get
more and more complicated, let's extend the mechanism provided by
- /sys/devices/system/memory/auto_online_blocks
- "memhp_default_state=" on the kernel cmdline
to be able to specify also "online_movable" as well as "online_kernel"

v1 -> v2:
- Tweaked some patch descriptions
- Added
-- "powernv/memtrace: always online added memory blocks"
-- "hv_balloon: don't check for memhp_auto_online manually"
-- "mm/memory_hotplug: unexport memhp_auto_online"
- "mm/memory_hotplug: convert memhp_auto_online to store an online_type"
-- No longer touches hv/memtrace code


=== Example /usr/libexec/config-memhotplug ===

#!/bin/bash

VIRT=`systemd-detect-virt --vm`
ARCH=`uname -p`

sense_virtio_mem() {
  if [ -d "/sys/bus/virtio/drivers/virtio_mem/" ]; then
    DEVICES=`find /sys/bus/virtio/drivers/virtio_mem/ -maxdepth 1 -type l | wc -l`
    if [ $DEVICES != "0" ]; then
        return 0
    fi
  fi
  return 1
}

if [ ! -e "/sys/devices/system/memory/auto_online_blocks" ]; then
  echo "Memory hotplug configuration support missing in the kernel"
  exit 1
fi

if grep "memhp_default_state=" /proc/cmdline > /dev/null; then
  echo "Memory hotplug configuration overridden in kernel cmdline (memhp_default_state=)"
  exit 1
fi

if [ $VIRT == "microsoft" ]; then
  echo "Detected Hyper-V on $ARCH"
  # Hyper-V wants all memory in ZONE_NORMAL
  ONLINE_TYPE="online_kernel"
elif sense_virtio_mem; then
  echo "Detected virtio-mem on $ARCH"
  # virtio-mem wants all memory in ZONE_NORMAL
  ONLINE_TYPE="online_kernel"
elif [ $ARCH == "s390x" ] || [ $ARCH == "s390" ]; then
  echo "Detected $ARCH"
  # standby memory should not be onlined automatically
  ONLINE_TYPE="offline"
elif [ $ARCH == "ppc64" ] || [ $ARCH == "ppc64le" ]; then
  echo "Detected" $ARCH
  # PPC64 onlines all hotplugged memory right from the kernel
  ONLINE_TYPE="offline"
elif [ $VIRT == "none" ]; then
  echo "Detected bare-metal on $ARCH"
  # Bare metal users expect hotplugged memory to be unpluggable. We assume
  # that ZONE imbalances on such enterpise servers cannot happen and is
  # properly documented
  ONLINE_TYPE="online_movable"
else
  # TODO: Hypervisors that want to unplug DIMMs and can guarantee that ZONE
  # imbalances won't happen
  echo "Detected $VIRT on $ARCH"
  # Usually, ballooning is used in virtual environments, so memory should go to
  # ZONE_NORMAL. However, sometimes "movable_node" is relevant.
  ONLINE_TYPE="online"
fi

echo "Selected online_type:" $ONLINE_TYPE

# Configure what to do with memory that will be hotplugged in the future
echo $ONLINE_TYPE 2>/dev/null > /sys/devices/system/memory/auto_online_blocks
if [ $? != "0" ]; then
  echo "Memory hotplug cannot be configured (e.g., old kernel or missing permissions)"
  # A backup udev rule should handle old kernels if necessary
  exit 1
fi

# Process all already pluggedd blocks (e.g., DIMMs, but also Hyper-V or virtio-mem)
if [ $ONLINE_TYPE != "offline" ]; then
  for MEMORY in /sys/devices/system/memory/memory*; do
    STATE=`cat $MEMORY/state`
    if [ $STATE == "offline" ]; then
        echo $ONLINE_TYPE > $MEMORY/state
    fi
  done
fi


=== Example /usr/lib/systemd/system/config-memhotplug.service ===

[Unit]
Description=Configure memory hotplug behavior
DefaultDependencies=no
Conflicts=shutdown.target
Before=sysinit.target shutdown.target
After=systemd-modules-load.service
ConditionPathExists=|/sys/devices/system/memory/auto_online_blocks

[Service]
ExecStart=/usr/libexec/config-memhotplug
Type=oneshot
TimeoutSec=0
RemainAfterExit=yes

[Install]
WantedBy=sysinit.target


=== Example modification to the 40-redhat.rules [2] ===

diff --git a/40-redhat.rules b/40-redhat.rules-new
index 2c690e5..168fd03 100644
--- a/40-redhat.rules
+++ b/40-redhat.rules-new
@@ -6,6 +6,9 @@ SUBSYSTEM=="cpu", ACTION=="add", TEST=="online", ATTR{online}=="0", ATTR{online}
 # Memory hotadd request
 SUBSYSTEM!="memory", GOTO="memory_hotplug_end"
 ACTION!="add", GOTO="memory_hotplug_end"
+# memory hotplug behavior configured
+PROGRAM=="grep online /sys/devices/system/memory/auto_online_blocks", GOTO="memory_hotplug_end"
+
 PROGRAM="/bin/uname -p", RESULT=="s390*", GOTO="memory_hotplug_end"

 ENV{.state}="online"

===


[1] https://github.com/lnykryn/systemd-rhel/pull/281
[2] https://github.com/lnykryn/systemd-rhel/blob/staging/rules/40-redhat.rules

Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Yumei Huang <yuhuang@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Milan Zamazal <mzamazal@redhat.com>

David Hildenbrand (8):
  drivers/base/memory: rename MMOP_ONLINE_KEEP to MMOP_ONLINE
  drivers/base/memory: map MMOP_OFFLINE to 0
  drivers/base/memory: store mapping between MMOP_* and string in an
    array
  powernv/memtrace: always online added memory blocks
  hv_balloon: don't check for memhp_auto_online manually
  mm/memory_hotplug: unexport memhp_auto_online
  mm/memory_hotplug: convert memhp_auto_online to store an online_type
  mm/memory_hotplug: allow to specify a default online_type

 arch/powerpc/platforms/powernv/memtrace.c | 14 ++---
 drivers/base/memory.c                     | 71 ++++++++++++-----------
 drivers/hv/hv_balloon.c                   | 25 ++++----
 include/linux/memory_hotplug.h            | 13 ++++-
 mm/memory_hotplug.c                       | 16 ++---
 5 files changed, 69 insertions(+), 70 deletions(-)

-- 
2.24.1


             reply	other threads:[~2020-03-17 10:50 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-17 10:49 David Hildenbrand [this message]
2020-03-17 10:49 ` [PATCH v2 1/8] drivers/base/memory: rename MMOP_ONLINE_KEEP to MMOP_ONLINE David Hildenbrand
2020-03-17 10:49 ` [PATCH v2 2/8] drivers/base/memory: map MMOP_OFFLINE to 0 David Hildenbrand
2020-03-17 10:49 ` [PATCH v2 3/8] drivers/base/memory: store mapping between MMOP_* and string in an array David Hildenbrand
2020-03-17 10:49 ` [PATCH v2 4/8] powernv/memtrace: always online added memory blocks David Hildenbrand
2020-03-17 10:58   ` Michal Hocko
2020-03-17 22:04   ` Wei Yang
2020-03-19  9:49   ` Michael Ellerman
2020-03-17 10:49 ` [PATCH v2 5/8] hv_balloon: don't check for memhp_auto_online manually David Hildenbrand
2020-03-17 16:29   ` Vitaly Kuznetsov
2020-03-17 16:33     ` David Hildenbrand
2020-03-17 18:46   ` David Hildenbrand
2020-03-17 10:49 ` [PATCH v2 6/8] mm/memory_hotplug: unexport memhp_auto_online David Hildenbrand
2020-03-17 10:59   ` Michal Hocko
2020-03-17 22:24   ` Wei Yang
2020-03-17 10:49 ` [PATCH v2 7/8] mm/memory_hotplug: convert memhp_auto_online to store an online_type David Hildenbrand
2020-03-17 11:00   ` Michal Hocko
2020-03-17 10:49 ` [PATCH v2 8/8] mm/memory_hotplug: allow to specify a default online_type David Hildenbrand
2020-03-17 11:01   ` Michal Hocko
2020-03-17 11:05     ` David Hildenbrand
2020-03-17 11:08   ` David Hildenbrand
2020-03-18 13:05 ` [PATCH v2 0/8] " Baoquan He
2020-03-18 13:50   ` David Hildenbrand
2020-03-18 14:50     ` Baoquan He
2020-03-18 13:54   ` Michal Hocko
2020-03-18 14:41     ` Baoquan He
2020-03-18 13:58   ` Vitaly Kuznetsov
2020-03-18 14:41     ` Baoquan He
2020-03-18 15:00       ` Vitaly Kuznetsov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200317104942.11178-1-david@redhat.com \
    --to=david@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=benh@kernel.crashing.org \
    --cc=bhe@redhat.com \
    --cc=ehabkost@redhat.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=haiyangz@microsoft.com \
    --cc=imammedo@redhat.com \
    --cc=kys@microsoft.com \
    --cc=linux-hyperv@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mhocko@kernel.org \
    --cc=mhocko@suse.com \
    --cc=mpe@ellerman.id.au \
    --cc=mzamazal@redhat.com \
    --cc=osalvador@suse.de \
    --cc=paulus@samba.org \
    --cc=rafael@kernel.org \
    --cc=richard.weiyang@gmail.com \
    --cc=sthemmin@microsoft.com \
    --cc=vkuznets@redhat.com \
    --cc=wei.liu@kernel.org \
    --cc=yuhuang@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).