linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1 0/2] drivers/base/memory: clarify some memory block properties
@ 2021-02-01 10:51 David Hildenbrand
  2021-02-01 10:51 ` [PATCH v1 1/2] drivers/base/memory: don't store phys_device in memory blocks David Hildenbrand
  2021-02-01 10:51 ` [PATCH v1 2/2] Documentation: sysfs/memory: clarify some memory block device properties David Hildenbrand
  0 siblings, 2 replies; 7+ messages in thread
From: David Hildenbrand @ 2021-02-01 10:51 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-mm, David Hildenbrand

Let's update parts of our documentation for
/sys/devices/system/memory/memoryX/ properties, especially stating which
properties are nowadays legacy interfaces.

David Hildenbrand (2):
  drivers/base/memory: don't store phys_device in memory blocks
  Documentation: sysfs/memory: clarify some memory block device
    properties

 .../ABI/testing/sysfs-devices-memory          | 58 ++++++++++++-------
 .../admin-guide/mm/memory-hotplug.rst         | 20 +++----
 drivers/base/memory.c                         | 23 +++-----
 include/linux/memory.h                        |  3 +-
 4 files changed, 56 insertions(+), 48 deletions(-)

-- 
2.29.2



^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v1 1/2] drivers/base/memory: don't store phys_device in memory blocks
  2021-02-01 10:51 [PATCH v1 0/2] drivers/base/memory: clarify some memory block properties David Hildenbrand
@ 2021-02-01 10:51 ` David Hildenbrand
  2021-02-01 13:00   ` kernel test robot
  2021-02-01 15:58   ` Michal Hocko
  2021-02-01 10:51 ` [PATCH v1 2/2] Documentation: sysfs/memory: clarify some memory block device properties David Hildenbrand
  1 sibling, 2 replies; 7+ messages in thread
From: David Hildenbrand @ 2021-02-01 10:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, David Hildenbrand, Andrew Morton, Dave Hansen,
	Michal Hocko, Oscar Salvador, Greg Kroah-Hartman,
	Gerald Schaefer, Jonathan Corbet, Rafael J. Wysocki,
	Mauro Carvalho Chehab, Ilya Dryomov, Vaibhav Jain, Tom Rix,
	Geert Uytterhoeven, linux-doc

No need to store the value for each and every memory block, as we can
easily query the value at runtime. Reshuffle the members to optimize the
memory layout. Also, let's clarify what the interface once was used for
and why it's legacy nowadays.

"phys_device" was used on s390x in older versions of lsmem[2]/chmem[3],
back when they were still part of s390x-tools. They were later replaced
by the variants in linux-utils. For example, RHEL6 and RHEL7 contain
lsmem/chmem from s390-utils. RHEL8 switched to versions from util-linux
on s390x [4].

"phys_device" was added with sysfs support for memory hotplug in
commit 3947be1969a9 ("[PATCH] memory hotplug: sysfs and add/remove
functions") in 2005. It always returned 0.

s390x started returning something != 0 on some setups (if sclp.rzm is
set by HW) in 2010 via commit 57b552ba0b2f ("memory hotplug/s390: set
phys_device").

For s390x, it allowed for identifying which memory block devices belong
to the same storage increment (RZM). Only if all memory block devices
comprising a single storage increment were offline, the memory could
actually be removed in the hypervisor.

Since commit e5d709bb5fb7 ("s390/memory hotplug: provide
memory_block_size_bytes() function") in 2013 a memory block devices
spans at least one storage increment - which is why the interface isn't
really helpful/used anymore (except by old lsmem/chmem tools).

There were once RFC patches to make use of "phys_device" in ACPI context;
however, the underlying problem could be solved using different
interfaces [1].

[1] https://patchwork.kernel.org/patch/2163871/
[2] https://github.com/ibm-s390-tools/s390-tools/blob/v2.1.0/zconf/lsmem
[3] https://github.com/ibm-s390-tools/s390-tools/blob/v2.1.0/zconf/chmem
[4] https://bugzilla.redhat.com/show_bug.cgi?id=1504134

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Cc: Ilya Dryomov <idryomov@gmail.com>
Cc: Vaibhav Jain <vaibhav@linux.ibm.com>
Cc: Tom Rix <trix@redhat.com>
Cc: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: linux-doc@vger.kernel.org
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 .../ABI/testing/sysfs-devices-memory          |  5 ++--
 .../admin-guide/mm/memory-hotplug.rst         |  4 ++--
 drivers/base/memory.c                         | 23 ++++++++-----------
 include/linux/memory.h                        |  3 +--
 4 files changed, 15 insertions(+), 20 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-devices-memory b/Documentation/ABI/testing/sysfs-devices-memory
index 246a45b96d22..58dbc592bc57 100644
--- a/Documentation/ABI/testing/sysfs-devices-memory
+++ b/Documentation/ABI/testing/sysfs-devices-memory
@@ -26,8 +26,9 @@ Date:		September 2008
 Contact:	Badari Pulavarty <pbadari@us.ibm.com>
 Description:
 		The file /sys/devices/system/memory/memoryX/phys_device
-		is read-only and is designed to show the name of physical
-		memory device.  Implementation is currently incomplete.
+		is read-only;  it is a legacy interface only ever used on s390x
+		to expose the covered storage increment.
+Users:		Legacy s390-tools lsmem/chmem
 
 What:		/sys/devices/system/memory/memoryX/phys_index
 Date:		September 2008
diff --git a/Documentation/admin-guide/mm/memory-hotplug.rst b/Documentation/admin-guide/mm/memory-hotplug.rst
index 5c4432c96c4b..245739f55ac7 100644
--- a/Documentation/admin-guide/mm/memory-hotplug.rst
+++ b/Documentation/admin-guide/mm/memory-hotplug.rst
@@ -160,8 +160,8 @@ Under each memory block, you can see 5 files:
 
                     "online_movable", "online", "offline" command
                     which will be performed on all sections in the block.
-``phys_device``     read-only: designed to show the name of physical memory
-                    device.  This is not well implemented now.
+``phys_device``	    read-only: legacy interface only ever used on s390x to
+		    expose the covered storage increment.
 ``removable``       read-only: contains an integer value indicating
                     whether the memory block is removable or not
                     removable.  A value of 1 indicates that the memory
diff --git a/drivers/base/memory.c b/drivers/base/memory.c
index 901e379676be..16959d339172 100644
--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -290,20 +290,20 @@ static ssize_t state_store(struct device *dev, struct device_attribute *attr,
 }
 
 /*
- * phys_device is a bad name for this.  What I really want
- * is a way to differentiate between memory ranges that
- * are part of physical devices that constitute
- * a complete removable unit or fru.
- * i.e. do these ranges belong to the same physical device,
- * s.t. if I offline all of these sections I can then
- * remove the physical device?
+ * Legacy interface that we cannot remove: s390x exposes the storage increment
+ * covered by a memory block, allowing for identifying which memory blocks
+ * comprise a storage increment. Since a memory block spans complete
+ * storage increments nowadays, this interface is basically unused. Other
+ * archs never exposed != 0.
  */
 static ssize_t phys_device_show(struct device *dev,
 				struct device_attribute *attr, char *buf)
 {
 	struct memory_block *mem = to_memory_block(dev);
+	unsigned long start_pfn = section_nr_to_pfn(mem->start_section_nr);
 
-	return sysfs_emit(buf, "%d\n", mem->phys_device);
+	return sysfs_emit(buf, "%d\n",
+			  arch_get_memory_phys_device(start_pfn));
 }
 
 #ifdef CONFIG_MEMORY_HOTREMOVE
@@ -488,11 +488,7 @@ static DEVICE_ATTR_WO(soft_offline_page);
 static DEVICE_ATTR_WO(hard_offline_page);
 #endif
 
-/*
- * Note that phys_device is optional.  It is here to allow for
- * differentiation between which *physical* devices each
- * section belongs to...
- */
+/* See phys_device_show(). */
 int __weak arch_get_memory_phys_device(unsigned long start_pfn)
 {
 	return 0;
@@ -589,7 +585,6 @@ static int init_memory_block(unsigned long block_id, unsigned long state)
 	mem->start_section_nr = block_id * sections_per_block;
 	mem->state = state;
 	start_pfn = section_nr_to_pfn(mem->start_section_nr);
-	mem->phys_device = arch_get_memory_phys_device(start_pfn);
 	mem->nid = NUMA_NO_NODE;
 
 	ret = register_memory(mem);
diff --git a/include/linux/memory.h b/include/linux/memory.h
index 439a89e758d8..4da95e684e20 100644
--- a/include/linux/memory.h
+++ b/include/linux/memory.h
@@ -27,9 +27,8 @@ struct memory_block {
 	unsigned long start_section_nr;
 	unsigned long state;		/* serialized by the dev->lock */
 	int online_type;		/* for passing data to online routine */
-	int phys_device;		/* to which fru does this belong? */
-	struct device dev;
 	int nid;			/* NID for this memory block */
+	struct device dev;
 };
 
 int arch_get_memory_phys_device(unsigned long start_pfn);
-- 
2.29.2



^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v1 2/2] Documentation: sysfs/memory: clarify some memory block device properties
  2021-02-01 10:51 [PATCH v1 0/2] drivers/base/memory: clarify some memory block properties David Hildenbrand
  2021-02-01 10:51 ` [PATCH v1 1/2] drivers/base/memory: don't store phys_device in memory blocks David Hildenbrand
@ 2021-02-01 10:51 ` David Hildenbrand
  2021-02-01 16:00   ` Michal Hocko
  1 sibling, 1 reply; 7+ messages in thread
From: David Hildenbrand @ 2021-02-01 10:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, David Hildenbrand, Andrew Morton, Dave Hansen,
	Michal Hocko, Oscar Salvador, Jonathan Corbet,
	Greg Kroah-Hartman, Jonathan Cameron, Ilya Dryomov,
	Mauro Carvalho Chehab, Geert Uytterhoeven, linux-doc

In commit 53cdc1cb29e8 ("drivers/base/memory.c: indicate all memory blocks
as removable") we changed the output of the "removable" property of memory
devices to return "1" if and only if the kernel supports memory offlining.

Let's update documentation, stating that the interface is legacy. Also
update documentation of the "state" property and "valid_zones"
properties.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: David Hildenbrand <david@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Ilya Dryomov <idryomov@gmail.com>
Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Cc: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: linux-doc@vger.kernel.org
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 .../ABI/testing/sysfs-devices-memory          | 53 ++++++++++++-------
 .../admin-guide/mm/memory-hotplug.rst         | 16 +++---
 2 files changed, 41 insertions(+), 28 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-devices-memory b/Documentation/ABI/testing/sysfs-devices-memory
index 58dbc592bc57..d8b0f80b9e33 100644
--- a/Documentation/ABI/testing/sysfs-devices-memory
+++ b/Documentation/ABI/testing/sysfs-devices-memory
@@ -13,13 +13,13 @@ What:		/sys/devices/system/memory/memoryX/removable
 Date:		June 2008
 Contact:	Badari Pulavarty <pbadari@us.ibm.com>
 Description:
-		The file /sys/devices/system/memory/memoryX/removable
-		indicates whether this memory block is removable or not.
-		This is useful for a user-level agent to determine
-		identify removable sections of the memory before attempting
-		potentially expensive hot-remove memory operation
+		The file /sys/devices/system/memory/memoryX/removable is a
+		legacy interface used to indicated whether a memory block is
+		likely to be offlineable or not.  Newer kernel versions return
+		"1" if and only if the kernel supports memory offlining.
 Users:		hotplug memory remove tools
 		http://www.ibm.com/developerworks/wikis/display/LinuxP/powerpc-utils
+		lsmem/chmem part of util-linux
 
 What:		/sys/devices/system/memory/memoryX/phys_device
 Date:		September 2008
@@ -44,23 +44,25 @@ Date:		September 2008
 Contact:	Badari Pulavarty <pbadari@us.ibm.com>
 Description:
 		The file /sys/devices/system/memory/memoryX/state
-		is read-write.  When read, its contents show the
-		online/offline state of the memory section.  When written,
-		root can toggle the the online/offline state of a removable
-		memory section (see removable file description above)
-		using the following commands::
+		is read-write.  When read, it returns the online/offline
+		state of the memory block.  When written, root can toggle
+		the online/offline state of a memory block using the following
+		commands::
 
 		  # echo online > /sys/devices/system/memory/memoryX/state
 		  # echo offline > /sys/devices/system/memory/memoryX/state
 
-		For example, if /sys/devices/system/memory/memory22/removable
-		contains a value of 1 and
-		/sys/devices/system/memory/memory22/state contains the
-		string "online" the following command can be executed by
-		by root to offline that section::
-
-		  # echo offline > /sys/devices/system/memory/memory22/state
-
+		On newer kernel versions, advanced states can be specified
+		when onlining to select a target zone: "online_movable"
+		selects the movable zone.  "online_kernel" selects the
+		applicable kernel zone (DMA, DMA32, or Normal).  However,
+		after successfully setting one of the advanced states,
+		reading the file will return "online"; the zone information
+		can be obtained via "valid_zones" instead.
+
+		While onlining is unlikely to fail, there are no guarantees
+		that offlining will succeed.  Offlining is more likely to
+		succeed if "valid_zones" indicates "Movable".
 Users:		hotplug memory remove tools
 		http://www.ibm.com/developerworks/wikis/display/LinuxP/powerpc-utils
 
@@ -70,8 +72,19 @@ Date:           July 2014
 Contact:	Zhang Zhen <zhenzhang.zhang@huawei.com>
 Description:
 		The file /sys/devices/system/memory/memoryX/valid_zones	is
-		read-only and is designed to show which zone this memory
-		block can be onlined to.
+		read-only.
+
+		For online memory blocks, it returns in which zone memory
+		provided by a memory block is managed.  If multiple zones
+		apply (not applicable for hotplugged memory), "None" is returned
+		and the memory block cannot be offlined.
+
+		For offline memory blocks, it returns by which zone memory
+		provided by a memory block can be managed when onlining.
+		The first returned zone ("default") will be used when setting
+		the state of an offline memory block to "online".  Only one of
+		the kernel zones (DMA, DMA32, Normal) is applicable for a single
+		memory block.
 
 What:		/sys/devices/system/memoryX/nodeY
 Date:		October 2009
diff --git a/Documentation/admin-guide/mm/memory-hotplug.rst b/Documentation/admin-guide/mm/memory-hotplug.rst
index 245739f55ac7..5307f90738aa 100644
--- a/Documentation/admin-guide/mm/memory-hotplug.rst
+++ b/Documentation/admin-guide/mm/memory-hotplug.rst
@@ -162,14 +162,14 @@ Under each memory block, you can see 5 files:
                     which will be performed on all sections in the block.
 ``phys_device``	    read-only: legacy interface only ever used on s390x to
 		    expose the covered storage increment.
-``removable``       read-only: contains an integer value indicating
-                    whether the memory block is removable or not
-                    removable.  A value of 1 indicates that the memory
-                    block is removable and a value of 0 indicates that
-                    it is not removable. A memory block is removable only if
-                    every section in the block is removable.
-``valid_zones``     read-only: designed to show which zones this memory block
-		    can be onlined to.
+``removable``	    read-only: legacy interface that indicated whether a memory
+		    block was likely to be offlineable or not.  Newer kernel
+		    versions return "1" if and only if the kernel supports
+		    memory offlining.
+``valid_zones``     read-only: designed to show by which zone memory provided by
+		    a memory block is managed, and to show by which zone memory
+		    provided by an offline memory block could be managed when
+		    onlining.
 
 		    The first column shows it`s default zone.
 
-- 
2.29.2



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v1 1/2] drivers/base/memory: don't store phys_device in memory blocks
  2021-02-01 10:51 ` [PATCH v1 1/2] drivers/base/memory: don't store phys_device in memory blocks David Hildenbrand
@ 2021-02-01 13:00   ` kernel test robot
  2021-02-01 13:27     ` David Hildenbrand
  2021-02-01 15:58   ` Michal Hocko
  1 sibling, 1 reply; 7+ messages in thread
From: kernel test robot @ 2021-02-01 13:00 UTC (permalink / raw)
  To: David Hildenbrand, linux-kernel
  Cc: kbuild-all, linux-mm, David Hildenbrand, Andrew Morton,
	Dave Hansen, Michal Hocko, Oscar Salvador, Greg Kroah-Hartman,
	Gerald Schaefer

[-- Attachment #1: Type: text/plain, Size: 3682 bytes --]

Hi David,

I love your patch! Perhaps something to improve:

[auto build test WARNING on linux/master]
[also build test WARNING on driver-core/driver-core-testing next-20210125]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/David-Hildenbrand/drivers-base-memory-clarify-some-memory-block-properties/20210201-185331
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 2ab38c17aac10bf55ab3efde4c4db3893d8691d2
config: x86_64-randconfig-s031-20210201 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-15) 9.3.0
reproduce:
        # apt-get install sparse
        # sparse version: v0.6.3-215-g0fb77bb6-dirty
        # https://github.com/0day-ci/linux/commit/614341d29c44f8965a604b9fd5d09eb0b652864c
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review David-Hildenbrand/drivers-base-memory-clarify-some-memory-block-properties/20210201-185331
        git checkout 614341d29c44f8965a604b9fd5d09eb0b652864c
        # save the attached .config to linux build tree
        make W=1 C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=x86_64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   drivers/base/memory.c: In function 'init_memory_block':
>> drivers/base/memory.c:573:16: warning: variable 'start_pfn' set but not used [-Wunused-but-set-variable]
     573 |  unsigned long start_pfn;
         |                ^~~~~~~~~


vim +/start_pfn +573 drivers/base/memory.c

96b2c0fc8e74a6 Nathan Fontenot   2013-06-04  569  
40ba2cde77e764 Wei Yang          2020-06-23  570  static int init_memory_block(unsigned long block_id, unsigned long state)
e4619c857d1d76 Nathan Fontenot   2010-10-19  571  {
0c2c99b1b8ab5d Nathan Fontenot   2011-01-20  572  	struct memory_block *mem;
e4619c857d1d76 Nathan Fontenot   2010-10-19 @573  	unsigned long start_pfn;
e4619c857d1d76 Nathan Fontenot   2010-10-19  574  	int ret = 0;
e4619c857d1d76 Nathan Fontenot   2010-10-19  575  
dd625285910d3c David Hildenbrand 2019-07-18  576  	mem = find_memory_block_by_id(block_id);
db051a0dac13db David Hildenbrand 2019-07-18  577  	if (mem) {
db051a0dac13db David Hildenbrand 2019-07-18  578  		put_device(&mem->dev);
db051a0dac13db David Hildenbrand 2019-07-18  579  		return -EEXIST;
db051a0dac13db David Hildenbrand 2019-07-18  580  	}
0c2c99b1b8ab5d Nathan Fontenot   2011-01-20  581  	mem = kzalloc(sizeof(*mem), GFP_KERNEL);
e4619c857d1d76 Nathan Fontenot   2010-10-19  582  	if (!mem)
e4619c857d1d76 Nathan Fontenot   2010-10-19  583  		return -ENOMEM;
e4619c857d1d76 Nathan Fontenot   2010-10-19  584  
1811582587c43b David Hildenbrand 2019-07-18  585  	mem->start_section_nr = block_id * sections_per_block;
e4619c857d1d76 Nathan Fontenot   2010-10-19  586  	mem->state = state;
d33601644cd3b0 Nathan Fontenot   2011-01-20  587  	start_pfn = section_nr_to_pfn(mem->start_section_nr);
d84f2f5a755208 David Hildenbrand 2019-09-23  588  	mem->nid = NUMA_NO_NODE;
e4619c857d1d76 Nathan Fontenot   2010-10-19  589  
0c2c99b1b8ab5d Nathan Fontenot   2011-01-20  590  	ret = register_memory(mem);
0c2c99b1b8ab5d Nathan Fontenot   2011-01-20  591  
0c2c99b1b8ab5d Nathan Fontenot   2011-01-20  592  	return ret;
0c2c99b1b8ab5d Nathan Fontenot   2011-01-20  593  }
0c2c99b1b8ab5d Nathan Fontenot   2011-01-20  594  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 31841 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v1 1/2] drivers/base/memory: don't store phys_device in memory blocks
  2021-02-01 13:00   ` kernel test robot
@ 2021-02-01 13:27     ` David Hildenbrand
  0 siblings, 0 replies; 7+ messages in thread
From: David Hildenbrand @ 2021-02-01 13:27 UTC (permalink / raw)
  To: kernel test robot, linux-kernel
  Cc: kbuild-all, linux-mm, Andrew Morton, Dave Hansen, Michal Hocko,
	Oscar Salvador, Greg Kroah-Hartman, Gerald Schaefer

[...]

> All warnings (new ones prefixed by >>):
> 
>     drivers/base/memory.c: In function 'init_memory_block':
>>> drivers/base/memory.c:573:16: warning: variable 'start_pfn' set but not used [-Wunused-but-set-variable]
>       573 |  unsigned long start_pfn;
>           |                ^~~~~~~~~

Indeed, we no longer need start_pfn in init_memory_block().

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v1 1/2] drivers/base/memory: don't store phys_device in memory blocks
  2021-02-01 10:51 ` [PATCH v1 1/2] drivers/base/memory: don't store phys_device in memory blocks David Hildenbrand
  2021-02-01 13:00   ` kernel test robot
@ 2021-02-01 15:58   ` Michal Hocko
  1 sibling, 0 replies; 7+ messages in thread
From: Michal Hocko @ 2021-02-01 15:58 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, Andrew Morton, Dave Hansen,
	Oscar Salvador, Greg Kroah-Hartman, Gerald Schaefer,
	Jonathan Corbet, Rafael J. Wysocki, Mauro Carvalho Chehab,
	Ilya Dryomov, Vaibhav Jain, Tom Rix, Geert Uytterhoeven,
	linux-doc

On Mon 01-02-21 11:51:57, David Hildenbrand wrote:
> No need to store the value for each and every memory block, as we can
> easily query the value at runtime. Reshuffle the members to optimize the
> memory layout. Also, let's clarify what the interface once was used for
> and why it's legacy nowadays.
> 
> "phys_device" was used on s390x in older versions of lsmem[2]/chmem[3],
> back when they were still part of s390x-tools. They were later replaced
> by the variants in linux-utils. For example, RHEL6 and RHEL7 contain
> lsmem/chmem from s390-utils. RHEL8 switched to versions from util-linux
> on s390x [4].
> 
> "phys_device" was added with sysfs support for memory hotplug in
> commit 3947be1969a9 ("[PATCH] memory hotplug: sysfs and add/remove
> functions") in 2005. It always returned 0.
> 
> s390x started returning something != 0 on some setups (if sclp.rzm is
> set by HW) in 2010 via commit 57b552ba0b2f ("memory hotplug/s390: set
> phys_device").
> 
> For s390x, it allowed for identifying which memory block devices belong
> to the same storage increment (RZM). Only if all memory block devices
> comprising a single storage increment were offline, the memory could
> actually be removed in the hypervisor.
> 
> Since commit e5d709bb5fb7 ("s390/memory hotplug: provide
> memory_block_size_bytes() function") in 2013 a memory block devices
> spans at least one storage increment - which is why the interface isn't
> really helpful/used anymore (except by old lsmem/chmem tools).
> 
> There were once RFC patches to make use of "phys_device" in ACPI context;
> however, the underlying problem could be solved using different
> interfaces [1].
> 
> [1] https://patchwork.kernel.org/patch/2163871/
> [2] https://github.com/ibm-s390-tools/s390-tools/blob/v2.1.0/zconf/lsmem
> [3] https://github.com/ibm-s390-tools/s390-tools/blob/v2.1.0/zconf/chmem
> [4] https://bugzilla.redhat.com/show_bug.cgi?id=1504134

Thanks for an excellent changelog!
 
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Dave Hansen <dave.hansen@intel.com>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
> Cc: Jonathan Corbet <corbet@lwn.net>
> Cc: "Rafael J. Wysocki" <rafael@kernel.org>
> Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> Cc: Ilya Dryomov <idryomov@gmail.com>
> Cc: Vaibhav Jain <vaibhav@linux.ibm.com>
> Cc: Tom Rix <trix@redhat.com>
> Cc: Geert Uytterhoeven <geert+renesas@glider.be>
> Cc: linux-doc@vger.kernel.org
> Signed-off-by: David Hildenbrand <david@redhat.com>

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  .../ABI/testing/sysfs-devices-memory          |  5 ++--
>  .../admin-guide/mm/memory-hotplug.rst         |  4 ++--
>  drivers/base/memory.c                         | 23 ++++++++-----------
>  include/linux/memory.h                        |  3 +--
>  4 files changed, 15 insertions(+), 20 deletions(-)
> 
> diff --git a/Documentation/ABI/testing/sysfs-devices-memory b/Documentation/ABI/testing/sysfs-devices-memory
> index 246a45b96d22..58dbc592bc57 100644
> --- a/Documentation/ABI/testing/sysfs-devices-memory
> +++ b/Documentation/ABI/testing/sysfs-devices-memory
> @@ -26,8 +26,9 @@ Date:		September 2008
>  Contact:	Badari Pulavarty <pbadari@us.ibm.com>
>  Description:
>  		The file /sys/devices/system/memory/memoryX/phys_device
> -		is read-only and is designed to show the name of physical
> -		memory device.  Implementation is currently incomplete.
> +		is read-only;  it is a legacy interface only ever used on s390x
> +		to expose the covered storage increment.
> +Users:		Legacy s390-tools lsmem/chmem
>  
>  What:		/sys/devices/system/memory/memoryX/phys_index
>  Date:		September 2008
> diff --git a/Documentation/admin-guide/mm/memory-hotplug.rst b/Documentation/admin-guide/mm/memory-hotplug.rst
> index 5c4432c96c4b..245739f55ac7 100644
> --- a/Documentation/admin-guide/mm/memory-hotplug.rst
> +++ b/Documentation/admin-guide/mm/memory-hotplug.rst
> @@ -160,8 +160,8 @@ Under each memory block, you can see 5 files:
>  
>                      "online_movable", "online", "offline" command
>                      which will be performed on all sections in the block.
> -``phys_device``     read-only: designed to show the name of physical memory
> -                    device.  This is not well implemented now.
> +``phys_device``	    read-only: legacy interface only ever used on s390x to
> +		    expose the covered storage increment.
>  ``removable``       read-only: contains an integer value indicating
>                      whether the memory block is removable or not
>                      removable.  A value of 1 indicates that the memory
> diff --git a/drivers/base/memory.c b/drivers/base/memory.c
> index 901e379676be..16959d339172 100644
> --- a/drivers/base/memory.c
> +++ b/drivers/base/memory.c
> @@ -290,20 +290,20 @@ static ssize_t state_store(struct device *dev, struct device_attribute *attr,
>  }
>  
>  /*
> - * phys_device is a bad name for this.  What I really want
> - * is a way to differentiate between memory ranges that
> - * are part of physical devices that constitute
> - * a complete removable unit or fru.
> - * i.e. do these ranges belong to the same physical device,
> - * s.t. if I offline all of these sections I can then
> - * remove the physical device?
> + * Legacy interface that we cannot remove: s390x exposes the storage increment
> + * covered by a memory block, allowing for identifying which memory blocks
> + * comprise a storage increment. Since a memory block spans complete
> + * storage increments nowadays, this interface is basically unused. Other
> + * archs never exposed != 0.
>   */
>  static ssize_t phys_device_show(struct device *dev,
>  				struct device_attribute *attr, char *buf)
>  {
>  	struct memory_block *mem = to_memory_block(dev);
> +	unsigned long start_pfn = section_nr_to_pfn(mem->start_section_nr);
>  
> -	return sysfs_emit(buf, "%d\n", mem->phys_device);
> +	return sysfs_emit(buf, "%d\n",
> +			  arch_get_memory_phys_device(start_pfn));
>  }
>  
>  #ifdef CONFIG_MEMORY_HOTREMOVE
> @@ -488,11 +488,7 @@ static DEVICE_ATTR_WO(soft_offline_page);
>  static DEVICE_ATTR_WO(hard_offline_page);
>  #endif
>  
> -/*
> - * Note that phys_device is optional.  It is here to allow for
> - * differentiation between which *physical* devices each
> - * section belongs to...
> - */
> +/* See phys_device_show(). */
>  int __weak arch_get_memory_phys_device(unsigned long start_pfn)
>  {
>  	return 0;
> @@ -589,7 +585,6 @@ static int init_memory_block(unsigned long block_id, unsigned long state)
>  	mem->start_section_nr = block_id * sections_per_block;
>  	mem->state = state;
>  	start_pfn = section_nr_to_pfn(mem->start_section_nr);
> -	mem->phys_device = arch_get_memory_phys_device(start_pfn);
>  	mem->nid = NUMA_NO_NODE;
>  
>  	ret = register_memory(mem);
> diff --git a/include/linux/memory.h b/include/linux/memory.h
> index 439a89e758d8..4da95e684e20 100644
> --- a/include/linux/memory.h
> +++ b/include/linux/memory.h
> @@ -27,9 +27,8 @@ struct memory_block {
>  	unsigned long start_section_nr;
>  	unsigned long state;		/* serialized by the dev->lock */
>  	int online_type;		/* for passing data to online routine */
> -	int phys_device;		/* to which fru does this belong? */
> -	struct device dev;
>  	int nid;			/* NID for this memory block */
> +	struct device dev;
>  };
>  
>  int arch_get_memory_phys_device(unsigned long start_pfn);
> -- 
> 2.29.2
> 

-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v1 2/2] Documentation: sysfs/memory: clarify some memory block device properties
  2021-02-01 10:51 ` [PATCH v1 2/2] Documentation: sysfs/memory: clarify some memory block device properties David Hildenbrand
@ 2021-02-01 16:00   ` Michal Hocko
  0 siblings, 0 replies; 7+ messages in thread
From: Michal Hocko @ 2021-02-01 16:00 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, Andrew Morton, Dave Hansen,
	Oscar Salvador, Jonathan Corbet, Greg Kroah-Hartman,
	Jonathan Cameron, Ilya Dryomov, Mauro Carvalho Chehab,
	Geert Uytterhoeven, linux-doc

On Mon 01-02-21 11:51:58, David Hildenbrand wrote:
> In commit 53cdc1cb29e8 ("drivers/base/memory.c: indicate all memory blocks
> as removable") we changed the output of the "removable" property of memory
> devices to return "1" if and only if the kernel supports memory offlining.
> 
> Let's update documentation, stating that the interface is legacy. Also
> update documentation of the "state" property and "valid_zones"
> properties.
> 
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Dave Hansen <dave.hansen@intel.com>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Jonathan Corbet <corbet@lwn.net>
> Cc: David Hildenbrand <david@redhat.com>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Cc: Ilya Dryomov <idryomov@gmail.com>
> Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> Cc: Geert Uytterhoeven <geert+renesas@glider.be>
> Cc: linux-doc@vger.kernel.org
> Signed-off-by: David Hildenbrand <david@redhat.com>

Acked-by: Michal Hocko <mhocko@suse.com>

Thanks!

> ---
>  .../ABI/testing/sysfs-devices-memory          | 53 ++++++++++++-------
>  .../admin-guide/mm/memory-hotplug.rst         | 16 +++---
>  2 files changed, 41 insertions(+), 28 deletions(-)
> 
> diff --git a/Documentation/ABI/testing/sysfs-devices-memory b/Documentation/ABI/testing/sysfs-devices-memory
> index 58dbc592bc57..d8b0f80b9e33 100644
> --- a/Documentation/ABI/testing/sysfs-devices-memory
> +++ b/Documentation/ABI/testing/sysfs-devices-memory
> @@ -13,13 +13,13 @@ What:		/sys/devices/system/memory/memoryX/removable
>  Date:		June 2008
>  Contact:	Badari Pulavarty <pbadari@us.ibm.com>
>  Description:
> -		The file /sys/devices/system/memory/memoryX/removable
> -		indicates whether this memory block is removable or not.
> -		This is useful for a user-level agent to determine
> -		identify removable sections of the memory before attempting
> -		potentially expensive hot-remove memory operation
> +		The file /sys/devices/system/memory/memoryX/removable is a
> +		legacy interface used to indicated whether a memory block is
> +		likely to be offlineable or not.  Newer kernel versions return
> +		"1" if and only if the kernel supports memory offlining.
>  Users:		hotplug memory remove tools
>  		http://www.ibm.com/developerworks/wikis/display/LinuxP/powerpc-utils
> +		lsmem/chmem part of util-linux
>  
>  What:		/sys/devices/system/memory/memoryX/phys_device
>  Date:		September 2008
> @@ -44,23 +44,25 @@ Date:		September 2008
>  Contact:	Badari Pulavarty <pbadari@us.ibm.com>
>  Description:
>  		The file /sys/devices/system/memory/memoryX/state
> -		is read-write.  When read, its contents show the
> -		online/offline state of the memory section.  When written,
> -		root can toggle the the online/offline state of a removable
> -		memory section (see removable file description above)
> -		using the following commands::
> +		is read-write.  When read, it returns the online/offline
> +		state of the memory block.  When written, root can toggle
> +		the online/offline state of a memory block using the following
> +		commands::
>  
>  		  # echo online > /sys/devices/system/memory/memoryX/state
>  		  # echo offline > /sys/devices/system/memory/memoryX/state
>  
> -		For example, if /sys/devices/system/memory/memory22/removable
> -		contains a value of 1 and
> -		/sys/devices/system/memory/memory22/state contains the
> -		string "online" the following command can be executed by
> -		by root to offline that section::
> -
> -		  # echo offline > /sys/devices/system/memory/memory22/state
> -
> +		On newer kernel versions, advanced states can be specified
> +		when onlining to select a target zone: "online_movable"
> +		selects the movable zone.  "online_kernel" selects the
> +		applicable kernel zone (DMA, DMA32, or Normal).  However,
> +		after successfully setting one of the advanced states,
> +		reading the file will return "online"; the zone information
> +		can be obtained via "valid_zones" instead.
> +
> +		While onlining is unlikely to fail, there are no guarantees
> +		that offlining will succeed.  Offlining is more likely to
> +		succeed if "valid_zones" indicates "Movable".
>  Users:		hotplug memory remove tools
>  		http://www.ibm.com/developerworks/wikis/display/LinuxP/powerpc-utils
>  
> @@ -70,8 +72,19 @@ Date:           July 2014
>  Contact:	Zhang Zhen <zhenzhang.zhang@huawei.com>
>  Description:
>  		The file /sys/devices/system/memory/memoryX/valid_zones	is
> -		read-only and is designed to show which zone this memory
> -		block can be onlined to.
> +		read-only.
> +
> +		For online memory blocks, it returns in which zone memory
> +		provided by a memory block is managed.  If multiple zones
> +		apply (not applicable for hotplugged memory), "None" is returned
> +		and the memory block cannot be offlined.
> +
> +		For offline memory blocks, it returns by which zone memory
> +		provided by a memory block can be managed when onlining.
> +		The first returned zone ("default") will be used when setting
> +		the state of an offline memory block to "online".  Only one of
> +		the kernel zones (DMA, DMA32, Normal) is applicable for a single
> +		memory block.
>  
>  What:		/sys/devices/system/memoryX/nodeY
>  Date:		October 2009
> diff --git a/Documentation/admin-guide/mm/memory-hotplug.rst b/Documentation/admin-guide/mm/memory-hotplug.rst
> index 245739f55ac7..5307f90738aa 100644
> --- a/Documentation/admin-guide/mm/memory-hotplug.rst
> +++ b/Documentation/admin-guide/mm/memory-hotplug.rst
> @@ -162,14 +162,14 @@ Under each memory block, you can see 5 files:
>                      which will be performed on all sections in the block.
>  ``phys_device``	    read-only: legacy interface only ever used on s390x to
>  		    expose the covered storage increment.
> -``removable``       read-only: contains an integer value indicating
> -                    whether the memory block is removable or not
> -                    removable.  A value of 1 indicates that the memory
> -                    block is removable and a value of 0 indicates that
> -                    it is not removable. A memory block is removable only if
> -                    every section in the block is removable.
> -``valid_zones``     read-only: designed to show which zones this memory block
> -		    can be onlined to.
> +``removable``	    read-only: legacy interface that indicated whether a memory
> +		    block was likely to be offlineable or not.  Newer kernel
> +		    versions return "1" if and only if the kernel supports
> +		    memory offlining.
> +``valid_zones``     read-only: designed to show by which zone memory provided by
> +		    a memory block is managed, and to show by which zone memory
> +		    provided by an offline memory block could be managed when
> +		    onlining.
>  
>  		    The first column shows it`s default zone.
>  
> -- 
> 2.29.2
> 

-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-02-01 16:01 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-01 10:51 [PATCH v1 0/2] drivers/base/memory: clarify some memory block properties David Hildenbrand
2021-02-01 10:51 ` [PATCH v1 1/2] drivers/base/memory: don't store phys_device in memory blocks David Hildenbrand
2021-02-01 13:00   ` kernel test robot
2021-02-01 13:27     ` David Hildenbrand
2021-02-01 15:58   ` Michal Hocko
2021-02-01 10:51 ` [PATCH v1 2/2] Documentation: sysfs/memory: clarify some memory block device properties David Hildenbrand
2021-02-01 16:00   ` Michal Hocko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).