linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1 0/3] kernel/resource: make walk_system_ram_res() and walk_mem_res() search the whole tree
@ 2021-03-22 16:01 David Hildenbrand
  2021-03-22 16:01 ` [PATCH v1 1/3] kernel/resource: make walk_system_ram_res() find all busy IORESOURCE_SYSTEM_RAM resources David Hildenbrand
                   ` (3 more replies)
  0 siblings, 4 replies; 19+ messages in thread
From: David Hildenbrand @ 2021-03-22 16:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, David Hildenbrand, Andrew Morton, Andy Shevchenko,
	Baoquan He, Borislav Petkov, Brijesh Singh, Daniel Vetter,
	Dan Williams, Dave Hansen, Dave Young, Eric Biederman,
	Greg Kroah-Hartman, H. Peter Anvin, Ingo Molnar, Keith Busch,
	Mauro Carvalho Chehab, Michal Hocko, Oscar Salvador, Qian Cai,
	Thomas Gleixner, Tom Lendacky, Vivek Goyal

Playing with kdump+virtio-mem I noticed that kexec_file_load() does not
consider System RAM added via dax/kmem and virtio-mem when preparing the
elf header for kdump. Looking into the details, the logic used in
walk_system_ram_res() and walk_mem_res() seems to be outdated.

walk_system_ram_range() already does the right thing, let's change
walk_system_ram_res() and walk_mem_res(), and clean up.

Loading a kdump kernel via "kexec -p -s" ... will result in the kdump
kernel to also dump dax/kmem and virtio-mem added System RAM now.

Note: kexec-tools on x86-64 also have to be updated to consider this
memory in the kexec_load() case when processing /proc/iomem.

Against next-20210322.

David Hildenbrand (3):
  kernel/resource: make walk_system_ram_res() find all busy
    IORESOURCE_SYSTEM_RAM resources
  kernel/resource: make walk_mem_res() find all busy IORESOURCE_MEM
    resources
  kernel/resource: remove first_lvl / siblings_only logic

 kernel/resource.c | 45 ++++++++++++---------------------------------
 1 file changed, 12 insertions(+), 33 deletions(-)

-- 
2.29.2



^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v1 1/3] kernel/resource: make walk_system_ram_res() find all busy IORESOURCE_SYSTEM_RAM resources
  2021-03-22 16:01 [PATCH v1 0/3] kernel/resource: make walk_system_ram_res() and walk_mem_res() search the whole tree David Hildenbrand
@ 2021-03-22 16:01 ` David Hildenbrand
  2021-03-22 16:10   ` Dan Williams
                     ` (4 more replies)
  2021-03-22 16:01 ` [PATCH v1 2/3] kernel/resource: make walk_mem_res() find all busy IORESOURCE_MEM resources David Hildenbrand
                   ` (2 subsequent siblings)
  3 siblings, 5 replies; 19+ messages in thread
From: David Hildenbrand @ 2021-03-22 16:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, David Hildenbrand, Andrew Morton, Greg Kroah-Hartman,
	Dan Williams, Daniel Vetter, Andy Shevchenko,
	Mauro Carvalho Chehab, Dave Young, Baoquan He, Vivek Goyal,
	Dave Hansen, Keith Busch, Michal Hocko, Qian Cai, Oscar Salvador,
	Eric Biederman, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	H. Peter Anvin, Tom Lendacky, Brijesh Singh, x86, kexec

It used to be true that we can have busy system RAM only on the first level
in the resourc tree. However, this is no longer holds for driver-managed
system RAM (i.e., added via dax/kmem and virtio-mem), which gets added on
lower levels.

We have two users of walk_system_ram_res(), which currently only
consideres the first level:
a) kernel/kexec_file.c:kexec_walk_resources() -- We properly skip
   IORESOURCE_SYSRAM_DRIVER_MANAGED resources via
   locate_mem_hole_callback(), so even after this change, we won't be
   placing kexec images onto dax/kmem and virtio-mem added memory. No
   change.
b) arch/x86/kernel/crash.c:fill_up_crash_elf_data() -- we're currently
   not adding relevant ranges to the crash elf info, resulting in them
   not getting dumped via kdump.

This change fixes loading a crashkernel via kexec_file_load() and including
dax/kmem and virtio-mem added System RAM in the crashdump on x86-64. Note
that e.g,, arm64 relies on memblock data and, therefore, always considers
all added System RAM already.

Let's find all busy IORESOURCE_SYSTEM_RAM resources, making the function
behave like walk_system_ram_range().

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Cc: Signed-off-by: David Hildenbrand <david@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Qian Cai <cai@lca.pw>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Brijesh Singh <brijesh.singh@amd.com>
Cc: x86@kernel.org
Cc: kexec@lists.infradead.org
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 kernel/resource.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/resource.c b/kernel/resource.c
index 627e61b0c124..4efd6e912279 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -457,7 +457,7 @@ int walk_system_ram_res(u64 start, u64 end, void *arg,
 {
 	unsigned long flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
 
-	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, true,
+	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, false,
 				     arg, func);
 }
 
-- 
2.29.2



^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v1 2/3] kernel/resource: make walk_mem_res() find all busy IORESOURCE_MEM resources
  2021-03-22 16:01 [PATCH v1 0/3] kernel/resource: make walk_system_ram_res() and walk_mem_res() search the whole tree David Hildenbrand
  2021-03-22 16:01 ` [PATCH v1 1/3] kernel/resource: make walk_system_ram_res() find all busy IORESOURCE_SYSTEM_RAM resources David Hildenbrand
@ 2021-03-22 16:01 ` David Hildenbrand
  2021-03-22 16:11   ` Dan Williams
  2021-03-23 11:08   ` Andy Shevchenko
  2021-03-22 16:02 ` [PATCH v1 3/3] kernel/resource: remove first_lvl / siblings_only logic David Hildenbrand
  2021-03-23 11:05 ` [PATCH v1 0/3] kernel/resource: make walk_system_ram_res() and walk_mem_res() search the whole tree Andy Shevchenko
  3 siblings, 2 replies; 19+ messages in thread
From: David Hildenbrand @ 2021-03-22 16:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, David Hildenbrand, Andrew Morton, Greg Kroah-Hartman,
	Dan Williams, Daniel Vetter, Andy Shevchenko,
	Mauro Carvalho Chehab, Dave Young, Baoquan He, Vivek Goyal,
	Dave Hansen, Keith Busch, Michal Hocko, Qian Cai, Oscar Salvador,
	Eric Biederman, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	H. Peter Anvin, Tom Lendacky, Brijesh Singh, x86, kexec

It used to be true that we can have system RAM only on the first level
in the resourc tree. However, this is no longer holds for driver-managed
system RAM (i.e., dax/kmem and virtio-mem).

The function walk_mem_res() only consideres the first level and is
used in arch/x86/mm/ioremap.c:__ioremap_check_mem() only. We currently
fail to identify System RAM added by dax/kmem and virtio-mem as
"IORES_MAP_SYSTEM_RAM", for example, allowing for remapping of such
"normal RAM" in __ioremap_caller().

Let's find all busy IORESOURCE_MEM resources, making the function
behave similar to walk_system_ram_res().

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Cc: Signed-off-by: David Hildenbrand <david@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Qian Cai <cai@lca.pw>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Brijesh Singh <brijesh.singh@amd.com>
Cc: x86@kernel.org
Cc: kexec@lists.infradead.org
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 kernel/resource.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/resource.c b/kernel/resource.c
index 4efd6e912279..16e0c7e8ed24 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -470,7 +470,7 @@ int walk_mem_res(u64 start, u64 end, void *arg,
 {
 	unsigned long flags = IORESOURCE_MEM | IORESOURCE_BUSY;
 
-	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, true,
+	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, false,
 				     arg, func);
 }
 
-- 
2.29.2



^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v1 3/3] kernel/resource: remove first_lvl / siblings_only logic
  2021-03-22 16:01 [PATCH v1 0/3] kernel/resource: make walk_system_ram_res() and walk_mem_res() search the whole tree David Hildenbrand
  2021-03-22 16:01 ` [PATCH v1 1/3] kernel/resource: make walk_system_ram_res() find all busy IORESOURCE_SYSTEM_RAM resources David Hildenbrand
  2021-03-22 16:01 ` [PATCH v1 2/3] kernel/resource: make walk_mem_res() find all busy IORESOURCE_MEM resources David Hildenbrand
@ 2021-03-22 16:02 ` David Hildenbrand
  2021-03-22 16:12   ` Dan Williams
  2021-03-23 11:11   ` Andy Shevchenko
  2021-03-23 11:05 ` [PATCH v1 0/3] kernel/resource: make walk_system_ram_res() and walk_mem_res() search the whole tree Andy Shevchenko
  3 siblings, 2 replies; 19+ messages in thread
From: David Hildenbrand @ 2021-03-22 16:02 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, David Hildenbrand, Andrew Morton, Greg Kroah-Hartman,
	Dan Williams, Daniel Vetter, Andy Shevchenko,
	Mauro Carvalho Chehab, Dave Young, Baoquan He, Vivek Goyal,
	Dave Hansen, Keith Busch, Michal Hocko, Qian Cai, Oscar Salvador,
	Eric Biederman, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	H. Peter Anvin, Tom Lendacky, Brijesh Singh, x86, kexec

All IORESOURCE_SYSTEM_RAM and IORESOURCE_MEM now properly consider the
whole resource tree, not just the first level. Let's drop the unused
first_lvl / siblings_only logic.

All functions properly search the whole tree, so remove documentation
that indicates that some functions behave differently.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Cc: Signed-off-by: David Hildenbrand <david@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Qian Cai <cai@lca.pw>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Brijesh Singh <brijesh.singh@amd.com>
Cc: x86@kernel.org
Cc: kexec@lists.infradead.org
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 kernel/resource.c | 45 ++++++++++++---------------------------------
 1 file changed, 12 insertions(+), 33 deletions(-)

diff --git a/kernel/resource.c b/kernel/resource.c
index 16e0c7e8ed24..7e00239a023a 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -64,12 +64,8 @@ static DEFINE_RWLOCK(resource_lock);
 static struct resource *bootmem_resource_free;
 static DEFINE_SPINLOCK(bootmem_resource_lock);
 
-static struct resource *next_resource(struct resource *p, bool sibling_only)
+static struct resource *next_resource(struct resource *p)
 {
-	/* Caller wants to traverse through siblings only */
-	if (sibling_only)
-		return p->sibling;
-
 	if (p->child)
 		return p->child;
 	while (!p->sibling && p->parent)
@@ -81,7 +77,7 @@ static void *r_next(struct seq_file *m, void *v, loff_t *pos)
 {
 	struct resource *p = v;
 	(*pos)++;
-	return (void *)next_resource(p, false);
+	return (void *)next_resource(p);
 }
 
 #ifdef CONFIG_PROC_FS
@@ -330,14 +326,10 @@ EXPORT_SYMBOL(release_resource);
  * of the resource that's within [@start..@end]; if none is found, returns
  * -ENODEV.  Returns -EINVAL for invalid parameters.
  *
- * This function walks the whole tree and not just first level children
- * unless @first_lvl is true.
- *
  * @start:	start address of the resource searched for
  * @end:	end address of same resource
  * @flags:	flags which the resource must have
  * @desc:	descriptor the resource must have
- * @first_lvl:	walk only the first level children, if set
  * @res:	return ptr, if resource found
  *
  * The caller must specify @start, @end, @flags, and @desc
@@ -345,9 +337,8 @@ EXPORT_SYMBOL(release_resource);
  */
 static int find_next_iomem_res(resource_size_t start, resource_size_t end,
 			       unsigned long flags, unsigned long desc,
-			       bool first_lvl, struct resource *res)
+			       struct resource *res)
 {
-	bool siblings_only = true;
 	struct resource *p;
 
 	if (!res)
@@ -358,7 +349,7 @@ static int find_next_iomem_res(resource_size_t start, resource_size_t end,
 
 	read_lock(&resource_lock);
 
-	for (p = iomem_resource.child; p; p = next_resource(p, siblings_only)) {
+	for (p = iomem_resource.child; p; p = next_resource(p)) {
 		/* If we passed the resource we are looking for, stop */
 		if (p->start > end) {
 			p = NULL;
@@ -369,13 +360,6 @@ static int find_next_iomem_res(resource_size_t start, resource_size_t end,
 		if (p->end < start)
 			continue;
 
-		/*
-		 * Now that we found a range that matches what we look for,
-		 * check the flags and the descriptor. If we were not asked to
-		 * use only the first level, start looking at children as well.
-		 */
-		siblings_only = first_lvl;
-
 		if ((p->flags & flags) != flags)
 			continue;
 		if ((desc != IORES_DESC_NONE) && (desc != p->desc))
@@ -402,14 +386,14 @@ static int find_next_iomem_res(resource_size_t start, resource_size_t end,
 
 static int __walk_iomem_res_desc(resource_size_t start, resource_size_t end,
 				 unsigned long flags, unsigned long desc,
-				 bool first_lvl, void *arg,
+				 void *arg,
 				 int (*func)(struct resource *, void *))
 {
 	struct resource res;
 	int ret = -EINVAL;
 
 	while (start < end &&
-	       !find_next_iomem_res(start, end, flags, desc, first_lvl, &res)) {
+	       !find_next_iomem_res(start, end, flags, desc, &res)) {
 		ret = (*func)(&res, arg);
 		if (ret)
 			break;
@@ -431,7 +415,6 @@ static int __walk_iomem_res_desc(resource_size_t start, resource_size_t end,
  * @arg: function argument for the callback @func
  * @func: callback function that is called for each qualifying resource area
  *
- * This walks through whole tree and not just first level children.
  * All the memory ranges which overlap start,end and also match flags and
  * desc are valid candidates.
  *
@@ -441,7 +424,7 @@ static int __walk_iomem_res_desc(resource_size_t start, resource_size_t end,
 int walk_iomem_res_desc(unsigned long desc, unsigned long flags, u64 start,
 		u64 end, void *arg, int (*func)(struct resource *, void *))
 {
-	return __walk_iomem_res_desc(start, end, flags, desc, false, arg, func);
+	return __walk_iomem_res_desc(start, end, flags, desc, arg, func);
 }
 EXPORT_SYMBOL_GPL(walk_iomem_res_desc);
 
@@ -457,8 +440,8 @@ int walk_system_ram_res(u64 start, u64 end, void *arg,
 {
 	unsigned long flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
 
-	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, false,
-				     arg, func);
+	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, arg,
+				     func);
 }
 
 /*
@@ -470,17 +453,14 @@ int walk_mem_res(u64 start, u64 end, void *arg,
 {
 	unsigned long flags = IORESOURCE_MEM | IORESOURCE_BUSY;
 
-	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, false,
-				     arg, func);
+	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, arg,
+				     func);
 }
 
 /*
  * This function calls the @func callback against all memory ranges of type
  * System RAM which are marked as IORESOURCE_SYSTEM_RAM and IORESOUCE_BUSY.
  * It is to be used only for System RAM.
- *
- * This will find System RAM ranges that are children of top-level resources
- * in addition to top-level System RAM resources.
  */
 int walk_system_ram_range(unsigned long start_pfn, unsigned long nr_pages,
 			  void *arg, int (*func)(unsigned long, unsigned long, void *))
@@ -495,8 +475,7 @@ int walk_system_ram_range(unsigned long start_pfn, unsigned long nr_pages,
 	end = ((u64)(start_pfn + nr_pages) << PAGE_SHIFT) - 1;
 	flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
 	while (start < end &&
-	       !find_next_iomem_res(start, end, flags, IORES_DESC_NONE,
-				    false, &res)) {
+	       !find_next_iomem_res(start, end, flags, IORES_DESC_NONE, &res)) {
 		pfn = PFN_UP(res.start);
 		end_pfn = PFN_DOWN(res.end + 1);
 		if (end_pfn > pfn)
-- 
2.29.2



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v1 1/3] kernel/resource: make walk_system_ram_res() find all busy IORESOURCE_SYSTEM_RAM resources
  2021-03-22 16:01 ` [PATCH v1 1/3] kernel/resource: make walk_system_ram_res() find all busy IORESOURCE_SYSTEM_RAM resources David Hildenbrand
@ 2021-03-22 16:10   ` Dan Williams
  2021-03-23  9:40   ` David Hildenbrand
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 19+ messages in thread
From: Dan Williams @ 2021-03-22 16:10 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Linux Kernel Mailing List, Linux MM, Andrew Morton,
	Greg Kroah-Hartman, Daniel Vetter, Andy Shevchenko,
	Mauro Carvalho Chehab, Dave Young, Baoquan He, Vivek Goyal,
	Dave Hansen, Keith Busch, Michal Hocko, Qian Cai, Oscar Salvador,
	Eric Biederman, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	H. Peter Anvin, Tom Lendacky, Brijesh Singh, X86 ML,
	Kexec Mailing List

On Mon, Mar 22, 2021 at 9:02 AM David Hildenbrand <david@redhat.com> wrote:
>
> It used to be true that we can have busy system RAM only on the first level
> in the resourc tree. However, this is no longer holds for driver-managed
> system RAM (i.e., added via dax/kmem and virtio-mem), which gets added on
> lower levels.
>
> We have two users of walk_system_ram_res(), which currently only
> consideres the first level:
> a) kernel/kexec_file.c:kexec_walk_resources() -- We properly skip
>    IORESOURCE_SYSRAM_DRIVER_MANAGED resources via
>    locate_mem_hole_callback(), so even after this change, we won't be
>    placing kexec images onto dax/kmem and virtio-mem added memory. No
>    change.
> b) arch/x86/kernel/crash.c:fill_up_crash_elf_data() -- we're currently
>    not adding relevant ranges to the crash elf info, resulting in them
>    not getting dumped via kdump.
>
> This change fixes loading a crashkernel via kexec_file_load() and including
> dax/kmem and virtio-mem added System RAM in the crashdump on x86-64. Note
> that e.g,, arm64 relies on memblock data and, therefore, always considers
> all added System RAM already.
>
> Let's find all busy IORESOURCE_SYSTEM_RAM resources, making the function
> behave like walk_system_ram_range().
>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> Cc: Signed-off-by: David Hildenbrand <david@redhat.com>
> Cc: Dave Young <dyoung@redhat.com>
> Cc: Baoquan He <bhe@redhat.com>
> Cc: Vivek Goyal <vgoyal@redhat.com>
> Cc: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: Keith Busch <keith.busch@intel.com>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Qian Cai <cai@lca.pw>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Eric Biederman <ebiederm@xmission.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: Tom Lendacky <thomas.lendacky@amd.com>
> Cc: Brijesh Singh <brijesh.singh@amd.com>
> Cc: x86@kernel.org
> Cc: kexec@lists.infradead.org
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  kernel/resource.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/resource.c b/kernel/resource.c
> index 627e61b0c124..4efd6e912279 100644
> --- a/kernel/resource.c
> +++ b/kernel/resource.c
> @@ -457,7 +457,7 @@ int walk_system_ram_res(u64 start, u64 end, void *arg,
>  {
>         unsigned long flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
>
> -       return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, true,
> +       return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, false,
>                                      arg, func);

Looks good,

Reviewed-by: Dan Williams <dan.j.williams@intel.com>


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v1 2/3] kernel/resource: make walk_mem_res() find all busy IORESOURCE_MEM resources
  2021-03-22 16:01 ` [PATCH v1 2/3] kernel/resource: make walk_mem_res() find all busy IORESOURCE_MEM resources David Hildenbrand
@ 2021-03-22 16:11   ` Dan Williams
  2021-03-23 11:08   ` Andy Shevchenko
  1 sibling, 0 replies; 19+ messages in thread
From: Dan Williams @ 2021-03-22 16:11 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Linux Kernel Mailing List, Linux MM, Andrew Morton,
	Greg Kroah-Hartman, Daniel Vetter, Andy Shevchenko,
	Mauro Carvalho Chehab, Dave Young, Baoquan He, Vivek Goyal,
	Dave Hansen, Keith Busch, Michal Hocko, Qian Cai, Oscar Salvador,
	Eric Biederman, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	H. Peter Anvin, Tom Lendacky, Brijesh Singh, X86 ML,
	Kexec Mailing List

On Mon, Mar 22, 2021 at 9:02 AM David Hildenbrand <david@redhat.com> wrote:
>
> It used to be true that we can have system RAM only on the first level
> in the resourc tree. However, this is no longer holds for driver-managed
> system RAM (i.e., dax/kmem and virtio-mem).
>
> The function walk_mem_res() only consideres the first level and is
> used in arch/x86/mm/ioremap.c:__ioremap_check_mem() only. We currently
> fail to identify System RAM added by dax/kmem and virtio-mem as
> "IORES_MAP_SYSTEM_RAM", for example, allowing for remapping of such
> "normal RAM" in __ioremap_caller().
>
> Let's find all busy IORESOURCE_MEM resources, making the function
> behave similar to walk_system_ram_res().

Looks good,

Reviewed-by: Dan Williams <dan.j.williams@intel.com>


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v1 3/3] kernel/resource: remove first_lvl / siblings_only logic
  2021-03-22 16:02 ` [PATCH v1 3/3] kernel/resource: remove first_lvl / siblings_only logic David Hildenbrand
@ 2021-03-22 16:12   ` Dan Williams
  2021-03-23 11:11   ` Andy Shevchenko
  1 sibling, 0 replies; 19+ messages in thread
From: Dan Williams @ 2021-03-22 16:12 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Linux Kernel Mailing List, Linux MM, Andrew Morton,
	Greg Kroah-Hartman, Daniel Vetter, Andy Shevchenko,
	Mauro Carvalho Chehab, Dave Young, Baoquan He, Vivek Goyal,
	Dave Hansen, Keith Busch, Michal Hocko, Qian Cai, Oscar Salvador,
	Eric Biederman, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	H. Peter Anvin, Tom Lendacky, Brijesh Singh, X86 ML,
	Kexec Mailing List

On Mon, Mar 22, 2021 at 9:03 AM David Hildenbrand <david@redhat.com> wrote:
>
> All IORESOURCE_SYSTEM_RAM and IORESOURCE_MEM now properly consider the
> whole resource tree, not just the first level. Let's drop the unused
> first_lvl / siblings_only logic.
>
> All functions properly search the whole tree, so remove documentation
> that indicates that some functions behave differently.

Looks good, and the staging of the potential regressions as standalone
lead-in patches makes sense.

Reviewed-by: Dan Williams <dan.j.williams@intel.com>


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v1 1/3] kernel/resource: make walk_system_ram_res() find all busy IORESOURCE_SYSTEM_RAM resources
  2021-03-22 16:01 ` [PATCH v1 1/3] kernel/resource: make walk_system_ram_res() find all busy IORESOURCE_SYSTEM_RAM resources David Hildenbrand
  2021-03-22 16:10   ` Dan Williams
@ 2021-03-23  9:40   ` David Hildenbrand
  2021-03-23 11:07     ` Andy Shevchenko
  2021-03-23 11:06   ` Andy Shevchenko
                     ` (2 subsequent siblings)
  4 siblings, 1 reply; 19+ messages in thread
From: David Hildenbrand @ 2021-03-23  9:40 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, Andrew Morton, Greg Kroah-Hartman, Dan Williams,
	Daniel Vetter, Andy Shevchenko, Mauro Carvalho Chehab,
	Dave Young, Baoquan He, Vivek Goyal, Dave Hansen, Keith Busch,
	Michal Hocko, Qian Cai, Oscar Salvador, Eric Biederman,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin,
	Tom Lendacky, Brijesh Singh, x86, kexec

On 22.03.21 17:01, David Hildenbrand wrote:
> It used to be true that we can have busy system RAM only on the first level
> in the resourc tree. However, this is no longer holds for driver-managed
> system RAM (i.e., added via dax/kmem and virtio-mem), which gets added on
> lower levels.
> 
> We have two users of walk_system_ram_res(), which currently only
> consideres the first level:
> a) kernel/kexec_file.c:kexec_walk_resources() -- We properly skip
>     IORESOURCE_SYSRAM_DRIVER_MANAGED resources via
>     locate_mem_hole_callback(), so even after this change, we won't be
>     placing kexec images onto dax/kmem and virtio-mem added memory. No
>     change.
> b) arch/x86/kernel/crash.c:fill_up_crash_elf_data() -- we're currently
>     not adding relevant ranges to the crash elf info, resulting in them
>     not getting dumped via kdump.
> 
> This change fixes loading a crashkernel via kexec_file_load() and including
> dax/kmem and virtio-mem added System RAM in the crashdump on x86-64. Note
> that e.g,, arm64 relies on memblock data and, therefore, always considers
> all added System RAM already.
> 
> Let's find all busy IORESOURCE_SYSTEM_RAM resources, making the function
> behave like walk_system_ram_range().
> 
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> Cc: Signed-off-by: David Hildenbrand <david@redhat.com>

^ My copy-paste action when creating the cc list slipped in a duplicate 
  SO in all 3 patches. I can resend if desired.

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v1 0/3] kernel/resource: make walk_system_ram_res() and walk_mem_res() search the whole tree
  2021-03-22 16:01 [PATCH v1 0/3] kernel/resource: make walk_system_ram_res() and walk_mem_res() search the whole tree David Hildenbrand
                   ` (2 preceding siblings ...)
  2021-03-22 16:02 ` [PATCH v1 3/3] kernel/resource: remove first_lvl / siblings_only logic David Hildenbrand
@ 2021-03-23 11:05 ` Andy Shevchenko
  3 siblings, 0 replies; 19+ messages in thread
From: Andy Shevchenko @ 2021-03-23 11:05 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, Andrew Morton, Baoquan He,
	Borislav Petkov, Brijesh Singh, Daniel Vetter, Dan Williams,
	Dave Hansen, Dave Young, Eric Biederman, Greg Kroah-Hartman,
	H. Peter Anvin, Ingo Molnar, Keith Busch, Mauro Carvalho Chehab,
	Michal Hocko, Oscar Salvador, Qian Cai, Thomas Gleixner,
	Tom Lendacky, Vivek Goyal

On Mon, Mar 22, 2021 at 05:01:57PM +0100, David Hildenbrand wrote:
> Playing with kdump+virtio-mem I noticed that kexec_file_load() does not
> consider System RAM added via dax/kmem and virtio-mem when preparing the
> elf header for kdump. Looking into the details, the logic used in
> walk_system_ram_res() and walk_mem_res() seems to be outdated.
> 
> walk_system_ram_range() already does the right thing, let's change
> walk_system_ram_res() and walk_mem_res(), and clean up.
> 
> Loading a kdump kernel via "kexec -p -s" ... will result in the kdump
> kernel to also dump dax/kmem and virtio-mem added System RAM now.
> 
> Note: kexec-tools on x86-64 also have to be updated to consider this
> memory in the kexec_load() case when processing /proc/iomem.


> Against next-20210322.

Hint: `git format-patch --base ...`

> 
> David Hildenbrand (3):
>   kernel/resource: make walk_system_ram_res() find all busy
>     IORESOURCE_SYSTEM_RAM resources
>   kernel/resource: make walk_mem_res() find all busy IORESOURCE_MEM
>     resources
>   kernel/resource: remove first_lvl / siblings_only logic
> 
>  kernel/resource.c | 45 ++++++++++++---------------------------------
>  1 file changed, 12 insertions(+), 33 deletions(-)
> 
> -- 
> 2.29.2
> 

-- 
With Best Regards,
Andy Shevchenko




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v1 1/3] kernel/resource: make walk_system_ram_res() find all busy IORESOURCE_SYSTEM_RAM resources
  2021-03-22 16:01 ` [PATCH v1 1/3] kernel/resource: make walk_system_ram_res() find all busy IORESOURCE_SYSTEM_RAM resources David Hildenbrand
  2021-03-22 16:10   ` Dan Williams
  2021-03-23  9:40   ` David Hildenbrand
@ 2021-03-23 11:06   ` Andy Shevchenko
  2021-03-23 11:25     ` David Hildenbrand
  2021-03-23 14:33   ` Baoquan He
  2021-03-24 11:18   ` Oscar Salvador
  4 siblings, 1 reply; 19+ messages in thread
From: Andy Shevchenko @ 2021-03-23 11:06 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, Andrew Morton, Greg Kroah-Hartman,
	Dan Williams, Daniel Vetter, Mauro Carvalho Chehab, Dave Young,
	Baoquan He, Vivek Goyal, Dave Hansen, Keith Busch, Michal Hocko,
	Qian Cai, Oscar Salvador, Eric Biederman, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, H. Peter Anvin, Tom Lendacky,
	Brijesh Singh, x86, kexec

On Mon, Mar 22, 2021 at 05:01:58PM +0100, David Hildenbrand wrote:
> It used to be true that we can have busy system RAM only on the first level
> in the resourc tree. However, this is no longer holds for driver-managed
> system RAM (i.e., added via dax/kmem and virtio-mem), which gets added on
> lower levels.
> 
> We have two users of walk_system_ram_res(), which currently only
> consideres the first level:
> a) kernel/kexec_file.c:kexec_walk_resources() -- We properly skip
>    IORESOURCE_SYSRAM_DRIVER_MANAGED resources via
>    locate_mem_hole_callback(), so even after this change, we won't be
>    placing kexec images onto dax/kmem and virtio-mem added memory. No
>    change.
> b) arch/x86/kernel/crash.c:fill_up_crash_elf_data() -- we're currently
>    not adding relevant ranges to the crash elf info, resulting in them
>    not getting dumped via kdump.
> 
> This change fixes loading a crashkernel via kexec_file_load() and including

"...fixes..." effectively means to me that Fixes tag should be provided.

> dax/kmem and virtio-mem added System RAM in the crashdump on x86-64. Note
> that e.g,, arm64 relies on memblock data and, therefore, always considers
> all added System RAM already.
> 
> Let's find all busy IORESOURCE_SYSTEM_RAM resources, making the function
> behave like walk_system_ram_range().
> 
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> Cc: Signed-off-by: David Hildenbrand <david@redhat.com>
> Cc: Dave Young <dyoung@redhat.com>
> Cc: Baoquan He <bhe@redhat.com>
> Cc: Vivek Goyal <vgoyal@redhat.com>
> Cc: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: Keith Busch <keith.busch@intel.com>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Qian Cai <cai@lca.pw>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Eric Biederman <ebiederm@xmission.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: Tom Lendacky <thomas.lendacky@amd.com>
> Cc: Brijesh Singh <brijesh.singh@amd.com>
> Cc: x86@kernel.org
> Cc: kexec@lists.infradead.org
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  kernel/resource.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/resource.c b/kernel/resource.c
> index 627e61b0c124..4efd6e912279 100644
> --- a/kernel/resource.c
> +++ b/kernel/resource.c
> @@ -457,7 +457,7 @@ int walk_system_ram_res(u64 start, u64 end, void *arg,
>  {
>  	unsigned long flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
>  
> -	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, true,
> +	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, false,
>  				     arg, func);
>  }
>  
> -- 
> 2.29.2
> 

-- 
With Best Regards,
Andy Shevchenko




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v1 1/3] kernel/resource: make walk_system_ram_res() find all busy IORESOURCE_SYSTEM_RAM resources
  2021-03-23  9:40   ` David Hildenbrand
@ 2021-03-23 11:07     ` Andy Shevchenko
  0 siblings, 0 replies; 19+ messages in thread
From: Andy Shevchenko @ 2021-03-23 11:07 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, Andrew Morton, Greg Kroah-Hartman,
	Dan Williams, Daniel Vetter, Mauro Carvalho Chehab, Dave Young,
	Baoquan He, Vivek Goyal, Dave Hansen, Keith Busch, Michal Hocko,
	Qian Cai, Oscar Salvador, Eric Biederman, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, H. Peter Anvin, Tom Lendacky,
	Brijesh Singh, x86, kexec

On Tue, Mar 23, 2021 at 10:40:33AM +0100, David Hildenbrand wrote:
> On 22.03.21 17:01, David Hildenbrand wrote:
> > It used to be true that we can have busy system RAM only on the first level
> > in the resourc tree. However, this is no longer holds for driver-managed
> > system RAM (i.e., added via dax/kmem and virtio-mem), which gets added on
> > lower levels.
> > 
> > We have two users of walk_system_ram_res(), which currently only
> > consideres the first level:
> > a) kernel/kexec_file.c:kexec_walk_resources() -- We properly skip
> >     IORESOURCE_SYSRAM_DRIVER_MANAGED resources via
> >     locate_mem_hole_callback(), so even after this change, we won't be
> >     placing kexec images onto dax/kmem and virtio-mem added memory. No
> >     change.
> > b) arch/x86/kernel/crash.c:fill_up_crash_elf_data() -- we're currently
> >     not adding relevant ranges to the crash elf info, resulting in them
> >     not getting dumped via kdump.
> > 
> > This change fixes loading a crashkernel via kexec_file_load() and including
> > dax/kmem and virtio-mem added System RAM in the crashdump on x86-64. Note
> > that e.g,, arm64 relies on memblock data and, therefore, always considers
> > all added System RAM already.
> > 
> > Let's find all busy IORESOURCE_SYSTEM_RAM resources, making the function
> > behave like walk_system_ram_range().
> > 
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > Cc: Dan Williams <dan.j.williams@intel.com>
> > Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> > Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> > Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> > Cc: Signed-off-by: David Hildenbrand <david@redhat.com>
> 
> ^ My copy-paste action when creating the cc list slipped in a duplicate  SO
> in all 3 patches. I can resend if desired.

I think to address my comments you will need to resend anyway (as v2).

-- 
With Best Regards,
Andy Shevchenko




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v1 2/3] kernel/resource: make walk_mem_res() find all busy IORESOURCE_MEM resources
  2021-03-22 16:01 ` [PATCH v1 2/3] kernel/resource: make walk_mem_res() find all busy IORESOURCE_MEM resources David Hildenbrand
  2021-03-22 16:11   ` Dan Williams
@ 2021-03-23 11:08   ` Andy Shevchenko
  2021-03-23 11:26     ` David Hildenbrand
  1 sibling, 1 reply; 19+ messages in thread
From: Andy Shevchenko @ 2021-03-23 11:08 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, Andrew Morton, Greg Kroah-Hartman,
	Dan Williams, Daniel Vetter, Mauro Carvalho Chehab, Dave Young,
	Baoquan He, Vivek Goyal, Dave Hansen, Keith Busch, Michal Hocko,
	Qian Cai, Oscar Salvador, Eric Biederman, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, H. Peter Anvin, Tom Lendacky,
	Brijesh Singh, x86, kexec

On Mon, Mar 22, 2021 at 05:01:59PM +0100, David Hildenbrand wrote:
> It used to be true that we can have system RAM only on the first level
> in the resourc tree. However, this is no longer holds for driver-managed
> system RAM (i.e., dax/kmem and virtio-mem).
> 
> The function walk_mem_res() only consideres the first level and is
> used in arch/x86/mm/ioremap.c:__ioremap_check_mem() only. We currently
> fail to identify System RAM added by dax/kmem and virtio-mem as
> "IORES_MAP_SYSTEM_RAM", for example, allowing for remapping of such
> "normal RAM" in __ioremap_caller().

Here I dunno, but consider to add Fixes tag if it fixes known bad behaviour.

> Let's find all busy IORESOURCE_MEM resources, making the function
> behave similar to walk_system_ram_res().
> 
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> Cc: Signed-off-by: David Hildenbrand <david@redhat.com>
> Cc: Dave Young <dyoung@redhat.com>
> Cc: Baoquan He <bhe@redhat.com>
> Cc: Vivek Goyal <vgoyal@redhat.com>
> Cc: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: Keith Busch <keith.busch@intel.com>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Qian Cai <cai@lca.pw>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Eric Biederman <ebiederm@xmission.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: Tom Lendacky <thomas.lendacky@amd.com>
> Cc: Brijesh Singh <brijesh.singh@amd.com>
> Cc: x86@kernel.org
> Cc: kexec@lists.infradead.org
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  kernel/resource.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/resource.c b/kernel/resource.c
> index 4efd6e912279..16e0c7e8ed24 100644
> --- a/kernel/resource.c
> +++ b/kernel/resource.c
> @@ -470,7 +470,7 @@ int walk_mem_res(u64 start, u64 end, void *arg,
>  {
>  	unsigned long flags = IORESOURCE_MEM | IORESOURCE_BUSY;
>  
> -	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, true,
> +	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, false,
>  				     arg, func);
>  }
>  
> -- 
> 2.29.2
> 

-- 
With Best Regards,
Andy Shevchenko




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v1 3/3] kernel/resource: remove first_lvl / siblings_only logic
  2021-03-22 16:02 ` [PATCH v1 3/3] kernel/resource: remove first_lvl / siblings_only logic David Hildenbrand
  2021-03-22 16:12   ` Dan Williams
@ 2021-03-23 11:11   ` Andy Shevchenko
  2021-03-23 11:19     ` David Hildenbrand
  1 sibling, 1 reply; 19+ messages in thread
From: Andy Shevchenko @ 2021-03-23 11:11 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, Andrew Morton, Greg Kroah-Hartman,
	Dan Williams, Daniel Vetter, Mauro Carvalho Chehab, Dave Young,
	Baoquan He, Vivek Goyal, Dave Hansen, Keith Busch, Michal Hocko,
	Qian Cai, Oscar Salvador, Eric Biederman, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, H. Peter Anvin, Tom Lendacky,
	Brijesh Singh, x86, kexec

On Mon, Mar 22, 2021 at 05:02:00PM +0100, David Hildenbrand wrote:
> All IORESOURCE_SYSTEM_RAM and IORESOURCE_MEM now properly consider the
> whole resource tree, not just the first level. Let's drop the unused
> first_lvl / siblings_only logic.
> 
> All functions properly search the whole tree, so remove documentation
> that indicates that some functions behave differently.


Like this clean up!
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>

Although a few nit-picks below.

> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> Cc: Signed-off-by: David Hildenbrand <david@redhat.com>
> Cc: Dave Young <dyoung@redhat.com>
> Cc: Baoquan He <bhe@redhat.com>
> Cc: Vivek Goyal <vgoyal@redhat.com>
> Cc: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: Keith Busch <keith.busch@intel.com>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Qian Cai <cai@lca.pw>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Eric Biederman <ebiederm@xmission.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: Tom Lendacky <thomas.lendacky@amd.com>
> Cc: Brijesh Singh <brijesh.singh@amd.com>
> Cc: x86@kernel.org
> Cc: kexec@lists.infradead.org
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  kernel/resource.c | 45 ++++++++++++---------------------------------
>  1 file changed, 12 insertions(+), 33 deletions(-)
> 
> diff --git a/kernel/resource.c b/kernel/resource.c
> index 16e0c7e8ed24..7e00239a023a 100644
> --- a/kernel/resource.c
> +++ b/kernel/resource.c
> @@ -64,12 +64,8 @@ static DEFINE_RWLOCK(resource_lock);
>  static struct resource *bootmem_resource_free;
>  static DEFINE_SPINLOCK(bootmem_resource_lock);
>  
> -static struct resource *next_resource(struct resource *p, bool sibling_only)
> +static struct resource *next_resource(struct resource *p)
>  {
> -	/* Caller wants to traverse through siblings only */
> -	if (sibling_only)
> -		return p->sibling;
> -
>  	if (p->child)
>  		return p->child;
>  	while (!p->sibling && p->parent)
> @@ -81,7 +77,7 @@ static void *r_next(struct seq_file *m, void *v, loff_t *pos)
>  {
>  	struct resource *p = v;
>  	(*pos)++;
> -	return (void *)next_resource(p, false);
> +	return (void *)next_resource(p);
>  }
>  
>  #ifdef CONFIG_PROC_FS
> @@ -330,14 +326,10 @@ EXPORT_SYMBOL(release_resource);
>   * of the resource that's within [@start..@end]; if none is found, returns
>   * -ENODEV.  Returns -EINVAL for invalid parameters.
>   *
> - * This function walks the whole tree and not just first level children
> - * unless @first_lvl is true.
> - *
>   * @start:	start address of the resource searched for
>   * @end:	end address of same resource
>   * @flags:	flags which the resource must have
>   * @desc:	descriptor the resource must have
> - * @first_lvl:	walk only the first level children, if set
>   * @res:	return ptr, if resource found
>   *
>   * The caller must specify @start, @end, @flags, and @desc
> @@ -345,9 +337,8 @@ EXPORT_SYMBOL(release_resource);
>   */
>  static int find_next_iomem_res(resource_size_t start, resource_size_t end,
>  			       unsigned long flags, unsigned long desc,
> -			       bool first_lvl, struct resource *res)
> +			       struct resource *res)
>  {
> -	bool siblings_only = true;
>  	struct resource *p;
>  
>  	if (!res)
> @@ -358,7 +349,7 @@ static int find_next_iomem_res(resource_size_t start, resource_size_t end,
>  
>  	read_lock(&resource_lock);
>  
> -	for (p = iomem_resource.child; p; p = next_resource(p, siblings_only)) {
> +	for (p = iomem_resource.child; p; p = next_resource(p)) {
>  		/* If we passed the resource we are looking for, stop */
>  		if (p->start > end) {
>  			p = NULL;
> @@ -369,13 +360,6 @@ static int find_next_iomem_res(resource_size_t start, resource_size_t end,
>  		if (p->end < start)
>  			continue;
>  
> -		/*
> -		 * Now that we found a range that matches what we look for,
> -		 * check the flags and the descriptor. If we were not asked to
> -		 * use only the first level, start looking at children as well.
> -		 */
> -		siblings_only = first_lvl;
> -
>  		if ((p->flags & flags) != flags)
>  			continue;
>  		if ((desc != IORES_DESC_NONE) && (desc != p->desc))
> @@ -402,14 +386,14 @@ static int find_next_iomem_res(resource_size_t start, resource_size_t end,
>  
>  static int __walk_iomem_res_desc(resource_size_t start, resource_size_t end,
>  				 unsigned long flags, unsigned long desc,
> -				 bool first_lvl, void *arg,

> +				 void *arg,
>  				 int (*func)(struct resource *, void *))

Can it be one line?

>  {
>  	struct resource res;
>  	int ret = -EINVAL;
>  
>  	while (start < end &&
> -	       !find_next_iomem_res(start, end, flags, desc, first_lvl, &res)) {
> +	       !find_next_iomem_res(start, end, flags, desc, &res)) {
>  		ret = (*func)(&res, arg);
>  		if (ret)
>  			break;
> @@ -431,7 +415,6 @@ static int __walk_iomem_res_desc(resource_size_t start, resource_size_t end,
>   * @arg: function argument for the callback @func
>   * @func: callback function that is called for each qualifying resource area
>   *
> - * This walks through whole tree and not just first level children.
>   * All the memory ranges which overlap start,end and also match flags and
>   * desc are valid candidates.
>   *
> @@ -441,7 +424,7 @@ static int __walk_iomem_res_desc(resource_size_t start, resource_size_t end,
>  int walk_iomem_res_desc(unsigned long desc, unsigned long flags, u64 start,
>  		u64 end, void *arg, int (*func)(struct resource *, void *))
>  {
> -	return __walk_iomem_res_desc(start, end, flags, desc, false, arg, func);
> +	return __walk_iomem_res_desc(start, end, flags, desc, arg, func);
>  }
>  EXPORT_SYMBOL_GPL(walk_iomem_res_desc);
>  
> @@ -457,8 +440,8 @@ int walk_system_ram_res(u64 start, u64 end, void *arg,
>  {
>  	unsigned long flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
>  
> -	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, false,
> -				     arg, func);

> +	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, arg,
> +				     func);

I guess you may do it on one line.

>  }
>  
>  /*
> @@ -470,17 +453,14 @@ int walk_mem_res(u64 start, u64 end, void *arg,
>  {
>  	unsigned long flags = IORESOURCE_MEM | IORESOURCE_BUSY;
>  
> -	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, false,
> -				     arg, func);
> +	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, arg,
> +				     func);

Ditto.

>  }
>  
>  /*
>   * This function calls the @func callback against all memory ranges of type
>   * System RAM which are marked as IORESOURCE_SYSTEM_RAM and IORESOUCE_BUSY.
>   * It is to be used only for System RAM.
> - *
> - * This will find System RAM ranges that are children of top-level resources
> - * in addition to top-level System RAM resources.
>   */
>  int walk_system_ram_range(unsigned long start_pfn, unsigned long nr_pages,
>  			  void *arg, int (*func)(unsigned long, unsigned long, void *))
> @@ -495,8 +475,7 @@ int walk_system_ram_range(unsigned long start_pfn, unsigned long nr_pages,
>  	end = ((u64)(start_pfn + nr_pages) << PAGE_SHIFT) - 1;
>  	flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
>  	while (start < end &&
> -	       !find_next_iomem_res(start, end, flags, IORES_DESC_NONE,
> -				    false, &res)) {
> +	       !find_next_iomem_res(start, end, flags, IORES_DESC_NONE, &res)) {
>  		pfn = PFN_UP(res.start);
>  		end_pfn = PFN_DOWN(res.end + 1);
>  		if (end_pfn > pfn)
> -- 
> 2.29.2
> 

-- 
With Best Regards,
Andy Shevchenko




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v1 3/3] kernel/resource: remove first_lvl / siblings_only logic
  2021-03-23 11:11   ` Andy Shevchenko
@ 2021-03-23 11:19     ` David Hildenbrand
  0 siblings, 0 replies; 19+ messages in thread
From: David Hildenbrand @ 2021-03-23 11:19 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: linux-kernel, linux-mm, Andrew Morton, Greg Kroah-Hartman,
	Dan Williams, Daniel Vetter, Mauro Carvalho Chehab, Dave Young,
	Baoquan He, Vivek Goyal, Dave Hansen, Keith Busch, Michal Hocko,
	Qian Cai, Oscar Salvador, Eric Biederman, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, H. Peter Anvin, Tom Lendacky,
	Brijesh Singh, x86, kexec

On 23.03.21 12:11, Andy Shevchenko wrote:
> On Mon, Mar 22, 2021 at 05:02:00PM +0100, David Hildenbrand wrote:
>> All IORESOURCE_SYSTEM_RAM and IORESOURCE_MEM now properly consider the
>> whole resource tree, not just the first level. Let's drop the unused
>> first_lvl / siblings_only logic.
>>
>> All functions properly search the whole tree, so remove documentation
>> that indicates that some functions behave differently.
> 
> 
> Like this clean up!
> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> 
> Although a few nit-picks below.
> 
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>> Cc: Dan Williams <dan.j.williams@intel.com>
>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
>> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
>> Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
>> Cc: Signed-off-by: David Hildenbrand <david@redhat.com>
>> Cc: Dave Young <dyoung@redhat.com>
>> Cc: Baoquan He <bhe@redhat.com>
>> Cc: Vivek Goyal <vgoyal@redhat.com>
>> Cc: Dave Hansen <dave.hansen@linux.intel.com>
>> Cc: Keith Busch <keith.busch@intel.com>
>> Cc: Michal Hocko <mhocko@suse.com>
>> Cc: Qian Cai <cai@lca.pw>
>> Cc: Oscar Salvador <osalvador@suse.de>
>> Cc: Eric Biederman <ebiederm@xmission.com>
>> Cc: Thomas Gleixner <tglx@linutronix.de>
>> Cc: Ingo Molnar <mingo@redhat.com>
>> Cc: Borislav Petkov <bp@alien8.de>
>> Cc: "H. Peter Anvin" <hpa@zytor.com>
>> Cc: Tom Lendacky <thomas.lendacky@amd.com>
>> Cc: Brijesh Singh <brijesh.singh@amd.com>
>> Cc: x86@kernel.org
>> Cc: kexec@lists.infradead.org
>> Signed-off-by: David Hildenbrand <david@redhat.com>
>> ---
>>   kernel/resource.c | 45 ++++++++++++---------------------------------
>>   1 file changed, 12 insertions(+), 33 deletions(-)
>>
>> diff --git a/kernel/resource.c b/kernel/resource.c
>> index 16e0c7e8ed24..7e00239a023a 100644
>> --- a/kernel/resource.c
>> +++ b/kernel/resource.c
>> @@ -64,12 +64,8 @@ static DEFINE_RWLOCK(resource_lock);
>>   static struct resource *bootmem_resource_free;
>>   static DEFINE_SPINLOCK(bootmem_resource_lock);
>>   
>> -static struct resource *next_resource(struct resource *p, bool sibling_only)
>> +static struct resource *next_resource(struct resource *p)
>>   {
>> -	/* Caller wants to traverse through siblings only */
>> -	if (sibling_only)
>> -		return p->sibling;
>> -
>>   	if (p->child)
>>   		return p->child;
>>   	while (!p->sibling && p->parent)
>> @@ -81,7 +77,7 @@ static void *r_next(struct seq_file *m, void *v, loff_t *pos)
>>   {
>>   	struct resource *p = v;
>>   	(*pos)++;
>> -	return (void *)next_resource(p, false);
>> +	return (void *)next_resource(p);
>>   }
>>   
>>   #ifdef CONFIG_PROC_FS
>> @@ -330,14 +326,10 @@ EXPORT_SYMBOL(release_resource);
>>    * of the resource that's within [@start..@end]; if none is found, returns
>>    * -ENODEV.  Returns -EINVAL for invalid parameters.
>>    *
>> - * This function walks the whole tree and not just first level children
>> - * unless @first_lvl is true.
>> - *
>>    * @start:	start address of the resource searched for
>>    * @end:	end address of same resource
>>    * @flags:	flags which the resource must have
>>    * @desc:	descriptor the resource must have
>> - * @first_lvl:	walk only the first level children, if set
>>    * @res:	return ptr, if resource found
>>    *
>>    * The caller must specify @start, @end, @flags, and @desc
>> @@ -345,9 +337,8 @@ EXPORT_SYMBOL(release_resource);
>>    */
>>   static int find_next_iomem_res(resource_size_t start, resource_size_t end,
>>   			       unsigned long flags, unsigned long desc,
>> -			       bool first_lvl, struct resource *res)
>> +			       struct resource *res)
>>   {
>> -	bool siblings_only = true;
>>   	struct resource *p;
>>   
>>   	if (!res)
>> @@ -358,7 +349,7 @@ static int find_next_iomem_res(resource_size_t start, resource_size_t end,
>>   
>>   	read_lock(&resource_lock);
>>   
>> -	for (p = iomem_resource.child; p; p = next_resource(p, siblings_only)) {
>> +	for (p = iomem_resource.child; p; p = next_resource(p)) {
>>   		/* If we passed the resource we are looking for, stop */
>>   		if (p->start > end) {
>>   			p = NULL;
>> @@ -369,13 +360,6 @@ static int find_next_iomem_res(resource_size_t start, resource_size_t end,
>>   		if (p->end < start)
>>   			continue;
>>   
>> -		/*
>> -		 * Now that we found a range that matches what we look for,
>> -		 * check the flags and the descriptor. If we were not asked to
>> -		 * use only the first level, start looking at children as well.
>> -		 */
>> -		siblings_only = first_lvl;
>> -
>>   		if ((p->flags & flags) != flags)
>>   			continue;
>>   		if ((desc != IORES_DESC_NONE) && (desc != p->desc))
>> @@ -402,14 +386,14 @@ static int find_next_iomem_res(resource_size_t start, resource_size_t end,
>>   
>>   static int __walk_iomem_res_desc(resource_size_t start, resource_size_t end,
>>   				 unsigned long flags, unsigned long desc,
>> -				 bool first_lvl, void *arg,
> 
>> +				 void *arg,
>>   				 int (*func)(struct resource *, void *))
> 
> Can it be one line?
> 
>>   {
>>   	struct resource res;
>>   	int ret = -EINVAL;
>>   
>>   	while (start < end &&
>> -	       !find_next_iomem_res(start, end, flags, desc, first_lvl, &res)) {
>> +	       !find_next_iomem_res(start, end, flags, desc, &res)) {
>>   		ret = (*func)(&res, arg);
>>   		if (ret)
>>   			break;
>> @@ -431,7 +415,6 @@ static int __walk_iomem_res_desc(resource_size_t start, resource_size_t end,
>>    * @arg: function argument for the callback @func
>>    * @func: callback function that is called for each qualifying resource area
>>    *
>> - * This walks through whole tree and not just first level children.
>>    * All the memory ranges which overlap start,end and also match flags and
>>    * desc are valid candidates.
>>    *
>> @@ -441,7 +424,7 @@ static int __walk_iomem_res_desc(resource_size_t start, resource_size_t end,
>>   int walk_iomem_res_desc(unsigned long desc, unsigned long flags, u64 start,
>>   		u64 end, void *arg, int (*func)(struct resource *, void *))
>>   {
>> -	return __walk_iomem_res_desc(start, end, flags, desc, false, arg, func);
>> +	return __walk_iomem_res_desc(start, end, flags, desc, arg, func);
>>   }
>>   EXPORT_SYMBOL_GPL(walk_iomem_res_desc);
>>   
>> @@ -457,8 +440,8 @@ int walk_system_ram_res(u64 start, u64 end, void *arg,
>>   {
>>   	unsigned long flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
>>   
>> -	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, false,
>> -				     arg, func);
> 
>> +	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, arg,
>> +				     func);
> 
> I guess you may do it on one line.
> 
>>   }
>>   
>>   /*
>> @@ -470,17 +453,14 @@ int walk_mem_res(u64 start, u64 end, void *arg,
>>   {
>>   	unsigned long flags = IORESOURCE_MEM | IORESOURCE_BUSY;
>>   
>> -	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, false,
>> -				     arg, func);
>> +	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, arg,
>> +				     func);
> 
> Ditto.

To all your comments:

"The preferred limit on the length of a single line is 80 columns."

"Statements longer than 80 columns should be broken into sensible chunks 
... unless exceeding 80 columns significantly increases readability"

I don't think it significantly increases readability.


-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v1 1/3] kernel/resource: make walk_system_ram_res() find all busy IORESOURCE_SYSTEM_RAM resources
  2021-03-23 11:06   ` Andy Shevchenko
@ 2021-03-23 11:25     ` David Hildenbrand
  0 siblings, 0 replies; 19+ messages in thread
From: David Hildenbrand @ 2021-03-23 11:25 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: linux-kernel, linux-mm, Andrew Morton, Greg Kroah-Hartman,
	Dan Williams, Daniel Vetter, Mauro Carvalho Chehab, Dave Young,
	Baoquan He, Vivek Goyal, Dave Hansen, Keith Busch, Michal Hocko,
	Qian Cai, Oscar Salvador, Eric Biederman, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, H. Peter Anvin, Tom Lendacky,
	Brijesh Singh, x86, kexec

On 23.03.21 12:06, Andy Shevchenko wrote:
> On Mon, Mar 22, 2021 at 05:01:58PM +0100, David Hildenbrand wrote:
>> It used to be true that we can have busy system RAM only on the first level
>> in the resourc tree. However, this is no longer holds for driver-managed
>> system RAM (i.e., added via dax/kmem and virtio-mem), which gets added on
>> lower levels.
>>
>> We have two users of walk_system_ram_res(), which currently only
>> consideres the first level:
>> a) kernel/kexec_file.c:kexec_walk_resources() -- We properly skip
>>     IORESOURCE_SYSRAM_DRIVER_MANAGED resources via
>>     locate_mem_hole_callback(), so even after this change, we won't be
>>     placing kexec images onto dax/kmem and virtio-mem added memory. No
>>     change.
>> b) arch/x86/kernel/crash.c:fill_up_crash_elf_data() -- we're currently
>>     not adding relevant ranges to the crash elf info, resulting in them
>>     not getting dumped via kdump.
>>
>> This change fixes loading a crashkernel via kexec_file_load() and including
> 
> "...fixes..." effectively means to me that Fixes tag should be provided.

We can certainly add, although it doesn't really affect the running 
kernel, but only crashdumps taken in the kdump kernel:

Fixes: ebf71552bb0e ("virtio-mem: Add parent resource for all added 
"System RAM"")
Fixes: c221c0b0308f ("device-dax: "Hotplug" persistent memory for use 
like normal RAM")

Thanks

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v1 2/3] kernel/resource: make walk_mem_res() find all busy IORESOURCE_MEM resources
  2021-03-23 11:08   ` Andy Shevchenko
@ 2021-03-23 11:26     ` David Hildenbrand
  0 siblings, 0 replies; 19+ messages in thread
From: David Hildenbrand @ 2021-03-23 11:26 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: linux-kernel, linux-mm, Andrew Morton, Greg Kroah-Hartman,
	Dan Williams, Daniel Vetter, Mauro Carvalho Chehab, Dave Young,
	Baoquan He, Vivek Goyal, Dave Hansen, Keith Busch, Michal Hocko,
	Qian Cai, Oscar Salvador, Eric Biederman, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, H. Peter Anvin, Tom Lendacky,
	Brijesh Singh, x86, kexec

On 23.03.21 12:08, Andy Shevchenko wrote:
> On Mon, Mar 22, 2021 at 05:01:59PM +0100, David Hildenbrand wrote:
>> It used to be true that we can have system RAM only on the first level
>> in the resourc tree. However, this is no longer holds for driver-managed
>> system RAM (i.e., dax/kmem and virtio-mem).
>>
>> The function walk_mem_res() only consideres the first level and is
>> used in arch/x86/mm/ioremap.c:__ioremap_check_mem() only. We currently
>> fail to identify System RAM added by dax/kmem and virtio-mem as
>> "IORES_MAP_SYSTEM_RAM", for example, allowing for remapping of such
>> "normal RAM" in __ioremap_caller().
> 
> Here I dunno, but consider to add Fixes tag if it fixes known bad behaviour.

Haven't observed it in the wild. We can add the same fixes tags as to 
patch #1.

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v1 1/3] kernel/resource: make walk_system_ram_res() find all busy IORESOURCE_SYSTEM_RAM resources
  2021-03-22 16:01 ` [PATCH v1 1/3] kernel/resource: make walk_system_ram_res() find all busy IORESOURCE_SYSTEM_RAM resources David Hildenbrand
                     ` (2 preceding siblings ...)
  2021-03-23 11:06   ` Andy Shevchenko
@ 2021-03-23 14:33   ` Baoquan He
  2021-03-24 11:18   ` Oscar Salvador
  4 siblings, 0 replies; 19+ messages in thread
From: Baoquan He @ 2021-03-23 14:33 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, Andrew Morton, Greg Kroah-Hartman,
	Dan Williams, Daniel Vetter, Andy Shevchenko,
	Mauro Carvalho Chehab, Dave Young, Vivek Goyal, Dave Hansen,
	Keith Busch, Michal Hocko, Qian Cai, Oscar Salvador,
	Eric Biederman, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	H. Peter Anvin, Tom Lendacky, Brijesh Singh, x86, kexec

On 03/22/21 at 05:01pm, David Hildenbrand wrote:
> It used to be true that we can have busy system RAM only on the first level
> in the resourc tree. However, this is no longer holds for driver-managed
> system RAM (i.e., added via dax/kmem and virtio-mem), which gets added on
> lower levels.
> 
> We have two users of walk_system_ram_res(), which currently only
> consideres the first level:
> a) kernel/kexec_file.c:kexec_walk_resources() -- We properly skip
>    IORESOURCE_SYSRAM_DRIVER_MANAGED resources via
>    locate_mem_hole_callback(), so even after this change, we won't be
>    placing kexec images onto dax/kmem and virtio-mem added memory. No
>    change.
> b) arch/x86/kernel/crash.c:fill_up_crash_elf_data() -- we're currently
>    not adding relevant ranges to the crash elf info, resulting in them
>    not getting dumped via kdump.
> 
> This change fixes loading a crashkernel via kexec_file_load() and including
> dax/kmem and virtio-mem added System RAM in the crashdump on x86-64. Note
> that e.g,, arm64 relies on memblock data and, therefore, always considers
> all added System RAM already.
> 
> Let's find all busy IORESOURCE_SYSTEM_RAM resources, making the function
> behave like walk_system_ram_range().
> 
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> Cc: Signed-off-by: David Hildenbrand <david@redhat.com>
> Cc: Dave Young <dyoung@redhat.com>
> Cc: Baoquan He <bhe@redhat.com>
> Cc: Vivek Goyal <vgoyal@redhat.com>
> Cc: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: Keith Busch <keith.busch@intel.com>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Qian Cai <cai@lca.pw>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Eric Biederman <ebiederm@xmission.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: Tom Lendacky <thomas.lendacky@amd.com>
> Cc: Brijesh Singh <brijesh.singh@amd.com>
> Cc: x86@kernel.org
> Cc: kexec@lists.infradead.org
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  kernel/resource.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/resource.c b/kernel/resource.c
> index 627e61b0c124..4efd6e912279 100644
> --- a/kernel/resource.c
> +++ b/kernel/resource.c
> @@ -457,7 +457,7 @@ int walk_system_ram_res(u64 start, u64 end, void *arg,
>  {
>  	unsigned long flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
>  
> -	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, true,
> +	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, false,
>  				     arg, func);

Thanks, David, this is a good fix.

Acked-by: Baoquan He <bhe@redhat.com>

>  }
>  
> -- 
> 2.29.2
> 



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v1 1/3] kernel/resource: make walk_system_ram_res() find all busy IORESOURCE_SYSTEM_RAM resources
  2021-03-22 16:01 ` [PATCH v1 1/3] kernel/resource: make walk_system_ram_res() find all busy IORESOURCE_SYSTEM_RAM resources David Hildenbrand
                     ` (3 preceding siblings ...)
  2021-03-23 14:33   ` Baoquan He
@ 2021-03-24 11:18   ` Oscar Salvador
  2021-03-24 11:28     ` David Hildenbrand
  4 siblings, 1 reply; 19+ messages in thread
From: Oscar Salvador @ 2021-03-24 11:18 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, Andrew Morton, Greg Kroah-Hartman,
	Dan Williams, Daniel Vetter, Andy Shevchenko,
	Mauro Carvalho Chehab, Dave Young, Baoquan He, Vivek Goyal,
	Dave Hansen, Keith Busch, Michal Hocko, Qian Cai, Eric Biederman,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin,
	Tom Lendacky, Brijesh Singh, x86, kexec

On Mon, Mar 22, 2021 at 05:01:58PM +0100, David Hildenbrand wrote:
> It used to be true that we can have busy system RAM only on the first level
> in the resourc tree. However, this is no longer holds for driver-managed
> system RAM (i.e., added via dax/kmem and virtio-mem), which gets added on
> lower levels.

Let me ask some rookie questions:

What does "busy" term stand for here?
Why resources coming from virtio-mem are added at a lower levels?

> 
> We have two users of walk_system_ram_res(), which currently only
> consideres the first level:
> a) kernel/kexec_file.c:kexec_walk_resources() -- We properly skip
>    IORESOURCE_SYSRAM_DRIVER_MANAGED resources via
>    locate_mem_hole_callback(), so even after this change, we won't be
>    placing kexec images onto dax/kmem and virtio-mem added memory. No
>    change.
> b) arch/x86/kernel/crash.c:fill_up_crash_elf_data() -- we're currently
>    not adding relevant ranges to the crash elf info, resulting in them
>    not getting dumped via kdump.
> 
> This change fixes loading a crashkernel via kexec_file_load() and including
> dax/kmem and virtio-mem added System RAM in the crashdump on x86-64. Note
> that e.g,, arm64 relies on memblock data and, therefore, always considers
> all added System RAM already.
> 
> Let's find all busy IORESOURCE_SYSTEM_RAM resources, making the function
> behave like walk_system_ram_range().
> 
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> Cc: Signed-off-by: David Hildenbrand <david@redhat.com>
> Cc: Dave Young <dyoung@redhat.com>
> Cc: Baoquan He <bhe@redhat.com>
> Cc: Vivek Goyal <vgoyal@redhat.com>
> Cc: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: Keith Busch <keith.busch@intel.com>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Qian Cai <cai@lca.pw>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Eric Biederman <ebiederm@xmission.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: Tom Lendacky <thomas.lendacky@amd.com>
> Cc: Brijesh Singh <brijesh.singh@amd.com>
> Cc: x86@kernel.org
> Cc: kexec@lists.infradead.org
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  kernel/resource.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/resource.c b/kernel/resource.c
> index 627e61b0c124..4efd6e912279 100644
> --- a/kernel/resource.c
> +++ b/kernel/resource.c
> @@ -457,7 +457,7 @@ int walk_system_ram_res(u64 start, u64 end, void *arg,
>  {
>  	unsigned long flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
>  
> -	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, true,
> +	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, false,
>  				     arg, func);
>  }
>  
> -- 
> 2.29.2
> 
> 

-- 
Oscar Salvador
SUSE L3


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v1 1/3] kernel/resource: make walk_system_ram_res() find all busy IORESOURCE_SYSTEM_RAM resources
  2021-03-24 11:18   ` Oscar Salvador
@ 2021-03-24 11:28     ` David Hildenbrand
  0 siblings, 0 replies; 19+ messages in thread
From: David Hildenbrand @ 2021-03-24 11:28 UTC (permalink / raw)
  To: Oscar Salvador
  Cc: linux-kernel, linux-mm, Andrew Morton, Greg Kroah-Hartman,
	Dan Williams, Daniel Vetter, Andy Shevchenko,
	Mauro Carvalho Chehab, Dave Young, Baoquan He, Vivek Goyal,
	Dave Hansen, Keith Busch, Michal Hocko, Qian Cai, Eric Biederman,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin,
	Tom Lendacky, Brijesh Singh, x86, kexec

On 24.03.21 12:18, Oscar Salvador wrote:
> On Mon, Mar 22, 2021 at 05:01:58PM +0100, David Hildenbrand wrote:
>> It used to be true that we can have busy system RAM only on the first level
>> in the resourc tree. However, this is no longer holds for driver-managed
>> system RAM (i.e., added via dax/kmem and virtio-mem), which gets added on
>> lower levels.
> 
> Let me ask some rookie questions:
> 
> What does "busy" term stand for here?

IORESOURCE_BUSY - here: actually added, not just some reserved range / container.


> Why resources coming from virtio-mem are added at a lower levels?

Some information can be had from ebf71552bb0e690cad523ad175e8c4c89a33c333

commit ebf71552bb0e690cad523ad175e8c4c89a33c333
Author: David Hildenbrand <david@redhat.com>
Date:   Thu May 7 16:01:35 2020 +0200

     virtio-mem: Add parent resource for all added "System RAM"
     
     Let's add a parent resource, named after the virtio device (inspired by
     drivers/dax/kmem.c). This allows user space to identify which memory
     belongs to which virtio-mem device.
     
     With this change and two virtio-mem devices:
             :/# cat /proc/iomem
             00000000-00000fff : Reserved
             00001000-0009fbff : System RAM
             [...]
             140000000-333ffffff : virtio0
               140000000-147ffffff : System RAM
               148000000-14fffffff : System RAM
               150000000-157ffffff : System RAM
             [...]
             334000000-3033ffffff : virtio1
               338000000-33fffffff : System RAM
               340000000-347ffffff : System RAM
               348000000-34fffffff : System RAM
             [...]



For dax/kmem it comes naturally due to the "Persistent Memory" and
device parent resources like:

             140000000-33fffffff : Persistent Memory
               140000000-1481fffff : namespace0.0
               150000000-33fffffff : dax0.0
                 150000000-33fffffff : System RAM (kmem)
             3280000000-32ffffffff : PCI Bus 0000:00


Thanks

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2021-03-24 11:28 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-22 16:01 [PATCH v1 0/3] kernel/resource: make walk_system_ram_res() and walk_mem_res() search the whole tree David Hildenbrand
2021-03-22 16:01 ` [PATCH v1 1/3] kernel/resource: make walk_system_ram_res() find all busy IORESOURCE_SYSTEM_RAM resources David Hildenbrand
2021-03-22 16:10   ` Dan Williams
2021-03-23  9:40   ` David Hildenbrand
2021-03-23 11:07     ` Andy Shevchenko
2021-03-23 11:06   ` Andy Shevchenko
2021-03-23 11:25     ` David Hildenbrand
2021-03-23 14:33   ` Baoquan He
2021-03-24 11:18   ` Oscar Salvador
2021-03-24 11:28     ` David Hildenbrand
2021-03-22 16:01 ` [PATCH v1 2/3] kernel/resource: make walk_mem_res() find all busy IORESOURCE_MEM resources David Hildenbrand
2021-03-22 16:11   ` Dan Williams
2021-03-23 11:08   ` Andy Shevchenko
2021-03-23 11:26     ` David Hildenbrand
2021-03-22 16:02 ` [PATCH v1 3/3] kernel/resource: remove first_lvl / siblings_only logic David Hildenbrand
2021-03-22 16:12   ` Dan Williams
2021-03-23 11:11   ` Andy Shevchenko
2021-03-23 11:19     ` David Hildenbrand
2021-03-23 11:05 ` [PATCH v1 0/3] kernel/resource: make walk_system_ram_res() and walk_mem_res() search the whole tree Andy Shevchenko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).