linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3 v3] add reserved e820 ranges to the kdump kernel e820 table
@ 2018-09-21  7:32 Lianbo Jiang
  2018-09-21  7:32 ` [PATCH 1/3 v3] resource: fix an error which walks through iomem resources Lianbo Jiang
                   ` (3 more replies)
  0 siblings, 4 replies; 24+ messages in thread
From: Lianbo Jiang @ 2018-09-21  7:32 UTC (permalink / raw)
  To: linux-kernel
  Cc: kexec, tglx, mingo, hpa, x86, akpm, dan.j.williams,
	thomas.lendacky, bhelgaas, baiyaowei, tiwai, bp, brijesh.singh,
	dyoung, bhe

E820 reserved ranges is useful in kdump kernel, we have added this in
kexec-tools code.

One reason is PCI mmconf (extended mode) requires reserved region otherwise
it falls back to legacy mode.

Furthermore, when AMD SME kdump support, it needs to map dmi table area as
unencrypted. For normal boot, these ranges sit in e820 reserved ranges,
thus the early ioremap code naturally map them as unencrypted. If we also
have same e820 reserve setup in kdump kernel then it will just work like
normal kernel.

Kdump uses walk_iomem_res_desc to iterate resources, then adds matched desc
to e820 table for the kdump kernel.

But IORES_DESC_NONE resource type includes several different e820 types, we
need add exact e820 type to the kdump kernel e820 table, thus it also needs
an extra checking in memmap_entry_callback() to match the e820 type and
resource name.

By the way, we also fix an error which walks through iomem resources, the
values of the function parameter may be modified in the while loop of
__walk_iomem_res_desc(), which will cause us to not get the desired result
in some cases.

Changes since v2:
1. Modified the value of flags to "0", when walking through the whole
tree for e820 reserved ranges.
2. Modified the invalid SOB chain issue.

Lianbo Jiang (3):
  resource: fix an error which walks through iomem resources
  x86/kexec_file: add e820 entry in case e820 type string matches to io
    resource name
  x86/kexec_file: add reserved e820 ranges to kdump kernel e820 table

 arch/x86/include/asm/e820/api.h |  2 ++
 arch/x86/kernel/crash.c         | 11 ++++++++++-
 arch/x86/kernel/e820.c          |  2 +-
 kernel/resource.c               |  3 +++
 4 files changed, 16 insertions(+), 2 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 1/3 v3] resource: fix an error which walks through iomem resources
  2018-09-21  7:32 [PATCH 0/3 v3] add reserved e820 ranges to the kdump kernel e820 table Lianbo Jiang
@ 2018-09-21  7:32 ` Lianbo Jiang
  2018-09-24 17:52   ` Bjorn Helgaas
  2018-09-24 22:14   ` [PATCH 0/3] find_next_iomem_res() fixes Bjorn Helgaas
  2018-09-21  7:32 ` [PATCH 2/3 v3] x86/kexec_file: add e820 entry in case e820 type string matches to io resource name Lianbo Jiang
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 24+ messages in thread
From: Lianbo Jiang @ 2018-09-21  7:32 UTC (permalink / raw)
  To: linux-kernel
  Cc: kexec, tglx, mingo, hpa, x86, akpm, dan.j.williams,
	thomas.lendacky, bhelgaas, baiyaowei, tiwai, bp, brijesh.singh,
	dyoung, bhe

When we walk through iomem resources by calling walk_iomem_res_desc(),
the values of the function parameter may be modified in the while loop
of __walk_iomem_res_desc(), which will cause us to not get the desired
result in some cases.

At present, it only restores the original value of res->end, but it
doesn't restore the original value of res->flags in the while loop of
__walk_iomem _res_desc(). Whenever the find_next_iomem_res() finds a
resource and returns the result, the original values of this resource
will be modified, which might lead to an error in the next loop. For
example:

The original value of resource flags is:
 res->flags=0x80000200(initial value)

p->flags   _ 0x81000200 _                _ 0x80000200 _
          /              \              /              \
|________|_______A________|____|_....._|______B_________|..........___|
0                                                            0xffffffff
                (memory address ranges)

Note: if ((p->flags & res->flags) != res->flags) continue;

When the resource A is found, the original value of this resource flags
will be changed to 0x81000200(res->flags=0x81000200), and continue to
look for the next resource, when the loop reaches resource B, it can not
get the resource B any more(you can refer to the for loop of find_next
_iomem_res()), because the value of conditional expression will become
true and will also jump the resource B.

In fact, we should get the resource A and B when we walk through the
whole tree, but it only gets the resource A, the resource B is missed.

Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
---
 kernel/resource.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/resource.c b/kernel/resource.c
index 30e1bc68503b..f5d9fc70a04c 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -375,6 +375,7 @@ static int __walk_iomem_res_desc(struct resource *res, unsigned long desc,
 				 int (*func)(struct resource *, void *))
 {
 	u64 orig_end = res->end;
+	u64 orig_flags = res->flags;
 	int ret = -1;
 
 	while ((res->start < res->end) &&
@@ -385,6 +386,7 @@ static int __walk_iomem_res_desc(struct resource *res, unsigned long desc,
 
 		res->start = res->end + 1;
 		res->end = orig_end;
+		res->flags = orig_flags;
 	}
 
 	return ret;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 2/3 v3] x86/kexec_file: add e820 entry in case e820 type string matches to io resource name
  2018-09-21  7:32 [PATCH 0/3 v3] add reserved e820 ranges to the kdump kernel e820 table Lianbo Jiang
  2018-09-21  7:32 ` [PATCH 1/3 v3] resource: fix an error which walks through iomem resources Lianbo Jiang
@ 2018-09-21  7:32 ` Lianbo Jiang
  2018-09-21  7:32 ` [PATCH 3/3 v3] x86/kexec_file: add reserved e820 ranges to kdump kernel e820 table Lianbo Jiang
  2018-10-16  2:56 ` [PATCH 0/3 v3] add reserved e820 ranges to the " Dave Young
  3 siblings, 0 replies; 24+ messages in thread
From: Lianbo Jiang @ 2018-09-21  7:32 UTC (permalink / raw)
  To: linux-kernel
  Cc: kexec, tglx, mingo, hpa, x86, akpm, dan.j.williams,
	thomas.lendacky, bhelgaas, baiyaowei, tiwai, bp, brijesh.singh,
	dyoung, bhe

kdump uses walk_iomem_res_desc() to iterate io resources, then adds matched
desc to e820 table for kdump kernel.

But IORES_DESC_NONE resource type includes several different e820 types,
we need add exact e820 type to kdump kernel e820 table, thus it also needs
an extra checking in memmap_entry_callback() to match the e820 type and
resource name.

Suggested-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
---
 arch/x86/include/asm/e820/api.h | 2 ++
 arch/x86/kernel/crash.c         | 6 +++++-
 arch/x86/kernel/e820.c          | 2 +-
 kernel/resource.c               | 1 +
 4 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/e820/api.h b/arch/x86/include/asm/e820/api.h
index 62be73b23d5c..6d5451b36e80 100644
--- a/arch/x86/include/asm/e820/api.h
+++ b/arch/x86/include/asm/e820/api.h
@@ -42,6 +42,8 @@ extern void e820__register_nosave_regions(unsigned long limit_pfn);
 
 extern int  e820__get_entry_type(u64 start, u64 end);
 
+extern const char *e820_type_to_string(struct e820_entry *entry);
+
 /*
  * Returns true iff the specified range [start,end) is completely contained inside
  * the ISA region.
diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index f631a3f15587..ae724a6e0a5f 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -37,6 +37,7 @@
 #include <asm/reboot.h>
 #include <asm/virtext.h>
 #include <asm/intel_pt.h>
+#include <asm/e820/api.h>
 
 /* Used while preparing memory map entries for second kernel */
 struct crash_memmap_data {
@@ -314,11 +315,14 @@ static int memmap_entry_callback(struct resource *res, void *arg)
 	struct crash_memmap_data *cmd = arg;
 	struct boot_params *params = cmd->params;
 	struct e820_entry ei;
+	const char *name;
 
 	ei.addr = res->start;
 	ei.size = resource_size(res);
 	ei.type = cmd->type;
-	add_e820_entry(params, &ei);
+	name = e820_type_to_string(&ei);
+	if (res->name && !strcmp(name, res->name))
+		add_e820_entry(params, &ei);
 
 	return 0;
 }
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index c88c23c658c1..f9761b2f7abb 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -1012,7 +1012,7 @@ void __init e820__finish_early_params(void)
 	}
 }
 
-static const char *__init e820_type_to_string(struct e820_entry *entry)
+const char *e820_type_to_string(struct e820_entry *entry)
 {
 	switch (entry->type) {
 	case E820_TYPE_RESERVED_KERN:	/* Fall-through: */
diff --git a/kernel/resource.c b/kernel/resource.c
index f5d9fc70a04c..cc90633f35f9 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -366,6 +366,7 @@ static int find_next_iomem_res(struct resource *res, unsigned long desc,
 		res->end = p->end;
 	res->flags = p->flags;
 	res->desc = p->desc;
+	res->name = p->name;
 	return 0;
 }
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 3/3 v3] x86/kexec_file: add reserved e820 ranges to kdump kernel e820 table
  2018-09-21  7:32 [PATCH 0/3 v3] add reserved e820 ranges to the kdump kernel e820 table Lianbo Jiang
  2018-09-21  7:32 ` [PATCH 1/3 v3] resource: fix an error which walks through iomem resources Lianbo Jiang
  2018-09-21  7:32 ` [PATCH 2/3 v3] x86/kexec_file: add e820 entry in case e820 type string matches to io resource name Lianbo Jiang
@ 2018-09-21  7:32 ` Lianbo Jiang
  2018-10-16  2:56 ` [PATCH 0/3 v3] add reserved e820 ranges to the " Dave Young
  3 siblings, 0 replies; 24+ messages in thread
From: Lianbo Jiang @ 2018-09-21  7:32 UTC (permalink / raw)
  To: linux-kernel
  Cc: kexec, tglx, mingo, hpa, x86, akpm, dan.j.williams,
	thomas.lendacky, bhelgaas, baiyaowei, tiwai, bp, brijesh.singh,
	dyoung, bhe

E820 reserved ranges is useful in kdump kernel, we have added this in
kexec-tools code.

One reason is PCI mmconf (extended mode) requires reserved region
otherwise it falls back to legacy mode.

When AMD SME kdump support, it needs to map dmi table area as unencrypted.
For normal boot, these ranges sit in e820 reserved ranges, thus the early
ioremap code naturally map them as unencrypted. If we also have same e820
reserve setup in kdump kernel then it will just work like normal kernel.

Suggested-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
---
 arch/x86/kernel/crash.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index ae724a6e0a5f..3460be990e0c 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -384,6 +384,11 @@ int crash_setup_memmap_entries(struct kimage *image, struct boot_params *params)
 	walk_iomem_res_desc(IORES_DESC_ACPI_NV_STORAGE, flags, 0, -1, &cmd,
 			memmap_entry_callback);
 
+	/* Add all reserved ranges */
+	cmd.type = E820_TYPE_RESERVED;
+	walk_iomem_res_desc(IORES_DESC_NONE, 0, 0, -1, &cmd,
+			memmap_entry_callback);
+
 	/* Add crashk_low_res region */
 	if (crashk_low_res.end) {
 		ei.addr = crashk_low_res.start;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/3 v3] resource: fix an error which walks through iomem resources
  2018-09-21  7:32 ` [PATCH 1/3 v3] resource: fix an error which walks through iomem resources Lianbo Jiang
@ 2018-09-24 17:52   ` Bjorn Helgaas
  2018-09-25  7:08     ` lijiang
  2018-09-24 22:14   ` [PATCH 0/3] find_next_iomem_res() fixes Bjorn Helgaas
  1 sibling, 1 reply; 24+ messages in thread
From: Bjorn Helgaas @ 2018-09-24 17:52 UTC (permalink / raw)
  To: Lianbo Jiang
  Cc: linux-kernel, kexec, tglx, mingo, hpa, x86, akpm, dan.j.williams,
	thomas.lendacky, bhelgaas, baiyaowei, tiwai, bp, brijesh.singh,
	dyoung, bhe

On Fri, Sep 21, 2018 at 03:32:09PM +0800, Lianbo Jiang wrote:
> When we walk through iomem resources by calling walk_iomem_res_desc(),
> the values of the function parameter may be modified in the while loop
> of __walk_iomem_res_desc(), which will cause us to not get the desired
> result in some cases.

If I understand correctly, the issue is caused by the interaction
between __walk_iomem_res_desc() and find_next_iomem_res() in this
path:

  __walk_iomem_res_desc
    find_next_iomem_res
      res->flags = p->flags;            # <-- problem

This path is used by the following interfaces, and I think your patch
would fix the issue for them:

  walk_iomem_res_desc()
  walk_system_ram_res()
  walk_mem_res()

However, find_next_iomem_res() is also used directly by
walk_system_ram_range().  I think that path has the same problem, and
your patch does not fix that path.

I have a few more comments related to the existing code that I'll post
soon.

> At present, it only restores the original value of res->end, but it
> doesn't restore the original value of res->flags in the while loop of
> __walk_iomem _res_desc(). Whenever the find_next_iomem_res() finds a
> resource and returns the result, the original values of this resource
> will be modified, which might lead to an error in the next loop. For
> example:
> 
> The original value of resource flags is:
>  res->flags=0x80000200(initial value)
> 
> p->flags   _ 0x81000200 _                _ 0x80000200 _
>           /              \              /              \
> |________|_______A________|____|_....._|______B_________|..........___|
> 0                                                            0xffffffff
>                 (memory address ranges)
> 
> Note: if ((p->flags & res->flags) != res->flags) continue;
> 
> When the resource A is found, the original value of this resource flags
> will be changed to 0x81000200(res->flags=0x81000200), and continue to
> look for the next resource, when the loop reaches resource B, it can not
> get the resource B any more(you can refer to the for loop of find_next
> _iomem_res()), because the value of conditional expression will become
> true and will also jump the resource B.
> 
> In fact, we should get the resource A and B when we walk through the
> whole tree, but it only gets the resource A, the resource B is missed.
> 
> Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
> ---
>  kernel/resource.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/kernel/resource.c b/kernel/resource.c
> index 30e1bc68503b..f5d9fc70a04c 100644
> --- a/kernel/resource.c
> +++ b/kernel/resource.c
> @@ -375,6 +375,7 @@ static int __walk_iomem_res_desc(struct resource *res, unsigned long desc,
>  				 int (*func)(struct resource *, void *))
>  {
>  	u64 orig_end = res->end;
> +	u64 orig_flags = res->flags;
>  	int ret = -1;
>  
>  	while ((res->start < res->end) &&
> @@ -385,6 +386,7 @@ static int __walk_iomem_res_desc(struct resource *res, unsigned long desc,
>  
>  		res->start = res->end + 1;
>  		res->end = orig_end;
> +		res->flags = orig_flags;
>  	}
>  
>  	return ret;

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 0/3] find_next_iomem_res() fixes
  2018-09-21  7:32 ` [PATCH 1/3 v3] resource: fix an error which walks through iomem resources Lianbo Jiang
  2018-09-24 17:52   ` Bjorn Helgaas
@ 2018-09-24 22:14   ` Bjorn Helgaas
  2018-09-24 22:14     ` [PATCH 1/3] x86/kexec: Correct KEXEC_BACKUP_SRC_END off-by-one error Bjorn Helgaas
                       ` (3 more replies)
  1 sibling, 4 replies; 24+ messages in thread
From: Bjorn Helgaas @ 2018-09-24 22:14 UTC (permalink / raw)
  To: Lianbo Jiang
  Cc: Vivek Goyal, linux-kernel, kexec, tglx, mingo, hpa, x86, akpm,
	dan.j.williams, thomas.lendacky, baiyaowei, tiwai, bp,
	brijesh.singh, dyoung, bhe

Hi Lianbo,

These three patches are a possible replacement for your first patch
("[PATCH 1/3 v3] resource: fix an error which walks through iomem
resources").

I think the interface of find_next_iomem_res() can be improved to make
the code easier to read and also avoid the errors you're fixing.

I can't test these, so they've only been compiled.  If you can test
them and if you like them, feel free to incorporate them into your
series.  If not, just drop them (but please at least fix the same
error in walk_system_ram_range()).

---

Bjorn Helgaas (3):
      x86/kexec: Correct KEXEC_BACKUP_SRC_END off-by-one error
      resource: Include resource end in walk_*() interfaces
      resource: Fix find_next_iomem_res() iteration issue


 arch/x86/include/asm/kexec.h |    2 -
 kernel/resource.c            |   96 ++++++++++++++++++------------------------
 2 files changed, 43 insertions(+), 55 deletions(-)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 1/3] x86/kexec: Correct KEXEC_BACKUP_SRC_END off-by-one error
  2018-09-24 22:14   ` [PATCH 0/3] find_next_iomem_res() fixes Bjorn Helgaas
@ 2018-09-24 22:14     ` Bjorn Helgaas
  2018-09-24 22:14     ` [PATCH 2/3] resource: Include resource end in walk_*() interfaces Bjorn Helgaas
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 24+ messages in thread
From: Bjorn Helgaas @ 2018-09-24 22:14 UTC (permalink / raw)
  To: Lianbo Jiang
  Cc: Vivek Goyal, linux-kernel, kexec, tglx, mingo, hpa, x86, akpm,
	dan.j.williams, thomas.lendacky, baiyaowei, tiwai, bp,
	brijesh.singh, dyoung, bhe

From: Bjorn Helgaas <bhelgaas@google.com>

The only use of KEXEC_BACKUP_SRC_END is as an argument to
walk_system_ram_res():

  int crash_load_segments(struct kimage *image)
  {
    ...
    walk_system_ram_res(KEXEC_BACKUP_SRC_START, KEXEC_BACKUP_SRC_END,
                        image, determine_backup_region);

walk_system_ram_res() expects "start, end" arguments that are inclusive,
i.e., the range to be walked includes both the start and end addresses.

KEXEC_BACKUP_SRC_END was previously defined as (640 * 1024UL), which is the
first address *past* the desired 0-64KB range.

Define KEXEC_BACKUP_SRC_END as (640 * 1024UL - 1) so the KEXEC_BACKUP_SRC
region is [0-0xffff], not [0-0x10000].

Fixes: dd5f726076cc ("kexec: support for kexec on panic using new system call")
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
---
 arch/x86/include/asm/kexec.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
index f327236f0fa7..5125fca472bb 100644
--- a/arch/x86/include/asm/kexec.h
+++ b/arch/x86/include/asm/kexec.h
@@ -67,7 +67,7 @@ struct kimage;
 
 /* Memory to backup during crash kdump */
 #define KEXEC_BACKUP_SRC_START	(0UL)
-#define KEXEC_BACKUP_SRC_END	(640 * 1024UL)	/* 640K */
+#define KEXEC_BACKUP_SRC_END	(640 * 1024UL - 1)	/* 640K */
 
 /*
  * CPU does not save ss and sp on stack if execution is already


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 2/3] resource: Include resource end in walk_*() interfaces
  2018-09-24 22:14   ` [PATCH 0/3] find_next_iomem_res() fixes Bjorn Helgaas
  2018-09-24 22:14     ` [PATCH 1/3] x86/kexec: Correct KEXEC_BACKUP_SRC_END off-by-one error Bjorn Helgaas
@ 2018-09-24 22:14     ` Bjorn Helgaas
  2018-09-24 22:15     ` [PATCH 3/3] resource: Fix find_next_iomem_res() iteration issue Bjorn Helgaas
  2018-09-26  9:22     ` [PATCH 0/3] find_next_iomem_res() fixes lijiang
  3 siblings, 0 replies; 24+ messages in thread
From: Bjorn Helgaas @ 2018-09-24 22:14 UTC (permalink / raw)
  To: Lianbo Jiang
  Cc: Vivek Goyal, linux-kernel, kexec, tglx, mingo, hpa, x86, akpm,
	dan.j.williams, thomas.lendacky, baiyaowei, tiwai, bp,
	brijesh.singh, dyoung, bhe

From: Bjorn Helgaas <bhelgaas@google.com>

find_next_iomem_res() finds an iomem resource that covers part of a range
described by "start, end".  All callers expect that range to be inclusive,
i.e., both start and end are included, but find_next_iomem_res() doesn't
handle the end address correctly.

If it finds an iomem resource that contains exactly the end address, it
skips it, e.g., if "start, end" is [0x0-0x10000] and there happens to be an
iomem resource [mem 0x10000-0x10000] (the single byte at 0x10000), we skip
it:

  find_next_iomem_res(...)
  {
    start = 0x0;
    end = 0x10000;
    for (p = next_resource(...)) {
      # p->start = 0x10000;
      # p->end = 0x10000;
      # we *should* return this resource, but this condition is false:
      if ((p->end >= start) && (p->start < end))
        break;

Adjust find_next_iomem_res() so it allows a resource that includes the
single byte at the end of the range.  This is a corner case that we
probably don't see in practice.

Fixes: 58c1b5b07907 ("[PATCH] memory hotadd fixes: find_next_system_ram catch range fix")
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
---
 kernel/resource.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/resource.c b/kernel/resource.c
index 30e1bc68503b..155ec873ea4d 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -319,7 +319,7 @@ int release_resource(struct resource *old)
 EXPORT_SYMBOL(release_resource);
 
 /*
- * Finds the lowest iomem resource existing within [res->start.res->end).
+ * Finds the lowest iomem resource existing within [res->start..res->end].
  * The caller must specify res->start, res->end, res->flags, and optionally
  * desc.  If found, returns 0, res is overwritten, if not found, returns -1.
  * This function walks the whole tree and not just first level children until
@@ -352,7 +352,7 @@ static int find_next_iomem_res(struct resource *res, unsigned long desc,
 			p = NULL;
 			break;
 		}
-		if ((p->end >= start) && (p->start < end))
+		if ((p->end >= start) && (p->start <= end))
 			break;
 	}
 


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 3/3] resource: Fix find_next_iomem_res() iteration issue
  2018-09-24 22:14   ` [PATCH 0/3] find_next_iomem_res() fixes Bjorn Helgaas
  2018-09-24 22:14     ` [PATCH 1/3] x86/kexec: Correct KEXEC_BACKUP_SRC_END off-by-one error Bjorn Helgaas
  2018-09-24 22:14     ` [PATCH 2/3] resource: Include resource end in walk_*() interfaces Bjorn Helgaas
@ 2018-09-24 22:15     ` Bjorn Helgaas
  2018-09-25  8:58       ` Baoquan He
  2018-09-27  5:27       ` lijiang
  2018-09-26  9:22     ` [PATCH 0/3] find_next_iomem_res() fixes lijiang
  3 siblings, 2 replies; 24+ messages in thread
From: Bjorn Helgaas @ 2018-09-24 22:15 UTC (permalink / raw)
  To: Lianbo Jiang
  Cc: Vivek Goyal, linux-kernel, kexec, tglx, mingo, hpa, x86, akpm,
	dan.j.williams, thomas.lendacky, baiyaowei, tiwai, bp,
	brijesh.singh, dyoung, bhe

From: Bjorn Helgaas <bhelgaas@google.com>

Previously find_next_iomem_res() used "*res" as both an input parameter for
the range to search and the type of resource to search for, and an output
parameter for the resource we found, which makes the interface confusing
and hard to use correctly.

All callers allocate a single struct resource and use it for repeated calls
to find_next_iomem_res().  When find_next_iomem_res() returns a resource,
it overwrites the start, end, flags, and desc members of the struct.  If we
call find_next_iomem_res() again, we must update or restore these fields.

The callers (__walk_iomem_res_desc() and walk_system_ram_range()) do not
restore res->flags, so if the caller is searching for flags of
IORESOURCE_MEM | IORESOURCE_BUSY and finds a resource with flags of
IORESOURCE_MEM | IORESOURCE_BUSY | IORESOURCE_SYSRAM, the next search will
find only resources marked as IORESOURCE_SYSRAM.

Fix this by restructuring the interface so it takes explicit "start, end,
flags" parameters and uses "*res" only as an output parameter.

Original-patch: http://lore.kernel.org/lkml/20180921073211.20097-2-lijiang@redhat.com
Based-on-patch-by: Lianbo Jiang <lijiang@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
---
 kernel/resource.c |   94 +++++++++++++++++++++++------------------------------
 1 file changed, 41 insertions(+), 53 deletions(-)

diff --git a/kernel/resource.c b/kernel/resource.c
index 155ec873ea4d..9891ea90cc8f 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -319,23 +319,26 @@ int release_resource(struct resource *old)
 EXPORT_SYMBOL(release_resource);
 
 /*
- * Finds the lowest iomem resource existing within [res->start..res->end].
- * The caller must specify res->start, res->end, res->flags, and optionally
- * desc.  If found, returns 0, res is overwritten, if not found, returns -1.
- * This function walks the whole tree and not just first level children until
- * and unless first_level_children_only is true.
+ * Finds the lowest iomem resource that covers part of [start..end].  The
+ * caller must specify start, end, flags, and desc (which may be
+ * IORES_DESC_NONE).
+ *
+ * If a resource is found, returns 0 and *res is overwritten with the part
+ * of the resource that's within [start..end]; if none is found, returns
+ * -1.
+ *
+ * This function walks the whole tree and not just first level children
+ * unless first_level_children_only is true.
  */
-static int find_next_iomem_res(struct resource *res, unsigned long desc,
-			       bool first_level_children_only)
+static int find_next_iomem_res(resource_size_t start, resource_size_t end,
+			       unsigned long flags, unsigned long desc,
+			       bool first_level_children_only,
+			       struct resource *res)
 {
-	resource_size_t start, end;
 	struct resource *p;
 	bool sibling_only = false;
 
 	BUG_ON(!res);
-
-	start = res->start;
-	end = res->end;
 	BUG_ON(start >= end);
 
 	if (first_level_children_only)
@@ -344,7 +347,7 @@ static int find_next_iomem_res(struct resource *res, unsigned long desc,
 	read_lock(&resource_lock);
 
 	for (p = iomem_resource.child; p; p = next_resource(p, sibling_only)) {
-		if ((p->flags & res->flags) != res->flags)
+		if ((p->flags & flags) != flags)
 			continue;
 		if ((desc != IORES_DESC_NONE) && (desc != p->desc))
 			continue;
@@ -359,32 +362,31 @@ static int find_next_iomem_res(struct resource *res, unsigned long desc,
 	read_unlock(&resource_lock);
 	if (!p)
 		return -1;
+
 	/* copy data */
-	if (res->start < p->start)
-		res->start = p->start;
-	if (res->end > p->end)
-		res->end = p->end;
+	res->start = max(start, p->start);
+	res->end = min(end, p->end);
 	res->flags = p->flags;
 	res->desc = p->desc;
 	return 0;
 }
 
-static int __walk_iomem_res_desc(struct resource *res, unsigned long desc,
-				 bool first_level_children_only,
-				 void *arg,
+static int __walk_iomem_res_desc(resource_size_t start, resource_size_t end,
+				 unsigned long flags, unsigned long desc,
+				 bool first_level_children_only, void *arg,
 				 int (*func)(struct resource *, void *))
 {
-	u64 orig_end = res->end;
+	struct resource res;
 	int ret = -1;
 
-	while ((res->start < res->end) &&
-	       !find_next_iomem_res(res, desc, first_level_children_only)) {
-		ret = (*func)(res, arg);
+	while (start < end &&
+	       !find_next_iomem_res(start, end, flags, desc,
+				    first_level_children_only, &res)) {
+		ret = (*func)(&res, arg);
 		if (ret)
 			break;
 
-		res->start = res->end + 1;
-		res->end = orig_end;
+		start = res.end + 1;
 	}
 
 	return ret;
@@ -407,13 +409,7 @@ static int __walk_iomem_res_desc(struct resource *res, unsigned long desc,
 int walk_iomem_res_desc(unsigned long desc, unsigned long flags, u64 start,
 		u64 end, void *arg, int (*func)(struct resource *, void *))
 {
-	struct resource res;
-
-	res.start = start;
-	res.end = end;
-	res.flags = flags;
-
-	return __walk_iomem_res_desc(&res, desc, false, arg, func);
+	return __walk_iomem_res_desc(start, end, flags, desc, false, arg, func);
 }
 EXPORT_SYMBOL_GPL(walk_iomem_res_desc);
 
@@ -427,13 +423,9 @@ EXPORT_SYMBOL_GPL(walk_iomem_res_desc);
 int walk_system_ram_res(u64 start, u64 end, void *arg,
 				int (*func)(struct resource *, void *))
 {
-	struct resource res;
-
-	res.start = start;
-	res.end = end;
-	res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
+	unsigned long flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
 
-	return __walk_iomem_res_desc(&res, IORES_DESC_NONE, true,
+	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, true,
 				     arg, func);
 }
 
@@ -444,13 +436,9 @@ int walk_system_ram_res(u64 start, u64 end, void *arg,
 int walk_mem_res(u64 start, u64 end, void *arg,
 		 int (*func)(struct resource *, void *))
 {
-	struct resource res;
-
-	res.start = start;
-	res.end = end;
-	res.flags = IORESOURCE_MEM | IORESOURCE_BUSY;
+	unsigned long flags = IORESOURCE_MEM | IORESOURCE_BUSY;
 
-	return __walk_iomem_res_desc(&res, IORES_DESC_NONE, true,
+	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, true,
 				     arg, func);
 }
 
@@ -464,25 +452,25 @@ int walk_mem_res(u64 start, u64 end, void *arg,
 int walk_system_ram_range(unsigned long start_pfn, unsigned long nr_pages,
 		void *arg, int (*func)(unsigned long, unsigned long, void *))
 {
+	resource_size_t start, end;
+	unsigned long flags;
 	struct resource res;
 	unsigned long pfn, end_pfn;
-	u64 orig_end;
 	int ret = -1;
 
-	res.start = (u64) start_pfn << PAGE_SHIFT;
-	res.end = ((u64)(start_pfn + nr_pages) << PAGE_SHIFT) - 1;
-	res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
-	orig_end = res.end;
-	while ((res.start < res.end) &&
-		(find_next_iomem_res(&res, IORES_DESC_NONE, true) >= 0)) {
+	start = (u64) start_pfn << PAGE_SHIFT;
+	end = ((u64)(start_pfn + nr_pages) << PAGE_SHIFT) - 1;
+	flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
+	while (start < end &&
+	       !find_next_iomem_res(start, end, flags, IORES_DESC_NONE,
+				    true, &res)) {
 		pfn = (res.start + PAGE_SIZE - 1) >> PAGE_SHIFT;
 		end_pfn = (res.end + 1) >> PAGE_SHIFT;
 		if (end_pfn > pfn)
 			ret = (*func)(pfn, end_pfn - pfn, arg);
 		if (ret)
 			break;
-		res.start = res.end + 1;
-		res.end = orig_end;
+		start = res.end + 1;
 	}
 	return ret;
 }


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/3 v3] resource: fix an error which walks through iomem resources
  2018-09-24 17:52   ` Bjorn Helgaas
@ 2018-09-25  7:08     ` lijiang
  0 siblings, 0 replies; 24+ messages in thread
From: lijiang @ 2018-09-25  7:08 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-kernel, kexec, tglx, mingo, hpa, x86, akpm, dan.j.williams,
	thomas.lendacky, bhelgaas, baiyaowei, tiwai, bp, brijesh.singh,
	dyoung, bhe

在 2018年09月25日 01:52, Bjorn Helgaas 写道:
> On Fri, Sep 21, 2018 at 03:32:09PM +0800, Lianbo Jiang wrote:
>> When we walk through iomem resources by calling walk_iomem_res_desc(),
>> the values of the function parameter may be modified in the while loop
>> of __walk_iomem_res_desc(), which will cause us to not get the desired
>> result in some cases.
> 
> If I understand correctly, the issue is caused by the interaction
> between __walk_iomem_res_desc() and find_next_iomem_res() in this
> path:
> 
>   __walk_iomem_res_desc
>     find_next_iomem_res
>       res->flags = p->flags;            # <-- problem
> 
> This path is used by the following interfaces, and I think your patch
> would fix the issue for them:
> 
>   walk_iomem_res_desc()
>   walk_system_ram_res()
>   walk_mem_res()
> 
> However, find_next_iomem_res() is also used directly by
> walk_system_ram_range().  I think that path has the same problem, and
> your patch does not fix that path.
> 
Thanks for your comment. 
Originally, my patch 1 only fixed this issue in kdump path, of course, i can
also improve this patch and fix the same issue in walk_system_ram_range().
If you have fixed this issue, it's good to me.

> I have a few more comments related to the existing code that I'll post
> soon.
> 
>> At present, it only restores the original value of res->end, but it
>> doesn't restore the original value of res->flags in the while loop of
>> __walk_iomem _res_desc(). Whenever the find_next_iomem_res() finds a
>> resource and returns the result, the original values of this resource
>> will be modified, which might lead to an error in the next loop. For
>> example:
>>
>> The original value of resource flags is:
>>  res->flags=0x80000200(initial value)
>>
>> p->flags   _ 0x81000200 _                _ 0x80000200 _
>>           /              \              /              \
>> |________|_______A________|____|_....._|______B_________|..........___|
>> 0                                                            0xffffffff
>>                 (memory address ranges)
>>
>> Note: if ((p->flags & res->flags) != res->flags) continue;
>>
>> When the resource A is found, the original value of this resource flags
>> will be changed to 0x81000200(res->flags=0x81000200), and continue to
>> look for the next resource, when the loop reaches resource B, it can not
>> get the resource B any more(you can refer to the for loop of find_next
>> _iomem_res()), because the value of conditional expression will become
>> true and will also jump the resource B.
>>
>> In fact, we should get the resource A and B when we walk through the
>> whole tree, but it only gets the resource A, the resource B is missed.
>>
>> Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
>> ---
>>  kernel/resource.c | 2 ++
>>  1 file changed, 2 insertions(+)
>>
>> diff --git a/kernel/resource.c b/kernel/resource.c
>> index 30e1bc68503b..f5d9fc70a04c 100644
>> --- a/kernel/resource.c
>> +++ b/kernel/resource.c
>> @@ -375,6 +375,7 @@ static int __walk_iomem_res_desc(struct resource *res, unsigned long desc,
>>  				 int (*func)(struct resource *, void *))
>>  {
>>  	u64 orig_end = res->end;
>> +	u64 orig_flags = res->flags;
>>  	int ret = -1;
>>  
>>  	while ((res->start < res->end) &&
>> @@ -385,6 +386,7 @@ static int __walk_iomem_res_desc(struct resource *res, unsigned long desc,
>>  
>>  		res->start = res->end + 1;
>>  		res->end = orig_end;
>> +		res->flags = orig_flags;
>>  	}
>>  
>>  	return ret;

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/3] resource: Fix find_next_iomem_res() iteration issue
  2018-09-24 22:15     ` [PATCH 3/3] resource: Fix find_next_iomem_res() iteration issue Bjorn Helgaas
@ 2018-09-25  8:58       ` Baoquan He
  2018-09-25 11:20         ` Baoquan He
  2018-09-27  5:27       ` lijiang
  1 sibling, 1 reply; 24+ messages in thread
From: Baoquan He @ 2018-09-25  8:58 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Lianbo Jiang, Vivek Goyal, linux-kernel, kexec, tglx, mingo, hpa,
	x86, akpm, dan.j.williams, thomas.lendacky, baiyaowei, tiwai, bp,
	brijesh.singh, dyoung

Hi Bjorn,

On 09/24/18 at 05:15pm, Bjorn Helgaas wrote:
> @@ -359,32 +362,31 @@ static int find_next_iomem_res(struct resource *res, unsigned long desc,
>  	read_unlock(&resource_lock);
>  	if (!p)
>  		return -1;
> +
>  	/* copy data */
> -	if (res->start < p->start)
> -		res->start = p->start;
> -	if (res->end > p->end)
> -		res->end = p->end;
> +	res->start = max(start, p->start);
> +	res->end = min(end, p->end);
>  	res->flags = p->flags;

I think this fix is good. However, is it OK to keep res->flags always,
never touch it in find_next_iomem_res()? We just iterate and update
region, its start and end. So just removing that "res->flags = p->flags;"
line might involve much less code changes.

Thanks
Baoquan

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/3] resource: Fix find_next_iomem_res() iteration issue
  2018-09-25  8:58       ` Baoquan He
@ 2018-09-25 11:20         ` Baoquan He
  0 siblings, 0 replies; 24+ messages in thread
From: Baoquan He @ 2018-09-25 11:20 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Lianbo Jiang, Vivek Goyal, linux-kernel, kexec, tglx, mingo, hpa,
	x86, akpm, dan.j.williams, thomas.lendacky, baiyaowei, tiwai, bp,
	brijesh.singh, dyoung

On 09/25/18 at 04:58pm, Baoquan He wrote:
> Hi Bjorn,
> 
> On 09/24/18 at 05:15pm, Bjorn Helgaas wrote:
> > @@ -359,32 +362,31 @@ static int find_next_iomem_res(struct resource *res, unsigned long desc,
> >  	read_unlock(&resource_lock);
> >  	if (!p)
> >  		return -1;
> > +
> >  	/* copy data */
> > -	if (res->start < p->start)
> > -		res->start = p->start;
> > -	if (res->end > p->end)
> > -		res->end = p->end;
> > +	res->start = max(start, p->start);
> > +	res->end = min(end, p->end);
> >  	res->flags = p->flags;
> 
> I think this fix is good. However, is it OK to keep res->flags always,
> never touch it in find_next_iomem_res()? We just iterate and update
> region, its start and end. So just removing that "res->flags = p->flags;"
> line might involve much less code changes.

Rethink about it, I was wrong. Please ignore my comment.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/3] find_next_iomem_res() fixes
  2018-09-24 22:14   ` [PATCH 0/3] find_next_iomem_res() fixes Bjorn Helgaas
                       ` (2 preceding siblings ...)
  2018-09-24 22:15     ` [PATCH 3/3] resource: Fix find_next_iomem_res() iteration issue Bjorn Helgaas
@ 2018-09-26  9:22     ` lijiang
  2018-09-26 13:36       ` lijiang
  3 siblings, 1 reply; 24+ messages in thread
From: lijiang @ 2018-09-26  9:22 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: dan.j.williams, brijesh.singh, bhe, thomas.lendacky, tiwai, x86,
	kexec, linux-kernel, mingo, baiyaowei, hpa, tglx, bp, dyoung,
	akpm, Vivek Goyal



在 2018年09月25日 06:14, Bjorn Helgaas 写道:
> Hi Lianbo,
> 
> These three patches are a possible replacement for your first patch
> ("[PATCH 1/3 v3] resource: fix an error which walks through iomem
> resources").
> 
> I think the interface of find_next_iomem_res() can be improved to make
> the code easier to read and also avoid the errors you're fixing.
> 
> I can't test these, so they've only been compiled.  If you can test
> them and if you like them, feel free to incorporate them into your
> series.  If not, just drop them (but please at least fix the same
> error in walk_system_ram_range()).
> 
Hi, Bjorn
Sorry for my late reply about this. No need to incorporate them into
my patch series, you might freely submit them to upstream.

I will test your patches whether it can also work well for my case, and
you will get feedback about my test later.

Thanks.
Lianbo

> ---
> 
> Bjorn Helgaas (3):
>       x86/kexec: Correct KEXEC_BACKUP_SRC_END off-by-one error
>       resource: Include resource end in walk_*() interfaces
>       resource: Fix find_next_iomem_res() iteration issue
> 
> 
>  arch/x86/include/asm/kexec.h |    2 -
>  kernel/resource.c            |   96 ++++++++++++++++++------------------------
>  2 files changed, 43 insertions(+), 55 deletions(-)
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/3] find_next_iomem_res() fixes
  2018-09-26  9:22     ` [PATCH 0/3] find_next_iomem_res() fixes lijiang
@ 2018-09-26 13:36       ` lijiang
  0 siblings, 0 replies; 24+ messages in thread
From: lijiang @ 2018-09-26 13:36 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: dan.j.williams, brijesh.singh, bhe, thomas.lendacky, tiwai, x86,
	kexec, linux-kernel, mingo, baiyaowei, hpa, tglx, bp, dyoung,
	akpm, Vivek Goyal

在 2018年09月26日 17:22, lijiang 写道:
> 
> 
> 在 2018年09月25日 06:14, Bjorn Helgaas 写道:
>> Hi Lianbo,
>>
>> These three patches are a possible replacement for your first patch
>> ("[PATCH 1/3 v3] resource: fix an error which walks through iomem
>> resources").
>>
>> I think the interface of find_next_iomem_res() can be improved to make
>> the code easier to read and also avoid the errors you're fixing.
>>
>> I can't test these, so they've only been compiled.  If you can test
>> them and if you like them, feel free to incorporate them into your
>> series.  If not, just drop them (but please at least fix the same
>> error in walk_system_ram_range()).
>>
> Hi, Bjorn
> Sorry for my late reply about this. No need to incorporate them into
> my patch series, you might freely submit them to upstream.
> 
> I will test your patches whether it can also work well for my case, and
> you will get feedback about my test later.
> 
For my case, your patches can also work well.

> Thanks.
> Lianbo
> 
>> ---
>>
>> Bjorn Helgaas (3):
>>       x86/kexec: Correct KEXEC_BACKUP_SRC_END off-by-one error
>>       resource: Include resource end in walk_*() interfaces
>>       resource: Fix find_next_iomem_res() iteration issue
>>
>>
>>  arch/x86/include/asm/kexec.h |    2 -
>>  kernel/resource.c            |   96 ++++++++++++++++++------------------------
>>  2 files changed, 43 insertions(+), 55 deletions(-)
>>
>> _______________________________________________
>> kexec mailing list
>> kexec@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/kexec
>>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/3] resource: Fix find_next_iomem_res() iteration issue
  2018-09-24 22:15     ` [PATCH 3/3] resource: Fix find_next_iomem_res() iteration issue Bjorn Helgaas
  2018-09-25  8:58       ` Baoquan He
@ 2018-09-27  5:27       ` lijiang
  2018-09-27 14:03         ` Bjorn Helgaas
  1 sibling, 1 reply; 24+ messages in thread
From: lijiang @ 2018-09-27  5:27 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: dan.j.williams, brijesh.singh, bhe, thomas.lendacky, tiwai, x86,
	kexec, linux-kernel, mingo, baiyaowei, hpa, tglx, bp, dyoung,
	akpm, Vivek Goyal

在 2018年09月25日 06:15, Bjorn Helgaas 写道:
> From: Bjorn Helgaas <bhelgaas@google.com>
> 
> Previously find_next_iomem_res() used "*res" as both an input parameter for
> the range to search and the type of resource to search for, and an output
> parameter for the resource we found, which makes the interface confusing
> and hard to use correctly.
> 
> All callers allocate a single struct resource and use it for repeated calls
> to find_next_iomem_res().  When find_next_iomem_res() returns a resource,
> it overwrites the start, end, flags, and desc members of the struct.  If we
> call find_next_iomem_res() again, we must update or restore these fields.
> 
> The callers (__walk_iomem_res_desc() and walk_system_ram_range()) do not
> restore res->flags, so if the caller is searching for flags of
> IORESOURCE_MEM | IORESOURCE_BUSY and finds a resource with flags of
> IORESOURCE_MEM | IORESOURCE_BUSY | IORESOURCE_SYSRAM, the next search will
> find only resources marked as IORESOURCE_SYSRAM.
> 
> Fix this by restructuring the interface so it takes explicit "start, end,
> flags" parameters and uses "*res" only as an output parameter.
> 

Hi, Bjorn
I personally suggest that some comments might be added in the code, make it clear
and easy to understand, then which could avoid the old confusion and more code changes.

Thanks
Lianbo
> Original-patch: http://lore.kernel.org/lkml/20180921073211.20097-2-lijiang@redhat.com
> Based-on-patch-by: Lianbo Jiang <lijiang@redhat.com>
> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> ---
>  kernel/resource.c |   94 +++++++++++++++++++++++------------------------------
>  1 file changed, 41 insertions(+), 53 deletions(-)
> 
> diff --git a/kernel/resource.c b/kernel/resource.c
> index 155ec873ea4d..9891ea90cc8f 100644
> --- a/kernel/resource.c
> +++ b/kernel/resource.c
> @@ -319,23 +319,26 @@ int release_resource(struct resource *old)
>  EXPORT_SYMBOL(release_resource);
>  
>  /*
> - * Finds the lowest iomem resource existing within [res->start..res->end].
> - * The caller must specify res->start, res->end, res->flags, and optionally
> - * desc.  If found, returns 0, res is overwritten, if not found, returns -1.
> - * This function walks the whole tree and not just first level children until
> - * and unless first_level_children_only is true.
> + * Finds the lowest iomem resource that covers part of [start..end].  The
> + * caller must specify start, end, flags, and desc (which may be
> + * IORES_DESC_NONE).
> + *
> + * If a resource is found, returns 0 and *res is overwritten with the part
> + * of the resource that's within [start..end]; if none is found, returns
> + * -1.
> + *
> + * This function walks the whole tree and not just first level children
> + * unless first_level_children_only is true.
>   */
> -static int find_next_iomem_res(struct resource *res, unsigned long desc,
> -			       bool first_level_children_only)
> +static int find_next_iomem_res(resource_size_t start, resource_size_t end,
> +			       unsigned long flags, unsigned long desc,
> +			       bool first_level_children_only,
> +			       struct resource *res)
>  {
> -	resource_size_t start, end;
>  	struct resource *p;
>  	bool sibling_only = false;
>  
>  	BUG_ON(!res);
> -
> -	start = res->start;
> -	end = res->end;
>  	BUG_ON(start >= end);
>  
>  	if (first_level_children_only)
> @@ -344,7 +347,7 @@ static int find_next_iomem_res(struct resource *res, unsigned long desc,
>  	read_lock(&resource_lock);
>  
>  	for (p = iomem_resource.child; p; p = next_resource(p, sibling_only)) {
> -		if ((p->flags & res->flags) != res->flags)
> +		if ((p->flags & flags) != flags)
>  			continue;
>  		if ((desc != IORES_DESC_NONE) && (desc != p->desc))
>  			continue;
> @@ -359,32 +362,31 @@ static int find_next_iomem_res(struct resource *res, unsigned long desc,
>  	read_unlock(&resource_lock);
>  	if (!p)
>  		return -1;
> +
>  	/* copy data */
> -	if (res->start < p->start)
> -		res->start = p->start;
> -	if (res->end > p->end)
> -		res->end = p->end;
> +	res->start = max(start, p->start);
> +	res->end = min(end, p->end);
>  	res->flags = p->flags;
>  	res->desc = p->desc;
>  	return 0;
>  }
>  
> -static int __walk_iomem_res_desc(struct resource *res, unsigned long desc,
> -				 bool first_level_children_only,
> -				 void *arg,
> +static int __walk_iomem_res_desc(resource_size_t start, resource_size_t end,
> +				 unsigned long flags, unsigned long desc,
> +				 bool first_level_children_only, void *arg,
>  				 int (*func)(struct resource *, void *))
>  {
> -	u64 orig_end = res->end;
> +	struct resource res;
>  	int ret = -1;
>  
> -	while ((res->start < res->end) &&
> -	       !find_next_iomem_res(res, desc, first_level_children_only)) {
> -		ret = (*func)(res, arg);
> +	while (start < end &&
> +	       !find_next_iomem_res(start, end, flags, desc,
> +				    first_level_children_only, &res)) {
> +		ret = (*func)(&res, arg);
>  		if (ret)
>  			break;
>  
> -		res->start = res->end + 1;
> -		res->end = orig_end;
> +		start = res.end + 1;
>  	}
>  
>  	return ret;
> @@ -407,13 +409,7 @@ static int __walk_iomem_res_desc(struct resource *res, unsigned long desc,
>  int walk_iomem_res_desc(unsigned long desc, unsigned long flags, u64 start,
>  		u64 end, void *arg, int (*func)(struct resource *, void *))
>  {
> -	struct resource res;
> -
> -	res.start = start;
> -	res.end = end;
> -	res.flags = flags;
> -
> -	return __walk_iomem_res_desc(&res, desc, false, arg, func);
> +	return __walk_iomem_res_desc(start, end, flags, desc, false, arg, func);
>  }
>  EXPORT_SYMBOL_GPL(walk_iomem_res_desc);
>  
> @@ -427,13 +423,9 @@ EXPORT_SYMBOL_GPL(walk_iomem_res_desc);
>  int walk_system_ram_res(u64 start, u64 end, void *arg,
>  				int (*func)(struct resource *, void *))
>  {
> -	struct resource res;
> -
> -	res.start = start;
> -	res.end = end;
> -	res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
> +	unsigned long flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
>  
> -	return __walk_iomem_res_desc(&res, IORES_DESC_NONE, true,
> +	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, true,
>  				     arg, func);
>  }
>  
> @@ -444,13 +436,9 @@ int walk_system_ram_res(u64 start, u64 end, void *arg,
>  int walk_mem_res(u64 start, u64 end, void *arg,
>  		 int (*func)(struct resource *, void *))
>  {
> -	struct resource res;
> -
> -	res.start = start;
> -	res.end = end;
> -	res.flags = IORESOURCE_MEM | IORESOURCE_BUSY;
> +	unsigned long flags = IORESOURCE_MEM | IORESOURCE_BUSY;
>  
> -	return __walk_iomem_res_desc(&res, IORES_DESC_NONE, true,
> +	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, true,
>  				     arg, func);
>  }
>  
> @@ -464,25 +452,25 @@ int walk_mem_res(u64 start, u64 end, void *arg,
>  int walk_system_ram_range(unsigned long start_pfn, unsigned long nr_pages,
>  		void *arg, int (*func)(unsigned long, unsigned long, void *))
>  {
> +	resource_size_t start, end;
> +	unsigned long flags;
>  	struct resource res;
>  	unsigned long pfn, end_pfn;
> -	u64 orig_end;
>  	int ret = -1;
>  
> -	res.start = (u64) start_pfn << PAGE_SHIFT;
> -	res.end = ((u64)(start_pfn + nr_pages) << PAGE_SHIFT) - 1;
> -	res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
> -	orig_end = res.end;
> -	while ((res.start < res.end) &&
> -		(find_next_iomem_res(&res, IORES_DESC_NONE, true) >= 0)) {
> +	start = (u64) start_pfn << PAGE_SHIFT;
> +	end = ((u64)(start_pfn + nr_pages) << PAGE_SHIFT) - 1;
> +	flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
> +	while (start < end &&
> +	       !find_next_iomem_res(start, end, flags, IORES_DESC_NONE,
> +				    true, &res)) {
>  		pfn = (res.start + PAGE_SIZE - 1) >> PAGE_SHIFT;
>  		end_pfn = (res.end + 1) >> PAGE_SHIFT;
>  		if (end_pfn > pfn)
>  			ret = (*func)(pfn, end_pfn - pfn, arg);
>  		if (ret)
>  			break;
> -		res.start = res.end + 1;
> -		res.end = orig_end;
> +		start = res.end + 1;
>  	}
>  	return ret;
>  }
> 
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/3] resource: Fix find_next_iomem_res() iteration issue
  2018-09-27  5:27       ` lijiang
@ 2018-09-27 14:03         ` Bjorn Helgaas
  2018-09-28  5:09           ` lijiang
  2018-09-28 13:10           ` Borislav Petkov
  0 siblings, 2 replies; 24+ messages in thread
From: Bjorn Helgaas @ 2018-09-27 14:03 UTC (permalink / raw)
  To: lijiang
  Cc: dan.j.williams, brijesh.singh, bhe, thomas.lendacky, tiwai, x86,
	kexec, linux-kernel, mingo, baiyaowei, hpa, tglx, bp, dyoung,
	akpm, Vivek Goyal

On Thu, Sep 27, 2018 at 01:27:41PM +0800, lijiang wrote:
> 在 2018年09月25日 06:15, Bjorn Helgaas 写道:
> > From: Bjorn Helgaas <bhelgaas@google.com>
> > 
> > Previously find_next_iomem_res() used "*res" as both an input parameter for
> > the range to search and the type of resource to search for, and an output
> > parameter for the resource we found, which makes the interface confusing
> > and hard to use correctly.
> > 
> > All callers allocate a single struct resource and use it for repeated calls
> > to find_next_iomem_res().  When find_next_iomem_res() returns a resource,
> > it overwrites the start, end, flags, and desc members of the struct.  If we
> > call find_next_iomem_res() again, we must update or restore these fields.
> > 
> > The callers (__walk_iomem_res_desc() and walk_system_ram_range()) do not
> > restore res->flags, so if the caller is searching for flags of
> > IORESOURCE_MEM | IORESOURCE_BUSY and finds a resource with flags of
> > IORESOURCE_MEM | IORESOURCE_BUSY | IORESOURCE_SYSRAM, the next search will
> > find only resources marked as IORESOURCE_SYSRAM.
> > 
> > Fix this by restructuring the interface so it takes explicit "start, end,
> > flags" parameters and uses "*res" only as an output parameter.
> 
> Hi, Bjorn
> I personally suggest that some comments might be added in the code, make it clear
> and easy to understand, then which could avoid the old confusion and more code changes.

Since I think the current interface (using *res as both input and
output parameters that have very different meanings) is confusing,
it's hard for *me* to write comments that make it less confusing, but
of course, you're welcome to propose something.

My opinion (probably not universally shared) is that my proposal would
make the code more readable, and it's worth doing even though the diff
is larger.

Anyway, I'll post these patches independently and see if anybody else
has an opinion.

Bjorn

> > Original-patch: http://lore.kernel.org/lkml/20180921073211.20097-2-lijiang@redhat.com
> > Based-on-patch-by: Lianbo Jiang <lijiang@redhat.com>
> > Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> > ---
> >  kernel/resource.c |   94 +++++++++++++++++++++++------------------------------
> >  1 file changed, 41 insertions(+), 53 deletions(-)
> > 
> > diff --git a/kernel/resource.c b/kernel/resource.c
> > index 155ec873ea4d..9891ea90cc8f 100644
> > --- a/kernel/resource.c
> > +++ b/kernel/resource.c
> > @@ -319,23 +319,26 @@ int release_resource(struct resource *old)
> >  EXPORT_SYMBOL(release_resource);
> >  
> >  /*
> > - * Finds the lowest iomem resource existing within [res->start..res->end].
> > - * The caller must specify res->start, res->end, res->flags, and optionally
> > - * desc.  If found, returns 0, res is overwritten, if not found, returns -1.
> > - * This function walks the whole tree and not just first level children until
> > - * and unless first_level_children_only is true.
> > + * Finds the lowest iomem resource that covers part of [start..end].  The
> > + * caller must specify start, end, flags, and desc (which may be
> > + * IORES_DESC_NONE).
> > + *
> > + * If a resource is found, returns 0 and *res is overwritten with the part
> > + * of the resource that's within [start..end]; if none is found, returns
> > + * -1.
> > + *
> > + * This function walks the whole tree and not just first level children
> > + * unless first_level_children_only is true.
> >   */
> > -static int find_next_iomem_res(struct resource *res, unsigned long desc,
> > -			       bool first_level_children_only)
> > +static int find_next_iomem_res(resource_size_t start, resource_size_t end,
> > +			       unsigned long flags, unsigned long desc,
> > +			       bool first_level_children_only,
> > +			       struct resource *res)
> >  {
> > -	resource_size_t start, end;
> >  	struct resource *p;
> >  	bool sibling_only = false;
> >  
> >  	BUG_ON(!res);
> > -
> > -	start = res->start;
> > -	end = res->end;
> >  	BUG_ON(start >= end);
> >  
> >  	if (first_level_children_only)
> > @@ -344,7 +347,7 @@ static int find_next_iomem_res(struct resource *res, unsigned long desc,
> >  	read_lock(&resource_lock);
> >  
> >  	for (p = iomem_resource.child; p; p = next_resource(p, sibling_only)) {
> > -		if ((p->flags & res->flags) != res->flags)
> > +		if ((p->flags & flags) != flags)
> >  			continue;
> >  		if ((desc != IORES_DESC_NONE) && (desc != p->desc))
> >  			continue;
> > @@ -359,32 +362,31 @@ static int find_next_iomem_res(struct resource *res, unsigned long desc,
> >  	read_unlock(&resource_lock);
> >  	if (!p)
> >  		return -1;
> > +
> >  	/* copy data */
> > -	if (res->start < p->start)
> > -		res->start = p->start;
> > -	if (res->end > p->end)
> > -		res->end = p->end;
> > +	res->start = max(start, p->start);
> > +	res->end = min(end, p->end);
> >  	res->flags = p->flags;
> >  	res->desc = p->desc;
> >  	return 0;
> >  }
> >  
> > -static int __walk_iomem_res_desc(struct resource *res, unsigned long desc,
> > -				 bool first_level_children_only,
> > -				 void *arg,
> > +static int __walk_iomem_res_desc(resource_size_t start, resource_size_t end,
> > +				 unsigned long flags, unsigned long desc,
> > +				 bool first_level_children_only, void *arg,
> >  				 int (*func)(struct resource *, void *))
> >  {
> > -	u64 orig_end = res->end;
> > +	struct resource res;
> >  	int ret = -1;
> >  
> > -	while ((res->start < res->end) &&
> > -	       !find_next_iomem_res(res, desc, first_level_children_only)) {
> > -		ret = (*func)(res, arg);
> > +	while (start < end &&
> > +	       !find_next_iomem_res(start, end, flags, desc,
> > +				    first_level_children_only, &res)) {
> > +		ret = (*func)(&res, arg);
> >  		if (ret)
> >  			break;
> >  
> > -		res->start = res->end + 1;
> > -		res->end = orig_end;
> > +		start = res.end + 1;
> >  	}
> >  
> >  	return ret;
> > @@ -407,13 +409,7 @@ static int __walk_iomem_res_desc(struct resource *res, unsigned long desc,
> >  int walk_iomem_res_desc(unsigned long desc, unsigned long flags, u64 start,
> >  		u64 end, void *arg, int (*func)(struct resource *, void *))
> >  {
> > -	struct resource res;
> > -
> > -	res.start = start;
> > -	res.end = end;
> > -	res.flags = flags;
> > -
> > -	return __walk_iomem_res_desc(&res, desc, false, arg, func);
> > +	return __walk_iomem_res_desc(start, end, flags, desc, false, arg, func);
> >  }
> >  EXPORT_SYMBOL_GPL(walk_iomem_res_desc);
> >  
> > @@ -427,13 +423,9 @@ EXPORT_SYMBOL_GPL(walk_iomem_res_desc);
> >  int walk_system_ram_res(u64 start, u64 end, void *arg,
> >  				int (*func)(struct resource *, void *))
> >  {
> > -	struct resource res;
> > -
> > -	res.start = start;
> > -	res.end = end;
> > -	res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
> > +	unsigned long flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
> >  
> > -	return __walk_iomem_res_desc(&res, IORES_DESC_NONE, true,
> > +	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, true,
> >  				     arg, func);
> >  }
> >  
> > @@ -444,13 +436,9 @@ int walk_system_ram_res(u64 start, u64 end, void *arg,
> >  int walk_mem_res(u64 start, u64 end, void *arg,
> >  		 int (*func)(struct resource *, void *))
> >  {
> > -	struct resource res;
> > -
> > -	res.start = start;
> > -	res.end = end;
> > -	res.flags = IORESOURCE_MEM | IORESOURCE_BUSY;
> > +	unsigned long flags = IORESOURCE_MEM | IORESOURCE_BUSY;
> >  
> > -	return __walk_iomem_res_desc(&res, IORES_DESC_NONE, true,
> > +	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, true,
> >  				     arg, func);
> >  }
> >  
> > @@ -464,25 +452,25 @@ int walk_mem_res(u64 start, u64 end, void *arg,
> >  int walk_system_ram_range(unsigned long start_pfn, unsigned long nr_pages,
> >  		void *arg, int (*func)(unsigned long, unsigned long, void *))
> >  {
> > +	resource_size_t start, end;
> > +	unsigned long flags;
> >  	struct resource res;
> >  	unsigned long pfn, end_pfn;
> > -	u64 orig_end;
> >  	int ret = -1;
> >  
> > -	res.start = (u64) start_pfn << PAGE_SHIFT;
> > -	res.end = ((u64)(start_pfn + nr_pages) << PAGE_SHIFT) - 1;
> > -	res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
> > -	orig_end = res.end;
> > -	while ((res.start < res.end) &&
> > -		(find_next_iomem_res(&res, IORES_DESC_NONE, true) >= 0)) {
> > +	start = (u64) start_pfn << PAGE_SHIFT;
> > +	end = ((u64)(start_pfn + nr_pages) << PAGE_SHIFT) - 1;
> > +	flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
> > +	while (start < end &&
> > +	       !find_next_iomem_res(start, end, flags, IORES_DESC_NONE,
> > +				    true, &res)) {
> >  		pfn = (res.start + PAGE_SIZE - 1) >> PAGE_SHIFT;
> >  		end_pfn = (res.end + 1) >> PAGE_SHIFT;
> >  		if (end_pfn > pfn)
> >  			ret = (*func)(pfn, end_pfn - pfn, arg);
> >  		if (ret)
> >  			break;
> > -		res.start = res.end + 1;
> > -		res.end = orig_end;
> > +		start = res.end + 1;
> >  	}
> >  	return ret;
> >  }
> > 
> > 
> > _______________________________________________
> > kexec mailing list
> > kexec@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/kexec
> > 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/3] resource: Fix find_next_iomem_res() iteration issue
  2018-09-27 14:03         ` Bjorn Helgaas
@ 2018-09-28  5:09           ` lijiang
  2018-09-28 13:10           ` Borislav Petkov
  1 sibling, 0 replies; 24+ messages in thread
From: lijiang @ 2018-09-28  5:09 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: dan.j.williams, brijesh.singh, bhe, thomas.lendacky, tiwai, x86,
	kexec, linux-kernel, mingo, baiyaowei, hpa, tglx, bp, dyoung,
	akpm, Vivek Goyal



在 2018年09月27日 22:03, Bjorn Helgaas 写道:
> On Thu, Sep 27, 2018 at 01:27:41PM +0800, lijiang wrote:
>> 在 2018年09月25日 06:15, Bjorn Helgaas 写道:
>>> From: Bjorn Helgaas <bhelgaas@google.com>
>>>
>>> Previously find_next_iomem_res() used "*res" as both an input parameter for
>>> the range to search and the type of resource to search for, and an output
>>> parameter for the resource we found, which makes the interface confusing
>>> and hard to use correctly.
>>>
>>> All callers allocate a single struct resource and use it for repeated calls
>>> to find_next_iomem_res().  When find_next_iomem_res() returns a resource,
>>> it overwrites the start, end, flags, and desc members of the struct.  If we
>>> call find_next_iomem_res() again, we must update or restore these fields.
>>>
>>> The callers (__walk_iomem_res_desc() and walk_system_ram_range()) do not
>>> restore res->flags, so if the caller is searching for flags of
>>> IORESOURCE_MEM | IORESOURCE_BUSY and finds a resource with flags of
>>> IORESOURCE_MEM | IORESOURCE_BUSY | IORESOURCE_SYSRAM, the next search will
>>> find only resources marked as IORESOURCE_SYSRAM.
>>>
>>> Fix this by restructuring the interface so it takes explicit "start, end,
>>> flags" parameters and uses "*res" only as an output parameter.
>>
>> Hi, Bjorn
>> I personally suggest that some comments might be added in the code, make it clear
>> and easy to understand, then which could avoid the old confusion and more code changes.
> 
> Since I think the current interface (using *res as both input and
> output parameters that have very different meanings) is confusing,
> it's hard for *me* to write comments that make it less confusing, but
> of course, you're welcome to propose something.
> 
> My opinion (probably not universally shared) is that my proposal would
> make the code more readable, and it's worth doing even though the diff
> is larger.
> 
> Anyway, I'll post these patches independently and see if anybody else
> has an opinion.
> 

I don't mind at all, because that is just my own opinion about more code
changes.

Anyway, thank you to improve this patch.

Regards,
Lianbo

> Bjorn
> 
>>> Original-patch: http://lore.kernel.org/lkml/20180921073211.20097-2-lijiang@redhat.com
>>> Based-on-patch-by: Lianbo Jiang <lijiang@redhat.com>
>>> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
>>> ---
>>>  kernel/resource.c |   94 +++++++++++++++++++++++------------------------------
>>>  1 file changed, 41 insertions(+), 53 deletions(-)
>>>
>>> diff --git a/kernel/resource.c b/kernel/resource.c
>>> index 155ec873ea4d..9891ea90cc8f 100644
>>> --- a/kernel/resource.c
>>> +++ b/kernel/resource.c
>>> @@ -319,23 +319,26 @@ int release_resource(struct resource *old)
>>>  EXPORT_SYMBOL(release_resource);
>>>  
>>>  /*
>>> - * Finds the lowest iomem resource existing within [res->start..res->end].
>>> - * The caller must specify res->start, res->end, res->flags, and optionally
>>> - * desc.  If found, returns 0, res is overwritten, if not found, returns -1.
>>> - * This function walks the whole tree and not just first level children until
>>> - * and unless first_level_children_only is true.
>>> + * Finds the lowest iomem resource that covers part of [start..end].  The
>>> + * caller must specify start, end, flags, and desc (which may be
>>> + * IORES_DESC_NONE).
>>> + *
>>> + * If a resource is found, returns 0 and *res is overwritten with the part
>>> + * of the resource that's within [start..end]; if none is found, returns
>>> + * -1.
>>> + *
>>> + * This function walks the whole tree and not just first level children
>>> + * unless first_level_children_only is true.
>>>   */
>>> -static int find_next_iomem_res(struct resource *res, unsigned long desc,
>>> -			       bool first_level_children_only)
>>> +static int find_next_iomem_res(resource_size_t start, resource_size_t end,
>>> +			       unsigned long flags, unsigned long desc,
>>> +			       bool first_level_children_only,
>>> +			       struct resource *res)
>>>  {
>>> -	resource_size_t start, end;
>>>  	struct resource *p;
>>>  	bool sibling_only = false;
>>>  
>>>  	BUG_ON(!res);
>>> -
>>> -	start = res->start;
>>> -	end = res->end;
>>>  	BUG_ON(start >= end);
>>>  
>>>  	if (first_level_children_only)
>>> @@ -344,7 +347,7 @@ static int find_next_iomem_res(struct resource *res, unsigned long desc,
>>>  	read_lock(&resource_lock);
>>>  
>>>  	for (p = iomem_resource.child; p; p = next_resource(p, sibling_only)) {
>>> -		if ((p->flags & res->flags) != res->flags)
>>> +		if ((p->flags & flags) != flags)
>>>  			continue;
>>>  		if ((desc != IORES_DESC_NONE) && (desc != p->desc))
>>>  			continue;
>>> @@ -359,32 +362,31 @@ static int find_next_iomem_res(struct resource *res, unsigned long desc,
>>>  	read_unlock(&resource_lock);
>>>  	if (!p)
>>>  		return -1;
>>> +
>>>  	/* copy data */
>>> -	if (res->start < p->start)
>>> -		res->start = p->start;
>>> -	if (res->end > p->end)
>>> -		res->end = p->end;
>>> +	res->start = max(start, p->start);
>>> +	res->end = min(end, p->end);
>>>  	res->flags = p->flags;
>>>  	res->desc = p->desc;
>>>  	return 0;
>>>  }
>>>  
>>> -static int __walk_iomem_res_desc(struct resource *res, unsigned long desc,
>>> -				 bool first_level_children_only,
>>> -				 void *arg,
>>> +static int __walk_iomem_res_desc(resource_size_t start, resource_size_t end,
>>> +				 unsigned long flags, unsigned long desc,
>>> +				 bool first_level_children_only, void *arg,
>>>  				 int (*func)(struct resource *, void *))
>>>  {
>>> -	u64 orig_end = res->end;
>>> +	struct resource res;
>>>  	int ret = -1;
>>>  
>>> -	while ((res->start < res->end) &&
>>> -	       !find_next_iomem_res(res, desc, first_level_children_only)) {
>>> -		ret = (*func)(res, arg);
>>> +	while (start < end &&
>>> +	       !find_next_iomem_res(start, end, flags, desc,
>>> +				    first_level_children_only, &res)) {
>>> +		ret = (*func)(&res, arg);
>>>  		if (ret)
>>>  			break;
>>>  
>>> -		res->start = res->end + 1;
>>> -		res->end = orig_end;
>>> +		start = res.end + 1;
>>>  	}
>>>  
>>>  	return ret;
>>> @@ -407,13 +409,7 @@ static int __walk_iomem_res_desc(struct resource *res, unsigned long desc,
>>>  int walk_iomem_res_desc(unsigned long desc, unsigned long flags, u64 start,
>>>  		u64 end, void *arg, int (*func)(struct resource *, void *))
>>>  {
>>> -	struct resource res;
>>> -
>>> -	res.start = start;
>>> -	res.end = end;
>>> -	res.flags = flags;
>>> -
>>> -	return __walk_iomem_res_desc(&res, desc, false, arg, func);
>>> +	return __walk_iomem_res_desc(start, end, flags, desc, false, arg, func);
>>>  }
>>>  EXPORT_SYMBOL_GPL(walk_iomem_res_desc);
>>>  
>>> @@ -427,13 +423,9 @@ EXPORT_SYMBOL_GPL(walk_iomem_res_desc);
>>>  int walk_system_ram_res(u64 start, u64 end, void *arg,
>>>  				int (*func)(struct resource *, void *))
>>>  {
>>> -	struct resource res;
>>> -
>>> -	res.start = start;
>>> -	res.end = end;
>>> -	res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
>>> +	unsigned long flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
>>>  
>>> -	return __walk_iomem_res_desc(&res, IORES_DESC_NONE, true,
>>> +	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, true,
>>>  				     arg, func);
>>>  }
>>>  
>>> @@ -444,13 +436,9 @@ int walk_system_ram_res(u64 start, u64 end, void *arg,
>>>  int walk_mem_res(u64 start, u64 end, void *arg,
>>>  		 int (*func)(struct resource *, void *))
>>>  {
>>> -	struct resource res;
>>> -
>>> -	res.start = start;
>>> -	res.end = end;
>>> -	res.flags = IORESOURCE_MEM | IORESOURCE_BUSY;
>>> +	unsigned long flags = IORESOURCE_MEM | IORESOURCE_BUSY;
>>>  
>>> -	return __walk_iomem_res_desc(&res, IORES_DESC_NONE, true,
>>> +	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, true,
>>>  				     arg, func);
>>>  }
>>>  
>>> @@ -464,25 +452,25 @@ int walk_mem_res(u64 start, u64 end, void *arg,
>>>  int walk_system_ram_range(unsigned long start_pfn, unsigned long nr_pages,
>>>  		void *arg, int (*func)(unsigned long, unsigned long, void *))
>>>  {
>>> +	resource_size_t start, end;
>>> +	unsigned long flags;
>>>  	struct resource res;
>>>  	unsigned long pfn, end_pfn;
>>> -	u64 orig_end;
>>>  	int ret = -1;
>>>  
>>> -	res.start = (u64) start_pfn << PAGE_SHIFT;
>>> -	res.end = ((u64)(start_pfn + nr_pages) << PAGE_SHIFT) - 1;
>>> -	res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
>>> -	orig_end = res.end;
>>> -	while ((res.start < res.end) &&
>>> -		(find_next_iomem_res(&res, IORES_DESC_NONE, true) >= 0)) {
>>> +	start = (u64) start_pfn << PAGE_SHIFT;
>>> +	end = ((u64)(start_pfn + nr_pages) << PAGE_SHIFT) - 1;
>>> +	flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
>>> +	while (start < end &&
>>> +	       !find_next_iomem_res(start, end, flags, IORES_DESC_NONE,
>>> +				    true, &res)) {
>>>  		pfn = (res.start + PAGE_SIZE - 1) >> PAGE_SHIFT;
>>>  		end_pfn = (res.end + 1) >> PAGE_SHIFT;
>>>  		if (end_pfn > pfn)
>>>  			ret = (*func)(pfn, end_pfn - pfn, arg);
>>>  		if (ret)
>>>  			break;
>>> -		res.start = res.end + 1;
>>> -		res.end = orig_end;
>>> +		start = res.end + 1;
>>>  	}
>>>  	return ret;
>>>  }
>>>
>>>
>>> _______________________________________________
>>> kexec mailing list
>>> kexec@lists.infradead.org
>>> http://lists.infradead.org/mailman/listinfo/kexec
>>>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/3] resource: Fix find_next_iomem_res() iteration issue
  2018-09-27 14:03         ` Bjorn Helgaas
  2018-09-28  5:09           ` lijiang
@ 2018-09-28 13:10           ` Borislav Petkov
  1 sibling, 0 replies; 24+ messages in thread
From: Borislav Petkov @ 2018-09-28 13:10 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: lijiang, dan.j.williams, brijesh.singh, bhe, thomas.lendacky,
	tiwai, x86, kexec, linux-kernel, mingo, baiyaowei, hpa, tglx,
	dyoung, akpm, Vivek Goyal

On Thu, Sep 27, 2018 at 09:03:51AM -0500, Bjorn Helgaas wrote:
> Since I think the current interface (using *res as both input and
> output parameters that have very different meanings) is confusing,

FTR, I too, think that this whole machinery in resource.c with passing
in a function and a struct resource pointer and is clumsy and could use
a rewrite. When there's time... ;-\

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/3 v3] add reserved e820 ranges to the kdump kernel e820 table
  2018-09-21  7:32 [PATCH 0/3 v3] add reserved e820 ranges to the kdump kernel e820 table Lianbo Jiang
                   ` (2 preceding siblings ...)
  2018-09-21  7:32 ` [PATCH 3/3 v3] x86/kexec_file: add reserved e820 ranges to kdump kernel e820 table Lianbo Jiang
@ 2018-10-16  2:56 ` Dave Young
  2018-10-16  3:45   ` lijiang
  3 siblings, 1 reply; 24+ messages in thread
From: Dave Young @ 2018-10-16  2:56 UTC (permalink / raw)
  To: Lianbo Jiang
  Cc: linux-kernel, kexec, tglx, mingo, hpa, x86, akpm, dan.j.williams,
	thomas.lendacky, bhelgaas, baiyaowei, tiwai, bp, brijesh.singh,
	bhe

On 09/21/18 at 03:32pm, Lianbo Jiang wrote:
> E820 reserved ranges is useful in kdump kernel, we have added this in
> kexec-tools code.
> 
> One reason is PCI mmconf (extended mode) requires reserved region otherwise
> it falls back to legacy mode.
> 
> Furthermore, when AMD SME kdump support, it needs to map dmi table area as
> unencrypted. For normal boot, these ranges sit in e820 reserved ranges,
> thus the early ioremap code naturally map them as unencrypted. If we also
> have same e820 reserve setup in kdump kernel then it will just work like
> normal kernel.
> 
> Kdump uses walk_iomem_res_desc to iterate resources, then adds matched desc
> to e820 table for the kdump kernel.
> 
> But IORES_DESC_NONE resource type includes several different e820 types, we
> need add exact e820 type to the kdump kernel e820 table, thus it also needs
> an extra checking in memmap_entry_callback() to match the e820 type and
> resource name.
> 
> By the way, we also fix an error which walks through iomem resources, the
> values of the function parameter may be modified in the while loop of
> __walk_iomem_res_desc(), which will cause us to not get the desired result
> in some cases.
> 
> Changes since v2:
> 1. Modified the value of flags to "0", when walking through the whole
> tree for e820 reserved ranges.
> 2. Modified the invalid SOB chain issue.
> 
> Lianbo Jiang (3):
>   resource: fix an error which walks through iomem resources
>   x86/kexec_file: add e820 entry in case e820 type string matches to io
>     resource name
>   x86/kexec_file: add reserved e820 ranges to kdump kernel e820 table

Lianbo, since Bjorn has fixed the resource issue in another patchset,
can you rebase your patch 2 and 3 on top of his patches and resend?

> 
>  arch/x86/include/asm/e820/api.h |  2 ++
>  arch/x86/kernel/crash.c         | 11 ++++++++++-
>  arch/x86/kernel/e820.c          |  2 +-
>  kernel/resource.c               |  3 +++
>  4 files changed, 16 insertions(+), 2 deletions(-)
> 
> -- 
> 2.17.1
> 

Thanks
Dave

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/3 v3] add reserved e820 ranges to the kdump kernel e820 table
  2018-10-16  2:56 ` [PATCH 0/3 v3] add reserved e820 ranges to the " Dave Young
@ 2018-10-16  3:45   ` lijiang
  0 siblings, 0 replies; 24+ messages in thread
From: lijiang @ 2018-10-16  3:45 UTC (permalink / raw)
  To: Dave Young
  Cc: linux-kernel, kexec, tglx, mingo, hpa, x86, akpm, dan.j.williams,
	thomas.lendacky, bhelgaas, baiyaowei, tiwai, bp, brijesh.singh,
	bhe

在 2018年10月16日 10:56, Dave Young 写道:
> On 09/21/18 at 03:32pm, Lianbo Jiang wrote:
>> E820 reserved ranges is useful in kdump kernel, we have added this in
>> kexec-tools code.
>>
>> One reason is PCI mmconf (extended mode) requires reserved region otherwise
>> it falls back to legacy mode.
>>
>> Furthermore, when AMD SME kdump support, it needs to map dmi table area as
>> unencrypted. For normal boot, these ranges sit in e820 reserved ranges,
>> thus the early ioremap code naturally map them as unencrypted. If we also
>> have same e820 reserve setup in kdump kernel then it will just work like
>> normal kernel.
>>
>> Kdump uses walk_iomem_res_desc to iterate resources, then adds matched desc
>> to e820 table for the kdump kernel.
>>
>> But IORES_DESC_NONE resource type includes several different e820 types, we
>> need add exact e820 type to the kdump kernel e820 table, thus it also needs
>> an extra checking in memmap_entry_callback() to match the e820 type and
>> resource name.
>>
>> By the way, we also fix an error which walks through iomem resources, the
>> values of the function parameter may be modified in the while loop of
>> __walk_iomem_res_desc(), which will cause us to not get the desired result
>> in some cases.
>>
>> Changes since v2:
>> 1. Modified the value of flags to "0", when walking through the whole
>> tree for e820 reserved ranges.
>> 2. Modified the invalid SOB chain issue.
>>
>> Lianbo Jiang (3):
>>   resource: fix an error which walks through iomem resources
>>   x86/kexec_file: add e820 entry in case e820 type string matches to io
>>     resource name
>>   x86/kexec_file: add reserved e820 ranges to kdump kernel e820 table
> 
> Lianbo, since Bjorn has fixed the resource issue in another patchset,
> can you rebase your patch 2 and 3 on top of his patches and resend?
> 
Ok, Thanks for your reminder.
I will adjust these patches and post again.

Thanks.
Lianbo
>>
>>  arch/x86/include/asm/e820/api.h |  2 ++
>>  arch/x86/kernel/crash.c         | 11 ++++++++++-
>>  arch/x86/kernel/e820.c          |  2 +-
>>  kernel/resource.c               |  3 +++
>>  4 files changed, 16 insertions(+), 2 deletions(-)
>>
>> -- 
>> 2.17.1
>>
> 
> Thanks
> Dave
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/3] resource: Fix find_next_iomem_res() iteration issue
  2018-10-09 17:30     ` Bjorn Helgaas
@ 2018-10-09 17:35       ` Borislav Petkov
  0 siblings, 0 replies; 24+ messages in thread
From: Borislav Petkov @ 2018-10-09 17:35 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Bjorn Helgaas, Linux Kernel Mailing List, lijiang, Vivek Goyal,
	kexec, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
	Andrew Morton, Dan Williams, thomas.lendacky, baiyaowei,
	Takashi Iwai, brijesh.singh, dyoung, Baoquan He

On Tue, Oct 09, 2018 at 12:30:33PM -0500, Bjorn Helgaas wrote:
> Sorry, I don't know what happened here because I didn't see these
> comments until today.  I suspect what happened was that my Gmail
> filter auto-filed them in my linux-kernel folder, which I don't read.
> On my @google.com account, I have another filter that pull out things
> addressed directly to me, which I *do* read.  But this thread didn't
> cc that account until the tip-bot message, which *did* cc my @google
> account because that's how I signed off the patches.  Sigh.

No worries, it all should be taken care of now! :-)

> Anyway, it looks like this stuff is on its way; let me know
> (bhelgaas@google.com) if I should do anything else.  I would address
> your comments above, but since this seems to be applied and I saw a
> cleanup patch from you, I assume you already took care of them.

Yeah, I think I've covered them all but if you see something out of the
ordinary, let me know so that I can fix it.

Thanks!

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/3] resource: Fix find_next_iomem_res() iteration issue
  2018-09-28 16:41   ` Borislav Petkov
@ 2018-10-09 17:30     ` Bjorn Helgaas
  2018-10-09 17:35       ` Borislav Petkov
  0 siblings, 1 reply; 24+ messages in thread
From: Bjorn Helgaas @ 2018-10-09 17:30 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Bjorn Helgaas, Linux Kernel Mailing List, lijiang, Vivek Goyal,
	kexec, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
	Andrew Morton, Dan Williams, thomas.lendacky, baiyaowei,
	Takashi Iwai, brijesh.singh, dyoung, Baoquan He, Bjorn Helgaas

On Fri, Sep 28, 2018 at 11:42 AM Borislav Petkov <bp@suse.de> wrote:
>
> On Thu, Sep 27, 2018 at 09:22:09AM -0500, Bjorn Helgaas wrote:
> > From: Bjorn Helgaas <bhelgaas@google.com>
> >
> > Previously find_next_iomem_res() used "*res" as both an input parameter for
> > the range to search and the type of resource to search for, and an output
> > parameter for the resource we found, which makes the interface confusing.
> >
> > The current callers use find_next_iomem_res() incorrectly because they
> > allocate a single struct resource and use it for repeated calls to
> > find_next_iomem_res().  When find_next_iomem_res() returns a resource, it
> > overwrites the start, end, flags, and desc members of the struct.  If we
> > call find_next_iomem_res() again, we must update or restore these fields.
> > The previous code restored res.start and res.end, but not res.flags or
> > res.desc.
>
> ... which is a sure sign that the design of this thing is not the best one.
>
> >
> > Since the callers did not restore res.flags, if they searched for flags
> > IORESOURCE_MEM | IORESOURCE_BUSY and found a resource with flags
> > IORESOURCE_MEM | IORESOURCE_BUSY | IORESOURCE_SYSRAM, the next search would
> > incorrectly skip resources unless they were also marked as
> > IORESOURCE_SYSRAM.
>
> Nice example!
>
> > Fix this by restructuring the interface so it takes explicit "start, end,
> > flags" parameters and uses "*res" only as an output parameter.
> >
> > Original-patch: http://lore.kernel.org/lkml/20180921073211.20097-2-lijiang@redhat.com
> > Based-on-patch-by: Lianbo Jiang <lijiang@redhat.com>
> > Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> > ---
> >  kernel/resource.c |   94 +++++++++++++++++++++++------------------------------
> >  1 file changed, 41 insertions(+), 53 deletions(-)
> >
> > diff --git a/kernel/resource.c b/kernel/resource.c
> > index 155ec873ea4d..9891ea90cc8f 100644
> > --- a/kernel/resource.c
> > +++ b/kernel/resource.c
> > @@ -319,23 +319,26 @@ int release_resource(struct resource *old)
> >  EXPORT_SYMBOL(release_resource);
> >
> >  /*
>
> I guess this could be made kernel-doc, since you're touching it...
>
> > - * Finds the lowest iomem resource existing within [res->start..res->end].
> > - * The caller must specify res->start, res->end, res->flags, and optionally
> > - * desc.  If found, returns 0, res is overwritten, if not found, returns -1.
> > - * This function walks the whole tree and not just first level children until
> > - * and unless first_level_children_only is true.
> > + * Finds the lowest iomem resource that covers part of [start..end].  The
> > + * caller must specify start, end, flags, and desc (which may be
> > + * IORES_DESC_NONE).
> > + *
> > + * If a resource is found, returns 0 and *res is overwritten with the part
> > + * of the resource that's within [start..end]; if none is found, returns
> > + * -1.
> > + *
> > + * This function walks the whole tree and not just first level children
> > + * unless first_level_children_only is true.
>
> ... and then prepend that with '@' - @first_level_children_only to refer
> to the function parameter.
>
> >   */
> > -static int find_next_iomem_res(struct resource *res, unsigned long desc,
> > -                            bool first_level_children_only)
> > +static int find_next_iomem_res(resource_size_t start, resource_size_t end,
> > +                            unsigned long flags, unsigned long desc,
> > +                            bool first_level_children_only,
> > +                            struct resource *res)
> >  {
> > -     resource_size_t start, end;
> >       struct resource *p;
> >       bool sibling_only = false;
> >
> >       BUG_ON(!res);
> > -
> > -     start = res->start;
> > -     end = res->end;
> >       BUG_ON(start >= end);
>
> And since we're touching this, maybe replace that BUG_ON() fun with
> simply return -EINVAL or some error code...
>
> >
> >       if (first_level_children_only)
>
>         if (first_level_children_only)
>                 sibling_only = true;
>
> So this is just silly - replacing a bool function param with a local bool
> var.
>
> You could kill that, shorten first_level_children_only's name and use it
> directly.
>
> Depending on how much cleanup it amounts to, you could make that a
> separate cleanup patch ontop, to keep the changes from the cleanup
> separate.
>
> > @@ -344,7 +347,7 @@ static int find_next_iomem_res(struct resource *res, unsigned long desc,
> >       read_lock(&resource_lock);
> >
> >       for (p = iomem_resource.child; p; p = next_resource(p, sibling_only)) {
> > -             if ((p->flags & res->flags) != res->flags)
> > +             if ((p->flags & flags) != flags)
> >                       continue;
> >               if ((desc != IORES_DESC_NONE) && (desc != p->desc))
> >                       continue;
> > @@ -359,32 +362,31 @@ static int find_next_iomem_res(struct resource *res, unsigned long desc,
> >       read_unlock(&resource_lock);
> >       if (!p)
> >               return -1;
> > +
> >       /* copy data */
> > -     if (res->start < p->start)
> > -             res->start = p->start;
> > -     if (res->end > p->end)
> > -             res->end = p->end;
> > +     res->start = max(start, p->start);
> > +     res->end = min(end, p->end);
>
> Should we use the min_t and max_t versions here for typechecking?
>
> >       res->flags = p->flags;
> >       res->desc = p->desc;
> >       return 0;
> >  }
> >
> > -static int __walk_iomem_res_desc(struct resource *res, unsigned long desc,
> > -                              bool first_level_children_only,
> > -                              void *arg,
> > +static int __walk_iomem_res_desc(resource_size_t start, resource_size_t end,
> > +                              unsigned long flags, unsigned long desc,
> > +                              bool first_level_children_only, void *arg,
> >                                int (*func)(struct resource *, void *))
> >  {
> > -     u64 orig_end = res->end;
> > +     struct resource res;
> >       int ret = -1;
> >
> > -     while ((res->start < res->end) &&
> > -            !find_next_iomem_res(res, desc, first_level_children_only)) {
> > -             ret = (*func)(res, arg);
> > +     while (start < end &&
> > +            !find_next_iomem_res(start, end, flags, desc,
> > +                                 first_level_children_only, &res)) {
> > +             ret = (*func)(&res, arg);
> >               if (ret)
> >                       break;
> >
> > -             res->start = res->end + 1;
> > -             res->end = orig_end;
> > +             start = res.end + 1;
> >       }
> >
> >       return ret;
> > @@ -407,13 +409,7 @@ static int __walk_iomem_res_desc(struct resource *res, unsigned long desc,
> >  int walk_iomem_res_desc(unsigned long desc, unsigned long flags, u64 start,
> >               u64 end, void *arg, int (*func)(struct resource *, void *))
>
> Align that function's parameters on the opening brace, pls, while you're
> at it.
>
> >  {
> > -     struct resource res;
> > -
> > -     res.start = start;
> > -     res.end = end;
> > -     res.flags = flags;
> > -
> > -     return __walk_iomem_res_desc(&res, desc, false, arg, func);
> > +     return __walk_iomem_res_desc(start, end, flags, desc, false, arg, func);
> >  }
> >  EXPORT_SYMBOL_GPL(walk_iomem_res_desc);
> >
> > @@ -427,13 +423,9 @@ EXPORT_SYMBOL_GPL(walk_iomem_res_desc);
> >  int walk_system_ram_res(u64 start, u64 end, void *arg,
> >                               int (*func)(struct resource *, void *))
>
> Ditto.
>
> >  {
> > -     struct resource res;
> > -
> > -     res.start = start;
> > -     res.end = end;
> > -     res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
> > +     unsigned long flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
> >
> > -     return __walk_iomem_res_desc(&res, IORES_DESC_NONE, true,
> > +     return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, true,
> >                                    arg, func);
> >  }
> >
> > @@ -444,13 +436,9 @@ int walk_system_ram_res(u64 start, u64 end, void *arg,
> >  int walk_mem_res(u64 start, u64 end, void *arg,
> >                int (*func)(struct resource *, void *))
> >  {
> > -     struct resource res;
> > -
> > -     res.start = start;
> > -     res.end = end;
> > -     res.flags = IORESOURCE_MEM | IORESOURCE_BUSY;
> > +     unsigned long flags = IORESOURCE_MEM | IORESOURCE_BUSY;
> >
> > -     return __walk_iomem_res_desc(&res, IORES_DESC_NONE, true,
> > +     return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, true,
> >                                    arg, func);
> >  }
> >
> > @@ -464,25 +452,25 @@ int walk_mem_res(u64 start, u64 end, void *arg,
> >  int walk_system_ram_range(unsigned long start_pfn, unsigned long nr_pages,
> >               void *arg, int (*func)(unsigned long, unsigned long, void *))
>
> Ditto.
>
> With that addressed:
>
> Reviewed-by: Borislav Petkov <bp@suse.de>
>
> All good stuff and a charm to review, lemme know if I should take them
> or you can carry them.

Sorry, I don't know what happened here because I didn't see these
comments until today.  I suspect what happened was that my Gmail
filter auto-filed them in my linux-kernel folder, which I don't read.
On my @google.com account, I have another filter that pull out things
addressed directly to me, which I *do* read.  But this thread didn't
cc that account until the tip-bot message, which *did* cc my @google
account because that's how I signed off the patches.  Sigh.

Anyway, it looks like this stuff is on its way; let me know
(bhelgaas@google.com) if I should do anything else.  I would address
your comments above, but since this seems to be applied and I saw a
cleanup patch from you, I assume you already took care of them.

Bjorn

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/3] resource: Fix find_next_iomem_res() iteration issue
  2018-09-27 14:22 ` [PATCH 3/3] resource: Fix find_next_iomem_res() iteration issue Bjorn Helgaas
@ 2018-09-28 16:41   ` Borislav Petkov
  2018-10-09 17:30     ` Bjorn Helgaas
  0 siblings, 1 reply; 24+ messages in thread
From: Borislav Petkov @ 2018-09-28 16:41 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-kernel, Lianbo Jiang, Vivek Goyal, kexec, tglx, mingo, hpa,
	x86, akpm, dan.j.williams, thomas.lendacky, baiyaowei, tiwai,
	brijesh.singh, dyoung, bhe

On Thu, Sep 27, 2018 at 09:22:09AM -0500, Bjorn Helgaas wrote:
> From: Bjorn Helgaas <bhelgaas@google.com>
> 
> Previously find_next_iomem_res() used "*res" as both an input parameter for
> the range to search and the type of resource to search for, and an output
> parameter for the resource we found, which makes the interface confusing.
> 
> The current callers use find_next_iomem_res() incorrectly because they
> allocate a single struct resource and use it for repeated calls to
> find_next_iomem_res().  When find_next_iomem_res() returns a resource, it
> overwrites the start, end, flags, and desc members of the struct.  If we
> call find_next_iomem_res() again, we must update or restore these fields.
> The previous code restored res.start and res.end, but not res.flags or
> res.desc.

... which is a sure sign that the design of this thing is not the best one.

> 
> Since the callers did not restore res.flags, if they searched for flags
> IORESOURCE_MEM | IORESOURCE_BUSY and found a resource with flags
> IORESOURCE_MEM | IORESOURCE_BUSY | IORESOURCE_SYSRAM, the next search would
> incorrectly skip resources unless they were also marked as
> IORESOURCE_SYSRAM.

Nice example!

> Fix this by restructuring the interface so it takes explicit "start, end,
> flags" parameters and uses "*res" only as an output parameter.
> 
> Original-patch: http://lore.kernel.org/lkml/20180921073211.20097-2-lijiang@redhat.com
> Based-on-patch-by: Lianbo Jiang <lijiang@redhat.com>
> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> ---
>  kernel/resource.c |   94 +++++++++++++++++++++++------------------------------
>  1 file changed, 41 insertions(+), 53 deletions(-)
> 
> diff --git a/kernel/resource.c b/kernel/resource.c
> index 155ec873ea4d..9891ea90cc8f 100644
> --- a/kernel/resource.c
> +++ b/kernel/resource.c
> @@ -319,23 +319,26 @@ int release_resource(struct resource *old)
>  EXPORT_SYMBOL(release_resource);
>  
>  /*

I guess this could be made kernel-doc, since you're touching it...

> - * Finds the lowest iomem resource existing within [res->start..res->end].
> - * The caller must specify res->start, res->end, res->flags, and optionally
> - * desc.  If found, returns 0, res is overwritten, if not found, returns -1.
> - * This function walks the whole tree and not just first level children until
> - * and unless first_level_children_only is true.
> + * Finds the lowest iomem resource that covers part of [start..end].  The
> + * caller must specify start, end, flags, and desc (which may be
> + * IORES_DESC_NONE).
> + *
> + * If a resource is found, returns 0 and *res is overwritten with the part
> + * of the resource that's within [start..end]; if none is found, returns
> + * -1.
> + *
> + * This function walks the whole tree and not just first level children
> + * unless first_level_children_only is true.

... and then prepend that with '@' - @first_level_children_only to refer
to the function parameter.

>   */
> -static int find_next_iomem_res(struct resource *res, unsigned long desc,
> -			       bool first_level_children_only)
> +static int find_next_iomem_res(resource_size_t start, resource_size_t end,
> +			       unsigned long flags, unsigned long desc,
> +			       bool first_level_children_only,
> +			       struct resource *res)
>  {
> -	resource_size_t start, end;
>  	struct resource *p;
>  	bool sibling_only = false;
>  
>  	BUG_ON(!res);
> -
> -	start = res->start;
> -	end = res->end;
>  	BUG_ON(start >= end);

And since we're touching this, maybe replace that BUG_ON() fun with
simply return -EINVAL or some error code...

>  
>  	if (first_level_children_only)

        if (first_level_children_only)
                sibling_only = true;

So this is just silly - replacing a bool function param with a local bool
var.

You could kill that, shorten first_level_children_only's name and use it
directly.

Depending on how much cleanup it amounts to, you could make that a
separate cleanup patch ontop, to keep the changes from the cleanup
separate.

> @@ -344,7 +347,7 @@ static int find_next_iomem_res(struct resource *res, unsigned long desc,
>  	read_lock(&resource_lock);
>  
>  	for (p = iomem_resource.child; p; p = next_resource(p, sibling_only)) {
> -		if ((p->flags & res->flags) != res->flags)
> +		if ((p->flags & flags) != flags)
>  			continue;
>  		if ((desc != IORES_DESC_NONE) && (desc != p->desc))
>  			continue;
> @@ -359,32 +362,31 @@ static int find_next_iomem_res(struct resource *res, unsigned long desc,
>  	read_unlock(&resource_lock);
>  	if (!p)
>  		return -1;
> +
>  	/* copy data */
> -	if (res->start < p->start)
> -		res->start = p->start;
> -	if (res->end > p->end)
> -		res->end = p->end;
> +	res->start = max(start, p->start);
> +	res->end = min(end, p->end);

Should we use the min_t and max_t versions here for typechecking?

>  	res->flags = p->flags;
>  	res->desc = p->desc;
>  	return 0;
>  }
>  
> -static int __walk_iomem_res_desc(struct resource *res, unsigned long desc,
> -				 bool first_level_children_only,
> -				 void *arg,
> +static int __walk_iomem_res_desc(resource_size_t start, resource_size_t end,
> +				 unsigned long flags, unsigned long desc,
> +				 bool first_level_children_only, void *arg,
>  				 int (*func)(struct resource *, void *))
>  {
> -	u64 orig_end = res->end;
> +	struct resource res;
>  	int ret = -1;
>  
> -	while ((res->start < res->end) &&
> -	       !find_next_iomem_res(res, desc, first_level_children_only)) {
> -		ret = (*func)(res, arg);
> +	while (start < end &&
> +	       !find_next_iomem_res(start, end, flags, desc,
> +				    first_level_children_only, &res)) {
> +		ret = (*func)(&res, arg);
>  		if (ret)
>  			break;
>  
> -		res->start = res->end + 1;
> -		res->end = orig_end;
> +		start = res.end + 1;
>  	}
>  
>  	return ret;
> @@ -407,13 +409,7 @@ static int __walk_iomem_res_desc(struct resource *res, unsigned long desc,
>  int walk_iomem_res_desc(unsigned long desc, unsigned long flags, u64 start,
>  		u64 end, void *arg, int (*func)(struct resource *, void *))

Align that function's parameters on the opening brace, pls, while you're
at it.

>  {
> -	struct resource res;
> -
> -	res.start = start;
> -	res.end = end;
> -	res.flags = flags;
> -
> -	return __walk_iomem_res_desc(&res, desc, false, arg, func);
> +	return __walk_iomem_res_desc(start, end, flags, desc, false, arg, func);
>  }
>  EXPORT_SYMBOL_GPL(walk_iomem_res_desc);
>  
> @@ -427,13 +423,9 @@ EXPORT_SYMBOL_GPL(walk_iomem_res_desc);
>  int walk_system_ram_res(u64 start, u64 end, void *arg,
>  				int (*func)(struct resource *, void *))

Ditto.

>  {
> -	struct resource res;
> -
> -	res.start = start;
> -	res.end = end;
> -	res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
> +	unsigned long flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
>  
> -	return __walk_iomem_res_desc(&res, IORES_DESC_NONE, true,
> +	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, true,
>  				     arg, func);
>  }
>  
> @@ -444,13 +436,9 @@ int walk_system_ram_res(u64 start, u64 end, void *arg,
>  int walk_mem_res(u64 start, u64 end, void *arg,
>  		 int (*func)(struct resource *, void *))
>  {
> -	struct resource res;
> -
> -	res.start = start;
> -	res.end = end;
> -	res.flags = IORESOURCE_MEM | IORESOURCE_BUSY;
> +	unsigned long flags = IORESOURCE_MEM | IORESOURCE_BUSY;
>  
> -	return __walk_iomem_res_desc(&res, IORES_DESC_NONE, true,
> +	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, true,
>  				     arg, func);
>  }
>  
> @@ -464,25 +452,25 @@ int walk_mem_res(u64 start, u64 end, void *arg,
>  int walk_system_ram_range(unsigned long start_pfn, unsigned long nr_pages,
>  		void *arg, int (*func)(unsigned long, unsigned long, void *))

Ditto.

With that addressed:

Reviewed-by: Borislav Petkov <bp@suse.de>

All good stuff and a charm to review, lemme know if I should take them
or you can carry them.

Thanks!

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 3/3] resource: Fix find_next_iomem_res() iteration issue
  2018-09-27 14:21 [PATCH 0/3] find_next_iomem_res() fixes Bjorn Helgaas
@ 2018-09-27 14:22 ` Bjorn Helgaas
  2018-09-28 16:41   ` Borislav Petkov
  0 siblings, 1 reply; 24+ messages in thread
From: Bjorn Helgaas @ 2018-09-27 14:22 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lianbo Jiang, Vivek Goyal, kexec, tglx, mingo, hpa, x86, akpm,
	dan.j.williams, thomas.lendacky, baiyaowei, tiwai, bp,
	brijesh.singh, dyoung, bhe

From: Bjorn Helgaas <bhelgaas@google.com>

Previously find_next_iomem_res() used "*res" as both an input parameter for
the range to search and the type of resource to search for, and an output
parameter for the resource we found, which makes the interface confusing.

The current callers use find_next_iomem_res() incorrectly because they
allocate a single struct resource and use it for repeated calls to
find_next_iomem_res().  When find_next_iomem_res() returns a resource, it
overwrites the start, end, flags, and desc members of the struct.  If we
call find_next_iomem_res() again, we must update or restore these fields.
The previous code restored res.start and res.end, but not res.flags or
res.desc.

Since the callers did not restore res.flags, if they searched for flags
IORESOURCE_MEM | IORESOURCE_BUSY and found a resource with flags
IORESOURCE_MEM | IORESOURCE_BUSY | IORESOURCE_SYSRAM, the next search would
incorrectly skip resources unless they were also marked as
IORESOURCE_SYSRAM.

Fix this by restructuring the interface so it takes explicit "start, end,
flags" parameters and uses "*res" only as an output parameter.

Original-patch: http://lore.kernel.org/lkml/20180921073211.20097-2-lijiang@redhat.com
Based-on-patch-by: Lianbo Jiang <lijiang@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
---
 kernel/resource.c |   94 +++++++++++++++++++++++------------------------------
 1 file changed, 41 insertions(+), 53 deletions(-)

diff --git a/kernel/resource.c b/kernel/resource.c
index 155ec873ea4d..9891ea90cc8f 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -319,23 +319,26 @@ int release_resource(struct resource *old)
 EXPORT_SYMBOL(release_resource);
 
 /*
- * Finds the lowest iomem resource existing within [res->start..res->end].
- * The caller must specify res->start, res->end, res->flags, and optionally
- * desc.  If found, returns 0, res is overwritten, if not found, returns -1.
- * This function walks the whole tree and not just first level children until
- * and unless first_level_children_only is true.
+ * Finds the lowest iomem resource that covers part of [start..end].  The
+ * caller must specify start, end, flags, and desc (which may be
+ * IORES_DESC_NONE).
+ *
+ * If a resource is found, returns 0 and *res is overwritten with the part
+ * of the resource that's within [start..end]; if none is found, returns
+ * -1.
+ *
+ * This function walks the whole tree and not just first level children
+ * unless first_level_children_only is true.
  */
-static int find_next_iomem_res(struct resource *res, unsigned long desc,
-			       bool first_level_children_only)
+static int find_next_iomem_res(resource_size_t start, resource_size_t end,
+			       unsigned long flags, unsigned long desc,
+			       bool first_level_children_only,
+			       struct resource *res)
 {
-	resource_size_t start, end;
 	struct resource *p;
 	bool sibling_only = false;
 
 	BUG_ON(!res);
-
-	start = res->start;
-	end = res->end;
 	BUG_ON(start >= end);
 
 	if (first_level_children_only)
@@ -344,7 +347,7 @@ static int find_next_iomem_res(struct resource *res, unsigned long desc,
 	read_lock(&resource_lock);
 
 	for (p = iomem_resource.child; p; p = next_resource(p, sibling_only)) {
-		if ((p->flags & res->flags) != res->flags)
+		if ((p->flags & flags) != flags)
 			continue;
 		if ((desc != IORES_DESC_NONE) && (desc != p->desc))
 			continue;
@@ -359,32 +362,31 @@ static int find_next_iomem_res(struct resource *res, unsigned long desc,
 	read_unlock(&resource_lock);
 	if (!p)
 		return -1;
+
 	/* copy data */
-	if (res->start < p->start)
-		res->start = p->start;
-	if (res->end > p->end)
-		res->end = p->end;
+	res->start = max(start, p->start);
+	res->end = min(end, p->end);
 	res->flags = p->flags;
 	res->desc = p->desc;
 	return 0;
 }
 
-static int __walk_iomem_res_desc(struct resource *res, unsigned long desc,
-				 bool first_level_children_only,
-				 void *arg,
+static int __walk_iomem_res_desc(resource_size_t start, resource_size_t end,
+				 unsigned long flags, unsigned long desc,
+				 bool first_level_children_only, void *arg,
 				 int (*func)(struct resource *, void *))
 {
-	u64 orig_end = res->end;
+	struct resource res;
 	int ret = -1;
 
-	while ((res->start < res->end) &&
-	       !find_next_iomem_res(res, desc, first_level_children_only)) {
-		ret = (*func)(res, arg);
+	while (start < end &&
+	       !find_next_iomem_res(start, end, flags, desc,
+				    first_level_children_only, &res)) {
+		ret = (*func)(&res, arg);
 		if (ret)
 			break;
 
-		res->start = res->end + 1;
-		res->end = orig_end;
+		start = res.end + 1;
 	}
 
 	return ret;
@@ -407,13 +409,7 @@ static int __walk_iomem_res_desc(struct resource *res, unsigned long desc,
 int walk_iomem_res_desc(unsigned long desc, unsigned long flags, u64 start,
 		u64 end, void *arg, int (*func)(struct resource *, void *))
 {
-	struct resource res;
-
-	res.start = start;
-	res.end = end;
-	res.flags = flags;
-
-	return __walk_iomem_res_desc(&res, desc, false, arg, func);
+	return __walk_iomem_res_desc(start, end, flags, desc, false, arg, func);
 }
 EXPORT_SYMBOL_GPL(walk_iomem_res_desc);
 
@@ -427,13 +423,9 @@ EXPORT_SYMBOL_GPL(walk_iomem_res_desc);
 int walk_system_ram_res(u64 start, u64 end, void *arg,
 				int (*func)(struct resource *, void *))
 {
-	struct resource res;
-
-	res.start = start;
-	res.end = end;
-	res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
+	unsigned long flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
 
-	return __walk_iomem_res_desc(&res, IORES_DESC_NONE, true,
+	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, true,
 				     arg, func);
 }
 
@@ -444,13 +436,9 @@ int walk_system_ram_res(u64 start, u64 end, void *arg,
 int walk_mem_res(u64 start, u64 end, void *arg,
 		 int (*func)(struct resource *, void *))
 {
-	struct resource res;
-
-	res.start = start;
-	res.end = end;
-	res.flags = IORESOURCE_MEM | IORESOURCE_BUSY;
+	unsigned long flags = IORESOURCE_MEM | IORESOURCE_BUSY;
 
-	return __walk_iomem_res_desc(&res, IORES_DESC_NONE, true,
+	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, true,
 				     arg, func);
 }
 
@@ -464,25 +452,25 @@ int walk_mem_res(u64 start, u64 end, void *arg,
 int walk_system_ram_range(unsigned long start_pfn, unsigned long nr_pages,
 		void *arg, int (*func)(unsigned long, unsigned long, void *))
 {
+	resource_size_t start, end;
+	unsigned long flags;
 	struct resource res;
 	unsigned long pfn, end_pfn;
-	u64 orig_end;
 	int ret = -1;
 
-	res.start = (u64) start_pfn << PAGE_SHIFT;
-	res.end = ((u64)(start_pfn + nr_pages) << PAGE_SHIFT) - 1;
-	res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
-	orig_end = res.end;
-	while ((res.start < res.end) &&
-		(find_next_iomem_res(&res, IORES_DESC_NONE, true) >= 0)) {
+	start = (u64) start_pfn << PAGE_SHIFT;
+	end = ((u64)(start_pfn + nr_pages) << PAGE_SHIFT) - 1;
+	flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
+	while (start < end &&
+	       !find_next_iomem_res(start, end, flags, IORES_DESC_NONE,
+				    true, &res)) {
 		pfn = (res.start + PAGE_SIZE - 1) >> PAGE_SHIFT;
 		end_pfn = (res.end + 1) >> PAGE_SHIFT;
 		if (end_pfn > pfn)
 			ret = (*func)(pfn, end_pfn - pfn, arg);
 		if (ret)
 			break;
-		res.start = res.end + 1;
-		res.end = orig_end;
+		start = res.end + 1;
 	}
 	return ret;
 }


^ permalink raw reply related	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2018-10-16  3:45 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-21  7:32 [PATCH 0/3 v3] add reserved e820 ranges to the kdump kernel e820 table Lianbo Jiang
2018-09-21  7:32 ` [PATCH 1/3 v3] resource: fix an error which walks through iomem resources Lianbo Jiang
2018-09-24 17:52   ` Bjorn Helgaas
2018-09-25  7:08     ` lijiang
2018-09-24 22:14   ` [PATCH 0/3] find_next_iomem_res() fixes Bjorn Helgaas
2018-09-24 22:14     ` [PATCH 1/3] x86/kexec: Correct KEXEC_BACKUP_SRC_END off-by-one error Bjorn Helgaas
2018-09-24 22:14     ` [PATCH 2/3] resource: Include resource end in walk_*() interfaces Bjorn Helgaas
2018-09-24 22:15     ` [PATCH 3/3] resource: Fix find_next_iomem_res() iteration issue Bjorn Helgaas
2018-09-25  8:58       ` Baoquan He
2018-09-25 11:20         ` Baoquan He
2018-09-27  5:27       ` lijiang
2018-09-27 14:03         ` Bjorn Helgaas
2018-09-28  5:09           ` lijiang
2018-09-28 13:10           ` Borislav Petkov
2018-09-26  9:22     ` [PATCH 0/3] find_next_iomem_res() fixes lijiang
2018-09-26 13:36       ` lijiang
2018-09-21  7:32 ` [PATCH 2/3 v3] x86/kexec_file: add e820 entry in case e820 type string matches to io resource name Lianbo Jiang
2018-09-21  7:32 ` [PATCH 3/3 v3] x86/kexec_file: add reserved e820 ranges to kdump kernel e820 table Lianbo Jiang
2018-10-16  2:56 ` [PATCH 0/3 v3] add reserved e820 ranges to the " Dave Young
2018-10-16  3:45   ` lijiang
2018-09-27 14:21 [PATCH 0/3] find_next_iomem_res() fixes Bjorn Helgaas
2018-09-27 14:22 ` [PATCH 3/3] resource: Fix find_next_iomem_res() iteration issue Bjorn Helgaas
2018-09-28 16:41   ` Borislav Petkov
2018-10-09 17:30     ` Bjorn Helgaas
2018-10-09 17:35       ` Borislav Petkov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).