linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3 v9] add reserved e820 ranges to the kdump kernel e820 table
@ 2019-03-21 10:33 Lianbo Jiang
  2019-03-21 10:33 ` [PATCH 1/3 v9] x86/mm: Change the examination condition to avoid confusion Lianbo Jiang
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Lianbo Jiang @ 2019-03-21 10:33 UTC (permalink / raw)
  To: linux-kernel
  Cc: kexec, tglx, mingo, bp, akpm, dave.hansen, luto, peterz, x86,
	hpa, dyoung, bhe, Thomas.Lendacky

This patchset did three things:
a). Change the examination condition to avoid confusion

Following the commit <0e4c12b45aa8> ("x86/mm, resource: Use
PAGE_KERNEL protection for ioremap of memory pages"), here
it is really checking for the 'IORES_DESC_ACPI_*' values.
Therefore, it is necessary to change the examination condition
to avoid confusion.

b). add a new I/O resource descriptor 'IORES_DESC_RESERVED'

When doing kexec_file_load, the first kernel needs to pass the e820
reserved ranges to the second kernel. But kernel can not exactly
match the e820 reserved ranges when walking through the iomem resources
with the descriptor 'IORES_DESC_NONE', because several e820 types(
e.g. E820_TYPE_RESERVED_KERN/E820_TYPE_RAM/E820_TYPE_UNUSABLE/E820
_TYPE_RESERVED) are converted to the descriptor 'IORES_DESC_NONE'. It
may pass these four types to the kdump kernel, that is not desired result.

So, this patch adds a new I/O resource descriptor 'IORES_DESC_RESERVED'
for the iomem resources search interfaces. It is helpful to exactly
match the reserved resource ranges when walking through iomem resources.

In addition, since the new descriptor 'IORES_DESC_RESERVED' is introduced,
these code originally related to the descriptor 'IORES_DESC_NONE' need to
be updated. Otherwise, it will be easily confused and also cause some
errors. Because the 'E820_TYPE_RESERVED' type is converted to the new
descriptor 'IORES_DESC_RESERVED' instead of 'IORES_DESC_NONE', it has been
changed.

c). add the e820 reserved ranges to kdump kernel e820 table

At present, when use the kexec_file_load syscall to load the kernel image
and initramfs(for example: kexec -s -p xxx), kernel does not pass the e820
reserved ranges to the second kernel, which might cause two problems:

The first one is the MMCONFIG issue. The basic problem is that this device
is in PCI segment 1 and the kernel PCI probing can not find it without all
the e820 I/O reservations being present in the e820 table. And the kdump
kernel does not have those reservations because the kexec command does not
pass the I/O reservation via the "memmap=xxx" command line option. (This
problem does not show up for other vendors, as SGI is apparently the
actually fails for everyone, but devices in segment 0 are then found by
some legacy lookup method.) The workaround for this is to pass the I/O
reserved regions to the kdump kernel.

MMCONFIG(aka ECAM) space is described in the ACPI MCFG table. If you don't
have ECAM: (a) PCI devices won't work at all on non-x86 systems that use
only ECAM for config access, (b) you won't be albe to access devices on
non-0 segments, (c) you won't be able to access extended config space(
address 0x100-0xffff), which means none of the Extended Capabilities will
be available(AER, ACS, ATS, etc). [Bjorn's comment]

The second issue is that the SME kdump kernel doesn't work without the
e820 reserved ranges. When SME is active in kdump kernel, actually, those
reserved regions are still decrypted, but because those reserved ranges are
not present at all in kdump kernel e820 table, those reserved regions are
considered as encrypted, it goes wrong.

The e820 reserved range is useful in kdump kernel, so it is necessary to
pass the e820 reserved ranges to kdump kernel.

Changes since v1:
1. Modified the value of flags to "0", when walking through the whole
tree for e820 reserved ranges.

Changes since v2:
1. Modified the value of flags to "0", when walking through the whole
tree for e820 reserved ranges.
2. Modified the invalid SOB chain issue.

Changes since v3:
1. Dropped [PATCH 1/3 v3] resource: fix an error which walks through iomem
   resources. Please refer to this commit <010a93bf97c7> "resource: Fix
   find_next_iomem_res() iteration issue"

Changes since v4:
1. Improve the patch log, and add kernel log.

Changes since v5:
1. Rewrite these patches log.

Changes since v6:
1. Modify the [PATCH 1/2], and add the new I/O resource descriptor
   'IORES_DESC_RESERVED' for the iomem resources search interfaces,
   and also updates these codes relates to 'IORES_DESC_NONE'.
2. Modify the [PATCH 2/2], and walk through io resource based on the
   new descriptor 'IORES_DESC_RESERVED'.
3. Update patch log.

Changes since v7:
1. Improve patch log.
2. Improve this function __ioremap_check_desc_other().
3. Modify code comment in the __ioremap_check_desc_other()

Changes since v8:
1. Get rid of all changes about ia64.(Borislav's suggestion)
2. Change the examination condition to the 'IORES_DESC_ACPI_*'.
3. Modify the signature. This patch(add the new I/O resource
   descriptor 'IORES_DESC_RESERVED') was suggested by Boris.

Lianbo Jiang (3):
  x86/mm: Change the examination condition to avoid confusion
  resource: add the new I/O resource descriptor 'IORES_DESC_RESERVED'
  x86/kexec_file: add reserved e820 ranges to kdump kernel e820 table

 arch/x86/kernel/crash.c | 6 ++++++
 arch/x86/kernel/e820.c  | 2 +-
 arch/x86/mm/ioremap.c   | 3 ++-
 include/linux/ioport.h  | 1 +
 kernel/resource.c       | 6 +++---
 5 files changed, 13 insertions(+), 5 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/3 v9] x86/mm: Change the examination condition to avoid confusion
  2019-03-21 10:33 [PATCH 0/3 v9] add reserved e820 ranges to the kdump kernel e820 table Lianbo Jiang
@ 2019-03-21 10:33 ` Lianbo Jiang
  2019-03-22 17:51   ` Borislav Petkov
  2019-03-21 10:33 ` [PATCH 2/3 v9] resource: add the new I/O resource descriptor 'IORES_DESC_RESERVED' Lianbo Jiang
  2019-03-21 10:33 ` [PATCH 3/3 v9] x86/kexec_file: add reserved e820 ranges to kdump kernel e820 table Lianbo Jiang
  2 siblings, 1 reply; 13+ messages in thread
From: Lianbo Jiang @ 2019-03-21 10:33 UTC (permalink / raw)
  To: linux-kernel
  Cc: kexec, tglx, mingo, bp, akpm, dave.hansen, luto, peterz, x86,
	hpa, dyoung, bhe, Thomas.Lendacky

Following the commit <0e4c12b45aa8> ("x86/mm, resource: Use
PAGE_KERNEL protection for ioremap of memory pages"), here
it is really checking for the 'IORES_DESC_ACPI_*' values.
Therefore, it is necessary to change the examination condition
to avoid confusion.

Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
---
 arch/x86/mm/ioremap.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index 0029604af8a4..0e3ba620612d 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -83,7 +83,8 @@ static bool __ioremap_check_ram(struct resource *res)
 
 static int __ioremap_check_desc_other(struct resource *res)
 {
-	return (res->desc != IORES_DESC_NONE);
+	return ((res->desc == IORES_DESC_ACPI_TABLES) ||
+		(res->desc == IORES_DESC_ACPI_NV_STORAGE));
 }
 
 static int __ioremap_res_check(struct resource *res, void *arg)
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/3 v9] resource: add the new I/O resource descriptor 'IORES_DESC_RESERVED'
  2019-03-21 10:33 [PATCH 0/3 v9] add reserved e820 ranges to the kdump kernel e820 table Lianbo Jiang
  2019-03-21 10:33 ` [PATCH 1/3 v9] x86/mm: Change the examination condition to avoid confusion Lianbo Jiang
@ 2019-03-21 10:33 ` Lianbo Jiang
  2019-03-22 19:28   ` Borislav Petkov
  2019-03-21 10:33 ` [PATCH 3/3 v9] x86/kexec_file: add reserved e820 ranges to kdump kernel e820 table Lianbo Jiang
  2 siblings, 1 reply; 13+ messages in thread
From: Lianbo Jiang @ 2019-03-21 10:33 UTC (permalink / raw)
  To: linux-kernel
  Cc: kexec, tglx, mingo, bp, akpm, dave.hansen, luto, peterz, x86,
	hpa, dyoung, bhe, Thomas.Lendacky

When doing kexec_file_load, the first kernel needs to pass the e820
reserved ranges to the second kernel. But kernel can not exactly
match the e820 reserved ranges when walking through the iomem resources
with the descriptor 'IORES_DESC_NONE', because several e820 types(
e.g. E820_TYPE_RESERVED_KERN/E820_TYPE_RAM/E820_TYPE_UNUSABLE/E820
_TYPE_RESERVED) are converted to the descriptor 'IORES_DESC_NONE'. It
may pass these four types to the kdump kernel, that is not desired result.

So, this patch adds a new I/O resource descriptor 'IORES_DESC_RESERVED'
for the iomem resources search interfaces. It is helpful to exactly
match the reserved resource ranges when walking through iomem resources.

In addition, since the new descriptor 'IORES_DESC_RESERVED' is introduced,
these code originally related to the descriptor 'IORES_DESC_NONE' need to
be updated. Otherwise, it will be easily confused and also cause some
errors. Because the 'E820_TYPE_RESERVED' type is converted to the new
descriptor 'IORES_DESC_RESERVED' instead of 'IORES_DESC_NONE', it has been
changed.

Suggested-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
---
 arch/x86/kernel/e820.c | 2 +-
 include/linux/ioport.h | 1 +
 kernel/resource.c      | 6 +++---
 3 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index 2879e234e193..16fcde196243 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -1050,10 +1050,10 @@ static unsigned long __init e820_type_to_iores_desc(struct e820_entry *entry)
 	case E820_TYPE_NVS:		return IORES_DESC_ACPI_NV_STORAGE;
 	case E820_TYPE_PMEM:		return IORES_DESC_PERSISTENT_MEMORY;
 	case E820_TYPE_PRAM:		return IORES_DESC_PERSISTENT_MEMORY_LEGACY;
+	case E820_TYPE_RESERVED:	return IORES_DESC_RESERVED;
 	case E820_TYPE_RESERVED_KERN:	/* Fall-through: */
 	case E820_TYPE_RAM:		/* Fall-through: */
 	case E820_TYPE_UNUSABLE:	/* Fall-through: */
-	case E820_TYPE_RESERVED:	/* Fall-through: */
 	default:			return IORES_DESC_NONE;
 	}
 }
diff --git a/include/linux/ioport.h b/include/linux/ioport.h
index da0ebaec25f0..6ed59de48bd5 100644
--- a/include/linux/ioport.h
+++ b/include/linux/ioport.h
@@ -133,6 +133,7 @@ enum {
 	IORES_DESC_PERSISTENT_MEMORY_LEGACY	= 5,
 	IORES_DESC_DEVICE_PRIVATE_MEMORY	= 6,
 	IORES_DESC_DEVICE_PUBLIC_MEMORY		= 7,
+	IORES_DESC_RESERVED			= 8,
 };
 
 /* helpers to define resources */
diff --git a/kernel/resource.c b/kernel/resource.c
index e81b17b53fa5..ee7348761858 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -990,7 +990,7 @@ __reserve_region_with_split(struct resource *root, resource_size_t start,
 	res->start = start;
 	res->end = end;
 	res->flags = type | IORESOURCE_BUSY;
-	res->desc = IORES_DESC_NONE;
+	res->desc = IORES_DESC_RESERVED;
 
 	while (1) {
 
@@ -1025,7 +1025,7 @@ __reserve_region_with_split(struct resource *root, resource_size_t start,
 				next_res->start = conflict->end + 1;
 				next_res->end = end;
 				next_res->flags = type | IORESOURCE_BUSY;
-				next_res->desc = IORES_DESC_NONE;
+				next_res->desc = IORES_DESC_RESERVED;
 			}
 		} else {
 			res->start = conflict->end + 1;
@@ -1488,7 +1488,7 @@ static int __init reserve_setup(char *str)
 			res->start = io_start;
 			res->end = io_start + io_num - 1;
 			res->flags |= IORESOURCE_BUSY;
-			res->desc = IORES_DESC_NONE;
+			res->desc = IORES_DESC_RESERVED;
 			res->child = NULL;
 			if (request_resource(parent, res) == 0)
 				reserved = x+1;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/3 v9] x86/kexec_file: add reserved e820 ranges to kdump kernel e820 table
  2019-03-21 10:33 [PATCH 0/3 v9] add reserved e820 ranges to the kdump kernel e820 table Lianbo Jiang
  2019-03-21 10:33 ` [PATCH 1/3 v9] x86/mm: Change the examination condition to avoid confusion Lianbo Jiang
  2019-03-21 10:33 ` [PATCH 2/3 v9] resource: add the new I/O resource descriptor 'IORES_DESC_RESERVED' Lianbo Jiang
@ 2019-03-21 10:33 ` Lianbo Jiang
  2 siblings, 0 replies; 13+ messages in thread
From: Lianbo Jiang @ 2019-03-21 10:33 UTC (permalink / raw)
  To: linux-kernel
  Cc: kexec, tglx, mingo, bp, akpm, dave.hansen, luto, peterz, x86,
	hpa, dyoung, bhe, Thomas.Lendacky

At present, when use the kexec_file_load syscall to load the kernel image
and initramfs(for example: kexec -s -p xxx), kernel does not pass the e820
reserved ranges to the second kernel, which might cause two problems:

The first one is the MMCONFIG issue. The basic problem is that this device
is in PCI segment 1 and the kernel PCI probing can not find it without all
the e820 I/O reservations being present in the e820 table. And the kdump
kernel does not have those reservations because the kexec command does not
pass the I/O reservation via the "memmap=xxx" command line option. (This
problem does not show up for other vendors, as SGI is apparently the
actually fails for everyone, but devices in segment 0 are then found by
some legacy lookup method.) The workaround for this is to pass the I/O
reserved regions to the kdump kernel.

MMCONFIG(aka ECAM) space is described in the ACPI MCFG table. If you don't
have ECAM: (a) PCI devices won't work at all on non-x86 systems that use
only ECAM for config access, (b) you won't be albe to access devices on
non-0 segments, (c) you won't be able to access extended config space(
address 0x100-0xffff), which means none of the Extended Capabilities will
be available(AER, ACS, ATS, etc). [Bjorn's comment]

The second issue is that the SME kdump kernel doesn't work without the
e820 reserved ranges. When SME is active in kdump kernel, actually, those
reserved regions are still decrypted, but because those reserved ranges are
not present at all in kdump kernel e820 table, those reserved regions are
considered as encrypted, it goes wrong.

The e820 reserved range is useful in kdump kernel, so it is necessary to
pass the e820 reserved ranges to kdump kernel.

Suggested-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
---
 arch/x86/kernel/crash.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index 17ffc869cab8..1db2754df9e9 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -381,6 +381,12 @@ int crash_setup_memmap_entries(struct kimage *image, struct boot_params *params)
 	walk_iomem_res_desc(IORES_DESC_ACPI_NV_STORAGE, flags, 0, -1, &cmd,
 			memmap_entry_callback);
 
+	/* Add e820 reserved ranges */
+	cmd.type = E820_TYPE_RESERVED;
+	flags = IORESOURCE_MEM;
+	walk_iomem_res_desc(IORES_DESC_RESERVED, flags, 0, -1, &cmd,
+			   memmap_entry_callback);
+
 	/* Add crashk_low_res region */
 	if (crashk_low_res.end) {
 		ei.addr = crashk_low_res.start;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/3 v9] x86/mm: Change the examination condition to avoid confusion
  2019-03-21 10:33 ` [PATCH 1/3 v9] x86/mm: Change the examination condition to avoid confusion Lianbo Jiang
@ 2019-03-22 17:51   ` Borislav Petkov
  2019-03-25  3:11     ` lijiang
  0 siblings, 1 reply; 13+ messages in thread
From: Borislav Petkov @ 2019-03-22 17:51 UTC (permalink / raw)
  To: Lianbo Jiang
  Cc: linux-kernel, kexec, tglx, mingo, akpm, dave.hansen, luto,
	peterz, x86, hpa, dyoung, bhe, Thomas.Lendacky

On Thu, Mar 21, 2019 at 06:33:07PM +0800, Lianbo Jiang wrote:
> Following the commit <0e4c12b45aa8> ("x86/mm, resource: Use
> PAGE_KERNEL protection for ioremap of memory pages"),

The proper commit quotation format is done by adding this to your
.gitconfig:

[core]
        abbrev = 12
[alias]
        one = show -s --pretty='format:%h (\"%s\")'

and then doing:

$ git one <commit id>

which will give you

0e4c12b45aa8 ("x86/mm, resource: Use PAGE_KERNEL protection for ioremap of memory pages")

> here it is really checking for the 'IORES_DESC_ACPI_*' values.

Well, it is not really checking that.

> Therefore, it is necessary to change the examination condition
> to avoid confusion.

What confusion?

The justification for that change sounds really fishy.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/3 v9] resource: add the new I/O resource descriptor 'IORES_DESC_RESERVED'
  2019-03-21 10:33 ` [PATCH 2/3 v9] resource: add the new I/O resource descriptor 'IORES_DESC_RESERVED' Lianbo Jiang
@ 2019-03-22 19:28   ` Borislav Petkov
  2019-03-25  6:53     ` lijiang
  0 siblings, 1 reply; 13+ messages in thread
From: Borislav Petkov @ 2019-03-22 19:28 UTC (permalink / raw)
  To: Lianbo Jiang
  Cc: linux-kernel, kexec, tglx, mingo, akpm, dave.hansen, luto,
	peterz, x86, hpa, dyoung, bhe, Thomas.Lendacky

On Thu, Mar 21, 2019 at 06:33:08PM +0800, Lianbo Jiang wrote:
> When doing kexec_file_load, the first kernel needs to pass the e820

Please end function names with parentheses.

> reserved ranges to the second kernel.

... because... ?

> But kernel can not exactly match the e820 reserved ranges
     ^
     the

> when walking through the iomem resources with the descriptor
> 'IORES_DESC_NONE', because several e820 types( e.g.
> E820_TYPE_RESERVED_KERN/E820_TYPE_RAM/E820_TYPE_UNUSABLE/E820
> _TYPE_RESERVED) are converted to the descriptor 'IORES_DESC_NONE'.
> It may pass these four types to the kdump kernel, that is not desired result.

Rewrite that sentence.

> So, this patch adds a new I/O resource descriptor 'IORES_DESC_RESERVED'

Avoid having "This patch" or "This commit" in the commit message. It is
tautologically useless.

Also, do

$ git grep 'This patch' Documentation/process

for more details.

> for the iomem resources search interfaces. It is helpful to exactly
> match the reserved resource ranges when walking through iomem resources.
> 
> In addition, since the new descriptor 'IORES_DESC_RESERVED' is introduced,
> these code originally related to the descriptor 'IORES_DESC_NONE' need to

"the code"

> be updated.

> Otherwise, it will be easily confused and also cause some errors.

What errors?

> Because the 'E820_TYPE_RESERVED' type is converted to the new
> descriptor 'IORES_DESC_RESERVED' instead of 'IORES_DESC_NONE', it has been
> changed.

That sentence I cannot parse.

> Suggested-by: Borislav Petkov <bp@suse.de>
> Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
> ---
>  arch/x86/kernel/e820.c | 2 +-
>  include/linux/ioport.h | 1 +
>  kernel/resource.c      | 6 +++---
>  3 files changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
> index 2879e234e193..16fcde196243 100644
> --- a/arch/x86/kernel/e820.c
> +++ b/arch/x86/kernel/e820.c
> @@ -1050,10 +1050,10 @@ static unsigned long __init e820_type_to_iores_desc(struct e820_entry *entry)
>  	case E820_TYPE_NVS:		return IORES_DESC_ACPI_NV_STORAGE;
>  	case E820_TYPE_PMEM:		return IORES_DESC_PERSISTENT_MEMORY;
>  	case E820_TYPE_PRAM:		return IORES_DESC_PERSISTENT_MEMORY_LEGACY;
> +	case E820_TYPE_RESERVED:	return IORES_DESC_RESERVED;
>  	case E820_TYPE_RESERVED_KERN:	/* Fall-through: */
>  	case E820_TYPE_RAM:		/* Fall-through: */
>  	case E820_TYPE_UNUSABLE:	/* Fall-through: */
> -	case E820_TYPE_RESERVED:	/* Fall-through: */
>  	default:			return IORES_DESC_NONE;
>  	}
>  }
> diff --git a/include/linux/ioport.h b/include/linux/ioport.h
> index da0ebaec25f0..6ed59de48bd5 100644
> --- a/include/linux/ioport.h
> +++ b/include/linux/ioport.h
> @@ -133,6 +133,7 @@ enum {
>  	IORES_DESC_PERSISTENT_MEMORY_LEGACY	= 5,
>  	IORES_DESC_DEVICE_PRIVATE_MEMORY	= 6,
>  	IORES_DESC_DEVICE_PUBLIC_MEMORY		= 7,
> +	IORES_DESC_RESERVED			= 8,
>  };
>  
>  /* helpers to define resources */

IORES_DESC_RESERVED is supposed to represent E820_TYPE_RESERVED. And if
that is the case, then all three hunks below look wrong to me. If you
want to pass E820_TYPE_RESERVED ranges, then do that explicitly.

> diff --git a/kernel/resource.c b/kernel/resource.c
> index e81b17b53fa5..ee7348761858 100644
> --- a/kernel/resource.c
> +++ b/kernel/resource.c
> @@ -990,7 +990,7 @@ __reserve_region_with_split(struct resource *root, resource_size_t start,
>  	res->start = start;
>  	res->end = end;
>  	res->flags = type | IORESOURCE_BUSY;
> -	res->desc = IORES_DESC_NONE;
> +	res->desc = IORES_DESC_RESERVED;
>  
>  	while (1) {
>  
> @@ -1025,7 +1025,7 @@ __reserve_region_with_split(struct resource *root, resource_size_t start,
>  				next_res->start = conflict->end + 1;
>  				next_res->end = end;
>  				next_res->flags = type | IORESOURCE_BUSY;
> -				next_res->desc = IORES_DESC_NONE;
> +				next_res->desc = IORES_DESC_RESERVED;
>  			}
>  		} else {
>  			res->start = conflict->end + 1;
> @@ -1488,7 +1488,7 @@ static int __init reserve_setup(char *str)
>  			res->start = io_start;
>  			res->end = io_start + io_num - 1;
>  			res->flags |= IORESOURCE_BUSY;
> -			res->desc = IORES_DESC_NONE;
> +			res->desc = IORES_DESC_RESERVED;
>  			res->child = NULL;
>  			if (request_resource(parent, res) == 0)
>  				reserved = x+1;
> -- 
> 2.17.1
> 

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/3 v9] x86/mm: Change the examination condition to avoid confusion
  2019-03-22 17:51   ` Borislav Petkov
@ 2019-03-25  3:11     ` lijiang
  2019-03-25  6:40       ` Borislav Petkov
  0 siblings, 1 reply; 13+ messages in thread
From: lijiang @ 2019-03-25  3:11 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: linux-kernel, kexec, tglx, mingo, akpm, dave.hansen, luto,
	peterz, x86, hpa, dyoung, bhe, Thomas.Lendacky

在 2019年03月23日 01:51, Borislav Petkov 写道:
> On Thu, Mar 21, 2019 at 06:33:07PM +0800, Lianbo Jiang wrote:
>> Following the commit <0e4c12b45aa8> ("x86/mm, resource: Use
>> PAGE_KERNEL protection for ioremap of memory pages"),
> 
> The proper commit quotation format is done by adding this to your
> .gitconfig:
> 
> [core]
>         abbrev = 12
> [alias]
>         one = show -s --pretty='format:%h (\"%s\")'
> 
> and then doing:
> 
> $ git one <commit id>
> 
> which will give you
> 
> 0e4c12b45aa8 ("x86/mm, resource: Use PAGE_KERNEL protection for ioremap of memory pages")

Nice. I added them to my .gitconfig. It works. Thank you very much.

> 
>> here it is really checking for the 'IORES_DESC_ACPI_*' values.
> 
> Well, it is not really checking that.

I mean it needs to find all the value of the 'IORES_DESC_ACPI_*' type.

> 
>> Therefore, it is necessary to change the examination condition
>> to avoid confusion.
> 
> What confusion?

As above mentioned, it needs to find all the value of the 'IORES_DESC_ACPI_*'
type, so we should explicitly use the 'IORES_DESC_ACPI_*' type as the check
condition instead of the 'IORES_DESC_NONE'.

Thanks.
Lianbo

> 
> The justification for that change sounds really fishy.
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/3 v9] x86/mm: Change the examination condition to avoid confusion
  2019-03-25  3:11     ` lijiang
@ 2019-03-25  6:40       ` Borislav Petkov
  2019-03-25  9:20         ` lijiang
  0 siblings, 1 reply; 13+ messages in thread
From: Borislav Petkov @ 2019-03-25  6:40 UTC (permalink / raw)
  To: lijiang
  Cc: linux-kernel, kexec, tglx, mingo, akpm, dave.hansen, luto,
	peterz, x86, hpa, dyoung, bhe, Thomas.Lendacky

On Mon, Mar 25, 2019 at 11:11:45AM +0800, lijiang wrote:
> I mean it needs to find all the value of the 'IORES_DESC_ACPI_*' type.

A function called __ioremap_check_desc_other() needs to find
IORES_DESC_ACPI_* types...

No, still don't know what you're trying to do.

> As above mentioned, it needs to find all the value of the 'IORES_DESC_ACPI_*'
> type, so we should explicitly use the 'IORES_DESC_ACPI_*' type as the check
> condition instead of the 'IORES_DESC_NONE'.

And now the same question I'm asking you each time: WHY does it need to find
the ACPI types?

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/3 v9] resource: add the new I/O resource descriptor 'IORES_DESC_RESERVED'
  2019-03-22 19:28   ` Borislav Petkov
@ 2019-03-25  6:53     ` lijiang
  2019-03-25 12:24       ` Borislav Petkov
  0 siblings, 1 reply; 13+ messages in thread
From: lijiang @ 2019-03-25  6:53 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: linux-kernel, kexec, tglx, mingo, akpm, dave.hansen, luto,
	peterz, x86, hpa, dyoung, bhe, Thomas.Lendacky

在 2019年03月23日 03:28, Borislav Petkov 写道:
> On Thu, Mar 21, 2019 at 06:33:08PM +0800, Lianbo Jiang wrote:
>> When doing kexec_file_load, the first kernel needs to pass the e820
> 
> Please end function names with parentheses.
> 
>> reserved ranges to the second kernel.
> 
> ... because... ?
> 
>> But kernel can not exactly match the e820 reserved ranges
>      ^
>      the
> 
>> when walking through the iomem resources with the descriptor
>> 'IORES_DESC_NONE', because several e820 types( e.g.
>> E820_TYPE_RESERVED_KERN/E820_TYPE_RAM/E820_TYPE_UNUSABLE/E820
>> _TYPE_RESERVED) are converted to the descriptor 'IORES_DESC_NONE'.
>> It may pass these four types to the kdump kernel, that is not desired result.
> 
> Rewrite that sentence.
> 
>> So, this patch adds a new I/O resource descriptor 'IORES_DESC_RESERVED'
> 
> Avoid having "This patch" or "This commit" in the commit message. It is
> tautologically useless.
> 
> Also, do
> 
> $ git grep 'This patch' Documentation/process
> 
> for more details.
> 
>> for the iomem resources search interfaces. It is helpful to exactly
>> match the reserved resource ranges when walking through iomem resources.
>>
>> In addition, since the new descriptor 'IORES_DESC_RESERVED' is introduced,
>> these code originally related to the descriptor 'IORES_DESC_NONE' need to
> 
> "the code"
> 
>> be updated.
> 
>> Otherwise, it will be easily confused and also cause some errors.
> 
> What errors?
> 
>> Because the 'E820_TYPE_RESERVED' type is converted to the new
>> descriptor 'IORES_DESC_RESERVED' instead of 'IORES_DESC_NONE', it has been
>> changed.
> 
> That sentence I cannot parse.

Thanks for your comment. I will improve the patch log next post.

> 
>> Suggested-by: Borislav Petkov <bp@suse.de>
>> Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
>> ---
>>  arch/x86/kernel/e820.c | 2 +-
>>  include/linux/ioport.h | 1 +
>>  kernel/resource.c      | 6 +++---
>>  3 files changed, 5 insertions(+), 4 deletions(-)
>>
>> diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
>> index 2879e234e193..16fcde196243 100644
>> --- a/arch/x86/kernel/e820.c
>> +++ b/arch/x86/kernel/e820.c
>> @@ -1050,10 +1050,10 @@ static unsigned long __init e820_type_to_iores_desc(struct e820_entry *entry)
>>  	case E820_TYPE_NVS:		return IORES_DESC_ACPI_NV_STORAGE;
>>  	case E820_TYPE_PMEM:		return IORES_DESC_PERSISTENT_MEMORY;
>>  	case E820_TYPE_PRAM:		return IORES_DESC_PERSISTENT_MEMORY_LEGACY;
>> +	case E820_TYPE_RESERVED:	return IORES_DESC_RESERVED;
>>  	case E820_TYPE_RESERVED_KERN:	/* Fall-through: */
>>  	case E820_TYPE_RAM:		/* Fall-through: */
>>  	case E820_TYPE_UNUSABLE:	/* Fall-through: */
>> -	case E820_TYPE_RESERVED:	/* Fall-through: */
>>  	default:			return IORES_DESC_NONE;
>>  	}
>>  }
>> diff --git a/include/linux/ioport.h b/include/linux/ioport.h
>> index da0ebaec25f0..6ed59de48bd5 100644
>> --- a/include/linux/ioport.h
>> +++ b/include/linux/ioport.h
>> @@ -133,6 +133,7 @@ enum {
>>  	IORES_DESC_PERSISTENT_MEMORY_LEGACY	= 5,
>>  	IORES_DESC_DEVICE_PRIVATE_MEMORY	= 6,
>>  	IORES_DESC_DEVICE_PUBLIC_MEMORY		= 7,
>> +	IORES_DESC_RESERVED			= 8,
>>  };
>>  
>>  /* helpers to define resources */
> 
> IORES_DESC_RESERVED is supposed to represent E820_TYPE_RESERVED. And if> that is the case, then all three hunks below look wrong to me. If you
> want to pass E820_TYPE_RESERVED ranges, then do that explicitly.

In this function, i printed its values, and only got the value of reserved
type, so i changed the IORES_DESC_NONE to the IORES_DESC_RESERVED.

In addition, after the new descriptor 'IORES_DESC_RESERVED' is introduced,
the IORES_DESC_NONE does not include the IORES_DESC_RESERVED any more, it
could miss to handle the value of the reserved type.

Do you mean i should never touch the three chunks? If i made a mistake, i
will remove this changes next post.

Thanks.
Lianbo

> 
>> diff --git a/kernel/resource.c b/kernel/resource.c
>> index e81b17b53fa5..ee7348761858 100644
>> --- a/kernel/resource.c
>> +++ b/kernel/resource.c
>> @@ -990,7 +990,7 @@ __reserve_region_with_split(struct resource *root, resource_size_t start,
>>  	res->start = start;
>>  	res->end = end;
>>  	res->flags = type | IORESOURCE_BUSY;
>> -	res->desc = IORES_DESC_NONE;
>> +	res->desc = IORES_DESC_RESERVED;
>>  
>>  	while (1) {
>>  
>> @@ -1025,7 +1025,7 @@ __reserve_region_with_split(struct resource *root, resource_size_t start,
>>  				next_res->start = conflict->end + 1;
>>  				next_res->end = end;
>>  				next_res->flags = type | IORESOURCE_BUSY;
>> -				next_res->desc = IORES_DESC_NONE;
>> +				next_res->desc = IORES_DESC_RESERVED;
>>  			}
>>  		} else {
>>  			res->start = conflict->end + 1;
>> @@ -1488,7 +1488,7 @@ static int __init reserve_setup(char *str)
>>  			res->start = io_start;
>>  			res->end = io_start + io_num - 1;
>>  			res->flags |= IORESOURCE_BUSY;
>> -			res->desc = IORES_DESC_NONE;
>> +			res->desc = IORES_DESC_RESERVED;
>>  			res->child = NULL;
>>  			if (request_resource(parent, res) == 0)
>>  				reserved = x+1;
>> -- 
>> 2.17.1
>>
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/3 v9] x86/mm: Change the examination condition to avoid confusion
  2019-03-25  6:40       ` Borislav Petkov
@ 2019-03-25  9:20         ` lijiang
  2019-03-25 12:15           ` Borislav Petkov
  0 siblings, 1 reply; 13+ messages in thread
From: lijiang @ 2019-03-25  9:20 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: linux-kernel, kexec, tglx, mingo, akpm, dave.hansen, luto,
	peterz, x86, hpa, dyoung, bhe, Thomas.Lendacky

在 2019年03月25日 14:40, Borislav Petkov 写道:
> On Mon, Mar 25, 2019 at 11:11:45AM +0800, lijiang wrote:
>> I mean it needs to find all the value of the 'IORES_DESC_ACPI_*' type.
> 
> A function called __ioremap_check_desc_other() needs to find
> IORES_DESC_ACPI_* types...
> 
> No, still don't know what you're trying to do.

Let's look at the discussion in patch v8, please refer to this link:
https://lkml.org/lkml/2019/3/16/15

I did a test according to Tom's reply, and the test indicated his suggestion was
correct, we should change this to check for IORES_DESC_ACPI_* values.

> 
>> As above mentioned, it needs to find all the value of the 'IORES_DESC_ACPI_*'
>> type, so we should explicitly use the 'IORES_DESC_ACPI_*' type as the check
>> condition instead of the 'IORES_DESC_NONE'.
> 
> And now the same question I'm asking you each time: WHY does it need to find
> the ACPI types?
> 

When SEV is enabled and the page being mapped is in memory, need to ensure the
memory encryption attribute is also enabled in the resulting mapping.

I believe Tom knows better than me. :-)

Thanks.
Lianbo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/3 v9] x86/mm: Change the examination condition to avoid confusion
  2019-03-25  9:20         ` lijiang
@ 2019-03-25 12:15           ` Borislav Petkov
  0 siblings, 0 replies; 13+ messages in thread
From: Borislav Petkov @ 2019-03-25 12:15 UTC (permalink / raw)
  To: lijiang
  Cc: linux-kernel, kexec, tglx, mingo, akpm, dave.hansen, luto,
	peterz, x86, hpa, dyoung, bhe, Thomas.Lendacky

On Mon, Mar 25, 2019 at 05:20:43PM +0800, lijiang wrote:
> Let's look at the discussion in patch v8, please refer to this link:
> https://lkml.org/lkml/2019/3/16/15
> 
> I did a test according to Tom's reply, and the test indicated his suggestion was
> correct, we should change this to check for IORES_DESC_ACPI_* values.

I know that discussion - I was on CC.

Your patch still doesn't explain *why* this change is needed and the
fact that you "did a test" and it worked doesn't answer my question a
single bit. In fact, it tells me that you have tried something at random
without even understanding why this is needed and makes me even more
suspicious towards what you're doing.

So slow down pls *think* why this change is needed and then *explain*
that in the commit message.

Thx.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/3 v9] resource: add the new I/O resource descriptor 'IORES_DESC_RESERVED'
  2019-03-25  6:53     ` lijiang
@ 2019-03-25 12:24       ` Borislav Petkov
  2019-03-28 14:00         ` lijiang
  0 siblings, 1 reply; 13+ messages in thread
From: Borislav Petkov @ 2019-03-25 12:24 UTC (permalink / raw)
  To: lijiang
  Cc: linux-kernel, kexec, tglx, mingo, akpm, dave.hansen, luto,
	peterz, x86, hpa, dyoung, bhe, Thomas.Lendacky

On Mon, Mar 25, 2019 at 02:53:02PM +0800, lijiang wrote:
> In this function, i printed its values, and only got the value of reserved
> type, so i changed the IORES_DESC_NONE to the IORES_DESC_RESERVED.
> 
> In addition, after the new descriptor 'IORES_DESC_RESERVED' is introduced,
> the IORES_DESC_NONE does not include the IORES_DESC_RESERVED any more, it
> could miss to handle the value of the reserved type.

Yes, IORES_DESC_RESERVED is supposed to denote the e820 reserved type.
Why should IORES_DESC_NONE include it ?!?!

IORES_DESC_NONE is, well, an invalid, i.e., "none" type:

/*
 * I/O Resource Descriptors
 *
 * Descriptors are used by walk_iomem_res_desc() and region_intersects()
 * for searching a specific resource range in the iomem table.  Assign
 * a new descriptor when a resource range supports the search interfaces.
 * Otherwise, resource.desc must be set to IORES_DESC_NONE (0).
 */

> Do you mean i should never touch the three chunks? If i made a mistake, i
> will remove this changes next post.

I'm looking at the hunks below and you're changing ->desc assignments in
some random function which doesn't look like you know what you're doing.
Maybe it gets you what you want but it sure as hell doesn't look right
to me.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/3 v9] resource: add the new I/O resource descriptor 'IORES_DESC_RESERVED'
  2019-03-25 12:24       ` Borislav Petkov
@ 2019-03-28 14:00         ` lijiang
  0 siblings, 0 replies; 13+ messages in thread
From: lijiang @ 2019-03-28 14:00 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: linux-kernel, kexec, tglx, mingo, akpm, dave.hansen, luto,
	peterz, x86, hpa, dyoung, bhe, Thomas.Lendacky

在 2019年03月25日 20:24, Borislav Petkov 写道:
> On Mon, Mar 25, 2019 at 02:53:02PM +0800, lijiang wrote:
>> In this function, i printed its values, and only got the value of reserved
>> type, so i changed the IORES_DESC_NONE to the IORES_DESC_RESERVED.
>>
>> In addition, after the new descriptor 'IORES_DESC_RESERVED' is introduced,
>> the IORES_DESC_NONE does not include the IORES_DESC_RESERVED any more, it
>> could miss to handle the value of the reserved type.
> 
> Yes, IORES_DESC_RESERVED is supposed to denote the e820 reserved type.
> Why should IORES_DESC_NONE include it ?!?!
> 
> IORES_DESC_NONE is, well, an invalid, i.e., "none" type:

Yes, i see. That indicates an empty area, or "void" type.

> 
> /*
>  * I/O Resource Descriptors
>  *
>  * Descriptors are used by walk_iomem_res_desc() and region_intersects()
>  * for searching a specific resource range in the iomem table.  Assign
>  * a new descriptor when a resource range supports the search interfaces.
>  * Otherwise, resource.desc must be set to IORES_DESC_NONE (0).
>  */
> 
>> Do you mean i should never touch the three chunks? If i made a mistake, i
>> will remove this changes next post.
> 
> I'm looking at the hunks below and you're changing ->desc assignments in
> some random function which doesn't look like you know what you're doing.
> Maybe it gets you what you want but it sure as hell doesn't look right
> to me.
> 
I have realized that there could be a problem with this changes. Indeed, here
it denotes an empty area instead of a reserved area.

BTW: Looks like the name of this function(__reserve_region_with_split) is a bit
misleading.

Thank you for pointing out this problem, i will correct them next post.

Lianbo

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2019-03-28 14:00 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-21 10:33 [PATCH 0/3 v9] add reserved e820 ranges to the kdump kernel e820 table Lianbo Jiang
2019-03-21 10:33 ` [PATCH 1/3 v9] x86/mm: Change the examination condition to avoid confusion Lianbo Jiang
2019-03-22 17:51   ` Borislav Petkov
2019-03-25  3:11     ` lijiang
2019-03-25  6:40       ` Borislav Petkov
2019-03-25  9:20         ` lijiang
2019-03-25 12:15           ` Borislav Petkov
2019-03-21 10:33 ` [PATCH 2/3 v9] resource: add the new I/O resource descriptor 'IORES_DESC_RESERVED' Lianbo Jiang
2019-03-22 19:28   ` Borislav Petkov
2019-03-25  6:53     ` lijiang
2019-03-25 12:24       ` Borislav Petkov
2019-03-28 14:00         ` lijiang
2019-03-21 10:33 ` [PATCH 3/3 v9] x86/kexec_file: add reserved e820 ranges to kdump kernel e820 table Lianbo Jiang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).