All of lore.kernel.org
 help / color / mirror / Atom feed
* [bug report] system panic at nfit_get_smbios_id+0x6e/0xf0 [nfit] during boot
@ 2021-05-01  2:27 Yi Zhang
  2021-05-01  6:05 ` Dan Williams
  0 siblings, 1 reply; 8+ messages in thread
From: Yi Zhang @ 2021-05-01  2:27 UTC (permalink / raw)
  To: linux-nvdimm

Hi

With the latest Linux tree, my DCPMM server boot failed with the
bellow panic log, pls help check it, let me know if you need any test
for it.

[   15.882889] BUG: unable to handle page fault for address: ffffffffffffffa8
[   15.889761] #PF: supervisor read access in kernel mode
[   15.894900] #PF: error_code(0x0000) - not-present page
[   15.900039] PGD fc2813067 P4D fc2813067 PUD fc2815067 PMD 0
[   15.905697] Oops: 0000 [#1] SMP NOPTI
[   15.909364] CPU: 22 PID: 1024 Comm: systemd-udevd Tainted: G
  I       5.12.0+ #1
[   15.917448] Hardware name: Dell Inc. PowerEdge R640/06NR82, BIOS
2.10.0 11/12/2020
[   15.925013] RIP: 0010:nfit_get_smbios_id+0x6e/0xf0 [nfit]
[   15.930413] Code: b1 f3 49 8b 84 24 c0 00 00 00 49 8d 8c 24 c0 00
00 00 48 8d 50 a0 48 39 c1 75 0f eb 49 48 8b 42 60 48 8d 50 a0 48 39
c1 74 3c <48> 8b 42 08 48 85 c0 75 04 48 8b 42 10 39 58 04 75 e1 0f b7
50 2c
[   15.949160] RSP: 0018:ffff9c28c284bb10 EFLAGS: 00010286
[   15.954383] RAX: 0000000000000000 RBX: 0000000000000020 RCX: ffff897b832d8cd8
[   15.961507] RDX: ffffffffffffffa0 RSI: ffff9c28c284bb46 RDI: ffff897b832d8c98
[   15.968631] RBP: ffff9c28c284bb46 R08: ffffffffc08c982c R09: ffff9c28c284bb6c
[   15.975763] R10: 0000000000000058 R11: ffff897b4bfb0aee R12: ffff897b832d8c18
[   15.982888] R13: ffff897b832d8c98 R14: ffff897b4bfb0038 R15: ffff897b4bfb1800
[   15.990021] FS:  00007fa1960ab180(0000) GS:ffff898a7ff80000(0000)
knlGS:0000000000000000
[   15.998107] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   16.003854] CR2: ffffffffffffffa8 CR3: 00000001086ac004 CR4: 00000000007706e0
[   16.010984] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   16.018119] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   16.025249] PKRU: 55555554
[   16.027954] Call Trace:
[   16.030407]  skx_get_nvdimm_info+0x56/0x130 [skx_edac]
[   16.035546]  skx_get_dimm_config+0x1f5/0x213 [skx_edac]
[   16.040770]  skx_register_mci+0x132/0x1c0 [skx_edac]
[   16.045737]  ? skx_show_retry_rd_err_log+0x190/0x190 [skx_edac]
[   16.051657]  skx_init+0x344/0xe87 [skx_edac]
[   16.055930]  ? skx_adxl_get+0x179/0x179 [skx_edac]
[   16.060722]  do_one_initcall+0x41/0x1d0
[   16.064560]  ? __cond_resched+0x15/0x30
[   16.068399]  ? kmem_cache_alloc_trace+0x3d/0x420
[   16.073019]  do_init_module+0x5a/0x240
[   16.076771]  load_module+0x1b5f/0x1c40
[   16.080525]  ? __kernel_read+0x14a/0x2c0
[   16.084450]  ? __do_sys_finit_module+0xad/0x110
[   16.088982]  __do_sys_finit_module+0xad/0x110
[   16.093343]  do_syscall_64+0x39/0x80
[   16.096921]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[   16.101975] RIP: 0033:0x7fa194c8852d
[   16.105554] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e
fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24
08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 2b 79 2c 00 f7 d8 64 89
01 48
[   16.124298] RSP: 002b:00007fff5daec098 EFLAGS: 00000246 ORIG_RAX:
0000000000000139
[   16.131864] RAX: ffffffffffffffda RBX: 000055ccd2937dc0 RCX: 00007fa194c8852d
[   16.138996] RDX: 0000000000000000 RSI: 00007fa1957fc86d RDI: 000000000000001c
[   16.146129] RBP: 00007fa1957fc86d R08: 0000000000000000 R09: 00007fff5daec1c0
[   16.153260] R10: 000000000000001c R11: 0000000000000246 R12: 0000000000000000
[   16.160392] R13: 000055ccd2838c30 R14: 0000000000020000 R15: 0000000000000000
[   16.167527] Modules linked in: skx_edac(+) x86_pkg_temp_thermal
intel_powerclamp coretemp kvm_intel ipmi_ssif mgag200 i2c_algo_bit kvm
drm_kms_helper iTCO_wdt iTCO_vendor_support syscopyarea sysfillrect
sysimgblt fb_sys_fops drm irqbypass crct10dif_pclmul crc32_pclmul
ghash_clmulni_intel acpi_ipmi rapl ipmi_si mei_me intel_cstate
intel_uncore mei i2c_i801 pcspkr wmi_bmof ipmi_devintf
intel_pch_thermal i2c_smbus lpc_ich ipmi_msghandler acpi_power_meter
ip_tables xfs libcrc32c sd_mod t10_pi sg ahci libahci libata tg3
megaraid_sas crc32c_intel nfit wmi libnvdimm dm_mirror dm_region_hash
dm_log dm_mod
[   16.220349] CR2: ffffffffffffffa8
[   16.223674] ---[ end trace 3e1fbf6e28c10643 ]---
[   16.231424] RIP: 0010:nfit_get_smbios_id+0x6e/0xf0 [nfit]
[   16.236822] Code: b1 f3 49 8b 84 24 c0 00 00 00 49 8d 8c 24 c0 00
00 00 48 8d 50 a0 48 39 c1 75 0f eb 49 48 8b 42 60 48 8d 50 a0 48 39
c1 74 3c <48> 8b 42 08 48 85 c0 75 04 48 8b 42 10 39 58 04 75 e1 0f b7
50 2c
[   16.255568] RSP: 0018:ffff9c28c284bb10 EFLAGS: 00010286
[   16.260794] RAX: 0000000000000000 RBX: 0000000000000020 RCX: ffff897b832d8cd8
[   16.267925] RDX: ffffffffffffffa0 RSI: ffff9c28c284bb46 RDI: ffff897b832d8c98
[   16.275057] RBP: ffff9c28c284bb46 R08: ffffffffc08c982c R09: ffff9c28c284bb6c
[   16.282189] R10: 0000000000000058 R11: ffff897b4bfb0aee R12: ffff897b832d8c18
[   16.289313] R13: ffff897b832d8c98 R14: ffff897b4bfb0038 R15: ffff897b4bfb1800
[   16.296440] FS:  00007fa1960ab180(0000) GS:ffff898a7ff80000(0000)
knlGS:0000000000000000
[   16.304525] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   16.310271] CR2: ffffffffffffffa8 CR3: 00000001086ac004 CR4: 00000000007706e0
[   16.317401] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   16.324526] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   16.331660] PKRU: 55555554
[   16.334370] Kernel panic - not syncing: Fatal exception
[   16.755349] Kernel Offset: 0x32600000 from 0xffffffff81000000
(relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[   16.769170] ---[ end Kernel panic - not syncing: Fatal exception ]---
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [bug report] system panic at nfit_get_smbios_id+0x6e/0xf0 [nfit] during boot
  2021-05-01  2:27 [bug report] system panic at nfit_get_smbios_id+0x6e/0xf0 [nfit] during boot Yi Zhang
@ 2021-05-01  6:05 ` Dan Williams
  2021-05-06  3:05   ` Yi Zhang
  0 siblings, 1 reply; 8+ messages in thread
From: Dan Williams @ 2021-05-01  6:05 UTC (permalink / raw)
  To: Yi Zhang; +Cc: linux-nvdimm

On Fri, Apr 30, 2021 at 7:28 PM Yi Zhang <yi.zhang@redhat.com> wrote:
>
> Hi
>
> With the latest Linux tree, my DCPMM server boot failed with the
> bellow panic log, pls help check it, let me know if you need any test
> for it.

So v5.12 is ok but v5.12+ is not?

Might you be able to bisect?

If not can you send the nfit.gz from this command:

acpidump -n NFIT | gzip -c > nfit.gz

Also can you send the full dmesg? I don't suppose you see a message of
this format before this failure:

                        dev_err(acpi_desc->dev, "SPA %d missing DCR %d\n",
                                        spa->range_index, dcr);
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [bug report] system panic at nfit_get_smbios_id+0x6e/0xf0 [nfit] during boot
  2021-05-01  6:05 ` Dan Williams
@ 2021-05-06  3:05   ` Yi Zhang
  2021-05-06 17:27     ` Kaneda, Erik
  0 siblings, 1 reply; 8+ messages in thread
From: Yi Zhang @ 2021-05-06  3:05 UTC (permalink / raw)
  To: Dan Williams, robert.moore; +Cc: linux-nvdimm, erik.kaneda, rafael.j.wysocki

On Sat, May 1, 2021 at 2:05 PM Dan Williams <dan.j.williams@intel.com> wrote:
>
> On Fri, Apr 30, 2021 at 7:28 PM Yi Zhang <yi.zhang@redhat.com> wrote:
> >
> > Hi
> >
> > With the latest Linux tree, my DCPMM server boot failed with the
> > bellow panic log, pls help check it, let me know if you need any test
> > for it.
>
> So v5.12 is ok but v5.12+ is not?
>
> Might you be able to bisect?

Hi Dan
This issue was introduced with this patch, let me know if you need more info.

commit cf16b05c607bd716a0a5726dc8d577a89fdc1777
Author: Bob Moore <robert.moore@intel.com>
Date:   Tue Apr 6 14:30:15 2021 -0700

    ACPICA: ACPI 6.4: NFIT: add Location Cookie field

    Also, update struct size to reflect these changes in nfit core driver.

    ACPICA commit af60199a9a1de9e6844929fd4cc22334522ed195

    Link: https://github.com/acpica/acpica/commit/af60199a
    Cc: Dan Williams <dan.j.williams@intel.com>
    Signed-off-by: Bob Moore <robert.moore@intel.com>
    Signed-off-by: Erik Kaneda <erik.kaneda@intel.com>
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

>
> If not can you send the nfit.gz from this command:
>
> acpidump -n NFIT | gzip -c > nfit.gz
>
> Also can you send the full dmesg? I don't suppose you see a message of
> this format before this failure:
>
>                         dev_err(acpi_desc->dev, "SPA %d missing DCR %d\n",
>                                         spa->range_index, dcr);
>
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [bug report] system panic at nfit_get_smbios_id+0x6e/0xf0 [nfit] during boot
  2021-05-06  3:05   ` Yi Zhang
@ 2021-05-06 17:27     ` Kaneda, Erik
  2021-05-06 21:17         ` Dan Williams
  0 siblings, 1 reply; 8+ messages in thread
From: Kaneda, Erik @ 2021-05-06 17:27 UTC (permalink / raw)
  To: Yi Zhang, Williams, Dan J, Moore, Robert; +Cc: linux-nvdimm, Wysocki, Rafael J



> -----Original Message-----
> From: Yi Zhang <yi.zhang@redhat.com>
> Sent: Wednesday, May 5, 2021 8:05 PM
> To: Williams, Dan J <dan.j.williams@intel.com>; Moore, Robert
> <robert.moore@intel.com>
> Cc: linux-nvdimm <linux-nvdimm@lists.01.org>; Kaneda, Erik
> <erik.kaneda@intel.com>; Wysocki, Rafael J <rafael.j.wysocki@intel.com>
> Subject: Re: [bug report] system panic at nfit_get_smbios_id+0x6e/0xf0
> [nfit] during boot
> 
> On Sat, May 1, 2021 at 2:05 PM Dan Williams <dan.j.williams@intel.com>
> wrote:
> >
> > On Fri, Apr 30, 2021 at 7:28 PM Yi Zhang <yi.zhang@redhat.com> wrote:
> > >
> > > Hi
> > >
> > > With the latest Linux tree, my DCPMM server boot failed with the
> > > bellow panic log, pls help check it, let me know if you need any test
> > > for it.
> >
> > So v5.12 is ok but v5.12+ is not?
> >
> > Might you be able to bisect?
> 
> Hi Dan
> This issue was introduced with this patch, let me know if you need more info.
> 
> commit cf16b05c607bd716a0a5726dc8d577a89fdc1777
> Author: Bob Moore <robert.moore@intel.com>
> Date:   Tue Apr 6 14:30:15 2021 -0700
> 
>     ACPICA: ACPI 6.4: NFIT: add Location Cookie field
> 
>     Also, update struct size to reflect these changes in nfit core driver.
> 
>     ACPICA commit af60199a9a1de9e6844929fd4cc22334522ed195
> 
>     Link: https://github.com/acpica/acpica/commit/af60199a
>     Cc: Dan Williams <dan.j.williams@intel.com>
>     Signed-off-by: Bob Moore <robert.moore@intel.com>
>     Signed-off-by: Erik Kaneda <erik.kaneda@intel.com>
>     Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 

It's likely that this change forced the nfit driver's code to parse the ACPI table so that it assumes that the location cookie field is always enabled and the NFIT was parsed incorrectly. Does the NFIT table on this platform contain a valid cookie field?

> >
> > If not can you send the nfit.gz from this command:
> >
> > acpidump -n NFIT | gzip -c > nfit.gz
> >
> > Also can you send the full dmesg? I don't suppose you see a message of
> > this format before this failure:
> >
> >                         dev_err(acpi_desc->dev, "SPA %d missing DCR %d\n",
> >                                         spa->range_index, dcr);
> >

_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [bug report] system panic at nfit_get_smbios_id+0x6e/0xf0 [nfit] during boot
  2021-05-06 17:27     ` Kaneda, Erik
@ 2021-05-06 21:17         ` Dan Williams
  0 siblings, 0 replies; 8+ messages in thread
From: Dan Williams @ 2021-05-06 21:17 UTC (permalink / raw)
  To: Kaneda, Erik
  Cc: Yi Zhang, Moore, Robert, linux-nvdimm, Wysocki, Rafael J, nvdimm

[-- Attachment #1: Type: text/plain, Size: 2214 bytes --]

On Thu, May 6, 2021 at 10:28 AM Kaneda, Erik <erik.kaneda@intel.com> wrote:
>
>
>
> > -----Original Message-----
> > From: Yi Zhang <yi.zhang@redhat.com>
> > Sent: Wednesday, May 5, 2021 8:05 PM
> > To: Williams, Dan J <dan.j.williams@intel.com>; Moore, Robert
> > <robert.moore@intel.com>
> > Cc: linux-nvdimm <linux-nvdimm@lists.01.org>; Kaneda, Erik
> > <erik.kaneda@intel.com>; Wysocki, Rafael J <rafael.j.wysocki@intel.com>
> > Subject: Re: [bug report] system panic at nfit_get_smbios_id+0x6e/0xf0
> > [nfit] during boot
> >
> > On Sat, May 1, 2021 at 2:05 PM Dan Williams <dan.j.williams@intel.com>
> > wrote:
> > >
> > > On Fri, Apr 30, 2021 at 7:28 PM Yi Zhang <yi.zhang@redhat.com> wrote:
> > > >
> > > > Hi
> > > >
> > > > With the latest Linux tree, my DCPMM server boot failed with the
> > > > bellow panic log, pls help check it, let me know if you need any test
> > > > for it.
> > >
> > > So v5.12 is ok but v5.12+ is not?
> > >
> > > Might you be able to bisect?
> >
> > Hi Dan
> > This issue was introduced with this patch, let me know if you need more info.
> >
> > commit cf16b05c607bd716a0a5726dc8d577a89fdc1777
> > Author: Bob Moore <robert.moore@intel.com>
> > Date:   Tue Apr 6 14:30:15 2021 -0700
> >
> >     ACPICA: ACPI 6.4: NFIT: add Location Cookie field
> >
> >     Also, update struct size to reflect these changes in nfit core driver.
> >
> >     ACPICA commit af60199a9a1de9e6844929fd4cc22334522ed195
> >
> >     Link: https://github.com/acpica/acpica/commit/af60199a
> >     Cc: Dan Williams <dan.j.williams@intel.com>
> >     Signed-off-by: Bob Moore <robert.moore@intel.com>
> >     Signed-off-by: Erik Kaneda <erik.kaneda@intel.com>
> >     Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >
>
> It's likely that this change forced the nfit driver's code to parse the ACPI table so that it assumes that the location cookie field is always enabled and the NFIT was parsed incorrectly. Does the NFIT table on this platform contain a valid cookie field?
>

This was my fault. When I saw the size change fly by, I should have
remembered to go update all the places that do "sizeof(struct
acpi_nfit_system_address)".

Yi Zhang, can you give the attached patch a try:

[-- Attachment #2: nfit-fix.patch --]
[-- Type: text/x-patch, Size: 9735 bytes --]

ACPI: NFIT: Fix support for variable 'SPA' structure size

From: Dan Williams <dan.j.williams@intel.com>

ACPI 6.4 introduced the "SpaLocationCookie" to the NFIT "System Physical
Address (SPA) Range Structure". The presence of that new field is
indicated by the ACPI_NFIT_LOCATION_COOKIE_VALID flag. Pre-ACPI-6.4
firmware implementations omit the flag and maintain the original size of
the structure.

Update the implementation to check that flag to determine the size
rather than the ACPI 6.4 compliant definition of 'struct
acpi_nfit_system_address' from the Linux ACPICA definitions.

Update the test infrastructure for the new expectations as well, i.e.
continue to emulate the ACPI 6.3 definition of that structure.

Without this fix the kernel fails to validate 'SPA' structures and this
leads to a crash in nfit_get_smbios_id() since that routine assumes that
SPAs are valid if it finds valid SMBIOS tables.

    BUG: unable to handle page fault for address: ffffffffffffffa8
    [..]
    Call Trace:
     skx_get_nvdimm_info+0x56/0x130 [skx_edac]
     skx_get_dimm_config+0x1f5/0x213 [skx_edac]
     skx_register_mci+0x132/0x1c0 [skx_edac]

Reported-by: Yi Zhang <yi.zhang@redhat.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/acpi/nfit/core.c         |   17 +++++++++++----
 tools/testing/nvdimm/test/nfit.c |   42 +++++++++++++++++++++++---------------
 2 files changed, 37 insertions(+), 22 deletions(-)

diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c
index 958aaac869e8..bfecb79e8c82 100644
--- a/drivers/acpi/nfit/core.c
+++ b/drivers/acpi/nfit/core.c
@@ -686,29 +686,36 @@ int nfit_spa_type(struct acpi_nfit_system_address *spa)
 	return -1;
 }
 
-static bool add_spa(struct acpi_nfit_desc *acpi_desc,
+static size_t sizeof_spa(struct acpi_nfit_system_address *spa)
+{
+	if (spa->flags & ACPI_NFIT_LOCATION_COOKIE_VALID)
+		return sizeof(*spa);
+	return sizeof(*spa) - 8;
+}
+
+static bool noinline add_spa(struct acpi_nfit_desc *acpi_desc,
 		struct nfit_table_prev *prev,
 		struct acpi_nfit_system_address *spa)
 {
 	struct device *dev = acpi_desc->dev;
 	struct nfit_spa *nfit_spa;
 
-	if (spa->header.length != sizeof(*spa))
+	if (spa->header.length != sizeof_spa(spa))
 		return false;
 
 	list_for_each_entry(nfit_spa, &prev->spas, list) {
-		if (memcmp(nfit_spa->spa, spa, sizeof(*spa)) == 0) {
+		if (memcmp(nfit_spa->spa, spa, sizeof_spa(spa)) == 0) {
 			list_move_tail(&nfit_spa->list, &acpi_desc->spas);
 			return true;
 		}
 	}
 
-	nfit_spa = devm_kzalloc(dev, sizeof(*nfit_spa) + sizeof(*spa),
+	nfit_spa = devm_kzalloc(dev, sizeof(*nfit_spa) + sizeof_spa(spa),
 			GFP_KERNEL);
 	if (!nfit_spa)
 		return false;
 	INIT_LIST_HEAD(&nfit_spa->list);
-	memcpy(nfit_spa->spa, spa, sizeof(*spa));
+	memcpy(nfit_spa->spa, spa, sizeof_spa(spa));
 	list_add_tail(&nfit_spa->list, &acpi_desc->spas);
 	dev_dbg(dev, "spa index: %d type: %s\n",
 			spa->range_index,
diff --git a/tools/testing/nvdimm/test/nfit.c b/tools/testing/nvdimm/test/nfit.c
index 9b185bf82da8..54f367cbadae 100644
--- a/tools/testing/nvdimm/test/nfit.c
+++ b/tools/testing/nvdimm/test/nfit.c
@@ -1871,9 +1871,16 @@ static void smart_init(struct nfit_test *t)
 	}
 }
 
+static size_t sizeof_spa(struct acpi_nfit_system_address *spa)
+{
+	/* until spa location cookie support is added... */
+	return sizeof(*spa) - 8;
+}
+
 static int nfit_test0_alloc(struct nfit_test *t)
 {
-	size_t nfit_size = sizeof(struct acpi_nfit_system_address) * NUM_SPA
+	struct acpi_nfit_system_address *spa = NULL;
+	size_t nfit_size = sizeof_spa(spa) * NUM_SPA
 			+ sizeof(struct acpi_nfit_memory_map) * NUM_MEM
 			+ sizeof(struct acpi_nfit_control_region) * NUM_DCR
 			+ offsetof(struct acpi_nfit_control_region,
@@ -1937,7 +1944,8 @@ static int nfit_test0_alloc(struct nfit_test *t)
 
 static int nfit_test1_alloc(struct nfit_test *t)
 {
-	size_t nfit_size = sizeof(struct acpi_nfit_system_address) * 2
+	struct acpi_nfit_system_address *spa = NULL;
+	size_t nfit_size = sizeof_spa(spa) * 2
 		+ sizeof(struct acpi_nfit_memory_map) * 2
 		+ offsetof(struct acpi_nfit_control_region, window_size) * 2;
 	int i;
@@ -2000,7 +2008,7 @@ static void nfit_test0_setup(struct nfit_test *t)
 	 */
 	spa = nfit_buf;
 	spa->header.type = ACPI_NFIT_TYPE_SYSTEM_ADDRESS;
-	spa->header.length = sizeof(*spa);
+	spa->header.length = sizeof_spa(spa);
 	memcpy(spa->range_guid, to_nfit_uuid(NFIT_SPA_PM), 16);
 	spa->range_index = 0+1;
 	spa->address = t->spa_set_dma[0];
@@ -2014,7 +2022,7 @@ static void nfit_test0_setup(struct nfit_test *t)
 	 */
 	spa = nfit_buf + offset;
 	spa->header.type = ACPI_NFIT_TYPE_SYSTEM_ADDRESS;
-	spa->header.length = sizeof(*spa);
+	spa->header.length = sizeof_spa(spa);
 	memcpy(spa->range_guid, to_nfit_uuid(NFIT_SPA_PM), 16);
 	spa->range_index = 1+1;
 	spa->address = t->spa_set_dma[1];
@@ -2024,7 +2032,7 @@ static void nfit_test0_setup(struct nfit_test *t)
 	/* spa2 (dcr0) dimm0 */
 	spa = nfit_buf + offset;
 	spa->header.type = ACPI_NFIT_TYPE_SYSTEM_ADDRESS;
-	spa->header.length = sizeof(*spa);
+	spa->header.length = sizeof_spa(spa);
 	memcpy(spa->range_guid, to_nfit_uuid(NFIT_SPA_DCR), 16);
 	spa->range_index = 2+1;
 	spa->address = t->dcr_dma[0];
@@ -2034,7 +2042,7 @@ static void nfit_test0_setup(struct nfit_test *t)
 	/* spa3 (dcr1) dimm1 */
 	spa = nfit_buf + offset;
 	spa->header.type = ACPI_NFIT_TYPE_SYSTEM_ADDRESS;
-	spa->header.length = sizeof(*spa);
+	spa->header.length = sizeof_spa(spa);
 	memcpy(spa->range_guid, to_nfit_uuid(NFIT_SPA_DCR), 16);
 	spa->range_index = 3+1;
 	spa->address = t->dcr_dma[1];
@@ -2044,7 +2052,7 @@ static void nfit_test0_setup(struct nfit_test *t)
 	/* spa4 (dcr2) dimm2 */
 	spa = nfit_buf + offset;
 	spa->header.type = ACPI_NFIT_TYPE_SYSTEM_ADDRESS;
-	spa->header.length = sizeof(*spa);
+	spa->header.length = sizeof_spa(spa);
 	memcpy(spa->range_guid, to_nfit_uuid(NFIT_SPA_DCR), 16);
 	spa->range_index = 4+1;
 	spa->address = t->dcr_dma[2];
@@ -2054,7 +2062,7 @@ static void nfit_test0_setup(struct nfit_test *t)
 	/* spa5 (dcr3) dimm3 */
 	spa = nfit_buf + offset;
 	spa->header.type = ACPI_NFIT_TYPE_SYSTEM_ADDRESS;
-	spa->header.length = sizeof(*spa);
+	spa->header.length = sizeof_spa(spa);
 	memcpy(spa->range_guid, to_nfit_uuid(NFIT_SPA_DCR), 16);
 	spa->range_index = 5+1;
 	spa->address = t->dcr_dma[3];
@@ -2064,7 +2072,7 @@ static void nfit_test0_setup(struct nfit_test *t)
 	/* spa6 (bdw for dcr0) dimm0 */
 	spa = nfit_buf + offset;
 	spa->header.type = ACPI_NFIT_TYPE_SYSTEM_ADDRESS;
-	spa->header.length = sizeof(*spa);
+	spa->header.length = sizeof_spa(spa);
 	memcpy(spa->range_guid, to_nfit_uuid(NFIT_SPA_BDW), 16);
 	spa->range_index = 6+1;
 	spa->address = t->dimm_dma[0];
@@ -2074,7 +2082,7 @@ static void nfit_test0_setup(struct nfit_test *t)
 	/* spa7 (bdw for dcr1) dimm1 */
 	spa = nfit_buf + offset;
 	spa->header.type = ACPI_NFIT_TYPE_SYSTEM_ADDRESS;
-	spa->header.length = sizeof(*spa);
+	spa->header.length = sizeof_spa(spa);
 	memcpy(spa->range_guid, to_nfit_uuid(NFIT_SPA_BDW), 16);
 	spa->range_index = 7+1;
 	spa->address = t->dimm_dma[1];
@@ -2084,7 +2092,7 @@ static void nfit_test0_setup(struct nfit_test *t)
 	/* spa8 (bdw for dcr2) dimm2 */
 	spa = nfit_buf + offset;
 	spa->header.type = ACPI_NFIT_TYPE_SYSTEM_ADDRESS;
-	spa->header.length = sizeof(*spa);
+	spa->header.length = sizeof_spa(spa);
 	memcpy(spa->range_guid, to_nfit_uuid(NFIT_SPA_BDW), 16);
 	spa->range_index = 8+1;
 	spa->address = t->dimm_dma[2];
@@ -2094,7 +2102,7 @@ static void nfit_test0_setup(struct nfit_test *t)
 	/* spa9 (bdw for dcr3) dimm3 */
 	spa = nfit_buf + offset;
 	spa->header.type = ACPI_NFIT_TYPE_SYSTEM_ADDRESS;
-	spa->header.length = sizeof(*spa);
+	spa->header.length = sizeof_spa(spa);
 	memcpy(spa->range_guid, to_nfit_uuid(NFIT_SPA_BDW), 16);
 	spa->range_index = 9+1;
 	spa->address = t->dimm_dma[3];
@@ -2581,7 +2589,7 @@ static void nfit_test0_setup(struct nfit_test *t)
 		/* spa10 (dcr4) dimm4 */
 		spa = nfit_buf + offset;
 		spa->header.type = ACPI_NFIT_TYPE_SYSTEM_ADDRESS;
-		spa->header.length = sizeof(*spa);
+		spa->header.length = sizeof_spa(spa);
 		memcpy(spa->range_guid, to_nfit_uuid(NFIT_SPA_DCR), 16);
 		spa->range_index = 10+1;
 		spa->address = t->dcr_dma[4];
@@ -2595,7 +2603,7 @@ static void nfit_test0_setup(struct nfit_test *t)
 		 */
 		spa = nfit_buf + offset;
 		spa->header.type = ACPI_NFIT_TYPE_SYSTEM_ADDRESS;
-		spa->header.length = sizeof(*spa);
+		spa->header.length = sizeof_spa(spa);
 		memcpy(spa->range_guid, to_nfit_uuid(NFIT_SPA_PM), 16);
 		spa->range_index = 11+1;
 		spa->address = t->spa_set_dma[2];
@@ -2605,7 +2613,7 @@ static void nfit_test0_setup(struct nfit_test *t)
 		/* spa12 (bdw for dcr4) dimm4 */
 		spa = nfit_buf + offset;
 		spa->header.type = ACPI_NFIT_TYPE_SYSTEM_ADDRESS;
-		spa->header.length = sizeof(*spa);
+		spa->header.length = sizeof_spa(spa);
 		memcpy(spa->range_guid, to_nfit_uuid(NFIT_SPA_BDW), 16);
 		spa->range_index = 12+1;
 		spa->address = t->dimm_dma[4];
@@ -2739,7 +2747,7 @@ static void nfit_test1_setup(struct nfit_test *t)
 	/* spa0 (flat range with no bdw aliasing) */
 	spa = nfit_buf + offset;
 	spa->header.type = ACPI_NFIT_TYPE_SYSTEM_ADDRESS;
-	spa->header.length = sizeof(*spa);
+	spa->header.length = sizeof_spa(spa);
 	memcpy(spa->range_guid, to_nfit_uuid(NFIT_SPA_PM), 16);
 	spa->range_index = 0+1;
 	spa->address = t->spa_set_dma[0];
@@ -2749,7 +2757,7 @@ static void nfit_test1_setup(struct nfit_test *t)
 	/* virtual cd region */
 	spa = nfit_buf + offset;
 	spa->header.type = ACPI_NFIT_TYPE_SYSTEM_ADDRESS;
-	spa->header.length = sizeof(*spa);
+	spa->header.length = sizeof_spa(spa);
 	memcpy(spa->range_guid, to_nfit_uuid(NFIT_SPA_VCD), 16);
 	spa->range_index = 0;
 	spa->address = t->spa_set_dma[1];

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [bug report] system panic at nfit_get_smbios_id+0x6e/0xf0 [nfit] during boot
@ 2021-05-06 21:17         ` Dan Williams
  0 siblings, 0 replies; 8+ messages in thread
From: Dan Williams @ 2021-05-06 21:17 UTC (permalink / raw)
  To: Kaneda, Erik
  Cc: Yi Zhang, Moore, Robert, linux-nvdimm, Wysocki, Rafael J, nvdimm

[-- Attachment #1: Type: text/plain, Size: 2214 bytes --]

On Thu, May 6, 2021 at 10:28 AM Kaneda, Erik <erik.kaneda@intel.com> wrote:
>
>
>
> > -----Original Message-----
> > From: Yi Zhang <yi.zhang@redhat.com>
> > Sent: Wednesday, May 5, 2021 8:05 PM
> > To: Williams, Dan J <dan.j.williams@intel.com>; Moore, Robert
> > <robert.moore@intel.com>
> > Cc: linux-nvdimm <linux-nvdimm@lists.01.org>; Kaneda, Erik
> > <erik.kaneda@intel.com>; Wysocki, Rafael J <rafael.j.wysocki@intel.com>
> > Subject: Re: [bug report] system panic at nfit_get_smbios_id+0x6e/0xf0
> > [nfit] during boot
> >
> > On Sat, May 1, 2021 at 2:05 PM Dan Williams <dan.j.williams@intel.com>
> > wrote:
> > >
> > > On Fri, Apr 30, 2021 at 7:28 PM Yi Zhang <yi.zhang@redhat.com> wrote:
> > > >
> > > > Hi
> > > >
> > > > With the latest Linux tree, my DCPMM server boot failed with the
> > > > bellow panic log, pls help check it, let me know if you need any test
> > > > for it.
> > >
> > > So v5.12 is ok but v5.12+ is not?
> > >
> > > Might you be able to bisect?
> >
> > Hi Dan
> > This issue was introduced with this patch, let me know if you need more info.
> >
> > commit cf16b05c607bd716a0a5726dc8d577a89fdc1777
> > Author: Bob Moore <robert.moore@intel.com>
> > Date:   Tue Apr 6 14:30:15 2021 -0700
> >
> >     ACPICA: ACPI 6.4: NFIT: add Location Cookie field
> >
> >     Also, update struct size to reflect these changes in nfit core driver.
> >
> >     ACPICA commit af60199a9a1de9e6844929fd4cc22334522ed195
> >
> >     Link: https://github.com/acpica/acpica/commit/af60199a
> >     Cc: Dan Williams <dan.j.williams@intel.com>
> >     Signed-off-by: Bob Moore <robert.moore@intel.com>
> >     Signed-off-by: Erik Kaneda <erik.kaneda@intel.com>
> >     Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >
>
> It's likely that this change forced the nfit driver's code to parse the ACPI table so that it assumes that the location cookie field is always enabled and the NFIT was parsed incorrectly. Does the NFIT table on this platform contain a valid cookie field?
>

This was my fault. When I saw the size change fly by, I should have
remembered to go update all the places that do "sizeof(struct
acpi_nfit_system_address)".

Yi Zhang, can you give the attached patch a try:

[-- Attachment #2: nfit-fix.patch --]
[-- Type: text/x-patch, Size: 9735 bytes --]

ACPI: NFIT: Fix support for variable 'SPA' structure size

From: Dan Williams <dan.j.williams@intel.com>

ACPI 6.4 introduced the "SpaLocationCookie" to the NFIT "System Physical
Address (SPA) Range Structure". The presence of that new field is
indicated by the ACPI_NFIT_LOCATION_COOKIE_VALID flag. Pre-ACPI-6.4
firmware implementations omit the flag and maintain the original size of
the structure.

Update the implementation to check that flag to determine the size
rather than the ACPI 6.4 compliant definition of 'struct
acpi_nfit_system_address' from the Linux ACPICA definitions.

Update the test infrastructure for the new expectations as well, i.e.
continue to emulate the ACPI 6.3 definition of that structure.

Without this fix the kernel fails to validate 'SPA' structures and this
leads to a crash in nfit_get_smbios_id() since that routine assumes that
SPAs are valid if it finds valid SMBIOS tables.

    BUG: unable to handle page fault for address: ffffffffffffffa8
    [..]
    Call Trace:
     skx_get_nvdimm_info+0x56/0x130 [skx_edac]
     skx_get_dimm_config+0x1f5/0x213 [skx_edac]
     skx_register_mci+0x132/0x1c0 [skx_edac]

Reported-by: Yi Zhang <yi.zhang@redhat.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/acpi/nfit/core.c         |   17 +++++++++++----
 tools/testing/nvdimm/test/nfit.c |   42 +++++++++++++++++++++++---------------
 2 files changed, 37 insertions(+), 22 deletions(-)

diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c
index 958aaac869e8..bfecb79e8c82 100644
--- a/drivers/acpi/nfit/core.c
+++ b/drivers/acpi/nfit/core.c
@@ -686,29 +686,36 @@ int nfit_spa_type(struct acpi_nfit_system_address *spa)
 	return -1;
 }
 
-static bool add_spa(struct acpi_nfit_desc *acpi_desc,
+static size_t sizeof_spa(struct acpi_nfit_system_address *spa)
+{
+	if (spa->flags & ACPI_NFIT_LOCATION_COOKIE_VALID)
+		return sizeof(*spa);
+	return sizeof(*spa) - 8;
+}
+
+static bool noinline add_spa(struct acpi_nfit_desc *acpi_desc,
 		struct nfit_table_prev *prev,
 		struct acpi_nfit_system_address *spa)
 {
 	struct device *dev = acpi_desc->dev;
 	struct nfit_spa *nfit_spa;
 
-	if (spa->header.length != sizeof(*spa))
+	if (spa->header.length != sizeof_spa(spa))
 		return false;
 
 	list_for_each_entry(nfit_spa, &prev->spas, list) {
-		if (memcmp(nfit_spa->spa, spa, sizeof(*spa)) == 0) {
+		if (memcmp(nfit_spa->spa, spa, sizeof_spa(spa)) == 0) {
 			list_move_tail(&nfit_spa->list, &acpi_desc->spas);
 			return true;
 		}
 	}
 
-	nfit_spa = devm_kzalloc(dev, sizeof(*nfit_spa) + sizeof(*spa),
+	nfit_spa = devm_kzalloc(dev, sizeof(*nfit_spa) + sizeof_spa(spa),
 			GFP_KERNEL);
 	if (!nfit_spa)
 		return false;
 	INIT_LIST_HEAD(&nfit_spa->list);
-	memcpy(nfit_spa->spa, spa, sizeof(*spa));
+	memcpy(nfit_spa->spa, spa, sizeof_spa(spa));
 	list_add_tail(&nfit_spa->list, &acpi_desc->spas);
 	dev_dbg(dev, "spa index: %d type: %s\n",
 			spa->range_index,
diff --git a/tools/testing/nvdimm/test/nfit.c b/tools/testing/nvdimm/test/nfit.c
index 9b185bf82da8..54f367cbadae 100644
--- a/tools/testing/nvdimm/test/nfit.c
+++ b/tools/testing/nvdimm/test/nfit.c
@@ -1871,9 +1871,16 @@ static void smart_init(struct nfit_test *t)
 	}
 }
 
+static size_t sizeof_spa(struct acpi_nfit_system_address *spa)
+{
+	/* until spa location cookie support is added... */
+	return sizeof(*spa) - 8;
+}
+
 static int nfit_test0_alloc(struct nfit_test *t)
 {
-	size_t nfit_size = sizeof(struct acpi_nfit_system_address) * NUM_SPA
+	struct acpi_nfit_system_address *spa = NULL;
+	size_t nfit_size = sizeof_spa(spa) * NUM_SPA
 			+ sizeof(struct acpi_nfit_memory_map) * NUM_MEM
 			+ sizeof(struct acpi_nfit_control_region) * NUM_DCR
 			+ offsetof(struct acpi_nfit_control_region,
@@ -1937,7 +1944,8 @@ static int nfit_test0_alloc(struct nfit_test *t)
 
 static int nfit_test1_alloc(struct nfit_test *t)
 {
-	size_t nfit_size = sizeof(struct acpi_nfit_system_address) * 2
+	struct acpi_nfit_system_address *spa = NULL;
+	size_t nfit_size = sizeof_spa(spa) * 2
 		+ sizeof(struct acpi_nfit_memory_map) * 2
 		+ offsetof(struct acpi_nfit_control_region, window_size) * 2;
 	int i;
@@ -2000,7 +2008,7 @@ static void nfit_test0_setup(struct nfit_test *t)
 	 */
 	spa = nfit_buf;
 	spa->header.type = ACPI_NFIT_TYPE_SYSTEM_ADDRESS;
-	spa->header.length = sizeof(*spa);
+	spa->header.length = sizeof_spa(spa);
 	memcpy(spa->range_guid, to_nfit_uuid(NFIT_SPA_PM), 16);
 	spa->range_index = 0+1;
 	spa->address = t->spa_set_dma[0];
@@ -2014,7 +2022,7 @@ static void nfit_test0_setup(struct nfit_test *t)
 	 */
 	spa = nfit_buf + offset;
 	spa->header.type = ACPI_NFIT_TYPE_SYSTEM_ADDRESS;
-	spa->header.length = sizeof(*spa);
+	spa->header.length = sizeof_spa(spa);
 	memcpy(spa->range_guid, to_nfit_uuid(NFIT_SPA_PM), 16);
 	spa->range_index = 1+1;
 	spa->address = t->spa_set_dma[1];
@@ -2024,7 +2032,7 @@ static void nfit_test0_setup(struct nfit_test *t)
 	/* spa2 (dcr0) dimm0 */
 	spa = nfit_buf + offset;
 	spa->header.type = ACPI_NFIT_TYPE_SYSTEM_ADDRESS;
-	spa->header.length = sizeof(*spa);
+	spa->header.length = sizeof_spa(spa);
 	memcpy(spa->range_guid, to_nfit_uuid(NFIT_SPA_DCR), 16);
 	spa->range_index = 2+1;
 	spa->address = t->dcr_dma[0];
@@ -2034,7 +2042,7 @@ static void nfit_test0_setup(struct nfit_test *t)
 	/* spa3 (dcr1) dimm1 */
 	spa = nfit_buf + offset;
 	spa->header.type = ACPI_NFIT_TYPE_SYSTEM_ADDRESS;
-	spa->header.length = sizeof(*spa);
+	spa->header.length = sizeof_spa(spa);
 	memcpy(spa->range_guid, to_nfit_uuid(NFIT_SPA_DCR), 16);
 	spa->range_index = 3+1;
 	spa->address = t->dcr_dma[1];
@@ -2044,7 +2052,7 @@ static void nfit_test0_setup(struct nfit_test *t)
 	/* spa4 (dcr2) dimm2 */
 	spa = nfit_buf + offset;
 	spa->header.type = ACPI_NFIT_TYPE_SYSTEM_ADDRESS;
-	spa->header.length = sizeof(*spa);
+	spa->header.length = sizeof_spa(spa);
 	memcpy(spa->range_guid, to_nfit_uuid(NFIT_SPA_DCR), 16);
 	spa->range_index = 4+1;
 	spa->address = t->dcr_dma[2];
@@ -2054,7 +2062,7 @@ static void nfit_test0_setup(struct nfit_test *t)
 	/* spa5 (dcr3) dimm3 */
 	spa = nfit_buf + offset;
 	spa->header.type = ACPI_NFIT_TYPE_SYSTEM_ADDRESS;
-	spa->header.length = sizeof(*spa);
+	spa->header.length = sizeof_spa(spa);
 	memcpy(spa->range_guid, to_nfit_uuid(NFIT_SPA_DCR), 16);
 	spa->range_index = 5+1;
 	spa->address = t->dcr_dma[3];
@@ -2064,7 +2072,7 @@ static void nfit_test0_setup(struct nfit_test *t)
 	/* spa6 (bdw for dcr0) dimm0 */
 	spa = nfit_buf + offset;
 	spa->header.type = ACPI_NFIT_TYPE_SYSTEM_ADDRESS;
-	spa->header.length = sizeof(*spa);
+	spa->header.length = sizeof_spa(spa);
 	memcpy(spa->range_guid, to_nfit_uuid(NFIT_SPA_BDW), 16);
 	spa->range_index = 6+1;
 	spa->address = t->dimm_dma[0];
@@ -2074,7 +2082,7 @@ static void nfit_test0_setup(struct nfit_test *t)
 	/* spa7 (bdw for dcr1) dimm1 */
 	spa = nfit_buf + offset;
 	spa->header.type = ACPI_NFIT_TYPE_SYSTEM_ADDRESS;
-	spa->header.length = sizeof(*spa);
+	spa->header.length = sizeof_spa(spa);
 	memcpy(spa->range_guid, to_nfit_uuid(NFIT_SPA_BDW), 16);
 	spa->range_index = 7+1;
 	spa->address = t->dimm_dma[1];
@@ -2084,7 +2092,7 @@ static void nfit_test0_setup(struct nfit_test *t)
 	/* spa8 (bdw for dcr2) dimm2 */
 	spa = nfit_buf + offset;
 	spa->header.type = ACPI_NFIT_TYPE_SYSTEM_ADDRESS;
-	spa->header.length = sizeof(*spa);
+	spa->header.length = sizeof_spa(spa);
 	memcpy(spa->range_guid, to_nfit_uuid(NFIT_SPA_BDW), 16);
 	spa->range_index = 8+1;
 	spa->address = t->dimm_dma[2];
@@ -2094,7 +2102,7 @@ static void nfit_test0_setup(struct nfit_test *t)
 	/* spa9 (bdw for dcr3) dimm3 */
 	spa = nfit_buf + offset;
 	spa->header.type = ACPI_NFIT_TYPE_SYSTEM_ADDRESS;
-	spa->header.length = sizeof(*spa);
+	spa->header.length = sizeof_spa(spa);
 	memcpy(spa->range_guid, to_nfit_uuid(NFIT_SPA_BDW), 16);
 	spa->range_index = 9+1;
 	spa->address = t->dimm_dma[3];
@@ -2581,7 +2589,7 @@ static void nfit_test0_setup(struct nfit_test *t)
 		/* spa10 (dcr4) dimm4 */
 		spa = nfit_buf + offset;
 		spa->header.type = ACPI_NFIT_TYPE_SYSTEM_ADDRESS;
-		spa->header.length = sizeof(*spa);
+		spa->header.length = sizeof_spa(spa);
 		memcpy(spa->range_guid, to_nfit_uuid(NFIT_SPA_DCR), 16);
 		spa->range_index = 10+1;
 		spa->address = t->dcr_dma[4];
@@ -2595,7 +2603,7 @@ static void nfit_test0_setup(struct nfit_test *t)
 		 */
 		spa = nfit_buf + offset;
 		spa->header.type = ACPI_NFIT_TYPE_SYSTEM_ADDRESS;
-		spa->header.length = sizeof(*spa);
+		spa->header.length = sizeof_spa(spa);
 		memcpy(spa->range_guid, to_nfit_uuid(NFIT_SPA_PM), 16);
 		spa->range_index = 11+1;
 		spa->address = t->spa_set_dma[2];
@@ -2605,7 +2613,7 @@ static void nfit_test0_setup(struct nfit_test *t)
 		/* spa12 (bdw for dcr4) dimm4 */
 		spa = nfit_buf + offset;
 		spa->header.type = ACPI_NFIT_TYPE_SYSTEM_ADDRESS;
-		spa->header.length = sizeof(*spa);
+		spa->header.length = sizeof_spa(spa);
 		memcpy(spa->range_guid, to_nfit_uuid(NFIT_SPA_BDW), 16);
 		spa->range_index = 12+1;
 		spa->address = t->dimm_dma[4];
@@ -2739,7 +2747,7 @@ static void nfit_test1_setup(struct nfit_test *t)
 	/* spa0 (flat range with no bdw aliasing) */
 	spa = nfit_buf + offset;
 	spa->header.type = ACPI_NFIT_TYPE_SYSTEM_ADDRESS;
-	spa->header.length = sizeof(*spa);
+	spa->header.length = sizeof_spa(spa);
 	memcpy(spa->range_guid, to_nfit_uuid(NFIT_SPA_PM), 16);
 	spa->range_index = 0+1;
 	spa->address = t->spa_set_dma[0];
@@ -2749,7 +2757,7 @@ static void nfit_test1_setup(struct nfit_test *t)
 	/* virtual cd region */
 	spa = nfit_buf + offset;
 	spa->header.type = ACPI_NFIT_TYPE_SYSTEM_ADDRESS;
-	spa->header.length = sizeof(*spa);
+	spa->header.length = sizeof_spa(spa);
 	memcpy(spa->range_guid, to_nfit_uuid(NFIT_SPA_VCD), 16);
 	spa->range_index = 0;
 	spa->address = t->spa_set_dma[1];

[-- Attachment #3: Type: text/plain, Size: 167 bytes --]

_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [bug report] system panic at nfit_get_smbios_id+0x6e/0xf0 [nfit] during boot
  2021-05-06 21:17         ` Dan Williams
@ 2021-05-07  1:41           ` Yi Zhang
  -1 siblings, 0 replies; 8+ messages in thread
From: Yi Zhang @ 2021-05-07  1:41 UTC (permalink / raw)
  To: Dan Williams
  Cc: Kaneda, Erik, Moore, Robert, linux-nvdimm, Wysocki, Rafael J, nvdimm

On Fri, May 7, 2021 at 5:17 AM Dan Williams <dan.j.williams@intel.com> wrote:
>
> On Thu, May 6, 2021 at 10:28 AM Kaneda, Erik <erik.kaneda@intel.com> wrote:
> >
> >
> >
> > > -----Original Message-----
> > > From: Yi Zhang <yi.zhang@redhat.com>
> > > Sent: Wednesday, May 5, 2021 8:05 PM
> > > To: Williams, Dan J <dan.j.williams@intel.com>; Moore, Robert
> > > <robert.moore@intel.com>
> > > Cc: linux-nvdimm <linux-nvdimm@lists.01.org>; Kaneda, Erik
> > > <erik.kaneda@intel.com>; Wysocki, Rafael J <rafael.j.wysocki@intel.com>
> > > Subject: Re: [bug report] system panic at nfit_get_smbios_id+0x6e/0xf0
> > > [nfit] during boot
> > >
> > > On Sat, May 1, 2021 at 2:05 PM Dan Williams <dan.j.williams@intel.com>
> > > wrote:
> > > >
> > > > On Fri, Apr 30, 2021 at 7:28 PM Yi Zhang <yi.zhang@redhat.com> wrote:
> > > > >
> > > > > Hi
> > > > >
> > > > > With the latest Linux tree, my DCPMM server boot failed with the
> > > > > bellow panic log, pls help check it, let me know if you need any test
> > > > > for it.
> > > >
> > > > So v5.12 is ok but v5.12+ is not?
> > > >
> > > > Might you be able to bisect?
> > >
> > > Hi Dan
> > > This issue was introduced with this patch, let me know if you need more info.
> > >
> > > commit cf16b05c607bd716a0a5726dc8d577a89fdc1777
> > > Author: Bob Moore <robert.moore@intel.com>
> > > Date:   Tue Apr 6 14:30:15 2021 -0700
> > >
> > >     ACPICA: ACPI 6.4: NFIT: add Location Cookie field
> > >
> > >     Also, update struct size to reflect these changes in nfit core driver.
> > >
> > >     ACPICA commit af60199a9a1de9e6844929fd4cc22334522ed195
> > >
> > >     Link: https://github.com/acpica/acpica/commit/af60199a
> > >     Cc: Dan Williams <dan.j.williams@intel.com>
> > >     Signed-off-by: Bob Moore <robert.moore@intel.com>
> > >     Signed-off-by: Erik Kaneda <erik.kaneda@intel.com>
> > >     Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > >
> >
> > It's likely that this change forced the nfit driver's code to parse the ACPI table so that it assumes that the location cookie field is always enabled and the NFIT was parsed incorrectly. Does the NFIT table on this platform contain a valid cookie field?
> >
>
> This was my fault. When I saw the size change fly by, I should have
> remembered to go update all the places that do "sizeof(struct
> acpi_nfit_system_address)".
>
> Yi Zhang, can you give the attached patch a try:

Hi Dan
My DCPMM server boots up now with this patch, feel free to add:
Tested-by: Yi Zhang <yi.zhang@redhat.com>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [bug report] system panic at nfit_get_smbios_id+0x6e/0xf0 [nfit] during boot
@ 2021-05-07  1:41           ` Yi Zhang
  0 siblings, 0 replies; 8+ messages in thread
From: Yi Zhang @ 2021-05-07  1:41 UTC (permalink / raw)
  To: Dan Williams
  Cc: Kaneda, Erik, Moore, Robert, linux-nvdimm, Wysocki, Rafael J, nvdimm

On Fri, May 7, 2021 at 5:17 AM Dan Williams <dan.j.williams@intel.com> wrote:
>
> On Thu, May 6, 2021 at 10:28 AM Kaneda, Erik <erik.kaneda@intel.com> wrote:
> >
> >
> >
> > > -----Original Message-----
> > > From: Yi Zhang <yi.zhang@redhat.com>
> > > Sent: Wednesday, May 5, 2021 8:05 PM
> > > To: Williams, Dan J <dan.j.williams@intel.com>; Moore, Robert
> > > <robert.moore@intel.com>
> > > Cc: linux-nvdimm <linux-nvdimm@lists.01.org>; Kaneda, Erik
> > > <erik.kaneda@intel.com>; Wysocki, Rafael J <rafael.j.wysocki@intel.com>
> > > Subject: Re: [bug report] system panic at nfit_get_smbios_id+0x6e/0xf0
> > > [nfit] during boot
> > >
> > > On Sat, May 1, 2021 at 2:05 PM Dan Williams <dan.j.williams@intel.com>
> > > wrote:
> > > >
> > > > On Fri, Apr 30, 2021 at 7:28 PM Yi Zhang <yi.zhang@redhat.com> wrote:
> > > > >
> > > > > Hi
> > > > >
> > > > > With the latest Linux tree, my DCPMM server boot failed with the
> > > > > bellow panic log, pls help check it, let me know if you need any test
> > > > > for it.
> > > >
> > > > So v5.12 is ok but v5.12+ is not?
> > > >
> > > > Might you be able to bisect?
> > >
> > > Hi Dan
> > > This issue was introduced with this patch, let me know if you need more info.
> > >
> > > commit cf16b05c607bd716a0a5726dc8d577a89fdc1777
> > > Author: Bob Moore <robert.moore@intel.com>
> > > Date:   Tue Apr 6 14:30:15 2021 -0700
> > >
> > >     ACPICA: ACPI 6.4: NFIT: add Location Cookie field
> > >
> > >     Also, update struct size to reflect these changes in nfit core driver.
> > >
> > >     ACPICA commit af60199a9a1de9e6844929fd4cc22334522ed195
> > >
> > >     Link: https://github.com/acpica/acpica/commit/af60199a
> > >     Cc: Dan Williams <dan.j.williams@intel.com>
> > >     Signed-off-by: Bob Moore <robert.moore@intel.com>
> > >     Signed-off-by: Erik Kaneda <erik.kaneda@intel.com>
> > >     Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > >
> >
> > It's likely that this change forced the nfit driver's code to parse the ACPI table so that it assumes that the location cookie field is always enabled and the NFIT was parsed incorrectly. Does the NFIT table on this platform contain a valid cookie field?
> >
>
> This was my fault. When I saw the size change fly by, I should have
> remembered to go update all the places that do "sizeof(struct
> acpi_nfit_system_address)".
>
> Yi Zhang, can you give the attached patch a try:

Hi Dan
My DCPMM server boots up now with this patch, feel free to add:
Tested-by: Yi Zhang <yi.zhang@redhat.com>
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-05-07  1:41 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-01  2:27 [bug report] system panic at nfit_get_smbios_id+0x6e/0xf0 [nfit] during boot Yi Zhang
2021-05-01  6:05 ` Dan Williams
2021-05-06  3:05   ` Yi Zhang
2021-05-06 17:27     ` Kaneda, Erik
2021-05-06 21:17       ` Dan Williams
2021-05-06 21:17         ` Dan Williams
2021-05-07  1:41         ` Yi Zhang
2021-05-07  1:41           ` Yi Zhang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.