All of lore.kernel.org
 help / color / mirror / Atom feed
* PCI HPMC on C240 with alternatives Patching
@ 2019-05-24  6:58 Sven Schnelle
  2019-05-24 10:50 ` Sven Schnelle
  0 siblings, 1 reply; 7+ messages in thread
From: Sven Schnelle @ 2019-05-24  6:58 UTC (permalink / raw)
  To: linux-parisc

Hi List,

i recently got my hands on an old C240. I see a Kernel oops pretty early when
alternatives patching is enabled:

[   40.810794] sym53c8xx 0000:00:13.0: enabling device (0150 -> 0153)
[   40.894350] sym0: <875> rev 0x4 at pci 0000:00:13.0 irq 22
[   41.047461] sym0: No NVRAM, ID 7, Fast-20, SE, parity checking


[   50.337981] Backtrace:
[   50.366087]  [<105b61bc>] sym_hcb_attach+0x668/0x840
[   50.425150]  [<105af6c4>] sym_attach.constprop.0+0x188/0x378
[   50.492476]  [<105ae518>] sym2_probe+0x40c/0x4c4
[   50.547424]  [<104e52e4>] pci_device_probe+0xb0/0x150
[   50.607543]  [<1056cab0>] really_probe+0x2ac/0x3f8
[   50.664530]  [<1056d2f4>] driver_probe_device+0x51c/0x534
[   50.728743]  [<1056d614>] device_driver_attach+0x54/0x98
[   50.791921]  [<1056d7a0>] __driver_attach+0x148/0x160
[   50.852010]  [<1056a2dc>] bus_for_each_dev+0x7c/0xbc
[   50.911067]  [<1056c100>] driver_attach+0x2c/0x4c
[   50.967022]  [<1056b95c>] bus_add_driver+0x1b4/0x210
[   51.026071]  [<1056dfb8>] driver_register+0xdc/0x12c
[   51.085129]  [<104e4894>] __pci_register_driver+0x4c/0x6c
[   51.149385]  [<10137d4c>] sym2_init+0xc0/0x134
[   51.202244]  [<1018b81c>] do_one_initcall+0x84/0x1c0
[   51.261301]  [<1010158c>] kernel_init_freeable+0x2b8/0x2d0
[   51.326580]  [<107e6040>] kernel_init+0x20/0x140
[   51.381523]  [<1019301c>] ret_from_kernel_thread+0x1c/0x24
[   51.464420]
[   51.482098] High Priority Machine Check (HPMC): Code=1 (High-priority machine check (HPMC)) at addr 00000000
[   51.599065] CPU: 0 PID: 1 Comm: swapper Not tainted 5.2.0-rc1-32bit+ #83
[   51.678775] Hardware name: 9000/782/C240+
[   51.726427]
[   51.744090]      YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
[   51.800018] PSW: 00000000000001001111111100001111 Not tainted
[   51.868370] r00-03  0004ff0f 109f07c4 105b61bc 2fcc09c0
[   51.930512] r04-07  2f44a000 108ced74 0098967f 00000008
[   51.992659] r08-11  00988000 01000000 00000002 2f44a3d4
[   52.054804] r12-15  10a57fc4 109e5800 000000fd f0100000
[   52.116948] r16-19  f000168c f000020c f0000204 0001602c
[   52.179094] r20-23  2f44a000 0000000f 2fc19870 2f44a920
[   52.241238] r24-27  00000040 0001602c 00016014 108fc7c4
[   52.303381] r28-31  0000001f 00000000 2fcc0a00 00000000
[   52.365519] sr00-03  00000000 00000000 00000000 00000000
[   52.428693] sr04-07  00000000 00000000 00000000 00000000
[   52.491854]
[   52.509524] IASQ: 00000000 00000000 IAOQ: 104d5708 104d570c
[   52.575798]  IIR: 48623fd9    ISR: 0024007f  IOR: 300c09ac
[   52.641036]  CPU:        0   CR30: 2fcc0000 CR31: ffff1558
[   52.706270]  ORIG_R28: 00000000
[   52.743614]  IAOQ[0]: ioread8+0x34/0x5c
[   52.789215]  IAOQ[1]: ioread8+0x38/0x5c
[   52.834821]  RP(r2): sym_hcb_attach+0x668/0x840
[   52.888676] Backtrace:
[   52.916709]  [<105b61bc>] sym_hcb_attach+0x668/0x840
[   52.975765]  [<105af6c4>] sym_attach.constprop.0+0x188/0x378
[   53.043088]  [<105ae518>] sym2_probe+0x40c/0x4c4
[   53.098009]  [<104e52e4>] pci_device_probe+0xb0/0x150
[   53.158101]  [<1056cab0>] really_probe+0x2ac/0x3f8
[   53.215080]  [<1056d2f4>] driver_probe_device+0x51c/0x534
[   53.279292]  [<1056d614>] device_driver_attach+0x54/0x98
[   53.342472]  [<1056d7a0>] __driver_attach+0x148/0x160
[   53.402557]  [<1056a2dc>] bus_for_each_dev+0x7c/0xbc
[   53.461616]  [<1056c100>] driver_attach+0x2c/0x4c
[   53.517569]  [<1056b95c>] bus_add_driver+0x1b4/0x210
[   53.576612]  [<1056dfb8>] driver_register+0xdc/0x12c
[   53.635669]  [<104e4894>] __pci_register_driver+0x4c/0x6c
[   53.699896]  [<10137d4c>] sym2_init+0xc0/0x134
[   53.752740]  [<1018b81c>] do_one_initcall+0x84/0x1c0
[   53.811785]  [<1010158c>] kernel_init_freeable+0x2b8/0x2d0
[   53.877046]  [<107e6040>] kernel_init+0x20/0x140
[   53.931969]  [<1019301c>] ret_from_kernel_thread+0x1c/0x24
[   53.997196]
[   54.016034] Kernel panic - not syncing: High Priority Machine Check (HPMC)
[   54.111012] Rebooting in 10 seconds..

The full dmesg can be found at https://stackframe.org/crashlog.txt

This also happens sometimes with the tulip driver, so it's likely not related to
sym53c8xx itself. The crash location in source is:

0x1059e394 is in sym_hcb_attach (/home/svens/parisc-linux/src/drivers/scsi/sym53c8xx_2/sym_hipd.c:1038).
1029		/*
1030		*  Start script (exchange values)
1031		*/
1032		OUTL(np, nc_dsa, np->hcb_ba);
1033		OUTL_DSP(np, pc);
1034		/*
1035		 *  Wait 'til done (with timeout)
1036		 */
1037		for (i=0; i<SYM_SNOOP_TIMEOUT; i++)
1038			if (INB(np, nc_istat) & (INTF|SIP|DIP))  <-- crash
1039				break;
1040		if (i>=SYM_SNOOP_TIMEOUT) {
1041			printf ("CACHE TEST FAILED: timeout.\n");
1042			return (0x20);

My (wild) guess is that we're patching away some memory barrier or cache flush
so the SCRIPTS engine in the SCSI controller starts executing garbage and triggers
a PCI bus read/write to an invalid address. The reason the INB() is given as the
HPMC location is likely caused by the delay between writing DSPS and the chip actually
starting to fetch insn/data.

Does that ring any bell for someone on the list? Otherwise i can check the
alternatives patching over the weekend, i think there are not that many locations.

The good thing is it's reproducible - it always crashes. Either in SCSI or in
Tulip.

Regards
Sven

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: PCI HPMC on C240 with alternatives Patching
  2019-05-24  6:58 PCI HPMC on C240 with alternatives Patching Sven Schnelle
@ 2019-05-24 10:50 ` Sven Schnelle
  2019-05-24 11:32   ` Sven Schnelle
  0 siblings, 1 reply; 7+ messages in thread
From: Sven Schnelle @ 2019-05-24 10:50 UTC (permalink / raw)
  To: linux-parisc

On Fri, May 24, 2019 at 08:58:50AM +0200, Sven Schnelle wrote:
> Hi List,
> 
> i recently got my hands on an old C240. I see a Kernel oops pretty early when
> alternatives patching is enabled:
> [..]
> My (wild) guess is that we're patching away some memory barrier or cache flush
> so the SCRIPTS engine in the SCSI controller starts executing garbage and triggers
> a PCI bus read/write to an invalid address. The reason the INB() is given as the
> HPMC location is likely caused by the delay between writing DSPS and the chip actually
> starting to fetch insn/data.
> 
> Does that ring any bell for someone on the list? Otherwise i can check the
> alternatives patching over the weekend, i think there are not that many locations.
> 
> The good thing is it's reproducible - it always crashes. Either in SCSI or in
> Tulip.

Did a quick test, removing ALT_COND_N_IOC_FDC from asm_io_fdc() seems to fix this
issue. Haven't looked in more detail into this though.

index 73ca89a47f49..d83b1adf2f3f 100644
--- a/arch/parisc/include/asm/cache.h
+++ b/arch/parisc/include/asm/cache.h
@@ -52,7 +52,6 @@ void parisc_setup_cache_timing(void);

 #define asm_io_fdc(addr) asm volatile("fdc %%r0(%0)" \
                        ALTERNATIVE(ALT_COND_NO_DCACHE, INSN_NOP) \
-                       ALTERNATIVE(ALT_COND_NO_IOC_FDC, INSN_NOP) \
                        : : "r" (addr) : "memory")
 #define asm_io_sync()  asm volatile("sync" \
                        ALTERNATIVE(ALT_COND_NO_DCACHE, INSN_NOP) \


Sven

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: PCI HPMC on C240 with alternatives Patching
  2019-05-24 10:50 ` Sven Schnelle
@ 2019-05-24 11:32   ` Sven Schnelle
  2019-05-24 15:38     ` Sven Schnelle
  0 siblings, 1 reply; 7+ messages in thread
From: Sven Schnelle @ 2019-05-24 11:32 UTC (permalink / raw)
  To: linux-parisc

Hi List,

On Fri, May 24, 2019 at 12:50:03PM +0200, Sven Schnelle wrote:
> On Fri, May 24, 2019 at 08:58:50AM +0200, Sven Schnelle wrote:
> > Hi List,
> > 
> > i recently got my hands on an old C240. I see a Kernel oops pretty early when
> > alternatives patching is enabled:
> > [..]
> > My (wild) guess is that we're patching away some memory barrier or cache flush
> > so the SCRIPTS engine in the SCSI controller starts executing garbage and triggers
> > a PCI bus read/write to an invalid address. The reason the INB() is given as the
> > HPMC location is likely caused by the delay between writing DSPS and the chip actually
> > starting to fetch insn/data.
> > 
> > Does that ring any bell for someone on the list? Otherwise i can check the
> > alternatives patching over the weekend, i think there are not that many locations.
> > 
> > The good thing is it's reproducible - it always crashes. Either in SCSI or in
> > Tulip.
> 
> Did a quick test, removing ALT_COND_N_IOC_FDC from asm_io_fdc() seems to fix this
> issue. Haven't looked in more detail into this though.

Added some debugging:

[   25.405365] boot_cpu_data.pdc_capabilities: 2

So PDC says IO-PDIR fetches are not performed coherently, *BUT*:

When this bit is clear, flushes and syncs are not required. This
bit is only applicable to SBAs, and does not apply to Legacy IOAs.

With my limited understand i would think that C240 has a 'Legacy IOA' while
C3xxx has SBA? So i think we would need to add some check whether we have
an IOA or SBA in the alternatives code?

Sven


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: PCI HPMC on C240 with alternatives Patching
  2019-05-24 11:32   ` Sven Schnelle
@ 2019-05-24 15:38     ` Sven Schnelle
  2019-05-24 17:59       ` Rolf Eike Beer
  2019-05-24 19:58       ` Helge Deller
  0 siblings, 2 replies; 7+ messages in thread
From: Sven Schnelle @ 2019-05-24 15:38 UTC (permalink / raw)
  To: linux-parisc

Hi List,

On Fri, May 24, 2019 at 01:32:41PM +0200, Sven Schnelle wrote:
> > Did a quick test, removing ALT_COND_N_IOC_FDC from asm_io_fdc() seems to fix this
> > issue. Haven't looked in more detail into this though.
> 
> Added some debugging:
> 
> [   25.405365] boot_cpu_data.pdc_capabilities: 2
> 
> So PDC says IO-PDIR fetches are not performed coherently, *BUT*:
> 
> When this bit is clear, flushes and syncs are not required. This
> bit is only applicable to SBAs, and does not apply to Legacy IOAs.
> 
> With my limited understand i would think that C240 has a 'Legacy IOA' while
> C3xxx has SBA? So i think we would need to add some check whether we have
> an IOA or SBA in the alternatives code?

I did the patch below to check for legacy IO Adapters. Is HW_BCPORT the right
type? On my C240 both GSC Adapters are HW_BCPORT.


From a5a444d0eb4960d7a1c7c4acf5eeb86b4e11e358 Mon Sep 17 00:00:00 2001
From: Sven Schnelle <svens@stackframe.org>
Date: Fri, 24 May 2019 17:33:28 +0200
Subject: [PATCH] parisc: fix alternative patching for Legacy IO systems

On systems with legacy IO Adapters we must ignore the IO-PDIR
bit we get in the PDC_MODEL response. This fixes booting on
HP9000/C240.

Signed-off-by: Sven Schnelle <svens@stackframe.org>
---
 arch/parisc/include/asm/hardware.h |  1 +
 arch/parisc/kernel/alternative.c   |  8 +++++---
 arch/parisc/kernel/drivers.c       | 14 ++++++++++++++
 3 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/arch/parisc/include/asm/hardware.h b/arch/parisc/include/asm/hardware.h
index 9d3d7737c58b..5fb7a3c3eb46 100644
--- a/arch/parisc/include/asm/hardware.h
+++ b/arch/parisc/include/asm/hardware.h
@@ -121,6 +121,7 @@ extern void init_parisc_bus(void);
 extern struct device *hwpath_to_device(struct hardware_path *modpath);
 extern void device_to_hwpath(struct device *dev, struct hardware_path *path);
 extern int machine_has_merced_bus(void);
+extern int machine_has_ioa(void);
 
 /* inventory.c: */
 extern void do_memory_inventory(void);
diff --git a/arch/parisc/kernel/alternative.c b/arch/parisc/kernel/alternative.c
index bf2274e01a96..ca6368e6e96a 100644
--- a/arch/parisc/kernel/alternative.c
+++ b/arch/parisc/kernel/alternative.c
@@ -25,6 +25,7 @@ void __init_or_module apply_alternatives(struct alt_instr *start,
 	struct alt_instr *entry;
 	int index = 0, applied = 0;
 	int num_cpus = num_online_cpus();
+	int has_ioa = machine_has_ioa();
 
 	for (entry = start; entry < end; entry++, index++) {
 
@@ -53,10 +54,11 @@ void __init_or_module apply_alternatives(struct alt_instr *start,
 		/*
 		 * If the PDC_MODEL capabilities has Non-coherent IO-PDIR bit
 		 * set (bit #61, big endian), we have to flush and sync every
-		 * time IO-PDIR is changed in Ike/Astro.
+		 * time IO-PDIR is changed in Ike/Astro. If legacy IOAs are
+		 * present we're not allowed to skip these flushes/syncs.
 		 */
-		if ((cond & ALT_COND_NO_IOC_FDC) &&
-			(boot_cpu_data.pdc.capabilities & PDC_MODEL_IOPDIR_FDC))
+		if (((cond & ALT_COND_NO_IOC_FDC) && (has_ioa ||
+		      (boot_cpu_data.pdc.capabilities & PDC_MODEL_IOPDIR_FDC))))
 			continue;
 
 		/* Want to replace pdtlb by a pdtlb,l instruction? */
diff --git a/arch/parisc/kernel/drivers.c b/arch/parisc/kernel/drivers.c
index 00a181f1ecc6..4c93b416df99 100644
--- a/arch/parisc/kernel/drivers.c
+++ b/arch/parisc/kernel/drivers.c
@@ -282,6 +282,20 @@ int __init machine_has_merced_bus(void)
 	return ret ? 1 : 0;
 }
 
+static int __init is_IOA_device(struct device *dev, void *data)
+{
+	struct parisc_device *pdev = to_parisc_device(dev);
+
+	if (!check_dev(dev))
+		return 0;
+	return pdev->id.hw_type == HPHW_BCPORT;
+}
+
+int __init machine_has_ioa(void)
+{
+	return !!for_each_padev(is_IOA_device, NULL);
+}
+
 /**
  * find_pa_parent_type - Find a parent of a specific type
  * @dev: The device to start searching from
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: PCI HPMC on C240 with alternatives Patching
  2019-05-24 15:38     ` Sven Schnelle
@ 2019-05-24 17:59       ` Rolf Eike Beer
  2019-05-24 19:58       ` Helge Deller
  1 sibling, 0 replies; 7+ messages in thread
From: Rolf Eike Beer @ 2019-05-24 17:59 UTC (permalink / raw)
  To: linux-parisc

[-- Attachment #1: Type: text/plain, Size: 533 bytes --]

> diff --git a/arch/parisc/include/asm/hardware.h
> b/arch/parisc/include/asm/hardware.h index 9d3d7737c58b..5fb7a3c3eb46
> 100644
> --- a/arch/parisc/include/asm/hardware.h
> +++ b/arch/parisc/include/asm/hardware.h
> @@ -121,6 +121,7 @@ extern void init_parisc_bus(void);
>  extern struct device *hwpath_to_device(struct hardware_path *modpath);
>  extern void device_to_hwpath(struct device *dev, struct hardware_path
> *path); extern int machine_has_merced_bus(void);
> +extern int machine_has_ioa(void);

This could return bool.

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: PCI HPMC on C240 with alternatives Patching
  2019-05-24 15:38     ` Sven Schnelle
  2019-05-24 17:59       ` Rolf Eike Beer
@ 2019-05-24 19:58       ` Helge Deller
  2019-05-24 21:11         ` Sven Schnelle
  1 sibling, 1 reply; 7+ messages in thread
From: Helge Deller @ 2019-05-24 19:58 UTC (permalink / raw)
  To: Sven Schnelle; +Cc: linux-parisc

* Sven Schnelle <svens@stackframe.org>:
> On Fri, May 24, 2019 at 01:32:41PM +0200, Sven Schnelle wrote:
> > > Did a quick test, removing ALT_COND_N_IOC_FDC from asm_io_fdc() seems to fix this
> > > issue. Haven't looked in more detail into this though.
> >
> > Added some debugging:
> > [   25.405365] boot_cpu_data.pdc_capabilities: 2

In case it would have booted, one could see that via:
# grep capabilities /proc/cpuinfo
capabilities    : os64 iopdir_fdc needs_equivalent_aliasing (0x35)

> > So PDC says IO-PDIR fetches are not performed coherently, *BUT*:
> >
> > When this bit is clear, flushes and syncs are not required. This
> > bit is only applicable to SBAs, and does not apply to Legacy IOAs.
> >
> > With my limited understand i would think that C240 has a 'Legacy IOA' while
> > C3xxx has SBA? So i think we would need to add some check whether we have
> > an IOA or SBA in the alternatives code?
>
> I did the patch below to check for legacy IO Adapters. Is HW_BCPORT the right
> type? On my C240 both GSC Adapters are HW_BCPORT.

I'm not sure.
Seems to be dependend on the CPU.
See comment in drivers/parisc/ccio-dma.c, line 607ff:

        /* FIXME: PCX_W platforms don't need FDC/SYNC. (eg C360)
        **        PCX-U/U+ do. (eg C200/C240)
        **        PCX-T'? Don't know. (eg C110 or similar K-class)
        **
        ** See PDC_MODEL/option 0/SW_CAP word for "Non-coherent IO-PDIR bit".
        **
        ** "Since PCX-U employs an offset hash that is incompatible with
        ** the real mode coherence index generation of U2, the PDIR entry
        ** must be flushed to memory to retain coherence."


Can you try this patch instead?


diff --git a/arch/parisc/kernel/alternative.c b/arch/parisc/kernel/alternative.c
index bf2274e01a96..7c574b21f834 100644
--- a/arch/parisc/kernel/alternative.c
+++ b/arch/parisc/kernel/alternative.c
@@ -56,7 +56,8 @@ void __init_or_module apply_alternatives(struct alt_instr *start,
 		 * time IO-PDIR is changed in Ike/Astro.
 		 */
 		if ((cond & ALT_COND_NO_IOC_FDC) &&
-			(boot_cpu_data.pdc.capabilities & PDC_MODEL_IOPDIR_FDC))
+			((boot_cpu_data.cpu_type < pcxw) ||
+			 (boot_cpu_data.pdc.capabilities & PDC_MODEL_IOPDIR_FDC)))
 			continue;

 		/* Want to replace pdtlb by a pdtlb,l instruction? */


Helge

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: PCI HPMC on C240 with alternatives Patching
  2019-05-24 19:58       ` Helge Deller
@ 2019-05-24 21:11         ` Sven Schnelle
  0 siblings, 0 replies; 7+ messages in thread
From: Sven Schnelle @ 2019-05-24 21:11 UTC (permalink / raw)
  To: Helge Deller; +Cc: linux-parisc

Hi Helge,

On Fri, May 24, 2019 at 09:58:30PM +0200, Helge Deller wrote:

> > I did the patch below to check for legacy IO Adapters. Is HW_BCPORT the right
> > type? On my C240 both GSC Adapters are HW_BCPORT.
> 
> I'm not sure.
> Seems to be dependend on the CPU.
> See comment in drivers/parisc/ccio-dma.c, line 607ff:
> 
>         /* FIXME: PCX_W platforms don't need FDC/SYNC. (eg C360)
>         **        PCX-U/U+ do. (eg C200/C240)
>         **        PCX-T'? Don't know. (eg C110 or similar K-class)
>         **
>         ** See PDC_MODEL/option 0/SW_CAP word for "Non-coherent IO-PDIR bit".
>         **
>         ** "Since PCX-U employs an offset hash that is incompatible with
>         ** the real mode coherence index generation of U2, the PDIR entry
>         ** must be flushed to memory to retain coherence."
> 
> 
> Can you try this patch instead?
> [..]

Works on my C240 and C3750. Thanks!

Regards
Sven

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-05-24 21:11 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-24  6:58 PCI HPMC on C240 with alternatives Patching Sven Schnelle
2019-05-24 10:50 ` Sven Schnelle
2019-05-24 11:32   ` Sven Schnelle
2019-05-24 15:38     ` Sven Schnelle
2019-05-24 17:59       ` Rolf Eike Beer
2019-05-24 19:58       ` Helge Deller
2019-05-24 21:11         ` Sven Schnelle

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.