LinuxPPC-Dev Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH v2] powerpc/eeh: Fix EEH handling for hugepages in ioremap space.
@ 2021-04-12  7:52 Mahesh Salgaonkar
  2021-04-13  0:53 ` Oliver O'Halloran
  2021-04-19  3:59 ` Michael Ellerman
  0 siblings, 2 replies; 4+ messages in thread
From: Mahesh Salgaonkar @ 2021-04-12  7:52 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Aneesh Kumar K.V, Oliver O'Halloran, stable

During the EEH MMIO error checking, the current implementation fails to map
the (virtual) MMIO address back to the pci device on radix with hugepage
mappings for I/O. This results into failure to dispatch EEH event with no
recovery even when EEH capability has been enabled on the device.

eeh_check_failure(token)		# token = virtual MMIO address
  addr = eeh_token_to_phys(token);
  edev = eeh_addr_cache_get_dev(addr);
  if (!edev)
	return 0;
  eeh_dev_check_failure(edev);	<= Dispatch the EEH event

In case of hugepage mappings, eeh_token_to_phys() has a bug in virt -> phys
translation that results in wrong physical address, which is then passed to
eeh_addr_cache_get_dev() to match it against cached pci I/O address ranges
to get to a PCI device. Hence, it fails to find a match and the EEH event
never gets dispatched leaving the device in failed state.

The commit 33439620680be ("powerpc/eeh: Handle hugepages in ioremap space")
introduced following logic to translate virt to phys for hugepage mappings:

eeh_token_to_phys():
+	pa = pte_pfn(*ptep);
+
+	/* On radix we can do hugepage mappings for io, so handle that */
+       if (hugepage_shift) {
+               pa <<= hugepage_shift;			<= This is wrong
+               pa |= token & ((1ul << hugepage_shift) - 1);
+       }

This patch fixes the virt -> phys translation in eeh_token_to_phys()
function.

$ cat /sys/kernel/debug/powerpc/eeh_address_cache
mem addr range [0x0000040080000000-0x00000400807fffff]: 0030:01:00.1
mem addr range [0x0000040080800000-0x0000040080ffffff]: 0030:01:00.1
mem addr range [0x0000040081000000-0x00000400817fffff]: 0030:01:00.0
mem addr range [0x0000040081800000-0x0000040081ffffff]: 0030:01:00.0
mem addr range [0x0000040082000000-0x000004008207ffff]: 0030:01:00.1
mem addr range [0x0000040082080000-0x00000400820fffff]: 0030:01:00.0
mem addr range [0x0000040082100000-0x000004008210ffff]: 0030:01:00.1
mem addr range [0x0000040082110000-0x000004008211ffff]: 0030:01:00.0

Above is the list of cached io address ranges of pci 0030:01:00.<fn>.

Before this patch:

Tracing 'arg1' of function eeh_addr_cache_get_dev() during error injection
clearly shows that 'addr=' contains wrong physical address:

   kworker/u16:0-7       [001] ....   108.883775: eeh_addr_cache_get_dev:
	   (eeh_addr_cache_get_dev+0xc/0xf0) addr=0x80103000a510

dmesg shows no EEH recovery messages:

[  108.563768] bnx2x: [bnx2x_timer:5801(eth2)]MFW seems hanged: drv_pulse (0x9ae) != mcp_pulse (0x7fff)
[  108.563788] bnx2x: [bnx2x_hw_stats_update:870(eth2)]NIG timer max (4294967295)
[  108.883788] bnx2x: [bnx2x_acquire_hw_lock:2013(eth1)]lock_status 0xffffffff  resource_bit 0x1
[  108.884407] bnx2x 0030:01:00.0 eth1: MDC/MDIO access timeout
[  108.884976] bnx2x 0030:01:00.0 eth1: MDC/MDIO access timeout
<..>

After this patch:

eeh_addr_cache_get_dev() trace shows correct physical address:

  <idle>-0       [001] ..s.  1043.123828: eeh_addr_cache_get_dev:
	  (eeh_addr_cache_get_dev+0xc/0xf0) addr=0x40080bc7cd8

dmesg logs shows EEH recovery getting triggerred:

[  964.323980] bnx2x: [bnx2x_timer:5801(eth2)]MFW seems hanged: drv_pulse (0x746f) != mcp_pulse (0x7fff)
[  964.323991] EEH: Recovering PHB#30-PE#10000
[  964.324002] EEH: PE location: N/A, PHB location: N/A
[  964.324006] EEH: Frozen PHB#30-PE#10000 detected
<..>

Cc: stable@vger.kernel.org
Fixes: 33439620680be ("powerpc/eeh: Handle hugepages in ioremap space")
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reported-by: Dominic DeMarco <ddemarc@us.ibm.com>
---
Change in v2:
- Fixed error reported by checkpatch.pl.
---
 arch/powerpc/kernel/eeh.c |   11 ++++-------
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
index cd60bc1c87011..7040e430a1249 100644
--- a/arch/powerpc/kernel/eeh.c
+++ b/arch/powerpc/kernel/eeh.c
@@ -362,14 +362,11 @@ static inline unsigned long eeh_token_to_phys(unsigned long token)
 	pa = pte_pfn(*ptep);
 
 	/* On radix we can do hugepage mappings for io, so handle that */
-	if (hugepage_shift) {
-		pa <<= hugepage_shift;
-		pa |= token & ((1ul << hugepage_shift) - 1);
-	} else {
-		pa <<= PAGE_SHIFT;
-		pa |= token & (PAGE_SIZE - 1);
-	}
+	if (!hugepage_shift)
+		hugepage_shift = PAGE_SHIFT;
 
+	pa <<= PAGE_SHIFT;
+	pa |= token & ((1ul << hugepage_shift) - 1);
 	return pa;
 }
 



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] powerpc/eeh: Fix EEH handling for hugepages in ioremap space.
  2021-04-12  7:52 [PATCH v2] powerpc/eeh: Fix EEH handling for hugepages in ioremap space Mahesh Salgaonkar
@ 2021-04-13  0:53 ` Oliver O'Halloran
  2021-04-14  8:31   ` Mahesh J Salgaonkar
  2021-04-19  3:59 ` Michael Ellerman
  1 sibling, 1 reply; 4+ messages in thread
From: Oliver O'Halloran @ 2021-04-13  0:53 UTC (permalink / raw)
  To: Mahesh Salgaonkar; +Cc: linuxppc-dev, stable, Aneesh Kumar K.V

On Mon, Apr 12, 2021 at 5:52 PM Mahesh Salgaonkar <mahesh@linux.ibm.com> wrote:
>
> During the EEH MMIO error checking, the current implementation fails to map
> the (virtual) MMIO address back to the pci device on radix with hugepage
> mappings for I/O. This results into failure to dispatch EEH event with no
> recovery even when EEH capability has been enabled on the device.
>
> eeh_check_failure(token)                # token = virtual MMIO address
>   addr = eeh_token_to_phys(token);
>   edev = eeh_addr_cache_get_dev(addr);
>   if (!edev)
>         return 0;
>   eeh_dev_check_failure(edev);  <= Dispatch the EEH event
>
> In case of hugepage mappings, eeh_token_to_phys() has a bug in virt -> phys
> translation that results in wrong physical address, which is then passed to
> eeh_addr_cache_get_dev() to match it against cached pci I/O address ranges
> to get to a PCI device. Hence, it fails to find a match and the EEH event
> never gets dispatched leaving the device in failed state.
>
> The commit 33439620680be ("powerpc/eeh: Handle hugepages in ioremap space")
> introduced following logic to translate virt to phys for hugepage mappings:
>
> eeh_token_to_phys():
> +       pa = pte_pfn(*ptep);
> +
> +       /* On radix we can do hugepage mappings for io, so handle that */
> +       if (hugepage_shift) {
> +               pa <<= hugepage_shift;                  <= This is wrong
> +               pa |= token & ((1ul << hugepage_shift) - 1);
> +       }

I think I vaguely remember thinking "is this right?" at the time.
Apparently not!

Reviewed-by: Oliver O'Halloran <oohall@gmail.com>


It would probably be a good idea to add a debugfs interface to help
with testing the address translation. Maybe something like:

echo <mmio addr> > /sys/kernel/debug/powerpc/eeh_addr_check

Then in the kernel:

struct resource *r = lookup_resource(mmio_addr);
void *virt = ioremap_resource(r);
ret = eeh_check_failure(virt);
iounmap(virt)

return ret;

A little tedious, but then you can write a selftest :)

Oliver

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] powerpc/eeh: Fix EEH handling for hugepages in ioremap space.
  2021-04-13  0:53 ` Oliver O'Halloran
@ 2021-04-14  8:31   ` Mahesh J Salgaonkar
  0 siblings, 0 replies; 4+ messages in thread
From: Mahesh J Salgaonkar @ 2021-04-14  8:31 UTC (permalink / raw)
  To: Oliver O'Halloran; +Cc: linuxppc-dev, stable, Aneesh Kumar K.V

On 2021-04-13 10:53:39 Tue, Oliver O'Halloran wrote:
> On Mon, Apr 12, 2021 at 5:52 PM Mahesh Salgaonkar <mahesh@linux.ibm.com> wrote:
> >
> > During the EEH MMIO error checking, the current implementation fails to map
> > the (virtual) MMIO address back to the pci device on radix with hugepage
> > mappings for I/O. This results into failure to dispatch EEH event with no
> > recovery even when EEH capability has been enabled on the device.
> >
> > eeh_check_failure(token)                # token = virtual MMIO address
> >   addr = eeh_token_to_phys(token);
> >   edev = eeh_addr_cache_get_dev(addr);
> >   if (!edev)
> >         return 0;
> >   eeh_dev_check_failure(edev);  <= Dispatch the EEH event
> >
> > In case of hugepage mappings, eeh_token_to_phys() has a bug in virt -> phys
> > translation that results in wrong physical address, which is then passed to
> > eeh_addr_cache_get_dev() to match it against cached pci I/O address ranges
> > to get to a PCI device. Hence, it fails to find a match and the EEH event
> > never gets dispatched leaving the device in failed state.
> >
> > The commit 33439620680be ("powerpc/eeh: Handle hugepages in ioremap space")
> > introduced following logic to translate virt to phys for hugepage mappings:
> >
> > eeh_token_to_phys():
> > +       pa = pte_pfn(*ptep);
> > +
> > +       /* On radix we can do hugepage mappings for io, so handle that */
> > +       if (hugepage_shift) {
> > +               pa <<= hugepage_shift;                  <= This is wrong
> > +               pa |= token & ((1ul << hugepage_shift) - 1);
> > +       }
> 
> I think I vaguely remember thinking "is this right?" at the time.
> Apparently not!
> 
> Reviewed-by: Oliver O'Halloran <oohall@gmail.com>
> 

Thanks for the review.

> 
> It would probably be a good idea to add a debugfs interface to help
> with testing the address translation. Maybe something like:
> 
> echo <mmio addr> > /sys/kernel/debug/powerpc/eeh_addr_check
> 
> Then in the kernel:
> 
> struct resource *r = lookup_resource(mmio_addr);
> void *virt = ioremap_resource(r);
> ret = eeh_check_failure(virt);
> iounmap(virt)
> 
> return ret;
> 
> A little tedious, but then you can write a selftest :)

Sure, will give a try.

Thanks,
-Mahesh.

-- 
Mahesh J Salgaonkar

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] powerpc/eeh: Fix EEH handling for hugepages in ioremap space.
  2021-04-12  7:52 [PATCH v2] powerpc/eeh: Fix EEH handling for hugepages in ioremap space Mahesh Salgaonkar
  2021-04-13  0:53 ` Oliver O'Halloran
@ 2021-04-19  3:59 ` Michael Ellerman
  1 sibling, 0 replies; 4+ messages in thread
From: Michael Ellerman @ 2021-04-19  3:59 UTC (permalink / raw)
  To: Mahesh Salgaonkar, linuxppc-dev
  Cc: Aneesh Kumar K.V, Oliver O'Halloran, stable

On Mon, 12 Apr 2021 13:22:50 +0530, Mahesh Salgaonkar wrote:
> During the EEH MMIO error checking, the current implementation fails to map
> the (virtual) MMIO address back to the pci device on radix with hugepage
> mappings for I/O. This results into failure to dispatch EEH event with no
> recovery even when EEH capability has been enabled on the device.
> 
> eeh_check_failure(token)		# token = virtual MMIO address
>   addr = eeh_token_to_phys(token);
>   edev = eeh_addr_cache_get_dev(addr);
>   if (!edev)
> 	return 0;
>   eeh_dev_check_failure(edev);	<= Dispatch the EEH event
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/eeh: Fix EEH handling for hugepages in ioremap space.
      https://git.kernel.org/powerpc/c/5ae5bc12d0728db60a0aa9b62160ffc038875f1a

cheers

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, back to index

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-12  7:52 [PATCH v2] powerpc/eeh: Fix EEH handling for hugepages in ioremap space Mahesh Salgaonkar
2021-04-13  0:53 ` Oliver O'Halloran
2021-04-14  8:31   ` Mahesh J Salgaonkar
2021-04-19  3:59 ` Michael Ellerman

LinuxPPC-Dev Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linuxppc-dev/0 linuxppc-dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linuxppc-dev linuxppc-dev/ https://lore.kernel.org/linuxppc-dev \
		linuxppc-dev@lists.ozlabs.org linuxppc-dev@ozlabs.org
	public-inbox-index linuxppc-dev

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.ozlabs.lists.linuxppc-dev


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git