All of lore.kernel.org
 help / color / mirror / Atom feed
From: Grant Grundler <grantgrundler@gmail.com>
To: John David Anglin <dave.anglin@bell.net>
Cc: Helge Deller <deller@gmx.de>,
	Carlo Pisani <carlojpisani@gmail.com>,
	debian-hppa@lists.debian.org,
	linux-parisc <linux-parisc@vger.kernel.org>
Subject: Re: kernel 4.15.7/64bit, C3600 is unstable during heavy I/O on PCI
Date: Mon, 19 Mar 2018 20:30:24 +0100	[thread overview]
Message-ID: <CAP6odjj4DTSox4K9318OT2nMqvhSw3Fhz6DY+iYduvfj6m6W2w@mail.gmail.com> (raw)
In-Reply-To: <b2fd86ec-a83a-5133-fa66-29229595851c@bell.net>

Hi John!

On Sat, Mar 17, 2018 at 6:47 PM, John David Anglin <dave.anglin@bell.net> wrote:
> Hi Grant,
>
> On 2018-03-17 12:12 PM, Grant Grundler wrote:
>>
>> "Master Abort" means the MMIO
>> transaction timed out - usually due to the device not responding to an
>> MMIO read.
>
> In lba_pci.c and sba_iommu.c, it says "BE WARNED: register writes are
> posted" and need to be followed by a read.  It seems there are a some
> routines in these modules that have writes that aren't followed by a read.
> One is lba_wr_cfg(). Another might be the macro
> LBA_CFG_RESTORE().  Are these okay?

I looked through the two examples you point out and I *think* both are
ok.   lba_wr_cfg() issues an mmio write and immediately after calls
LBA_CFG_MASTER_ABORT_CHECK() which performs an MMIO read from the same
base address.

The LBA_CFG_RESTORE() is "lazy" - the next MMIO read will flush those
three writes and (I believe) any following MMIO writes will still be
issued in order.

Typically, the problem with posted MMIO writes is DMA or other events
don't start until the MMIO write is "seen" by the device. This is
important when specific timing between MMIO transactions is required
OR some magic (e.g. device reset, updates Frame Buffer, etc) happens.

> It seems probable that the problem that Carlo is having is a conflict
> between devices.

Hrm. I don't know. I haven't yet looked at the latest dump that Carlo
helpfully provided as I'm still traveling. Why do you suspect this?

I'm skeptical about "conflict between devices" (due to lba_wr_cfg())
for two reasons:
1) configuration space accesses are usually not part of normal IO
device transaction processing.
2) I've nearly always found that PCI Master Aborts (on MMIO reads) are
usually just a symptom of something else going wrong and not the root
cause.

Typically, the issues I recall running into are around the drivers
hitting a corner case where the device is still performing DMA to an
address that gets unmapped by the driver.  This will wedge the IOMMU
(sba) and then following MMIO reads will generate an HPMC.

The hard part is to determine what the corner case is based on a DMA
address (as reported in SER PIM output). It requires deeper
understanding of the DMA programming for the given SATA controller
(driver directing HW what to do), how transaction completions are
reported (SATA controller HW) and handled (driver operation).

In the past, I've sorted several of these issues out for tg3 and tulip
NIC drivers and I can with confidence say that some issues still
remain in the tulip driver shutdown path. But I gave up on trying to
fix those and lost interest later.

cheers,
grant

  reply	other threads:[~2018-03-19 19:30 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CA+QBN9DxM5PYCnPJCRtgxQ8xGk75=jAtsE+VibUfFOv+Yah6Og@mail.gmail.com>
     [not found] ` <CA+QBN9D9vUA7Q=Sd=moi+bSAkQjGQ6nGa8wnb1=7qHudAY8L8g@mail.gmail.com>
2018-03-17 11:36   ` kernel 4.15.7/64bit, C3600 is unstable during heavy I/O on PCI Helge Deller
2018-03-17 16:12     ` Grant Grundler
     [not found]       ` <CA+QBN9Bd69whK14_0jT0m=5F4i2FxNPyEbMv3mgUsox9pX3fKA@mail.gmail.com>
2018-03-17 17:05         ` Grant Grundler
     [not found]           ` <CA+QBN9A_0x6_0Ayo6YtkdC=j1ssa9mD7NTzU5+Jie0dYabp3bQ@mail.gmail.com>
2018-03-19 19:37             ` Grant Grundler
2018-03-19 19:41               ` Grant Grundler
2018-03-17 17:47       ` John David Anglin
2018-03-19 19:30         ` Grant Grundler [this message]
     [not found]         ` <CA+QBN9DgtjA3vdE7WA36o8P585r8vMbR+s2zPAF+529HuEzzfw@mail.gmail.com>
     [not found]           ` <665cf5f2-35e4-f2ca-7b32-3e5f70ba7acd@bell.net>
     [not found]             ` <CA+QBN9A6vgzxvDtvKNqenxFCGOksjgcxo_NHaGLRPsFyfDzL1Q@mail.gmail.com>
     [not found]               ` <438edd73-e420-e3d9-df03-610e7dbc2e13@bell.net>
     [not found]                 ` <CA+QBN9CS11tBCOMKyGvjGF+w9bnd-pvUioSJD=o3AT9pJC=OmA@mail.gmail.com>
2018-03-19 20:46                   ` John David Anglin
     [not found]     ` <81f118a7-5bc5-c6b2-eb10-a9c683c72a26@bell.net>
2018-03-19 22:23       ` [RFC][PATCH] Fix HPMC handler by increasing size to multiple of 16 bytes Helge Deller
2018-03-19 22:58         ` John David Anglin
2018-03-24 20:18           ` [RFC][PATCH v2] " Helge Deller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAP6odjj4DTSox4K9318OT2nMqvhSw3Fhz6DY+iYduvfj6m6W2w@mail.gmail.com \
    --to=grantgrundler@gmail.com \
    --cc=carlojpisani@gmail.com \
    --cc=dave.anglin@bell.net \
    --cc=debian-hppa@lists.debian.org \
    --cc=deller@gmx.de \
    --cc=linux-parisc@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.