linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: Linux Kernel Mentorship Summer 2020
       [not found] <CABGxfO5aN2kJd8YOhbOALw5j7Eq1-rArC_O10RQyFUUhM6YGqw@mail.gmail.com>
@ 2020-03-18 15:04 ` Bjorn Helgaas
       [not found]   ` <CABGxfO5Mrzx0OKpuzzc+3X2J+s1oHbxzbHpCB_FzopPp+hgw=w@mail.gmail.com>
  0 siblings, 1 reply; 3+ messages in thread
From: Bjorn Helgaas @ 2020-03-18 15:04 UTC (permalink / raw)
  To: Saheed Bolarinwa; +Cc: Bjorn Helgaas, Shuah Khan, linux-pci

[+cc linux-pci because there are lots of interesting wrinkles in this
issue -- as background, my idea for this project was to make
pcie_capability_read_word() return errors  more like
pci_read_config_word() does.  When a PCI error occurs,
pci_read_config_word() returns ~0 data, but
pcie_capability_read_word() return 0 data.]

On Mon, Mar 16, 2020 at 02:03:57PM +0100, Saheed Bolarinwa wrote:
> I have checked the function the  pcie_capability_read_word() that
> sets *val = 0  when pci_read_config_word() fails.
> 
> I am still trying to get familiar to the code, I just wondering why
> the result of pci_read_config_word()  is not being returned directly
> since  *val  is passed into it.

pci_read_config_word() needs to return two things:

  1) A success/failure indication.  This is either 0 or an error like
  PCIBIOS_DEVICE_NOT_FOUND, PCIBIOS_BAD_REGISTER_NUMBER, etc.  This is
  the function return value.

  2) A 16-bit value read from PCI config space.  This is what goes in
  *val.

pcie_capability_read_word() is similar.  The idea of this project is
to change part 2, the value in *val.

The reason is that sometimes the config read fails on PCI.  This can
happen because the device has been physically removed, it's been
electrically isolated, or some other PCI error.  When a config read
fails, the PCI host bridge generally returns ~0 data to satisfy the
CPU load instruction.

The PCI core, i.e., pci_read_config_word() can't tell whether ~0 means
(a) the config read was successful and the register on the PCI card
happened to contain ~0, or (b) the config read failed because of a PCI
error.

Only the driver knows whether ~0 is a valid value for a config space
register, so the driver has to be involved in looking for this error.

My idea for this project is to make this checking more uniform between
pci_read_config_word() and pcie_capability_read_word().

For example, pci_read_config_word() sets *val = ~0 when it returns
PCIBIOS_DEVICE_NOT_FOUND.

On the other hand, pcie_capability_read_word() sets *val = 0 when it
returns -EINVAL, PCIBIOS_DEVICE_NOT_FOUND, etc.

> This is what is done inside pci_read_config_word() when
> pci_bus_read_config_word() is called. I tried to find the definition
> of  pci_bus_read_config_word() but I couldn't find it.  I am not
> sure I need it, though.

pci_bus_read_config_word() and similar functions are defined by the
PCI_OP_READ() macro in drivers/pci/access.c.  It's sort of annoying
because grep/cscope/etc don't find those functions very well.  But
you're right; they aren't really relevant to this project.

> I am also confused with the last statement of the Project
> description on the Project List
> <https://wiki.linuxfoundation.org/lkmp/lkmp_project_list> page. It
> says, "*This will require changes to some callers of
> pci_read_config_word() that assume the read returns 0 on error.*" I
> think the callers pcie_capability_read_word() are supposedly being
> referred to here.

Yes, that's a typo.  The idea is to change pcie_capability_read_*(),
so we'd probably need to change some callers to match.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Fwd: Linux Kernel Mentorship Summer 2020
       [not found]   ` <CABGxfO5Mrzx0OKpuzzc+3X2J+s1oHbxzbHpCB_FzopPp+hgw=w@mail.gmail.com>
@ 2020-03-19 15:57     ` Saheed Bolarinwa
  2020-03-19 20:55     ` Bjorn Helgaas
  1 sibling, 0 replies; 3+ messages in thread
From: Saheed Bolarinwa @ 2020-03-19 15:57 UTC (permalink / raw)
  To: linux-pci

---------- Forwarded message ---------
From: Saheed Bolarinwa <refactormyself@gmail.com>
Date: Thu, Mar 19, 2020 at 4:33 PM
Subject: Re: Linux Kernel Mentorship Summer 2020
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: Bjorn Helgaas <bjorn@helgaas.com>, Shuah Khan
<skhan@linuxfoundation.org>, <linux-pci@vger.kernel.org>


Hello,
Thank you for the detailed explanation.

So please just to clarify, I think it will be better to ONLY set *val
= 0 when returning -EINVAL
In this case, we can just return pci_read_config_word(), then there is
nothing to reset.

Then the remaining tasks will now be to deal with the references to
pcie_capability_read_word().

- Saheed

On Wed, Mar 18, 2020 at 4:04 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> [+cc linux-pci because there are lots of interesting wrinkles in this
> issue -- as background, my idea for this project was to make
> pcie_capability_read_word() return errors  more like
> pci_read_config_word() does.  When a PCI error occurs,
> pci_read_config_word() returns ~0 data, but
> pcie_capability_read_word() return 0 data.]
>
> On Mon, Mar 16, 2020 at 02:03:57PM +0100, Saheed Bolarinwa wrote:
> > I have checked the function the  pcie_capability_read_word() that
> > sets *val = 0  when pci_read_config_word() fails.
> >
> > I am still trying to get familiar to the code, I just wondering why
> > the result of pci_read_config_word()  is not being returned directly
> > since  *val  is passed into it.
>
> pci_read_config_word() needs to return two things:
>
>   1) A success/failure indication.  This is either 0 or an error like
>   PCIBIOS_DEVICE_NOT_FOUND, PCIBIOS_BAD_REGISTER_NUMBER, etc.  This is
>   the function return value.
>
>   2) A 16-bit value read from PCI config space.  This is what goes in
>   *val.
>
> pcie_capability_read_word() is similar.  The idea of this project is
> to change part 2, the value in *val.
>
> The reason is that sometimes the config read fails on PCI.  This can
> happen because the device has been physically removed, it's been
> electrically isolated, or some other PCI error.  When a config read
> fails, the PCI host bridge generally returns ~0 data to satisfy the
> CPU load instruction.
>
> The PCI core, i.e., pci_read_config_word() can't tell whether ~0 means
> (a) the config read was successful and the register on the PCI card
> happened to contain ~0, or (b) the config read failed because of a PCI
> error.
>
> Only the driver knows whether ~0 is a valid value for a config space
> register, so the driver has to be involved in looking for this error.
>
> My idea for this project is to make this checking more uniform between
> pci_read_config_word() and pcie_capability_read_word().
>
> For example, pci_read_config_word() sets *val = ~0 when it returns
> PCIBIOS_DEVICE_NOT_FOUND.
>
> On the other hand, pcie_capability_read_word() sets *val = 0 when it
> returns -EINVAL, PCIBIOS_DEVICE_NOT_FOUND, etc.
>
> > This is what is done inside pci_read_config_word() when
> > pci_bus_read_config_word() is called. I tried to find the definition
> > of  pci_bus_read_config_word() but I couldn't find it.  I am not
> > sure I need it, though.
>
> pci_bus_read_config_word() and similar functions are defined by the
> PCI_OP_READ() macro in drivers/pci/access.c.  It's sort of annoying
> because grep/cscope/etc don't find those functions very well.  But
> you're right; they aren't really relevant to this project.
>
> > I am also confused with the last statement of the Project
> > description on the Project List
> > <https://wiki.linuxfoundation.org/lkmp/lkmp_project_list> page. It
> > says, "*This will require changes to some callers of
> > pci_read_config_word() that assume the read returns 0 on error.*" I
> > think the callers pcie_capability_read_word() are supposedly being
> > referred to here.
>
> Yes, that's a typo.  The idea is to change pcie_capability_read_*(),
> so we'd probably need to change some callers to match.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Linux Kernel Mentorship Summer 2020
       [not found]   ` <CABGxfO5Mrzx0OKpuzzc+3X2J+s1oHbxzbHpCB_FzopPp+hgw=w@mail.gmail.com>
  2020-03-19 15:57     ` Fwd: " Saheed Bolarinwa
@ 2020-03-19 20:55     ` Bjorn Helgaas
  1 sibling, 0 replies; 3+ messages in thread
From: Bjorn Helgaas @ 2020-03-19 20:55 UTC (permalink / raw)
  To: Saheed Bolarinwa; +Cc: Bjorn Helgaas, Shuah Khan, linux-pci

1) Your first couple emails were multi-part MIME messages, which do
not make it to the mailing lists reliably [1].  It looks like maybe
they were sent via Gmail, which I'm sorry to say is not very convenient
in this respect.  There should be a "plain text" option in the three
dots menu next to the trash can in the draft window, which helps a
little bit.

2) Please use "interleaved" style [2] when responding, so the
conversation is easier to follow.  Gmail again makes this more of a
hassle than it should be.

3) The "**val = 0*" below is confusing; I think the surrounding
asterisks were unhelpfully added by Gmail to indicate italicization.
If you just use plain text that should help.

On Thu, Mar 19, 2020 at 04:33:34PM +0100, Saheed Bolarinwa wrote:
> So please just to clarify, I think it will be better to ONLY set **val
> = 0* when
> returning -EINVAL
> In this case, we can just *return pci_read_config_word(),* then there is
> nothing to reset.

4) I'm not sure we should be returning -EINVAL from
pcie_capability_read_*().  Those functions can also return errors from
pci_read_config_*(), which are generally PCIBIOS_SUCCESSFUL (0) or a
PCIBIOS_* error (a positive integer).  It seems sort of weird to
return either a negative errno or a positive PCIBIOS* number on error.

Other similar cases, e.g., bonito64_pcibios_read(), return
PCIBIOS_BAD_REGISTER_NUMBER if the address isn't aligned correctly.

I suggest you do some research and see whether anything would break if
we made pcie_capability_read_*() return PCIBIOS_BAD_REGISTER_NUMBER in
that case.  If not, that could be its own preliminary patch.

5) Then there are two other questions.  The first is whether we should
set *val = 0 before returning -EINVAL/PCIBIOS_BAD_REGISTER_NUMBER.
This situation is a programming error (the caller should know to use
the correct alignment).  I looked at other config accessors that
return PCIBIOS_BAD_REGISTER_NUMBER; some of them set *val = ~0
(ixp4xx_pci_read_config(), msp_pcibios_read_config_word()), but most
do not (bonito64_pcibios_read(), loongson_pcibios_read(),
msc_pcibios_read(), nile4_pcibios_read(), etc).

Again, this will require some research to make sure we don't break
anything, but I suspect we may be able to leave *val untouched when
returning error due to unaligned address.  This should be a separate
patch.

6) The second question is whether pcie_capability_read_*() should set
*val = 0 before passing on an error from pci_read_config_word().  This
is the case I originally envisioned for this project, and my thought
was that it should not set *val = 0, and it should just leave *val as
whatever pci_read_config_word() set it to.

My reasoning is that callers of pci_read_config_word() really need to
check for "val == ~0" if they care about detecting errors, and it
probably should be the same for pcie_capability_read_*().

This also requires research to see how it would affect callers.  Many
callers check for success and never look at *val if
pcie_capability_read_*() fails; those are easy.  Some callers consume
the data (*val) without checking for success -- those are the
important ones to look at.

I think the first step here is to *identify* those callers and analyze
whether they need to check for failure and how they should do it.

[1] http://vger.kernel.org/majordomo-info.html
[2] https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

> On Wed, Mar 18, 2020 at 4:04 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
> 
> > [+cc linux-pci because there are lots of interesting wrinkles in this
> > issue -- as background, my idea for this project was to make
> > pcie_capability_read_word() return errors  more like
> > pci_read_config_word() does.  When a PCI error occurs,
> > pci_read_config_word() returns ~0 data, but
> > pcie_capability_read_word() return 0 data.]
> >
> > On Mon, Mar 16, 2020 at 02:03:57PM +0100, Saheed Bolarinwa wrote:
> > > I have checked the function the  pcie_capability_read_word() that
> > > sets *val = 0  when pci_read_config_word() fails.
> > >
> > > I am still trying to get familiar to the code, I just wondering why
> > > the result of pci_read_config_word()  is not being returned directly
> > > since  *val  is passed into it.
> >
> > pci_read_config_word() needs to return two things:
> >
> >   1) A success/failure indication.  This is either 0 or an error like
> >   PCIBIOS_DEVICE_NOT_FOUND, PCIBIOS_BAD_REGISTER_NUMBER, etc.  This is
> >   the function return value.
> >
> >   2) A 16-bit value read from PCI config space.  This is what goes in
> >   *val.
> >
> > pcie_capability_read_word() is similar.  The idea of this project is
> > to change part 2, the value in *val.
> >
> > The reason is that sometimes the config read fails on PCI.  This can
> > happen because the device has been physically removed, it's been
> > electrically isolated, or some other PCI error.  When a config read
> > fails, the PCI host bridge generally returns ~0 data to satisfy the
> > CPU load instruction.
> >
> > The PCI core, i.e., pci_read_config_word() can't tell whether ~0 means
> > (a) the config read was successful and the register on the PCI card
> > happened to contain ~0, or (b) the config read failed because of a PCI
> > error.
> >
> > Only the driver knows whether ~0 is a valid value for a config space
> > register, so the driver has to be involved in looking for this error.
> >
> > My idea for this project is to make this checking more uniform between
> > pci_read_config_word() and pcie_capability_read_word().
> >
> > For example, pci_read_config_word() sets *val = ~0 when it returns
> > PCIBIOS_DEVICE_NOT_FOUND.
> >
> > On the other hand, pcie_capability_read_word() sets *val = 0 when it
> > returns -EINVAL, PCIBIOS_DEVICE_NOT_FOUND, etc.
> >
> > > This is what is done inside pci_read_config_word() when
> > > pci_bus_read_config_word() is called. I tried to find the definition
> > > of  pci_bus_read_config_word() but I couldn't find it.  I am not
> > > sure I need it, though.
> >
> > pci_bus_read_config_word() and similar functions are defined by the
> > PCI_OP_READ() macro in drivers/pci/access.c.  It's sort of annoying
> > because grep/cscope/etc don't find those functions very well.  But
> > you're right; they aren't really relevant to this project.
> >
> > > I am also confused with the last statement of the Project
> > > description on the Project List
> > > <https://wiki.linuxfoundation.org/lkmp/lkmp_project_list> page. It
> > > says, "*This will require changes to some callers of
> > > pci_read_config_word() that assume the read returns 0 on error.*" I
> > > think the callers pcie_capability_read_word() are supposedly being
> > > referred to here.
> >
> > Yes, that's a typo.  The idea is to change pcie_capability_read_*(),
> > so we'd probably need to change some callers to match.
> >

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-03-19 20:55 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CABGxfO5aN2kJd8YOhbOALw5j7Eq1-rArC_O10RQyFUUhM6YGqw@mail.gmail.com>
2020-03-18 15:04 ` Linux Kernel Mentorship Summer 2020 Bjorn Helgaas
     [not found]   ` <CABGxfO5Mrzx0OKpuzzc+3X2J+s1oHbxzbHpCB_FzopPp+hgw=w@mail.gmail.com>
2020-03-19 15:57     ` Fwd: " Saheed Bolarinwa
2020-03-19 20:55     ` Bjorn Helgaas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).