linux-arch.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: Linux 5.15-rc2
       [not found] ` <20210920134424.GA346531@roeck-us.net>
@ 2021-09-20 16:18   ` Linus Torvalds
  2021-09-20 17:04     ` Linus Torvalds
  0 siblings, 1 reply; 5+ messages in thread
From: Linus Torvalds @ 2021-09-20 16:18 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Linux Kernel Mailing List, linux-sparc, David S. Miller, linux-arch

On Mon, Sep 20, 2021 at 6:44 AM Guenter Roeck <linux@roeck-us.net> wrote:
>
> Details for build failures below. Several improvements since last week,
> but it looks like the alpha related pci_iounmap patches still need some
> tweaking (see error log at the very end).

Bah.

I thought I had tested sparc64, but I was wrong.

Silly me, I had only tested the 32-bit case.

That sparc64 thing is being particularly stupid: sparc64 uses
GENERIC_PCI_IOMAP, but declares its own empty pci_iounmap() because it
didn't use GENERIC_IOMAP that does this all right.

And because it didn't use the generic iomap code, it's all actually
entirely buggy, in that it seems to think that pci_iounmap() is about
unmapping ports (like ioport_unmap) and thus a no-op. But no,
pci_iounmap() is supposed to unmap anything that pci_iomap() mapped,
which includes the actual MMIO range too.

Basically, the whole idea of "pci_iomap()" is that you give it a PCI
device and the index to the BAR in that device, and it maps that BAR -
whether it is MMIO or PIO. And then you can use that __iomem pointer
for ioread*() and friends (or you can use readl()/writel() if you know
it was MMIO).

You can give it a maximum length if you want, but by default it just
maps the whole PCI BAR, so the default usage would just be

    void __iomem *map = pci_iomap(pdev, bar, 0);

And then you do whatever IO using that 'map' base pointer, and once
you're done you do "pci_iounmap()" on it all.

And then the trick most cases use is that they know that the PIO case
is just always a fixed map, so for PIO that "map/unmap" part os a
no-op. But generally ONLY for the PIO case.

And the sparc64 code seems to think it's only used for PIO, and makes
pci_iounmap() a no-op in general. Which is all kinds of completely
broken.

This is the same bug that the broken inline function in
<asm-generic/io.h> had, and that I added a big comment about in commit
316e8d79a095 ("pci_iounmap'2: Electric Boogaloo: try to make sense of
it all"):

+ * This code is odd, and the ARCH_HAS/ARCH_WANTS #define logic comes
+ * from legacy <asm-generic/io.h> header file behavior. In particular,
+ * it would seem to make sense to do the iounmap(p) for the non-IO-space
+ * case here regardless, but that's not what the old header file code
+ * did. Probably incorrectly, but this is meant to be bug-for-bug
+ * compatible.

but I intentionally didn't fix the bug in that commit, because I
wanted to just try to keep the odd old logic as closely as possible.

It looks like a big part of the "people do their own pci_iounmap()"
thing is that they do it badly and with bugs.

This was all meant to uncover and fix warnings, but it seems to be
uncovering bigger issues.

Of course, most of the time the "pci_iounmap()" only happens at driver
unload time, so it's basically only a kernel virtual memory mapping
leak, which may be why people didn't realize how buggy their own
implementations were.

What the normal GENERIC_IOMAP code does is:

 - it "fake maps" the PIO space at an invalid fixed virtual address

   Since we know that a PIO address on PCI is just a 16-bit number,
this fake virtual window is small and easy to do:

        /*
         * We encode the physical PIO addresses (0-0xffff) into the
         * pointer by offsetting them with a constant (0x10000) and
         * assuming that all the low addresses are always PIO. That means
         * we can do some sanity checks on the low bits, and don't
         * need to just take things for granted.
         */
        #define PIO_OFFSET      0x10000UL
        #define PIO_MASK        0x0ffffUL
        #define PIO_RESERVED    0x40000UL

   so the logic is basically that we can trivially test whether a
"void __iomem *" pointer is a PIO pointer or not: if the pointer value
is in that range of PIO_OFFSET..PIO_OFFSET+PIO_MASK range, it's PIO,
otherwise it's mmio.

 - the MMIO space acts using all the normal ioremap() logic, and we
can tell the end result apart from PIO with the above trivial thing.

 - the GENERIC_IOMAP code internally just has a IO_COND(adds, is_pio,
is_mmio) helper macro, which sets "port" for the is_pio case, and
"addr" for the is_mmio case, so you can do trivial things like this:

        unsigned int ioread8(const void __iomem *addr)
        {
                IO_COND(addr, return inb(port), return readb(addr));
                return 0xff;
        }

   which does the "inb(port)" for the PIO case, and the "readb(addr)"
for the MMIO case.

 - and lookie here what the GENERIC_IOMAP code for pci_iounmap() is:

        void pci_iounmap(struct pci_dev *dev, void __iomem * addr)
        {
                IO_COND(addr, /* nothing */, iounmap(addr));
        }

  IOW, for the "is_pio" case it does nothing, and for the "is_mmio"
case it does "iounmap()".

So the GENERIC_IOMAP code is actually really simple and should just
work for pretty much everybody. All it requires is that fake kernel
virtual address range at PIO_OFFSET (you can override the default
values if you want - maybe your architecture really wants to put MMIO
in those virtual addresses, but I don't think there's a lot of reason
to generally want to do it)

But despite that, people think they should implement their own code,
and then they clearly get it HORRIBLY WRONG.

Anyway, this email ended up being a long explanation of what the code
_should_ do, in the hope that some enterprising kernel developer
decides "Oh, this sounds like an easy thing to fix". But you do need
to be able to test the end result at least a tiny bit.

Because I suspect that the real fix for sparc64 is to just get rid of
its broken non-GENERIC_IOMAP code, and just do "select GENERIC_IOMAP"

And I don't think sparc64 is the only architecture that should go "Oh,
I should just use GENERIC_IOMAP instead of implementing it badly by
hand".

Anyone?

               Linus

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Linux 5.15-rc2
  2021-09-20 16:18   ` Linux 5.15-rc2 Linus Torvalds
@ 2021-09-20 17:04     ` Linus Torvalds
  2021-09-20 18:03       ` Linus Torvalds
  2021-09-20 19:14       ` John Paul Adrian Glaubitz
  0 siblings, 2 replies; 5+ messages in thread
From: Linus Torvalds @ 2021-09-20 17:04 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Linux Kernel Mailing List, linux-sparc, David S. Miller, linux-arch

On Mon, Sep 20, 2021 at 9:18 AM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> Anyway, this email ended up being a long explanation of what the code
> _should_ do, in the hope that some enterprising kernel developer
> decides "Oh, this sounds like an easy thing to fix". But you do need
> to be able to test the end result at least a tiny bit.

In the meantime, the build fix is trivial: make that broken sparc
pci_iounmap() definition depend on CONFIG_PCI being set.

But let me build a few more sparc configs (and this time do it
properly for both 32-bit and 64-bit) before I actually commit it and
push it out.

              Linus

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Linux 5.15-rc2
  2021-09-20 17:04     ` Linus Torvalds
@ 2021-09-20 18:03       ` Linus Torvalds
  2021-09-20 19:14       ` John Paul Adrian Glaubitz
  1 sibling, 0 replies; 5+ messages in thread
From: Linus Torvalds @ 2021-09-20 18:03 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Linux Kernel Mailing List, linux-sparc, David S. Miller, linux-arch

On Mon, Sep 20, 2021 at 10:04 AM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> But let me build a few more sparc configs (and this time do it
> properly for both 32-bit and 64-bit) before I actually commit it and
> push it out.

Actually, I think I _had_ tested sparc64 properly before, it's just
that I had only done the bigger configurations that had CONFIG_PCI
enabled and that didn't show the failure due to that.

But I have now done the allnoconfig and tinyconfig builds that showed
the problem, and verified that the trivial "depend on CONFIG_PCI"
fixes things for me.

Looks like the only remaining problem in your build configuration list
that isn't queued up somewhere is that odd nouveau use of pwrsrc and
the errno range thing. I'll apply it directly.

I'm sure that other configs will have a lot of other cases, but it is
good to see at least your set shrinking.

            Linus

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Linux 5.15-rc2
  2021-09-20 17:04     ` Linus Torvalds
  2021-09-20 18:03       ` Linus Torvalds
@ 2021-09-20 19:14       ` John Paul Adrian Glaubitz
  2021-09-20 20:11         ` Linus Torvalds
  1 sibling, 1 reply; 5+ messages in thread
From: John Paul Adrian Glaubitz @ 2021-09-20 19:14 UTC (permalink / raw)
  To: Linus Torvalds, Guenter Roeck
  Cc: Linux Kernel Mailing List, linux-sparc, David S. Miller, linux-arch

Hi Linus!

On 9/20/21 19:04, Linus Torvalds wrote:
> On Mon, Sep 20, 2021 at 9:18 AM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
>>
>> Anyway, this email ended up being a long explanation of what the code
>> _should_ do, in the hope that some enterprising kernel developer
>> decides "Oh, this sounds like an easy thing to fix". But you do need
>> to be able to test the end result at least a tiny bit.
> 
> In the meantime, the build fix is trivial: make that broken sparc
> pci_iounmap() definition depend on CONFIG_PCI being set.
> 
> But let me build a few more sparc configs (and this time do it
> properly for both 32-bit and 64-bit) before I actually commit it and
> push it out.

If you want to get feedback whether the kernel actually boots, let me know.

I could test boot on a SPARC T5 LDOM (SPARC VM logical domain).

Adrian

-- 
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer - glaubitz@debian.org
`. `'   Freie Universitaet Berlin - glaubitz@physik.fu-berlin.de
  `-    GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Linux 5.15-rc2
  2021-09-20 19:14       ` John Paul Adrian Glaubitz
@ 2021-09-20 20:11         ` Linus Torvalds
  0 siblings, 0 replies; 5+ messages in thread
From: Linus Torvalds @ 2021-09-20 20:11 UTC (permalink / raw)
  To: John Paul Adrian Glaubitz
  Cc: Guenter Roeck, Linux Kernel Mailing List, linux-sparc,
	David S. Miller, linux-arch

On Mon, Sep 20, 2021 at 12:14 PM John Paul Adrian Glaubitz
<glaubitz@physik.fu-berlin.de> wrote:
>
> If you want to get feedback whether the kernel actually boots, let me know.

So having looked around more sparc64 actually looks to be ok as-is,
because it doesn't do any ioremap at all, and the PIO accesses are
done at physical address zero.

Sparc uses a special IO memory address space and can basically map all
of PCI that way, and it looks like the hardware does all the required
special things for the PIO range at address 0-0xffff.

So it turns out that the "missing iounmap()" is actually ok on sparc,
because it's a no-op anyway - because the ioremap() was just a pointer
cast with no actual remapping necessary.

And the generic IOMAP thing does assume that PIO is special, in ways
that sparc doesn't need. On x86, PIO is not remapped, but also uses
different instructions, so it's not just pointer games that could be
done at iomap/unmap case.

(And on many other architectures you need to do different
synchronization, even if you could perhaps otherwise make the
PIO-vs-MMIO be only about the pointer mapping - so "writeb()" and
"outb()" aren't just different in the addressing).

End result: the only downside of sparc not using the generic iomap is
likely that sparc will happily use a NULL __iomap pointer (error) and
basically use it as a PIO access. But since other architectures like
x86-64 would warn for that case (see 'bad_io_access()' in
lib/iomap.c), even that isn't actually a big deal - any such bugs
would have been found elsewhere.

And having looked at this, I'm starting to suspect that sparc oddity
is _why_ the fallback version in <asm-generic/io.h> was so broken. It
did the right thing on sparc, but leaks iomap remappings almost
anywhere else. But maybe sparc ended up being the only user of it?

           Linus

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-09-21  3:19 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CAHk-=wirexiZR+VO=H3xemGKOMkh8OasmXaKXTKUmAKYCzi8AQ@mail.gmail.com>
     [not found] ` <20210920134424.GA346531@roeck-us.net>
2021-09-20 16:18   ` Linux 5.15-rc2 Linus Torvalds
2021-09-20 17:04     ` Linus Torvalds
2021-09-20 18:03       ` Linus Torvalds
2021-09-20 19:14       ` John Paul Adrian Glaubitz
2021-09-20 20:11         ` Linus Torvalds

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).