All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeremy Fitzhardinge <jeremy@goop.org>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Ingo Molnar <mingo@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	"H. Peter Anvin" <hpa@zytor.com>,
	the arch/x86 maintainers <x86@kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Xen-devel <xen-devel@lists.xensource.com>,
	Keir Fraser <keir.fraser@eu.citrix.com>
Subject: Re: [PATCH RFC] x86/acpi: don't ignore I/O APICs just because there's no local APIC
Date: Wed, 17 Jun 2009 10:32:06 -0700	[thread overview]
Message-ID: <4A392896.9090408@goop.org> (raw)
In-Reply-To: <m1vdmvxe3u.fsf@fess.ebiederm.org>

On 06/17/09 05:02, Eric W. Biederman wrote:
> Trying to understand what is going on I just read through Xen 3.4 and the
> accompanying 2.6.18 kernel source.
>    

Thanks very much for spending time on this.  I really appreciate it.

> Xen has a horrible api with respect to io_apics.  They aren't even real
> io_apics when Xen is done ``abstracting'' them.
>
> Xen gives us the vector to write.  But we get to assign that
> vector arbitrarily to an ioapic and vector.
>
> We are required to use a hypercall when performing the write.
> Xen overrides the delivery_mode and destination, and occasionally
> the mask bit.
>    

Yes, it's a bit mad.  All those writes are really conveying is the 
vector, and Xen gave that to us in the first place.

> We still have to handle polarity and the trigger mode.  Despite
> the fact that Xen has acpi and mp tables parsers of it's own.
>
> I expect it would have been easier and simpler all around if there
> was just a map_gsi event channel hypercall.  But Xen has an abi
> and an existing set of calls so could aren't worth worrying about
> much.
>    

Actually I was discussing this with Keir yesterday.  We're definitely 
open to changing the dom0 API to make things simpler on the Linux side.  
(The dom0 ABI is more fluid than the domU one, and these changes would 
be backwards-compatible anyway.)

One of the options we discussed was changing the API to get rid of the 
exposed vector, and just replace it with an operation to directly bind a 
gsi to a pirq (internal Xen physical interrupt handle, if you will), so 
that Xen ends up doing all the I/O APIC programming internally, as well 
as the local APIC.

On the Linux side, I think it means we can just point 
pcibios_enable/disable_irq to our own xen_pci_irq_enable/disable 
functions to create the binding between a PCI device and an irq.

I haven't prototyped this yet, or even looked into it very closely, but 
it seems like a promising approach to avoid almost all interaction with 
the apic layer of the kernel.  xen_pci_irq_enable() would have to make 
its own calls acpi_pci_irq_lookup() to map pci_dev+pin -> gsi, so we 
would still need to make sure ACPI is up to that job.

> Xen's ioapic affinity management logic looks like it only works
> on sunny days if you don't stress it too hard.
Could you be a bit more specific?  Are you referring to problems that 
you've fixed in the kernel which are still present in Xen?

>    Of course the hard
> part Xen of driving the hardware Xen doesn't want to share.
>    

Yes; it has to handle everything relating to physical CPUs, as the 
kernel only has virtual CPUs.

> It looks like the only thing Xen gains by pushing out the work of
> setting the polarity and setting edge/level triggering is our database
> of motherboards which get those things wrong.
>    

Avoiding duplication of effort is a non-trivial benefit.

> So I expect the thing to do is factor out acpi_parse_ioapic,
> mp_register_ioapic so we can share information on borked BIOS's
> between the Xen dom0 port and otherwise push Xen pseudo apic handling
> off into it's strange little corner.

Yes, that's what I'll look into.

     J


WARNING: multiple messages have this Message-ID (diff)
From: Jeremy Fitzhardinge <jeremy@goop.org>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Xen-devel <xen-devel@lists.xensource.com>,
	the arch/x86 maintainers <x86@kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Ingo Molnar <mingo@redhat.com>,
	Keir Fraser <keir.fraser@eu.citrix.com>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [PATCH RFC] x86/acpi: don't ignore I/O APICs just because there's no local APIC
Date: Wed, 17 Jun 2009 10:32:06 -0700	[thread overview]
Message-ID: <4A392896.9090408@goop.org> (raw)
In-Reply-To: <m1vdmvxe3u.fsf@fess.ebiederm.org>

On 06/17/09 05:02, Eric W. Biederman wrote:
> Trying to understand what is going on I just read through Xen 3.4 and the
> accompanying 2.6.18 kernel source.
>    

Thanks very much for spending time on this.  I really appreciate it.

> Xen has a horrible api with respect to io_apics.  They aren't even real
> io_apics when Xen is done ``abstracting'' them.
>
> Xen gives us the vector to write.  But we get to assign that
> vector arbitrarily to an ioapic and vector.
>
> We are required to use a hypercall when performing the write.
> Xen overrides the delivery_mode and destination, and occasionally
> the mask bit.
>    

Yes, it's a bit mad.  All those writes are really conveying is the 
vector, and Xen gave that to us in the first place.

> We still have to handle polarity and the trigger mode.  Despite
> the fact that Xen has acpi and mp tables parsers of it's own.
>
> I expect it would have been easier and simpler all around if there
> was just a map_gsi event channel hypercall.  But Xen has an abi
> and an existing set of calls so could aren't worth worrying about
> much.
>    

Actually I was discussing this with Keir yesterday.  We're definitely 
open to changing the dom0 API to make things simpler on the Linux side.  
(The dom0 ABI is more fluid than the domU one, and these changes would 
be backwards-compatible anyway.)

One of the options we discussed was changing the API to get rid of the 
exposed vector, and just replace it with an operation to directly bind a 
gsi to a pirq (internal Xen physical interrupt handle, if you will), so 
that Xen ends up doing all the I/O APIC programming internally, as well 
as the local APIC.

On the Linux side, I think it means we can just point 
pcibios_enable/disable_irq to our own xen_pci_irq_enable/disable 
functions to create the binding between a PCI device and an irq.

I haven't prototyped this yet, or even looked into it very closely, but 
it seems like a promising approach to avoid almost all interaction with 
the apic layer of the kernel.  xen_pci_irq_enable() would have to make 
its own calls acpi_pci_irq_lookup() to map pci_dev+pin -> gsi, so we 
would still need to make sure ACPI is up to that job.

> Xen's ioapic affinity management logic looks like it only works
> on sunny days if you don't stress it too hard.
Could you be a bit more specific?  Are you referring to problems that 
you've fixed in the kernel which are still present in Xen?

>    Of course the hard
> part Xen of driving the hardware Xen doesn't want to share.
>    

Yes; it has to handle everything relating to physical CPUs, as the 
kernel only has virtual CPUs.

> It looks like the only thing Xen gains by pushing out the work of
> setting the polarity and setting edge/level triggering is our database
> of motherboards which get those things wrong.
>    

Avoiding duplication of effort is a non-trivial benefit.

> So I expect the thing to do is factor out acpi_parse_ioapic,
> mp_register_ioapic so we can share information on borked BIOS's
> between the Xen dom0 port and otherwise push Xen pseudo apic handling
> off into it's strange little corner.

Yes, that's what I'll look into.

     J

  reply	other threads:[~2009-06-17 17:32 UTC|newest]

Thread overview: 79+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-06-12 18:22 [PATCH RFC] x86/acpi: don't ignore I/O APICs just because there's no local APIC Jeremy Fitzhardinge
2009-06-12 18:22 ` Jeremy Fitzhardinge
2009-06-12 18:28 ` Alan Cox
2009-06-12 18:28   ` Alan Cox
2009-06-12 18:33   ` Jeremy Fitzhardinge
2009-06-12 18:33     ` Jeremy Fitzhardinge
2009-06-12 20:11 ` Cyrill Gorcunov
2009-06-15  2:01   ` Jeremy Fitzhardinge
2009-06-12 20:35 ` Eric W. Biederman
2009-06-12 20:35   ` Eric W. Biederman
2009-06-15  2:06   ` Jeremy Fitzhardinge
2009-06-15 10:47     ` Eric W. Biederman
2009-06-15 10:47       ` Eric W. Biederman
2009-06-15 20:49       ` Jeremy Fitzhardinge
2009-06-15 20:49         ` Jeremy Fitzhardinge
2009-06-15 21:58         ` Eric W. Biederman
2009-06-15 21:58           ` Eric W. Biederman
2009-06-16 19:38           ` Jeremy Fitzhardinge
2009-06-16 19:38             ` Jeremy Fitzhardinge
2009-06-17  5:10             ` Eric W. Biederman
2009-06-17  5:10               ` Eric W. Biederman
2009-06-17 12:02             ` Eric W. Biederman
2009-06-17 12:02               ` Eric W. Biederman
2009-06-17 17:32               ` Jeremy Fitzhardinge [this message]
2009-06-17 17:32                 ` Jeremy Fitzhardinge
2009-06-18  2:58                 ` Eric W. Biederman
2009-06-18  2:58                   ` Eric W. Biederman
2009-06-18 19:34                   ` Jeremy Fitzhardinge
2009-06-18 19:34                     ` Jeremy Fitzhardinge
2009-06-18 20:28                     ` Eric W. Biederman
2009-06-18 21:09                       ` Jeremy Fitzhardinge
2009-06-18 21:09                         ` Jeremy Fitzhardinge
2009-06-19  1:38                         ` Eric W. Biederman
2009-06-19  1:38                           ` Eric W. Biederman
2009-06-19  3:10                           ` [Xen-devel] " Jiang, Yunhong
2009-06-19  3:10                             ` Jiang, Yunhong
2009-06-18 12:26                 ` Eric W. Biederman
2009-06-15 10:51 ` Eric W. Biederman
2009-06-15 10:51   ` Eric W. Biederman
2009-06-18 16:08 ` Len Brown
2009-06-18 19:14   ` Jeremy Fitzhardinge
2009-06-18 19:14     ` Jeremy Fitzhardinge
2009-06-18 19:27     ` Eric W. Biederman
2009-06-18 19:48       ` Jeremy Fitzhardinge
2009-06-18 19:48         ` Jeremy Fitzhardinge
2009-06-18 20:39         ` Eric W. Biederman
2009-06-18 22:33           ` Jeremy Fitzhardinge
2009-06-18 22:33             ` Jeremy Fitzhardinge
2009-06-19  2:42             ` Eric W. Biederman
2009-06-19  2:42               ` Eric W. Biederman
2009-06-19 19:58               ` Jeremy Fitzhardinge
2009-06-19 19:58                 ` Jeremy Fitzhardinge
2009-06-19 23:44                 ` [Xen-devel] " Nakajima, Jun
2009-06-19 23:44                   ` Nakajima, Jun
2009-06-20  7:39                   ` [Xen-devel] " Keir Fraser
2009-06-20  7:39                     ` Keir Fraser
2009-06-20  8:21                     ` [Xen-devel] " Eric W. Biederman
2009-06-20  8:21                       ` Eric W. Biederman
2009-06-20  8:57                       ` [Xen-devel] " Tian, Kevin
2009-06-20  8:57                         ` Tian, Kevin
2009-06-20 10:22                         ` [Xen-devel] " Keir Fraser
2009-06-20 10:22                           ` Keir Fraser
2009-06-20  8:18                   ` [Xen-devel] " Eric W. Biederman
2009-06-20  8:18                     ` Eric W. Biederman
2009-06-19  5:32             ` Yinghai Lu
2009-06-19  5:32               ` Yinghai Lu
2009-06-19  5:50               ` Eric W. Biederman
2009-06-19  5:50                 ` Eric W. Biederman
2009-06-19  7:52               ` [Xen-devel] Re: [PATCH RFC] x86/acpi: don't ignore I/O APICs justbecause " Jan Beulich
2009-06-19  7:52                 ` Jan Beulich
2009-06-19  8:16                 ` [Xen-devel] " Eric W. Biederman
2009-06-19  8:16                   ` Eric W. Biederman
2009-06-20  3:58                   ` [Xen-devel] " Yinghai Lu
2009-06-20  3:58                     ` Yinghai Lu
2009-06-20  5:40                     ` [Xen-devel] " Eric W. Biederman
2009-06-20  5:40                       ` Eric W. Biederman
2009-06-20  5:58                       ` [Xen-devel] " Yinghai Lu
2009-06-20  5:58                         ` Yinghai Lu
2009-06-18 22:51     ` [PATCH RFC] x86/acpi: don't ignore I/O APICs just because " Maciej W. Rozycki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4A392896.9090408@goop.org \
    --to=jeremy@goop.org \
    --cc=ebiederm@xmission.com \
    --cc=hpa@zytor.com \
    --cc=keir.fraser@eu.citrix.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.