xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Jan Beulich <JBeulich@suse.com>,
	xen-devel <xen-devel@lists.xenproject.org>
Cc: Brian Woods <brian.woods@amd.com>,
	Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Subject: Re: [Xen-devel] [PATCH 4/9] AMD/IOMMU: introduce 128-bit IRTE non-guest-APIC IRTE format
Date: Tue, 18 Jun 2019 12:57:54 +0100	[thread overview]
Message-ID: <56de6c45-ab04-a919-9caf-140a6faab338@citrix.com> (raw)
In-Reply-To: <5D024E6F0200007800237E25@prv1-mh.provo.novell.com>

On 13/06/2019 14:23, Jan Beulich wrote:
> This is in preparation of actually enabling x2APIC mode, which requires
> this wider IRTE format to be used.
>
> A specific remark regarding the first hunk changing
> amd_iommu_ioapic_update_ire(): This bypass was introduced for XSA-36,
> i.e. by 94d4a1119d ("AMD,IOMMU: Clean up old entries in remapping
> tables when creating new one"). Other code introduced by that change has
> meanwhile disappeared or further changed, and I wonder if - rather than
> adding an x2apic_enabled check to the conditional - the bypass couldn't
> be deleted altogether. For now the goal is to affect the non-x2APIC
> paths as little as possible.
>
> Take the liberty and use the new "fresh" flag to suppress an unneeded
> flush in update_intremap_entry_from_ioapic().

What is the meaning of fresh?  Wouldn't "needs_update" be a more
descriptive name?

>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> Note that AMD's doc says Lowest Priority ("Arbitrated" by their naming)
> mode is unavailable in x2APIC mode, but they've confirmed this to be a
> mistake on their part.
>
> --- a/xen/drivers/passthrough/amd/iommu_intr.c
> +++ b/xen/drivers/passthrough/amd/iommu_intr.c
> @@ -35,12 +35,34 @@ struct irte_basic {
>      unsigned int :8;
>  };
>  
> +struct irte_full {
> +    unsigned int remap_en:1;
> +    unsigned int sup_io_pf:1;
> +    unsigned int int_type:3;
> +    unsigned int rq_eoi:1;
> +    unsigned int dm:1;
> +    unsigned int guest_mode:1; /* MBZ */

/* MBZ - not implemented yet. */

Seeing as interrupt posting will be a minor tweak to this data
structure, rather than implementing a new one.

> +    unsigned int dest_lo:24;
> +    unsigned int :32;
> +    unsigned int vector:8;
> +    unsigned int :24;
> +    unsigned int :24;
> +    unsigned int dest_hi:8;

The manual says that we should prefer aligned 64bit access, so some
raw_{lo,hi} fields here will allow...

> @@ -136,7 +170,21 @@ static void free_intremap_entry(unsigned
>  {
>      union irte_ptr entry = get_intremap_entry(seg, bdf, offset);
>  
> -    *entry.basic = (struct irte_basic){};
> +    switch ( irte_mode )
> +    {
> +    case irte_basic:
> +        *entry.basic = (struct irte_basic){};
> +        break;
> +
> +    case irte_full:
> +        entry.full->remap_en = 0;
> +        wmb();

... this to become

entry._128->raw_lo = 0;
smp_wmb();
entry._128->raw_hi = 0;

The interrupt mapping table is allocated in WB memory and accessed
coherently, so an sfence instruction isn't necessary.  All that matters
is that remap_en gets cleared first.

> +        *entry.full = (struct irte_full){};
> +        break;
> +
> +    default:
> +        ASSERT_UNREACHABLE();
> +    }
>  
>      __clear_bit(offset, get_ivrs_mappings(seg)[bdf].intremap_inuse);
>  }
> @@ -154,8 +202,38 @@ static void update_intremap_entry(union
>          .dest = dest,
>          .vector = vector,
>      };
> +    struct irte_full full = {
> +        .remap_en = 1,
> +        .sup_io_pf = 0,
> +        .int_type = int_type,
> +        .rq_eoi = 0,
> +        .dm = dest_mode,
> +        .dest_lo = dest,
> +        .dest_hi = dest >> 24,
> +        .vector = vector,
> +    };

Looking at the resulting code after this patch, I think these structures
should move into their respective case blocks, to help the compiler to
avoid initialising both.

> +
> +    switch ( irte_mode )
> +    {
> +        __uint128_t ret;
> +        union {
> +            __uint128_t raw;
> +            struct irte_full full;
> +        } old;
> +
> +    case irte_basic:
> +        *entry.basic = basic;
> +        break;
> +
> +    case irte_full:
> +        old.full = *entry.full;
> +        ret = cmpxchg16b(entry.full, &old, &full);
> +        ASSERT(ret == old.raw);

Similarly, this can be implemented with

entry.full->remap_en = 0;
smp_wmb();
entry._128->raw_hi = full.raw_hi;
smp_wmb();
entry._128->raw_lo = full.raw_lo;

which avoids using a locked operation.

> +        break;
>  
> -    *entry.basic = basic;
> +    default:
> +        ASSERT_UNREACHABLE();
> +    }
>  }
>  
>  static inline int get_rte_index(const struct IO_APIC_route_entry *rte)
> @@ -169,6 +247,11 @@ static inline void set_rte_index(struct
>      rte->delivery_mode = offset >> 8;
>  }
>  
> +static inline unsigned int get_full_dest(const struct irte_full *entry)
> +{
> +    return entry->dest_lo | (entry->dest_hi << 24);

Given your observation on my still-not-upstream-yet patch about GCC not
honouring the type of bitfields, doesn't dest_hi need explicitly casting
to unsigned int before the shift, to avoid it being performed as int?

> @@ -280,6 +392,18 @@ int __init amd_iommu_setup_ioapic_remapp
>              dest_mode = rte.dest_mode;
>              dest = rte.dest.logical.logical_dest;
>  
> +            if ( iommu->ctrl.xt_en )
> +            {
> +                /*
> +                 * In x2APIC mode we have no way of discovering the high 24
> +                 * bits of the destination of an already enabled interrupt.

Yes we do.  We read the interrupt remapping table, which is where that
information lives.

Any firmware driver which is following the spec won't have let an IRTE
get cached, then played with the table without appropriate flushing.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

  reply	other threads:[~2019-06-18 11:58 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-13 13:14 [Xen-devel] [PATCH 0/9] x86: AMD x2APIC support Jan Beulich
2019-06-13 13:22 ` [Xen-devel] [PATCH 1/9] AMD/IOMMU: use bit field for extended feature register Jan Beulich
2019-06-17 19:07   ` Woods, Brian
2019-06-18  9:37     ` Jan Beulich
2019-06-17 20:23   ` Andrew Cooper
2019-06-18  9:33     ` Jan Beulich
2019-06-13 13:22 ` [Xen-devel] [PATCH 2/9] AMD/IOMMU: use bit field for control register Jan Beulich
2019-06-18  9:54   ` Andrew Cooper
2019-06-18 10:45     ` Jan Beulich
2019-06-13 13:23 ` [Xen-devel] [PATCH 3/9] AMD/IOMMU: use bit field for IRTE Jan Beulich
2019-06-18 10:37   ` Andrew Cooper
2019-06-18 11:53     ` Jan Beulich
2019-06-18 12:16       ` Andrew Cooper
2019-06-18 12:55         ` Jan Beulich
2019-06-18 11:31   ` Andrew Cooper
2019-06-18 11:47     ` Jan Beulich
2019-06-13 13:23 ` [Xen-devel] [PATCH 4/9] AMD/IOMMU: introduce 128-bit IRTE non-guest-APIC IRTE format Jan Beulich
2019-06-18 11:57   ` Andrew Cooper [this message]
2019-06-18 15:31     ` Jan Beulich
2019-06-13 13:24 ` [Xen-devel] [PATCH 5/9] AMD/IOMMU: split amd_iommu_init_one() Jan Beulich
2019-06-18 12:17   ` Andrew Cooper
2019-06-13 13:25 ` [Xen-devel] [PATCH 6/9] AMD/IOMMU: allow enabling with IRQ not yet set up Jan Beulich
2019-06-18 12:22   ` Andrew Cooper
2019-06-13 13:26 ` [Xen-devel] [PATCH 7/9] AMD/IOMMU: adjust setup of internal interrupt for x2APIC mode Jan Beulich
2019-06-18 12:35   ` Andrew Cooper
2019-06-13 13:27 ` [Xen-devel] [PATCH 8/9] AMD/IOMMU: enable x2APIC mode when available Jan Beulich
2019-06-18 13:40   ` Andrew Cooper
2019-06-18 14:02     ` Jan Beulich
2019-06-13 13:28 ` [Xen-devel] [PATCH RFC 9/9] AMD/IOMMU: correct IRTE updating Jan Beulich
2019-06-18 13:28   ` Andrew Cooper
2019-06-18 14:58     ` Jan Beulich
2019-06-27 15:15 ` [Xen-devel] [PATCH v2 00/10] x86: AMD x2APIC support Jan Beulich
2019-06-27 15:19   ` [Xen-devel] [PATCH v2 01/10] AMD/IOMMU: restrict feature logging Jan Beulich
2019-07-01 15:37     ` Andrew Cooper
2019-07-01 15:59     ` Woods, Brian
2019-06-27 15:19   ` [Xen-devel] [PATCH v2 02/10] AMD/IOMMU: use bit field for extended feature register Jan Beulich
2019-07-02 12:09     ` Andrew Cooper
2019-07-02 13:48       ` Jan Beulich
2019-07-16 16:02       ` Jan Beulich
2019-06-27 15:20   ` [Xen-devel] [PATCH v2 03/10] AMD/IOMMU: use bit field for control register Jan Beulich
2019-07-02 12:20     ` Andrew Cooper
2019-06-27 15:20   ` [Xen-devel] [PATCH v2 04/10] AMD/IOMMU: use bit field for IRTE Jan Beulich
2019-07-02 12:33     ` Andrew Cooper
2019-07-02 13:56       ` Jan Beulich
2019-06-27 15:21   ` [Xen-devel] [PATCH v2 05/10] AMD/IOMMU: introduce 128-bit IRTE non-guest-APIC IRTE format Jan Beulich
2019-07-02 14:41     ` Andrew Cooper
2019-07-03  8:46       ` Jan Beulich
2019-07-16  6:39       ` Jan Beulich
2019-06-27 15:21   ` [Xen-devel] [PATCH v2 06/10] AMD/IOMMU: split amd_iommu_init_one() Jan Beulich
2019-06-27 15:22   ` [Xen-devel] [PATCH v2 07/10] AMD/IOMMU: allow enabling with IRQ not yet set up Jan Beulich
2019-06-27 15:22   ` [Xen-devel] [PATCH v2 08/10] AMD/IOMMU: adjust setup of internal interrupt for x2APIC mode Jan Beulich
2019-06-27 15:23   ` [Xen-devel] [PATCH v2 09/10] AMD/IOMMU: enable x2APIC mode when available Jan Beulich
2019-07-02 14:50     ` Andrew Cooper
2019-06-27 15:23   ` [Xen-devel] [PATCH RFC v2 10/10] AMD/IOMMU: correct IRTE updating Jan Beulich
2019-07-02 15:08     ` Andrew Cooper
2019-07-03  8:55       ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56de6c45-ab04-a919-9caf-140a6faab338@citrix.com \
    --to=andrew.cooper3@citrix.com \
    --cc=JBeulich@suse.com \
    --cc=brian.woods@amd.com \
    --cc=suravee.suthikulpanit@amd.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).