From: Sasha Levin <sashal-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
To: lantianyu1986-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org
Cc: alex.williamson-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org,
Lan Tianyu <Tianyu.Lan-0li6OtcxBFHby3iVrkZq2A@public.gmane.org>,
arnd-r2nGTMty4D4@public.gmane.org,
gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r@public.gmane.org,
nicolas.ferre-UWL1GkI3JZL3oGB3hsPCZA@public.gmane.org,
linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
michael.h.kelley-0li6OtcxBFHby3iVrkZq2A@public.gmane.org,
vkuznets-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org,
iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org,
mchehab+samsung-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org,
akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org,
kys-0li6OtcxBFHby3iVrkZq2A@public.gmane.org,
davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org
Subject: Re: [PATCH 2/3] HYPERV/IOMMU: Add Hyper-V stub IOMMU driver
Date: Fri, 1 Feb 2019 09:51:30 -0500 [thread overview]
Message-ID: <20190201145130.GW3973@sasha-vm> (raw)
In-Reply-To: <1548929853-25877-3-git-send-email-Tianyu.Lan-0li6OtcxBFHby3iVrkZq2A@public.gmane.org>
Hi Tianyu,
Few comments below.
On Thu, Jan 31, 2019 at 06:17:32PM +0800, lantianyu1986-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org wrote:
>From: Lan Tianyu <Tianyu.Lan-0li6OtcxBFHby3iVrkZq2A@public.gmane.org>
>
>On the bare metal, enabling X2APIC mode requires interrupt remapping
>function which helps to deliver irq to cpu with 32-bit APIC ID.
>Hyper-V doesn't provide interrupt remapping function so far and Hyper-V
>MSI protocol already supports to deliver interrupt to the CPU whose
>virtual processor index is more than 255. IO-APIC interrupt still has
>8-bit APIC ID limitation.
>
>This patch is to add Hyper-V stub IOMMU driver in order to enable
>X2APIC mode successfully in Hyper-V Linux guest. The driver returns X2APIC
>interrupt remapping capability when X2APIC mode is available. Otherwise,
>it creates a Hyper-V irq domain to limit IO-APIC interrupts' affinity
>and make sure cpus assigned with IO-APIC interrupt have 8-bit APIC ID.
>
>Signed-off-by: Lan Tianyu <Tianyu.Lan-0li6OtcxBFHby3iVrkZq2A@public.gmane.org>
>---
> drivers/iommu/Kconfig | 7 ++
> drivers/iommu/Makefile | 1 +
> drivers/iommu/hyperv-iommu.c | 189 ++++++++++++++++++++++++++++++++++++++++++
> drivers/iommu/irq_remapping.c | 3 +
> drivers/iommu/irq_remapping.h | 1 +
> 5 files changed, 201 insertions(+)
> create mode 100644 drivers/iommu/hyperv-iommu.c
>
>diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
>index 45d7021..5c397c0 100644
>--- a/drivers/iommu/Kconfig
>+++ b/drivers/iommu/Kconfig
>@@ -437,4 +437,11 @@ config QCOM_IOMMU
> help
> Support for IOMMU on certain Qualcomm SoCs.
>
>+config HYPERV_IOMMU
>+ bool "Hyper-V stub IOMMU support"
>+ depends on HYPERV
select IOMMU_API ?
>+ help
>+ Hyper-V stub IOMMU driver provides capability to run
>+ Linux guest with X2APIC mode enabled.
>+
> endif # IOMMU_SUPPORT
>diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
>index a158a68..8c71a15 100644
>--- a/drivers/iommu/Makefile
>+++ b/drivers/iommu/Makefile
>@@ -32,3 +32,4 @@ obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o
> obj-$(CONFIG_FSL_PAMU) += fsl_pamu.o fsl_pamu_domain.o
> obj-$(CONFIG_S390_IOMMU) += s390-iommu.o
> obj-$(CONFIG_QCOM_IOMMU) += qcom_iommu.o
>+obj-$(CONFIG_HYPERV_IOMMU) += hyperv-iommu.o
>diff --git a/drivers/iommu/hyperv-iommu.c b/drivers/iommu/hyperv-iommu.c
>new file mode 100644
>index 0000000..a64b747
>--- /dev/null
>+++ b/drivers/iommu/hyperv-iommu.c
>@@ -0,0 +1,189 @@
>+// SPDX-License-Identifier: GPL-2.0
>+
>+#define pr_fmt(fmt) "HYPERV-IR: " fmt
>+
>+#include <linux/types.h>
>+#include <linux/interrupt.h>
>+#include <linux/irq.h>
>+#include <linux/iommu.h>
>+#include <linux/module.h>
>+
>+#include <asm/hw_irq.h>
>+#include <asm/io_apic.h>
>+#include <asm/irq_remapping.h>
>+#include <asm/hypervisor.h>
>+
>+#include "irq_remapping.h"
>+
>+/*
>+ * According IO-APIC spec, IO APIC has a 24-entry Interrupt
>+ * Redirection Table.
Can the spec be linked somewhere? In the commit message, or here?
>+ */
>+#define IOAPIC_REMAPPING_ENTRY 24
>+
>+static cpumask_t ioapic_max_cpumask = { CPU_BITS_NONE };
>+struct irq_domain *ioapic_ir_domain;
>+
>+static int hyperv_ir_set_affinity(struct irq_data *data,
>+ const struct cpumask *mask, bool force)
>+{
>+ struct irq_data *parent = data->parent_data;
>+ struct irq_cfg *cfg = irqd_cfg(data);
>+ struct IO_APIC_route_entry *entry;
>+ cpumask_t cpumask;
>+ int ret;
>+
>+ cpumask_andnot(&cpumask, mask, &ioapic_max_cpumask);
>+
>+ /* Return error If new irq affinity is out of ioapic_max_cpumask. */
>+ if (!cpumask_empty(&cpumask))
>+ return -EINVAL;
>+
>+ ret = parent->chip->irq_set_affinity(parent, mask, force);
>+ if (ret < 0 || ret == IRQ_SET_MASK_OK_DONE)
>+ return ret;
>+
>+ entry = data->chip_data;
>+ entry->dest = cfg->dest_apicid;
>+ entry->vector = cfg->vector;
>+ send_cleanup_vector(cfg);
>+
>+ return 0;
>+}
>+
>+static struct irq_chip hyperv_ir_chip = {
>+ .name = "HYPERV-IR",
>+ .irq_ack = apic_ack_irq,
>+ .irq_set_affinity = hyperv_ir_set_affinity,
>+};
>+
>+static int hyperv_irq_remapping_alloc(struct irq_domain *domain,
>+ unsigned int virq, unsigned int nr_irqs,
>+ void *arg)
>+{
>+ struct irq_alloc_info *info = arg;
>+ struct IO_APIC_route_entry *entry;
What's the role of this variable? We set it, once, later on in the
function but that's all?
>+ struct irq_data *irq_data;
>+ struct irq_desc *desc;
>+ struct irq_cfg *cfg;
>+ int ret = 0;
>+
>+ if (!info || info->type != X86_IRQ_ALLOC_TYPE_IOAPIC || nr_irqs > 1)
>+ return -EINVAL;
>+
>+ ret = irq_domain_alloc_irqs_parent(domain, virq, nr_irqs, arg);
>+ if (ret < 0)
>+ goto fail;
>+
>+ irq_data = irq_domain_get_irq_data(domain, virq);
>+ cfg = irqd_cfg(irq_data);
Is this actually being used anywhere, or do we only need it for the
check below? It's not clear from the code why we're calling irqd_cfg()
and ignoring the result.
>+ if (!irq_data || !cfg) {
You just dereferenced irq_data in the line above this one, it's a bit
late to check that it's not NULL.
>+ ret = -EINVAL;
>+ goto fail;
>+ }
>+
>+ irq_data->chip = &hyperv_ir_chip;
>+
>+ /*
>+ * Save IOAPIC entry pointer here in order to set vector and
>+ * and dest_apicid in the hyperv_irq_remappng_activate()
>+ * and hyperv_ir_set_affinity(). IOAPIC driver ignores
>+ * cfg->dest_apicid and cfg->vector when irq remapping
>+ * mode is enabled. Detail see ioapic_configure_entry().
>+ */
>+ irq_data->chip_data = entry = info->ioapic_entry;
>+
>+ /*
>+ * Hypver-V IO APIC irq affinity should be in the scope of
>+ * ioapic_max_cpumask because no irq remapping support.
>+ */
>+ desc = irq_data_to_desc(irq_data);
>+ cpumask_and(desc->irq_common_data.affinity,
>+ desc->irq_common_data.affinity,
>+ &ioapic_max_cpumask);
>+
>+ fail:
>+ return ret;
This one doesn't actually free anything?
>+}
>+
>+static void hyperv_irq_remapping_free(struct irq_domain *domain,
>+ unsigned int virq, unsigned int nr_irqs)
>+{
>+ irq_domain_free_irqs_common(domain, virq, nr_irqs);
>+}
>+
>+static int hyperv_irq_remappng_activate(struct irq_domain *domain,
>+ struct irq_data *irq_data, bool reserve)
>+{
>+ struct irq_cfg *cfg = irqd_cfg(irq_data);
>+ struct IO_APIC_route_entry *entry = irq_data->chip_data;
>+
>+ entry->dest = cfg->dest_apicid;
>+ entry->vector = cfg->vector;
>+
>+ return 0;
>+}
>+
>+static struct irq_domain_ops hyperv_ir_domain_ops = {
>+ .alloc = hyperv_irq_remapping_alloc,
>+ .free = hyperv_irq_remapping_free,
>+ .activate = hyperv_irq_remappng_activate,
>+};
>+
>+static int __init hyperv_prepare_irq_remapping(void)
>+{
>+ struct fwnode_handle *fn;
>+ u32 apic_id;
>+ int i;
>+
>+ if (x86_hyper_type != X86_HYPER_MS_HYPERV ||
>+ !x2apic_supported())
>+ return -ENODEV;
>+
>+ fn = irq_domain_alloc_named_id_fwnode("HYPERV-IR", 0);
>+ if (!fn)
>+ return -EFAULT;
Why EFAULT? The only reason irq_domain_alloc_named_id_fwnode() might
fail is running out of memory.
>+
>+ ioapic_ir_domain =
>+ irq_domain_create_hierarchy(arch_get_ir_parent_domain(),
>+ 0, IOAPIC_REMAPPING_ENTRY, fn,
>+ &hyperv_ir_domain_ops, NULL);
>+
>+ irq_domain_free_fwnode(fn);
>+
>+ /*
>+ * Hyper-V doesn't provide irq remapping function for
>+ * IO-APIC and so IO-APIC only accepts 8-bit APIC ID.
>+ * Prepare max cpu affinity for IOAPIC irqs. Scan cpu 0-255
>+ * and set cpu into ioapic_max_cpumask if its APIC ID is less
>+ * than 255.
Off-by-one here: it'll set the CPU in the affinity mask if it's less
than 256, not 255.
>+ */
>+ for (i = 0; i < 256; i++) {
>+ apic_id = cpu_physical_id(i);
>+ if (apic_id > 255)
>+ continue;
>+
>+ cpumask_set_cpu(i, &ioapic_max_cpumask);
>+ }
I'm curious here: assuming we have a large amount of CPUs, what
guarantee do we have that this mask will have anything set? What happens
if it remains empty?
>+
>+ return 0;
>+}
>+
>+static int __init hyperv_enable_irq_remapping(void)
>+{
>+ return IRQ_REMAP_X2APIC_MODE;
>+}
>+
>+static struct irq_domain *hyperv_get_ir_irq_domain(struct irq_alloc_info *info)
>+{
>+ if (info->type == X86_IRQ_ALLOC_TYPE_IOAPIC)
>+ return ioapic_ir_domain;
>+ else
>+ return NULL;
>+}
>+
>+struct irq_remap_ops hyperv_irq_remap_ops = {
>+ .prepare = hyperv_prepare_irq_remapping,
>+ .enable = hyperv_enable_irq_remapping,
>+ .get_ir_irq_domain = hyperv_get_ir_irq_domain,
>+};
>diff --git a/drivers/iommu/irq_remapping.c b/drivers/iommu/irq_remapping.c
>index b94ebd4..81cf290 100644
>--- a/drivers/iommu/irq_remapping.c
>+++ b/drivers/iommu/irq_remapping.c
>@@ -103,6 +103,9 @@ int __init irq_remapping_prepare(void)
> else if (IS_ENABLED(CONFIG_AMD_IOMMU) &&
> amd_iommu_irq_ops.prepare() == 0)
> remap_ops = &amd_iommu_irq_ops;
>+ else if (IS_ENABLED(CONFIG_HYPERV_IOMMU) &&
>+ hyperv_irq_remap_ops.prepare() == 0)
>+ remap_ops = &hyperv_irq_remap_ops;
> else
> return -ENOSYS;
>
>diff --git a/drivers/iommu/irq_remapping.h b/drivers/iommu/irq_remapping.h
>index 0afef6e..f8609e9 100644
>--- a/drivers/iommu/irq_remapping.h
>+++ b/drivers/iommu/irq_remapping.h
>@@ -64,6 +64,7 @@ struct irq_remap_ops {
>
> extern struct irq_remap_ops intel_irq_remap_ops;
> extern struct irq_remap_ops amd_iommu_irq_ops;
>+extern struct irq_remap_ops hyperv_irq_remap_ops;
>
> #else /* CONFIG_IRQ_REMAP */
>
>--
>2.7.4
>
next prev parent reply other threads:[~2019-02-01 14:51 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-01-31 10:17 [PATCH 0/3] x86/Hyper-V/IOMMU: Add Hyper-V IOMMU driver to support x2apic mode lantianyu1986
2019-01-31 10:17 ` [PATCH 1/3] x86/Hyper-V: Set x2apic destination mode to physical when x2apic is available lantianyu1986
2019-01-31 11:57 ` Greg KH
2019-01-31 12:02 ` Tianyu Lan
2019-02-01 7:06 ` Dan Carpenter
2019-02-01 7:10 ` Tianyu Lan
2019-01-31 10:17 ` [PATCH 2/3] HYPERV/IOMMU: Add Hyper-V stub IOMMU driver lantianyu1986
2019-01-31 11:59 ` Greg KH
2019-01-31 12:08 ` Tianyu Lan
2019-01-31 14:04 ` Vitaly Kuznetsov
2019-02-01 5:45 ` Tianyu Lan
[not found] ` <1548929853-25877-3-git-send-email-Tianyu.Lan-0li6OtcxBFHby3iVrkZq2A@public.gmane.org>
2019-02-01 14:51 ` Sasha Levin [this message]
2019-02-02 6:02 ` Tianyu Lan
2019-02-01 16:34 ` Joerg Roedel
2019-02-02 2:51 ` Tianyu Lan
2019-02-01 17:00 ` Robin Murphy
2019-02-02 6:20 ` Tianyu Lan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190201145130.GW3973@sasha-vm \
--to=sashal-dgejt+ai2ygdnm+yrofe0a@public.gmane.org \
--cc=Tianyu.Lan-0li6OtcxBFHby3iVrkZq2A@public.gmane.org \
--cc=akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
--cc=alex.williamson-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
--cc=arnd-r2nGTMty4D4@public.gmane.org \
--cc=davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org \
--cc=gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r@public.gmane.org \
--cc=iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
--cc=kys-0li6OtcxBFHby3iVrkZq2A@public.gmane.org \
--cc=lantianyu1986-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
--cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=mchehab+samsung-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
--cc=michael.h.kelley-0li6OtcxBFHby3iVrkZq2A@public.gmane.org \
--cc=nicolas.ferre-UWL1GkI3JZL3oGB3hsPCZA@public.gmane.org \
--cc=vkuznets-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).