From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 23621C2D0E4 for ; Tue, 24 Nov 2020 08:08:16 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B6ED4206E0 for ; Tue, 24 Nov 2020 08:08:15 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="eDV1mZdh" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730287AbgKXIIO (ORCPT ); Tue, 24 Nov 2020 03:08:14 -0500 Received: from mail.kernel.org ([198.145.29.99]:49010 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726155AbgKXIIN (ORCPT ); Tue, 24 Nov 2020 03:08:13 -0500 Received: from disco-boy.misterjones.org (disco-boy.misterjones.org [51.254.78.96]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id E5DD6206D9; Tue, 24 Nov 2020 08:08:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1606205293; bh=G+ujxjFt/eVsHQI1QsvkPgPsk4UXcNfS/bgQijKG/hA=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=eDV1mZdh+r548ggFb4+qZPDsElMU200zkPh59LiqAg/HwFaKAhufK4hlprPzNgE1+ wVj6edkiR1Vd3YKTRmAquQc14/9ZNQ1EWEA4va9qtXPDMkl2WwXxXjrSvDer7DBpM8 7TtSk3JfZDuUiQ3rjx9vcUXwtBAOG8XZNxH81NY8= Received: from disco-boy.misterjones.org ([51.254.78.96] helo=www.loen.fr) by disco-boy.misterjones.org with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94) (envelope-from ) id 1khTMw-00DBJ1-GG; Tue, 24 Nov 2020 08:08:10 +0000 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Date: Tue, 24 Nov 2020 08:08:10 +0000 From: Marc Zyngier To: Shenming Lu Cc: James Morse , Julien Thierry , Suzuki K Poulose , Eric Auger , linux-arm-kernel@lists.infradead.org, kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Christoffer Dall , Alex Williamson , Kirti Wankhede , Cornelia Huck , Neo Jia , wanghaibin.wang@huawei.com, yuzenghui@huawei.com Subject: Re: [RFC PATCH v1 1/4] irqchip/gic-v4.1: Plumb get_irqchip_state VLPI callback In-Reply-To: <7bc7e428-cfd5-6171-dc1e-4be097c46690@huawei.com> References: <20201123065410.1915-1-lushenming@huawei.com> <20201123065410.1915-2-lushenming@huawei.com> <7bc7e428-cfd5-6171-dc1e-4be097c46690@huawei.com> User-Agent: Roundcube Webmail/1.4.9 Message-ID: X-Sender: maz@kernel.org X-SA-Exim-Connect-IP: 51.254.78.96 X-SA-Exim-Rcpt-To: lushenming@huawei.com, james.morse@arm.com, julien.thierry.kdev@gmail.com, suzuki.poulose@arm.com, eric.auger@redhat.com, linux-arm-kernel@lists.infradead.org, kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, christoffer.dall@arm.com, alex.williamson@redhat.com, kwankhede@nvidia.com, cohuck@redhat.com, cjia@nvidia.com, wanghaibin.wang@huawei.com, yuzenghui@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2020-11-24 07:38, Shenming Lu wrote: > On 2020/11/23 17:01, Marc Zyngier wrote: >> On 2020-11-23 06:54, Shenming Lu wrote: >>> From: Zenghui Yu >>> >>> Up to now, the irq_get_irqchip_state() callback of its_irq_chip >>> leaves unimplemented since there is no architectural way to get >>> the VLPI's pending state before GICv4.1. Yeah, there has one in >>> v4.1 for VLPIs. >>> >>> With GICv4.1, after unmapping the vPE, which cleans and invalidates >>> any caching of the VPT, we can get the VLPI's pending state by >> >> This is a crucial note: without this unmapping and invalidation, >> the pending bits are not generally accessible (they could be cached >> in a GIC private structure, cache or otherwise). >> >>> peeking at the VPT. So we implement the irq_get_irqchip_state() >>> callback of its_irq_chip to do it. >>> >>> Signed-off-by: Zenghui Yu >>> Signed-off-by: Shenming Lu >>> --- >>>  drivers/irqchip/irq-gic-v3-its.c | 38 >>> ++++++++++++++++++++++++++++++++ >>>  1 file changed, 38 insertions(+) >>> >>> diff --git a/drivers/irqchip/irq-gic-v3-its.c >>> b/drivers/irqchip/irq-gic-v3-its.c >>> index 0fec31931e11..287003cacac7 100644 >>> --- a/drivers/irqchip/irq-gic-v3-its.c >>> +++ b/drivers/irqchip/irq-gic-v3-its.c >>> @@ -1695,6 +1695,43 @@ static void its_irq_compose_msi_msg(struct >>> irq_data *d, struct msi_msg *msg) >>>      iommu_dma_compose_msi_msg(irq_data_get_msi_desc(d), msg); >>>  } >>> >>> +static bool its_peek_vpt(struct its_vpe *vpe, irq_hw_number_t hwirq) >>> +{ >>> +    int mask = hwirq % BITS_PER_BYTE; >> >> nit: this isn't a mask, but a shift instead. BIT(hwirq % BPB) would >> give >> you a mask. > > Ok, I will correct it. > >> >>> +    void *va; >>> +    u8 *pt; >>> + >>> +    va = page_address(vpe->vpt_page); >>> +    pt = va + hwirq / BITS_PER_BYTE; >>> + >>> +    return !!(*pt & (1U << mask)); >>> +} >>> + >>> +static int its_irq_get_irqchip_state(struct irq_data *d, >>> +                     enum irqchip_irq_state which, bool *val) >>> +{ >>> +    struct its_device *its_dev = irq_data_get_irq_chip_data(d); >>> +    struct its_vlpi_map *map = get_vlpi_map(d); >>> + >>> +    if (which != IRQCHIP_STATE_PENDING) >>> +        return -EINVAL; >>> + >>> +    /* not intended for physical LPI's pending state */ >>> +    if (!map) >>> +        return -EINVAL; >>> + >>> +    /* >>> +     * In GICv4.1, a VMAPP with {V,Alloc}=={0,1} cleans and >>> invalidates >>> +     * any caching of the VPT associated with the vPEID held in the >>> GIC. >>> +     */ >>> +    if (!is_v4_1(its_dev->its) || >>> atomic_read(&map->vpe->vmapp_count)) >> >> It isn't clear to me what prevents this from racing against a mapping >> of >> the VPE. Actually, since we only hold the LPI irqdesc lock, I'm pretty >> sure >> nothing prevents it. > > Yes, should have the vmovp_lock held? That's not helping because of the VPE activation. > And is it necessary to also hold this lock in > its_vpe_irq_domain_activate/deactivate? Well, you'd need that, but that's unnecessary complex AFAICT. > >> >>> +        return -EACCES; >> >> I can sort of buy EACCESS for a VPE that is currently mapped, but a >> non-4.1 >> ITS should definitely return EINVAL. > > Alright, EINVAL looks better. > >> >>> + >>> +    *val = its_peek_vpt(map->vpe, map->vintid); >>> + >>> +    return 0; >>> +} >>> + >>>  static int its_irq_set_irqchip_state(struct irq_data *d, >>>                       enum irqchip_irq_state which, >>>                       bool state) >>> @@ -1975,6 +2012,7 @@ static struct irq_chip its_irq_chip = { >>>      .irq_eoi        = irq_chip_eoi_parent, >>>      .irq_set_affinity    = its_set_affinity, >>>      .irq_compose_msi_msg    = its_irq_compose_msi_msg, >>> +    .irq_get_irqchip_state    = its_irq_get_irqchip_state, >> >> My biggest issue with this is that it isn't a reliable interface. >> It happens to work in the context of KVM, because you make sure it >> is called at the right time, but that doesn't make it safe in general >> (anyone with the interrupt number is allowed to call this at any >> time). > > We check the vmapp_count in it to ensure the unmapping of the vPE, and > let the caller do the unmapping (they should know whether it is the > right > time). If the unmapping is not done, just return a failure. And without guaranteeing mutual exclusion against a concurrent VMAPP, checking the vmapp_count means nothing (you need the lock described above, and start sprinkling it around in other places as well). >> >> Is there a problem with poking at the VPT page from the KVM side? >> The code should be exactly the same (maybe simpler even), and at least >> you'd be guaranteed to be in the correct context. > > Yeah, that also seems a good choice. > If you prefer it, we can try to realize it in v2. I'd certainly prefer that. Let me know if you spot any implementation issue with that. Thanks, M. -- Jazz is not dead. It just smells funny...