From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0F695C433F5 for ; Wed, 29 Sep 2021 02:42:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D7B3561152 for ; Wed, 29 Sep 2021 02:42:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243746AbhI2Cn5 (ORCPT ); Tue, 28 Sep 2021 22:43:57 -0400 Received: from mga05.intel.com ([192.55.52.43]:61422 "EHLO mga05.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242877AbhI2Cn4 (ORCPT ); Tue, 28 Sep 2021 22:43:56 -0400 X-IronPort-AV: E=McAfee;i="6200,9189,10121"; a="310397020" X-IronPort-AV: E=Sophos;i="5.85,331,1624345200"; d="scan'208";a="310397020" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Sep 2021 19:42:16 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.85,331,1624345200"; d="scan'208";a="476498465" Received: from allen-box.sh.intel.com (HELO [10.239.159.118]) ([10.239.159.118]) by orsmga007.jf.intel.com with ESMTP; 28 Sep 2021 19:42:09 -0700 Cc: baolu.lu@linux.intel.com, "Liu, Yi L" , "alex.williamson@redhat.com" , "hch@lst.de" , "jasowang@redhat.com" , "joro@8bytes.org" , "jean-philippe@linaro.org" , "parav@mellanox.com" , "lkml@metux.net" , "pbonzini@redhat.com" , "lushenming@huawei.com" , "eric.auger@redhat.com" , "corbet@lwn.net" , "Raj, Ashok" , "yi.l.liu@linux.intel.com" , "Tian, Jun J" , "Wu, Hao" , "Jiang, Dave" , "jacob.jun.pan@linux.intel.com" , "kwankhede@nvidia.com" , "robin.murphy@arm.com" , "kvm@vger.kernel.org" , "iommu@lists.linux-foundation.org" , "dwmw2@infradead.org" , "linux-kernel@vger.kernel.org" , "david@gibson.dropbear.id.au" , "nicolinc@nvidia.com" Subject: Re: [RFC 06/20] iommu: Add iommu_device_init[exit]_user_dma interfaces To: "Tian, Kevin" , Jason Gunthorpe References: <20210919063848.1476776-1-yi.l.liu@intel.com> <20210919063848.1476776-7-yi.l.liu@intel.com> <20210921170943.GS327412@nvidia.com> <20210922123931.GI327412@nvidia.com> <20210927150928.GA1517957@nvidia.com> <20210928115751.GK964074@nvidia.com> <9a314095-3db9-30fc-2ed9-4e46d385036d@linux.intel.com> <20210928140712.GL964074@nvidia.com> <4ba3294b-1628-0522-17ff-8aa38ed5a615@linux.intel.com> From: Lu Baolu Message-ID: <96999691-f056-d3ca-bcdf-e55e8d040517@linux.intel.com> Date: Wed, 29 Sep 2021 10:38:36 +0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.13.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org On 9/29/21 10:29 AM, Tian, Kevin wrote: >> From: Lu Baolu >> Sent: Wednesday, September 29, 2021 10:22 AM >> >> On 9/28/21 10:07 PM, Jason Gunthorpe wrote: >>> On Tue, Sep 28, 2021 at 09:35:05PM +0800, Lu Baolu wrote: >>>> Another issue is, when putting a device into user-dma mode, all devices >>>> belonging to the same iommu group shouldn't be bound with a kernel- >> dma >>>> driver. Kevin's prototype checks this by READ_ONCE(dev->driver). This is >>>> not lock safe as discussed below, >>>> >>>> https://lore.kernel.org/linux- >> iommu/20210927130935.GZ964074@nvidia.com/ >>>> >>>> Any guidance on this? >>> >>> Something like this? >>> >>> >>> int iommu_set_device_dma_owner(struct device *dev, enum >> device_dma_owner mode, >>> struct file *user_owner) >>> { >>> struct iommu_group *group = group_from_dev(dev); >>> >>> spin_lock(&iommu_group->dma_owner_lock); >>> switch (mode) { >>> case DMA_OWNER_KERNEL: >>> if (iommu_group- >>> dma_users[DMA_OWNER_USERSPACE]) >>> return -EBUSY; >>> break; >>> case DMA_OWNER_SHARED: >>> break; >>> case DMA_OWNER_USERSPACE: >>> if (iommu_group- >>> dma_users[DMA_OWNER_KERNEL]) >>> return -EBUSY; >>> if (iommu_group->dma_owner_file != user_owner) { >>> if (iommu_group- >>> dma_users[DMA_OWNER_USERSPACE]) >>> return -EPERM; >>> get_file(user_owner); >>> iommu_group->dma_owner_file = >> user_owner; >>> } >>> break; >>> default: >>> spin_unlock(&iommu_group->dma_owner_lock); >>> return -EINVAL; >>> } >>> iommu_group->dma_users[mode]++; >>> spin_unlock(&iommu_group->dma_owner_lock); >>> return 0; >>> } >>> >>> int iommu_release_device_dma_owner(struct device *dev, >>> enum device_dma_owner mode) >>> { >>> struct iommu_group *group = group_from_dev(dev); >>> >>> spin_lock(&iommu_group->dma_owner_lock); >>> if (WARN_ON(!iommu_group->dma_users[mode])) >>> goto err_unlock; >>> if (!iommu_group->dma_users[mode]--) { >>> if (mode == DMA_OWNER_USERSPACE) { >>> fput(iommu_group->dma_owner_file); >>> iommu_group->dma_owner_file = NULL; >>> } >>> } >>> err_unlock: >>> spin_unlock(&iommu_group->dma_owner_lock); >>> } >>> >>> >>> Where, the driver core does before probe: >>> >>> iommu_set_device_dma_owner(dev, DMA_OWNER_KERNEL, NULL) >>> >>> pci_stub/etc does in their probe func: >>> >>> iommu_set_device_dma_owner(dev, DMA_OWNER_SHARED, NULL) >>> >>> And vfio/iommfd does when a struct vfio_device FD is attached: >>> >>> iommu_set_device_dma_owner(dev, DMA_OWNER_USERSPACE, >> group_file/iommu_file) >> >> Really good design. It also helps alleviating some pains elsewhere in >> the iommu core. >> >> Just a nit comment, we also need DMA_OWNER_NONE which will be set >> when >> the driver core unbinds the driver from the device. >> > > Not necessarily. NONE is represented by none of dma_user[mode] > is valid. > Fair enough. Best regards, baolu