From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753454AbeDSTy4 (ORCPT ); Thu, 19 Apr 2018 15:54:56 -0400 Received: from mx1.redhat.com ([209.132.183.28]:44792 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752153AbeDSTyx (ORCPT ); Thu, 19 Apr 2018 15:54:53 -0400 Date: Thu, 19 Apr 2018 13:54:52 -0600 From: Alex Williamson To: Xu Yandong Cc: , , , , Kirti Wankhede Subject: Re: [PATCH] vfio iommu type1: no need to check task->mm if task has been destroyed Message-ID: <20180419135452.72d0ff32@w520.home> In-Reply-To: <20180419101926.5700f8a9@w520.home> References: <20180418105545.13488-1-xuyandong2@huawei.com> <20180419101926.5700f8a9@w520.home> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 19 Apr 2018 10:19:26 -0600 Alex Williamson wrote: > [cc +Kirti] > > On Wed, 18 Apr 2018 18:55:45 +0800 > Xu Yandong wrote: > > > The task structure in vfio_dma struct used to identify the same > > task who map it or other task who shares same adress space is > > allowed to unmap. But if the task who map it has exited, mm of > > the task has been set to null, we should unmap the vfio dma directly. > > > > Signed-off-by: Xu Yandong > > --- > > Hi all, > > When I unplug a vcpu from a VM lanched with a VFIO hostdev device, > > I found that the *vfio_dma* mapped by this vcpu task could not be unmaped > > in the future, so I send this patch to unmap vfio_dma directly if the > > task who mapped it has exited. > > > > Howerver this patch may introduce a new security risk because any task can > > unmap the *vfio_dma* if the mapper task has exited. > > Well that's unexpected, but adding some debugging code I can clearly > see that the map and unmap ioctls are typically called by the various > processor threads, which all share the same mm_struct (so accounting is > correct regardless of which CPU does the unmap). I don't think the fix > below is correct though, it's not for a security risk, but for > accounting issue and correctness issues. The pages are mapped and > accounted against the users locked memory limits, if we simply bail > out, both the IOMMU mappings and the limit accounting are wrong. > Perhaps rather than referencing the calling task_struct in the vfio_dma > on mapping, we should traverse to the highest parent task sharing the > same mm_struct. Kirti, any thoughts since this code originated for > mdev support? Thanks, I think something like below is a better solution. More research required on group_leader and if it needs to be sanity tested or if testing mm_struct is redundant, but I think it should resolve the failing test case, all mappings reference the same task regardless of which vCPU triggers it. Thanks, Alex diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index 5c212bf29640..3a1d3695c3fb 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -1093,6 +1093,7 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu, int ret = 0, prot = 0; uint64_t mask; struct vfio_dma *dma; + struct task_struct *task; /* Verify that none of our __u64 fields overflow */ if (map->size != size || map->vaddr != vaddr || map->iova != iova) @@ -1131,8 +1132,12 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu, dma->iova = iova; dma->vaddr = vaddr; dma->prot = prot; - get_task_struct(current); - dma->task = current; + + task = (current->mm == current->group_leader->mm ? + current->group_leader : current); + get_task_struct(task); + dma->task = task; + dma->pfn_list = RB_ROOT; /* Insert zero-sized and grow as we map chunks of it */