From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.5 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C52A8CA9EC2 for ; Mon, 28 Oct 2019 20:10:46 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 808A5208C0 for ; Mon, 28 Oct 2019 20:10:46 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=ziepe.ca header.i=@ziepe.ca header.b="NTlSsfHD" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 808A5208C0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=ziepe.ca Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 18A136B0005; Mon, 28 Oct 2019 16:10:46 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 13A1B6B0006; Mon, 28 Oct 2019 16:10:46 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F1D526B0008; Mon, 28 Oct 2019 16:10:45 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id C4CC86B0005 for ; Mon, 28 Oct 2019 16:10:45 -0400 (EDT) Received: from smtpin01.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with SMTP id 5AC20824999B for ; Mon, 28 Oct 2019 20:10:45 +0000 (UTC) X-FDA: 76094286450.01.hose55_40baa609c5d2e X-HE-Tag: hose55_40baa609c5d2e X-Filterd-Recvd-Size: 14694 Received: from mail-qt1-f194.google.com (mail-qt1-f194.google.com [209.85.160.194]) by imf13.hostedemail.com (Postfix) with ESMTP for ; Mon, 28 Oct 2019 20:10:44 +0000 (UTC) Received: by mail-qt1-f194.google.com with SMTP id o49so16510575qta.7 for ; Mon, 28 Oct 2019 13:10:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ziepe.ca; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=oxTlwVkpt6xsZnGNNFXWlMYrq1FFLaGMPydq4/sN/5w=; b=NTlSsfHDOeTIWuMNnlPtSucXVB40h2KRrzDkHW+jL1verP1znZeXxUVr5ZxGRkg8kj nOSmfcIOektBvoWEXSXNLeWyu4bK7iCKjdDfsBhsxqpELeMVbud4AvVz2DgwJMKf3hW+ ftu0ovDWA7ugtM7z1PmtSEjj69N1bWOBzxzB9u5WoceWfOVgcCcy5OEDNJ3s5ZdRsF4y ycs6GleA0mRegnlWaa97Tl0s3MYXNiDC4JYWMfNhzPrKYg7D/FO+3LOmrV5pQwOQa/Q4 PyTFpj6wxBLgO6MJ1y1Ka7WlaxZeQNsi2I86naY8pTh62rgM7qRbB3znow/XgagUSw8Q XHLA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=oxTlwVkpt6xsZnGNNFXWlMYrq1FFLaGMPydq4/sN/5w=; b=uHEK9EiecShsKGRJyDUFRW0vwLFh8T5sO2gvcpOETaY6646y6VVjXgffW/XwtSnppg XlsQT2kU6qWdKoMkdLb/fwlM09qRuWiKhckOhM8vOOPM4gD5cwrV1Ah/z2WYHXRuF/2y /NOf9mgftflIUI0UxxItt21Gu39Eo5Jtse9guviPfBZv+NX2Uzsd7Cb6nzhJ5FOAgN8h EHsvJsj83eu2a5/xFhg5/mnOe4hYldTL1HXLTrN5lDdXFQRF+RW77642/kGVt3hBeZ68 5WIYxF7spPAIuW4kZ3A9rZE91lN3XHEI05YCyghXL1jpxIkMETCcgBjEuU0KRaHpLOdd Q9tw== X-Gm-Message-State: APjAAAUfbBwQ4FDJQJ5zAUeDv2jc9cbnHtWTJU9upSkrQ9pv7Ywa3dGT V9ojbra6gFCWjJqQXfnSLJUpUHY1Jik= X-Google-Smtp-Source: APXvYqwCdM3Q6SHuLXB8g1n/gZbUCOYrtwo97VEtbKno5VWn+17Xn4uDxHSdiC11RemjVHl4/2axPQ== X-Received: by 2002:ac8:5408:: with SMTP id b8mr327630qtq.164.1572293444231; Mon, 28 Oct 2019 13:10:44 -0700 (PDT) Received: from ziepe.ca (hlfxns017vw-142-162-113-180.dhcp-dynamic.fibreop.ns.bellaliant.net. [142.162.113.180]) by smtp.gmail.com with ESMTPSA id a23sm4534647qtp.85.2019.10.28.13.10.43 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 28 Oct 2019 13:10:44 -0700 (PDT) Received: from jgg by mlx.ziepe.ca with local (Exim 4.90_1) (envelope-from ) id 1iPBLf-0001gq-Ec; Mon, 28 Oct 2019 17:10:43 -0300 From: Jason Gunthorpe To: linux-mm@kvack.org, Jerome Glisse , Ralph Campbell , John Hubbard , Felix.Kuehling@amd.com Cc: linux-rdma@vger.kernel.org, dri-devel@lists.freedesktop.org, amd-gfx@lists.freedesktop.org, Alex Deucher , Ben Skeggs , Boris Ostrovsky , =?UTF-8?q?Christian=20K=C3=B6nig?= , David Zhou , Dennis Dalessandro , Juergen Gross , Mike Marciniszyn , Oleksandr Andrushchenko , Petr Cvek , Stefano Stabellini , nouveau@lists.freedesktop.org, xen-devel@lists.xenproject.org, Christoph Hellwig , Jason Gunthorpe Subject: [PATCH v2 09/15] xen/gntdev: use mmu_range_notifier_insert Date: Mon, 28 Oct 2019 17:10:26 -0300 Message-Id: <20191028201032.6352-10-jgg@ziepe.ca> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20191028201032.6352-1-jgg@ziepe.ca> References: <20191028201032.6352-1-jgg@ziepe.ca> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Jason Gunthorpe gntdev simply wants to monitor a specific VMA for any notifier events, this can be done straightforwardly using mmu_range_notifier_insert() over the VMA's VA range. The notifier should be attached until the original VMA is destroyed. It is unclear if any of this is even sane, but at least a lot of duplicat= e code is removed. Cc: Oleksandr Andrushchenko Cc: Boris Ostrovsky Cc: xen-devel@lists.xenproject.org Cc: Juergen Gross Cc: Stefano Stabellini Signed-off-by: Jason Gunthorpe --- drivers/xen/gntdev-common.h | 8 +- drivers/xen/gntdev.c | 180 ++++++++++-------------------------- 2 files changed, 49 insertions(+), 139 deletions(-) diff --git a/drivers/xen/gntdev-common.h b/drivers/xen/gntdev-common.h index 2f8b949c3eeb14..b201fdd20b667b 100644 --- a/drivers/xen/gntdev-common.h +++ b/drivers/xen/gntdev-common.h @@ -21,15 +21,8 @@ struct gntdev_dmabuf_priv; struct gntdev_priv { /* Maps with visible offsets in the file descriptor. */ struct list_head maps; - /* - * Maps that are not visible; will be freed on munmap. - * Only populated if populate_freeable_maps =3D=3D 1 - */ - struct list_head freeable_maps; /* lock protects maps and freeable_maps. */ struct mutex lock; - struct mm_struct *mm; - struct mmu_notifier mn; =20 #ifdef CONFIG_XEN_GRANT_DMA_ALLOC /* Device for which DMA memory is allocated. */ @@ -49,6 +42,7 @@ struct gntdev_unmap_notify { }; =20 struct gntdev_grant_map { + struct mmu_range_notifier notifier; struct list_head next; struct vm_area_struct *vma; int index; diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c index a446a7221e13e9..12d626670bebbc 100644 --- a/drivers/xen/gntdev.c +++ b/drivers/xen/gntdev.c @@ -65,7 +65,6 @@ MODULE_PARM_DESC(limit, "Maximum number of grants that = may be mapped by " static atomic_t pages_mapped =3D ATOMIC_INIT(0); =20 static int use_ptemod; -#define populate_freeable_maps use_ptemod =20 static int unmap_grant_pages(struct gntdev_grant_map *map, int offset, int pages); @@ -251,12 +250,6 @@ void gntdev_put_map(struct gntdev_priv *priv, struct= gntdev_grant_map *map) evtchn_put(map->notify.event); } =20 - if (populate_freeable_maps && priv) { - mutex_lock(&priv->lock); - list_del(&map->next); - mutex_unlock(&priv->lock); - } - if (map->pages && !use_ptemod) unmap_grant_pages(map, 0, map->count); gntdev_free_map(map); @@ -445,17 +438,9 @@ static void gntdev_vma_close(struct vm_area_struct *= vma) struct gntdev_priv *priv =3D file->private_data; =20 pr_debug("gntdev_vma_close %p\n", vma); - if (use_ptemod) { - /* It is possible that an mmu notifier could be running - * concurrently, so take priv->lock to ensure that the vma won't - * vanishing during the unmap_grant_pages call, since we will - * spin here until that completes. Such a concurrent call will - * not do any unmapping, since that has been done prior to - * closing the vma, but it may still iterate the unmap_ops list. - */ - mutex_lock(&priv->lock); + if (use_ptemod && map->vma =3D=3D vma) { + mmu_range_notifier_remove(&map->notifier); map->vma =3D NULL; - mutex_unlock(&priv->lock); } vma->vm_private_data =3D NULL; gntdev_put_map(priv, map); @@ -477,109 +462,44 @@ static const struct vm_operations_struct gntdev_vm= ops =3D { =20 /* ------------------------------------------------------------------ */ =20 -static bool in_range(struct gntdev_grant_map *map, - unsigned long start, unsigned long end) -{ - if (!map->vma) - return false; - if (map->vma->vm_start >=3D end) - return false; - if (map->vma->vm_end <=3D start) - return false; - - return true; -} - -static int unmap_if_in_range(struct gntdev_grant_map *map, - unsigned long start, unsigned long end, - bool blockable) +static bool gntdev_invalidate(struct mmu_range_notifier *mn, + const struct mmu_notifier_range *range, + unsigned long cur_seq) { + struct gntdev_grant_map *map =3D + container_of(mn, struct gntdev_grant_map, notifier); unsigned long mstart, mend; int err; =20 - if (!in_range(map, start, end)) - return 0; + if (!mmu_notifier_range_blockable(range)) + return false; =20 - if (!blockable) - return -EAGAIN; + /* + * If the VMA is split or otherwise changed the notifier is not + * updated, but we don't want to process VA's outside the modified + * VMA. FIXME: It would be much more understandable to just prevent + * modifying the VMA in the first place. + */ + if (map->vma->vm_start >=3D range->end || + map->vma->vm_end <=3D range->start) + return true; =20 - mstart =3D max(start, map->vma->vm_start); - mend =3D min(end, map->vma->vm_end); + mstart =3D max(range->start, map->vma->vm_start); + mend =3D min(range->end, map->vma->vm_end); pr_debug("map %d+%d (%lx %lx), range %lx %lx, mrange %lx %lx\n", map->index, map->count, map->vma->vm_start, map->vma->vm_end, - start, end, mstart, mend); + range->start, range->end, mstart, mend); err =3D unmap_grant_pages(map, (mstart - map->vma->vm_start) >> PAGE_SHIFT, (mend - mstart) >> PAGE_SHIFT); WARN_ON(err); =20 - return 0; -} - -static int mn_invl_range_start(struct mmu_notifier *mn, - const struct mmu_notifier_range *range) -{ - struct gntdev_priv *priv =3D container_of(mn, struct gntdev_priv, mn); - struct gntdev_grant_map *map; - int ret =3D 0; - - if (mmu_notifier_range_blockable(range)) - mutex_lock(&priv->lock); - else if (!mutex_trylock(&priv->lock)) - return -EAGAIN; - - list_for_each_entry(map, &priv->maps, next) { - ret =3D unmap_if_in_range(map, range->start, range->end, - mmu_notifier_range_blockable(range)); - if (ret) - goto out_unlock; - } - list_for_each_entry(map, &priv->freeable_maps, next) { - ret =3D unmap_if_in_range(map, range->start, range->end, - mmu_notifier_range_blockable(range)); - if (ret) - goto out_unlock; - } - -out_unlock: - mutex_unlock(&priv->lock); - - return ret; -} - -static void mn_release(struct mmu_notifier *mn, - struct mm_struct *mm) -{ - struct gntdev_priv *priv =3D container_of(mn, struct gntdev_priv, mn); - struct gntdev_grant_map *map; - int err; - - mutex_lock(&priv->lock); - list_for_each_entry(map, &priv->maps, next) { - if (!map->vma) - continue; - pr_debug("map %d+%d (%lx %lx)\n", - map->index, map->count, - map->vma->vm_start, map->vma->vm_end); - err =3D unmap_grant_pages(map, /* offset */ 0, map->count); - WARN_ON(err); - } - list_for_each_entry(map, &priv->freeable_maps, next) { - if (!map->vma) - continue; - pr_debug("map %d+%d (%lx %lx)\n", - map->index, map->count, - map->vma->vm_start, map->vma->vm_end); - err =3D unmap_grant_pages(map, /* offset */ 0, map->count); - WARN_ON(err); - } - mutex_unlock(&priv->lock); + return true; } =20 -static const struct mmu_notifier_ops gntdev_mmu_ops =3D { - .release =3D mn_release, - .invalidate_range_start =3D mn_invl_range_start, +static const struct mmu_range_notifier_ops gntdev_mmu_ops =3D { + .invalidate =3D gntdev_invalidate, }; =20 /* ------------------------------------------------------------------ */ @@ -594,7 +514,6 @@ static int gntdev_open(struct inode *inode, struct fi= le *flip) return -ENOMEM; =20 INIT_LIST_HEAD(&priv->maps); - INIT_LIST_HEAD(&priv->freeable_maps); mutex_init(&priv->lock); =20 #ifdef CONFIG_XEN_GNTDEV_DMABUF @@ -606,17 +525,6 @@ static int gntdev_open(struct inode *inode, struct f= ile *flip) } #endif =20 - if (use_ptemod) { - priv->mm =3D get_task_mm(current); - if (!priv->mm) { - kfree(priv); - return -ENOMEM; - } - priv->mn.ops =3D &gntdev_mmu_ops; - ret =3D mmu_notifier_register(&priv->mn, priv->mm); - mmput(priv->mm); - } - if (ret) { kfree(priv); return ret; @@ -653,16 +561,12 @@ static int gntdev_release(struct inode *inode, stru= ct file *flip) list_del(&map->next); gntdev_put_map(NULL /* already removed */, map); } - WARN_ON(!list_empty(&priv->freeable_maps)); mutex_unlock(&priv->lock); =20 #ifdef CONFIG_XEN_GNTDEV_DMABUF gntdev_dmabuf_fini(priv->dmabuf_priv); #endif =20 - if (use_ptemod) - mmu_notifier_unregister(&priv->mn, priv->mm); - kfree(priv); return 0; } @@ -723,8 +627,6 @@ static long gntdev_ioctl_unmap_grant_ref(struct gntde= v_priv *priv, map =3D gntdev_find_map_index(priv, op.index >> PAGE_SHIFT, op.count); if (map) { list_del(&map->next); - if (populate_freeable_maps) - list_add_tail(&map->next, &priv->freeable_maps); err =3D 0; } mutex_unlock(&priv->lock); @@ -1096,11 +998,6 @@ static int gntdev_mmap(struct file *flip, struct vm= _area_struct *vma) goto unlock_out; if (use_ptemod && map->vma) goto unlock_out; - if (use_ptemod && priv->mm !=3D vma->vm_mm) { - pr_warn("Huh? Other mm?\n"); - goto unlock_out; - } - refcount_inc(&map->users); =20 vma->vm_ops =3D &gntdev_vmops; @@ -1111,10 +1008,6 @@ static int gntdev_mmap(struct file *flip, struct v= m_area_struct *vma) vma->vm_flags |=3D VM_DONTCOPY; =20 vma->vm_private_data =3D map; - - if (use_ptemod) - map->vma =3D vma; - if (map->flags) { if ((vma->vm_flags & VM_WRITE) && (map->flags & GNTMAP_readonly)) @@ -1125,8 +1018,28 @@ static int gntdev_mmap(struct file *flip, struct v= m_area_struct *vma) map->flags |=3D GNTMAP_readonly; } =20 + if (use_ptemod) { + map->vma =3D vma; + err =3D mmu_range_notifier_insert_locked( + &map->notifier, vma->vm_start, + vma->vm_end - vma->vm_start, vma->vm_mm); + if (err) + goto out_unlock_put; + } mutex_unlock(&priv->lock); =20 + /* + * gntdev takes the address of the PTE in find_grant_ptes() and passes + * it to the hypervisor in gntdev_map_grant_pages(). The purpose of + * the notifier is to prevent the hypervisor pointer to the PTE from + * going stale. + * + * Since this vma's mappings can't be touched without the mmap_sem, + * and we are holding it now, there is no need for the notifier_range + * locking pattern. + */ + mmu_range_read_begin(&map->notifier); + if (use_ptemod) { map->pages_vm_start =3D vma->vm_start; err =3D apply_to_page_range(vma->vm_mm, vma->vm_start, @@ -1175,8 +1088,11 @@ static int gntdev_mmap(struct file *flip, struct v= m_area_struct *vma) mutex_unlock(&priv->lock); out_put_map: if (use_ptemod) { - map->vma =3D NULL; unmap_grant_pages(map, 0, map->count); + if (map->vma) { + mmu_range_notifier_remove(&map->notifier); + map->vma =3D NULL; + } } gntdev_put_map(priv, map); return err; --=20 2.23.0