From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.5 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0F209CA9EC1 for ; Mon, 28 Oct 2019 20:10:58 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id AD972208C0 for ; Mon, 28 Oct 2019 20:10:57 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=ziepe.ca header.i=@ziepe.ca header.b="QDUnE3MK" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org AD972208C0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=ziepe.ca Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 2C70B6B000C; Mon, 28 Oct 2019 16:10:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 24D096B000D; Mon, 28 Oct 2019 16:10:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 162FF6B000E; Mon, 28 Oct 2019 16:10:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0122.hostedemail.com [216.40.44.122]) by kanga.kvack.org (Postfix) with ESMTP id D4AEE6B000C for ; Mon, 28 Oct 2019 16:10:47 -0400 (EDT) Received: from smtpin20.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with SMTP id 62FA0180AD81A for ; Mon, 28 Oct 2019 20:10:47 +0000 (UTC) X-FDA: 76094286534.20.spark18_4104fee112436 X-HE-Tag: spark18_4104fee112436 X-Filterd-Recvd-Size: 15139 Received: from mail-qt1-f194.google.com (mail-qt1-f194.google.com [209.85.160.194]) by imf03.hostedemail.com (Postfix) with ESMTP for ; Mon, 28 Oct 2019 20:10:46 +0000 (UTC) Received: by mail-qt1-f194.google.com with SMTP id g50so16546589qtb.4 for ; Mon, 28 Oct 2019 13:10:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ziepe.ca; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=nd1aeYImkpXF2bbrObm2jlgTnnBK6THl9IrchVUSDrc=; b=QDUnE3MKTwJpRZIuKGLEd1KxvZJlb+1DckHcCHURQ9HSK1t0RgoAy1Is1dYbrkQZ64 g3Cik5qTLKAK+tJxANfOd6jSJHSfmS2EgZwzw8EAv2+qIZnZ5utqGK1ZH71e5CCb1jJ3 XsMcS4e6B+Vs9d1PZ+vxftVtfIQ6Bl9/KzbGU1uf7mO8gJSkYiO2QQoAzifV0POM/8ap Cf/8XsxF5GYxRHaHPBTyYNzN1Lgf40qxiLNjl+31nIJL+8BLCAieaP73H5kJOEruupO9 MkbljHmAYlsW7oq0Nf8YesEJ4y9aFQHgNi8duEVefClbdJoJex+URxoETEvIn6cobMxB BxyQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=nd1aeYImkpXF2bbrObm2jlgTnnBK6THl9IrchVUSDrc=; b=o2iyfS5TPJ8dWu9Wb0s1kUX/x9FbHJu1y/6ZC32xT67Mb2zu02+ACAZWdkKHO/6i8Z yA/wiGD45OivQ8HgHSV4tkQ/6/+hBMQXDlYniir9mKKAqdKHDvPc1aePIZ4tap92VaWl Ml7dJH2zpR3jOX8uA/k1aMJMw2SvmgZC0tv2BMKan/n92Eew5YgX4iQp5+jrfeAn7MHh e2UCjRPy7YtpZ4ixhNx5PUVHFSzgFSSgMGX6vARGpQUUfecwKaReNm0KiPk12HleO3MK 3IGgLKU9WqGo1XHNTVd4uJZlCqS5+4wsYUos7pFT3lrN5maesQAleTalNElesYQmL4Bb 6v6w== X-Gm-Message-State: APjAAAUdoQ0CekEhxETlA3lF3Dz3uEGEP+BI3M9ZFPCFNKYcRk8ISEES 04YxlZ96MCMEKPzodDbDvLZwgu8Grug= X-Google-Smtp-Source: APXvYqz7tV89//SAw/M6qcwng6wHNybOT2RkA1hOycOk7KeHSuGGECfRCjBje19BfyzzB0Z65gEjdA== X-Received: by 2002:ac8:3142:: with SMTP id h2mr343813qtb.182.1572293445995; Mon, 28 Oct 2019 13:10:45 -0700 (PDT) Received: from ziepe.ca (hlfxns017vw-142-162-113-180.dhcp-dynamic.fibreop.ns.bellaliant.net. [142.162.113.180]) by smtp.gmail.com with ESMTPSA id i66sm6141757qkb.105.2019.10.28.13.10.43 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 28 Oct 2019 13:10:43 -0700 (PDT) Received: from jgg by mlx.ziepe.ca with local (Exim 4.90_1) (envelope-from ) id 1iPBLf-0001gY-AK; Mon, 28 Oct 2019 17:10:43 -0300 From: Jason Gunthorpe To: linux-mm@kvack.org, Jerome Glisse , Ralph Campbell , John Hubbard , Felix.Kuehling@amd.com Cc: linux-rdma@vger.kernel.org, dri-devel@lists.freedesktop.org, amd-gfx@lists.freedesktop.org, Alex Deucher , Ben Skeggs , Boris Ostrovsky , =?UTF-8?q?Christian=20K=C3=B6nig?= , David Zhou , Dennis Dalessandro , Juergen Gross , Mike Marciniszyn , Oleksandr Andrushchenko , Petr Cvek , Stefano Stabellini , nouveau@lists.freedesktop.org, xen-devel@lists.xenproject.org, Christoph Hellwig , Jason Gunthorpe Subject: [PATCH v2 06/15] RDMA/hfi1: Use mmu_range_notifier_inset for user_exp_rcv Date: Mon, 28 Oct 2019 17:10:23 -0300 Message-Id: <20191028201032.6352-7-jgg@ziepe.ca> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20191028201032.6352-1-jgg@ziepe.ca> References: <20191028201032.6352-1-jgg@ziepe.ca> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Jason Gunthorpe This converts one of the two users of mmu_notifiers to use the new API. The conversion is fairly straightforward, however the existing use of notifiers here seems to be racey. Cc: Mike Marciniszyn Cc: Dennis Dalessandro Signed-off-by: Jason Gunthorpe --- drivers/infiniband/hw/hfi1/file_ops.c | 2 +- drivers/infiniband/hw/hfi1/hfi.h | 2 +- drivers/infiniband/hw/hfi1/user_exp_rcv.c | 146 +++++++++------------- drivers/infiniband/hw/hfi1/user_exp_rcv.h | 3 +- 4 files changed, 60 insertions(+), 93 deletions(-) diff --git a/drivers/infiniband/hw/hfi1/file_ops.c b/drivers/infiniband/h= w/hfi1/file_ops.c index f9a7e9d29c8ba2..7c5e3fb224139a 100644 --- a/drivers/infiniband/hw/hfi1/file_ops.c +++ b/drivers/infiniband/hw/hfi1/file_ops.c @@ -1138,7 +1138,7 @@ static int get_ctxt_info(struct hfi1_filedata *fd, = unsigned long arg, u32 len) HFI1_CAP_UGET_MASK(uctxt->flags, MASK) | HFI1_CAP_KGET_MASK(uctxt->flags, K2U); /* adjust flag if this fd is not able to cache */ - if (!fd->handler) + if (!fd->use_mn) cinfo.runtime_flags |=3D HFI1_CAP_TID_UNMAP; /* no caching */ =20 cinfo.num_active =3D hfi1_count_active_units(); diff --git a/drivers/infiniband/hw/hfi1/hfi.h b/drivers/infiniband/hw/hfi= 1/hfi.h index fa45350a9a1d32..fc10d65fc3e13c 100644 --- a/drivers/infiniband/hw/hfi1/hfi.h +++ b/drivers/infiniband/hw/hfi1/hfi.h @@ -1444,7 +1444,7 @@ struct hfi1_filedata { /* for cpu affinity; -1 if none */ int rec_cpu_num; u32 tid_n_pinned; - struct mmu_rb_handler *handler; + bool use_mn; struct tid_rb_node **entry_to_rb; spinlock_t tid_lock; /* protect tid_[limit,used] counters */ u32 tid_limit; diff --git a/drivers/infiniband/hw/hfi1/user_exp_rcv.c b/drivers/infiniba= nd/hw/hfi1/user_exp_rcv.c index 3592a9ec155e85..a1ab3bd334f89e 100644 --- a/drivers/infiniband/hw/hfi1/user_exp_rcv.c +++ b/drivers/infiniband/hw/hfi1/user_exp_rcv.c @@ -59,11 +59,11 @@ static int set_rcvarray_entry(struct hfi1_filedata *f= d, struct tid_user_buf *tbuf, u32 rcventry, struct tid_group *grp, u16 pageidx, unsigned int npages); -static int tid_rb_insert(void *arg, struct mmu_rb_node *node); static void cacheless_tid_rb_remove(struct hfi1_filedata *fdata, struct tid_rb_node *tnode); -static void tid_rb_remove(void *arg, struct mmu_rb_node *node); -static int tid_rb_invalidate(void *arg, struct mmu_rb_node *mnode); +static bool tid_rb_invalidate(struct mmu_range_notifier *mrn, + const struct mmu_notifier_range *range, + unsigned long cur_seq); static int program_rcvarray(struct hfi1_filedata *fd, struct tid_user_bu= f *, struct tid_group *grp, unsigned int start, u16 count, @@ -73,10 +73,8 @@ static int unprogram_rcvarray(struct hfi1_filedata *fd= , u32 tidinfo, struct tid_group **grp); static void clear_tid_node(struct hfi1_filedata *fd, struct tid_rb_node = *node); =20 -static struct mmu_rb_ops tid_rb_ops =3D { - .insert =3D tid_rb_insert, - .remove =3D tid_rb_remove, - .invalidate =3D tid_rb_invalidate +static const struct mmu_range_notifier_ops tid_mn_ops =3D { + .invalidate =3D tid_rb_invalidate, }; =20 /* @@ -87,7 +85,6 @@ static struct mmu_rb_ops tid_rb_ops =3D { int hfi1_user_exp_rcv_init(struct hfi1_filedata *fd, struct hfi1_ctxtdata *uctxt) { - struct hfi1_devdata *dd =3D uctxt->dd; int ret =3D 0; =20 spin_lock_init(&fd->tid_lock); @@ -109,20 +106,7 @@ int hfi1_user_exp_rcv_init(struct hfi1_filedata *fd, fd->entry_to_rb =3D NULL; return -ENOMEM; } - - /* - * Register MMU notifier callbacks. If the registration - * fails, continue without TID caching for this context. - */ - ret =3D hfi1_mmu_rb_register(fd, fd->mm, &tid_rb_ops, - dd->pport->hfi1_wq, - &fd->handler); - if (ret) { - dd_dev_info(dd, - "Failed MMU notifier registration %d\n", - ret); - ret =3D 0; - } + fd->use_mn =3D true; } =20 /* @@ -139,7 +123,7 @@ int hfi1_user_exp_rcv_init(struct hfi1_filedata *fd, * init. */ spin_lock(&fd->tid_lock); - if (uctxt->subctxt_cnt && fd->handler) { + if (uctxt->subctxt_cnt && fd->use_mn) { u16 remainder; =20 fd->tid_limit =3D uctxt->expected_count / uctxt->subctxt_cnt; @@ -158,18 +142,10 @@ void hfi1_user_exp_rcv_free(struct hfi1_filedata *f= d) { struct hfi1_ctxtdata *uctxt =3D fd->uctxt; =20 - /* - * The notifier would have been removed when the process'es mm - * was freed. - */ - if (fd->handler) { - hfi1_mmu_rb_unregister(fd->handler); - } else { - if (!EXP_TID_SET_EMPTY(uctxt->tid_full_list)) - unlock_exp_tids(uctxt, &uctxt->tid_full_list, fd); - if (!EXP_TID_SET_EMPTY(uctxt->tid_used_list)) - unlock_exp_tids(uctxt, &uctxt->tid_used_list, fd); - } + if (!EXP_TID_SET_EMPTY(uctxt->tid_full_list)) + unlock_exp_tids(uctxt, &uctxt->tid_full_list, fd); + if (!EXP_TID_SET_EMPTY(uctxt->tid_used_list)) + unlock_exp_tids(uctxt, &uctxt->tid_used_list, fd); =20 kfree(fd->invalid_tids); fd->invalid_tids =3D NULL; @@ -201,7 +177,7 @@ static void unpin_rcv_pages(struct hfi1_filedata *fd, =20 if (mapped) { pci_unmap_single(dd->pcidev, node->dma_addr, - node->mmu.len, PCI_DMA_FROMDEVICE); + node->npages * PAGE_SIZE, PCI_DMA_FROMDEVICE); pages =3D &node->pages[idx]; } else { pages =3D &tidbuf->pages[idx]; @@ -777,8 +753,8 @@ static int set_rcvarray_entry(struct hfi1_filedata *f= d, return -EFAULT; } =20 - node->mmu.addr =3D tbuf->vaddr + (pageidx * PAGE_SIZE); - node->mmu.len =3D npages * PAGE_SIZE; + node->notifier.ops =3D &tid_mn_ops; + node->fdata =3D fd; node->phys =3D page_to_phys(pages[0]); node->npages =3D npages; node->rcventry =3D rcventry; @@ -787,23 +763,34 @@ static int set_rcvarray_entry(struct hfi1_filedata = *fd, node->freed =3D false; memcpy(node->pages, pages, sizeof(struct page *) * npages); =20 - if (!fd->handler) - ret =3D tid_rb_insert(fd, &node->mmu); - else - ret =3D hfi1_mmu_rb_insert(fd->handler, &node->mmu); - - if (ret) { - hfi1_cdbg(TID, "Failed to insert RB node %u 0x%lx, 0x%lx %d", - node->rcventry, node->mmu.addr, node->phys, ret); - pci_unmap_single(dd->pcidev, phys, npages * PAGE_SIZE, - PCI_DMA_FROMDEVICE); - kfree(node); - return -EFAULT; + if (fd->use_mn) { + ret =3D mmu_range_notifier_insert( + &node->notifier, tbuf->vaddr + (pageidx * PAGE_SIZE), + npages * PAGE_SIZE, fd->mm); + if (ret) + goto out_unmap; + /* + * FIXME: This is in the wrong order, the notifier should be + * established before the pages are pinned by pin_rcv_pages. + */ + mmu_range_read_begin(&node->notifier); } + fd->entry_to_rb[node->rcventry - uctxt->expected_base] =3D node; + hfi1_put_tid(dd, rcventry, PT_EXPECTED, phys, ilog2(npages) + 1); trace_hfi1_exp_tid_reg(uctxt->ctxt, fd->subctxt, rcventry, npages, - node->mmu.addr, node->phys, phys); + node->notifier.interval_tree.start, node->phys, + phys); return 0; + +out_unmap: + hfi1_cdbg(TID, "Failed to insert RB node %u 0x%lx, 0x%lx %d", + node->rcventry, node->notifier.interval_tree.start, + node->phys, ret); + pci_unmap_single(dd->pcidev, phys, npages * PAGE_SIZE, + PCI_DMA_FROMDEVICE); + kfree(node); + return -EFAULT; } =20 static int unprogram_rcvarray(struct hfi1_filedata *fd, u32 tidinfo, @@ -833,10 +820,9 @@ static int unprogram_rcvarray(struct hfi1_filedata *= fd, u32 tidinfo, if (grp) *grp =3D node->grp; =20 - if (!fd->handler) - cacheless_tid_rb_remove(fd, node); - else - hfi1_mmu_rb_remove(fd->handler, &node->mmu); + if (fd->use_mn) + mmu_range_notifier_remove(&node->notifier); + cacheless_tid_rb_remove(fd, node); =20 return 0; } @@ -847,7 +833,8 @@ static void clear_tid_node(struct hfi1_filedata *fd, = struct tid_rb_node *node) struct hfi1_devdata *dd =3D uctxt->dd; =20 trace_hfi1_exp_tid_unreg(uctxt->ctxt, fd->subctxt, node->rcventry, - node->npages, node->mmu.addr, node->phys, + node->npages, + node->notifier.interval_tree.start, node->phys, node->dma_addr); =20 /* @@ -894,30 +881,29 @@ static void unlock_exp_tids(struct hfi1_ctxtdata *u= ctxt, if (!node || node->rcventry !=3D rcventry) continue; =20 + if (fd->use_mn) + mmu_range_notifier_remove( + &node->notifier); cacheless_tid_rb_remove(fd, node); } } } } =20 -/* - * Always return 0 from this function. A non-zero return indicates that= the - * remove operation will be called and that memory should be unpinned. - * However, the driver cannot unpin out from under PSM. Instead, retain= the - * memory (by returning 0) and inform PSM that the memory is going away.= PSM - * will call back later when it has removed the memory from its list. - */ -static int tid_rb_invalidate(void *arg, struct mmu_rb_node *mnode) +static bool tid_rb_invalidate(struct mmu_range_notifier *mrn, + const struct mmu_notifier_range *range, + unsigned long cur_seq) { - struct hfi1_filedata *fdata =3D arg; - struct hfi1_ctxtdata *uctxt =3D fdata->uctxt; struct tid_rb_node *node =3D - container_of(mnode, struct tid_rb_node, mmu); + container_of(mrn, struct tid_rb_node, notifier); + struct hfi1_filedata *fdata =3D node->fdata; + struct hfi1_ctxtdata *uctxt =3D fdata->uctxt; =20 if (node->freed) - return 0; + return true; =20 - trace_hfi1_exp_tid_inval(uctxt->ctxt, fdata->subctxt, node->mmu.addr, + trace_hfi1_exp_tid_inval(uctxt->ctxt, fdata->subctxt, + node->notifier.interval_tree.start, node->rcventry, node->npages, node->dma_addr); node->freed =3D true; =20 @@ -946,18 +932,7 @@ static int tid_rb_invalidate(void *arg, struct mmu_r= b_node *mnode) fdata->invalid_tid_idx++; } spin_unlock(&fdata->invalid_lock); - return 0; -} - -static int tid_rb_insert(void *arg, struct mmu_rb_node *node) -{ - struct hfi1_filedata *fdata =3D arg; - struct tid_rb_node *tnode =3D - container_of(node, struct tid_rb_node, mmu); - u32 base =3D fdata->uctxt->expected_base; - - fdata->entry_to_rb[tnode->rcventry - base] =3D tnode; - return 0; + return true; } =20 static void cacheless_tid_rb_remove(struct hfi1_filedata *fdata, @@ -968,12 +943,3 @@ static void cacheless_tid_rb_remove(struct hfi1_file= data *fdata, fdata->entry_to_rb[tnode->rcventry - base] =3D NULL; clear_tid_node(fdata, tnode); } - -static void tid_rb_remove(void *arg, struct mmu_rb_node *node) -{ - struct hfi1_filedata *fdata =3D arg; - struct tid_rb_node *tnode =3D - container_of(node, struct tid_rb_node, mmu); - - cacheless_tid_rb_remove(fdata, tnode); -} diff --git a/drivers/infiniband/hw/hfi1/user_exp_rcv.h b/drivers/infiniba= nd/hw/hfi1/user_exp_rcv.h index 43b105de1d5427..b5314db083b125 100644 --- a/drivers/infiniband/hw/hfi1/user_exp_rcv.h +++ b/drivers/infiniband/hw/hfi1/user_exp_rcv.h @@ -65,7 +65,8 @@ struct tid_user_buf { }; =20 struct tid_rb_node { - struct mmu_rb_node mmu; + struct mmu_range_notifier notifier; + struct hfi1_filedata *fdata; unsigned long phys; struct tid_group *grp; u32 rcventry; --=20 2.23.0