From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B88FBC433DB for ; Fri, 26 Mar 2021 06:57:26 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 82BFE61A4A for ; Fri, 26 Mar 2021 06:57:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230105AbhCZG4u (ORCPT ); Fri, 26 Mar 2021 02:56:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38320 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230059AbhCZG4U (ORCPT ); Fri, 26 Mar 2021 02:56:20 -0400 Received: from mail-ed1-x535.google.com (mail-ed1-x535.google.com [IPv6:2a00:1450:4864:20::535]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9EE0DC0613AA for ; Thu, 25 Mar 2021 23:56:17 -0700 (PDT) Received: by mail-ed1-x535.google.com with SMTP id l18so5080165edc.9 for ; Thu, 25 Mar 2021 23:56:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=PQypIBjUimiCDVu5dvIjV1qLemMlr0debSxZW75gths=; b=r5Ga8+UedSB+p0J/ID7lgruod9NuIbmZYp4duaHkeV+6eGe3KTIBnBr/jUol6MK2xT KigRHuLHX4UnrFmjSEkWk30i5s9ySzMlZZpPQ8e+Xl60EuTjkPa4pn23LBvKHhOaWSd4 ylVF2D9BNswpzd12vlataSJrOoKQxccgyy+5CwCCp6nHKKBTJIyFd3z3o0B0DjwBwCh+ BSzRyheU2Hdb5mZBGJXAkTce196LeaVm1tw2uwYX/3mHkbnA2xD8woqTfXFT68zWYagU JDejzGcDAmUQstHf1XH88WvM/9oi8+CJjN8H6E5xGINYmMzcApiuJNEme9a2ng/xEdsJ Usqg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=PQypIBjUimiCDVu5dvIjV1qLemMlr0debSxZW75gths=; b=kFpkyqgQDyL7W/FFrG90kn5KAlJDS9ZMbQXSntOwkuCz1hEC81/B44RKs3nRoVR2Lt 9wAJs9PXrY+w2OEDB9c4UOtblN2NrmzagtjnHs5zQGdlV1+cU90CxKikWBnLDBTjbOZY VmOuDPAAMZ910jJjRRD1NVDO2a+st5MqZsIZCK2yIHTLUw7miNXLkckMy3DXJfE7/BUR oef/kFPM96ZfO2VhFCmPUdIQG/RcFf7zv8dE5Mnjf830sS36Ylo1VSIiTiMASTLKwx9i X+runjbo9lpM4GVjxUHR6vc71xQfpD125jtVt2rLTrzvYMb77DDgQnGMhhk1ijdLhm4l c1Tw== X-Gm-Message-State: AOAM531he9zXabT8rbNr+4n3Je5TSIaxvBq2H+0VWsTS+WfH/rFjbj+/ wAdTDoMD5CMMp0Ictu5zAPmyOLSIntn4ygYdsrar X-Google-Smtp-Source: ABdhPJwVqduG89yOSPZPZH6FO21xukl+7m/AtoIDrk4nR7UoWmIwrzX1T2oEszw1L5x8crslqmePwiOetelHUMIsq14= X-Received: by 2002:a05:6402:168c:: with SMTP id a12mr13336426edv.344.1616741776081; Thu, 25 Mar 2021 23:56:16 -0700 (PDT) MIME-Version: 1.0 References: <20210315053721.189-1-xieyongji@bytedance.com> <20210315053721.189-9-xieyongji@bytedance.com> <7c90754b-681d-f3bf-514c-756abfcf3d23@redhat.com> <2db71996-037e-494d-6ef0-de3ff164d3c3@redhat.com> <75e3b941-dfd2-ebbc-d752-8f25c1f14cab@redhat.com> In-Reply-To: <75e3b941-dfd2-ebbc-d752-8f25c1f14cab@redhat.com> From: Yongji Xie Date: Fri, 26 Mar 2021 14:56:05 +0800 Message-ID: Subject: Re: Re: [PATCH v5 08/11] vduse: Implement an MMU-based IOMMU driver To: Jason Wang Cc: "Michael S. Tsirkin" , Stefan Hajnoczi , Stefano Garzarella , Parav Pandit , Christoph Hellwig , Randy Dunlap , Matthew Wilcox , viro@zeniv.linux.org.uk, Jens Axboe , bcrl@kvack.org, Jonathan Corbet , =?UTF-8?Q?Mika_Penttil=C3=A4?= , Dan Carpenter , virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, kvm@vger.kernel.org, linux-fsdevel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On Fri, Mar 26, 2021 at 2:16 PM Jason Wang wrote: > > > =E5=9C=A8 2021/3/26 =E4=B8=8B=E5=8D=881:14, Yongji Xie =E5=86=99=E9=81=93= : > > + } > + map->bounce_page =3D page; > + > + /* paired with vduse_domain_map_page() */ > + smp_mb(); > > So this is suspicious. It's better to explain like, we need make sure A > must be done after B. > > OK. I see. It's used to protect this pattern: > > vduse_domain_alloc_bounce_page: vduse_domain_map_page: > write map->bounce_page write map->orig_phy= s > mb() mb() > read map->orig_phys read map->bounce= _page > > Make sure there will always be a path to do bouncing. > > Ok. > > > And it looks to me the iotlb_lock is sufficnet to do the synchronization > here. E.g any reason that you don't take it in > vduse_domain_map_bounce_page(). > > Yes, we can. But the performance in multi-queue cases will go down if > we use iotlb_lock on this critical path. > > And what's more, is there anyway to aovid holding the spinlock during > bouncing? > > Looks like we can't. In the case that multiple page faults happen on > the same page, we should make sure the bouncing is done before any > page fault handler returns. > > So it looks to me all those extra complexitiy comes from the fact that > the bounce_page and orig_phys are set by different places so we need to > do the bouncing in two places. > > I wonder how much we can gain from the "lazy" boucning in page fault. > The buffer mapped via dma_ops from virtio driver is expected to be > accessed by the userspace soon. It looks to me we can do all those > stuffs during dma_map() then things would be greatly simplified. > > If so, we need to allocate lots of pages from the pool reserved for > atomic memory allocation requests. > > This should be fine, a lot of drivers tries to allocate pages in atomic > context. The point is to simplify the codes to make it easy to > determince the correctness so we can add optimization on top simply by > benchmarking the difference. > > OK. I will use this way in the next version. > > E.g we have serveral places that accesses orig_phys: > > 1) map_page(), write > 2) unmap_page(), write > 3) page fault handler, read > > It's not clear to me how they were synchronized. Or if it was > synchronzied implicitly (via iova allocator?), we'd better document it. > > Yes. > > Or simply use spinlock (which is the preferrable way I'd like to go). We > probably don't need to worry too much about the cost of spinlock since > iova allocater use it heavily. > > Actually iova allocator implements a per-CPU cache to optimize it. > > Thanks, > Yongji > > > Right, but have a quick glance, I guess what you meant is that usually th= ere's no lock contention unless cpu hot-plug. This can work but the problem= is that such synchornization depends on the internal implementation of IOV= A allocator which is kind of fragile. I still think we should do that on ou= r own. > I might miss something. Looks like we don't need any synchronization if the page fault handler is removed as you suggested. We should not access the same orig_phys concurrently (in map_page() and unmap_page()) unless we free the iova before accessing. Thanks, Yongji