From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.2 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5320BFA3732 for ; Thu, 17 Oct 2019 08:54:49 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 2188721848 for ; Thu, 17 Oct 2019 08:54:49 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="NQe40YBx" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2439287AbfJQIys (ORCPT ); Thu, 17 Oct 2019 04:54:48 -0400 Received: from mail-wm1-f68.google.com ([209.85.128.68]:51954 "EHLO mail-wm1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2439155AbfJQIys (ORCPT ); Thu, 17 Oct 2019 04:54:48 -0400 Received: by mail-wm1-f68.google.com with SMTP id 7so1675996wme.1 for ; Thu, 17 Oct 2019 01:54:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=reply-to:subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-transfer-encoding:content-language; bh=Hmw4OwL9FqHb+jV8hLh1gJHi/QJcPAOSq/jTTk2K2mA=; b=NQe40YBxmG96f9TOt2lx2nu5yeOXIxxCZAt2XQ4UJNRC/70WHa4W3XT/bvDilI4+h8 B0wNL6ov0qQEYZPgvU+KVwDDMAtWgOsYpRb8qbHhiW8s7Il0t4d5ZROl6Plmy2QWvIpe 5LLee0ctnKCYPjSfOWUzNj3fbs3MjnU5KB/6XfHtXcA0z18t6ZtuuYL6abHh+QYPPFHU NMe4BKsJHV7wCBmKf2FuGN7qsm+r93+M0G1Qj3cgpNYdL2Bdny9VlzCdUMmGvTTAG2x1 4kzSR+iWl7k73rIJNAOKvHu05m/5toXC/u7LCLrWXNmWdKxgwqIwK4JIGwfAmkq9+7ZJ N2ng== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:reply-to:subject:to:cc:references:from :message-id:date:user-agent:mime-version:in-reply-to :content-transfer-encoding:content-language; bh=Hmw4OwL9FqHb+jV8hLh1gJHi/QJcPAOSq/jTTk2K2mA=; b=FlMFdt7ZEdb6jgfVkNKdBYk/lPMFakJWZS7yGv4lPxm5rkM+oPZrw3yi0hKrsGVD9N E8+T7BZ5qzjtVxvXiua1AV8FGYQn05uB2R3lJUzjkNCW7ujaUzJ8Q+ZWSqgmCRIUAV/V P6wsJL3GqmyzUA3GRz3sRUfaUN6MMPqicsz+IVV2V0pmdGZuMQhT83Yc5RJAgnyxTczH DcSdWvKs/9/4JwJkUAfL60Y1WosOXW3X7RZsqNqbF4HJcuUN4Womstb/OD8LBpRzEPgJ AGO2uMFrkGdGPNhoFKEfI2P5QNkDSa3OrzOz5soRKRUuyhip4NN7B5xeNd5VfzBvO275 0ILw== X-Gm-Message-State: APjAAAVUSZrkh8NiEe4fLp35UJXMYNZXFREXoxIXejpOHCqBxqyJzlML PsGqx+RAhWTsJ1KvU8D/mkLwk3XU X-Google-Smtp-Source: APXvYqzJi9+RvTSEwHzV3ur+tJTiN0DDJTk5yUlvOyd82uaG+4z5rZr2J2St19DvdGnPXph1nSYjMw== X-Received: by 2002:a7b:c936:: with SMTP id h22mr1819751wml.1.1571302485776; Thu, 17 Oct 2019 01:54:45 -0700 (PDT) Received: from ?IPv6:2a02:908:1252:fb60:be8a:bd56:1f94:86e7? ([2a02:908:1252:fb60:be8a:bd56:1f94:86e7]) by smtp.gmail.com with ESMTPSA id c132sm1490101wme.27.2019.10.17.01.54.44 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 17 Oct 2019 01:54:45 -0700 (PDT) Reply-To: christian.koenig@amd.com Subject: Re: [PATCH hmm 00/15] Consolidate the mmu notifier interval_tree and locking To: Jason Gunthorpe , "christian.koenig@amd.com" Cc: Andrea Arcangeli , Ralph Campbell , "linux-rdma@vger.kernel.org" , John Hubbard , "Felix.Kuehling@amd.com" , "amd-gfx@lists.freedesktop.org" , "linux-mm@kvack.org" , Jerome Glisse , "dri-devel@lists.freedesktop.org" , Ben Skeggs References: <20191015181242.8343-1-jgg@ziepe.ca> <20191016160444.GB3430@mellanox.com> From: =?UTF-8?Q?Christian_K=c3=b6nig?= Message-ID: <2df298e2-ee91-ef40-5da9-2bc1af3a17be@gmail.com> Date: Thu, 17 Oct 2019 10:54:43 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0 MIME-Version: 1.0 In-Reply-To: <20191016160444.GB3430@mellanox.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org Am 16.10.19 um 18:04 schrieb Jason Gunthorpe: > On Wed, Oct 16, 2019 at 10:58:02AM +0200, Christian König wrote: >> Am 15.10.19 um 20:12 schrieb Jason Gunthorpe: >>> From: Jason Gunthorpe >>> >>> 8 of the mmu_notifier using drivers (i915_gem, radeon_mn, umem_odp, hfi1, >>> scif_dma, vhost, gntdev, hmm) drivers are using a common pattern where >>> they only use invalidate_range_start/end and immediately check the >>> invalidating range against some driver data structure to tell if the >>> driver is interested. Half of them use an interval_tree, the others are >>> simple linear search lists. >>> >>> Of the ones I checked they largely seem to have various kinds of races, >>> bugs and poor implementation. This is a result of the complexity in how >>> the notifier interacts with get_user_pages(). It is extremely difficult to >>> use it correctly. >>> >>> Consolidate all of this code together into the core mmu_notifier and >>> provide a locking scheme similar to hmm_mirror that allows the user to >>> safely use get_user_pages() and reliably know if the page list still >>> matches the mm. >> That sounds really good, but could you outline for a moment how that is >> archived? > It uses the same basic scheme as hmm and rdma odp, outlined in the > revisions to hmm.rst later on. > > Basically, > > seq = mmu_range_read_begin(&mrn); > > // This is a speculative region > .. get_user_pages()/hmm_range_fault() .. How do we enforce that this get_user_pages()/hmm_range_fault() doesn't see outdated page table information? In other words how the the following race prevented: CPU A CPU B invalidate_range_start()       mmu_range_read_begin()       get_user_pages()/hmm_range_fault() Updating the ptes invalidate_range_end() I mean get_user_pages() tries to circumvent this issue by grabbing a reference to the pages in question, but that isn't sufficient for the SVM use case. That's the reason why we had this horrible solution with a r/w lock and a linked list of BOs in an interval tree. Regards, Christian. > // Result cannot be derferenced > > take_lock(driver->update); > if (mmu_range_read_retry(&mrn, range.notifier_seq) { > // collision! The results are not correct > goto again > } > > // no collision, and now under lock. Now we can de-reference the pages/etc > // program HW > // Now the invalidate callback is responsible to synchronize against changes > unlock(driver->update) > > Basically, anything that was using hmm_mirror correctly transisions > over fairly trivially, just with the modification to store a sequence > number to close that race described in the hmm commit. > > For something like AMD gpu I expect it to transition to use dma_fence > from the notifier for coherency right before it unlocks driver->update. > > Jason > _______________________________________________ > amd-gfx mailing list > amd-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx From mboxrd@z Thu Jan 1 00:00:00 1970 From: =?UTF-8?Q?Christian_K=c3=b6nig?= Subject: Re: [PATCH hmm 00/15] Consolidate the mmu notifier interval_tree and locking Date: Thu, 17 Oct 2019 10:54:43 +0200 Message-ID: <2df298e2-ee91-ef40-5da9-2bc1af3a17be@gmail.com> References: <20191015181242.8343-1-jgg@ziepe.ca> <20191016160444.GB3430@mellanox.com> Reply-To: christian.koenig-5C7GfCeVMHo@public.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8"; Format="flowed" Content-Transfer-Encoding: base64 Return-path: In-Reply-To: <20191016160444.GB3430-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> Content-Language: en-US List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: amd-gfx-bounces-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org Sender: "amd-gfx" To: Jason Gunthorpe , "christian.koenig-5C7GfCeVMHo@public.gmane.org" Cc: Andrea Arcangeli , Ralph Campbell , "linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" , John Hubbard , "Felix.Kuehling-5C7GfCeVMHo@public.gmane.org" , "amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org" , "linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org" , Jerome Glisse , "dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org" , Ben Skeggs List-Id: dri-devel@lists.freedesktop.org QW0gMTYuMTAuMTkgdW0gMTg6MDQgc2NocmllYiBKYXNvbiBHdW50aG9ycGU6Cj4gT24gV2VkLCBP Y3QgMTYsIDIwMTkgYXQgMTA6NTg6MDJBTSArMDIwMCwgQ2hyaXN0aWFuIEvDtm5pZyB3cm90ZToK Pj4gQW0gMTUuMTAuMTkgdW0gMjA6MTIgc2NocmllYiBKYXNvbiBHdW50aG9ycGU6Cj4+PiBGcm9t OiBKYXNvbiBHdW50aG9ycGUgPGpnZ0BtZWxsYW5veC5jb20+Cj4+Pgo+Pj4gOCBvZiB0aGUgbW11 X25vdGlmaWVyIHVzaW5nIGRyaXZlcnMgKGk5MTVfZ2VtLCByYWRlb25fbW4sIHVtZW1fb2RwLCBo ZmkxLAo+Pj4gc2NpZl9kbWEsIHZob3N0LCBnbnRkZXYsIGhtbSkgZHJpdmVycyBhcmUgdXNpbmcg YSBjb21tb24gcGF0dGVybiB3aGVyZQo+Pj4gdGhleSBvbmx5IHVzZSBpbnZhbGlkYXRlX3Jhbmdl X3N0YXJ0L2VuZCBhbmQgaW1tZWRpYXRlbHkgY2hlY2sgdGhlCj4+PiBpbnZhbGlkYXRpbmcgcmFu Z2UgYWdhaW5zdCBzb21lIGRyaXZlciBkYXRhIHN0cnVjdHVyZSB0byB0ZWxsIGlmIHRoZQo+Pj4g ZHJpdmVyIGlzIGludGVyZXN0ZWQuIEhhbGYgb2YgdGhlbSB1c2UgYW4gaW50ZXJ2YWxfdHJlZSwg dGhlIG90aGVycyBhcmUKPj4+IHNpbXBsZSBsaW5lYXIgc2VhcmNoIGxpc3RzLgo+Pj4KPj4+IE9m IHRoZSBvbmVzIEkgY2hlY2tlZCB0aGV5IGxhcmdlbHkgc2VlbSB0byBoYXZlIHZhcmlvdXMga2lu ZHMgb2YgcmFjZXMsCj4+PiBidWdzIGFuZCBwb29yIGltcGxlbWVudGF0aW9uLiBUaGlzIGlzIGEg cmVzdWx0IG9mIHRoZSBjb21wbGV4aXR5IGluIGhvdwo+Pj4gdGhlIG5vdGlmaWVyIGludGVyYWN0 cyB3aXRoIGdldF91c2VyX3BhZ2VzKCkuIEl0IGlzIGV4dHJlbWVseSBkaWZmaWN1bHQgdG8KPj4+ IHVzZSBpdCBjb3JyZWN0bHkuCj4+Pgo+Pj4gQ29uc29saWRhdGUgYWxsIG9mIHRoaXMgY29kZSB0 b2dldGhlciBpbnRvIHRoZSBjb3JlIG1tdV9ub3RpZmllciBhbmQKPj4+IHByb3ZpZGUgYSBsb2Nr aW5nIHNjaGVtZSBzaW1pbGFyIHRvIGhtbV9taXJyb3IgdGhhdCBhbGxvd3MgdGhlIHVzZXIgdG8K Pj4+IHNhZmVseSB1c2UgZ2V0X3VzZXJfcGFnZXMoKSBhbmQgcmVsaWFibHkga25vdyBpZiB0aGUg cGFnZSBsaXN0IHN0aWxsCj4+PiBtYXRjaGVzIHRoZSBtbS4KPj4gVGhhdCBzb3VuZHMgcmVhbGx5 IGdvb2QsIGJ1dCBjb3VsZCB5b3Ugb3V0bGluZSBmb3IgYSBtb21lbnQgaG93IHRoYXQgaXMKPj4g YXJjaGl2ZWQ/Cj4gSXQgdXNlcyB0aGUgc2FtZSBiYXNpYyBzY2hlbWUgYXMgaG1tIGFuZCByZG1h IG9kcCwgb3V0bGluZWQgaW4gdGhlCj4gcmV2aXNpb25zIHRvIGhtbS5yc3QgbGF0ZXIgb24uCj4K PiBCYXNpY2FsbHksCj4KPiAgIHNlcSA9IG1tdV9yYW5nZV9yZWFkX2JlZ2luKCZtcm4pOwo+Cj4g ICAvLyBUaGlzIGlzIGEgc3BlY3VsYXRpdmUgcmVnaW9uCj4gICAuLiBnZXRfdXNlcl9wYWdlcygp L2htbV9yYW5nZV9mYXVsdCgpIC4uCgpIb3cgZG8gd2UgZW5mb3JjZSB0aGF0IHRoaXMgZ2V0X3Vz ZXJfcGFnZXMoKS9obW1fcmFuZ2VfZmF1bHQoKSBkb2Vzbid0IApzZWUgb3V0ZGF0ZWQgcGFnZSB0 YWJsZSBpbmZvcm1hdGlvbj8KCkluIG90aGVyIHdvcmRzIGhvdyB0aGUgdGhlIGZvbGxvd2luZyBy YWNlIHByZXZlbnRlZDoKCkNQVSBBIENQVSBCCmludmFsaWRhdGVfcmFuZ2Vfc3RhcnQoKQogwqDC oMKgIMKgIG1tdV9yYW5nZV9yZWFkX2JlZ2luKCkKIMKgwqDCoCDCoCBnZXRfdXNlcl9wYWdlcygp L2htbV9yYW5nZV9mYXVsdCgpClVwZGF0aW5nIHRoZSBwdGVzCmludmFsaWRhdGVfcmFuZ2VfZW5k KCkKCgpJIG1lYW4gZ2V0X3VzZXJfcGFnZXMoKSB0cmllcyB0byBjaXJjdW12ZW50IHRoaXMgaXNz dWUgYnkgZ3JhYmJpbmcgYSAKcmVmZXJlbmNlIHRvIHRoZSBwYWdlcyBpbiBxdWVzdGlvbiwgYnV0 IHRoYXQgaXNuJ3Qgc3VmZmljaWVudCBmb3IgdGhlIApTVk0gdXNlIGNhc2UuCgpUaGF0J3MgdGhl IHJlYXNvbiB3aHkgd2UgaGFkIHRoaXMgaG9ycmlibGUgc29sdXRpb24gd2l0aCBhIHIvdyBsb2Nr IGFuZCAKYSBsaW5rZWQgbGlzdCBvZiBCT3MgaW4gYW4gaW50ZXJ2YWwgdHJlZS4KClJlZ2FyZHMs CkNocmlzdGlhbi4KCj4gICAvLyBSZXN1bHQgY2Fubm90IGJlIGRlcmZlcmVuY2VkCj4KPiAgIHRh a2VfbG9jayhkcml2ZXItPnVwZGF0ZSk7Cj4gICBpZiAobW11X3JhbmdlX3JlYWRfcmV0cnkoJm1y biwgcmFuZ2Uubm90aWZpZXJfc2VxKSB7Cj4gICAgICAvLyBjb2xsaXNpb24hIFRoZSByZXN1bHRz IGFyZSBub3QgY29ycmVjdAo+ICAgICAgZ290byBhZ2Fpbgo+ICAgfQo+Cj4gICAvLyBubyBjb2xs aXNpb24sIGFuZCBub3cgdW5kZXIgbG9jay4gTm93IHdlIGNhbiBkZS1yZWZlcmVuY2UgdGhlIHBh Z2VzL2V0Ywo+ICAgLy8gcHJvZ3JhbSBIVwo+ICAgLy8gTm93IHRoZSBpbnZhbGlkYXRlIGNhbGxi YWNrIGlzIHJlc3BvbnNpYmxlIHRvIHN5bmNocm9uaXplIGFnYWluc3QgY2hhbmdlcwo+ICAgdW5s b2NrKGRyaXZlci0+dXBkYXRlKQo+Cj4gQmFzaWNhbGx5LCBhbnl0aGluZyB0aGF0IHdhcyB1c2lu ZyBobW1fbWlycm9yIGNvcnJlY3RseSB0cmFuc2lzaW9ucwo+IG92ZXIgZmFpcmx5IHRyaXZpYWxs eSwganVzdCB3aXRoIHRoZSBtb2RpZmljYXRpb24gdG8gc3RvcmUgYSBzZXF1ZW5jZQo+IG51bWJl ciB0byBjbG9zZSB0aGF0IHJhY2UgZGVzY3JpYmVkIGluIHRoZSBobW0gY29tbWl0Lgo+Cj4gRm9y IHNvbWV0aGluZyBsaWtlIEFNRCBncHUgSSBleHBlY3QgaXQgdG8gdHJhbnNpdGlvbiB0byB1c2Ug ZG1hX2ZlbmNlCj4gZnJvbSB0aGUgbm90aWZpZXIgZm9yIGNvaGVyZW5jeSByaWdodCBiZWZvcmUg aXQgdW5sb2NrcyBkcml2ZXItPnVwZGF0ZS4KPgo+IEphc29uCj4gX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX18KPiBhbWQtZ2Z4IG1haWxpbmcgbGlzdAo+IGFt ZC1nZnhAbGlzdHMuZnJlZWRlc2t0b3Aub3JnCj4gaHR0cHM6Ly9saXN0cy5mcmVlZGVza3RvcC5v cmcvbWFpbG1hbi9saXN0aW5mby9hbWQtZ2Z4CgpfX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fXwphbWQtZ2Z4IG1haWxpbmcgbGlzdAphbWQtZ2Z4QGxpc3RzLmZy ZWVkZXNrdG9wLm9yZwpodHRwczovL2xpc3RzLmZyZWVkZXNrdG9wLm9yZy9tYWlsbWFuL2xpc3Rp bmZvL2FtZC1nZng=