From mboxrd@z Thu Jan 1 00:00:00 1970 Return-path: Received: from mail-wm0-f43.google.com ([74.125.82.43]:52256 "EHLO mail-wm0-f43.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751269AbeCTRsB (ORCPT ); Tue, 20 Mar 2018 13:48:01 -0400 Received: by mail-wm0-f43.google.com with SMTP id l9so5138943wmh.2 for ; Tue, 20 Mar 2018 10:48:00 -0700 (PDT) Reply-To: christian.koenig@amd.com Subject: Re: [Linaro-mm-sig] [PATCH 1/5] dma-buf: add optional invalidate_mappings callback v2 To: Daniel Vetter , =?UTF-8?Q?Christian_K=c3=b6nig?= Cc: "moderated list:DMA BUFFER SHARING FRAMEWORK" , Daniel Vetter , dri-devel , amd-gfx list , "open list:DMA BUFFER SHARING FRAMEWORK" References: <20180316132049.1748-1-christian.koenig@amd.com> <20180316132049.1748-2-christian.koenig@amd.com> <152120831102.25315.4326885184264378830@mail.alporthouse.com> <21879456-db47-589c-b5e2-dfe8333d9e4c@gmail.com> <152147480241.18954.4556582215766884582@mail.alporthouse.com> <0bd85f69-c64c-70d1-a4a0-10ae0ed8b4e8@gmail.com> <19ed21a5-805d-271f-9120-49e0c00f510f@amd.com> <20180320140810.GU14155@phenom.ffwll.local> From: =?UTF-8?Q?Christian_K=c3=b6nig?= Message-ID: <37ba7394-2a5c-a0bc-cc51-c8a0edc2991d@gmail.com> Date: Tue, 20 Mar 2018 18:47:57 +0100 MIME-Version: 1.0 In-Reply-To: <20180320140810.GU14155@phenom.ffwll.local> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-US Sender: linux-media-owner@vger.kernel.org List-ID: Am 20.03.2018 um 15:08 schrieb Daniel Vetter: > [SNIP] > For the in-driver reservation path (CS) having a slow-path that grabs a > temporary reference, drops the vram lock and then locks the reservation > normally (using the acquire context used already for the entire CS) is a > bit tricky, but totally feasible. Ttm doesn't do that though. That is exactly what we do in amdgpu as well, it's just not very efficient nor reliable to retry getting the right pages for a submission over and over again. > [SNIP] > Note that there are 2 paths for i915 userptr. One is the mmu notifier, the > other one is the root-only hack we have for dubious reasons (or that I > really don't see the point in myself). Well I'm referring to i915_gem_userptr.c, if that isn't what you are exposing then just feel free to ignore this whole discussion. >> For coherent usage you need to install some lock to prevent concurrent >> get_user_pages(), command submission and >> invalidate_range_start/invalidate_range_end from the MMU notifier. >> >> Otherwise you can't guarantee that you are actually accessing the right page >> in the case of a fork() or mprotect(). > Yeah doing that with a full lock will create endless amounts of issues, > but I don't see why we need that. Userspace racing stuff with itself gets > to keep all the pieces. This is like racing DIRECT_IO against mprotect and > fork. First of all I strongly disagree on that. A thread calling fork() because it wants to run a command is not something we can forbid just because we have a gfx stack loaded. That the video driver is not capable of handling that correct is certainly not the problem of userspace. Second it's not only userspace racing here, you can get into this kind of issues just because of transparent huge page support where the background daemon tries to reallocate the page tables into bigger chunks. And if I'm not completely mistaken you can also open up quite a bunch of security problems if you suddenly access the wrong page. > Leaking the IOMMU mappings otoh means rogue userspace could do a bunch of > stray writes (I don't see anywhere code in amdgpu_mn.c to unmap at least > the gpu side PTEs to make stuff inaccessible) and wreak the core kernel's > book-keeping. > > In i915 we guarantee that we call set_page_dirty/mark_page_accessed only > after all the mappings are really gone (both GPU PTEs and sg mapping), > guaranteeing that any stray writes from either the GPU or IOMMU will > result in faults (except bugs in the IOMMU, but can't have it all, "IOMMU > actually works" is an assumption behind device isolation). Well exactly that's the point, the handling in i915 looks incorrect to me. You need to call set_page_dirty/mark_page_accessed way before the mapping is destroyed. To be more precise for userptrs it must be called from the invalidate_range_start, but i915 seems to delegate everything into a background worker to avoid the locking problems. >> Felix and I hammered for quite some time on amdgpu until all of this was >> handled correctly, see drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c. > Maybe we should have more shared code in this, it seems to be a source of > endless amounts of fun ... > >> I can try to gather the lockdep splat from my mail history, but it >> essentially took us multiple years to get rid of all of them. > I'm very much interested in specifically the splat that makes it > impossible for you folks to remove the sg mappings. That one sounds bad. > And would essentially make mmu_notifiers useless for their primary use > case, which is handling virtual machines where you really have to make > sure the IOMMU mapping is gone before you claim it's gone, because there's > no 2nd level of device checks (like GPU PTEs, or command checker) catching > stray writes. Well to be more precise the problem is not that we can't destroy the sg table, but rather that we can't grab the locks to do so. See when you need to destroy the sg table you usually need to grab the same lock you grabbed when you created it. And all locks taken while in an MMU notifier can only depend on memory allocation with GFP_NOIO, which is not really feasible for gfx drivers. Regards, Christian. From mboxrd@z Thu Jan 1 00:00:00 1970 From: =?UTF-8?Q?Christian_K=c3=b6nig?= Subject: Re: [Linaro-mm-sig] [PATCH 1/5] dma-buf: add optional invalidate_mappings callback v2 Date: Tue, 20 Mar 2018 18:47:57 +0100 Message-ID: <37ba7394-2a5c-a0bc-cc51-c8a0edc2991d@gmail.com> References: <20180316132049.1748-1-christian.koenig@amd.com> <20180316132049.1748-2-christian.koenig@amd.com> <152120831102.25315.4326885184264378830@mail.alporthouse.com> <21879456-db47-589c-b5e2-dfe8333d9e4c@gmail.com> <152147480241.18954.4556582215766884582@mail.alporthouse.com> <0bd85f69-c64c-70d1-a4a0-10ae0ed8b4e8@gmail.com> <19ed21a5-805d-271f-9120-49e0c00f510f@amd.com> <20180320140810.GU14155@phenom.ffwll.local> Reply-To: christian.koenig-5C7GfCeVMHo@public.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8"; Format="flowed" Content-Transfer-Encoding: base64 Return-path: In-Reply-To: <20180320140810.GU14155-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org> Content-Language: en-US List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: amd-gfx-bounces-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org Sender: "amd-gfx" To: Daniel Vetter , =?UTF-8?Q?Christian_K=c3=b6nig?= Cc: "moderated list:DMA BUFFER SHARING FRAMEWORK" , Daniel Vetter , amd-gfx list , dri-devel , "open list:DMA BUFFER SHARING FRAMEWORK" List-Id: dri-devel@lists.freedesktop.org QW0gMjAuMDMuMjAxOCB1bSAxNTowOCBzY2hyaWViIERhbmllbCBWZXR0ZXI6Cj4gW1NOSVBdCj4g Rm9yIHRoZSBpbi1kcml2ZXIgcmVzZXJ2YXRpb24gcGF0aCAoQ1MpIGhhdmluZyBhIHNsb3ctcGF0 aCB0aGF0IGdyYWJzIGEKPiB0ZW1wb3JhcnkgcmVmZXJlbmNlLCBkcm9wcyB0aGUgdnJhbSBsb2Nr IGFuZCB0aGVuIGxvY2tzIHRoZSByZXNlcnZhdGlvbgo+IG5vcm1hbGx5ICh1c2luZyB0aGUgYWNx dWlyZSBjb250ZXh0IHVzZWQgYWxyZWFkeSBmb3IgdGhlIGVudGlyZSBDUykgaXMgYQo+IGJpdCB0 cmlja3ksIGJ1dCB0b3RhbGx5IGZlYXNpYmxlLiBUdG0gZG9lc24ndCBkbyB0aGF0IHRob3VnaC4K ClRoYXQgaXMgZXhhY3RseSB3aGF0IHdlIGRvIGluIGFtZGdwdSBhcyB3ZWxsLCBpdCdzIGp1c3Qg bm90IHZlcnkgCmVmZmljaWVudCBub3IgcmVsaWFibGUgdG8gcmV0cnkgZ2V0dGluZyB0aGUgcmln aHQgcGFnZXMgZm9yIGEgc3VibWlzc2lvbiAKb3ZlciBhbmQgb3ZlciBhZ2Fpbi4KCj4gW1NOSVBd Cj4gTm90ZSB0aGF0IHRoZXJlIGFyZSAyIHBhdGhzIGZvciBpOTE1IHVzZXJwdHIuIE9uZSBpcyB0 aGUgbW11IG5vdGlmaWVyLCB0aGUKPiBvdGhlciBvbmUgaXMgdGhlIHJvb3Qtb25seSBoYWNrIHdl IGhhdmUgZm9yIGR1YmlvdXMgcmVhc29ucyAob3IgdGhhdCBJCj4gcmVhbGx5IGRvbid0IHNlZSB0 aGUgcG9pbnQgaW4gbXlzZWxmKS4KCldlbGwgSSdtIHJlZmVycmluZyB0byBpOTE1X2dlbV91c2Vy cHRyLmMsIGlmIHRoYXQgaXNuJ3Qgd2hhdCB5b3UgYXJlIApleHBvc2luZyB0aGVuIGp1c3QgZmVl bCBmcmVlIHRvIGlnbm9yZSB0aGlzIHdob2xlIGRpc2N1c3Npb24uCgo+PiBGb3IgY29oZXJlbnQg dXNhZ2UgeW91IG5lZWQgdG8gaW5zdGFsbCBzb21lIGxvY2sgdG8gcHJldmVudCBjb25jdXJyZW50 Cj4+IGdldF91c2VyX3BhZ2VzKCksIGNvbW1hbmQgc3VibWlzc2lvbiBhbmQKPj4gaW52YWxpZGF0 ZV9yYW5nZV9zdGFydC9pbnZhbGlkYXRlX3JhbmdlX2VuZCBmcm9tIHRoZSBNTVUgbm90aWZpZXIu Cj4+Cj4+IE90aGVyd2lzZSB5b3UgY2FuJ3QgZ3VhcmFudGVlIHRoYXQgeW91IGFyZSBhY3R1YWxs eSBhY2Nlc3NpbmcgdGhlIHJpZ2h0IHBhZ2UKPj4gaW4gdGhlIGNhc2Ugb2YgYSBmb3JrKCkgb3Ig bXByb3RlY3QoKS4KPiBZZWFoIGRvaW5nIHRoYXQgd2l0aCBhIGZ1bGwgbG9jayB3aWxsIGNyZWF0 ZSBlbmRsZXNzIGFtb3VudHMgb2YgaXNzdWVzLAo+IGJ1dCBJIGRvbid0IHNlZSB3aHkgd2UgbmVl ZCB0aGF0LiBVc2Vyc3BhY2UgcmFjaW5nIHN0dWZmIHdpdGggaXRzZWxmIGdldHMKPiB0byBrZWVw IGFsbCB0aGUgcGllY2VzLiBUaGlzIGlzIGxpa2UgcmFjaW5nIERJUkVDVF9JTyBhZ2FpbnN0IG1w cm90ZWN0IGFuZAo+IGZvcmsuCgpGaXJzdCBvZiBhbGwgSSBzdHJvbmdseSBkaXNhZ3JlZSBvbiB0 aGF0LiBBIHRocmVhZCBjYWxsaW5nIGZvcmsoKSAKYmVjYXVzZSBpdCB3YW50cyB0byBydW4gYSBj b21tYW5kIGlzIG5vdCBzb21ldGhpbmcgd2UgY2FuIGZvcmJpZCBqdXN0IApiZWNhdXNlIHdlIGhh dmUgYSBnZnggc3RhY2sgbG9hZGVkLiBUaGF0IHRoZSB2aWRlbyBkcml2ZXIgaXMgbm90IGNhcGFi bGUgCm9mIGhhbmRsaW5nIHRoYXQgY29ycmVjdCBpcyBjZXJ0YWlubHkgbm90IHRoZSBwcm9ibGVt IG9mIHVzZXJzcGFjZS4KClNlY29uZCBpdCdzIG5vdCBvbmx5IHVzZXJzcGFjZSByYWNpbmcgaGVy ZSwgeW91IGNhbiBnZXQgaW50byB0aGlzIGtpbmQgCm9mIGlzc3VlcyBqdXN0IGJlY2F1c2Ugb2Yg dHJhbnNwYXJlbnQgaHVnZSBwYWdlIHN1cHBvcnQgd2hlcmUgdGhlIApiYWNrZ3JvdW5kIGRhZW1v biB0cmllcyB0byByZWFsbG9jYXRlIHRoZSBwYWdlIHRhYmxlcyBpbnRvIGJpZ2dlciBjaHVua3Mu CgpBbmQgaWYgSSdtIG5vdCBjb21wbGV0ZWx5IG1pc3Rha2VuIHlvdSBjYW4gYWxzbyBvcGVuIHVw IHF1aXRlIGEgYnVuY2ggb2YgCnNlY3VyaXR5IHByb2JsZW1zIGlmIHlvdSBzdWRkZW5seSBhY2Nl c3MgdGhlIHdyb25nIHBhZ2UuCgo+IExlYWtpbmcgdGhlIElPTU1VIG1hcHBpbmdzIG90b2ggbWVh bnMgcm9ndWUgdXNlcnNwYWNlIGNvdWxkIGRvIGEgYnVuY2ggb2YKPiBzdHJheSB3cml0ZXMgKEkg ZG9uJ3Qgc2VlIGFueXdoZXJlIGNvZGUgaW4gYW1kZ3B1X21uLmMgdG8gdW5tYXAgYXQgbGVhc3QK PiB0aGUgZ3B1IHNpZGUgUFRFcyB0byBtYWtlIHN0dWZmIGluYWNjZXNzaWJsZSkgYW5kIHdyZWFr IHRoZSBjb3JlIGtlcm5lbCdzCj4gYm9vay1rZWVwaW5nLgo+Cj4gSW4gaTkxNSB3ZSBndWFyYW50 ZWUgdGhhdCB3ZSBjYWxsIHNldF9wYWdlX2RpcnR5L21hcmtfcGFnZV9hY2Nlc3NlZCBvbmx5Cj4g YWZ0ZXIgYWxsIHRoZSBtYXBwaW5ncyBhcmUgcmVhbGx5IGdvbmUgKGJvdGggR1BVIFBURXMgYW5k IHNnIG1hcHBpbmcpLAo+IGd1YXJhbnRlZWluZyB0aGF0IGFueSBzdHJheSB3cml0ZXMgZnJvbSBl aXRoZXIgdGhlIEdQVSBvciBJT01NVSB3aWxsCj4gcmVzdWx0IGluIGZhdWx0cyAoZXhjZXB0IGJ1 Z3MgaW4gdGhlIElPTU1VLCBidXQgY2FuJ3QgaGF2ZSBpdCBhbGwsICJJT01NVQo+IGFjdHVhbGx5 IHdvcmtzIiBpcyBhbiBhc3N1bXB0aW9uIGJlaGluZCBkZXZpY2UgaXNvbGF0aW9uKS4KV2VsbCBl eGFjdGx5IHRoYXQncyB0aGUgcG9pbnQsIHRoZSBoYW5kbGluZyBpbiBpOTE1IGxvb2tzIGluY29y cmVjdCB0byAKbWUuIFlvdSBuZWVkIHRvIGNhbGwgc2V0X3BhZ2VfZGlydHkvbWFya19wYWdlX2Fj Y2Vzc2VkIHdheSBiZWZvcmUgdGhlIAptYXBwaW5nIGlzIGRlc3Ryb3llZC4KClRvIGJlIG1vcmUg cHJlY2lzZSBmb3IgdXNlcnB0cnMgaXQgbXVzdCBiZSBjYWxsZWQgZnJvbSB0aGUgCmludmFsaWRh dGVfcmFuZ2Vfc3RhcnQsIGJ1dCBpOTE1IHNlZW1zIHRvIGRlbGVnYXRlIGV2ZXJ5dGhpbmcgaW50 byBhIApiYWNrZ3JvdW5kIHdvcmtlciB0byBhdm9pZCB0aGUgbG9ja2luZyBwcm9ibGVtcy4KCj4+ IEZlbGl4IGFuZCBJIGhhbW1lcmVkIGZvciBxdWl0ZSBzb21lIHRpbWUgb24gYW1kZ3B1IHVudGls IGFsbCBvZiB0aGlzIHdhcwo+PiBoYW5kbGVkIGNvcnJlY3RseSwgc2VlIGRyaXZlcnMvZ3B1L2Ry bS9hbWQvYW1kZ3B1L2FtZGdwdV9tbi5jLgo+IE1heWJlIHdlIHNob3VsZCBoYXZlIG1vcmUgc2hh cmVkIGNvZGUgaW4gdGhpcywgaXQgc2VlbXMgdG8gYmUgYSBzb3VyY2Ugb2YKPiBlbmRsZXNzIGFt b3VudHMgb2YgZnVuIC4uLgo+Cj4+IEkgY2FuIHRyeSB0byBnYXRoZXIgdGhlIGxvY2tkZXAgc3Bs YXQgZnJvbSBteSBtYWlsIGhpc3RvcnksIGJ1dCBpdAo+PiBlc3NlbnRpYWxseSB0b29rIHVzIG11 bHRpcGxlIHllYXJzIHRvIGdldCByaWQgb2YgYWxsIG9mIHRoZW0uCj4gSSdtIHZlcnkgbXVjaCBp bnRlcmVzdGVkIGluIHNwZWNpZmljYWxseSB0aGUgc3BsYXQgdGhhdCBtYWtlcyBpdAo+IGltcG9z c2libGUgZm9yIHlvdSBmb2xrcyB0byByZW1vdmUgdGhlIHNnIG1hcHBpbmdzLiBUaGF0IG9uZSBz b3VuZHMgYmFkLgo+IEFuZCB3b3VsZCBlc3NlbnRpYWxseSBtYWtlIG1tdV9ub3RpZmllcnMgdXNl bGVzcyBmb3IgdGhlaXIgcHJpbWFyeSB1c2UKPiBjYXNlLCB3aGljaCBpcyBoYW5kbGluZyB2aXJ0 dWFsIG1hY2hpbmVzIHdoZXJlIHlvdSByZWFsbHkgaGF2ZSB0byBtYWtlCj4gc3VyZSB0aGUgSU9N TVUgbWFwcGluZyBpcyBnb25lIGJlZm9yZSB5b3UgY2xhaW0gaXQncyBnb25lLCBiZWNhdXNlIHRo ZXJlJ3MKPiBubyAybmQgbGV2ZWwgb2YgZGV2aWNlIGNoZWNrcyAobGlrZSBHUFUgUFRFcywgb3Ig Y29tbWFuZCBjaGVja2VyKSBjYXRjaGluZwo+IHN0cmF5IHdyaXRlcy4KCldlbGwgdG8gYmUgbW9y ZSBwcmVjaXNlIHRoZSBwcm9ibGVtIGlzIG5vdCB0aGF0IHdlIGNhbid0IGRlc3Ryb3kgdGhlIHNn IAp0YWJsZSwgYnV0IHJhdGhlciB0aGF0IHdlIGNhbid0IGdyYWIgdGhlIGxvY2tzIHRvIGRvIHNv LgoKU2VlIHdoZW4geW91IG5lZWQgdG8gZGVzdHJveSB0aGUgc2cgdGFibGUgeW91IHVzdWFsbHkg bmVlZCB0byBncmFiIHRoZSAKc2FtZSBsb2NrIHlvdSBncmFiYmVkIHdoZW4geW91IGNyZWF0ZWQg aXQuCgpBbmQgYWxsIGxvY2tzIHRha2VuIHdoaWxlIGluIGFuIE1NVSBub3RpZmllciBjYW4gb25s eSBkZXBlbmQgb24gbWVtb3J5IAphbGxvY2F0aW9uIHdpdGggR0ZQX05PSU8sIHdoaWNoIGlzIG5v dCByZWFsbHkgZmVhc2libGUgZm9yIGdmeCBkcml2ZXJzLgoKUmVnYXJkcywKQ2hyaXN0aWFuLgpf X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fXwphbWQtZ2Z4IG1h aWxpbmcgbGlzdAphbWQtZ2Z4QGxpc3RzLmZyZWVkZXNrdG9wLm9yZwpodHRwczovL2xpc3RzLmZy ZWVkZXNrdG9wLm9yZy9tYWlsbWFuL2xpc3RpbmZvL2FtZC1nZngK