From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qg0-f45.google.com (mail-qg0-f45.google.com [209.85.192.45]) by kanga.kvack.org (Postfix) with ESMTP id 0F7CF6B0253 for ; Thu, 28 Jan 2016 12:57:46 -0500 (EST) Received: by mail-qg0-f45.google.com with SMTP id e32so45692417qgf.3 for ; Thu, 28 Jan 2016 09:57:46 -0800 (PST) Received: from mail-qg0-x232.google.com (mail-qg0-x232.google.com. [2607:f8b0:400d:c04::232]) by mx.google.com with ESMTPS id 69si13250982qgg.2.2016.01.28.09.57.45 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 28 Jan 2016 09:57:45 -0800 (PST) Received: by mail-qg0-x232.google.com with SMTP id b35so45377188qge.0 for ; Thu, 28 Jan 2016 09:57:45 -0800 (PST) Date: Thu, 28 Jan 2016 18:55:37 +0100 From: Jerome Glisse Subject: [LSF/MM ATTEND] HMM (heterogeneous memory manager) and GPU Message-ID: <20160128175536.GA20797@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit Sender: owner-linux-mm@kvack.org List-ID: To: lsf-pc@lists.linux-foundation.org, "linux-mm@kvack.org" Hi, I would like to attend LSF/MM this year to discuss about HMM (Heterogeneous Memory Manager) and more generaly all topics related to GPU and heterogeneous memory architecture (including persistent memory). I want to discuss how to move forward with HMM merging and i hope that by MM summit time i will be able to share more informations publicly on devices which rely on HMM. Jerome Glisse -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm0-f46.google.com (mail-wm0-f46.google.com [74.125.82.46]) by kanga.kvack.org (Postfix) with ESMTP id 08A986B0009 for ; Fri, 29 Jan 2016 04:50:32 -0500 (EST) Received: by mail-wm0-f46.google.com with SMTP id r129so60690900wmr.0 for ; Fri, 29 Jan 2016 01:50:31 -0800 (PST) Received: from mail-wm0-x233.google.com (mail-wm0-x233.google.com. [2a00:1450:400c:c09::233]) by mx.google.com with ESMTPS id dz12si21021354wjb.180.2016.01.29.01.50.30 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 29 Jan 2016 01:50:30 -0800 (PST) Received: by mail-wm0-x233.google.com with SMTP id r129so60690127wmr.0 for ; Fri, 29 Jan 2016 01:50:30 -0800 (PST) Date: Fri, 29 Jan 2016 11:50:28 +0200 From: "Kirill A. Shutemov" Subject: Re: [LSF/MM ATTEND] HMM (heterogeneous memory manager) and GPU Message-ID: <20160129095028.GA10767@node.shutemov.name> References: <20160128175536.GA20797@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160128175536.GA20797@gmail.com> Sender: owner-linux-mm@kvack.org List-ID: To: Jerome Glisse Cc: lsf-pc@lists.linux-foundation.org, "linux-mm@kvack.org" On Thu, Jan 28, 2016 at 06:55:37PM +0100, Jerome Glisse wrote: > Hi, > > I would like to attend LSF/MM this year to discuss about HMM > (Heterogeneous Memory Manager) and more generaly all topics > related to GPU and heterogeneous memory architecture (including > persistent memory). How is persistent memory heterogeneous? I thought it's either in the same cache coherency domain (DAX case) or is not a memory for kernel -- behind block layer. Do we have yet another option? -- Kirill A. Shutemov -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm0-f47.google.com (mail-wm0-f47.google.com [74.125.82.47]) by kanga.kvack.org (Postfix) with ESMTP id A9EE66B0009 for ; Fri, 29 Jan 2016 08:35:47 -0500 (EST) Received: by mail-wm0-f47.google.com with SMTP id l66so54789532wml.0 for ; Fri, 29 Jan 2016 05:35:47 -0800 (PST) Received: from mail-wm0-x22d.google.com (mail-wm0-x22d.google.com. [2a00:1450:400c:c09::22d]) by mx.google.com with ESMTPS id w77si11099336wme.5.2016.01.29.05.35.46 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 29 Jan 2016 05:35:46 -0800 (PST) Received: by mail-wm0-x22d.google.com with SMTP id p63so68452958wmp.1 for ; Fri, 29 Jan 2016 05:35:46 -0800 (PST) Date: Fri, 29 Jan 2016 14:35:38 +0100 From: Jerome Glisse Subject: Re: [LSF/MM ATTEND] HMM (heterogeneous memory manager) and GPU Message-ID: <20160129133537.GA26044@gmail.com> References: <20160128175536.GA20797@gmail.com> <20160129095028.GA10767@node.shutemov.name> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20160129095028.GA10767@node.shutemov.name> Sender: owner-linux-mm@kvack.org List-ID: To: "Kirill A. Shutemov" Cc: lsf-pc@lists.linux-foundation.org, "linux-mm@kvack.org" On Fri, Jan 29, 2016 at 11:50:28AM +0200, Kirill A. Shutemov wrote: > On Thu, Jan 28, 2016 at 06:55:37PM +0100, Jerome Glisse wrote: > > Hi, > > > > I would like to attend LSF/MM this year to discuss about HMM > > (Heterogeneous Memory Manager) and more generaly all topics > > related to GPU and heterogeneous memory architecture (including > > persistent memory). > > How is persistent memory heterogeneous? > > I thought it's either in the same cache coherency domain (DAX case) or is > not a memory for kernel -- behind block layer. > Do we have yet another option? Right now it is not, but i am interested in the DMA mapping issue. But from what i have seen on roadmap, we are going toward a world with a deeper memory hierarchy. Very fast cache near CPU in GB range, regular memory like ddr, slower persistent or similar but with enormous capacity. On top of this you have thing like GPU memory (which is my main topic of interest) and other similar thing like FPGA. GPU are not going away, bandwidth for GPU is in TB/s ranges and on GPU roadmap the gap with CPU memory bandwidth keeps getting bigger. So i believe this hierarchy of memory add a layer of complexity on top of numa. Technology is not ready but it might be worth discussing it, seeing if there is anything to do on top of numa. Also note that thing like GPU memory can either be visible or unvisible from CPU point of view, more over it can be cache coherent or not. Thought the latter is only enabled through specific API where application is aware that it loose cache coherency with CPU. Cheers, Jerome -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qg0-f44.google.com (mail-qg0-f44.google.com [209.85.192.44]) by kanga.kvack.org (Postfix) with ESMTP id 4D71D6B0005 for ; Mon, 1 Feb 2016 10:46:11 -0500 (EST) Received: by mail-qg0-f44.google.com with SMTP id b35so120673436qge.0 for ; Mon, 01 Feb 2016 07:46:11 -0800 (PST) Received: from e38.co.us.ibm.com (e38.co.us.ibm.com. [32.97.110.159]) by mx.google.com with ESMTPS id b84si31959130qhd.120.2016.02.01.07.46.09 for (version=TLS1 cipher=AES128-SHA bits=128/128); Mon, 01 Feb 2016 07:46:10 -0800 (PST) Received: from localhost by e38.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 1 Feb 2016 08:46:09 -0700 Received: from b03cxnp08027.gho.boulder.ibm.com (b03cxnp08027.gho.boulder.ibm.com [9.17.130.19]) by d03dlp02.boulder.ibm.com (Postfix) with ESMTP id 93AF13E4003F for ; Mon, 1 Feb 2016 08:46:06 -0700 (MST) Received: from d03av03.boulder.ibm.com (d03av03.boulder.ibm.com [9.17.195.169]) by b03cxnp08027.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id u11Fk67Z29688002 for ; Mon, 1 Feb 2016 08:46:06 -0700 Received: from d03av03.boulder.ibm.com (localhost [127.0.0.1]) by d03av03.boulder.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id u11Fk60r008798 for ; Mon, 1 Feb 2016 08:46:06 -0700 From: "Aneesh Kumar K.V" Subject: Re: [LSF/MM ATTEND] HMM (heterogeneous memory manager) and GPU In-Reply-To: <20160128175536.GA20797@gmail.com> References: <20160128175536.GA20797@gmail.com> Date: Mon, 01 Feb 2016 21:16:02 +0530 Message-ID: <87bn805t8l.fsf@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain Sender: owner-linux-mm@kvack.org List-ID: To: Jerome Glisse , lsf-pc@lists.linux-foundation.org, "linux-mm@kvack.org" Jerome Glisse writes: > Hi, > > I would like to attend LSF/MM this year to discuss about HMM > (Heterogeneous Memory Manager) and more generaly all topics > related to GPU and heterogeneous memory architecture (including > persistent memory). > > I want to discuss how to move forward with HMM merging and i > hope that by MM summit time i will be able to share more > informations publicly on devices which rely on HMM. > I mentioned in my request to attend mail, I would like to attend this discussion. I am wondering whether we can split the series further to mmu_notifier bits and then the page table mirroring bits. Can the mmu notifier changes go in early so that we can merge the page table mirroring later ? Can be page table mirroring bits be built as a kernel module ? -aneesh -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qg0-f44.google.com (mail-qg0-f44.google.com [209.85.192.44]) by kanga.kvack.org (Postfix) with ESMTP id DD32C6B0005 for ; Tue, 2 Feb 2016 18:03:26 -0500 (EST) Received: by mail-qg0-f44.google.com with SMTP id b35so3206525qge.0 for ; Tue, 02 Feb 2016 15:03:26 -0800 (PST) Received: from mail-qg0-x231.google.com (mail-qg0-x231.google.com. [2607:f8b0:400d:c04::231]) by mx.google.com with ESMTPS id z2si3148336qkg.60.2016.02.02.15.03.26 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 02 Feb 2016 15:03:26 -0800 (PST) Received: by mail-qg0-x231.google.com with SMTP id e32so3112618qgf.3 for ; Tue, 02 Feb 2016 15:03:26 -0800 (PST) Date: Wed, 3 Feb 2016 00:03:16 +0100 From: Jerome Glisse Subject: Re: [LSF/MM ATTEND] HMM (heterogeneous memory manager) and GPU Message-ID: <20160202230314.GA5183@gmail.com> References: <20160128175536.GA20797@gmail.com> <87bn805t8l.fsf@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <87bn805t8l.fsf@linux.vnet.ibm.com> Sender: owner-linux-mm@kvack.org List-ID: To: "Aneesh Kumar K.V" Cc: lsf-pc@lists.linux-foundation.org, "linux-mm@kvack.org" On Mon, Feb 01, 2016 at 09:16:02PM +0530, Aneesh Kumar K.V wrote: > Jerome Glisse writes: > > > Hi, > > > > I would like to attend LSF/MM this year to discuss about HMM > > (Heterogeneous Memory Manager) and more generaly all topics > > related to GPU and heterogeneous memory architecture (including > > persistent memory). > > > > I want to discuss how to move forward with HMM merging and i > > hope that by MM summit time i will be able to share more > > informations publicly on devices which rely on HMM. > > > > I mentioned in my request to attend mail, I would like to attend this > discussion. I am wondering whether we can split the series further to > mmu_notifier bits and then the page table mirroring bits. Can the mmu notifier > changes go in early so that we can merge the page table mirroring later ? Well the mmu_notifier bit can be upstream on their own but they would not useful. Maybe on KVM side i need to investigate. > Can be page table mirroring bits be built as a kernel module ? Well i am not sure this is a good idea. Memory migration requires to hook up into page fault code path and it relies on the mirrored page table to service fault on memory that is migrated. Jerome -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pf0-f178.google.com (mail-pf0-f178.google.com [209.85.192.178]) by kanga.kvack.org (Postfix) with ESMTP id 10B846B0005 for ; Tue, 2 Feb 2016 19:41:03 -0500 (EST) Received: by mail-pf0-f178.google.com with SMTP id n128so3210240pfn.3 for ; Tue, 02 Feb 2016 16:41:03 -0800 (PST) Received: from bombadil.infradead.org (bombadil.infradead.org. [2001:1868:205::9]) by mx.google.com with ESMTPS id v16si5094613pfa.129.2016.02.02.16.41.01 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 02 Feb 2016 16:41:02 -0800 (PST) Message-ID: <1454460057.4788.117.camel@infradead.org> Subject: Re: [LSF/MM ATTEND] HMM (heterogeneous memory manager) and GPU From: David Woodhouse Date: Wed, 03 Feb 2016 00:40:57 +0000 In-Reply-To: <20160128175536.GA20797@gmail.com> References: <20160128175536.GA20797@gmail.com> Content-Type: multipart/signed; micalg="sha-1"; protocol="application/x-pkcs7-signature"; boundary="=-ZMKB05rXY+bxHZZBUqO0" Mime-Version: 1.0 Sender: owner-linux-mm@kvack.org List-ID: To: Jerome Glisse , lsf-pc@lists.linux-foundation.org, "linux-mm@kvack.org" Cc: joro@8bytes.org --=-ZMKB05rXY+bxHZZBUqO0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Thu, 2016-01-28 at 18:55 +0100, Jerome Glisse wrote: >=20 > I would like to attend LSF/MM this year to discuss about HMM > (Heterogeneous Memory Manager) and more generaly all topics > related to GPU and heterogeneous memory architecture (including > persistent memory). >=20 > I want to discuss how to move forward with HMM merging and i > hope that by MM summit time i will be able to share more > informations publicly on devices which rely on HMM. There are a few related issues here around Shared Virtual Memory, and lifetime management of the associated MM, and the proposal discussed at the Kernel Summit for "off-CPU tasks". I've hit a situation with the Intel SVM code in 4.4 where the device driver binds a PASID, and also has mmap() functionality on the same file descriptor that the PASID is associated with. So on process exit, the MM doesn't die because the PASID binding still exists. The VMA of the mmap doesn't die because the MM still exists. So the underlying file remains open because the VMA still exists. And the PASID binding thus doesn't die because the file is still open. I've posted a patch=C2=B9 which moves us closer to the amd_iommu_v2 model, although I'm still *strongly* resisting the temptation to call out into device driver code from the mmu_notifier's release callback. I would like to attend LSF/MM this year so we can continue to work on those issues =E2=80=94 now that we actually have some hardware in the field= and a better idea of how we can build a unified access model for SVM across the different IOMMU types. --=20 David Woodhouse Open Source Technology Centre David.Woodhouse@intel.com Intel Corporation =C2=B9 http://www.spinics.net/lists/linux-mm/msg100230.html --=-ZMKB05rXY+bxHZZBUqO0 Content-Type: application/x-pkcs7-signature; name="smime.p7s" Content-Disposition: attachment; filename="smime.p7s" Content-Transfer-Encoding: base64 MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIISjjCCBicw ggUPoAMCAQICAw3vNzANBgkqhkiG9w0BAQUFADCBjDELMAkGA1UEBhMCSUwxFjAUBgNVBAoTDVN0 YXJ0Q29tIEx0ZC4xKzApBgNVBAsTIlNlY3VyZSBEaWdpdGFsIENlcnRpZmljYXRlIFNpZ25pbmcx ODA2BgNVBAMTL1N0YXJ0Q29tIENsYXNzIDEgUHJpbWFyeSBJbnRlcm1lZGlhdGUgQ2xpZW50IENB MB4XDTE1MDUwNTA5NDM0MVoXDTE2MDUwNTA5NTMzNlowQjEcMBoGA1UEAwwTZHdtdzJAaW5mcmFk ZWFkLm9yZzEiMCAGCSqGSIb3DQEJARYTZHdtdzJAaW5mcmFkZWFkLm9yZzCCASIwDQYJKoZIhvcN AQEBBQADggEPADCCAQoCggEBAMkbm9kPbx1j/X4RVyf/pPKSYwelcco69TvnQQbKM8m8xkWjXJI1 jpJ1jMaGUZGFToINMSZi7lZawUozudWbXSKy1SikENSTJHffsdRAIlsp+hR8vWvjsKUry6sEdqPG doa5RY7+N4WRusWZDYW/RRWE6i9EL9qV86CVPYqw22UBOUw4/j/HVGCV6TSB8yE5iEwhk/hUuzRr FZm1MJMR7mCS7BCR8Lr5jFY61lWpBiXNXIxLZCvDc26KR5L5tYX43iUVO3fzES1GRVoYnxxk2tmz fcsZG5vK+Trc9L8OZJfkYrEHH3+Iw41MQ0w/djVtYr1+HYldx0QmYXAtnhIj+UMCAwEAAaOCAtkw ggLVMAkGA1UdEwQCMAAwCwYDVR0PBAQDAgSwMB0GA1UdJQQWMBQGCCsGAQUFBwMCBggrBgEFBQcD BDAdBgNVHQ4EFgQUszC96C3w5/2+d+atSr0IpT26YI4wHwYDVR0jBBgwFoAUU3Ltkpzg2ssBXHx+ ljVO8tS4UYIwHgYDVR0RBBcwFYETZHdtdzJAaW5mcmFkZWFkLm9yZzCCAUwGA1UdIASCAUMwggE/ MIIBOwYLKwYBBAGBtTcBAgMwggEqMC4GCCsGAQUFBwIBFiJodHRwOi8vd3d3LnN0YXJ0c3NsLmNv bS9wb2xpY3kucGRmMIH3BggrBgEFBQcCAjCB6jAnFiBTdGFydENvbSBDZXJ0aWZpY2F0aW9uIEF1 dGhvcml0eTADAgEBGoG+VGhpcyBjZXJ0aWZpY2F0ZSB3YXMgaXNzdWVkIGFjY29yZGluZyB0byB0 aGUgQ2xhc3MgMSBWYWxpZGF0aW9uIHJlcXVpcmVtZW50cyBvZiB0aGUgU3RhcnRDb20gQ0EgcG9s aWN5LCByZWxpYW5jZSBvbmx5IGZvciB0aGUgaW50ZW5kZWQgcHVycG9zZSBpbiBjb21wbGlhbmNl IG9mIHRoZSByZWx5aW5nIHBhcnR5IG9ibGlnYXRpb25zLjA2BgNVHR8ELzAtMCugKaAnhiVodHRw Oi8vY3JsLnN0YXJ0c3NsLmNvbS9jcnR1MS1jcmwuY3JsMIGOBggrBgEFBQcBAQSBgTB/MDkGCCsG AQUFBzABhi1odHRwOi8vb2NzcC5zdGFydHNzbC5jb20vc3ViL2NsYXNzMS9jbGllbnQvY2EwQgYI KwYBBQUHMAKGNmh0dHA6Ly9haWEuc3RhcnRzc2wuY29tL2NlcnRzL3N1Yi5jbGFzczEuY2xpZW50 LmNhLmNydDAjBgNVHRIEHDAahhhodHRwOi8vd3d3LnN0YXJ0c3NsLmNvbS8wDQYJKoZIhvcNAQEF BQADggEBAHMQmxHHodpS85X8HRyxhvfkys7r+taCNOaNU9cxQu/cZ/6k5nS2qGNMzZ6jb7ueY/V7 7p+4DW/9ZWODDTf4Fz00mh5SSVc20Bz7t+hhxwHd62PZgENh5i76Qq2tw48U8AsYo5damHby1epf neZafLpUkLLO7AGBJIiRVTevdvyXQ0qnixOmKMWyvrhSNGuVIKVdeqLP+102Dwf+dpFyw+j1hz28 jEEKpHa+NR1b2kXuSPi/rMGhexwlJOh4tK8KQ6Ryr0rIN//NSbOgbyYZrzc/ZUWX9V5OA84ChFb2 vkFl0OcYrttp/rhDBLITwffPxSZeoBh9H7zYzkbCXKL3BUIwggYnMIIFD6ADAgECAgMN7zcwDQYJ KoZIhvcNAQEFBQAwgYwxCzAJBgNVBAYTAklMMRYwFAYDVQQKEw1TdGFydENvbSBMdGQuMSswKQYD VQQLEyJTZWN1cmUgRGlnaXRhbCBDZXJ0aWZpY2F0ZSBTaWduaW5nMTgwNgYDVQQDEy9TdGFydENv bSBDbGFzcyAxIFByaW1hcnkgSW50ZXJtZWRpYXRlIENsaWVudCBDQTAeFw0xNTA1MDUwOTQzNDFa Fw0xNjA1MDUwOTUzMzZaMEIxHDAaBgNVBAMME2R3bXcyQGluZnJhZGVhZC5vcmcxIjAgBgkqhkiG 9w0BCQEWE2R3bXcyQGluZnJhZGVhZC5vcmcwggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIB AQDJG5vZD28dY/1+EVcn/6TykmMHpXHKOvU750EGyjPJvMZFo1ySNY6SdYzGhlGRhU6CDTEmYu5W WsFKM7nVm10istUopBDUkyR337HUQCJbKfoUfL1r47ClK8urBHajxnaGuUWO/jeFkbrFmQ2Fv0UV hOovRC/alfOglT2KsNtlATlMOP4/x1Rglek0gfMhOYhMIZP4VLs0axWZtTCTEe5gkuwQkfC6+YxW OtZVqQYlzVyMS2Qrw3NuikeS+bWF+N4lFTt38xEtRkVaGJ8cZNrZs33LGRubyvk63PS/DmSX5GKx Bx9/iMONTENMP3Y1bWK9fh2JXcdEJmFwLZ4SI/lDAgMBAAGjggLZMIIC1TAJBgNVHRMEAjAAMAsG A1UdDwQEAwIEsDAdBgNVHSUEFjAUBggrBgEFBQcDAgYIKwYBBQUHAwQwHQYDVR0OBBYEFLMwvegt 8Of9vnfmrUq9CKU9umCOMB8GA1UdIwQYMBaAFFNy7ZKc4NrLAVx8fpY1TvLUuFGCMB4GA1UdEQQX MBWBE2R3bXcyQGluZnJhZGVhZC5vcmcwggFMBgNVHSAEggFDMIIBPzCCATsGCysGAQQBgbU3AQID MIIBKjAuBggrBgEFBQcCARYiaHR0cDovL3d3dy5zdGFydHNzbC5jb20vcG9saWN5LnBkZjCB9wYI KwYBBQUHAgIwgeowJxYgU3RhcnRDb20gQ2VydGlmaWNhdGlvbiBBdXRob3JpdHkwAwIBARqBvlRo aXMgY2VydGlmaWNhdGUgd2FzIGlzc3VlZCBhY2NvcmRpbmcgdG8gdGhlIENsYXNzIDEgVmFsaWRh dGlvbiByZXF1aXJlbWVudHMgb2YgdGhlIFN0YXJ0Q29tIENBIHBvbGljeSwgcmVsaWFuY2Ugb25s eSBmb3IgdGhlIGludGVuZGVkIHB1cnBvc2UgaW4gY29tcGxpYW5jZSBvZiB0aGUgcmVseWluZyBw YXJ0eSBvYmxpZ2F0aW9ucy4wNgYDVR0fBC8wLTAroCmgJ4YlaHR0cDovL2NybC5zdGFydHNzbC5j b20vY3J0dTEtY3JsLmNybDCBjgYIKwYBBQUHAQEEgYEwfzA5BggrBgEFBQcwAYYtaHR0cDovL29j c3Auc3RhcnRzc2wuY29tL3N1Yi9jbGFzczEvY2xpZW50L2NhMEIGCCsGAQUFBzAChjZodHRwOi8v YWlhLnN0YXJ0c3NsLmNvbS9jZXJ0cy9zdWIuY2xhc3MxLmNsaWVudC5jYS5jcnQwIwYDVR0SBBww GoYYaHR0cDovL3d3dy5zdGFydHNzbC5jb20vMA0GCSqGSIb3DQEBBQUAA4IBAQBzEJsRx6HaUvOV /B0csYb35MrO6/rWgjTmjVPXMULv3Gf+pOZ0tqhjTM2eo2+7nmP1e+6fuA1v/WVjgw03+Bc9NJoe UklXNtAc+7foYccB3etj2YBDYeYu+kKtrcOPFPALGKOXWph28tXqX53mWny6VJCyzuwBgSSIkVU3 r3b8l0NKp4sTpijFsr64UjRrlSClXXqiz/tdNg8H/naRcsPo9Yc9vIxBCqR2vjUdW9pF7kj4v6zB oXscJSToeLSvCkOkcq9KyDf/zUmzoG8mGa83P2VFl/VeTgPOAoRW9r5BZdDnGK7baf64QwSyE8H3 z8UmXqAYfR+82M5Gwlyi9wVCMIIGNDCCBBygAwIBAgIBHjANBgkqhkiG9w0BAQUFADB9MQswCQYD VQQGEwJJTDEWMBQGA1UEChMNU3RhcnRDb20gTHRkLjErMCkGA1UECxMiU2VjdXJlIERpZ2l0YWwg Q2VydGlmaWNhdGUgU2lnbmluZzEpMCcGA1UEAxMgU3RhcnRDb20gQ2VydGlmaWNhdGlvbiBBdXRo b3JpdHkwHhcNMDcxMDI0MjEwMTU1WhcNMTcxMDI0MjEwMTU1WjCBjDELMAkGA1UEBhMCSUwxFjAU BgNVBAoTDVN0YXJ0Q29tIEx0ZC4xKzApBgNVBAsTIlNlY3VyZSBEaWdpdGFsIENlcnRpZmljYXRl IFNpZ25pbmcxODA2BgNVBAMTL1N0YXJ0Q29tIENsYXNzIDEgUHJpbWFyeSBJbnRlcm1lZGlhdGUg Q2xpZW50IENBMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAxwmDzM4t2BqxKaQuE6uW vooyg4ymiEGWVUet1G8SD+rqvyNH4QrvnEIaFHxOhESip7vMz39ScLpNLbL1QpOlPW/tFIzNHS3q d2XRNYG5Sv9RcGE+T4qbLtsjjJbi6sL7Ls/f/X9ftTyhxvxWkf8KW37iKrueKsxw2HqolH7GM6FX 5UfNAwAu4ZifkpmZzU1slBhyWwaQPEPPZRsWoTb7q8hmgv6Nv3Hg9rmA1/VPBIOQ6SKRkHXG0Hhm q1dOFoAFI411+a/9nWm5rcVjGcIWZ2v/43Yksq60jExipA4l5uv9/+Hm33mbgmCszdj/Dthf13tg Av2O83hLJ0exTqfrlwIDAQABo4IBrTCCAakwDwYDVR0TAQH/BAUwAwEB/zAOBgNVHQ8BAf8EBAMC AQYwHQYDVR0OBBYEFFNy7ZKc4NrLAVx8fpY1TvLUuFGCMB8GA1UdIwQYMBaAFE4L7xqkQFulF2mH MMo0aEPQQa7yMGYGCCsGAQUFBwEBBFowWDAnBggrBgEFBQcwAYYbaHR0cDovL29jc3Auc3RhcnRz c2wuY29tL2NhMC0GCCsGAQUFBzAChiFodHRwOi8vd3d3LnN0YXJ0c3NsLmNvbS9zZnNjYS5jcnQw WwYDVR0fBFQwUjAnoCWgI4YhaHR0cDovL3d3dy5zdGFydHNzbC5jb20vc2ZzY2EuY3JsMCegJaAj hiFodHRwOi8vY3JsLnN0YXJ0c3NsLmNvbS9zZnNjYS5jcmwwgYAGA1UdIAR5MHcwdQYLKwYBBAGB tTcBAgEwZjAuBggrBgEFBQcCARYiaHR0cDovL3d3dy5zdGFydHNzbC5jb20vcG9saWN5LnBkZjA0 BggrBgEFBQcCARYoaHR0cDovL3d3dy5zdGFydHNzbC5jb20vaW50ZXJtZWRpYXRlLnBkZjANBgkq hkiG9w0BAQUFAAOCAgEACoMIfXirLAZcuGOMXq4cuSN3TaFx2H2GvD5VSy/6rV55BYHbWNaPeQn3 oBSU8KgQZn/Kck1JxbLpAxVCNtsxeW1R87ifhsYZ0qjdrA9anrW2MAWCtosmAOT4OxK9QPoSjCMx M3HbkZCDJgnlE8jMopH21BbyAYr7b5EfGRQJNtgWcvqSXwKHnTutR08+Kkn0KAkXCzeQNLeA5LlY UzFyM7kPAp8pIRMQ+seHunmyG642S2+y/qHEdMuGIwpfz3eDF1PdctL04qYK/zu+Qg1Bw0RwgigV Zs/0c5HP2/e9DBHh7eSwtzYlk4AUr6yxLlcwSjOfOmKEQ/Q8tzh0IFiNu9IPuTGAPBn4CPxD0+Ru 8T2wg8/s43R/PT3kd1OEqOJUl7q+h+r6fpvU0Fzxd2tC8Ga6fDEPme+1Nbi+03pVjuZQKbGwKJ66 gEn06WqaxVZC+J8hh/jR0k9mST1iAZPNYulcNJ8tKmVtjYsv0L1TSm2+NwON58tO+pIVzu3DWwSE XSf+qkDavQam+QtEOZxLBXI++aMUEapSn+k3Lxm48ZCYfAWLb/Xj7F5JQMbZvCexglAbYR0kIHqW 5DnsYSdMD/IplJMojx0NBrxJ3fN9dvX2Y6BIXRsF1du4qESm4/3CKuyUV7p9DW3mPlHTGLvYxnyK Qy7VFBkoLINszBrOUeIxggNvMIIDawIBATCBlDCBjDELMAkGA1UEBhMCSUwxFjAUBgNVBAoTDVN0 YXJ0Q29tIEx0ZC4xKzApBgNVBAsTIlNlY3VyZSBEaWdpdGFsIENlcnRpZmljYXRlIFNpZ25pbmcx ODA2BgNVBAMTL1N0YXJ0Q29tIENsYXNzIDEgUHJpbWFyeSBJbnRlcm1lZGlhdGUgQ2xpZW50IENB AgMN7zcwCQYFKw4DAhoFAKCCAa8wGAYJKoZIhvcNAQkDMQsGCSqGSIb3DQEHATAcBgkqhkiG9w0B CQUxDxcNMTYwMjAzMDA0MDU3WjAjBgkqhkiG9w0BCQQxFgQUJAREgC/XdAzYnx2hvM8Qs/fNJ90w gaUGCSsGAQQBgjcQBDGBlzCBlDCBjDELMAkGA1UEBhMCSUwxFjAUBgNVBAoTDVN0YXJ0Q29tIEx0 ZC4xKzApBgNVBAsTIlNlY3VyZSBEaWdpdGFsIENlcnRpZmljYXRlIFNpZ25pbmcxODA2BgNVBAMT L1N0YXJ0Q29tIENsYXNzIDEgUHJpbWFyeSBJbnRlcm1lZGlhdGUgQ2xpZW50IENBAgMN7zcwgacG CyqGSIb3DQEJEAILMYGXoIGUMIGMMQswCQYDVQQGEwJJTDEWMBQGA1UEChMNU3RhcnRDb20gTHRk LjErMCkGA1UECxMiU2VjdXJlIERpZ2l0YWwgQ2VydGlmaWNhdGUgU2lnbmluZzE4MDYGA1UEAxMv U3RhcnRDb20gQ2xhc3MgMSBQcmltYXJ5IEludGVybWVkaWF0ZSBDbGllbnQgQ0ECAw3vNzANBgkq hkiG9w0BAQEFAASCAQBWRxSDLqfR14TraZeZddnPJqpnJBQiOaoh7asTeqm0MRY6Adnob0Ukf62n nKaFxHPh69+PZKqSj7SnECRGeLtIgWoH5x2BhwFpSvARod2ev+/yICSJoWi2zZXjWonVVJ6zzjWV EoWCyBjAvnSWKlcyc8ceF0O8owODMk3i+n6iDUGd08WNjhHr+pHxqyNDVSCBQR1QdM8Fu+okRa2J KQP300XNrt80I/dR0dJHFLUHRP/Kp9s950G4sXDdqWZxqtyEpfsV2IDjzYCn/RiHgCyE0a/ayKEQ HDuPDz6UQUVrGAZbH+m55na6obH5ZavB/WFrfJtAMsDjznaytECqbd1eAAAAAAAA --=-ZMKB05rXY+bxHZZBUqO0-- -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm0-f44.google.com (mail-wm0-f44.google.com [74.125.82.44]) by kanga.kvack.org (Postfix) with ESMTP id 3142E6B0005 for ; Wed, 3 Feb 2016 03:13:38 -0500 (EST) Received: by mail-wm0-f44.google.com with SMTP id l66so151402294wml.0 for ; Wed, 03 Feb 2016 00:13:38 -0800 (PST) Received: from mail-wm0-x22e.google.com (mail-wm0-x22e.google.com. [2a00:1450:400c:c09::22e]) by mx.google.com with ESMTPS id y190si28797097wme.93.2016.02.03.00.13.37 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Feb 2016 00:13:37 -0800 (PST) Received: by mail-wm0-x22e.google.com with SMTP id 128so152997813wmz.1 for ; Wed, 03 Feb 2016 00:13:37 -0800 (PST) MIME-Version: 1.0 In-Reply-To: <1454460057.4788.117.camel@infradead.org> References: <20160128175536.GA20797@gmail.com> <1454460057.4788.117.camel@infradead.org> From: Oded Gabbay Date: Wed, 3 Feb 2016 10:13:07 +0200 Message-ID: Subject: Re: [LSF/MM ATTEND] HMM (heterogeneous memory manager) and GPU Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Sender: owner-linux-mm@kvack.org List-ID: To: David Woodhouse Cc: Jerome Glisse , lsf-pc@lists.linux-foundation.org, "linux-mm@kvack.org" , Joerg Roedel On Wed, Feb 3, 2016 at 2:40 AM, David Woodhouse wrote= : > On Thu, 2016-01-28 at 18:55 +0100, Jerome Glisse wrote: >> >> I would like to attend LSF/MM this year to discuss about HMM >> (Heterogeneous Memory Manager) and more generaly all topics >> related to GPU and heterogeneous memory architecture (including >> persistent memory). >> >> I want to discuss how to move forward with HMM merging and i >> hope that by MM summit time i will be able to share more >> informations publicly on devices which rely on HMM. > > There are a few related issues here around Shared Virtual Memory, and > lifetime management of the associated MM, and the proposal discussed at > the Kernel Summit for "off-CPU tasks". > > I've hit a situation with the Intel SVM code in 4.4 where the device > driver binds a PASID, and also has mmap() functionality on the same > file descriptor that the PASID is associated with. > > So on process exit, the MM doesn't die because the PASID binding still > exists. The VMA of the mmap doesn't die because the MM still exists. So > the underlying file remains open because the VMA still exists. And the > PASID binding thus doesn't die because the file is still open. > Why connect the PASID to the FD in the first place ? Why not tie everything to the MM ? > I've posted a patch=C2=B9 which moves us closer to the amd_iommu_v2 model= , > although I'm still *strongly* resisting the temptation to call out into > device driver code from the mmu_notifier's release callback. You mean you are resisting doing this (taken from amdkfd): -------------- static const struct mmu_notifier_ops kfd_process_mmu_notifier_ops =3D { .release =3D kfd_process_notifier_release, }; process->mmu_notifier.ops =3D &kfd_process_mmu_notifier_ops; ----------- Why, if I may ask ? Oded > > I would like to attend LSF/MM this year so we can continue to work on > those issues =E2=80=94 now that we actually have some hardware in the fie= ld and > a better idea of how we can build a unified access model for SVM across > the different IOMMU types. > > -- > David Woodhouse Open Source Technology Centre > David.Woodhouse@intel.com Intel Corporation > > > =C2=B9 http://www.spinics.net/lists/linux-mm/msg100230.html -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f44.google.com (mail-pa0-f44.google.com [209.85.220.44]) by kanga.kvack.org (Postfix) with ESMTP id 1D6186B0005 for ; Wed, 3 Feb 2016 03:40:58 -0500 (EST) Received: by mail-pa0-f44.google.com with SMTP id ho8so9971193pac.2 for ; Wed, 03 Feb 2016 00:40:58 -0800 (PST) Received: from bombadil.infradead.org (bombadil.infradead.org. [2001:1868:205::9]) by mx.google.com with ESMTPS id tc5si7838902pab.176.2016.02.03.00.40.57 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Feb 2016 00:40:57 -0800 (PST) Message-ID: <1454488853.4788.142.camel@infradead.org> Subject: Re: [LSF/MM ATTEND] HMM (heterogeneous memory manager) and GPU From: David Woodhouse Date: Wed, 03 Feb 2016 08:40:53 +0000 In-Reply-To: References: <20160128175536.GA20797@gmail.com> <1454460057.4788.117.camel@infradead.org> Content-Type: multipart/signed; micalg="sha-1"; protocol="application/x-pkcs7-signature"; boundary="=-gxnqeKZlpm2pRfnVHCKF" Mime-Version: 1.0 Sender: owner-linux-mm@kvack.org List-ID: To: Oded Gabbay Cc: Jerome Glisse , lsf-pc@lists.linux-foundation.org, "linux-mm@kvack.org" , Joerg Roedel --=-gxnqeKZlpm2pRfnVHCKF Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Wed, 2016-02-03 at 10:13 +0200, Oded Gabbay wrote: >=20 > > So on process exit, the MM doesn't die because the PASID binding still > > exists. The VMA of the mmap doesn't die because the MM still exists. So > > the underlying file remains open because the VMA still exists. And the > > PASID binding thus doesn't die because the file is still open. > > > Why connect the PASID to the FD in the first place ? > Why not tie everything to the MM ? That's actually a question for the device driver in question, of course; it's not the generic SVM support code which chooses *when* to bind/unbind PASIDs. We just provide those functions for the driver to call. But the answer is that that's the normal resource tracking model. Resources hang off the file and are cleared up when the file is closed. (And exit_files() is called later than exit_mm()). > > I've posted a patch=C2=B9 which moves us closer to the amd_iommu_v2 mod= el, > > although I'm still *strongly* resisting the temptation to call out into > > device driver code from the mmu_notifier's release callback. >=20 > You mean you are resisting doing this (taken from amdkfd): >=20 > -------------- > static const struct mmu_notifier_ops kfd_process_mmu_notifier_ops =3D { > .release =3D kfd_process_notifier_release, > }; >=20 > process->mmu_notifier.ops =3D &kfd_process_mmu_notifier_ops; > ----------- >=20 > Why, if I may ask ? The KISS principle, especially as it relates to device drivers. We just Do Not Want random device drivers being called in that context. It's OK for amdkfd where you have sufficient clue to deal with it =E2=80=94 it's more than "just a device driver". But when we get discrete devices with PASID support (and the required TLP prefix support in our root ports at last!) we're going to see SVM supported in many more device drivers, and we should make it simple. Having the mmu_notifier release callback exposed to drivers is going to strongly encourage them to do the WRONG thing, because they need to interact with their hardware and *wait* for the PASID to be entirely retired through the pipeline before they tell the IOMMU to flush it. The patch at=C2=A0http://www.spinics.net/lists/linux-mm/msg100230.html addresses this by clearing the PASID from the PASID table (in core IOMMU code) when the process exits so that all subsequent accesses to that PASID then take faults. The device driver can then clean up its binding for that PASID in its own time. It is a fairly fundamental rule that faulting access to *one* PASID should not adversely affect behaviour for *other* PASIDs, of course. --=20 David Woodhouse Open Source Technology Centre David.Woodhouse@intel.com Intel Corporation --=-gxnqeKZlpm2pRfnVHCKF Content-Type: application/x-pkcs7-signature; name="smime.p7s" Content-Disposition: attachment; filename="smime.p7s" Content-Transfer-Encoding: base64 MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIISjjCCBicw ggUPoAMCAQICAw3vNzANBgkqhkiG9w0BAQUFADCBjDELMAkGA1UEBhMCSUwxFjAUBgNVBAoTDVN0 YXJ0Q29tIEx0ZC4xKzApBgNVBAsTIlNlY3VyZSBEaWdpdGFsIENlcnRpZmljYXRlIFNpZ25pbmcx ODA2BgNVBAMTL1N0YXJ0Q29tIENsYXNzIDEgUHJpbWFyeSBJbnRlcm1lZGlhdGUgQ2xpZW50IENB MB4XDTE1MDUwNTA5NDM0MVoXDTE2MDUwNTA5NTMzNlowQjEcMBoGA1UEAwwTZHdtdzJAaW5mcmFk ZWFkLm9yZzEiMCAGCSqGSIb3DQEJARYTZHdtdzJAaW5mcmFkZWFkLm9yZzCCASIwDQYJKoZIhvcN AQEBBQADggEPADCCAQoCggEBAMkbm9kPbx1j/X4RVyf/pPKSYwelcco69TvnQQbKM8m8xkWjXJI1 jpJ1jMaGUZGFToINMSZi7lZawUozudWbXSKy1SikENSTJHffsdRAIlsp+hR8vWvjsKUry6sEdqPG doa5RY7+N4WRusWZDYW/RRWE6i9EL9qV86CVPYqw22UBOUw4/j/HVGCV6TSB8yE5iEwhk/hUuzRr FZm1MJMR7mCS7BCR8Lr5jFY61lWpBiXNXIxLZCvDc26KR5L5tYX43iUVO3fzES1GRVoYnxxk2tmz fcsZG5vK+Trc9L8OZJfkYrEHH3+Iw41MQ0w/djVtYr1+HYldx0QmYXAtnhIj+UMCAwEAAaOCAtkw ggLVMAkGA1UdEwQCMAAwCwYDVR0PBAQDAgSwMB0GA1UdJQQWMBQGCCsGAQUFBwMCBggrBgEFBQcD BDAdBgNVHQ4EFgQUszC96C3w5/2+d+atSr0IpT26YI4wHwYDVR0jBBgwFoAUU3Ltkpzg2ssBXHx+ ljVO8tS4UYIwHgYDVR0RBBcwFYETZHdtdzJAaW5mcmFkZWFkLm9yZzCCAUwGA1UdIASCAUMwggE/ MIIBOwYLKwYBBAGBtTcBAgMwggEqMC4GCCsGAQUFBwIBFiJodHRwOi8vd3d3LnN0YXJ0c3NsLmNv bS9wb2xpY3kucGRmMIH3BggrBgEFBQcCAjCB6jAnFiBTdGFydENvbSBDZXJ0aWZpY2F0aW9uIEF1 dGhvcml0eTADAgEBGoG+VGhpcyBjZXJ0aWZpY2F0ZSB3YXMgaXNzdWVkIGFjY29yZGluZyB0byB0 aGUgQ2xhc3MgMSBWYWxpZGF0aW9uIHJlcXVpcmVtZW50cyBvZiB0aGUgU3RhcnRDb20gQ0EgcG9s aWN5LCByZWxpYW5jZSBvbmx5IGZvciB0aGUgaW50ZW5kZWQgcHVycG9zZSBpbiBjb21wbGlhbmNl IG9mIHRoZSByZWx5aW5nIHBhcnR5IG9ibGlnYXRpb25zLjA2BgNVHR8ELzAtMCugKaAnhiVodHRw Oi8vY3JsLnN0YXJ0c3NsLmNvbS9jcnR1MS1jcmwuY3JsMIGOBggrBgEFBQcBAQSBgTB/MDkGCCsG AQUFBzABhi1odHRwOi8vb2NzcC5zdGFydHNzbC5jb20vc3ViL2NsYXNzMS9jbGllbnQvY2EwQgYI KwYBBQUHMAKGNmh0dHA6Ly9haWEuc3RhcnRzc2wuY29tL2NlcnRzL3N1Yi5jbGFzczEuY2xpZW50 LmNhLmNydDAjBgNVHRIEHDAahhhodHRwOi8vd3d3LnN0YXJ0c3NsLmNvbS8wDQYJKoZIhvcNAQEF BQADggEBAHMQmxHHodpS85X8HRyxhvfkys7r+taCNOaNU9cxQu/cZ/6k5nS2qGNMzZ6jb7ueY/V7 7p+4DW/9ZWODDTf4Fz00mh5SSVc20Bz7t+hhxwHd62PZgENh5i76Qq2tw48U8AsYo5damHby1epf neZafLpUkLLO7AGBJIiRVTevdvyXQ0qnixOmKMWyvrhSNGuVIKVdeqLP+102Dwf+dpFyw+j1hz28 jEEKpHa+NR1b2kXuSPi/rMGhexwlJOh4tK8KQ6Ryr0rIN//NSbOgbyYZrzc/ZUWX9V5OA84ChFb2 vkFl0OcYrttp/rhDBLITwffPxSZeoBh9H7zYzkbCXKL3BUIwggYnMIIFD6ADAgECAgMN7zcwDQYJ KoZIhvcNAQEFBQAwgYwxCzAJBgNVBAYTAklMMRYwFAYDVQQKEw1TdGFydENvbSBMdGQuMSswKQYD VQQLEyJTZWN1cmUgRGlnaXRhbCBDZXJ0aWZpY2F0ZSBTaWduaW5nMTgwNgYDVQQDEy9TdGFydENv bSBDbGFzcyAxIFByaW1hcnkgSW50ZXJtZWRpYXRlIENsaWVudCBDQTAeFw0xNTA1MDUwOTQzNDFa Fw0xNjA1MDUwOTUzMzZaMEIxHDAaBgNVBAMME2R3bXcyQGluZnJhZGVhZC5vcmcxIjAgBgkqhkiG 9w0BCQEWE2R3bXcyQGluZnJhZGVhZC5vcmcwggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIB AQDJG5vZD28dY/1+EVcn/6TykmMHpXHKOvU750EGyjPJvMZFo1ySNY6SdYzGhlGRhU6CDTEmYu5W WsFKM7nVm10istUopBDUkyR337HUQCJbKfoUfL1r47ClK8urBHajxnaGuUWO/jeFkbrFmQ2Fv0UV hOovRC/alfOglT2KsNtlATlMOP4/x1Rglek0gfMhOYhMIZP4VLs0axWZtTCTEe5gkuwQkfC6+YxW OtZVqQYlzVyMS2Qrw3NuikeS+bWF+N4lFTt38xEtRkVaGJ8cZNrZs33LGRubyvk63PS/DmSX5GKx Bx9/iMONTENMP3Y1bWK9fh2JXcdEJmFwLZ4SI/lDAgMBAAGjggLZMIIC1TAJBgNVHRMEAjAAMAsG A1UdDwQEAwIEsDAdBgNVHSUEFjAUBggrBgEFBQcDAgYIKwYBBQUHAwQwHQYDVR0OBBYEFLMwvegt 8Of9vnfmrUq9CKU9umCOMB8GA1UdIwQYMBaAFFNy7ZKc4NrLAVx8fpY1TvLUuFGCMB4GA1UdEQQX MBWBE2R3bXcyQGluZnJhZGVhZC5vcmcwggFMBgNVHSAEggFDMIIBPzCCATsGCysGAQQBgbU3AQID MIIBKjAuBggrBgEFBQcCARYiaHR0cDovL3d3dy5zdGFydHNzbC5jb20vcG9saWN5LnBkZjCB9wYI KwYBBQUHAgIwgeowJxYgU3RhcnRDb20gQ2VydGlmaWNhdGlvbiBBdXRob3JpdHkwAwIBARqBvlRo aXMgY2VydGlmaWNhdGUgd2FzIGlzc3VlZCBhY2NvcmRpbmcgdG8gdGhlIENsYXNzIDEgVmFsaWRh dGlvbiByZXF1aXJlbWVudHMgb2YgdGhlIFN0YXJ0Q29tIENBIHBvbGljeSwgcmVsaWFuY2Ugb25s eSBmb3IgdGhlIGludGVuZGVkIHB1cnBvc2UgaW4gY29tcGxpYW5jZSBvZiB0aGUgcmVseWluZyBw YXJ0eSBvYmxpZ2F0aW9ucy4wNgYDVR0fBC8wLTAroCmgJ4YlaHR0cDovL2NybC5zdGFydHNzbC5j b20vY3J0dTEtY3JsLmNybDCBjgYIKwYBBQUHAQEEgYEwfzA5BggrBgEFBQcwAYYtaHR0cDovL29j c3Auc3RhcnRzc2wuY29tL3N1Yi9jbGFzczEvY2xpZW50L2NhMEIGCCsGAQUFBzAChjZodHRwOi8v YWlhLnN0YXJ0c3NsLmNvbS9jZXJ0cy9zdWIuY2xhc3MxLmNsaWVudC5jYS5jcnQwIwYDVR0SBBww GoYYaHR0cDovL3d3dy5zdGFydHNzbC5jb20vMA0GCSqGSIb3DQEBBQUAA4IBAQBzEJsRx6HaUvOV /B0csYb35MrO6/rWgjTmjVPXMULv3Gf+pOZ0tqhjTM2eo2+7nmP1e+6fuA1v/WVjgw03+Bc9NJoe UklXNtAc+7foYccB3etj2YBDYeYu+kKtrcOPFPALGKOXWph28tXqX53mWny6VJCyzuwBgSSIkVU3 r3b8l0NKp4sTpijFsr64UjRrlSClXXqiz/tdNg8H/naRcsPo9Yc9vIxBCqR2vjUdW9pF7kj4v6zB oXscJSToeLSvCkOkcq9KyDf/zUmzoG8mGa83P2VFl/VeTgPOAoRW9r5BZdDnGK7baf64QwSyE8H3 z8UmXqAYfR+82M5Gwlyi9wVCMIIGNDCCBBygAwIBAgIBHjANBgkqhkiG9w0BAQUFADB9MQswCQYD VQQGEwJJTDEWMBQGA1UEChMNU3RhcnRDb20gTHRkLjErMCkGA1UECxMiU2VjdXJlIERpZ2l0YWwg Q2VydGlmaWNhdGUgU2lnbmluZzEpMCcGA1UEAxMgU3RhcnRDb20gQ2VydGlmaWNhdGlvbiBBdXRo b3JpdHkwHhcNMDcxMDI0MjEwMTU1WhcNMTcxMDI0MjEwMTU1WjCBjDELMAkGA1UEBhMCSUwxFjAU BgNVBAoTDVN0YXJ0Q29tIEx0ZC4xKzApBgNVBAsTIlNlY3VyZSBEaWdpdGFsIENlcnRpZmljYXRl IFNpZ25pbmcxODA2BgNVBAMTL1N0YXJ0Q29tIENsYXNzIDEgUHJpbWFyeSBJbnRlcm1lZGlhdGUg Q2xpZW50IENBMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAxwmDzM4t2BqxKaQuE6uW vooyg4ymiEGWVUet1G8SD+rqvyNH4QrvnEIaFHxOhESip7vMz39ScLpNLbL1QpOlPW/tFIzNHS3q d2XRNYG5Sv9RcGE+T4qbLtsjjJbi6sL7Ls/f/X9ftTyhxvxWkf8KW37iKrueKsxw2HqolH7GM6FX 5UfNAwAu4ZifkpmZzU1slBhyWwaQPEPPZRsWoTb7q8hmgv6Nv3Hg9rmA1/VPBIOQ6SKRkHXG0Hhm q1dOFoAFI411+a/9nWm5rcVjGcIWZ2v/43Yksq60jExipA4l5uv9/+Hm33mbgmCszdj/Dthf13tg Av2O83hLJ0exTqfrlwIDAQABo4IBrTCCAakwDwYDVR0TAQH/BAUwAwEB/zAOBgNVHQ8BAf8EBAMC AQYwHQYDVR0OBBYEFFNy7ZKc4NrLAVx8fpY1TvLUuFGCMB8GA1UdIwQYMBaAFE4L7xqkQFulF2mH MMo0aEPQQa7yMGYGCCsGAQUFBwEBBFowWDAnBggrBgEFBQcwAYYbaHR0cDovL29jc3Auc3RhcnRz c2wuY29tL2NhMC0GCCsGAQUFBzAChiFodHRwOi8vd3d3LnN0YXJ0c3NsLmNvbS9zZnNjYS5jcnQw WwYDVR0fBFQwUjAnoCWgI4YhaHR0cDovL3d3dy5zdGFydHNzbC5jb20vc2ZzY2EuY3JsMCegJaAj hiFodHRwOi8vY3JsLnN0YXJ0c3NsLmNvbS9zZnNjYS5jcmwwgYAGA1UdIAR5MHcwdQYLKwYBBAGB tTcBAgEwZjAuBggrBgEFBQcCARYiaHR0cDovL3d3dy5zdGFydHNzbC5jb20vcG9saWN5LnBkZjA0 BggrBgEFBQcCARYoaHR0cDovL3d3dy5zdGFydHNzbC5jb20vaW50ZXJtZWRpYXRlLnBkZjANBgkq hkiG9w0BAQUFAAOCAgEACoMIfXirLAZcuGOMXq4cuSN3TaFx2H2GvD5VSy/6rV55BYHbWNaPeQn3 oBSU8KgQZn/Kck1JxbLpAxVCNtsxeW1R87ifhsYZ0qjdrA9anrW2MAWCtosmAOT4OxK9QPoSjCMx M3HbkZCDJgnlE8jMopH21BbyAYr7b5EfGRQJNtgWcvqSXwKHnTutR08+Kkn0KAkXCzeQNLeA5LlY UzFyM7kPAp8pIRMQ+seHunmyG642S2+y/qHEdMuGIwpfz3eDF1PdctL04qYK/zu+Qg1Bw0RwgigV Zs/0c5HP2/e9DBHh7eSwtzYlk4AUr6yxLlcwSjOfOmKEQ/Q8tzh0IFiNu9IPuTGAPBn4CPxD0+Ru 8T2wg8/s43R/PT3kd1OEqOJUl7q+h+r6fpvU0Fzxd2tC8Ga6fDEPme+1Nbi+03pVjuZQKbGwKJ66 gEn06WqaxVZC+J8hh/jR0k9mST1iAZPNYulcNJ8tKmVtjYsv0L1TSm2+NwON58tO+pIVzu3DWwSE XSf+qkDavQam+QtEOZxLBXI++aMUEapSn+k3Lxm48ZCYfAWLb/Xj7F5JQMbZvCexglAbYR0kIHqW 5DnsYSdMD/IplJMojx0NBrxJ3fN9dvX2Y6BIXRsF1du4qESm4/3CKuyUV7p9DW3mPlHTGLvYxnyK Qy7VFBkoLINszBrOUeIxggNvMIIDawIBATCBlDCBjDELMAkGA1UEBhMCSUwxFjAUBgNVBAoTDVN0 YXJ0Q29tIEx0ZC4xKzApBgNVBAsTIlNlY3VyZSBEaWdpdGFsIENlcnRpZmljYXRlIFNpZ25pbmcx ODA2BgNVBAMTL1N0YXJ0Q29tIENsYXNzIDEgUHJpbWFyeSBJbnRlcm1lZGlhdGUgQ2xpZW50IENB AgMN7zcwCQYFKw4DAhoFAKCCAa8wGAYJKoZIhvcNAQkDMQsGCSqGSIb3DQEHATAcBgkqhkiG9w0B CQUxDxcNMTYwMjAzMDg0MDUzWjAjBgkqhkiG9w0BCQQxFgQUAGd0h17Fe1dtvfaz4lSOVHigGfIw gaUGCSsGAQQBgjcQBDGBlzCBlDCBjDELMAkGA1UEBhMCSUwxFjAUBgNVBAoTDVN0YXJ0Q29tIEx0 ZC4xKzApBgNVBAsTIlNlY3VyZSBEaWdpdGFsIENlcnRpZmljYXRlIFNpZ25pbmcxODA2BgNVBAMT L1N0YXJ0Q29tIENsYXNzIDEgUHJpbWFyeSBJbnRlcm1lZGlhdGUgQ2xpZW50IENBAgMN7zcwgacG CyqGSIb3DQEJEAILMYGXoIGUMIGMMQswCQYDVQQGEwJJTDEWMBQGA1UEChMNU3RhcnRDb20gTHRk LjErMCkGA1UECxMiU2VjdXJlIERpZ2l0YWwgQ2VydGlmaWNhdGUgU2lnbmluZzE4MDYGA1UEAxMv U3RhcnRDb20gQ2xhc3MgMSBQcmltYXJ5IEludGVybWVkaWF0ZSBDbGllbnQgQ0ECAw3vNzANBgkq hkiG9w0BAQEFAASCAQCQ66OxoDs1fk+5YYeNCbAtsDXXnlPWABgmwsqFVOPk1wDIX6xVEPUDltk+ qz7thv1gXU+ehvIWAb/bpMH2VSF10iSMncbSgMaLuuvjBZIRd6YfDt5b85e48bF7ho5XFi7nZghp pEmzbeUuXDHWvOgUa7lJPNp+nO2rPJ6TKUBIqrQzIXnKUTE10oFDOw/cONIv2OcjgPDx62d3Y4Mm 6CfFdKR8822MGtDmMKGb9wWf49IfphsvSKyt1WL98dHhMt3++XpOlQ1yTe44VmIMIvp3xWfmtR9s hw7M/7dbXsD+BJvhvLK4AVXTdPo9NK/JlOQlx7qbXyRRw22HwfJp0yxJAAAAAAAA --=-gxnqeKZlpm2pRfnVHCKF-- -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm0-f53.google.com (mail-wm0-f53.google.com [74.125.82.53]) by kanga.kvack.org (Postfix) with ESMTP id 050B66B0005 for ; Wed, 3 Feb 2016 04:21:40 -0500 (EST) Received: by mail-wm0-f53.google.com with SMTP id l66so60414887wml.0 for ; Wed, 03 Feb 2016 01:21:39 -0800 (PST) Received: from mail-wm0-x233.google.com (mail-wm0-x233.google.com. [2a00:1450:400c:c09::233]) by mx.google.com with ESMTPS id ju2si8605131wjb.192.2016.02.03.01.21.38 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Feb 2016 01:21:38 -0800 (PST) Received: by mail-wm0-x233.google.com with SMTP id l66so153900200wml.0 for ; Wed, 03 Feb 2016 01:21:38 -0800 (PST) MIME-Version: 1.0 In-Reply-To: <1454488853.4788.142.camel@infradead.org> References: <20160128175536.GA20797@gmail.com> <1454460057.4788.117.camel@infradead.org> <1454488853.4788.142.camel@infradead.org> From: Oded Gabbay Date: Wed, 3 Feb 2016 11:21:08 +0200 Message-ID: Subject: Re: [LSF/MM ATTEND] HMM (heterogeneous memory manager) and GPU Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Sender: owner-linux-mm@kvack.org List-ID: To: David Woodhouse Cc: Jerome Glisse , lsf-pc@lists.linux-foundation.org, "linux-mm@kvack.org" , Joerg Roedel On Wed, Feb 3, 2016 at 10:40 AM, David Woodhouse wrot= e: > On Wed, 2016-02-03 at 10:13 +0200, Oded Gabbay wrote: >> >> > So on process exit, the MM doesn't die because the PASID binding still >> > exists. The VMA of the mmap doesn't die because the MM still exists. S= o >> > the underlying file remains open because the VMA still exists. And the >> > PASID binding thus doesn't die because the file is still open. >> > >> Why connect the PASID to the FD in the first place ? >> Why not tie everything to the MM ? > > That's actually a question for the device driver in question, of > course; it's not the generic SVM support code which chooses *when* to > bind/unbind PASIDs. We just provide those functions for the driver to > call. > > But the answer is that that's the normal resource tracking model. > Resources hang off the file and are cleared up when the file is closed. > > (And exit_files() is called later than exit_mm()). > >> > I've posted a patch=C2=B9 which moves us closer to the amd_iommu_v2 mo= del, >> > although I'm still *strongly* resisting the temptation to call out int= o >> > device driver code from the mmu_notifier's release callback. >> >> You mean you are resisting doing this (taken from amdkfd): >> >> -------------- >> static const struct mmu_notifier_ops kfd_process_mmu_notifier_ops =3D { >> .release =3D kfd_process_notifier_release, >> }; >> >> process->mmu_notifier.ops =3D &kfd_process_mmu_notifier_ops; >> ----------- >> >> Why, if I may ask ? > > The KISS principle, especially as it relates to device drivers. > We just Do Not Want random device drivers being called in that context. > > It's OK for amdkfd where you have sufficient clue to deal with it =E2=80= =94 > it's more than "just a device driver". > > But when we get discrete devices with PASID support (and the required > TLP prefix support in our root ports at last!) we're going to see SVM > supported in many more device drivers, and we should make it simple. > > Having the mmu_notifier release callback exposed to drivers is going to > strongly encourage them to do the WRONG thing, because they need to > interact with their hardware and *wait* for the PASID to be entirely > retired through the pipeline before they tell the IOMMU to flush it. > > The patch at http://www.spinics.net/lists/linux-mm/msg100230.html > addresses this by clearing the PASID from the PASID table (in core > IOMMU code) when the process exits so that all subsequent accesses to > that PASID then take faults. The device driver can then clean up its > binding for that PASID in its own time. OK, so I think I got confused up a little, but looking at your code I see that you register SVM for the mm notifier (intel_mm_release), therefore I guess what you meant to say you don't want to call a device driver callback from your mm notifier callback, correct ? (like the amd_iommu_v2 does when it calls ev_state->inv_ctx_cb inside its mn_release) Because you can't really control what the device driver will do, i.e. if it decides to register itself to the mm notifier in its own code. And because you don't call the device driver, the driver can/will get errors for using this PASID (since you unbinded it) and the device driver is supposed to handle it. Did I understood that correctly ? If I understood it correctly, doesn't it confuses between error/fault and normal unbinding ? Won't it be better to actively notify them and indeed *wait* until the device driver cleared its H/W pipeline before "pulling the carpet under their feet" ? In our case (AMD GPUs), if we have such an error it could make the GPU stuck. That's why we even reset the wavefronts inside the GPU, if we can't gracefully remove the work from the GPU (see kfd_unbind_process_from_device) In the patch's comment you wrote: "Hardware designers have confirmed that the resulting 'PASID not present' faults should be handled just as gracefully as 'page not present' faults" Unless *all* the H/W that is going to use SVM is designed by the same company, I don't think we can say such a thing. And even then, from my experience, H/W designers can be "creative" sometimes. Just my 2 cents. Oded > > It is a fairly fundamental rule that faulting access to *one* PASID > should not adversely affect behaviour for *other* PASIDs, of course. > > -- > David Woodhouse Open Source Technology Centre > David.Woodhouse@intel.com Intel Corporation > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f41.google.com (mail-pa0-f41.google.com [209.85.220.41]) by kanga.kvack.org (Postfix) with ESMTP id 29B766B0005 for ; Wed, 3 Feb 2016 05:15:13 -0500 (EST) Received: by mail-pa0-f41.google.com with SMTP id uo6so11513441pac.1 for ; Wed, 03 Feb 2016 02:15:13 -0800 (PST) Received: from bombadil.infradead.org (bombadil.infradead.org. [2001:1868:205::9]) by mx.google.com with ESMTPS id e3si8374034pas.149.2016.02.03.02.15.12 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Feb 2016 02:15:12 -0800 (PST) Message-ID: <1454494508.4788.154.camel@infradead.org> Subject: Re: [LSF/MM ATTEND] HMM (heterogeneous memory manager) and GPU From: David Woodhouse Date: Wed, 03 Feb 2016 10:15:08 +0000 In-Reply-To: References: <20160128175536.GA20797@gmail.com> <1454460057.4788.117.camel@infradead.org> <1454488853.4788.142.camel@infradead.org> Content-Type: multipart/signed; micalg="sha-1"; protocol="application/x-pkcs7-signature"; boundary="=-QXtkjhYtl8t4xEgGRGLb" Mime-Version: 1.0 Sender: owner-linux-mm@kvack.org List-ID: To: Oded Gabbay Cc: Jerome Glisse , lsf-pc@lists.linux-foundation.org, "linux-mm@kvack.org" , Joerg Roedel --=-QXtkjhYtl8t4xEgGRGLb Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Wed, 2016-02-03 at 11:21 +0200, Oded Gabbay wrote: > OK, so I think I got confused up a little, but looking at your code I > see that you register SVM for the mm notifier (intel_mm_release), > therefore I guess what you meant to say you don't want to call a > device driver callback from your mm notifier callback, correct ? (like > the amd_iommu_v2 does when it calls ev_state->inv_ctx_cb inside its > mn_release) Right. > Because you can't really control what the device driver will do, i.e. > if it decides to register itself to the mm notifier in its own code. Right. I can't *prevent* them from doing it. But I don't need to encourage or facilitate it :) > And because you don't call the device driver, the driver can/will get > errors for using this PASID (since you unbinded it) and the device > driver is supposed to handle it. Did I understood that correctly ? In the case of an unclean exit, yes. In an orderly shutdown of the process, one would hope that the device context is relinquished cleanly rather than the process simply exiting. And yes, the device and its driver are expected to handle faults. If they don't do that, they are broken :) > If I understood it correctly, doesn't it confuses between error/fault > and normal unbinding ? Won't it be better to actively notify them and > indeed *wait* until the device driver cleared its H/W pipeline before > "pulling the carpet under their feet" ? >=20 > In our case (AMD GPUs), if we have such an error it could make the GPU > stuck. That's why we even reset the wavefronts inside the GPU, if we > can't gracefully remove the work from the GPU (see > kfd_unbind_process_from_device) But a rogue process can easily trigger faults =E2=80=94 just request access= to an address that doesn't exist. My conversation with the hardware designers was not about the peculiarities of any specific implementation, but just getting them to confirm my assertion that if a device *doesn't* cleanly handle faults on *one* PASID without screwing over all the *other* PASIDs, then it is utterly broken by design and should never get to production. I *do* anticipate broken hardware which will crap itself completely when it takes a fault, and have implemented a callback from the fault handler so that the driver gets notified when a fault *happens* (even on a PASID which is still alive), and can prod the broken hardware if it needs to. But I wasn't expecting it to be the norm. > In the patch's comment you wrote: > "Hardware designers have confirmed that the resulting 'PASID not present' > faults should be handled just as gracefully as 'page not present' faults" >=20 > Unless *all* the H/W that is going to use SVM is designed by the same > company, I don't think we can say such a thing. And even then, from my > experience, H/W designers can be "creative" sometimes. If we have to turn it into a 'page not present' fault instead of a 'PASID not present' fault, that's easy enough to do by pointing it at a dummy PML4 (the zero page will do). But I stand by my assertion that any hardware which doesn't handle at least a 'page not present' fault in a given PASID without screwing over all the other users of the hardware is BROKEN. We could *almost* forgive hardware for stalling when it sees a 'PASID not present' fault. Since that *does* require OS participation. --=20 David Woodhouse Open Source Technology Centre David.Woodhouse@intel.com Intel Corporation --=-QXtkjhYtl8t4xEgGRGLb Content-Type: application/x-pkcs7-signature; name="smime.p7s" Content-Disposition: attachment; filename="smime.p7s" Content-Transfer-Encoding: base64 MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIISjjCCBicw ggUPoAMCAQICAw3vNzANBgkqhkiG9w0BAQUFADCBjDELMAkGA1UEBhMCSUwxFjAUBgNVBAoTDVN0 YXJ0Q29tIEx0ZC4xKzApBgNVBAsTIlNlY3VyZSBEaWdpdGFsIENlcnRpZmljYXRlIFNpZ25pbmcx ODA2BgNVBAMTL1N0YXJ0Q29tIENsYXNzIDEgUHJpbWFyeSBJbnRlcm1lZGlhdGUgQ2xpZW50IENB MB4XDTE1MDUwNTA5NDM0MVoXDTE2MDUwNTA5NTMzNlowQjEcMBoGA1UEAwwTZHdtdzJAaW5mcmFk ZWFkLm9yZzEiMCAGCSqGSIb3DQEJARYTZHdtdzJAaW5mcmFkZWFkLm9yZzCCASIwDQYJKoZIhvcN AQEBBQADggEPADCCAQoCggEBAMkbm9kPbx1j/X4RVyf/pPKSYwelcco69TvnQQbKM8m8xkWjXJI1 jpJ1jMaGUZGFToINMSZi7lZawUozudWbXSKy1SikENSTJHffsdRAIlsp+hR8vWvjsKUry6sEdqPG doa5RY7+N4WRusWZDYW/RRWE6i9EL9qV86CVPYqw22UBOUw4/j/HVGCV6TSB8yE5iEwhk/hUuzRr FZm1MJMR7mCS7BCR8Lr5jFY61lWpBiXNXIxLZCvDc26KR5L5tYX43iUVO3fzES1GRVoYnxxk2tmz fcsZG5vK+Trc9L8OZJfkYrEHH3+Iw41MQ0w/djVtYr1+HYldx0QmYXAtnhIj+UMCAwEAAaOCAtkw ggLVMAkGA1UdEwQCMAAwCwYDVR0PBAQDAgSwMB0GA1UdJQQWMBQGCCsGAQUFBwMCBggrBgEFBQcD BDAdBgNVHQ4EFgQUszC96C3w5/2+d+atSr0IpT26YI4wHwYDVR0jBBgwFoAUU3Ltkpzg2ssBXHx+ ljVO8tS4UYIwHgYDVR0RBBcwFYETZHdtdzJAaW5mcmFkZWFkLm9yZzCCAUwGA1UdIASCAUMwggE/ MIIBOwYLKwYBBAGBtTcBAgMwggEqMC4GCCsGAQUFBwIBFiJodHRwOi8vd3d3LnN0YXJ0c3NsLmNv bS9wb2xpY3kucGRmMIH3BggrBgEFBQcCAjCB6jAnFiBTdGFydENvbSBDZXJ0aWZpY2F0aW9uIEF1 dGhvcml0eTADAgEBGoG+VGhpcyBjZXJ0aWZpY2F0ZSB3YXMgaXNzdWVkIGFjY29yZGluZyB0byB0 aGUgQ2xhc3MgMSBWYWxpZGF0aW9uIHJlcXVpcmVtZW50cyBvZiB0aGUgU3RhcnRDb20gQ0EgcG9s aWN5LCByZWxpYW5jZSBvbmx5IGZvciB0aGUgaW50ZW5kZWQgcHVycG9zZSBpbiBjb21wbGlhbmNl IG9mIHRoZSByZWx5aW5nIHBhcnR5IG9ibGlnYXRpb25zLjA2BgNVHR8ELzAtMCugKaAnhiVodHRw Oi8vY3JsLnN0YXJ0c3NsLmNvbS9jcnR1MS1jcmwuY3JsMIGOBggrBgEFBQcBAQSBgTB/MDkGCCsG AQUFBzABhi1odHRwOi8vb2NzcC5zdGFydHNzbC5jb20vc3ViL2NsYXNzMS9jbGllbnQvY2EwQgYI KwYBBQUHMAKGNmh0dHA6Ly9haWEuc3RhcnRzc2wuY29tL2NlcnRzL3N1Yi5jbGFzczEuY2xpZW50 LmNhLmNydDAjBgNVHRIEHDAahhhodHRwOi8vd3d3LnN0YXJ0c3NsLmNvbS8wDQYJKoZIhvcNAQEF BQADggEBAHMQmxHHodpS85X8HRyxhvfkys7r+taCNOaNU9cxQu/cZ/6k5nS2qGNMzZ6jb7ueY/V7 7p+4DW/9ZWODDTf4Fz00mh5SSVc20Bz7t+hhxwHd62PZgENh5i76Qq2tw48U8AsYo5damHby1epf neZafLpUkLLO7AGBJIiRVTevdvyXQ0qnixOmKMWyvrhSNGuVIKVdeqLP+102Dwf+dpFyw+j1hz28 jEEKpHa+NR1b2kXuSPi/rMGhexwlJOh4tK8KQ6Ryr0rIN//NSbOgbyYZrzc/ZUWX9V5OA84ChFb2 vkFl0OcYrttp/rhDBLITwffPxSZeoBh9H7zYzkbCXKL3BUIwggYnMIIFD6ADAgECAgMN7zcwDQYJ KoZIhvcNAQEFBQAwgYwxCzAJBgNVBAYTAklMMRYwFAYDVQQKEw1TdGFydENvbSBMdGQuMSswKQYD VQQLEyJTZWN1cmUgRGlnaXRhbCBDZXJ0aWZpY2F0ZSBTaWduaW5nMTgwNgYDVQQDEy9TdGFydENv bSBDbGFzcyAxIFByaW1hcnkgSW50ZXJtZWRpYXRlIENsaWVudCBDQTAeFw0xNTA1MDUwOTQzNDFa Fw0xNjA1MDUwOTUzMzZaMEIxHDAaBgNVBAMME2R3bXcyQGluZnJhZGVhZC5vcmcxIjAgBgkqhkiG 9w0BCQEWE2R3bXcyQGluZnJhZGVhZC5vcmcwggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIB AQDJG5vZD28dY/1+EVcn/6TykmMHpXHKOvU750EGyjPJvMZFo1ySNY6SdYzGhlGRhU6CDTEmYu5W WsFKM7nVm10istUopBDUkyR337HUQCJbKfoUfL1r47ClK8urBHajxnaGuUWO/jeFkbrFmQ2Fv0UV hOovRC/alfOglT2KsNtlATlMOP4/x1Rglek0gfMhOYhMIZP4VLs0axWZtTCTEe5gkuwQkfC6+YxW OtZVqQYlzVyMS2Qrw3NuikeS+bWF+N4lFTt38xEtRkVaGJ8cZNrZs33LGRubyvk63PS/DmSX5GKx Bx9/iMONTENMP3Y1bWK9fh2JXcdEJmFwLZ4SI/lDAgMBAAGjggLZMIIC1TAJBgNVHRMEAjAAMAsG A1UdDwQEAwIEsDAdBgNVHSUEFjAUBggrBgEFBQcDAgYIKwYBBQUHAwQwHQYDVR0OBBYEFLMwvegt 8Of9vnfmrUq9CKU9umCOMB8GA1UdIwQYMBaAFFNy7ZKc4NrLAVx8fpY1TvLUuFGCMB4GA1UdEQQX MBWBE2R3bXcyQGluZnJhZGVhZC5vcmcwggFMBgNVHSAEggFDMIIBPzCCATsGCysGAQQBgbU3AQID MIIBKjAuBggrBgEFBQcCARYiaHR0cDovL3d3dy5zdGFydHNzbC5jb20vcG9saWN5LnBkZjCB9wYI KwYBBQUHAgIwgeowJxYgU3RhcnRDb20gQ2VydGlmaWNhdGlvbiBBdXRob3JpdHkwAwIBARqBvlRo aXMgY2VydGlmaWNhdGUgd2FzIGlzc3VlZCBhY2NvcmRpbmcgdG8gdGhlIENsYXNzIDEgVmFsaWRh dGlvbiByZXF1aXJlbWVudHMgb2YgdGhlIFN0YXJ0Q29tIENBIHBvbGljeSwgcmVsaWFuY2Ugb25s eSBmb3IgdGhlIGludGVuZGVkIHB1cnBvc2UgaW4gY29tcGxpYW5jZSBvZiB0aGUgcmVseWluZyBw YXJ0eSBvYmxpZ2F0aW9ucy4wNgYDVR0fBC8wLTAroCmgJ4YlaHR0cDovL2NybC5zdGFydHNzbC5j b20vY3J0dTEtY3JsLmNybDCBjgYIKwYBBQUHAQEEgYEwfzA5BggrBgEFBQcwAYYtaHR0cDovL29j c3Auc3RhcnRzc2wuY29tL3N1Yi9jbGFzczEvY2xpZW50L2NhMEIGCCsGAQUFBzAChjZodHRwOi8v YWlhLnN0YXJ0c3NsLmNvbS9jZXJ0cy9zdWIuY2xhc3MxLmNsaWVudC5jYS5jcnQwIwYDVR0SBBww GoYYaHR0cDovL3d3dy5zdGFydHNzbC5jb20vMA0GCSqGSIb3DQEBBQUAA4IBAQBzEJsRx6HaUvOV /B0csYb35MrO6/rWgjTmjVPXMULv3Gf+pOZ0tqhjTM2eo2+7nmP1e+6fuA1v/WVjgw03+Bc9NJoe UklXNtAc+7foYccB3etj2YBDYeYu+kKtrcOPFPALGKOXWph28tXqX53mWny6VJCyzuwBgSSIkVU3 r3b8l0NKp4sTpijFsr64UjRrlSClXXqiz/tdNg8H/naRcsPo9Yc9vIxBCqR2vjUdW9pF7kj4v6zB oXscJSToeLSvCkOkcq9KyDf/zUmzoG8mGa83P2VFl/VeTgPOAoRW9r5BZdDnGK7baf64QwSyE8H3 z8UmXqAYfR+82M5Gwlyi9wVCMIIGNDCCBBygAwIBAgIBHjANBgkqhkiG9w0BAQUFADB9MQswCQYD VQQGEwJJTDEWMBQGA1UEChMNU3RhcnRDb20gTHRkLjErMCkGA1UECxMiU2VjdXJlIERpZ2l0YWwg Q2VydGlmaWNhdGUgU2lnbmluZzEpMCcGA1UEAxMgU3RhcnRDb20gQ2VydGlmaWNhdGlvbiBBdXRo b3JpdHkwHhcNMDcxMDI0MjEwMTU1WhcNMTcxMDI0MjEwMTU1WjCBjDELMAkGA1UEBhMCSUwxFjAU BgNVBAoTDVN0YXJ0Q29tIEx0ZC4xKzApBgNVBAsTIlNlY3VyZSBEaWdpdGFsIENlcnRpZmljYXRl IFNpZ25pbmcxODA2BgNVBAMTL1N0YXJ0Q29tIENsYXNzIDEgUHJpbWFyeSBJbnRlcm1lZGlhdGUg Q2xpZW50IENBMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAxwmDzM4t2BqxKaQuE6uW vooyg4ymiEGWVUet1G8SD+rqvyNH4QrvnEIaFHxOhESip7vMz39ScLpNLbL1QpOlPW/tFIzNHS3q d2XRNYG5Sv9RcGE+T4qbLtsjjJbi6sL7Ls/f/X9ftTyhxvxWkf8KW37iKrueKsxw2HqolH7GM6FX 5UfNAwAu4ZifkpmZzU1slBhyWwaQPEPPZRsWoTb7q8hmgv6Nv3Hg9rmA1/VPBIOQ6SKRkHXG0Hhm q1dOFoAFI411+a/9nWm5rcVjGcIWZ2v/43Yksq60jExipA4l5uv9/+Hm33mbgmCszdj/Dthf13tg Av2O83hLJ0exTqfrlwIDAQABo4IBrTCCAakwDwYDVR0TAQH/BAUwAwEB/zAOBgNVHQ8BAf8EBAMC AQYwHQYDVR0OBBYEFFNy7ZKc4NrLAVx8fpY1TvLUuFGCMB8GA1UdIwQYMBaAFE4L7xqkQFulF2mH MMo0aEPQQa7yMGYGCCsGAQUFBwEBBFowWDAnBggrBgEFBQcwAYYbaHR0cDovL29jc3Auc3RhcnRz c2wuY29tL2NhMC0GCCsGAQUFBzAChiFodHRwOi8vd3d3LnN0YXJ0c3NsLmNvbS9zZnNjYS5jcnQw WwYDVR0fBFQwUjAnoCWgI4YhaHR0cDovL3d3dy5zdGFydHNzbC5jb20vc2ZzY2EuY3JsMCegJaAj hiFodHRwOi8vY3JsLnN0YXJ0c3NsLmNvbS9zZnNjYS5jcmwwgYAGA1UdIAR5MHcwdQYLKwYBBAGB tTcBAgEwZjAuBggrBgEFBQcCARYiaHR0cDovL3d3dy5zdGFydHNzbC5jb20vcG9saWN5LnBkZjA0 BggrBgEFBQcCARYoaHR0cDovL3d3dy5zdGFydHNzbC5jb20vaW50ZXJtZWRpYXRlLnBkZjANBgkq hkiG9w0BAQUFAAOCAgEACoMIfXirLAZcuGOMXq4cuSN3TaFx2H2GvD5VSy/6rV55BYHbWNaPeQn3 oBSU8KgQZn/Kck1JxbLpAxVCNtsxeW1R87ifhsYZ0qjdrA9anrW2MAWCtosmAOT4OxK9QPoSjCMx M3HbkZCDJgnlE8jMopH21BbyAYr7b5EfGRQJNtgWcvqSXwKHnTutR08+Kkn0KAkXCzeQNLeA5LlY UzFyM7kPAp8pIRMQ+seHunmyG642S2+y/qHEdMuGIwpfz3eDF1PdctL04qYK/zu+Qg1Bw0RwgigV Zs/0c5HP2/e9DBHh7eSwtzYlk4AUr6yxLlcwSjOfOmKEQ/Q8tzh0IFiNu9IPuTGAPBn4CPxD0+Ru 8T2wg8/s43R/PT3kd1OEqOJUl7q+h+r6fpvU0Fzxd2tC8Ga6fDEPme+1Nbi+03pVjuZQKbGwKJ66 gEn06WqaxVZC+J8hh/jR0k9mST1iAZPNYulcNJ8tKmVtjYsv0L1TSm2+NwON58tO+pIVzu3DWwSE XSf+qkDavQam+QtEOZxLBXI++aMUEapSn+k3Lxm48ZCYfAWLb/Xj7F5JQMbZvCexglAbYR0kIHqW 5DnsYSdMD/IplJMojx0NBrxJ3fN9dvX2Y6BIXRsF1du4qESm4/3CKuyUV7p9DW3mPlHTGLvYxnyK Qy7VFBkoLINszBrOUeIxggNvMIIDawIBATCBlDCBjDELMAkGA1UEBhMCSUwxFjAUBgNVBAoTDVN0 YXJ0Q29tIEx0ZC4xKzApBgNVBAsTIlNlY3VyZSBEaWdpdGFsIENlcnRpZmljYXRlIFNpZ25pbmcx ODA2BgNVBAMTL1N0YXJ0Q29tIENsYXNzIDEgUHJpbWFyeSBJbnRlcm1lZGlhdGUgQ2xpZW50IENB AgMN7zcwCQYFKw4DAhoFAKCCAa8wGAYJKoZIhvcNAQkDMQsGCSqGSIb3DQEHATAcBgkqhkiG9w0B CQUxDxcNMTYwMjAzMTAxNTA4WjAjBgkqhkiG9w0BCQQxFgQUDC1rkZwX7ccYtVefxoiyu/uvRO4w gaUGCSsGAQQBgjcQBDGBlzCBlDCBjDELMAkGA1UEBhMCSUwxFjAUBgNVBAoTDVN0YXJ0Q29tIEx0 ZC4xKzApBgNVBAsTIlNlY3VyZSBEaWdpdGFsIENlcnRpZmljYXRlIFNpZ25pbmcxODA2BgNVBAMT L1N0YXJ0Q29tIENsYXNzIDEgUHJpbWFyeSBJbnRlcm1lZGlhdGUgQ2xpZW50IENBAgMN7zcwgacG CyqGSIb3DQEJEAILMYGXoIGUMIGMMQswCQYDVQQGEwJJTDEWMBQGA1UEChMNU3RhcnRDb20gTHRk LjErMCkGA1UECxMiU2VjdXJlIERpZ2l0YWwgQ2VydGlmaWNhdGUgU2lnbmluZzE4MDYGA1UEAxMv U3RhcnRDb20gQ2xhc3MgMSBQcmltYXJ5IEludGVybWVkaWF0ZSBDbGllbnQgQ0ECAw3vNzANBgkq hkiG9w0BAQEFAASCAQBDooqpgWWmydVnUYaKRLE8WRhcwzn/m0ffzSSRv1QNgr1l/RyLxlV3u7l9 Yil/bzhIRkhhQ9yiCnarV9G0cPzHuegW9yHnv+L9z4+86fMTOOkUD1382wCjkVUt5pcbApHMS/Xi WIwbJNnB4NLWa9DwXoRK+rfCT4l2IIuNaGeck6WCPap+fYu4G1YxKOE3Z0bqnWVQZjHblKTucrHz R0Y52Xy3YW6uqL26DMe1HH7A9e87Of8VJaXnWAV1TNYqRFCWUlDQCHQ9dBcqydSoRwgaNfJ+n+07 +PO4xLva2wI9FezEKTw/8du4JuEjhm8HZtfW0xnwVnWtC61G3thgDLUvAAAAAAAA --=-QXtkjhYtl8t4xEgGRGLb-- -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm0-f53.google.com (mail-wm0-f53.google.com [74.125.82.53]) by kanga.kvack.org (Postfix) with ESMTP id 4FC396B0005 for ; Wed, 3 Feb 2016 06:02:21 -0500 (EST) Received: by mail-wm0-f53.google.com with SMTP id l66so64429371wml.0 for ; Wed, 03 Feb 2016 03:02:21 -0800 (PST) Received: from mail-wm0-x234.google.com (mail-wm0-x234.google.com. [2a00:1450:400c:c09::234]) by mx.google.com with ESMTPS id z11si9188574wjw.127.2016.02.03.03.02.19 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Feb 2016 03:02:20 -0800 (PST) Received: by mail-wm0-x234.google.com with SMTP id p63so64398330wmp.1 for ; Wed, 03 Feb 2016 03:02:19 -0800 (PST) MIME-Version: 1.0 In-Reply-To: <1454494508.4788.154.camel@infradead.org> References: <20160128175536.GA20797@gmail.com> <1454460057.4788.117.camel@infradead.org> <1454488853.4788.142.camel@infradead.org> <1454494508.4788.154.camel@infradead.org> From: Oded Gabbay Date: Wed, 3 Feb 2016 13:01:49 +0200 Message-ID: Subject: Re: [LSF/MM ATTEND] HMM (heterogeneous memory manager) and GPU Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Sender: owner-linux-mm@kvack.org List-ID: To: David Woodhouse Cc: Jerome Glisse , lsf-pc@lists.linux-foundation.org, "linux-mm@kvack.org" , Joerg Roedel On Wed, Feb 3, 2016 at 12:15 PM, David Woodhouse wrot= e: > On Wed, 2016-02-03 at 11:21 +0200, Oded Gabbay wrote: > >> OK, so I think I got confused up a little, but looking at your code I >> see that you register SVM for the mm notifier (intel_mm_release), >> therefore I guess what you meant to say you don't want to call a >> device driver callback from your mm notifier callback, correct ? (like >> the amd_iommu_v2 does when it calls ev_state->inv_ctx_cb inside its >> mn_release) > > Right. > >> Because you can't really control what the device driver will do, i.e. >> if it decides to register itself to the mm notifier in its own code. > > Right. I can't *prevent* them from doing it. But I don't need to > encourage or facilitate it :) > >> And because you don't call the device driver, the driver can/will get >> errors for using this PASID (since you unbinded it) and the device >> driver is supposed to handle it. Did I understood that correctly ? > > In the case of an unclean exit, yes. In an orderly shutdown of the > process, one would hope that the device context is relinquished cleanly > rather than the process simply exiting. > > And yes, the device and its driver are expected to handle faults. If > they don't do that, they are broken :) > >> If I understood it correctly, doesn't it confuses between error/fault >> and normal unbinding ? Won't it be better to actively notify them and >> indeed *wait* until the device driver cleared its H/W pipeline before >> "pulling the carpet under their feet" ? >> >> In our case (AMD GPUs), if we have such an error it could make the GPU >> stuck. That's why we even reset the wavefronts inside the GPU, if we >> can't gracefully remove the work from the GPU (see >> kfd_unbind_process_from_device) > > But a rogue process can easily trigger faults =E2=80=94 just request acce= ss to > an address that doesn't exist. My conversation with the hardware > designers was not about the peculiarities of any specific > implementation, but just getting them to confirm my assertion that if a > device *doesn't* cleanly handle faults on *one* PASID without screwing > over all the *other* PASIDs, then it is utterly broken by design and > should never get to production. Yes, that is agreed, address errors should not affect the H/W itself, nor other processes. > > I *do* anticipate broken hardware which will crap itself completely > when it takes a fault, and have implemented a callback from the fault > handler so that the driver gets notified when a fault *happens* (even > on a PASID which is still alive), and can prod the broken hardware if > it needs to. > > But I wasn't expecting it to be the norm. > Yeah, I guess that after a few H/W iterations the "correct" implementation will be the norm. >> In the patch's comment you wrote: >> "Hardware designers have confirmed that the resulting 'PASID not present= ' >> faults should be handled just as gracefully as 'page not present' faults= " >> >> Unless *all* the H/W that is going to use SVM is designed by the same >> company, I don't think we can say such a thing. And even then, from my >> experience, H/W designers can be "creative" sometimes. > > If we have to turn it into a 'page not present' fault instead of a > 'PASID not present' fault, that's easy enough to do by pointing it at a > dummy PML4 (the zero page will do). > > But I stand by my assertion that any hardware which doesn't handle at > least a 'page not present' fault in a given PASID without screwing over > all the other users of the hardware is BROKEN. Totally agreed! > > We could *almost* forgive hardware for stalling when it sees a 'PASID > not present' fault. Since that *does* require OS participation. > > -- > David Woodhouse Open Source Technology Centre > David.Woodhouse@intel.com Intel Corporation > Another, perhaps trivial, question. When there is an address fault, who handles it ? the SVM driver, or each device driver ? In other words, is the model the same as (AMD) IOMMU where it binds amd_iommu driver to the IOMMU H/W, and that driver (amd_iommu/v2) is the only one which handles the PPR events ? If that is the case, then with SVM, how will the device driver be made aware of faults, if the SVM driver won't notify him about them, because it has already severed the connection between PASID and process ? If the model is that each device driver gets a direct fault notification (via interrupt or some other way) then that is a different story. Oded -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm0-f53.google.com (mail-wm0-f53.google.com [74.125.82.53]) by kanga.kvack.org (Postfix) with ESMTP id 156F56B0253 for ; Wed, 3 Feb 2016 06:07:39 -0500 (EST) Received: by mail-wm0-f53.google.com with SMTP id l66so158074026wml.0 for ; Wed, 03 Feb 2016 03:07:39 -0800 (PST) Received: from mail-wm0-x229.google.com (mail-wm0-x229.google.com. [2a00:1450:400c:c09::229]) by mx.google.com with ESMTPS id j139si29683798wmg.65.2016.02.03.03.07.37 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Feb 2016 03:07:38 -0800 (PST) Received: by mail-wm0-x229.google.com with SMTP id l66so158073433wml.0 for ; Wed, 03 Feb 2016 03:07:37 -0800 (PST) MIME-Version: 1.0 In-Reply-To: References: <20160128175536.GA20797@gmail.com> <1454460057.4788.117.camel@infradead.org> <1454488853.4788.142.camel@infradead.org> <1454494508.4788.154.camel@infradead.org> From: Oded Gabbay Date: Wed, 3 Feb 2016 13:07:07 +0200 Message-ID: Subject: Re: [LSF/MM ATTEND] HMM (heterogeneous memory manager) and GPU Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Sender: owner-linux-mm@kvack.org List-ID: To: David Woodhouse Cc: Jerome Glisse , lsf-pc@lists.linux-foundation.org, "linux-mm@kvack.org" , Joerg Roedel On Wed, Feb 3, 2016 at 1:01 PM, Oded Gabbay wrote: > On Wed, Feb 3, 2016 at 12:15 PM, David Woodhouse wr= ote: >> On Wed, 2016-02-03 at 11:21 +0200, Oded Gabbay wrote: >> >>> OK, so I think I got confused up a little, but looking at your code I >>> see that you register SVM for the mm notifier (intel_mm_release), >>> therefore I guess what you meant to say you don't want to call a >>> device driver callback from your mm notifier callback, correct ? (like >>> the amd_iommu_v2 does when it calls ev_state->inv_ctx_cb inside its >>> mn_release) >> >> Right. >> >>> Because you can't really control what the device driver will do, i.e. >>> if it decides to register itself to the mm notifier in its own code. >> >> Right. I can't *prevent* them from doing it. But I don't need to >> encourage or facilitate it :) >> >>> And because you don't call the device driver, the driver can/will get >>> errors for using this PASID (since you unbinded it) and the device >>> driver is supposed to handle it. Did I understood that correctly ? >> >> In the case of an unclean exit, yes. In an orderly shutdown of the >> process, one would hope that the device context is relinquished cleanly >> rather than the process simply exiting. >> >> And yes, the device and its driver are expected to handle faults. If >> they don't do that, they are broken :) >> >>> If I understood it correctly, doesn't it confuses between error/fault >>> and normal unbinding ? Won't it be better to actively notify them and >>> indeed *wait* until the device driver cleared its H/W pipeline before >>> "pulling the carpet under their feet" ? >>> >>> In our case (AMD GPUs), if we have such an error it could make the GPU >>> stuck. That's why we even reset the wavefronts inside the GPU, if we >>> can't gracefully remove the work from the GPU (see >>> kfd_unbind_process_from_device) >> >> But a rogue process can easily trigger faults =E2=80=94 just request acc= ess to >> an address that doesn't exist. My conversation with the hardware >> designers was not about the peculiarities of any specific >> implementation, but just getting them to confirm my assertion that if a >> device *doesn't* cleanly handle faults on *one* PASID without screwing >> over all the *other* PASIDs, then it is utterly broken by design and >> should never get to production. > > Yes, that is agreed, address errors should not affect the H/W itself, > nor other processes. > >> >> I *do* anticipate broken hardware which will crap itself completely >> when it takes a fault, and have implemented a callback from the fault >> handler so that the driver gets notified when a fault *happens* (even >> on a PASID which is still alive), and can prod the broken hardware if >> it needs to. >> >> But I wasn't expecting it to be the norm. >> > Yeah, I guess that after a few H/W iterations the "correct" > implementation will be the norm. > >>> In the patch's comment you wrote: >>> "Hardware designers have confirmed that the resulting 'PASID not presen= t' >>> faults should be handled just as gracefully as 'page not present' fault= s" >>> >>> Unless *all* the H/W that is going to use SVM is designed by the same >>> company, I don't think we can say such a thing. And even then, from my >>> experience, H/W designers can be "creative" sometimes. >> >> If we have to turn it into a 'page not present' fault instead of a >> 'PASID not present' fault, that's easy enough to do by pointing it at a >> dummy PML4 (the zero page will do). >> >> But I stand by my assertion that any hardware which doesn't handle at >> least a 'page not present' fault in a given PASID without screwing over >> all the other users of the hardware is BROKEN. > > Totally agreed! > >> >> We could *almost* forgive hardware for stalling when it sees a 'PASID >> not present' fault. Since that *does* require OS participation. >> >> -- >> David Woodhouse Open Source Technology Centre >> David.Woodhouse@intel.com Intel Corporation >> > > Another, perhaps trivial, question. > When there is an address fault, who handles it ? the SVM driver, or > each device driver ? > > In other words, is the model the same as (AMD) IOMMU where it binds > amd_iommu driver to the IOMMU H/W, and that driver (amd_iommu/v2) is > the only one which handles the PPR events ? > > If that is the case, then with SVM, how will the device driver be made > aware of faults, if the SVM driver won't notify him about them, > because it has already severed the connection between PASID and > process ? > > If the model is that each device driver gets a direct fault > notification (via interrupt or some other way) then that is a > different story. > > Oded And another question, if I may, aren't you afraid of "false positive" prints to dmesg ? I mean, I'm pretty sure page faults / pasid faults errors will be logged somewhere, probably to dmesg. Aren't you concerned of the users seeing those errors and thinking they may have a bug, while actually the errors were only caused by process termination ? Or in that case you say that the application is broken, because if it still had something running in the H/W, it should not have closed itself ? I can accept that, I just want to know what is our answer when people will start to complain :) Thanks, Oded -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pf0-f174.google.com (mail-pf0-f174.google.com [209.85.192.174]) by kanga.kvack.org (Postfix) with ESMTP id DB10B6B0005 for ; Wed, 3 Feb 2016 06:35:55 -0500 (EST) Received: by mail-pf0-f174.google.com with SMTP id w123so12713089pfb.0 for ; Wed, 03 Feb 2016 03:35:55 -0800 (PST) Received: from bombadil.infradead.org (bombadil.infradead.org. [2001:1868:205::9]) by mx.google.com with ESMTPS id x72si3764304pfi.196.2016.02.03.03.35.55 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Feb 2016 03:35:55 -0800 (PST) Message-ID: <1454499350.4788.170.camel@infradead.org> Subject: Re: [LSF/MM ATTEND] HMM (heterogeneous memory manager) and GPU From: David Woodhouse Date: Wed, 03 Feb 2016 11:35:50 +0000 In-Reply-To: References: <20160128175536.GA20797@gmail.com> <1454460057.4788.117.camel@infradead.org> <1454488853.4788.142.camel@infradead.org> <1454494508.4788.154.camel@infradead.org> Content-Type: multipart/signed; micalg="sha-1"; protocol="application/x-pkcs7-signature"; boundary="=-LVl6c2/rXDT6cGV14bp8" Mime-Version: 1.0 Sender: owner-linux-mm@kvack.org List-ID: To: Oded Gabbay Cc: Jerome Glisse , lsf-pc@lists.linux-foundation.org, "linux-mm@kvack.org" , Joerg Roedel --=-LVl6c2/rXDT6cGV14bp8 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Wed, 2016-02-03 at 13:07 +0200, Oded Gabbay wrote: > > Another, perhaps trivial, question. > > When there is an address fault, who handles it ? the SVM driver, or > > each device driver ? > > > > In other words, is the model the same as (AMD) IOMMU where it binds > > amd_iommu driver to the IOMMU H/W, and that driver (amd_iommu/v2) is > > the only one which handles the PPR events ? > > > > If that is the case, then with SVM, how will the device driver be made > > aware of faults, if the SVM driver won't notify him about them, > > because it has already severed the connection between PASID and > > process ? In the ideal case, there's no need for the device driver to get involved at all. When a page isn't found in the page tables, the IOMMU code calls handle_mm_fault() and either populates the page and sends a a 'success' response, or sends an 'invalid fault' response back. To account for broken hardware, we *have* added a callback into the device driver when these faults happen. Ideally it should never be used, of course. In the case where the process has gone away, the PASID is still assigned and we still hold mm_count on the MM, just not mm_users. This callback into the device driver still occurs if a fault happens during process exit between the exit_mm() and exit_files() stage. > And another question, if I may, aren't you afraid of "false positive" > prints to dmesg ? I mean, I'm pretty sure page faults / pasid faults > errors will be logged somewhere, probably to dmesg. Aren't you > concerned of the users seeing those errors and thinking they may have > a bug, while actually the errors were only caused by process > termination ? If that's the case, it's easy enough to silence them. We are already explicitly testing for the 'defunct mm' case in our fault handler, to prevent us from faulting more pages into an obsolescent MM after its mm_users reaches zero and its page tables are supposed to have been torn down. That's the 'if(!atomic_inc_not_zere(&svm->mm->mm_users)) goto bad_req;' part. > Or in that case you say that the application is broken, because if it > still had something running in the H/W, it should not have closed > itself ? That's also true but it's still nice to avoid confusion. Even if only to disambiguate cause and effect =E2=80=94 we don't want people to see PASI= D faults which were caused by the process crashing, and to think that they might be involved in *causing* that process to crash... --=20 David Woodhouse Open Source Technology Centre David.Woodhouse@intel.com Intel Corporation --=-LVl6c2/rXDT6cGV14bp8 Content-Type: application/x-pkcs7-signature; name="smime.p7s" Content-Disposition: attachment; filename="smime.p7s" Content-Transfer-Encoding: base64 MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIISjjCCBicw ggUPoAMCAQICAw3vNzANBgkqhkiG9w0BAQUFADCBjDELMAkGA1UEBhMCSUwxFjAUBgNVBAoTDVN0 YXJ0Q29tIEx0ZC4xKzApBgNVBAsTIlNlY3VyZSBEaWdpdGFsIENlcnRpZmljYXRlIFNpZ25pbmcx ODA2BgNVBAMTL1N0YXJ0Q29tIENsYXNzIDEgUHJpbWFyeSBJbnRlcm1lZGlhdGUgQ2xpZW50IENB MB4XDTE1MDUwNTA5NDM0MVoXDTE2MDUwNTA5NTMzNlowQjEcMBoGA1UEAwwTZHdtdzJAaW5mcmFk ZWFkLm9yZzEiMCAGCSqGSIb3DQEJARYTZHdtdzJAaW5mcmFkZWFkLm9yZzCCASIwDQYJKoZIhvcN AQEBBQADggEPADCCAQoCggEBAMkbm9kPbx1j/X4RVyf/pPKSYwelcco69TvnQQbKM8m8xkWjXJI1 jpJ1jMaGUZGFToINMSZi7lZawUozudWbXSKy1SikENSTJHffsdRAIlsp+hR8vWvjsKUry6sEdqPG doa5RY7+N4WRusWZDYW/RRWE6i9EL9qV86CVPYqw22UBOUw4/j/HVGCV6TSB8yE5iEwhk/hUuzRr FZm1MJMR7mCS7BCR8Lr5jFY61lWpBiXNXIxLZCvDc26KR5L5tYX43iUVO3fzES1GRVoYnxxk2tmz fcsZG5vK+Trc9L8OZJfkYrEHH3+Iw41MQ0w/djVtYr1+HYldx0QmYXAtnhIj+UMCAwEAAaOCAtkw ggLVMAkGA1UdEwQCMAAwCwYDVR0PBAQDAgSwMB0GA1UdJQQWMBQGCCsGAQUFBwMCBggrBgEFBQcD BDAdBgNVHQ4EFgQUszC96C3w5/2+d+atSr0IpT26YI4wHwYDVR0jBBgwFoAUU3Ltkpzg2ssBXHx+ ljVO8tS4UYIwHgYDVR0RBBcwFYETZHdtdzJAaW5mcmFkZWFkLm9yZzCCAUwGA1UdIASCAUMwggE/ MIIBOwYLKwYBBAGBtTcBAgMwggEqMC4GCCsGAQUFBwIBFiJodHRwOi8vd3d3LnN0YXJ0c3NsLmNv bS9wb2xpY3kucGRmMIH3BggrBgEFBQcCAjCB6jAnFiBTdGFydENvbSBDZXJ0aWZpY2F0aW9uIEF1 dGhvcml0eTADAgEBGoG+VGhpcyBjZXJ0aWZpY2F0ZSB3YXMgaXNzdWVkIGFjY29yZGluZyB0byB0 aGUgQ2xhc3MgMSBWYWxpZGF0aW9uIHJlcXVpcmVtZW50cyBvZiB0aGUgU3RhcnRDb20gQ0EgcG9s aWN5LCByZWxpYW5jZSBvbmx5IGZvciB0aGUgaW50ZW5kZWQgcHVycG9zZSBpbiBjb21wbGlhbmNl IG9mIHRoZSByZWx5aW5nIHBhcnR5IG9ibGlnYXRpb25zLjA2BgNVHR8ELzAtMCugKaAnhiVodHRw Oi8vY3JsLnN0YXJ0c3NsLmNvbS9jcnR1MS1jcmwuY3JsMIGOBggrBgEFBQcBAQSBgTB/MDkGCCsG AQUFBzABhi1odHRwOi8vb2NzcC5zdGFydHNzbC5jb20vc3ViL2NsYXNzMS9jbGllbnQvY2EwQgYI KwYBBQUHMAKGNmh0dHA6Ly9haWEuc3RhcnRzc2wuY29tL2NlcnRzL3N1Yi5jbGFzczEuY2xpZW50 LmNhLmNydDAjBgNVHRIEHDAahhhodHRwOi8vd3d3LnN0YXJ0c3NsLmNvbS8wDQYJKoZIhvcNAQEF BQADggEBAHMQmxHHodpS85X8HRyxhvfkys7r+taCNOaNU9cxQu/cZ/6k5nS2qGNMzZ6jb7ueY/V7 7p+4DW/9ZWODDTf4Fz00mh5SSVc20Bz7t+hhxwHd62PZgENh5i76Qq2tw48U8AsYo5damHby1epf neZafLpUkLLO7AGBJIiRVTevdvyXQ0qnixOmKMWyvrhSNGuVIKVdeqLP+102Dwf+dpFyw+j1hz28 jEEKpHa+NR1b2kXuSPi/rMGhexwlJOh4tK8KQ6Ryr0rIN//NSbOgbyYZrzc/ZUWX9V5OA84ChFb2 vkFl0OcYrttp/rhDBLITwffPxSZeoBh9H7zYzkbCXKL3BUIwggYnMIIFD6ADAgECAgMN7zcwDQYJ KoZIhvcNAQEFBQAwgYwxCzAJBgNVBAYTAklMMRYwFAYDVQQKEw1TdGFydENvbSBMdGQuMSswKQYD VQQLEyJTZWN1cmUgRGlnaXRhbCBDZXJ0aWZpY2F0ZSBTaWduaW5nMTgwNgYDVQQDEy9TdGFydENv bSBDbGFzcyAxIFByaW1hcnkgSW50ZXJtZWRpYXRlIENsaWVudCBDQTAeFw0xNTA1MDUwOTQzNDFa Fw0xNjA1MDUwOTUzMzZaMEIxHDAaBgNVBAMME2R3bXcyQGluZnJhZGVhZC5vcmcxIjAgBgkqhkiG 9w0BCQEWE2R3bXcyQGluZnJhZGVhZC5vcmcwggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIB AQDJG5vZD28dY/1+EVcn/6TykmMHpXHKOvU750EGyjPJvMZFo1ySNY6SdYzGhlGRhU6CDTEmYu5W WsFKM7nVm10istUopBDUkyR337HUQCJbKfoUfL1r47ClK8urBHajxnaGuUWO/jeFkbrFmQ2Fv0UV hOovRC/alfOglT2KsNtlATlMOP4/x1Rglek0gfMhOYhMIZP4VLs0axWZtTCTEe5gkuwQkfC6+YxW OtZVqQYlzVyMS2Qrw3NuikeS+bWF+N4lFTt38xEtRkVaGJ8cZNrZs33LGRubyvk63PS/DmSX5GKx Bx9/iMONTENMP3Y1bWK9fh2JXcdEJmFwLZ4SI/lDAgMBAAGjggLZMIIC1TAJBgNVHRMEAjAAMAsG A1UdDwQEAwIEsDAdBgNVHSUEFjAUBggrBgEFBQcDAgYIKwYBBQUHAwQwHQYDVR0OBBYEFLMwvegt 8Of9vnfmrUq9CKU9umCOMB8GA1UdIwQYMBaAFFNy7ZKc4NrLAVx8fpY1TvLUuFGCMB4GA1UdEQQX MBWBE2R3bXcyQGluZnJhZGVhZC5vcmcwggFMBgNVHSAEggFDMIIBPzCCATsGCysGAQQBgbU3AQID MIIBKjAuBggrBgEFBQcCARYiaHR0cDovL3d3dy5zdGFydHNzbC5jb20vcG9saWN5LnBkZjCB9wYI KwYBBQUHAgIwgeowJxYgU3RhcnRDb20gQ2VydGlmaWNhdGlvbiBBdXRob3JpdHkwAwIBARqBvlRo aXMgY2VydGlmaWNhdGUgd2FzIGlzc3VlZCBhY2NvcmRpbmcgdG8gdGhlIENsYXNzIDEgVmFsaWRh dGlvbiByZXF1aXJlbWVudHMgb2YgdGhlIFN0YXJ0Q29tIENBIHBvbGljeSwgcmVsaWFuY2Ugb25s eSBmb3IgdGhlIGludGVuZGVkIHB1cnBvc2UgaW4gY29tcGxpYW5jZSBvZiB0aGUgcmVseWluZyBw YXJ0eSBvYmxpZ2F0aW9ucy4wNgYDVR0fBC8wLTAroCmgJ4YlaHR0cDovL2NybC5zdGFydHNzbC5j b20vY3J0dTEtY3JsLmNybDCBjgYIKwYBBQUHAQEEgYEwfzA5BggrBgEFBQcwAYYtaHR0cDovL29j c3Auc3RhcnRzc2wuY29tL3N1Yi9jbGFzczEvY2xpZW50L2NhMEIGCCsGAQUFBzAChjZodHRwOi8v YWlhLnN0YXJ0c3NsLmNvbS9jZXJ0cy9zdWIuY2xhc3MxLmNsaWVudC5jYS5jcnQwIwYDVR0SBBww GoYYaHR0cDovL3d3dy5zdGFydHNzbC5jb20vMA0GCSqGSIb3DQEBBQUAA4IBAQBzEJsRx6HaUvOV /B0csYb35MrO6/rWgjTmjVPXMULv3Gf+pOZ0tqhjTM2eo2+7nmP1e+6fuA1v/WVjgw03+Bc9NJoe UklXNtAc+7foYccB3etj2YBDYeYu+kKtrcOPFPALGKOXWph28tXqX53mWny6VJCyzuwBgSSIkVU3 r3b8l0NKp4sTpijFsr64UjRrlSClXXqiz/tdNg8H/naRcsPo9Yc9vIxBCqR2vjUdW9pF7kj4v6zB oXscJSToeLSvCkOkcq9KyDf/zUmzoG8mGa83P2VFl/VeTgPOAoRW9r5BZdDnGK7baf64QwSyE8H3 z8UmXqAYfR+82M5Gwlyi9wVCMIIGNDCCBBygAwIBAgIBHjANBgkqhkiG9w0BAQUFADB9MQswCQYD VQQGEwJJTDEWMBQGA1UEChMNU3RhcnRDb20gTHRkLjErMCkGA1UECxMiU2VjdXJlIERpZ2l0YWwg Q2VydGlmaWNhdGUgU2lnbmluZzEpMCcGA1UEAxMgU3RhcnRDb20gQ2VydGlmaWNhdGlvbiBBdXRo b3JpdHkwHhcNMDcxMDI0MjEwMTU1WhcNMTcxMDI0MjEwMTU1WjCBjDELMAkGA1UEBhMCSUwxFjAU BgNVBAoTDVN0YXJ0Q29tIEx0ZC4xKzApBgNVBAsTIlNlY3VyZSBEaWdpdGFsIENlcnRpZmljYXRl IFNpZ25pbmcxODA2BgNVBAMTL1N0YXJ0Q29tIENsYXNzIDEgUHJpbWFyeSBJbnRlcm1lZGlhdGUg Q2xpZW50IENBMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAxwmDzM4t2BqxKaQuE6uW vooyg4ymiEGWVUet1G8SD+rqvyNH4QrvnEIaFHxOhESip7vMz39ScLpNLbL1QpOlPW/tFIzNHS3q d2XRNYG5Sv9RcGE+T4qbLtsjjJbi6sL7Ls/f/X9ftTyhxvxWkf8KW37iKrueKsxw2HqolH7GM6FX 5UfNAwAu4ZifkpmZzU1slBhyWwaQPEPPZRsWoTb7q8hmgv6Nv3Hg9rmA1/VPBIOQ6SKRkHXG0Hhm q1dOFoAFI411+a/9nWm5rcVjGcIWZ2v/43Yksq60jExipA4l5uv9/+Hm33mbgmCszdj/Dthf13tg Av2O83hLJ0exTqfrlwIDAQABo4IBrTCCAakwDwYDVR0TAQH/BAUwAwEB/zAOBgNVHQ8BAf8EBAMC AQYwHQYDVR0OBBYEFFNy7ZKc4NrLAVx8fpY1TvLUuFGCMB8GA1UdIwQYMBaAFE4L7xqkQFulF2mH MMo0aEPQQa7yMGYGCCsGAQUFBwEBBFowWDAnBggrBgEFBQcwAYYbaHR0cDovL29jc3Auc3RhcnRz c2wuY29tL2NhMC0GCCsGAQUFBzAChiFodHRwOi8vd3d3LnN0YXJ0c3NsLmNvbS9zZnNjYS5jcnQw WwYDVR0fBFQwUjAnoCWgI4YhaHR0cDovL3d3dy5zdGFydHNzbC5jb20vc2ZzY2EuY3JsMCegJaAj hiFodHRwOi8vY3JsLnN0YXJ0c3NsLmNvbS9zZnNjYS5jcmwwgYAGA1UdIAR5MHcwdQYLKwYBBAGB tTcBAgEwZjAuBggrBgEFBQcCARYiaHR0cDovL3d3dy5zdGFydHNzbC5jb20vcG9saWN5LnBkZjA0 BggrBgEFBQcCARYoaHR0cDovL3d3dy5zdGFydHNzbC5jb20vaW50ZXJtZWRpYXRlLnBkZjANBgkq hkiG9w0BAQUFAAOCAgEACoMIfXirLAZcuGOMXq4cuSN3TaFx2H2GvD5VSy/6rV55BYHbWNaPeQn3 oBSU8KgQZn/Kck1JxbLpAxVCNtsxeW1R87ifhsYZ0qjdrA9anrW2MAWCtosmAOT4OxK9QPoSjCMx M3HbkZCDJgnlE8jMopH21BbyAYr7b5EfGRQJNtgWcvqSXwKHnTutR08+Kkn0KAkXCzeQNLeA5LlY UzFyM7kPAp8pIRMQ+seHunmyG642S2+y/qHEdMuGIwpfz3eDF1PdctL04qYK/zu+Qg1Bw0RwgigV Zs/0c5HP2/e9DBHh7eSwtzYlk4AUr6yxLlcwSjOfOmKEQ/Q8tzh0IFiNu9IPuTGAPBn4CPxD0+Ru 8T2wg8/s43R/PT3kd1OEqOJUl7q+h+r6fpvU0Fzxd2tC8Ga6fDEPme+1Nbi+03pVjuZQKbGwKJ66 gEn06WqaxVZC+J8hh/jR0k9mST1iAZPNYulcNJ8tKmVtjYsv0L1TSm2+NwON58tO+pIVzu3DWwSE XSf+qkDavQam+QtEOZxLBXI++aMUEapSn+k3Lxm48ZCYfAWLb/Xj7F5JQMbZvCexglAbYR0kIHqW 5DnsYSdMD/IplJMojx0NBrxJ3fN9dvX2Y6BIXRsF1du4qESm4/3CKuyUV7p9DW3mPlHTGLvYxnyK Qy7VFBkoLINszBrOUeIxggNvMIIDawIBATCBlDCBjDELMAkGA1UEBhMCSUwxFjAUBgNVBAoTDVN0 YXJ0Q29tIEx0ZC4xKzApBgNVBAsTIlNlY3VyZSBEaWdpdGFsIENlcnRpZmljYXRlIFNpZ25pbmcx ODA2BgNVBAMTL1N0YXJ0Q29tIENsYXNzIDEgUHJpbWFyeSBJbnRlcm1lZGlhdGUgQ2xpZW50IENB AgMN7zcwCQYFKw4DAhoFAKCCAa8wGAYJKoZIhvcNAQkDMQsGCSqGSIb3DQEHATAcBgkqhkiG9w0B CQUxDxcNMTYwMjAzMTEzNTUwWjAjBgkqhkiG9w0BCQQxFgQULvYMS2covLYbYnH3jw18xRcGFT4w gaUGCSsGAQQBgjcQBDGBlzCBlDCBjDELMAkGA1UEBhMCSUwxFjAUBgNVBAoTDVN0YXJ0Q29tIEx0 ZC4xKzApBgNVBAsTIlNlY3VyZSBEaWdpdGFsIENlcnRpZmljYXRlIFNpZ25pbmcxODA2BgNVBAMT L1N0YXJ0Q29tIENsYXNzIDEgUHJpbWFyeSBJbnRlcm1lZGlhdGUgQ2xpZW50IENBAgMN7zcwgacG CyqGSIb3DQEJEAILMYGXoIGUMIGMMQswCQYDVQQGEwJJTDEWMBQGA1UEChMNU3RhcnRDb20gTHRk LjErMCkGA1UECxMiU2VjdXJlIERpZ2l0YWwgQ2VydGlmaWNhdGUgU2lnbmluZzE4MDYGA1UEAxMv U3RhcnRDb20gQ2xhc3MgMSBQcmltYXJ5IEludGVybWVkaWF0ZSBDbGllbnQgQ0ECAw3vNzANBgkq hkiG9w0BAQEFAASCAQCZkHXKKCkqslm5oAFRNmgC8klpDbKujto1IB0es3w2GgZAorNyYuFAOkC3 labicOY6EvajysKA+OwvfccurotqgZQIAYwlySBShG0rO1No47z4raj1A5TlFQKJI7pzsBdlQ+6U TmAF34UMfX1gutMv+Yn2YA34a97XkvrF0MVpRCoGHYdbhPkC9QQ0ZG+BJ7ccaCdV49ztVvd8vrL1 EJ8YEf1mQ/wPt3hM57vd2IftdP00PBwp8v4kORTk8FbkKSdqA5NErRMZWt4aEnt9p+yxyZO+kcuc MfbkpscZszEfSP6nCoBx114b+lV3BeDR1zwE0GLgTRyQairW8z0LjEexAAAAAAAA --=-LVl6c2/rXDT6cGV14bp8-- -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f50.google.com (mail-pa0-f50.google.com [209.85.220.50]) by kanga.kvack.org (Postfix) with ESMTP id D2C336B0005 for ; Wed, 3 Feb 2016 06:41:55 -0500 (EST) Received: by mail-pa0-f50.google.com with SMTP id cy9so12485992pac.0 for ; Wed, 03 Feb 2016 03:41:55 -0800 (PST) Received: from bombadil.infradead.org (bombadil.infradead.org. [2001:1868:205::9]) by mx.google.com with ESMTPS id kv12si8839831pab.194.2016.02.03.03.41.55 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Feb 2016 03:41:55 -0800 (PST) Message-ID: <1454499710.4788.176.camel@infradead.org> Subject: Re: [LSF/MM ATTEND] HMM (heterogeneous memory manager) and GPU From: David Woodhouse Date: Wed, 03 Feb 2016 11:41:50 +0000 In-Reply-To: <1454499350.4788.170.camel@infradead.org> References: <20160128175536.GA20797@gmail.com> <1454460057.4788.117.camel@infradead.org> <1454488853.4788.142.camel@infradead.org> <1454494508.4788.154.camel@infradead.org> <1454499350.4788.170.camel@infradead.org> Content-Type: multipart/signed; micalg="sha-1"; protocol="application/x-pkcs7-signature"; boundary="=-ZvPHwSq+j4tWP9vHU58T" Mime-Version: 1.0 Sender: owner-linux-mm@kvack.org List-ID: To: Oded Gabbay Cc: Jerome Glisse , lsf-pc@lists.linux-foundation.org, "linux-mm@kvack.org" , Joerg Roedel --=-ZvPHwSq+j4tWP9vHU58T Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Wed, 2016-02-03 at 11:35 +0000, David Woodhouse wrote: >=20 > In the ideal case, there's no need for the device driver to get > involved at all. When a page isn't found in the page tables, the IOMMU > code calls handle_mm_fault() and either populates the page and sends a > a 'success' response, or sends an 'invalid fault' response back. I missed a bit here; I should have made it explicit: The device hardware receives that page-request response, successful or otherwise, and is supposed to act on it accordingly. The device's own request then fails, and it should have some coherent way of reporting that to the device driver. The point is that there should be no need to 'short-circuit' that and pass notification directly from the IOMMU code to the device driver that "there was a fault on PASID x". That direct notification hack doesn't even *tell* us which device-side context was affected, if there's more than one context accessing a given PASID. (Actually, in the Intel case for integrated devices, there *are* some opaque=C2=B9 bits in the page-request which do include that information. Bu= t that's horrid, and not a solution for the general case.) --=20 David Woodhouse Open Source Technology Centre David.Woodhouse@intel.com Intel Corporation =C2=B9 to the IOMMU code. --=-ZvPHwSq+j4tWP9vHU58T Content-Type: application/x-pkcs7-signature; name="smime.p7s" Content-Disposition: attachment; filename="smime.p7s" Content-Transfer-Encoding: base64 MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIISjjCCBicw ggUPoAMCAQICAw3vNzANBgkqhkiG9w0BAQUFADCBjDELMAkGA1UEBhMCSUwxFjAUBgNVBAoTDVN0 YXJ0Q29tIEx0ZC4xKzApBgNVBAsTIlNlY3VyZSBEaWdpdGFsIENlcnRpZmljYXRlIFNpZ25pbmcx ODA2BgNVBAMTL1N0YXJ0Q29tIENsYXNzIDEgUHJpbWFyeSBJbnRlcm1lZGlhdGUgQ2xpZW50IENB MB4XDTE1MDUwNTA5NDM0MVoXDTE2MDUwNTA5NTMzNlowQjEcMBoGA1UEAwwTZHdtdzJAaW5mcmFk ZWFkLm9yZzEiMCAGCSqGSIb3DQEJARYTZHdtdzJAaW5mcmFkZWFkLm9yZzCCASIwDQYJKoZIhvcN AQEBBQADggEPADCCAQoCggEBAMkbm9kPbx1j/X4RVyf/pPKSYwelcco69TvnQQbKM8m8xkWjXJI1 jpJ1jMaGUZGFToINMSZi7lZawUozudWbXSKy1SikENSTJHffsdRAIlsp+hR8vWvjsKUry6sEdqPG doa5RY7+N4WRusWZDYW/RRWE6i9EL9qV86CVPYqw22UBOUw4/j/HVGCV6TSB8yE5iEwhk/hUuzRr FZm1MJMR7mCS7BCR8Lr5jFY61lWpBiXNXIxLZCvDc26KR5L5tYX43iUVO3fzES1GRVoYnxxk2tmz fcsZG5vK+Trc9L8OZJfkYrEHH3+Iw41MQ0w/djVtYr1+HYldx0QmYXAtnhIj+UMCAwEAAaOCAtkw ggLVMAkGA1UdEwQCMAAwCwYDVR0PBAQDAgSwMB0GA1UdJQQWMBQGCCsGAQUFBwMCBggrBgEFBQcD BDAdBgNVHQ4EFgQUszC96C3w5/2+d+atSr0IpT26YI4wHwYDVR0jBBgwFoAUU3Ltkpzg2ssBXHx+ ljVO8tS4UYIwHgYDVR0RBBcwFYETZHdtdzJAaW5mcmFkZWFkLm9yZzCCAUwGA1UdIASCAUMwggE/ MIIBOwYLKwYBBAGBtTcBAgMwggEqMC4GCCsGAQUFBwIBFiJodHRwOi8vd3d3LnN0YXJ0c3NsLmNv bS9wb2xpY3kucGRmMIH3BggrBgEFBQcCAjCB6jAnFiBTdGFydENvbSBDZXJ0aWZpY2F0aW9uIEF1 dGhvcml0eTADAgEBGoG+VGhpcyBjZXJ0aWZpY2F0ZSB3YXMgaXNzdWVkIGFjY29yZGluZyB0byB0 aGUgQ2xhc3MgMSBWYWxpZGF0aW9uIHJlcXVpcmVtZW50cyBvZiB0aGUgU3RhcnRDb20gQ0EgcG9s aWN5LCByZWxpYW5jZSBvbmx5IGZvciB0aGUgaW50ZW5kZWQgcHVycG9zZSBpbiBjb21wbGlhbmNl IG9mIHRoZSByZWx5aW5nIHBhcnR5IG9ibGlnYXRpb25zLjA2BgNVHR8ELzAtMCugKaAnhiVodHRw Oi8vY3JsLnN0YXJ0c3NsLmNvbS9jcnR1MS1jcmwuY3JsMIGOBggrBgEFBQcBAQSBgTB/MDkGCCsG AQUFBzABhi1odHRwOi8vb2NzcC5zdGFydHNzbC5jb20vc3ViL2NsYXNzMS9jbGllbnQvY2EwQgYI KwYBBQUHMAKGNmh0dHA6Ly9haWEuc3RhcnRzc2wuY29tL2NlcnRzL3N1Yi5jbGFzczEuY2xpZW50 LmNhLmNydDAjBgNVHRIEHDAahhhodHRwOi8vd3d3LnN0YXJ0c3NsLmNvbS8wDQYJKoZIhvcNAQEF BQADggEBAHMQmxHHodpS85X8HRyxhvfkys7r+taCNOaNU9cxQu/cZ/6k5nS2qGNMzZ6jb7ueY/V7 7p+4DW/9ZWODDTf4Fz00mh5SSVc20Bz7t+hhxwHd62PZgENh5i76Qq2tw48U8AsYo5damHby1epf neZafLpUkLLO7AGBJIiRVTevdvyXQ0qnixOmKMWyvrhSNGuVIKVdeqLP+102Dwf+dpFyw+j1hz28 jEEKpHa+NR1b2kXuSPi/rMGhexwlJOh4tK8KQ6Ryr0rIN//NSbOgbyYZrzc/ZUWX9V5OA84ChFb2 vkFl0OcYrttp/rhDBLITwffPxSZeoBh9H7zYzkbCXKL3BUIwggYnMIIFD6ADAgECAgMN7zcwDQYJ KoZIhvcNAQEFBQAwgYwxCzAJBgNVBAYTAklMMRYwFAYDVQQKEw1TdGFydENvbSBMdGQuMSswKQYD VQQLEyJTZWN1cmUgRGlnaXRhbCBDZXJ0aWZpY2F0ZSBTaWduaW5nMTgwNgYDVQQDEy9TdGFydENv bSBDbGFzcyAxIFByaW1hcnkgSW50ZXJtZWRpYXRlIENsaWVudCBDQTAeFw0xNTA1MDUwOTQzNDFa Fw0xNjA1MDUwOTUzMzZaMEIxHDAaBgNVBAMME2R3bXcyQGluZnJhZGVhZC5vcmcxIjAgBgkqhkiG 9w0BCQEWE2R3bXcyQGluZnJhZGVhZC5vcmcwggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIB AQDJG5vZD28dY/1+EVcn/6TykmMHpXHKOvU750EGyjPJvMZFo1ySNY6SdYzGhlGRhU6CDTEmYu5W WsFKM7nVm10istUopBDUkyR337HUQCJbKfoUfL1r47ClK8urBHajxnaGuUWO/jeFkbrFmQ2Fv0UV hOovRC/alfOglT2KsNtlATlMOP4/x1Rglek0gfMhOYhMIZP4VLs0axWZtTCTEe5gkuwQkfC6+YxW OtZVqQYlzVyMS2Qrw3NuikeS+bWF+N4lFTt38xEtRkVaGJ8cZNrZs33LGRubyvk63PS/DmSX5GKx Bx9/iMONTENMP3Y1bWK9fh2JXcdEJmFwLZ4SI/lDAgMBAAGjggLZMIIC1TAJBgNVHRMEAjAAMAsG A1UdDwQEAwIEsDAdBgNVHSUEFjAUBggrBgEFBQcDAgYIKwYBBQUHAwQwHQYDVR0OBBYEFLMwvegt 8Of9vnfmrUq9CKU9umCOMB8GA1UdIwQYMBaAFFNy7ZKc4NrLAVx8fpY1TvLUuFGCMB4GA1UdEQQX MBWBE2R3bXcyQGluZnJhZGVhZC5vcmcwggFMBgNVHSAEggFDMIIBPzCCATsGCysGAQQBgbU3AQID MIIBKjAuBggrBgEFBQcCARYiaHR0cDovL3d3dy5zdGFydHNzbC5jb20vcG9saWN5LnBkZjCB9wYI KwYBBQUHAgIwgeowJxYgU3RhcnRDb20gQ2VydGlmaWNhdGlvbiBBdXRob3JpdHkwAwIBARqBvlRo aXMgY2VydGlmaWNhdGUgd2FzIGlzc3VlZCBhY2NvcmRpbmcgdG8gdGhlIENsYXNzIDEgVmFsaWRh dGlvbiByZXF1aXJlbWVudHMgb2YgdGhlIFN0YXJ0Q29tIENBIHBvbGljeSwgcmVsaWFuY2Ugb25s eSBmb3IgdGhlIGludGVuZGVkIHB1cnBvc2UgaW4gY29tcGxpYW5jZSBvZiB0aGUgcmVseWluZyBw YXJ0eSBvYmxpZ2F0aW9ucy4wNgYDVR0fBC8wLTAroCmgJ4YlaHR0cDovL2NybC5zdGFydHNzbC5j b20vY3J0dTEtY3JsLmNybDCBjgYIKwYBBQUHAQEEgYEwfzA5BggrBgEFBQcwAYYtaHR0cDovL29j c3Auc3RhcnRzc2wuY29tL3N1Yi9jbGFzczEvY2xpZW50L2NhMEIGCCsGAQUFBzAChjZodHRwOi8v YWlhLnN0YXJ0c3NsLmNvbS9jZXJ0cy9zdWIuY2xhc3MxLmNsaWVudC5jYS5jcnQwIwYDVR0SBBww GoYYaHR0cDovL3d3dy5zdGFydHNzbC5jb20vMA0GCSqGSIb3DQEBBQUAA4IBAQBzEJsRx6HaUvOV /B0csYb35MrO6/rWgjTmjVPXMULv3Gf+pOZ0tqhjTM2eo2+7nmP1e+6fuA1v/WVjgw03+Bc9NJoe UklXNtAc+7foYccB3etj2YBDYeYu+kKtrcOPFPALGKOXWph28tXqX53mWny6VJCyzuwBgSSIkVU3 r3b8l0NKp4sTpijFsr64UjRrlSClXXqiz/tdNg8H/naRcsPo9Yc9vIxBCqR2vjUdW9pF7kj4v6zB oXscJSToeLSvCkOkcq9KyDf/zUmzoG8mGa83P2VFl/VeTgPOAoRW9r5BZdDnGK7baf64QwSyE8H3 z8UmXqAYfR+82M5Gwlyi9wVCMIIGNDCCBBygAwIBAgIBHjANBgkqhkiG9w0BAQUFADB9MQswCQYD VQQGEwJJTDEWMBQGA1UEChMNU3RhcnRDb20gTHRkLjErMCkGA1UECxMiU2VjdXJlIERpZ2l0YWwg Q2VydGlmaWNhdGUgU2lnbmluZzEpMCcGA1UEAxMgU3RhcnRDb20gQ2VydGlmaWNhdGlvbiBBdXRo b3JpdHkwHhcNMDcxMDI0MjEwMTU1WhcNMTcxMDI0MjEwMTU1WjCBjDELMAkGA1UEBhMCSUwxFjAU BgNVBAoTDVN0YXJ0Q29tIEx0ZC4xKzApBgNVBAsTIlNlY3VyZSBEaWdpdGFsIENlcnRpZmljYXRl IFNpZ25pbmcxODA2BgNVBAMTL1N0YXJ0Q29tIENsYXNzIDEgUHJpbWFyeSBJbnRlcm1lZGlhdGUg Q2xpZW50IENBMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAxwmDzM4t2BqxKaQuE6uW vooyg4ymiEGWVUet1G8SD+rqvyNH4QrvnEIaFHxOhESip7vMz39ScLpNLbL1QpOlPW/tFIzNHS3q d2XRNYG5Sv9RcGE+T4qbLtsjjJbi6sL7Ls/f/X9ftTyhxvxWkf8KW37iKrueKsxw2HqolH7GM6FX 5UfNAwAu4ZifkpmZzU1slBhyWwaQPEPPZRsWoTb7q8hmgv6Nv3Hg9rmA1/VPBIOQ6SKRkHXG0Hhm q1dOFoAFI411+a/9nWm5rcVjGcIWZ2v/43Yksq60jExipA4l5uv9/+Hm33mbgmCszdj/Dthf13tg Av2O83hLJ0exTqfrlwIDAQABo4IBrTCCAakwDwYDVR0TAQH/BAUwAwEB/zAOBgNVHQ8BAf8EBAMC AQYwHQYDVR0OBBYEFFNy7ZKc4NrLAVx8fpY1TvLUuFGCMB8GA1UdIwQYMBaAFE4L7xqkQFulF2mH MMo0aEPQQa7yMGYGCCsGAQUFBwEBBFowWDAnBggrBgEFBQcwAYYbaHR0cDovL29jc3Auc3RhcnRz c2wuY29tL2NhMC0GCCsGAQUFBzAChiFodHRwOi8vd3d3LnN0YXJ0c3NsLmNvbS9zZnNjYS5jcnQw WwYDVR0fBFQwUjAnoCWgI4YhaHR0cDovL3d3dy5zdGFydHNzbC5jb20vc2ZzY2EuY3JsMCegJaAj hiFodHRwOi8vY3JsLnN0YXJ0c3NsLmNvbS9zZnNjYS5jcmwwgYAGA1UdIAR5MHcwdQYLKwYBBAGB tTcBAgEwZjAuBggrBgEFBQcCARYiaHR0cDovL3d3dy5zdGFydHNzbC5jb20vcG9saWN5LnBkZjA0 BggrBgEFBQcCARYoaHR0cDovL3d3dy5zdGFydHNzbC5jb20vaW50ZXJtZWRpYXRlLnBkZjANBgkq hkiG9w0BAQUFAAOCAgEACoMIfXirLAZcuGOMXq4cuSN3TaFx2H2GvD5VSy/6rV55BYHbWNaPeQn3 oBSU8KgQZn/Kck1JxbLpAxVCNtsxeW1R87ifhsYZ0qjdrA9anrW2MAWCtosmAOT4OxK9QPoSjCMx M3HbkZCDJgnlE8jMopH21BbyAYr7b5EfGRQJNtgWcvqSXwKHnTutR08+Kkn0KAkXCzeQNLeA5LlY UzFyM7kPAp8pIRMQ+seHunmyG642S2+y/qHEdMuGIwpfz3eDF1PdctL04qYK/zu+Qg1Bw0RwgigV Zs/0c5HP2/e9DBHh7eSwtzYlk4AUr6yxLlcwSjOfOmKEQ/Q8tzh0IFiNu9IPuTGAPBn4CPxD0+Ru 8T2wg8/s43R/PT3kd1OEqOJUl7q+h+r6fpvU0Fzxd2tC8Ga6fDEPme+1Nbi+03pVjuZQKbGwKJ66 gEn06WqaxVZC+J8hh/jR0k9mST1iAZPNYulcNJ8tKmVtjYsv0L1TSm2+NwON58tO+pIVzu3DWwSE XSf+qkDavQam+QtEOZxLBXI++aMUEapSn+k3Lxm48ZCYfAWLb/Xj7F5JQMbZvCexglAbYR0kIHqW 5DnsYSdMD/IplJMojx0NBrxJ3fN9dvX2Y6BIXRsF1du4qESm4/3CKuyUV7p9DW3mPlHTGLvYxnyK Qy7VFBkoLINszBrOUeIxggNvMIIDawIBATCBlDCBjDELMAkGA1UEBhMCSUwxFjAUBgNVBAoTDVN0 YXJ0Q29tIEx0ZC4xKzApBgNVBAsTIlNlY3VyZSBEaWdpdGFsIENlcnRpZmljYXRlIFNpZ25pbmcx ODA2BgNVBAMTL1N0YXJ0Q29tIENsYXNzIDEgUHJpbWFyeSBJbnRlcm1lZGlhdGUgQ2xpZW50IENB AgMN7zcwCQYFKw4DAhoFAKCCAa8wGAYJKoZIhvcNAQkDMQsGCSqGSIb3DQEHATAcBgkqhkiG9w0B CQUxDxcNMTYwMjAzMTE0MTUwWjAjBgkqhkiG9w0BCQQxFgQU2fTYtYiRKdcuVz6Wliai9IMEqEkw gaUGCSsGAQQBgjcQBDGBlzCBlDCBjDELMAkGA1UEBhMCSUwxFjAUBgNVBAoTDVN0YXJ0Q29tIEx0 ZC4xKzApBgNVBAsTIlNlY3VyZSBEaWdpdGFsIENlcnRpZmljYXRlIFNpZ25pbmcxODA2BgNVBAMT L1N0YXJ0Q29tIENsYXNzIDEgUHJpbWFyeSBJbnRlcm1lZGlhdGUgQ2xpZW50IENBAgMN7zcwgacG CyqGSIb3DQEJEAILMYGXoIGUMIGMMQswCQYDVQQGEwJJTDEWMBQGA1UEChMNU3RhcnRDb20gTHRk LjErMCkGA1UECxMiU2VjdXJlIERpZ2l0YWwgQ2VydGlmaWNhdGUgU2lnbmluZzE4MDYGA1UEAxMv U3RhcnRDb20gQ2xhc3MgMSBQcmltYXJ5IEludGVybWVkaWF0ZSBDbGllbnQgQ0ECAw3vNzANBgkq hkiG9w0BAQEFAASCAQBB2a1z0C4Z9hWIP3XWF5I9rkBBfTQXF9gpEyVJixliPU6rLr1xbX3ozKFC YVbyctML92GftSFNWXzt44IJF31jHWFeqit1j3+Id89a+SuKO4WoJrv6h81t65Kxmq/qhGmMHqTa 03Ipn6oqMhw0a4vGO4nMpKGF6qhuPOcOErjkOQlx3TuKko+FVa6DJJ8UWaq4+id4iTd8ItzcII6r p307YLCTRaoUW3m2ANx6/ZViG43fhAwTclVX+/APBsj25pScrz7wuNV23rCP/+eeoHcig9gXlnpo tVOFrQzS64hWpIY+9FJ+YgOQRiPT9BDrykczJOBN6ciPyS5Z6jQygAZrAAAAAAAA --=-ZvPHwSq+j4tWP9vHU58T-- -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm0-f51.google.com (mail-wm0-f51.google.com [74.125.82.51]) by kanga.kvack.org (Postfix) with ESMTP id 048746B0009 for ; Wed, 3 Feb 2016 06:42:24 -0500 (EST) Received: by mail-wm0-f51.google.com with SMTP id l66so66020091wml.0 for ; Wed, 03 Feb 2016 03:42:23 -0800 (PST) Received: from mail-wm0-x233.google.com (mail-wm0-x233.google.com. [2a00:1450:400c:c09::233]) by mx.google.com with ESMTPS id p192si10483458wmd.30.2016.02.03.03.42.22 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Feb 2016 03:42:23 -0800 (PST) Received: by mail-wm0-x233.google.com with SMTP id p63so160848892wmp.1 for ; Wed, 03 Feb 2016 03:42:22 -0800 (PST) MIME-Version: 1.0 In-Reply-To: <1454499350.4788.170.camel@infradead.org> References: <20160128175536.GA20797@gmail.com> <1454460057.4788.117.camel@infradead.org> <1454488853.4788.142.camel@infradead.org> <1454494508.4788.154.camel@infradead.org> <1454499350.4788.170.camel@infradead.org> From: Oded Gabbay Date: Wed, 3 Feb 2016 13:41:53 +0200 Message-ID: Subject: Re: [LSF/MM ATTEND] HMM (heterogeneous memory manager) and GPU Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Sender: owner-linux-mm@kvack.org List-ID: To: David Woodhouse Cc: Jerome Glisse , lsf-pc@lists.linux-foundation.org, "linux-mm@kvack.org" , Joerg Roedel On Wed, Feb 3, 2016 at 1:35 PM, David Woodhouse wrote= : > On Wed, 2016-02-03 at 13:07 +0200, Oded Gabbay wrote: >> > Another, perhaps trivial, question. >> > When there is an address fault, who handles it ? the SVM driver, or >> > each device driver ? >> > >> > In other words, is the model the same as (AMD) IOMMU where it binds >> > amd_iommu driver to the IOMMU H/W, and that driver (amd_iommu/v2) is >> > the only one which handles the PPR events ? >> > >> > If that is the case, then with SVM, how will the device driver be made >> > aware of faults, if the SVM driver won't notify him about them, >> > because it has already severed the connection between PASID and >> > process ? > > In the ideal case, there's no need for the device driver to get > involved at all. When a page isn't found in the page tables, the IOMMU > code calls handle_mm_fault() and either populates the page and sends a > a 'success' response, or sends an 'invalid fault' response back. > > To account for broken hardware, we *have* added a callback into the > device driver when these faults happen. Ideally it should never be > used, of course. > > In the case where the process has gone away, the PASID is still > assigned and we still hold mm_count on the MM, just not mm_users. This > callback into the device driver still occurs if a fault happens during > process exit between the exit_mm() and exit_files() stage. > >> And another question, if I may, aren't you afraid of "false positive" >> prints to dmesg ? I mean, I'm pretty sure page faults / pasid faults >> errors will be logged somewhere, probably to dmesg. Aren't you >> concerned of the users seeing those errors and thinking they may have >> a bug, while actually the errors were only caused by process >> termination ? > > If that's the case, it's easy enough to silence them. We are already > explicitly testing for the 'defunct mm' case in our fault handler, to > prevent us from faulting more pages into an obsolescent MM after its > mm_users reaches zero and its page tables are supposed to have been > torn down. That's the 'if(!atomic_inc_not_zere(&svm->mm->mm_users)) > goto bad_req;' part. > >> Or in that case you say that the application is broken, because if it >> still had something running in the H/W, it should not have closed >> itself ? > > That's also true but it's still nice to avoid confusion. Even if only > to disambiguate cause and effect =E2=80=94 we don't want people to see PA= SID > faults which were caused by the process crashing, and to think that > they might be involved in *causing* that process to crash... Yes, that's why in our model, we aim to kill all running waves *before* the amd_iommu_v2 driver unbinds the PASID. > > -- > David Woodhouse Open Source Technology Centre > David.Woodhouse@intel.com Intel Corporation > It seems you have most of your bases covered. I'll stop harassing you now := ) But in seriousness, its interesting to see the different approaches taken to handling pretty much the same type of H/W (IOMMU). Thanks for your patience in answering my questions. Oded -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pf0-f173.google.com (mail-pf0-f173.google.com [209.85.192.173]) by kanga.kvack.org (Postfix) with ESMTP id A47AB6B0005 for ; Wed, 3 Feb 2016 07:22:35 -0500 (EST) Received: by mail-pf0-f173.google.com with SMTP id 65so13237736pfd.2 for ; Wed, 03 Feb 2016 04:22:35 -0800 (PST) Received: from bombadil.infradead.org (bombadil.infradead.org. [2001:1868:205::9]) by mx.google.com with ESMTPS id qo15si9121265pab.12.2016.02.03.04.22.32 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Feb 2016 04:22:32 -0800 (PST) Message-ID: <1454502148.4788.185.camel@infradead.org> Subject: Re: [LSF/MM ATTEND] HMM (heterogeneous memory manager) and GPU From: David Woodhouse Date: Wed, 03 Feb 2016 12:22:28 +0000 In-Reply-To: References: <20160128175536.GA20797@gmail.com> <1454460057.4788.117.camel@infradead.org> <1454488853.4788.142.camel@infradead.org> <1454494508.4788.154.camel@infradead.org> <1454499350.4788.170.camel@infradead.org> Content-Type: multipart/signed; micalg="sha-1"; protocol="application/x-pkcs7-signature"; boundary="=-KhBbKOHbAAL91o7W84Vq" Mime-Version: 1.0 Sender: owner-linux-mm@kvack.org List-ID: To: Oded Gabbay Cc: Jerome Glisse , lsf-pc@lists.linux-foundation.org, "linux-mm@kvack.org" , Joerg Roedel --=-KhBbKOHbAAL91o7W84Vq Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Wed, 2016-02-03 at 13:41 +0200, Oded Gabbay wrote: >=20 > It seems you have most of your bases covered. I'll stop harassing you now= :) > But in seriousness, its interesting to see the different approaches > taken to handling pretty much the same type of H/W (IOMMU). Well, the point is that we need to settle on a model we can *all* use. It's all very well having vendor-specific intel_svm_bind_mm() and amd_iommu_bind_pasid() functions with subtly different semantics, while the only devices we support for Intel are integrated graphics and our PCIe root ports don't even support discrete devices with PASID capabilities =E2=80=94 and while the only device using the AMD version is t= he AMD GPU. But we *are* starting to see additional devices with PASID capabilities, and it won't be long before we really do have to support third-party discrete devices. So we do need a coherent API for SVM, as an extension of the DMA API. And that means we have to settle on the semantics we want for it :) With the commit I showed earlier, I've moved the Intel model somewhat closer to the AMD model =E2=80=94 no longer holding mm_users on the MM in question. I think we can come up with something acceptable.=C2=A0 There are Power and ARM incarnations of SVM also in the works, I believe. --=20 David Woodhouse Open Source Technology Centre David.Woodhouse@intel.com Intel Corporation --=-KhBbKOHbAAL91o7W84Vq Content-Type: application/x-pkcs7-signature; name="smime.p7s" Content-Disposition: attachment; filename="smime.p7s" Content-Transfer-Encoding: base64 MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIISjjCCBicw ggUPoAMCAQICAw3vNzANBgkqhkiG9w0BAQUFADCBjDELMAkGA1UEBhMCSUwxFjAUBgNVBAoTDVN0 YXJ0Q29tIEx0ZC4xKzApBgNVBAsTIlNlY3VyZSBEaWdpdGFsIENlcnRpZmljYXRlIFNpZ25pbmcx ODA2BgNVBAMTL1N0YXJ0Q29tIENsYXNzIDEgUHJpbWFyeSBJbnRlcm1lZGlhdGUgQ2xpZW50IENB MB4XDTE1MDUwNTA5NDM0MVoXDTE2MDUwNTA5NTMzNlowQjEcMBoGA1UEAwwTZHdtdzJAaW5mcmFk ZWFkLm9yZzEiMCAGCSqGSIb3DQEJARYTZHdtdzJAaW5mcmFkZWFkLm9yZzCCASIwDQYJKoZIhvcN AQEBBQADggEPADCCAQoCggEBAMkbm9kPbx1j/X4RVyf/pPKSYwelcco69TvnQQbKM8m8xkWjXJI1 jpJ1jMaGUZGFToINMSZi7lZawUozudWbXSKy1SikENSTJHffsdRAIlsp+hR8vWvjsKUry6sEdqPG doa5RY7+N4WRusWZDYW/RRWE6i9EL9qV86CVPYqw22UBOUw4/j/HVGCV6TSB8yE5iEwhk/hUuzRr FZm1MJMR7mCS7BCR8Lr5jFY61lWpBiXNXIxLZCvDc26KR5L5tYX43iUVO3fzES1GRVoYnxxk2tmz fcsZG5vK+Trc9L8OZJfkYrEHH3+Iw41MQ0w/djVtYr1+HYldx0QmYXAtnhIj+UMCAwEAAaOCAtkw ggLVMAkGA1UdEwQCMAAwCwYDVR0PBAQDAgSwMB0GA1UdJQQWMBQGCCsGAQUFBwMCBggrBgEFBQcD BDAdBgNVHQ4EFgQUszC96C3w5/2+d+atSr0IpT26YI4wHwYDVR0jBBgwFoAUU3Ltkpzg2ssBXHx+ ljVO8tS4UYIwHgYDVR0RBBcwFYETZHdtdzJAaW5mcmFkZWFkLm9yZzCCAUwGA1UdIASCAUMwggE/ MIIBOwYLKwYBBAGBtTcBAgMwggEqMC4GCCsGAQUFBwIBFiJodHRwOi8vd3d3LnN0YXJ0c3NsLmNv bS9wb2xpY3kucGRmMIH3BggrBgEFBQcCAjCB6jAnFiBTdGFydENvbSBDZXJ0aWZpY2F0aW9uIEF1 dGhvcml0eTADAgEBGoG+VGhpcyBjZXJ0aWZpY2F0ZSB3YXMgaXNzdWVkIGFjY29yZGluZyB0byB0 aGUgQ2xhc3MgMSBWYWxpZGF0aW9uIHJlcXVpcmVtZW50cyBvZiB0aGUgU3RhcnRDb20gQ0EgcG9s aWN5LCByZWxpYW5jZSBvbmx5IGZvciB0aGUgaW50ZW5kZWQgcHVycG9zZSBpbiBjb21wbGlhbmNl IG9mIHRoZSByZWx5aW5nIHBhcnR5IG9ibGlnYXRpb25zLjA2BgNVHR8ELzAtMCugKaAnhiVodHRw Oi8vY3JsLnN0YXJ0c3NsLmNvbS9jcnR1MS1jcmwuY3JsMIGOBggrBgEFBQcBAQSBgTB/MDkGCCsG AQUFBzABhi1odHRwOi8vb2NzcC5zdGFydHNzbC5jb20vc3ViL2NsYXNzMS9jbGllbnQvY2EwQgYI KwYBBQUHMAKGNmh0dHA6Ly9haWEuc3RhcnRzc2wuY29tL2NlcnRzL3N1Yi5jbGFzczEuY2xpZW50 LmNhLmNydDAjBgNVHRIEHDAahhhodHRwOi8vd3d3LnN0YXJ0c3NsLmNvbS8wDQYJKoZIhvcNAQEF BQADggEBAHMQmxHHodpS85X8HRyxhvfkys7r+taCNOaNU9cxQu/cZ/6k5nS2qGNMzZ6jb7ueY/V7 7p+4DW/9ZWODDTf4Fz00mh5SSVc20Bz7t+hhxwHd62PZgENh5i76Qq2tw48U8AsYo5damHby1epf neZafLpUkLLO7AGBJIiRVTevdvyXQ0qnixOmKMWyvrhSNGuVIKVdeqLP+102Dwf+dpFyw+j1hz28 jEEKpHa+NR1b2kXuSPi/rMGhexwlJOh4tK8KQ6Ryr0rIN//NSbOgbyYZrzc/ZUWX9V5OA84ChFb2 vkFl0OcYrttp/rhDBLITwffPxSZeoBh9H7zYzkbCXKL3BUIwggYnMIIFD6ADAgECAgMN7zcwDQYJ KoZIhvcNAQEFBQAwgYwxCzAJBgNVBAYTAklMMRYwFAYDVQQKEw1TdGFydENvbSBMdGQuMSswKQYD VQQLEyJTZWN1cmUgRGlnaXRhbCBDZXJ0aWZpY2F0ZSBTaWduaW5nMTgwNgYDVQQDEy9TdGFydENv bSBDbGFzcyAxIFByaW1hcnkgSW50ZXJtZWRpYXRlIENsaWVudCBDQTAeFw0xNTA1MDUwOTQzNDFa Fw0xNjA1MDUwOTUzMzZaMEIxHDAaBgNVBAMME2R3bXcyQGluZnJhZGVhZC5vcmcxIjAgBgkqhkiG 9w0BCQEWE2R3bXcyQGluZnJhZGVhZC5vcmcwggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIB AQDJG5vZD28dY/1+EVcn/6TykmMHpXHKOvU750EGyjPJvMZFo1ySNY6SdYzGhlGRhU6CDTEmYu5W WsFKM7nVm10istUopBDUkyR337HUQCJbKfoUfL1r47ClK8urBHajxnaGuUWO/jeFkbrFmQ2Fv0UV hOovRC/alfOglT2KsNtlATlMOP4/x1Rglek0gfMhOYhMIZP4VLs0axWZtTCTEe5gkuwQkfC6+YxW OtZVqQYlzVyMS2Qrw3NuikeS+bWF+N4lFTt38xEtRkVaGJ8cZNrZs33LGRubyvk63PS/DmSX5GKx Bx9/iMONTENMP3Y1bWK9fh2JXcdEJmFwLZ4SI/lDAgMBAAGjggLZMIIC1TAJBgNVHRMEAjAAMAsG A1UdDwQEAwIEsDAdBgNVHSUEFjAUBggrBgEFBQcDAgYIKwYBBQUHAwQwHQYDVR0OBBYEFLMwvegt 8Of9vnfmrUq9CKU9umCOMB8GA1UdIwQYMBaAFFNy7ZKc4NrLAVx8fpY1TvLUuFGCMB4GA1UdEQQX MBWBE2R3bXcyQGluZnJhZGVhZC5vcmcwggFMBgNVHSAEggFDMIIBPzCCATsGCysGAQQBgbU3AQID MIIBKjAuBggrBgEFBQcCARYiaHR0cDovL3d3dy5zdGFydHNzbC5jb20vcG9saWN5LnBkZjCB9wYI KwYBBQUHAgIwgeowJxYgU3RhcnRDb20gQ2VydGlmaWNhdGlvbiBBdXRob3JpdHkwAwIBARqBvlRo aXMgY2VydGlmaWNhdGUgd2FzIGlzc3VlZCBhY2NvcmRpbmcgdG8gdGhlIENsYXNzIDEgVmFsaWRh dGlvbiByZXF1aXJlbWVudHMgb2YgdGhlIFN0YXJ0Q29tIENBIHBvbGljeSwgcmVsaWFuY2Ugb25s eSBmb3IgdGhlIGludGVuZGVkIHB1cnBvc2UgaW4gY29tcGxpYW5jZSBvZiB0aGUgcmVseWluZyBw YXJ0eSBvYmxpZ2F0aW9ucy4wNgYDVR0fBC8wLTAroCmgJ4YlaHR0cDovL2NybC5zdGFydHNzbC5j b20vY3J0dTEtY3JsLmNybDCBjgYIKwYBBQUHAQEEgYEwfzA5BggrBgEFBQcwAYYtaHR0cDovL29j c3Auc3RhcnRzc2wuY29tL3N1Yi9jbGFzczEvY2xpZW50L2NhMEIGCCsGAQUFBzAChjZodHRwOi8v YWlhLnN0YXJ0c3NsLmNvbS9jZXJ0cy9zdWIuY2xhc3MxLmNsaWVudC5jYS5jcnQwIwYDVR0SBBww GoYYaHR0cDovL3d3dy5zdGFydHNzbC5jb20vMA0GCSqGSIb3DQEBBQUAA4IBAQBzEJsRx6HaUvOV /B0csYb35MrO6/rWgjTmjVPXMULv3Gf+pOZ0tqhjTM2eo2+7nmP1e+6fuA1v/WVjgw03+Bc9NJoe UklXNtAc+7foYccB3etj2YBDYeYu+kKtrcOPFPALGKOXWph28tXqX53mWny6VJCyzuwBgSSIkVU3 r3b8l0NKp4sTpijFsr64UjRrlSClXXqiz/tdNg8H/naRcsPo9Yc9vIxBCqR2vjUdW9pF7kj4v6zB oXscJSToeLSvCkOkcq9KyDf/zUmzoG8mGa83P2VFl/VeTgPOAoRW9r5BZdDnGK7baf64QwSyE8H3 z8UmXqAYfR+82M5Gwlyi9wVCMIIGNDCCBBygAwIBAgIBHjANBgkqhkiG9w0BAQUFADB9MQswCQYD VQQGEwJJTDEWMBQGA1UEChMNU3RhcnRDb20gTHRkLjErMCkGA1UECxMiU2VjdXJlIERpZ2l0YWwg Q2VydGlmaWNhdGUgU2lnbmluZzEpMCcGA1UEAxMgU3RhcnRDb20gQ2VydGlmaWNhdGlvbiBBdXRo b3JpdHkwHhcNMDcxMDI0MjEwMTU1WhcNMTcxMDI0MjEwMTU1WjCBjDELMAkGA1UEBhMCSUwxFjAU BgNVBAoTDVN0YXJ0Q29tIEx0ZC4xKzApBgNVBAsTIlNlY3VyZSBEaWdpdGFsIENlcnRpZmljYXRl IFNpZ25pbmcxODA2BgNVBAMTL1N0YXJ0Q29tIENsYXNzIDEgUHJpbWFyeSBJbnRlcm1lZGlhdGUg Q2xpZW50IENBMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAxwmDzM4t2BqxKaQuE6uW vooyg4ymiEGWVUet1G8SD+rqvyNH4QrvnEIaFHxOhESip7vMz39ScLpNLbL1QpOlPW/tFIzNHS3q d2XRNYG5Sv9RcGE+T4qbLtsjjJbi6sL7Ls/f/X9ftTyhxvxWkf8KW37iKrueKsxw2HqolH7GM6FX 5UfNAwAu4ZifkpmZzU1slBhyWwaQPEPPZRsWoTb7q8hmgv6Nv3Hg9rmA1/VPBIOQ6SKRkHXG0Hhm q1dOFoAFI411+a/9nWm5rcVjGcIWZ2v/43Yksq60jExipA4l5uv9/+Hm33mbgmCszdj/Dthf13tg Av2O83hLJ0exTqfrlwIDAQABo4IBrTCCAakwDwYDVR0TAQH/BAUwAwEB/zAOBgNVHQ8BAf8EBAMC AQYwHQYDVR0OBBYEFFNy7ZKc4NrLAVx8fpY1TvLUuFGCMB8GA1UdIwQYMBaAFE4L7xqkQFulF2mH MMo0aEPQQa7yMGYGCCsGAQUFBwEBBFowWDAnBggrBgEFBQcwAYYbaHR0cDovL29jc3Auc3RhcnRz c2wuY29tL2NhMC0GCCsGAQUFBzAChiFodHRwOi8vd3d3LnN0YXJ0c3NsLmNvbS9zZnNjYS5jcnQw WwYDVR0fBFQwUjAnoCWgI4YhaHR0cDovL3d3dy5zdGFydHNzbC5jb20vc2ZzY2EuY3JsMCegJaAj hiFodHRwOi8vY3JsLnN0YXJ0c3NsLmNvbS9zZnNjYS5jcmwwgYAGA1UdIAR5MHcwdQYLKwYBBAGB tTcBAgEwZjAuBggrBgEFBQcCARYiaHR0cDovL3d3dy5zdGFydHNzbC5jb20vcG9saWN5LnBkZjA0 BggrBgEFBQcCARYoaHR0cDovL3d3dy5zdGFydHNzbC5jb20vaW50ZXJtZWRpYXRlLnBkZjANBgkq hkiG9w0BAQUFAAOCAgEACoMIfXirLAZcuGOMXq4cuSN3TaFx2H2GvD5VSy/6rV55BYHbWNaPeQn3 oBSU8KgQZn/Kck1JxbLpAxVCNtsxeW1R87ifhsYZ0qjdrA9anrW2MAWCtosmAOT4OxK9QPoSjCMx M3HbkZCDJgnlE8jMopH21BbyAYr7b5EfGRQJNtgWcvqSXwKHnTutR08+Kkn0KAkXCzeQNLeA5LlY UzFyM7kPAp8pIRMQ+seHunmyG642S2+y/qHEdMuGIwpfz3eDF1PdctL04qYK/zu+Qg1Bw0RwgigV Zs/0c5HP2/e9DBHh7eSwtzYlk4AUr6yxLlcwSjOfOmKEQ/Q8tzh0IFiNu9IPuTGAPBn4CPxD0+Ru 8T2wg8/s43R/PT3kd1OEqOJUl7q+h+r6fpvU0Fzxd2tC8Ga6fDEPme+1Nbi+03pVjuZQKbGwKJ66 gEn06WqaxVZC+J8hh/jR0k9mST1iAZPNYulcNJ8tKmVtjYsv0L1TSm2+NwON58tO+pIVzu3DWwSE XSf+qkDavQam+QtEOZxLBXI++aMUEapSn+k3Lxm48ZCYfAWLb/Xj7F5JQMbZvCexglAbYR0kIHqW 5DnsYSdMD/IplJMojx0NBrxJ3fN9dvX2Y6BIXRsF1du4qESm4/3CKuyUV7p9DW3mPlHTGLvYxnyK Qy7VFBkoLINszBrOUeIxggNvMIIDawIBATCBlDCBjDELMAkGA1UEBhMCSUwxFjAUBgNVBAoTDVN0 YXJ0Q29tIEx0ZC4xKzApBgNVBAsTIlNlY3VyZSBEaWdpdGFsIENlcnRpZmljYXRlIFNpZ25pbmcx ODA2BgNVBAMTL1N0YXJ0Q29tIENsYXNzIDEgUHJpbWFyeSBJbnRlcm1lZGlhdGUgQ2xpZW50IENB AgMN7zcwCQYFKw4DAhoFAKCCAa8wGAYJKoZIhvcNAQkDMQsGCSqGSIb3DQEHATAcBgkqhkiG9w0B CQUxDxcNMTYwMjAzMTIyMjI4WjAjBgkqhkiG9w0BCQQxFgQUDMreXl4UvEjI1UE4cNK4tHlMKWcw gaUGCSsGAQQBgjcQBDGBlzCBlDCBjDELMAkGA1UEBhMCSUwxFjAUBgNVBAoTDVN0YXJ0Q29tIEx0 ZC4xKzApBgNVBAsTIlNlY3VyZSBEaWdpdGFsIENlcnRpZmljYXRlIFNpZ25pbmcxODA2BgNVBAMT L1N0YXJ0Q29tIENsYXNzIDEgUHJpbWFyeSBJbnRlcm1lZGlhdGUgQ2xpZW50IENBAgMN7zcwgacG CyqGSIb3DQEJEAILMYGXoIGUMIGMMQswCQYDVQQGEwJJTDEWMBQGA1UEChMNU3RhcnRDb20gTHRk LjErMCkGA1UECxMiU2VjdXJlIERpZ2l0YWwgQ2VydGlmaWNhdGUgU2lnbmluZzE4MDYGA1UEAxMv U3RhcnRDb20gQ2xhc3MgMSBQcmltYXJ5IEludGVybWVkaWF0ZSBDbGllbnQgQ0ECAw3vNzANBgkq hkiG9w0BAQEFAASCAQBO3keKpCYILHBz4R8ts98WtZALMd9imWXRN0mmFwnrrw4NGKkMwfeG/F5h 42uX6f/TZiS82OMK8iK0Zi6mWLPCntsIadajwOZ14UVFUzj4J7d+QfwK4iskC155NQxLCc0BfegP dVVzc/5QtByAvHCb3FsaE4BhJAEKbQTxuP1ABLRDeugdwsJkcXyRVnoJjHOsgpSnrmI0ReEX6Z+m ToGG9t6ZuwQoHhwq8GY2ZANge1MoekC9zGtUlnpCuhVhaSL5MTFCSuEyVgzZ8HnjTQ91RLXk8SZ0 RsoBY3H/kSZq23oke7bUmmH49njQCVDlKsNqoKt7s45r9aoqDMcmyy1eAAAAAAAA --=-KhBbKOHbAAL91o7W84Vq-- -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm0-f49.google.com (mail-wm0-f49.google.com [74.125.82.49]) by kanga.kvack.org (Postfix) with ESMTP id EE5396B0253 for ; Thu, 25 Feb 2016 08:49:36 -0500 (EST) Received: by mail-wm0-f49.google.com with SMTP id a4so28360474wme.1 for ; Thu, 25 Feb 2016 05:49:36 -0800 (PST) Received: from theia.8bytes.org (8bytes.org. [81.169.241.247]) by mx.google.com with ESMTPS id v125si4182832wme.79.2016.02.25.05.49.33 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 25 Feb 2016 05:49:33 -0800 (PST) Date: Thu, 25 Feb 2016 14:49:33 +0100 From: Joerg Roedel Subject: Re: [LSF/MM ATTEND] HMM (heterogeneous memory manager) and GPU Message-ID: <20160225134933.GD22747@8bytes.org> References: <20160128175536.GA20797@gmail.com> <1454460057.4788.117.camel@infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <1454460057.4788.117.camel@infradead.org> Sender: owner-linux-mm@kvack.org List-ID: To: David Woodhouse Cc: Jerome Glisse , lsf-pc@lists.linux-foundation.org, "linux-mm@kvack.org" Hey, On Wed, Feb 03, 2016 at 12:40:57AM +0000, David Woodhouse wrote: > There are a few related issues here around Shared Virtual Memory, and > lifetime management of the associated MM, and the proposal discussed at > the Kernel Summit for "off-CPU tasks". > > I've hit a situation with the Intel SVM code in 4.4 where the device > driver binds a PASID, and also has mmap() functionality on the same > file descriptor that the PASID is associated with. > > So on process exit, the MM doesn't die because the PASID binding still > exists. The VMA of the mmap doesn't die because the MM still exists. So > the underlying file remains open because the VMA still exists. And the > PASID binding thus doesn't die because the file is still open. > > I've posted a patchA1 which moves us closer to the amd_iommu_v2 model, > although I'm still *strongly* resisting the temptation to call out into > device driver code from the mmu_notifier's release callback. > > I would like to attend LSF/MM this year so we can continue to work on > those issues a?? now that we actually have some hardware in the field and > a better idea of how we can build a unified access model for SVM across > the different IOMMU types. That sounds very interesting and I'd like to participate in this discussion. Unfortunatly I can't make it to the mm-sumit this year, so I didn't even apply for an invitation. But if this gets discussed there I am interested in the outcome. I still have a prototype for the off-cpu task concept on my list of thing to implement. The problem is that I can't really test any changes I make because I don't have SVM hardware and on the AMD side the user-space part needed for testing only runs on Ubuntu with some AMD provided kernel :( Joerg -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org