From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from smtp4.osuosl.org (smtp4.osuosl.org [140.211.166.137]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7E19FC433EF for ; Thu, 24 Mar 2022 22:41:23 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp4.osuosl.org (Postfix) with ESMTP id 2C4D141CCB; Thu, 24 Mar 2022 22:41:23 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp4.osuosl.org ([127.0.0.1]) by localhost (smtp4.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id R3MPA0731FvA; Thu, 24 Mar 2022 22:41:22 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by smtp4.osuosl.org (Postfix) with ESMTPS id CF6F941CC7; Thu, 24 Mar 2022 22:41:21 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 8D80BC0012; Thu, 24 Mar 2022 22:41:21 +0000 (UTC) Received: from smtp3.osuosl.org (smtp3.osuosl.org [140.211.166.136]) by lists.linuxfoundation.org (Postfix) with ESMTP id CA8B4C000B for ; Thu, 24 Mar 2022 22:41:19 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp3.osuosl.org (Postfix) with ESMTP id A3DB360AF4 for ; Thu, 24 Mar 2022 22:41:19 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Authentication-Results: smtp3.osuosl.org (amavisd-new); dkim=pass (1024-bit key) header.d=redhat.com Received: from smtp3.osuosl.org ([127.0.0.1]) by localhost (smtp3.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id S-KnYq7yaXYy for ; Thu, 24 Mar 2022 22:41:19 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.8.0 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by smtp3.osuosl.org (Postfix) with ESMTPS id D41AA60ADC for ; Thu, 24 Mar 2022 22:41:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1648161677; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0+hoh4jfjE01vsVfZmT2GJ8qlpm5OtKcpQZnExTmPL0=; b=RKNWe3tegusMXJ8zxPj/bEjxByiOHoSuHoNtt0JFLeTsccfJckmEfdzRj9WsSCRY8pWWCw /5MK5kFwOLVNc1V2KuN9gDFFBqOxBGbrRDNltKuDJcC+rMU2poVXcd/bPcqTyUABCcu4Iq /j8jn9peTClZDX1YpQjNNzvE5BSsevg= Received: from mail-io1-f72.google.com (mail-io1-f72.google.com [209.85.166.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-448-iFeR59PhM1Oh0WqTIIDAKA-1; Thu, 24 Mar 2022 18:41:16 -0400 X-MC-Unique: iFeR59PhM1Oh0WqTIIDAKA-1 Received: by mail-io1-f72.google.com with SMTP id f5-20020a6be805000000b00649b9faf257so3974415ioh.9 for ; Thu, 24 Mar 2022 15:41:16 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to :references:organization:mime-version:content-transfer-encoding; bh=l9WNoz/FzvLksUXnj3LwqVUaZSJ8tKS3cuqgqvhdHn0=; b=RTiAEdkHRpv7ZJOYBn0JxUjQjazKn2r2cONmuBXusVeNjZ59BsljBaPGobE5n9aTzp 52VQG23dOJKAA//8Spd9oZ7TqRTo0FvF4LNFuzmbIKL+Oea4dHfdz5RRDCRWuKAAW+y+ SzdxaQFZYgQK3nxWkJWkMWpsLi/FKJr1W7C53Ydxu1F1T7yZDwKTHLNhnK85xTz1QiSc FNpizcDA4D0QuXNwZYa43XsshCGIScVFvTSTlA5P0QKXPfAwA7XcZGBKtiPeuqRHDPZB 6jFRbcBagS1dc0dcpt1KzReX1kNN33s508wIA5XMxIbcRaa5iQ3os9LQi3X7uCLdMqKB ywDA== X-Gm-Message-State: AOAM532fUpFNpZWbmxZWg5KzQYZZ1tuMdGqd6mufHu0Jl383D32eVZcJ xItoHrFtS/7iwQan06euIIk5UrAvGyBvFmpA1kYaICOZcU2lk5k7LmGn5Up6PtOLOvzcZyvaIdE gkwFL14jQYok9OsvFl9/K1v9yGuwFZQ== X-Received: by 2002:a05:6638:371f:b0:31a:8654:e49c with SMTP id k31-20020a056638371f00b0031a8654e49cmr3994486jav.197.1648161675851; Thu, 24 Mar 2022 15:41:15 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy8BahSZl1L2C0PBuAAPzP0FUv2fHadswqtKoqwYknCJfEeQpET1HU5FEk+lcbhTyW3Tu6Gjg== X-Received: by 2002:a05:6638:371f:b0:31a:8654:e49c with SMTP id k31-20020a056638371f00b0031a8654e49cmr3994469jav.197.1648161675601; Thu, 24 Mar 2022 15:41:15 -0700 (PDT) Received: from redhat.com ([38.15.36.239]) by smtp.gmail.com with ESMTPSA id q197-20020a6b8ece000000b00648d615e80csm2082175iod.41.2022.03.24.15.41.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 24 Mar 2022 15:41:15 -0700 (PDT) Date: Thu, 24 Mar 2022 16:41:14 -0600 From: Alex Williamson To: Jason Gunthorpe Subject: Re: [PATCH RFC 04/12] kernel/user: Allow user::locked_vm to be usable for iommufd Message-ID: <20220324164114.78f2e63a.alex.williamson@redhat.com> In-Reply-To: <20220324222739.GZ11336@nvidia.com> References: <4-v1-e79cd8d168e8+6-iommufd_jgg@nvidia.com> <808a871b3918dc067031085de3e8af6b49c6ef89.camel@linux.ibm.com> <20220322145741.GH11336@nvidia.com> <20220322092923.5bc79861.alex.williamson@redhat.com> <20220322161521.GJ11336@nvidia.com> <20220324144015.031ca277.alex.williamson@redhat.com> <20220324222739.GZ11336@nvidia.com> Organization: Red Hat MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=alex.williamson@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Cc: Jean-Philippe Brucker , Chaitanya Kulkarni , "Daniel P. =?UTF-8?B?QmVycmFuZ8Op?=" , kvm@vger.kernel.org, Niklas Schnelle , Jason Wang , Cornelia Huck , Kevin Tian , Daniel Jordan , Jason Gunthorpe via iommu , "Michael S. Tsirkin" , Joao Martins , David Gibson X-BeenThere: iommu@lists.linux-foundation.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: Development issues for Linux IOMMU support List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: iommu-bounces@lists.linux-foundation.org Sender: "iommu" T24gVGh1LCAyNCBNYXIgMjAyMiAxOToyNzozOSAtMDMwMApKYXNvbiBHdW50aG9ycGUgPGpnZ0Bu dmlkaWEuY29tPiB3cm90ZToKCj4gT24gVGh1LCBNYXIgMjQsIDIwMjIgYXQgMDI6NDA6MTVQTSAt MDYwMCwgQWxleCBXaWxsaWFtc29uIHdyb3RlOgo+ID4gT24gVHVlLCAyMiBNYXIgMjAyMiAxMzox NToyMSAtMDMwMAo+ID4gSmFzb24gR3VudGhvcnBlIHZpYSBpb21tdSA8aW9tbXVAbGlzdHMubGlu dXgtZm91bmRhdGlvbi5vcmc+IHdyb3RlOgo+ID4gICAKPiA+ID4gT24gVHVlLCBNYXIgMjIsIDIw MjIgYXQgMDk6Mjk6MjNBTSAtMDYwMCwgQWxleCBXaWxsaWFtc29uIHdyb3RlOgo+ID4gPiAgIAo+ ID4gPiA+IEknbSBzdGlsbCBwaWNraW5nIG15IHdheSB0aHJvdWdoIHRoZSBzZXJpZXMsIGJ1dCB0 aGUgbGF0ZXIgY29tcGF0Cj4gPiA+ID4gaW50ZXJmYWNlIGRvZXNuJ3QgbWVudGlvbiB0aGlzIGRp ZmZlcmVuY2UgYXMgYW4gb3V0c3RhbmRpbmcgaXNzdWUuCj4gPiA+ID4gRG9lc24ndCB0aGlzIGRp ZmZlcmVuY2UgbmVlZCB0byBiZSBhY2NvdW50ZWQgaW4gaG93IGxpYnZpcnQgbWFuYWdlcyBWTQo+ ID4gPiA+IHJlc291cmNlIGxpbWl0cz8gICAgICAKPiA+ID4gCj4gPiA+IEFGQUNJVCwgbm8sIGJ1 dCBpdCBzaG91bGQgYmUgY2hlY2tlZC4KPiA+ID4gICAKPiA+ID4gPiBBSVVJIGxpYnZpcnQgdXNl cyBzb21lIGZvcm0gb2YgcHJsaW1pdCgyKSB0byBzZXQgcHJvY2VzcyBsb2NrZWQKPiA+ID4gPiBt ZW1vcnkgbGltaXRzLiAgICAKPiA+ID4gCj4gPiA+IFllcywgYW5kIHVsaW1pdCBkb2VzIHdvcmsg ZnVsbHkuIHBybGltaXQgYWRqdXN0cyB0aGUgdmFsdWU6Cj4gPiA+IAo+ID4gPiBpbnQgZG9fcHJs aW1pdChzdHJ1Y3QgdGFza19zdHJ1Y3QgKnRzaywgdW5zaWduZWQgaW50IHJlc291cmNlLAo+ID4g PiAJCXN0cnVjdCBybGltaXQgKm5ld19ybGltLCBzdHJ1Y3QgcmxpbWl0ICpvbGRfcmxpbSkKPiA+ ID4gewo+ID4gPiAJcmxpbSA9IHRzay0+c2lnbmFsLT5ybGltICsgcmVzb3VyY2U7Cj4gPiA+IFsu Ll0KPiA+ID4gCQlpZiAobmV3X3JsaW0pCj4gPiA+IAkJCSpybGltID0gKm5ld19ybGltOwo+ID4g PiAKPiA+ID4gV2hpY2ggdmZpbyByZWFkcyBiYWNrIGhlcmU6Cj4gPiA+IAo+ID4gPiBkcml2ZXJz L3ZmaW8vdmZpb19pb21tdV90eXBlMS5jOiAgICAgICAgdW5zaWduZWQgbG9uZyBwZm4sIGxpbWl0 ID0gcmxpbWl0KFJMSU1JVF9NRU1MT0NLKSA+PiBQQUdFX1NISUZUOwo+ID4gPiBkcml2ZXJzL3Zm aW8vdmZpb19pb21tdV90eXBlMS5jOiAgICAgICAgdW5zaWduZWQgbG9uZyBsaW1pdCA9IHJsaW1p dChSTElNSVRfTUVNTE9DSykgPj4gUEFHRV9TSElGVDsKPiA+ID4gCj4gPiA+IEFuZCBpb21tdWZk IGRvZXMgdGhlIHNhbWUgcmVhZCBiYWNrOgo+ID4gPiAKPiA+ID4gCWxvY2tfbGltaXQgPQo+ID4g PiAJCXRhc2tfcmxpbWl0KHBhZ2VzLT5zb3VyY2VfdGFzaywgUkxJTUlUX01FTUxPQ0spID4+IFBB R0VfU0hJRlQ7Cj4gPiA+IAlucGFnZXMgPSBwYWdlcy0+bnBpbm5lZCAtIHBhZ2VzLT5sYXN0X25w aW5uZWQ7Cj4gPiA+IAlkbyB7Cj4gPiA+IAkJY3VyX3BhZ2VzID0gYXRvbWljX2xvbmdfcmVhZCgm cGFnZXMtPnNvdXJjZV91c2VyLT5sb2NrZWRfdm0pOwo+ID4gPiAJCW5ld19wYWdlcyA9IGN1cl9w YWdlcyArIG5wYWdlczsKPiA+ID4gCQlpZiAobmV3X3BhZ2VzID4gbG9ja19saW1pdCkKPiA+ID4g CQkJcmV0dXJuIC1FTk9NRU07Cj4gPiA+IAl9IHdoaWxlIChhdG9taWNfbG9uZ19jbXB4Y2hnKCZw YWdlcy0+c291cmNlX3VzZXItPmxvY2tlZF92bSwgY3VyX3BhZ2VzLAo+ID4gPiAJCQkJICAgICBu ZXdfcGFnZXMpICE9IGN1cl9wYWdlcyk7Cj4gPiA+IAo+ID4gPiBTbyBpdCBkb2VzIHdvcmsgZXNz ZW50aWFsbHkgdGhlIHNhbWUuICAKPiA+IAo+ID4gV2VsbCwgZXhjZXB0IGZvciB0aGUgcGFydCBh Ym91dCB2ZmlvIHVwZGF0aW5nIG1tLT5sb2NrZWRfdm0gYW5kIGlvbW11ZmQKPiA+IHVwZGF0aW5n IHVzZXItPmxvY2tlZF92bSwgYSBwZXItcHJvY2VzcyBjb3VudGVyIHZlcnN1cyBhIHBlci11c2Vy Cj4gPiBjb3VudGVyLiAgcHJsaW1pdCBzcGVjaWZpY2FsbHkgc2V0cyBwcm9jZXNzIHJlc291cmNl IGxpbWl0cywgd2hpY2ggZ2V0Cj4gPiByZWZsZWN0ZWQgaW4gdGFza19ybGltaXQuICAKPiAKPiBJ bmRlZWQsIGJ1dCB0aGF0IGlzIG5vdCBob3cgdGhlIG1ham9yaXR5IG9mIG90aGVyIHRoaW5ncyBz ZWVtIHRvCj4gb3BlcmF0ZSBpdC4KPiAKPiA+IEZvciBleGFtcGxlLCBsZXQncyBzYXkgYSB1c2Vy IGhhcyB0d28gNEdCIFZNcyBhbmQgdGhleSdyZSBob3QtYWRkaW5nCj4gPiB2ZmlvIGRldmljZXMg dG8gZWFjaCBvZiB0aGVtLCBzbyBsaWJ2aXJ0IG5lZWRzIHRvIGR5bmFtaWNhbGx5IG1vZGlmeQo+ ID4gdGhlIGxvY2tlZCBtZW1vcnkgbGltaXQgZm9yIGVhY2ggVk0uICBBSVVJLCBsaWJ2aXJ0IHdv dWxkIGxvb2sgYXQgdGhlCj4gPiBWTSBzaXplIGFuZCBjYWxsIHBybGltaXQgdG8gc2V0IHRoYXQg dmFsdWUuICBJZiBsaWJ2aXJ0IGRvZXMgdGhpcyB0bwo+ID4gYm90aCBWTXMsIHRoZW4gZWFjaCBo YXMgYSB0YXNrX3JsaW1pdCBvZiA0R0IuICBJbiB2ZmlvIHdlIGFkZCBwaW5uZWQKPiA+IHBhZ2Vz IHRvIG1tLT5sb2NrZWRfdm0sIHNvIHRoaXMgd29ya3Mgd2VsbC4gIEluIHRoZSBpb21tdWZkIGxv b3AgYWJvdmUsCj4gPiB3ZSdyZSBjb21wYXJpbmcgYSBwZXItdGFzay9wcm9jZXNzIGxpbWl0IHRv IGEgcGVyLXVzZXIgY291bnRlci4gIFNvIEknbQo+ID4gYSBiaXQgbG9zdCBob3cgYm90aCBWTXMg Y2FuIHBpbiB0aGVpciBwYWdlcyBoZXJlLiAgCj4gCj4gSSBkb24ndCBrbm93IGFueXRoaW5nIGFi b3V0IGxpYnZpcnQgLSBpdCBzZWVtcyBzdHJhbmdlIHRvIHVzZSBhCj4gc2VjdXJpdHlpc2ggZmVh dHVyZSBsaWtlIHVsaW1pdCBidXQgbm90IHNlY3VyaXR5IGlzb2xhdGUgcHJvY2Vzc2VzCj4gd2l0 aCByZWFsIHVzZXJzLgo+IAo+IEJ1dCBpZiBpdCByZWFsbHkgZG9lcyB0aGlzIHRoZW4gaXQgcmVh bGx5IGRvZXMgdGhpcy4KPiAKPiBTbyBhdCB0aGUgdmVyeSBsZWFzdCBWRklPIGNvbnRhaW5lciBo YXMgdG8ga2VlcCB3b3JraW5nIHRoaXMgd2F5Lgo+IAo+IFRoZSBuZXh0IHF1ZXN0aW9uIGlzIGlm IHdlIHdhbnQgaW9tbXVmZCdzIG93biBkZXZpY2Ugbm9kZSB0byB3b3JrIHRoaXMKPiB3YXkgYW5k IHRyeSB0byBjaGFuZ2UgbGlidmlydCBzb21laG93LiBJdCBzZWVtcyBsaWJ2aXJ0IHdpbGwgaGF2 ZSB0bwo+IGRlYWwgd2l0aCB0aGlzIGF0IHNvbWUgcG9pbnQgYXMgaW91cmluZyB3aWxsIHRyaWdn ZXIgdGhlIHNhbWUgcHJvYmxlbS4KPiAKPiA+ID4gVGhpcyB3aG9sZSBhcmVhIGlzIGEgYml0IHBl Y3VsaWFyIChlZyBtbG9jayBpdHNlbGYgd29ya3MgZGlmZmVyZW50bHkpLAo+ID4gPiBJTUhPLCBi dXQgd2l0aCBtb3N0IG9mIHRoZSBwbGFjZXMgZG9pbmcgcGlucyB2b3RpbmcgdG8gdXNlCj4gPiA+ IHVzZXItPmxvY2tlZF92bSBhcyB0aGUgY2hhcmdlIGl0IHNlZW1zIHRoZSByaWdodCBwYXRoIGlu IHRvZGF5J3MKPiA+ID4ga2VybmVsLiAgCj4gPiAKPiA+IFRoZSBwaGlsb3NvcGh5IG9mIHdoZXRo ZXIgaXQncyB1bHRpbWF0ZWx5IGEgYmV0dGVyIGNob2ljZSBmb3IgdGhlCj4gPiBrZXJuZWwgYXNp ZGUsIGlmIHVzZXJzcGFjZSBicmVha3MgYmVjYXVzZSB3ZSdyZSBhY2NvdW50aW5nIGluIGEKPiA+ IHBlci11c2VyIHBvb2wgcmF0aGVyIHRoYW4gYSBwZXItcHJvY2VzcyBwb29sLCB0aGVuIG91ciBj b21wYXRpYmlsaXR5Cj4gPiBsYXllciBhaW4ndCBzbyB0cmFuc3BhcmVudC4gIAo+IAo+IFN1cmUs IGlmIGl0IGRvZXNuJ3Qgd29yayBpdCBkb2Vzbid0IHdvcmsuIExldHMgYmUgc3VyZSBhbmQgY2xl YXJseQo+IGRvY3VtZW50IHdoYXQgdGhlIGNvbXBhdGFiaWxpdHkgaXNzdWUgaXMgYW5kIHRoZW4g d2UgaGF2ZSB0byBrZWVwIGl0Cj4gcGVyLXByb2Nlc3MuCj4gCj4gQW5kIHRoZSBzYW1lIHJlYXNv bmluZyBsaWtlbHkgbWVhbnMgSSBjYW4ndCBjaGFuZ2UgUkRNQSBlaXRoZXIgYXMgcWVtdQo+IHdp bGwgYnJlYWsganVzdCBhcyB3ZWxsIHdoZW4gcWVtdSB1c2VzIHJkbWEgbW9kZS4KPiAKPiBXaGlj aCBpcyBwcmV0dHkgc3Vja3ksIGJ1dCBpdCBpcyB3aGF0IGl0IGlzLi4KCkkgYWRkZWQgRGFuaWVs IEJlcnJhbmfDqSB0byB0aGUgY2MgbGlzdCBmb3IgbXkgcHJldmlvdXMgcmVwbHksIGhvcGVmdWxs eQpoZSBjYW4gY29tbWVudCB3aGV0aGVyIGxpYnZpcnQgaGFzIHRoZSBzb3J0IG9mIHVzZXIgc2Vj dXJpdHkgbW9kZWwgeW91CmFsbHVkZSB0byBhYm92ZSB0aGF0IG1heWJlIG1ha2VzIHRoaXMgYSBu b24taXNzdWUgZm9yIHRoaXMgdXNlIGNhc2UuClVuZm9ydHVuYXRlbHkgaXQncyBleHRyZW1lbHkg ZGlmZmljdWx0IHRvIHByb3ZlIHRoYXQgdGhlcmUgYXJlIG5vIHN1Y2gKdXNlIGNhc2VzIG91dCB0 aGVyZSBldmVuIGlmIGxpYnZpcnQgaXMgb2suICBUaGFua3MsCgpBbGV4CgpfX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fXwppb21tdSBtYWlsaW5nIGxpc3QKaW9t bXVAbGlzdHMubGludXgtZm91bmRhdGlvbi5vcmcKaHR0cHM6Ly9saXN0cy5saW51eGZvdW5kYXRp b24ub3JnL21haWxtYW4vbGlzdGluZm8vaW9tbXU= From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C08DCC433EF for ; Thu, 24 Mar 2022 22:41:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1355466AbiCXWnK (ORCPT ); Thu, 24 Mar 2022 18:43:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33006 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1355478AbiCXWmv (ORCPT ); Thu, 24 Mar 2022 18:42:51 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id BC310ADD67 for ; Thu, 24 Mar 2022 15:41:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1648161677; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=l9WNoz/FzvLksUXnj3LwqVUaZSJ8tKS3cuqgqvhdHn0=; b=Izg8yZKXJM2k8YllcrWc7YsshReDVQV+qnNnifg64CufRVUCjNLvHivPskibr9uMR6Pbi4 saIMQ5umLZkrPgb/pYQarbOisflM2mLCjWmGAz679rswIMKvGCzEOL4XTIwidGBVrz/RTO DFOqT0gDsWQ2ezpmf/4K6gy7wnne0/E= Received: from mail-il1-f198.google.com (mail-il1-f198.google.com [209.85.166.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-602-sZyYwh4VMca8JaCaHkFvaA-1; Thu, 24 Mar 2022 18:41:16 -0400 X-MC-Unique: sZyYwh4VMca8JaCaHkFvaA-1 Received: by mail-il1-f198.google.com with SMTP id g5-20020a92dd85000000b002c79aa519f4so3587014iln.10 for ; Thu, 24 Mar 2022 15:41:16 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to :references:organization:mime-version:content-transfer-encoding; bh=l9WNoz/FzvLksUXnj3LwqVUaZSJ8tKS3cuqgqvhdHn0=; b=p6A2/0YwALMOvEqDkeSbtlC0RoY/PMLEwDes/b/pOw9fE0eNVnd2Wy1rSynKesEWxG qJU0gQSbSypsXGB+hN9AX0pXNiwpsqa6ikM9/jQwjMqy8bLEAB1XTK/cwUhmmbYexcQL 5WhFejwm6XmPvym2a4I7sycYAzVV9qk1N0rOWUaTvW0iBzHxKZNTKhI0Vjf4pvosbcbo FC/qTUzGRHRFWrWOEaBiUbVlknIu4WtJgImUOcVZeG91Yfa3Gp5BgtCrLzlKLuZ0Vwlh CxFOicIvTVmp8Yy58vdUNbivIkY9rabxXj9VUHeU6w63ZJcppFUf8WFYR59BCJc9Tia1 58Mw== X-Gm-Message-State: AOAM53338cmtDuy32KnZgnJnRDTrdmGycu2m1Vs9EIHfBaXw/1GYvSLH 4ciOcum29F55tOWicg/9zA5TrP1TW/W+2qh3z1hSJa7Xw9R56QxmsctEl3N3t/dAxytSee+Q5lb iNIS5+IQn2b2o X-Received: by 2002:a05:6638:371f:b0:31a:8654:e49c with SMTP id k31-20020a056638371f00b0031a8654e49cmr3994484jav.197.1648161675850; Thu, 24 Mar 2022 15:41:15 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy8BahSZl1L2C0PBuAAPzP0FUv2fHadswqtKoqwYknCJfEeQpET1HU5FEk+lcbhTyW3Tu6Gjg== X-Received: by 2002:a05:6638:371f:b0:31a:8654:e49c with SMTP id k31-20020a056638371f00b0031a8654e49cmr3994469jav.197.1648161675601; Thu, 24 Mar 2022 15:41:15 -0700 (PDT) Received: from redhat.com ([38.15.36.239]) by smtp.gmail.com with ESMTPSA id q197-20020a6b8ece000000b00648d615e80csm2082175iod.41.2022.03.24.15.41.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 24 Mar 2022 15:41:15 -0700 (PDT) Date: Thu, 24 Mar 2022 16:41:14 -0600 From: Alex Williamson To: Jason Gunthorpe Cc: Jason Gunthorpe via iommu , Jean-Philippe Brucker , Chaitanya Kulkarni , kvm@vger.kernel.org, Niklas Schnelle , Jason Wang , Cornelia Huck , Kevin Tian , Daniel Jordan , "Michael S. Tsirkin" , Joao Martins , David Gibson , "Daniel P. =?UTF-8?B?QmVycmFuZ8Op?=" Subject: Re: [PATCH RFC 04/12] kernel/user: Allow user::locked_vm to be usable for iommufd Message-ID: <20220324164114.78f2e63a.alex.williamson@redhat.com> In-Reply-To: <20220324222739.GZ11336@nvidia.com> References: <4-v1-e79cd8d168e8+6-iommufd_jgg@nvidia.com> <808a871b3918dc067031085de3e8af6b49c6ef89.camel@linux.ibm.com> <20220322145741.GH11336@nvidia.com> <20220322092923.5bc79861.alex.williamson@redhat.com> <20220322161521.GJ11336@nvidia.com> <20220324144015.031ca277.alex.williamson@redhat.com> <20220324222739.GZ11336@nvidia.com> Organization: Red Hat MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org On Thu, 24 Mar 2022 19:27:39 -0300 Jason Gunthorpe wrote: > On Thu, Mar 24, 2022 at 02:40:15PM -0600, Alex Williamson wrote: > > On Tue, 22 Mar 2022 13:15:21 -0300 > > Jason Gunthorpe via iommu wrote: > > =20 > > > On Tue, Mar 22, 2022 at 09:29:23AM -0600, Alex Williamson wrote: > > > =20 > > > > I'm still picking my way through the series, but the later compat > > > > interface doesn't mention this difference as an outstanding issue. > > > > Doesn't this difference need to be accounted in how libvirt manages= VM > > > > resource limits? =20 > > >=20 > > > AFACIT, no, but it should be checked. > > > =20 > > > > AIUI libvirt uses some form of prlimit(2) to set process locked > > > > memory limits. =20 > > >=20 > > > Yes, and ulimit does work fully. prlimit adjusts the value: > > >=20 > > > int do_prlimit(struct task_struct *tsk, unsigned int resource, > > > struct rlimit *new_rlim, struct rlimit *old_rlim) > > > { > > > rlim =3D tsk->signal->rlim + resource; > > > [..] > > > if (new_rlim) > > > *rlim =3D *new_rlim; > > >=20 > > > Which vfio reads back here: > > >=20 > > > drivers/vfio/vfio_iommu_type1.c: unsigned long pfn, limit =3D = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT; > > > drivers/vfio/vfio_iommu_type1.c: unsigned long limit =3D rlimi= t(RLIMIT_MEMLOCK) >> PAGE_SHIFT; > > >=20 > > > And iommufd does the same read back: > > >=20 > > > lock_limit =3D > > > task_rlimit(pages->source_task, RLIMIT_MEMLOCK) >> PAGE_SHIFT; > > > npages =3D pages->npinned - pages->last_npinned; > > > do { > > > cur_pages =3D atomic_long_read(&pages->source_user->locked_vm); > > > new_pages =3D cur_pages + npages; > > > if (new_pages > lock_limit) > > > return -ENOMEM; > > > } while (atomic_long_cmpxchg(&pages->source_user->locked_vm, cur_pag= es, > > > new_pages) !=3D cur_pages); > > >=20 > > > So it does work essentially the same. =20 > >=20 > > Well, except for the part about vfio updating mm->locked_vm and iommufd > > updating user->locked_vm, a per-process counter versus a per-user > > counter. prlimit specifically sets process resource limits, which get > > reflected in task_rlimit. =20 >=20 > Indeed, but that is not how the majority of other things seem to > operate it. >=20 > > For example, let's say a user has two 4GB VMs and they're hot-adding > > vfio devices to each of them, so libvirt needs to dynamically modify > > the locked memory limit for each VM. AIUI, libvirt would look at the > > VM size and call prlimit to set that value. If libvirt does this to > > both VMs, then each has a task_rlimit of 4GB. In vfio we add pinned > > pages to mm->locked_vm, so this works well. In the iommufd loop above, > > we're comparing a per-task/process limit to a per-user counter. So I'm > > a bit lost how both VMs can pin their pages here. =20 >=20 > I don't know anything about libvirt - it seems strange to use a > securityish feature like ulimit but not security isolate processes > with real users. >=20 > But if it really does this then it really does this. >=20 > So at the very least VFIO container has to keep working this way. >=20 > The next question is if we want iommufd's own device node to work this > way and try to change libvirt somehow. It seems libvirt will have to > deal with this at some point as iouring will trigger the same problem. >=20 > > > This whole area is a bit peculiar (eg mlock itself works differently), > > > IMHO, but with most of the places doing pins voting to use > > > user->locked_vm as the charge it seems the right path in today's > > > kernel. =20 > >=20 > > The philosophy of whether it's ultimately a better choice for the > > kernel aside, if userspace breaks because we're accounting in a > > per-user pool rather than a per-process pool, then our compatibility > > layer ain't so transparent. =20 >=20 > Sure, if it doesn't work it doesn't work. Lets be sure and clearly > document what the compatability issue is and then we have to keep it > per-process. >=20 > And the same reasoning likely means I can't change RDMA either as qemu > will break just as well when qemu uses rdma mode. >=20 > Which is pretty sucky, but it is what it is.. I added Daniel Berrang=C3=A9 to the cc list for my previous reply, hopefully he can comment whether libvirt has the sort of user security model you allude to above that maybe makes this a non-issue for this use case. Unfortunately it's extremely difficult to prove that there are no such use cases out there even if libvirt is ok. Thanks, Alex