From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.4 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, MSGID_FROM_MTA_HEADER,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6D926C54FCB for ; Thu, 23 Apr 2020 19:12:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3FF1920728 for ; Thu, 23 Apr 2020 19:12:27 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=Mellanox.com header.i=@Mellanox.com header.b="HRsRzsNf" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728116AbgDWTM0 (ORCPT ); Thu, 23 Apr 2020 15:12:26 -0400 Received: from mail-eopbgr140043.outbound.protection.outlook.com ([40.107.14.43]:8000 "EHLO EUR01-VE1-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726296AbgDWTM0 (ORCPT ); Thu, 23 Apr 2020 15:12:26 -0400 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Y6ddtwzKjYqA1kwmfU1b1fLD3uGCN4yiU76aAX9igazYevUNTMUaUDeDgYJ+mWeQpLVziW3K7nmIPUv9cK38b3ULV8A0WjfPmCcxThEQ3ypJP4OmjWzyTZUlQtCEehbOAq3wSwn8hgP8UiLTZMn0ZHAcdGPU74Yx8tymDliBwUL1utOVkHD04o2PCT/SjMZQD59NvBHtCphGVh5fpuh1mZyGPfWjbMqs9tc7pzFZ7ev6KGZ9BIoYNtBA0vcTE3ZhLBMbz6faQaEpyXSsQW33xawHgkPy8RgbvC5HkX5GRqqLYDkVgSirWt3O+n2ofdGaSDtjnBjVlEN2L/2FOtM5pg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=lB+iWplYEKT4bgSJmyTWVF/M1IfZC6avqDk+oCMgJgU=; b=ZnKsh3koCmqxPbcvOXqRBpzFDqrkcikrUEezceAMBRBWPJp+zHnHRDtN5EgK+zbqN1BYPJZGPHt37DYOFH5Jv8gprBdAlmCxAdmJuNeCK3o28yq4updaEveu7Z6oPE+hLIKDuG4uMymKg+tdmgeLv44lFrlcqCHAjvEl+cRTRRIK8w0Mkr8rXZVk5WW3VDtKdxyOrtgFygP1w8UQigX2plqw3S7tiIIwTYtJQIQp9DqbbbBR33PgH252W49iaCTwz6mCqGbBzBBf9UIi/JPIQ61MH4Tu5pmgxD1I/F/LYvoYv9x0BujGzWXcyRdqTO0A/HwE2h/M57P861UJkCMc+A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=mellanox.com; dmarc=pass action=none header.from=mellanox.com; dkim=pass header.d=mellanox.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Mellanox.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=lB+iWplYEKT4bgSJmyTWVF/M1IfZC6avqDk+oCMgJgU=; b=HRsRzsNforrb7kS1r5ApyYjedGNAnf9t9oWUL0VQtTmTJh0wp/t1mEkneOttzXbnrRSq2tI4qYO3f+JatgCLlPAehJ9FRl0ujF9nRXehRPFhOfCKyNhlkjgopKS0TcotEKKib0bhwe8dosOvxO6nvjyE9bGeXVlgDwoG2WzoiII= Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=jgg@mellanox.com; Received: from VI1PR05MB4141.eurprd05.prod.outlook.com (2603:10a6:803:44::15) by VI1PR05MB4958.eurprd05.prod.outlook.com (2603:10a6:803:58::31) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2921.29; Thu, 23 Apr 2020 19:12:21 +0000 Received: from VI1PR05MB4141.eurprd05.prod.outlook.com ([fe80::a47b:e3cd:7d6d:5d4e]) by VI1PR05MB4141.eurprd05.prod.outlook.com ([fe80::a47b:e3cd:7d6d:5d4e%6]) with mapi id 15.20.2921.030; Thu, 23 Apr 2020 19:12:20 +0000 Date: Thu, 23 Apr 2020 16:12:17 -0300 From: Jason Gunthorpe To: "Raj, Ashok" Cc: "Tian, Kevin" , "Jiang, Dave" , "vkoul@kernel.org" , "megha.dey@linux.intel.com" , "maz@kernel.org" , "bhelgaas@google.com" , "rafael@kernel.org" , "gregkh@linuxfoundation.org" , "tglx@linutronix.de" , "hpa@zytor.com" , "alex.williamson@redhat.com" , "Pan, Jacob jun" , "Liu, Yi L" , "Lu, Baolu" , "Kumar, Sanjay K" , "Luck, Tony" , "Lin, Jing" , "Williams, Dan J" , "kwankhede@nvidia.com" , "eric.auger@redhat.com" , "parav@mellanox.com" , "dmaengine@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "x86@kernel.org" , "linux-pci@vger.kernel.org" , "kvm@vger.kernel.org" Subject: Re: [PATCH RFC 00/15] Add VFIO mediated device support and IMS support for the idxd driver. Message-ID: <20200423191217.GD13640@mellanox.com> References: <158751095889.36773.6009825070990637468.stgit@djiang5-desk3.ch.intel.com> <20200421235442.GO11945@mellanox.com> <20200422115017.GQ11945@mellanox.com> <20200422211436.GA103345@otc-nc-03> Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200422211436.GA103345@otc-nc-03> User-Agent: Mutt/1.9.4 (2018-02-28) X-ClientProxiedBy: MN2PR02CA0007.namprd02.prod.outlook.com (2603:10b6:208:fc::20) To VI1PR05MB4141.eurprd05.prod.outlook.com (2603:10a6:803:44::15) MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 Received: from mlx.ziepe.ca (142.68.57.212) by MN2PR02CA0007.namprd02.prod.outlook.com (2603:10b6:208:fc::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2937.13 via Frontend Transport; Thu, 23 Apr 2020 19:12:20 +0000 Received: from jgg by mlx.ziepe.ca with local (Exim 4.90_1) (envelope-from ) id 1jRhGj-0006uL-8Y; Thu, 23 Apr 2020 16:12:17 -0300 X-Originating-IP: [142.68.57.212] X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-HT: Tenant X-MS-Office365-Filtering-Correlation-Id: 09c67aba-d63e-4d0b-f348-08d7e7ba3f24 X-MS-TrafficTypeDiagnostic: VI1PR05MB4958:|VI1PR05MB4958: X-MS-Exchange-Transport-Forked: True X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:9508; X-Forefront-PRVS: 03827AF76E X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:VI1PR05MB4141.eurprd05.prod.outlook.com;PTR:;CAT:NONE;SFTY:;SFS:(10009020)(4636009)(39860400002)(366004)(376002)(346002)(396003)(136003)(8936002)(478600001)(2906002)(5660300002)(6916009)(2616005)(52116002)(36756003)(4326008)(86362001)(81156014)(9746002)(9786002)(1076003)(8676002)(66556008)(33656002)(26005)(316002)(66946007)(54906003)(186003)(7416002)(66476007)(24400500001);DIR:OUT;SFP:1101; Received-SPF: None (protection.outlook.com: mellanox.com does not designate permitted sender hosts) X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: wAM+pprR4yUQbLq3p0e6S3VFp5WeiA+AJgY2ul9RP76s29+E35WJruVo4pt/lIV36j6bC6LXXjJ7k0KMmS/FLbje5y65zPKRDV4AOULEeDxFA6/i5zkEalFcp8eK0M7+IzrUyXehscQ36GP/eKNJeenCTw4Woba6ckcvXh+pVD3INMnqRtsowKc6vrKZ/wzOIwnD5C3q8+DoxBP5fWOiOeIudI82d2kN9d0tZgu1NTphpOxLWRIkkbXQkx+PVer0XWZWT1Lx9GyoH+ZjEelHCkLNtMecIzgJF6xW1jfQ15FwU0Xvv1exdTR1iUz3NIQsitPBZMXoNII/Q8YSpmRoBs++qySGioIAo6VdTBEkVTJOTAyqcQZDH0uvCe0tCKrVxsE/eIIX0MYEMec1sLX4Ne9JUdP3Wvu6/Ic4Ac9s8iRQh0xFpDoLcZGLkP/mMPQe93KyqmOnBZHli/dmBuelWaHf5pgTw4M23EFJwJef5bt38Y0qZ/DKjdeSKd0edGsp X-MS-Exchange-AntiSpam-MessageData: UVsdrsnYOrM37NsticKTdTYtEw6FXqS3d4QQCS77P/R5AKDlxlyp5Nlk2wQliKq8tgEaHpgtE/uAkPNPX03Skrp6R7myNFmKY9ztCaZtDGuo19K9nmyiE23cla+RwIkl8kAuexmu4LLrxAyE0xe5Kg== X-OriginatorOrg: Mellanox.com X-MS-Exchange-CrossTenant-Network-Message-Id: 09c67aba-d63e-4d0b-f348-08d7e7ba3f24 X-MS-Exchange-CrossTenant-OriginalArrivalTime: 23 Apr 2020 19:12:20.7642 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: a652971c-7d2e-4d9b-a6a4-d149256f461b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 4WvMtKZ3GfNIGv8F5B8H111rLMRsTHkJkH2xxmaj5rHkXF4cT/ROnd//62ZAP6/EeNQ9l8GfoR6tlSLAyEpp6g== X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR05MB4958 Sender: dmaengine-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: dmaengine@vger.kernel.org On Wed, Apr 22, 2020 at 02:14:36PM -0700, Raj, Ashok wrote: > Hi Jason > > > > > > > > > I'm feeling really skeptical that adding all this PCI config space and > > > > MMIO BAR emulation to the kernel just to cram this into a VFIO > > > > interface is a good idea, that kind of stuff is much safer in > > > > userspace. > > > > > > > > Particularly since vfio is not really needed once a driver is using > > > > the PASID stuff. We already have general code for drivers to use to > > > > attach a PASID to a mm_struct - and using vfio while disabling all the > > > > DMA/iommu config really seems like an abuse. > > > > > > Well, this series is for virtualizing idxd device to VMs, instead of > > > supporting SVA for bare metal processes. idxd implements a > > > hardware-assisted mediated device technique called Intel Scalable > > > I/O Virtualization, > > > > I'm familiar with the intel naming scheme. > > > > > which allows each Assignable Device Interface (ADI, e.g. a work > > > queue) tagged with an unique PASID to ensure fine-grained DMA > > > isolation when those ADIs are assigned to different VMs. For this > > > purpose idxd utilizes the VFIO mdev framework and IOMMU aux-domain > > > extension. Bare metal SVA will be enabled for idxd later by using > > > the general SVA code that you mentioned. Both paths will co-exist > > > in the end so there is no such case of disabling DMA/iommu config. > > > > Again, if you will have a normal SVA interface, there is no need for a > > VFIO version, just use normal SVA for both. > > > > PCI emulation should try to be in userspace, not the kernel, for > > security. > > Not sure we completely understand your proposal. Mediated devices > are software constructed and they have protected resources like > interrupts and stuff and VFIO already provids abstractions to export > to user space. > > Native SVA is simply passing the process CR3 handle to IOMMU so > IOMMU knows how to walk process page tables, kernel handles things > like page-faults, doing device tlb invalidations and such. > That by itself doesn't translate to what a guest typically does > with a VDEV. There are other control paths that need to be serviced > from the kernel code via VFIO. For speed path operations like > ringing doorbells and such they are directly managed from guest. You don't need vfio to mmap BAR pages to userspace. The unique thing that vfio gives is it provides a way to program the classic non-PASID iommu, which you are not using here. > How do you propose to use the existing SVA api's to also provide > full device emulation as opposed to using an existing infrastructure > that's already in place? You'd provide the 'full device emulation' in userspace (eg qemu), along side all the other device emulation. Device emulation does not belong in the kernel without a very good reason. You get the doorbell BAR page from your own char dev You setup a PASID IOMMU configuration over your own char dev Interrupt delivery is triggering a generic event fd What is VFIO needed for? > Perhaps Alex can ease Jason's concerns? Last we talked Alex also had doubts on what mdev should be used for. It is a feature that seems to lack boundaries, and I'll note that when the discussion came up for VDPA, they eventually choose not to use VFIO. Jason