From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.1 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5186DC47093 for ; Tue, 1 Jun 2021 17:30:59 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 35F62610C9 for ; Tue, 1 Jun 2021 17:30:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234628AbhFARcj (ORCPT ); Tue, 1 Jun 2021 13:32:39 -0400 Received: from mail-bn7nam10on2050.outbound.protection.outlook.com ([40.107.92.50]:23104 "EHLO NAM10-BN7-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S233871AbhFARcf (ORCPT ); Tue, 1 Jun 2021 13:32:35 -0400 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=cpwg/YXjzs6Q03JGvd4dXveJ+p/cxE0d8Ovy8eiKeKyQmP7Ni0TGMFVzhyuKFwhzXJiR0ZsVgjdohY/IEJL2ily3MRVJ8+ODi6RMGq1ntMtPK3cTgJyRduqfIttjj4RVHVxcytORlUjC9X0Jhge1yFwGfbiDFxIwKIx8GFTS0C3MjsRRLf2Ed44OTIP7QCbXFSt8oHsYTHy0eAbEMpWI2TI4AGvnt6UyAYujtXWX1tOJ77r2J3fL6Mru7H8qkOMuC69dOqxsk6On7c0VUOhuCG2O1VEOma/mx7zWlS3NNTBNVHnca2B8MObsgSGeZ6w09JeKdLaK31Y8ZGA0/2UUDg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=1QwNAmBGxgXs7S+pu/e5X0ABNHiSYSjsBYkEl89XYKU=; b=i4ED2EXE/G/UdEMjLTeOEHse3zADQ+yciLShJBY5y0tcX+eZd+XFr8YbJZBRONd4vR2iZAb6RGqVmBXj9A/fJRXqYq5lvjUVyZg4oYYsAILamSdB4OsAOiMvmwcrH5nWne4NBkrfNfKMwBPU9jveXxJn7IHtZ8f7dn3R7rRVom1tfYNEBmctYrVWv5JCYatFLVvnC6hNFvp0AXa5gLXzRomNKpkYpH18Y9GP2qmhw4ospJg894eGjcUlSqItWcx9DHN5rE1lxYTOaYLoNGsnarLiwhuB2ddXcHIirdgB3g7IOKEj75IhppwLD/10cZWelrQ85J1163Dya7NDpBRRww== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=1QwNAmBGxgXs7S+pu/e5X0ABNHiSYSjsBYkEl89XYKU=; b=MyzaLqdvxIq2rn25p0OA6+mlNbsdw8Y9bZAp0aVqgah2JHjgz6Qk9a69g3oeahMRhoWjS5C3LbdC81uYLhMQys355EMPvhDZ+Z+fOoMgS9NwsGjVWAOfdBy6josmhwvdheZE92xuIVBL08uv9P01MCwyv2IE1Fzq3K5IfKbNd2V/eqzA7rqM0W8yKB2rzzKXHUzlUM1clqaMpsm40RaoumGJRwAhy4XATFGytzhiQQr9uf0mmdCrXl6/fqV0jRcCfJ8IWjODX9KVfRGeYQn8fJR9tLqIfWS6eAOvp9qASP85m5Dv8l+re37ZMusBRJkyEG9di17v83BMYCNoeAEotg== Received: from PH0PR12MB5481.namprd12.prod.outlook.com (2603:10b6:510:d4::15) by PH0PR12MB5419.namprd12.prod.outlook.com (2603:10b6:510:e9::24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4173.20; Tue, 1 Jun 2021 17:30:51 +0000 Received: from PH0PR12MB5481.namprd12.prod.outlook.com ([fe80::b0d9:bff5:2fbf:b344]) by PH0PR12MB5481.namprd12.prod.outlook.com ([fe80::b0d9:bff5:2fbf:b344%6]) with mapi id 15.20.4173.030; Tue, 1 Jun 2021 17:30:51 +0000 From: Parav Pandit To: "Tian, Kevin" , LKML , Joerg Roedel , Jason Gunthorpe , Lu Baolu , David Woodhouse , "iommu@lists.linux-foundation.org" , "kvm@vger.kernel.org" , "Alex Williamson (alex.williamson@redhat.com)" , Jason Wang CC: Eric Auger , Jonathan Corbet , "Raj, Ashok" , "Liu, Yi L" , "Wu, Hao" , "Jiang, Dave" , Jacob Pan , Jean-Philippe Brucker , David Gibson , Kirti Wankhede , Robin Murphy Subject: RE: [RFC] /dev/ioasid uAPI proposal Thread-Topic: [RFC] /dev/ioasid uAPI proposal Thread-Index: AddSzQ970oLnVHLeQca/ysPD8zMJZwEO8mWg Date: Tue, 1 Jun 2021 17:30:51 +0000 Message-ID: References: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: intel.com; dkim=none (message not signed) header.d=none;intel.com; dmarc=none action=none header.from=nvidia.com; x-originating-ip: [49.207.197.245] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: fadf3c45-f0d2-40c4-79fa-08d9252300a9 x-ms-traffictypediagnostic: PH0PR12MB5419: x-ms-exchange-transport-forked: True x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:9508; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: H1p5FZtPej9JS15M8X3To73AHBtoBmoLjPFzmFKhqoZrEfq9YankBXDBIsmjWBBi7DT0nNDkD5TFLPL9upDNHHVTNdOSeXJ/YnisBP0YzSRpRs5nUE8vAt5LAqP/bvmc2LeL6JIQCZs++C9v9R9f64KW7fCdZuWYqGPnstX9W7ypCmGgpkLWr9nvqm6jUaPYQbxNfEOtB/c33deqSztyJAy5ziFtQArf+DHu/UUbCDnb7FJavNCT/j9r0DSBUl9TVrYqPWCfYXw4Z+MOTGjG1XsEu5i61dR5RQg0txK9OYrwFqGgJ6/dCf0uf9W3lSseFVla9hnyiD/0sOhW6BMo15cA37wUy6OL/uY7oax6Ax9ULjy84bjJbF7TGhYJchEJEOJbI3JnOsE/xfdfsjPoelYj37qP4MINFmitV6hfciT9EKfgkuP9tCXSm0W9n5nQ7wzD7QHbdnUFklrMb0/DvOHl7rZ5kpG/ELKj7zKmLkQ61XQnKxxNHbycM3fkSK1TBOf7n1wbab2cVadSoRjJqf2OHi4IJH8Y9ccxjju1Z155T1+B3biH8J1uCMJ8DsSiZk5VSluIqiJPo6Ntguw+xPRK1foCb3wU0hPTO55BRFz0g2y/4gyMIIjZFk1L1ExxfLj4PNHL2uVNA39rwWYcMQ== x-forefront-antispam-report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:PH0PR12MB5481.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(4636009)(39860400002)(136003)(366004)(376002)(396003)(346002)(83380400001)(33656002)(316002)(9686003)(55236004)(26005)(71200400001)(54906003)(110136005)(4326008)(5660300002)(86362001)(6506007)(2906002)(52536014)(8936002)(8676002)(66946007)(66446008)(186003)(38100700002)(921005)(66556008)(55016002)(478600001)(122000001)(7416002)(7696005)(76116006)(66476007)(64756008);DIR:OUT;SFP:1101; x-ms-exchange-antispam-messagedata: =?us-ascii?Q?FIxkAunrJrMUxBGxqN8SSFrEF9PKDLG96Jg8L9W+AM/bJFZJdbbDwqgZOYr7?= =?us-ascii?Q?UINvoTL3sUXu2eB8q2NWZXhBHCJTAokEqQSX1LBFBdYu7DSHbrONxklicGeh?= =?us-ascii?Q?oA9N7vN3GLeW+HiJZv1MuVEUqT/fcTXbJANmV/FkDxSjuxvocM75fzB0lAPs?= =?us-ascii?Q?+5nH2xvcy9NMQL9E53IP2kf4vXeHb7s2vnsHl5eEo5kzsbAGTHSOMsaqMCDm?= =?us-ascii?Q?YgDIPeTKjZcHkTMtZAqKYE1xS/1C9bhhV9mUx7dsPSKtkhaXBTGafvZQsff0?= =?us-ascii?Q?tJPxQZs7/xUjPJw73F5swqmU0QqA18jQYIY9dV0RAu/7aw3v67hj2YVw1XqI?= =?us-ascii?Q?rNI0mjqFWW955y/f7UF+be4Il+LTLslF0McyDxU7rjbylUrqwrocMBWpNWEo?= =?us-ascii?Q?FxhmVS/a8Pe6dGuu94pm8MfkGPcfqWvttFMA2eXGbXxy/CRyQP9+0llYmy76?= =?us-ascii?Q?AV/ICoCjdFhoO0NuE1Rxu6B82BnBjw6zQAkLviM7pPuaI8d2Y+MZOi78YdLj?= =?us-ascii?Q?qpgteNOmoFtwC13kJLScGHJPoCNX0ZoUck9v3aE1DQoGK+ADozoH/femJR1s?= =?us-ascii?Q?xH7pBtARJejV2TBbKzmae0QckTOpNHwj6uqVjcKndjBfASPd4+Py5nU2cdlv?= =?us-ascii?Q?deUOZC2GsUA7iaVeFdKnxKzLkTZuy+De6mhui173I43h608unUaxHMYa8/UV?= =?us-ascii?Q?Zvf0vWjxNPKvd/ROUdhPptemauLKYlDqYafocG4k4WpSeno/snYtVu2loLjT?= =?us-ascii?Q?z7/BO6RdCDjqMfGfkPB0X5fRCWpIf1Nc6np7O6LcKofxXmDNRB4N3w+g/2ie?= =?us-ascii?Q?1/aneSvYC5GiryMlZqqJbG8nKmUmcfKFQHyIyhcg7pTSf/EGT03etLIweoHJ?= =?us-ascii?Q?x83rj4qAF4F76Z6V/lb9zPsh/QJGhHprV/RkT5sv99KTWU9NRqG74MgYI8FF?= =?us-ascii?Q?9O6u4GwZcPpMPHJoX+r9Xq7M77edPTV0sQgAOWtvtDdS7Ndz7AdK85zSEWcd?= =?us-ascii?Q?iJkXll+9Iugc7899KKV5JrH7uJzFODiqrSOvyn7yPMIYb9/QOt6o/068edjg?= =?us-ascii?Q?9qsT1YxhU+jgHebpBnfMePvXwdVCtePK05uQO9LyeSor/nW4huihOExXNC42?= =?us-ascii?Q?eBcJKBqU73ALmneyq4FF1ofBzI5v/KwoAjQOx1b54nPcf13wwl5ZoLkBEV7z?= =?us-ascii?Q?PJlQ+O5EKDz8UW3AgAmT7WDHFq0yZ1hcVgc86B7sADJ+jbFHCJXxfmyxXThO?= =?us-ascii?Q?dO27gHGaHUnSNUzrJC6VhxgJwq2voCpRbNoGemtIwaJEKa4vq5OkTlPtHVbd?= =?us-ascii?Q?nqrV59vKJ8zZJIIRhayEr0p2?= Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: PH0PR12MB5481.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: fadf3c45-f0d2-40c4-79fa-08d9252300a9 X-MS-Exchange-CrossTenant-originalarrivaltime: 01 Jun 2021 17:30:51.2983 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: FpkGxyUXwWXPh+RzItkB1EUQUbiDeZ00GxValMX7t6o8DLYzLA56sQfgczmsXnQbaqMh59fE8MlAoQbDyU2N+w== X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR12MB5419 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > From: Tian, Kevin > Sent: Thursday, May 27, 2021 1:28 PM > 5.6. I/O page fault > +++++++++++++++ >=20 > (uAPI is TBD. Here is just about the high-level flow from host IOMMU driv= er > to guest IOMMU driver and backwards). >=20 > - Host IOMMU driver receives a page request with raw fault_data {rid, > pasid, addr}; >=20 > - Host IOMMU driver identifies the faulting I/O page table according to > information registered by IOASID fault handler; >=20 > - IOASID fault handler is called with raw fault_data (rid, pasid, addr)= , which > is saved in ioasid_data->fault_data (used for response); >=20 > - IOASID fault handler generates an user fault_data (ioasid, addr), lin= ks it > to the shared ring buffer and triggers eventfd to userspace; >=20 > - Upon received event, Qemu needs to find the virtual routing informati= on > (v_rid + v_pasid) of the device attached to the faulting ioasid. If t= here are > multiple, pick a random one. This should be fine since the purpose is= to > fix the I/O page table on the guest; >=20 > - Qemu generates a virtual I/O page fault through vIOMMU into guest, > carrying the virtual fault data (v_rid, v_pasid, addr); >=20 Why does it have to be through vIOMMU? For a VFIO PCI device, have you considered to reuse the same PRI interface = to inject page fault in the guest? This eliminates any new v_rid. It will also route the page fault request and response through the right vf= io device. > - Guest IOMMU driver fixes up the fault, updates the I/O page table, an= d > then sends a page response with virtual completion data (v_rid, v_pas= id, > response_code) to vIOMMU; >=20 What about fixing up the fault for mmu page table as well in guest? Or you meant both when above you said "updates the I/O page table"? It is unclear to me that if there is single nested page table maintained or= two (one for cr3 references and other for iommu). Can you please clarify? > - Qemu finds the pending fault event, converts virtual completion data > into (ioasid, response_code), and then calls a /dev/ioasid ioctl to > complete the pending fault; >=20 For VFIO PCI device a virtual PRI request response interface is done, it ca= n be generic interface among multiple vIOMMUs. > - /dev/ioasid finds out the pending fault data {rid, pasid, addr} saved= in > ioasid_data->fault_data, and then calls iommu api to complete it with > {rid, pasid, response_code}; > From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 60292C4708F for ; Tue, 1 Jun 2021 17:30:58 +0000 (UTC) Received: from smtp1.osuosl.org (smtp1.osuosl.org [140.211.166.138]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 24A4860234 for ; Tue, 1 Jun 2021 17:30:58 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 24A4860234 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=nvidia.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=iommu-bounces@lists.linux-foundation.org Received: from localhost (localhost [127.0.0.1]) by smtp1.osuosl.org (Postfix) with ESMTP id E6E6083D0D; Tue, 1 Jun 2021 17:30:57 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp1.osuosl.org ([127.0.0.1]) by localhost (smtp1.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id pdJpO-jrZzQe; Tue, 1 Jun 2021 17:30:57 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [IPv6:2605:bc80:3010:104::8cd3:938]) by smtp1.osuosl.org (Postfix) with ESMTP id CD36483CB9; Tue, 1 Jun 2021 17:30:56 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id AAE0DC000E; Tue, 1 Jun 2021 17:30:56 +0000 (UTC) Received: from smtp2.osuosl.org (smtp2.osuosl.org [IPv6:2605:bc80:3010::133]) by lists.linuxfoundation.org (Postfix) with ESMTP id 9D4B2C0001 for ; Tue, 1 Jun 2021 17:30:55 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp2.osuosl.org (Postfix) with ESMTP id 7C81B40270 for ; Tue, 1 Jun 2021 17:30:55 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Authentication-Results: smtp2.osuosl.org (amavisd-new); dkim=pass (2048-bit key) header.d=nvidia.com Received: from smtp2.osuosl.org ([127.0.0.1]) by localhost (smtp2.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id BBvL5HeoAk3y for ; Tue, 1 Jun 2021 17:30:54 +0000 (UTC) X-Greylist: whitelisted by SQLgrey-1.8.0 Received: from NAM10-BN7-obe.outbound.protection.outlook.com (mail-bn7nam10on2062f.outbound.protection.outlook.com [IPv6:2a01:111:f400:7e8a::62f]) by smtp2.osuosl.org (Postfix) with ESMTPS id 65E39400F4 for ; Tue, 1 Jun 2021 17:30:54 +0000 (UTC) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=cpwg/YXjzs6Q03JGvd4dXveJ+p/cxE0d8Ovy8eiKeKyQmP7Ni0TGMFVzhyuKFwhzXJiR0ZsVgjdohY/IEJL2ily3MRVJ8+ODi6RMGq1ntMtPK3cTgJyRduqfIttjj4RVHVxcytORlUjC9X0Jhge1yFwGfbiDFxIwKIx8GFTS0C3MjsRRLf2Ed44OTIP7QCbXFSt8oHsYTHy0eAbEMpWI2TI4AGvnt6UyAYujtXWX1tOJ77r2J3fL6Mru7H8qkOMuC69dOqxsk6On7c0VUOhuCG2O1VEOma/mx7zWlS3NNTBNVHnca2B8MObsgSGeZ6w09JeKdLaK31Y8ZGA0/2UUDg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=1QwNAmBGxgXs7S+pu/e5X0ABNHiSYSjsBYkEl89XYKU=; b=i4ED2EXE/G/UdEMjLTeOEHse3zADQ+yciLShJBY5y0tcX+eZd+XFr8YbJZBRONd4vR2iZAb6RGqVmBXj9A/fJRXqYq5lvjUVyZg4oYYsAILamSdB4OsAOiMvmwcrH5nWne4NBkrfNfKMwBPU9jveXxJn7IHtZ8f7dn3R7rRVom1tfYNEBmctYrVWv5JCYatFLVvnC6hNFvp0AXa5gLXzRomNKpkYpH18Y9GP2qmhw4ospJg894eGjcUlSqItWcx9DHN5rE1lxYTOaYLoNGsnarLiwhuB2ddXcHIirdgB3g7IOKEj75IhppwLD/10cZWelrQ85J1163Dya7NDpBRRww== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=1QwNAmBGxgXs7S+pu/e5X0ABNHiSYSjsBYkEl89XYKU=; b=MyzaLqdvxIq2rn25p0OA6+mlNbsdw8Y9bZAp0aVqgah2JHjgz6Qk9a69g3oeahMRhoWjS5C3LbdC81uYLhMQys355EMPvhDZ+Z+fOoMgS9NwsGjVWAOfdBy6josmhwvdheZE92xuIVBL08uv9P01MCwyv2IE1Fzq3K5IfKbNd2V/eqzA7rqM0W8yKB2rzzKXHUzlUM1clqaMpsm40RaoumGJRwAhy4XATFGytzhiQQr9uf0mmdCrXl6/fqV0jRcCfJ8IWjODX9KVfRGeYQn8fJR9tLqIfWS6eAOvp9qASP85m5Dv8l+re37ZMusBRJkyEG9di17v83BMYCNoeAEotg== Received: from PH0PR12MB5481.namprd12.prod.outlook.com (2603:10b6:510:d4::15) by PH0PR12MB5419.namprd12.prod.outlook.com (2603:10b6:510:e9::24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4173.20; Tue, 1 Jun 2021 17:30:51 +0000 Received: from PH0PR12MB5481.namprd12.prod.outlook.com ([fe80::b0d9:bff5:2fbf:b344]) by PH0PR12MB5481.namprd12.prod.outlook.com ([fe80::b0d9:bff5:2fbf:b344%6]) with mapi id 15.20.4173.030; Tue, 1 Jun 2021 17:30:51 +0000 From: Parav Pandit To: "Tian, Kevin" , LKML , Joerg Roedel , Jason Gunthorpe , Lu Baolu , David Woodhouse , "iommu@lists.linux-foundation.org" , "kvm@vger.kernel.org" , "Alex Williamson (alex.williamson@redhat.com)" , Jason Wang Subject: RE: [RFC] /dev/ioasid uAPI proposal Thread-Topic: [RFC] /dev/ioasid uAPI proposal Thread-Index: AddSzQ970oLnVHLeQca/ysPD8zMJZwEO8mWg Date: Tue, 1 Jun 2021 17:30:51 +0000 Message-ID: References: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: intel.com; dkim=none (message not signed) header.d=none;intel.com; dmarc=none action=none header.from=nvidia.com; x-originating-ip: [49.207.197.245] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: fadf3c45-f0d2-40c4-79fa-08d9252300a9 x-ms-traffictypediagnostic: PH0PR12MB5419: x-ms-exchange-transport-forked: True x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:9508; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: H1p5FZtPej9JS15M8X3To73AHBtoBmoLjPFzmFKhqoZrEfq9YankBXDBIsmjWBBi7DT0nNDkD5TFLPL9upDNHHVTNdOSeXJ/YnisBP0YzSRpRs5nUE8vAt5LAqP/bvmc2LeL6JIQCZs++C9v9R9f64KW7fCdZuWYqGPnstX9W7ypCmGgpkLWr9nvqm6jUaPYQbxNfEOtB/c33deqSztyJAy5ziFtQArf+DHu/UUbCDnb7FJavNCT/j9r0DSBUl9TVrYqPWCfYXw4Z+MOTGjG1XsEu5i61dR5RQg0txK9OYrwFqGgJ6/dCf0uf9W3lSseFVla9hnyiD/0sOhW6BMo15cA37wUy6OL/uY7oax6Ax9ULjy84bjJbF7TGhYJchEJEOJbI3JnOsE/xfdfsjPoelYj37qP4MINFmitV6hfciT9EKfgkuP9tCXSm0W9n5nQ7wzD7QHbdnUFklrMb0/DvOHl7rZ5kpG/ELKj7zKmLkQ61XQnKxxNHbycM3fkSK1TBOf7n1wbab2cVadSoRjJqf2OHi4IJH8Y9ccxjju1Z155T1+B3biH8J1uCMJ8DsSiZk5VSluIqiJPo6Ntguw+xPRK1foCb3wU0hPTO55BRFz0g2y/4gyMIIjZFk1L1ExxfLj4PNHL2uVNA39rwWYcMQ== x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:PH0PR12MB5481.namprd12.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(39860400002)(136003)(366004)(376002)(396003)(346002)(83380400001)(33656002)(316002)(9686003)(55236004)(26005)(71200400001)(54906003)(110136005)(4326008)(5660300002)(86362001)(6506007)(2906002)(52536014)(8936002)(8676002)(66946007)(66446008)(186003)(38100700002)(921005)(66556008)(55016002)(478600001)(122000001)(7416002)(7696005)(76116006)(66476007)(64756008); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata: =?us-ascii?Q?FIxkAunrJrMUxBGxqN8SSFrEF9PKDLG96Jg8L9W+AM/bJFZJdbbDwqgZOYr7?= =?us-ascii?Q?UINvoTL3sUXu2eB8q2NWZXhBHCJTAokEqQSX1LBFBdYu7DSHbrONxklicGeh?= =?us-ascii?Q?oA9N7vN3GLeW+HiJZv1MuVEUqT/fcTXbJANmV/FkDxSjuxvocM75fzB0lAPs?= =?us-ascii?Q?+5nH2xvcy9NMQL9E53IP2kf4vXeHb7s2vnsHl5eEo5kzsbAGTHSOMsaqMCDm?= =?us-ascii?Q?YgDIPeTKjZcHkTMtZAqKYE1xS/1C9bhhV9mUx7dsPSKtkhaXBTGafvZQsff0?= =?us-ascii?Q?tJPxQZs7/xUjPJw73F5swqmU0QqA18jQYIY9dV0RAu/7aw3v67hj2YVw1XqI?= =?us-ascii?Q?rNI0mjqFWW955y/f7UF+be4Il+LTLslF0McyDxU7rjbylUrqwrocMBWpNWEo?= =?us-ascii?Q?FxhmVS/a8Pe6dGuu94pm8MfkGPcfqWvttFMA2eXGbXxy/CRyQP9+0llYmy76?= =?us-ascii?Q?AV/ICoCjdFhoO0NuE1Rxu6B82BnBjw6zQAkLviM7pPuaI8d2Y+MZOi78YdLj?= =?us-ascii?Q?qpgteNOmoFtwC13kJLScGHJPoCNX0ZoUck9v3aE1DQoGK+ADozoH/femJR1s?= =?us-ascii?Q?xH7pBtARJejV2TBbKzmae0QckTOpNHwj6uqVjcKndjBfASPd4+Py5nU2cdlv?= =?us-ascii?Q?deUOZC2GsUA7iaVeFdKnxKzLkTZuy+De6mhui173I43h608unUaxHMYa8/UV?= =?us-ascii?Q?Zvf0vWjxNPKvd/ROUdhPptemauLKYlDqYafocG4k4WpSeno/snYtVu2loLjT?= =?us-ascii?Q?z7/BO6RdCDjqMfGfkPB0X5fRCWpIf1Nc6np7O6LcKofxXmDNRB4N3w+g/2ie?= =?us-ascii?Q?1/aneSvYC5GiryMlZqqJbG8nKmUmcfKFQHyIyhcg7pTSf/EGT03etLIweoHJ?= =?us-ascii?Q?x83rj4qAF4F76Z6V/lb9zPsh/QJGhHprV/RkT5sv99KTWU9NRqG74MgYI8FF?= =?us-ascii?Q?9O6u4GwZcPpMPHJoX+r9Xq7M77edPTV0sQgAOWtvtDdS7Ndz7AdK85zSEWcd?= =?us-ascii?Q?iJkXll+9Iugc7899KKV5JrH7uJzFODiqrSOvyn7yPMIYb9/QOt6o/068edjg?= =?us-ascii?Q?9qsT1YxhU+jgHebpBnfMePvXwdVCtePK05uQO9LyeSor/nW4huihOExXNC42?= =?us-ascii?Q?eBcJKBqU73ALmneyq4FF1ofBzI5v/KwoAjQOx1b54nPcf13wwl5ZoLkBEV7z?= =?us-ascii?Q?PJlQ+O5EKDz8UW3AgAmT7WDHFq0yZ1hcVgc86B7sADJ+jbFHCJXxfmyxXThO?= =?us-ascii?Q?dO27gHGaHUnSNUzrJC6VhxgJwq2voCpRbNoGemtIwaJEKa4vq5OkTlPtHVbd?= =?us-ascii?Q?nqrV59vKJ8zZJIIRhayEr0p2?= MIME-Version: 1.0 X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: PH0PR12MB5481.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: fadf3c45-f0d2-40c4-79fa-08d9252300a9 X-MS-Exchange-CrossTenant-originalarrivaltime: 01 Jun 2021 17:30:51.2983 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: FpkGxyUXwWXPh+RzItkB1EUQUbiDeZ00GxValMX7t6o8DLYzLA56sQfgczmsXnQbaqMh59fE8MlAoQbDyU2N+w== X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR12MB5419 Cc: Jean-Philippe Brucker , "Jiang, Dave" , "Raj, Ashok" , Jonathan Corbet , Kirti Wankhede , David Gibson , Robin Murphy X-BeenThere: iommu@lists.linux-foundation.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: Development issues for Linux IOMMU support List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: iommu-bounces@lists.linux-foundation.org Sender: "iommu" > From: Tian, Kevin > Sent: Thursday, May 27, 2021 1:28 PM > 5.6. I/O page fault > +++++++++++++++ > > (uAPI is TBD. Here is just about the high-level flow from host IOMMU driver > to guest IOMMU driver and backwards). > > - Host IOMMU driver receives a page request with raw fault_data {rid, > pasid, addr}; > > - Host IOMMU driver identifies the faulting I/O page table according to > information registered by IOASID fault handler; > > - IOASID fault handler is called with raw fault_data (rid, pasid, addr), which > is saved in ioasid_data->fault_data (used for response); > > - IOASID fault handler generates an user fault_data (ioasid, addr), links it > to the shared ring buffer and triggers eventfd to userspace; > > - Upon received event, Qemu needs to find the virtual routing information > (v_rid + v_pasid) of the device attached to the faulting ioasid. If there are > multiple, pick a random one. This should be fine since the purpose is to > fix the I/O page table on the guest; > > - Qemu generates a virtual I/O page fault through vIOMMU into guest, > carrying the virtual fault data (v_rid, v_pasid, addr); > Why does it have to be through vIOMMU? For a VFIO PCI device, have you considered to reuse the same PRI interface to inject page fault in the guest? This eliminates any new v_rid. It will also route the page fault request and response through the right vfio device. > - Guest IOMMU driver fixes up the fault, updates the I/O page table, and > then sends a page response with virtual completion data (v_rid, v_pasid, > response_code) to vIOMMU; > What about fixing up the fault for mmu page table as well in guest? Or you meant both when above you said "updates the I/O page table"? It is unclear to me that if there is single nested page table maintained or two (one for cr3 references and other for iommu). Can you please clarify? > - Qemu finds the pending fault event, converts virtual completion data > into (ioasid, response_code), and then calls a /dev/ioasid ioctl to > complete the pending fault; > For VFIO PCI device a virtual PRI request response interface is done, it can be generic interface among multiple vIOMMUs. > - /dev/ioasid finds out the pending fault data {rid, pasid, addr} saved in > ioasid_data->fault_data, and then calls iommu api to complete it with > {rid, pasid, response_code}; > _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu