From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4AFD1C433EF for ; Wed, 1 Dec 2021 09:54:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242400AbhLAJ5x convert rfc822-to-8bit (ORCPT ); Wed, 1 Dec 2021 04:57:53 -0500 Received: from szxga03-in.huawei.com ([45.249.212.189]:28211 "EHLO szxga03-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232736AbhLAJ5w (ORCPT ); Wed, 1 Dec 2021 04:57:52 -0500 Received: from dggemv711-chm.china.huawei.com (unknown [172.30.72.56]) by szxga03-in.huawei.com (SkyGuard) with ESMTP id 4J3vWz6sJ1z8vhN; Wed, 1 Dec 2021 17:52:31 +0800 (CST) Received: from kwepemm600005.china.huawei.com (7.193.23.191) by dggemv711-chm.china.huawei.com (10.1.198.66) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.20; Wed, 1 Dec 2021 17:54:30 +0800 Received: from lhreml710-chm.china.huawei.com (10.201.108.61) by kwepemm600005.china.huawei.com (7.193.23.191) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.20; Wed, 1 Dec 2021 17:54:29 +0800 Received: from lhreml710-chm.china.huawei.com ([169.254.81.184]) by lhreml710-chm.china.huawei.com ([169.254.81.184]) with mapi id 15.01.2308.020; Wed, 1 Dec 2021 09:54:27 +0000 From: Shameerali Kolothum Thodi To: Jason Gunthorpe , Alex Williamson CC: Jonathan Corbet , "linux-doc@vger.kernel.org" , Cornelia Huck , "kvm@vger.kernel.org" , Kirti Wankhede , Max Gurtovoy , Yishai Hadas , "Zengtao (B)" , liulongfang Subject: RE: [PATCH RFC v2] vfio: Documentation for the migration region Thread-Topic: [PATCH RFC v2] vfio: Documentation for the migration region Thread-Index: AQHX5S/pH80pYVu+o0az8lQb+t8Cn6wcVFyAgAAZ+wCAADx+gIAATcuAgABn8BA= Date: Wed, 1 Dec 2021 09:54:27 +0000 Message-ID: <90226a3c13a2404086dc555e4aced7cb@huawei.com> References: <0-v2-45a95932a4c6+37-vfio_mig_doc_jgg@nvidia.com> <20211130102611.71394253.alex.williamson@redhat.com> <20211130185910.GD4670@nvidia.com> <20211130153541.131c9729.alex.williamson@redhat.com> <20211201031407.GG4670@nvidia.com> In-Reply-To: <20211201031407.GG4670@nvidia.com> Accept-Language: en-GB, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.202.227.178] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 8BIT MIME-Version: 1.0 X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-doc@vger.kernel.org > -----Original Message----- > From: Jason Gunthorpe [mailto:jgg@nvidia.com] > Sent: 01 December 2021 03:14 > To: Alex Williamson > Cc: Jonathan Corbet ; linux-doc@vger.kernel.org; Cornelia > Huck ; kvm@vger.kernel.org; Kirti Wankhede > ; Max Gurtovoy ; > Shameerali Kolothum Thodi ; Yishai > Hadas > Subject: Re: [PATCH RFC v2] vfio: Documentation for the migration region > > On Tue, Nov 30, 2021 at 03:35:41PM -0700, Alex Williamson wrote: > > > > From what HNS said the device driver would have to trap every MMIO to > > > implement NDMA as it must prevent touches to the physical HW MMIO to > > > maintain the NDMA state. > > > > > > The issue is that the HW migration registers can stop processing the > > > queue and thus enter NDMA but a MMIO touch can resume queue > > > processing, so NDMA cannot be sustained. > > > > > > Trapping every MMIO would have a huge negative performance impact. > So > > > it doesn't make sense to do so for a device that is not intended to be > > > used in any situation where NDMA is required. > > > > But migration is a cooperative activity with userspace. If necessary > > we can impose a requirement that mmap access to regions (other than the > > migration region itself) are dropped when we're in the NDMA or !RUNNING > > device_state. > > It is always NDMA|RUNNING, so we can't fully drop access to > MMIO. Userspace would have to transfer from direct MMIO to > trapping. With enough new kernel infrastructure and qemu support it > could be done. As far as our devices are concerned we put the dev queue into a PAUSE state in the !RUNNUNG state. And since we don't have any P2P support, is it ok to put the onus on userspace here that it won't try to access the MMIO during !RUNNUNG state? So just to make it clear , if a device declares that it doesn't support NDMA and P2P, is the v1 version of the spec good enough or we still need to take care the case that a malicious user might try MMIO access in !RUNNING state and should have kernel infrastructure in place to safe guard that? > > Even so, we can't trap accesses through the IOMMU so such a scheme > would still require removing IOMMU acess to the device. Given that the > basic qemu mitigation for no NDMA support is to eliminate P2P cases by > removing the IOMMU mappings this doesn't seem to advance anything and > only creates complexity. > > At least I'm not going to insist that hns do all kinds of work like > this for a edge case they don't care about as a precondition to get a > migration driver. Yes. That's our concern too. (Just a note to clarify that these are not HNS devices per se. HNS actually stands for HiSilicon Network Subsystem and doesn't currently have live migration capability. The devices capable of live migration are HiSilicon Accelerator devices). Thanks, Shameer