From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=X6pY=KX=vger.kernel.org=linux-kernel-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-1.0 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS,
	MAILING_LIST_MULTI,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 92571C4646D
	for <linux-kernel@archiver.kernel.org>; Wed,  8 Aug 2018 20:31:58 +0000 (UTC)
Received: from vger.kernel.org (vger.kernel.org [209.132.180.67])
	by mail.kernel.org (Postfix) with ESMTP id 4822B219EF
	for <linux-kernel@archiver.kernel.org>; Wed,  8 Aug 2018 20:31:58 +0000 (UTC)
DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4822B219EF
Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com
Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-kernel-owner@vger.kernel.org
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1731048AbeHHWxO (ORCPT
        <rfc822;linux-kernel@archiver.kernel.org>);
        Wed, 8 Aug 2018 18:53:14 -0400
Received: from mx3-rdu2.redhat.com ([66.187.233.73]:59362 "EHLO mx1.redhat.com"
        rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP
        id S1727530AbeHHWxN (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
        Wed, 8 Aug 2018 18:53:13 -0400
Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4])
        (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
        (No client certificate requested)
        by mx1.redhat.com (Postfix) with ESMTPS id 6C200402315B;
        Wed,  8 Aug 2018 20:31:53 +0000 (UTC)
Received: from redhat.com (ovpn-116-95.ams2.redhat.com [10.36.116.95])
        by smtp.corp.redhat.com (Postfix) with SMTP id B09AA2026DE8;
        Wed,  8 Aug 2018 20:31:47 +0000 (UTC)
Date:   Wed, 8 Aug 2018 23:31:46 +0300
From:   "Michael S. Tsirkin" <mst@redhat.com>
To:     Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc:     Christoph Hellwig <hch@infradead.org>,
        Will Deacon <will.deacon@arm.com>,
        Anshuman Khandual <khandual@linux.vnet.ibm.com>,
        virtualization@lists.linux-foundation.org,
        linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
        aik@ozlabs.ru, robh@kernel.org, joe@perches.com,
        elfring@users.sourceforge.net, david@gibson.dropbear.id.au,
        jasowang@redhat.com, mpe@ellerman.id.au, linuxram@us.ibm.com,
        haren@linux.vnet.ibm.com, paulus@samba.org,
        srikar@linux.vnet.ibm.com, robin.murphy@arm.com,
        jean-philippe.brucker@arm.com, marc.zyngier@arm.com
Subject: Re: [RFC 0/4] Virtio uses DMA API for all devices
Message-ID: <20180808232210-mutt-send-email-mst@kernel.org>
References: <20180806094243.GA16032@infradead.org>
 <6c707d6d33ac25a42265c2e9b521c2416d72c739.camel@kernel.crashing.org>
 <20180807062117.GD32709@infradead.org>
 <aa59c7f8556bd4b332394a1dcf2d4a8faf3dc4a2.camel@kernel.crashing.org>
 <20180807135505.GA29034@infradead.org>
 <2103ecfe52d23cec03f185d08a87bfad9c9d82b5.camel@kernel.crashing.org>
 <20180808063158.GA2474@infradead.org>
 <4b596883892b5cb5560bef26fcd249e7107173ac.camel@kernel.crashing.org>
 <20180808123036.GA2525@infradead.org>
 <b8b9150a747453c070ad3b0e4c92d2b1b052ad06.camel@kernel.crashing.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <b8b9150a747453c070ad3b0e4c92d2b1b052ad06.camel@kernel.crashing.org>
X-Scanned-By: MIMEDefang 2.78 on 10.11.54.4
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.6]); Wed, 08 Aug 2018 20:31:53 +0000 (UTC)
X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.6]); Wed, 08 Aug 2018 20:31:53 +0000 (UTC) for IP:'10.11.54.4' DOMAIN:'int-mx04.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'mst@redhat.com' RCPT:''
Sender: linux-kernel-owner@vger.kernel.org
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Wed, Aug 08, 2018 at 11:18:13PM +1000, Benjamin Herrenschmidt wrote:
> Sure, but all of this is just the configuration of the iommu. But I
> think we agree here, and your point remains valid, indeed my proposed
> hack:
> 
> >       if ((flags & VIRTIO_F_IOMMU_PLATFORM) || arch_virtio_wants_dma_ops())
> 
> Will only work if the IOMMU and non-IOMMU path are completely equivalent.
> 
> We can provide that guarantee for our secure VM case, but not generally so if
> we were to go down the route of a quirk in virtio, it might be better to
> make it painfully obvious that it's specific to that one case with a different
> kind of turd:
> 
> -	if (xen_domain())
> +	if (xen_domain() || pseries_secure_vm())
> 		return true;

I don't think it's pseries specific actually. E.g. I suspect AMD SEV
might benefit from the same kind of hack.


> So to summarize, and make sure I'm not missing something, the two approaches
> at hand are either:
> 
>  1- The above, which is a one liner and contained in the guest, so that's nice, but
> also means another turd in virtio which isn't ...
> 
>  2- We force pseries to always set VIRTIO_F_IOMMU_PLATFORM, but with the current
> architecture on our side that will force virtio to always go through an emulated
> iommu, as pseries doesn't have the concept of a real bypass window, and thus will
> impact performance for both secure and non-secure VMs.
> 
>  3- Invent a property that can be put in selected PCI device tree nodes that
> indicates that for that device specifically, the iommu can be bypassed, along with
> a hypercall to turn that bypass on/off. Virtio would then use VIRTIO_F_IOMMU_PLATFORM
> but its DT nodes would also have that property and Linux would notice it and turn
> bypass on.

For completeness, virtio could also have its own bounce buffer
outside of DMA API one. I don't see lots of benefits to this
though.


> The resulting properties of those options are:
> 
> 1- Is what I want because it's the simplest, provides the best performance now,
>    and works without code changes to qemu or non-secure Linux. However it does
>    add a tiny turd to virtio which is annoying.
> 
> 2- This works but it puts the iommu in the way always, thus reducing virtio performance
>    accross the board for pseries unless we only do that for secure VMs but that is
>    difficult (as discussed earlier).
> 
> 3- This would recover the performance lost in -2-, however it requires qemu *and*
>    guest changes. Specifically, existing guests (RHEL 7 etc...) would get the
>    performance hit of -2- unless modified to call that 'enable bypass' call, which
>    isn't great.
> 
> So imho we have to chose one of 3 not-great solutions here... Unless I missed
> something in your ideas of course.
> 
> Cheers,
> Ben.
> 
>