From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 3E34BEB64D9 for ; Tue, 4 Jul 2023 17:15:45 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.558581.872803 (Exim 4.92) (envelope-from ) id 1qGjcS-0004o6-1t; Tue, 04 Jul 2023 17:15:16 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 558581.872803; Tue, 04 Jul 2023 17:15:16 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1qGjcR-0004nz-Vb; Tue, 04 Jul 2023 17:15:15 +0000 Received: by outflank-mailman (input) for mailman id 558581; Tue, 04 Jul 2023 17:15:15 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1qGjcR-0004nt-1u for xen-devel@lists.xenproject.org; Tue, 04 Jul 2023 17:15:15 +0000 Received: from mail-pl1-x631.google.com (mail-pl1-x631.google.com [2607:f8b0:4864:20::631]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 555009b9-1a8e-11ee-8611-37d641c3527e; Tue, 04 Jul 2023 19:15:12 +0200 (CEST) Received: by mail-pl1-x631.google.com with SMTP id d9443c01a7336-1b89e10d356so10624025ad.3 for ; Tue, 04 Jul 2023 10:15:12 -0700 (PDT) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 555009b9-1a8e-11ee-8611-37d641c3527e DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1688490911; x=1691082911; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=tKZ68cydAkPbtFvf5MYcmHYlMLMsBvPQpIORG93kvps=; b=ZJ+6K+Cc+zO2x+/Th/Qw8/RImWEGtcIwUEk++smcOi9U6oAFmmnjoAbx+stgg2cR/7 naPTqEYWvJQHzJtSeumaeHo0mrR8ryAAm/wzB4Vy3HBkpPAv5KQ20lknynVb/WjDtClb 1av1OBA8XA626AlD4hIdcr2LYmnEXkmsJE2a6hLYvy9bHpJYrjRirtuWjhB1QOFIXNJ+ AnJxaRvrGEWPf6p4pvwbW43Wzpan7P8Y02St28Bg4sVaQJDjakJ3/TYx8QgReuPuhJyk PVDPN4gPZShhEbFxYzHnmixDKIVMxe/FIc6+WYr3TqaBPw/7ACdfNPXLJzvgjNZw2eHi QW2A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688490911; x=1691082911; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=tKZ68cydAkPbtFvf5MYcmHYlMLMsBvPQpIORG93kvps=; b=JkBxBHnrq/Vy7KbNSmcR63Z7HVfvkxLs8uKHVvfrkbcQqQC31lBWvcU+n5oPe9pKMQ +riq1kw0q1SFbc1kYksE58PC8jHIi2ITA9UJUpOUsaBkktTiV+YR4GWQ0JNhF6x+K1P7 eqaZy45HuevFgrCAizasDofZs5vHcFfSFBWMTP4E2i7ijIhIZlSjYl1tV1sWofGe2I+3 1S2QalNqNyFuIwv2d+j/HJ03HE62o/hERFCX6tWihBvErgwZLrkb+aBfofUzQvOC8e+M Lswfys+WagO2AIgVibfAJaBF5z4FI3mDflyb9nC5QroBFp4Mpf4sCNbZbJUZoNiNimH7 wTSA== X-Gm-Message-State: ABy/qLYwysa+JnMcRVd3yQeShkVUc+OHktlftnVxyL32YzWrpizXN7hu GzLHmUwhNmvNmv97UoQ6HQbrGvqPzuvT1uIAHl8= X-Google-Smtp-Source: APBJJlGxG6QAD6OaWx4SUrnqUir385+eBtxw6rU9uUeMEDOPkEm+/+Wd+dYB2Wxyx8tymUh/paezx1Ygwf6kN3stRZw= X-Received: by 2002:a17:90a:7b87:b0:263:8083:bdcf with SMTP id z7-20020a17090a7b8700b002638083bdcfmr10927832pjc.15.1688490910662; Tue, 04 Jul 2023 10:15:10 -0700 (PDT) MIME-Version: 1.0 References: <20230621131214.9398-1-petr.pavlu@suse.com> <20230621131214.9398-3-petr.pavlu@suse.com> <15e31609-6c45-7372-76ee-0adf7a64fe88@epam.com> In-Reply-To: From: Oleksandr Tyshchenko Date: Tue, 4 Jul 2023 20:14:59 +0300 Message-ID: Subject: Re: [PATCH 2/2] xen/virtio: Avoid use of the dom0 backend in dom0 To: =?UTF-8?Q?Roger_Pau_Monn=C3=A9?= , Stefano Stabellini , Juergen Gross , =?UTF-8?Q?Marek_Marczykowski=2DG=C3=B3recki?= Cc: Oleksandr Tyshchenko , Petr Pavlu , "xen-devel@lists.xenproject.org" , "linux-kernel@vger.kernel.org" , vikram.garhwal@amd.com Content-Type: multipart/alternative; boundary="000000000000efc60b05ffac6c55" --000000000000efc60b05ffac6c55 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Tue, Jul 4, 2023 at 5:49=E2=80=AFPM Roger Pau Monn=C3=A9 wrote: Hello all. [sorry for the possible format issues] On Tue, Jul 04, 2023 at 01:43:46PM +0200, Marek Marczykowski-G=C3=B3recki w= rote: > > Hi, > > > > FWIW, I have ran into this issue some time ago too. I run Xen on top of > > KVM and then passthrough some of the virtio devices (network one > > specifically) into a (PV) guest. So, I hit both cases, the dom0 one and > > domU one. As a temporary workaround I needed to disable > > CONFIG_XEN_VIRTIO completely (just disabling > > CONFIG_XEN_VIRTIO_FORCE_GRANT was not enough to fix it). > > With that context in place, the actual response below. > > > > On Tue, Jul 04, 2023 at 12:39:40PM +0200, Juergen Gross wrote: > > > On 04.07.23 09:48, Roger Pau Monn=C3=A9 wrote: > > > > On Thu, Jun 29, 2023 at 03:44:04PM -0700, Stefano Stabellini wrote: > > > > > On Thu, 29 Jun 2023, Oleksandr Tyshchenko wrote: > > > > > > On 29.06.23 04:00, Stefano Stabellini wrote: > > > > > > > I think we need to add a second way? It could be anything tha= t > can help > > > > > > > us distinguish between a non-grants-capable virtio backend an= d > a > > > > > > > grants-capable virtio backend, such as: > > > > > > > - a string on xenstore > > > > > > > - a xen param > > > > > > > - a special PCI configuration register value > > > > > > > - something in the ACPI tables > > > > > > > - the QEMU machine type > > > > > > > > > > > > > > > > > > Yes, I remember there was a discussion regarding that. The poin= t > is to > > > > > > choose a solution to be functional for both PV and HVM *and* to > be able > > > > > > to support a hotplug. IIRC, the xenstore could be a possible > candidate. > > > > > > > > > > xenstore would be among the easiest to make work. The only > downside is > > > > > the dependency on xenstore which otherwise virtio+grants doesn't > have. > > > > > > > > I would avoid introducing a dependency on xenstore, if nothing else > we > > > > know it's a performance bottleneck. > > > > > > > > We would also need to map the virtio device topology into xenstore, > so > > > > that we can pass different options for each device. > > > > > > This aspect (different options) is important. How do you want to pass > virtio > > > device configuration parameters from dom0 to the virtio backend > domain? You > > > probably need something like Xenstore (a virtio based alternative lik= e > virtiofs > > > would work, too) for that purpose. > > > > > > Mapping the topology should be rather easy via the PCI-Id, e.g.: > > > > > > /local/domain/42/device/virtio/0000:00:1c.0/backend > > > > While I agree this would probably be the simplest to implement, I don't > > like introducing xenstore dependency into virtio frontend either. > > Toolstack -> backend communication is probably easier to solve, as it's > > much more flexible (could use qemu cmdline, QMP, other similar > > mechanisms for non-qemu backends etc). > > I also think features should be exposed uniformly for devices, it's at > least weird to have certain features exposed in the PCI config space > while other features exposed in xenstore. > > For virtio-mmio this might get a bit confusing, are we going to add > xenstore entries based on the position of the device config mmio > region? > > I think on Arm PCI enumeration is not (usually?) done by the firmware, > at which point the SBDF expected by the tools/backend might be > different than the value assigned by the guest OS. > > I think there are two slightly different issues, one is how to pass > information to virtio backends, I think doing this initially based on > xenstore is not that bad, because it's an internal detail of the > backend implementation. However passing information to virtio > frontends using xenstore is IMO a bad idea, there's already a way to > negotiate features between virtio frontends and backends, and Xen > should just expand and use that. > > On Arm with device-tree we have a special bindings which purpose is to inform us whether we need to use grants for virtio and backend domid for a particular device.Here on x86, we don't have a device tree, so cannot (easily?) reuse this logic. I have just recollected one idea suggested by Stefano some time ago [1]. The context of discussion was about what to do when device-tree and ACPI cannot be reused (or something like that).The idea won't cover virtio-mmio, but I have heard that virtio-mmio usage with x86 Xen is rather unusual case= . I will paste the text below for convenience. ********** Part 1 (intro): We could reuse a PCI config space register to expose the backend id. However this solution requires a backend change (QEMU) to expose the backend id via an emulated register for each emulated device. To avoid having to introduce a special config space register in all emulated PCI devices (virtio-net, virtio-block, etc) I wonder if we could add a special PCI config space register at the emulated PCI Root Complex level. Basically the workflow would be as follow: - Linux recognizes the PCI Root Complex as a Xen PCI Root Complex - Linux writes to special PCI config space register of the Xen PCI Root Complex the PCI device id (basically the BDF) - The Xen PCI Root Complex emulated by Xen answers by writing back to the same location the backend id (domid of the backend) - Linux reads back the same PCI config space register of the Xen PCI Root Complex and learn the relevant domid Part 2 (clarification): I think using a special config space register in the root complex would not be terrible in terms of guest changes because it is easy to introduce a new root complex driver in Linux and other OSes. The root complex would still be ECAM compatible so the regular ECAM driver would still work. A new driver would only be necessary if you want to be able to access the special config space register. ********** What do you think about it? Are there any pitfalls, etc? This also requires system changes, but at least without virtio spec changes. [1] https://lore.kernel.org/xen-devel/alpine.DEB.2.22.394.2210061747590.3690179= @ubuntu-linux-20-04-desktop/ --=20 Regards, Oleksandr Tyshchenko --000000000000efc60b05ffac6c55 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


=
On Tue, Jul 4, 2023 at 5:49=E2=80=AFP= M Roger Pau Monn=C3=A9 <roger.pa= u@citrix.com> wrote:
=
Hello all.

[sorry for the possible= format issues]


On Tue, Jul 04, 2023 at 01:43:46PM +0200, Marek Marczykows= ki-G=C3=B3recki wrote:
> Hi,
>
> FWIW, I have ran into this issue some time ago too. I run Xen on top o= f
> KVM and then passthrough some of the virtio devices (network one
> specifically) into a (PV) guest. So, I hit both cases, the dom0 one an= d
> domU one. As a temporary workaround I needed to disable
> CONFIG_XEN_VIRTIO completely (just disabling
> CONFIG_XEN_VIRTIO_FORCE_GRANT was not enough to fix it).
> With that context in place, the actual response below.
>
> On Tue, Jul 04, 2023 at 12:39:40PM +0200, Juergen Gross wrote:
> > On 04.07.23 09:48, Roger Pau Monn=C3=A9 wrote:
> > > On Thu, Jun 29, 2023 at 03:44:04PM -0700, Stefano Stabellini= wrote:
> > > > On Thu, 29 Jun 2023, Oleksandr Tyshchenko wrote:
> > > > > On 29.06.23 04:00, Stefano Stabellini wrote:
> > > > > > I think we need to add a second way? It could= be anything that can help
> > > > > > us distinguish between a non-grants-capable v= irtio backend and a
> > > > > > grants-capable virtio backend, such as:
> > > > > > - a string on xenstore
> > > > > > - a xen param
> > > > > > - a special PCI configuration register value<= br> > > > > > > - something in the ACPI tables
> > > > > > - the QEMU machine type
> > > > >
> > > > >
> > > > > Yes, I remember there was a discussion regarding t= hat. The point is to
> > > > > choose a solution to be functional for both PV and= HVM *and* to be able
> > > > > to support a hotplug. IIRC, the xenstore could be = a possible candidate.
> > > >
> > > > xenstore would be among the easiest to make work. The o= nly downside is
> > > > the dependency on xenstore which otherwise virtio+grant= s doesn't have.
> > >
> > > I would avoid introducing a dependency on xenstore, if nothi= ng else we
> > > know it's a performance bottleneck.
> > >
> > > We would also need to map the virtio device topology into xe= nstore, so
> > > that we can pass different options for each device.
> >
> > This aspect (different options) is important. How do you want to = pass virtio
> > device configuration parameters from dom0 to the virtio backend d= omain? You
> > probably need something like Xenstore (a virtio based alternative= like virtiofs
> > would work, too) for that purpose.
> >
> > Mapping the topology should be rather easy via the PCI-Id, e.g.:<= br> > >
> > /local/domain/42/device/virtio/0000:00:1c.0/backend
>
> While I agree this would probably be the simplest to implement, I don&= #39;t
> like introducing xenstore dependency into virtio frontend either.
> Toolstack -> backend communication is probably easier to solve, as = it's
> much more flexible (could use qemu cmdline, QMP, other similar
> mechanisms for non-qemu backends etc).

I also think features should be exposed uniformly for devices, it's at<= br> least weird to have certain features exposed in the PCI config space
while other features exposed in xenstore.

For virtio-mmio this might get a bit confusing, are we going to add
xenstore entries based on the position of the device config mmio
region?

I think on Arm PCI enumeration is not (usually?) done by the firmware,
at which point the SBDF expected by the tools/backend might be
different than the value assigned by the guest OS.

I think there are two slightly different issues, one is how to pass
information to virtio backends, I think doing this initially based on
xenstore is not that bad, because it's an internal detail of the
backend implementation. However passing information to virtio
frontends using xenstore is IMO a bad idea, there's already a way to negotiate features between virtio frontends and backends, and Xen
should just expand and use that.


On Arm with device-tree we have a special bindings which purpo= se is to inform us whether we need to use grants for virtio and backend dom= id for a particular device.Here on x86, we don't have a device tree, so= cannot (easily?) reuse this logic.

I have just recollected o= ne idea suggested by Stefano some time ago [1]. The context of discussion w= as about what to do when device-tree and ACPI cannot be reused (or somethin= g like that).The idea won't cover virtio-mmio, but I have heard that vi= rtio-mmio usage with x86 Xen is rather unusual case.

I will p= aste the text below for convenience.=C2=A0

*******= ***

Part 1 (intro):

We could reuse a PCI config= space register to expose the backend id.
However this solution requires= a backend change (QEMU) to expose the
backend id via an emulated regist= er for each emulated device.

To avoid having to introduce a special = config space register in all
emulated PCI devices (virtio-net, virtio-bl= ock, etc) I wonder if we
could add a special PCI config space register a= t the emulated PCI Root
Complex level.

Basically the workflow wou= ld be as follow:

- Linux recognizes the PCI Root Complex as a Xen PC= I Root Complex
- Linux writes to special PCI config space register of th= e Xen PCI Root
=C2=A0 Complex the PCI device id (basically the BDF)
-= The Xen PCI Root Complex emulated by Xen answers by writing back to
=C2= =A0 the same location the backend id (domid of the backend)
- Linux read= s back the same PCI config space register of the Xen PCI
=C2=A0 Root Com= plex and learn the relevant domid

Part 2 (clarification):

I think using a special config space register in the root comple= x would
not be terrible in terms of guest changes because it is easy to<= br>introduce a new root complex driver in Linux and other OSes. The rootcomplex would still be ECAM compatible so the regular ECAM driver wouldstill work. A new driver would only be necessary if you want to be ableto access the special config space register.


**********
What= do=C2=A0you=C2=A0think about it? Are there any pitfalls, etc? This also re= quires system changes, but at least without virtio spec changes.=C2=A0=C2= =A0



= --
Regards,<= /span>

= Oleksand= r Tyshchenko
--000000000000efc60b05ffac6c55--