linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Si-Wei Liu <si-wei.liu@oracle.com>
To: Jason Wang <jasowang@redhat.com>
Cc: Eugenio Perez Martin <eperezma@redhat.com>,
	mst@redhat.com, xuanzhuo@linux.alibaba.com, dtatulea@nvidia.com,
	virtualization@lists.linux-foundation.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 2/4] vhost-vdpa: reset vendor specific mapping to initial state in .release
Date: Wed, 18 Oct 2023 23:46:35 -0700	[thread overview]
Message-ID: <4ba8b63f-61e8-44cf-8e01-e467bd27ed5b@oracle.com> (raw)
In-Reply-To: <CACGkMEvebhMo5dfcyq2MFhBvFVNbwrqVofJXaBe9Vmn1O4qVjA@mail.gmail.com>



On 10/18/2023 7:53 PM, Jason Wang wrote:
> On Wed, Oct 18, 2023 at 4:49 PM Si-Wei Liu <si-wei.liu@oracle.com> wrote:
>>
>>
>> On 10/18/2023 12:00 AM, Jason Wang wrote:
>>>> Unfortunately, it's a must to stick to ABI. I agree it's a mess but we
>>>> don't have a better choice. Or we can fail the probe if userspace
>>>> doesn't ack this feature.
>>> Antoher idea we can just do the following in vhost_vdpa reset?
>>>
>>> config->reset()
>>> if (IOTLB_PERSIST is not set) {
>>>       config->reset_map()
>>> }
>>>
>>> Then we don't have the burden to maintain them in the parent?
>>>
>>> Thanks
>> Please see my earlier response in the other email, thanks.
>>
>> ----------------%<----------------%<----------------
>>
>> First, the ideal fix would be to leave this reset_vendor_mappings()
>> emulation code on the individual driver itself, which already has the
>> broken behavior.
> So the point is, not about whether the existing behavior is "broken"
> or not.
Hold on, I thought earlier we all agreed upon that the existing behavior 
of vendor driver self-clearing maps during .reset violates the vhost 
iotlb abstraction and also breaks the .set_map/.dma_map API. This is 
100% buggy driver implementation itself that we should discourage or 
eliminate as much as possible (that's part of the goal for this series), 
but here you seem to go existentialism and suggests the very opposite 
that every .set_map/.dma_map driver implementation, regardless being the 
current or the new/upcoming, should unconditionally try to emulate the 
broken reset behavior for the sake of not breaking older userspace. Set 
aside the criteria and definition for how userspace can be broken, can 
we step back to the original question why we think it's broken, and what 
we can do to promote good driver implementation instead of discuss the 
implementation details? Reading the below response I found my major 
points are not heard even if written for quite a few times. It's not 
that I don't understand the importance of not breaking old userspace, I 
appreciate your questions and extra patience, however I do feel the 
"broken" part is very relevant to our discussion here.

If it's broken (in the sense of vhost IOTLB API) that you agree, I think 
we should at least allow good driver implementations; and when you think 
about the possibility of those valid good driver cases 
(.set_map/.dma_map implementations that do not clear maps in .reset),  
you might be able to see why it's coded the way as it is now.

>   It's about whether we could stick to the old behaviour without
> too much cost. And I believe we could.
>
> And just to clarify here, reset_vendor_mappings() = config->reset_map()
>
>> But today there's no backend feature negotiation
>> between vhost-vdpa and the parent driver. Do we want to send down the
>> acked_backend_features to parent drivers?
> There's no need to do that with the above code, or anything I missed here?
>
> config->reset()
> if (IOTLB_PERSIST is not set) {
>        config->reset_map()
> }
Implementation issue: this implies reset_map() has to be there for every 
.set_map implementations, but vendor driver implementation for custom 
IOMMU could well implement DMA ops by itself instead of .reset_map. This 
won't work for every set_map driver (think about the vduse case).

But this is not the the point I was making. I think if you agree this is 
purely buggy driver implementation of its own, we should try to isolate 
this buggy behavior to individual driver rather than overload vhost-vdpa 
or vdpa core's role to help implement the emulation of broken driver 
behavior. I don't get why .reset is special here, the abuse of .reset to 
manipulate mapping could also happen in other IOMMU unrelated driver 
entries like in .suspend, or in queue_reset. If someday userspace is 
found coded around similar buggy driver implementation in other driver 
ops, do we want to follow and duplicate the same emulation in vdpa core 
as the precedent is already set here around .reset?
The buggy driver can fail in a lot of other ways indefinitely during 
reset, if there's a buggy driver that's already broken the way as how it 
is and happens to survive with all userspace apps, we just don't care 
and let it be. There's no way we can enumerate all those buggy behaviors 
in .reset_map itself, it's overloading that driver API too much.
>> Second, IOTLB_PERSIST is needed but not sufficient. Due to lack of
>> backend feature negotiation in parent driver, if vhost-vdpa has to
>> provide the old-behaviour emulation for compatibility on driver's
>> behalf, it needs to be done per-driver basis. There could be good
>> on-chip or vendor IOMMU implementation which doesn't clear the IOTLB in
>> .reset, and vendor specific IOMMU doesn't have to provide .reset_map,
> Then we just don't offer IOTLB_PRESIST, isn't this by design?
Think about the vduse case, it can work with DMA ops directly so doesn't 
have to implement .reset_map, unless for some specific good reason. 
Because it's a conforming and valid/good driver implementation, we may 
still allow it to advertise IOTLB_PERSIST to userspace. Which belongs to 
the 3rd bullet below:

https://lore.kernel.org/virtualization/1696928580-7520-4-git-send-email-si-wei.liu@oracle.com/

There are 3 cases that backend may claim this feature bit on:

- parent device that has to work with platform IOMMU
- parent device with on-chip IOMMU that has the expected
   .reset_map support in driver
- parent device with vendor specific IOMMU implementation
   that explicitly declares the specific backend feature

>
>> we
>> should allow these good driver implementations rather than
>> unconditionally stick to some specific problematic behavior for every
>> other good driver.
> Then you can force reset_map() with set_map() that is what I suggest
> in another thread, no?
This is exactly what I was afraid of that broken behavior emulation may 
become a dangerous slippery slope - in principle we should encourage 
good driver implementation, as they can work totally fine with older 
userspace. Why do they have to bother emulating broken behavior just 
because some other driver's misbehaving? And what's the boundary for 
this hack, do drivers backed by platform IOMMU even have to emulate (if 
not why not, and is there substantial difference in between)? After 
getting through all of this, do you still believe everything is just as 
easy and simple as what thought to be?

Btw, I thought I was expecting but still haven't got the clear answers 
to what was the goal to do all this, we spent a lot of time trying to 
unbreak userspace, but looks to me as if we were trying every possible 
way to break userspace or try to approximate to the same brokenness 
mlx5_vdpa may have caused to the userspace. What we will get eventually 
from these lengthy discussions? On the other hand, if you think it from 
vhost-vdpa user perspective, you'll clearly see there's just a couple of 
ways to unbreak userspace from the internal broken map which is out of 
sync with vhost-vdpa iotlb after device reset. If this brokenness was 
something universally done from the vhost-vdpa layer itself, I'd feel 
it's more of a shared problem, but this is not the case I see it here. 
While the long standing mlx5_vdpa/vdpa_sim issue is 100% misuse of 
.reset op in a wrong way per IOMMU API definition. Why leaving this 
discrepancy to the individual driver is not even an option, I'm still 
not sure?


Thanks,
-Siwei

>
>> Then we need a set of device flags (backend_features
>> bit again?) to indicate the specific driver needs upper layer's help on
>> old-behaviour emulation.
>>
>> Last but not least, I'm not sure how to properly emulate
>> reset_vendor_mappings() from vhost-vdpa layer. If a vendor driver has no
>> .reset_map op implemented, or if .reset_map has a slightly different
>> implementation than what it used to reset the iotlb in the .reset op,
> See above, for reset_vendor_mappings() I meant config->reset_map() exactly.
>
> Thanks
>
>> then this either becomes effectively dead code if no one ends up using,
>> or the vhost-vdpa emulation is helpless and limited in scope, unable to
>> cover all the cases.
>>
>> ----------------%<----------------%<----------------
>>


  reply	other threads:[~2023-10-19  6:47 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-10  9:02 [PATCH 0/4] vdpa: decouple reset of iotlb mapping from device reset Si-Wei Liu
2023-10-10  9:02 ` [PATCH 1/4] vdpa: introduce .reset_map operation callback Si-Wei Liu
2023-10-13  2:49   ` Jason Wang
2023-10-13  7:36     ` Si-Wei Liu
2023-10-16  5:30       ` Jason Wang
2023-10-10  9:02 ` [PATCH 2/4] vhost-vdpa: reset vendor specific mapping to initial state in .release Si-Wei Liu
2023-10-11 11:21   ` Eugenio Perez Martin
2023-10-12  6:18     ` Si-Wei Liu
2023-10-13  3:01   ` Jason Wang
2023-10-13  7:35     ` Si-Wei Liu
2023-10-16  6:32       ` Jason Wang
2023-10-16 11:28         ` Eugenio Perez Martin
2023-10-16 20:30           ` Si-Wei Liu
2023-10-17  2:35             ` Jason Wang
2023-10-17 13:58               ` Eugenio Perez Martin
2023-10-18  4:35               ` Si-Wei Liu
2023-10-18  5:27                 ` Jason Wang
2023-10-18  7:00                   ` Jason Wang
2023-10-18  8:49                     ` Si-Wei Liu
2023-10-19  2:53                       ` Jason Wang
2023-10-19  6:46                         ` Si-Wei Liu [this message]
2023-10-19  8:27                           ` Jason Wang
2023-10-19 14:39                             ` Eugenio Perez Martin
2023-10-19 22:28                               ` Si-Wei Liu
2023-10-20  4:11                                 ` Jason Wang
2023-10-20  5:57                                   ` Si-Wei Liu
2023-10-18  8:44                   ` Si-Wei Liu
2023-10-18 11:14                     ` Eugenio Perez Martin
2023-10-18 23:21                       ` Si-Wei Liu
2023-10-19  2:48                         ` Jason Wang
2023-10-19 22:57                   ` Si-Wei Liu
2023-10-16 20:10         ` Si-Wei Liu
2023-10-10  9:02 ` [PATCH 3/4] vhost-vdpa: introduce IOTLB_PERSIST backend feature bit Si-Wei Liu
2023-10-10  9:03 ` [PATCH 4/4] vdpa/mlx5: implement .reset_map driver op Si-Wei Liu
2023-10-13  3:04   ` Jason Wang
2023-10-13  7:55     ` Si-Wei Liu
2023-10-11 11:30 ` [PATCH 0/4] vdpa: decouple reset of iotlb mapping from device reset Eugenio Perez Martin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4ba8b63f-61e8-44cf-8e01-e467bd27ed5b@oracle.com \
    --to=si-wei.liu@oracle.com \
    --cc=dtatulea@nvidia.com \
    --cc=eperezma@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=xuanzhuo@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).