From: Si-Wei Liu <si-wei.liu@oracle.com>
To: Jason Wang <jasowang@redhat.com>
Cc: mst@redhat.com, eperezma@redhat.com, xuanzhuo@linux.alibaba.com,
dtatulea@nvidia.com, virtualization@lists.linux-foundation.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH 2/4] vhost-vdpa: reset vendor specific mapping to initial state in .release
Date: Mon, 16 Oct 2023 13:10:50 -0700 [thread overview]
Message-ID: <aa1c177f-7d3f-4b66-8f0b-ab2c4bf40084@oracle.com> (raw)
In-Reply-To: <CACGkMEt_zvBM=ysbXZJEC1sdbCk=BpcWvtjeuP_L2WH4ke1dWQ@mail.gmail.com>
On 10/15/2023 11:32 PM, Jason Wang wrote:
> On Fri, Oct 13, 2023 at 3:36 PM Si-Wei Liu <si-wei.liu@oracle.com> wrote:
>>
>>
>> On 10/12/2023 8:01 PM, Jason Wang wrote:
>>> On Tue, Oct 10, 2023 at 5:05 PM Si-Wei Liu <si-wei.liu@oracle.com> wrote:
>>>> Devices with on-chip IOMMU or vendor specific IOTLB implementation
>>>> may need to restore iotlb mapping to the initial or default state
>>>> using the .reset_map op, as it's desirable for some parent devices
>>>> to solely manipulate mappings by its own, independent of virtio device
>>>> state. For instance, device reset does not cause mapping go away on
>>>> such IOTLB model in need of persistent mapping. Before vhost-vdpa
>>>> is going away, give them a chance to reset iotlb back to the initial
>>>> state in vhost_vdpa_cleanup().
>>>>
>>>> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
>>>> ---
>>>> drivers/vhost/vdpa.c | 16 ++++++++++++++++
>>>> 1 file changed, 16 insertions(+)
>>>>
>>>> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
>>>> index 851535f..a3f8160 100644
>>>> --- a/drivers/vhost/vdpa.c
>>>> +++ b/drivers/vhost/vdpa.c
>>>> @@ -131,6 +131,15 @@ static struct vhost_vdpa_as *vhost_vdpa_find_alloc_as(struct vhost_vdpa *v,
>>>> return vhost_vdpa_alloc_as(v, asid);
>>>> }
>>>>
>>>> +static void vhost_vdpa_reset_map(struct vhost_vdpa *v, u32 asid)
>>>> +{
>>>> + struct vdpa_device *vdpa = v->vdpa;
>>>> + const struct vdpa_config_ops *ops = vdpa->config;
>>>> +
>>>> + if (ops->reset_map)
>>>> + ops->reset_map(vdpa, asid);
>>>> +}
>>>> +
>>>> static int vhost_vdpa_remove_as(struct vhost_vdpa *v, u32 asid)
>>>> {
>>>> struct vhost_vdpa_as *as = asid_to_as(v, asid);
>>>> @@ -140,6 +149,13 @@ static int vhost_vdpa_remove_as(struct vhost_vdpa *v, u32 asid)
>>>>
>>>> hlist_del(&as->hash_link);
>>>> vhost_vdpa_iotlb_unmap(v, &as->iotlb, 0ULL, 0ULL - 1, asid);
>>>> + /*
>>>> + * Devices with vendor specific IOMMU may need to restore
>>>> + * iotlb to the initial or default state which is not done
>>>> + * through device reset, as the IOTLB mapping manipulation
>>>> + * could be decoupled from the virtio device life cycle.
>>>> + */
>>> Should we do this according to whether IOTLB_PRESIST is set?
>> Well, in theory this seems like so but it's unnecessary code change
>> actually, as that is the way how vDPA parent behind platform IOMMU works
>> today, and userspace doesn't break as of today. :)
> Well, this is one question I've ever asked before. You have explained
> that one of the reason that we don't break userspace is that they may
> couple IOTLB reset with vDPA reset as well. One example is the Qemu.
Nope, it was the opposite. Maybe it was not clear enough, let me try
once more - userspace CANNOT decouple IOTLB reset from vDPA reset today.
This is because of bug/discrepancy in mlx5_vdap and vdpa_sim already
breaking userspace's expectation, rendering the brokenness/inconsistency
on vhost-vdpa mapping interface from behaving what it promised and
should have done. Only with the IOTLB_PERSIST flag seen userspace can
trust vhost-vdpa kernel interface *reliably* to decouple IOTLB reset
from vDPA reset. Without seeing this flag, no matter how the code in
QEMU was written, today's older userspace was never like to assume the
mappings will *definitely* be cleared by vDPA reset. If any userspace
implementation wants to get consistent behavior for all vDPA parent
devices, it still has to *explicitly* clear all existing mappings by its
own by sending bunch of unmap (iotlb invalidate) requests to vhost-vdpa
kernel before resetting the vDPA backend.
In brief, userspace is already broken by kernel implementation today,
and new userspace needs some device flag to know for sure if kernel bug
has already been fixed; older userspace doesn't care about preserving
the broken kernel behavior at all, regardless whether or not it wants to
decouple IOTLB from vDPA reset.
>
>> As explained in previous threads [1][2], when IOTLB_PERSIST is not set
>> it doesn't necessarily mean the iotlb will definitely be destroyed
>> across reset (think about the platform IOMMU case), so userspace today
>> is already tolerating enough with either good or bad IOMMU. This code of
>> not checking IOTLB_PERSIST being set is intentional, there's no point to
>> emulate bad IOMMU behavior even for older userspace (with improper
>> emulation to be done it would result in even worse performance).
> For two reasons:
>
> 1) backend features need acked by userspace this is by design
There's no breakage on this part. Backend feature IOTLB_PERSIST won't be
set if userspace doesn't ack.
> 2) keep the odd behaviour seems to be more safe as we can't audit
> every userspace program
Definitely don't have to audit every userspace program, but I cannot
think of a case where a sane userspace program can be broken. Can you
elaborate one or two potential userspace usage that may break because of
this? As said, platform IOMMU already did it this way.
Regards,
-Siwei
>
> Thanks
>
>> I think
>> the purpose of the IOTLB_PERSIST flag is just to give userspace 100%
>> certainty of persistent iotlb mapping not getting lost across vdpa reset.
>>
>> Thanks,
>> -Siwei
>>
>> [1]
>> https://lore.kernel.org/virtualization/9f118fc9-4f6f-dd67-a291-be78152e47fd@oracle.com/
>> [2]
>> https://lore.kernel.org/virtualization/3364adfd-1eb7-8bce-41f9-bfe5473f1f2e@oracle.com/
>>> Otherwise
>>> we may break old userspace.
>>>
>>> Thanks
>>>
>>>> + vhost_vdpa_reset_map(v, asid);
>>>> kfree(as);
>>>>
>>>> return 0;
>>>> --
>>>> 1.8.3.1
>>>>
next prev parent reply other threads:[~2023-10-16 20:11 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-10 9:02 [PATCH 0/4] vdpa: decouple reset of iotlb mapping from device reset Si-Wei Liu
2023-10-10 9:02 ` [PATCH 1/4] vdpa: introduce .reset_map operation callback Si-Wei Liu
2023-10-13 2:49 ` Jason Wang
2023-10-13 7:36 ` Si-Wei Liu
2023-10-16 5:30 ` Jason Wang
2023-10-10 9:02 ` [PATCH 2/4] vhost-vdpa: reset vendor specific mapping to initial state in .release Si-Wei Liu
2023-10-11 11:21 ` Eugenio Perez Martin
2023-10-12 6:18 ` Si-Wei Liu
2023-10-13 3:01 ` Jason Wang
2023-10-13 7:35 ` Si-Wei Liu
2023-10-16 6:32 ` Jason Wang
2023-10-16 11:28 ` Eugenio Perez Martin
2023-10-16 20:30 ` Si-Wei Liu
2023-10-17 2:35 ` Jason Wang
2023-10-17 13:58 ` Eugenio Perez Martin
2023-10-18 4:35 ` Si-Wei Liu
2023-10-18 5:27 ` Jason Wang
2023-10-18 7:00 ` Jason Wang
2023-10-18 8:49 ` Si-Wei Liu
2023-10-19 2:53 ` Jason Wang
2023-10-19 6:46 ` Si-Wei Liu
2023-10-19 8:27 ` Jason Wang
2023-10-19 14:39 ` Eugenio Perez Martin
2023-10-19 22:28 ` Si-Wei Liu
2023-10-20 4:11 ` Jason Wang
2023-10-20 5:57 ` Si-Wei Liu
2023-10-18 8:44 ` Si-Wei Liu
2023-10-18 11:14 ` Eugenio Perez Martin
2023-10-18 23:21 ` Si-Wei Liu
2023-10-19 2:48 ` Jason Wang
2023-10-19 22:57 ` Si-Wei Liu
2023-10-16 20:10 ` Si-Wei Liu [this message]
2023-10-10 9:02 ` [PATCH 3/4] vhost-vdpa: introduce IOTLB_PERSIST backend feature bit Si-Wei Liu
2023-10-10 9:03 ` [PATCH 4/4] vdpa/mlx5: implement .reset_map driver op Si-Wei Liu
2023-10-13 3:04 ` Jason Wang
2023-10-13 7:55 ` Si-Wei Liu
2023-10-11 11:30 ` [PATCH 0/4] vdpa: decouple reset of iotlb mapping from device reset Eugenio Perez Martin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aa1c177f-7d3f-4b66-8f0b-ab2c4bf40084@oracle.com \
--to=si-wei.liu@oracle.com \
--cc=dtatulea@nvidia.com \
--cc=eperezma@redhat.com \
--cc=jasowang@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mst@redhat.com \
--cc=virtualization@lists.linux-foundation.org \
--cc=xuanzhuo@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).