linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Si-Wei Liu <si-wei.liu@oracle.com>
To: Jason Wang <jasowang@redhat.com>
Cc: mst@redhat.com, eperezma@redhat.com, xuanzhuo@linux.alibaba.com,
	dtatulea@nvidia.com, virtualization@lists.linux-foundation.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 2/4] vhost-vdpa: reset vendor specific mapping to initial state in .release
Date: Mon, 16 Oct 2023 13:10:50 -0700	[thread overview]
Message-ID: <aa1c177f-7d3f-4b66-8f0b-ab2c4bf40084@oracle.com> (raw)
In-Reply-To: <CACGkMEt_zvBM=ysbXZJEC1sdbCk=BpcWvtjeuP_L2WH4ke1dWQ@mail.gmail.com>



On 10/15/2023 11:32 PM, Jason Wang wrote:
> On Fri, Oct 13, 2023 at 3:36 PM Si-Wei Liu <si-wei.liu@oracle.com> wrote:
>>
>>
>> On 10/12/2023 8:01 PM, Jason Wang wrote:
>>> On Tue, Oct 10, 2023 at 5:05 PM Si-Wei Liu <si-wei.liu@oracle.com> wrote:
>>>> Devices with on-chip IOMMU or vendor specific IOTLB implementation
>>>> may need to restore iotlb mapping to the initial or default state
>>>> using the .reset_map op, as it's desirable for some parent devices
>>>> to solely manipulate mappings by its own, independent of virtio device
>>>> state. For instance, device reset does not cause mapping go away on
>>>> such IOTLB model in need of persistent mapping. Before vhost-vdpa
>>>> is going away, give them a chance to reset iotlb back to the initial
>>>> state in vhost_vdpa_cleanup().
>>>>
>>>> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
>>>> ---
>>>>    drivers/vhost/vdpa.c | 16 ++++++++++++++++
>>>>    1 file changed, 16 insertions(+)
>>>>
>>>> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
>>>> index 851535f..a3f8160 100644
>>>> --- a/drivers/vhost/vdpa.c
>>>> +++ b/drivers/vhost/vdpa.c
>>>> @@ -131,6 +131,15 @@ static struct vhost_vdpa_as *vhost_vdpa_find_alloc_as(struct vhost_vdpa *v,
>>>>           return vhost_vdpa_alloc_as(v, asid);
>>>>    }
>>>>
>>>> +static void vhost_vdpa_reset_map(struct vhost_vdpa *v, u32 asid)
>>>> +{
>>>> +       struct vdpa_device *vdpa = v->vdpa;
>>>> +       const struct vdpa_config_ops *ops = vdpa->config;
>>>> +
>>>> +       if (ops->reset_map)
>>>> +               ops->reset_map(vdpa, asid);
>>>> +}
>>>> +
>>>>    static int vhost_vdpa_remove_as(struct vhost_vdpa *v, u32 asid)
>>>>    {
>>>>           struct vhost_vdpa_as *as = asid_to_as(v, asid);
>>>> @@ -140,6 +149,13 @@ static int vhost_vdpa_remove_as(struct vhost_vdpa *v, u32 asid)
>>>>
>>>>           hlist_del(&as->hash_link);
>>>>           vhost_vdpa_iotlb_unmap(v, &as->iotlb, 0ULL, 0ULL - 1, asid);
>>>> +       /*
>>>> +        * Devices with vendor specific IOMMU may need to restore
>>>> +        * iotlb to the initial or default state which is not done
>>>> +        * through device reset, as the IOTLB mapping manipulation
>>>> +        * could be decoupled from the virtio device life cycle.
>>>> +        */
>>> Should we do this according to whether IOTLB_PRESIST is set?
>> Well, in theory this seems like so but it's unnecessary code change
>> actually, as that is the way how vDPA parent behind platform IOMMU works
>> today, and userspace doesn't break as of today. :)
> Well, this is one question I've ever asked before. You have explained
> that one of the reason that we don't break userspace is that they may
> couple IOTLB reset with vDPA reset as well. One example is the Qemu.
Nope, it was the opposite. Maybe it was not clear enough, let me try 
once more - userspace CANNOT decouple IOTLB reset from vDPA reset today. 
This is because of bug/discrepancy in mlx5_vdap and vdpa_sim already 
breaking userspace's expectation, rendering the brokenness/inconsistency 
on vhost-vdpa mapping interface from behaving what it promised and 
should have done. Only with the IOTLB_PERSIST flag seen userspace can 
trust vhost-vdpa kernel interface *reliably* to decouple IOTLB reset 
from vDPA reset. Without seeing this flag, no matter how the code in 
QEMU was written, today's older userspace was never like to assume the 
mappings will *definitely* be cleared by vDPA reset. If any userspace 
implementation wants to get consistent behavior for all vDPA parent 
devices, it still has to *explicitly* clear all existing mappings by its 
own by sending bunch of unmap (iotlb invalidate) requests to vhost-vdpa 
kernel before resetting the vDPA backend.

In brief, userspace is already broken by kernel implementation today, 
and new userspace needs some device flag to know for sure if kernel bug 
has already been fixed; older userspace doesn't care about preserving 
the broken kernel behavior at all, regardless whether or not it wants to 
decouple IOTLB from vDPA reset.

>
>> As explained in previous threads [1][2], when IOTLB_PERSIST is not set
>> it doesn't necessarily mean the iotlb will definitely be destroyed
>> across reset (think about the platform IOMMU case), so userspace today
>> is already tolerating enough with either good or bad IOMMU. This code of
>> not checking IOTLB_PERSIST being set is intentional, there's no point to
>> emulate bad IOMMU behavior even for older userspace (with improper
>> emulation to be done it would result in even worse performance).
> For two reasons:
>
> 1) backend features need acked by userspace this is by design
There's no breakage on this part. Backend feature IOTLB_PERSIST won't be 
set if userspace doesn't ack.
> 2) keep the odd behaviour seems to be more safe as we can't audit
> every userspace program
Definitely don't have to audit every userspace program, but I cannot 
think of a case where a sane userspace program can be broken. Can you 
elaborate one or two potential userspace usage that may break because of 
this? As said, platform IOMMU already did it this way.

Regards,
-Siwei
>
> Thanks
>
>> I think
>> the purpose of the IOTLB_PERSIST flag is just to give userspace 100%
>> certainty of persistent iotlb mapping not getting lost across vdpa reset.
>>
>> Thanks,
>> -Siwei
>>
>> [1]
>> https://lore.kernel.org/virtualization/9f118fc9-4f6f-dd67-a291-be78152e47fd@oracle.com/
>> [2]
>> https://lore.kernel.org/virtualization/3364adfd-1eb7-8bce-41f9-bfe5473f1f2e@oracle.com/
>>>    Otherwise
>>> we may break old userspace.
>>>
>>> Thanks
>>>
>>>> +       vhost_vdpa_reset_map(v, asid);
>>>>           kfree(as);
>>>>
>>>>           return 0;
>>>> --
>>>> 1.8.3.1
>>>>


  parent reply	other threads:[~2023-10-16 20:11 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-10  9:02 [PATCH 0/4] vdpa: decouple reset of iotlb mapping from device reset Si-Wei Liu
2023-10-10  9:02 ` [PATCH 1/4] vdpa: introduce .reset_map operation callback Si-Wei Liu
2023-10-13  2:49   ` Jason Wang
2023-10-13  7:36     ` Si-Wei Liu
2023-10-16  5:30       ` Jason Wang
2023-10-10  9:02 ` [PATCH 2/4] vhost-vdpa: reset vendor specific mapping to initial state in .release Si-Wei Liu
2023-10-11 11:21   ` Eugenio Perez Martin
2023-10-12  6:18     ` Si-Wei Liu
2023-10-13  3:01   ` Jason Wang
2023-10-13  7:35     ` Si-Wei Liu
2023-10-16  6:32       ` Jason Wang
2023-10-16 11:28         ` Eugenio Perez Martin
2023-10-16 20:30           ` Si-Wei Liu
2023-10-17  2:35             ` Jason Wang
2023-10-17 13:58               ` Eugenio Perez Martin
2023-10-18  4:35               ` Si-Wei Liu
2023-10-18  5:27                 ` Jason Wang
2023-10-18  7:00                   ` Jason Wang
2023-10-18  8:49                     ` Si-Wei Liu
2023-10-19  2:53                       ` Jason Wang
2023-10-19  6:46                         ` Si-Wei Liu
2023-10-19  8:27                           ` Jason Wang
2023-10-19 14:39                             ` Eugenio Perez Martin
2023-10-19 22:28                               ` Si-Wei Liu
2023-10-20  4:11                                 ` Jason Wang
2023-10-20  5:57                                   ` Si-Wei Liu
2023-10-18  8:44                   ` Si-Wei Liu
2023-10-18 11:14                     ` Eugenio Perez Martin
2023-10-18 23:21                       ` Si-Wei Liu
2023-10-19  2:48                         ` Jason Wang
2023-10-19 22:57                   ` Si-Wei Liu
2023-10-16 20:10         ` Si-Wei Liu [this message]
2023-10-10  9:02 ` [PATCH 3/4] vhost-vdpa: introduce IOTLB_PERSIST backend feature bit Si-Wei Liu
2023-10-10  9:03 ` [PATCH 4/4] vdpa/mlx5: implement .reset_map driver op Si-Wei Liu
2023-10-13  3:04   ` Jason Wang
2023-10-13  7:55     ` Si-Wei Liu
2023-10-11 11:30 ` [PATCH 0/4] vdpa: decouple reset of iotlb mapping from device reset Eugenio Perez Martin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aa1c177f-7d3f-4b66-8f0b-ab2c4bf40084@oracle.com \
    --to=si-wei.liu@oracle.com \
    --cc=dtatulea@nvidia.com \
    --cc=eperezma@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=xuanzhuo@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).