From: "Michael S. Tsirkin" <mst@redhat.com>
To: Jason Wang <jasowang@redhat.com>
Cc: kvm@vger.kernel.org, virtualization@lists.linux-foundation.org,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
Jintack Lim <jintack@cs.columbia.edu>
Subject: Re: [PATCH net V2 4/4] vhost: log dirty page correctly
Date: Tue, 25 Dec 2018 11:25:32 -0500 [thread overview]
Message-ID: <20181225111716-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <9e57732f-2d42-173f-9297-42821f34ab8f@redhat.com>
On Tue, Dec 25, 2018 at 05:43:25PM +0800, Jason Wang wrote:
>
> On 2018/12/25 上午1:41, Michael S. Tsirkin wrote:
> > On Mon, Dec 24, 2018 at 11:43:31AM +0800, Jason Wang wrote:
> > > On 2018/12/14 下午9:20, Michael S. Tsirkin wrote:
> > > > On Fri, Dec 14, 2018 at 10:43:03AM +0800, Jason Wang wrote:
> > > > > On 2018/12/13 下午10:31, Michael S. Tsirkin wrote:
> > > > > > > Just to make sure I understand this. It looks to me we should:
> > > > > > >
> > > > > > > - allow passing GIOVA->GPA through UAPI
> > > > > > >
> > > > > > > - cache GIOVA->GPA somewhere but still use GIOVA->HVA in device IOTLB for
> > > > > > > performance
> > > > > > >
> > > > > > > Is this what you suggest?
> > > > > > >
> > > > > > > Thanks
> > > > > > Not really. We already have GPA->HVA, so I suggested a flag to pass
> > > > > > GIOVA->GPA in the IOTLB.
> > > > > >
> > > > > > This has advantages for security since a single table needs
> > > > > > then to be validated to ensure guest does not corrupt
> > > > > > QEMU memory.
> > > > > >
> > > > > I wonder how much we can gain through this. Currently, qemu IOMMU gives
> > > > > GIOVA->GPA mapping, and qemu vhost code will translate GPA to HVA then pass
> > > > > GIOVA->HVA to vhost. It looks no difference to me.
> > > > >
> > > > > Thanks
> > > > The difference is in security not in performance. Getting a bad HVA
> > > > corrupts QEMU memory and it might be guest controlled. Very risky.
> > > How can this be controlled by guest? HVA was generated from qemu ram blocks
> > > which is totally under the control of qemu memory core instead of guest.
> > >
> > >
> > > Thanks
> > It is ultimately under guest influence as guest supplies IOVA->GPA
> > translations. qemu translates GPA->HVA and gives the translated result
> > to the kernel. If it's not buggy and kernel isn't buggy it's all
> > fine.
>
>
> If qemu provides buggy GPA->HVA, we can't workaround this. And I don't get
> the point why we even want to try this. Buggy qemu code can crash itself in
> many ways.
>
>
> >
> > But that's the approach that was proven not to work in the 20th century.
> > In the 21st century we are trying defence in depth approach.
> >
> > My point is that a single code path that is responsible for
> > the HVA translations is better than two.
> >
>
> So the difference whether or not use memory table information:
>
> Current:
>
> 1) SET_MEM_TABLE: GPA->HVA
>
> 2) Qemu GIOVA->GPA
>
> 3) Qemu GPA->HVA
>
> 4) IOTLB_UPDATE: GIOVA->HVA
>
> If I understand correctly you want to drop step 3 consider it might be buggy
> which is just 19 lines of code in qemu (vhost_memory_region_lookup()). This
> will ends up:
>
> 1) Do GPA->HVA translation in IOTLB_UPDATE path (I believe we won't want to
> do it during device IOTLB lookup).
>
> 2) Extra bits to enable this capability.
>
> So this looks need more codes in kernel than what qemu did in userspace. Is
> this really worthwhile?
>
> Thanks
So there are several points I would like to make
1. At the moment without an iommu it is possible to
change GPA-HVA mappings and everything keeps working
because a change in memory tables flushes the rings.
However I don't see the iotlb cache being invalidated
on that path - did I miss it? If it is not there it's
a related minor bug.
2. qemu already has a GPA. Discarding it and re-calculating
when logging is on just seems wrong.
However if you would like to *also* keep the HVA in the iotlb
to avoid doing extra translations, that sounds like a
reasonable optimization.
3. it also means that the hva->gpa translation only runs
when logging is enabled. That is a rarely excercised
path so any bugs there will not be caught.
So I really would like us long term to move away from
hva->gpa translations, keep them for legacy userspace only
but I don't really mind how we do it.
How about
- a new flag to pass an iotlb with *both* a gpa and hva
- for legacy userspace, calculate the gpa on iotlb update
so the device then uses a shared code path
what do you think?
--
MST
next prev parent reply other threads:[~2018-12-25 16:25 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-12-12 10:08 [PATCH net V2 0/4] Fix various issue of vhost Jason Wang
2018-12-12 10:08 ` [PATCH net V2 1/4] vhost: make sure used idx is seen before log in vhost_add_used_n() Jason Wang
2018-12-12 14:33 ` Michael S. Tsirkin
2018-12-12 10:08 ` [PATCH net V2 2/4] vhost_net: switch to use mutex_trylock() in vhost_net_busy_poll() Jason Wang
2018-12-12 14:20 ` Michael S. Tsirkin
2018-12-12 10:08 ` [PATCH net V2 3/4] Revert "net: vhost: lock the vqs one by one" Jason Wang
2018-12-12 14:24 ` Michael S. Tsirkin
2018-12-13 2:27 ` Jason Wang
2018-12-12 10:08 ` [PATCH net V2 4/4] vhost: log dirty page correctly Jason Wang
2018-12-12 14:32 ` Michael S. Tsirkin
2018-12-13 2:39 ` Jason Wang
2018-12-13 14:31 ` Michael S. Tsirkin
2018-12-14 2:43 ` Jason Wang
2018-12-14 13:20 ` Michael S. Tsirkin
2018-12-24 3:43 ` Jason Wang
2018-12-24 17:41 ` Michael S. Tsirkin
2018-12-25 9:43 ` Jason Wang
2018-12-25 16:25 ` Michael S. Tsirkin [this message]
2018-12-26 5:43 ` Jason Wang
2018-12-26 13:46 ` Michael S. Tsirkin
2018-12-27 9:32 ` Jason Wang
2018-12-12 23:31 ` [PATCH net V2 0/4] Fix various issue of vhost David Miller
2018-12-13 2:42 ` Jason Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181225111716-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=jasowang@redhat.com \
--cc=jintack@cs.columbia.edu \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=virtualization@lists.linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).