All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] Logging dirty pages from vhost-net in-kernel with vIOMMU
@ 2018-12-04 18:37 Jintack Lim
  2018-12-05  1:30 ` Jason Wang
  0 siblings, 1 reply; 12+ messages in thread
From: Jintack Lim @ 2018-12-04 18:37 UTC (permalink / raw)
  To: QEMU Devel Mailing List; +Cc: Jason Wang, \ Michael S . Tsirkin \

Hi,

I'm wondering how the current implementation works when logging dirty
pages during migration from vhost-net (in kernel) when used vIOMMU.

I understand how vhost-net logs GPAs when not using vIOMMU. But when
we use vhost with vIOMMU, then shouldn't vhost-net need to log the
translated address (GPA) instead of the address written in the
descriptor (IOVA) ? The current implementation looks like vhost-net
just logs IOVA without translation in vhost_get_vq_desc() in
drivers/vhost/net.c. It seems like QEMU doesn't do any further
translation of the dirty log when syncing.

I might be missing something. Could somebody shed some light on this?

Thanks,
Jintack

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] Logging dirty pages from vhost-net in-kernel with vIOMMU
  2018-12-04 18:37 [Qemu-devel] Logging dirty pages from vhost-net in-kernel with vIOMMU Jintack Lim
@ 2018-12-05  1:30 ` Jason Wang
  2018-12-05  1:59   ` Michael S. Tsirkin
  2018-12-05 14:47   ` Jintack Lim
  0 siblings, 2 replies; 12+ messages in thread
From: Jason Wang @ 2018-12-05  1:30 UTC (permalink / raw)
  To: Jintack Lim, QEMU Devel Mailing List; +Cc: Michael S. Tsirkin


On 2018/12/5 上午2:37, Jintack Lim wrote:
> Hi,
>
> I'm wondering how the current implementation works when logging dirty
> pages during migration from vhost-net (in kernel) when used vIOMMU.
>
> I understand how vhost-net logs GPAs when not using vIOMMU. But when
> we use vhost with vIOMMU, then shouldn't vhost-net need to log the
> translated address (GPA) instead of the address written in the
> descriptor (IOVA) ? The current implementation looks like vhost-net
> just logs IOVA without translation in vhost_get_vq_desc() in
> drivers/vhost/net.c. It seems like QEMU doesn't do any further
> translation of the dirty log when syncing.
>
> I might be missing something. Could somebody shed some light on this?


Good catch. It looks like a bug to me. Want to post a patch for this?

Thanks


>
> Thanks,
> Jintack
>
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] Logging dirty pages from vhost-net in-kernel with vIOMMU
  2018-12-05  1:30 ` Jason Wang
@ 2018-12-05  1:59   ` Michael S. Tsirkin
  2018-12-05  3:02     ` Jason Wang
  2018-12-05 14:47   ` Jintack Lim
  1 sibling, 1 reply; 12+ messages in thread
From: Michael S. Tsirkin @ 2018-12-05  1:59 UTC (permalink / raw)
  To: Jason Wang; +Cc: Jintack Lim, QEMU Devel Mailing List

On Wed, Dec 05, 2018 at 09:30:19AM +0800, Jason Wang wrote:
> 
> On 2018/12/5 上午2:37, Jintack Lim wrote:
> > Hi,
> > 
> > I'm wondering how the current implementation works when logging dirty
> > pages during migration from vhost-net (in kernel) when used vIOMMU.
> > 
> > I understand how vhost-net logs GPAs when not using vIOMMU. But when
> > we use vhost with vIOMMU, then shouldn't vhost-net need to log the
> > translated address (GPA) instead of the address written in the
> > descriptor (IOVA) ? The current implementation looks like vhost-net
> > just logs IOVA without translation in vhost_get_vq_desc() in
> > drivers/vhost/net.c. It seems like QEMU doesn't do any further
> > translation of the dirty log when syncing.
> > 
> > I might be missing something. Could somebody shed some light on this?
> 
> 
> Good catch. It looks like a bug to me. Want to post a patch for this?

This isn't going to be a quick fix: IOTLB UAPI is translating
IOVA values directly to uaddr.

So to fix it, we need to change IOVA messages to translate to GPA
so GPA can be logged.

for existing userspace We can try reverse translation uaddr->gpa as a
hack for logging but that translation was never guaranteed to be unique.

Jason I think you'll have to work on it given the complexity.

> Thanks
> 
> 
> > 
> > Thanks,
> > Jintack
> > 
> > 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] Logging dirty pages from vhost-net in-kernel with vIOMMU
  2018-12-05  1:59   ` Michael S. Tsirkin
@ 2018-12-05  3:02     ` Jason Wang
  2018-12-05 13:32       ` Michael S. Tsirkin
  0 siblings, 1 reply; 12+ messages in thread
From: Jason Wang @ 2018-12-05  3:02 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Jintack Lim, QEMU Devel Mailing List


On 2018/12/5 上午9:59, Michael S. Tsirkin wrote:
> On Wed, Dec 05, 2018 at 09:30:19AM +0800, Jason Wang wrote:
>> On 2018/12/5 上午2:37, Jintack Lim wrote:
>>> Hi,
>>>
>>> I'm wondering how the current implementation works when logging dirty
>>> pages during migration from vhost-net (in kernel) when used vIOMMU.
>>>
>>> I understand how vhost-net logs GPAs when not using vIOMMU. But when
>>> we use vhost with vIOMMU, then shouldn't vhost-net need to log the
>>> translated address (GPA) instead of the address written in the
>>> descriptor (IOVA) ? The current implementation looks like vhost-net
>>> just logs IOVA without translation in vhost_get_vq_desc() in
>>> drivers/vhost/net.c. It seems like QEMU doesn't do any further
>>> translation of the dirty log when syncing.
>>>
>>> I might be missing something. Could somebody shed some light on this?
>>
>> Good catch. It looks like a bug to me. Want to post a patch for this?
> This isn't going to be a quick fix: IOTLB UAPI is translating
> IOVA values directly to uaddr.
>
> So to fix it, we need to change IOVA messages to translate to GPA
> so GPA can be logged.
>
> for existing userspace We can try reverse translation uaddr->gpa as a
> hack for logging but that translation was never guaranteed to be unique.


We have memory table in vhost as well, so looks like we can do this in 
kernel as well without disturbing UAPI?

Thanks


>
> Jason I think you'll have to work on it given the complexity.
>
>> Thanks
>>
>>
>>> Thanks,
>>> Jintack
>>>
>>>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] Logging dirty pages from vhost-net in-kernel with vIOMMU
  2018-12-05  3:02     ` Jason Wang
@ 2018-12-05 13:32       ` Michael S. Tsirkin
  2018-12-06  7:27         ` Jason Wang
  0 siblings, 1 reply; 12+ messages in thread
From: Michael S. Tsirkin @ 2018-12-05 13:32 UTC (permalink / raw)
  To: Jason Wang; +Cc: Jintack Lim, QEMU Devel Mailing List

On Wed, Dec 05, 2018 at 11:02:11AM +0800, Jason Wang wrote:
> 
> On 2018/12/5 上午9:59, Michael S. Tsirkin wrote:
> > On Wed, Dec 05, 2018 at 09:30:19AM +0800, Jason Wang wrote:
> > > On 2018/12/5 上午2:37, Jintack Lim wrote:
> > > > Hi,
> > > > 
> > > > I'm wondering how the current implementation works when logging dirty
> > > > pages during migration from vhost-net (in kernel) when used vIOMMU.
> > > > 
> > > > I understand how vhost-net logs GPAs when not using vIOMMU. But when
> > > > we use vhost with vIOMMU, then shouldn't vhost-net need to log the
> > > > translated address (GPA) instead of the address written in the
> > > > descriptor (IOVA) ? The current implementation looks like vhost-net
> > > > just logs IOVA without translation in vhost_get_vq_desc() in
> > > > drivers/vhost/net.c. It seems like QEMU doesn't do any further
> > > > translation of the dirty log when syncing.
> > > > 
> > > > I might be missing something. Could somebody shed some light on this?
> > > 
> > > Good catch. It looks like a bug to me. Want to post a patch for this?
> > This isn't going to be a quick fix: IOTLB UAPI is translating
> > IOVA values directly to uaddr.
> > 
> > So to fix it, we need to change IOVA messages to translate to GPA
> > so GPA can be logged.
> > 
> > for existing userspace We can try reverse translation uaddr->gpa as a
> > hack for logging but that translation was never guaranteed to be unique.
> 
> 
> We have memory table in vhost as well, so looks like we can do this in
> kernel as well without disturbing UAPI?
> 
> Thanks

Let me try to rephrase.

Yes, as a temporary bugfix we can do the uaddr to gpa translations.
It is probably good enough for what QEMU does now.

However it can break some legal userspace, since it is possible to
have multiple UADDR mappings for a single GPA.
In that setup the vhost table would only have one of these
and it's possible that IOTLB would use another one.

And generally it's a better idea security-wise to make
iotlb talk in GPA terms. This way whoever sets the static
GPA-to-UADDR mappings controls security, and the dynamic
and more fragile iova mappings can not break QEMU security.

So we need a UAPI extension with a feature flag.

> 
> > 
> > Jason I think you'll have to work on it given the complexity.
> > 
> > > Thanks
> > > 
> > > 
> > > > Thanks,
> > > > Jintack
> > > > 
> > > > 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] Logging dirty pages from vhost-net in-kernel with vIOMMU
  2018-12-05  1:30 ` Jason Wang
  2018-12-05  1:59   ` Michael S. Tsirkin
@ 2018-12-05 14:47   ` Jintack Lim
  2018-12-06  7:33     ` Jason Wang
  1 sibling, 1 reply; 12+ messages in thread
From: Jintack Lim @ 2018-12-05 14:47 UTC (permalink / raw)
  To: Jason Wang; +Cc: QEMU Devel Mailing List, \ Michael S . Tsirkin \

On Tue, Dec 4, 2018 at 8:30 PM Jason Wang <jasowang@redhat.com> wrote:
>
>
> On 2018/12/5 上午2:37, Jintack Lim wrote:
> > Hi,
> >
> > I'm wondering how the current implementation works when logging dirty
> > pages during migration from vhost-net (in kernel) when used vIOMMU.
> >
> > I understand how vhost-net logs GPAs when not using vIOMMU. But when
> > we use vhost with vIOMMU, then shouldn't vhost-net need to log the
> > translated address (GPA) instead of the address written in the
> > descriptor (IOVA) ? The current implementation looks like vhost-net
> > just logs IOVA without translation in vhost_get_vq_desc() in
> > drivers/vhost/net.c. It seems like QEMU doesn't do any further
> > translation of the dirty log when syncing.
> >
> > I might be missing something. Could somebody shed some light on this?
>
>
> Good catch. It looks like a bug to me. Want to post a patch for this?

Thanks for the confirmation.

What would be a good setup to catch this kind of migration bug? I
tried to observe it in the VM expecting to see network applications
not getting data correctly on the destination, but it was not
successful (i.e. the VM on the destination just worked fine.) I didn't
even see anything going wrong when I disabled the vhost logging
completely without using vIOMMU.

What I did is I ran multiple network benchmarks (e.g. netperf tcp
stream and my own one to check correctness of received data) in a VM
without vhost dirty page logging, and the benchmarks just ran fine in
the destination. I checked the used ring at the time the VM is stopped
in the source for migration, and it had multiple descriptors that is
(probably) not processed in the VM yet. Do you have any insight how it
could just work and what would be a good setup to catch this?

About sending a patch, as Michael suggested, I think it's better for
you to handle this case - this is not my area of expertise, yet :-)

>
> Thanks
>
>
> >
> > Thanks,
> > Jintack
> >
> >
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] Logging dirty pages from vhost-net in-kernel with vIOMMU
  2018-12-05 13:32       ` Michael S. Tsirkin
@ 2018-12-06  7:27         ` Jason Wang
  0 siblings, 0 replies; 12+ messages in thread
From: Jason Wang @ 2018-12-06  7:27 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Jintack Lim, QEMU Devel Mailing List


On 2018/12/5 下午9:32, Michael S. Tsirkin wrote:
> On Wed, Dec 05, 2018 at 11:02:11AM +0800, Jason Wang wrote:
>> On 2018/12/5 上午9:59, Michael S. Tsirkin wrote:
>>> On Wed, Dec 05, 2018 at 09:30:19AM +0800, Jason Wang wrote:
>>>> On 2018/12/5 上午2:37, Jintack Lim wrote:
>>>>> Hi,
>>>>>
>>>>> I'm wondering how the current implementation works when logging dirty
>>>>> pages during migration from vhost-net (in kernel) when used vIOMMU.
>>>>>
>>>>> I understand how vhost-net logs GPAs when not using vIOMMU. But when
>>>>> we use vhost with vIOMMU, then shouldn't vhost-net need to log the
>>>>> translated address (GPA) instead of the address written in the
>>>>> descriptor (IOVA) ? The current implementation looks like vhost-net
>>>>> just logs IOVA without translation in vhost_get_vq_desc() in
>>>>> drivers/vhost/net.c. It seems like QEMU doesn't do any further
>>>>> translation of the dirty log when syncing.
>>>>>
>>>>> I might be missing something. Could somebody shed some light on this?
>>>> Good catch. It looks like a bug to me. Want to post a patch for this?
>>> This isn't going to be a quick fix: IOTLB UAPI is translating
>>> IOVA values directly to uaddr.
>>>
>>> So to fix it, we need to change IOVA messages to translate to GPA
>>> so GPA can be logged.
>>>
>>> for existing userspace We can try reverse translation uaddr->gpa as a
>>> hack for logging but that translation was never guaranteed to be unique.
>>
>> We have memory table in vhost as well, so looks like we can do this in
>> kernel as well without disturbing UAPI?
>>
>> Thanks
> Let me try to rephrase.
>
> Yes, as a temporary bugfix we can do the uaddr to gpa translations.
> It is probably good enough for what QEMU does now.
>
> However it can break some legal userspace, since it is possible to
> have multiple UADDR mappings for a single GPA.
> In that setup the vhost table would only have one of these
> and it's possible that IOTLB would use another one.


Consider we are logging GPA, so it doesn't matter which UADDR in this 
case since we finally get a same GPA. Maybe you mean multiple GPA 
mappings for a single UADDR? Then we may want to log all possible GPA in 
this case.


>
> And generally it's a better idea security-wise to make
> iotlb talk in GPA terms. This way whoever sets the static
> GPA-to-UADDR mappings controls security, and the dynamic
> and more fragile iova mappings can not break QEMU security.


AFAIK, this may only work if memory table and IOTLB entries were set by 
different process I believe. Consider it's all set by qemu, and qemu 
will go through GPA-UADDR mapping before setting device IOTLB. It's 
probably not a gain for us now.


>
> So we need a UAPI extension with a feature flag.
>

Yes.

Thanks


>>> Jason I think you'll have to work on it given the complexity.
>>>
>>>> Thanks
>>>>
>>>>
>>>>> Thanks,
>>>>> Jintack
>>>>>
>>>>>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] Logging dirty pages from vhost-net in-kernel with vIOMMU
  2018-12-05 14:47   ` Jintack Lim
@ 2018-12-06  7:33     ` Jason Wang
  2018-12-06 12:11       ` Jintack Lim
  0 siblings, 1 reply; 12+ messages in thread
From: Jason Wang @ 2018-12-06  7:33 UTC (permalink / raw)
  To: Jintack Lim; +Cc: QEMU Devel Mailing List, Michael S . Tsirkin


On 2018/12/5 下午10:47, Jintack Lim wrote:
> On Tue, Dec 4, 2018 at 8:30 PM Jason Wang <jasowang@redhat.com> wrote:
>>
>> On 2018/12/5 上午2:37, Jintack Lim wrote:
>>> Hi,
>>>
>>> I'm wondering how the current implementation works when logging dirty
>>> pages during migration from vhost-net (in kernel) when used vIOMMU.
>>>
>>> I understand how vhost-net logs GPAs when not using vIOMMU. But when
>>> we use vhost with vIOMMU, then shouldn't vhost-net need to log the
>>> translated address (GPA) instead of the address written in the
>>> descriptor (IOVA) ? The current implementation looks like vhost-net
>>> just logs IOVA without translation in vhost_get_vq_desc() in
>>> drivers/vhost/net.c. It seems like QEMU doesn't do any further
>>> translation of the dirty log when syncing.
>>>
>>> I might be missing something. Could somebody shed some light on this?
>>
>> Good catch. It looks like a bug to me. Want to post a patch for this?
> Thanks for the confirmation.
>
> What would be a good setup to catch this kind of migration bug? I
> tried to observe it in the VM expecting to see network applications
> not getting data correctly on the destination, but it was not
> successful (i.e. the VM on the destination just worked fine.) I didn't
> even see anything going wrong when I disabled the vhost logging
> completely without using vIOMMU.
>
> What I did is I ran multiple network benchmarks (e.g. netperf tcp
> stream and my own one to check correctness of received data) in a VM
> without vhost dirty page logging, and the benchmarks just ran fine in
> the destination. I checked the used ring at the time the VM is stopped
> in the source for migration, and it had multiple descriptors that is
> (probably) not processed in the VM yet. Do you have any insight how it
> could just work and what would be a good setup to catch this?


According to past experience, it could be reproduced by doing scp from 
host to guest during migration.


>
> About sending a patch, as Michael suggested, I think it's better for
> you to handle this case - this is not my area of expertise, yet :-)


No problem, I will fix this.

Thanks for spotting this issue.


>> Thanks
>>
>>
>>> Thanks,
>>> Jintack
>>>
>>>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] Logging dirty pages from vhost-net in-kernel with vIOMMU
  2018-12-06  7:33     ` Jason Wang
@ 2018-12-06 12:11       ` Jintack Lim
  2018-12-06 12:44         ` Jason Wang
  0 siblings, 1 reply; 12+ messages in thread
From: Jintack Lim @ 2018-12-06 12:11 UTC (permalink / raw)
  To: Jason Wang; +Cc: QEMU Devel Mailing List, \ Michael S . Tsirkin \

On Thu, Dec 6, 2018 at 2:33 AM Jason Wang <jasowang@redhat.com> wrote:
>
>
> On 2018/12/5 下午10:47, Jintack Lim wrote:
> > On Tue, Dec 4, 2018 at 8:30 PM Jason Wang <jasowang@redhat.com> wrote:
> >>
> >> On 2018/12/5 上午2:37, Jintack Lim wrote:
> >>> Hi,
> >>>
> >>> I'm wondering how the current implementation works when logging dirty
> >>> pages during migration from vhost-net (in kernel) when used vIOMMU.
> >>>
> >>> I understand how vhost-net logs GPAs when not using vIOMMU. But when
> >>> we use vhost with vIOMMU, then shouldn't vhost-net need to log the
> >>> translated address (GPA) instead of the address written in the
> >>> descriptor (IOVA) ? The current implementation looks like vhost-net
> >>> just logs IOVA without translation in vhost_get_vq_desc() in
> >>> drivers/vhost/net.c. It seems like QEMU doesn't do any further
> >>> translation of the dirty log when syncing.
> >>>
> >>> I might be missing something. Could somebody shed some light on this?
> >>
> >> Good catch. It looks like a bug to me. Want to post a patch for this?
> > Thanks for the confirmation.
> >
> > What would be a good setup to catch this kind of migration bug? I
> > tried to observe it in the VM expecting to see network applications
> > not getting data correctly on the destination, but it was not
> > successful (i.e. the VM on the destination just worked fine.) I didn't
> > even see anything going wrong when I disabled the vhost logging
> > completely without using vIOMMU.
> >
> > What I did is I ran multiple network benchmarks (e.g. netperf tcp
> > stream and my own one to check correctness of received data) in a VM
> > without vhost dirty page logging, and the benchmarks just ran fine in
> > the destination. I checked the used ring at the time the VM is stopped
> > in the source for migration, and it had multiple descriptors that is
> > (probably) not processed in the VM yet. Do you have any insight how it
> > could just work and what would be a good setup to catch this?
>
>
> According to past experience, it could be reproduced by doing scp from
> host to guest during migration.
>

Thanks. I actually tried that, but didn't see any problem either - I
copied a large file during migration from host to guest (the copy
continued on the destination), and checked md5 hashes using md5sum,
but the copied file had the same checksum as the one in the host.

Do you recall what kind of symptom you observed when the dirty pages
were not migrated correctly with scp?

>
> >
> > About sending a patch, as Michael suggested, I think it's better for
> > you to handle this case - this is not my area of expertise, yet :-)
>
>
> No problem, I will fix this.
>
> Thanks for spotting this issue.
>
>
> >> Thanks
> >>
> >>
> >>> Thanks,
> >>> Jintack
> >>>
> >>>
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] Logging dirty pages from vhost-net in-kernel with vIOMMU
  2018-12-06 12:11       ` Jintack Lim
@ 2018-12-06 12:44         ` Jason Wang
  2018-12-07 12:37           ` Jason Wang
  0 siblings, 1 reply; 12+ messages in thread
From: Jason Wang @ 2018-12-06 12:44 UTC (permalink / raw)
  To: Jintack Lim; +Cc: QEMU Devel Mailing List, Michael S . Tsirkin


On 2018/12/6 下午8:11, Jintack Lim wrote:
> On Thu, Dec 6, 2018 at 2:33 AM Jason Wang <jasowang@redhat.com> wrote:
>>
>> On 2018/12/5 下午10:47, Jintack Lim wrote:
>>> On Tue, Dec 4, 2018 at 8:30 PM Jason Wang <jasowang@redhat.com> wrote:
>>>> On 2018/12/5 上午2:37, Jintack Lim wrote:
>>>>> Hi,
>>>>>
>>>>> I'm wondering how the current implementation works when logging dirty
>>>>> pages during migration from vhost-net (in kernel) when used vIOMMU.
>>>>>
>>>>> I understand how vhost-net logs GPAs when not using vIOMMU. But when
>>>>> we use vhost with vIOMMU, then shouldn't vhost-net need to log the
>>>>> translated address (GPA) instead of the address written in the
>>>>> descriptor (IOVA) ? The current implementation looks like vhost-net
>>>>> just logs IOVA without translation in vhost_get_vq_desc() in
>>>>> drivers/vhost/net.c. It seems like QEMU doesn't do any further
>>>>> translation of the dirty log when syncing.
>>>>>
>>>>> I might be missing something. Could somebody shed some light on this?
>>>> Good catch. It looks like a bug to me. Want to post a patch for this?
>>> Thanks for the confirmation.
>>>
>>> What would be a good setup to catch this kind of migration bug? I
>>> tried to observe it in the VM expecting to see network applications
>>> not getting data correctly on the destination, but it was not
>>> successful (i.e. the VM on the destination just worked fine.) I didn't
>>> even see anything going wrong when I disabled the vhost logging
>>> completely without using vIOMMU.
>>>
>>> What I did is I ran multiple network benchmarks (e.g. netperf tcp
>>> stream and my own one to check correctness of received data) in a VM
>>> without vhost dirty page logging, and the benchmarks just ran fine in
>>> the destination. I checked the used ring at the time the VM is stopped
>>> in the source for migration, and it had multiple descriptors that is
>>> (probably) not processed in the VM yet. Do you have any insight how it
>>> could just work and what would be a good setup to catch this?
>>
>> According to past experience, it could be reproduced by doing scp from
>> host to guest during migration.
>>
> Thanks. I actually tried that, but didn't see any problem either - I
> copied a large file during migration from host to guest (the copy
> continued on the destination), and checked md5 hashes using md5sum,
> but the copied file had the same checksum as the one in the host.
>
> Do you recall what kind of symptom you observed when the dirty pages
> were not migrated correctly with scp?


Yes,  the point is to make the migration converge before the end of scp 
(e.g set migration speed to a very big value). If scp end before 
migration, we won't catch the bug. And it's better to do several rounds 
of migration during scp.

Anyway, let me try to reproduce it tomorrow.

Thanks


>
>>> About sending a patch, as Michael suggested, I think it's better for
>>> you to handle this case - this is not my area of expertise, yet :-)
>>
>> No problem, I will fix this.
>>
>> Thanks for spotting this issue.
>>
>>
>>>> Thanks
>>>>
>>>>
>>>>> Thanks,
>>>>> Jintack
>>>>>
>>>>>
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] Logging dirty pages from vhost-net in-kernel with vIOMMU
  2018-12-06 12:44         ` Jason Wang
@ 2018-12-07 12:37           ` Jason Wang
  2018-12-09 18:31             ` Jintack Lim
  0 siblings, 1 reply; 12+ messages in thread
From: Jason Wang @ 2018-12-07 12:37 UTC (permalink / raw)
  To: Jintack Lim; +Cc: QEMU Devel Mailing List, Michael S . Tsirkin


On 2018/12/6 下午8:44, Jason Wang wrote:
>
> On 2018/12/6 下午8:11, Jintack Lim wrote:
>> On Thu, Dec 6, 2018 at 2:33 AM Jason Wang <jasowang@redhat.com> wrote:
>>>
>>> On 2018/12/5 下午10:47, Jintack Lim wrote:
>>>> On Tue, Dec 4, 2018 at 8:30 PM Jason Wang <jasowang@redhat.com> wrote:
>>>>> On 2018/12/5 上午2:37, Jintack Lim wrote:
>>>>>> Hi,
>>>>>>
>>>>>> I'm wondering how the current implementation works when logging 
>>>>>> dirty
>>>>>> pages during migration from vhost-net (in kernel) when used vIOMMU.
>>>>>>
>>>>>> I understand how vhost-net logs GPAs when not using vIOMMU. But when
>>>>>> we use vhost with vIOMMU, then shouldn't vhost-net need to log the
>>>>>> translated address (GPA) instead of the address written in the
>>>>>> descriptor (IOVA) ? The current implementation looks like vhost-net
>>>>>> just logs IOVA without translation in vhost_get_vq_desc() in
>>>>>> drivers/vhost/net.c. It seems like QEMU doesn't do any further
>>>>>> translation of the dirty log when syncing.
>>>>>>
>>>>>> I might be missing something. Could somebody shed some light on 
>>>>>> this?
>>>>> Good catch. It looks like a bug to me. Want to post a patch for this?
>>>> Thanks for the confirmation.
>>>>
>>>> What would be a good setup to catch this kind of migration bug? I
>>>> tried to observe it in the VM expecting to see network applications
>>>> not getting data correctly on the destination, but it was not
>>>> successful (i.e. the VM on the destination just worked fine.) I didn't
>>>> even see anything going wrong when I disabled the vhost logging
>>>> completely without using vIOMMU.
>>>>
>>>> What I did is I ran multiple network benchmarks (e.g. netperf tcp
>>>> stream and my own one to check correctness of received data) in a VM
>>>> without vhost dirty page logging, and the benchmarks just ran fine in
>>>> the destination. I checked the used ring at the time the VM is stopped
>>>> in the source for migration, and it had multiple descriptors that is
>>>> (probably) not processed in the VM yet. Do you have any insight how it
>>>> could just work and what would be a good setup to catch this?
>>>
>>> According to past experience, it could be reproduced by doing scp from
>>> host to guest during migration.
>>>
>> Thanks. I actually tried that, but didn't see any problem either - I
>> copied a large file during migration from host to guest (the copy
>> continued on the destination), and checked md5 hashes using md5sum,
>> but the copied file had the same checksum as the one in the host.
>>
>> Do you recall what kind of symptom you observed when the dirty pages
>> were not migrated correctly with scp?
>
>
> Yes,  the point is to make the migration converge before the end of 
> scp (e.g set migration speed to a very big value). If scp end before 
> migration, we won't catch the bug. And it's better to do several 
> rounds of migration during scp.
>
> Anyway, let me try to reproduce it tomorrow.
>

Looks like I can reproduce this, scp give the following error to me:

scp /home/file root@192.168.100.4:/home
file                                           63% 1301MB 58.1MB/s   
00:12 ETAReceived disconnect from 192.168.100.4: 2: Packet corrupt
lost connection

FYI, I use the following cli:

numactl --cpunodebind 0 --membind 0 $qemu_path $img_path \
            -netdev tap,id=hn0,vhost=on \
            -device ioh3420,id=root.1,chassis=1 \
            -device 
virtio-net-pci,bus=root.1,netdev=hn0,ats=on,disable-legacy=on,disable-modern=off,iommu_platform=on 
\
            -device intel-iommu,device-iotlb=on \
            -M q35 -m 4G -enable-kvm -cpu host -smp 2 $@

Thanks


> Thanks

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] Logging dirty pages from vhost-net in-kernel with vIOMMU
  2018-12-07 12:37           ` Jason Wang
@ 2018-12-09 18:31             ` Jintack Lim
  0 siblings, 0 replies; 12+ messages in thread
From: Jintack Lim @ 2018-12-09 18:31 UTC (permalink / raw)
  To: Jason Wang; +Cc: QEMU Devel Mailing List, \ Michael S . Tsirkin \

On Fri, Dec 7, 2018 at 7:37 AM Jason Wang <jasowang@redhat.com> wrote:
>
>
> On 2018/12/6 下午8:44, Jason Wang wrote:
> >
> > On 2018/12/6 下午8:11, Jintack Lim wrote:
> >> On Thu, Dec 6, 2018 at 2:33 AM Jason Wang <jasowang@redhat.com> wrote:
> >>>
> >>> On 2018/12/5 下午10:47, Jintack Lim wrote:
> >>>> On Tue, Dec 4, 2018 at 8:30 PM Jason Wang <jasowang@redhat.com> wrote:
> >>>>> On 2018/12/5 上午2:37, Jintack Lim wrote:
> >>>>>> Hi,
> >>>>>>
> >>>>>> I'm wondering how the current implementation works when logging
> >>>>>> dirty
> >>>>>> pages during migration from vhost-net (in kernel) when used vIOMMU.
> >>>>>>
> >>>>>> I understand how vhost-net logs GPAs when not using vIOMMU. But when
> >>>>>> we use vhost with vIOMMU, then shouldn't vhost-net need to log the
> >>>>>> translated address (GPA) instead of the address written in the
> >>>>>> descriptor (IOVA) ? The current implementation looks like vhost-net
> >>>>>> just logs IOVA without translation in vhost_get_vq_desc() in
> >>>>>> drivers/vhost/net.c. It seems like QEMU doesn't do any further
> >>>>>> translation of the dirty log when syncing.
> >>>>>>
> >>>>>> I might be missing something. Could somebody shed some light on
> >>>>>> this?
> >>>>> Good catch. It looks like a bug to me. Want to post a patch for this?
> >>>> Thanks for the confirmation.
> >>>>
> >>>> What would be a good setup to catch this kind of migration bug? I
> >>>> tried to observe it in the VM expecting to see network applications
> >>>> not getting data correctly on the destination, but it was not
> >>>> successful (i.e. the VM on the destination just worked fine.) I didn't
> >>>> even see anything going wrong when I disabled the vhost logging
> >>>> completely without using vIOMMU.
> >>>>
> >>>> What I did is I ran multiple network benchmarks (e.g. netperf tcp
> >>>> stream and my own one to check correctness of received data) in a VM
> >>>> without vhost dirty page logging, and the benchmarks just ran fine in
> >>>> the destination. I checked the used ring at the time the VM is stopped
> >>>> in the source for migration, and it had multiple descriptors that is
> >>>> (probably) not processed in the VM yet. Do you have any insight how it
> >>>> could just work and what would be a good setup to catch this?
> >>>
> >>> According to past experience, it could be reproduced by doing scp from
> >>> host to guest during migration.
> >>>
> >> Thanks. I actually tried that, but didn't see any problem either - I
> >> copied a large file during migration from host to guest (the copy
> >> continued on the destination), and checked md5 hashes using md5sum,
> >> but the copied file had the same checksum as the one in the host.
> >>
> >> Do you recall what kind of symptom you observed when the dirty pages
> >> were not migrated correctly with scp?
> >
> >
> > Yes,  the point is to make the migration converge before the end of
> > scp (e.g set migration speed to a very big value). If scp end before
> > migration, we won't catch the bug. And it's better to do several
> > rounds of migration during scp.
> >
> > Anyway, let me try to reproduce it tomorrow.
> >
>
> Looks like I can reproduce this, scp give the following error to me:
>
> scp /home/file root@192.168.100.4:/home
> file                                           63% 1301MB 58.1MB/s
> 00:12 ETAReceived disconnect from 192.168.100.4: 2: Packet corrupt
> lost connection

Thanks for sharing this.

I was able to reproduce the bug. I observed different md5sum in the
host and the guest after several tries. I didn't observe the
disconnect you saw, but the different md5sum is enough to show the
bug, I guess.

Thanks,
Jintack

>
> FYI, I use the following cli:
>
> numactl --cpunodebind 0 --membind 0 $qemu_path $img_path \
>             -netdev tap,id=hn0,vhost=on \
>             -device ioh3420,id=root.1,chassis=1 \
>             -device
> virtio-net-pci,bus=root.1,netdev=hn0,ats=on,disable-legacy=on,disable-modern=off,iommu_platform=on
> \
>             -device intel-iommu,device-iotlb=on \
>             -M q35 -m 4G -enable-kvm -cpu host -smp 2 $@
>
> Thanks
>
>
> > Thanks
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2018-12-09 18:32 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-04 18:37 [Qemu-devel] Logging dirty pages from vhost-net in-kernel with vIOMMU Jintack Lim
2018-12-05  1:30 ` Jason Wang
2018-12-05  1:59   ` Michael S. Tsirkin
2018-12-05  3:02     ` Jason Wang
2018-12-05 13:32       ` Michael S. Tsirkin
2018-12-06  7:27         ` Jason Wang
2018-12-05 14:47   ` Jintack Lim
2018-12-06  7:33     ` Jason Wang
2018-12-06 12:11       ` Jintack Lim
2018-12-06 12:44         ` Jason Wang
2018-12-07 12:37           ` Jason Wang
2018-12-09 18:31             ` Jintack Lim

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.