All of lore.kernel.org
 help / color / mirror / Atom feed
From: Avi Kivity <avi@redhat.com>
To: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>
Cc: "Fernando Luis Vázquez Cao" <fernando@oss.ntt.co.jp>,
	kvm@vger.kernel.org, qemu-devel@nongnu.org,
	"\"大村圭(oomura kei)\"" <ohmura.kei@lab.ntt.co.jp>,
	"Takuya Yoshikawa" <yoshikawa.takuya@oss.ntt.co.jp>,
	anthony@codemonkey.ws, "Andrea Arcangeli" <aarcange@redhat.com>,
	"Chris Wright" <chrisw@redhat.com>
Subject: Re: [RFC] KVM Fault Tolerance: Kemari for KVM
Date: Tue, 17 Nov 2009 14:15:21 +0200	[thread overview]
Message-ID: <4B0293D9.7000302@redhat.com> (raw)
In-Reply-To: <4B028334.1070004@lab.ntt.co.jp>

On 11/17/2009 01:04 PM, Yoshiaki Tamura wrote:
>> What I mean is:
>>
>> - choose synchronization point A
>> - start copying memory for synchronization point A
>>   - output is delayed
>> - choose synchronization point B
>> - copy memory for A and B
>>    if guest touches memory not yet copied for A, COW it
>> - once A copying is complete, release A output
>> - continue copying memory for B
>> - choose synchronization point B
>>
>> by keeping two synchronization points active, you don't have any 
>> pauses.  The cost is maintaining copy-on-write so we can copy dirty 
>> pages for A while keeping execution.
>
>
> The overall idea seems good, but if I'm understanding correctly, we 
> need a buffer for copying memory locally, and when it gets full, or 
> when we COW the memory for B, we still have to pause the guest to 
> prevent from overwriting. Correct?

Yes.  During COW the guest would not be able to access the page, but if 
other vcpus access other pages, they can still continue.  So generally 
synchronization would be pauseless.

> To make things simple, we would like to start with the synchronous 
> transmission first, and tackle asynchronous transmission later.

Of course.  I'm just worried that realistic workloads will drive the 
latency beyond acceptable limits.

>
>>>> How many pages do you copy per synchronization point for reasonably 
>>>> difficult workloads?
>>>
>>> That is very workload-dependent, but if you take a look at the examples
>>> below you can get a feeling of how Kemari behaves.
>>>
>>> IOzone            Kemari sync interval[ms]  dirtied pages
>>> ---------------------------------------------------------
>>> buffered + fsync                       400           3000
>>> O_SYNC                                  10             80
>>>
>>> In summary, if the guest executes few I/O operations, the interval
>>> between Kemari synchronizations points will increase and the number of
>>> dirtied pages will grow accordingly.
>>
>> In the example above, the externally observed latency grows to 400 
>> ms, yes?
>
> Not exactly.  The sync interval refers to the interval of 
> synchronization points captured when the workload is running.  In the 
> example above, when the observed sync interval is 400ms, it takes 
> about 150ms to sync VMs with 3000 dirtied pages.  Kemari resumes I/O 
> operations immediately once the synchronization is finished, and thus, 
> the externally observed latency is 150ms in this case.

Not sure I understand.

If a packet is output from a guest immediately after a synchronization 
point, doesn't it need to be delayed until the next synchronization 
point?  So it's not just the guest pause time that matters, but also the 
interval between sync points?

-- 
error compiling committee.c: too many arguments to function


WARNING: multiple messages have this Message-ID (diff)
From: Avi Kivity <avi@redhat.com>
To: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>
Cc: "Andrea Arcangeli" <aarcange@redhat.com>,
	"Chris Wright" <chrisw@redhat.com>,
	"\"大村圭(oomura kei)\"" <ohmura.kei@lab.ntt.co.jp>,
	kvm@vger.kernel.org,
	"Fernando Luis Vázquez Cao" <fernando@oss.ntt.co.jp>,
	qemu-devel@nongnu.org,
	"Takuya Yoshikawa" <yoshikawa.takuya@oss.ntt.co.jp>
Subject: [Qemu-devel] Re: [RFC] KVM Fault Tolerance: Kemari for KVM
Date: Tue, 17 Nov 2009 14:15:21 +0200	[thread overview]
Message-ID: <4B0293D9.7000302@redhat.com> (raw)
In-Reply-To: <4B028334.1070004@lab.ntt.co.jp>

On 11/17/2009 01:04 PM, Yoshiaki Tamura wrote:
>> What I mean is:
>>
>> - choose synchronization point A
>> - start copying memory for synchronization point A
>>   - output is delayed
>> - choose synchronization point B
>> - copy memory for A and B
>>    if guest touches memory not yet copied for A, COW it
>> - once A copying is complete, release A output
>> - continue copying memory for B
>> - choose synchronization point B
>>
>> by keeping two synchronization points active, you don't have any 
>> pauses.  The cost is maintaining copy-on-write so we can copy dirty 
>> pages for A while keeping execution.
>
>
> The overall idea seems good, but if I'm understanding correctly, we 
> need a buffer for copying memory locally, and when it gets full, or 
> when we COW the memory for B, we still have to pause the guest to 
> prevent from overwriting. Correct?

Yes.  During COW the guest would not be able to access the page, but if 
other vcpus access other pages, they can still continue.  So generally 
synchronization would be pauseless.

> To make things simple, we would like to start with the synchronous 
> transmission first, and tackle asynchronous transmission later.

Of course.  I'm just worried that realistic workloads will drive the 
latency beyond acceptable limits.

>
>>>> How many pages do you copy per synchronization point for reasonably 
>>>> difficult workloads?
>>>
>>> That is very workload-dependent, but if you take a look at the examples
>>> below you can get a feeling of how Kemari behaves.
>>>
>>> IOzone            Kemari sync interval[ms]  dirtied pages
>>> ---------------------------------------------------------
>>> buffered + fsync                       400           3000
>>> O_SYNC                                  10             80
>>>
>>> In summary, if the guest executes few I/O operations, the interval
>>> between Kemari synchronizations points will increase and the number of
>>> dirtied pages will grow accordingly.
>>
>> In the example above, the externally observed latency grows to 400 
>> ms, yes?
>
> Not exactly.  The sync interval refers to the interval of 
> synchronization points captured when the workload is running.  In the 
> example above, when the observed sync interval is 400ms, it takes 
> about 150ms to sync VMs with 3000 dirtied pages.  Kemari resumes I/O 
> operations immediately once the synchronization is finished, and thus, 
> the externally observed latency is 150ms in this case.

Not sure I understand.

If a packet is output from a guest immediately after a synchronization 
point, doesn't it need to be delayed until the next synchronization 
point?  So it's not just the guest pause time that matters, but also the 
interval between sync points?

-- 
error compiling committee.c: too many arguments to function

  reply	other threads:[~2009-11-17 12:15 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-11-09  3:53 [RFC] KVM Fault Tolerance: Kemari for KVM Fernando Luis Vázquez Cao
2009-11-09  3:53 ` [Qemu-devel] " Fernando Luis Vázquez Cao
2009-11-12 21:51 ` Dor Laor
2009-11-12 21:51   ` [Qemu-devel] " Dor Laor
2009-11-13 11:48   ` Yoshiaki Tamura
2009-11-13 11:48     ` [Qemu-devel] " Yoshiaki Tamura
2009-11-15 13:42     ` Dor Laor
2009-11-15 13:42       ` [Qemu-devel] " Dor Laor
2009-11-15 10:35 ` Avi Kivity
2009-11-15 10:35   ` [Qemu-devel] " Avi Kivity
2009-11-16 14:18   ` Fernando Luis Vázquez Cao
2009-11-16 14:18     ` [Qemu-devel] " Fernando Luis Vázquez Cao
2009-11-16 14:49     ` Avi Kivity
2009-11-16 14:49       ` [Qemu-devel] " Avi Kivity
2009-11-17 11:04       ` Yoshiaki Tamura
2009-11-17 11:04         ` [Qemu-devel] " Yoshiaki Tamura
2009-11-17 12:15         ` Avi Kivity [this message]
2009-11-17 12:15           ` Avi Kivity
2009-11-17 14:06           ` Yoshiaki Tamura
2009-11-17 14:06             ` [Qemu-devel] " Yoshiaki Tamura
2009-11-18 13:28         ` Yoshiaki Tamura
2009-11-18 13:28           ` [Qemu-devel] " Yoshiaki Tamura
2009-11-18 13:58           ` Avi Kivity
2009-11-18 13:58             ` [Qemu-devel] " Avi Kivity
2009-11-19  3:43             ` Yoshiaki Tamura
2009-11-19  3:43               ` [Qemu-devel] " Yoshiaki Tamura

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4B0293D9.7000302@redhat.com \
    --to=avi@redhat.com \
    --cc=aarcange@redhat.com \
    --cc=anthony@codemonkey.ws \
    --cc=chrisw@redhat.com \
    --cc=fernando@oss.ntt.co.jp \
    --cc=kvm@vger.kernel.org \
    --cc=ohmura.kei@lab.ntt.co.jp \
    --cc=qemu-devel@nongnu.org \
    --cc=tamura.yoshiaki@lab.ntt.co.jp \
    --cc=yoshikawa.takuya@oss.ntt.co.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.