All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dor Laor <dlaor@redhat.com>
To: Ori Mamluk <omamluk@zerto.com>
Cc: "Kevin Wolf" <kwolf@redhat.com>, "עודד קדם" <oded@zerto.com>,
	"תומר בן אור" <tomer@zertodata.com>,
	qemu-devel@nongnu.org, "Yair Kuszpet" <yairk@zerto.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>
Subject: Re: [Qemu-devel] [RFC PATCH] replication agent module
Date: Wed, 08 Feb 2012 10:49:22 +0200	[thread overview]
Message-ID: <4F323712.1030409@redhat.com> (raw)
In-Reply-To: <4F3211D0.3070502@zerto.com>

On 02/08/2012 08:10 AM, Ori Mamluk wrote:
> On 07/02/2012 17:47, Paolo Bonzini wrote:
>> On 02/07/2012 03:48 PM, Ori Mamluk wrote:
>>>> The current streaming code in QEMU only deals with the former.
>>>> Streaming to a remote server would not be supported.
>>>>
>>> I need it at the same time. The Rephub reads either the full volume or
>>> parts of, and concurrently protects new IOs.
>>
>> Why can't QEMU itself stream the full volume in the background, and
>> send that together with any new I/O? Is it because the rephub knows
>> which parts are out-of-date and need recovery? In that case, as a
>> first approximation the rephub can pass the sector at which streaming
>> should start.
> Yes - it's because rephub knows. The parts that need recovery may be a
> series of random IOs that were lost because of a network outage
> somewhere along the replication pipe.
> Easy to think of it as a bitmap holding the not-yet-replicated IOs. The
> rephub occasionally reads those areas to 'sync' them, so in effect the
> rephub needs read access - it's not really to trigger streaming from an
> offset.
>>
>> But I'm also starting to wonder whether it would be simpler to use
>> existing replication code. DRBD is more feature-rich, and you can use
>> it over loopback or NBD devices (respectively raw and non-raw), and
>> also store the replication metadata on a file using the loopback
>> device. Ceph even has a userspace library and support within QEMU.
>>
> I think there are two immediate problems that drbd poses:
> 1. Our replication is not a simple mirror - it maintains history. I.e.
> you can recover to any point in time in the last X hours (usually 24) at
> a granularity of about 5 seconds.
> To be able to do that and keep the replica consistent we need to be
> notified for each IO.

Can you please elaborate some more in the exact details -
In theory, you can build a setup where the drbd (or nbd) copy on the 
destination side write to a intermediate image and every such write is 
trapped locally on the destination and you may not immediately propagate 
that to the disk image the VM sees.

> 2. drbd is 'below' all the Qemu block layers - if the protected volume
> is qcow2 then drbd doesn't get the raw IOs, right?

That's one of the major caveats in drbd/iscsi/nbd - there is no support 
for block level snapshots[1]. I wonder if the scsi protocol has 
something like this so we'll get efficient replication of qcow2/lvm 
snapshots that their base is already shared. If we'll gain such 
functionality, we'll benefit of it for storage vm motion solution too.

Another issue w/ drbd is that a continuous backup solution requires to 
do consistent snapshot and call file system freeze and sync it w/ the 
current block IO transfer. DRBD doesn't do that nor the other protocols. 
Of course DRBD can be enhanced but it will take allot more time.

A third requirement and similar to above is to group snapshots of 
several VMs so a consistent _cross vm application view_ will be created. 
It demands some control over IO tagging.

To summarize, IMHO drbd (which I used successfully 6 years ago and I 
love) is not drop&replace solution to this case.
I recommend we either to fit the nbd/iscsi case and improve our vm 
storage motion on the way or worse case develop proprietary logic that 
can live out side of qemu using IO tapping interface, similar to the 
guidelines Ori outlines.

Thanks,
Dor

[1] Check the far too basic approach for snapshots: 
http://www.drbd.org/users-guide/s-lvm-snapshots.html
>
> Ori
>

  reply	other threads:[~2012-02-08  8:49 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-07 10:29 [Qemu-devel] [RFC PATCH] replication agent module Ori Mamluk
2012-02-07 12:12 ` Anthony Liguori
2012-02-07 12:25   ` Dor Laor
2012-02-07 12:30     ` Ori Mamluk
2012-02-07 12:40       ` Anthony Liguori
2012-02-07 14:06         ` Ori Mamluk
2012-02-07 14:40           ` Paolo Bonzini
2012-02-07 14:48             ` Ori Mamluk
2012-02-07 15:47               ` Paolo Bonzini
2012-02-08  6:10                 ` Ori Mamluk
2012-02-08  8:49                   ` Dor Laor [this message]
2012-02-08 11:59                     ` Stefan Hajnoczi
2012-02-08  8:55                   ` Kevin Wolf
2012-02-08  9:47                     ` Ori Mamluk
2012-02-08 10:04                       ` Kevin Wolf
2012-02-08 13:28                         ` [Qemu-devel] [RFC] Replication agent design (was [RFC PATCH] replication agent module) Ori Mamluk
2012-02-08 14:59                           ` Stefan Hajnoczi
2012-02-08 14:59                             ` Stefan Hajnoczi
2012-02-19 13:40                             ` Ori Mamluk
2012-02-20 14:32                               ` Paolo Bonzini
2012-02-21  9:03                                 ` [Qemu-devel] BlockDriverState stack and BlockListeners (was: [RFC] Replication agent design) Kevin Wolf
2012-02-21  9:15                                   ` [Qemu-devel] BlockDriverState stack and BlockListeners Paolo Bonzini
2012-02-21  9:49                                     ` Kevin Wolf
2012-02-21 10:09                                       ` Paolo Bonzini
2012-02-21 10:51                                         ` Kevin Wolf
2012-02-21 11:36                                           ` Paolo Bonzini
2012-02-21 12:22                                             ` Stefan Hajnoczi
2012-02-21 12:57                                               ` Paolo Bonzini
2012-02-21 15:49                                               ` Markus Armbruster
2012-02-21 13:10                                             ` Kevin Wolf
2012-02-21 13:21                                               ` Paolo Bonzini
2012-02-21 15:56                                               ` Markus Armbruster
2012-02-21 16:04                                                 ` Kevin Wolf
2012-02-21 16:19                                                   ` Markus Armbruster
2012-02-21 16:39                                                     ` Kevin Wolf
2012-02-21 17:16                                               ` Stefan Hajnoczi
2012-02-21 10:20                                       ` Ori Mamluk
2012-02-29  8:38                                   ` Ori Mamluk
2012-03-03 11:46                                     ` Stefan Hajnoczi
2012-03-04  5:14                                       ` Ori Mamluk
2012-03-04  8:56                                         ` Paolo Bonzini
2012-03-05 12:04                                         ` Stefan Hajnoczi
2012-02-08 11:02                   ` [Qemu-devel] [RFC PATCH] replication agent module Stefan Hajnoczi
2012-02-08 13:00                     ` [Qemu-devel] [RFC] Replication agent requirements (was [RFC PATCH] replication agent module) Ori Mamluk
2012-02-08 13:30                       ` Anthony Liguori
2012-02-08 12:03                   ` [Qemu-devel] [RFC PATCH] replication agent module Stefan Hajnoczi
2012-02-08 12:46                     ` Paolo Bonzini
2012-02-08 14:39                       ` Stefan Hajnoczi
2012-02-08 14:55                         ` Paolo Bonzini
2012-02-08 15:07                           ` Stefan Hajnoczi
2012-02-07 14:53             ` Kevin Wolf
2012-02-07 15:00             ` Anthony Liguori
2012-02-07 13:34 ` Kevin Wolf
2012-02-07 13:50   ` Stefan Hajnoczi
2012-02-07 13:58     ` Paolo Bonzini
2012-02-07 14:05     ` Paolo Bonzini
2012-02-08 12:17       ` Orit Wasserman
2012-02-07 14:18     ` Ori Mamluk
2012-02-07 14:59     ` Anthony Liguori
2012-02-07 15:20       ` Stefan Hajnoczi
2012-02-07 16:25         ` Anthony Liguori
2012-02-21 16:01       ` Markus Armbruster
2012-02-21 17:31         ` Stefan Hajnoczi
2012-02-07 14:45   ` Ori Mamluk
2012-02-08 12:29     ` Orit Wasserman
2012-02-08 11:45   ` Luiz Capitulino

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F323712.1030409@redhat.com \
    --to=dlaor@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=oded@zerto.com \
    --cc=omamluk@zerto.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=tomer@zertodata.com \
    --cc=yairk@zerto.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.