All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kevin Wolf <kwolf@redhat.com>
To: Xingbo Wu <wuxb45@gmail.com>
Cc: qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] disk image: self-organized format or raw file
Date: Fri, 15 Aug 2014 12:46:52 +0200	[thread overview]
Message-ID: <20140815104652.GB3770@noname.redhat.com> (raw)
In-Reply-To: <CABPa+v1vAww=2EZwDskpkqE-0-gwiX+JgPtWs8EatJM8yD2aEA@mail.gmail.com>

Am 14.08.2014 um 22:53 hat Xingbo Wu geschrieben:
> >> >> The main trick of QED was to introduce a dirty flag, which allowed to
> >> >> call fdatasync() less often because it was okay for image metadata to
> >> >> become inconsistent. After a crash, you have to repair the image then.
> >> >>
> >> >
> >> > I'm very curious about this dirty flag trick. I was surprised when I
> >> > observed very fast 'sync write' performance on QED.
> >> > If it skips the fdatasync when processing the device 'flush' command from
> >> > guest, it literally cheats the guest as the data can be lost. Am I that correct?
> >> > Does the repairing make sure all the data written before the last
> >> > successful 'flush'
> >> > can be recovered?
> >> > To my understanding, the 'flush' command in guest asks for persistence.
> >> > Data has to be persistent on host storage after flush except for the
> >> > image opened with 'cache=unsafe' mode.
> >> >
> >>
> >> I have some different ideas. Please correct me if I make any mistake.
> >> The trick may not cause true consistency issues. The relaxed write
> >> ordering (less fdatasync) seems to be safe.
> >> The analysis on this is described in this
> >> [http://lists.nongnu.org/archive/html/qemu-devel/2010-09/msg00515.html].
> >
> > Yes, specifically point 3. Without the dirty flag, you would have to
> > ensure that the file size is updated first and then the L2 table entry
> > is written. (This would still allow cluster leaks that cannot be
> > reclaimed, but at least no data corruption.)
> >
> >> In my opinion the reason why the ordering is irreverent is that any
> >> uninitialized block could exist in a block device.
> >> Unordered update l1 and alloc-write l2 are also safe because
> >> uninitialized blocks in a file is always zero or beyond the EOF.
> >
> > Yes. This holds true because QED (unlike qcow2) cannot be used directly
> > on block devices. This is a real limitation.
> >
> 
> I don't know much about the best practices in virtualization. Could
> you give me some examples? Thanks.
> Do some products provide resizeable (automatically?) Logical Volumes
> and put one qcow2 on each LV?
> Anyway, does someone use a physical disk to hold only one qcow2 image
> for some special usage?

I would be surprised if someone used a whole physical disk for a single
qcow2 image, but some people always do crazier things than you can
imagine...

Anyway, oVirt uses LVs to store qcow2 images on. It resizes the LVs on
the fly as they fill up. This seems to be working quite well.

Kevin

      reply	other threads:[~2014-08-15 10:47 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-08-11 23:38 [Qemu-devel] disk image: self-organized format or raw file 吴兴博
2014-08-12  0:52 ` Fam Zheng
2014-08-12 10:46   ` 吴兴博
2014-08-12 11:19     ` Fam Zheng
     [not found]       ` <CABPa+v1a7meoEtjLkwygjuZEABTqd8q3efGWJvAsAr-mLTQb-A@mail.gmail.com>
     [not found]         ` <20140812113916.GB2803@T430.redhat.com>
2014-08-12 12:03           ` 吴兴博
2014-08-12 12:21             ` Fam Zheng
2014-08-12 13:08   ` Kirill Batuzov
2014-08-12 13:23 ` Eric Blake
2014-08-12 13:45   ` 吴兴博
2014-08-12 14:07     ` Eric Blake
2014-08-12 14:14       ` 吴兴博
2014-08-12 15:30         ` Eric Blake
2014-08-12 16:22           ` Xingbo Wu
2014-08-13  1:29             ` Fam Zheng
2014-08-13 15:42           ` Kevin Wolf
2014-08-12 18:39       ` Richard W.M. Jones
2014-08-12 18:46 ` Daniel P. Berrange
2014-08-12 18:52   ` Richard W.M. Jones
2014-08-12 19:23     ` Xingbo Wu
2014-08-12 20:14       ` Richard W.M. Jones
2014-08-13 15:54 ` Kevin Wolf
2014-08-13 16:38   ` Xingbo Wu
2014-08-13 18:32     ` Kevin Wolf
2014-08-13 21:04       ` Xingbo Wu
2014-08-13 21:35         ` Eric Blake
2014-08-14  2:42         ` Xingbo Wu
2014-08-14  9:06           ` Kevin Wolf
2014-08-14 20:53             ` Xingbo Wu
2014-08-15 10:46               ` Kevin Wolf [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140815104652.GB3770@noname.redhat.com \
    --to=kwolf@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=wuxb45@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.