All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alberto Garcia <berto@igalia.com>
To: "Denis V. Lunev" <den@openvz.org>, qemu-devel@nongnu.org
Cc: Kevin Wolf <kwolf@redhat.com>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	qemu-block@nongnu.org, Max Reitz <mreitz@redhat.com>
Subject: Re: [Qemu-devel] [RFC] Proposed qcow2 extension: subcluster allocation
Date: Thu, 13 Apr 2017 13:58:12 +0200	[thread overview]
Message-ID: <w51shlcv7sb.fsf@maestria.local.igalia.com> (raw)
In-Reply-To: <2b915695-29b5-df8d-4d89-080eeaaaff13@openvz.org>

On Wed 12 Apr 2017 06:54:50 PM CEST, Denis V. Lunev wrote:
> My opinion about this approach is very negative as the problem could
> be (partially) solved in a much better way.

Hmm... it seems to me that (some of) the problems you are describing are
different from the ones this proposal tries to address. Not that I
disagree with them! I think you are giving useful feedback :)

> 1) current L2 cache management seems very wrong to me. Each cache
>     miss means that we have to read entire L2 cache block. This means
>     that in the worst case (when dataset of the test does not fit L2
>     cache size we read 64kb of L2 table for each 4 kb read).
>
>     The situation is MUCH worse once we are starting to increase
>     cluster size. For 1 Mb blocks we have to read 1 Mb on each cache
>     miss.
>
>     The situation can be cured immediately once we will start reading
>     L2 cache with 4 or 8kb chunks. We have patchset for this for our
>     downstream and preparing it for upstream.

Correct, although the impact of this depends on whether you are using
SDD or HDD.

With an SSD what you want is to minimize is the number of unnecessary
reads, so reading small chunks will likely increase the performance when
there's a cache miss.

With an HDD what you want is to minimize the number of seeks. Once you
have moved the disk head to the location where the cluster is, reading
the whole cluster is relatively inexpensive, so (leaving the memory
requirements aside) you generally want to read as much as possible.

> 2) yet another terrible thing in cluster allocation is its allocation
>     strategy.
>     Current QCOW2 codebase implies that we need 5 (five) IOPSes to
>     complete COW operation. We are reading head, writing head, reading
>     tail, writing tail, writing actual data to be written. This could
>     be easily reduced to 3 IOPSes.

That sounds right, but I'm not sure if this is really incompatible with
my proposal :)

>     Another problem is the amount of data written. We are writing
>     entire cluster in write operation and this is also insane. It is
>     possible to perform fallocate() and actual data write on normal
>     modern filesystem.

But that only works when filling the cluster with zeroes, doesn't it? If
there's a backing image you need to bring all the contents from there.

Berto

  reply	other threads:[~2017-04-13 11:58 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-06 15:01 [Qemu-devel] [RFC] Proposed qcow2 extension: subcluster allocation Alberto Garcia
2017-04-06 16:40 ` Eric Blake
2017-04-07  8:49   ` Alberto Garcia
2017-04-07 12:41   ` Kevin Wolf
2017-04-07 14:24     ` Alberto Garcia
2017-04-21 21:09   ` [Qemu-devel] proposed qcow2 extension: cluster reservations [was: " Eric Blake
2017-04-22 17:56     ` Max Reitz
2017-04-24 11:45       ` Kevin Wolf
2017-04-24 12:46       ` Alberto Garcia
2017-04-07 12:20 ` [Qemu-devel] " Stefan Hajnoczi
2017-04-07 12:24   ` Alberto Garcia
2017-04-07 13:01   ` Kevin Wolf
2017-04-10 15:32     ` Stefan Hajnoczi
2017-04-07 17:10 ` Max Reitz
2017-04-10  8:42   ` Kevin Wolf
2017-04-10 15:03     ` Max Reitz
2017-04-11 12:56   ` Alberto Garcia
2017-04-11 14:04     ` Max Reitz
2017-04-11 14:31       ` Alberto Garcia
2017-04-11 14:45         ` [Qemu-devel] [Qemu-block] " Eric Blake
2017-04-12 12:41           ` Alberto Garcia
2017-04-12 14:10             ` Max Reitz
2017-04-13  8:05               ` Alberto Garcia
2017-04-13  9:02                 ` Kevin Wolf
2017-04-13  9:05                   ` Alberto Garcia
2017-04-11 14:49         ` [Qemu-devel] " Kevin Wolf
2017-04-11 14:58           ` Eric Blake
2017-04-11 14:59           ` Max Reitz
2017-04-11 15:08             ` Eric Blake
2017-04-11 15:18               ` Max Reitz
2017-04-11 15:29                 ` Kevin Wolf
2017-04-11 15:29                   ` Max Reitz
2017-04-11 15:30                 ` Eric Blake
2017-04-11 15:34                   ` Max Reitz
2017-04-12 12:47           ` Alberto Garcia
2017-04-12 16:54 ` Denis V. Lunev
2017-04-13 11:58   ` Alberto Garcia [this message]
2017-04-13 12:44     ` Denis V. Lunev
2017-04-13 13:05       ` Kevin Wolf
2017-04-13 13:09         ` Denis V. Lunev
2017-04-13 13:36           ` Alberto Garcia
2017-04-13 14:06             ` Denis V. Lunev
2017-04-13 13:21       ` Alberto Garcia
2017-04-13 13:30         ` Denis V. Lunev
2017-04-13 13:59           ` Kevin Wolf
2017-04-13 15:04           ` Alberto Garcia
2017-04-13 15:17             ` Denis V. Lunev
2017-04-18 11:52               ` Alberto Garcia
2017-04-18 17:27                 ` Denis V. Lunev
2017-04-13 13:51         ` Kevin Wolf
2017-04-13 14:15           ` Alberto Garcia
2017-04-13 14:27             ` Kevin Wolf
2017-04-13 16:42               ` [Qemu-devel] [Qemu-block] " Roman Kagan
2017-04-13 14:42           ` [Qemu-devel] " Denis V. Lunev
2017-04-12 17:55 ` Denis V. Lunev
2017-04-12 18:20   ` Eric Blake
2017-04-12 19:02     ` Denis V. Lunev
2017-04-13  9:44       ` Kevin Wolf
2017-04-13 10:19         ` Denis V. Lunev
2017-04-14  1:06           ` [Qemu-devel] [Qemu-block] " John Snow
2017-04-14  4:17             ` Denis V. Lunev
2017-04-18 11:22               ` Kevin Wolf
2017-04-18 17:30                 ` Denis V. Lunev
2017-04-14  7:40             ` Roman Kagan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=w51shlcv7sb.fsf@maestria.local.igalia.com \
    --to=berto@igalia.com \
    --cc=den@openvz.org \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.