Re: [Qemu-devel] [PATCH][RFC][0/2] REF+/REF- optimization

From: Kevin Wolf <kwolf@redhat.com>
To: Frediano Ziglio <freddy77@gmail.com>
Cc: qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH][RFC][0/2] REF+/REF- optimization
Date: Tue, 13 Sep 2011 12:37:51 +0200	[thread overview]
Message-ID: <4E6F327F.203@redhat.com> (raw)
In-Reply-To: <1315900388-6448-1-git-send-email-freddy77@gmail.com>

Am 13.09.2011 09:53, schrieb Frediano Ziglio:
> These patches try to trade-off between leaks and speed for clusters
> refcounts.
> 
> Refcount increments (REF+ or refp) are handled in a different way from
> decrements (REF- or refm). The reason it that posting or not flushing
> a REF- cause "just" a leak while posting a REF+ cause a corruption.
> 
> To optimize REF- I just used an array to store offsets then when a
> flush is requested or array reach a limit (currently 1022) the array
> is sorted and written to disk. I use an array with offset instead of
> ranges to support compression (an offset could appear multiple times
> in the array).
> I consider this patch quite ready.

Ok, first of all let's clarify what this optimises. I don't think it
changes anything at all for the writeback cache modes, because these
already do most operations in memory only. So this must be about
optimising some operations with cache=writethrough. REF- isn't about
normal cluster allocation, it is about COW with internal snapshots or
bdrv_discard. Do you have benchmarks for any of them?

I strongly disagree with your approach for REF-. We already have a
cache, and introducing a second one sounds like a bad idea. I think we
could get a very similar effect if we introduced a
qcow2_cache_entry_mark_dirty_wb() that marks a given refcount block as
dirty, but at the same time tells the cache that even in write-through
mode it can still treat this block as write-back. This should require
much less code changes.

But let's measure the effects first, I suspect that for cluster
allocation it doesn't help much because every REF- comes with a REF+.

> To optimize REF+ I mark a range as allocated and use this range to
> get new ones (avoiding writing refcount to disk). When a flush is
> requested or in some situations (like snapshot) this cache is disabled
> and flushed (written as REF-).
> I do not consider this patch ready, it works and pass all io-tests
> but for instance I would avoid allocating new clusters for refcount
> during preallocation.

The only question here is if improving cache=writethrough cluster
allocation performance is worth the additional complexity in the already
complex refcounting code.

The alternative that was discussed before is the dirty bit approach that
is used in QED and would allow us to use writeback for all refcount
blocks, regardless of REF- or REF+. It would be an easier approach
requiring less code changes, but it comes with the cost of requiring an
fsck after a qemu crash.

> End speed up is quite visible allocating clusters (more then 20%).

What benchmark do you use for testing this?

Kevin