On Tue, 2020-11-24 at 20:59 +0200, Maxim Levitsky wrote:
> On Tue, 2020-11-24 at 19:59 +0100, Alberto Garcia wrote:
> > On Tue 24 Nov 2020 10:17:23 AM CET, Kevin Wolf wrote:
> > > We can then continue work to find a minimal reproducer and merge the
> > > test case in the early 6.0 cycle.
> > 
> > I haven't been able to reproduce the problem yet, do you have any
> > findings?
> > 
> > Berto
> > 
> 
> I have a working reproducer script. I'll send it in a hour or so.
> Best regards,
> 	Maxim Levitsky

I have attached a minimal reproducer for this issue.
I can convert this to an iotest if you think that this is worth it.


So these are the exact conditions for the corruption to happen:

1. Image must have at least 5 refcount tables 
(1 more that default refcount table cache size, which is 4 by default)


2. IO pattern that populates the 4 entry refcount table cache fully:

 Easiest way to do it is to have 4 L2 entries populated in the base image,
 such as each entry references a physical cluster that is served by different
 refcount table.
 
 Then discard these entries in the snapshot, triggering discard in the
 base file during the commit, which will populate the refcount table cache.


4. A discard of a cluster that belongs to 5th refcount table (done in the
   exact same way as above discards).
   It should be done soon, before L2 cache flushed due to some unrelated
   IO.

   This triggers the corruption:

The call stack is:

2. qcow2_free_any_cluster->
	qcow2_free_clusters->
		update_refcount:

			//This sets dependency between flushing the refcount cache and l2 cache.
    			if (decrease)
        			qcow2_cache_set_dependency(bs, s->refcount_block_cache,s->l2_table_cache);


			ret = alloc_refcount_block(bs, cluster_index, &refcount_block);
				return load_refcount_block(bs, refcount_block_offset, refcount_block);
					return qcow2_cache_get(...
						qcow2_cache_do_get
							/* because of a cache miss, we have to evict an entry*/
							ret = qcow2_cache_entry_flush(bs, c, i);
							if (c->depends) {
								/* this flushes the L2 cache */
        							ret = qcow2_cache_flush_dependency(bs, c);
							}


I had attached a reproducer that works with almost any cluster size and refcount block size.
Cluster sizes below 4K don't work because commit which is done by the mirror job which works on 4K granularity,
and that results in it not doing any discards due to various alignment restrictions.

If I patch qemu to make mirror job work on 512B granularity, test reproduces for small clusters as well.

The reproducer creates a qcow2 image in the current directory and it needs about 11G for default parameters.
(64K cluster size, 16 bit refcounts).
For 4K cluster size and 64 bit refcounts, it needs only 11M.
(This can be changed by editing the script)

Best regards,
	Maxim Levitsky