On 06.04.2017 15:04, Stefan Hajnoczi wrote: > On Mon, Apr 03, 2017 at 06:09:33PM +0200, Max Reitz wrote: >> + BDRVQcow2State *s = bs->opaque; >> + uint64_t aligned_cur_size = align_offset(current_size, cluster_size); >> + uint64_t creftable_length; >> + uint64_t i; >> + >> + /* new total size of L2 tables */ >> + nl2e = aligned_total_size / cluster_size; >> + nl2e = align_offset(nl2e, cluster_size / sizeof(uint64_t)); >> + meta_size += nl2e * sizeof(uint64_t); >> + >> + /* Subtract L2 tables which are already present */ >> + for (i = 0; i < s->l1_size; i++) { >> + if (s->l1_table[i] & L1E_OFFSET_MASK) { >> + meta_size -= cluster_size; >> + } >> + } > > Allocated L1 table entries are 'A', unallocated L1 table entries in > the existing image are '-', and new L1 table entries after growth are > '+': > > |-A-AAA--+++++| > > This for loop includes '-' entries. Should we count them or just the > '+' entries? Hm, good question, thanks. I suppose we should count them if we wanted to fully allocate the image now. But that's not the intention, we just want to fully allocate the new area. While you can make the calculation faster if you take that into account, it also gets a bit more complicated: We can subtract all L2 tables that do not point to the new area. But there may be L2 tables already which do point there which we therefore do not have to allocate. So we'd still need to do this loop, but the starting index would be the first L2 table index to point into the area. So the above loop may calculate more space to be required than actually is the case; I could argue that's fine because this function may overcommit (especially if the image wasn't fully allocated before). But modifying it to ignore all L2 tables before the new area doesn't seem to be too complicated, so I'll do it. >> - /* total size of refcount tables */ >> - nreftablee = nrefblocke / refblock_size; >> - nreftablee = align_offset(nreftablee, cluster_size / sizeof(uint64_t)); >> - meta_size += nreftablee * sizeof(uint64_t); >> + /* Do not add L1 table size because the only caller of this path >> + * (qcow2_truncate) has increased its size already. */ >> >> - return aligned_total_size + meta_size; >> + /* Calculate size of the additional refblocks (this assumes that all of >> + * the existing image is covered by refblocks, which is extremely >> + * likely); this may result in overallocation because parts of the newly >> + * added space may be covered by existing refblocks, but that is fine. >> + * >> + * This only considers the newly added space. Since we cannot update the >> + * reftable in-place, we will have to able to store both the old and the >> + * new one at the same time, though. Therefore, we need to add the size >> + * of the old reftable here. >> + */ >> + creftable_length = ROUND_UP(s->refcount_table_size * sizeof(uint64_t), >> + cluster_size); >> + nrefblocke = ((aligned_total_size - aligned_cur_size) + meta_size + >> + creftable_length + cluster_size) >> + / (cluster_size - rces - >> + rces * sizeof(uint64_t) / cluster_size); >> + meta_size += DIV_ROUND_UP(nrefblocke, refblock_size) * cluster_size; >> + >> + /* total size of the new refcount table (again, may be too much because >> + * it assumes that the new area is not covered by any refcount blocks >> + * yet) */ >> + nreftablee = s->max_refcount_table_index + 1 + >> + nrefblocke / refblock_size; >> + nreftablee = align_offset(nreftablee, cluster_size / sizeof(uint64_t)); >> + meta_size += nreftablee * sizeof(uint64_t); >> + >> + return (aligned_total_size - aligned_cur_size) + meta_size; > > This calculation is an approximation. Here is a simpler approximation: > > current_usage = qcow2_calc_size_usage(current_size, ...); > new_usage = qcow2_calc_size_usage(new_size, ...); > delta = new_usage - current_usage; > > (Perhaps the new refcount_table clusters need to be added to new_size > too.) > > Is there an advantage of the more elaborate calculation implemented in > this patch? Now that you mention it... Well, the important thing is it's a conservative approximation. It's supposed to never underestimate the correct result. Now the existing image doesn't need to be fully allocated. However, your snippet assumes that it is. Often, this actually wouldn't be an issue. But it is for clusters that are "shared" between the existing image and the new area: 1. The old final L2 table may point into the new area. Your code assumes it is already present but it may not be. 2. current_size need not be aligned to clusters. If it isn't, the new area will reuse a data cluster from the existing image that now needs to be allocated. 3. Theoretically: The L1 table may be not be actually allocated. We have to make sure it is. In practice: We call qcow2_grow_l1_table() before doing any preallocation, and that always fully allocates the (new) L1 table. So we're good. 4. Of course, just as always, it gets *very* funny with refcount information. Maybe the existing image is sparsely allocated in a way that its allocated cluster count is aligned to refblocks so that it completely fills up all the refblocks it has (and those completely fill up, say, one reftable cluster). Now the calculation above might assume that we have more allocated clusters and therefore enough free entries in the last refblock to put everything there. But when actually doing the allocation... Surprise, you need to allocate one new refblock and a hole new reftable because that new refblock doesn't fit into the old one. So if I implemented your snippet and wanted to keep conservative, I'd have to add the following cluster counts for each of these: 1. 1 2. 1 3. -- 4. 1 (refblock) + number of existing reftable clusters + 1 (resized reftable) So this is not really good. We could add checks so we keep the count lower: 1. Check whether current_size is aligned to L2 boundaries. If it isn't, check whether the final L2 table is allocated. If it isn't, add 1. 2. Check whether current_size is aligned to clusters. If it isn't, check whether the final cluster is allocated. If it isn't, add 1. 4. Errr... This seems hard (like all refcount stuff). Maybe check whether the super-conservative estimate above would fit into the last refblock, if it does, add 0; otherwise, add $number_of_refblocks_required plus the number of reftable clusters required for that, however we can calculate that[1]. [1]: This brings me to another issue. When we have to resize the reftable, we create a copy, so we have to be able to store both the old and the new reftable at the same time. Your above snippet doesn't take this into account, so we'll have to add the number of existing reftable clusters to it; unless we don't have to resize the reftable... So all in all we could either use your snippet in a super-conservative approach, or we get probably the same complexity as I already have if we add all of the checks proposed above. The issue with the "super-conservative approach" is that I'm not even sure I found all possible corner cases. Max