On 11.04.2017 17:08, Eric Blake wrote:
> On 04/11/2017 09:59 AM, Max Reitz wrote:
> 
>>
>> Good point, but that also means that (with (2)) you can only use
>> subcluster configurations where the L2 entry size increases by a power
>> of two. Unfortunately, only one of those configurations itself is a
>> power of two, and that is 32.
>>
>> (With 32 subclusters, you take up 64 bits, which means an L2 entry will
>> take 128 bits; with any higher 2^n, you'd take up 2^{n+1} bits and the
>> L2 entry would take 2^{n+1} + 64 which is impossible to be a power of two.)
> 
> Or we add padding. If you want 64 subclusters, you burn 256 bits per> entry, even though only 192 of those bits are used.

Hm, yeah, although you have to keep in mind that the padding is almost
pretty much the same as the the data bits we need, effectively doubling
the size of the L2 tables:

padding = 2^{n+2} - 2^{n+1} - 64 (=2^6)
        = 2^{n+1} - 64

So that's not so nice, but if it's the only thing we can do...

>> I don't know how useful non-power-of-two subcluster configurations are.
>> Probably not at all.
>>
>> Since using subcluster would always result in the L2 table taking more
>> than 512 bytes, you could therefore never guarantee that there is no
>> entry overlapping a sector border (except with 32 subclusters).
> 
> Yes, there's definite benefits to keeping whatever structure we end up
> with aligned so that it naturally falls into sector boundaries, even if
> it means more padding bits.

Then again, I'm not even sure we really need atomicity for L2 entries +
subcluster bits. I don't think you'd ever have to modify both at the
same time (if you just say the subclusters are all unallocated when
allocating the cluster itself, and then you write which subclusters are
actually allocated afterwards)).

(This also applies to your remark on caching, I think.)

Atomicity certainly makes things easier, though.

Max