On 11.04.2017 16:49, Kevin Wolf wrote:

[...]

>>>>> By the way, if you'd only allow multiple of 1s overhead
>>>>> (i.e. multiples of 32 subclusters), I think (3) would be pretty much
>>>>> the same as (2) if you just always write the subcluster information
>>>>> adjacent to the L2 table. Should be just the same caching-wise and
>>>>> performance-wise.
>>>>
>>>> Then (3) is effectively the same as (2), just that the subcluster
>>>> bitmaps are at the end of the L2 cluster, and not next to each entry.
>>>
>>> Exactly. But it's a difference in implementation, as you won't have to
>>> worry about having changed the L2 table layout; maybe that's a
>>> benefit.
>>
>> I'm not sure if that would simplify or complicate things, but it's worth
>> considering.
> 
> Note that 64k between an L2 entry and the corresponding bitmap is enough
> to make an update not atomic any more. They need to be within the same
> sector to get atomicity.

Good point, but that also means that (with (2)) you can only use
subcluster configurations where the L2 entry size increases by a power
of two. Unfortunately, only one of those configurations itself is a
power of two, and that is 32.

(With 32 subclusters, you take up 64 bits, which means an L2 entry will
take 128 bits; with any higher 2^n, you'd take up 2^{n+1} bits and the
L2 entry would take 2^{n+1} + 64 which is impossible to be a power of two.)

I don't know how useful non-power-of-two subcluster configurations are.
Probably not at all.

Since using subcluster would always result in the L2 table taking more
than 512 bytes, you could therefore never guarantee that there is no
entry overlapping a sector border (except with 32 subclusters).

Max