On Sat, Oct 9, 2021, 9:34 AM Matthew Wilcox <willy@infradead.org> wrote:
On Sat, Oct 09, 2021 at 12:19:03AM +0000, Hyeonggon Yoo wrote:
>  - Is there a reason that SLUB does not implement cache coloring?
>    it will help utilizing hardware cache. Especially in block layer,
>    they are literally *squeezing* its performance now.

Have you tried turning off cache colouring in SLAB and seeing if
performance changes?  My impression is that it's useful for caches
with low associativity (direct mapped / 2-way / 4-way), but loses
its effectiveness for caches with higher associativity.  For example,
my laptop:

 L1 Data Cache: 48KB, 12-way associative, 64 byte line size
 L1 Instruction Cache: 32KB, 8-way associative, 64 byte line size
 L2 Unified Cache: 1280KB, 20-way associative, 64 byte line size
 L3 Unified Cache: 12288KB, 12-way associative, 64 byte line size

I very much doubt that cache colouring is still useful for this machine.

And what was result on that benchmark?

How many cores on your processor?
And is it NUMA or UMA?

As I mentioned, color scheme is shared between cpus in same node.

I think we need to measure performqnce again after per-cpu coloring.