On Tue, 2010-04-06 at 10:41 -0500, Christoph Lameter wrote: > On Tue, 6 Apr 2010, Zhang, Yanmin wrote: > > > Thanks. I tried 2 and 4 times and didn't see much improvement. > > I checked /proc/vamallocinfo and it doesn't have item of pcpu_get_vm_areas > > when I use 4 times of PERCPU_DYNAMIC_RESERVE. > > > I used perf to collect dtlb misses and LLC misses. dtlb miss data is not > > stable. Sometimes, we have a bigger dtlb miss, but get a better result. > > > > LLC misses data are more stable. Only LLC-load-misses is the clear sign now. > > LLC-store-misses has no big difference. > > LLC-load-miss is exactly what condition? I don't know. I just said it's a clear sign. Otherwise, there is no clear sign. The function statistics collected by perf with event llc-load-misses are very scattered. > > The cacheline environment in the hotpath should only include the following > cache lines (without debugging and counters): > > 1. The first cacheline from the kmem_cache structure > > (This is different from the sitation before the 2.6.34 changes. Earlier > some critical values (object length etc) where available > from the kmem_cache_cpu structure. The cacheline containing the percpu > structure array was needed to determome the kmem_cache_cpu address!) > > 2. The first cacheline from kmem_cache_cpu > > 3. The first cacheline of the data object (free pointer) > > And in case of a kfree/ kmem_cache_free: > > 4. Cacheline that contains the page struct of the page the object resides > in. I agree with your analysis, but we still have no answer. > > Can you post the .config you are using and the bootup messages? > Pls. see the 2 attachment. CONFIG_SLUB_STATS has no big impact on results. Yanmin