From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754427AbcKJAiN (ORCPT ); Wed, 9 Nov 2016 19:38:13 -0500 Received: from mail-pf0-f182.google.com ([209.85.192.182]:33861 "EHLO mail-pf0-f182.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754348AbcKJAiK (ORCPT ); Wed, 9 Nov 2016 19:38:10 -0500 Date: Wed, 9 Nov 2016 16:38:08 -0800 (PST) From: David Rientjes X-X-Sender: rientjes@chino.kir.corp.google.com To: Andrew Morton cc: Greg Thelen , Aruna Ramakrishna , Christoph Lameter , Joonsoo Kim , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [patch] mm, slab: faster active and free stats In-Reply-To: <20161108151727.b64035da825c69bced88b46d@linux-foundation.org> Message-ID: References: <20161108151727.b64035da825c69bced88b46d@linux-foundation.org> User-Agent: Alpine 2.10 (DEB 1266 2009-07-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 8 Nov 2016, Andrew Morton wrote: > > Reading /proc/slabinfo or monitoring slabtop(1) can become very expensive > > if there are many slab caches and if there are very lengthy per-node > > partial and/or free lists. > > > > Commit 07a63c41fa1f ("mm/slab: improve performance of gathering slabinfo > > stats") addressed the per-node full lists which showed a significant > > improvement when no objects were freed. This patch has the same > > motivation and optimizes the remainder of the usecases where there are > > very lengthy partial and free lists. > > > > This patch maintains per-node active_slabs (full and partial) and > > free_slabs rather than iterating the lists at runtime when reading > > /proc/slabinfo. > > Are there any nice numbers you can share? > Yes, please add this to the description: When allocating 100GB of slab from a test cache where every slab page is on the partial list, reading /proc/slabinfo (includes all other slab caches on the system) takes ~247ms on average with 48 samples. As a result of this patch, the same read takes ~0.856ms on average.