From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753223AbbAEBhi (ORCPT ); Sun, 4 Jan 2015 20:37:38 -0500 Received: from LGEMRELSE7Q.lge.com ([156.147.1.151]:57173 "EHLO lgemrelse7q.lge.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752979AbbAEBhh (ORCPT ); Sun, 4 Jan 2015 20:37:37 -0500 X-Original-SENDERIP: 10.177.220.158 X-Original-MAILFROM: iamjoonsoo.kim@lge.com From: Joonsoo Kim To: Andrew Morton Cc: Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Jesper Dangaard Brouer Subject: [PATCH 0/6] mm/slab: optimize allocation fastpath Date: Mon, 5 Jan 2015 10:37:25 +0900 Message-Id: <1420421851-3281-1-git-send-email-iamjoonsoo.kim@lge.com> X-Mailer: git-send-email 1.7.9.5 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org SLAB always disable irq before executing any object alloc/free operation. This is really painful in terms of performance. Benchmark result that does alloc/free repeatedly shows that each alloc/free is rougly 2 times slower than SLUB's one (27 ns : 14 ns). To improve performance, this patchset try to implement allocation fastpath without disabling irq. This is a similar way to implement allocation fastpath in SLUB. Transaction id is introduced and updated on every operation. In allocation fastpath, object in array cache is read speculartively. And then, pointer pointing object position in array cache and transaction id are updated simultaneously through this_cpu_cmpxchg_double(). If tid is unchanged until this updating, it ensures that there is no concurrent clients allocating/freeing object to this slab. So allocation could succeed without disabling irq. Above mentioned benchmark shows that alloc/free fastpath performance is improved roughly 22%. (27 ns -> 21 ns). Unfortunately, I cannot optimize free fastpath, because speculartively writing freeing object pointer into array cache cannot be possible. If anyone have a good idea to optimize free fastpath, please let me know. Thanks. Joonsoo Kim (6): mm/slab: fix gfp flags of percpu allocation at boot phase mm/slab: remove kmemleak_erase() call mm/slab: clean-up __ac_get_obj() to prepare future changes mm/slab: rearrange irq management mm/slab: cleanup ____cache_alloc() mm/slab: allocation fastpath without disabling irq include/linux/kmemleak.h | 8 -- mm/slab.c | 257 +++++++++++++++++++++++++++++++--------------- 2 files changed, 176 insertions(+), 89 deletions(-) -- 1.7.9.5 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pd0-f182.google.com (mail-pd0-f182.google.com [209.85.192.182]) by kanga.kvack.org (Postfix) with ESMTP id D52456B0032 for ; Sun, 4 Jan 2015 20:37:38 -0500 (EST) Received: by mail-pd0-f182.google.com with SMTP id p10so26962848pdj.41 for ; Sun, 04 Jan 2015 17:37:38 -0800 (PST) Received: from lgemrelse7q.lge.com (LGEMRELSE7Q.lge.com. [156.147.1.151]) by mx.google.com with ESMTP id ib3si80807874pbb.224.2015.01.04.17.37.35 for ; Sun, 04 Jan 2015 17:37:37 -0800 (PST) From: Joonsoo Kim Subject: [PATCH 0/6] mm/slab: optimize allocation fastpath Date: Mon, 5 Jan 2015 10:37:25 +0900 Message-Id: <1420421851-3281-1-git-send-email-iamjoonsoo.kim@lge.com> Sender: owner-linux-mm@kvack.org List-ID: To: Andrew Morton Cc: Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Jesper Dangaard Brouer SLAB always disable irq before executing any object alloc/free operation. This is really painful in terms of performance. Benchmark result that does alloc/free repeatedly shows that each alloc/free is rougly 2 times slower than SLUB's one (27 ns : 14 ns). To improve performance, this patchset try to implement allocation fastpath without disabling irq. This is a similar way to implement allocation fastpath in SLUB. Transaction id is introduced and updated on every operation. In allocation fastpath, object in array cache is read speculartively. And then, pointer pointing object position in array cache and transaction id are updated simultaneously through this_cpu_cmpxchg_double(). If tid is unchanged until this updating, it ensures that there is no concurrent clients allocating/freeing object to this slab. So allocation could succeed without disabling irq. Above mentioned benchmark shows that alloc/free fastpath performance is improved roughly 22%. (27 ns -> 21 ns). Unfortunately, I cannot optimize free fastpath, because speculartively writing freeing object pointer into array cache cannot be possible. If anyone have a good idea to optimize free fastpath, please let me know. Thanks. Joonsoo Kim (6): mm/slab: fix gfp flags of percpu allocation at boot phase mm/slab: remove kmemleak_erase() call mm/slab: clean-up __ac_get_obj() to prepare future changes mm/slab: rearrange irq management mm/slab: cleanup ____cache_alloc() mm/slab: allocation fastpath without disabling irq include/linux/kmemleak.h | 8 -- mm/slab.c | 257 +++++++++++++++++++++++++++++++--------------- 2 files changed, 176 insertions(+), 89 deletions(-) -- 1.7.9.5 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org