From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2C499ECE58C for ; Wed, 9 Oct 2019 22:19:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 0807F2190F for ; Wed, 9 Oct 2019 22:19:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1570659545; bh=cwcg9ufUYnEfgyFWr0WuaHXAmyFAM81C8Yflm5O6DmI=; h=Date:From:To:Cc:Subject:In-Reply-To:References:List-ID:From; b=vhykGaTmsIxCJ7cWfdD9E9FnKULyQ9z8UeSiFrGK1WA9qynNqBoNpT++QilEdBVGs D8ARjLrhwOuvubXdbgV4WihoIO+nu/CU/9yrZOcGgYxllasDPBiX5X1ipLzXakQDF+ 6y/iA+clQ7lAPscHDwyM0gGmYYBBhJ/2nWKX/aeY= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732184AbfJIWTD (ORCPT ); Wed, 9 Oct 2019 18:19:03 -0400 Received: from mail.kernel.org ([198.145.29.99]:54348 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730675AbfJIWTD (ORCPT ); Wed, 9 Oct 2019 18:19:03 -0400 Received: from localhost.localdomain (c-73-231-172-41.hsd1.ca.comcast.net [73.231.172.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 9FBE7206BB; Wed, 9 Oct 2019 22:19:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1570659542; bh=cwcg9ufUYnEfgyFWr0WuaHXAmyFAM81C8Yflm5O6DmI=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=w2SN4LTfY7nBKv5qRHy7gJswfJSRmQKQon58/A2+z9p7EBR+9y1lcypKgpPeKlKZw C5VFLOJDQVczNIUG/J4OahVsIexgYS5QDiQZ93t3LozaVMAGAuNUf4Dl2gK06OnqHf M/m2QsNsjQsii/Mbsr8uMQC/ZlF8hIoCFZGClRlA= Date: Wed, 9 Oct 2019 15:19:01 -0700 From: Andrew Morton To: "Uladzislau Rezki (Sony)" Cc: Daniel Wagner , Sebastian Andrzej Siewior , Thomas Gleixner , linux-mm@kvack.org, LKML , Peter Zijlstra , Hillf Danton , Michal Hocko , Matthew Wilcox , Oleksiy Avramchenko , Steven Rostedt Subject: Re: [PATCH 1/1] mm/vmalloc: remove preempt_disable/enable when do preloading Message-Id: <20191009151901.1be5f7211db291e4bd2da8ca@linux-foundation.org> In-Reply-To: <20191009164934.10166-1-urezki@gmail.com> References: <20191009164934.10166-1-urezki@gmail.com> X-Mailer: Sylpheed 3.5.1 (GTK+ 2.24.31; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 9 Oct 2019 18:49:34 +0200 "Uladzislau Rezki (Sony)" wrote: > Get rid of preempt_disable() and preempt_enable() when the > preload is done for splitting purpose. The reason is that > calling spin_lock() with disabled preemtion is forbidden in > CONFIG_PREEMPT_RT kernel. > > Therefore, we do not guarantee that a CPU is preloaded, instead > we minimize the case when it is not with this change. > > For example i run the special test case that follows the preload > pattern and path. 20 "unbind" threads run it and each does > 1000000 allocations. Only 3.5 times among 1000000 a CPU was > not preloaded thus. So it can happen but the number is rather > negligible. > > ... > A few questions about the resulting alloc_vmap_area(): : static struct vmap_area *alloc_vmap_area(unsigned long size, : unsigned long align, : unsigned long vstart, unsigned long vend, : int node, gfp_t gfp_mask) : { : struct vmap_area *va, *pva; : unsigned long addr; : int purged = 0; : : BUG_ON(!size); : BUG_ON(offset_in_page(size)); : BUG_ON(!is_power_of_2(align)); : : if (unlikely(!vmap_initialized)) : return ERR_PTR(-EBUSY); : : might_sleep(); : : va = kmem_cache_alloc_node(vmap_area_cachep, : gfp_mask & GFP_RECLAIM_MASK, node); Why does this use GFP_RECLAIM_MASK? Please add a comment explaining this. : if (unlikely(!va)) : return ERR_PTR(-ENOMEM); : : /* : * Only scan the relevant parts containing pointers to other objects : * to avoid false negatives. : */ : kmemleak_scan_area(&va->rb_node, SIZE_MAX, gfp_mask & GFP_RECLAIM_MASK); : : retry: : /* : * Preload this CPU with one extra vmap_area object. It is used : * when fit type of free area is NE_FIT_TYPE. Please note, it : * does not guarantee that an allocation occurs on a CPU that : * is preloaded, instead we minimize the case when it is not. : * It can happen because of migration, because there is a race : * until the below spinlock is taken. : * : * The preload is done in non-atomic context, thus it allows us : * to use more permissive allocation masks to be more stable under : * low memory condition and high memory pressure. : * : * Even if it fails we do not really care about that. Just proceed : * as it is. "overflow" path will refill the cache we allocate from. : */ : if (!this_cpu_read(ne_fit_preload_node)) { Readability nit: local `pva' should be defined here, rather than having function-wide scope. : pva = kmem_cache_alloc_node(vmap_area_cachep, GFP_KERNEL, node); Why doesn't this honour gfp_mask? If it's not a bug, please add comment explaining this. The kmem_cache_alloc() in adjust_va_to_fit_type() omits the caller's gfp_mask also. If not a bug, please document the unexpected behaviour. : : if (this_cpu_cmpxchg(ne_fit_preload_node, NULL, pva)) { : if (pva) : kmem_cache_free(vmap_area_cachep, pva); : } : } : : spin_lock(&vmap_area_lock); : : /* : * If an allocation fails, the "vend" address is : * returned. Therefore trigger the overflow path. : */ As for the intent of this patch, why not preallocate the vmap_area outside the spinlock and use it within the spinlock? Does spin_lock() disable preemption on RT? I forget, but it doesn't matter much anyway - doing this will make the code better in the regular kernel I think? Something like this: struct vmap_area *pva = NULL; ... if (!this_cpu_read(ne_fit_preload_node)) pva = kmem_cache_alloc_node(vmap_area_cachep, ...); spin_lock(&vmap_area_lock); if (pva && __this_cpu_cmpxchg(ne_fit_preload_node, NULL, pva)) kmem_cache_free(vmap_area_cachep, pva);