From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751631AbbEAUTC (ORCPT ); Fri, 1 May 2015 16:19:02 -0400 Received: from mta01.ornl.gov ([128.219.177.14]:51920 "EHLO mta01.ornl.gov" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750990AbbEAUS7 convert rfc822-to-8bit (ORCPT ); Fri, 1 May 2015 16:18:59 -0400 X-SG: RELAYLIST X-IronPort-AV: E=Sophos;i="5.13,352,1427774400"; d="scan'208";a="102180660" From: "Simmons, James A." To: "'Julia Lawall'" CC: Oleg Drokin , "devel@driverdev.osuosl.org" , Greg Kroah-Hartman , "kernel-janitors@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "HPDD-discuss@lists.01.org" Subject: RE: [HPDD-discuss] [PATCH 2/11] Staging: lustre: fld: Use kzalloc and kfree Thread-Topic: [HPDD-discuss] [PATCH 2/11] Staging: lustre: fld: Use kzalloc and kfree Thread-Index: AQHQhDcGSjuySU6k/EC2FSAoAQR6Jp1njMlQ Date: Fri, 1 May 2015 20:18:56 +0000 Message-ID: <524505df3433441494cf082a425f2ee7@EXCHCS32.ornl.gov> References: <1430495482-933-1-git-send-email-Julia.Lawall@lip6.fr> <1430495482-933-11-git-send-email-Julia.Lawall@lip6.fr> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-transport-fromentityheader: Hosted x-originating-ip: [128.219.12.132] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 8BIT MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org >> >From: Julia Lawall >> > >> >Replace OBD_ALLOC, OBD_ALLOC_WAIT, OBD_ALLOC_PTR, and OBD_ALLOC_PTR_WAIT by >> >kalloc/kcalloc, and OBD_FREE and OBD_FREE_PTR by kfree. >> >> Nak: James Simmons >> >> A simple replace will not work. The OBD_ALLOC and OBD_FREE functions allocate memory >> anywhere from one page to 4MB in size. You can't use kmalloc for the 4MB allocations. >> Currently lustre uses a 4 page water mark to determine if we allocate using vmalloc. Even >> using kmalloc for 4 pages has shown high failure rates on some systems. It gets even more >> messy with 64K page systems like ppc64 boxes. Now I'm not suggesting to port the larger >> allocations to vmalloc either since issues have been founded with using vmalloc. For example >> when using large stripe count files the MDS rpc generated crosses the 4 page line and vmalloc >> is used. Using vmalloc caused a global spinlock to be taken which causes meta data operations >> to serialized on the MDS servers. > >It's not the LARGE functions that do the switching? For example OBD_ALLOC >ends up at __OBD_MALLOC_VERBOSE, which as far as I can see calls kmalloc >(with __GFP_ZERO, and hance the use of kzalloc). Yes the LARGE functions do the switching. I was expecting also patches to remove the OBD_ALLOC_LARGE functions as well which is not the case here. I do have one question still. The macro __OBD_MALLOC_VERBOSE allowed the ability to simulate memory allocation failures at a certain percentage rate. Does something exist in the kernel to duplicate that functionality? Once these macros are gone we lose the ability to simulate high memory allocation failures.