From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S936402AbcJZWYi (ORCPT ); Wed, 26 Oct 2016 18:24:38 -0400 Received: from mail-yw0-f181.google.com ([209.85.161.181]:34958 "EHLO mail-yw0-f181.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S935063AbcJZWYf (ORCPT ); Wed, 26 Oct 2016 18:24:35 -0400 MIME-Version: 1.0 In-Reply-To: <20161026123945.9341e2a2c1f3a9f95ef41cce@linux-foundation.org> References: <325247067.2674.1477398550882.JavaMail.zimbra@efficios.com> <1477420604-28918-1-git-send-email-danielmentz@google.com> <20161026180951.GG15216@arm.com> <20161026123945.9341e2a2c1f3a9f95ef41cce@linux-foundation.org> From: Daniel Mentz Date: Wed, 26 Oct 2016 15:24:33 -0700 Message-ID: Subject: Re: [PATCH] lib/genalloc.c: Start search from start of chunk To: Andrew Morton Cc: Will Deacon , linux-kernel@vger.kernel.org, Catalin Marinas , Greg Kroah-Hartman , Mathieu Desnoyers , Russell King Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Oct 26, 2016 at 12:39 PM, Andrew Morton wrote: > On Wed, 26 Oct 2016 19:09:51 +0100 Will Deacon wrote: > >> On Tue, Oct 25, 2016 at 11:36:44AM -0700, Daniel Mentz wrote: >> > gen_pool_alloc_algo() iterates over the chunks of a pool trying to find >> > a contiguous block of memory that satisfies the allocation request. >> > >> > The shortcut >> > >> > if (size > atomic_read(&chunk->avail)) >> > continue; >> > >> > makes the loop skip over chunks that do not have enough bytes left to >> > fulfill the request. There are two situations, though, where an >> > allocation might still fail: >> > >> > (1) The available memory is not contiguous, i.e. the request cannot be >> > fulfilled due to external fragmentation. >> > >> > (2) A race condition. Another thread runs the same code concurrently and >> > is quicker to grab the available memory. >> > >> > In those situations, the loop calls pool->algo() to search the entire >> > chunk, and pool->algo() returns some value that is >= end_bit to >> > indicate that the search failed. This return value is then assigned to >> > start_bit. The variables start_bit and end_bit describe the range that >> > should be searched, and this range should be reset for every chunk that >> > is searched. Today, the code fails to reset start_bit to 0. As a >> > result, prefixes of subsequent chunks are ignored. Memory allocations >> > might fail even though there is plenty of room left in these prefixes of >> > those other chunks. >> >> Please add a CC stable. > > You sure? The changelog doesn't describe the end-user impacts very > well (it should always do so, please) but I'm thinking "little impact"? Sorry for not describing the end-user impact. This bug does actually not affect our specific application since we are using genalloc with only a single chunk (through arch/arm/mm/dma-mapping.c). I found this bug by coincident while reviewing the code for a different reason. While trying to determine the user impact, I found some places where people add multiple chunks. Whether or not these users hit this bug depends on the patterns in which they allocate memory through genalloc and whether an allocation request can't be fulfilled due to external fragmentation. I'd say it's unlikely for end-users to hit this bug but if they do, the user impact is probably significant.