From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Morton Subject: Re: [PATCH -next] ashmem: Fix ashmem_shrink deadlock. Date: Thu, 16 May 2013 09:45:59 -0700 Message-ID: <20130516094559.4d2c9212.akpm@linux-foundation.org> References: <1367416573-5430-1-git-send-email-rlove@google.com> <20130513214216.GA23743@kroah.com> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Return-path: Received: from mail.linuxfoundation.org ([140.211.169.12]:42660 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751538Ab3EPQqG (ORCPT ); Thu, 16 May 2013 12:46:06 -0400 In-Reply-To: Sender: linux-next-owner@vger.kernel.org List-ID: To: Robert Love Cc: Raul Xiong , Neil Zhang , Greg Kroah-Hartman , Shankar Brahadeeswaran , Dan Carpenter , LKML , Bjorn Bringert , devel , Hugh Dickins , Anjana V Kumar , linux-next On Thu, 16 May 2013 09:44:49 -0400 Robert Love wrote: > On Thu, May 16, 2013 at 4:15 AM, Raul Xiong wrote: > > The issue happens in such sequence: > > ashmem_mmap acquired ashmem_mutex --> ashmem_mutex:shmem_file_setup > > called kmem_cache_alloc --> shrink due to low memory --> ashmem_shrink > > tries to acquire the same ashmem_mutex -- it blocks here. > > > > I think this reports the bug clearly. Please have a look. > > There is no debate about the nature of the bug. Only the fix. > > My mutex_trylock patch fixes the problem. I prefer that solution. > > Andrew's suggestion of GFP_ATOMIC won't work as we'd have to propagate > that down into shmem and elsewhere. s/won't work/impractical/ A better approach would be to add a new __GFP_NOSHRINKERS, but it's all variations on a theme. > Using PF_MEMALLOC will work. You'd want to define something like: > > static int set_memalloc(void) > { > if (current->flags & PF_MEMALLOC) > return 0; > current->flags |= PF_MEMALLOC; > return 1; > } > > static void clear_memalloc(int memalloc) > { > if (memalloc) > current->flags &= ~PF_MEMALLOC; > } > > and then set/clear PF_MEMALLOC around every memory allocation and > function that descends into a memory allocation. As said I prefer my > solution but if someone wants to put together a patch with this > approach, fine by me. The mutex_trylock(ashmem_mutex) will actually have the best performance, because it skips the least amount of memory reclaim opportunities. But it still sucks! The real problem is that there exists a lock called "ashmem_mutex", taken by both the high-level mmap() and by the low-level shrinker. And taken by everything else too! The ashmem locking is pretty crude... What is the mutex_lock() in ashmem_mmap() actually protecting? I don't see much, apart from perhaps some incidental races around the contents of the file's ashmem_area, and those could/should be protected by a per-object lock, not a global one?