From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752792AbaEWPHL (ORCPT ); Fri, 23 May 2014 11:07:11 -0400 Received: from mail-pb0-f41.google.com ([209.85.160.41]:61072 "EHLO mail-pb0-f41.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752054AbaEWPHJ (ORCPT ); Fri, 23 May 2014 11:07:09 -0400 From: Kevin Hilman To: Vlastimil Babka Cc: Shawn Guo , Andrew Morton , Joonsoo Kim , David Rientjes , Hugh Dickins , Greg Thelen , LKML , linux-mm@kvack.org, Minchan Kim , Mel Gorman , Bartlomiej Zolnierkiewicz , Michal Nazarewicz , Christoph Lameter , Rik van Riel , Olof Johansson , Stephen Warren , linux-arm-kernel Subject: Re: [PATCH v2] mm, compaction: properly signal and act upon lock and need_sched() contention References: <1399904111-23520-1-git-send-email-vbabka@suse.cz> <1400233673-11477-1-git-send-email-vbabka@suse.cz> <537F082F.50501@suse.cz> Date: Fri, 23 May 2014 08:07:06 -0700 In-Reply-To: <537F082F.50501@suse.cz> (Vlastimil Babka's message of "Fri, 23 May 2014 10:34:55 +0200") Message-ID: <7hvbswfs9x.fsf@paris.lan> User-Agent: Gnus/5.130008 (Ma Gnus v0.8) Emacs/24.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Vlastimil Babka writes: > On 05/23/2014 04:48 AM, Shawn Guo wrote: >> On 23 May 2014 07:49, Kevin Hilman wrote: >>> On Fri, May 16, 2014 at 2:47 AM, Vlastimil Babka wrote: >>>> Compaction uses compact_checklock_irqsave() function to periodically check for >>>> lock contention and need_resched() to either abort async compaction, or to >>>> free the lock, schedule and retake the lock. When aborting, cc->contended is >>>> set to signal the contended state to the caller. Two problems have been >>>> identified in this mechanism. >>> >>> This patch (or later version) has hit next-20140522 (in the form >>> commit 645ceea9331bfd851bc21eea456dda27862a10f4) and according to my >>> bisect, appears to be the culprit of several boot failures on ARM >>> platforms. >> >> On i.MX6 where CMA is enabled, the commit causes the drivers calling >> dma_alloc_coherent() fail to probe. Tracing it a little bit, it seems >> dma_alloc_from_contiguous() always return page as NULL after this >> commit. >> >> Shawn >> > > Really sorry, guys :/ > > -----8<----- > From: Vlastimil Babka > Date: Fri, 23 May 2014 10:18:56 +0200 > Subject: mm-compaction-properly-signal-and-act-upon-lock-and-need_sched-contention-fix2 > > Step 1: Change function name and comment between v1 and v2 so that the return > value signals the opposite thing. > Step 2: Change the call sites to reflect the opposite return value. > Step 3: ??? > Step 4: Make a complete fool of yourself. > Signed-off-by: Vlastimil Babka Tested-by: Kevin Hilman I verified that this fixes the boot failures I've seen on ARM (i.MX6 and Marvell Armada 370). Thanks for the quick fix. Kevin From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f47.google.com (mail-pa0-f47.google.com [209.85.220.47]) by kanga.kvack.org (Postfix) with ESMTP id 335DC6B0036 for ; Fri, 23 May 2014 11:07:10 -0400 (EDT) Received: by mail-pa0-f47.google.com with SMTP id lf10so4195448pab.6 for ; Fri, 23 May 2014 08:07:09 -0700 (PDT) Received: from mail-pb0-f44.google.com (mail-pb0-f44.google.com [209.85.160.44]) by mx.google.com with ESMTPS id kn8si4282978pab.196.2014.05.23.08.07.08 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Fri, 23 May 2014 08:07:09 -0700 (PDT) Received: by mail-pb0-f44.google.com with SMTP id rq2so4296151pbb.3 for ; Fri, 23 May 2014 08:07:08 -0700 (PDT) From: Kevin Hilman Subject: Re: [PATCH v2] mm, compaction: properly signal and act upon lock and need_sched() contention References: <1399904111-23520-1-git-send-email-vbabka@suse.cz> <1400233673-11477-1-git-send-email-vbabka@suse.cz> <537F082F.50501@suse.cz> Date: Fri, 23 May 2014 08:07:06 -0700 In-Reply-To: <537F082F.50501@suse.cz> (Vlastimil Babka's message of "Fri, 23 May 2014 10:34:55 +0200") Message-ID: <7hvbswfs9x.fsf@paris.lan> MIME-Version: 1.0 Content-Type: text/plain Sender: owner-linux-mm@kvack.org List-ID: To: Vlastimil Babka Cc: Shawn Guo , Andrew Morton , Joonsoo Kim , David Rientjes , Hugh Dickins , Greg Thelen , LKML , linux-mm@kvack.org, Minchan Kim , Mel Gorman , Bartlomiej Zolnierkiewicz , Michal Nazarewicz , Christoph Lameter , Rik van Riel , Olof Johansson , Stephen Warren , linux-arm-kernel Vlastimil Babka writes: > On 05/23/2014 04:48 AM, Shawn Guo wrote: >> On 23 May 2014 07:49, Kevin Hilman wrote: >>> On Fri, May 16, 2014 at 2:47 AM, Vlastimil Babka wrote: >>>> Compaction uses compact_checklock_irqsave() function to periodically check for >>>> lock contention and need_resched() to either abort async compaction, or to >>>> free the lock, schedule and retake the lock. When aborting, cc->contended is >>>> set to signal the contended state to the caller. Two problems have been >>>> identified in this mechanism. >>> >>> This patch (or later version) has hit next-20140522 (in the form >>> commit 645ceea9331bfd851bc21eea456dda27862a10f4) and according to my >>> bisect, appears to be the culprit of several boot failures on ARM >>> platforms. >> >> On i.MX6 where CMA is enabled, the commit causes the drivers calling >> dma_alloc_coherent() fail to probe. Tracing it a little bit, it seems >> dma_alloc_from_contiguous() always return page as NULL after this >> commit. >> >> Shawn >> > > Really sorry, guys :/ > > -----8<----- > From: Vlastimil Babka > Date: Fri, 23 May 2014 10:18:56 +0200 > Subject: mm-compaction-properly-signal-and-act-upon-lock-and-need_sched-contention-fix2 > > Step 1: Change function name and comment between v1 and v2 so that the return > value signals the opposite thing. > Step 2: Change the call sites to reflect the opposite return value. > Step 3: ??? > Step 4: Make a complete fool of yourself. > Signed-off-by: Vlastimil Babka Tested-by: Kevin Hilman I verified that this fixes the boot failures I've seen on ARM (i.MX6 and Marvell Armada 370). Thanks for the quick fix. Kevin -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 From: khilman@linaro.org (Kevin Hilman) Date: Fri, 23 May 2014 08:07:06 -0700 Subject: [PATCH v2] mm, compaction: properly signal and act upon lock and need_sched() contention In-Reply-To: <537F082F.50501@suse.cz> (Vlastimil Babka's message of "Fri, 23 May 2014 10:34:55 +0200") References: <1399904111-23520-1-git-send-email-vbabka@suse.cz> <1400233673-11477-1-git-send-email-vbabka@suse.cz> <537F082F.50501@suse.cz> Message-ID: <7hvbswfs9x.fsf@paris.lan> To: linux-arm-kernel@lists.infradead.org List-Id: linux-arm-kernel.lists.infradead.org Vlastimil Babka writes: > On 05/23/2014 04:48 AM, Shawn Guo wrote: >> On 23 May 2014 07:49, Kevin Hilman wrote: >>> On Fri, May 16, 2014 at 2:47 AM, Vlastimil Babka wrote: >>>> Compaction uses compact_checklock_irqsave() function to periodically check for >>>> lock contention and need_resched() to either abort async compaction, or to >>>> free the lock, schedule and retake the lock. When aborting, cc->contended is >>>> set to signal the contended state to the caller. Two problems have been >>>> identified in this mechanism. >>> >>> This patch (or later version) has hit next-20140522 (in the form >>> commit 645ceea9331bfd851bc21eea456dda27862a10f4) and according to my >>> bisect, appears to be the culprit of several boot failures on ARM >>> platforms. >> >> On i.MX6 where CMA is enabled, the commit causes the drivers calling >> dma_alloc_coherent() fail to probe. Tracing it a little bit, it seems >> dma_alloc_from_contiguous() always return page as NULL after this >> commit. >> >> Shawn >> > > Really sorry, guys :/ > > -----8<----- > From: Vlastimil Babka > Date: Fri, 23 May 2014 10:18:56 +0200 > Subject: mm-compaction-properly-signal-and-act-upon-lock-and-need_sched-contention-fix2 > > Step 1: Change function name and comment between v1 and v2 so that the return > value signals the opposite thing. > Step 2: Change the call sites to reflect the opposite return value. > Step 3: ??? > Step 4: Make a complete fool of yourself. > Signed-off-by: Vlastimil Babka Tested-by: Kevin Hilman I verified that this fixes the boot failures I've seen on ARM (i.MX6 and Marvell Armada 370). Thanks for the quick fix. Kevin