From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752601AbaEAVgO (ORCPT ); Thu, 1 May 2014 17:36:14 -0400 Received: from mail-pd0-f181.google.com ([209.85.192.181]:51027 "EHLO mail-pd0-f181.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752433AbaEAVfw (ORCPT ); Thu, 1 May 2014 17:35:52 -0400 Date: Thu, 1 May 2014 14:35:48 -0700 (PDT) From: David Rientjes X-X-Sender: rientjes@chino.kir.corp.google.com To: Andrew Morton cc: Mel Gorman , Rik van Riel , Vlastimil Babka , Joonsoo Kim , Greg Thelen , Hugh Dickins , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [patch v2 4/4] mm, thp: do not perform sync compaction on pagefault In-Reply-To: Message-ID: References: User-Agent: Alpine 2.02 (DEB 1266 2009-07-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Synchronous memory compaction can be very expensive: it can iterate an enormous amount of memory without aborting, constantly rescheduling, waiting on page locks and lru_lock, etc, if a pageblock cannot be defragmented. Unfortunately, it's too expensive for pagefault for transparent hugepages and it's much better to simply fallback to pages. On 128GB machines, we find that synchronous memory compaction can take O(seconds) for a single thp fault. Now that async compaction remembers where it left off without strictly relying on sync compaction, this makes thp allocations best-effort without causing egregious latency during pagefault. Signed-off-by: David Rientjes --- mm/page_alloc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2656,7 +2656,7 @@ rebalance: /* Wait for some write requests to complete then retry */ wait_iff_congested(preferred_zone, BLK_RW_ASYNC, HZ/50); goto rebalance; - } else { + } else if (!(gfp_mask & __GFP_NO_KSWAPD)) { /* * High-order allocations do not necessarily loop after * direct reclaim and reclaim/compaction depends on compaction From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f51.google.com (mail-pa0-f51.google.com [209.85.220.51]) by kanga.kvack.org (Postfix) with ESMTP id BD8AB6B0038 for ; Thu, 1 May 2014 17:35:51 -0400 (EDT) Received: by mail-pa0-f51.google.com with SMTP id fb1so4297364pad.10 for ; Thu, 01 May 2014 14:35:51 -0700 (PDT) Received: from mail-pa0-x22c.google.com (mail-pa0-x22c.google.com [2607:f8b0:400e:c03::22c]) by mx.google.com with ESMTPS id yd10si21961655pab.330.2014.05.01.14.35.50 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Thu, 01 May 2014 14:35:50 -0700 (PDT) Received: by mail-pa0-f44.google.com with SMTP id ey11so4271189pad.17 for ; Thu, 01 May 2014 14:35:50 -0700 (PDT) Date: Thu, 1 May 2014 14:35:48 -0700 (PDT) From: David Rientjes Subject: [patch v2 4/4] mm, thp: do not perform sync compaction on pagefault In-Reply-To: Message-ID: References: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-linux-mm@kvack.org List-ID: To: Andrew Morton Cc: Mel Gorman , Rik van Riel , Vlastimil Babka , Joonsoo Kim , Greg Thelen , Hugh Dickins , linux-kernel@vger.kernel.org, linux-mm@kvack.org Synchronous memory compaction can be very expensive: it can iterate an enormous amount of memory without aborting, constantly rescheduling, waiting on page locks and lru_lock, etc, if a pageblock cannot be defragmented. Unfortunately, it's too expensive for pagefault for transparent hugepages and it's much better to simply fallback to pages. On 128GB machines, we find that synchronous memory compaction can take O(seconds) for a single thp fault. Now that async compaction remembers where it left off without strictly relying on sync compaction, this makes thp allocations best-effort without causing egregious latency during pagefault. Signed-off-by: David Rientjes --- mm/page_alloc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2656,7 +2656,7 @@ rebalance: /* Wait for some write requests to complete then retry */ wait_iff_congested(preferred_zone, BLK_RW_ASYNC, HZ/50); goto rebalance; - } else { + } else if (!(gfp_mask & __GFP_NO_KSWAPD)) { /* * High-order allocations do not necessarily loop after * direct reclaim and reclaim/compaction depends on compaction -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org