From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933034AbbEEUDA (ORCPT ); Tue, 5 May 2015 16:03:00 -0400 Received: from mail.linuxfoundation.org ([140.211.169.12]:60351 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752939AbbEEUC5 (ORCPT ); Tue, 5 May 2015 16:02:57 -0400 Date: Tue, 5 May 2015 13:02:55 -0700 From: Andrew Morton To: Mel Gorman Cc: Waiman Long , Nathan Zimmer , Dave Hansen , Scott Norton , Daniel J Blueman , Linux-MM , LKML Subject: Re: [PATCH 0/13] Parallel struct page initialisation v4 Message-Id: <20150505130255.49ff76bbf0a3b32d884ab2ce@linux-foundation.org> In-Reply-To: <20150505104514.GC2462@suse.de> References: <1430231830-7702-1-git-send-email-mgorman@suse.de> <554030D1.8080509@hp.com> <5543F802.9090504@hp.com> <554415B1.2050702@hp.com> <20150504143046.9404c572486caf71bdef0676@linux-foundation.org> <20150505104514.GC2462@suse.de> X-Mailer: Sylpheed 3.4.1 (GTK+ 2.24.23; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 5 May 2015 11:45:14 +0100 Mel Gorman wrote: > On Mon, May 04, 2015 at 02:30:46PM -0700, Andrew Morton wrote: > > > Before the patch, the boot time from elilo prompt to ssh login was 694s. > > > After the patch, the boot up time was 346s, a saving of 348s (about 50%). > > > > Having to guesstimate the amount of memory which is needed for a > > successful boot will be painful. Any number we choose will be wrong > > 99% of the time. > > > > If the kswapd threads have started, all we need to do is to wait: take > > a little nap in the allocator's page==NULL slowpath. > > > > I'm not seeing any reason why we can't start kswapd much earlier - > > right at the start of do_basic_setup()? > > It doesn't even have to be kswapd, it just should be a thread pinned to > a done. The difficulty is that dealing with the system hashes means the > initialisation has to happen before vfs_caches_init_early() when there is > no scheduler. I bet we can run vfs_caches_init_early() after sched_init(). Might need a few little fixups. > Those allocations could be delayed further but then there is > the possibility that the allocations would not be contiguous and they'd > have to rely on CMA to make the attempt. That potentially alters the > performance of the large system hashes at run time. hm, why. If the kswapd threads are running and busily creating free pages then alloc_pages(order=10) can detect this situation and stall for a while, waiting for kswapd to create an order-10 page. Alternatively, the page allocator can go off and synchronously initialize some pageframes itself. Keep doing that until the allocation attempt succeeds. Such an approach is much more robust than trying to predict how much memory will be needed.