From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932197AbbCIL3p (ORCPT ); Mon, 9 Mar 2015 07:29:45 -0400 Received: from ipmail06.adl2.internode.on.net ([150.101.137.129]:42489 "EHLO ipmail06.adl2.internode.on.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752104AbbCIL3o (ORCPT ); Mon, 9 Mar 2015 07:29:44 -0400 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A2D3EADUg/1U/xq65gxagwaBLII8rRwBAQEBAQEGjHqLagICAoEpTQEBAQEBAXyEDwEBAQMBOj8FCwgDDQsJJQ8FKCETiCcLwz8BAQEBBgIBH4YKhQ2EJUkHgxeBFgWZYYcMjGkjhAIqMYEDgUABAQE Date: Mon, 9 Mar 2015 22:29:36 +1100 From: Dave Chinner To: Linus Torvalds Cc: Ingo Molnar , Mel Gorman , Andrew Morton , Aneesh Kumar , Linux Kernel Mailing List , Linux-MM , xfs@oss.sgi.com, ppc-dev Subject: Re: [PATCH 4/4] mm: numa: Slow PTE scan rate if migration failures occur Message-ID: <20150309112936.GD26657@destitution> References: <1425741651-29152-1-git-send-email-mgorman@suse.de> <1425741651-29152-5-git-send-email-mgorman@suse.de> <20150307163657.GA9702@gmail.com> <20150308100223.GC15487@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, Mar 08, 2015 at 11:35:59AM -0700, Linus Torvalds wrote: > On Sun, Mar 8, 2015 at 3:02 AM, Ingo Molnar wrote: > But: > > > As a second hack (not to be applied), could we change: > > > > #define _PAGE_BIT_PROTNONE _PAGE_BIT_GLOBAL > > > > to: > > > > #define _PAGE_BIT_PROTNONE (_PAGE_BIT_GLOBAL+1) > > > > to double check that the position of the bit does not matter? > > Agreed. We should definitely try that. > > Dave? As Mel has already mentioned, I'm in Boston for LSFMM and don't have access to the test rig I've used to generate this. > Also, is there some sane way for me to actually see this behavior on a > regular machine with just a single socket? Dave is apparently running > in some fake-numa setup, I'm wondering if this is easy enough to > reproduce that I could see it myself. Should be - I don't actually use 500TB of storage to generate this - 50GB on an SSD is all you need from the storage side. I just use a sparse backing file to make it look like a 500TB device. :P i.e. create an XFS filesystem on a 500TB sparse file with "mkfs.xfs -d size=500t,file=1 /path/to/file.img", mount it on loopback or as a virtio,cache=none device for the guest vm and then use fsmark to generate several million files spread across many, many directories such as: $ fs_mark -D 10000 -S0 -n 100000 -s 1 -L 32 -d \ /mnt/scratch/0 -d /mnt/scratch/1 -d /mnt/scratch/2 -d \ /mnt/scratch/3 -d /mnt/scratch/4 -d /mnt/scratch/5 -d \ /mnt/scratch/6 -d /mnt/scratch/7 That should only take a few minutes to run - if you throw 8p at it then it should run at >100k files/s being created. Then unmount and run "xfs_repair -o bhash=101703 /path/to/file.img" on the resultant image file. Cheers, Dave. -- Dave Chinner david@fromorbit.com