From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S261701AbULJEcX (ORCPT ); Thu, 9 Dec 2004 23:32:23 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S261707AbULJEcW (ORCPT ); Thu, 9 Dec 2004 23:32:22 -0500 Received: from smtp208.mail.sc5.yahoo.com ([216.136.130.116]:23187 "HELO smtp208.mail.sc5.yahoo.com") by vger.kernel.org with SMTP id S261701AbULJEad (ORCPT ); Thu, 9 Dec 2004 23:30:33 -0500 Message-ID: <41B92662.40600@yahoo.com.au> Date: Fri, 10 Dec 2004 15:30:26 +1100 From: Nick Piggin User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.3) Gecko/20041007 Debian/1.7.3-5 X-Accept-Language: en MIME-Version: 1.0 To: Christoph Lameter CC: Linus Torvalds , Hugh Dickins , akpm@osdl.org, Benjamin Herrenschmidt , linux-mm@kvack.org, linux-ia64@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: page fault scalability patch V12 [0/7]: Overview and performance tests References: <41B8060A.4050402@yahoo.com.au> In-Reply-To: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Christoph Lameter wrote: > On Thu, 9 Dec 2004, Nick Piggin wrote: > > >>>For more than 8 cpus the page fault rate increases by orders >>>of magnitude. For more than 64 cpus the improvement in performace >>>is 10 times better. >> >>Those numbers are pretty impressive. I thought you'd said with earlier >>patches that performance was about doubled from 8 to 512 CPUS. Did I >>remember correctly? If so, where is the improvement coming from? The >>per-thread RSS I guess? > > > Right. The per-thread RSS seems to have made a big difference for high CPU > counts. Also I was conservative in the estimates in earlier post since I > did not have the numbers for the very high cpu counts. > Ah OK. > >>On another note, these patches are basically only helpful to new >>anonymous page faults. I guess this is the main thing you are concerned >>about at the moment, but I wonder if you would see improvements with >>my patch to remove the ptl from the other types of faults as well? > > > I can try that but I am frankly a bit sceptical since the ptl protects > many other variables. It may be more efficient to have the ptl in these > cases than doing the atomic ops all over the place. Do you have any number > you could post? I believe I send you a copy of the code that I use for > performance tests last week or so, > Yep I have your test program. No real numbers because the biggest thing I have to test on is a 4-way - there is improvement, but it is not so impressive as your 512 way tests! :) > >>The downside of my patch - well the main downsides - compared to yours >>are its intrusiveness, and the extra cost involved in copy_page_range >>which yours appears not to require. > > > Is the patch known to be okay for ia64? I can try to see how it > does. > I think it just needs one small fix to the swapping code, and it should be pretty stable. So in fact it would probably work for you as is (if you don't swap), but I'd rather have something more stable before I ask you to test. I'll try to find time to do that in the next few days. > >>As I've said earlier though, I wouldn't mind your patches going in. At >>least they should probably get into -mm soon, when Andrew has time (and >>after the 4level patches are sorted out). That wouldn't stop my patch >>(possibly) being merged some time after that if and when it was found >>worthy... > > > I'd certainly be willing to poke around and see how beneficial this is. If > it turns out to accellerate other functionality of the vm then you > have my full support. > Great, thanks.