From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S261306AbULWVD2 (ORCPT ); Thu, 23 Dec 2004 16:03:28 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S261304AbULWVD1 (ORCPT ); Thu, 23 Dec 2004 16:03:27 -0500 Received: from omx2-ext.sgi.com ([192.48.171.19]:15766 "EHLO omx2.sgi.com") by vger.kernel.org with ESMTP id S261298AbULWVDD (ORCPT ); Thu, 23 Dec 2004 16:03:03 -0500 Date: Thu, 23 Dec 2004 13:02:57 -0800 (PST) From: Christoph Lameter X-X-Sender: clameter@schroedinger.engr.sgi.com To: Andi Kleen cc: linux-kernel@vger.kernel.org Subject: Re: Prezeroing V2 [0/3]: Why and When it works In-Reply-To: Message-ID: References: <41C20E3E.3070209@yahoo.com.au.suse.lists.linux.kernel> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 23 Dec 2004, Andi Kleen wrote: > > 1. Aggregating zeroing operations to only apply to pages of higher order, > > which results in many pages that will later become order 0 to be > > zeroed in one go. For that purpose the existing clear_page function is > > extended and made to take an additional argument specifying the order of > > the page to be cleared. > > But if you do that you should really use a separate function that > can use cache bypassing stores. > > Normal clear_page cannot use that because it would be a loss > when the data is soon used. Clear_page is used both in the cache hot and no cache wanted case now. > So the two changes don't really make sense. Which two changes? If an arch can do zeroing without touching the cpu caches then that can be done with a zero driver. > Also I must say I'm still suspicious regarding your heuristic > to trigger gang faulting - with bad luck it could lead to a lot > more memory usage to specific applications that do very sparse > usage of memory. Gang faulting is not part of this patch. Please keep the issues separate. > There should be at least an madvise flag to turn it off and a sysctl > and it would be better to trigger only on a longer sequence of > consecutive faulted pages. Again this is not related to this patchset. Look at the V13 of the page fault scalability patch and you will find a /proc/sys/vm setting to manipulate things. This is V2 of the prezeroing patch. > How about some numbers on i386? Umm. Yeah. I only have smallish i386 machines here. Maybe next year ;-)