linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Christoph Lameter <clameter@sgi.com>
To: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: "Luck, Tony" <tony.luck@intel.com>, Robin Holt <holt@sgi.com>,
	Adam Litke <agl@us.ibm.com>,
	linux-ia64@vger.kernel.org, torvalds@osdl.org,
	linux-mm@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Increase page fault rate by prezeroing V1 [0/3]: Overview
Date: Tue, 21 Dec 2004 11:55:27 -0800 (PST)	[thread overview]
Message-ID: <Pine.LNX.4.58.0412211154100.1313@schroedinger.engr.sgi.com> (raw)
In-Reply-To: <41C20E3E.3070209@yahoo.com.au>


The patches increasing the page fault rate (introduction of atomic pte operations
and anticipatory prefaulting) do so by reducing the locking overhead and are
therefore mainly of interest for applications running in SMP systems with a high
number of cpus. The single thread performance does just show minor increases.
Only the performance of multi-threaded applications increase significantly.

The most expensive operation in the page fault handler is (apart of SMP
locking overhead) the zeroing of the page that is also done in the page fault
handler. Others have seen this too and have tried provide a way to provide
zeroed pages to the page fault handler:

http://marc.theaimsgroup.com/?t=109914559100004&r=1&w=2
http://marc.theaimsgroup.com/?t=109777267500005&r=1&w=2
http://marc.theaimsgroup.com/?l=linux-kernel&m=104931944213955&w=2

The problem so far has been that simple zeroing of pages simply shifts
the time spend somewhere else. Plus one would not want to zero hot
pages.

This patch addresses those issues by making it more effective to zero pages by:

1. Aggregating zeroing operations to mainly apply to larger order pages
which results in many later order 0 pages to be zeroed in one go.
For that purpose a new achitecture specific function zero_page(page, order)
is introduced.

2. Hardware support for offloading zeroing from the cpu. This avoids
the invalidation of the cpu caches by extensive zeroing operations.

The result is a significant increase of the page fault performance even for
single threaded applications:

w/o patch:
 Gb Rep Threads   User      System     Wall flt/cpu/s fault/wsec
   4   3    1    0.146s     11.155s  11.030s 69584.896  69566.852

w/patch
 Gb Rep Threads   User      System     Wall flt/cpu/s fault/wsec
   1   1    1    0.014s      0.110s   0.012s524292.194 517665.538

This is a performance increase by a factor 8!

The performance can only be upheld if enough zeroed pages are available.
In a heavy memory intensive benchmark the system will run out of these very
fast but the efficient algorithm for page zeroing still makes this a winner
(8 way system with 6 GB RAM, no hardware zeroing support):

w/o patch:

Gb Rep Threads   User      System     Wall flt/cpu/s fault/wsec
 4   3    1    0.146s     11.155s  11.030s 69584.896  69566.852
 4   3    2    0.170s     14.909s   7.097s 52150.369  98643.687
 4   3    4    0.181s     16.597s   5.079s 46869.167 135642.420
 4   3    8    0.166s     23.239s   4.037s 33599.215 179791.120

w/patch
Gb Rep Threads   User      System     Wall flt/cpu/s fault/wsec
 4   3    1    0.183s      2.750s   2.093s268077.996 267952.890
 4   3    2    0.185s      4.876s   2.097s155344.562 263967.292
 4   3    4    0.150s      6.617s   2.097s116205.793 264774.080
 4   3    8    0.186s     13.693s   3.054s 56659.819 221701.073

The patch is composed of 3 parts:

[1/3] Introduce __GFP_ZERO
	Modifies the page allocator to be able to take the __GFP_ZERO flag
	and returns zeroed memory on request. Modifies locations throughout
	the linux sources that retrieve a page and then zeroe it to request
	a zeroed page.
	Adds new low level zero_page functions for i386, ia64 and x86_64.
	(x64_64 untested)

[2/3] Page Zeroing
	Adds management of ZEROED and NOT_ZEROED pages and a background daemon
	called scrubd. scrubd is disable by default but can be enabled
	by writing an order number to /proc/sys/vm/scrub_start. If a page
	is coalesced of that order then the scrub daemon will start zeroing
	until all pages of order /proc/sys/vm/scrub_stop and higher are
	zeroed.

[3/3]	SGI Altix Block Transfer Engine Support
	Implements a driver to shift the zeroing off the cpu into hardware.
	With hardware support there will be minimal impact of zeroing
	on the performance of the system.



       reply	other threads:[~2004-12-21 19:56 UTC|newest]

Thread overview: 87+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <B8E391BBE9FE384DAA4C5C003888BE6F02900FBD@scsmsx401.amr.corp.intel.com>
     [not found] ` <41C20E3E.3070209@yahoo.com.au>
2004-12-21 19:55   ` Christoph Lameter [this message]
2004-12-21 19:56     ` Increase page fault rate by prezeroing V1 [1/3]: Introduce __GFP_ZERO Christoph Lameter
2004-12-21 19:57     ` Increase page fault rate by prezeroing V1 [2/3]: zeroing and scrubd Christoph Lameter
2005-01-01  2:22       ` Nick Piggin
2005-01-01  2:55         ` pmarques
2004-12-21 19:57     ` Increase page fault rate by prezeroing V1 [3/3]: Altix SN2 BTE Zeroing Christoph Lameter
2004-12-22 12:46       ` Robin Holt
2004-12-22 19:56         ` Christoph Lameter
2004-12-23 19:29     ` Prezeroing V2 [0/3]: Why and When it works Christoph Lameter
2004-12-23 19:33       ` Prezeroing V2 [1/4]: __GFP_ZERO / clear_page() removal Christoph Lameter
2004-12-23 19:33         ` Prezeroing V2 [2/4]: add second parameter to clear_page() for all arches Christoph Lameter
2004-12-24  8:33           ` Pavel Machek
2004-12-24 16:18             ` Christoph Lameter
2004-12-24 16:27               ` Pavel Machek
2004-12-24 17:02                 ` David S. Miller
2004-12-24 17:05           ` David S. Miller
2004-12-27 22:48             ` David S. Miller
2005-01-03 17:52             ` Christoph Lameter
2005-01-01 10:24           ` Geert Uytterhoeven
2005-01-04 23:12             ` Prezeroing V3 [0/4]: Discussion and i386 performance tests Christoph Lameter
2005-01-04 23:13               ` Prezeroing V3 [1/4]: Allow request for zeroed memory Christoph Lameter
2005-01-04 23:45                 ` Dave Hansen
2005-01-05  1:16                   ` Christoph Lameter
2005-01-05  1:26                     ` Linus Torvalds
2005-01-05 23:11                       ` Christoph Lameter
2005-01-05  0:34                 ` Linus Torvalds
2005-01-05  0:47                   ` Andrew Morton
2005-01-05  1:15                     ` Christoph Lameter
2005-01-08 21:12                 ` Hugh Dickins
2005-01-08 21:56                   ` David S. Miller
2005-01-21 20:09                     ` alloc_zeroed_user_highpage to fix the clear_user_highpage issue Christoph Lameter
2005-02-09  9:58                       ` [Patch] Fix oops in alloc_zeroed_user_highpage() when page is NULL Michael Ellerman
2005-02-10  0:38                         ` Christoph Lameter
2005-01-21 20:12                     ` Extend clear_page by an order parameter Christoph Lameter
2005-01-21 22:29                       ` Paul Mackerras
2005-01-21 23:48                         ` Christoph Lameter
2005-01-22  0:35                           ` Paul Mackerras
2005-01-22  0:43                             ` Andrew Morton
2005-01-22  1:08                               ` Paul Mackerras
2005-01-22  1:20                               ` Roman Zippel
2005-01-22  1:25                               ` Paul Mackerras
2005-01-22  1:54                                 ` Christoph Lameter
2005-01-22  2:53                                   ` Paul Mackerras
2005-01-23  7:45                       ` Andrew Morton
2005-01-24 16:37                         ` Christoph Lameter
2005-01-24 20:23                           ` David S. Miller
2005-01-24 20:33                             ` Christoph Lameter
2005-01-21 20:15                     ` A scrub daemon (prezeroing) Christoph Lameter
2005-01-10 17:16                   ` Prezeroing V3 [1/4]: Allow request for zeroed memory Christoph Lameter
2005-01-10 18:13                     ` Linus Torvalds
2005-01-10 20:17                       ` Christoph Lameter
2005-01-10 23:53                       ` Prezeroing V4 [0/4]: Overview Christoph Lameter
2005-01-10 23:54                         ` Prezeroing V4 [1/4]: Arch specific page zeroing during page fault Christoph Lameter
2005-01-11  0:41                           ` Chris Wright
2005-01-11  0:46                             ` Christoph Lameter
2005-01-11  0:49                               ` Chris Wright
2005-01-10 23:55                         ` Prezeroing V4 [2/4]: Zeroing implementation Christoph Lameter
2005-01-10 23:55                         ` Prezeroing V4 [3/4]: Altix SN2 BTE zero driver Christoph Lameter
2005-01-10 23:56                         ` Prezeroing V4 [4/4]: Extend clear_page to take an order parameter Christoph Lameter
2005-01-04 23:14               ` Prezeroing V3 [2/4]: Extension of " Christoph Lameter
2005-01-05 23:25                 ` Christoph Lameter
2005-01-06 13:52                   ` Andi Kleen
2005-01-06 17:47                     ` Christoph Lameter
2005-01-04 23:15               ` Prezeroing V3 [3/4]: Page zeroing through kscrubd Christoph Lameter
2005-01-04 23:16               ` Prezeroing V3 [4/4]: Driver for hardware zeroing on Altix Christoph Lameter
2005-01-05  2:16                 ` Andi Kleen
2005-01-05 16:24                   ` Christoph Lameter
2004-12-23 19:34         ` Prezeroing V2 [3/4]: Add support for ZEROED and NOT_ZEROED free maps Christoph Lameter
2004-12-23 19:35         ` Prezeroing V2 [4/4]: Hardware Zeroing through SGI BTE Christoph Lameter
2004-12-23 20:08         ` Prezeroing V2 [1/4]: __GFP_ZERO / clear_page() removal Brian Gerst
2004-12-24 16:24           ` Christoph Lameter
2004-12-23 19:49       ` Prezeroing V2 [0/3]: Why and When it works Arjan van de Ven
2004-12-23 20:57       ` Matt Mackall
2004-12-23 21:01       ` Paul Mackerras
2004-12-23 21:11       ` Paul Mackerras
2004-12-23 21:37         ` Andrew Morton
2004-12-23 23:00           ` Paul Mackerras
2004-12-23 21:48         ` Linus Torvalds
2004-12-23 22:34           ` Zwane Mwaikambo
2004-12-24  9:14           ` Arjan van de Ven
2004-12-24 18:21             ` Linus Torvalds
2004-12-24 18:57               ` Arjan van de Ven
2004-12-27 22:50               ` David S. Miller
2004-12-28 11:53                 ` Marcelo Tosatti
2004-12-24 16:17           ` Christoph Lameter
2004-12-24 18:31     ` Increase page fault rate by prezeroing V1 [0/3]: Overview Andrea Arcangeli
2005-01-03 17:54       ` Christoph Lameter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.58.0412211154100.1313@schroedinger.engr.sgi.com \
    --to=clameter@sgi.com \
    --cc=agl@us.ibm.com \
    --cc=holt@sgi.com \
    --cc=linux-ia64@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@vger.kernel.org \
    --cc=nickpiggin@yahoo.com.au \
    --cc=tony.luck@intel.com \
    --cc=torvalds@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).