All of lore.kernel.org
 help / color / mirror / Atom feed
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Minchan Kim <minchan.kim@gmail.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Peter Zijlstra <peterz@infradead.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	cl@linux-foundation.org,
	"hugh.dickins" <hugh.dickins@tiscali.co.uk>,
	Nick Piggin <nickpiggin@yahoo.com.au>,
	Ingo Molnar <mingo@elte.hu>
Subject: Re: [RFC][PATCH 6/8] mm: handle_speculative_fault()
Date: Wed, 6 Jan 2010 16:06:14 +0900	[thread overview]
Message-ID: <20100106160614.ff756f82.kamezawa.hiroyu@jp.fujitsu.com> (raw)
In-Reply-To: <alpine.LFD.2.00.1001052007090.3630@localhost.localdomain>

[-- Attachment #1: Type: text/plain, Size: 4341 bytes --]

On Tue, 5 Jan 2010 20:20:56 -0800 (PST)
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> 
> 
> On Wed, 6 Jan 2010, KAMEZAWA Hiroyuki wrote:
> > > 
> > > Of course, your other load with MADV_DONTNEED seems to be horrible, and 
> > > has some nasty spinlock issues, but that looks like a separate deal (I 
> > > assume that load is just very hard on the pgtable lock).
> > 
> > It's zone->lock, I guess. My test program avoids pgtable lock problem.
> 
> Yeah, I should have looked more at your callchain. That's nasty. Much 
> worse than the per-mm lock. I thought the page buffering would avoid the 
> zone lock becoming a huge problem, but clearly not in this case.
> 
For my mental peace, I rewrote test program as

  while () {
	touch memory
	barrier
	madvice DONTNEED all range by cpu 0
	barrier
  }
And serialize madivce().

Then, zone->lock disappears and I don't see big difference with XADD rwsem and
my tricky patch. I think I got reasonable result and fixing rwsem is the sane way.

next target will be clear_page()? hehe.
What catches my eyes is cost of memcg... (>_<  

Thank you all, 
-Kame
==
[XADD rwsem]
[root@bluextal memory]#  /root/bin/perf stat -e page-faults,cache-misses --repeat 5 ./multi-fault-all 8

 Performance counter stats for './multi-fault-all 8' (5 runs):

       33029186  page-faults                ( +-   0.146% )
      348698659  cache-misses               ( +-   0.149% )

   60.002876268  seconds time elapsed   ( +-   0.001% )

# Samples: 815596419603
#
# Overhead          Command             Shared Object  Symbol
# ........  ...............  ........................  ......
#
    41.51%  multi-fault-all  [kernel]                  [k] clear_page_c
     9.08%  multi-fault-all  [kernel]                  [k] down_read_trylock
     6.23%  multi-fault-all  [kernel]                  [k] up_read
     6.17%  multi-fault-all  [kernel]                  [k] __mem_cgroup_try_charg
     4.76%  multi-fault-all  [kernel]                  [k] handle_mm_fault
     3.77%  multi-fault-all  [kernel]                  [k] __mem_cgroup_commit_ch
     3.62%  multi-fault-all  [kernel]                  [k] __rmqueue
     2.30%  multi-fault-all  [kernel]                  [k] _raw_spin_lock
     2.30%  multi-fault-all  [kernel]                  [k] page_fault
     2.12%  multi-fault-all  [kernel]                  [k] mem_cgroup_charge_comm
     2.05%  multi-fault-all  [kernel]                  [k] bad_range
     1.78%  multi-fault-all  [kernel]                  [k] _raw_spin_lock_irq
     1.53%  multi-fault-all  [kernel]                  [k] lookup_page_cgroup
     1.44%  multi-fault-all  [kernel]                  [k] __mem_cgroup_uncharge_
     1.41%  multi-fault-all  ./multi-fault-all         [.] worker
     1.30%  multi-fault-all  [kernel]                  [k] get_page_from_freelist
     1.06%  multi-fault-all  [kernel]                  [k] page_remove_rmap



[async page fault]
[root@bluextal memory]#  /root/bin/perf stat -e page-faults,cache-misses --repeat 5 ./multi-fault-all 8

 Performance counter stats for './multi-fault-all 8' (5 runs):

       33345089  page-faults                ( +-   0.555% )
      357660074  cache-misses               ( +-   1.438% )

   60.003711279  seconds time elapsed   ( +-   0.002% )


    40.94%  multi-fault-all  [kernel]                  [k] clear_page_c
     6.96%  multi-fault-all  [kernel]                  [k] vma_put
     6.82%  multi-fault-all  [kernel]                  [k] page_add_new_anon_rmap
     5.86%  multi-fault-all  [kernel]                  [k] __mem_cgroup_try_charg
     4.40%  multi-fault-all  [kernel]                  [k] __rmqueue
     4.14%  multi-fault-all  [kernel]                  [k] find_vma_speculative
     3.97%  multi-fault-all  [kernel]                  [k] handle_mm_fault
     3.52%  multi-fault-all  [kernel]                  [k] _raw_spin_lock
     3.46%  multi-fault-all  [kernel]                  [k] __mem_cgroup_commit_ch
     2.23%  multi-fault-all  [kernel]                  [k] bad_range
     2.16%  multi-fault-all  [kernel]                  [k] mem_cgroup_charge_comm
     1.96%  multi-fault-all  [kernel]                  [k] _raw_spin_lock_irq
     1.75%  multi-fault-all  [kernel]                  [k] mem_cgroup_add_lru_lis
     1.73%  multi-fault-all  [kernel]                  [k] page_fault

[-- Attachment #2: multi-fault-all.c --]
[-- Type: text/x-csrc, Size: 1877 bytes --]

/*
 * multi-fault.c :: causes 60secs of parallel page fault in multi-thread.
 * % gcc -O2 -o multi-fault multi-fault.c -lpthread
 * % multi-fault # of cpus.
 */

#define _GNU_SOURCE
#include <stdio.h>
#include <pthread.h>
#include <sched.h>
#include <sys/mman.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <signal.h>

#define NR_THREADS	8
pthread_t threads[NR_THREADS];
/*
 * For avoiding contention in page table lock, FAULT area is
 * sparse. If FAULT_LENGTH is too large for your cpus, decrease it.
 */
#define MMAP_LENGTH	(8 * 1024 * 1024)
#define FAULT_LENGTH	(2 * 1024 * 1024)
void *mmap_area[NR_THREADS];
#define PAGE_SIZE	4096

pthread_barrier_t barrier;
int name[NR_THREADS];

void segv_handler(int sig)
{
	sleep(100);
}

int num;
void *worker(void *data)
{
	cpu_set_t set;
	int i, cpu;

	cpu = *(int *)data;

	CPU_ZERO(&set);
	CPU_SET(cpu, &set);
	sched_setaffinity(0, sizeof(set), &set);

	while (1) {
		char *c;
		char *start = mmap_area[cpu];
		char *end = mmap_area[cpu] + FAULT_LENGTH;
		pthread_barrier_wait(&barrier);
		//printf("fault into %p-%p\n",start, end);

		for (c = start; c < end; c += PAGE_SIZE)
			*c = 0;

		pthread_barrier_wait(&barrier);
		for (i = 0; cpu==0 && i < num; i++)
			madvise(mmap_area[i], FAULT_LENGTH, MADV_DONTNEED);
		pthread_barrier_wait(&barrier);
	}
	return NULL;
}

int main(int argc, char *argv[])
{
	int i, ret;

	if (argc < 2)
		return 0;

	num = atoi(argv[1]);	
	pthread_barrier_init(&barrier, NULL, num);

	mmap_area[0] = mmap(NULL, MMAP_LENGTH * num, PROT_WRITE|PROT_READ,
				MAP_PRIVATE | MAP_ANONYMOUS, 0, 0);
	for (i = 1; i < num; i++) {
		mmap_area[i] = mmap_area[i - 1]+ MMAP_LENGTH;
	}

	for (i = 0; i < num; ++i) {
		name[i] = i;
		ret = pthread_create(&threads[i], NULL, worker, &name[i]);
		if (ret < 0) {
			perror("pthread create");
			return 0;
		}
	}
	sleep(60);
	return 0;
}

  reply	other threads:[~2010-01-06  7:09 UTC|newest]

Thread overview: 239+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-01-04 18:24 [RFC][PATCH 0/8] Speculative pagefault -v3 Peter Zijlstra
2010-01-04 18:24 ` Peter Zijlstra
2010-01-04 18:24 ` [RFC][PATCH 1/8] mm: Remove pte reference from fault path Peter Zijlstra
2010-01-04 18:24   ` Peter Zijlstra
2010-01-04 18:24 ` [RFC][PATCH 2/8] mm: Speculative pagefault infrastructure Peter Zijlstra
2010-01-04 18:24   ` Peter Zijlstra
2010-01-04 18:24 ` [RFC][PATCH 3/8] mm: Add vma sequence count Peter Zijlstra
2010-01-04 18:24   ` Peter Zijlstra
2010-01-04 18:24 ` [RFC][PATCH 4/8] mm: RCU free vmas Peter Zijlstra
2010-01-04 18:24   ` Peter Zijlstra
2010-01-05  2:43   ` Paul E. McKenney
2010-01-05  2:43     ` Paul E. McKenney
2010-01-05  8:28     ` Peter Zijlstra
2010-01-05  8:28       ` Peter Zijlstra
2010-01-05 16:05       ` Paul E. McKenney
2010-01-05 16:05         ` Paul E. McKenney
2010-01-04 18:24 ` [RFC][PATCH 5/8] mm: Speculative pte_map_lock() Peter Zijlstra
2010-01-04 18:24   ` Peter Zijlstra
2010-01-04 18:24 ` [RFC][PATCH 6/8] mm: handle_speculative_fault() Peter Zijlstra
2010-01-04 18:24   ` Peter Zijlstra
2010-01-05  0:25   ` KAMEZAWA Hiroyuki
2010-01-05  0:25     ` KAMEZAWA Hiroyuki
2010-01-05  3:13     ` Linus Torvalds
2010-01-05  3:13       ` Linus Torvalds
2010-01-05  8:17       ` Peter Zijlstra
2010-01-05  8:17         ` Peter Zijlstra
2010-01-05  8:57       ` Peter Zijlstra
2010-01-05  8:57         ` Peter Zijlstra
2010-01-05 15:34         ` Linus Torvalds
2010-01-05 15:34           ` Linus Torvalds
2010-01-05 15:40           ` Al Viro
2010-01-05 15:40             ` Al Viro
2010-01-05 16:10             ` Linus Torvalds
2010-01-05 16:10               ` Linus Torvalds
2010-01-06 15:41               ` Peter Zijlstra
2010-01-06 15:41                 ` Peter Zijlstra
2010-01-05  9:37       ` Peter Zijlstra
2010-01-05  9:37         ` Peter Zijlstra
2010-01-05 23:35         ` Linus Torvalds
2010-01-05 23:35           ` Linus Torvalds
2010-01-05  4:29     ` Minchan Kim
2010-01-05  4:29       ` Minchan Kim
2010-01-05  4:43       ` KAMEZAWA Hiroyuki
2010-01-05  4:43         ` KAMEZAWA Hiroyuki
2010-01-05  5:10         ` Linus Torvalds
2010-01-05  5:10           ` Linus Torvalds
2010-01-05  5:30           ` KAMEZAWA Hiroyuki
2010-01-05  5:30             ` KAMEZAWA Hiroyuki
2010-01-05  7:39             ` KAMEZAWA Hiroyuki
2010-01-05 15:26               ` Linus Torvalds
2010-01-05 15:26                 ` Linus Torvalds
2010-01-05 16:14                 ` Linus Torvalds
2010-01-05 16:14                   ` Linus Torvalds
2010-01-05 17:25                   ` Andi Kleen
2010-01-05 17:25                     ` Andi Kleen
2010-01-05 17:47                     ` Christoph Lameter
2010-01-05 17:47                       ` Christoph Lameter
2010-01-05 18:00                       ` Andi Kleen
2010-01-05 18:00                         ` Andi Kleen
2010-01-05 17:55                     ` Linus Torvalds
2010-01-05 17:55                       ` Linus Torvalds
2010-01-05 18:13                       ` Christoph Lameter
2010-01-05 18:13                         ` Christoph Lameter
2010-01-05 18:25                         ` Linus Torvalds
2010-01-05 18:25                           ` Linus Torvalds
2010-01-05 18:46                           ` Christoph Lameter
2010-01-05 18:46                             ` Christoph Lameter
2010-01-05 18:56                             ` Linus Torvalds
2010-01-05 18:56                               ` Linus Torvalds
2010-01-05 19:15                               ` Christoph Lameter
2010-01-05 19:15                                 ` Christoph Lameter
2010-01-05 19:28                                 ` Linus Torvalds
2010-01-05 19:28                                   ` Linus Torvalds
2010-01-05 18:55                           ` Paul E. McKenney
2010-01-05 18:55                             ` Paul E. McKenney
2010-01-05 19:08                             ` Linus Torvalds
2010-01-05 19:08                               ` Linus Torvalds
2010-01-05 19:23                               ` Paul E. McKenney
2010-01-05 19:23                                 ` Paul E. McKenney
2010-01-05 20:29                           ` Peter Zijlstra
2010-01-05 20:29                             ` Peter Zijlstra
2010-01-05 20:46                             ` Linus Torvalds
2010-01-05 20:46                               ` Linus Torvalds
2010-01-05 21:00                               ` Linus Torvalds
2010-01-05 21:00                                 ` Linus Torvalds
2010-01-05 23:29                             ` Paul E. McKenney
2010-01-05 23:29                               ` Paul E. McKenney
2010-01-06  0:22                 ` KAMEZAWA Hiroyuki
2010-01-06  0:22                   ` KAMEZAWA Hiroyuki
2010-01-06  1:37                   ` Linus Torvalds
2010-01-06  1:37                     ` Linus Torvalds
2010-01-06  2:52                     ` KAMEZAWA Hiroyuki
2010-01-06  2:52                       ` KAMEZAWA Hiroyuki
2010-01-06  3:27                       ` Linus Torvalds
2010-01-06  3:27                         ` Linus Torvalds
2010-01-06  3:56                         ` KAMEZAWA Hiroyuki
2010-01-06  3:56                           ` KAMEZAWA Hiroyuki
2010-01-06  4:20                           ` Linus Torvalds
2010-01-06  4:20                             ` Linus Torvalds
2010-01-06  7:06                             ` KAMEZAWA Hiroyuki [this message]
2010-01-06  7:49                               ` Minchan Kim
2010-01-06  7:49                                 ` Minchan Kim
2010-01-06  9:39                               ` Linus Torvalds
2010-01-06  9:39                                 ` Linus Torvalds
2010-01-07  1:00                                 ` KAMEZAWA Hiroyuki
2010-01-07  1:00                                   ` KAMEZAWA Hiroyuki
2010-01-08 16:53                             ` Peter Zijlstra
2010-01-08 16:53                               ` Peter Zijlstra
2010-01-08 17:22                               ` Linus Torvalds
2010-01-08 17:22                                 ` Linus Torvalds
2010-01-08 17:43                                 ` Christoph Lameter
2010-01-08 17:43                                   ` Christoph Lameter
2010-01-08 17:52                                   ` Linus Torvalds
2010-01-08 17:52                                     ` Linus Torvalds
2010-01-08 18:33                                     ` Christoph Lameter
2010-01-08 18:33                                       ` Christoph Lameter
2010-01-08 18:46                                   ` Andi Kleen
2010-01-08 18:46                                     ` Andi Kleen
2010-01-08 18:56                                     ` Christoph Lameter
2010-01-08 18:56                                       ` Christoph Lameter
2010-01-08 19:10                                       ` Andi Kleen
2010-01-08 19:10                                         ` Andi Kleen
2010-01-08 19:11                                       ` Linus Torvalds
2010-01-08 19:11                                         ` Linus Torvalds
2010-01-08 19:28                                         ` Andi Kleen
2010-01-08 19:28                                           ` Andi Kleen
2010-01-08 19:39                                           ` Linus Torvalds
2010-01-08 19:39                                             ` Linus Torvalds
2010-01-08 19:42                                             ` Linus Torvalds
2010-01-08 19:42                                               ` Linus Torvalds
2010-01-08 21:36                                   ` Linus Torvalds
2010-01-08 21:36                                     ` Linus Torvalds
2010-01-08 21:46                                     ` Christoph Lameter
2010-01-08 21:46                                       ` Christoph Lameter
2010-01-08 22:43                                       ` Linus Torvalds
2010-01-08 22:43                                         ` Linus Torvalds
2010-01-08 22:43                                       ` Linus Torvalds
2010-01-09 14:47                               ` Ed Tomlinson
2010-01-09 14:47                                 ` Ed Tomlinson
2010-01-10  5:27                                 ` Nitin Gupta
2010-01-10  5:27                                   ` Nitin Gupta
2010-01-05 15:14             ` Christoph Lameter
2010-01-05 15:14               ` Christoph Lameter
2010-01-05  8:18           ` Peter Zijlstra
2010-01-05  8:18             ` Peter Zijlstra
2010-01-05  6:00         ` Minchan Kim
2010-01-05  6:00           ` Minchan Kim
2010-01-05  4:48       ` Linus Torvalds
2010-01-05  4:48         ` Linus Torvalds
2010-01-05  6:09         ` Minchan Kim
2010-01-05  6:09           ` Minchan Kim
2010-01-05  6:09           ` KAMEZAWA Hiroyuki
2010-01-05  6:09             ` KAMEZAWA Hiroyuki
2010-01-05  6:24             ` Minchan Kim
2010-01-05  6:24               ` Minchan Kim
2010-01-05  8:35           ` Peter Zijlstra
2010-01-05  8:35             ` Peter Zijlstra
2010-01-05 13:45   ` Arjan van de Ven
2010-01-05 13:45     ` Arjan van de Ven
2010-01-05 14:15     ` Andi Kleen
2010-01-05 14:15       ` Andi Kleen
2010-01-05 15:17     ` Christoph Lameter
2010-01-05 15:17       ` Christoph Lameter
2010-01-06  3:22       ` Arjan van de Ven
2010-01-06  3:22         ` Arjan van de Ven
2010-01-07 16:11         ` Christoph Lameter
2010-01-07 16:11           ` Christoph Lameter
2010-01-07 16:19           ` Linus Torvalds
2010-01-07 16:19             ` Linus Torvalds
2010-01-07 16:31             ` Linus Torvalds
2010-01-07 16:31               ` Linus Torvalds
2010-01-07 16:34             ` Paul E. McKenney
2010-01-07 16:34               ` Paul E. McKenney
2010-01-07 16:36             ` Christoph Lameter
2010-01-07 16:36               ` Christoph Lameter
2010-01-08  4:49               ` Arjan van de Ven
2010-01-08  4:49                 ` Arjan van de Ven
2010-01-08  5:00                 ` Linus Torvalds
2010-01-08  5:00                   ` Linus Torvalds
2010-01-08 15:51                 ` Christoph Lameter
2010-01-08 15:51                   ` Christoph Lameter
2010-01-09 15:55                   ` Arjan van de Ven
2010-01-09 15:55                     ` Arjan van de Ven
2010-01-07 17:22             ` Peter Zijlstra
2010-01-07 17:22               ` Peter Zijlstra
2010-01-07 17:36               ` Linus Torvalds
2010-01-07 17:36                 ` Linus Torvalds
2010-01-07 17:49                 ` Linus Torvalds
2010-01-07 17:49                   ` Linus Torvalds
2010-01-07 18:00                   ` Peter Zijlstra
2010-01-07 18:00                     ` Peter Zijlstra
2010-01-07 18:15                     ` Linus Torvalds
2010-01-07 18:15                       ` Linus Torvalds
2010-01-07 21:49                       ` Peter Zijlstra
2010-01-07 21:49                         ` Peter Zijlstra
2010-01-07 18:44                   ` Linus Torvalds
2010-01-07 18:44                     ` Linus Torvalds
2010-01-07 19:20                     ` Paul E. McKenney
2010-01-07 19:20                       ` Paul E. McKenney
2010-01-07 20:06                       ` Linus Torvalds
2010-01-07 20:06                         ` Linus Torvalds
2010-01-07 20:25                         ` Paul E. McKenney
2010-01-07 20:25                           ` Paul E. McKenney
2010-01-07 19:24                     ` Christoph Lameter
2010-01-07 19:24                       ` Christoph Lameter
2010-01-07 20:08                       ` Linus Torvalds
2010-01-07 20:08                         ` Linus Torvalds
2010-01-07 20:13                         ` Linus Torvalds
2010-01-07 20:13                           ` Linus Torvalds
2010-01-07 21:44                     ` Peter Zijlstra
2010-01-07 21:44                       ` Peter Zijlstra
2010-01-07 22:33                       ` Linus Torvalds
2010-01-07 22:33                         ` Linus Torvalds
2010-01-08  0:23                         ` KAMEZAWA Hiroyuki
2010-01-08  0:23                           ` KAMEZAWA Hiroyuki
2010-01-08  0:25                           ` KAMEZAWA Hiroyuki
2010-01-08  0:25                             ` KAMEZAWA Hiroyuki
2010-01-08  0:39                           ` Linus Torvalds
2010-01-08  0:39                             ` Linus Torvalds
2010-01-08  0:41                             ` Linus Torvalds
2010-01-08  0:41                               ` Linus Torvalds
2010-01-07 23:51                 ` Rik van Riel
2010-01-07 23:51                   ` Rik van Riel
2010-01-04 18:24 ` [RFC][PATCH 7/8] mm,x86: speculative pagefault support Peter Zijlstra
2010-01-04 18:24   ` Peter Zijlstra
2010-01-04 18:24 ` [RFC][PATCH 8/8] mm: Optimize pte_map_lock() Peter Zijlstra
2010-01-04 18:24   ` Peter Zijlstra
2010-01-04 21:41 ` [RFC][PATCH 0/8] Speculative pagefault -v3 Rik van Riel
2010-01-04 21:41   ` Rik van Riel
2010-01-04 21:46   ` Peter Zijlstra
2010-01-04 21:46     ` Peter Zijlstra
2010-01-04 23:20     ` Rik van Riel
2010-01-04 23:20       ` Rik van Riel
2010-01-04 21:59   ` Christoph Lameter
2010-01-04 21:59     ` Christoph Lameter
2010-01-05  0:28     ` KAMEZAWA Hiroyuki
2010-01-05  0:28       ` KAMEZAWA Hiroyuki
2010-01-05  2:26 ` Minchan Kim
2010-01-05  2:26   ` Minchan Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100106160614.ff756f82.kamezawa.hiroyu@jp.fujitsu.com \
    --to=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=cl@linux-foundation.org \
    --cc=hugh.dickins@tiscali.co.uk \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=minchan.kim@gmail.com \
    --cc=mingo@elte.hu \
    --cc=nickpiggin@yahoo.com.au \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.