linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Nadav Amit <nadav.amit@gmail.com>
To: "open list:MEMORY MANAGEMENT" <linux-mm@kvack.org>
Cc: Mel Gorman <mgorman@suse.de>, Andy Lutomirski <luto@kernel.org>
Subject: TLB batching breaks MADV_DONTNEED
Date: Tue, 18 Jul 2017 22:05:23 -0700	[thread overview]
Message-ID: <B672524C-1D52-4215-89CB-9FF3477600C9@gmail.com> (raw)

Something seems to be really wrong with all these TLB flush batching
mechanisms that are all around kernel. Here is another example, which was
not addressed by the recently submitted patches.

Consider what happens when two MADV_DONTNEED run concurrently. According to
the man page "After a successful MADV_DONTNEED operation … subsequent
accesses of pages in the range will succeed, but will result in …
zero-fill-on-demand pages for anonymous private mappings.”

However, the test below, which does MADV_DONTNEED in two threads, reads “8”
and not “0” when reading the memory following MADV_DONTNEED. It happens
since one of the threads clears the PTE, but defers the TLB flush for some
time (until it finishes changing 16k PTEs). The main thread sees the PTE
already non-present and does not flush the TLB.

I think there is a need for a batching scheme that considers whether
mmap_sem is taken for write/read/nothing and the change to the PTE.
Unfortunately, I do not have the time to do it right now.

Am I missing something? Thoughts?


---


#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h>
#include <sys/types.h>
#include <pthread.h>
#include <string.h>

#define PAGE_SIZE	(4096)
#define N_PAGES		(65536)

volatile int sync_step = 0;
volatile char *p;

static inline unsigned long rdtsc()
{
	unsigned long hi, lo;
	__asm__ __volatile__ ("rdtsc" : "=a"(lo), "=d"(hi));
	 return lo | (hi << 32);
}

static inline void wait_rdtsc(unsigned long cycles)
{
	unsigned long tsc = rdtsc();

	while (rdtsc() - tsc < cycles);
}

void *big_madvise_thread(void *ign)
{
	sync_step = 1;
	while (sync_step != 2);
	madvise((void*)p, PAGE_SIZE * N_PAGES, MADV_DONTNEED);
}

void main(void)
{
	pthread_t aux_thread;

	p = mmap(0, PAGE_SIZE * N_PAGES, PROT_READ|PROT_WRITE,
		 MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);

	memset((void*)p, 8, PAGE_SIZE * N_PAGES);

	pthread_create(&aux_thread, NULL, big_madvise_thread, NULL);
	while (sync_step != 1);

	*p = 8;		// Cache in TLB
	sync_step = 2;
	wait_rdtsc(100000);
	madvise((void*)p, PAGE_SIZE, MADV_DONTNEED);
	printf("Result : %d\n", *p);
}

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

             reply	other threads:[~2017-07-19  5:05 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-19  5:05 Nadav Amit [this message]
2017-07-19  8:23 ` TLB batching breaks MADV_DONTNEED Mel Gorman
2017-07-19 18:14   ` Nadav Amit
2017-07-19 20:08     ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=B672524C-1D52-4215-89CB-9FF3477600C9@gmail.com \
    --to=nadav.amit@gmail.com \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=mgorman@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).