All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matt Mackall <mpm@selenic.com>
To: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Gerald Schaefer <gerald.schaefer@de.ibm.com>,
	akpm@linux-foundation.org
Subject: Re: [PATCH] fix/improve generic page table walker
Date: Wed, 11 Mar 2009 12:24:23 -0500	[thread overview]
Message-ID: <1236792263.3205.45.camel@calx> (raw)
In-Reply-To: <20090311144951.58c6ab60@skybase>

On Wed, 2009-03-11 at 14:49 +0100, Martin Schwidefsky wrote:
> From: Martin Schwidefsky <schwidefsky@de.ibm.com>
> 
> On s390 the /proc/pid/pagemap interface is currently broken. This is
> caused by the unconditional loop over all pgd/pud entries as specified
> by the address range passed to walk_page_range. The tricky bit here
> is that the pgd++ in the outer loop may only be done if the page table
> really has 4 levels. For the pud++ in the second loop the page table needs
> to have at least 3 levels. With the dynamic page tables on s390 we can have
> page tables with 2, 3 or 4 levels. Which means that the pgd and/or the
> pud pointer can get out-of-bounds causing all kinds of mayhem.

Not sure why this should be a problem without delving into the S390
code. After all, x86 has 2, 3, or 4 levels as well (at compile time) in
a way that's transparent to the walker.

> The proposed solution is to fast-forward over the hole between the start
> address and the first vma and the hole between the last vma and the end
> address. The pgd/pud/pmd/pte loops are used only for the address range
> between the first and last vma. This guarantees that the page table
> pointers stay in range for s390. For the other architectures this is
> a small optimization.

I've gone to lengths to keep VMAs out of the equation, so I can't say
I'm excited about this solution.

-- 
http://selenic.com : development and support for Mercurial and Linux



WARNING: multiple messages have this Message-ID (diff)
From: Matt Mackall <mpm@selenic.com>
To: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Gerald Schaefer <gerald.schaefer@de.ibm.com>,
	akpm@linux-foundation.org
Subject: Re: [PATCH] fix/improve generic page table walker
Date: Wed, 11 Mar 2009 12:24:23 -0500	[thread overview]
Message-ID: <1236792263.3205.45.camel@calx> (raw)
In-Reply-To: <20090311144951.58c6ab60@skybase>

On Wed, 2009-03-11 at 14:49 +0100, Martin Schwidefsky wrote:
> From: Martin Schwidefsky <schwidefsky@de.ibm.com>
> 
> On s390 the /proc/pid/pagemap interface is currently broken. This is
> caused by the unconditional loop over all pgd/pud entries as specified
> by the address range passed to walk_page_range. The tricky bit here
> is that the pgd++ in the outer loop may only be done if the page table
> really has 4 levels. For the pud++ in the second loop the page table needs
> to have at least 3 levels. With the dynamic page tables on s390 we can have
> page tables with 2, 3 or 4 levels. Which means that the pgd and/or the
> pud pointer can get out-of-bounds causing all kinds of mayhem.

Not sure why this should be a problem without delving into the S390
code. After all, x86 has 2, 3, or 4 levels as well (at compile time) in
a way that's transparent to the walker.

> The proposed solution is to fast-forward over the hole between the start
> address and the first vma and the hole between the last vma and the end
> address. The pgd/pud/pmd/pte loops are used only for the address range
> between the first and last vma. This guarantees that the page table
> pointers stay in range for s390. For the other architectures this is
> a small optimization.

I've gone to lengths to keep VMAs out of the equation, so I can't say
I'm excited about this solution.

-- 
http://selenic.com : development and support for Mercurial and Linux


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2009-03-11 17:27 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-03-11 13:49 [PATCH] fix/improve generic page table walker Martin Schwidefsky
2009-03-11 13:49 ` Martin Schwidefsky
2009-03-11 17:24 ` Matt Mackall [this message]
2009-03-11 17:24   ` Matt Mackall
2009-03-12  8:33   ` Martin Schwidefsky
2009-03-12  8:33     ` Martin Schwidefsky
2009-03-12 10:19     ` Martin Schwidefsky
2009-03-12 10:19       ` Martin Schwidefsky
2009-03-12 11:24       ` Martin Schwidefsky
2009-03-12 11:24         ` Martin Schwidefsky
2009-03-12 14:10     ` Matt Mackall
2009-03-12 14:10       ` Matt Mackall
2009-03-12 14:42       ` Martin Schwidefsky
2009-03-12 14:42         ` Martin Schwidefsky
2009-03-12 15:58         ` Matt Mackall
2009-03-12 15:58           ` Matt Mackall
2009-03-16 12:27           ` Martin Schwidefsky
2009-03-16 12:27             ` Martin Schwidefsky
2009-03-16 12:36             ` Nick Piggin
2009-03-16 12:36               ` Nick Piggin
2009-03-16 12:55               ` Martin Schwidefsky
2009-03-16 12:55                 ` Martin Schwidefsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1236792263.3205.45.camel@calx \
    --to=mpm@selenic.com \
    --cc=akpm@linux-foundation.org \
    --cc=gerald.schaefer@de.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=schwidefsky@de.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.