All of lore.kernel.org
 help / color / mirror / Atom feed
From: Henning Schild <henning.schild@siemens.com>
To: Philippe Gerum <rpm@xenomai.org>
Cc: Jan Kiszka <jan.kiszka@siemens.com>, Xenomai <xenomai@xenomai.org>
Subject: Re: [Xenomai] [PATCH v2] ipipe x86 mm: handle huge pages in memory pinning
Date: Tue, 2 Feb 2016 12:41:41 +0100	[thread overview]
Message-ID: <20160202124141.4201d657@md1em3qc> (raw)
In-Reply-To: <56AB9D2B.1030905@xenomai.org>

On Fri, 29 Jan 2016 18:11:07 +0100
Philippe Gerum <rpm@xenomai.org> wrote:

> On 01/28/2016 09:53 PM, Henning Schild wrote:
> > On Thu, 28 Jan 2016 11:53:08 +0100
> > Philippe Gerum <rpm@xenomai.org> wrote:
> >   
> >> On 01/27/2016 02:41 PM, Henning Schild wrote:  
> >>> In 4.1 huge page mapping of io memory was introduced, enable ipipe
> >>> to handle that when pinning kernel memory.
> >>>
> >>> change that introduced the feature
> >>> 0f616be120c632c818faaea9adcb8f05a7a8601f
> >>>
> >>> Signed-off-by: Henning Schild <henning.schild@siemens.com>
> >>> ---
> >>>  arch/x86/mm/fault.c | 8 ++++++++
> >>>  1 file changed, 8 insertions(+)
> >>>
> >>> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
> >>> index fd5bbcc..ca1e75b 100644
> >>> --- a/arch/x86/mm/fault.c
> >>> +++ b/arch/x86/mm/fault.c
> >>> @@ -211,11 +211,15 @@ static inline pmd_t *vmalloc_sync_one(pgd_t
> >>> *pgd, unsigned long address) pud_k = pud_offset(pgd_k, address);
> >>>  	if (!pud_present(*pud_k))
> >>>  		return NULL;
> >>> +	if (pud_large(*pud))
> >>> +		return pud_k;
> >>>  
> >>>  	pmd = pmd_offset(pud, address);
> >>>  	pmd_k = pmd_offset(pud_k, address);
> >>>  	if (!pmd_present(*pmd_k))
> >>>  		return NULL;
> >>> +	if (pmd_large(*pmd))
> >>> +		return pmd_k;
> >>>  
> >>>  	if (!pmd_present(*pmd))
> >>>  		set_pmd(pmd, *pmd_k);
> >>> @@ -400,6 +404,8 @@ static inline int vmalloc_sync_one(pgd_t *pgd,
> >>> unsigned long address) 
> >>>  	if (pud_none(*pud) || pud_page_vaddr(*pud) !=
> >>> pud_page_vaddr(*pud_ref)) BUG();
> >>> +	if (pud_large(*pud))
> >>> +		return 0;
> >>>  
> >>>  	pmd = pmd_offset(pud, address);
> >>>  	pmd_ref = pmd_offset(pud_ref, address);
> >>> @@ -408,6 +414,8 @@ static inline int vmalloc_sync_one(pgd_t *pgd,
> >>> unsigned long address) 
> >>>  	if (pmd_none(*pmd) || pmd_page(*pmd) !=
> >>> pmd_page(*pmd_ref)) BUG();
> >>> +	if (pmd_large(*pmd))
> >>> +		return 0;
> >>>  
> >>>  	pte_ref = pte_offset_kernel(pmd_ref, address);
> >>>  	if (!pte_present(*pte_ref))
> >>>     
> >>
> >> I'm confused. Assuming the purpose of that patch is to exclude huge
> >> I/O mappings from pte pinning, why does the changes to the x86_32
> >> version of the vmalloc_sync_one() helper actually prevent such
> >> pinning, while the x86_64 version does not?  
> > 
> > No the purpose is to include them just like they were before.
> > vanilla vmalloc_sync_one just must not be called on huge mappings
> > because it cant handle them. The patch is supposed to make the
> > function return successfully, stopping early when huge pages are
> > detected.
> > 
> > It changes the implementation of both x86_32 and x86_64.
> >   
> 
> Sorry, your answer confuses me even more. vmalloc_sync_one() _does_
> the pinning, by copying over the kernel mapping, early in the course
> of the routine for x86_64, late for x86_32.
> 
> Please explain why your changes prevent huge I/O mappings from being
> pinned into the current page directory in the x86_32 implementation,
> but still allow this to be done in the x86_64 version. The section of
> code you patched in the latter case is basically a series of sanity
> checks done after the pinning took place, not before.

There is no difference between 32 and 64bits. After the patch the
memory will get pinned like it was before. The "sanity checks" are
required when you want to call vmalloc_sync_one on a range that
contains huge pages. They actually make sure that the function does not
dig deeper treating the huge pages as pagetables.
The initial problem is that the huge page itself was accessed as if it
was a pagetable. An offset into it was derefenced which caused a #PF.
The upstream kernel seems to never take this path for areas that
contain huge pages, but we do. That is why we have to introduce these
checks in the pagetable walker.

> On a more general note, a better approach would be to filter out calls
> to vmalloc_sync_one() for huge pages directly from __ipipe_pin_mapping
> globally().
> 



  parent reply	other threads:[~2016-02-02 11:41 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-14 17:34 [Xenomai] ipipe x86_64 huge page ioremap Henning Schild
2016-01-15  8:32 ` Philippe Gerum
2016-01-15 12:34   ` Jan Kiszka
2016-02-02 17:43   ` Henning Schild
2016-02-03 15:07     ` Philippe Gerum
2016-02-03 15:48       ` Philippe Gerum
2016-02-04 11:43       ` Henning Schild
2016-01-26 15:20 ` [Xenomai] [PATCH] ipipe x86 mm: handle huge pages in memory pinning Henning Schild
2016-01-26 20:18   ` Jan Kiszka
2016-01-27  9:54     ` Henning Schild
2016-01-27 10:31       ` Jan Kiszka
2016-01-27 10:44         ` Philippe Gerum
2016-01-27 10:46           ` Philippe Gerum
2016-01-27 13:41   ` [Xenomai] [PATCH v2] " Henning Schild
2016-01-28 10:53     ` Philippe Gerum
2016-01-28 20:53       ` Henning Schild
2016-01-29 17:11         ` Philippe Gerum
2016-01-29 18:39           ` Gilles Chanteperdrix
2016-02-02 12:08             ` Henning Schild
2016-02-02 13:58               ` Philippe Gerum
2016-02-02 16:38                 ` Henning Schild
2016-02-02 16:39                   ` Jan Kiszka
2016-02-02 19:26                   ` Gilles Chanteperdrix
2016-02-03 11:35                     ` Henning Schild
2016-02-02 14:18               ` Philippe Gerum
2016-02-02 16:30                 ` Henning Schild
2016-02-02 11:41           ` Henning Schild [this message]
2016-02-03 12:59     ` [Xenomai] [PATCH v3] " Henning Schild
2016-02-03 14:24       ` Gilles Chanteperdrix
2016-02-03 14:31         ` Jan Kiszka
2016-02-03 14:38           ` Gilles Chanteperdrix
2016-02-03 14:51             ` Jan Kiszka
2016-02-03 15:02               ` Gilles Chanteperdrix
2016-02-03 15:14                 ` Jan Kiszka
2016-02-04 11:53         ` Henning Schild
2016-02-08  8:44       ` Henning Schild
2016-03-07  7:58         ` Henning Schild

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160202124141.4201d657@md1em3qc \
    --to=henning.schild@siemens.com \
    --cc=jan.kiszka@siemens.com \
    --cc=rpm@xenomai.org \
    --cc=xenomai@xenomai.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.