All of lore.kernel.org
 help / color / mirror / Atom feed
From: Philippe Gerum <rpm@xenomai.org>
To: Henning Schild <henning.schild@siemens.com>,
	Gilles Chanteperdrix <gilles.chanteperdrix@xenomai.org>
Cc: Jan Kiszka <jan.kiszka@siemens.com>, Xenomai <xenomai@xenomai.org>
Subject: Re: [Xenomai] [PATCH v2] ipipe x86 mm: handle huge pages in memory pinning
Date: Tue, 2 Feb 2016 14:58:41 +0100	[thread overview]
Message-ID: <56B0B611.4070505@xenomai.org> (raw)
In-Reply-To: <20160202130849.44ee20d5@md1em3qc>

On 02/02/2016 01:08 PM, Henning Schild wrote:
> On Fri, 29 Jan 2016 19:39:48 +0100
> Gilles Chanteperdrix <gilles.chanteperdrix@xenomai.org> wrote:
> 
>> On Fri, Jan 29, 2016 at 06:11:07PM +0100, Philippe Gerum wrote:
>>> On 01/28/2016 09:53 PM, Henning Schild wrote:  
>>>> On Thu, 28 Jan 2016 11:53:08 +0100
>>>> Philippe Gerum <rpm@xenomai.org> wrote:
>>>>   
>>>>> On 01/27/2016 02:41 PM, Henning Schild wrote:  
>>>>>> In 4.1 huge page mapping of io memory was introduced, enable
>>>>>> ipipe to handle that when pinning kernel memory.
>>>>>>
>>>>>> change that introduced the feature
>>>>>> 0f616be120c632c818faaea9adcb8f05a7a8601f
>>>>>>
>>>>>> Signed-off-by: Henning Schild <henning.schild@siemens.com>
>>>>>> ---
>>>>>>  arch/x86/mm/fault.c | 8 ++++++++
>>>>>>  1 file changed, 8 insertions(+)
>>>>>>
>>>>>> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
>>>>>> index fd5bbcc..ca1e75b 100644
>>>>>> --- a/arch/x86/mm/fault.c
>>>>>> +++ b/arch/x86/mm/fault.c
>>>>>> @@ -211,11 +211,15 @@ static inline pmd_t
>>>>>> *vmalloc_sync_one(pgd_t *pgd, unsigned long address) pud_k =
>>>>>> pud_offset(pgd_k, address); if (!pud_present(*pud_k))
>>>>>>  		return NULL;
>>>>>> +	if (pud_large(*pud))
>>>>>> +		return pud_k;
>>>>>>  
>>>>>>  	pmd = pmd_offset(pud, address);
>>>>>>  	pmd_k = pmd_offset(pud_k, address);
>>>>>>  	if (!pmd_present(*pmd_k))
>>>>>>  		return NULL;
>>>>>> +	if (pmd_large(*pmd))
>>>>>> +		return pmd_k;
>>>>>>  
>>>>>>  	if (!pmd_present(*pmd))
>>>>>>  		set_pmd(pmd, *pmd_k);
>>>>>> @@ -400,6 +404,8 @@ static inline int vmalloc_sync_one(pgd_t
>>>>>> *pgd, unsigned long address) 
>>>>>>  	if (pud_none(*pud) || pud_page_vaddr(*pud) !=
>>>>>> pud_page_vaddr(*pud_ref)) BUG();
>>>>>> +	if (pud_large(*pud))
>>>>>> +		return 0;
>>>>>>  
>>>>>>  	pmd = pmd_offset(pud, address);
>>>>>>  	pmd_ref = pmd_offset(pud_ref, address);
>>>>>> @@ -408,6 +414,8 @@ static inline int vmalloc_sync_one(pgd_t
>>>>>> *pgd, unsigned long address) 
>>>>>>  	if (pmd_none(*pmd) || pmd_page(*pmd) !=
>>>>>> pmd_page(*pmd_ref)) BUG();
>>>>>> +	if (pmd_large(*pmd))
>>>>>> +		return 0;
>>>>>>  
>>>>>>  	pte_ref = pte_offset_kernel(pmd_ref, address);
>>>>>>  	if (!pte_present(*pte_ref))
>>>>>>     
>>>>>
>>>>> I'm confused. Assuming the purpose of that patch is to exclude
>>>>> huge I/O mappings from pte pinning, why does the changes to the
>>>>> x86_32 version of the vmalloc_sync_one() helper actually prevent
>>>>> such pinning, while the x86_64 version does not?  
>>>>
>>>> No the purpose is to include them just like they were before.
>>>> vanilla vmalloc_sync_one just must not be called on huge mappings
>>>> because it cant handle them. The patch is supposed to make the
>>>> function return successfully, stopping early when huge pages are
>>>> detected.
>>>>
>>>> It changes the implementation of both x86_32 and x86_64.
>>>>   
>>>
>>> Sorry, your answer confuses me even more. vmalloc_sync_one() _does_
>>> the pinning, by copying over the kernel mapping, early in the
>>> course of the routine for x86_64, late for x86_32.
>>>
>>> Please explain why your changes prevent huge I/O mappings from being
>>> pinned into the current page directory in the x86_32
>>> implementation, but still allow this to be done in the x86_64
>>> version. The section of code you patched in the latter case is
>>> basically a series of sanity checks done after the pinning took
>>> place, not before.
>>>
>>> On a more general note, a better approach would be to filter out
>>> calls to vmalloc_sync_one() for huge pages directly from
>>> __ipipe_pin_mapping globally().  
>>
>> Since the ioremap/vmalloc range is based on huge pages or not,
>> globally (since the processes mapping are copied from the kernel
>> mapping), we can probably even avoid the call to __ipipe_pin_mapping
>> for hug page mappings. After all, since the mainline kernel is able
>> to avoid calls to vmalloc_sync_one() for huge page mappings, the
>> I-pipe should be able to do the same.
> 
> As said, my patch preserves the old ipipe behaviour, making it able to
> deal with huge pages. Whether or not the pinning is required and on
> which regions, is a valid but totally different question. Down in these
> low-level functions you can not tell why the pinning was requested. It
> has to be dealt with at a higher level.
> But you are right, the kernel never ends up in vmalloc_sync_one for
> these regions. Hence we can probably expect that in ioremapped region
> is always mapped anyways. Since vmalloc_sync_one will ultimately also
> be used when handling a #PF.
> 
> I just traced the change in lib/ioremap.c to
> d41282baf7c1f2bb517d6df95571e7da865f5e37
> Not too helpful, at least to me.

So why did you trace it?

-- 
Philippe.


  reply	other threads:[~2016-02-02 13:58 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-14 17:34 [Xenomai] ipipe x86_64 huge page ioremap Henning Schild
2016-01-15  8:32 ` Philippe Gerum
2016-01-15 12:34   ` Jan Kiszka
2016-02-02 17:43   ` Henning Schild
2016-02-03 15:07     ` Philippe Gerum
2016-02-03 15:48       ` Philippe Gerum
2016-02-04 11:43       ` Henning Schild
2016-01-26 15:20 ` [Xenomai] [PATCH] ipipe x86 mm: handle huge pages in memory pinning Henning Schild
2016-01-26 20:18   ` Jan Kiszka
2016-01-27  9:54     ` Henning Schild
2016-01-27 10:31       ` Jan Kiszka
2016-01-27 10:44         ` Philippe Gerum
2016-01-27 10:46           ` Philippe Gerum
2016-01-27 13:41   ` [Xenomai] [PATCH v2] " Henning Schild
2016-01-28 10:53     ` Philippe Gerum
2016-01-28 20:53       ` Henning Schild
2016-01-29 17:11         ` Philippe Gerum
2016-01-29 18:39           ` Gilles Chanteperdrix
2016-02-02 12:08             ` Henning Schild
2016-02-02 13:58               ` Philippe Gerum [this message]
2016-02-02 16:38                 ` Henning Schild
2016-02-02 16:39                   ` Jan Kiszka
2016-02-02 19:26                   ` Gilles Chanteperdrix
2016-02-03 11:35                     ` Henning Schild
2016-02-02 14:18               ` Philippe Gerum
2016-02-02 16:30                 ` Henning Schild
2016-02-02 11:41           ` Henning Schild
2016-02-03 12:59     ` [Xenomai] [PATCH v3] " Henning Schild
2016-02-03 14:24       ` Gilles Chanteperdrix
2016-02-03 14:31         ` Jan Kiszka
2016-02-03 14:38           ` Gilles Chanteperdrix
2016-02-03 14:51             ` Jan Kiszka
2016-02-03 15:02               ` Gilles Chanteperdrix
2016-02-03 15:14                 ` Jan Kiszka
2016-02-04 11:53         ` Henning Schild
2016-02-08  8:44       ` Henning Schild
2016-03-07  7:58         ` Henning Schild

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56B0B611.4070505@xenomai.org \
    --to=rpm@xenomai.org \
    --cc=gilles.chanteperdrix@xenomai.org \
    --cc=henning.schild@siemens.com \
    --cc=jan.kiszka@siemens.com \
    --cc=xenomai@xenomai.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.