All of lore.kernel.org
 help / color / mirror / Atom feed
From: Christophe LEROY <christophe.leroy@c-s.fr>
To: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>,
	akpm@linux-foundation.org, linux-mm@kvack.org,
	aneesh.kumar@linux.vnet.ibm.com,
	Nicholas Piggin <npiggin@gmail.com>,
	Michael Ellerman <mpe@ellerman.id.au>,
	linuxppc-dev@lists.ozlabs.org
Cc: LKML <linux-kernel@vger.kernel.org>
Subject: Re: How to handle PTE tables with non contiguous entries ?
Date: Tue, 18 Sep 2018 13:53:33 +0200	[thread overview]
Message-ID: <ca282523-5184-ae79-ecfc-5e6048562420@c-s.fr> (raw)
In-Reply-To: <87pnxbgh8b.fsf@linux.ibm.com>



Le 18/09/2018 à 13:47, Aneesh Kumar K.V a écrit :
> Christophe LEROY <christophe.leroy@c-s.fr> writes:
> 
>> Le 17/09/2018 à 11:03, Aneesh Kumar K.V a écrit :
>>> Christophe Leroy <christophe.leroy@c-s.fr> writes:
>>>
>>>> Hi,
>>>>
>>>> I'm having a hard time figuring out the best way to handle the following
>>>> situation:
>>>>
>>>> On the powerpc8xx, handling 16k size pages requires to have page tables
>>>> with 4 identical entries.
>>>
>>> I assume that hugetlb page size? If so isn't that similar to FSL hugetlb
>>> page table layout?
>>
>> No, it is not for 16k hugepage size with a standard page size of 4k.
>>
>> Here I'm trying to handle the case of CONFIG_PPC_16K_PAGES.
>> As of today, it is implemented by using the standard Linux page layout,
>> ie one PTE entry for each 16k page. This forbids the use the 8xx HW
>> assistance.
>>
>>>
>>>>
>>>> Initially I was thinking about handling this by simply modifying
>>>> pte_index() which changing pte_t type in order to have one entry every
>>>> 16 bytes, then replicate the PTE value at *ptep, *ptep+1,*ptep+2 and
>>>> *ptep+3 both in set_pte_at() and pte_update().
>>>>
>>>> However, this doesn't work because many many places in the mm core part
>>>> of the kernel use loops on ptep with single ptep++ increment.
>>>>
>>>> Therefore did it with the following hack:
>>>>
>>>>     /* PTE level */
>>>> +#if defined(CONFIG_PPC_8xx) && defined(CONFIG_PPC_16K_PAGES)
>>>> +typedef struct { pte_basic_t pte, pte1, pte2, pte3; } pte_t;
>>>> +#else
>>>>     typedef struct { pte_basic_t pte; } pte_t;
>>>> +#endif
>>>>
>>>> @@ -181,7 +192,13 @@ static inline unsigned long pte_update(pte_t *p,
>>>>            : "cc" );
>>>>     #else /* PTE_ATOMIC_UPDATES */
>>>>            unsigned long old = pte_val(*p);
>>>> -       *p = __pte((old & ~clr) | set);
>>>> +       unsigned long new = (old & ~clr) | set;
>>>> +
>>>> +#if defined(CONFIG_PPC_8xx) && defined(CONFIG_PPC_16K_PAGES)
>>>> +       p->pte = p->pte1 = p->pte2 = p->pte3 = new;
>>>> +#else
>>>> +       *p = __pte(new);
>>>> +#endif
>>>>     #endif /* !PTE_ATOMIC_UPDATES */
>>>>
>>>>     #ifdef CONFIG_44x
>>>>
>>>>
>>>> @@ -161,7 +161,11 @@ static inline void __set_pte_at(struct mm_struct
>>>> *mm, unsigned long addr,
>>>>            /* Anything else just stores the PTE normally. That covers all
>>>> 64-bit
>>>>             * cases, and 32-bit non-hash with 32-bit PTEs.
>>>>             */
>>>> +#if defined(CONFIG_PPC_8xx) && defined(CONFIG_PPC_16K_PAGES)
>>>> +       ptep->pte = ptep->pte1 = ptep->pte2 = ptep->pte3 = pte_val(pte);
>>>> +#else
>>>>            *ptep = pte;
>>>> +#endif
>>>>
>>>>
>>>>
>>>> But I'm not too happy with it as it means pte_t is not a single type
>>>> anymore so passing it from one function to the other is quite heavy.
>>>>
>>>>
>>>> Would someone have an idea of an elegent way to handle that ?
>>>>
>>>> Thanks
>>>> Christophe
>>>
>>> Why would pte_update bother about updating all the 4 entries?. Can you
>>> help me understand the issue?
>>
>> Because the 8xx HW assistance expects 4 identical entries for each 16k
>> page, so everytime a PTE is updated the 4 entries have to be updated.
>>
> 
> What you suggested in the original mail is what matches that best isn't it?
> That is a linux pte update involves updating 4 slot. Hence a linux pte
> consist of 4 unsigned long?
> 

Yes indeed.
It seems hopeless to avoid carrying the 4 longs from one function to the 
other allthough that's four times the same thing.

Christophe

WARNING: multiple messages have this Message-ID (diff)
From: Christophe LEROY <christophe.leroy@c-s.fr>
To: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>,
	akpm@linux-foundation.org, linux-mm@kvack.org,
	aneesh.kumar@linux.vnet.ibm.com,
	Nicholas Piggin <npiggin@gmail.com>,
	Michael Ellerman <mpe@ellerman.id.au>,
	linuxppc-dev@lists.ozlabs.org
Cc: LKML <linux-kernel@vger.kernel.org>
Subject: Re: How to handle PTE tables with non contiguous entries ?
Date: Tue, 18 Sep 2018 13:53:33 +0200	[thread overview]
Message-ID: <ca282523-5184-ae79-ecfc-5e6048562420@c-s.fr> (raw)
In-Reply-To: <87pnxbgh8b.fsf@linux.ibm.com>



Le 18/09/2018 A  13:47, Aneesh Kumar K.V a A(C)critA :
> Christophe LEROY <christophe.leroy@c-s.fr> writes:
> 
>> Le 17/09/2018 A  11:03, Aneesh Kumar K.V a A(C)critA :
>>> Christophe Leroy <christophe.leroy@c-s.fr> writes:
>>>
>>>> Hi,
>>>>
>>>> I'm having a hard time figuring out the best way to handle the following
>>>> situation:
>>>>
>>>> On the powerpc8xx, handling 16k size pages requires to have page tables
>>>> with 4 identical entries.
>>>
>>> I assume that hugetlb page size? If so isn't that similar to FSL hugetlb
>>> page table layout?
>>
>> No, it is not for 16k hugepage size with a standard page size of 4k.
>>
>> Here I'm trying to handle the case of CONFIG_PPC_16K_PAGES.
>> As of today, it is implemented by using the standard Linux page layout,
>> ie one PTE entry for each 16k page. This forbids the use the 8xx HW
>> assistance.
>>
>>>
>>>>
>>>> Initially I was thinking about handling this by simply modifying
>>>> pte_index() which changing pte_t type in order to have one entry every
>>>> 16 bytes, then replicate the PTE value at *ptep, *ptep+1,*ptep+2 and
>>>> *ptep+3 both in set_pte_at() and pte_update().
>>>>
>>>> However, this doesn't work because many many places in the mm core part
>>>> of the kernel use loops on ptep with single ptep++ increment.
>>>>
>>>> Therefore did it with the following hack:
>>>>
>>>>     /* PTE level */
>>>> +#if defined(CONFIG_PPC_8xx) && defined(CONFIG_PPC_16K_PAGES)
>>>> +typedef struct { pte_basic_t pte, pte1, pte2, pte3; } pte_t;
>>>> +#else
>>>>     typedef struct { pte_basic_t pte; } pte_t;
>>>> +#endif
>>>>
>>>> @@ -181,7 +192,13 @@ static inline unsigned long pte_update(pte_t *p,
>>>>            : "cc" );
>>>>     #else /* PTE_ATOMIC_UPDATES */
>>>>            unsigned long old = pte_val(*p);
>>>> -       *p = __pte((old & ~clr) | set);
>>>> +       unsigned long new = (old & ~clr) | set;
>>>> +
>>>> +#if defined(CONFIG_PPC_8xx) && defined(CONFIG_PPC_16K_PAGES)
>>>> +       p->pte = p->pte1 = p->pte2 = p->pte3 = new;
>>>> +#else
>>>> +       *p = __pte(new);
>>>> +#endif
>>>>     #endif /* !PTE_ATOMIC_UPDATES */
>>>>
>>>>     #ifdef CONFIG_44x
>>>>
>>>>
>>>> @@ -161,7 +161,11 @@ static inline void __set_pte_at(struct mm_struct
>>>> *mm, unsigned long addr,
>>>>            /* Anything else just stores the PTE normally. That covers all
>>>> 64-bit
>>>>             * cases, and 32-bit non-hash with 32-bit PTEs.
>>>>             */
>>>> +#if defined(CONFIG_PPC_8xx) && defined(CONFIG_PPC_16K_PAGES)
>>>> +       ptep->pte = ptep->pte1 = ptep->pte2 = ptep->pte3 = pte_val(pte);
>>>> +#else
>>>>            *ptep = pte;
>>>> +#endif
>>>>
>>>>
>>>>
>>>> But I'm not too happy with it as it means pte_t is not a single type
>>>> anymore so passing it from one function to the other is quite heavy.
>>>>
>>>>
>>>> Would someone have an idea of an elegent way to handle that ?
>>>>
>>>> Thanks
>>>> Christophe
>>>
>>> Why would pte_update bother about updating all the 4 entries?. Can you
>>> help me understand the issue?
>>
>> Because the 8xx HW assistance expects 4 identical entries for each 16k
>> page, so everytime a PTE is updated the 4 entries have to be updated.
>>
> 
> What you suggested in the original mail is what matches that best isn't it?
> That is a linux pte update involves updating 4 slot. Hence a linux pte
> consist of 4 unsigned long?
> 

Yes indeed.
It seems hopeless to avoid carrying the 4 longs from one function to the 
other allthough that's four times the same thing.

Christophe

  reply	other threads:[~2018-09-18 11:53 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-10 14:34 How to handle PTE tables with non contiguous entries ? Christophe Leroy
2018-09-10 20:05 ` Dan Malek
2018-09-10 20:05   ` Dan Malek
2018-09-11  5:28   ` Christophe LEROY
2018-09-11  5:28     ` Christophe LEROY
2018-09-10 21:06 ` Nicholas Piggin
2018-09-11  5:39   ` Christophe LEROY
2018-09-11  5:39     ` Christophe LEROY
2018-09-17  9:03 ` Aneesh Kumar K.V
2018-09-17  9:47   ` Christophe LEROY
2018-09-17  9:47     ` Christophe LEROY
2018-09-18 11:47     ` Aneesh Kumar K.V
2018-09-18 11:47       ` Aneesh Kumar K.V
2018-09-18 11:53       ` Christophe LEROY [this message]
2018-09-18 11:53         ` Christophe LEROY

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ca282523-5184-ae79-ecfc-5e6048562420@c-s.fr \
    --to=christophe.leroy@c-s.fr \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mpe@ellerman.id.au \
    --cc=npiggin@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.