All of lore.kernel.org
 help / color / mirror / Atom feed
* [LSF/MM ATTEND] Huge Page Futures
@ 2016-01-25  1:57 Mike Kravetz
  2016-01-25 11:01 ` Kirill A. Shutemov
  2016-01-28 15:05 ` Aneesh Kumar K.V
  0 siblings, 2 replies; 11+ messages in thread
From: Mike Kravetz @ 2016-01-25  1:57 UTC (permalink / raw)
  To: lsf-pc; +Cc: linux-mm, linux-fsdevel

In a search of the archives, it appears huge page support in one form or
another has been a discussion topic in almost every LSF/MM gathering. Based
on patches submitted this past year, huge pages is still an area of active
development.  And, it appears this level of activity will  continue in the
coming year.

I propose a "Huge Page Futures" session to discuss large works in progress
as well as work people are considering for 2016.  Areas of discussion would
minimally include:

- Krill Shutemov's THP new refcounting code and the push for huge page
  support in the page cache.

- Matt Wilcox's huge page support in DAX enabled filesystems, but perhaps
  more interesting is the desire for supporting PUD pages.  This seems to
  beg the question of supporting transparent PUD pages elsewhere.

- Other suggestions?

My interest in attending also revolves around huge pages.  This past year
I have added functionality to hugetlbfs.  hugetlbfs is not dead, and is
very much in use by some DB implementations.  Proposed future work I will
be attempting includes:
- Adding userfaultfd support to hugetlbfs
- Adding shared page table (PMD) support to DAX much like that which exists
  for hugetlbfs

-- 
Mike Kravetz

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [LSF/MM ATTEND] Huge Page Futures
  2016-01-25  1:57 [LSF/MM ATTEND] Huge Page Futures Mike Kravetz
@ 2016-01-25 11:01 ` Kirill A. Shutemov
  2016-01-25 13:50   ` Mike Kravetz
  2016-01-28 15:05 ` Aneesh Kumar K.V
  1 sibling, 1 reply; 11+ messages in thread
From: Kirill A. Shutemov @ 2016-01-25 11:01 UTC (permalink / raw)
  To: Mike Kravetz; +Cc: lsf-pc, linux-mm, linux-fsdevel

On Sun, Jan 24, 2016 at 05:57:12PM -0800, Mike Kravetz wrote:
> In a search of the archives, it appears huge page support in one form or
> another has been a discussion topic in almost every LSF/MM gathering. Based
> on patches submitted this past year, huge pages is still an area of active
> development.  And, it appears this level of activity will  continue in the
> coming year.
> 
> I propose a "Huge Page Futures" session to discuss large works in progress
> as well as work people are considering for 2016.  Areas of discussion would
> minimally include:
> 
> - Krill Shutemov's THP new refcounting code and the push for huge page
>   support in the page cache.

s/Krill/Kirill/ :]

I work on huge pages in tmpfs first and will look on huge pages for real
filesystems later.

> 
> - Matt Wilcox's huge page support in DAX enabled filesystems, but perhaps
>   more interesting is the desire for supporting PUD pages.  This seems to
>   beg the question of supporting transparent PUD pages elsewhere.
> 
> - Other suggestions?
> 
> My interest in attending also revolves around huge pages.  This past year
> I have added functionality to hugetlbfs.  hugetlbfs is not dead, and is
> very much in use by some DB implementations.  Proposed future work I will
> be attempting includes:
> - Adding userfaultfd support to hugetlbfs
> - Adding shared page table (PMD) support to DAX much like that which exists
>   for hugetlbfs

Shared page tables for hugetlbfs is rather ugly hack.

Do you have any thoughts how it's going to be implemented? It would be
nice to have some design overview or better proof-of-concept patch before
the summit to be able analyze implications for the kernel.

-- 
 Kirill A. Shutemov

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [LSF/MM ATTEND] Huge Page Futures
  2016-01-25 11:01 ` Kirill A. Shutemov
@ 2016-01-25 13:50   ` Mike Kravetz
  2016-01-27 17:49     ` Mike Kravetz
  0 siblings, 1 reply; 11+ messages in thread
From: Mike Kravetz @ 2016-01-25 13:50 UTC (permalink / raw)
  To: Kirill A. Shutemov; +Cc: lsf-pc, linux-mm, linux-fsdevel

On 01/25/2016 03:01 AM, Kirill A. Shutemov wrote:
> On Sun, Jan 24, 2016 at 05:57:12PM -0800, Mike Kravetz wrote:
>> In a search of the archives, it appears huge page support in one form or
>> another has been a discussion topic in almost every LSF/MM gathering. Based
>> on patches submitted this past year, huge pages is still an area of active
>> development.  And, it appears this level of activity will  continue in the
>> coming year.
>>
>> I propose a "Huge Page Futures" session to discuss large works in progress
>> as well as work people are considering for 2016.  Areas of discussion would
>> minimally include:
>>
>> - Krill Shutemov's THP new refcounting code and the push for huge page
>>   support in the page cache.
> 
> s/Krill/Kirill/ :]

Sorry!

> 
> I work on huge pages in tmpfs first and will look on huge pages for real
> filesystems later.
> 
>>
>> - Matt Wilcox's huge page support in DAX enabled filesystems, but perhaps
>>   more interesting is the desire for supporting PUD pages.  This seems to
>>   beg the question of supporting transparent PUD pages elsewhere.
>>
>> - Other suggestions?
>>
>> My interest in attending also revolves around huge pages.  This past year
>> I have added functionality to hugetlbfs.  hugetlbfs is not dead, and is
>> very much in use by some DB implementations.  Proposed future work I will
>> be attempting includes:
>> - Adding userfaultfd support to hugetlbfs
>> - Adding shared page table (PMD) support to DAX much like that which exists
>>   for hugetlbfs
> 
> Shared page tables for hugetlbfs is rather ugly hack.
> 
> Do you have any thoughts how it's going to be implemented? It would be
> nice to have some design overview or better proof-of-concept patch before
> the summit to be able analyze implications for the kernel.
> 

Good to know the hugetlbfs implementation is considered a hack.  I just
started looking at this, and was going to use hugetlbfs as a starting
point.  I'll reconsider that decision.

BTW, this request comes from the same DB people taking advantage of shared
page tables today.  This will be as important (if not more) with the larger
sizes of pmem.

-- 
Mike Kravetz

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [LSF/MM ATTEND] Huge Page Futures
  2016-01-25 13:50   ` Mike Kravetz
@ 2016-01-27 17:49     ` Mike Kravetz
  2016-01-28  8:49       ` Hugh Dickins
  2016-01-28  9:21       ` [Lsf-pc] " Mel Gorman
  0 siblings, 2 replies; 11+ messages in thread
From: Mike Kravetz @ 2016-01-27 17:49 UTC (permalink / raw)
  To: Kirill A. Shutemov; +Cc: lsf-pc, linux-mm, linux-fsdevel

On 01/25/2016 05:50 AM, Mike Kravetz wrote:
> On 01/25/2016 03:01 AM, Kirill A. Shutemov wrote:
>> On Sun, Jan 24, 2016 at 05:57:12PM -0800, Mike Kravetz wrote:
>>> In a search of the archives, it appears huge page support in one form or
>>> another has been a discussion topic in almost every LSF/MM gathering. Based
>>> on patches submitted this past year, huge pages is still an area of active
>>> development.  And, it appears this level of activity will  continue in the
>>> coming year.
>>>
>>> I propose a "Huge Page Futures" session to discuss large works in progress
>>> as well as work people are considering for 2016.  Areas of discussion would
>>> minimally include:
>>>
>>> - Krill Shutemov's THP new refcounting code and the push for huge page
>>>   support in the page cache.
>>
>> s/Krill/Kirill/ :]
> 
> Sorry!
> 
>>
>> I work on huge pages in tmpfs first and will look on huge pages for real
>> filesystems later.
>>
>>>
>>> - Matt Wilcox's huge page support in DAX enabled filesystems, but perhaps
>>>   more interesting is the desire for supporting PUD pages.  This seems to
>>>   beg the question of supporting transparent PUD pages elsewhere.
>>>
>>> - Other suggestions?
>>>
>>> My interest in attending also revolves around huge pages.  This past year
>>> I have added functionality to hugetlbfs.  hugetlbfs is not dead, and is
>>> very much in use by some DB implementations.  Proposed future work I will
>>> be attempting includes:
>>> - Adding userfaultfd support to hugetlbfs
>>> - Adding shared page table (PMD) support to DAX much like that which exists
>>>   for hugetlbfs
>>
>> Shared page tables for hugetlbfs is rather ugly hack.
>>
>> Do you have any thoughts how it's going to be implemented? It would be
>> nice to have some design overview or better proof-of-concept patch before
>> the summit to be able analyze implications for the kernel.
>>
> 
> Good to know the hugetlbfs implementation is considered a hack.  I just
> started looking at this, and was going to use hugetlbfs as a starting
> point.  I'll reconsider that decision.

Kirill, can you (or others) explain your reasons for saying the hugetlbfs
implementation is an ugly hack?  I do not have enough history/experience
with this to say what is most offensive.  I would be happy to start by
cleaning up issues with the current implementation.

If we do shared page tables for DAX, it makes sense that it and hugetlbfs
should be similar (or common) if possible.

-- 
Mike Kravetz

> 
> BTW, this request comes from the same DB people taking advantage of shared
> page tables today.  This will be as important (if not more) with the larger
> sizes of pmem.
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [LSF/MM ATTEND] Huge Page Futures
  2016-01-27 17:49     ` Mike Kravetz
@ 2016-01-28  8:49       ` Hugh Dickins
  2016-01-28 19:06         ` Mike Kravetz
  2016-01-28  9:21       ` [Lsf-pc] " Mel Gorman
  1 sibling, 1 reply; 11+ messages in thread
From: Hugh Dickins @ 2016-01-28  8:49 UTC (permalink / raw)
  To: Mike Kravetz; +Cc: Kirill A. Shutemov, lsf-pc, linux-mm, linux-fsdevel

On Wed, 27 Jan 2016, Mike Kravetz wrote:
> On 01/25/2016 05:50 AM, Mike Kravetz wrote:
> > On 01/25/2016 03:01 AM, Kirill A. Shutemov wrote:
> >> On Sun, Jan 24, 2016 at 05:57:12PM -0800, Mike Kravetz wrote:
> >>> In a search of the archives, it appears huge page support in one form or
> >>> another has been a discussion topic in almost every LSF/MM gathering. Based
> >>> on patches submitted this past year, huge pages is still an area of active
> >>> development.  And, it appears this level of activity will  continue in the
> >>> coming year.
> >>>
> >>> I propose a "Huge Page Futures" session to discuss large works in progress
> >>> as well as work people are considering for 2016.  Areas of discussion would
> >>> minimally include:
> >>>
> >>> - Krill Shutemov's THP new refcounting code and the push for huge page
> >>>   support in the page cache.
> >>
> >> s/Krill/Kirill/ :]
> > 
> > Sorry!
> > 
> >>
> >> I work on huge pages in tmpfs first and will look on huge pages for real
> >> filesystems later.
> >>
> >>>
> >>> - Matt Wilcox's huge page support in DAX enabled filesystems, but perhaps
> >>>   more interesting is the desire for supporting PUD pages.  This seems to
> >>>   beg the question of supporting transparent PUD pages elsewhere.
> >>>
> >>> - Other suggestions?

I would like to attend LSF/MM 2016, and think I can contribute
to Mike's "Huge Page Futures" topic.  I remain very focussed on my
"huge tmpfs" THP pagecache implementation in tmpfs - which has proved
a success within Google over the last year - but hope that once I'm at
conference, can turn my attention to some of the other mm topics too.


> >>>
> >>> My interest in attending also revolves around huge pages.  This past year
> >>> I have added functionality to hugetlbfs.  hugetlbfs is not dead, and is
> >>> very much in use by some DB implementations.  Proposed future work I will
> >>> be attempting includes:
> >>> - Adding userfaultfd support to hugetlbfs
> >>> - Adding shared page table (PMD) support to DAX much like that which exists
> >>>   for hugetlbfs
> >>
> >> Shared page tables for hugetlbfs is rather ugly hack.
> >>
> >> Do you have any thoughts how it's going to be implemented? It would be
> >> nice to have some design overview or better proof-of-concept patch before
> >> the summit to be able analyze implications for the kernel.
> >>
> > 
> > Good to know the hugetlbfs implementation is considered a hack.  I just
> > started looking at this, and was going to use hugetlbfs as a starting
> > point.  I'll reconsider that decision.
> 
> Kirill, can you (or others) explain your reasons for saying the hugetlbfs
> implementation is an ugly hack?  I do not have enough history/experience
> with this to say what is most offensive.  I would be happy to start by
> cleaning up issues with the current implementation.

I disagree that the hugetlbfs shared pagetables are an ugly hack.
What they are is a dark backwater that very few people are aware of,
which we therefore can very easily break or be broken by.

I have regretted bringing them into mm for that reason, and have
thought that they're next in line for the axe, after those non-linear
vmas which Kirill dispatched without tears last year.  But if you're
intent on making more use of them, exposing them to the light of day
is a fair alternative to consider.

Hugh

> 
> If we do shared page tables for DAX, it makes sense that it and hugetlbfs
> should be similar (or common) if possible.
> 
> -- 
> Mike Kravetz
> 
> > 
> > BTW, this request comes from the same DB people taking advantage of shared
> > page tables today.  This will be as important (if not more) with the larger
> > sizes of pmem.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Lsf-pc] [LSF/MM ATTEND] Huge Page Futures
  2016-01-27 17:49     ` Mike Kravetz
  2016-01-28  8:49       ` Hugh Dickins
@ 2016-01-28  9:21       ` Mel Gorman
  2016-01-28 18:24         ` Mike Kravetz
  1 sibling, 1 reply; 11+ messages in thread
From: Mel Gorman @ 2016-01-28  9:21 UTC (permalink / raw)
  To: Mike Kravetz; +Cc: Kirill A. Shutemov, linux-fsdevel, linux-mm, lsf-pc

On Wed, Jan 27, 2016 at 09:49:57AM -0800, Mike Kravetz wrote:
> On 01/25/2016 05:50 AM, Mike Kravetz wrote:
> >> Do you have any thoughts how it's going to be implemented? It would be
> >> nice to have some design overview or better proof-of-concept patch before
> >> the summit to be able analyze implications for the kernel.
> >>
> > 
> > Good to know the hugetlbfs implementation is considered a hack.  I just
> > started looking at this, and was going to use hugetlbfs as a starting
> > point.  I'll reconsider that decision.
> 
> Kirill, can you (or others) explain your reasons for saying the hugetlbfs
> implementation is an ugly hack?  I do not have enough history/experience
> with this to say what is most offensive.  I would be happy to start by
> cleaning up issues with the current implementation.
> 

Historically, it was considered a hack because it had special handling in
a number of paths in the VM. Of course THP also has similar handling now
so it's less of a concern but there are differences that cause base pages,
transparent hugepages and hugetlbfs pages to all be special cases. That
does not sit comfortably with everyone.

For a long time, it was considered ugly because a fault on private child
mappings was so unreliable and a fork could cause a parent to unexpectedly
fail a fault and die. These days it's different as only the child can die
so while it's less of a concern, hugetlbfs pages allow a child to be killed
if enough huge pages are not available.

It was also considered ugly because application-awareness was required in
so many cases. Granted, libhugetlbfs can hide some of that ugliness but
even that was considered hacky.

The fact that hugetlbfs pages cannot be swapped even without mlock is
another fact that makes them different to the rest of the VM. It has its
own reservation scheme that is different to everything else.

One that crippled it to some extent with the label was the fact that fixing
swap on it was effectively impossible because of power. Once huge pages
had been installed on that architecture for a lont time, it was impossible
to remap them at a different size. The limitation has been relaxed to some
extent but those around long enough remember it.

So it is a bit of a hack that behaves differently to other page types.
It's fairly complex and while the semantics used to be a lot uglier than
it is now, the "ugly hack" label has stuck.

> If we do shared page tables for DAX, it makes sense that it and hugetlbfs
> should be similar (or common) if possible.
> 

It's been a long time since I looked at shared page tables so I can't
remember why but it was a difficult area. A few years were spent on it so
if shared page tables are being considered, I would make damn sure first
that they actually help on modern hardware before jumping into that hole.

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [LSF/MM ATTEND] Huge Page Futures
  2016-01-25  1:57 [LSF/MM ATTEND] Huge Page Futures Mike Kravetz
  2016-01-25 11:01 ` Kirill A. Shutemov
@ 2016-01-28 15:05 ` Aneesh Kumar K.V
  2016-01-28 19:28   ` Mike Kravetz
  1 sibling, 1 reply; 11+ messages in thread
From: Aneesh Kumar K.V @ 2016-01-28 15:05 UTC (permalink / raw)
  To: Mike Kravetz, lsf-pc; +Cc: linux-mm, linux-fsdevel

Mike Kravetz <mike.kravetz@oracle.com> writes:

> In a search of the archives, it appears huge page support in one form or
> another has been a discussion topic in almost every LSF/MM gathering. Based
> on patches submitted this past year, huge pages is still an area of active
> development.  And, it appears this level of activity will  continue in the
> coming year.
>
> I propose a "Huge Page Futures" session to discuss large works in progress
> as well as work people are considering for 2016.  Areas of discussion would
> minimally include:
>
> - Krill Shutemov's THP new refcounting code and the push for huge page
>   support in the page cache.

I am also interested in this discussion. We had some nice challenge
w.r.t to powerpc implementation of THP.

>
> - Matt Wilcox's huge page support in DAX enabled filesystems, but perhaps
>   more interesting is the desire for supporting PUD pages.  This seems to
>   beg the question of supporting transparent PUD pages elsewhere.
>

I am also looking at switching powerpc hugetlbfs to GENERAL_HUGETLB. To
support 16GB pages I would need hugepage at PUD/PGD. Can you elaborate
why supporting huge PUD page is a challenge ?

-aneesh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Lsf-pc] [LSF/MM ATTEND] Huge Page Futures
  2016-01-28  9:21       ` [Lsf-pc] " Mel Gorman
@ 2016-01-28 18:24         ` Mike Kravetz
  0 siblings, 0 replies; 11+ messages in thread
From: Mike Kravetz @ 2016-01-28 18:24 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Kirill A. Shutemov, linux-fsdevel, linux-mm, lsf-pc

On 01/28/2016 01:21 AM, Mel Gorman wrote:
> On Wed, Jan 27, 2016 at 09:49:57AM -0800, Mike Kravetz wrote:
>> On 01/25/2016 05:50 AM, Mike Kravetz wrote:
>>>> Do you have any thoughts how it's going to be implemented? It would be
>>>> nice to have some design overview or better proof-of-concept patch before
>>>> the summit to be able analyze implications for the kernel.
>>>>
>>>
>>> Good to know the hugetlbfs implementation is considered a hack.  I just
>>> started looking at this, and was going to use hugetlbfs as a starting
>>> point.  I'll reconsider that decision.
>>
>> Kirill, can you (or others) explain your reasons for saying the hugetlbfs
>> implementation is an ugly hack?  I do not have enough history/experience
>> with this to say what is most offensive.  I would be happy to start by
>> cleaning up issues with the current implementation.
>>
> 
> Historically, it was considered a hack because it had special handling in
> a number of paths in the VM. Of course THP also has similar handling now
> so it's less of a concern but there are differences that cause base pages,
> transparent hugepages and hugetlbfs pages to all be special cases. That
> does not sit comfortably with everyone.
> 
> For a long time, it was considered ugly because a fault on private child
> mappings was so unreliable and a fork could cause a parent to unexpectedly
> fail a fault and die. These days it's different as only the child can die
> so while it's less of a concern, hugetlbfs pages allow a child to be killed
> if enough huge pages are not available.
> 
> It was also considered ugly because application-awareness was required in
> so many cases. Granted, libhugetlbfs can hide some of that ugliness but
> even that was considered hacky.
> 
> The fact that hugetlbfs pages cannot be swapped even without mlock is
> another fact that makes them different to the rest of the VM. It has its
> own reservation scheme that is different to everything else.
> 
> One that crippled it to some extent with the label was the fact that fixing
> swap on it was effectively impossible because of power. Once huge pages
> had been installed on that architecture for a lont time, it was impossible
> to remap them at a different size. The limitation has been relaxed to some
> extent but those around long enough remember it.
> 
> So it is a bit of a hack that behaves differently to other page types.
> It's fairly complex and while the semantics used to be a lot uglier than
> it is now, the "ugly hack" label has stuck.

Thanks Mel.  I understand most of the issues you mention above.  However,
some DB providers make extensive use of hugetlbfs for maximum performance.
My question had more to do with shared page tables (below) than hugetlbfs
in general.

> 
>> If we do shared page tables for DAX, it makes sense that it and hugetlbfs
>> should be similar (or common) if possible.
>>
> 
> It's been a long time since I looked at shared page tables so I can't
> remember why but it was a difficult area. A few years were spent on it so
> if shared page tables are being considered, I would make damn sure first
> that they actually help on modern hardware before jumping into that hole.

IIUC, the only sharing today is in hugetlbfs at the PMD level.  Sharing
at this level for 2M huge pages still provides significant memory savings
for some large DB implementations.  Think of systems with TBs of memory,
and using large shared mappings.  There can be 10,000 or more processes
sharing these mappings.  You can expect that pmem will provide more
opportunities for sharing of large mappings.  My intention was to explore
the possibility of providing this type of sharing (at a minimum) to huge
pages for DAX mappings.  I was only looking at this from the space saving
angle.

-- 
Mike Kravetz

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [LSF/MM ATTEND] Huge Page Futures
  2016-01-28  8:49       ` Hugh Dickins
@ 2016-01-28 19:06         ` Mike Kravetz
  0 siblings, 0 replies; 11+ messages in thread
From: Mike Kravetz @ 2016-01-28 19:06 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: Kirill A. Shutemov, lsf-pc, linux-mm, linux-fsdevel

On 01/28/2016 12:49 AM, Hugh Dickins wrote:
> On Wed, 27 Jan 2016, Mike Kravetz wrote:
>> On 01/25/2016 05:50 AM, Mike Kravetz wrote:
>>> On 01/25/2016 03:01 AM, Kirill A. Shutemov wrote:
>>>> On Sun, Jan 24, 2016 at 05:57:12PM -0800, Mike Kravetz wrote:
>>>>> - Adding shared page table (PMD) support to DAX much like that which exists
>>>>>   for hugetlbfs
>>>>
>>>> Shared page tables for hugetlbfs is rather ugly hack.
>>>>
>>>> Do you have any thoughts how it's going to be implemented? It would be
>>>> nice to have some design overview or better proof-of-concept patch before
>>>> the summit to be able analyze implications for the kernel.
>>>>
>>>
>>> Good to know the hugetlbfs implementation is considered a hack.  I just
>>> started looking at this, and was going to use hugetlbfs as a starting
>>> point.  I'll reconsider that decision.
>>
>> Kirill, can you (or others) explain your reasons for saying the hugetlbfs
>> implementation is an ugly hack?  I do not have enough history/experience
>> with this to say what is most offensive.  I would be happy to start by
>> cleaning up issues with the current implementation.
> 
> I disagree that the hugetlbfs shared pagetables are an ugly hack.
> What they are is a dark backwater that very few people are aware of,
> which we therefore can very easily break or be broken by.
> 
> I have regretted bringing them into mm for that reason, and have
> thought that they're next in line for the axe, after those non-linear
> vmas which Kirill dispatched without tears last year.  But if you're
> intent on making more use of them, exposing them to the light of day
> is a fair alternative to consider.

It is interesting to note that at least one DB vendor (my employer) is
very aware of hugetlbfs shared pagetables, and takes advantage of them
in their DB architecture.  Their primary concern is the memory savings
that sharing provides.  I agree with you that very few people know
about them.  I didn't know they existed until informed by the DB team
and I looked at the code.

-- 
Mike Kravetz

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [LSF/MM ATTEND] Huge Page Futures
  2016-01-28 15:05 ` Aneesh Kumar K.V
@ 2016-01-28 19:28   ` Mike Kravetz
  2016-01-29 10:01     ` Kirill A. Shutemov
  0 siblings, 1 reply; 11+ messages in thread
From: Mike Kravetz @ 2016-01-28 19:28 UTC (permalink / raw)
  To: Aneesh Kumar K.V, lsf-pc; +Cc: linux-mm, linux-fsdevel

On 01/28/2016 07:05 AM, Aneesh Kumar K.V wrote:
> Mike Kravetz <mike.kravetz@oracle.com> writes:
> 
>> In a search of the archives, it appears huge page support in one form or
>> another has been a discussion topic in almost every LSF/MM gathering. Based
>> on patches submitted this past year, huge pages is still an area of active
>> development.  And, it appears this level of activity will  continue in the
>> coming year.
>>
>> I propose a "Huge Page Futures" session to discuss large works in progress
>> as well as work people are considering for 2016.  Areas of discussion would
>> minimally include:
>>
>> - Krill Shutemov's THP new refcounting code and the push for huge page
>>   support in the page cache.
> 
> I am also interested in this discussion. We had some nice challenge
> w.r.t to powerpc implementation of THP.
> 
>>
>> - Matt Wilcox's huge page support in DAX enabled filesystems, but perhaps
>>   more interesting is the desire for supporting PUD pages.  This seems to
>>   beg the question of supporting transparent PUD pages elsewhere.
>>
> 
> I am also looking at switching powerpc hugetlbfs to GENERAL_HUGETLB. To
> support 16GB pages I would need hugepage at PUD/PGD. Can you elaborate
> why supporting huge PUD page is a challenge ?

For hugetlbfs it should not be an issue.  However, page fault handling for
hugetlbfs is already a special case today.  Is that what you were asking?

Matt's work adds THP for PUD sized huge pages to DAX mappings.  The thought
that popped into my head is "Does it make sense to try and expand THP for
PUD sized pages elsewhere?".  Perhaps that is nonsense and a silly question
to ask.

-- 
Mike Kravetz

> 
> -aneesh
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [LSF/MM ATTEND] Huge Page Futures
  2016-01-28 19:28   ` Mike Kravetz
@ 2016-01-29 10:01     ` Kirill A. Shutemov
  0 siblings, 0 replies; 11+ messages in thread
From: Kirill A. Shutemov @ 2016-01-29 10:01 UTC (permalink / raw)
  To: Mike Kravetz; +Cc: Aneesh Kumar K.V, lsf-pc, linux-mm, linux-fsdevel

On Thu, Jan 28, 2016 at 11:28:33AM -0800, Mike Kravetz wrote:
> On 01/28/2016 07:05 AM, Aneesh Kumar K.V wrote:
> > Mike Kravetz <mike.kravetz@oracle.com> writes:
> > 
> >> In a search of the archives, it appears huge page support in one form or
> >> another has been a discussion topic in almost every LSF/MM gathering. Based
> >> on patches submitted this past year, huge pages is still an area of active
> >> development.  And, it appears this level of activity will  continue in the
> >> coming year.
> >>
> >> I propose a "Huge Page Futures" session to discuss large works in progress
> >> as well as work people are considering for 2016.  Areas of discussion would
> >> minimally include:
> >>
> >> - Krill Shutemov's THP new refcounting code and the push for huge page
> >>   support in the page cache.
> > 
> > I am also interested in this discussion. We had some nice challenge
> > w.r.t to powerpc implementation of THP.
> > 
> >>
> >> - Matt Wilcox's huge page support in DAX enabled filesystems, but perhaps
> >>   more interesting is the desire for supporting PUD pages.  This seems to
> >>   beg the question of supporting transparent PUD pages elsewhere.
> >>
> > 
> > I am also looking at switching powerpc hugetlbfs to GENERAL_HUGETLB. To
> > support 16GB pages I would need hugepage at PUD/PGD. Can you elaborate
> > why supporting huge PUD page is a challenge ?
> 
> For hugetlbfs it should not be an issue.  However, page fault handling for
> hugetlbfs is already a special case today.  Is that what you were asking?
> 
> Matt's work adds THP for PUD sized huge pages to DAX mappings.  The thought
> that popped into my head is "Does it make sense to try and expand THP for
> PUD sized pages elsewhere?".  Perhaps that is nonsense and a silly question
> to ask.

I don't think it has much sense on x86-64. But if an architecture has more
reasonable page size on PUD level, who knows...

-- 
 Kirill A. Shutemov

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2016-01-29 10:01 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-25  1:57 [LSF/MM ATTEND] Huge Page Futures Mike Kravetz
2016-01-25 11:01 ` Kirill A. Shutemov
2016-01-25 13:50   ` Mike Kravetz
2016-01-27 17:49     ` Mike Kravetz
2016-01-28  8:49       ` Hugh Dickins
2016-01-28 19:06         ` Mike Kravetz
2016-01-28  9:21       ` [Lsf-pc] " Mel Gorman
2016-01-28 18:24         ` Mike Kravetz
2016-01-28 15:05 ` Aneesh Kumar K.V
2016-01-28 19:28   ` Mike Kravetz
2016-01-29 10:01     ` Kirill A. Shutemov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.