Re: [PATCH v1 2/7] mm: Prepare for DAX huge pages

* Re: [PATCH v1 2/7] mm: Prepare for DAX huge pages
       [not found]     ` <20141008155758.GK5098@wil.cx>
@ 2014-10-08 19:43       ` Kirill A. Shutemov
  2014-10-09 20:40         ` Matthew Wilcox
  0 siblings, 1 reply; 6+ messages in thread
From: Kirill A. Shutemov @ 2014-10-08 19:43 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: Matthew Wilcox, linux-fsdevel, linux-kernel, linux-mm

On Wed, Oct 08, 2014 at 11:57:58AM -0400, Matthew Wilcox wrote:
> On Wed, Oct 08, 2014 at 06:21:24PM +0300, Kirill A. Shutemov wrote:
> > On Wed, Oct 08, 2014 at 09:25:24AM -0400, Matthew Wilcox wrote:
> > > From: Matthew Wilcox <willy@linux.intel.com>
> > > 
> > > DAX wants to use the 'special' bit to mark PMD entries that are not backed
> > > by struct page, just as for PTEs. 
> > 
> > Hm. I don't see where you use PMD without special set.
> 
> Right ... I don't currently insert PMDs that point to huge pages of DRAM,
> only to huge pages of PMEM.

Looks like you don't need pmd_{mk,}special() then. It seems you have all
inforamtion you need -- vma -- to find out what's going on. Right?

PMD bits is not something we can assigning to a feature without a need.

> > > @@ -1104,9 +1103,20 @@ int do_huge_pmd_wp_page(struct mm_struct *mm, struct vm_area_struct *vma,
> > >  	if (unlikely(!pmd_same(*pmd, orig_pmd)))
> > >  		goto out_unlock;
> > >  
> > > -	page = pmd_page(orig_pmd);
> > > -	VM_BUG_ON_PAGE(!PageCompound(page) || !PageHead(page), page);
> > > -	if (page_mapcount(page) == 1) {
> > > +	if (pmd_special(orig_pmd)) {
> > > +		/* VM_MIXEDMAP !pfn_valid() case */
> > > +		if ((vma->vm_flags & (VM_WRITE|VM_SHARED)) !=
> > > +				     (VM_WRITE|VM_SHARED)) {
> > > +			pmdp_clear_flush(vma, haddr, pmd);
> > > +			ret = VM_FAULT_FALLBACK;
> > 
> > No private THP pages with THP? Why?
> > It should be trivial: we already have a code path for !page case for zero
> > page and it shouldn't be too hard to modify do_dax_pmd_fault() to support
> > COW.
> > 
> > I remeber I've mentioned that you don't think it's reasonable to allocate
> > 2M page on COW, but that's what we do for anon memory...
> 
> I agree that it shouldn't be too hard, but I have no evidence that it'll
> be a performance win to COW 2MB pages for MAP_PRIVATE.  I'd rather be
> cautious for now and we can explore COWing 2MB chunks in a future patch.

I would rather make it other way around: use the same apporoach as for
anon memory until data shows it's doesn't make any good. Then consider
switching COW for *both* anon and file THP to fallback path.
This way we will get consistent behaviour for both types of mappings.

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 6+ messages in thread