All of lore.kernel.org
 help / color / mirror / Atom feed
From: Song Liu <songliubraving@fb.com>
To: Oleg Nesterov <oleg@redhat.com>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Linux MM <linux-mm@kvack.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	"Matthew Wilcox" <matthew.wilcox@oracle.com>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Kernel Team <Kernel-team@fb.com>,
	"William Kucharski" <william.kucharski@oracle.com>,
	"srikar@linux.vnet.ibm.com" <srikar@linux.vnet.ibm.com>
Subject: Re: [PATCH v12 5/6] khugepaged: enable collapse pmd for pte-mapped THP
Date: Mon, 12 Aug 2019 14:36:11 +0000	[thread overview]
Message-ID: <0BD4D364-51BE-4844-8270-9A664C3E6216@fb.com> (raw)
In-Reply-To: <20190812130659.GA31560@redhat.com>



> On Aug 12, 2019, at 6:06 AM, Oleg Nesterov <oleg@redhat.com> wrote:
> 
> On 08/09, Song Liu wrote:
>> 
>> +void collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr)
>> +{
>> +	unsigned long haddr = addr & HPAGE_PMD_MASK;
>> +	struct vm_area_struct *vma = find_vma(mm, haddr);
>> +	struct page *hpage = NULL;
>> +	pmd_t *pmd, _pmd;
>> +	spinlock_t *ptl;
>> +	int count = 0;
>> +	int i;
>> +
>> +	if (!vma || !vma->vm_file ||
>> +	    vma->vm_start > haddr || vma->vm_end < haddr + HPAGE_PMD_SIZE)
>> +		return;
>> +
>> +	/*
>> +	 * This vm_flags may not have VM_HUGEPAGE if the page was not
>> +	 * collapsed by this mm. But we can still collapse if the page is
>> +	 * the valid THP. Add extra VM_HUGEPAGE so hugepage_vma_check()
>> +	 * will not fail the vma for missing VM_HUGEPAGE
>> +	 */
>> +	if (!hugepage_vma_check(vma, vma->vm_flags | VM_HUGEPAGE))
>> +		return;
>> +
>> +	pmd = mm_find_pmd(mm, haddr);
>> +	if (!pmd)
>> +		return;
>> +
>> +	/* step 1: check all mapped PTEs are to the right huge page */
>> +	for (i = 0, addr = haddr; i < HPAGE_PMD_NR; i++, addr += PAGE_SIZE) {
>> +		pte_t *pte = pte_offset_map(pmd, addr);
>> +		struct page *page;
>> +
>> +		if (pte_none(*pte) || !pte_present(*pte))
>> +			continue;
> 
> 		if (!pte_present(*pte))
> 			return;
> 
> you can't simply flush pmd if this page is swapped out.

hmm... how about
		if (pte_none(*pte))
			continue;

		if (!pte_present(*pte))
			return;

If the page hasn't faulted in for this mm, i.e. pte_none(), we
can flush the pmd. 

> 
>> +
>> +		page = vm_normal_page(vma, addr, *pte);
>> +
>> +		if (!page || !PageCompound(page))
>> +			return;
>> +
>> +		if (!hpage) {
>> +			hpage = compound_head(page);
>> +			/*
>> +			 * The mapping of the THP should not change.
>> +			 *
>> +			 * Note that uprobe may change the page table,
> 
> Not only uprobe can cow the page. Debugger can do. Or mmap(PROT_WRITE, MAP_PRIVATE).
> 
> uprobe() is "special" because it a) it works with a foreign mm and b)
> it can't stop the process which uses this mm. Otherwise it could simply
> update the page returned by get_user_pages_remote(FOLL_FORCE), just we
> would need to add FOLL_WRITE and if we do this we do not even need SPLIT,
> that is why, say, __access_remote_vm() works without SPLIT.

Will update the comment in next version. 

Thanks!
Song



  reply	other threads:[~2019-08-12 14:37 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-07 23:37 [PATCH v12 0/6] THP aware uprobe Song Liu
2019-08-07 23:37 ` [PATCH v12 1/6] mm: move memcmp_pages() and pages_identical() Song Liu
2019-08-07 23:37 ` [PATCH v12 2/6] uprobe: use original page when all uprobes are removed Song Liu
2019-08-07 23:37 ` [PATCH v12 3/6] mm, thp: introduce FOLL_SPLIT_PMD Song Liu
2019-08-08 16:37   ` Oleg Nesterov
2019-08-08 17:16     ` Song Liu
2019-08-09 16:35       ` Oleg Nesterov
2019-08-09 16:50         ` Song Liu
2019-08-07 23:37 ` [PATCH v12 4/6] uprobe: use FOLL_SPLIT_PMD instead of FOLL_SPLIT Song Liu
2019-08-07 23:37 ` [PATCH v12 5/6] khugepaged: enable collapse pmd for pte-mapped THP Song Liu
2019-08-08 16:33   ` Oleg Nesterov
2019-08-08 17:05     ` Song Liu
2019-08-09 15:24       ` Oleg Nesterov
2019-08-09 16:30         ` Song Liu
2019-08-09 18:01           ` Song Liu
2019-08-12 12:11             ` Kirill A. Shutemov
2019-08-12 13:22               ` Oleg Nesterov
2019-08-12 14:40                 ` Kirill A. Shutemov
2019-08-12 21:04                   ` Song Liu
2019-08-13 14:44                     ` Song Liu
2019-08-15 10:16                       ` Oleg Nesterov
2019-08-15 16:27                         ` Song Liu
2019-08-13 13:30                   ` Oleg Nesterov
2019-08-13 14:05                     ` Oleg Nesterov
2019-08-13 15:05                       ` Kirill A. Shutemov
2019-08-13 16:24                         ` Oleg Nesterov
2019-08-16 14:54                           ` Kirill A. Shutemov
2019-08-12 13:06             ` Oleg Nesterov
2019-08-12 14:36               ` Song Liu [this message]
2019-08-07 23:37 ` [PATCH v12 6/6] uprobe: collapse THP pmd after removing all uprobes Song Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0BD4D364-51BE-4844-8270-9A664C3E6216@fb.com \
    --to=songliubraving@fb.com \
    --cc=Kernel-team@fb.com \
    --cc=akpm@linux-foundation.org \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=matthew.wilcox@oracle.com \
    --cc=oleg@redhat.com \
    --cc=srikar@linux.vnet.ibm.com \
    --cc=william.kucharski@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.