All of lore.kernel.org
 help / color / mirror / Atom feed
From: Miaohe Lin <linmiaohe@huawei.com>
To: Mike Kravetz <mike.kravetz@oracle.com>
Cc: <linux-mm@kvack.org>, <linux-kernel@vger.kernel.org>,
	Muchun Song <songmuchun@bytedance.com>,
	David Hildenbrand <david@redhat.com>,
	Sven Schnelle <svens@linux.ibm.com>,
	Michal Hocko <mhocko@suse.com>, Peter Xu <peterx@redhat.com>,
	Naoya Horiguchi <naoya.horiguchi@linux.dev>,
	"Aneesh Kumar K . V" <aneesh.kumar@linux.vnet.ibm.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	"Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>,
	Davidlohr Bueso <dave@stgolabs.net>,
	Prakash Sangappa <prakash.sangappa@oracle.com>,
	James Houghton <jthoughton@google.com>,
	Mina Almasry <almasrymina@google.com>,
	Pasha Tatashin <pasha.tatashin@soleen.com>,
	Axel Rasmussen <axelrasmussen@google.com>,
	Ray Fucillo <Ray.Fucillo@intersystems.com>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH 1/3] hugetlb: fix vma lock handling during split vma and range unmapping
Date: Tue, 18 Oct 2022 14:25:31 +0800	[thread overview]
Message-ID: <e62238e3-352f-2fef-d09e-1e2469c9fc95@huawei.com> (raw)
In-Reply-To: <Y04VxvTbJTlxWfwl@monkey>

On 2022/10/18 10:56, Mike Kravetz wrote:
> On 10/15/22 09:25, Miaohe Lin wrote:
>> Sorry for late respond. It's a really busy week. :)
>>
>> On 2022/10/5 9:17, Mike Kravetz wrote:
>>> The hugetlb vma lock hangs off the vm_private_data field and is specific
>>> to the vma.  When vm_area_dup() is called as part of vma splitting,  the
>>
>> Oh, I checked vm_area_dup() from callsite of copy_vma and dup_mmap but split_vma
>> is missed... And yes, vma splitting can occur but vma merging won't for hugetlb
>> vma. Thanks for catching this, Mike.
>>
>>> vma lock pointer is copied to the new vma.  This will result in issues
>>> such as double freeing of the structure.  Update the hugetlb open vm_ops
>>> to allocate a new vma lock for the new vma.
>>>
>>> The routine __unmap_hugepage_range_final unconditionally unset
>>> VM_MAYSHARE to prevent subsequent pmd sharing.  hugetlb_vma_lock_free
>>> attempted to anticipate this by checking both VM_MAYSHARE and VM_SHARED.
>>> However, if only VM_MAYSHARE was set we would miss the free.  With the
>>> introduction of the vma lock, a vma can not participate in pmd sharing
>>> if vm_private_data is NULL.  Instead of clearing VM_MAYSHARE in
>>> __unmap_hugepage_range_final, free the vma lock to prevent sharing.  Also,
>>> update the sharing code to make sure vma lock is indeed a condition for
>>> pmd sharing.  hugetlb_vma_lock_free can then key off VM_MAYSHARE and not
>>> miss any vmas.
>>>
>>> Fixes: "hugetlb: add vma based lock for pmd sharing"
>>> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
>>> ---
>>>  mm/hugetlb.c | 43 +++++++++++++++++++++++++++----------------
>>>  mm/memory.c  |  4 ----
>>>  2 files changed, 27 insertions(+), 20 deletions(-)
>>>
>>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>>> index 4443e87e814b..0129d371800c 100644
>>> --- a/mm/hugetlb.c
>>> +++ b/mm/hugetlb.c
>>> @@ -4612,7 +4612,14 @@ static void hugetlb_vm_op_open(struct vm_area_struct *vma)
>>>  		kref_get(&resv->refs);
>>>  	}
>>>  
>>> -	hugetlb_vma_lock_alloc(vma);
>>> +	/*
>>> +	 * vma_lock structure for sharable mappings is vma specific.
>>> +	 * Clear old pointer (if copied via vm_area_dup) and create new.
>>> +	 */
>>> +	if (vma->vm_flags & VM_MAYSHARE) {
>>> +		vma->vm_private_data = NULL;
>>> +		hugetlb_vma_lock_alloc(vma);
>>> +	}
>>
>> IMHO this would lead to memoryleak. Think about the below move_vma() flow:
>> move_vma
>>   copy_vma
>>     new_vma = vm_area_dup(vma);
>>     new_vma->vm_ops->open(new_vma); --> new_vma has its own vma lock.
>>   is_vm_hugetlb_page(vma)
>>     clear_vma_resv_huge_pages
>>       hugetlb_dup_vma_private --> vma->vm_private_data is set to NULL
>>       				  without put ref. So vma lock is *leaked*?
> 
> You are right, that could lead to a leak.
> 
> I have an idea about setting vma->vm_private_data to NULL for VM_MAYSHARE
> vmas in routines like hugetlb_dup_vma_private().  We can check
> hugetlb_vma_lock->vma and only set to NULL if,
> 
> 	vma->(hugetlb_vma_lock)vma->vm_private_data->vma != vma

Looks feasible. Thanks for your work, Mike.

Thanks,
Miaohe Lin

> 
> Got sidetracked chasing down another leak today.  Will send a patch
> implementing this idea soon.
> 
> Thanks for looking at this!
> 


  reply	other threads:[~2022-10-18  6:25 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-05  1:17 [PATCH 0/3] hugetlb: fixes for new vma lock series Mike Kravetz
2022-10-05  1:17 ` [PATCH 1/3] hugetlb: fix vma lock handling during split vma and range unmapping Mike Kravetz
2022-10-15  1:25   ` Miaohe Lin
2022-10-18  2:56     ` Mike Kravetz
2022-10-18  6:25       ` Miaohe Lin [this message]
2022-10-05  1:17 ` [PATCH 2/3] hugetlb: take hugetlb vma_lock when clearing vma_lock->vma pointer Mike Kravetz
2022-10-05  3:48   ` kernel test robot
2022-10-05  6:58   ` kernel test robot
2022-10-06  3:30   ` Mike Kravetz
2022-10-15  1:32     ` Miaohe Lin
2022-10-05  1:17 ` [PATCH 3/3] hugetlb: allocate vma lock for all sharable vmas Mike Kravetz
2022-10-15  1:40   ` Miaohe Lin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e62238e3-352f-2fef-d09e-1e2469c9fc95@huawei.com \
    --to=linmiaohe@huawei.com \
    --cc=Ray.Fucillo@intersystems.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=almasrymina@google.com \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=axelrasmussen@google.com \
    --cc=dave@stgolabs.net \
    --cc=david@redhat.com \
    --cc=jthoughton@google.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=mike.kravetz@oracle.com \
    --cc=naoya.horiguchi@linux.dev \
    --cc=pasha.tatashin@soleen.com \
    --cc=peterx@redhat.com \
    --cc=prakash.sangappa@oracle.com \
    --cc=songmuchun@bytedance.com \
    --cc=svens@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.