All of lore.kernel.org
 help / color / mirror / Atom feed
From: Joerg Roedel <joro@8bytes.org>
To: Toshi Kani <toshi.kani@hpe.com>
Cc: mhocko@suse.com, akpm@linux-foundation.org, tglx@linutronix.de,
	mingo@redhat.com, hpa@zytor.com, bp@suse.de,
	catalin.marinas@arm.com, guohanjun@huawei.com,
	will.deacon@arm.com, wxf.wang@hisilicon.com, willy@infradead.org,
	cpandya@codeaurora.org, linux-mm@kvack.org, x86@kernel.org,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, stable@vger.kernel.org
Subject: Re: [PATCH v2 2/2] x86/mm: implement free pmd/pte page interfaces
Date: Thu, 26 Apr 2018 16:19:26 +0200	[thread overview]
Message-ID: <20180426141926.GN15462@8bytes.org> (raw)
In-Reply-To: <20180314180155.19492-3-toshi.kani@hpe.com>

Hi Toshi, Andrew,

this patch(-set) is broken in several ways, please see below.

On Wed, Mar 14, 2018 at 12:01:55PM -0600, Toshi Kani wrote:
> Implement pud_free_pmd_page() and pmd_free_pte_page() on x86, which
> clear a given pud/pmd entry and free up lower level page table(s).
> Address range associated with the pud/pmd entry must have been purged
> by INVLPG.

An INVLPG before actually unmapping the page is useless, as other cores
or even speculative instruction execution can bring the TLB entry back
before the code actually unmaps the page.

>  int pud_free_pmd_page(pud_t *pud)
>  {
> -	return pud_none(*pud);
> +	pmd_t *pmd;
> +	int i;
> +
> +	if (pud_none(*pud))
> +		return 1;
> +
> +	pmd = (pmd_t *)pud_page_vaddr(*pud);
> +
> +	for (i = 0; i < PTRS_PER_PMD; i++)
> +		if (!pmd_free_pte_page(&pmd[i]))
> +			return 0;
> +
> +	pud_clear(pud);

TLB flush needed here, before the page is freed.

> +	free_page((unsigned long)pmd);
> +
> +	return 1;
>  }
>  
>  /**
> @@ -724,6 +739,15 @@ int pud_free_pmd_page(pud_t *pud)
>   */
>  int pmd_free_pte_page(pmd_t *pmd)
>  {
> -	return pmd_none(*pmd);
> +	pte_t *pte;
> +
> +	if (pmd_none(*pmd))
> +		return 1;
> +
> +	pte = (pte_t *)pmd_page_vaddr(*pmd);
> +	pmd_clear(pmd);

Same here, TLB flush needed.

Further this needs synchronization with other page-tables in the system
when the kernel PMDs are not shared between processes. In x86-32 with
PAE this causes a BUG_ON() being triggered at arch/x86/mm/fault.c:268
because the page-tables are not correctly synchronized.

> +	free_page((unsigned long)pte);
> +
> +	return 1;
>  }
>  #endif	/* CONFIG_HAVE_ARCH_HUGE_VMAP */

WARNING: multiple messages have this Message-ID (diff)
From: joro@8bytes.org (Joerg Roedel)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v2 2/2] x86/mm: implement free pmd/pte page interfaces
Date: Thu, 26 Apr 2018 16:19:26 +0200	[thread overview]
Message-ID: <20180426141926.GN15462@8bytes.org> (raw)
In-Reply-To: <20180314180155.19492-3-toshi.kani@hpe.com>

Hi Toshi, Andrew,

this patch(-set) is broken in several ways, please see below.

On Wed, Mar 14, 2018 at 12:01:55PM -0600, Toshi Kani wrote:
> Implement pud_free_pmd_page() and pmd_free_pte_page() on x86, which
> clear a given pud/pmd entry and free up lower level page table(s).
> Address range associated with the pud/pmd entry must have been purged
> by INVLPG.

An INVLPG before actually unmapping the page is useless, as other cores
or even speculative instruction execution can bring the TLB entry back
before the code actually unmaps the page.

>  int pud_free_pmd_page(pud_t *pud)
>  {
> -	return pud_none(*pud);
> +	pmd_t *pmd;
> +	int i;
> +
> +	if (pud_none(*pud))
> +		return 1;
> +
> +	pmd = (pmd_t *)pud_page_vaddr(*pud);
> +
> +	for (i = 0; i < PTRS_PER_PMD; i++)
> +		if (!pmd_free_pte_page(&pmd[i]))
> +			return 0;
> +
> +	pud_clear(pud);

TLB flush needed here, before the page is freed.

> +	free_page((unsigned long)pmd);
> +
> +	return 1;
>  }
>  
>  /**
> @@ -724,6 +739,15 @@ int pud_free_pmd_page(pud_t *pud)
>   */
>  int pmd_free_pte_page(pmd_t *pmd)
>  {
> -	return pmd_none(*pmd);
> +	pte_t *pte;
> +
> +	if (pmd_none(*pmd))
> +		return 1;
> +
> +	pte = (pte_t *)pmd_page_vaddr(*pmd);
> +	pmd_clear(pmd);

Same here, TLB flush needed.

Further this needs synchronization with other page-tables in the system
when the kernel PMDs are not shared between processes. In x86-32 with
PAE this causes a BUG_ON() being triggered at arch/x86/mm/fault.c:268
because the page-tables are not correctly synchronized.

> +	free_page((unsigned long)pte);
> +
> +	return 1;
>  }
>  #endif	/* CONFIG_HAVE_ARCH_HUGE_VMAP */

  parent reply	other threads:[~2018-04-26 14:19 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-14 18:01 [PATCH v2 0/2] fix memory leak / panic in ioremap huge pages Toshi Kani
2018-03-14 18:01 ` Toshi Kani
2018-03-14 18:01 ` [PATCH v2 1/2] mm/vmalloc: Add interfaces to free unmapped page table Toshi Kani
2018-03-14 18:01   ` Toshi Kani
2018-03-14 22:38   ` Andrew Morton
2018-03-14 22:38     ` Andrew Morton
2018-03-15 14:27     ` Kani, Toshi
2018-03-15 14:27       ` Kani, Toshi
2018-03-14 18:01 ` [PATCH v2 2/2] x86/mm: implement free pmd/pte page interfaces Toshi Kani
2018-03-14 18:01   ` Toshi Kani
2018-03-15  7:39   ` Chintan Pandya
2018-03-15  7:39     ` Chintan Pandya
2018-03-15 14:51     ` Kani, Toshi
2018-03-15 14:51       ` Kani, Toshi
2018-04-26 14:19   ` Joerg Roedel [this message]
2018-04-26 14:19     ` Joerg Roedel
2018-04-26 16:21     ` Kani, Toshi
2018-04-26 16:21       ` Kani, Toshi
2018-04-26 17:23       ` joro
2018-04-26 17:23         ` joro at 8bytes.org
2018-04-26 17:49         ` Kani, Toshi
2018-04-26 17:49           ` Kani, Toshi
2018-04-26 20:07           ` joro
2018-04-26 20:07             ` joro at 8bytes.org
2018-04-26 22:30             ` Kani, Toshi
2018-04-26 22:30               ` Kani, Toshi
2018-04-27  7:37               ` joro
2018-04-27  7:37                 ` joro at 8bytes.org
2018-04-27 11:39                 ` Michal Hocko
2018-04-27 11:39                   ` Michal Hocko
2018-04-27 11:46                   ` joro
2018-04-27 11:46                     ` joro at 8bytes.org
2018-04-27 11:52                 ` Chintan Pandya
2018-04-27 11:52                   ` Chintan Pandya
2018-04-27 12:48                   ` joro
2018-04-27 12:48                     ` joro at 8bytes.org
2018-04-27 13:42                     ` Chintan Pandya
2018-04-27 13:42                       ` Chintan Pandya
2018-04-27 14:31                 ` Kani, Toshi
2018-04-27 14:31                   ` Kani, Toshi
2018-04-28  9:02                   ` joro
2018-04-28  9:02                     ` joro at 8bytes.org
2018-04-28 20:54                     ` Kani, Toshi
2018-04-28 20:54                       ` Kani, Toshi
2018-04-30  7:30                       ` Chintan Pandya
2018-04-30  7:30                         ` Chintan Pandya
2018-04-30 13:43                         ` Kani, Toshi
2018-04-30 13:43                           ` Kani, Toshi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180426141926.GN15462@8bytes.org \
    --to=joro@8bytes.org \
    --cc=akpm@linux-foundation.org \
    --cc=bp@suse.de \
    --cc=catalin.marinas@arm.com \
    --cc=cpandya@codeaurora.org \
    --cc=guohanjun@huawei.com \
    --cc=hpa@zytor.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=mingo@redhat.com \
    --cc=stable@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=toshi.kani@hpe.com \
    --cc=will.deacon@arm.com \
    --cc=willy@infradead.org \
    --cc=wxf.wang@hisilicon.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.