linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] docs: x86: Remove obsolete information about x86_64 vmalloc() faulting
@ 2021-06-22  3:19 Peilin Ye
  2021-07-16  6:09 ` Peilin Ye
  0 siblings, 1 reply; 4+ messages in thread
From: Peilin Ye @ 2021-06-22  3:19 UTC (permalink / raw)
  To: x86, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Jonathan Corbet
  Cc: H. Peter Anvin, Joerg Roedel, Cong Wang, Zefang Han,
	Wei Lin Chang, linux-kernel, linux-doc, Peilin Ye

x86_64 vmalloc() mappings are no longer "lazily synchronized" among page
tables via page fault handling since commit 7f0a002b5a21 ("x86/mm: remove
vmalloc faulting").  Subsequently, commit 6eb82f994026 ("x86/mm:
Pre-allocate P4D/PUD pages for vmalloc area") rendered it unnecessary to
synchronize, whether lazily or not, x86_64 vmalloc() mappings at runtime,
since the corresponding P4D or PUD pages are now preallocated during
system initialization by preallocate_vmalloc_pages().  Drop the "lazily
synchronized" description for less confusion.

It is worth noting, however, that there is still a slight complication for
x86_32; see commit 4819e15f740e ("x86/mm/32: Bring back vmalloc faulting
on x86_32") for details.

Signed-off-by: Peilin Ye <yepeilin.cs@gmail.com>
---
Hi all,

I was trying to understand vmalloc() when I saw this "lazily synchronized"
statement, which confused me for a while.  Please correct me if my
understanding is wrong or out of date.

Thank you,
Peilin Ye

 Documentation/x86/x86_64/mm.rst | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/Documentation/x86/x86_64/mm.rst b/Documentation/x86/x86_64/mm.rst
index ede1875719fb..9798676bb0bf 100644
--- a/Documentation/x86/x86_64/mm.rst
+++ b/Documentation/x86/x86_64/mm.rst
@@ -140,10 +140,6 @@ The direct mapping covers all memory in the system up to the highest
 memory address (this means in some cases it can also include PCI memory
 holes).
 
-vmalloc space is lazily synchronized into the different PML4/PML5 pages of
-the processes using the page fault handler, with init_top_pgt as
-reference.
-
 We map EFI runtime services in the 'efi_pgd' PGD in a 64Gb large virtual
 memory window (this size is arbitrary, it can be raised later if needed).
 The mappings are not part of any other kernel PGD and are only available
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] docs: x86: Remove obsolete information about x86_64 vmalloc() faulting
  2021-06-22  3:19 [PATCH] docs: x86: Remove obsolete information about x86_64 vmalloc() faulting Peilin Ye
@ 2021-07-16  6:09 ` Peilin Ye
  2021-07-19 12:34   ` Joerg Roedel
  0 siblings, 1 reply; 4+ messages in thread
From: Peilin Ye @ 2021-07-16  6:09 UTC (permalink / raw)
  To: x86, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Jonathan Corbet
  Cc: H. Peter Anvin, Joerg Roedel, Cong Wang, Zefang Han,
	Wei Lin Chang, linux-kernel, linux-doc

Hi all,

> diff --git a/Documentation/x86/x86_64/mm.rst b/Documentation/x86/x86_64/mm.rst
> index ede1875719fb..9798676bb0bf 100644
> --- a/Documentation/x86/x86_64/mm.rst
> +++ b/Documentation/x86/x86_64/mm.rst
> @@ -140,10 +140,6 @@ The direct mapping covers all memory in the system up to the highest
>  memory address (this means in some cases it can also include PCI memory
>  holes).
>  
> -vmalloc space is lazily synchronized into the different PML4/PML5 pages of
> -the processes using the page fault handler, with init_top_pgt as
> -reference.

This information is out-of-date, and it took me quite some time of
ftrace'ing before I figured it out...  I think it would be beneficial to
update, or at least remove it.

As a proof that I understand what I am talking about, on my x86_64 box:

  1. I allocated a vmalloc() area containing linear address `addr`;
  2. I manually pagewalked `addr` in different page tables, including
     `init_mm.pgd`;
  3. The corresponding PGD entries for `addr` in different page tables,
     they all immediately pointed at the same PUD table (my box uses
     4-level paging), at the same physical address;
  4. No "lazy synchronization" via page fault handling happened at all,
     since it is the same PUD table pre-allocated by
     preallocate_vmalloc_pages() during boot time.

Commit 6eb82f994026 ("x86/mm: Pre-allocate P4D/PUD pages for vmalloc
area") documented this clearly:

"""
Doing this at boot makes sure no synchronization of that area is
necessary at runtime.
"""

Should we remove this sentence, or update it?  Any ideas?

Sincerely,
Peilin Ye


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] docs: x86: Remove obsolete information about x86_64 vmalloc() faulting
  2021-07-16  6:09 ` Peilin Ye
@ 2021-07-19 12:34   ` Joerg Roedel
  2021-07-20  4:50     ` Peilin Ye
  0 siblings, 1 reply; 4+ messages in thread
From: Joerg Roedel @ 2021-07-19 12:34 UTC (permalink / raw)
  To: Peilin Ye
  Cc: x86, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Jonathan Corbet, H. Peter Anvin, Cong Wang, Zefang Han,
	Wei Lin Chang, linux-kernel, linux-doc

Hi,

On Fri, Jul 16, 2021 at 02:09:58AM -0400, Peilin Ye wrote:
> This information is out-of-date, and it took me quite some time of
> ftrace'ing before I figured it out...  I think it would be beneficial to
> update, or at least remove it.
> 
> As a proof that I understand what I am talking about, on my x86_64 box:
> 
>   1. I allocated a vmalloc() area containing linear address `addr`;
>   2. I manually pagewalked `addr` in different page tables, including
>      `init_mm.pgd`;
>   3. The corresponding PGD entries for `addr` in different page tables,
>      they all immediately pointed at the same PUD table (my box uses
>      4-level paging), at the same physical address;
>   4. No "lazy synchronization" via page fault handling happened at all,
>      since it is the same PUD table pre-allocated by
>      preallocate_vmalloc_pages() during boot time.

Yes, this is the story for x86-64, because all PUD/P4D pages for the vmalloc
area are pre-allocated at boot. So no faulting or synchronization needs
to happen.

On x86-32 this is a bit different. Pre-allocation of PMD/PTE pages is
not an option there (even less when 4MB large-pages with 2-level paging
come into the picture).

So what happens there is that vmalloc related changes to the init_mm.pgd
are synchronized to all page-tables in the system. But this
synchronization is subject to race conditions in a way that another CPU
might vmalloc an area below a PMD which is not fully synchronized yet.

When this happens there is a fault, which is handled as a vmalloc()
fault on x86-32 just as before. So vmalloc faults still exist on 32-bit,
they are just less likely as they used to be.

Regards,

	Joerg

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] docs: x86: Remove obsolete information about x86_64 vmalloc() faulting
  2021-07-19 12:34   ` Joerg Roedel
@ 2021-07-20  4:50     ` Peilin Ye
  0 siblings, 0 replies; 4+ messages in thread
From: Peilin Ye @ 2021-07-20  4:50 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: x86, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Jonathan Corbet, H. Peter Anvin, Cong Wang, Zefang Han,
	Wei Lin Chang, linux-kernel, linux-doc

Hi Joerg,

On Mon, Jul 19, 2021 at 02:34:31PM +0200, Joerg Roedel wrote:
> On Fri, Jul 16, 2021 at 02:09:58AM -0400, Peilin Ye wrote:
> > This information is out-of-date, and it took me quite some time of
> > ftrace'ing before I figured it out...  I think it would be beneficial to
> > update, or at least remove it.
> > 
> > As a proof that I understand what I am talking about, on my x86_64 box:
> > 
> >   1. I allocated a vmalloc() area containing linear address `addr`;
> >   2. I manually pagewalked `addr` in different page tables, including
> >      `init_mm.pgd`;
> >   3. The corresponding PGD entries for `addr` in different page tables,
> >      they all immediately pointed at the same PUD table (my box uses
> >      4-level paging), at the same physical address;
> >   4. No "lazy synchronization" via page fault handling happened at all,
> >      since it is the same PUD table pre-allocated by
> >      preallocate_vmalloc_pages() during boot time.
> 
> Yes, this is the story for x86-64, because all PUD/P4D pages for the vmalloc
> area are pre-allocated at boot. So no faulting or synchronization needs
> to happen.
> 
> On x86-32 this is a bit different. Pre-allocation of PMD/PTE pages is
> not an option there (even less when 4MB large-pages with 2-level paging
> come into the picture).
> 
> So what happens there is that vmalloc related changes to the init_mm.pgd
> are synchronized to all page-tables in the system. But this
> synchronization is subject to race conditions in a way that another CPU
> might vmalloc an area below a PMD which is not fully synchronized yet.
> 
> When this happens there is a fault, which is handled as a vmalloc()
> fault on x86-32 just as before. So vmalloc faults still exist on 32-bit,
> they are just less likely as they used to be.

Thanks a lot for the information!  I will improve my commit message and
send a v2 soon.

I think for this patch, removing that out-of-date statement is
sufficient, since mm.rst is x86-64-specific, but maybe we should
document this behavior for x86-32 somewhere as well...

Thank you,
Peilin Ye


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-07-20  4:50 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-22  3:19 [PATCH] docs: x86: Remove obsolete information about x86_64 vmalloc() faulting Peilin Ye
2021-07-16  6:09 ` Peilin Ye
2021-07-19 12:34   ` Joerg Roedel
2021-07-20  4:50     ` Peilin Ye

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).