All of lore.kernel.org
 help / color / mirror / Atom feed
* The usage of page_mapping() in architecture code
@ 2018-02-11  6:43 Huang, Ying
  2018-02-11  6:47   ` Huang, Ying
  2018-02-11  8:17 ` Russell King - ARM Linux
  0 siblings, 2 replies; 6+ messages in thread
From: Huang, Ying @ 2018-02-11  6:43 UTC (permalink / raw)
  To: Andrew Morton, Minchan Kim, Michal Hocko, Johannes Weiner,
	Mel Gorman, Dave Hansen, Chen Liqin, Russell King,
	Yoshinori Sato, James E.J. Bottomley, Guan Xuetao,
	David S. Miller, Chris Zankel, Vineet Gupta, Ley Foon Tan,
	Ralf Baechle, Andi Kleen
  Cc: linux-mm

Hi, All,

To optimize the scalability of swap cache, it is made more dynamic
than before.  That is, after being swapped off, the address space of
the swap device will be freed too.  So the usage of page_mapping()
need to be audited to make sure the address space of the swap device
will not be used after it is freed.  For most cases it is OK, because
to call page_mapping(), the page, page table, or LRU list will be
locked.  But I found at least one usage isn't safe.  When
page_mapping() is called in architecture specific code to flush dcache
or sync between dcache and icache.

The typical usage models are,


1) Check whether page_mapping() is NULL, which is safe

2) Call mapping_mapped() to check whether the backing file is mapped
   to user space.

3) Iterate all vmas via the interval tree (mapping->i_mmap) to flush dcache


2) and 3) isn't safe, because no lock to prevent swap device from
swapping off is held.  But I found the code is for file address space
only, not for swap cache.  For example, for flush_dcache_page() in
arch/parisc/kernel/cache.c,


void flush_dcache_page(struct page *page)
{
	struct address_space *mapping = page_mapping(page);
	struct vm_area_struct *mpnt;
	unsigned long offset;
	unsigned long addr, old_addr = 0;
	pgoff_t pgoff;

	if (mapping && !mapping_mapped(mapping)) {
		set_bit(PG_dcache_dirty, &page->flags);
		return;
	}

	flush_kernel_dcache_page(page);

	if (!mapping)
		return;

	pgoff = page->index;

	/* We have carefully arranged in arch_get_unmapped_area() that
	 * *any* mappings of a file are always congruently mapped (whether
	 * declared as MAP_PRIVATE or MAP_SHARED), so we only need
	 * to flush one address here for them all to become coherent */

	flush_dcache_mmap_lock(mapping);
	vma_interval_tree_foreach(mpnt, &mapping->i_mmap, pgoff, pgoff) {
		offset = (pgoff - mpnt->vm_pgoff) << PAGE_SHIFT;
		addr = mpnt->vm_start + offset;

		/* The TLB is the engine of coherence on parisc: The
		 * CPU is entitled to speculate any page with a TLB
		 * mapping, so here we kill the mapping then flush the
		 * page along a special flush only alias mapping.
		 * This guarantees that the page is no-longer in the
		 * cache for any process and nor may it be
		 * speculatively read in (until the user or kernel
		 * specifically accesses it, of course) */

		flush_tlb_page(mpnt, addr);
		if (old_addr == 0 || (old_addr & (SHM_COLOUR - 1))
				      != (addr & (SHM_COLOUR - 1))) {
			__flush_cache_page(mpnt, addr, page_to_phys(page));
			if (old_addr)
				printk(KERN_ERR "INEQUIVALENT ALIASES 0x%lx and 0x%lx in file %pD\n", old_addr, addr, mpnt->vm_file);
			old_addr = addr;
		}
	}
	flush_dcache_mmap_unlock(mapping);
}


if page is an anonymous page in swap cache, "mapping &&
!mapping_mapped()" will be true, so we will delay flushing.  But if my
understanding of the code were correct, we should call
flush_kernel_dcache() because the kernel may access the page during
swapping in/out.

The code in other architectures follow the similar logic.  Would it be
better for page_mapping() here to return NULL for anonymous pages even
if they are in swap cache?  Of course we need to change the function
name.  page_file_mapping() appears a good name, but that has been used
already.  Any suggestion?

Is my understanding correct?  Could you help me on this?

Best Regards,
Huang, Ying

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* The usage of page_mapping() in architecture code
  2018-02-11  6:43 The usage of page_mapping() in architecture code Huang, Ying
@ 2018-02-11  6:47   ` Huang, Ying
  2018-02-11  8:17 ` Russell King - ARM Linux
  1 sibling, 0 replies; 6+ messages in thread
From: Huang, Ying @ 2018-02-11  6:47 UTC (permalink / raw)
  To: Andrew Morton, Minchan Kim, Michal Hocko, Johannes Weiner,
	Mel Gorman, Dave Hansen, Chen Liqin, Russell King,
	Yoshinori Sato, James E.J. Bottomley, Guan Xuetao,
	David S. Miller, Chris Zankel, Vineet Gupta, Ley Foon Tan,
	Ralf Baechle, Andi Kleen
  Cc: linux-mm, linux-kernel

Sorry for bothering, forget to Cc LKML in the original email.

Hi, All,

To optimize the scalability of swap cache, it is made more dynamic
than before.  That is, after being swapped off, the address space of
the swap device will be freed too.  So the usage of page_mapping()
need to be audited to make sure the address space of the swap device
will not be used after it is freed.  For most cases it is OK, because
to call page_mapping(), the page, page table, or LRU list will be
locked.  But I found at least one usage isn't safe.  When
page_mapping() is called in architecture specific code to flush dcache
or sync between dcache and icache.

The typical usage models are,


1) Check whether page_mapping() is NULL, which is safe

2) Call mapping_mapped() to check whether the backing file is mapped
   to user space.

3) Iterate all vmas via the interval tree (mapping->i_mmap) to flush dcache


2) and 3) isn't safe, because no lock to prevent swap device from
swapping off is held.  But I found the code is for file address space
only, not for swap cache.  For example, for flush_dcache_page() in
arch/parisc/kernel/cache.c,


void flush_dcache_page(struct page *page)
{
	struct address_space *mapping = page_mapping(page);
	struct vm_area_struct *mpnt;
	unsigned long offset;
	unsigned long addr, old_addr = 0;
	pgoff_t pgoff;

	if (mapping && !mapping_mapped(mapping)) {
		set_bit(PG_dcache_dirty, &page->flags);
		return;
	}

	flush_kernel_dcache_page(page);

	if (!mapping)
		return;

	pgoff = page->index;

	/* We have carefully arranged in arch_get_unmapped_area() that
	 * *any* mappings of a file are always congruently mapped (whether
	 * declared as MAP_PRIVATE or MAP_SHARED), so we only need
	 * to flush one address here for them all to become coherent */

	flush_dcache_mmap_lock(mapping);
	vma_interval_tree_foreach(mpnt, &mapping->i_mmap, pgoff, pgoff) {
		offset = (pgoff - mpnt->vm_pgoff) << PAGE_SHIFT;
		addr = mpnt->vm_start + offset;

		/* The TLB is the engine of coherence on parisc: The
		 * CPU is entitled to speculate any page with a TLB
		 * mapping, so here we kill the mapping then flush the
		 * page along a special flush only alias mapping.
		 * This guarantees that the page is no-longer in the
		 * cache for any process and nor may it be
		 * speculatively read in (until the user or kernel
		 * specifically accesses it, of course) */

		flush_tlb_page(mpnt, addr);
		if (old_addr == 0 || (old_addr & (SHM_COLOUR - 1))
				      != (addr & (SHM_COLOUR - 1))) {
			__flush_cache_page(mpnt, addr, page_to_phys(page));
			if (old_addr)
				printk(KERN_ERR "INEQUIVALENT ALIASES 0x%lx and 0x%lx in file %pD\n", old_addr, addr, mpnt->vm_file);
			old_addr = addr;
		}
	}
	flush_dcache_mmap_unlock(mapping);
}


if page is an anonymous page in swap cache, "mapping &&
!mapping_mapped()" will be true, so we will delay flushing.  But if my
understanding of the code were correct, we should call
flush_kernel_dcache() because the kernel may access the page during
swapping in/out.

The code in other architectures follow the similar logic.  Would it be
better for page_mapping() here to return NULL for anonymous pages even
if they are in swap cache?  Of course we need to change the function
name.  page_file_mapping() appears a good name, but that has been used
already.  Any suggestion?

Is my understanding correct?  Could you help me on this?

Best Regards,
Huang, Ying

^ permalink raw reply	[flat|nested] 6+ messages in thread

* The usage of page_mapping() in architecture code
@ 2018-02-11  6:47   ` Huang, Ying
  0 siblings, 0 replies; 6+ messages in thread
From: Huang, Ying @ 2018-02-11  6:47 UTC (permalink / raw)
  To: Andrew Morton, Minchan Kim, Michal Hocko, Johannes Weiner,
	Mel Gorman, Dave Hansen, Chen Liqin, Russell King,
	Yoshinori Sato, James E.J. Bottomley, Guan Xuetao,
	David S. Miller, Chris Zankel, Vineet Gupta, Ley Foon Tan,
	Ralf Baechle, Andi Kleen
  Cc: linux-mm, linux-kernel

Sorry for bothering, forget to Cc LKML in the original email.

Hi, All,

To optimize the scalability of swap cache, it is made more dynamic
than before.  That is, after being swapped off, the address space of
the swap device will be freed too.  So the usage of page_mapping()
need to be audited to make sure the address space of the swap device
will not be used after it is freed.  For most cases it is OK, because
to call page_mapping(), the page, page table, or LRU list will be
locked.  But I found at least one usage isn't safe.  When
page_mapping() is called in architecture specific code to flush dcache
or sync between dcache and icache.

The typical usage models are,


1) Check whether page_mapping() is NULL, which is safe

2) Call mapping_mapped() to check whether the backing file is mapped
   to user space.

3) Iterate all vmas via the interval tree (mapping->i_mmap) to flush dcache


2) and 3) isn't safe, because no lock to prevent swap device from
swapping off is held.  But I found the code is for file address space
only, not for swap cache.  For example, for flush_dcache_page() in
arch/parisc/kernel/cache.c,


void flush_dcache_page(struct page *page)
{
	struct address_space *mapping = page_mapping(page);
	struct vm_area_struct *mpnt;
	unsigned long offset;
	unsigned long addr, old_addr = 0;
	pgoff_t pgoff;

	if (mapping && !mapping_mapped(mapping)) {
		set_bit(PG_dcache_dirty, &page->flags);
		return;
	}

	flush_kernel_dcache_page(page);

	if (!mapping)
		return;

	pgoff = page->index;

	/* We have carefully arranged in arch_get_unmapped_area() that
	 * *any* mappings of a file are always congruently mapped (whether
	 * declared as MAP_PRIVATE or MAP_SHARED), so we only need
	 * to flush one address here for them all to become coherent */

	flush_dcache_mmap_lock(mapping);
	vma_interval_tree_foreach(mpnt, &mapping->i_mmap, pgoff, pgoff) {
		offset = (pgoff - mpnt->vm_pgoff) << PAGE_SHIFT;
		addr = mpnt->vm_start + offset;

		/* The TLB is the engine of coherence on parisc: The
		 * CPU is entitled to speculate any page with a TLB
		 * mapping, so here we kill the mapping then flush the
		 * page along a special flush only alias mapping.
		 * This guarantees that the page is no-longer in the
		 * cache for any process and nor may it be
		 * speculatively read in (until the user or kernel
		 * specifically accesses it, of course) */

		flush_tlb_page(mpnt, addr);
		if (old_addr == 0 || (old_addr & (SHM_COLOUR - 1))
				      != (addr & (SHM_COLOUR - 1))) {
			__flush_cache_page(mpnt, addr, page_to_phys(page));
			if (old_addr)
				printk(KERN_ERR "INEQUIVALENT ALIASES 0x%lx and 0x%lx in file %pD\n", old_addr, addr, mpnt->vm_file);
			old_addr = addr;
		}
	}
	flush_dcache_mmap_unlock(mapping);
}


if page is an anonymous page in swap cache, "mapping &&
!mapping_mapped()" will be true, so we will delay flushing.  But if my
understanding of the code were correct, we should call
flush_kernel_dcache() because the kernel may access the page during
swapping in/out.

The code in other architectures follow the similar logic.  Would it be
better for page_mapping() here to return NULL for anonymous pages even
if they are in swap cache?  Of course we need to change the function
name.  page_file_mapping() appears a good name, but that has been used
already.  Any suggestion?

Is my understanding correct?  Could you help me on this?

Best Regards,
Huang, Ying

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: The usage of page_mapping() in architecture code
  2018-02-11  6:43 The usage of page_mapping() in architecture code Huang, Ying
  2018-02-11  6:47   ` Huang, Ying
@ 2018-02-11  8:17 ` Russell King - ARM Linux
  2018-02-11  8:39   ` Huang, Ying
  1 sibling, 1 reply; 6+ messages in thread
From: Russell King - ARM Linux @ 2018-02-11  8:17 UTC (permalink / raw)
  To: Huang, Ying
  Cc: Andrew Morton, Minchan Kim, Michal Hocko, Johannes Weiner,
	Mel Gorman, Dave Hansen, Chen Liqin, Yoshinori Sato,
	James E.J. Bottomley, Guan Xuetao, David S. Miller, Chris Zankel,
	Vineet Gupta, Ley Foon Tan, Ralf Baechle, Andi Kleen, linux-mm

On Sun, Feb 11, 2018 at 02:43:39PM +0800, Huang, Ying wrote:
> Hi, All,
> 
> To optimize the scalability of swap cache, it is made more dynamic
> than before.  That is, after being swapped off, the address space of
> the swap device will be freed too.  So the usage of page_mapping()
> need to be audited to make sure the address space of the swap device
> will not be used after it is freed.  For most cases it is OK, because
> to call page_mapping(), the page, page table, or LRU list will be
> locked.  But I found at least one usage isn't safe.  When
> page_mapping() is called in architecture specific code to flush dcache
> or sync between dcache and icache.
> 
> The typical usage models are,
> 
> 
> 1) Check whether page_mapping() is NULL, which is safe
> 
> 2) Call mapping_mapped() to check whether the backing file is mapped
>    to user space.
> 
> 3) Iterate all vmas via the interval tree (mapping->i_mmap) to flush dcache
> 
> 
> 2) and 3) isn't safe, because no lock to prevent swap device from
> swapping off is held.  But I found the code is for file address space
> only, not for swap cache.  For example, for flush_dcache_page() in
> arch/parisc/kernel/cache.c,

This code is required with virtually cached architectures to flush
every mapping alias - unlike physically cached architectures where
flushing one mapping suffices.

> 
> 
> void flush_dcache_page(struct page *page)
> {
> 	struct address_space *mapping = page_mapping(page);
> 	struct vm_area_struct *mpnt;
> 	unsigned long offset;
> 	unsigned long addr, old_addr = 0;
> 	pgoff_t pgoff;
> 
> 	if (mapping && !mapping_mapped(mapping)) {
> 		set_bit(PG_dcache_dirty, &page->flags);
> 		return;
> 	}
> 
> 	flush_kernel_dcache_page(page);
> 
> 	if (!mapping)
> 		return;
> 
> 	pgoff = page->index;
> 
> 	/* We have carefully arranged in arch_get_unmapped_area() that
> 	 * *any* mappings of a file are always congruently mapped (whether
> 	 * declared as MAP_PRIVATE or MAP_SHARED), so we only need
> 	 * to flush one address here for them all to become coherent */
> 
> 	flush_dcache_mmap_lock(mapping);
> 	vma_interval_tree_foreach(mpnt, &mapping->i_mmap, pgoff, pgoff) {
> 		offset = (pgoff - mpnt->vm_pgoff) << PAGE_SHIFT;
> 		addr = mpnt->vm_start + offset;
> 
> 		/* The TLB is the engine of coherence on parisc: The
> 		 * CPU is entitled to speculate any page with a TLB
> 		 * mapping, so here we kill the mapping then flush the
> 		 * page along a special flush only alias mapping.
> 		 * This guarantees that the page is no-longer in the
> 		 * cache for any process and nor may it be
> 		 * speculatively read in (until the user or kernel
> 		 * specifically accesses it, of course) */
> 
> 		flush_tlb_page(mpnt, addr);
> 		if (old_addr == 0 || (old_addr & (SHM_COLOUR - 1))
> 				      != (addr & (SHM_COLOUR - 1))) {
> 			__flush_cache_page(mpnt, addr, page_to_phys(page));
> 			if (old_addr)
> 				printk(KERN_ERR "INEQUIVALENT ALIASES 0x%lx and 0x%lx in file %pD\n", old_addr, addr, mpnt->vm_file);
> 			old_addr = addr;
> 		}
> 	}
> 	flush_dcache_mmap_unlock(mapping);
> }
> 
> 
> if page is an anonymous page in swap cache, "mapping &&
> !mapping_mapped()" will be true, so we will delay flushing.  But if my
> understanding of the code were correct, we should call
> flush_kernel_dcache() because the kernel may access the page during
> swapping in/out.
> 
> The code in other architectures follow the similar logic.  Would it be
> better for page_mapping() here to return NULL for anonymous pages even
> if they are in swap cache?  Of course we need to change the function
> name.  page_file_mapping() appears a good name, but that has been used
> already.  Any suggestion?

flush_dcache_page() does nothing for anonymous pages (see cachetlb.txt,
it's only defined to do anything for page cache pages.)

flush_anon_page() deals with anonymous pages.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: The usage of page_mapping() in architecture code
  2018-02-11  8:17 ` Russell King - ARM Linux
@ 2018-02-11  8:39   ` Huang, Ying
  2018-02-11 13:21     ` Russell King - ARM Linux
  0 siblings, 1 reply; 6+ messages in thread
From: Huang, Ying @ 2018-02-11  8:39 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Andrew Morton, Minchan Kim, Michal Hocko, Johannes Weiner,
	Mel Gorman, Dave Hansen, Chen Liqin, Yoshinori Sato,
	James E.J. Bottomley, Guan Xuetao, David S. Miller, Chris Zankel,
	Vineet Gupta, Ley Foon Tan, Ralf Baechle, Andi Kleen, linux-mm,
	Andrea Arcangeli, Hugh Dickins

Hi, Russell,

Russell King - ARM Linux <linux@armlinux.org.uk> writes:

> On Sun, Feb 11, 2018 at 02:43:39PM +0800, Huang, Ying wrote:
[snip]
>> 
>> 
>> if page is an anonymous page in swap cache, "mapping &&
>> !mapping_mapped()" will be true, so we will delay flushing.  But if my
>> understanding of the code were correct, we should call
>> flush_kernel_dcache() because the kernel may access the page during
>> swapping in/out.
>> 
>> The code in other architectures follow the similar logic.  Would it be
>> better for page_mapping() here to return NULL for anonymous pages even
>> if they are in swap cache?  Of course we need to change the function
>> name.  page_file_mapping() appears a good name, but that has been used
>> already.  Any suggestion?
>
> flush_dcache_page() does nothing for anonymous pages (see cachetlb.txt,
> it's only defined to do anything for page cache pages.)
>
> flush_anon_page() deals with anonymous pages.

Thanks for your information!  But I found this isn't followed exactly in
the code.  For example, in get_mergeable_page() in mm/ksm.c,

	if (PageAnon(page)) {
		flush_anon_page(vma, page, addr);
		flush_dcache_page(page);
	} else {
		put_page(page);

flush_dcache_page() is called for anonymous pages too.

Best Regards,
Huang, Ying

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: The usage of page_mapping() in architecture code
  2018-02-11  8:39   ` Huang, Ying
@ 2018-02-11 13:21     ` Russell King - ARM Linux
  0 siblings, 0 replies; 6+ messages in thread
From: Russell King - ARM Linux @ 2018-02-11 13:21 UTC (permalink / raw)
  To: Huang, Ying
  Cc: Andrew Morton, Minchan Kim, Michal Hocko, Johannes Weiner,
	Mel Gorman, Dave Hansen, Chen Liqin, Yoshinori Sato,
	James E.J. Bottomley, Guan Xuetao, David S. Miller, Chris Zankel,
	Vineet Gupta, Ley Foon Tan, Ralf Baechle, Andi Kleen, linux-mm,
	Andrea Arcangeli, Hugh Dickins

On Sun, Feb 11, 2018 at 04:39:07PM +0800, Huang, Ying wrote:
> Hi, Russell,
> 
> Russell King - ARM Linux <linux@armlinux.org.uk> writes:
> 
> > On Sun, Feb 11, 2018 at 02:43:39PM +0800, Huang, Ying wrote:
> [snip]
> >> 
> >> 
> >> if page is an anonymous page in swap cache, "mapping &&
> >> !mapping_mapped()" will be true, so we will delay flushing.  But if my
> >> understanding of the code were correct, we should call
> >> flush_kernel_dcache() because the kernel may access the page during
> >> swapping in/out.
> >> 
> >> The code in other architectures follow the similar logic.  Would it be
> >> better for page_mapping() here to return NULL for anonymous pages even
> >> if they are in swap cache?  Of course we need to change the function
> >> name.  page_file_mapping() appears a good name, but that has been used
> >> already.  Any suggestion?
> >
> > flush_dcache_page() does nothing for anonymous pages (see cachetlb.txt,
> > it's only defined to do anything for page cache pages.)
> >
> > flush_anon_page() deals with anonymous pages.
> 
> Thanks for your information!  But I found this isn't followed exactly in
> the code.  For example, in get_mergeable_page() in mm/ksm.c,
> 
> 	if (PageAnon(page)) {
> 		flush_anon_page(vma, page, addr);
> 		flush_dcache_page(page);
> 	} else {
> 		put_page(page);
> 
> flush_dcache_page() is called for anonymous pages too.

... and flush_dcache_page() will be a no-op here, as per its
documentation.  Any flushing required here will have been taken care of
with flush_anon_page().

If flush_dcache_page() were to do flushing here, it would repeat the
flushing that flush_anon_page() has just done, so its pointless.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-02-11 13:22 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-11  6:43 The usage of page_mapping() in architecture code Huang, Ying
2018-02-11  6:47 ` Huang, Ying
2018-02-11  6:47   ` Huang, Ying
2018-02-11  8:17 ` Russell King - ARM Linux
2018-02-11  8:39   ` Huang, Ying
2018-02-11 13:21     ` Russell King - ARM Linux

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.