linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: Bad use of highmem with buffer_migrate_page?
       [not found]   ` <4FAD89DC.2090307@codeaurora.org>
@ 2012-07-05  9:28     ` Rabin Vincent
  2012-07-05 10:05       ` Marek Szyprowski
  0 siblings, 1 reply; 4+ messages in thread
From: Rabin Vincent @ 2012-07-05  9:28 UTC (permalink / raw)
  To: Marek Szyprowski, Michal Nazarewicz
  Cc: Laura Abbott, linaro-mm-sig, linux-arm-msm, linux-arm-kernel,
	linux-mm, LKML

On Sat, May 12, 2012 at 3:21 AM, Laura Abbott <lauraa@codeaurora.org> wrote:
> On 5/11/2012 1:30 AM, Marek Szyprowski wrote:
>> On Thursday, May 10, 2012 10:08 PM Laura Abbott wrote:
>>> I did a backport of the Contiguous Memory Allocator to a 3.0.8 tree. I
>>> wrote fairly simple test case that, in 1MB chunks, allocs up to 40MB
>>> from a reserved area, maps, writes, unmaps and then frees in an infinite
>>> loop. When running this with another program in parallel to put some
>>> stress on the filesystem, I hit data aborts in the filesystem/journal
>>> layer, although not always the same backtrace. As an example:
>>>
>>> [<c02907a4>] (__ext4_check_dir_entry+0x20/0x184) from [<c029e1a8>]
>>> (add_dirent_to_buf+0x70/0x2ac)
>>> [<c029e1a8>] (add_dirent_to_buf+0x70/0x2ac) from [<c029f3f0>]
>>> (ext4_add_entry+0xd8/0x4bc)
>>> [<c029f3f0>] (ext4_add_entry+0xd8/0x4bc) from [<c029fe90>]
>>> (ext4_add_nondir+0x14/0x64)
>>> [<c029fe90>] (ext4_add_nondir+0x14/0x64) from [<c02a04c4>]
>>> (ext4_create+0xd8/0x120)
>>> [<c02a04c4>] (ext4_create+0xd8/0x120) from [<c022e134>]
>>> (vfs_create+0x74/0xa4)
>>> [<c022e134>] (vfs_create+0x74/0xa4) from [<c022ed3c>]
>>> (do_last+0x588/0x8d4)
>>> [<c022ed3c>] (do_last+0x588/0x8d4) from [<c022fe64>]
>>> (path_openat+0xc4/0x394)
>>> [<c022fe64>] (path_openat+0xc4/0x394) from [<c0230214>]
>>> (do_filp_open+0x30/0x7c)
>>> [<c0230214>] (do_filp_open+0x30/0x7c) from [<c0220cb4>]
>>> (do_sys_open+0xd8/0x174)
>>> [<c0220cb4>] (do_sys_open+0xd8/0x174) from [<c0105ea0>]
>>> (ret_fast_syscall+0x0/0x30)
>>>
>>> Every panic had the same issue where a struct buffer_head [1] had a
>>> b_data that was unexpectedly NULL.
>>>
>>> During the course of CMA, buffer_migrate_page could be called to migrate
>>> from a CMA page to a new page. buffer_migrate_page calls set_bh_page[2]
>>> to set the new page for the buffer_head. If the new page is a highmem
>>> page though, the bh->b_data ends up as NULL, which could produce the
>>> panics seen above.
>>>
>>> This seems to indicate that highmem pages are not not appropriate for
>>> use as pages to migrate to. The following made the problem go away for
>>> me:
>>>
>>> --- a/mm/page_alloc.c
>>> +++ b/mm/page_alloc.c
>>> @@ -5753,7 +5753,7 @@ static struct page *
>>>    __alloc_contig_migrate_alloc(struct page *page, unsigned long private,
>>>                                int **resultp)
>>>    {
>>> -       return alloc_page(GFP_HIGHUSER_MOVABLE);
>>> +       return alloc_page(GFP_USER | __GFP_MOVABLE);
>>>    }
>>>
>>>
>>> Does this seem like an actual issue or is this an artifact of my
>>> backport to 3.0? I'm not familiar enough with the filesystem layer to be
>>> able to tell where highmem can actually be used.
>>
>>
>> I will need to investigate this further as this issue doesn't appear on
>> v3.3+ kernels, but I remember I saw something similar when I tried CMA
>> backported to v3.0.

The problem is still present on latest mainline.  The filesystem layer
expects that the pages in the block device's mapping are not in highmem
(the mapping's gfp mask is set in bdget()), but CMA replaces lowmem
pages with highmem pages leading to the crashes.

The above fix should work, but perhaps the following is preferable since
it should allow moving highmem pages to other highmem pages?

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 4403009..4a4f921 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5635,7 +5635,12 @@ static struct page *
 __alloc_contig_migrate_alloc(struct page *page, unsigned long private,
 			     int **resultp)
 {
-	return alloc_page(GFP_HIGHUSER_MOVABLE);
+	gfp_t gfp_mask = GFP_USER | __GFP_MOVABLE;
+
+	if (PageHighMem(page))
+		gfp_mask |= __GFP_HIGHMEM;
+
+	return alloc_page(gfp_mask);
 }

 /* [start, end) must belong to a single zone. */

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* RE: Bad use of highmem with buffer_migrate_page?
  2012-07-05  9:28     ` Bad use of highmem with buffer_migrate_page? Rabin Vincent
@ 2012-07-05 10:05       ` Marek Szyprowski
  2012-07-05 10:45         ` Rabin Vincent
  0 siblings, 1 reply; 4+ messages in thread
From: Marek Szyprowski @ 2012-07-05 10:05 UTC (permalink / raw)
  To: 'Rabin Vincent', 'Michal Nazarewicz'
  Cc: 'Laura Abbott',
	linaro-mm-sig, linux-arm-msm, linux-arm-kernel, linux-mm,
	'LKML'

Hello,

On Thursday, July 05, 2012 11:28 AM Rabin Vincent wrote:

> On Sat, May 12, 2012 at 3:21 AM, Laura Abbott <lauraa@codeaurora.org> wrote:
> > On 5/11/2012 1:30 AM, Marek Szyprowski wrote:
> >> On Thursday, May 10, 2012 10:08 PM Laura Abbott wrote:
> >>> I did a backport of the Contiguous Memory Allocator to a 3.0.8 tree. I
> >>> wrote fairly simple test case that, in 1MB chunks, allocs up to 40MB
> >>> from a reserved area, maps, writes, unmaps and then frees in an infinite
> >>> loop. When running this with another program in parallel to put some
> >>> stress on the filesystem, I hit data aborts in the filesystem/journal
> >>> layer, although not always the same backtrace. As an example:
> >>>
> >>> [<c02907a4>] (__ext4_check_dir_entry+0x20/0x184) from [<c029e1a8>]
> >>> (add_dirent_to_buf+0x70/0x2ac)
> >>> [<c029e1a8>] (add_dirent_to_buf+0x70/0x2ac) from [<c029f3f0>]
> >>> (ext4_add_entry+0xd8/0x4bc)
> >>> [<c029f3f0>] (ext4_add_entry+0xd8/0x4bc) from [<c029fe90>]
> >>> (ext4_add_nondir+0x14/0x64)
> >>> [<c029fe90>] (ext4_add_nondir+0x14/0x64) from [<c02a04c4>]
> >>> (ext4_create+0xd8/0x120)
> >>> [<c02a04c4>] (ext4_create+0xd8/0x120) from [<c022e134>]
> >>> (vfs_create+0x74/0xa4)
> >>> [<c022e134>] (vfs_create+0x74/0xa4) from [<c022ed3c>]
> >>> (do_last+0x588/0x8d4)
> >>> [<c022ed3c>] (do_last+0x588/0x8d4) from [<c022fe64>]
> >>> (path_openat+0xc4/0x394)
> >>> [<c022fe64>] (path_openat+0xc4/0x394) from [<c0230214>]
> >>> (do_filp_open+0x30/0x7c)
> >>> [<c0230214>] (do_filp_open+0x30/0x7c) from [<c0220cb4>]
> >>> (do_sys_open+0xd8/0x174)
> >>> [<c0220cb4>] (do_sys_open+0xd8/0x174) from [<c0105ea0>]
> >>> (ret_fast_syscall+0x0/0x30)
> >>>
> >>> Every panic had the same issue where a struct buffer_head [1] had a
> >>> b_data that was unexpectedly NULL.
> >>>
> >>> During the course of CMA, buffer_migrate_page could be called to migrate
> >>> from a CMA page to a new page. buffer_migrate_page calls set_bh_page[2]
> >>> to set the new page for the buffer_head. If the new page is a highmem
> >>> page though, the bh->b_data ends up as NULL, which could produce the
> >>> panics seen above.
> >>>
> >>> This seems to indicate that highmem pages are not not appropriate for
> >>> use as pages to migrate to. The following made the problem go away for
> >>> me:
> >>>
> >>> --- a/mm/page_alloc.c
> >>> +++ b/mm/page_alloc.c
> >>> @@ -5753,7 +5753,7 @@ static struct page *
> >>>    __alloc_contig_migrate_alloc(struct page *page, unsigned long private,
> >>>                                int **resultp)
> >>>    {
> >>> -       return alloc_page(GFP_HIGHUSER_MOVABLE);
> >>> +       return alloc_page(GFP_USER | __GFP_MOVABLE);
> >>>    }
> >>>
> >>>
> >>> Does this seem like an actual issue or is this an artifact of my
> >>> backport to 3.0? I'm not familiar enough with the filesystem layer to be
> >>> able to tell where highmem can actually be used.
> >>
> >>
> >> I will need to investigate this further as this issue doesn't appear on
> >> v3.3+ kernels, but I remember I saw something similar when I tried CMA
> >> backported to v3.0.
> 
> The problem is still present on latest mainline.  The filesystem layer
> expects that the pages in the block device's mapping are not in highmem
> (the mapping's gfp mask is set in bdget()), but CMA replaces lowmem
> pages with highmem pages leading to the crashes.
> 
> The above fix should work, but perhaps the following is preferable since
> it should allow moving highmem pages to other highmem pages?

> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 4403009..4a4f921 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5635,7 +5635,12 @@ static struct page *
>  __alloc_contig_migrate_alloc(struct page *page, unsigned long private,
>  			     int **resultp)
>  {
> -	return alloc_page(GFP_HIGHUSER_MOVABLE);
> +	gfp_t gfp_mask = GFP_USER | __GFP_MOVABLE;
> +
> +	if (PageHighMem(page))
> +		gfp_mask |= __GFP_HIGHMEM;
> +
> +	return alloc_page(gfp_mask);
>  }
> 
>  /* [start, end) must belong to a single zone. */


The patch looks fine and does it job well. Could you resend it as a complete 
patch with commit message and signed-off-by/reported-by lines? I will handle
merging it to mainline then.

Best regards
-- 
Marek Szyprowski
Samsung Poland R&D Center




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Bad use of highmem with buffer_migrate_page?
  2012-07-05 10:05       ` Marek Szyprowski
@ 2012-07-05 10:45         ` Rabin Vincent
  2012-07-05 19:21           ` Michał Nazarewicz
  0 siblings, 1 reply; 4+ messages in thread
From: Rabin Vincent @ 2012-07-05 10:45 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: 'Michal Nazarewicz', 'Laura Abbott',
	linaro-mm-sig, linux-arm-msm, linux-arm-kernel, linux-mm,
	'LKML'

On Thu, Jul 05, 2012 at 12:05:45PM +0200, Marek Szyprowski wrote:
> On Thursday, July 05, 2012 11:28 AM Rabin Vincent wrote:
> > The problem is still present on latest mainline.  The filesystem layer
> > expects that the pages in the block device's mapping are not in highmem
> > (the mapping's gfp mask is set in bdget()), but CMA replaces lowmem
> > pages with highmem pages leading to the crashes.
> > 
> > The above fix should work, but perhaps the following is preferable since
> > it should allow moving highmem pages to other highmem pages?
> 
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index 4403009..4a4f921 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -5635,7 +5635,12 @@ static struct page *
> >  __alloc_contig_migrate_alloc(struct page *page, unsigned long private,
> >  			     int **resultp)
> >  {
> > -	return alloc_page(GFP_HIGHUSER_MOVABLE);
> > +	gfp_t gfp_mask = GFP_USER | __GFP_MOVABLE;
> > +
> > +	if (PageHighMem(page))
> > +		gfp_mask |= __GFP_HIGHMEM;
> > +
> > +	return alloc_page(gfp_mask);
> >  }
> > 
> >  /* [start, end) must belong to a single zone. */
> 
> 
> The patch looks fine and does it job well. Could you resend it as a complete 
> patch with commit message and signed-off-by/reported-by lines? I will handle
> merging it to mainline then.

Thanks, here it is:

8<----
>From 8a94126eb3aa2824866405fb78bb0b8316f8fd00 Mon Sep 17 00:00:00 2001
From: Rabin Vincent <rabin@rab.in>
Date: Thu, 5 Jul 2012 15:52:23 +0530
Subject: [PATCH] mm: cma: don't replace lowmem pages with highmem

The filesystem layer expects pages in the block device's mapping to not
be in highmem (the mapping's gfp mask is set in bdget()), but CMA can
currently replace lowmem pages with highmem pages, leading to crashes in
filesystem code such as the one below:

  Unable to handle kernel NULL pointer dereference at virtual address 00000400
  pgd = c0c98000
  [00000400] *pgd=00c91831, *pte=00000000, *ppte=00000000
  Internal error: Oops: 817 [#1] PREEMPT SMP ARM
  CPU: 0    Not tainted  (3.5.0-rc5+ #80)
  PC is at __memzero+0x24/0x80
  ...
  Process fsstress (pid: 323, stack limit = 0xc0cbc2f0)
  Backtrace:
  [<c010e3f0>] (ext4_getblk+0x0/0x180) from [<c010e58c>] (ext4_bread+0x1c/0x98)
  [<c010e570>] (ext4_bread+0x0/0x98) from [<c0117944>] (ext4_mkdir+0x160/0x3bc)
   r4:c15337f0
  [<c01177e4>] (ext4_mkdir+0x0/0x3bc) from [<c00c29e0>] (vfs_mkdir+0x8c/0x98)
  [<c00c2954>] (vfs_mkdir+0x0/0x98) from [<c00c2a60>] (sys_mkdirat+0x74/0xac)
   r6:00000000 r5:c152eb40 r4:000001ff r3:c14b43f0
  [<c00c29ec>] (sys_mkdirat+0x0/0xac) from [<c00c2ab8>] (sys_mkdir+0x20/0x24)
   r6:beccdcf0 r5:00074000 r4:beccdbbc
  [<c00c2a98>] (sys_mkdir+0x0/0x24) from [<c000e3c0>] (ret_fast_syscall+0x0/0x30)

Fix this by replacing only highmem pages with highmem.

Reported-by: Laura Abbott <lauraa@codeaurora.org>
Signed-off-by: Rabin Vincent <rabin@rab.in>
---
 mm/page_alloc.c |    7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 4403009..4a4f921 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5635,7 +5635,12 @@ static struct page *
 __alloc_contig_migrate_alloc(struct page *page, unsigned long private,
 			     int **resultp)
 {
-	return alloc_page(GFP_HIGHUSER_MOVABLE);
+	gfp_t gfp_mask = GFP_USER | __GFP_MOVABLE;
+
+	if (PageHighMem(page))
+		gfp_mask |= __GFP_HIGHMEM;
+
+	return alloc_page(gfp_mask);
 }
 
 /* [start, end) must belong to a single zone. */
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: Bad use of highmem with buffer_migrate_page?
  2012-07-05 10:45         ` Rabin Vincent
@ 2012-07-05 19:21           ` Michał Nazarewicz
  0 siblings, 0 replies; 4+ messages in thread
From: Michał Nazarewicz @ 2012-07-05 19:21 UTC (permalink / raw)
  To: Rabin Vincent
  Cc: Marek Szyprowski, Laura Abbott, linaro-mm-sig, linux-arm-msm,
	linux-arm-kernel, linux-mm, LKML

2012/7/5 Rabin Vincent <rabin@rab.in>:
> From 8a94126eb3aa2824866405fb78bb0b8316f8fd00 Mon Sep 17 00:00:00 2001
> From: Rabin Vincent <rabin@rab.in>
> Date: Thu, 5 Jul 2012 15:52:23 +0530
> Subject: [PATCH] mm: cma: don't replace lowmem pages with highmem
>
> The filesystem layer expects pages in the block device's mapping to not
> be in highmem (the mapping's gfp mask is set in bdget()), but CMA can
> currently replace lowmem pages with highmem pages, leading to crashes in
> filesystem code such as the one below:
>
>   Unable to handle kernel NULL pointer dereference at virtual address 00000400
>   pgd = c0c98000
>   [00000400] *pgd=00c91831, *pte=00000000, *ppte=00000000
>   Internal error: Oops: 817 [#1] PREEMPT SMP ARM
>   CPU: 0    Not tainted  (3.5.0-rc5+ #80)
>   PC is at __memzero+0x24/0x80
>   ...
>   Process fsstress (pid: 323, stack limit = 0xc0cbc2f0)
>   Backtrace:
>   [<c010e3f0>] (ext4_getblk+0x0/0x180) from [<c010e58c>] (ext4_bread+0x1c/0x98)
>   [<c010e570>] (ext4_bread+0x0/0x98) from [<c0117944>] (ext4_mkdir+0x160/0x3bc)
>    r4:c15337f0
>   [<c01177e4>] (ext4_mkdir+0x0/0x3bc) from [<c00c29e0>] (vfs_mkdir+0x8c/0x98)
>   [<c00c2954>] (vfs_mkdir+0x0/0x98) from [<c00c2a60>] (sys_mkdirat+0x74/0xac)
>    r6:00000000 r5:c152eb40 r4:000001ff r3:c14b43f0
>   [<c00c29ec>] (sys_mkdirat+0x0/0xac) from [<c00c2ab8>] (sys_mkdir+0x20/0x24)
>    r6:beccdcf0 r5:00074000 r4:beccdbbc
>   [<c00c2a98>] (sys_mkdir+0x0/0x24) from [<c000e3c0>] (ret_fast_syscall+0x0/0x30)
>
> Fix this by replacing only highmem pages with highmem.
>
> Reported-by: Laura Abbott <lauraa@codeaurora.org>
> Signed-off-by: Rabin Vincent <rabin@rab.in>

Acked-by: Michal Nazarewicz <mina86@mina86.com>

> ---
>  mm/page_alloc.c |    7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 4403009..4a4f921 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5635,7 +5635,12 @@ static struct page *
>  __alloc_contig_migrate_alloc(struct page *page, unsigned long private,
>                              int **resultp)
>  {
> -       return alloc_page(GFP_HIGHUSER_MOVABLE);
> +       gfp_t gfp_mask = GFP_USER | __GFP_MOVABLE;
> +
> +       if (PageHighMem(page))
> +               gfp_mask |= __GFP_HIGHMEM;
> +
> +       return alloc_page(gfp_mask);
>  }
>
>  /* [start, end) must belong to a single zone. */

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2012-07-05 19:21 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <4FAC200D.2080306@codeaurora.org>
     [not found] ` <02fc01cd2f50$5d77e4c0$1867ae40$%szyprowski@samsung.com>
     [not found]   ` <4FAD89DC.2090307@codeaurora.org>
2012-07-05  9:28     ` Bad use of highmem with buffer_migrate_page? Rabin Vincent
2012-07-05 10:05       ` Marek Szyprowski
2012-07-05 10:45         ` Rabin Vincent
2012-07-05 19:21           ` Michał Nazarewicz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).