Re: [PATCH] mm: Add nr_free_highatomimic to fix incorrect watermatk routine

All of lore.kernel.org
 help / color / mirror / Atom feed

* Re: [PATCH] mm: Add nr_free_highatomimic to fix incorrect watermatk routine
       [not found] <1567157153-22024-1-git-send-email-sangwoo2.park@lge.com>
@ 2019-08-30 11:09 ` Michal Hocko
  2019-09-02  4:34   ` 박상우
  2019-09-05 13:59 ` Vlastimil Babka
  1 sibling, 1 reply; 5+ messages in thread
From: Michal Hocko @ 2019-08-30 11:09 UTC (permalink / raw)
  To: Sangwoo
  Cc: hannes, arunks, guro, richard.weiyang, glider, jannh,
	dan.j.williams, akpm, alexander.h.duyck, rppt, gregkh,
	janne.huttunen, pasha.tatashin, vbabka, osalvador, mgorman,
	khlebnikov, linux-mm, linux-kernel

On Fri 30-08-19 18:25:53, Sangwoo wrote:
> The highatomic migrate block can be increased to 1% of Total memory.
> And, this is for only highorder ( > 0 order). So, this block size is
> excepted during check watermark if allocation type isn't alloc_harder.
> 
> It has problem. The usage of highatomic is already calculated at NR_FREE_PAGES.
> So, if we except total block size of highatomic, it's twice minus size of allocated
> highatomic.
> It's cause allocation fail although free pages enough.
> 
> We checked this by random test on my target(8GB RAM).
> 
> 	Binder:6218_2: page allocation failure: order:0, mode:0x14200ca(GFP_HIGHUSER_MOVABLE), nodemask=(null)
> 	Binder:6218_2 cpuset=background mems_allowed=0

How come this order-0 sleepable allocation fails? The upstream kernel
doesn't fail those allocations unless the process context is killed by
the oom killer.

Also please note that atomic reserves are released when the memory
pressure is high and we cannot reclaim any other memory. Have a look at
unreserve_highatomic_pageblock called from should_reclaim_retry.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: Re: [PATCH] mm: Add nr_free_highatomimic to fix incorrect watermatk routine
  2019-08-30 11:09 ` [PATCH] mm: Add nr_free_highatomimic to fix incorrect watermatk routine Michal Hocko
@ 2019-09-02  4:34   ` 박상우
  2019-09-02  6:09     ` Michal Hocko
  0 siblings, 1 reply; 5+ messages in thread
From: 박상우 @ 2019-09-02  4:34 UTC (permalink / raw)
  To: Michal Hocko
  Cc: hannes, arunks, guro, richard.weiyang, glider, jannh,
	dan.j.williams, akpm, alexander.h.duyck, rppt, gregkh,
	janne.huttunen, pasha.tatashin, vbabka, osalvador, mgorman,
	khlebnikov, linux-mm, linux-kernel

[-- Attachment #1: Type: text/html, Size: 5150 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Re: [PATCH] mm: Add nr_free_highatomimic to fix incorrect watermatk routine
  2019-09-02  4:34   ` 박상우
@ 2019-09-02  6:09     ` Michal Hocko
  0 siblings, 0 replies; 5+ messages in thread
From: Michal Hocko @ 2019-09-02  6:09 UTC (permalink / raw)
  To: 박상우
  Cc: hannes, arunks, guro, richard.weiyang, glider, jannh,
	dan.j.williams, akpm, alexander.h.duyck, rppt, gregkh,
	janne.huttunen, pasha.tatashin, vbabka, osalvador, mgorman,
	khlebnikov, linux-mm, linux-kernel

On Mon 02-09-19 13:34:54, 박상우 wrote:
> >On Fri 30-08-19 18:25:53, Sangwoo wrote:
> >> The highatomic migrate block can be increased to 1% of Total memory.
> >> And, this is for only highorder ( > 0 order). So, this block size is
> >> excepted during check watermark if allocation type isn't alloc_harder.
> >>
> >> It has problem. The usage of highatomic is already calculated at
> NR_FREE_PAGES.
> >> So, if we except total block size of highatomic, it's twice minus size of
> allocated
> >> highatomic.
> >> It's cause allocation fail although free pages enough.
> >>
> >> We checked this by random test on my target(8GB RAM).
> >>
> >>  Binder:6218_2: page allocation failure: order:0, mode:0x14200ca
> (GFP_HIGHUSER_MOVABLE), nodemask=(null)
> >>  Binder:6218_2 cpuset=background mems_allowed=0
> >
> >How come this order-0 sleepable allocation fails? The upstream kernel
> >doesn't fail those allocations unless the process context is killed by
> >the oom killer.
> 
> Most calltacks are zsmalloc, as shown below.

What makes those allocations special so that they fail unlike any other
normal order-0 requests? Also do you see the same problem with the
current upstream kernel? Is it possible this is an Android specific
issue?

>  Call trace:
>   dump_backtrace+0x0/0x1f0
>   show_stack+0x18/0x20
>   dump_stack+0xc4/0x100
>   warn_alloc+0x100/0x198
>   __alloc_pages_nodemask+0x116c/0x1188
>   do_swap_page+0x10c/0x6f0
>   handle_pte_fault+0x12c/0xfe0
>   handle_mm_fault+0x1d0/0x328
>   do_page_fault+0x2a0/0x3e0
>   do_translation_fault+0x44/0xa8
>   do_mem_abort+0x4c/0xd0
>   el1_da+0x24/0x84
>   __arch_copy_to_user+0x5c/0x220
>   binder_ioctl+0x20c/0x740
>   compat_SyS_ioctl+0x128/0x248
>   __sys_trace_return+0x0/0x4
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] mm: Add nr_free_highatomimic to fix incorrect watermatk routine
       [not found] <1567157153-22024-1-git-send-email-sangwoo2.park@lge.com>
  2019-08-30 11:09 ` [PATCH] mm: Add nr_free_highatomimic to fix incorrect watermatk routine Michal Hocko
@ 2019-09-05 13:59 ` Vlastimil Babka
  1 sibling, 0 replies; 5+ messages in thread
From: Vlastimil Babka @ 2019-09-05 13:59 UTC (permalink / raw)
  To: Sangwoo, hannes, arunks, guro, richard.weiyang, glider, jannh,
	dan.j.williams, akpm, alexander.h.duyck, rppt, gregkh,
	janne.huttunen, pasha.tatashin, Michal Hocko, osalvador, mgorman,
	khlebnikov
  Cc: linux-mm, linux-kernel

On 8/30/19 11:25 AM, Sangwoo wrote:
> The highatomic migrate block can be increased to 1% of Total memory.
> And, this is for only highorder ( > 0 order). So, this block size is
> excepted during check watermark if allocation type isn't alloc_harder.
> 
> It has problem. The usage of highatomic is already calculated at NR_FREE_PAGES.
> So, if we except total block size of highatomic, it's twice minus size of allocated
> highatomic.
> It's cause allocation fail although free pages enough.

This is known, the comment in __zone_watermark_order says "This will
over-estimate the size of the atomic reserve but it avoids a search."
It was discussed during review and wasn't considered a large issue
thanks to unreserving on demand before OOM happens.

> @@ -919,6 +923,9 @@ static inline void __free_one_page(struct page *page,
>  	VM_BUG_ON(migratetype == -1);
>  	if (likely(!is_migrate_isolate(migratetype)))
>  		__mod_zone_freepage_state(zone, 1 << order, migratetype);
> +	if (is_migrate_highatomic(migratetype) ||
> +		is_migrate_highatomic_page(page))
> +		__mod_zone_page_state(zone, NR_FREE_HIGHATOMIC_PAGES, 1 << order);

I suspect the counter will eventually get imbalanced, at the least due
to merging a highatomic pageblock and non-highatomic pageblock. To get
it right, it would have to be complicated in a similar way that we
handle MIGRATE_ISOLATED and MIGRATE_CMA. It wasn't considered serious
enough to warrant these complications.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH] mm: Add nr_free_highatomimic to fix incorrect watermatk routine
@ 2019-09-05  1:46 Park Sangwoo
  0 siblings, 0 replies; 5+ messages in thread
From: Park Sangwoo @ 2019-09-05  1:46 UTC (permalink / raw)
  To: mhocko
  Cc: hannes, arunks, guro, richard.weiyang, glider, jannh,
	dan.j.williams, akpm, alexander.h.duyck, rppt, gregkh,
	janne.huttunen, pasha.tatashin, vbabka, osalvador, mgorman,
	khlebnikov, linux-mm, linux-kernel

> On Wed 04-09-19 15:54:57, Park Sangwoo wrote:
> > On Tue 03-09-19 18:59:59, Park Sangwoo wrote:
> > > On Mon 02-09-19 13:34:54, Sangwoo� wrote:
> > >>> On Fri 30-08-19 18:25:53, Sangwoo wrote:
> > >>>> The highatomic migrate block can be increased to 1% of Total memory.
> > >>>> And, this is for only highorder ( > 0 order). So, this block size is
> > >>>> excepted during check watermark if allocation type isn't alloc_harder.
> > >>>>
> > >>>> It has problem. The usage of highatomic is already calculated at
> > >>> NR_FREE_PAGES.
> > >>>>> So, if we except total block size of highatomic, it's twice minus size of
> > >>> allocated
> > >>>>> highatomic.
> > >>>>> It's cause allocation fail although free pages enough.
> > >>>>>
> > >>>>> We checked this by random test on my target(8GB RAM).
> > >>>>>
> > >>>>>  Binder:6218_2: page allocation failure: order:0, mode:0x14200ca
> > >>> (GFP_HIGHUSER_MOVABLE), nodemask=(null)
> > >>>>>  Binder:6218_2 cpuset=background mems_allowed=0
> > >>>>
> > >>>> How come this order-0 sleepable allocation fails? The upstream kernel
> > >>>> doesn't fail those allocations unless the process context is killed by
> > >>>> the oom killer.
> > >>> 
> > > >>> Most calltacks are zsmalloc, as shown below.
> > > >>
> > > >> What makes those allocations special so that they fail unlike any other
> > > >> normal order-0 requests? Also do you see the same problem with the
> > > >> current upstream kernel? Is it possible this is an Android specific
> > > >> issue?
> > > >
> > > > There is the other case of fail order-0 fail.
> > > > ----
> > > > hvdcp_opti: page allocation failure: order:0, mode:0x1004000(GFP_NOWAIT|__GFP_COMP), nodemask=(null)
> > > 
> > > This is an atomic allocation and failing that one is not a problem
> > > usually. High atomic reservations might prevent GFP_NOWAIT allocation
> > > from suceeding but I do not see that as a problem. This is the primary
> > > purpose of the reservation. 
> > 
> > Thanks, your answer helped me. However, my suggestion is not to modify the use and management of the high atomic region,
> > but to calculate the exact free size of the highatomic so that fail does not occur for previously shared cases.
> > 
> > In __zone_water_mark_ok(...) func, if it is not atomic allocation, high atomic size is excluded.
> > 
> > bool __zone_watermark_ok(struct zone *z,
> > ...
> > {
> >     ...
> >     if (likely(!alloc_harder)) {
> >         free_pages -= z->nr_reserved_highatomic;
> >     ...
> > }
> > 
> > However, free_page excludes the size already allocated by hiahtomic.
> > If highatomic block is small(Under 4GB RAM), it could be no problem.
> > But, the larger the memory size, the greater the chance of problems.
> > (Becasue highatomic size can be increased up to 1% of memory)
> 
> I still do not understand. NR_FREE_PAGES should include the amount of
> hhighatomic reserves, right. So reducing the free_pages for normal
> allocations just makes sense. Or what do I miss?

You are right. But z->nr_reserved_highatomic value is total size of
highatomic migrate type per zone.

nr_reserved_highatomic = (# of allocated of highatomic) + (# of free list of highatomic).

And (# of allocated of hiagatomic) is already excluded at NR_FREE_PAGES.
So, if reducing nr_reserved_highatomic at NR_FREE_PAGES,
the (# of allocated of highatomic) is double reduced.

So I proposal that only (# of free list of highatomic) is reduced at NR_FREE_PAGE.

    if (likely(!alloc_harder)) {
-       free_pages -= z->nr_reserved_highatomic;
+       free_pages -= zone_page_state(z, NR_FREE_HIGHATOMIC_PAGES);
    } else {

> 
> I am sorry but I find your reasoning really hard to follow.


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-09-05 13:59 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <1567157153-22024-1-git-send-email-sangwoo2.park@lge.com>
2019-08-30 11:09 ` [PATCH] mm: Add nr_free_highatomimic to fix incorrect watermatk routine Michal Hocko
2019-09-02  4:34   ` 박상우
2019-09-02  6:09     ` Michal Hocko
2019-09-05 13:59 ` Vlastimil Babka
2019-09-05  1:46 Park Sangwoo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.