linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Heesub Shin <heesub.shin@samsung.com>
To: Joonsoo Kim <iamjoonsoo.kim@lge.com>, Minchan Kim <minchan.kim@lge.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Rik van Riel <riel@redhat.com>,
	Laura Abbott <lauraa@codeaurora.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Michal Nazarewicz <mina86@mina86.com>,
	Mel Gorman <mgorman@suse.de>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Marek Szyprowski <m.szyprowski@samsung.com>
Subject: Re: [RFC PATCH 2/3] CMA: aggressively allocate the pages on cma reserved memory when not used
Date: Thu, 15 May 2014 11:45:31 +0900	[thread overview]
Message-ID: <53742A4B.4090901@samsung.com> (raw)
In-Reply-To: <20140515015301.GA10116@js1304-P5Q-DELUXE>

Hello,

On 05/15/2014 10:53 AM, Joonsoo Kim wrote:
> On Tue, May 13, 2014 at 12:00:57PM +0900, Minchan Kim wrote:
>> Hey Joonsoo,
>>
>> On Thu, May 08, 2014 at 09:32:23AM +0900, Joonsoo Kim wrote:
>>> CMA is introduced to provide physically contiguous pages at runtime.
>>> For this purpose, it reserves memory at boot time. Although it reserve
>>> memory, this reserved memory can be used for movable memory allocation
>>> request. This usecase is beneficial to the system that needs this CMA
>>> reserved memory infrequently and it is one of main purpose of
>>> introducing CMA.
>>>
>>> But, there is a problem in current implementation. The problem is that
>>> it works like as just reserved memory approach. The pages on cma reserved
>>> memory are hardly used for movable memory allocation. This is caused by
>>> combination of allocation and reclaim policy.
>>>
>>> The pages on cma reserved memory are allocated if there is no movable
>>> memory, that is, as fallback allocation. So the time this fallback
>>> allocation is started is under heavy memory pressure. Although it is under
>>> memory pressure, movable allocation easily succeed, since there would be
>>> many pages on cma reserved memory. But this is not the case for unmovable
>>> and reclaimable allocation, because they can't use the pages on cma
>>> reserved memory. These allocations regard system's free memory as
>>> (free pages - free cma pages) on watermark checking, that is, free
>>> unmovable pages + free reclaimable pages + free movable pages. Because
>>> we already exhausted movable pages, only free pages we have are unmovable
>>> and reclaimable types and this would be really small amount. So watermark
>>> checking would be failed. It will wake up kswapd to make enough free
>>> memory for unmovable and reclaimable allocation and kswapd will do.
>>> So before we fully utilize pages on cma reserved memory, kswapd start to
>>> reclaim memory and try to make free memory over the high watermark. This
>>> watermark checking by kswapd doesn't take care free cma pages so many
>>> movable pages would be reclaimed. After then, we have a lot of movable
>>> pages again, so fallback allocation doesn't happen again. To conclude,
>>> amount of free memory on meminfo which includes free CMA pages is moving
>>> around 512 MB if I reserve 512 MB memory for CMA.
>>>
>>> I found this problem on following experiment.
>>>
>>> 4 CPUs, 1024 MB, VIRTUAL MACHINE
>>> make -j24
>>>
>>> CMA reserve:		0 MB		512 MB
>>> Elapsed-time:		234.8		361.8
>>> Average-MemFree:	283880 KB	530851 KB
>>>
>>> To solve this problem, I can think following 2 possible solutions.
>>> 1. allocate the pages on cma reserved memory first, and if they are
>>>     exhausted, allocate movable pages.
>>> 2. interleaved allocation: try to allocate specific amounts of memory
>>>     from cma reserved memory and then allocate from free movable memory.
>>
>> I love this idea but when I see the code, I don't like that.
>> In allocation path, just try to allocate pages by round-robin so it's role
>> of allocator. If one of migratetype is full, just pass mission to reclaimer
>> with hint(ie, Hey reclaimer, it's non-movable allocation fail
>> so there is pointless if you reclaim MIGRATE_CMA pages) so that
>> reclaimer can filter it out during page scanning.
>> We already have an tool to achieve it(ie, isolate_mode_t).
>
> Hello,
>
> I agree with leaving fast allocation path as simple as possible.
> I will remove runtime computation for determining ratio in
> __rmqueue_cma() and, instead, will use pre-computed value calculated
> on the other path.
>
> I am not sure that whether your second suggestion(Hey relaimer part)
> is good or not. In my quick thought, that could be helpful in the
> situation that many free cma pages remained. But, it would be not helpful
> when there are neither free movable and cma pages. In generally, most
> workloads mainly uses movable pages for page cache or anonymous mapping.
> Although reclaim is triggered by non-movable allocation failure, reclaimed
> pages are used mostly by movable allocation. We can handle these allocation
> request even if we reclaim the pages just in lru order. If we rotate
> the lru list for finding movable pages, it could cause more useful
> pages to be evicted.
>
> This is just my quick thought, so please let me correct if I am wrong.

We have an out of tree implementation that is completely the same with 
the approach Minchan said and it works, but it has definitely some 
side-effects as you pointed, distorting the LRU and evicting hot pages. 
I do not attach code fragments in this thread for some reasons, but it 
must be easy for yourself. I am wondering if it could help also in your 
case.

Thanks,
Heesub

>
>>
>> And we couldn't do it in zone_watermark_ok with set/reset ALLOC_CMA?
>> If possible, it would be better becauser it's generic function to check
>> free pages and cause trigger reclaim/compaction logic.
>
> I guess, your *it* means ratio computation. Right?
> I don't like putting it on zone_watermark_ok(). Although it need to
> refer to free cma pages value which are also referred in zone_watermark_ok(),
> this computation is for determining ratio, not for triggering
> reclaim/compaction. And this zone_watermark_ok() is on more hot-path, so
> putting this logic into zone_watermark_ok() looks not better to me.
>
> I will think better place to do it.
>
> Thanks.
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>

  parent reply	other threads:[~2014-05-15  2:45 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-08  0:32 [RFC PATCH 0/3] Aggressively allocate the pages on cma reserved memory Joonsoo Kim
2014-05-08  0:32 ` [RFC PATCH 1/3] CMA: remove redundant retrying code in __alloc_contig_migrate_range Joonsoo Kim
2014-05-09 15:44   ` Michal Nazarewicz
2014-05-08  0:32 ` [RFC PATCH 2/3] CMA: aggressively allocate the pages on cma reserved memory when not used Joonsoo Kim
2014-05-09 15:45   ` Michal Nazarewicz
2014-05-12 17:04   ` Laura Abbott
2014-05-13  1:14     ` Joonsoo Kim
2014-05-13  3:05     ` Minchan Kim
2014-05-24  0:57     ` Laura Abbott
2014-05-26  2:44       ` Joonsoo Kim
2014-05-13  3:00   ` Minchan Kim
2014-05-15  1:53     ` Joonsoo Kim
2014-05-15  2:43       ` Minchan Kim
2014-05-19  2:11         ` Joonsoo Kim
2014-05-19  2:53           ` Minchan Kim
2014-05-19  4:50             ` Joonsoo Kim
2014-05-19 23:18               ` Minchan Kim
2014-05-20  6:33                 ` Joonsoo Kim
2014-05-15  2:45       ` Heesub Shin [this message]
2014-05-15  5:06         ` Minchan Kim
2014-05-19 23:22         ` Minchan Kim
2014-05-16  8:02       ` [RFC][PATCH] CMA: drivers/base/Kconfig: restrict CMA size to non-zero value Gioh Kim
2014-05-16 17:45         ` Michal Nazarewicz
2014-05-19  1:47           ` Gioh Kim
2014-05-19  5:55             ` Joonsoo Kim
2014-05-19  9:14               ` Gioh Kim
2014-05-19 19:59               ` Michal Nazarewicz
2014-05-20  0:50                 ` Gioh Kim
2014-05-20  1:28                   ` Michal Nazarewicz
2014-05-20  2:26                     ` Gioh Kim
2014-05-20 18:15                       ` Michal Nazarewicz
2014-05-20 11:38                   ` Marek Szyprowski
2014-05-21  0:15                     ` Gioh Kim
2014-05-14  8:42   ` [RFC PATCH 2/3] CMA: aggressively allocate the pages on cma reserved memory when not used Aneesh Kumar K.V
2014-05-15  1:58     ` Joonsoo Kim
2014-05-18 17:36       ` Aneesh Kumar K.V
2014-05-19  2:29         ` Joonsoo Kim
2014-05-08  0:32 ` [RFC PATCH 3/3] CMA: always treat free cma pages as non-free on watermark checking Joonsoo Kim
2014-05-09 15:46   ` Michal Nazarewicz
2014-05-09 12:39 ` [RFC PATCH 0/3] Aggressively allocate the pages on cma reserved memory Marek Szyprowski
2014-05-13  2:26   ` Joonsoo Kim
2014-05-14  9:44     ` Aneesh Kumar K.V
2014-05-15  2:10       ` Joonsoo Kim
2014-05-15  9:47         ` Mel Gorman
2014-05-19  2:12           ` Joonsoo Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53742A4B.4090901@samsung.com \
    --to=heesub.shin@samsung.com \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=lauraa@codeaurora.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=m.szyprowski@samsung.com \
    --cc=mgorman@suse.de \
    --cc=mina86@mina86.com \
    --cc=minchan.kim@lge.com \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).