All of lore.kernel.org
 help / color / mirror / Atom feed
* compaction: trying to understand the code
@ 2010-08-17 11:08 Iram Shahzad
  2010-08-17 11:10 ` Mel Gorman
  0 siblings, 1 reply; 33+ messages in thread
From: Iram Shahzad @ 2010-08-17 11:08 UTC (permalink / raw)
  To: Mel Gorman; +Cc: linux-mm

Hi

I am trying to understand the following code in isolate_migratepages
function. I have a question regarding this.

---
 while (unlikely(too_many_isolated(zone))) {
  congestion_wait(BLK_RW_ASYNC, HZ/10);

  if (fatal_signal_pending(current))
   return 0;
 }

---

I have seen that in some cases this while loop never exits
because too_many_isolated keeps returning true for ever.
And hence the process hangs. Is this intended behaviour?
What is it that is supposed to change the "too_many_isolated" situation?
In other words, what is it that is supposed to increase the "inactive"
or decrease the "isolated" so that isolated > inactive becomes false?

Best regards
Iram


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: compaction: trying to understand the code
  2010-08-17 11:08 compaction: trying to understand the code Iram Shahzad
@ 2010-08-17 11:10 ` Mel Gorman
  2010-08-18  8:19   ` Iram Shahzad
  0 siblings, 1 reply; 33+ messages in thread
From: Mel Gorman @ 2010-08-17 11:10 UTC (permalink / raw)
  To: Iram Shahzad; +Cc: linux-mm

On Tue, Aug 17, 2010 at 08:08:54PM +0900, Iram Shahzad wrote:
> Hi
>
> I am trying to understand the following code in isolate_migratepages
> function. I have a question regarding this.
>
> ---
> while (unlikely(too_many_isolated(zone))) {
>  congestion_wait(BLK_RW_ASYNC, HZ/10);
>
>  if (fatal_signal_pending(current))
>   return 0;
> }
>
> ---
>
> I have seen that in some cases this while loop never exits
> because too_many_isolated keeps returning true for ever.
> And hence the process hangs. Is this intended behaviour?

No. Under what circumstances does it get stuck forever. It's similar
logic to what's in page reclaim except there parallel processes such as
kswapd or direct reclaimers would eventually release isolated pages.

> What is it that is supposed to change the "too_many_isolated" situation?

Parallel reclaimers or compaction processes releasing the pages they
have isolated from the LRU.

> In other words, what is it that is supposed to increase the "inactive"
> or decrease the "isolated" so that isolated > inactive becomes false?
>

See places that update the NR_ISOLATED_ANON and NR_ISOLATED_FILE
counters.

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: compaction: trying to understand the code
  2010-08-17 11:10 ` Mel Gorman
@ 2010-08-18  8:19   ` Iram Shahzad
  2010-08-18 15:41     ` Wu Fengguang
  0 siblings, 1 reply; 33+ messages in thread
From: Iram Shahzad @ 2010-08-18  8:19 UTC (permalink / raw)
  To: Mel Gorman; +Cc: linux-mm

>> In other words, what is it that is supposed to increase the "inactive"
>> or decrease the "isolated" so that isolated > inactive becomes false?
>>
> 
> See places that update the NR_ISOLATED_ANON and NR_ISOLATED_FILE
> counters.

Many thanks for the advice.
So far as I understand, to come out of the loop, somehow NR_ISOLATED_*
has to be decremented. And the code that decrements it is called here:
mm/migrate.c migrate_pages() -> unmap_and_move()

In compaction.c, migrate_pages() is called only after returning from 
isolate_migratepages().
So if it is looping inside isolate_migratepages() function, migrate_pages()
will not be called and hence there is no chance for NR_ISOLATED_*
to be decremented. Am I wrong?

Best regards
Iram


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: compaction: trying to understand the code
  2010-08-18  8:19   ` Iram Shahzad
@ 2010-08-18 15:41     ` Wu Fengguang
  2010-08-19  7:09       ` Iram Shahzad
  0 siblings, 1 reply; 33+ messages in thread
From: Wu Fengguang @ 2010-08-18 15:41 UTC (permalink / raw)
  To: Iram Shahzad; +Cc: Mel Gorman, linux-mm, KOSAKI Motohiro

On Wed, Aug 18, 2010 at 05:19:21PM +0900, Iram Shahzad wrote:
> >>In other words, what is it that is supposed to increase the "inactive"
> >>or decrease the "isolated" so that isolated > inactive becomes false?
> >>
> >
> >See places that update the NR_ISOLATED_ANON and NR_ISOLATED_FILE
> >counters.
> 
> Many thanks for the advice.
> So far as I understand, to come out of the loop, somehow NR_ISOLATED_*
> has to be decremented. And the code that decrements it is called here:
> mm/migrate.c migrate_pages() -> unmap_and_move()
> 
> In compaction.c, migrate_pages() is called only after returning from
> isolate_migratepages().
> So if it is looping inside isolate_migratepages() function, migrate_pages()
> will not be called and hence there is no chance for NR_ISOLATED_*
> to be decremented. Am I wrong?

The loop should be waiting for the _other_ processes (doing direct
reclaims) to proceed.  When there are _lots of_ ongoing page
allocations/reclaims, it makes sense to wait for them to calm down a bit?

Thanks,
Fengguang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: compaction: trying to understand the code
  2010-08-18 15:41     ` Wu Fengguang
@ 2010-08-19  7:09       ` Iram Shahzad
  2010-08-19  7:45         ` Wu Fengguang
                           ` (2 more replies)
  0 siblings, 3 replies; 33+ messages in thread
From: Iram Shahzad @ 2010-08-19  7:09 UTC (permalink / raw)
  To: Wu Fengguang; +Cc: Mel Gorman, linux-mm, KOSAKI Motohiro

> The loop should be waiting for the _other_ processes (doing direct
> reclaims) to proceed.  When there are _lots of_ ongoing page
> allocations/reclaims, it makes sense to wait for them to calm down a bit?

I have noticed that if I run other process, it helps the loop to exit.
So is this (ie hanging until other process helps) intended behaviour?

Also, the other process does help the loop to exit, but again it enters
the loop and the compaction is never finished. That is, the process
looks like hanging. Is this intended behaviour?
What will improve this situation?

Thanks
Iram


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: compaction: trying to understand the code
  2010-08-19  7:09       ` Iram Shahzad
@ 2010-08-19  7:45         ` Wu Fengguang
  2010-08-19  7:46         ` Mel Gorman
  2010-08-19 16:00         ` Minchan Kim
  2 siblings, 0 replies; 33+ messages in thread
From: Wu Fengguang @ 2010-08-19  7:45 UTC (permalink / raw)
  To: Iram Shahzad; +Cc: Mel Gorman, linux-mm, KOSAKI Motohiro

On Thu, Aug 19, 2010 at 03:09:38PM +0800, Iram Shahzad wrote:
> > The loop should be waiting for the _other_ processes (doing direct
> > reclaims) to proceed.  When there are _lots of_ ongoing page
> > allocations/reclaims, it makes sense to wait for them to calm down a bit?
> 
> I have noticed that if I run other process, it helps the loop to exit.
> So is this (ie hanging until other process helps) intended behaviour?
>
> Also, the other process does help the loop to exit, but again it enters
> the loop and the compaction is never finished. That is, the process
> looks like hanging. Is this intended behaviour?
> What will improve this situation?
 
What's your /proc/vmstat?  Does your system have thousands of
processes allocating memory concurrently? I'd like to make sure the
too_many_isolated() test is working as expected..

Thanks,
Fengguang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: compaction: trying to understand the code
  2010-08-19  7:09       ` Iram Shahzad
  2010-08-19  7:45         ` Wu Fengguang
@ 2010-08-19  7:46         ` Mel Gorman
  2010-08-19  8:08           ` Wu Fengguang
  2010-08-20  5:45           ` Iram Shahzad
  2010-08-19 16:00         ` Minchan Kim
  2 siblings, 2 replies; 33+ messages in thread
From: Mel Gorman @ 2010-08-19  7:46 UTC (permalink / raw)
  To: Iram Shahzad; +Cc: Wu Fengguang, linux-mm, KOSAKI Motohiro

On Thu, Aug 19, 2010 at 04:09:38PM +0900, Iram Shahzad wrote:
>> The loop should be waiting for the _other_ processes (doing direct
>> reclaims) to proceed.  When there are _lots of_ ongoing page
>> allocations/reclaims, it makes sense to wait for them to calm down a bit?
>
> I have noticed that if I run other process, it helps the loop to exit.
> So is this (ie hanging until other process helps) intended behaviour?
>

No, it's not but I'm not immediately seeing how it would occur either.
too_many_isolated() should only be true when there are multiple
processes running that are isolating pages be it due to reclaim or
compaction. These should be finishing their work after some time so
while a process may stall in too_many_isolated(), it should not stay
there forever.

The loop around isolate_migratepages() puts back LRU pages it failed to
migrate so it's not the case that the compacting process is isolating a
large number of pages and then calling too_many_isolated() against itself.

> Also, the other process does help the loop to exit, but again it enters
> the loop and the compaction is never finished. That is, the process
> looks like hanging. Is this intended behaviour?

Infinite loops are never intended behaviour.

> What will improve this situation?

What is your test scenario? Who or what has these pages isolated that is
allowing too_many_isolated() to be true?

I'm not seeing how processes could isolate a large number of pages and
hold onto them for a long time but knowing the test scenario might help.

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: compaction: trying to understand the code
  2010-08-19  7:46         ` Mel Gorman
@ 2010-08-19  8:08           ` Wu Fengguang
  2010-08-19  8:15             ` Mel Gorman
  2010-08-20  5:45           ` Iram Shahzad
  1 sibling, 1 reply; 33+ messages in thread
From: Wu Fengguang @ 2010-08-19  8:08 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Iram Shahzad, linux-mm, KOSAKI Motohiro

On Thu, Aug 19, 2010 at 03:46:02PM +0800, Mel Gorman wrote:
> On Thu, Aug 19, 2010 at 04:09:38PM +0900, Iram Shahzad wrote:
> >> The loop should be waiting for the _other_ processes (doing direct
> >> reclaims) to proceed.  When there are _lots of_ ongoing page
> >> allocations/reclaims, it makes sense to wait for them to calm down a bit?
> >
> > I have noticed that if I run other process, it helps the loop to exit.
> > So is this (ie hanging until other process helps) intended behaviour?
> >
> 
> No, it's not but I'm not immediately seeing how it would occur either.
> too_many_isolated() should only be true when there are multiple
> processes running that are isolating pages be it due to reclaim or
> compaction. These should be finishing their work after some time so
> while a process may stall in too_many_isolated(), it should not stay
> there forever.
> 
> The loop around isolate_migratepages() puts back LRU pages it failed to
> migrate so it's not the case that the compacting process is isolating a
> large number of pages and then calling too_many_isolated() against itself.

It seems the compaction process isolates 128MB pages at a time? That
sounds risky, too_many_isolated() can easily be true, which will stall
direct reclaim processes. I'm not seeing how exactly it makes
compaction itself stall infinitely though.

> > Also, the other process does help the loop to exit, but again it enters
> > the loop and the compaction is never finished. That is, the process
> > looks like hanging. Is this intended behaviour?
> 
> Infinite loops are never intended behaviour.

Yup.

Thanks,
Fengguang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: compaction: trying to understand the code
  2010-08-19  8:08           ` Wu Fengguang
@ 2010-08-19  8:15             ` Mel Gorman
  2010-08-19  8:29               ` Wu Fengguang
  0 siblings, 1 reply; 33+ messages in thread
From: Mel Gorman @ 2010-08-19  8:15 UTC (permalink / raw)
  To: Wu Fengguang; +Cc: Iram Shahzad, linux-mm, KOSAKI Motohiro

On Thu, Aug 19, 2010 at 04:08:31PM +0800, Wu Fengguang wrote:
> On Thu, Aug 19, 2010 at 03:46:02PM +0800, Mel Gorman wrote:
> > On Thu, Aug 19, 2010 at 04:09:38PM +0900, Iram Shahzad wrote:
> > >> The loop should be waiting for the _other_ processes (doing direct
> > >> reclaims) to proceed.  When there are _lots of_ ongoing page
> > >> allocations/reclaims, it makes sense to wait for them to calm down a bit?
> > >
> > > I have noticed that if I run other process, it helps the loop to exit.
> > > So is this (ie hanging until other process helps) intended behaviour?
> > >
> > 
> > No, it's not but I'm not immediately seeing how it would occur either.
> > too_many_isolated() should only be true when there are multiple
> > processes running that are isolating pages be it due to reclaim or
> > compaction. These should be finishing their work after some time so
> > while a process may stall in too_many_isolated(), it should not stay
> > there forever.
> > 
> > The loop around isolate_migratepages() puts back LRU pages it failed to
> > migrate so it's not the case that the compacting process is isolating a
> > large number of pages and then calling too_many_isolated() against itself.
> 
> It seems the compaction process isolates 128MB pages at a time?

It should be one pageblock at a time for source migration and one pageblock
for target pages. Look at the values for low_pfn and end_pfn here;

static unsigned long isolate_migratepages(struct zone *zone,
                                        struct compact_control *cc)
{
        unsigned long low_pfn, end_pfn;
        struct list_head *migratelist = &cc->migratepages;

        /* Do not scan outside zone boundaries */
        low_pfn = max(cc->migrate_pfn, zone->zone_start_pfn);

        /* Only scan within a pageblock boundary */
        end_pfn = ALIGN(low_pfn + pageblock_nr_pages, pageblock_nr_pages);

....

and the loop around that looks like

        while ((ret = compact_finished(zone, cc)) == COMPACT_CONTINUE) {
                unsigned long nr_migrate, nr_remaining;

                if (!isolate_migratepages(zone, cc))
                        continue;

                nr_migrate = cc->nr_migratepages;
                migrate_pages(&cc->migratepages, compaction_alloc,
                                                (unsigned long)cc, 0);
                update_nr_listpages(cc);
                nr_remaining = cc->nr_migratepages;

                count_vm_event(COMPACTBLOCKS);
                count_vm_events(COMPACTPAGES, nr_migrate - nr_remaining);
                if (nr_remaining)
                        count_vm_events(COMPACTPAGEFAILED, nr_remaining);

                /* Release LRU pages not migrated */
                if (!list_empty(&cc->migratepages)) {
                        putback_lru_pages(&cc->migratepages);
                        cc->nr_migratepages = 0;
                }

        }

Where is it isolating 128MB?

> That
> sounds risky, too_many_isolated() can easily be true, which will stall
> direct reclaim processes. I'm not seeing how exactly it makes
> compaction itself stall infinitely though.
> 
> > > Also, the other process does help the loop to exit, but again it enters
> > > the loop and the compaction is never finished. That is, the process
> > > looks like hanging. Is this intended behaviour?
> > 
> > Infinite loops are never intended behaviour.
> 
> Yup.
> 

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: compaction: trying to understand the code
  2010-08-19  8:15             ` Mel Gorman
@ 2010-08-19  8:29               ` Wu Fengguang
  0 siblings, 0 replies; 33+ messages in thread
From: Wu Fengguang @ 2010-08-19  8:29 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Iram Shahzad, linux-mm, KOSAKI Motohiro

On Thu, Aug 19, 2010 at 04:15:54PM +0800, Mel Gorman wrote:
> On Thu, Aug 19, 2010 at 04:08:31PM +0800, Wu Fengguang wrote:
> > On Thu, Aug 19, 2010 at 03:46:02PM +0800, Mel Gorman wrote:
> > > On Thu, Aug 19, 2010 at 04:09:38PM +0900, Iram Shahzad wrote:
> > > >> The loop should be waiting for the _other_ processes (doing direct
> > > >> reclaims) to proceed.  When there are _lots of_ ongoing page
> > > >> allocations/reclaims, it makes sense to wait for them to calm down a bit?
> > > >
> > > > I have noticed that if I run other process, it helps the loop to exit.
> > > > So is this (ie hanging until other process helps) intended behaviour?
> > > >
> > > 
> > > No, it's not but I'm not immediately seeing how it would occur either.
> > > too_many_isolated() should only be true when there are multiple
> > > processes running that are isolating pages be it due to reclaim or
> > > compaction. These should be finishing their work after some time so
> > > while a process may stall in too_many_isolated(), it should not stay
> > > there forever.
> > > 
> > > The loop around isolate_migratepages() puts back LRU pages it failed to
> > > migrate so it's not the case that the compacting process is isolating a
> > > large number of pages and then calling too_many_isolated() against itself.
> > 
> > It seems the compaction process isolates 128MB pages at a time?
> 
> It should be one pageblock at a time for source migration and one pageblock
> for target pages. Look at the values for low_pfn and end_pfn here;

Ah sorry! I confused it with section size..

Thanks,
Fengguang

> static unsigned long isolate_migratepages(struct zone *zone,
>                                         struct compact_control *cc)
> {
>         unsigned long low_pfn, end_pfn;
>         struct list_head *migratelist = &cc->migratepages;
> 
>         /* Do not scan outside zone boundaries */
>         low_pfn = max(cc->migrate_pfn, zone->zone_start_pfn);
> 
>         /* Only scan within a pageblock boundary */
>         end_pfn = ALIGN(low_pfn + pageblock_nr_pages, pageblock_nr_pages);
> 
> ....
> 
> and the loop around that looks like
> 
>         while ((ret = compact_finished(zone, cc)) == COMPACT_CONTINUE) {
>                 unsigned long nr_migrate, nr_remaining;
> 
>                 if (!isolate_migratepages(zone, cc))
>                         continue;
> 
>                 nr_migrate = cc->nr_migratepages;
>                 migrate_pages(&cc->migratepages, compaction_alloc,
>                                                 (unsigned long)cc, 0);
>                 update_nr_listpages(cc);
>                 nr_remaining = cc->nr_migratepages;
> 
>                 count_vm_event(COMPACTBLOCKS);
>                 count_vm_events(COMPACTPAGES, nr_migrate - nr_remaining);
>                 if (nr_remaining)
>                         count_vm_events(COMPACTPAGEFAILED, nr_remaining);
> 
>                 /* Release LRU pages not migrated */
>                 if (!list_empty(&cc->migratepages)) {
>                         putback_lru_pages(&cc->migratepages);
>                         cc->nr_migratepages = 0;
>                 }
> 
>         }
> 
> Where is it isolating 128MB?
> 
> > That
> > sounds risky, too_many_isolated() can easily be true, which will stall
> > direct reclaim processes. I'm not seeing how exactly it makes
> > compaction itself stall infinitely though.
> > 
> > > > Also, the other process does help the loop to exit, but again it enters
> > > > the loop and the compaction is never finished. That is, the process
> > > > looks like hanging. Is this intended behaviour?
> > > 
> > > Infinite loops are never intended behaviour.
> > 
> > Yup.
> > 
> 
> -- 
> Mel Gorman
> Part-time Phd Student                          Linux Technology Center
> University of Limerick                         IBM Dublin Software Lab

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: compaction: trying to understand the code
  2010-08-19  7:09       ` Iram Shahzad
  2010-08-19  7:45         ` Wu Fengguang
  2010-08-19  7:46         ` Mel Gorman
@ 2010-08-19 16:00         ` Minchan Kim
  2010-08-20  5:31           ` Iram Shahzad
  2 siblings, 1 reply; 33+ messages in thread
From: Minchan Kim @ 2010-08-19 16:00 UTC (permalink / raw)
  To: Iram Shahzad; +Cc: Wu Fengguang, Mel Gorman, linux-mm, KOSAKI Motohiro

On Thu, Aug 19, 2010 at 04:09:38PM +0900, Iram Shahzad wrote:
> >The loop should be waiting for the _other_ processes (doing direct
> >reclaims) to proceed.  When there are _lots of_ ongoing page
> >allocations/reclaims, it makes sense to wait for them to calm down a bit?
> 
> I have noticed that if I run other process, it helps the loop to exit.
> So is this (ie hanging until other process helps) intended behaviour?
> 
> Also, the other process does help the loop to exit, but again it enters
> the loop and the compaction is never finished. That is, the process
> looks like hanging. Is this intended behaviour?
> What will improve this situation?
> 
I don't know why too many pages are isolated.
Could you apply below patch for debugging and report it?

diff --git a/mm/compaction.c b/mm/compaction.c
index 94cce51..17f339f 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -215,6 +215,7 @@ static void acct_isolated(struct zone *zone, struct compact_control *cc)
 static bool too_many_isolated(struct zone *zone)
 {
 
+       int overflow = 0;
        unsigned long inactive, isolated;
 
        inactive = zone_page_state(zone, NR_INACTIVE_FILE) +
@@ -222,7 +223,13 @@ static bool too_many_isolated(struct zone *zone)
        isolated = zone_page_state(zone, NR_ISOLATED_FILE) +
                                        zone_page_state(zone, NR_ISOLATED_ANON);
 
-       return isolated > inactive;
+       if (isolated > inactive)
+               overflow = 1;
+
+       if (overflow)
+               show_mem();     
+
+       return overflow;
 }


> Thanks
> Iram
> 
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: compaction: trying to understand the code
  2010-08-19 16:00         ` Minchan Kim
@ 2010-08-20  5:31           ` Iram Shahzad
  2010-08-20  5:34             ` Wu Fengguang
  0 siblings, 1 reply; 33+ messages in thread
From: Iram Shahzad @ 2010-08-20  5:31 UTC (permalink / raw)
  To: Minchan Kim; +Cc: Wu Fengguang, Mel Gorman, linux-mm, KOSAKI Motohiro

[-- Attachment #1: Type: text/plain, Size: 230 bytes --]

> Could you apply below patch for debugging and report it?

The Mem-info gets printed forever. So I have picked the first 2 of them
and then another 2 after some time. These 4 Mem-infos are shown in
the attached log.

Thanks
Iram

[-- Attachment #2: too_many_isolated_log.txt --]
[-- Type: text/plain, Size: 4292 bytes --]

Mem-info:
Normal per-cpu:
CPU    0: hi:  186, btch:  31 usd: 184
active_anon:40345 inactive_anon:0 isolated_anon:8549
 active_file:2713 inactive_file:10418 isolated_file:1871
 unevictable:0 dirty:0 writeback:0 unstable:0
 free:53713 slab_reclaimable:533 slab_unreclaimable:1076
 mapped:9461 shmem:2349 pagetables:1574 bounce:0
Normal free:214852kB min:2884kB low:3604kB high:4324kB active_anon:161380kB inactive_anon:0kB active_file:10852kB inactive_file:41672kB unevictable:0kB isolated(anon):34196kB isolated(file):7484kB present:520192kB mlocked:0kB dirty:0kB writeback:0kB mapped:37844kB shmem:9396kB slab_reclaimable:2132kB slab_unreclaimable:4304kB kernel_stack:1880kB pagetables:6296kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0
Normal: 31*4kB 29*8kB 20*16kB 23*32kB 21*64kB 19*128kB 19*256kB 20*512kB 20*1024kB 3*2048kB 41*4096kB = 214852kB
15491 total pagecache pages
131072 pages of RAM
54242 free pages
18897 reserved pages
1609 slab pages
84316 pages shared
0 pages swap cached
Mem-info:
Normal per-cpu:
CPU    0: hi:  186, btch:  31 usd: 184
active_anon:40345 inactive_anon:0 isolated_anon:8549
 active_file:2713 inactive_file:10418 isolated_file:1871
 unevictable:0 dirty:0 writeback:0 unstable:0
 free:53713 slab_reclaimable:533 slab_unreclaimable:1076
 mapped:9461 shmem:2349 pagetables:1574 bounce:0
Normal free:214852kB min:2884kB low:3604kB high:4324kB active_anon:161380kB inactive_anon:0kB active_file:10852kB inactive_file:41672kB unevictable:0kB isolated(anon):34196kB isolated(file):7484kB present:520192kB mlocked:0kB dirty:0kB writeback:0kB mapped:37844kB shmem:9396kB slab_reclaimable:2132kB slab_unreclaimable:4304kB kernel_stack:1880kB pagetables:6296kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0
Normal: 26*4kB 27*8kB 19*16kB 22*32kB 20*64kB 19*128kB 19*256kB 20*512kB 20*1024kB 3*2048kB 41*4096kB = 214704kB
15491 total pagecache pages
131072 pages of RAM
54258 free pages
18897 reserved pages
1609 slab pages
84296 pages shared
0 pages swap cached


[snip]


Mem-info:
Normal per-cpu:
CPU    0: hi:  186, btch:  31 usd: 100
active_anon:40429 inactive_anon:0 isolated_anon:8581
 active_file:2719 inactive_file:10423 isolated_file:1871
 unevictable:0 dirty:0 writeback:0 unstable:0
 free:53777 slab_reclaimable:534 slab_unreclaimable:1070
 mapped:9461 shmem:2349 pagetables:1574 bounce:0
Normal free:215108kB min:2884kB low:3604kB high:4324kB active_anon:161716kB inactive_anon:0kB active_file:10876kB inactive_file:41692kB unevictable:0kB isolated(anon):34324kB isolated(file):7484kB present:520192kB mlocked:0kB dirty:0kB writeback:0kB mapped:37844kB shmem:9396kB slab_reclaimable:2136kB slab_unreclaimable:4280kB kernel_stack:1872kB pagetables:6296kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0
Normal: 31*4kB 29*8kB 20*16kB 21*32kB 22*64kB 19*128kB 20*256kB 20*512kB 20*1024kB 3*2048kB 41*4096kB = 215108kB
15491 total pagecache pages
131072 pages of RAM
54221 free pages
18897 reserved pages
1604 slab pages
84289 pages shared
0 pages swap cached
Mem-info:
Normal per-cpu:
CPU    0: hi:  186, btch:  31 usd: 100
active_anon:40429 inactive_anon:0 isolated_anon:8581
 active_file:2719 inactive_file:10423 isolated_file:1871
 unevictable:0 dirty:0 writeback:0 unstable:0
 free:53777 slab_reclaimable:534 slab_unreclaimable:1070
 mapped:9461 shmem:2349 pagetables:1574 bounce:0
Normal free:215108kB min:2884kB low:3604kB high:4324kB active_anon:161716kB inactive_anon:0kB active_file:10876kB inactive_file:41692kB unevictable:0kB isolated(anon):34324kB isolated(file):7484kB present:520192kB mlocked:0kB dirty:0kB writeback:0kB mapped:37844kB shmem:9396kB slab_reclaimable:2136kB slab_unreclaimable:4280kB kernel_stack:1872kB pagetables:6296kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0
Normal: 31*4kB 29*8kB 20*16kB 21*32kB 22*64kB 19*128kB 20*256kB 20*512kB 20*1024kB 3*2048kB 41*4096kB = 215108kB
15491 total pagecache pages
131072 pages of RAM
54222 free pages
18897 reserved pages
1603 slab pages
84289 pages shared
0 pages swap cached

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: compaction: trying to understand the code
  2010-08-20  5:31           ` Iram Shahzad
@ 2010-08-20  5:34             ` Wu Fengguang
  2010-08-20  9:35               ` Mel Gorman
  0 siblings, 1 reply; 33+ messages in thread
From: Wu Fengguang @ 2010-08-20  5:34 UTC (permalink / raw)
  To: Iram Shahzad; +Cc: Minchan Kim, Mel Gorman, linux-mm, KOSAKI Motohiro

You do run lots of tasks: kernel_stack=1880kB.

And you have lots of free memory, page reclaim has never run, so
inactive_anon=0. This is where compaction is different from vmscan.
In vmscan, inactive_anon is reasonably large, and will only be
compared directly with isolated_anon.

Thanks,
Fengguang

On Fri, Aug 20, 2010 at 01:31:03PM +0800, Iram Shahzad wrote:
> > Could you apply below patch for debugging and report it?
> 
> The Mem-info gets printed forever. So I have picked the first 2 of them
> and then another 2 after some time. These 4 Mem-infos are shown in
> the attached log.
> 
> Thanks
> Iram

> Mem-info:
> Normal per-cpu:
> CPU    0: hi:  186, btch:  31 usd: 184
> active_anon:40345 inactive_anon:0 isolated_anon:8549
>  active_file:2713 inactive_file:10418 isolated_file:1871
>  unevictable:0 dirty:0 writeback:0 unstable:0
>  free:53713 slab_reclaimable:533 slab_unreclaimable:1076
>  mapped:9461 shmem:2349 pagetables:1574 bounce:0
> Normal free:214852kB min:2884kB low:3604kB high:4324kB active_anon:161380kB inactive_anon:0kB active_file:10852kB inactive_file:41672kB unevictable:0kB isolated(anon):34196kB isolated(file):7484kB present:520192kB mlocked:0kB dirty:0kB writeback:0kB mapped:37844kB shmem:9396kB slab_reclaimable:2132kB slab_unreclaimable:4304kB kernel_stack:1880kB pagetables:6296kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
> lowmem_reserve[]: 0 0 0
> Normal: 31*4kB 29*8kB 20*16kB 23*32kB 21*64kB 19*128kB 19*256kB 20*512kB 20*1024kB 3*2048kB 41*4096kB = 214852kB
> 15491 total pagecache pages
> 131072 pages of RAM
> 54242 free pages
> 18897 reserved pages
> 1609 slab pages
> 84316 pages shared
> 0 pages swap cached
> Mem-info:
> Normal per-cpu:
> CPU    0: hi:  186, btch:  31 usd: 184
> active_anon:40345 inactive_anon:0 isolated_anon:8549
>  active_file:2713 inactive_file:10418 isolated_file:1871
>  unevictable:0 dirty:0 writeback:0 unstable:0
>  free:53713 slab_reclaimable:533 slab_unreclaimable:1076
>  mapped:9461 shmem:2349 pagetables:1574 bounce:0
> Normal free:214852kB min:2884kB low:3604kB high:4324kB active_anon:161380kB inactive_anon:0kB active_file:10852kB inactive_file:41672kB unevictable:0kB isolated(anon):34196kB isolated(file):7484kB present:520192kB mlocked:0kB dirty:0kB writeback:0kB mapped:37844kB shmem:9396kB slab_reclaimable:2132kB slab_unreclaimable:4304kB kernel_stack:1880kB pagetables:6296kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
> lowmem_reserve[]: 0 0 0
> Normal: 26*4kB 27*8kB 19*16kB 22*32kB 20*64kB 19*128kB 19*256kB 20*512kB 20*1024kB 3*2048kB 41*4096kB = 214704kB
> 15491 total pagecache pages
> 131072 pages of RAM
> 54258 free pages
> 18897 reserved pages
> 1609 slab pages
> 84296 pages shared
> 0 pages swap cached
> 
> 
> [snip]
> 
> 
> Mem-info:
> Normal per-cpu:
> CPU    0: hi:  186, btch:  31 usd: 100
> active_anon:40429 inactive_anon:0 isolated_anon:8581
>  active_file:2719 inactive_file:10423 isolated_file:1871
>  unevictable:0 dirty:0 writeback:0 unstable:0
>  free:53777 slab_reclaimable:534 slab_unreclaimable:1070
>  mapped:9461 shmem:2349 pagetables:1574 bounce:0
> Normal free:215108kB min:2884kB low:3604kB high:4324kB active_anon:161716kB inactive_anon:0kB active_file:10876kB inactive_file:41692kB unevictable:0kB isolated(anon):34324kB isolated(file):7484kB present:520192kB mlocked:0kB dirty:0kB writeback:0kB mapped:37844kB shmem:9396kB slab_reclaimable:2136kB slab_unreclaimable:4280kB kernel_stack:1872kB pagetables:6296kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
> lowmem_reserve[]: 0 0 0
> Normal: 31*4kB 29*8kB 20*16kB 21*32kB 22*64kB 19*128kB 20*256kB 20*512kB 20*1024kB 3*2048kB 41*4096kB = 215108kB
> 15491 total pagecache pages
> 131072 pages of RAM
> 54221 free pages
> 18897 reserved pages
> 1604 slab pages
> 84289 pages shared
> 0 pages swap cached
> Mem-info:
> Normal per-cpu:
> CPU    0: hi:  186, btch:  31 usd: 100
> active_anon:40429 inactive_anon:0 isolated_anon:8581
>  active_file:2719 inactive_file:10423 isolated_file:1871
>  unevictable:0 dirty:0 writeback:0 unstable:0
>  free:53777 slab_reclaimable:534 slab_unreclaimable:1070
>  mapped:9461 shmem:2349 pagetables:1574 bounce:0
> Normal free:215108kB min:2884kB low:3604kB high:4324kB active_anon:161716kB inactive_anon:0kB active_file:10876kB inactive_file:41692kB unevictable:0kB isolated(anon):34324kB isolated(file):7484kB present:520192kB mlocked:0kB dirty:0kB writeback:0kB mapped:37844kB shmem:9396kB slab_reclaimable:2136kB slab_unreclaimable:4280kB kernel_stack:1872kB pagetables:6296kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
> lowmem_reserve[]: 0 0 0
> Normal: 31*4kB 29*8kB 20*16kB 21*32kB 22*64kB 19*128kB 20*256kB 20*512kB 20*1024kB 3*2048kB 41*4096kB = 215108kB
> 15491 total pagecache pages
> 131072 pages of RAM
> 54222 free pages
> 18897 reserved pages
> 1603 slab pages
> 84289 pages shared
> 0 pages swap cached

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: compaction: trying to understand the code
  2010-08-19  7:46         ` Mel Gorman
  2010-08-19  8:08           ` Wu Fengguang
@ 2010-08-20  5:45           ` Iram Shahzad
  2010-08-20  5:50             ` Wu Fengguang
  1 sibling, 1 reply; 33+ messages in thread
From: Iram Shahzad @ 2010-08-20  5:45 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Wu Fengguang, linux-mm, KOSAKI Motohiro

> What is your test scenario? Who or what has these pages isolated that is
> allowing too_many_isolated() to be true?

I have a test app that attempts to create fragmentation. Then I run
echo 1 > /proc/sys/vm/compact_memory
That is all.
The test app mallocs 2MB 100 times, memsets them.
Then it frees the even numbered 2MB blocks.
That is, 2MB*50 remains malloced and 2MB*50 gets freed.

Thanks
Iram


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: compaction: trying to understand the code
  2010-08-20  5:45           ` Iram Shahzad
@ 2010-08-20  5:50             ` Wu Fengguang
  2010-08-20  6:13               ` Iram Shahzad
  0 siblings, 1 reply; 33+ messages in thread
From: Wu Fengguang @ 2010-08-20  5:50 UTC (permalink / raw)
  To: Iram Shahzad; +Cc: Mel Gorman, linux-mm, KOSAKI Motohiro, Ying Han

On Fri, Aug 20, 2010 at 01:45:56PM +0800, Iram Shahzad wrote:
> > What is your test scenario? Who or what has these pages isolated that is
> > allowing too_many_isolated() to be true?
> 
> I have a test app that attempts to create fragmentation. Then I run
> echo 1 > /proc/sys/vm/compact_memory
> That is all.

That's all? Is you system idle otherwise? (for example, fresh booted
and not running many processes)

> The test app mallocs 2MB 100 times, memsets them.
> Then it frees the even numbered 2MB blocks.
> That is, 2MB*50 remains malloced and 2MB*50 gets freed.
 
We are interested in the test app, can you share it? :)

Thanks,
Fengguang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: compaction: trying to understand the code
  2010-08-20  5:50             ` Wu Fengguang
@ 2010-08-20  6:13               ` Iram Shahzad
  0 siblings, 0 replies; 33+ messages in thread
From: Iram Shahzad @ 2010-08-20  6:13 UTC (permalink / raw)
  To: Wu Fengguang; +Cc: Mel Gorman, linux-mm, KOSAKI Motohiro, Ying Han

[-- Attachment #1: Type: text/plain, Size: 298 bytes --]

> That's all? Is you system idle otherwise? (for example, fresh booted
> and not running many processes)

Sorry, I didn't mean that. There are other processes running.
I just meant my test doesn't do anything else.

> We are interested in the test app, can you share it? :)

Attached.

Thanks
Iram

[-- Attachment #2: mfragprog.c --]
[-- Type: application/octet-stream, Size: 1368 bytes --]

#include <stdlib.h>
#include <stdio.h>
#include <sys/ioctl.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <asm/types.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdint.h>
#include <errno.h>
#include <string.h>

#define TRYNUMMAX (1024*50)
static void *p[TRYNUMMAX] = {(void *)1, };
static size_t size;
static int trynum;

static void mfrag(void)
{
	int i;

	fprintf(stderr, "size, trynum: %d %d\n", size, trynum);

	for (i=0; i<trynum; i++) {
			p[i] = NULL;
	}

	for (i=0; i<trynum; i++) {
		p[i] = malloc(size);
		if (p[i]) {
			fprintf(stderr, "(%s:%s:%d) Success %d %d %p\n", __FILE__, __FUNCTION__, __LINE__, size, i, p[i]);
			memset(p[i], 'a', size);
		}
		else {
			fprintf(stderr, "(%s:%s:%d) Fail %d %d\n", __FILE__, __FUNCTION__, __LINE__, size, i);
			break;
		}
	}

	fprintf(stderr, "%d allocs done\n", i);

	for (i=0; i<trynum; i+=2) {
		if (p[i]) {
			free(p[i]);
			p[i] = NULL;
		}
	}

	fprintf(stderr, "frag done\n");
}

int main (int argc, char **argv)
{
	if (argc != 3) {
		fprintf(stderr, "usage: %s <size> <trynum>\n", argv[0]);
		exit(1);
	}

	size = atoi(argv[1]);
	trynum = atoi(argv[2]);
	if (trynum > TRYNUMMAX) {
		trynum = TRYNUMMAX;
	}
	
	mfrag();
	
	while (1) {
		fprintf(stdout, "(%s:%s:%d)\n", __FILE__, __FUNCTION__, __LINE__);
		sleep(3);
	}
}

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: compaction: trying to understand the code
  2010-08-20  5:34             ` Wu Fengguang
@ 2010-08-20  9:35               ` Mel Gorman
  2010-08-20 10:22                 ` Minchan Kim
  2010-08-20 10:23                 ` Wu Fengguang
  0 siblings, 2 replies; 33+ messages in thread
From: Mel Gorman @ 2010-08-20  9:35 UTC (permalink / raw)
  To: Wu Fengguang; +Cc: Iram Shahzad, Minchan Kim, linux-mm, KOSAKI Motohiro

On Fri, Aug 20, 2010 at 01:34:47PM +0800, Wu Fengguang wrote:
> You do run lots of tasks: kernel_stack=1880kB.
> 
> And you have lots of free memory, page reclaim has never run, so
> inactive_anon=0. This is where compaction is different from vmscan.
> In vmscan, inactive_anon is reasonably large, and will only be
> compared directly with isolated_anon.
> 

True, the key observation here was that compaction is being run via the
proc trigger. Normally it would be run as part of the direct reclaim
path when kswapd would already be awake. too_many_isolated() needs to be
different for compaction to take the whole system into account. What
would be the best alternative? Here is one possibility. A reasonable
alternative would be that when inactive < active that isolated can't be
more than num_online_cpus() * 2 (i.e. one compactor per online cpu).

diff --git a/mm/compaction.c b/mm/compaction.c
index 94cce51..1e000b7 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -215,14 +215,16 @@ static void acct_isolated(struct zone *zone, struct compact_control *cc)
 static bool too_many_isolated(struct zone *zone)
 {
 
-	unsigned long inactive, isolated;
+	unsigned long active, inactive, isolated;
 
+	active = zone_page_state(zone, NR_ACTIVE_FILE) +
+					zone_page_state(zone, NR_INACTIVE_ANON);
 	inactive = zone_page_state(zone, NR_INACTIVE_FILE) +
 					zone_page_state(zone, NR_INACTIVE_ANON);
 	isolated = zone_page_state(zone, NR_ISOLATED_FILE) +
 					zone_page_state(zone, NR_ISOLATED_ANON);
 
-	return isolated > inactive;
+	return (inactive > active) ? isolated > inactive : false;
 }
 
 /*

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: compaction: trying to understand the code
  2010-08-20  9:35               ` Mel Gorman
@ 2010-08-20 10:22                 ` Minchan Kim
  2010-08-22 15:31                   ` Minchan Kim
  2010-08-20 10:23                 ` Wu Fengguang
  1 sibling, 1 reply; 33+ messages in thread
From: Minchan Kim @ 2010-08-20 10:22 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Wu Fengguang, Iram Shahzad, linux-mm, KOSAKI Motohiro

On Fri, Aug 20, 2010 at 6:35 PM, Mel Gorman <mel@csn.ul.ie> wrote:
> On Fri, Aug 20, 2010 at 01:34:47PM +0800, Wu Fengguang wrote:
>> You do run lots of tasks: kernel_stack=1880kB.
>>
>> And you have lots of free memory, page reclaim has never run, so
>> inactive_anon=0. This is where compaction is different from vmscan.
>> In vmscan, inactive_anon is reasonably large, and will only be
>> compared directly with isolated_anon.
>>
>
> True, the key observation here was that compaction is being run via the
> proc trigger. Normally it would be run as part of the direct reclaim
> path when kswapd would already be awake. too_many_isolated() needs to be
> different for compaction to take the whole system into account. What
> would be the best alternative? Here is one possibility. A reasonable
> alternative would be that when inactive < active that isolated can't be
> more than num_online_cpus() * 2 (i.e. one compactor per online cpu).
>
> diff --git a/mm/compaction.c b/mm/compaction.c
> index 94cce51..1e000b7 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -215,14 +215,16 @@ static void acct_isolated(struct zone *zone, struct compact_control *cc)
>  static bool too_many_isolated(struct zone *zone)
>  {
>
> -       unsigned long inactive, isolated;
> +       unsigned long active, inactive, isolated;
>
> +       active = zone_page_state(zone, NR_ACTIVE_FILE) +
> +                                       zone_page_state(zone, NR_INACTIVE_ANON);
>        inactive = zone_page_state(zone, NR_INACTIVE_FILE) +
>                                        zone_page_state(zone, NR_INACTIVE_ANON);
>        isolated = zone_page_state(zone, NR_ISOLATED_FILE) +
>                                        zone_page_state(zone, NR_ISOLATED_ANON);
>
> -       return isolated > inactive;
> +       return (inactive > active) ? isolated > inactive : false;
>  }
>
>  /*
>

1. active : 1000 inactive : 1000
2. parallel reclaiming -> active : 1000 inactive : 500 isolated : 500
3. too_many_isolated return false.

But in this  case, there are already many isolated pages. So it should
return true.

How about this?
too_many_isolated()
{
      return (isolated > nr_zones * nr_nodes * nr_online_cpu *
SWAP_CLUSTER_MAX);
}
-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: compaction: trying to understand the code
  2010-08-20  9:35               ` Mel Gorman
  2010-08-20 10:22                 ` Minchan Kim
@ 2010-08-20 10:23                 ` Wu Fengguang
  1 sibling, 0 replies; 33+ messages in thread
From: Wu Fengguang @ 2010-08-20 10:23 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Iram Shahzad, Minchan Kim, linux-mm, KOSAKI Motohiro

On Fri, Aug 20, 2010 at 05:35:59PM +0800, Mel Gorman wrote:
> On Fri, Aug 20, 2010 at 01:34:47PM +0800, Wu Fengguang wrote:
> > You do run lots of tasks: kernel_stack=1880kB.
> > 
> > And you have lots of free memory, page reclaim has never run, so
> > inactive_anon=0. This is where compaction is different from vmscan.
> > In vmscan, inactive_anon is reasonably large, and will only be
> > compared directly with isolated_anon.
> > 
> 
> True, the key observation here was that compaction is being run via the
> proc trigger. Normally it would be run as part of the direct reclaim
> path when kswapd would already be awake. too_many_isolated() needs to be
> different for compaction to take the whole system into account. What
> would be the best alternative? Here is one possibility. A reasonable
> alternative would be that when inactive < active that isolated can't be
> more than num_online_cpus() * 2 (i.e. one compactor per online cpu).
>
> diff --git a/mm/compaction.c b/mm/compaction.c
> index 94cce51..1e000b7 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -215,14 +215,16 @@ static void acct_isolated(struct zone *zone, struct compact_control *cc)
>  static bool too_many_isolated(struct zone *zone)
>  {
>  
> -	unsigned long inactive, isolated;
> +	unsigned long active, inactive, isolated;
>  
> +	active = zone_page_state(zone, NR_ACTIVE_FILE) +
> +					zone_page_state(zone, NR_INACTIVE_ANON);

s/NR_INACTIVE_ANON/NR_ACTIVE_ANON/

>  	inactive = zone_page_state(zone, NR_INACTIVE_FILE) +
>  					zone_page_state(zone, NR_INACTIVE_ANON);
>  	isolated = zone_page_state(zone, NR_ISOLATED_FILE) +
>  					zone_page_state(zone, NR_ISOLATED_ANON);
>  
> -	return isolated > inactive;
> +	return (inactive > active) ? isolated > inactive : false;

Note that for anon LRU, inactive_ratio may be large numbers.
(inactive > active) is not easy, and not stable even when inactive_ratio=1.

Thanks,
Fengguang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: compaction: trying to understand the code
  2010-08-20 10:22                 ` Minchan Kim
@ 2010-08-22 15:31                   ` Minchan Kim
  2010-08-22 23:23                     ` Wu Fengguang
  2010-08-23  7:16                     ` Mel Gorman
  0 siblings, 2 replies; 33+ messages in thread
From: Minchan Kim @ 2010-08-22 15:31 UTC (permalink / raw)
  To: Mel Gorman, Andrew Morton
  Cc: Wu Fengguang, Iram Shahzad, linux-mm, KOSAKI Motohiro

On Fri, Aug 20, 2010 at 07:22:16PM +0900, Minchan Kim wrote:
> On Fri, Aug 20, 2010 at 6:35 PM, Mel Gorman <mel@csn.ul.ie> wrote:
> > On Fri, Aug 20, 2010 at 01:34:47PM +0800, Wu Fengguang wrote:
> >> You do run lots of tasks: kernel_stack=1880kB.
> >>
> >> And you have lots of free memory, page reclaim has never run, so
> >> inactive_anon=0. This is where compaction is different from vmscan.
> >> In vmscan, inactive_anon is reasonably large, and will only be
> >> compared directly with isolated_anon.
> >>
> >
> > True, the key observation here was that compaction is being run via the
> > proc trigger. Normally it would be run as part of the direct reclaim
> > path when kswapd would already be awake. too_many_isolated() needs to be
> > different for compaction to take the whole system into account. What
> > would be the best alternative? Here is one possibility. A reasonable
> > alternative would be that when inactive < active that isolated can't be
> > more than num_online_cpus() * 2 (i.e. one compactor per online cpu).
> >
> > diff --git a/mm/compaction.c b/mm/compaction.c
> > index 94cce51..1e000b7 100644
> > --- a/mm/compaction.c
> > +++ b/mm/compaction.c
> > @@ -215,14 +215,16 @@ static void acct_isolated(struct zone *zone, struct compact_control *cc)
> >  static bool too_many_isolated(struct zone *zone)
> >  {
> >
> > -       unsigned long inactive, isolated;
> > +       unsigned long active, inactive, isolated;
> >
> > +       active = zone_page_state(zone, NR_ACTIVE_FILE) +
> > +                                       zone_page_state(zone, NR_INACTIVE_ANON);
> >        inactive = zone_page_state(zone, NR_INACTIVE_FILE) +
> >                                        zone_page_state(zone, NR_INACTIVE_ANON);
> >        isolated = zone_page_state(zone, NR_ISOLATED_FILE) +
> >                                        zone_page_state(zone, NR_ISOLATED_ANON);
> >
> > -       return isolated > inactive;
> > +       return (inactive > active) ? isolated > inactive : false;
> >  }
> >
> >  /*
> >
> 
> 1. active : 1000 inactive : 1000
> 2. parallel reclaiming -> active : 1000 inactive : 500 isolated : 500
> 3. too_many_isolated return false.
> 
> But in this  case, there are already many isolated pages. So it should
> return true.
> 
> How about this?
> too_many_isolated()
> {
>       return (isolated > nr_zones * nr_nodes * nr_online_cpu *
> SWAP_CLUSTER_MAX);
> }

Above utterly not good. 
How about this?

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: compaction: trying to understand the code
  2010-08-22 15:31                   ` Minchan Kim
@ 2010-08-22 23:23                     ` Wu Fengguang
  2010-08-23  1:58                       ` Minchan Kim
                                         ` (2 more replies)
  2010-08-23  7:16                     ` Mel Gorman
  1 sibling, 3 replies; 33+ messages in thread
From: Wu Fengguang @ 2010-08-22 23:23 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Mel Gorman, Andrew Morton, Iram Shahzad, linux-mm, KOSAKI Motohiro

> From: Minchan Kim <minchan.kim@gmail.com>
> Date: Mon, 23 Aug 2010 00:20:44 +0900
> Subject: [PATCH] compaction: handle active and inactive fairly in too_many_isolated
> 
> Iram reported compaction's too_many_isolated loops forever.
> (http://www.spinics.net/lists/linux-mm/msg08123.html)
> 
> The meminfo of situation happened was inactive anon is zero.
> That's because the system has no memory pressure until then.
> While all anon pages was in active lru, compaction could select
> active lru as well as inactive lru. That's different things
> with vmscan's isolated. So we has been two too_many_isolated.
> 
> While compaction can isolated pages in both active and inactive,
> current implementation of too_many_isolated only considers inactive.
> It made Iram's problem.
> 
> This patch handles active and inactie with fair.
> That's because we can't expect where from and how many compaction would
> isolated pages.
> 
> This patch changes (nr_isolated > nr_inactive) with
> nr_isolated > (nr_active + nr_inactive) / 2.

The change looks good, thanks. However I'm not sure if it's enough.

I wonder where the >40MB isolated pages come about.  inactive_anon
remains 0 and free remains high over a long time, so it seems there
are no concurrent direct reclaims at all. Are the pages isolated by
the compaction process itself?

Thanks,
Fengguang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: compaction: trying to understand the code
  2010-08-22 23:23                     ` Wu Fengguang
@ 2010-08-23  1:58                       ` Minchan Kim
  2010-08-23  3:03                         ` Iram Shahzad
  2010-08-23  7:18                       ` Mel Gorman
  2010-08-23 17:14                       ` Minchan Kim
  2 siblings, 1 reply; 33+ messages in thread
From: Minchan Kim @ 2010-08-23  1:58 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Mel Gorman, Andrew Morton, Iram Shahzad, linux-mm, KOSAKI Motohiro

On Mon, Aug 23, 2010 at 8:23 AM, Wu Fengguang <fengguang.wu@intel.com> wrote:
>> From: Minchan Kim <minchan.kim@gmail.com>
>> Date: Mon, 23 Aug 2010 00:20:44 +0900
>> Subject: [PATCH] compaction: handle active and inactive fairly in too_many_isolated
>>
>> Iram reported compaction's too_many_isolated loops forever.
>> (http://www.spinics.net/lists/linux-mm/msg08123.html)
>>
>> The meminfo of situation happened was inactive anon is zero.
>> That's because the system has no memory pressure until then.
>> While all anon pages was in active lru, compaction could select
>> active lru as well as inactive lru. That's different things
>> with vmscan's isolated. So we has been two too_many_isolated.
>>
>> While compaction can isolated pages in both active and inactive,
>> current implementation of too_many_isolated only considers inactive.
>> It made Iram's problem.
>>
>> This patch handles active and inactie with fair.
>> That's because we can't expect where from and how many compaction would
>> isolated pages.
>>
>> This patch changes (nr_isolated > nr_inactive) with
>> nr_isolated > (nr_active + nr_inactive) / 2.
>
> The change looks good, thanks. However I'm not sure if it's enough.

Thanks.

>
> I wonder where the >40MB isolated pages come about.  inactive_anon
> remains 0 and free remains high over a long time, so it seems there
> are no concurrent direct reclaims at all. Are the pages isolated by
> the compaction process itself?

Agree. I wonder too.

Now compaction isolates page per 32 until reaching pageblock_nr_pages,
So I can't understand how 40MB isolated pages come out.

Iram. How do you execute test_app?

1) synchronous test
1.1 start test_app
1.2 wait test_app job done (ie, wait memory is fragment)
1.3 echo 1 > /proc/sys/vm/compact_memory

2) asynchronous test
2.1 start test_app
2.2 not wait test_app job done
2.3 echo 1 > /proc/sys/vm/compact_memory(Maybe your test app and
compaction were executed parallel)

Which one is your scenario?


>
> Thanks,
> Fengguang
>



-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: compaction: trying to understand the code
  2010-08-23  1:58                       ` Minchan Kim
@ 2010-08-23  3:03                         ` Iram Shahzad
  2010-08-23  9:10                           ` Minchan Kim
  0 siblings, 1 reply; 33+ messages in thread
From: Iram Shahzad @ 2010-08-23  3:03 UTC (permalink / raw)
  To: Minchan Kim, Wu Fengguang
  Cc: Mel Gorman, Andrew Morton, linux-mm, KOSAKI Motohiro

> Iram. How do you execute test_app?
>
> 1) synchronous test
> 1.1 start test_app
> 1.2 wait test_app job done (ie, wait memory is fragment)
> 1.3 echo 1 > /proc/sys/vm/compact_memory
>
> 2) asynchronous test
> 2.1 start test_app
> 2.2 not wait test_app job done
> 2.3 echo 1 > /proc/sys/vm/compact_memory(Maybe your test app and
> compaction were executed parallel)

It's synchronous.
First I confirm that the test app has completed its fragmentation work
by looking at the printf output. Then only I run echo 1 > 
/proc/sys/vm/compact_memory.

After completing fragmentation work, my test app sleeps in a useless while 
loop
which I think is not important.

Thanks
Iram


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: compaction: trying to understand the code
  2010-08-22 15:31                   ` Minchan Kim
  2010-08-22 23:23                     ` Wu Fengguang
@ 2010-08-23  7:16                     ` Mel Gorman
  2010-08-23  9:07                       ` Minchan Kim
  1 sibling, 1 reply; 33+ messages in thread
From: Mel Gorman @ 2010-08-23  7:16 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, Wu Fengguang, Iram Shahzad, linux-mm, KOSAKI Motohiro

On Mon, Aug 23, 2010 at 12:31:21AM +0900, Minchan Kim wrote:
> <SNIP>
> 
> From 560e8898295c663f02aede07b3d55880eba16c69 Mon Sep 17 00:00:00 2001
> From: Minchan Kim <minchan.kim@gmail.com>
> Date: Mon, 23 Aug 2010 00:20:44 +0900
> Subject: [PATCH] compaction: handle active and inactive fairly in too_many_isolated
> 
> Iram reported compaction's too_many_isolated loops forever.
> (http://www.spinics.net/lists/linux-mm/msg08123.html)
> 
> The meminfo of situation happened was inactive anon is zero.
> That's because the system has no memory pressure until then.
> While all anon pages was in active lru, compaction could select
> active lru as well as inactive lru. That's different things
> with vmscan's isolated. So we has been two too_many_isolated.
> 
> While compaction can isolated pages in both active and inactive,
> current implementation of too_many_isolated only considers inactive.
> It made Iram's problem.
> 
> This patch handles active and inactie with fair.
> That's because we can't expect where from and how many compaction would
> isolated pages.
> 
> This patch changes (nr_isolated > nr_inactive) with
> nr_isolated > (nr_active + nr_inactive) / 2.
> 
> Cc: Mel Gorman <mel@csn.ul.ie>
> Cc: Wu Fengguang <fengguang.wu@intel.com>
> Signed-off-by: Minchan Kim <minchan.kim@gmail.com>

Seems reasonable to me.

Acked-by: Mel Gorman <mel@csn.ul.ie>

Want to repost this as a standalone patch?

> ---
>  mm/compaction.c |    9 +++++----
>  1 files changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/mm/compaction.c b/mm/compaction.c
> index 94cce51..0864839 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -214,15 +214,16 @@ static void acct_isolated(struct zone *zone, struct compact_control *cc)
>  /* Similar to reclaim, but different enough that they don't share logic */
>  static bool too_many_isolated(struct zone *zone)
>  {
> -
> -       unsigned long inactive, isolated;
> +       unsigned long active, inactive, isolated;
>  
>         inactive = zone_page_state(zone, NR_INACTIVE_FILE) +
>                                         zone_page_state(zone, NR_INACTIVE_ANON);
> +       active = zone_page_state(zone, NR_ACTIVE_FILE) +
> +                                       zone_page_state(zone, NR_ACTIVE_ANON);
>         isolated = zone_page_state(zone, NR_ISOLATED_FILE) +
>                                         zone_page_state(zone, NR_ISOLATED_ANON);
> -
> -       return isolated > inactive;
> +
> +       return isolated > (inactive + active) / 2;
>  }
> 

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: compaction: trying to understand the code
  2010-08-22 23:23                     ` Wu Fengguang
  2010-08-23  1:58                       ` Minchan Kim
@ 2010-08-23  7:18                       ` Mel Gorman
  2010-08-23 17:14                       ` Minchan Kim
  2 siblings, 0 replies; 33+ messages in thread
From: Mel Gorman @ 2010-08-23  7:18 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Minchan Kim, Andrew Morton, Iram Shahzad, linux-mm, KOSAKI Motohiro

On Mon, Aug 23, 2010 at 07:23:16AM +0800, Wu Fengguang wrote:
> > From: Minchan Kim <minchan.kim@gmail.com>
> > Date: Mon, 23 Aug 2010 00:20:44 +0900
> > Subject: [PATCH] compaction: handle active and inactive fairly in too_many_isolated
> > 
> > Iram reported compaction's too_many_isolated loops forever.
> > (http://www.spinics.net/lists/linux-mm/msg08123.html)
> > 
> > The meminfo of situation happened was inactive anon is zero.
> > That's because the system has no memory pressure until then.
> > While all anon pages was in active lru, compaction could select
> > active lru as well as inactive lru. That's different things
> > with vmscan's isolated. So we has been two too_many_isolated.
> > 
> > While compaction can isolated pages in both active and inactive,
> > current implementation of too_many_isolated only considers inactive.
> > It made Iram's problem.
> > 
> > This patch handles active and inactie with fair.
> > That's because we can't expect where from and how many compaction would
> > isolated pages.
> > 
> > This patch changes (nr_isolated > nr_inactive) with
> > nr_isolated > (nr_active + nr_inactive) / 2.
> 
> The change looks good, thanks. However I'm not sure if it's enough.
> 
> I wonder where the >40MB isolated pages come about.  inactive_anon
> remains 0 and free remains high over a long time, so it seems there
> are no concurrent direct reclaims at all.

When the proc trigger is used, it is not necessary for there to be any
memory pressure for compaction to be running. It's unlikely in this case
that there are many processes direct compacting but the check should
still not potentially loop.

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: compaction: trying to understand the code
  2010-08-23  7:16                     ` Mel Gorman
@ 2010-08-23  9:07                       ` Minchan Kim
  0 siblings, 0 replies; 33+ messages in thread
From: Minchan Kim @ 2010-08-23  9:07 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Andrew Morton, Wu Fengguang, Iram Shahzad, linux-mm, KOSAKI Motohiro

On Mon, Aug 23, 2010 at 4:16 PM, Mel Gorman <mel@csn.ul.ie> wrote:
> On Mon, Aug 23, 2010 at 12:31:21AM +0900, Minchan Kim wrote:
>> <SNIP>
>>
>> From 560e8898295c663f02aede07b3d55880eba16c69 Mon Sep 17 00:00:00 2001
>> From: Minchan Kim <minchan.kim@gmail.com>
>> Date: Mon, 23 Aug 2010 00:20:44 +0900
>> Subject: [PATCH] compaction: handle active and inactive fairly in too_many_isolated
>>
>> Iram reported compaction's too_many_isolated loops forever.
>> (http://www.spinics.net/lists/linux-mm/msg08123.html)
>>
>> The meminfo of situation happened was inactive anon is zero.
>> That's because the system has no memory pressure until then.
>> While all anon pages was in active lru, compaction could select
>> active lru as well as inactive lru. That's different things
>> with vmscan's isolated. So we has been two too_many_isolated.
>>
>> While compaction can isolated pages in both active and inactive,
>> current implementation of too_many_isolated only considers inactive.
>> It made Iram's problem.
>>
>> This patch handles active and inactie with fair.
>> That's because we can't expect where from and how many compaction would
>> isolated pages.
>>
>> This patch changes (nr_isolated > nr_inactive) with
>> nr_isolated > (nr_active + nr_inactive) / 2.
>>
>> Cc: Mel Gorman <mel@csn.ul.ie>
>> Cc: Wu Fengguang <fengguang.wu@intel.com>
>> Signed-off-by: Minchan Kim <minchan.kim@gmail.com>
>
> Seems reasonable to me.
>
> Acked-by: Mel Gorman <mel@csn.ul.ie>

Thanks.

>
> Want to repost this as a standalone patch?

Yes. It is enough to be a standalone.
I will repost the patch as removing part about reporting Iram's problem.

We need to dig in Iram's problem regardless of this patch.

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: compaction: trying to understand the code
  2010-08-23  3:03                         ` Iram Shahzad
@ 2010-08-23  9:10                           ` Minchan Kim
  2010-08-26  8:51                             ` Mel Gorman
  0 siblings, 1 reply; 33+ messages in thread
From: Minchan Kim @ 2010-08-23  9:10 UTC (permalink / raw)
  To: Iram Shahzad
  Cc: Wu Fengguang, Mel Gorman, Andrew Morton, linux-mm, KOSAKI Motohiro

On Mon, Aug 23, 2010 at 12:03 PM, Iram Shahzad
<iram.shahzad@jp.fujitsu.com> wrote:
>> Iram. How do you execute test_app?
>>
>> 1) synchronous test
>> 1.1 start test_app
>> 1.2 wait test_app job done (ie, wait memory is fragment)
>> 1.3 echo 1 > /proc/sys/vm/compact_memory
>>
>> 2) asynchronous test
>> 2.1 start test_app
>> 2.2 not wait test_app job done
>> 2.3 echo 1 > /proc/sys/vm/compact_memory(Maybe your test app and
>> compaction were executed parallel)
>
> It's synchronous.
> First I confirm that the test app has completed its fragmentation work
> by looking at the printf output. Then only I run echo 1 >
> /proc/sys/vm/compact_memory.
>
> After completing fragmentation work, my test app sleeps in a useless while
> loop
> which I think is not important.

Thanks. It seems to be not any other processes which is entering
direct reclaiming.
I tested your test_app but failed to reproduce your problem.
Actually I suspected some leak of decrease NR_ISOLATE_XXX but my
system worked well.
And I couldn't find the point as just code reviewing. If it really
was, Mel found it during his stress test.

Hmm.. Mystery.
Maybe we need some tracepoint to debug.
-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: compaction: trying to understand the code
  2010-08-22 23:23                     ` Wu Fengguang
  2010-08-23  1:58                       ` Minchan Kim
  2010-08-23  7:18                       ` Mel Gorman
@ 2010-08-23 17:14                       ` Minchan Kim
  2010-08-24  0:27                         ` Wu Fengguang
  2 siblings, 1 reply; 33+ messages in thread
From: Minchan Kim @ 2010-08-23 17:14 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Mel Gorman, Andrew Morton, Iram Shahzad, linux-mm, KOSAKI Motohiro

On Mon, Aug 23, 2010 at 07:23:16AM +0800, Wu Fengguang wrote:
> > From: Minchan Kim <minchan.kim@gmail.com>
> > Date: Mon, 23 Aug 2010 00:20:44 +0900
> > Subject: [PATCH] compaction: handle active and inactive fairly in too_many_isolated
> > 
> > Iram reported compaction's too_many_isolated loops forever.
> > (http://www.spinics.net/lists/linux-mm/msg08123.html)
> > 
> > The meminfo of situation happened was inactive anon is zero.
> > That's because the system has no memory pressure until then.
> > While all anon pages was in active lru, compaction could select
> > active lru as well as inactive lru. That's different things
> > with vmscan's isolated. So we has been two too_many_isolated.
> > 
> > While compaction can isolated pages in both active and inactive,
> > current implementation of too_many_isolated only considers inactive.
> > It made Iram's problem.
> > 
> > This patch handles active and inactie with fair.
> > That's because we can't expect where from and how many compaction would
> > isolated pages.
> > 
> > This patch changes (nr_isolated > nr_inactive) with
> > nr_isolated > (nr_active + nr_inactive) / 2.
> 
> The change looks good, thanks. However I'm not sure if it's enough.
> 
> I wonder where the >40MB isolated pages come about.  inactive_anon
> remains 0 and free remains high over a long time, so it seems there
> are no concurrent direct reclaims at all. Are the pages isolated by
> the compaction process itself?

I think it can't happen without kswapd or direct reclaim.
But I think direct reclaim doesn't happen becuase Iram has no activity on system 
at that time. So just geussing following scenario.

1. trigger compaction by proc
2. isolate some pages and then migrate_pages
3. migrate_pages calls cond_resched
4. someone need big page(I am not sure this part)
4. kswapd: shrink anon active list due to inactive_anon_is_low
5. kswapd: isolate_lru_pages for order > 0 (ex, 0.5M page) so 0.5 M * 32 = 16M are isolated
6. kswapd: shrink_zone : shrink anon active list due to inactive_anon_is_low 
7. kswapd: isolate_lru_pages for order > 0 (ex, 0.5M page) so 0.5 M * 32  are isolated again.

Does it make sense?

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: compaction: trying to understand the code
  2010-08-23 17:14                       ` Minchan Kim
@ 2010-08-24  0:27                         ` Wu Fengguang
  2010-08-24  5:07                           ` Iram Shahzad
  0 siblings, 1 reply; 33+ messages in thread
From: Wu Fengguang @ 2010-08-24  0:27 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Mel Gorman, Andrew Morton, Iram Shahzad, linux-mm, KOSAKI Motohiro

On Tue, Aug 24, 2010 at 01:14:16AM +0800, Minchan Kim wrote:
> On Mon, Aug 23, 2010 at 07:23:16AM +0800, Wu Fengguang wrote:
> > > From: Minchan Kim <minchan.kim@gmail.com>
> > > Date: Mon, 23 Aug 2010 00:20:44 +0900
> > > Subject: [PATCH] compaction: handle active and inactive fairly in too_many_isolated
> > > 
> > > Iram reported compaction's too_many_isolated loops forever.
> > > (http://www.spinics.net/lists/linux-mm/msg08123.html)
> > > 
> > > The meminfo of situation happened was inactive anon is zero.
> > > That's because the system has no memory pressure until then.
> > > While all anon pages was in active lru, compaction could select
> > > active lru as well as inactive lru. That's different things
> > > with vmscan's isolated. So we has been two too_many_isolated.
> > > 
> > > While compaction can isolated pages in both active and inactive,
> > > current implementation of too_many_isolated only considers inactive.
> > > It made Iram's problem.
> > > 
> > > This patch handles active and inactie with fair.
> > > That's because we can't expect where from and how many compaction would
> > > isolated pages.
> > > 
> > > This patch changes (nr_isolated > nr_inactive) with
> > > nr_isolated > (nr_active + nr_inactive) / 2.
> > 
> > The change looks good, thanks. However I'm not sure if it's enough.
> > 
> > I wonder where the >40MB isolated pages come about.  inactive_anon
> > remains 0 and free remains high over a long time, so it seems there
> > are no concurrent direct reclaims at all. Are the pages isolated by
> > the compaction process itself?
> 
> I think it can't happen without kswapd or direct reclaim.
> But I think direct reclaim doesn't happen becuase Iram has no activity on system 
> at that time. So just geussing following scenario.
> 
> 1. trigger compaction by proc
> 2. isolate some pages and then migrate_pages
> 3. migrate_pages calls cond_resched
> 4. someone need big page(I am not sure this part)
> 4. kswapd: shrink anon active list due to inactive_anon_is_low
> 5. kswapd: isolate_lru_pages for order > 0 (ex, 0.5M page) so 0.5 M * 32 = 16M are isolated
> 6. kswapd: shrink_zone : shrink anon active list due to inactive_anon_is_low 
> 7. kswapd: isolate_lru_pages for order > 0 (ex, 0.5M page) so 0.5 M * 32  are isolated again.
> 
> Does it make sense?

One question is, why kswapd won't proceed after isolating all the pages?
If it has done with the isolated pages, we'll see growing inactive_anon
numbers.

/proc/vmstat should give more clues on any possible page reclaim
activities. Iram, would you help post it?

Thanks,
Fengguang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: compaction: trying to understand the code
  2010-08-24  0:27                         ` Wu Fengguang
@ 2010-08-24  5:07                           ` Iram Shahzad
  2010-08-24  6:52                             ` Minchan Kim
  0 siblings, 1 reply; 33+ messages in thread
From: Iram Shahzad @ 2010-08-24  5:07 UTC (permalink / raw)
  To: Wu Fengguang, Minchan Kim
  Cc: Mel Gorman, Andrew Morton, linux-mm, KOSAKI Motohiro

[-- Attachment #1: Type: text/plain, Size: 806 bytes --]

> One question is, why kswapd won't proceed after isolating all the pages?
> If it has done with the isolated pages, we'll see growing inactive_anon
> numbers.
> 
> /proc/vmstat should give more clues on any possible page reclaim
> activities. Iram, would you help post it?

I am not sure which point of time are you interested in, so I am
attaching /proc/vmstat log of 3 points.

too_many_isolated_vmstat_before_frag.txt
   This one is taken before I ran my test app which attempts
   to make fragmentation
too_many_isolated_vmstat_before_compaction.txt
   This one is taken after running the test app and before
   running compaction.
too_many_isolated_vmstat_during_compaction.txt
   This one is taken a few minutes after running compaction.
   To take this I ran compaction in background.

Thanks
Iram

[-- Attachment #2: too_many_isolated_vmstat_before_frag.txt --]
[-- Type: text/plain, Size: 1269 bytes --]

nr_free_pages 79896
nr_inactive_anon 0
nr_active_anon 14688
nr_inactive_file 10444
nr_active_file 2718
nr_unevictable 0
nr_mlock 0
nr_anon_pages 12341
nr_mapped 9430
nr_file_pages 15511
nr_dirty 0
nr_writeback 0
nr_slab_reclaimable 528
nr_slab_unreclaimable 1073
nr_page_table_pages 1479
nr_kernel_stack 235
nr_unstable 0
nr_bounce 0
nr_vmscan_write 0
nr_writeback_temp 0
nr_isolated_anon 0
nr_isolated_file 0
nr_shmem 2349
pgpgin 4
pgpgout 0
pswpin 0
pswpout 0
pgalloc_normal 54208
pgalloc_high 0
pgalloc_movable 0
pgfree 134220
pgactivate 2718
pgdeactivate 0
pgfault 88952
pgmajfault 555
pgrefill_normal 0
pgrefill_high 0
pgrefill_movable 0
pgsteal_normal 0
pgsteal_high 0
pgsteal_movable 0
pgscan_kswapd_normal 0
pgscan_kswapd_high 0
pgscan_kswapd_movable 0
pgscan_direct_normal 0
pgscan_direct_high 0
pgscan_direct_movable 0
pginodesteal 0
slabs_scanned 0
kswapd_steal 0
kswapd_inodesteal 0
pageoutrun 0
allocstall 0
pgrotated 0
compact_blocks_moved 0
compact_pages_moved 0
compact_pagemigrate_failed 0
compact_stall 0
compact_fail 0
compact_success 0
unevictable_pgs_culled 0
unevictable_pgs_scanned 0
unevictable_pgs_rescued 0
unevictable_pgs_mlocked 0
unevictable_pgs_munlocked 0
unevictable_pgs_cleared 0
unevictable_pgs_stranded 0
unevictable_pgs_mlockfreed 0

[-- Attachment #3: too_many_isolated_vmstat_before_compaction.txt --]
[-- Type: text/plain, Size: 1271 bytes --]

nr_free_pages 54098
nr_inactive_anon 0
nr_active_anon 40354
nr_inactive_file 10433
nr_active_file 2729
nr_unevictable 0
nr_mlock 0
nr_anon_pages 38007
nr_mapped 9469
nr_file_pages 15511
nr_dirty 0
nr_writeback 0
nr_slab_reclaimable 528
nr_slab_unreclaimable 1070
nr_page_table_pages 1582
nr_kernel_stack 236
nr_unstable 0
nr_bounce 0
nr_vmscan_write 0
nr_writeback_temp 0
nr_isolated_anon 0
nr_isolated_file 0
nr_shmem 2349
pgpgin 4
pgpgout 0
pswpin 0
pswpout 0
pgalloc_normal 105927
pgalloc_high 0
pgalloc_movable 0
pgfree 160167
pgactivate 2729
pgdeactivate 0
pgfault 141220
pgmajfault 555
pgrefill_normal 0
pgrefill_high 0
pgrefill_movable 0
pgsteal_normal 0
pgsteal_high 0
pgsteal_movable 0
pgscan_kswapd_normal 0
pgscan_kswapd_high 0
pgscan_kswapd_movable 0
pgscan_direct_normal 0
pgscan_direct_high 0
pgscan_direct_movable 0
pginodesteal 0
slabs_scanned 0
kswapd_steal 0
kswapd_inodesteal 0
pageoutrun 0
allocstall 0
pgrotated 0
compact_blocks_moved 0
compact_pages_moved 0
compact_pagemigrate_failed 0
compact_stall 0
compact_fail 0
compact_success 0
unevictable_pgs_culled 0
unevictable_pgs_scanned 0
unevictable_pgs_rescued 0
unevictable_pgs_mlocked 0
unevictable_pgs_munlocked 0
unevictable_pgs_cleared 0
unevictable_pgs_stranded 0
unevictable_pgs_mlockfreed 0

[-- Attachment #4: too_many_isolated_vmstat_during_compaction.txt --]
[-- Type: text/plain, Size: 1283 bytes --]

nr_free_pages 53673
nr_inactive_anon 0
nr_active_anon 40498
nr_inactive_file 10427
nr_active_file 2735
nr_unevictable 0
nr_mlock 0
nr_anon_pages 38151
nr_mapped 9469
nr_file_pages 15511
nr_dirty 0
nr_writeback 0
nr_slab_reclaimable 536
nr_slab_unreclaimable 1070
nr_page_table_pages 1588
nr_kernel_stack 237
nr_unstable 0
nr_bounce 0
nr_vmscan_write 0
nr_writeback_temp 0
nr_isolated_anon 8592
nr_isolated_file 1862
nr_shmem 2349
pgpgin 4
pgpgout 0
pswpin 0
pswpout 0
pgalloc_normal 117872
pgalloc_high 0
pgalloc_movable 0
pgfree 182402
pgactivate 2735
pgdeactivate 0
pgfault 182499
pgmajfault 555
pgrefill_normal 0
pgrefill_high 0
pgrefill_movable 0
pgsteal_normal 0
pgsteal_high 0
pgsteal_movable 0
pgscan_kswapd_normal 0
pgscan_kswapd_high 0
pgscan_kswapd_movable 0
pgscan_direct_normal 0
pgscan_direct_high 0
pgscan_direct_movable 0
pginodesteal 0
slabs_scanned 0
kswapd_steal 0
kswapd_inodesteal 0
pageoutrun 0
allocstall 0
pgrotated 0
compact_blocks_moved 327
compact_pages_moved 10454
compact_pagemigrate_failed 0
compact_stall 0
compact_fail 0
compact_success 0
unevictable_pgs_culled 0
unevictable_pgs_scanned 0
unevictable_pgs_rescued 0
unevictable_pgs_mlocked 0
unevictable_pgs_munlocked 0
unevictable_pgs_cleared 0
unevictable_pgs_stranded 0
unevictable_pgs_mlockfreed 0

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: compaction: trying to understand the code
  2010-08-24  5:07                           ` Iram Shahzad
@ 2010-08-24  6:52                             ` Minchan Kim
  2010-08-26  8:05                               ` Iram Shahzad
  0 siblings, 1 reply; 33+ messages in thread
From: Minchan Kim @ 2010-08-24  6:52 UTC (permalink / raw)
  To: Iram Shahzad
  Cc: Wu Fengguang, Mel Gorman, Andrew Morton, linux-mm, KOSAKI Motohiro

On Tue, Aug 24, 2010 at 2:07 PM, Iram Shahzad
<iram.shahzad@jp.fujitsu.com> wrote:
>> One question is, why kswapd won't proceed after isolating all the pages?
>> If it has done with the isolated pages, we'll see growing inactive_anon
>> numbers.
>>
>> /proc/vmstat should give more clues on any possible page reclaim
>> activities. Iram, would you help post it?
>
> I am not sure which point of time are you interested in, so I am
> attaching /proc/vmstat log of 3 points.
>
> too_many_isolated_vmstat_before_frag.txt
>  This one is taken before I ran my test app which attempts
>  to make fragmentation
> too_many_isolated_vmstat_before_compaction.txt
>  This one is taken after running the test app and before
>  running compaction.
> too_many_isolated_vmstat_during_compaction.txt
>  This one is taken a few minutes after running compaction.
>  To take this I ran compaction in background.
>
> Thanks
> Iram
>

Hmm.. Never happens reclaim. Strange.
In addtion, pgpgin is always 4.

pgpgin 4
pgpgout 0

Is it possible?
What kinds of filesystem do you use?
Do you boot from NFS?
Do your system have any non-mainline(ie, doesn't merged into linux
kernel tree) driver, file system or any feature?

Maybe your config file can answer this questions.
Thanks.
-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: compaction: trying to understand the code
  2010-08-24  6:52                             ` Minchan Kim
@ 2010-08-26  8:05                               ` Iram Shahzad
  0 siblings, 0 replies; 33+ messages in thread
From: Iram Shahzad @ 2010-08-26  8:05 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Wu Fengguang, Mel Gorman, Andrew Morton, linux-mm, KOSAKI Motohiro

> What kinds of filesystem do you use?
> Do you boot from NFS?
> Do your system have any non-mainline(ie, doesn't merged into linux
> kernel tree) driver, file system or any feature?


I do not boot from NFS.
My system does have non-mainline file system and drivers.
I thought file system and drivers were irrelevant to this problem,
are they?

Thanks
Iram


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: compaction: trying to understand the code
  2010-08-23  9:10                           ` Minchan Kim
@ 2010-08-26  8:51                             ` Mel Gorman
  0 siblings, 0 replies; 33+ messages in thread
From: Mel Gorman @ 2010-08-26  8:51 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Iram Shahzad, Wu Fengguang, Andrew Morton, linux-mm, KOSAKI Motohiro

On Mon, Aug 23, 2010 at 06:10:02PM +0900, Minchan Kim wrote:
> On Mon, Aug 23, 2010 at 12:03 PM, Iram Shahzad
> <iram.shahzad@jp.fujitsu.com> wrote:
> >> Iram. How do you execute test_app?
> >>
> >> 1) synchronous test
> >> 1.1 start test_app
> >> 1.2 wait test_app job done (ie, wait memory is fragment)
> >> 1.3 echo 1 > /proc/sys/vm/compact_memory
> >>
> >> 2) asynchronous test
> >> 2.1 start test_app
> >> 2.2 not wait test_app job done
> >> 2.3 echo 1 > /proc/sys/vm/compact_memory(Maybe your test app and
> >> compaction were executed parallel)
> >
> > It's synchronous.
> > First I confirm that the test app has completed its fragmentation work
> > by looking at the printf output. Then only I run echo 1 >
> > /proc/sys/vm/compact_memory.
> >
> > After completing fragmentation work, my test app sleeps in a useless while
> > loop
> > which I think is not important.
> 
> Thanks. It seems to be not any other processes which is entering
> direct reclaiming.
> I tested your test_app but failed to reproduce your problem.
> Actually I suspected some leak of decrease NR_ISOLATE_XXX but my
> system worked well.
> And I couldn't find the point as just code reviewing. If it really
> was, Mel found it during his stress test.
> 

My test machines have been tied up which has delayed me reviewing these
patches. I reran standardish compaction stress tests and didn't spot a
NR_ISOLATE_XXX. While none of those tests depend on the proc trigger,
they share the core logic so I don't think we're looking at a leak issue
and all the difficulty is in too_many_isolated()

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2010-08-26  8:51 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-08-17 11:08 compaction: trying to understand the code Iram Shahzad
2010-08-17 11:10 ` Mel Gorman
2010-08-18  8:19   ` Iram Shahzad
2010-08-18 15:41     ` Wu Fengguang
2010-08-19  7:09       ` Iram Shahzad
2010-08-19  7:45         ` Wu Fengguang
2010-08-19  7:46         ` Mel Gorman
2010-08-19  8:08           ` Wu Fengguang
2010-08-19  8:15             ` Mel Gorman
2010-08-19  8:29               ` Wu Fengguang
2010-08-20  5:45           ` Iram Shahzad
2010-08-20  5:50             ` Wu Fengguang
2010-08-20  6:13               ` Iram Shahzad
2010-08-19 16:00         ` Minchan Kim
2010-08-20  5:31           ` Iram Shahzad
2010-08-20  5:34             ` Wu Fengguang
2010-08-20  9:35               ` Mel Gorman
2010-08-20 10:22                 ` Minchan Kim
2010-08-22 15:31                   ` Minchan Kim
2010-08-22 23:23                     ` Wu Fengguang
2010-08-23  1:58                       ` Minchan Kim
2010-08-23  3:03                         ` Iram Shahzad
2010-08-23  9:10                           ` Minchan Kim
2010-08-26  8:51                             ` Mel Gorman
2010-08-23  7:18                       ` Mel Gorman
2010-08-23 17:14                       ` Minchan Kim
2010-08-24  0:27                         ` Wu Fengguang
2010-08-24  5:07                           ` Iram Shahzad
2010-08-24  6:52                             ` Minchan Kim
2010-08-26  8:05                               ` Iram Shahzad
2010-08-23  7:16                     ` Mel Gorman
2010-08-23  9:07                       ` Minchan Kim
2010-08-20 10:23                 ` Wu Fengguang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.