linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/1] mm/swap.c: flush lru_add pvecs on compound page arrival
@ 2016-06-08 14:35 Lukasz Odzioba
  2016-06-08 15:04 ` Michal Hocko
  2016-06-08 15:31 ` Dave Hansen
  0 siblings, 2 replies; 13+ messages in thread
From: Lukasz Odzioba @ 2016-06-08 14:35 UTC (permalink / raw)
  To: linux-kernel, linux-mm, akpm, kirill.shutemov, mhocko, aarcange,
	vdavydov, mingli199x, minchan
  Cc: dave.hansen, lukasz.anaczkowski, lukasz.odzioba

When the application does not exit cleanly (i.e. SIGTERM) we might
end up with some pages in lru_add_pvec, which is ok. With THP
enabled huge pages may also end up on per cpu lru_add_pvecs.
In the systems with a lot of processors we end up with quite a lot
of memory pending for addition to LRU cache - in the worst case
scenario up to CPUS * PAGE_SIZE * PAGEVEC_SIZE, which on machine
with 200+CPUs means GBs in practice.

We are able to reproduce this problem with the following program:

void main() {
{
	size_t size = 55 * 1000 * 1000; // smaller than  MEM/CPUS
	void *p = mmap(NULL, size, PROT_READ | PROT_WRITE,
		MAP_PRIVATE | MAP_ANONYMOUS , -1, 0);
	if (p != MAP_FAILED)
		memset(p, 0, size);
	//munmap(p, size); // uncomment to make the problem go away
}
}

When we run it it will leave significant amount of memory on pvecs.
This memory will be not reclaimed if we hit OOM, so when we run
above program in a loop:
	$ for i in `seq 100`; do ./a.out; done
many processes (95% in my case) will be killed by OOM.

This patch flushes lru_add_pvecs on compound page arrival making
the problem less severe - kill rate drops to 0%.

Suggested-by: Michal Hocko <mhocko@suse.com>
Tested-by: Lukasz Odzioba <lukasz.odzioba@intel.com>
Signed-off-by: Lukasz Odzioba <lukasz.odzioba@intel.com>
---
 mm/swap.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/mm/swap.c b/mm/swap.c
index 9591614..3fe4f18 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -391,9 +391,8 @@ static void __lru_cache_add(struct page *page)
 	struct pagevec *pvec = &get_cpu_var(lru_add_pvec);
 
 	get_page(page);
-	if (!pagevec_space(pvec))
+	if (!pagevec_add(pvec, page) || PageCompound(page))
 		__pagevec_lru_add(pvec);
-	pagevec_add(pvec, page);
 	put_cpu_var(lru_add_pvec);
 }
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/1] mm/swap.c: flush lru_add pvecs on compound page arrival
  2016-06-08 14:35 [PATCH 1/1] mm/swap.c: flush lru_add pvecs on compound page arrival Lukasz Odzioba
@ 2016-06-08 15:04 ` Michal Hocko
  2016-06-09  8:01   ` Odzioba, Lukasz
  2016-06-08 15:31 ` Dave Hansen
  1 sibling, 1 reply; 13+ messages in thread
From: Michal Hocko @ 2016-06-08 15:04 UTC (permalink / raw)
  To: Lukasz Odzioba
  Cc: linux-kernel, linux-mm, akpm, kirill.shutemov, aarcange,
	vdavydov, mingli199x, minchan, dave.hansen, lukasz.anaczkowski

On Wed 08-06-16 16:35:37, Lukasz Odzioba wrote:
> When the application does not exit cleanly (i.e. SIGTERM) we might

I do not see how a SIGTERM would make any difference. But see below.

> end up with some pages in lru_add_pvec, which is ok. With THP
> enabled huge pages may also end up on per cpu lru_add_pvecs.
> In the systems with a lot of processors we end up with quite a lot
> of memory pending for addition to LRU cache - in the worst case
> scenario up to CPUS * PAGE_SIZE * PAGEVEC_SIZE, which on machine
> with 200+CPUs means GBs in practice.

It is 56kB per CPU for normal pages which is not really that bad.
28MB for THP only cache is a lot though.

> We are able to reproduce this problem with the following program:
> 
> void main() {
> {
> 	size_t size = 55 * 1000 * 1000; // smaller than  MEM/CPUS
> 	void *p = mmap(NULL, size, PROT_READ | PROT_WRITE,
> 		MAP_PRIVATE | MAP_ANONYMOUS , -1, 0);
> 	if (p != MAP_FAILED)
> 		memset(p, 0, size);
> 	//munmap(p, size); // uncomment to make the problem go away

Is this really true? Both munmap and exit_mmap do the same
lru_add_drain() which flushes only the local CPU cache so munmap
shouldn't make any difference.

> }
> 
> When we run it it will leave significant amount of memory on pvecs.
> This memory will be not reclaimed if we hit OOM, so when we run
> above program in a loop:
> 	$ for i in `seq 100`; do ./a.out; done
> many processes (95% in my case) will be killed by OOM.
> 
> This patch flushes lru_add_pvecs on compound page arrival making
> the problem less severe - kill rate drops to 0%.

I believe this deserves a more explanation. What do you think about the
following.
"
The primary point of the LRU add cache is to save the zone lru_lock
contention with a hope that more pages will belong to the same zone
and so their addition can be batched. The huge page is already a
form of batched addition (it will add 512 worth of memory in one go)
so skipping the batching seems like a safer option when compared to a
potential excess in the caching which can be quite large and much
harder to fix because lru_add_drain_all is way to expensive and
it is not really clear what would be a good moment to call it.
"

Does this sound better?

> 
> Suggested-by: Michal Hocko <mhocko@suse.com>
> Tested-by: Lukasz Odzioba <lukasz.odzioba@intel.com>
> Signed-off-by: Lukasz Odzioba <lukasz.odzioba@intel.com>
> ---
>  mm/swap.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/mm/swap.c b/mm/swap.c
> index 9591614..3fe4f18 100644
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -391,9 +391,8 @@ static void __lru_cache_add(struct page *page)
>  	struct pagevec *pvec = &get_cpu_var(lru_add_pvec);
>  
>  	get_page(page);
> -	if (!pagevec_space(pvec))
> +	if (!pagevec_add(pvec, page) || PageCompound(page))
>  		__pagevec_lru_add(pvec);
> -	pagevec_add(pvec, page);
>  	put_cpu_var(lru_add_pvec);
>  }
>  
> -- 
> 1.8.3.1
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/1] mm/swap.c: flush lru_add pvecs on compound page arrival
  2016-06-08 14:35 [PATCH 1/1] mm/swap.c: flush lru_add pvecs on compound page arrival Lukasz Odzioba
  2016-06-08 15:04 ` Michal Hocko
@ 2016-06-08 15:31 ` Dave Hansen
  2016-06-08 16:06   ` Michal Hocko
  2016-06-09  8:50   ` Odzioba, Lukasz
  1 sibling, 2 replies; 13+ messages in thread
From: Dave Hansen @ 2016-06-08 15:31 UTC (permalink / raw)
  To: Lukasz Odzioba, linux-kernel, linux-mm, akpm, kirill.shutemov,
	mhocko, aarcange, vdavydov, mingli199x, minchan
  Cc: lukasz.anaczkowski

On 06/08/2016 07:35 AM, Lukasz Odzioba wrote:
> diff --git a/mm/swap.c b/mm/swap.c
> index 9591614..3fe4f18 100644
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -391,9 +391,8 @@ static void __lru_cache_add(struct page *page)
>  	struct pagevec *pvec = &get_cpu_var(lru_add_pvec);
>  
>  	get_page(page);
> -	if (!pagevec_space(pvec))
> +	if (!pagevec_add(pvec, page) || PageCompound(page))
>  		__pagevec_lru_add(pvec);
> -	pagevec_add(pvec, page);
>  	put_cpu_var(lru_add_pvec);
>  }

Lukasz,

Do we have any statistics that tell us how many pages are sitting the
lru pvecs?  Although this helps the problem overall, don't we still have
a problem with memory being held in such an opaque place?

I think if we're going to be hacking around this area, we should also
add something to vmstat or zoneinfo to spell out how many of these
things there are.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/1] mm/swap.c: flush lru_add pvecs on compound page arrival
  2016-06-08 15:31 ` Dave Hansen
@ 2016-06-08 16:06   ` Michal Hocko
  2016-06-08 16:34     ` Dave Hansen
  2016-06-09  8:50   ` Odzioba, Lukasz
  1 sibling, 1 reply; 13+ messages in thread
From: Michal Hocko @ 2016-06-08 16:06 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Lukasz Odzioba, linux-kernel, linux-mm, akpm, kirill.shutemov,
	aarcange, vdavydov, mingli199x, minchan, lukasz.anaczkowski

On Wed 08-06-16 08:31:21, Dave Hansen wrote:
> On 06/08/2016 07:35 AM, Lukasz Odzioba wrote:
> > diff --git a/mm/swap.c b/mm/swap.c
> > index 9591614..3fe4f18 100644
> > --- a/mm/swap.c
> > +++ b/mm/swap.c
> > @@ -391,9 +391,8 @@ static void __lru_cache_add(struct page *page)
> >  	struct pagevec *pvec = &get_cpu_var(lru_add_pvec);
> >  
> >  	get_page(page);
> > -	if (!pagevec_space(pvec))
> > +	if (!pagevec_add(pvec, page) || PageCompound(page))
> >  		__pagevec_lru_add(pvec);
> > -	pagevec_add(pvec, page);
> >  	put_cpu_var(lru_add_pvec);
> >  }
> 
> Lukasz,
> 
> Do we have any statistics that tell us how many pages are sitting the
> lru pvecs?  Although this helps the problem overall, don't we still have
> a problem with memory being held in such an opaque place?

Is it really worth bothering when we are talking about 56kB per CPU
(after this patch)?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/1] mm/swap.c: flush lru_add pvecs on compound page arrival
  2016-06-08 16:06   ` Michal Hocko
@ 2016-06-08 16:34     ` Dave Hansen
  2016-06-09 12:21       ` Michal Hocko
  0 siblings, 1 reply; 13+ messages in thread
From: Dave Hansen @ 2016-06-08 16:34 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Lukasz Odzioba, linux-kernel, linux-mm, akpm, kirill.shutemov,
	aarcange, vdavydov, mingli199x, minchan, lukasz.anaczkowski,
	Shutemov, Kirill

On 06/08/2016 09:06 AM, Michal Hocko wrote:
>> > Do we have any statistics that tell us how many pages are sitting the
>> > lru pvecs?  Although this helps the problem overall, don't we still have
>> > a problem with memory being held in such an opaque place?
> Is it really worth bothering when we are talking about 56kB per CPU
> (after this patch)?

That was the logic why we didn't have it up until now: we didn't
*expect* it to get large.  A code change blew it up by 512x, and we had
no instrumentation to tell us where all the memory went.

I guess we don't have any other ways to group pages than compound pages,
and _that_ one is covered now... for one of the 5 classes of pvecs.

Is there a good reason we don't have to touch the other 4 pagevecs, btw?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH 1/1] mm/swap.c: flush lru_add pvecs on compound page arrival
  2016-06-08 15:04 ` Michal Hocko
@ 2016-06-09  8:01   ` Odzioba, Lukasz
  0 siblings, 0 replies; 13+ messages in thread
From: Odzioba, Lukasz @ 2016-06-09  8:01 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-kernel, linux-mm, akpm, kirill.shutemov, aarcange,
	vdavydov, mingli199x, minchan, Hansen, Dave, Anaczkowski, Lukasz

On Wed 08-07-16 17:04:00, Michal Hocko wrote: 
> I do not see how a SIGTERM would make any difference. But see below.

This is how we encounter this problem initially, by hitting ctr-c while
running parallel  memory intensive workload, which ended up
not calling munmap on allocated memory.

> Is this really true? Both munmap and exit_mmap do the same
> lru_add_drain() which flushes only the local CPU cache so munmap
> shouldn't make any difference.

Damn, I forgot to escape # in pragma parallel, it should be
void main(){
#pragma parallel
{
(...)

And then yes, exit_mmap will flush just the local CPU cache, but not the
rest. This would be another way of fixing the problem, but I concluded
that it would hurt performance on short running processes like scripts
if we do it synchronously, and we will be racing with next processes if 
we do it asynchronously, not tested it though.

> I believe this deserves a more explanation. What do you think about the
> following.
> "
> The primary point of the LRU add cache is to save the zone lru_lock
> contention with a hope that more pages will belong to the same zone
> and so their addition can be batched. The huge page is already a
> form of batched addition (it will add 512 worth of memory in one go)
> so skipping the batching seems like a safer option when compared to a
> potential excess in the caching which can be quite large and much
> harder to fix because lru_add_drain_all is way to expensive and
> it is not really clear what would be a good moment to call it.
>"
>
> Does this sound better?

Far better, thanks.

Lukas

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH 1/1] mm/swap.c: flush lru_add pvecs on compound page arrival
  2016-06-08 15:31 ` Dave Hansen
  2016-06-08 16:06   ` Michal Hocko
@ 2016-06-09  8:50   ` Odzioba, Lukasz
  2016-06-09 15:41     ` Dave Hansen
  1 sibling, 1 reply; 13+ messages in thread
From: Odzioba, Lukasz @ 2016-06-09  8:50 UTC (permalink / raw)
  To: Hansen, Dave, linux-kernel, linux-mm, akpm, kirill.shutemov,
	mhocko, aarcange, vdavydov, mingli199x, minchan
  Cc: Anaczkowski, Lukasz

On 08-06-16 17:31:00, Dave Hansen wrote:
> Do we have any statistics that tell us how many pages are sitting the
> lru pvecs?  Although this helps the problem overall, don't we still have
> a problem with memory being held in such an opaque place?

>From what I observed the problem is mainly with lru_add_pvec, the
rest is near empty for most of the time. I added debug code to
 lru_add_drain_all(), to see sizes of the lru pvecs when I debugged this.

Among lru_add_pvec, lru_rotate_pvecs, lru_deactivate_file_pvecs, 
lru_deactivate_pvecs, activate_page_pvecs almost all (3-4GB) of the 
missing memory was in lru_add_pvec, the rest was almost always empty.

Below are more detailed logs of each list size in the same order as above 
(list length -> sum of all pages on the list - for lru_add_pvec and 
activate_page_pvecs). "combined_len = X, combined_size = Y MB"
 -  X: combined length of lru_add_pvec and Y: sum of all page sizes in
 lru_add_pvec.

After compaction, before running workload, lru_add_drain_all:
cpu(0) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(1) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(2) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(3) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(4) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(5) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(6) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(7) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(8) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(9) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(10) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(11) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(12) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(13) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(14) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(15) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(16) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(17) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(18) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(19) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(20) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(21) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(22) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(23) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(24) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(25) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(26) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(27) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(28) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(29) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(30) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(31) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(32) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(33) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(34) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(35) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(36) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(37) = (2 -> 8192 B,0,0,0,0,0->0 B)
cpu(38) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(39) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(40) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(41) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(42) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(43) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(44) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(45) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(46) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(47) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(48) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(49) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(50) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(51) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(52) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(53) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(54) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(55) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(56) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(57) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(58) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(59) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(60) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(61) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(62) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(63) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(64) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(65) = (2 -> 8192 B,0,0,0,0,0->0 B)
cpu(66) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(67) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(68) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(69) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(70) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(71) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(72) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(73) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(74) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(75) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(76) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(77) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(78) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(79) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(80) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(81) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(82) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(83) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(84) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(85) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(86) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(87) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(88) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(89) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(90) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(91) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(92) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(93) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(94) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(95) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(96) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(97) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(98) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(99) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(100) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(101) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(102) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(103) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(104) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(105) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(106) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(107) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(108) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(109) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(110) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(111) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(112) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(113) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(114) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(115) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(116) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(117) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(118) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(119) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(120) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(121) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(122) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(123) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(124) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(125) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(126) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(127) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(128) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(129) = (10 -> 40960 B,0,0,0,0,0->0 B)
cpu(130) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(131) = (3 -> 12288 B,0,0,0,0,0->0 B)
cpu(132) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(133) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(134) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(135) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(136) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(137) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(138) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(139) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(140) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(141) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(142) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(143) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(144) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(145) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(146) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(147) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(148) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(149) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(150) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(151) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(152) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(153) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(154) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(155) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(156) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(157) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(158) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(159) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(160) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(161) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(162) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(163) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(164) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(165) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(166) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(167) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(168) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(169) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(170) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(171) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(172) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(173) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(174) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(175) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(176) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(177) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(178) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(179) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(180) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(181) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(182) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(183) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(184) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(185) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(186) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(187) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(188) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(189) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(190) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(191) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(192) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(193) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(194) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(195) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(196) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(197) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(198) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(199) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(200) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(201) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(202) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(203) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(204) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(205) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(206) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(207) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(208) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(209) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(210) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(211) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(212) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(213) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(214) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(215) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(216) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(217) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(218) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(219) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(220) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(221) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(222) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(223) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(224) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(225) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(226) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(227) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(228) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(229) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(230) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(231) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(232) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(233) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(234) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(235) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(236) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(237) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(238) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(239) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(240) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(241) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(242) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(243) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(244) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(245) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(246) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(247) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(248) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(249) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(250) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(251) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(252) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(253) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(254) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(255) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(256) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(257) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(258) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(259) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(260) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(261) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(262) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(263) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(264) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(265) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(266) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(267) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(268) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(269) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(270) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(271) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(272) = (8 -> 32768 B,0,0,0,0,0->0 B)
cpu(273) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(274) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(275) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(276) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(277) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(278) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(279) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(280) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(281) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(282) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(283) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(284) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(285) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(286) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(287) = (0 -> 0 B,0,0,0,0,0->0 B)
combined_len = 25, combined_size = 0 MB

<start and interrupt workload here>

After interrupting workload, lru_add_drain_all:
cpu(0) = (13 -> 53248 B,0,0,0,0,0->0 B)
cpu(1) = (10 -> 20971520 B,0,0,0,0,0->0 B)
cpu(2) = (4 -> 6295552 B,0,0,0,0,0->0 B)
cpu(3) = (11 -> 8417280 B,0,0,0,0,0->0 B)
cpu(4) = (1 -> 4096 B,0,0,0,0,0->0 B)
cpu(5) = (1 -> 4096 B,0,0,0,0,0->0 B)
cpu(6) = (2 -> 8192 B,0,0,0,0,0->0 B)
cpu(7) = (7 -> 14680064 B,0,0,0,0,0->0 B)
cpu(8) = (10 -> 20971520 B,0,0,0,0,0->0 B)
cpu(9) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(10) = (12 -> 25165824 B,0,0,0,0,0->0 B)
cpu(11) = (12 -> 25165824 B,0,0,0,0,0->0 B)
cpu(12) = (4 -> 8388608 B,0,0,0,0,0->0 B)
cpu(13) = (13 -> 27262976 B,0,0,0,0,0->0 B)
cpu(14) = (14 -> 29360128 B,0,0,0,0,0->0 B)
cpu(15) = (2 -> 4194304 B,0,0,0,0,0->0 B)
cpu(16) = (11 -> 2138112 B,0,0,0,0,0->0 B)
cpu(17) = (14 -> 29360128 B,0,0,0,0,0->0 B)
cpu(18) = (6 -> 12582912 B,0,0,0,0,0->0 B)
cpu(19) = (6 -> 12582912 B,0,0,0,0,0->0 B)
cpu(20) = (9 -> 18874368 B,0,0,0,0,0->0 B)
cpu(21) = (14 -> 29360128 B,0,0,0,0,0->0 B)
cpu(22) = (6 -> 8396800 B,0,0,0,0,0->0 B)
cpu(23) = (9 -> 14688256 B,0,0,0,0,0->0 B)
cpu(24) = (5 -> 20480 B,0,0,0,0,0->0 B)
cpu(25) = (13 -> 25169920 B,0,0,0,0,0->0 B)
cpu(26) = (4 -> 16384 B,0,0,0,0,0->0 B)
cpu(27) = (12 -> 25165824 B,0,0,0,0,0->0 B)
cpu(28) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(29) = (12 -> 25165824 B,0,0,0,0,0->0 B)
cpu(30) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(31) = (5 -> 6299648 B,0,0,0,0,0->0 B)
cpu(32) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(33) = (9 -> 18874368 B,0,0,0,0,0->0 B)
cpu(34) = (1 -> 2097152 B,0,0,0,0,0->0 B)
cpu(35) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(36) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(37) = (1 -> 2097152 B,0,0,0,0,0->0 B)
cpu(38) = (6 -> 10489856 B,0,0,0,0,0->0 B)
cpu(39) = (7 -> 14680064 B,0,0,0,0,0->0 B)
cpu(40) = (3 -> 6291456 B,0,0,0,0,0->0 B)
cpu(41) = (1 -> 2097152 B,0,0,0,0,0->0 B)
cpu(42) = (4 -> 8388608 B,0,0,0,0,0->0 B)
cpu(43) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(44) = (4 -> 8388608 B,0,0,0,0,0->0 B)
cpu(45) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(46) = (7 -> 28672 B,0,0,0,0,0->0 B)
cpu(47) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(48) = (4 -> 8388608 B,0,0,0,0,0->0 B)
cpu(49) = (10 -> 6320128 B,0,0,0,0,0->0 B)
cpu(50) = (14 -> 57344 B,0,0,0,0,0->0 B)
cpu(51) = (12 -> 8421376 B,0,0,0,0,0->0 B)
cpu(52) = (12 -> 49152 B,0,0,0,0,0->0 B)
cpu(53) = (1 -> 2097152 B,0,0,0,0,0->0 B)
cpu(54) = (3 -> 12288 B,0,0,0,0,0->0 B)
cpu(55) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(56) = (1 -> 2097152 B,0,0,0,0,0->0 B)
cpu(57) = (2 -> 8192 B,0,0,0,0,0->0 B)
cpu(58) = (10 -> 18878464 B,0,0,0,0,0->0 B)
cpu(59) = (11 -> 45056 B,0,0,0,0,0->0 B)
cpu(60) = (10 -> 40960 B,0,0,0,0,0->0 B)
cpu(61) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(62) = (11 -> 23068672 B,0,0,0,0,0->0 B)
cpu(63) = (14 -> 29360128 B,0,0,0,0,0->0 B)
cpu(64) = (2 -> 2101248 B,0,0,0,0,0->0 B)
cpu(65) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(66) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(67) = (6 -> 6303744 B,0,0,0,0,0->0 B)
cpu(68) = (13 -> 27262976 B,0,0,0,0,0->0 B)
cpu(69) = (14 -> 29360128 B,0,0,0,0,0->0 B)
cpu(70) = (3 -> 6291456 B,0,0,0,0,0->0 B)
cpu(71) = (9 -> 18874368 B,0,0,0,0,0->0 B)
cpu(72) = (12 -> 25165824 B,0,0,0,0,0->0 B)
cpu(73) = (5 -> 10485760 B,0,0,0,0,0->0 B)
cpu(74) = (3 -> 6291456 B,0,0,0,0,0->0 B)
cpu(75) = (7 -> 14680064 B,0,0,0,0,0->0 B)
cpu(76) = (11 -> 23068672 B,0,0,0,0,0->0 B)
cpu(77) = (1 -> 4096 B,0,0,0,0,0->0 B)
cpu(78) = (7 -> 14680064 B,0,0,0,0,0->0 B)
cpu(79) = (1 -> 2097152 B,0,0,0,0,0->0 B)
cpu(80) = (11 -> 23068672 B,0,0,0,0,0->0 B)
cpu(81) = (3 -> 6291456 B,0,0,0,0,0->0 B)
cpu(82) = (12 -> 25165824 B,0,0,0,0,0->0 B)
cpu(83) = (6 -> 12582912 B,0,0,0,0,0->0 B)
cpu(84) = (5 -> 10485760 B,0,0,0,0,0->0 B)
cpu(85) = (4 -> 8388608 B,0,0,0,0,0->0 B)
cpu(86) = (9 -> 18874368 B,0,0,0,0,0->0 B)
cpu(87) = (6 -> 12582912 B,0,0,0,0,0->0 B)
cpu(88) = (2 -> 2101248 B,0,0,0,0,0->0 B)
cpu(89) = (8 -> 16777216 B,0,0,0,0,0->0 B)
cpu(90) = (9 -> 12595200 B,0,0,0,0,0->0 B)
cpu(91) = (13 -> 27262976 B,0,0,0,0,0->0 B)
cpu(92) = (7 -> 14680064 B,0,0,0,0,0->0 B)
cpu(93) = (10 -> 20971520 B,0,0,0,0,0->0 B)
cpu(94) = (1 -> 2097152 B,0,0,0,0,0->0 B)
cpu(95) = (1 -> 4096 B,0,0,0,0,0->0 B)
cpu(96) = (1 -> 2097152 B,0,0,0,0,0->0 B)
cpu(97) = (5 -> 10485760 B,0,0,0,0,0->0 B)
cpu(98) = (12 -> 25165824 B,0,0,0,0,0->0 B)
cpu(99) = (9 -> 18874368 B,0,0,0,0,0->0 B)
cpu(100) = (10 -> 20971520 B,0,0,0,0,0->0 B)
cpu(101) = (1 -> 2097152 B,0,0,0,0,0->0 B)
cpu(102) = (6 -> 12582912 B,0,0,0,0,0->0 B)
cpu(103) = (9 -> 18874368 B,0,0,0,0,0->0 B)
cpu(104) = (7 -> 14680064 B,0,0,0,0,0->0 B)
cpu(105) = (3 -> 6291456 B,0,0,0,0,0->0 B)
cpu(106) = (10 -> 20971520 B,0,0,0,0,0->0 B)
cpu(107) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(108) = (10 -> 20971520 B,0,0,0,0,0->0 B)
cpu(109) = (13 -> 27262976 B,0,0,0,0,0->0 B)
cpu(110) = (4 -> 8388608 B,0,0,0,0,0->0 B)
cpu(111) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(112) = (13 -> 27262976 B,0,0,0,0,0->0 B)
cpu(113) = (9 -> 18874368 B,0,0,0,0,0->0 B)
cpu(114) = (8 -> 32768 B,0,0,0,0,0->0 B)
cpu(115) = (7 -> 12587008 B,0,0,0,0,0->0 B)
cpu(116) = (9 -> 2129920 B,0,0,0,0,0->0 B)
cpu(117) = (3 -> 4198400 B,0,0,0,0,0->0 B)
cpu(118) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(119) = (10 -> 18878464 B,0,0,0,0,0->0 B)
cpu(120) = (7 -> 12587008 B,0,0,0,0,0->0 B)
cpu(121) = (8 -> 8404992 B,0,0,0,0,0->0 B)
cpu(122) = (12 -> 18886656 B,0,0,0,0,0->0 B)
cpu(123) = (1 -> 2097152 B,0,0,0,0,0->0 B)
cpu(124) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(125) = (2 -> 8192 B,0,0,0,0,0->0 B)
cpu(126) = (6 -> 24576 B,0,0,0,0,0->0 B)
cpu(127) = (13 -> 20983808 B,0,0,0,0,0->0 B)
cpu(128) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(129) = (3 -> 12288 B,0,0,0,0,0->0 B)
cpu(130) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(131) = (13 -> 27262976 B,0,0,0,0,0->0 B)
cpu(132) = (10 -> 16785408 B,0,0,0,0,0->0 B)
cpu(133) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(134) = (6 -> 12582912 B,0,0,0,0,0->0 B)
cpu(135) = (2 -> 4194304 B,0,0,0,0,0->0 B)
cpu(136) = (9 -> 36864 B,0,0,0,0,0->0 B)
cpu(137) = (4 -> 4202496 B,0,0,0,0,0->0 B)
cpu(138) = (3 -> 6291456 B,0,0,0,0,0->0 B)
cpu(139) = (5 -> 10485760 B,0,0,0,0,0->0 B)
cpu(140) = (1 -> 2097152 B,0,0,0,0,0->0 B)
cpu(141) = (7 -> 10493952 B,0,0,0,0,0->0 B)
cpu(142) = (3 -> 6291456 B,0,0,0,0,0->0 B)
cpu(143) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(144) = (5 -> 10485760 B,0,0,0,0,0->0 B)
cpu(145) = (5 -> 10485760 B,0,0,0,0,0->0 B)
cpu(146) = (14 -> 27267072 B,0,0,0,0,0->0 B)
cpu(147) = (10 -> 20971520 B,0,0,0,0,0->0 B)
cpu(148) = (4 -> 8388608 B,0,0,0,0,0->0 B)
cpu(149) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(150) = (8 -> 16777216 B,0,0,0,0,0->0 B)
cpu(151) = (4 -> 8388608 B,0,0,0,0,0->0 B)
cpu(152) = (11 -> 23068672 B,0,0,0,0,0->0 B)
cpu(153) = (4 -> 8388608 B,0,0,0,0,0->0 B)
cpu(154) = (12 -> 18886656 B,0,0,0,0,0->0 B)
cpu(155) = (8 -> 16777216 B,0,0,0,0,0->0 B)
cpu(156) = (2 -> 4194304 B,0,0,0,0,0->0 B)
cpu(157) = (13 -> 23076864 B,0,0,0,0,0->0 B)
cpu(158) = (11 -> 23068672 B,0,0,0,0,0->0 B)
cpu(159) = (6 -> 12582912 B,0,0,0,0,0->0 B)
cpu(160) = (3 -> 6291456 B,0,0,0,0,0->0 B)
cpu(161) = (4 -> 8388608 B,0,0,0,0,0->0 B)
cpu(162) = (9 -> 18874368 B,0,0,0,0,0->0 B)
cpu(163) = (8 -> 16777216 B,0,0,0,0,0->0 B)
cpu(164) = (10 -> 20971520 B,0,0,0,0,0->0 B)
cpu(165) = (5 -> 10485760 B,0,0,0,0,0->0 B)
cpu(166) = (5 -> 10485760 B,0,0,0,0,0->0 B)
cpu(167) = (11 -> 23068672 B,0,0,0,0,0->0 B)
cpu(168) = (6 -> 12582912 B,0,0,0,0,0->0 B)
cpu(169) = (8 -> 12591104 B,0,0,0,0,0->0 B)
cpu(170) = (7 -> 12587008 B,0,0,0,0,0->0 B)
cpu(171) = (6 -> 12582912 B,0,0,0,0,0->0 B)
cpu(172) = (14 -> 29360128 B,0,0,0,0,0->0 B)
cpu(173) = (10 -> 20971520 B,0,0,0,0,0->0 B)
cpu(174) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(175) = (10 -> 20971520 B,0,0,0,0,0->0 B)
cpu(176) = (5 -> 10485760 B,0,0,0,0,0->0 B)
cpu(177) = (9 -> 18874368 B,0,0,0,0,0->0 B)
cpu(178) = (8 -> 14684160 B,0,0,0,0,0->0 B)
cpu(179) = (5 -> 8392704 B,0,0,0,0,0->0 B)
cpu(180) = (8 -> 14684160 B,0,0,0,0,0->0 B)
cpu(181) = (4 -> 6295552 B,0,0,0,0,0->0 B)
cpu(182) = (3 -> 2105344 B,0,0,0,0,0->0 B)
cpu(183) = (9 -> 16781312 B,0,0,0,0,0->0 B)
cpu(184) = (2 -> 8192 B,0,0,0,0,0->0 B)
cpu(185) = (3 -> 4198400 B,0,0,0,0,0->0 B)
cpu(186) = (12 -> 23072768 B,0,0,0,0,0->0 B)
cpu(187) = (9 -> 14688256 B,0,0,0,0,0->0 B)
cpu(188) = (4 -> 6295552 B,0,0,0,0,0->0 B)
cpu(189) = (14 -> 29360128 B,0,0,0,0,0->0 B)
cpu(190) = (4 -> 8388608 B,0,0,0,0,0->0 B)
cpu(191) = (13 -> 27262976 B,0,0,0,0,0->0 B)
cpu(192) = (3 -> 6291456 B,0,0,0,0,0->0 B)
cpu(193) = (7 -> 14680064 B,0,0,0,0,0->0 B)
cpu(194) = (5 -> 10485760 B,0,0,0,0,0->0 B)
cpu(195) = (6 -> 12582912 B,0,0,0,0,0->0 B)
cpu(196) = (8 -> 14684160 B,0,0,0,0,0->0 B)
cpu(197) = (6 -> 8396800 B,0,0,0,0,0->0 B)
cpu(198) = (12 -> 49152 B,0,0,0,0,0->0 B)
cpu(199) = (2 -> 4194304 B,0,0,0,0,0->0 B)
cpu(200) = (6 -> 12582912 B,0,0,0,0,0->0 B)
cpu(201) = (14 -> 25174016 B,0,0,0,0,0->0 B)
cpu(202) = (8 -> 16777216 B,0,0,0,0,0->0 B)
cpu(203) = (10 -> 16785408 B,0,0,0,0,0->0 B)
cpu(204) = (6 -> 10489856 B,0,0,0,0,0->0 B)
cpu(205) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(206) = (6 -> 4210688 B,0,0,0,0,0->0 B)
cpu(207) = (8 -> 16777216 B,0,0,0,0,0->0 B)
cpu(208) = (12 -> 14700544 B,0,0,0,0,0->0 B)
cpu(209) = (2 -> 4194304 B,0,0,0,0,0->0 B)
cpu(210) = (2 -> 2101248 B,0,0,0,0,0->0 B)
cpu(211) = (3 -> 6291456 B,0,0,0,0,0->0 B)
cpu(212) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(213) = (4 -> 8388608 B,0,0,0,0,0->0 B)
cpu(214) = (5 -> 10485760 B,0,0,0,0,0->0 B)
cpu(215) = (8 -> 16777216 B,0,0,0,0,0->0 B)
cpu(216) = (10 -> 20971520 B,0,0,0,0,0->0 B)
cpu(217) = (4 -> 8388608 B,0,0,0,0,0->0 B)
cpu(218) = (5 -> 10485760 B,0,0,0,0,0->0 B)
cpu(219) = (10 -> 20971520 B,0,0,0,0,0->0 B)
cpu(220) = (9 -> 18874368 B,0,0,0,0,0->0 B)
cpu(221) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(222) = (14 -> 29360128 B,0,0,0,0,0->0 B)
cpu(223) = (5 -> 8392704 B,0,0,0,0,0->0 B)
cpu(224) = (7 -> 14680064 B,0,0,0,0,0->0 B)
cpu(225) = (6 -> 12582912 B,0,0,0,0,0->0 B)
cpu(226) = (11 -> 23068672 B,0,0,0,0,0->0 B)
cpu(227) = (7 -> 14680064 B,0,0,0,0,0->0 B)
cpu(228) = (14 -> 29360128 B,0,0,0,0,0->0 B)
cpu(229) = (10 -> 20971520 B,0,0,0,0,0->0 B)
cpu(230) = (6 -> 12582912 B,0,0,0,0,0->0 B)
cpu(231) = (7 -> 14680064 B,0,0,0,0,0->0 B)
cpu(232) = (7 -> 14680064 B,0,0,0,0,0->0 B)
cpu(233) = (4 -> 8388608 B,0,0,0,0,0->0 B)
cpu(234) = (10 -> 20971520 B,0,0,0,0,0->0 B)
cpu(235) = (6 -> 12582912 B,0,0,0,0,0->0 B)
cpu(236) = (9 -> 18874368 B,0,0,0,0,0->0 B)
cpu(237) = (5 -> 10485760 B,0,0,0,0,0->0 B)
cpu(238) = (12 -> 25165824 B,0,0,0,0,0->0 B)
cpu(239) = (12 -> 49152 B,0,0,0,0,0->0 B)
cpu(240) = (13 -> 20983808 B,0,0,0,0,0->0 B)
cpu(241) = (11 -> 23068672 B,0,0,0,0,0->0 B)
cpu(242) = (3 -> 4198400 B,0,0,0,0,0->0 B)
cpu(243) = (5 -> 8392704 B,0,0,0,0,0->0 B)
cpu(244) = (10 -> 18878464 B,0,0,0,0,0->0 B)
cpu(245) = (4 -> 16384 B,0,0,0,0,0->0 B)
cpu(246) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(247) = (7 -> 12587008 B,0,0,0,0,0->0 B)
cpu(248) = (6 -> 10489856 B,0,0,0,0,0->0 B)
cpu(249) = (7 -> 14680064 B,0,0,0,0,0->0 B)
cpu(250) = (13 -> 25169920 B,0,0,0,0,0->0 B)
cpu(251) = (7 -> 12587008 B,0,0,0,0,0->0 B)
cpu(252) = (3 -> 4198400 B,0,0,0,0,0->0 B)
cpu(253) = (11 -> 20975616 B,0,0,0,0,0->0 B)
cpu(254) = (8 -> 12591104 B,0,0,0,0,0->0 B)
cpu(255) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(256) = (8 -> 16777216 B,0,0,0,0,0->0 B)
cpu(257) = (13 -> 25169920 B,0,0,0,0,0->0 B)
cpu(258) = (10 -> 20971520 B,0,0,0,0,0->0 B)
cpu(259) = (7 -> 8400896 B,0,0,0,0,0->0 B)
cpu(260) = (10 -> 20971520 B,0,0,0,0,0->0 B)
cpu(261) = (13 -> 25169920 B,0,0,0,0,0->0 B)
cpu(262) = (7 -> 14680064 B,0,0,0,0,0->0 B)
cpu(263) = (7 -> 2121728 B,0,0,0,0,0->0 B)
cpu(264) = (2 -> 4194304 B,0,0,0,0,0->0 B)
cpu(265) = (7 -> 14680064 B,0,0,0,0,0->0 B)
cpu(266) = (12 -> 12607488 B,0,0,0,0,0->0 B)
cpu(267) = (2 -> 4194304 B,0,0,0,0,0->0 B)
cpu(268) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(269) = (3 -> 6291456 B,0,0,0,0,0->0 B)
cpu(270) = (11 -> 45056 B,0,0,0,0,0->0 B)
cpu(271) = (3 -> 6291456 B,0,0,0,0,0->0 B)
cpu(272) = (10 -> 40960 B,0,0,0,0,0->0 B)
cpu(273) = (10 -> 20971520 B,0,0,0,0,0->0 B)
cpu(274) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(275) = (6 -> 6303744 B,0,0,0,0,0->0 B)
cpu(276) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(277) = (9 -> 14688256 B,0,0,0,0,0->0 B)
cpu(278) = (5 -> 10485760 B,0,0,0,0,0->0 B)
cpu(279) = (7 -> 14680064 B,0,0,0,0,0->0 B)
cpu(280) = (10 -> 16785408 B,0,0,0,0,0->0 B)
cpu(281) = (7 -> 14680064 B,0,0,0,0,0->0 B)
cpu(282) = (3 -> 6291456 B,0,0,0,0,0->0 B)
cpu(283) = (4 -> 8388608 B,0,0,0,0,0->0 B)
cpu(284) = (7 -> 14680064 B,0,0,0,0,0->0 B)
cpu(285) = (4 -> 8388608 B,0,0,0,0,0->0 B)
cpu(286) = (6 -> 10489856 B,0,0,0,0,0->0 B)
cpu(287) = (5 -> 10485760 B,0,0,0,0,0->0 B)
combined_len = 1869, combined_size = 3093 MB

After interrupting workload and after compaction,
lru_add_drain_all:
cpu(0) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(1) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(2) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(3) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(4) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(5) = (8 -> 32768 B,0,0,0,0,0->0 B)
cpu(6) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(7) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(8) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(9) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(10) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(11) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(12) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(13) = (2 -> 8192 B,0,0,0,0,0->0 B)
cpu(14) = (11 -> 45056 B,0,0,0,0,0->0 B)
cpu(15) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(16) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(17) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(18) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(19) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(20) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(21) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(22) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(23) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(24) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(25) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(26) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(27) = (7 -> 28672 B,0,0,0,0,0->0 B)
cpu(28) = (3 -> 12288 B,0,0,0,0,0->0 B)
cpu(29) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(30) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(31) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(32) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(33) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(34) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(35) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(36) = (6 -> 24576 B,0,0,0,0,0->0 B)
cpu(37) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(38) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(39) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(40) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(41) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(42) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(43) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(44) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(45) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(46) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(47) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(48) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(49) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(50) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(51) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(52) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(53) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(54) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(55) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(56) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(57) = (2 -> 8192 B,0,0,0,0,0->0 B)
cpu(58) = (10 -> 40960 B,0,0,0,0,0->0 B)
cpu(59) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(60) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(61) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(62) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(63) = (4 -> 16384 B,0,0,0,0,0->0 B)
cpu(64) = (7 -> 28672 B,0,0,0,0,0->0 B)
cpu(65) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(66) = (4 -> 16384 B,0,0,0,0,0->0 B)
cpu(67) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(68) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(69) = (7 -> 28672 B,0,0,0,0,0->0 B)
cpu(70) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(71) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(72) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(73) = (4 -> 16384 B,0,0,0,0,0->0 B)
cpu(74) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(75) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(76) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(77) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(78) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(79) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(80) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(81) = (3 -> 12288 B,0,0,0,0,0->0 B)
cpu(82) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(83) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(84) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(85) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(86) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(87) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(88) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(89) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(90) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(91) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(92) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(93) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(94) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(95) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(96) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(97) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(98) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(99) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(100) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(101) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(102) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(103) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(104) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(105) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(106) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(107) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(108) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(109) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(110) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(111) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(112) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(113) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(114) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(115) = (2 -> 8192 B,0,0,0,0,0->0 B)
cpu(116) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(117) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(118) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(119) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(120) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(121) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(122) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(123) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(124) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(125) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(126) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(127) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(128) = (7 -> 28672 B,0,0,0,0,0->0 B)
cpu(129) = (12 -> 49152 B,0,0,0,0,0->0 B)
cpu(130) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(131) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(132) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(133) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(134) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(135) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(136) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(137) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(138) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(139) = (13 -> 53248 B,0,0,0,0,0->0 B)
cpu(140) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(141) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(142) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(143) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(144) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(145) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(146) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(147) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(148) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(149) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(150) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(151) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(152) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(153) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(154) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(155) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(156) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(157) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(158) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(159) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(160) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(161) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(162) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(163) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(164) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(165) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(166) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(167) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(168) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(169) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(170) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(171) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(172) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(173) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(174) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(175) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(176) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(177) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(178) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(179) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(180) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(181) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(182) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(183) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(184) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(185) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(186) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(187) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(188) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(189) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(190) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(191) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(192) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(193) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(194) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(195) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(196) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(197) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(198) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(199) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(200) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(201) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(202) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(203) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(204) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(205) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(206) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(207) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(208) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(209) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(210) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(211) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(212) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(213) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(214) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(215) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(216) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(217) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(218) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(219) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(220) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(221) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(222) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(223) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(224) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(225) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(226) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(227) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(228) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(229) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(230) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(231) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(232) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(233) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(234) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(235) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(236) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(237) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(238) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(239) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(240) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(241) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(242) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(243) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(244) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(245) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(246) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(247) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(248) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(249) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(250) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(251) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(252) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(253) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(254) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(255) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(256) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(257) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(258) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(259) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(260) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(261) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(262) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(263) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(264) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(265) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(266) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(267) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(268) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(269) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(270) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(271) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(272) = (2 -> 8192 B,0,0,0,0,0->0 B)
cpu(273) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(274) = (11 -> 45056 B,0,0,0,0,0->0 B)
cpu(275) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(276) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(277) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(278) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(279) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(280) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(281) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(282) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(283) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(284) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(285) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(286) = (0 -> 0 B,0,0,0,0,0->0 B)
cpu(287) = (0 -> 0 B,0,0,0,0,0->0 B)
combined_len = 125, combined_size = 0 MB


This is not super representative example, because I compacted
memory before running workload, but as I said from what I have
seen there is not a big problem with the rest of pvecs -
maybe unless we encounter some other real-world issue.

> I think if we're going to be hacking around this area, we should also
> add something to vmstat or zoneinfo to spell out how many of these
> things there are.

Agree, I can try to prepare another patch for that.

Thanks,
Lukas

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/1] mm/swap.c: flush lru_add pvecs on compound page arrival
  2016-06-08 16:34     ` Dave Hansen
@ 2016-06-09 12:21       ` Michal Hocko
  2016-06-16 18:08         ` Odzioba, Lukasz
  0 siblings, 1 reply; 13+ messages in thread
From: Michal Hocko @ 2016-06-09 12:21 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Lukasz Odzioba, linux-kernel, linux-mm, akpm, kirill.shutemov,
	aarcange, vdavydov, mingli199x, minchan, lukasz.anaczkowski,
	Shutemov, Kirill

On Wed 08-06-16 09:34:01, Dave Hansen wrote:
> On 06/08/2016 09:06 AM, Michal Hocko wrote:
> >> > Do we have any statistics that tell us how many pages are sitting the
> >> > lru pvecs?  Although this helps the problem overall, don't we still have
> >> > a problem with memory being held in such an opaque place?
> > Is it really worth bothering when we are talking about 56kB per CPU
> > (after this patch)?
> 
> That was the logic why we didn't have it up until now: we didn't
> *expect* it to get large.  A code change blew it up by 512x, and we had
> no instrumentation to tell us where all the memory went.
> 
> I guess we don't have any other ways to group pages than compound pages,
> and _that_ one is covered now...

exactly and that is why I am not sure it is needed. I do not expect we
would ever change the pagevec size or have a different way of grouping
pages on the LRU list.

That being said I am not objecting to the counter, I am just not sure it
is worth it.

> for one of the 5 classes of pvecs.
> 
> Is there a good reason we don't have to touch the other 4 pagevecs, btw?

I agree it would be better to do the same for others as well. Even if
this is not an immediate problem for those.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/1] mm/swap.c: flush lru_add pvecs on compound page arrival
  2016-06-09  8:50   ` Odzioba, Lukasz
@ 2016-06-09 15:41     ` Dave Hansen
  2016-06-13 21:01       ` Odzioba, Lukasz
  0 siblings, 1 reply; 13+ messages in thread
From: Dave Hansen @ 2016-06-09 15:41 UTC (permalink / raw)
  To: Odzioba, Lukasz, linux-kernel, linux-mm, akpm, kirill.shutemov,
	mhocko, aarcange, vdavydov, mingli199x, minchan
  Cc: Anaczkowski, Lukasz

On 06/09/2016 01:50 AM, Odzioba, Lukasz wrote:
> On 08-06-16 17:31:00, Dave Hansen wrote:
>> Do we have any statistics that tell us how many pages are sitting the
>> lru pvecs?  Although this helps the problem overall, don't we still have
>> a problem with memory being held in such an opaque place?
> 
>>From what I observed the problem is mainly with lru_add_pvec, the
> rest is near empty for most of the time. I added debug code to
>  lru_add_drain_all(), to see sizes of the lru pvecs when I debugged this.
> 
> Among lru_add_pvec, lru_rotate_pvecs, lru_deactivate_file_pvecs, 
> lru_deactivate_pvecs, activate_page_pvecs almost all (3-4GB) of the 
> missing memory was in lru_add_pvec, the rest was almost always empty.

Does your workload put large pages in and out of those pvecs, though?
If your system doesn't have any activity, then all we've shown is that
they're not a problem when not in use.  But what about when we use them?

Have you, for instance, tried this on a system with memory pressure?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH 1/1] mm/swap.c: flush lru_add pvecs on compound page arrival
  2016-06-09 15:41     ` Dave Hansen
@ 2016-06-13 21:01       ` Odzioba, Lukasz
  0 siblings, 0 replies; 13+ messages in thread
From: Odzioba, Lukasz @ 2016-06-13 21:01 UTC (permalink / raw)
  To: Hansen, Dave, linux-kernel, linux-mm, akpm, kirill.shutemov,
	mhocko, aarcange, vdavydov, mingli199x, minchan
  Cc: Anaczkowski, Lukasz

On 09-06-16 17:42:00, Dave Hansen wrote:
> Does your workload put large pages in and out of those pvecs, though?
> If your system doesn't have any activity, then all we've shown is that
> they're not a problem when not in use.  But what about when we use them?

It doesn't. To use them extensively I guess we would have to
craft a separate program for each one, which is not trivial.

> Have you, for instance, tried this on a system with memory pressure?

Not then, but here are exemplary snapshots with system using swap to handle 
allocation requests with patch applied: (notation: pages = sum in bytes):
LRU_add              336 =     1344kB
LRU_rotate           158 =      632kB
LRU_deactivate         0 =        0kB
LRU_deact_file         0 =        0kB
LRU_activate           1 =        4kB
---
LRU_add             3262 =    13048kB
LRU_rotate           142 =      568kB
LRU_deactivate         0 =        0kB
LRU_deact_file         0 =        0kB
LRU_activate           6 =       24kB
---
LRU_add             3689 =    14756kB
LRU_rotate            81 =      324kB
LRU_deactivate         0 =        0kB
LRU_deact_file         0 =        0kB
LRU_activate          19 =       76kB

While running idle os we have:
LRU_add             1038 =     4152kB
LRU_rotate             0 =        0kB
LRU_deactivate         0 =        0kB
LRU_deact_file         0 =        0kB
LRU_activate           0 =        0kB

I know those are not representative in overall.

Thanks,
Lukas

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH 1/1] mm/swap.c: flush lru_add pvecs on compound page arrival
  2016-06-09 12:21       ` Michal Hocko
@ 2016-06-16 18:08         ` Odzioba, Lukasz
  2016-06-16 18:19           ` Michal Hocko
  0 siblings, 1 reply; 13+ messages in thread
From: Odzioba, Lukasz @ 2016-06-16 18:08 UTC (permalink / raw)
  To: Michal Hocko, Hansen, Dave
  Cc: linux-kernel, linux-mm, akpm, kirill.shutemov, aarcange,
	vdavydov, mingli199x, minchan, Anaczkowski, Lukasz, Shutemov,
	Kirill

On Thru 09-06-16 02:22 PM Michal Hocko wrote:
> I agree it would be better to do the same for others as well. Even if
> this is not an immediate problem for those.

I am not able to find clear reasons why we shouldn't do it for the rest.
Ok so what do we do now? I'll send v2 with proposed changes.
Then do we still want  to have stats on those pvecs?
In my opinion it's not worth it now.

Thanks,
Lukas

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/1] mm/swap.c: flush lru_add pvecs on compound page arrival
  2016-06-16 18:08         ` Odzioba, Lukasz
@ 2016-06-16 18:19           ` Michal Hocko
  2016-06-16 20:03             ` Odzioba, Lukasz
  0 siblings, 1 reply; 13+ messages in thread
From: Michal Hocko @ 2016-06-16 18:19 UTC (permalink / raw)
  To: Odzioba, Lukasz
  Cc: Hansen, Dave, linux-kernel, linux-mm, akpm, kirill.shutemov,
	aarcange, vdavydov, mingli199x, minchan, Anaczkowski, Lukasz,
	Shutemov, Kirill

On Thu 16-06-16 18:08:57, Odzioba, Lukasz wrote:
> On Thru 09-06-16 02:22 PM Michal Hocko wrote:
> > I agree it would be better to do the same for others as well. Even if
> > this is not an immediate problem for those.
> 
> I am not able to find clear reasons why we shouldn't do it for the rest.
> Ok so what do we do now? I'll send v2 with proposed changes.
> Then do we still want  to have stats on those pvecs?
> In my opinion it's not worth it now.

I think the fix has a higher priority - we also want to backport it to
stable trees IMO. We can discuss the stats and how to present them
later.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH 1/1] mm/swap.c: flush lru_add pvecs on compound page arrival
  2016-06-16 18:19           ` Michal Hocko
@ 2016-06-16 20:03             ` Odzioba, Lukasz
  0 siblings, 0 replies; 13+ messages in thread
From: Odzioba, Lukasz @ 2016-06-16 20:03 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Hansen, Dave, linux-kernel, linux-mm, akpm, kirill.shutemov,
	aarcange, vdavydov, mingli199x, minchan, Anaczkowski, Lukasz,
	Shutemov, Kirill

On Thu 16-06-16 08:19 PM, Michal Hocko wrote:
>
> On Thu 16-06-16 18:08:57, Odzioba, Lukasz wrote:
> I am not able to find clear reasons why we shouldn't do it for the rest.
> Ok so what do we do now? I'll send v2 with proposed changes.
> Then do we still want  to have stats on those pvecs?
> In my opinion it's not worth it now.
>
> I think the fix has a higher priority - we also want to backport it to
> stable trees IMO. We can discuss the stats and how to present them
> later.

Will send the patch tomorrow. In the meantime I was able get similar
problem on lru_deactivate by using MADV_FREE:

LRU_add              588 =    18704kB
LRU_rotate             0 =        0kB
LRU_deactivate       165 =   309304kB
LRU_deact_file         0 =        0kB
LRU_activate           0 =        0kB

Thanks,
Lukas

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2016-06-16 20:03 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-08 14:35 [PATCH 1/1] mm/swap.c: flush lru_add pvecs on compound page arrival Lukasz Odzioba
2016-06-08 15:04 ` Michal Hocko
2016-06-09  8:01   ` Odzioba, Lukasz
2016-06-08 15:31 ` Dave Hansen
2016-06-08 16:06   ` Michal Hocko
2016-06-08 16:34     ` Dave Hansen
2016-06-09 12:21       ` Michal Hocko
2016-06-16 18:08         ` Odzioba, Lukasz
2016-06-16 18:19           ` Michal Hocko
2016-06-16 20:03             ` Odzioba, Lukasz
2016-06-09  8:50   ` Odzioba, Lukasz
2016-06-09 15:41     ` Dave Hansen
2016-06-13 21:01       ` Odzioba, Lukasz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).