All of lore.kernel.org
 help / color / mirror / Atom feed
* memory.force_empty is deprecated
@ 2016-11-04  8:24 Zhao Hui Ding
  2016-11-04 15:21 ` Johannes Weiner
  0 siblings, 1 reply; 9+ messages in thread
From: Zhao Hui Ding @ 2016-11-04  8:24 UTC (permalink / raw)
  To: linux-mm

[-- Attachment #1: Type: text/plain, Size: 1357 bytes --]

Hello,

I'm Zhaohui from IBM Spectrum LSF development team. I got below message 
when running LSF on SUSE11.4, so I would like to share our use scenario 
and ask for the suggestions without using memory.force_empty.

memory.force_empty is deprecated and will be removed. Let us know if it is 
needed in your usecase at linux-mm@kvack.org

LSF is a batch workload scheduler, it uses cgroup to do batch jobs 
resource enforcement and accounting. For each job, LSF creates a cgroup 
directory and put job's PIDs to the cgroup.

When we implement LSF cgroup integration, we found creating a new cgroup 
is much slower than renaming an existing cgroup, it's about hundreds of 
milliseconds vs less than 10 milliseconds.
To speed up job clean up, when a job is done, LSF doesn't delete the 
cgroup, instead, LSF reset the memory usage by setting memory.force_empty 
to "0". The subsequent job will rename the cgroup name and reuse it.

If memory.force_empty will be removed, how to achieve the same goal?

Looking forward for you reply.

Thanks & Regards,

Zhaohui Ding (丁肇辉), Ph.D
Senior Product Architect, IBM Platform LSF Product Line
IBM China Systems and Technology Laboratory in Beijing
Addr: Building 28, ZhongGuanCun Software Park, No.8 Dong Bei Wang West 
Road
Office : (86-10) 82450903   Mobile : (86) 186-1198-2179


[-- Attachment #2: Type: text/html, Size: 1934 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: memory.force_empty is deprecated
  2016-11-04  8:24 memory.force_empty is deprecated Zhao Hui Ding
@ 2016-11-04 15:21 ` Johannes Weiner
  2016-11-17 10:13   ` Zhao Hui Ding
  2016-11-17 10:39   ` Balbir Singh
  0 siblings, 2 replies; 9+ messages in thread
From: Johannes Weiner @ 2016-11-04 15:21 UTC (permalink / raw)
  To: Zhao Hui Ding; +Cc: Tejun Heo, cgroups, linux-mm

Hi,

On Fri, Nov 04, 2016 at 04:24:25PM +0800, Zhao Hui Ding wrote:
> Hello,
> 
> I'm Zhaohui from IBM Spectrum LSF development team. I got below message 
> when running LSF on SUSE11.4, so I would like to share our use scenario 
> and ask for the suggestions without using memory.force_empty.
> 
> memory.force_empty is deprecated and will be removed. Let us know if it is 
> needed in your usecase at linux-mm@kvack.org
> 
> LSF is a batch workload scheduler, it uses cgroup to do batch jobs 
> resource enforcement and accounting. For each job, LSF creates a cgroup 
> directory and put job's PIDs to the cgroup.
> 
> When we implement LSF cgroup integration, we found creating a new cgroup 
> is much slower than renaming an existing cgroup, it's about hundreds of 
> milliseconds vs less than 10 milliseconds.

Cgroup creation/deletion is not expected to be an ultra-hot path, but
I'm surprised it takes longer than actually reclaiming leftover pages.

By the time the jobs conclude, how much is usually left in the group?

That said, is it even necessary to pro-actively remove the leftover
cache from the group before starting the next job? Why not leave it
for the next job to reclaim it lazily should memory pressure arise?
It's easy to reclaim page cache, and the first to go as it's behind
the next job's memory on the LRU list.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: memory.force_empty is deprecated
  2016-11-04 15:21 ` Johannes Weiner
@ 2016-11-17 10:13   ` Zhao Hui Ding
  2016-11-22 15:20       ` Michal Hocko
  2016-11-17 10:39   ` Balbir Singh
  1 sibling, 1 reply; 9+ messages in thread
From: Zhao Hui Ding @ 2016-11-17 10:13 UTC (permalink / raw)
  To: Johannes Weiner; +Cc: cgroups, linux-mm, Tejun Heo, Xue Bin Min

[-- Attachment #1: Type: text/plain, Size: 3026 bytes --]

Thanks for the reply.

When the job is finished, "tasks" is empty, but "memory.stat" still 
contains cache, active_file...
# cat tasks
# cat memory.stat 
cache 81920
rss 0
mapped_file 0
pgpgin 9440
pgpgout 9420
swap 0
inactive_anon 0
active_anon 0
inactive_file 77824
active_file 4096
unevictable 0
hierarchical_memory_limit 9223372036854775807
hierarchical_memsw_limit 9223372036854775807
total_cache 81920
total_rss 0
total_mapped_file 0
total_pgpgin 9440
total_pgpgout 9420
total_swap 0
total_inactive_anon 0
total_active_anon 0
total_inactive_file 77824
total_active_file 4096
total_unevictable 0

After echo 0 to memory.force_empty, cache is cleaned.
# echo 0 > memory.force_empty 
# cat memory.stat 
cache 0
rss 0
mapped_file 0
pgpgin 9440
pgpgout 9440
swap 0
inactive_anon 0
active_anon 0
inactive_file 0
active_file 0
unevictable 0
hierarchical_memory_limit 9223372036854775807
hierarchical_memsw_limit 9223372036854775807
total_cache 0
total_rss 0
total_mapped_file 0
total_pgpgin 9440
total_pgpgout 9440
total_swap 0
total_inactive_anon 0
total_active_anon 0
total_inactive_file 0
total_active_file 0
total_unevictable 0

We cannot leave it lazily because when new job reuse the cgroup, "cache" 
doesn't be cleaned automatically.
We need a mechanism that clean memory.stat.

Thanks & Regards,
--Zhaohui



From:   Johannes Weiner <hannes@cmpxchg.org>
To:     Zhao Hui Ding/China/IBM@IBMCN
Cc:     Tejun Heo <tj@kernel.org>, cgroups@vger.kernel.org, 
linux-mm@kvack.org
Date:   2016-11-04 下午 11:21
Subject:        Re: memory.force_empty is deprecated



Hi,

On Fri, Nov 04, 2016 at 04:24:25PM +0800, Zhao Hui Ding wrote:
> Hello,
> 
> I'm Zhaohui from IBM Spectrum LSF development team. I got below message 
> when running LSF on SUSE11.4, so I would like to share our use scenario 
> and ask for the suggestions without using memory.force_empty.
> 
> memory.force_empty is deprecated and will be removed. Let us know if it 
is 
> needed in your usecase at linux-mm@kvack.org
> 
> LSF is a batch workload scheduler, it uses cgroup to do batch jobs 
> resource enforcement and accounting. For each job, LSF creates a cgroup 
> directory and put job's PIDs to the cgroup.
> 
> When we implement LSF cgroup integration, we found creating a new cgroup 

> is much slower than renaming an existing cgroup, it's about hundreds of 
> milliseconds vs less than 10 milliseconds.

Cgroup creation/deletion is not expected to be an ultra-hot path, but
I'm surprised it takes longer than actually reclaiming leftover pages.

By the time the jobs conclude, how much is usually left in the group?

That said, is it even necessary to pro-actively remove the leftover
cache from the group before starting the next job? Why not leave it
for the next job to reclaim it lazily should memory pressure arise?
It's easy to reclaim page cache, and the first to go as it's behind
the next job's memory on the LRU list.






[-- Attachment #2: Type: text/html, Size: 6352 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: memory.force_empty is deprecated
  2016-11-04 15:21 ` Johannes Weiner
  2016-11-17 10:13   ` Zhao Hui Ding
@ 2016-11-17 10:39   ` Balbir Singh
  2016-11-18  6:28     ` Zhao Hui Ding
  1 sibling, 1 reply; 9+ messages in thread
From: Balbir Singh @ 2016-11-17 10:39 UTC (permalink / raw)
  To: Johannes Weiner, Zhao Hui Ding; +Cc: Tejun Heo, cgroups, linux-mm



On 05/11/16 02:21, Johannes Weiner wrote:
> Hi,
> 
> On Fri, Nov 04, 2016 at 04:24:25PM +0800, Zhao Hui Ding wrote:
>> Hello,
>>
>> I'm Zhaohui from IBM Spectrum LSF development team. I got below message 
>> when running LSF on SUSE11.4, so I would like to share our use scenario 
>> and ask for the suggestions without using memory.force_empty.
>>
>> memory.force_empty is deprecated and will be removed. Let us know if it is 
>> needed in your usecase at linux-mm@kvack.org
>>
>> LSF is a batch workload scheduler, it uses cgroup to do batch jobs 
>> resource enforcement and accounting. For each job, LSF creates a cgroup 
>> directory and put job's PIDs to the cgroup.
>>
>> When we implement LSF cgroup integration, we found creating a new cgroup 
>> is much slower than renaming an existing cgroup, it's about hundreds of 
>> milliseconds vs less than 10 milliseconds.
> 

We added force_empty a long time back so that we could force delete
cgroups. There was no definitive way of removing references to the cgroup
from page_cgroup otherwise.

> Cgroup creation/deletion is not expected to be an ultra-hot path, but
> I'm surprised it takes longer than actually reclaiming leftover pages.
> 
> By the time the jobs conclude, how much is usually left in the group?
> 
> That said, is it even necessary to pro-actively remove the leftover
> cache from the group before starting the next job? Why not leave it
> for the next job to reclaim it lazily should memory pressure arise?
> It's easy to reclaim page cache, and the first to go as it's behind
> the next job's memory on the LRU list.

It might actually make sense to migrate all tasks out and check what
the left overs look like -- should be easy to reclaim. Also be mindful
if you are using v1 and have use_hierarchy set.

Balbir Singh.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: memory.force_empty is deprecated
  2016-11-17 10:39   ` Balbir Singh
@ 2016-11-18  6:28     ` Zhao Hui Ding
  2016-11-18 22:08       ` Balbir Singh
  2016-11-22 15:21       ` Michal Hocko
  0 siblings, 2 replies; 9+ messages in thread
From: Zhao Hui Ding @ 2016-11-18  6:28 UTC (permalink / raw)
  To: Balbir Singh; +Cc: cgroups, Johannes Weiner, linux-mm, Tejun Heo

[-- Attachment #1: Type: text/plain, Size: 2271 bytes --]

Thank you. 
Do you mean memory.force_empty won't be deprecated and removed?

Regards,
--Zhaohui



From:   Balbir Singh <bsingharora@gmail.com>
To:     Johannes Weiner <hannes@cmpxchg.org>, Zhao Hui 
Ding/China/IBM@IBMCN
Cc:     Tejun Heo <tj@kernel.org>, cgroups@vger.kernel.org, 
linux-mm@kvack.org
Date:   2016-11-17 下午 06:39
Subject:        Re: memory.force_empty is deprecated





On 05/11/16 02:21, Johannes Weiner wrote:
> Hi,
> 
> On Fri, Nov 04, 2016 at 04:24:25PM +0800, Zhao Hui Ding wrote:
>> Hello,
>>
>> I'm Zhaohui from IBM Spectrum LSF development team. I got below message 

>> when running LSF on SUSE11.4, so I would like to share our use scenario 

>> and ask for the suggestions without using memory.force_empty.
>>
>> memory.force_empty is deprecated and will be removed. Let us know if it 
is 
>> needed in your usecase at linux-mm@kvack.org
>>
>> LSF is a batch workload scheduler, it uses cgroup to do batch jobs 
>> resource enforcement and accounting. For each job, LSF creates a cgroup 

>> directory and put job's PIDs to the cgroup.
>>
>> When we implement LSF cgroup integration, we found creating a new 
cgroup 
>> is much slower than renaming an existing cgroup, it's about hundreds of 

>> milliseconds vs less than 10 milliseconds.
> 

We added force_empty a long time back so that we could force delete
cgroups. There was no definitive way of removing references to the cgroup
from page_cgroup otherwise.

> Cgroup creation/deletion is not expected to be an ultra-hot path, but
> I'm surprised it takes longer than actually reclaiming leftover pages.
> 
> By the time the jobs conclude, how much is usually left in the group?
> 
> That said, is it even necessary to pro-actively remove the leftover
> cache from the group before starting the next job? Why not leave it
> for the next job to reclaim it lazily should memory pressure arise?
> It's easy to reclaim page cache, and the first to go as it's behind
> the next job's memory on the LRU list.

It might actually make sense to migrate all tasks out and check what
the left overs look like -- should be easy to reclaim. Also be mindful
if you are using v1 and have use_hierarchy set.

Balbir Singh.






[-- Attachment #2: Type: text/html, Size: 3315 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: memory.force_empty is deprecated
  2016-11-18  6:28     ` Zhao Hui Ding
@ 2016-11-18 22:08       ` Balbir Singh
  2016-11-22 15:21       ` Michal Hocko
  1 sibling, 0 replies; 9+ messages in thread
From: Balbir Singh @ 2016-11-18 22:08 UTC (permalink / raw)
  To: Zhao Hui Ding; +Cc: cgroups, Johannes Weiner, linux-mm, Tejun Heo

On Fri, Nov 18, 2016 at 5:28 PM, Zhao Hui Ding <dingzhh@cn.ibm.com> wrote:
> Thank you.
> Do you mean memory.force_empty won't be deprecated and removed?
>
> Regards,
> --Zhaohui

No I am not implying that. That is decided by the maintainers.

Balbir Singh.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: memory.force_empty is deprecated
@ 2016-11-22 15:20       ` Michal Hocko
  0 siblings, 0 replies; 9+ messages in thread
From: Michal Hocko @ 2016-11-22 15:20 UTC (permalink / raw)
  To: Zhao Hui Ding; +Cc: Johannes Weiner, cgroups, linux-mm, Tejun Heo, Xue Bin Min

On Thu 17-11-16 18:13:18, Zhao Hui Ding wrote:
[...]
> We cannot leave it lazily because when new job reuse the cgroup, "cache" 
> doesn't be cleaned automatically.
> We need a mechanism that clean memory.stat.

Could you clarify why, please?
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: memory.force_empty is deprecated
@ 2016-11-22 15:20       ` Michal Hocko
  0 siblings, 0 replies; 9+ messages in thread
From: Michal Hocko @ 2016-11-22 15:20 UTC (permalink / raw)
  To: Zhao Hui Ding
  Cc: Johannes Weiner, cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Tejun Heo, Xue Bin Min

On Thu 17-11-16 18:13:18, Zhao Hui Ding wrote:
[...]
> We cannot leave it lazily because when new job reuse the cgroup, "cache" 
> doesn't be cleaned automatically.
> We need a mechanism that clean memory.stat.

Could you clarify why, please?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: memory.force_empty is deprecated
  2016-11-18  6:28     ` Zhao Hui Ding
  2016-11-18 22:08       ` Balbir Singh
@ 2016-11-22 15:21       ` Michal Hocko
  1 sibling, 0 replies; 9+ messages in thread
From: Michal Hocko @ 2016-11-22 15:21 UTC (permalink / raw)
  To: Zhao Hui Ding; +Cc: Balbir Singh, cgroups, Johannes Weiner, linux-mm, Tejun Heo

On Fri 18-11-16 14:28:21, Zhao Hui Ding wrote:
> Thank you. 
> Do you mean memory.force_empty won't be deprecated and removed?

The knob will most likely stay in the v1 memcg user api. The warning is
mostly to inform users that it will not be added to the v2 api unless
there is a strong usecase.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2016-11-22 15:21 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-04  8:24 memory.force_empty is deprecated Zhao Hui Ding
2016-11-04 15:21 ` Johannes Weiner
2016-11-17 10:13   ` Zhao Hui Ding
2016-11-22 15:20     ` Michal Hocko
2016-11-22 15:20       ` Michal Hocko
2016-11-17 10:39   ` Balbir Singh
2016-11-18  6:28     ` Zhao Hui Ding
2016-11-18 22:08       ` Balbir Singh
2016-11-22 15:21       ` Michal Hocko

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.